Re: [Qemu-devel] [PATCH 01/18] block: move AioContext, QEMUTimer, main-loop to libqemuutil

2017-02-13 Thread Fam Zheng
On Mon, 02/13 14:52, Paolo Bonzini wrote:
> --- /dev/null
> +++ b/util/aiocb.c
> @@ -0,0 +1,55 @@
> +/*
> + * BlockAIOCB allocation
> + *
> + * Copyright (c) 2003-2017 Fabrice Bellard and the QEMU team

Hmm, I'm not lawyer, just wondering if the QEMU team is a legal entity that can
hold copyright? :)

Fam



Re: [Qemu-devel] [RFC PATCH 06/41] block: Involve block drivers in permission granting

2017-02-13 Thread Fam Zheng
On Mon, 02/13 18:22, Kevin Wolf wrote:
> +int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
> +Error **errp)
> +{
> +int ret;
> +
> +ret = bdrv_child_check_perm(c, perm, shared, errp);
> +if (ret < 0) {
> +return ret;
> +}
> +
> +bdrv_child_set_perm(c, perm, shared);

This has an issue of TOCTOU, which means image locking cannot fit in easily.
Maybe squash them into one callback (.bdrv_try_set_perm) that can return error?

> +
>  return 0;
>  }
>  
> @@ -1322,6 +1445,7 @@ static void bdrv_replace_child(BdrvChild *child, 
> BlockDriverState *new_bs)
>  child->role->drained_end(child);
>  }
>  QLIST_REMOVE(child, next_parent);
> +bdrv_update_perm(old_bs);
>  }
>  
>  child->bs = new_bs;
> @@ -1331,6 +1455,7 @@ static void bdrv_replace_child(BdrvChild *child, 
> BlockDriverState *new_bs)
>  if (new_bs->quiesce_counter && child->role->drained_begin) {
>  child->role->drained_begin(child);
>  }
> +bdrv_update_perm(new_bs);
>  }
>  }
>  
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index f36b064..8578e17 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -320,6 +320,45 @@ struct BlockDriver {
>  void (*bdrv_del_child)(BlockDriverState *parent, BdrvChild *child,
> Error **errp);
>  
> +/**
> + * Checks whether the requested set of cumulative permissions in @perm
> + * can be granted for accessing @bs and whether no other users are using
> + * permissions other than those given in @shared (both arguments take
> + * BLK_PERM_* bitmasks).
> + *
> + * If both conditions are met, 0 is returned. Otherwise, -errno is 
> returned
> + * and errp is set to an error describing the conflict.
> + */
> +int (*bdrv_check_perm)(BlockDriverState *bs, uint64_t perm,
> +   uint64_t shared, Error **errp);
> +
> +/**
> + * Called to inform the driver that the set of cumulative set of used
> + * permissions for @bs has changed to @perm, and the set of sharable
> + * permission to @shared. The driver can use this to propagate changes to
> + * its children (i.e. request permissions only if a parent actually needs
> + * them).
> + *
> + * If permissions are added to @perm or dropped from @shared, callers 
> must
> + * use bdrv_check_perm() first to ensure that this operation is valid.
> + * Dropping from @perm or adding to @shared is always allowed without a
> + * previous check.
> + */
> +void (*bdrv_set_perm)(BlockDriverState *bs, uint64_t perm, uint64_t 
> shared);
> +
> +/**
> + * Returns in @nperm and @nshared the permissions that the driver for @bs
> + * needs on its child @c, based on the cumulative permissions requested 
> by
> + * the parents in @parent_perm and @parent_shared.
> + *
> + * If @c is NULL, return the permissions for attaching a new child for 
> the
> + * given @role.
> + */
> + void (*bdrv_child_perm)(BlockDriverState* bs, BdrvChild *c,
> + const BdrvChildRole *role,
> + uint64_t parent_perm, uint64_t parent_shared,
> + uint64_t *nperm, uint64_t *nshared);
> +
>  QLIST_ENTRY(BlockDriver) list;
>  };
>  
> @@ -832,6 +871,13 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
> *child_bs,
>void *opaque, Error **errp);
>  void bdrv_root_unref_child(BdrvChild *child);
>  
> +int bdrv_child_check_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
> +  Error **errp);
> +void bdrv_child_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared);
> +int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
> +Error **errp);
> +
> +
>  const char *bdrv_get_parent_name(const BlockDriverState *bs);
>  void blk_dev_change_media_cb(BlockBackend *blk, bool load);
>  bool blk_dev_has_removable_media(BlockBackend *blk);
> -- 
> 1.8.3.1
> 



Re: [Qemu-devel] [PATCH 1/2] ppc/xics: remove set_nr_irqs() handler from XICSStateClass

2017-02-13 Thread David Gibson
On Mon, Feb 13, 2017 at 03:09:16PM +0100, Cédric Le Goater wrote:
> Today, the ICS (Interrupt Controller Source) object is created and
> realized by the init and realize routines of the XICS object, but some
> of the parameters are only known at the machine level.
> 
> These parameters are passed from the sPAPR machine to the ICS object
> in a rather convoluted way using property handlers and a class handler
> of the XICS object. The number of irqs required to allocate the IRQ
> state objects in the ICS realize routine is one of them.
> 
> Let's simplify the process by creating the ICS object along with the
> XICS object at the machine level and link the ICS into the XICS list
> of ICSs at this level also. In the sPAPR machine, there is only a
> single ICS but that will change with the PowerNV machine.
> 
> Also, QOMify the creation of the objects and get rid of the
> superfluous code.
> 
> Signed-off-by: Cédric Le Goater 

I like the basic idea here: while the ics and icp objects are pretty
straightforward, the "xics" object has always been a bit of a hack,
with logic that really belongs in the machine.

But.. I don't think the approach here really works.  Specifically..

[snip]
> -static XICSState *try_create_xics(const char *type, int nr_servers,
> -  int nr_irqs, Error **errp)
> -{
> -Error *err = NULL;
> -DeviceState *dev;
> +static XICSState *try_create_xics(const char *type, const char *type_ics,
> +  int nr_servers, int nr_irqs, Error **errp)
> +{
> +Error *err = NULL, *local_err = NULL;
> +XICSState *xics;
> +ICSState *ics = NULL;
> +
> +xics = XICS_COMMON(object_new(type));
> +qdev_set_parent_bus(DEVICE(xics), sysbus_get_default());
> +object_property_set_int(OBJECT(xics), nr_servers, "nr_servers", );
> +object_property_set_bool(OBJECT(xics), true, "realized", _err);
> +error_propagate(, local_err);
> +if (err) {
> +goto error;
> +}
>  
> -dev = qdev_create(NULL, type);
> -qdev_prop_set_uint32(dev, "nr_servers", nr_servers);
> -qdev_prop_set_uint32(dev, "nr_irqs", nr_irqs);
> -object_property_set_bool(OBJECT(dev), true, "realized", );
> +ics = ICS_SIMPLE(object_new(type_ics));
> +object_property_add_child(OBJECT(xics), "ics", OBJECT(ics), NULL);
> +object_property_set_int(OBJECT(ics), nr_irqs, "nr-irqs", );
> +object_property_set_bool(OBJECT(ics), true, "realized", _err);
> +error_propagate(, local_err);
>  if (err) {
> -error_propagate(errp, err);
> -object_unparent(OBJECT(dev));
> -return NULL;
> +goto error;
> +}
> +
> +ics->xics = xics;
> +QLIST_INSERT_HEAD(>ics, ics, list);

Poking into the ics and xics objects directly from the machine here
violates abstraction even worse than the existing xics device does.
In fact, avoiding that is basically why the xics device exists in the
first place.

I've thought about this a bit more, and I think I know how to solve
this better now.

I think that "xics" shouldn't be a concrete object at all, instead it
should be a QOM interface, implemented by the machine type.  Both ICS
and ICP would take a link property to find their xics.  The xics
interface would provide methods to return an ics object given an irq
number and to return an icp object given a server number. This gives
full control of the irq and server number spaces back to the machine
type, which is really where it belongs.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[Qemu-devel] [Bug 912983] Re: Unable to install OS/2 Warp v3 past disk 2

2017-02-13 Thread Launchpad Bug Tracker
[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/912983

Title:
  Unable to install OS/2 Warp v3 past disk 2

Status in QEMU:
  Expired

Bug description:
  To whom it may concern,

  As you may (or may not) be aware, QEMU is currently unable to readily
  install OS/2 Warp v3 (OS2W3) when asked for Installation Diskette 2
  
(http://www.claunia.com/qemu/objectManager.php?sClass=version=132=138).

  QEMU 0.8.2 is the last known (to me) release to successfully install
  OS2W3. QEMU version 1.0 and the latest development version (as of
  2012-01-05) have been verified not to work.

  A 'git bisect' reveals the issue was introduced during a migration to
  new removable media handling prior to the QEMU 0.9.0 release:

There are only 'skip'ped commits left to test.
The first bad commit could be any of:
66c6ef7678939f2119eb649074babf5d5b2666f6
ea185bbda732dae6b6a5a44699f90c83e21f1494
19cb37389f4641d48803f0c5d72708749cbcf318
We cannot bisect more!

  For testing, the 'qcow' hard drive format was chosen due to QEMU 0.8.2
  not having 'qcow2':

  $ qemu -M isapc -m 8 -localtime -soundhw sb16 -hda os2.qcow
  -fda install.img -boot a

  Of note, the ISA-only PC (isapc) was needed for QEMU 0.8.2 to 0.9.0.
  Otherwise QEMU hangs on start-up. Later versions of QEMU, segmentation
  fault when attempting to use '-M isapc' though boot correctly when
  using the default PC machine.

  
  The currently preferred method to install OS2W3 is to use another application 
(such as VirtualBox or VMWare), using a QEMU compatible disk image format. Once 
installed, QEMU can then run OS2W3; which it does phenomenally well.

  However, I've identified a way to install OS2W3 exclusively with QEMU,
  which may also shed additional light on the issue.

  1. Using a relatively new QEMU (I'm on 0.11.1), install OS2W3 as you normally 
would on to a 'qcow2' hard drive.
  2. When Installation Diskette 2 is reached, save a VM snapshot.
  3. Quit QEMU and re-run, loading the VM state *with* the Installation 
Diskette 2 image in the floppy drive.
$ qemu -m 8 -localtime -soundhw sb16 -hda os2.qcow2 -fda disk2.img 
-loadvm install 
  The installation process will then continue as normal.

  This same method can be used once OS2W3 continues installing from it's
  GUI. Installation Diskette 7 experiences the same issue of not being
  recognised when inserted.

  Of note, as an unrelated issue, I was unable to save VM snapshots in
  QEMU 1.0 or later.

  
  Thank you for a fantastic emulator.

  
  cheers,
  multitude

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/912983/+subscriptions



[Qemu-devel] [Bug 922355] Re: qemu crashes when invoked on Pandaboard

2017-02-13 Thread Launchpad Bug Tracker
[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/922355

Title:
  qemu crashes when invoked on Pandaboard

Status in QEMU:
  Expired

Bug description:
  root@omap:~# uname -a
  Linux omap 3.1.6-x6 #1 SMP Thu Dec 22 11:17:51 UTC 2011 armv7l armv7l
  armv7l GNU/Linux

  root@omap:~# qemu
  Could not initialize KVM, will disable KVM support
  /build/buildd/qemu-kvm-0.14.1+noroms/tcg/arm/tcg-target.c:848: tcg fatal error

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/922355/+subscriptions



[Qemu-devel] [Bug 938945] Re: Slirp cannot be forward and makes segmentation faults

2017-02-13 Thread Launchpad Bug Tracker
[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/938945

Title:
  Slirp cannot be forward and makes segmentation faults

Status in QEMU:
  Expired

Bug description:
  Hi,

  Let's consider the following lines:

  $ qemu -enable-kvm -name opeth -hda debian1.img -k fr -localtime -m
  512 -net user,vlan=0 -net
  nic,vlan=0,model=$model,macaddr=a2:00:00:00:00:10 -net
  socket,vlan=1,listen=127.0.0.1:5900 -net
  nic,vlan=1,model=$model,macaddr=a2:00:00:00:00:04

  $qemu -enable-kvm -name nightwish -hda debian2.img -k fr -localtime -m
  512 -net socket,vlan=0,connect=127.0.0.1:5900 -net
  nic,vlan=0,model=$model,macaddr=a2:00:00:00:00:02

  
  My configuration is clear and allows to transmit packets between the Slirp 
and the guest nightwish.
  But when I try to do on nightwish :

  $ wget www.qemu.org

  The opeth QEMU makes a segfault :"11586 Segmentation Fault"

  This phenomenon is not always present... If the Segfault does not
  appear, nightwish cannot enable a connection with internet :(

  
  Thanks
  Vince

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/938945/+subscriptions



[Qemu-devel] [Bug 906864] Re: Always mouse grabbing with -usbdevice tablet

2017-02-13 Thread Launchpad Bug Tracker
[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/906864

Title:
  Always mouse grabbing with -usbdevice tablet

Status in QEMU:
  Expired

Bug description:
  version: QEMU emulator version 1.0 (qemu-kvm 1.0)
   QEMU emulator version 1.0 (qemu 1.0)
   (source builds)
  os: archlinux x86-64
  last working version: qemu-kvm 0.15.1

  commandline: each with "-usb -usbdevice tablet" and sdl output

  expected behavior:  mouse gets grabbed only by forcing it (pressing
  release/grab-combination (CTRL-ALT))

  actual behavior:
  When moving the mouse over the window it gets instantly grabbed, so i cannot 
use window-manager-specific hotkeys anymore.  After pressing the release 
combination every mouse movement over or within the window will grab the mouse 
again. 
  I have tried this with several vga types and window managers: no difference

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/906864/+subscriptions



Re: [Qemu-devel] [PATCH 00/22] target/openrisc updates

2017-02-13 Thread Programmingkid

On Feb 13, 2017, at 10:25 PM, Richard Henderson wrote:

> On 02/10/2017 11:39 AM, Stafford Horne wrote:
>> On Thu, Feb 09, 2017 at 09:10:51AM -0500, G 3 wrote:
>>> 
>>> On Feb 8, 2017, at 11:52 PM, qemu-devel-requ...@nongnu.org wrote:
>>> 
 Message: 6
 Date: Wed,  8 Feb 2017 20:51:32 -0800
 From: Richard Henderson 
 To: qemu-devel@nongnu.org
 Cc: sho...@gmail.com
 Subject: [Qemu-devel] [PATCH 00/22] target/openrisc updates
 Message-ID: <20170209045154.16868-1-...@twiddle.net>
 
 The bulk of this patch set is 2-3 years old, and was mostly
 reviewed by Bastian Koppelmann.  But it languished because
 there were reports of it not booting kernel images, and I
 had problems putting together a set of tools that could even
 build a kernel.
 
 The OpenRISC community has picked up activity recently,
 with Stafford Horne upstreaming some of the compiler tools.
 He has even done some testing for me of this patch set.
 
 
 r~
>>> 
>>> I see you are working on OpenRISC. Would you be able to help
>>> me improve its wiki page?
>>> 
>>> http://wiki.qemu-project.org/Documentation/Platforms/OpenRISC
>>> 
>>> Right now there isn't much information on OpenRISC here. I'm
>>> hoping to make it a lot useful to anyone who is interesting in this
>>> platform.
>>> 
>>> Would you know of any links to software that works in OpenRISC?
>>> 
>>> Pictures of the OpenRISC target running anything would be great also.
>>> 
>>> What is your suggested command-line for using OpenRISC?
>>> 
>>> If you have any suggestions on how to improve the page place don't
>>> hesitate to let me know.
>>> 
>> 
>> Hi Richard, G 3,
>> 
>> I am not sure if Richard has the time for this.  But I kind of got him
>> started back on the openrisc work, it would be my fault to burden him
>> with even more work :).  Also, I was helping to do some tests so I have
>> some example commands.
> 
> :-)
> 
>> I don't have a lot of time, but I think I can help to update the above.
>> Probably Richard and I could work together?
>> 
>> I was recently working on updating the openrisc page
>> 
>>  http://openrisc.io
>> 
>> I think I could link between the two for toolchain compile guides and
>> software guides (i.e. debugging and linux).
>> 
>> Richard what do you think?
> 
> Probably the most beneficial thing that we can do is create a kernel+initrd 
> that boots into a busybox root shell.  Similar to the other test images that 
> we have at
> 
>  http://wiki.qemu-project.org/Testing/System_Images
> 
> 
> r~
> 

We could do both the test image and a wiki page.


Re: [Qemu-devel] [PATCH v6 2/2] block/vxhs.c: Add qemu-iotests for new block device type "vxhs"

2017-02-13 Thread Ketan Nilangekar


On 2/13/17, 3:23 PM, "Jeff Cody"  wrote:

On Mon, Feb 13, 2017 at 10:36:53PM +, Ketan Nilangekar wrote:
> 
> 
> On 2/13/17, 8:32 AM, "Jeff Cody"  wrote:
> 
> On Mon, Feb 13, 2017 at 01:37:25PM +, Stefan Hajnoczi wrote:
> > On Tue, Feb 07, 2017 at 03:12:36PM -0800, ashish mittal wrote:
> > > On Tue, Nov 8, 2016 at 12:44 PM, Jeff Cody  
wrote:
> > > > On Mon, Nov 07, 2016 at 04:59:45PM -0800, Ashish Mittal wrote:
> > > >> These changes use a vxhs test server that is a part of the 
following
> > > >> repository:
> > > >> https://github.com/MittalAshish/libqnio.git
> > > >>
> > > >> Signed-off-by: Ashish Mittal 
> > > >> ---
> > > >> v6 changelog:
> > > >> (1) Added iotests for VxHS block device.
> > > >>
> > > >>  tests/qemu-iotests/common|  6 ++
> > > >>  tests/qemu-iotests/common.config | 13 +
> > > >>  tests/qemu-iotests/common.filter |  1 +
> > > >>  tests/qemu-iotests/common.rc | 19 +++
> > > >>  4 files changed, 39 insertions(+)
> > > >>
> > > >> diff --git a/tests/qemu-iotests/common 
b/tests/qemu-iotests/common
> > > >> index d60ea2c..41430d8 100644
> > > >> --- a/tests/qemu-iotests/common
> > > >> +++ b/tests/qemu-iotests/common
> > > >
> > > > When using raw format, I was able to run the test successfully 
for all
> > > > supported test cases (26 of them).
> > > >
> > > > With qcow2, they fail - but not the fault of this patch, I 
think; but
> > > > rather, the fault of the test server.  Can qnio_server be 
modified so that
> > > > it does not work on just raw files?
> > > >
> > > >
> > > 
> > > VxHS supports and uses only the raw format.
> > 
> > That's like saying only ext4 guest file systems are supported on 
VxHS
> > and not ZFS.  The VxHS driver should not care what file system is 
used,
> > it just does block I/O without interpreting the data.
> > 
> > It must be possible to use any format on top of the VxHS protocol.
> > After all, the image format drivers just do block I/O.  If there is 
a
> > case where qcow2 on VxHS fails then it needs to be investigated.
> > 
> > The VxHS driver can't be merged until we at least understand the 
cause
> > of the qcow2 test failures.
> >
> 
> A quick run with the test server and a QEMU process showed an abort() 
in the
> test server, so I just sent a pull req to libqnio to fix that.  
> 
> But playing with it in gdb right now with a test qcow2 file, I see 
that we
> are waiting in aio_poll() forever for the test server to respond to a 
read
> request, when using qcow2 format.  
> 
> As Stefan said, this doesn't really make any sense - why would VXHS 
behave
> differently based on the file contents?
> 
> [Ketan] To read/write a qcow2 backed device VxHS server implementation
> will need to understand the qcow2 format. This is not just block IO but
> actually does involve interpreting the qcow2 header and cluster formats.
> Clearly the test server implementation does not handle it as it was never
> intended to. VxHS backend won't handle it either because VxHS virtual
> disks are written as non-sparse files.  There are space savings with the
> qcow2 format but performance penalties as well because of the metadata
> overhead. As a block storage provider, VxHS does not support sparse file
> formats like qcow2 primarily because of performance reasons.  Implementing
> a qcow2 backend in the test server would be a non-trivial and non-useful
> exercise since the VxHS server won't support it.
>

What?  Why would the backend need to know anything about qcow2 formats; are
you manipulating the guest image data directly yourself?  But regardless,
since the test server is naive and just reads and writes data, the fact that
the test server breaks on qcow2 image formats means that the test server is
broken for raw images, as well [1].

The backend should not need to know anything about the image file format in
order to serve data, as the backend is essentially serving bytes. (I guess
the only thing the backend would need to be aware of is how to invoke
qemu-img to create a qcow2 image file initially).


QEMU is what interprets the qcow2 format (or any image format).  Every read
from QEMU will look like a raw file read from the perspective of libqnio /
vxhs.  I can't see any reason why vxhs would need to know anything about the
contents of the image file itself.

[Qemu-devel] [Bug 1662050] Re: qemu-img convert a overlay qcow2 image into a entire image

2017-02-13 Thread wayen
** Attachment added: "qemu-img map delta.qcow2 output"
   
https://bugs.launchpad.net/qemu/+bug/1662050/+attachment/4818562/+files/qemu_img_map_delta_qcow2.txt

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1662050

Title:
  qemu-img convert a overlay qcow2 image into a entire image

Status in QEMU:
  Incomplete

Bug description:
  I have a base image file "base.qcow2" and a delta qcow2 image file
  "delta.qcow2" whose backing file is "base.qcow2".

  Now I use qemu-img to convert "delta.qcow2" and will get a new image
  file "new.qcow2" which is entire and equivalent to combination of
  "base.qcow2" and "delta.qcow2".

  In fact,I don't want to get a complete image.I just want to convert
  delta qcow2 image file "A" to a New delta overlay qcow2 image "B"
  which is equivalent to "A". So the "new.qcow2" is not what i want. I
  have to admit that this is not bug. Could you please take this as a
  new feature and enable qemu-img to convert qcow2 overlay itself?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1662050/+subscriptions



Re: [Qemu-devel] [PATCH 2/2] virtio-blk: Inline request init, complete and free functions

2017-02-13 Thread Fam Zheng
On Mon, 02/13 14:28, Stefan Hajnoczi wrote:
> On Tue, Feb 07, 2017 at 02:52:38PM +0100, Laszlo Ersek wrote:
> > On 02/07/17 14:27, Fam Zheng wrote:
> > > These are used in each request handling, inline them.
> > > 
> > > Signed-off-by: Fam Zheng 
> > > ---
> > >  hw/block/virtio-blk.c | 9 +
> > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> > > index 2858c31..1da9570 100644
> > > --- a/hw/block/virtio-blk.c
> > > +++ b/hw/block/virtio-blk.c
> > > @@ -29,8 +29,8 @@
> > >  #include "hw/virtio/virtio-bus.h"
> > >  #include "hw/virtio/virtio-access.h"
> > >  
> > > -static void virtio_blk_init_request(VirtIOBlock *s, VirtQueue *vq,
> > > -VirtIOBlockReq *req)
> > > +static inline void virtio_blk_init_request(VirtIOBlock *s, VirtQueue *vq,
> > > +   VirtIOBlockReq *req)
> > >  {
> > >  req->dev = s;
> > >  req->vq = vq;
> > > @@ -40,12 +40,13 @@ static void virtio_blk_init_request(VirtIOBlock *s, 
> > > VirtQueue *vq,
> > >  req->mr_next = NULL;
> > >  }
> > >  
> > > -static void virtio_blk_free_request(VirtIOBlockReq *req)
> > > +static inline void virtio_blk_free_request(VirtIOBlockReq *req)
> > >  {
> > >  g_free(req);
> > >  }
> > >  
> > > -static void virtio_blk_req_complete(VirtIOBlockReq *req, unsigned char 
> > > status)
> > > +static inline void virtio_blk_req_complete(VirtIOBlockReq *req,
> > > +   unsigned char status)
> > >  {
> > >  VirtIOBlock *s = req->dev;
> > >  VirtIODevice *vdev = VIRTIO_DEVICE(s);
> > > 
> > 
> > Hm, virtio_blk_req_complete() looks a bit too "meaty" and seems to be
> > called from a little too many places for me to feel convenient about
> > inlining it. I guess I'd leave it to the compiler to optimize the
> > function call. Does the explicit hint offer a noticeable perf improvement?
> > 
> > Inlining virtio_blk_free_request() looks reasonable.
> > 
> > virtio_blk_init_request() looks okay too.
> > 
> > Other reviewers should feel free to override my concerns :) My view on
> > this is distant.
> 
> I'm not a big fan of manually inlining functions.  Let the compiler
> decide whether these static functions should be inlined.
> 

Fair enough, let's drop this one.

Fam



Re: [Qemu-devel] [PATCH 13/24] qcow2: add .bdrv_store_persistent_dirty_bitmaps()

2017-02-13 Thread John Snow


On 02/03/2017 04:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> Realize block bitmap storing interface, to allow qcow2 images store
> persistent bitmaps.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Max Reitz 
> ---
>  block/qcow2-bitmap.c | 489 
> +--
>  block/qcow2.c|   1 +
>  block/qcow2.h|   1 +
>  3 files changed, 476 insertions(+), 15 deletions(-)
> 


> diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
> index 9af658a0f4..151f5e9173 100644
> --- a/block/qcow2-bitmap.c
> +++ b/block/qcow2-bitmap.c
> @@ -28,6 +28,7 @@
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
>  #include "exec/log.h"
> +#include "qemu/cutils.h"
>  
>  #include "block/block_int.h"
>  #include "block/qcow2.h"
> @@ -43,6 +44,10 @@
>  #define BME_MIN_GRANULARITY_BITS 9
>  #define BME_MAX_NAME_SIZE 1023
>  
> +#if BME_MAX_TABLE_SIZE * 8ULL > INT_MAX
> +#error In the code bitmap table physical size assumed to fit into int
> +#endif
> +
>  /* Bitmap directory entry flags */
>  #define BME_RESERVED_FLAGS 0xfffcU
>  #define BME_FLAG_IN_USE 1
> @@ -67,13 +72,22 @@ typedef struct QEMU_PACKED Qcow2BitmapDirEntry {
>  /* name follows  */
>  } Qcow2BitmapDirEntry;
>  
> +typedef struct Qcow2BitmapTable {
> +uint64_t offset;
> +uint32_t size; /* number of 64bit entries */
> +QSIMPLEQ_ENTRY(Qcow2BitmapTable) entry;
> +} Qcow2BitmapTable;
> +typedef QSIMPLEQ_HEAD(Qcow2BitmapTableList, Qcow2BitmapTable)
> +Qcow2BitmapTableList;
> +
>  typedef struct Qcow2Bitmap {
> -uint64_t table_offset;
> -uint32_t table_size;
> +Qcow2BitmapTable table;
>  uint32_t flags;
>  uint8_t granularity_bits;
>  char *name;
>  
> +BdrvDirtyBitmap *dirty_bitmap;
> +
>  QSIMPLEQ_ENTRY(Qcow2Bitmap) entry;
>  } Qcow2Bitmap;
>  typedef QSIMPLEQ_HEAD(Qcow2BitmapList, Qcow2Bitmap) Qcow2BitmapList;
> @@ -117,6 +131,15 @@ static inline void bitmap_table_to_cpu(uint64_t 
> *bitmap_table, size_t size)
>  }
>  }
>  
> +static inline void bitmap_table_to_be(uint64_t *bitmap_table, size_t size)
> +{
> +size_t i;
> +
> +for (i = 0; i < size; ++i) {
> +cpu_to_be64s(_table[i]);
> +}
> +}
> +
>  static int check_table_entry(uint64_t entry, int cluster_size)
>  {
>  uint64_t offset;
> @@ -144,7 +167,50 @@ static int check_table_entry(uint64_t entry, int 
> cluster_size)
>  return 0;
>  }
>  
> -static int bitmap_table_load(BlockDriverState *bs, Qcow2Bitmap *bm,
> +static int check_constraints_on_bitmap(BlockDriverState *bs,
> +   const char *name,
> +   uint32_t granularity)
> +{
> +BDRVQcow2State *s = bs->opaque;
> +int granularity_bits = ctz32(granularity);
> +int64_t len = bdrv_getlength(bs);
> +bool fail;
> +
> +assert(granularity > 0);
> +assert((granularity & (granularity - 1)) == 0);
> +
> +if (len < 0) {
> +return len;
> +}
> +
> +fail = (granularity_bits > BME_MAX_GRANULARITY_BITS) ||
> +   (granularity_bits < BME_MIN_GRANULARITY_BITS) ||
> +   (len > (uint64_t)BME_MAX_PHYS_SIZE << granularity_bits) ||
> +   (len > (uint64_t)BME_MAX_TABLE_SIZE * s->cluster_size <<
> +  granularity_bits) ||
> +   (strlen(name) > BME_MAX_NAME_SIZE);
> +
> +return fail ? -EINVAL : 0;
> +}
> +
> +static void clear_bitmap_table(BlockDriverState *bs, uint64_t *bitmap_table,
> +   uint32_t bitmap_table_size)
> +{
> +BDRVQcow2State *s = bs->opaque;
> +int i;
> +
> +for (i = 0; i < bitmap_table_size; ++i) {
> +uint64_t addr = bitmap_table[i] & BME_TABLE_ENTRY_OFFSET_MASK;
> +if (!addr) {
> +continue;
> +}
> +
> +qcow2_free_clusters(bs, addr, s->cluster_size, QCOW2_DISCARD_OTHER);
> +bitmap_table[i] = 0;
> +}
> +}
> +
> +static int bitmap_table_load(BlockDriverState *bs, Qcow2BitmapTable *tb,
>   uint64_t **bitmap_table)

It would have been nicer to have factored out the size and offset from
the very beginning to avoid churn within this series, but... oh well.
Next time you write a 24 patch series, OK? :)

>  {
>  int ret;
> @@ -152,20 +218,20 @@ static int bitmap_table_load(BlockDriverState *bs, 
> Qcow2Bitmap *bm,
>  uint32_t i;
>  uint64_t *table;
>  
> -assert(bm->table_size != 0);
> -table = g_try_new(uint64_t, bm->table_size);
> +assert(tb->size != 0);
> +table = g_try_new(uint64_t, tb->size);
>  if (table == NULL) {
>  return -ENOMEM;
>  }
>  
> -assert(bm->table_size <= BME_MAX_TABLE_SIZE);
> -ret = bdrv_pread(bs->file, bm->table_offset,
> - table, bm->table_size * sizeof(uint64_t));
> +assert(tb->size <= BME_MAX_TABLE_SIZE);
> +ret = bdrv_pread(bs->file, tb->offset,
> + table, tb->size 

Re: [Qemu-devel] [PATCH v6 2/2] block/vxhs.c: Add qemu-iotests for new block device type "vxhs"

2017-02-13 Thread Jeff Cody
On Mon, Feb 13, 2017 at 10:36:53PM +, Ketan Nilangekar wrote:
> 
> 
> On 2/13/17, 8:32 AM, "Jeff Cody"  wrote:
> 
> On Mon, Feb 13, 2017 at 01:37:25PM +, Stefan Hajnoczi wrote:
> > On Tue, Feb 07, 2017 at 03:12:36PM -0800, ashish mittal wrote:
> > > On Tue, Nov 8, 2016 at 12:44 PM, Jeff Cody  wrote:
> > > > On Mon, Nov 07, 2016 at 04:59:45PM -0800, Ashish Mittal wrote:
> > > >> These changes use a vxhs test server that is a part of the 
> following
> > > >> repository:
> > > >> https://github.com/MittalAshish/libqnio.git
> > > >>
> > > >> Signed-off-by: Ashish Mittal 
> > > >> ---
> > > >> v6 changelog:
> > > >> (1) Added iotests for VxHS block device.
> > > >>
> > > >>  tests/qemu-iotests/common|  6 ++
> > > >>  tests/qemu-iotests/common.config | 13 +
> > > >>  tests/qemu-iotests/common.filter |  1 +
> > > >>  tests/qemu-iotests/common.rc | 19 +++
> > > >>  4 files changed, 39 insertions(+)
> > > >>
> > > >> diff --git a/tests/qemu-iotests/common b/tests/qemu-iotests/common
> > > >> index d60ea2c..41430d8 100644
> > > >> --- a/tests/qemu-iotests/common
> > > >> +++ b/tests/qemu-iotests/common
> > > >
> > > > When using raw format, I was able to run the test successfully for 
> all
> > > > supported test cases (26 of them).
> > > >
> > > > With qcow2, they fail - but not the fault of this patch, I think; 
> but
> > > > rather, the fault of the test server.  Can qnio_server be modified 
> so that
> > > > it does not work on just raw files?
> > > >
> > > >
> > > 
> > > VxHS supports and uses only the raw format.
> > 
> > That's like saying only ext4 guest file systems are supported on VxHS
> > and not ZFS.  The VxHS driver should not care what file system is used,
> > it just does block I/O without interpreting the data.
> > 
> > It must be possible to use any format on top of the VxHS protocol.
> > After all, the image format drivers just do block I/O.  If there is a
> > case where qcow2 on VxHS fails then it needs to be investigated.
> > 
> > The VxHS driver can't be merged until we at least understand the cause
> > of the qcow2 test failures.
> >
> 
> A quick run with the test server and a QEMU process showed an abort() in 
> the
> test server, so I just sent a pull req to libqnio to fix that.  
> 
> But playing with it in gdb right now with a test qcow2 file, I see that we
> are waiting in aio_poll() forever for the test server to respond to a read
> request, when using qcow2 format.  
> 
> As Stefan said, this doesn't really make any sense - why would VXHS behave
> differently based on the file contents?
> 
> [Ketan] To read/write a qcow2 backed device VxHS server implementation
> will need to understand the qcow2 format. This is not just block IO but
> actually does involve interpreting the qcow2 header and cluster formats.
> Clearly the test server implementation does not handle it as it was never
> intended to. VxHS backend won't handle it either because VxHS virtual
> disks are written as non-sparse files.  There are space savings with the
> qcow2 format but performance penalties as well because of the metadata
> overhead. As a block storage provider, VxHS does not support sparse file
> formats like qcow2 primarily because of performance reasons.  Implementing
> a qcow2 backend in the test server would be a non-trivial and non-useful
> exercise since the VxHS server won't support it.
>

What?  Why would the backend need to know anything about qcow2 formats; are
you manipulating the guest image data directly yourself?  But regardless,
since the test server is naive and just reads and writes data, the fact that
the test server breaks on qcow2 image formats means that the test server is
broken for raw images, as well [1].

The backend should not need to know anything about the image file format in
order to serve data, as the backend is essentially serving bytes. (I guess
the only thing the backend would need to be aware of is how to invoke
qemu-img to create a qcow2 image file initially).


QEMU is what interprets the qcow2 format (or any image format).  Every read
from QEMU will look like a raw file read from the perspective of libqnio /
vxhs.  I can't see any reason why vxhs would need to know anything about the
contents of the image file itself.

The only thing the test server needs to do, in order to be able to serve
qcow2 files correctly, is to be able to serve raw files correctly.

Looking at the libqnio implementation from the test server with gdb, I think
that the reason why qcow2 format does not work is that the current
libqnio implementation does not handle short reads correctly.  For instance,
if the file is not a even multiple of the read size, and we try 

Re: [Qemu-devel] [PATCH 1/1] mirror: do not increase offset during initial zero_or_discard phase - pls consider this as V3 patch

2017-02-13 Thread Denis V. Lunev
On 02/13/2017 10:13 PM, Jeff Cody wrote:
> On Mon, Feb 13, 2017 at 06:16:36PM +0100, Max Reitz wrote:
>> On 13.02.2017 08:10, Denis V. Lunev wrote:
>>> On 02/03/2017 06:08 PM, Denis V. Lunev wrote:
 On 02/03/2017 06:06 PM, Denis V. Lunev wrote:
> From: Anton Nefedov 
>
> If explicit zeroing out before mirroring is required for the target image,
> it moves the block job offset counter to EOF, then offset and len counters
> count the image size twice. There is no harm but stats are confusing,
> specifically the progress of the operation is always reported as 99% by
> management tools.
>
> The patch skips offset increase for the first "technical" pass over the
> image. This should not cause any further harm.
>
> Signed-off-by: Anton Nefedov 
> Signed-off-by: Denis V. Lunev 
> Reviewed-by: Eric Blake 
> Reviewed-by: Stefan Hajnoczi 
> CC: Jeff Cody 
> CC: Kevin Wolf 
> CC: Max Reitz 
 actually this is V3 patch. Sorry for broken subject.

 Den
>>> ping
>> Didn't Jeff merge v2?
>>
>> http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg01319.html
>>
>> Max
>>
>
> Yes, I did.
thank you very much for clarification

Den



[Qemu-devel] [PULL 19/24] target/openrisc: Implement muld, muldu, macu, msbu

2017-02-13 Thread Richard Henderson
Reviewed-by: Bastian Koppelmann 
Signed-off-by: Richard Henderson 
---
 target/openrisc/translate.c | 108 
 1 file changed, 108 insertions(+)

diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index 82b8bec..ce9672e 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -362,6 +362,56 @@ static void gen_divu(DisasContext *dc, TCGv dest, TCGv 
srca, TCGv srcb)
 gen_ove_cy(dc);
 }
 
+static void gen_muld(DisasContext *dc, TCGv srca, TCGv srcb)
+{
+TCGv_i64 t1 = tcg_temp_new_i64();
+TCGv_i64 t2 = tcg_temp_new_i64();
+
+tcg_gen_ext_tl_i64(t1, srca);
+tcg_gen_ext_tl_i64(t2, srcb);
+if (TARGET_LONG_BITS == 32) {
+tcg_gen_mul_i64(cpu_mac, t1, t2);
+tcg_gen_movi_tl(cpu_sr_ov, 0);
+} else {
+TCGv_i64 high = tcg_temp_new_i64();
+
+tcg_gen_muls2_i64(cpu_mac, high, t1, t2);
+tcg_gen_sari_i64(t1, cpu_mac, 63);
+tcg_gen_setcond_i64(TCG_COND_NE, t1, t1, high);
+tcg_temp_free_i64(high);
+tcg_gen_trunc_i64_tl(cpu_sr_ov, t1);
+tcg_gen_neg_tl(cpu_sr_ov, cpu_sr_ov);
+
+gen_ove_ov(dc);
+}
+tcg_temp_free_i64(t1);
+tcg_temp_free_i64(t2);
+}
+
+static void gen_muldu(DisasContext *dc, TCGv srca, TCGv srcb)
+{
+TCGv_i64 t1 = tcg_temp_new_i64();
+TCGv_i64 t2 = tcg_temp_new_i64();
+
+tcg_gen_extu_tl_i64(t1, srca);
+tcg_gen_extu_tl_i64(t2, srcb);
+if (TARGET_LONG_BITS == 32) {
+tcg_gen_mul_i64(cpu_mac, t1, t2);
+tcg_gen_movi_tl(cpu_sr_cy, 0);
+} else {
+TCGv_i64 high = tcg_temp_new_i64();
+
+tcg_gen_mulu2_i64(cpu_mac, high, t1, t2);
+tcg_gen_setcondi_i64(TCG_COND_NE, high, high, 0);
+tcg_gen_trunc_i64_tl(cpu_sr_cy, high);
+tcg_temp_free_i64(high);
+
+gen_ove_cy(dc);
+}
+tcg_temp_free_i64(t1);
+tcg_temp_free_i64(t2);
+}
+
 static void gen_mac(DisasContext *dc, TCGv srca, TCGv srcb)
 {
 TCGv_i64 t1 = tcg_temp_new_i64();
@@ -388,6 +438,25 @@ static void gen_mac(DisasContext *dc, TCGv srca, TCGv srcb)
 gen_ove_ov(dc);
 }
 
+static void gen_macu(DisasContext *dc, TCGv srca, TCGv srcb)
+{
+TCGv_i64 t1 = tcg_temp_new_i64();
+TCGv_i64 t2 = tcg_temp_new_i64();
+
+tcg_gen_extu_tl_i64(t1, srca);
+tcg_gen_extu_tl_i64(t2, srcb);
+tcg_gen_mul_i64(t1, t1, t2);
+tcg_temp_free_i64(t2);
+
+/* Note that overflow is only computed during addition stage.  */
+tcg_gen_add_i64(cpu_mac, cpu_mac, t1);
+tcg_gen_setcond_i64(TCG_COND_LTU, t1, cpu_mac, t1);
+tcg_gen_trunc_i64_tl(cpu_sr_cy, t1);
+tcg_temp_free_i64(t1);
+
+gen_ove_cy(dc);
+}
+
 static void gen_msb(DisasContext *dc, TCGv srca, TCGv srcb)
 {
 TCGv_i64 t1 = tcg_temp_new_i64();
@@ -414,6 +483,25 @@ static void gen_msb(DisasContext *dc, TCGv srca, TCGv srcb)
 gen_ove_ov(dc);
 }
 
+static void gen_msbu(DisasContext *dc, TCGv srca, TCGv srcb)
+{
+TCGv_i64 t1 = tcg_temp_new_i64();
+TCGv_i64 t2 = tcg_temp_new_i64();
+
+tcg_gen_extu_tl_i64(t1, srca);
+tcg_gen_extu_tl_i64(t2, srcb);
+tcg_gen_mul_i64(t1, t1, t2);
+
+/* Note that overflow is only computed during subtraction stage.  */
+tcg_gen_setcond_i64(TCG_COND_LTU, t2, cpu_mac, t1);
+tcg_gen_sub_i64(cpu_mac, cpu_mac, t1);
+tcg_gen_trunc_i64_tl(cpu_sr_cy, t2);
+tcg_temp_free_i64(t2);
+tcg_temp_free_i64(t1);
+
+gen_ove_cy(dc);
+}
+
 static void gen_lwa(DisasContext *dc, TCGv rd, TCGv ra, int32_t ofs)
 {
 TCGv ea = tcg_temp_new();
@@ -590,6 +678,11 @@ static void dec_calc(DisasContext *dc, uint32_t insn)
 gen_mul(dc, cpu_R[rd], cpu_R[ra], cpu_R[rb]);
 return;
 
+case 0x7: /* l.muld */
+LOG_DIS("l.muld r%d, r%d\n", ra, rb);
+gen_muld(dc, cpu_R[ra], cpu_R[rb]);
+break;
+
 case 0x9: /* l.div */
 LOG_DIS("l.div r%d, r%d, r%d\n", rd, ra, rb);
 gen_div(dc, cpu_R[rd], cpu_R[ra], cpu_R[rb]);
@@ -604,6 +697,11 @@ static void dec_calc(DisasContext *dc, uint32_t insn)
 LOG_DIS("l.mulu r%d, r%d, r%d\n", rd, ra, rb);
 gen_mulu(dc, cpu_R[rd], cpu_R[ra], cpu_R[rb]);
 return;
+
+case 0xc: /* l.muldu */
+LOG_DIS("l.muldu r%d, r%d\n", ra, rb);
+gen_muldu(dc, cpu_R[ra], cpu_R[rb]);
+return;
 }
 break;
 }
@@ -916,6 +1014,16 @@ static void dec_mac(DisasContext *dc, uint32_t insn)
 gen_msb(dc, cpu_R[ra], cpu_R[rb]);
 break;
 
+case 0x0003:/* l.macu */
+LOG_DIS("l.macu r%d, r%d\n", ra, rb);
+gen_macu(dc, cpu_R[ra], cpu_R[rb]);
+break;
+
+case 0x0004:/* l.msbu */
+LOG_DIS("l.msbu r%d, r%d\n", ra, rb);
+gen_msbu(dc, cpu_R[ra], cpu_R[rb]);
+break;
+
 default:
 gen_illegal_exception(dc);
 break;
-- 
2.9.3




[Qemu-devel] [PULL 14/24] target/openrisc: Use movcond where appropriate

2017-02-13 Thread Richard Henderson
Reviewed-by: Bastian Koppelmann 
Signed-off-by: Richard Henderson 
---
 target/openrisc/translate.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index 6c745d3..f91ab6a 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -214,12 +214,16 @@ static void gen_jump(DisasContext *dc, int32_t n26, 
uint32_t reg, uint32_t op0)
 case 0x03: /* l.bnf */
 case 0x04: /* l.bf  */
 {
-TCGLabel *lab = gen_new_label();
-tcg_gen_movi_tl(jmp_pc, dc->pc+8);
-tcg_gen_brcondi_tl(op0 == 0x03 ? TCG_COND_NE : TCG_COND_EQ,
-   cpu_sr_f, 0, lab);
-tcg_gen_movi_tl(jmp_pc, tmp_pc);
-gen_set_label(lab);
+TCGv t_next = tcg_const_tl(dc->pc + 8);
+TCGv t_true = tcg_const_tl(tmp_pc);
+TCGv t_zero = tcg_const_tl(0);
+
+tcg_gen_movcond_tl(op0 == 0x03 ? TCG_COND_EQ : TCG_COND_NE,
+   jmp_pc, cpu_sr_f, t_zero, t_true, t_next);
+
+tcg_temp_free(t_next);
+tcg_temp_free(t_true);
+tcg_temp_free(t_zero);
 }
 break;
 case 0x11: /* l.jr */
@@ -502,14 +506,10 @@ static void dec_calc(DisasContext *dc, uint32_t insn)
 case 0xe: /* l.cmov */
 LOG_DIS("l.cmov r%d, r%d, r%d\n", rd, ra, rb);
 {
-TCGLabel *lab = gen_new_label();
-TCGv res = tcg_temp_local_new();
-tcg_gen_mov_tl(res, cpu_R[rb]);
-tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_sr_f, 0, lab);
-tcg_gen_mov_tl(res, cpu_R[ra]);
-gen_set_label(lab);
-tcg_gen_mov_tl(cpu_R[rd], res);
-tcg_temp_free(res);
+TCGv zero = tcg_const_tl(0);
+tcg_gen_movcond_tl(TCG_COND_NE, cpu_R[rd], cpu_sr_f, zero,
+   cpu_R[ra], cpu_R[rb]);
+tcg_temp_free(zero);
 }
 return;
 
-- 
2.9.3




[Qemu-devel] [PULL 16/24] target/openrisc: Enable trap, csync, msync, psync for user mode

2017-02-13 Thread Richard Henderson
Not documented as disabled for user mode.

Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Bastian Koppelmann 
Signed-off-by: Richard Henderson 
---
 target/openrisc/translate.c | 32 
 1 file changed, 32 deletions(-)

diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index f91ab6a..6c8f05c 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -1134,52 +1134,20 @@ static void dec_sys(DisasContext *dc, uint32_t insn)
 
 case 0x100:/* l.trap */
 LOG_DIS("l.trap %d\n", K16);
-#if defined(CONFIG_USER_ONLY)
-return;
-#else
-if (dc->mem_idx == MMU_USER_IDX) {
-gen_illegal_exception(dc);
-return;
-}
 tcg_gen_movi_tl(cpu_pc, dc->pc);
 gen_exception(dc, EXCP_TRAP);
-#endif
 break;
 
 case 0x300:/* l.csync */
 LOG_DIS("l.csync\n");
-#if defined(CONFIG_USER_ONLY)
-return;
-#else
-if (dc->mem_idx == MMU_USER_IDX) {
-gen_illegal_exception(dc);
-return;
-}
-#endif
 break;
 
 case 0x200:/* l.msync */
 LOG_DIS("l.msync\n");
-#if defined(CONFIG_USER_ONLY)
-return;
-#else
-if (dc->mem_idx == MMU_USER_IDX) {
-gen_illegal_exception(dc);
-return;
-}
-#endif
 break;
 
 case 0x270:/* l.psync */
 LOG_DIS("l.psync\n");
-#if defined(CONFIG_USER_ONLY)
-return;
-#else
-if (dc->mem_idx == MMU_USER_IDX) {
-gen_illegal_exception(dc);
-return;
-}
-#endif
 break;
 
 default:
-- 
2.9.3




[Qemu-devel] [PULL 13/24] target/openrisc: Keep SR_CY and SR_OV in a separate variables

2017-02-13 Thread Richard Henderson
This significantly streamlines carry and overflow production.

Signed-off-by: Richard Henderson 
---
 target/openrisc/cpu.h  |  13 +++-
 target/openrisc/exception_helper.c |  31 --
 target/openrisc/helper.h   |   4 +-
 target/openrisc/translate.c| 119 +
 4 files changed, 78 insertions(+), 89 deletions(-)

diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
index bb5d363..e693461 100644
--- a/target/openrisc/cpu.h
+++ b/target/openrisc/cpu.h
@@ -287,7 +287,9 @@ typedef struct CPUOpenRISCState {
 target_ulong eear;/* Exception EA register */
 
 target_ulong sr_f;/* the SR_F bit, values 0, 1.  */
-uint32_t sr;  /* Supervisor register, without SR_F */
+target_ulong sr_cy;   /* the SR_CY bit, values 0, 1.  */
+target_long  sr_ov;   /* the SR_OV bit (in the sign bit only) */
+uint32_t sr;  /* Supervisor register, without SR_{F,CY,OV} */
 uint32_t vr;  /* Version register */
 uint32_t upr; /* Unit presence register */
 uint32_t cpucfgr; /* CPU configure register */
@@ -414,13 +416,18 @@ static inline int cpu_mmu_index(CPUOpenRISCState *env, 
bool ifetch)
 
 static inline uint32_t cpu_get_sr(const CPUOpenRISCState *env)
 {
-return env->sr + env->sr_f * SR_F;
+return (env->sr
++ env->sr_f * SR_F
++ env->sr_cy * SR_CY
++ (env->sr_ov < 0) * SR_OV);
 }
 
 static inline void cpu_set_sr(CPUOpenRISCState *env, uint32_t val)
 {
 env->sr_f = (val & SR_F) != 0;
-env->sr = (val & ~SR_F) | SR_FO;
+env->sr_cy = (val & SR_CY) != 0;
+env->sr_ov = (val & SR_OV ? -1 : 0);
+env->sr = (val & ~(SR_F | SR_CY | SR_OV)) | SR_FO;
 }
 
 #define CPU_INTERRUPT_TIMER   CPU_INTERRUPT_TGT_INT_0
diff --git a/target/openrisc/exception_helper.c 
b/target/openrisc/exception_helper.c
index 5147da6..1536053 100644
--- a/target/openrisc/exception_helper.c
+++ b/target/openrisc/exception_helper.c
@@ -30,13 +30,32 @@ void HELPER(exception)(CPUOpenRISCState *env, uint32_t excp)
 raise_exception(cpu, excp);
 }
 
-void HELPER(ove)(CPUOpenRISCState *env, target_ulong test)
+static void QEMU_NORETURN do_range(CPUOpenRISCState *env, uintptr_t pc)
 {
-if (unlikely(test)) {
-OpenRISCCPU *cpu = openrisc_env_get_cpu(env);
-CPUState *cs = CPU(cpu);
+OpenRISCCPU *cpu = openrisc_env_get_cpu(env);
+CPUState *cs = CPU(cpu);
+
+cs->exception_index = EXCP_RANGE;
+cpu_loop_exit_restore(cs, pc);
+}
+
+void HELPER(ove_cy)(CPUOpenRISCState *env)
+{
+if (env->sr_cy) {
+do_range(env, GETPC());
+}
+}
+
+void HELPER(ove_ov)(CPUOpenRISCState *env)
+{
+if (env->sr_ov < 0) {
+do_range(env, GETPC());
+}
+}
 
-cs->exception_index = EXCP_RANGE;
-cpu_loop_exit_restore(cs, GETPC());
+void HELPER(ove_cyov)(CPUOpenRISCState *env)
+{
+if (env->sr_cy || env->sr_ov < 0) {
+do_range(env, GETPC());
 }
 }
diff --git a/target/openrisc/helper.h b/target/openrisc/helper.h
index c2c8098..f4d97a2 100644
--- a/target/openrisc/helper.h
+++ b/target/openrisc/helper.h
@@ -19,7 +19,9 @@
 
 /* exception */
 DEF_HELPER_FLAGS_2(exception, 0, void, env, i32)
-DEF_HELPER_FLAGS_2(ove, TCG_CALL_NO_WG, void, env, tl)
+DEF_HELPER_FLAGS_1(ove_cy, TCG_CALL_NO_WG, void, env)
+DEF_HELPER_FLAGS_1(ove_ov, TCG_CALL_NO_WG, void, env)
+DEF_HELPER_FLAGS_1(ove_cyov, TCG_CALL_NO_WG, void, env)
 
 /* float */
 DEF_HELPER_FLAGS_2(itofd, 0, i64, env, i64)
diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index 405a1a0..6c745d3 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -55,6 +55,8 @@ static TCGv jmp_pc;/* l.jr/l.jalr temp pc */
 static TCGv cpu_npc;
 static TCGv cpu_ppc;
 static TCGv cpu_sr_f;   /* bf/bnf, F flag taken */
+static TCGv cpu_sr_cy;  /* carry (unsigned overflow) */
+static TCGv cpu_sr_ov;  /* signed overflow */
 static TCGv cpu_lock_addr;
 static TCGv cpu_lock_value;
 static TCGv_i32 fpcsr;
@@ -90,6 +92,10 @@ void openrisc_translate_init(void)
 offsetof(CPUOpenRISCState, jmp_pc), "jmp_pc");
 cpu_sr_f = tcg_global_mem_new(cpu_env,
   offsetof(CPUOpenRISCState, sr_f), "sr_f");
+cpu_sr_cy = tcg_global_mem_new(cpu_env,
+   offsetof(CPUOpenRISCState, sr_cy), "sr_cy");
+cpu_sr_ov = tcg_global_mem_new(cpu_env,
+   offsetof(CPUOpenRISCState, sr_ov), "sr_ov");
 cpu_lock_addr = tcg_global_mem_new(cpu_env,
offsetof(CPUOpenRISCState, lock_addr),
"lock_addr");
@@ -233,27 +239,24 @@ static void gen_jump(DisasContext *dc, int32_t n26, 
uint32_t reg, uint32_t op0)
 gen_sync_flags(dc);
 }
 
-static void gen_ove_cy(DisasContext *dc, TCGv cy)

[Qemu-devel] [PULL 15/24] target/openrisc: Set flags on helpers

2017-02-13 Thread Richard Henderson
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Bastian Koppelmann 
Signed-off-by: Richard Henderson 
---
 target/openrisc/helper.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/target/openrisc/helper.h b/target/openrisc/helper.h
index f4d97a2..78a123d 100644
--- a/target/openrisc/helper.h
+++ b/target/openrisc/helper.h
@@ -18,26 +18,26 @@
  */
 
 /* exception */
-DEF_HELPER_FLAGS_2(exception, 0, void, env, i32)
+DEF_HELPER_FLAGS_2(exception, TCG_CALL_NO_WG, void, env, i32)
 DEF_HELPER_FLAGS_1(ove_cy, TCG_CALL_NO_WG, void, env)
 DEF_HELPER_FLAGS_1(ove_ov, TCG_CALL_NO_WG, void, env)
 DEF_HELPER_FLAGS_1(ove_cyov, TCG_CALL_NO_WG, void, env)
 
 /* float */
-DEF_HELPER_FLAGS_2(itofd, 0, i64, env, i64)
-DEF_HELPER_FLAGS_2(itofs, 0, i32, env, i32)
-DEF_HELPER_FLAGS_2(ftoid, 0, i64, env, i64)
-DEF_HELPER_FLAGS_2(ftois, 0, i32, env, i32)
+DEF_HELPER_FLAGS_2(itofd, TCG_CALL_NO_WG, i64, env, i64)
+DEF_HELPER_FLAGS_2(itofs, TCG_CALL_NO_WG, i32, env, i32)
+DEF_HELPER_FLAGS_2(ftoid, TCG_CALL_NO_WG, i64, env, i64)
+DEF_HELPER_FLAGS_2(ftois, TCG_CALL_NO_WG, i32, env, i32)
 
 #define FOP_MADD(op) \
-DEF_HELPER_FLAGS_3(float_ ## op ## _s, 0, i32, env, i32, i32)\
-DEF_HELPER_FLAGS_3(float_ ## op ## _d, 0, i64, env, i64, i64)
+DEF_HELPER_FLAGS_3(float_ ## op ## _s, TCG_CALL_NO_WG, i32, env, i32, i32) \
+DEF_HELPER_FLAGS_3(float_ ## op ## _d, TCG_CALL_NO_WG, i64, env, i64, i64)
 FOP_MADD(muladd)
 #undef FOP_MADD
 
 #define FOP_CALC(op)\
-DEF_HELPER_FLAGS_3(float_ ## op ## _s, 0, i32, env, i32, i32)\
-DEF_HELPER_FLAGS_3(float_ ## op ## _d, 0, i64, env, i64, i64)
+DEF_HELPER_FLAGS_3(float_ ## op ## _s, TCG_CALL_NO_WG, i32, env, i32, i32) \
+DEF_HELPER_FLAGS_3(float_ ## op ## _d, TCG_CALL_NO_WG, i64, env, i64, i64)
 FOP_CALC(add)
 FOP_CALC(sub)
 FOP_CALC(mul)
@@ -46,8 +46,8 @@ FOP_CALC(rem)
 #undef FOP_CALC
 
 #define FOP_CMP(op)  \
-DEF_HELPER_FLAGS_3(float_ ## op ## _s, 0, i32, env, i32, i32)\
-DEF_HELPER_FLAGS_3(float_ ## op ## _d, 0, i64, env, i64, i64)
+DEF_HELPER_FLAGS_3(float_ ## op ## _s, TCG_CALL_NO_WG, i32, env, i32, i32) \
+DEF_HELPER_FLAGS_3(float_ ## op ## _d, TCG_CALL_NO_WG, i64, env, i64, i64)
 FOP_CMP(eq)
 FOP_CMP(lt)
 FOP_CMP(le)
@@ -61,4 +61,4 @@ DEF_HELPER_FLAGS_1(rfe, 0, void, env)
 
 /* sys */
 DEF_HELPER_FLAGS_4(mtspr, 0, void, env, tl, tl, tl)
-DEF_HELPER_FLAGS_4(mfspr, 0, tl, env, tl, tl, tl)
+DEF_HELPER_FLAGS_4(mfspr, TCG_CALL_NO_WG, tl, env, tl, tl, tl)
-- 
2.9.3




[Qemu-devel] [PULL 22/24] target/openrisc: Tidy ppc/npc implementation

2017-02-13 Thread Richard Henderson
The NPC SPR is really only supposed to be used for FPGA debugging.
It contains the same contents as PC, unless one plays games.  Follow
the or1ksim implementation in flushing delayed branch state when it
is changed.

The PPC SPR need not be updated every instruction, merely when we
exit the TB or attempt to read its contents.

Signed-off-by: Richard Henderson 
---
 target/openrisc/cpu.h  |  2 +-
 target/openrisc/gdbstub.c  | 13 +++
 target/openrisc/interrupt_helper.c |  1 -
 target/openrisc/machine.c  |  5 ++---
 target/openrisc/sys_helper.c   | 44 ++
 target/openrisc/translate.c| 29 ++---
 6 files changed, 39 insertions(+), 55 deletions(-)

diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
index 0694038..8294636 100644
--- a/target/openrisc/cpu.h
+++ b/target/openrisc/cpu.h
@@ -58,6 +58,7 @@ typedef struct OpenRISCCPUClass {
 } OpenRISCCPUClass;
 
 #define NB_MMU_MODES3
+#define TARGET_INSN_START_EXTRA_WORDS 1
 
 enum {
 MMU_NOMMU_IDX = 0,
@@ -273,7 +274,6 @@ typedef struct CPUOpenRISCTLBContext {
 typedef struct CPUOpenRISCState {
 target_ulong gpr[32]; /* General registers */
 target_ulong pc;  /* Program counter */
-target_ulong npc; /* Next PC */
 target_ulong ppc; /* Prev PC */
 target_ulong jmp_pc;  /* Jump PC */
 
diff --git a/target/openrisc/gdbstub.c b/target/openrisc/gdbstub.c
index 31ea013..2a4821f 100644
--- a/target/openrisc/gdbstub.c
+++ b/target/openrisc/gdbstub.c
@@ -34,8 +34,8 @@ int openrisc_cpu_gdb_read_register(CPUState *cs, uint8_t 
*mem_buf, int n)
 case 32:/* PPC */
 return gdb_get_reg32(mem_buf, env->ppc);
 
-case 33:/* NPC */
-return gdb_get_reg32(mem_buf, env->npc);
+case 33:/* NPC (equals PC) */
+return gdb_get_reg32(mem_buf, env->pc);
 
 case 34:/* SR */
 return gdb_get_reg32(mem_buf, cpu_get_sr(env));
@@ -68,8 +68,13 @@ int openrisc_cpu_gdb_write_register(CPUState *cs, uint8_t 
*mem_buf, int n)
 env->ppc = tmp;
 break;
 
-case 33: /* NPC */
-env->npc = tmp;
+case 33: /* NPC (equals PC) */
+/* If setting PC to something different,
+   also clear delayed branch status.  */
+if (env->pc != tmp) {
+env->pc = tmp;
+env->flags = 0;
+}
 break;
 
 case 34: /* SR */
diff --git a/target/openrisc/interrupt_helper.c 
b/target/openrisc/interrupt_helper.c
index c7fa97a..56620e0 100644
--- a/target/openrisc/interrupt_helper.c
+++ b/target/openrisc/interrupt_helper.c
@@ -32,7 +32,6 @@ void HELPER(rfe)(CPUOpenRISCState *env)
  (cpu->env.esr & (SR_SM | SR_IME | SR_DME));
 #endif
 cpu->env.pc = cpu->env.epcr;
-cpu->env.npc = cpu->env.epcr;
 cpu_set_sr(>env, cpu->env.esr);
 cpu->env.lock_addr = -1;
 
diff --git a/target/openrisc/machine.c b/target/openrisc/machine.c
index 4100957..686eaa3 100644
--- a/target/openrisc/machine.c
+++ b/target/openrisc/machine.c
@@ -47,12 +47,11 @@ static const VMStateInfo vmstate_sr = {
 
 static const VMStateDescription vmstate_env = {
 .name = "env",
-.version_id = 3,
-.minimum_version_id = 3,
+.version_id = 4,
+.minimum_version_id = 4,
 .fields = (VMStateField[]) {
 VMSTATE_UINTTL_ARRAY(gpr, CPUOpenRISCState, 32),
 VMSTATE_UINTTL(pc, CPUOpenRISCState),
-VMSTATE_UINTTL(npc, CPUOpenRISCState),
 VMSTATE_UINTTL(ppc, CPUOpenRISCState),
 VMSTATE_UINTTL(jmp_pc, CPUOpenRISCState),
 VMSTATE_UINTTL(lock_addr, CPUOpenRISCState),
diff --git a/target/openrisc/sys_helper.c b/target/openrisc/sys_helper.c
index 9841a5b..0968901 100644
--- a/target/openrisc/sys_helper.c
+++ b/target/openrisc/sys_helper.c
@@ -29,11 +29,10 @@ void HELPER(mtspr)(CPUOpenRISCState *env,
target_ulong ra, target_ulong rb, target_ulong offset)
 {
 #ifndef CONFIG_USER_ONLY
-int spr = (ra | offset);
-int idx;
-
 OpenRISCCPU *cpu = openrisc_env_get_cpu(env);
 CPUState *cs = CPU(cpu);
+int spr = (ra | offset);
+int idx;
 
 switch (spr) {
 case TO_SPR(0, 0): /* VR */
@@ -41,7 +40,14 @@ void HELPER(mtspr)(CPUOpenRISCState *env,
 break;
 
 case TO_SPR(0, 16): /* NPC */
-env->npc = rb;
+cpu_restore_state(cs, GETPC());
+/* ??? Mirror or1ksim in not trashing delayed branch state
+   when "jumping" to the current instruction.  */
+if (env->pc != rb) {
+env->pc = rb;
+env->flags = 0;
+cpu_loop_exit(cs);
+}
 break;
 
 case TO_SPR(0, 17): /* SR */
@@ -170,7 +176,6 @@ void HELPER(mtspr)(CPUOpenRISCState *env,
 cpu_openrisc_timer_update(cpu);
 break;
 default:
-
 break;
 }
 #endif
@@ 

[Qemu-devel] [PULL 12/24] target/openrisc: Keep SR_F in a separate variable

2017-02-13 Thread Richard Henderson
This avoids having to keep merging and extracting the flag from SR.

Reviewed-by: Bastian Koppelmann 
Signed-off-by: Richard Henderson 
---
 linux-user/elfload.c   |   3 +-
 linux-user/main.c  |   3 +-
 target/openrisc/cpu.h  |  15 +-
 target/openrisc/gdbstub.c  |   4 +-
 target/openrisc/interrupt.c|   2 +-
 target/openrisc/interrupt_helper.c |   2 +-
 target/openrisc/machine.c  |  38 +-
 target/openrisc/sys_helper.c   |   5 +-
 target/openrisc/translate.c| 104 ++---
 9 files changed, 98 insertions(+), 78 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index c66cbbe..8271227 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1054,9 +1054,8 @@ static void elf_core_copy_regs(target_elf_gregset_t *regs,
 for (i = 0; i < 32; i++) {
 (*regs)[i] = tswapreg(env->gpr[i]);
 }
-
 (*regs)[32] = tswapreg(env->pc);
-(*regs)[33] = tswapreg(env->sr);
+(*regs)[33] = tswapreg(cpu_get_sr(env));
 }
 #define ELF_HWCAP 0
 #define ELF_PLATFORM NULL
diff --git a/linux-user/main.c b/linux-user/main.c
index 001f71c..4fd49ce 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -4765,9 +4765,8 @@ int main(int argc, char **argv, char **envp)
 for (i = 0; i < 32; i++) {
 env->gpr[i] = regs->gpr[i];
 }
-
-env->sr = regs->sr;
 env->pc = regs->pc;
+cpu_set_sr(env, regs->sr);
 }
 #elif defined(TARGET_SH4)
 {
diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
index ef90e49..bb5d363 100644
--- a/target/openrisc/cpu.h
+++ b/target/openrisc/cpu.h
@@ -286,7 +286,8 @@ typedef struct CPUOpenRISCState {
 target_ulong epcr;/* Exception PC register */
 target_ulong eear;/* Exception EA register */
 
-uint32_t sr;  /* Supervisor register */
+target_ulong sr_f;/* the SR_F bit, values 0, 1.  */
+uint32_t sr;  /* Supervisor register, without SR_F */
 uint32_t vr;  /* Version register */
 uint32_t upr; /* Unit presence register */
 uint32_t cpucfgr; /* CPU configure register */
@@ -301,7 +302,6 @@ typedef struct CPUOpenRISCState {
 
 uint32_t flags;   /* cpu_flags, we only use it for exception
  in solt so far.  */
-uint32_t btaken;  /* the SR_F bit */
 
 /* Fields up to this point are cleared by a CPU reset */
 struct {} end_reset_fields;
@@ -412,6 +412,17 @@ static inline int cpu_mmu_index(CPUOpenRISCState *env, 
bool ifetch)
 return (env->sr & SR_SM) == 0 ? MMU_USER_IDX : MMU_SUPERVISOR_IDX;
 }
 
+static inline uint32_t cpu_get_sr(const CPUOpenRISCState *env)
+{
+return env->sr + env->sr_f * SR_F;
+}
+
+static inline void cpu_set_sr(CPUOpenRISCState *env, uint32_t val)
+{
+env->sr_f = (val & SR_F) != 0;
+env->sr = (val & ~SR_F) | SR_FO;
+}
+
 #define CPU_INTERRUPT_TIMER   CPU_INTERRUPT_TGT_INT_0
 
 #endif /* OPENRISC_CPU_H */
diff --git a/target/openrisc/gdbstub.c b/target/openrisc/gdbstub.c
index cb16e76..31ea013 100644
--- a/target/openrisc/gdbstub.c
+++ b/target/openrisc/gdbstub.c
@@ -38,7 +38,7 @@ int openrisc_cpu_gdb_read_register(CPUState *cs, uint8_t 
*mem_buf, int n)
 return gdb_get_reg32(mem_buf, env->npc);
 
 case 34:/* SR */
-return gdb_get_reg32(mem_buf, env->sr);
+return gdb_get_reg32(mem_buf, cpu_get_sr(env));
 
 default:
 break;
@@ -73,7 +73,7 @@ int openrisc_cpu_gdb_write_register(CPUState *cs, uint8_t 
*mem_buf, int n)
 break;
 
 case 34: /* SR */
-env->sr = tmp;
+cpu_set_sr(env, tmp);
 break;
 
 default:
diff --git a/target/openrisc/interrupt.c b/target/openrisc/interrupt.c
index a981638..042506f 100644
--- a/target/openrisc/interrupt.c
+++ b/target/openrisc/interrupt.c
@@ -54,7 +54,7 @@ void openrisc_cpu_do_interrupt(CPUState *cs)
we need flush TLB when we enter EXCP.  */
 tlb_flush(cs);
 
-env->esr = env->sr;
+env->esr = cpu_get_sr(env);
 env->sr &= ~SR_DME;
 env->sr &= ~SR_IME;
 env->sr |= SR_SM;
diff --git a/target/openrisc/interrupt_helper.c 
b/target/openrisc/interrupt_helper.c
index a6d4df3..c7fa97a 100644
--- a/target/openrisc/interrupt_helper.c
+++ b/target/openrisc/interrupt_helper.c
@@ -33,7 +33,7 @@ void HELPER(rfe)(CPUOpenRISCState *env)
 #endif
 cpu->env.pc = cpu->env.epcr;
 cpu->env.npc = cpu->env.epcr;
-cpu->env.sr = cpu->env.esr;
+cpu_set_sr(>env, cpu->env.esr);
 cpu->env.lock_addr = -1;
 
 #ifndef CONFIG_USER_ONLY
diff --git a/target/openrisc/machine.c b/target/openrisc/machine.c
index d0b47ef..b723138 100644
--- a/target/openrisc/machine.c
+++ b/target/openrisc/machine.c
@@ -24,6 +24,27 @@
 #include "hw/boards.h"
 #include 

[Qemu-devel] [PULL 23/24] target/openrisc: Tidy handling of delayed branches

2017-02-13 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/openrisc/cpu.h| 12 +---
 target/openrisc/gdbstub.c|  2 +-
 target/openrisc/interrupt.c  |  4 ++--
 target/openrisc/sys_helper.c |  2 +-
 target/openrisc/translate.c  | 40 
 5 files changed, 25 insertions(+), 35 deletions(-)

diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
index 8294636..50a36ba 100644
--- a/target/openrisc/cpu.h
+++ b/target/openrisc/cpu.h
@@ -83,9 +83,6 @@ enum {
 /* Version Register */
 #define SPR_VR 0x003F
 
-/* Internal flags, delay slot flag */
-#define D_FLAG1
-
 /* Interrupt */
 #define NR_IRQS  32
 
@@ -298,8 +295,7 @@ typedef struct CPUOpenRISCState {
 target_ulong lock_addr;
 target_ulong lock_value;
 
-uint32_t flags;   /* cpu_flags, we only use it for exception
- in solt so far.  */
+uint32_t dflag;   /* In delay slot (boolean) */
 
 /* Fields up to this point are cleared by a CPU reset */
 struct {} end_reset_fields;
@@ -392,14 +388,16 @@ int cpu_openrisc_get_phys_data(OpenRISCCPU *cpu,
 
 #include "exec/cpu-all.h"
 
+#define TB_FLAGS_DFLAG 1
+#define TB_FLAGS_OVE   SR_OVE
+
 static inline void cpu_get_tb_cpu_state(CPUOpenRISCState *env,
 target_ulong *pc,
 target_ulong *cs_base, uint32_t *flags)
 {
 *pc = env->pc;
 *cs_base = 0;
-/* D_FLAG -- branch instruction exception, OVE overflow trap enable.  */
-*flags = (env->flags & D_FLAG) | (env->sr & SR_OVE);
+*flags = env->dflag | (env->sr & SR_OVE);
 }
 
 static inline int cpu_mmu_index(CPUOpenRISCState *env, bool ifetch)
diff --git a/target/openrisc/gdbstub.c b/target/openrisc/gdbstub.c
index 2a4821f..b18c7e9 100644
--- a/target/openrisc/gdbstub.c
+++ b/target/openrisc/gdbstub.c
@@ -73,7 +73,7 @@ int openrisc_cpu_gdb_write_register(CPUState *cs, uint8_t 
*mem_buf, int n)
also clear delayed branch status.  */
 if (env->pc != tmp) {
 env->pc = tmp;
-env->flags = 0;
+env->dflag = 0;
 }
 break;
 
diff --git a/target/openrisc/interrupt.c b/target/openrisc/interrupt.c
index 042506f..a2eec6f 100644
--- a/target/openrisc/interrupt.c
+++ b/target/openrisc/interrupt.c
@@ -34,8 +34,8 @@ void openrisc_cpu_do_interrupt(CPUState *cs)
 CPUOpenRISCState *env = >env;
 
 env->epcr = env->pc;
-if (env->flags & D_FLAG) {
-env->flags &= ~D_FLAG;
+if (env->dflag) {
+env->dflag = 0;
 env->sr |= SR_DSX;
 env->epcr -= 4;
 } else {
diff --git a/target/openrisc/sys_helper.c b/target/openrisc/sys_helper.c
index 0968901..60c3193 100644
--- a/target/openrisc/sys_helper.c
+++ b/target/openrisc/sys_helper.c
@@ -45,7 +45,7 @@ void HELPER(mtspr)(CPUOpenRISCState *env,
when "jumping" to the current instruction.  */
 if (env->pc != rb) {
 env->pc = rb;
-env->flags = 0;
+env->dflag = 0;
 cpu_loop_exit(cs);
 }
 break;
diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index 10f0633..313dae2 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -40,11 +40,11 @@
 typedef struct DisasContext {
 TranslationBlock *tb;
 target_ulong pc;
-uint32_t tb_flags, synced_flags, flags;
 uint32_t is_jmp;
 uint32_t mem_idx;
-int singlestep_enabled;
+uint32_t tb_flags;
 uint32_t delayed_branch;
+bool singlestep_enabled;
 } DisasContext;
 
 static TCGv_env cpu_env;
@@ -60,7 +60,7 @@ static TCGv cpu_lock_addr;
 static TCGv cpu_lock_value;
 static TCGv_i32 fpcsr;
 static TCGv_i64 cpu_mac;/* MACHI:MACLO */
-static TCGv_i32 env_flags;
+static TCGv_i32 cpu_dflag;
 #include "exec/gen-icount.h"
 
 void openrisc_translate_init(void)
@@ -77,9 +77,9 @@ void openrisc_translate_init(void)
 tcg_ctx.tcg_env = cpu_env;
 cpu_sr = tcg_global_mem_new(cpu_env,
 offsetof(CPUOpenRISCState, sr), "sr");
-env_flags = tcg_global_mem_new_i32(cpu_env,
-   offsetof(CPUOpenRISCState, flags),
-   "flags");
+cpu_dflag = tcg_global_mem_new_i32(cpu_env,
+   offsetof(CPUOpenRISCState, dflag),
+   "dflag");
 cpu_pc = tcg_global_mem_new(cpu_env,
 offsetof(CPUOpenRISCState, pc), "pc");
 cpu_ppc = tcg_global_mem_new(cpu_env,
@@ -111,15 +111,6 @@ void openrisc_translate_init(void)
 }
 }
 
-static inline void gen_sync_flags(DisasContext *dc)
-{
-/* Sync the tb dependent flag between translate and runtime.  */
-if ((dc->tb_flags ^ dc->synced_flags) & D_FLAG) {
-tcg_gen_movi_tl(env_flags, dc->tb_flags & D_FLAG);
-dc->synced_flags = 

[Qemu-devel] [PULL 21/24] target/openrisc: Optimize l.jal to next

2017-02-13 Thread Richard Henderson
This allows the tcg optimizer to see, and fold, all of the
constants involved in a GOT base register load sequence.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/openrisc/translate.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index 66064e1..cda84b6 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -198,7 +198,11 @@ static void gen_jump(DisasContext *dc, int32_t n26, 
uint32_t reg, uint32_t op0)
 tcg_gen_movi_tl(jmp_pc, tmp_pc);
 break;
 case 0x01: /* l.jal */
-tcg_gen_movi_tl(cpu_R[9], (dc->pc + 8));
+tcg_gen_movi_tl(cpu_R[9], dc->pc + 8);
+/* Optimize jal being used to load the PC for PIC.  */
+if (tmp_pc == dc->pc + 8) {
+return;
+}
 tcg_gen_movi_tl(jmp_pc, tmp_pc);
 break;
 case 0x03: /* l.bnf */
-- 
2.9.3




[Qemu-devel] [PULL 11/24] target/openrisc: Invert the decoding in dec_calc

2017-02-13 Thread Richard Henderson
Decoding the opcodes in the right order reduces by 100+ lines.
Also, it happens to put the opcodes in the same order as Chapter 17.

Reviewed-by: Bastian Koppelmann 
Signed-off-by: Richard Henderson 
---
 target/openrisc/translate.c | 302 ++--
 1 file changed, 95 insertions(+), 207 deletions(-)

diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index b8116ba..1f3f22c 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -465,133 +465,95 @@ static void dec_calc(DisasContext *dc, uint32_t insn)
 rb = extract32(insn, 11, 5);
 rd = extract32(insn, 21, 5);
 
-switch (op0) {
-case 0x:
-switch (op1) {
-case 0x00:/* l.add */
+switch (op1) {
+case 0:
+switch (op0) {
+case 0x0: /* l.add */
 LOG_DIS("l.add r%d, r%d, r%d\n", rd, ra, rb);
 gen_add(dc, cpu_R[rd], cpu_R[ra], cpu_R[rb]);
-break;
-default:
-gen_illegal_exception(dc);
-break;
-}
-break;
+return;
 
-case 0x0001:/* l.addc */
-switch (op1) {
-case 0x00:
+case 0x1: /* l.addc */
 LOG_DIS("l.addc r%d, r%d, r%d\n", rd, ra, rb);
 gen_addc(dc, cpu_R[rd], cpu_R[ra], cpu_R[rb]);
-break;
-default:
-gen_illegal_exception(dc);
-break;
-}
-break;
+return;
 
-case 0x0002:/* l.sub */
-switch (op1) {
-case 0x00:
+case 0x2: /* l.sub */
 LOG_DIS("l.sub r%d, r%d, r%d\n", rd, ra, rb);
 gen_sub(dc, cpu_R[rd], cpu_R[ra], cpu_R[rb]);
-break;
-default:
-gen_illegal_exception(dc);
-break;
-}
-break;
+return;
 
-case 0x0003:/* l.and */
-switch (op1) {
-case 0x00:
+case 0x3: /* l.and */
 LOG_DIS("l.and r%d, r%d, r%d\n", rd, ra, rb);
 tcg_gen_and_tl(cpu_R[rd], cpu_R[ra], cpu_R[rb]);
-break;
-default:
-gen_illegal_exception(dc);
-break;
-}
-break;
+return;
 
-case 0x0004:/* l.or */
-switch (op1) {
-case 0x00:
+case 0x4: /* l.or */
 LOG_DIS("l.or r%d, r%d, r%d\n", rd, ra, rb);
 tcg_gen_or_tl(cpu_R[rd], cpu_R[ra], cpu_R[rb]);
-break;
-default:
-gen_illegal_exception(dc);
-break;
-}
-break;
+return;
 
-case 0x0005:
-switch (op1) {
-case 0x00:/* l.xor */
+case 0x5: /* l.xor */
 LOG_DIS("l.xor r%d, r%d, r%d\n", rd, ra, rb);
 tcg_gen_xor_tl(cpu_R[rd], cpu_R[ra], cpu_R[rb]);
-break;
-default:
-gen_illegal_exception(dc);
-break;
-}
-break;
-
-case 0x0006:
-switch (op1) {
-case 0x03:/* l.mul */
-LOG_DIS("l.mul r%d, r%d, r%d\n", rd, ra, rb);
-gen_mul(dc, cpu_R[rd], cpu_R[ra], cpu_R[rb]);
-break;
-default:
-gen_illegal_exception(dc);
-break;
-}
-break;
-
-case 0x0009:
-switch (op1) {
-case 0x03:/* l.div */
-LOG_DIS("l.div r%d, r%d, r%d\n", rd, ra, rb);
-gen_div(dc, cpu_R[rd], cpu_R[ra], cpu_R[rb]);
-break;
-
-default:
-gen_illegal_exception(dc);
-break;
-}
-break;
-
-case 0x000a:
-switch (op1) {
-case 0x03:/* l.divu */
-LOG_DIS("l.divu r%d, r%d, r%d\n", rd, ra, rb);
-gen_divu(dc, cpu_R[rd], cpu_R[ra], cpu_R[rb]);
-break;
+return;
 
-default:
-gen_illegal_exception(dc);
+case 0x8:
+switch (op2) {
+case 0: /* l.sll */
+LOG_DIS("l.sll r%d, r%d, r%d\n", rd, ra, rb);
+tcg_gen_shl_tl(cpu_R[rd], cpu_R[ra], cpu_R[rb]);
+return;
+case 1: /* l.srl */
+LOG_DIS("l.srl r%d, r%d, r%d\n", rd, ra, rb);
+tcg_gen_shr_tl(cpu_R[rd], cpu_R[ra], cpu_R[rb]);
+return;
+case 2: /* l.sra */
+LOG_DIS("l.sra r%d, r%d, r%d\n", rd, ra, rb);
+tcg_gen_sar_tl(cpu_R[rd], cpu_R[ra], cpu_R[rb]);
+return;
+case 3: /* l.ror */
+LOG_DIS("l.ror r%d, r%d, r%d\n", rd, ra, rb);
+tcg_gen_rotr_tl(cpu_R[rd], cpu_R[ra], cpu_R[rb]);
+return;
+}
 break;
-}
-break;
 
-case 0x000b:
-switch (op1) {
-case 0x03:/* l.mulu */
-LOG_DIS("l.mulu r%d, r%d, r%d\n", rd, ra, rb);
-gen_mulu(dc, cpu_R[rd], 

[Qemu-devel] [PULL 07/24] target/openrisc: Tidy insn dumping

2017-02-13 Thread Richard Henderson
Avoids warnings from unused variables etc.

Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Bastian Koppelmann 
Signed-off-by: Richard Henderson 
---
 target/openrisc/translate.c | 36 
 1 file changed, 12 insertions(+), 24 deletions(-)

diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index c207875..ac0c409 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -34,14 +34,8 @@
 #include "trace-tcg.h"
 #include "exec/log.h"
 
-
-#define OPENRISC_DISAS
-
-#ifdef OPENRISC_DISAS
-#  define LOG_DIS(...) qemu_log_mask(CPU_LOG_TB_IN_ASM, ## __VA_ARGS__)
-#else
-#  define LOG_DIS(...) do { } while (0)
-#endif
+#define LOG_DIS(str, ...) \
+qemu_log_mask(CPU_LOG_TB_IN_ASM, "%08x: " str, dc->pc, ## __VA_ARGS__)
 
 typedef struct DisasContext {
 TranslationBlock *tb;
@@ -766,9 +760,7 @@ static void dec_misc(DisasContext *dc, uint32_t insn)
 {
 uint32_t op0, op1;
 uint32_t ra, rb, rd;
-#ifdef OPENRISC_DISAS
 uint32_t L6, K5;
-#endif
 uint32_t I16, I5, I11, N26, tmp;
 TCGMemOp mop;
 
@@ -777,10 +769,8 @@ static void dec_misc(DisasContext *dc, uint32_t insn)
 ra = extract32(insn, 16, 5);
 rb = extract32(insn, 11, 5);
 rd = extract32(insn, 21, 5);
-#ifdef OPENRISC_DISAS
 L6 = extract32(insn, 5, 6);
 K5 = extract32(insn, 0, 5);
-#endif
 I16 = extract32(insn, 0, 16);
 I5 = extract32(insn, 21, 5);
 I11 = extract32(insn, 0, 11);
@@ -1387,13 +1377,10 @@ static void dec_compi(DisasContext *dc, uint32_t insn)
 static void dec_sys(DisasContext *dc, uint32_t insn)
 {
 uint32_t op0;
-#ifdef OPENRISC_DISAS
 uint32_t K16;
-#endif
+
 op0 = extract32(insn, 16, 10);
-#ifdef OPENRISC_DISAS
 K16 = extract32(insn, 0, 16);
-#endif
 
 switch (op0) {
 case 0x000:/* l.sys */
@@ -1723,6 +1710,13 @@ void gen_intermediate_code(CPUOpenRISCState *env, struct 
TranslationBlock *tb)
 max_insns = TCG_MAX_INSNS;
 }
 
+if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)
+&& qemu_log_in_addr_range(pc_start)) {
+qemu_log_lock();
+qemu_log("\n");
+qemu_log("IN: %s\n", lookup_symbol(pc_start));
+}
+
 gen_tb_start(tb);
 
 do {
@@ -1807,18 +1801,12 @@ void gen_intermediate_code(CPUOpenRISCState *env, 
struct TranslationBlock *tb)
 tb->size = dc->pc - pc_start;
 tb->icount = num_insns;
 
-#ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)
 && qemu_log_in_addr_range(pc_start)) {
-qemu_log_lock();
-qemu_log("\n");
-qemu_log("IN: %s\n", lookup_symbol(pc_start));
-log_target_disas(cs, pc_start, dc->pc - pc_start, 0);
-qemu_log("\nisize=%d osize=%d\n",
- dc->pc - pc_start, tcg_op_buf_count());
+log_target_disas(cs, pc_start, tb->size, 0);
+qemu_log("\n");
 qemu_log_unlock();
 }
-#endif
 }
 
 void openrisc_cpu_dump_state(CPUState *cs, FILE *f,
-- 
2.9.3




[Qemu-devel] [PULL 17/24] target/openrisc: Implement msync

2017-02-13 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/openrisc/translate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index 6c8f05c..dd4ba8c 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -1144,6 +1144,7 @@ static void dec_sys(DisasContext *dc, uint32_t insn)
 
 case 0x200:/* l.msync */
 LOG_DIS("l.msync\n");
+tcg_gen_mb(TCG_MO_ALL);
 break;
 
 case 0x270:/* l.psync */
-- 
2.9.3




[Qemu-devel] [PULL 20/24] target/openrisc: Fix madd

2017-02-13 Thread Richard Henderson
Note that the specification for lf.madd.s is confused.  It's
the only mention of supposed FPMADDHI/FPMADDLO special registers.
On the other hand, or1ksim implements a somewhat normal non-fused
multiply and add.  Mirror that.

Reviewed-by: Bastian Koppelmann 
Signed-off-by: Richard Henderson 
---
 target/openrisc/cpu.h|  3 --
 target/openrisc/fpu_helper.c | 68 
 target/openrisc/helper.h |  7 ++---
 target/openrisc/translate.c  | 13 +++--
 4 files changed, 30 insertions(+), 61 deletions(-)

diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
index 9528277..0694038 100644
--- a/target/openrisc/cpu.h
+++ b/target/openrisc/cpu.h
@@ -279,9 +279,6 @@ typedef struct CPUOpenRISCState {
 
 uint64_t mac; /* Multiply registers MACHI:MACLO */
 
-target_ulong fpmaddhi;/* Multiply and add float register FPMADDHI */
-target_ulong fpmaddlo;/* Multiply and add float register FPMADDLO */
-
 target_ulong epcr;/* Exception PC register */
 target_ulong eear;/* Exception EA register */
 
diff --git a/target/openrisc/fpu_helper.c b/target/openrisc/fpu_helper.c
index c54404b..1375cea 100644
--- a/target/openrisc/fpu_helper.c
+++ b/target/openrisc/fpu_helper.c
@@ -146,52 +146,32 @@ FLOAT_CALC(div)
 FLOAT_CALC(rem)
 #undef FLOAT_CALC
 
-#define FLOAT_TERNOP(name1, name2)\
-uint64_t helper_float_ ## name1 ## name2 ## _d(CPUOpenRISCState *env, \
-   uint64_t fdt0, \
-   uint64_t fdt1) \
-{ \
-uint64_t result, temp, hi, lo;\
-uint32_t val1, val2;  \
-OpenRISCCPU *cpu = openrisc_env_get_cpu(env); \
-hi = env->fpmaddhi;   \
-lo = env->fpmaddlo;   \
-set_float_exception_flags(0, >env.fp_status);\
-result = float64_ ## name1(fdt0, fdt1, >env.fp_status);  \
-lo &= 0x; \
-hi &= 0x; \
-temp = (hi << 32) | lo;   \
-result = float64_ ## name2(result, temp, >env.fp_status);\
-val1 = result >> 32;  \
-val2 = (uint32_t) (result & 0x);  \
-update_fpcsr(cpu);\
-cpu->env.fpmaddlo = val2; \
-cpu->env.fpmaddhi = val1; \
-return 0; \
-} \
-  \
-uint32_t helper_float_ ## name1 ## name2 ## _s(CPUOpenRISCState *env, \
-uint32_t fdt0, uint32_t fdt1) \
-{ \
-uint64_t result, temp, hi, lo;\
-uint32_t val1, val2;  \
-OpenRISCCPU *cpu = openrisc_env_get_cpu(env); \
-hi = cpu->env.fpmaddhi;   \
-lo = cpu->env.fpmaddlo;   \
-set_float_exception_flags(0, >env.fp_status);\
-result = float64_ ## name1(fdt0, fdt1, >env.fp_status);  \
-temp = (hi << 32) | lo;   \
-result = float64_ ## name2(result, temp, >env.fp_status);\
-val1 = result >> 32;  \
-val2 = (uint32_t) (result & 0x);  \
-update_fpcsr(cpu);\
-cpu->env.fpmaddlo = val2; \
-cpu->env.fpmaddhi = val1; \
-return 0; \
+
+uint64_t helper_float_madd_d(CPUOpenRISCState *env, uint64_t a,
+ uint64_t b, uint64_t c)
+{
+OpenRISCCPU *cpu = openrisc_env_get_cpu(env);
+uint64_t result;
+set_float_exception_flags(0, >env.fp_status);
+/* Note that or1ksim doesn't use merged operation.  */
+result = float64_mul(b, c, >env.fp_status);
+result = float64_add(result, 

[Qemu-devel] [PULL 06/24] target/openrisc: Implement lwa, swa

2017-02-13 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/openrisc/cpu.c  |  1 +
 target/openrisc/cpu.h  |  3 ++
 target/openrisc/interrupt.c|  1 +
 target/openrisc/interrupt_helper.c |  1 +
 target/openrisc/machine.c  | 24 ++--
 target/openrisc/mmu.c  |  1 +
 target/openrisc/translate.c| 58 ++
 7 files changed, 81 insertions(+), 8 deletions(-)

diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c
index 422139d..7fd2b9a 100644
--- a/target/openrisc/cpu.c
+++ b/target/openrisc/cpu.c
@@ -48,6 +48,7 @@ static void openrisc_cpu_reset(CPUState *s)
 
 cpu->env.pc = 0x100;
 cpu->env.sr = SR_FO | SR_SM;
+cpu->env.lock_addr = -1;
 s->exception_index = -1;
 
 cpu->env.upr = UPR_UP | UPR_DMP | UPR_IMP | UPR_PICP | UPR_TTP;
diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
index 231c163..06d0e89 100644
--- a/target/openrisc/cpu.h
+++ b/target/openrisc/cpu.h
@@ -296,6 +296,9 @@ typedef struct CPUOpenRISCState {
 uint32_t fpcsr;   /* Float register */
 float_status fp_status;
 
+target_ulong lock_addr;
+target_ulong lock_value;
+
 uint32_t flags;   /* cpu_flags, we only use it for exception
  in solt so far.  */
 uint32_t btaken;  /* the SR_F bit */
diff --git a/target/openrisc/interrupt.c b/target/openrisc/interrupt.c
index a243eb2..a981638 100644
--- a/target/openrisc/interrupt.c
+++ b/target/openrisc/interrupt.c
@@ -62,6 +62,7 @@ void openrisc_cpu_do_interrupt(CPUState *cs)
 env->sr &= ~SR_TEE;
 env->tlb->cpu_openrisc_map_address_data = _openrisc_get_phys_nommu;
 env->tlb->cpu_openrisc_map_address_code = _openrisc_get_phys_nommu;
+env->lock_addr = -1;
 
 if (cs->exception_index > 0 && cs->exception_index < EXCP_NR) {
 env->pc = (cs->exception_index << 8);
diff --git a/target/openrisc/interrupt_helper.c 
b/target/openrisc/interrupt_helper.c
index 0ed5146..a6d4df3 100644
--- a/target/openrisc/interrupt_helper.c
+++ b/target/openrisc/interrupt_helper.c
@@ -34,6 +34,7 @@ void HELPER(rfe)(CPUOpenRISCState *env)
 cpu->env.pc = cpu->env.epcr;
 cpu->env.npc = cpu->env.epcr;
 cpu->env.sr = cpu->env.esr;
+cpu->env.lock_addr = -1;
 
 #ifndef CONFIG_USER_ONLY
 if (cpu->env.sr & SR_DME) {
diff --git a/target/openrisc/machine.c b/target/openrisc/machine.c
index 17b0c77..d0b47ef 100644
--- a/target/openrisc/machine.c
+++ b/target/openrisc/machine.c
@@ -26,18 +26,26 @@
 
 static const VMStateDescription vmstate_env = {
 .name = "env",
-.version_id = 1,
-.minimum_version_id = 1,
+.version_id = 2,
+.minimum_version_id = 2,
 .fields = (VMStateField[]) {
-VMSTATE_UINT32_ARRAY(gpr, CPUOpenRISCState, 32),
+VMSTATE_UINTTL_ARRAY(gpr, CPUOpenRISCState, 32),
+VMSTATE_UINTTL(pc, CPUOpenRISCState),
+VMSTATE_UINTTL(npc, CPUOpenRISCState),
+VMSTATE_UINTTL(ppc, CPUOpenRISCState),
+VMSTATE_UINTTL(jmp_pc, CPUOpenRISCState),
+VMSTATE_UINTTL(lock_addr, CPUOpenRISCState),
+VMSTATE_UINTTL(lock_value, CPUOpenRISCState),
+VMSTATE_UINTTL(epcr, CPUOpenRISCState),
+VMSTATE_UINTTL(eear, CPUOpenRISCState),
 VMSTATE_UINT32(sr, CPUOpenRISCState),
-VMSTATE_UINT32(epcr, CPUOpenRISCState),
-VMSTATE_UINT32(eear, CPUOpenRISCState),
+VMSTATE_UINT32(vr, CPUOpenRISCState),
+VMSTATE_UINT32(upr, CPUOpenRISCState),
+VMSTATE_UINT32(cpucfgr, CPUOpenRISCState),
+VMSTATE_UINT32(dmmucfgr, CPUOpenRISCState),
+VMSTATE_UINT32(immucfgr, CPUOpenRISCState),
 VMSTATE_UINT32(esr, CPUOpenRISCState),
 VMSTATE_UINT32(fpcsr, CPUOpenRISCState),
-VMSTATE_UINT32(pc, CPUOpenRISCState),
-VMSTATE_UINT32(npc, CPUOpenRISCState),
-VMSTATE_UINT32(ppc, CPUOpenRISCState),
 VMSTATE_END_OF_LIST()
 }
 };
diff --git a/target/openrisc/mmu.c b/target/openrisc/mmu.c
index 505dcdc..56b11d3 100644
--- a/target/openrisc/mmu.c
+++ b/target/openrisc/mmu.c
@@ -174,6 +174,7 @@ static void cpu_openrisc_raise_mmu_exception(OpenRISCCPU 
*cpu,
 
 cs->exception_index = exception;
 cpu->env.eear = address;
+cpu->env.lock_addr = -1;
 }
 
 #ifndef CONFIG_USER_ONLY
diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index 03fa7db..c207875 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -61,6 +61,8 @@ static TCGv jmp_pc;/* l.jr/l.jalr temp pc */
 static TCGv cpu_npc;
 static TCGv cpu_ppc;
 static TCGv_i32 env_btaken;/* bf/bnf , F flag taken */
+static TCGv cpu_lock_addr;
+static TCGv cpu_lock_value;
 static TCGv_i32 fpcsr;
 static TCGv machi, maclo;
 static TCGv fpmaddhi, fpmaddlo;
@@ -95,6 +97,12 @@ void openrisc_translate_init(void)
 env_btaken = tcg_global_mem_new_i32(cpu_env,
 offsetof(CPUOpenRISCState, btaken),
  

[Qemu-devel] [PULL 01/24] target/openrisc: Rename the cpu from or32 to or1k

2017-02-13 Thread Richard Henderson
This is in keeping with the toolchain and or1ksim.

Signed-off-by: Richard Henderson 
---
 configure   | 6 +++---
 default-configs/or1k-linux-user.mak | 1 +
 default-configs/or1k-softmmu.mak| 4 
 default-configs/or32-linux-user.mak | 1 -
 default-configs/or32-softmmu.mak| 4 
 hw/openrisc/openrisc_sim.c  | 4 ++--
 target/openrisc/cpu.h   | 2 +-
 tests/tcg/openrisc/Makefile | 4 ++--
 8 files changed, 13 insertions(+), 13 deletions(-)
 create mode 100644 default-configs/or1k-linux-user.mak
 create mode 100644 default-configs/or1k-softmmu.mak
 delete mode 100644 default-configs/or32-linux-user.mak
 delete mode 100644 default-configs/or32-softmmu.mak

diff --git a/configure b/configure
index 6325339..1c9655e 100755
--- a/configure
+++ b/configure
@@ -5843,7 +5843,7 @@ target_name=$(echo $target | cut -d '-' -f 1)
 target_bigendian="no"
 
 case "$target_name" in
-  
armeb|hppa|lm32|m68k|microblaze|mips|mipsn32|mips64|moxie|or32|ppc|ppcemb|ppc64|ppc64abi32|s390x|sh4eb|sparc|sparc64|sparc32plus|xtensaeb)
+  
armeb|hppa|lm32|m68k|microblaze|mips|mipsn32|mips64|moxie|or1k|ppc|ppcemb|ppc64|ppc64abi32|s390x|sh4eb|sparc|sparc64|sparc32plus|xtensaeb)
   target_bigendian=yes
   ;;
 esac
@@ -5937,7 +5937,7 @@ case "$target_name" in
   ;;
   nios2)
   ;;
-  or32)
+  or1k)
 TARGET_ARCH=openrisc
 TARGET_BASE_ARCH=openrisc
   ;;
@@ -6145,7 +6145,7 @@ for i in $ARCH $TARGET_BASE_ARCH ; do
   nios2)
 disas_config "NIOS2"
   ;;
-  or32)
+  or1k)
 disas_config "OPENRISC"
   ;;
   ppc*)
diff --git a/default-configs/or1k-linux-user.mak 
b/default-configs/or1k-linux-user.mak
new file mode 100644
index 000..20e03c1
--- /dev/null
+++ b/default-configs/or1k-linux-user.mak
@@ -0,0 +1 @@
+# Default configuration for or1k-linux-user
diff --git a/default-configs/or1k-softmmu.mak b/default-configs/or1k-softmmu.mak
new file mode 100644
index 000..10bfa7a
--- /dev/null
+++ b/default-configs/or1k-softmmu.mak
@@ -0,0 +1,4 @@
+# Default configuration for or1k-softmmu
+
+CONFIG_SERIAL=y
+CONFIG_OPENCORES_ETH=y
diff --git a/default-configs/or32-linux-user.mak 
b/default-configs/or32-linux-user.mak
deleted file mode 100644
index 808c1f9..000
--- a/default-configs/or32-linux-user.mak
+++ /dev/null
@@ -1 +0,0 @@
-# Default configuration for or32-linux-user
diff --git a/default-configs/or32-softmmu.mak b/default-configs/or32-softmmu.mak
deleted file mode 100644
index cce4746..000
--- a/default-configs/or32-softmmu.mak
+++ /dev/null
@@ -1,4 +0,0 @@
-# Default configuration for or32-softmmu
-
-CONFIG_SERIAL=y
-CONFIG_OPENCORES_ETH=y
diff --git a/hw/openrisc/openrisc_sim.c b/hw/openrisc/openrisc_sim.c
index 6d06d5b..fc0d096 100644
--- a/hw/openrisc/openrisc_sim.c
+++ b/hw/openrisc/openrisc_sim.c
@@ -139,10 +139,10 @@ static void openrisc_sim_init(MachineState *machine)
 
 static void openrisc_sim_machine_init(MachineClass *mc)
 {
-mc->desc = "or32 simulation";
+mc->desc = "or1k simulation";
 mc->init = openrisc_sim_init;
 mc->max_cpus = 1;
 mc->is_default = 1;
 }
 
-DEFINE_MACHINE("or32-sim", openrisc_sim_machine_init)
+DEFINE_MACHINE("or1k-sim", openrisc_sim_machine_init)
diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
index 508ef56..231c163 100644
--- a/target/openrisc/cpu.h
+++ b/target/openrisc/cpu.h
@@ -32,7 +32,7 @@ struct OpenRISCCPU;
 #include "fpu/softfloat.h"
 #include "qom/cpu.h"
 
-#define TYPE_OPENRISC_CPU "or32-cpu"
+#define TYPE_OPENRISC_CPU "or1k-cpu"
 
 #define OPENRISC_CPU_CLASS(klass) \
 OBJECT_CLASS_CHECK(OpenRISCCPUClass, (klass), TYPE_OPENRISC_CPU)
diff --git a/tests/tcg/openrisc/Makefile b/tests/tcg/openrisc/Makefile
index 7e65888..fb5ceda 100644
--- a/tests/tcg/openrisc/Makefile
+++ b/tests/tcg/openrisc/Makefile
@@ -1,8 +1,8 @@
 -include ../../config-host.mak
 
-CROSS = or32-linux-
+CROSS = or1k-linux-
 
-SIM = qemu-or32
+SIM = qemu-or1k
 
 CC = $(CROSS)gcc
 
-- 
2.9.3




[Qemu-devel] [PULL 09/24] target/openrisc: Streamline arithmetic and OVE

2017-02-13 Thread Richard Henderson
Fix incorrect overflow calculation.  Move overflow exception check
to a helper function, to eliminate inline branches.  Remove some
incorrect special casing of R0.  Implement multiply inline.

Reviewed-by: Bastian Koppelmann 
Signed-off-by: Richard Henderson 
---
 target/openrisc/Makefile.objs  |   2 +-
 target/openrisc/exception_helper.c |  12 ++
 target/openrisc/helper.h   |   4 +-
 target/openrisc/int_helper.c   |  61 --
 target/openrisc/translate.c| 426 +++--
 5 files changed, 191 insertions(+), 314 deletions(-)
 delete mode 100644 target/openrisc/int_helper.c

diff --git a/target/openrisc/Makefile.objs b/target/openrisc/Makefile.objs
index 397d016..918b1c6 100644
--- a/target/openrisc/Makefile.objs
+++ b/target/openrisc/Makefile.objs
@@ -1,5 +1,5 @@
 obj-$(CONFIG_SOFTMMU) += machine.o
 obj-y += cpu.o exception.o interrupt.o mmu.o translate.o
-obj-y += exception_helper.o fpu_helper.o int_helper.o \
+obj-y += exception_helper.o fpu_helper.o \
  interrupt_helper.o mmu_helper.o sys_helper.o
 obj-y += gdbstub.o
diff --git a/target/openrisc/exception_helper.c 
b/target/openrisc/exception_helper.c
index 329a9e4..7e54c97 100644
--- a/target/openrisc/exception_helper.c
+++ b/target/openrisc/exception_helper.c
@@ -20,6 +20,7 @@
 #include "qemu/osdep.h"
 #include "cpu.h"
 #include "exec/helper-proto.h"
+#include "exec/exec-all.h"
 #include "exception.h"
 
 void HELPER(exception)(CPUOpenRISCState *env, uint32_t excp)
@@ -28,3 +29,14 @@ void HELPER(exception)(CPUOpenRISCState *env, uint32_t excp)
 
 raise_exception(cpu, excp);
 }
+
+void HELPER(ove)(CPUOpenRISCState *env, target_ulong test)
+{
+if (unlikely(test) && (env->sr & SR_OVE)) {
+OpenRISCCPU *cpu = openrisc_env_get_cpu(env);
+CPUState *cs = CPU(cpu);
+
+cs->exception_index = EXCP_RANGE;
+cpu_loop_exit_restore(cs, GETPC());
+}
+}
diff --git a/target/openrisc/helper.h b/target/openrisc/helper.h
index bcc7245..c2c8098 100644
--- a/target/openrisc/helper.h
+++ b/target/openrisc/helper.h
@@ -19,6 +19,7 @@
 
 /* exception */
 DEF_HELPER_FLAGS_2(exception, 0, void, env, i32)
+DEF_HELPER_FLAGS_2(ove, TCG_CALL_NO_WG, void, env, tl)
 
 /* float */
 DEF_HELPER_FLAGS_2(itofd, 0, i64, env, i64)
@@ -53,9 +54,6 @@ FOP_CMP(gt)
 FOP_CMP(ge)
 #undef FOP_CMP
 
-/* int */
-DEF_HELPER_FLAGS_3(mul32, 0, i32, env, i32, i32)
-
 /* interrupt */
 DEF_HELPER_FLAGS_1(rfe, 0, void, env)
 
diff --git a/target/openrisc/int_helper.c b/target/openrisc/int_helper.c
deleted file mode 100644
index ba0fd27..000
--- a/target/openrisc/int_helper.c
+++ /dev/null
@@ -1,61 +0,0 @@
-/*
- * OpenRISC int helper routines
- *
- * Copyright (c) 2011-2012 Jia Liu 
- * Feng Gao 
- *
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation; either
- * version 2 of the License, or (at your option) any later version.
- *
- * This library is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with this library; if not, see .
- */
-
-#include "qemu/osdep.h"
-#include "cpu.h"
-#include "exec/helper-proto.h"
-#include "exception.h"
-#include "qemu/host-utils.h"
-
-uint32_t HELPER(mul32)(CPUOpenRISCState *env,
-   uint32_t ra, uint32_t rb)
-{
-uint64_t result;
-uint32_t high, cy;
-
-OpenRISCCPU *cpu = openrisc_env_get_cpu(env);
-
-result = (uint64_t)ra * rb;
-/* regisiers in or32 is 32bit, so 32 is NOT a magic number.
-   or64 is not handled in this function, and not implement yet,
-   TARGET_LONG_BITS for or64 is 64, it will break this function,
-   so, we didn't use TARGET_LONG_BITS here.  */
-high = result >> 32;
-cy = result >> (32 - 1);
-
-if ((cy & 0x1) == 0x0) {
-if (high == 0x0) {
-return result;
-}
-}
-
-if ((cy & 0x1) == 0x1) {
-if (high == 0x) {
-return result;
-}
-}
-
-cpu->env.sr |= (SR_OV | SR_CY);
-if (cpu->env.sr & SR_OVE) {
-raise_exception(cpu, EXCP_RANGE);
-}
-
-return result;
-}
diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index d999d2f..7c6cd1c 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -247,6 +247,166 @@ static void gen_jump(DisasContext *dc, int32_t n26, 
uint32_t reg, uint32_t op0)
 gen_sync_flags(dc);
 }
 
+static void gen_ove_cy(DisasContext *dc, TCGv cy)
+{
+gen_helper_ove(cpu_env, cy);
+}
+
+static void 

[Qemu-devel] [PULL 10/24] target/openrisc: Put SR[OVE] in TB flags

2017-02-13 Thread Richard Henderson
Removes a call at execution time for overflow exceptions.

Signed-off-by: Richard Henderson 
---
 target/openrisc/cpu.h  |  4 ++--
 target/openrisc/exception_helper.c |  2 +-
 target/openrisc/translate.c| 24 +++-
 3 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
index 06d0e89..ef90e49 100644
--- a/target/openrisc/cpu.h
+++ b/target/openrisc/cpu.h
@@ -400,8 +400,8 @@ static inline void cpu_get_tb_cpu_state(CPUOpenRISCState 
*env,
 {
 *pc = env->pc;
 *cs_base = 0;
-/* D_FLAG -- branch instruction exception */
-*flags = (env->flags & D_FLAG);
+/* D_FLAG -- branch instruction exception, OVE overflow trap enable.  */
+*flags = (env->flags & D_FLAG) | (env->sr & SR_OVE);
 }
 
 static inline int cpu_mmu_index(CPUOpenRISCState *env, bool ifetch)
diff --git a/target/openrisc/exception_helper.c 
b/target/openrisc/exception_helper.c
index 7e54c97..5147da6 100644
--- a/target/openrisc/exception_helper.c
+++ b/target/openrisc/exception_helper.c
@@ -32,7 +32,7 @@ void HELPER(exception)(CPUOpenRISCState *env, uint32_t excp)
 
 void HELPER(ove)(CPUOpenRISCState *env, target_ulong test)
 {
-if (unlikely(test) && (env->sr & SR_OVE)) {
+if (unlikely(test)) {
 OpenRISCCPU *cpu = openrisc_env_get_cpu(env);
 CPUState *cs = CPU(cpu);
 
diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index 7c6cd1c..b8116ba 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -132,8 +132,8 @@ static inline void wb_SR_F(void)
 static inline void gen_sync_flags(DisasContext *dc)
 {
 /* Sync the tb dependent flag between translate and runtime.  */
-if (dc->tb_flags != dc->synced_flags) {
-tcg_gen_movi_tl(env_flags, dc->tb_flags);
+if ((dc->tb_flags ^ dc->synced_flags) & D_FLAG) {
+tcg_gen_movi_tl(env_flags, dc->tb_flags & D_FLAG);
 dc->synced_flags = dc->tb_flags;
 }
 }
@@ -249,20 +249,26 @@ static void gen_jump(DisasContext *dc, int32_t n26, 
uint32_t reg, uint32_t op0)
 
 static void gen_ove_cy(DisasContext *dc, TCGv cy)
 {
-gen_helper_ove(cpu_env, cy);
+if (dc->tb_flags & SR_OVE) {
+gen_helper_ove(cpu_env, cy);
+}
 }
 
 static void gen_ove_ov(DisasContext *dc, TCGv ov)
 {
-gen_helper_ove(cpu_env, ov);
+if (dc->tb_flags & SR_OVE) {
+gen_helper_ove(cpu_env, ov);
+}
 }
 
 static void gen_ove_cyov(DisasContext *dc, TCGv cy, TCGv ov)
 {
-TCGv t0 = tcg_temp_new();
-tcg_gen_or_tl(t0, cy, ov);
-gen_helper_ove(cpu_env, t0);
-tcg_temp_free(t0);
+if (dc->tb_flags & SR_OVE) {
+TCGv t0 = tcg_temp_new();
+tcg_gen_or_tl(t0, cy, ov);
+gen_helper_ove(cpu_env, t0);
+tcg_temp_free(t0);
+}
 }
 
 static void gen_add(DisasContext *dc, TCGv dest, TCGv srca, TCGv srcb)
@@ -1606,7 +1612,7 @@ void gen_intermediate_code(CPUOpenRISCState *env, struct 
TranslationBlock *tb)
 dc->flags = cpu->env.cpucfgr;
 dc->mem_idx = cpu_mmu_index(>env, false);
 dc->synced_flags = dc->tb_flags = tb->flags;
-dc->delayed_branch = !!(dc->tb_flags & D_FLAG);
+dc->delayed_branch = (dc->tb_flags & D_FLAG) != 0;
 dc->singlestep_enabled = cs->singlestep_enabled;
 
 next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
-- 
2.9.3




[Qemu-devel] [PULL 05/24] target/openrisc: Fix exception handling status registers

2017-02-13 Thread Richard Henderson
From: Stafford Horne 

I am working on testing instruction emulation patches for the linux
kernel. During testing I found these 2 issues:

 - sets DSX (delay slot exception) but never clears it
 - EEAR for illegal insns should point to the bad exception (as per
   openrisc spec) but its not

This patch fixes these two issues by clearing the DSX flag when not in a
delay slot and by setting EEAR to exception PC when handling illegal
instruction exceptions.

After this patch the openrisc kernel with latest patches boots great on
qemu and instruction emulation works.

Cc: qemu-triv...@nongnu.org
Cc: openr...@lists.librecores.org
Signed-off-by: Stafford Horne 
Message-Id: <20170113220028.29687-1-sho...@gmail.com>
Signed-off-by: Richard Henderson 
---
 target/openrisc/interrupt.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/target/openrisc/interrupt.c b/target/openrisc/interrupt.c
index e43fc84..a243eb2 100644
--- a/target/openrisc/interrupt.c
+++ b/target/openrisc/interrupt.c
@@ -38,10 +38,17 @@ void openrisc_cpu_do_interrupt(CPUState *cs)
 env->flags &= ~D_FLAG;
 env->sr |= SR_DSX;
 env->epcr -= 4;
+} else {
+env->sr &= ~SR_DSX;
 }
 if (cs->exception_index == EXCP_SYSCALL) {
 env->epcr += 4;
 }
+/* When we have an illegal instruction the error effective address
+   shall be set to the illegal instruction address.  */
+if (cs->exception_index == EXCP_ILLEGAL) {
+env->eear = env->pc;
+}
 
 /* For machine-state changed between user-mode and supervisor mode,
we need flush TLB when we enter EXCP.  */
-- 
2.9.3




[Qemu-devel] [PULL 00/24] target/openrisc patches

2017-02-13 Thread Richard Henderson
This is the v2 patch set that I posted last week.
It even acquired some new R-b.  Whee!


r~



The following changes since commit 305e6c8a2ff7a6e3f4942b50e853230f18eeb5a9:

  Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' 
into staging (2017-02-13 16:44:04 +)

are available in the git repository at:

  git://github.com/rth7680/qemu.git tags/pull-or-20170214

for you to fetch changes up to 6597c28d618a3d16d468770b7c30a0237a8c8ea9:

  target/openrisc: Optimize for r0 being zero (2017-02-14 08:15:00 +1100)


Queued openrisc patches


Richard Henderson (23):
  target/openrisc: Rename the cpu from or32 to or1k
  linux-user: Add MMAP_SHIFT for openrisc
  linux-user: Fix openrisc cpu_loop
  linux-user: Honor CLONE_SETTLS for openrisc
  target/openrisc: Implement lwa, swa
  target/openrisc: Tidy insn dumping
  target/openrisc: Rationalize immediate extraction
  target/openrisc: Streamline arithmetic and OVE
  target/openrisc: Put SR[OVE] in TB flags
  target/openrisc: Invert the decoding in dec_calc
  target/openrisc: Keep SR_F in a separate variable
  target/openrisc: Keep SR_CY and SR_OV in a separate variables
  target/openrisc: Use movcond where appropriate
  target/openrisc: Set flags on helpers
  target/openrisc: Enable trap, csync, msync, psync for user mode
  target/openrisc: Implement msync
  target/openrisc: Represent MACHI:MACLO as a single unit
  target/openrisc: Implement muld, muldu, macu, msbu
  target/openrisc: Fix madd
  target/openrisc: Optimize l.jal to next
  target/openrisc: Tidy ppc/npc implementation
  target/openrisc: Tidy handling of delayed branches
  target/openrisc: Optimize for r0 being zero

Stafford Horne (1):
  target/openrisc: Fix exception handling status registers

 configure|6 +-
 default-configs/or1k-linux-user.mak  |1 +
 default-configs/or1k-softmmu.mak |4 +
 default-configs/or32-linux-user.mak  |1 -
 default-configs/or32-softmmu.mak |4 -
 hw/openrisc/openrisc_sim.c   |4 +-
 linux-user/elfload.c |3 +-
 linux-user/main.c|   98 +--
 linux-user/openrisc/target_cpu.h |4 +-
 linux-user/openrisc/target_syscall.h |2 +
 target/openrisc/Makefile.objs|2 +-
 target/openrisc/cpu.c|1 +
 target/openrisc/cpu.h|   50 +-
 target/openrisc/exception_helper.c   |   32 +
 target/openrisc/fpu_helper.c |   68 +-
 target/openrisc/gdbstub.c|   17 +-
 target/openrisc/helper.h |   33 +-
 target/openrisc/int_helper.c |   61 --
 target/openrisc/interrupt.c  |   14 +-
 target/openrisc/interrupt_helper.c   |4 +-
 target/openrisc/machine.c|   62 +-
 target/openrisc/mmu.c|1 +
 target/openrisc/sys_helper.c |   62 +-
 target/openrisc/translate.c  | 1389 --
 tests/tcg/openrisc/Makefile  |4 +-
 25 files changed, 915 insertions(+), 1012 deletions(-)
 create mode 100644 default-configs/or1k-linux-user.mak
 create mode 100644 default-configs/or1k-softmmu.mak
 delete mode 100644 default-configs/or32-linux-user.mak
 delete mode 100644 default-configs/or32-softmmu.mak
 delete mode 100644 target/openrisc/int_helper.c



[Qemu-devel] [PULL 08/24] target/openrisc: Rationalize immediate extraction

2017-02-13 Thread Richard Henderson
The architecture manual is consistent in using "I" for signed
fields and "K" for unsigned fields.  Mirror that.

Reviewed-by: Bastian Koppelmann 
Signed-off-by: Richard Henderson 
---
 target/openrisc/translate.c | 98 ++---
 1 file changed, 40 insertions(+), 58 deletions(-)

diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index ac0c409..d999d2f 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -129,23 +129,6 @@ static inline void wb_SR_F(void)
 gen_set_label(label);
 }
 
-static inline int zero_extend(unsigned int val, int width)
-{
-return val & ((1 << width) - 1);
-}
-
-static inline int sign_extend(unsigned int val, int width)
-{
-int sval;
-
-/* LSL */
-val <<= TARGET_LONG_BITS - width;
-sval = val;
-/* ASR.  */
-sval >>= TARGET_LONG_BITS - width;
-return sval;
-}
-
 static inline void gen_sync_flags(DisasContext *dc)
 {
 /* Sync the tb dependent flag between translate and runtime.  */
@@ -221,11 +204,9 @@ static void gen_goto_tb(DisasContext *dc, int n, 
target_ulong dest)
 }
 }
 
-static void gen_jump(DisasContext *dc, uint32_t imm, uint32_t reg, uint32_t 
op0)
+static void gen_jump(DisasContext *dc, int32_t n26, uint32_t reg, uint32_t op0)
 {
-target_ulong tmp_pc;
-/* N26, 26bits imm */
-tmp_pc = sign_extend((imm<<2), 26) + dc->pc;
+target_ulong tmp_pc = dc->pc + n26 * 4;
 
 switch (op0) {
 case 0x00: /* l.j */
@@ -760,8 +741,8 @@ static void dec_misc(DisasContext *dc, uint32_t insn)
 {
 uint32_t op0, op1;
 uint32_t ra, rb, rd;
-uint32_t L6, K5;
-uint32_t I16, I5, I11, N26, tmp;
+uint32_t L6, K5, K16, K5_11;
+int32_t I16, I5_11, N26;
 TCGMemOp mop;
 
 op0 = extract32(insn, 26, 6);
@@ -771,11 +752,11 @@ static void dec_misc(DisasContext *dc, uint32_t insn)
 rd = extract32(insn, 21, 5);
 L6 = extract32(insn, 5, 6);
 K5 = extract32(insn, 0, 5);
-I16 = extract32(insn, 0, 16);
-I5 = extract32(insn, 21, 5);
-I11 = extract32(insn, 0, 11);
-N26 = extract32(insn, 0, 26);
-tmp = (I5<<11) + I11;
+K16 = extract32(insn, 0, 16);
+I16 = (int16_t)K16;
+N26 = sextract32(insn, 0, 26);
+K5_11 = (extract32(insn, 21, 5) << 11) | extract32(insn, 0, 11);
+I5_11 = (int16_t)K5_11;
 
 switch (op0) {
 case 0x00:/* l.j */
@@ -821,12 +802,12 @@ static void dec_misc(DisasContext *dc, uint32_t insn)
 break;
 
 case 0x13:/* l.maci */
-LOG_DIS("l.maci %d, r%d, %d\n", I5, ra, I11);
+LOG_DIS("l.maci r%d, %d\n", ra, I16);
 {
 TCGv_i64 t1 = tcg_temp_new_i64();
 TCGv_i64 t2 = tcg_temp_new_i64();
 TCGv_i32 dst = tcg_temp_new_i32();
-TCGv ttmp = tcg_const_tl(tmp);
+TCGv ttmp = tcg_const_tl(I16);
 tcg_gen_mul_tl(dst, cpu_R[ra], ttmp);
 tcg_gen_ext_i32_i64(t1, dst);
 tcg_gen_concat_i32_i64(t2, maclo, machi);
@@ -936,7 +917,7 @@ static void dec_misc(DisasContext *dc, uint32_t insn)
 do_load:
 {
 TCGv t0 = tcg_temp_new();
-tcg_gen_addi_tl(t0, cpu_R[ra], sign_extend(I16, 16));
+tcg_gen_addi_tl(t0, cpu_R[ra], I16);
 tcg_gen_qemu_ld_tl(cpu_R[rd], t0, dc->mem_idx, mop);
 tcg_temp_free(t0);
 }
@@ -954,7 +935,7 @@ static void dec_misc(DisasContext *dc, uint32_t insn)
 TCGv_i32 res = tcg_temp_local_new_i32();
 TCGv_i32 sr_ove = tcg_temp_local_new_i32();
 tcg_gen_extu_i32_i64(ta, cpu_R[ra]);
-tcg_gen_addi_i64(td, ta, sign_extend(I16, 16));
+tcg_gen_addi_i64(td, ta, I16);
 tcg_gen_extrl_i64_i32(res, td);
 tcg_gen_shri_i64(td, td, 32);
 tcg_gen_andi_i64(td, td, 0x3);
@@ -989,7 +970,7 @@ static void dec_misc(DisasContext *dc, uint32_t insn)
 tcg_gen_andi_i32(sr_cy, cpu_sr, SR_CY);
 tcg_gen_shri_i32(sr_cy, sr_cy, 10);
 tcg_gen_extu_i32_i64(tcy, sr_cy);
-tcg_gen_addi_i64(td, ta, sign_extend(I16, 16));
+tcg_gen_addi_i64(td, ta, I16);
 tcg_gen_add_i64(td, td, tcy);
 tcg_gen_extrl_i64_i32(res, td);
 tcg_gen_shri_i64(td, td, 32);
@@ -1013,18 +994,18 @@ static void dec_misc(DisasContext *dc, uint32_t insn)
 break;
 
 case 0x29:/* l.andi */
-LOG_DIS("l.andi r%d, r%d, %d\n", rd, ra, I16);
-tcg_gen_andi_tl(cpu_R[rd], cpu_R[ra], zero_extend(I16, 16));
+LOG_DIS("l.andi r%d, r%d, %d\n", rd, ra, K16);
+tcg_gen_andi_tl(cpu_R[rd], cpu_R[ra], K16);
 break;
 
 case 0x2a:/* l.ori */
-LOG_DIS("l.ori r%d, r%d, %d\n", rd, ra, I16);
-tcg_gen_ori_tl(cpu_R[rd], cpu_R[ra], zero_extend(I16, 16));
+LOG_DIS("l.ori r%d, r%d, %d\n", rd, ra, K16);
+

[Qemu-devel] [PULL 04/24] linux-user: Honor CLONE_SETTLS for openrisc

2017-02-13 Thread Richard Henderson
Threads work much better when you set the TLS register.
This was fixed in the upstream kernel for Linux 4.9.

Signed-off-by: Richard Henderson 
---
 linux-user/openrisc/target_cpu.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/linux-user/openrisc/target_cpu.h b/linux-user/openrisc/target_cpu.h
index a21ed1a..f283d96 100644
--- a/linux-user/openrisc/target_cpu.h
+++ b/linux-user/openrisc/target_cpu.h
@@ -30,9 +30,7 @@ static inline void cpu_clone_regs(CPUOpenRISCState *env, 
target_ulong newsp)
 
 static inline void cpu_set_tls(CPUOpenRISCState *env, target_ulong newtls)
 {
-/* Linux kernel 3.10 does not pay any attention to CLONE_SETTLS
- * in copy_thread(), so QEMU need not do so either.
- */
+env->gpr[10] = newtls;
 }
 
 #endif
-- 
2.9.3




[Qemu-devel] [PULL 03/24] linux-user: Fix openrisc cpu_loop

2017-02-13 Thread Richard Henderson
We need to handle EXCP_DEBUG and EXCP_INTERRUPT.
We need to send signals to the guest using queue_signal.

Signed-off-by: Richard Henderson 
---
 linux-user/main.c | 95 ---
 1 file changed, 41 insertions(+), 54 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index e588f58..001f71c 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -2574,52 +2574,17 @@ kuser_fail:
 void cpu_loop(CPUOpenRISCState *env)
 {
 CPUState *cs = CPU(openrisc_env_get_cpu(env));
-int trapnr, gdbsig;
+int trapnr;
 abi_long ret;
+target_siginfo_t info;
 
 for (;;) {
 cpu_exec_start(cs);
 trapnr = cpu_exec(cs);
 cpu_exec_end(cs);
 process_queued_cpu_work(cs);
-gdbsig = 0;
 
 switch (trapnr) {
-case EXCP_RESET:
-qemu_log_mask(CPU_LOG_INT, "\nReset request, exit, pc is %#x\n", 
env->pc);
-exit(EXIT_FAILURE);
-break;
-case EXCP_BUSERR:
-qemu_log_mask(CPU_LOG_INT, "\nBus error, exit, pc is %#x\n", 
env->pc);
-gdbsig = TARGET_SIGBUS;
-break;
-case EXCP_DPF:
-case EXCP_IPF:
-cpu_dump_state(cs, stderr, fprintf, 0);
-gdbsig = TARGET_SIGSEGV;
-break;
-case EXCP_TICK:
-qemu_log_mask(CPU_LOG_INT, "\nTick time interrupt pc is %#x\n", 
env->pc);
-break;
-case EXCP_ALIGN:
-qemu_log_mask(CPU_LOG_INT, "\nAlignment pc is %#x\n", env->pc);
-gdbsig = TARGET_SIGBUS;
-break;
-case EXCP_ILLEGAL:
-qemu_log_mask(CPU_LOG_INT, "\nIllegal instructionpc is %#x\n", 
env->pc);
-gdbsig = TARGET_SIGILL;
-break;
-case EXCP_INT:
-qemu_log_mask(CPU_LOG_INT, "\nExternal interruptpc is %#x\n", 
env->pc);
-break;
-case EXCP_DTLBMISS:
-case EXCP_ITLBMISS:
-qemu_log_mask(CPU_LOG_INT, "\nTLB miss\n");
-break;
-case EXCP_RANGE:
-qemu_log_mask(CPU_LOG_INT, "\nRange\n");
-gdbsig = TARGET_SIGSEGV;
-break;
 case EXCP_SYSCALL:
 env->pc += 4;   /* 0xc00; */
 ret = do_syscall(env,
@@ -2636,32 +2601,54 @@ void cpu_loop(CPUOpenRISCState *env)
 env->gpr[11] = ret;
 }
 break;
+case EXCP_DPF:
+case EXCP_IPF:
+case EXCP_RANGE:
+info.si_signo = TARGET_SIGSEGV;
+info.si_errno = 0;
+info.si_code = TARGET_SEGV_MAPERR;
+info._sifields._sigfault._addr = env->pc;
+queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+break;
+case EXCP_ALIGN:
+info.si_signo = TARGET_SIGBUS;
+info.si_errno = 0;
+info.si_code = TARGET_BUS_ADRALN;
+info._sifields._sigfault._addr = env->pc;
+queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+break;
+case EXCP_ILLEGAL:
+info.si_signo = TARGET_SIGILL;
+info.si_errno = 0;
+info.si_code = TARGET_ILL_ILLOPC;
+info._sifields._sigfault._addr = env->pc;
+queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+break;
 case EXCP_FPE:
-qemu_log_mask(CPU_LOG_INT, "\nFloating point error\n");
+info.si_signo = TARGET_SIGFPE;
+info.si_errno = 0;
+info.si_code = 0;
+info._sifields._sigfault._addr = env->pc;
+queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
 break;
-case EXCP_TRAP:
-qemu_log_mask(CPU_LOG_INT, "\nTrap\n");
-gdbsig = TARGET_SIGTRAP;
+case EXCP_INTERRUPT:
+/* We processed the pending cpu work above.  */
 break;
-case EXCP_NR:
-qemu_log_mask(CPU_LOG_INT, "\nNR\n");
+case EXCP_DEBUG:
+trapnr = gdb_handlesig(cs, TARGET_SIGTRAP);
+if (trapnr) {
+info.si_signo = trapnr;
+info.si_errno = 0;
+info.si_code = TARGET_TRAP_BRKPT;
+queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+}
 break;
 case EXCP_ATOMIC:
 cpu_exec_step_atomic(cs);
 break;
 default:
-EXCP_DUMP(env, "\nqemu: unhandled CPU exception %#x - aborting\n",
- trapnr);
-gdbsig = TARGET_SIGILL;
-break;
-}
-if (gdbsig) {
-gdb_handlesig(cs, gdbsig);
-if (gdbsig != TARGET_SIGTRAP) {
-exit(EXIT_FAILURE);
-}
+g_assert_not_reached();
 }
-
 process_pending_signals(env);
 }
 }
-- 
2.9.3




[Qemu-devel] [Bug 1490611] Re: Using qemu >=2.2.1 to convert raw->VHD (fixed) adds extra padding to the result file, which Microsoft Azure rejects as invalid

2017-02-13 Thread Alexandre
Is it correct to assume that current 16.04.2 Xenial with the 2.5 QEMU
package, doesn't have this patch and can't generate MiB aligned Azure
images with qemu-img (no force_size support) ?

Any recommended PPA backport of QEMU from 16.10 (2.6+) ?

cheers.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1490611

Title:
  Using qemu >=2.2.1 to convert raw->VHD (fixed) adds extra padding to
  the result file, which Microsoft Azure rejects as invalid

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  In Progress

Bug description:
  [Impact]

   * Starting with a raw disk image, using "qemu-img convert" to convert
  from raw to VHD results in the output VHD file's virtual size being
  aligned to the nearest 516096 bytes (16 heads x 63 sectors per head x
  512 bytes per sector), instead of preserving the input file's size as
  the output VHD's virtual disk size.

   * Microsoft Azure requires that disk images (VHDs) submitted for
  upload have virtual sizes aligned to a megabyte boundary. (Ex. 4096MB,
  4097MB, 4098MB, etc. are OK, 4096.5MB is rejected with an error.) This
  is reflected in Microsoft's documentation: https://azure.microsoft.com
  /en-us/documentation/articles/virtual-machines-linux-create-upload-
  vhd-generic/

   * The fix for this bug is a backport from upstream.
  http://git.qemu.org/?p=qemu.git;a=commitdiff;h=fb9245c2610932d33ce14

  [Test Case]

   * This is reproducible with the following set of commands (including
  the Azure command line tools from https://github.com/Azure/azure-
  xplat-cli). For the following example, I used qemu version 2.2.1:

  $ dd if=/dev/zero of=source-disk.img bs=1M count=4096

  $ stat source-disk.img
    File: ‘source-disk.img’
    Size: 4294967296  Blocks: 798656 IO Block: 4096   regular file
  Device: fc01h/64513dInode: 13247963Links: 1
  Access: (0644/-rw-r--r--)  Uid: ( 1000/  smkent)   Gid: ( 1000/  smkent)
  Access: 2015-08-18 09:48:02.613988480 -0700
  Modify: 2015-08-18 09:48:02.825985646 -0700
  Change: 2015-08-18 09:48:02.825985646 -0700
   Birth: -

  $ qemu-img convert -f raw -o subformat=fixed -O vpc source-disk.img
  dest-disk.vhd

  $ stat dest-disk.vhd
    File: ‘dest-disk.vhd’
    Size: 4296499712  Blocks: 535216 IO Block: 4096   regular file
  Device: fc01h/64513dInode: 13247964Links: 1
  Access: (0644/-rw-r--r--)  Uid: ( 1000/  smkent)   Gid: ( 1000/  smkent)
  Access: 2015-08-18 09:50:22.252077624 -0700
  Modify: 2015-08-18 09:49:24.424868868 -0700
  Change: 2015-08-18 09:49:24.424868868 -0700
   Birth: -

  $ azure vm image create testimage1 dest-disk.vhd -o linux -l "West US"
  info:Executing command vm image create
  + Retrieving storage accounts
  info:VHD size : 4097 MB
  info:Uploading 4195800.5 KB
  Requested:100.0% Completed:100.0% Running:   0 Time: 1m 0s Speed:  6744 KB/s
  info:https://[redacted].blob.core.windows.net/vm-images/dest-disk.vhd was 
uploaded successfully
  error:   The VHD 
https://[redacted].blob.core.windows.net/vm-images/dest-disk.vhd has an 
unsupported virtual size of 4296499200 bytes.  The size must be a whole number 
(in MBs).
  info:Error information has been recorded to /home/smkent/.azure/azure.err
  error:   vm image create command failed

   * A fixed qemu-img will not result in an error during azure image
  creation. It will require passing -o force_size, which will leverage
  the backported functionality.

  [Regression Potential]

   * The upstream fix introduces a qemu-img option (-o force_size) which
  is unset by default. The regression potential is very low, as a
  result.

  ...

  I also ran the above commands using qemu 2.4.0, which resulted in the
  same error as the conversion behavior is the same.

  However, qemu 2.1.1 and earlier (including qemu 2.0.0 installed by
  Ubuntu 14.04) does not pad the virtual disk size during conversion.
  Using qemu-img convert from qemu versions <=2.1.1 results in a VHD
  that is exactly the size of the raw input file plus 512 bytes (for the
  VHD footer). Those qemu versions do not attempt to realign the disk.
  As a result, Azure accepts VHD files created using those versions of
  qemu-img convert for upload.

  Is there a reason why newer qemu realigns the converted VHD file? It
  would be useful if an option were added to disable this feature, as
  current versions of qemu cannot be used to create VHD files for Azure
  using Microsoft's official instructions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1490611/+subscriptions



[Qemu-devel] [Bug 1012023] Re: Windows 7 bluescreen STOP: 00000005D

2017-02-13 Thread Thomas Huth
Looks like this is a duplicate of
https://bugs.launchpad.net/qemu/+bug/921208 ... so closing this ticket
here.

** Changed in: qemu
   Status: New => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1012023

Title:
  Windows 7 bluescreen STOP: 0005D

Status in QEMU:
  Invalid

Bug description:
  Hello, with installed windows, or with install cd I have a blue screen
  (crash) after the first windows logo, see the screenshot.

  Thanks to fix it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1012023/+subscriptions



[Qemu-devel] [Bug 921208] Re: win7/x64 installer hangs on startup with 0x0000005d.

2017-02-13 Thread Thomas Huth
Bug has also been reported here:
https://bugs.launchpad.net/qemu/+bug/1012023

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/921208

Title:
  win7/x64 installer hangs on startup with 0x005d.

Status in QEMU:
  Confirmed
Status in qemu package in Ubuntu:
  Triaged

Bug description:
  hi,

  during booting win7/x64 installer i'm observing a bsod with 0x005d
  ( msdn: unsupported_processor ).

  used command line: qemu-system-x86_64 -m 2048 -hda w7-system.img
  -cdrom win7_x64.iso -boot d

  adding '-machine accel=kvm' instead of default tcg accel helps to
  boot.

  
  installed software:

  qemu-1.0
  linux-3.2.1
  glibc-2.14.1
  gcc-4.6.2

  hw cpu:

  processor   : 0..7
  vendor_id   : GenuineIntel
  cpu family  : 6
  model   : 42
  model name  : Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz
  stepping: 7
  microcode   : 0x14
  cpu MHz : 1995.739
  cache size  : 6144 KB
  physical id : 0
  siblings: 8
  core id : 3
  cpu cores   : 4
  apicid  : 7
  initial apicid  : 7
  fpu : yes
  fpu_exception   : yes
  cpuid level : 13
  wp  : yes
  flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx 
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology 
nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 
cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave avx 
lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid
  bogomips: 3992.23
  clflush size: 64
  cache_alignment : 64
  address sizes   : 36 bits physical, 48 bits virtual

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/921208/+subscriptions



[Qemu-devel] [Bug 994412] Re: reverse vnc to unix domain sockets does not work

2017-02-13 Thread Thomas Huth
Looks like this should work nowadays (of course you need to start a
listening program first), so closing this bug ticket now.

** Changed in: qemu
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/994412

Title:
  reverse vnc to unix domain sockets does not work

Status in QEMU:
  Fix Released

Bug description:
  I tried to connect to a unix domain socket, but failed.

  $ qemu -vnc unix:/tmp/my.sock,reverse
  connect(unix:/tmp/my.sock,reverse): No such file or directory

  I guess it is because unix_connect does not remove characters after
  first comma.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/994412/+subscriptions



[Qemu-devel] [Bug 966316] Re: Can't load Android VBOX image or even linux test image as well

2017-02-13 Thread Thomas Huth
Triaging old bug tickets ... Can you still reproduce this problem with
the latest version of QEMU?

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/966316

Title:
  Can't load Android VBOX image or even linux test image as well

Status in QEMU:
  Incomplete

Bug description:
  Can't load Android X86 ICS 4.0 VBOX image.
  It worked with previous version before adding /qemu/hw/pc_sysfw.c file ( 
tested with version 1.0 ). 

  x86_64-softmmu# ./qemu-system-x86_64 ~/kvm-test-image/x86-linux-0.2.img
  qemu: PC system firmware (pflash) must be a multiple of 0x1000

  In QEMU website (http://wiki.qemu.org/Testing), there is a test image for 
linux
  but, new version can't load the image as well because of upper error.
  linux-0.2.img.bz2 (8 MB)  Small Linux disk image containing a 2.6.20 
Linux kernel, X11 and various utilities to test QEMU

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/966316/+subscriptions



Re: [Qemu-devel] [PATCH V7 2/2] Add a new qmp command to do checkpoint, query xen replication status

2017-02-13 Thread Stefano Stabellini
On Wed, 8 Feb 2017, Eric Blake wrote:
> On 02/07/2017 11:24 PM, Zhang Chen wrote:
> > We can call this qmp command to do checkpoint outside of qemu.
> > Xen colo will need this function.
> > 
> > Signed-off-by: Zhang Chen 
> > Signed-off-by: Wen Congyang 
> > ---
> >  migration/colo.c | 17 
> >  qapi-schema.json | 60 
> > 
> >  2 files changed, 77 insertions(+)
> > 
> 
> Reviewed-by: Eric Blake 

Given that the series is all acked, are you going to take care of the
pull request?



[Qemu-devel] [Bug 1004050] Re: qemu-system-ppc64 by default has non-working keyboard

2017-02-13 Thread Thomas Huth
AFAIK an OHCI driver has been added to OpenBIOS in 2014, so marking this
bug as fixed now. If you still have issues with OpenBIOS, please report
them to the OpenBIOS project instead of the QEMU bug tracker, thanks!

** Changed in: qemu
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1004050

Title:
  qemu-system-ppc64 by default has non-working keyboard

Status in QEMU:
  Fix Released

Bug description:
  Compile qemu from git and do:

./ppc64-softmmu/qemu-system-ppc64

  (ie. no parameters).  It boots to an OpenBIOS prompt.  However the
  keyboard doesn't work.  After ~10 keypresses, qemu just says:

  usb-kbd: warning: key event queue full
  usb-kbd: warning: key event queue full
  usb-kbd: warning: key event queue full
  usb-kbd: warning: key event queue full

  There is no indication inside the guest that OpenBIOS is seeing
  keyboard events.

  Also there's no indication of what type of keyboard devices are
  available, nor what we should use.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1004050/+subscriptions



[Qemu-devel] [Bug 1018530] Re: No write access in a 9p/virtfs shared folder

2017-02-13 Thread Thomas Huth
Sounds like this was an Ubuntu- or libvirt-specific bug ... so closing
this in the upstream QEMU bug tracker.

** Changed in: qemu
   Status: New => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1018530

Title:
  No write access in a 9p/virtfs shared folder

Status in QEMU:
  Invalid
Status in qemu-kvm package in Ubuntu:
  Fix Released

Bug description:
  Ubuntu version:  Ubuntu 12.04 LTS
  Kernel: 3.2.0-25-generic
  Version of qemu-kvm: 1.0+noroms-0ubuntu13

  I have created an shared folder for an virtual machine which is
  managed by libvirt.

  
  
  
  
  

  I mounted it in the virtual machine with this command:  mount -t 9p -o 
trans=virtio,version=9p2000.L data /data
  The filesystem permissions of all files an folders in the shared folder are 
set to 777. I expected that I have the full permissions also in the virtual 
machine.

  Regardless of the permissions on the filesystem I cannot write or create 
files and folders in the virtual machine. The original filesystem (/storage) is 
XFS.
  In another shared folder (similar config in libvirt) which is originally NTFS 
I have no problems.

  ProblemType: Bug
  DistroRelease: Ubuntu 12.04
  Package: qemu-kvm 1.0+noroms-0ubuntu13
  ProcVersionSignature: Ubuntu 3.2.0-25.40-generic 3.2.18
  Uname: Linux 3.2.0-25-generic x86_64
  ApportVersion: 2.0.1-0ubuntu8
  Architecture: amd64
  Date: Wed Jun 27 20:15:20 2012
  InstallationMedia: Ubuntu-Server 12.04 LTS "Precise Pangolin" - Beta amd64 
(20120409)
  MachineType: To be filled by O.E.M. To be filled by O.E.M.
  ProcEnviron:
   TERM=xterm
   LANG=de_DE.UTF-8
   SHELL=/bin/bash
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-25-generic 
root=/dev/mapper/system-root ro
  SourcePackage: qemu-kvm
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 04/18/2012
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: 1208
  dmi.board.asset.tag: To be filled by O.E.M.
  dmi.board.name: M5A99X EVO
  dmi.board.vendor: ASUSTeK COMPUTER INC.
  dmi.board.version: Rev 1.xx
  dmi.chassis.asset.tag: To Be Filled By O.E.M.
  dmi.chassis.type: 3
  dmi.chassis.vendor: To Be Filled By O.E.M.
  dmi.chassis.version: To Be Filled By O.E.M.
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvr1208:bd04/18/2012:svnTobefilledbyO.E.M.:pnTobefilledbyO.E.M.:pvrTobefilledbyO.E.M.:rvnASUSTeKCOMPUTERINC.:rnM5A99XEVO:rvrRev1.xx:cvnToBeFilledByO.E.M.:ct3:cvrToBeFilledByO.E.M.:
  dmi.product.name: To be filled by O.E.M.
  dmi.product.version: To be filled by O.E.M.
  dmi.sys.vendor: To be filled by O.E.M.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1018530/+subscriptions



Re: [Qemu-devel] [PATCH v12 12/24] tcg: handle EXCP_ATOMIC exception for system emulation

2017-02-13 Thread Richard Henderson

On 02/14/2017 06:33 AM, Pranith Kumar wrote:

On Mon, Feb 13, 2017 at 2:19 PM, Richard Henderson  wrote:

On 02/13/2017 11:10 PM, Alex Bennée wrote:


@@ -239,9 +240,16 @@ static void cpu_exec_step(CPUState *cpu)
  1 | CF_NOCACHE | CF_IGNORE_ICOUNT);
 tb->orig_tb = NULL;
 tb_unlock();
-/* execute the generated code */
-trace_exec_tb_nocache(tb, pc);
-cpu_tb_exec(cpu, tb);
+
+cc->cpu_exec_enter(cpu);
+
+if (sigsetjmp(cpu->jmp_env, 0) == 0) {
+/* execute the generated code */
+trace_exec_tb_nocache(tb, pc);
+cpu_tb_exec(cpu, tb);
+}



I don't understand this, since cpu_tb_exec has its own sigsetjmp.  Where is
the exception supposed to come from that escapes?


cpu_exec() has its own sigsetjmp, not cpu_tb_exec(). The exception is
the debug exception from the generated code. Without this new
sigsetjmp, it'll jump to cpu_exec() instead of coming back here.


Bah.  Sorry, ENOCOFFEE.

Reviewed-by: Richard Henderson 


r~




Re: [Qemu-devel] [PATCH v3 1/4] sd: sdhci: check transfer mode register in multi block transfer

2017-02-13 Thread Alistair Francis
On Sat, Feb 11, 2017 at 7:06 AM, P J P  wrote:
> From: Prasad J Pandit 
>
> In the SDHCI protocol, the transfer mode register value
> is used during multi block transfer to check if block count
> register is enabled and should be updated. Transfer mode
> register could be set such that, block count register would
> not be updated, thus leading to an infinite loop. Add check
> to avoid it.
>
> Reported-by: Wjjzhang 
> Reported-by: Jiang Xin 
> Signed-off-by: Prasad J Pandit 

Reviewed-by: Alistair Francis 

Thanks,

Alistair

> ---
>  hw/sd/sdhci.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> Update: use qemu_log_mask(LOG_UNIMP, ...)
>   -> https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg02354.html
>
> diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
> index 5bd5ab6..a9c744b 100644
> --- a/hw/sd/sdhci.c
> +++ b/hw/sd/sdhci.c
> @@ -486,6 +486,11 @@ static void sdhci_sdma_transfer_multi_blocks(SDHCIState 
> *s)
>  uint32_t boundary_chk = 1 << (((s->blksize & 0xf000) >> 12) + 12);
>  uint32_t boundary_count = boundary_chk - (s->sdmasysad % boundary_chk);
>
> +if (!(s->trnmod & SDHC_TRNS_BLK_CNT_EN) || !s->blkcnt) {
> +qemu_log_mask(LOG_UNIMP, "infinite transfer is not supported\n");
> +return;
> +}
> +
>  /* XXX: Some sd/mmc drivers (for example, u-boot-slp) do not account for
>   * possible stop at page boundary if initial address is not page aligned,
>   * allow them to work properly */
> @@ -797,11 +802,6 @@ static void sdhci_data_transfer(void *opaque)
>  if (s->trnmod & SDHC_TRNS_DMA) {
>  switch (SDHC_DMA_TYPE(s->hostctl)) {
>  case SDHC_CTRL_SDMA:
> -if ((s->trnmod & SDHC_TRNS_MULTI) &&
> -(!(s->trnmod & SDHC_TRNS_BLK_CNT_EN) || s->blkcnt == 0)) 
> {
> -break;
> -}
> -
>  if ((s->blkcnt == 1) || !(s->trnmod & SDHC_TRNS_MULTI)) {
>  sdhci_sdma_transfer_single_block(s);
>  } else {
> --
> 2.9.3
>



Re: [Qemu-devel] [PATCH v3 3/4] sd: sdhci: conditionally invoke multi block transfer

2017-02-13 Thread Alistair Francis
On Sat, Feb 11, 2017 at 7:07 AM, P J P  wrote:
> From: Prasad J Pandit 
>
> In sdhci_write invoke multi block transfer if it is enabled
> in the transfer mode register 's->trnmod'.
>
> Signed-off-by: Prasad J Pandit 

Reviewed-by: Alistair Francis 

Thanks,

Alistair

> ---
>  hw/sd/sdhci.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> Update: test if (s->trnmod & SDHC_TRNS_MULTI) is true
>   -> https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg02357.html
>
> diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
> index 0307b8c..99fa0c7 100644
> --- a/hw/sd/sdhci.c
> +++ b/hw/sd/sdhci.c
> @@ -1022,7 +1022,11 @@ sdhci_write(void *opaque, hwaddr offset, uint64_t val, 
> unsigned size)
>  /* Writing to last byte of sdmasysad might trigger transfer */
>  if (!(mask & 0xFF00) && TRANSFERRING_DATA(s->prnsts) && 
> s->blkcnt &&
>  s->blksize && SDHC_DMA_TYPE(s->hostctl) == SDHC_CTRL_SDMA) {
> -sdhci_sdma_transfer_multi_blocks(s);
> +if (s->trnmod & SDHC_TRNS_MULTI) {
> +sdhci_sdma_transfer_multi_blocks(s);
> +} else {
> +sdhci_sdma_transfer_single_block(s);
> +}
>  }
>  break;
>  case SDHC_BLKSIZE:
> --
> 2.9.3
>



Re: [Qemu-devel] [PATCH v3 2/4] sd: sdhci: mask transfer mode register value

2017-02-13 Thread Alistair Francis
On Sat, Feb 11, 2017 at 7:06 AM, P J P  wrote:
> From: Prasad J Pandit 
>
> In SDHCI protocol, the transfer mode register is defined
> to be of 6 bits. Mask its value with '0x0037' so that an
> invalid value couldn't be assigned.
>
> Signed-off-by: Prasad J Pandit 
> ---
>  hw/sd/sdhci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> Update: mask s->trnmod register value
>   -> https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg02354.html
>
> diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
> index a9c744b..0307b8c 100644
> --- a/hw/sd/sdhci.c
> +++ b/hw/sd/sdhci.c
> @@ -1050,7 +1050,7 @@ sdhci_write(void *opaque, hwaddr offset, uint64_t val, 
> unsigned size)
>  if (!(s->capareg & SDHC_CAN_DO_DMA)) {
>  value &= ~SDHC_TRNS_DMA;
>  }
> -MASKED_WRITE(s->trnmod, mask, value);
> +MASKED_WRITE(s->trnmod, mask, value & 0x0037);

This looks good.

Can you use a macro for the value so then it is explained and easier
to change in the future?

Once you have done that:

Reviewed-by: Alistair Francis 

Thanks,

Alistair

>  MASKED_WRITE(s->cmdreg, mask >> 16, value >> 16);
>
>  /* Writing to the upper byte of CMDREG triggers SD command 
> generation */
> --
> 2.9.3
>



[Qemu-devel] [Bug 821078] Re: virtio-serial-bus: Unexpected port id

2017-02-13 Thread Thomas Huth
Triaging old bug tickets ... Can you still reproduce this problem with
the latest version of QEMU?

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/821078

Title:
  virtio-serial-bus: Unexpected port id

Status in QEMU:
  Incomplete

Bug description:
  With qemu-kvm-0.15.0-rc1 virtio-serial-bus reports an error, and windows 
vdagent can not start.  qemu-0.15.0-rc1 behaves as expected, ie vdagent runs in 
the guest, mouse passes seamlessly between spicec and host and copy/paste works 
between guest and host.
  qemu-kvm has been configured with
  ./configure --target-list=x86_64-softmmu --disable-curses  --disable-curl 
--audio-drv-list=alsa --audio-card-list=sb16,ac97,hda --enable-vnc-thread 
--disable-bluez --enable-vhost-net --enable-spice
  and is started with
  qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid 
-drive file=/home/rick/qemu/hds/wxp.raw,if=virtio,aio=native -m 1536 -name 
WinXP -net nic,model -net user -localtime -usb -vga qxl -device 
virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5 -chardev 
spicevmc,name=vdagent,id=vdagent -device 
virtserialport,nr=1,bus=virtio-serial0.0,chardev=vdagent,name=com.redhat.spice.0
 -spice port=1234,disable-ticketing -monitor stdio

  I've also tried start qemu like
  qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid 
-drive file=/home/rick/qemu/hds/wxp.raw,if=virtio -m 768 -name WinXP -net 
nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial 
-chardev spicevmc,name=vdagent,id=vdagent -device 
virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice 
port=1234,disable-ticketing -monitor stdio
  and observed the same results.

  the host runs 2.6.39.4 vanilla kernel.  the guest uses the most recent
  virtio-serial, vga-qxl and vdagent from spice-space.org

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/821078/+subscriptions



[Qemu-devel] [Bug 988909] Re: Assert failed in arp_table.c

2017-02-13 Thread Thomas Huth
Triaging old bug tickets ... Can you still reproduce this problem with
the latest version of QEMU?

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/988909

Title:
  Assert failed in arp_table.c

Status in QEMU:
  Incomplete

Bug description:
  
  With latest git (8001954) hen running:

  qemu-system-64 -hda $VDISK -kernel arch/x86/boot/bzImage \
  -append "ro root=/dev/hda1 console=ttyS0 init=/bin/systemd" \
  -curses \
  -net nic  -smp 3 -m 312 $@

  I'm getting this:

   qemu-system-x86_64: slirp/arp_table.c:75: arp_table_search: Assertion
  `(ip_addr & (__extension__ ({ register unsigned int __v, __x = (~(0xf
  << 28)); if (__builtin_constant_p (__x)) __v = __x) & 0xff00)
  >> 24) | (((__x) & 0x00ff) >> 8) | (((__x) & 0xff00) << 8) |
  (((__x) & 0x00ff) << 24)); else __asm__ ("bswap %0" : "=r" (__v) :
  "0" (__x)); __v; }))) != 0' failed.

  Bug #824650 seems to be related to this one, but it is not. Fix for that one 
is already upstream. 
  I can help on testing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/988909/+subscriptions



Re: [Qemu-devel] [PATCH v12 12/24] tcg: handle EXCP_ATOMIC exception for system emulation

2017-02-13 Thread Pranith Kumar
On Mon, Feb 13, 2017 at 2:19 PM, Richard Henderson  wrote:
> On 02/13/2017 11:10 PM, Alex Bennée wrote:
>>
>> @@ -239,9 +240,16 @@ static void cpu_exec_step(CPUState *cpu)
>>   1 | CF_NOCACHE | CF_IGNORE_ICOUNT);
>>  tb->orig_tb = NULL;
>>  tb_unlock();
>> -/* execute the generated code */
>> -trace_exec_tb_nocache(tb, pc);
>> -cpu_tb_exec(cpu, tb);
>> +
>> +cc->cpu_exec_enter(cpu);
>> +
>> +if (sigsetjmp(cpu->jmp_env, 0) == 0) {
>> +/* execute the generated code */
>> +trace_exec_tb_nocache(tb, pc);
>> +cpu_tb_exec(cpu, tb);
>> +}
>
>
> I don't understand this, since cpu_tb_exec has its own sigsetjmp.  Where is
> the exception supposed to come from that escapes?

cpu_exec() has its own sigsetjmp, not cpu_tb_exec(). The exception is
the debug exception from the generated code. Without this new
sigsetjmp, it'll jump to cpu_exec() instead of coming back here.

Thanks,
-- 
Pranith



[Qemu-devel] [Bug 989504] Re: assertion failed when attaching USB MSD device

2017-02-13 Thread Thomas Huth
Triaging old bug tickets ... Can you still reproduce this problem with
the latest version of QEMU?

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/989504

Title:
  assertion failed when attaching USB MSD device

Status in QEMU:
  Incomplete

Bug description:
  version: git rev be5ea8ed4481f0ffa4ea0f7ba13e465701536001
  commandline: qemu-system-i386 -usb -fda dosusb.img -drive 
if=none,id=usbstick,file=usb.img -device usb-storage,bus=usb.0,drive=usbstick 
-boot a -L d:\_programs\qemu

  ---
  Microsoft Visual C++ Runtime Library
  ---
  Assertion failed!

  Program: E:\qemu-system-i386.exe
  File: C:/msys/home/User/qemu/hw/usb/hcd-uhci.c
  Line: 968

  Expression: ret == TD_RESULT_ASYNC_START

  For information on how your program can cause an assertion
  failure, see the Visual C++ documentation on asserts

  (Press Retry to debug the application - JIT must be enabled)
  ---
  Abort   Retry   Ignore   
  ---

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/989504/+subscriptions



Re: [Qemu-devel] [PATCH] linux-user: fill target sigcontext struct accordingly

2017-02-13 Thread joserz
Up

On Wed, Feb 01, 2017 at 09:43:57PM +0100, Laurent Vivier wrote:
> Le 31/01/2017 à 23:05, Jose Ricardo Ziviani a écrit :
> > A segfault is noticed when an emulated program uses any of ucontext
> > regs fields. Risu detected this issue in the following operation when
> > handling a signal:
> >   ucontext_t *uc = (ucontext_t*)uc;
> >   uc->uc_mcontext.regs->nip += 4;
> > 
> > but this works fine:
> >   uc->uc_mcontext.gp_regs[PT_NIP] += 4;
> > 
> > This patch set regs to a valid location as well as other sigcontext
> > fields.
> > 
> > Signed-off-by: Jose Ricardo Ziviani 
> > ---
> >  linux-user/signal.c | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/linux-user/signal.c b/linux-user/signal.c
> > index 5064de0..8209539 100644
> > --- a/linux-user/signal.c
> > +++ b/linux-user/signal.c
> > @@ -5155,6 +5155,7 @@ static void setup_rt_frame(int sig, struct 
> > target_sigaction *ka,
> >  target_ulong rt_sf_addr, newsp = 0;
> >  int i, err = 0;
> >  #if defined(TARGET_PPC64)
> > +struct target_sigcontext *sc = 0;
> >  struct image_info *image = ((TaskState *)thread_cpu->opaque)->info;
> >  #endif
> >  
> > @@ -5183,6 +5184,10 @@ static void setup_rt_frame(int sig, struct 
> > target_sigaction *ka,
> >  #if defined(TARGET_PPC64)
> >  mctx = _sf->uc.tuc_sigcontext.mcontext;
> >  trampptr = _sf->trampoline[0];
> > +
> > +sc = _sf->uc.tuc_sigcontext;
> > +__put_user(h2g(mctx), >regs);
> > +__put_user(sig, >signal);
> >  #else
> >  mctx = _sf->uc.tuc_mcontext;
> >  trampptr = (uint32_t *)_sf->uc.tuc_mcontext.tramp;
> > 
> 
> Reviewed-by: Laurent Vivier 
> 
> This is correct, but QEMU and kernel implementation are really different.
> 
> In the kernel:
> 
> handle_rt_signal64()
> ...
> frame = get_sigframe(ksig, get_tm_stackpointer(tsk),
>  sizeof(*frame), 0);
> ...
> err |= setup_sigcontext(>uc.uc_mcontext, tsk, ksig->sig,
> NULL,
> (unsigned long)ksig->ka.sa.sa_handler,
> 1);
> 
> static long setup_sigcontext(struct sigcontext __user *sc,
> struct task_struct *tsk, int signr, sigset_t *set,
> unsigned long handler, int ctx_has_vsx_region)
> 
> err |= __put_user(>gp_regs, >regs);
> ...
> err |= __put_user(signr, >signal);
> ...
> 
> According to kernel definition of ucontext:
> 
> struct ucontext {
> ...
> #ifdef __powerpc64__
> sigset_t__unused[15];   /* Allow for uc_sigmask growth */
> struct sigcontext uc_mcontext;  /* last for extensibility */
> #else
> ...
> }
> 
> kernel >uc.uc_mcontext is qemu _sf->uc.tuc_sigcontext
> 
> uc_sigcontext.mcontext doesn't exit in the kernel.
> 
> But QEMU code works because tuc_sigcontext.mcontext is where we have the
> CPU registers in sigcontext:
> 
> kernel:
> 
> struct sigcontext {
> unsigned long   _unused[4];
> int signal;
> #ifdef __powerpc64__
> int _pad0;
> #endif
> unsigned long   handler;
> unsigned long   oldmask;
> struct pt_regs  __user *regs;
> #ifdef __powerpc64__
> elf_gregset_t   gp_regs;
> elf_fpregset_t  fp_regs;
> ...
> 
> Qemu:
> 
> struct target_sigcontext {
> target_ulong _unused[4];
> int32_t signal;
> #if defined(TARGET_PPC64)
> int32_t pad0;
> #endif
> target_ulong handler;
> target_ulong oldmask;
> target_ulong regs;  /* struct pt_regs __user * */
> #if defined(TARGET_PPC64)
> struct target_mcontext mcontext;
> #endif
> };
> 
> struct target_mcontext {
> target_ulong mc_gregs[48];
> /* Includes fpscr.  */
> uint64_t mc_fregs[33];
> ...
> 
> I think we do like that to use the same
> save_user_regs()/save_user_regs() functions with PPC and PPC64... but
> comparison with kernel becomes harder.
> 
> Laurent
> 




Re: [Qemu-devel] [PATCH v12 12/24] tcg: handle EXCP_ATOMIC exception for system emulation

2017-02-13 Thread Richard Henderson

On 02/13/2017 11:10 PM, Alex Bennée wrote:

@@ -239,9 +240,16 @@ static void cpu_exec_step(CPUState *cpu)
  1 | CF_NOCACHE | CF_IGNORE_ICOUNT);
 tb->orig_tb = NULL;
 tb_unlock();
-/* execute the generated code */
-trace_exec_tb_nocache(tb, pc);
-cpu_tb_exec(cpu, tb);
+
+cc->cpu_exec_enter(cpu);
+
+if (sigsetjmp(cpu->jmp_env, 0) == 0) {
+/* execute the generated code */
+trace_exec_tb_nocache(tb, pc);
+cpu_tb_exec(cpu, tb);
+}


I don't understand this, since cpu_tb_exec has its own sigsetjmp.  Where is the 
exception supposed to come from that escapes?



+} else if (r == EXCP_ATOMIC) {
+qemu_mutex_unlock_iothread();
+cpu_exec_step_atomic(cpu);
+qemu_mutex_lock_iothread();

...

+case EXCP_ATOMIC:
+qemu_mutex_unlock_iothread();
+cpu_exec_step_atomic(cpu);
+qemu_mutex_lock_iothread();



I just noticed this, but if you have to do a v13, it might be best to move 
these locks inside cpu_exec_step_atomic, as with tcg_cpu_exec.  Otherwise leave 
it for later.



r~



Re: [Qemu-devel] [PATCH v2 00/16] Postcopy: Hugepage support

2017-02-13 Thread Dr. David Alan Gilbert
* Alexey Perevalov (a.pereva...@samsung.com) wrote:
>  Hello David!

Hi Alexey,

> I have checked you series with 1G hugepage, but only in 1 Gbit/sec network
> environment.

Can you show the qemu command line you're using?  I'm just trying
to make sure I understand where your hugepages are; running 1G hostpages
across a 1Gbit/sec network for postcopy would be pretty poor - it would take
~10 seconds to transfer the page.

> I started Ubuntu just with console interface and gave to it only 1G of
> RAM, inside Ubuntu I started stress command

> (stress --cpu 4 --io 4 --vm 4 --vm-bytes 25600 &)
> in such environment precopy live migration was impossible, it never
> being finished, in this case it infinitely sends pages (it looks like
> dpkg scenario).
> 
> Also I modified stress utility
> http://people.seas.harvard.edu/~apw/stress/stress-1.0.4.tar.gz
> due to it wrote into memory every time the same value `Z`. My
> modified version writes every allocation new incremented value.

I use google's stressapptest normally; although remember to turn
off the bit where it pauses.

> I'm using Arcangeli's kernel only at the destination.
> 
> I got controversial results. Downtime for 1G hugepage is close to 2Mb
> hugepage and it took around 7 ms (in 2Mb hugepage scenario downtime was
> around 8 ms).
> I made that opinion by query-migrate.
> {"return": {"status": "completed", "setup-time": 6, "downtime": 6, 
> "total-time": 9668, "ram": {"total": 1091379200, "postcopy-requests": 1, 
> "dirty-sync-count": 2, "remaining": 0, "mbps": 879.786851, "transferred": 
> 1063007296, "duplicate": 7449, "dirty-pages-rate": 0, "skipped": 0, 
> "normal-bytes": 1060868096, "normal": 259001}}}
> 
> Documentation says about downtime field - measurement unit is ms.

The downtime measurement field is pretty meaningless for postcopy; it's only
the time from stopping the VM until the point where we tell the destination it
can start running.  Meaningful measurements are only from inside the guest
really, or the place latencys.

> So I traced it (I added additional trace into postcopy_place_page
> trace_postcopy_place_page_start(host, from, pagesize); )
> 
> postcopy_ram_fault_thread_request Request for HVA=7f6dc000 
> rb=/objects/mem offset=0
> postcopy_place_page_start host=0x7f6dc000 from=0x7f6d7000, 
> pagesize=4000
> postcopy_place_page_start host=0x7f6e0e80 from=0x55b665969619, 
> pagesize=1000
> postcopy_place_page_start host=0x7f6e0e801000 from=0x55b6659684e8, 
> pagesize=1000
> several pages with 4Kb step ...
> postcopy_place_page_start host=0x7f6e0e817000 from=0x55b6659694f0, 
> pagesize=1000
> 
> 4K pages, started from 0x7f6e0e80 address it's
> vga.ram, /rom@etc/acpi/tables etc.
> 
> Frankly saying, right now, I don't have any ideas why hugepage wasn't
> resent. Maybe my expectation of it is wrong as well as understanding )

That's pretty much what I expect to see - before you get into postcopy
mode everything is sent as individual 4k pages (in order); once we're
in postcopy mode we send each page no more than once.  So you're
huge page comes across once - and there it is.

> stress utility also duplicated for me value into appropriate file:
> sec_since_epoch.microsec:value
> 1487003192.728493:22
> 1487003197.335362:23
> *1487003213.367260:24*
> *1487003238.480379:25*
> 1487003243.315299:26
> 1487003250.775721:27
> 1487003255.473792:28
> 
> It mean rewriting 256Mb of memory per byte took around 5 sec, but at
> the moment of migration it took 25 sec.

right, now this is the thing that's more useful to measure.
That's not too surprising; when it migrates that data is changing rapidly
so it's going to have to pause and wait for that whole 1GB to be transferred.
Your 1Gbps network is going to take about 10 seconds to transfer that
1GB page - and that's if you're lucky and it saturates the network.
SO it's going to take at least 10 seconds longer than it normally
would, plus any other overheads - so at least 15 seconds.
This is why I say it's a bad idea to use 1GB host pages with postcopy.
Of course it would be fun to find where the other 10 seconds went!

You might like to add timing to the tracing so you can see the time between the
fault thread requesting the page and it arriving.

> Another one request.
> QEMU could use mem_path in hugefs with share key simultaneously
> (-object 
> memory-backend-file,id=mem,size=${mem_size},mem-path=${mem_path},share=on) 
> and vm
> in this case will start and will properly work (it will allocate memory
> with mmap), but in case of destination for postcopy live migration
> UFFDIO_COPY ioctl will fail for
> such region, in Arcangeli's git tree there is such prevent check
> (if (!vma_is_shmem(dst_vma) && dst_vma->vm_flags & VM_SHARED).
> Is it possible to handle such situation at qemu?

Imagine that you had shared memory; what semantics would you like
to see ?  What happens to the other process?

Dave

> On Mon, Feb 06, 2017 at 05:45:30PM +, Dr. David Alan Gilbert wrote:

[Qemu-devel] [PATCH 1/6] coroutine-lock: make CoMutex thread-safe

2017-02-13 Thread Paolo Bonzini
This uses the lock-free mutex described in the paper '"Blocking without
Locking", or LFTHREADS: A lock-free thread library' by Gidenstam and
Papatriantafilou.  The same technique is used in OSv, and in fact
the code is essentially a conversion to C of OSv's code.

Signed-off-by: Paolo Bonzini 
---
 include/qemu/coroutine.h |  17 -
 tests/test-aio-multithread.c |  86 
 util/qemu-coroutine-lock.c   | 153 ---
 util/trace-events|   1 +
 4 files changed, 245 insertions(+), 12 deletions(-)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 12584ed..fce228f 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -160,10 +160,23 @@ bool qemu_co_queue_empty(CoQueue *queue);
 /**
  * Provides a mutex that can be used to synchronise coroutines
  */
+struct CoWaitRecord;
 typedef struct CoMutex {
-bool locked;
+/* Count of pending lockers; 0 for a free mutex, 1 for an
+ * uncontended mutex.
+ */
+unsigned locked;
+
+/* A queue of waiters.  Elements are added atomically in front of
+ * from_push.  to_pop is only populated, and popped from, by whoever
+ * is in charge of the next wakeup.  This can be an unlocker or,
+ * through the handoff protocol, a locker that is about to go to sleep.
+ */
+QSLIST_HEAD(, CoWaitRecord) from_push, to_pop;
+
+unsigned handoff, sequence;
+
 Coroutine *holder;
-CoQueue queue;
 } CoMutex;
 
 /**
diff --git a/tests/test-aio-multithread.c b/tests/test-aio-multithread.c
index 534807d..ada8c48 100644
--- a/tests/test-aio-multithread.c
+++ b/tests/test-aio-multithread.c
@@ -196,6 +196,88 @@ static void test_multi_co_schedule_10(void)
 test_multi_co_schedule(10);
 }
 
+/* CoMutex thread-safety.  */
+
+static uint32_t atomic_counter;
+static uint32_t running;
+static uint32_t counter;
+static CoMutex comutex;
+
+static void test_multi_co_mutex_entry(void *opaque)
+{
+while (!atomic_mb_read(_stopping)) {
+qemu_co_mutex_lock();
+counter++;
+qemu_co_mutex_unlock();
+
+/* Increase atomic_counter *after* releasing the mutex.  Otherwise
+ * there is a chance (it happens about 1 in 3 runs) that the iothread
+ * exits before the coroutine is woken up, causing a spurious
+ * assertion failure.
+ */
+atomic_inc(_counter);
+}
+atomic_dec();
+}
+
+static void test_multi_co_mutex(int threads, int seconds)
+{
+int i;
+
+qemu_co_mutex_init();
+counter = 0;
+atomic_counter = 0;
+now_stopping = false;
+
+create_aio_contexts();
+assert(threads <= NUM_CONTEXTS);
+running = threads;
+for (i = 0; i < threads; i++) {
+Coroutine *co1 = qemu_coroutine_create(test_multi_co_mutex_entry, 
NULL);
+aio_co_schedule(ctx[i], co1);
+}
+
+g_usleep(seconds * 100);
+
+atomic_mb_set(_stopping, true);
+while (running > 0) {
+g_usleep(10);
+}
+
+join_aio_contexts();
+g_test_message("%d iterations/second\n", counter / seconds);
+g_assert_cmpint(counter, ==, atomic_counter);
+}
+
+/* Testing with NUM_CONTEXTS threads focuses on the queue.  The mutex however
+ * is too contended (and the threads spend too much time in aio_poll)
+ * to actually stress the handoff protocol.
+ */
+static void test_multi_co_mutex_1(void)
+{
+test_multi_co_mutex(NUM_CONTEXTS, 1);
+}
+
+static void test_multi_co_mutex_10(void)
+{
+test_multi_co_mutex(NUM_CONTEXTS, 10);
+}
+
+/* Testing with fewer threads stresses the handoff protocol too.  Still, the
+ * case where the locker _can_ pick up a handoff is very rare, happening
+ * about 10 times in 1 million, so increase the runtime a bit compared to
+ * other "quick" testcases that only run for 1 second.
+ */
+static void test_multi_co_mutex_2_3(void)
+{
+test_multi_co_mutex(2, 3);
+}
+
+static void test_multi_co_mutex_2_30(void)
+{
+test_multi_co_mutex(2, 30);
+}
+
 /* End of tests.  */
 
 int main(int argc, char **argv)
@@ -206,8 +288,12 @@ int main(int argc, char **argv)
 g_test_add_func("/aio/multi/lifecycle", test_lifecycle);
 if (g_test_quick()) {
 g_test_add_func("/aio/multi/schedule", test_multi_co_schedule_1);
+g_test_add_func("/aio/multi/mutex/contended", test_multi_co_mutex_1);
+g_test_add_func("/aio/multi/mutex/handoff", test_multi_co_mutex_2_3);
 } else {
 g_test_add_func("/aio/multi/schedule", test_multi_co_schedule_10);
+g_test_add_func("/aio/multi/mutex/contended", test_multi_co_mutex_10);
+g_test_add_func("/aio/multi/mutex/handoff", test_multi_co_mutex_2_30);
 }
 return g_test_run();
 }
diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index e6afd1a..25da9fa 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -20,6 +20,10 @@
  * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 

Re: [Qemu-devel] [PATCH 1/1] mirror: do not increase offset during initial zero_or_discard phase - pls consider this as V3 patch

2017-02-13 Thread Jeff Cody
On Mon, Feb 13, 2017 at 06:16:36PM +0100, Max Reitz wrote:
> On 13.02.2017 08:10, Denis V. Lunev wrote:
> > On 02/03/2017 06:08 PM, Denis V. Lunev wrote:
> >> On 02/03/2017 06:06 PM, Denis V. Lunev wrote:
> >>> From: Anton Nefedov 
> >>>
> >>> If explicit zeroing out before mirroring is required for the target image,
> >>> it moves the block job offset counter to EOF, then offset and len counters
> >>> count the image size twice. There is no harm but stats are confusing,
> >>> specifically the progress of the operation is always reported as 99% by
> >>> management tools.
> >>>
> >>> The patch skips offset increase for the first "technical" pass over the
> >>> image. This should not cause any further harm.
> >>>
> >>> Signed-off-by: Anton Nefedov 
> >>> Signed-off-by: Denis V. Lunev 
> >>> Reviewed-by: Eric Blake 
> >>> Reviewed-by: Stefan Hajnoczi 
> >>> CC: Jeff Cody 
> >>> CC: Kevin Wolf 
> >>> CC: Max Reitz 
> >> actually this is V3 patch. Sorry for broken subject.
> >>
> >> Den
> > ping
> 
> Didn't Jeff merge v2?
> 
> http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg01319.html
> 
> Max
> 


Yes, I did.



[Qemu-devel] [PATCH v4] migrate: Introduce a 'dc->vmsd' check to avoid segfault for --only-migratable

2017-02-13 Thread Ashijeet Acharya
Commit a3a3d8c7 introduced a segfault bug while checking for
'dc->vmsd->unmigratable' which caused QEMU to crash when trying to add
devices which do no set their 'dc->vmsd' yet while initialization.
Place a 'dc->vmsd' check prior to it so that we do not segfault for
such devices.

NOTE: This doesn't compromise the functioning of --only-migratable
option as all the unmigratable devices do set their 'dc->vmsd'.

Introduce a new function check_migratable() and move the
only_migratable check inside it, also use stubs to avoid user-mode qemu
build failures.

Signed-off-by: Ashijeet Acharya 
---
Changes in v4:
- introduce new check_migratable() function and use stubs
Changes is v3:
- move only_migratable check inside device_set_realized() to avoid code
- duplication
- I have dropped Juan's R-b tag for this one
Changes in v2:
- place dc->vmsd check in hw/usb/bus.c as well
---
 hw/core/qdev.c|  7 +++
 hw/usb/bus.c  | 19 ---
 include/migration/migration.h |  3 +++
 migration/migration.c | 15 +++
 qdev-monitor.c|  9 -
 stubs/vmstate.c   |  6 ++
 6 files changed, 31 insertions(+), 28 deletions(-)

diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 5783442..4f49cfe 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -37,6 +37,7 @@
 #include "hw/boards.h"
 #include "hw/sysbus.h"
 #include "qapi-event.h"
+#include "migration/migration.h"
 
 int qdev_hotplug = 0;
 static bool qdev_hot_added = false;
@@ -889,6 +890,7 @@ static void device_set_realized(Object *obj, bool value, 
Error **errp)
 Error *local_err = NULL;
 bool unattached_parent = false;
 static int unattached_count;
+int ret;
 
 if (dev->hotplugged && !dc->hotpluggable) {
 error_setg(errp, QERR_DEVICE_NO_HOTPLUG, object_get_typename(obj));
@@ -896,6 +898,11 @@ static void device_set_realized(Object *obj, bool value, 
Error **errp)
 }
 
 if (value && !dev->realized) {
+ret = check_migratable(obj, _err);
+if (ret < 0) {
+goto fail;
+}
+
 if (!obj->parent) {
 gchar *name = g_strdup_printf("device[%d]", unattached_count++);
 
diff --git a/hw/usb/bus.c b/hw/usb/bus.c
index 1dcc35c..25913ad 100644
--- a/hw/usb/bus.c
+++ b/hw/usb/bus.c
@@ -8,7 +8,6 @@
 #include "monitor/monitor.h"
 #include "trace.h"
 #include "qemu/cutils.h"
-#include "migration/migration.h"
 
 static void usb_bus_dev_print(Monitor *mon, DeviceState *qdev, int indent);
 
@@ -687,8 +686,6 @@ USBDevice *usbdevice_create(const char *cmdline)
 const char *params;
 int len;
 USBDevice *dev;
-ObjectClass *klass;
-DeviceClass *dc;
 
 params = strchr(cmdline,':');
 if (params) {
@@ -723,22 +720,6 @@ USBDevice *usbdevice_create(const char *cmdline)
 return NULL;
 }
 
-klass = object_class_by_name(f->name);
-if (klass == NULL) {
-error_report("Device '%s' not found", f->name);
-return NULL;
-}
-
-dc = DEVICE_CLASS(klass);
-
-if (only_migratable) {
-if (dc->vmsd->unmigratable) {
-error_report("Device %s is not migratable, but --only-migratable "
- "was specified", f->name);
-return NULL;
-}
-}
-
 if (f->usbdevice_init) {
 dev = f->usbdevice_init(bus, params);
 } else {
diff --git a/include/migration/migration.h b/include/migration/migration.h
index af9135f..a6868cd 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -22,6 +22,7 @@
 #include "qapi-types.h"
 #include "exec/cpu-common.h"
 #include "qemu/coroutine_int.h"
+#include "qom/object.h"
 
 #define QEMU_VM_FILE_MAGIC   0x5145564d
 #define QEMU_VM_FILE_VERSION_COMPAT  0x0002
@@ -305,6 +306,8 @@ int migrate_add_blocker(Error *reason, Error **errp);
  */
 void migrate_del_blocker(Error *reason);
 
+int check_migratable(Object *obj, Error **err);
+
 bool migrate_postcopy_ram(void);
 bool migrate_zero_blocks(void);
 
diff --git a/migration/migration.c b/migration/migration.c
index 2766d2f..00b33f3 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1146,6 +1146,21 @@ void migrate_del_blocker(Error *reason)
 migration_blockers = g_slist_remove(migration_blockers, reason);
 }
 
+int check_migratable(Object *obj, Error **err)
+{
+DeviceClass *dc = DEVICE_GET_CLASS(obj);
+if (only_migratable && dc->vmsd) {
+if (dc->vmsd->unmigratable) {
+error_setg(err, "Device %s is not migratable, but "
+   "--only-migratable was specified",
+   object_get_typename(obj));
+return -1;
+}
+}
+
+return 0;
+}
+
 void qmp_migrate_incoming(const char *uri, Error **errp)
 {
 Error *local_err = NULL;
diff --git a/qdev-monitor.c b/qdev-monitor.c
index 549f45f..5f2fcdf 100644
--- a/qdev-monitor.c
+++ b/qdev-monitor.c
@@ -29,7 +29,6 @@
 #include 

[Qemu-devel] [PULL 14/14] virtio/migration: Migrate virtio-net to VMState

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

Signed-off-by: Dr. David Alan Gilbert 
Reviewed-by: Michael S. Tsirkin 
Message-Id: <20170203160651.19917-5-dgilb...@redhat.com>
Signed-off-by: Dr. David Alan Gilbert 
  Merge fix against Halil's removal of the '_start' field in
VMSTATE_VBUFFER_MULTIPLY
---
 hw/net/virtio-net.c| 316 +++--
 include/hw/virtio/virtio-net.h |   4 +-
 2 files changed, 213 insertions(+), 107 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 7b3ad4a..354a19e 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1557,119 +1557,22 @@ static void virtio_net_set_multiqueue(VirtIONet *n, 
int multiqueue)
 virtio_net_set_queues(n);
 }
 
-static void virtio_net_save_device(VirtIODevice *vdev, QEMUFile *f)
+static int virtio_net_post_load_device(void *opaque, int version_id)
 {
-VirtIONet *n = VIRTIO_NET(vdev);
-int i;
-
-qemu_put_buffer(f, n->mac, ETH_ALEN);
-qemu_put_be32(f, n->vqs[0].tx_waiting);
-qemu_put_be32(f, n->mergeable_rx_bufs);
-qemu_put_be16(f, n->status);
-qemu_put_byte(f, n->promisc);
-qemu_put_byte(f, n->allmulti);
-qemu_put_be32(f, n->mac_table.in_use);
-qemu_put_buffer(f, n->mac_table.macs, n->mac_table.in_use * ETH_ALEN);
-qemu_put_buffer(f, (uint8_t *)n->vlans, MAX_VLAN >> 3);
-qemu_put_be32(f, n->has_vnet_hdr);
-qemu_put_byte(f, n->mac_table.multi_overflow);
-qemu_put_byte(f, n->mac_table.uni_overflow);
-qemu_put_byte(f, n->alluni);
-qemu_put_byte(f, n->nomulti);
-qemu_put_byte(f, n->nouni);
-qemu_put_byte(f, n->nobcast);
-qemu_put_byte(f, n->has_ufo);
-if (n->max_queues > 1) {
-qemu_put_be16(f, n->max_queues);
-qemu_put_be16(f, n->curr_queues);
-for (i = 1; i < n->curr_queues; i++) {
-qemu_put_be32(f, n->vqs[i].tx_waiting);
-}
-}
-
-if (virtio_vdev_has_feature(vdev, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS)) {
-qemu_put_be64(f, n->curr_guest_offloads);
-}
-}
-
-static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f,
-  int version_id)
-{
-VirtIONet *n = VIRTIO_NET(vdev);
+VirtIONet *n = opaque;
+VirtIODevice *vdev = VIRTIO_DEVICE(n);
 int i, link_down;
 
-qemu_get_buffer(f, n->mac, ETH_ALEN);
-n->vqs[0].tx_waiting = qemu_get_be32(f);
-
-virtio_net_set_mrg_rx_bufs(n, qemu_get_be32(f),
+virtio_net_set_mrg_rx_bufs(n, n->mergeable_rx_bufs,
virtio_vdev_has_feature(vdev,
VIRTIO_F_VERSION_1));
 
-n->status = qemu_get_be16(f);
-
-n->promisc = qemu_get_byte(f);
-n->allmulti = qemu_get_byte(f);
-
-n->mac_table.in_use = qemu_get_be32(f);
 /* MAC_TABLE_ENTRIES may be different from the saved image */
-if (n->mac_table.in_use <= MAC_TABLE_ENTRIES) {
-qemu_get_buffer(f, n->mac_table.macs,
-n->mac_table.in_use * ETH_ALEN);
-} else {
-int64_t i;
-
-/* Overflow detected - can happen if source has a larger MAC table.
- * We simply set overflow flag so there's no need to maintain the
- * table of addresses, discard them all.
- * Note: 64 bit math to avoid integer overflow.
- */
-for (i = 0; i < (int64_t)n->mac_table.in_use * ETH_ALEN; ++i) {
-qemu_get_byte(f);
-}
-n->mac_table.multi_overflow = n->mac_table.uni_overflow = 1;
+if (n->mac_table.in_use > MAC_TABLE_ENTRIES) {
 n->mac_table.in_use = 0;
 }
- 
-qemu_get_buffer(f, (uint8_t *)n->vlans, MAX_VLAN >> 3);
-
-if (qemu_get_be32(f) && !peer_has_vnet_hdr(n)) {
-error_report("virtio-net: saved image requires vnet_hdr=on");
-return -1;
-}
-
-n->mac_table.multi_overflow = qemu_get_byte(f);
-n->mac_table.uni_overflow = qemu_get_byte(f);
-
-n->alluni = qemu_get_byte(f);
-n->nomulti = qemu_get_byte(f);
-n->nouni = qemu_get_byte(f);
-n->nobcast = qemu_get_byte(f);
-
-if (qemu_get_byte(f) && !peer_has_ufo(n)) {
-error_report("virtio-net: saved image requires TUN_F_UFO support");
-return -1;
-}
 
-if (n->max_queues > 1) {
-if (n->max_queues != qemu_get_be16(f)) {
-error_report("virtio-net: different max_queues ");
-return -1;
-}
-
-n->curr_queues = qemu_get_be16(f);
-if (n->curr_queues > n->max_queues) {
-error_report("virtio-net: curr_queues %x > max_queues %x",
- n->curr_queues, n->max_queues);
-return -1;
-}
-for (i = 1; i < n->curr_queues; i++) {
-n->vqs[i].tx_waiting = qemu_get_be32(f);
-}
-}
-
-if (virtio_vdev_has_feature(vdev, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS)) {
-n->curr_guest_offloads = qemu_get_be64(f);

[Qemu-devel] [PULL 07/14] migration: consolidate VMStateField.start

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: Halil Pasic 

The member VMStateField.start is used for two things, partial data
migration for VBUFFER data (basically provide migration for a
sub-buffer) and for locating next in QTAILQ.

The implementation of the VBUFFER feature is broken when VMSTATE_ALLOC
is used. This however goes unnoticed because actually partial migration
for VBUFFER is not used at all.

Let's consolidate the usage of VMStateField.start by removing support
for partial migration for VBUFFER.

Signed-off-by: Halil Pasic 

Message-Id: <20170203175217.45562-1-pa...@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert 
Signed-off-by: Dr. David Alan Gilbert 
---
 hw/char/exynos4210_uart.c   |  2 +-
 hw/display/g364fb.c |  2 +-
 hw/dma/pl330.c  |  8 
 hw/intc/exynos4210_gic.c|  2 +-
 hw/ipmi/isa_ipmi_bt.c   |  6 ++
 hw/net/vmxnet3.c|  2 +-
 hw/nvram/mac_nvram.c|  2 +-
 hw/nvram/spapr_nvram.c  |  2 +-
 hw/sd/sdhci.c   |  2 +-
 hw/timer/m48t59.c   |  2 +-
 include/migration/vmstate.h | 21 -
 migration/savevm.c  |  2 +-
 migration/vmstate.c |  4 ++--
 target/s390x/machine.c  |  2 +-
 util/fifo8.c|  2 +-
 15 files changed, 27 insertions(+), 34 deletions(-)

diff --git a/hw/char/exynos4210_uart.c b/hw/char/exynos4210_uart.c
index 7c16e89..b75f28d 100644
--- a/hw/char/exynos4210_uart.c
+++ b/hw/char/exynos4210_uart.c
@@ -561,7 +561,7 @@ static const VMStateDescription 
vmstate_exynos4210_uart_fifo = {
 .fields = (VMStateField[]) {
 VMSTATE_UINT32(sp, Exynos4210UartFIFO),
 VMSTATE_UINT32(rp, Exynos4210UartFIFO),
-VMSTATE_VBUFFER_UINT32(data, Exynos4210UartFIFO, 1, NULL, 0, size),
+VMSTATE_VBUFFER_UINT32(data, Exynos4210UartFIFO, 1, NULL, size),
 VMSTATE_END_OF_LIST()
 }
 };
diff --git a/hw/display/g364fb.c b/hw/display/g364fb.c
index 70ef2c7..8cdc205 100644
--- a/hw/display/g364fb.c
+++ b/hw/display/g364fb.c
@@ -464,7 +464,7 @@ static const VMStateDescription vmstate_g364fb = {
 .minimum_version_id = 1,
 .post_load = g364fb_post_load,
 .fields = (VMStateField[]) {
-VMSTATE_VBUFFER_UINT32(vram, G364State, 1, NULL, 0, vram_size),
+VMSTATE_VBUFFER_UINT32(vram, G364State, 1, NULL, vram_size),
 VMSTATE_BUFFER_UNSAFE(color_palette, G364State, 0, 256 * 3),
 VMSTATE_BUFFER_UNSAFE(cursor_palette, G364State, 0, 9),
 VMSTATE_UINT16_ARRAY(cursor, G364State, 512),
diff --git a/hw/dma/pl330.c b/hw/dma/pl330.c
index c0bd9fe..32cf839 100644
--- a/hw/dma/pl330.c
+++ b/hw/dma/pl330.c
@@ -173,8 +173,8 @@ static const VMStateDescription vmstate_pl330_fifo = {
 .version_id = 1,
 .minimum_version_id = 1,
 .fields = (VMStateField[]) {
-VMSTATE_VBUFFER_UINT32(buf, PL330Fifo, 1, NULL, 0, buf_size),
-VMSTATE_VBUFFER_UINT32(tag, PL330Fifo, 1, NULL, 0, buf_size),
+VMSTATE_VBUFFER_UINT32(buf, PL330Fifo, 1, NULL, buf_size),
+VMSTATE_VBUFFER_UINT32(tag, PL330Fifo, 1, NULL, buf_size),
 VMSTATE_UINT32(head, PL330Fifo),
 VMSTATE_UINT32(num, PL330Fifo),
 VMSTATE_UINT32(buf_size, PL330Fifo),
@@ -282,8 +282,8 @@ static const VMStateDescription vmstate_pl330 = {
 VMSTATE_STRUCT(manager, PL330State, 0, vmstate_pl330_chan, PL330Chan),
 VMSTATE_STRUCT_VARRAY_UINT32(chan, PL330State, num_chnls, 0,
  vmstate_pl330_chan, PL330Chan),
-VMSTATE_VBUFFER_UINT32(lo_seqn, PL330State, 1, NULL, 0, num_chnls),
-VMSTATE_VBUFFER_UINT32(hi_seqn, PL330State, 1, NULL, 0, num_chnls),
+VMSTATE_VBUFFER_UINT32(lo_seqn, PL330State, 1, NULL, num_chnls),
+VMSTATE_VBUFFER_UINT32(hi_seqn, PL330State, 1, NULL, num_chnls),
 VMSTATE_STRUCT(fifo, PL330State, 0, vmstate_pl330_fifo, PL330Fifo),
 VMSTATE_STRUCT(read_queue, PL330State, 0, vmstate_pl330_queue,
PL330Queue),
diff --git a/hw/intc/exynos4210_gic.c b/hw/intc/exynos4210_gic.c
index fd7a8f3..2a55817 100644
--- a/hw/intc/exynos4210_gic.c
+++ b/hw/intc/exynos4210_gic.c
@@ -393,7 +393,7 @@ static const VMStateDescription vmstate_exynos4210_irq_gate 
= {
 .version_id = 2,
 .minimum_version_id = 2,
 .fields = (VMStateField[]) {
-VMSTATE_VBUFFER_UINT32(level, Exynos4210IRQGateState, 1, NULL, 0, 
n_in),
+VMSTATE_VBUFFER_UINT32(level, Exynos4210IRQGateState, 1, NULL, n_in),
 VMSTATE_END_OF_LIST()
 }
 };
diff --git a/hw/ipmi/isa_ipmi_bt.c b/hw/ipmi/isa_ipmi_bt.c
index f036617..1c69cb3 100644
--- a/hw/ipmi/isa_ipmi_bt.c
+++ b/hw/ipmi/isa_ipmi_bt.c
@@ -471,10 +471,8 @@ static const VMStateDescription vmstate_ISAIPMIBTDevice = {
 VMSTATE_BOOL(bt.use_irq, ISAIPMIBTDevice),
 VMSTATE_BOOL(bt.irqs_enabled, ISAIPMIBTDevice),
 VMSTATE_UINT32(bt.outpos, ISAIPMIBTDevice),
-

[Qemu-devel] [PULL 06/14] migrate: Introduce zero RAM checks to skip RAM migration

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: Ashijeet Acharya 

Migration of a "none" machine with no RAM crashes abruptly as
bitmap_new() fails and thus aborts. Instead place zero RAM checks at
appropriate places to skip migration of RAM in this case and complete
migration successfully for devices only.

Signed-off-by: Ashijeet Acharya 
Message-Id: <1486564125-31366-1-git-send-email-ashijeetacha...@gmail.com>
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Dr. David Alan Gilbert 
---
 migration/ram.c | 22 +++---
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 67f2efb..f289fcd 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1346,6 +1346,11 @@ static int ram_find_and_save_block(QEMUFile *f, bool 
last_stage,
 ram_addr_t dirty_ram_abs; /* Address of the start of the dirty page in
  ram_addr_t space */
 
+/* No dirty page as there is zero RAM */
+if (!ram_bytes_total()) {
+return pages;
+}
+
 pss.block = last_seen_block;
 pss.offset = last_offset;
 pss.complete_round = false;
@@ -1952,14 +1957,17 @@ static int ram_save_init_globals(void)
 bytes_transferred = 0;
 reset_ram_globals();
 
-ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
 migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
-migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
-bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
-
-if (migrate_postcopy_ram()) {
-migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
-bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
+/* Skip setting bitmap if there is no RAM */
+if (ram_bytes_total()) {
+ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
+migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
+bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
+
+if (migrate_postcopy_ram()) {
+migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
+bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
+}
 }
 
 /*
-- 
2.9.3




Re: [Qemu-devel] [PATCH 2/4] xhci: add qemu xhci controller

2017-02-13 Thread Marcel Apfelbaum

On 02/06/2017 01:55 PM, Gerd Hoffmann wrote:

Turn existing TYPE_XHCI into an abstract base class.
Create two child classes, TYPE_NEC_XHCI (same name as old xhci
controller) and TYPE_QEMU_XHCI (using an ID from our namespace).

Signed-off-by: Gerd Hoffmann 
---
 docs/specs/pci-ids.txt |  1 +
 hw/usb/hcd-xhci.c  | 40 
 include/hw/pci/pci.h   |  1 +
 3 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/docs/specs/pci-ids.txt b/docs/specs/pci-ids.txt
index 16fdb0c..95adee0 100644
--- a/docs/specs/pci-ids.txt
+++ b/docs/specs/pci-ids.txt
@@ -61,6 +61,7 @@ PCI devices (other than virtio):
 1b36:0009  PCI Expander Bridge (-device pxb)
 1b36:000a  PCI-PCI bridge (multiseat)
 1b36:000b  PCIe Expander Bridge (-device pxb-pcie)
+1b36:000d  PCI xhci usb host adapter

 All these devices are documented in docs/specs.

diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index 74184ac..887bb39 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -487,7 +487,9 @@ struct XHCIState {
 XHCIRing cmd_ring;
 };

-#define TYPE_XHCI "nec-usb-xhci"
+#define TYPE_XHCI "base-xhci"
+#define TYPE_NEC_XHCI "nec-usb-xhci"
+#define TYPE_QEMU_XHCI "qemu-xhci"

 #define XHCI(obj) \
 OBJECT_CHECK(XHCIState, (obj), TYPE_XHCI)
@@ -3868,10 +3870,7 @@ static void xhci_class_init(ObjectClass *klass, void 
*data)
 set_bit(DEVICE_CATEGORY_USB, dc->categories);
 k->realize  = usb_xhci_realize;
 k->exit = usb_xhci_exit;
-k->vendor_id= PCI_VENDOR_ID_NEC;
-k->device_id= PCI_DEVICE_ID_NEC_UPD720200;
 k->class_id = PCI_CLASS_SERIAL_USB;
-k->revision = 0x03;
 k->is_express   = 1;
 }

@@ -3880,11 +3879,44 @@ static const TypeInfo xhci_info = {
 .parent= TYPE_PCI_DEVICE,
 .instance_size = sizeof(XHCIState),
 .class_init= xhci_class_init,
+.abstract  = true,
+};
+
+static void nec_xhci_class_init(ObjectClass *klass, void *data)
+{
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+k->vendor_id= PCI_VENDOR_ID_NEC;
+k->device_id= PCI_DEVICE_ID_NEC_UPD720200;
+k->revision = 0x03;
+}
+
+static const TypeInfo nec_xhci_info = {
+.name  = TYPE_NEC_XHCI,
+.parent= TYPE_XHCI,
+.class_init= nec_xhci_class_init,
+};
+
+static void qemu_xhci_class_init(ObjectClass *klass, void *data)
+{
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+k->vendor_id= PCI_VENDOR_ID_REDHAT;
+k->device_id= PCI_DEVICE_ID_REDHAT_XHCI;
+k->revision = 0x01;
+}
+
+static const TypeInfo qemu_xhci_info = {
+.name  = TYPE_QEMU_XHCI,
+.parent= TYPE_XHCI,
+.class_init= qemu_xhci_class_init,
 };

 static void xhci_register_types(void)
 {
 type_register_static(_info);
+type_register_static(_xhci_info);
+type_register_static(_xhci_info);
 }

 type_init(xhci_register_types)
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index cbc1fdf..05ef14b 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -97,6 +97,7 @@
 #define PCI_DEVICE_ID_REDHAT_BRIDGE_SEAT 0x000a
 #define PCI_DEVICE_ID_REDHAT_PXB_PCIE0x000b
 #define PCI_DEVICE_ID_REDHAT_PCIE_RP 0x000c
+#define PCI_DEVICE_ID_REDHAT_XHCI0x000d
 #define PCI_DEVICE_ID_REDHAT_QXL 0x0100

 #define FMT_PCIBUS  PRIx64




Reviewed-by: Marcel Apfelbaum 

Thanks,
Marcel



Re: [Qemu-devel] [PATCH] audio: make audio poll timer deterministic

2017-02-13 Thread Yurii Zubrytskyi
Hi,

It looks to me that this behavior can be achieved with "timer_mod_anticipate()"
function instead of a separate check.

On Sun, Feb 12, 2017 at 9:04 PM, Pavel Dovgalyuk  wrote:

> Ping?
>
> Pavel Dovgalyuk
>
>
> > -Original Message-
> > From: Pavel Dovgalyuk [mailto:pavel.dovga...@ispras.ru]
> > Sent: Tuesday, January 31, 2017 2:59 PM
> > To: qemu-devel@nongnu.org
> > Cc: pbonz...@redhat.com; dovga...@ispras.ru; kra...@redhat.com
> > Subject: [PATCH] audio: make audio poll timer deterministic
> >
> > This patch changes resetting strategy of the audio polling timer.
> > It does not change expiration time if the timer is already set.
> >
> > Signed-off-by: Pavel Dovgalyuk 
> > ---
> >  audio/audio.c |6 --
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/audio/audio.c b/audio/audio.c
> > index c845a44..1ee95a5 100644
> > --- a/audio/audio.c
> > +++ b/audio/audio.c
> > @@ -1112,8 +1112,10 @@ static int audio_is_timer_needed (void)
> >  static void audio_reset_timer (AudioState *s)
> >  {
> >  if (audio_is_timer_needed ()) {
> > -timer_mod (s->ts,
> > -qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + conf.period.ticks);
> > +if (!timer_pending(s->ts)) {
> > +timer_mod (s->ts,
> > +qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
> conf.period.ticks);
> > +}
> >  }
> >  else {
> >  timer_del (s->ts);
>
>
>
>


-- 
Thanks, Yurii


[Qemu-devel] [PULL 03/14] migration: add MigrationState arg for ram_save_/compressed_/page()

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: Pavel Butsykin 

Cosmetic patch. The use of ms variable instead of migrate_get_current()
looks nicer, especially when there reuse.

Signed-off-by: Pavel Butsykin 
Message-Id: <20170203152321.19739-2-pbutsy...@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert 
Signed-off-by: Dr. David Alan Gilbert 
---
 migration/ram.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index ef8fadf..91443b3 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -713,13 +713,14 @@ static int save_zero_page(QEMUFile *f, RAMBlock *block, 
ram_addr_t offset,
  *  >=0 - Number of pages written - this might legally be 0
  *if xbzrle noticed the page was the same.
  *
+ * @ms: The current migration state.
  * @f: QEMUFile where to send the data
  * @block: block that contains the page we want to send
  * @offset: offset inside the block for the page
  * @last_stage: if we are at the completion stage
  * @bytes_transferred: increase it with the number of transferred bytes
  */
-static int ram_save_page(QEMUFile *f, PageSearchStatus *pss,
+static int ram_save_page(MigrationState *ms, QEMUFile *f, PageSearchStatus 
*pss,
  bool last_stage, uint64_t *bytes_transferred)
 {
 int pages = -1;
@@ -765,8 +766,7 @@ static int ram_save_page(QEMUFile *f, PageSearchStatus *pss,
  */
 xbzrle_cache_zero_page(current_addr);
 } else if (!ram_bulk_stage &&
-   !migration_in_postcopy(migrate_get_current()) &&
-   migrate_use_xbzrle()) {
+   !migration_in_postcopy(ms) && migrate_use_xbzrle()) {
 pages = save_xbzrle_page(f, , current_addr, block,
  offset, last_stage, bytes_transferred);
 if (!last_stage) {
@@ -893,14 +893,15 @@ static int compress_page_with_multi_thread(QEMUFile *f, 
RAMBlock *block,
  *
  * Returns: Number of pages written.
  *
+ * @ms: The current migration state.
  * @f: QEMUFile where to send the data
  * @block: block that contains the page we want to send
  * @offset: offset inside the block for the page
  * @last_stage: if we are at the completion stage
  * @bytes_transferred: increase it with the number of transferred bytes
  */
-static int ram_save_compressed_page(QEMUFile *f, PageSearchStatus *pss,
-bool last_stage,
+static int ram_save_compressed_page(MigrationState *ms, QEMUFile *f,
+PageSearchStatus *pss, bool last_stage,
 uint64_t *bytes_transferred)
 {
 int pages = -1;
@@ -1231,11 +1232,11 @@ static int ram_save_target_page(MigrationState *ms, 
QEMUFile *f,
 if (migration_bitmap_clear_dirty(dirty_ram_abs)) {
 unsigned long *unsentmap;
 if (compression_switch && migrate_use_compression()) {
-res = ram_save_compressed_page(f, pss,
+res = ram_save_compressed_page(ms, f, pss,
last_stage,
bytes_transferred);
 } else {
-res = ram_save_page(f, pss, last_stage,
+res = ram_save_page(ms, f, pss, last_stage,
 bytes_transferred);
 }
 
-- 
2.9.3




Re: [Qemu-devel] [RFC PATCH 00/41] New op blocker system

2017-02-13 Thread no-reply
Hi,

Your series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [RFC PATCH 00/41] New op blocker system
Message-id: 1487006583-24350-1-git-send-email-kw...@redhat.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]  
patchew/1486747506-15876-1-git-send-email-abolo...@redhat.com -> 
patchew/1486747506-15876-1-git-send-email-abolo...@redhat.com
 * [new tag] patchew/1487006583-24350-1-git-send-email-kw...@redhat.com 
-> patchew/1487006583-24350-1-git-send-email-kw...@redhat.com
Switched to a new branch 'test'
74e9065 block: Assertions for write permissions
e39ccbe block: Pass BdrvChild to bdrv_aligned_preadv/pwritev
7b6003a tests: Remove FIXME comments
779ef72 nbd/server: Use real permissions for NBD exports
2a88fb5 migration/block: Use real permissions
b7208e1 hmp: Request permissions in qemu-io
2fd0afd stream: Use real permissions in streaming block job
d1bf1e8 mirror: Use real permissions in mirror/active commit block job
b53e0a7 block: Allow backing file links in change_parent_backing_link()
04ce3a6 block: BdrvChildRole.attach/detach() callbacks
d27507e block: Fix pending requests check in bdrv_append()
c28f58a backup: Use real permissions in backup block job
9c179f9 commit: Use real permissions for HMP 'commit'
d0affd9 commit: Use real permissions in commit block job
fc0d620 block: Add bdrv_new_open_driver()
c4670f2 block: Factor out bdrv_open_driver()
a0b9c4e blockjob: Add permissions to block_job_add_bdrv()
7ae2792 block: Add BdrvChildRole.stay_at_node
6790ced block: Include details on permission errors in message
135cb50 block: Add BdrvChildRole.get_link_name()
aee95b7 blockjob: Add permissions to block_job_create()
e65d733 hw/block: Introduce share-rw qdev property
abf8d2f hw/block: Request permissions
016d652 block: Allow error return in BlockDevOps.change_media_cb()
f0ce22b block: Request real permissions in blk_new_open()
0560a02 block: Add error parameter to blk_insert_bs()
616c5f5 block: Add permissions to blk_new()
9d7081b block: Add permissions to BlockBackend
53aa264 block: Request real permissions in bdrv_attach_child()
2dc4523 block: Require .bdrv_child_perm() with child nodes
a1cb388 vvfat: Implement .bdrv_child_perm()
c3c7960 block: Request child permissions in format drivers
49ff6dd block: Default .bdrv_child_perm() for format drivers
126dedf block: Request child permissions in filter drivers
6eebee7 block: Default .bdrv_child_perm() for filter drivers
017216c block: Involve block drivers in permission granting
18c9ee8 tests: Use opened block node for block job tests
aa32b84 block: Let callers request permissions when attaching a child node
7abb303 block: Add Error argument to bdrv_attach_child()
0a3b05b block: Add op blocker permission constants
e693833 block: Attach bs->file only during .bdrv_open()

=== OUTPUT BEGIN ===
Checking PATCH 1/41: block: Attach bs->file only during .bdrv_open()...
Checking PATCH 2/41: block: Add op blocker permission constants...
Checking PATCH 3/41: block: Add Error argument to bdrv_attach_child()...
Checking PATCH 4/41: block: Let callers request permissions when attaching a 
child node...
Checking PATCH 5/41: tests: Use opened block node for block job tests...
Checking PATCH 6/41: block: Involve block drivers in permission granting...
ERROR: "foo* bar" should be "foo *bar"
#228: FILE: include/block/block_int.h:357:
+ void (*bdrv_child_perm)(BlockDriverState* bs, BdrvChild *c,

total: 1 errors, 0 warnings, 214 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 7/41: block: Default .bdrv_child_perm() for filter drivers...
Checking PATCH 8/41: block: Request child permissions in filter drivers...
Checking PATCH 9/41: block: Default .bdrv_child_perm() for format drivers...
Checking PATCH 10/41: block: Request child permissions in format drivers...
Checking PATCH 11/41: vvfat: Implement .bdrv_child_perm()...
Checking PATCH 12/41: block: Require .bdrv_child_perm() with child nodes...
Checking PATCH 13/41: block: Request real permissions in bdrv_attach_child()...
Checking PATCH 14/41: block: Add permissions to BlockBackend...
Checking PATCH 15/41: block: Add permissions to blk_new()...
Checking PATCH 16/41: block: Add error parameter to blk_insert_bs()...
Checking PATCH 

Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image

2017-02-13 Thread John Snow


On 02/13/2017 12:16 PM, Daniel P. Berrange wrote:
> On Mon, Feb 13, 2017 at 12:03:35PM -0500, John Snow wrote:
>> Also keep in mind that changing the cluster size will give you different
>> answers, too -- but that different cluster sizes will effect the runtime
>> performance of the image as well.
> 
> This means that apps trying to figure out this future usage have to
> understand fine internal impl details of qcow2 to correctly calculate
> it.
> 

Well, as long as they just want an *estimate* ...

Plus, the spec for qcow2 is open source! :) What internal details? O:-)

>>> We think that the best way to solve this issue is to return this info
>>> from qemu-img, maybe as a flag to qemu-img convert that will
>>> calculate the size of the converted image without doing any writes.
>>>
>>
>> Might not be too hard to add, but it wouldn't necessarily be any more
>> accurate than if you implemented the same logic, I think.
>>
>> Still, it'd be up to us to keep it up to date, but I don't know what
>> guarantees we could provide about the accuracy of the estimate or
>> preventing it from bitrot if there are format changes..
> 
> As opposed to every application trying to implement the logic
> themselves...it'll likely bitrot even worse in 3rd party apps
> as their maintainers won't notice format changes until they
> see a bug report.  Likewise, app developers aren't in a much
> better position wrt to accracy - if anything they'll do a worse
> job at calculating it since they might miss subtable nuances of
> the qcow2 format that qemu developers would more likely get right.
> 

Sure, just cautioning against the idea that we'll be able to provide
anything better than an *estimate*, for all the same reasons it would be
difficult for anyone else to provide anything better than an educated guess.

Was not seriously campaigning against us adding it -- just offering a
pathway to not have to wait for us to do it, since ours likely won't be
much more accurate or stable in any meaningful sense.

--js

> This isn't just a problem wrt to the usage scenario mentioned in
> this thread. For active VMs, consider you want to determine whether
> you are at risk of overcommitting the filesystem or not. You cannot
> simply sum up the image capacity - you need to know the largest
> size that the qcow2 file is going to grow to in future[1] - this
> again requires the app to calculate overhead of qcow2 metdata to
> understand what they've committed to providing in terms of storage
> 
> Regards,
> Daniel
> 
> [1] There is no upper limit if internal snapshots are usedm but if
> we assume use of external snapshots, we should be able to
> calculate the file size commitment.
> 



[Qemu-devel] [PULL 01/14] migration: remove myself as maintainer

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: Amit Shah 

I'm switching jobs, and I'm not sure I can continue maintaining migration.

Signed-off-by: Amit Shah 
Message-Id: <1486120416-11566-1-git-send-email-amit.s...@redhat.com>
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Juan Quintela 
Signed-off-by: Dr. David Alan Gilbert 
---
 MAINTAINERS | 1 -
 1 file changed, 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 7afbada..d8ea161 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1431,7 +1431,6 @@ F: scripts/checkpatch.pl
 
 Migration
 M: Juan Quintela 
-M: Amit Shah 
 M: Dr. David Alan Gilbert 
 S: Maintained
 F: include/migration/
-- 
2.9.3




[Qemu-devel] [PATCH 6/6] coroutine-lock: make CoRwlock thread-safe and fair

2017-02-13 Thread Paolo Bonzini
This adds a CoMutex around the existing CoQueue.  Because the write-side
can just take CoMutex, the old "writer" field is not necessary anymore.
Instead of removing it altogether, count the number of pending writers
during a read-side critical section and forbid further readers from
entering.

Signed-off-by: Paolo Bonzini 
---
 include/qemu/coroutine.h   |  3 ++-
 util/qemu-coroutine-lock.c | 35 ---
 2 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index d2de268..e60beaf 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -204,8 +204,9 @@ bool qemu_co_queue_empty(CoQueue *queue);
 
 
 typedef struct CoRwlock {
-bool writer;
+int pending_writer;
 int reader;
+CoMutex mutex;
 CoQueue queue;
 } CoRwlock;
 
diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index b0a554f..6328eed 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -346,16 +346,22 @@ void qemu_co_rwlock_init(CoRwlock *lock)
 {
 memset(lock, 0, sizeof(*lock));
 qemu_co_queue_init(>queue);
+qemu_co_mutex_init(>mutex);
 }
 
 void qemu_co_rwlock_rdlock(CoRwlock *lock)
 {
 Coroutine *self = qemu_coroutine_self();
 
-while (lock->writer) {
-qemu_co_queue_wait(>queue, NULL);
+qemu_co_mutex_lock(>mutex);
+/* For fairness, wait if a writer is in line.  */
+while (lock->pending_writer) {
+qemu_co_queue_wait(>queue, >mutex);
 }
 lock->reader++;
+qemu_co_mutex_unlock(>mutex);
+
+/* The rest of the read-side critical section is run without the mutex.  */
 self->locks_held++;
 }
 
@@ -364,10 +370,13 @@ void qemu_co_rwlock_unlock(CoRwlock *lock)
 Coroutine *self = qemu_coroutine_self();
 
 assert(qemu_in_coroutine());
-if (lock->writer) {
-lock->writer = false;
+if (!lock->reader) {
+/* The critical section started in qemu_co_rwlock_wrlock.  */
 qemu_co_queue_restart_all(>queue);
 } else {
+self->locks_held--;
+
+qemu_co_mutex_lock(>mutex);
 lock->reader--;
 assert(lock->reader >= 0);
 /* Wakeup only one waiting writer */
@@ -375,16 +384,20 @@ void qemu_co_rwlock_unlock(CoRwlock *lock)
 qemu_co_queue_next(>queue);
 }
 }
-self->locks_held--;
+qemu_co_mutex_unlock(>mutex);
 }
 
 void qemu_co_rwlock_wrlock(CoRwlock *lock)
 {
-Coroutine *self = qemu_coroutine_self();
-
-while (lock->writer || lock->reader) {
-qemu_co_queue_wait(>queue, NULL);
+qemu_co_mutex_lock(>mutex);
+lock->pending_writer++;
+while (lock->reader) {
+qemu_co_queue_wait(>queue, >mutex);
 }
-lock->writer = true;
-self->locks_held++;
+lock->pending_writer--;
+
+/* The rest of the write-side critical section is run with
+ * the mutex taken, so that lock->reader remains zero.
+ * There is no need to update self->locks_held.
+ */
 }
-- 
2.9.3




[Qemu-devel] [PATCH 5/6] coroutine-lock: add mutex argument to CoQueue APIs

2017-02-13 Thread Paolo Bonzini
All that CoQueue needs in order to become thread-safe is help
from an external mutex.  Add this to the API.

Signed-off-by: Paolo Bonzini 
---
 block/backup.c |  2 +-
 block/io.c |  4 ++--
 block/nbd-client.c |  2 +-
 block/qcow2-cluster.c  |  4 +---
 block/sheepdog.c   |  2 +-
 block/throttle-groups.c|  2 +-
 hw/9pfs/9p.c   |  2 +-
 include/qemu/coroutine.h   |  8 +---
 util/qemu-coroutine-lock.c | 24 +---
 9 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index ea38733..fe010e7 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -64,7 +64,7 @@ static void coroutine_fn 
wait_for_overlapping_requests(BackupBlockJob *job,
 retry = false;
 QLIST_FOREACH(req, >inflight_reqs, list) {
 if (end > req->start && start < req->end) {
-qemu_co_queue_wait(>wait_queue);
+qemu_co_queue_wait(>wait_queue, NULL);
 retry = true;
 break;
 }
diff --git a/block/io.c b/block/io.c
index a5c7d36..d5c4544 100644
--- a/block/io.c
+++ b/block/io.c
@@ -539,7 +539,7 @@ static bool coroutine_fn 
wait_serialising_requests(BdrvTrackedRequest *self)
  * (instead of producing a deadlock in the former case). */
 if (!req->waiting_for) {
 self->waiting_for = req;
-qemu_co_queue_wait(>wait_queue);
+qemu_co_queue_wait(>wait_queue, NULL);
 self->waiting_for = NULL;
 retry = true;
 waited = true;
@@ -2275,7 +2275,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 
 /* Wait until any previous flushes are completed */
 while (bs->active_flush_req) {
-qemu_co_queue_wait(>flush_queue);
+qemu_co_queue_wait(>flush_queue, NULL);
 }
 
 bs->active_flush_req = true;
diff --git a/block/nbd-client.c b/block/nbd-client.c
index 10fcc9e..0dc12c2 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -182,7 +182,7 @@ static void nbd_coroutine_start(NBDClientSession *s,
 /* Poor man semaphore.  The free_sema is locked when no other request
  * can be accepted, and unlocked after receiving one reply.  */
 if (s->in_flight == MAX_NBD_REQUESTS) {
-qemu_co_queue_wait(>free_sema);
+qemu_co_queue_wait(>free_sema, NULL);
 assert(s->in_flight < MAX_NBD_REQUESTS);
 }
 s->in_flight++;
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 928c1e2..78c11d4 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -932,9 +932,7 @@ static int handle_dependencies(BlockDriverState *bs, 
uint64_t guest_offset,
 if (bytes == 0) {
 /* Wait for the dependency to complete. We need to recheck
  * the free/allocated clusters when we continue. */
-qemu_co_mutex_unlock(>lock);
-qemu_co_queue_wait(_alloc->dependent_requests);
-qemu_co_mutex_lock(>lock);
+qemu_co_queue_wait(_alloc->dependent_requests, >lock);
 return -EAGAIN;
 }
 }
diff --git a/block/sheepdog.c b/block/sheepdog.c
index 32c4e4c..860ba61 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -486,7 +486,7 @@ static void wait_for_overlapping_aiocb(BDRVSheepdogState 
*s, SheepdogAIOCB *acb)
 retry:
 QLIST_FOREACH(cb, >inflight_aiocb_head, aiocb_siblings) {
 if (AIOCBOverlapping(acb, cb)) {
-qemu_co_queue_wait(>overlapping_queue);
+qemu_co_queue_wait(>overlapping_queue, NULL);
 goto retry;
 }
 }
diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index aade5de..b73e7a8 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -326,7 +326,7 @@ void coroutine_fn 
throttle_group_co_io_limits_intercept(BlockBackend *blk,
 if (must_wait || blkp->pending_reqs[is_write]) {
 blkp->pending_reqs[is_write]++;
 qemu_mutex_unlock(>lock);
-qemu_co_queue_wait(>throttled_reqs[is_write]);
+qemu_co_queue_wait(>throttled_reqs[is_write], NULL);
 qemu_mutex_lock(>lock);
 blkp->pending_reqs[is_write]--;
 }
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 99e9472..3af1c93 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -2374,7 +2374,7 @@ static void coroutine_fn v9fs_flush(void *opaque)
 /*
  * Wait for pdu to complete.
  */
-qemu_co_queue_wait(_pdu->complete);
+qemu_co_queue_wait(_pdu->complete, NULL);
 cancel_pdu->cancelled = 0;
 pdu_free(cancel_pdu);
 }
diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 9f68579..d2de268 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -160,7 +160,8 @@ void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex);
 
 /**
  * 

[Qemu-devel] [PATCH 2/6] coroutine-lock: add limited spinning to CoMutex

2017-02-13 Thread Paolo Bonzini
Running a very small critical section on pthread_mutex_t and CoMutex
shows that pthread_mutex_t is much faster because it doesn't actually
go to sleep.  What happens is that the critical section is shorter
than the latency of entering the kernel and thus FUTEX_WAIT always
fails.  With CoMutex there is no such latency but you still want to
avoid wait and wakeup.  So introduce it artificially.

This only works with one waiters; because CoMutex is fair, it will
always have more waits and wakeups than a pthread_mutex_t.

Signed-off-by: Paolo Bonzini 
---
 include/qemu/coroutine.h   |  5 +
 util/qemu-coroutine-lock.c | 51 --
 util/qemu-coroutine.c  |  2 +-
 3 files changed, 51 insertions(+), 7 deletions(-)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index fce228f..12ce8e1 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -167,6 +167,11 @@ typedef struct CoMutex {
  */
 unsigned locked;
 
+/* Context that is holding the lock.  Useful to avoid spinning
+ * when two coroutines on the same AioContext try to get the lock. :)
+ */
+AioContext *ctx;
+
 /* A queue of waiters.  Elements are added atomically in front of
  * from_push.  to_pop is only populated, and popped from, by whoever
  * is in charge of the next wakeup.  This can be an unlocker or,
diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index 25da9fa..73fe77c 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -30,6 +30,7 @@
 #include "qemu-common.h"
 #include "qemu/coroutine.h"
 #include "qemu/coroutine_int.h"
+#include "qemu/processor.h"
 #include "qemu/queue.h"
 #include "block/aio.h"
 #include "trace.h"
@@ -181,7 +182,18 @@ void qemu_co_mutex_init(CoMutex *mutex)
 memset(mutex, 0, sizeof(*mutex));
 }
 
-static void coroutine_fn qemu_co_mutex_lock_slowpath(CoMutex *mutex)
+static void coroutine_fn qemu_co_mutex_wake(CoMutex *mutex, Coroutine *co)
+{
+/* Read co before co->ctx; pairs with smp_wmb() in
+ * qemu_coroutine_enter().
+ */
+smp_read_barrier_depends();
+mutex->ctx = co->ctx;
+aio_co_wake(co);
+}
+
+static void coroutine_fn qemu_co_mutex_lock_slowpath(AioContext *ctx,
+ CoMutex *mutex)
 {
 Coroutine *self = qemu_coroutine_self();
 CoWaitRecord w;
@@ -206,10 +218,11 @@ static void coroutine_fn 
qemu_co_mutex_lock_slowpath(CoMutex *mutex)
 if (co == self) {
 /* We got the lock ourselves!  */
 assert(to_wake == );
+mutex->ctx = ctx;
 return;
 }
 
-aio_co_wake(co);
+qemu_co_mutex_wake(mutex, co);
 }
 
 qemu_coroutine_yield();
@@ -218,13 +231,39 @@ static void coroutine_fn 
qemu_co_mutex_lock_slowpath(CoMutex *mutex)
 
 void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex)
 {
+AioContext *ctx = qemu_get_current_aio_context();
 Coroutine *self = qemu_coroutine_self();
+int waiters, i;
+
+/* Running a very small critical section on pthread_mutex_t and CoMutex
+ * shows that pthread_mutex_t is much faster because it doesn't actually
+ * go to sleep.  What happens is that the critical section is shorter
+ * than the latency of entering the kernel and thus FUTEX_WAIT always
+ * fails.  With CoMutex there is no such latency but you still want to
+ * avoid wait and wakeup.  So introduce it artificially.
+ */
+i = 0;
+retry_fast_path:
+waiters = atomic_cmpxchg(>locked, 0, 1);
+if (waiters != 0) {
+while (waiters == 1 && ++i < 1000) {
+if (atomic_read(>ctx) == ctx) {
+break;
+}
+if (atomic_read(>locked) == 0) {
+goto retry_fast_path;
+}
+cpu_relax();
+}
+waiters = atomic_fetch_inc(>locked);
+}
 
-if (atomic_fetch_inc(>locked) == 0) {
+if (waiters == 0) {
 /* Uncontended.  */
 trace_qemu_co_mutex_lock_uncontended(mutex, self);
+mutex->ctx = ctx;
 } else {
-qemu_co_mutex_lock_slowpath(mutex);
+qemu_co_mutex_lock_slowpath(ctx, mutex);
 }
 mutex->holder = self;
 self->locks_held++;
@@ -240,6 +279,7 @@ void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex)
 assert(mutex->holder == self);
 assert(qemu_in_coroutine());
 
+mutex->ctx = NULL;
 mutex->holder = NULL;
 self->locks_held--;
 if (atomic_fetch_dec(>locked) == 1) {
@@ -252,8 +292,7 @@ void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex)
 unsigned our_handoff;
 
 if (to_wake) {
-Coroutine *co = to_wake->co;
-aio_co_wake(co);
+qemu_co_mutex_wake(mutex, to_wake->co);
 break;
 }
 
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index 415600d..72412e5 100644
--- a/util/qemu-coroutine.c
+++ 

[Qemu-devel] [PATCH 3/6] test-aio-multithread: add performance comparison with thread-based mutexes

2017-02-13 Thread Paolo Bonzini
Add two implementations of the same benchmark as the previous patch,
but using pthreads.  One uses a normal QemuMutex, the other is Linux
only and implements a fair mutex based on MCS locks and futexes.
This shows that the slower performance of the 5-thread case is due to
the fairness of CoMutex, rather than to coroutines.  If fairness does
not matter, as is the case with two threads, CoMutex can actually be
faster than pthreads.

Signed-off-by: Paolo Bonzini 
---
 tests/test-aio-multithread.c | 164 +++
 1 file changed, 164 insertions(+)

diff --git a/tests/test-aio-multithread.c b/tests/test-aio-multithread.c
index ada8c48..e7256a9 100644
--- a/tests/test-aio-multithread.c
+++ b/tests/test-aio-multithread.c
@@ -278,6 +278,162 @@ static void test_multi_co_mutex_2_30(void)
 test_multi_co_mutex(2, 30);
 }
 
+/* Same test with fair mutexes, for performance comparison.  */
+
+#ifdef CONFIG_LINUX
+#include "qemu/futex.h"
+
+/* The nodes for the mutex reside in this structure (on which we try to avoid
+ * false sharing).  The head of the mutex is in the "mutex_head" variable.
+ */
+static struct {
+int next, locked;
+int padding[14];
+} nodes[NUM_CONTEXTS] __attribute__((__aligned__(64)));
+
+static int mutex_head = -1;
+
+static void mcs_mutex_lock(void)
+{
+int prev;
+
+nodes[id].next = -1;
+nodes[id].locked = 1;
+prev = atomic_xchg(_head, id);
+if (prev != -1) {
+atomic_set([prev].next, id);
+qemu_futex_wait([id].locked, 1);
+}
+}
+
+static void mcs_mutex_unlock(void)
+{
+int next;
+if (nodes[id].next == -1) {
+if (atomic_read(_head) == id &&
+atomic_cmpxchg(_head, id, -1) == id) {
+/* Last item in the list, exit.  */
+return;
+}
+while (atomic_read([id].next) == -1) {
+/* mcs_mutex_lock did the xchg, but has not updated
+ * nodes[prev].next yet.
+ */
+}
+}
+
+/* Wake up the next in line.  */
+next = nodes[id].next;
+nodes[next].locked = 0;
+qemu_futex_wake([next].locked, 1);
+}
+
+static void test_multi_fair_mutex_entry(void *opaque)
+{
+while (!atomic_mb_read(_stopping)) {
+mcs_mutex_lock();
+counter++;
+mcs_mutex_unlock();
+atomic_inc(_counter);
+}
+atomic_dec();
+}
+
+static void test_multi_fair_mutex(int threads, int seconds)
+{
+int i;
+
+assert(mutex_head == -1);
+counter = 0;
+atomic_counter = 0;
+now_stopping = false;
+
+create_aio_contexts();
+assert(threads <= NUM_CONTEXTS);
+running = threads;
+for (i = 0; i < threads; i++) {
+Coroutine *co1 = qemu_coroutine_create(test_multi_fair_mutex_entry, 
NULL);
+aio_co_schedule(ctx[i], co1);
+}
+
+g_usleep(seconds * 100);
+
+atomic_mb_set(_stopping, true);
+while (running > 0) {
+g_usleep(10);
+}
+
+join_aio_contexts();
+g_test_message("%d iterations/second\n", counter / seconds);
+g_assert_cmpint(counter, ==, atomic_counter);
+}
+
+static void test_multi_fair_mutex_1(void)
+{
+test_multi_fair_mutex(NUM_CONTEXTS, 1);
+}
+
+static void test_multi_fair_mutex_10(void)
+{
+test_multi_fair_mutex(NUM_CONTEXTS, 10);
+}
+#endif
+
+/* Same test with pthread mutexes, for performance comparison and
+ * portability.  */
+
+static QemuMutex mutex;
+
+static void test_multi_mutex_entry(void *opaque)
+{
+while (!atomic_mb_read(_stopping)) {
+qemu_mutex_lock();
+counter++;
+qemu_mutex_unlock();
+atomic_inc(_counter);
+}
+atomic_dec();
+}
+
+static void test_multi_mutex(int threads, int seconds)
+{
+int i;
+
+qemu_mutex_init();
+counter = 0;
+atomic_counter = 0;
+now_stopping = false;
+
+create_aio_contexts();
+assert(threads <= NUM_CONTEXTS);
+running = threads;
+for (i = 0; i < threads; i++) {
+Coroutine *co1 = qemu_coroutine_create(test_multi_mutex_entry, NULL);
+aio_co_schedule(ctx[i], co1);
+}
+
+g_usleep(seconds * 100);
+
+atomic_mb_set(_stopping, true);
+while (running > 0) {
+g_usleep(10);
+}
+
+join_aio_contexts();
+g_test_message("%d iterations/second\n", counter / seconds);
+g_assert_cmpint(counter, ==, atomic_counter);
+}
+
+static void test_multi_mutex_1(void)
+{
+test_multi_mutex(NUM_CONTEXTS, 1);
+}
+
+static void test_multi_mutex_10(void)
+{
+test_multi_mutex(NUM_CONTEXTS, 10);
+}
+
 /* End of tests.  */
 
 int main(int argc, char **argv)
@@ -290,10 +446,18 @@ int main(int argc, char **argv)
 g_test_add_func("/aio/multi/schedule", test_multi_co_schedule_1);
 g_test_add_func("/aio/multi/mutex/contended", test_multi_co_mutex_1);
 g_test_add_func("/aio/multi/mutex/handoff", test_multi_co_mutex_2_3);
+#ifdef CONFIG_LINUX
+g_test_add_func("/aio/multi/mutex/mcs", test_multi_fair_mutex_1);

Re: [Qemu-devel] [PATCH 09/17] migration: Start of multiple fd work

2017-02-13 Thread Daniel P. Berrange
On Mon, Jan 23, 2017 at 10:32:13PM +0100, Juan Quintela wrote:
> We create new channels for each new thread created. We only send through
> them a character to be sure that we are creating the channels in the
> right order.
> 
> Note: Reference count/freeing of channels is not done
> 
> Signed-off-by: Juan Quintela 
> ---
>  include/migration/migration.h |  6 +
>  migration/ram.c   | 45 +-
>  migration/socket.c| 56 
> +--

BTW, right now libvirt never uses QEMU's tcp: protocol - it does everything
with the fd: protocol.  So either we need multi-fd support for fd: protocol,
or libvirt needs to switch to use tcp:

In fact, having said that, we're going to have to switch to use  the tcp:
protocol anyway in order to support TLS, so this is just another good
reason for the switch.

We avoided tcp: in the past because QEMU was incapable of reporting error
messages when the connection failed. That's fixed since

  commit d59ce6f34434bf47a9b26138c908650bf9a24be1
  Author: Daniel P. Berrange 
  Date:   Wed Apr 27 11:05:00 2016 +0100

migration: add reporting of errors for outgoing migration

so libvirt should be ok to use tcp: now.

>  3 files changed, 104 insertions(+), 3 deletions(-)
> 
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index f119ba0..3989bd6 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -22,6 +22,7 @@
>  #include "qapi-types.h"
>  #include "exec/cpu-common.h"
>  #include "qemu/coroutine_int.h"
> +#include "io/channel.h"
> 
>  #define QEMU_VM_FILE_MAGIC   0x5145564d
>  #define QEMU_VM_FILE_VERSION_COMPAT  0x0002
> @@ -218,6 +219,11 @@ void tcp_start_incoming_migration(const char *host_port, 
> Error **errp);
> 
>  void tcp_start_outgoing_migration(MigrationState *s, const char *host_port, 
> Error **errp);
> 
> +QIOChannel *socket_recv_channel_create(void);
> +int socket_recv_channel_destroy(QIOChannel *recv);
> +QIOChannel *socket_send_channel_create(void);
> +int socket_send_channel_destroy(QIOChannel *send);
> +
>  void unix_start_incoming_migration(const char *path, Error **errp);
> 
>  void unix_start_outgoing_migration(MigrationState *s, const char *path, 
> Error **errp);
> diff --git a/migration/ram.c b/migration/ram.c
> index 939f364..5ad7cb3 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -386,9 +386,11 @@ void migrate_compress_threads_create(void)
> 
>  struct MultiFDSendParams {
>  QemuThread thread;
> +QIOChannel *c;
>  QemuCond cond;
>  QemuMutex mutex;
>  bool quit;
> +bool started;
>  };
>  typedef struct MultiFDSendParams MultiFDSendParams;
> 
> @@ -397,6 +399,13 @@ static MultiFDSendParams *multifd_send;
>  static void *multifd_send_thread(void *opaque)
>  {
>  MultiFDSendParams *params = opaque;
> +char start = 's';
> +
> +qio_channel_write(params->c, , 1, _abort);
> +qemu_mutex_lock(>mutex);
> +params->started = true;
> +qemu_cond_signal(>cond);
> +qemu_mutex_unlock(>mutex);
> 
>  qemu_mutex_lock(>mutex);
>  while (!params->quit){
> @@ -433,6 +442,7 @@ void migrate_multifd_send_threads_join(void)
>  qemu_thread_join(_send[i].thread);
>  qemu_mutex_destroy(_send[i].mutex);
>  qemu_cond_destroy(_send[i].cond);
> +socket_send_channel_destroy(multifd_send[i].c);
>  }
>  g_free(multifd_send);
>  multifd_send = NULL;
> @@ -452,18 +462,31 @@ void migrate_multifd_send_threads_create(void)
>  qemu_mutex_init(_send[i].mutex);
>  qemu_cond_init(_send[i].cond);
>  multifd_send[i].quit = false;
> +multifd_send[i].started = false;
> +multifd_send[i].c = socket_send_channel_create();
> +if(!multifd_send[i].c) {
> +error_report("Error creating a send channel");
> +exit(0);
> +}
>  snprintf(thread_name, 15, "multifd_send_%d", i);
>  qemu_thread_create(_send[i].thread, thread_name,
> multifd_send_thread, _send[i],
> QEMU_THREAD_JOINABLE);
> +qemu_mutex_lock(_send[i].mutex);
> +while (!multifd_send[i].started) {
> +qemu_cond_wait(_send[i].cond, _send[i].mutex);
> +}
> +qemu_mutex_unlock(_send[i].mutex);
>  }
>  }
> 
>  struct MultiFDRecvParams {
>  QemuThread thread;
> +QIOChannel *c;
>  QemuCond cond;
>  QemuMutex mutex;
>  bool quit;
> +bool started;
>  };
>  typedef struct MultiFDRecvParams MultiFDRecvParams;
> 
> @@ -472,7 +495,14 @@ static MultiFDRecvParams *multifd_recv;
>  static void *multifd_recv_thread(void *opaque)
>  {
>  MultiFDRecvParams *params = opaque;
> - 
> +char start;
> +
> +qio_channel_read(params->c, , 1, _abort);
> +qemu_mutex_lock(>mutex);
> +params->started = true;
> +

[Qemu-devel] [PULL 10/14] COLO: Don't process failover request while loading VM's state

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: zhanghailiang 

We should not do failover work while the main thread is loading
VM's state. Otherwise the consistent of VM's memory and
device state will be broken.

We will restart the loading process after jump over the stage,
The new failover status 'RELAUNCH' will help to record if we
need to restart the process.

Cc: Eric Blake 
Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
Reviewed-by: Dr. David Alan Gilbert 
Message-Id: <1484657864-21708-4-git-send-email-zhang.zhanghaili...@huawei.com>
Signed-off-by: Dr. David Alan Gilbert 
   Added a missing '(Since 2.9)'
---
 migration/colo.c | 26 ++
 qapi-schema.json |  4 +++-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/migration/colo.c b/migration/colo.c
index 3222812..712308e 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -20,6 +20,8 @@
 #include "qapi/error.h"
 #include "migration/failover.h"
 
+static bool vmstate_loading;
+
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
 
 bool colo_supported(void)
@@ -51,6 +53,19 @@ static void secondary_vm_do_failover(void)
 int old_state;
 MigrationIncomingState *mis = migration_incoming_get_current();
 
+/* Can not do failover during the process of VM's loading VMstate, Or
+ * it will break the secondary VM.
+ */
+if (vmstate_loading) {
+old_state = failover_set_state(FAILOVER_STATUS_ACTIVE,
+FAILOVER_STATUS_RELAUNCH);
+if (old_state != FAILOVER_STATUS_ACTIVE) {
+error_report("Unknown error while do failover for secondary VM,"
+ "old_state: %s", FailoverStatus_lookup[old_state]);
+}
+return;
+}
+
 migrate_set_state(>state, MIGRATION_STATUS_COLO,
   MIGRATION_STATUS_COMPLETED);
 
@@ -548,13 +563,23 @@ void *colo_process_incoming_thread(void *opaque)
 
 qemu_mutex_lock_iothread();
 qemu_system_reset(VMRESET_SILENT);
+vmstate_loading = true;
 if (qemu_loadvm_state(fb) < 0) {
 error_report("COLO: loadvm failed");
 qemu_mutex_unlock_iothread();
 goto out;
 }
+
+vmstate_loading = false;
 qemu_mutex_unlock_iothread();
 
+if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) {
+failover_set_state(FAILOVER_STATUS_RELAUNCH,
+FAILOVER_STATUS_NONE);
+failover_request_active(NULL);
+goto out;
+}
+
 colo_send_message(mis->to_src_file, COLO_MESSAGE_VMSTATE_LOADED,
  _err);
 if (local_err) {
@@ -563,6 +588,7 @@ void *colo_process_incoming_thread(void *opaque)
 }
 
 out:
+vmstate_loading = false;
 /* Throw the unreported error message after exited from loop */
 if (local_err) {
 error_report_err(local_err);
diff --git a/qapi-schema.json b/qapi-schema.json
index 9330541..5edb08d 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1193,10 +1193,12 @@
 #
 # @completed: finish the process of failover
 #
+# @relaunch: restart the failover process, from 'none' -> 'completed' (Since 
2.9)
+#
 # Since: 2.8
 ##
 { 'enum': 'FailoverStatus',
-  'data': [ 'none', 'require', 'active', 'completed'] }
+  'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] }
 
 ##
 # @x-colo-lost-heartbeat:
-- 
2.9.3




[Qemu-devel] [PATCH 4/6] coroutine-lock: place CoMutex before CoQueue in header

2017-02-13 Thread Paolo Bonzini
This will avoid forward references in the next patch.  It is also
more logical because CoQueue is not anymore the basic primitive.

Signed-off-by: Paolo Bonzini 
---
 include/qemu/coroutine.h | 89 
 1 file changed, 44 insertions(+), 45 deletions(-)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 12ce8e1..9f68579 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -112,51 +112,6 @@ bool qemu_in_coroutine(void);
  */
 bool qemu_coroutine_entered(Coroutine *co);
 
-
-/**
- * CoQueues are a mechanism to queue coroutines in order to continue executing
- * them later. They provide the fundamental primitives on which coroutine locks
- * are built.
- */
-typedef struct CoQueue {
-QSIMPLEQ_HEAD(, Coroutine) entries;
-} CoQueue;
-
-/**
- * Initialise a CoQueue. This must be called before any other operation is used
- * on the CoQueue.
- */
-void qemu_co_queue_init(CoQueue *queue);
-
-/**
- * Adds the current coroutine to the CoQueue and transfers control to the
- * caller of the coroutine.
- */
-void coroutine_fn qemu_co_queue_wait(CoQueue *queue);
-
-/**
- * Restarts the next coroutine in the CoQueue and removes it from the queue.
- *
- * Returns true if a coroutine was restarted, false if the queue is empty.
- */
-bool coroutine_fn qemu_co_queue_next(CoQueue *queue);
-
-/**
- * Restarts all coroutines in the CoQueue and leaves the queue empty.
- */
-void coroutine_fn qemu_co_queue_restart_all(CoQueue *queue);
-
-/**
- * Enter the next coroutine in the queue
- */
-bool qemu_co_enter_next(CoQueue *queue);
-
-/**
- * Checks if the CoQueue is empty.
- */
-bool qemu_co_queue_empty(CoQueue *queue);
-
-
 /**
  * Provides a mutex that can be used to synchronise coroutines
  */
@@ -202,6 +157,50 @@ void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex);
  */
 void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex);
 
+
+/**
+ * CoQueues are a mechanism to queue coroutines in order to continue executing
+ * them later.
+ */
+typedef struct CoQueue {
+QSIMPLEQ_HEAD(, Coroutine) entries;
+} CoQueue;
+
+/**
+ * Initialise a CoQueue. This must be called before any other operation is used
+ * on the CoQueue.
+ */
+void qemu_co_queue_init(CoQueue *queue);
+
+/**
+ * Adds the current coroutine to the CoQueue and transfers control to the
+ * caller of the coroutine.
+ */
+void coroutine_fn qemu_co_queue_wait(CoQueue *queue);
+
+/**
+ * Restarts the next coroutine in the CoQueue and removes it from the queue.
+ *
+ * Returns true if a coroutine was restarted, false if the queue is empty.
+ */
+bool coroutine_fn qemu_co_queue_next(CoQueue *queue);
+
+/**
+ * Restarts all coroutines in the CoQueue and leaves the queue empty.
+ */
+void coroutine_fn qemu_co_queue_restart_all(CoQueue *queue);
+
+/**
+ * Enter the next coroutine in the queue
+ */
+bool qemu_co_enter_next(CoQueue *queue);
+
+/**
+ * Checks if the CoQueue is empty.
+ */
+bool qemu_co_queue_empty(CoQueue *queue);
+
+
 typedef struct CoRwlock {
 bool writer;
 int reader;
-- 
2.9.3





Re: [Qemu-devel] [PATCH v2 00/16] Postcopy: Hugepage support

2017-02-13 Thread Andrea Arcangeli
On Mon, Feb 13, 2017 at 06:57:22PM +0100, Andrea Arcangeli wrote:
> Hello,
> 
> On Mon, Feb 13, 2017 at 08:11:06PM +0300, Alexey Perevalov wrote:
> > Another one request.
> > QEMU could use mem_path in hugefs with share key simultaneously
> > (-object 
> > memory-backend-file,id=mem,size=${mem_size},mem-path=${mem_path},share=on) 
> > and vm
> > in this case will start and will properly work (it will allocate memory
> > with mmap), but in case of destination for postcopy live migration
> > UFFDIO_COPY ioctl will fail for
> > such region, in Arcangeli's git tree there is such prevent check
> > (if (!vma_is_shmem(dst_vma) && dst_vma->vm_flags & VM_SHARED).
> > Is it possible to handle such situation at qemu?
> 
> It'd be nice to lift this hugetlbfs !VM_SHARED restriction I agree, I
> already asked Mike (CC'ed) why is there, because I'm afraid it's a

Cc'ed not existent email, mail client autocompletion error, corrected
the CC.

> leftover from the anon version where VM_SHARED means a very different
> thing but it was already lifted for shmem. share=on should already
> work on top of tmpfs and also with THP on tmpfs enabled.
> 
> For hugetlbfs and shmem it should be generally more complicated to
> cope with private mappings than shared ones, shared is just the native
> form of the pseudofs without having to deal with private COWs aliases
> so it's hard to imagine something going wrong for VM_SHARED if the
> MAP_PRIVATE mapping already works fine. If it turns out to be
> superflous the check may be just turned into
> "vma_is_anonymous(dst_vma) && dst_vma->vm_flags & VM_SHARED".
> 
> Thanks,
> Andrea



[Qemu-devel] [RFC PATCH 40/41] block: Pass BdrvChild to bdrv_aligned_preadv/pwritev

2017-02-13 Thread Kevin Wolf
This is where we want to check the permissions, so we need to have the
BdrvChild around where they are stored.

Signed-off-by: Kevin Wolf 
---
 block/io.c | 31 +--
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/block/io.c b/block/io.c
index c42b34a..cb2feff 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1001,10 +1001,11 @@ err:
  * handles copy on read, zeroing after EOF, and fragmentation of large
  * reads; any other features must be implemented by the caller.
  */
-static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
+static int coroutine_fn bdrv_aligned_preadv(BdrvChild *child,
 BdrvTrackedRequest *req, int64_t offset, unsigned int bytes,
 int64_t align, QEMUIOVector *qiov, int flags)
 {
+BlockDriverState *bs = child->bs;
 int64_t total_bytes, max_bytes;
 int ret = 0;
 uint64_t bytes_remaining = bytes;
@@ -1158,7 +1159,7 @@ int coroutine_fn bdrv_co_preadv(BdrvChild *child,
 }
 
 tracked_request_begin(, bs, offset, bytes, BDRV_TRACKED_READ);
-ret = bdrv_aligned_preadv(bs, , offset, bytes, align,
+ret = bdrv_aligned_preadv(child, , offset, bytes, align,
   use_local_qiov ? _qiov : qiov,
   flags);
 tracked_request_end();
@@ -1306,10 +1307,11 @@ fail:
  * Forwards an already correctly aligned write request to the BlockDriver,
  * after possibly fragmenting it.
  */
-static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
+static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child,
 BdrvTrackedRequest *req, int64_t offset, unsigned int bytes,
 int64_t align, QEMUIOVector *qiov, int flags)
 {
+BlockDriverState *bs = child->bs;
 BlockDriver *drv = bs->drv;
 bool waited;
 int ret;
@@ -1397,12 +1399,13 @@ static int coroutine_fn 
bdrv_aligned_pwritev(BlockDriverState *bs,
 return ret;
 }
 
-static int coroutine_fn bdrv_co_do_zero_pwritev(BlockDriverState *bs,
+static int coroutine_fn bdrv_co_do_zero_pwritev(BdrvChild *child,
 int64_t offset,
 unsigned int bytes,
 BdrvRequestFlags flags,
 BdrvTrackedRequest *req)
 {
+BlockDriverState *bs = child->bs;
 uint8_t *buf = NULL;
 QEMUIOVector local_qiov;
 struct iovec iov;
@@ -1430,7 +1433,7 @@ static int coroutine_fn 
bdrv_co_do_zero_pwritev(BlockDriverState *bs,
 mark_request_serialising(req, align);
 wait_serialising_requests(req);
 bdrv_debug_event(bs, BLKDBG_PWRITEV_RMW_HEAD);
-ret = bdrv_aligned_preadv(bs, req, offset & ~(align - 1), align,
+ret = bdrv_aligned_preadv(child, req, offset & ~(align - 1), align,
   align, _qiov, 0);
 if (ret < 0) {
 goto fail;
@@ -1438,7 +1441,7 @@ static int coroutine_fn 
bdrv_co_do_zero_pwritev(BlockDriverState *bs,
 bdrv_debug_event(bs, BLKDBG_PWRITEV_RMW_AFTER_HEAD);
 
 memset(buf + head_padding_bytes, 0, zero_bytes);
-ret = bdrv_aligned_pwritev(bs, req, offset & ~(align - 1), align,
+ret = bdrv_aligned_pwritev(child, req, offset & ~(align - 1), align,
align, _qiov,
flags & ~BDRV_REQ_ZERO_WRITE);
 if (ret < 0) {
@@ -1452,7 +1455,7 @@ static int coroutine_fn 
bdrv_co_do_zero_pwritev(BlockDriverState *bs,
 if (bytes >= align) {
 /* Write the aligned part in the middle. */
 uint64_t aligned_bytes = bytes & ~(align - 1);
-ret = bdrv_aligned_pwritev(bs, req, offset, aligned_bytes, align,
+ret = bdrv_aligned_pwritev(child, req, offset, aligned_bytes, align,
NULL, flags);
 if (ret < 0) {
 goto fail;
@@ -1468,7 +1471,7 @@ static int coroutine_fn 
bdrv_co_do_zero_pwritev(BlockDriverState *bs,
 mark_request_serialising(req, align);
 wait_serialising_requests(req);
 bdrv_debug_event(bs, BLKDBG_PWRITEV_RMW_TAIL);
-ret = bdrv_aligned_preadv(bs, req, offset, align,
+ret = bdrv_aligned_preadv(child, req, offset, align,
   align, _qiov, 0);
 if (ret < 0) {
 goto fail;
@@ -1476,7 +1479,7 @@ static int coroutine_fn 
bdrv_co_do_zero_pwritev(BlockDriverState *bs,
 bdrv_debug_event(bs, BLKDBG_PWRITEV_RMW_AFTER_TAIL);
 
 memset(buf, 0, bytes);
-ret = bdrv_aligned_pwritev(bs, req, offset, align, align,
+ret = bdrv_aligned_pwritev(child, req, offset, align, align,
_qiov, flags & ~BDRV_REQ_ZERO_WRITE);
 }
 fail:
@@ -1523,7 +1526,7 @@ int coroutine_fn bdrv_co_pwritev(BdrvChild *child,
 tracked_request_begin(, bs, offset, bytes, BDRV_TRACKED_WRITE);
 
 if (!qiov) {
-ret 

[Qemu-devel] [PATCH 0/6] Make CoMutex/CoQueue/CoRwlock thread-safe

2017-02-13 Thread Paolo Bonzini
This is yet another tiny bit of the multiqueue work, this time affecting
the synchronization infrastructure for coroutines.  Currently, coroutines
synchronize between the main I/O thread and the dataplane iothread through
the AioContext lock.  However, for multiqueue a single BDS will be used
by multiple iothreads and hence multiple AioContexts.  This calls for
a different approach to coroutine synchronization, and this series is my
attempt.

After the previous series, coroutines are already bound to an AioContext
while they wait on a CoMutex.  Of course currently a CoMutex is generally
not used across multiple iothreads, because you have to acquire/release
the AioContext around CoMutex critical sections.  This series is the
missing link between the aio_co_schedule/aio_co_wake infrastructure and
making BDS thread-safe: by making CoMutex thread-safe, it removes
the need for a "thread-based" mutex around it.

This will still need some changes in the formats because, for multiqueue,
CoMutexes would need to be used like "real" thread mutexes.  Code like
this:

...
qemu_co_mutex_unlock()
... /* still access shared data, but don't yield */
qemu_coroutine_yield()

might be required to use this other pattern:

... /* access shared data, but don't yield */
qemu_co_mutex_unlock()
qemu_coroutine_yield()

because adding a second AioContext is already introducing concurrency that
wasn't there before.  Still, even if you have to take concurrency into
account, multitasking between coroutines remains non-preemptive.  So for
example, it is easy to synchronize between qemu_co_mutex_lock's yield
and the qemu_coroutine_enter in aio_co_schedule's bottom half.

CoMutex puts coroutines to sleep with qemu_coroutine_yield and wake them
up with aio_co_wake.  I could have wrapped CoMutex's CoQueue with
a "regular" thread mutex or spinlock.  The resulting code would
have looked a lot like RFifoLock (with CoQueue replacing RFifoLock's
condition variable).  Instead, CoMutex is implemented from scratch and
CoQueue is made to depend on a CoMutex, similar to condition variables.
Most CoQueues already have a corresponding CoMutex so this is not a big
deal; converting the others is left for a future series, but a surprising
number of drivers actually need no change.

The mutex algorithm comes from OSv; it only needs two to four atomic ops
for a lock-unlock pair (two when uncontended) and if necessary
we could even take OSv's support for wait morphing (which avoids the
thundering herd problem) and add it to CoMutex and CoQueue.

Performance of CoMutex is comparable to pthread mutexes.  However, you
cannot make a direct comparison between CoMutex (fair) and pthread_mutex_t
(unfair).  For this reason the testcase also measures performance of
a quick-and-dirty implementation of a fair mutex, based on MCS locks
and futexes.

Paolo

Paolo Bonzini (6):
  coroutine-lock: make CoMutex thread-safe
  coroutine-lock: add limited spinning to CoMutex
  test-aio-multithread: add performance comparison with thread-based
mutexes
  coroutine-lock: place CoMutex before CoQueue in header
  coroutine-lock: add mutex argument to CoQueue APIs
  coroutine-lock: make CoRwlock thread-safe and fair

 block/backup.c   |   2 +-
 block/io.c   |   4 +-
 block/nbd-client.c   |   2 +-
 block/qcow2-cluster.c|   4 +-
 block/sheepdog.c |   2 +-
 block/throttle-groups.c  |   2 +-
 hw/9pfs/9p.c |   2 +-
 include/qemu/coroutine.h |  84 +--
 tests/test-aio-multithread.c | 250 +++
 util/qemu-coroutine-lock.c   | 247 ++
 util/qemu-coroutine.c|   2 +-
 util/trace-events|   1 +
 12 files changed, 537 insertions(+), 65 deletions(-)

-- 
2.9.3




[Qemu-devel] [PULL 00/14] migration queue

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" <dgilb...@redhat.com>

The following changes since commit df96bfab49dab2d0373e49b51bbb51ce72e1601e:

  Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-20170213-1' into 
staging (2017-02-13 10:54:49 +)

are available in the git repository at:

  git://github.com/dagrh/qemu.git tags/pull-migration-20170213a

for you to fetch changes up to 982b78c5e37864c06fd7b5f156d80bf02628a855:

  virtio/migration: Migrate virtio-net to VMState (2017-02-13 17:27:14 +)


Migration

  Amit: migration: remove myself as maintainer
MAINTAINERS: update my email address
  Ashijeet: migrate: Introduce zero RAM checks to skip RAM migration
  Pavel: Postcopy release RAM
  Halil: consolidate VMStateField.start
  Hailiang: COLO: fix setting checkpoint-delay not working properly
 COLO: Shutdown related socket fd while do failover
 COLO: Don't process failover request while loading VM's state
  Me:
 migration: Add VMSTATE_UNUSED_VARRAY_UINT32
 migration: Add VMSTATE_WITH_TMP
 tests/migration: Add test for VMSTATE_WITH_TMP
 virtio-net VMState conversion and new VMSTATE macros


Amit Shah (2):
  migration: remove myself as maintainer
  MAINTAINERS: update my email address

Ashijeet Acharya (1):
  migrate: Introduce zero RAM checks to skip RAM migration

Dr. David Alan Gilbert (4):
  migration: Add VMSTATE_UNUSED_VARRAY_UINT32
  migration: Add VMSTATE_WITH_TMP
  tests/migration: Add test for VMSTATE_WITH_TMP
  virtio/migration: Migrate virtio-net to VMState

Halil Pasic (1):
  migration: consolidate VMStateField.start

Pavel Butsykin (3):
  migration: add MigrationState arg for ram_save_/compressed_/page()
  add 'release-ram' migrate capability
  migration: discard non-dirty ram pages after the start of postcopy

zhanghailiang (3):
  COLO: fix setting checkpoint-delay not working properly
  COLO: Shutdown related socket fd while do failover
  COLO: Don't process failover request while loading VM's state

 MAINTAINERS|   5 +-
 hw/char/exynos4210_uart.c  |   2 +-
 hw/display/g364fb.c|   2 +-
 hw/dma/pl330.c |   8 +-
 hw/intc/exynos4210_gic.c   |   2 +-
 hw/ipmi/isa_ipmi_bt.c  |   6 +-
 hw/net/virtio-net.c| 316 +++--
 hw/net/vmxnet3.c   |   2 +-
 hw/nvram/mac_nvram.c   |   2 +-
 hw/nvram/spapr_nvram.c |   2 +-
 hw/sd/sdhci.c  |   2 +-
 hw/timer/m48t59.c  |   2 +-
 include/hw/virtio/virtio-net.h |   4 +-
 include/migration/colo.h   |   2 +
 include/migration/migration.h  |  10 ++
 include/migration/qemu-file.h  |   3 +-
 include/migration/vmstate.h|  51 +--
 migration/colo.c   | 102 +++--
 migration/migration.c  |  16 +++
 migration/qemu-file.c  |  59 +++-
 migration/ram.c|  78 --
 migration/savevm.c |   2 +-
 migration/vmstate.c|  44 +-
 qapi-schema.json   |   9 +-
 target/s390x/machine.c |   2 +-
 tests/test-vmstate.c   |  98 -
 util/fifo8.c   |   2 +-
 27 files changed, 648 insertions(+), 185 deletions(-)



[Qemu-devel] [RFC PATCH 35/41] stream: Use real permissions in streaming block job

2017-02-13 Thread Kevin Wolf
The correct permissions are relatively obvious here (and explained in
code comments). For intermediate streaming, we need to reopen the top
node read-write before creating the job now because the permissions
system catches attempts to get the BLK_PERM_WRITE_UNCHANGED permission
on a read-only node.

Signed-off-by: Kevin Wolf 
---
 block/stream.c | 38 ++
 1 file changed, 26 insertions(+), 12 deletions(-)

diff --git a/block/stream.c b/block/stream.c
index 47f0ffb..e562f57 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -229,27 +229,35 @@ void stream_start(const char *job_id, BlockDriverState 
*bs,
 BlockDriverState *iter;
 int orig_bs_flags;
 
-/* FIXME Use real permissions */
-s = block_job_create(job_id, _job_driver, bs, 0, BLK_PERM_ALL,
- speed, BLOCK_JOB_DEFAULT, NULL, NULL, errp);
-if (!s) {
-return;
-}
-
 /* Make sure that the image is opened in read-write mode */
 orig_bs_flags = bdrv_get_flags(bs);
 if (!(orig_bs_flags & BDRV_O_RDWR)) {
 if (bdrv_reopen(bs, orig_bs_flags | BDRV_O_RDWR, errp) != 0) {
-block_job_unref(>common);
 return;
 }
 }
 
-/* Block all intermediate nodes between bs and base, because they
- * will disappear from the chain after this operation */
+/* Prevent concurrent jobs trying to modify the graph structure here, we
+ * already have our own plans. Also don't allow resize as the image size is
+ * queried only at the job start and then cached. */
+s = block_job_create(job_id, _job_driver, bs,
+ BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
+ BLK_PERM_GRAPH_MOD,
+ BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
+ BLK_PERM_WRITE,
+ speed, BLOCK_JOB_DEFAULT, NULL, NULL, errp);
+if (!s) {
+goto fail;
+}
+
+/* Block all intermediate nodes between bs and base, because they will
+ * disappear from the chain after this operation. The streaming job reads
+ * every block only once, assuming that it doesn't change, so block writes
+ * and resizes. */
 for (iter = backing_bs(bs); iter && iter != base; iter = backing_bs(iter)) 
{
-/* FIXME Use real permissions */
-block_job_add_bdrv(>common, iter, 0, BLK_PERM_ALL, _abort);
+block_job_add_bdrv(>common, iter, 0,
+   BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED,
+   _abort);
 }
 
 s->base = base;
@@ -259,4 +267,10 @@ void stream_start(const char *job_id, BlockDriverState *bs,
 s->on_error = on_error;
 trace_stream_start(bs, base, s);
 block_job_start(>common);
+return;
+
+fail:
+if (orig_bs_flags != bdrv_get_flags(bs)) {
+bdrv_reopen(bs, s->bs_flags, NULL);
+}
 }
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v2 00/16] Postcopy: Hugepage support

2017-02-13 Thread Andrea Arcangeli
Hello,

On Mon, Feb 13, 2017 at 08:11:06PM +0300, Alexey Perevalov wrote:
> Another one request.
> QEMU could use mem_path in hugefs with share key simultaneously
> (-object 
> memory-backend-file,id=mem,size=${mem_size},mem-path=${mem_path},share=on) 
> and vm
> in this case will start and will properly work (it will allocate memory
> with mmap), but in case of destination for postcopy live migration
> UFFDIO_COPY ioctl will fail for
> such region, in Arcangeli's git tree there is such prevent check
> (if (!vma_is_shmem(dst_vma) && dst_vma->vm_flags & VM_SHARED).
> Is it possible to handle such situation at qemu?

It'd be nice to lift this hugetlbfs !VM_SHARED restriction I agree, I
already asked Mike (CC'ed) why is there, because I'm afraid it's a
leftover from the anon version where VM_SHARED means a very different
thing but it was already lifted for shmem. share=on should already
work on top of tmpfs and also with THP on tmpfs enabled.

For hugetlbfs and shmem it should be generally more complicated to
cope with private mappings than shared ones, shared is just the native
form of the pseudofs without having to deal with private COWs aliases
so it's hard to imagine something going wrong for VM_SHARED if the
MAP_PRIVATE mapping already works fine. If it turns out to be
superflous the check may be just turned into
"vma_is_anonymous(dst_vma) && dst_vma->vm_flags & VM_SHARED".

Thanks,
Andrea



[Qemu-devel] [PULL 13/14] tests/migration: Add test for VMSTATE_WITH_TMP

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

Add a test for VMSTATE_WITH_TMP to tests/test-vmstate.c

Signed-off-by: Dr. David Alan Gilbert 
Reviewed-by: Juan Quintela 
Message-Id: <20170203160651.19917-4-dgilb...@redhat.com>
Signed-off-by: Dr. David Alan Gilbert 
---
 tests/test-vmstate.c | 98 
 1 file changed, 92 insertions(+), 6 deletions(-)

diff --git a/tests/test-vmstate.c b/tests/test-vmstate.c
index 9d87faf..d0dd390 100644
--- a/tests/test-vmstate.c
+++ b/tests/test-vmstate.c
@@ -90,7 +90,7 @@ static void save_buffer(const uint8_t *buf, size_t buf_size)
 qemu_fclose(fsave);
 }
 
-static void compare_vmstate(uint8_t *wire, size_t size)
+static void compare_vmstate(const uint8_t *wire, size_t size)
 {
 QEMUFile *f = open_test_file(false);
 uint8_t result[size];
@@ -113,7 +113,7 @@ static void compare_vmstate(uint8_t *wire, size_t size)
 }
 
 static int load_vmstate_one(const VMStateDescription *desc, void *obj,
-int version, uint8_t *wire, size_t size)
+int version, const uint8_t *wire, size_t size)
 {
 QEMUFile *f;
 int ret;
@@ -137,7 +137,7 @@ static int load_vmstate_one(const VMStateDescription *desc, 
void *obj,
 static int load_vmstate(const VMStateDescription *desc,
 void *obj, void *obj_clone,
 void (*obj_copy)(void *, void*),
-int version, uint8_t *wire, size_t size)
+int version, const uint8_t *wire, size_t size)
 {
 /* We test with zero size */
 obj_copy(obj_clone, obj);
@@ -289,7 +289,6 @@ static void test_simple_primitive(void)
 FIELD_EQUAL(i64_1);
 FIELD_EQUAL(i64_2);
 }
-#undef FIELD_EQUAL
 
 typedef struct TestStruct {
 uint32_t a, b, c, e;
@@ -474,7 +473,6 @@ static void test_load_skip(void)
 qemu_fclose(loading);
 }
 
-
 typedef struct {
 int32_t i;
 } TestStructTriv;
@@ -688,6 +686,94 @@ static void test_load_q(void)
 qemu_fclose(fload);
 }
 
+typedef struct TmpTestStruct {
+TestStruct *parent;
+int64_t diff;
+} TmpTestStruct;
+
+static void tmp_child_pre_save(void *opaque)
+{
+struct TmpTestStruct *tts = opaque;
+
+tts->diff = tts->parent->b - tts->parent->a;
+}
+
+static int tmp_child_post_load(void *opaque, int version_id)
+{
+struct TmpTestStruct *tts = opaque;
+
+tts->parent->b = tts->parent->a + tts->diff;
+
+return 0;
+}
+
+static const VMStateDescription vmstate_tmp_back_to_parent = {
+.name = "test/tmp_child_parent",
+.fields = (VMStateField[]) {
+VMSTATE_UINT64(f, TestStruct),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static const VMStateDescription vmstate_tmp_child = {
+.name = "test/tmp_child",
+.pre_save = tmp_child_pre_save,
+.post_load = tmp_child_post_load,
+.fields = (VMStateField[]) {
+VMSTATE_INT64(diff, TmpTestStruct),
+VMSTATE_STRUCT_POINTER(parent, TmpTestStruct,
+   vmstate_tmp_back_to_parent, TestStruct),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static const VMStateDescription vmstate_with_tmp = {
+.name = "test/with_tmp",
+.version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32(a, TestStruct),
+VMSTATE_UINT64(d, TestStruct),
+VMSTATE_WITH_TMP(TestStruct, TmpTestStruct, vmstate_tmp_child),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static void obj_tmp_copy(void *target, void *source)
+{
+memcpy(target, source, sizeof(TestStruct));
+}
+
+static void test_tmp_struct(void)
+{
+TestStruct obj, obj_clone;
+
+uint8_t const wire_with_tmp[] = {
+/* u32 a */ 0x00, 0x00, 0x00, 0x02,
+/* u64 d */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01,
+/* diff  */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02,
+/* u64 f */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08,
+QEMU_VM_EOF, /* just to ensure we won't get EOF reported prematurely */
+};
+
+memset(, 0, sizeof(obj));
+obj.a = 2;
+obj.b = 4;
+obj.d = 1;
+obj.f = 8;
+save_vmstate(_with_tmp, );
+
+compare_vmstate(wire_with_tmp, sizeof(wire_with_tmp));
+
+memset(, 0, sizeof(obj));
+SUCCESS(load_vmstate(_with_tmp, , _clone,
+ obj_tmp_copy, 1, wire_with_tmp,
+ sizeof(wire_with_tmp)));
+g_assert_cmpint(obj.a, ==, 2); /* From top level vmsd */
+g_assert_cmpint(obj.b, ==, 4); /* from the post_load */
+g_assert_cmpint(obj.d, ==, 1); /* From top level vmsd */
+g_assert_cmpint(obj.f, ==, 8); /* From the child->parent */
+}
+
 int main(int argc, char **argv)
 {
 temp_fd = mkstemp(temp_file);
@@ -708,7 +794,7 @@ int main(int argc, char **argv)
 test_arr_ptr_str_no0_load);
 g_test_add_func("/vmstate/qtailq/save/saveq", test_save_q);
 

[Qemu-devel] [RFC PATCH 30/41] backup: Use real permissions in backup block job

2017-02-13 Thread Kevin Wolf
The backup block job doesn't have very complicated requirements: It
needs to read from the source and write to the target, but it's fine
with either side being changed. The only restriction is that we can't
resize the image because the job uses a cached value.

qemu-iotests 055 needs to be changed because it used a target which was
already attached to a virtio-blk device. The permission system correctly
forbids this (virtio-blk can't accept another writer with its default
share-rw=off).

Signed-off-by: Kevin Wolf 
---
 block/backup.c | 15 ++-
 tests/qemu-iotests/055 | 11 +++
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 22171f4..47fadfb 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -618,15 +618,20 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 goto error;
 }
 
-/* FIXME Use real permissions */
-job = block_job_create(job_id, _job_driver, bs, 0, BLK_PERM_ALL,
+/* job->common.len is fixed, so we can't allow resize */
+job = block_job_create(job_id, _job_driver, bs,
+   BLK_PERM_CONSISTENT_READ,
+   BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE |
+   BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD,
speed, creation_flags, cb, opaque, errp);
 if (!job) {
 goto error;
 }
 
-/* FIXME Use real permissions */
-job->target = blk_new(0, BLK_PERM_ALL);
+/* The target must match the source in size, so no resize here either */
+job->target = blk_new(BLK_PERM_WRITE,
+  BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE |
+  BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD);
 ret = blk_insert_bs(job->target, target, errp);
 if (ret < 0) {
 goto error;
@@ -657,7 +662,7 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size);
 }
 
-/* FIXME Use real permissions */
+/* Required permissions are already taken with target's blk_new() */
 block_job_add_bdrv(>common, target, 0, BLK_PERM_ALL, _abort);
 job->common.len = len;
 block_job_txn_add_job(txn, >common);
diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055
index 1d3fd04..aafcd24 100755
--- a/tests/qemu-iotests/055
+++ b/tests/qemu-iotests/055
@@ -48,7 +48,8 @@ class TestSingleDrive(iotests.QMPTestCase):
 def setUp(self):
 qemu_img('create', '-f', iotests.imgfmt, blockdev_target_img, 
str(image_len))
 
-self.vm = 
iotests.VM().add_drive(test_img).add_drive(blockdev_target_img)
+self.vm = iotests.VM().add_drive(test_img)
+self.vm.add_drive(blockdev_target_img, interface="none")
 if iotests.qemu_default_machine == 'pc':
 self.vm.add_drive(None, 'media=cdrom', 'ide')
 self.vm.launch()
@@ -164,7 +165,8 @@ class TestSetSpeed(iotests.QMPTestCase):
 def setUp(self):
 qemu_img('create', '-f', iotests.imgfmt, blockdev_target_img, 
str(image_len))
 
-self.vm = 
iotests.VM().add_drive(test_img).add_drive(blockdev_target_img)
+self.vm = iotests.VM().add_drive(test_img)
+self.vm.add_drive(blockdev_target_img, interface="none")
 self.vm.launch()
 
 def tearDown(self):
@@ -247,7 +249,8 @@ class TestSingleTransaction(iotests.QMPTestCase):
 def setUp(self):
 qemu_img('create', '-f', iotests.imgfmt, blockdev_target_img, 
str(image_len))
 
-self.vm = 
iotests.VM().add_drive(test_img).add_drive(blockdev_target_img)
+self.vm = iotests.VM().add_drive(test_img)
+self.vm.add_drive(blockdev_target_img, interface="none")
 if iotests.qemu_default_machine == 'pc':
 self.vm.add_drive(None, 'media=cdrom', 'ide')
 self.vm.launch()
@@ -460,7 +463,7 @@ class TestDriveCompression(iotests.QMPTestCase):
 
 qemu_img('create', '-f', fmt, blockdev_target_img,
  str(TestDriveCompression.image_len), *args)
-self.vm.add_drive(blockdev_target_img, format=fmt)
+self.vm.add_drive(blockdev_target_img, format=fmt, interface="none")
 
 self.vm.launch()
 
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 36/41] hmp: Request permissions in qemu-io

2017-02-13 Thread Kevin Wolf
The HMP command 'qemu-io' is a bit tricky because it wants to work on
the original BlockBackend, but additional permissions could be required.
The details are explained in a comment in the code, but in summary, just
request whatever permissions the current qemu-io command needs.

Signed-off-by: Kevin Wolf 
---
 block/block-backend.c  |  6 ++
 hmp.c  | 26 +-
 include/qemu-io.h  |  1 +
 include/sysemu/block-backend.h |  1 +
 qemu-io-cmds.c | 28 
 5 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index a314284..94db555 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -573,6 +573,12 @@ int blk_set_perm(BlockBackend *blk, uint64_t perm, 
uint64_t shared_perm,
 return 0;
 }
 
+void blk_get_perm(BlockBackend *blk, uint64_t *perm, uint64_t *shared_perm)
+{
+*perm = blk->perm;
+*shared_perm = blk->shared_perm;
+}
+
 static int blk_do_attach_dev(BlockBackend *blk, void *dev)
 {
 if (blk->dev) {
diff --git a/hmp.c b/hmp.c
index 801fddb..fde5016 100644
--- a/hmp.c
+++ b/hmp.c
@@ -2045,7 +2045,6 @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
 if (!blk) {
 BlockDriverState *bs = bdrv_lookup_bs(NULL, device, );
 if (bs) {
-/* FIXME Use real permissions */
 blk = local_blk = blk_new(0, BLK_PERM_ALL);
 ret = blk_insert_bs(blk, bs, );
 if (ret < 0) {
@@ -2059,6 +2058,31 @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
 aio_context = blk_get_aio_context(blk);
 aio_context_acquire(aio_context);
 
+/*
+ * Notably absent: Proper permission management. This is sad, but it seems
+ * almost impossible to achieve without changing the semantics and thereby
+ * limiting the use cases of the qemu-io HMP command.
+ *
+ * In an ideal world we would unconditionally create a new BlockBackend for
+ * qemuio_command(), but we have commands like 'reopen' and want them to
+ * take effect on the exact BlockBackend whose name the user passed instead
+ * of just on a temporary copy of it.
+ *
+ * Another problem is that deleting the temporary BlockBackend involves
+ * draining all requests on it first, but some qemu-iotests cases want to
+ * issue multiple aio_read/write requests and expect them to complete in
+ * the background while the monitor has already returned.
+ *
+ * This is also what prevents us from saving the original permissions and
+ * restoring them later: We can't revoke permissions until all requests
+ * have completed, and we don't know when that is nor can we really let
+ * anything else run before we have revoken them to avoid race conditions.
+ *
+ * What happens now is that command() in qemu-io-cmds.c can extend the
+ * permissions if necessary for the qemu-io command. And they simply stay
+ * extended, possibly resulting in a read-only guest device keeping write
+ * permissions. Ugly, but it appears to be the lesser evil.
+ */
 qemuio_command(blk, command);
 
 aio_context_release(aio_context);
diff --git a/include/qemu-io.h b/include/qemu-io.h
index 4d402b9..196fde0 100644
--- a/include/qemu-io.h
+++ b/include/qemu-io.h
@@ -36,6 +36,7 @@ typedef struct cmdinfo {
 const char  *args;
 const char  *oneline;
 helpfunc_t  help;
+uint64_tperm;
 } cmdinfo_t;
 
 extern bool qemuio_misalign;
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 65bd081..b400212 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -101,6 +101,7 @@ bool bdrv_has_blk(BlockDriverState *bs);
 bool bdrv_is_root_node(BlockDriverState *bs);
 int blk_set_perm(BlockBackend *blk, uint64_t perm, uint64_t shared_perm,
  Error **errp);
+void blk_get_perm(BlockBackend *blk, uint64_t *perm, uint64_t *shared_perm);
 
 void blk_set_allow_write_beyond_eof(BlockBackend *blk, bool allow);
 void blk_iostatus_enable(BlockBackend *blk);
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index e415b03..035cb96 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -83,6 +83,29 @@ static int command(BlockBackend *blk, const cmdinfo_t *ct, 
int argc,
 }
 return 0;
 }
+
+/* Request additional permissions if necessary for this command. The caller
+ * is responsible for restoring the original permissions afterwards if this
+ * is what it wants. */
+if (ct->perm && blk_is_available(blk)) {
+uint64_t orig_perm, orig_shared_perm;
+blk_get_perm(blk, _perm, _shared_perm);
+
+if (ct->perm & ~orig_perm) {
+uint64_t new_perm;
+Error *local_err = NULL;
+int ret;
+
+new_perm = orig_perm | ct->perm;
+
+ret = blk_set_perm(blk, new_perm, orig_shared_perm, _err);
+  

[Qemu-devel] [PULL 09/14] COLO: Shutdown related socket fd while do failover

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: zhanghailiang 

If the net connection between primary host and secondary host breaks
while COLO/COLO incoming threads are doing read() or write().
It will block until connection is timeout, and the failover process
will be blocked because of it.

So it is necessary to shutdown all the socket fds used by COLO
to avoid this situation. Besides, we should close the corresponding
file descriptors after failvoer BH shutdown them,
Or there will be an error.

Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
Reviewed-by: Dr. David Alan Gilbert 
Cc: Dr. David Alan Gilbert 
Message-Id: <1484657864-21708-3-git-send-email-zhang.zhanghaili...@huawei.com>
Signed-off-by: Dr. David Alan Gilbert 
---
 include/migration/migration.h |  3 +++
 migration/colo.c  | 43 +++
 2 files changed, 46 insertions(+)

diff --git a/include/migration/migration.h b/include/migration/migration.h
index cb83f16..1735d66 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -116,6 +116,7 @@ struct MigrationIncomingState {
 QemuThread colo_incoming_thread;
 /* The coroutine we should enter (back) after failover */
 Coroutine *migration_incoming_co;
+QemuSemaphore colo_incoming_sem;
 
 /* See savevm.c */
 LoadStateEntry_Head loadvm_handlers;
@@ -187,6 +188,8 @@ struct MigrationState
 QSIMPLEQ_HEAD(src_page_requests, MigrationSrcPageRequest) 
src_page_requests;
 /* The RAMBlock used in the last src_page_request */
 RAMBlock *last_req_rb;
+/* The semaphore is used to notify COLO thread that failover is finished */
+QemuSemaphore colo_exit_sem;
 
 /* The semaphore is used to notify COLO thread to do checkpoint */
 QemuSemaphore colo_checkpoint_sem;
diff --git a/migration/colo.c b/migration/colo.c
index 08b2e46..3222812 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -59,6 +59,18 @@ static void secondary_vm_do_failover(void)
 /* recover runstate to normal migration finish state */
 autostart = true;
 }
+/*
+ * Make sure COLO incoming thread not block in recv or send,
+ * If mis->from_src_file and mis->to_src_file use the same fd,
+ * The second shutdown() will return -1, we ignore this value,
+ * It is harmless.
+ */
+if (mis->from_src_file) {
+qemu_file_shutdown(mis->from_src_file);
+}
+if (mis->to_src_file) {
+qemu_file_shutdown(mis->to_src_file);
+}
 
 old_state = failover_set_state(FAILOVER_STATUS_ACTIVE,
FAILOVER_STATUS_COMPLETED);
@@ -67,6 +79,8 @@ static void secondary_vm_do_failover(void)
  "secondary VM", FailoverStatus_lookup[old_state]);
 return;
 }
+/* Notify COLO incoming thread that failover work is finished */
+qemu_sem_post(>colo_incoming_sem);
 /* For Secondary VM, jump to incoming co */
 if (mis->migration_incoming_co) {
 qemu_coroutine_enter(mis->migration_incoming_co);
@@ -81,6 +95,18 @@ static void primary_vm_do_failover(void)
 migrate_set_state(>state, MIGRATION_STATUS_COLO,
   MIGRATION_STATUS_COMPLETED);
 
+/*
+ * Wake up COLO thread which may blocked in recv() or send(),
+ * The s->rp_state.from_dst_file and s->to_dst_file may use the
+ * same fd, but we still shutdown the fd for twice, it is harmless.
+ */
+if (s->to_dst_file) {
+qemu_file_shutdown(s->to_dst_file);
+}
+if (s->rp_state.from_dst_file) {
+qemu_file_shutdown(s->rp_state.from_dst_file);
+}
+
 old_state = failover_set_state(FAILOVER_STATUS_ACTIVE,
FAILOVER_STATUS_COMPLETED);
 if (old_state != FAILOVER_STATUS_ACTIVE) {
@@ -88,6 +114,8 @@ static void primary_vm_do_failover(void)
  FailoverStatus_lookup[old_state]);
 return;
 }
+/* Notify COLO thread that failover work is finished */
+qemu_sem_post(>colo_exit_sem);
 }
 
 void colo_do_failover(MigrationState *s)
@@ -361,6 +389,14 @@ out:
 
 timer_del(s->colo_delay_timer);
 
+/* Hope this not to be too long to wait here */
+qemu_sem_wait(>colo_exit_sem);
+qemu_sem_destroy(>colo_exit_sem);
+/*
+ * Must be called after failover BH is completed,
+ * Or the failover BH may shutdown the wrong fd that
+ * re-used by other threads after we release here.
+ */
 if (s->rp_state.from_dst_file) {
 qemu_fclose(s->rp_state.from_dst_file);
 }
@@ -385,6 +421,7 @@ void migrate_start_colo_process(MigrationState *s)
 s->colo_delay_timer =  timer_new_ms(QEMU_CLOCK_HOST,
 colo_checkpoint_notify, s);
 
+qemu_sem_init(>colo_exit_sem, 0);
 migrate_set_state(>state, MIGRATION_STATUS_ACTIVE,
   

[Qemu-devel] [PULL 12/14] migration: Add VMSTATE_WITH_TMP

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

VMSTATE_WITH_TMP is for handling structures where some calculation
or rearrangement of the data needs to be performed before the data
hits the wire.
For example,  where the value on the wire is an offset from a
non-migrated base, but the data in the structure is the actual pointer.

To use it, a temporary type is created and a vmsd used on that type.
The first element of the type must be 'parent' a pointer back to the
type of the main structure.  VMSTATE_WITH_TMP takes care of allocating
and freeing the temporary before running the child vmsd.

The post_load/pre_save on the child vmsd can copy things from the parent
to the temporary using the parent pointer and do any other calculations
needed; it can then use normal VMSD entries to do the actual data
storage without having to fiddle around with qemu_get_*/qemu_put_*

Signed-off-by: Dr. David Alan Gilbert 
Reviewed-by: David Gibson 
Reviewed-by: Juan Quintela 
Message-Id: <20170203160651.19917-3-dgilb...@redhat.com>
Signed-off-by: Dr. David Alan Gilbert 
---
 include/migration/vmstate.h | 19 +++
 migration/vmstate.c | 40 
 2 files changed, 59 insertions(+)

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 7339594..63e7b02 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -259,6 +259,7 @@ extern const VMStateInfo vmstate_info_cpudouble;
 extern const VMStateInfo vmstate_info_timer;
 extern const VMStateInfo vmstate_info_buffer;
 extern const VMStateInfo vmstate_info_unused_buffer;
+extern const VMStateInfo vmstate_info_tmp;
 extern const VMStateInfo vmstate_info_bitmap;
 extern const VMStateInfo vmstate_info_qtailq;
 
@@ -649,6 +650,24 @@ extern const VMStateInfo vmstate_info_qtailq;
 .offset = offsetof(_state, _field),  \
 }
 
+/* Allocate a temporary of type 'tmp_type', set tmp->parent to _state
+ * and execute the vmsd on the temporary.  Note that we're working with
+ * the whole of _state here, not a field within it.
+ * We compile time check that:
+ *That _tmp_type contains a 'parent' member that's a pointer to the
+ *'_state' type
+ *That the pointer is right at the start of _tmp_type.
+ */
+#define VMSTATE_WITH_TMP(_state, _tmp_type, _vmsd) { \
+.name = "tmp",   \
+.size = sizeof(_tmp_type) +  \
+QEMU_BUILD_BUG_ON_ZERO(offsetof(_tmp_type, parent) != 0) + 
\
+type_check_pointer(_state,   \
+typeof_field(_tmp_type, parent)),\
+.vmsd = &(_vmsd),\
+.info = _info_tmp,   \
+}
+
 #define VMSTATE_UNUSED_BUFFER(_test, _version, _size) {  \
 .name = "unused",\
 .field_exists = (_test), \
diff --git a/migration/vmstate.c b/migration/vmstate.c
index 520341a..b4d8ae9 100644
--- a/migration/vmstate.c
+++ b/migration/vmstate.c
@@ -935,6 +935,46 @@ const VMStateInfo vmstate_info_unused_buffer = {
 .put  = put_unused_buffer,
 };
 
+/* vmstate_info_tmp, see VMSTATE_WITH_TMP, the idea is that we allocate
+ * a temporary buffer and the pre_load/pre_save methods in the child vmsd
+ * copy stuff from the parent into the child and do calculations to fill
+ * in fields that don't really exist in the parent but need to be in the
+ * stream.
+ */
+static int get_tmp(QEMUFile *f, void *pv, size_t size, VMStateField *field)
+{
+int ret;
+const VMStateDescription *vmsd = field->vmsd;
+int version_id = field->version_id;
+void *tmp = g_malloc(size);
+
+/* Writes the parent field which is at the start of the tmp */
+*(void **)tmp = pv;
+ret = vmstate_load_state(f, vmsd, tmp, version_id);
+g_free(tmp);
+return ret;
+}
+
+static int put_tmp(QEMUFile *f, void *pv, size_t size, VMStateField *field,
+QJSON *vmdesc)
+{
+const VMStateDescription *vmsd = field->vmsd;
+void *tmp = g_malloc(size);
+
+/* Writes the parent field which is at the start of the tmp */
+*(void **)tmp = pv;
+vmstate_save_state(f, vmsd, tmp, vmdesc);
+g_free(tmp);
+
+return 0;
+}
+
+const VMStateInfo vmstate_info_tmp = {
+.name = "tmp",
+.get = get_tmp,
+.put = put_tmp,
+};
+
 /* bitmaps (as defined by bitmap.h). Note that size here is the size
  * of the bitmap in bits. The on-the-wire format of a bitmap is 64
  * bit words with the bits in big endian order. The in-memory format
-- 
2.9.3




[Qemu-devel] [RFC PATCH 31/41] block: Fix pending requests check in bdrv_append()

2017-02-13 Thread Kevin Wolf
bdrv_append() cares about isolation of the node that it modifies, but
not about activity in some subtree below it. Instead of using the
recursive bdrv_requests_pending(), directly check bs->in_flight, which
considers only the node in question.

Signed-off-by: Kevin Wolf 
---
 block.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 7d84d43..1e01647 100644
--- a/block.c
+++ b/block.c
@@ -2802,8 +2802,8 @@ static void change_parent_backing_link(BlockDriverState 
*from,
  */
 void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top)
 {
-assert(!bdrv_requests_pending(bs_top));
-assert(!bdrv_requests_pending(bs_new));
+assert(!atomic_read(_top->in_flight));
+assert(!atomic_read(_new->in_flight));
 
 bdrv_ref(bs_top);
 
-- 
1.8.3.1




[Qemu-devel] [PULL 08/14] COLO: fix setting checkpoint-delay not working properly

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: zhanghailiang 

If we set checkpoint-delay through command 'migrate-set-parameters',
It will not take effect until we finish last sleep chekpoint-delay,
That's will be offensive espeically when we want to change its value
from an extreme big one to a proper value.

Fix it by using timer to realize checkpoint-delay.

Signed-off-by: zhanghailiang 
Message-Id: <1484657864-21708-2-git-send-email-zhang.zhanghaili...@huawei.com>
Signed-off-by: Dr. David Alan Gilbert 
Reviewed-by: Dr. David Alan Gilbert 
---
 include/migration/colo.h  |  2 ++
 include/migration/migration.h |  5 +
 migration/colo.c  | 33 +++--
 migration/migration.c |  3 +++
 4 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index e32eef4..2bbff9e 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -35,4 +35,6 @@ COLOMode get_colo_mode(void);
 
 /* failover */
 void colo_do_failover(MigrationState *s);
+
+void colo_checkpoint_notify(void *opaque);
 #endif
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 71ce190..cb83f16 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -188,6 +188,11 @@ struct MigrationState
 /* The RAMBlock used in the last src_page_request */
 RAMBlock *last_req_rb;
 
+/* The semaphore is used to notify COLO thread to do checkpoint */
+QemuSemaphore colo_checkpoint_sem;
+int64_t colo_checkpoint_time;
+QEMUTimer *colo_delay_timer;
+
 /* The last error that occurred */
 Error *error;
 };
diff --git a/migration/colo.c b/migration/colo.c
index 93c85c5..08b2e46 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -302,7 +302,7 @@ static void colo_process_checkpoint(MigrationState *s)
 {
 QIOChannelBuffer *bioc;
 QEMUFile *fb = NULL;
-int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+int64_t current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
 Error *local_err = NULL;
 int ret;
 
@@ -332,26 +332,21 @@ static void colo_process_checkpoint(MigrationState *s)
 qemu_mutex_unlock_iothread();
 trace_colo_vm_state_change("stop", "run");
 
+timer_mod(s->colo_delay_timer,
+current_time + s->parameters.x_checkpoint_delay);
+
 while (s->state == MIGRATION_STATUS_COLO) {
 if (failover_get_state() != FAILOVER_STATUS_NONE) {
 error_report("failover request");
 goto out;
 }
 
-current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
-if (current_time - checkpoint_time <
-s->parameters.x_checkpoint_delay) {
-int64_t delay_ms;
+qemu_sem_wait(>colo_checkpoint_sem);
 
-delay_ms = s->parameters.x_checkpoint_delay -
-   (current_time - checkpoint_time);
-g_usleep(delay_ms * 1000);
-}
 ret = colo_do_checkpoint_transaction(s, bioc, fb);
 if (ret < 0) {
 goto out;
 }
-checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
 }
 
 out:
@@ -364,14 +359,32 @@ out:
 qemu_fclose(fb);
 }
 
+timer_del(s->colo_delay_timer);
+
 if (s->rp_state.from_dst_file) {
 qemu_fclose(s->rp_state.from_dst_file);
 }
 }
 
+void colo_checkpoint_notify(void *opaque)
+{
+MigrationState *s = opaque;
+int64_t next_notify_time;
+
+qemu_sem_post(>colo_checkpoint_sem);
+s->colo_checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+next_notify_time = s->colo_checkpoint_time +
+s->parameters.x_checkpoint_delay;
+timer_mod(s->colo_delay_timer, next_notify_time);
+}
+
 void migrate_start_colo_process(MigrationState *s)
 {
 qemu_mutex_unlock_iothread();
+qemu_sem_init(>colo_checkpoint_sem, 0);
+s->colo_delay_timer =  timer_new_ms(QEMU_CLOCK_HOST,
+colo_checkpoint_notify, s);
+
 migrate_set_state(>state, MIGRATION_STATUS_ACTIVE,
   MIGRATION_STATUS_COLO);
 colo_process_checkpoint(s);
diff --git a/migration/migration.c b/migration/migration.c
index 2a26a20..c6ae69d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -891,6 +891,9 @@ void qmp_migrate_set_parameters(MigrationParameters 
*params, Error **errp)
 
 if (params->has_x_checkpoint_delay) {
 s->parameters.x_checkpoint_delay = params->x_checkpoint_delay;
+if (migration_in_colo_state()) {
+colo_checkpoint_notify(s);
+}
 }
 }
 
-- 
2.9.3




[Qemu-devel] [RFC PATCH 33/41] block: Allow backing file links in change_parent_backing_link()

2017-02-13 Thread Kevin Wolf
Now that the backing file child role implements .attach/.detach
callbacks, nothing prevents us from modifying the graph even if that
involves changing backing file links.

Signed-off-by: Kevin Wolf 
---
 block.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index ff328f1..8224dde 100644
--- a/block.c
+++ b/block.c
@@ -2788,9 +2788,9 @@ static void change_parent_backing_link(BlockDriverState 
*from,
 continue;
 }
 if (c->role == _backing) {
-/* @from is generally not allowed to be a backing file, except for
- * when @to is the overlay. In that case, @from may not be replaced
- * by @to as @to's backing node. */
+/* If @from is a backing file of @to, ignore the child to avoid
+ * creating a loop. We only want to change the pointer of other
+ * parents. */
 QLIST_FOREACH(to_c, >children, next) {
 if (to_c == c) {
 break;
@@ -2801,7 +2801,6 @@ static void change_parent_backing_link(BlockDriverState 
*from,
 }
 }
 
-assert(c->role != _backing);
 bdrv_ref(to);
 bdrv_replace_child(c, to);
 bdrv_unref(from);
-- 
1.8.3.1




[Qemu-devel] [PULL 11/14] migration: Add VMSTATE_UNUSED_VARRAY_UINT32

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

VMSTATE_UNUSED_VARRAY_UINT32 is used to skip a chunk of the stream
that's an n-element array;  note the array size and the dynamic value
read never get multiplied so there's no overflow risk.

Signed-off-by: Dr. David Alan Gilbert 
Message-Id: <20170203160651.19917-2-dgilb...@redhat.com>
Reviewed-by: Juan Quintela 
Signed-off-by: Dr. David Alan Gilbert 
---
 include/migration/vmstate.h | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 39db47e..7339594 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -658,6 +658,17 @@ extern const VMStateInfo vmstate_info_qtailq;
 .flags= VMS_BUFFER,  \
 }
 
+/* Discard size * field_num bytes, where field_num is a uint32 member */
+#define VMSTATE_UNUSED_VARRAY_UINT32(_state, _test, _version, _field_num, 
_size) {\
+.name = "unused",\
+.field_exists = (_test), \
+.num_offset   = vmstate_offset_value(_state, _field_num, uint32_t),\
+.version_id   = (_version),  \
+.size = (_size), \
+.info = _info_unused_buffer, \
+.flags= VMS_VARRAY_UINT32 | VMS_BUFFER,  \
+}
+
 /* _field_size should be a int32_t field in the _state struct giving the
  * size of the bitmap _field in bits.
  */
-- 
2.9.3




[Qemu-devel] [RFC PATCH 29/41] commit: Use real permissions for HMP 'commit'

2017-02-13 Thread Kevin Wolf
This is a little simpler than the commit block job because it's
synchronous and only commits into the immediate backing file, but
otherwise doing more or less the same.

Signed-off-by: Kevin Wolf 
---
 block/commit.c | 31 ++-
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/block/commit.c b/block/commit.c
index 49ffddb..581d161 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -396,11 +396,14 @@ fail:
 int bdrv_commit(BlockDriverState *bs)
 {
 BlockBackend *src, *backing;
+BlockDriverState *backing_file_bs = NULL;
+BlockDriverState *commit_top_bs = NULL;
 BlockDriver *drv = bs->drv;
 int64_t sector, total_sectors, length, backing_length;
 int n, ro, open_flags;
 int ret = 0;
 uint8_t *buf = NULL;
+Error *local_err = NULL;
 
 if (!drv)
 return -ENOMEDIUM;
@@ -423,17 +426,31 @@ int bdrv_commit(BlockDriverState *bs)
 }
 }
 
-/* FIXME Use real permissions */
-src = blk_new(0, BLK_PERM_ALL);
-backing = blk_new(0, BLK_PERM_ALL);
+src = blk_new(BLK_PERM_CONSISTENT_READ, BLK_PERM_ALL);
+backing = blk_new(BLK_PERM_WRITE, BLK_PERM_ALL);
+
+ret = blk_insert_bs(src, bs, _err);
+if (ret < 0) {
+error_report_err(local_err);
+goto ro_cleanup;
+}
+
+/* Insert commit_top block node above backing, so we can write to it */
+backing_file_bs = backing_bs(bs);
 
-ret = blk_insert_bs(src, bs, NULL);
+ret = bdrv_new_open_driver(_commit_top, _top_bs, NULL,
+   BDRV_O_RDWR, _err);
 if (ret < 0) {
+error_report_err(local_err);
 goto ro_cleanup;
 }
 
-ret = blk_insert_bs(backing, bs->backing->bs, NULL);
+bdrv_set_backing_hd(commit_top_bs, backing_file_bs);
+bdrv_set_backing_hd(bs, commit_top_bs);
+
+ret = blk_insert_bs(backing, backing_file_bs, _err);
 if (ret < 0) {
+error_report_err(local_err);
 goto ro_cleanup;
 }
 
@@ -507,6 +524,10 @@ int bdrv_commit(BlockDriverState *bs)
 ro_cleanup:
 qemu_vfree(buf);
 
+if (backing_file_bs) {
+bdrv_set_backing_hd(bs, backing_file_bs);
+}
+bdrv_unref(commit_top_bs);
 blk_unref(src);
 blk_unref(backing);
 
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 27/41] block: Add bdrv_new_open_driver()

2017-02-13 Thread Kevin Wolf
This function allows to create more or less normal BlockDriverStates
even for BlockDrivers that aren't globally registered (e.g. helper
filters for block jobs).

Signed-off-by: Kevin Wolf 
---
 block.c   | 31 ++-
 include/block/block.h |  2 ++
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 9c80cba..7d84d43 100644
--- a/block.c
+++ b/block.c
@@ -948,13 +948,16 @@ static int bdrv_open_driver(BlockDriverState *bs, 
BlockDriver *drv,
 }
 
 bs->drv = drv;
+bs->read_only = !(bs->open_flags & BDRV_O_RDWR);
 bs->opaque = g_malloc0(drv->instance_size);
 
 if (drv->bdrv_file_open) {
 assert(!drv->bdrv_needs_filename || bs->filename[0]);
 ret = drv->bdrv_file_open(bs, options, open_flags, _err);
-} else {
+} else if (drv->bdrv_open) {
 ret = drv->bdrv_open(bs, options, open_flags, _err);
+} else {
+ret = 0;
 }
 
 if (ret < 0) {
@@ -995,6 +998,32 @@ free_and_fail:
 return ret;
 }
 
+int bdrv_new_open_driver(BlockDriver *drv, BlockDriverState **result,
+ const char *node_name, int flags, Error **errp)
+{
+BlockDriverState *bs;
+int ret;
+
+bs = bdrv_new();
+bs->open_flags = flags;
+bs->explicit_options = qdict_new();
+bs->options = qdict_new();
+bs->opaque = NULL;
+
+update_options_from_flags(bs->options, flags);
+
+ret = bdrv_open_driver(bs, drv, node_name, bs->options, flags, errp);
+if (ret < 0) {
+QDECREF(bs->explicit_options);
+QDECREF(bs->options);
+bdrv_unref(bs);
+return ret;
+}
+
+*result = bs;
+return 0;
+}
+
 QemuOptsList bdrv_runtime_opts = {
 .name = "bdrv_common",
 .head = QTAILQ_HEAD_INITIALIZER(bdrv_runtime_opts.head),
diff --git a/include/block/block.h b/include/block/block.h
index 93812df..3238850 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -215,6 +215,8 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*parent_options,
const char *bdref_key, Error **errp);
 BlockDriverState *bdrv_open(const char *filename, const char *reference,
 QDict *options, int flags, Error **errp);
+int bdrv_new_open_driver(BlockDriver *drv, BlockDriverState **result,
+ const char *node_name, int flags, Error **errp);
 BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
 BlockDriverState *bs,
 QDict *options, int flags);
-- 
1.8.3.1




[Qemu-devel] [PULL 04/14] add 'release-ram' migrate capability

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: Pavel Butsykin 

This feature frees the migrated memory on the source during postcopy-ram
migration. In the second step of postcopy-ram migration when the source vm
is put on pause we can free unnecessary memory. It will allow, in particular,
to start relaxing the memory stress on the source host in a load-balancing
scenario.

Signed-off-by: Pavel Butsykin 
Message-Id: <20170203152321.19739-3-pbutsy...@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert 
Signed-off-by: Dr. David Alan Gilbert 
   Manually merged in Pavel's 'migration: madvise error_report fixup!'
---
 include/migration/migration.h |  1 +
 include/migration/qemu-file.h |  3 ++-
 migration/migration.c |  9 +++
 migration/qemu-file.c | 59 ++-
 migration/ram.c   | 22 +++-
 qapi-schema.json  |  5 +++-
 6 files changed, 89 insertions(+), 10 deletions(-)

diff --git a/include/migration/migration.h b/include/migration/migration.h
index 7528cc2..b9b706a 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -304,6 +304,7 @@ int migrate_add_blocker(Error *reason, Error **errp);
  */
 void migrate_del_blocker(Error *reason);
 
+bool migrate_release_ram(void);
 bool migrate_postcopy_ram(void);
 bool migrate_zero_blocks(void);
 
diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
index abedd46..0cd648a 100644
--- a/include/migration/qemu-file.h
+++ b/include/migration/qemu-file.h
@@ -132,7 +132,8 @@ void qemu_put_byte(QEMUFile *f, int v);
  * put_buffer without copying the buffer.
  * The buffer should be available till it is sent asynchronously.
  */
-void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, size_t size);
+void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, size_t size,
+   bool may_free);
 bool qemu_file_mode_is_not_valid(const char *mode);
 bool qemu_file_is_writable(QEMUFile *f);
 
diff --git a/migration/migration.c b/migration/migration.c
index 2b179c6..68afc07 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1297,6 +1297,15 @@ void qmp_migrate_set_downtime(double value, Error **errp)
 qmp_migrate_set_parameters(, errp);
 }
 
+bool migrate_release_ram(void)
+{
+MigrationState *s;
+
+s = migrate_get_current();
+
+return s->enabled_capabilities[MIGRATION_CAPABILITY_RELEASE_RAM];
+}
+
 bool migrate_postcopy_ram(void)
 {
 MigrationState *s;
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index e9fae31..195fa94 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -49,6 +49,7 @@ struct QEMUFile {
 int buf_size; /* 0 when writing */
 uint8_t buf[IO_BUF_SIZE];
 
+DECLARE_BITMAP(may_free, MAX_IOV_SIZE);
 struct iovec iov[MAX_IOV_SIZE];
 unsigned int iovcnt;
 
@@ -132,6 +133,41 @@ bool qemu_file_is_writable(QEMUFile *f)
 return f->ops->writev_buffer;
 }
 
+static void qemu_iovec_release_ram(QEMUFile *f)
+{
+struct iovec iov;
+unsigned long idx;
+
+/* Find and release all the contiguous memory ranges marked as may_free. */
+idx = find_next_bit(f->may_free, f->iovcnt, 0);
+if (idx >= f->iovcnt) {
+return;
+}
+iov = f->iov[idx];
+
+/* The madvise() in the loop is called for iov within a continuous range 
and
+ * then reinitialize the iov. And in the end, madvise() is called for the
+ * last iov.
+ */
+while ((idx = find_next_bit(f->may_free, f->iovcnt, idx + 1)) < f->iovcnt) 
{
+/* check for adjacent buffer and coalesce them */
+if (iov.iov_base + iov.iov_len == f->iov[idx].iov_base) {
+iov.iov_len += f->iov[idx].iov_len;
+continue;
+}
+if (qemu_madvise(iov.iov_base, iov.iov_len, QEMU_MADV_DONTNEED) < 0) {
+error_report("migrate: madvise DONTNEED failed %p %zd: %s",
+ iov.iov_base, iov.iov_len, strerror(errno));
+}
+iov = f->iov[idx];
+}
+if (qemu_madvise(iov.iov_base, iov.iov_len, QEMU_MADV_DONTNEED) < 0) {
+error_report("migrate: madvise DONTNEED failed %p %zd: %s",
+ iov.iov_base, iov.iov_len, strerror(errno));
+}
+memset(f->may_free, 0, sizeof(f->may_free));
+}
+
 /**
  * Flushes QEMUFile buffer
  *
@@ -151,6 +187,8 @@ void qemu_fflush(QEMUFile *f)
 if (f->iovcnt > 0) {
 expect = iov_size(f->iov, f->iovcnt);
 ret = f->ops->writev_buffer(f->opaque, f->iov, f->iovcnt, f->pos);
+
+qemu_iovec_release_ram(f);
 }
 
 if (ret >= 0) {
@@ -304,13 +342,19 @@ int qemu_fclose(QEMUFile *f)
 return ret;
 }
 
-static void add_to_iovec(QEMUFile *f, const uint8_t *buf, size_t size)
+static void add_to_iovec(QEMUFile *f, const uint8_t *buf, size_t size,
+ bool may_free)
 {
 /* check for adjacent buffer and coalesce them */
 

[Qemu-devel] [PULL 05/14] migration: discard non-dirty ram pages after the start of postcopy

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: Pavel Butsykin 

After the start of postcopy migration there are some non-dirty pages which have
already been migrated. These pages are no longer needed on the source vm so that
we can free them and it doen't hurt to complete the migration.

Signed-off-by: Pavel Butsykin 
Message-Id: <20170203152321.19739-4-pbutsy...@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert 
---
 include/migration/migration.h |  1 +
 migration/migration.c |  4 
 migration/ram.c   | 19 +++
 3 files changed, 24 insertions(+)

diff --git a/include/migration/migration.h b/include/migration/migration.h
index b9b706a..71ce190 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -285,6 +285,7 @@ int ram_postcopy_send_discard_bitmap(MigrationState *ms);
 int ram_discard_range(MigrationIncomingState *mis, const char *block_name,
   uint64_t start, size_t length);
 int ram_postcopy_incoming_init(MigrationIncomingState *mis);
+void ram_postcopy_migrated_memory_release(MigrationState *ms);
 
 /**
  * @migrate_add_blocker - prevent migration from proceeding
diff --git a/migration/migration.c b/migration/migration.c
index 68afc07..2a26a20 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1722,6 +1722,10 @@ static int postcopy_start(MigrationState *ms, bool 
*old_vm_running)
  */
 qemu_savevm_send_ping(ms->to_dst_file, 4);
 
+if (migrate_release_ram()) {
+ram_postcopy_migrated_memory_release(ms);
+}
+
 ret = qemu_file_get_error(ms->to_dst_file);
 if (ret) {
 error_report("postcopy_start: Migration stream errored");
diff --git a/migration/ram.c b/migration/ram.c
index c22209d..67f2efb 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1537,6 +1537,25 @@ void ram_debug_dump_bitmap(unsigned long *todump, bool 
expected)
 
 /*  functions for postcopy * */
 
+void ram_postcopy_migrated_memory_release(MigrationState *ms)
+{
+struct RAMBlock *block;
+unsigned long *bitmap = atomic_rcu_read(_bitmap_rcu)->bmap;
+
+QLIST_FOREACH_RCU(block, _list.blocks, next) {
+unsigned long first = block->offset >> TARGET_PAGE_BITS;
+unsigned long range = first + (block->used_length >> TARGET_PAGE_BITS);
+unsigned long run_start = find_next_zero_bit(bitmap, range, first);
+
+while (run_start < range) {
+unsigned long run_end = find_next_bit(bitmap, range, run_start + 
1);
+ram_discard_range(NULL, block->idstr, run_start << 
TARGET_PAGE_BITS,
+  (run_end - run_start) << TARGET_PAGE_BITS);
+run_start = find_next_zero_bit(bitmap, range, run_end + 1);
+}
+}
+}
+
 /*
  * Callback from postcopy_each_ram_send_discard for each RAMBlock
  * Note: At this point the 'unsentmap' is the processed bitmap combined
-- 
2.9.3




[Qemu-devel] [RFC PATCH 19/41] hw/block: Request permissions

2017-02-13 Thread Kevin Wolf
This makes all device emulations with a qdev drive property request
permissions on their BlockBackend. We don't block anything yet.

Signed-off-by: Kevin Wolf 
---
 hw/block/block.c | 19 ++-
 hw/block/fdc.c   | 25 +++--
 hw/block/m25p80.c|  8 
 hw/block/nand.c  |  7 +++
 hw/block/nvme.c  |  8 +++-
 hw/block/onenand.c   |  7 +++
 hw/block/pflash_cfi01.c  | 18 --
 hw/block/pflash_cfi02.c  | 19 +--
 hw/block/virtio-blk.c|  8 +++-
 hw/core/qdev-properties-system.c |  1 -
 hw/ide/qdev.c|  7 +--
 hw/nvram/spapr_nvram.c   |  8 
 hw/scsi/scsi-disk.c  |  8 ++--
 hw/sd/sd.c   |  6 ++
 hw/usb/dev-storage.c |  6 +-
 include/hw/block/block.h |  3 ++-
 tests/qemu-iotests/051.pc.out|  6 +++---
 17 files changed, 137 insertions(+), 27 deletions(-)

diff --git a/hw/block/block.c b/hw/block/block.c
index 8dc9d84..c3d3901 100644
--- a/hw/block/block.c
+++ b/hw/block/block.c
@@ -51,11 +51,28 @@ void blkconf_blocksizes(BlockConf *conf)
 }
 }
 
-void blkconf_apply_backend_options(BlockConf *conf)
+void blkconf_apply_backend_options(BlockConf *conf, bool readonly,
+   Error **errp)
 {
 BlockBackend *blk = conf->blk;
 BlockdevOnError rerror, werror;
+uint64_t perm;
 bool wce;
+int ret;
+
+perm = BLK_PERM_CONSISTENT_READ;
+if (!readonly) {
+perm |= BLK_PERM_WRITE;
+}
+
+/* TODO Remove BLK_PERM_WRITE unless explicitly configured so */
+ret = blk_set_perm(blk, perm,
+   BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
+   BLK_PERM_GRAPH_MOD | BLK_PERM_RESIZE | BLK_PERM_WRITE,
+   errp);
+if (ret < 0) {
+return;
+}
 
 switch (conf->wce) {
 case ON_OFF_AUTO_ON:wce = true; break;
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 5f6c496..b3fa4c7 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -186,6 +186,7 @@ typedef enum FDiskFlags {
 struct FDrive {
 FDCtrl *fdctrl;
 BlockBackend *blk;
+BlockConf *conf;
 /* Drive status */
 FloppyDriveType drive;/* CMOS drive type*/
 uint8_t perpendicular;/* 2.88 MB access mode*/
@@ -472,6 +473,19 @@ static void fd_revalidate(FDrive *drv)
 static void fd_change_cb(void *opaque, bool load, Error **errp)
 {
 FDrive *drive = opaque;
+Error *local_err = NULL;
+
+if (!load) {
+blk_set_perm(drive->blk, 0, BLK_PERM_ALL, _abort);
+} else {
+blkconf_apply_backend_options(drive->conf,
+  blk_is_read_only(drive->blk),
+  _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+}
 
 drive->media_changed = 1;
 drive->media_validated = false;
@@ -508,6 +522,7 @@ static int floppy_drive_init(DeviceState *qdev)
 FloppyDrive *dev = FLOPPY_DRIVE(qdev);
 FloppyBus *bus = FLOPPY_BUS(qdev->parent_bus);
 FDrive *drive;
+Error *local_err = NULL;
 int ret;
 
 if (dev->unit == -1) {
@@ -533,7 +548,6 @@ static int floppy_drive_init(DeviceState *qdev)
 
 if (!dev->conf.blk) {
 /* Anonymous BlockBackend for an empty drive */
-/* FIXME Use real permissions */
 dev->conf.blk = blk_new(0, BLK_PERM_ALL);
 ret = blk_attach_dev(dev->conf.blk, qdev);
 assert(ret == 0);
@@ -552,7 +566,13 @@ static int floppy_drive_init(DeviceState *qdev)
  * blkconf_apply_backend_options(). */
 dev->conf.rerror = BLOCKDEV_ON_ERROR_AUTO;
 dev->conf.werror = BLOCKDEV_ON_ERROR_AUTO;
-blkconf_apply_backend_options(>conf);
+
+blkconf_apply_backend_options(>conf, blk_is_read_only(dev->conf.blk),
+  _err);
+if (local_err) {
+error_report_err(local_err);
+return -1;
+}
 
 /* 'enospc' is the default for -drive, 'report' is what blk_new() gives us
  * for empty drives. */
@@ -566,6 +586,7 @@ static int floppy_drive_init(DeviceState *qdev)
 return -1;
 }
 
+drive->conf = >conf;
 drive->blk = dev->conf.blk;
 drive->fdctrl = bus->fdc;
 
diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
index 2d6eb46..190573c 100644
--- a/hw/block/m25p80.c
+++ b/hw/block/m25p80.c
@@ -1215,6 +1215,7 @@ static void m25p80_realize(SSISlave *ss, Error **errp)
 {
 Flash *s = M25P80(ss);
 M25P80Class *mc = M25P80_GET_CLASS(s);
+int ret;
 
 s->pi = mc->pi;
 
@@ -1222,6 +1223,13 @@ static void m25p80_realize(SSISlave *ss, Error **errp)
 s->dirty_page = -1;
 
 if (s->blk) {
+uint64_t perm = BLK_PERM_CONSISTENT_READ |
+

[Qemu-devel] [RFC PATCH 26/41] block: Factor out bdrv_open_driver()

2017-02-13 Thread Kevin Wolf
This is a function that doesn't do any option parsing, but just does
some basic BlockDriverState setup and calls the .bdrv_open() function of
the block driver.

Signed-off-by: Kevin Wolf 
---
 block.c | 112 +---
 1 file changed, 65 insertions(+), 47 deletions(-)

diff --git a/block.c b/block.c
index 842ac78..9c80cba 100644
--- a/block.c
+++ b/block.c
@@ -934,6 +934,67 @@ out:
 g_free(gen_node_name);
 }
 
+static int bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv,
+const char *node_name, QDict *options,
+int open_flags, Error **errp)
+{
+Error *local_err = NULL;
+int ret;
+
+bdrv_assign_node_name(bs, node_name, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return -EINVAL;
+}
+
+bs->drv = drv;
+bs->opaque = g_malloc0(drv->instance_size);
+
+if (drv->bdrv_file_open) {
+assert(!drv->bdrv_needs_filename || bs->filename[0]);
+ret = drv->bdrv_file_open(bs, options, open_flags, _err);
+} else {
+ret = drv->bdrv_open(bs, options, open_flags, _err);
+}
+
+if (ret < 0) {
+if (local_err) {
+error_propagate(errp, local_err);
+} else if (bs->filename[0]) {
+error_setg_errno(errp, -ret, "Could not open '%s'", bs->filename);
+} else {
+error_setg_errno(errp, -ret, "Could not open image");
+}
+goto free_and_fail;
+}
+
+ret = refresh_total_sectors(bs, bs->total_sectors);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "Could not refresh total sector count");
+goto free_and_fail;
+}
+
+bdrv_refresh_limits(bs, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+ret = -EINVAL;
+goto free_and_fail;
+}
+
+assert(bdrv_opt_mem_align(bs) != 0);
+assert(bdrv_min_mem_align(bs) != 0);
+assert(is_power_of_2(bs->bl.request_alignment));
+
+return 0;
+
+free_and_fail:
+/* FIXME Close bs first if already opened*/
+g_free(bs->opaque);
+bs->opaque = NULL;
+bs->drv = NULL;
+return ret;
+}
+
 QemuOptsList bdrv_runtime_opts = {
 .name = "bdrv_common",
 .head = QTAILQ_HEAD_INITIALIZER(bdrv_runtime_opts.head),
@@ -1028,14 +1089,6 @@ static int bdrv_open_common(BlockDriverState *bs, 
BdrvChild *file,
 trace_bdrv_open_common(bs, filename ?: "", bs->open_flags,
drv->format_name);
 
-node_name = qemu_opt_get(opts, "node-name");
-bdrv_assign_node_name(bs, node_name, _err);
-if (local_err) {
-error_propagate(errp, local_err);
-ret = -EINVAL;
-goto fail_opts;
-}
-
 bs->read_only = !(bs->open_flags & BDRV_O_RDWR);
 
 if (use_bdrv_whitelist && !bdrv_is_whitelisted(drv, bs->read_only)) {
@@ -1101,54 +1154,19 @@ static int bdrv_open_common(BlockDriverState *bs, 
BdrvChild *file,
 }
 pstrcpy(bs->exact_filename, sizeof(bs->exact_filename), bs->filename);
 
-bs->drv = drv;
-bs->opaque = g_malloc0(drv->instance_size);
-
 /* Open the image, either directly or using a protocol */
 open_flags = bdrv_open_flags(bs, bs->open_flags);
-if (drv->bdrv_file_open) {
-assert(file == NULL);
-assert(!drv->bdrv_needs_filename || filename != NULL);
-ret = drv->bdrv_file_open(bs, options, open_flags, _err);
-} else {
-ret = drv->bdrv_open(bs, options, open_flags, _err);
-}
-
-if (ret < 0) {
-if (local_err) {
-error_propagate(errp, local_err);
-} else if (bs->filename[0]) {
-error_setg_errno(errp, -ret, "Could not open '%s'", bs->filename);
-} else {
-error_setg_errno(errp, -ret, "Could not open image");
-}
-goto free_and_fail;
-}
+node_name = qemu_opt_get(opts, "node-name");
 
-ret = refresh_total_sectors(bs, bs->total_sectors);
+assert(!drv->bdrv_file_open || file == NULL);
+ret = bdrv_open_driver(bs, drv, node_name, options, open_flags, errp);
 if (ret < 0) {
-error_setg_errno(errp, -ret, "Could not refresh total sector count");
-goto free_and_fail;
-}
-
-bdrv_refresh_limits(bs, _err);
-if (local_err) {
-error_propagate(errp, local_err);
-ret = -EINVAL;
-goto free_and_fail;
+goto fail_opts;
 }
 
-assert(bdrv_opt_mem_align(bs) != 0);
-assert(bdrv_min_mem_align(bs) != 0);
-assert(is_power_of_2(bs->bl.request_alignment));
-
 qemu_opts_del(opts);
 return 0;
 
-free_and_fail:
-g_free(bs->opaque);
-bs->opaque = NULL;
-bs->drv = NULL;
 fail_opts:
 qemu_opts_del(opts);
 return ret;
-- 
1.8.3.1




[Qemu-devel] [PULL 02/14] MAINTAINERS: update my email address

2017-02-13 Thread Dr. David Alan Gilbert (git)
From: Amit Shah 

I'm leaving my job at Red Hat, this email address will stop working next week.
Update it to one that I will have access to later.

Signed-off-by: Amit Shah 
Message-Id: <1486120433-11628-1-git-send-email-amit.s...@redhat.com>
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Juan Quintela 
Signed-off-by: Dr. David Alan Gilbert 
---
 MAINTAINERS | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d8ea161..fb57d8e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1034,7 +1034,7 @@ F: hw/input/virtio-input*.c
 F: include/hw/virtio/virtio-input.h
 
 virtio-serial
-M: Amit Shah 
+M: Amit Shah 
 S: Supported
 F: hw/char/virtio-serial-bus.c
 F: hw/char/virtio-console.c
@@ -1043,7 +1043,7 @@ F: tests/virtio-console-test.c
 F: tests/virtio-serial-test.c
 
 virtio-rng
-M: Amit Shah 
+M: Amit Shah 
 S: Supported
 F: hw/virtio/virtio-rng.c
 F: include/hw/virtio/virtio-rng.h
-- 
2.9.3




Re: [Qemu-devel] [PATCH v7 2/2] mach-virt: Provide sample configuration files

2017-02-13 Thread Laszlo Ersek
On 02/10/17 18:25, Andrea Bolognani wrote:
> These are very much like the sample configuration files
> for q35, and can be used both as documentation and as
> a starting point for creating your own guest.
> 
> Two sample configuration files are provided:
> 
>   * mach-virt-graphical.cfg can be used to start a
> fully-featured (USB, graphical console, etc.)
> guest that uses VirtIO devices;
> 
>   * mach-virt-serial.cfg is similar but has a minimal
> set of devices and uses the serial console.
> 
> All configuration files are fully commented and neatly
> organized.
> ---
>  docs/mach-virt-graphical.cfg | 281 
> +++
>  docs/mach-virt-serial.cfg| 243 +
>  2 files changed, 524 insertions(+)
>  create mode 100644 docs/mach-virt-graphical.cfg
>  create mode 100644 docs/mach-virt-serial.cfg
> 

[snip]

> +[drive "optical-disk"]
> +  file = "install.iso"  # CHANGE ME
> +  format = "raw"
> +  if = "none"

I usually add

  readonly = "on"

here -- more precisely, at the corresponding location on the command
line --, but I'm unsure if that justifies v8 :)

Reviewed-by: Laszlo Ersek 

Thanks!
Laszlo




[Qemu-devel] [RFC PATCH 23/41] block: Include details on permission errors in message

2017-02-13 Thread Kevin Wolf
Instead of just telling that there was some conflict, we can be specific
and tell which permissions were in conflict and which way the conflict
is.

Signed-off-by: Kevin Wolf 
---
 block.c | 66 ++---
 1 file changed, 55 insertions(+), 11 deletions(-)

diff --git a/block.c b/block.c
index 2116542..d743f50 100644
--- a/block.c
+++ b/block.c
@@ -1381,6 +1381,43 @@ static void bdrv_update_perm(BlockDriverState *bs)
 bdrv_set_perm(bs, cumulative_perms, cumulative_shared_perms);
 }
 
+static char *bdrv_child_link_name(BdrvChild *c)
+{
+if (c->role->get_link_name) {
+return c->role->get_link_name(c);
+}
+
+return g_strdup("another user");
+}
+
+static char *bdrv_perm_names(uint64_t perm)
+{
+struct perm_name {
+uint64_t perm;
+const char *name;
+} permissions[] = {
+{ BLK_PERM_CONSISTENT_READ, "consistent read" },
+{ BLK_PERM_WRITE,   "write" },
+{ BLK_PERM_WRITE_UNCHANGED, "write unchanged" },
+{ BLK_PERM_RESIZE,  "resize" },
+{ BLK_PERM_GRAPH_MOD,   "change children" },
+{ 0, NULL }
+};
+
+char *result = g_strdup("");
+struct perm_name *p;
+
+for (p = permissions; p->name; p++) {
+if (perm & p->perm) {
+char *old = result;
+result = g_strdup_printf("%s%s%s", old, *old ? ", " : "", p->name);
+g_free(old);
+}
+}
+
+return result;
+}
+
 static int bdrv_check_update_perm(BlockDriverState *bs, uint64_t new_used_perm,
   uint64_t new_shared_perm,
   BdrvChild *ignore_child, Error **errp)
@@ -1397,17 +1434,24 @@ static int bdrv_check_update_perm(BlockDriverState *bs, 
uint64_t new_used_perm,
 continue;
 }
 
-if ((new_used_perm & c->shared_perm) != new_used_perm ||
-(c->perm & new_shared_perm) != c->perm)
-{
-const char *user = NULL;
-if (c->role->get_name) {
-user = c->role->get_name(c);
-if (user && !*user) {
-user = NULL;
-}
-}
-error_setg(errp, "Conflicts with %s", user ?: "another operation");
+if ((new_used_perm & c->shared_perm) != new_used_perm) {
+char *link = bdrv_child_link_name(c);
+char *perm_names = bdrv_perm_names(new_used_perm & 
~c->shared_perm);
+error_setg(errp, "Conflicts with %s, which does not allow '%s' "
+ "on %s",
+   link, perm_names, bdrv_get_node_name(c->bs));
+g_free(link);
+g_free(perm_names);
+return -EPERM;
+}
+
+if ((c->perm & new_shared_perm) != c->perm) {
+char *link = bdrv_child_link_name(c);
+char *perm_names = bdrv_perm_names(c->perm & ~new_shared_perm);
+error_setg(errp, "Conflicts with %s, which uses '%s' on %s",
+   link, perm_names, bdrv_get_node_name(c->bs));
+g_free(link);
+g_free(perm_names);
 return -EPERM;
 }
 
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 17/41] block: Request real permissions in blk_new_open()

2017-02-13 Thread Kevin Wolf
We can figure out the necessary permissions from the flags that the
caller passed.

Signed-off-by: Kevin Wolf 
---
 block/block-backend.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 1f80854..e10a278 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -160,6 +160,7 @@ BlockBackend *blk_new_open(const char *filename, const char 
*reference,
 {
 BlockBackend *blk;
 BlockDriverState *bs;
+uint64_t perm;
 
 blk = blk_new(0, BLK_PERM_ALL);
 bs = bdrv_open(filename, reference, options, flags, errp);
@@ -168,9 +169,20 @@ BlockBackend *blk_new_open(const char *filename, const 
char *reference,
 return NULL;
 }
 
-/* FIXME Use real permissions */
+/* blk_new_open() is mainly used in .bdrv_create implementations and the
+ * tools where sharing isn't a concern because the BDS stays private, so we
+ * just request permission according to the flags.
+ *
+ * The exceptions are xen_disk and blockdev_init(); in these cases, the
+ * caller of blk_new_open() doesn't make use of the permissions, but they
+ * shouldn't hurt either. We can still share everything here because the
+ * guest devices will add their own blockers if they can't share. */
+perm = BLK_PERM_CONSISTENT_READ;
+if (flags & BDRV_O_RDWR) {
+perm |= BLK_PERM_WRITE;
+}
 blk->root = bdrv_root_attach_child(bs, "root", _root,
-   0, BLK_PERM_ALL, blk, _abort);
+   perm, BLK_PERM_ALL, blk, _abort);
 
 return blk;
 }
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 41/41] block: Assertions for write permissions

2017-02-13 Thread Kevin Wolf
This adds assertions that ensure that the necessary write permissions
have been granted before someone attempts to write to a node.

Signed-off-by: Kevin Wolf 
---
 block/io.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/block/io.c b/block/io.c
index cb2feff..74929e5 100644
--- a/block/io.c
+++ b/block/io.c
@@ -925,9 +925,11 @@ bdrv_driver_pwritev_compressed(BlockDriverState *bs, 
uint64_t offset,
 return drv->bdrv_co_pwritev_compressed(bs, offset, bytes, qiov);
 }
 
-static int coroutine_fn bdrv_co_do_copy_on_readv(BlockDriverState *bs,
+static int coroutine_fn bdrv_co_do_copy_on_readv(BdrvChild *child,
 int64_t offset, unsigned int bytes, QEMUIOVector *qiov)
 {
+BlockDriverState *bs = child->bs;
+
 /* Perform I/O through a temporary buffer so that users who scribble over
  * their read buffer while the operation is in progress do not end up
  * modifying the image file.  This is critical for zero-copy guest I/O
@@ -943,6 +945,8 @@ static int coroutine_fn 
bdrv_co_do_copy_on_readv(BlockDriverState *bs,
 size_t skip_bytes;
 int ret;
 
+assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
+
 /* Cover entire cluster so no additional backing file I/O is required when
  * allocating cluster in the image file.
  */
@@ -1051,7 +1055,7 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild 
*child,
 }
 
 if (!ret || pnum != nb_sectors) {
-ret = bdrv_co_do_copy_on_readv(bs, offset, bytes, qiov);
+ret = bdrv_co_do_copy_on_readv(child, offset, bytes, qiov);
 goto out;
 }
 }
@@ -1334,6 +1338,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
 assert(!waited || !req->serialising);
 assert(req->overlap_offset <= offset);
 assert(offset + bytes <= req->overlap_offset + req->overlap_bytes);
+assert(child->perm & BLK_PERM_WRITE);
 
 ret = notifier_with_return_list_notify(>before_write_notifiers, req);
 
-- 
1.8.3.1




  1   2   3   4   >