Re: [Qemu-devel] [for-2.10 PATCH v2] 9pfs: local: fix fchmodat_nofollow() limitations

2017-08-09 Thread Greg Kurz
On Wed, 9 Aug 2017 10:59:46 -0500
Eric Blake  wrote:

> On 08/09/2017 10:22 AM, Greg Kurz wrote:
> 
> >>>
> >>> The solution is to use O_PATH: openat() now succeeds in both cases, and we
> >>> can ensure the path isn't a symlink with fstat(). The associated entry in
> >>> "/proc/self/fd" can hence be safely passed to the regular chmod() 
> >>> syscall.
> >>
> >> Hey - should we point this out as a viable solution to the glibc folks,
> >> since their current user-space emulation of AT_SYMLINK_NOFOLLOW is broken?
> >>  
> > 
> > Probably. What's the best way to do that ?  
> 
> I've added a comment to
> https://sourceware.org/bugzilla/show_bug.cgi?id=14578; you'll also want
> to point to the lkml discussion in that bug.  And reading that bug, it
> also looks like your hack with /proc/self/fd has been proposed by Rich
> Felker since 2013! (although fstat() didn't work until Linux 3.6, even
> though O_PATH predates that time) - so there is that one additional
> concern of whether we need to cater to the window of kernels where
> O_PATH exists but fstat() on that fd can't learn whether we opened a
> symlink.
> 

BTW, what happens with fstat() and O_PATH before Linux 3.6 ? Does it
fail or does it return something wrong in th stat buf ?


pgp64UsrRX43U.pgp
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 3/4] block-backend: shift in-flight counter to BB from BDS

2017-08-09 Thread Kevin Wolf
Am 08.08.2017 um 19:57 hat John Snow geschrieben:
> From: Kevin Wolf 
> 
> This allows us to detect errors in cache flushing (ENOMEDIUM)
> without choking on a null dereference because we assume that
> blk_bs(bb) is always defined.
> 
> Signed-off-by: Kevin Wolf 
> Signed-off-by: John Snow 
> ---
>  block.c   |  2 +-
>  block/block-backend.c | 40 ++--
>  2 files changed, 35 insertions(+), 7 deletions(-)
> 
> diff --git a/block.c b/block.c
> index ce9cce7..834b836 100644
> --- a/block.c
> +++ b/block.c
> @@ -4476,7 +4476,7 @@ out:
>  
>  AioContext *bdrv_get_aio_context(BlockDriverState *bs)
>  {
> -return bs->aio_context;
> +return bs ? bs->aio_context : qemu_get_aio_context();
>  }

This should probably be a separate patch; it's not really related to
moving the in-flight counter, but fixes another NULL dereference in
blk_aio_prwv().

>  void bdrv_coroutine_enter(BlockDriverState *bs, Coroutine *co)
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 968438c..efd7e92 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -68,6 +68,9 @@ struct BlockBackend {
>  NotifierList remove_bs_notifiers, insert_bs_notifiers;
>  
>  int quiesce_counter;
> +
> +/* Number of in-flight requests. Accessed with atomic ops. */
> +unsigned int in_flight;
>  };
>  
>  typedef struct BlockBackendAIOCB {
> @@ -1109,6 +1112,16 @@ int blk_make_zero(BlockBackend *blk, BdrvRequestFlags 
> flags)
>  return bdrv_make_zero(blk->root, flags);
>  }
>  
> +static void blk_inc_in_flight(BlockBackend *blk)
> +{
> +atomic_inc(>in_flight);
> +}
> +
> +static void blk_dec_in_flight(BlockBackend *blk)
> +{
> +atomic_dec(>in_flight);
> +}
> +
>  static void error_callback_bh(void *opaque)
>  {
>  struct BlockBackendAIOCB *acb = opaque;
> @@ -1147,7 +1160,7 @@ static const AIOCBInfo blk_aio_em_aiocb_info = {
>  static void blk_aio_complete(BlkAioEmAIOCB *acb)
>  {
>  if (acb->has_returned) {
> -bdrv_dec_in_flight(acb->common.bs);
> +blk_dec_in_flight(acb->rwco.blk);
>  acb->common.cb(acb->common.opaque, acb->rwco.ret);
>  qemu_aio_unref(acb);
>  }
> @@ -1168,7 +1181,7 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, 
> int64_t offset, int bytes,
>  BlkAioEmAIOCB *acb;
>  Coroutine *co;
>  
> -bdrv_inc_in_flight(blk_bs(blk));
> +blk_inc_in_flight(blk);
>  acb = blk_aio_get(_aio_em_aiocb_info, blk, cb, opaque);
>  acb->rwco = (BlkRwCo) {
>  .blk= blk,
> @@ -1405,14 +1418,28 @@ int blk_flush(BlockBackend *blk)
>  
>  void blk_drain(BlockBackend *blk)
>  {
> -if (blk_bs(blk)) {
> -bdrv_drain(blk_bs(blk));
> +AioContext *ctx = blk_get_aio_context(blk);
> +
> +while (atomic_read(>in_flight)) {
> +aio_context_acquire(ctx);
> +aio_poll(ctx, false);
> +aio_context_release(ctx);
> +
> +if (blk_bs(blk)) {
> +bdrv_drain(blk_bs(blk));
> +}
>  }
>  }
>  
>  void blk_drain_all(void)
>  {
> -bdrv_drain_all();
> +BlockBackend *blk = NULL;
> +
> +bdrv_drain_all_begin();
> +while ((blk = blk_all_next(blk)) != NULL) {
> +blk_drain(blk);
> +}
> +bdrv_drain_all_end();
>  }

We still need to check that everyone who should call blk_drain_all()
rather than bdrv_drain_all() actually does so.

>  void blk_set_on_error(BlockBackend *blk, BlockdevOnError on_read_error,
> @@ -1453,10 +1480,11 @@ static void send_qmp_error_event(BlockBackend *blk,
>   bool is_read, int error)
>  {
>  IoOperationType optype;
> +BlockDriverState *bs = blk_bs(blk);
>  
>  optype = is_read ? IO_OPERATION_TYPE_READ : IO_OPERATION_TYPE_WRITE;
>  qapi_event_send_block_io_error(blk_name(blk),
> -   bdrv_get_node_name(blk_bs(blk)), optype,
> +   bs ? bdrv_get_node_name(bs) : "", optype,
> action, blk_iostatus_is_enabled(blk),
> error == ENOSPC, strerror(error),
> _abort);

And this is another independent NULL dereference fix.

Kevin



Re: [Qemu-devel] [for-2.10 PATCH v3] 9pfs: local: fix fchmodat_nofollow() limitations

2017-08-09 Thread Eric Blake
On 08/09/2017 11:00 AM, Greg Kurz wrote:
> This function has to ensure it doesn't follow a symlink that could be used
> to escape the virtfs directory. This could be easily achieved if fchmodat()
> on linux honored the AT_SYMLINK_NOFOLLOW flag as described in POSIX, but
> it doesn't. There was a tentative to implement a new fchmodat2() syscall
> with the correct semantics:
> 
> https://patchwork.kernel.org/patch/9596301/
> 
> but it didn't gain much momentum. Also it was suggested to look at a O_PATH

s/a O_PATH/an O_PATH/

> based solution in the first place.
> 
> The current implementation covers most use-cases, but it notably fails if:
> - the target path has access rights equal to  (openat() returns EPERM),
>   => once you've done chmod() on a file, you can never chmod() again
> - the target path is UNIX domain socket (openat() returns ENXIO)
>   => bind() of UNIX domain sockets fails if the file is on 9pfs
> 
> The solution is to use O_PATH: openat() now succeeds in both cases, and we
> can ensure the path isn't a symlink with fstat(). The associated entry in
> "/proc/self/fd" can hence be safely passed to the regular chmod() syscall.

My late-breaking question from v2 remains: fstat() on O_PATH only works
in kernel 3.6 and newer; are we worried about kernels in the window of
2.6.39 (when O_PATH was introduced) and 3.5?  Or at this point, are we
reasonably sure that platforms are either too old for O_PATH at all
(Hello RHEL 6, with 2.6.32), or else new enough that we aren't going to
have spurious failures due to fstat() not doing what we want?

I don't actually know the failure mode of fstat() on kernel 3.5, so if
someone cares about that working (presumably because they are on a
platform with such a kernel), please speak up. (Or even run my test
program included on the v1 thread, to show us what happens)

> +fd = openat_file(dirfd, name, O_RDONLY | O_PATH_9P_UTIL, 0);
> +#ifndef O_PATH

Please make this '#if O_PATH' or even '#if O_PATH_9P_UTIL'; as it might
be feasible for someone to

#ifndef O_PATH
#define O_PATH 0
#endif

where the macro is defined but the feature is not present, messing up
our code if we only check for a definition.

> +#else
> +/* Now we handle racing symlinks. */
> +ret = fstat(fd, );
> +if (ret) {
> +goto out;

This may leave errno at an unusual value for fchmodat(), if we are on
kernel 3.5.  But until someone speaks up that it matters, I'm okay
saving any cleanup work in that area for a followup patch.

> +}
> +if (S_ISLNK(stbuf.st_mode)) {
> +errno = ELOOP;
> +ret = -1;
> +goto out;
> +}
> +
> +{
> +char *proc_path = g_strdup_printf("/proc/self/fd/%d", fd);
> +ret = chmod(proc_path, mode);
> +g_free(proc_path);
> +}
> +#endif
> +out:

Swap these two lines - your only use of 'goto out' are under the O_PATH
branch, and therefore you get a compilation failure about unused label
on older glibc.

With the #if condition fixed and the scope of the #endif fixed,

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] >256 Virtio-net-pci hotplug Devices

2017-08-09 Thread Kinsella, Ray
Marcel,

The findings are pretty consistent with what I identified.
Although it looks like SeaBIOS fairs better than UEFI.

Thanks for the headsup, will reply on the thread itself.

Ray K

-Original Message-
From: Marcel Apfelbaum [mailto:mar...@redhat.com] 
Sent: Wednesday, August 9, 2017 3:53 AM
To: Kinsella, Ray ; Kevin O'Connor 
Cc: Tan, Jianfeng ; seab...@seabios.org; Michael 
Tsirkin ; qemu-devel@nongnu.org; Gerd Hoffmann 

Subject: Re: [Qemu-devel] >256 Virtio-net-pci hotplug Devices

On 07/08/2017 22:00, Kinsella, Ray wrote:
> Hi Marcel,
> 

Hi Ray,

Please have a look on this thread, I think Laszlo and Paolo
found the root cause.
 https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg01368.html
It seems hot-plugging the devices would not help.

Thanks,
MArcel

> Yup - I am using Seabios by default.
> I took all the measures from the Kernel time reported in syslog.
> As Seabios wasn't exhibiting any obvious scaling problem.
> 
> Ray K
> 
> -Original Message-
> From: Marcel Apfelbaum [mailto:mar...@redhat.com]
> Sent: Wednesday, August 2, 2017 5:43 AM
> To: Kinsella, Ray ; Kevin O'Connor 
> 
> Cc: Tan, Jianfeng ; seab...@seabios.org; Michael 
> Tsirkin ; qemu-devel@nongnu.org; Gerd Hoffmann 
> 
> Subject: Re: [Qemu-devel] >256 Virtio-net-pci hotplug Devices
> 
> It is an issue worth looking into it, one more question, all the measurements 
> are from OS boot? Do you use SeaBIOS?
> No problems with the firmware?
> 
> Thanks,
> Marcel
> 
> 



Re: [Qemu-devel] No video for Windows 2000 guest

2017-08-09 Thread Paolo Bonzini
On 09/08/2017 16:56, Programmingkid wrote:
> The default vga card not longer works with a Windows 2000 guest. All I see is 
> a black screen after a the Windows splash screen.
> 
> This is the command-line I used: 
> 
> qemu-system-i386 -hda Windows2000HD.qcow2 -boot c -m 512
> 
> When using the -vga cirrus option video works. Testing was done with QEMU 
> v2.10.0 rc2. 

Did it work in 2.9?

Paolo



Re: [Qemu-devel] [PATCH v4 19/22] libqtest: Add qmp_args_dict() helper

2017-08-09 Thread Eric Blake
On 08/09/2017 10:59 AM, Markus Armbruster wrote:
> Eric Blake  writes:
> 
>> Leaving interpolation into JSON to qobject_from_jsonf() is more
>> robust than building QMP input manually; however, we have a few
>> places where code is already creating a QDict to interpolate
>> individual arguments, which cannot be reproduced with the
>> qobject_from_jsonf() parser.  Expose a public wrapper
>> qmp_args_dict() for the internal helper qmp_args_dict_async()
>> that we needed earlier for qmp_args(), and fix a test to use
>> the new helper.
>>
>> Signed-off-by: Eric Blake 
>> ---

>> +++ b/tests/device-introspect-test.c
>> @@ -36,8 +36,7 @@ static QList *qom_list_types(const char *implements, bool 
>> abstract)
>>  if (implements) {
>>  qdict_put_str(args, "implements", implements);
>>  }
>> -resp = qmp("{'execute': 'qom-list-types',"
>> -   " 'arguments': %p }", args);
>> +resp = qmp_args_dict("qom-list-types", args);
>>  g_assert(qdict_haskey(resp, "return"));
>>  ret = qdict_get_qlist(resp, "return");
>>  QINCREF(ret);
> 
> If we had five of these, the helper would be worth its keep.

This patch only  has one client, but 20/22 adds another.  Is having 2
clients sufficient to keep it (not quite the 5 that makes it obvious,
but still a good reuse of code)?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page

2017-08-09 Thread Paolo Bonzini
On 08/08/2017 21:14, Dr. David Alan Gilbert wrote:
>> There is no barrier there that I can see.  I know that it probably work
>> on x86, but in others?  I think that it *** HERE  we need that
>> memory barrier that we don't have.
> Yes, I think that's smp_mb_release() - and you have to do an
> smp_mb_acquire after reading the pages->num before accessing the iov.

Yes, I think that's correct.

Paolo

> (Probably worth checking with Paolo).
> Or just stick with mutex's.
> 
> 




Re: [Qemu-devel] [PATCH RFC 0/6] QEMU: kvm: cleanup kvm_slot handling

2017-08-09 Thread Paolo Bonzini
On 09/08/2017 15:33, David Hildenbrand wrote:
> If I am not missing something important here, we can heavily simplify
> the kvm_slot code. Flatview will make sure that we don't have to deal
> with overlapping slots. E.g. when a memory section is resized, we are
> first notified about the removal and then about the new memory section.
> 
> So basically, we can directly always map one memory section to one
> kvm slot (if the fixed up size is > 0).
> 
> Only very briefly tested. Will do some more testing if we agree that this
> is the right thing to do.

Yes, it all looks very sane.

Paolo

> David Hildenbrand (6):
>   kvm: require JOIN_MEMORY_REGIONS_WORKS
>   kvm: factor out alignment of memory section
>   kvm: use start + size for memory ranges
>   kvm: we never have overlapping slots in kvm_set_phys_mem()
>   kvm: kvm_log_start/stop are only called with known sections
>   kvm: kvm_log_sync() is only called with known memory sections
> 
>  accel/kvm/kvm-all.c | 276 
> +---
>  1 file changed, 89 insertions(+), 187 deletions(-)
> 




Re: [Qemu-devel] [PATCH v2 2/4] vhost-user-blk: introduce a new vhost-user-blk host device

2017-08-09 Thread Michael S. Tsirkin
I only had time for a quick look. More review when
you repost after release.


On Thu, Aug 10, 2017 at 06:12:29PM +0800, Changpeng Liu wrote:
> This commit introduces a new vhost-user device for block, it uses a
> chardev to connect with the backend, same with Qemu virito-blk device,
> Guest OS still uses the virtio-blk frontend driver.
> 
> To use it, start Qemu with command line like this:
> 
> qemu-system-x86_64 \
> -chardev socket,id=char0,path=/path/vhost.socket \
> -device vhost-user-blk-pci,chardev=char0,num_queues=...
> 
> Different with exist Qemu virtio-blk host device, it makes more easy
> for users to implement their own I/O processing logic, such as all
> user space I/O stack against hardware block device. It uses the new
> vhost messages(VHOST_USER_GET_CONFIG) to get block virtio config
> information from backend process.

I took a quick look. I think I would prefer a more direct approach
where qemu is more of a driver. So user specifies properties and
they get sent to backend at init time. Only handle geometry changes
specially.

> 
> Signed-off-by: Changpeng Liu 
> ---
>  configure  |  11 ++
>  hw/block/Makefile.objs |   3 +
>  hw/block/vhost-user-blk.c  | 360 
> +
>  hw/virtio/virtio-pci.c |  55 ++
>  hw/virtio/virtio-pci.h |  18 ++
>  include/hw/virtio/vhost-user-blk.h |  40 +
>  6 files changed, 487 insertions(+)
>  create mode 100644 hw/block/vhost-user-blk.c
>  create mode 100644 include/hw/virtio/vhost-user-blk.h
> 
> diff --git a/configure b/configure
> index dd73cce..1452c66 100755
> --- a/configure
> +++ b/configure
> @@ -305,6 +305,7 @@ tcg="yes"
>  
>  vhost_net="no"
>  vhost_scsi="no"
> +vhost_user_blk="no"
>  vhost_vsock="no"
>  vhost_user=""
>  kvm="no"
> @@ -779,6 +780,7 @@ Linux)
>kvm="yes"
>vhost_net="yes"
>vhost_scsi="yes"
> +  vhost_user_blk="yes"
>vhost_vsock="yes"
>QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers 
> $QEMU_INCLUDES"
>supported_os="yes"
> @@ -1136,6 +1138,10 @@ for opt do
>;;
>--enable-vhost-scsi) vhost_scsi="yes"
>;;
> +  --disable-vhost-user-blk) vhost_user_blk="no"
> +  ;;
> +  --enable-vhost-user-blk) vhost_user_blk="yes"
> +  ;;
>--disable-vhost-vsock) vhost_vsock="no"
>;;
>--enable-vhost-vsock) vhost_vsock="yes"
> @@ -1506,6 +1512,7 @@ disabled with --disable-FEATURE, default is enabled if 
> available:
>cap-ng  libcap-ng support
>attrattr and xattr support
>vhost-net   vhost-net acceleration support
> +  vhost-user-blk  VM virtio-blk acceleration in user space
>spice   spice
>rbd rados block device (rbd)
>libiscsiiscsi support
> @@ -5365,6 +5372,7 @@ echo "posix_madvise $posix_madvise"
>  echo "libcap-ng support $cap_ng"
>  echo "vhost-net support $vhost_net"
>  echo "vhost-scsi support $vhost_scsi"
> +echo "vhost-user-blk support $vhost_user_blk"
>  echo "vhost-vsock support $vhost_vsock"
>  echo "vhost-user support $vhost_user"
>  echo "Trace backends$trace_backends"
> @@ -5776,6 +5784,9 @@ fi
>  if test "$vhost_scsi" = "yes" ; then
>echo "CONFIG_VHOST_SCSI=y" >> $config_host_mak
>  fi
> +if test "$vhost_user_blk" = "yes" ; then
> +  echo "CONFIG_VHOST_USER_BLK=y" >> $config_host_mak
> +fi
>  if test "$vhost_net" = "yes" -a "$vhost_user" = "yes"; then
>echo "CONFIG_VHOST_NET_USED=y" >> $config_host_mak
>  fi
> diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
> index e0ed980..4c19a58 100644
> --- a/hw/block/Makefile.objs
> +++ b/hw/block/Makefile.objs
> @@ -13,3 +13,6 @@ obj-$(CONFIG_SH4) += tc58128.o
>  
>  obj-$(CONFIG_VIRTIO) += virtio-blk.o
>  obj-$(CONFIG_VIRTIO) += dataplane/
> +ifeq ($(CONFIG_VIRTIO),y)
> +obj-$(CONFIG_VHOST_USER_BLK) += vhost-user-blk.o
> +endif
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> new file mode 100644
> index 000..8aa9fa9
> --- /dev/null
> +++ b/hw/block/vhost-user-blk.c
> @@ -0,0 +1,360 @@
> +/*
> + * vhost-user-blk host device
> + *
> + * Copyright IBM, Corp. 2011
> + * Copyright(C) 2017 Intel Corporation.
> + *
> + * Authors:
> + *  Stefan Hajnoczi 
> + *  Changpeng Liu 
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2 or later.
> + * See the COPYING.LIB file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qemu/error-report.h"
> +#include "qemu/typedefs.h"
> +#include "qemu/cutils.h"
> +#include "qom/object.h"
> +#include "hw/qdev-core.h"
> +#include "hw/virtio/vhost.h"
> +#include "hw/virtio/vhost-user-blk.h"
> +#include "hw/virtio/virtio.h"
> +#include "hw/virtio/virtio-bus.h"
> +#include "hw/virtio/virtio-access.h"
> +
> +static const int user_feature_bits[] = {
> +VIRTIO_BLK_F_SIZE_MAX,
> +VIRTIO_BLK_F_SEG_MAX,
> +   

Re: [Qemu-devel] [PATCH v2 for 2.10] iotests: fix 185

2017-08-09 Thread Eric Blake
On 08/09/2017 10:19 AM, Vladimir Sementsov-Ogievskiy wrote:
> 09.08.2017 18:17, Vladimir Sementsov-Ogievskiy wrote:
>> 185 can sometimes produce wrong output like this:
>>

>>
>> This is because quite happens before first mirror request is done
> 
> s/quite/quit/
> 
>> (and, in specified case, even before block-job len field is set).
>> To prevent it let's just add a sleep for 0.3 seconds before quite.

Here, as well.

Maybe:

This is because, under heavy load, the quit can happen before the first
iteration of the mirror request has occurred.  To make sure we've had
time to iterate, let's just add a sleep for 0.3 seconds before quitting.


>>   "return"
>>   +# If we don't sleep here 'quit' command may be handled before
>> +# the first mirror iteration is done
>> +sleep 0.5

The commit message disagrees with the code (.3 vs. .5) - which one is
correct?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v4 15/22] libqtest: Delete qtest_qmp() wrappers

2017-08-09 Thread Markus Armbruster
Eric Blake  writes:

> None of our tests were directly using qtest_qmp() and friends;
> even tests like postcopy-test.c that manage multiple connections
> get along just fine changing global_qtest as needed (other than
> one callsite where it forgot to use the inlined form).  It's
> also annoying that we have qmp_async() but qtest_async_qmp(),
> with inconsistent naming for tracing through the wrappers.
>
> As future patches are about to add some convenience functions
> for easier generation of QMP commands, it's easier if we don't
> have to percolate the changes through as many layers of the stack,
> by getting rid of the layer that no one uses, and just documenting
> that callers have to massage the global variable as needed. (Yes,
> this is backwards from good design that says all state should be
> passed as parameters rather than via a global variable - but such
> is life in maintaining a testsuite, where it is easier to write
> concise tests than it is to retrofit all existing tests to pass
> the extra parameter everywhere.)
>
> Internally, we rename qmp_fd_sendv() to qtest_qmp_sendv(), as
> well as give it a ... counterpart qmp_fd_send(), but the overall
> reduction in code makes this file a bit less repetitive.
>
> Signed-off-by: Eric Blake 

Ah!  I see you're as fed up with this nonsense as I am :)

What about all the other functions taking a QTestState?  Aren't they
just as silly?

Having two of every function is tiresome, but consistent.

Having just one is easier to maintain, so if it serves our needs,
possibly with the occasional state switch, I like it.

What I don't like is a mix of the two.

> ---
>  tests/libqtest.h  | 75 
> +--
>  tests/libqtest.c  | 71 +---
>  tests/postcopy-test.c |  2 +-
>  3 files changed, 25 insertions(+), 123 deletions(-)
>
> diff --git a/tests/libqtest.h b/tests/libqtest.h
> index 6bae0223aa..684cfb3507 100644
> --- a/tests/libqtest.h
> +++ b/tests/libqtest.h
> @@ -21,6 +21,11 @@
>
>  typedef struct QTestState QTestState;
>
> +/*
> + * The various qmp_*() commands operate on global_qtest.  Tests can
> + * alternate between two parallel connections by switching which state
> + * is current before issuing commands.
> + */
>  extern QTestState *global_qtest;
>
>  /**
> @@ -48,49 +53,7 @@ QTestState *qtest_init_without_qmp_handshake(const char 
> *extra_args);
>  void qtest_quit(QTestState *s);
>
>  /**
> - * qtest_qmp:
> - * @s: #QTestState instance to operate on.
> - * @fmt...: QMP message to send to qemu; formats arguments through
> - * json-lexer.c (only understands '%(PRI[ud]64|(l|ll)?[du]|[ipsf%])').
> - *
> - * Sends a QMP message to QEMU and returns the response.
> - */
> -QDict *qtest_qmp(QTestState *s, const char *fmt, ...);
> -
> -/**
> - * qtest_async_qmp:
> - * @s: #QTestState instance to operate on.
> - * @fmt...: QMP message to send to qemu; formats arguments through
> - * json-lexer.c (only understands '%(PRI[ud]64|(l|ll)?[du]|[ipsf%])').
> - *
> - * Sends a QMP message to QEMU and leaves the response in the stream.
> - */
> -void qtest_async_qmp(QTestState *s, const char *fmt, ...);
> -
> -/**
> - * qtest_qmpv:
> - * @s: #QTestState instance to operate on.
> - * @fmt: QMP message to send to QEMU; formats arguments through
> - * json-lexer.c (only understands '%(PRI[ud]64|(l|ll)?[du]|[ipsf%])').
> - * @ap: QMP message arguments
> - *
> - * Sends a QMP message to QEMU and returns the response.
> - */
> -QDict *qtest_qmpv(QTestState *s, const char *fmt, va_list ap);
> -
> -/**
> - * qtest_async_qmpv:
> - * @s: #QTestState instance to operate on.
> - * @fmt: QMP message to send to QEMU; formats arguments through
> - * json-lexer.c (only understands '%(PRI[ud]64|(l|ll)?[du]|[ipsf%])').
> - * @ap: QMP message arguments
> - *
> - * Sends a QMP message to QEMU and leaves the response in the stream.
> - */
> -void qtest_async_qmpv(QTestState *s, const char *fmt, va_list ap);
> -
> -/**
> - * qtest_receive:
> + * qtest_qmp_receive:
>   * @s: #QTestState instance to operate on.
>   *
>   * Reads a QMP message from QEMU and returns the response.
> @@ -117,32 +80,6 @@ void qtest_qmp_eventwait(QTestState *s, const char 
> *event);
>  QDict *qtest_qmp_eventwait_ref(QTestState *s, const char *event);
>
>  /**
> - * qtest_hmp:
> - * @s: #QTestState instance to operate on.
> - * @fmt...: HMP command to send to QEMU, formats arguments like sprintf().
> - *
> - * Send HMP command to QEMU via QMP's human-monitor-command.
> - * QMP events are discarded.
> - *
> - * Returns: the command's output.  The caller should g_free() it.
> - */
> -char *qtest_hmp(QTestState *s, const char *fmt, ...) GCC_FMT_ATTR(2, 3);
> -
> -/**
> - * qtest_hmpv:
> - * @s: #QTestState instance to operate on.
> - * @fmt: HMP command to send to QEMU, formats arguments like vsprintf().
> - * @ap: HMP command arguments
> - *
> - * Send HMP command to QEMU via QMP's 

Re: [Qemu-devel] [PATCH v4 5/5] docs: update documentation considering PCIE-PCI bridge

2017-08-09 Thread Aleksandr Bezzubikov
2017-08-09 13:18 GMT+03:00 Laszlo Ersek :
> On 08/08/17 21:21, Aleksandr Bezzubikov wrote:
>> 2017-08-08 18:11 GMT+03:00 Laszlo Ersek :
>>> one comment below
>>>
>>> On 08/05/17 22:27, Aleksandr Bezzubikov wrote:
>>>
 +Capability layout (defined in include/hw/pci/pci_bridge.h):
 +
 +uint8_t id; Standard PCI capability header field
 +uint8_t next;   Standard PCI capability header field
 +uint8_t len;Standard PCI vendor-specific capability header field
 +
 +uint8_t type;   Red Hat vendor-specific capability type
 +List of currently existing types:
 +QEMU_RESERVE = 1
 +
 +
 +uint32_t bus_res;   Minimum number of buses to reserve
 +
 +uint64_t io;IO space to reserve
 +uint64_t memNon-prefetchable memory to reserve
 +uint64_t mem_pref;  Prefetchable memory to reserve
>>>
>>> (I apologize if I missed any concrete points from the past messages
>>> regarding this structure.)
>>>
>>> How is the firmware supposed to know whether the prefetchable MMIO
>>> reservation should be made in 32-bit or 64-bit address space? If we
>>> reserve prefetchable MMIO outside of the 32-bit address space, then
>>> hot-plugging a device without 64-bit MMIO support could fail.
>>>
>>> My earlier request, to distinguish "prefetchable_32" from
>>> "prefetchable_64" (mutually exclusively), was so that firmware would
>>> know whether to restrict the MMIO reservation to 32-bit address
>>> space.
>>
>> IIUC now (in SeaBIOS at least) we just assign this PREF registers
>> unconditionally,
>> so the decision about the mode can be made basing on !=0
>> UPPER_PREF_LIMIT register.
>> My idea was the same - we can just check if the value doesn't fit into
>> 16-bit (PREF_LIMIT reg size, 32-bit MMIO). Do we really need separate
>> fields for that?
>
> The PciBusDxe driver in edk2 tracks 32-bit and 64-bit MMIO resources
> separately from each other, and other (independent) logic exists in it
> that, on some conditions, allocates 64-bit MMIO BARs from 32-bit address
> space. This is just to say that the distinction is intentional in
> PciBusDxe.
>
> Furthermore, the Platform Init spec v1.6 says the following (this is
> what OVMF will have to comply with, in the "platform hook" called by
> PciBusDxe):
>
>> 12.6 PCI Hot Plug PCI Initialization Protocol
>> EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding()
>> ...
>> Padding  The amount of resource padding that is required by the PCI
>>  bus under the control of the specified HPC. Because the
>>  caller does not know the size of this buffer, this buffer is
>>  allocated by the callee and freed by the caller.
>> ...
>> The padding is returned in the form of ACPI (2.0 & 3.0) resource
>> descriptors. The exact definition of each of the fields is the same as
>> in the
>> EFI_PCI_HOST_BRIDGE_RESOURCE_ALLOCATION_PROTOCOL.SubmitResources()
>> function. See the section 10.8 for the definition of this function.
>
> Following that pointer:
>
>> 10.8 PCI HostBridge Code Definitions
>> 10.8.2 PCI Host Bridge Resource Allocation Protocol
>>
>> Table 8. ACPI 2.0 & 3.0 QWORD Address Space Descriptor Usage
>>
>> ByteByteData  Description
>> Offset  Length
>> ...
>> 0x030x01  Resource type:
>> 0: Memory range
>> 1: I/O range
>> 2: Bus number range
>> ...
>> 0x050x01  Type-specific flags. Ignored except as defined
>>   in Table 3-3 and Table 3-4 below.
>>
>> 0x060x08  Address Space Granularity. Used to differentiate
>>   between a 32-bit memory request and a 64-bit
>>   memory request. For a 32-bit memory request,
>>   this field should be set to 32. For a 64-bit
>>   memory request, this field should be set to 64.
>>   Ignored for I/O and bus resource requests.
>>   Ignored during GetProposedResources().
>
> The "Table 3-3" and "Table 3-4" references under "Type-specific flags"
> are out of date (spec bug); in reality those are:
> - Table 10. I/O Resource Flag (Resource Type = 1) Usage,
> - Table 11. Memory Resource Flag (Resource Type = 0) Usage.
>
> The latter is relevant here:
>
>> Table 11. Memory Resource Flag (Resource Type = 0) Usage
>>
>> Bits  Meaning
>> ...
>> Bit[2:1]  _MEM. Memory attributes.
>>   Value and Meaning:
>> 0 The memory is nonprefetchable.
>> 1 Invalid.
>> 2 Invalid.
>> 3 The memory is prefetchable.
>>   Note: The interpretation of these bits is somewhat different
>>   from the ACPI Specification. According to the ACPI
>>   Specification, a value of 0 implies noncacheable memory and
>>   the value of 3 indicates prefetchable and 

Re: [Qemu-devel] [PATCH v2 0/4] *** Introduce a new vhost-user-blk host device to Qemu ***

2017-08-09 Thread Michael S. Tsirkin
On Thu, Aug 10, 2017 at 06:12:27PM +0800, Changpeng Liu wrote:
> Althrough virtio scsi specification was designed as a replacement for 
> virtio_blk,
> there are still many users using virtio_blk. Qemu 2.9 introduced a new device
> vhost user scsi which can process I/O in user space for virtio_scsi, this 
> commit
> introduces a new vhost user block host device, which can support virtio_blk in
> Guest OS, and I/O processing in another I/O target.
> 
> Due to the limitation for virtio_blk specification, virtio_blk device cannot 
> get
> block information such as capacity, block size etc via the specification, 
> several
> new vhost user messages were added to support deliver virtio config space
> information between Qemu and I/O target, 
> VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG
> messages used for get/set config space from/to I/O target, 
> VHOST_USER_SET_CONFIG_FD
> was added for event notifier in case the change of virtio config space. Also, 
> those
> messages can be used for vhost device live migration as well.

As we are busy wrapping up a QEMU release, please remember to repost after the
release.

> Changpeng Liu (4):
>   vhost-user: add new vhost user messages to support virtio config space
>   vhost-user-blk: introduce a new vhost-user-blk host device
>   contrib/libvhost-user: enable virtio config space messages
>   contrib/vhost-user-blk: introduce a vhost-user-blk sample application
> 
>  .gitignore  |   1 +
>  Makefile|   3 +
>  Makefile.objs   |   2 +
>  configure   |  11 +
>  contrib/libvhost-user/libvhost-user.c   |  51 +++
>  contrib/libvhost-user/libvhost-user.h   |  14 +
>  contrib/vhost-user-blk/Makefile.objs|   1 +
>  contrib/vhost-user-blk/vhost-user-blk.c | 735 
> 
>  docs/interop/vhost-user.txt |  31 ++
>  hw/block/Makefile.objs  |   3 +
>  hw/block/vhost-user-blk.c   | 360 
>  hw/virtio/vhost-user.c  |  86 
>  hw/virtio/vhost.c   |  63 +++
>  hw/virtio/virtio-pci.c  |  55 +++
>  hw/virtio/virtio-pci.h  |  18 +
>  include/hw/virtio/vhost-backend.h   |   8 +
>  include/hw/virtio/vhost-user-blk.h  |  40 ++
>  include/hw/virtio/vhost.h   |  16 +
>  18 files changed, 1498 insertions(+)
>  create mode 100644 contrib/vhost-user-blk/Makefile.objs
>  create mode 100644 contrib/vhost-user-blk/vhost-user-blk.c
>  create mode 100644 hw/block/vhost-user-blk.c
>  create mode 100644 include/hw/virtio/vhost-user-blk.h
> 
> -- 
> 1.9.3



Re: [Qemu-devel] Qemu and 32 PCIe devices

2017-08-09 Thread Michael S. Tsirkin
On Wed, Aug 09, 2017 at 09:26:11AM +0200, Paolo Bonzini wrote:
> On 09/08/2017 03:06, Laszlo Ersek wrote:
> >>   20.14%  qemu-system-x86_64  [.] render_memory_region
> >>   17.14%  qemu-system-x86_64  [.] subpage_register
> >>   10.31%  qemu-system-x86_64  [.] int128_add
> >>7.86%  qemu-system-x86_64  [.] addrrange_end
> >>7.30%  qemu-system-x86_64  [.] int128_ge
> >>4.89%  qemu-system-x86_64  [.] int128_nz
> >>3.94%  qemu-system-x86_64  [.] phys_page_compact
> >>2.73%  qemu-system-x86_64  [.] phys_map_node_alloc
> 
> Yes, this is the O(n^3) thing.  An optimized build should be faster
> because int128 operations will be inlined and become much more efficient.
> 
> > With this patch, I only tested the "93 devices" case, as the slowdown
> > became visible to the naked eye from the trace messages, as the firmware
> > enabled more and more BARs / command registers (and inversely, the
> > speedup was perceivable when the firmware disabled more and more BARs /
> > command registers).
> 
> This is an interesting observation, and it's expected.  Looking at the
> O(n^3) complexity more in detail you have N operations, where the "i"th
> operates on "i" DMA address spaces, all of which have at least "i"
> memory regions (at least 1 BAR per device).
> 
> So the total cost is sum i=1..N i^2 = N(N+1)(2N+1)/6 = O(n^3).
> Expressing it as a sum shows why it gets slower as time progresses.
> 
> The solution is to note that those "i" address spaces are actually all
> the same, so we can get it down to sum i=1..N i = N(N+1)/2 = O(n^2).
> 
> Thanks,
> 
> Paolo

We'll probably run into more issues with the vIOMMU but I guess we
can look into it later.

Resolving addresses lazily somehow might be interesting. And would
the caching work that went in a while ago but got disabled
since we couldn't iron out all the small issues
help go in that direction somehow?

-- 
MST



Re: [Qemu-devel] [PATCH v4 5/5] docs: update documentation considering PCIE-PCI bridge

2017-08-09 Thread Laszlo Ersek
On 08/09/17 18:52, Aleksandr Bezzubikov wrote:
> 2017-08-09 13:18 GMT+03:00 Laszlo Ersek :
>> On 08/08/17 21:21, Aleksandr Bezzubikov wrote:
>>> 2017-08-08 18:11 GMT+03:00 Laszlo Ersek :
 one comment below

 On 08/05/17 22:27, Aleksandr Bezzubikov wrote:

> +Capability layout (defined in include/hw/pci/pci_bridge.h):
> +
> +uint8_t id; Standard PCI capability header field
> +uint8_t next;   Standard PCI capability header field
> +uint8_t len;Standard PCI vendor-specific capability header field
> +
> +uint8_t type;   Red Hat vendor-specific capability type
> +List of currently existing types:
> +QEMU_RESERVE = 1
> +
> +
> +uint32_t bus_res;   Minimum number of buses to reserve
> +
> +uint64_t io;IO space to reserve
> +uint64_t memNon-prefetchable memory to reserve
> +uint64_t mem_pref;  Prefetchable memory to reserve

 (I apologize if I missed any concrete points from the past messages
 regarding this structure.)

 How is the firmware supposed to know whether the prefetchable MMIO
 reservation should be made in 32-bit or 64-bit address space? If we
 reserve prefetchable MMIO outside of the 32-bit address space, then
 hot-plugging a device without 64-bit MMIO support could fail.

 My earlier request, to distinguish "prefetchable_32" from
 "prefetchable_64" (mutually exclusively), was so that firmware would
 know whether to restrict the MMIO reservation to 32-bit address
 space.
>>>
>>> IIUC now (in SeaBIOS at least) we just assign this PREF registers
>>> unconditionally,
>>> so the decision about the mode can be made basing on !=0
>>> UPPER_PREF_LIMIT register.
>>> My idea was the same - we can just check if the value doesn't fit into
>>> 16-bit (PREF_LIMIT reg size, 32-bit MMIO). Do we really need separate
>>> fields for that?
>>
>> The PciBusDxe driver in edk2 tracks 32-bit and 64-bit MMIO resources
>> separately from each other, and other (independent) logic exists in it
>> that, on some conditions, allocates 64-bit MMIO BARs from 32-bit address
>> space. This is just to say that the distinction is intentional in
>> PciBusDxe.
>>
>> Furthermore, the Platform Init spec v1.6 says the following (this is
>> what OVMF will have to comply with, in the "platform hook" called by
>> PciBusDxe):
>>
>>> 12.6 PCI Hot Plug PCI Initialization Protocol
>>> EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding()
>>> ...
>>> Padding  The amount of resource padding that is required by the PCI
>>>  bus under the control of the specified HPC. Because the
>>>  caller does not know the size of this buffer, this buffer is
>>>  allocated by the callee and freed by the caller.
>>> ...
>>> The padding is returned in the form of ACPI (2.0 & 3.0) resource
>>> descriptors. The exact definition of each of the fields is the same as
>>> in the
>>> EFI_PCI_HOST_BRIDGE_RESOURCE_ALLOCATION_PROTOCOL.SubmitResources()
>>> function. See the section 10.8 for the definition of this function.
>>
>> Following that pointer:
>>
>>> 10.8 PCI HostBridge Code Definitions
>>> 10.8.2 PCI Host Bridge Resource Allocation Protocol
>>>
>>> Table 8. ACPI 2.0 & 3.0 QWORD Address Space Descriptor Usage
>>>
>>> ByteByteData  Description
>>> Offset  Length
>>> ...
>>> 0x030x01  Resource type:
>>> 0: Memory range
>>> 1: I/O range
>>> 2: Bus number range
>>> ...
>>> 0x050x01  Type-specific flags. Ignored except as defined
>>>   in Table 3-3 and Table 3-4 below.
>>>
>>> 0x060x08  Address Space Granularity. Used to differentiate
>>>   between a 32-bit memory request and a 64-bit
>>>   memory request. For a 32-bit memory request,
>>>   this field should be set to 32. For a 64-bit
>>>   memory request, this field should be set to 64.
>>>   Ignored for I/O and bus resource requests.
>>>   Ignored during GetProposedResources().
>>
>> The "Table 3-3" and "Table 3-4" references under "Type-specific flags"
>> are out of date (spec bug); in reality those are:
>> - Table 10. I/O Resource Flag (Resource Type = 1) Usage,
>> - Table 11. Memory Resource Flag (Resource Type = 0) Usage.
>>
>> The latter is relevant here:
>>
>>> Table 11. Memory Resource Flag (Resource Type = 0) Usage
>>>
>>> Bits  Meaning
>>> ...
>>> Bit[2:1]  _MEM. Memory attributes.
>>>   Value and Meaning:
>>> 0 The memory is nonprefetchable.
>>> 1 Invalid.
>>> 2 Invalid.
>>> 3 The memory is prefetchable.
>>>   Note: The interpretation of these bits is somewhat different
>>>   from the ACPI 

Re: [Qemu-devel] No video for Windows 2000 guest

2017-08-09 Thread Michael S. Tsirkin
On Wed, Aug 09, 2017 at 01:54:23PM -0400, Programmingkid wrote:
> 
> > On Aug 9, 2017, at 1:18 PM, Michael S. Tsirkin  wrote:
> > 
> > On Wed, Aug 09, 2017 at 06:37:12PM +0200, Paolo Bonzini wrote:
> >> On 09/08/2017 16:56, Programmingkid wrote:
> >>> The default vga card not longer works with a Windows 2000 guest. All I 
> >>> see is a black screen after a the Windows splash screen.
> >>> 
> >>> This is the command-line I used: 
> >>> 
> >>> qemu-system-i386 -hda Windows2000HD.qcow2 -boot c -m 512
> >>> 
> >>> When using the -vga cirrus option video works. Testing was done with QEMU 
> >>> v2.10.0 rc2. 
> >> 
> >> Did it work in 2.9?
> >> 
> >> Paolo
> > 
> > Generally bisect is extremely helpful to debug these issues.
> 
> I tried but the acpi issue kept Windows 2000 from booting. 

You can just revert that on top of each bisect.



Re: [Qemu-devel] No video for Windows 2000 guest

2017-08-09 Thread Programmingkid

> On Aug 9, 2017, at 12:37 PM, Paolo Bonzini  wrote:
> 
> On 09/08/2017 16:56, Programmingkid wrote:
>> The default vga card not longer works with a Windows 2000 guest. All I see 
>> is a black screen after a the Windows splash screen.
>> 
>> This is the command-line I used: 
>> 
>> qemu-system-i386 -hda Windows2000HD.qcow2 -boot c -m 512
>> 
>> When using the -vga cirrus option video works. Testing was done with QEMU 
>> v2.10.0 rc2. 
> 
> Did it work in 2.9?
> 
> Paolo

I haven't test version QEMU 2.9.0 but I did test version 2.8.0 and it has the 
same problem. Starting up Windows 2000 in VGA mode allowed me to access the 
operating system. Found out QEMU's default video controller is not recognized. 
In the Device Manager I can see a yellow question mark for Video Controller 
(VGA Compatible). I remember Windows 2000 being able to use the default video 
card in QEMU in the past. I may be able to bisect this issue after all.


Re: [Qemu-devel] [PATCH v2 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application

2017-08-09 Thread Marc-André Lureau
Hi

- Original Message -
> This commit introcudes a vhost-user-blk backend device, it uses UNIX
> domain socket to communicate with Qemu. The vhost-user-blk sample
> application should be used with Qemu vhost-user-blk-pci device.
> 
> To use it, complie with:
> make vhost-user-blk
> 
> and start like this:
> vhost-user-blk -b /dev/sdb -s /path/vhost.socket

I guess it could be a regular file instead (fallocate/trunc to desired size).

> 
> Signed-off-by: Changpeng Liu 
> ---
>  .gitignore  |   1 +
>  Makefile|   3 +
>  Makefile.objs   |   2 +
>  contrib/vhost-user-blk/Makefile.objs|   1 +
>  contrib/vhost-user-blk/vhost-user-blk.c | 735
>  
>  5 files changed, 742 insertions(+)
>  create mode 100644 contrib/vhost-user-blk/Makefile.objs
>  create mode 100644 contrib/vhost-user-blk/vhost-user-blk.c
> 
> diff --git a/.gitignore b/.gitignore
> index cf65316..dbe5c13 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -51,6 +51,7 @@
>  /module_block.h
>  /vscclient
>  /vhost-user-scsi
> +/vhost-user-blk
>  /fsdev/virtfs-proxy-helper
>  *.[1-9]
>  *.a
> diff --git a/Makefile b/Makefile
> index 97a58a0..e68e339 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -270,6 +270,7 @@ dummy := $(call unnest-vars,, \
>  ivshmem-server-obj-y \
>  libvhost-user-obj-y \
>  vhost-user-scsi-obj-y \
> +vhost-user-blk-obj-y \
>  qga-vss-dll-obj-y \
>  block-obj-y \
>  block-obj-m \
> @@ -478,6 +479,8 @@ ivshmem-server$(EXESUF): $(ivshmem-server-obj-y)
> $(COMMON_LDADDS)
>  endif
>  vhost-user-scsi$(EXESUF): $(vhost-user-scsi-obj-y)
>   $(call LINK, $^)
> +vhost-user-blk$(EXESUF): $(vhost-user-blk-obj-y)
> + $(call LINK, $^)
>  
>  module_block.h: $(SRC_PATH)/scripts/modules/module_block.py config-host.mak
>   $(call quiet-command,$(PYTHON) $< $@ \
> diff --git a/Makefile.objs b/Makefile.objs
> index 24a4ea0..6b81548 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -114,6 +114,8 @@ vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS)
>  vhost-user-scsi.o-libs := $(LIBISCSI_LIBS)
>  vhost-user-scsi-obj-y = contrib/vhost-user-scsi/
>  vhost-user-scsi-obj-y += contrib/libvhost-user/libvhost-user.o
> +vhost-user-blk-obj-y = contrib/vhost-user-blk/
> +vhost-user-blk-obj-y += contrib/libvhost-user/libvhost-user.o
>  
>  ##
>  trace-events-subdirs =
> diff --git a/contrib/vhost-user-blk/Makefile.objs
> b/contrib/vhost-user-blk/Makefile.objs
> new file mode 100644
> index 000..72e2cdc
> --- /dev/null
> +++ b/contrib/vhost-user-blk/Makefile.objs
> @@ -0,0 +1 @@
> +vhost-user-blk-obj-y = vhost-user-blk.o
> diff --git a/contrib/vhost-user-blk/vhost-user-blk.c
> b/contrib/vhost-user-blk/vhost-user-blk.c

My bad I didn't review vhost-user-scsi.c and you reproduce a lot of code here.

Imho, there is no need for memory allocation failure check; it's a test app & 
glib will terminate if allocation fails anyway.

I should also say that libvhost-user is supposed to be glib-free, and it's not 
fully (it is 99%). That also create some confusion I believe. And some docs is 
lacking.

> new file mode 100644
> index 000..9b90164
> --- /dev/null
> +++ b/contrib/vhost-user-blk/vhost-user-blk.c
> @@ -0,0 +1,735 @@
> +/*
> + * vhost-user-blk sample application
> + *
> + * Copyright IBM, Corp. 2007
> + * Copyright (c) 2016 Nutanix Inc. All rights reserved.
> + * Copyright (c) 2017 Intel Corporation. All rights reserved.
> + *
> + * Author:
> + *  Anthony Liguori 
> + *  Felipe Franciosi 
> + *  Changpeng Liu 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 only.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/virtio/virtio-blk.h"
> +#include "contrib/libvhost-user/libvhost-user.h"
> +
> +#include 
> +
> +/* Small compat shim from glib 2.32 */
> +#ifndef G_SOURCE_CONTINUE
> +#define G_SOURCE_CONTINUE TRUE
> +#endif
> +#ifndef G_SOURCE_REMOVE
> +#define G_SOURCE_REMOVE FALSE
> +#endif
> +

Should probably be in glib-compat.h

> +/* And this is the final byte of request*/
> +#define VIRTIO_BLK_S_OK 0
> +#define VIRTIO_BLK_S_IOERR 1
> +#define VIRTIO_BLK_S_UNSUPP 2
> +
> +typedef struct vhost_blk_dev {
> +VuDev vu_dev;
> +int server_sock;
> +int blk_fd;
> +struct virtio_blk_config blkcfg;
> +char *blk_name;
> +GMainLoop *loop;
> +GTree *fdmap;   /* fd -> gsource context id */

why a tree? I would rather have hashmap, or even a fixed size array, since the 
app isn't supposed to have so many FD open...

> +} vhost_blk_dev_t;
> +
> +typedef struct vhost_blk_request {
> +VuVirtqElement *elem;
> +int64_t sector_num;
> +size_t size;

Re: [Qemu-devel] [PATCH v2 3/4] contrib/libvhost-user: enable virtio config space messages

2017-08-09 Thread Marc-André Lureau
Hi

- Original Message -
> Enable VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG/VHOST_USER_SET_CONFIG_FD
> messages in libvhost-user library, users can implement their own I/O target
> based on the library. This enable the virtio config space delivered between
> Qemu host device and the I/O target, also event notifier is added in case
> of virtio config space changed.
> 
> Signed-off-by: Changpeng Liu 
> ---
>  contrib/libvhost-user/libvhost-user.c | 51
>  +++
>  contrib/libvhost-user/libvhost-user.h | 14 ++
>  2 files changed, 65 insertions(+)
> 
> diff --git a/contrib/libvhost-user/libvhost-user.c
> b/contrib/libvhost-user/libvhost-user.c
> index 9efb9da..002cf15 100644
> --- a/contrib/libvhost-user/libvhost-user.c
> +++ b/contrib/libvhost-user/libvhost-user.c
> @@ -63,6 +63,9 @@ vu_request_to_string(int req)
>  REQ(VHOST_USER_SET_VRING_ENABLE),
>  REQ(VHOST_USER_SEND_RARP),
>  REQ(VHOST_USER_INPUT_GET_CONFIG),
> +REQ(VHOST_USER_GET_CONFIG),
> +REQ(VHOST_USER_SET_CONFIG),
> +REQ(VHOST_USER_SET_CONFIG_FD),
>  REQ(VHOST_USER_MAX),
>  };
>  #undef REQ
> @@ -744,6 +747,43 @@ vu_set_vring_enable_exec(VuDev *dev, VhostUserMsg *vmsg)
>  }
>  
>  static bool
> +vu_get_config(VuDev *dev, VhostUserMsg *vmsg)
> +{
> +if (dev->iface->get_config) {
> +dev->iface->get_config(dev, vmsg->payload.config, vmsg->size);

better check the return value on error to avoid sending garbage back to master.

> +}
> +
> +return true;
> +}
> +
> +static bool
> +vu_set_config(VuDev *dev, VhostUserMsg *vmsg)
> +{
> +if (dev->iface->set_config) {
> +dev->iface->set_config(dev, vmsg->payload.config, vmsg->size);

you could perhaps make that function return void instead (since error isn't 
reported to master)


> +}
> +
> +return false;
> +}
> +
> +static bool
> +vu_set_config_fd(VuDev *dev, VhostUserMsg *vmsg)
> +{
> +   if (vmsg->fd_num != 1) {
> +vu_panic(dev, "Invalid config_fd message");
> +return false;
> +}
> +
> +if (dev->config_fd != -1) {
> +close(dev->config_fd);
> +}
> +dev->config_fd = vmsg->fds[0];
> +DPRINT("Got config_fd: %d\n", vmsg->fds[0]);
> +
> +return false;
> +}
> +
> +static bool
>  vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
>  {
>  int do_reply = 0;
> @@ -806,6 +846,12 @@ vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
>  return vu_get_queue_num_exec(dev, vmsg);
>  case VHOST_USER_SET_VRING_ENABLE:
>  return vu_set_vring_enable_exec(dev, vmsg);
> +case VHOST_USER_GET_CONFIG:
> +return vu_get_config(dev, vmsg);
> +case VHOST_USER_SET_CONFIG:
> +return vu_set_config(dev, vmsg);
> +case VHOST_USER_SET_CONFIG_FD:
> +return vu_set_config_fd(dev, vmsg);
>  default:
>  vmsg_close_fds(vmsg);
>  vu_panic(dev, "Unhandled request: %d", vmsg->request);
> @@ -878,6 +924,10 @@ vu_deinit(VuDev *dev)
>  
>  vu_close_log(dev);
>  
> +if (dev->config_fd != -1) {
> +close(dev->config_fd);
> +}
> +
>  if (dev->sock != -1) {
>  close(dev->sock);
>  }
> @@ -907,6 +957,7 @@ vu_init(VuDev *dev,
>  dev->remove_watch = remove_watch;
>  dev->iface = iface;
>  dev->log_call_fd = -1;
> +dev->config_fd = -1;
>  for (i = 0; i < VHOST_MAX_NR_VIRTQUEUE; i++) {
>  dev->vq[i] = (VuVirtq) {
>  .call_fd = -1, .kick_fd = -1, .err_fd = -1,
> diff --git a/contrib/libvhost-user/libvhost-user.h
> b/contrib/libvhost-user/libvhost-user.h
> index 53ef222..899dee1 100644
> --- a/contrib/libvhost-user/libvhost-user.h
> +++ b/contrib/libvhost-user/libvhost-user.h
> @@ -30,6 +30,8 @@
>  
>  #define VHOST_MEMORY_MAX_NREGIONS 8
>  
> +#define VHOST_USER_MAX_CONFIG_SIZE 256
> +
>  enum VhostUserProtocolFeature {
>  VHOST_USER_PROTOCOL_F_MQ = 0,
>  VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
> @@ -62,6 +64,9 @@ typedef enum VhostUserRequest {
>  VHOST_USER_SET_VRING_ENABLE = 18,
>  VHOST_USER_SEND_RARP = 19,
>  VHOST_USER_INPUT_GET_CONFIG = 20,
> +VHOST_USER_GET_CONFIG = 24,
> +VHOST_USER_SET_CONFIG = 25,
> +VHOST_USER_SET_CONFIG_FD = 26,
>  VHOST_USER_MAX
>  } VhostUserRequest;
>  
> @@ -105,6 +110,7 @@ typedef struct VhostUserMsg {
>  struct vhost_vring_addr addr;
>  VhostUserMemory memory;
>  VhostUserLog log;
> +uint8_t config[VHOST_USER_MAX_CONFIG_SIZE];
>  } payload;
>  
>  int fds[VHOST_MEMORY_MAX_NREGIONS];
> @@ -132,6 +138,9 @@ typedef void (*vu_set_features_cb) (VuDev *dev, uint64_t
> features);
>  typedef int (*vu_process_msg_cb) (VuDev *dev, VhostUserMsg *vmsg,
>int *do_reply);
>  typedef void (*vu_queue_set_started_cb) (VuDev *dev, int qidx, bool
>  started);
> +typedef int (*vu_get_config_cb) (VuDev *dev, uint8_t *config, size_t len);
> +typedef int (*vu_set_config_cb) 

[Qemu-devel] [Bug 904308] Re: x86: BT/BTS/BTR/BTC: ZF flag is unaffected

2017-08-09 Thread Launchpad Bug Tracker
[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/904308

Title:
  x86: BT/BTS/BTR/BTC: ZF flag is unaffected

Status in QEMU:
  Expired

Bug description:
  Hello!

  Bug was found in qemu.git.
  See target-i386/translate.c:

  case 0x1ba: /* bt/bts/btr/btc Gv, im */
  ot = dflag + OT_WORD;
  modrm = ldub_code(s->pc++);
  op = (modrm >> 3) & 7;
  mod = (modrm >> 6) & 3;
  rm = (modrm & 7) | REX_B(s);
  if (mod != 3) {
  s->rip_offset = 1;
  gen_lea_modrm(s, modrm, _addr, _addr);
  gen_op_ld_T0_A0(ot + s->mem_index);
  } else {
  gen_op_mov_TN_reg(ot, 0, rm);
  }
  /* load shift */
  val = ldub_code(s->pc++);
  gen_op_movl_T1_im(val);
  if (op < 4)
  goto illegal_op;
  op -= 4;
  goto bt_op;
  case 0x1a3: /* bt Gv, Ev */
  op = 0;
  goto do_btx;
  case 0x1ab: /* bts */
  op = 1;
  goto do_btx;
  case 0x1b3: /* btr */
  op = 2;
  goto do_btx;
  case 0x1bb: /* btc */
  op = 3;
  do_btx:
  ot = dflag + OT_WORD;
  modrm = ldub_code(s->pc++);
  reg = ((modrm >> 3) & 7) | rex_r;
  mod = (modrm >> 6) & 3;
  rm = (modrm & 7) | REX_B(s);
  gen_op_mov_TN_reg(OT_LONG, 1, reg);
  if (mod != 3) {
  gen_lea_modrm(s, modrm, _addr, _addr);
  /* specific case: we need to add a displacement */
  gen_exts(ot, cpu_T[1]);
  tcg_gen_sari_tl(cpu_tmp0, cpu_T[1], 3 + ot);
  tcg_gen_shli_tl(cpu_tmp0, cpu_tmp0, ot);
  tcg_gen_add_tl(cpu_A0, cpu_A0, cpu_tmp0);
  gen_op_ld_T0_A0(ot + s->mem_index);
  } else {
  gen_op_mov_TN_reg(ot, 0, rm);
  }
  bt_op:
  tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1 << (3 + ot)) - 1);
  switch(op) {
  case 0:
  tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]);
  tcg_gen_movi_tl(cpu_cc_dst, 0);   
<< always set zf
  break;
  case 1:
  tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
  tcg_gen_movi_tl(cpu_tmp0, 1);
  tcg_gen_shl_tl(cpu_tmp0, cpu_tmp0, cpu_T[1]);
  tcg_gen_or_tl(cpu_T[0], cpu_T[0], cpu_tmp0);
  break;
  case 2:
  tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
  tcg_gen_movi_tl(cpu_tmp0, 1);
  tcg_gen_shl_tl(cpu_tmp0, cpu_tmp0, cpu_T[1]);
  tcg_gen_not_tl(cpu_tmp0, cpu_tmp0);
  tcg_gen_and_tl(cpu_T[0], cpu_T[0], cpu_tmp0);
  break;
  default:
  case 3:
  tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
  tcg_gen_movi_tl(cpu_tmp0, 1);
  tcg_gen_shl_tl(cpu_tmp0, cpu_tmp0, cpu_T[1]);
  tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0);
  break;
  }
  s->cc_op = CC_OP_SARB + ot;
  if (op != 0) {
  if (mod != 3)
  gen_op_st_T0_A0(ot + s->mem_index);
  else
  gen_op_mov_reg_T0(ot, rm);
  tcg_gen_mov_tl(cpu_cc_src, cpu_tmp4);
  tcg_gen_movi_tl(cpu_cc_dst, 0);   
<< always set zf
  }
  break;

  always set zf...

  There is fixed patch.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/904308/+subscriptions



Re: [Qemu-devel] No video for Windows 2000 guest

2017-08-09 Thread Thomas Huth
On 09.08.2017 20:12, Programmingkid wrote:
> 
>> On Aug 9, 2017, at 12:37 PM, Paolo Bonzini  wrote:
>>
>> On 09/08/2017 16:56, Programmingkid wrote:
>>> The default vga card not longer works with a Windows 2000 guest. All I see 
>>> is a black screen after a the Windows splash screen.
>>>
>>> This is the command-line I used: 
>>>
>>> qemu-system-i386 -hda Windows2000HD.qcow2 -boot c -m 512
>>>
>>> When using the -vga cirrus option video works. Testing was done with QEMU 
>>> v2.10.0 rc2. 
>>
>> Did it work in 2.9?
>>
>> Paolo
> 
> I haven't test version QEMU 2.9.0 but I did test version 2.8.0 and it has the 
> same problem. Starting up Windows 2000 in VGA mode allowed me to access the 
> operating system. Found out QEMU's default video controller is not 
> recognized. In the Device Manager I can see a yellow question mark for Video 
> Controller (VGA Compatible). I remember Windows 2000 being able to use the 
> default video card in QEMU in the past. I may be able to bisect this issue 
> after all.

I guess you'll end up with QEMU 2.1 as good version and 2.2 as the first
"bad" version. According the qemu-doc:

-vga type

Select type of VGA card to emulate. Valid values for type are

cirrus

Cirrus Logic GD5446 Video card. All Windows versions starting
from Windows 95 should recognize and use this graphic card. For
optimal performances, use 16 bit color depth in the guest and
the host OS. (This card was the default before QEMU 2.2)

std

Standard VGA card with Bochs VBE extensions. If your guest OS
supports the VESA 2.0 VBE extensions (e.g. Windows XP) and if
you want to use high resolution modes (>= 1280x1024x16) then you
should use this option. (This card is the default since QEMU
2.2)

Everything is in the documentation ;-)

 Thomas



Re: [Qemu-devel] [RFC PATCH 27/56] block/dirty-bitmap: Clean up signed vs. unsigned dirty counts

2017-08-09 Thread Markus Armbruster
Eric Blake  writes:

> On 08/07/2017 09:45 AM, Markus Armbruster wrote:
>> hbitmap_count() returns uint64_t.
>> 
>> Clean up test-hbitmap.c to check its value with g_assert_cmpuint()
>> instead of g_assert_cmpint().
>> 
>> bdrv_get_dirty_count() and bdrv_get_meta_dirty_count() return its
>> value converted to int64_t.  Clean them up to return it unadulterated.
>> 
>> This moves the implicit conversion to some callers, so clean them up,
>> too.
>> 
>> mirror_run() assigns the value of bdrv_get_meta_dirty_count() to a
>> local int64_t variable.  Change it to uint64_t.  Signedness still gets
>> mixed up in the computation of s->common.len, but messing with that is
>> more than I can handle right now.
>> 
>> get_remaining_dirty() tallies bdrv_get_dirty_count() values in
>> int64_t.  Its caller block_save_pending() converts it back to
>> uint64_t.  Change get_remaining_dirty() to uint64_t.
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>>  block/dirty-bitmap.c |  4 ++--
>>  block/mirror.c   |  4 ++--
>>  block/trace-events   |  8 
>>  include/block/dirty-bitmap.h |  4 ++--
>>  migration/block.c|  4 ++--
>>  tests/test-hbitmap.c | 16 +---
>>  6 files changed, 21 insertions(+), 19 deletions(-)
>
> I don't know how much this will conflict with my pending work for
> byte-based block status, but I suspect it may be easier for your RFC to
> go in after my cleanups (I think you'll still have things to fix).

I fully expect to rebase on your work.



Re: [Qemu-devel] [RFC PATCH 10/56] hmp: Make balloon's argument unsigned

2017-08-09 Thread Markus Armbruster
"Dr. David Alan Gilbert"  writes:

> * Markus Armbruster (arm...@redhat.com) wrote:
>> The previous commit made it unsigned in QMP.  Switch HMP's args_type
>> from 'M' to 'o'.  Loses support for expressions (QEMU pocket
>> calculator), gains support for units other than mebibytes.  Negative
>> values are no longer accepted and interpreted modulo 2^64.  Instead,
>> values between 2^63 and 2^64-1 are now accepted.
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>>  hmp-commands.hx | 2 +-
>>  hmp.c   | 4 ++--
>>  2 files changed, 3 insertions(+), 3 deletions(-)
>> 
>> diff --git a/hmp-commands.hx b/hmp-commands.hx
>> index 1941e19..46ce79c 100644
>> --- a/hmp-commands.hx
>> +++ b/hmp-commands.hx
>> @@ -1433,7 +1433,7 @@ ETEXI
>>  
>>  {
>>  .name   = "balloon",
>> -.args_type  = "value:M",
>> +.args_type  = "value:o",
>>  .params = "target",
>>  .help   = "request VM to change its memory allocation (in MB)",
>>  .cmd= hmp_balloon,
>> diff --git a/hmp.c b/hmp.c
>> index 4556045..1932a11 100644
>> --- a/hmp.c
>> +++ b/hmp.c
>> @@ -781,7 +781,7 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict)
>>  return;
>>  }
>>  
>> -monitor_printf(mon, "balloon: actual=%" PRIu64 "\n", info->actual >> 
>> 20);
>> +monitor_printf(mon, "balloon: actual=%" PRId64 "\n", info->actual >> 
>> 20);
>
> That looks like a partial reversion of the last patch ?

Accident, will fix, thanks!

[...]



[Qemu-devel] [PATCH for 2.11 v2 0/2] wdt_aspeed: Support reset width patterns

2017-08-09 Thread Andrew Jeffery
Hello,

These two patches add support for the reset width configuration register in the
Aspeed watchdog. Initially this was just one patch[1], but I've reworked it as
two to explicitly support the varying capabilities between Aspeed SoC versions.

Andrew

[1] http://patchwork.ozlabs.org/patch/796039/

Andrew Jeffery (2):
  watchdog: wdt_aspeed: Add support for the reset width register
  aspeed_soc: Propagate silicon-rev to watchdog

 hw/arm/aspeed_soc.c  |  2 +
 hw/watchdog/wdt_aspeed.c | 93 +++-
 include/hw/watchdog/wdt_aspeed.h |  2 +
 3 files changed, 86 insertions(+), 11 deletions(-)

-- 
2.11.0




[Qemu-devel] [PATCH for 2.11 v2 2/2] aspeed_soc: Propagate silicon-rev to watchdog

2017-08-09 Thread Andrew Jeffery
This is required to configure differences in behaviour between the
AST2400 and AST2500 watchdog IPs.

Signed-off-by: Andrew Jeffery 
---
 hw/arm/aspeed_soc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/arm/aspeed_soc.c b/hw/arm/aspeed_soc.c
index 3034849c80bf..79804e1ee652 100644
--- a/hw/arm/aspeed_soc.c
+++ b/hw/arm/aspeed_soc.c
@@ -183,6 +183,8 @@ static void aspeed_soc_init(Object *obj)
 object_initialize(>wdt[i], sizeof(s->wdt[i]), TYPE_ASPEED_WDT);
 object_property_add_child(obj, "wdt[*]", OBJECT(>wdt[i]), NULL);
 qdev_set_parent_bus(DEVICE(>wdt[i]), sysbus_get_default());
+qdev_prop_set_uint32(DEVICE(>wdt[i]), "silicon-rev",
+sc->info->silicon_rev);
 }
 
 object_initialize(>ftgmac100, sizeof(s->ftgmac100), TYPE_FTGMAC100);
-- 
2.11.0




Re: [Qemu-devel] [PATCH for 2.11 v2 2/2] ARM: aspeed_soc: Propagate silicon-rev to watchdog

2017-08-09 Thread Andrew Jeffery
Ugh, disregard this one; I changed the subject and reissued `git
format-patch`, which naturally doesn't overwrite any existing patch in
the output directory and so the old one got sent as well.

Andrew

On Wed, 2017-08-09 at 15:58 +0930, Andrew Jeffery wrote:
> This is required to configure differences in behaviour between the
> AST2400 and AST2500 watchdog IPs.
> 
> Signed-off-by: Andrew Jeffery 
> ---
>  hw/arm/aspeed_soc.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/arm/aspeed_soc.c b/hw/arm/aspeed_soc.c
> index 3034849c80bf..79804e1ee652 100644
> --- a/hw/arm/aspeed_soc.c
> +++ b/hw/arm/aspeed_soc.c
> @@ -183,6 +183,8 @@ static void aspeed_soc_init(Object *obj)
>  object_initialize(>wdt[i], sizeof(s->wdt[i]),
> TYPE_ASPEED_WDT);
>  object_property_add_child(obj, "wdt[*]", OBJECT(>wdt[i]), 
> NULL);
>  qdev_set_parent_bus(DEVICE(>wdt[i]),
> sysbus_get_default());
> +qdev_prop_set_uint32(DEVICE(>wdt[i]), "silicon-rev",
> +sc->info->silicon_rev);
>  }
>  
>  object_initialize(>ftgmac100, sizeof(s->ftgmac100),
> TYPE_FTGMAC100);

signature.asc
Description: This is a digitally signed message part


[Qemu-devel] [PULL 2/6] ppc: fix double-free in cpu_post_load()

2017-08-09 Thread David Gibson
From: Greg Kurz 

When running nested with KVM PR, ppc_set_compat() fails and QEMU crashes
because of "double free or corruption (!prev)". The crash happens because
error_report_err() has already called error_free().

Signed-off-by: Greg Kurz 
Reviewed-by: Eric Blake 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: David Gibson 
---
 target/ppc/machine.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/target/ppc/machine.c b/target/ppc/machine.c
index f578156dd4..abe0a1cdf0 100644
--- a/target/ppc/machine.c
+++ b/target/ppc/machine.c
@@ -239,7 +239,6 @@ static int cpu_post_load(void *opaque, int version_id)
 ppc_set_compat(cpu, cpu->compat_pvr, _err);
 if (local_err) {
 error_report_err(local_err);
-error_free(local_err);
 return -1;
 }
 } else
-- 
2.13.4




[Qemu-devel] [PULL 3/6] target/ppc: Implement TIDR

2017-08-09 Thread David Gibson
This adds a trivial implementation of the TIDR register added in
POWER9.  This isn't particularly important to qemu directly - it's
used by accelerator modules that we don't emulate.

However, since qemu isn't aware of it, its state is not synchronized
with KVM and therefore not migrated, which can be a problem.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Greg Kurz 
Reviewed-by: Thomas Huth 
---
 target/ppc/cpu.h| 1 +
 target/ppc/translate_init.c | 5 +
 2 files changed, 6 insertions(+)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 6ee2a26a96..f6e5413fad 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1451,6 +1451,7 @@ void ppc_compat_add_property(Object *obj, const char 
*name,
 #define SPR_TEXASR(0x082)
 #define SPR_TEXASRU   (0x083)
 #define SPR_UCTRL (0x088)
+#define SPR_TIDR  (0x090)
 #define SPR_MPC_CMPA  (0x090)
 #define SPR_MPC_CMPB  (0x091)
 #define SPR_MPC_CMPC  (0x092)
diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
index 01723bdfec..94800cd29d 100644
--- a/target/ppc/translate_init.c
+++ b/target/ppc/translate_init.c
@@ -8841,6 +8841,11 @@ static void init_proc_POWER9(CPUPPCState *env)
 gen_spr_power8_book4(env);
 gen_spr_power8_rpr(env);
 
+/* POWER9 Specific registers */
+spr_register_kvm(env, SPR_TIDR, "TIDR", NULL, NULL,
+ spr_read_generic, spr_write_generic,
+ KVM_REG_PPC_TIDR, 0);
+
 /* env variables */
 #if !defined(CONFIG_USER_ONLY)
 env->slb_nr = 32;
-- 
2.13.4




[Qemu-devel] [PULL 4/6] target/ppc: Add stub implementation of the PSSCR

2017-08-09 Thread David Gibson
The PSSCR register added in POWER9 controls certain power saving mode
behaviours.  Mostly, it's not relevant to TCG, however because qemu
doesn't know about it yet, it doesn't synchronize the state with KVM,
and thus it doesn't get migrated.

To fix that, this adds a minimal stub implementation of the register.
This isn't complete, even to the extent that an implementation is
possible in TCG, just enough to get migration working.  We need to
come back later and at least properly filter the various fields in the
register based on privilege level.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Greg Kurz 
Reviewed-by: Thomas Huth 
---
 target/ppc/cpu.h| 1 +
 target/ppc/translate_init.c | 5 +
 2 files changed, 6 insertions(+)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index f6e5413fad..46d3dd88f6 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1771,6 +1771,7 @@ void ppc_compat_add_property(Object *obj, const char 
*name,
 #define SPR_IC(0x350)
 #define SPR_VTB   (0x351)
 #define SPR_MMCRC (0x353)
+#define SPR_PSSCR (0x357)
 #define SPR_440_INV0  (0x370)
 #define SPR_440_INV1  (0x371)
 #define SPR_440_INV2  (0x372)
diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
index 94800cd29d..8fb407ed73 100644
--- a/target/ppc/translate_init.c
+++ b/target/ppc/translate_init.c
@@ -8846,6 +8846,11 @@ static void init_proc_POWER9(CPUPPCState *env)
  spr_read_generic, spr_write_generic,
  KVM_REG_PPC_TIDR, 0);
 
+/* FIXME: Filter fields properly based on privilege level */
+spr_register_kvm_hv(env, SPR_PSSCR, "PSSCR", NULL, NULL, NULL, NULL,
+spr_read_generic, spr_write_generic,
+KVM_REG_PPC_PSSCR, 0);
+
 /* env variables */
 #if !defined(CONFIG_USER_ONLY)
 env->slb_nr = 32;
-- 
2.13.4




[Qemu-devel] [PULL 6/6] spapr: Fix bug in h_signal_sys_reset()

2017-08-09 Thread David Gibson
From: Sam Bobroff 

The unicast case in h_signal_sys_reset() seems to be broken:
rather than selecting the target CPU, it looks like it will pick
either the first CPU or fail to find one at all.

Fix it by using the search function rather than open coding the
search.

This was found by inspection; the code appears to be unused because
the Linux kernel only uses the broadcast target.

Signed-off-by: Sam Bobroff 
Reviewed-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_hcall.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 72ea5a8247..07b3da8dc4 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1431,11 +1431,10 @@ static target_ulong h_signal_sys_reset(PowerPCCPU *cpu,
 
 } else {
 /* Unicast */
-CPU_FOREACH(cs) {
-if (cpu->cpu_dt_id == target) {
-run_on_cpu(cs, spapr_do_system_reset_on_cpu, RUN_ON_CPU_NULL);
-return H_SUCCESS;
-}
+cs = CPU(ppc_get_vcpu_by_dt_id(target));
+if (cs) {
+run_on_cpu(cs, spapr_do_system_reset_on_cpu, RUN_ON_CPU_NULL);
+return H_SUCCESS;
 }
 return H_PARAMETER;
 }
-- 
2.13.4




Re: [Qemu-devel] [PATCH v4 04/22] tests: Add assertion for no qmp("")

2017-08-09 Thread Markus Armbruster
Eric Blake  writes:

> Now that the previous patches have fixed all callers to avoid an
> empty message, we can tweak qmp_fd_sendv() to assert that we
> don't introduce new callers, and reindent accordingly.  The
> additional assertions will also help verify that later refactoring
> is not breaking anything.
>
> Signed-off-by: Eric Blake 
> ---
>  tests/libqtest.c | 38 ++
>  1 file changed, 18 insertions(+), 20 deletions(-)
>
> diff --git a/tests/libqtest.c b/tests/libqtest.c
> index 7e5425d704..99a07c246f 100644
> --- a/tests/libqtest.c
> +++ b/tests/libqtest.c
> @@ -450,6 +450,9 @@ void qmp_fd_sendv(int fd, const char *fmt, va_list ap)
>  {
>  va_list ap_copy;
>  QObject *qobj;
> +int log = getenv("QTEST_LOG") != NULL;

Use the opportunity to make this bool?

> +QString *qstr;
> +const char *str;
>
>  /* qobject_from_jsonv() silently eats leading 0xff as invalid
>   * JSON, but we want to test sending them over the wire to force
> @@ -458,6 +461,7 @@ void qmp_fd_sendv(int fd, const char *fmt, va_list ap)
>  socket_send(fd, fmt, 1);
>  fmt++;
>  }
> +assert(*fmt);

I prefer assertions on arguments going first, for extra visibility.

>  /* Going through qobject ensures we escape strings properly.
>   * This seemingly unnecessary copy is required in case va_list
> @@ -466,29 +470,23 @@ void qmp_fd_sendv(int fd, const char *fmt, va_list ap)
>  va_copy(ap_copy, ap);
>  qobj = qobject_from_jsonv(fmt, _copy, _abort);
>  va_end(ap_copy);
> +qstr = qobject_to_json(qobj);
>
> -/* No need to send anything for an empty QObject.  */
> -if (qobj) {
> -int log = getenv("QTEST_LOG") != NULL;
> -QString *qstr = qobject_to_json(qobj);
> -const char *str;
> +/*
> + * BUG: QMP doesn't react to input until it sees a newline, an
> + * object, or an array.  Work-around: give it a newline.
> + */
> +qstring_append_chr(qstr, '\n');
> +str = qstring_get_str(qstr);
>
> -/*
> - * BUG: QMP doesn't react to input until it sees a newline, an
> - * object, or an array.  Work-around: give it a newline.
> - */
> -qstring_append_chr(qstr, '\n');
> -str = qstring_get_str(qstr);
> -
> -if (log) {
> -fprintf(stderr, "%s", str);
> -}
> -/* Send QMP request */
> -socket_send(fd, str, qstring_get_length(qstr));
> -
> -QDECREF(qstr);
> -qobject_decref(qobj);
> +if (log) {
> +fprintf(stderr, "%s", str);
>  }
> +/* Send QMP request */
> +socket_send(fd, str, qstring_get_length(qstr));
> +
> +QDECREF(qstr);
> +qobject_decref(qobj);
>  }
>
>  void qtest_async_qmpv(QTestState *s, const char *fmt, va_list ap)

Preferably with the assertion moved:
Reviewed-by: Markus Armbruster 



Re: [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time

2017-08-09 Thread Peter Xu
On Wed, Aug 09, 2017 at 10:05:19AM +0200, Juan Quintela wrote:
> Peter Xu  wrote:
> > On Tue, Aug 08, 2017 at 06:06:04PM +0200, Juan Quintela wrote:
> >> Peter Xu  wrote:
> >> > On Mon, Jul 17, 2017 at 03:42:32PM +0200, Juan Quintela wrote:
> >> >
> >> > [...]
> >> >
> >> >>  static int multifd_send_page(uint8_t *address)
> >> >>  {
> >> >> -int i;
> >> >> +int i, j;
> >> >>  MultiFDSendParams *p = NULL; /* make happy gcc */
> >> >> +static multifd_pages_t pages;
> >> >> +static bool once;
> >> >> +
> >> >> +if (!once) {
> >> >> +multifd_init_group();
> >> >> +once = true;
> >> >
> >> > Would it be good to put the "pages" into multifd_send_state? One is to
> >> > stick globals together; another benefit is that we can remove the
> >> > "once" here: we can then init the "pages" when init multifd_send_state
> >> > struct (but maybe with a better name?...).
> >> 
> >> I did to be able to free it.
> >
> > Free it? But they a static variables, then how can we free them?
> >
> > (I thought the only way to free it is putting it into
> >  multifd_send_state...)
> >
> > Something I must have missed here. :(
> 
> I did the change that you suggested in response to a comment from Dave
> that asked where I freed it.   I see that my sentence was ambigous.

Oh! Then it's clear now. Thanks!

(Sorry I may have missed some of the emails in the threads)

-- 
Peter Xu



Re: [Qemu-devel] [PATCH v2] 9pfs: fix dependencies

2017-08-09 Thread Cornelia Huck
On Wed, 9 Aug 2017 10:23:04 +0200
Thomas Huth  wrote:

> On 09.08.2017 09:17, Cornelia Huck wrote:
> > Nothing in fsdev/ or hw/9pfs/ depends on pci; it should rather depend
> > on CONFIG_VIRTFS and on the presence of an appropriate virtio transport
> > device.
> > 
> > Let's introduce CONFIG_VIRTIO_CCW to cover s390x and check for
> > CONFIG_VIRTFS && (CONFIG_VIRTIO_PCI || CONFIG_VIRTIO_CCW).
> > 
> > Signed-off-by: Cornelia Huck 
> > ---
> > 
> > Changes v1->v2: drop extraneous spaces, fix build on cris
> > 
> > ---
> >  default-configs/s390x-softmmu.mak | 1 +
> >  fsdev/Makefile.objs   | 9 +++--
> >  hw/Makefile.objs  | 2 +-
> >  3 files changed, 5 insertions(+), 7 deletions(-)
> > 
> > diff --git a/default-configs/s390x-softmmu.mak 
> > b/default-configs/s390x-softmmu.mak
> > index 51191b77df..e4c5236ceb 100644
> > --- a/default-configs/s390x-softmmu.mak
> > +++ b/default-configs/s390x-softmmu.mak
> > @@ -8,3 +8,4 @@ CONFIG_S390_FLIC=y
> >  CONFIG_S390_FLIC_KVM=$(CONFIG_KVM)
> >  CONFIG_VFIO_CCW=$(CONFIG_LINUX)
> >  CONFIG_WDT_DIAG288=y
> > +CONFIG_VIRTIO_CCW=y
> > diff --git a/fsdev/Makefile.objs b/fsdev/Makefile.objs
> > index 659df6e187..3d157add31 100644
> > --- a/fsdev/Makefile.objs
> > +++ b/fsdev/Makefile.objs
> > @@ -1,10 +1,7 @@
> > -ifeq ($(CONFIG_VIRTIO)$(CONFIG_VIRTFS)$(CONFIG_PCI),yyy)
> >  # Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.
> > -# only pull in the actual virtio-9p device if we also enabled virtio.
> > -common-obj-y = qemu-fsdev.o 9p-marshal.o 9p-iov-marshal.o
> > -else
> > -common-obj-y = qemu-fsdev-dummy.o
> > -endif
> > +# only pull in the actual virtio-9p device if we also enabled a virtio 
> > backend.
> > +common-obj-$(call land,$(CONFIG_VIRTFS),$(call 
> > lor,$(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW)))= qemu-fsdev.o 9p-marshal.o 
> > 9p-iov-marshal.o
> > +common-obj-$(call lnot,$(call land,$(CONFIG_VIRTFS),$(call 
> > lor,$(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW = qemu-fsdev-dummy.o
> >  common-obj-y += qemu-fsdev-opts.o qemu-fsdev-throttle.o
> >  
> >  # Toplevel always builds this; targets without virtio will put it in
> > diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> > index a2c61f6b09..335f26b65e 100644
> > --- a/hw/Makefile.objs
> > +++ b/hw/Makefile.objs
> > @@ -1,4 +1,4 @@
> > -devices-dirs-$(call land, $(CONFIG_VIRTIO),$(call 
> > land,$(CONFIG_VIRTFS),$(CONFIG_PCI))) += 9pfs/
> > +devices-dirs-$(call land,$(CONFIG_VIRTFS),$(call 
> > lor,$(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW))) += 9pfs/
> >  devices-dirs-$(CONFIG_SOFTMMU) += acpi/
> >  devices-dirs-$(CONFIG_SOFTMMU) += adc/
> >  devices-dirs-$(CONFIG_SOFTMMU) += audio/  
> 
> Patch should be fine now, I think...
> 
> But thinking about this again, I wonder whether it would be enough to
> simply check for CONFIG_VIRTIO=y here instead. CONFIG_VIRTIO=y should be
> sufficient to assert that there is also at least one kind of virtio
> transport available, right?
> Otherwise this will look really horrible as soon as somebody also tries
> to add support for virtio-mmio here later ;-)

Do all virtio transports have support for 9p, though? I thought it was
only virtio-pci and virtio-ccw...



Re: [Qemu-devel] [PATCH v2] 9pfs: fix dependencies

2017-08-09 Thread Thomas Huth
On 09.08.2017 09:17, Cornelia Huck wrote:
> Nothing in fsdev/ or hw/9pfs/ depends on pci; it should rather depend
> on CONFIG_VIRTFS and on the presence of an appropriate virtio transport
> device.
> 
> Let's introduce CONFIG_VIRTIO_CCW to cover s390x and check for
> CONFIG_VIRTFS && (CONFIG_VIRTIO_PCI || CONFIG_VIRTIO_CCW).
> 
> Signed-off-by: Cornelia Huck 
> ---
> 
> Changes v1->v2: drop extraneous spaces, fix build on cris
> 
> ---
>  default-configs/s390x-softmmu.mak | 1 +
>  fsdev/Makefile.objs   | 9 +++--
>  hw/Makefile.objs  | 2 +-
>  3 files changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/default-configs/s390x-softmmu.mak 
> b/default-configs/s390x-softmmu.mak
> index 51191b77df..e4c5236ceb 100644
> --- a/default-configs/s390x-softmmu.mak
> +++ b/default-configs/s390x-softmmu.mak
> @@ -8,3 +8,4 @@ CONFIG_S390_FLIC=y
>  CONFIG_S390_FLIC_KVM=$(CONFIG_KVM)
>  CONFIG_VFIO_CCW=$(CONFIG_LINUX)
>  CONFIG_WDT_DIAG288=y
> +CONFIG_VIRTIO_CCW=y
> diff --git a/fsdev/Makefile.objs b/fsdev/Makefile.objs
> index 659df6e187..3d157add31 100644
> --- a/fsdev/Makefile.objs
> +++ b/fsdev/Makefile.objs
> @@ -1,10 +1,7 @@
> -ifeq ($(CONFIG_VIRTIO)$(CONFIG_VIRTFS)$(CONFIG_PCI),yyy)
>  # Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.
> -# only pull in the actual virtio-9p device if we also enabled virtio.
> -common-obj-y = qemu-fsdev.o 9p-marshal.o 9p-iov-marshal.o
> -else
> -common-obj-y = qemu-fsdev-dummy.o
> -endif
> +# only pull in the actual virtio-9p device if we also enabled a virtio 
> backend.
> +common-obj-$(call land,$(CONFIG_VIRTFS),$(call 
> lor,$(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW)))= qemu-fsdev.o 9p-marshal.o 
> 9p-iov-marshal.o
> +common-obj-$(call lnot,$(call land,$(CONFIG_VIRTFS),$(call 
> lor,$(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW = qemu-fsdev-dummy.o
>  common-obj-y += qemu-fsdev-opts.o qemu-fsdev-throttle.o
>  
>  # Toplevel always builds this; targets without virtio will put it in
> diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> index a2c61f6b09..335f26b65e 100644
> --- a/hw/Makefile.objs
> +++ b/hw/Makefile.objs
> @@ -1,4 +1,4 @@
> -devices-dirs-$(call land, $(CONFIG_VIRTIO),$(call 
> land,$(CONFIG_VIRTFS),$(CONFIG_PCI))) += 9pfs/
> +devices-dirs-$(call land,$(CONFIG_VIRTFS),$(call 
> lor,$(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW))) += 9pfs/
>  devices-dirs-$(CONFIG_SOFTMMU) += acpi/
>  devices-dirs-$(CONFIG_SOFTMMU) += adc/
>  devices-dirs-$(CONFIG_SOFTMMU) += audio/

Patch should be fine now, I think...

But thinking about this again, I wonder whether it would be enough to
simply check for CONFIG_VIRTIO=y here instead. CONFIG_VIRTIO=y should be
sufficient to assert that there is also at least one kind of virtio
transport available, right?
Otherwise this will look really horrible as soon as somebody also tries
to add support for virtio-mmio here later ;-)

 Thomas



Re: [Qemu-devel] [RFC PATCH 03/56] monitor: Rewrite comment describing HMP .args_type

2017-08-09 Thread Markus Armbruster
"Dr. David Alan Gilbert"  writes:

> * Markus Armbruster (arm...@redhat.com) wrote:
>> "Dr. David Alan Gilbert"  writes:
>> 
>> > * Markus Armbruster (arm...@redhat.com) wrote:
>> >> Signed-off-by: Markus Armbruster 
>> >> ---
>> >>  monitor.c | 75 
>> >> +++
>> >>  1 file changed, 47 insertions(+), 28 deletions(-)
>> >> 
>> >> diff --git a/monitor.c b/monitor.c
>> >> index e0f8801..8b54ba1 100644
>> >> --- a/monitor.c
>> >> +++ b/monitor.c
>> >> @@ -85,37 +85,56 @@
>> >>  #endif
>> >>  
>> >>  /*
>> >> - * Supported types:
>> >> + * Command handlers (mon_cmd_t member @cmd) receive actual arguments
>> >> + * in a QDict, which is built by the HMP core according to mon_cmd_t
>> >> + * member @args_type.  It's a list of NAME:TYPE separated by comma.
>> >>   *
>> >> - * 'F'  filename
>> >> - * 'B'  block device name
>> >> - * 's'  string (accept optional quote)
>> >> - * 'S'  it just appends the rest of the string (accept optional 
>> >> quote)
>> >> - * 'O'  option string of the form NAME=VALUE,...
>> >> - *  parsed according to QemuOptsList given by its name
>> >> - *  Example: 'device:O' uses qemu_device_opts.
>> >> - *  Restriction: only lists with empty desc are supported
>> >> - *  TODO lift the restriction
>> >> - * 'i'  32 bit integer
>> >> - * 'l'  target long (32 or 64 bit)
>> >> - * 'M'  Non-negative target long (32 or 64 bit), in user mode the
>> >> - *  value is multiplied by 2^20 (think Mebibyte)
>> >> - * 'o'  octets (aka bytes)
>> >> - *  user mode accepts an optional E, e, P, p, T, t, G, g, M, 
>> >> m,
>> >> - *  K, k suffix, which multiplies the value by 2^60 for 
>> >> suffixes E
>> >> - *  and e, 2^50 for suffixes P and p, 2^40 for suffixes T 
>> >> and t,
>> >> - *  2^30 for suffixes G and g, 2^20 for M and m, 2^10 for K 
>> >> and k
>> >> - * 'T'  double
>> >> - *  user mode accepts an optional ms, us, ns suffix,
>> >> - *  which divides the value by 1e3, 1e6, 1e9, respectively
>> >> - * '/'  optional gdb-like print format (like "/10x")
>> >> + * TYPEs that put a string value with key NAME into the QDict:
>> >> + * 's'Argument is enclosed in '"' or delimited by whitespace.  In
>> >> + *the former case, escapes \n \r \\ \' and \" are recognized.
>> >> + * 'F'File name, like 's' except for completion.
>> >> + * 'B'BlockBackend name, like 's' except for completion.
>> >> + * 'S'Argument is the remainder of the line, less leading
>> >> + *whitespace.
>> >> +
>> >>   *
>> >> - * '?'  optional type (for all types, except '/')
>> >> - * '.'  other form of optional type (for 'i' and 'l')
>> >> - * 'b'  boolean
>> >> - *  user mode accepts "on" or "off"
>> >> - * '-'  optional parameter (eg. '-f')
>> >> + * TYPEs that put an int64_t value with key NAME:
>> >> + * 'l'Argument is an expression (QEMU pocket calculator).
>> >> + * 'i'Like 'l' except value must fit into 32 bit unsigned.
>> >> + * 'M'Like 'l' except value must not be negative and is multiplied
>> >> + *by 2^20 (think "mebibyte").
>> >>   *
>> >> + * TYPEs that put an uint64_t value with key NAME:
>> >> + * 'o'Argument is a size (think "octets").  Without suffix the
>> >> + *value is multiplied by 2^20 (mebibytes), with suffix E or e
>> >> + *by 2^60 (exbibytes), with P or p by 2^50 (pebibytes), with T
>> >> + *or t by 2^40 (tebibytes), with G or g by 2^30 (gibibytes),
>> >> + *with M or m by 2^10 (mebibytes), with K or k by 2^10
>> >> + *(kibibytes).
>> >
>> > 'o' is messy.  It using qemu_strtosz_MiB which uses a 'double' intermediate
>> > so I fear it can round.
>> 
>> It does, but only when you have more than 53 significant bits.
>> 
>> >  It also has a note it can't take all f's due to
>> > an overflow from the conversion.
>> 
>> Correct, because values between 0xfc00 and 2^64-1 round up
>> to 2^64.
>
> Right, so these bother me not for normal sizes, but if we were to start
> to use them for hex values with meanings, like addresses for example.
> (Although I guess that's unlikely with the default assumption of MB)

Yes, 'o' is convenient in some cases, inconvenient in others, and
incapable when you need more than 53 significant bits.

>> If it bothers you, feel free to explore the following: feed the string
>> both to strtod() and to strtoll().  Whichever eats more characters wins.
>
> Is the reason we're using strtod because we actively want users to be
> able to say 3.5G ?  I guess that's a reason to keep it.

Early (and flawed) version(s) of the patch introducing strtosz() used
strtoll().  Jes decided to switch to 

Re: [Qemu-devel] [RFC PATCH 32/56] hmp: Make block_set_io_throttle's arguments unsigned

2017-08-09 Thread Markus Armbruster
"Dr. David Alan Gilbert"  writes:

> * Markus Armbruster (arm...@redhat.com) wrote:
>> The previous commit made them unsigned in QMP.  Switch HMP's args_type
>> from 'l' to 'o'.  Loses support for expressions (QEMU pocket
>> calculator), gains support for unit suffixes.  Negative values are no
>> longer accepted and interpreted modulo 2^64.  Instead, values between
>> 2^63 and 2^64-1 are now accepted.
>
> But that also means all these values are assumed to be in MB by default?

Yes.

We could debate whether that's acceptable, as HMP is not a stable
interface, but as a matter of fact, I'm no friend of defaulting the unit
to anything but one.  Looks like we have a customer for your proposed
args_type '6'.



[Qemu-devel] [PULL 0/6] ppc patch queue 2017-08-09

2017-08-09 Thread David Gibson
The following changes since commit 54affb3a3623b1d36c95e34faa722a5831323a74:

  Update version for v2.10.0-rc2 release (2017-08-08 19:07:46 +0100)

are available in the git repository at:

  git://github.com/dgibson/qemu.git tags/ppc-for-2.10-20170809

for you to fetch changes up to f57467e3b326c7736f8e481fd6b680f30e575c87:

  spapr: Fix bug in h_signal_sys_reset() (2017-08-09 14:04:28 +1000)


ppc patch queue 2017-08-09

This series contains a number of bugfixes for ppc and related
machines, for the qemu-2.10.release.  Some are true regressions,
others are serious enough and non-invasive enough to fix that it's
worth putting in 2.10 this late.



I haven't completed a Travis build for this, which is part of my usual
test regime, since the first dozen or so Travis builds are failing
more often than not on master as well.  I don't know why this is -
seems to be failing some of the x86 tests.


David Gibson (2):
  target/ppc: Implement TIDR
  target/ppc: Add stub implementation of the PSSCR

Greg Kurz (2):
  ppc: fix double-free in cpu_post_load()
  spapr_drc: abort if object_property_add_child() fails

KONRAD Frederic (1):
  booke206: fix MAS update on tlb miss

Sam Bobroff (1):
  spapr: Fix bug in h_signal_sys_reset()

 hw/ppc/spapr_drc.c  |  2 +-
 hw/ppc/spapr_hcall.c|  9 -
 target/ppc/cpu.h|  2 ++
 target/ppc/machine.c|  1 -
 target/ppc/mmu_helper.c |  2 +-
 target/ppc/translate_init.c | 10 ++
 6 files changed, 18 insertions(+), 8 deletions(-)



[Qemu-devel] [PULL 1/6] booke206: fix MAS update on tlb miss

2017-08-09 Thread David Gibson
From: KONRAD Frederic 

When a tlb instruction miss happen, rw is set to 0 at the bottom
of cpu_ppc_handle_mmu_fault which cause the MAS update function to miss
the SAS and TS bit in MAS6, MAS1 in booke206_update_mas_tlb_miss.

Just calling booke206_update_mas_tlb_miss with rw = 2 solve the issue.

Signed-off-by: KONRAD Frederic 
Signed-off-by: David Gibson 
---
 target/ppc/mmu_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
index b7b9088842..f06b9382b4 100644
--- a/target/ppc/mmu_helper.c
+++ b/target/ppc/mmu_helper.c
@@ -1551,7 +1551,7 @@ static int cpu_ppc_handle_mmu_fault(CPUPPCState *env, 
target_ulong address,
 env->spr[SPR_40x_ESR] = 0x;
 break;
 case POWERPC_MMU_BOOKE206:
-booke206_update_mas_tlb_miss(env, address, rw);
+booke206_update_mas_tlb_miss(env, address, 2);
 /* fall through */
 case POWERPC_MMU_BOOKE:
 cs->exception_index = POWERPC_EXCP_ITLB;
-- 
2.13.4




Re: [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time

2017-08-09 Thread Peter Xu
On Tue, Aug 08, 2017 at 06:06:04PM +0200, Juan Quintela wrote:
> Peter Xu  wrote:
> > On Mon, Jul 17, 2017 at 03:42:32PM +0200, Juan Quintela wrote:
> >
> > [...]
> >
> >>  static int multifd_send_page(uint8_t *address)
> >>  {
> >> -int i;
> >> +int i, j;
> >>  MultiFDSendParams *p = NULL; /* make happy gcc */
> >> +static multifd_pages_t pages;
> >> +static bool once;
> >> +
> >> +if (!once) {
> >> +multifd_init_group();
> >> +once = true;
> >
> > Would it be good to put the "pages" into multifd_send_state? One is to
> > stick globals together; another benefit is that we can remove the
> > "once" here: we can then init the "pages" when init multifd_send_state
> > struct (but maybe with a better name?...).
> 
> I did to be able to free it.

Free it? But they a static variables, then how can we free them?

(I thought the only way to free it is putting it into
 multifd_send_state...)

Something I must have missed here. :(

> 
> > (there are similar static variables in multifd_recv_page() as well, if
> >  this one applies, then we can possibly use multifd_recv_state for
> >  that one)
> 
> Also there.
> 
> >> +}
> >> +
> >> +pages.iov[pages.num].iov_base = address;
> >> +pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
> >> +pages.num++;
> >> +
> >> +if (pages.num < (pages.size - 1)) {
> >> +return UINT16_MAX;
> >
> > Nit: shall we define something for readability?  Like:
> >
> > #define  MULTIFD_FD_INVALID  UINT16_MAX
> 
> Also done.
> 
> MULTIFD_CONTINUE
> 
> But I am open to changes.

It's clear enough at least to me. Thanks!

-- 
Peter Xu



Re: [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work

2017-08-09 Thread Peter Xu
On Tue, Aug 08, 2017 at 11:19:35AM +0200, Juan Quintela wrote:
> Peter Xu  wrote:
> > On Mon, Jul 17, 2017 at 03:42:30PM +0200, Juan Quintela wrote:
> >
> > [...]
> >
> >>  int multifd_load_setup(void)
> >>  {
> >>  int thread_count;
> >> -uint8_t i;
> >>  
> >>  if (!migrate_use_multifd()) {
> >>  return 0;
> >>  }
> >>  thread_count = migrate_multifd_threads();
> >>  multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
> >> -multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
> >> +multifd_recv_state->params = g_new0(MultiFDRecvParams *, 
> >> thread_count);
> >>  multifd_recv_state->count = 0;
> >> -for (i = 0; i < thread_count; i++) {
> >> -char thread_name[16];
> >> -MultiFDRecvParams *p = _recv_state->params[i];
> >> -
> >> -qemu_mutex_init(>mutex);
> >> -qemu_sem_init(>sem, 0);
> >> -p->quit = false;
> >> -p->id = i;
> >> -snprintf(thread_name, sizeof(thread_name), "multifdrecv_%d", i);
> >> -qemu_thread_create(>thread, thread_name, multifd_recv_thread, 
> >> p,
> >> -   QEMU_THREAD_JOINABLE);
> >> -multifd_recv_state->count++;
> >> -}
> >
> > Could I ask why we explicitly switched from MultiFDRecvParams[] array
> > into a pointer array? Can we still use the old array?  Thanks,
> 
> Now, we could receive the channels out of order (the wonders of
> networking).  So, we have two options that I can see:
> 
> * Add interesting global locking to be able to modify inplace (I know
>   that it should be safe, but yet).
> * Create a new struct in the new connection, and then atomically switch
>   the pointer to the right instruction.
> 
> I can assure you that the second one makes it much more easier to detect
> when you use the "channel" before you have fully created it O:-)

Oh, so it's possible that we start to recv pages even if the recv
channel has not yet been established...

Then would current code be problematic? Like in multifd_recv_page() we
have:

static void multifd_recv_page(uint8_t *address, uint16_t fd_num)
{
...
p = multifd_recv_state->params[fd_num];
qemu_sem_wait(>ready);
...
}

Here can p==NULL if channel is not ready yet?

(If so, I think a static array makes more sense...)

Thanks,

-- 
Peter Xu



Re: [Qemu-devel] [RFC PATCH 09/56] balloon: Make balloon size unsigned in QAPI/QMP

2017-08-09 Thread Markus Armbruster
"Dr. David Alan Gilbert"  writes:

> * Markus Armbruster (arm...@redhat.com) wrote:
>> Sizes should use QAPI type 'size' (uint64_t).  balloon parameter
>> @value is 'int' (int64_t).  qmp_balloon() implicitly converts to
>> ram_addr_t, i.e. uint64_t.  BALLOON_CHANGE parameter @actual and
>> BalloonInfo member @actual are also 'int'.
>> virtio_balloon_set_config() and virtio_balloon_stat() implicitly
>> convert from ram_addr_t.
>> 
>> Change all three to 'size', and adjust the code using them.
>> 
>> balloon now accepts size values between 2^63 and 2^64-1.  It accepts
>> negative values as before, because that's how the QObject input
>> visitor works for backward compatibility.
>> 
>> Doing the same for HMP's balloon deserves its own commit (the next
>> one).
>> 
>> BALLOON_CHANGE and query-balloon now report sizes above 2^63-1
>> correctly instead of their (negative) two's complement.
>> 
>> So does HMP's "info balloon".
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>>  balloon.c| 2 +-
>>  hmp.c| 2 +-
>>  qapi-schema.json | 4 ++--
>>  qapi/event.json  | 2 +-
>>  4 files changed, 5 insertions(+), 5 deletions(-)
>> 
>> diff --git a/balloon.c b/balloon.c
>> index 1d720ff..2ecca76 100644
>> --- a/balloon.c
>> +++ b/balloon.c
>> @@ -102,7 +102,7 @@ BalloonInfo *qmp_query_balloon(Error **errp)
>>  return info;
>>  }
>>  
>> -void qmp_balloon(int64_t target, Error **errp)
>> +void qmp_balloon(uint64_t target, Error **errp)
>>  {
>>  if (!have_balloon(errp)) {
>>  return;
>
> Can't you remove the:
>   if (target <= 0) {
>
> check?

Functional change when target == 0.  Impact is not clear to me.

> (The type of the trace_balloon_event probably needs fixing
> to be uint64_t rather than the unsigned long)

You're right.  I'll fix it.

>> diff --git a/hmp.c b/hmp.c
>> index 8257dd0..4556045 100644
>> --- a/hmp.c
>> +++ b/hmp.c
>> @@ -781,7 +781,7 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict)
>>  return;
>>  }
>>  
>> -monitor_printf(mon, "balloon: actual=%" PRId64 "\n", info->actual >> 
>> 20);
>> +monitor_printf(mon, "balloon: actual=%" PRIu64 "\n", info->actual >> 
>> 20);
>>  
>>  qapi_free_BalloonInfo(info);
>>  }
>> diff --git a/qapi-schema.json b/qapi-schema.json
>> index 3ad2bc0..23eb60d 100644
>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -2003,7 +2003,7 @@
>>  # Since: 0.14.0
>>  #
>>  ##
>> -{ 'struct': 'BalloonInfo', 'data': {'actual': 'int' } }
>> +{ 'struct': 'BalloonInfo', 'data': {'actual': 'size' } }
>>  
>>  ##
>>  # @query-balloon:
>> @@ -2603,7 +2603,7 @@
>>  # <- { "return": {} }
>>  #
>>  ##
>> -{ 'command': 'balloon', 'data': {'value': 'int'} }
>> +{ 'command': 'balloon', 'data': {'value': 'size'} }
>>  
>>  ##
>>  # @Abort:
>> diff --git a/qapi/event.json b/qapi/event.json
>> index 6d22b02..9dfc70b 100644
>> --- a/qapi/event.json
>> +++ b/qapi/event.json
>> @@ -488,7 +488,7 @@
>>  #
>>  ##
>>  { 'event': 'BALLOON_CHANGE',
>> -  'data': { 'actual': 'int' } }
>> +  'data': { 'actual': 'size' } }
>
> I was going to ask whether that was a problem for any external users,
> but there again libvirt looks like it reads it into an unsigned long
> long.

Yes.  See also my reply to Juan's review of PATCH 15.



[Qemu-devel] [PATCH for 2.11 v2 1/2] watchdog: wdt_aspeed: Add support for the reset width register

2017-08-09 Thread Andrew Jeffery
The reset width register controls how the pulse on the SoC's WDTRST{1,2}
pins behaves. A pulse is emitted if the external reset bit is set in
WDT_CTRL. On the AST2500 WDT_RESET_WIDTH can consume magic bit patterns
to configure push-pull/open-drain and active-high/active-low
behaviours and thus needs some special handling in the write path.

As some of the capabilities depend on the SoC version a silicon-rev
property is introduced, which is used to guard version-specific
behaviour.

Signed-off-by: Andrew Jeffery 
---
 hw/watchdog/wdt_aspeed.c | 93 +++-
 include/hw/watchdog/wdt_aspeed.h |  2 +
 2 files changed, 84 insertions(+), 11 deletions(-)

diff --git a/hw/watchdog/wdt_aspeed.c b/hw/watchdog/wdt_aspeed.c
index 8bbe579b6b66..22bce364d7b5 100644
--- a/hw/watchdog/wdt_aspeed.c
+++ b/hw/watchdog/wdt_aspeed.c
@@ -8,16 +8,19 @@
  */
 
 #include "qemu/osdep.h"
+
+#include "qapi/error.h"
 #include "qemu/log.h"
+#include "qemu/timer.h"
 #include "sysemu/watchdog.h"
+#include "hw/misc/aspeed_scu.h"
 #include "hw/sysbus.h"
-#include "qemu/timer.h"
 #include "hw/watchdog/wdt_aspeed.h"
 
-#define WDT_STATUS  (0x00 / 4)
-#define WDT_RELOAD_VALUE(0x04 / 4)
-#define WDT_RESTART (0x08 / 4)
-#define WDT_CTRL(0x0C / 4)
+#define WDT_STATUS  (0x00 / 4)
+#define WDT_RELOAD_VALUE(0x04 / 4)
+#define WDT_RESTART (0x08 / 4)
+#define WDT_CTRL(0x0C / 4)
 #define   WDT_CTRL_RESET_MODE_SOC   (0x00 << 5)
 #define   WDT_CTRL_RESET_MODE_FULL_CHIP (0x01 << 5)
 #define   WDT_CTRL_1MHZ_CLK BIT(4)
@@ -25,18 +28,41 @@
 #define   WDT_CTRL_WDT_INTR BIT(2)
 #define   WDT_CTRL_RESET_SYSTEM BIT(1)
 #define   WDT_CTRL_ENABLE   BIT(0)
+#define WDT_RESET_WIDTH (0x18 / 4)
+#define   WDT_RESET_WIDTH_ACTIVE_HIGH   BIT(31)
+#define WDT_POLARITY_MASK   (0xFF << 24)
+#define WDT_ACTIVE_HIGH_MAGIC   (0xA5 << 24)
+#define WDT_ACTIVE_LOW_MAGIC(0x5A << 24)
+#define   WDT_RESET_WIDTH_PUSH_PULL BIT(30)
+#define WDT_DRIVE_TYPE_MASK (0xFF << 24)
+#define WDT_PUSH_PULL_MAGIC (0xA8 << 24)
+#define WDT_OPEN_DRAIN_MAGIC(0x8A << 24)
 
-#define WDT_TIMEOUT_STATUS  (0x10 / 4)
-#define WDT_TIMEOUT_CLEAR   (0x14 / 4)
-#define WDT_RESET_WDITH (0x18 / 4)
+#define WDT_TIMEOUT_STATUS  (0x10 / 4)
+#define WDT_TIMEOUT_CLEAR   (0x14 / 4)
 
-#define WDT_RESTART_MAGIC   0x4755
+#define WDT_RESTART_MAGIC   0x4755
 
 static bool aspeed_wdt_is_enabled(const AspeedWDTState *s)
 {
 return s->regs[WDT_CTRL] & WDT_CTRL_ENABLE;
 }
 
+static bool is_ast2500(const AspeedWDTState *s)
+{
+switch (s->silicon_rev) {
+case AST2500_A0_SILICON_REV:
+case AST2500_A1_SILICON_REV:
+return true;
+case AST2400_A0_SILICON_REV:
+case AST2400_A1_SILICON_REV:
+default:
+break;
+}
+
+return false;
+}
+
 static uint64_t aspeed_wdt_read(void *opaque, hwaddr offset, unsigned size)
 {
 AspeedWDTState *s = ASPEED_WDT(opaque);
@@ -55,9 +81,10 @@ static uint64_t aspeed_wdt_read(void *opaque, hwaddr offset, 
unsigned size)
 return 0;
 case WDT_CTRL:
 return s->regs[WDT_CTRL];
+case WDT_RESET_WIDTH:
+return s->regs[WDT_RESET_WIDTH];
 case WDT_TIMEOUT_STATUS:
 case WDT_TIMEOUT_CLEAR:
-case WDT_RESET_WDITH:
 qemu_log_mask(LOG_UNIMP,
   "%s: uninmplemented read at offset 0x%" HWADDR_PRIx "\n",
   __func__, offset);
@@ -119,9 +146,27 @@ static void aspeed_wdt_write(void *opaque, hwaddr offset, 
uint64_t data,
 timer_del(s->timer);
 }
 break;
+case WDT_RESET_WIDTH:
+{
+uint32_t property = data & WDT_POLARITY_MASK;
+
+if (property && is_ast2500(s)) {
+if (property == WDT_ACTIVE_HIGH_MAGIC) {
+s->regs[WDT_RESET_WIDTH] |= WDT_RESET_WIDTH_ACTIVE_HIGH;
+} else if (property == WDT_ACTIVE_LOW_MAGIC) {
+s->regs[WDT_RESET_WIDTH] &= ~WDT_RESET_WIDTH_ACTIVE_HIGH;
+} else if (property == WDT_PUSH_PULL_MAGIC) {
+s->regs[WDT_RESET_WIDTH] |= WDT_RESET_WIDTH_PUSH_PULL;
+} else if (property == WDT_OPEN_DRAIN_MAGIC) {
+s->regs[WDT_RESET_WIDTH] &= ~WDT_RESET_WIDTH_PUSH_PULL;
+}
+}
+s->regs[WDT_RESET_WIDTH] &= ~s->ext_pulse_width_mask;
+s->regs[WDT_RESET_WIDTH] |= data & s->ext_pulse_width_mask;
+break;
+}
 case WDT_TIMEOUT_STATUS:
 case WDT_TIMEOUT_CLEAR:
-case WDT_RESET_WDITH:
 qemu_log_mask(LOG_UNIMP,
   "%s: uninmplemented write at offset 0x%" HWADDR_PRIx 
"\n",
   __func__, offset);
@@ -167,6 +212,7 @@ static void aspeed_wdt_reset(DeviceState *dev)

[Qemu-devel] [PATCH for 2.11 v2 2/2] ARM: aspeed_soc: Propagate silicon-rev to watchdog

2017-08-09 Thread Andrew Jeffery
This is required to configure differences in behaviour between the
AST2400 and AST2500 watchdog IPs.

Signed-off-by: Andrew Jeffery 
---
 hw/arm/aspeed_soc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/arm/aspeed_soc.c b/hw/arm/aspeed_soc.c
index 3034849c80bf..79804e1ee652 100644
--- a/hw/arm/aspeed_soc.c
+++ b/hw/arm/aspeed_soc.c
@@ -183,6 +183,8 @@ static void aspeed_soc_init(Object *obj)
 object_initialize(>wdt[i], sizeof(s->wdt[i]), TYPE_ASPEED_WDT);
 object_property_add_child(obj, "wdt[*]", OBJECT(>wdt[i]), NULL);
 qdev_set_parent_bus(DEVICE(>wdt[i]), sysbus_get_default());
+qdev_prop_set_uint32(DEVICE(>wdt[i]), "silicon-rev",
+sc->info->silicon_rev);
 }
 
 object_initialize(>ftgmac100, sizeof(s->ftgmac100), TYPE_FTGMAC100);
-- 
2.11.0




Re: [Qemu-devel] Qemu and 32 PCIe devices

2017-08-09 Thread Paolo Bonzini
On 09/08/2017 03:06, Laszlo Ersek wrote:
>>   20.14%  qemu-system-x86_64  [.] render_memory_region
>>   17.14%  qemu-system-x86_64  [.] subpage_register
>>   10.31%  qemu-system-x86_64  [.] int128_add
>>7.86%  qemu-system-x86_64  [.] addrrange_end
>>7.30%  qemu-system-x86_64  [.] int128_ge
>>4.89%  qemu-system-x86_64  [.] int128_nz
>>3.94%  qemu-system-x86_64  [.] phys_page_compact
>>2.73%  qemu-system-x86_64  [.] phys_map_node_alloc

Yes, this is the O(n^3) thing.  An optimized build should be faster
because int128 operations will be inlined and become much more efficient.

> With this patch, I only tested the "93 devices" case, as the slowdown
> became visible to the naked eye from the trace messages, as the firmware
> enabled more and more BARs / command registers (and inversely, the
> speedup was perceivable when the firmware disabled more and more BARs /
> command registers).

This is an interesting observation, and it's expected.  Looking at the
O(n^3) complexity more in detail you have N operations, where the "i"th
operates on "i" DMA address spaces, all of which have at least "i"
memory regions (at least 1 BAR per device).

So the total cost is sum i=1..N i^2 = N(N+1)(2N+1)/6 = O(n^3).
Expressing it as a sum shows why it gets slower as time progresses.

The solution is to note that those "i" address spaces are actually all
the same, so we can get it down to sum i=1..N i = N(N+1)/2 = O(n^2).

Thanks,

Paolo



Re: [Qemu-devel] [PATCH v4 05/22] qobject: Simplify qobject_from_jsonv()

2017-08-09 Thread Markus Armbruster
Eric Blake  writes:

> qobject_from_jsonv() was unusual in that it took a va_list*, instead
> of the more typical va_list; this was so that callers could pass
> NULL to avoid % interpolation.  While this works under the hood, it
> is awkward for callers, so move the magic into qjson.c rather than
> in the public interface, and finally improve the documentation of
> qobject_from_jsonf().
>
> test-qobject-input-visitor.c's visitor_input_test_init_internal()
> was the only caller passing NULL, fix it to instead use a QObject
> created by the various callers, who now use the appropriate form
> of qobject_from_json*() according to whether % interpolation is
> desired.
>
> Once that is done, all remaining callers to qobject_from_jsonv() are
> passing _abort; drop this parameter to match the counterpart
> qobject_from_jsonf() which assert()s success instead.  Besides, all
> current callers that need interpolation live in the testsuite, where
> enforcing well-defined input by asserts can help catch typos, and
> where we should not be operating on dynamic untrusted arbitrary
> input in the first place.
>
> Asserting success has another nice benefit: if we pass more than one
> %p, but could return failure, we would have to worry about whether
> all arguments in the va_list had consistent refcount treatment (it
> should be an all-or-none decision on whether each QObject in the
> va_list had its reference count altered - but whichever way we
> prefer, it's a lot of overhead to ensure we do it for everything
> in the va_list even if we failed halfway through).  But now that we
> promise success, we can now easily promise that all %p arguments will
> now be cleaned up when freeing the returned object.
>
> Signed-off-by: Eric Blake 
> ---
>  include/qapi/qmp/qjson.h   |  2 +-
>  tests/libqtest.c   | 10 ++--
>  qobject/qjson.c| 49 
> +++---
>  tests/test-qobject-input-visitor.c | 18 --
>  4 files changed, 60 insertions(+), 19 deletions(-)
>
> diff --git a/include/qapi/qmp/qjson.h b/include/qapi/qmp/qjson.h
> index 6e84082d5f..9aacb1ccf6 100644
> --- a/include/qapi/qmp/qjson.h
> +++ b/include/qapi/qmp/qjson.h
> @@ -19,7 +19,7 @@
>
>  QObject *qobject_from_json(const char *string, Error **errp);
>  QObject *qobject_from_jsonf(const char *string, ...) GCC_FMT_ATTR(1, 2);
> -QObject *qobject_from_jsonv(const char *string, va_list *ap, Error **errp)
> +QObject *qobject_from_jsonv(const char *string, va_list ap)
>  GCC_FMT_ATTR(1, 0);
>
>  QString *qobject_to_json(const QObject *obj);
> diff --git a/tests/libqtest.c b/tests/libqtest.c
> index 99a07c246f..cde737ec5a 100644
> --- a/tests/libqtest.c
> +++ b/tests/libqtest.c
> @@ -448,7 +448,6 @@ QDict *qtest_qmp_receive(QTestState *s)
>   */
>  void qmp_fd_sendv(int fd, const char *fmt, va_list ap)
>  {
> -va_list ap_copy;
>  QObject *qobj;
>  int log = getenv("QTEST_LOG") != NULL;
>  QString *qstr;
> @@ -463,13 +462,8 @@ void qmp_fd_sendv(int fd, const char *fmt, va_list ap)
>  }
>  assert(*fmt);
>
> -/* Going through qobject ensures we escape strings properly.
> - * This seemingly unnecessary copy is required in case va_list
> - * is an array type.
> - */
> -va_copy(ap_copy, ap);
> -qobj = qobject_from_jsonv(fmt, _copy, _abort);
> -va_end(ap_copy);
> +/* Going through qobject ensures we escape strings properly. */
> +qobj = qobject_from_jsonv(fmt, ap);
>  qstr = qobject_to_json(qobj);
>
>  /*

Wait!  Oh, the va_copy() moves iinto qobject_from_jsonv().  Okay, I
guess.

> diff --git a/qobject/qjson.c b/qobject/qjson.c
> index 2e0930884e..210c290aa9 100644
> --- a/qobject/qjson.c
> +++ b/qobject/qjson.c
> @@ -35,7 +35,8 @@ static void parse_json(JSONMessageParser *parser, GQueue 
> *tokens)
>  s->result = json_parser_parse_err(tokens, s->ap, >err);
>  }
>
> -QObject *qobject_from_jsonv(const char *string, va_list *ap, Error **errp)
> +static QObject *qobject_from_json_internal(const char *string, va_list *ap,
> +   Error **errp)
>  {
>  JSONParsingState state = {};
>
> @@ -50,12 +51,31 @@ QObject *qobject_from_jsonv(const char *string, va_list 
> *ap, Error **errp)
>  return state.result;
>  }
>
> +/*
> + * Parses JSON input without interpolation.

Imperative mood, please.  Same elsewhere.

Suggest "without interpolation of % sequences".

> + *
> + * Returns a QObject matching the input on success, or NULL with
> + * an error set if the input is not valid JSON.
> + */
>  QObject *qobject_from_json(const char *string, Error **errp)
>  {
> -return qobject_from_jsonv(string, NULL, errp);
> +return qobject_from_json_internal(string, NULL, errp);
>  }
>
>  /*
> + * Parses JSON input with interpolation of % sequences.
> + *
> + * The set of sequences recognized is compatible with gcc's -Wformat
> + * warnings, although not all 

Re: [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time

2017-08-09 Thread Juan Quintela
Peter Xu  wrote:
> On Tue, Aug 08, 2017 at 06:06:04PM +0200, Juan Quintela wrote:
>> Peter Xu  wrote:
>> > On Mon, Jul 17, 2017 at 03:42:32PM +0200, Juan Quintela wrote:
>> >
>> > [...]
>> >
>> >>  static int multifd_send_page(uint8_t *address)
>> >>  {
>> >> -int i;
>> >> +int i, j;
>> >>  MultiFDSendParams *p = NULL; /* make happy gcc */
>> >> +static multifd_pages_t pages;
>> >> +static bool once;
>> >> +
>> >> +if (!once) {
>> >> +multifd_init_group();
>> >> +once = true;
>> >
>> > Would it be good to put the "pages" into multifd_send_state? One is to
>> > stick globals together; another benefit is that we can remove the
>> > "once" here: we can then init the "pages" when init multifd_send_state
>> > struct (but maybe with a better name?...).
>> 
>> I did to be able to free it.
>
> Free it? But they a static variables, then how can we free them?
>
> (I thought the only way to free it is putting it into
>  multifd_send_state...)
>
> Something I must have missed here. :(

I did the change that you suggested in response to a comment from Dave
that asked where I freed it.   I see that my sentence was ambigous.

>
>> 
>> > (there are similar static variables in multifd_recv_page() as well, if
>> >  this one applies, then we can possibly use multifd_recv_state for
>> >  that one)
>> 
>> Also there.
>> 
>> >> +}
>> >> +
>> >> +pages.iov[pages.num].iov_base = address;
>> >> +pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
>> >> +pages.num++;
>> >> +
>> >> +if (pages.num < (pages.size - 1)) {
>> >> +return UINT16_MAX;
>> >
>> > Nit: shall we define something for readability?  Like:
>> >
>> > #define  MULTIFD_FD_INVALID  UINT16_MAX
>> 
>> Also done.
>> 
>> MULTIFD_CONTINUE
>> 
>> But I am open to changes.
>
> It's clear enough at least to me. Thanks!

Thanks, Juan.



Re: [Qemu-devel] [PATCH] 9pfs: fix dependencies

2017-08-09 Thread Cornelia Huck
On Wed, 9 Aug 2017 07:12:51 +0200
Thomas Huth  wrote:

> On 08.08.2017 18:26, Greg Kurz wrote:
> > On Tue,  8 Aug 2017 17:38:27 +0200
> > Cornelia Huck  wrote:
> >   
> >> Nothing in fsdev/ or hw/9pfs/ depends on pci; it should rather depend
> >> on CONFIG_VIRTFS and on the presence of an appropriate virtio transport
> >> device.
> >>
> >> Let's introduce CONFIG_VIRTIO_CCW to cover s390x and check for
> >> CONFIG_VIRTFS && (CONFIG_VIRTIO_PCI || CONFIG_VIRTIO_CCW).
> >>
> >> Signed-off-by: Cornelia Huck 
> >> ---
> >>
> >> This is the alternative approach to "9pfs: fix and simplify dependencies".
> >> Uglier; but probably not broken...
> >>  
> > 
> > Yikes. I don't know why yet but this doesn't work for PCI-less targets
> > like cris-softmmu...
> > 
> >   LINKcris-softmmu/qemu-system-cris
> > vl.o: In function `fsdev_init_func':
> > vl.c:2360: undefined reference to `qemu_fsdev_add'

Hmpf. Added cris to my buildlist...

> >   
> >> ---
> >>  default-configs/s390x-softmmu.mak | 1 +
> >>  fsdev/Makefile.objs   | 9 +++--
> >>  hw/Makefile.objs  | 2 +-
> >>  3 files changed, 5 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/default-configs/s390x-softmmu.mak 
> >> b/default-configs/s390x-softmmu.mak
> >> index 51191b77df..e4c5236ceb 100644
> >> --- a/default-configs/s390x-softmmu.mak
> >> +++ b/default-configs/s390x-softmmu.mak
> >> @@ -8,3 +8,4 @@ CONFIG_S390_FLIC=y
> >>  CONFIG_S390_FLIC_KVM=$(CONFIG_KVM)
> >>  CONFIG_VFIO_CCW=$(CONFIG_LINUX)
> >>  CONFIG_WDT_DIAG288=y
> >> +CONFIG_VIRTIO_CCW=y
> >> diff --git a/fsdev/Makefile.objs b/fsdev/Makefile.objs
> >> index 659df6e187..10d8caa291 100644
> >> --- a/fsdev/Makefile.objs
> >> +++ b/fsdev/Makefile.objs
> >> @@ -1,10 +1,7 @@
> >> -ifeq ($(CONFIG_VIRTIO)$(CONFIG_VIRTFS)$(CONFIG_PCI),yyy)
> >>  # Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.
> >> -# only pull in the actual virtio-9p device if we also enabled virtio.
> >> -common-obj-y = qemu-fsdev.o 9p-marshal.o 9p-iov-marshal.o
> >> -else
> >> -common-obj-y = qemu-fsdev-dummy.o
> >> -endif
> >> +# only pull in the actual virtio-9p device if we also enabled a virtio 
> >> backend.
> >> +common-obj-$(call land, $(CONFIG_VIRTFS),$(call lor, 
> >> $(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW)))= qemu-fsdev.o 9p-marshal.o 
> >> 9p-iov-marshal.o
> >> +common-obj-$(call lnot, $(call land, $(CONFIG_VIRTFS),$(call lor, 
> >> $(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW = qemu-fsdev-dummy.o
> >>  common-obj-y += qemu-fsdev-opts.o qemu-fsdev-throttle.o
> >>  
> >>  # Toplevel always builds this; targets without virtio will put it in
> >> diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> >> index a2c61f6b09..10942fe0b4 100644
> >> --- a/hw/Makefile.objs
> >> +++ b/hw/Makefile.objs
> >> @@ -1,4 +1,4 @@
> >> -devices-dirs-$(call land, $(CONFIG_VIRTIO),$(call 
> >> land,$(CONFIG_VIRTFS),$(CONFIG_PCI))) += 9pfs/
> >> +devices-dirs-$(call land, $(CONFIG_VIRTFS),$(call 
> >> lor,$(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW))) += 9pfs/
> >>  devices-dirs-$(CONFIG_SOFTMMU) += acpi/
> >>  devices-dirs-$(CONFIG_SOFTMMU) += adc/
> >>  devices-dirs-$(CONFIG_SOFTMMU) += audio/  
> 
> I think the problem are the white spaces after a ",". For example for
> the following test code in a makefile:
> 
>   @echo test1: $(call lnot, n)
>   @echo test2: $(call lnot,n)
> 
> I get the following output:
> 
>  test1: n
>  test2: y
> 
> Hope that helps,
>  Thomas
> 

Yeah, my fingers are trained to type a blank after a comma... currently
testing an updated patch.



[Qemu-devel] [Bug 1357175] Re: qemu fails to build on powerpc64

2017-08-09 Thread Thomas Huth
OK, thanks for checking!

** Changed in: qemu
   Status: Incomplete => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1357175

Title:
  qemu fails to build on powerpc64

Status in QEMU:
  Fix Released

Bug description:
  Qemu fails to build on powerpc64, ELFv1 ABI, since the introduction of
  the ELFv2 ABI support.  On FreeBSD/powerpc64 I see the following error
  building HEAD from today (8/14/2014):

  In file included from /home/chmeee/qemu-git/tcg/tcg.c:264:
  /home/chmeee/qemu-git/tcg/ppc/tcg-target.c:1737:3: error: #error "Unhandled 
abi"
  In file included from /home/chmeee/qemu-git/tcg/tcg.c:264:
  /home/chmeee/qemu-git/tcg/ppc/tcg-target.c: In function 
'tcg_target_qemu_prologue':
  /home/chmeee/qemu-git/tcg/ppc/tcg-target.c:1766: error: 'LINK_AREA_SIZE' 
undeclared (first use in this function)
  /home/chmeee/qemu-git/tcg/ppc/tcg-target.c:1766: error: (Each undeclared 
identifier is reported only once
  /home/chmeee/qemu-git/tcg/ppc/tcg-target.c:1766: error: for each function it 
appears in.)
  /home/chmeee/qemu-git/tcg/ppc/tcg-target.c:1778: error: 'LR_OFFSET' 
undeclared (first use in this function)
  /home/chmeee/qemu-git/tcg/ppc/tcg-target.c: At top level:
  /home/chmeee/qemu-git/tcg/ppc/tcg-target.c:2579: error: 'LINK_AREA_SIZE' 
undeclared here (not in a function)
  /home/chmeee/qemu-git/tcg/ppc/tcg-target.c:2605: error: 'LR_OFFSET' 
undeclared here (not in a function)
  gmake[1]: *** [tcg/tcg.o] Error 1

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1357175/+subscriptions



[Qemu-devel] [PULL 5/6] spapr_drc: abort if object_property_add_child() fails

2017-08-09 Thread David Gibson
From: Greg Kurz 

object_property_add_child() can only fail in two cases:
- the child already has a parent, which shouldn't happen since the DRC was
  allocated a few lines above
- the parent already has a child with the same name, which would mean the
  caller tries to create a DRC that already exists

In both case, this is a QEMU bug and we should abort.

Signed-off-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_drc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 47d94e782a..5260b5d363 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -541,7 +541,7 @@ sPAPRDRConnector *spapr_dr_connector_new(Object *owner, 
const char *type,
 drc->owner = owner;
 prop_name = g_strdup_printf("dr-connector[%"PRIu32"]",
 spapr_drc_index(drc));
-object_property_add_child(owner, prop_name, OBJECT(drc), NULL);
+object_property_add_child(owner, prop_name, OBJECT(drc), _abort);
 object_property_set_bool(OBJECT(drc), true, "realized", NULL);
 g_free(prop_name);
 
-- 
2.13.4




[Qemu-devel] [PATCH v2] 9pfs: fix dependencies

2017-08-09 Thread Cornelia Huck
Nothing in fsdev/ or hw/9pfs/ depends on pci; it should rather depend
on CONFIG_VIRTFS and on the presence of an appropriate virtio transport
device.

Let's introduce CONFIG_VIRTIO_CCW to cover s390x and check for
CONFIG_VIRTFS && (CONFIG_VIRTIO_PCI || CONFIG_VIRTIO_CCW).

Signed-off-by: Cornelia Huck 
---

Changes v1->v2: drop extraneous spaces, fix build on cris

---
 default-configs/s390x-softmmu.mak | 1 +
 fsdev/Makefile.objs   | 9 +++--
 hw/Makefile.objs  | 2 +-
 3 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/default-configs/s390x-softmmu.mak 
b/default-configs/s390x-softmmu.mak
index 51191b77df..e4c5236ceb 100644
--- a/default-configs/s390x-softmmu.mak
+++ b/default-configs/s390x-softmmu.mak
@@ -8,3 +8,4 @@ CONFIG_S390_FLIC=y
 CONFIG_S390_FLIC_KVM=$(CONFIG_KVM)
 CONFIG_VFIO_CCW=$(CONFIG_LINUX)
 CONFIG_WDT_DIAG288=y
+CONFIG_VIRTIO_CCW=y
diff --git a/fsdev/Makefile.objs b/fsdev/Makefile.objs
index 659df6e187..3d157add31 100644
--- a/fsdev/Makefile.objs
+++ b/fsdev/Makefile.objs
@@ -1,10 +1,7 @@
-ifeq ($(CONFIG_VIRTIO)$(CONFIG_VIRTFS)$(CONFIG_PCI),yyy)
 # Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.
-# only pull in the actual virtio-9p device if we also enabled virtio.
-common-obj-y = qemu-fsdev.o 9p-marshal.o 9p-iov-marshal.o
-else
-common-obj-y = qemu-fsdev-dummy.o
-endif
+# only pull in the actual virtio-9p device if we also enabled a virtio backend.
+common-obj-$(call land,$(CONFIG_VIRTFS),$(call 
lor,$(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW)))= qemu-fsdev.o 9p-marshal.o 
9p-iov-marshal.o
+common-obj-$(call lnot,$(call land,$(CONFIG_VIRTFS),$(call 
lor,$(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW = qemu-fsdev-dummy.o
 common-obj-y += qemu-fsdev-opts.o qemu-fsdev-throttle.o
 
 # Toplevel always builds this; targets without virtio will put it in
diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index a2c61f6b09..335f26b65e 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -1,4 +1,4 @@
-devices-dirs-$(call land, $(CONFIG_VIRTIO),$(call 
land,$(CONFIG_VIRTFS),$(CONFIG_PCI))) += 9pfs/
+devices-dirs-$(call land,$(CONFIG_VIRTFS),$(call 
lor,$(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW))) += 9pfs/
 devices-dirs-$(CONFIG_SOFTMMU) += acpi/
 devices-dirs-$(CONFIG_SOFTMMU) += adc/
 devices-dirs-$(CONFIG_SOFTMMU) += audio/
-- 
2.13.4




Re: [Qemu-devel] [PATCH for-2.11 0/4] ppc64: add e6500

2017-08-09 Thread David Gibson
On Mon, Aug 07, 2017 at 05:50:44PM +0200, KONRAD Frederic wrote:
> Hi,
> 
> Those are some patches to add basic e6500 support for the moment e5500 with a
> correct MMU configuration and supported instructions.
> Some (maybe a lot of) things are missing (ie: the thread support) but it is
> enough to boot a propietary OS on my side.
> 
> The first two patches are fixes when using MAV 2.0 MMU.
> The two last patches introduces the e6500.
> 
> This can be cloned here:
> https://github.com/FredKonrad/qemu.git branch e6500

Looks sane as best as my minimal knowledge of e500 goes.  Applied to
ppc-for-2.11.  Alex, if you (or anyone with knowledge of this
platform) has objections, I'll reconsider.

> 
> Thanks,
> Fred
> 
> KONRAD Frederic (4):
>   booke206: fix booke206_tlbnps for mav 2.0
>   booke206: fix tlbnps for fixed size TLB
>   booke206: allow to specify an mmucfg value at the init
>   ppc64: introduce e6500
> 
>  target/ppc/cpu-models.c |   2 +
>  target/ppc/cpu-models.h |   1 +
>  target/ppc/cpu.h|  26 +++-
>  target/ppc/mmu_helper.c |  16 ---
>  target/ppc/translate_init.c | 100 
> +---
>  5 files changed, 132 insertions(+), 13 deletions(-)
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v3 1/1] ppc: spapr: Make VCPU ID handling private to SPAPR

2017-08-09 Thread David Gibson
On Wed, Aug 09, 2017 at 03:38:56PM +1000, Sam Bobroff wrote:
> The concept of a VCPU ID that differs from the CPU's index
> (cpu->cpu_index) exists only within SPAPR machines so, move the
> functions ppc_get_vcpu_id() and ppc_get_cpu_by_vcpu_id() into spapr.c
> and rename them appropriately.
> 
> Signed-off-by: Sam Bobroff 

Applied to ppc-for-2.11, thanks.

> ---
> Changes in v3:
> 
> * Implemented spapr_find_cpu() using spapr_vcpu_id() rather than direct access
>   to vcpu_id.
> 
>  hw/ppc/ppc.c   | 21 -
>  hw/ppc/spapr.c | 40 +---
>  hw/ppc/spapr_hcall.c   |  4 ++--
>  hw/ppc/spapr_rtas.c|  4 ++--
>  include/hw/ppc/spapr.h |  3 +++
>  target/ppc/cpu.h   | 18 --
>  target/ppc/kvm.c   |  2 +-
>  7 files changed, 41 insertions(+), 51 deletions(-)
> 
> diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
> index 4477d4ad89..f76886f4d3 100644
> --- a/hw/ppc/ppc.c
> +++ b/hw/ppc/ppc.c
> @@ -1358,27 +1358,6 @@ void PPC_debug_write (void *opaque, uint32_t addr, 
> uint32_t val)
>  }
>  }
>  
> -/* CPU device-tree ID helpers */
> -int ppc_get_vcpu_id(PowerPCCPU *cpu)
> -{
> -return cpu->vcpu_id;
> -}
> -
> -PowerPCCPU *ppc_get_cpu_by_vcpu_id(int vcpu_id)
> -{
> -CPUState *cs;
> -
> -CPU_FOREACH(cs) {
> -PowerPCCPU *cpu = POWERPC_CPU(cs);
> -
> -if (cpu->vcpu_id == vcpu_id) {
> -return cpu;
> -}
> -}
> -
> -return NULL;
> -}
> -
>  void ppc_cpu_parse_features(const char *cpu_model)
>  {
>  CPUClass *cc;
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d6c9b3e334..cd6eb2d4a9 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -208,7 +208,7 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, 
> PowerPCCPU *cpu,
>  int i, ret = 0;
>  uint32_t servers_prop[smt_threads];
>  uint32_t gservers_prop[smt_threads * 2];
> -int index = ppc_get_vcpu_id(cpu);
> +int index = spapr_vcpu_id(cpu);
>  
>  if (cpu->compat_pvr) {
>  ret = fdt_setprop_cell(fdt, offset, "cpu-version", cpu->compat_pvr);
> @@ -237,7 +237,7 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, 
> PowerPCCPU *cpu,
>  
>  static int spapr_fixup_cpu_numa_dt(void *fdt, int offset, PowerPCCPU *cpu)
>  {
> -int index = ppc_get_vcpu_id(cpu);
> +int index = spapr_vcpu_id(cpu);
>  uint32_t associativity[] = {cpu_to_be32(0x5),
>  cpu_to_be32(0x0),
>  cpu_to_be32(0x0),
> @@ -341,7 +341,7 @@ static int spapr_fixup_cpu_dt(void *fdt, 
> sPAPRMachineState *spapr)
>  PowerPCCPU *cpu = POWERPC_CPU(cs);
>  CPUPPCState *env = >env;
>  DeviceClass *dc = DEVICE_GET_CLASS(cs);
> -int index = ppc_get_vcpu_id(cpu);
> +int index = spapr_vcpu_id(cpu);
>  int compat_smt = MIN(smp_threads, ppc_compat_max_threads(cpu));
>  
>  if ((index % smt) != 0) {
> @@ -493,7 +493,7 @@ static void spapr_populate_cpu_dt(CPUState *cs, void 
> *fdt, int offset,
>  PowerPCCPU *cpu = POWERPC_CPU(cs);
>  CPUPPCState *env = >env;
>  PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
> -int index = ppc_get_vcpu_id(cpu);
> +int index = spapr_vcpu_id(cpu);
>  uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
> 0x, 0x};
>  uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq()
> @@ -626,7 +626,7 @@ static void spapr_populate_cpus_dt_node(void *fdt, 
> sPAPRMachineState *spapr)
>   */
>  CPU_FOREACH_REVERSE(cs) {
>  PowerPCCPU *cpu = POWERPC_CPU(cs);
> -int index = ppc_get_vcpu_id(cpu);
> +int index = spapr_vcpu_id(cpu);
>  DeviceClass *dc = DEVICE_GET_CLASS(cs);
>  int offset;
>  
> @@ -2982,7 +2982,7 @@ static void *spapr_populate_hotplug_cpu_dt(CPUState 
> *cs, int *fdt_offset,
>  {
>  PowerPCCPU *cpu = POWERPC_CPU(cs);
>  DeviceClass *dc = DEVICE_GET_CLASS(cs);
> -int id = ppc_get_vcpu_id(cpu);
> +int id = spapr_vcpu_id(cpu);
>  void *fdt;
>  int offset, fdt_size;
>  char *nodename;
> @@ -3392,7 +3392,7 @@ static void spapr_ics_resend(XICSFabric *dev)
>  
>  static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
>  {
> -PowerPCCPU *cpu = ppc_get_cpu_by_vcpu_id(vcpu_id);
> +PowerPCCPU *cpu = spapr_find_cpu(vcpu_id);
>  
>  return cpu ? ICP(cpu->intc) : NULL;
>  }
> @@ -3412,6 +3412,32 @@ static void 
> spapr_pic_print_info(InterruptStatsProvider *obj,
>  ics_pic_print_info(spapr->ics, mon);
>  }
>  
> +int spapr_vcpu_id(PowerPCCPU *cpu)
> +{
> +CPUState *cs = CPU(cpu);
> +
> +if (kvm_enabled()) {
> +return kvm_arch_vcpu_id(cs);
> +} else {
> +return cs->cpu_index;
> +}
> +}
> +
> +PowerPCCPU *spapr_find_cpu(int vcpu_id)
> +{
> +CPUState *cs;
> +
> +CPU_FOREACH(cs) {
> +PowerPCCPU *cpu = POWERPC_CPU(cs);
> +
> +if 

Re: [Qemu-devel] [for-2.10 PATCH] 9pfs: local: fix fchmodat_nofollow() limitations

2017-08-09 Thread Greg Kurz
On Tue, 8 Aug 2017 14:14:18 -0500
Eric Blake  wrote:

> On 08/08/2017 12:28 PM, Greg Kurz wrote:
> > This function has to ensure it doesn't follow a symlink that could be used
> > to escape the virtfs directory. This could be easily achieved if fchmodat()
> > on linux honored the AT_SYMLINK_NOFOLLOW flag as described in POSIX, but
> > it doesn't.
> > 
> > The current implementation covers most use-cases, but it notably fails if:
> > - the target path has access rights equal to  (openat() returns EPERM), 
> >  
> >   => once you've done chmod() on a file, you can never chmod() again  
> > - the target path is UNIX domain socket (openat() returns ENXIO)  
> >   => bind() of UNIX domain sockets fails if the file is on 9pfs  
> 
> Did your attempt at a kernel patch for AT_SYMLINK_NOFOLLOW ever get
> anywhere?
> 

No.

> > 
> > The solution is to use O_PATH: openat() now succeeds in both cases, and we
> > can ensure the path isn't a symlink with fstat(). The associated entry in
> > "/proc/self/fd" can hence be safely passed to the regular chmod() syscall.
> > 
> > Signed-off-by: Greg Kurz 
> > ---
> >  hw/9pfs/9p-local.c |   44 
> >  hw/9pfs/9p-util.h  |   10 +++---
> >  2 files changed, 35 insertions(+), 19 deletions(-)
> > 
> > diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
> > index 6e478f4765ef..b178d627c764 100644
> > --- a/hw/9pfs/9p-local.c
> > +++ b/hw/9pfs/9p-local.c
> > @@ -333,30 +333,42 @@ update_map_file:
> >  
> >  static int fchmodat_nofollow(int dirfd, const char *name, mode_t mode)
> >  {
> > +struct stat stbuf;
> >  int fd, ret;
> > +char *proc_path;
> >  
> >  /* FIXME: this should be handled with fchmodat(AT_SYMLINK_NOFOLLOW).
> > - * Unfortunately, the linux kernel doesn't implement it yet. As an
> > - * alternative, let's open the file and use fchmod() instead. This
> > - * may fail depending on the permissions of the file, but it is the
> > - * best we can do to avoid TOCTTOU. We first try to open read-only
> > - * in case name points to a directory. If that fails, we try write-only
> > - * in case name doesn't point to a directory.
> > + * Unfortunately, the linux kernel doesn't implement it yet.
> >   */
> > -fd = openat_file(dirfd, name, O_RDONLY, 0);
> > -if (fd == -1) {
> > -/* In case the file is writable-only and isn't a directory. */
> > -if (errno == EACCES) {
> > -fd = openat_file(dirfd, name, O_WRONLY, 0);
> > -}
> > -if (fd == -1 && errno == EISDIR) {
> > -errno = EACCES;
> > -}
> > +if (fstatat(dirfd, name, , AT_SYMLINK_NOFOLLOW)) {
> > +return -1;
> >  }  
> 
> Checking the file...
> 
> > +
> > +if (S_ISLNK(stbuf.st_mode)) {
> > +errno = ELOOP;
> > +return -1;
> > +}
> > +
> > +fd = openat_file(dirfd, name, O_RDONLY | O_PATH, 0);  
> 
> ...and then opening the file is a TOCTTOU race (although it works most
> of the time and avoids the open where it is easy)...
> 

Exactly. It is globally assumed in the 9p code that symbolic links are
resolved by the client. The earlier we detect the file is a symbolic
link, the earlier we can error out. The fstat()+S_ISLNK() is a deliberate
fast error path actually :)

> >  if (fd == -1) {
> >  return -1;
> >  }
> > -ret = fchmod(fd, mode);
> > +
> > +ret = fstat(fd, );
> > +if (ret) {
> > +goto out;
> > +}  
> 
> ...so you are double-checking that you got a non-symlink after all (your
> insurance against the race having done the wrong thing [but see below])...
> 

Yes, this is the *real* check.

> > +
> > +if (S_ISLNK(stbuf.st_mode)) {
> > +errno = ELOOP;
> > +ret = -1;
> > +goto out;
> > +}
> > +
> > +proc_path = g_strdup_printf("/proc/self/fd/%d", fd);
> > +ret = chmod(proc_path, mode);  
> 
> ...at which point you now have a valid file name that represents the
> file you wanted to chmod() in the first place (even if another rename
> occurs in the meantime, you are changing the mode tied to the
> non-symlink fd that you double-checked, which ends up behaving as if you
> had won the race and made the chmod() call before the rename).
> 

Yes.

> > +g_free(proc_path);
> > +out:
> >  close_preserve_errno(fd);
> >  return ret;  
> 
> Might be worth littering some comments in the code explaining why you
> have to call both fstatat() and stat(), or perhaps you could drop the
> first fstatat() and just always do the open().
> 

I like the idea of erroring out right away when a symbolic link is
detected. I'll add comments.

> Reading 'man open', it looks like O_PATH will chase symlinks UNLESS you
> also use O_NOFOLLOW.  So I had to code up a simple test program to
> verify if things work...
> 
> =
> #define _GNU_SOURCE 1
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> int 

Re: [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page

2017-08-09 Thread Peter Xu
On Tue, Aug 08, 2017 at 06:04:54PM +0200, Juan Quintela wrote:
> Peter Xu  wrote:
> > On Wed, Jul 19, 2017 at 08:02:39PM +0100, Dr. David Alan Gilbert wrote:
> >> * Juan Quintela (quint...@redhat.com) wrote:
> 
> >> >  struct MultiFDSendParams {
> >> > +/* not changed */
> >> >  uint8_t id;
> >> >  QemuThread thread;
> >> >  QIOChannel *c;
> >> >  QemuSemaphore sem;
> >> >  QemuMutex mutex;
> >> > +/* protected by param mutex */
> >> >  bool quit;
> >> 
> >> Should probably comment to say what address space address is in - this
> >> is really a qemu pointer - and that's why we can treat 0 as special?
> >
> > I believe this comment is for "address" below.
> >
> > Yes, it would be nice to comment it. IIUC it belongs to virtual
> > address space of QEMU, so it should be okay to use zero as a "special"
> > value.
> 
> See new comments.
> 
> >> 
> >> > +uint8_t *address;
> >> > +/* protected by multifd mutex */
> >> > +bool done;
> >> 
> >> done needs a comment to explain what it is because
> >> it sounds similar to quit;  I think 'done' is saying that
> >> the thread is idle having done what was asked?
> >
> > Since we know that valid address won't be zero, not sure whether we
> > can just avoid introducing the "done" field (even, not sure whether we
> > will need lock when modifying "address", I think not as well? Please
> > see below). For what I see this, it works like a state machine, and
> > "address" can be the state:
> >
> > +  send thread -+
> > |   |
> >\|/  |
> > address==0 (IDLE)   address!=0 (ACTIVE)
> > |  /|\
> > |   |
> > +  main thread -+
> >
> > Then...
> 
> It is needed, we change things later in the series.  We could treat as
> an special case page.num == 0. But then we can differentiate the case
> where we have finished the last round and that we are in the beggining
> of the new one.

(Will comment below)

> 
> >> 
> >> >  };
> >> >  typedef struct MultiFDSendParams MultiFDSendParams;
> >> >  
> >> > @@ -375,6 +381,8 @@ struct {
> >> >  MultiFDSendParams *params;
> >> >  /* number of created threads */
> >> >  int count;
> >> > +QemuMutex mutex;
> >> > +QemuSemaphore sem;
> >> >  } *multifd_send_state;
> >> >  
> >> >  static void terminate_multifd_send_threads(void)
> >> > @@ -443,6 +451,7 @@ static void *multifd_send_thread(void *opaque)
> >> >  } else {
> >> >  qio_channel_write(p->c, string, MULTIFD_UUID_MSG, _abort);
> >> >  }
> >> > +qemu_sem_post(_send_state->sem);
> >> >  
> >> >  while (!exit) {
> >> >  qemu_mutex_lock(>mutex);
> >> > @@ -450,6 +459,15 @@ static void *multifd_send_thread(void *opaque)
> >> >  qemu_mutex_unlock(>mutex);
> >> >  break;
> >> >  }
> >> > +if (p->address) {
> >> > +p->address = 0;
> >> > +qemu_mutex_unlock(>mutex);
> >> > +qemu_mutex_lock(_send_state->mutex);
> >> > +p->done = true;
> >> > +qemu_mutex_unlock(_send_state->mutex);
> >> > +qemu_sem_post(_send_state->sem);
> >> > +continue;
> >
> > Here instead of setting up address=0 at the entry, can we do this (no
> > "done" for this time)?
> >
> >  // send the page before clearing p->address
> >  send_page(p->address);
> >  // clear p->address to switch to "IDLE" state
> >  atomic_set(>address, 0);
> >  // tell main thread, in case it's waiting
> >  qemu_sem_post(_send_state->sem);
> >
> > And on the main thread...
> >
> >> > +}
> >> >  qemu_mutex_unlock(>mutex);
> >> >  qemu_sem_wait(>sem);
> >> >  }
> >> > @@ -469,6 +487,8 @@ int multifd_save_setup(void)
> >> >  multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
> >> >  multifd_send_state->params = g_new0(MultiFDSendParams, 
> >> > thread_count);
> >> >  multifd_send_state->count = 0;
> >> > +qemu_mutex_init(_send_state->mutex);
> >> > +qemu_sem_init(_send_state->sem, 0);
> >> >  for (i = 0; i < thread_count; i++) {
> >> >  char thread_name[16];
> >> >  MultiFDSendParams *p = _send_state->params[i];
> >> > @@ -477,6 +497,8 @@ int multifd_save_setup(void)
> >> >  qemu_sem_init(>sem, 0);
> >> >  p->quit = false;
> >> >  p->id = i;
> >> > +p->done = true;
> >> > +p->address = 0;
> >> >  p->c = socket_send_channel_create();
> >> >  if (!p->c) {
> >> >  error_report("Error creating a send channel");
> >> > @@ -491,6 +513,30 @@ int multifd_save_setup(void)
> >> >  return 0;
> >> >  }
> >> >  
> >> > +static int multifd_send_page(uint8_t *address)
> 

Re: [Qemu-devel] [PATCH v2 0/5] tests/pxe-test: add testcase using vhost-user-bridge

2017-08-09 Thread Jens Freimann

On Wed, Aug 09, 2017 at 04:17:05AM +0300, Michael S. Tsirkin wrote:

On Tue, Aug 08, 2017 at 04:59:02PM -0700, no-re...@patchew.org wrote:

Hi,

This series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Message-id: 20170808203900.7661-1-jfreim...@redhat.com
Subject: [Qemu-devel] [PATCH v2 0/5] tests/pxe-test: add testcase using 
vhost-user-bridge
Type: series
/tmp/qemu-test/src/tests/pxe-test.c: In function ‘test_pxe_vhost_user’:
/tmp/qemu-test/src/tests/pxe-test.c:106: error: ‘G_SPAWN_SEARCH_PATH_FROM_ENVP’ 
undeclared (first use in this function)
/tmp/qemu-test/src/tests/pxe-test.c:106: error: (Each undeclared identifier is 
reported only once
/tmp/qemu-test/src/tests/pxe-test.c:106: error: for each function it appears 
in.)
make: *** [tests/pxe-test.o] Error 1
make: *** Waiting for unfinished jobs
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 382, in 
sys.exit(main())
  File "./tests/docker/docker.py", line 379, in main
return args.cmdobj.run(args, argv)
  File "./tests/docker/docker.py", line 237, in run
return Docker().run(argv, args.keep, quiet=args.quiet)
  File "./tests/docker/docker.py", line 205, in run
quiet=quiet)
  File "./tests/docker/docker.py", line 123, in _do_check
return subprocess.check_call(self._command + cmd, **kwargs)
  File "/usr/lib64/python2.7/subprocess.py", line 186, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['docker', 'run', '--label', 
'com.qemu.instance.uuid=534034f67c9511e78af152540069c830', '-u', '0', '-t', 
'--rm', '--net=none', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', 
'-e', 'V=', '-e', 'J=8', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 
'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/var/tmp/patchew-tester-tmp-ku3bj8a1/src/docker-src.2017-08-08-19.57.20.28326:/var/tmp/qemu:z,ro',
 '-v', '/root/.cache/qemu-docker-ccache:/var/tmp/ccache:z', 'qemu:centos6', 
'/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2
make[1]: *** [tests/docker/Makefile.include:139: docker-run] Error 1
make[1]: Leaving directory '/var/tmp/patchew-tester-tmp-ku3bj8a1/src'
make: *** [tests/docker/Makefile.include:168: docker-run-test-quick@centos6] 
Error 2

real1m41.298s
user0m5.189s
sys 0m1.753s
=== OUTPUT END ===

Test command exited with code: 2



OK given this - it seems clear the right thing to do is to merge 2-4
and leave 1 and 5 for 2.11.


Agreed. I'll fix this and send a new version. 


Thanks!

regards,
Jens 



Re: [Qemu-devel] [PATCH v4 06/22] qobject: Perform %% interpolation in qobject_from_jsonf()

2017-08-09 Thread Markus Armbruster
Eric Blake  writes:

> We want -Wformat to catch blatant programming errors in format
> strings that we pass to qobject_from_jsonf().  But if someone
> were to pass a JSON string "'%s'" as the format string, gcc would
> insist that it be paired with a char* argument, even though our
> lexer does not recognize % sequences inside a JSON string.  You can
> bypass -Wformat checking by passing the Unicode escape \u0025 for
> %, but who wants to remember to type that?  So the solution is that
> anywhere we are relying on -Wformat checking, the caller should
> pass the usual printf %% escape sequence where a literal % is
> wanted on output.
>
> Note that since % can only appear in JSON inside a string, we don't
> have to teach the lexer how to parse any new % sequences, but instead
> only have to add code to the parser.  Likewise, the parser is still
> where we reject any attempt at mid-string % interpolation other than
> %% (this is only a runtime failure, rather than compile-time), but
> since we already document that qobject_from_jsonf() asserts on invalid
> usage, presumably anyone that is adding a new usage will have tested
> that their usage doesn't always fail.
>
> Simplify qstring_from_escaped_str() while touching it, by using
> bool, a more compact conditional, and qstring_append_chr().
> Likewise, improve the error message when parse_escape() is reached
> without interpolation (for example, if a client sends garbage
> rather than JSON over a QMP connection).
>
> The testsuite additions pass under valgrind, proving that we are
> indeed passing the reference of anything given through %p to the
> returned containing object, even when more than one object is
> interpolated.
>
> Signed-off-by: Eric Blake 
> ---
>  qobject/json-lexer.c  |  6 --
>  qobject/json-parser.c | 49 -
>  qobject/qjson.c   |  4 ++--
>  tests/check-qjson.c   | 50 ++
>  4 files changed, 80 insertions(+), 29 deletions(-)
>
> diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c
> index b846d2852d..599b7446b7 100644
> --- a/qobject/json-lexer.c
> +++ b/qobject/json-lexer.c
> @@ -32,9 +32,11 @@
>   * Extension for vararg handling in JSON construction, when using
>   * qobject_from_jsonf() instead of qobject_from_json() (this lexer
>   * actually accepts multiple forms of PRId64, but parse_escape() later
> - * filters to only parse the current platform's spelling):
> + * filters to only parse the current platform's spelling; meanwhile,
> + * JSON only allows % inside strings, where the parser handles %%, so
> + * we do not need to lex it here):

The parenthesis is becoming unwieldy.  Turn it into a note...

>   *
> - * %(PRI[du]64|(l|ll)?[ud]|[ipsf])
> + * %(PRI[du]64|(l|ll)?[ud]|[ipsf%])
>   *

... here?

>   */
>
> diff --git a/qobject/json-parser.c b/qobject/json-parser.c
> index 388aa7a386..daafe77a0c 100644
> --- a/qobject/json-parser.c
> +++ b/qobject/json-parser.c
> @@ -120,25 +120,21 @@ static int hex2decimal(char ch)
>   *  \n
>   *  \r
>   *  \t
> - *  \u four-hex-digits 
> + *  \u four-hex-digits
> + *
> + * Additionally, if @percent is true, all % in @token must be doubled,
> + * replaced by a single % will be in the result; this allows -Wformat
> + * processing of qobject_from_jsonf().
>   */
>  static QString *qstring_from_escaped_str(JSONParserContext *ctxt,
> - JSONToken *token)
> + JSONToken *token, bool percent)
>  {
>  const char *ptr = token->str;
>  QString *str;
> -int double_quote = 1;
> -
> -if (*ptr == '"') {
> -double_quote = 1;
> -} else {
> -double_quote = 0;
> -}
> -ptr++;
> +bool double_quote = *ptr++ == '"';
>
>  str = qstring_new();
> -while (*ptr && 
> -   ((double_quote && *ptr != '"') || (!double_quote && *ptr != 
> '\''))) {
> +while (*ptr && *ptr != "'\""[double_quote]) {

Simpler:

   bool quote = *ptr++;

and then

   while (*ptr && *ptr != quote) {

Have you considered splitting the patch into one to simplify, and one to
implement %%?

>  if (*ptr == '\\') {
>  ptr++;
>
> @@ -205,12 +201,13 @@ static QString 
> *qstring_from_escaped_str(JSONParserContext *ctxt,
>  goto out;
>  }
>  } else {
> -char dummy[2];
> -
> -dummy[0] = *ptr++;
> -dummy[1] = 0;
> -
> -qstring_append(str, dummy);
> +if (*ptr == '%' && percent) {
> +if (*++ptr != '%') {
> +parse_error(ctxt, token, "invalid %% sequence in 
> string");
> +goto out;
> +}
> +}
> +qstring_append_chr(str, *ptr++);
>  }
>  }
>
> @@ -455,13 +452,15 @@ static QObject *parse_escape(JSONParserContext *ctxt, 
> va_list *ap)

Re: [Qemu-devel] [PATCH v4 07/22] numa-test: Use hmp()

2017-08-09 Thread Markus Armbruster
Eric Blake  writes:

> Don't open-code something that has a convenient helper available.
>
> Signed-off-by: Eric Blake 

Reviewed-by: Markus Armbruster 



Re: [Qemu-devel] [PATCH v2] 9pfs: fix dependencies

2017-08-09 Thread Cornelia Huck
On Wed, 9 Aug 2017 11:07:38 +0200
Thomas Huth  wrote:

> On 09.08.2017 10:27, Cornelia Huck wrote:
> > On Wed, 9 Aug 2017 10:23:04 +0200
> > Thomas Huth  wrote:
> >   
> >> On 09.08.2017 09:17, Cornelia Huck wrote:  
> >>> Nothing in fsdev/ or hw/9pfs/ depends on pci; it should rather depend
> >>> on CONFIG_VIRTFS and on the presence of an appropriate virtio transport
> >>> device.
> >>>
> >>> Let's introduce CONFIG_VIRTIO_CCW to cover s390x and check for
> >>> CONFIG_VIRTFS && (CONFIG_VIRTIO_PCI || CONFIG_VIRTIO_CCW).
> >>>
> >>> Signed-off-by: Cornelia Huck 
> >>> ---
> >>>
> >>> Changes v1->v2: drop extraneous spaces, fix build on cris
> >>>
> >>> ---
> >>>  default-configs/s390x-softmmu.mak | 1 +
> >>>  fsdev/Makefile.objs   | 9 +++--
> >>>  hw/Makefile.objs  | 2 +-
> >>>  3 files changed, 5 insertions(+), 7 deletions(-)  
> [...]
> >>
> >> Patch should be fine now, I think...
> >>
> >> But thinking about this again, I wonder whether it would be enough to
> >> simply check for CONFIG_VIRTIO=y here instead. CONFIG_VIRTIO=y should be
> >> sufficient to assert that there is also at least one kind of virtio
> >> transport available, right?
> >> Otherwise this will look really horrible as soon as somebody also tries
> >> to add support for virtio-mmio here later ;-)  
> > 
> > Do all virtio transports have support for 9p, though? I thought it was
> > only virtio-pci and virtio-ccw...  
> 
> While virtio-pci and virtio-ccw seem to require separate dedicated
> devices (e.g. virtio-9p-pci and virtio-9p-ccw) for everything,
> virtio-mmio seems to work different. As far as I can see, there are no
> dedicated virtio-xxx-mmio devices in the code at all. According to
> https://www.redhat.com/archives/libvir-list/2013-August/msg9.html
> you simply have to use virtio-xxx-device here instead. And a
> virtio-9p-device is available. So theoretically, the 9p code should work
> with virtio-mmio, too, or is there a problem that I did not see yet?
> 
> Anyway, we likely should not blindly enable this, so unless somebody has
> a setup to test it, we should go with your current patch instead, I think.

Yes, I'd prefer if somebody with a virtio-mmio setup could chime in.

Given the current Makefiles, this cannot have worked for !pci anyway...



Re: [Qemu-devel] [PATCH for-2.11 v2 1/5] qmp-shell: Use optparse module

2017-08-09 Thread Stefan Hajnoczi
On Tue, Aug 08, 2017 at 05:39:31PM -0300, Eduardo Habkost wrote:
> It makes command-line parsing and generation of help text much
> simpler.
> 
> The optparse module is deprecated since Python 2.7, but argparse
> is not available in Python 2.6 (the minimum Python version
> required for building QEMU).
> 
> Signed-off-by: Eduardo Habkost 
> ---
> Changes v1 -> v2:
> * Use optparse module, as the minimum Python version for building
>   QEMU is 2.6
>   * Reported-by: Stefan Hajnoczi 
>   * Suggested-by: "Daniel P. Berrange" 
> ---
>  scripts/qmp/qmp-shell | 63 
> +++
>  1 file changed, 23 insertions(+), 40 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v3 5/7] block: add throttle block filter driver

2017-08-09 Thread Manos Pitsidianakis

On Tue, Aug 08, 2017 at 05:04:48PM +0200, Alberto Garcia wrote:

On Tue 08 Aug 2017 04:56:20 PM CEST, Manos Pitsidianakis wrote:

So basically if we have anonymous groups, we accept limits in the
driver options but only without a group-name.


In the commit message you do however have limits and a group name, is
that a mistake?

   -drive driver=throttle,file.filename=foo.qcow2, \
  limits.iops-total=...,throttle-group=bar


Sorry this wasn't clear, I'm actually proposing to remove limits from
the throttle driver options and only create/config throttle groups via
-object/object-add.


Sorry I think it was me who misunderstood :-) Anyway in the new
command-line API I would be more inclined to have limits defined using
"-object throttle-group" and -drive would only reference the group id.

I understand that this implies that it wouldn't be possible to create
anonymous groups (at least not from the command line), is that a
problem?



We can accept anonymous groups if a user specifies limits but not a 
group name in the throttle driver. (The only case where limits would be 
acccepted)


Not creating eponymous throttle groups via the throttle driver means we 
don't need throttle_groups anymore, since even anonymous ones don't need 
to be accounted for in a list. I will send a new revision for this 
series but I can make this change in a later patch if everyone agrees.


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2] 9pfs: fix dependencies

2017-08-09 Thread Cornelia Huck
On Wed, 9 Aug 2017 10:24:13 +0100
"Daniel P. Berrange"  wrote:

> On Wed, Aug 09, 2017 at 11:07:38AM +0200, Thomas Huth wrote:
> > On 09.08.2017 10:27, Cornelia Huck wrote:  
> > > On Wed, 9 Aug 2017 10:23:04 +0200
> > > Thomas Huth  wrote:
> > >   
> > >> On 09.08.2017 09:17, Cornelia Huck wrote:  
> > >>> Nothing in fsdev/ or hw/9pfs/ depends on pci; it should rather depend
> > >>> on CONFIG_VIRTFS and on the presence of an appropriate virtio transport
> > >>> device.
> > >>>
> > >>> Let's introduce CONFIG_VIRTIO_CCW to cover s390x and check for
> > >>> CONFIG_VIRTFS && (CONFIG_VIRTIO_PCI || CONFIG_VIRTIO_CCW).
> > >>>
> > >>> Signed-off-by: Cornelia Huck 
> > >>> ---
> > >>>
> > >>> Changes v1->v2: drop extraneous spaces, fix build on cris
> > >>>
> > >>> ---
> > >>>  default-configs/s390x-softmmu.mak | 1 +
> > >>>  fsdev/Makefile.objs   | 9 +++--
> > >>>  hw/Makefile.objs  | 2 +-
> > >>>  3 files changed, 5 insertions(+), 7 deletions(-)  
> > [...]  
> > >>
> > >> Patch should be fine now, I think...
> > >>
> > >> But thinking about this again, I wonder whether it would be enough to
> > >> simply check for CONFIG_VIRTIO=y here instead. CONFIG_VIRTIO=y should be
> > >> sufficient to assert that there is also at least one kind of virtio
> > >> transport available, right?
> > >> Otherwise this will look really horrible as soon as somebody also tries
> > >> to add support for virtio-mmio here later ;-)  
> > > 
> > > Do all virtio transports have support for 9p, though? I thought it was
> > > only virtio-pci and virtio-ccw...  
> > 
> > While virtio-pci and virtio-ccw seem to require separate dedicated
> > devices (e.g. virtio-9p-pci and virtio-9p-ccw) for everything,
> > virtio-mmio seems to work different. As far as I can see, there are no
> > dedicated virtio-xxx-mmio devices in the code at all. According to
> > https://www.redhat.com/archives/libvir-list/2013-August/msg9.html
> > you simply have to use virtio-xxx-device here instead. And a
> > virtio-9p-device is available. So theoretically, the 9p code should work
> > with virtio-mmio, too, or is there a problem that I did not see yet?
> > 
> > Anyway, we likely should not blindly enable this, so unless somebody has
> > a setup to test it, we should go with your current patch instead, I think.  
> 
> qemu-system-arm supports virtio-mmio so you can use that to test it

Hm, the default config for arm enables CONFIG_PCI, so machines using
virtio-mmio and 9p would be broken with that patch... should we rather
depend on PCI || VIRTIO_CCW?

(Any arches not enabling PCI that use virtio-mmio? Or is arm the only
user of virtio-mmio?)



Re: [Qemu-devel] [PATCH v2] 9pfs: fix dependencies

2017-08-09 Thread Greg Kurz
On Wed, 9 Aug 2017 10:27:37 +0200
Cornelia Huck  wrote:

> On Wed, 9 Aug 2017 10:23:04 +0200
> Thomas Huth  wrote:
> 
> > On 09.08.2017 09:17, Cornelia Huck wrote:  
> > > Nothing in fsdev/ or hw/9pfs/ depends on pci; it should rather depend
> > > on CONFIG_VIRTFS and on the presence of an appropriate virtio transport
> > > device.
> > > 
> > > Let's introduce CONFIG_VIRTIO_CCW to cover s390x and check for
> > > CONFIG_VIRTFS && (CONFIG_VIRTIO_PCI || CONFIG_VIRTIO_CCW).
> > > 
> > > Signed-off-by: Cornelia Huck 
> > > ---
> > > 
> > > Changes v1->v2: drop extraneous spaces, fix build on cris
> > > 
> > > ---
> > >  default-configs/s390x-softmmu.mak | 1 +
> > >  fsdev/Makefile.objs   | 9 +++--
> > >  hw/Makefile.objs  | 2 +-
> > >  3 files changed, 5 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/default-configs/s390x-softmmu.mak 
> > > b/default-configs/s390x-softmmu.mak
> > > index 51191b77df..e4c5236ceb 100644
> > > --- a/default-configs/s390x-softmmu.mak
> > > +++ b/default-configs/s390x-softmmu.mak
> > > @@ -8,3 +8,4 @@ CONFIG_S390_FLIC=y
> > >  CONFIG_S390_FLIC_KVM=$(CONFIG_KVM)
> > >  CONFIG_VFIO_CCW=$(CONFIG_LINUX)
> > >  CONFIG_WDT_DIAG288=y
> > > +CONFIG_VIRTIO_CCW=y
> > > diff --git a/fsdev/Makefile.objs b/fsdev/Makefile.objs
> > > index 659df6e187..3d157add31 100644
> > > --- a/fsdev/Makefile.objs
> > > +++ b/fsdev/Makefile.objs
> > > @@ -1,10 +1,7 @@
> > > -ifeq ($(CONFIG_VIRTIO)$(CONFIG_VIRTFS)$(CONFIG_PCI),yyy)
> > >  # Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.
> > > -# only pull in the actual virtio-9p device if we also enabled virtio.
> > > -common-obj-y = qemu-fsdev.o 9p-marshal.o 9p-iov-marshal.o
> > > -else
> > > -common-obj-y = qemu-fsdev-dummy.o
> > > -endif
> > > +# only pull in the actual virtio-9p device if we also enabled a virtio 
> > > backend.
> > > +common-obj-$(call land,$(CONFIG_VIRTFS),$(call 
> > > lor,$(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW)))= qemu-fsdev.o 
> > > 9p-marshal.o 9p-iov-marshal.o
> > > +common-obj-$(call lnot,$(call land,$(CONFIG_VIRTFS),$(call 
> > > lor,$(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW = qemu-fsdev-dummy.o
> > >  common-obj-y += qemu-fsdev-opts.o qemu-fsdev-throttle.o
> > >  
> > >  # Toplevel always builds this; targets without virtio will put it in
> > > diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> > > index a2c61f6b09..335f26b65e 100644
> > > --- a/hw/Makefile.objs
> > > +++ b/hw/Makefile.objs
> > > @@ -1,4 +1,4 @@
> > > -devices-dirs-$(call land, $(CONFIG_VIRTIO),$(call 
> > > land,$(CONFIG_VIRTFS),$(CONFIG_PCI))) += 9pfs/
> > > +devices-dirs-$(call land,$(CONFIG_VIRTFS),$(call 
> > > lor,$(CONFIG_VIRTIO_PCI),$(CONFIG_VIRTIO_CCW))) += 9pfs/
> > >  devices-dirs-$(CONFIG_SOFTMMU) += acpi/
> > >  devices-dirs-$(CONFIG_SOFTMMU) += adc/
> > >  devices-dirs-$(CONFIG_SOFTMMU) += audio/
> > 
> > Patch should be fine now, I think...
> > 
> > But thinking about this again, I wonder whether it would be enough to
> > simply check for CONFIG_VIRTIO=y here instead. CONFIG_VIRTIO=y should be
> > sufficient to assert that there is also at least one kind of virtio
> > transport available, right?
> > Otherwise this will look really horrible as soon as somebody also tries
> > to add support for virtio-mmio here later ;-)  
> 

And virtio isn't the only transport for 9p: we also have a Xen backend,
which happen to be built because targets that support Xen also have
CONFIG_PCI I guess.

Cc'ing Stefano and Paolo who had a discussion during the review of
9p Xen backend patches:

https://patchwork.kernel.org/patch/9622325/

> Do all virtio transports have support for 9p, though? I thought it was
> only virtio-pci and virtio-ccw...

Hmm... I don't see any device-specific code in virtio-mmio.. why would it
be different for 9p than for block or net ?


pgpGlZVxiWL_h.pgp
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2] 9pfs: fix dependencies

2017-08-09 Thread Peter Maydell
On 9 August 2017 at 10:07, Thomas Huth  wrote:
> While virtio-pci and virtio-ccw seem to require separate dedicated
> devices (e.g. virtio-9p-pci and virtio-9p-ccw) for everything,
> virtio-mmio seems to work different. As far as I can see, there are no
> dedicated virtio-xxx-mmio devices in the code at all. According to
> https://www.redhat.com/archives/libvir-list/2013-August/msg9.html
> you simply have to use virtio-xxx-device here instead. And a
> virtio-9p-device is available. So theoretically, the 9p code should work
> with virtio-mmio, too, or is there a problem that I did not see yet?
>
> Anyway, we likely should not blindly enable this, so unless somebody has
> a setup to test it, we should go with your current patch instead, I think.

As you say, we already compile the virtio-9p-device that can
plug into any virtio transport. So why not just build it
whenever virtio of any form is enabled? Having it only
build if PCI is also enabled seems very odd: the backend
should not care at all about what transport it is using.

thanks
-- PMM



[Qemu-devel] [PATCH v2 2/4] vhost-user-blk: introduce a new vhost-user-blk host device

2017-08-09 Thread Changpeng Liu
This commit introduces a new vhost-user device for block, it uses a
chardev to connect with the backend, same with Qemu virito-blk device,
Guest OS still uses the virtio-blk frontend driver.

To use it, start Qemu with command line like this:

qemu-system-x86_64 \
-chardev socket,id=char0,path=/path/vhost.socket \
-device vhost-user-blk-pci,chardev=char0,num_queues=...

Different with exist Qemu virtio-blk host device, it makes more easy
for users to implement their own I/O processing logic, such as all
user space I/O stack against hardware block device. It uses the new
vhost messages(VHOST_USER_GET_CONFIG) to get block virtio config
information from backend process.

Signed-off-by: Changpeng Liu 
---
 configure  |  11 ++
 hw/block/Makefile.objs |   3 +
 hw/block/vhost-user-blk.c  | 360 +
 hw/virtio/virtio-pci.c |  55 ++
 hw/virtio/virtio-pci.h |  18 ++
 include/hw/virtio/vhost-user-blk.h |  40 +
 6 files changed, 487 insertions(+)
 create mode 100644 hw/block/vhost-user-blk.c
 create mode 100644 include/hw/virtio/vhost-user-blk.h

diff --git a/configure b/configure
index dd73cce..1452c66 100755
--- a/configure
+++ b/configure
@@ -305,6 +305,7 @@ tcg="yes"
 
 vhost_net="no"
 vhost_scsi="no"
+vhost_user_blk="no"
 vhost_vsock="no"
 vhost_user=""
 kvm="no"
@@ -779,6 +780,7 @@ Linux)
   kvm="yes"
   vhost_net="yes"
   vhost_scsi="yes"
+  vhost_user_blk="yes"
   vhost_vsock="yes"
   QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers 
$QEMU_INCLUDES"
   supported_os="yes"
@@ -1136,6 +1138,10 @@ for opt do
   ;;
   --enable-vhost-scsi) vhost_scsi="yes"
   ;;
+  --disable-vhost-user-blk) vhost_user_blk="no"
+  ;;
+  --enable-vhost-user-blk) vhost_user_blk="yes"
+  ;;
   --disable-vhost-vsock) vhost_vsock="no"
   ;;
   --enable-vhost-vsock) vhost_vsock="yes"
@@ -1506,6 +1512,7 @@ disabled with --disable-FEATURE, default is enabled if 
available:
   cap-ng  libcap-ng support
   attrattr and xattr support
   vhost-net   vhost-net acceleration support
+  vhost-user-blk  VM virtio-blk acceleration in user space
   spice   spice
   rbd rados block device (rbd)
   libiscsiiscsi support
@@ -5365,6 +5372,7 @@ echo "posix_madvise $posix_madvise"
 echo "libcap-ng support $cap_ng"
 echo "vhost-net support $vhost_net"
 echo "vhost-scsi support $vhost_scsi"
+echo "vhost-user-blk support $vhost_user_blk"
 echo "vhost-vsock support $vhost_vsock"
 echo "vhost-user support $vhost_user"
 echo "Trace backends$trace_backends"
@@ -5776,6 +5784,9 @@ fi
 if test "$vhost_scsi" = "yes" ; then
   echo "CONFIG_VHOST_SCSI=y" >> $config_host_mak
 fi
+if test "$vhost_user_blk" = "yes" ; then
+  echo "CONFIG_VHOST_USER_BLK=y" >> $config_host_mak
+fi
 if test "$vhost_net" = "yes" -a "$vhost_user" = "yes"; then
   echo "CONFIG_VHOST_NET_USED=y" >> $config_host_mak
 fi
diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
index e0ed980..4c19a58 100644
--- a/hw/block/Makefile.objs
+++ b/hw/block/Makefile.objs
@@ -13,3 +13,6 @@ obj-$(CONFIG_SH4) += tc58128.o
 
 obj-$(CONFIG_VIRTIO) += virtio-blk.o
 obj-$(CONFIG_VIRTIO) += dataplane/
+ifeq ($(CONFIG_VIRTIO),y)
+obj-$(CONFIG_VHOST_USER_BLK) += vhost-user-blk.o
+endif
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
new file mode 100644
index 000..8aa9fa9
--- /dev/null
+++ b/hw/block/vhost-user-blk.c
@@ -0,0 +1,360 @@
+/*
+ * vhost-user-blk host device
+ *
+ * Copyright IBM, Corp. 2011
+ * Copyright(C) 2017 Intel Corporation.
+ *
+ * Authors:
+ *  Stefan Hajnoczi 
+ *  Changpeng Liu 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/typedefs.h"
+#include "qemu/cutils.h"
+#include "qom/object.h"
+#include "hw/qdev-core.h"
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-user-blk.h"
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
+
+static const int user_feature_bits[] = {
+VIRTIO_BLK_F_SIZE_MAX,
+VIRTIO_BLK_F_SEG_MAX,
+VIRTIO_BLK_F_GEOMETRY,
+VIRTIO_BLK_F_BLK_SIZE,
+VIRTIO_BLK_F_TOPOLOGY,
+VIRTIO_BLK_F_SCSI,
+VIRTIO_BLK_F_MQ,
+VIRTIO_BLK_F_RO,
+VIRTIO_BLK_F_FLUSH,
+VIRTIO_BLK_F_BARRIER,
+VIRTIO_BLK_F_WCE,
+VIRTIO_F_VERSION_1,
+VIRTIO_RING_F_INDIRECT_DESC,
+VIRTIO_RING_F_EVENT_IDX,
+VIRTIO_F_NOTIFY_ON_EMPTY,
+VHOST_INVALID_FEATURE_BIT
+};
+
+static void vhost_user_blk_update_config(VirtIODevice *vdev, uint8_t *config)
+{
+VHostUserBlk *s = VHOST_USER_BLK(vdev);
+
+memcpy(config, >blkcfg, sizeof(struct virtio_blk_config));
+}
+
+static void vhost_user_blk_set_config(VirtIODevice 

[Qemu-devel] [PATCH v2 1/4] vhost-user: add new vhost user messages to support virtio config space

2017-08-09 Thread Changpeng Liu
Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which can be
used for live migration for vhost user devices, also vhost user devices
can benifit from the messages to get/set virtio config space from/to the
I/O target besides Qemu. For the purpose to support virtio config space
change, VHOST_USER_SET_CONFIG_FD message is added as the event notifier
in case virtio config space change.

Signed-off-by: Changpeng Liu 
---
 docs/interop/vhost-user.txt   | 31 ++
 hw/virtio/vhost-user.c| 86 +++
 hw/virtio/vhost.c | 63 
 include/hw/virtio/vhost-backend.h |  8 
 include/hw/virtio/vhost.h | 16 
 5 files changed, 204 insertions(+)

diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
index 954771d..19dfc61 100644
--- a/docs/interop/vhost-user.txt
+++ b/docs/interop/vhost-user.txt
@@ -116,6 +116,10 @@ Depending on the request type, payload can be:
 - 3: IOTLB invalidate
 - 4: IOTLB access fail
 
+ * Virtio device config space
+
+   256 Bytes static virito config space  
+
 In QEMU the vhost-user message is implemented with the following struct:
 
 typedef struct VhostUserMsg {
@@ -129,6 +133,7 @@ typedef struct VhostUserMsg {
 VhostUserMemory memory;
 VhostUserLog log;
 struct vhost_iotlb_msg iotlb;
+uint8_t config[256];
 };
 } QEMU_PACKED VhostUserMsg;
 
@@ -596,6 +601,32 @@ Master message types
   and expect this message once (per VQ) during device configuration
   (ie. before the master starts the VQ).
 
+ * VHOST_USER_GET_CONFIG
+  Id: 24
+  Equivalent ioctl: N/A
+  Master payload: virtio device config space
+
+  Submitted by the vhost-user master to fetch the contents of the virtio
+  config space. The vhost-user master may cache the contents to avoid
+  repeated VHOST_USER_GET_CONFIG calls.
+
+* VHOST_USER_SET_CONFIG
+  Id: 25
+  Equivalent ioctl: N/A
+  Master payload: virtio device config space
+
+  Submitted by the vhost-user master when the guest writes to virtio
+  config space and also after live migration on the destination host.
+
+* VHOST_USER_SET_CONFIG_FD
+  Id: 26
+  Equivalent ioctl: N/A
+  Master payload: N/A
+
+  Sets the notifier file descriptor, which is passed as ancillary data.
+  Vhost-user master uses the file descriptor as event callback when the
+  virtio config space changed.
+
 Slave message types
 ---
 
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 093675e..4b402c5 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -26,6 +26,11 @@
 #define VHOST_MEMORY_MAX_NREGIONS8
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
 
+/*
+ * Maximum size of virtio device config space
+ */
+#define VHOST_USER_MAX_CONFIG_SIZE 256
+
 enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
@@ -65,6 +70,9 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_SLAVE_REQ_FD = 21,
 VHOST_USER_IOTLB_MSG = 22,
 VHOST_USER_SET_VRING_ENDIAN = 23,
+VHOST_USER_GET_CONFIG = 24,
+VHOST_USER_SET_CONFIG = 25,
+VHOST_USER_SET_CONFIG_FD = 26,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -109,6 +117,7 @@ typedef struct VhostUserMsg {
 VhostUserMemory memory;
 VhostUserLog log;
 struct vhost_iotlb_msg iotlb;
+uint8_t config[VHOST_USER_MAX_CONFIG_SIZE];
 } payload;
 } QEMU_PACKED VhostUserMsg;
 
@@ -922,6 +931,80 @@ static void vhost_user_set_iotlb_callback(struct vhost_dev 
*dev, int enabled)
 /* No-op as the receive channel is not dedicated to IOTLB messages. */
 }
 
+static int vhost_user_get_config(struct vhost_dev *dev, uint8_t *config,
+ size_t config_len)
+{
+VhostUserMsg msg = {
+.request = VHOST_USER_GET_CONFIG,
+.flags = VHOST_USER_VERSION,
+.size = config_len,
+};
+
+if (config_len == 0 || config_len > VHOST_USER_PAYLOAD_SIZE) {
+error_report("bad config length");
+return -1;
+}
+
+if (vhost_user_write(dev, , NULL, 0) < 0) {
+return -1;
+}
+
+if (vhost_user_read(dev, ) < 0) {
+return -1;
+}
+
+if (msg.request != VHOST_USER_GET_CONFIG) {
+error_report("Received unexpected msg type. Expected %d received %d",
+ VHOST_USER_GET_CONFIG, msg.request);
+return -1;
+}
+
+if (msg.size != config_len) {
+error_report("Received bad msg size.");
+return -1;
+}
+
+memcpy(config, , config_len);
+
+return 0;
+}
+
+static int vhost_user_set_config(struct vhost_dev *dev, const uint8_t *config,
+ size_t config_len)
+{
+VhostUserMsg msg = {
+.request = VHOST_USER_SET_CONFIG,
+.flags = VHOST_USER_VERSION,
+.size = config_len,
+   

[Qemu-devel] [PATCH v2 0/4] *** Introduce a new vhost-user-blk host device to Qemu ***

2017-08-09 Thread Changpeng Liu
Althrough virtio scsi specification was designed as a replacement for 
virtio_blk,
there are still many users using virtio_blk. Qemu 2.9 introduced a new device
vhost user scsi which can process I/O in user space for virtio_scsi, this commit
introduces a new vhost user block host device, which can support virtio_blk in
Guest OS, and I/O processing in another I/O target.

Due to the limitation for virtio_blk specification, virtio_blk device cannot get
block information such as capacity, block size etc via the specification, 
several
new vhost user messages were added to support deliver virtio config space
information between Qemu and I/O target, 
VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG
messages used for get/set config space from/to I/O target, 
VHOST_USER_SET_CONFIG_FD
was added for event notifier in case the change of virtio config space. Also, 
those
messages can be used for vhost device live migration as well.

Changpeng Liu (4):
  vhost-user: add new vhost user messages to support virtio config space
  vhost-user-blk: introduce a new vhost-user-blk host device
  contrib/libvhost-user: enable virtio config space messages
  contrib/vhost-user-blk: introduce a vhost-user-blk sample application

 .gitignore  |   1 +
 Makefile|   3 +
 Makefile.objs   |   2 +
 configure   |  11 +
 contrib/libvhost-user/libvhost-user.c   |  51 +++
 contrib/libvhost-user/libvhost-user.h   |  14 +
 contrib/vhost-user-blk/Makefile.objs|   1 +
 contrib/vhost-user-blk/vhost-user-blk.c | 735 
 docs/interop/vhost-user.txt |  31 ++
 hw/block/Makefile.objs  |   3 +
 hw/block/vhost-user-blk.c   | 360 
 hw/virtio/vhost-user.c  |  86 
 hw/virtio/vhost.c   |  63 +++
 hw/virtio/virtio-pci.c  |  55 +++
 hw/virtio/virtio-pci.h  |  18 +
 include/hw/virtio/vhost-backend.h   |   8 +
 include/hw/virtio/vhost-user-blk.h  |  40 ++
 include/hw/virtio/vhost.h   |  16 +
 18 files changed, 1498 insertions(+)
 create mode 100644 contrib/vhost-user-blk/Makefile.objs
 create mode 100644 contrib/vhost-user-blk/vhost-user-blk.c
 create mode 100644 hw/block/vhost-user-blk.c
 create mode 100644 include/hw/virtio/vhost-user-blk.h

-- 
1.9.3




[Qemu-devel] [PATCH v2 3/4] contrib/libvhost-user: enable virtio config space messages

2017-08-09 Thread Changpeng Liu
Enable VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG/VHOST_USER_SET_CONFIG_FD
messages in libvhost-user library, users can implement their own I/O target
based on the library. This enable the virtio config space delivered between
Qemu host device and the I/O target, also event notifier is added in case
of virtio config space changed.

Signed-off-by: Changpeng Liu 
---
 contrib/libvhost-user/libvhost-user.c | 51 +++
 contrib/libvhost-user/libvhost-user.h | 14 ++
 2 files changed, 65 insertions(+)

diff --git a/contrib/libvhost-user/libvhost-user.c 
b/contrib/libvhost-user/libvhost-user.c
index 9efb9da..002cf15 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -63,6 +63,9 @@ vu_request_to_string(int req)
 REQ(VHOST_USER_SET_VRING_ENABLE),
 REQ(VHOST_USER_SEND_RARP),
 REQ(VHOST_USER_INPUT_GET_CONFIG),
+REQ(VHOST_USER_GET_CONFIG),
+REQ(VHOST_USER_SET_CONFIG),
+REQ(VHOST_USER_SET_CONFIG_FD),
 REQ(VHOST_USER_MAX),
 };
 #undef REQ
@@ -744,6 +747,43 @@ vu_set_vring_enable_exec(VuDev *dev, VhostUserMsg *vmsg)
 }
 
 static bool
+vu_get_config(VuDev *dev, VhostUserMsg *vmsg)
+{
+if (dev->iface->get_config) {
+dev->iface->get_config(dev, vmsg->payload.config, vmsg->size);
+}
+
+return true;
+}
+
+static bool
+vu_set_config(VuDev *dev, VhostUserMsg *vmsg)
+{
+if (dev->iface->set_config) {
+dev->iface->set_config(dev, vmsg->payload.config, vmsg->size);
+}
+
+return false;
+}
+
+static bool
+vu_set_config_fd(VuDev *dev, VhostUserMsg *vmsg)
+{
+   if (vmsg->fd_num != 1) {
+vu_panic(dev, "Invalid config_fd message");
+return false;
+}
+
+if (dev->config_fd != -1) {
+close(dev->config_fd);
+}
+dev->config_fd = vmsg->fds[0];
+DPRINT("Got config_fd: %d\n", vmsg->fds[0]);
+
+return false;
+}
+
+static bool
 vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
 {
 int do_reply = 0;
@@ -806,6 +846,12 @@ vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
 return vu_get_queue_num_exec(dev, vmsg);
 case VHOST_USER_SET_VRING_ENABLE:
 return vu_set_vring_enable_exec(dev, vmsg);
+case VHOST_USER_GET_CONFIG:
+return vu_get_config(dev, vmsg);
+case VHOST_USER_SET_CONFIG:
+return vu_set_config(dev, vmsg);
+case VHOST_USER_SET_CONFIG_FD:
+return vu_set_config_fd(dev, vmsg);
 default:
 vmsg_close_fds(vmsg);
 vu_panic(dev, "Unhandled request: %d", vmsg->request);
@@ -878,6 +924,10 @@ vu_deinit(VuDev *dev)
 
 vu_close_log(dev);
 
+if (dev->config_fd != -1) {
+close(dev->config_fd);
+}
+
 if (dev->sock != -1) {
 close(dev->sock);
 }
@@ -907,6 +957,7 @@ vu_init(VuDev *dev,
 dev->remove_watch = remove_watch;
 dev->iface = iface;
 dev->log_call_fd = -1;
+dev->config_fd = -1;
 for (i = 0; i < VHOST_MAX_NR_VIRTQUEUE; i++) {
 dev->vq[i] = (VuVirtq) {
 .call_fd = -1, .kick_fd = -1, .err_fd = -1,
diff --git a/contrib/libvhost-user/libvhost-user.h 
b/contrib/libvhost-user/libvhost-user.h
index 53ef222..899dee1 100644
--- a/contrib/libvhost-user/libvhost-user.h
+++ b/contrib/libvhost-user/libvhost-user.h
@@ -30,6 +30,8 @@
 
 #define VHOST_MEMORY_MAX_NREGIONS 8
 
+#define VHOST_USER_MAX_CONFIG_SIZE 256
+
 enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
@@ -62,6 +64,9 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_VRING_ENABLE = 18,
 VHOST_USER_SEND_RARP = 19,
 VHOST_USER_INPUT_GET_CONFIG = 20,
+VHOST_USER_GET_CONFIG = 24,
+VHOST_USER_SET_CONFIG = 25,
+VHOST_USER_SET_CONFIG_FD = 26,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -105,6 +110,7 @@ typedef struct VhostUserMsg {
 struct vhost_vring_addr addr;
 VhostUserMemory memory;
 VhostUserLog log;
+uint8_t config[VHOST_USER_MAX_CONFIG_SIZE];
 } payload;
 
 int fds[VHOST_MEMORY_MAX_NREGIONS];
@@ -132,6 +138,9 @@ typedef void (*vu_set_features_cb) (VuDev *dev, uint64_t 
features);
 typedef int (*vu_process_msg_cb) (VuDev *dev, VhostUserMsg *vmsg,
   int *do_reply);
 typedef void (*vu_queue_set_started_cb) (VuDev *dev, int qidx, bool started);
+typedef int (*vu_get_config_cb) (VuDev *dev, uint8_t *config, size_t len);
+typedef int (*vu_set_config_cb) (VuDev *dev, const uint8_t *config,
+ size_t len);
 
 typedef struct VuDevIface {
 /* called by VHOST_USER_GET_FEATURES to get the features bitmask */
@@ -148,6 +157,10 @@ typedef struct VuDevIface {
 vu_process_msg_cb process_msg;
 /* tells when queues can be processed */
 vu_queue_set_started_cb queue_set_started;
+/* get the config space of the device */
+vu_get_config_cb get_config;
+/* set the config space of the device */
+

[Qemu-devel] [PATCH v4 0/7] add throttle block driver filter

2017-08-09 Thread Manos Pitsidianakis
This series adds a throttle block driver filter. Currently throttling is done
at the BlockBackend level. Using block driver interfaces we can move the
throttling to any point in the BDS graph using a throttle node which uses the
existing throttling code. This allows for potentially more complex
configurations (throttling at any point in the graph, chained filters)

v4:
  fix suggestions in block/throttle.c
  fix suggestions in block/throttle_groups.c
  add doc note in BlockDevOptionsThrottle
  
v3:
  fix style error in 'add aio_context field in ThrottleGroupMember'

v2:
  change QOM throttle group object name
  print valid ranges for uint on error
  move frees in throttle_group_obj_finalize()
  split throttle_group_{set,get}()
  add throttle_recurse_is_first_non_filter()

Manos Pitsidianakis (7):
  block: move ThrottleGroup membership to ThrottleGroupMember
  block: add aio_context field in ThrottleGroupMember
  block: tidy ThrottleGroupMember initializations
  block: convert ThrottleGroup to object with QOM
  block: add throttle block filter driver
  block: add BlockDevOptionsThrottle to QAPI
  block: add throttle block filter driver interface tests

 block/Makefile.objs |   1 +
 block/block-backend.c   |  62 ++--
 block/qapi.c|   8 +-
 block/throttle-groups.c | 730 +---
 block/throttle.c| 315 +
 blockdev.c  |   4 +-
 include/block/throttle-groups.h |  47 ++-
 include/qemu/throttle-options.h |  60 ++--
 include/qemu/throttle.h |   3 +
 include/sysemu/block-backend.h  |  20 +-
 qapi/block-core.json|  70 +++-
 tests/qemu-iotests/184  | 310 +
 tests/qemu-iotests/184.out  | 422 +++
 tests/qemu-iotests/group|   1 +
 tests/test-throttle.c   | 111 +++---
 util/throttle.c | 151 +
 16 files changed, 1995 insertions(+), 320 deletions(-)
 create mode 100644 block/throttle.c
 create mode 100755 tests/qemu-iotests/184
 create mode 100644 tests/qemu-iotests/184.out

-- 
2.11.0




[Qemu-devel] [PATCH v4 2/7] block: add aio_context field in ThrottleGroupMember

2017-08-09 Thread Manos Pitsidianakis
timer_cb() needs to know about the current Aio context of the throttle
request that is woken up. In order to make ThrottleGroupMember backend
agnostic, this information is stored in an aio_context field instead of
accessing it from BlockBackend.

Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Manos Pitsidianakis 
---
 block/block-backend.c   | 15 +-
 block/throttle-groups.c | 38 -
 include/block/throttle-groups.h |  7 -
 tests/test-throttle.c   | 63 +
 4 files changed, 70 insertions(+), 53 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index bde6948d0e..6687a90660 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1685,17 +1685,15 @@ static AioContext *blk_aiocb_get_aio_context(BlockAIOCB 
*acb)
 void blk_set_aio_context(BlockBackend *blk, AioContext *new_context)
 {
 BlockDriverState *bs = blk_bs(blk);
-ThrottleTimers *tt;
+ThrottleGroupMember *tgm = >public.throttle_group_member;
 
 if (bs) {
-if (blk->public.throttle_group_member.throttle_state) {
-tt = >public.throttle_group_member.throttle_timers;
-throttle_timers_detach_aio_context(tt);
+if (tgm->throttle_state) {
+throttle_group_detach_aio_context(tgm);
 }
 bdrv_set_aio_context(bs, new_context);
-if (blk->public.throttle_group_member.throttle_state) {
-tt = >public.throttle_group_member.throttle_timers;
-throttle_timers_attach_aio_context(tt, new_context);
+if (tgm->throttle_state) {
+throttle_group_attach_aio_context(tgm, new_context);
 }
 }
 }
@@ -1929,7 +1927,8 @@ void blk_io_limits_disable(BlockBackend *blk)
 void blk_io_limits_enable(BlockBackend *blk, const char *group)
 {
 assert(!blk->public.throttle_group_member.throttle_state);
-throttle_group_register_tgm(>public.throttle_group_member, group);
+throttle_group_register_tgm(>public.throttle_group_member,
+group, blk_get_aio_context(blk));
 }
 
 void blk_io_limits_update_group(BlockBackend *blk, const char *group)
diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index c8ed16ddf8..3b07b25f39 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -391,9 +391,6 @@ static void coroutine_fn 
throttle_group_restart_queue_entry(void *opaque)
 
 static void throttle_group_restart_queue(ThrottleGroupMember *tgm, bool 
is_write)
 {
-BlockBackendPublic *blkp = container_of(tgm, BlockBackendPublic,
-throttle_group_member);
-BlockBackend *blk = blk_by_public(blkp);
 Coroutine *co;
 RestartData rd = {
 .tgm = tgm,
@@ -401,7 +398,7 @@ static void 
throttle_group_restart_queue(ThrottleGroupMember *tgm, bool is_write
 };
 
 co = qemu_coroutine_create(throttle_group_restart_queue_entry, );
-aio_co_enter(blk_get_aio_context(blk), co);
+aio_co_enter(tgm->aio_context, co);
 }
 
 void throttle_group_restart_tgm(ThrottleGroupMember *tgm)
@@ -449,13 +446,11 @@ void throttle_group_get_config(ThrottleGroupMember *tgm, 
ThrottleConfig *cfg)
 /* ThrottleTimers callback. This wakes up a request that was waiting
  * because it had been throttled.
  *
- * @blk:   the BlockBackend whose request had been throttled
+ * @tgm:   the ThrottleGroupMember whose request had been throttled
  * @is_write:  the type of operation (read/write)
  */
-static void timer_cb(BlockBackend *blk, bool is_write)
+static void timer_cb(ThrottleGroupMember *tgm, bool is_write)
 {
-BlockBackendPublic *blkp = blk_get_public(blk);
-ThrottleGroupMember *tgm = >throttle_group_member;
 ThrottleState *ts = tgm->throttle_state;
 ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
 
@@ -484,18 +479,18 @@ static void write_timer_cb(void *opaque)
  *
  * @tgm:   the ThrottleGroupMember to insert
  * @groupname: the name of the group
+ * @ctx:   the AioContext to use
  */
 void throttle_group_register_tgm(ThrottleGroupMember *tgm,
- const char *groupname)
+ const char *groupname,
+ AioContext *ctx)
 {
 int i;
-BlockBackendPublic *blkp = container_of(tgm, BlockBackendPublic,
-throttle_group_member);
-BlockBackend *blk = blk_by_public(blkp);
 ThrottleState *ts = throttle_group_incref(groupname);
 ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
 
 tgm->throttle_state = ts;
+tgm->aio_context = ctx;
 
 qemu_mutex_lock(>lock);
 /* If the ThrottleGroup is new set this ThrottleGroupMember as the token */
@@ -508,11 +503,11 @@ void throttle_group_register_tgm(ThrottleGroupMember *tgm,
 QLIST_INSERT_HEAD(>head, tgm, round_robin);
 
 throttle_timers_init(>throttle_timers,
- 

[Qemu-devel] [PATCH v4 5/7] block: add throttle block filter driver

2017-08-09 Thread Manos Pitsidianakis
block/throttle.c uses existing I/O throttle infrastructure inside a
block filter driver. I/O operations are intercepted in the filter's
read/write coroutines, and referred to block/throttle-groups.c

The driver can be used with the syntax
-drive driver=throttle,file.filename=foo.qcow2, \
limits.iops-total=...,throttle-group=bar

The configuration flags and their semantics are identical to the
hardcoded throttling ones.

A node can be created referring to an existing group, and will overwrite
its limits if any are specified, otherwise they are retained.

Signed-off-by: Manos Pitsidianakis 
---
 block/Makefile.objs |   1 +
 block/throttle.c| 315 
 include/qemu/throttle-options.h |   1 +
 3 files changed, 317 insertions(+)
 create mode 100644 block/throttle.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 2aaede4ae1..6eaf78a046 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -25,6 +25,7 @@ block-obj-y += accounting.o dirty-bitmap.o
 block-obj-y += write-threshold.o
 block-obj-y += backup.o
 block-obj-$(CONFIG_REPLICATION) += replication.o
+block-obj-y += throttle.o
 
 block-obj-y += crypto.o
 
diff --git a/block/throttle.c b/block/throttle.c
new file mode 100644
index 00..3e6cb1de7b
--- /dev/null
+++ b/block/throttle.c
@@ -0,0 +1,315 @@
+/*
+ * QEMU block throttling filter driver infrastructure
+ *
+ * Copyright (c) 2017 Manos Pitsidianakis
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 or
+ * (at your option) version 3 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "block/throttle-groups.h"
+#include "qemu/throttle-options.h"
+#include "qapi/error.h"
+
+#undef THROTTLE_OPT_PREFIX
+#define THROTTLE_OPT_PREFIX "limits."
+static QemuOptsList throttle_opts = {
+.name = "throttle",
+.head = QTAILQ_HEAD_INITIALIZER(throttle_opts.head),
+.desc = {
+THROTTLE_OPTS,
+{
+.name = QEMU_OPT_THROTTLE_GROUP_NAME,
+.type = QEMU_OPT_STRING,
+.help = "throttle group name",
+},
+{ /* end of list */ }
+},
+};
+
+/* Extract ThrottleConfig options. Assumes cfg is initialized and will be
+ * checked for validity.
+ *
+ * Returns -1 and sets errp if a burst_length value is over UINT_MAX.
+ */
+static int throttle_extract_options(QemuOpts *opts, ThrottleConfig *cfg,
+Error **errp)
+{
+#define IF_OPT_SET(rvalue, opt_name) \
+if (qemu_opt_get(opts, THROTTLE_OPT_PREFIX opt_name)) { \
+rvalue = qemu_opt_get_number(opts, THROTTLE_OPT_PREFIX opt_name, 0); }
+
+IF_OPT_SET(cfg->buckets[THROTTLE_BPS_TOTAL].avg, QEMU_OPT_BPS_TOTAL);
+IF_OPT_SET(cfg->buckets[THROTTLE_BPS_READ].avg, QEMU_OPT_BPS_READ);
+IF_OPT_SET(cfg->buckets[THROTTLE_BPS_WRITE].avg, QEMU_OPT_BPS_WRITE);
+IF_OPT_SET(cfg->buckets[THROTTLE_OPS_TOTAL].avg, QEMU_OPT_IOPS_TOTAL);
+IF_OPT_SET(cfg->buckets[THROTTLE_OPS_READ].avg, QEMU_OPT_IOPS_READ);
+IF_OPT_SET(cfg->buckets[THROTTLE_OPS_WRITE].avg, QEMU_OPT_IOPS_WRITE);
+IF_OPT_SET(cfg->buckets[THROTTLE_BPS_TOTAL].max, QEMU_OPT_BPS_TOTAL_MAX);
+IF_OPT_SET(cfg->buckets[THROTTLE_BPS_READ].max, QEMU_OPT_BPS_READ_MAX);
+IF_OPT_SET(cfg->buckets[THROTTLE_BPS_WRITE].max, QEMU_OPT_BPS_WRITE_MAX);
+IF_OPT_SET(cfg->buckets[THROTTLE_OPS_TOTAL].max, QEMU_OPT_IOPS_TOTAL_MAX);
+IF_OPT_SET(cfg->buckets[THROTTLE_OPS_READ].max, QEMU_OPT_IOPS_READ_MAX);
+IF_OPT_SET(cfg->buckets[THROTTLE_OPS_WRITE].max, QEMU_OPT_IOPS_WRITE_MAX);
+IF_OPT_SET(cfg->op_size, QEMU_OPT_IOPS_SIZE);
+
+#define IF_OPT_UINT_SET(rvalue, opt_name) \
+if (qemu_opt_get(opts, THROTTLE_OPT_PREFIX opt_name)) { \
+if (qemu_opt_get_number(opts,  \
+THROTTLE_OPT_PREFIX opt_name, 1) > UINT_MAX) { \
+error_setg(errp, "%s value must be in the range [0, %u]", \
+   THROTTLE_OPT_PREFIX opt_name, UINT_MAX); \
+return -1; \
+} \
+rvalue = qemu_opt_get_number(opts, THROTTLE_OPT_PREFIX opt_name, 1); \
+}
+
+IF_OPT_UINT_SET(cfg->buckets[THROTTLE_BPS_TOTAL].burst_length,
+QEMU_OPT_BPS_TOTAL_MAX_LENGTH);
+IF_OPT_UINT_SET(cfg->buckets[THROTTLE_BPS_READ].burst_length,
+QEMU_OPT_BPS_READ_MAX_LENGTH);
+IF_OPT_UINT_SET(cfg->buckets[THROTTLE_BPS_WRITE].burst_length,
+

Re: [Qemu-devel] [PATCH v4 5/5] docs: update documentation considering PCIE-PCI bridge

2017-08-09 Thread Laszlo Ersek
On 08/08/17 21:21, Aleksandr Bezzubikov wrote:
> 2017-08-08 18:11 GMT+03:00 Laszlo Ersek :
>> one comment below
>>
>> On 08/05/17 22:27, Aleksandr Bezzubikov wrote:
>>
>>> +Capability layout (defined in include/hw/pci/pci_bridge.h):
>>> +
>>> +uint8_t id; Standard PCI capability header field
>>> +uint8_t next;   Standard PCI capability header field
>>> +uint8_t len;Standard PCI vendor-specific capability header field
>>> +
>>> +uint8_t type;   Red Hat vendor-specific capability type
>>> +List of currently existing types:
>>> +QEMU_RESERVE = 1
>>> +
>>> +
>>> +uint32_t bus_res;   Minimum number of buses to reserve
>>> +
>>> +uint64_t io;IO space to reserve
>>> +uint64_t memNon-prefetchable memory to reserve
>>> +uint64_t mem_pref;  Prefetchable memory to reserve
>>
>> (I apologize if I missed any concrete points from the past messages
>> regarding this structure.)
>>
>> How is the firmware supposed to know whether the prefetchable MMIO
>> reservation should be made in 32-bit or 64-bit address space? If we
>> reserve prefetchable MMIO outside of the 32-bit address space, then
>> hot-plugging a device without 64-bit MMIO support could fail.
>>
>> My earlier request, to distinguish "prefetchable_32" from
>> "prefetchable_64" (mutually exclusively), was so that firmware would
>> know whether to restrict the MMIO reservation to 32-bit address
>> space.
>
> IIUC now (in SeaBIOS at least) we just assign this PREF registers
> unconditionally,
> so the decision about the mode can be made basing on !=0
> UPPER_PREF_LIMIT register.
> My idea was the same - we can just check if the value doesn't fit into
> 16-bit (PREF_LIMIT reg size, 32-bit MMIO). Do we really need separate
> fields for that?

The PciBusDxe driver in edk2 tracks 32-bit and 64-bit MMIO resources
separately from each other, and other (independent) logic exists in it
that, on some conditions, allocates 64-bit MMIO BARs from 32-bit address
space. This is just to say that the distinction is intentional in
PciBusDxe.

Furthermore, the Platform Init spec v1.6 says the following (this is
what OVMF will have to comply with, in the "platform hook" called by
PciBusDxe):

> 12.6 PCI Hot Plug PCI Initialization Protocol
> EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding()
> ...
> Padding  The amount of resource padding that is required by the PCI
>  bus under the control of the specified HPC. Because the
>  caller does not know the size of this buffer, this buffer is
>  allocated by the callee and freed by the caller.
> ...
> The padding is returned in the form of ACPI (2.0 & 3.0) resource
> descriptors. The exact definition of each of the fields is the same as
> in the
> EFI_PCI_HOST_BRIDGE_RESOURCE_ALLOCATION_PROTOCOL.SubmitResources()
> function. See the section 10.8 for the definition of this function.

Following that pointer:

> 10.8 PCI HostBridge Code Definitions
> 10.8.2 PCI Host Bridge Resource Allocation Protocol
>
> Table 8. ACPI 2.0 & 3.0 QWORD Address Space Descriptor Usage
>
> ByteByteData  Description
> Offset  Length
> ...
> 0x030x01  Resource type:
> 0: Memory range
> 1: I/O range
> 2: Bus number range
> ...
> 0x050x01  Type-specific flags. Ignored except as defined
>   in Table 3-3 and Table 3-4 below.
>
> 0x060x08  Address Space Granularity. Used to differentiate
>   between a 32-bit memory request and a 64-bit
>   memory request. For a 32-bit memory request,
>   this field should be set to 32. For a 64-bit
>   memory request, this field should be set to 64.
>   Ignored for I/O and bus resource requests.
>   Ignored during GetProposedResources().

The "Table 3-3" and "Table 3-4" references under "Type-specific flags"
are out of date (spec bug); in reality those are:
- Table 10. I/O Resource Flag (Resource Type = 1) Usage,
- Table 11. Memory Resource Flag (Resource Type = 0) Usage.

The latter is relevant here:

> Table 11. Memory Resource Flag (Resource Type = 0) Usage
>
> Bits  Meaning
> ...
> Bit[2:1]  _MEM. Memory attributes.
>   Value and Meaning:
> 0 The memory is nonprefetchable.
> 1 Invalid.
> 2 Invalid.
> 3 The memory is prefetchable.
>   Note: The interpretation of these bits is somewhat different
>   from the ACPI Specification. According to the ACPI
>   Specification, a value of 0 implies noncacheable memory and
>   the value of 3 indicates prefetchable and cacheable memory.

So whatever OVMF sees in the capability, it must be able to translate to
the above representation.

Thanks
Laszlo

>
>>
>> This is based on an earlier email from Alex to me:
>>

Re: [Qemu-devel] [PATCH for 2.11 v2 1/2] watchdog: wdt_aspeed: Add support for the reset width register

2017-08-09 Thread Andrew Jeffery


On Wed, Aug 9, 2017, at 18:28, Cédric Le Goater wrote:
> On 08/09/2017 08:28 AM, Andrew Jeffery wrote:
> > The reset width register controls how the pulse on the SoC's WDTRST{1,2}
> > pins behaves. A pulse is emitted if the external reset bit is set in
> > WDT_CTRL. On the AST2500 WDT_RESET_WIDTH can consume magic bit patterns
> > to configure push-pull/open-drain and active-high/active-low
> > behaviours and thus needs some special handling in the write path.
> > 
> > As some of the capabilities depend on the SoC version a silicon-rev
> > property is introduced, which is used to guard version-specific
> > behaviour.
> > 
> > Signed-off-by: Andrew Jeffery 
> 
> One minor comment below. Nevertheless :
> 
> Reviewed-by: Cédric Le Goater 
> 
> > ---
> >  hw/watchdog/wdt_aspeed.c | 93 
> > +++-
> >  include/hw/watchdog/wdt_aspeed.h |  2 +
> >  2 files changed, 84 insertions(+), 11 deletions(-)
> > 
> > diff --git a/hw/watchdog/wdt_aspeed.c b/hw/watchdog/wdt_aspeed.c
> > index 8bbe579b6b66..22bce364d7b5 100644
> > --- a/hw/watchdog/wdt_aspeed.c
> > +++ b/hw/watchdog/wdt_aspeed.c
> > @@ -8,16 +8,19 @@
> >   */
> >  
> >  #include "qemu/osdep.h"
> > +
> > +#include "qapi/error.h"
> >  #include "qemu/log.h"
> > +#include "qemu/timer.h"
> >  #include "sysemu/watchdog.h"
> > +#include "hw/misc/aspeed_scu.h"
> >  #include "hw/sysbus.h"
> > -#include "qemu/timer.h"
> >  #include "hw/watchdog/wdt_aspeed.h"
> >  
> > -#define WDT_STATUS  (0x00 / 4)
> > -#define WDT_RELOAD_VALUE(0x04 / 4)
> > -#define WDT_RESTART (0x08 / 4)
> > -#define WDT_CTRL(0x0C / 4)
> > +#define WDT_STATUS  (0x00 / 4)
> > +#define WDT_RELOAD_VALUE(0x04 / 4)
> > +#define WDT_RESTART (0x08 / 4)
> > +#define WDT_CTRL(0x0C / 4)
> >  #define   WDT_CTRL_RESET_MODE_SOC   (0x00 << 5)
> >  #define   WDT_CTRL_RESET_MODE_FULL_CHIP (0x01 << 5)
> >  #define   WDT_CTRL_1MHZ_CLK BIT(4)
> > @@ -25,18 +28,41 @@
> >  #define   WDT_CTRL_WDT_INTR BIT(2)
> >  #define   WDT_CTRL_RESET_SYSTEM BIT(1)
> >  #define   WDT_CTRL_ENABLE   BIT(0)
> > +#define WDT_RESET_WIDTH (0x18 / 4)
> > +#define   WDT_RESET_WIDTH_ACTIVE_HIGH   BIT(31)
> > +#define WDT_POLARITY_MASK   (0xFF << 24)
> > +#define WDT_ACTIVE_HIGH_MAGIC   (0xA5 << 24)
> > +#define WDT_ACTIVE_LOW_MAGIC(0x5A << 24)
> > +#define   WDT_RESET_WIDTH_PUSH_PULL BIT(30)
> > +#define WDT_DRIVE_TYPE_MASK (0xFF << 24)
> > +#define WDT_PUSH_PULL_MAGIC (0xA8 << 24)
> > +#define WDT_OPEN_DRAIN_MAGIC(0x8A << 24)
> >  
> > -#define WDT_TIMEOUT_STATUS  (0x10 / 4)
> > -#define WDT_TIMEOUT_CLEAR   (0x14 / 4)
> > -#define WDT_RESET_WDITH (0x18 / 4)
> > +#define WDT_TIMEOUT_STATUS  (0x10 / 4)
> > +#define WDT_TIMEOUT_CLEAR   (0x14 / 4)
> >  
> > -#define WDT_RESTART_MAGIC   0x4755
> > +#define WDT_RESTART_MAGIC   0x4755
> >  
> >  static bool aspeed_wdt_is_enabled(const AspeedWDTState *s)
> >  {
> >  return s->regs[WDT_CTRL] & WDT_CTRL_ENABLE;
> >  }
> >  
> > +static bool is_ast2500(const AspeedWDTState *s)
> 
> I think we could use this routine in other controllers (scu, sdmc). 
> So may be, in a follow-up patch, we could move it in aspeed_scu.h

Right, I figured we would move it when we came to need it elsewhere.

Thanks for the review.

Cheers,

Andrew

> 
> Thanks,
> 
> C. 
>  
> > +{
> > +switch (s->silicon_rev) {
> > +case AST2500_A0_SILICON_REV:
> > +case AST2500_A1_SILICON_REV:
> > +return true;
> > +case AST2400_A0_SILICON_REV:
> > +case AST2400_A1_SILICON_REV:
> > +default:
> > +break;
> > +}
> > +
> > +return false;
> > +}
> > +
> >  static uint64_t aspeed_wdt_read(void *opaque, hwaddr offset, unsigned size)
> >  {
> >  AspeedWDTState *s = ASPEED_WDT(opaque);
> > @@ -55,9 +81,10 @@ static uint64_t aspeed_wdt_read(void *opaque, hwaddr 
> > offset, unsigned size)
> >  return 0;
> >  case WDT_CTRL:
> >  return s->regs[WDT_CTRL];
> > +case WDT_RESET_WIDTH:
> > +return s->regs[WDT_RESET_WIDTH];
> >  case WDT_TIMEOUT_STATUS:
> >  case WDT_TIMEOUT_CLEAR:
> > -case WDT_RESET_WDITH:
> >  qemu_log_mask(LOG_UNIMP,
> >"%s: uninmplemented read at offset 0x%" HWADDR_PRIx 
> > "\n",
> >__func__, offset);
> > @@ -119,9 +146,27 @@ static void aspeed_wdt_write(void *opaque, hwaddr 
> > offset, uint64_t data,
> >  timer_del(s->timer);
> >  }
> >  break;
> > +case WDT_RESET_WIDTH:
> > +{
> > +uint32_t property = data & WDT_POLARITY_MASK;
> > +
> > +if (property && is_ast2500(s)) {
> > +if (property == WDT_ACTIVE_HIGH_MAGIC) {
> > +   

Re: [Qemu-devel] [PATCH 0/3] build configuration query tool and conditional (qemu-io)test skip

2017-08-09 Thread Stefan Hajnoczi
On Tue, Aug 08, 2017 at 04:52:25PM +0200, Markus Armbruster wrote:
> Stefan Hajnoczi  writes:
> 
> > On Tue, Aug 08, 2017 at 10:06:04AM +0200, Markus Armbruster wrote:
> >> Stefan Hajnoczi  writes:
> >> 
> >> > On Wed, Jul 26, 2017 at 02:24:02PM -0400, Cleber Rosa wrote:
> >> >> 
> >> >> 
> >> >> On 07/26/2017 01:58 PM, Stefan Hajnoczi wrote:
> >> >> > On Tue, Jul 25, 2017 at 12:16:13PM -0400, Cleber Rosa wrote:
> >> >> >> On 07/25/2017 11:49 AM, Stefan Hajnoczi wrote:
> >> >> >>> On Fri, Jul 21, 2017 at 10:21:24AM -0400, Cleber Rosa wrote:
> >> >>  On 07/21/2017 10:01 AM, Daniel P. Berrange wrote:
> >> >> > On Fri, Jul 21, 2017 at 01:33:25PM +0100, Stefan Hajnoczi wrote:
> >> >> >> On Thu, Jul 20, 2017 at 11:47:27PM -0400, Cleber Rosa wrote:
> >> >>  Without the static capabilities defined, the dynamic check would be
> >> >>  influenced by the run time environment.  It would really mean 
> >> >>  "qemu-io
> >> >>  running on this environment (filesystem?) can do native aio".  
> >> >>  Again,
> >> >>  that's not the best type of information to depend on when writing 
> >> >>  tests.
> >> >> >>>
> >> >> >>> Can you explain this more?
> >> >> >>>
> >> >> >>> It seems logical to me that if qemu-io in this environment cannot do
> >> >> >>> aio=native then we must skip those tests.
> >> >> >>>
> >> >> >>> Stefan
> >> >> >>>
> >> >> >>
> >> >> >> OK, let's abstract a bit more.  Let's take this part of your 
> >> >> >> statement:
> >> >> >>
> >> >> >>  "if qemu-io in this environment cannot do aio=native"
> >> >> >>
> >> >> >> Let's call that a feature check.  Depending on how the *feature 
> >> >> >> check*
> >> >> >> is written, a negative result may hide a test failure, because it 
> >> >> >> would
> >> >> >> now be skipped.
> >> >> > 
> >> >> > You are saying a pass->skip transition can hide a failure but ./check
> >> >> > tracks skipped tests.  See tests/qemu-iotests/check.log for a
> >> >> > pass/fail/skip history.
> >> >> > 
> >> >> 
> >> >> You're not focusing on the problem here.  The problem is that a test
> >> >> that *was not* supposed to be skipped, would be skipped.
> >> >
> >> > As Daniel Berrange mentioned, ./configure has the same problem.  You
> >> > cannot just run it blindly because it silently disables features.
> >> >
> >> > What I'm saying is that in addition to watching ./configure closely, you
> >> > also need to look at the skipped tests that ./check reports.  If you do
> >> > that then you can be sure the expected set of tests is passing.
> >> >
> >> >> > It is the job of the CI system to flag up pass/fail/skip transitions.
> >> >> > You're no worse off using feature tests.
> >> >> > 
> >> >> > Stefan
> >> >> > 
> >> >> 
> >> >> What I'm trying to help us achieve here is a reliable and predictable
> >> >> way for the same test job execution to be comparable across
> >> >> environments.  From the individual developer workstation, CI, QA etc.
> >> >
> >> > 1. Use ./configure --enable-foo options for all desired features.
> >> > 2. Run the ./check command-line and there should be no unexpected skips
> >> >like this:
> >> >
> >> > 087 [not run] missing aio=native support
> >> >
> >> > To me this seems to address the problem.
> >> >
> >> > I have mentioned the issues with the build flags solution: it creates a
> >> > dependency on the build environment and forces test feature checks to
> >> > duplicate build dependency logic.  This is why I think feature tests are
> >> > a cleaner solution.
> >> 
> >> I suspect the actual problem here is that the qemu-iotests harness is
> >> not integrated in the build process.  For other tests, we specify the
> >> tests to run in a Makefile, and use the same configuration mechanism as
> >> for building stuff conditionally.
> >
> > The ability to run tests against QEMU binaries without a build
> > environment is useful though.  It would still be possible to symlink to
> > external binaries but then the build feature information could be
> > incorrect.
> 
> I don't dispute it's useful.  "make check" doesn't do it, though.
> 
> I think we can either have a standalone test suite (introspects the
> binaries under test to figure out what to test), or an integrated test
> suite (tests exactly what is configured).  "make check" is the latter.
> qemu-iotests is kind-of-sort-of the former.

Yes, originally qemu-iotests was a separate repo.  It was moved into
qemu.git so that it's easier to include tests in a patch series.  But as
a result of this history it has the ability to run against any QEMU.

Actually I'm not sure how important that ability is anymore.  Some
testing teams use qemu-iotests against QEMU binaries from elsewhere, so
we'd inconvenience them by tying it to a build.  But they could update
their process to get the QEMU tree that matches their binaries, if
necessary.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH for 2.11 v2 2/2] ARM: aspeed_soc: Propagate silicon-rev to watchdog

2017-08-09 Thread Cédric Le Goater
On 08/09/2017 08:28 AM, Andrew Jeffery wrote:
> This is required to configure differences in behaviour between the
> AST2400 and AST2500 watchdog IPs.
> 
> Signed-off-by: Andrew Jeffery 

Reviewed-by: Cédric Le Goater 

> ---
>  hw/arm/aspeed_soc.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/arm/aspeed_soc.c b/hw/arm/aspeed_soc.c
> index 3034849c80bf..79804e1ee652 100644
> --- a/hw/arm/aspeed_soc.c
> +++ b/hw/arm/aspeed_soc.c
> @@ -183,6 +183,8 @@ static void aspeed_soc_init(Object *obj)
>  object_initialize(>wdt[i], sizeof(s->wdt[i]), TYPE_ASPEED_WDT);
>  object_property_add_child(obj, "wdt[*]", OBJECT(>wdt[i]), NULL);
>  qdev_set_parent_bus(DEVICE(>wdt[i]), sysbus_get_default());
> +qdev_prop_set_uint32(DEVICE(>wdt[i]), "silicon-rev",
> +sc->info->silicon_rev);
>  }
>  
>  object_initialize(>ftgmac100, sizeof(s->ftgmac100), TYPE_FTGMAC100);
> 




Re: [Qemu-devel] [PATCH v2] 9pfs: fix dependencies

2017-08-09 Thread Thomas Huth
On 09.08.2017 10:27, Cornelia Huck wrote:
> On Wed, 9 Aug 2017 10:23:04 +0200
> Thomas Huth  wrote:
> 
>> On 09.08.2017 09:17, Cornelia Huck wrote:
>>> Nothing in fsdev/ or hw/9pfs/ depends on pci; it should rather depend
>>> on CONFIG_VIRTFS and on the presence of an appropriate virtio transport
>>> device.
>>>
>>> Let's introduce CONFIG_VIRTIO_CCW to cover s390x and check for
>>> CONFIG_VIRTFS && (CONFIG_VIRTIO_PCI || CONFIG_VIRTIO_CCW).
>>>
>>> Signed-off-by: Cornelia Huck 
>>> ---
>>>
>>> Changes v1->v2: drop extraneous spaces, fix build on cris
>>>
>>> ---
>>>  default-configs/s390x-softmmu.mak | 1 +
>>>  fsdev/Makefile.objs   | 9 +++--
>>>  hw/Makefile.objs  | 2 +-
>>>  3 files changed, 5 insertions(+), 7 deletions(-)
[...]
>>
>> Patch should be fine now, I think...
>>
>> But thinking about this again, I wonder whether it would be enough to
>> simply check for CONFIG_VIRTIO=y here instead. CONFIG_VIRTIO=y should be
>> sufficient to assert that there is also at least one kind of virtio
>> transport available, right?
>> Otherwise this will look really horrible as soon as somebody also tries
>> to add support for virtio-mmio here later ;-)
> 
> Do all virtio transports have support for 9p, though? I thought it was
> only virtio-pci and virtio-ccw...

While virtio-pci and virtio-ccw seem to require separate dedicated
devices (e.g. virtio-9p-pci and virtio-9p-ccw) for everything,
virtio-mmio seems to work different. As far as I can see, there are no
dedicated virtio-xxx-mmio devices in the code at all. According to
https://www.redhat.com/archives/libvir-list/2013-August/msg9.html
you simply have to use virtio-xxx-device here instead. And a
virtio-9p-device is available. So theoretically, the 9p code should work
with virtio-mmio, too, or is there a problem that I did not see yet?

Anyway, we likely should not blindly enable this, so unless somebody has
a setup to test it, we should go with your current patch instead, I think.

 Thomas



Re: [Qemu-devel] [PATCH for-2.11 v2 4/5] qmp-shell: Accept QMP command as argument

2017-08-09 Thread Stefan Hajnoczi
On Tue, Aug 08, 2017 at 05:39:34PM -0300, Eduardo Habkost wrote:
> This is useful for testing QMP commands in scripts.
> 
> Example usage, combined with 'jq' for filtering the results:
> 
>   $ ./scripts/qmp/qmp-shell /tmp/qmp qom-list path=/ | jq -r .return[].name
>   machine
>   type
>   chardevs
>   backend
>   $
> 
> Signed-off-by: Eduardo Habkost 
> ---
> Changes v1 -> v2:
> * Rewritten using optparse module
> ---
>  scripts/qmp/qmp-shell | 14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 1/2] tests/boot-sector: Do not overwrite the x86 buffer on other architectures

2017-08-09 Thread Thomas Huth
On 09.08.2017 11:05, Cornelia Huck wrote:
> On Wed,  9 Aug 2017 06:59:37 +0200
> Thomas Huth  wrote:
> 
>> Re-using the boot_sector code buffer from x86 for other architectures
>> is not very nice, especially if we add more architectures later. It's
>> also ugly that the test uses a huge pre-initialized array - the size
>> of the executable is very huge due to this array. So let's use a
>> separate buffer for each architecture instead, allocated from the heap,
>> so that we really just use the memory that we need.
>>
>> Suggested-by: Michael Tsirkin 
>> Signed-off-by: Thomas Huth 
>> ---
>>  tests/boot-sector.c | 37 -
>>  1 file changed, 24 insertions(+), 13 deletions(-)
>>
>> diff --git a/tests/boot-sector.c b/tests/boot-sector.c
>> index e3880f4..4ea1373 100644
>> --- a/tests/boot-sector.c
>> +++ b/tests/boot-sector.c
>> @@ -21,13 +21,12 @@
>>  #define SIGNATURE 0xdead
>>  #define SIGNATURE_OFFSET 0x10
>>  #define BOOT_SECTOR_ADDRESS 0x7c00
>> +#define SIGNATURE_ADDR (BOOT_SECTOR_ADDRESS + SIGNATURE_OFFSET)
> 
> Do you want to use this new #define in boot_sector_test() as well?

Yes, sounds like a good idea.

>>  
>> -/* Boot sector code: write SIGNATURE into memory,
>> +/* x86 boot sector code: write SIGNATURE into memory,
>>   * then halt.
>> - * Q35 machine requires a minimum 0x7e000 bytes disk.
>> - * (bug or feature?)
>>   */
>> -static uint8_t boot_sector[0x7e000] = {
>> +static uint8_t x86_boot_sector[512] = {
>>  /* The first sector will be placed at RAM address 7C00, and
>>   * the BIOS transfers control to 7C00
>>   */
>> @@ -50,8 +49,8 @@ static uint8_t boot_sector[0x7e000] = {
>>  [0x07] = HIGH(SIGNATURE),
>>  /* 7c08:  mov %ax,0x7c10 */
>>  [0x08] = 0xa3,
>> -[0x09] = LOW(BOOT_SECTOR_ADDRESS + SIGNATURE_OFFSET),
>> -[0x0a] = HIGH(BOOT_SECTOR_ADDRESS + SIGNATURE_OFFSET),
>> +[0x09] = LOW(SIGNATURE_ADDR),
>> +[0x0a] = HIGH(SIGNATURE_ADDR),
>>  
>>  /* 7c0b cli */
>>  [0x0b] = 0xfa,
>> @@ -72,7 +71,9 @@ static uint8_t boot_sector[0x7e000] = {
>>  int boot_sector_init(char *fname)
>>  {
>>  int fd, ret;
>> -size_t len = sizeof boot_sector;
>> +size_t len;
>> +char *boot_code;
>> +const char *arch = qtest_get_arch();
>>  
>>  fd = mkstemp(fname);
>>  if (fd < 0) {
>> @@ -80,16 +81,26 @@ int boot_sector_init(char *fname)
>>  return 1;
>>  }
>>  
>> -/* For Open Firmware based system, we can use a Forth script instead */
>> -if (strcmp(qtest_get_arch(), "ppc64") == 0) {
>> -len = sprintf((char *)boot_sector, "\\ Bootscript\n%x %x c! %x %x 
>> c!\n",
>> -LOW(SIGNATURE), BOOT_SECTOR_ADDRESS + SIGNATURE_OFFSET,
>> -HIGH(SIGNATURE), BOOT_SECTOR_ADDRESS + SIGNATURE_OFFSET + 
>> 1);
>> +if (g_str_equal(arch, "i386") || g_str_equal(arch, "x86_64")) {
>> +/* Q35 requires a minimum 0x7e000 bytes disk (bug or feature?) */
>> +len = 0x7e000;
> 
> Use the maximum of (0x7e000, sizeof(x86_boot_sector))? (Not that it is
> likely that the boot sector will ever grow, but I think it is cleaner.)

Sounds like a little bit of too much sanity checking for me, but ok, I
can add it.

>> +boot_code = g_malloc(len);
> 
> Would g_malloc_0() be better?

Good idea, the test is likely more predictable if we don't have random
data in the file here (it should not really matter, but let's better be
safe than sorry).

>> +memcpy(boot_code, x86_boot_sector, sizeof x86_boot_sector);
> 
> sizeof(x86_boot_sector)?

The original code uses sizeof without parenthesis, so I think we should
stay with that coding style.

>> +} else if (g_str_equal(arch, "ppc64")) {
>> +/* For Open Firmware based system, use a Forth script */
>> +boot_code = g_strdup_printf("\\ Bootscript\n%x %x c! %x %x c!\n",
>> +LOW(SIGNATURE), SIGNATURE_ADDR,
>> +HIGH(SIGNATURE), SIGNATURE_ADDR + 1);
>> +len = strlen(boot_code);
>> +} else {
>> +g_assert_not_reached();
>>  }
>>  
>> -ret = write(fd, boot_sector, len);
>> +ret = write(fd, boot_code, len);
>>  close(fd);
>>  
>> +g_free(boot_code);
>> +
>>  if (ret != len) {
>>  fprintf(stderr, "Could not write \"%s\"", fname);
>>  return 1;
> 
> This makes the code much nicer :)

Thanks for the review!

I'll wait for some more feedback, then send a v2...

 Thomas



Re: [Qemu-devel] [PATCH 1/2] tests/boot-sector: Do not overwrite the x86 buffer on other architectures

2017-08-09 Thread Cornelia Huck
On Wed, 9 Aug 2017 11:18:33 +0200
Thomas Huth  wrote:

> On 09.08.2017 11:05, Cornelia Huck wrote:
> > On Wed,  9 Aug 2017 06:59:37 +0200
> > Thomas Huth  wrote:

> >> @@ -80,16 +81,26 @@ int boot_sector_init(char *fname)
> >>  return 1;
> >>  }
> >>  
> >> -/* For Open Firmware based system, we can use a Forth script instead 
> >> */
> >> -if (strcmp(qtest_get_arch(), "ppc64") == 0) {
> >> -len = sprintf((char *)boot_sector, "\\ Bootscript\n%x %x c! %x %x 
> >> c!\n",
> >> -LOW(SIGNATURE), BOOT_SECTOR_ADDRESS + SIGNATURE_OFFSET,
> >> -HIGH(SIGNATURE), BOOT_SECTOR_ADDRESS + SIGNATURE_OFFSET + 
> >> 1);
> >> +if (g_str_equal(arch, "i386") || g_str_equal(arch, "x86_64")) {
> >> +/* Q35 requires a minimum 0x7e000 bytes disk (bug or feature?) */
> >> +len = 0x7e000;  
> > 
> > Use the maximum of (0x7e000, sizeof(x86_boot_sector))? (Not that it is
> > likely that the boot sector will ever grow, but I think it is cleaner.)  
> 
> Sounds like a little bit of too much sanity checking for me, but ok, I
> can add it.

It's probably a bit paranoid, but I don't think it hurts. 

> 
> >> +boot_code = g_malloc(len);  
> > 
> > Would g_malloc_0() be better?  
> 
> Good idea, the test is likely more predictable if we don't have random
> data in the file here (it should not really matter, but let's better be
> safe than sorry).
> 
> >> +memcpy(boot_code, x86_boot_sector, sizeof x86_boot_sector);  
> > 
> > sizeof(x86_boot_sector)?  
> 
> The original code uses sizeof without parenthesis, so I think we should
> stay with that coding style.

After your patch, the original sizeof callers are gone, no? (I really
prefer the sizeof() variant...)



Re: [Qemu-devel] [Qemu-block] [PATCH 1/4] IDE: Do not flush empty CDROM drives

2017-08-09 Thread Stefan Hajnoczi
On Tue, Aug 08, 2017 at 01:57:08PM -0400, John Snow wrote:
> The block backend changed in a way that flushing empty CDROM drives
> is now an error. Amend IDE to avoid doing so until the root problem
> can be addressed for 2.11.
> 
> Reported-by: Kieron Shorrock 
> Signed-off-by: John Snow 
> ---
>  hw/ide/core.c | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index 0b48b64..6cbca43 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -1053,17 +1053,21 @@ static void ide_flush_cb(void *opaque, int ret)
>  ide_set_irq(s->bus);
>  }
>  
> -static void ide_flush_cache(IDEState *s)
> +static bool ide_flush_cache(IDEState *s)

Previously this function invoked ide_flush_cb() to complete the request
and raise an IRQ.  Now it may return true instead of invoking
ide_flush_cb().  It's no longer a helper function, it's more like an IDE
command handler.

cmd_set_features() must be updated:

  case 0x82: /* write cache disable */
  blk_set_enable_write_cache(s->blk, false);
  identify_data = (uint16_t *)s->identify_data;
  put_le16(identify_data + 85, (1 << 14) | 1);
  ide_flush_cache(s);
  return false;  <--- should be "return ide_flush_cache(s)"

To make the code clearer I suggest deleting ide_flush_cache() and
calling cmd_flush_cache() directly instead.  That way it's obvious that
this is an IDE command handler and it may return true to indicate the
the command completed immediately.

>  {
>  if (s->blk == NULL) {
>  ide_flush_cb(s, 0);
> -return;
> +return false;
> +} else if (!blk_bs(s->blk)) {
> +/* Nothing to flush */

Please move information from the commit description into this comment if
you respin.  Someone looking at the code might think this is a nop that
is safe to remove.  Once blk_*() is fixed this code path can be removed
again.


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2] 9pfs: fix dependencies

2017-08-09 Thread Cornelia Huck
On Wed, 9 Aug 2017 10:47:18 +0100
Peter Maydell  wrote:

> On 9 August 2017 at 10:07, Thomas Huth  wrote:
> > While virtio-pci and virtio-ccw seem to require separate dedicated
> > devices (e.g. virtio-9p-pci and virtio-9p-ccw) for everything,
> > virtio-mmio seems to work different. As far as I can see, there are no
> > dedicated virtio-xxx-mmio devices in the code at all. According to
> > https://www.redhat.com/archives/libvir-list/2013-August/msg9.html
> > you simply have to use virtio-xxx-device here instead. And a
> > virtio-9p-device is available. So theoretically, the 9p code should work
> > with virtio-mmio, too, or is there a problem that I did not see yet?
> >
> > Anyway, we likely should not blindly enable this, so unless somebody has
> > a setup to test it, we should go with your current patch instead, I think.  
> 
> As you say, we already compile the virtio-9p-device that can
> plug into any virtio transport. So why not just build it
> whenever virtio of any form is enabled? Having it only
> build if PCI is also enabled seems very odd: the backend
> should not care at all about what transport it is using.

Given the previous discussions, I think just dropping the PCI
dependency is indeed the way to go. I'll send a v3.



[Qemu-devel] [PATCH v2 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application

2017-08-09 Thread Changpeng Liu
This commit introcudes a vhost-user-blk backend device, it uses UNIX
domain socket to communicate with Qemu. The vhost-user-blk sample
application should be used with Qemu vhost-user-blk-pci device.

To use it, complie with:
make vhost-user-blk

and start like this:
vhost-user-blk -b /dev/sdb -s /path/vhost.socket

Signed-off-by: Changpeng Liu 
---
 .gitignore  |   1 +
 Makefile|   3 +
 Makefile.objs   |   2 +
 contrib/vhost-user-blk/Makefile.objs|   1 +
 contrib/vhost-user-blk/vhost-user-blk.c | 735 
 5 files changed, 742 insertions(+)
 create mode 100644 contrib/vhost-user-blk/Makefile.objs
 create mode 100644 contrib/vhost-user-blk/vhost-user-blk.c

diff --git a/.gitignore b/.gitignore
index cf65316..dbe5c13 100644
--- a/.gitignore
+++ b/.gitignore
@@ -51,6 +51,7 @@
 /module_block.h
 /vscclient
 /vhost-user-scsi
+/vhost-user-blk
 /fsdev/virtfs-proxy-helper
 *.[1-9]
 *.a
diff --git a/Makefile b/Makefile
index 97a58a0..e68e339 100644
--- a/Makefile
+++ b/Makefile
@@ -270,6 +270,7 @@ dummy := $(call unnest-vars,, \
 ivshmem-server-obj-y \
 libvhost-user-obj-y \
 vhost-user-scsi-obj-y \
+vhost-user-blk-obj-y \
 qga-vss-dll-obj-y \
 block-obj-y \
 block-obj-m \
@@ -478,6 +479,8 @@ ivshmem-server$(EXESUF): $(ivshmem-server-obj-y) 
$(COMMON_LDADDS)
 endif
 vhost-user-scsi$(EXESUF): $(vhost-user-scsi-obj-y)
$(call LINK, $^)
+vhost-user-blk$(EXESUF): $(vhost-user-blk-obj-y)
+   $(call LINK, $^)
 
 module_block.h: $(SRC_PATH)/scripts/modules/module_block.py config-host.mak
$(call quiet-command,$(PYTHON) $< $@ \
diff --git a/Makefile.objs b/Makefile.objs
index 24a4ea0..6b81548 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -114,6 +114,8 @@ vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS)
 vhost-user-scsi.o-libs := $(LIBISCSI_LIBS)
 vhost-user-scsi-obj-y = contrib/vhost-user-scsi/
 vhost-user-scsi-obj-y += contrib/libvhost-user/libvhost-user.o
+vhost-user-blk-obj-y = contrib/vhost-user-blk/
+vhost-user-blk-obj-y += contrib/libvhost-user/libvhost-user.o
 
 ##
 trace-events-subdirs =
diff --git a/contrib/vhost-user-blk/Makefile.objs 
b/contrib/vhost-user-blk/Makefile.objs
new file mode 100644
index 000..72e2cdc
--- /dev/null
+++ b/contrib/vhost-user-blk/Makefile.objs
@@ -0,0 +1 @@
+vhost-user-blk-obj-y = vhost-user-blk.o
diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
new file mode 100644
index 000..9b90164
--- /dev/null
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -0,0 +1,735 @@
+/*
+ * vhost-user-blk sample application
+ *
+ * Copyright IBM, Corp. 2007
+ * Copyright (c) 2016 Nutanix Inc. All rights reserved.
+ * Copyright (c) 2017 Intel Corporation. All rights reserved.
+ *
+ * Author:
+ *  Anthony Liguori 
+ *  Felipe Franciosi 
+ *  Changpeng Liu 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 only.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/virtio/virtio-blk.h"
+#include "contrib/libvhost-user/libvhost-user.h"
+
+#include 
+
+/* Small compat shim from glib 2.32 */
+#ifndef G_SOURCE_CONTINUE
+#define G_SOURCE_CONTINUE TRUE
+#endif
+#ifndef G_SOURCE_REMOVE
+#define G_SOURCE_REMOVE FALSE
+#endif
+
+/* And this is the final byte of request*/
+#define VIRTIO_BLK_S_OK 0
+#define VIRTIO_BLK_S_IOERR 1
+#define VIRTIO_BLK_S_UNSUPP 2
+
+typedef struct vhost_blk_dev {
+VuDev vu_dev;
+int server_sock;
+int blk_fd;
+struct virtio_blk_config blkcfg;
+char *blk_name;
+GMainLoop *loop;
+GTree *fdmap;   /* fd -> gsource context id */
+} vhost_blk_dev_t;
+
+typedef struct vhost_blk_request {
+VuVirtqElement *elem;
+int64_t sector_num;
+size_t size;
+struct virtio_blk_inhdr *in;
+struct virtio_blk_outhdr *out;
+vhost_blk_dev_t *vdev_blk;
+struct VuVirtq *vq;
+} vhost_blk_request_t;
+
+/**  refer util/iov.c  **/
+static size_t vu_blk_iov_size(const struct iovec *iov,
+  const unsigned int iov_cnt)
+{
+size_t len;
+unsigned int i;
+
+len = 0;
+for (i = 0; i < iov_cnt; i++) {
+len += iov[i].iov_len;
+}
+return len;
+}
+
+/** glib event loop integration for libvhost-user and misc callbacks **/
+
+QEMU_BUILD_BUG_ON((int)G_IO_IN != (int)VU_WATCH_IN);
+QEMU_BUILD_BUG_ON((int)G_IO_OUT != (int)VU_WATCH_OUT);
+QEMU_BUILD_BUG_ON((int)G_IO_PRI != (int)VU_WATCH_PRI);
+QEMU_BUILD_BUG_ON((int)G_IO_ERR != (int)VU_WATCH_ERR);
+QEMU_BUILD_BUG_ON((int)G_IO_HUP != (int)VU_WATCH_HUP);
+
+typedef struct vu_blk_gsrc {
+GSource parent;
+vhost_blk_dev_t *vdev_blk;
+GPollFD gfd;
+

[Qemu-devel] [PATCH v4 1/7] block: move ThrottleGroup membership to ThrottleGroupMember

2017-08-09 Thread Manos Pitsidianakis
This commit eliminates the 1:1 relationship between BlockBackend and
throttle group state.  Users will be able to create multiple throttle
nodes, each with its own throttle group state, in the future.  The
throttle group state cannot be per-BlockBackend anymore, it must be
per-throttle node. This is done by gathering ThrottleGroup membership
details from BlockBackendPublic into ThrottleGroupMember and refactoring
existing code to use the structure.

Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Manos Pitsidianakis 
---
 block/block-backend.c   |  66 +
 block/qapi.c|   8 +-
 block/throttle-groups.c | 288 
 blockdev.c  |   4 +-
 include/block/throttle-groups.h |  39 +-
 include/sysemu/block-backend.h  |  20 +--
 tests/test-throttle.c   |  53 
 7 files changed, 252 insertions(+), 226 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 968438c149..bde6948d0e 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -215,9 +215,9 @@ BlockBackend *blk_new(uint64_t perm, uint64_t shared_perm)
 blk->shared_perm = shared_perm;
 blk_set_enable_write_cache(blk, true);
 
-qemu_co_mutex_init(>public.throttled_reqs_lock);
-qemu_co_queue_init(>public.throttled_reqs[0]);
-qemu_co_queue_init(>public.throttled_reqs[1]);
+qemu_co_mutex_init(>public.throttle_group_member.throttled_reqs_lock);
+qemu_co_queue_init(>public.throttle_group_member.throttled_reqs[0]);
+qemu_co_queue_init(>public.throttle_group_member.throttled_reqs[1]);
 block_acct_init(>stats);
 
 notifier_list_init(>remove_bs_notifiers);
@@ -285,7 +285,7 @@ static void blk_delete(BlockBackend *blk)
 assert(!blk->refcnt);
 assert(!blk->name);
 assert(!blk->dev);
-if (blk->public.throttle_state) {
+if (blk->public.throttle_group_member.throttle_state) {
 blk_io_limits_disable(blk);
 }
 if (blk->root) {
@@ -596,9 +596,12 @@ BlockBackend *blk_by_public(BlockBackendPublic *public)
  */
 void blk_remove_bs(BlockBackend *blk)
 {
+ThrottleTimers *tt;
+
 notifier_list_notify(>remove_bs_notifiers, blk);
-if (blk->public.throttle_state) {
-throttle_timers_detach_aio_context(>public.throttle_timers);
+if (blk->public.throttle_group_member.throttle_state) {
+tt = >public.throttle_group_member.throttle_timers;
+throttle_timers_detach_aio_context(tt);
 }
 
 blk_update_root_state(blk);
@@ -620,9 +623,10 @@ int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, 
Error **errp)
 bdrv_ref(bs);
 
 notifier_list_notify(>insert_bs_notifiers, blk);
-if (blk->public.throttle_state) {
+if (blk->public.throttle_group_member.throttle_state) {
 throttle_timers_attach_aio_context(
->public.throttle_timers, bdrv_get_aio_context(bs));
+>public.throttle_group_member.throttle_timers,
+bdrv_get_aio_context(bs));
 }
 
 return 0;
@@ -984,8 +988,9 @@ int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t 
offset,
 bdrv_inc_in_flight(bs);
 
 /* throttling disk I/O */
-if (blk->public.throttle_state) {
-throttle_group_co_io_limits_intercept(blk, bytes, false);
+if (blk->public.throttle_group_member.throttle_state) {
+
throttle_group_co_io_limits_intercept(>public.throttle_group_member,
+bytes, false);
 }
 
 ret = bdrv_co_preadv(blk->root, offset, bytes, qiov, flags);
@@ -1008,10 +1013,10 @@ int coroutine_fn blk_co_pwritev(BlockBackend *blk, 
int64_t offset,
 }
 
 bdrv_inc_in_flight(bs);
-
 /* throttling disk I/O */
-if (blk->public.throttle_state) {
-throttle_group_co_io_limits_intercept(blk, bytes, true);
+if (blk->public.throttle_group_member.throttle_state) {
+
throttle_group_co_io_limits_intercept(>public.throttle_group_member,
+bytes, true);
 }
 
 if (!blk->enable_write_cache) {
@@ -1680,15 +1685,17 @@ static AioContext *blk_aiocb_get_aio_context(BlockAIOCB 
*acb)
 void blk_set_aio_context(BlockBackend *blk, AioContext *new_context)
 {
 BlockDriverState *bs = blk_bs(blk);
+ThrottleTimers *tt;
 
 if (bs) {
-if (blk->public.throttle_state) {
-throttle_timers_detach_aio_context(>public.throttle_timers);
+if (blk->public.throttle_group_member.throttle_state) {
+tt = >public.throttle_group_member.throttle_timers;
+throttle_timers_detach_aio_context(tt);
 }
 bdrv_set_aio_context(bs, new_context);
-if (blk->public.throttle_state) {
-throttle_timers_attach_aio_context(>public.throttle_timers,
-   new_context);
+if (blk->public.throttle_group_member.throttle_state) {
+tt = 

[Qemu-devel] [PATCH v4 6/7] block: add BlockDevOptionsThrottle to QAPI

2017-08-09 Thread Manos Pitsidianakis
This is needed to configure throttle filter driver nodes with QAPI.

Signed-off-by: Manos Pitsidianakis 
---
 qapi/block-core.json | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0bdc69aa5f..12fd749a94 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -,6 +,7 @@
 # Drivers that are supported in block device operations.
 #
 # @vxhs: Since 2.10
+# @throttle: Since 2.11
 #
 # Since: 2.9
 ##
@@ -2231,7 +2232,7 @@
 'host_device', 'http', 'https', 'iscsi', 'luks', 'nbd', 'nfs',
 'null-aio', 'null-co', 'parallels', 'qcow', 'qcow2', 'qed',
 'quorum', 'raw', 'rbd', 'replication', 'sheepdog', 'ssh',
-'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
+'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
 
 ##
 # @BlockdevOptionsFile:
@@ -3095,6 +3096,24 @@
 '*tls-creds': 'str' } }
 
 ##
+# @BlockdevOptionsThrottle:
+#
+# Driver specific block device options for the throttle driver
+#
+# @throttle-group:   the name of the throttle-group object to use. It will be
+#created if it doesn't already exist. If not specified, an
+#anonymous group will be created, which cannot be
+#referenced by other throttle nodes.
+# @file: reference to or definition of the data source block device
+# @limits:   ThrottleLimits options
+# Since: 2.11
+##
+{ 'struct': 'BlockdevOptionsThrottle',
+  'data': { '*throttle-group': 'str',
+'file' : 'BlockdevRef',
+'*limits' : 'ThrottleLimits'
+ } }
+##
 # @BlockdevOptions:
 #
 # Options for creating a block device.  Many options are available for all
@@ -3155,6 +3174,7 @@
   'replication':'BlockdevOptionsReplication',
   'sheepdog':   'BlockdevOptionsSheepdog',
   'ssh':'BlockdevOptionsSsh',
+  'throttle':   'BlockdevOptionsThrottle',
   'vdi':'BlockdevOptionsGenericFormat',
   'vhdx':   'BlockdevOptionsGenericFormat',
   'vmdk':   'BlockdevOptionsGenericCOWFormat',
-- 
2.11.0




[Qemu-devel] [PATCH v4 7/7] block: add throttle block filter driver interface tests

2017-08-09 Thread Manos Pitsidianakis
Signed-off-by: Manos Pitsidianakis 
---
 tests/qemu-iotests/184 | 310 +
 tests/qemu-iotests/184.out | 422 +
 tests/qemu-iotests/group   |   1 +
 3 files changed, 733 insertions(+)
 create mode 100755 tests/qemu-iotests/184
 create mode 100644 tests/qemu-iotests/184.out

diff --git a/tests/qemu-iotests/184 b/tests/qemu-iotests/184
new file mode 100755
index 00..5c11d1123d
--- /dev/null
+++ b/tests/qemu-iotests/184
@@ -0,0 +1,310 @@
+#!/bin/bash
+#
+# Test I/O throttle block filter driver interface
+#
+# Copyright (C) 2017 Manos Pitsidianakis
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner="Manos Pitsidianakis"
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+status=1   # failure is the default!
+
+_cleanup()
+{
+_cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+function do_run_qemu()
+{
+echo Testing: "$@" | _filter_imgfmt
+$QEMU -nographic -qmp-pretty stdio -serial none "$@"
+echo
+}
+
+function run_qemu()
+{
+do_run_qemu "$@" 2>&1 | _filter_testdir | _filter_qemu | _filter_qmp\
+  | _filter_qemu_io | _filter_generated_node_ids
+}
+
+_make_test_img 64M
+test_throttle=$($QEMU_IMG --help|grep throttle)
+[ "$test_throttle" = "" ] && _supported_fmt throttle
+
+echo
+echo "== checking interface =="
+
+run_qemu <

[Qemu-devel] [PATCH v4 4/7] block: convert ThrottleGroup to object with QOM

2017-08-09 Thread Manos Pitsidianakis
ThrottleGroup is converted to an object. This will allow the future
throttle block filter drive easy creation and configuration of throttle
groups in QMP and cli.

A new QAPI struct, ThrottleLimits, is introduced to provide a shared
struct for all throttle configuration needs in QMP.

ThrottleGroups can be created via CLI as
-object throttle-group,id=foo,x-iops-total=100,x-..
where x-* are individual limit properties. Since we can't add non-scalar
properties in -object this interface must be used instead. However,
setting these properties must be disabled after initialization because
certain combinations of limits are forbidden and thus configuration
changes should be done in one transaction. The individual properties
will go away when support for non-scalar values in CLI is implemented
and thus are marked as experimental.

ThrottleGroup also has a `limits` property that uses the ThrottleLimits
struct.  It can be used to create ThrottleGroups or set the
configuration in existing groups as follows:

{ "execute": "object-add",
  "arguments": {
"qom-type": "throttle-group",
"id": "foo",
"props" : {
  "limits": {
  "iops-total": 100
  }
}
  }
}
{ "execute" : "qom-set",
"arguments" : {
"path" : "foo",
"property" : "limits",
"value" : {
"iops-total" : 99
}
}
}

This also means a group's configuration can be fetched with qom-get.

ThrottleGroups can be anonymous which means they can't get accessed by
other users ie they will always be units instead of group (Because they
have one ThrottleGroupMember).

Signed-off-by: Manos Pitsidianakis 
---
v4 major changes:
  removed throttle_groups_lock
  changed reference counting of ThrottleGroup objects
Notes:
Note: I tested Markus Armbruster's patch in 
<87wp7fghi9@dusky.pond.sub.org>
on master and I can use this syntax successfuly:
-object '{ "qom-type" : "throttle-group", "id" : "foo", "props" : { 
"limits" \
: { "iops-total" : 1000 } } }'
If this gets merged using -object will be a little more verbose but at 
least we
won't have seperate properties, which is a good thing, so the x-* should be
dropped.

 block/throttle-groups.c | 421 
 include/block/throttle-groups.h |   3 +
 include/qemu/throttle-options.h |  59 --
 include/qemu/throttle.h |   3 +
 qapi/block-core.json|  48 +
 tests/test-throttle.c   |   1 +
 util/throttle.c | 151 ++
 7 files changed, 626 insertions(+), 60 deletions(-)

diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index f711a3dc62..751e86c676 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -25,9 +25,17 @@
 #include "qemu/osdep.h"
 #include "sysemu/block-backend.h"
 #include "block/throttle-groups.h"
+#include "qemu/throttle-options.h"
 #include "qemu/queue.h"
 #include "qemu/thread.h"
 #include "sysemu/qtest.h"
+#include "qapi/error.h"
+#include "qapi-visit.h"
+#include "qom/object.h"
+#include "qom/object_interfaces.h"
+
+static void throttle_group_obj_init(Object *obj);
+static void throttle_group_obj_complete(UserCreatable *obj, Error **errp);
 
 /* The ThrottleGroup structure (with its ThrottleState) is shared
  * among different ThrottleGroupMembers and it's independent from
@@ -54,6 +62,10 @@
  * that ThrottleGroupMember has throttled requests in the queue.
  */
 typedef struct ThrottleGroup {
+Object parent_obj;
+
+/* refuse individual property change if initialization is complete */
+bool is_initialized;
 char *name; /* This is constant during the lifetime of the group */
 
 QemuMutex lock; /* This lock protects the following four fields */
@@ -63,12 +75,11 @@ typedef struct ThrottleGroup {
 bool any_timer_armed[2];
 QEMUClockType clock_type;
 
-/* These two are protected by the global throttle_groups_lock */
-unsigned refcount;
+/* This field is protected by the global QEMU mutex */
 QTAILQ_ENTRY(ThrottleGroup) list;
 } ThrottleGroup;
 
-static QemuMutex throttle_groups_lock;
+/* This is protected by the global QEMU mutex */
 static QTAILQ_HEAD(, ThrottleGroup) throttle_groups =
 QTAILQ_HEAD_INITIALIZER(throttle_groups);
 
@@ -77,7 +88,11 @@ static QTAILQ_HEAD(, ThrottleGroup) throttle_groups =
  * If no ThrottleGroup is found with the given name a new one is
  * created.
  *
- * @name: the name of the ThrottleGroup
+ * This function edits throttle_groups and must be called under the global
+ * mutex.
+ *
+ * @name: the name of the ThrottleGroup, NULL means a new anonymous group will
+ *be created.
  * @ret:  the ThrottleState member of the ThrottleGroup
  */
 ThrottleState *throttle_group_incref(const char *name)
@@ -85,37 +100,26 @@ ThrottleState *throttle_group_incref(const char *name)
 ThrottleGroup *tg = NULL;
 ThrottleGroup *iter;
 
-qemu_mutex_lock(_groups_lock);
-
-/* Look for 

Re: [Qemu-devel] Qemu and 32 PCIe devices

2017-08-09 Thread Paolo Bonzini
On 09/08/2017 12:00, Laszlo Ersek wrote:
> On 08/09/17 09:26, Paolo Bonzini wrote:
>> On 09/08/2017 03:06, Laszlo Ersek wrote:
   20.14%  qemu-system-x86_64  [.] render_memory_region
   17.14%  qemu-system-x86_64  [.] subpage_register
   10.31%  qemu-system-x86_64  [.] int128_add
7.86%  qemu-system-x86_64  [.] addrrange_end
7.30%  qemu-system-x86_64  [.] int128_ge
4.89%  qemu-system-x86_64  [.] int128_nz
3.94%  qemu-system-x86_64  [.] phys_page_compact
2.73%  qemu-system-x86_64  [.] phys_map_node_alloc
>>
>> Yes, this is the O(n^3) thing.  An optimized build should be faster
>> because int128 operations will be inlined and become much more efficient.
>>
>>> With this patch, I only tested the "93 devices" case, as the slowdown
>>> became visible to the naked eye from the trace messages, as the firmware
>>> enabled more and more BARs / command registers (and inversely, the
>>> speedup was perceivable when the firmware disabled more and more BARs /
>>> command registers).
>>
>> This is an interesting observation, and it's expected.  Looking at the
>> O(n^3) complexity more in detail you have N operations, where the "i"th
>> operates on "i" DMA address spaces, all of which have at least "i"
>> memory regions (at least 1 BAR per device).
> 
> - Can you please give me a pointer to the code where the "i"th operation
> works on "i" DMA address spaces? (Not that I dream about patching *that*
> code, wherever it may live :) )

It's all driven by actions of the guest.

Simply, by the time you get to the "i"th command register, you have
enabled bus-master DMA on "i" devices (so that "i" DMA address spaces
are non-empty) and you have enabled BARs on "i" devices (so that their
BARs are included in the address spaces).

> - You mentioned that changing this is on the ToDo list. I couldn't find
> it under . Is it tracked somewhere
> else?

I've added it to https://wiki.qemu.org/index.php/ToDo/MemoryAPI (thanks
for the nudge).

Paolo

> (I'm not trying to urge any changes in the area, I'd just like to learn
> about the code & the tracker item, if there's one.)
> 
> Thanks!
> Laszlo
> 
>>
>> So the total cost is sum i=1..N i^2 = N(N+1)(2N+1)/6 = O(n^3).
>> Expressing it as a sum shows why it gets slower as time progresses.
>>
>> The solution is to note that those "i" address spaces are actually all
>> the same, so we can get it down to sum i=1..N i = N(N+1)/2 = O(n^2).
>>
>> Thanks,
>>
>> Paolo
>>
> 




Re: [Qemu-devel] [PATCH for 2.11 v2 1/2] watchdog: wdt_aspeed: Add support for the reset width register

2017-08-09 Thread Cédric Le Goater
On 08/09/2017 08:28 AM, Andrew Jeffery wrote:
> The reset width register controls how the pulse on the SoC's WDTRST{1,2}
> pins behaves. A pulse is emitted if the external reset bit is set in
> WDT_CTRL. On the AST2500 WDT_RESET_WIDTH can consume magic bit patterns
> to configure push-pull/open-drain and active-high/active-low
> behaviours and thus needs some special handling in the write path.
> 
> As some of the capabilities depend on the SoC version a silicon-rev
> property is introduced, which is used to guard version-specific
> behaviour.
> 
> Signed-off-by: Andrew Jeffery 

One minor comment below. Nevertheless :

Reviewed-by: Cédric Le Goater 

> ---
>  hw/watchdog/wdt_aspeed.c | 93 
> +++-
>  include/hw/watchdog/wdt_aspeed.h |  2 +
>  2 files changed, 84 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/watchdog/wdt_aspeed.c b/hw/watchdog/wdt_aspeed.c
> index 8bbe579b6b66..22bce364d7b5 100644
> --- a/hw/watchdog/wdt_aspeed.c
> +++ b/hw/watchdog/wdt_aspeed.c
> @@ -8,16 +8,19 @@
>   */
>  
>  #include "qemu/osdep.h"
> +
> +#include "qapi/error.h"
>  #include "qemu/log.h"
> +#include "qemu/timer.h"
>  #include "sysemu/watchdog.h"
> +#include "hw/misc/aspeed_scu.h"
>  #include "hw/sysbus.h"
> -#include "qemu/timer.h"
>  #include "hw/watchdog/wdt_aspeed.h"
>  
> -#define WDT_STATUS  (0x00 / 4)
> -#define WDT_RELOAD_VALUE(0x04 / 4)
> -#define WDT_RESTART (0x08 / 4)
> -#define WDT_CTRL(0x0C / 4)
> +#define WDT_STATUS  (0x00 / 4)
> +#define WDT_RELOAD_VALUE(0x04 / 4)
> +#define WDT_RESTART (0x08 / 4)
> +#define WDT_CTRL(0x0C / 4)
>  #define   WDT_CTRL_RESET_MODE_SOC   (0x00 << 5)
>  #define   WDT_CTRL_RESET_MODE_FULL_CHIP (0x01 << 5)
>  #define   WDT_CTRL_1MHZ_CLK BIT(4)
> @@ -25,18 +28,41 @@
>  #define   WDT_CTRL_WDT_INTR BIT(2)
>  #define   WDT_CTRL_RESET_SYSTEM BIT(1)
>  #define   WDT_CTRL_ENABLE   BIT(0)
> +#define WDT_RESET_WIDTH (0x18 / 4)
> +#define   WDT_RESET_WIDTH_ACTIVE_HIGH   BIT(31)
> +#define WDT_POLARITY_MASK   (0xFF << 24)
> +#define WDT_ACTIVE_HIGH_MAGIC   (0xA5 << 24)
> +#define WDT_ACTIVE_LOW_MAGIC(0x5A << 24)
> +#define   WDT_RESET_WIDTH_PUSH_PULL BIT(30)
> +#define WDT_DRIVE_TYPE_MASK (0xFF << 24)
> +#define WDT_PUSH_PULL_MAGIC (0xA8 << 24)
> +#define WDT_OPEN_DRAIN_MAGIC(0x8A << 24)
>  
> -#define WDT_TIMEOUT_STATUS  (0x10 / 4)
> -#define WDT_TIMEOUT_CLEAR   (0x14 / 4)
> -#define WDT_RESET_WDITH (0x18 / 4)
> +#define WDT_TIMEOUT_STATUS  (0x10 / 4)
> +#define WDT_TIMEOUT_CLEAR   (0x14 / 4)
>  
> -#define WDT_RESTART_MAGIC   0x4755
> +#define WDT_RESTART_MAGIC   0x4755
>  
>  static bool aspeed_wdt_is_enabled(const AspeedWDTState *s)
>  {
>  return s->regs[WDT_CTRL] & WDT_CTRL_ENABLE;
>  }
>  
> +static bool is_ast2500(const AspeedWDTState *s)

I think we could use this routine in other controllers (scu, sdmc). 
So may be, in a follow-up patch, we could move it in aspeed_scu.h

Thanks,

C. 
 
> +{
> +switch (s->silicon_rev) {
> +case AST2500_A0_SILICON_REV:
> +case AST2500_A1_SILICON_REV:
> +return true;
> +case AST2400_A0_SILICON_REV:
> +case AST2400_A1_SILICON_REV:
> +default:
> +break;
> +}
> +
> +return false;
> +}
> +
>  static uint64_t aspeed_wdt_read(void *opaque, hwaddr offset, unsigned size)
>  {
>  AspeedWDTState *s = ASPEED_WDT(opaque);
> @@ -55,9 +81,10 @@ static uint64_t aspeed_wdt_read(void *opaque, hwaddr 
> offset, unsigned size)
>  return 0;
>  case WDT_CTRL:
>  return s->regs[WDT_CTRL];
> +case WDT_RESET_WIDTH:
> +return s->regs[WDT_RESET_WIDTH];
>  case WDT_TIMEOUT_STATUS:
>  case WDT_TIMEOUT_CLEAR:
> -case WDT_RESET_WDITH:
>  qemu_log_mask(LOG_UNIMP,
>"%s: uninmplemented read at offset 0x%" HWADDR_PRIx 
> "\n",
>__func__, offset);
> @@ -119,9 +146,27 @@ static void aspeed_wdt_write(void *opaque, hwaddr 
> offset, uint64_t data,
>  timer_del(s->timer);
>  }
>  break;
> +case WDT_RESET_WIDTH:
> +{
> +uint32_t property = data & WDT_POLARITY_MASK;
> +
> +if (property && is_ast2500(s)) {
> +if (property == WDT_ACTIVE_HIGH_MAGIC) {
> +s->regs[WDT_RESET_WIDTH] |= WDT_RESET_WIDTH_ACTIVE_HIGH;
> +} else if (property == WDT_ACTIVE_LOW_MAGIC) {
> +s->regs[WDT_RESET_WIDTH] &= ~WDT_RESET_WIDTH_ACTIVE_HIGH;
> +} else if (property == WDT_PUSH_PULL_MAGIC) {
> +s->regs[WDT_RESET_WIDTH] |= WDT_RESET_WIDTH_PUSH_PULL;
> +} else if (property == WDT_OPEN_DRAIN_MAGIC) {
> +

Re: [Qemu-devel] [PATCH 1/2] tests/boot-sector: Do not overwrite the x86 buffer on other architectures

2017-08-09 Thread Cornelia Huck
On Wed,  9 Aug 2017 06:59:37 +0200
Thomas Huth  wrote:

> Re-using the boot_sector code buffer from x86 for other architectures
> is not very nice, especially if we add more architectures later. It's
> also ugly that the test uses a huge pre-initialized array - the size
> of the executable is very huge due to this array. So let's use a
> separate buffer for each architecture instead, allocated from the heap,
> so that we really just use the memory that we need.
> 
> Suggested-by: Michael Tsirkin 
> Signed-off-by: Thomas Huth 
> ---
>  tests/boot-sector.c | 37 -
>  1 file changed, 24 insertions(+), 13 deletions(-)
> 
> diff --git a/tests/boot-sector.c b/tests/boot-sector.c
> index e3880f4..4ea1373 100644
> --- a/tests/boot-sector.c
> +++ b/tests/boot-sector.c
> @@ -21,13 +21,12 @@
>  #define SIGNATURE 0xdead
>  #define SIGNATURE_OFFSET 0x10
>  #define BOOT_SECTOR_ADDRESS 0x7c00
> +#define SIGNATURE_ADDR (BOOT_SECTOR_ADDRESS + SIGNATURE_OFFSET)

Do you want to use this new #define in boot_sector_test() as well?

>  
> -/* Boot sector code: write SIGNATURE into memory,
> +/* x86 boot sector code: write SIGNATURE into memory,
>   * then halt.
> - * Q35 machine requires a minimum 0x7e000 bytes disk.
> - * (bug or feature?)
>   */
> -static uint8_t boot_sector[0x7e000] = {
> +static uint8_t x86_boot_sector[512] = {
>  /* The first sector will be placed at RAM address 7C00, and
>   * the BIOS transfers control to 7C00
>   */
> @@ -50,8 +49,8 @@ static uint8_t boot_sector[0x7e000] = {
>  [0x07] = HIGH(SIGNATURE),
>  /* 7c08:  mov %ax,0x7c10 */
>  [0x08] = 0xa3,
> -[0x09] = LOW(BOOT_SECTOR_ADDRESS + SIGNATURE_OFFSET),
> -[0x0a] = HIGH(BOOT_SECTOR_ADDRESS + SIGNATURE_OFFSET),
> +[0x09] = LOW(SIGNATURE_ADDR),
> +[0x0a] = HIGH(SIGNATURE_ADDR),
>  
>  /* 7c0b cli */
>  [0x0b] = 0xfa,
> @@ -72,7 +71,9 @@ static uint8_t boot_sector[0x7e000] = {
>  int boot_sector_init(char *fname)
>  {
>  int fd, ret;
> -size_t len = sizeof boot_sector;
> +size_t len;
> +char *boot_code;
> +const char *arch = qtest_get_arch();
>  
>  fd = mkstemp(fname);
>  if (fd < 0) {
> @@ -80,16 +81,26 @@ int boot_sector_init(char *fname)
>  return 1;
>  }
>  
> -/* For Open Firmware based system, we can use a Forth script instead */
> -if (strcmp(qtest_get_arch(), "ppc64") == 0) {
> -len = sprintf((char *)boot_sector, "\\ Bootscript\n%x %x c! %x %x 
> c!\n",
> -LOW(SIGNATURE), BOOT_SECTOR_ADDRESS + SIGNATURE_OFFSET,
> -HIGH(SIGNATURE), BOOT_SECTOR_ADDRESS + SIGNATURE_OFFSET + 1);
> +if (g_str_equal(arch, "i386") || g_str_equal(arch, "x86_64")) {
> +/* Q35 requires a minimum 0x7e000 bytes disk (bug or feature?) */
> +len = 0x7e000;

Use the maximum of (0x7e000, sizeof(x86_boot_sector))? (Not that it is
likely that the boot sector will ever grow, but I think it is cleaner.)

> +boot_code = g_malloc(len);

Would g_malloc_0() be better?

> +memcpy(boot_code, x86_boot_sector, sizeof x86_boot_sector);

sizeof(x86_boot_sector)?

> +} else if (g_str_equal(arch, "ppc64")) {
> +/* For Open Firmware based system, use a Forth script */
> +boot_code = g_strdup_printf("\\ Bootscript\n%x %x c! %x %x c!\n",
> +LOW(SIGNATURE), SIGNATURE_ADDR,
> +HIGH(SIGNATURE), SIGNATURE_ADDR + 1);
> +len = strlen(boot_code);
> +} else {
> +g_assert_not_reached();
>  }
>  
> -ret = write(fd, boot_sector, len);
> +ret = write(fd, boot_code, len);
>  close(fd);
>  
> +g_free(boot_code);
> +
>  if (ret != len) {
>  fprintf(stderr, "Could not write \"%s\"", fname);
>  return 1;

This makes the code much nicer :)



Re: [Qemu-devel] [PATCH 2/2] tests/pxe: Check virtio-net-ccw on s390x

2017-08-09 Thread Cornelia Huck
On Wed,  9 Aug 2017 06:59:38 +0200
Thomas Huth  wrote:

> Now that we've got a firmware that can do TFTP booting on s390x (i.e.
> the pc-bios/s390-netboot.img), we can enable the PXE tester for this
> architecture, too.
> 
> Signed-off-by: Thomas Huth 
> ---
>  tests/Makefile.include |  1 +
>  tests/boot-sector.c| 20 
>  tests/pxe-test.c   |  7 +++
>  3 files changed, 28 insertions(+)
> 
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index eb4895f..639371e4 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -337,6 +337,7 @@ check-qtest-microblazeel-y = $(check-qtest-microblaze-y)
>  check-qtest-xtensaeb-y = $(check-qtest-xtensa-y)
>  
>  check-qtest-s390x-y = tests/boot-serial-test$(EXESUF)
> +check-qtest-s390x-y += tests/pxe-test$(EXESUF)
>  
>  check-qtest-generic-y += tests/qom-test$(EXESUF)
>  check-qtest-generic-y += tests/test-hmp$(EXESUF)
> diff --git a/tests/boot-sector.c b/tests/boot-sector.c
> index 4ea1373..bc5837a 100644
> --- a/tests/boot-sector.c
> +++ b/tests/boot-sector.c
> @@ -67,6 +67,21 @@ static uint8_t x86_boot_sector[512] = {
>  [0x1FF] = 0xAA,
>  };
>  
> +/* For s390x, use a mini "kernel" with the appropriate signature */
> +static const uint8_t s390x_psw[] = {
> +0x00, 0x08, 0x00, 0x00, 0x80, 0x01, 0x00, 0x00
> +};
> +static const uint8_t s390x_code[] = {
> +0xa7, 0xf4, 0x00, 0x0a,/* j 0x10010 */
> +0x00, 0x00, 0x00, 0x00,
> +'S', '3', '9', '0',
> +'E', 'P', 0x00, 0x01,
> +0xa7, 0x38, HIGH(SIGNATURE_ADDR), LOW(SIGNATURE_ADDR), /* lhi r3,0x7c10 
> */
> +0xa7, 0x48, LOW(SIGNATURE), HIGH(SIGNATURE),   /* lhi r4,0xadde 
> */
> +0x40, 0x40, 0x30, 0x00,/* sth r4,0(r3) */
> +0xa7, 0xf4, 0xff, 0xfa /* j 0x10010 */
> +};
> +
>  /* Create boot disk file.  */
>  int boot_sector_init(char *fname)
>  {
> @@ -92,6 +107,11 @@ int boot_sector_init(char *fname)
>  LOW(SIGNATURE), SIGNATURE_ADDR,
>  HIGH(SIGNATURE), SIGNATURE_ADDR + 1);
>  len = strlen(boot_code);
> +} else if (g_str_equal(arch, "s390x")) {
> +len = 0x1 + sizeof s390x_code;
> +boot_code = g_malloc(len);

g_malloc_0()?

> +memcpy(boot_code, s390x_psw, sizeof s390x_psw);
> +memcpy(_code[0x1], s390x_code, sizeof s390x_code);

I'd prefer the sizeof() notation.

>  } else {
>  g_assert_not_reached();
>  }
> diff --git a/tests/pxe-test.c b/tests/pxe-test.c
> index cf6e225..0d70afc 100644
> --- a/tests/pxe-test.c
> +++ b/tests/pxe-test.c
> @@ -51,6 +51,11 @@ static void test_pxe_spapr_vlan(void)
>  test_pxe_one("-device spapr-vlan,netdev=" NETNAME, true);
>  }
>  
> +static void test_pxe_virtio_ccw(void)
> +{
> +test_pxe_one("-device virtio-net-ccw,bootindex=1,netdev=" NETNAME, 
> false);
> +}
> +
>  int main(int argc, char *argv[])
>  {
>  int ret;
> @@ -68,6 +73,8 @@ int main(int argc, char *argv[])
>  } else if (strcmp(arch, "ppc64") == 0) {
>  qtest_add_func("pxe/virtio", test_pxe_virtio_pci);
>  qtest_add_func("pxe/spapr-vlan", test_pxe_spapr_vlan);
> +} else if (g_str_equal(arch, "s390x")) {
> +qtest_add_func("pxe/virtio-ccw", test_pxe_virtio_ccw);
>  }
>  ret = g_test_run();
>  boot_sector_cleanup(disk);

Else, looks good.



Re: [Qemu-devel] [PATCH v2] 9pfs: fix dependencies

2017-08-09 Thread Daniel P. Berrange
On Wed, Aug 09, 2017 at 11:07:38AM +0200, Thomas Huth wrote:
> On 09.08.2017 10:27, Cornelia Huck wrote:
> > On Wed, 9 Aug 2017 10:23:04 +0200
> > Thomas Huth  wrote:
> > 
> >> On 09.08.2017 09:17, Cornelia Huck wrote:
> >>> Nothing in fsdev/ or hw/9pfs/ depends on pci; it should rather depend
> >>> on CONFIG_VIRTFS and on the presence of an appropriate virtio transport
> >>> device.
> >>>
> >>> Let's introduce CONFIG_VIRTIO_CCW to cover s390x and check for
> >>> CONFIG_VIRTFS && (CONFIG_VIRTIO_PCI || CONFIG_VIRTIO_CCW).
> >>>
> >>> Signed-off-by: Cornelia Huck 
> >>> ---
> >>>
> >>> Changes v1->v2: drop extraneous spaces, fix build on cris
> >>>
> >>> ---
> >>>  default-configs/s390x-softmmu.mak | 1 +
> >>>  fsdev/Makefile.objs   | 9 +++--
> >>>  hw/Makefile.objs  | 2 +-
> >>>  3 files changed, 5 insertions(+), 7 deletions(-)
> [...]
> >>
> >> Patch should be fine now, I think...
> >>
> >> But thinking about this again, I wonder whether it would be enough to
> >> simply check for CONFIG_VIRTIO=y here instead. CONFIG_VIRTIO=y should be
> >> sufficient to assert that there is also at least one kind of virtio
> >> transport available, right?
> >> Otherwise this will look really horrible as soon as somebody also tries
> >> to add support for virtio-mmio here later ;-)
> > 
> > Do all virtio transports have support for 9p, though? I thought it was
> > only virtio-pci and virtio-ccw...
> 
> While virtio-pci and virtio-ccw seem to require separate dedicated
> devices (e.g. virtio-9p-pci and virtio-9p-ccw) for everything,
> virtio-mmio seems to work different. As far as I can see, there are no
> dedicated virtio-xxx-mmio devices in the code at all. According to
> https://www.redhat.com/archives/libvir-list/2013-August/msg9.html
> you simply have to use virtio-xxx-device here instead. And a
> virtio-9p-device is available. So theoretically, the 9p code should work
> with virtio-mmio, too, or is there a problem that I did not see yet?
> 
> Anyway, we likely should not blindly enable this, so unless somebody has
> a setup to test it, we should go with your current patch instead, I think.

qemu-system-arm supports virtio-mmio so you can use that to test it


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] [Qemu-block] [PATCH 2/4] IDE: test flush on empty CDROM

2017-08-09 Thread Stefan Hajnoczi
On Tue, Aug 08, 2017 at 01:57:09PM -0400, John Snow wrote:
> From: Kevin Wolf 
> 
> Signed-off-by: Kevin Wolf 
> Signed-off-by: John Snow 
> ---
>  tests/ide-test.c | 19 +++
>  1 file changed, 19 insertions(+)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH for-2.10] throttle: Make LeakyBucket.avg and LeakyBucket.max integer types

2017-08-09 Thread Stefan Hajnoczi
On Mon, Aug 07, 2017 at 07:15:29PM +0300, Alberto Garcia wrote:
> Both the throttling limits set with the throttling.iops-* and
> throttling.bps-* options and their QMP equivalents defined in the
> BlockIOThrottle struct are integer values.
> 
> Those limits are also reported in the BlockDeviceInfo struct and they
> are integers there as well.
> 
> Therefore there's no reason to store them internally as double and do
> the conversion everytime we're setting or querying them, so this patch
> uses int64_t for those types.
> 
> LeakyBucket.level and LeakyBucket.burst_level do however remain double
> because their value changes depending on the fraction of time elapsed
> since the previous I/O operation.
> 
> There's one particular instance of the previous code where bkt->max
> could have a non-integer value: that's in throttle_fix_bucket() when
> bkt->max is initialized to bkt->avg / 10. This is now an integer
> division and the result is rounded. We don't need to worry about this
> because:
> 
>a) with the magnitudes we're dealing with (bytes per second, I/O
>   operations per second) the limits are likely to be always
>   multiples of 10.
> 
>b) even if they weren't this doesn't affect the actual limits, only
>   the algorithm that makes the throttling smoother.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  include/qemu/throttle.h | 4 ++--
>  util/throttle.c | 7 ++-
>  2 files changed, 4 insertions(+), 7 deletions(-)

Thanks, applied to my block-next tree:
https://github.com/stefanha/qemu/commits/block-next

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2] 9pfs: fix dependencies

2017-08-09 Thread Peter Maydell
On 9 August 2017 at 10:07, Thomas Huth  wrote:
> While virtio-pci and virtio-ccw seem to require separate dedicated
> devices (e.g. virtio-9p-pci and virtio-9p-ccw) for everything,

You don't *have* to use the dedicated virtio-foo-pci device;
if you like you can manually plug together the virtio-pci
transport and the virtio-foo-device backend yourself. The
'fused' device is just for convenience and compatibility
with existing command lines. There is no fused device for
virtio-mmio because the board itself creates the transport
devices so the user only needs to create the backends
(which then auto-plug into the virtio bus).

thanks
-- PMM



Re: [Qemu-devel] Qemu and 32 PCIe devices

2017-08-09 Thread Laszlo Ersek
On 08/09/17 09:26, Paolo Bonzini wrote:
> On 09/08/2017 03:06, Laszlo Ersek wrote:
>>>   20.14%  qemu-system-x86_64  [.] render_memory_region
>>>   17.14%  qemu-system-x86_64  [.] subpage_register
>>>   10.31%  qemu-system-x86_64  [.] int128_add
>>>7.86%  qemu-system-x86_64  [.] addrrange_end
>>>7.30%  qemu-system-x86_64  [.] int128_ge
>>>4.89%  qemu-system-x86_64  [.] int128_nz
>>>3.94%  qemu-system-x86_64  [.] phys_page_compact
>>>2.73%  qemu-system-x86_64  [.] phys_map_node_alloc
> 
> Yes, this is the O(n^3) thing.  An optimized build should be faster
> because int128 operations will be inlined and become much more efficient.
> 
>> With this patch, I only tested the "93 devices" case, as the slowdown
>> became visible to the naked eye from the trace messages, as the firmware
>> enabled more and more BARs / command registers (and inversely, the
>> speedup was perceivable when the firmware disabled more and more BARs /
>> command registers).
> 
> This is an interesting observation, and it's expected.  Looking at the
> O(n^3) complexity more in detail you have N operations, where the "i"th
> operates on "i" DMA address spaces, all of which have at least "i"
> memory regions (at least 1 BAR per device).

- Can you please give me a pointer to the code where the "i"th operation
works on "i" DMA address spaces? (Not that I dream about patching *that*
code, wherever it may live :) )

- You mentioned that changing this is on the ToDo list. I couldn't find
it under . Is it tracked somewhere
else?

(I'm not trying to urge any changes in the area, I'd just like to learn
about the code & the tracker item, if there's one.)

Thanks!
Laszlo

> 
> So the total cost is sum i=1..N i^2 = N(N+1)(2N+1)/6 = O(n^3).
> Expressing it as a sum shows why it gets slower as time progresses.
> 
> The solution is to note that those "i" address spaces are actually all
> the same, so we can get it down to sum i=1..N i = N(N+1)/2 = O(n^2).
> 
> Thanks,
> 
> Paolo
> 




[Qemu-devel] [PATCH v4 3/7] block: tidy ThrottleGroupMember initializations

2017-08-09 Thread Manos Pitsidianakis
Move the CoMutex and CoQueue inits inside throttle_group_register_tgm()
which is called whenever a ThrottleGroupMember is initialized. There's
no need for them to be separate.

Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Manos Pitsidianakis 
---
 block/block-backend.c   | 3 ---
 block/throttle-groups.c | 3 +++
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 6687a90660..df0200fc49 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -215,9 +215,6 @@ BlockBackend *blk_new(uint64_t perm, uint64_t shared_perm)
 blk->shared_perm = shared_perm;
 blk_set_enable_write_cache(blk, true);
 
-qemu_co_mutex_init(>public.throttle_group_member.throttled_reqs_lock);
-qemu_co_queue_init(>public.throttle_group_member.throttled_reqs[0]);
-qemu_co_queue_init(>public.throttle_group_member.throttled_reqs[1]);
 block_acct_init(>stats);
 
 notifier_list_init(>remove_bs_notifiers);
diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index a979e86243..f711a3dc62 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -508,6 +508,9 @@ void throttle_group_register_tgm(ThrottleGroupMember *tgm,
  read_timer_cb,
  write_timer_cb,
  tgm);
+qemu_co_mutex_init(>throttled_reqs_lock);
+qemu_co_queue_init(>throttled_reqs[0]);
+qemu_co_queue_init(>throttled_reqs[1]);
 
 qemu_mutex_unlock(>lock);
 }
-- 
2.11.0




[Qemu-devel] Making QEMU build with Python 3

2017-08-09 Thread Stefan Hajnoczi
Ross created a bug to track Python 3 support:

  https://bugs.launchpad.net/qemu/+bug/1708462

Currently most Python code in QEMU is for Python 2.6+ only.  There
have only been a few patches adding Python 3 support to certain
scripts so far.

In this email I want to highlight the most important scripts that need
Python 3 support.  Volunteers are welcome!

Python scripts needed to build QEMU are the highest priority.  They
are invoked by ./configure or make.  I've identified the following:

scripts/signrom.py
scripts/qapi*.py
scripts/modules/module_block.py
scripts/tracetool*

Anyone wishing to tackle a script listed here, please reply to this
email thread to avoid duplicating work.

The fundamentals of adding Python 3 support are:

1. The script must work correctly under both Python 2.6+ and Python 3.
Only use language or standard library features that are available in
both Python versions.
2. Compare Python 2.6 vs Python 3 documentation to find a common
subset.  There is often a Pythonic solution that does not require
writing explicit wrappers.
3. Avoid third-party package dependencies - QEMU currently has none!
That means do not use 'six' or 'python-future'.  Our use of Python
isn't that fancy, but if you feel a third party package is essential
the please justify it.
4. If you decide to do PEP8 cleanups, make them separate patches so
review is easy.

Getting started info (but do not rely on 'python-future'):
http://www.python-future.org/compatible_idioms.html

Once the build scripts are converted the next most important group of
Python scripts are the tests.  This is where the bulk of the work
lies.

I have tracetool on my todo list and hope to add Python 3 support in QEMU 2.11.

Stefan



Re: [Qemu-devel] [PATCH v4 10/22] libqtest: Skip round-trip through QObject

2017-08-09 Thread Markus Armbruster
Eric Blake  writes:

> When we don't have to do any % interpolation in qmp() and friends,
> there is no point wasting time allocating a QObject from the format
> string only to then format it back into the string we send over
> the wire.

True, but there's also no point in complicating things for efficiency
here.

> This is a temporary measure: it becomes important in the next
> patch, where test-qga will be refactored to do interpolation in
> place, and where we must not re-interpolate the string; but will
> go away when further refactoring makes it easier to directly
> output a string without going through qmp_fd_sendv().

Okay, let's see how that works out.

> Signed-off-by: Eric Blake 
> ---
>  tests/libqtest.c | 16 
>  1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/tests/libqtest.c b/tests/libqtest.c
> index cde737ec5a..0cb439eefa 100644
> --- a/tests/libqtest.c
> +++ b/tests/libqtest.c
> @@ -448,7 +448,7 @@ QDict *qtest_qmp_receive(QTestState *s)
>   */
>  void qmp_fd_sendv(int fd, const char *fmt, va_list ap)
>  {
> -QObject *qobj;
> +QObject *qobj = NULL;
>  int log = getenv("QTEST_LOG") != NULL;
>  QString *qstr;
>  const char *str;
> @@ -462,9 +462,17 @@ void qmp_fd_sendv(int fd, const char *fmt, va_list ap)
>  }
>  assert(*fmt);
>
> -/* Going through qobject ensures we escape strings properly. */
> -qobj = qobject_from_jsonv(fmt, ap);
> -qstr = qobject_to_json(qobj);
> +/*
> + * A round trip through QObject is only needed if % interpolation
> + * is used.  We interpolate through QObject rather than sprintf in
> + * order to escape strings properly.
> + */
> +if (strchr(fmt, '%')) {
> +qobj = qobject_from_jsonv(fmt, ap);
> +qstr = qobject_to_json(qobj);
> +} else {

qobj = NULL here would be clearer than the initializer.

> +qstr = qstring_from_str(fmt);
> +}
>
>  /*
>   * BUG: QMP doesn't react to input until it sees a newline, an



Re: [Qemu-devel] [PATCH v4 01/10] qemu.py: Pylint/style fixes

2017-08-09 Thread Stefan Hajnoczi
On Tue, Aug 08, 2017 at 02:56:47PM +0200, Lukáš Doktor wrote:
> Dne 8.8.2017 v 14:38 Stefan Hajnoczi napsal(a):
> > On Wed, Jul 26, 2017 at 04:42:17PM +0200, Lukáš Doktor wrote:
> >>  def command(self, cmd, conv_keys=True, **args):
> >> +'''
> >> +Invoke a QMP command and on success report result dict or on 
> >> failure
> > 
> > s/report/return/ ?
> > 
> Don't see much difference, but I'll use the "return" in the next version.

"report" could mean the function logs or prints a message.  I have not
heard it used to mean "return".

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] target/i386: set rip_offset for further SSE instructions

2017-08-09 Thread Paolo Bonzini
On 09/08/2017 01:51, Joseph Myers wrote:
> It turns out that my recent fix to set rip_offset when emulating some
> SSE4.1 instructions needs generalizing to cover a wider class of
> instructions.  Specifically, every instruction in the sse_op_table7
> table, coming from various instruction set extensions, has an 8-bit
> immediate operand that comes after any memory operand, and so needs
> rip_offset set for correctness if there is a memory operand that is
> rip-relative, and my patch only set it for a subset of those
> instructions.  This patch moves the rip_offset setting to cover the
> wider class of instructions, so fixing 9 further gcc testsuite
> failures in my GCC 6-based testing.  (I do not know whether there
> might be still further classes of instructions missing this setting.)
> 
> Signed-off-by: Joseph Myers 
> 
> ---
> 
> diff --git a/target/i386/translate.c b/target/i386/translate.c
> index 5fdadf9..95f7261 100644
> --- a/target/i386/translate.c
> +++ b/target/i386/translate.c
> @@ -4077,10 +4077,11 @@ static void gen_sse(CPUX86State *env, DisasContext 
> *s, int b,
>  if (!(s->cpuid_ext_features & sse_op_table7[b].ext_mask))
>  goto illegal_op;
>  
> +s->rip_offset = 1;
> +
>  if (sse_fn_eppi == SSE_SPECIAL) {
>  ot = mo_64_32(s->dflag);
>  rm = (modrm & 7) | REX_B(s);
> -s->rip_offset = 1;
>  if (mod != 3)
>  gen_lea_modrm(env, s, modrm);
>  reg = ((modrm >> 3) & 7) | rex_r;
> 


Queued, thanks.

Paolo



Re: [Qemu-devel] [PATCH] target/i386: fix pmovsx/pmovzx in-place operations

2017-08-09 Thread Paolo Bonzini
On 08/08/2017 22:21, Joseph Myers wrote:
> The SSE4.1 pmovsx* and pmovzx* instructions take packed 1-byte, 2-byte
> or 4-byte inputs and sign-extend or zero-extend them to a wider vector
> output.  The associated helpers for these instructions do the
> extension on each element in turn, starting with the lowest.  If the
> input and output are the same register, this means that all the input
> elements after the first have been overwritten before they are read.
> This patch makes the helpers extend starting with the highest element,
> not the lowest, to avoid such overwriting.  This fixes many GCC test
> failures (161 in the gcc testsuite in my GCC 6-based testing) when
> testing with a default CPU setting enabling those instructions.
> 
> Signed-off-by: Joseph Myers 
> 
> ---
> 
> diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
> index 16509d0..d578216 100644
> --- a/target/i386/ops_sse.h
> +++ b/target/i386/ops_sse.h
> @@ -1617,18 +1617,18 @@ void glue(helper_ptest, SUFFIX)(CPUX86State *env, Reg 
> *d, Reg *s)
>  #define SSE_HELPER_F(name, elem, num, F)\
>  void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
>  {   \
> -d->elem(0) = F(0);  \
> -d->elem(1) = F(1);  \
>  if (num > 2) {  \
> -d->elem(2) = F(2);  \
> -d->elem(3) = F(3);  \
>  if (num > 4) {  \
> -d->elem(4) = F(4);  \
> -d->elem(5) = F(5);  \
> -d->elem(6) = F(6);  \
>  d->elem(7) = F(7);  \
> +d->elem(6) = F(6);  \
> +d->elem(5) = F(5);  \
> +d->elem(4) = F(4);  \
>  }   \
> +d->elem(3) = F(3);  \
> +d->elem(2) = F(2);  \
>  }   \
> +d->elem(1) = F(1);  \
> +d->elem(0) = F(0);  \
>  }
>  
>  SSE_HELPER_F(helper_pmovsxbw, W, 8, (int8_t) s->B)
> 

Queued, thanks.

Paolo



  1   2   3   >