date:20190815

[Qemu-devel] [Bug 1840249] [NEW] Cancelling 'make docker-test-build' does not cancel running containers

2019-08-15 Thread Philippe Mathieu-Daudé

Public bug reported:

version: v4.1.0-rc5

Run 'make -k docker-test-build', wait a few, cancel with ^C:

$ make -k docker-test-build 2>&1 > /dev/null
^C

$ docker ps
CONTAINER IDIMAGECOMMAND  
CREATED STATUS
62264a2d777aqemu:debian-mips-cross   "/var/tmp/qemu/run t…"   
10 minutes ago  Up 10 minutes
80807c47d0dfqemu:debian-armel-cross  "/var/tmp/qemu/run t…"   
10 minutes ago  Up 10 minutes
06027b5dfd4aqemu:debian-amd64"/var/tmp/qemu/run t…"   
10 minutes ago  Up 10 minutes

The docker containers are still up building QEMU.

** Affects: qemu
 Importance: Undecided
 Status: New


** Tags: docker

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1840249

Title:
  Cancelling 'make docker-test-build' does not cancel running containers

Status in QEMU:
  New

Bug description:
  version: v4.1.0-rc5

  Run 'make -k docker-test-build', wait a few, cancel with ^C:

  $ make -k docker-test-build 2>&1 > /dev/null
  ^C

  $ docker ps
  CONTAINER IDIMAGECOMMAND  
CREATED STATUS
  62264a2d777aqemu:debian-mips-cross   "/var/tmp/qemu/run t…"   
10 minutes ago  Up 10 minutes
  80807c47d0dfqemu:debian-armel-cross  "/var/tmp/qemu/run t…"   
10 minutes ago  Up 10 minutes
  06027b5dfd4aqemu:debian-amd64"/var/tmp/qemu/run t…"   
10 minutes ago  Up 10 minutes

  The docker containers are still up building QEMU.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1840249/+subscriptions

[Qemu-devel] Running docker cross-tests with SELinux (was: Re: [PATCH v3 20/29] Include qemu/main-loop.h less)

2019-08-15 Thread Philippe Mathieu-Daudé

Hi Alex,

On 8/10/19 9:34 PM, Markus Armbruster wrote:
> 
> There are a few SELinux gripes in my logs, like this one:
> 
> type=AVC msg=audit(1565418107.93:125036): avc:  denied  { module_request } 
> for  pid=19599 comm="configure" kmod="binfmt-464c" 
> scontext=system_u:system_r:container_t:s0:c611,c653 
> tcontext=system_u:system_r:kernel_t:s0 tclass=system permissive=0

Few notes while chatting with Markus.

Another interesting syslog entry:

AVC avc:  denied  { mounton } for  pid=24489 comm="mount"
path="/proc/sys/fs/binfmt_misc" dev="proc" ino=3907274
scontext=system_u:system_r:container_t:s0:c497,c743
tcontext=system_u:object_r:sysctl_fs_t:s0 tclass=dir permissive=0

Distrib is Fedora 30 with SELinux:

$ getenforce
Enforcing

$ make -k docker-test-build
[...]
  BUILD   binfmt debian-powerpc-user (debootstrapped)
No binfmt_misc entry for qemu-ppc
make: *** [tests/docker/Makefile.include:66:
docker-binfmt-image-debian-powerpc-user] Error 1make -k docker-test-build
make[1]: Entering directory 'bld'
  GEN bld/docker-src.2019-08-11-23.50.37.5117/qemu.tar
  COPYRUNNER
RUN test-build in qemu:debian-powerpc-user-cross
Unable to find image 'qemu:debian-powerpc-user-cross' locally
Trying to pull repository docker.io/library/qemu ...
Trying to pull repository quay.io/qemu ...
Trying to pull repository docker.io/library/qemu ...
/usr/bin/docker-current: repository docker.io/qemu not found: does not
exist or no pull access.
See '/usr/bin/docker-current run --help'.
Traceback (most recent call last):
  File "tests/docker/docker.py", line 615, in 
sys.exit(main())
  File "tests/docker/docker.py", line 611, in main
return args.cmdobj.run(args, argv)
  File "tests/docker/docker.py", line 338, in run
return Docker().run(argv, args.keep, quiet=args.quiet)
  File "tests/docker/docker.py", line 300, in run
quiet=quiet)
  File "tests/docker/docker.py", line 207, in _do_check
return subprocess.check_call(self._command + cmd, **kwargs)
  File "/usr/lib64/python2.7/subprocess.py", line 190, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run',
'--label', 'com.qemu.instance.uuid=0e8b34a8bc8211e98734d8cb8ae0c842',
'-u', '1000', '--security-opt', 'seccomp=unconfined', '--rm',
'--net=none', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e',
'V=', '-e', 'J=', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e',
'CCACHE_DIR=/var/tmp/ccache', '-v',
'/home/armbru/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v',
'bld/docker-src.2019-08-11-23.50.37.5117:/var/tmp/qemu:z,ro',
'qemu:debian-powerpc-user-cross', '/var/tmp/qemu/run', 'test-build']'
returned non-zero exit status 125
make[1]: *** [tests/docker/Makefile.include:207: docker-run] Error 1
make[1]: Leaving directory 'bld'
make: *** [tests/docker/Makefile.include:241:
docker-run-test-build@debian-powerpc-user-cross] Error 2

Note the "No binfmt_misc entry for qemu-ppc" and syslog entry:

'AVC denied comm="mount" path="/proc/sys/fs/binfmt_misc" dev="proc"'.

Does the selinux-policy require tuning?

Re: [Qemu-devel] [PATCH v3 0/4] qcow2: async handling of fragmented io

2019-08-15 Thread Max Reitz

On 15.08.19 14:10, Vladimir Sementsov-Ogievskiy wrote:
> 01: - use coroutine_fn where appropriate !!!

:-)



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 1/4] block: introduce aio task pool

2019-08-15 Thread Max Reitz

On 15.08.19 14:10, Vladimir Sementsov-Ogievskiy wrote:
> Common interface for aio task loops. To be used for improving
> performance of synchronous io loops in qcow2, block-stream,
> copy-on-read, and may be other places.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/aio_task.h |  54 +
>  block/aio_task.c | 124 +++
>  block/Makefile.objs  |   2 +
>  3 files changed, 180 insertions(+)
>  create mode 100644 include/block/aio_task.h
>  create mode 100644 block/aio_task.c
> 
> diff --git a/include/block/aio_task.h b/include/block/aio_task.h
> new file mode 100644
> index 00..58b4d99e59
> --- /dev/null
> +++ b/include/block/aio_task.h

[...]

> +AioTaskPool *aio_task_pool_new(int max_busy_tasks);

Because aio_task_pool_wait_one() asserts that it runs in the same
coroutine as aio_task_pool_new(), this should be a coroutine_fn as well.
O:-)

But I don’t want to be responsible for breaking your “1” key (assuming
you have the exclamation mark there):

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] Question about libvhost-user and vhost-user-bridge.c

2019-08-15 Thread Stefan Hajnoczi

On Wed, Aug 14, 2019 at 10:54:34AM -0700, William Tu wrote:
> Hi,
> 
> I'm using libvhost-user.a to write a vhost backend, in order to receive and
> send packets from/to VMs from OVS. I started by reading the 
> vhost-user-bridge.c.
> I can now pass the initialization stage, seeing .queue_set_started get 
> invoked.
> 
> However, I am stuck at receiving the packet from VM.
> So is it correct to do:
> 1) check vu_queue_empty, started, and aval_bytes, if OK, then

This step can be skipped because vu_queue_pop() returns NULL if there
are no virtqueue elements available.

> 2) elem = vu_queue_pop(>vudev, vq, sizeof(VuVirtqElement));
> 3) the packet payload should be at elem->in_sg->iov_base + hdrlen? or
> at elem->out_sg?

The driver->device buffers are elem->out_sg and the device->driver
buffers are elem->in_sg.

Device implementations must not make assumptions about the layout of
out_sg and in_sg (e.g. you cannot assume that in_sg[0]->iov_len ==
sizeof(struct virtio_net_hdr) and you must handle the case where
in_sg[0]->iov_len == 1).

> I tried to hex dump the iov_base, but the content doesn't look like
> having a ethernet header. I saw in vubr_backend_recv_cb at 
> vhost-user-bridge.c,
> we're creating another iovec and recvmsg(vubr->backend_udp_sock, , 0);
> I don't think I have to create backend UDP sock, am I correct?

Please see the VIRTIO specification for details of the virtio-net rx/tx
virtqueue formats:
https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-2050006

I think you may need to handle the struct virtio_net_hdr that comes
before the Ethernet header.

Stefan

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] usb: reword -usb command-line option and mention xHCI

2019-08-15 Thread Stefan Hajnoczi

On Thu, Aug 15, 2019 at 08:13:40AM +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > > > -Enable the USB driver (if it is not used by default yet).
> > > > +Enable USB emulation on machine types with an on-board USB host 
> > > > controller (if
> > > > +not enabled by default).  Note that on-board USB host controllers may 
> > > > not
> > > > +support USB 3.0.  In this case -device nec-usb-xhci can be used 
> > > > instead on
> > > 
> > > Should we maybe rather recommend qemu-xhci instead?
> > 
> > I think nec-usb-xhci is preferred because there are Windows drivers.
> > IIRC qemu-xhci works under Linux but not under Windows (just because the
> > PCI Vendor/Device ID aren't covered by any driver).
> > 
> > Gerd: Can you confirm this?
> 
> That applies to windows 7 only, which is EOL next year.
> 
> win7 doesn't ship with xhci drivers, but you can download and use
> nec/renesas drivers which require nec-usb-xhci.
> 
> win8+ ships with generic xhci drivers which works with all xhci
> hardware, including qemu-xhci.
> 
> So it indeed makes sense to refer to qemu-xhci.

Thanks, will fix in v2!

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v3 4/4] block/qcow2: introduce parallel subrequest handling in read and write

2019-08-15 Thread Max Reitz

On 15.08.19 14:10, Vladimir Sementsov-Ogievskiy wrote:
> It improves performance for fragmented qcow2 images.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/qcow2.h  |   3 ++
>  block/qcow2.c  | 125 -
>  block/trace-events |   1 +
>  3 files changed, 117 insertions(+), 12 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 00/13] RFC: luks/encrypted qcow2 key management

2019-08-15 Thread Markus Armbruster

Kevin Wolf  writes:

> Am 14.08.2019 um 23:08 hat Eric Blake geschrieben:
>> On 8/14/19 3:22 PM, Maxim Levitsky wrote:
>> 
>> > This is an issue that was raised today on IRC with Kevin Wolf. Really 
>> > thanks
>> > for the idea!
>> > 
>> > We agreed that this new qmp interface should take the same options as
>> > blockdev-create does, however since we want to be able to edit the 
>> > encryption
>> > slots separately, this implies that we sort of need to allow this on 
>> > creation
>> > time as well.
>> > 
>> > Also the BlockdevCreateOptions is a union, which is specialized by the 
>> > driver name
>> > which is great for creation, but for update, the driver name is already 
>> > known,
>> > and thus the user should not be forced to pass it again.
>> > However qmp doesn't seem to support union type guessing based on actual 
>> > fields
>> > given (this might not be desired either), which complicates this somewhat.
>> 
>> Does the idea of a union type with a default value for the discriminator
>> help?  Maybe we have a discriminator which defaults to 'auto', and add a
>> union branch 'auto':'any'.  During creation, if the "driver":"auto"
>> branch is selected (usually implicitly by omitting "driver", but also
>> possible explicitly), the creation attempt is rejected as invalid
>> regardless of the contents of the remaining 'any'.  But during amend
>> usage, if the 'auto' branch is selected, we then add in the proper
>> "driver":"xyz" and reparse the QAPI object to determine if the remaining
>> fields in 'any' still meet the specification for the required driver branch.
>> 
>> This idea may still require some tweaks to the QAPI generator, but it's
>> the best I can come up with for a way to parse an arbitrary JSON object
>> with unknown validation, then reparse it again after adding more
>> information that would constrain the parse differently.
>
> Feels like this would be a lot of code just to allow the client to omit
> passing a value that it knows anyway. If this were a human interface, I
> could understand the desire to make commands less verbose, but for QMP I
> honestly don't see the point when it's not trivial.

Seconded.

Re: [Qemu-devel] [PATCH v0] Implement new cache mode "target"

2019-08-15 Thread Kevin Wolf

Am 15.08.2019 um 15:53 hat Stefan Hajnoczi geschrieben:
> On Wed, Aug 07, 2019 at 04:09:54PM +0300, Artemy Kapitula wrote:
> 
> Hi,
> Please use "scripts/get_maintainer.pl -f block.c" to find out which
> maintainers to email.  qemu-devel@nongnu.org is a high-traffic list and
> patches not CCed to the right maintainer may not get quick review.
> 
> > There is an issue with databases in VM that perform too slow
> > on generic SAN storages. The key point is fdatasync that flushes
> > disk on SCSI target.
> > 
> > The QEMU blockdev "target" cache mode intended to be used with
> > SAN storages and is a mix of "none" by using direct I/O and
> > "unsafe" that omit device flush.
> > 
> > Such storages has its own data integrity protection and can
> > be operated with direct I/O without additional fdatasyc().
> > 
> > With generic SCSI targets like LIO or SCST it boost performance
> > up to 100% on some profiles like database with transaction journal
> > (postrgesql/mssql/oracle etc) or virtualized SDS (ceph/rook inside
> > VMs) which performs block device cache flush on journal record.
> 
> If the physical storage controller has a Battery Backed Unit (BBU) or
> similar then flush requests are not required with O_DIRECT.  This has
> been a common enterprise storage configuration for many years and is
> already supported in QEMU today:
> 
> Configure the guest with cache=none and disable the emulated storage
> controller's write cache (e.g. -device virtio-blk-pci,write-cache=off).
> Inside the guest /sys/block/$BLKDEV/queue/write_cache should show "write
> through".
> 
> I think this patch is not necessary since write-cache=off already
> exists.  cache=target is also slower since the guest sends unnecessary
> flush requests to the emulated storage controller.

Two more comments:

1. The proposed cache mode can already be configured as
   cache.direct=on,cache.no-flush=on. I don't think we intend to add new
   aliases for combinations of these options. The existing aliases exist
   for compatibility reasons.

2. If fdatasync() takes noticable time on such storage, this is a host
   kernel problem. If we know that there is nothing to be synced, the
   kernel should just return immediately without involving any I/O.

Kevin


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v1 1/2] accel/tcg: adding integration with linux perf

2019-08-15 Thread Stefan Hajnoczi

On Wed, Aug 14, 2019 at 11:37:24PM -0300, vandersonmr wrote:
> This commit adds support to Linux Perf in order
> to be able to analyze qemu jitted code and
> also to able to see the TBs PC in it.

Is there any reference to the file format?  Please include it in a code
comment, if such a thing exists.

> diff --git a/accel/tcg/perf/jitdump.c b/accel/tcg/perf/jitdump.c
> new file mode 100644
> index 00..6f4c0911c2
> --- /dev/null
> +++ b/accel/tcg/perf/jitdump.c
> @@ -0,0 +1,180 @@

License header?

> +#ifdef __linux__

If the entire source file is #ifdef __linux__ then Makefile.objs should
probably contain obj-$(CONFIG_LINUX) += jitdump.o instead.  Letting the
build system decide what to compile is cleaner than ifdeffing large
amounts of code.

> +
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "jitdump.h"
> +#include "qemu-common.h"

Please follow QEMU's header ordering conventions.  See ./HACKING "1.2.
Include directives".

> +void start_jitdump_file(void)
> +{
> +GString *dumpfile_name = g_string_new(NULL);;
> +g_string_printf(dumpfile_name, "./jit-%d.dump", getpid());

Simpler:

  gchar *dumpfile_name = g_strdup_printf("./jit-%d.dump", getpid());
  ...
  g_free(dumpfile_name);

> +dumpfile = fopen(dumpfile_name->str, "w+");

getpid() and the global dumpfile variable make me wonder what happens
with multi-threaded TCG?

> +
> +perf_marker = mmap(NULL, sysconf(_SC_PAGESIZE),

Please mention the point of this mmap in a comment.  My best guess is
that perf stores the /proc/$PID/maps and this is how it finds the
jitdump file?

> +  PROT_READ | PROT_EXEC,
> +  MAP_PRIVATE,
> +  fileno(dumpfile), 0);
> +
> +if (perf_marker == MAP_FAILED) {
> +printf("Failed to create mmap marker file for perf %d\n", 
> fileno(dumpfile));
> +fclose(dumpfile);
> +return;
> +}
> +
> +g_string_free(dumpfile_name, TRUE);
> +
> +struct jitheader *header = g_new0(struct jitheader, 1);

Why g_new this struct?  It's small and can be declared on the stack.

Please use g_free() with g_malloc/new/etc().  It's not safe to mismatch
glib and libc memory allocation functions.

> +header->magic = 0x4A695444;
> +header->version = 1;
> +header->elf_mach = get_e_machine();
> +header->total_size = sizeof(struct jitheader);
> +header->pid = getpid();
> +header->timestamp = get_timestamp();
> +
> +fwrite(header, header->total_size, 1, dumpfile);
> +
> +free(header);
> +fflush(dumpfile);
> +}
> +
> +void append_load_in_jitdump_file(TranslationBlock *tb)
> +{
> +GString *func_name = g_string_new(NULL);
> +g_string_printf(func_name, "TB virt:0x"TARGET_FMT_lx"%c", tb->pc, '\0');

The explicit NUL character looks strange to me.  I think the idea is to
avoid func_name->len + 1?  Adding NUL characters to C strings can be a
source of bugs, I would stick to convention and do len + 1 instead of
putting NUL characters into the GString.  This is a question of style
though.

> +
> +struct jr_code_load *load_event = g_new0(struct jr_code_load, 1);

No need to allocate load_event on the heap.

> diff --git a/qemu-options.hx b/qemu-options.hx
> index 9621e934c0..1c26eeeb9c 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -4147,6 +4147,18 @@ STEXI
>  Enable FIPS 140-2 compliance mode.
>  ETEXI
>  
> +#ifdef __linux__
> +DEF("perf", 0, QEMU_OPTION_perf,
> +"-perfdump jitdump files to help linux perf JIT code 
> visualization\n",
> +QEMU_ARCH_ALL)
> +#endif
> +STEXI
> +@item -perf
> +@findex -perf
> +Dumps jitdump files to help linux perf JIT code visualization

Suggestions on expanding the documentation:

Where are the jitdump files dumped?  The current working directory?

Anything to say about the naming scheme for these files?

Can you include an example of how to load them into perf(1)?

signature.asc
Description: PGP signature

[Qemu-devel] [PATCH v3 3/4] block/qcow2: refactor qcow2_co_pwritev_part

2019-08-15 Thread Vladimir Sementsov-Ogievskiy

Similarly to previous commit, prepare for parallelizing write-loop
iterations.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/qcow2.c | 153 +-
 1 file changed, 89 insertions(+), 64 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 89afb4272e..3aaa180e2b 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2234,6 +2234,87 @@ static int handle_alloc_space(BlockDriverState *bs, 
QCowL2Meta *l2meta)
 return 0;
 }
 
+/*
+ * qcow2_co_pwritev_task
+ * Called with s->lock unlocked
+ * l2meta  - if not NULL, qcow2_co_do_pwritev() will consume it. Caller must 
not
+ *   use it somehow after qcow2_co_pwritev_task() call
+ */
+static coroutine_fn int qcow2_co_pwritev_task(BlockDriverState *bs,
+  uint64_t file_cluster_offset,
+  uint64_t offset, uint64_t bytes,
+  QEMUIOVector *qiov,
+  uint64_t qiov_offset,
+  QCowL2Meta *l2meta)
+{
+int ret;
+BDRVQcow2State *s = bs->opaque;
+void *crypt_buf = NULL;
+int offset_in_cluster = offset_into_cluster(s, offset);
+QEMUIOVector encrypted_qiov;
+
+if (bs->encrypted) {
+assert(s->crypto);
+assert(bytes <= QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
+crypt_buf = qemu_try_blockalign(bs->file->bs, bytes);
+if (crypt_buf == NULL) {
+ret = -ENOMEM;
+goto out_unlocked;
+}
+qemu_iovec_to_buf(qiov, qiov_offset, crypt_buf, bytes);
+
+if (qcow2_co_encrypt(bs, file_cluster_offset, offset,
+ crypt_buf, bytes) < 0) {
+ret = -EIO;
+goto out_unlocked;
+}
+
+qemu_iovec_init_buf(_qiov, crypt_buf, bytes);
+qiov = _qiov;
+qiov_offset = 0;
+}
+
+/* Try to efficiently initialize the physical space with zeroes */
+ret = handle_alloc_space(bs, l2meta);
+if (ret < 0) {
+goto out_unlocked;
+}
+
+/*
+ * If we need to do COW, check if it's possible to merge the
+ * writing of the guest data together with that of the COW regions.
+ * If it's not possible (or not necessary) then write the
+ * guest data now.
+ */
+if (!merge_cow(offset, bytes, qiov, qiov_offset, l2meta)) {
+BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
+trace_qcow2_writev_data(qemu_coroutine_self(),
+file_cluster_offset + offset_in_cluster);
+ret = bdrv_co_pwritev_part(s->data_file,
+   file_cluster_offset + offset_in_cluster,
+   bytes, qiov, qiov_offset, 0);
+if (ret < 0) {
+goto out_unlocked;
+}
+}
+
+qemu_co_mutex_lock(>lock);
+
+ret = qcow2_handle_l2meta(bs, , true);
+goto out_locked;
+
+out_unlocked:
+qemu_co_mutex_lock(>lock);
+
+out_locked:
+qcow2_handle_l2meta(bs, , false);
+qemu_co_mutex_unlock(>lock);
+
+qemu_vfree(crypt_buf);
+
+return ret;
+}
+
 static coroutine_fn int qcow2_co_pwritev_part(
 BlockDriverState *bs, uint64_t offset, uint64_t bytes,
 QEMUIOVector *qiov, size_t qiov_offset, int flags)
@@ -2243,15 +2324,10 @@ static coroutine_fn int qcow2_co_pwritev_part(
 int ret;
 unsigned int cur_bytes; /* number of sectors in current iteration */
 uint64_t cluster_offset;
-QEMUIOVector encrypted_qiov;
-uint64_t bytes_done = 0;
-uint8_t *cluster_data = NULL;
 QCowL2Meta *l2meta = NULL;
 
 trace_qcow2_writev_start_req(qemu_coroutine_self(), offset, bytes);
 
-qemu_co_mutex_lock(>lock);
-
 while (bytes != 0) {
 
 l2meta = NULL;
@@ -2265,6 +2341,8 @@ static coroutine_fn int qcow2_co_pwritev_part(
 - offset_in_cluster);
 }
 
+qemu_co_mutex_lock(>lock);
+
 ret = qcow2_alloc_cluster_offset(bs, offset, _bytes,
  _offset, );
 if (ret < 0) {
@@ -2282,73 +2360,20 @@ static coroutine_fn int qcow2_co_pwritev_part(
 
 qemu_co_mutex_unlock(>lock);
 
-if (bs->encrypted) {
-assert(s->crypto);
-if (!cluster_data) {
-cluster_data = qemu_try_blockalign(bs->file->bs,
-   QCOW_MAX_CRYPT_CLUSTERS
-   * s->cluster_size);
-if (cluster_data == NULL) {
-ret = -ENOMEM;
-goto out_unlocked;
-}
-}
-
-assert(cur_bytes <= QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
-qemu_iovec_to_buf(qiov, qiov_offset + bytes_done,
-  cluster_data, cur_bytes);
-
-if (qcow2_co_encrypt(bs, cluster_offset,

[Qemu-devel] [PATCH v3 2/4] block/qcow2: refactor qcow2_co_preadv_part

2019-08-15 Thread Vladimir Sementsov-Ogievskiy

Further patch will run partial requests of iterations of
qcow2_co_preadv in parallel for performance reasons. To prepare for
this, separate part which may be parallelized into separate function
(qcow2_co_preadv_task).

While being here, also separate encrypted clusters reading to own
function, like it is done for compressed reading.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 qapi/block-core.json |   2 +-
 block/qcow2.c| 205 +++
 2 files changed, 111 insertions(+), 96 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0d43d4f37c..dd80aa11db 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3266,7 +3266,7 @@
 'pwritev_rmw_tail', 'pwritev_rmw_after_tail', 'pwritev',
 'pwritev_zero', 'pwritev_done', 'empty_image_prepare',
 'l1_shrink_write_table', 'l1_shrink_free_l2_clusters',
-'cor_write', 'cluster_alloc_space', 'none'] }
+'cor_write', 'cluster_alloc_space', 'none', 'read_encrypted'] }
 
 ##
 # @BlkdebugIOType:
diff --git a/block/qcow2.c b/block/qcow2.c
index 93ab7edcea..89afb4272e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1967,17 +1967,114 @@ out:
 return ret;
 }
 
+static coroutine_fn int
+qcow2_co_preadv_encrypted(BlockDriverState *bs,
+   uint64_t file_cluster_offset,
+   uint64_t offset,
+   uint64_t bytes,
+   QEMUIOVector *qiov,
+   uint64_t qiov_offset)
+{
+int ret;
+BDRVQcow2State *s = bs->opaque;
+uint8_t *buf;
+
+assert(bs->encrypted && s->crypto);
+assert(bytes <= QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
+
+/*
+ * For encrypted images, read everything into a temporary
+ * contiguous buffer on which the AES functions can work.
+ * Also, decryption in a separate buffer is better as it
+ * prevents the guest from learning information about the
+ * encrypted nature of the virtual disk.
+ */
+
+buf = qemu_try_blockalign(s->data_file->bs, bytes);
+if (buf == NULL) {
+return -ENOMEM;
+}
+
+BLKDBG_EVENT(bs->file, BLKDBG_READ_ENCRYPTED);
+ret = bdrv_co_pread(s->data_file,
+file_cluster_offset + offset_into_cluster(s, offset),
+bytes, buf, 0);
+if (ret < 0) {
+goto fail;
+}
+
+assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+if (qcow2_co_decrypt(bs, file_cluster_offset, offset, buf, bytes) < 0) {
+ret = -EIO;
+goto fail;
+}
+qemu_iovec_from_buf(qiov, qiov_offset, buf, bytes);
+
+fail:
+qemu_vfree(buf);
+
+return ret;
+}
+
+static coroutine_fn int qcow2_co_preadv_task(BlockDriverState *bs,
+ QCow2ClusterType cluster_type,
+ uint64_t file_cluster_offset,
+ uint64_t offset, uint64_t bytes,
+ QEMUIOVector *qiov,
+ size_t qiov_offset)
+{
+BDRVQcow2State *s = bs->opaque;
+int offset_in_cluster = offset_into_cluster(s, offset);
+
+switch (cluster_type) {
+case QCOW2_CLUSTER_ZERO_PLAIN:
+case QCOW2_CLUSTER_ZERO_ALLOC:
+/* Both zero types are handled in qcow2_co_preadv_part */
+g_assert_not_reached();
+
+case QCOW2_CLUSTER_UNALLOCATED:
+assert(bs->backing); /* otherwise handled in qcow2_co_preadv_part */
+
+BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
+return bdrv_co_preadv_part(bs->backing, offset, bytes,
+   qiov, qiov_offset, 0);
+
+case QCOW2_CLUSTER_COMPRESSED:
+return qcow2_co_preadv_compressed(bs, file_cluster_offset,
+  offset, bytes, qiov, qiov_offset);
+
+case QCOW2_CLUSTER_NORMAL:
+if ((file_cluster_offset & 511) != 0) {
+return -EIO;
+}
+
+if (bs->encrypted) {
+return qcow2_co_preadv_encrypted(bs, file_cluster_offset,
+ offset, bytes, qiov, qiov_offset);
+}
+
+BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
+return bdrv_co_preadv_part(s->data_file,
+   file_cluster_offset + offset_in_cluster,
+   bytes, qiov, qiov_offset, 0);
+
+default:
+g_assert_not_reached();
+}
+
+g_assert_not_reached();
+}
+
 static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
  uint64_t offset, uint64_t bytes,
  QEMUIOVector *qiov,
  size_t qiov_offset, int flags)
 {
 BDRVQcow2State *s = bs->opaque;
-int

[Qemu-devel] [PATCH v3 1/4] block: introduce aio task pool

2019-08-15 Thread Vladimir Sementsov-Ogievskiy

Common interface for aio task loops. To be used for improving
performance of synchronous io loops in qcow2, block-stream,
copy-on-read, and may be other places.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/aio_task.h |  54 +
 block/aio_task.c | 124 +++
 block/Makefile.objs  |   2 +
 3 files changed, 180 insertions(+)
 create mode 100644 include/block/aio_task.h
 create mode 100644 block/aio_task.c

diff --git a/include/block/aio_task.h b/include/block/aio_task.h
new file mode 100644
index 00..58b4d99e59
--- /dev/null
+++ b/include/block/aio_task.h
@@ -0,0 +1,54 @@
+/*
+ * Aio tasks loops
+ *
+ * Copyright (c) 2019 Virtuozzo International GmbH.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef BLOCK_AIO_TASK_H
+#define BLOCK_AIO_TASK_H
+
+#include "qemu/coroutine.h"
+
+typedef struct AioTaskPool AioTaskPool;
+typedef struct AioTask AioTask;
+typedef int coroutine_fn (*AioTaskFunc)(AioTask *task);
+struct AioTask {
+AioTaskPool *pool;
+AioTaskFunc func;
+int ret;
+};
+
+AioTaskPool *aio_task_pool_new(int max_busy_tasks);
+void aio_task_pool_free(AioTaskPool *);
+
+/* error code of failed task or 0 if all is OK */
+int aio_task_pool_status(AioTaskPool *pool);
+
+bool aio_task_pool_empty(AioTaskPool *pool);
+
+/* User provides filled @task, however task->pool will be set automatically */
+void coroutine_fn aio_task_pool_start_task(AioTaskPool *pool, AioTask *task);
+
+void coroutine_fn aio_task_pool_wait_slot(AioTaskPool *pool);
+void coroutine_fn aio_task_pool_wait_one(AioTaskPool *pool);
+void coroutine_fn aio_task_pool_wait_all(AioTaskPool *pool);
+
+#endif /* BLOCK_AIO_TASK_H */
diff --git a/block/aio_task.c b/block/aio_task.c
new file mode 100644
index 00..3eacfd1f40
--- /dev/null
+++ b/block/aio_task.c
@@ -0,0 +1,124 @@
+/*
+ * Aio tasks loops
+ *
+ * Copyright (c) 2019 Virtuozzo International GmbH.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "block/aio.h"
+#include "block/aio_task.h"
+
+struct AioTaskPool {
+Coroutine *main_co;
+int status;
+int max_busy_tasks;
+int busy_tasks;
+bool waiting;
+};
+
+static void coroutine_fn aio_task_co(void *opaque)
+{
+AioTask *task = opaque;
+AioTaskPool *pool = task->pool;
+
+assert(pool->busy_tasks < pool->max_busy_tasks);
+pool->busy_tasks++;
+
+task->ret = task->func(task);
+
+pool->busy_tasks--;
+
+if (task->ret < 0 && pool->status == 0) {
+pool->status = task->ret;
+}
+
+g_free(task);
+
+if (pool->waiting) {
+pool->waiting = false;
+aio_co_wake(pool->main_co);
+}
+}
+
+void coroutine_fn aio_task_pool_wait_one(AioTaskPool *pool)
+{
+assert(pool->busy_tasks > 0);
+assert(qemu_coroutine_self() == pool->main_co);
+
+pool->waiting = true;
+

[Qemu-devel] [Bug 1840252] Re: Infinite loop over ERANGE from getsockopt

2019-08-15 Thread Fritz Katze

*** This bug is a duplicate of bug 1823790 ***
https://bugs.launchpad.net/bugs/1823790

** Description changed:

  Host system: Ubuntu 18.04.3 AMD64
  Qemu Version: qemu-arm-static --version
  qemu-arm version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.17)
  
- Emulated System: 
+ Emulated System:
  Root file system taken from RaspberryPi 3 image
  ubuntu-18.04.3-preinstalled-server-armhf+raspi3.img
  from 
http://cdimage.ubuntu.com/releases/18.04/release/ubuntu-18.04.3-preinstalled-server-armhf+raspi3.img.xz.
  
  Then using system-nspawn with with /usr/bin/qemu-arm-static copied in.
  
- When executing commands like 
-   dpkg -i (--force-all) <...>.deb
+ When executing commands like
+   dpkg -i (--force-all) <...>.deb
  or
-   tar tvf ..
+   tar tvf ..
  or
-   tar xvf ..
+   tar xvf ..
  the hosting qemu-arm-static process goes into an infinite loop of getsockopt 
calls of the form:
  getsockopt(12, SOL_SOCKET, SO_PEERSEC, 0x7fff7cac49d8, [4]) = -1 ERANGE 
(Numerical result out of range)
  I assume that this is because of an infinite retry without checking the 
actual error code of the call.
  
  strace:
  openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/librt.so.1", O_RDONLY|O_CLOEXEC) = 
12
  read(12, 
"\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\20\30\0\0004\0\0\0"..., 512) = 
512
  lseek(12, 21236, SEEK_SET)  = 21236
  read(12, 
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1240) = 
1240
  lseek(12, 20856, SEEK_SET)  = 20856
  read(12, "A2\0\0\0aeabi\0\1(\0\0\0\0057-A\0\6\n\7A\10\1\t\2\n\4\22"..., 51) = 
51
  fstat(12, {st_mode=S_IFREG|0644, st_size=22476, ...}) = 0
  mmap(0x7f419952c000, 90112, PROT_READ|PROT_EXEC, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_DENYWRIT
  E, -1, 0) = 0x7f419952c000
  mmap(0x7f419952c000, 90112, PROT_READ|PROT_EXEC, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 12, 0) = 0x
  7f419952c000
  mprotect(0x7f4199531000, 61440, PROT_NONE) = 0
  mmap(0x7f419954, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 12, 0x4000)
-  = 0x7f419954
+  = 0x7f419954
  close(12)   = 0
  mprotect(0x7f419954, 4096, PROT_READ) = 0
  mprotect(0x7f4199578000, 8192, PROT_READ) = 0
- mmap(0x7f419957b000, 28672, PROT_NONE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) 
+ mmap(0x7f419957b000, 28672, PROT_NONE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0)
  = 0x7f419957b000
  rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], NULL, 8) = 0
  rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], NULL, 8) = 0
  rt_sigprocmask(SIG_SETMASK, [HUP USR1 USR2 PIPE ALRM CHLD TSTP URG VTALRM 
PROF WINCH IO], NULL, 8
  ) = 0
  access("/etc/systemd/dont-synthesize-nobody", F_OK) = -1 ENOENT (No such file 
or directory)
  getpid()= 26
  socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 12
  getsockopt(12, SOL_SOCKET, SO_RCVBUF, [212992], [4]) = 0
  setsockopt(12, SOL_SOCKET, SO_RCVBUFFORCE, [8388608], 4) = -1 EPERM 
(Operation not permitted)
  setsockopt(12, SOL_SOCKET, SO_RCVBUF, [8388608], 4) = 0
  getsockopt(12, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0
  setsockopt(12, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = -1 EPERM 
(Operation not permitted)
  setsockopt(12, SOL_SOCKET, SO_SNDBUF, [8388608], 4) = 0
  connect(12, {sa_family=AF_UNIX, sun_path="/run/dbus/system_bus_socket"}, 29) 
= 0
  getsockopt(12, SOL_SOCKET, SO_PEERCRED, {pid=0, uid=0, gid=0}, [12]) = 0
- getsockopt(12, SOL_SOCKET, SO_PEERSEC, 0x7fff7cac49d8, [4]) = -1 ERANGE 
(Numerical result out of 
+ getsockopt(12, SOL_SOCKET, SO_PEERSEC, 0x7fff7cac49d8, [4]) = -1 ERANGE 
(Numerical result out of
  range)
+ 
+ And this last entry repeats endlessly.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1840252

Title:
  Infinite loop over  ERANGE from getsockopt

Status in QEMU:
  New

Bug description:
  Host system: Ubuntu 18.04.3 AMD64
  Qemu Version: qemu-arm-static --version
  qemu-arm version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.17)

  Emulated System:
  Root file system taken from RaspberryPi 3 image
  ubuntu-18.04.3-preinstalled-server-armhf+raspi3.img
  from 
http://cdimage.ubuntu.com/releases/18.04/release/ubuntu-18.04.3-preinstalled-server-armhf+raspi3.img.xz.

  Then using system-nspawn with with /usr/bin/qemu-arm-static copied in.

  When executing commands like
    dpkg -i (--force-all) <...>.deb
  or
    tar tvf ..
  or
    tar xvf ..
  the hosting qemu-arm-static process goes into an infinite loop of getsockopt 
calls of the form:
  getsockopt(12, SOL_SOCKET, SO_PEERSEC, 0x7fff7cac49d8, [4]) = -1 ERANGE 
(Numerical result out of range)
  I assume that this is because of an infinite retry without checking the 
actual error code of the call.

  strace:
  openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/librt.so.1", O_RDONLY|O_CLOEXEC) = 
12
  read(12,

Re: [Qemu-devel] [PATCH] vhost-vsock: report QMP event when set running

2019-08-15 Thread Stefan Hajnoczi

On Tue, Jul 30, 2019 at 04:29:35PM -0400, Michael S. Tsirkin wrote:
> On Tue, Jul 30, 2019 at 12:24:27PM +, N. B. wrote:
> > From: Ning Bo 
> > 
> > Report vsock running event so that the upper application can
> > control boot sequence.
> > see https://github.com/kata-containers/runtime/pull/1918
> > 
> > Signed-off-by: Ning Bo 
> 
> Cc Stefan.
> 
> Stefan, are you willing to maintain virtio/vhost-vsock in qemu, too?
> 
> If yes let's add an entry to MAINTAINERS, ok?

Yes, I'll send a patch.

Stefan


signature.asc
Description: PGP signature

[Qemu-devel] [PATCH] MAINTAINERS: add Stefan Hajnoczi as vhost-vsock maintainer

2019-08-15 Thread Stefan Hajnoczi

A MAINTAINERS entry wasn't added when this code was merged.  Add it now
so that scripts/get_maintainer.pl works for vhost-vsock.

Signed-off-by: Stefan Hajnoczi 
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index d6de200453..b8fc408bf3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1571,6 +1571,13 @@ F: hw/virtio/virtio-crypto.c
 F: hw/virtio/virtio-crypto-pci.c
 F: include/hw/virtio/virtio-crypto.h
 
+vhost-vsock
+M: Stefan Hajnoczi 
+S: Supported
+F: hw/virtio/vhost-vsock.c
+F: hw/virtio/vhost-vsock-pci.c
+F: include/hw/virtio/vhost-vsock.h
+
 nvme
 M: Keith Busch 
 L: qemu-bl...@nongnu.org
-- 
2.21.0

Re: [Qemu-devel] [PATCH v3 0/4] qcow2: async handling of fragmented io

2019-08-15 Thread Max Reitz

On 15.08.19 14:10, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> Here is an asynchronous scheme for handling fragmented qcow2
> reads and writes. Both qcow2 read and write functions loops through
> sequential portions of data. The series aim it to parallelize these
> loops iterations.
> It improves performance for fragmented qcow2 images, I've tested it
> as described below.

Looks good to me, but I can’t take it yet because I need to wait for
Stefan’s branch to be merged, of course.

Speaking of which, why didn’t you add any tests for the *_part()
methods?  I find it a bit unsettling that nothing would have caught the
bug you had in v2 in patch 3.

Max

signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v2] usb: reword -usb command-line option and mention xHCI

2019-08-15 Thread Stefan Hajnoczi

The -usb section of the man page is not very clear on what exactly -usb
does and fails to mention xHCI as a modern alternative (-device
nec-usb-xhci).

Signed-off-by: Stefan Hajnoczi 
---
v2:
 * Use @option{-device ...} [Thomas]
 * Suggest qemu-xhci instead of nec-usb-xhci [Thomas and David]
---
 qemu-options.hx | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 9621e934c0..1fb362f06f 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1436,12 +1436,15 @@ STEXI
 ETEXI
 
 DEF("usb", 0, QEMU_OPTION_usb,
-"-usbenable the USB driver (if it is not used by default 
yet)\n",
+"-usbenable on-board USB host controller (if not enabled by 
default)\n",
 QEMU_ARCH_ALL)
 STEXI
 @item -usb
 @findex -usb
-Enable the USB driver (if it is not used by default yet).
+Enable USB emulation on machine types with an on-board USB host controller (if
+not enabled by default).  Note that on-board USB host controllers may not
+support USB 3.0.  In this case @option{-device qemu-xhci} can be used instead
+on machines with PCI.
 ETEXI
 
 DEF("usbdevice", HAS_ARG, QEMU_OPTION_usbdevice,
-- 
2.21.0

Re: [Qemu-devel] [PATCH v3 04/13] fpu: use min/max values from stdint.h for integral overflow

2019-08-15 Thread Aleksandar Markovic

13.08.2019. 14.52, "Alex Bennée"  је написао/ла:
>
> Remove some more use of LIT64 while making the meaning more clear. We
> also avoid the need of casts as the results by definition fit into the
> return type.
>
> Signed-off-by: Alex Bennée 
> ---
>  fpu/softfloat.c | 30 ++
>  1 file changed, 14 insertions(+), 16 deletions(-)
>
> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
> index 9e57b7b5933..a1e1e9a8559 100644
> --- a/fpu/softfloat.c
> +++ b/fpu/softfloat.c
> @@ -3444,9 +3444,7 @@ static int64_t roundAndPackInt64(flag zSign,
uint64_t absZ0, uint64_t absZ1,
>  if ( z && ( ( z < 0 ) ^ zSign ) ) {
>   overflow:
>  float_raise(float_flag_invalid, status);
> -return
> -  zSign ? (int64_t) LIT64( 0x8000 )
> -: LIT64( 0x7FFF );
> +return zSign ? INT64_MIN : INT64_MAX;
>  }

In function roundAndPavkInt32 tgere is a following segment:

if ( ( absZ>>32 ) || ( z && ( ( z < 0 ) ^ zSign ) ) ) {
float_raise(float_flag_invalid, status);
return zSign ? (int32_t) 0x8000 : 0x7FFF;
}

Perhaps replace these constants with INT32_MIN, INT32_MAX, for similar
reasons, in the same or a separate patch?

Aleksandar


>  if (absZ1) {
>  status->float_exception_flags |= float_flag_inexact;
> @@ -3497,7 +3495,7 @@ static int64_t roundAndPackUint64(flag zSign,
uint64_t absZ0,
>  ++absZ0;
>  if (absZ0 == 0) {
>  float_raise(float_flag_invalid, status);
> -return LIT64(0x);
> +return UINT64_MAX;
>  }
>  absZ0 &= ~(((uint64_t)(absZ1<<1) == 0) & roundNearestEven);
>  }
> @@ -5518,9 +5516,9 @@ int64_t floatx80_to_int64(floatx80 a, float_status
*status)
>  if ( shiftCount ) {
>  float_raise(float_flag_invalid, status);
>  if (!aSign || floatx80_is_any_nan(a)) {
> -return LIT64( 0x7FFF );
> +return INT64_MAX;
>  }
> -return (int64_t) LIT64( 0x8000 );
> +return INT64_MIN;
>  }
>  aSigExtra = 0;
>  }
> @@ -5561,10 +5559,10 @@ int64_t floatx80_to_int64_round_to_zero(floatx80
a, float_status *status)
>  if ( ( a.high != 0xC03E ) || aSig ) {
>  float_raise(float_flag_invalid, status);
>  if ( ! aSign || ( ( aExp == 0x7FFF ) && aSig ) ) {
> -return LIT64( 0x7FFF );
> +return INT64_MAX;
>  }
>  }
> -return (int64_t) LIT64( 0x8000 );
> +return INT64_MIN;
>  }
>  else if ( aExp < 0x3FFF ) {
>  if (aExp | aSig) {
> @@ -6623,7 +6621,7 @@ int32_t float128_to_int32_round_to_zero(float128 a,
float_status *status)
>  if ( ( z < 0 ) ^ aSign ) {
>   invalid:
>  float_raise(float_flag_invalid, status);
> -return aSign ? (int32_t) 0x8000 : 0x7FFF;
> +return aSign ? INT32_MIN : INT32_MAX;
>  }
>  if ( ( aSig0<  status->float_exception_flags |= float_flag_inexact;
> @@ -6662,9 +6660,9 @@ int64_t float128_to_int64(float128 a, float_status
*status)
>&& ( aSig1 || ( aSig0 != LIT64( 0x0001
) ) )
>  )
> ) {
> -return LIT64( 0x7FFF );
> +return INT64_MAX;
>  }
> -return (int64_t) LIT64( 0x8000 );
> +return INT64_MIN;
>  }
>  shortShift128Left( aSig0, aSig1, - shiftCount, ,  );
>  }
> @@ -6710,10 +6708,10 @@ int64_t float128_to_int64_round_to_zero(float128
a, float_status *status)
>  else {
>  float_raise(float_flag_invalid, status);
>  if ( ! aSign || ( ( aExp == 0x7FFF ) && ( aSig0 | aSig1
) ) ) {
> -return LIT64( 0x7FFF );
> +return INT64_MAX;
>  }
>  }
> -return (int64_t) LIT64( 0x8000 );
> +return INT64_MIN;
>  }
>  z = ( aSig0<>( ( - shiftCount ) & 63 ) );
>  if ( (uint64_t) ( aSig1< @@ -6764,19 +6762,19 @@ uint64_t float128_to_uint64(float128 a,
float_status *status)
>  if (aSign && (aExp > 0x3FFE)) {
>  float_raise(float_flag_invalid, status);
>  if (float128_is_any_nan(a)) {
> -return LIT64(0x);
> +return UINT64_MAX;
>  } else {
>  return 0;
>  }
>  }
>  if (aExp) {
> -aSig0 |= LIT64(0x0001);
> +aSig0 |= UINT64_C(0x0001);
>  }
>  shiftCount = 0x402F - aExp;
>  if (shiftCount <= 0) {
>  if (0x403E < aExp) {
>  float_raise(float_flag_invalid, status);
> -return LIT64(0x);
> +return UINT64_MAX;
>  }
>

[Qemu-devel] [PATCH] trace: Clarify DTrace/SystemTap help message

2019-08-15 Thread Philippe Mathieu-Daudé

Most tracing backends are implemented within QEMU, except the
DTrace/SystemTap backends.

One side effect is when running 'qemu -trace help', an incomplete
list of trace events is displayed when using the DTrace/SystemTap
backends.

This is partly due to trace events registered as modules with
trace_init(), and since the events are not used within QEMU,
the linker optimize and remove the unused modules (which is
OK in this particular case).
Currently only the events compiled in trace-root.o and in the
last trace.o member of libqemuutil.a are linked, resulting in
an incomplete list of events.

To avoid confusion, improve the help message, recommending to
use the proper systemtap script to display the events list.

Before:

  $ lm32-softmmu/qemu-system-lm32 -trace help 2>&1 | wc -l
  70

After:

  $ lm32-softmmu/qemu-system-lm32 -trace help
  Run 'qemu-trace-stap list qemu-system-lm32' to print a list
  of names of trace points with the DTrace/SystemTap backends.

  $ qemu-trace-stap list qemu-system-lm32 | wc -l
  1136

Signed-off-by: Philippe Mathieu-Daudé 
---
 trace/control.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/trace/control.c b/trace/control.c
index 43fb7868db..bc2fe0859d 100644
--- a/trace/control.c
+++ b/trace/control.c
@@ -159,12 +159,19 @@ TraceEvent *trace_event_iter_next(TraceEventIter *iter)
 
 void trace_list_events(void)
 {
+#ifdef CONFIG_TRACE_DTRACE
+fprintf(stderr, "Run 'qemu-trace-stap list %s' to print a list\n"
+"of names of trace points with the DTrace/SystemTap"
+" backends.\n",
+error_get_progname());
+#else
 TraceEventIter iter;
 TraceEvent *ev;
 trace_event_iter_init(, NULL);
 while ((ev = trace_event_iter_next()) != NULL) {
 fprintf(stderr, "%s\n", trace_event_get_name(ev));
 }
+#endif
 }
 
 static void do_trace_enable_events(const char *line_buf)
-- 
2.20.1

Re: [Qemu-devel] [PATCH 7/7] target/arm: Use tcg_gen_extrh_i64_i32 to extract the high word

2019-08-15 Thread Peter Maydell

On Thu, 15 Aug 2019 at 12:56, Richard Henderson
 wrote:
>
>
>
>
> >
> >This patch is fine, but I noticed while reviewing it that tcg/README
> >labels both the extrl_i64_i32 and extrh_i64_i32 operations as
> >"for 64-bit hosts only". Presumably that's a documentation error,
> >since we're not guarding the existing uses of the extrl_i64_i32
> >here with any kind of ifdeffery to restrict them to 64-bit hosts ?
> >
>
>
> A documentation unclarity in that the opcodes are for 64-bit hosts. The 
> tcg_gen_* functions are always available, and expand to INDEX_op_mov_i32 on 
> 32-bit hosts.

Oh, I see. We should probably split that document out properly
into a primary "what you need to know to generate TCG code as
a target" (which is the main audience) and "what you need to
implement for a TCG backend (which I guess is relevant to fewer
people).

thanks
-- PMM

Re: [Qemu-devel] [PATCH] Fix hw/rdma/vmw/pvrdma_cmd.c build

2019-08-15 Thread Stephen Kitt

On Thu, 15 Aug 2019 13:57:05 +0300, Yuval Shaia 
wrote:

> On Sun, Aug 11, 2019 at 09:42:47PM +0200, Stephen Kitt wrote:
> > This was broken by the cherry-pick in 41dd30f. Fix by handling errors
> > as in the rest of the function: "goto out" instead of "return rc".
> > 
> > Signed-off-by: Stephen Kitt 
> > ---
> >  hw/rdma/vmw/pvrdma_cmd.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/hw/rdma/vmw/pvrdma_cmd.c b/hw/rdma/vmw/pvrdma_cmd.c
> > index bb9a9f1cd1..a3a86d7c8e 100644
> > --- a/hw/rdma/vmw/pvrdma_cmd.c
> > +++ b/hw/rdma/vmw/pvrdma_cmd.c
> > @@ -514,7 +514,7 @@ static int create_qp(PVRDMADev *dev, union
> > pvrdma_cmd_req *req, cmd->recv_cq_handle, rings, >qpn);
> >  if (resp->hdr.err) {
> >  destroy_qp_rings(rings);
> > -return rc;
> > +goto out;  
> 
> This label was removed, can you please check master branch?

Sorry, it wasn’t clear from my message — my patch is against the stable-3.1
branch.

Regards,

Stephen


pgp7w_vonsRPD.pgp
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v3 0/4] qcow2: async handling of fragmented io

2019-08-15 Thread Vladimir Sementsov-Ogievskiy

Hi all!

Here is an asynchronous scheme for handling fragmented qcow2
reads and writes. Both qcow2 read and write functions loops through
sequential portions of data. The series aim it to parallelize these
loops iterations.
It improves performance for fragmented qcow2 images, I've tested it
as described below.

v3 (by Max's comments) [perf results not updated]:

01: - use coroutine_fn where appropriate !!!
- add aio_task_pool_free
- add some comments
- move header to include/block
- s/wait_done/waiting
02: - Rewrite note about decryption in guest buffers [thx to Eric]
- separate g_assert_not_reached for QCOW2_CLUSTER_ZERO_*
- drop return after g_assert_not_reached
03: - drop bytes_done and correctly use qiov_offset
- fix comment
04: - move QCOW2_MAX_WORKERS to block/qcow2.h
- initialize ret in qcow2_co_preadv_part
Based-on: https://github.com/stefanha/qemu/commits/block


v2: changed a lot, as
 1. a lot of preparations around locks, hd_qiovs, threads for encryption
are done
 2. I decided to create separate file with async request handling API, to
reuse it for backup, stream and copy-on-read to improve their performance
too. Mirror and qemu-img convert has their own async request handling,
may be we'll be able finally merge all these similar code into one
feature.
Note that not all API calls used in qcow2, some will be needed on
following steps for parallelizing other io loops.

About testing:

I have four 4G qcow2 images (with default 64k block size) on my ssd disk:
t-seq.qcow2 - sequentially written qcow2 image
t-reverse.qcow2 - filled by writing 64k portions from end to the start
t-rand.qcow2 - filled by writing 64k portions (aligned) in random order
t-part-rand.qcow2 - filled by shuffling order of 64k writes in 1m clusters
(see source code of image generation in the end for details)

and I've done several runs like the following (sequential io by 1mb chunks):

out=/tmp/block; echo > $out; cat /tmp/files | while read file; do for wr in 
{"","-w"}; do echo "$file" $wr; ./qemu-img bench -c 4096 -d 1 -f qcow2 -n -s 1m 
-t none $wr "$file" | grep 'Run completed in' | awk '{print $4}' >> $out; done; 
done


short info about parameters:
  -w - do writes (otherwise do reads)
  -c - count of blocks
  -s - block size
  -t none - disable cache
  -n - native aio
  -d 1 - don't use parallel requests provided by qemu-img bench itself

results:

+---+-+-+
| file  | master  | async   |
+---+-+-+
| /ssd/t-part-rand.qcow2| 14.671  | 9.193   |
+---+-+-+
| /ssd/t-part-rand.qcow2 -w | 11.434  | 8.621   |
+---+-+-+
| /ssd/t-rand.qcow2 | 20.421  | 10.05   |
+---+-+-+
| /ssd/t-rand.qcow2 -w  | 11.097  | 8.915   |
+---+-+-+
| /ssd/t-reverse.qcow2  | 17.515  | 9.407   |
+---+-+-+
| /ssd/t-reverse.qcow2 -w   | 11.255  | 8.649   |
+---+-+-+
| /ssd/t-seq.qcow2  | 9.081   | 9.072   |
+---+-+-+
| /ssd/t-seq.qcow2 -w   | 8.761   | 8.747   |
+---+-+-+
| /tmp/t-part-rand.qcow2| 41.179  | 41.37   |
+---+-+-+
| /tmp/t-part-rand.qcow2 -w | 54.097  | 55.323  |
+---+-+-+
| /tmp/t-rand.qcow2 | 711.899 | 514.339 |
+---+-+-+
| /tmp/t-rand.qcow2 -w  | 546.259 | 642.114 |
+---+-+-+
| /tmp/t-reverse.qcow2  | 86.065  | 96.522  |
+---+-+-+
| /tmp/t-reverse.qcow2 -w   | 46.557  | 48.499  |
+---+-+-+
| /tmp/t-seq.qcow2  | 33.804  | 33.862  |
+---+-+-+
| /tmp/t-seq.qcow2 -w   | 34.299  | 34.233  |
+---+-+-+


Performance gain is obvious, especially for read and especially for ssd.
For hdd there is a degradation for reverse case, but this is the most
impossible case and seems not critical.

How images are generated:

 gen-writes ==
#!/usr/bin/env python
import random
import sys

size = 4 * 1024 * 1024 * 1024
block = 64 * 1024
block2 = 1024 * 1024

arg = sys.argv[1]

if arg in ('rand', 'reverse', 'seq'):
writes = list(range(0, size, block))

if arg == 'rand':
random.shuffle(writes)
elif arg == 'reverse':
writes.reverse()
elif arg == 'part-rand':
writes = []

[Qemu-devel] [PATCH v3 4/4] block/qcow2: introduce parallel subrequest handling in read and write

2019-08-15 Thread Vladimir Sementsov-Ogievskiy

It improves performance for fragmented qcow2 images.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/qcow2.h  |   3 ++
 block/qcow2.c  | 125 -
 block/trace-events |   1 +
 3 files changed, 117 insertions(+), 12 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 998bcdaef1..fdfa9c31cd 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -65,6 +65,9 @@
 #define QCOW2_MAX_BITMAPS 65535
 #define QCOW2_MAX_BITMAP_DIRECTORY_SIZE (1024 * QCOW2_MAX_BITMAPS)
 
+/* Maximum of parallel sub-request per guest request */
+#define QCOW2_MAX_WORKERS 8
+
 /* indicate that the refcount of the referenced cluster is exactly one. */
 #define QCOW_OFLAG_COPIED (1ULL << 63)
 /* indicate that the cluster is compressed (they never have the copied flag) */
diff --git a/block/qcow2.c b/block/qcow2.c
index 3aaa180e2b..36b41e8536 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -40,6 +40,7 @@
 #include "qapi/qobject-input-visitor.h"
 #include "qapi/qapi-visit-block-core.h"
 #include "crypto.h"
+#include "block/aio_task.h"
 
 /*
   Differences with QCOW:
@@ -2017,6 +2018,60 @@ fail:
 return ret;
 }
 
+typedef struct Qcow2AioTask {
+AioTask task;
+
+BlockDriverState *bs;
+QCow2ClusterType cluster_type; /* only for read */
+uint64_t file_cluster_offset;
+uint64_t offset;
+uint64_t bytes;
+QEMUIOVector *qiov;
+uint64_t qiov_offset;
+QCowL2Meta *l2meta; /* only for write */
+} Qcow2AioTask;
+
+static coroutine_fn int qcow2_co_preadv_task_entry(AioTask *task);
+static coroutine_fn int qcow2_add_task(BlockDriverState *bs,
+   AioTaskPool *pool,
+   AioTaskFunc func,
+   QCow2ClusterType cluster_type,
+   uint64_t file_cluster_offset,
+   uint64_t offset,
+   uint64_t bytes,
+   QEMUIOVector *qiov,
+   size_t qiov_offset,
+   QCowL2Meta *l2meta)
+{
+Qcow2AioTask local_task;
+Qcow2AioTask *task = pool ? g_new(Qcow2AioTask, 1) : _task;
+
+*task = (Qcow2AioTask) {
+.task.func = func,
+.bs = bs,
+.cluster_type = cluster_type,
+.qiov = qiov,
+.file_cluster_offset = file_cluster_offset,
+.offset = offset,
+.bytes = bytes,
+.qiov_offset = qiov_offset,
+.l2meta = l2meta,
+};
+
+trace_qcow2_add_task(qemu_coroutine_self(), bs, pool,
+ func == qcow2_co_preadv_task_entry ? "read" : "write",
+ cluster_type, file_cluster_offset, offset, bytes,
+ qiov, qiov_offset);
+
+if (!pool) {
+return func(>task);
+}
+
+aio_task_pool_start_task(pool, >task);
+
+return 0;
+}
+
 static coroutine_fn int qcow2_co_preadv_task(BlockDriverState *bs,
  QCow2ClusterType cluster_type,
  uint64_t file_cluster_offset,
@@ -2066,18 +2121,28 @@ static coroutine_fn int 
qcow2_co_preadv_task(BlockDriverState *bs,
 g_assert_not_reached();
 }
 
+static coroutine_fn int qcow2_co_preadv_task_entry(AioTask *task)
+{
+Qcow2AioTask *t = container_of(task, Qcow2AioTask, task);
+
+assert(!t->l2meta);
+
+return qcow2_co_preadv_task(t->bs, t->cluster_type, t->file_cluster_offset,
+t->offset, t->bytes, t->qiov, t->qiov_offset);
+}
+
 static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
  uint64_t offset, uint64_t bytes,
  QEMUIOVector *qiov,
  size_t qiov_offset, int flags)
 {
 BDRVQcow2State *s = bs->opaque;
-int ret;
+int ret = 0;
 unsigned int cur_bytes; /* number of bytes in current iteration */
 uint64_t cluster_offset = 0;
+AioTaskPool *aio = NULL;
 
-while (bytes != 0) {
-
+while (bytes != 0 && aio_task_pool_status(aio) == 0) {
 /* prepare next request */
 cur_bytes = MIN(bytes, INT_MAX);
 if (s->crypto) {
@@ -2089,7 +2154,7 @@ static coroutine_fn int 
qcow2_co_preadv_part(BlockDriverState *bs,
 ret = qcow2_get_cluster_offset(bs, offset, _bytes, 
_offset);
 qemu_co_mutex_unlock(>lock);
 if (ret < 0) {
-return ret;
+goto out;
 }
 
 if (ret == QCOW2_CLUSTER_ZERO_PLAIN ||
@@ -2098,11 +2163,14 @@ static coroutine_fn int 
qcow2_co_preadv_part(BlockDriverState *bs,
 {
 qemu_iovec_memset(qiov, qiov_offset, 0, cur_bytes);
 } else {
-ret = qcow2_co_preadv_task(bs, ret,
-   cluster_offset, offset, cur_bytes,
-

[Qemu-devel] [Bug 1840250] [NEW] 'make -j1 docker-test-build' uses more than one job

2019-08-15 Thread Philippe Mathieu-Daudé

Public bug reported:

version: v4.1.0-rc5

Run 'make -j1 docker-test-build', wait a few, various containers get
instantiated.

$ make -j1 docker-test-build 2>&1 > /dev/null

On another terminal:

$ docker ps
CONTAINER IDIMAGECOMMAND  
CREATED STATUS
62264a2d777aqemu:debian-mips-cross   "/var/tmp/qemu/run t…"   
10 minutes ago  Up 10 minutes
80807c47d0dfqemu:debian-armel-cross  "/var/tmp/qemu/run t…"   
10 minutes ago  Up 10 minutes
06027b5dfd4aqemu:debian-amd64"/var/tmp/qemu/run t…"   
10 minutes ago  Up 10 minutes

** Affects: qemu
 Importance: Undecided
 Status: New


** Tags: docker

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1840250

Title:
  'make -j1 docker-test-build' uses more than one job

Status in QEMU:
  New

Bug description:
  version: v4.1.0-rc5

  Run 'make -j1 docker-test-build', wait a few, various containers get
  instantiated.

  $ make -j1 docker-test-build 2>&1 > /dev/null

  On another terminal:

  $ docker ps
  CONTAINER IDIMAGECOMMAND  
CREATED STATUS
  62264a2d777aqemu:debian-mips-cross   "/var/tmp/qemu/run t…"   
10 minutes ago  Up 10 minutes
  80807c47d0dfqemu:debian-armel-cross  "/var/tmp/qemu/run t…"   
10 minutes ago  Up 10 minutes
  06027b5dfd4aqemu:debian-amd64"/var/tmp/qemu/run t…"   
10 minutes ago  Up 10 minutes

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1840250/+subscriptions

Re: [Qemu-devel] [RFC PATCH v2 01/17] fuzz: Move initialization from main to qemu_init

2019-08-15 Thread Darren Kenny


On Mon, Aug 05, 2019 at 09:43:06AM +0200, Paolo Bonzini wrote:

On 05/08/19 09:11, Oleinik, Alexander wrote:

Using this, we avoid needing a special case to break out of main(),
early, when initializing the fuzzer, as we can just call qemu_init.
There is still a #define around main(), since it otherwise conflicts
with the libfuzzer main().

Signed-off-by: Alexander Oleinik 
---
 include/sysemu/sysemu.h |  5 +
 vl.c| 25 +++--
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 984c439ac9..a63d5ccce3 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -184,6 +184,8 @@ QemuOpts *qemu_get_machine_opts(void);

 bool defaults_enabled(void);

+int qemu_init(int argc, char **argv, char **envp);
+
 extern QemuOptsList qemu_legacy_drive_opts;
 extern QemuOptsList qemu_common_drive_opts;
 extern QemuOptsList qemu_drive_opts;
@@ -197,4 +199,7 @@ extern QemuOptsList qemu_global_opts;
 extern QemuOptsList qemu_mon_opts;
 extern QemuOptsList qemu_semihosting_config_opts;

+#ifdef CONFIG_FUZZ
+int real_main(int argc, char **argv, char **envp);
+#endif
 #endif
diff --git a/vl.c b/vl.c
index 130a389712..914bb9b2de 100644
--- a/vl.c
+++ b/vl.c
@@ -130,6 +130,10 @@ int main(int argc, char **argv)
 #include "sysemu/iothread.h"
 #include "qemu/guest-random.h"

+#ifdef CONFIG_FUZZ
+#include "tests/libqtest.h"
+#endif


Why is this #include needed?

If you leave out the changes to introduce real_main, the patch can be
committed independent of the rest.  Those can be introduced in patch 2
or even 12 ("Add fuzzer skeleton").


The build actually fails for me due to this include, because it has it's own
and different declaration of qtest_init:

 In file included from vl.c:134:
 .../qemu-upstream-libfuzz/./tests/libqtest.h:57:13: error: conflicting types 
for 'qtest_init'
 QTestState *qtest_init(const char *extra_args);
 ^
 .../qemu-upstream-libfuzz/include/sysemu/qtest.h:27:6: note: previous 
declaration is here
 void qtest_init(const char *qtest_chrdev, const char *qtest_log, Error **errp);
  ^
 In file included from vl.c:134:
 .../qemu-upstream-libfuzz/./tests/libqtest.h:640:35: error: too few arguments 
to function call, expected 3, have 1
 global_qtest = qtest_init(args);
~~ ^
 .../qemu-upstream-libfuzz/include/sysemu/qtest.h:27:1: note: 'qtest_init' 
declared here
 void qtest_init(const char *qtest_chrdev, const char *qtest_log, Error **errp);
 ^
 2 errors generated.

(It's probably a separate issue as to why there are 2 functions with
the same name, are not static and have different signatures in the
first place)

Thanks,

Darren.

[Qemu-devel] [Bug 1840252] Re: Infinite loop over ERANGE from getsockopt

2019-08-15 Thread Peter Maydell

*** This bug is a duplicate of bug 1823790 ***
https://bugs.launchpad.net/bugs/1823790

Hi; thanks for this bug report. It looks like it's the same as
LP:1823790. The underlying cause is that we don't implement the
SO_PEERSEC getsockopt option properly. Unfortunately this option appears
to be completely undocumented, which makes it pretty hard for us to
implement :-(


** This bug has been marked a duplicate of bug 1823790
   QEMU mishandling of SO_PEERSEC forces systemd into tight loop

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1840252

Title:
  Infinite loop over  ERANGE from getsockopt

Status in QEMU:
  New

Bug description:
  Host system: Ubuntu 18.04.3 AMD64
  Qemu Version: qemu-arm-static --version
  qemu-arm version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.17)

  Emulated System:
  Root file system taken from RaspberryPi 3 image
  ubuntu-18.04.3-preinstalled-server-armhf+raspi3.img
  from 
http://cdimage.ubuntu.com/releases/18.04/release/ubuntu-18.04.3-preinstalled-server-armhf+raspi3.img.xz.

  Then using system-nspawn with with /usr/bin/qemu-arm-static copied in.

  When executing commands like
    dpkg -i (--force-all) <...>.deb
  or
    tar tvf ..
  or
    tar xvf ..
  the hosting qemu-arm-static process goes into an infinite loop of getsockopt 
calls of the form:
  getsockopt(12, SOL_SOCKET, SO_PEERSEC, 0x7fff7cac49d8, [4]) = -1 ERANGE 
(Numerical result out of range)
  I assume that this is because of an infinite retry without checking the 
actual error code of the call.

  strace:
  openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/librt.so.1", O_RDONLY|O_CLOEXEC) = 
12
  read(12, 
"\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\20\30\0\0004\0\0\0"..., 512) = 
512
  lseek(12, 21236, SEEK_SET)  = 21236
  read(12, 
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1240) = 
1240
  lseek(12, 20856, SEEK_SET)  = 20856
  read(12, "A2\0\0\0aeabi\0\1(\0\0\0\0057-A\0\6\n\7A\10\1\t\2\n\4\22"..., 51) = 
51
  fstat(12, {st_mode=S_IFREG|0644, st_size=22476, ...}) = 0
  mmap(0x7f419952c000, 90112, PROT_READ|PROT_EXEC, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_DENYWRIT
  E, -1, 0) = 0x7f419952c000
  mmap(0x7f419952c000, 90112, PROT_READ|PROT_EXEC, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 12, 0) = 0x
  7f419952c000
  mprotect(0x7f4199531000, 61440, PROT_NONE) = 0
  mmap(0x7f419954, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 12, 0x4000)
   = 0x7f419954
  close(12)   = 0
  mprotect(0x7f419954, 4096, PROT_READ) = 0
  mprotect(0x7f4199578000, 8192, PROT_READ) = 0
  mmap(0x7f419957b000, 28672, PROT_NONE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0)
  = 0x7f419957b000
  rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], NULL, 8) = 0
  rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], NULL, 8) = 0
  rt_sigprocmask(SIG_SETMASK, [HUP USR1 USR2 PIPE ALRM CHLD TSTP URG VTALRM 
PROF WINCH IO], NULL, 8
  ) = 0
  access("/etc/systemd/dont-synthesize-nobody", F_OK) = -1 ENOENT (No such file 
or directory)
  getpid()= 26
  socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 12
  getsockopt(12, SOL_SOCKET, SO_RCVBUF, [212992], [4]) = 0
  setsockopt(12, SOL_SOCKET, SO_RCVBUFFORCE, [8388608], 4) = -1 EPERM 
(Operation not permitted)
  setsockopt(12, SOL_SOCKET, SO_RCVBUF, [8388608], 4) = 0
  getsockopt(12, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0
  setsockopt(12, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = -1 EPERM 
(Operation not permitted)
  setsockopt(12, SOL_SOCKET, SO_SNDBUF, [8388608], 4) = 0
  connect(12, {sa_family=AF_UNIX, sun_path="/run/dbus/system_bus_socket"}, 29) 
= 0
  getsockopt(12, SOL_SOCKET, SO_PEERCRED, {pid=0, uid=0, gid=0}, [12]) = 0
  getsockopt(12, SOL_SOCKET, SO_PEERSEC, 0x7fff7cac49d8, [4]) = -1 ERANGE 
(Numerical result out of
  range)

  And this last entry repeats endlessly.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1840252/+subscriptions

Re: [Qemu-devel] [PATCH v4] blockjob: drain all job nodes in block_job_drain

2019-08-15 Thread Max Reitz

On 02.08.19 11:52, Vladimir Sementsov-Ogievskiy wrote:
> Instead of draining additional nodes in each job code, let's do it in
> common block_job_drain, draining just all job's children.
> BlockJobDriver.drain becomes unused, so, drop it at all.
> 
> It's also a first step to finally get rid of blockjob->blk.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---

What do you think of Kevin’s comment that draining the block nodes may
actually be entirely unnecessary?

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 2/4] block/qcow2: refactor qcow2_co_preadv_part

2019-08-15 Thread Max Reitz

On 15.08.19 14:10, Vladimir Sementsov-Ogievskiy wrote:
> Further patch will run partial requests of iterations of
> qcow2_co_preadv in parallel for performance reasons. To prepare for
> this, separate part which may be parallelized into separate function
> (qcow2_co_preadv_task).
> 
> While being here, also separate encrypted clusters reading to own
> function, like it is done for compressed reading.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  qapi/block-core.json |   2 +-
>  block/qcow2.c| 205 +++
>  2 files changed, 111 insertions(+), 96 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2] Add git-publish profile for security bugs

2019-08-15 Thread Stefan Hajnoczi

On Tue, Aug 13, 2019 at 11:07:27AM +0200, Gerd Hoffmann wrote:
> Simplifies sending security patches to all people listed in
> https://wiki.qemu.org/SecurityProcess.  Should also make it
> harder to send a copy to the mailing list by accident.
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  .gitpublish | 12 
>  1 file changed, 12 insertions(+)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] trace: Clarify DTrace/SystemTap help message

2019-08-15 Thread Stefan Hajnoczi

On Thu, Aug 15, 2019 at 02:02:47PM +0200, Philippe Mathieu-Daudé wrote:
> Most tracing backends are implemented within QEMU, except the
> DTrace/SystemTap backends.
> 
> One side effect is when running 'qemu -trace help', an incomplete
> list of trace events is displayed when using the DTrace/SystemTap
> backends.
> 
> This is partly due to trace events registered as modules with
> trace_init(), and since the events are not used within QEMU,
> the linker optimize and remove the unused modules (which is
> OK in this particular case).
> Currently only the events compiled in trace-root.o and in the
> last trace.o member of libqemuutil.a are linked, resulting in
> an incomplete list of events.
> 
> To avoid confusion, improve the help message, recommending to
> use the proper systemtap script to display the events list.
> 
> Before:
> 
>   $ lm32-softmmu/qemu-system-lm32 -trace help 2>&1 | wc -l
>   70
> 
> After:
> 
>   $ lm32-softmmu/qemu-system-lm32 -trace help
>   Run 'qemu-trace-stap list qemu-system-lm32' to print a list
>   of names of trace points with the DTrace/SystemTap backends.
> 
>   $ qemu-trace-stap list qemu-system-lm32 | wc -l
>   1136
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  trace/control.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/trace/control.c b/trace/control.c
> index 43fb7868db..bc2fe0859d 100644
> --- a/trace/control.c
> +++ b/trace/control.c
> @@ -159,12 +159,19 @@ TraceEvent *trace_event_iter_next(TraceEventIter *iter)
>  
>  void trace_list_events(void)
>  {
> +#ifdef CONFIG_TRACE_DTRACE
> +fprintf(stderr, "Run 'qemu-trace-stap list %s' to print a list\n"
> +"of names of trace points with the DTrace/SystemTap"
> +" backends.\n",
> +error_get_progname());
> +#else
>  TraceEventIter iter;
>  TraceEvent *ev;
>  trace_event_iter_init(, NULL);
>  while ((ev = trace_event_iter_next()) != NULL) {
>  fprintf(stderr, "%s\n", trace_event_get_name(ev));
>  }
> +#endif

Multiple trace backends can be built into QEMU.  In that case the list
might be complete and the user may not be using stap at all.  Perhaps
the message should be turned into a warning instead and the list should
still be printed:

  This list of trace events may be incompletel.  Run 'qemu-trace-stap
  list %s' to print a list of names of trace events with the
  DTrace/SystemTap backends.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v3 3/4] block/qcow2: refactor qcow2_co_pwritev_part

2019-08-15 Thread Max Reitz

On 15.08.19 14:10, Vladimir Sementsov-Ogievskiy wrote:
> Similarly to previous commit, prepare for parallelizing write-loop
> iterations.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/qcow2.c | 153 +-
>  1 file changed, 89 insertions(+), 64 deletions(-)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 89afb4272e..3aaa180e2b 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -2234,6 +2234,87 @@ static int handle_alloc_space(BlockDriverState *bs, 
> QCowL2Meta *l2meta)
>  return 0;
>  }
>  
> +/*
> + * qcow2_co_pwritev_task
> + * Called with s->lock unlocked
> + * l2meta  - if not NULL, qcow2_co_do_pwritev() will consume it. Caller must 
> not

You missed this instance of “qcow2_co_do_pwritev()”.

With that fixed:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] Exposing feature deprecation to machine clients (was: [PATCH 2/2] qapi: deprecate implicit filters)

2019-08-15 Thread Markus Armbruster

John Snow  writes:

> On 8/14/19 6:07 AM, Vladimir Sementsov-Ogievskiy wrote:
>> To get rid of implicit filters related workarounds in future let's
>> deprecate them now.
>> 
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> ---
[...]
>> diff --git a/blockdev.c b/blockdev.c
>> index 36e9368e01..b3cfaccce1 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -3292,6 +3292,11 @@ void qmp_block_commit(bool has_job_id, const char 
>> *job_id, const char *device,
>>  BlockdevOnError on_error = BLOCKDEV_ON_ERROR_REPORT;
>>  int job_flags = JOB_DEFAULT;
>>  
>> +if (!has_filter_node_name) {
>> +warn_report("Omitting filter-node-name parameter is deprecated, it "
>> +"will be required in future");
>> +}
>> +
>>  if (!has_speed) {
>>  speed = 0;
>>  }
>> @@ -3990,6 +3995,11 @@ void qmp_blockdev_mirror(bool has_job_id, const char 
>> *job_id,
>>  Error *local_err = NULL;
>>  int ret;
>>  
>> +if (!has_filter_node_name) {
>> +warn_report("Omitting filter-node-name parameter is deprecated, it "
>> +"will be required in future");
>> +}
>> +
>>  bs = qmp_get_root_bs(device, errp);
>>  if (!bs) {
>>  return;
>> 
>
> This might be OK to do right away, though.
>
> I asked Markus this not too long ago; do we want to amend the QAPI
> schema specification to allow commands to return with "Warning" strings,
> or "Deprecated" stings to allow in-band deprecation notices for cases
> like these?
>
> example:
>
> { "return": {},
>   "deprecated": True,
>   "warning": "Omitting filter-node-name parameter is deprecated, it will
> be required in the future"
> }
>
> There's no "error" key, so this should be recognized as success by
> compatible clients, but they'll definitely see the extra information.

This is a compatible evolution of the QMP protocol.

> Part of my motivation is to facilitate a more aggressive deprecation of
> legacy features by ensuring that we are able to rigorously notify users
> through any means that they need to adjust their scripts.

Yes, we should help libvirt etc. with detecting use of deprecated
features.  We discussed this at the KVM Forum 2018 BoF on deprecating
stuff.  Minutes:

Message-ID: <87mur0ls8o@dusky.pond.sub.org>
https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg05828.html

Last item is relevant here.

Adding deprecation information to QMP's success response belongs to "We
can also pass the buck to the next layer up", next to "emit a QMP
event".

Let's compare the two, i.e. "deprecation info in success response"
vs. "deprecation event".

1. Possible triggers

Anything we put in the success response should only ever apply to the
(successful) command.  So this one's limited to QMP commands.

A QMP event is not limited to QMP commands.  For instance, it could be
emitted for deprecated CLI features (long after the fact, in addition to
human-readable warnings on stderr), or when we detect use of a
deprecated feature only after we sent the success response, say in a
job.  Neither use case is particularly convincing.  Reporting use of
deprecated CLI in QMP feels like a work-around for the CLI's
machine-unfriendliness.  Job-like commands should really check their
arguments upfront.

2. Connection to trigger

Connecting responses to commands is the QMP protocol's responsibility.
Transmitting deprecation information in the response trivially ties it
to the offending command.

The QMP protocol doesn't tie events to anything.  Thus, a deprecation
event needs an event-specific tie to its trigger.

The obvious way to tie it to a command mirrors how the QMP protocol ties
responses to commands: by command ID.  The event either has to be sent
just to the offending monitor (currently, all events are broadcast to
all monitors), or include a suitable monitor ID.

For non-command triggers, we'd have to invent some other tie.

3. Interface complexity

Tying the event to some arbitrary trigger adds complexity.

Do we need non-command triggers, and badly enough to justify the
additional complexity?

4. Implementation complexity 

Emitting an event could be as simple as

qapi_event_send_deprecated(qmp_command_id(),
   "Omitting 'filter-node-name'");

where qmp_command_id() returns the ID of the currently executing
command.  Making qmp_command_id() work is up to the QMP core.  Simple
enough as long as each QMP command runs to completion before its monitor
starts the next one.

The event is "fire and forget".  There is no warning object propagated
up the call chain into the QMP core like errors objects are.

"Fire and forget" is ideal for letting arbitrary code decide "this is
deprecated".

Note the QAPI schema remains untouched.

Unlike an event, which can be emitted anywhere, the success response
gets built in the QMP core.  To have the core add deprecation info to
it, we need to get the info to the core.

If deprecation info originates in

Re: [Qemu-devel] [PATCH v5 10/10] linux-user: dumping hot TBs at the end of the execution

2019-08-15 Thread Aleksandar Markovic

15.08.2019. 04.37, "vandersonmr"  је написао/ла:
>
> dumps, in linux-user mode, the hottest TBs if -d tb_stats is used.
>
> Signed-off-by: Vanderson M. do Rosario 
> --

Hi, Vanderson,

Can you please provide an illustrative example of the dump output, within
the commit message?

Thanks,
Aleksandar

>  linux-user/exit.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/linux-user/exit.c b/linux-user/exit.c
> index bdda720553..7226104959 100644
> --- a/linux-user/exit.c
> +++ b/linux-user/exit.c
> @@ -28,6 +28,10 @@ extern void __gcov_dump(void);
>
>  void preexit_cleanup(CPUArchState *env, int code)
>  {
> +if (tb_stats_collection_enabled()) {
> +dump_tbs_info(max_num_hot_tbs_to_dump, SORT_BY_HOTNESS, false);
> +}
> +
>  #ifdef TARGET_GPROF
>  _mcleanup();
>  #endif
> --
> 2.22.0
>
>

Re: [Qemu-devel] [PATCH 00/13] RFC: luks/encrypted qcow2 key management

2019-08-15 Thread Maxim Levitsky

On Thu, 2019-08-15 at 16:18 +0200, Markus Armbruster wrote:
> Kevin Wolf  writes:
> 
> > Am 14.08.2019 um 23:08 hat Eric Blake geschrieben:
> > > On 8/14/19 3:22 PM, Maxim Levitsky wrote:
> > > 
> > > > This is an issue that was raised today on IRC with Kevin Wolf. Really 
> > > > thanks
> > > > for the idea!
> > > > 
> > > > We agreed that this new qmp interface should take the same options as
> > > > blockdev-create does, however since we want to be able to edit the 
> > > > encryption
> > > > slots separately, this implies that we sort of need to allow this on 
> > > > creation
> > > > time as well.
> > > > 
> > > > Also the BlockdevCreateOptions is a union, which is specialized by the 
> > > > driver name
> > > > which is great for creation, but for update, the driver name is already 
> > > > known,
> > > > and thus the user should not be forced to pass it again.
> > > > However qmp doesn't seem to support union type guessing based on actual 
> > > > fields
> > > > given (this might not be desired either), which complicates this 
> > > > somewhat.
> > > 
> > > Does the idea of a union type with a default value for the discriminator
> > > help?  Maybe we have a discriminator which defaults to 'auto', and add a
> > > union branch 'auto':'any'.  During creation, if the "driver":"auto"
> > > branch is selected (usually implicitly by omitting "driver", but also
> > > possible explicitly), the creation attempt is rejected as invalid
> > > regardless of the contents of the remaining 'any'.  But during amend
> > > usage, if the 'auto' branch is selected, we then add in the proper
> > > "driver":"xyz" and reparse the QAPI object to determine if the remaining
> > > fields in 'any' still meet the specification for the required driver 
> > > branch.
> > > 
> > > This idea may still require some tweaks to the QAPI generator, but it's
> > > the best I can come up with for a way to parse an arbitrary JSON object
> > > with unknown validation, then reparse it again after adding more
> > > information that would constrain the parse differently.
> > 
> > Feels like this would be a lot of code just to allow the client to omit
> > passing a value that it knows anyway. If this were a human interface, I
> > could understand the desire to make commands less verbose, but for QMP I
> > honestly don't see the point when it's not trivial.
> 
> Seconded.


But what about my suggestion of adding something like:

{ 'union': 'BlockdevAmendOptions',

  'base': {
  'node-name': 'str' },

  'discriminator': { 'get_block_driver(node-name)' } ,

  'data': {
  'file':   'BlockdevCreateOptionsFile',
  'gluster':'BlockdevCreateOptionsGluster',
  'luks':   'BlockdevCreateOptionsLUKS',
  'nfs':'BlockdevCreateOptionsNfs',
  'parallels':  'BlockdevCreateOptionsParallels',
  'qcow':   'BlockdevCreateOptionsQcow',
  'qcow2':  'BlockdevCreateOptionsQcow2',
  'qed':'BlockdevCreateOptionsQed',
  'rbd':'BlockdevCreateOptionsRbd',
  'sheepdog':   'BlockdevCreateOptionsSheepdog',
  'ssh':'BlockdevCreateOptionsSsh',
  'vdi':'BlockdevCreateOptionsVdi',
  'vhdx':   'BlockdevCreateOptionsVhdx',
  'vmdk':   'BlockdevCreateOptionsVmdk',
  'vpc':'BlockdevCreateOptionsVpc'
  } }


This shouldn't be hard to do IMHO.

Best regards,
Maxim Levitsky

Re: [Qemu-devel] [PATCH v2] qemu-img convert: Deprecate using -n and -o together

2019-08-15 Thread Eric Blake

On 8/15/19 6:06 AM, Kevin Wolf wrote:
> bdrv_create options specified with -o have no effect when skipping image
> creation with -n, so this doesn't make sense. Warn against the misuse
> and deprecate the combination so we can make it a hard error later.
> 
> Signed-off-by: Kevin Wolf 
> ---

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2] qemu-img convert: Deprecate using -n and -o together

2019-08-15 Thread Max Reitz

On 15.08.19 13:06, Kevin Wolf wrote:
> bdrv_create options specified with -o have no effect when skipping image
> creation with -n, so this doesn't make sense. Warn against the misuse
> and deprecate the combination so we can make it a hard error later.
> 
> Signed-off-by: Kevin Wolf 
> ---
> 
> - Hopefully removed the "finger-wagging" that John saw, without stating
>   that the combination doesn't have a well-defined behaviour (because
>   skipping image creation and therefore ignoring -o is well-defined
>   behaviour).

The problem is that e.g. the man page never says that -o gives creation
options.  It says that “options is a [...] list of format specific
options”, nothing more.

>  qemu-img.c   | 5 +
>  qemu-deprecated.texi | 7 +++
>  2 files changed, 12 insertions(+)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [Bug 1840252] [NEW] Infinite loop over ERANGE from getsockopt

2019-08-15 Thread Fritz Katze

Public bug reported:

Host system: Ubuntu 18.04.3 AMD64
Qemu Version: qemu-arm-static --version
qemu-arm version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.17)

Emulated System: 
Root file system taken from RaspberryPi 3 image
ubuntu-18.04.3-preinstalled-server-armhf+raspi3.img
from 
http://cdimage.ubuntu.com/releases/18.04/release/ubuntu-18.04.3-preinstalled-server-armhf+raspi3.img.xz.

Then using system-nspawn with with /usr/bin/qemu-arm-static copied in.

When executing commands like 
  dpkg -i (--force-all) <...>.deb
or
  tar tvf ..
or
  tar xvf ..
the hosting qemu-arm-static process goes into an infinite loop of getsockopt 
calls of the form:
getsockopt(12, SOL_SOCKET, SO_PEERSEC, 0x7fff7cac49d8, [4]) = -1 ERANGE 
(Numerical result out of range)
I assume that this is because of an infinite retry without checking the actual 
error code of the call.

strace:
openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/librt.so.1", O_RDONLY|O_CLOEXEC) = 12
read(12, 
"\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\20\30\0\0004\0\0\0"..., 512) = 
512
lseek(12, 21236, SEEK_SET)  = 21236
read(12, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
1240) = 1240
lseek(12, 20856, SEEK_SET)  = 20856
read(12, "A2\0\0\0aeabi\0\1(\0\0\0\0057-A\0\6\n\7A\10\1\t\2\n\4\22"..., 51) = 51
fstat(12, {st_mode=S_IFREG|0644, st_size=22476, ...}) = 0
mmap(0x7f419952c000, 90112, PROT_READ|PROT_EXEC, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_DENYWRIT
E, -1, 0) = 0x7f419952c000
mmap(0x7f419952c000, 90112, PROT_READ|PROT_EXEC, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 12, 0) = 0x
7f419952c000
mprotect(0x7f4199531000, 61440, PROT_NONE) = 0
mmap(0x7f419954, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 12, 0x4000)
 = 0x7f419954
close(12)   = 0
mprotect(0x7f419954, 4096, PROT_READ) = 0
mprotect(0x7f4199578000, 8192, PROT_READ) = 0
mmap(0x7f419957b000, 28672, PROT_NONE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) 
= 0x7f419957b000
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [HUP USR1 USR2 PIPE ALRM CHLD TSTP URG VTALRM PROF 
WINCH IO], NULL, 8
) = 0
access("/etc/systemd/dont-synthesize-nobody", F_OK) = -1 ENOENT (No such file 
or directory)
getpid()= 26
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 12
getsockopt(12, SOL_SOCKET, SO_RCVBUF, [212992], [4]) = 0
setsockopt(12, SOL_SOCKET, SO_RCVBUFFORCE, [8388608], 4) = -1 EPERM (Operation 
not permitted)
setsockopt(12, SOL_SOCKET, SO_RCVBUF, [8388608], 4) = 0
getsockopt(12, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0
setsockopt(12, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = -1 EPERM (Operation 
not permitted)
setsockopt(12, SOL_SOCKET, SO_SNDBUF, [8388608], 4) = 0
connect(12, {sa_family=AF_UNIX, sun_path="/run/dbus/system_bus_socket"}, 29) = 0
getsockopt(12, SOL_SOCKET, SO_PEERCRED, {pid=0, uid=0, gid=0}, [12]) = 0
getsockopt(12, SOL_SOCKET, SO_PEERSEC, 0x7fff7cac49d8, [4]) = -1 ERANGE 
(Numerical result out of 
range)

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1840252

Title:
  Infinite loop over  ERANGE from getsockopt

Status in QEMU:
  New

Bug description:
  Host system: Ubuntu 18.04.3 AMD64
  Qemu Version: qemu-arm-static --version
  qemu-arm version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.17)

  Emulated System: 
  Root file system taken from RaspberryPi 3 image
  ubuntu-18.04.3-preinstalled-server-armhf+raspi3.img
  from 
http://cdimage.ubuntu.com/releases/18.04/release/ubuntu-18.04.3-preinstalled-server-armhf+raspi3.img.xz.

  Then using system-nspawn with with /usr/bin/qemu-arm-static copied in.

  When executing commands like 
dpkg -i (--force-all) <...>.deb
  or
tar tvf ..
  or
tar xvf ..
  the hosting qemu-arm-static process goes into an infinite loop of getsockopt 
calls of the form:
  getsockopt(12, SOL_SOCKET, SO_PEERSEC, 0x7fff7cac49d8, [4]) = -1 ERANGE 
(Numerical result out of range)
  I assume that this is because of an infinite retry without checking the 
actual error code of the call.

  strace:
  openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/librt.so.1", O_RDONLY|O_CLOEXEC) = 
12
  read(12, 
"\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\20\30\0\0004\0\0\0"..., 512) = 
512
  lseek(12, 21236, SEEK_SET)  = 21236
  read(12, 
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1240) = 
1240
  lseek(12, 20856, SEEK_SET)  = 20856
  read(12, "A2\0\0\0aeabi\0\1(\0\0\0\0057-A\0\6\n\7A\10\1\t\2\n\4\22"..., 51) = 
51
  fstat(12, {st_mode=S_IFREG|0644, st_size=22476, ...}) = 0
  mmap(0x7f419952c000, 90112, PROT_READ|PROT_EXEC, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_DENYWRIT
  E, -1, 0) = 0x7f419952c000
  mmap(0x7f419952c000, 90112,

Re: [Qemu-devel] [PATCH v5 01/10] accel: introducing TBStatistics structure

2019-08-15 Thread Alex Bennée



vandersonmr  writes:

> To store statistics for each TB, we created a TBStatistics structure
> which is linked with the TBs. TBStatistics can stay alive after
> tb_flush and be relinked to a regenerated TB. So the statistics can
> be accumulated even through flushes.
>
> The goal is to have all present and future qemu/tcg statistics and
> meta-data stored in this new structure.
>
> Signed-off-by: Vanderson M. do Rosario 

Reviewed-by: Alex Bennée 

> ---
>  accel/tcg/Makefile.objs  |  2 +-
>  accel/tcg/perf/Makefile.objs |  1 +
>  accel/tcg/tb-stats.c | 39 
>  accel/tcg/translate-all.c| 57 
>  include/exec/exec-all.h  | 15 +++---
>  include/exec/tb-context.h| 12 
>  include/exec/tb-hash.h   |  7 +
>  include/exec/tb-stats.h  | 43 +++
>  util/log.c   |  2 ++
>  9 files changed, 166 insertions(+), 12 deletions(-)
>  create mode 100644 accel/tcg/perf/Makefile.objs
>  create mode 100644 accel/tcg/tb-stats.c
>  create mode 100644 include/exec/tb-stats.h
>
> diff --git a/accel/tcg/Makefile.objs b/accel/tcg/Makefile.objs
> index d381a02f34..49ffe81b5d 100644
> --- a/accel/tcg/Makefile.objs
> +++ b/accel/tcg/Makefile.objs
> @@ -2,7 +2,7 @@ obj-$(CONFIG_SOFTMMU) += tcg-all.o
>  obj-$(CONFIG_SOFTMMU) += cputlb.o
>  obj-y += tcg-runtime.o tcg-runtime-gvec.o
>  obj-y += cpu-exec.o cpu-exec-common.o translate-all.o
> -obj-y += translator.o
> +obj-y += translator.o tb-stats.o
>
>  obj-$(CONFIG_USER_ONLY) += user-exec.o
>  obj-$(call lnot,$(CONFIG_SOFTMMU)) += user-exec-stub.o
> diff --git a/accel/tcg/perf/Makefile.objs b/accel/tcg/perf/Makefile.objs
> new file mode 100644
> index 00..f82fba35e5
> --- /dev/null
> +++ b/accel/tcg/perf/Makefile.objs
> @@ -0,0 +1 @@
> +obj-y += jitdump.o
> diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
> new file mode 100644
> index 00..02844717cb
> --- /dev/null
> +++ b/accel/tcg/tb-stats.c
> @@ -0,0 +1,39 @@
> +#include "qemu/osdep.h"
> +
> +#include "disas/disas.h"
> +
> +#include "exec/tb-stats.h"
> +
> +void init_tb_stats_htable_if_not(void)
> +{
> +if (tb_stats_collection_enabled() && !tb_ctx.tb_stats.map) {
> +qht_init(_ctx.tb_stats, tb_stats_cmp,
> +CODE_GEN_HTABLE_SIZE, QHT_MODE_AUTO_RESIZE);
> +}
> +}
> +
> +void enable_collect_tb_stats(void)
> +{
> +init_tb_stats_htable_if_not();
> +tcg_collect_tb_stats = TB_STATS_RUNNING;
> +}
> +
> +void disable_collect_tb_stats(void)
> +{
> +tcg_collect_tb_stats = TB_STATS_PAUSED;
> +}
> +
> +void pause_collect_tb_stats(void)
> +{
> +tcg_collect_tb_stats = TB_STATS_STOPPED;
> +}
> +
> +bool tb_stats_collection_enabled(void)
> +{
> +return tcg_collect_tb_stats == TB_STATS_RUNNING;
> +}
> +
> +bool tb_stats_collection_paused(void)
> +{
> +return tcg_collect_tb_stats == TB_STATS_PAUSED;
> +}
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index 5d1e08b169..b7bccacd3b 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -1118,6 +1118,23 @@ static inline void code_gen_alloc(size_t tb_size)
>  }
>  }
>
> +/*
> + * This is the more or less the same compare as tb_cmp(), but the
> + * data persists over tb_flush. We also aggregate the various
> + * variations of cflags under one record and ignore the details of
> + * page overlap (although we can count it).
> + */
> +bool tb_stats_cmp(const void *ap, const void *bp)
> +{
> +const TBStatistics *a = ap;
> +const TBStatistics *b = bp;
> +
> +return a->phys_pc == b->phys_pc &&
> +a->pc == b->pc &&
> +a->cs_base == b->cs_base &&
> +a->flags == b->flags;
> +}
> +
>  static bool tb_cmp(const void *ap, const void *bp)
>  {
>  const TranslationBlock *a = ap;
> @@ -1137,6 +1154,7 @@ static void tb_htable_init(void)
>  unsigned int mode = QHT_MODE_AUTO_RESIZE;
>
>  qht_init(_ctx.htable, tb_cmp, CODE_GEN_HTABLE_SIZE, mode);
> +init_tb_stats_htable_if_not();
>  }
>
>  /* Must be called before using the QEMU cpus. 'tb_size' is the size
> @@ -1666,6 +1684,34 @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t 
> phys_pc,
>  return tb;
>  }
>
> +static TBStatistics *tb_get_stats(tb_page_addr_t phys_pc, target_ulong pc,
> +  target_ulong cs_base, uint32_t flags,
> +  TranslationBlock *current_tb)
> +{
> +TBStatistics *new_stats = g_new0(TBStatistics, 1);
> +uint32_t hash = tb_stats_hash_func(phys_pc, pc, flags);
> +void *existing_stats = NULL;
> +new_stats->phys_pc = phys_pc;
> +new_stats->pc = pc;
> +new_stats->cs_base = cs_base;
> +new_stats->flags = flags;
> +new_stats->tb = current_tb;
> +
> +qht_insert(_ctx.tb_stats, new_stats, hash, _stats);
> +
> +if (unlikely(existing_stats)) {
> +/*
> + * If there is already a TBStatistic for this TB from a previous 
>

Re: [Qemu-devel] [PATCH v5 02/10] accel: collecting TB execution count

2019-08-15 Thread Alex Bennée



vandersonmr  writes:

> If a TB has a TBS (TBStatistics) with the TB_EXEC_STATS
> enabled, then we instrument the start code of this TB
> to atomically count the number of times it is executed.
> We count both the number of "normal" executions and atomic
> executions of a TB.
>
> The execution count of the TB is stored in its respective
> TBS.
>
> All TBStatistics are created by default with the flags from
> default_tbstats_flag.
>
> Signed-off-by: Vanderson M. do Rosario 
> ---
>  accel/tcg/cpu-exec.c  |  4 
>  accel/tcg/tb-stats.c  |  5 +
>  accel/tcg/tcg-runtime.c   |  7 +++
>  accel/tcg/tcg-runtime.h   |  2 ++
>  accel/tcg/translate-all.c |  7 +++
>  accel/tcg/translator.c|  1 +
>  include/exec/gen-icount.h |  9 +
>  include/exec/tb-stats.h   | 19 +++
>  util/log.c|  1 +
>  9 files changed, 55 insertions(+)
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 6c85c3ee1e..e54be69499 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -252,6 +252,10 @@ void cpu_exec_step_atomic(CPUState *cpu)
>
>  start_exclusive();
>
> +if (tb_stats_enabled(tb, TB_EXEC_STATS)) {
> +tb->tb_stats->executions.atomic++;
> +}
> +
>  /* Since we got here, we know that parallel_cpus must be true.  */
>  parallel_cpus = false;
>  in_exclusive_region = true;
> diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
> index 02844717cb..3489133e9e 100644
> --- a/accel/tcg/tb-stats.c
> +++ b/accel/tcg/tb-stats.c
> @@ -37,3 +37,8 @@ bool tb_stats_collection_paused(void)
>  {
>  return tcg_collect_tb_stats == TB_STATS_PAUSED;
>  }
> +
> +uint32_t get_default_tbstats_flag(void)
> +{
> +return default_tbstats_flag;
> +}
> diff --git a/accel/tcg/tcg-runtime.c b/accel/tcg/tcg-runtime.c
> index 8a1e408e31..6f4aafba11 100644
> --- a/accel/tcg/tcg-runtime.c
> +++ b/accel/tcg/tcg-runtime.c
> @@ -167,3 +167,10 @@ void HELPER(exit_atomic)(CPUArchState *env)
>  {
>  cpu_loop_exit_atomic(env_cpu(env), GETPC());
>  }
> +
> +void HELPER(inc_exec_freq)(void *ptr)
> +{
> +TBStatistics *stats = (TBStatistics *) ptr;
> +g_assert(stats);
> +atomic_inc(>executions.normal);
> +}
> diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h
> index 4fa61b49b4..bf0b75dbe8 100644
> --- a/accel/tcg/tcg-runtime.h
> +++ b/accel/tcg/tcg-runtime.h
> @@ -28,6 +28,8 @@ DEF_HELPER_FLAGS_1(lookup_tb_ptr, TCG_CALL_NO_WG_SE, ptr, 
> env)
>
>  DEF_HELPER_FLAGS_1(exit_atomic, TCG_CALL_NO_WG, noreturn, env)
>
> +DEF_HELPER_FLAGS_1(inc_exec_freq, TCG_CALL_NO_RWG, void, ptr)
> +
>  #ifdef CONFIG_SOFTMMU
>
>  DEF_HELPER_FLAGS_5(atomic_cmpxchgb, TCG_CALL_NO_WG,
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index b7bccacd3b..df08d183df 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -1785,6 +1785,13 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>   */
>  if (tb_stats_collection_enabled()) {
>  tb->tb_stats = tb_get_stats(phys_pc, pc, cs_base, flags, tb);
> +uint32_t flag = get_default_tbstats_flag();
> +
> +if (qemu_log_in_addr_range(tb->pc)) {

Minor nit: the compiler should spot it but you could move the flag
fetching inside this block.

> +if (flag & TB_EXEC_STATS) {
> +tb->tb_stats->stats_enabled |= TB_EXEC_STATS;
> +}
> +}
>  } else {
>  tb->tb_stats = NULL;
>  }
> diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
> index 9226a348a3..396a11e828 100644
> --- a/accel/tcg/translator.c
> +++ b/accel/tcg/translator.c
> @@ -46,6 +46,7 @@ void translator_loop(const TranslatorOps *ops, 
> DisasContextBase *db,
>
>  ops->init_disas_context(db, cpu);
>  tcg_debug_assert(db->is_jmp == DISAS_NEXT);  /* no early exit */
> +gen_tb_exec_count(tb);
>
>  /* Reset the temp count so that we can identify leaks */
>  tcg_clear_temp_count();
> diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
> index f7669b6841..b3efe41894 100644
> --- a/include/exec/gen-icount.h
> +++ b/include/exec/gen-icount.h
> @@ -7,6 +7,15 @@
>
>  static TCGOp *icount_start_insn;
>
> +static inline void gen_tb_exec_count(TranslationBlock *tb)
> +{
> +if (tb_stats_enabled(tb, TB_EXEC_STATS)) {
> +TCGv_ptr ptr = tcg_const_ptr(tb->tb_stats);
> +gen_helper_inc_exec_freq(ptr);
> +tcg_temp_free_ptr(ptr);
> +}
> +}
> +
>  static inline void gen_tb_start(TranslationBlock *tb)
>  {
>  TCGv_i32 count, imm;
> diff --git a/include/exec/tb-stats.h b/include/exec/tb-stats.h
> index cc8f8a6ce6..0265050b79 100644
> --- a/include/exec/tb-stats.h
> +++ b/include/exec/tb-stats.h
> @@ -6,6 +6,9 @@
>  #include "exec/tb-context.h"
>  #include "tcg.h"
>
> +#define tb_stats_enabled(tb, JIT_STATS) \
> +(tb && tb->tb_stats && (tb->tb_stats->stats_enabled & JIT_STATS))
> +
>  typedef struct TBStatistics TBStatistics;
>

Re: [Qemu-devel] [PATCH for-4.2 v10 03/15] virtio-iommu: Add skeleton

2019-08-15 Thread Peter Xu

On Tue, Jul 30, 2019 at 07:21:25PM +0200, Eric Auger wrote:
> +static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
> +{
> +VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
> +struct virtio_iommu_req_head head;
> +struct virtio_iommu_req_tail tail;

[1]

> +VirtQueueElement *elem;
> +unsigned int iov_cnt;
> +struct iovec *iov;
> +size_t sz;
> +
> +for (;;) {
> +elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> +if (!elem) {
> +return;
> +}
> +
> +if (iov_size(elem->in_sg, elem->in_num) < sizeof(tail) ||
> +iov_size(elem->out_sg, elem->out_num) < sizeof(head)) {
> +virtio_error(vdev, "virtio-iommu bad head/tail size");
> +virtqueue_detach_element(vq, elem, 0);
> +g_free(elem);
> +break;
> +}
> +
> +iov_cnt = elem->out_num;
> +iov = g_memdup(elem->out_sg, sizeof(struct iovec) * elem->out_num);

Could I ask why memdup is needed here?

> +sz = iov_to_buf(iov, iov_cnt, 0, , sizeof(head));
> +if (unlikely(sz != sizeof(head))) {
> +tail.status = VIRTIO_IOMMU_S_DEVERR;

Do you need to zero the reserved bits to make sure it won't contain
garbage?  Same question to below uses of tail.

> +goto out;
> +}
> +qemu_mutex_lock(>mutex);
> +switch (head.type) {
> +case VIRTIO_IOMMU_T_ATTACH:
> +tail.status = virtio_iommu_handle_attach(s, iov, iov_cnt);
> +break;
> +case VIRTIO_IOMMU_T_DETACH:
> +tail.status = virtio_iommu_handle_detach(s, iov, iov_cnt);
> +break;
> +case VIRTIO_IOMMU_T_MAP:
> +tail.status = virtio_iommu_handle_map(s, iov, iov_cnt);
> +break;
> +case VIRTIO_IOMMU_T_UNMAP:
> +tail.status = virtio_iommu_handle_unmap(s, iov, iov_cnt);
> +break;
> +default:
> +tail.status = VIRTIO_IOMMU_S_UNSUPP;
> +}
> +qemu_mutex_unlock(>mutex);
> +
> +out:
> +sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
> +  , sizeof(tail));
> +assert(sz == sizeof(tail));
> +
> +virtqueue_push(vq, elem, sizeof(tail));

s/tail/head/ (though they are the same size)?

> +virtio_notify(vdev, vq);
> +g_free(elem);
> +}
> +}

[...]

> +static void virtio_iommu_set_features(VirtIODevice *vdev, uint64_t val)
> +{
> +VirtIOIOMMU *dev = VIRTIO_IOMMU(vdev);
> +
> +dev->acked_features = val;
> +trace_virtio_iommu_set_features(dev->acked_features);
> +}
> +
> +static const VMStateDescription vmstate_virtio_iommu_device = {
> +.name = "virtio-iommu-device",
> +.unmigratable = 1,

Curious, is there explicit reason to not support migration from the
first version? :)

> +};
> +
> +static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
> +{
> +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
> +
> +virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU,
> +sizeof(struct virtio_iommu_config));
> +
> +s->req_vq = virtio_add_queue(vdev, VIOMMU_DEFAULT_QUEUE_SIZE,
> + virtio_iommu_handle_command);
> +s->event_vq = virtio_add_queue(vdev, VIOMMU_DEFAULT_QUEUE_SIZE, NULL);
> +
> +s->config.page_size_mask = TARGET_PAGE_MASK;
> +s->config.input_range.end = -1UL;
> +s->config.domain_range.start = 0;

Zero input_range.start = 0?  After all domain_range.start is zeroed.

> +s->config.domain_range.end = 32;
> +
> +virtio_add_feature(>features, VIRTIO_RING_F_EVENT_IDX);
> +virtio_add_feature(>features, VIRTIO_RING_F_INDIRECT_DESC);
> +virtio_add_feature(>features, VIRTIO_F_VERSION_1);
> +virtio_add_feature(>features, VIRTIO_IOMMU_F_INPUT_RANGE);
> +virtio_add_feature(>features, VIRTIO_IOMMU_F_DOMAIN_RANGE);
> +virtio_add_feature(>features, VIRTIO_IOMMU_F_MAP_UNMAP);
> +virtio_add_feature(>features, VIRTIO_IOMMU_F_BYPASS);
> +virtio_add_feature(>features, VIRTIO_IOMMU_F_MMIO);
> +}

Regards,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH v5 03/10] accel: collecting JIT statistics

2019-08-15 Thread Alex Bennée



vandersonmr  writes:

> If a TB has a TBS (TBStatistics) with the TB_JIT_STATS
> enabled then we collect statistics of its translation
> processes and code translation.
>
> Collecting the number of host instructions seems to be
> not simple as it would imply in having to modify several
> target source files. So, for now, we are only collecting
> the size of the host gen code.
>
> Signed-off-by: Vanderson M. do Rosario 

Reviewed-by: Alex Bennée 

> ---
>  accel/tcg/translate-all.c | 14 ++
>  accel/tcg/translator.c|  4 
>  include/exec/tb-stats.h   | 15 +++
>  tcg/tcg.c | 23 +++
>  tcg/tcg.h |  2 ++
>  5 files changed, 58 insertions(+)
>
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index df08d183df..85c6b7b409 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -1696,6 +1696,7 @@ static TBStatistics *tb_get_stats(tb_page_addr_t 
> phys_pc, target_ulong pc,
>  new_stats->cs_base = cs_base;
>  new_stats->flags = flags;
>  new_stats->tb = current_tb;
> +new_stats->translations.total = 1;
>
>  qht_insert(_ctx.tb_stats, new_stats, hash, _stats);
>
> @@ -1705,6 +1706,7 @@ static TBStatistics *tb_get_stats(tb_page_addr_t 
> phys_pc, target_ulong pc,
>   * then just make the new TB point to the older TBStatistic
>   */
>  g_free(new_stats);
> +((TBStatistics *) existing_stats)->tb = current_tb;
>  return existing_stats;
>  } else {
>  return new_stats;
> @@ -1792,6 +1794,11 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>  tb->tb_stats->stats_enabled |= TB_EXEC_STATS;
>  }
>  }
> +
> +if (flag & TB_JIT_STATS) {
> +tb->tb_stats->stats_enabled |= TB_JIT_STATS;
> +atomic_inc(>tb_stats->translations.total);
> +}
>  } else {
>  tb->tb_stats = NULL;
>  }
> @@ -1869,6 +1876,10 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>  atomic_set(>search_out_len, prof->search_out_len + search_size);
>  #endif
>
> +if (tb_stats_enabled(tb, TB_JIT_STATS)) {
> +atomic_add(>tb_stats->code.out_len, gen_code_size);
> +}
> +
>  #ifdef DEBUG_DISAS
>  if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM) &&
>  qemu_log_in_addr_range(tb->pc)) {
> @@ -1926,6 +1937,9 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>  phys_page2 = -1;
>  if ((pc & TARGET_PAGE_MASK) != virt_page2) {
>  phys_page2 = get_page_addr_code(env, virt_page2);
> +if (tb_stats_enabled(tb, TB_JIT_STATS)) {
> +atomic_inc(>tb_stats->translations.spanning);
> +}
>  }
>  /*
>   * No explicit memory barrier is required -- tb_link_page() makes the
> diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
> index 396a11e828..834265d5be 100644
> --- a/accel/tcg/translator.c
> +++ b/accel/tcg/translator.c
> @@ -117,6 +117,10 @@ void translator_loop(const TranslatorOps *ops, 
> DisasContextBase *db,
>  db->tb->size = db->pc_next - db->pc_first;
>  db->tb->icount = db->num_insns;
>
> +if (tb_stats_enabled(tb, TB_JIT_STATS)) {
> +atomic_add(>tb->tb_stats->code.num_guest_inst, db->num_insns);
> +}
> +
>  #ifdef DEBUG_DISAS
>  if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)
>  && qemu_log_in_addr_range(db->pc_first)) {
> diff --git a/include/exec/tb-stats.h b/include/exec/tb-stats.h
> index 0265050b79..3c219123c2 100644
> --- a/include/exec/tb-stats.h
> +++ b/include/exec/tb-stats.h
> @@ -34,6 +34,20 @@ struct TBStatistics {
>  unsigned long atomic;
>  } executions;
>
> +struct {
> +unsigned num_guest_inst;
> +unsigned num_tcg_ops;
> +unsigned num_tcg_ops_opt;
> +unsigned spills;
> +unsigned out_len;
> +} code;
> +
> +struct {
> +unsigned long total;
> +unsigned long uncached;
> +unsigned long spanning;
> +} translations;
> +
>  /* current TB linked to this TBStatistics */
>  TranslationBlock *tb;
>  };
> @@ -47,6 +61,7 @@ enum TBStatsStatus { TB_STATS_RUNNING, TB_STATS_PAUSED, 
> TB_STATS_STOPPED };
>
>  #define TB_NOTHING0
>  #define TB_EXEC_STATS 1
> +#define TB_JIT_STATS  (1 << 2)
>
>  extern int tcg_collect_tb_stats;
>  extern uint32_t default_tbstats_flag;
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index be2c33c400..446e3d1708 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -3126,6 +3126,11 @@ static void temp_sync(TCGContext *s, TCGTemp *ts, 
> TCGRegSet allocated_regs,
>  case TEMP_VAL_REG:
>  tcg_out_st(s, ts->type, ts->reg,
> ts->mem_base->reg, ts->mem_offset);
> +
> +/* Count number of spills */
> +if (tb_stats_enabled(s->current_tb, TB_JIT_STATS)) {
> +atomic_inc(>current_tb->tb_stats->code.spills);
> +}
>  break;
>
>  case TEMP_VAL_MEM:
> @@ -3997,6

Re: [Qemu-devel] [PATCH v5 04/10] accel: replacing part of CONFIG_PROFILER with TBStats

2019-08-15 Thread Alex Bennée



vandersonmr  writes:

> We add some of the statistics collected in the TCGProfiler
> into the TBStats, having the statistics not only for the whole
> emulation but for each TB. Then, we removed these stats
> from TCGProfiler and reconstruct the information for the
> "info jit" using the sum of all TBStats statistics.
>
> The goal is to have one unique and better way of collecting
> emulation statistics. Moreover, checking dynamiclly if the
> profiling is enabled showed to have an insignificant impact
> on the performance:
> https://wiki.qemu.org/Internships/ProjectIdeas/TCGCodeQuality#Overheads.
>
> Signed-off-by: Vanderson M. do Rosario 
> ---
>  accel/tcg/tb-stats.c  | 95 +++
>  accel/tcg/translate-all.c |  8 +---
>  include/exec/tb-stats.h   | 11 +
>  tcg/tcg.c | 93 +-
>  tcg/tcg.h | 10 -
>  5 files changed, 118 insertions(+), 99 deletions(-)
>
> diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
> index 3489133e9e..9b720d9b86 100644
> --- a/accel/tcg/tb-stats.c
> +++ b/accel/tcg/tb-stats.c
> @@ -1,9 +1,104 @@
>  #include "qemu/osdep.h"
>
>  #include "disas/disas.h"
> +#include "exec/exec-all.h"
> +#include "tcg.h"
> +
> +#include "qemu/qemu-print.h"
>
>  #include "exec/tb-stats.h"
>
> +struct jit_profile_info {
> +uint64_t translations;
> +uint64_t aborted;
> +uint64_t ops;
> +unsigned ops_max;
> +uint64_t del_ops;
> +uint64_t temps;
> +unsigned temps_max;
> +uint64_t host;
> +uint64_t guest;
> +uint64_t search_data;
> +};
> +
> +/* accumulate the statistics from all TBs */
> +static void collect_jit_profile_info(void *p, uint32_t hash, void *userp)
> +{
> +struct jit_profile_info *jpi = userp;
> +TBStatistics *tbs = p;
> +
> +jpi->translations += tbs->translations.total;
> +jpi->ops += tbs->code.num_tcg_ops;
> +if (stat_per_translation(tbs, code.num_tcg_ops) > jpi->ops_max) {
> +jpi->ops_max = stat_per_translation(tbs, code.num_tcg_ops);
> +}
> +jpi->del_ops += tbs->code.deleted_ops;
> +jpi->temps += tbs->code.temps;
> +if (stat_per_translation(tbs, code.temps) > jpi->temps_max) {
> +jpi->temps_max = stat_per_translation(tbs, code.temps);
> +}
> +jpi->host += tbs->code.out_len;
> +jpi->guest += tbs->code.in_len;
> +jpi->search_data += tbs->code.search_out_len;
> +}
> +
> +/* dump JIT statisticis using TCGProfile and TBStats */
> +void dump_jit_profile_info(TCGProfile *s)
> +{
> +if (!tb_stats_collection_enabled()) {
> +return;
> +}
> +
> +struct jit_profile_info *jpi = g_new0(struct jit_profile_info, 1);
> +
> +qht_iter(_ctx.tb_stats, collect_jit_profile_info, jpi);
> +
> +if (jpi->translations) {
> +qemu_printf("translated TBs  %" PRId64 "\n", jpi->translations);
> +qemu_printf("avg ops/TB  %0.1f max=%d\n",
> +jpi->ops / (double) jpi->translations, jpi->ops_max);
> +qemu_printf("deleted ops/TB  %0.2f\n",
> +jpi->del_ops / (double) jpi->translations);
> +qemu_printf("avg temps/TB%0.2f max=%d\n",
> +jpi->temps / (double) jpi->translations, jpi->temps_max);
> +qemu_printf("avg host code/TB%0.1f\n",
> +jpi->host / (double) jpi->translations);
> +qemu_printf("avg search data/TB  %0.1f\n",
> +jpi->search_data / (double) jpi->translations);
> +
> +if (s) {
> +int64_t tot = s->interm_time + s->code_time;
> +qemu_printf("JIT cycles  %" PRId64 " (%0.3f s at 2.4 
> GHz)\n",
> +tot, tot / 2.4e9);
> +qemu_printf("cycles/op   %0.1f\n",
> +jpi->ops ? (double)tot / jpi->ops : 0);
> +qemu_printf("cycles/in byte  %0.1f\n",
> +jpi->guest ? (double)tot / jpi->guest : 0);
> +qemu_printf("cycles/out byte %0.1f\n",
> +jpi->host ? (double)tot / jpi->host : 0);
> +qemu_printf("cycles/search byte %0.1f\n",
> +jpi->search_data ? (double)tot / jpi->search_data : 
> 0);
> +if (tot == 0) {
> +tot = 1;
> +}
> +qemu_printf("  gen_interm time   %0.1f%%\n",
> +(double)s->interm_time / tot * 100.0);
> +qemu_printf("  gen_code time %0.1f%%\n",
> +(double)s->code_time / tot * 100.0);
> +qemu_printf("optim./code time%0.1f%%\n",
> +(double)s->opt_time / (s->code_time ? s->code_time : 
> 1)
> +* 100.0);
> +qemu_printf("liveness/code time  %0.1f%%\n",
> +(double)s->la_time / (s->code_time ? s->code_time : 1) * 
> 100.0);
> +qemu_printf("cpu_restore count   %" PRId64 "\n",
> +

Re: [Qemu-devel] [PATCH 00/13] RFC: luks/encrypted qcow2 key management

2019-08-15 Thread Eric Blake

On 8/15/19 9:44 AM, Maxim Levitsky wrote:

 Does the idea of a union type with a default value for the discriminator
 help?  Maybe we have a discriminator which defaults to 'auto', and add a
 union branch 'auto':'any'.  During creation, if the "driver":"auto"
 branch is selected (usually implicitly by omitting "driver", but also
 possible explicitly), the creation attempt is rejected as invalid
 regardless of the contents of the remaining 'any'.  But during amend
 usage, if the 'auto' branch is selected, we then add in the proper
 "driver":"xyz" and reparse the QAPI object to determine if the remaining
 fields in 'any' still meet the specification for the required driver 
 branch.

 This idea may still require some tweaks to the QAPI generator, but it's
 the best I can come up with for a way to parse an arbitrary JSON object
 with unknown validation, then reparse it again after adding more
 information that would constrain the parse differently.
>>>
>>> Feels like this would be a lot of code just to allow the client to omit
>>> passing a value that it knows anyway. If this were a human interface, I
>>> could understand the desire to make commands less verbose, but for QMP I
>>> honestly don't see the point when it's not trivial.
>>
>> Seconded.
> 
> 
> But what about my suggestion of adding something like:
> 
> { 'union': 'BlockdevAmendOptions',
> 
>   'base': {
>   'node-name': 'str' },
> 
>   'discriminator': { 'get_block_driver(node-name)' } ,

Not worth it. It makes the QAPI generator more complex (to invoke
arbitrary code instead of a fixed name) just to avoid a little bit of
complexity in the caller (which is assumed to be a computer, and thus
shouldn't have a hard time providing a sane 'driver' unconditionally).
An HMP wrapper around the QMP command can do whatever magic it needs to
omit driver, but making driver mandatory for QMP is just fine.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF

2019-08-15 Thread Laszlo Ersek

On 08/14/19 16:04, Paolo Bonzini wrote:
> On 14/08/19 15:20, Yao, Jiewen wrote:
>>> - Does this part require a new branch somewhere in the OVMF SEC code?
>>>   How do we determine whether the CPU executing SEC is BSP or
>>>   hot-plugged AP?
>> [Jiewen] I think this is blocked from hardware perspective, since the first 
>> instruction.
>> There are some hardware specific registers can be used to determine if the 
>> CPU is new added.
>> I don’t think this must be same as the real hardware.
>> You are free to invent some registers in device model to be used in OVMF hot 
>> plug driver.
> 
> Yes, this would be a new operation mode for QEMU, that only applies to
> hot-plugged CPUs.  In this mode the AP doesn't reply to INIT or SMI, in
> fact it doesn't reply to anything at all.
> 
>>> - How do we tell the hot-plugged AP where to start execution? (I.e. that
>>>   it should execute code at a particular pflash location.)
>> [Jiewen] Same real mode reset vector at :FFF0.
> 
> You do not need a reset vector or INIT/SIPI/SIPI sequence at all in
> QEMU.  The AP does not start execution at all when it is unplugged, so
> no cache-as-RAM etc.
> 
> We only need to modify QEMU so that hot-plugged APIs do not reply to
> INIT/SIPI/SMI.
> 
>> I don’t think there is problem for real hardware, who always has CAR.
>> Can QEMU provide some CPU specific space, such as MMIO region?
> 
> Why is a CPU-specific region needed if every other processor is in SMM
> and thus trusted.

I was going through the steps Jiewen and Yingwen recommended.

In step (02), the new CPU is expected to set up RAM access. In step
(03), the new CPU, executing code from flash, is expected to "send board
message to tell host CPU (GPIO->SCI) -- I am waiting for hot-add
message." For that action, the new CPU may need a stack (minimally if we
want to use C function calls).

Until step (03), there had been no word about any other (= pre-plugged)
CPUs (more precisely, Jiewen even confirmed "No impact to other
processors"), so I didn't assume that other CPUs had entered SMM.

Paolo, I've attempted to read Jiewen's response, and yours, as carefully
as I can. I'm still very confused. If you have a better understanding,
could you please write up the 15-step process from the thread starter
again, with all QEMU customizations applied? Such as, unnecessary steps
removed, and platform specifics filled in.

One more comment below:

> 
>>>   Does CPU hotplug apply only at the socket level? If the CPU is
>>>   multi-core, what is responsible for hot-plugging all cores present in
>>>   the socket?
> 
> I can answer this: the SMM handler would interact with the hotplug
> controller in the same way that ACPI DSDT does normally.  This supports
> multiple hotplugs already.
> 
> Writes to the hotplug controller from outside SMM would be ignored.
> 
 (03) New CPU: (Flash) send board message to tell host CPU (GPIO->SCI)
  -- I am waiting for hot-add message.
>>>
>>> Maybe we can simplify this in QEMU by broadcasting an SMI to existent
>>> processors immediately upon plugging the new CPU.
> 
> The QEMU DSDT could be modified (when secure boot is in effect) to OUT
> to 0xB2 when hotplug happens.  It could write a well-known value to
> 0xB2, to be read by an SMI handler in edk2.

(My comment below is general, and may not apply to this particular
situation. I'm too confused to figure that out myself, sorry!)

I dislike involving QEMU's generated DSDT in anything SMM (even
injecting the SMI), because the AML interpreter runs in the OS.

If a malicious OS kernel is a bit too enlightened about the DSDT, it
could willfully diverge from the process that we design. If QEMU
broadcast the SMI internally, the guest OS could not interfere with that.

If the purpose of the SMI is specifically to force all CPUs into SMM
(and thereby force them into trusted state), then the OS would be
explicitly counter-interested in carrying out the AML operations from
QEMU's DSDT.

I'd be OK with an SMM / SMI involvement in QEMU's DSDT if, by diverging
from that DSDT, the OS kernel could only mess with its own state, and
not with the firmware's.

Thanks
Laszlo

> 
> 
>>>
(NOTE: Host CPU can only
>>> send
  instruction in SMM mode. -- The register is SMM only)
>>>
>>> Sorry, I don't follow -- what register are we talking about here, and
>>> why is the BSP needed to send anything at all? What "instruction" do you
>>> have in mind?
>> [Jiewen] The new CPU does not enable SMI at reset.
>> At some point of time later, the CPU need enable SMI, right?
>> The "instruction" here means, the host CPUs need tell to CPU to enable SMI.
> 
> Right, this would be a write to the CPU hotplug controller
> 
 (04) Host CPU: (OS) get message from board that a new CPU is added.
  (GPIO -> SCI)

 (05) Host CPU: (OS) All CPUs enter SMM (SCI->SWSMI) (NOTE: New CPU
  will not enter CPU because SMI is disabled)
>>>
>>> I don't understand the

Re: [Qemu-devel] [RFC PATCH v3 15/46] target/i386: introduce function ck_cpuid

2019-08-15 Thread Aleksandar Markovic

15.08.2019. 04.23, "Jan Bobek"  је написао/ла:
>
> Introduce a helper function to take care of instruction CPUID checks.
>
> Signed-off-by: Jan Bobek 
> ---

Jan, what is the origin of "CK"? If it is a QEMU internal thing, perhaps
use "CHECK".

The function should be called check_cpuid(), imho. I know, Richard would
like c_ci(), or simpler cc(), better.

Aleksandar

>  target/i386/translate.c | 48 +
>  1 file changed, 48 insertions(+)
>
> diff --git a/target/i386/translate.c b/target/i386/translate.c
> index 6296a02991..0cffa2226b 100644
> --- a/target/i386/translate.c
> +++ b/target/i386/translate.c
> @@ -4500,6 +4500,54 @@ static void gen_sse(CPUX86State *env, DisasContext
*s, int b)
>  #define tcg_gen_gvec_cmpgt(vece, dofs, aofs, bofs, oprsz, maxsz)\
>  tcg_gen_gvec_cmp(TCG_COND_GT, vece, dofs, aofs, bofs, oprsz, maxsz)
>
> +typedef enum {
> +CK_CPUID_MMX = 1,
> +CK_CPUID_3DNOW,
> +CK_CPUID_SSE,
> +CK_CPUID_SSE2,
> +CK_CPUID_CLFLUSH,
> +CK_CPUID_SSE3,
> +CK_CPUID_SSSE3,
> +CK_CPUID_SSE4_1,
> +CK_CPUID_SSE4_2,
> +CK_CPUID_SSE4A,
> +CK_CPUID_AVX,
> +CK_CPUID_AVX2,
> +} CkCpuidFeat;
> +
> +static int ck_cpuid(CPUX86State *env, DisasContext *s, CkCpuidFeat feat)
> +{
> +switch (feat) {
> +case CK_CPUID_MMX:
> +return !(s->cpuid_features & CPUID_MMX)
> +|| !(s->cpuid_ext2_features & CPUID_EXT2_MMX);
> +case CK_CPUID_3DNOW:
> +return !(s->cpuid_ext2_features & CPUID_EXT2_3DNOW);
> +case CK_CPUID_SSE:
> +return !(s->cpuid_features & CPUID_SSE);
> +case CK_CPUID_SSE2:
> +return !(s->cpuid_features & CPUID_SSE2);
> +case CK_CPUID_CLFLUSH:
> +return !(s->cpuid_features & CPUID_CLFLUSH);
> +case CK_CPUID_SSE3:
> +return !(s->cpuid_ext_features & CPUID_EXT_SSE3);
> +case CK_CPUID_SSSE3:
> +return !(s->cpuid_ext_features & CPUID_EXT_SSSE3);
> +case CK_CPUID_SSE4_1:
> +return !(s->cpuid_ext_features & CPUID_EXT_SSE41);
> +case CK_CPUID_SSE4_2:
> +return !(s->cpuid_ext_features & CPUID_EXT_SSE42);
> +case CK_CPUID_SSE4A:
> +return !(s->cpuid_ext3_features & CPUID_EXT3_SSE4A);
> +case CK_CPUID_AVX:
> +return !(s->cpuid_ext_features & CPUID_EXT_AVX);
> +case CK_CPUID_AVX2:
> +return !(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2);
> +default:
> +g_assert_not_reached();
> +}
> +}
> +
>  static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
>  {
>  enum {
> --
> 2.20.1
>
>

Re: [Qemu-devel] [RFC PATCH] ati-vga: Implement dummy VBlank IRQ

2019-08-15 Thread BALATON Zoltan


On Thu, 15 Aug 2019, Gerd Hoffmann wrote:

+static void ati_vga_update_irq(ATIVGAState *s)
+{
+pci_set_irq(>dev, s->regs.gen_int_status & 1);


This should be "s->regs.gen_int_status & s->regs.gen_int_cntl" I guess?


Probably, but we only try to emulate VBlank yet so to avoid any problems due
to raising irq for unknown bits I restricted it for that now.


Well, qemu doesn't set unknown status bits, only vblank.  The guest
can't set them either due to status register having write-one-to-clear
semantics.  So, that should not happen.  If you want an extra check to
catch programming errors I'd suggest to add an assert() for that.


OK I'll change that then.


+s->regs.gen_int_status &= ~data;


ati_vga_update_irq() needed here too.


Thanks. Indeed I forgot this. With that it works a bit better, mouse now can
be moved but only vertically... No idea why, I'll have to check,


Still progress.  One step at a time ;)


Got this too (and cursor color as well). These are becuase MacOS accesses 
these regs with less than 4 size so we need to support unaligned access 
for them. It's a bit tricky because we could get reg content in pieces and 
we need to decide when it's complete to act on it. I'll try to fix that 
and then it hopefully will work (well, at least boot to see it fail 
whenever tries to use any unimplemented acceleration but at least we can 
test emulation with it when further pieces are implemented in the future).


Regards,
BALATON Zoltan

Re: [Qemu-devel] [libvirt] [PATCH 2/2] qapi: deprecate implicit filters

2019-08-15 Thread Peter Krempa

On Thu, Aug 15, 2019 at 12:49:28 +0200, Kevin Wolf wrote:
> Am 14.08.2019 um 21:27 hat John Snow geschrieben:

[...]

> > example:
> > 
> > { "return": {},
> >   "deprecated": True,
> >   "warning": "Omitting filter-node-name parameter is deprecated, it will
> > be required in the future"
> > }
> > 
> > There's no "error" key, so this should be recognized as success by
> > compatible clients, but they'll definitely see the extra information.
> > 
> > Part of my motivation is to facilitate a more aggressive deprecation of
> > legacy features by ensuring that we are able to rigorously notify users
> > through any means that they need to adjust their scripts.
> 
> Who would read this, though? In the best case it ends up deep in a
> libvirt log that nobody will look at because there was no error. In the
> more common case, the debug level is configured so that QMP traffic
> isn't even logged.

The best we could do here is to log a warning. Thankfully we have one
central function which always checks the returned JSON from qemu so we
could do that universally.

The would end up in the system log and alternatively also in the VM
log file. I agree with Kevin that the possibility of it being noticed
is rather small.

From my experience users report non-fatal messages mostly only if it is
spamming the system log. One of instances are very unlikely to be
noticed.

In my experience it's better to notify us in libvirt of such change and
we will try our best to fix it.

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 7/7] target/arm: Use tcg_gen_extrh_i64_i32 to extract the high word

2019-08-15 Thread Richard Henderson





>
>This patch is fine, but I noticed while reviewing it that tcg/README
>labels both the extrl_i64_i32 and extrh_i64_i32 operations as
>"for 64-bit hosts only". Presumably that's a documentation error,
>since we're not guarding the existing uses of the extrl_i64_i32
>here with any kind of ifdeffery to restrict them to 64-bit hosts ?
>


A documentation unclarity in that the opcodes are for 64-bit hosts. The 
tcg_gen_* functions are always available, and expand to INDEX_op_mov_i32 on 
32-bit hosts.


r~

Re: [Qemu-devel] [PATCH v0] Implement new cache mode "target"

2019-08-15 Thread Stefan Hajnoczi

On Wed, Aug 07, 2019 at 04:09:54PM +0300, Artemy Kapitula wrote:

Hi,
Please use "scripts/get_maintainer.pl -f block.c" to find out which
maintainers to email.  qemu-devel@nongnu.org is a high-traffic list and
patches not CCed to the right maintainer may not get quick review.

> There is an issue with databases in VM that perform too slow
> on generic SAN storages. The key point is fdatasync that flushes
> disk on SCSI target.
> 
> The QEMU blockdev "target" cache mode intended to be used with
> SAN storages and is a mix of "none" by using direct I/O and
> "unsafe" that omit device flush.
> 
> Such storages has its own data integrity protection and can
> be operated with direct I/O without additional fdatasyc().
> 
> With generic SCSI targets like LIO or SCST it boost performance
> up to 100% on some profiles like database with transaction journal
> (postrgesql/mssql/oracle etc) or virtualized SDS (ceph/rook inside
> VMs) which performs block device cache flush on journal record.

If the physical storage controller has a Battery Backed Unit (BBU) or
similar then flush requests are not required with O_DIRECT.  This has
been a common enterprise storage configuration for many years and is
already supported in QEMU today:

Configure the guest with cache=none and disable the emulated storage
controller's write cache (e.g. -device virtio-blk-pci,write-cache=off).
Inside the guest /sys/block/$BLKDEV/queue/write_cache should show "write
through".

I think this patch is not necessary since write-cache=off already
exists.  cache=target is also slower since the guest sends unnecessary
flush requests to the emulated storage controller.

Thanks,
Stefan

> Signed-off-by: Artemy Kapitula 
> ---
>  block.c| 4 
>  qemu-options.hx| 3 ++-
>  tests/qemu-iotests/026 | 2 +-
>  tests/qemu-iotests/091 | 2 +-
>  4 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/block.c b/block.c
> index cbd8da5f3b..60919d82ff 100644
> --- a/block.c
> +++ b/block.c
> @@ -884,6 +884,10 @@ int bdrv_parse_cache_mode(const char *mode, int *flags, 
> bool *writethrough)
>  } else if (!strcmp(mode, "unsafe")) {
>  *writethrough = false;
>  *flags |= BDRV_O_NO_FLUSH;
> +} else if (!strcmp(mode, "target")) {
> +*writethrough = false;
> +*flags |= BDRV_O_NOCACHE;
> +*flags |= BDRV_O_NO_FLUSH;
>  } else if (!strcmp(mode, "writethrough")) {
>  *writethrough = true;
>  } else {
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 9621e934c0..01f1f4ad34 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1065,7 +1065,7 @@ This option defines the type of the media: disk or 
> cdrom.
>  @var{snapshot} is "on" or "off" and controls snapshot mode for the given 
> drive
>  (see @option{-snapshot}).
>  @item cache=@var{cache}
> -@var{cache} is "none", "writeback", "unsafe", "directsync" or "writethrough"
> +@var{cache} is "none", "writeback", "unsafe", "target", "directsync" or 
> "writethrough"
>  and controls how the host cache is used to access block data. This is a
>  shortcut that sets the @option{cache.direct} and @option{cache.no-flush}
>  options (as in @option{-blockdev}), and additionally 
> @option{cache.writeback},
> @@ -1084,6 +1084,7 @@ none │ onon off
>  writethrough │ off   offoff
>  directsync   │ off   on off
>  unsafe   │ onoffon
> +target   │ onon on
>  @end example
>  The default mode is @option{cache=writeback}.
> diff --git a/tests/qemu-iotests/026 b/tests/qemu-iotests/026
> index e30243608b..e7179b0de4 100755
> --- a/tests/qemu-iotests/026
> +++ b/tests/qemu-iotests/026
> @@ -42,7 +42,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
>  _supported_fmt qcow2
>  _supported_proto file
>  _default_cache_mode "writethrough"
> -_supported_cache_modes "writethrough" "none"
> +_supported_cache_modes "writethrough" "none" "target"
>  # The refcount table tests expect a certain minimum width for refcount 
> entries
>  # (so that the refcount table actually needs to grow); that minimum is 16 
> bits,
>  # being the default refcount entry width.
> diff --git a/tests/qemu-iotests/091 b/tests/qemu-iotests/091
> index d62ef18a02..2eaf258c8a 100755
> --- a/tests/qemu-iotests/091
> +++ b/tests/qemu-iotests/091
> @@ -47,7 +47,7 @@ _supported_fmt qcow2
>  _supported_proto file
>  _supported_os Linux
>  _default_cache_mode "none"
> -_supported_cache_modes "writethrough" "none" "writeback"
> +_supported_cache_modes "writethrough" "none" "writeback" "target"
>  size=1G
> -- 
> 2.21.0
> 
> 
> 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [libvirt] [PATCH 2/2] qapi: deprecate implicit filters

2019-08-15 Thread Markus Armbruster

Peter Krempa  writes:

> On Thu, Aug 15, 2019 at 12:49:28 +0200, Kevin Wolf wrote:
>> Am 14.08.2019 um 21:27 hat John Snow geschrieben:
>
> [...]
>
>> > example:
>> > 
>> > { "return": {},
>> >   "deprecated": True,
>> >   "warning": "Omitting filter-node-name parameter is deprecated, it will
>> > be required in the future"
>> > }
>> > 
>> > There's no "error" key, so this should be recognized as success by
>> > compatible clients, but they'll definitely see the extra information.
>> > 
>> > Part of my motivation is to facilitate a more aggressive deprecation of
>> > legacy features by ensuring that we are able to rigorously notify users
>> > through any means that they need to adjust their scripts.
>> 
>> Who would read this, though? In the best case it ends up deep in a
>> libvirt log that nobody will look at because there was no error. In the
>> more common case, the debug level is configured so that QMP traffic
>> isn't even logged.
>
> The best we could do here is to log a warning. Thankfully we have one
> central function which always checks the returned JSON from qemu so we
> could do that universally.
>
> The would end up in the system log and alternatively also in the VM
> log file. I agree with Kevin that the possibility of it being noticed
> is rather small.
>
> From my experience users report non-fatal messages mostly only if it is
> spamming the system log. One of instances are very unlikely to be
> noticed.
>
> In my experience it's better to notify us in libvirt of such change and
> we will try our best to fix it.

How to best alert the layers above QEMU was one of the topic of the KVM
Forum 2018 BoF on deprecating stuff.  Minutes:

Message-ID: <87mur0ls8o@dusky.pond.sub.org>
https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg05828.html

Relevant part:

* We need to communicate "you're using something that is deprecated".
  How?  Right now, we print a deprecation message.  Okay when humans use
  QEMU directly in a shell.  However, when QEMU sits at the bottom of a
  software stack, the message will likely end up in a log file that is
  effectively write-only.
 
  - The one way to get people read log files is crashing their
application.  A command line option --future could make QEMU crash
right after printing a deprecation message.  This could help with
finding use of deprecated features in a testing environment.

  - A less destructive way to grab people's attention is to make things
run really, really slow: have QEMU go to sleep for a while after
printing a deprecation message.

  - We can also pass the buck to the next layer up: emit a QMP event.

Sadly, by the time the next layer connects to QMP, plenty of stuff
already happened.  We'd have to buffer deprecation events somehow.

What would libvirt do with such an event?  Log it, taint the domain,
emit a (libvirt) event to pass it on to the next layer up.

  - A completely different idea is to have a configuratin linter.  To
support doing this at the libvirt level, QEMU could expose "is
deprecated" in interface introspection.  Feels feasible for QMP,
where we already have sufficiently expressive introspection.  For
CLI, we'd first have to provide that (but we want that anyway).

  - We might also want to dispay deprecation messages in QEMU's GUI
somehow, or on serial consoles.

Re: [Qemu-devel] [PATCH v2] usb: reword -usb command-line option and mention xHCI

2019-08-15 Thread Thomas Huth

On 8/15/19 4:14 PM, Stefan Hajnoczi wrote:
> The -usb section of the man page is not very clear on what exactly -usb
> does and fails to mention xHCI as a modern alternative (-device
> nec-usb-xhci).
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
> v2:
>  * Use @option{-device ...} [Thomas]
>  * Suggest qemu-xhci instead of nec-usb-xhci [Thomas and David]
> ---
>  qemu-options.hx | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 9621e934c0..1fb362f06f 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1436,12 +1436,15 @@ STEXI
>  ETEXI
>  
>  DEF("usb", 0, QEMU_OPTION_usb,
> -"-usbenable the USB driver (if it is not used by default 
> yet)\n",
> +"-usbenable on-board USB host controller (if not enabled by 
> default)\n",
>  QEMU_ARCH_ALL)
>  STEXI
>  @item -usb
>  @findex -usb
> -Enable the USB driver (if it is not used by default yet).
> +Enable USB emulation on machine types with an on-board USB host controller 
> (if
> +not enabled by default).  Note that on-board USB host controllers may not
> +support USB 3.0.  In this case @option{-device qemu-xhci} can be used instead
> +on machines with PCI.
>  ETEXI

Reviewed-by: Thomas Huth

Re: [Qemu-devel] [PATCH] trace: Clarify DTrace/SystemTap help message

2019-08-15 Thread Philippe Mathieu-Daudé

On 8/15/19 4:45 PM, Stefan Hajnoczi wrote:
> On Thu, Aug 15, 2019 at 02:02:47PM +0200, Philippe Mathieu-Daudé wrote:
>> Most tracing backends are implemented within QEMU, except the
>> DTrace/SystemTap backends.
>>
>> One side effect is when running 'qemu -trace help', an incomplete
>> list of trace events is displayed when using the DTrace/SystemTap
>> backends.
>>
>> This is partly due to trace events registered as modules with
>> trace_init(), and since the events are not used within QEMU,
>> the linker optimize and remove the unused modules (which is
>> OK in this particular case).
>> Currently only the events compiled in trace-root.o and in the
>> last trace.o member of libqemuutil.a are linked, resulting in
>> an incomplete list of events.
>>
>> To avoid confusion, improve the help message, recommending to
>> use the proper systemtap script to display the events list.
>>
>> Before:
>>
>>   $ lm32-softmmu/qemu-system-lm32 -trace help 2>&1 | wc -l
>>   70
>>
>> After:
>>
>>   $ lm32-softmmu/qemu-system-lm32 -trace help
>>   Run 'qemu-trace-stap list qemu-system-lm32' to print a list
>>   of names of trace points with the DTrace/SystemTap backends.
>>
>>   $ qemu-trace-stap list qemu-system-lm32 | wc -l
>>   1136
>>
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>>  trace/control.c | 7 +++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/trace/control.c b/trace/control.c
>> index 43fb7868db..bc2fe0859d 100644
>> --- a/trace/control.c
>> +++ b/trace/control.c
>> @@ -159,12 +159,19 @@ TraceEvent *trace_event_iter_next(TraceEventIter *iter)
>>  
>>  void trace_list_events(void)
>>  {
>> +#ifdef CONFIG_TRACE_DTRACE
>> +fprintf(stderr, "Run 'qemu-trace-stap list %s' to print a list\n"
>> +"of names of trace points with the DTrace/SystemTap"
>> +" backends.\n",
>> +error_get_progname());
>> +#else
>>  TraceEventIter iter;
>>  TraceEvent *ev;
>>  trace_event_iter_init(, NULL);
>>  while ((ev = trace_event_iter_next()) != NULL) {
>>  fprintf(stderr, "%s\n", trace_event_get_name(ev));
>>  }
>> +#endif
> 
> Multiple trace backends can be built into QEMU.  In that case the list

I did not know, that explains the final 's' to the
--enable-trace-backends option.

The "== Trace backends ==" of docs/devel/tracing.txt is not clear about
this.

> might be complete and the user may not be using stap at all.  Perhaps
> the message should be turned into a warning instead and the list should
> still be printed:
> 
>   This list of trace events may be incompletel.  Run 'qemu-trace-stap
>   list %s' to print a list of names of trace events with the
>   DTrace/SystemTap backends.

OK, thanks!

Phil.

Re: [Qemu-devel] [PATCH] usb: reword -usb command-line option and mention xHCI

2019-08-15 Thread Gerd Hoffmann

  Hi,

> > > -Enable the USB driver (if it is not used by default yet).
> > > +Enable USB emulation on machine types with an on-board USB host 
> > > controller (if
> > > +not enabled by default).  Note that on-board USB host controllers may not
> > > +support USB 3.0.  In this case -device nec-usb-xhci can be used instead 
> > > on
> > 
> > Should we maybe rather recommend qemu-xhci instead?
> 
> I think nec-usb-xhci is preferred because there are Windows drivers.
> IIRC qemu-xhci works under Linux but not under Windows (just because the
> PCI Vendor/Device ID aren't covered by any driver).
> 
> Gerd: Can you confirm this?

That applies to windows 7 only, which is EOL next year.

win7 doesn't ship with xhci drivers, but you can download and use
nec/renesas drivers which require nec-usb-xhci.

win8+ ships with generic xhci drivers which works with all xhci
hardware, including qemu-xhci.

So it indeed makes sense to refer to qemu-xhci.

cheers,
  Gerd

Re: [Qemu-devel] [RFC v2] hw/sd/aspeed_sdhci: New device

2019-08-15 Thread Cédric Le Goater

Hello Eddie,

On 14/08/2019 22:27, Eddie James wrote:
> The Aspeed SOCs have two SD/MMC controllers. Add a device that
> encapsulates both of these controllers and models the Aspeed-specific
> registers and behavior.
> 
> Tested by reading from mmcblk0 in Linux:
> qemu-system-arm -machine romulus-bmc -nographic -serial mon:stdio \
>  -drive file=_tmp/flash-romulus,format=raw,if=mtd \
>  -device sd-card,drive=sd0 -drive file=_tmp/kernel,format=raw,if=sd
> 
> Signed-off-by: Eddie James 
> ---
> This patch applies on top of Cedric's set of recent Aspeed changes. Therefore,
> I'm sending as an RFC rather than a patch.

yes. we can worked that out when the patch is reviewed. You can base on
mainline when ready. My tree serves as an overall test base.

> Changes since v1:
>  - Move slot realize code into the Aspeed SDHCI realize function
>  - Fix interrupt handling by creating input irqs and connecting them to the
>slot irqs.
>  - Removed card device creation code

I think all the code is here but it needs some more reshuffling :) 
 
The raspi machine is a good source for modelling pratices. 

>  hw/arm/aspeed.c  |   1 -
>  hw/arm/aspeed_soc.c  |  24 ++
>  hw/sd/Makefile.objs  |   1 +
>  hw/sd/aspeed_sdhci.c | 190 
> +++
>  include/hw/arm/aspeed_soc.h  |   3 +
>  include/hw/sd/aspeed_sdhci.h |  35 
>  6 files changed, 253 insertions(+), 1 deletion(-)
>  create mode 100644 hw/sd/aspeed_sdhci.c
>  create mode 100644 include/hw/sd/aspeed_sdhci.h
> 
> diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
> index 2574425..aeed5b6 100644
> --- a/hw/arm/aspeed.c
> +++ b/hw/arm/aspeed.c
> @@ -480,7 +480,6 @@ static void aspeed_machine_class_init(ObjectClass *oc, 
> void *data)
>  mc->desc = board->desc;
>  mc->init = aspeed_machine_init;
>  mc->max_cpus = ASPEED_CPUS_NUM;
> -mc->no_sdcard = 1;
>  mc->no_floppy = 1;
>  mc->no_cdrom = 1;
>  mc->no_parallel = 1;
> diff --git a/hw/arm/aspeed_soc.c b/hw/arm/aspeed_soc.c
> index 8df96f2..a12f14a 100644
> --- a/hw/arm/aspeed_soc.c
> +++ b/hw/arm/aspeed_soc.c
> @@ -22,6 +22,7 @@
>  #include "qemu/error-report.h"
>  #include "hw/i2c/aspeed_i2c.h"
>  #include "net/net.h"
> +#include "sysemu/blockdev.h"

I would expect block devices to be handled at the machine level in 
aspeed.c, like the flash devices are. Something like :

/* Create and plug in the SD cards */
for (i = 0; i < ASPEED_SDHCI_NUM_SLOTS; i++) {
BusState *bus;
DriveInfo *di = drive_get_next(IF_SD);
BlockBackend *blk = di ? blk_by_legacy_dinfo(di) : NULL;
DeviceState *carddev;

bus = qdev_get_child_bus(DEVICE(>soc), "sd-bus");
if (!bus) {
error_report("No SD bus found for SD card %d", i);
exit(1);
}
carddev = qdev_create(bus, TYPE_SD_CARD);
qdev_prop_set_drive(carddev, "drive", blk, _fatal);
object_property_set_bool(OBJECT(carddev), true, "realized",
 _fatal);
}

>  
>  #define ASPEED_SOC_IOMEM_SIZE   0x0020
>  
> @@ -62,6 +63,7 @@ static const hwaddr aspeed_soc_ast2500_memmap[] = {
>  [ASPEED_XDMA]   = 0x1E6E7000,
>  [ASPEED_ADC]= 0x1E6E9000,
>  [ASPEED_SRAM]   = 0x1E72,
> +[ASPEED_SDHCI]  = 0x1E74,
>  [ASPEED_GPIO]   = 0x1E78,
>  [ASPEED_RTC]= 0x1E781000,
>  [ASPEED_TIMER1] = 0x1E782000,
> @@ -100,6 +102,7 @@ static const hwaddr aspeed_soc_ast2600_memmap[] = {
>  [ASPEED_XDMA]   = 0x1E6E7000,
>  [ASPEED_ADC]= 0x1E6E9000,
>  [ASPEED_VIDEO]  = 0x1E70,
> +[ASPEED_SDHCI]  = 0x1E74,
>  [ASPEED_GPIO]   = 0x1E78,
>  [ASPEED_RTC]= 0x1E781000,
>  [ASPEED_TIMER1] = 0x1E782000,
> @@ -146,6 +149,7 @@ static const int aspeed_soc_ast2400_irqmap[] = {
>  [ASPEED_ETH1]   = 2,
>  [ASPEED_ETH2]   = 3,
>  [ASPEED_XDMA]   = 6,
> +[ASPEED_SDHCI]  = 26,
>  };
>  
>  #define aspeed_soc_ast2500_irqmap aspeed_soc_ast2400_irqmap
> @@ -163,6 +167,7 @@ static const int aspeed_soc_ast2600_irqmap[] = {
>  [ASPEED_SDMC]   = 0,
>  [ASPEED_SCU]= 12,
>  [ASPEED_XDMA]   = 6,
> +[ASPEED_SDHCI]  = 43,
>  [ASPEED_ADC]= 46,
>  [ASPEED_GPIO]   = 40,
>  [ASPEED_RTC]= 13,
> @@ -350,6 +355,15 @@ static void aspeed_soc_init(Object *obj)
>  sysbus_init_child_obj(obj, "fsi[*]", OBJECT(>fsi[0]),
>sizeof(s->fsi[0]), TYPE_ASPEED_FSI);
>  }
> +
> +sysbus_init_child_obj(obj, "sdhci", OBJECT(>sdhci), sizeof(s->sdhci),
> +  TYPE_ASPEED_SDHCI);

This is the Aspeed SD host interface. May be we should call it sdhost ? 

I suppose this is our "sd-bus" device ? 

> +/* Init sd card slot class here so that they're under the correct parent 
> */
> +for (i = 0; i < ASPEED_SDHCI_NUM_SLOTS; ++i) {

and these are the slots, I would put them at the SoC level.

> +

Re: [Qemu-devel] [PULL 5/7] file-posix: Support BDRV_REQ_NO_FALLBACK for zero writes

2019-08-15 Thread Kevin Wolf

Am 15.08.2019 um 04:44 hat Eric Blake geschrieben:
> On 3/26/19 10:51 AM, Kevin Wolf wrote:
> > We know that the kernel implements a slow fallback code path for
> > BLKZEROOUT, so if BDRV_REQ_NO_FALLBACK is given, we shouldn't call it.
> > The other operations we call in the context of .bdrv_co_pwrite_zeroes
> > should usually be quick, so no modification should be needed for them.
> > If we ever notice that there are additional problematic cases, we can
> > still make these conditional as well.
> 
> Are there cases where fallocate(FALLOC_FL_ZERO_RANGE) falls back to slow
> writes?  It may be fast on some file systems, but when used on a block
> device, that may equally trigger slow fallbacks.  The man page is not
> clear on that fact; I suspect that there may be cases in there that need
> to be made conditional (it would be awesome if the kernel folks would
> give us another FALLOC_ flag when we want to guarantee no fallback).

The NO_FALLBACK changes were based on the Linux code rather than
documentation because no interface is explicitly documented to forbid
fallbacks.

I think for file systems, we can generally assume that we don't get
fallbacks because for file systems, just deallocating blocks is the
easiest way to implement the function anyway. (Hm, or is it when we
don't punch holes...?)

And for block devices, we don't try FALLOC_FL_ZERO_RANGE because it also
involves the same slow fallback as BLKZEROOUT. In other words,
bdrv_co_pwrite_zeroes() with NO_FALLBACK, but without MAY_UNMAP, always
fails on Linux block devices, and we fall back to emulation in user
space.

We would need a kernel interface that calls blkdev_issue_zeroout() with
BLKDEV_ZERO_NOUNMAP | BLKDEV_ZERO_NOFALLBACK, but no such interface
exists.

When I talked to some file system people, they insisted that "efficient"
or "fast" wasn't well-defined enough for them or something, so if we
want to get a kernel change, maybe a new block device ioctl would be the
most realistic thing.

We do use FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE for MAY_UNMAP,
which works for both file systems (I assume - each file system has a
separate implementation) and block devices without slow fallbacks.

qemu-img create sets MAY_UNMAP, so the case we are most interested in is
covered with a fast implementation.

> By the way, is there an easy setup to prove (maybe some qemu-img convert
> command on a specially-prepared source image) whether the no fallback
> flag makes a difference?  I'm about to cross-post a series of patches to
> nbd/qemu/nbdkit/libnbd that adds a new NBD_CMD_FLAG_FAST_ZERO which fits
> the bill of BDRV_REQ_NO_FALLBACK, but would like to include some
> benchmark numbers in my cover letter if I can reproduce a setup where it
> matters.

Hm, the original case came from Nir, maybe he can suggest something.

You'll definitely need a block device that doesn't support
FALLOC_FL_PUNCH_HOLE, otherwise you can't trigger the fallback. My
first though was a loop device, but this actually does support the
operation and passes it through to the underlying file system. So maybe
if you know a file system that doesn't support it. Or if you have an old
hard disk handy.

Or... Actually there is an easily available block device that doesn't
suppport zero writes: The in-kernel NBD client! :-)

And I think I remember now how I tested this back then:

1. qemu-nbd exports an image with very slow throttling enabled
   (throttling affects writes, but not write_zeroes)

2. Attach the NBD device to /dev/nbd0

3. Convert to there (use a second NBD connection to test the fix)

> And this patch has a bug:
> 
> > +++ b/block/file-posix.c
> > @@ -652,7 +652,7 @@ static int raw_open_common(BlockDriverState *bs, QDict 
> > *options,
> >  }
> >  #endif
> >  
> > -bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP;
> > +bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK;
> >  ret = 0;
> >  fail:
> >  if (filename && (bdrv_flags & BDRV_O_TEMPORARY)) {
> > @@ -1500,14 +1500,19 @@ static ssize_t 
> > handle_aiocb_write_zeroes_block(RawPosixAIOData *aiocb)
> {
> int ret = -ENOTSUP;
> BDRVRawState *s = aiocb->bs->opaque;
> 
> if (!s->has_write_zeroes) {
> return -ENOTSUP;
> >  }
> 
> At this point, ret is -ENOTSUP.
> 
> >  
> >  #ifdef BLKZEROOUT
> > -do {
> > -uint64_t range[2] = { aiocb->aio_offset, aiocb->aio_nbytes };
> > -if (ioctl(aiocb->aio_fildes, BLKZEROOUT, range) == 0) {
> > -return 0;
> > -}
> > -} while (errno == EINTR);
> > +/* The BLKZEROOUT implementation in the kernel doesn't set
> > + * BLKDEV_ZERO_NOFALLBACK, so we can't call this if we have to avoid 
> > slow
> > + * fallbacks. */
> > +if (!(aiocb->aio_type & QEMU_AIO_NO_FALLBACK)) {
> > +do {
> > +uint64_t range[2] = { aiocb->aio_offset, aiocb->aio_nbytes };
> > +if (ioctl(aiocb->aio_fildes, BLKZEROOUT, range) == 0) {
> > +return 0;
> > +

Re: [Qemu-devel] [PATCH v2 00/12] block: qiov_offset parameter for io

2019-08-15 Thread Vladimir Sementsov-Ogievskiy

29.07.2019 18:24, Stefan Hajnoczi wrote:
> On Tue, Jun 04, 2019 at 07:15:02PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> Hi all!
>>
>> Here is new parameter qiov_offset for io path, to avoid
>> a lot of places with same pattern of creating local_qiov or hd_qiov
>> variables.
>>
>> These series also includes my
>> "[Qemu-devel] [PATCH 0/2] block/io: refactor padding"
>> with some changes [described in 01 and 03 emails]
>>
>> Vladimir Sementsov-Ogievskiy (12):
>>util/iov: introduce qemu_iovec_init_extended
>>util/iov: improve qemu_iovec_is_zero
>>block/io: refactor padding
>>block: define .*_part io handlers in BlockDriver
>>block/io: bdrv_co_do_copy_on_readv: use and support qiov_offset
>>block/io: bdrv_co_do_copy_on_readv: lazy allocation
>>block/io: bdrv_aligned_preadv: use and support qiov_offset
>>block/io: bdrv_aligned_pwritev: use and support qiov_offset
>>block/io: introduce bdrv_co_p{read,write}v_part
>>block/qcow2: refactor qcow2_co_preadv to use buffer-based io
>>block/qcow2: implement .bdrv_co_preadv_part
>>block/qcow2: implement .bdrv_co_pwritev(_compressed)_part
>>
>>   block/qcow2.h |   1 +
>>   include/block/block_int.h |  21 ++
>>   include/qemu/iov.h|  10 +-
>>   block/backup.c|   2 +-
>>   block/io.c| 532 ++
>>   block/qcow2-cluster.c |  14 +-
>>   block/qcow2.c | 131 +-
>>   qemu-img.c|   4 +-
>>   util/iov.c| 153 +--
>>   9 files changed, 559 insertions(+), 309 deletions(-)
>>
>> -- 
>> 2.18.0
> 
> Thanks, applied to my block tree:
> https://github.com/stefanha/qemu/commits/block
> 
> Stefan
> 

Could you please squash this into 01:

diff --git a/util/iov.c b/util/iov.c
index 0ed75e764c..5059e10431 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -422,7 +422,7 @@ void qemu_iovec_init_extended(
  void *tail_buf, size_t tail_len)
  {
  size_t mid_head, mid_tail;
-int total_niov, mid_niov;
+int total_niov, mid_niov = 0;
  struct iovec *p, *mid_iov;

  if (mid_len) {



-- 
Best regards,
Vladimir

Re: [Qemu-devel] [PATCH v5 07/10] monitor: adding tb_stats hmp command

2019-08-15 Thread Dr. David Alan Gilbert

* vandersonmr (vanderson...@gmail.com) wrote:
> Adding tb_stats [start|pause|stop|filter] command to hmp.
> This allows controlling the collection of statistics.
> It is also possible to set the level of collection:
> all, jit, or exec.
> 
> tb_stats filter allow to only collect statistics for the TB
> in the last_search list.
> 
> The goal of this command is to allow the dynamic exploration
> of the TCG behavior and quality. Therefore, for now, a
> corresponding QMP command is not worthwhile.
> 
> Signed-off-by: Vanderson M. do Rosario 

For HMP:

Acked-by: Dr. David Alan Gilbert 

> ---
>  accel/tcg/tb-stats.c| 111 
>  hmp-commands.hx |  17 ++
>  include/exec/tb-stats.h |  12 +
>  include/qemu-common.h   |   1 +
>  monitor/misc.c  |  49 ++
>  vl.c|   6 +++
>  6 files changed, 196 insertions(+)
> 
> diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
> index dddb9d4537..f28fd7b434 100644
> --- a/accel/tcg/tb-stats.c
> +++ b/accel/tcg/tb-stats.c
> @@ -9,6 +9,9 @@
>  
>  #include "exec/tb-stats.h"
>  
> +/* only accessed in safe work */
> +static GList *last_search;
> +
>  uint64_t dev_time;
>  
>  struct jit_profile_info {
> @@ -140,6 +143,96 @@ void dump_jit_profile_info(TCGProfile *s)
>  g_free(jpi);
>  }
>  
> +static void free_tbstats(void *p, uint32_t hash, void *userp)
> +{
> +g_free(p);
> +}
> +
> +static void clean_tbstats(void)
> +{
> +/* remove all tb_stats */
> +qht_iter(_ctx.tb_stats, free_tbstats, NULL);
> +qht_destroy(_ctx.tb_stats);
> +}
> +
> +void do_hmp_tbstats_safe(CPUState *cpu, run_on_cpu_data icmd)
> +{
> +struct TbstatsCommand *cmdinfo = icmd.host_ptr;
> +int cmd = cmdinfo->cmd;
> +uint32_t level = cmdinfo->level;
> +
> +switch (cmd) {
> +case START:
> +if (tb_stats_collection_paused()) {
> +set_tbstats_flags(level);
> +} else {
> +if (tb_stats_collection_enabled()) {
> +qemu_printf("TB information already being recorded");
> +return;
> +}
> +qht_init(_ctx.tb_stats, tb_stats_cmp, CODE_GEN_HTABLE_SIZE,
> +QHT_MODE_AUTO_RESIZE);
> +}
> +
> +set_default_tbstats_flag(level);
> +enable_collect_tb_stats();
> +tb_flush(cpu);
> +break;
> +case PAUSE:
> +if (!tb_stats_collection_enabled()) {
> +qemu_printf("TB information not being recorded");
> +return;
> +}
> +
> +/* Continue to create TBStatistic structures but stop collecting 
> statistics */
> +pause_collect_tb_stats();
> +set_default_tbstats_flag(TB_NOTHING);
> +set_tbstats_flags(TB_PAUSED);
> +tb_flush(cpu);
> +break;
> +case STOP:
> +if (!tb_stats_collection_enabled()) {
> +qemu_printf("TB information not being recorded");
> +return;
> +}
> +
> +/* Dissalloc all TBStatistics structures and stop creating new ones 
> */
> +disable_collect_tb_stats();
> +clean_tbstats();
> +tb_flush(cpu);
> +break;
> +case FILTER:
> +if (!tb_stats_collection_enabled()) {
> +qemu_printf("TB information not being recorded");
> +return;
> +}
> +if (!last_search) {
> +qemu_printf("no search on record! execute info tbs before 
> filtering!");
> +return;
> +}
> +
> +set_default_tbstats_flag(TB_NOTHING);
> +
> +/* Set all tbstats as paused, then return only the ones from 
> last_search */
> +pause_collect_tb_stats();
> +set_tbstats_flags(TB_PAUSED);
> +
> +for (GList *iter = last_search; iter; iter = g_list_next(iter)) {
> +TBStatistics *tbs = iter->data;
> +tbs->stats_enabled = level;
> +}
> +
> +tb_flush(cpu);
> +
> +break;
> +default: /* INVALID */
> +g_assert_not_reached();
> +break;
> +}
> +
> +g_free(cmdinfo);
> +}
> +
>  
>  void init_tb_stats_htable_if_not(void)
>  {
> @@ -175,6 +268,24 @@ bool tb_stats_collection_paused(void)
>  return tcg_collect_tb_stats == TB_STATS_PAUSED;
>  }
>  
> +static void reset_tbstats_flag(void *p, uint32_t hash, void *userp)
> +{
> +uint32_t flag = *((int *)userp);
> +TBStatistics *tbs = p;
> +tbs->stats_enabled = flag;
> +}
> +
> +void set_default_tbstats_flag(uint32_t flag)
> +{
> +default_tbstats_flag = flag;
> +}
> +
> +void set_tbstats_flags(uint32_t flag)
> +{
> +/* iterate over tbstats setting their flag as TB_NOTHING */
> +qht_iter(_ctx.tb_stats, reset_tbstats_flag, );
> +}
> +
>  uint32_t get_default_tbstats_flag(void)
>  {
>  return default_tbstats_flag;
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index bfa5681dd2..419898751e 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1885,6 +1885,23

Re: [Qemu-devel] [PATCH 2/4] configure: Avoid using libssh deprecated API

2019-08-15 Thread Andrea Bolognani

On Wed, 2019-08-14 at 17:14 +0200, Philippe Mathieu-Daudé wrote:
> On 8/14/19 4:51 PM, Andrea Bolognani wrote:
> > On Wed, 2019-08-14 at 16:15 +0200, Philippe Mathieu-Daudé wrote:
> > > On 8/14/19 3:27 PM, Andrea Bolognani wrote:
> > > > On Wed, 2019-08-14 at 14:15 +0200, Philippe Mathieu-Daudé wrote:
> > > > > Suggested-by: Andrea Bolognani 
> > > > > Signed-off-by: Philippe Mathieu-Daudé 
> > > > > ---
> > > > >  block/ssh.c | 2 +-
> > > > >  configure   | 7 +++
> > > > >  2 files changed, 8 insertions(+), 1 deletion(-)
> > > > 
> > > > Did I really suggest this? I have no recollection of doing so, or
> > > > even getting involved with libssh support in QEMU at all for that
> > > > matter.
> > > 
> > > I took this suggestion from
> > > https://www.redhat.com/archives/libvir-list/2018-May/msg00597.html
> > 
> > I see :)
> > 
> > I feel like adding a Suggested-by because of something that was
> > posted on an unrelated project's mailing list is stretching the
> > definition of the tag a bit, so if you end up having to respin I
> > think it would be reasonable to drop it, but honestly it's not a
> > big deal either way: I was just curious.
> 
> Understood, sorry.

Nothing to apologize for! :)

-- 
Andrea Bolognani / Red Hat / Virtualization

Re: [Qemu-devel] RISC-V: Vector && DSP Extension

2019-08-15 Thread Aleksandar Markovic

> We can accept draft
> extensions in QEMU as long as they are disabled by default.
>
> Alistair
>

Hi, Alistair, Palmer,

Is this an official stance of QEMU community, or perhaps Alistair's
personal judgement, or maybe a rule within risv subcomunity?

Yours,
Aleksandar

Re: [Qemu-devel] [PATCH 00/13] RFC: luks/encrypted qcow2 key management

2019-08-15 Thread Kevin Wolf

Am 14.08.2019 um 23:08 hat Eric Blake geschrieben:
> On 8/14/19 3:22 PM, Maxim Levitsky wrote:
> 
> > This is an issue that was raised today on IRC with Kevin Wolf. Really thanks
> > for the idea!
> > 
> > We agreed that this new qmp interface should take the same options as
> > blockdev-create does, however since we want to be able to edit the 
> > encryption
> > slots separately, this implies that we sort of need to allow this on 
> > creation
> > time as well.
> > 
> > Also the BlockdevCreateOptions is a union, which is specialized by the 
> > driver name
> > which is great for creation, but for update, the driver name is already 
> > known,
> > and thus the user should not be forced to pass it again.
> > However qmp doesn't seem to support union type guessing based on actual 
> > fields
> > given (this might not be desired either), which complicates this somewhat.
> 
> Does the idea of a union type with a default value for the discriminator
> help?  Maybe we have a discriminator which defaults to 'auto', and add a
> union branch 'auto':'any'.  During creation, if the "driver":"auto"
> branch is selected (usually implicitly by omitting "driver", but also
> possible explicitly), the creation attempt is rejected as invalid
> regardless of the contents of the remaining 'any'.  But during amend
> usage, if the 'auto' branch is selected, we then add in the proper
> "driver":"xyz" and reparse the QAPI object to determine if the remaining
> fields in 'any' still meet the specification for the required driver branch.
> 
> This idea may still require some tweaks to the QAPI generator, but it's
> the best I can come up with for a way to parse an arbitrary JSON object
> with unknown validation, then reparse it again after adding more
> information that would constrain the parse differently.

Feels like this would be a lot of code just to allow the client to omit
passing a value that it knows anyway. If this were a human interface, I
could understand the desire to make commands less verbose, but for QMP I
honestly don't see the point when it's not trivial.

Kevin


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH v3 46/46] target/i386: introduce SSE3 instructions to sse-opcode.inc.h

2019-08-15 Thread Aleksandar Markovic

15.08.2019. 11.55, "Richard Henderson"  је
написао/ла:
>
> On 8/15/19 8:02 AM, Aleksandar Markovic wrote:
> > A question for you: What about FISTTP, MONITOR, MWAIT, that belong to
SSE3, but
> > are not mentioned in this patch?
> >
>
> They are also not vector instructions, which is the subject of this patch
set.
>

The subject of the patch and the patch set says "SSE3", not "vector", read
it again.

>
> r~

Re: [Qemu-devel] [RFC PATCH] ati-vga: Implement dummy VBlank IRQ

2019-08-15 Thread Gerd Hoffmann

On Thu, Aug 15, 2019 at 02:25:07AM +0200, BALATON Zoltan wrote:
> The MacOS driver exits if the card does not have an interrupt. If we
> set PCI_INTERRUPT_PIN to 1 then it enables VBlank interrupts and it
> boots but the mouse poniter can not be moved. This patch implements a
> dummy VBlank interrupt by a timer triggered at 60 Hz to test if it
> helps. Unfortunately it doesn't: MacOS with this patch hangs during
> boot just polling interrupts and acknowledging them so maybe it needs
> something else or there may be some other problem with this
> implementation.
> 
> This is posted for comments and to let others experiment with it but
> probably should not be committed upstream yet.
> 
> Signed-off-by: BALATON Zoltan 
> ---
>  hw/display/ati.c  | 41 +
>  hw/display/ati_dbg.c  |  1 +
>  hw/display/ati_int.h  |  4 
>  hw/display/ati_regs.h |  1 +
>  4 files changed, 47 insertions(+)
> 
> diff --git a/hw/display/ati.c b/hw/display/ati.c
> index a365e2455d..e06cbf3e91 100644
> --- a/hw/display/ati.c
> +++ b/hw/display/ati.c
> @@ -243,6 +243,21 @@ static uint64_t ati_i2c(bitbang_i2c_interface *i2c, 
> uint64_t data, int base)
>  return data;
>  }
>  
> +static void ati_vga_update_irq(ATIVGAState *s)
> +{
> +pci_set_irq(>dev, s->regs.gen_int_status & 1);

This should be "s->regs.gen_int_status & s->regs.gen_int_cntl" I guess?

> +static void ati_vga_vblank_irq(void *opaque)
> +{
> +ATIVGAState *s = opaque;
> +
> +timer_mod(>vblank_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
> +  NANOSECONDS_PER_SECOND / 60);
> +s->regs.gen_int_status |= 1;

#defines for the irq status bits would be nice.

> +case GEN_INT_CNTL:
> +s->regs.gen_int_cntl = data;
> +if (data & 1) {
> +ati_vga_vblank_irq(s);
> +} else {
> +timer_del(>vblank_timer);
> +}

ati_vga_update_irq() needed here.

> +break;
> +case GEN_INT_STATUS:
> +data &= (s->dev_id == PCI_DEVICE_ID_ATI_RAGE128_PF ?
> + 0x000f040fUL : 0xfc080effUL);

Add IRQ_MASK #define ?

> +s->regs.gen_int_status &= ~data;

ati_vga_update_irq() needed here too.

cheers,
  Gerd

Re: [Qemu-devel] [libvirt] [PATCH 1/2] qapi: deprecate drive-mirror and drive-backup

2019-08-15 Thread Peter Krempa

On Wed, Aug 14, 2019 at 15:22:15 -0400, John Snow wrote:
> 
> 
> On 8/14/19 6:07 AM, Vladimir Sementsov-Ogievskiy wrote:
> > It's hard and not necessary to maintain outdated versions of these
> > commands.
> > 
> > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > ---
> >  qemu-deprecated.texi  |  4 
> >  qapi/block-core.json  |  4 
> >  qapi/transaction.json |  2 +-
> >  blockdev.c| 10 ++
> >  4 files changed, 19 insertions(+), 1 deletion(-)
> > 
> > diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
> > index fff07bb2a3..2753fafd0b 100644
> > --- a/qemu-deprecated.texi
> > +++ b/qemu-deprecated.texi
> > @@ -179,6 +179,10 @@ and accurate ``query-qmp-schema'' command.
> >  Character devices creating sockets in client mode should not specify
> >  the 'wait' field, which is only applicable to sockets in server mode
> >  
> > +@subsection drive-mirror, drive-backup and drive-backup transaction action 
> > (since 4.2)
> > +
> > +Use blockdev-mirror and blockdev-backup instead.
> > +
> >  @section Human Monitor Protocol (HMP) commands
> >  
> >  @subsection The hub_id parameter of 'hostfwd_add' / 'hostfwd_remove' 
> > (since 3.1)

[...]

> > @@ -3831,6 +3838,9 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
> >  const char *format = arg->format;
> >  int ret;
> >  
> > +warn_report("drive-mirror command is deprecated and will disappear in "
> > +"future. Use blockdev-mirror instead");
> > +
> >  bs = qmp_get_root_bs(arg->device, errp);
> >  if (!bs) {
> >  return;
> > 
> 
> Hm!
> 
> I wonder if this is ever-so-slightly too soon for our friends over at
> the libvirt project.
> 
> I don't think they have fully moved away from the non-blockdev
> interfaces *just yet*, and I might encourage seeing the first full
> libvirt release that does support and use it before we start the
> deprecation clock.
> 
> (Jst in case.)
> 
> That's just me being very, very cautious though.
> 
> Peter Krempa, how do you feel about this?

Thanks for the heads up!

Currently libvirt does not use 'drive-backup' at all so that one can be
deprecated immediately.

In case of 'drive-mirror' the situation is a bit more complex:

Libvirt uses 'drive-mirror' currently in the following places

1) virDomainBlockCopy API
With blockdev integration enabled this will go away. Pathces are being
reviewed:

https://www.redhat.com/archives/libvir-list/2019-August/msg00295.html

2) VM migration with non-shared storage
Currently uses 'drive-mirror' in most cases but there is pre-existing
integration for blockdev-mirror for nbd+tls. I need to make sure that
the blockdev version will be used unconditionally once the integration
is enabled. This is a TODO.

There is also one gotcha. In case when an 'sd' card device is used for
the VM, libvirt disables all of blockdev, because SD cards can't be
expressed with blockdev. There's too many code paths which would need
checking to be worth it. To be fair, I'm not even sure when a sd card
can be emulated by qemu as all of my basic tests failed and I did not
care more.

For libvirt to enable blockdev there's one more part missing and that's
snapshot integration. I'm currently testing patches to integrate it with
external snapshots, which should be posted soon.

I also found a bug in qemu, which prevents creation of internal
snapshots when -blockdev is used:

When savevm HMP command is used (via QMP->HMP bridge) qemu invokes
save_snapshot(), which calls bdrv_all_can_snapshot(). That function uses
bdrv_next() to iterate all nodes which correspond to a block backend
first, but then also iterates any other node which is monitor-owned.

Since with blockdev all nodes including the ones for the 'file' protocol
are monitor-owned, and 'file' does not support snapshots that check
fails. A simple hack of skipping the second part in bdrv_next() allows
to do a snapshot actually. Kevin told me that the idea is that also
non-attached nodes should be considered for internal snapshod which is
okay in my opinion, but given how the snapshot works for the files
attached to backeds (and also in pre-blockdev use) only the top level of
a chain should ever be considered for snapshot.

So the summary is, that I'm pretty hopeful that we should be able to get
rid of all reasonable uses of drive-mirror very soon after I finish
snapshot integration. The only question is how much
we care about SD card users being able to do a drive-mirror in the
future.

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v3 00/15] target/arm/kvm: enable SVE in guests

2019-08-15 Thread Peter Maydell

On Fri, 2 Aug 2019 at 13:25, Andrew Jones  wrote:
>
> Since Linux kernel v5.2-rc1 KVM has support for enabling SVE in guests.
> This series provides the QEMU bits for that enablement. First, we
> select existing CPU properties representing features we want to
> advertise in addition to the SVE vector lengths and prepare
> them for a qmp query. Then we introduce the qmp query, applying
> it immediately to those selected features. We also document ARM CPU
> features at this time. We next add a qtest for the selected CPU
> features that uses the qmp query for its tests - and we continue to
> add tests as we add CPU features with the following patches. So then,
> once we have the support we need for CPU feature querying and testing,
> we add our first SVE CPU feature property, 'sve', which just allows
> SVE to be completely enabled/disabled. Following that feature property,
> we add all 16 vector length properties along with the input validation
> they need and tests to prove the validation works. At this point the
> SVE features are still only for TCG, so we provide some patches to
> prepare for KVM and then a patch that allows the 'max' CPU type to
> enable SVE with KVM, but at first without vector length properties.
> After a bit more preparation we add the SVE vector length properties
> to the KVM-enabled 'max' CPU type along with the additional input
> validation and tests that that needs.  Finally we allow the 'host'
> CPU type to also enjoy these properties by simply sharing them with it.

Hi -- I see there have been some review comments on these patches
that mean there'll be a v4. In the meantime, patches 1, 2, 5, 6, 9, 10
seem to me to be independent bugfixes/cleanups that have been reviewed.
Would you like me to take those into target-arm.next to reduce the
size of the patchset for v4, or is that going to make rebasing
painful on your end?

thanks
-- PMM

Re: [Qemu-devel] [PATCH-for-4.2 v9 01/12] hw/acpi: Make ACPI IO address space configurable

2019-08-15 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Linuxarm [mailto:linuxarm-boun...@huawei.com] On Behalf Of Shameer
> Kolothum
> Sent: 13 August 2019 22:05
> To: qemu-devel@nongnu.org; qemu-...@nongnu.org;
> eric.au...@redhat.com; imamm...@redhat.com
> Cc: peter.mayd...@linaro.org; sa...@linux.intel.com;
> ard.biesheu...@linaro.org; Linuxarm ;
> shannon.zha...@gmail.com; sebastien.bo...@intel.com; ler...@redhat.com
> Subject: [PATCH-for-4.2 v9 01/12] hw/acpi: Make ACPI IO address space
> configurable
> 
> This is in preparation for adding support for ARM64 platforms
> where it doesn't use port mapped IO for ACPI IO space. We are
> making changes so that MMIO region can be accommodated
> and board can pass the base address into the aml build function.

Looks like, this now breaks the "make check" on x86_64 and needs
updating bios-tables-test-allowed-diff.h with DSDT entries. But I am 
not sure what changed now compared to v8(and older ones) that makes
it to complain now!. 

Patchew URL: 
https://patchew.org/QEMU/20190813210539.31164-1-shameerali.kolothum.th...@huawei.com/

ERROR:/tmp/qemu-test/src/tests/bios-tables-test.c:447:test_acpi_asl: assertion 
failed: (all_tables_match)

Thanks,
Shameer

> Also move few MEMORY_* definitions to header so that other memory
> hotplug event signalling mechanisms (eg. Generic Event Device on
> HW-reduced acpi platforms) can use the same from their respective
> event handler code.
> 
> Signed-off-by: Shameer Kolothum 
> ---
> v8 --> v9
>   -base address is an input into build_memory_hotplug_aml()
>   -Removed R-by tags from Igor and Eric for now.
> ---
>  hw/acpi/memory_hotplug.c | 29 ++---
>  hw/i386/acpi-build.c |  4 +++-
>  hw/i386/pc.c |  3 +++
>  include/hw/acpi/memory_hotplug.h |  9 +++--
>  include/hw/i386/pc.h |  3 +++
>  5 files changed, 30 insertions(+), 18 deletions(-)
> 
> diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
> index 297812d5f7..1734d4b44f 100644
> --- a/hw/acpi/memory_hotplug.c
> +++ b/hw/acpi/memory_hotplug.c
> @@ -29,12 +29,7 @@
>  #define MEMORY_SLOT_PROXIMITY_METHOD "MPXM"
>  #define MEMORY_SLOT_EJECT_METHOD "MEJ0"
>  #define MEMORY_SLOT_NOTIFY_METHOD"MTFY"
> -#define MEMORY_SLOT_SCAN_METHOD  "MSCN"
>  #define MEMORY_HOTPLUG_DEVICE"MHPD"
> -#define MEMORY_HOTPLUG_IO_LEN 24
> -#define MEMORY_DEVICES_CONTAINER "\\_SB.MHPC"
> -
> -static uint16_t memhp_io_base;
> 
>  static ACPIOSTInfo *acpi_memory_device_status(int slot, MemStatus *mdev)
>  {
> @@ -209,7 +204,7 @@ static const MemoryRegionOps
> acpi_memory_hotplug_ops = {
>  };
> 
>  void acpi_memory_hotplug_init(MemoryRegion *as, Object *owner,
> -  MemHotplugState *state, uint16_t
> io_base)
> +  MemHotplugState *state, hwaddr
> io_base)
>  {
>  MachineState *machine = MACHINE(qdev_get_machine());
> 
> @@ -218,12 +213,10 @@ void acpi_memory_hotplug_init(MemoryRegion *as,
> Object *owner,
>  return;
>  }
> 
> -assert(!memhp_io_base);
> -memhp_io_base = io_base;
>  state->devs = g_malloc0(sizeof(*state->devs) * state->dev_count);
>  memory_region_init_io(>io, owner, _memory_hotplug_ops,
> state,
>"acpi-mem-hotplug",
> MEMORY_HOTPLUG_IO_LEN);
> -memory_region_add_subregion(as, memhp_io_base, >io);
> +memory_region_add_subregion(as, io_base, >io);
>  }
> 
>  /**
> @@ -342,7 +335,8 @@ const VMStateDescription vmstate_memory_hotplug
> = {
> 
>  void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
>const char *res_root,
> -  const char *event_handler_method)
> +  const char *event_handler_method,
> +  AmlRegionSpace rs, hwaddr
> memhp_io_base)
>  {
>  int i;
>  Aml *ifctx;
> @@ -365,14 +359,19 @@ void build_memory_hotplug_aml(Aml *table,
> uint32_t nr_mem,
>  aml_name_decl("_UID", aml_string("Memory hotplug
> resources")));
> 
>  crs = aml_resource_template();
> -aml_append(crs,
> -aml_io(AML_DECODE16, memhp_io_base, memhp_io_base, 0,
> -   MEMORY_HOTPLUG_IO_LEN)
> -);
> +if (rs == AML_SYSTEM_IO) {
> +aml_append(crs,
> +aml_io(AML_DECODE16, memhp_io_base,
> memhp_io_base, 0,
> +   MEMORY_HOTPLUG_IO_LEN)
> +);
> +} else {
> +aml_append(crs, aml_memory32_fixed(memhp_io_base,
> +MEMORY_HOTPLUG_IO_LEN,
> AML_READ_WRITE));
> +}
>  aml_append(mem_ctrl_dev, aml_name_decl("_CRS", crs));
> 
>  aml_append(mem_ctrl_dev, aml_operation_region(
> -MEMORY_HOTPLUG_IO_REGION, AML_SYSTEM_IO,
> +MEMORY_HOTPLUG_IO_REGION, rs,
>  aml_int(memhp_io_base), MEMORY_HOTPLUG_IO_LEN)
>  );
> 
> diff --git

Re: [Qemu-devel] [RFC PATCH] ati-vga: Implement dummy VBlank IRQ

2019-08-15 Thread BALATON Zoltan


On Thu, 15 Aug 2019, Gerd Hoffmann wrote:

On Thu, Aug 15, 2019 at 02:25:07AM +0200, BALATON Zoltan wrote:

The MacOS driver exits if the card does not have an interrupt. If we
set PCI_INTERRUPT_PIN to 1 then it enables VBlank interrupts and it
boots but the mouse poniter can not be moved. This patch implements a
dummy VBlank interrupt by a timer triggered at 60 Hz to test if it
helps. Unfortunately it doesn't: MacOS with this patch hangs during
boot just polling interrupts and acknowledging them so maybe it needs
something else or there may be some other problem with this
implementation.

This is posted for comments and to let others experiment with it but
probably should not be committed upstream yet.

Signed-off-by: BALATON Zoltan 
---
 hw/display/ati.c  | 41 +
 hw/display/ati_dbg.c  |  1 +
 hw/display/ati_int.h  |  4 
 hw/display/ati_regs.h |  1 +
 4 files changed, 47 insertions(+)

diff --git a/hw/display/ati.c b/hw/display/ati.c
index a365e2455d..e06cbf3e91 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -243,6 +243,21 @@ static uint64_t ati_i2c(bitbang_i2c_interface *i2c, 
uint64_t data, int base)
 return data;
 }

+static void ati_vga_update_irq(ATIVGAState *s)
+{
+pci_set_irq(>dev, s->regs.gen_int_status & 1);


This should be "s->regs.gen_int_status & s->regs.gen_int_cntl" I guess?


Probably, but we only try to emulate VBlank yet so to avoid any problems 
due to raising irq for unknown bits I restricted it for that now.



+static void ati_vga_vblank_irq(void *opaque)
+{
+ATIVGAState *s = opaque;
+
+timer_mod(>vblank_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
+  NANOSECONDS_PER_SECOND / 60);
+s->regs.gen_int_status |= 1;


#defines for the irq status bits would be nice.


Yes, I thought about that but this was only for quick testing. I'll add 
constant for this in next version.



+case GEN_INT_CNTL:
+s->regs.gen_int_cntl = data;
+if (data & 1) {
+ati_vga_vblank_irq(s);
+} else {
+timer_del(>vblank_timer);
+}


ati_vga_update_irq() needed here.


+break;
+case GEN_INT_STATUS:
+data &= (s->dev_id == PCI_DEVICE_ID_ATI_RAGE128_PF ?
+ 0x000f040fUL : 0xfc080effUL);


Add IRQ_MASK #define ?


I'd leave these as constants because there are many of them (basically 
reserved bit mask for regs where we care or in this case writable bits) 
and one has to check docs to verify them and also in some cases we combine 
rage128p and rv100 so hiding them behind a constant would just make code 
less readable in my opinion. (This would become 3 lines for example with 
defines you'd have to look up in a different header so it's easier to see 
this way.)



+s->regs.gen_int_status &= ~data;


ati_vga_update_irq() needed here too.


Thanks. Indeed I forgot this. With that it works a bit better, mouse now 
can be moved but only vertically... No idea why, I'll have to check,


Regards,
BALATON Zoltan

Re: [Qemu-devel] [RFC PATCH] ati-vga: Implement dummy VBlank IRQ

2019-08-15 Thread Gerd Hoffmann

  Hi,

> > > +static void ati_vga_update_irq(ATIVGAState *s)
> > > +{
> > > +pci_set_irq(>dev, s->regs.gen_int_status & 1);
> > 
> > This should be "s->regs.gen_int_status & s->regs.gen_int_cntl" I guess?
> 
> Probably, but we only try to emulate VBlank yet so to avoid any problems due
> to raising irq for unknown bits I restricted it for that now.

Well, qemu doesn't set unknown status bits, only vblank.  The guest
can't set them either due to status register having write-one-to-clear
semantics.  So, that should not happen.  If you want an extra check to
catch programming errors I'd suggest to add an assert() for that.

> > > +s->regs.gen_int_status &= ~data;
> > 
> > ati_vga_update_irq() needed here too.
> 
> Thanks. Indeed I forgot this. With that it works a bit better, mouse now can
> be moved but only vertically... No idea why, I'll have to check,

Still progress.  One step at a time ;)

cheers,
  Gerd

Re: [Qemu-devel] [PATCH] Fix hw/rdma/vmw/pvrdma_cmd.c build

2019-08-15 Thread Yuval Shaia

On Sun, Aug 11, 2019 at 09:42:47PM +0200, Stephen Kitt wrote:
> This was broken by the cherry-pick in 41dd30f. Fix by handling errors
> as in the rest of the function: "goto out" instead of "return rc".
> 
> Signed-off-by: Stephen Kitt 
> ---
>  hw/rdma/vmw/pvrdma_cmd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/rdma/vmw/pvrdma_cmd.c b/hw/rdma/vmw/pvrdma_cmd.c
> index bb9a9f1cd1..a3a86d7c8e 100644
> --- a/hw/rdma/vmw/pvrdma_cmd.c
> +++ b/hw/rdma/vmw/pvrdma_cmd.c
> @@ -514,7 +514,7 @@ static int create_qp(PVRDMADev *dev, union pvrdma_cmd_req 
> *req,
>   cmd->recv_cq_handle, rings, >qpn);
>  if (resp->hdr.err) {
>  destroy_qp_rings(rings);
> -return rc;
> +goto out;

This label was removed, can you please check master branch?

>  }
>  
>  resp->max_send_wr = cmd->max_send_wr;
> -- 
> 2.20.1
> 
>

Re: [Qemu-devel] [RFC PATCH v3 02/46] target/i386: Push rex_w into DisasContext

2019-08-15 Thread Aleksandar Markovic

15.08.2019. 04.13, "Jan Bobek"  је написао/ла:
>
> From: Richard Henderson 
>
> Treat this the same as we already do for other rex bits.
>
> Signed-off-by: Richard Henderson 
> ---
>  target/i386/translate.c | 19 +++
>  1 file changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/target/i386/translate.c b/target/i386/translate.c
> index d74dbfd585..c0866c2797 100644
> --- a/target/i386/translate.c
> +++ b/target/i386/translate.c
> @@ -44,11 +44,13 @@
>  #define REX_X(s) ((s)->rex_x)
>  #define REX_B(s) ((s)->rex_b)
>  #define REX_R(s) ((s)->rex_r)
> +#define REX_W(s) ((s)->rex_w)
>  #else
>  #define CODE64(s) 0
>  #define REX_X(s) 0
>  #define REX_B(s) 0
>  #define REX_R(s) 0
> +#define REX_W(s) -1

The commit message says "treat rex_w the same as other rex bits". Why is
then REX_W() treated differently here?

Yours,
Aleksandar

>  #endif
>
>  #ifdef TARGET_X86_64
> @@ -100,7 +102,7 @@ typedef struct DisasContext {
>  #ifdef TARGET_X86_64
>  int lma;/* long mode active */
>  int code64; /* 64 bit code segment */
> -int rex_x, rex_b, rex_r;
> +int rex_x, rex_b, rex_r, rex_w;
>  #endif
>  int vex_l;  /* vex vector length */
>  int vex_v;  /* vex  register, without 1's complement.  */
> @@ -4495,7 +4497,6 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  int modrm, reg, rm, mod, op, opreg, val;
>  target_ulong next_eip, tval;
>  target_ulong pc_start = s->base.pc_next;
> -int rex_w;
>
>  s->pc_start = s->pc = pc_start;
>  s->override = -1;
> @@ -4503,6 +4504,7 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  s->rex_x = 0;
>  s->rex_b = 0;
>  s->rex_r = 0;
> +s->rex_w = -1;
>  s->x86_64_hregs = false;
>  #endif
>  s->rip_offset = 0; /* for relative ip address */
> @@ -4514,7 +4516,6 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  }
>
>  prefixes = 0;
> -rex_w = -1;
>
>   next_byte:
>  b = x86_ldub_code(env, s);
> @@ -4557,7 +4558,7 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  case 0x40 ... 0x4f:
>  if (CODE64(s)) {
>  /* REX prefix */
> -rex_w = (b >> 3) & 1;
> +s->rex_w = (b >> 3) & 1;
>  s->rex_r = (b & 0x4) << 1;
>  s->rex_x = (b & 0x2) << 2;
>  s->rex_b = (b & 0x1) << 3;
> @@ -4606,7 +4607,9 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  s->rex_b = (~vex2 >> 2) & 8;
>  #endif
>  vex3 = x86_ldub_code(env, s);
> -rex_w = (vex3 >> 7) & 1;
> +#ifdef TARGET_X86_64
> +s->rex_w = (vex3 >> 7) & 1;
> +#endif
>  switch (vex2 & 0x1f) {
>  case 0x01: /* Implied 0f leading opcode bytes.  */
>  b = x86_ldub_code(env, s) | 0x100;
> @@ -4631,9 +4634,9 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  /* Post-process prefixes.  */
>  if (CODE64(s)) {
>  /* In 64-bit mode, the default data size is 32-bit.  Select
64-bit
> -   data with rex_w, and 16-bit data with 0x66; rex_w takes
precedence
> +   data with REX_W, and 16-bit data with 0x66; REX_W takes
precedence
> over 0x66 if both are present.  */
> -dflag = (rex_w > 0 ? MO_64 : prefixes & PREFIX_DATA ? MO_16 :
MO_32);
> +dflag = (REX_W(s) > 0 ? MO_64 : prefixes & PREFIX_DATA ? MO_16 :
MO_32);
>  /* In 64-bit mode, 0x67 selects 32-bit addressing.  */
>  aflag = (prefixes & PREFIX_ADR ? MO_32 : MO_64);
>  } else {
> @@ -5029,7 +5032,7 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  /* operand size for jumps is 64 bit */
>  ot = MO_64;
>  } else if (op == 3 || op == 5) {
> -ot = dflag != MO_16 ? MO_32 + (rex_w == 1) : MO_16;
> +ot = dflag != MO_16 ? MO_32 + (REX_W(s) == 1) : MO_16;
>  } else if (op == 6) {
>  /* default push size is 64 bit */
>  ot = mo_pushpop(s, dflag);
> --
> 2.20.1
>
>

Re: [Qemu-devel] [PATCH 00/13] RFC: luks/encrypted qcow2 key management

2019-08-15 Thread Maxim Levitsky

On Wed, 2019-08-14 at 16:08 -0500, Eric Blake wrote:
> On 8/14/19 3:22 PM, Maxim Levitsky wrote:
> 
> > This is an issue that was raised today on IRC with Kevin Wolf. Really thanks
> > for the idea!
> > 
> > We agreed that this new qmp interface should take the same options as
> > blockdev-create does, however since we want to be able to edit the 
> > encryption
> > slots separately, this implies that we sort of need to allow this on 
> > creation
> > time as well.
> > 
> > Also the BlockdevCreateOptions is a union, which is specialized by the 
> > driver name
> > which is great for creation, but for update, the driver name is already 
> > known,
> > and thus the user should not be forced to pass it again.
> > However qmp doesn't seem to support union type guessing based on actual 
> > fields
> > given (this might not be desired either), which complicates this somewhat.
> 
> Does the idea of a union type with a default value for the discriminator
> help?  Maybe we have a discriminator which defaults to 'auto', and add a
> union branch 'auto':'any'.  During creation, if the "driver":"auto"
> branch is selected (usually implicitly by omitting "driver", but also
> possible explicitly), the creation attempt is rejected as invalid
> regardless of the contents of the remaining 'any'.  But during amend
> usage, if the 'auto' branch is selected, we then add in the proper
> "driver":"xyz" and reparse the QAPI object to determine if the remaining
> fields in 'any' still meet the specification for the required driver branch.
> 
> This idea may still require some tweaks to the QAPI generator, but it's
> the best I can come up with for a way to parse an arbitrary JSON object
> with unknown validation, then reparse it again after adding more
> information that would constrain the parse differently.
> 

This could work, but the idea of doing the parsing twice might not be easy to 
implement.
We currently have the qmp parser completely separated from the rest of the qemu,
so only once the qmp command parsing is done, the corresponding callback is 
called.


I am thinking. Since any 'update' commmand would need to sepecify the node to 
work on,
one could add some kind of expression for the qmp frontend to query the driver 
of that
node itself, which would solve that problem


Something like that:

{ 'union': 'BlockdevAmendOptions',

  'base': {
  'node-name': 'str' },

  'discriminator': { 'get_block_driver(node-name)' } ,

  'data': {
  'file':   'BlockdevCreateOptionsFile',
  'gluster':'BlockdevCreateOptionsGluster',
  'luks':   'BlockdevCreateOptionsLUKS',
  'nfs':'BlockdevCreateOptionsNfs',
  'parallels':  'BlockdevCreateOptionsParallels',
  'qcow':   'BlockdevCreateOptionsQcow',
  'qcow2':  'BlockdevCreateOptionsQcow2',
  'qed':'BlockdevCreateOptionsQed',
  'rbd':'BlockdevCreateOptionsRbd',
  'sheepdog':   'BlockdevCreateOptionsSheepdog',
  'ssh':'BlockdevCreateOptionsSsh',
  'vdi':'BlockdevCreateOptionsVdi',
  'vhdx':   'BlockdevCreateOptionsVhdx',
  'vmdk':   'BlockdevCreateOptionsVmdk',
  'vpc':'BlockdevCreateOptionsVpc'
  } }


The 'get_block_driver' expression will make the QMP frontend, take the value of 
the node-name union field,
and look up the block driver associated with it and use that as a discriminator.

Syntax wise we can (at some expense of readability) use json to express the 
same like

'discriminator': { 'field' : 'node-name', 'transform': 'getdrivername' },

Best regards,
Maxim Levitsky

Re: [Qemu-devel] [PATCH v1] s390x/tcg: Fix VERIM with 32/64 bit elements

2019-08-15 Thread Cornelia Huck

On Wed, 14 Aug 2019 17:12:42 +0200
David Hildenbrand  wrote:

> Wrong order of operands. The constant always comes last. Makes QEMU crash
> reliably on specific git fetch invocations.
> 
> Reported-by: Stefano Brivio 
> Signed-off-by: David Hildenbrand 
> ---
> 
> I guess it is too late for 4.1 :(
> 
> ---
>  target/s390x/translate_vx.inc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
> index 41d5cf869f..0caddb3958 100644
> --- a/target/s390x/translate_vx.inc.c
> +++ b/target/s390x/translate_vx.inc.c
> @@ -213,7 +213,7 @@ static void get_vec_element_ptr_i64(TCGv_ptr ptr, uint8_t 
> reg, TCGv_i64 enr,
> vec_full_reg_offset(v3), ptr, 16, 16, data, fn)
>  #define gen_gvec_3i(v1, v2, v3, c, gen) \
>  tcg_gen_gvec_3i(vec_full_reg_offset(v1), vec_full_reg_offset(v2), \
> -vec_full_reg_offset(v3), c, 16, 16, gen)
> +vec_full_reg_offset(v3), 16, 16, c, gen)
>  #define gen_gvec_4(v1, v2, v3, v4, gen) \
>  tcg_gen_gvec_4(vec_full_reg_offset(v1), vec_full_reg_offset(v2), \
> vec_full_reg_offset(v3), vec_full_reg_offset(v4), \

Reviewed-by: Cornelia Huck 
Fixes: 5c4b0ab460ef ("s390x/tcg: Implement VECTOR ELEMENT ROTATE AND INSERT 
UNDER MASK")
Cc: qemu-sta...@nongnu.org

Thanks, applied.

Re: [Qemu-devel] [RFC PATCH v2 16/39] target/i386: introduce instruction operand infrastructure

2019-08-15 Thread Richard Henderson

On 8/15/19 1:00 AM, Jan Bobek wrote:
> On 8/13/19 2:07 AM, Richard Henderson wrote:
>> On 8/10/19 5:12 AM, Jan Bobek wrote:
>>> +#define INSNOP_INIT(opT, init_stmt)\
>>> +static int insnop_init(opT)(CPUX86State *env, DisasContext *s, \
>>> +int modrm, insnop_t(opT) *op)  \
>>> +{  \
>>> +init_stmt; \
>>> +}
>> ...
>>> +#define INSNOP_INIT_FAILreturn 1
>>> +#define INSNOP_INIT_OK(x)   return ((*(op) = (x)), 0)
>>
>> Return bool and true on success.
> 
> So, the reason why I did this "inverted" logic (0 = success, 1 =
> failure) is because I was anticipating I might need to differentiate
> between two or more different failures, in which case returning
> different non-zero values for different error cases makes perfect
> sense. I have not made use of it yet, but I'd rather hold on to this
> idiom at least for now, until I am 100 % sure it really is
> unnecessary.

In that case "int" still isn't the best return type -- an enumeration would be.


r~

Re: [Qemu-devel] [RFC PATCH v3 02/46] target/i386: Push rex_w into DisasContext

2019-08-15 Thread Richard Henderson

On 8/15/19 8:30 AM, Aleksandar Markovic wrote:
> 
> 15.08.2019. 04.13, "Jan Bobek"  > је написао/ла:
>>
>> From: Richard Henderson mailto:r...@twiddle.net>>
>>
>> Treat this the same as we already do for other rex bits.
>>
>> Signed-off-by: Richard Henderson mailto:r...@twiddle.net>>
>> ---
>>  target/i386/translate.c | 19 +++
>>  1 file changed, 11 insertions(+), 8 deletions(-)
>>
>> diff --git a/target/i386/translate.c b/target/i386/translate.c
>> index d74dbfd585..c0866c2797 100644
>> --- a/target/i386/translate.c
>> +++ b/target/i386/translate.c
>> @@ -44,11 +44,13 @@
>>  #define REX_X(s) ((s)->rex_x)
>>  #define REX_B(s) ((s)->rex_b)
>>  #define REX_R(s) ((s)->rex_r)
>> +#define REX_W(s) ((s)->rex_w)
>>  #else
>>  #define CODE64(s) 0
>>  #define REX_X(s) 0
>>  #define REX_B(s) 0
>>  #define REX_R(s) 0
>> +#define REX_W(s) -1
> 
> The commit message says "treat rex_w the same as other rex bits". Why is then
> REX_W() treated differently here?

"Treated the same" in terms of being referenced by a macro instead of a local
variable.  As for the -1, if you look at the rest of the patch you can clearly
see it preserves existing behaviour.

>> @@ -4503,6 +4504,7 @@ static target_ulong disas_insn(DisasContext *s,
> CPUState *cpu)
>>      s->rex_x = 0;
>>      s->rex_b = 0;
>>      s->rex_r = 0;
>> +    s->rex_w = -1;
>>      s->x86_64_hregs = false;
>>  #endif
>>      s->rip_offset = 0; /* for relative ip address */
>> @@ -4514,7 +4516,6 @@ static target_ulong disas_insn(DisasContext *s,
> CPUState *cpu)
>>      }
>>
>>      prefixes = 0;
>> -    rex_w = -1;


r~

Re: [Qemu-devel] CPU hotplug using SMM with QEMU+OVMF

2019-08-15 Thread Yao, Jiewen

Hi Paolo
I am not sure what do you mean - "You do not need a reset vector ...".
If so, where is the first instruction of the new CPU in the virtualization 
environment?
Please help me understand that at first. Then we can continue the discussion.

Thank you
Yao Jiewen

> -Original Message-
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> Sent: Wednesday, August 14, 2019 10:05 PM
> To: Yao, Jiewen ; Laszlo Ersek
> ; edk2-devel-groups-io 
> Cc: edk2-rfc-groups-io ; qemu devel list
> ; Igor Mammedov ;
> Chen, Yingwen ; Nakajima, Jun
> ; Boris Ostrovsky ;
> Joao Marcal Lemos Martins ; Phillip Goerl
> 
> Subject: Re: CPU hotplug using SMM with QEMU+OVMF
> 
> On 14/08/19 15:20, Yao, Jiewen wrote:
> >> - Does this part require a new branch somewhere in the OVMF SEC code?
> >>   How do we determine whether the CPU executing SEC is BSP or
> >>   hot-plugged AP?
> > [Jiewen] I think this is blocked from hardware perspective, since the first
> instruction.
> > There are some hardware specific registers can be used to determine if the
> CPU is new added.
> > I don’t think this must be same as the real hardware.
> > You are free to invent some registers in device model to be used in OVMF
> hot plug driver.
> 
> Yes, this would be a new operation mode for QEMU, that only applies to
> hot-plugged CPUs.  In this mode the AP doesn't reply to INIT or SMI, in
> fact it doesn't reply to anything at all.
> 
> >> - How do we tell the hot-plugged AP where to start execution? (I.e. that
> >>   it should execute code at a particular pflash location.)
> > [Jiewen] Same real mode reset vector at :FFF0.
> 
> You do not need a reset vector or INIT/SIPI/SIPI sequence at all in
> QEMU.  The AP does not start execution at all when it is unplugged, so
> no cache-as-RAM etc.

> We only need to modify QEMU so that hot-plugged APIs do not reply to
> INIT/SIPI/SMI.
> 
> > I don’t think there is problem for real hardware, who always has CAR.
> > Can QEMU provide some CPU specific space, such as MMIO region?
> 
> Why is a CPU-specific region needed if every other processor is in SMM
> and thus trusted.
> >>   Does CPU hotplug apply only at the socket level? If the CPU is
> >>   multi-core, what is responsible for hot-plugging all cores present in
> >>   the socket?
> 
> I can answer this: the SMM handler would interact with the hotplug
> controller in the same way that ACPI DSDT does normally.  This supports
> multiple hotplugs already.
> 
> Writes to the hotplug controller from outside SMM would be ignored.
> 
> >>> (03) New CPU: (Flash) send board message to tell host CPU (GPIO->SCI)
> >>>  -- I am waiting for hot-add message.
> >>
> >> Maybe we can simplify this in QEMU by broadcasting an SMI to existent
> >> processors immediately upon plugging the new CPU.
> 
> The QEMU DSDT could be modified (when secure boot is in effect) to OUT
> to 0xB2 when hotplug happens.  It could write a well-known value to
> 0xB2, to be read by an SMI handler in edk2.
> 
> 
> >>
> >>>(NOTE: Host CPU can
> only
> >> send
> >>>  instruction in SMM mode. -- The register is SMM only)
> >>
> >> Sorry, I don't follow -- what register are we talking about here, and
> >> why is the BSP needed to send anything at all? What "instruction" do you
> >> have in mind?
> > [Jiewen] The new CPU does not enable SMI at reset.
> > At some point of time later, the CPU need enable SMI, right?
> > The "instruction" here means, the host CPUs need tell to CPU to enable
> SMI.
> 
> Right, this would be a write to the CPU hotplug controller
> 
> >>> (04) Host CPU: (OS) get message from board that a new CPU is added.
> >>>  (GPIO -> SCI)
> >>>
> >>> (05) Host CPU: (OS) All CPUs enter SMM (SCI->SWSMI) (NOTE: New CPU
> >>>  will not enter CPU because SMI is disabled)
> >>
> >> I don't understand the OS involvement here. But, again, perhaps QEMU
> can
> >> force all existent CPUs into SMM immediately upon adding the new CPU.
> > [Jiewen] OS here means the Host CPU running code in OS environment, not
> in SMM environment.
> 
> See above.
> 
> >>> (06) Host CPU: (SMM) Save 38000, Update 38000 -- fill simple SMM
> >>>  rebase code.
> >>>
> >>> (07) Host CPU: (SMM) Send message to New CPU to Enable SMI.
> >>
> >> Aha, so this is the SMM-only register you mention in step (03). Is the
> >> register specified in the Intel SDM?
> > [Jiewen] Right. That is the register to let host CPU tell new CPU to enable
> SMI.
> > It is platform specific register. Not defined in SDM.
> > You may invent one in device model.
> 
> See above.
> 
> >>> (10) New CPU: (SMM) Response first SMI at 38000, and rebase SMBASE
> to
> >>>  TSEG.
> >>
> >> What code does the new CPU execute after it completes step (10)? Does
> it
> >> halt?
> >
> > [Jiewen] The new CPU exits SMM and return to original place - where it is
> > interrupted to enter SMM - running code on the flash.
> 
> So in our case we'd need an INIT/SIPI/SIPI sequence between (06) and

Re: [Qemu-devel] [RFC PATCH v3 46/46] target/i386: introduce SSE3 instructions to sse-opcode.inc.h

2019-08-15 Thread Richard Henderson

On 8/15/19 8:02 AM, Aleksandar Markovic wrote:
> A question for you: What about FISTTP, MONITOR, MWAIT, that belong to SSE3, 
> but
> are not mentioned in this patch?
> 

They are also not vector instructions, which is the subject of this patch set.


r~

Re: [Qemu-devel] RISC-V: Vector && DSP Extension

2019-08-15 Thread Aleksandar Markovic

15.08.2019. 11.07, "Peter Maydell"  је написао/ла:
>
> On Thu, 15 Aug 2019 at 09:53, Aleksandar Markovic
>  wrote:
> >
> > > We can accept draft
> > > extensions in QEMU as long as they are disabled by default.
>
> > Hi, Alistair, Palmer,
> >
> > Is this an official stance of QEMU community, or perhaps Alistair's
> > personal judgement, or maybe a rule within risv subcomunity?
>
> Alistair asked on a previous thread; my view was:
> https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg03364.html
> and nobody else spoke up disagreeing (summary: should at least be
> disabled-by-default and only enabled by setting an explicit
> property whose name should start with the 'x-' prefix).
>
> In general QEMU does sometimes introduce experimental extensions
> (we've had them in the block layer, for example) and so the 'x-'
> property to enable them is a reasonably established convention.
> I think it's a reasonable compromise to allow this sort of work
> to start and not have to live out-of-tree for a long time, without
> confusing users or getting into a situation where some QEMU
> versions behave differently or to obsolete drafts of a spec
> without it being clear from the command line that experimental
> extensions are being enabled.
>
> There is also an element of "submaintainer judgement" to be applied
> here -- upstream is probably not the place for a draft extension
> to be implemented if it is:
>  * still fast moving or subject to major changes of design direction
>  * major changes to the codebase (especially if it requires
>changes to core code) that might later need to be redone
>entirely differently
>  * still experimental
>

OK.

Thanks for detailed response.

Aleksandar

> thanks
> -- PMM

Re: [Qemu-devel] [PATCH 0/7] target/arm: Misc cleanups

2019-08-15 Thread Peter Maydell

On Thu, 8 Aug 2019 at 21:26, Richard Henderson
 wrote:
>
> Some of these were cleanups that I was making simultaneous
> with the decodetree split.  Let's do those beforehand to
> make the split easier to read.
>
> Some of these are new, noticed while I was in the area.
>
>
> r~
>
>
> Richard Henderson (7):
>   target/arm: Use tcg_gen_extract_i32 for shifter_out_im
>   target/arm: Use tcg_gen_deposit_i32 for PKHBT, PKHTB
>   target/arm: Remove redundant shift tests
>   target/arm: Use ror32 instead of open-coding the operation
>   target/arm: Use tcg_gen_rotri_i32 for gen_swap_half
>   target/arm: Simplify SMMLA, SMMLAR, SMMLS, SMMLSR
>   target/arm: Use tcg_gen_extrh_i64_i32 to extract the high word

Applied to target-arm.next, thanks. (I had a comment on patch 6
but it was about the tcg docs, not the patch itself.)

-- PMM

Re: [Qemu-devel] [RFC PATCH v3 03/46] target/i386: reduce scope of variable aflag

2019-08-15 Thread Aleksandar Markovic

15.08.2019. 04.10, "Jan Bobek"  је написао/ла:
>
> The variable aflag is not used in most of disas_insn; make this clear
> by explicitly reducing its scope to the block where it is used.
>
> Suggested-by: Richard Henderson 
> Reviewed-by: Richard Henderson 
> Signed-off-by: Jan Bobek 
> ---

Jan, the new block between { and } should be indented.

Yours,
Aleksandar

>  target/i386/translate.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/target/i386/translate.c b/target/i386/translate.c
> index c0866c2797..bda96277e4 100644
> --- a/target/i386/translate.c
> +++ b/target/i386/translate.c
> @@ -4493,11 +4493,14 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  CPUX86State *env = cpu->env_ptr;
>  int b, prefixes;
>  int shift;
> -TCGMemOp ot, aflag, dflag;
> +TCGMemOp ot, dflag;
>  int modrm, reg, rm, mod, op, opreg, val;
>  target_ulong next_eip, tval;
>  target_ulong pc_start = s->base.pc_next;
>
> +{
> +TCGMemOp aflag;
> +
>  s->pc_start = s->pc = pc_start;
>  s->override = -1;
>  #ifdef TARGET_X86_64
> @@ -4657,6 +4660,7 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  s->prefix = prefixes;
>  s->aflag = aflag;
>  s->dflag = dflag;
> +}
>
>  /* now check op code */
>   reswitch:
> --
> 2.20.1
>
>

Re: [Qemu-devel] RISC-V: Vector && DSP Extension

2019-08-15 Thread Peter Maydell

On Thu, 15 Aug 2019 at 09:53, Aleksandar Markovic
 wrote:
>
> > We can accept draft
> > extensions in QEMU as long as they are disabled by default.

> Hi, Alistair, Palmer,
>
> Is this an official stance of QEMU community, or perhaps Alistair's
> personal judgement, or maybe a rule within risv subcomunity?

Alistair asked on a previous thread; my view was:
https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg03364.html
and nobody else spoke up disagreeing (summary: should at least be
disabled-by-default and only enabled by setting an explicit
property whose name should start with the 'x-' prefix).

In general QEMU does sometimes introduce experimental extensions
(we've had them in the block layer, for example) and so the 'x-'
property to enable them is a reasonably established convention.
I think it's a reasonable compromise to allow this sort of work
to start and not have to live out-of-tree for a long time, without
confusing users or getting into a situation where some QEMU
versions behave differently or to obsolete drafts of a spec
without it being clear from the command line that experimental
extensions are being enabled.

There is also an element of "submaintainer judgement" to be applied
here -- upstream is probably not the place for a draft extension
to be implemented if it is:
 * still fast moving or subject to major changes of design direction
 * major changes to the codebase (especially if it requires
   changes to core code) that might later need to be redone
   entirely differently
 * still experimental

thanks
-- PMM

Re: [Qemu-devel] [PATCH 2/2] qapi: deprecate implicit filters

2019-08-15 Thread Kevin Wolf

Am 14.08.2019 um 21:27 hat John Snow geschrieben:
> 
> 
> On 8/14/19 6:07 AM, Vladimir Sementsov-Ogievskiy wrote:
> > To get rid of implicit filters related workarounds in future let's
> > deprecate them now.
> > 
> > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > ---
> >  qemu-deprecated.texi  |  7 +++
> >  qapi/block-core.json  |  6 --
> >  include/block/block_int.h | 10 +-
> >  blockdev.c| 10 ++
> >  4 files changed, 30 insertions(+), 3 deletions(-)
> > 
> > diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
> > index 2753fafd0b..8222440148 100644
> > --- a/qemu-deprecated.texi
> > +++ b/qemu-deprecated.texi
> > @@ -183,6 +183,13 @@ the 'wait' field, which is only applicable to sockets 
> > in server mode
> >  
> >  Use blockdev-mirror and blockdev-backup instead.
> >  
> > +@subsection implicit filters (since 4.2)
> > +
> > +Mirror and commit jobs inserts filters, which becomes implicit if user
> > +omitted filter-node-name parameter. So omitting it is deprecated, set it
> > +always. Note, that drive-mirror don't have this parameter, so it will
> > +create implicit filter anyway, but drive-mirror is deprecated itself too.
> > +
> >  @section Human Monitor Protocol (HMP) commands
> >  
> >  @subsection The hub_id parameter of 'hostfwd_add' / 'hostfwd_remove' 
> > (since 3.1)
> > diff --git a/qapi/block-core.json b/qapi/block-core.json
> > index 4e35526634..0505ac9d8b 100644
> > --- a/qapi/block-core.json
> > +++ b/qapi/block-core.json
> > @@ -1596,7 +1596,8 @@
> >  # @filter-node-name: the node name that should be assigned to the
> >  #filter driver that the commit job inserts into the 
> > graph
> >  #above @top. If this option is not given, a node name 
> > is
> > -#autogenerated. (Since: 2.9)
> > +#autogenerated. Omitting this option is deprecated, it 
> > will
> > +#be required in future. (Since: 2.9)
> >  #
> >  # @auto-finalize: When false, this job will wait in a PENDING state after 
> > it has
> >  # finished its work, waiting for @block-job-finalize before
> > @@ -2249,7 +2250,8 @@
> >  # @filter-node-name: the node name that should be assigned to the
> >  #filter driver that the mirror job inserts into the 
> > graph
> >  #above @device. If this option is not given, a node 
> > name is
> > -#autogenerated. (Since: 2.9)
> > +#autogenerated. Omitting this option is deprecated, it 
> > will
> > +#be required in future. (Since: 2.9)
> >  #
> >  # @copy-mode: when to copy data to the destination; defaults to 
> > 'background'
> >  # (Since: 3.0)
> > diff --git a/include/block/block_int.h b/include/block/block_int.h
> > index 3aa1e832a8..624da0b4a2 100644
> > --- a/include/block/block_int.h
> > +++ b/include/block/block_int.h
> > @@ -762,7 +762,15 @@ struct BlockDriverState {
> >  bool sg;/* if true, the device is a /dev/sg* */
> >  bool probed;/* if true, format was probed rather than specified */
> >  bool force_share; /* if true, always allow all shared permissions */
> > -bool implicit;  /* if true, this filter node was automatically 
> > inserted */
> > +
> > +/*
> > + * @implicit field is deprecated, don't set it to true for new filters.
> > + * If true, this filter node was automatically inserted and user don't
> > + * know about it and unprepared for any effects of it. So, implicit
> > + * filters are workarounded and skipped in many places of the block
> > + * layer code.
> > + */
> > +bool implicit;
> >  
> >  BlockDriver *drv; /* NULL means no media */
> >  void *opaque;
> > diff --git a/blockdev.c b/blockdev.c
> > index 36e9368e01..b3cfaccce1 100644
> > --- a/blockdev.c
> > +++ b/blockdev.c
> > @@ -3292,6 +3292,11 @@ void qmp_block_commit(bool has_job_id, const char 
> > *job_id, const char *device,
> >  BlockdevOnError on_error = BLOCKDEV_ON_ERROR_REPORT;
> >  int job_flags = JOB_DEFAULT;
> >  
> > +if (!has_filter_node_name) {
> > +warn_report("Omitting filter-node-name parameter is deprecated, it 
> > "
> > +"will be required in future");
> > +}
> > +
> >  if (!has_speed) {
> >  speed = 0;
> >  }
> > @@ -3990,6 +3995,11 @@ void qmp_blockdev_mirror(bool has_job_id, const char 
> > *job_id,
> >  Error *local_err = NULL;
> >  int ret;
> >  
> > +if (!has_filter_node_name) {
> > +warn_report("Omitting filter-node-name parameter is deprecated, it 
> > "
> > +"will be required in future");
> > +}
> > +
> >  bs = qmp_get_root_bs(device, errp);
> >  if (!bs) {
> >  return;
> > 
> 
> This might be OK to do right away, though.
> 
> I asked Markus this not too long ago; do we want to amend the QAPI
> schema specification to allow

Re: [Qemu-devel] [RFC PATCH v3 43/46] target/i386: introduce SSE2 instructions to sse-opcode.inc.h

2019-08-15 Thread Aleksandar Markovic

15.08.2019. 04.51, "Jan Bobek"  је написао/ла:
>
> Add all the SSE2 instruction entries to sse-opcode.inc.h.
>
> Signed-off-by: Jan Bobek 
> ---

I gather the order of items in this file is based on instruction
functionality. This means, however, that, for example, SSE2 items will be
scattered all over the place.

In that light, consider providing a comment somewhere close to the top of
the file similar to this:

/*
* SSE2 instructions
* -
*
*   MOVD xmm,r/m32
*   MOVD r/m32,xmm
*   MOVQ xmm,r/m64
*   MOVQ xmm2/m64, xmm1
... etc ...
*
*/

(the same for SSE3, MMX)

That would provide the reader with a nice overview per instruction set
generation, and would also serve you as an a convenient checkpoint against
specifications.

Yours,
Aleksandar

>  target/i386/sse-opcode.inc.h | 323 ++-
>  1 file changed, 322 insertions(+), 1 deletion(-)
>
> diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
> index 39947aeb51..efa67b7ce2 100644
> --- a/target/i386/sse-opcode.inc.h
> +++ b/target/i386/sse-opcode.inc.h
> @@ -43,241 +43,535 @@
>  OPCODE(movd, LEG(NP, 0F, 0, 0x6e), MMX, WR, Pq, Ed)
>  /* NP 0F 7E /r: MOVD r/m32,mm */
>  OPCODE(movd, LEG(NP, 0F, 0, 0x7e), MMX, WR, Ed, Pq)
> +/* 66 0F 6E /r: MOVD xmm,r/m32 */
> +OPCODE(movd, LEG(66, 0F, 0, 0x6e), SSE2, WR, Vdq, Ed)
> +/* 66 0F 7E /r: MOVD r/m32,xmm */
> +OPCODE(movd, LEG(66, 0F, 0, 0x7e), SSE2, WR, Ed, Vdq)
>  /* NP REX.W + 0F 6E /r: MOVQ mm,r/m64 */
>  OPCODE(movq, LEG(NP, 0F, 1, 0x6e), MMX, WR, Pq, Eq)
>  /* NP REX.W + 0F 7E /r: MOVQ r/m64,mm */
>  OPCODE(movq, LEG(NP, 0F, 1, 0x7e), MMX, WR, Eq, Pq)
> +/* 66 REX.W 0F 6E /r: MOVQ xmm,r/m64 */
> +OPCODE(movq, LEG(66, 0F, 1, 0x6e), SSE2, WR, Vdq, Eq)
> +/* 66 REX.W 0F 7E /r: MOVQ r/m64,xmm */
> +OPCODE(movq, LEG(66, 0F, 1, 0x7e), SSE2, WR, Eq, Vdq)
>  /* NP 0F 6F /r: MOVQ mm, mm/m64 */
>  OPCODE(movq, LEG(NP, 0F, 0, 0x6f), MMX, WR, Pq, Qq)
>  /* NP 0F 7F /r: MOVQ mm/m64, mm */
>  OPCODE(movq, LEG(NP, 0F, 0, 0x7f), MMX, WR, Qq, Pq)
> +/* F3 0F 7E /r: MOVQ xmm1, xmm2/m64 */
> +OPCODE(movq, LEG(F3, 0F, 0, 0x7e), SSE2, WR, Vdq, Wq)
> +/* 66 0F D6 /r: MOVQ xmm2/m64, xmm1 */
> +OPCODE(movq, LEG(66, 0F, 0, 0xd6), SSE2, WR, UdqMq, Vq)
>  /* NP 0F 28 /r: MOVAPS xmm1, xmm2/m128 */
>  OPCODE(movaps, LEG(NP, 0F, 0, 0x28), SSE, WR, Vdq, Wdq)
>  /* NP 0F 29 /r: MOVAPS xmm2/m128, xmm1 */
>  OPCODE(movaps, LEG(NP, 0F, 0, 0x29), SSE, WR, Wdq, Vdq)
> +/* 66 0F 28 /r: MOVAPD xmm1, xmm2/m128 */
> +OPCODE(movapd, LEG(66, 0F, 0, 0x28), SSE2, WR, Vdq, Wdq)
> +/* 66 0F 29 /r: MOVAPD xmm2/m128, xmm1 */
> +OPCODE(movapd, LEG(66, 0F, 0, 0x29), SSE2, WR, Wdq, Vdq)
> +/* 66 0F 6F /r: MOVDQA xmm1, xmm2/m128 */
> +OPCODE(movdqa, LEG(66, 0F, 0, 0x6f), SSE2, WR, Vdq, Wdq)
> +/* 66 0F 7F /r: MOVDQA xmm2/m128, xmm1 */
> +OPCODE(movdqa, LEG(66, 0F, 0, 0x7f), SSE2, WR, Wdq, Vdq)
>  /* NP 0F 10 /r: MOVUPS xmm1, xmm2/m128 */
>  OPCODE(movups, LEG(NP, 0F, 0, 0x10), SSE, WR, Vdq, Wdq)
>  /* NP 0F 11 /r: MOVUPS xmm2/m128, xmm1 */
>  OPCODE(movups, LEG(NP, 0F, 0, 0x11), SSE, WR, Wdq, Vdq)
> +/* 66 0F 10 /r: MOVUPD xmm1, xmm2/m128 */
> +OPCODE(movupd, LEG(66, 0F, 0, 0x10), SSE2, WR, Vdq, Wdq)
> +/* 66 0F 11 /r: MOVUPD xmm2/m128, xmm1 */
> +OPCODE(movupd, LEG(66, 0F, 0, 0x11), SSE2, WR, Wdq, Vdq)
> +/* F3 0F 6F /r: MOVDQU xmm1,xmm2/m128 */
> +OPCODE(movdqu, LEG(F3, 0F, 0, 0x6f), SSE2, WR, Vdq, Wdq)
> +/* F3 0F 7F /r: MOVDQU xmm2/m128,xmm1 */
> +OPCODE(movdqu, LEG(F3, 0F, 0, 0x7f), SSE2, WR, Wdq, Vdq)
>  /* F3 0F 10 /r: MOVSS xmm1, xmm2/m32 */
>  OPCODE(movss, LEG(F3, 0F, 0, 0x10), SSE, WRRR, Vdq, Vdq, Wd, modrm_mod)
>  /* F3 0F 11 /r: MOVSS xmm2/m32, xmm1 */
>  OPCODE(movss, LEG(F3, 0F, 0, 0x11), SSE, WR, Wd, Vd)
> +/* F2 0F 10 /r: MOVSD xmm1, xmm2/m64 */
> +OPCODE(movsd, LEG(F2, 0F, 0, 0x10), SSE2, WRRR, Vdq, Vdq, Wq, modrm_mod)
> +/* F2 0F 11 /r: MOVSD xmm1/m64, xmm2 */
> +OPCODE(movsd, LEG(F2, 0F, 0, 0x11), SSE2, WR, Wq, Vq)
> +/* F3 0F D6 /r: MOVQ2DQ xmm, mm */
> +OPCODE(movq2dq, LEG(F3, 0F, 0, 0xd6), SSE2, WR, Vdq, Nq)
> +/* F2 0F D6 /r: MOVDQ2Q mm, xmm */
> +OPCODE(movdq2q, LEG(F2, 0F, 0, 0xd6), SSE2, WR, Pq, Uq)
>  /* NP 0F 12 /r: MOVHLPS xmm1, xmm2 */
>  /* NP 0F 12 /r: MOVLPS xmm1, m64 */
>  OPCODE(movhlps, LEG(NP, 0F, 0, 0x12), SSE, WR, Vq, UdqMhq)
>  /* 0F 13 /r: MOVLPS m64, xmm1 */
>  OPCODE(movlps, LEG(NP, 0F, 0, 0x13), SSE, WR, Mq, Vq)
> +/* 66 0F 12 /r: MOVLPD xmm1,m64 */
> +OPCODE(movlpd, LEG(66, 0F, 0, 0x12), SSE2, WR, Vq, Mq)
> +/* 66 0F 13 /r: MOVLPD m64,xmm1 */
> +OPCODE(movlpd, LEG(66, 0F, 0, 0x13), SSE2, WR, Mq, Vq)
>  /* NP 0F 16 /r: MOVLHPS xmm1, xmm2 */
>  /* NP 0F 16 /r: MOVHPS xmm1, m64 */
>  OPCODE(movlhps, LEG(NP, 0F, 0, 0x16), SSE, WRR, Vdq, Vq, Wq)
>  /* NP 0F 17 /r: MOVHPS m64, xmm1 */
>  OPCODE(movhps, LEG(NP, 0F, 0, 0x17), SSE, WR, Mq, Vdq)
> +/* 66 0F 16 /r: MOVHPD xmm1, m64 */
> +OPCODE(movhpd, LEG(66, 0F, 0, 0x16), SSE2, WRR, Vdq, Vd, Mq)
> +/* 66 0F 17 /r: MOVHPD m64, xmm1 */
> +OPCODE(movhpd, LEG(66, 0F, 0, 0x17), SSE2, WR, Mq, Vdq)
>  /* NP 0F D7 /r:

Re: [Qemu-devel] [PATCH v3 00/15] target/arm/kvm: enable SVE in guests

2019-08-15 Thread Andrew Jones

On Thu, Aug 15, 2019 at 09:31:35AM +0100, Peter Maydell wrote:
> On Fri, 2 Aug 2019 at 13:25, Andrew Jones  wrote:
> >
> > Since Linux kernel v5.2-rc1 KVM has support for enabling SVE in guests.
> > This series provides the QEMU bits for that enablement. First, we
> > select existing CPU properties representing features we want to
> > advertise in addition to the SVE vector lengths and prepare
> > them for a qmp query. Then we introduce the qmp query, applying
> > it immediately to those selected features. We also document ARM CPU
> > features at this time. We next add a qtest for the selected CPU
> > features that uses the qmp query for its tests - and we continue to
> > add tests as we add CPU features with the following patches. So then,
> > once we have the support we need for CPU feature querying and testing,
> > we add our first SVE CPU feature property, 'sve', which just allows
> > SVE to be completely enabled/disabled. Following that feature property,
> > we add all 16 vector length properties along with the input validation
> > they need and tests to prove the validation works. At this point the
> > SVE features are still only for TCG, so we provide some patches to
> > prepare for KVM and then a patch that allows the 'max' CPU type to
> > enable SVE with KVM, but at first without vector length properties.
> > After a bit more preparation we add the SVE vector length properties
> > to the KVM-enabled 'max' CPU type along with the additional input
> > validation and tests that that needs.  Finally we allow the 'host'
> > CPU type to also enjoy these properties by simply sharing them with it.
> 
> Hi -- I see there have been some review comments on these patches
> that mean there'll be a v4. In the meantime, patches 1, 2, 5, 6, 9, 10
> seem to me to be independent bugfixes/cleanups that have been reviewed.
> Would you like me to take those into target-arm.next to reduce the
> size of the patchset for v4, or is that going to make rebasing
> painful on your end?
>

Hi Peter,

Please do take the fixups. I think the rebasing should go fine, and indeed
reducing the number of patches in the patchset should reduce some of my
maintenance and also some reviewer strain for v4.

Thanks,
drew

Re: [Qemu-devel] [RFC PATCH v3 02/46] target/i386: Push rex_w into DisasContext

2019-08-15 Thread Aleksandar Markovic

15.08.2019. 11.55, "Richard Henderson"  је
написао/ла:
>
> On 8/15/19 8:30 AM, Aleksandar Markovic wrote:
> >
> > 15.08.2019. 04.13, "Jan Bobek"  > > је написао/ла:
> >>
> >> From: Richard Henderson mailto:r...@twiddle.net>>
> >>
> >> Treat this the same as we already do for other rex bits.
> >>
> >> Signed-off-by: Richard Henderson >
> >> ---
> >>  target/i386/translate.c | 19 +++
> >>  1 file changed, 11 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/target/i386/translate.c b/target/i386/translate.c
> >> index d74dbfd585..c0866c2797 100644
> >> --- a/target/i386/translate.c
> >> +++ b/target/i386/translate.c
> >> @@ -44,11 +44,13 @@
> >>  #define REX_X(s) ((s)->rex_x)
> >>  #define REX_B(s) ((s)->rex_b)
> >>  #define REX_R(s) ((s)->rex_r)
> >> +#define REX_W(s) ((s)->rex_w)
> >>  #else
> >>  #define CODE64(s) 0
> >>  #define REX_X(s) 0
> >>  #define REX_B(s) 0
> >>  #define REX_R(s) 0
> >> +#define REX_W(s) -1
> >
> > The commit message says "treat rex_w the same as other rex bits". Why
is then
> > REX_W() treated differently here?
>
> "Treated the same" in terms of being referenced by a macro instead of a
local
> variable.  As for the -1, if you look at the rest of the patch you can
clearly
> see it preserves existing behaviour.
>

That is exactly what I dislike about your commit messages: they often
introduce ambiguity, without any real need, and with really bad
consequences to the reader. Is adding "in terms of being referenced by a
macro instead of a local
variable" to the commit message that hard?

When writing commit messages, you need to try to put yourself in the shoes
of the reader.

Aleksandar

> >> @@ -4503,6 +4504,7 @@ static target_ulong disas_insn(DisasContext *s,
> > CPUState *cpu)
> >>  s->rex_x = 0;
> >>  s->rex_b = 0;
> >>  s->rex_r = 0;
> >> +s->rex_w = -1;
> >>  s->x86_64_hregs = false;
> >>  #endif
> >>  s->rip_offset = 0; /* for relative ip address */
> >> @@ -4514,7 +4516,6 @@ static target_ulong disas_insn(DisasContext *s,
> > CPUState *cpu)
> >>  }
> >>
> >>  prefixes = 0;
> >> -rex_w = -1;
>
>
> r~

Re: [Qemu-devel] [RFC v3 0/2] Add live migration support in the PVRDMA device

2019-08-15 Thread Yuval Shaia

On Sun, Jul 21, 2019 at 05:18:01AM +0530, Sukrit Bhatnagar wrote:
> In v2, we had successful migration of PCI and MSIX states as well as
> various DMA addresses and ring page information.
> This series enables the migration of various GIDs used by the device.
> 
> We have switched to a setup having two hosts and two VMs running atop them.
> Migrations are now performed over the local network. This has settled the
> same-host issue with libvirt.
> 
> We also have performed various ping-pong tests (ibv_rc_pingpong) in the
> guest(s) after adding GID migration support and this is the current status:
> - ping-pong to localhost succeeds, when performed before starting the
>   migration and after the completion of migration.
> - ping-pong to a peer succeeds, both before and after migration as above,
>   provided that both VMs are running on/migrated to the same host.
>   So, if two VMs were started on two different hosts, and one of them
>   was migrated to the other host, the ping-pong was successful.
>   Similarly, if two VMs are migrated to the same host, then after migration,
>   the ping-pong was successful.
> - ping-pong to a peer on the remote host is not working as of now.
> 
> Our next goal is to achieve successful migration with live traffic.

As this is a major milestone which enable live migration (still when there
are no QPs), i believe we are ok for a patch.

Yuval

> 
> This series can be also found at:
> https://github.com/skrtbhtngr/qemu/tree/gsoc19
> 
> 
> History:
> 
> v2 -> v3:
> - remove struct PVRDMAMigTmp and VMSTATE_WITH_TMP
> - use predefined PVRDMA_HW_NAME for the vmsd name
> - add vmsd for gids and a gid table field in pvrdma_state
> - perform gid registration in pvrdma_post_load
> - define pvrdma_post_save to unregister gids in the source host
> 
> v1 -> v2:
> - modify load_dsr() to make it idempotent
> - switch to VMStateDescription
> - add fields for PCI and MSIX state
> - define a temporary struct PVRDMAMigTmp to use WITH_TMP macro
> - perform mappings to CQ and event notification rings at load
> - vmxnet3 issue solved by Marcel's patch
> - BounceBuffer issue solved automatically by switching to VMStateDescription
> 
> 
> Link(s) to v2:
> https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg01848.html
> https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg01849.html
> https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg01850.html
> 
> Link(s) to v1:
> https://lists.gnu.org/archive/html/qemu-devel/2019-06/msg04924.html
> https://lists.gnu.org/archive/html/qemu-devel/2019-06/msg04923.html
> 
> Sukrit Bhatnagar (2):
>   hw/pvrdma: make DSR mapping idempotent in load_dsr()
>   hw/pvrdma: add live migration support
> 
>  hw/rdma/vmw/pvrdma_main.c | 94 +++
>  1 file changed, 86 insertions(+), 8 deletions(-)
> 
> -- 
> 2.21.0
>

Re: [Qemu-devel] [PATCH 0/3] colo: Add support for continious replication

2019-08-15 Thread Zhang, Chen

Hi Lukas,

Please fix this issue and add more comments in the commit log.

Thanks
Zhang Chen

> -Original Message-
> From: no-re...@patchew.org [mailto:no-re...@patchew.org]
> Sent: Thursday, August 15, 2019 11:20 AM
> To: lukasstra...@web.de
> Cc: Zhang, Chen ; qemu-devel@nongnu.org
> Subject: Re: [Qemu-devel] [PATCH 0/3] colo: Add support for continious
> replication
> 
> Patchew URL:
> https://patchew.org/QEMU/cover.1565814686.git.lukasstra...@web.de/
> 
> 
> 
> Hi,
> 
> This series failed build test on s390x host. Please find the details below.
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> # Testing script will be invoked under the git checkout with # HEAD pointing 
> to
> a commit that has the patches applied on top of "base"
> # branch
> set -e
> 
> echo
> echo "=== ENV ==="
> env
> 
> echo
> echo "=== PACKAGES ==="
> rpm -qa
> 
> echo
> echo "=== UNAME ==="
> uname -a
> 
> CC=$HOME/bin/cc
> INSTALL=$PWD/install
> BUILD=$PWD/build
> mkdir -p $BUILD $INSTALL
> SRC=$PWD
> cd $BUILD
> $SRC/configure --cc=$CC --prefix=$INSTALL make -j4 # XXX: we need reliable
> clean up # make check -j4 V=1 make install === TEST SCRIPT END ===
> 
>  from /var/tmp/patchew-tester-tmp-
> 6ji6qfi2/src/include/net/filter.h:13,
>  from 
> /var/tmp/patchew-tester-tmp-6ji6qfi2/src/net/filter.c:14:
> /var/tmp/patchew-tester-tmp-6ji6qfi2/src/net/filter.c: In function
> ‘netfilter_complete’:
> /var/tmp/patchew-tester-tmp-6ji6qfi2/src/include/qemu/queue.h:412:44: error:
> ‘position’ may be used uninitialized in this function [-Werror=maybe-
> uninitialized]
>   412 | (listelm)->field.tqe_circ.tql_prev = &(elm)->field.tqe_circ;  
>\
>   |^
> /var/tmp/patchew-tester-tmp-6ji6qfi2/src/net/filter.c:237:21: note: ‘position’
> was declared here
> 
> 
> The full log is available at
> http://patchew.org/logs/cover.1565814686.git.lukasstra...@web.de/testing.s3
> 90x/?type=message.
> ---
> Email generated automatically by Patchew [https://patchew.org/].
> Please send your feedback to patchew-de...@redhat.com

Re: [Qemu-devel] [PATCH v5 08/10] Adding info [tbs|tb|coverset] commands to HMP. These commands allow the exploration of TBs generated by the TCG. Understand which one hotter, with more guest/host ins

2019-08-15 Thread Dr. David Alan Gilbert

* vandersonmr (vanderson...@gmail.com) wrote:
> The goal of this command is to allow the dynamic exploration
> of TCG behavior and code quality. Therefore, for now, a
> corresponding QMP command is not worthwhile.
> 
> Signed-off-by: Vanderson M. do Rosario 
> ---
>  accel/tcg/tb-stats.c | 398 ++-
>  accel/tcg/translate-all.c|   2 +-
>  disas.c  |  31 ++-
>  hmp-commands-info.hx |  24 +++
>  include/exec/tb-stats.h  |  43 +++-
>  include/qemu/log-for-trace.h |   4 +
>  include/qemu/log.h   |   2 +
>  monitor/misc.c   |  74 +++
>  util/log.c   |  52 -
>  9 files changed, 609 insertions(+), 21 deletions(-)
> 
> diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
> index f28fd7b434..f5e519bdb7 100644
> --- a/accel/tcg/tb-stats.c
> +++ b/accel/tcg/tb-stats.c
> @@ -11,9 +11,36 @@
>  
>  /* only accessed in safe work */
>  static GList *last_search;
> -
> +int id = 1; /* display_id increment counter */
>  uint64_t dev_time;
>  
> +static TBStatistics *get_tbstats_by_id(int id)
> +{
> +GList *iter;
> +
> +for (iter = last_search; iter; iter = g_list_next(iter)) {
> +TBStatistics *tbs = iter->data;
> +if (tbs && tbs->display_id == id) {
> +return tbs;
> +break;
> +}
> +}
> +return NULL;
> +}
> +
> +static TBStatistics *get_tbstats_by_addr(target_ulong pc)
> +{
> +GList *iter;
> +for (iter = last_search; iter; iter = g_list_next(iter)) {
> +TBStatistics *tbs = iter->data;
> +if (tbs && tbs->pc == pc) {
> +return tbs;
> +break;
> +}
> +}
> +return NULL;
> +}
> +
>  struct jit_profile_info {
>  uint64_t translations;
>  uint64_t aborted;
> @@ -155,6 +182,7 @@ static void clean_tbstats(void)
>  qht_destroy(_ctx.tb_stats);
>  }
>  
> +
>  void do_hmp_tbstats_safe(CPUState *cpu, run_on_cpu_data icmd)
>  {
>  struct TbstatsCommand *cmdinfo = icmd.host_ptr;
> @@ -242,6 +270,374 @@ void init_tb_stats_htable_if_not(void)
>  }
>  }
>  
> +static void collect_tb_stats(void *p, uint32_t hash, void *userp)
> +{
> +last_search = g_list_prepend(last_search, p);
> +}
> +
> +static void dump_tb_targets(TBStatistics *tbs)
> +{
> +if (tbs && tbs->tb) {
> +uintptr_t dst1 = atomic_read(tbs->tb->jmp_dest);
> +uintptr_t dst2 = atomic_read(tbs->tb->jmp_dest + 1);
> +TranslationBlock* tb_dst1 = dst1 > 1 ? (TranslationBlock *) dst1 : 0;
> +TranslationBlock* tb_dst2 = dst2 > 1 ? (TranslationBlock *) dst2 : 0;
> +target_ulong pc1 = tb_dst1 ? tb_dst1->pc : 0;
> +target_ulong pc2 = tb_dst2 ? tb_dst2->pc : 0;
> +
> +/* if there is no display id from the last_search, then create one */
> +TBStatistics *tbstats_pc1 = get_tbstats_by_addr(pc1);
> +TBStatistics *tbstats_pc2 = get_tbstats_by_addr(pc2);
> +
> +if (!tbstats_pc1 && tb_dst1 && tb_dst1->tb_stats) {
> +last_search = g_list_append(last_search, tb_dst1->tb_stats);
> +tbstats_pc1 = tb_dst1->tb_stats;
> +}
> +
> +if (!tbstats_pc2 && tb_dst2 && tb_dst2->tb_stats) {
> +last_search = g_list_append(last_search, tb_dst2->tb_stats);
> +tbstats_pc2 = tb_dst2->tb_stats;
> +}
> +
> +if (tbstats_pc1 && tbstats_pc1->display_id == 0) {
> +tbstats_pc1->display_id = id++;
> +}
> +
> +if (tbstats_pc2 && tbstats_pc2->display_id == 0) {
> +tbstats_pc2->display_id = id++;
> +}
> +
> +if (pc1 && !pc2) {
> +qemu_log("\t| targets: 0x"TARGET_FMT_lx" (id:%d)\n",
> +pc1, tb_dst1 ? tbstats_pc1->display_id : -1);
> +} else if (pc1 && pc2) {
> +qemu_log("\t| targets: 0x"TARGET_FMT_lx" (id:%d), "
> + "0x"TARGET_FMT_lx" (id:%d)\n",
> +pc1, tb_dst1 ? tbstats_pc1->display_id : -1,
> +pc2, tb_dst2 ? tbstats_pc2->display_id : -1);
> +} else {
> +qemu_log("\t| targets: no direct target\n");
> +}
> +}
> +}
> +
> +static void dump_tb_header(TBStatistics *tbs)
> +{
> +unsigned g = stat_per_translation(tbs, code.num_guest_inst);
> +unsigned ops = stat_per_translation(tbs, code.num_tcg_ops);
> +unsigned ops_opt = stat_per_translation(tbs, code.num_tcg_ops_opt);
> +unsigned spills = stat_per_translation(tbs, code.spills);
> +unsigned h = stat_per_translation(tbs, code.out_len);
> +
> +float guest_host_prop = g ? ((float) h / g) : 0;
> +
> +qemu_log("TB id:%d | phys:0x"TB_PAGE_ADDR_FMT" virt:0x"TARGET_FMT_lx
> + " flags:%#08x\n", tbs->display_id, tbs->phys_pc, tbs->pc, 
> tbs->flags);
> +
> +if (tbs_stats_enabled(tbs, TB_EXEC_STATS)) {
> +qemu_log("\t| exec:%lu/%lu\n", tbs->executions.normal, 
> tbs->executions.atomic);
> +}
> +
> +if

Re: [Qemu-devel] [PATCH v1 0/2] Integrating qemu to Linux Perf

2019-08-15 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20190815023725.2659-1-vanderson...@gmail.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  x86_64-softmmu/hw/intc/apic_common.o
  CC  x86_64-softmmu/hw/intc/ioapic.o
  CC  x86_64-softmmu/hw/isa/lpc_ich9.o
/tmp/qemu-test/src/accel/tcg/perf/jitdump.c:11:10: fatal error: 
'exec/tb-stats.h' file not found
#include "exec/tb-stats.h"
 ^
  CC  x86_64-softmmu/hw/misc/ivshmem.o


The full log is available at
http://patchew.org/logs/20190815023725.2659-1-vanderson...@gmail.com/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [Qemu-devel] [PATCH v5 09/10] monitor: adding new info cfg command

2019-08-15 Thread Dr. David Alan Gilbert

* vandersonmr (vanderson...@gmail.com) wrote:
> Adding "info cfg id depth" commands to HMP.
> This command allow the exploration a TB
> neighbors by dumping [and opening] a .dot
> file with the TB CFG neighbors colorized
> by their hotness.
> 
> The goal of this command is to allow the dynamic exploration
> of TCG behavior and code quality. Therefore, for now, a
> corresponding QMP command is not worthwhile.
> 
> Signed-off-by: Vanderson M. do Rosario 
> ---
>  accel/tcg/tb-stats.c| 177 
>  hmp-commands-info.hx|   7 ++
>  include/exec/tb-stats.h |   1 +
>  monitor/misc.c  |  22 +
>  4 files changed, 207 insertions(+)
> 
> diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
> index f5e519bdb7..5fda2bed9e 100644
> --- a/accel/tcg/tb-stats.c
> +++ b/accel/tcg/tb-stats.c
> @@ -637,6 +637,182 @@ void dump_tb_info(int id, int log_mask, bool 
> use_monitor)
>  /* tbdi free'd by do_dump_tb_info_safe */
>  }
>  
> +/* TB CFG xdot/dot dump implementation */
> +#define MAX_CFG_NUM_NODES 1000
> +static int cfg_tb_id;
> +static GHashTable *cfg_nodes;
> +static uint64_t root_count;
> +
> +static void fputs_jump(TBStatistics *from, TBStatistics *to, FILE *dot)
> +{
> +if (!from || !to) {
> +return;
> +}
> +
> +int *from_id = (int *) g_hash_table_lookup(cfg_nodes, from);
> +int *to_id   = (int *) g_hash_table_lookup(cfg_nodes, to);
> +
> +if (!from_id || !to_id) {
> +return;
> +}
> +
> +GString *line = g_string_new(NULL);
> +
> +g_string_printf(line, "   node_%d -> node_%d;\n", *from_id, *to_id);
> +
> +fputs(line->str, dot);
> +
> +g_string_free(line, true);
> +}
> +
> +static void fputs_tbstats(TBStatistics *tbs, FILE *dot, int log_flags)
> +{
> +if (!tbs) {
> +return;
> +}
> +
> +GString *line = g_string_new(NULL);;
> +
> +uint32_t color = 0xFF666;
> +uint64_t count = tbs->executions.normal;
> +if (count > 1.6 * root_count) {
> +color = 0xFF000;
> +} else if (count > 1.2 * root_count) {
> +color = 0xFF333;
> +} else if (count < 0.4 * root_count) {
> +color = 0xFFCCC;
> +} else if (count < 0.8 * root_count) {
> +color = 0xFF999;
> +}
> +
> +GString *code_s = get_code_string(tbs, log_flags);
> +
> +for (int i = 0; i < code_s->len; i++) {
> +if (code_s->str[i] == '\n') {
> +code_s->str[i] = ' ';
> +code_s = g_string_insert(code_s, i, "\\l");
> +i += 2;
> +}
> +}
> +
> +g_string_printf(line,
> +"   node_%d [fillcolor=\"#%xFF\" shape=\"record\" "
> +"label=\"TB %d\\l"
> +"-\\l"
> +"PC:\t0x"TARGET_FMT_lx"\\l"
> +"exec count:\t%lu\\l"
> +"\\l %s\"];\n",
> +cfg_tb_id, color, cfg_tb_id, tbs->pc,
> +tbs->executions.normal, code_s->str);
> +
> +fputs(line->str, dot);
> +
> +int *id = g_new(int, 1);
> +*id = cfg_tb_id;
> +g_hash_table_insert(cfg_nodes, tbs, id);
> +
> +cfg_tb_id++;
> +
> +g_string_free(line, true);
> +g_string_free(code_s, true);
> +}
> +
> +static void fputs_preorder_walk(TBStatistics *tbs, int depth, FILE *dot, int 
> log_flags)
> +{
> +if (tbs && depth > 0
> +&& cfg_tb_id < MAX_CFG_NUM_NODES
> +&& !g_hash_table_contains(cfg_nodes, tbs)) {
> +
> +fputs_tbstats(tbs, dot, log_flags);
> +
> +if (tbs->tb) {
> +TranslationBlock *left_tb  = NULL;
> +TranslationBlock *right_tb = NULL;
> +if (tbs->tb->jmp_dest[0]) {
> +left_tb = (TranslationBlock *) 
> atomic_read(tbs->tb->jmp_dest);
> +}
> +if (tbs->tb->jmp_dest[1]) {
> +right_tb = (TranslationBlock *) 
> atomic_read(tbs->tb->jmp_dest + 1);
> +}
> +
> +if (left_tb) {
> +fputs_preorder_walk(left_tb->tb_stats, depth - 1, dot, 
> log_flags);
> +fputs_jump(tbs, left_tb->tb_stats, dot);
> +}
> +if (right_tb) {
> +fputs_preorder_walk(right_tb->tb_stats, depth - 1, dot, 
> log_flags);
> +fputs_jump(tbs, right_tb->tb_stats, dot);
> +}
> +}
> +}
> +}
> +
> +struct PreorderInfo {
> +TBStatistics *tbs;
> +int depth;
> +int log_flags;
> +};
> +
> +static void fputs_preorder_walk_safe(CPUState *cpu, run_on_cpu_data icmd)
> +{
> +struct PreorderInfo *info = icmd.host_ptr;
> +
> +GString *file_name = g_string_new(NULL);;
> +g_string_printf(file_name, "/tmp/qemu-cfg-tb-%d-%d.dot", id, 
> info->depth);

It's probably better to pass the filename in?

> +FILE *dot = fopen(file_name->str, "w+");
> +
> +fputs(
> +"digraph G {\n"
> +"   mclimit=1.5;\n"
> +"   rankdir=TD; ordering=out;\n"
> +"   graph[fontsize=10 fontname=\"Verdana\"];\n"
>

Re: [Qemu-devel] [PATCH v4 0/3] High downtime with 95+ throttle pct

2019-08-15 Thread Yury Kotov

Ping ping

07.08.2019, 10:42, "Yury Kotov" :
> Ping
>
> 23.07.2019, 16:42, "Yury Kotov" :
>>  Hi,
>>
>>  V4:
>>  * The test was simplified to prevent false fails.
>>
>>  V3:
>>  * Rebase fixes (migrate_set_parameter -> migrate_set_parameter_int)
>>
>>  V2:
>>  * Added a test
>>  * Fixed qemu_cond_timedwait for qsp
>>
>>  I wrote a test for migration auto converge and found out a strange thing:
>>  1. Enable auto converge
>>  2. Set max-bandwidth 1Gb/s
>>  3. Set downtime-limit 1ms
>>  4. Run standard test (just writes a byte per page)
>>  5. Wait for converge
>>  6. It's converged with 99% throttle percentage
>>  7. The result downtime was about 300-600ms 
>>
>>  It's much higher than expected 1ms. I figured out that cpu_throttle_thread()
>>  function sleeps for 100ms+ for high throttle percentage (>=95%) in VCPU 
>> thread.
>>  And it sleeps even after a cpu kick.
>>
>>  Fixed it by using timedwait for ms part of sleep.
>>  E.g timedwait(halt_cond, 1ms) + usleep(500).
>>
>>  Regards,
>>  Yury
>>
>>  Yury Kotov (3):
>>    qemu-thread: Add qemu_cond_timedwait
>>    cpus: Fix throttling during vm_stop
>>    tests/migration: Add a test for auto converge
>>
>>   cpus.c | 27 +++---
>>   include/qemu/thread.h | 18 +++
>>   tests/migration-test.c | 103 ++-
>>   util/qemu-thread-posix.c | 40 ++-
>>   util/qemu-thread-win32.c | 16 ++
>>   util/qsp.c | 18 +++
>>   6 files changed, 191 insertions(+), 31 deletions(-)
>>
>>  --
>>  2.22.0

Re: [Qemu-devel] [PATCH 7/7] target/arm: Use tcg_gen_extrh_i64_i32 to extract the high word

2019-08-15 Thread Peter Maydell

On Thu, 8 Aug 2019 at 21:26, Richard Henderson
 wrote:
>
> Separate shift + extract low will result in one extra insn
> for hosts like RISC-V, MIPS, and Sparc.
>
> Signed-off-by: Richard Henderson 
> ---
>  target/arm/translate.c | 18 ++
>  1 file changed, 6 insertions(+), 12 deletions(-)
>
> diff --git a/target/arm/translate.c b/target/arm/translate.c
> index 77154be743..9e103e4fad 100644
> --- a/target/arm/translate.c
> +++ b/target/arm/translate.c
> @@ -1761,8 +1761,7 @@ static int disas_iwmmxt_insn(DisasContext *s, uint32_t 
> insn)
>  if (insn & ARM_CP_RW_BIT) { /* TMRRC */
>  iwmmxt_load_reg(cpu_V0, wrd);
>  tcg_gen_extrl_i64_i32(cpu_R[rdlo], cpu_V0);
> -tcg_gen_shri_i64(cpu_V0, cpu_V0, 32);
> -tcg_gen_extrl_i64_i32(cpu_R[rdhi], cpu_V0);
> +tcg_gen_extrh_i64_i32(cpu_R[rdhi], cpu_V0);
>  } else {/* TMCRR */
>  tcg_gen_concat_i32_i64(cpu_V0, cpu_R[rdlo], cpu_R[rdhi]);
>  iwmmxt_store_reg(cpu_V0, wrd);

This patch is fine, but I noticed while reviewing it that tcg/README
labels both the extrl_i64_i32 and extrh_i64_i32 operations as
"for 64-bit hosts only". Presumably that's a documentation error,
since we're not guarding the existing uses of the extrl_i64_i32
here with any kind of ifdeffery to restrict them to 64-bit hosts ?

thanks
-- PMM

[Qemu-devel] [PATCH v2] qemu-img convert: Deprecate using -n and -o together

2019-08-15 Thread Kevin Wolf

bdrv_create options specified with -o have no effect when skipping image
creation with -n, so this doesn't make sense. Warn against the misuse
and deprecate the combination so we can make it a hard error later.

Signed-off-by: Kevin Wolf 
---

- Hopefully removed the "finger-wagging" that John saw, without stating
  that the combination doesn't have a well-defined behaviour (because
  skipping image creation and therefore ignoring -o is well-defined
  behaviour).

 qemu-img.c   | 5 +
 qemu-deprecated.texi | 7 +++
 2 files changed, 12 insertions(+)

diff --git a/qemu-img.c b/qemu-img.c
index 79983772de..d9321f6418 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2231,6 +2231,11 @@ static int img_convert(int argc, char **argv)
 goto fail_getopt;
 }
 
+if (skip_create && options) {
+warn_report("-o has no effect when skipping image creation");
+warn_report("This will become an error in future QEMU versions.");
+}
+
 s.src_num = argc - optind - 1;
 out_filename = s.src_num >= 1 ? argv[argc - 1] : NULL;
 
diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
index fff07bb2a3..f7680c08e1 100644
--- a/qemu-deprecated.texi
+++ b/qemu-deprecated.texi
@@ -305,6 +305,13 @@ to just export the entire image and then mount only 
/dev/nbd0p1 than
 it is to reinvoke @command{qemu-nbd -c /dev/nbd0} limited to just a
 subset of the image.
 
+@subsection qemu-img convert -n -o (since 4.2.0)
+
+All options specified in @option{-o} are image creation options, so
+they have no effect when used with @option{-n} to skip image creation.
+Silently ignored options can be confusing, so this combination of
+options will be made an error in future versions.
+
 @section Build system
 
 @subsection Python 2 support (since 4.1.0)
-- 
2.20.1

[Qemu-devel] [PATCH v2 3/7] iotests: Keep testing broken relative extent paths

2019-08-15 Thread Max Reitz

We had a test for a case where relative extent paths did not work, but
unfortunately we just fixed the underlying problem, so it works now.
This patch adds a new test case that still fails.

Signed-off-by: Max Reitz 
Reviewed-by: John Snow 
---
 tests/qemu-iotests/059 | 27 +++
 tests/qemu-iotests/059.out |  4 
 2 files changed, 31 insertions(+)

diff --git a/tests/qemu-iotests/059 b/tests/qemu-iotests/059
index fbed5f9483..10bfbaecec 100755
--- a/tests/qemu-iotests/059
+++ b/tests/qemu-iotests/059
@@ -114,6 +114,8 @@ $QEMU_IMG convert -f qcow2 -O vmdk -o 
subformat=streamOptimized "$TEST_IMG.qcow2
 
 echo
 echo "=== Testing monolithicFlat with internally generated JSON file name ==="
+
+echo '--- blkdebug ---'
 # Should work, because bdrv_dirname() works fine with blkdebug
 IMGOPTS="subformat=monolithicFlat" _make_test_img 64M
 $QEMU_IO -c "open -o 
driver=$IMGFMT,file.driver=blkdebug,file.image.filename=$TEST_IMG,file.inject-error.0.event=read_aio"
 \
@@ -122,6 +124,31 @@ $QEMU_IO -c "open -o 
driver=$IMGFMT,file.driver=blkdebug,file.image.filename=$TE
 | _filter_testdir | _filter_imgfmt | _filter_img_info
 _cleanup_test_img
 
+echo '--- quorum ---'
+# Should not work, because bdrv_dirname() does not work with quorum
+IMGOPTS="subformat=monolithicFlat" _make_test_img 64M
+cp "$TEST_IMG" "$TEST_IMG.orig"
+
+filename="json:{
+\"driver\": \"$IMGFMT\",
+\"file\": {
+\"driver\": \"quorum\",
+\"children\": [ {
+\"driver\": \"file\",
+\"filename\": \"$TEST_IMG\"
+}, {
+\"driver\": \"file\",
+\"filename\": \"$TEST_IMG.orig\"
+} ],
+\"vote-threshold\": 1
+} }"
+
+filename=$(echo "$filename" | tr '\n' ' ' | sed -e 's/\s\+/ /g')
+$QEMU_IMG info "$filename" 2>&1 \
+| sed -e "s/'json:[^']*'/\$QUORUM_FILE/g" \
+| _filter_testdir | _filter_imgfmt | _filter_img_info
+
+
 echo
 echo "=== Testing version 3 ==="
 _use_sample_img iotest-version3.vmdk.bz2
diff --git a/tests/qemu-iotests/059.out b/tests/qemu-iotests/059.out
index 120cddd207..f8895ba434 100644
--- a/tests/qemu-iotests/059.out
+++ b/tests/qemu-iotests/059.out
@@ -2049,10 +2049,14 @@ wrote 512/512 bytes at offset 10240
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
 === Testing monolithicFlat with internally generated JSON file name ===
+--- blkdebug ---
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
 format name: IMGFMT
 cluster size: 0 bytes
 vm state offset: 0 bytes
+--- quorum ---
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
+qemu-img: Could not open $QUORUM_FILE: Cannot use relative paths with VMDK 
descriptor file $QUORUM_FILE: Cannot generate a base directory for quorum nodes
 
 === Testing version 3 ===
 image: TEST_DIR/iotest-version3.IMGFMT
-- 
2.21.0

[Qemu-devel] [PATCH v2 0/7] vmdk: Misc fixes

2019-08-15 Thread Max Reitz

I made the mistake of trying to run the iotests with all non-default
subformats our vmdk driver has to offer:
- monolithicFlat
- twoGbMaxExtentSparse
- twoGbMaxExtentFlat
- streamOptimized

Many things broke, so this series fixes what I found.  It’s mostly just
iotest fixes, but there are actually two real fixes in here.


v2:
- Patch 2: Don’t treat extent filenames with protocol prefixes as
  absolute filenames – this may be the right thing to do, but:
  (1) path_combine() doesn’t (it just ignores whether the supposed
  relative filename has a potential protocol prefix), so this is how
  we handled it so far,
  (2) It would break other cases (when a filename contains a colon for
  no particular reason), as seen in iotest 126.
  That means you cannot have an extent file e.g. on an http server while
  the descriptor is on a local filesystem, but I hope nobody would ever
  want to do that.

- Patch 3: Fix paste-o [John]

- Patch 7: twoGbMaxExtentSparse works now with the change to patch 2, so
  we no longer have to mark it unsupported [Thanks for the insistent
  inquiry, John :-)]


git-backport-diff against v1:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/7:[] [--] 'iotests: Fix _filter_img_create()'
002/7:[0002] [FC] 'vmdk: Use bdrv_dirname() for relative extent paths'
003/7:[0002] [FC] 'iotests: Keep testing broken relative extent paths'
004/7:[] [--] 'vmdk: Reject invalid compressed writes'
005/7:[] [--] 'iotests: Disable broken streamOptimized tests'
006/7:[] [--] 'iotests: Disable 110 for vmdk.twoGbMaxExtentSparse'
007/7:[0006] [FC] 'iotests: Disable 126 for some vmdk subformats'


Max Reitz (7):
  iotests: Fix _filter_img_create()
  vmdk: Use bdrv_dirname() for relative extent paths
  iotests: Keep testing broken relative extent paths
  vmdk: Reject invalid compressed writes
  iotests: Disable broken streamOptimized tests
  iotests: Disable 110 for vmdk.twoGbMaxExtentSparse
  iotests: Disable 126 for flat vmdk subformats

 block/vmdk.c | 64 ++--
 tests/qemu-iotests/002   |  1 +
 tests/qemu-iotests/003   |  1 +
 tests/qemu-iotests/005   |  3 +-
 tests/qemu-iotests/009   |  1 +
 tests/qemu-iotests/010   |  1 +
 tests/qemu-iotests/011   |  1 +
 tests/qemu-iotests/017   |  3 +-
 tests/qemu-iotests/018   |  3 +-
 tests/qemu-iotests/019   |  3 +-
 tests/qemu-iotests/020   |  3 +-
 tests/qemu-iotests/027   |  1 +
 tests/qemu-iotests/032   |  1 +
 tests/qemu-iotests/033   |  1 +
 tests/qemu-iotests/034   |  3 +-
 tests/qemu-iotests/037   |  3 +-
 tests/qemu-iotests/059   | 34 -
 tests/qemu-iotests/059.out   | 24 +++-
 tests/qemu-iotests/063   |  3 +-
 tests/qemu-iotests/072   |  1 +
 tests/qemu-iotests/105   |  3 +-
 tests/qemu-iotests/110   |  3 +-
 tests/qemu-iotests/126   |  2 +
 tests/qemu-iotests/197   |  1 +
 tests/qemu-iotests/215   |  1 +
 tests/qemu-iotests/251   |  1 +
 tests/qemu-iotests/common.filter |  4 +-
 27 files changed, 127 insertions(+), 43 deletions(-)

-- 
2.21.0

Re: [Qemu-devel] [PATCH-for-4.2 v2 2/6] s390x/tcg: Rework MMU selection for instruction fetches

2019-08-15 Thread Cornelia Huck

On Wed, 14 Aug 2019 09:23:51 +0200
David Hildenbrand  wrote:

> Instructions are always fetched from primary address space, except when
> in home address mode. Perform the selection directly in cpu_mmu_index().
> 
> get_mem_index() is only used to perform data access, instructions are
> fetched via cpu_lduw_code(), which translates to cpu_mmu_index(env, true).
> 
> We don't care about restricting the access permissions of the TLB
> entries anymore, as we no longer enter PRIMARY entries into the
> SECONDARY MMU. Cleanup related code a bit.
> 
> Signed-off-by: David Hildenbrand 
> ---
>  target/s390x/cpu.h|  7 +++
>  target/s390x/mmu_helper.c | 38 +++---
>  2 files changed, 22 insertions(+), 23 deletions(-)
> 

(...)

> diff --git a/target/s390x/mmu_helper.c b/target/s390x/mmu_helper.c
> index 6e9c4d6151..c34e8d2021 100644
> --- a/target/s390x/mmu_helper.c
> +++ b/target/s390x/mmu_helper.c
> @@ -349,8 +349,9 @@ int mmu_translate(CPUS390XState *env, target_ulong vaddr, 
> int rw, uint64_t asc,
>  {
>  static S390SKeysState *ss;
>  static S390SKeysClass *skeyclass;
> -int r = -1;
> +uint64_t asce;
>  uint8_t key;
> +int r;
>  
>  if (unlikely(!ss)) {
>  ss = s390_get_skeys_device();
> @@ -380,36 +381,21 @@ int mmu_translate(CPUS390XState *env, target_ulong 
> vaddr, int rw, uint64_t asc,
>  
>  if (!(env->psw.mask & PSW_MASK_DAT)) {
>  *raddr = vaddr;
> -r = 0;
> -goto out;
> +goto nodat;
>  }
>  
>  switch (asc) {
>  case PSW_ASC_PRIMARY:
>  PTE_DPRINTF("%s: asc=primary\n", __func__);
> -r = mmu_translate_asce(env, vaddr, asc, env->cregs[1], raddr, flags,
> -   rw, exc);
> +asce = env->cregs[1];
>  break;
>  case PSW_ASC_HOME:
>  PTE_DPRINTF("%s: asc=home\n", __func__);
> -r = mmu_translate_asce(env, vaddr, asc, env->cregs[13], raddr, flags,
> -   rw, exc);
> +asce = env->cregs[13];
>  break;
>  case PSW_ASC_SECONDARY:
>  PTE_DPRINTF("%s: asc=secondary\n", __func__);
> -/*
> - * Instruction: Primary
> - * Data: Secondary
> - */
> -if (rw == MMU_INST_FETCH) {
> -r = mmu_translate_asce(env, vaddr, PSW_ASC_PRIMARY, 
> env->cregs[1],
> -   raddr, flags, rw, exc);
> -*flags &= ~(PAGE_READ | PAGE_WRITE);
> -} else {
> -r = mmu_translate_asce(env, vaddr, PSW_ASC_SECONDARY, 
> env->cregs[7],
> -   raddr, flags, rw, exc);
> -*flags &= ~(PAGE_EXEC);
> -}
> +asce = env->cregs[7];
>  break;
>  case PSW_ASC_ACCREG:
>  default:
> @@ -417,11 +403,17 @@ int mmu_translate(CPUS390XState *env, target_ulong 
> vaddr, int rw, uint64_t asc,
>  break;
>  }
>  
> - out:
> +/* perform the DAT translation */
> +r = mmu_translate_asce(env, vaddr, asc, asce, raddr, flags, rw, exc);
> +if (r) {
> +return r;
> +}
> +
> +nodat:
>  /* Convert real address -> absolute address */
>  *raddr = mmu_real2abs(env, *raddr);
>  
> -if (r == 0 && *raddr < ram_size) {
> +if (*raddr < ram_size) {
>  if (skeyclass->get_skeys(ss, *raddr / TARGET_PAGE_SIZE, 1, )) {
>  trace_get_skeys_nonzero(r);

I think you might up here with an uninitialized r before patch 4?

>  return 0;
> @@ -441,7 +433,7 @@ int mmu_translate(CPUS390XState *env, target_ulong vaddr, 
> int rw, uint64_t asc,
>  }
>  }
>  
> -return r;
> +return 0;
>  }
>  
>  /**

[Qemu-devel] ANNOUNCE: libnbd 0.9.8 - prerelease of high performance NBD client library

2019-08-15 Thread Richard W.M. Jones

I'm pleased to announce a new high performance Network Block Device
(NBD) client library called libnbd.  It's written in C and there are
also bindings available for Python, OCaml and (soon) Rust.

0.9.8 is the third pre-release before the stable 1.0 version where we
freeze the API, so feedback on API-related issues is very welcome now.

Download:   http://download.libguestfs.org/libnbd/
Documentation:  https://github.com/libguestfs/libnbd/blob/master/docs/libnbd.pod
Fedora package: https://koji.fedoraproject.org/koji/packageinfo?packageID=28807
Debian package: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=933223
Git repo:   https://github.com/libguestfs/libnbd
Mailing list:   libgues...@redhat.com (no subscription required)

Here are some of the things you can do with this library ...

Connect to an NBD server and grab the first sector of the disk:
https://github.com/libguestfs/libnbd/blob/a5f8fd2f0f48e9cf2487e23750b55f67b166014f/examples/simple-fetch-first-sector.c#L14

High performance multi-threaded reads and writes, with multiple
connections and multiple commands in flight on each connection:
https://github.com/libguestfs/libnbd/blob/master/examples/threaded-reads-and-writes.c

Integrate with glib main loop:
https://github.com/libguestfs/libnbd/blob/master/examples/glib-main-loop.c

Connect to an NBD server from an interactive shell:

  $ nbdkit -f linuxdisk . &
  $ nbdsh --connect nbd://localhost

  Welcome to nbdsh, the shell for interacting with
  Network Block Device (NBD) servers.

  nbd> h.get_size()
  716266496
  nbd> buf = h.pread (512, 0)
  nbd> print ("%r" % buf)
  [prints the first sector]

Use ‘fio’ to benchmark an NBD server:

  $ nbdkit -U - memory size=256M \
--run 'export unixsocket ; fio examples/nbd.fio '

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

Re: [Qemu-devel] [PATCH v1 1/2] accel/tcg: adding integration with linux perf

2019-08-15 Thread Alex Bennée



Stefan Hajnoczi  writes:

> On Wed, Aug 14, 2019 at 11:37:24PM -0300, vandersonmr wrote:
>> This commit adds support to Linux Perf in order
>> to be able to analyze qemu jitted code and
>> also to able to see the TBs PC in it.
>
> Is there any reference to the file format?  Please include it in a code
> comment, if such a thing exists.
>
>> diff --git a/accel/tcg/perf/jitdump.c b/accel/tcg/perf/jitdump.c
>> new file mode 100644
>> index 00..6f4c0911c2
>> --- /dev/null
>> +++ b/accel/tcg/perf/jitdump.c
>> @@ -0,0 +1,180 @@
>
> License header?
>
>> +#ifdef __linux__
>
> If the entire source file is #ifdef __linux__ then Makefile.objs should
> probably contain obj-$(CONFIG_LINUX) += jitdump.o instead.  Letting the
> build system decide what to compile is cleaner than ifdeffing large
> amounts of code.
>
>> +
>> +#include 
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "jitdump.h"
>> +#include "qemu-common.h"
>
> Please follow QEMU's header ordering conventions.  See ./HACKING "1.2.
> Include directives".
>
>> +void start_jitdump_file(void)
>> +{
>> +GString *dumpfile_name = g_string_new(NULL);;
>> +g_string_printf(dumpfile_name, "./jit-%d.dump", getpid());
>
> Simpler:
>
>   gchar *dumpfile_name = g_strdup_printf("./jit-%d.dump", getpid());
>   ...
>   g_free(dumpfile_name);
>
>> +dumpfile = fopen(dumpfile_name->str, "w+");
>
> getpid() and the global dumpfile variable make me wonder what happens
> with multi-threaded TCG?
>
>> +
>> +perf_marker = mmap(NULL, sysconf(_SC_PAGESIZE),
>
> Please mention the point of this mmap in a comment.  My best guess is
> that perf stores the /proc/$PID/maps and this is how it finds the
> jitdump file?
>
>> +  PROT_READ | PROT_EXEC,
>> +  MAP_PRIVATE,
>> +  fileno(dumpfile), 0);
>> +
>> +if (perf_marker == MAP_FAILED) {
>> +printf("Failed to create mmap marker file for perf %d\n", 
>> fileno(dumpfile));
>> +fclose(dumpfile);
>> +return;
>> +}
>> +
>> +g_string_free(dumpfile_name, TRUE);
>> +
>> +struct jitheader *header = g_new0(struct jitheader, 1);
>
> Why g_new this struct?  It's small and can be declared on the stack.
>
> Please use g_free() with g_malloc/new/etc().  It's not safe to mismatch
> glib and libc memory allocation functions.
>
>> +header->magic = 0x4A695444;
>> +header->version = 1;
>> +header->elf_mach = get_e_machine();
>> +header->total_size = sizeof(struct jitheader);
>> +header->pid = getpid();
>> +header->timestamp = get_timestamp();
>> +
>> +fwrite(header, header->total_size, 1, dumpfile);
>> +
>> +free(header);
>> +fflush(dumpfile);
>> +}
>> +
>> +void append_load_in_jitdump_file(TranslationBlock *tb)
>> +{
>> +GString *func_name = g_string_new(NULL);
>> +g_string_printf(func_name, "TB virt:0x"TARGET_FMT_lx"%c", tb->pc, '\0');
>
> The explicit NUL character looks strange to me.  I think the idea is to
> avoid func_name->len + 1?  Adding NUL characters to C strings can be a
> source of bugs, I would stick to convention and do len + 1 instead of
> putting NUL characters into the GString.  This is a question of style
> though.

The glib functions always add null characters so you shouldn't need to
manually.

>
>> +
>> +struct jr_code_load *load_event = g_new0(struct jr_code_load, 1);
>
> No need to allocate load_event on the heap.
>
>> diff --git a/qemu-options.hx b/qemu-options.hx
>> index 9621e934c0..1c26eeeb9c 100644
>> --- a/qemu-options.hx
>> +++ b/qemu-options.hx
>> @@ -4147,6 +4147,18 @@ STEXI
>>  Enable FIPS 140-2 compliance mode.
>>  ETEXI
>>
>> +#ifdef __linux__
>> +DEF("perf", 0, QEMU_OPTION_perf,
>> +"-perfdump jitdump files to help linux perf JIT code 
>> visualization\n",
>> +QEMU_ARCH_ALL)
>> +#endif
>> +STEXI
>> +@item -perf
>> +@findex -perf
>> +Dumps jitdump files to help linux perf JIT code visualization
>
> Suggestions on expanding the documentation:
>
> Where are the jitdump files dumped?  The current working directory?
>
> Anything to say about the naming scheme for these files?
>
> Can you include an example of how to load them into perf(1)?


--
Alex Bennée

Re: [Qemu-devel] CPU hotplug using SMM with QEMU+OVMF

2019-08-15 Thread Paolo Bonzini

On 15/08/19 18:07, Igor Mammedov wrote:
> Looking at Q35 code and Seabios SMM relocation as example, if I see it
> right QEMU has:
> - SMRAM is aliased from DRAM at 0xa
> - and TSEG steals from the top of low RAM when configured
> 
> Now problem is that default SMBASE at 0x3 isn't backed by anything
> in SMRAM address space and default SMI entry falls-through to the same
> location in System address space.
> 
> The later is not trusted and entry into SMM mode will corrupt area + might
> jump to 'random' SMI handler (hence save/restore code in Seabios).
> 
> Here is an idea, can we map a memory region at 0x3 in SMRAM address
> space with relocation space/code reserved. It could be a part of TSEG
> (so we don't have to invent ABI to configure that)?

No, there could be real mode code using it.  What we _could_ do is
initialize SMBASE to 0xa, but I think it's better to not deviate too
much from processor behavior (even if it's admittedly a 20-years legacy
that doesn't make any sense).

Paolo

[Qemu-devel] [PULL 01/33] migration: Add error_desc for file channel errors

2019-08-15 Thread Dr. David Alan Gilbert (git)

From: Yury Kotov 

Currently, there is no information about error if outgoing migration was failed
because of file channel errors.
Example (QMP session):
-> { "execute": "migrate", "arguments": { "uri": "exec:head -c 1" }}
<- { "return": {} }
...
-> { "execute": "query-migrate" }
<- { "return": { "status": "failed" }} // There is not error's description

And even in the QEMU's output there is nothing.

This patch
1) Adds errp for the most of QEMUFileOps
2) Adds qemu_file_get_error_obj/qemu_file_set_error_obj
3) And finally using of qemu_file_get_error_obj in migration.c

And now, the status for the mentioned fail will be:
-> { "execute": "query-migrate" }
<- { "return": { "status": "failed",
 "error-desc": "Unable to write to command: Broken pipe" }}

Signed-off-by: Yury Kotov 
Message-Id: <20190422103420.15686-1-yury-ko...@yandex-team.ru>
Reviewed-by: Dr. David Alan Gilbert 
Signed-off-by: Dr. David Alan Gilbert 
---
 migration/migration.c | 10 --
 migration/qemu-file-channel.c | 30 +
 migration/qemu-file.c | 63 ---
 migration/qemu-file.h | 15 ++---
 migration/savevm.c|  6 ++--
 5 files changed, 88 insertions(+), 36 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 8a607fe1e2..28342969ea 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2963,6 +2963,7 @@ static MigThrError migration_detect_error(MigrationState 
*s)
 {
 int ret;
 int state = s->state;
+Error *local_error = NULL;
 
 if (state == MIGRATION_STATUS_CANCELLING ||
 state == MIGRATION_STATUS_CANCELLED) {
@@ -2971,13 +2972,18 @@ static MigThrError 
migration_detect_error(MigrationState *s)
 }
 
 /* Try to detect any file errors */
-ret = qemu_file_get_error(s->to_dst_file);
-
+ret = qemu_file_get_error_obj(s->to_dst_file, _error);
 if (!ret) {
 /* Everything is fine */
+assert(!local_error);
 return MIG_THR_ERR_NONE;
 }
 
+if (local_error) {
+migrate_set_error(s, local_error);
+error_free(local_error);
+}
+
 if (state == MIGRATION_STATUS_POSTCOPY_ACTIVE && ret == -EIO) {
 /*
  * For postcopy, we allow the network to be down for a
diff --git a/migration/qemu-file-channel.c b/migration/qemu-file-channel.c
index 8e639eb496..c382ea2d78 100644
--- a/migration/qemu-file-channel.c
+++ b/migration/qemu-file-channel.c
@@ -33,7 +33,8 @@
 static ssize_t channel_writev_buffer(void *opaque,
  struct iovec *iov,
  int iovcnt,
- int64_t pos)
+ int64_t pos,
+ Error **errp)
 {
 QIOChannel *ioc = QIO_CHANNEL(opaque);
 ssize_t done = 0;
@@ -47,7 +48,7 @@ static ssize_t channel_writev_buffer(void *opaque,
 
 while (nlocal_iov > 0) {
 ssize_t len;
-len = qio_channel_writev(ioc, local_iov, nlocal_iov, NULL);
+len = qio_channel_writev(ioc, local_iov, nlocal_iov, errp);
 if (len == QIO_CHANNEL_ERR_BLOCK) {
 if (qemu_in_coroutine()) {
 qio_channel_yield(ioc, G_IO_OUT);
@@ -57,7 +58,6 @@ static ssize_t channel_writev_buffer(void *opaque,
 continue;
 }
 if (len < 0) {
-/* XXX handle Error objects */
 done = -EIO;
 goto cleanup;
 }
@@ -75,13 +75,14 @@ static ssize_t channel_writev_buffer(void *opaque,
 static ssize_t channel_get_buffer(void *opaque,
   uint8_t *buf,
   int64_t pos,
-  size_t size)
+  size_t size,
+  Error **errp)
 {
 QIOChannel *ioc = QIO_CHANNEL(opaque);
 ssize_t ret;
 
 do {
-ret = qio_channel_read(ioc, (char *)buf, size, NULL);
+ret = qio_channel_read(ioc, (char *)buf, size, errp);
 if (ret < 0) {
 if (ret == QIO_CHANNEL_ERR_BLOCK) {
 if (qemu_in_coroutine()) {
@@ -90,7 +91,6 @@ static ssize_t channel_get_buffer(void *opaque,
 qio_channel_wait(ioc, G_IO_IN);
 }
 } else {
-/* XXX handle Error * object */
 return -EIO;
 }
 }
@@ -100,18 +100,20 @@ static ssize_t channel_get_buffer(void *opaque,
 }
 
 
-static int channel_close(void *opaque)
+static int channel_close(void *opaque, Error **errp)
 {
+int ret;
 QIOChannel *ioc = QIO_CHANNEL(opaque);
-qio_channel_close(ioc, NULL);
+ret = qio_channel_close(ioc, errp);
 object_unref(OBJECT(ioc));
-return 0;
+return ret;
 }
 
 
 static int channel_shutdown(void *opaque,
 bool rd,
-bool wr)
+bool wr,
+

[Qemu-devel] [PULL 03/33] migration: consolidate time info into populate_time_info

2019-08-15 Thread Dr. David Alan Gilbert (git)

From: Wei Yang 

Consolidate time information fill up into its function for better
readability.

Signed-off-by: Wei Yang 
Message-Id: <20190716005411.4156-1-richardw.y...@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert 
Signed-off-by: Dr. David Alan Gilbert 
---
 migration/migration.c | 40 ++--
 1 file changed, 22 insertions(+), 18 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 28342969ea..7c66da3a83 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -823,6 +823,25 @@ bool migration_is_setup_or_active(int state)
 }
 }
 
+static void populate_time_info(MigrationInfo *info, MigrationState *s)
+{
+info->has_status = true;
+info->has_setup_time = true;
+info->setup_time = s->setup_time;
+if (s->state == MIGRATION_STATUS_COMPLETED) {
+info->has_total_time = true;
+info->total_time = s->total_time;
+info->has_downtime = true;
+info->downtime = s->downtime;
+} else {
+info->has_total_time = true;
+info->total_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) -
+   s->start_time;
+info->has_expected_downtime = true;
+info->expected_downtime = s->expected_downtime;
+}
+}
+
 static void populate_ram_info(MigrationInfo *info, MigrationState *s)
 {
 info->has_ram = true;
@@ -908,16 +927,8 @@ static void fill_source_migration_info(MigrationInfo *info)
 case MIGRATION_STATUS_DEVICE:
 case MIGRATION_STATUS_POSTCOPY_PAUSED:
 case MIGRATION_STATUS_POSTCOPY_RECOVER:
- /* TODO add some postcopy stats */
-info->has_status = true;
-info->has_total_time = true;
-info->total_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME)
-- s->start_time;
-info->has_expected_downtime = true;
-info->expected_downtime = s->expected_downtime;
-info->has_setup_time = true;
-info->setup_time = s->setup_time;
-
+/* TODO add some postcopy stats */
+populate_time_info(info, s);
 populate_ram_info(info, s);
 populate_disk_info(info);
 break;
@@ -926,14 +937,7 @@ static void fill_source_migration_info(MigrationInfo *info)
 /* TODO: display COLO specific information (checkpoint info etc.) */
 break;
 case MIGRATION_STATUS_COMPLETED:
-info->has_status = true;
-info->has_total_time = true;
-info->total_time = s->total_time;
-info->has_downtime = true;
-info->downtime = s->downtime;
-info->has_setup_time = true;
-info->setup_time = s->setup_time;
-
+populate_time_info(info, s);
 populate_ram_info(info, s);
 break;
 case MIGRATION_STATUS_FAILED:
-- 
2.21.0

1 2 3 >

1 - 100 of 282 matches

Mail list logo