date:20190724

Re: [Qemu-devel] [PATCH for-4.2 14/14] icount: clean up cpu_can_io before jumping to the next block

2019-07-24 Thread Pavel Dovgalyuk

> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> On 24/07/19 10:44, Pavel Dovgalyuk wrote:
> > From: Pavel Dovgalyuk 
> >
> > Most of IO instructions can be executed only at the end of the block in
> > icount mode. Therefore translator can set cpu_can_io flag when translating
> > the last instruction.
> > But when the blocks are chained, then this flag is not reset and may
> > remain set at the beginning of the next block.
> > This patch resets the flag before "chaining" the translation blocks.
> >
> > Signed-off-by: Pavel Dovgalyuk 
> > ---
> >  accel/tcg/tcg-runtime.c |2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/accel/tcg/tcg-runtime.c b/accel/tcg/tcg-runtime.c
> > index 8a1e408e31..fe6b83d0fc 100644
> > --- a/accel/tcg/tcg-runtime.c
> > +++ b/accel/tcg/tcg-runtime.c
> > @@ -151,6 +151,8 @@ void *HELPER(lookup_tb_ptr)(CPUArchState *env)
> >  target_ulong cs_base, pc;
> >  uint32_t flags;
> >
> > +/* We are going to jump to the next block. can_do_io should be reset */
> > +cpu->can_do_io = !use_icount;
> >  tb = tb_lookup__cpu_state(cpu, , _base, , curr_cflags());
> >  if (tb == NULL) {
> >  return tcg_ctx->code_gen_epilogue;
> >
> 
> This only fixes indirect jumps though.
> 
> I think you do not need this patch if you remove the assignment in
> cpu_tb_exec, and compile a "move 0 to cpu->can_do_io" in gen_tb_start
> instead.

"move 0 to cpu->can_do_io" only for icount mode?
And we'll also need to set can_do_io to 1 somewhere, because it
is checked in non-icount mode too.

Pavel Dovgalyuk

Re: [Qemu-devel] [PATCH] docs/nvdimm: add example on persistent backend setup

2019-07-24 Thread Pankaj Gupta



> 
> Persistent backend setup requires some knowledge about nvdimm and ndctl
> tool. Some users report they may struggle to gather these knowledge and
> have difficulty to setup it properly.
> 
> Here we provide two examples for persistent backend and gives the link
> to ndctl. By doing so, user could try it directly and do more
> investigation on persistent backend setup with ndctl.
> 
> Signed-off-by: Wei Yang 
> ---
>  docs/nvdimm.txt | 28 
>  1 file changed, 28 insertions(+)
> 
> diff --git a/docs/nvdimm.txt b/docs/nvdimm.txt
> index b531cacd35..baba7a940d 100644
> --- a/docs/nvdimm.txt
> +++ b/docs/nvdimm.txt
> @@ -171,6 +171,32 @@ guest software that this vNVDIMM device contains a
> region that cannot
>  accept persistent writes. In result, for example, the guest Linux
>  NVDIMM driver, marks such vNVDIMM device as read-only.
>  
> +Backend File Setup Example
> +..
> +
> +Here is two examples for how to setup these persistent backend on
> +linux, which leverages the tool ndctl [3].
> +
> +It is easy to setup DAX device backend file.
> +
> +A. DAX device
> +
> +ndctl create-namespace -f -e namespace0.0 -m devdax
> +
> +The /dev/dax0.0 could be used directly in "mem-path" option.
> +
> +For DAX file, it is more than creating the proper namespace. The
> +block device should be partitioned and mounted (with dax option).
> +
> +B. DAX file
> +
> +ndctl create-namespace -f -e namespace0.0 -m fsdax
> +(partition /dev/pmem0 with name pmem0p1)
> +mount -o dax /dev/pmem0p1 /mnt
> +(dd a file with proper size in /mnt)
> +
> +Then the new file in /mnt could be used in "mem-path" option.
> +
>  NVDIMM Persistence
>  --
>  
> @@ -212,3 +238,5 @@ References
>  
> https://www.snia.org/sites/default/files/technical_work/final/NVMProgrammingModel_v1.2.pdf
>  [2] Persistent Memory Development Kit (PMDK), formerly known as NVML
>  project, home page:
>  http://pmem.io/pmdk/
> +[3] ndctl-create-namespace - provision or reconfigure a namespace
> +http://pmem.io/ndctl/ndctl-create-namespace.html
> --

Reviewed-by: Pankaj Gupta 

> 2.17.1
> 
> 
>

Re: [Qemu-devel] [PATCH] docs/nvdimm: add example on persistent backend setup

2019-07-24 Thread Pankaj Gupta



> >
> >> 
> >> Persistent backend setup requires some knowledge about nvdimm and ndctl
> >> tool. Some users report they may struggle to gather these knowledge and
> >> have difficulty to setup it properly.
> >> 
> >> Here we provide two examples for persistent backend and gives the link
> >> to ndctl. By doing so, user could try it directly and do more
> >> investigation on persistent backend setup with ndctl.
> >> 
> >> Signed-off-by: Wei Yang 
> >> ---
> >>  docs/nvdimm.txt | 28 
> >>  1 file changed, 28 insertions(+)
> >> 
> >> diff --git a/docs/nvdimm.txt b/docs/nvdimm.txt
> >> index b531cacd35..baba7a940d 100644
> >> --- a/docs/nvdimm.txt
> >> +++ b/docs/nvdimm.txt
> >> @@ -171,6 +171,32 @@ guest software that this vNVDIMM device contains a
> >> region that cannot
> >>  accept persistent writes. In result, for example, the guest Linux
> >>  NVDIMM driver, marks such vNVDIMM device as read-only.
> >>  
> >> +Backend File Setup Example
> >> +..
> >> +
> >> +Here is two examples for how to setup these persistent backend on
> >> +linux, which leverages the tool ndctl [3].
> >> +
> >> +It is easy to setup DAX device backend file.
> >> +
> >> +A. DAX device
> >> +
> >> +ndctl create-namespace -f -e namespace0.0 -m devdax
> >> +
> >> +The /dev/dax0.0 could be used directly in "mem-path" option.
> >> +
> >> +For DAX file, it is more than creating the proper namespace. The
> >> +block device should be partitioned and mounted (with dax option).
> >> +
> >> +B. DAX file
> >> +
> >> +ndctl create-namespace -f -e namespace0.0 -m fsdax
> >> +(partition /dev/pmem0 with name pmem0p1)
> >> +mount -o dax /dev/pmem0p1 /mnt
> >> +(dd a file with proper size in /mnt)
> >
> >This is not clear to me. why 'dd' file is required in /mnt?
> >You mean for creating a backend file?
> >
> 
> Yes, create a backend file. You need to give a file instead of a directory to
> qemu command line.
o.k

Thanks,
Pankaj 
> 
> >> +
> >> +Then the new file in /mnt could be used in "mem-path" option.
> >> +
> >>  NVDIMM Persistence
> >>  --
> >>  
> >> @@ -212,3 +238,5 @@ References
> >>  
> >> https://www.snia.org/sites/default/files/technical_work/final/NVMProgrammingModel_v1.2.pdf
> >>  [2] Persistent Memory Development Kit (PMDK), formerly known as NVML
> >>  project, home page:
> >>  http://pmem.io/pmdk/
> >> +[3] ndctl-create-namespace - provision or reconfigure a namespace
> >> +http://pmem.io/ndctl/ndctl-create-namespace.html
> >> --
> >
> >Looks good to me. Just a small comment above.
> >Other than that: Reviewed-by: Pankaj Gupta 
> >
> 
> Thanks
> 
> >> 2.17.1
> >> 
> >> 
> >> 
> 
> --
> Wei Yang
> Help you, Help me
> 
>

Re: [Qemu-devel] [PATCH v4 0/3] restrict bridge interface name to IFNAMSIZ

2019-07-24 Thread Jason Wang




On 2019/7/24 上午1:44, no-re...@patchew.org wrote:

Patchew URL:https://patchew.org/QEMU/20190723104754.29324-1-ppan...@redhat.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.


Prasad, this looks unrelated to the series? Please double check.

Thanks

Re: [Qemu-devel] [RFC 00/19] Add virtual device fuzzing support

2019-07-24 Thread no-reply

Patchew URL: https://patchew.org/QEMU/20190725032321.12721-1-alx...@bu.edu/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Subject: [Qemu-devel] [RFC 00/19] Add virtual device fuzzing support
Message-id: 20190725032321.12721-1-alx...@bu.edu

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/20190725032321.12721-1-alx...@bu.edu -> 
patchew/20190725032321.12721-1-alx...@bu.edu
 * [new tag] 
patchew/20190725032722.32271-1-richardw.y...@linux.intel.com -> 
patchew/20190725032722.32271-1-richardw.y...@linux.intel.com
Submodule 'capstone' (https://git.qemu.org/git/capstone.git) registered for 
path 'capstone'
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Submodule 'roms/QemuMacDrivers' (https://git.qemu.org/git/QemuMacDrivers.git) 
registered for path 'roms/QemuMacDrivers'
Submodule 'roms/SLOF' (https://git.qemu.org/git/SLOF.git) registered for path 
'roms/SLOF'
Submodule 'roms/edk2' (https://git.qemu.org/git/edk2.git) registered for path 
'roms/edk2'
Submodule 'roms/ipxe' (https://git.qemu.org/git/ipxe.git) registered for path 
'roms/ipxe'
Submodule 'roms/openbios' (https://git.qemu.org/git/openbios.git) registered 
for path 'roms/openbios'
Submodule 'roms/openhackware' (https://git.qemu.org/git/openhackware.git) 
registered for path 'roms/openhackware'
Submodule 'roms/opensbi' (https://git.qemu.org/git/opensbi.git) registered for 
path 'roms/opensbi'
Submodule 'roms/qemu-palcode' (https://git.qemu.org/git/qemu-palcode.git) 
registered for path 'roms/qemu-palcode'
Submodule 'roms/seabios' (https://git.qemu.org/git/seabios.git/) registered for 
path 'roms/seabios'
Submodule 'roms/seabios-hppa' (https://git.qemu.org/git/seabios-hppa.git) 
registered for path 'roms/seabios-hppa'
Submodule 'roms/sgabios' (https://git.qemu.org/git/sgabios.git) registered for 
path 'roms/sgabios'
Submodule 'roms/skiboot' (https://git.qemu.org/git/skiboot.git) registered for 
path 'roms/skiboot'
Submodule 'roms/u-boot' (https://git.qemu.org/git/u-boot.git) registered for 
path 'roms/u-boot'
Submodule 'roms/u-boot-sam460ex' (https://git.qemu.org/git/u-boot-sam460ex.git) 
registered for path 'roms/u-boot-sam460ex'
Submodule 'slirp' (https://git.qemu.org/git/libslirp.git) registered for path 
'slirp'
Submodule 'tests/fp/berkeley-softfloat-3' 
(https://git.qemu.org/git/berkeley-softfloat-3.git) registered for path 
'tests/fp/berkeley-softfloat-3'
Submodule 'tests/fp/berkeley-testfloat-3' 
(https://git.qemu.org/git/berkeley-testfloat-3.git) registered for path 
'tests/fp/berkeley-testfloat-3'
Submodule 'ui/keycodemapdb' (https://git.qemu.org/git/keycodemapdb.git) 
registered for path 'ui/keycodemapdb'
Cloning into 'capstone'...
Submodule path 'capstone': checked out 
'22ead3e0bfdb87516656453336160e0a37b066bf'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '88f18909db731a627456f26d779445f84e449536'
Cloning into 'roms/QemuMacDrivers'...
Submodule path 'roms/QemuMacDrivers': checked out 
'90c488d5f4a407342247b9ea869df1c2d9c8e266'
Cloning into 'roms/SLOF'...
Submodule path 'roms/SLOF': checked out 
'ba1ab360eebe6338bb8d7d83a9220ccf7e213af3'
Cloning into 'roms/edk2'...
Submodule path 'roms/edk2': checked out 
'20d2e5a125e34fc8501026613a71549b2a1a3e54'
Submodule 'SoftFloat' (https://github.com/ucb-bar/berkeley-softfloat-3.git) 
registered for path 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
Submodule 'CryptoPkg/Library/OpensslLib/openssl' 
(https://github.com/openssl/openssl) registered for path 
'CryptoPkg/Library/OpensslLib/openssl'
Cloning into 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'...
Submodule path 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3': 
checked out 'b64af41c3276f97f0e181920400ee056b9c88037'
Cloning into 'CryptoPkg/Library/OpensslLib/openssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl': checked out 
'50eaac9f3337667259de725451f201e784599687'
Submodule 'boringssl' (https://boringssl.googlesource.com/boringssl) registered 
for path 'boringssl'
Submodule 'krb5' (https://github.com/krb5/krb5) registered for path 'krb5'
Submodule 'pyca.cryptography' (https://github.com/pyca/cryptography.git) 
registered for path 'pyca-cryptography'
Cloning into 'boringssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl': 
checked out '2070f8ad9151dc8f3a73bffaa146b5e6937a583f'
Cloning into 'krb5'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/krb5': checked 
out 'b9ad6c49505c96a088326b62a52568e3484f2168'
Cloning into 'pyca-cryptography'...
Submodule path

[Qemu-devel] [RFC 19/19] fuzz: Add documentation about the fuzzer to docs/

2019-07-24 Thread Oleinik, Alexander

Signed-off-by: Alexander Oleinik 
---
 docs/devel/fuzzing.txt | 145 +
 1 file changed, 145 insertions(+)
 create mode 100644 docs/devel/fuzzing.txt

diff --git a/docs/devel/fuzzing.txt b/docs/devel/fuzzing.txt
new file mode 100644
index 00..321e005e8c
--- /dev/null
+++ b/docs/devel/fuzzing.txt
@@ -0,0 +1,145 @@
+= Fuzzing =
+
+== Introduction ==
+
+This document describes the fuzzing infrastructure in QEMU and how to use it
+to add additional fuzzing targets.
+
+== Basics ==
+
+Fuzzing operates by passing inputs to an entry point/target function. The
+fuzzer tracks the code coverage triggered by the input. Based on these
+findings, the fuzzer mutates the input and repeats the fuzzing. 
+
+To fuzz QEMU, we rely on libfuzzer. Unlike other fuzzers such as AFL, libfuzzer
+is an _in-process_ fuzzer. For the developer, this means that it is their
+responsibility to ensure that state is reset between fuzzing-runs.
+
+libfuzzer provides its own main() and expects the developer to implement the
+entrypoint "LLVMFuzzerTestOneInput".
+
+Currently, Fuzz targets are built out to fuzz virtual-devices from guests. The
+fuzz targets can use qtest and qos functions to pass inputs to virtual devices.
+
+== Main Modifications required for Fuzzing ==
+
+Fuzzing is enabled with the -enable-fuzzing flag, which adds the needed cflags
+to enable Libfuzzer and AddressSanitizer. In the code, most of the changes to
+existing qemu source are surrounded by #ifdef CONFIG_FUZZ statements. Here are
+the key areas that are changed:
+
+=== General Changes ===
+
+vl.c:main renamed to real_main to avoid conflicts when libfuzzer is linked in.
+Also, real_main returns where it would normally call main_loop. 
+
+The fuzzer adds an accelerator. The accelerator does not do anything, much
+like the qtest accelerator.
+
+=== Changes to SaveVM ===
+
+There aren't any particular changes to SaveVM, but the fuzzer adds a type
+of file "ramfile" implemented in test/fuzz/ramfile.c which allocates a buffer
+on the heap to which it saves the vmstate.
+
+=== Changes to QTest ===
+
+QEMU-fuzz modifies the qtest server(qtest.c) and qtest client
+(tests/libqtest.c) so that they communicate within the same QEMU process. In
+the qtest server, there is a qtest_init_fuzz function to initialize the
+QTestState. Normally, qtest commands are passed to socket_send which
+communicates the command to the server/QEMU process over a socket. The fuzzer,
+instead, directly calls the qtest server recieve function with the the command
+string as an argument. The server usually responds to commands with an "OK"
+command. To support this, there is an added qtest_client_recv function in
+libqtest.c, which the server calls directly.
+
+At the moment, qtest's qmp wrapper functions are not supported.
+
+=== Chages to QOS ===
+
+QOS tests are usually linked against the compiled tests/qos-test.c. The main
+function in this file initializes the QOS graph and uses some QMP commands to
+query the qtest server for the available devices. It also registers the tests
+implemented in all of the linked qos test-case files. Then it uses a DFS walker
+to iterate over QOS graph and determine the required QEMU devices/arguments and
+device initialization functions to perform each test.
+
+The fuzzer doesn't link against qos-test, but re-uses most of the functionality
+in test/fuzz/qos_helpers.c The major changes are that the walker simply saves
+the last QGraph path for later use in the fuzzer. The
+qos_set_machines_devices_available function is changed to directly used qmp_*
+commands. Note that to populate the QGraph, the fuzzer still needs to be linked
+against the devices described in test/libqos/*.o
+
+== The Fuzzer's Lifecycle ==
+
+The fuzzer has two entrypoints that libfuzzer calls.
+
+LLVMFuzzerInitialize: called prior to fuzzing. Used to initialize all of the
+necessary state
+
+LLVMFuzzerTestOneInput: called for each fuzzing run. Processes the input and
+resets the state at the end of each run.
+
+In more detail:
+
+LLVMFuzzerInitialize parses the arguments to the fuzzer (must start with two
+dashes, so they are ignored by libfuzzer main()). Currently, the arguments
+select the fuzz target. Then, the qtest client is initialized. If the target
+requires qos, qgraph is set up and the QOM/LIBQOS modules are initailized.
+Then the QGraph is walked and the QEMU cmd_line is determined and saved.
+
+After this, the vl.c:real_main is called to set up the guest. After this, the
+fuzzer saves the initial vm/device state to ram, after which the initilization
+is complete.
+
+LLVMFuzzerTestOneInput: Uses qtest/qos functions to act based on the fuzz
+input. It is also responsible for manually calling the main loop/main_loop_wait
+to ensure that bottom halves are executed. Finally, it calls reset() which
+restores state from the ramfile and/or resets the guest.
+
+
+Since the same process is reused for many fuzzing runs, QEMU state needs to
+be reset

[Qemu-devel] [RFC PATCH] migration/postcopy: skip compression when postcopy is active

2019-07-24 Thread Wei Yang

Now postcopy is not compatible with compression. And we disable setting
these two capability at the same time. While we can still leverage
compress before postcopy is active, for example at the bulk stage.

This patch skips compression when postcopy is active instead of
forbidding setting these capability at the same time.

Signed-off-by: Wei Yang 
---
 migration/migration.c | 11 ---
 migration/ram.c   | 10 ++
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 5a496addbd..33c373033d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -995,17 +995,6 @@ static bool migrate_caps_check(bool *cap_list,
 #endif
 
 if (cap_list[MIGRATION_CAPABILITY_POSTCOPY_RAM]) {
-if (cap_list[MIGRATION_CAPABILITY_COMPRESS]) {
-/* The decompression threads asynchronously write into RAM
- * rather than use the atomic copies needed to avoid
- * userfaulting.  It should be possible to fix the decompression
- * threads for compatibility in future.
- */
-error_setg(errp, "Postcopy is not currently compatible "
-   "with compression");
-return false;
-}
-
 /* This check is reasonably expensive, so only when it's being
  * set the first time, also it's only the destination that needs
  * special support.
diff --git a/migration/ram.c b/migration/ram.c
index da12774a24..a0d3bc60b2 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2384,6 +2384,16 @@ static bool save_page_use_compression(RAMState *rs)
 return false;
 }
 
+/*
+ * The decompression threads asynchronously write into RAM
+ * rather than use the atomic copies needed to avoid
+ * userfaulting.  It should be possible to fix the decompression
+ * threads for compatibility in future.
+ */
+if (migration_in_postcopy()) {
+return false;
+}
+
 /*
  * If xbzrle is on, stop using the data compression after first
  * round of migration even if compression is enabled. In theory,
-- 
2.17.1

[Qemu-devel] [RFC 17/19] fuzz: add general qtest fuzz target

2019-07-24 Thread Oleinik, Alexander

These fuzz targets perform a range of qtest operations over mmio and
port i/o addresses mapped to devices.

Signed-off-by: Alexander Oleinik 
---
 tests/fuzz/qtest_fuzz.c | 261 
 tests/fuzz/qtest_fuzz.h |  38 ++
 2 files changed, 299 insertions(+)
 create mode 100644 tests/fuzz/qtest_fuzz.c
 create mode 100644 tests/fuzz/qtest_fuzz.h

diff --git a/tests/fuzz/qtest_fuzz.c b/tests/fuzz/qtest_fuzz.c
new file mode 100644
index 00..6d6670838d
--- /dev/null
+++ b/tests/fuzz/qtest_fuzz.c
@@ -0,0 +1,261 @@
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
+#include "qemu-common.h"
+#include "exec/memory.h"
+#include "exec/address-spaces.h"
+#include "sysemu/sysemu.h"
+#include "qemu/main-loop.h"
+#include 
+#include "qemu-common.h"
+#include "fuzzer_hooks.h"
+
+
+#include "fuzz.h"
+#include "qtest_fuzz.h"
+#include "tests/libqtest.h"
+#include "fuzz/qos_fuzz.h"
+
+
+/* Make sure that the io_port is mapped to some device */
+static uint16_t normalize_io_port(uint64_t addr) {
+addr = addr%total_io_mem;
+fuzz_memory_region *fmr = fuzz_memory_region_head;
+while(addr!=0) {
+if(!fmr->io){
+fmr = fmr->next;
+continue;
+}
+if(addr <= fmr->length)
+{
+addr= fmr->start + addr;
+break;
+}
+addr -= fmr->length +1;
+fmr = fmr->next;
+}
+/* Stuff that times out or hotplugs.. */
+if(addr>=0x5655 && addr<=0x565b)
+return 0;
+if(addr>=0x510 && addr<=0x518)
+return 0;
+if(addr>=0xae00 && addr<=0xae13) // PCI Hotplug
+return 0;
+if(addr>=0xaf00 && addr<=0xaf1f) // CPU Hotplug
+return 0;
+return addr;
+}
+
+/* Make sure that the memory address is mapped to some interesting device */
+static uint16_t normalize_mem_addr(uint64_t addr) {
+addr = addr%total_ram_mem;
+fuzz_memory_region *fmr = fuzz_memory_region_head;
+while(addr!=0) {
+if(fmr->io){
+fmr = fmr->next;
+continue;
+}
+if(addr <= fmr->length)
+{
+return fmr->start + addr;
+}
+addr -= fmr->length +1;
+fmr = fmr->next;
+}
+return addr;
+}
+
+static void qtest_fuzz(const unsigned char *Data, size_t Size){
+const unsigned char *pos = Data;
+const unsigned char *End = Data + Size;
+
+qtest_cmd *cmd;
+
+while(pos < Data+Size)
+{
+/* Translate the fuzz input to a qtest command */
+cmd = [(*pos)%(sizeof(commands)/sizeof(qtest_cmd))];
+pos++;
+
+if(strcmp(cmd->name, "clock_step") == 0){
+// TODO: This times out
+/* qtest_clock_step_next(s); */
+} 
+else if(strcmp(cmd->name, "outb") == 0) {
+if(pos + sizeof(uint16_t) + sizeof(uint8_t) < End) {
+uint16_t addr = *(int16_t*)(pos);
+pos += sizeof(uint16_t);
+uint8_t val = *(uint16_t*)(pos);
+pos += sizeof(uint8_t);
+addr = normalize_io_port(addr);
+qtest_outb(s, addr, val);
+}
+}
+else if(strcmp(cmd->name, "outw") == 0) {
+if(pos + sizeof(uint16_t) + sizeof(uint16_t) < End) {
+uint16_t addr = *(int16_t*)(pos);
+pos += sizeof(uint16_t);
+uint16_t val = *(uint16_t*)(pos);
+pos += sizeof(uint16_t);
+addr = normalize_io_port(addr);
+qtest_outw(s, addr, val);
+}
+}
+else if(strcmp(cmd->name, "outl") == 0) {
+if(pos + sizeof(uint16_t) + sizeof(uint32_t) < End) {
+uint16_t addr = *(int16_t*)(pos);
+pos += sizeof(uint16_t);
+uint32_t val = *(uint32_t*)(pos);
+pos += sizeof(uint32_t);
+addr = normalize_io_port(addr);
+qtest_outl(s, addr, val);
+}
+}
+else if(strcmp(cmd->name, "inb") == 0) {
+if(pos + sizeof(uint16_t) < End) {
+uint16_t addr = *(int16_t*)(pos);
+pos += sizeof(uint16_t);
+addr = normalize_io_port(addr);
+qtest_inb(s, addr);
+}
+}
+else if(strcmp(cmd->name, "inw") == 0) {
+if(pos + sizeof(uint16_t) < End) {
+uint16_t addr = *(int16_t*)(pos);
+pos += sizeof(uint16_t);
+addr = normalize_io_port(addr);
+qtest_inw(s, addr);
+}
+}
+else if(strcmp(cmd->name, "inl") == 0) {
+if(pos + sizeof(uint16_t) < End) {
+uint16_t addr = *(int16_t*)(pos);
+pos += sizeof(uint16_t);
+addr = normalize_io_port(addr);
+qtest_inl(s, addr);
+}
+}
+else if(strcmp(cmd->name, "writeb") == 0) {
+

[Qemu-devel] [RFC 16/19] fuzz: add general fuzzer entrypoints

2019-07-24 Thread Oleinik, Alexander

Defines LLVMFuzzerInitialize and LLVMFuzzerTestOneInput

Signed-off-by: Alexander Oleinik 
---
 tests/fuzz/fuzz.c | 262 ++
 tests/fuzz/fuzz.h |  96 +
 2 files changed, 358 insertions(+)
 create mode 100644 tests/fuzz/fuzz.c
 create mode 100644 tests/fuzz/fuzz.h

diff --git a/tests/fuzz/fuzz.c b/tests/fuzz/fuzz.c
new file mode 100644
index 00..0421b9402c
--- /dev/null
+++ b/tests/fuzz/fuzz.c
@@ -0,0 +1,262 @@
+#include "tests/fuzz/ramfile.h"
+#include "migration/qemu-file.h"
+#include "migration/global_state.h"
+#include "migration/savevm.h"
+#include "tests/libqtest.h"
+#include "exec/memory.h"
+#include "migration/migration.h"
+#include "fuzz.h"
+#include "tests/libqos/qgraph.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+QTestState *s;
+
+QEMUFile *ramfile;
+QEMUFile *writefile;
+ram_disk *rd; 
+typedef QSLIST_HEAD(, FuzzTarget) FuzzTargetList;
+
+FuzzTargetList* fuzz_target_list;
+
+uint64_t total_mr_size = 0;
+uint64_t mr_index = 0;
+
+const MemoryRegion* mrs[1000];
+
+
+// Save just the VMStateDescriptors
+void save_device_state(void)
+{
+writefile = qemu_fopen_ram();
+global_state_store();
+qemu_save_device_state(writefile);
+qemu_fflush(writefile);
+ramfile = qemu_fopen_ro_ram(rd);
+}
+
+// Save the entire vm state including RAM
+void save_vm_state(void) 
+{
+writefile = qemu_fopen_ram();
+vm_stop(RUN_STATE_SAVE_VM);
+global_state_store();
+qemu_savevm_state(writefile, NULL);
+qemu_fflush(writefile);
+ramfile = qemu_fopen_ro_ram(rd);
+}
+
+/* Reset state by rebooting */
+void reboot()
+{
+qemu_system_reset(SHUTDOWN_CAUSE_NONE);
+}
+
+/* Restore device state */
+void load_device_state()
+{
+qemu_freopen_ro_ram(ramfile);
+
+int ret = qemu_load_device_state(ramfile);
+if (ret < 0){
+printf("reset error\n");
+exit(-1);
+}
+}
+
+/* Restore full vm state */
+void load_vm_state()
+{
+qemu_freopen_ro_ram(ramfile);
+
+vm_stop(RUN_STATE_RESTORE_VM);
+/* qemu_system_reset(SHUTDOWN_CAUSE_NONE); */
+
+int ret = qemu_loadvm_state(ramfile);
+if (ret < 0){
+printf("reset error\n");
+exit(-1);
+}
+migration_incoming_state_destroy();
+vm_start();
+}
+
+void qtest_setup()
+{
+s = qtest_init_fuzz(NULL, NULL);
+global_qtest = s;
+}
+
+void fuzz_add_target(const char* name,
+const char* description,
+void(*init_pre_main)(void),
+void(*init_pre_save)(void),
+void(*save_state)(void),
+void(*reset)(void),
+void(*pre_fuzz)(void),
+void(*fuzz)(const unsigned char*, size_t),
+void(*post_fuzz)(void),
+int* main_argc,
+char*** main_argv)
+{
+
+FuzzTarget *target;
+FuzzTarget *tmp;
+if(!fuzz_target_list)
+fuzz_target_list = g_new0(FuzzTargetList, 1);
+
+QSLIST_FOREACH(tmp, fuzz_target_list, target_list) {
+if (g_strcmp0(tmp->name->str, name) == 0) {
+fprintf(stderr, "Error: Fuzz target name %s already in use\n", 
name);
+abort();
+}
+}
+target = g_new0(FuzzTarget, 1);
+target->name = g_string_new(name);
+target->description = g_string_new(description);
+target->init_pre_main = init_pre_main;
+target->init_pre_save = init_pre_save;
+target->save_state = save_state;
+target->reset = reset;
+target->pre_fuzz = pre_fuzz;
+target->fuzz = fuzz;
+target->post_fuzz = post_fuzz;
+target->main_argc = main_argc;
+target->main_argv = main_argv;
+QSLIST_INSERT_HEAD(fuzz_target_list, target, target_list);
+}
+
+
+FuzzTarget* fuzz_get_target(char* name)
+{
+FuzzTarget* tmp;
+if(!fuzz_target_list){
+fprintf(stderr, "Fuzz target list not initialized");
+abort();
+}
+
+QSLIST_FOREACH(tmp, fuzz_target_list, target_list) {
+if (g_strcmp0(tmp->name->str, name) == 0) {
+break;
+}
+}
+return tmp;
+}
+
+FuzzTarget* fuzz_target;
+
+
+
+static void usage(void)
+{
+printf("Usage: ./fuzz --FUZZ_TARGET [LIBFUZZER ARGUMENTS]\n");
+printf("where --FUZZ_TARGET is one of:\n");
+FuzzTarget* tmp;
+if(!fuzz_target_list){
+fprintf(stderr, "Fuzz target list not initialized");
+abort();
+}
+QSLIST_FOREACH(tmp, fuzz_target_list, target_list) {
+QSLIST_FOREACH(tmp, fuzz_target_list, target_list) {
+printf(" --%s  : %s\n", tmp->name->str, tmp->description->str);
+}
+exit(0);
+}
+}
+
+// TODO: Replace this with QEMU's built-in linked list
+static void enum_memory(void)
+{
+mtree_info(true, true, true);
+fuzz_memory_region *fmr = g_new0(fuzz_memory_region, 1);
+
+fmr->io = false;
+fmr->start = 0x10;
+fmr->length = 0x1;
+fmr->next = fuzz_memory_region_head;
+fuzz_memory_region_tail->next = fmr;
+fuzz_memory_region_tail = fmr;
+fmr =

[Qemu-devel] [RFC 14/19] fuzz: hard-code a main-loop timeout

2019-07-24 Thread Oleinik, Alexander

Signed-off-by: Alexander Oleinik 
---
 util/main-loop.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/util/main-loop.c b/util/main-loop.c
index e3eaa55866..708e6be5eb 100644
--- a/util/main-loop.c
+++ b/util/main-loop.c
@@ -513,6 +513,9 @@ void main_loop_wait(int nonblocking)
 timeout_ns = qemu_soonest_timeout(timeout_ns,
   timerlistgroup_deadline_ns(
   _loop_tlg));
+#ifdef CONFIG_FUZZ
+timeout_ns = 5;
+#endif
 
 ret = os_host_main_loop_wait(timeout_ns);
 mlpoll.state = ret < 0 ? MAIN_LOOP_POLL_ERR : MAIN_LOOP_POLL_OK;
-- 
2.20.1

[Qemu-devel] [RFC 13/19] fuzz: add ctrl vq support to virtio-net in libqos

2019-07-24 Thread Oleinik, Alexander

Signed-off-by: Alexander Oleinik 
---
 tests/libqos/virtio-net.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/libqos/virtio-net.c b/tests/libqos/virtio-net.c
index 66405b646e..247a0a17a8 100644
--- a/tests/libqos/virtio-net.c
+++ b/tests/libqos/virtio-net.c
@@ -51,7 +51,7 @@ static void virtio_net_setup(QVirtioNet *interface)
 if (features & (1u << VIRTIO_NET_F_MQ)) {
 interface->n_queues = qvirtio_config_readw(vdev, 8) * 2;
 } else {
-interface->n_queues = 2;
+interface->n_queues = 3;
 }
 
 interface->queues = g_new(QVirtQueue *, interface->n_queues);
-- 
2.20.1

[Qemu-devel] [RFC 18/19] fuzz: Add virtio-net tx and ctrl fuzz targets

2019-07-24 Thread Oleinik, Alexander

These virtio-net fuzz targets use libqos abstractions to virtio-net
virtqueues.

Signed-off-by: Alexander Oleinik 
---
 tests/fuzz/virtio-net-fuzz.c | 226 +++
 1 file changed, 226 insertions(+)
 create mode 100644 tests/fuzz/virtio-net-fuzz.c

diff --git a/tests/fuzz/virtio-net-fuzz.c b/tests/fuzz/virtio-net-fuzz.c
new file mode 100644
index 00..4b6c788498
--- /dev/null
+++ b/tests/fuzz/virtio-net-fuzz.c
@@ -0,0 +1,226 @@
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
+#include "qemu-common.h"
+#include "exec/memory.h"
+#include "sysemu/sysemu.h"
+#include "qemu/main-loop.h"
+
+#include "hw/virtio/virtio-net.h"
+#include "hw/virtio/virtio.h"
+#include "tests/libqos/virtio-net.h"
+#include "fuzzer_hooks.h"
+
+#include "fuzz.h"
+#include "qos_fuzz.h"
+
+typedef struct vq_action {
+uint8_t queue;
+uint8_t length;
+uint8_t write;
+uint8_t next;
+bool kick;
+} vq_action;
+
+static void virtio_net_ctrl_fuzz(const unsigned char *Data, size_t Size)
+{
+uint64_t req_addr[10];
+int reqi =0;
+uint32_t free_head;
+
+QGuestAllocator *t_alloc = qos_alloc;
+
+QVirtioNet *net_if = qos_obj;
+QVirtioDevice *dev = net_if->vdev;
+QVirtQueue *q;
+vq_action vqa;
+int iters=0;
+while(true) {
+if(Size < sizeof(vqa)) {
+break;
+}
+vqa = *((vq_action*)Data);
+Data += sizeof(vqa);
+Size -= sizeof(vqa);
+
+q = net_if->queues[2];
+
+vqa.length = vqa.length >= Size ? Size :  vqa.length;
+
+req_addr[reqi] = guest_alloc(t_alloc, vqa.length);
+memwrite(req_addr[reqi], Data, vqa.length);
+if(iters == 0)
+free_head = qvirtqueue_add(q, req_addr[reqi], vqa.length, 
vqa.write , vqa.next) ;
+else
+qvirtqueue_add(q, req_addr[reqi], vqa.length, vqa.write , 
vqa.next) ;
+iters++;
+reqi++;
+if(iters==10)
+break;
+Data += vqa.length;
+Size -= vqa.length;
+}
+if(iters){
+qvirtqueue_kick(dev, q, free_head);
+qtest_clock_step_next(s);
+main_loop_wait(false);
+for(int i =0; ivdev;
+QVirtQueue *q;
+vq_action vqa;
+int iters=0;
+while(Size >= sizeof(vqa)) {
+vqa = *((vq_action*)Data);
+Data += sizeof(vqa);
+Size -= sizeof(vqa);
+if(vqa.kick && free_head)
+{
+qvirtqueue_kick(dev, q, free_head);
+qtest_clock_step_next(s);
+main_loop_wait(false);
+for(int i =0; iqueues[2];
+
+vqa.length = vqa.length >= Size ? Size :  vqa.length;
+
+req_addr[reqi] = guest_alloc(t_alloc, vqa.length);
+memwrite(req_addr[reqi], Data, vqa.length);
+if(iters == 0)
+free_head = qvirtqueue_add(q, req_addr[reqi], vqa.length, 
vqa.write , vqa.next) ;
+else
+qvirtqueue_add(q, req_addr[reqi], vqa.length, vqa.write , 
vqa.next) ;
+iters++;
+reqi++;
+if(iters==10)
+break;
+Data += vqa.length;
+Size -= vqa.length;
+}
+}
+qtest_clear_rxbuf(s);
+qos_object_queue_destroy(qos_obj);
+}
+
+int *sv;
+static void virtio_net_tx_fuzz(const unsigned char *Data, size_t Size)
+{
+uint64_t req_addr[10];
+int reqi =0;
+uint32_t free_head;
+
+QGuestAllocator *t_alloc = qos_alloc;
+
+QVirtioNet *net_if = qos_obj;
+QVirtioDevice *dev = net_if->vdev;
+QVirtQueue *q;
+vq_action vqa;
+int iters=0;
+while(true) {
+if(Size < sizeof(vqa)) {
+break;
+}
+vqa = *((vq_action*)Data);
+Data += sizeof(vqa);
+Size -= sizeof(vqa);
+
+q = net_if->queues[1];
+
+vqa.length = vqa.length >= Size ? Size :  vqa.length;
+
+req_addr[reqi] = guest_alloc(t_alloc, vqa.length);
+memwrite(req_addr[reqi], Data, vqa.length);
+if(iters == 0)
+free_head = qvirtqueue_add(q, req_addr[reqi], vqa.length, 
vqa.write , vqa.next) ;
+else
+qvirtqueue_add(q, req_addr[reqi], vqa.length, vqa.write , 
vqa.next) ;
+iters++;
+reqi++;
+if(iters==10)
+break;
+Data += vqa.length;
+Size -= vqa.length;
+}
+if(iters){
+qvirtqueue_kick(dev, q, free_head);
+qtest_clock_step_next(s);
+main_loop_wait(false);
+for(int i =0; i

[Qemu-devel] [RFC 15/19] fuzz: add fuzz accelerator type

2019-07-24 Thread Oleinik, Alexander

Signed-off-by: Alexander Oleinik 
---
 accel/fuzz.c  | 47 +++
 include/sysemu/fuzz.h | 15 ++
 2 files changed, 62 insertions(+)
 create mode 100644 accel/fuzz.c
 create mode 100644 include/sysemu/fuzz.h

diff --git a/accel/fuzz.c b/accel/fuzz.c
new file mode 100644
index 00..1694cf46e8
--- /dev/null
+++ b/accel/fuzz.c
@@ -0,0 +1,47 @@
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/module.h"
+#include "qemu/option.h"
+#include "qemu/config-file.h"
+#include "sysemu/accel.h"
+#include "sysemu/fuzz.h"
+#include "sysemu/cpus.h"
+
+
+static void fuzz_setup_post(MachineState *ms, AccelState *accel) {
+}
+
+static int fuzz_init_accel(MachineState *ms)
+{
+QemuOpts *opts = qemu_opts_create(qemu_find_opts("icount"), NULL, 0,
+  _abort);
+qemu_opt_set(opts, "shift", "0", _abort);
+configure_icount(opts, _abort);
+qemu_opts_del(opts);
+return 0;
+}
+
+static void fuzz_accel_class_init(ObjectClass *oc, void *data)
+{
+AccelClass *ac = ACCEL_CLASS(oc);
+ac->name = "fuzz";
+ac->init_machine = fuzz_init_accel;
+   ac->setup_post = fuzz_setup_post;
+ac->allowed = _allowed;
+}
+
+#define TYPE_FUZZ_ACCEL ACCEL_CLASS_NAME("fuzz")
+
+static const TypeInfo fuzz_accel_type = {
+.name = TYPE_FUZZ_ACCEL,
+.parent = TYPE_ACCEL,
+.class_init = fuzz_accel_class_init,
+};
+
+static void fuzz_type_init(void)
+{
+type_register_static(_accel_type);
+}
+
+type_init(fuzz_type_init);
+
diff --git a/include/sysemu/fuzz.h b/include/sysemu/fuzz.h
new file mode 100644
index 00..09a2a9ffdf
--- /dev/null
+++ b/include/sysemu/fuzz.h
@@ -0,0 +1,15 @@
+#ifndef FUZZ_H
+#define FUZZ_H
+
+bool fuzz_allowed;
+
+static inline bool fuzz_enabled(void)
+{
+return fuzz_allowed;
+}
+
+bool fuzz_driver(void);
+
+void fuzz_init(const char *fuzz_chrdev, const char *fuzz_log, Error **errp);
+
+#endif
-- 
2.20.1

[Qemu-devel] [RFC 08/19] fuzz: add shims to intercept libfuzzer init

2019-07-24 Thread Oleinik, Alexander

Intercept coverage buffer registration calls and use this information to
copy them to shared memory, if using fork() to avoid resetting device
state.

Signed-off-by: Alexander Oleinik 
---
 tests/fuzz/fuzzer_hooks.c | 106 ++
 tests/fuzz/fuzzer_hooks.h |   9 
 2 files changed, 115 insertions(+)
 create mode 100644 tests/fuzz/fuzzer_hooks.c
 create mode 100644 tests/fuzz/fuzzer_hooks.h

diff --git a/tests/fuzz/fuzzer_hooks.c b/tests/fuzz/fuzzer_hooks.c
new file mode 100644
index 00..5a0bbec413
--- /dev/null
+++ b/tests/fuzz/fuzzer_hooks.c
@@ -0,0 +1,106 @@
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
+#include "qemu-common.h"
+#include "fuzzer_hooks.h"
+
+#include 
+#include 
+
+
+extern void* _ZN6fuzzer3TPCE;
+// The libfuzzer handlers
+void __real___sanitizer_cov_8bit_counters_init(uint8_t*, uint8_t*);
+void __real___sanitizer_cov_trace_pc_guard_init(uint8_t*, uint8_t*);
+
+void __wrap___sanitizer_cov_8bit_counters_init(uint8_t *Start, uint8_t *Stop);
+void __wrap___sanitizer_cov_trace_pc_guard_init(uint8_t *Start, uint8_t *Stop);
+
+
+void* counter_shm;
+
+typedef struct CoverageRegion {
+uint8_t* start;
+size_t length;
+bool store; /* Set this if it needs to be copied to the forked process */
+} CoverageRegion;
+
+CoverageRegion regions[10];
+int region_index = 0;
+
+void __wrap___sanitizer_cov_8bit_counters_init(uint8_t *Start, uint8_t *Stop)
+{
+regions[region_index].start = Start;
+regions[region_index].length = Stop-Start;
+regions[region_index].store = true;
+region_index++;
+__real___sanitizer_cov_8bit_counters_init(Start, Stop);
+}
+
+void __wrap___sanitizer_cov_trace_pc_guard_init(uint8_t *Start, uint8_t *Stop)
+{
+regions[region_index].start = Start;
+regions[region_index++].length = Stop-Start;
+regions[region_index].store = true;
+region_index++;
+__real___sanitizer_cov_trace_pc_guard_init(Start, Stop);
+}
+
+static void add_tpc_region(void)
+{
+/* Got symbol and length from readelf. Horrible way to do this! */
+regions[region_index].start = (uint8_t*)(&_ZN6fuzzer3TPCE);
+regions[region_index].length = 0x443c00; 
+regions[region_index].store = true;
+region_index++;
+}
+
+void counter_shm_init(void)
+{
+/*
+ * Add the  internal libfuzzer object that gets modified by cmp, etc
+ * callbacks
+ */
+add_tpc_region(); 
+
+size_t length = 0;
+for(int i=0; i

[Qemu-devel] [RFC 11/19] fuzz: add direct send/receive in qtest client

2019-07-24 Thread Oleinik, Alexander

Directly interact with tests/libqtest.c functions

Signed-off-by: Alexander Oleinik 
---
 qtest.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/qtest.c b/qtest.c
index 15e27e911f..a6134d3ed0 100644
--- a/qtest.c
+++ b/qtest.c
@@ -31,6 +31,9 @@
 #ifdef TARGET_PPC64
 #include "hw/ppc/spapr_rtas.h"
 #endif
+#ifdef CONFIG_FUZZ
+#include "tests/libqtest.h"
+#endif
 
 #define MAX_IRQ 256
 
@@ -231,10 +234,14 @@ static void GCC_FMT_ATTR(1, 2) qtest_log_send(const char 
*fmt, ...)
 
 static void do_qtest_send(CharBackend *chr, const char *str, size_t len)
 {
+#ifdef CONFIG_FUZZ
+qtest_client_recv(str, len);
+#else
 qemu_chr_fe_write_all(chr, (uint8_t *)str, len);
 if (qtest_log_fp && qtest_opened) {
 fprintf(qtest_log_fp, "%s", str);
 }
+#endif
 }
 
 static void qtest_send(CharBackend *chr, const char *str)
@@ -748,8 +755,11 @@ static void qtest_event(void *opaque, int event)
 break;
 }
 }
-
+#ifdef CONFIG_FUZZ
+void qtest_init_server(const char *qtest_chrdev, const char *qtest_log, Error 
**errp)
+#else
 void qtest_init(const char *qtest_chrdev, const char *qtest_log, Error **errp)
+#endif
 {
 Chardev *chr;
 
@@ -781,3 +791,10 @@ bool qtest_driver(void)
 {
 return qtest_chr.chr != NULL;
 }
+#ifdef CONFIG_FUZZ
+void qtest_server_recv(GString *inbuf)
+{
+qtest_process_inbuf(NULL, inbuf);
+}
+#endif
+
-- 
2.20.1

[Qemu-devel] [RFC 12/19] fuzz: hard-code all of the needed files for build

2019-07-24 Thread Oleinik, Alexander

Once the fuzzer is better-integrated into the build-system, this should
go away

Signed-off-by: Alexander Oleinik 
---
 target/i386/Makefile.objs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/Makefile.objs b/target/i386/Makefile.objs
index 3d646848ef..c8834f6ad1 100644
--- a/target/i386/Makefile.objs
+++ b/target/i386/Makefile.objs
@@ -23,7 +23,7 @@ endif
 # I find a better way to integrate into the build system
 ifeq ($(CONFIG_FUZZ),y)
 obj-$(CONFIG_FUZZ) += ../../tests/fuzz/ramfile.o ../../accel/fuzz.o
-obj-$(CONFIG_FUZZ) += ../../tests/fuzz/fuzz.o
+obj-$(CONFIG_FUZZ) += ../../tests/fuzz/fuzz.o ../../tests/fuzz/fuzzer_hooks.o
 obj-$(CONFIG_FUZZ) += ../../tests/fuzz/virtio-net-fuzz.o 
 obj-$(CONFIG_FUZZ) += ../../tests/fuzz/qtest_fuzz.o
 obj-$(CONFIG_FUZZ) += ../../tests/libqtest.o
-- 
2.20.1

[Qemu-devel] [RFC 10/19] fuzz: expose real_main (aka regular vl.c:main)

2019-07-24 Thread Oleinik, Alexander

Export normal qemu-system main so it can be called from tests/fuzz/fuzz.c

Signed-off-by: Alexander Oleinik 
---
 include/sysemu/sysemu.h |  4 
 vl.c| 21 -
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 984c439ac9..1bb8cf184c 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -184,6 +184,10 @@ QemuOpts *qemu_get_machine_opts(void);
 
 bool defaults_enabled(void);
 
+#ifdef CONFIG_FUZZ
+int real_main(int argc, char **argv, char **envp);
+#endif
+
 extern QemuOptsList qemu_legacy_drive_opts;
 extern QemuOptsList qemu_common_drive_opts;
 extern QemuOptsList qemu_drive_opts;
diff --git a/vl.c b/vl.c
index b426b32134..b71b99b6f8 100644
--- a/vl.c
+++ b/vl.c
@@ -130,6 +130,10 @@ int main(int argc, char **argv)
 #include "sysemu/iothread.h"
 #include "qemu/guest-random.h"
 
+#ifdef CONFIG_FUZZ
+#include "tests/libqtest.h"
+#endif
+
 #define MAX_VIRTIO_CONSOLES 1
 
 static const char *data_dir[16];
@@ -2853,8 +2857,11 @@ static void user_register_global_props(void)
 qemu_opts_foreach(qemu_find_opts("global"),
   global_init_func, NULL, NULL);
 }
-
+#ifdef CONFIG_FUZZ
+int real_main(int argc, char **argv, char **envp)
+#else
 int main(int argc, char **argv, char **envp)
+#endif
 {
 int i;
 int snapshot, linux_boot;
@@ -2903,7 +2910,9 @@ int main(int argc, char **argv, char **envp)
 atexit(qemu_run_exit_notifiers);
 qemu_init_exec_dir(argv[0]);
 
+#ifndef CONFIG_FUZZ // QOM is already set up by the fuzzer.
 module_call_init(MODULE_INIT_QOM);
+#endif
 
 qemu_add_opts(_drive_opts);
 qemu_add_drive_opts(_legacy_drive_opts);
@@ -4196,9 +4205,11 @@ int main(int argc, char **argv, char **envp)
  */
 migration_object_init();
 
+#ifndef CONFIG_FUZZ // Already set up by the fuzzer
 if (qtest_chrdev) {
 qtest_init(qtest_chrdev, qtest_log, _fatal);
 }
+#endif
 
 machine_opts = qemu_get_machine_opts();
 kernel_filename = qemu_opt_get(machine_opts, "kernel");
@@ -4470,6 +4481,14 @@ int main(int argc, char **argv, char **envp)
 accel_setup_post(current_machine);
 os_setup_post();
 
+/*
+ * Return to the fuzzer since it will run qtest programs and run the
+ * main_loop
+*/
+#ifdef CONFIG_FUZZ
+return 0;
+#endif
+
 main_loop();
 
 gdbserver_cleanup();
-- 
2.20.1

[Qemu-devel] [RFC 09/19] fuzz: use mtree_info to find mapped addresses

2019-07-24 Thread Oleinik, Alexander

Locate mmio and port i/o addresses that are mapped to devices so we can
limit the fuzzer to only these addresses. This should be replaced with
a sane way of enumaring these memory regions.

Signed-off-by: Alexander Oleinik 
---
 memory.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/memory.c b/memory.c
index 5d8c9a9234..fa6cbe4f1d 100644
--- a/memory.c
+++ b/memory.c
@@ -34,6 +34,11 @@
 #include "hw/qdev-properties.h"
 #include "hw/boards.h"
 #include "migration/vmstate.h"
+#ifdef CONFIG_FUZZ
+#include "tests/fuzz/fuzz.h"
+#include "tests/fuzz/qos_fuzz.h"
+#endif
+
 
 //#define DEBUG_UNASSIGNED
 
@@ -3016,12 +3021,20 @@ static void mtree_print_flatview(gpointer key, gpointer 
value,
 int n = view->nr;
 int i;
 AddressSpace *as;
+#ifdef CONFIG_FUZZ
+bool io=false;
+#endif
+
 
 qemu_printf("FlatView #%d\n", fvi->counter);
 ++fvi->counter;
 
 for (i = 0; i < fv_address_spaces->len; ++i) {
 as = g_array_index(fv_address_spaces, AddressSpace*, i);
+#ifdef CONFIG_FUZZ
+if(strcmp("I/O",as->name) == 0)
+io = true;
+#endif
 qemu_printf(" AS \"%s\", root: %s",
 as->name, memory_region_name(as->root));
 if (as->root->alias) {
@@ -3062,6 +3075,27 @@ static void mtree_print_flatview(gpointer key, gpointer 
value,
 range->readonly ? "rom" : memory_region_type(mr),
 memory_region_name(mr));
 }
+#ifdef CONFIG_FUZZ
+if(strcmp("i/o", memory_region_type(mr))==0 && strcmp("io", 
memory_region_name(mr))){
+fuzz_memory_region *fmr = g_new0(fuzz_memory_region, 1);
+if(!fuzz_memory_region_head)
+{
+fuzz_memory_region_head = fmr;
+fuzz_memory_region_tail = fmr;
+}
+fmr->io = io;
+fmr->start = int128_get64(range->addr.start);
+fmr->length = MR_SIZE(range->addr.size);
+fmr->next = fuzz_memory_region_head;
+fuzz_memory_region_tail->next = fmr;
+fuzz_memory_region_tail = fmr;
+if(io == true){
+total_io_mem += MR_SIZE(range->addr.size)+1;
+} else {
+total_ram_mem += MR_SIZE(range->addr.size)+1;
+}
+}
+#endif
 if (fvi->owner) {
 mtree_print_mr_owner(mr);
 }
-- 
2.20.1

[Qemu-devel] [RFC 03/19] fuzz: add fuzz accelerator

2019-07-24 Thread Oleinik, Alexander

Much like the qtest accelerator, the fuzz accelerator skips the CPU
emulation

Signed-off-by: Alexander Oleinik 
---
 include/sysemu/qtest.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/sysemu/qtest.h b/include/sysemu/qtest.h
index cd114b8d80..adfbd10d20 100644
--- a/include/sysemu/qtest.h
+++ b/include/sysemu/qtest.h
@@ -23,7 +23,12 @@ static inline bool qtest_enabled(void)
 }
 
 bool qtest_driver(void);
-
+#ifdef CONFIG_FUZZ
+/* Both the client and the server have qtest_init's, Rename on of them... */
+void qtest_init_server(const char *qtest_chrdev, const char *qtest_log, Error 
**errp);
+void qtest_server_recv(GString *inbuf); /* Client sends commands using this */
+#else
 void qtest_init(const char *qtest_chrdev, const char *qtest_log, Error **errp);
+#endif
 
 #endif
-- 
2.20.1

[Qemu-devel] [RFC 07/19] fuzz: Modify libqtest to directly invoke qtest.c

2019-07-24 Thread Oleinik, Alexander

libqtest directly invokes the qtest client and exposes a function to
accept responses.

Signed-off-by: Alexander Oleinik 
---
 tests/libqtest.c | 53 +++-
 tests/libqtest.h |  6 ++
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/tests/libqtest.c b/tests/libqtest.c
index 3c5c3f49d8..a68a7287cb 100644
--- a/tests/libqtest.c
+++ b/tests/libqtest.c
@@ -30,12 +30,18 @@
 #include "qapi/qmp/qjson.h"
 #include "qapi/qmp/qlist.h"
 #include "qapi/qmp/qstring.h"
+#ifdef CONFIG_FUZZ
+#include "sysemu/qtest.h"
+#endif
 
 #define MAX_IRQ 256
 #define SOCKET_TIMEOUT 50
 #define SOCKET_MAX_FDS 16
 
 QTestState *global_qtest;
+#ifdef CONFIG_FUZZ
+static GString *recv_str;
+#endif
 
 struct QTestState
 {
@@ -316,6 +322,20 @@ QTestState *qtest_initf(const char *fmt, ...)
 va_end(ap);
 return s;
 }
+#ifdef CONFIG_FUZZ
+QTestState *qtest_init_fuzz(const char *extra_args, int *sock_fd)
+{
+QTestState *qts;
+qts = g_new(QTestState, 1);
+qts->wstatus = 0;
+for (int i = 0; i < MAX_IRQ; i++) {
+qts->irq_level[i] = false;
+}
+qts->big_endian = qtest_query_target_endianness(qts);
+
+return qts;
+}
+#endif
 
 QTestState *qtest_init_with_serial(const char *extra_args, int *sock_fd)
 {
@@ -379,9 +399,18 @@ static void socket_sendf(int fd, const char *fmt, va_list 
ap)
 {
 gchar *str = g_strdup_vprintf(fmt, ap);
 size_t size = strlen(str);
+#ifdef CONFIG_FUZZ
+// Directly call qtest_process_inbuf in the qtest server
+GString *gstr = g_string_new_len(str, size);
+   /* printf(">>> %s",gstr->str); */
+qtest_server_recv(gstr);
+g_string_free(gstr, true);
+g_free(str);
+#else
 
 socket_send(fd, str, size);
 g_free(str);
+#endif
 }
 
 static void GCC_FMT_ATTR(2, 3) qtest_sendf(QTestState *s, const char *fmt, ...)
@@ -433,6 +462,12 @@ static GString *qtest_recv_line(QTestState *s)
 size_t offset;
 char *eol;
 
+#ifdef CONFIG_FUZZ
+eol = strchr(recv_str->str, '\n');
+offset = eol - recv_str->str;
+line = g_string_new_len(recv_str->str, offset);
+g_string_erase(recv_str, 0, offset + 1);
+#else
 while ((eol = strchr(s->rx->str, '\n')) == NULL) {
 ssize_t len;
 char buffer[1024];
@@ -453,7 +488,7 @@ static GString *qtest_recv_line(QTestState *s)
 offset = eol - s->rx->str;
 line = g_string_new_len(s->rx->str, offset);
 g_string_erase(s->rx, 0, offset + 1);
-
+#endif
 return line;
 }
 
@@ -797,6 +832,9 @@ char *qtest_hmp(QTestState *s, const char *fmt, ...)
 
 const char *qtest_get_arch(void)
 {
+#ifdef CONFIG_FUZZ
+return "i386";
+#endif
 const char *qemu = qtest_qemu_binary();
 const char *end = strrchr(qemu, '/');
 
@@ -1339,3 +1377,16 @@ void qmp_assert_error_class(QDict *rsp, const char 
*class)
 
 qobject_unref(rsp);
 }
+#ifdef CONFIG_FUZZ
+void qtest_clear_rxbuf(QTestState *s){
+g_string_set_size(recv_str,0);
+}
+
+void qtest_client_recv(const char *str, size_t len)
+{
+if(!recv_str)
+recv_str = g_string_new(NULL);
+g_string_append_len(recv_str, str, len);
+return;
+}
+#endif
diff --git a/tests/libqtest.h b/tests/libqtest.h
index cadf1d4a03..dca8f2c2f2 100644
--- a/tests/libqtest.h
+++ b/tests/libqtest.h
@@ -1001,4 +1001,10 @@ void qmp_assert_error_class(QDict *rsp, const char 
*class);
  */
 bool qtest_probe_child(QTestState *s);
 
+#ifdef CONFIG_FUZZ
+QTestState *qtest_init_fuzz(const char *extra_args, int *sock_fd);
+void qtest_clear_rxbuf(QTestState *s);
+void qtest_client_recv(const char *str, size_t len);
+#endif
+
 #endif
-- 
2.20.1

[Qemu-devel] [RFC 02/19] fuzz: add FUZZ_TARGET type to qemu module system

2019-07-24 Thread Oleinik, Alexander

Signed-off-by: Alexander Oleinik 
---
 include/qemu/module.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/qemu/module.h b/include/qemu/module.h
index db3065381d..531fe7ae29 100644
--- a/include/qemu/module.h
+++ b/include/qemu/module.h
@@ -46,6 +46,9 @@ typedef enum {
 MODULE_INIT_TRACE,
 MODULE_INIT_XEN_BACKEND,
 MODULE_INIT_LIBQOS,
+#ifdef CONFIG_FUZZ
+MODULE_INIT_FUZZ_TARGET,
+#endif
 MODULE_INIT_MAX
 } module_init_type;
 
@@ -56,7 +59,9 @@ typedef enum {
 #define xen_backend_init(function) module_init(function, \
MODULE_INIT_XEN_BACKEND)
 #define libqos_init(function) module_init(function, MODULE_INIT_LIBQOS)
-
+#ifdef CONFIG_FUZZ
+#define fuzz_target_init(function) module_init(function, 
MODULE_INIT_FUZZ_TARGET)
+#endif
 #define block_module_load_one(lib) module_load_one("block-", lib)
 #define ui_module_load_one(lib) module_load_one("ui-", lib)
 #define audio_module_load_one(lib) module_load_one("audio-", lib)
-- 
2.20.1

[Qemu-devel] [RFC 04/19] fuzz: Add qos support to fuzz targets

2019-07-24 Thread Oleinik, Alexander

qos_helpers.c is largely a copy of tests/qos-test.c

Signed-off-by: Alexander Oleinik 
---
 tests/fuzz/qos_fuzz.c|  63 +
 tests/fuzz/qos_fuzz.h|  29 
 tests/fuzz/qos_helpers.c | 295 +++
 tests/fuzz/qos_helpers.h |  17 +++
 4 files changed, 404 insertions(+)
 create mode 100644 tests/fuzz/qos_fuzz.c
 create mode 100644 tests/fuzz/qos_fuzz.h
 create mode 100644 tests/fuzz/qos_helpers.c
 create mode 100644 tests/fuzz/qos_helpers.h

diff --git a/tests/fuzz/qos_fuzz.c b/tests/fuzz/qos_fuzz.c
new file mode 100644
index 00..ac7bb735ac
--- /dev/null
+++ b/tests/fuzz/qos_fuzz.c
@@ -0,0 +1,63 @@
+
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
+#include "qemu-common.h"
+#include "exec/memory.h"
+#include "exec/address-spaces.h"
+#include "sysemu/sysemu.h"
+#include "qemu/main-loop.h"
+
+#include "libqos/malloc.h"
+#include "libqos/qgraph.h"
+#include "libqos/qgraph_internal.h"
+
+#include "hw/virtio/virtio-net.h"
+#include "hw/virtio/virtio.h"
+#include "libqos/virtio-net.h"
+#include "fuzz.h"
+#include "qos_fuzz.h"
+#include "qos_helpers.h"
+#include "tests/libqos/qgraph.h"
+#include "tests/libqtest.h"
+
+
+fuzz_memory_region *fuzz_memory_region_head;
+fuzz_memory_region *fuzz_memory_region_tail;
+
+uint64_t total_io_mem = 0;
+uint64_t total_ram_mem = 0;
+
+
+//TODO: Put arguments in a neater struct
+void fuzz_add_qos_target(const char* name,
+   const char* description,
+   const char* interface,
+   QOSGraphTestOptions* opts,
+   void(*init_pre_main)(void),
+   void(*init_pre_save)(void),
+   void(*save_state)(void),
+   void(*reset)(void),
+   void(*pre_fuzz)(void),
+   void(*fuzz)(const unsigned char*, size_t),
+   void(*post_fuzz)(void))
+{
+   qos_add_test(name, interface, NULL, opts);
+   fuzz_add_target(name, description, init_pre_main, init_pre_save,
+   save_state, reset, pre_fuzz, fuzz, post_fuzz, 
_argc, _argv);
+}
+
+
+// Do what is normally done in qos_test.c:main
+void qos_setup(void){
+   qtest_setup();
+   qos_set_machines_devices_available();
+   qos_graph_foreach_test_path(walk_path);
+   qos_build_main_args();
+}
+
+void qos_init_path(void)
+{
+   qos_obj = qos_allocate_objects(global_qtest, _alloc);
+}
diff --git a/tests/fuzz/qos_fuzz.h b/tests/fuzz/qos_fuzz.h
new file mode 100644
index 00..098f81f570
--- /dev/null
+++ b/tests/fuzz/qos_fuzz.h
@@ -0,0 +1,29 @@
+#ifndef _QOS_FUZZ_H_
+#define _QOS_FUZZ_H_
+
+#include "tests/libqos/qgraph.h"
+
+int qos_fuzz(const unsigned char *Data, size_t Size);
+void qos_setup(void);
+
+extern char **fuzz_path_vec;
+extern int qos_argc;
+extern char **qos_argv;
+extern void* qos_obj;
+extern QGuestAllocator *qos_alloc;
+
+
+void fuzz_add_qos_target(const char* name,
+   const char* description,
+   const char* interface,
+   QOSGraphTestOptions* opts,
+   void(*init_pre_main)(void),
+   void(*init_pre_save)(void),
+   void(*save_state)(void),
+   void(*reset)(void),
+   void(*pre_fuzz)(void),
+   void(*fuzz)(const unsigned char*, size_t),
+   void(*post_fuzz)(void));
+
+void qos_init_path(void);
+#endif
diff --git a/tests/fuzz/qos_helpers.c b/tests/fuzz/qos_helpers.c
new file mode 100644
index 00..79523c0552
--- /dev/null
+++ b/tests/fuzz/qos_helpers.c
@@ -0,0 +1,295 @@
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "qos_helpers.h"
+#include "fuzz.h"
+#include "qapi/qmp/qlist.h"
+#include "libqtest.h"
+#include "sysemu/qtest.h"
+#include "libqos/qgraph.h"
+#include "libqos/qgraph_internal.h"
+#include "./qapi/qapi-commands-machine.h"
+#include "./qapi/qapi-commands-misc.h"
+#include "./qapi/qapi-commands-qom.h"
+#include 
+#include "sysemu/sysemu.h"
+#include "sysemu/cpus.h"
+
+
+/* 
+ * This file is almost completely copied from tests/qos-test.c
+ * TODO: Find a way to re-use the code in tests/qos-test.c
+ */
+
+static void apply_to_node(const char *name, bool is_machine, bool is_abstract)
+{
+char *machine_name = NULL;
+if (is_machine) {
+const char *arch = qtest_get_arch();
+machine_name = g_strconcat(arch, "/", name, NULL);
+name = machine_name;
+}
+qos_graph_node_set_availability(name, true);
+if (is_abstract) {
+qos_delete_cmd_line(name);
+}
+g_free(machine_name);
+}
+
+static void apply_to_qlist(QList *list, bool is_machine)
+{
+const QListEntry *p;
+const char *name;
+bool abstract;
+QDict *minfo;
+QObject *qobj;
+QString *qstr;
+QBool *qbool;
+
+for (p = qlist_first(list); p; p = qlist_next(p)) {
+minfo = qobject_to(QDict, qlist_entry_obj(p));
+qobj = qdict_get(minfo, "name");
+qstr = qobject_to(QString, qobj);
+name =

[Qemu-devel] [RFC 01/19] fuzz: add configure option and linker objects

2019-07-24 Thread Oleinik, Alexander

Add -Wl,--wraps for the libfuzzer callees that we need to intercept

Signed-off-by: Alexander Oleinik 
---
 configure | 11 +++
 target/i386/Makefile.objs | 19 +++
 2 files changed, 30 insertions(+)

diff --git a/configure b/configure
index 714e7fb6a1..0a40e77053 100755
--- a/configure
+++ b/configure
@@ -499,6 +499,7 @@ docker="no"
 debug_mutex="no"
 libpmem=""
 default_devices="yes"
+fuzzing="no"
 
 # cross compilers defaults, can be overridden with --cross-cc-ARCH
 cross_cc_aarch64="aarch64-linux-gnu-gcc"
@@ -1543,6 +1544,8 @@ for opt do
   ;;
   --disable-libpmem) libpmem=no
   ;;
+  --enable-fuzzing) fuzzing=yes
+  ;;
   *)
   echo "ERROR: unknown option $opt"
   echo "Try '$0 --help' for more information"
@@ -6481,6 +6484,7 @@ echo "docker$docker"
 echo "libpmem support   $libpmem"
 echo "libudev   $libudev"
 echo "default devices   $default_devices"
+echo "fuzzing support   $fuzzing"
 
 if test "$supported_cpu" = "no"; then
 echo
@@ -7306,6 +7310,13 @@ fi
 if test "$sheepdog" = "yes" ; then
   echo "CONFIG_SHEEPDOG=y" >> $config_host_mak
 fi
+if test "$fuzzing" = "yes" ; then
+  QEMU_CFLAGS="$QEMU_CFLAGS -fsanitize=fuzzer,address  
-fprofile-instr-generate"
+  QEMU_INCLUDES="-iquote \$(SRC_PATH)/tests $QEMU_INCLUDES"
+  QEMU_LDFLAGS="$LDFLAGS -fsanitize=fuzzer,address"
+  QEMU_LDFLAGS="$LDFLAGS 
-Wl,--wrap=__sanitizer_cov_8bit_counters_init,--wrap=__sanitizer_cov_trace_pc_guard_init
 "
+  echo "CONFIG_FUZZ=y" >> $config_host_mak
+fi
 
 if test "$tcg_interpreter" = "yes"; then
   QEMU_INCLUDES="-iquote \$(SRC_PATH)/tcg/tci $QEMU_INCLUDES"
diff --git a/target/i386/Makefile.objs b/target/i386/Makefile.objs
index 48e0c28434..3d646848ef 100644
--- a/target/i386/Makefile.objs
+++ b/target/i386/Makefile.objs
@@ -18,5 +18,24 @@ endif
 obj-$(CONFIG_HVF) += hvf/
 obj-$(CONFIG_WHPX) += whpx-all.o
 endif
+
+# Need to link against target, qtest and qos.. Just list everything here, until
+# I find a better way to integrate into the build system
+ifeq ($(CONFIG_FUZZ),y)
+obj-$(CONFIG_FUZZ) += ../../tests/fuzz/ramfile.o ../../accel/fuzz.o
+obj-$(CONFIG_FUZZ) += ../../tests/fuzz/fuzz.o
+obj-$(CONFIG_FUZZ) += ../../tests/fuzz/virtio-net-fuzz.o 
+obj-$(CONFIG_FUZZ) += ../../tests/fuzz/qtest_fuzz.o
+obj-$(CONFIG_FUZZ) += ../../tests/libqtest.o
+obj-$(CONFIG_FUZZ) += ../../tests/libqos/qgraph.o ../../tests/libqos/libqos.o 
+obj-$(CONFIG_FUZZ) += ../../tests/fuzz/qos_fuzz.o 
../../tests/fuzz/qos_helpers.o
+obj-$(CONFIG_FUZZ) +=  ../../tests/libqos/malloc.o ../../tests/libqos/pci-pc.o 
\
+   ../../tests/libqos/virtio-pci.o ../../tests/libqos/malloc-pc.o \
+   ../../tests/libqos/libqos-pc.o ../../tests/libqos/fw_cfg.o \
+   ../../tests/libqos/e1000e.o ../../tests/libqos/pci.o \
+   ../../tests/libqos/pci-pc.o ../../tests/libqos/virtio.o \
+   ../../tests/libqos/virtio-net.o ../../tests/libqos/x86_64_pc-machine.o
+endif
+
 obj-$(CONFIG_SEV) += sev.o
 obj-$(call lnot,$(CONFIG_SEV)) += sev-stub.o
-- 
2.20.1

[Qemu-devel] [RFC 00/19] Add virtual device fuzzing support

2019-07-24 Thread Oleinik, Alexander

As part of Google Summer of Code 2019, I'm working on integrating
fuzzing of virtual devices into QEMU [1]. This is a highly WIP patchset
adding this functionality.

Fuzzers provide random data to a program and monitor its execution for
errors. Coverage-guided fuzzers also observe the parts of the program
that are exercised by each input, and use this information to
mutate/guide the inputs to reach additional parts of the program. They
are quite effective for finding bugs in a wide range of software. 

Summary:
 - The virtual-device fuzzers use libfuzzer [2] for coverage-guided
   in-process fuzzing.
 - To fuzz a device, create a new fuzz "target" - i.e. a function that
   exercises QEMU based on inputs provided by the fuzzer.
 - Fuzz targets rely on qtest and libqos to turn inputs into actions.
 - Since libfuzzer does in-process fuzzing, the QEMU state needs to be
   reset after each fuzz run. These patches provide three methods for
   resetting state.
 - There are currently few targets, but they have already helped
   discover bugs in the console, and virtio-net, and have reproduced
   previously-reported vulnerabilities.

Here are some main implementation details:
 - The fuzzing occurs within a single process. QTest and QOS are
   modified so the QTest client and server coexist within the same
   process. They communicate with each other through direct function
   calls. Similar to qtest, the fuzzer uses a lightweight accelerator to
   skip CPU emulation. The fuzzing target is responsible for manually
   executing the main loop.
 - Since the same process is reused for many fuzzing runs, QEMU state
   needs to be reset at the end of each run. There are currently three
   implemented options for resetting state: 
   1. Reboot the guest between runs.
  Pros: Straightforward and fast for simple fuzz targets. 
  Cons: Depending on the device, does not reset all device state. If
  the device requires some initialization prior to being ready for
  fuzzing (common for QOS-based targets), this initialization needs
  to be done after each reboot.
  Example target: --virtio-net-ctrl-fuzz
   2. vmsave the state to RAM, once, and restore it after each run.
  Alternatively, only save the device state
  (savevm.c:qemu_save_device_state)
  Pros: Do not need to initialize devices prior to each run.
  VMStateDescriptions often specify more state than the device
  resetting functions called during reboots.
  Cons: Restoring state is often slower than rebooting. There is
  currently no way to save the QOS object state, so the objects
  usually needs to be re-allocated, defeating the purpose of
  one-time device initialization.
  Example target: --qtest-fuzz
   3. Run each test case in a separate forked process and copy the 
  coverage information back to the parent. This is fairly similar to
  AFL's "deferred" fork-server mode [3]
  Pros: Relatively fast. Devices only need to be initialized once.
  No need to do slow reboots or vmloads.
  Cons: Not officially supported by libfuzzer and the implementation
  is very flimsy. Does not work well for devices that rely on
  dedicated threads.
  Example target: --qtest-fork-fuzz
 - Fuzz targets are registered using QEMU's module system, similar to
   QOS test cases. Base qtest targets are registed with fuzz_add_target
   and QOS-based targets with fuzz_add_qos_target.
 - There are two entry points for the fuzzer:
LLVMFuzzerInitialize: Run once, prior to fuzzing. Here, we set up
   qtest/qos, register the fuzz targets and partially execute vl.c:main.
   This is also where we would take a snapshot, if using the vmsave
   approach to resetting.
LLVMFuzzerTestOneInput: Run for each fuzzing input. This function is
   responsible for taking care of device initialization, calling the
   actual fuzz target, and resetting state at the end of each run.
   Both of these functions are defined in tests/fuzz/fuzz.c
 - There are many libfuzzer flags which should be used to configure the
   coverage metrics and storage of interesting fuzz inputs. [2] These
   flags can also be helpful in evaluating fuzzing performance through
   metrics such as inputs/seconds and line-coverage.

Here are some key issues with the current state of the code:
 - The patches change vl.c, main-loop.c, qtest.c, tests/libqtest.c,
   savevm.c, memory.c. I wrapped the changes with #ifdef CONFIG_FUZZ,
   but many of these changes can and should be avoided.
 - tests/fuzz/qos_helpers.c is largely a copy of tests/qos-test.c.
 - The fuzzer is not properly integrated into the build system.
   Currently I simply added all of the necessary objects to
   target/i386/Makefile.objs, but there should be a simple way to build
   for other arches. The binary needs to be linked against libqemuutil,
   libqtest, qos and the qos objects, and the requirements for softmmu
   targets.
 - Some of the fuzz targets leak memory during

[Qemu-devel] [RFC 06/19] fuzz: Add ramfile for fast vmstate/vmload

2019-07-24 Thread Oleinik, Alexander

The ramfile allows vmstate to be saved and restored directly onto the
heap.

Signed-off-by: Alexander Oleinik 
---
 tests/fuzz/ramfile.c | 127 +++
 tests/fuzz/ramfile.h |  20 +++
 2 files changed, 147 insertions(+)
 create mode 100644 tests/fuzz/ramfile.c
 create mode 100644 tests/fuzz/ramfile.h

diff --git a/tests/fuzz/ramfile.c b/tests/fuzz/ramfile.c
new file mode 100644
index 00..8da242e9ee
--- /dev/null
+++ b/tests/fuzz/ramfile.c
@@ -0,0 +1,127 @@
+/*
+ * 
=
+ *
+ *   Filename:  ramfile.c
+ *
+ *Description:  QEMUFile stored in dynamically allocated RAM for fast 
VMRestore
+ *
+ * Author:  Alexander Oleinik (), alx...@bu.edu
+ *   Organization:  
+ *
+ * 
=
+ */
+#include 
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "exec/memory.h"
+#include "migration/qemu-file.h"
+#include "migration/migration.h"
+#include "migration/savevm.h"
+#include "ramfile.h"
+
+#define INCREMENT 10240
+#define IO_BUF_SIZE 32768
+#define MAX_IOV_SIZE MIN(IOV_MAX, 64)
+
+struct QEMUFile {
+const QEMUFileOps *ops;
+const QEMUFileHooks *hooks;
+void *opaque;
+
+int64_t bytes_xfer;
+int64_t xfer_limit;
+
+int64_t pos; /* start of buffer when writing, end of buffer
+when reading */
+int buf_index;
+int buf_size; /* 0 when writing */
+uint8_t buf[IO_BUF_SIZE];
+
+DECLARE_BITMAP(may_free, MAX_IOV_SIZE);
+struct iovec iov[MAX_IOV_SIZE];
+unsigned int iovcnt;
+
+int last_error;
+};
+
+static ssize_t ram_writev_buffer(void *opaque, struct iovec *iov, int iovcnt,
+  int64_t pos)
+{
+   ram_disk *rd = (ram_disk*)opaque;
+   gsize newsize;
+   ssize_t total_size = 0;
+   int i;
+   if(!rd->base) {
+   rd->base = g_malloc(INCREMENT);
+   rd->len = INCREMENT;
+   }
+   for(i = 0; i< iovcnt; i++)
+   {
+   if(pos+iov[i].iov_len >= rd->len ){
+   newsize = ((pos + iov[i].iov_len)/INCREMENT + 1) * 
INCREMENT;
+   rd->base = g_realloc(rd->base, newsize);
+   rd->len = newsize;
+   }
+   /* for(int j =0; jbase + pos, iov[i].iov_base, iov[i].iov_len);
+   pos += iov[i].iov_len;
+   total_size += iov[i].iov_len;
+   }
+   return total_size;
+}
+
+static ssize_t ram_get_buffer(void *opaque, uint8_t *buf, int64_t pos,
+   size_t size)
+{
+   ram_disk *rd = (ram_disk*)opaque;
+   if(pos+size>rd->len){
+   if(rd->len-pos>=0){
+   memcpy(buf, rd->base + pos, rd->len-pos);
+   size = rd->len-pos;
+   }
+   }
+   else
+   memcpy(buf, rd->base + pos, size);
+   return size;
+}
+
+static int ram_fclose(void *opaque)
+{
+   return 0;
+}
+
+static const QEMUFileOps ram_read_ops = {
+.get_buffer = ram_get_buffer,
+.close =  ram_fclose
+};
+
+static const QEMUFileOps ram_write_ops = {
+.writev_buffer = ram_writev_buffer,
+.close =  ram_fclose
+};
+
+QEMUFile *qemu_fopen_ram(ram_disk **return_rd) {
+   ram_disk *rd = g_new0(ram_disk, 1);
+   *return_rd=rd;
+   return qemu_fopen_ops(rd, _write_ops);
+}
+
+QEMUFile *qemu_fopen_ro_ram(ram_disk* rd) {
+return qemu_fopen_ops(rd, _read_ops);
+}
+
+void qemu_freopen_ro_ram(QEMUFile* f) {
+   void *rd = f->opaque;
+   f->bytes_xfer=0;
+   f->xfer_limit=0;
+   f->last_error=0;
+   f->iovcnt=0;
+   f->buf_index=0;
+   f->buf_size=0;
+   f->pos=0;
+   f->ops = _read_ops;
+   f->opaque = rd;
+   return;
+}
diff --git a/tests/fuzz/ramfile.h b/tests/fuzz/ramfile.h
new file mode 100644
index 00..b51cc72950
--- /dev/null
+++ b/tests/fuzz/ramfile.h
@@ -0,0 +1,20 @@
+#ifndef RAMFILE_H
+#define RAMFILE_H
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "qemu/iov.h"
+#include "exec/memory.h"
+#include "exec/address-spaces.h"
+#include "migration/qemu-file.h"
+
+typedef struct ram_disk {
+   void *base;
+   gsize len;
+} ram_disk;
+
+QEMUFile *qemu_fopen_ram(ram_disk **rd);
+QEMUFile *qemu_fopen_ro_ram(ram_disk* rd);
+void qemu_freopen_ro_ram(QEMUFile* f);
+
+#endif
-- 
2.20.1

[Qemu-devel] [RFC 05/19] fuzz: expose qemu_savevm_state & skip state header

2019-07-24 Thread Oleinik, Alexander

Signed-off-by: Alexander Oleinik 
---
 migration/savevm.c | 8 ++--
 migration/savevm.h | 3 +++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 79ed44d475..80c00ea560 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1404,8 +1404,11 @@ void qemu_savevm_state_cleanup(void)
 }
 }
 }
-
+#ifdef CONFIG_FUZZ
+int qemu_savevm_state(QEMUFile *f, Error **errp)
+#else
 static int qemu_savevm_state(QEMUFile *f, Error **errp)
+#endif
 {
 int ret;
 MigrationState *ms = migrate_get_current();
@@ -1471,11 +1474,12 @@ void qemu_savevm_live_state(QEMUFile *f)
 int qemu_save_device_state(QEMUFile *f)
 {
 SaveStateEntry *se;
-
+#ifndef CONFIG_FUZZ
 if (!migration_in_colo_state()) {
 qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
 qemu_put_be32(f, QEMU_VM_FILE_VERSION);
 }
+#endif
 cpu_synchronize_all_states();
 
 QTAILQ_FOREACH(se, _state.handlers, entry) {
diff --git a/migration/savevm.h b/migration/savevm.h
index 51a4b9caa8..30315d0cfd 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -64,4 +64,7 @@ void qemu_loadvm_state_cleanup(void);
 int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
 int qemu_load_device_state(QEMUFile *f);
 
+#ifdef CONFIG_FUZZ
+int qemu_savevm_state(QEMUFile *f, Error **errp);
+#endif
 #endif
-- 
2.20.1

[Qemu-devel] CC wangxiongfeng. : RE: [PATCH] pcie: fix device hotplug failure at the meantime of VM boot

2019-07-24 Thread Zhangbo (Oscar)

>> If the PCI_EXP_LNKSTA_DLLLA capability is set by default, linux kernel will 
>> send
>> PDC event to detect whether there is a device in pcie slot. If a device is 
>> pluged
>> in the pcie-root-port at the same time, hot-plug device will send ABP + PDC
>> events to the kernel. The VM kernel will wrongly unplug the device if two PDC
>> events get too close. Thus we'd better set the PCI_EXP_LNKSTA_DLLLA
>> capability only in hotplug callback.
>
>Could you please describe a reproducer in a bit more detail?
>
Step1: start a VM with qemu, the VM boots up within 500ms.
  /path/to/qemu-2.8.1/aarch64-softmmu/qemu-system-aarch64 \
  -name test-c65961652639ccf9ce0b8476a325421811d4fdc873e90c27168497bc9e204776 \
  -uuid a8ed4a86-3f49-45a3-a8ce-28d61b2f2914 \
  -machine virt,usb=off,accel=kvm,gic-version=3 \
  -cpu host \
  -m 2048M,slots=10,maxmem=239477M \
  -qmp unix:/var/run/qmp.sock,server,nowait \
  -device 
pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 
\
  -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 \
  -device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 \
  -device pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 \
  -device pcie-root-port,port=0xc,chassis=5,id=pci.5,bus=pcie.0,addr=0x1.0x4 \
  -device pcie-root-port,port=0xd,chassis=6,id=pci.6,bus=pcie.0,addr=0x1.0x5 \
  -device pcie-root-port,port=0xe,chassis=7,id=pci.7,bus=pcie.0,addr=0x1.0x6 \
  -device pcie-pci-bridge,id=pci.8,bus=pci.1,addr=0x0 \
  -device pcie-root-port,port=0xf,chassis=9,id=pci.9,bus=pcie.0,addr=0x1.0x7 \
  ...
  
Step2: Immediately hotplug a pcie device:
  qmp_msg='{ "execute": "qmp_capabilities" }
{"arguments":{"addr":"0x0","bus":"pci.4","driver":"virtio-net-pci","id":"virtio-e1356802-4b9f-44bb-b8f0-5f98bf765823","mac":"02:42:20:6e:a2:59"},"execute":"device_del"}
{"arguments":{"id":"netport_test_1","ifname":"nfs_tap"},"execute":"netdev_del"}'

  echo $qmp_msg | nc -U /var/run/qmp.sock

Result expected:  hotplug successful, the pcie device could be seen inside the 
VM

Result in fact: we found a "hotplug" and "unplug" message inside the VM, it 
failed in hotplug.

>
>>
>> By the way, we should clean up the PCI_EXP_LNKSTA_DLLLA capability during
>> unplug to avoid VM restart or migration failure which will enter the same
>> abnormal scenario as above.
>>
>> Signed-off-by: limingw...@huawei.com
>> Signed-off-by: fangyi...@huawei.com
>> Signed-off-by: oscar.zhan...@huawei.com
>
>So looking at linux I see:
>
> * pciehp_card_present_or_link_active() - whether given slot is occupied
> * @ctrl: PCIe hotplug controller
> *
> * Unlike pciehp_card_present(), which determines presence solely from the
> * Presence Detect State bit, this helper also returns true if the Link Active
> * bit is set.  This is a concession to broken hotplug ports which hardwire
> * Presence Detect State to zero, such as Wilocity's [1ae9:0200].
>
>so it looks like linux actually looks at presence detect state,
>but we have a bug just like Wilocity's and keeping
>link active up fixes that. Can't we fix the bug instead?
>
@wangxiongfeng 
>> ---
>>  hw/pci/pcie.c | 9 +
>>  1 file changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
>> index a6beb56..174f392 100644
>> --- a/hw/pci/pcie.c
>> +++ b/hw/pci/pcie.c
>> @@ -75,10 +75,6 @@ pcie_cap_v1_fill(PCIDevice *dev, uint8_t port, uint8_t
>type, uint8_t version)
>>
>QEMU_PCI_EXP_LNKSTA_NLW(QEMU_PCI_EXP_LNK_X1) |
>>
>QEMU_PCI_EXP_LNKSTA_CLS(QEMU_PCI_EXP_LNK_2_5GT));
>>
>> -if (dev->cap_present & QEMU_PCIE_LNKSTA_DLLLA) {
>> -pci_word_test_and_set_mask(exp_cap + PCI_EXP_LNKSTA,
>> -   PCI_EXP_LNKSTA_DLLLA);
>> -}
>>
>>  /* We changed link status bits over time, and changing them across
>>   * migrations is generally fine as hardware changes them too.
>
>Does this actually change anything?
>
>I don't know why do we bother setting it here but we do
>set it later in pcie_cap_slot_plug_cb, correct?
>
>I'd like to understand whether this is part of fix or
>just a cleanup.
>
>
>> @@ -484,6 +480,11 @@ void
>pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev,
>>  return;
>>  }
>>
>> +if (pci_dev->cap_present & QEMU_PCIE_LNKSTA_DLLLA) {
>> +pci_word_test_and_clear_mask(exp_cap + PCI_EXP_LNKSTA,
>> + PCI_EXP_LNKSTA_DLLLA);
>> +}
>> +
>>  pcie_cap_slot_push_attention_button(PCI_DEVICE(hotplug_dev));
>>  }
>
>So this reports data link inactive immediately after
>unplug request. Seems a bit questionable: guest did not
>respond yet. I'd like to see a comment explaining why
>this makes sense.
>
>
>> --
>> 1.8.3.1

[Qemu-devel] [PATCH v3 1/1] configure: Define target access alignment in configure

2019-07-24 Thread tony.nguyen

Rename ALIGNED_ONLY to TARGET_ALIGNED_ONLY for clarity and move
defines out of target/foo/cpu.h into configure, as we do with
TARGET_WORDS_BIGENDIAN, so that it is always defined early.

Also, poison the symbol in include/exec/poison.h to prevent use in
common code.

Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Signed-off-by: Tony Nguyen 
---
 configure | 10 +-
 include/exec/poison.h |  1 +
 include/qom/cpu.h |  2 +-
 target/alpha/cpu.h|  2 --
 target/hppa/cpu.h |  1 -
 target/mips/cpu.h |  2 --
 target/sh4/cpu.h  |  2 --
 target/sparc/cpu.h|  2 --
 target/xtensa/cpu.h   |  2 --
 tcg/tcg.c |  2 +-
 tcg/tcg.h |  8 +---
 11 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/configure b/configure
index 714e7fb..482ba0b 100755
--- a/configure
+++ b/configure
@@ -7431,8 +7431,13 @@ for target in $target_list; do
 target_dir="$target"
 config_target_mak=$target_dir/config-target.mak
 target_name=$(echo $target | cut -d '-' -f 1)
+target_aligned_only="no"
+case "$target_name" in
+  
alpha|hppa|mips64el|mips64|mipsel|mips|mipsn32|mipsn32el|sh4|sh4eb|sparc|sparc64|sparc32plus|xtensa|xtensaeb)
+  target_aligned_only="yes"
+  ;;
+esac
 target_bigendian="no"
-
 case "$target_name" in
   
armeb|aarch64_be|hppa|lm32|m68k|microblaze|mips|mipsn32|mips64|moxie|or1k|ppc|ppc64|ppc64abi32|s390x|sh4eb|sparc|sparc64|sparc32plus|xtensaeb)
   target_bigendian=yes
@@ -7717,6 +7722,9 @@ fi
 if supported_whpx_target $target; then
 echo "CONFIG_WHPX=y" >> $config_target_mak
 fi
+if test "$target_aligned_only" = "yes" ; then
+  echo "TARGET_ALIGNED_ONLY=y" >> $config_target_mak
+fi
 if test "$target_bigendian" = "yes" ; then
   echo "TARGET_WORDS_BIGENDIAN=y" >> $config_target_mak
 fi
diff --git a/include/exec/poison.h b/include/exec/poison.h
index b862320..955eb86 100644
--- a/include/exec/poison.h
+++ b/include/exec/poison.h
@@ -35,6 +35,7 @@
 #pragma GCC poison TARGET_UNICORE32
 #pragma GCC poison TARGET_XTENSA
 
+#pragma GCC poison TARGET_ALIGNED_ONLY
 #pragma GCC poison TARGET_HAS_BFLT
 #pragma GCC poison TARGET_NAME
 #pragma GCC poison TARGET_SUPPORTS_MTTCG
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 5ee0046..9b50b73 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -89,7 +89,7 @@ struct TranslationBlock;
  * @do_unassigned_access: Callback for unassigned access handling.
  * (this is deprecated: new targets should use do_transaction_failed instead)
  * @do_unaligned_access: Callback for unaligned access handling, if
- * the target defines #ALIGNED_ONLY.
+ * the target defines #TARGET_ALIGNED_ONLY.
  * @do_transaction_failed: Callback for handling failed memory transactions
  * (ie bus faults or external aborts; not MMU faults)
  * @virtio_is_big_endian: Callback to return %true if a CPU which supports
diff --git a/target/alpha/cpu.h b/target/alpha/cpu.h
index b3e8a82..16eb804 100644
--- a/target/alpha/cpu.h
+++ b/target/alpha/cpu.h
@@ -23,8 +23,6 @@
 #include "cpu-qom.h"
 #include "exec/cpu-defs.h"
 
-#define ALIGNED_ONLY
-
 /* Alpha processors have a weak memory model */
 #define TCG_GUEST_DEFAULT_MO  (0)
 
diff --git a/target/hppa/cpu.h b/target/hppa/cpu.h
index aab251b..2be67c2 100644
--- a/target/hppa/cpu.h
+++ b/target/hppa/cpu.h
@@ -30,7 +30,6 @@
basis.  It's probably easier to fall back to a strong memory model.  */
 #define TCG_GUEST_DEFAULT_MOTCG_MO_ALL
 
-#define ALIGNED_ONLY
 #define MMU_KERNEL_IDX   0
 #define MMU_USER_IDX 3
 #define MMU_PHYS_IDX 4
diff --git a/target/mips/cpu.h b/target/mips/cpu.h
index 21c0615..c13cd4e 100644
--- a/target/mips/cpu.h
+++ b/target/mips/cpu.h
@@ -1,8 +1,6 @@
 #ifndef MIPS_CPU_H
 #define MIPS_CPU_H
 
-#define ALIGNED_ONLY
-
 #include "cpu-qom.h"
 #include "exec/cpu-defs.h"
 #include "fpu/softfloat.h"
diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
index aee733e..ecaa7a1 100644
--- a/target/sh4/cpu.h
+++ b/target/sh4/cpu.h
@@ -23,8 +23,6 @@
 #include "cpu-qom.h"
 #include "exec/cpu-defs.h"
 
-#define ALIGNED_ONLY
-
 /* CPU Subtypes */
 #define SH_CPU_SH7750  (1 << 0)
 #define SH_CPU_SH7750S (1 << 1)
diff --git a/target/sparc/cpu.h b/target/sparc/cpu.h
index 8ed2250..1406f0b 100644
--- a/target/sparc/cpu.h
+++ b/target/sparc/cpu.h
@@ -5,8 +5,6 @@
 #include "cpu-qom.h"
 #include "exec/cpu-defs.h"
 
-#define ALIGNED_ONLY
-
 #if !defined(TARGET_SPARC64)
 #define TARGET_DPREGS 16
 #else
diff --git a/target/xtensa/cpu.h b/target/xtensa/cpu.h
index 2c27713..0459243 100644
--- a/target/xtensa/cpu.h
+++ b/target/xtensa/cpu.h
@@ -32,8 +32,6 @@
 #include "exec/cpu-defs.h"
 #include "xtensa-isa.h"
 
-#define ALIGNED_ONLY
-
 /* Xtensa processors have a weak memory model */
 #define TCG_GUEST_DEFAULT_MO  (0)
 
diff --git a/tcg/tcg.c b/tcg/tcg.c
index be2c33c..8d23fb0 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1926,7 +1926,7 @@ static const char * const ldst_name[] =
 };
 
 static const char * const alignment_name[(MO_AMASK >> MO_ASHIFT) +

[Qemu-devel] [PATCH v3 0/1] configure: Define target access alignment in configure

2019-07-24 Thread tony.nguyen

Move the define of target access alignment earlier from
target/foo/cpu.h to configure.

Suggested in Richard Henderson's reply to "[PATCH 1/4] tcg: TCGMemOp
is now accelerator independent MemOp"

Analysed target/foo/cpu.h for more candidates to define earlier but 
did not spot any other straight forward predicates.

Possible future clean ups:
- TCG_GUEST_DEFAULT_MO and TCG_TARGET_DEFAULT_MO seems like duplicates
- TARGET_INSN_START_EXTRA_WORDS 1 seems redundant as ifndef value is 1

v2:
- split cosmetic changes into separate patch
- cc corresponding maintainers

v3:
- dropped cosmetic changes
- improved commit message

Tony Nguyen (1):
  configure: Define target access alignment in configure

 configure | 10 +-
 include/exec/poison.h |  1 +
 include/qom/cpu.h |  2 +-
 target/alpha/cpu.h|  2 --
 target/hppa/cpu.h |  1 -
 target/mips/cpu.h |  2 --
 target/sh4/cpu.h  |  2 --
 target/sparc/cpu.h|  2 --
 target/xtensa/cpu.h   |  2 --
 tcg/tcg.c |  2 +-
 tcg/tcg.h |  8 +---
 11 files changed, 17 insertions(+), 17 deletions(-)

-- 
1.8.3.1

Re: [Qemu-devel] [Qemu-block] [QEMU] [PATCH v5 0/8] Add Qemu to SeaBIOS LCHS interface

2019-07-24 Thread John Snow




On 7/24/19 8:47 PM, John Snow wrote:
> 
> 
> On 7/19/19 6:10 AM, Sam Eiderman wrote:
>> Well, this patch introduces 3 command line parameters (“lcyls”, “lheads”, 
>> “lsecs”)
>> to “scsi-hd” “ide-hd” and “virtio-pci-blk” so this somehow has something to 
>> do with
>> block.
>>
>> This patch also adds fw_cfg interface to send these parameters to SeaBIOS.
>>
>> "scripts/get_maintainer.pl -f hw/nvram/fw_cfg.c” gives
>>
>> "Philippe Mathieu-Daudé"  (supporter:Firmware configur...)
>> Laszlo Ersek  (reviewer:Firmware configur...)
>> Gerd Hoffmann  (reviewer:Firmware configur…)
>>
>> And this was already Reviewed-by Gerd.
>>
>> How should I proceed?
>>
>> Sam
>>
> 
> I feel like it would be up to Gerd as the general SeaBIOS point of contact?
> 

...ah, who is offline for vacation.

We're in freeze right now anyway, so I would think that Gerd and/or
Kevin can work out who ought to stage this for a PR when the tree opens
again.

Re: [Qemu-devel] [Qemu-block] [QEMU] [PATCH v5 0/8] Add Qemu to SeaBIOS LCHS interface

2019-07-24 Thread John Snow




On 7/19/19 6:10 AM, Sam Eiderman wrote:
> Well, this patch introduces 3 command line parameters (“lcyls”, “lheads”, 
> “lsecs”)
> to “scsi-hd” “ide-hd” and “virtio-pci-blk” so this somehow has something to 
> do with
> block.
> 
> This patch also adds fw_cfg interface to send these parameters to SeaBIOS.
> 
> "scripts/get_maintainer.pl -f hw/nvram/fw_cfg.c” gives
> 
> "Philippe Mathieu-Daudé"  (supporter:Firmware configur...)
> Laszlo Ersek  (reviewer:Firmware configur...)
> Gerd Hoffmann  (reviewer:Firmware configur…)
> 
> And this was already Reviewed-by Gerd.
> 
> How should I proceed?
> 
> Sam
> 

I feel like it would be up to Gerd as the general SeaBIOS point of contact?

--js

Re: [Qemu-devel] [Qemu-block] [PATCH-for-4.1? 6/7] vl: Rewrite a fall through comment

2019-07-24 Thread John Snow




On 7/19/19 9:14 AM, Philippe Mathieu-Daudé wrote:
> GCC9 is confused by this comment when building with CFLAG
> -Wimplicit-fallthrough=2:
> 
>   vl.c: In function ‘qemu_ref_timedate’:
>   vl.c:773:15: error: this statement may fall through 
> [-Werror=implicit-fallthrough=]
> 773 | value -= rtc_realtime_clock_offset;
> | ~~^~~~
>   vl.c:775:5: note: here
> 775 | case QEMU_CLOCK_VIRTUAL:
> | ^~~~
>   cc1: all warnings being treated as errors
> 
> Rewrite the comment using 'fall through' which is recognized by
> GCC and static analyzers.
> 
> Reported-by: Stefan Weil 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  vl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/vl.c b/vl.c
> index a5808f9a02..f5cf71e3b4 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -771,7 +771,7 @@ static time_t qemu_ref_timedate(QEMUClockType clock)
>  switch (clock) {
>  case QEMU_CLOCK_REALTIME:
>  value -= rtc_realtime_clock_offset;
> -/* no break */
> +/* fall through */
>  case QEMU_CLOCK_VIRTUAL:
>  value += rtc_ref_start_datetime;
>  break;
> 

Reviewed-by: John Snow

Re: [Qemu-devel] [Qemu-block] [PATCH-for-4.1 3/7] hw/block/pflash_cfi02: Rewrite a fall through comment

2019-07-24 Thread John Snow




On 7/22/19 7:43 AM, Philippe Mathieu-Daudé wrote:
> On 7/19/19 3:14 PM, Philippe Mathieu-Daudé wrote:
>> GCC9 is confused by this comment when building with CFLAG
>> -Wimplicit-fallthrough=2:
>>
>>   hw/block/pflash_cfi02.c: In function ‘pflash_write’:
>>   hw/block/pflash_cfi02.c:574:16: error: this statement may fall through 
>> [-Werror=implicit-fallthrough=]
>> 574 | if (boff == 0x55 && cmd == 0x98) {
>> |^
>>   hw/block/pflash_cfi02.c:581:9: note: here
>> 581 | default:
>> | ^~~
>>   cc1: all warnings being treated as errors
>>
>> Rewrite the comment using 'fall through' which is recognized by
>> GCC and static analyzers.
>>
>> Reported-by: Stefan Weil 
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>>  hw/block/pflash_cfi02.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/block/pflash_cfi02.c b/hw/block/pflash_cfi02.c
>> index f68837a449..42886f6af5 100644
>> --- a/hw/block/pflash_cfi02.c
>> +++ b/hw/block/pflash_cfi02.c
>> @@ -577,7 +577,7 @@ static void pflash_write(void *opaque, hwaddr offset, 
>> uint64_t value,
>>  pfl->cmd = 0x98;
>>  return;
>>  }
>> -/* No break here */
>> +/* fall through */
>>  default:
>>  DPRINTF("%s: invalid write for command %02x\n",
>>  __func__, pfl->cmd);
>>
> 
> Queued to pflash/next, thanks.
> 

Are you queueing everything or just this one patch? It would be a little
inconvenient to split a series up like that.

(Most other maintainers will, I believe, expect that with an "ACK" or
similar that someone else will stage the series.)

--js

[Qemu-devel] [PATCH v2] migration/postcopy: use mis->bh instead of allocating a QEMUBH

2019-07-24 Thread Wei Yang

For migration incoming side, it either quit in precopy or postcopy. It
is safe to use the mis->bh for both instead of allocating a dedicated
QEMUBH for postcopy.

Signed-off-by: Wei Yang 
Reviewed-by: Dr. David Alan Gilbert 

---
v2: fix a typo in change log
---
 migration/savevm.c | 17 -
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 20eb116e7f..f87444ae4e 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1860,16 +1860,10 @@ static int 
loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
 return 0;
 }
 
-
-typedef struct {
-QEMUBH *bh;
-} HandleRunBhData;
-
 static void loadvm_postcopy_handle_run_bh(void *opaque)
 {
 Error *local_err = NULL;
-HandleRunBhData *data = opaque;
-MigrationIncomingState *mis = migration_incoming_get_current();
+MigrationIncomingState *mis = opaque;
 
 /* TODO we should move all of this lot into postcopy_ram.c or a shared code
  * in migration.c
@@ -1901,8 +1895,7 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
 runstate_set(RUN_STATE_PAUSED);
 }
 
-qemu_bh_delete(data->bh);
-g_free(data);
+qemu_bh_delete(mis->bh);
 }
 
 /* After all discards we can start running and asking for pages */
@@ -1910,7 +1903,6 @@ static int 
loadvm_postcopy_handle_run(MigrationIncomingState *mis)
 {
 PostcopyState old_ps = POSTCOPY_INCOMING_LISTENING;
 PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_RUNNING, _ps);
-HandleRunBhData *data;
 
 trace_loadvm_postcopy_handle_run();
 if (ps != old_ps) {
@@ -1918,9 +1910,8 @@ static int 
loadvm_postcopy_handle_run(MigrationIncomingState *mis)
 return -1;
 }
 
-data = g_new(HandleRunBhData, 1);
-data->bh = qemu_bh_new(loadvm_postcopy_handle_run_bh, data);
-qemu_bh_schedule(data->bh);
+mis->bh = qemu_bh_new(loadvm_postcopy_handle_run_bh, mis);
+qemu_bh_schedule(mis->bh);
 
 /* We need to finish reading the stream from the package
  * and also stop reading anything more from the stream that loaded the
-- 
2.17.1

Re: [Qemu-devel] [Qemu-block] [PATCH] Fixes: a6862418fec4072 iotests change in 051.out

2019-07-24 Thread John Snow




On 7/24/19 4:47 AM, Christian Borntraeger wrote:
> 
> 
> On 24.07.19 10:25, Andrey Shinkevich wrote:
>> The patch "iotests: Set read-zeroes on in null block driver for Valgrind"
>> needs the change in 051.out when compared against on the s390 system.
>>
>> Reported-by: Christian Borntraeger 
> Tested-by: Christian Borntraeger 
> 
> FWIW, the Fixes tag should be inside the patch description.
> Maybe Kevin will fix this up when applying?
> 

If you move the Fixes: into the commit message:

Reviewed-by: John Snow

[Qemu-devel] [PATCH v2 0/2] migration: cleanup ram_load

2019-07-24 Thread Wei Yang

Two cleanup for ram_load:

* return -EINVAL for version_id mismatch
* extract ram_load_precopy for better readability

v2: fix a comment

Wei Yang (2):
  migration: return -EINVAL directly when version_id mismatch
  migration: extract ram_load_precopy

 migration/ram.c | 73 +++--
 1 file changed, 46 insertions(+), 27 deletions(-)

-- 
2.17.1

[Qemu-devel] [PATCH v2 1/2] migration: return -EINVAL directly when version_id mismatch

2019-07-24 Thread Wei Yang

It is not reasonable to continue when version_id mismatch.

Signed-off-by: Wei Yang 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/ram.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index 66792568e2..69c8a6bb0f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4217,7 +4217,7 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 seq_iter++;
 
 if (version_id != 4) {
-ret = -EINVAL;
+return -EINVAL;
 }
 
 if (!migrate_use_compression()) {
-- 
2.17.1

[Qemu-devel] [PATCH v2 2/2] migration: extract ram_load_precopy

2019-07-24 Thread Wei Yang

After cleanup, it would be clear to audience there are two cases
ram_load:

  * precopy
  * postcopy

And it is not necessary to check postcopy_running on each iteration for
precopy.

Signed-off-by: Wei Yang 
Reviewed-by: Dr. David Alan Gilbert 

---
v2: fix a comment
---
 migration/ram.c | 73 +++--
 1 file changed, 46 insertions(+), 27 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 69c8a6bb0f..bee6a88c6d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4201,40 +4201,26 @@ static void colo_flush_ram_cache(void)
 trace_colo_flush_ram_cache_end();
 }
 
-static int ram_load(QEMUFile *f, void *opaque, int version_id)
+/**
+ * ram_load_precopy: load pages in precopy case
+ *
+ * Returns 0 for success or -errno in case of error
+ *
+ * Called in precopy mode by ram_load().
+ * rcu_read_lock is taken prior to this being called.
+ *
+ * @f: QEMUFile where to send the data
+ */
+static int ram_load_precopy(QEMUFile *f)
 {
-int flags = 0, ret = 0, invalid_flags = 0;
-static uint64_t seq_iter;
-int len = 0;
-/*
- * If system is running in postcopy mode, page inserts to host memory must
- * be atomic
- */
-bool postcopy_running = postcopy_is_running();
+int flags = 0, ret = 0, invalid_flags = 0, len = 0;
 /* ADVISE is earlier, it shows the source has the postcopy capability on */
 bool postcopy_advised = postcopy_is_advised();
-
-seq_iter++;
-
-if (version_id != 4) {
-return -EINVAL;
-}
-
 if (!migrate_use_compression()) {
 invalid_flags |= RAM_SAVE_FLAG_COMPRESS_PAGE;
 }
-/* This RCU critical section can be very long running.
- * When RCU reclaims in the code start to become numerous,
- * it will be necessary to reduce the granularity of this
- * critical section.
- */
-rcu_read_lock();
-
-if (postcopy_running) {
-ret = ram_load_postcopy(f);
-}
 
-while (!postcopy_running && !ret && !(flags & RAM_SAVE_FLAG_EOS)) {
+while (!ret && !(flags & RAM_SAVE_FLAG_EOS)) {
 ram_addr_t addr, total_ram_bytes;
 void *host = NULL;
 uint8_t ch;
@@ -4391,6 +4377,39 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 }
 }
 
+return ret;
+}
+
+static int ram_load(QEMUFile *f, void *opaque, int version_id)
+{
+int ret = 0;
+static uint64_t seq_iter;
+/*
+ * If system is running in postcopy mode, page inserts to host memory must
+ * be atomic
+ */
+bool postcopy_running = postcopy_is_running();
+
+seq_iter++;
+
+if (version_id != 4) {
+return -EINVAL;
+}
+
+/*
+ * This RCU critical section can be very long running.
+ * When RCU reclaims in the code start to become numerous,
+ * it will be necessary to reduce the granularity of this
+ * critical section.
+ */
+rcu_read_lock();
+
+if (postcopy_running) {
+ret = ram_load_postcopy(f);
+} else {
+ret = ram_load_precopy(f);
+}
+
 ret |= wait_for_decompress_done();
 rcu_read_unlock();
 trace_ram_load_complete(ret, seq_iter);
-- 
2.17.1

Re: [Qemu-devel] [PATCH v3] tests/boot_linux_console: add a test for riscv64 + virt

2019-07-24 Thread Alistair Francis

On Tue, Jul 23, 2019 at 11:46 PM Chih-Min Chao  wrote:
>
> Similar to the mips + malta test, it boots a Linux kernel on a virt
> board and verify the serial is working.  Also, it relies on the serial
> device set by the machine itself.
>
> If riscv64 is a target being built, "make check-acceptance" will
> automatically include this test by the use of the "arch:riscv64" tags.
>
> Alternatively, this test can be run using:
>
>   $ avocado run -t arch:riscv64 tests/acceptance
>
> packages
>   debian official
> binutils-riscv64-linux-gnu_2.32-8
> opensbi_0.4-1_all
> linux-image-5.0.0-trunk-riscv64 5.0.2-1~exp1
>   third-party
> https://github.com/groeck/linux-build-test/rootfs/riscv64/rootfs.cpio.gz
> (the repo is also used in mips target acceptance)
>
> Signed-off-by: Chih-Min Chao 
> ---
>  .travis.yml|  2 +-
>  tests/acceptance/boot_linux_console.py | 67 
> ++
>  2 files changed, 68 insertions(+), 1 deletion(-)
>
> diff --git a/.travis.yml b/.travis.yml
> index caf0a1f..7ba9952 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -232,7 +232,7 @@ matrix:
>
>  # Acceptance (Functional) tests
>  - env:
> -- CONFIG="--python=/usr/bin/python3 
> --target-list=x86_64-softmmu,mips-softmmu,mips64el-softmmu,aarch64-softmmu,arm-softmmu,s390x-softmmu,alpha-softmmu"
> +- CONFIG="--python=/usr/bin/python3 
> --target-list=x86_64-softmmu,mips-softmmu,mips64el-softmmu,aarch64-softmmu,arm-softmmu,s390x-softmmu,alpha-softmmu,riscv64-softmmu"
>  - TEST_CMD="make check-acceptance"
>after_failure:
>  - cat tests/results/latest/job.log
> diff --git a/tests/acceptance/boot_linux_console.py 
> b/tests/acceptance/boot_linux_console.py
> index 3215950..b0569b9 100644
> --- a/tests/acceptance/boot_linux_console.py
> +++ b/tests/acceptance/boot_linux_console.py
> @@ -354,3 +354,70 @@ class BootLinuxConsole(Test):
>  self.vm.launch()
>  console_pattern = 'Kernel command line: %s' % kernel_command_line
>  self.wait_for_console_pattern(console_pattern)
> +
> +def test_riscv64_virt(self):
> +"""
> +:avocado: tags=arch:riscv64
> +:avocado: tags=machine:virt
> +"""
> +deb_url = ('https://snapshot.debian.org/archive/debian/'
> + '20190424T171759Z/pool/main/b/binutils/'
> + 'binutils-riscv64-linux-gnu_2.32-8_amd64.deb')
> +deb_hash = ('7fe376fd4452696c03acd508d6d613ca553ea15e')
> +deb_path = self.fetch_asset(deb_url, asset_hash=deb_hash)
> +objcopy_path = '/usr/bin/riscv64-linux-gnu-objcopy'
> +objcopy_path = self.extract_from_deb(deb_path, objcopy_path)
> +libbfd_path = '/usr/lib/x86_64-linux-gnu/libbfd-2.32-riscv64.so'
> +libbfd_path = self.extract_from_deb(deb_path, libbfd_path)
> +process.run('ls -al %s' % (objcopy_path))

Why do we need objcopy? Won't this not work on non x86 architectures?

> +
> +deb_url = ('https://snapshot.debian.org/archive/debian/'
> +   '20190708T032337Z/pool/main/o/opensbi/'
> +   'opensbi_0.4-1_all.deb')
> +deb_hash = ('2319dcd702958291d323acf5649fd98a11d90112')
> +deb_path = self.fetch_asset(deb_url, asset_hash=deb_hash)
> +opensbi_path = ('/usr/lib/riscv64-linux-gnu/opensbi/'
> +'qemu/virt/fw_jump.elf')
> +opensbi_path = self.extract_from_deb(deb_path, opensbi_path)
> +
> +deb_url = ('https://snapshot.debian.org/archive/debian-ports/'
> +   '20190319T205124Z/pool-riscv64/main/l/linux/'
> +   
> 'linux-image-5.0.0-trunk-riscv64_5.0.2-1~exp1_riscv64.deb')
> +deb_hash = ('90155ed4b36673cbf7746a37cf3159c8f0b2857a')
> +deb_path = self.fetch_asset(deb_url, asset_hash=deb_hash)
> +kernel_path = '/boot/vmlinux-5.0.0-trunk-riscv64'

I thought we were swapping to using an Image file?

Alistair

> +kernel_path = self.extract_from_deb(deb_path, kernel_path)
> +kimage_path = self.workdir + "/Image"
> +env = os.environ
> +env['LD_LIBRARY_PATH'] = ('%s:' % (os.path.dirname(libbfd_path)) +
> + env.get('LD_LIBRARY_PATH', ''))
> +process.run(('%s -O binary -O binary -R'
> + '.note -R .note.gnu.build-id -R .comment -S %s %s') %
> + (objcopy_path, kernel_path, kimage_path))
> +
> +initrd_url = ('https://github.com/groeck/linux-build-test/raw/'
> +  '8584a59ed9e5eb5ee7ca91f6d74bbb06619205b8/rootfs/'
> +  'riscv64/rootfs.cpio.gz')
> +initrd_hash = 'f4867d263754961b6f626cdcdc0cb334c47e3b49'
> +initrd_path = self.fetch_asset(initrd_url, asset_hash=initrd_hash)
> +
> +self.vm.set_machine('virt')
> +self.vm.set_console()
> +kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE
> +

[Qemu-devel] Sphinx and docs/index.rst: dead code?

2019-07-24 Thread John Snow

Does anything actually use this file? It doesn't appear to be used for
generating the HTML manuals.

It looks like we might use it for latex, man and texinfo output from
sphinx judging by docs/conf.py, but I don't think we actually use sphinx
to generate such output, so I think this is all dead code.

Am I mistaken?

--js

Re: [Qemu-devel] [PATCH v2 06/13] doc: update AMD SEV to include Live migration flow

2019-07-24 Thread Venu Busireddy

On 2019-07-10 20:23:03 +, Singh, Brijesh wrote:
> Signed-off-by: Brijesh Singh 
> ---
>  docs/amd-memory-encryption.txt | 42 +-
>  1 file changed, 41 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/amd-memory-encryption.txt b/docs/amd-memory-encryption.txt
> index abb9a976f5..374f4b0a94 100644
> --- a/docs/amd-memory-encryption.txt
> +++ b/docs/amd-memory-encryption.txt
> @@ -89,7 +89,47 @@ TODO
>  
>  Live Migration
>  
> -TODO
> +AMD SEV encrypts the memory of VMs and because a different key is used
> +in each VM, the hypervisor will be unable to simply copy the
> +ciphertext from one VM to another to migrate the VM. Instead the AMD SEV Key
> +Management API provides sets of function which the hypervisor can use
> +to package a guest page for migration, while maintaining the confidentiality
> +provided by AMD SEV.
> +
> +SEV guest VMs have the concept of private and shared memory. The private
> +memory is encrypted with the guest-specific key, while shared memory may
> +be encrypted with the hypervisor key. The migration APIs provided by the
> +SEV API spec should be used for migrating the private pages. The
> +KVM_GET_PAGE_ENC_BITMAP ioctl can be used to get the guest page encryption
> +bitmap. The bitmap can be used to check if the given guest page is
> +private or shared.
> +
> +Before initiating the migration, we need to know the targets machine's public
> +Diffie-Hellman key (PDH) and certificate chain. It can be retrieved
> +with the 'query-sev-capabilities' QMP command or using the sev-tool. The
> +migrate-set-sev-info object can be used to pass the target machine's PDH and
> +certificate chain.
> +
> +e.g
> +(QMP) migrate-sev-set-info pdh= plat-cert= \

'migrate-sev-set-info' needs to be changed to 'migrate-set-sev-info'.

> +   amd-cert=
> +(QMP) migrate tcp:0:
> +
> +
> +During the migration flow, the SEND_START is called on the source hypervisor
> +to create outgoing encryption context. The SEV guest policy dectates whether
> +the certificate passed through the migrate-sev-set-info command will be

Same here.

> +validate. SEND_UPDATE_DATA is called to encrypt the guest private pages.
> +After migration is completed, SEND_FINISH is called to destroy the encryption
> +context and make the VM non-runnable to protect it against the cloning.
> +
> +On the target machine, RECEIVE_START is called first to create an
> +incoming encryption context. The RECEIVE_UPDATE_DATA is called to copy
> +the receieved encrypted page into guest memory. After migration has
> +completed, RECEIVE_FINISH is called to make the VM runnable.
> +
> +For more information about the migration see SEV API Appendix A
> +Usage flow (Live migration section).
>  
>  References
>  -
> -- 
> 2.17.1
> 
>

Re: [Qemu-devel] [PATCH 2/2] migration: extract ram_load_precopy

2019-07-24 Thread Wei Yang

On Wed, Jul 24, 2019 at 01:10:24PM +0100, Dr. David Alan Gilbert wrote:
>* Wei Yang (richardw.y...@linux.intel.com) wrote:
>> On Tue, Jul 23, 2019 at 05:47:03PM +0100, Dr. David Alan Gilbert wrote:
>> >* Wei Yang (richardw.y...@linux.intel.com) wrote:
>> >> After cleanup, it would be clear to audience there are two cases
>> >> ram_load:
>> >> 
>> >>   * precopy
>> >>   * postcopy
>> >> 
>> >> And it is not necessary to check postcopy_running on each iteration for
>> >> precopy.
>> >> 
>> >> Signed-off-by: Wei Yang 
>> >> ---
>> >>  migration/ram.c | 73 +++--
>> >>  1 file changed, 46 insertions(+), 27 deletions(-)
>> >> 
>> >> diff --git a/migration/ram.c b/migration/ram.c
>> >> index 6bfdfae16e..5f6f07b255 100644
>> >> --- a/migration/ram.c
>> >> +++ b/migration/ram.c
>> >> @@ -4200,40 +4200,26 @@ static void colo_flush_ram_cache(void)
>> >>  trace_colo_flush_ram_cache_end();
>> >>  }
>> >>  
>> >> -static int ram_load(QEMUFile *f, void *opaque, int version_id)
>> >> +/**
>> >> + * ram_load_precopy: load a page in precopy case
>> >
>> >This comment is wrong - although I realise you copied it from the
>> >postcopy case; they don't just load a single page; they load 'pages'
>> >
>> 
>> Thanks for pointing out.
>> 
>> Actually, I got one confusion in these two load. Compare these two cases, I
>> found precopy would handle two more cases:
>> 
>>   * precopy:  RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE |
>>   RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE
>>   * postcopy: RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE
>> 
>> Why postcopy doesn't need to handle the other two cases? Function
>> ram_save_target_page() does the same thing in precopy and postcopy. I don't
>> find the reason the behavior differs. Would you mind giving me a hint?
>
>Because we don't support either compression or xbzrle with postcopy.
>Compression could be fixed, but it needs to make sure it uses the
>place-page function to atomically place the page.
>

Thanks for the explanation. Sounds I missed some point.

The place-page function to use is postcopy_place_page()?

>xbzrle never gets used during the postcopy stage; it gets used
>in the precopy stage in a migration that might switch to postcopy
>though.  Since xbzrle relies on optimising differences between
>passes, it's
>   1) Not needed in postcopy where there's only one final pass
>   2) Since the destination is changing RAM, you can't transmit
>  deltas relative to the old data, since that data may have
>  changed.
>
>Dave
>
>> >Other than that, I think it's OK, so:
>> >
>> >
>> >Reviewed-by: Dr. David Alan Gilbert 
>> >
>> 
>> -- 
>> Wei Yang
>> Help you, Help me
>--
>Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

-- 
Wei Yang
Help you, Help me

Re: [Qemu-devel] [PATCH v3] qapi: add dirty-bitmaps to query-named-block-nodes result

2019-07-24 Thread John Snow




On 7/24/19 12:47 AM, Markus Armbruster wrote:
> John Snow  writes:
> 
>> From: Vladimir Sementsov-Ogievskiy 
>>
>> Let's add a possibility to query dirty-bitmaps not only on root nodes.
>> It is useful when dealing both with snapshots and incremental backups.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> [Added deprecation information. --js]
>> Signed-off-by: John Snow 
>> ---
>>  block/qapi.c |  5 +
>>  qapi/block-core.json |  6 +-
>>  qemu-deprecated.texi | 12 
>>  3 files changed, 22 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/qapi.c b/block/qapi.c
>> index 917435f022..15f1030264 100644
>> --- a/block/qapi.c
>> +++ b/block/qapi.c
>> @@ -79,6 +79,11 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
>>  info->backing_file = g_strdup(bs->backing_file);
>>  }
>>  
>> +if (!QLIST_EMPTY(>dirty_bitmaps)) {
>> +info->has_dirty_bitmaps = true;
>> +info->dirty_bitmaps = bdrv_query_dirty_bitmaps(bs);
>> +}
>> +
>>  info->detect_zeroes = bs->detect_zeroes;
>>  
>>  if (blk && blk_get_public(blk)->throttle_group_member.throttle_state) {
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index 0d43d4f37c..9210ae233d 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -360,6 +360,9 @@
>>  # @write_threshold: configured write threshold for the device.
>>  #   0 if disabled. (Since 2.3)
>>  #
>> +# @dirty-bitmaps: dirty bitmaps information (only present if node
>> +# has one or more dirty bitmaps) (Since 4.2)
>> +#
>>  # Since: 0.14.0
>>  #
>>  ##
>> @@ -378,7 +381,7 @@
>>  '*bps_wr_max_length': 'int', '*iops_max_length': 'int',
>>  '*iops_rd_max_length': 'int', '*iops_wr_max_length': 'int',
>>  '*iops_size': 'int', '*group': 'str', 'cache': 
>> 'BlockdevCacheInfo',
>> -'write_threshold': 'int' } }
>> +'write_threshold': 'int', '*dirty-bitmaps': ['BlockDirtyInfo'] 
>> } }
>>  
>>  ##
>>  # @BlockDeviceIoStatus:
>> @@ -656,6 +659,7 @@
>>  #
>>  # @dirty-bitmaps: dirty bitmaps information (only present if the
>>  # driver has one or more dirty bitmaps) (Since 2.0)
>> +# Deprecated in 4.2; see BlockDirtyInfo instead.
>>  #
>>  # @io-status: @BlockDeviceIoStatus. Only present if the device
>>  # supports it and the VM is configured to stop on errors
>> diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
>> index c90b08d553..6374b66546 100644
>> --- a/qemu-deprecated.texi
>> +++ b/qemu-deprecated.texi
>> @@ -134,6 +134,18 @@ The ``status'' field of the ``BlockDirtyInfo'' 
>> structure, returned by
>>  the query-block command is deprecated. Two new boolean fields,
>>  ``recording'' and ``busy'' effectively replace it.
>>  
>> +@subsection query-block result field dirty-bitmaps (Since 4.2)
>> +
>> +The ``dirty-bitmaps`` field of the ``BlockInfo`` structure, returned by
>> +the query-block command is itself now deprecated. The ``dirty-bitmaps``
>> +field of the ``BlockDeviceInfo`` struct should be used instead, which is the
>> +type of the ``inserted`` field in query-block replies, as well as the
>> +type of array items in query-named-block-nodes.
> 
> Would the text be clearer if it talked only about commands, not about
> types?
> 
> Here's my (laconic) try:
> 
>@subsection query-block result field dirty-bitmaps (Since 4.2)
> 
>In the result of query-block, member ``dirty-bitmaps'' has been moved
>into member ``inserted''.
> 

Yeah, that's probably better in terms of strictly what the deprecation
is. I was trying to imply that the output will also now be visible in
other commands as well, but that's not the deprecation -- that's the new
feature.

ACK

> Aside: same for existing @subsection query-block result field
> dirty-bitmaps[i].status (since 4.0).
> 

(Probably not worth editing deprecation text that was already published.)

>> +Since the ``dirty-bitmaps`` field is optionally present in both the old and
>> +new locations, clients must use introspection to learn where to anticipate
>> +the field if/when it does appear in command output.
>> +
> 
> I find this hint a bit confusing.  Do we need it?
> 

I think so, yes: it's nice to inform readers of how to cope with the
deprecation.

> If yes, laconic me again:
> 
>Clients should use introspection to learn whether ``dirty-bitmaps'' is
>in the new location.
> 

Too terse. I want my documentation to greet me in the morning by reading
me the local newspaper while I brush my teeth.

Yours says the "how", but I think a hint should have the "why":

"Since the ``dirty-bitmaps`` field is not always present in command
output, Clients should use introspection to learn the location of this
field."

But I'm only willing to give you a self-deprecating joke and a final
nudge to keep a more informative hint, and then I'll capitulate and take
your suggestion if you give me a stern look.

--js

>>

[Qemu-devel] [PATCH] ati-vga: Fix GPIO_MONID register write

2019-07-24 Thread BALATON Zoltan

Also update bitbang_i2c state when output bits are changed while
enable bits are set. This fixes EDID access by the ATI FCode ROM.

Signed-off-by: BALATON Zoltan 
---
 hw/display/ati.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/display/ati.c b/hw/display/ati.c
index d2116d2ab0..b849f5d510 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -564,12 +564,15 @@ static void ati_mm_write(void *opaque, hwaddr addr,
addr - GPIO_MONID, data, size);
 /*
  * Rage128p accesses DDC used to get EDID via these bits.
- * Only touch i2c when write overlaps 3rd byte because some
- * drivers access this reg via multiple partial writes and
- * without this spurious bits would be sent.
+ * Because some drivers access this via multiple byte writes
+ * we have to be careful when we send bits to avoid spurious
+ * changes in bitbang_i2c state. So only do it when mask is set
+ * and either the enable bits are changed or output bits changed
+ * while enabled.
  */
 if ((s->regs.gpio_monid & BIT(25)) &&
-addr <= GPIO_MONID + 2 && addr + size > GPIO_MONID + 2) {
+((addr <= GPIO_MONID + 2 && addr + size > GPIO_MONID + 2) ||
+ (addr == GPIO_MONID && (s->regs.gpio_monid & 0x6 {
 s->regs.gpio_monid = ati_i2c(>bbi2c, s->regs.gpio_monid, 1);
 }
 }
-- 
2.13.7

[Qemu-devel] Exploring Sphinx, autodoc, apidoc, and coverage tools for python/qemu

2019-07-24 Thread John Snow

Has anyone on this list experimented with these tools?

I was hoping to use them to document things like the python/machine.py
and python/qmp.py modules to help demonstrate some of our internal
tooling API (for test writers, GSoC/Outreachy interns, folks who want to
script QEMU at a level between writing a CLI driver and using libvirt.)

What follows below is my process trying to enable this and some of the
problems I'm still stuck with, summarized below at the end of this more
exploratory text.


Enabling autodoc:

First, it appears as if enabling the "sphinx-autodoc" tool is not
sufficient for actually generating anything at all when you invoke the
sphinx-generated "make html" target. It just enables understanding
certain directives.

So apparently you need to generate module "stubs" using sphinx-autodoc.
Sphinx uses the sphinx-autodoc extension to understand how to consume
the directives in these stubs.

That strikes me as odd, because these stubs might need to be changed
frequently as code comes and goes; it seems strange that it isn't
integrated at the top level. (Do I have the wrong idea on how these
tools should be used?)

So you need to run:
> sphinx-apidoc --separate --module-first -o docs/ python/qemu/

which generates stubs to docs:

Creating file docs/qemu.machine.rst.
Creating file docs/qemu.qmp.rst.
Creating file docs/qemu.qtest.rst.
Creating file docs/qemu.rst.
Creating file docs/modules.rst.

And then you can edit e.g. the top-level index.rst TOC in docs/index.rst
to look like this:

```
.. toctree::
   :maxdepth: 2
   :caption: Contents:

   interop/index
   devel/index
   specs/index
   modules
```

And then finally generating the build; manually removing the -W option
from the Makefile: there are a lot of warnings in here.

> sphinx-build -n -b html -D version=4.0.92 -D release="4.0.92
(v4.1.0-rc2-34-g160802eb07-dirty)" -d .doctrees/
/home/bos/jhuston/src/qemu/docs/ docs/

Great! that will generate output to docs/index.html which indeed shows
APIdoc comments generated from our Python files. Good.

However, where this gets a little strange is if you look at the
generated stubs. For instance, qemu.machine.rst looks like this:

```
.. automodule:: qemu.machine
:members:
:undoc-members:
:show-inheritance:
```

:undoc-members: says that we want to "document" any members that don't
have a matching apidoc comment by generating a stub.

Oops, but the presence of that stub will cause the sphinx coverage tool
to happily report 100% coverage.

Further oops, pylint doesn't understand apidoc comments and can't be
used as the linter in this case, either.

You can edit the stubs to remove these directives, but these stubs are
generated -- and it doesn't appear like there's a command line option to
change this behavior. ...Hmm.

And either way, the coverage tool only generates a report and not
something with an error code that I could use to gate the build. Same
goes for the general build: if I remove the :undoc-members: parameter,
there's nothing in the autodoc module that appears to throw warnings
when it encounters undocumented parameters or members.

That seems disappointing, because it's hard to keep docstrings up to
date unless they are checked conclusively at build time.


Conclusions:

- the autodoc documentation page doesn't seem to document examples of
how you're expected to write meaningful docstrings for the tool to extract.

- autodoc fools the coverage tool into reporting 100% coverage.

- autodoc can be configured to omit non-documented members to allow the
coverage tool to work, but the configuration is auto-generated and
defaults to always generating documentation for these entities.

- coverage tool doesn't appear like it can be used for gating the build
natively for missing python docs; it only generates a report.

- Even if we script to block on a non-empty report, the coverage tool
only works at the function/class level and does not understand the
concept of missing parameter or return value tags.

- It would seem that it would be the Autodoc module's job to be
responsible for understanding incomplete documentation, but doesn't
appear to. The :param name: syntax is just a ReST "field list" and isn't
parsed semantically by autodoc, sadly.


It looks to me, at a glance, that there's nothing in Sphinx that knows
how to look for and warn about undocumented parameters, exception types,
return values, etc. Hopefully I've missed something and it is possible.

--js

Re: [Qemu-devel] [PATCH for-4.2 11/24] target/arm: Add the hypervisor virtual counter

2019-07-24 Thread Alex Bennée



Richard Henderson  writes:

> Signed-off-by: Richard Henderson 

Reviewed-by: Alex Bennée 

> ---
>  target/arm/cpu-qom.h |  1 +
>  target/arm/cpu.h | 11 +
>  target/arm/cpu.c |  2 ++
>  target/arm/helper.c  | 57 
>  4 files changed, 66 insertions(+), 5 deletions(-)
>
> diff --git a/target/arm/cpu-qom.h b/target/arm/cpu-qom.h
> index 2049fa9612..43fc8296db 100644
> --- a/target/arm/cpu-qom.h
> +++ b/target/arm/cpu-qom.h
> @@ -76,6 +76,7 @@ void arm_gt_ptimer_cb(void *opaque);
>  void arm_gt_vtimer_cb(void *opaque);
>  void arm_gt_htimer_cb(void *opaque);
>  void arm_gt_stimer_cb(void *opaque);
> +void arm_gt_hvtimer_cb(void *opaque);
>
>  #define ARM_AFF0_SHIFT 0
>  #define ARM_AFF0_MASK  (0xFFULL << ARM_AFF0_SHIFT)
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index e37008a4f7..bba4e1f984 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -144,11 +144,12 @@ typedef struct ARMGenericTimer {
>  uint64_t ctl; /* Timer Control register */
>  } ARMGenericTimer;
>
> -#define GTIMER_PHYS 0
> -#define GTIMER_VIRT 1
> -#define GTIMER_HYP  2
> -#define GTIMER_SEC  3
> -#define NUM_GTIMERS 4
> +#define GTIMER_PHYS 0
> +#define GTIMER_VIRT 1
> +#define GTIMER_HYP  2
> +#define GTIMER_SEC  3
> +#define GTIMER_HYPVIRT  4
> +#define NUM_GTIMERS 5
>
>  typedef struct {
>  uint64_t raw_tcr;
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index 1959467fdc..90352decc5 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -1218,6 +1218,8 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
> **errp)
>arm_gt_htimer_cb, cpu);
>  cpu->gt_timer[GTIMER_SEC] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,
>arm_gt_stimer_cb, cpu);
> +cpu->gt_timer[GTIMER_HYPVIRT] = timer_new(QEMU_CLOCK_VIRTUAL, 
> GTIMER_SCALE,
> +  arm_gt_hvtimer_cb, cpu);
>  #endif
>
>  cpu_exec_realizefn(cs, _err);
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 3124d682a2..329548e45d 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -2527,6 +2527,7 @@ static uint64_t gt_tval_read(CPUARMState *env, const 
> ARMCPRegInfo *ri,
>
>  switch (timeridx) {
>  case GTIMER_VIRT:
> +case GTIMER_HYPVIRT:
>  offset = gt_virt_cnt_offset(env);
>  break;
>  }
> @@ -2543,6 +2544,7 @@ static void gt_tval_write(CPUARMState *env, const 
> ARMCPRegInfo *ri,
>
>  switch (timeridx) {
>  case GTIMER_VIRT:
> +case GTIMER_HYPVIRT:
>  offset = gt_virt_cnt_offset(env);
>  break;
>  }
> @@ -2698,6 +2700,34 @@ static void gt_sec_ctl_write(CPUARMState *env, const 
> ARMCPRegInfo *ri,
>  gt_ctl_write(env, ri, GTIMER_SEC, value);
>  }
>
> +static void gt_hv_timer_reset(CPUARMState *env, const ARMCPRegInfo *ri)
> +{
> +gt_timer_reset(env, ri, GTIMER_HYPVIRT);
> +}
> +
> +static void gt_hv_cval_write(CPUARMState *env, const ARMCPRegInfo *ri,
> + uint64_t value)
> +{
> +gt_cval_write(env, ri, GTIMER_HYPVIRT, value);
> +}
> +
> +static uint64_t gt_hv_tval_read(CPUARMState *env, const ARMCPRegInfo *ri)
> +{
> +return gt_tval_read(env, ri, GTIMER_HYPVIRT);
> +}
> +
> +static void gt_hv_tval_write(CPUARMState *env, const ARMCPRegInfo *ri,
> + uint64_t value)
> +{
> +gt_tval_write(env, ri, GTIMER_HYPVIRT, value);
> +}
> +
> +static void gt_hv_ctl_write(CPUARMState *env, const ARMCPRegInfo *ri,
> +uint64_t value)
> +{
> +gt_ctl_write(env, ri, GTIMER_HYPVIRT, value);
> +}
> +
>  void arm_gt_ptimer_cb(void *opaque)
>  {
>  ARMCPU *cpu = opaque;
> @@ -2726,6 +2756,13 @@ void arm_gt_stimer_cb(void *opaque)
>  gt_recalc_timer(cpu, GTIMER_SEC);
>  }
>
> +void arm_gt_hvtimer_cb(void *opaque)
> +{
> +ARMCPU *cpu = opaque;
> +
> +gt_recalc_timer(cpu, GTIMER_HYPVIRT);
> +}
> +
>  static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
>  /* Note that CNTFRQ is purely reads-as-written for the benefit
>   * of software; writing it doesn't actually change the timer frequency.
> @@ -6852,6 +6889,26 @@ void register_cp_regs_for_features(ARMCPU *cpu)
>.opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 1,
>.access = PL2_RW, .writefn = vmsa_tcr_ttbr_el2_write,
>.fieldoffset = offsetof(CPUARMState, cp15.ttbr1_el[2]) },
> +#ifndef CONFIG_USER_ONLY
> +{ .name = "CNTHV_CVAL_EL2", .state = ARM_CP_STATE_AA64,
> +  .opc0 = 3, .opc1 = 4, .crn = 14, .crm = 3, .opc2 = 2,
> +  .fieldoffset =
> +offsetof(CPUARMState, cp15.c14_timer[GTIMER_HYPVIRT].cval),
> +  .type = ARM_CP_IO, .access = PL2_RW,
> +  .writefn = gt_hv_cval_write, .raw_writefn = raw_write },
> +{ .name = "CNTHV_TVAL_EL2", .state =

Re: [Qemu-devel] [RFC PATCH] pci: Use PCI aliases when determining device IOMMU address space

2019-07-24 Thread Alex Williamson

On Wed, 24 Jul 2019 08:43:55 -0600
Alex Williamson  wrote:

> On Wed, 24 Jul 2019 18:03:31 +0800
> Peter Xu  wrote:
> 
> > On Wed, Jul 24, 2019 at 05:39:22AM -0400, Michael S. Tsirkin wrote:  
> > > On Wed, Jul 24, 2019 at 03:14:39PM +0800, Peter Xu wrote:
> > > > On Tue, Jul 23, 2019 at 11:26:18AM -0600, Alex Williamson wrote:
> > > > > > On 3/29/19 11:49 AM, Alex Williamson wrote:
> > > > > > > [Cc +Brijesh]
> > > > > > > 
> > > > > > > Hi Brijesh, will the change below require the IVRS to be updated 
> > > > > > > to
> > > > > > > include aliases for all BDF ranges behind a conventional bridge?  
> > > > > > > I
> > > > > > > think the Linux code handles this regardless of the firmware 
> > > > > > > provided
> > > > > > > aliases, but is it required per spec for the ACPI tables to 
> > > > > > > include
> > > > > > > bridge aliases?  Thanks,
[snip]

For a data point, I fired up an old 990FX system which includes a
PCIe-to-PCI bridge and I added a plugin PCIe-to-PCI bridge just to keep
things interesting... guess how many alias ranges are in the IVRS...
Yep, just the one built into the motherboard:

AMD-Vi: Using IVHD type 0x10
AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
AMD-Vi:mmio-addr: fec3
AMD-Vi:   DEV_SELECT_RANGE_START devid: 00:00.0 flags: 00
AMD-Vi:   DEV_RANGE_END  devid: 00:00.2
AMD-Vi:   DEV_SELECT devid: 00:09.0 flags: 00
AMD-Vi:   DEV_SELECT devid: 01:00.0 flags: 00
AMD-Vi:   DEV_SELECT devid: 00:0a.0 flags: 00
AMD-Vi:   DEV_SELECT devid: 02:00.0 flags: 00
AMD-Vi:   DEV_SELECT devid: 00:11.0 flags: 00
AMD-Vi:   DEV_SELECT_RANGE_START devid: 00:12.0 flags: 00
AMD-Vi:   DEV_RANGE_END  devid: 00:12.2
AMD-Vi:   DEV_SELECT_RANGE_START devid: 00:13.0 flags: 00
AMD-Vi:   DEV_RANGE_END  devid: 00:13.2
AMD-Vi:   DEV_SELECT devid: 00:14.0 flags: d7
AMD-Vi:   DEV_SELECT devid: 00:14.2 flags: 00
AMD-Vi:   DEV_SELECT devid: 00:14.3 flags: 00
AMD-Vi:   DEV_SELECT devid: 00:14.4 flags: 00
AMD-Vi:   DEV_ALIAS_RANGEdevid: 03:00.0 flags: 00 devid_to: 
00:14.4
AMD-Vi:   DEV_RANGE_END  devid: 03:1f.7

[Everything on bus 03:xx.x is aliased to device 00:14.4, the builtin 
PCIe-to-PCI bridge]

AMD-Vi:   DEV_SELECT devid: 00:14.5 flags: 00
AMD-Vi:   DEV_SELECT devid: 00:15.0 flags: 00
AMD-Vi:   DEV_SELECT_RANGE_START devid: 04:00.0 flags: 00
AMD-Vi:   DEV_RANGE_END  devid: 04:1f.7
AMD-Vi:   DEV_SELECT devid: 00:15.1 flags: 00
AMD-Vi:   DEV_SELECT_RANGE_START devid: 05:00.0 flags: 00
AMD-Vi:   DEV_RANGE_END  devid: 05:1f.7
AMD-Vi:   DEV_SELECT devid: 00:15.2 flags: 00
AMD-Vi:   DEV_SELECT_RANGE_START devid: 06:00.0 flags: 00
AMD-Vi:   DEV_RANGE_END  devid: 06:1f.7
AMD-Vi:   DEV_SELECT devid: 00:15.3 flags: 00
AMD-Vi:   DEV_SELECT_RANGE_START devid: 08:00.0 flags: 00
AMD-Vi:   DEV_RANGE_END  devid: 08:1f.7
AMD-Vi:   DEV_SELECT_RANGE_START devid: 00:16.0 flags: 00
AMD-Vi:   DEV_RANGE_END  devid: 00:16.2
AMD-Vi:   DEV_SPECIAL(IOAPIC[8])devid: 00:14.0
AMD-Vi:   DEV_SPECIAL(HPET[0])  devid: 00:14.0
AMD-Vi:   DEV_SPECIAL(IOAPIC[8])devid: 00:00.1

-[:00]-+-00.0  Advanced Micro Devices, Inc. [AMD/ATI] RD9x0/RX980 Host 
Bridge
   +-00.2  Advanced Micro Devices, Inc. [AMD/ATI] RD890S/RD990 I/O 
Memory Management Unit (IOMMU)
   +-09.0-[01]00.0  Etron Technology, Inc. EJ168 USB 3.0 Host 
Controller
   +-0a.0-[02]00.0  Marvell Technology Group Ltd. 88SE9172 SATA 
6Gb/s Controller
   +-11.0  Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 
SATA Controller [AHCI mode]
   +-12.0  Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB 
OHCI0 Controller
   +-12.2  Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB 
EHCI Controller
   +-13.0  Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB 
OHCI0 Controller
   +-13.2  Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB 
EHCI Controller
   +-14.0  Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller
   +-14.2  Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel 
HDA)
   +-14.3  Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC 
host controller
   +-14.4-[03]--+-06.0  NVidia / SGS Thomson (Joint Venture) Riva128
   |\-0e.0  VIA Technologies, Inc. VT6306/7/8 [Fire II(M)] 
IEEE 1394 OHCI Controller
   +-14.5  Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB 
OHCI2 Controller
   +-15.0-[04]00.0  Realtek Semiconductor Co., Ltd. 
RTL8111/8168/8411

Re: [Qemu-devel] [PATCH v7 02/11] numa: move numa global variable nb_numa_nodes into MachineState

2019-07-24 Thread Eduardo Habkost

On Wed, Jul 24, 2019 at 05:48:11PM +0200, Igor Mammedov wrote:
> On Wed, 24 Jul 2019 12:02:41 -0300
> Eduardo Habkost  wrote:
> 
> > On Wed, Jul 24, 2019 at 04:27:21PM +0200, Igor Mammedov wrote:
> > > On Tue, 23 Jul 2019 12:23:57 -0300
> > > Eduardo Habkost  wrote:
> > > 
> > > > On Tue, Jul 23, 2019 at 04:56:41PM +0200, Igor Mammedov wrote:
> > > > > On Tue, 16 Jul 2019 22:51:12 +0800
> > > > > Tao Xu  wrote:
> > > > > 
> > > > > > Add struct NumaState in MachineState and move existing numa global
> > > > > > nb_numa_nodes(renamed as "num_nodes") into NumaState. And add 
> > > > > > variable
> > > > > > numa_support into MachineClass to decide which submachines support 
> > > > > > NUMA.
> > > > > > 
> > > > > > Suggested-by: Igor Mammedov 
> > > > > > Suggested-by: Eduardo Habkost 
> > > > > > Signed-off-by: Tao Xu 
> > > > > > ---
> > > > > > 
> > > > > > No changes in v7.
> > > > > > 
> > > > > > Changes in v6:
> > > > > > - Rebase to upstream, move globals in arm/sbsa-ref and use
> > > > > >   numa_mem_supported
> > > > > > - When used once or twice in the function, use
> > > > > >   ms->numa_state->num_nodes directly
> > > > > > - Correct some mistakes
> > > > > > - Use once monitor_printf in hmp_info_numa
> > > > > > ---
> > > > [...]
> > > > > >  if (pxb->numa_node != NUMA_NODE_UNASSIGNED &&
> > > > > > -pxb->numa_node >= nb_numa_nodes) {
> > > > > > +pxb->numa_node >= ms->numa_state->num_nodes) {
> > > > > this will crash if user tries to use device on machine that doesn't 
> > > > > support numa
> > > > > check that numa_state is not NULL before dereferencing 
> > > > 
> > > > That's exactly why the machine_num_numa_nodes() was created in
> > > > v5, but then you asked for its removal.
> > > V4 to more precise.
> > > I dislike small wrappers because they usually doesn't simplify code and 
> > > make it more obscure,
> > > forcing to jump around to see what's really going on.
> > > Like it's implemented in this patch it's obvious what's wrong right away.
> > > 
> > > In that particular case machine_num_numa_nodes() was also misused since 
> > > only a handful
> > > of places (6) really need NULL check while majority (48) can directly 
> > > access ms->numa_state->num_nodes.
> > > without NULL check.
> > 
> > I strongly disagree, here.  Avoiding a ms->numa_state==NULL check
> > is pointless optimization,
> I see it not as optimization (compiler probably would manage to optimize out 
> most of them)
> but as rather properly self documented code. Doing check in places where it's
> not needed is confusing at best and can mask/introduce later subtle bugs at 
> worst.
> 
> > and leads to hard to spot bugs like
> > the one you saw above.
> That one was actually easy to spot because of the way it's written in this 
> patch.

When somebody is looking at a line of code containing
"ms->numa_state->num_nodes", how exactly are they supposed to
know if ms->numa_state is already guaranteed to be non-NULL, or
not?

-- 
Eduardo

Re: [Qemu-devel] [PATCH v3] block/rbd: add preallocation support

2019-07-24 Thread Jason Dillaman

On Tue, Jul 23, 2019 at 3:13 AM Stefano Garzarella  wrote:
>
> This patch adds the support of preallocation (off/full) for the RBD
> block driver.
> If rbd_writesame() is available and supports zeroed buffers, we use
> it to quickly fill the image when full preallocation is required.
>
> Signed-off-by: Stefano Garzarella 
> ---
> v3:
>  - rebased on master
>  - filled with zeroed buffer [Max]
>  - used rbd_writesame() only when we can disable the discard of zeroed
>buffers
>  - added 'since: 4.2' in qapi/block-core.json [Max]
>  - used buffer as large as the "stripe unit"
> ---
>  block/rbd.c  | 202 ---
>  qapi/block-core.json |   5 +-
>  2 files changed, 192 insertions(+), 15 deletions(-)
>
> diff --git a/block/rbd.c b/block/rbd.c
> index 59757b3120..d923a5a26c 100644
> --- a/block/rbd.c
> +++ b/block/rbd.c
> @@ -64,6 +64,7 @@
>  #define OBJ_MAX_SIZE (1UL << OBJ_DEFAULT_OBJ_ORDER)
>
>  #define RBD_MAX_SNAPS 100
> +#define RBD_DEFAULT_CONCURRENT_OPS 10
>
>  /* The LIBRBD_SUPPORTS_IOVEC is defined in librbd.h */
>  #ifdef LIBRBD_SUPPORTS_IOVEC
> @@ -104,6 +105,7 @@ typedef struct BDRVRBDState {
>  char *image_name;
>  char *snap;
>  uint64_t image_size;
> +bool ws_zero_supported; /* rbd_writesame() supports zeroed buffers */
>  } BDRVRBDState;
>
>  static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
> @@ -333,6 +335,155 @@ static void qemu_rbd_memset(RADOSCB *rcb, int64_t offs)
>  }
>  }
>
> +static int qemu_rbd_get_max_concurrent_ops(rados_t cluster)
> +{
> +char buf[16];
> +int ret, max_concurrent_ops;
> +
> +ret = rados_conf_get(cluster, "rbd_concurrent_management_ops", buf,
> + sizeof(buf));
> +if (ret < 0) {
> +return RBD_DEFAULT_CONCURRENT_OPS;
> +}
> +
> +ret = qemu_strtoi(buf, NULL, 10, _concurrent_ops);
> +if (ret < 0) {
> +return RBD_DEFAULT_CONCURRENT_OPS;
> +}
> +
> +return max_concurrent_ops;
> +}
> +
> +static int qemu_rbd_do_truncate(rados_t cluster, rbd_image_t image,
> +int64_t offset, PreallocMode prealloc,
> +bool ws_zero_supported, Error **errp)
> +{
> +uint64_t current_length;
> +char *buf = NULL;
> +int ret;
> +
> +ret = rbd_get_size(image, _length);
> +if (ret < 0) {
> +error_setg_errno(errp, -ret, "Failed to get file length");
> +goto out;
> +}
> +
> +if (current_length > offset && prealloc != PREALLOC_MODE_OFF) {
> +error_setg(errp, "Cannot use preallocation for shrinking files");
> +ret = -ENOTSUP;
> +goto out;
> +}
> +
> +switch (prealloc) {
> +case PREALLOC_MODE_FULL: {
> +uint64_t buf_size, current_offset = current_length;
> +ssize_t bytes;
> +
> +ret = rbd_get_stripe_unit(image, _size);
> +if (ret < 0) {
> +error_setg_errno(errp, -ret, "Failed to get stripe unit");
> +goto out;
> +}
> +
> +ret = rbd_resize(image, offset);
> +if (ret < 0) {
> +error_setg_errno(errp, -ret, "Failed to resize file");
> +goto out;
> +}
> +
> +buf = g_malloc0(buf_size);
> +
> +#ifdef LIBRBD_SUPPORTS_WRITESAME
> +if (ws_zero_supported) {
> +uint64_t writesame_max_size;
> +int max_concurrent_ops;
> +
> +max_concurrent_ops = qemu_rbd_get_max_concurrent_ops(cluster);
> +/*
> + * We limit the rbd_writesame() size to avoid to spawn more then
> + * 'rbd_concurrent_management_ops' concurrent operations.
> + */
> +writesame_max_size = MIN(buf_size * max_concurrent_ops, INT_MAX);

In the most efficient world, the 'buf_size' would be some small, fixed
power of 2 value (like 512 bytes) since there isn't much need to send
extra zeroes. You would then want to writesame the full stripe period
(if possible), where a stripe period is the data block object size
(defaults to 4MiB and is availble via 'rbd_stat') * the stripe count.
In this case, the stripe count becomes the number of in-flight IOs.
Therefore, you could substitute its value w/ the max_concurrent_ops to
ensure you are issuing exactly max_concurrent_ops IOs per
rbd_writesame call.

> +
> +while (offset - current_offset > buf_size) {
> +bytes = MIN(offset - current_offset, writesame_max_size);
> +/*
> + * rbd_writesame() supports only request where the size of 
> the
> + * operation is multiple of buffer size.
> + */
> +bytes -= bytes % buf_size;
> +
> +bytes = rbd_writesame(image, current_offset, bytes, buf,
> +  buf_size, 0);

If the RBD in-memory cache is enabled during this operation, the
writesame will effectively just be turned into a write. Therefore,
when

Re: [Qemu-devel] [PATCH for-4.1 2/2] xics/kvm: Fix fallback to emulated XICS

2019-07-24 Thread Cédric Le Goater

On 24/07/2019 18:57, Greg Kurz wrote:
> Commit 4812f2615288 tried to fix rollback path of xics_kvm_connect() but
> it isn't enough. If we fail to create the KVM device, the guest fails
> to boot later on with:
> 
> [0.010817] pci :00:00.0: Adding to iommu group 0
> [0.010863] irq: unknown-1 didn't like hwirq-0x1200 to VIRQ17 mapping 
> (rc=-22)
> [0.010923] pci :00:01.0: Adding to iommu group 0
> [0.010968] irq: unknown-1 didn't like hwirq-0x1201 to VIRQ17 mapping 
> (rc=-22)
> [0.011543] EEH: No capable adapters found
> [0.011597] irq: unknown-1 didn't like hwirq-0x1000 to VIRQ17 mapping 
> (rc=-22)
> [0.011651] audit: type=2000 audit(1563977526.000:1): state=initialized 
> audit_enabled=0 res=1
> [0.011703] [ cut here ]
> [0.011729] event-sources: Unable to allocate interrupt number for 
> /event-sources/epow-events
> [0.011776] WARNING: CPU: 0 PID: 1 at 
> arch/powerpc/platforms/pseries/event_sources.c:34 
> request_event_sources_irqs+0xbc/0x150
> [0.011828] Modules linked in:
> [0.011850] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
> 5.1.17-300.fc30.ppc64le #1
> [0.011886] NIP:  c00d4fac LR: c00d4fa8 CTR: 
> c18f
> [0.011923] REGS: c0001e4c38d0 TRAP: 0700   Not tainted  
> (5.1.17-300.fc30.ppc64le)
> [0.011966] MSR:  82029033   CR: 
> 28000284  XER: 2004
> [0.012012] CFAR: c011b42c IRQMASK: 0
> [0.012012] GPR00: c00d4fa8 c0001e4c3b60 c15fc400 
> 0051
> [0.012012] GPR04: 0001  0081 
> 772d6576656e7473
> [0.012012] GPR08: 1edf c14d4830 c14d4830 
> 6e6576652f20726f
> [0.012012] GPR12:  c18f c0010bf0 
> 
> [0.012012] GPR16:    
> 
> [0.012012] GPR20:    
> 
> [0.012012] GPR24:   c0ebbf00 
> c00d5570
> [0.012012] GPR28: c0ebc008 c0001fff8248  
> 
> [0.012372] NIP [c00d4fac] request_event_sources_irqs+0xbc/0x150
> [0.012409] LR [c00d4fa8] request_event_sources_irqs+0xb8/0x150
> [0.012445] Call Trace:
> [0.012462] [c0001e4c3b60] [c00d4fa8] 
> request_event_sources_irqs+0xb8/0x150 (unreliable)
> [0.012513] [c0001e4c3bf0] [c1042848] 
> __machine_initcall_pseries_init_ras_IRQ+0xc8/0xf8
> [0.012563] [c0001e4c3c20] [c0010810] 
> do_one_initcall+0x60/0x254
> [0.012611] [c0001e4c3cf0] [c1024538] 
> kernel_init_freeable+0x35c/0x444
> [0.012655] [c0001e4c3db0] [c0010c14] kernel_init+0x2c/0x148
> [0.012693] [c0001e4c3e20] [c000bdc4] 
> ret_from_kernel_thread+0x5c/0x78
> [0.012736] Instruction dump:
> [0.012759] 38a0 7c7f1b78 7f64db78 2c1f 2fbf 78630020 4180002c 
> 409effa8
> [0.012805] 7fa4eb78 7f43d378 48046421 6000 <0fe0> 3bde0001 
> 2c1e0010 7fde07b4
> [0.012851] ---[ end trace aa5785707323fad3 ]---
> 
> This happens because QEMU fell back on XICS emulation but didn't unregister
> the RTAS calls from KVM. The emulated RTAS calls are hence never called and
> the KVM ones return an error to the guest since the KVM device is absent.
> 
> The sanity checks in xics_kvm_disconnect() are abusive since we're freeing
> the KVM device. Simply drop them.
> 
> Fixes: 4812f2615288 "xics/kvm: Add proper rollback to xics_kvm_init()"
> Signed-off-by: Greg Kurz 



Reviewed-by: Cédric Le Goater 

Thanks,

C.

> ---
>  hw/intc/xics_kvm.c |   11 ---
>  1 file changed, 11 deletions(-)
> 
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index 2df1f3e92c7e..65c35f90f9af 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -430,17 +430,6 @@ fail:
>  
>  void xics_kvm_disconnect(SpaprMachineState *spapr, Error **errp)
>  {
> -/* The KVM XICS device is not in use */
> -if (kernel_xics_fd == -1) {
> -return;
> -}
> -
> -if (!kvm_enabled() || !kvm_check_extension(kvm_state, KVM_CAP_IRQ_XICS)) 
> {
> -error_setg(errp,
> -   "KVM and IRQ_XICS capability must be present for KVM XICS 
> device");
> -return;
> -}
> -
>  /*
>   * Only on P9 using the XICS-on XIVE KVM device:
>   *
>

[Qemu-devel] [PATCH v2 07/11] vdi: Fix .bdrv_has_zero_init()

2019-07-24 Thread Max Reitz

Static VDI images cannot guarantee to be zero-initialized.  If the image
has been statically allocated, forward the call to the underlying
storage node.

Reported-by: Stefano Garzarella 
Signed-off-by: Max Reitz 
Reviewed-by: Stefan Weil 
Acked-by: Stefano Garzarella 
Tested-by: Stefano Garzarella 
---
 block/vdi.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/block/vdi.c b/block/vdi.c
index b9845a4cbd..0caa3f281d 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -988,6 +988,17 @@ static void vdi_close(BlockDriverState *bs)
 error_free(s->migration_blocker);
 }
 
+static int vdi_has_zero_init(BlockDriverState *bs)
+{
+BDRVVdiState *s = bs->opaque;
+
+if (s->header.image_type == VDI_TYPE_STATIC) {
+return bdrv_has_zero_init(bs->file->bs);
+} else {
+return 1;
+}
+}
+
 static QemuOptsList vdi_create_opts = {
 .name = "vdi-create-opts",
 .head = QTAILQ_HEAD_INITIALIZER(vdi_create_opts.head),
@@ -1028,7 +1039,7 @@ static BlockDriver bdrv_vdi = {
 .bdrv_child_perm  = bdrv_format_default_perms,
 .bdrv_co_create  = vdi_co_create,
 .bdrv_co_create_opts = vdi_co_create_opts,
-.bdrv_has_zero_init = bdrv_has_zero_init_1,
+.bdrv_has_zero_init  = vdi_has_zero_init,
 .bdrv_co_block_status = vdi_co_block_status,
 .bdrv_make_empty = vdi_make_empty,
 
-- 
2.21.0

[Qemu-devel] [PATCH v2 11/11] iotests: Full mirror to existing non-zero image

2019-07-24 Thread Max Reitz

The result of a sync=full mirror should always be the equal to the
input.  Therefore, existing images should be treated as potentially
non-zero and thus should be explicitly initialized to be zero
beforehand.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/041 | 62 +++---
 tests/qemu-iotests/041.out |  4 +--
 2 files changed, 60 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 26bf1701eb..8bc8f81db7 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -741,8 +741,15 @@ class TestUnbackedSource(iotests.QMPTestCase):
 def setUp(self):
 qemu_img('create', '-f', iotests.imgfmt, test_img,
  str(TestUnbackedSource.image_len))
-self.vm = iotests.VM().add_drive(test_img)
+self.vm = iotests.VM()
 self.vm.launch()
+result = self.vm.qmp('blockdev-add', node_name='drive0',
+ driver=iotests.imgfmt,
+ file={
+ 'driver': 'file',
+ 'filename': test_img,
+ })
+self.assert_qmp(result, 'return', {})
 
 def tearDown(self):
 self.vm.shutdown()
@@ -751,7 +758,7 @@ class TestUnbackedSource(iotests.QMPTestCase):
 
 def test_absolute_paths_full(self):
 self.assert_no_active_block_jobs()
-result = self.vm.qmp('drive-mirror', device='drive0',
+result = self.vm.qmp('drive-mirror', job_id='drive0', device='drive0',
  sync='full', target=target_img,
  mode='absolute-paths')
 self.assert_qmp(result, 'return', {})
@@ -760,7 +767,7 @@ class TestUnbackedSource(iotests.QMPTestCase):
 
 def test_absolute_paths_top(self):
 self.assert_no_active_block_jobs()
-result = self.vm.qmp('drive-mirror', device='drive0',
+result = self.vm.qmp('drive-mirror', job_id='drive0', device='drive0',
  sync='top', target=target_img,
  mode='absolute-paths')
 self.assert_qmp(result, 'return', {})
@@ -769,13 +776,60 @@ class TestUnbackedSource(iotests.QMPTestCase):
 
 def test_absolute_paths_none(self):
 self.assert_no_active_block_jobs()
-result = self.vm.qmp('drive-mirror', device='drive0',
+result = self.vm.qmp('drive-mirror', job_id='drive0', device='drive0',
  sync='none', target=target_img,
  mode='absolute-paths')
 self.assert_qmp(result, 'return', {})
 self.complete_and_wait()
 self.assert_no_active_block_jobs()
 
+def test_existing_full(self):
+qemu_img('create', '-f', iotests.imgfmt, target_img,
+ str(self.image_len))
+qemu_io('-c', 'write -P 42 0 64k', target_img)
+
+self.assert_no_active_block_jobs()
+result = self.vm.qmp('drive-mirror', job_id='drive0', device='drive0',
+ sync='full', target=target_img, mode='existing')
+self.assert_qmp(result, 'return', {})
+self.complete_and_wait()
+self.assert_no_active_block_jobs()
+
+result = self.vm.qmp('blockdev-del', node_name='drive0')
+self.assert_qmp(result, 'return', {})
+
+self.assertTrue(iotests.compare_images(test_img, target_img),
+'target image does not match source after mirroring')
+
+def test_blockdev_full(self):
+qemu_img('create', '-f', iotests.imgfmt, target_img,
+ str(self.image_len))
+qemu_io('-c', 'write -P 42 0 64k', target_img)
+
+result = self.vm.qmp('blockdev-add', node_name='target',
+ driver=iotests.imgfmt,
+ file={
+ 'driver': 'file',
+ 'filename': target_img,
+ })
+self.assert_qmp(result, 'return', {})
+
+self.assert_no_active_block_jobs()
+result = self.vm.qmp('blockdev-mirror', job_id='drive0', 
device='drive0',
+ sync='full', target='target')
+self.assert_qmp(result, 'return', {})
+self.complete_and_wait()
+self.assert_no_active_block_jobs()
+
+result = self.vm.qmp('blockdev-del', node_name='drive0')
+self.assert_qmp(result, 'return', {})
+
+result = self.vm.qmp('blockdev-del', node_name='target')
+self.assert_qmp(result, 'return', {})
+
+self.assertTrue(iotests.compare_images(test_img, target_img),
+'target image does not match source after mirroring')
+
 class TestGranularity(iotests.QMPTestCase):
 image_len = 10 * 1024 * 1024 # MB
 
diff --git a/tests/qemu-iotests/041.out b/tests/qemu-iotests/041.out
index e071d0b261..2c448b4239 100644
--- a/tests/qemu-iotests/041.out
+++

Re: [Qemu-devel] [PATCH v6 14/14] qemu-iotest: enable testing with qemu-io aio options

2019-07-24 Thread Julia Suvorova


On 7/19/19 3:35 PM, Aarushi Mehta wrote:
 @@ -225,6 +227,10 @@ s/ .*//p

  CACHEMODE_IS_DEFAULT=false
  cachemode=false
  continue
+elif $aiomode
+then
+AIOMODE="$r"
+aiomode=false


'continue' is missed here.

Best regards, Julia Suvorova.

[Qemu-devel] [PATCH v2 05/11] block: Use bdrv_has_zero_init_truncate()

2019-07-24 Thread Max Reitz

Signed-off-by: Max Reitz 
---
 block/parallels.c | 2 +-
 block/vhdx.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index 00fae125d1..7cd2714b69 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -835,7 +835,7 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 goto fail_options;
 }
 
-if (!bdrv_has_zero_init(bs->file->bs)) {
+if (!bdrv_has_zero_init_truncate(bs->file->bs)) {
 s->prealloc_mode = PRL_PREALLOC_MODE_FALLOCATE;
 }
 
diff --git a/block/vhdx.c b/block/vhdx.c
index d6070b6fa8..a02d1c99a7 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -1282,7 +1282,7 @@ static coroutine_fn int vhdx_co_writev(BlockDriverState 
*bs, int64_t sector_num,
 /* Queue another write of zero buffers if the underlying file
  * does not zero-fill on file extension */
 
-if (bdrv_has_zero_init(bs->file->bs) == 0) {
+if (bdrv_has_zero_init_truncate(bs->file->bs) == 0) {
 use_zero_buffers = true;
 
 /* zero fill the front, if any */
-- 
2.21.0

[Qemu-devel] [PATCH v2 10/11] iotests: Test convert -n to pre-filled image

2019-07-24 Thread Max Reitz

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/122 | 17 +
 tests/qemu-iotests/122.out |  8 
 2 files changed, 25 insertions(+)

diff --git a/tests/qemu-iotests/122 b/tests/qemu-iotests/122
index 85c3a8d047..059011ebb1 100755
--- a/tests/qemu-iotests/122
+++ b/tests/qemu-iotests/122
@@ -257,6 +257,23 @@ for min_sparse in 4k 8k; do
 $QEMU_IMG map --output=json "$TEST_IMG".orig | _filter_qemu_img_map
 done
 
+
+echo
+echo '=== -n to a non-zero image ==='
+echo
+
+# Keep source zero
+_make_test_img 64M
+
+# Output is not zero, but has bdrv_has_zero_init() == 1
+TEST_IMG="$TEST_IMG".orig _make_test_img 64M
+$QEMU_IO -c "write -P 42 0 64k" "$TEST_IMG".orig | _filter_qemu_io
+
+# Convert with -n, which should not assume that the target is zeroed
+$QEMU_IMG convert -O $IMGFMT -n "$TEST_IMG" "$TEST_IMG".orig
+
+$QEMU_IMG compare "$TEST_IMG" "$TEST_IMG".orig
+
 # success, all done
 echo '*** done'
 rm -f $seq.full
diff --git a/tests/qemu-iotests/122.out b/tests/qemu-iotests/122.out
index c576705284..849b6cc2ef 100644
--- a/tests/qemu-iotests/122.out
+++ b/tests/qemu-iotests/122.out
@@ -220,4 +220,12 @@ convert -c -S 8k
 { "start": 9216, "length": 8192, "depth": 0, "zero": true, "data": false},
 { "start": 17408, "length": 1024, "depth": 0, "zero": false, "data": true},
 { "start": 18432, "length": 67090432, "depth": 0, "zero": true, "data": false}]
+
+=== -n to a non-zero image ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
+Formatting 'TEST_DIR/t.IMGFMT.orig', fmt=IMGFMT size=67108864
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Images are identical.
 *** done
-- 
2.21.0

[Qemu-devel] [PATCH v2 08/11] vhdx: Fix .bdrv_has_zero_init()

2019-07-24 Thread Max Reitz

Fixed VHDX images cannot guarantee to be zero-initialized.  If the image
has the "fixed" subformat, forward the call to the underlying storage
node.

Reported-by: Stefano Garzarella 
Signed-off-by: Max Reitz 
---
 block/vhdx.c | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/block/vhdx.c b/block/vhdx.c
index a02d1c99a7..6a09d0a55c 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -2075,6 +2075,30 @@ static int coroutine_fn vhdx_co_check(BlockDriverState 
*bs,
 return 0;
 }
 
+static int vhdx_has_zero_init(BlockDriverState *bs)
+{
+BDRVVHDXState *s = bs->opaque;
+int state;
+
+/*
+ * Check the subformat: Fixed images have all BAT entries present,
+ * dynamic images have none (right after creation).  It is
+ * therefore enough to check the first BAT entry.
+ */
+if (!s->bat_entries) {
+return 1;
+}
+
+state = s->bat[0] & VHDX_BAT_STATE_BIT_MASK;
+if (state == PAYLOAD_BLOCK_FULLY_PRESENT) {
+/* Fixed subformat */
+return bdrv_has_zero_init(bs->file->bs);
+}
+
+/* Dynamic subformat */
+return 1;
+}
+
 static QemuOptsList vhdx_create_opts = {
 .name = "vhdx-create-opts",
 .head = QTAILQ_HEAD_INITIALIZER(vhdx_create_opts.head),
@@ -2128,7 +2152,7 @@ static BlockDriver bdrv_vhdx = {
 .bdrv_co_create_opts= vhdx_co_create_opts,
 .bdrv_get_info  = vhdx_get_info,
 .bdrv_co_check  = vhdx_co_check,
-.bdrv_has_zero_init = bdrv_has_zero_init_1,
+.bdrv_has_zero_init = vhdx_has_zero_init,
 
 .create_opts= _create_opts,
 };
-- 
2.21.0

[Qemu-devel] [PATCH v2 09/11] iotests: Convert to preallocated encrypted qcow2

2019-07-24 Thread Max Reitz

Add a test case for converting an empty image (which only returns zeroes
when read) to a preallocated encrypted qcow2 image.
qcow2_has_zero_init() should return 0 then, thus forcing qemu-img
convert to create zero clusters.

Signed-off-by: Max Reitz 
Acked-by: Stefano Garzarella 
Tested-by: Stefano Garzarella 
---
 tests/qemu-iotests/188 | 20 +++-
 tests/qemu-iotests/188.out |  4 
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/188 b/tests/qemu-iotests/188
index be7278aa65..afca44df54 100755
--- a/tests/qemu-iotests/188
+++ b/tests/qemu-iotests/188
@@ -48,7 +48,7 @@ SECRETALT="secret,id=sec0,data=platypus"
 
 _make_test_img --object $SECRET -o 
"encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10" $size
 
-IMGSPEC="driver=$IMGFMT,file.filename=$TEST_IMG,encrypt.key-secret=sec0"
+IMGSPEC="driver=$IMGFMT,encrypt.key-secret=sec0,file.filename=$TEST_IMG"
 
 QEMU_IO_OPTIONS=$QEMU_IO_OPTIONS_NO_FMT
 
@@ -68,6 +68,24 @@ echo
 echo "== verify open failure with wrong password =="
 $QEMU_IO --object $SECRETALT -c "read -P 0xa 0 $size" --image-opts $IMGSPEC | 
_filter_qemu_io | _filter_testdir
 
+_cleanup_test_img
+
+echo
+echo "== verify that has_zero_init returns false when preallocating =="
+
+# Empty source file
+if [ -n "$TEST_IMG_FILE" ]; then
+TEST_IMG_FILE="${TEST_IMG_FILE}.orig" _make_test_img $size
+else
+TEST_IMG="${TEST_IMG}.orig" _make_test_img $size
+fi
+
+$QEMU_IMG convert -O "$IMGFMT" --object $SECRET \
+-o 
"encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10,preallocation=metadata"
 \
+"${TEST_IMG}.orig" "$TEST_IMG"
+
+$QEMU_IMG compare --object $SECRET --image-opts "${IMGSPEC}.orig" "$IMGSPEC"
+
 
 # success, all done
 echo "*** done"
diff --git a/tests/qemu-iotests/188.out b/tests/qemu-iotests/188.out
index 97b1402671..c568ef3701 100644
--- a/tests/qemu-iotests/188.out
+++ b/tests/qemu-iotests/188.out
@@ -15,4 +15,8 @@ read 16777216/16777216 bytes at offset 0
 
 == verify open failure with wrong password ==
 qemu-io: can't open: Invalid password, cannot unlock any keyslot
+
+== verify that has_zero_init returns false when preallocating ==
+Formatting 'TEST_DIR/t.IMGFMT.orig', fmt=IMGFMT size=16777216
+Images are identical.
 *** done
-- 
2.21.0

[Qemu-devel] [PATCH v2 06/11] qcow2: Fix .bdrv_has_zero_init()

2019-07-24 Thread Max Reitz

If a qcow2 file is preallocated, it can no longer guarantee that it
initially appears as filled with zeroes.

So implement .bdrv_has_zero_init() by checking whether the file is
preallocated; if so, forward the call to the underlying storage node,
except for when it is encrypted: Encrypted preallocated images always
return effectively random data, so .bdrv_has_zero_init() must always
return 0 for them.

.bdrv_has_zero_init_truncate() can remain bdrv_has_zero_init_1(),
because it presupposes PREALLOC_MODE_OFF.

Reported-by: Stefano Garzarella 
Signed-off-by: Max Reitz 
---
 block/qcow2.c | 29 -
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 5c40f54d64..b4e73aa443 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -4631,6 +4631,33 @@ static ImageInfoSpecific 
*qcow2_get_specific_info(BlockDriverState *bs,
 return spec_info;
 }
 
+static int qcow2_has_zero_init(BlockDriverState *bs)
+{
+BDRVQcow2State *s = bs->opaque;
+bool preallocated;
+
+if (qemu_in_coroutine()) {
+qemu_co_mutex_lock(>lock);
+}
+/*
+ * Check preallocation status: Preallocated images have all L2
+ * tables allocated, nonpreallocated images have none.  It is
+ * therefore enough to check the first one.
+ */
+preallocated = s->l1_size > 0 && s->l1_table[0] != 0;
+if (qemu_in_coroutine()) {
+qemu_co_mutex_unlock(>lock);
+}
+
+if (!preallocated) {
+return 1;
+} else if (bs->encrypted) {
+return 0;
+} else {
+return bdrv_has_zero_init(s->data_file->bs);
+}
+}
+
 static int qcow2_save_vmstate(BlockDriverState *bs, QEMUIOVector *qiov,
   int64_t pos)
 {
@@ -5186,7 +5213,7 @@ BlockDriver bdrv_qcow2 = {
 .bdrv_child_perm  = bdrv_format_default_perms,
 .bdrv_co_create_opts  = qcow2_co_create_opts,
 .bdrv_co_create   = qcow2_co_create,
-.bdrv_has_zero_init = bdrv_has_zero_init_1,
+.bdrv_has_zero_init   = qcow2_has_zero_init,
 .bdrv_has_zero_init_truncate = bdrv_has_zero_init_1,
 .bdrv_co_block_status = qcow2_co_block_status,
 
-- 
2.21.0

[Qemu-devel] [PATCH v2 04/11] block: Implement .bdrv_has_zero_init_truncate()

2019-07-24 Thread Max Reitz

We need to implement .bdrv_has_zero_init_truncate() for every block
driver that supports truncation and has a .bdrv_has_zero_init()
implementation.

Implement it the same way each driver implements .bdrv_has_zero_init().
This is at least not any more unsafe than what we had before.

Signed-off-by: Max Reitz 
---
 block/file-posix.c | 1 +
 block/file-win32.c | 1 +
 block/gluster.c| 4 
 block/nfs.c| 1 +
 block/qcow2.c  | 1 +
 block/qed.c| 1 +
 block/raw-format.c | 6 ++
 block/rbd.c| 1 +
 block/sheepdog.c   | 1 +
 block/ssh.c| 1 +
 10 files changed, 18 insertions(+)

diff --git a/block/file-posix.c b/block/file-posix.c
index 4479cc7ab4..0208006f3c 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2924,6 +2924,7 @@ BlockDriver bdrv_file = {
 .bdrv_co_create = raw_co_create,
 .bdrv_co_create_opts = raw_co_create_opts,
 .bdrv_has_zero_init = bdrv_has_zero_init_1,
+.bdrv_has_zero_init_truncate = bdrv_has_zero_init_1,
 .bdrv_co_block_status = raw_co_block_status,
 .bdrv_co_invalidate_cache = raw_co_invalidate_cache,
 .bdrv_co_pwrite_zeroes = raw_co_pwrite_zeroes,
diff --git a/block/file-win32.c b/block/file-win32.c
index 6b2d67b239..41f55dfece 100644
--- a/block/file-win32.c
+++ b/block/file-win32.c
@@ -635,6 +635,7 @@ BlockDriver bdrv_file = {
 .bdrv_close = raw_close,
 .bdrv_co_create_opts = raw_co_create_opts,
 .bdrv_has_zero_init = bdrv_has_zero_init_1,
+.bdrv_has_zero_init_truncate = bdrv_has_zero_init_1,
 
 .bdrv_aio_preadv= raw_aio_preadv,
 .bdrv_aio_pwritev   = raw_aio_pwritev,
diff --git a/block/gluster.c b/block/gluster.c
index f64dc5b01e..64028b2cba 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -1567,6 +1567,7 @@ static BlockDriver bdrv_gluster = {
 .bdrv_co_writev   = qemu_gluster_co_writev,
 .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
 .bdrv_has_zero_init   = qemu_gluster_has_zero_init,
+.bdrv_has_zero_init_truncate  = qemu_gluster_has_zero_init,
 #ifdef CONFIG_GLUSTERFS_DISCARD
 .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
 #endif
@@ -1598,6 +1599,7 @@ static BlockDriver bdrv_gluster_tcp = {
 .bdrv_co_writev   = qemu_gluster_co_writev,
 .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
 .bdrv_has_zero_init   = qemu_gluster_has_zero_init,
+.bdrv_has_zero_init_truncate  = qemu_gluster_has_zero_init,
 #ifdef CONFIG_GLUSTERFS_DISCARD
 .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
 #endif
@@ -1629,6 +1631,7 @@ static BlockDriver bdrv_gluster_unix = {
 .bdrv_co_writev   = qemu_gluster_co_writev,
 .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
 .bdrv_has_zero_init   = qemu_gluster_has_zero_init,
+.bdrv_has_zero_init_truncate  = qemu_gluster_has_zero_init,
 #ifdef CONFIG_GLUSTERFS_DISCARD
 .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
 #endif
@@ -1666,6 +1669,7 @@ static BlockDriver bdrv_gluster_rdma = {
 .bdrv_co_writev   = qemu_gluster_co_writev,
 .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
 .bdrv_has_zero_init   = qemu_gluster_has_zero_init,
+.bdrv_has_zero_init_truncate  = qemu_gluster_has_zero_init,
 #ifdef CONFIG_GLUSTERFS_DISCARD
 .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
 #endif
diff --git a/block/nfs.c b/block/nfs.c
index d93241b3bb..97c815085f 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -863,6 +863,7 @@ static BlockDriver bdrv_nfs = {
 .create_opts= _create_opts,
 
 .bdrv_has_zero_init = nfs_has_zero_init,
+.bdrv_has_zero_init_truncate= nfs_has_zero_init,
 .bdrv_get_allocated_file_size   = nfs_get_allocated_file_size,
 .bdrv_co_truncate   = nfs_file_co_truncate,
 
diff --git a/block/qcow2.c b/block/qcow2.c
index 039bdc2f7e..5c40f54d64 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -5187,6 +5187,7 @@ BlockDriver bdrv_qcow2 = {
 .bdrv_co_create_opts  = qcow2_co_create_opts,
 .bdrv_co_create   = qcow2_co_create,
 .bdrv_has_zero_init = bdrv_has_zero_init_1,
+.bdrv_has_zero_init_truncate = bdrv_has_zero_init_1,
 .bdrv_co_block_status = qcow2_co_block_status,
 
 .bdrv_co_preadv = qcow2_co_preadv,
diff --git a/block/qed.c b/block/qed.c
index 77c7cef175..daaedb6864 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1668,6 +1668,7 @@ static BlockDriver bdrv_qed = {
 .bdrv_co_create   = bdrv_qed_co_create,
 .bdrv_co_create_opts  = bdrv_qed_co_create_opts,
 .bdrv_has_zero_init   = bdrv_has_zero_init_1,
+.bdrv_has_zero_init_truncate = bdrv_has_zero_init_1,
 .bdrv_co_block_status = bdrv_qed_co_block_status,
 .bdrv_co_readv= bdrv_qed_co_readv,
 .bdrv_co_writev   = bdrv_qed_co_writev,
diff --git a/block/raw-format.c b/block/raw-format.c

[Qemu-devel] [PATCH v2 02/11] mirror: Fix bdrv_has_zero_init() use

2019-07-24 Thread Max Reitz

bdrv_has_zero_init() only has meaning for newly created images or image
areas.  If the mirror job itself did not create the image, it cannot
rely on bdrv_has_zero_init()'s result to carry any meaning.

This is the case for drive-mirror with mode=existing and always for
blockdev-mirror.

Note that we only have to zero-initialize the target with sync=full,
because other modes actually do not promise that the target will contain
the same data as the source after the job -- sync=top only promises to
copy anything allocated in the top layer, and sync=none will only copy
new I/O.  (Which is how mirror has always handled it.)

Signed-off-by: Max Reitz 
---
 include/block/block_int.h   |  2 ++
 block/mirror.c  | 11 ---
 blockdev.c  | 16 +---
 tests/test-block-iothread.c |  2 +-
 4 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 3aa1e832a8..6a0b1b5008 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1116,6 +1116,7 @@ BlockJob *commit_active_start(const char *job_id, 
BlockDriverState *bs,
  * @buf_size: The amount of data that can be in flight at one time.
  * @mode: Whether to collapse all images in the chain to the target.
  * @backing_mode: How to establish the target's backing chain after completion.
+ * @zero_target: Whether the target should be explicitly zero-initialized
  * @on_source_error: The action to take upon error reading from the source.
  * @on_target_error: The action to take upon error writing to the target.
  * @unmap: Whether to unmap target where source sectors only contain zeroes.
@@ -1135,6 +1136,7 @@ void mirror_start(const char *job_id, BlockDriverState 
*bs,
   int creation_flags, int64_t speed,
   uint32_t granularity, int64_t buf_size,
   MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
+  bool zero_target,
   BlockdevOnError on_source_error,
   BlockdevOnError on_target_error,
   bool unmap, const char *filter_node_name,
diff --git a/block/mirror.c b/block/mirror.c
index 8cb75fb409..50188ce6e9 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -51,6 +51,8 @@ typedef struct MirrorBlockJob {
 Error *replace_blocker;
 bool is_none_mode;
 BlockMirrorBackingMode backing_mode;
+/* Whether the target image requires explicit zero-initialization */
+bool zero_target;
 MirrorCopyMode copy_mode;
 BlockdevOnError on_source_error, on_target_error;
 bool synced;
@@ -763,7 +765,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 int ret;
 int64_t count;
 
-if (base == NULL && !bdrv_has_zero_init(target_bs)) {
+if (s->zero_target) {
 if (!bdrv_can_write_zeroes_with_unmap(target_bs)) {
 bdrv_set_dirty_bitmap(s->dirty_bitmap, 0, s->bdev_length);
 return 0;
@@ -1501,6 +1503,7 @@ static BlockJob *mirror_start_job(
  const char *replaces, int64_t speed,
  uint32_t granularity, int64_t buf_size,
  BlockMirrorBackingMode backing_mode,
+ bool zero_target,
  BlockdevOnError on_source_error,
  BlockdevOnError on_target_error,
  bool unmap,
@@ -1628,6 +1631,7 @@ static BlockJob *mirror_start_job(
 s->on_target_error = on_target_error;
 s->is_none_mode = is_none_mode;
 s->backing_mode = backing_mode;
+s->zero_target = zero_target;
 s->copy_mode = copy_mode;
 s->base = base;
 s->granularity = granularity;
@@ -1713,6 +1717,7 @@ void mirror_start(const char *job_id, BlockDriverState 
*bs,
   int creation_flags, int64_t speed,
   uint32_t granularity, int64_t buf_size,
   MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
+  bool zero_target,
   BlockdevOnError on_source_error,
   BlockdevOnError on_target_error,
   bool unmap, const char *filter_node_name,
@@ -1728,7 +1733,7 @@ void mirror_start(const char *job_id, BlockDriverState 
*bs,
 is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
 base = mode == MIRROR_SYNC_MODE_TOP ? backing_bs(bs) : NULL;
 mirror_start_job(job_id, bs, creation_flags, target, replaces,
- speed, granularity, buf_size, backing_mode,
+ speed, granularity, buf_size, backing_mode, zero_target,
  on_source_error, on_target_error, unmap, NULL, NULL,
  _job_driver, is_none_mode, base, false,
  filter_node_name, true, copy_mode, errp);
@@ -1755,7 +1760,7 @@ BlockJob *commit_active_start(const char *job_id, 
BlockDriverState *bs,
 
 ret = mirror_start_job(
  job_id, bs,

Re: [Qemu-devel] [PATCH v6 11/14] qemu-io: adds option to use aio engine

2019-07-24 Thread Julia Suvorova


On 7/19/19 3:35 PM, Aarushi Mehta wrote:

@@ -489,7 +493,7 @@ static QemuOptsList file_opts = {
  int main(int argc, char **argv)
  {
  int readonly = 0;
-const char *sopt = "hVc:d:f:rsnCmkt:T:U";
+const char *sopt = "hVc:d:f:rsnCmit:T:U";


Add ':' after 'i' to pass an argument after -i option, as it currently
works with a long variant.

Best regards, Julia Suvorova.

[Qemu-devel] [PATCH v2 01/11] qemu-img: Fix bdrv_has_zero_init() use in convert

2019-07-24 Thread Max Reitz

bdrv_has_zero_init() only has meaning for newly created images or image
areas.  If qemu-img convert did not create the image itself, it cannot
rely on bdrv_has_zero_init()'s result to carry any meaning.

Signed-off-by: Max Reitz 
---
 qemu-img.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 79983772de..0f4be80c10 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1578,6 +1578,7 @@ typedef struct ImgConvertState {
 bool has_zero_init;
 bool compressed;
 bool unallocated_blocks_are_zero;
+bool target_is_new;
 bool target_has_backing;
 int64_t target_backing_sectors; /* negative if unknown */
 bool wr_in_order;
@@ -1975,9 +1976,11 @@ static int convert_do_copy(ImgConvertState *s)
 int64_t sector_num = 0;
 
 /* Check whether we have zero initialisation or can get it efficiently */
-s->has_zero_init = s->min_sparse && !s->target_has_backing
- ? bdrv_has_zero_init(blk_bs(s->target))
- : false;
+if (s->target_is_new && s->min_sparse && !s->target_has_backing) {
+s->has_zero_init = bdrv_has_zero_init(blk_bs(s->target));
+} else {
+s->has_zero_init = false;
+}
 
 if (!s->has_zero_init && !s->target_has_backing &&
 bdrv_can_write_zeroes_with_unmap(blk_bs(s->target)))
@@ -2423,6 +2426,8 @@ static int img_convert(int argc, char **argv)
 }
 }
 
+s.target_is_new = !skip_create;
+
 flags = s.min_sparse ? (BDRV_O_RDWR | BDRV_O_UNMAP) : BDRV_O_RDWR;
 ret = bdrv_parse_cache_mode(cache, , );
 if (ret < 0) {
-- 
2.21.0

[Qemu-devel] [PATCH v2 03/11] block: Add bdrv_has_zero_init_truncate()

2019-07-24 Thread Max Reitz

No .bdrv_has_zero_init() implementation returns 1 if growing the file
would add non-zero areas (at least with PREALLOC_MODE_OFF), so using it
in lieu of this new function was always safe.

But on the other hand, it is possible that growing an image that is not
zero-initialized would still add a zero-initialized area, like when
using nonpreallocating truncation on a preallocated image.  For callers
that care only about truncation, not about creation with potential
preallocation, this new function is useful.

Alternatively, we could have added a PreallocMode parameter to
bdrv_has_zero_init().  But the only user would have been qemu-img
convert, which does not have a plain PreallocMode value right now -- it
would have to parse the creation option to obtain it.  Therefore, the
simpler solution is to let bdrv_has_zero_init() inquire the
preallocation status and add the new bdrv_has_zero_init_truncate() that
presupposes PREALLOC_MODE_OFF.

Signed-off-by: Max Reitz 
---
 include/block/block.h |  1 +
 include/block/block_int.h |  7 +++
 block.c   | 21 +
 3 files changed, 29 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index 50a07c1c33..5321d8afdf 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -438,6 +438,7 @@ int bdrv_pdiscard(BdrvChild *child, int64_t offset, int64_t 
bytes);
 int bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes);
 int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
+int bdrv_has_zero_init_truncate(BlockDriverState *bs);
 bool bdrv_unallocated_blocks_are_zero(BlockDriverState *bs);
 bool bdrv_can_write_zeroes_with_unmap(BlockDriverState *bs);
 int bdrv_block_status(BlockDriverState *bs, int64_t offset,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 6a0b1b5008..d7fc6b296b 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -420,9 +420,16 @@ struct BlockDriver {
 /*
  * Returns 1 if newly created images are guaranteed to contain only
  * zeros, 0 otherwise.
+ * Must return 0 if .bdrv_has_zero_init_truncate() returns 0.
  */
 int (*bdrv_has_zero_init)(BlockDriverState *bs);
 
+/*
+ * Returns 1 if new areas added by growing the image with
+ * PREALLOC_MODE_OFF contain only zeros, 0 otherwise.
+ */
+int (*bdrv_has_zero_init_truncate)(BlockDriverState *bs);
+
 /* Remove fd handlers, timers, and other event loop callbacks so the event
  * loop is no longer in use.  Called with no in-flight requests and in
  * depth-first traversal order with parents before child nodes.
diff --git a/block.c b/block.c
index cbd8da5f3b..81ae44dcf3 100644
--- a/block.c
+++ b/block.c
@@ -5066,6 +5066,27 @@ int bdrv_has_zero_init(BlockDriverState *bs)
 return 0;
 }
 
+int bdrv_has_zero_init_truncate(BlockDriverState *bs)
+{
+if (!bs->drv) {
+return 0;
+}
+
+if (bs->backing) {
+/* Depends on the backing image length, but better safe than sorry */
+return 0;
+}
+if (bs->drv->bdrv_has_zero_init_truncate) {
+return bs->drv->bdrv_has_zero_init_truncate(bs);
+}
+if (bs->file && bs->drv->is_filter) {
+return bdrv_has_zero_init_truncate(bs->file->bs);
+}
+
+/* safe default */
+return 0;
+}
+
 bool bdrv_unallocated_blocks_are_zero(BlockDriverState *bs)
 {
 BlockDriverInfo bdi;
-- 
2.21.0

[Qemu-devel] [PATCH v2 00/11] block: Fix some things about bdrv_has_zero_init()

2019-07-24 Thread Max Reitz

Hi,

See the previous cover letter for the reason for patches 6 through 9:
https://lists.nongnu.org/archive/html/qemu-block/2019-07/msg00563.html

But no only some bdrv_has_zero_init() implementations are wrong, some
callers also use it the wrong way.

First, qemu-img and mirror use it for pre-existing images, where it
doesn’t have any meaning.  Both should consider pre-existing images to
always be non-zero and not look at bdrv_has-zero_init() (patches 1, 2,
and the tests in 10 and 11).

Second, vhdx and parallels call bdrv_has_zero_init() when they do not
really care about an image’s post-create state but only about what
happens when you grow an image.  That is a bit ugly, and also overly
safe when growing preallocated images without preallocating the new
areas.  So this series adds a new function bdrv_has_zero_init_truncate()
that is more suited to vhdx's and parallel's needs (patches 3 through
5).


v2:
- Simplified preallocation checks in qcow2 and vhdx [Kevin]
- Added patches 1 – 5, 10, 11


git-backport-diff against v1:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/11:[down] 'qemu-img: Fix bdrv_has_zero_init() use in convert'
002/11:[down] 'mirror: Fix bdrv_has_zero_init() use'
003/11:[down] 'block: Add bdrv_has_zero_init_truncate()'
004/11:[down] 'block: Implement .bdrv_has_zero_init_truncate()'
005/11:[down] 'block: Use bdrv_has_zero_init_truncate()'
006/11:[0077] [FC] 'qcow2: Fix .bdrv_has_zero_init()'
007/11:[] [--] 'vdi: Fix .bdrv_has_zero_init()'
008/11:[0021] [FC] 'vhdx: Fix .bdrv_has_zero_init()'
009/11:[] [--] 'iotests: Convert to preallocated encrypted qcow2'
010/11:[down] 'iotests: Test convert -n to pre-filled image'
011/11:[down] 'iotests: Full mirror to existing non-zero image'


Max Reitz (11):
  qemu-img: Fix bdrv_has_zero_init() use in convert
  mirror: Fix bdrv_has_zero_init() use
  block: Add bdrv_has_zero_init_truncate()
  block: Implement .bdrv_has_zero_init_truncate()
  block: Use bdrv_has_zero_init_truncate()
  qcow2: Fix .bdrv_has_zero_init()
  vdi: Fix .bdrv_has_zero_init()
  vhdx: Fix .bdrv_has_zero_init()
  iotests: Convert to preallocated encrypted qcow2
  iotests: Test convert -n to pre-filled image
  iotests: Full mirror to existing non-zero image

 include/block/block.h   |  1 +
 include/block/block_int.h   |  9 ++
 block.c | 21 +
 block/file-posix.c  |  1 +
 block/file-win32.c  |  1 +
 block/gluster.c |  4 +++
 block/mirror.c  | 11 +--
 block/nfs.c |  1 +
 block/parallels.c   |  2 +-
 block/qcow2.c   | 30 +-
 block/qed.c |  1 +
 block/raw-format.c  |  6 
 block/rbd.c |  1 +
 block/sheepdog.c|  1 +
 block/ssh.c |  1 +
 block/vdi.c | 13 +++-
 block/vhdx.c| 28 +++--
 blockdev.c  | 16 --
 qemu-img.c  | 11 +--
 tests/test-block-iothread.c |  2 +-
 tests/qemu-iotests/041  | 62 ++---
 tests/qemu-iotests/041.out  |  4 +--
 tests/qemu-iotests/122  | 17 ++
 tests/qemu-iotests/122.out  |  8 +
 tests/qemu-iotests/188  | 20 +++-
 tests/qemu-iotests/188.out  |  4 +++
 26 files changed, 254 insertions(+), 22 deletions(-)

-- 
2.21.0

Re: [Qemu-devel] [PATCH for-4.1 1/2] spapr/irq: Inform the user when falling back to emulated IC

2019-07-24 Thread Cédric Le Goater

On 24/07/2019 18:57, Greg Kurz wrote:
> Just to give an indication to the user that the error condition is
> handled and how.
> 
> Reported-by: Satheesh Rajendran 
> Signed-off-by: Greg Kurz 



Reviewed-by: Cédric Le Goater 

Thanks,

C.

> ---
>  hw/ppc/spapr_irq.c |1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index ff3df0bbd8cf..d07aed8ca9f9 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -86,6 +86,7 @@ static void spapr_irq_init_kvm(SpaprMachineState *spapr,
>   * emulated mode
>   */
>  error_prepend(_err, "kernel_irqchip allowed but unavailable: 
> ");
> +error_append_hint(_err, "Falling back to 
> kernel-irqchip=off\n");
>  warn_report_err(local_err);
>  }
>  }
>

Re: [Qemu-devel] [Qemu-stable] [PATCH 00/36] Patch Round-up for stable 3.1.1, freeze on 2019-07-29

2019-07-24 Thread Cole Robinson

On 7/23/19 1:00 PM, Michael Roth wrote:
> Hi everyone,  
> 
> 
> The following new patches are queued for QEMU stable v3.1.1:
> 
>   https://github.com/mdroth/qemu/commits/stable-3.1-staging
> 
> The release is planned for 2019-08-01:
> 
>   https://wiki.qemu.org/Planning/3.1
> 
> Please respond here or CC qemu-sta...@nongnu.org on any patches you
> think should be included in the release.
> 
> Note that this update falls outside the normal stable release support
> window (~1 development cycle), but is being released now since it was
> delayed from its intended release date.
> 

Here's some extra patches we are carrying in Fedora 30

Thanks,
Cole

commit e014dbe74e0484188164c61ff6843f8a04a8cb9d
Author: Prasanna Kumar Kalever 
Date:   Tue Mar 5 16:46:33 2019 +0100

gluster: Handle changed glfs_ftruncate signature

commit 0e3b891fefacc0e49f3c8ffa3a753b69eb7214d2
Author: Niels de Vos 
Date:   Tue Mar 5 16:46:34 2019 +0100

gluster: the glfs_io_cbk callback function pointer adds pre/post
stat args

commit cce648613bc802be1b894227f7fd94d88476ea07
Author: Prasad J Pandit 
Date:   Wed Dec 12 23:28:17 2018 +0530

pvrdma: release device resources in case of an error

commit 2aa86456fb938a11f2b7bd57c8643c213218681c
Author: Prasad J Pandit 
Date:   Thu Dec 13 01:00:35 2018 +0530

pvrdma: add uar_read routine

commit e909ff93698851777faac3c45d03c1b73f311ea6
Author: Paolo Bonzini 
Date:   Fri Jan 11 17:27:31 2019 +0100

scsi-generic: avoid possible out-of-bounds access to r->buf

commit a7104eda7dab99d0cdbd3595c211864cba415905
Author: Prasad J Pandit 
Date:   Sun Jan 13 23:29:48 2019 +0530

slirp: check data length while emulating ident function

commit b05b267840515730dbf6753495d5b7bd8b04ad1c
Author: Gerd Hoffmann 
Date:   Tue Jan 8 11:23:01 2019 +0100

i2c-ddc: fix oob read

commit 9a1565a03b79d80b236bc7cc2dbce52a2ef3a1b8
Author: Daniel P. Berrangé 
Date:   Wed Mar 13 09:49:03 2019 +

seccomp: don't kill process for resource control syscalls

commit d52680fc932efb8a2f334cc6993e705ed1e31e99
Author: Prasad J Pandit 
Date:   Thu Apr 25 12:05:34 2019 +0530

qxl: check release info object

commit ad280559c68360c9f1cd7be063857853759e6a73
Author: Prasad J Pandit 
Date:   Fri Jan 4 15:19:10 2019 +0530

sun4u: add power_mem_read routine

commit da885fe1ee8b4589047484bd7fa05a4905b52b17
Author: Peter Maydell 
Date:   Fri Dec 14 13:30:52 2018 +

device_tree.c: Don't use load_image()

commit 065e6298a75164b4347682b63381dbe752c2b156
Author: Markus Armbruster 
Date:   Tue Apr 9 19:40:18 2019 +0200

device_tree: Fix integer overflowing in load_device_tree()

[Qemu-devel] [PATCH for-4.1 1/2] spapr/irq: Inform the user when falling back to emulated IC

2019-07-24 Thread Greg Kurz

Just to give an indication to the user that the error condition is
handled and how.

Reported-by: Satheesh Rajendran 
Signed-off-by: Greg Kurz 
---
 hw/ppc/spapr_irq.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index ff3df0bbd8cf..d07aed8ca9f9 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -86,6 +86,7 @@ static void spapr_irq_init_kvm(SpaprMachineState *spapr,
  * emulated mode
  */
 error_prepend(_err, "kernel_irqchip allowed but unavailable: ");
+error_append_hint(_err, "Falling back to kernel-irqchip=off\n");
 warn_report_err(local_err);
 }
 }

[Qemu-devel] [PATCH for-4.1 2/2] xics/kvm: Fix fallback to emulated XICS

2019-07-24 Thread Greg Kurz

Commit 4812f2615288 tried to fix rollback path of xics_kvm_connect() but
it isn't enough. If we fail to create the KVM device, the guest fails
to boot later on with:

[0.010817] pci :00:00.0: Adding to iommu group 0
[0.010863] irq: unknown-1 didn't like hwirq-0x1200 to VIRQ17 mapping 
(rc=-22)
[0.010923] pci :00:01.0: Adding to iommu group 0
[0.010968] irq: unknown-1 didn't like hwirq-0x1201 to VIRQ17 mapping 
(rc=-22)
[0.011543] EEH: No capable adapters found
[0.011597] irq: unknown-1 didn't like hwirq-0x1000 to VIRQ17 mapping 
(rc=-22)
[0.011651] audit: type=2000 audit(1563977526.000:1): state=initialized 
audit_enabled=0 res=1
[0.011703] [ cut here ]
[0.011729] event-sources: Unable to allocate interrupt number for 
/event-sources/epow-events
[0.011776] WARNING: CPU: 0 PID: 1 at 
arch/powerpc/platforms/pseries/event_sources.c:34 
request_event_sources_irqs+0xbc/0x150
[0.011828] Modules linked in:
[0.011850] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
5.1.17-300.fc30.ppc64le #1
[0.011886] NIP:  c00d4fac LR: c00d4fa8 CTR: c18f
[0.011923] REGS: c0001e4c38d0 TRAP: 0700   Not tainted  
(5.1.17-300.fc30.ppc64le)
[0.011966] MSR:  82029033   CR: 28000284  
XER: 2004
[0.012012] CFAR: c011b42c IRQMASK: 0
[0.012012] GPR00: c00d4fa8 c0001e4c3b60 c15fc400 
0051
[0.012012] GPR04: 0001  0081 
772d6576656e7473
[0.012012] GPR08: 1edf c14d4830 c14d4830 
6e6576652f20726f
[0.012012] GPR12:  c18f c0010bf0 

[0.012012] GPR16:    

[0.012012] GPR20:    

[0.012012] GPR24:   c0ebbf00 
c00d5570
[0.012012] GPR28: c0ebc008 c0001fff8248  

[0.012372] NIP [c00d4fac] request_event_sources_irqs+0xbc/0x150
[0.012409] LR [c00d4fa8] request_event_sources_irqs+0xb8/0x150
[0.012445] Call Trace:
[0.012462] [c0001e4c3b60] [c00d4fa8] 
request_event_sources_irqs+0xb8/0x150 (unreliable)
[0.012513] [c0001e4c3bf0] [c1042848] 
__machine_initcall_pseries_init_ras_IRQ+0xc8/0xf8
[0.012563] [c0001e4c3c20] [c0010810] do_one_initcall+0x60/0x254
[0.012611] [c0001e4c3cf0] [c1024538] 
kernel_init_freeable+0x35c/0x444
[0.012655] [c0001e4c3db0] [c0010c14] kernel_init+0x2c/0x148
[0.012693] [c0001e4c3e20] [c000bdc4] 
ret_from_kernel_thread+0x5c/0x78
[0.012736] Instruction dump:
[0.012759] 38a0 7c7f1b78 7f64db78 2c1f 2fbf 78630020 4180002c 
409effa8
[0.012805] 7fa4eb78 7f43d378 48046421 6000 <0fe0> 3bde0001 2c1e0010 
7fde07b4
[0.012851] ---[ end trace aa5785707323fad3 ]---

This happens because QEMU fell back on XICS emulation but didn't unregister
the RTAS calls from KVM. The emulated RTAS calls are hence never called and
the KVM ones return an error to the guest since the KVM device is absent.

The sanity checks in xics_kvm_disconnect() are abusive since we're freeing
the KVM device. Simply drop them.

Fixes: 4812f2615288 "xics/kvm: Add proper rollback to xics_kvm_init()"
Signed-off-by: Greg Kurz 
---
 hw/intc/xics_kvm.c |   11 ---
 1 file changed, 11 deletions(-)

diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index 2df1f3e92c7e..65c35f90f9af 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -430,17 +430,6 @@ fail:
 
 void xics_kvm_disconnect(SpaprMachineState *spapr, Error **errp)
 {
-/* The KVM XICS device is not in use */
-if (kernel_xics_fd == -1) {
-return;
-}
-
-if (!kvm_enabled() || !kvm_check_extension(kvm_state, KVM_CAP_IRQ_XICS)) {
-error_setg(errp,
-   "KVM and IRQ_XICS capability must be present for KVM XICS 
device");
-return;
-}
-
 /*
  * Only on P9 using the XICS-on XIVE KVM device:
  *

[Qemu-devel] [PATCH for-4.1 0/2] spapr/xics: Last minute fixes

2019-07-24 Thread Greg Kurz

KVM on POWER9 doesn't use the XIVE VP space optimally. This currently
limits the number of VMs we can start to 127. Starting with the 128th
one, KVM fails to create the XIVE or the XICS-on-XIVE device and we
go through the fallback paths in QEMU.

The XICS error path still has an issue that prevents the guest to do
interrupts, even after QEMU fell back on XICS emulation. This is
fixed with patch 2.

Patch 1 is just a _trivial_ improvement of the warning that gets
emited when falling back to emulated IC. Feel free to apply it to
4.2 or even to drop it if you don't want it in 4.1.

--
Greg

---

Greg Kurz (2):
  spapr/irq: Inform the user when falling back to emulated IC
  xics/kvm: Fix fallback to emulated XICS


 hw/intc/xics_kvm.c |   11 ---
 hw/ppc/spapr_irq.c |1 +
 2 files changed, 1 insertion(+), 11 deletions(-)

Re: [Qemu-devel] [PATCH for 4.1?] pl330: fix vmstate description

2019-07-24 Thread Dr. David Alan Gilbert

* Philippe Mathieu-Daudé (phi...@redhat.com) wrote:
> On 7/24/19 4:35 PM, Damien Hedde wrote:
> > Fix the pl330 main and queue vmstate description.
> > There were missing POINTER flags causing crashes during
> > incoming migration because:
> > + PL330State chan field is a pointer to an array
> > + PL330Queue queue field is a pointer to an array
> > 
> > Also bump corresponding vmsd version numbers.
> > 
> > Signed-off-by: Damien Hedde 
> > ---
> > 
> > I found this while working on reset with xilinx-zynq machine.
> > 
> > I'm not sure what's the vmsd version policy in such cases (for
> > backward compatibility). I've simply bumped them since migration
> > was not working anyway (vmstate_load_state was erasing critical part
> > of PL330State and causing segfaults while loading following fields).
> 
> I still not understand versioning and migration

Incrementing the version (and minimum) is the right thing
to do if you conclude the old one was hopelessly broken.
Migration to and from old qemu breaks, but who cares since it was toast
anyway.
As far as I can tell pl330 is only on our zynq and exynos models
so wont break our versioned 'virt' type.
So from a migration point of view:


Acked-by: Dr. David Alan Gilbert 


> so I can't say, but
> then you use the correct macro, since we have:
> 
> s->chan = g_new0(PL330Chan, s->num_chnls);
> 
> So:
> Reviewed-by: Philippe Mathieu-Daude 
> 
> > 
> > Tested doing migration with the xilinx-zynq-a9 machine.
> > 
> > ---
> >  hw/dma/pl330.c | 17 +
> >  1 file changed, 9 insertions(+), 8 deletions(-)
> > 
> > diff --git a/hw/dma/pl330.c b/hw/dma/pl330.c
> > index 58df965a46..a56a3e7771 100644
> > --- a/hw/dma/pl330.c
> > +++ b/hw/dma/pl330.c
> > @@ -218,11 +218,12 @@ typedef struct PL330Queue {
> >  
> >  static const VMStateDescription vmstate_pl330_queue = {
> >  .name = "pl330_queue",
> > -.version_id = 1,
> > -.minimum_version_id = 1,
> > +.version_id = 2,
> > +.minimum_version_id = 2,
> >  .fields = (VMStateField[]) {
> > -VMSTATE_STRUCT_VARRAY_UINT32(queue, PL330Queue, queue_size, 1,
> > - vmstate_pl330_queue_entry, 
> > PL330QueueEntry),
> > +VMSTATE_STRUCT_VARRAY_POINTER_UINT32(queue, PL330Queue, queue_size,
> > + vmstate_pl330_queue_entry,
> > + PL330QueueEntry),
> >  VMSTATE_END_OF_LIST()
> >  }
> >  };
> > @@ -278,12 +279,12 @@ struct PL330State {
> >  
> >  static const VMStateDescription vmstate_pl330 = {
> >  .name = "pl330",
> > -.version_id = 1,
> > -.minimum_version_id = 1,
> > +.version_id = 2,
> > +.minimum_version_id = 2,
> >  .fields = (VMStateField[]) {
> >  VMSTATE_STRUCT(manager, PL330State, 0, vmstate_pl330_chan, 
> > PL330Chan),
> > -VMSTATE_STRUCT_VARRAY_UINT32(chan, PL330State, num_chnls, 0,
> > - vmstate_pl330_chan, PL330Chan),
> > +VMSTATE_STRUCT_VARRAY_POINTER_UINT32(chan, PL330State, num_chnls,
> > + vmstate_pl330_chan, 
> > PL330Chan),
> >  VMSTATE_VBUFFER_UINT32(lo_seqn, PL330State, 1, NULL, num_chnls),
> >  VMSTATE_VBUFFER_UINT32(hi_seqn, PL330State, 1, NULL, num_chnls),
> >  VMSTATE_STRUCT(fifo, PL330State, 0, vmstate_pl330_fifo, PL330Fifo),
> > 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v4 0/3] pc: mmap kernel (ELF image) and initrd

2019-07-24 Thread Montes, Julio

Hi Stefano

Here the results

https://pasteboard.co/Ipu3DO4.png
https://pasteboard.co/Ipu3L69.png

boot time with initrd is a little bit better

Thanks

-
Julio


On Wed, 2019-07-24 at 16:31 +0200, Stefano Garzarella wrote:
> In order to reduce the memory footprint when PVH kernel and initrd
> are used, we map them into memory instead of reading them.
> In this way we can share them between multiple instances of QEMU.
> 
> v4:
>   - Patch 1: fix the rom_add_elf_program() comment [Paolo]
>   - Patch 2:
> ~ fix the missing of g_mapped_file_unref() in the success case
> [Paolo]
> ~ fix the rom_add_elf_program() comment [Paolo]
> 
> v3: 
> https://patchew.org/QEMU/20190724112531.232260-1-sgarz...@redhat.com/
> v2: 
> https://patchew.org/QEMU/20190723140445.12748-1-sgarz...@redhat.com/
> 
> These are the results using a PVH kernel and initrd (cpio):
> - memory footprint (using smem) [MB]
> QEMU  before   now
> # instancesUSS  PSSUSS  PSS
>  1   102.0M   105.8M 102.3M   106.2M
>  294.6M   101.2M  72.3M90.1M
>  494.1M98.0M  72.0M81.5M
>  894.0M96.2M  71.8M76.9M
> 1693.9M95.1M  71.6M74.3M
> 
> Initrd size: 3.0M
> Kernel
> image size: 28M
> sections size [size -A -d vmlinux]:  18.9M
> 
> - boot time [ms]
>   before   now
>  qemu_init_end:   63.85   55.91
>  linux_start_kernel:  82.11 (+18.26)  74.51 (+18.60)
>  linux_start_user:   169.94 (+87.83) 159.06 (+84.56)
> 
> QEMU command used:
> ./qemu-system-x86_64 -bios /path/to/seabios/out/bios.bin -no-hpet \
> -machine
> q35,accel=kvm,kernel_irqchip,nvdimm,sata=off,smbus=off,vmport=off \
> -cpu host -m 1G -smp 1 -vga none -display none -no-user-config
> -nodefaults \
> -kernel /path/to/vmlinux -initrd /path/to/rootfs.cpio \
> -append 'root=/dev/mem0 ro console=hvc0 pci=lastbus=0 nosmap'
> 
> Stefano Garzarella (3):
>   loader: Handle memory-mapped ELFs
>   elf-ops.h: Map into memory the ELF to load
>   hw/i386/pc: Map into memory the initrd
> 
>  hw/core/loader.c | 38 +++-
>  hw/i386/pc.c | 17 ---
>  include/hw/elf_ops.h | 71 ++--
> 
>  include/hw/i386/pc.h |  1 +
>  include/hw/loader.h  |  5 ++--
>  5 files changed, 89 insertions(+), 43 deletions(-)
>

Re: [Qemu-devel] qemu-iotests 069 and 111 are failing on NetBSD

2019-07-24 Thread Paolo Bonzini

On 24/07/19 11:34, Thomas Huth wrote:
> In case somebody is interested, two of the "auto" iotests are failing
> on NetBSD due to non-matching output:
> 
>   TESTiotest-qcow2: 069 [fail]
> --- /var/tmp/qemu-test.1BMupF/tests/qemu-iotests/069.out2019-07-24 
> 09:19:22.0 +
> +++ /var/tmp/qemu-test.1BMupF/tests/qemu-iotests/069.out.bad2019-07-24 
> 09:21:34.0 +
> @@ -4,5 +4,5 @@
>  
>  Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=131072
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072 
> backing_file=TEST_DIR/t.IMGFMT.base
> -qemu-io: can't open device TEST_DIR/t.IMGFMT: Could not open backing file: 
> Could not open 'TEST_DIR/t.IMGFMT.base': No such file or directory
> +qemu-io: can't open device TEST_DIR/t.IMGFMT: Could not open backing file: 
> TEST_DIR/t.IMGFMT.base: stat failed: No such file or directory
>  *** done
> 
> and:
> 
>   TESTiotest-qcow2: 111 [fail]
> --- /var/tmp/qemu-test.1BMupF/tests/qemu-iotests/111.out2019-07-24 
> 09:19:22.0 +
> +++ /var/tmp/qemu-test.1BMupF/tests/qemu-iotests/111.out.bad2019-07-24 
> 09:21:40.0 +
> @@ -1,4 +1,4 @@
>  QA output created by 111
> -qemu-img: TEST_DIR/t.IMGFMT: Could not open 'TEST_DIR/t.IMGFMT.inexistent': 
> No such file or directory
> +qemu-img: TEST_DIR/t.IMGFMT: TEST_DIR/t.IMGFMT.inexistent: stat failed: No 
> such file or directory
>  Could not open backing image to determine size.
>  *** done
> 
> It's currently not a problem yet since we're not running the
> iotests on NetBSD yet (since our netbsd VM image does not have
> bash and gsed installed yet), but if somebody has some spare
> minutes, it would be great if this could be fixed so that we
> can enable the iotests on NetBSD, too, one day...

Is this (slightly ridiculous but effective) patch enough?

diff --git a/block/file-posix.c b/block/file-posix.c
index 73a001ceb7..ce847f4d62 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -217,7 +217,7 @@ static int raw_normalize_devicepath(const char **filename, 
Error **errp)
 fname = *filename;
 dp = strrchr(fname, '/');
 if (lstat(fname, ) < 0) {
-error_setg_errno(errp, errno, "%s: stat failed", fname);
+error_setg_errno(errp, errno, "Could not open: '%s'", fname);
 return -errno;
 }
 
Paolo

Re: [Qemu-devel] [PATCH for 4.1?] pl330: fix vmstate description

2019-07-24 Thread Philippe Mathieu-Daudé

On 7/24/19 4:35 PM, Damien Hedde wrote:
> Fix the pl330 main and queue vmstate description.
> There were missing POINTER flags causing crashes during
> incoming migration because:
> + PL330State chan field is a pointer to an array
> + PL330Queue queue field is a pointer to an array
> 
> Also bump corresponding vmsd version numbers.
> 
> Signed-off-by: Damien Hedde 
> ---
> 
> I found this while working on reset with xilinx-zynq machine.
> 
> I'm not sure what's the vmsd version policy in such cases (for
> backward compatibility). I've simply bumped them since migration
> was not working anyway (vmstate_load_state was erasing critical part
> of PL330State and causing segfaults while loading following fields).

I still not understand versioning and migration, so I can't say, but
then you use the correct macro, since we have:

s->chan = g_new0(PL330Chan, s->num_chnls);

So:
Reviewed-by: Philippe Mathieu-Daude 

> 
> Tested doing migration with the xilinx-zynq-a9 machine.
> 
> ---
>  hw/dma/pl330.c | 17 +
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/dma/pl330.c b/hw/dma/pl330.c
> index 58df965a46..a56a3e7771 100644
> --- a/hw/dma/pl330.c
> +++ b/hw/dma/pl330.c
> @@ -218,11 +218,12 @@ typedef struct PL330Queue {
>  
>  static const VMStateDescription vmstate_pl330_queue = {
>  .name = "pl330_queue",
> -.version_id = 1,
> -.minimum_version_id = 1,
> +.version_id = 2,
> +.minimum_version_id = 2,
>  .fields = (VMStateField[]) {
> -VMSTATE_STRUCT_VARRAY_UINT32(queue, PL330Queue, queue_size, 1,
> - vmstate_pl330_queue_entry, PL330QueueEntry),
> +VMSTATE_STRUCT_VARRAY_POINTER_UINT32(queue, PL330Queue, queue_size,
> + vmstate_pl330_queue_entry,
> + PL330QueueEntry),
>  VMSTATE_END_OF_LIST()
>  }
>  };
> @@ -278,12 +279,12 @@ struct PL330State {
>  
>  static const VMStateDescription vmstate_pl330 = {
>  .name = "pl330",
> -.version_id = 1,
> -.minimum_version_id = 1,
> +.version_id = 2,
> +.minimum_version_id = 2,
>  .fields = (VMStateField[]) {
>  VMSTATE_STRUCT(manager, PL330State, 0, vmstate_pl330_chan, 
> PL330Chan),
> -VMSTATE_STRUCT_VARRAY_UINT32(chan, PL330State, num_chnls, 0,
> - vmstate_pl330_chan, PL330Chan),
> +VMSTATE_STRUCT_VARRAY_POINTER_UINT32(chan, PL330State, num_chnls,
> + vmstate_pl330_chan, PL330Chan),
>  VMSTATE_VBUFFER_UINT32(lo_seqn, PL330State, 1, NULL, num_chnls),
>  VMSTATE_VBUFFER_UINT32(hi_seqn, PL330State, 1, NULL, num_chnls),
>  VMSTATE_STRUCT(fifo, PL330State, 0, vmstate_pl330_fifo, PL330Fifo),
>

Re: [Qemu-devel] [PATCH v3 2/3] qapi: implement block-dirty-bitmap-remove transaction action

2019-07-24 Thread John Snow




On 7/24/19 9:58 AM, Vladimir Sementsov-Ogievskiy wrote:
> 09.07.2019 1:05, John Snow wrote:
>> It is used to do transactional movement of the bitmap (which is
>> possible in conjunction with merge command). Transactional bitmap
>> movement is needed in scenarios with external snapshot, when we don't
>> want to leave copy of the bitmap in the base image.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> Signed-off-by: John Snow 
>> ---
>>   block.c|  2 +-
>>   block/dirty-bitmap.c   | 15 +++
>>   blockdev.c | 79 +++---
>>   include/block/dirty-bitmap.h   |  2 +-
>>   migration/block-dirty-bitmap.c |  2 +-
>>   qapi/transaction.json  |  2 +
>>   6 files changed, 85 insertions(+), 17 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index c139540f2b..5195d4b910 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -5316,7 +5316,7 @@ static void coroutine_fn 
>> bdrv_co_invalidate_cache(BlockDriverState *bs,
>>   for (bm = bdrv_dirty_bitmap_next(bs, NULL); bm;
>>bm = bdrv_dirty_bitmap_next(bs, bm))
>>   {
>> -bdrv_dirty_bitmap_set_migration(bm, false);
>> +bdrv_dirty_bitmap_skip_store(bm, false);
>>   }
>>   
>>   ret = refresh_total_sectors(bs, bs->total_sectors);
>> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
>> index 95a9c2a5d8..a308e1f84b 100644
>> --- a/block/dirty-bitmap.c
>> +++ b/block/dirty-bitmap.c
>> @@ -48,10 +48,9 @@ struct BdrvDirtyBitmap {
>>   bool inconsistent;  /* bitmap is persistent, but inconsistent.
>>  It cannot be used at all in any way, 
>> except
>>  a QMP user can remove it. */
>> -bool migration; /* Bitmap is selected for migration, it 
>> should
>> -   not be stored on the next inactivation
>> -   (persistent flag doesn't matter until 
>> next
>> -   invalidation).*/
>> +bool skip_store;/* We are either migrating or deleting this
>> + * bitmap; it should not be stored on the 
>> next
>> + * inactivation. */
>>   QLIST_ENTRY(BdrvDirtyBitmap) list;
>>   };
>>   
>> @@ -757,16 +756,16 @@ void 
>> bdrv_dirty_bitmap_set_inconsistent(BdrvDirtyBitmap *bitmap)
>>   }
>>   
>>   /* Called with BQL taken. */
>> -void bdrv_dirty_bitmap_set_migration(BdrvDirtyBitmap *bitmap, bool 
>> migration)
>> +void bdrv_dirty_bitmap_skip_store(BdrvDirtyBitmap *bitmap, bool skip)
>>   {
>>   qemu_mutex_lock(bitmap->mutex);
>> -bitmap->migration = migration;
>> +bitmap->skip_store = skip;
>>   qemu_mutex_unlock(bitmap->mutex);
>>   }
>>   
>>   bool bdrv_dirty_bitmap_get_persistence(BdrvDirtyBitmap *bitmap)
>>   {
>> -return bitmap->persistent && !bitmap->migration;
>> +return bitmap->persistent && !bitmap->skip_store;
>>   }
>>   
>>   bool bdrv_dirty_bitmap_inconsistent(const BdrvDirtyBitmap *bitmap)
>> @@ -778,7 +777,7 @@ bool 
>> bdrv_has_changed_persistent_bitmaps(BlockDriverState *bs)
>>   {
>>   BdrvDirtyBitmap *bm;
>>   QLIST_FOREACH(bm, >dirty_bitmaps, list) {
>> -if (bm->persistent && !bm->readonly && !bm->migration) {
>> +if (bm->persistent && !bm->readonly && !bm->skip_store) {
>>   return true;
>>   }
>>   }
>> diff --git a/blockdev.c b/blockdev.c
>> index 01248252ca..800b3dcb42 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -2134,6 +2134,51 @@ static void 
>> block_dirty_bitmap_merge_prepare(BlkActionState *common,
>>   errp);
>>   }
>>   
>> +static BdrvDirtyBitmap *do_block_dirty_bitmap_remove(
>> +const char *node, const char *name, bool release,
>> +BlockDriverState **bitmap_bs, Error **errp);
>> +
>> +static void block_dirty_bitmap_remove_prepare(BlkActionState *common,
>> +  Error **errp)
>> +{
>> +BlockDirtyBitmap *action;
>> +BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
>> + common, common);
>> +
>> +if (action_check_completion_mode(common, errp) < 0) {
>> +return;
>> +}
>> +
>> +action = common->action->u.block_dirty_bitmap_remove.data;
>> +
>> +state->bitmap = do_block_dirty_bitmap_remove(action->node, action->name,
>> + false, >bs, errp);
>> +if (state->bitmap) {
>> +bdrv_dirty_bitmap_skip_store(state->bitmap, true);
>> +bdrv_dirty_bitmap_set_busy(state->bitmap, true);
>> +}
>> +}
>> +
>> +static void block_dirty_bitmap_remove_abort(BlkActionState *common)
>> +{
>> +BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
>> + common, common);
>> +
>> +

Re: [Qemu-devel] [PATCH v7 02/11] numa: move numa global variable nb_numa_nodes into MachineState

2019-07-24 Thread Igor Mammedov

On Wed, 24 Jul 2019 12:02:41 -0300
Eduardo Habkost  wrote:

> On Wed, Jul 24, 2019 at 04:27:21PM +0200, Igor Mammedov wrote:
> > On Tue, 23 Jul 2019 12:23:57 -0300
> > Eduardo Habkost  wrote:
> > 
> > > On Tue, Jul 23, 2019 at 04:56:41PM +0200, Igor Mammedov wrote:
> > > > On Tue, 16 Jul 2019 22:51:12 +0800
> > > > Tao Xu  wrote:
> > > > 
> > > > > Add struct NumaState in MachineState and move existing numa global
> > > > > nb_numa_nodes(renamed as "num_nodes") into NumaState. And add variable
> > > > > numa_support into MachineClass to decide which submachines support 
> > > > > NUMA.
> > > > > 
> > > > > Suggested-by: Igor Mammedov 
> > > > > Suggested-by: Eduardo Habkost 
> > > > > Signed-off-by: Tao Xu 
> > > > > ---
> > > > > 
> > > > > No changes in v7.
> > > > > 
> > > > > Changes in v6:
> > > > > - Rebase to upstream, move globals in arm/sbsa-ref and use
> > > > >   numa_mem_supported
> > > > > - When used once or twice in the function, use
> > > > >   ms->numa_state->num_nodes directly
> > > > > - Correct some mistakes
> > > > > - Use once monitor_printf in hmp_info_numa
> > > > > ---
> > > [...]
> > > > >  if (pxb->numa_node != NUMA_NODE_UNASSIGNED &&
> > > > > -pxb->numa_node >= nb_numa_nodes) {
> > > > > +pxb->numa_node >= ms->numa_state->num_nodes) {
> > > > this will crash if user tries to use device on machine that doesn't 
> > > > support numa
> > > > check that numa_state is not NULL before dereferencing 
> > > 
> > > That's exactly why the machine_num_numa_nodes() was created in
> > > v5, but then you asked for its removal.
> > V4 to more precise.
> > I dislike small wrappers because they usually doesn't simplify code and 
> > make it more obscure,
> > forcing to jump around to see what's really going on.
> > Like it's implemented in this patch it's obvious what's wrong right away.
> > 
> > In that particular case machine_num_numa_nodes() was also misused since 
> > only a handful
> > of places (6) really need NULL check while majority (48) can directly 
> > access ms->numa_state->num_nodes.
> > without NULL check.
> 
> I strongly disagree, here.  Avoiding a ms->numa_state==NULL check
> is pointless optimization,
I see it not as optimization (compiler probably would manage to optimize out 
most of them)
but as rather properly self documented code. Doing check in places where it's
not needed is confusing at best and can mask/introduce later subtle bugs at 
worst.

> and leads to hard to spot bugs like
> the one you saw above.
That one was actually easy to spot because of the way it's written in this 
patch.


> Although I won't reject a patch just because it doesn't have a
> machine_num_numa_nodes() wrapper, I insist we use one for clarity
> and safety.
>

Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type

2019-07-24 Thread Stefano Garzarella




On Tue, Jul 23, 2019 at 1:30 PM Stefano Garzarella  wrote:
>
> On Tue, Jul 23, 2019 at 10:47:39AM +0100, Stefan Hajnoczi wrote:
> > On Tue, Jul 23, 2019 at 9:43 AM Sergio Lopez  wrote:
> > > Montes, Julio  writes:
> > >
> > > > On Fri, 2019-07-19 at 16:09 +0100, Stefan Hajnoczi wrote:
> > > >> On Fri, Jul 19, 2019 at 2:48 PM Sergio Lopez  wrote:
> > > >> > Stefan Hajnoczi  writes:
> > > >> > > On Thu, Jul 18, 2019 at 05:21:46PM +0200, Sergio Lopez wrote:
> > > >> > > > Stefan Hajnoczi  writes:
> > > >> > > >
> > > >> > > > > On Tue, Jul 02, 2019 at 02:11:02PM +0200, Sergio Lopez wrote:
> > > >> > > >  --
> > > >> > > >  | Conclusion |
> > > >> > > >  --
> > > >> > > >
> > > >> > > > The average boot time of microvm is a third of Q35's (115ms vs.
> > > >> > > > 363ms),
> > > >> > > > and is smaller on all sections (QEMU initialization, firmware
> > > >> > > > overhead
> > > >> > > > and kernel start-to-user).
> > > >> > > >
> > > >> > > > Microvm's memory tree is also visibly simpler, significantly
> > > >> > > > reducing
> > > >> > > > the exposed surface to the guest.
> > > >> > > >
> > > >> > > > While we can certainly work on making Q35 smaller, I definitely
> > > >> > > > think
> > > >> > > > it's better (and way safer!) having a specialized machine type
> > > >> > > > for a
> > > >> > > > specific use case, than a minimal Q35 whose behavior
> > > >> > > > significantly
> > > >> > > > diverges from a conventional Q35.
> > > >> > >
> > > >> > > Interesting, so not a 10x difference!  This might be amenable to
> > > >> > > optimization.
> > > >> > >
> > > >> > > My concern with microvm is that it's so limited that few users
> > > >> > > will be
> > > >> > > able to benefit from the reduced attack surface and faster
> > > >> > > startup time.
> > > >> > > I think it's worth investigating slimming down Q35 further first.
> > > >> > >
> > > >> > > In terms of startup time the first step would be profiling Q35
> > > >> > > kernel
> > > >> > > startup to find out what's taking so long (firmware
> > > >> > > initialization, PCI
> > > >> > > probing, etc)?
> > > >> >
> > > >> > Some findings:
> > > >> >
> > > >> >  1. Exposing the TSC_DEADLINE CPU flag (i.e. using "-cpu host")
> > > >> > saves a
> > > >> > whooping 120ms by avoiding the APIC timer calibration at
> > > >> > arch/x86/kernel/apic/apic.c:calibrate_APIC_clock
> > > >> >
> > > >> > Average boot time with "-cpu host"
> > > >> >  qemu_init_end: 76.408950
> > > >> >  linux_start_kernel: 116.166142 (+39.757192)
> > > >> >  linux_start_user: 242.954347 (+126.788205)
> > > >> >
> > > >> > Average boot time with default "cpu"
> > > >> >  qemu_init_end: 77.467852
> > > >> >  linux_start_kernel: 116.688472 (+39.22062)
> > > >> >  linux_start_user: 363.033365 (+246.344893)
> > > >>
> > > >> \o/
> > > >>
> > > >> >  2. The other 130ms are a direct result of PCI and ACPI presence
> > > >> > (tested
> > > >> > with a kernel without support for those elements). I'll publish
> > > >> > some
> > > >> > detailed numbers next week.
> > > >>
> > > >> Here are the Kata Containers kernel parameters:
> > > >>
> > > >> var kernelParams = []Param{
> > > >> {"tsc", "reliable"},
> > > >> {"no_timer_check", ""},
> > > >> {"rcupdate.rcu_expedited", "1"},
> > > >> {"i8042.direct", "1"},
> > > >> {"i8042.dumbkbd", "1"},
> > > >> {"i8042.nopnp", "1"},
> > > >> {"i8042.noaux", "1"},
> > > >> {"noreplace-smp", ""},
> > > >> {"reboot", "k"},
> > > >> {"console", "hvc0"},
> > > >> {"console", "hvc1"},
> > > >> {"iommu", "off"},
> > > >> {"cryptomgr.notests", ""},
> > > >> {"net.ifnames", "0"},
> > > >> {"pci", "lastbus=0"},
> > > >> }
> > > >>
> > > >> pci lastbus=0 looks interesting and so do some of the others :).
> > > >>
> > > >
> > > > yeah, pci=lastbus=0 is very helpful to reduce the boot time in q35,
> > > > kernel won't scan the 255.. buses :)
> > >
> > > I can confirm that adding pci=lastbus=0 makes a significant
> > > improvement. In fact, is the only option from Kata's kernel parameter
> > > list that has an impact, probably because the kernel is already quite
> > > minimalistic.
> > >
> > > Average boot time with "-cpu host" and "pci=lastbus=0"
> > >  qemu_init_end: 73.711569
> > >  linux_start_kernel: 113.414311 (+39.702742)
> > >  linux_start_user: 190.949939 (+77.535628)
> > >
> > > That's still ~40% slower than microvm, and the breach quickly widens
> > > when adding more PCI devices (each one adds 10-15ms), but it's certainly
> > > an improvement over the original numbers.
> > >
> > > On the other hand, there isn't much we can do here from QEMU's
> > > perspective, as this is basically Guest OS tuning.
> >
> > fw_cfg could expose this information so guest kernels know when to
> > stop enumerating the PCI bus.  This would make all PCI guests with new
> > kernels boot ~50 ms faster, regardless of machine type.

Re: [Qemu-devel] [PATCH v7 02/11] numa: move numa global variable nb_numa_nodes into MachineState

2019-07-24 Thread Eduardo Habkost

On Wed, Jul 24, 2019 at 04:27:21PM +0200, Igor Mammedov wrote:
> On Tue, 23 Jul 2019 12:23:57 -0300
> Eduardo Habkost  wrote:
> 
> > On Tue, Jul 23, 2019 at 04:56:41PM +0200, Igor Mammedov wrote:
> > > On Tue, 16 Jul 2019 22:51:12 +0800
> > > Tao Xu  wrote:
> > > 
> > > > Add struct NumaState in MachineState and move existing numa global
> > > > nb_numa_nodes(renamed as "num_nodes") into NumaState. And add variable
> > > > numa_support into MachineClass to decide which submachines support NUMA.
> > > > 
> > > > Suggested-by: Igor Mammedov 
> > > > Suggested-by: Eduardo Habkost 
> > > > Signed-off-by: Tao Xu 
> > > > ---
> > > > 
> > > > No changes in v7.
> > > > 
> > > > Changes in v6:
> > > > - Rebase to upstream, move globals in arm/sbsa-ref and use
> > > >   numa_mem_supported
> > > > - When used once or twice in the function, use
> > > >   ms->numa_state->num_nodes directly
> > > > - Correct some mistakes
> > > > - Use once monitor_printf in hmp_info_numa
> > > > ---
> > [...]
> > > >  if (pxb->numa_node != NUMA_NODE_UNASSIGNED &&
> > > > -pxb->numa_node >= nb_numa_nodes) {
> > > > +pxb->numa_node >= ms->numa_state->num_nodes) {
> > > this will crash if user tries to use device on machine that doesn't 
> > > support numa
> > > check that numa_state is not NULL before dereferencing 
> > 
> > That's exactly why the machine_num_numa_nodes() was created in
> > v5, but then you asked for its removal.
> V4 to more precise.
> I dislike small wrappers because they usually doesn't simplify code and make 
> it more obscure,
> forcing to jump around to see what's really going on.
> Like it's implemented in this patch it's obvious what's wrong right away.
> 
> In that particular case machine_num_numa_nodes() was also misused since only 
> a handful
> of places (6) really need NULL check while majority (48) can directly access 
> ms->numa_state->num_nodes.
> without NULL check.

I strongly disagree, here.  Avoiding a ms->numa_state==NULL check
is pointless optimization, and leads to hard to spot bugs like
the one you saw above.

Although I won't reject a patch just because it doesn't have a
machine_num_numa_nodes() wrapper, I insist we use one for clarity
and safety.

-- 
Eduardo

Re: [Qemu-devel] [Qemu-ppc] [PATCH] ppc/pnv: Generate phandle for the "interrupt-parent" property

2019-07-24 Thread Amol Surati

On Wed, Jul 24, 2019 at 06:57:30PM +1000, David Gibson wrote:
> On Wed, Jul 24, 2019 at 09:11:54AM +0200, Cédric Le Goater wrote:
> > On 24/07/2019 05:23, David Gibson wrote:
> > > On Tue, Jul 23, 2019 at 11:01:38AM +0200, Cédric Le Goater wrote:
> > >> Devices such as the BT or serial devices require a valid
> > >> "interrupt-parent" phandle in the device tree and it is currently
> > >> empty (0x0). It was not a problem until now but since OpenFirmare
> > >> started using a recent libdft (>= 1.4.7), petitboot fails to boot the
> > >> system image with error :
> > >>
> > >>dtc_resize: fdt_open_into returned FDT_ERR_BADMAGIC
> > >>
> > >> Provide a phandle for the LPC bus.
> > >>
> > >> Suggested-by: Greg Kurz 
> > >> Signed-off-by: Cédric Le Goater 
> > > 
> > > I've applied this, since it looks to be correct.
> > > 
> > > But.. can you connect the dots for me in how this being missing
> > > results in a BADMAGIC error??
> > 
> > Some binary called by petitboot segfaults when trying to kexec an image on 
> > a system with a bogus DT (generated by QEMU). I don't know exactly which 
> > one 
> > as I only see the error message above and the segv message in dmesg
> 
> Ok, I'm still not seeing how that gets you to a BADMAGIC error.

If I may interject, as this patch is related to the qemu bug:
https://bugs.launchpad.net/qemu/+bug/1826827.

The error is printed by dtc_resize in kexec.c from kexec-lite
(antonblanchard/kexec-lite).

There are two places where dtc_resize is called -
(1) initialize_fdt, when kexec is passed a dtb file.
(2) fdt_from_fs, when kexec must make dtc read /proc/device-tree to form
a dtb.

If initialize_fdt is called with a file which is an invalid dtb, the
dtc_resize prints the FDT_ERR_BADMAGIC error.

Bug# 1826827 shows that dtc is one application that does
crash, although through the firing of an assertion, in the absence of
the mentioned properties. (fix to avoid the crash already checked into
dtc upstream, commit 8f69567622; to be released with dtc-v1.5.1).

Assuming that the crashing app (it is not known here what it is) is
supposed to create a dtb for kexec, and its crash leaves behind an
incomplete/invalid dtb file, the initialize_fdt might receive an invalid
dtb.

Another possibility for that error exists within the fdt_from_fs function,
but that needs a version of kexec-lite at least 5 years old, which is
unlikely to be used here I guess.

If this patch fixes both the crash and the error "dtc_resize: ",
it is likely that dtc (or anything else which depends on libfdt) was the
cause of the crash, with dtc/libfdt version being < g8f69567622.

Thanks,
-amol

Re: [Qemu-devel] [PATCH] docs/nvdimm: add example on persistent backend setup

2019-07-24 Thread Wei Yang

On Wed, Jul 24, 2019 at 06:17:31AM -0400, Pankaj Gupta wrote:
>
>> 
>> Persistent backend setup requires some knowledge about nvdimm and ndctl
>> tool. Some users report they may struggle to gather these knowledge and
>> have difficulty to setup it properly.
>> 
>> Here we provide two examples for persistent backend and gives the link
>> to ndctl. By doing so, user could try it directly and do more
>> investigation on persistent backend setup with ndctl.
>> 
>> Signed-off-by: Wei Yang 
>> ---
>>  docs/nvdimm.txt | 28 
>>  1 file changed, 28 insertions(+)
>> 
>> diff --git a/docs/nvdimm.txt b/docs/nvdimm.txt
>> index b531cacd35..baba7a940d 100644
>> --- a/docs/nvdimm.txt
>> +++ b/docs/nvdimm.txt
>> @@ -171,6 +171,32 @@ guest software that this vNVDIMM device contains a
>> region that cannot
>>  accept persistent writes. In result, for example, the guest Linux
>>  NVDIMM driver, marks such vNVDIMM device as read-only.
>>  
>> +Backend File Setup Example
>> +..
>> +
>> +Here is two examples for how to setup these persistent backend on
>> +linux, which leverages the tool ndctl [3].
>> +
>> +It is easy to setup DAX device backend file.
>> +
>> +A. DAX device
>> +
>> +ndctl create-namespace -f -e namespace0.0 -m devdax
>> +
>> +The /dev/dax0.0 could be used directly in "mem-path" option.
>> +
>> +For DAX file, it is more than creating the proper namespace. The
>> +block device should be partitioned and mounted (with dax option).
>> +
>> +B. DAX file
>> +
>> +ndctl create-namespace -f -e namespace0.0 -m fsdax
>> +(partition /dev/pmem0 with name pmem0p1)
>> +mount -o dax /dev/pmem0p1 /mnt
>> +(dd a file with proper size in /mnt)
>
>This is not clear to me. why 'dd' file is required in /mnt?
>You mean for creating a backend file?
>

Yes, create a backend file. You need to give a file instead of a directory to
qemu command line.

>> +
>> +Then the new file in /mnt could be used in "mem-path" option.
>> +
>>  NVDIMM Persistence
>>  --
>>  
>> @@ -212,3 +238,5 @@ References
>>  
>> https://www.snia.org/sites/default/files/technical_work/final/NVMProgrammingModel_v1.2.pdf
>>  [2] Persistent Memory Development Kit (PMDK), formerly known as NVML
>>  project, home page:
>>  http://pmem.io/pmdk/
>> +[3] ndctl-create-namespace - provision or reconfigure a namespace
>> +http://pmem.io/ndctl/ndctl-create-namespace.html
>> --
>
>Looks good to me. Just a small comment above. 
>Other than that: Reviewed-by: Pankaj Gupta 
>

Thanks

>> 2.17.1
>> 
>> 
>> 

-- 
Wei Yang
Help you, Help me

Re: [Qemu-devel] [PATCH for-4.2 10/24] target/arm: Update CNTVCT_EL0 for VHE

2019-07-24 Thread Alex Bennée



Richard Henderson  writes:

> The virtual offset may be 0 depending on EL, E2H and TGE.
>
> Signed-off-by: Richard Henderson 

Reviewed-by: Alex Bennée 

> ---
>  target/arm/helper.c | 40 +---
>  1 file changed, 37 insertions(+), 3 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index da2e0627b2..3124d682a2 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -2484,9 +2484,31 @@ static uint64_t gt_cnt_read(CPUARMState *env, const 
> ARMCPRegInfo *ri)
>  return gt_get_countervalue(env);
>  }
>
> +static uint64_t gt_virt_cnt_offset(CPUARMState *env)
> +{
> +uint64_t hcr;
> +
> +switch (arm_current_el(env)) {
> +case 2:
> +hcr = arm_hcr_el2_eff(env);
> +if (hcr & HCR_E2H) {
> +return 0;
> +}
> +break;
> +case 0:
> +hcr = arm_hcr_el2_eff(env);
> +if ((hcr & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
> +return 0;
> +}
> +break;
> +}
> +
> +return env->cp15.cntvoff_el2;
> +}
> +
>  static uint64_t gt_virt_cnt_read(CPUARMState *env, const ARMCPRegInfo *ri)
>  {
> -return gt_get_countervalue(env) - env->cp15.cntvoff_el2;
> +return gt_get_countervalue(env) - gt_virt_cnt_offset(env);
>  }
>
>  static void gt_cval_write(CPUARMState *env, const ARMCPRegInfo *ri,
> @@ -2501,7 +2523,13 @@ static void gt_cval_write(CPUARMState *env, const 
> ARMCPRegInfo *ri,
>  static uint64_t gt_tval_read(CPUARMState *env, const ARMCPRegInfo *ri,
>   int timeridx)
>  {
> -uint64_t offset = timeridx == GTIMER_VIRT ? env->cp15.cntvoff_el2 : 0;
> +uint64_t offset = 0;
> +
> +switch (timeridx) {
> +case GTIMER_VIRT:
> +offset = gt_virt_cnt_offset(env);
> +break;
> +}
>
>  return (uint32_t)(env->cp15.c14_timer[timeridx].cval -
>(gt_get_countervalue(env) - offset));
> @@ -2511,7 +2539,13 @@ static void gt_tval_write(CPUARMState *env, const 
> ARMCPRegInfo *ri,
>int timeridx,
>uint64_t value)
>  {
> -uint64_t offset = timeridx == GTIMER_VIRT ? env->cp15.cntvoff_el2 : 0;
> +uint64_t offset = 0;
> +
> +switch (timeridx) {
> +case GTIMER_VIRT:
> +offset = gt_virt_cnt_offset(env);
> +break;
> +}
>
>  trace_arm_gt_tval_write(timeridx, value);
>  env->cp15.c14_timer[timeridx].cval = gt_get_countervalue(env) - offset +


--
Alex Bennée

Re: [Qemu-devel] [PATCH v7 04/11] numa: move numa global variable numa_info into MachineState

2019-07-24 Thread Igor Mammedov

On Tue, 16 Jul 2019 22:51:14 +0800
Tao Xu  wrote:

> Move existing numa global numa_info (renamed as "nodes") into NumaState.
> 
> Suggested-by: Igor Mammedov 
> Suggested-by: Eduardo Habkost 
> Signed-off-by: Tao Xu 

Reviewed-by: Igor Mammedov 

> ---
> 
> No changes in v7.
> 
> Changes in v6:
> - Rebase to upstream, move globals in arm/sbsa-ref
> - Correct some mistake(Igor)
> - Use ms->numa_state->nodes directly, when use it once or twice(Igor)
> ---
>  exec.c   |  2 +-
>  hw/acpi/aml-build.c  |  6 --
>  hw/arm/boot.c|  2 +-
>  hw/arm/sbsa-ref.c|  3 ++-
>  hw/arm/virt-acpi-build.c |  7 ---
>  hw/arm/virt.c|  3 ++-
>  hw/core/numa.c   | 15 +--
>  hw/i386/pc.c |  4 ++--
>  hw/ppc/spapr.c   | 10 +-
>  hw/ppc/spapr_pci.c   |  4 +++-
>  include/sysemu/numa.h|  5 +++--
>  11 files changed, 36 insertions(+), 25 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index b6b75d2ad5..26dd7676c0 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1766,7 +1766,7 @@ long qemu_minrampagesize(void)
>  if (hpsize > mainrampagesize &&
>  (ms->numa_state == NULL ||
>   ms->numa_state->num_nodes == 0 ||
> - numa_info[0].node_memdev == NULL)) {
> + ms->numa_state->nodes[0].node_memdev == NULL)) {
>  static bool warned;
>  if (!warned) {
>  error_report("Huge page support disabled (n/a for main 
> memory).");
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index 63c1cae8c9..26ccc1a3e2 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -1737,8 +1737,10 @@ void build_slit(GArray *table_data, BIOSLinker 
> *linker, MachineState *ms)
>  build_append_int_noprefix(table_data, nb_numa_nodes, 8);
>  for (i = 0; i < nb_numa_nodes; i++) {
>  for (j = 0; j < nb_numa_nodes; j++) {
> -assert(numa_info[i].distance[j]);
> -build_append_int_noprefix(table_data, numa_info[i].distance[j], 
> 1);
> +assert(ms->numa_state->nodes[i].distance[j]);
> +build_append_int_noprefix(table_data,
> +  ms->numa_state->nodes[i].distance[j],
> +  1);
>  }
>  }
>  
> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> index e28daa5278..da228919dc 100644
> --- a/hw/arm/boot.c
> +++ b/hw/arm/boot.c
> @@ -601,7 +601,7 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info 
> *binfo,
>  if (ms->numa_state != NULL && ms->numa_state->num_nodes > 0) {
>  mem_base = binfo->loader_start;
>  for (i = 0; i < ms->numa_state->num_nodes; i++) {
> -mem_len = numa_info[i].node_mem;
> +mem_len = ms->numa_state->nodes[i].node_mem;
>  rc = fdt_add_memory_node(fdt, acells, mem_base,
>   scells, mem_len, i);
>  if (rc < 0) {
> diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
> index 7e4c471717..3a243e6a53 100644
> --- a/hw/arm/sbsa-ref.c
> +++ b/hw/arm/sbsa-ref.c
> @@ -168,7 +168,8 @@ static void create_fdt(SBSAMachineState *sms)
>  idx = (i * nb_numa_nodes + j) * 3;
>  matrix[idx + 0] = cpu_to_be32(i);
>  matrix[idx + 1] = cpu_to_be32(j);
> -matrix[idx + 2] = cpu_to_be32(numa_info[i].distance[j]);
> +matrix[idx + 2] =
> +cpu_to_be32(ms->numa_state->nodes[i].distance[j]);
>  }
>  }
>  
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 461a44b5b0..89899ec4c1 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -534,11 +534,12 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  
>  mem_base = vms->memmap[VIRT_MEM].base;
>  for (i = 0; i < ms->numa_state->num_nodes; ++i) {
> -if (numa_info[i].node_mem > 0) {
> +if (ms->numa_state->nodes[i].node_mem > 0) {
>  numamem = acpi_data_push(table_data, sizeof(*numamem));
> -build_srat_memory(numamem, mem_base, numa_info[i].node_mem, i,
> +build_srat_memory(numamem, mem_base,
> +  ms->numa_state->nodes[i].node_mem, i,
>MEM_AFFINITY_ENABLED);
> -mem_base += numa_info[i].node_mem;
> +mem_base += ms->numa_state->nodes[i].node_mem;
>  }
>  }
>  
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 984f162531..174e81a3de 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -242,7 +242,8 @@ static void create_fdt(VirtMachineState *vms)
>  idx = (i * nb_numa_nodes + j) * 3;
>  matrix[idx + 0] = cpu_to_be32(i);
>  matrix[idx + 1] = cpu_to_be32(j);
> -matrix[idx + 2] = cpu_to_be32(numa_info[i].distance[j]);
> +matrix[idx + 2] =
> +

Re: [Qemu-devel] [PATCH v2 03/14] target/arm/monitor: Introduce qmp_query_cpu_model_expansion

2019-07-24 Thread Andrew Jones

On Wed, Jul 24, 2019 at 04:25:32PM +0200, Auger Eric wrote:
> > diff --git a/target/arm/monitor.c b/target/arm/monitor.c
> > index 41b32b94b258..19e3120eef95 100644
> > --- a/target/arm/monitor.c
> > +++ b/target/arm/monitor.c
> > @@ -23,7 +23,13 @@
> >  #include "qemu/osdep.h"
> >  #include "hw/boards.h"
> >  #include "kvm_arm.h"
> > +#include "qapi/error.h"
> > +#include "qapi/visitor.h"
> > +#include "qapi/qobject-input-visitor.h"
> >  #include "qapi/qapi-commands-target.h"
> > +#include "qapi/qmp/qerror.h"
> > +#include "qapi/qmp/qdict.h"
> > +#include "qom/qom-qobject.h"
> >  
> >  static GICCapability *gic_cap_new(int version)
> >  {
> > @@ -82,3 +88,129 @@ GICCapabilityList *qmp_query_gic_capabilities(Error 
> > **errp)
> >  
> >  return head;
> >  }
> > +
> > +static const char *cpu_model_advertised_features[] = {
> > +"aarch64", "pmu",
> > +NULL
> > +};
> > +
> > +CpuModelExpansionInfo 
> > *qmp_query_cpu_model_expansion(CpuModelExpansionType type,
> > + CpuModelInfo 
> > *model,
> > + Error **errp)
> > +{
> > +CpuModelExpansionInfo *expansion_info;
> > +const QDict *qdict_in = NULL;
> > +QDict *qdict_out;
> > +ObjectClass *oc;
> > +Object *obj;
> > +const char *name;
> > +int i;
> > +
> > +if (type != CPU_MODEL_EXPANSION_TYPE_FULL) {
> > +error_setg(errp, "The requested expansion type is not 
> > supported.");
> > +return NULL;
> > +}
> > +
> > +if (!kvm_enabled() && !strcmp(model->name, "host")) {
> > +error_setg(errp, "The CPU definition '%s' requires KVM", 
> > model->name);
> > +return NULL;
> > +}
> > +
> > +oc = cpu_class_by_name(TYPE_ARM_CPU, model->name);
> > +if (!oc) {
> > +error_setg(errp, "The CPU definition '%s' is unknown.", 
> > model->name);
> > +return NULL;
> > +}
> > +
> > +if (kvm_enabled()) {
> > +const char *cpu_type = current_machine->cpu_type;
> > +int len = strlen(cpu_type) - strlen(ARM_CPU_TYPE_SUFFIX);
> > +bool supported = false;
> > +
> > +if (!strcmp(model->name, "host") || !strcmp(model->name, 
> > "max")) {
> > +/* These are kvmarm's recommended cpu types */
> > +supported = true;
> > +} else if (strlen(model->name) == len &&
> > +   !strncmp(model->name, cpu_type, len)) {
> > +/* KVM is enabled and we're using this type, so it works. 
> > */
> > +supported = true;
> > +}
> > +if (!supported) {
> > +error_setg(errp, "The CPU definition '%s' cannot "
>  use model name instead of CPU definition?
> >>>
> >>> I took that wording from s390x, but maybe I prefer "The CPU type..."
> >>> better. I'll change it for v3.>> This CPU type is not recognized as an 
> >>> ARM CPU type?
> > 
> > That's not what this error message is stating. The CPU type may well be an
> > ARM CPU type, but it's not one you can expect to use with KVM enabled. I
> > currently have
> > 
> >   "The CPU type '%s' cannot "
> >   "be used with KVM on this host", model->name)
> > 
> > queued up for v3.
> 
> decidedly, I meant the error message associated to:
> 
> +oc = cpu_class_by_name(TYPE_ARM_CPU, model->name);
> +if (!oc) {
> +error_setg(errp, "The CPU definition '%s' is unknown.",
> model->name);
> +return NULL;
> +}

Ah, OK. Yeah I can change that one too. Of course if we deviate from
s390x's generic error messages for common errors, then we're assuming
the messages aren't being parsed by upper layers using code that we'd
like to easily adopt to ARM. But, I think that assumption is reasonable.

> 
> Why am I always looking at your series when we suffer heat wave?
>

Climate change generates too many heat waves. Or I generate too much
code that requires comments. Or both.

Thanks,
drew

Re: [Qemu-devel] [RFC PATCH] pci: Use PCI aliases when determining device IOMMU address space

2019-07-24 Thread Alex Williamson

On Wed, 24 Jul 2019 18:03:31 +0800
Peter Xu  wrote:

> On Wed, Jul 24, 2019 at 05:39:22AM -0400, Michael S. Tsirkin wrote:
> > On Wed, Jul 24, 2019 at 03:14:39PM +0800, Peter Xu wrote:  
> > > On Tue, Jul 23, 2019 at 11:26:18AM -0600, Alex Williamson wrote:  
> > > > > On 3/29/19 11:49 AM, Alex Williamson wrote:  
> > > > > > [Cc +Brijesh]
> > > > > > 
> > > > > > Hi Brijesh, will the change below require the IVRS to be updated to
> > > > > > include aliases for all BDF ranges behind a conventional bridge?  I
> > > > > > think the Linux code handles this regardless of the firmware 
> > > > > > provided
> > > > > > aliases, but is it required per spec for the ACPI tables to include
> > > > > > bridge aliases?  Thanks,
> > > > > > 
> > > > > 
> > > > > We do need to includes aliases in ACPI table. We need to populate the
> > > > > IVHD type 0x43 and 0x4 for alias range start and end. I believe host
> > > > > IVRS would contain similar information.
> > > > > 
> > > > > Suravee, please correct me if I am missing something?  
> > > > 
> > > > I finally found some time to investigate this a little further, yes the
> > > > types mentioned are correct for defining start and end of an alias
> > > > range.  The challenge here is that these entries require a DeviceID,
> > > > which is defined as a BDF, AIUI.  The IVRS is created in QEMU, but bus
> > > > numbers are defined by the guest firmware, and potentially redefined by
> > > > the guest OS.  This makes it non-trivial to insert a few IVHDs into the
> > > > IVRS to describe alias ranges.  I'm wondering if the solution here is
> > > > to define a new linker-loader command that would instruct the guest to
> > > > write a bus number byte to a given offset for a described device.
> > > > These commands would be inserted before the checksum command, such that
> > > > these bus number updates are calculated as part of the checksum.
> > > > 
> > > > I'm imagining the command format would need to be able to distinguish
> > > > between the actual bus number of a described device, the secondary bus
> > > > number of the device, and the subordinate bus number of the device.
> > > > For describing the device, I'm envisioning stealing from the DMAR
> > > > definition, which already includes a bus number invariant mechanism to
> > > > describe a device, starting with a segment and root bus, follow a chain
> > > > of devfns to get to the target device.  Therefore the guest firmware
> > > > would follow the path to the described device, pick the desired bus
> > > > number, and write it to the indicated table offset.
> > > > 
> > > > Does this seem like a reasonable approach?  Better ideas?  I'm not
> > > > thrilled with the increased scope demanded by IVRS support, but so long
> > > > as we have an AMD IOMMU model, I don't see how to avoid it.  Thanks,  
> > > 
> > > I don't have a better idea yet, but just want to say that accidentally
> > > I was trying to look into this as well starting from this week and I'd
> > > say that's mostly what I thought about too (I was still reading a bit
> > > seabios when I saw this email)... so at least this idea makes sense to
> > > me.
> > > 
> > > Would the guest OS still change the PCI bus number even after the
> > > firmware (BIOS/UEFI)?  Could I ask in what case would that happen?
> > > 
> > > Thanks,  
> > 
> > Guest OSes can in theory rebalance resources. Changing bus numbers
> > would be useful if new bridges are added by hotplug.
> > In practice at least Linux doesn't do the rebalancing.
> > I think that if we start reporting PNP OS support in BIOS then windows
> > might start doing that more aggressively.  
> 
> It's surprising me a bit...  IMHO if we allow the bus number to change
> then at least many scripts can even fail which might work before.
> E.g. , a very common script can run "lspci-like" program to list each
> device and then do "lspci-like -vvv" again upon the BDF it fetched
> from previous commands.  Any kind of BDF caching would be invalid
> since that from either userspace or kernel.
> 
> Also, obviously the data to be stored in IVRS is closely bound to how
> bus number is defined.  Even if we can add a new linker-loader command
> to all the open firmwares like seabios or OVMF but still we can't do
> that to Windows (or, could we?...).
> 
> Now one step back, I'm also curious on the reason behind on why AMD
> spec required the IVRS with BDF information, rather than the scope
> information like what Intel DMAR spec was asking for.

It's a deficiency of the IVRS spec, but it's really out of scope here.
It's not the responsibility of the hypervisor to resolve this sort of
design issue, we should simply maintain the bare metal behavior and the
bare metal limitations of the design.

Michael did invoke some interesting ideas regarding QEMU updating the
IRVS table though.  QEMU does know when bus apertures are programmed on
devices and the config writes for these updates could trigger IVRS
updates.  I think we'd want to

Re: [Qemu-devel] [PATCH v4 0/3] pc: mmap kernel (ELF image) and initrd

2019-07-24 Thread Dr. David Alan Gilbert

* Stefano Garzarella (sgarz...@redhat.com) wrote:
> In order to reduce the memory footprint when PVH kernel and initrd
> are used, we map them into memory instead of reading them.
> In this way we can share them between multiple instances of QEMU.
> 
> v4:
>   - Patch 1: fix the rom_add_elf_program() comment [Paolo]
>   - Patch 2:
> ~ fix the missing of g_mapped_file_unref() in the success case [Paolo]
> ~ fix the rom_add_elf_program() comment [Paolo]
> 
> v3: https://patchew.org/QEMU/20190724112531.232260-1-sgarz...@redhat.com/
> v2: https://patchew.org/QEMU/20190723140445.12748-1-sgarz...@redhat.com/

Two high level questions:
   a) What happens if someone tries to migrate the VM - I don't think
it's too unusual for people to run with -kernel/-initrd in situations
where they migrate.

   b) Are there situations where you can't mmap but you can validly
read it?  For example, running with an ELF built for 4k page alignment
on a host with 64k host pages?

Dave

> 
> These are the results using a PVH kernel and initrd (cpio):
> - memory footprint (using smem) [MB]
> QEMU  before   now
> # instancesUSS  PSSUSS  PSS
>  1   102.0M   105.8M 102.3M   106.2M
>  294.6M   101.2M  72.3M90.1M
>  494.1M98.0M  72.0M81.5M
>  894.0M96.2M  71.8M76.9M
> 1693.9M95.1M  71.6M74.3M
> 
> Initrd size: 3.0M
> Kernel
> image size: 28M
> sections size [size -A -d vmlinux]:  18.9M
> 
> - boot time [ms]
>   before   now
>  qemu_init_end:   63.85   55.91
>  linux_start_kernel:  82.11 (+18.26)  74.51 (+18.60)
>  linux_start_user:   169.94 (+87.83) 159.06 (+84.56)
> 
> QEMU command used:
> ./qemu-system-x86_64 -bios /path/to/seabios/out/bios.bin -no-hpet \
> -machine 
> q35,accel=kvm,kernel_irqchip,nvdimm,sata=off,smbus=off,vmport=off \
> -cpu host -m 1G -smp 1 -vga none -display none -no-user-config 
> -nodefaults \
> -kernel /path/to/vmlinux -initrd /path/to/rootfs.cpio \
> -append 'root=/dev/mem0 ro console=hvc0 pci=lastbus=0 nosmap'
> 
> Stefano Garzarella (3):
>   loader: Handle memory-mapped ELFs
>   elf-ops.h: Map into memory the ELF to load
>   hw/i386/pc: Map into memory the initrd
> 
>  hw/core/loader.c | 38 +++-
>  hw/i386/pc.c | 17 ---
>  include/hw/elf_ops.h | 71 ++--
>  include/hw/i386/pc.h |  1 +
>  include/hw/loader.h  |  5 ++--
>  5 files changed, 89 insertions(+), 43 deletions(-)
> 
> -- 
> 2.20.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v4 0/3] pc: mmap kernel (ELF image) and initrd

2019-07-24 Thread Paolo Bonzini

On 24/07/19 16:31, Stefano Garzarella wrote:
> In order to reduce the memory footprint when PVH kernel and initrd
> are used, we map them into memory instead of reading them.
> In this way we can share them between multiple instances of QEMU.
> 
> v4:
>   - Patch 1: fix the rom_add_elf_program() comment [Paolo]
>   - Patch 2:
> ~ fix the missing of g_mapped_file_unref() in the success case [Paolo]
> ~ fix the rom_add_elf_program() comment [Paolo]
> 
> v3: https://patchew.org/QEMU/20190724112531.232260-1-sgarz...@redhat.com/
> v2: https://patchew.org/QEMU/20190723140445.12748-1-sgarz...@redhat.com/
> 
> These are the results using a PVH kernel and initrd (cpio):
> - memory footprint (using smem) [MB]
> QEMU  before   now
> # instancesUSS  PSSUSS  PSS
>  1   102.0M   105.8M 102.3M   106.2M
>  294.6M   101.2M  72.3M90.1M
>  494.1M98.0M  72.0M81.5M
>  894.0M96.2M  71.8M76.9M
> 1693.9M95.1M  71.6M74.3M
> 
> Initrd size: 3.0M
> Kernel
> image size: 28M
> sections size [size -A -d vmlinux]:  18.9M
> 
> - boot time [ms]
>   before   now
>  qemu_init_end:   63.85   55.91
>  linux_start_kernel:  82.11 (+18.26)  74.51 (+18.60)
>  linux_start_user:   169.94 (+87.83) 159.06 (+84.56)
> 
> QEMU command used:
> ./qemu-system-x86_64 -bios /path/to/seabios/out/bios.bin -no-hpet \
> -machine 
> q35,accel=kvm,kernel_irqchip,nvdimm,sata=off,smbus=off,vmport=off \
> -cpu host -m 1G -smp 1 -vga none -display none -no-user-config 
> -nodefaults \
> -kernel /path/to/vmlinux -initrd /path/to/rootfs.cpio \
> -append 'root=/dev/mem0 ro console=hvc0 pci=lastbus=0 nosmap'
> 
> Stefano Garzarella (3):
>   loader: Handle memory-mapped ELFs
>   elf-ops.h: Map into memory the ELF to load
>   hw/i386/pc: Map into memory the initrd
> 
>  hw/core/loader.c | 38 +++-
>  hw/i386/pc.c | 17 ---
>  include/hw/elf_ops.h | 71 ++--
>  include/hw/i386/pc.h |  1 +
>  include/hw/loader.h  |  5 ++--
>  5 files changed, 89 insertions(+), 43 deletions(-)
> 

Queued, thanks.

Paolo

[Qemu-devel] [PATCH for 4.1?] pl330: fix vmstate description

2019-07-24 Thread Damien Hedde

Fix the pl330 main and queue vmstate description.
There were missing POINTER flags causing crashes during
incoming migration because:
+ PL330State chan field is a pointer to an array
+ PL330Queue queue field is a pointer to an array

Also bump corresponding vmsd version numbers.

Signed-off-by: Damien Hedde 
---

I found this while working on reset with xilinx-zynq machine.

I'm not sure what's the vmsd version policy in such cases (for
backward compatibility). I've simply bumped them since migration
was not working anyway (vmstate_load_state was erasing critical part
of PL330State and causing segfaults while loading following fields).

Tested doing migration with the xilinx-zynq-a9 machine.

---
 hw/dma/pl330.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/hw/dma/pl330.c b/hw/dma/pl330.c
index 58df965a46..a56a3e7771 100644
--- a/hw/dma/pl330.c
+++ b/hw/dma/pl330.c
@@ -218,11 +218,12 @@ typedef struct PL330Queue {
 
 static const VMStateDescription vmstate_pl330_queue = {
 .name = "pl330_queue",
-.version_id = 1,
-.minimum_version_id = 1,
+.version_id = 2,
+.minimum_version_id = 2,
 .fields = (VMStateField[]) {
-VMSTATE_STRUCT_VARRAY_UINT32(queue, PL330Queue, queue_size, 1,
- vmstate_pl330_queue_entry, PL330QueueEntry),
+VMSTATE_STRUCT_VARRAY_POINTER_UINT32(queue, PL330Queue, queue_size,
+ vmstate_pl330_queue_entry,
+ PL330QueueEntry),
 VMSTATE_END_OF_LIST()
 }
 };
@@ -278,12 +279,12 @@ struct PL330State {
 
 static const VMStateDescription vmstate_pl330 = {
 .name = "pl330",
-.version_id = 1,
-.minimum_version_id = 1,
+.version_id = 2,
+.minimum_version_id = 2,
 .fields = (VMStateField[]) {
 VMSTATE_STRUCT(manager, PL330State, 0, vmstate_pl330_chan, PL330Chan),
-VMSTATE_STRUCT_VARRAY_UINT32(chan, PL330State, num_chnls, 0,
- vmstate_pl330_chan, PL330Chan),
+VMSTATE_STRUCT_VARRAY_POINTER_UINT32(chan, PL330State, num_chnls,
+ vmstate_pl330_chan, PL330Chan),
 VMSTATE_VBUFFER_UINT32(lo_seqn, PL330State, 1, NULL, num_chnls),
 VMSTATE_VBUFFER_UINT32(hi_seqn, PL330State, 1, NULL, num_chnls),
 VMSTATE_STRUCT(fifo, PL330State, 0, vmstate_pl330_fifo, PL330Fifo),
-- 
2.22.0

Re: [Qemu-devel] [PATCH for-4.2 00/14] Some record/replay fixes

2019-07-24 Thread Paolo Bonzini

On 24/07/19 10:43, Pavel Dovgalyuk wrote:
> The set of patches include the latest fixes for record/replay icount function:
>  - fix for icount for the case when translation blocks are chained
>  - block operation fixes for rr mode
>  - development documentation update
>  - some refactoring
> 
> These patches make record/replay functional on the latest 4.2 QEMU core.
> 
> ---
> 
> Pavel Dovgalyuk (13):
>   block: implement bdrv_snapshot_goto for blkreplay
>   replay: disable default snapshot for record/replay
>   replay: update docs for record/replay with block devices
>   replay: don't drain/flush bdrv queue while RR is working
>   replay: finish record/replay before closing the disks
>   replay: provide an accessor for rr filename
>   replay: add BH oneshot event for block layer
>   replay: document development rules
>   util/qemu-timer: refactor deadline calculation for external timers
>   replay: fix replay shutdown
>   replay: refine replay-time module
>   replay: rename step-related variables and functions
>   icount: clean up cpu_can_io before jumping to the next block
> 
> pbonz...@redhat.com (1):
>   replay: add missing fix for internal function
> 
> 
>  accel/tcg/tcg-runtime.c   |2 ++
>  block/blkreplay.c |8 
>  block/block-backend.c |8 +---
>  block/io.c|   32 +--
>  block/iscsi.c |5 +++--
>  block/nfs.c   |5 +++--
>  block/null.c  |4 +++-
>  block/nvme.c  |6 --
>  block/rbd.c   |5 +++--
>  block/vxhs.c  |5 +++--
>  cpus.c|   11 ---
>  docs/devel/replay.txt |   46 
> +
>  docs/replay.txt   |   12 +---
>  include/qemu/timer.h  |7 +++
>  include/sysemu/replay.h   |7 ++-
>  qtest.c   |2 +-
>  replay/replay-events.c|   18 +-
>  replay/replay-internal.c  |   10 +-
>  replay/replay-internal.h  |   11 ++-
>  replay/replay-snapshot.c  |6 +++---
>  replay/replay-time.c  |   36 ---
>  replay/replay.c   |   39 +++---
>  stubs/Makefile.objs   |1 +
>  stubs/replay-user.c   |9 +
>  tests/ptimer-test-stubs.c |4 ++--
>  tests/ptimer-test.c   |4 ++--
>  util/qemu-timer.c |   41 
>  vl.c  |   11 +--
>  28 files changed, 259 insertions(+), 96 deletions(-)
>  create mode 100644 docs/devel/replay.txt
>  create mode 100644 stubs/replay-user.c
> 

Please separate patches 1 and 9-14, I can merge those.

Paolo

[Qemu-devel] [PATCH for 4.2 0/2] target/mips: Misc patches for 4.2

2019-07-24 Thread Aleksandar Markovic

From: Aleksandar Markovic 

This series includes misc MIPS patches intended to be integrated after
4.1 release.

Aleksandar Markovic (2):
  tests/tcg: target/mips: Fix target configurations for MSA tests
  tests/tcg: target/mips: Add optional printing of more detailed failure
info

 tests/tcg/mips/include/test_utils_128.h|  21 +-
 .../mips/user/ase/msa/test_msa_compile_32r5eb.sh   | 643 +
 .../mips/user/ase/msa/test_msa_compile_32r5el.sh   | 643 +
 .../mips/user/ase/msa/test_msa_compile_32r6eb.sh   | 643 -
 .../mips/user/ase/msa/test_msa_compile_32r6el.sh   | 643 -
 tests/tcg/mips/user/ase/msa/test_msa_run_32r5eb.sh | 371 
 tests/tcg/mips/user/ase/msa/test_msa_run_32r5el.sh | 371 
 tests/tcg/mips/user/ase/msa/test_msa_run_32r6eb.sh | 371 
 tests/tcg/mips/user/ase/msa/test_msa_run_32r6el.sh | 371 
 9 files changed, 2048 insertions(+), 2029 deletions(-)
 create mode 100755 tests/tcg/mips/user/ase/msa/test_msa_compile_32r5eb.sh
 create mode 100755 tests/tcg/mips/user/ase/msa/test_msa_compile_32r5el.sh
 delete mode 100755 tests/tcg/mips/user/ase/msa/test_msa_compile_32r6eb.sh
 delete mode 100755 tests/tcg/mips/user/ase/msa/test_msa_compile_32r6el.sh
 create mode 100755 tests/tcg/mips/user/ase/msa/test_msa_run_32r5eb.sh
 create mode 100755 tests/tcg/mips/user/ase/msa/test_msa_run_32r5el.sh
 delete mode 100644 tests/tcg/mips/user/ase/msa/test_msa_run_32r6eb.sh
 delete mode 100755 tests/tcg/mips/user/ase/msa/test_msa_run_32r6el.sh

-- 
2.7.4

[Qemu-devel] [PATCH for 4.2 2/2] tests/tcg: target/mips: Add optional printing of more detailed failure info

2019-07-24 Thread Aleksandar Markovic

From: Aleksandar Markovic 

There is a need for printing input and output data for failure cases,
for debugging purpose. This is achieved by this patch, and only if a
preprocessor constant is manually set to 1. (Assumption is that the
need for such priontout is relatively rare.)

Signed-off-by: Aleksandar Markovic 
---
 tests/tcg/mips/include/test_utils_128.h | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/tests/tcg/mips/include/test_utils_128.h 
b/tests/tcg/mips/include/test_utils_128.h
index 2fea610..debb264 100644
--- a/tests/tcg/mips/include/test_utils_128.h
+++ b/tests/tcg/mips/include/test_utils_128.h
@@ -27,7 +27,8 @@
 #include 
 #include 
 
-#define PRINT_RESULTS 0
+#define PRINT_RESULTS0
+#define PRINT_FAILURES   0
 
 
 static inline int32_t check_results_128(const char *isa_ase_name,
@@ -65,6 +66,24 @@ static inline int32_t check_results_128(const char 
*isa_ase_name,
 (b128_result[2 * i + 1] == b128_expect[2 * i + 1])) {
 pass_count++;
 } else {
+#if PRINT_FAILURES
+uint32_t ii;
+uint64_t a, b;
+
+printf("\n");
+
+printf("FAILURE for test case %d!\n", i);
+
+memcpy(, (b128_expect + 2 * i), 8);
+memcpy(, (b128_expect + 2 * i + 1), 8);
+printf("Expected result : { 0x%016llxULL, 0x%016llxULL, },\n", a, 
b);
+
+memcpy(, (b128_result + 2 * i), 8);
+memcpy(, (b128_result + 2 * i + 1), 8);
+printf("Actual result   : { 0x%016llxULL, 0x%016llxULL, },\n", a, 
b);
+
+printf("\n");
+#endif
 fail_count++;
 }
 }
-- 
2.7.4

[Qemu-devel] [PATCH v4 3/3] hw/i386/pc: Map into memory the initrd

2019-07-24 Thread Stefano Garzarella

In order to reduce the memory footprint we map into memory
the initrd using g_mapped_file_new() instead of reading it.
In this way we can share the initrd pages between multiple
instances of QEMU.

Suggested-by: Paolo Bonzini 
Signed-off-by: Stefano Garzarella 
---
v3:
  - renamed 'GMappedFile *gmf' in 'GMappedFile *mapped_filed' for readability
  - stored the initrd GMappedFile* in PCMachineState to avoid Coverity
issue [Paolo]
---
 hw/i386/pc.c | 17 +
 include/hw/i386/pc.h |  1 +
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 549c437050..96f6b89f70 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1241,17 +1241,21 @@ static void load_linux(PCMachineState *pcms,
 
 /* load initrd */
 if (initrd_filename) {
+GMappedFile *mapped_file;
 gsize initrd_size;
 gchar *initrd_data;
 GError *gerr = NULL;
 
-if (!g_file_get_contents(initrd_filename, _data,
-_size, )) {
+mapped_file = g_mapped_file_new(initrd_filename, false, );
+if (!mapped_file) {
 fprintf(stderr, "qemu: error reading initrd %s: %s\n",
 initrd_filename, gerr->message);
 exit(1);
 }
+pcms->initrd_mapped_file = mapped_file;
 
+initrd_data = g_mapped_file_get_contents(mapped_file);
+initrd_size = g_mapped_file_get_length(mapped_file);
 initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 
1;
 if (initrd_size >= initrd_max) {
 fprintf(stderr, "qemu: initrd is too large, cannot 
support."
@@ -1378,6 +1382,7 @@ static void load_linux(PCMachineState *pcms,
 
 /* load initrd */
 if (initrd_filename) {
+GMappedFile *mapped_file;
 gsize initrd_size;
 gchar *initrd_data;
 GError *gerr = NULL;
@@ -1387,12 +1392,16 @@ static void load_linux(PCMachineState *pcms,
 exit(1);
 }
 
-if (!g_file_get_contents(initrd_filename, _data,
- _size, )) {
+mapped_file = g_mapped_file_new(initrd_filename, false, );
+if (!mapped_file) {
 fprintf(stderr, "qemu: error reading initrd %s: %s\n",
 initrd_filename, gerr->message);
 exit(1);
 }
+pcms->initrd_mapped_file = mapped_file;
+
+initrd_data = g_mapped_file_get_contents(mapped_file);
+initrd_size = g_mapped_file_get_length(mapped_file);
 if (initrd_size >= initrd_max) {
 fprintf(stderr, "qemu: initrd is too large, cannot support."
 "(max: %"PRIu32", need %"PRId64")\n",
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 859b64c51d..44edc6955e 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -42,6 +42,7 @@ struct PCMachineState {
 FWCfgState *fw_cfg;
 qemu_irq *gsi;
 PFlashCFI01 *flash[2];
+GMappedFile *initrd_mapped_file;
 
 /* Configuration options: */
 uint64_t max_ram_below_4g;
-- 
2.20.1

[Qemu-devel] [PATCH v4 1/3] loader: Handle memory-mapped ELFs

2019-07-24 Thread Stefano Garzarella

This patch allows handling an ELF memory-mapped, taking care
the reference count of the GMappedFile* passed through
rom_add_elf_program().
In this case, the 'data' pointer is not heap-allocated, so
we cannot free it.

Suggested-by: Paolo Bonzini 
Signed-off-by: Stefano Garzarella 
---
v4:
  - fix the rom_add_elf_program() comment [Paolo]
---
 hw/core/loader.c | 38 ++
 include/hw/elf_ops.h |  2 +-
 include/hw/loader.h  |  5 +++--
 3 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/hw/core/loader.c b/hw/core/loader.c
index 425bf69a99..9fb93a6541 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -836,6 +836,7 @@ struct Rom {
 int isrom;
 char *fw_dir;
 char *fw_file;
+GMappedFile *mapped_file;
 
 bool committed;
 
@@ -846,10 +847,25 @@ struct Rom {
 static FWCfgState *fw_cfg;
 static QTAILQ_HEAD(, Rom) roms = QTAILQ_HEAD_INITIALIZER(roms);
 
-/* rom->data must be heap-allocated (do not use with rom_add_elf_program()) */
+/*
+ * rom->data can be heap-allocated or memory-mapped (e.g. when added with
+ * rom_add_elf_program())
+ */
+static void rom_free_data(Rom *rom)
+{
+if (rom->mapped_file) {
+g_mapped_file_unref(rom->mapped_file);
+rom->mapped_file = NULL;
+} else {
+g_free(rom->data);
+}
+
+rom->data = NULL;
+}
+
 static void rom_free(Rom *rom)
 {
-g_free(rom->data);
+rom_free_data(rom);
 g_free(rom->path);
 g_free(rom->name);
 g_free(rom->fw_dir);
@@ -1056,11 +1072,12 @@ MemoryRegion *rom_add_blob(const char *name, const void 
*blob, size_t len,
 
 /* This function is specific for elf program because we don't need to allocate
  * all the rom. We just allocate the first part and the rest is just zeros. 
This
- * is why romsize and datasize are different. Also, this function seize the
- * memory ownership of "data", so we don't have to allocate and copy the 
buffer.
+ * is why romsize and datasize are different. Also, this function takes its own
+ * reference to "mapped_file", so we don't have to allocate and copy the 
buffer.
  */
-int rom_add_elf_program(const char *name, void *data, size_t datasize,
-size_t romsize, hwaddr addr, AddressSpace *as)
+int rom_add_elf_program(const char *name, GMappedFile *mapped_file, void *data,
+size_t datasize, size_t romsize, hwaddr addr,
+AddressSpace *as)
 {
 Rom *rom;
 
@@ -1071,6 +1088,12 @@ int rom_add_elf_program(const char *name, void *data, 
size_t datasize,
 rom->romsize  = romsize;
 rom->data = data;
 rom->as   = as;
+
+if (mapped_file && data) {
+g_mapped_file_ref(mapped_file);
+rom->mapped_file = mapped_file;
+}
+
 rom_insert(rom);
 return 0;
 }
@@ -1105,8 +1128,7 @@ static void rom_reset(void *unused)
 }
 if (rom->isrom) {
 /* rom needs to be written only once */
-g_free(rom->data);
-rom->data = NULL;
+rom_free_data(rom);
 }
 /*
  * The rom loader is really on the same level as firmware in the guest
diff --git a/include/hw/elf_ops.h b/include/hw/elf_ops.h
index 690f9238c8..fede37ee9c 100644
--- a/include/hw/elf_ops.h
+++ b/include/hw/elf_ops.h
@@ -525,7 +525,7 @@ static int glue(load_elf, SZ)(const char *name, int fd,
 snprintf(label, sizeof(label), "phdr #%d: %s", i, name);
 
 /* rom_add_elf_program() seize the ownership of 'data' */
-rom_add_elf_program(label, data, file_size, mem_size,
+rom_add_elf_program(label, NULL, data, file_size, mem_size,
 addr, as);
 } else {
 address_space_write(as ? as : _space_memory,
diff --git a/include/hw/loader.h b/include/hw/loader.h
index 3e1b3a4566..07fd9286e7 100644
--- a/include/hw/loader.h
+++ b/include/hw/loader.h
@@ -258,8 +258,9 @@ MemoryRegion *rom_add_blob(const char *name, const void 
*blob, size_t len,
FWCfgCallback fw_callback,
void *callback_opaque, AddressSpace *as,
bool read_only);
-int rom_add_elf_program(const char *name, void *data, size_t datasize,
-size_t romsize, hwaddr addr, AddressSpace *as);
+int rom_add_elf_program(const char *name, GMappedFile *mapped_file, void *data,
+size_t datasize, size_t romsize, hwaddr addr,
+AddressSpace *as);
 int rom_check_and_register_reset(void);
 void rom_set_fw(FWCfgState *f);
 void rom_set_order_override(int order);
-- 
2.20.1

[Qemu-devel] [PATCH v4 0/3] pc: mmap kernel (ELF image) and initrd

2019-07-24 Thread Stefano Garzarella

In order to reduce the memory footprint when PVH kernel and initrd
are used, we map them into memory instead of reading them.
In this way we can share them between multiple instances of QEMU.

v4:
  - Patch 1: fix the rom_add_elf_program() comment [Paolo]
  - Patch 2:
~ fix the missing of g_mapped_file_unref() in the success case [Paolo]
~ fix the rom_add_elf_program() comment [Paolo]

v3: https://patchew.org/QEMU/20190724112531.232260-1-sgarz...@redhat.com/
v2: https://patchew.org/QEMU/20190723140445.12748-1-sgarz...@redhat.com/

These are the results using a PVH kernel and initrd (cpio):
- memory footprint (using smem) [MB]
QEMU  before   now
# instancesUSS  PSSUSS  PSS
 1   102.0M   105.8M 102.3M   106.2M
 294.6M   101.2M  72.3M90.1M
 494.1M98.0M  72.0M81.5M
 894.0M96.2M  71.8M76.9M
1693.9M95.1M  71.6M74.3M

Initrd size: 3.0M
Kernel
image size: 28M
sections size [size -A -d vmlinux]:  18.9M

- boot time [ms]
  before   now
 qemu_init_end:   63.85   55.91
 linux_start_kernel:  82.11 (+18.26)  74.51 (+18.60)
 linux_start_user:   169.94 (+87.83) 159.06 (+84.56)

QEMU command used:
./qemu-system-x86_64 -bios /path/to/seabios/out/bios.bin -no-hpet \
-machine q35,accel=kvm,kernel_irqchip,nvdimm,sata=off,smbus=off,vmport=off \
-cpu host -m 1G -smp 1 -vga none -display none -no-user-config -nodefaults \
-kernel /path/to/vmlinux -initrd /path/to/rootfs.cpio \
-append 'root=/dev/mem0 ro console=hvc0 pci=lastbus=0 nosmap'

Stefano Garzarella (3):
  loader: Handle memory-mapped ELFs
  elf-ops.h: Map into memory the ELF to load
  hw/i386/pc: Map into memory the initrd

 hw/core/loader.c | 38 +++-
 hw/i386/pc.c | 17 ---
 include/hw/elf_ops.h | 71 ++--
 include/hw/i386/pc.h |  1 +
 include/hw/loader.h  |  5 ++--
 5 files changed, 89 insertions(+), 43 deletions(-)

-- 
2.20.1

[Qemu-devel] [PATCH v4 2/3] elf-ops.h: Map into memory the ELF to load

2019-07-24 Thread Stefano Garzarella

In order to reduce the memory footprint we map into memory
the ELF to load using g_mapped_file_new_from_fd() instead of
reading each sections. In this way we can share the ELF pages
between multiple instances of QEMU.

Suggested-by: Dr. David Alan Gilbert 
Suggested-by: Paolo Bonzini 
Signed-off-by: Stefano Garzarella 
---
v4:
  - fix the missing of g_mapped_file_unref() in the success case [Paolo]
  - fix the rom_add_elf_program() comment [Paolo]
v3:
  - renamed 'GMappedFile *gmf' in 'GMappedFile *mapped_filed' for readability.
  - passed the GMappedFile* to rom_add_elf_program() to correctly handle the
reference count. [Paolo]
  - set 'data' pointer only if 'file_size > 0' as the original behaviour
[check-qtest-ppc64 fails without it]
v2:
  - used g_mapped_file_new_from_fd() with 'writeble' set to 'true',
since we can modify the mapped buffer. [Paolo, Peter]
---
 include/hw/elf_ops.h | 71 ++--
 1 file changed, 42 insertions(+), 29 deletions(-)

diff --git a/include/hw/elf_ops.h b/include/hw/elf_ops.h
index fede37ee9c..1496d7e753 100644
--- a/include/hw/elf_ops.h
+++ b/include/hw/elf_ops.h
@@ -323,8 +323,9 @@ static int glue(load_elf, SZ)(const char *name, int fd,
 struct elfhdr ehdr;
 struct elf_phdr *phdr = NULL, *ph;
 int size, i, total_size;
-elf_word mem_size, file_size;
+elf_word mem_size, file_size, data_offset;
 uint64_t addr, low = (uint64_t)-1, high = 0;
+GMappedFile *mapped_file = NULL;
 uint8_t *data = NULL;
 char label[128];
 int ret = ELF_LOAD_FAILED;
@@ -409,20 +410,32 @@ static int glue(load_elf, SZ)(const char *name, int fd,
 }
 }
 
+/*
+ * Since we want to be able to modify the mapped buffer, we set the
+ * 'writeble' parameter to 'true'. Modifications to the buffer are not
+ * written back to the file.
+ */
+mapped_file = g_mapped_file_new_from_fd(fd, true, NULL);
+if (!mapped_file) {
+goto fail;
+}
+
 total_size = 0;
 for(i = 0; i < ehdr.e_phnum; i++) {
 ph = [i];
 if (ph->p_type == PT_LOAD) {
 mem_size = ph->p_memsz; /* Size of the ROM */
 file_size = ph->p_filesz; /* Size of the allocated data */
-data = g_malloc0(file_size);
-if (ph->p_filesz > 0) {
-if (lseek(fd, ph->p_offset, SEEK_SET) < 0) {
-goto fail;
-}
-if (read(fd, data, file_size) != file_size) {
+data_offset = ph->p_offset; /* Offset where the data is located */
+
+if (file_size > 0) {
+if (g_mapped_file_get_length(mapped_file) <
+file_size + data_offset) {
 goto fail;
 }
+
+data = (uint8_t *)g_mapped_file_get_contents(mapped_file);
+data += data_offset;
 }
 
 /* The ELF spec is somewhat vague about the purpose of the
@@ -513,25 +526,25 @@ static int glue(load_elf, SZ)(const char *name, int fd,
 *pentry = ehdr.e_entry - ph->p_vaddr + ph->p_paddr;
 }
 
-if (mem_size == 0) {
-/* Some ELF files really do have segments of zero size;
- * just ignore them rather than trying to create empty
- * ROM blobs, because the zero-length blob can falsely
- * trigger the overlapping-ROM-blobs check.
- */
-g_free(data);
-} else {
+/* Some ELF files really do have segments of zero size;
+ * just ignore them rather than trying to create empty
+ * ROM blobs, because the zero-length blob can falsely
+ * trigger the overlapping-ROM-blobs check.
+ */
+if (mem_size != 0) {
 if (load_rom) {
 snprintf(label, sizeof(label), "phdr #%d: %s", i, name);
 
-/* rom_add_elf_program() seize the ownership of 'data' */
-rom_add_elf_program(label, NULL, data, file_size, mem_size,
-addr, as);
+/*
+ * rom_add_elf_program() takes its own reference to
+ * 'mapped_file'.
+ */
+rom_add_elf_program(label, mapped_file, data, file_size,
+mem_size, addr, as);
 } else {
 address_space_write(as ? as : _space_memory,
 addr, MEMTXATTRS_UNSPECIFIED,
 data, file_size);
-g_free(data);
 }
 }
 
@@ -547,14 +560,16 @@ static int glue(load_elf, SZ)(const char *name, int fd,
 struct elf_note *nhdr = NULL;
 
 file_size = ph->p_filesz; /* Size of the range of ELF notes */
-data =

Re: [Qemu-devel] [PATCH for-4.2 10/14] util/qemu-timer: refactor deadline calculation for external timers

2019-07-24 Thread Paolo Bonzini

On 24/07/19 10:44, Pavel Dovgalyuk wrote:
> -int64_t qemu_clock_deadline_ns_all(QEMUClockType type)
> +int64_t virtual_clock_deadline_ns(void)
>  {
>  int64_t deadline = -1;
> +int64_t delta;
> +int64_t expire_time;
> +QEMUTimer *ts;
>  QEMUTimerList *timer_list;
> -QEMUClock *clock = qemu_clock_ptr(type);
> +QEMUClock *clock = qemu_clock_ptr(QEMU_CLOCK_VIRTUAL);
> +
> +if (!clock->enabled) {
> +return -1;
> +}
> +
>  QLIST_FOREACH(timer_list, >timerlists, list) {
> -deadline = qemu_soonest_timeout(deadline,
> -timerlist_deadline_ns(timer_list));
> +qemu_mutex_lock(_list->active_timers_lock);
> +ts = timer_list->active_timers;
> +/* Skip all external timers */
> +while (ts && (ts->attributes & QEMU_TIMER_ATTR_EXTERNAL)) {
> +ts = ts->next;
> +}
> +if (!ts) {
> +qemu_mutex_unlock(_list->active_timers_lock);
> +continue;
> +}
> +expire_time = ts->expire_time;
> +qemu_mutex_unlock(_list->active_timers_lock);
> +
> +delta = expire_time - qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
> +if (delta <= 0) {
> +delta = 0;
> +}
> +deadline = qemu_soonest_timeout(deadline, delta);
>  }
>  return deadline;
>  }
> 

Why would this change be exclusive to QEMU_CLOCK_VIRTUAL?  I don't think
it's useful to remove the argument.  Otherwise, the patch makes sense.

Paolo

Re: [Qemu-devel] [PATCH v7 02/11] numa: move numa global variable nb_numa_nodes into MachineState

2019-07-24 Thread Igor Mammedov

On Tue, 23 Jul 2019 12:23:57 -0300
Eduardo Habkost  wrote:

> On Tue, Jul 23, 2019 at 04:56:41PM +0200, Igor Mammedov wrote:
> > On Tue, 16 Jul 2019 22:51:12 +0800
> > Tao Xu  wrote:
> > 
> > > Add struct NumaState in MachineState and move existing numa global
> > > nb_numa_nodes(renamed as "num_nodes") into NumaState. And add variable
> > > numa_support into MachineClass to decide which submachines support NUMA.
> > > 
> > > Suggested-by: Igor Mammedov 
> > > Suggested-by: Eduardo Habkost 
> > > Signed-off-by: Tao Xu 
> > > ---
> > > 
> > > No changes in v7.
> > > 
> > > Changes in v6:
> > > - Rebase to upstream, move globals in arm/sbsa-ref and use
> > >   numa_mem_supported
> > > - When used once or twice in the function, use
> > >   ms->numa_state->num_nodes directly
> > > - Correct some mistakes
> > > - Use once monitor_printf in hmp_info_numa
> > > ---
> [...]
> > >  if (pxb->numa_node != NUMA_NODE_UNASSIGNED &&
> > > -pxb->numa_node >= nb_numa_nodes) {
> > > +pxb->numa_node >= ms->numa_state->num_nodes) {
> > this will crash if user tries to use device on machine that doesn't support 
> > numa
> > check that numa_state is not NULL before dereferencing 
> 
> That's exactly why the machine_num_numa_nodes() was created in
> v5, but then you asked for its removal.
V4 to more precise.
I dislike small wrappers because they usually doesn't simplify code and make it 
more obscure,
forcing to jump around to see what's really going on.
Like it's implemented in this patch it's obvious what's wrong right away.

In that particular case machine_num_numa_nodes() was also misused since only a 
handful
of places (6) really need NULL check while majority (48) can directly access 
ms->numa_state->num_nodes.
without NULL check.

Re: [Qemu-devel] [PATCH v2 03/14] target/arm/monitor: Introduce qmp_query_cpu_model_expansion

2019-07-24 Thread Auger Eric

Hi Drew,
On 7/24/19 4:05 PM, Andrew Jones wrote:
> On Wed, Jul 24, 2019 at 02:51:08PM +0200, Auger Eric wrote:
>> Hi Drew,
>>
>> On 6/26/19 3:26 PM, Andrew Jones wrote:
>>> On Wed, Jun 26, 2019 at 09:43:09AM +0200, Auger Eric wrote:
 Hi Drew,

 On 6/21/19 6:34 PM, Andrew Jones wrote:
> Add support for the query-cpu-model-expansion QMP command to Arm. We
> do this selectively, only exposing CPU properties which represent
> optional CPU features which the user may want to enable/disable. Also,
> for simplicity, we restrict the list of queryable cpu models to 'max',
> 'host', or the current type when KVM is in use, even though there
> may exist KVM hosts where other types would also work. For example on a
> seattle you could use 'host' for the current type, but then attempt to
> query 'cortex-a57', which is also a valid CPU type to use with KVM on
> seattle hosts, but that query will fail with our simplifications. This
> shouldn't be an issue though as management layers and users have been
> preferring the 'host' CPU type for use with KVM for quite some time.
> Additionally, if the KVM-enabled QEMU instance running on a seattle
> host is using the cortex-a57 CPU type, then querying 'cortex-a57' will
> work. Finally, we only implement expansion type 'full', as Arm does not
> yet have a "base" CPU type. Below are some example calls and results
> (to save character clutter they're not in json, but are still json-ish
> to give the idea)
>
>  # expand the 'max' CPU model
>  query-cpu-model-expansion: type:full, model:{ name:max }
>
>  return: model:{ name:max, props:{ 'aarch64': true, 'pmu': true }}
>
>  # attempt to expand the 'max' CPU model with pmu=off
>  query-cpu-model-expansion:
>type:full, model:{ name:max, props:{ 'pmu': false }}
>
>  return: model:{ name:max, props:{ 'aarch64': true, 'pmu': false }}
>
>  # attempt to expand the 'max' CPU model with aarch64=off
>  query-cpu-model-expansion:
>type:full, model:{ name:max, props:{ 'aarch64': false }}
>
>  error: "'aarch64' feature cannot be disabled unless KVM is enabled
>  and 32-bit EL1 is supported"
>
> In the last example KVM was not in use so an error was returned.
>
> Note1: It's possible for features to have dependencies on other
> features. I.e. it may be possible to change one feature at a time
> without error, but when attempting to change all features at once
> an error could occur depending on the order they are processed. It's
> also possible changing all at once doesn't generate an error, because
> a feature's dependencies are satisfied with other features, but the
> same feature cannot be changed independently without error. For these
> reasons callers should always attempt to make their desired changes
> all at once in order to ensure the collection is valid.
>
> Note2: Certainly more features may be added to the list of
> advertised features, e.g. 'vfp' and 'neon'. The only requirement
> is that their property set accessors fail when invalid
> configurations are detected. For vfp we would need something like
>
>  set_vfp()
>  {
>if (arm_feature(env, ARM_FEATURE_AARCH64) &&
>cpu->has_vfp != cpu->has_neon)
>error("AArch64 CPUs must have both VFP and Neon or neither")
>
> in its set accessor, and the same for neon, rather than doing that
> check at realize time, which isn't executed at qmp query time.
>
> Signed-off-by: Andrew Jones 
> ---
>  qapi/target.json |   6 +-
>  target/arm/monitor.c | 132 +++
>  2 files changed, 135 insertions(+), 3 deletions(-)
>
> diff --git a/qapi/target.json b/qapi/target.json
> index 1d4d54b6002e..edfa2f82b916 100644
> --- a/qapi/target.json
> +++ b/qapi/target.json
> @@ -408,7 +408,7 @@
>  ##
>  { 'struct': 'CpuModelExpansionInfo',
>'data': { 'model': 'CpuModelInfo' },
> -  'if': 'defined(TARGET_S390X) || defined(TARGET_I386)' }
> +  'if': 'defined(TARGET_S390X) || defined(TARGET_I386) || 
> defined(TARGET_ARM)' }
>  
>  ##
>  # @query-cpu-model-expansion:
> @@ -433,7 +433,7 @@
>  #   query-cpu-model-expansion while using these is not advised.
>  #
>  # Some architectures may not support all expansion types. s390x supports
> -# "full" and "static".
> +# "full" and "static". Arm only supports "full".
>  #
>  # Returns: a CpuModelExpansionInfo. Returns an error if expanding CPU 
> models is
>  #  not supported, if the model cannot be expanded, if the model 
> contains
> @@ -447,7 +447,7 @@
>'data': { 'type': 'CpuModelExpansionType',
>  'model': 'CpuModelInfo' },
>'returns': 'CpuModelExpansionInfo',
> -

Re: [Qemu-devel] [PATCH v2 01/14] target/arm/cpu64: Ensure kvm really supports aarch64=off

2019-07-24 Thread Auger Eric

Hi Drew,

On 7/24/19 3:52 PM, Andrew Jones wrote:
> On Wed, Jul 24, 2019 at 02:51:15PM +0200, Auger Eric wrote:
>> Hi Drew,
>>
>> On 6/25/19 3:34 PM, Andrew Jones wrote:
>>> On Tue, Jun 25, 2019 at 11:35:12AM +0200, Auger Eric wrote:
 Hi Drew,

 On 6/21/19 6:34 PM, Andrew Jones wrote:
> If -cpu ,aarch64=off is used then KVM must also be used, and it
> and the host must support running the vcpu in 32-bit mode. Also, if
>> s/and it//
> 
> "and it and the host" means "and KVM and the host", as 'it' refers to the
> last subject, which is KVM. I wanted to point out both the host (machine)
> and KVM (version of kernel with KVM) need to support the feature.
hum ok
> 
> -cpu ,aarch64=on is used, then it doesn't matter if kvm is
> enabled or not.
>
> Signed-off-by: Andrew Jones 


> ---
>  target/arm/cpu64.c   | 12 ++--
>  target/arm/kvm64.c   | 11 +++
>  target/arm/kvm_arm.h | 14 ++
>  3 files changed, 31 insertions(+), 6 deletions(-)
>
> diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> index 1901997a0645..946994838d8a 100644
> --- a/target/arm/cpu64.c
> +++ b/target/arm/cpu64.c
> @@ -407,13 +407,13 @@ static void aarch64_cpu_set_aarch64(Object *obj, 
> bool value, Error **errp)
>   * restriction allows us to avoid fixing up functionality that 
> assumes a
>   * uniform execution state like do_interrupt.
>   */> -if (!kvm_enabled()) {
> -error_setg(errp, "'aarch64' feature cannot be disabled "
> - "unless KVM is enabled");
> -return;
> -}
> -
>  if (value == false) {
> +if (!kvm_enabled() || !kvm_arm_aarch32_supported(CPU(cpu))) {
> +error_setg(errp, "'aarch64' feature cannot be disabled "
> + "unless KVM is enabled and 32-bit EL1 "
> + "is supported");
> +return;
> +}
>  unset_feature(>env, ARM_FEATURE_AARCH64);
>  } else {
>  set_feature(>env, ARM_FEATURE_AARCH64);
> diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
> index 22d19c9aec6f..45ccda589903 100644
> --- a/target/arm/kvm64.c
> +++ b/target/arm/kvm64.c
> @@ -24,7 +24,9 @@
>  #include "exec/gdbstub.h"
>  #include "sysemu/sysemu.h"
>  #include "sysemu/kvm.h"
> +#include "sysemu/kvm_int.h"
>  #include "kvm_arm.h"
> +#include "hw/boards.h"
>> By the way those two new headers are not needed by this patch
> 
> Really?
> 
> current_machine is defined in hw/boards.h and KVM_STATE is defined
> in sysemu/kvm_int.h.
argh my bad.

Sorry for the noise

Eric
> 
>  #include "internals.h"
>  
>  static bool have_guest_debug;
> @@ -593,6 +595,15 @@ bool 
> kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
>  return true;
>  }
>  
> +bool kvm_arm_aarch32_supported(CPUState *cpu)
> +{
> +KVMState *s = KVM_STATE(current_machine->accelerator);
> +int ret;
> +
> +ret = kvm_check_extension(s, KVM_CAP_ARM_EL1_32BIT);
> +return ret > 0;
 nit: return kvm_check_extension() should be sufficient
>>>
>>> Ah yes, I forgot kvm_check_extension() already converts negative
>>> error codes to zero. I'll fix that for v3.
>>>
> +}
> +
>  #define ARM_CPU_ID_MPIDR   3, 0, 0, 0, 5
>  
>  int kvm_arch_init_vcpu(CPUState *cs)
> diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
> index 2a07333c615f..812125f805a1 100644
> --- a/target/arm/kvm_arm.h
> +++ b/target/arm/kvm_arm.h
> @@ -207,6 +207,15 @@ bool 
> kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf);
>   */
>  void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu);
>  
> +/**
> + * kvm_arm_aarch32_supported:
> + * @cs: CPUState
 use kernel-doc comment style?
>>>
>>> This file (kvm_arm.h) doesn't appear to have a super consistent comment
>>> style. I see some use @var: for the parameters and some have 'Returns:
>>> ...' lines as well. I'm happy to do whatever the maintainers prefer. For
>>> now I was just trying to mimic whatever caught my eye.>
> + *
> + * Returns true if the KVM VCPU can enable AArch32 mode and false
> + * otherwise.
> + */
> +bool kvm_arm_aarch32_supported(CPUState *cs);
> +
>  /**
>   * kvm_arm_get_max_vm_ipa_size - Returns the number of bits in the
>   * IPA address space supported by KVM
> @@ -247,6 +256,11 @@ static inline void 
> kvm_arm_set_cpu_features_from_host(ARMCPU *cpu)
>  cpu->host_cpu_probe_failed = true;
>  }
>  
> +static inline bool kvm_arm_aarch32_supported(CPUState *cs)
> +{
> +return false;
> +}
> +
>  static inline int kvm_arm_get_max_vm_ipa_size(MachineState *ms)
>  {
>  return -ENOENT;
>

Re: [Qemu-devel] [PATCH for-4.2] hw: add compat machines for 4.2

2019-07-24 Thread Halil Pasic

On Wed, 24 Jul 2019 12:35:24 +0200
Cornelia Huck  wrote:

> Add 4.2 machine types for arm/i440fx/q35/s390x/spapr.
> 
> For i440fx and q35, unversioned cpu models are still translated
> to -v1, as 0788a56bd1ae ("i386: Make unversioned CPU models be
> aliases") states this should only transition to the latest cpu
> model version in 4.3 (or later).
> 
> Signed-off-by: Cornelia Huck 
> ---
>  hw/arm/virt.c  |  9 -
>  hw/core/machine.c  |  3 +++
>  hw/i386/pc.c   |  3 +++
>  hw/i386/pc_piix.c  | 14 +-
>  hw/i386/pc_q35.c   | 13 -
>  hw/ppc/spapr.c | 15 +--
>  hw/s390x/s390-virtio-ccw.c | 14 +-
>  include/hw/boards.h|  3 +++
>  include/hw/i386/pc.h   |  3 +++
>  9 files changed, 71 insertions(+), 6 deletions(-)

The for s390 change:
Reviewed-by: Halil Pasic

[Qemu-devel] [PATCH for 4.2 3/3] linux-user: Add support for RNDRESEEDCRNG ioctl

2019-07-24 Thread Aleksandar Markovic

From: Aleksandar Markovic 

RNDRESEEDCRNG is a newer ioctl (added in mid-2018 in kernel), and
"ifdef" guard is used for that reason in this patch.

Signed-off-by: Aleksandar Markovic 
---
 linux-user/ioctls.h   | 3 +++
 linux-user/syscall_defs.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index 7fac4fc..4264ff5 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -233,6 +233,9 @@
   IOCTL(RNDADDTOENTCNT, IOC_W, MK_PTR(TYPE_INT))
   IOCTL(RNDZAPENTCNT, 0, TYPE_NULL)
   IOCTL(RNDCLEARPOOL, 0, TYPE_NULL)
+#ifdef RNDRESEEDCRNG
+  IOCTL(RNDRESEEDCRNG, 0, TYPE_NULL)
+#endif
 
   IOCTL(CDROMPAUSE, 0, TYPE_NULL)
   IOCTL(CDROMSTART, 0, TYPE_NULL)
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 61c2f3c..bc3f52b 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -824,6 +824,7 @@ struct target_pollfd {
 #define TARGET_RNDADDTOENTCNT  TARGET_IOW('R', 0x01, int)
 #define TARGET_RNDZAPENTCNTTARGET_IO('R', 0x04)
 #define TARGET_RNDCLEARPOOLTARGET_IO('R', 0x06)
+#define TARGET_RNDRESEEDCRNG   TARGET_IO('R', 0x07)
 
 /* From  */
 
-- 
2.7.4

[Qemu-devel] [PATCH for 4.2 2/3] linux-user: Add support for FDMSGON and FDMSGOFF ioctls

2019-07-24 Thread Aleksandar Markovic

From: Aleksandar Markovic 

FDMSGON and FDMSGOFF switch informational messages of floppy drives
on and off.

Signed-off-by: Aleksandar Markovic 
---
 linux-user/ioctls.h   | 2 ++
 linux-user/syscall_defs.h | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index 3ade2d2..7fac4fc 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -112,6 +112,8 @@
  IOCTL(BLKZEROOUT, IOC_W, MK_PTR(MK_ARRAY(TYPE_ULONGLONG, 2)))
 #endif
 
+ IOCTL(FDMSGON, 0, TYPE_NULL)
+ IOCTL(FDMSGOFF, 0, TYPE_NULL)
  IOCTL(FDFLUSH, 0, TYPE_NULL)
 
 #ifdef FIBMAP
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 7e22ed7..61c2f3c 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -859,6 +859,8 @@ struct target_pollfd {
 
 /* From  */
 
+#define TARGET_FDMSGON TARGET_IO(2, 0x45)
+#define TARGET_FDMSGOFFTARGET_IO(2, 0x46)
 #define TARGET_FDFLUSH TARGET_IO(2, 0x4b)
 
 #define TARGET_FIBMAP TARGET_IO(0x00,1)  /* bmap access */
-- 
2.7.4

1 2 3 >

1 - 100 of 228 matches

Mail list logo