date:20190111

Re: [Qemu-devel] [PATCH v4 for-4.0 0/7] vhost-user-blk: Add support for backend reconnecting

2019-01-11 Thread Yongji Xie

On Fri, 11 Jan 2019 at 23:53, Stefan Hajnoczi  wrote:
>
> On Thu, Jan 10, 2019 at 06:59:27PM +0800, Yongji Xie wrote:
> > On Thu, 10 Jan 2019 at 18:25, Stefan Hajnoczi  wrote:
> > >
> > > On Wed, Jan 09, 2019 at 07:27:21PM +0800, elohi...@gmail.com wrote:
> > > > From: Xie Yongji 
> > > >
> > > > This patchset is aimed at supporting qemu to reconnect
> > > > vhost-user-blk backend after vhost-user-blk backend crash or
> > > > restart.
> > > >
> > > > The patch 1 uses exisiting wait/nowait options to make QEMU not
> > > > do a connect on client sockets during initialization of the chardev.
> > > >
> > > > The patch 2 introduces two new messages VHOST_USER_GET_INFLIGHT_FD
> > > > and VHOST_USER_SET_INFLIGHT_FD to support providing shared
> > > > memory to backend.
> > >
> > > Can you describe the problem that the inflight I/O shared memory region
> > > solves?
> > >
> >
> > The backend need to get inflight I/O and do I/O replay after restart.
> > Now we can only get used_idx in used_ring. It's not enough. Because we
> > might not process descriptors in the same order which they have been
> > made available sometimes. A simple example:
> > https://patchwork.kernel.org/cover/10715305/#22375607. So we need a
> > shared memory to track inflight I/O.
>
> The inflight shared memory region is something that needs to be
> discussed in detail to make sure it is correct not just for vhost-blk
> but also for other vhost-user devices in the future.
>
> Please expand the protocol documentation so that someone can implement
> this feature without looking at the code.  Explain the reason for the
> inflight shared memory region and how exactly it used.
>

OK, will do it in v5.

> After a quick look at the shared memory region documentation I have a
> few questions:
>
> 1. The avail ring only contains "head" indices, each of which may chain
>non-head descriptors.  Does the inflight memory region track only the
>heads?
>

Yes, we only track the head in inflight region.

> 2. Does the inflight shared memory region preserve avail ring order?
>For example, if the guest submitted 5 2 4, will the new vhost-user
>backend keep that order or does it see 2 4 5?
>

Now we don't support to resubmit I/O in order. But I think we can add
a timestamp
for each inflight descriptor in shared memory to achieve that.

> 3. What are the exact memory region size requirements?  Imagine a device
>with a large virtqueue.  Or a device with multiple virtqueues of
>different sizes.  Can your structure hand those cases?
>

Each available virtqueue should have a region in inflight memory. The
size of region should be fixed and support handling max size elements
in one virtqueue. There may be a little waste, but I think it make the
structure more clear.

> I'm concerned that this approach to device recovery is invasive and hard
> to test.  Instead I would use VIRTIO's Device Status Field
> DEVICE_NEEDS_RESET bit to tell the guest driver that a reset is
> necessary.  This is more disruptive - drivers either have to resubmit or
> fail I/O with EIO - but it's also simple and more likely to work
> correctly (it only needs to be implemented correctly in the guest
> driver, not in the many available vhost-user backend implementations).
>

So you mean adding one way to notify guest to resubmit inflight I/O. I
think it's a good idea. But would it be more flexible to implement
this in backend. We can support old guest. And it will be easy to fix
bug or add feature.

Thanks,
Yongji

[Qemu-devel] [Bug 1523246] Re: Virtio-blk does not support TRIM

2019-01-11 Thread James Harvey

I believe this feature was just merged by Linus about a week ago, and is
in linux 5.0-rc1:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d548e65904ae43b0637d200a2441fc94e0589c30

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1523246

Title:
  Virtio-blk does not support TRIM

Status in QEMU:
  Confirmed

Bug description:
  When model=virtio is used, TRIM is not supported.

  # mount -o discard /dev/vda4 /mnt
  # mount | tail -1
  /dev/vda4 on /mnt type fuseblk 
(rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other,blksize=4096)
  # fstrim /mnt/
  fstrim: /mnt/: the discard operation is not supported

  Booting without model=virtio allows using TRIM (in Windows as well).

  Full QEMU line:

  qemu-system-x86_64 -enable-kvm -cpu host -bios
  /usr/share/ovmf/ovmf_x64.bin -smp 2 -m 7G -vga qxl -usbdevice tablet
  -net nic,model=virtio -net user -drive discard=unmap,detect-
  zeroes=unmap,cache=none,file=vms/win10.hd.img.vmdk,format=vmdk,if=virtio

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1523246/+subscriptions

[Qemu-devel] Questions about VFIO enabling MSIX vector

2019-01-11 Thread Heyi Guo


Hi folks,

I have some questions about vfio_msix_vector_do_use() in hw/vfio/pci.c, could 
you help to explain?

We can see that when guest tries to enable one specific MSIX vector by 
unmasking MSIX Vector Control, the access will be trapped and then into 
function vfio_msix_vector_do_use(). And we may go to the below branch in line 
525:


520 

 /*
521 

 * We don't want to have the host allocate all possible MSI vectors
522 

 * for a device if they're not in use, so we shutdown and incrementally
523 

 * increase them as needed.
524 

 */
525 

 if (vdev->nr_vectors < nr + 1) {
526 

 vfio_disable_irqindex(>vbasedev, VFIO_PCI_MSIX_IRQ_INDEX);
527 

 vdev->nr_vectors = nr + 1;
528 

 ret = vfio_enable_vectors(vdev, true);
529 

 if (ret) {
530 

 error_report("vfio: failed to enable vectors, %d", ret);
531 

 }

Here all MSIX vectors will be disabled first and then enabled, with one more MSIX. The 
comment is there but I still don't quite understand. It makes sense for not allocating 
all possible MSI vectors, but why shall we shutdown the whole MSI when being requested to 
enable one specific vector? Can't we just enable the user specified vector indexed by 
"nr"?

What's more, on ARM64 systems with GIC ITS, the kernel will issue an ITS 
discard command when disabling a MSI vector, which will drop currently pending 
MSI interrupt. If device driver in guest system enables some MSIs first and 
interrupts may come at any time, and then it tries to enable other MSIs, is it 
possible for the above code to cause interrupts missing?

I may misunderstand the whole thing... Any comment is appreciated.

Thanks,

Heyi

[Qemu-devel] [PATCH] smbus: Remove unneeded include

2019-01-11 Thread BALATON Zoltan

The file no longer uses error_report after review so the header is not
needed either.

Signed-off-by: BALATON Zoltan 
---
Forgot to delete this when rewriting after review. This
can be squashed into 7713f5bed in dgibson/qemu/ppc-for-4.0

 hw/i2c/smbus_eeprom.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/i2c/smbus_eeprom.c b/hw/i2c/smbus_eeprom.c
index bef24a1ca4..01b9439014 100644
--- a/hw/i2c/smbus_eeprom.c
+++ b/hw/i2c/smbus_eeprom.c
@@ -23,7 +23,6 @@
  */
 
 #include "qemu/osdep.h"
-#include "qemu/error-report.h"
 #include "qemu/units.h"
 #include "qapi/error.h"
 #include "hw/hw.h"
-- 
2.13.7

[Qemu-devel] [PATCH v2 1/1] riscv: Ensure the kernel start address is correctly cast

2019-01-11 Thread Alistair Francis

Cast the kernel start address to the target bit length.

This ensures that we calculate the initrd offset to a valid address for
the architecture.

Signed-off-by: Alistair Francis 
Suggested-by: Alexander Graf 
Reported-by: Alexander Graf 
---
v2:
 - Remove old comment
 hw/riscv/sifive_e.c | 2 +-
 hw/riscv/sifive_u.c | 2 +-
 hw/riscv/spike.c| 2 +-
 hw/riscv/virt.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
index 5d9d65ff29..e5d7fc548e 100644
--- a/hw/riscv/sifive_e.c
+++ b/hw/riscv/sifive_e.c
@@ -74,7 +74,7 @@ static const struct MemmapEntry {
 [SIFIVE_E_DTIM] = { 0x8000, 0x4000 }
 };
 
-static uint64_t load_kernel(const char *kernel_filename)
+static target_ulong load_kernel(const char *kernel_filename)
 {
 uint64_t kernel_entry, kernel_high;
 
diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
index 3bd3b67507..3b3acec377 100644
--- a/hw/riscv/sifive_u.c
+++ b/hw/riscv/sifive_u.c
@@ -65,7 +65,7 @@ static const struct MemmapEntry {
 
 #define GEM_REVISION0x10070109
 
-static uint64_t load_kernel(const char *kernel_filename)
+static target_ulong load_kernel(const char *kernel_filename)
 {
 uint64_t kernel_entry, kernel_high;
 
diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
index 268df04c3c..79cb4c1282 100644
--- a/hw/riscv/spike.c
+++ b/hw/riscv/spike.c
@@ -53,7 +53,7 @@ static const struct MemmapEntry {
 [SPIKE_DRAM] = { 0x8000,0x0 },
 };
 
-static uint64_t load_kernel(const char *kernel_filename)
+static target_ulong load_kernel(const char *kernel_filename)
 {
 uint64_t kernel_entry, kernel_high;
 
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index e7f0716fb6..648462b18c 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -62,7 +62,7 @@ static const struct MemmapEntry {
 [VIRT_PCIE_ECAM] =   { 0x3000,0x1000 },
 };
 
-static uint64_t load_kernel(const char *kernel_filename)
+static target_ulong load_kernel(const char *kernel_filename)
 {
 uint64_t kernel_entry, kernel_high;
 
-- 
2.19.1

Re: [Qemu-devel] [PATCH v16 0/6] Add support for TPM Physical Presence interface

2019-01-11 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20190109193115.20655-1-marcandre.lur...@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20190109193115.20655-1-marcandre.lur...@redhat.com
Type: series
Subject: [Qemu-devel] [PATCH v16 0/6] Add support for TPM Physical Presence 
interface

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback --color=always base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
16ddac1 tpm: clear RAM when "memory overwrite" requested
bc2cacd acpi: add ACPI memory clear interface
5e862c0 acpi: build TPM Physical Presence interface
e137a73 acpi: expose TPM/PPI configuration parameters to firmware via fw_cfg
c2780c9 tpm: allocate/map buffer for TPM Physical Presence interface
6edda68 tpm: add a "ppi" boolean property

=== OUTPUT BEGIN ===
Unknown option: color
Usage:

checkpatch.pl [OPTION]... [FILE]...
checkpatch.pl [OPTION]... [GIT-REV-LIST]

Version: 0.31

Options:
  -q, --quietquiet
  --no-tree  run without a kernel tree
  --no-signoff   do not check for 'Signed-off-by' line
  --patchtreat FILE as patchfile
  --branch   treat args as GIT revision list
  --emacsemacs compile window format
  --terseone line per report
  -f, --file treat FILE as regular source file
  --strict   fail if only warnings are found
  --root=PATHPATH to the kernel tree root
  --no-summary   suppress the per-file summary
  --mailback only produce a report in case of warnings/errors
  --summary-file include the filename in summary
  --debug KEY=[0|1]  turn on/off debugging of KEY, where KEY is one of
 'values', 'possible', 'type', and 'attr' (default
 is all off)
  --test-only=WORD   report only warnings/errors containing WORD
 literally
  -h, --help, --version  display this help and exit

When FILE is - read standard input.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20190109193115.20655-1-marcandre.lur...@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [Qemu-devel] Internship idea: virtio-blk oss-fuzz support

2019-01-11 Thread Jonathan Metzman via Qemu-devel

>It should be possible to turn the qtest process into a test
postprocessor,

OSS-Fuzz doesn't support AFL's preprocessors, but adding support shouldn't
be hard.

>It's much harder to
remove the QEMU process as well and turn it into a TestOneInput function.

Got it. I am not familiar with postprocessors but I guess this should work
as long as the QEMU process can interpret the buffer output by the
postprocessor.

There may be some other complications with having an AFL-only fuzzer on
OSS-Fuzz.
The main one is no pruning. On OSS-Fuzz pruning is only done by libFuzzer,
since AFL and libFuzzer share a corpus.
I'm not sure how we would handle this.

On Fri, Jan 11, 2019 at 12:27 PM Paolo Bonzini  wrote:

> On 11/01/19 20:09, Jonathan Metzman wrote:
> > Could you clarify what you think the relationship between the qtest
> > process, QEMU, and afl-fuzz will look like when fuzzing?
> >
> > Is it something like this:
> > 1. afl-fuzz mutates a buffer, starts a qtest process, and gives the
> > qtest process the mutated buffer.
> > 2. The qtest process starts a QEMU process and interacts with QEMU
> > process based on the buffer AFL gave it (qtest).
> > 3. goto 1
> >
> > I don't think this works (under normal circumstances). AFL will think it
> > is fuzzing qtest and will not learn about coverage or crashes from qsym.
> > There probably are ways to get this working, but I just want to make
> > sure I understand.
>
> It should be possible to turn the qtest process into a test
> postprocessor, and remove the second process.  It's much harder to
> remove the QEMU process as well and turn it into a TestOneInput function.
>
> Paolo
>

[Qemu-devel] [PATCH v3] target/xtensa: rework zero overhead loops implementation

2019-01-11 Thread Max Filippov

Don't invalidate TB with the end of zero overhead loop when LBEG or LEND
change. Instead encode the distance from the start of the page where the
TB starts to the LEND in the TB cs_base and generate loopback code when
the next PC matches encoded LEND. Distance to a destination within the
same page and up to a maximum instruction length into the next page is
encoded literally, otherwise it's zero. The distance from LEND to LBEG
is also encoded in the cs_base: it's encoded literally when less than
256 or as 0 otherwise. This allows for TB chaining for the loopback
branch at the end of a loop for the most common loop sizes.

With this change the resulting emulation speed is about 10% higher in
softmmu mode on uClibc-ng and LTP tests. Emulation speed in linux
user mode is a few percent lower because there's no direct TB chaining
between different memory pages. Testing with lower limit on direct TB
chaining range shows gradual slowdown to ~15% for the block size of 64
bytes and ~50% for the block size of 32 bytes.

Signed-off-by: Max Filippov 
Reviewed-by: Richard Henderson 
---
Changes v2->v3:
- improve comment wording

Changes v1->v2:
- drop LINKABLE_*;
- add comment about LEND offset encoding;

 target/xtensa/cpu.h  | 32 ++
 target/xtensa/helper.h   |  2 --
 target/xtensa/op_helper.c| 24 
 target/xtensa/overlay_tool.h |  1 +
 target/xtensa/translate.c| 53 +---
 5 files changed, 49 insertions(+), 63 deletions(-)

diff --git a/target/xtensa/cpu.h b/target/xtensa/cpu.h
index 34e5ccd9f1d6..bf6f9a09b62c 100644
--- a/target/xtensa/cpu.h
+++ b/target/xtensa/cpu.h
@@ -400,6 +400,7 @@ struct XtensaConfig {
 int excm_level;
 int ndepc;
 unsigned inst_fetch_width;
+unsigned max_insn_size;
 uint32_t vecbase;
 uint32_t exception_vector[EXC_MAX];
 unsigned ninterrupt;
@@ -695,6 +696,11 @@ static inline int cpu_mmu_index(CPUXtensaState *env, bool 
ifetch)
 #define XTENSA_TBFLAG_CALLINC_MASK 0x18
 #define XTENSA_TBFLAG_CALLINC_SHIFT 19
 
+#define XTENSA_CSBASE_LEND_MASK 0x
+#define XTENSA_CSBASE_LEND_SHIFT 0
+#define XTENSA_CSBASE_LBEG_OFF_MASK 0x00ff
+#define XTENSA_CSBASE_LBEG_OFF_SHIFT 16
+
 static inline void cpu_get_tb_cpu_state(CPUXtensaState *env, target_ulong *pc,
 target_ulong *cs_base, uint32_t *flags)
 {
@@ -706,6 +712,32 @@ static inline void cpu_get_tb_cpu_state(CPUXtensaState 
*env, target_ulong *pc,
 *flags |= xtensa_get_ring(env);
 if (env->sregs[PS] & PS_EXCM) {
 *flags |= XTENSA_TBFLAG_EXCM;
+} else if (xtensa_option_enabled(env->config, XTENSA_OPTION_LOOP)) {
+target_ulong lend_dist =
+env->sregs[LEND] - (env->pc & -(1u << TARGET_PAGE_BITS));
+
+/*
+ * 0 in the csbase_lend field means that there may not be a loopback
+ * for any instruction that starts inside this page. Any other value
+ * means that an instruction that ends at this offset from the page
+ * start may loop back and will need loopback code to be generated.
+ *
+ * lend_dist is 0 when LEND points to the start of the page, but
+ * no instruction that starts inside this page may end at offset 0,
+ * so it's still correct.
+ *
+ * When an instruction ends at a page boundary it may only start in
+ * the previous page. lend_dist will be encoded as TARGET_PAGE_SIZE
+ * for the TB that contains this instruction.
+ */
+if (lend_dist < (1u << TARGET_PAGE_BITS) + env->config->max_insn_size) 
{
+target_ulong lbeg_off = env->sregs[LEND] - env->sregs[LBEG];
+
+*cs_base = lend_dist;
+if (lbeg_off < 256) {
+*cs_base |= lbeg_off << XTENSA_CSBASE_LBEG_OFF_SHIFT;
+}
+}
 }
 if (xtensa_option_enabled(env->config, XTENSA_OPTION_EXTENDED_L32R) &&
 (env->sregs[LITBASE] & 1)) {
diff --git a/target/xtensa/helper.h b/target/xtensa/helper.h
index 10153c245360..2ebba0b2c2bf 100644
--- a/target/xtensa/helper.h
+++ b/target/xtensa/helper.h
@@ -12,8 +12,6 @@ DEF_HELPER_2(rotw, void, env, i32)
 DEF_HELPER_3(window_check, noreturn, env, i32, i32)
 DEF_HELPER_1(restore_owb, void, env)
 DEF_HELPER_2(movsp, void, env, i32)
-DEF_HELPER_2(wsr_lbeg, void, env, i32)
-DEF_HELPER_2(wsr_lend, void, env, i32)
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_1(simcall, void, env)
 #endif
diff --git a/target/xtensa/op_helper.c b/target/xtensa/op_helper.c
index e4b42ab3e56c..078aeb6c2c94 100644
--- a/target/xtensa/op_helper.c
+++ b/target/xtensa/op_helper.c
@@ -107,13 +107,6 @@ static void tb_invalidate_virtual_addr(CPUXtensaState 
*env, uint32_t vaddr)
 }
 }
 
-#else
-
-static void tb_invalidate_virtual_addr(CPUXtensaState *env, uint32_t vaddr)
-{
-tb_invalidate_phys_addr(vaddr);
-}
-
 #endif
 
 void HELPER(exception)(CPUXtensaState *env, uint32_t excp)
@@ -370,23 +363,6 @@ void

Re: [Qemu-devel] [PATCH v2 07/22] qemu-nbd: Avoid strtol open-coding

2019-01-11 Thread Eric Blake

On 1/4/19 5:50 PM, Eric Blake wrote:

>> hmm, as I understand, even if this compatibility exists, it's not a part
>> of standard and nothing about off_t size in POSIX..
> 
> off_t allows you to run on older systems with 32-bit offsets and newer
> systems with 64-bit offsets; but these days, even on 32-bit systems, we
> compile qemu to always ask for 64-bit off_t.  Using off_t instead of
> int64_t is probably a separate cleanup, but one that may be worth making
> prior to this patch, so I'll defer this one to my v3.
> 
>>
>> Moreover: what is the reason for using off_t in NBD code? We don't have it
>> in NBD protocol, we don't have it in generic block layer interface. Isn't it
>> always casted to int64_t or like this?

Thanks again for asking this question. In auditing the use of off_t, I
found that 'qemu-nbd -P 1' happily tries to read beyond the bounds of
the BDS.  Thankfully, I can't find an exploit (escaped the CVE bullet -
no DoS assertion, overlarge malloc, information leak, or other nasty
problem), merely a permanent EIO down the road once the client tries to
access advertised available bytes; but I'm also adding a patch to my v3
series that does sanity checking (as we should NEVER blindly trust
values in a potentially-malicious image as being in bounds, so that
clients can't even connect to such images in the first place). [I also
think that qemu-nbd -P is seldom-used...]

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2] target/xtensa: rework zero overhead loops implementation

2019-01-11 Thread Richard Henderson

On 1/12/19 9:27 AM, Max Filippov wrote:
> Maybe "Any other value means that an instruction that ends at this offset
> from the page start may loop back and will need loopback code to be 
> generated"?

Much better, thanks.


r~

Re: [Qemu-devel] [PULL 00/10] Machine queue, 2019-01-10

2019-01-11 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20190110142955.23254-1-ehabk...@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20190110142955.23254-1-ehabk...@redhat.com
Type: series
Subject: [Qemu-devel] [PULL 00/10] Machine queue, 2019-01-10

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback --color=always base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]  patchew/20190111220131.27153-1-jcmvb...@gmail.com -> 
patchew/20190111220131.27153-1-jcmvb...@gmail.com
Switched to a new branch 'test'
62544fe qom: Don't keep error value between object_property_parse() calls
a320168 qdev: fix -device scsi-hd, help regression
917e46b machine: Use shorter format for GlobalProperty arrays
4326497 machine: Eliminate unnecessary stringify() usage
f4d8c3d spapr: Eliminate SPAPR_PCI_2_7_MMIO_WIN_SIZE macro
6ca72c7 memory-device: rewrite address assignment using ranges
e6667d7 range: add some more functions
58bda71 Mention that QMP 'cpu-add' will be deprecated
cfce683 Update that HMP 'cpu-add' is deprecated in 4.0
fd46ac3 qemu-deprecated.texi: Rename the HMP section

=== OUTPUT BEGIN ===
Unknown option: color
Usage:

checkpatch.pl [OPTION]... [FILE]...
checkpatch.pl [OPTION]... [GIT-REV-LIST]

Version: 0.31

Options:
  -q, --quietquiet
  --no-tree  run without a kernel tree
  --no-signoff   do not check for 'Signed-off-by' line
  --patchtreat FILE as patchfile
  --branch   treat args as GIT revision list
  --emacsemacs compile window format
  --terseone line per report
  -f, --file treat FILE as regular source file
  --strict   fail if only warnings are found
  --root=PATHPATH to the kernel tree root
  --no-summary   suppress the per-file summary
  --mailback only produce a report in case of warnings/errors
  --summary-file include the filename in summary
  --debug KEY=[0|1]  turn on/off debugging of KEY, where KEY is one of
 'values', 'possible', 'type', and 'attr' (default
 is all off)
  --test-only=WORD   report only warnings/errors containing WORD
 literally
  -h, --help, --version  display this help and exit

When FILE is - read standard input.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20190110142955.23254-1-ehabk...@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [Qemu-devel] [PATCH] contrib/gitdm: Fix a typo

2019-01-11 Thread Philippe Mathieu-Daudé

On 1/11/19 10:28 PM, no-re...@patchew.org wrote:
> Patchew URL: https://patchew.org/QEMU/201905.8270-1-phi...@redhat.com/
> 
> Hi,
> 
> This series seems to have some coding style problems. See output below for
> more information:
> 
> Message-id: 201905.8270-1-phi...@redhat.com
> Type: series
> Subject: [Qemu-devel] [PATCH] contrib/gitdm: Fix a typo
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> git config --local diff.renamelimit 0
> git config --local diff.renames True
> git config --local diff.algorithm histogram
> ./scripts/checkpatch.pl --mailback --color=always base..

I suppose Paolo expected his PR to land to upgrade patchew, and patchew
is still processing from an old queue?

> === TEST SCRIPT END ===
> 
> Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
> Switched to a new branch 'test'
> c995f5f contrib/gitdm: Fix a typo
> 
> === OUTPUT BEGIN ===
> Unknown option: color
> Usage:
> 
> checkpatch.pl [OPTION]... [FILE]...
> checkpatch.pl [OPTION]... [GIT-REV-LIST]
> 
> Version: 0.31
> 
> Options:
>   -q, --quietquiet
>   --no-tree  run without a kernel tree
>   --no-signoff   do not check for 'Signed-off-by' line
>   --patchtreat FILE as patchfile
>   --branch   treat args as GIT revision list
>   --emacsemacs compile window format
>   --terseone line per report
>   -f, --file treat FILE as regular source file
>   --strict   fail if only warnings are found
>   --root=PATHPATH to the kernel tree root
>   --no-summary   suppress the per-file summary
>   --mailback only produce a report in case of warnings/errors
>   --summary-file include the filename in summary
>   --debug KEY=[0|1]  turn on/off debugging of KEY, where KEY is one of
>  'values', 'possible', 'type', and 'attr' (default
>  is all off)
>   --test-only=WORD   report only warnings/errors containing WORD
>  literally
>   -h, --help, --version  display this help and exit
> 
> When FILE is - read standard input.
> === OUTPUT END ===
> 
> Test command exited with code: 1
> 
> 
> The full log is available at
> http://patchew.org/logs/201905.8270-1-phi...@redhat.com/testing.checkpatch/?type=message.
> ---
> Email generated automatically by Patchew [http://patchew.org/].
> Please send your feedback to patchew-de...@redhat.com
>

Re: [Qemu-devel] [PATCH v2] target/xtensa: rework zero overhead loops implementation

2019-01-11 Thread Max Filippov

On Fri, Jan 11, 2019 at 2:23 PM Max Filippov  wrote:
>
> On Fri, Jan 11, 2019 at 2:17 PM Richard Henderson
>  wrote:
> >
> > On 1/12/19 9:14 AM, Max Filippov wrote:
> > >>> +/*
> > >>> + * 0 in the csbase_lend field means that there may not be a 
> > >>> loopback
> > >>> + * for any instruction that starts inside this page. Any other 
> > >>> value
> > >>> + * means that an instruction that ends at this offset from the 
> > >>> page
> > >>> + * start may loop back.
> > >>
> > >> Nit: s/may/will/g
> > >>
> > >> Using "may" makes it seem like we may have missed a case that should have
> > >> looped back.
> > >
> > > Using "will" makes it seem like it always loops back, but it doesn't do it
> > > when LCOUNT is 0.
> >
> > Err, no, you covered that in the first sentence, where 0 means that it will 
> > not
> > loop back.
>
> What I'm saying is that "Any other value means that an instruction that ends
> at this offset from the page start will loop back." sounds like that 
> instruction
> will loop back unconditionally. LCOUNT = 0 is the condition when this doesn't
> happen, and it's not encoded in the TB flags. It's checked in the generated
> loopback code.

Maybe "Any other value means that an instruction that ends at this offset
from the page start may loop back and will need loopback code to be generated"?


-- 
Thanks.
-- Max

Re: [Qemu-devel] [PATCH v2] target/xtensa: rework zero overhead loops implementation

2019-01-11 Thread Max Filippov

On Fri, Jan 11, 2019 at 2:17 PM Richard Henderson
 wrote:
>
> On 1/12/19 9:14 AM, Max Filippov wrote:
> >>> +/*
> >>> + * 0 in the csbase_lend field means that there may not be a 
> >>> loopback
> >>> + * for any instruction that starts inside this page. Any other 
> >>> value
> >>> + * means that an instruction that ends at this offset from the 
> >>> page
> >>> + * start may loop back.
> >>
> >> Nit: s/may/will/g
> >>
> >> Using "may" makes it seem like we may have missed a case that should have
> >> looped back.
> >
> > Using "will" makes it seem like it always loops back, but it doesn't do it
> > when LCOUNT is 0.
>
> Err, no, you covered that in the first sentence, where 0 means that it will 
> not
> loop back.

What I'm saying is that "Any other value means that an instruction that ends
at this offset from the page start will loop back." sounds like that instruction
will loop back unconditionally. LCOUNT = 0 is the condition when this doesn't
happen, and it's not encoded in the TB flags. It's checked in the generated
loopback code.

-- 
Thanks.
-- Max

Re: [Qemu-devel] [PATCH v2] target/xtensa: rework zero overhead loops implementation

2019-01-11 Thread Richard Henderson

On 1/12/19 9:14 AM, Max Filippov wrote:
>>> +/*
>>> + * 0 in the csbase_lend field means that there may not be a 
>>> loopback
>>> + * for any instruction that starts inside this page. Any other 
>>> value
>>> + * means that an instruction that ends at this offset from the page
>>> + * start may loop back.
>>
>> Nit: s/may/will/g
>>
>> Using "may" makes it seem like we may have missed a case that should have
>> looped back.
> 
> Using "will" makes it seem like it always loops back, but it doesn't do it
> when LCOUNT is 0.

Err, no, you covered that in the first sentence, where 0 means that it will not
loop back.


r~

Re: [Qemu-devel] [PATCH v2] target/xtensa: rework zero overhead loops implementation

2019-01-11 Thread Max Filippov

On Fri, Jan 11, 2019 at 2:09 PM Richard Henderson
 wrote:
>
> On 1/12/19 9:01 AM, Max Filippov wrote:
> > Don't invalidate TB with the end of zero overhead loop when LBEG or LEND
> > change. Instead encode the distance from the start of the page where the
> > TB starts to the LEND in the TB cs_base and generate loopback code when
> > the next PC matches encoded LEND. Distance to a destination within the
> > same page and up to a maximum instruction length into the next page is
> > encoded literally, otherwise it's zero. The distance from LEND to LBEG
> > is also encoded in the cs_base: it's encoded literally when less than
> > 256 or as 0 otherwise. This allows for TB chaining for the loopback
> > branch at the end of a loop for the most common loop sizes.
> >
> > With this change the resulting emulation speed is about 10% higher in
> > softmmu mode on uClibc-ng and LTP tests. Emulation speed in linux
> > user mode is a few percent lower because there's no direct TB chaining
> > between different memory pages. Testing with lower limit on direct TB
> > chainig range shows gradual slowdown to ~15% for the block size of 64
> > bytes and ~50% for the block size of 32 bytes.
> >
> > Signed-off-by: Max Filippov 
>
> Reviewed-by: Richard Henderson 
>
>
> > +/*
> > + * 0 in the csbase_lend field means that there may not be a 
> > loopback
> > + * for any instruction that starts inside this page. Any other 
> > value
> > + * means that an instruction that ends at this offset from the page
> > + * start may loop back.
>
> Nit: s/may/will/g
>
> Using "may" makes it seem like we may have missed a case that should have
> looped back.

Using "will" makes it seem like it always loops back, but it doesn't do it
when LCOUNT is 0.

-- 
Thanks.
-- Max

Re: [Qemu-devel] [PATCH] contrib/gitdm: Fix a typo

2019-01-11 Thread no-reply

Patchew URL: https://patchew.org/QEMU/201905.8270-1-phi...@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 201905.8270-1-phi...@redhat.com
Type: series
Subject: [Qemu-devel] [PATCH] contrib/gitdm: Fix a typo

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback --color=always base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
c995f5f contrib/gitdm: Fix a typo

=== OUTPUT BEGIN ===
Unknown option: color
Usage:

checkpatch.pl [OPTION]... [FILE]...
checkpatch.pl [OPTION]... [GIT-REV-LIST]

Version: 0.31

Options:
  -q, --quietquiet
  --no-tree  run without a kernel tree
  --no-signoff   do not check for 'Signed-off-by' line
  --patchtreat FILE as patchfile
  --branch   treat args as GIT revision list
  --emacsemacs compile window format
  --terseone line per report
  -f, --file treat FILE as regular source file
  --strict   fail if only warnings are found
  --root=PATHPATH to the kernel tree root
  --no-summary   suppress the per-file summary
  --mailback only produce a report in case of warnings/errors
  --summary-file include the filename in summary
  --debug KEY=[0|1]  turn on/off debugging of KEY, where KEY is one of
 'values', 'possible', 'type', and 'attr' (default
 is all off)
  --test-only=WORD   report only warnings/errors containing WORD
 literally
  -h, --help, --version  display this help and exit

When FILE is - read standard input.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/201905.8270-1-phi...@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [Qemu-devel] [PATCH v2] target/xtensa: rework zero overhead loops implementation

2019-01-11 Thread Richard Henderson

On 1/12/19 9:01 AM, Max Filippov wrote:
> Don't invalidate TB with the end of zero overhead loop when LBEG or LEND
> change. Instead encode the distance from the start of the page where the
> TB starts to the LEND in the TB cs_base and generate loopback code when
> the next PC matches encoded LEND. Distance to a destination within the
> same page and up to a maximum instruction length into the next page is
> encoded literally, otherwise it's zero. The distance from LEND to LBEG
> is also encoded in the cs_base: it's encoded literally when less than
> 256 or as 0 otherwise. This allows for TB chaining for the loopback
> branch at the end of a loop for the most common loop sizes.
> 
> With this change the resulting emulation speed is about 10% higher in
> softmmu mode on uClibc-ng and LTP tests. Emulation speed in linux
> user mode is a few percent lower because there's no direct TB chaining
> between different memory pages. Testing with lower limit on direct TB
> chainig range shows gradual slowdown to ~15% for the block size of 64
> bytes and ~50% for the block size of 32 bytes.
> 
> Signed-off-by: Max Filippov 

Reviewed-by: Richard Henderson 


> +/*
> + * 0 in the csbase_lend field means that there may not be a loopback
> + * for any instruction that starts inside this page. Any other value
> + * means that an instruction that ends at this offset from the page
> + * start may loop back.

Nit: s/may/will/g

Using "may" makes it seem like we may have missed a case that should have
looped back.


r~

[Qemu-devel] [PATCH v2] target/xtensa: rework zero overhead loops implementation

2019-01-11 Thread Max Filippov

Don't invalidate TB with the end of zero overhead loop when LBEG or LEND
change. Instead encode the distance from the start of the page where the
TB starts to the LEND in the TB cs_base and generate loopback code when
the next PC matches encoded LEND. Distance to a destination within the
same page and up to a maximum instruction length into the next page is
encoded literally, otherwise it's zero. The distance from LEND to LBEG
is also encoded in the cs_base: it's encoded literally when less than
256 or as 0 otherwise. This allows for TB chaining for the loopback
branch at the end of a loop for the most common loop sizes.

With this change the resulting emulation speed is about 10% higher in
softmmu mode on uClibc-ng and LTP tests. Emulation speed in linux
user mode is a few percent lower because there's no direct TB chaining
between different memory pages. Testing with lower limit on direct TB
chainig range shows gradual slowdown to ~15% for the block size of 64
bytes and ~50% for the block size of 32 bytes.

Signed-off-by: Max Filippov 
---
Changes v1->v2:
- drop LINKABLE_*;
- add comment about LEND offset encoding;

 target/xtensa/cpu.h  | 32 ++
 target/xtensa/helper.h   |  2 --
 target/xtensa/op_helper.c| 24 
 target/xtensa/overlay_tool.h |  1 +
 target/xtensa/translate.c| 53 +---
 5 files changed, 49 insertions(+), 63 deletions(-)

diff --git a/target/xtensa/cpu.h b/target/xtensa/cpu.h
index 34e5ccd9f1d6..6c2636d06c5e 100644
--- a/target/xtensa/cpu.h
+++ b/target/xtensa/cpu.h
@@ -400,6 +400,7 @@ struct XtensaConfig {
 int excm_level;
 int ndepc;
 unsigned inst_fetch_width;
+unsigned max_insn_size;
 uint32_t vecbase;
 uint32_t exception_vector[EXC_MAX];
 unsigned ninterrupt;
@@ -695,6 +696,11 @@ static inline int cpu_mmu_index(CPUXtensaState *env, bool 
ifetch)
 #define XTENSA_TBFLAG_CALLINC_MASK 0x18
 #define XTENSA_TBFLAG_CALLINC_SHIFT 19
 
+#define XTENSA_CSBASE_LEND_MASK 0x
+#define XTENSA_CSBASE_LEND_SHIFT 0
+#define XTENSA_CSBASE_LBEG_OFF_MASK 0x00ff
+#define XTENSA_CSBASE_LBEG_OFF_SHIFT 16
+
 static inline void cpu_get_tb_cpu_state(CPUXtensaState *env, target_ulong *pc,
 target_ulong *cs_base, uint32_t *flags)
 {
@@ -706,6 +712,32 @@ static inline void cpu_get_tb_cpu_state(CPUXtensaState 
*env, target_ulong *pc,
 *flags |= xtensa_get_ring(env);
 if (env->sregs[PS] & PS_EXCM) {
 *flags |= XTENSA_TBFLAG_EXCM;
+} else if (xtensa_option_enabled(env->config, XTENSA_OPTION_LOOP)) {
+target_ulong lend_dist =
+env->sregs[LEND] - (env->pc & -(1u << TARGET_PAGE_BITS));
+
+/*
+ * 0 in the csbase_lend field means that there may not be a loopback
+ * for any instruction that starts inside this page. Any other value
+ * means that an instruction that ends at this offset from the page
+ * start may loop back.
+ *
+ * lend_dist is 0 when LEND points to the start of the page, but
+ * no instruction that starts inside this page may end at offset 0,
+ * so it's still correct.
+ *
+ * When an instruction ends at a page boundary it may only start in
+ * the previous page. lend_dist will be encoded as TARGET_PAGE_SIZE
+ * for the TB that contains this instruction.
+ */
+if (lend_dist < (1u << TARGET_PAGE_BITS) + env->config->max_insn_size) 
{
+target_ulong lbeg_off = env->sregs[LEND] - env->sregs[LBEG];
+
+*cs_base = lend_dist;
+if (lbeg_off < 256) {
+*cs_base |= lbeg_off << XTENSA_CSBASE_LBEG_OFF_SHIFT;
+}
+}
 }
 if (xtensa_option_enabled(env->config, XTENSA_OPTION_EXTENDED_L32R) &&
 (env->sregs[LITBASE] & 1)) {
diff --git a/target/xtensa/helper.h b/target/xtensa/helper.h
index 10153c245360..2ebba0b2c2bf 100644
--- a/target/xtensa/helper.h
+++ b/target/xtensa/helper.h
@@ -12,8 +12,6 @@ DEF_HELPER_2(rotw, void, env, i32)
 DEF_HELPER_3(window_check, noreturn, env, i32, i32)
 DEF_HELPER_1(restore_owb, void, env)
 DEF_HELPER_2(movsp, void, env, i32)
-DEF_HELPER_2(wsr_lbeg, void, env, i32)
-DEF_HELPER_2(wsr_lend, void, env, i32)
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_1(simcall, void, env)
 #endif
diff --git a/target/xtensa/op_helper.c b/target/xtensa/op_helper.c
index e4b42ab3e56c..078aeb6c2c94 100644
--- a/target/xtensa/op_helper.c
+++ b/target/xtensa/op_helper.c
@@ -107,13 +107,6 @@ static void tb_invalidate_virtual_addr(CPUXtensaState 
*env, uint32_t vaddr)
 }
 }
 
-#else
-
-static void tb_invalidate_virtual_addr(CPUXtensaState *env, uint32_t vaddr)
-{
-tb_invalidate_phys_addr(vaddr);
-}
-
 #endif
 
 void HELPER(exception)(CPUXtensaState *env, uint32_t excp)
@@ -370,23 +363,6 @@ void HELPER(movsp)(CPUXtensaState *env, uint32_t pc)
 }
 }
 
-void HELPER(wsr_lbeg)(CPUXtensaState *env, uint32_t v)
-{
-if

Re: [Qemu-devel] Internship idea: virtio-blk oss-fuzz support

2019-01-11 Thread Jonathan Metzman via Qemu-devel

Could you clarify what you think the relationship between the qtest
process, QEMU, and afl-fuzz will look like when fuzzing?

Is it something like this:
1. afl-fuzz mutates a buffer, starts a qtest process, and gives the qtest
process the mutated buffer.
2. The qtest process starts a QEMU process and interacts with QEMU process
based on the buffer AFL gave it (qtest).
3. goto 1

I don't think this works (under normal circumstances). AFL will think it is
fuzzing qtest and will not learn about coverage or crashes from qsym.
There probably are ways to get this working, but I just want to make sure I
understand.

Thanks,
Jonathan

On Fri, Jan 11, 2019 at 8:16 AM Paolo Bonzini  wrote:

> On 11/01/19 16:41, Max Moroz wrote:
> >
> >
> > On Fri, Jan 11, 2019 at 7:34 AM Paolo Bonzini  > > wrote:
> >
> > On 11/01/19 16:04, Max Moroz wrote:
> > > We usually have a single fuzzing process, it starts with a fuzzing
> > > engine's main function and is calling LLVMFuzzerTestOneInput with
> > > various inputs and keep mutating them based on the coverage
> feedback.
> > > Running a second process which you don't care too much about might
> be
> > > fine, but the fuzzing process should be "replacing" or should I say
> > > "imitating" the process whose coverage you're interested in.
> >
> > What do you mean by replacing or imitating?
> >
> > To give you an example, when we fuzz ffmpeg, we do not run ffmpeg's main
> > function. We write LLVMFuzzerTestOneInput that would do the necessary
> > initialization, reset the state, etc, and then would pass (data, size)
> > provided by a fuzzing engine to the API(s) we're trying to fuzz. So, in
> > your case, there should not be a regular QEMU process, and instead the
> > fuzz target (i.e. LLVMFuzzerTestOneInput) should be doing certain
> > initialization (which is usually done by the QEMU process) and then call
> > the API you want to fuzz.
>
> The main issue is that we are not really testing an API and QEMU has a
> lot of global state.

Of course there are C functions to do the
> elementary I/O operations, but there are two problems.  The first and
> smaller is that one input would correspond to a sequence of invocations
> of the functions, not just one function invocation; the larger, is that
> the knowledge of the guest (memory map, placement of devices, etc.) is
> not easy to consume from within QEMU.  We do have a library that
> includes that knowledge, and it would be easy to use it from a mutator
> or input postprocessor, but I'm afraid that it would be very very
> intrusive to embed that library into QEMU, replacing all of main() etc.
>
> The simplest way would be to test the input by fork()-ing QEMU, followed
> by waitpid() in the parent and invoking the original main() in the
> child.  But that wouldn't buy much and would probably even be slower
> than an efficient deferred fork server.
>
> Paolo
>
> >
> >
> >
> > Avoiding fork would probably be hard.  I'm mostly afraid that some
> state
> > guest state is not resetted properly across runs, and this would
> result
> > in non-reproducible crashes.
> >
> > It seems to me that the task can be approached with AFL and a test
> case
> > postprocessor to generate the qtest input; however, my knowledge of
> > libFuzzer is very very limited.
> >
> > Paolo
> >
>
>

[Qemu-devel] [PATCH v2 13/13] spapr: enable PHB hotplug for default pseries machine type

2019-01-11 Thread Greg Kurz

From: Michael Roth 

The 'dr_phb_enabled' field of that class can be set as part of
machine-specific init code. It will be used to conditionally
enable creation of DRC objects and device-tree description to
facilitate hotplug of PHBs.

Since we can't migrate this state to older machine types,
default the option to true and disable it for older machine
types.

Signed-off-by: Michael Roth 
Signed-off-by: Greg Kurz 
---
 hw/ppc/spapr.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 96979caa5b11..59088203e0b2 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4164,6 +4164,7 @@ static void spapr_machine_class_init(ObjectClass *oc, 
void *data)
 smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF;
 spapr_caps_add_properties(smc, _abort);
 smc->irq = _irq_xics;
+smc->dr_phb_enabled = true;
 }
 
 static const TypeInfo spapr_machine_info = {
@@ -4229,6 +4230,7 @@ static void spapr_machine_3_1_class_options(MachineClass 
*mc)
 compat_props_add(mc->compat_props, hw_compat_3_1, hw_compat_3_1_len);
 mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power8_v2.0");
 smc->update_dt_enabled = false;
+smc->dr_phb_enabled = false;
 }
 
 DEFINE_SPAPR_MACHINE(3_1, "3.1", false);

[Qemu-devel] [PATCH v2 12/13] spapr: add hotplug hooks for PHB hotplug

2019-01-11 Thread Greg Kurz

Hotplugging PHBs is a machine-level operation, but PHBs reside on the
main system bus, so we register spapr machine as the handler for the
main system bus.

We re-get the phandle of the interrupt controller systematically for
simplicity.

Signed-off-by: Michael Roth 
Signed-off-by: Greg Kurz 
---
v2: - reworked phandle handling
- sync LSIs to KVM
---
 hw/ppc/spapr.c |  125 
 hw/ppc/spapr_drc.c |1 
 hw/ppc/spapr_pci.c |   27 +-
 include/hw/ppc/spapr.h |1 
 4 files changed, 138 insertions(+), 16 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index ebb6fbbb2b69..96979caa5b11 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2930,6 +2930,11 @@ static void spapr_machine_init(MachineState *machine)
 register_savevm_live(NULL, "spapr/htab", -1, 1,
  _htab_handlers, spapr);
 
+if (smc->dr_phb_enabled) {
+qbus_set_hotplug_handler(sysbus_get_default(), OBJECT(machine),
+ _fatal);
+}
+
 qemu_register_boot_set(spapr_boot_set, spapr);
 
 if (kvm_enabled()) {
@@ -3727,6 +3732,112 @@ out:
 error_propagate(errp, local_err);
 }
 
+static void spapr_phb_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+   Error **errp)
+{
+sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
+sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
+sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
+const unsigned windows_supported = spapr_phb_windows_supported(sphb);
+
+if (sphb->index == (uint32_t)-1) {
+error_setg(errp, "\"index\" for PAPR PHB is mandatory");
+return;
+}
+
+/*
+ * This will check that sphb->index doesn't exceed the maximum number of
+ * PHBs for the current machine type.
+ */
+smc->phb_placement(spapr, sphb->index,
+   >buid, >io_win_addr,
+   >mem_win_addr, >mem64_win_addr,
+   windows_supported, sphb->dma_liobn, errp);
+}
+
+static void spapr_phb_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+   Error **errp)
+{
+sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
+sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
+sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
+void *fdt = NULL;
+int fdt_start_offset;
+int fdt_size;
+Error *local_err = NULL;
+sPAPRDRConnector *drc;
+int ret;
+bool hotplugged = spapr_drc_hotplugged(dev);
+int phandle;
+
+if (!smc->dr_phb_enabled) {
+return;
+}
+
+drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, sphb->index);
+/* hotplug hooks should check it's enabled before getting this far */
+assert(drc);
+
+if (hotplugged) {
+phandle = spapr->irq->get_phandle(spapr, spapr->fdt_blob,
+  _err);
+if (local_err) {
+goto out;
+}
+} else {
+phandle = PHANDLE_INTC;
+}
+
+fdt = create_device_tree(_size);
+ret = spapr_populate_pci_dt(sphb, phandle, fdt, spapr->irq->nr_msis,
+_start_offset);
+if (ret < 0) {
+error_setg(_err, "unable to create FDT for %sPHB",
+   dev->hotplugged ? "hotplugged " : "");
+goto out;
+}
+
+if (hotplugged) {
+/* generally SLOF creates these, for hotplug it's up to QEMU */
+_FDT(fdt_setprop_string(fdt, fdt_start_offset, "name", "pci"));
+}
+
+spapr_drc_attach(drc, DEVICE(dev), fdt, fdt_start_offset, _err);
+
+out:
+if (local_err) {
+error_propagate(errp, local_err);
+g_free(fdt);
+return;
+}
+
+if (hotplugged) {
+spapr_hotplug_req_add_by_index(drc);
+} else if (drc) {
+spapr_drc_reset(drc);
+}
+}
+
+void spapr_phb_release(DeviceState *dev)
+{
+object_unparent(OBJECT(dev));
+}
+
+static void spapr_phb_unplug_request(HotplugHandler *hotplug_dev,
+ DeviceState *dev, Error **errp)
+{
+sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
+sPAPRDRConnector *drc;
+
+drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, sphb->index);
+assert(drc);
+
+if (!spapr_drc_unplug_requested(drc)) {
+spapr_drc_detach(drc);
+spapr_hotplug_req_remove_by_index(drc);
+}
+}
+
 static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
   DeviceState *dev, Error **errp)
 {
@@ -3734,6 +3845,8 @@ static void spapr_machine_device_plug(HotplugHandler 
*hotplug_dev,
 spapr_memory_plug(hotplug_dev, dev, errp);
 } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
 spapr_core_plug(hotplug_dev, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
+spapr_phb_plug(hotplug_dev, dev, errp);
 }
 }
 
@@

Re: [Qemu-devel] [PATCH v5 4/8] linux-user: Split out preadv, pwritev, readv, writev, pread64, pwrite64

2019-01-11 Thread Richard Henderson

On 1/11/19 2:17 AM, Laurent Vivier wrote:
> On 19/12/2018 05:21, Richard Henderson wrote:
>> Signed-off-by: Richard Henderson 
>> ---
>>  linux-user/syscall-defs.h |  14 
>>  linux-user/syscall-file.inc.c | 124 ++
>>  linux-user/syscall.c  |  93 -
>>  linux-user/strace.list|  18 -
>>  4 files changed, 138 insertions(+), 111 deletions(-)
>>
> ...
>> diff --git a/linux-user/syscall-file.inc.c b/linux-user/syscall-file.inc.c
>> index 11e75044c1..410a763eee 100644
>> --- a/linux-user/syscall-file.inc.c
>> +++ b/linux-user/syscall-file.inc.c
>> @@ -315,6 +315,104 @@ SYSCALL_IMPL(openat)
> ...
>> +
>> +/*
>> + * Both preadv and pwritev merge args 4/5 into a 64-bit offset.
>> + * Moreover, the parts are *always* in little-endian order.
>> + */
>> +#if TARGET_ABI_BITS == 32
>> +SYSCALL_ARGS(preadv_pwritev)
>> +{
>> +/* We have already assigned out[0-2].  */
>> +abi_ulong lo = in[3], hi = in[4];
>> +out[3] = ((hi << (TARGET_ABI_BITS - 1)) << 1) | lo;
>> +return def;
>> +}
>> +#else
>> +#define args_preadv_pwritev NULL
>> +#endif
>> +
>> +/* Perform the inverse operation for the host.  */
>> +static inline void host_offset64_low_high(unsigned long *l, unsigned long 
>> *h,
>> +  uint64_t off)
>> +{
>> +*l = off;
>> +*h = (off >> (HOST_LONG_BITS - 1)) >> 1;
>> +}
> 
> 
> I have an error with preadv() on a 32bit target (powerpc, LTP test preadv02).
> 
> It works if I use:
> 
> static inline void host_offset64_low_high(unsigned long *hlow,
>   unsigned long *hhigh,
>   abi_ulong tlow,
>   abi_ulong thigh)
> {
> uint64_t off = tlow |
>((unsigned long long)thigh << TARGET_LONG_BITS / 2) <<
>TARGET_LONG_BITS / 2;
> 
> *hlow = off;
> *hhigh = (off >> HOST_LONG_BITS / 2) >> HOST_LONG_BITS / 2;
> }

This doesn't make any sense.  Where are "tlow" and "thigh" coming from?

I think the bug will be

 SYSCALL_ARGS(preadv_pwritev)
 {
 /* We have already assigned out[0-2].  */
 abi_ulong lo = in[3], hi = in[4];
-out[3] = ((hi << (TARGET_ABI_BITS - 1)) << 1) | lo;
+out[3] = (((uint64_t)hi << (TARGET_ABI_BITS - 1)) << 1) | lo;
 return def;
 }



r~

Re: [Qemu-devel] [PATCH] target/xtensa: rework zero overhead loops implementation

2019-01-11 Thread Richard Henderson

On 1/11/19 10:49 PM, Max Filippov wrote:
> @@ -706,6 +716,17 @@ static inline void cpu_get_tb_cpu_state(CPUXtensaState 
> *env, target_ulong *pc,
>  *flags |= xtensa_get_ring(env);
>  if (env->sregs[PS] & PS_EXCM) {
>  *flags |= XTENSA_TBFLAG_EXCM;
> +} else if (xtensa_option_enabled(env->config, XTENSA_OPTION_LOOP)) {
> +target_ulong lend_dist = env->sregs[LEND] - (env->pc & 
> LINKABLE_MASK);
> +
> +if (lend_dist < LINKABLE_SIZE + env->config->max_insn_size) {
> +target_ulong lbeg_off = env->sregs[LEND] - env->sregs[LBEG];
> +
> +*cs_base = lend_dist;
> +if (lbeg_off < 256) {
> +*cs_base |= lbeg_off << XTENSA_CSBASE_LBEG_OFF_SHIFT;
> +}
> +}
>  }
>  if (xtensa_option_enabled(env->config, XTENSA_OPTION_EXTENDED_L32R) &&
>  (env->sregs[LITBASE] & 1)) {

I think the only other thing that would be nice is some comment that describes
the loop scheme.  I can follow it now, but it took a while of re-reading.  In
particular, the fact that 0 means "disabled", and happens to work because we
(correctly) only check for LEND at the end of an instruction.  Thus the offset
from pc_first will always be non-zero when we check.

r~

[Qemu-devel] [PATCH v2 11/13] spapr_irq: Allow synchronization of a single irq state to KVM

2019-01-11 Thread Greg Kurz

When using the in-kernel interrupt controller, the state of all irqs is
synchronized to KVM at machine reset time. In the case of PHB hotplug, we
will need to synchronize LSIs manually.

Do this for the existing KVM XICS implementation and put a placeholder for
the upcoming KVM XIVE.

Signed-off-by: Greg Kurz 
---
 hw/intc/xics_kvm.c |   67 +---
 hw/ppc/spapr_irq.c |   31 
 include/hw/ppc/spapr_irq.h |2 +
 include/hw/ppc/xics.h  |2 +
 4 files changed, 73 insertions(+), 29 deletions(-)

diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index dff13300504c..d3bbb2bcf19c 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -253,43 +253,52 @@ static void ics_synchronize_state(ICSState *ics)
 ics_get_kvm_state(ics);
 }
 
-static int ics_set_kvm_state(ICSState *ics, int version_id)
+int ics_set_kvm_state_one(ICSState *ics, unsigned srcno, Error **errp)
 {
+ICSIRQState *irq;
 uint64_t state;
-int i;
-Error *local_err = NULL;
 
-for (i = 0; i < ics->nr_irqs; i++) {
-ICSIRQState *irq = >irqs[i];
-int ret;
+assert(srcno < ics->nr_irqs);
 
-state = irq->server;
-state |= (uint64_t)(irq->saved_priority & KVM_XICS_PRIORITY_MASK)
-<< KVM_XICS_PRIORITY_SHIFT;
-if (irq->priority != irq->saved_priority) {
-assert(irq->priority == 0xff);
-state |= KVM_XICS_MASKED;
-}
+irq = >irqs[srcno];
+state = irq->server;
+state |= (uint64_t)(irq->saved_priority & KVM_XICS_PRIORITY_MASK)
+<< KVM_XICS_PRIORITY_SHIFT;
+if (irq->priority != irq->saved_priority) {
+assert(irq->priority == 0xff);
+state |= KVM_XICS_MASKED;
+}
 
-if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
-state |= KVM_XICS_LEVEL_SENSITIVE;
-if (irq->status & XICS_STATUS_ASSERTED) {
-state |= KVM_XICS_PENDING;
-}
-} else {
-if (irq->status & XICS_STATUS_MASKED_PENDING) {
-state |= KVM_XICS_PENDING;
-}
-}
-if (irq->status & XICS_STATUS_PRESENTED) {
-state |= KVM_XICS_PRESENTED;
+if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
+state |= KVM_XICS_LEVEL_SENSITIVE;
+if (irq->status & XICS_STATUS_ASSERTED) {
+state |= KVM_XICS_PENDING;
 }
-if (irq->status & XICS_STATUS_QUEUED) {
-state |= KVM_XICS_QUEUED;
+} else {
+if (irq->status & XICS_STATUS_MASKED_PENDING) {
+state |= KVM_XICS_PENDING;
 }
+}
+if (irq->status & XICS_STATUS_PRESENTED) {
+state |= KVM_XICS_PRESENTED;
+}
+if (irq->status & XICS_STATUS_QUEUED) {
+state |= KVM_XICS_QUEUED;
+}
+
+return kvm_device_access(kernel_xics_fd, KVM_DEV_XICS_GRP_SOURCES,
+ srcno + ics->offset, , true, errp);
+}
+
+static int ics_set_kvm_state(ICSState *ics, int version_id)
+{
+int i;
+Error *local_err = NULL;
+
+for (i = 0; i < ics->nr_irqs; i++) {
+int ret;
 
-ret = kvm_device_access(kernel_xics_fd, KVM_DEV_XICS_GRP_SOURCES,
-i + ics->offset, , true, _err);
+ret = ics_set_kvm_state_one(ics, i, _err);
 if (local_err) {
 error_report_err(local_err);
 return ret;
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index ba0df9ae2e1b..2cf666c2ebc5 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -236,6 +236,17 @@ static void spapr_irq_reset_xics(sPAPRMachineState *spapr, 
Error **errp)
 /* TODO: create the KVM XICS device */
 }
 
+static void spapr_irq_sync_to_kvm_xics(sPAPRMachineState *spapr, int irq,
+   Error **errp)
+{
+MachineState *machine = MACHINE(spapr);
+ICSState *ics = spapr->ics;
+
+if (kvm_enabled() && machine_kernel_irqchip_allowed(machine)) {
+ics_set_kvm_state_one(ics, irq - ics->offset, errp);
+}
+}
+
 #define SPAPR_IRQ_XICS_NR_IRQS 0x1000
 #define SPAPR_IRQ_XICS_NR_MSIS \
 (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
@@ -256,6 +267,7 @@ sPAPRIrq spapr_irq_xics = {
 .reset   = spapr_irq_reset_xics,
 .set_irq = spapr_irq_set_irq_xics,
 .get_phandle = spapr_get_phandle_xics,
+.sync_to_kvm = spapr_irq_sync_to_kvm_xics,
 };
 
 /*
@@ -389,6 +401,12 @@ static void spapr_irq_set_irq_xive(void *opaque, int 
srcno, int val)
 xive_source_set_irq(>xive->source, srcno, val);
 }
 
+static void spapr_irq_sync_to_kvm_xive(sPAPRMachineState *spapr, int irq,
+   Error **errp)
+{
+/* TODO: to be implemented when adding KVM XIVE support */
+}
+
 /*
  * XIVE uses the full IRQ number space. Set it to 8K to be compatible
  * with XICS.
@@ -413,6 +431,7 @@ sPAPRIrq spapr_irq_xive = {
 .reset   = spapr_irq_reset_xive,

[Qemu-devel] [PATCH v2 08/13] spapr_pci: provide node start offset via spapr_populate_pci_dt()

2019-01-11 Thread Greg Kurz

From: Michael Roth 

PHB hotplug re-uses PHB device tree generation code and passes
it to a guest via RTAS. Doing this requires knowledge of where
exactly in the device tree the node describing the PHB begins.

Provide this via a new optional pointer that can be used to
store the PHB node's start offset.

Signed-off-by: Michael Roth 
Reviewed-by: David Gibson 
Signed-off-by: Greg Kurz 
---
 hw/ppc/spapr.c  |2 +-
 hw/ppc/spapr_pci.c  |5 -
 include/hw/pci-host/spapr.h |2 +-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index ef8984286587..ebb6fbbb2b69 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1297,7 +1297,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
 
 QLIST_FOREACH(phb, >phbs, list) {
 ret = spapr_populate_pci_dt(phb, PHANDLE_INTC, fdt,
-spapr->irq->nr_msis);
+spapr->irq->nr_msis, NULL);
 if (ret < 0) {
 error_report("couldn't setup PCI devices in fdt");
 exit(1);
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index be9e01d84eaf..d8a285cc291d 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -2130,7 +2130,7 @@ static void spapr_phb_pci_enumerate(sPAPRPHBState *phb)
 }
 
 int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t intc_phandle, void *fdt,
-  uint32_t nr_msis)
+  uint32_t nr_msis, int *node_offset)
 {
 int bus_off, i, j, ret;
 gchar *nodename;
@@ -2185,6 +2185,9 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t 
intc_phandle, void *fdt,
 nodename = g_strdup_printf("pci@%" PRIx64, phb->buid);
 _FDT(bus_off = fdt_add_subnode(fdt, 0, nodename));
 g_free(nodename);
+if (node_offset) {
+*node_offset = bus_off;
+}
 
 /* Write PHB properties */
 _FDT(fdt_setprop_string(fdt, bus_off, "device_type", "pci"));
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index 16f42d0c0254..caee5df4d52f 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -114,7 +114,7 @@ static inline qemu_irq spapr_phb_lsi_qirq(struct 
sPAPRPHBState *phb, int pin)
 }
 
 int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t intc_phandle, void *fdt,
-  uint32_t nr_msis);
+  uint32_t nr_msis, int *node_offset);
 
 void spapr_pci_rtas_init(void);

[Qemu-devel] [PATCH v2 09/13] spapr_pci: add ibm, my-drc-index property for PHB hotplug

2019-01-11 Thread Greg Kurz

From: Michael Roth 

This is needed to denote a boot-time PHB as being hot-pluggable.

Signed-off-by: Michael Roth 
Reviewed-by: David Gibson 
Signed-off-by: Greg Kurz 
---
 hw/ppc/spapr_pci.c |9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index d8a285cc291d..f2b9066ecabe 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -2180,6 +2180,7 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t 
intc_phandle, void *fdt,
 sPAPRTCETable *tcet;
 PCIBus *bus = PCI_HOST_BRIDGE(phb)->bus;
 sPAPRFDT s_fdt;
+sPAPRDRConnector *drc;
 
 /* Start populating the FDT */
 nodename = g_strdup_printf("pci@%" PRIx64, phb->buid);
@@ -2246,6 +2247,14 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t 
intc_phandle, void *fdt,
  tcet->liobn, tcet->bus_offset,
  tcet->nb_table << tcet->page_shift);
 
+drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, phb->index);
+if (drc) {
+uint32_t drc_index = cpu_to_be32(spapr_drc_index(drc));
+
+_FDT(fdt_setprop(fdt, bus_off, "ibm,my-drc-index", _index,
+ sizeof(drc_index)));
+}
+
 /* Walk the bridges and program the bus numbers*/
 spapr_phb_pci_enumerate(phb);
 _FDT(fdt_setprop_cell(fdt, bus_off, "qemu,phb-enumerated", 0x1));

[Qemu-devel] [PATCH v2 07/13] qdev: pass an Object * to qbus_set_hotplug_handler()

2019-01-11 Thread Greg Kurz

From: Michael Roth 

Certain devices types, like memory/CPU, are now being handled using a
hotplug interface provided by a top-level MachineClass. Hotpluggable
host bridges are another such device where it makes sense to use a
machine-level hotplug handler. However, unlike those devices,
host-bridges have a parent bus (the main system bus), and devices with
a parent bus use a different mechanism for registering their hotplug
handlers: qbus_set_hotplug_handler(). This interface currently expects
a handler to be a subclass of DeviceClass, but this is not the case
for MachineClass, which derives directly from ObjectClass.

Internally, the interface only requires an ObjectClass, so expose that
in qbus_set_hotplug_handler().

Cc: Michael S. Tsirkin 
Cc: Eduardo Habkost 
Signed-off-by: Michael Roth 
Signed-off-by: Greg Kurz 
Reviewed-by: David Gibson 
---
 hw/acpi/pcihp.c   |2 +-
 hw/acpi/piix4.c   |2 +-
 hw/char/virtio-serial-bus.c   |2 +-
 hw/core/bus.c |   11 ++-
 hw/pci/pcie.c |2 +-
 hw/pci/shpc.c |2 +-
 hw/ppc/spapr_pci.c|2 +-
 hw/s390x/css-bridge.c |2 +-
 hw/s390x/s390-pci-bus.c   |6 +++---
 hw/scsi/virtio-scsi.c |2 +-
 hw/scsi/vmw_pvscsi.c  |2 +-
 hw/usb/dev-smartcard-reader.c |2 +-
 include/hw/qdev-core.h|3 +--
 13 files changed, 16 insertions(+), 24 deletions(-)

diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
index 7bc7a723407b..942918132376 100644
--- a/hw/acpi/pcihp.c
+++ b/hw/acpi/pcihp.c
@@ -251,7 +251,7 @@ void acpi_pcihp_device_plug_cb(HotplugHandler *hotplug_dev, 
AcpiPciHpState *s,
 object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
 PCIBus *sec = pci_bridge_get_sec_bus(PCI_BRIDGE(pdev));
 
-qbus_set_hotplug_handler(BUS(sec), DEVICE(hotplug_dev),
+qbus_set_hotplug_handler(BUS(sec), OBJECT(hotplug_dev),
  _abort);
 /* We don't have to overwrite any other hotplug handler yet */
 assert(QLIST_EMPTY(>child));
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 88f9a9ec0912..df8c0db909ce 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -536,7 +536,7 @@ static void piix4_pm_realize(PCIDevice *dev, Error **errp)
 
 piix4_acpi_system_hot_add_init(pci_address_space_io(dev),
pci_get_bus(dev), s);
-qbus_set_hotplug_handler(BUS(pci_get_bus(dev)), DEVICE(s), _abort);
+qbus_set_hotplug_handler(BUS(pci_get_bus(dev)), OBJECT(s), _abort);
 
 piix4_pm_add_propeties(s);
 }
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 04e3ebe3526a..e4310c78f2dc 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -1052,7 +1052,7 @@ static void virtio_serial_device_realize(DeviceState 
*dev, Error **errp)
 /* Spawn a new virtio-serial bus on which the ports will ride as devices */
 qbus_create_inplace(>bus, sizeof(vser->bus), TYPE_VIRTIO_SERIAL_BUS,
 dev, vdev->bus_name);
-qbus_set_hotplug_handler(BUS(>bus), DEVICE(vser), errp);
+qbus_set_hotplug_handler(BUS(>bus), OBJECT(vser), errp);
 vser->bus.vser = vser;
 QTAILQ_INIT(>ports);
 
diff --git a/hw/core/bus.c b/hw/core/bus.c
index 4651f244864c..e09843f6abea 100644
--- a/hw/core/bus.c
+++ b/hw/core/bus.c
@@ -22,22 +22,15 @@
 #include "hw/qdev.h"
 #include "qapi/error.h"
 
-static void qbus_set_hotplug_handler_internal(BusState *bus, Object *handler,
-  Error **errp)
+void qbus_set_hotplug_handler(BusState *bus, Object *handler, Error **errp)
 {
-
 object_property_set_link(OBJECT(bus), OBJECT(handler),
  QDEV_HOTPLUG_HANDLER_PROPERTY, errp);
 }
 
-void qbus_set_hotplug_handler(BusState *bus, DeviceState *handler, Error 
**errp)
-{
-qbus_set_hotplug_handler_internal(bus, OBJECT(handler), errp);
-}
-
 void qbus_set_bus_hotplug_handler(BusState *bus, Error **errp)
 {
-qbus_set_hotplug_handler_internal(bus, OBJECT(bus), errp);
+qbus_set_hotplug_handler(bus, OBJECT(bus), errp);
 }
 
 int qbus_walk_children(BusState *bus,
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index 2d3d8a047b83..a2287c3eb997 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -534,7 +534,7 @@ void pcie_cap_slot_init(PCIDevice *dev, uint16_t slot)
 dev->exp.hpev_notified = false;
 
 qbus_set_hotplug_handler(BUS(pci_bridge_get_sec_bus(PCI_BRIDGE(dev))),
- DEVICE(dev), NULL);
+ OBJECT(dev), NULL);
 }
 
 void pcie_cap_slot_reset(PCIDevice *dev)
diff --git a/hw/pci/shpc.c b/hw/pci/shpc.c
index 45053b39b92c..52ccdc5ae3b9 100644
--- a/hw/pci/shpc.c
+++ b/hw/pci/shpc.c
@@ -648,7 +648,7 @@ int shpc_init(PCIDevice *d, PCIBus *sec_bus, MemoryRegion 
*bar,
 shpc_cap_update_dword(d);
 memory_region_add_subregion(bar, offset, >mmio);

[Qemu-devel] [PATCH v2 06/13] spapr_events: add support for phb hotplug events

2019-01-11 Thread Greg Kurz

From: Michael Roth 

Extend the existing EPOW event format we use for PCI
devices to emit PHB plug/unplug events.

Signed-off-by: Michael Roth 
Reviewed-by: David Gibson 
Signed-off-by: Greg Kurz 
---
 hw/ppc/spapr_events.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index d4e75211954d..6cd640b700dd 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -526,6 +526,9 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t 
hp_action,
 case SPAPR_DR_CONNECTOR_TYPE_CPU:
 hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_CPU;
 break;
+case SPAPR_DR_CONNECTOR_TYPE_PHB:
+hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_PHB;
+break;
 default:
 /* we shouldn't be signaling hotplug events for resources
  * that don't support them

[Qemu-devel] [PATCH v2 10/13] spapr_irq: Expose the phandle of the interrupt controller

2019-01-11 Thread Greg Kurz

This will be used by PHB hotplug in order to create the "interrupt-map"
property of the PHB node.

Signed-off-by: Greg Kurz 
---
 hw/intc/spapr_xive.c|   34 --
 hw/intc/xics_spapr.c|   28 +++-
 hw/ppc/spapr_irq.c  |   13 -
 include/hw/ppc/spapr_irq.h  |1 +
 include/hw/ppc/spapr_xive.h |2 ++
 include/hw/ppc/xics_spapr.h |2 ++
 6 files changed, 76 insertions(+), 4 deletions(-)

diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index d391177ab81f..ffae680024d7 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -1414,6 +1414,12 @@ void spapr_xive_hcall_init(sPAPRMachineState *spapr)
 spapr_register_hypercall(H_INT_RESET, h_int_reset);
 }
 
+static gchar *xive_nodename(sPAPRXive *xive)
+{
+return g_strdup_printf("interrupt-controller@%" PRIx64,
+   xive->tm_base + XIVE_TM_USER_PAGE * (1 << 
TM_SHIFT));
+}
+
 void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
uint32_t phandle)
 {
@@ -1450,8 +1456,7 @@ void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t 
nr_servers, void *fdt,
XIVE_TM_OS_PAGE * (1ull << TM_SHIFT));
 timas[3] = cpu_to_be64(1ull << TM_SHIFT);
 
-nodename = g_strdup_printf("interrupt-controller@%" PRIx64,
-   xive->tm_base + XIVE_TM_USER_PAGE * (1 << 
TM_SHIFT));
+nodename = xive_nodename(xive);
 _FDT(node = fdt_add_subnode(fdt, 0, nodename));
 g_free(nodename);
 
@@ -1479,3 +1484,28 @@ void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t 
nr_servers, void *fdt,
 _FDT(fdt_setprop(fdt, 0, "ibm,plat-res-int-priorities",
  plat_res_int_priorities, 
sizeof(plat_res_int_priorities)));
 }
+
+uint32_t spapr_get_phandle_xive(sPAPRMachineState *spapr, void *fdt,
+Error **errp)
+{
+gchar *nodename = xive_nodename(spapr->xive);
+int phandle = -1, offset;
+
+offset = fdt_subnode_offset(fdt, 0, nodename);
+if (offset < 0) {
+error_setg(errp, "Can't find node \"%s\": %s", nodename,
+   fdt_strerror(offset));
+goto out;
+}
+
+phandle = fdt_get_phandle(spapr->fdt_blob, offset);
+if (phandle < 0) {
+error_setg(errp, "Can't get phandle of node \"%s\": %s",
+   nodename, fdt_strerror(phandle));
+goto out;
+}
+
+out:
+g_free(nodename);
+return phandle;
+}
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index de6cc15b6474..7763b81d90dd 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -35,6 +35,7 @@
 #include "hw/ppc/xics_spapr.h"
 #include "hw/ppc/fdt.h"
 #include "qapi/visitor.h"
+#include "qapi/error.h"
 
 /*
  * Guest interfaces
@@ -245,6 +246,8 @@ void xics_spapr_init(sPAPRMachineState *spapr)
 spapr_register_hypercall(H_IPOLL, h_ipoll);
 }
 
+#define NODENAME "interrupt-controller"
+
 void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
uint32_t phandle)
 {
@@ -253,7 +256,7 @@ void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t 
nr_servers, void *fdt,
 };
 int node;
 
-_FDT(node = fdt_add_subnode(fdt, 0, "interrupt-controller"));
+_FDT(node = fdt_add_subnode(fdt, 0, NODENAME));
 
 _FDT(fdt_setprop_string(fdt, node, "device_type",
 "PowerPC-External-Interrupt-Presentation"));
@@ -266,3 +269,26 @@ void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t 
nr_servers, void *fdt,
 _FDT(fdt_setprop_cell(fdt, node, "linux,phandle", phandle));
 _FDT(fdt_setprop_cell(fdt, node, "phandle", phandle));
 }
+
+uint32_t spapr_get_phandle_xics(sPAPRMachineState *spapr, void *fdt,
+Error **errp)
+{
+int phandle = -1, offset;
+
+offset = fdt_subnode_offset(fdt, 0, NODENAME);
+if (offset < 0) {
+error_setg(errp, "Can't find node \"%s\": %s", NODENAME,
+   fdt_strerror(offset));
+goto out;
+}
+
+phandle = fdt_get_phandle(spapr->fdt_blob, offset);
+if (phandle < 0) {
+error_setg(errp, "Can't get phandle of node \"%s\": %s",
+   NODENAME, fdt_strerror(phandle));
+goto out;
+}
+
+out:
+return phandle;
+}
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 1da7a32348fc..ba0df9ae2e1b 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -255,6 +255,7 @@ sPAPRIrq spapr_irq_xics = {
 .post_load   = spapr_irq_post_load_xics,
 .reset   = spapr_irq_reset_xics,
 .set_irq = spapr_irq_set_irq_xics,
+.get_phandle = spapr_get_phandle_xics,
 };
 
 /*
@@ -411,6 +412,7 @@ sPAPRIrq spapr_irq_xive = {
 .post_load   = spapr_irq_post_load_xive,
 .reset   = spapr_irq_reset_xive,
 .set_irq = spapr_irq_set_irq_xive,
+.get_phandle = spapr_get_phandle_xive,
 };
 
 /*
@@ -569,6 +571,13 @@ static void

[Qemu-devel] [PATCH v2 05/13] spapr: populate PHB DRC entries for root DT node

2019-01-11 Thread Greg Kurz

From: Nathan Fontenot 

This add entries to the root OF node to advertise our PHBs as being
DR-capable in accordance with PAPR specification.

Signed-off-by: Nathan Fontenot 
Signed-off-by: Michael Roth 
Reviewed-by: David Gibson 
Signed-off-by: Greg Kurz 
---
 hw/ppc/spapr.c |8 
 1 file changed, 8 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 9eeb6792e261..ef8984286587 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1350,6 +1350,14 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
 exit(1);
 }
 
+if (smc->dr_phb_enabled) {
+ret = spapr_drc_populate_dt(fdt, 0, NULL, SPAPR_DR_CONNECTOR_TYPE_PHB);
+if (ret < 0) {
+error_report("Couldn't set up PHB DR device tree properties");
+exit(1);
+}
+}
+
 return fdt;
 }

[Qemu-devel] [PATCH v2 04/13] spapr: create DR connectors for PHBs

2019-01-11 Thread Greg Kurz

From: Michael Roth 

Signed-off-by: Michael Roth 
Reviewed-by: David Gibson 
Signed-off-by: Greg Kurz 
---
 hw/ppc/spapr.c |   13 +
 hw/ppc/spapr_drc.c |   17 +
 include/hw/ppc/spapr.h |1 +
 include/hw/ppc/spapr_drc.h |8 
 4 files changed, 39 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 26f8e55cc25e..9eeb6792e261 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2797,6 +2797,19 @@ static void spapr_machine_init(MachineState *machine)
 /* We always have at least the nvram device on VIO */
 spapr_create_nvram(spapr);
 
+/*
+ * Setup hotplug / dynamic-reconfiguration connectors. top-level
+ * connectors (described in root DT node's "ibm,drc-types" property)
+ * are pre-initialized here. additional child connectors (such as
+ * connectors for a PHBs PCI slots) are added as needed during their
+ * parent's realization.
+ */
+if (smc->dr_phb_enabled) {
+for (i = 0; i < SPAPR_MAX_PHBS; i++) {
+spapr_dr_connector_new(OBJECT(machine), TYPE_SPAPR_DRC_PHB, i);
+}
+}
+
 /* Set up PCI */
 spapr_pci_rtas_init();
 
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 2edb7d1e9c8c..189ee681062a 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -696,6 +696,15 @@ static void spapr_drc_lmb_class_init(ObjectClass *k, void 
*data)
 drck->release = spapr_lmb_release;
 }
 
+static void spapr_drc_phb_class_init(ObjectClass *k, void *data)
+{
+sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
+
+drck->typeshift = SPAPR_DR_CONNECTOR_TYPE_SHIFT_PHB;
+drck->typename = "PHB";
+drck->drc_name_prefix = "PHB ";
+}
+
 static const TypeInfo spapr_dr_connector_info = {
 .name  = TYPE_SPAPR_DR_CONNECTOR,
 .parent= TYPE_DEVICE,
@@ -739,6 +748,13 @@ static const TypeInfo spapr_drc_lmb_info = {
 .class_init= spapr_drc_lmb_class_init,
 };
 
+static const TypeInfo spapr_drc_phb_info = {
+.name  = TYPE_SPAPR_DRC_PHB,
+.parent= TYPE_SPAPR_DRC_LOGICAL,
+.instance_size = sizeof(sPAPRDRConnector),
+.class_init= spapr_drc_phb_class_init,
+};
+
 /* helper functions for external users */
 
 sPAPRDRConnector *spapr_drc_by_index(uint32_t index)
@@ -1189,6 +1205,7 @@ static void spapr_drc_register_types(void)
 type_register_static(_drc_cpu_info);
 type_register_static(_drc_pci_info);
 type_register_static(_drc_lmb_info);
+type_register_static(_drc_phb_info);
 
 spapr_rtas_register(RTAS_SET_INDICATOR, "set-indicator",
 rtas_set_indicator);
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 7193ce094689..4eb80c3d888c 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -103,6 +103,7 @@ struct sPAPRMachineClass {
 
 /*< public >*/
 bool dr_lmb_enabled;   /* enable dynamic-reconfig/hotplug of LMBs */
+bool dr_phb_enabled;   /* enable dynamic-reconfig/hotplug of PHBs */
 bool update_dt_enabled;/* enable KVMPPC_H_UPDATE_DT */
 bool use_ohci_by_default;  /* use USB-OHCI instead of XHCI */
 bool pre_2_10_has_unused_icps;
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index f6ff32e7e2f2..56bba36ad4da 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -70,6 +70,14 @@
 #define SPAPR_DRC_LMB(obj) OBJECT_CHECK(sPAPRDRConnector, (obj), \
 TYPE_SPAPR_DRC_LMB)
 
+#define TYPE_SPAPR_DRC_PHB "spapr-drc-phb"
+#define SPAPR_DRC_PHB_GET_CLASS(obj) \
+OBJECT_GET_CLASS(sPAPRDRConnectorClass, obj, TYPE_SPAPR_DRC_PHB)
+#define SPAPR_DRC_PHB_CLASS(klass) \
+OBJECT_CLASS_CHECK(sPAPRDRConnectorClass, klass, TYPE_SPAPR_DRC_PHB)
+#define SPAPR_DRC_PHB(obj) OBJECT_CHECK(sPAPRDRConnector, (obj), \
+TYPE_SPAPR_DRC_PHB)
+
 /*
  * Various hotplug types managed by sPAPRDRConnector
  *

[Qemu-devel] [PATCH v2 01/13] ppc: Move spapr-related prototypes from xics.h into a seperate header file

2019-01-11 Thread Greg Kurz

From: Thomas Huth 

When compiling with Clang in -std=gnu99 mode, there is a warning/error:

  CC  ppc64-softmmu/hw/intc/xics_spapr.o
In file included from /home/thuth/devel/qemu/hw/intc/xics_spapr.c:34:
/home/thuth/devel/qemu/include/hw/ppc/xics.h:203:34: error: redefinition of 
typedef 'sPAPRMachineState' is a C11 feature
  [-Werror,-Wtypedef-redefinition]
typedef struct sPAPRMachineState sPAPRMachineState;
 ^
/home/thuth/devel/qemu/include/hw/ppc/spapr_irq.h:25:34: note: previous 
definition is here
typedef struct sPAPRMachineState sPAPRMachineState;
 ^

We have to remove the duplicated typedef here and include "spapr.h" instead.
But "spapr.h" should not be included for the pnv machine files. So move
the spapr-related prototypes into a new file called "xics_spapr.h" instead.

Reviewed-by: Greg Kurz 
Reviewed-by: Cédric Le Goater 
Signed-off-by: Thomas Huth 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Greg Kurz 
---
 hw/intc/xics_kvm.c  |1 +
 hw/intc/xics_spapr.c|1 +
 hw/ppc/spapr_irq.c  |1 +
 include/hw/ppc/xics.h   |7 ---
 include/hw/ppc/xics_spapr.h |   37 +
 5 files changed, 40 insertions(+), 7 deletions(-)
 create mode 100644 include/hw/ppc/xics_spapr.h

diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index ac94594b1919..dff13300504c 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -34,6 +34,7 @@
 #include "sysemu/kvm.h"
 #include "hw/ppc/spapr.h"
 #include "hw/ppc/xics.h"
+#include "hw/ppc/xics_spapr.h"
 #include "kvm_ppc.h"
 #include "qemu/config-file.h"
 #include "qemu/error-report.h"
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index 9c1a90d7094b..de6cc15b6474 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -32,6 +32,7 @@
 #include "qemu/timer.h"
 #include "hw/ppc/spapr.h"
 #include "hw/ppc/xics.h"
+#include "hw/ppc/xics_spapr.h"
 #include "hw/ppc/fdt.h"
 #include "qapi/visitor.h"
 
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 5fce72fe0f6c..1da7a32348fc 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -14,6 +14,7 @@
 #include "hw/ppc/spapr.h"
 #include "hw/ppc/spapr_xive.h"
 #include "hw/ppc/xics.h"
+#include "hw/ppc/xics_spapr.h"
 #include "sysemu/kvm.h"
 
 #include "trace.h"
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 07508cbd217e..fad786e8b22d 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -200,13 +200,6 @@ void ics_pic_print_info(ICSState *ics, Monitor *mon);
 void ics_resend(ICSState *ics);
 void icp_resend(ICPState *ss);
 
-typedef struct sPAPRMachineState sPAPRMachineState;
-
-void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
-   uint32_t phandle);
-int xics_kvm_init(sPAPRMachineState *spapr, Error **errp);
-void xics_spapr_init(sPAPRMachineState *spapr);
-
 Object *icp_create(Object *cpu, const char *type, XICSFabric *xi,
Error **errp);
 
diff --git a/include/hw/ppc/xics_spapr.h b/include/hw/ppc/xics_spapr.h
new file mode 100644
index ..b1ab27d022cf
--- /dev/null
+++ b/include/hw/ppc/xics_spapr.h
@@ -0,0 +1,37 @@
+/*
+ * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System Emulator
+ *
+ * PAPR Virtualized Interrupt System, aka ICS/ICP aka xics
+ *
+ * Copyright (c) 2010, 2011 David Gibson, IBM Corporation.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef XICS_SPAPR_H
+#define XICS_SPAPR_H
+
+#include "hw/ppc/spapr.h"
+
+void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
+   uint32_t phandle);
+int xics_kvm_init(sPAPRMachineState *spapr, Error **errp);
+void xics_spapr_init(sPAPRMachineState *spapr);
+
+#endif /* XICS_SPAPR_H */

[Qemu-devel] [PATCH v2 03/13] spapr_pci: add PHB unrealize

2019-01-11 Thread Greg Kurz

To support PHB hotplug we need to clean up lingering references,
memory, child properties, etc. prior to the PHB object being
finalized. Generally this will be called as a result of calling
object_unparent() on the PHB object, which in turn would normally
be called as the result of an unplug() operation.

When the PHB is finalized, child objects will be unparented in
turn, and finalized if the PHB was the only reference holder. so
we don't bother to explicitly unparent child objects of the PHB
(spapr_iommu, spapr_drc, etc).

The formula that gives the number of DMA windows is moved to an
inline function in the hw/pci-host/spapr.h header because it
will have other users.

The unrealize function is able to cope with partially realized PHBs.
It is hence used to implement proper rollback on the realize error
path.

Signed-off-by: Michael Roth 
Signed-off-by: Greg Kurz 
---
v2: - implement rollback with unrealize function
---
 hw/ppc/spapr_pci.c  |   73 ---
 include/hw/pci-host/spapr.h |4 ++
 2 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 24f95a400623..0d680b71b5d6 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1559,6 +1559,64 @@ static void spapr_pci_unplug_request(HotplugHandler 
*plug_handler,
 }
 }
 
+static void spapr_phb_finalizefn(Object *obj)
+{
+sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(obj);
+
+g_free(sphb->dtbusname);
+sphb->dtbusname = NULL;
+}
+
+static void spapr_phb_unrealize(DeviceState *dev, Error **errp)
+{
+sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
+SysBusDevice *s = SYS_BUS_DEVICE(dev);
+PCIHostState *phb = PCI_HOST_BRIDGE(s);
+sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(phb);
+sPAPRTCETable *tcet;
+int i;
+const unsigned windows_supported = spapr_phb_windows_supported(sphb);
+
+if (sphb->msi) {
+g_hash_table_unref(sphb->msi);
+sphb->msi = NULL;
+}
+
+/*
+ * Remove IO/MMIO subregions and aliases, rest should get cleaned
+ * via PHB's unrealize->object_finalize
+ */
+for (i = windows_supported - 1; i >= 0; i--) {
+tcet = spapr_tce_find_by_liobn(sphb->dma_liobn[i]);
+if (tcet) {
+memory_region_del_subregion(>iommu_root,
+spapr_tce_get_iommu(tcet));
+}
+}
+
+for (i = PCI_NUM_PINS - 1; i >= 0; i--) {
+if (sphb->lsi_table[i].irq) {
+spapr_irq_free(spapr, sphb->lsi_table[i].irq, 1);
+sphb->lsi_table[i].irq = 0;
+}
+}
+
+QLIST_REMOVE(sphb, list);
+
+memory_region_del_subregion(>iommu_root, >msiwindow);
+
+address_space_destroy(>iommu_as);
+
+qbus_set_hotplug_handler(BUS(phb->bus), NULL, _abort);
+pci_unregister_root_bus(phb->bus);
+
+memory_region_del_subregion(get_system_memory(), >iowindow);
+if (sphb->mem64_win_pciaddr != (hwaddr)-1) {
+memory_region_del_subregion(get_system_memory(), >mem64window);
+}
+memory_region_del_subregion(get_system_memory(), >mem32window);
+}
+
 static void spapr_phb_realize(DeviceState *dev, Error **errp)
 {
 /* We don't use SPAPR_MACHINE() in order to exit gracefully if the user
@@ -1576,8 +1634,7 @@ static void spapr_phb_realize(DeviceState *dev, Error 
**errp)
 PCIBus *bus;
 uint64_t msi_window_size = 4096;
 sPAPRTCETable *tcet;
-const unsigned windows_supported =
-sphb->ddw_enabled ? SPAPR_PCI_DMA_MAX_WINDOWS : 1;
+const unsigned windows_supported = spapr_phb_windows_supported(sphb);
 
 if (!spapr) {
 error_setg(errp, TYPE_SPAPR_PCI_HOST_BRIDGE " needs a pseries 
machine");
@@ -1734,14 +1791,14 @@ static void spapr_phb_realize(DeviceState *dev, Error 
**errp)
 if (local_err) {
 error_propagate_prepend(errp, local_err,
 "can't allocate LSIs: ");
-return;
+goto unrealize;
 }
 }
 
 spapr_irq_claim(spapr, irq, true, _err);
 if (local_err) {
 error_propagate_prepend(errp, local_err, "can't allocate LSIs: ");
-return;
+goto unrealize;
 }
 
 sphb->lsi_table[i].irq = irq;
@@ -1761,13 +1818,17 @@ static void spapr_phb_realize(DeviceState *dev, Error 
**errp)
 if (!tcet) {
 error_setg(errp, "Creating window#%d failed for %s",
i, sphb->dtbusname);
-return;
+goto unrealize;
 }
 memory_region_add_subregion(>iommu_root, 0,
 spapr_tce_get_iommu(tcet));
 }
 
 sphb->msi = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free);
+return;
+
+unrealize:
+spapr_phb_unrealize(dev, NULL);
 }
 
 static int spapr_phb_children_reset(Object *child, void *opaque)
@@ -1966,6 +2027,7 @@ static void

[Qemu-devel] [PATCH v2 02/13] spapr: Rename xics to intc in interrupt controller agnostic code

2019-01-11 Thread Greg Kurz

All this code is used with both the XICS and XIVE interrupt controllers.

Signed-off-by: Greg Kurz 
---
 hw/ppc/spapr.c  |6 +++---
 hw/ppc/spapr_events.c   |2 +-
 hw/ppc/spapr_pci.c  |6 +++---
 hw/ppc/spapr_vio.c  |2 +-
 include/hw/pci-host/spapr.h |2 +-
 include/hw/ppc/spapr.h  |2 +-
 6 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 83081defde4e..26f8e55cc25e 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -96,7 +96,7 @@
 
 #define MIN_RMA_SLOF128UL
 
-#define PHANDLE_XICP0x
+#define PHANDLE_INTC0x
 
 /* These two functions implement the VCPU id numbering: one to compute them
  * all and one to identify thread 0 of a VCORE. Any change to the first one
@@ -1276,7 +1276,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
 
 /* /interrupt controller */
 spapr->irq->dt_populate(spapr, spapr_max_server_number(spapr), fdt,
-  PHANDLE_XICP);
+  PHANDLE_INTC);
 
 ret = spapr_populate_memory(spapr, fdt);
 if (ret < 0) {
@@ -1296,7 +1296,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
 }
 
 QLIST_FOREACH(phb, >phbs, list) {
-ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt,
+ret = spapr_populate_pci_dt(phb, PHANDLE_INTC, fdt,
 spapr->irq->nr_msis);
 if (ret < 0) {
 error_report("couldn't setup PCI devices in fdt");
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 32719a1b72d0..d4e75211954d 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -282,7 +282,7 @@ void spapr_dt_events(sPAPRMachineState *spapr, void *fdt)
 continue;
 }
 
-spapr_dt_xics_irq(interrupts, source->irq, false);
+spapr_dt_intc_irq(interrupts, source->irq, false);
 
 _FDT(node_offset = fdt_add_subnode(fdt, event_sources, source_name));
 _FDT(fdt_setprop(fdt, node_offset, "interrupts", interrupts,
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index b74f2632ecc6..24f95a400623 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -2066,7 +2066,7 @@ static void spapr_phb_pci_enumerate(sPAPRPHBState *phb)
 
 }
 
-int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t xics_phandle, void *fdt,
+int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t intc_phandle, void *fdt,
   uint32_t nr_msis)
 {
 int bus_off, i, j, ret;
@@ -2164,8 +2164,8 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t 
xics_phandle, void *fdt,
 irqmap[1] = 0;
 irqmap[2] = 0;
 irqmap[3] = cpu_to_be32(j+1);
-irqmap[4] = cpu_to_be32(xics_phandle);
-spapr_dt_xics_irq([5], phb->lsi_table[lsi_num].irq, true);
+irqmap[4] = cpu_to_be32(intc_phandle);
+spapr_dt_intc_irq([5], phb->lsi_table[lsi_num].irq, true);
 }
 }
 /* Write interrupt map */
diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
index 7e8a9ad09337..cfd9849d8dfb 100644
--- a/hw/ppc/spapr_vio.c
+++ b/hw/ppc/spapr_vio.c
@@ -158,7 +158,7 @@ static int vio_make_devnode(VIOsPAPRDevice *dev,
 if (dev->irq) {
 uint32_t ints_prop[2];
 
-spapr_dt_xics_irq(ints_prop, dev->irq, false);
+spapr_dt_intc_irq(ints_prop, dev->irq, false);
 ret = fdt_setprop(fdt, node_off, "interrupts", ints_prop,
   sizeof(ints_prop));
 if (ret < 0) {
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index 4eb3a2ce3eb8..e0e683c32469 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -113,7 +113,7 @@ static inline qemu_irq spapr_phb_lsi_qirq(struct 
sPAPRPHBState *phb, int pin)
 return spapr_qirq(spapr, phb->lsi_table[pin].irq);
 }
 
-int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t xics_phandle, void *fdt,
+int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t intc_phandle, void *fdt,
   uint32_t nr_msis);
 
 void spapr_pci_rtas_init(void);
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 9e01a5a12e4a..7193ce094689 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -682,7 +682,7 @@ void spapr_load_rtas(sPAPRMachineState *spapr, void *fdt, 
hwaddr addr);
  * "interrupt-controller" node has its "#interrupt-cells" property set to 2 
(ie,
  * VIO devices, RTAS event sources and PHBs).
  */
-static inline void spapr_dt_xics_irq(uint32_t *intspec, int irq, bool is_lsi)
+static inline void spapr_dt_intc_irq(uint32_t *intspec, int irq, bool is_lsi)
 {
 intspec[0] = cpu_to_be32(irq);
 intspec[1] = is_lsi ? cpu_to_be32(1) : 0;

[Qemu-devel] [PATCH v2 00/13] spapr: Add support for PHB hotplug

2019-01-11 Thread Greg Kurz

This allows to hotplug/unplug PHBs. I could successfully test:
- hotplug/unplug with e1000 device to validate LSIs
- hotplug/unplug with virtio-net device to validate MSIs
- some simple migration scenarios

Changes in v2:
- rebased on current ppc-for-4.0
- added some preliminary cleanup
- call unrealize from realize error path
- advertise PHB hotplug in last patch
- reworked phandle related code
- sync LSIs to KVM

Please comment.

--
Greg

---

Greg Kurz (5):
  spapr: Rename xics to intc in interrupt controller agnostic code
  spapr_pci: add PHB unrealize
  spapr_irq: Expose the phandle of the interrupt controller
  spapr_irq: Allow synchronization of a single irq state to KVM
  spapr: add hotplug hooks for PHB hotplug

Michael Roth (6):
  spapr: create DR connectors for PHBs
  spapr_events: add support for phb hotplug events
  qdev: pass an Object * to qbus_set_hotplug_handler()
  spapr_pci: provide node start offset via spapr_populate_pci_dt()
  spapr_pci: add ibm, my-drc-index property for PHB hotplug
  spapr: enable PHB hotplug for default pseries machine type

Nathan Fontenot (1):
  spapr: populate PHB DRC entries for root DT node

Thomas Huth (1):
  ppc: Move spapr-related prototypes from xics.h into a seperate header file


 hw/acpi/pcihp.c   |2 -
 hw/acpi/piix4.c   |2 -
 hw/char/virtio-serial-bus.c   |2 -
 hw/core/bus.c |   11 +--
 hw/intc/spapr_xive.c  |   34 -
 hw/intc/xics_kvm.c|   68 ++
 hw/intc/xics_spapr.c  |   29 +++-
 hw/pci/pcie.c |2 -
 hw/pci/shpc.c |2 -
 hw/ppc/spapr.c|  156 -
 hw/ppc/spapr_drc.c|   18 +
 hw/ppc/spapr_events.c |5 +
 hw/ppc/spapr_irq.c|   45 
 hw/ppc/spapr_pci.c|  122 +---
 hw/ppc/spapr_vio.c|2 -
 hw/s390x/css-bridge.c |2 -
 hw/s390x/s390-pci-bus.c   |6 +-
 hw/scsi/virtio-scsi.c |2 -
 hw/scsi/vmw_pvscsi.c  |2 -
 hw/usb/dev-smartcard-reader.c |2 -
 include/hw/pci-host/spapr.h   |8 ++
 include/hw/ppc/spapr.h|4 +
 include/hw/ppc/spapr_drc.h|8 ++
 include/hw/ppc/spapr_irq.h|3 +
 include/hw/ppc/spapr_xive.h   |2 +
 include/hw/ppc/xics.h |9 +-
 include/hw/ppc/xics_spapr.h   |   39 ++
 include/hw/qdev-core.h|3 -
 28 files changed, 492 insertions(+), 98 deletions(-)
 create mode 100644 include/hw/ppc/xics_spapr.h

Re: [Qemu-devel] [PATCH v3 24/35] target/riscv: Move gen_arith_imm() decoding into trans_* functions

2019-01-11 Thread Richard Henderson

On 1/12/19 12:10 AM, Bastian Koppelmann wrote:
> 
> On 10/31/18 11:18 PM, Richard Henderson wrote:
>>
>> Surely the shri and sari functions need the same shamt >= TARGET_LONG_BITS
>> check as slli.  Otherwise RV32 shri should definitely produce an assert in
>> tcg_gen_shri_tl.
>>
>> I did wonder about changing the decode of the shift functions such that only
>> the top two bits of the imm are reserved for secondary parsing of the shift
>> type, and the other 10 bits are passed down into trans_foo.  At which point 
>> the
>> TARGET_LONG_BITS check takes care of other illegalities.
>>
>> Which means that the parsing for slli and slliw are identical, and also that
>> for the far future when RV128 is a thing, we don't have to change the 
>> parsing.
> 
> 
> I don't quite understand this. Do you want to have one entry in the decode 
> file
> for slli and slliw?
> 
> How is the parsing of slli and slliw identical with this change? As far as I
> see it, they are different at least in the opcode.

I meant in the extraction and validation of operands, I think.
I'm not really sure where else I was going with this.  It has
been two months and I don't have the decode in front of me.


r~

Re: [Qemu-devel] [PATCH] target/xtensa: rework zero overhead loops implementation

2019-01-11 Thread Max Filippov

On Fri, Jan 11, 2019 at 12:53 PM Richard Henderson
 wrote:
>
> On 1/11/19 10:49 PM, Max Filippov wrote:
> > +#define LINKABLE_BITS 12
> > +#define LINKABLE_SIZE (1u << LINKABLE_BITS)
> > +#define LINKABLE_MASK (-LINKABLE_SIZE)
>
> What is this for?  It seems to be replicating TARGET_PAGE_BITS.

I used it to play with different sizes of regions where direct TB chaining
is allowed. In linux-user it is possible to have it bigger than the page size.
It looks like the optimum is somewhere in the range 10..12.
I left it there because it looks logically independent from the page size,
but I sure can drop it.

-- 
Thanks.
-- Max

Re: [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability

2019-01-11 Thread Eduardo Habkost

On Fri, Jan 11, 2019 at 06:49:53PM +0300, Yury Kotov wrote:
> 10.01.2019, 23:12, "Dr. David Alan Gilbert" :
> > * Yury Kotov (yury-ko...@yandex-team.ru) wrote:
> >>  Hi,
> >>
> >>  The series adds migration capability which allows to skip 'external' RAM 
> >> blocks
> >>  during migration. External block is a RAMBlock which available from the 
> >> outside
> >>  of current QEMU process (e.g. file in /dev/shm). It's useful for fast 
> >> local
> >>  migration to update QEMU for the running guests.
> >
> > Hi Yury,
> >   There have been a few similar patch series around from people wanting
> > to do similar things.
> >   In particular Lai Jiangshan's 
> > https://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg07511.html
> > and Cédric Le Goater wanted to skip regions for a different reason.
> >
> >   We merged some of Cédric's code last year so that we now
> > have the qemu_ram_is_migratable() function - and we should be reusing
> > that to skip things rather than adding a new check that we have to add
> > everywhere.
> >
> 
> I didn't see the series, so I'll check it, thanks!
> But I saw qemu_ram_is_migratable() function and corresponding patch.
> It's very close to my needs, but it works a bit different IIUC:
> 1. Not migratable blocks isn't validated (existence and size) during 
> migration,
> 2. "Migratable" state is determined during the block creation time.
>Such case isn't valid because of it:
>* Source has one migratable and one not migratable RAM blocks,
>* Target has the same (idstr) blocks, but both are not migratable.
>Thus, target will not expect pages for not migratable blocks.
> 
> >   Also, ypu're skipping 'external' things, I think the other suggestion
> > was to skip 'shared' things (i.e. anything with share=0); skipping
> > share=on cases sounds easier to me.
> 
> I agree that introducing new term is a complication, but 'share' and 
> 'external'
> terms have important differences (I'll describe it below).
> 
> Just to clarify:
> * 'share' means that other processes has an access to such memory,
> * 'external' means file backed memory.

If you use file backed memory with share=off, writes are not
propagated to the file (they are mapped with MAP_PRIVATE).  Would
you really want to skip file backed memory if it has share=off?

> 
> There is another use case I wanted to support (I had to write about it in
> the cover letter, sorry..):
> 1. Migrate source VM to file and kill source,
> 2. Start target VM and migrate it from file.
> In such case source VM may have memory-backend-ram with share=off, it's ok.
> 
> Thus, in the new migration capability I want to migrate memory that meets
> three conditions:
> 1. The source will not use the memory after migration ends,
> 2. The source may exit before target starts (migrate to file),
> 3. The target has an access to the memory.
> 
> I think 'external' fits them better than 'share'.
> 

In either case, defining "external" seems tricky.  A memory
region might be backed by a file on tmpfs or hugetlbfs that was
deleted, which makes the file "internal" for practical purposes.
QEMU has no way to tell if (3) is really true.

-- 
Eduardo

Re: [Qemu-devel] [PATCH] target/xtensa: rework zero overhead loops implementation

2019-01-11 Thread Richard Henderson

On 1/11/19 10:49 PM, Max Filippov wrote:
> +#define LINKABLE_BITS 12
> +#define LINKABLE_SIZE (1u << LINKABLE_BITS)
> +#define LINKABLE_MASK (-LINKABLE_SIZE)

What is this for?  It seems to be replicating TARGET_PAGE_BITS.


r~

Re: [Qemu-devel] Internship idea: virtio-blk oss-fuzz support

2019-01-11 Thread Paolo Bonzini

On 11/01/19 20:09, Jonathan Metzman wrote:
> Could you clarify what you think the relationship between the qtest
> process, QEMU, and afl-fuzz will look like when fuzzing?
> 
> Is it something like this:
> 1. afl-fuzz mutates a buffer, starts a qtest process, and gives the
> qtest process the mutated buffer.
> 2. The qtest process starts a QEMU process and interacts with QEMU
> process based on the buffer AFL gave it (qtest).
> 3. goto 1
> 
> I don't think this works (under normal circumstances). AFL will think it
> is fuzzing qtest and will not learn about coverage or crashes from qsym.
> There probably are ways to get this working, but I just want to make
> sure I understand.

It should be possible to turn the qtest process into a test
postprocessor, and remove the second process.  It's much harder to
remove the QEMU process as well and turn it into a TestOneInput function.

Paolo

Re: [Qemu-devel] [PATCH] xen: Fix event channel interface for XenDevice-s

2019-01-11 Thread Paul Durrant

> -Original Message-
> From: Anthony PERARD [mailto:anthony.per...@citrix.com]
> Sent: 11 January 2019 18:10
> To: qemu-devel@nongnu.org
> Cc: Anthony Perard ; Stefano Stabellini
> ; Paul Durrant ; open
> list:X86 
> Subject: [PATCH] xen: Fix event channel interface for XenDevice-s
> 
> Patch "xen: add event channel interface for XenDevice-s" makes use of
> the type xenevtchn_port_or_error_t, but this isn't avaiable before Xen
> 4.7. Also the function xen_device_bind_event_channel assign the return
> value of xenevtchn_bind_interdomain to channel->local_port but check the
> result for error with xendev->local_port.
> 
> Fix by:
> - removing local_port from struct XenDevice as it isn't use anywere.
> - adding a compatibility typedef for xenevtchn_port_or_error_t for Xen
>   4.6 and earlier.
> 
> As extra, replace the type of XenEventChannel->local_port by
> evtchn_port_t.
> 
> Signed-off-by: Anthony PERARD 

Reviewed-by: Paul Durrant 

> ---
>  hw/xen/xen-bus.c| 12 +++-
>  include/hw/xen/xen-bus.h|  1 -
>  include/hw/xen/xen_common.h |  1 +
>  3 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
> index f90bcf2342..3aeccec69c 100644
> --- a/hw/xen/xen-bus.c
> +++ b/hw/xen/xen-bus.c
> @@ -917,7 +917,7 @@ void xen_device_copy_grant_refs(XenDevice *xendev,
> bool to_domain,
>  }
> 
>  struct XenEventChannel {
> -unsigned int local_port;
> +evtchn_port_t local_port;
>  XenEventHandler handler;
>  void *opaque;
>  Notifier notifier;
> @@ -939,17 +939,19 @@ XenEventChannel
> *xen_device_bind_event_channel(XenDevice *xendev,
> void *opaque, Error
> **errp)
>  {
>  XenEventChannel *channel = g_new0(XenEventChannel, 1);
> +xenevtchn_port_or_error_t local_port;
> 
> -channel->local_port = xenevtchn_bind_interdomain(xendev->xeh,
> - xendev->frontend_id,
> - port);
> -if (xendev->local_port < 0) {
> +local_port = xenevtchn_bind_interdomain(xendev->xeh,
> +xendev->frontend_id,
> +port);
> +if (local_port < 0) {
>  error_setg_errno(errp, errno, "xenevtchn_bind_interdomain
> failed");
> 
>  g_free(channel);
>  return NULL;
>  }
> 
> +channel->local_port = local_port;
>  channel->handler = handler;
>  channel->opaque = opaque;
>  channel->notifier.notify = event_notify;
> diff --git a/include/hw/xen/xen-bus.h b/include/hw/xen/xen-bus.h
> index e55a5de5f1..3183f10e3c 100644
> --- a/include/hw/xen/xen-bus.h
> +++ b/include/hw/xen/xen-bus.h
> @@ -29,7 +29,6 @@ typedef struct XenDevice {
>  xengnttab_handle *xgth;
>  bool feature_grant_copy;
>  xenevtchn_handle *xeh;
> -xenevtchn_port_or_error_t local_port;
>  NotifierList event_notifiers;
>  } XenDevice;
> 
> diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h
> index 2b91d199a1..9a8155e172 100644
> --- a/include/hw/xen/xen_common.h
> +++ b/include/hw/xen/xen_common.h
> @@ -32,6 +32,7 @@ extern xc_interface *xen_xc;
>  typedef xc_interface xenforeignmemory_handle;
>  typedef xc_evtchn xenevtchn_handle;
>  typedef xc_gnttab xengnttab_handle;
> +typedef evtchn_port_or_error_t xenevtchn_port_or_error_t;
> 
>  #define xenevtchn_open(l, f) xc_evtchn_open(l, f);
>  #define xenevtchn_close(h) xc_evtchn_close(h)
> --
> Anthony PERARD

[Qemu-devel] [PATCH] qemu.py: Fix error message when qemu dies from signal

2019-01-11 Thread Eric Blake

When qemu dies from a signal, the python code gets a negative
value for exitcode; but signal numbers are positive.  Copy the
pattern used in qemu-iotests/iotests.py for reporting a positive
value.

CC: qemu-triv...@nongnu.org
Signed-off-by: Eric Blake 
---
 scripts/qemu.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/qemu.py b/scripts/qemu.py
index 6e3b0e67719..0a5e02eb56e 100644
--- a/scripts/qemu.py
+++ b/scripts/qemu.py
@@ -351,7 +351,7 @@ class QEMUMachine(object):
 command = ' '.join(self._qemu_full_args)
 else:
 command = ''
-LOG.warn(msg, exitcode, command)
+LOG.warn(msg, -exitcode, command)

 self._launched = False

-- 
2.20.1

Re: [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability

2019-01-11 Thread Dr. David Alan Gilbert

* Yury Kotov (yury-ko...@yandex-team.ru) wrote:
> 10.01.2019, 23:12, "Dr. David Alan Gilbert" :
> > * Yury Kotov (yury-ko...@yandex-team.ru) wrote:
> >>  Hi,
> >>
> >>  The series adds migration capability which allows to skip 'external' RAM 
> >> blocks
> >>  during migration. External block is a RAMBlock which available from the 
> >> outside
> >>  of current QEMU process (e.g. file in /dev/shm). It's useful for fast 
> >> local
> >>  migration to update QEMU for the running guests.
> >
> > Hi Yury,
> >   There have been a few similar patch series around from people wanting
> > to do similar things.
> >   In particular Lai Jiangshan's 
> > https://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg07511.html
> > and Cédric Le Goater wanted to skip regions for a different reason.
> >
> >   We merged some of Cédric's code last year so that we now
> > have the qemu_ram_is_migratable() function - and we should be reusing
> > that to skip things rather than adding a new check that we have to add
> > everywhere.
> >
> 
> I didn't see the series, so I'll check it, thanks!
> But I saw qemu_ram_is_migratable() function and corresponding patch.
> It's very close to my needs, but it works a bit different IIUC:
> 1. Not migratable blocks isn't validated (existence and size) during 
> migration,
> 2. "Migratable" state is determined during the block creation time.
>Such case isn't valid because of it:
>* Source has one migratable and one not migratable RAM blocks,
>* Target has the same (idstr) blocks, but both are not migratable.
>Thus, target will not expect pages for not migratable blocks.

I've added Cédric to the mail;
there were other cases people were interested in, including switching
it dynamically, just no one else used it yet.

I'd prefer it if you did modify the is_migratable - that will fix a lot
of other places in the migration code to avoid your blocks;  I think the
best thing is for you to add a spearate 'needs_migration_check' for
those blocks like yours which you want to check but you don't send the
data.  Please don't change the format of the migratino stream except
in the case where you are sending those to be checked.

> >   Also, ypu're skipping 'external' things, I think the other suggestion
> > was to skip 'shared' things (i.e. anything with share=0); skipping
> > share=on cases sounds easier to me.
> 
> I agree that introducing new term is a complication, but 'share' and 
> 'external'
> terms have important differences (I'll describe it below).
> 
> Just to clarify:
> * 'share' means that other processes has an access to such memory,
> * 'external' means file backed memory.
> 
> There is another use case I wanted to support (I had to write about it in
> the cover letter, sorry..):
> 1. Migrate source VM to file and kill source,
> 2. Start target VM and migrate it from file.
> In such case source VM may have memory-backend-ram with share=off, it's ok.
> 
> Thus, in the new migration capability I want to migrate memory that meets
> three conditions:
> 1. The source will not use the memory after migration ends,
> 2. The source may exit before target starts (migrate to file),
> 3. The target has an access to the memory.
> 
> I think 'external' fits them better than 'share'.

Are you sure that with share=off (the default), in the case where the
source shuts down, that the changes have been written to the backing
file?

I'm also wondering if perhaps we'd be better having an explicit
migrate=off property on memory objects rather than trying to guess from
the share= or the fact it's an external path.

Igor: Does that make sense to you?

Dave

> >
> > Dave
> >
> >>  Patches:
> >>  1. Add offset validation to make sure that external RAM block has the same
> >> physical offset on target side,
> >>  2. Add RAM_EXTERNAL flag to determine external RAM blocks,
> >>  3. Add ignore-external migration capability,
> >>  4. Add a test.
> >>
> >>  Usage example:
> >>  1. Start source VM:
> >> qemu-system-x86 \
> >>   -m 4G \
> >>   -object 
> >> memory-backend-file,id=mem0,size=4G,share=on,mem-path=/dev/shm/mem0 \
> >>   -numa node,memdev=mem0 \
> >>   -qmp unix:/tmp/qemu-qmp-1.sock,server,nowait \
> >>
> >>  2. Start target VM:
> >> qemu-system-x86 \
> >>   -m 4G \
> >>   -object 
> >> memory-backend-file,id=mem0,size=4G,share=on,mem-path=/dev/shm/mem0 \
> >>   -numa node,memdev=mem0 \
> >>   -qmp unix:/tmp/qemu-qmp-2.sock,server,nowait \
> >>   -incoming defer
> >>
> >>  3. Enable ignore-external capability on both VMs:
> >> { "execute": "migrate-set-capabilities" , "arguments":
> >>   { "capabilities": [ { "capability": "x-ignore-external", "state": 
> >> true } ] } }
> >>
> >>  4. Start migration.
> >>
> >>  Regards,
> >>  Yury
> >>
> >>  Yury Kotov (4):
> >>migration: add RAMBlock's offset validation
> >>exec: add RAM_EXTERNAL flag to mark non-QEMU allocated blocks
> >>migration: introduce ignore-external capability
> >>

Re: [Qemu-devel] [PATCH v3 0/8] Promote x-nbd-server-add-bitmap to stable

2019-01-11 Thread Eric Blake

On 1/11/19 1:47 PM, Eric Blake wrote:
> Or rather, move its functionality into nbd-server-add.  And as
> a side effect, teach qemu-nbd how to export a persistent bitmap
> without having to go through a qemu process and several QMP
> commands.
> 
> Based-on: <20181221093529.23855-1-js...@redhat.com>
> [jsnow: 0/11 bitmaps: remove x- prefix from QMP api]
> Based-on: <2019063519.11457-1-phi...@redhat.com>
> [philmd: qemu-nbd: Rename 'exp' variable clashing with math::exp() symbol]
> 
> Available at: https://repo.or.cz/qemu/ericb.git nbd-bitmap-add-v3
> 
> Since v2:
> - split old patch 1 into 3 parts
> - add even more tests of expected error messages
> - rebase on top of 'exp' rename
> - improve commit messages
> - add some R-b where it made sense
> 
> 001/8:[down] 'nbd: Add some error case testing to iotests 223'
> 002/8:[down] 'nbd: Forbid nbd-server-stop when server is not running'
> 003/8:[0011] [FC] 'nbd: Only require disabled bitmap for read-only exports'
> 004/8:[0020] [FC] 'nbd: Merge nbd_export_set_name into nbd_export_new'
> 005/8:[0008] [FC] 'nbd: Allow bitmap export during QMP nbd-server-add'
> 006/8:[] [--] 'nbd: Remove x-nbd-server-add-bitmap'
> 007/8:[0009] [FC] 'nbd: Merge nbd_export_bitmap into nbd_export_new'
> 008/8:[0004] [FC] 'qemu-nbd: Add --bitmap=NAME option'

Queuing on my NBD tree; I'm planning on sending a pull request Monday
morning that includes these patches, unless a last-minute review finding
changes things.

https://repo.or.cz/qemu/ericb.git nbd

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v3 5/8] nbd: Allow bitmap export during QMP nbd-server-add

2019-01-11 Thread Eric Blake

With the experimental x-nbd-server-add-bitmap command, there was
a window of time where an NBD client could see the export but not
the associated dirty bitmap, which can cause a client that planned
on using the dirty bitmap to be forced to treat the entire image
as dirty as a safety fallback.  Furthermore, if the QMP client
successfully exports a disk but then fails to add the bitmap, it
has to take on the burden of removing the export.  Since we don't
allow changing the exposed dirty bitmap (whether to a different
bitmap, or removing advertisement of the bitmap), it is nicer to
make the bitmap tied to the export at the time the export is
created, with automatic failure to export if the bitmap is not
available.

The experimental command included an optional 'bitmap-export-name'
field for remapping the name exposed over NBD to be different from
the bitmap name stored on disk.  However, my libvirt demo code
for implementing differential backups on top of persistent bitmaps
did not need to take advantage of that feature (it is instead
possible to create a new temporary bitmap with the desired name,
use block-dirty-bitmap-merge to merge one or more persistent
bitmaps into the temporary, then associate the temporary with the
NBD export, if control is needed over the exported bitmap name).
Hence, I'm not copying that part of the experiment over to the
stable addition. For more details on the libvirt demo, see
https://www.redhat.com/archives/libvir-list/2018-October/msg01254.html,
https://kvmforum2018.sched.com/event/FzuB/facilitating-incremental-backup-eric-blake-red-hat

This patch focuses on the user interface, and reduces (but does
not completely eliminate) the window where an NBD client can see
the export but not the dirty bitmap, with less work to clean up
after errors.  Later patches will add further cleanups now that
this interface is declared stable via a single QMP command,
including removing the race window.

Update test 223 to use the new interface.

Signed-off-by: Eric Blake 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 qapi/block.json|  7 ++-
 blockdev-nbd.c | 12 +++-
 hmp.c  |  5 +++--
 tests/qemu-iotests/223 | 19 ---
 tests/qemu-iotests/223.out |  5 +
 5 files changed, 29 insertions(+), 19 deletions(-)

diff --git a/qapi/block.json b/qapi/block.json
index 11f01f28efe..3d70420f763 100644
--- a/qapi/block.json
+++ b/qapi/block.json
@@ -246,6 +246,10 @@
 #
 # @writable: Whether clients should be able to write to the device via the
 # NBD connection (default false).
+
+# @bitmap: Also export the dirty bitmap reachable from @device, so the
+#  NBD client can use NBD_OPT_SET_META_CONTEXT with
+#  "qemu:dirty-bitmap:NAME" to inspect the bitmap. (since 4.0)
 #
 # Returns: error if the server is not running, or export with the same name
 #  already exists.
@@ -253,7 +257,8 @@
 # Since: 1.3.0
 ##
 { 'command': 'nbd-server-add',
-  'data': {'device': 'str', '*name': 'str', '*writable': 'bool'} }
+  'data': {'device': 'str', '*name': 'str', '*writable': 'bool',
+   '*bitmap': 'str' } }

 ##
 # @NbdServerRemoveMode:
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 582ffded77f..ec8cf0ab8c3 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -140,7 +140,8 @@ void qmp_nbd_server_start(SocketAddressLegacy *addr,
 }

 void qmp_nbd_server_add(const char *device, bool has_name, const char *name,
-bool has_writable, bool writable, Error **errp)
+bool has_writable, bool writable,
+bool has_bitmap, const char *bitmap, Error **errp)
 {
 BlockDriverState *bs = NULL;
 BlockBackend *on_eject_blk;
@@ -185,6 +186,15 @@ void qmp_nbd_server_add(const char *device, bool has_name, 
const char *name,
  * our only way of accessing it is through nbd_export_find(), so we can 
drop
  * the strong reference that is @exp. */
 nbd_export_put(exp);
+
+if (has_bitmap) {
+Error *err = NULL;
+nbd_export_bitmap(exp, bitmap, bitmap, );
+if (err) {
+error_propagate(errp, err);
+nbd_export_remove(exp, NBD_SERVER_REMOVE_MODE_HARD, NULL);
+}
+}
 }

 void qmp_nbd_server_remove(const char *name,
diff --git a/hmp.c b/hmp.c
index 80aa5ab504b..8da5fd8760a 100644
--- a/hmp.c
+++ b/hmp.c
@@ -2326,7 +2326,7 @@ void hmp_nbd_server_start(Monitor *mon, const QDict 
*qdict)
 }

 qmp_nbd_server_add(info->value->device, false, NULL,
-   true, writable, _err);
+   true, writable, false, NULL, _err);

 if (local_err != NULL) {
 qmp_nbd_server_stop(NULL);
@@ -2347,7 +2347,8 @@ void hmp_nbd_server_add(Monitor *mon, const QDict *qdict)
 bool writable = qdict_get_try_bool(qdict, "writable", false);
 Error *local_err = NULL;

-qmp_nbd_server_add(device, !!name, name, true, writable, _err);
+

[Qemu-devel] [PATCH v3 6/8] nbd: Remove x-nbd-server-add-bitmap

2019-01-11 Thread Eric Blake

Now that nbd-server-add can do the same functionality (well, other
than making the exported bitmap name different than the underlying
bitamp - but we argued that was not essential, since it is just as
easy to create a new non-persistent bitmap with the desired name),
we no longer need the experimental separate command.

Signed-off-by: Eric Blake 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 qapi/block.json | 23 ---
 blockdev-nbd.c  | 23 ---
 2 files changed, 46 deletions(-)

diff --git a/qapi/block.json b/qapi/block.json
index 3d70420f763..5a79d639e8c 100644
--- a/qapi/block.json
+++ b/qapi/block.json
@@ -301,29 +301,6 @@
 { 'command': 'nbd-server-remove',
   'data': {'name': 'str', '*mode': 'NbdServerRemoveMode'} }

-##
-# @x-nbd-server-add-bitmap:
-#
-# Expose a dirty bitmap associated with the selected export. The bitmap search
-# starts at the device attached to the export, and includes all backing files.
-# The exported bitmap is then locked until the NBD export is removed.
-#
-# @name: Export name.
-#
-# @bitmap: Bitmap name to search for.
-#
-# @bitmap-export-name: How the bitmap will be seen by nbd clients
-#  (default @bitmap)
-#
-# Note: the client must use NBD_OPT_SET_META_CONTEXT with a query of
-# "qemu:dirty-bitmap:NAME" (where NAME matches @bitmap-export-name) to access
-# the exposed bitmap.
-#
-# Since: 3.0
-##
-  { 'command': 'x-nbd-server-add-bitmap',
-'data': {'name': 'str', 'bitmap': 'str', '*bitmap-export-name': 'str'} }
-
 ##
 # @nbd-server-stop:
 #
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index ec8cf0ab8c3..cd86b38cdaa 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -233,26 +233,3 @@ void qmp_nbd_server_stop(Error **errp)
 nbd_server_free(nbd_server);
 nbd_server = NULL;
 }
-
-void qmp_x_nbd_server_add_bitmap(const char *name, const char *bitmap,
- bool has_bitmap_export_name,
- const char *bitmap_export_name,
- Error **errp)
-{
-NBDExport *exp;
-
-if (!nbd_server) {
-error_setg(errp, "NBD server not running");
-return;
-}
-
-exp = nbd_export_find(name);
-if (exp == NULL) {
-error_setg(errp, "Export '%s' is not found", name);
-return;
-}
-
-nbd_export_bitmap(exp, bitmap,
-  has_bitmap_export_name ? bitmap_export_name : bitmap,
-  errp);
-}
-- 
2.20.1

[Qemu-devel] [PATCH v3 8/8] qemu-nbd: Add --bitmap=NAME option

2019-01-11 Thread Eric Blake

Having to fire up qemu, then use QMP commands for nbd-server-start
and nbd-server-add, just to expose a persistent dirty bitmap, is
rather tedious.  Make it possible to expose a dirty bitmap using
just qemu-nbd (of course, for now this only works when qemu-nbd is
visiting a BDS formatted as qcow2).

Of course, any good feature also needs unit testing, so expand
iotest 223 to cover it.

Signed-off-by: Eric Blake 

---
v3: rebase to earlier changes
---
 qemu-nbd.texi  |  4 
 qemu-nbd.c | 10 --
 tests/qemu-iotests/223 | 18 +-
 tests/qemu-iotests/223.out | 12 +++-
 4 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/qemu-nbd.texi b/qemu-nbd.texi
index 9a84e81eed9..96b1546006a 100644
--- a/qemu-nbd.texi
+++ b/qemu-nbd.texi
@@ -45,6 +45,10 @@ auto-detecting
 Export the disk as read-only
 @item -P, --partition=@var{num}
 Only expose partition @var{num}
+@item -B, --bitmap=@var{name}
+If @var{filename} has a qcow2 persistent bitmap @var{name}, expose
+that bitmap via the ``qemu:dirty-bitmap:@var{name}'' context
+accessible through NBD_OPT_SET_META_CONTEXT.
 @item -s, --snapshot
 Use @var{filename} as an external snapshot, create a temporary
 file with backing_file=@var{filename}, redirect the write to
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 1552274c189..51b55f2e066 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -95,6 +95,7 @@ static void usage(const char *name)
 "Exposing part of the image:\n"
 "  -o, --offset=OFFSET   offset into the image\n"
 "  -P, --partition=NUM   only expose partition NUM\n"
+"  -B, --bitmap=NAME expose a persistent dirty bitmap\n"
 "\n"
 "General purpose options:\n"
 "  --object type,id=ID,...   define an object such as 'secret' for providing\n"
@@ -509,7 +510,7 @@ int main(int argc, char **argv)
 off_t fd_size;
 QemuOpts *sn_opts = NULL;
 const char *sn_id_or_name = NULL;
-const char *sopt = "hVb:o:p:rsnP:c:dvk:e:f:tl:x:T:D:";
+const char *sopt = "hVb:o:p:rsnP:c:dvk:e:f:tl:x:T:D:B:";
 struct option lopt[] = {
 { "help", no_argument, NULL, 'h' },
 { "version", no_argument, NULL, 'V' },
@@ -519,6 +520,7 @@ int main(int argc, char **argv)
 { "offset", required_argument, NULL, 'o' },
 { "read-only", no_argument, NULL, 'r' },
 { "partition", required_argument, NULL, 'P' },
+{ "bitmap", required_argument, NULL, 'B' },
 { "connect", required_argument, NULL, 'c' },
 { "disconnect", no_argument, NULL, 'd' },
 { "snapshot", no_argument, NULL, 's' },
@@ -558,6 +560,7 @@ int main(int argc, char **argv)
 QDict *options = NULL;
 const char *export_name = ""; /* Default export name */
 const char *export_description = NULL;
+const char *bitmap = NULL;
 const char *tlscredsid = NULL;
 bool imageOpts = false;
 bool writethrough = true;
@@ -695,6 +698,9 @@ int main(int argc, char **argv)
 exit(EXIT_FAILURE);
 }
 break;
+case 'B':
+bitmap = optarg;
+break;
 case 'k':
 sockpath = optarg;
 if (sockpath[0] != '/') {
@@ -1016,7 +1022,7 @@ int main(int argc, char **argv)
 }

 export = nbd_export_new(bs, dev_offset, fd_size, export_name,
-export_description, NULL, nbdflags,
+export_description, bitmap, nbdflags,
 nbd_export_closed, writethrough, NULL,
 _fatal);

diff --git a/tests/qemu-iotests/223 b/tests/qemu-iotests/223
index 0bcc98a8677..773892dbe60 100755
--- a/tests/qemu-iotests/223
+++ b/tests/qemu-iotests/223
@@ -25,6 +25,7 @@ status=1 # failure is the default!

 _cleanup()
 {
+nbd_server_stop
 _cleanup_test_img
 _cleanup_qemu
 rm -f "$TEST_DIR/nbd"
@@ -35,6 +36,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 . ./common.rc
 . ./common.filter
 . ./common.qemu
+. ./common.nbd

 _supported_fmt qcow2
 _supported_proto file # uses NBD as well
@@ -163,7 +165,7 @@ $QEMU_IMG map --output=json --image-opts \
   "$IMG,x-dirty-bitmap=qemu:dirty-bitmap:b2" | _filter_qemu_img_map

 echo
-echo "=== End NBD server ==="
+echo "=== End qemu NBD server ==="
 echo

 _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-remove",
@@ -176,6 +178,20 @@ _send_qemu_cmd $QEMU_HANDLE 
'{"execute":"nbd-server-stop"}' "return"
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-stop"}' "error" # Again
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"quit"}' "return"

+echo
+echo "=== Use qemu-nbd as server ==="
+echo
+
+nbd_server_start_unix_socket -r -f $IMGFMT -B b "$TEST_IMG"
+IMG="driver=nbd,server.type=unix,server.path=$nbd_unix_socket"
+$QEMU_IMG map --output=json --image-opts \
+  "$IMG,x-dirty-bitmap=qemu:dirty-bitmap:b" | _filter_qemu_img_map
+
+nbd_server_start_unix_socket -f $IMGFMT -B b2 "$TEST_IMG"
+IMG="driver=nbd,server.type=unix,server.path=$nbd_unix_socket"
+$QEMU_IMG map

[Qemu-devel] [PATCH v3 7/8] nbd: Merge nbd_export_bitmap into nbd_export_new

2019-01-11 Thread Eric Blake

We only have one caller that wants to export a bitmap name,
which it does right after creation of the export. But there is
still a brief window of time where an NBD client could see the
export but not the dirty bitmap, which a robust client would
have to interpret as meaning the entire image should be treated
as dirty.  Better is to eliminate the window entirely, by
inlining nbd_export_bitmap() into nbd_export_new(), and refusing
to create the bitmap in the first place if the requested bitmap
can't be located.

We also no longer need logic for setting a different bitmap
name compared to the bitmap being exported.

Signed-off-by: Eric Blake 
Reviewed-by: Vladimir Sementsov-Ogievskiy 

---
v3: rebase to 'exp' rename [Phillipe]
---
 include/block/nbd.h |  9 ++---
 blockdev-nbd.c  | 11 +-
 nbd/server.c| 87 +
 qemu-nbd.c  |  5 +--
 4 files changed, 47 insertions(+), 65 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 2f9a2aeb73c..1971b557896 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -296,9 +296,9 @@ typedef struct NBDClient NBDClient;

 NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, off_t size,
   const char *name, const char *description,
-  uint16_t nbdflags, void (*close)(NBDExport *),
-  bool writethrough, BlockBackend *on_eject_blk,
-  Error **errp);
+  const char *bitmap, uint16_t nbdflags,
+  void (*close)(NBDExport *), bool writethrough,
+  BlockBackend *on_eject_blk, Error **errp);
 void nbd_export_close(NBDExport *exp);
 void nbd_export_remove(NBDExport *exp, NbdServerRemoveMode mode, Error **errp);
 void nbd_export_get(NBDExport *exp);
@@ -319,9 +319,6 @@ void nbd_client_put(NBDClient *client);
 void nbd_server_start(SocketAddress *addr, const char *tls_creds,
   Error **errp);

-void nbd_export_bitmap(NBDExport *exp, const char *bitmap,
-   const char *bitmap_export_name, Error **errp);
-
 /* nbd_read
  * Reads @size bytes from @ioc. Returns 0 on success.
  */
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index cd86b38cdaa..c76d5416b90 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -175,7 +175,7 @@ void qmp_nbd_server_add(const char *device, bool has_name, 
const char *name,
 writable = false;
 }

-exp = nbd_export_new(bs, 0, -1, name, NULL,
+exp = nbd_export_new(bs, 0, -1, name, NULL, bitmap,
  writable ? 0 : NBD_FLAG_READ_ONLY,
  NULL, false, on_eject_blk, errp);
 if (!exp) {
@@ -186,15 +186,6 @@ void qmp_nbd_server_add(const char *device, bool has_name, 
const char *name,
  * our only way of accessing it is through nbd_export_find(), so we can 
drop
  * the strong reference that is @exp. */
 nbd_export_put(exp);
-
-if (has_bitmap) {
-Error *err = NULL;
-nbd_export_bitmap(exp, bitmap, bitmap, );
-if (err) {
-error_propagate(errp, err);
-nbd_export_remove(exp, NBD_SERVER_REMOVE_MODE_HARD, NULL);
-}
-}
 }

 void qmp_nbd_server_remove(const char *name,
diff --git a/nbd/server.c b/nbd/server.c
index bb5438c448b..e8c56607eff 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1457,9 +1457,9 @@ static void nbd_eject_notifier(Notifier *n, void *data)

 NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, off_t size,
   const char *name, const char *description,
-  uint16_t nbdflags, void (*close)(NBDExport *),
-  bool writethrough, BlockBackend *on_eject_blk,
-  Error **errp)
+  const char *bitmap, uint16_t nbdflags,
+  void (*close)(NBDExport *), bool writethrough,
+  BlockBackend *on_eject_blk, Error **errp)
 {
 AioContext *ctx;
 BlockBackend *blk;
@@ -1507,6 +1507,43 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t 
dev_offset, off_t size,
 }
 exp->size -= exp->size % BDRV_SECTOR_SIZE;

+if (bitmap) {
+BdrvDirtyBitmap *bm = NULL;
+BlockDriverState *bs = blk_bs(blk);
+
+while (true) {
+bm = bdrv_find_dirty_bitmap(bs, bitmap);
+if (bm != NULL || bs->backing == NULL) {
+break;
+}
+
+bs = bs->backing->bs;
+}
+
+if (bm == NULL) {
+error_setg(errp, "Bitmap '%s' is not found", bitmap);
+goto fail;
+}
+
+if ((nbdflags & NBD_FLAG_READ_ONLY) && bdrv_is_writable(bs) &&
+bdrv_dirty_bitmap_enabled(bm)) {
+error_setg(errp,
+   "Enabled bitmap '%s' incompatible with readonly export",
+   bitmap);
+goto fail;
+

[Qemu-devel] [PATCH v3 0/8] Promote x-nbd-server-add-bitmap to stable

2019-01-11 Thread Eric Blake

Or rather, move its functionality into nbd-server-add.  And as
a side effect, teach qemu-nbd how to export a persistent bitmap
without having to go through a qemu process and several QMP
commands.

Based-on: <20181221093529.23855-1-js...@redhat.com>
[jsnow: 0/11 bitmaps: remove x- prefix from QMP api]
Based-on: <2019063519.11457-1-phi...@redhat.com>
[philmd: qemu-nbd: Rename 'exp' variable clashing with math::exp() symbol]

Available at: https://repo.or.cz/qemu/ericb.git nbd-bitmap-add-v3

Since v2:
- split old patch 1 into 3 parts
- add even more tests of expected error messages
- rebase on top of 'exp' rename
- improve commit messages
- add some R-b where it made sense

001/8:[down] 'nbd: Add some error case testing to iotests 223'
002/8:[down] 'nbd: Forbid nbd-server-stop when server is not running'
003/8:[0011] [FC] 'nbd: Only require disabled bitmap for read-only exports'
004/8:[0020] [FC] 'nbd: Merge nbd_export_set_name into nbd_export_new'
005/8:[0008] [FC] 'nbd: Allow bitmap export during QMP nbd-server-add'
006/8:[] [--] 'nbd: Remove x-nbd-server-add-bitmap'
007/8:[0009] [FC] 'nbd: Merge nbd_export_bitmap into nbd_export_new'
008/8:[0004] [FC] 'qemu-nbd: Add --bitmap=NAME option'

Eric Blake (8):
  nbd: Add some error case testing to iotests 223
  nbd: Forbid nbd-server-stop when server is not running
  nbd: Only require disabled bitmap for read-only exports
  nbd: Merge nbd_export_set_name into nbd_export_new
  nbd: Allow bitmap export during QMP nbd-server-add
  nbd: Remove x-nbd-server-add-bitmap
  nbd: Merge nbd_export_bitmap into nbd_export_new
  qemu-nbd: Add --bitmap=NAME option

 qemu-nbd.texi  |   4 ++
 qapi/block.json|  30 ++--
 include/block/nbd.h|  12 ++--
 blockdev-nbd.c |  36 +++---
 hmp.c  |   5 +-
 nbd/server.c   | 136 +
 qemu-nbd.c |  17 +++--
 tests/qemu-iotests/223 |  50 +++---
 tests/qemu-iotests/223.out |  23 +--
 9 files changed, 160 insertions(+), 153 deletions(-)

-- 
2.20.1

[Qemu-devel] [PATCH v3 4/8] nbd: Merge nbd_export_set_name into nbd_export_new

2019-01-11 Thread Eric Blake

The existing NBD code had a weird split where nbd_export_new()
created an export but did not add it to the list of exported
names until a later nbd_export_set_name() came along and grabbed
a second reference on the object; later, the first call to
nbd_export_close() drops the second reference while removing
the export from the list.  This is in part because the QAPI
NbdServerRemoveNode enum documents the possibility of adding a
mode where we could do a soft disconnect: preventing new clients,
but waiting for existing clients to gracefully quit, based on
the mode used when calling nbd_export_close().

But in spite of all that, note that we never change the name of
an NBD export while it is exposed, which means it is easier to
just inline the process of setting the name as part of creating
the export.

Inline the contents of nbd_export_set_name() and
nbd_export_set_description() into the two points in an export
lifecycle where they matter, then adjust both callers to pass
the name up front.  Note that for creation, all callers pass a
non-NULL name, (passing NULL at creation was for old style
servers, but we removed support for that in commit 7f7dfe2a),
so we can add an assert and do things unconditionally; but for
cleanup, because of the dual nature of nbd_export_close(), we
still have to be careful to avoid use-after-free.  Along the
way, add a comment reminding ourselves of the potential of
adding a middle mode disconnect.

Signed-off-by: Eric Blake 
Reviewed-by: Vladimir Sementsov-Ogievskiy 

---
v3: add comment on potential close mode semantics [Vladimir],
rebase to 'exp' rename [Phillipe]; no major code change so R-b added
---
 include/block/nbd.h |  3 +--
 blockdev-nbd.c  |  5 ++---
 nbd/server.c| 52 -
 qemu-nbd.c  |  8 +++
 4 files changed, 29 insertions(+), 39 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 65402d33964..2f9a2aeb73c 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -295,6 +295,7 @@ typedef struct NBDExport NBDExport;
 typedef struct NBDClient NBDClient;

 NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, off_t size,
+  const char *name, const char *description,
   uint16_t nbdflags, void (*close)(NBDExport *),
   bool writethrough, BlockBackend *on_eject_blk,
   Error **errp);
@@ -306,8 +307,6 @@ void nbd_export_put(NBDExport *exp);
 BlockBackend *nbd_export_get_blockdev(NBDExport *exp);

 NBDExport *nbd_export_find(const char *name);
-void nbd_export_set_name(NBDExport *exp, const char *name);
-void nbd_export_set_description(NBDExport *exp, const char *description);
 void nbd_export_close_all(void);

 void nbd_client_new(QIOChannelSocket *sioc,
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index ca584919194..582ffded77f 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -174,14 +174,13 @@ void qmp_nbd_server_add(const char *device, bool 
has_name, const char *name,
 writable = false;
 }

-exp = nbd_export_new(bs, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY,
+exp = nbd_export_new(bs, 0, -1, name, NULL,
+ writable ? 0 : NBD_FLAG_READ_ONLY,
  NULL, false, on_eject_blk, errp);
 if (!exp) {
 return;
 }

-nbd_export_set_name(exp, name);
-
 /* The list of named exports has a strong reference to this export now and
  * our only way of accessing it is through nbd_export_find(), so we can 
drop
  * the strong reference that is @exp. */
diff --git a/nbd/server.c b/nbd/server.c
index 98327088cb4..bb5438c448b 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1456,6 +1456,7 @@ static void nbd_eject_notifier(Notifier *n, void *data)
 }

 NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, off_t size,
+  const char *name, const char *description,
   uint16_t nbdflags, void (*close)(NBDExport *),
   bool writethrough, BlockBackend *on_eject_blk,
   Error **errp)
@@ -1471,6 +1472,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t 
dev_offset, off_t size,
  * that BDRV_O_INACTIVE is cleared and the image is ready for write
  * access since the export could be available before migration handover.
  */
+assert(name);
 ctx = bdrv_get_aio_context(bs);
 aio_context_acquire(ctx);
 bdrv_invalidate_cache(bs, NULL);
@@ -1494,6 +1496,8 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t 
dev_offset, off_t size,
 QTAILQ_INIT(>clients);
 exp->blk = blk;
 exp->dev_offset = dev_offset;
+exp->name = g_strdup(name);
+exp->description = g_strdup(description);
 exp->nbdflags = nbdflags;
 exp->size = size < 0 ? blk_getlength(blk) : size;
 if (exp->size < 0) {
@@ -1513,10 +1517,14 @@ NBDExport *nbd_export_new(BlockDriverState *bs,

[Qemu-devel] [PATCH v3 2/8] nbd: Forbid nbd-server-stop when server is not running

2019-01-11 Thread Eric Blake

Since we already forbid other nbd-server commands when not
in the right state, it is unlikely that any caller was relying
on a second stop to behave as a silent no-op.  Update iotest
223 to show the improved behavior.

Signed-off-by: Eric Blake 

---
v3: new patch
---
 blockdev-nbd.c | 5 +
 tests/qemu-iotests/223 | 2 +-
 tests/qemu-iotests/223.out | 2 +-
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 1d170c80b82..ca584919194 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -214,6 +214,11 @@ void qmp_nbd_server_remove(const char *name,

 void qmp_nbd_server_stop(Error **errp)
 {
+if (!nbd_server) {
+error_setg(errp, "NBD server not running");
+return;
+}
+
 nbd_export_close_all();

 nbd_server_free(nbd_server);
diff --git a/tests/qemu-iotests/223 b/tests/qemu-iotests/223
index 61b46a2f066..a4016091b21 100755
--- a/tests/qemu-iotests/223
+++ b/tests/qemu-iotests/223
@@ -172,7 +172,7 @@ _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-remove",
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-remove",
   "arguments":{"name":"n2"}}' "error" # Attempt duplicate clean
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-stop"}' "return"
-_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-stop"}' "return" # Oops
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-stop"}' "error" # Again
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"quit"}' "return"

 # success, all done
diff --git a/tests/qemu-iotests/223.out b/tests/qemu-iotests/223.out
index e6ede0591cd..8a4d63a4fc2 100644
--- a/tests/qemu-iotests/223.out
+++ b/tests/qemu-iotests/223.out
@@ -69,6 +69,6 @@ read 2097152/2097152 bytes at offset 2097152
 {"return": {}}
 {"error": {"class": "GenericError", "desc": "Export 'n2' is not found"}}
 {"return": {}}
-{"return": {}}
+{"error": {"class": "GenericError", "desc": "NBD server not running"}}
 {"return": {}}
 *** done
-- 
2.20.1

[Qemu-devel] [PATCH v3 3/8] nbd: Only require disabled bitmap for read-only exports

2019-01-11 Thread Eric Blake

Our initial implementation of x-nbd-server-add-bitmap put
in a restriction because of incremental backups: in that
usage, we are exporting one qcow2 file (the temporary overlay
target of a blockdev-backup sync:none job) and a dirty bitmap
owned by a second qcow2 file (the source of the
blockdev-backup, which is the backing file of the temporary).
While both qcow2 files are still writable (the target in
order to capture copy-on-write of old contents, and the
source in order to track live guest writes in the meantime),
the NBD client expects to see constant data, including the
dirty bitmap.  An enabled bitmap in the source would be
modified by guest writes, which is at odds with the NBD
export being a read-only constant view, hence the initial
code choice of enforcing a disabled bitmap (the intent is
that the exposed bitmap was disabled in the same transaction
that started the blockdev-backup job, although we don't want
to track enough state to actually enforce that).

However, consider the case of a bitmap contained in a read-only
node (including when the bitmap is found in a backing layer of
the active image).  Because the node can't be modified, the
bitmap won't change due to writes, regardless of whether it is
still enabled.  Forbidding the export unless the bitmap is
disabled is awkward, paritcularly since we can't change the
bitmap to be disabled (because the node is read-only).

Alternatively, consider the case of live storage migration,
where management directs the destination to create a writable
NBD server, then performs a drive-mirror from the source to
the mirror, prior to doing the rest of the live migration.
Since storage migration can be time-consuming, it may be wise
to let the destination include a dirty bitmap to track which
portions it has already received, where even if the migration
is interrupted and restarted, the source can query the
destination block status in order to minimize re-sending
data that has not changed in the meantime on a second attempt.
Such code has not been written, but it makes sense that
letting an active dirty bitmap be exposed and changing
alongside writes may prove useful in the future.

Solve both issues by gating the restriction against a
disabled bitmap to only happen when the caller has requested
a read-only export, and where the BDS that owns the bitmap
(whether or not it is the BDS handed to nbd_export_new() or
from its backing chain) is still writable.

Update iotest 223 to show the looser behavior by leaving
a bitmap enabled the whole run; note that we have to tear
down and re-export a node when handling an error.

Signed-off-by: Eric Blake 

---
v3: split out unrelated test changes, improve commit message [Vladimir]
v2: new patch
---
 nbd/server.c   |  7 +--
 tests/qemu-iotests/223 | 10 +++---
 tests/qemu-iotests/223.out |  3 ++-
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 7af0ddffb20..98327088cb4 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2456,8 +2456,11 @@ void nbd_export_bitmap(NBDExport *exp, const char 
*bitmap,
 return;
 }

-if (bdrv_dirty_bitmap_enabled(bm)) {
-error_setg(errp, "Bitmap '%s' is enabled", bitmap);
+if ((exp->nbdflags & NBD_FLAG_READ_ONLY) && bdrv_is_writable(bs) &&
+bdrv_dirty_bitmap_enabled(bm)) {
+error_setg(errp,
+   "Enabled bitmap '%s' incompatible with readonly export",
+   bitmap);
 return;
 }

diff --git a/tests/qemu-iotests/223 b/tests/qemu-iotests/223
index a4016091b21..f200e313c06 100755
--- a/tests/qemu-iotests/223
+++ b/tests/qemu-iotests/223
@@ -61,6 +61,8 @@ echo "=== Create partially sparse image, then add dirty 
bitmaps ==="
 echo

 # Two bitmaps, to contrast granularity issues
+# Also note that b will be disabled, while b2 is left enabled, to
+# check for read-only interactions
 _make_test_img -o cluster_size=4k 4M
 $QEMU_IO -c 'w -P 0x11 1M 2M' "$TEST_IMG" | _filter_qemu_io
 run_qemu <

[Qemu-devel] [PATCH v3 1/8] nbd: Add some error case testing to iotests 223

2019-01-11 Thread Eric Blake

Testing success paths is important, but it's also nice to highlight
expected failure handling, to show that we don't crash, and so that
upcoming tests that change behavior can demonstrate the resulting
effects on error paths.

Add the following errors:
Attempting to export without a running server
Attempting to start a second server
Attempting to export a bad node name
Attempting to export a name that is already exported
Attempting to export an enabled bitmap
Attempting to clean an already cleaned export
Attempting to quit server a second time

All of these properly complain except for a second server-stop,
which will be fixed next.

Signed-off-by: Eric Blake 

---
v3: split out unrelated error path testing and expand [Vladimir]
---
 tests/qemu-iotests/223 | 19 +--
 tests/qemu-iotests/223.out |  7 +++
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/223 b/tests/qemu-iotests/223
index 5513dc62159..61b46a2f066 100755
--- a/tests/qemu-iotests/223
+++ b/tests/qemu-iotests/223
@@ -107,6 +107,7 @@ echo

 _launch_qemu 2> >(_filter_nbd)

+# Intentionally provoke some errors as well, to check error handling
 silent=
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"qmp_capabilities"}' "return"
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"blockdev-add",
@@ -114,17 +115,28 @@ _send_qemu_cmd $QEMU_HANDLE '{"execute":"blockdev-add",
 "file":{"driver":"file", "filename":"'"$TEST_IMG"'"}}}' "return"
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"block-dirty-bitmap-disable",
   "arguments":{"node":"n", "name":"b"}}' "return"
-_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-dirty-bitmap-disable",
-  "arguments":{"node":"n", "name":"b2"}}' "return"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-add",
+  "arguments":{"device":"n"}}' "error" # Attempt add without server
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
   "arguments":{"addr":{"type":"unix",
 "data":{"path":"'"$TEST_DIR/nbd"'"' "return"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
+  "arguments":{"addr":{"type":"unix",
+"data":{"path":"'"$TEST_DIR/nbd"1'"' "error" # Attempt second server
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-add",
   "arguments":{"device":"n"}}' "return"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-add",
+  "arguments":{"device":"nosuch"}}' "error" # Attempt to export missing node
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-add",
+  "arguments":{"device":"n"}}' "error" # Attempt to export same name twice
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"x-nbd-server-add-bitmap",
   "arguments":{"name":"n", "bitmap":"b"}}' "return"
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-add",
   "arguments":{"device":"n", "name":"n2"}}' "return"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"x-nbd-server-add-bitmap",
+  "arguments":{"name":"n2", "bitmap":"b2"}}' "error" # Attempt enabled bitmap
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-dirty-bitmap-disable",
+  "arguments":{"node":"n", "name":"b2"}}' "return"
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"x-nbd-server-add-bitmap",
   "arguments":{"name":"n2", "bitmap":"b2"}}' "return"

@@ -157,7 +169,10 @@ _send_qemu_cmd $QEMU_HANDLE 
'{"execute":"nbd-server-remove",
   "arguments":{"name":"n"}}' "return"
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-remove",
   "arguments":{"name":"n2"}}' "return"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-remove",
+  "arguments":{"name":"n2"}}' "error" # Attempt duplicate clean
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-stop"}' "return"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-stop"}' "return" # Oops
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"quit"}' "return"

 # success, all done
diff --git a/tests/qemu-iotests/223.out b/tests/qemu-iotests/223.out
index 99ca172fbb8..e6ede0591cd 100644
--- a/tests/qemu-iotests/223.out
+++ b/tests/qemu-iotests/223.out
@@ -27,10 +27,15 @@ wrote 2097152/2097152 bytes at offset 2097152
 {"return": {}}
 {"return": {}}
 {"return": {}}
+{"error": {"class": "GenericError", "desc": "NBD server not running"}}
 {"return": {}}
+{"error": {"class": "GenericError", "desc": "NBD server already running"}}
 {"return": {}}
+{"error": {"class": "GenericError", "desc": "Cannot find device=nosuch nor 
node_name=nosuch"}}
+{"error": {"class": "GenericError", "desc": "NBD server already has export 
named 'n'"}}
 {"return": {}}
 {"return": {}}
+{"error": {"class": "GenericError", "desc": "Bitmap 'b2' is enabled"}}
 {"return": {}}
 {"return": {}}

@@ -62,6 +67,8 @@ read 2097152/2097152 bytes at offset 2097152

 {"return": {}}
 {"return": {}}
+{"error": {"class": "GenericError", "desc": "Export 'n2' is not found"}}
+{"return": {}}
 {"return": {}}
 {"return": {}}
 *** done
-- 
2.20.1

Re: [Qemu-devel] [PATCH v6 01/11] blockdev: abort transactions in reverse order

2019-01-11 Thread Eric Blake

On 1/11/19 11:52 AM, Eric Blake wrote:
> On 12/21/18 3:35 AM, John Snow wrote:
>> Presently, we abort transactions in the same order they were processed in.
>> Bitmap commands, though, attempt to restore backup data structures on abort.
>>
>> That's not valid, they need to be aborted in reverse chronological order.
>>
>> Replace the QSIMPLEQ data structure with a QTAILQ one, so we can iterate
>> in reverse for the abort phase of the transaction.
> 
> This patch conflicts with Paolo's improvements to QTAILQ that just
> landed; the resolution should be obvious, but at this point, you'll want
> to post a v7.
> 
> Or did you want me to try and fix the conflict, and take the series
> through my NBD tree since I have patches based on top of it?

Actually, I went ahead and queued this series in my NBD tree after
making the fixes, so I can get the removal of x- in place sooner for
easier integration testing on my libvirt code.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 1/1] block: Eliminate the S_1KiB, S_2KiB, ... macros

2019-01-11 Thread Eric Blake

On 1/11/19 1:14 PM, Markus Armbruster wrote:
> We define 54 macros for the powers of two >= 1024.  We use six, in six
> macro definitions.  Four of them could just as well use the common MiB
> macro, so do that.  The remaining two can't, because they get passed
> to stringify.  Replace the macro by the literal number there.
> Slightly harder to read in one instance (1048576 vs. S_1MiB), so add a
> comment there.  The other instance is a wash: 65536 vs S_64KiB.  65536
> has been good enough for more than seven years there.
> 
> This effectively reverts commit 540b8492618 and 1240ac558d3.
> 
> Signed-off-by: Markus Armbruster 
> ---
>  block/qcow2.h| 10 +++---
>  block/vdi.c  |  3 +-
>  include/qemu/units.h | 73 
>  3 files changed, 7 insertions(+), 79 deletions(-)

Renders part of my v3 series useless (since I effectively did the same
reversions), but is indeed the simplest baseline that can possibly work.

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer

2019-01-11 Thread Alex Bennée



Greg Kurz  writes:

> On Fri, 11 Jan 2019 16:41:41 +0100
> Paolo Bonzini  wrote:
>
>> On 11/01/19 16:28, Alex Bennée wrote:
>> >> Why not g_usleep?  It already does a while loop around nanosleep (which
>> >> returns the remaining time in the wait, like select but unlike sleep and
>> >> poll).
>> > Yeah I'm testing that now. However I have managed to trigger:
>> >
>> >   ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): 
>> > (35584 == 0)
>>
>> I think that's a good old SIGSEGV (0x8B00).
>>
>
> Hmmm... system() returns a "wait status" that can  be examined using the
> macros described in waitpid(2), and we have:
>
> /* If WIFEXITED(STATUS), the low-order 8 bits of the status.  */
> #define   __WEXITSTATUS(status)   (((status) & 0xff00) >> 8)
>
> So this rather looks like a 139 exit status to me... Not sure how
> this can happen though.

Yeah the child segfaulted in mcount while closing down. I've started a
new thread with the details of the remaining failure modes:

  Subject: Remaining CI failures
  Date: Fri, 11 Jan 2019 19:10:07 +
  Message-ID: <87lg3rui28@linaro.org>


--
Alex Bennée

Re: [Qemu-devel] qemu-user-linux: how could I measure performance for aarch64 and arm?

2019-01-11 Thread Matwey V. Kornilov

пт, 11 янв. 2019 г. в 12:52, Peter Maydell :
>
> On Thu, 10 Jan 2019 at 19:33, Matwey V. Kornilov
>  wrote:
> > I am running the same application compiled for aarch64 and armv7l on
> > x86_64 platform using qemu-user-linux tools.
> >
> > I see dramatic performance difference (30 times) between emulated
> > architectures: aarch64 runs for ~4 minutes, armv7l runs for ~2 hours.
> > I do understand that CPU architecture emulation is inherently slow
> > thing, but my question is about the difference.
> >
> > How could I debug to understand what is the reason for such a big
> > difference? I've already tried to run stress-ng compiled for this two
> > architectures, but it leads to the same performance per second.
> >
> > I am running qemu 2.11, should I try other version?
>
> Yes, do try 3.1 -- we have done some overall TCG performance
> improvements.

Indeed, qemu-arm from master runs for 4 minutes where 2.11 runs for 2
hours for me. It is impressive improvement.

>
> For a big difference between target architectures like that,
> I would try starting by using some host performance tools on
> the two runs to see where all the time is being taken in
> the armv7l guest run -- is it all in translated guest code,
> or is there more time (proportionally) spent in particular
> parts of the QEMU C code? Does the armv7l version do
> many more or different syscalls (check with the QEMU -strace
> option) ?
>
> Also you should check performance on h/w 32 bit vs
> 64-bit Arm if you can, to confirm that it's not just
> that the guest application runs much slower there.
> (If you don't have the arm hardware you could at least
> check x86 32-bit vs 64-bit.)
>
> thanks
> -- PMM



-- 
With best regards,
Matwey V. Kornilov

[Qemu-devel] [PATCH 1/1] block: Eliminate the S_1KiB, S_2KiB, ... macros

2019-01-11 Thread Markus Armbruster

We define 54 macros for the powers of two >= 1024.  We use six, in six
macro definitions.  Four of them could just as well use the common MiB
macro, so do that.  The remaining two can't, because they get passed
to stringify.  Replace the macro by the literal number there.
Slightly harder to read in one instance (1048576 vs. S_1MiB), so add a
comment there.  The other instance is a wash: 65536 vs S_64KiB.  65536
has been good enough for more than seven years there.

This effectively reverts commit 540b8492618 and 1240ac558d3.

Signed-off-by: Markus Armbruster 
---
 block/qcow2.h| 10 +++---
 block/vdi.c  |  3 +-
 include/qemu/units.h | 73 
 3 files changed, 7 insertions(+), 79 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index a98d24500b..2380cbfb41 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -50,11 +50,11 @@
 
 /* 8 MB refcount table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_REFTABLE_SIZE S_8MiB
+#define QCOW_MAX_REFTABLE_SIZE (8 * MiB)
 
 /* 32 MB L1 table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_L1_SIZE S_32MiB
+#define QCOW_MAX_L1_SIZE (32 * MiB)
 
 /* Allow for an average of 1k per snapshot table entry, should be plenty of
  * space for snapshot names and IDs */
@@ -81,15 +81,15 @@
 #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
 
 #ifdef CONFIG_LINUX
-#define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB
+#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB)
 #define DEFAULT_CACHE_CLEAN_INTERVAL 600  /* seconds */
 #else
-#define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB
+#define DEFAULT_L2_CACHE_MAX_SIZE (8 * MiB)
 /* Cache clean interval is currently available only on Linux, so must be 0 */
 #define DEFAULT_CACHE_CLEAN_INTERVAL 0
 #endif
 
-#define DEFAULT_CLUSTER_SIZE S_64KiB
+#define DEFAULT_CLUSTER_SIZE 65536
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
 #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request"
diff --git a/block/vdi.c b/block/vdi.c
index 2380daa583..bf1d19dd68 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -85,7 +85,8 @@
 #define BLOCK_OPT_STATIC "static"
 
 #define SECTOR_SIZE 512
-#define DEFAULT_CLUSTER_SIZE S_1MiB
+#define DEFAULT_CLUSTER_SIZE 1048576
+/* Note: can't use 1 * MiB, because it's passed to stringify() */
 
 #if defined(CONFIG_VDI_DEBUG)
 #define VDI_DEBUG 1
diff --git a/include/qemu/units.h b/include/qemu/units.h
index 1c959d182e..692db3fbb2 100644
--- a/include/qemu/units.h
+++ b/include/qemu/units.h
@@ -17,77 +17,4 @@
 #define PiB (INT64_C(1) << 50)
 #define EiB (INT64_C(1) << 60)
 
-/*
- * The following lookup table is intended to be used when a literal string of
- * the number of bytes is required (for example if it needs to be stringified).
- * It can also be used for generic shortcuts of power-of-two sizes.
- * This table is generated using the AWK script below:
- *
- *  BEGIN {
- *  suffix="KMGTPE";
- *  for(i=10; i<64; i++) {
- *  val=2**i;
- *  s=substr(suffix, int(i/10), 1);
- *  n=2**(i%10);
- *  pad=21-int(log(n)/log(10));
- *  printf("#define S_%d%siB %*d\n", n, s, pad, val);
- *  }
- *  }
- */
-
-#define S_1KiB  1024
-#define S_2KiB  2048
-#define S_4KiB  4096
-#define S_8KiB  8192
-#define S_16KiB16384
-#define S_32KiB32768
-#define S_64KiB65536
-#define S_128KiB  131072
-#define S_256KiB  262144
-#define S_512KiB  524288
-#define S_1MiB   1048576
-#define S_2MiB   2097152
-#define S_4MiB   4194304
-#define S_8MiB   8388608
-#define S_16MiB 16777216
-#define S_32MiB 33554432
-#define S_64MiB 67108864
-#define S_128MiB   134217728
-#define S_256MiB   268435456
-#define S_512MiB   536870912
-#define S_1GiB1073741824
-#define S_2GiB2147483648
-#define S_4GiB4294967296
-#define S_8GiB8589934592
-#define S_16GiB  17179869184
-#define S_32GiB  34359738368
-#define S_64GiB  68719476736
-#define S_128GiB137438953472
-#define S_256GiB274877906944
-#define S_512GiB549755813888
-#define S_1TiB 1099511627776
-#define S_2TiB 219902322
-#define S_4TiB 4398046511104
-#define S_8TiB 8796093022208
-#define S_16TiB   17592186044416
-#define S_32TiB   35184372088832
-#define S_64TiB   70368744177664
-#define S_128TiB 140737488355328
-#define S_256TiB 281474976710656
-#define S_512TiB 562949953421312
-#define S_1PiB  1125899906842624
-#define S_2PiB  2251799813685248
-#define S_4PiB  4503599627370496
-#define S_8PiB  9007199254740992
-#define S_16PiB

[Qemu-devel] [PATCH 0/1] block: Eliminate the S_1KiB, S_2KiB, ... macros

2019-01-11 Thread Markus Armbruster

Back in September, Leonid Block added a whole bunch of macros (commit
540b8492618) to improve readability of qcow2.h a bit (commit
b6a95c6d100).  He later used them to fix the "vdi" driver's parameter
cluster_size's default value (commit 3dd5b8f4718).  He has now
proposed a further patch[1] to auto-generate these macros.  That patch
feels overengineered to me.

On closer examination, I found I dislike the macros before his new
patch.  So did Eric Blake.

The macros exist because the common KiB, MiB, ... macros aren't usable
when you need a literal rather than a constant expression.
stringify() does, and we use it to define the QemuOpts default value.

Eric proposed to improve QemuOpts to accept integer default values,
too[2].  Before I review that patch series, I want to establish a
"stupidest solution that can possibly work" baseline.  And that's what
this patch is.

[1] [PATCH v2 0/1] include: Auto-generate the sizes lookup table
Message-ID: <20190103213320.2653-1-lbl...@janustech.com>

[2] [PATCH v3 0/6] include: Auto-generate the sizes lookup table
Message-Id: <20190110191901.5082-1-ebl...@redhat.com>

Markus Armbruster (1):
  block: Eliminate the S_1KiB, S_2KiB, ... macros

 block/qcow2.h| 10 +++---
 block/vdi.c  |  3 +-
 include/qemu/units.h | 73 
 3 files changed, 7 insertions(+), 79 deletions(-)

-- 
2.17.2

[Qemu-devel] Remaining CI failures

2019-01-11 Thread Alex Bennée



Hi,

So trying to narrow down the remaining failures in the CI system. There
is one with a patch in flight (use g_usleep instead of sleep) but there
remains two failure modes, both erratic.

tests/qht-par:

I can trigger this on my dev machine with a gprof enabled build:

  # QEMU configure log Fri Jan 11 14:10:45 GMT 2019
  # Configured with: './configure' '--disable-tools' '--disable-docs' 
'--enable-gprof' '--enable-gcov'

I only seem to be able to trigger it when running via the wrapper in the
make system:

  retry.py -n 30 --invert make check-tests/test-qht-par

Eventually this crashes with:

  ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): (35584 == 
0)

Leaving a core dump for the child:

  Core was generated by `tests/qht-bench -R -S0.1 -D1 -N1 -n 2 -u 20 -d 1'
  (gdb) info thread
Id   Target Id Frame
  * 1Thread 0x76a7e700 (LWP 15473) 0x5557c306 in 
call_rcu_thread (opaque=0x0) at util/rcu.c:255
2Thread 0x77fbe780 (LWP 15472) 0x555b8d50 in 
gcov_read_words ()
  (gdb) bt
  #0  __mcount_internal (frompc=, selfpc=93824992383630) at 
mcount.c:98
  #1  0x76e15e24 in mcount () at ../sysdeps/x86_64/_mcount.S:51
  #2  0x5557928e in qemu_event_reset (ev=0x3cc692b8d452f400) at 
util/qemu-thread-posix.c:408
  #3  0x5557c306 in call_rcu_thread (opaque=0x0) at util/rcu.c:255
  #4  0x55579630 in qemu_thread_start (args=0x55808080) at 
util/qemu-thread-posix.c:502
  #5  0x770e96db in start_thread (arg=0x76a7e700) at 
pthread_create.c:463
  #6  0x76e1288f in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
  (gdb) thread 2
  [Switching to thread 2 (Thread 0x77fbe780 (LWP 15472))]
  #0  0x555b8d50 in gcov_read_words ()
  (gdb) bt
  #0  0x555b8d50 in gcov_read_words ()
  #1  0x555b9453 in __gcov_read_summary ()
  #2  0x555ba461 in gcov_do_dump ()
  #3  0x555bab62 in __gcov_exit ()
  #4  0x555b8c22 in _GLOBAL__sub_D_00100_1_json_lexer_init () at 
qobject/json-lexer.c:365
  #5  0x77de5b73 in _dl_fini () at dl-fini.c:138
  #6  0x76d34041 in __run_exit_handlers (status=0, listp=0x770dc718 
<__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, 
run_dtors=run_dtors@entry=true) at
  exit.c:108
  #7  0x76d3413a in __GI_exit (status=) at exit.c:139
  #8  0x76d12b9e in __libc_start_main (main=0x55575d50 , 
argc=11, argv=0x7fffdf78, init=, fini=, 
rtld_fini=, stack_end=0x7fffdf68) at ../csu/libc-start.c:344
  #9  0x55573c1a in _start ()

To trigger the second failure I had to run on a limited Xenial machine
(16.04, 2 cores, 8Gb RAM) again with gprof build:

  # QEMU configure log Thu 10 Jan 22:22:52 GMT 2019
  # Configured with: './configure' '--enable-gprof' '--enable-gcov' 
'--disable-pie' 
'--target-list=aarch64-softmmu,arm-softmmu,i386-softmmu,mips-softmmu,mips64-softmmu,ppc64-softmmu,riscv64-softmmu,s390x-softmmu,x86_64-softmmu'

Running it like Travis does:

  retry.py -n 40 --invert -- make -j 3 check V=1

It eventually fails with:

  PASS: tests/test-hmp
  make: write error: stdout

It's hard to tell from the output what was running that failed. So far
I've managed to get the following information out of execsnoop:

  qemu-system-x86  1345   1332 0 x86_64-softmmu/qemu-system-x86_64 -qtest 
unix:/tmp/qtest-1332.sock,nowait -qtest-log /dev/null -chardev 
socket,path=/tmp/qtest-1332.qmp,no
  wait,id=char0 -mon chardev=char0,mode=control -machine accel=qtest -display 
none -S -M pc-i440fx-4.0
  sh   1350   1332 0 /bin/sh -c exec 
x86_64-softmmu/qemu-system-x86_64 -qtest unix:/tmp/qtest-1332.sock,nowait 
-qtest-log /dev/null -chardev socket,path=/tmp/q
  t
  qemu-system-x86  1350   1332 0 x86_64-softmmu/qemu-system-x86_64 -qtest 
unix:/tmp/qtest-1332.sock,nowait -qtest-log /dev/null -chardev 
socket,path=/tmp/qtest-1332.qmp,no
  wait,id=char0 -mon chardev=char0,mode=control -machine accel=qtest -display 
none -S -M pc-q35-3.1
  sh   1356   1332 0 /bin/sh -c exec 
x86_64-softmmu/qemu-system-x86_64 -qtest unix:/tmp/qtest-1332.sock,nowait 
-qtest-log /dev/null -chardev socket,path=/tmp/q
  t
  qemu-system-x86  1356   1332 0 x86_64-softmmu/qemu-system-x86_64 -qtest 
unix:/tmp/qtest-1332.sock,nowait -qtest-log /dev/null -chardev 
socket,path=/tmp/qtest-1332.qmp,no
  wait,id=char0 -mon chardev=char0,mode=control -machine accel=qtest -display 
none -S -M pc-i440fx-3.1
  sh   1361   1332 0 /bin/sh -c exec 
x86_64-softmmu/qemu-system-x86_64 -qtest unix:/tmp/qtest-1332.sock,nowait 
-qtest-log /dev/null -chardev socket,path=/tmp/q
  t
  qemu-system-x86  1361   1332 0 x86_64-softmmu/qemu-system-x86_64 -qtest 
unix:/tmp/qtest-1332.sock,nowait -qtest-log /dev/null -chardev 
socket,path=/tmp/qtest-1332.qmp,no
  wait,id=char0 -mon chardev=char0,mode=control -machine accel=qtest -display 
none -S -M pc-q35-4.0
  sh

Re: [Qemu-devel] [PULL 0/1] Misc 20190111 patches

2019-01-11 Thread Peter Maydell

On Fri, 11 Jan 2019 at 11:48, Gerd Hoffmann  wrote:
>
> The following changes since commit 291741033f611a4f0bbce3f7c9dead84ce315f96:
>
>   Merge remote-tracking branch 
> 'remotes/kraxel/tags/audio-20190110-pull-request' into staging (2019-01-10 
> 18:45:23 +)
>
> are available in the git repository at:
>
>   git://git.kraxel.org/qemu tags/misc-20190111-pull-request
>
> for you to fetch changes up to 78ac44af547e09bdddc75e41a525cdc8eec60be4:
>
>   roms: seabios: Rename CROSS_COMPILE to CROSS_PREFIX (2019-01-11 12:46:07 
> +0100)
>
> 
> misc: fix seabios cross build.
>
> 
>
> Roman Bolshakov (1):
>   roms: seabios: Rename CROSS_COMPILE to CROSS_PREFIX
>
>  roms/Makefile | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.0
for any user-visible changes.

-- PMM

Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer

2019-01-11 Thread Eduardo Habkost

On Fri, Jan 11, 2019 at 04:06:54PM +, Alex Bennée wrote:
> 
> Paolo Bonzini  writes:
> 
> > On 11/01/19 16:28, Alex Bennée wrote:
> >>> Why not g_usleep?  It already does a while loop around nanosleep (which
> >>> returns the remaining time in the wait, like select but unlike sleep and
> >>> poll).
> >> Yeah I'm testing that now. However I have managed to trigger:
> >>
> >>   ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): 
> >> (35584 == 0)
> >
> > I think that's a good old SIGSEGV (0x8B00).
> 
> According to the PC in the logs:
> 
>   Line 98 of "mcount.c" starts at address 0x76e15145 
> <__mcount_internal+69> and ends at 0x76e15148 <__mcount_internal+72>.

Was this on Travis?  Which architecture?

-- 
Eduardo

Re: [Qemu-devel] [PATCH v3 1/6] qemu-option: Allow integer defaults

2019-01-11 Thread Leonid Bloch

On 1/11/19 6:23 PM, Eric Blake wrote:
> On 1/11/19 8:14 AM, Leonid Bloch wrote:
>> On 1/10/19 9:18 PM, Eric Blake wrote:
>>> Set the framework up for declaring integer options with an integer
>>> default, instead of our current insane approach of requiring the
>>> default value to be given as a string (which then has to be reparsed
>>> at every use that wants a number).  git grep '[^.]def_value_str' says
>>> that we have done a good job of NOT abusing the internal field
>>> outside of the implementation in qemu-option.c; therefore, it is not
>>> too hard to audit that all code can either handle the new integer
>>> defaults correctly or abort because a caller violated constraints.
>>> Sadly, we DO have a constraint that qemu_opt_get() should not be
>>> called on an option that has an integer type, because we have no
>>> where to stash a cached const char * result; but callers that want
>>> an integer should be using qemu_opt_get_number() and friends
>>> anyways.
>>>
>>> Signed-off-by: Eric Blake 
>>> ---
>>>include/qemu/option.h | 12 
>>>util/qemu-option.c| 69 +--
>>>2 files changed, 72 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/include/qemu/option.h b/include/qemu/option.h
>>> index 844587cab39..46b80d5a6e1 100644
>>> --- a/include/qemu/option.h
>>> +++ b/include/qemu/option.h
>>> @@ -46,6 +46,18 @@ typedef struct QemuOptDesc {
>>>const char *name;
>>>enum QemuOptType type;
>>>const char *help;
>>> +/*
>>> + * For QEMU_OPT_STRING: Leave def_value_int 0, and set def_value_str
>>> + * to a default value or leave NULL for no default.
>>> + *
>>> + * For other types: Initialize at most non-zero def_value_int or a
>>> + * parseable def_value_str for a default (must use a string for an
>>> + * explicit default of 0, although an implicit default generally
>>> + * works).  If setting def_value_int, calling qemu_opt_get() on
>>> + * that option will abort(); instead, call qemu_opt_get_del() or a
>>> + * typed getter.
>>> + */
>>> +uint64_t def_value_int;
>>>const char *def_value_str;
>>>} QemuOptDesc;
>>>
> 
>>
>> If I understand correctly, the number will still be compiled into the
>> binary as an expression string, but will be printed correctly during
>> runtime?
> 
> Anyone that uses .def_value_int will compile into the binary as an
> 8-byte integer (regardless of how that number was spelled in C, either
> as a single constant, or as a constant expression), and NOT as a decimal
> string.  String conversions for code that asks for a string will happen
> by a runtime printf() (ideally, such code is limited to the case of
> printing out information during help output).  Code that is smart enough
> to request a number now gets the default value directly, rather than the
> old way of having to strtoll() decode a decimal string back into a
> number.  No one should ever be using .def_value_str = stringify(macro),
> when they can instead just use .def_value_int = macro (which is what the
> next patch fixes).

Yes, you're right. Thanks for the explanation!

Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer

2019-01-11 Thread Greg Kurz

On Fri, 11 Jan 2019 16:41:41 +0100
Paolo Bonzini  wrote:

> On 11/01/19 16:28, Alex Bennée wrote:
> >> Why not g_usleep?  It already does a while loop around nanosleep (which
> >> returns the remaining time in the wait, like select but unlike sleep and
> >> poll).  
> > Yeah I'm testing that now. However I have managed to trigger:
> > 
> >   ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): 
> > (35584 == 0)  
> 
> I think that's a good old SIGSEGV (0x8B00).
> 

Hmmm... system() returns a "wait status" that can  be examined using the
macros described in waitpid(2), and we have:

/* If WIFEXITED(STATUS), the low-order 8 bits of the status.  */
#define __WEXITSTATUS(status)   (((status) & 0xff00) >> 8)

So this rather looks like a 139 exit status to me... Not sure how
this can happen though.

> Paolo
>

Re: [Qemu-devel] [PATCH 1/4] migration: add RAMBlock's offset validation

2019-01-11 Thread Dr. David Alan Gilbert

* Yury Kotov (yury-ko...@yandex-team.ru) wrote:
> 10.01.2019, 23:14, "Dr. David Alan Gilbert" :
> > * Yury Kotov (yury-ko...@yandex-team.ru) wrote:
> >>  RAM migration has a RAMBlock validation stage (flag 
> >> RAM_SAVE_FLAG_MEM_SIZE).
> >>  In this stage QEMU checks further information about RAMBlock:
> >>  1. Presence (by idstr),
> >>  2. Length (trying to resize, when differs),
> >>  3. Optional page size.
> >>
> >>  This patch adds a check for RAMBlock's offset. Currently we check it 
> >> during
> >>  RAM pages loading - every RAM page has an offset in its header. But there 
> >> is a
> >>  case when we don't send RAM pages (see below).
> >>
> >>  The following commits introduce a capability (ignore-external) to skip 
> >> some
> >>  RAM blocks from migration. In such case the migration stream contains only
> >>  meta information about RAM blocks to validate them. So, the only way to 
> >> check
> >>  block's offset is to send it explicitly.
> >>
> >>  Signed-off-by: Yury Kotov 
> >
> > But why check that offsets match? THey aren't supposed to!
> > Offset's are entirely private to each qemu and they're allowed to be
> > different; the only requirement is that the length and name of each
> > RAMBlock matches, then all the operations we do over the migration
> > stream are relative to the start of the block.
> >
> 
> Yes, you are right. It seems that instead I should check block->mr->addr.

I don't think that's guaranteed to be the same either;  for example
a video buffer or ROM on a  PCI card gets mapped into different parts of
the guests physical address space depending on writes into the PCI
bars.  Some RAMBlocks dont even have a mapping associated with them.

If you do want to add something here, please don't increment the version
- unless we're desperate I want to keep the version number the same so
that backwards migration works.

Dave

> >
> > One example where they are validly different is where you hotplug some
> > RAM, so for example:
> >
> >   source qemu
> >   -M 4G
> >   hotplug PCI card
> >   hotplug 2G
> >
> >   destination qemu
> >   -M 4G
> >   PCI card declared on the command line
> >   extra 2G declared on the command line
> >
> > The offsets are different but we can migrate that case fine.
> >
> > Dave
> >
> >>  ---
> >>   migration/ram.c | 15 +--
> >>   1 file changed, 13 insertions(+), 2 deletions(-)
> >>
> >>  diff --git a/migration/ram.c b/migration/ram.c
> >>  index 7e7deec4d8..39629254e1 100644
> >>  --- a/migration/ram.c
> >>  +++ b/migration/ram.c
> >>  @@ -3171,6 +3171,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> >>   if (migrate_postcopy_ram() && block->page_size != 
> >> qemu_host_page_size) {
> >>   qemu_put_be64(f, block->page_size);
> >>   }
> >>  + qemu_put_be64(f, block->offset);
> >>   }
> >>
> >>   rcu_read_unlock();
> >>  @@ -4031,7 +4032,7 @@ static int ram_load(QEMUFile *f, void *opaque, int 
> >> version_id)
> >>
> >>   seq_iter++;
> >>
> >>  - if (version_id != 4) {
> >>  + if (version_id < 4) {
> >>   ret = -EINVAL;
> >>   }
> >>
> >>  @@ -4132,6 +4133,16 @@ static int ram_load(QEMUFile *f, void *opaque, int 
> >> version_id)
> >>   ret = -EINVAL;
> >>   }
> >>   }
> >>  + if (version_id >= 5) {
> >>  + ram_addr_t offset;
> >>  + offset = qemu_get_be64(f);
> >>  + if (block->offset != offset) {
> >>  + error_report("Mismatched RAM block offset %s "
> >>  + "%" PRId64 "!= %" PRId64,
> >>  + id, offset, (uint64_t)block->offset);
> >>  + ret = -EINVAL;
> >>  + }
> >>  + }
> >>   ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
> >> block->idstr);
> >>   } else {
> >>  @@ -4363,5 +4374,5 @@ static SaveVMHandlers savevm_ram_handlers = {
> >>   void ram_mig_init(void)
> >>   {
> >>   qemu_mutex_init();
> >>  - register_savevm_live(NULL, "ram", 0, 4, _ram_handlers, 
> >> _state);
> >>  + register_savevm_live(NULL, "ram", 0, 5, _ram_handlers, 
> >> _state);
> >>   }
> >>  --
> >>  2.20.1
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
> Regards,
> Yury
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] [PULL 2/2] tests: Disable qht-bench parallel test when using gprof

2019-01-11 Thread Eduardo Habkost

From: Philippe Mathieu-Daudé 

This test is failing on the Travis CI [*] since some time now,
disable it until it get fixed.

[*] https://travis-ci.org/qemu/qemu/builds/474821674

Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20190103150951.17592-3-phi...@redhat.com>
Signed-off-by: Eduardo Habkost 
---
 configure  | 1 +
 tests/Makefile.include | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index 0c433ec043..bb1e7c9e19 100755
--- a/configure
+++ b/configure
@@ -7490,6 +7490,7 @@ alpha)
 esac
 
 if test "$gprof" = "yes" ; then
+  echo "CONFIG_GPROF=y" >> $config_host_mak
   echo "TARGET_GPROF=y" >> $config_target_mak
   if test "$target_linux_user" = "yes" ; then
 cflags="-p $cflags"
diff --git a/tests/Makefile.include b/tests/Makefile.include
index 9c84bbd829..dfd87344bd 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -88,7 +88,8 @@ check-unit-y += tests/test-rcu-simpleq$(EXESUF)
 check-unit-y += tests/test-rcu-tailq$(EXESUF)
 check-unit-y += tests/test-qdist$(EXESUF)
 check-unit-y += tests/test-qht$(EXESUF)
-check-unit-y += tests/test-qht-par$(EXESUF)
+# FIXME: {test-qht-par + gprof} often break on Travis CI
+check-unit-$(call lnot,$(CONFIG_GPROF)) += tests/test-qht-par$(EXESUF)
 check-unit-y += tests/test-bitops$(EXESUF)
 check-unit-y += tests/test-bitcnt$(EXESUF)
 check-unit-y += tests/test-qdev-global-props$(EXESUF)
-- 
2.18.0.rc1.1.g3f1ff2140

[Qemu-devel] [PULL 1/2] configure: Let the TARGET_GPROF var use the regular 'y' for Yes

2019-01-11 Thread Eduardo Habkost

From: Philippe Mathieu-Daudé 

All other variables are set using 'y', which is what the rules.mak
functions expect to parse.

Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20190103150951.17592-2-phi...@redhat.com>
Signed-off-by: Eduardo Habkost 
---
 configure | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure b/configure
index cf763d4674..0c433ec043 100755
--- a/configure
+++ b/configure
@@ -7490,7 +7490,7 @@ alpha)
 esac
 
 if test "$gprof" = "yes" ; then
-  echo "TARGET_GPROF=yes" >> $config_target_mak
+  echo "TARGET_GPROF=y" >> $config_target_mak
   if test "$target_linux_user" = "yes" ; then
 cflags="-p $cflags"
 ldflags="-p $ldflags"
-- 
2.18.0.rc1.1.g3f1ff2140

[Qemu-devel] [PULL 0/2] Work around test-qht-par + gprof issues

2019-01-11 Thread Eduardo Habkost

The following changes since commit e53f7796fbe71a5c7c24ffebf04b4aa9a759da36:

  Merge remote-tracking branch 
'remotes/ehabkost/tags/machine-next-pull-request' into staging (2019-01-11 
13:35:48 +)

are available in the Git repository at:

  git://github.com/ehabkost/qemu.git tags/machine-next-pull-request

for you to fetch changes up to ce2eefd7c21697fee87a0686353de881081d22c6:

  tests: Disable qht-bench parallel test when using gprof (2019-01-11 16:21:45 
-0200)


Work around test-qht-par + gprof issues

Travis CI jobs are failing because of test-qht-par when gprof is
enabled.  Temporarily disable test-qht-par if gprof is enabled,
until we fix the bug.



Queue for Machine Core patches


Philippe Mathieu-Daudé (2):
  configure: Let the TARGET_GPROF var use the regular 'y' for Yes
  tests: Disable qht-bench parallel test when using gprof

 configure  | 3 ++-
 tests/Makefile.include | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

-- 
2.18.0.rc1.1.g3f1ff2140

Re: [Qemu-devel] [PATCH] xen: Fix event channel interface for XenDevice-s

2019-01-11 Thread Peter Maydell

On Fri, 11 Jan 2019 at 18:13, Anthony PERARD  wrote:
>
> On Fri, Jan 11, 2019 at 06:09:41PM +, Anthony PERARD wrote:
> > Patch "xen: add event channel interface for XenDevice-s" makes use of
> > the type xenevtchn_port_or_error_t, but this isn't avaiable before Xen
> > 4.7. Also the function xen_device_bind_event_channel assign the return
> > value of xenevtchn_bind_interdomain to channel->local_port but check the
> > result for error with xendev->local_port.
> >
> > Fix by:
> > - removing local_port from struct XenDevice as it isn't use anywere.
> > - adding a compatibility typedef for xenevtchn_port_or_error_t for Xen
> >   4.6 and earlier.
> >
> > As extra, replace the type of XenEventChannel->local_port by
> > evtchn_port_t.
> >
> > Signed-off-by: Anthony PERARD 
>
> Notes:
> This patch fix "xen: add event channel interface for XenDevice-s" that
> isn't commited yet, of the patch series "Xen PV backend 'qdevification'".

Thanks for the fix. I assume you're going to squash it into the
appropriate patch in that pullrequest ?

thanks
-- PMM

[Qemu-devel] [PATCH] xen: Fix event channel interface for XenDevice-s

2019-01-11 Thread Anthony PERARD

Patch "xen: add event channel interface for XenDevice-s" makes use of
the type xenevtchn_port_or_error_t, but this isn't avaiable before Xen
4.7. Also the function xen_device_bind_event_channel assign the return
value of xenevtchn_bind_interdomain to channel->local_port but check the
result for error with xendev->local_port.

Fix by:
- removing local_port from struct XenDevice as it isn't use anywere.
- adding a compatibility typedef for xenevtchn_port_or_error_t for Xen
  4.6 and earlier.

As extra, replace the type of XenEventChannel->local_port by
evtchn_port_t.

Signed-off-by: Anthony PERARD 
---
 hw/xen/xen-bus.c| 12 +++-
 include/hw/xen/xen-bus.h|  1 -
 include/hw/xen/xen_common.h |  1 +
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index f90bcf2342..3aeccec69c 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -917,7 +917,7 @@ void xen_device_copy_grant_refs(XenDevice *xendev, bool 
to_domain,
 }
 
 struct XenEventChannel {
-unsigned int local_port;
+evtchn_port_t local_port;
 XenEventHandler handler;
 void *opaque;
 Notifier notifier;
@@ -939,17 +939,19 @@ XenEventChannel *xen_device_bind_event_channel(XenDevice 
*xendev,
void *opaque, Error **errp)
 {
 XenEventChannel *channel = g_new0(XenEventChannel, 1);
+xenevtchn_port_or_error_t local_port;
 
-channel->local_port = xenevtchn_bind_interdomain(xendev->xeh,
- xendev->frontend_id,
- port);
-if (xendev->local_port < 0) {
+local_port = xenevtchn_bind_interdomain(xendev->xeh,
+xendev->frontend_id,
+port);
+if (local_port < 0) {
 error_setg_errno(errp, errno, "xenevtchn_bind_interdomain failed");
 
 g_free(channel);
 return NULL;
 }
 
+channel->local_port = local_port;
 channel->handler = handler;
 channel->opaque = opaque;
 channel->notifier.notify = event_notify;
diff --git a/include/hw/xen/xen-bus.h b/include/hw/xen/xen-bus.h
index e55a5de5f1..3183f10e3c 100644
--- a/include/hw/xen/xen-bus.h
+++ b/include/hw/xen/xen-bus.h
@@ -29,7 +29,6 @@ typedef struct XenDevice {
 xengnttab_handle *xgth;
 bool feature_grant_copy;
 xenevtchn_handle *xeh;
-xenevtchn_port_or_error_t local_port;
 NotifierList event_notifiers;
 } XenDevice;
 
diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h
index 2b91d199a1..9a8155e172 100644
--- a/include/hw/xen/xen_common.h
+++ b/include/hw/xen/xen_common.h
@@ -32,6 +32,7 @@ extern xc_interface *xen_xc;
 typedef xc_interface xenforeignmemory_handle;
 typedef xc_evtchn xenevtchn_handle;
 typedef xc_gnttab xengnttab_handle;
+typedef evtchn_port_or_error_t xenevtchn_port_or_error_t;
 
 #define xenevtchn_open(l, f) xc_evtchn_open(l, f);
 #define xenevtchn_close(h) xc_evtchn_close(h)
-- 
Anthony PERARD

Re: [Qemu-devel] [PATCH] xen: Fix event channel interface for XenDevice-s

2019-01-11 Thread Anthony PERARD

On Fri, Jan 11, 2019 at 06:09:41PM +, Anthony PERARD wrote:
> Patch "xen: add event channel interface for XenDevice-s" makes use of
> the type xenevtchn_port_or_error_t, but this isn't avaiable before Xen
> 4.7. Also the function xen_device_bind_event_channel assign the return
> value of xenevtchn_bind_interdomain to channel->local_port but check the
> result for error with xendev->local_port.
> 
> Fix by:
> - removing local_port from struct XenDevice as it isn't use anywere.
> - adding a compatibility typedef for xenevtchn_port_or_error_t for Xen
>   4.6 and earlier.
> 
> As extra, replace the type of XenEventChannel->local_port by
> evtchn_port_t.
> 
> Signed-off-by: Anthony PERARD 

Notes:
This patch fix "xen: add event channel interface for XenDevice-s" that
isn't commited yet, of the patch series "Xen PV backend 'qdevification'".

-- 
Anthony PERARD

[Qemu-devel] [PULL 3/4] RISC-V: Implement existential predicates for CSRs

2019-01-11 Thread Palmer Dabbelt

From: Michael Clark 

CSR predicate functions are added to the CSR table.
mstatus.FS and counter enable checks are moved
to predicate functions and two new predicates are
added to check misa.S for s* CSRs and a new PMP
CPU feature for pmp* CSRs.

Processors that don't implement S-mode will trap
on access to s* CSRs and processors that don't
implement PMP will trap on accesses to pmp* CSRs.

PMP checks are disabled in riscv_cpu_handle_mmu_fault
when the PMP CPU feature is not present.

Signed-off-by: Michael Clark 
Signed-off-by: Alistair Francis 
Reviewed-by: Richard Henderson 
Signed-off-by: Palmer Dabbelt 
---
 target/riscv/cpu.c|   6 ++
 target/riscv/cpu.h|   6 +-
 target/riscv/cpu_helper.c |   3 +-
 target/riscv/csr.c| 169 +-
 4 files changed, 105 insertions(+), 79 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 5e8a2cb2ba61..28d7e5302fb1 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -126,6 +126,7 @@ static void rv32gcsu_priv1_09_1_cpu_init(Object *obj)
 set_versions(env, USER_VERSION_2_02_0, PRIV_VERSION_1_09_1);
 set_resetvec(env, DEFAULT_RSTVEC);
 set_feature(env, RISCV_FEATURE_MMU);
+set_feature(env, RISCV_FEATURE_PMP);
 }
 
 static void rv32gcsu_priv1_10_0_cpu_init(Object *obj)
@@ -135,6 +136,7 @@ static void rv32gcsu_priv1_10_0_cpu_init(Object *obj)
 set_versions(env, USER_VERSION_2_02_0, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
 set_feature(env, RISCV_FEATURE_MMU);
+set_feature(env, RISCV_FEATURE_PMP);
 }
 
 static void rv32imacu_nommu_cpu_init(Object *obj)
@@ -143,6 +145,7 @@ static void rv32imacu_nommu_cpu_init(Object *obj)
 set_misa(env, RV32 | RVI | RVM | RVA | RVC | RVU);
 set_versions(env, USER_VERSION_2_02_0, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
+set_feature(env, RISCV_FEATURE_PMP);
 }
 
 #elif defined(TARGET_RISCV64)
@@ -154,6 +157,7 @@ static void rv64gcsu_priv1_09_1_cpu_init(Object *obj)
 set_versions(env, USER_VERSION_2_02_0, PRIV_VERSION_1_09_1);
 set_resetvec(env, DEFAULT_RSTVEC);
 set_feature(env, RISCV_FEATURE_MMU);
+set_feature(env, RISCV_FEATURE_PMP);
 }
 
 static void rv64gcsu_priv1_10_0_cpu_init(Object *obj)
@@ -163,6 +167,7 @@ static void rv64gcsu_priv1_10_0_cpu_init(Object *obj)
 set_versions(env, USER_VERSION_2_02_0, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
 set_feature(env, RISCV_FEATURE_MMU);
+set_feature(env, RISCV_FEATURE_PMP);
 }
 
 static void rv64imacu_nommu_cpu_init(Object *obj)
@@ -171,6 +176,7 @@ static void rv64imacu_nommu_cpu_init(Object *obj)
 set_misa(env, RV64 | RVI | RVM | RVA | RVC | RVU);
 set_versions(env, USER_VERSION_2_02_0, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
+set_feature(env, RISCV_FEATURE_PMP);
 }
 
 #endif
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 4aeaa3204903..743f02c8b95a 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -83,9 +83,10 @@
 /* S extension denotes that Supervisor mode exists, however it is possible
to have a core that support S mode but does not have an MMU and there
is currently no bit in misa to indicate whether an MMU exists or not
-   so a cpu features bitfield is required */
+   so a cpu features bitfield is required, likewise for optional PMP support */
 enum {
-RISCV_FEATURE_MMU
+RISCV_FEATURE_MMU,
+RISCV_FEATURE_PMP
 };
 
 #define USER_VERSION_2_02_0 0x00020200
@@ -314,6 +315,7 @@ typedef int (*riscv_csr_op_fn)(CPURISCVState *env, int 
csrno,
 target_ulong *ret_value, target_ulong new_value, target_ulong write_mask);
 
 typedef struct {
+riscv_csr_predicate_fn predicate;
 riscv_csr_read_fn read;
 riscv_csr_write_fn write;
 riscv_csr_op_fn op;
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 4ef7f5c1f93d..f257050f1282 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -404,7 +404,8 @@ int riscv_cpu_handle_mmu_fault(CPUState *cs, vaddr address, 
int size,
 qemu_log_mask(CPU_LOG_MMU,
 "%s address=%" VADDR_PRIx " ret %d physical " TARGET_FMT_plx
  " prot %d\n", __func__, address, ret, pa, prot);
-if (!pmp_hart_has_privs(env, pa, TARGET_PAGE_SIZE, 1 << rw)) {
+if (riscv_feature(env, RISCV_FEATURE_PMP) &&
+!pmp_hart_has_privs(env, pa, TARGET_PAGE_SIZE, 1 << rw)) {
 ret = TRANSLATE_FAIL;
 }
 if (ret == TRANSLATE_SUCCESS) {
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 44ea8b7cb6e8..5e7e7d16b8b5 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -42,6 +42,46 @@ void riscv_set_csr_ops(int csrno, riscv_csr_operations *ops)
 csr_ops[csrno & (CSR_TABLE_SIZE - 1)] = *ops;
 }
 
+/* Predicates */
+static int fs(CPURISCVState *env, int csrno)
+{
+#if !defined(CONFIG_USER_ONLY)
+if (!(env->mstatus & MSTATUS_FS)) {
+return -1;
+}
+#endif
+return 0;
+}
+
+static

[Qemu-devel] [PULL 4/4] default-configs: Enable USB support for RISC-V machines

2019-01-11 Thread Palmer Dabbelt

From: Alistair Francis 

Signed-off-by: Alistair Francis 
Signed-off-by: Palmer Dabbelt 
---
 default-configs/riscv32-softmmu.mak | 1 +
 default-configs/riscv64-softmmu.mak | 1 +
 2 files changed, 2 insertions(+)

diff --git a/default-configs/riscv32-softmmu.mak 
b/default-configs/riscv32-softmmu.mak
index dbc93982848a..c9c59714093f 100644
--- a/default-configs/riscv32-softmmu.mak
+++ b/default-configs/riscv32-softmmu.mak
@@ -1,6 +1,7 @@
 # Default configuration for riscv-softmmu
 
 include pci.mak
+include usb.mak
 
 CONFIG_SERIAL=y
 CONFIG_VIRTIO_MMIO=y
diff --git a/default-configs/riscv64-softmmu.mak 
b/default-configs/riscv64-softmmu.mak
index dbc93982848a..c9c59714093f 100644
--- a/default-configs/riscv64-softmmu.mak
+++ b/default-configs/riscv64-softmmu.mak
@@ -1,6 +1,7 @@
 # Default configuration for riscv-softmmu
 
 include pci.mak
+include usb.mak
 
 CONFIG_SERIAL=y
 CONFIG_VIRTIO_MMIO=y
-- 
2.18.1

[Qemu-devel] [PULL 1/4] RISC-V: Implement modular CSR helper interface

2019-01-11 Thread Palmer Dabbelt

From: Michael Clark 

Previous CSR code uses csr_read_helper and csr_write_helper
to update CSR registers however this interface prevents
atomic read/modify/write CSR operations; in addition
there is no trap-free method to access to CSRs due
to the monolithic CSR functions call longjmp.

The current iCSR interface is not safe to be called by
target/riscv/gdbstub.c as privilege checks or missing CSRs
may call longjmp to generate exceptions. It needs to
indicate existence so traps can be generated in the
CSR instruction helpers.

This commit moves CSR access from the monolithic switch
statements in target/riscv/op_helper.c into modular
read/write functions in target/riscv/csr.c using a new
function pointer table for dispatch (which can later
be used to allow CPUs to hook up model specific CSRs).

A read/modify/write interface is added to support atomic
CSR operations and a non-trapping interface is added
to allow exception-free access to CSRs by the debugger.

The CSR functions and CSR dispatch table are ordered
to match The RISC-V Instruction Set Manual, Volume II:
Privileged Architecture Version 1.10, 2.2 CSR Listing.

An API is added to allow derived cpu instances to modify
or implement new CSR operations.

Signed-off-by: Michael Clark 
Signed-off-by: Alistair Francis 
Reviewed-by: Richard Henderson 
Signed-off-by: Palmer Dabbelt 
---
 target/riscv/Makefile.objs |   2 +-
 target/riscv/cpu.h |  35 +-
 target/riscv/cpu_helper.c  |   4 +-
 target/riscv/csr.c | 846 +
 target/riscv/gdbstub.c |  10 +-
 target/riscv/op_helper.c   | 613 +--
 6 files changed, 904 insertions(+), 606 deletions(-)
 create mode 100644 target/riscv/csr.c

diff --git a/target/riscv/Makefile.objs b/target/riscv/Makefile.objs
index fcc5d34c1f2e..4072abe3e45c 100644
--- a/target/riscv/Makefile.objs
+++ b/target/riscv/Makefile.objs
@@ -1 +1 @@
-obj-y += translate.o op_helper.o cpu_helper.o cpu.o fpu_helper.o gdbstub.o 
pmp.o
+obj-y += translate.o op_helper.o cpu_helper.o cpu.o csr.o fpu_helper.o 
gdbstub.o pmp.o
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 4ee09b9cffe0..4aeaa3204903 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -289,9 +289,38 @@ static inline void cpu_get_tb_cpu_state(CPURISCVState 
*env, target_ulong *pc,
 #endif
 }
 
-void csr_write_helper(CPURISCVState *env, target_ulong val_to_write,
-target_ulong csrno);
-target_ulong csr_read_helper(CPURISCVState *env, target_ulong csrno);
+int riscv_csrrw(CPURISCVState *env, int csrno, target_ulong *ret_value,
+target_ulong new_value, target_ulong write_mask);
+
+static inline void csr_write_helper(CPURISCVState *env, target_ulong val,
+int csrno)
+{
+riscv_csrrw(env, csrno, NULL, val, MAKE_64BIT_MASK(0, TARGET_LONG_BITS));
+}
+
+static inline target_ulong csr_read_helper(CPURISCVState *env, int csrno)
+{
+target_ulong val = 0;
+riscv_csrrw(env, csrno, , 0, 0);
+return val;
+}
+
+typedef int (*riscv_csr_predicate_fn)(CPURISCVState *env, int csrno);
+typedef int (*riscv_csr_read_fn)(CPURISCVState *env, int csrno,
+target_ulong *ret_value);
+typedef int (*riscv_csr_write_fn)(CPURISCVState *env, int csrno,
+target_ulong new_value);
+typedef int (*riscv_csr_op_fn)(CPURISCVState *env, int csrno,
+target_ulong *ret_value, target_ulong new_value, target_ulong write_mask);
+
+typedef struct {
+riscv_csr_read_fn read;
+riscv_csr_write_fn write;
+riscv_csr_op_fn op;
+} riscv_csr_operations;
+
+void riscv_get_csr_ops(int csrno, riscv_csr_operations *ops);
+void riscv_set_csr_ops(int csrno, riscv_csr_operations *ops);
 
 #include "exec/cpu-all.h"
 
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 0234c2d52886..4ef7f5c1f93d 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -528,7 +528,7 @@ void riscv_cpu_do_interrupt(CPUState *cs)
 get_field(s, MSTATUS_SIE) : get_field(s, MSTATUS_UIE << 
env->priv));
 s = set_field(s, MSTATUS_SPP, env->priv);
 s = set_field(s, MSTATUS_SIE, 0);
-csr_write_helper(env, s, CSR_MSTATUS);
+env->mstatus = s;
 riscv_set_mode(env, PRV_S);
 } else {
 /* No need to check MTVEC for misaligned - lower 2 bits cannot be set 
*/
@@ -553,7 +553,7 @@ void riscv_cpu_do_interrupt(CPUState *cs)
 get_field(s, MSTATUS_MIE) : get_field(s, MSTATUS_UIE << 
env->priv));
 s = set_field(s, MSTATUS_MPP, env->priv);
 s = set_field(s, MSTATUS_MIE, 0);
-csr_write_helper(env, s, CSR_MSTATUS);
+env->mstatus = s;
 riscv_set_mode(env, PRV_M);
 }
 /* TODO yield load reservation  */
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
new file mode 100644
index ..b61b0ef37971
--- /dev/null
+++ b/target/riscv/csr.c
@@ -0,0 +1,846 @@
+/*
+ * RISC-V Control and Status Registers.
+ *
+ * Copyright (c) 2016-2017 Sagar

[Qemu-devel] [PULL] RISC-V Updates for 3.2, Part 2

2019-01-11 Thread Palmer Dabbelt

The following changes since commit 147923b1a901a0370f83a0f4c58ec1baffef22f0:

  Merge remote-tracking branch 'remotes/kraxel/tags/usb-20190108-pull-request' 
into staging (2019-01-08 16:07:32 +)

are available in the Git repository at:

  git://github.com/palmer-dabbelt/qemu.git tags/riscv-for-master-3.2-part2

for you to fetch changes up to f7cdfa38f37e0985457ac03c3238861144a58b4c:

  default-configs: Enable USB support for RISC-V machines (2019-01-09 17:34:10 
-0800)


RISC-V Updates for 3.2, Part 2

This patch set contains a handful of Michael's CSR-related cleanups,
which should allow us to proceed with more outstanding bug fixes that
depend on them.

Additionally, there is a patch that turns on USB.  This works for me
when the kernel has the appropriate drivers (which will soon be in
defconfig) and I pass

-device usb-ehci
-drive id=my_usb_disk,file=usbdisk.img,if=none,format=raw
-device usb-storage,drive=my_usb_disk

to QEMU.


Alistair Francis (1):
  default-configs: Enable USB support for RISC-V machines

Michael Clark (3):
  RISC-V: Implement modular CSR helper interface
  RISC-V: Implement atomic mip/sip CSR updates
  RISC-V: Implement existential predicates for CSRs

 default-configs/riscv32-softmmu.mak |   1 +
 default-configs/riscv64-softmmu.mak |   1 +
 target/riscv/Makefile.objs  |   2 +-
 target/riscv/cpu.c  |   6 +
 target/riscv/cpu.h  |  41 +-
 target/riscv/cpu_helper.c   |   7 +-
 target/riscv/csr.c  | 863 
 target/riscv/gdbstub.c  |  10 +-
 target/riscv/op_helper.c| 613 +
 9 files changed, 935 insertions(+), 609 deletions(-)
 create mode 100644 target/riscv/csr.c

[Qemu-devel] [PULL 2/4] RISC-V: Implement atomic mip/sip CSR updates

2019-01-11 Thread Palmer Dabbelt

From: Michael Clark 

Use the new CSR read/modify/write interface to implement
atomic updates to mip/sip.

Signed-off-by: Michael Clark 
Signed-off-by: Alistair Francis 
Reviewed-by: Richard Henderson 
Signed-off-by: Palmer Dabbelt 
---
 target/riscv/csr.c | 56 +++---
 1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index b61b0ef37971..44ea8b7cb6e8 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -487,25 +487,31 @@ static int write_mbadaddr(CPURISCVState *env, int csrno, 
target_ulong val)
 return 0;
 }
 
-static int read_mip(CPURISCVState *env, int csrno, target_ulong *val)
-{
-*val = atomic_read(>mip);
-return 0;
-}
-
-static int write_mip(CPURISCVState *env, int csrno, target_ulong val)
+static int rmw_mip(CPURISCVState *env, int csrno, target_ulong *ret_value,
+   target_ulong new_value, target_ulong write_mask)
 {
 RISCVCPU *cpu = riscv_env_get_cpu(env);
+target_ulong mask = write_mask & delegable_ints;
+uint32_t old_mip;
+
+/* We can't allow the supervisor to control SEIP as this would allow the
+ * supervisor to clear a pending external interrupt which will result in
+ * lost a interrupt in the case a PLIC is attached. The SEIP bit must be
+ * hardware controlled when a PLIC is attached. This should be an option
+ * for CPUs with software-delegated Supervisor External Interrupts. */
+mask &= ~MIP_SEIP;
+
+if (mask) {
+qemu_mutex_lock_iothread();
+old_mip = riscv_cpu_update_mip(cpu, mask, (new_value & mask));
+qemu_mutex_unlock_iothread();
+} else {
+old_mip = atomic_read(>mip);
+}
 
-/*
- * csrs, csrc on mip.SEIP is not decomposable into separate read and
- * write steps, so a different implementation is needed
- */
-
-qemu_mutex_lock_iothread();
-riscv_cpu_update_mip(cpu, MIP_SSIP | MIP_STIP,
- (val & (MIP_SSIP | MIP_STIP)));
-qemu_mutex_unlock_iothread();
+if (ret_value) {
+*ret_value = old_mip;
+}
 
 return 0;
 }
@@ -623,17 +629,11 @@ static int write_sbadaddr(CPURISCVState *env, int csrno, 
target_ulong val)
 return 0;
 }
 
-static int read_sip(CPURISCVState *env, int csrno, target_ulong *val)
-{
-*val = atomic_read(>mip) & env->mideleg;
-return 0;
-}
-
-static int write_sip(CPURISCVState *env, int csrno, target_ulong val)
+static int rmw_sip(CPURISCVState *env, int csrno, target_ulong *ret_value,
+   target_ulong new_value, target_ulong write_mask)
 {
-target_ulong newval = (atomic_read(>mip) & ~env->mideleg)
-  | (val & env->mideleg);
-return write_mip(env, CSR_MIP, newval);
+return rmw_mip(env, CSR_MSTATUS, ret_value, new_value,
+   write_mask & env->mideleg);
 }
 
 /* Supervisor Protection and Translation */
@@ -812,7 +812,7 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
 [CSR_MEPC] ={ read_mepc,write_mepc},
 [CSR_MCAUSE] =  { read_mcause,  write_mcause  },
 [CSR_MBADADDR] ={ read_mbadaddr,write_mbadaddr},
-[CSR_MIP] = { read_mip, write_mip },
+[CSR_MIP] = { NULL, NULL, rmw_mip },
 
 /* Supervisor Trap Setup */
 [CSR_SSTATUS] = { read_sstatus, write_sstatus },
@@ -825,7 +825,7 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
 [CSR_SEPC] ={ read_sepc,write_sepc},
 [CSR_SCAUSE] =  { read_scause,  write_scause  },
 [CSR_SBADADDR] ={ read_sbadaddr,write_sbadaddr},
-[CSR_SIP] = { read_sip, write_sip },
+[CSR_SIP] = { NULL, NULL, rmw_sip },
 
 /* Supervisor Protection and Translation */
 [CSR_SATP] ={ read_satp,write_satp},
-- 
2.18.1

Re: [Qemu-devel] [PATCH v1 0/3] gitdm updates

2019-01-11 Thread Alex Bennée



Aleksandar Markovic  writes:

> On Monday, January 7, 2019, Alex Bennée  wrote:
>
>>
>> Hi,
>>
>> Added a few more updates mostly of IBMers with non corporate emails.
>
>
> Alex, it seems logical to me that you also create a section on gitdm in
> MAINTAINERS, and set yourself as the maintainer for contrib/gitdm/*.

I guess - do we do that for other contrib stuff?

>
> Alelsandar
>
>
>
>> The year-end stats as per:
>>
>>  git log --numstat --after="1/1/2018 00:00" \
>>--before="31/12/2018 23:59" | gitdm -n -l 10
>>
>> Are:
>>
>>   Top changeset contributors by employer
>>   Red Hat   3091 (43.3%)
>>   Linaro1201 (16.8%)
>>   (None) 484 (6.8%)
>>   IBM426 (6.0%)
>>   Academics (various)186 (2.6%)
>>   Virtuozzo  172 (2.4%)
>>   Wave Computing 118 (1.7%)
>>   Igalia 109 (1.5%)
>>   Xilinx 102 (1.4%)
>>   Cadence Design Systems  80 (1.1%)
>>
>>   Top lines changed by employer
>>   Red Hat   140523 (30.3%)
>>   Cadence Design Systems81010 (17.5%)
>>   Linaro78098 (16.8%)
>>   Wave Computing33134 (7.1%)
>>   IBM   18918 (4.1%)
>>   SiFive14436 (3.1%)
>>   Academics (various)   11995 (2.6%)
>>   (None)11458 (2.5%)
>>   Virtuozzo 10770 (2.3%)
>>   Oracle6698 (1.4%)
>>
>> Alex Bennée (2):
>>   contrib/gitdm: add Nokia and Proxmox to the domain-map
>>   contrib/gitdm: add two more IBM'ers to group-map-ibm
>>
>> Joel Stanley (1):
>>   contrib/gitdm: Add other IBMers
>>
>>  contrib/gitdm/domain-map| 2 ++
>>  contrib/gitdm/group-map-ibm | 7 +++
>>  2 files changed, 9 insertions(+)
>>
>> --
>> 2.17.1
>>
>>
>>


--
Alex Bennée

Re: [Qemu-devel] [PATCH 2/4] linuxboot_dma: move common functions in a new header

2019-01-11 Thread Eric Blake

On 1/11/19 11:48 AM, Michael S. Tsirkin wrote:

>>>
 diff --git a/pc-bios/optionrom/optrom.h b/pc-bios/optionrom/optrom.h
 new file mode 100644
 index 00..36f43b43fd
 --- /dev/null
 +++ b/pc-bios/optionrom/optrom.h

 +#include "../../include/standard-headers/linux/qemu_fw_cfg.h"
>>>
>>> This depends on , please include it first.
>>
>> Sure.
>>
>>
>> Thanks,
>> Stefano
> 
> Better to just pull in qemu/osdep.h

Except that qemu/osdep.h should already have been pulled in by whatever
.c file is including this header. We specifically document that .h files
shouldn't need to include osdep.h (and in turn, anything that osdep.h
already pulls in, like ).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v6 01/11] blockdev: abort transactions in reverse order

2019-01-11 Thread Eric Blake

On 12/21/18 3:35 AM, John Snow wrote:
> Presently, we abort transactions in the same order they were processed in.
> Bitmap commands, though, attempt to restore backup data structures on abort.
> 
> That's not valid, they need to be aborted in reverse chronological order.
> 
> Replace the QSIMPLEQ data structure with a QTAILQ one, so we can iterate
> in reverse for the abort phase of the transaction.

This patch conflicts with Paolo's improvements to QTAILQ that just
landed; the resolution should be obvious, but at this point, you'll want
to post a v7.

Or did you want me to try and fix the conflict, and take the series
through my NBD tree since I have patches based on top of it?

> 
> Signed-off-by: John Snow 
> Reviewed-by: Eric Blake 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> ---
>  blockdev.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 2/4] linuxboot_dma: move common functions in a new header

2019-01-11 Thread Michael S. Tsirkin

On Fri, Jan 11, 2019 at 06:42:25PM +0100, Stefano Garzarella wrote:
> On Fri, Jan 11, 2019 at 5:27 PM Stefan Hajnoczi  wrote:
> >
> > On Fri, Jan 11, 2019 at 02:18:34PM +0100, Stefano Garzarella wrote:
> > > In order to allow other option roms to use these common
> > > useful functions and definitions, this patch put them
> > > in a new C header file called optrom.h, and also add
> > > useful out*() in*() functions for different size.
> >
> > It's usually helpful to modify code in a one patch and move it in a
> > separate patch.  This patch does both, so it's a harder to review.
> > Don't worry about changing it now, but something to try in the future.
> 
> Thanks for the tip!
> 
> >
> > > +#include 
> > > +#include "optrom.h"
> > > +#include "optrom_fw_cfg.h"
> >
> > Can these be moved to the top of the file like a regular C source file?
> 
> Yes, I'll move them to the top.
> 
> >
> > > diff --git a/pc-bios/optionrom/optrom.h b/pc-bios/optionrom/optrom.h
> > > new file mode 100644
> > > index 00..36f43b43fd
> > > --- /dev/null
> > > +++ b/pc-bios/optionrom/optrom.h
> > > @@ -0,0 +1,109 @@
> > > +/*
> > > + * Common Option ROM Functions for C code
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License as published by
> > > + * the Free Software Foundation; either version 2 of the License, or
> > > + * (at your option) any later version.
> > > + *
> > > + * This program is distributed in the hope that it will be useful,
> > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > > + * GNU General Public License for more details.
> > > + *
> > > + * You should have received a copy of the GNU General Public License
> > > + * along with this program; if not, see .
> > > + *
> > > + * Copyright (c) 2015-2019 Red Hat Inc.
> > > + *   Authors:
> > > + * Marc Marí 
> > > + * Richard W.M. Jones 
> > > + * Stefano Garzarella 
> > > + */
> > > +
> > > +#ifndef OPTROM_H
> > > +#define OPTROM_H
> > > +
> > > +#include "../../include/standard-headers/linux/qemu_fw_cfg.h"
> >
> > This depends on , please include it first.
> 
> Sure.
> 
> 
> Thanks,
> Stefano

Better to just pull in qemu/osdep.h

> 
> -- 
> Stefano Garzarella
> Red Hat

Re: [Qemu-devel] [PATCH 2/4] linuxboot_dma: move common functions in a new header

2019-01-11 Thread Stefano Garzarella

On Fri, Jan 11, 2019 at 5:27 PM Stefan Hajnoczi  wrote:
>
> On Fri, Jan 11, 2019 at 02:18:34PM +0100, Stefano Garzarella wrote:
> > In order to allow other option roms to use these common
> > useful functions and definitions, this patch put them
> > in a new C header file called optrom.h, and also add
> > useful out*() in*() functions for different size.
>
> It's usually helpful to modify code in a one patch and move it in a
> separate patch.  This patch does both, so it's a harder to review.
> Don't worry about changing it now, but something to try in the future.

Thanks for the tip!

>
> > +#include 
> > +#include "optrom.h"
> > +#include "optrom_fw_cfg.h"
>
> Can these be moved to the top of the file like a regular C source file?

Yes, I'll move them to the top.

>
> > diff --git a/pc-bios/optionrom/optrom.h b/pc-bios/optionrom/optrom.h
> > new file mode 100644
> > index 00..36f43b43fd
> > --- /dev/null
> > +++ b/pc-bios/optionrom/optrom.h
> > @@ -0,0 +1,109 @@
> > +/*
> > + * Common Option ROM Functions for C code
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program; if not, see .
> > + *
> > + * Copyright (c) 2015-2019 Red Hat Inc.
> > + *   Authors:
> > + * Marc Marí 
> > + * Richard W.M. Jones 
> > + * Stefano Garzarella 
> > + */
> > +
> > +#ifndef OPTROM_H
> > +#define OPTROM_H
> > +
> > +#include "../../include/standard-headers/linux/qemu_fw_cfg.h"
>
> This depends on , please include it first.

Sure.


Thanks,
Stefano


-- 
Stefano Garzarella
Red Hat

Re: [Qemu-devel] [PATCH] ui: vnc: finish removing TABs

2019-01-11 Thread Daniel P . Berrangé

On Tue, Dec 18, 2018 at 12:16:29AM +0100, Paolo Bonzini wrote:
> Suggested-by: Daniel P. Berrange 
> Signed-off-by: Paolo Bonzini 
> ---
>  ui/vnc-enc-hextile-template.h | 268 +++
>  ui/vnc-enc-zywrle.h   | 394 +-
>  2 files changed, 331 insertions(+), 331 deletions(-)

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [RFC PATCH 15/15] ui/console: Add "ui/pixelformat.h" to declare PixelFormat

2019-01-11 Thread Paolo Bonzini

On 11/01/19 15:08, Philippe Mathieu-Daudé wrote:
> PixelFormat is used by "ui/console.h" and by "ui/qemu-pixman.h".

ui/console.h already includes ui/qemu-pixman.h, just move it there?

Paolo

> Create the new header "ui/pixelformat.h" to declare this structure,
> and remove the forward declaration from "qemu/typedefs.h".
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> RFC because "ui/console.h" has on license, so I added a default one.

Re: [Qemu-devel] [PATCH] qemu-nbd: Rename 'exp' variable clashing with math::exp() symbol

2019-01-11 Thread Eric Blake

On 1/11/19 10:35 AM, Philippe Mathieu-Daudé wrote:
> The use of a variable named 'exp' prevents includes to import .
> 
> Rename it to avoid:
> 
>   qemu-nbd.c:64:19: error: ‘exp’ redeclared as different kind of symbol
>static NBDExport *exp;
>  ^~~
>   In file included from /usr/include/features.h:428,
>from /usr/include/bits/libc-header-start.h:33,
>from /usr/include/stdint.h:26,
>from /usr/lib/gcc/x86_64-redhat-linux/8/include/stdint.h:9,
>from /source/qemu/include/qemu/osdep.h:80,
>from /source/qemu/qemu-nbd.c:19:
>   /usr/include/bits/mathcalls.h:95:1: note: previous declaration of ‘exp’ was 
> here
> __MATHCALL_VEC (exp,, (_Mdouble_ __x));
> ^~
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  qemu-nbd.c | 19 ++-
>  1 file changed, 10 insertions(+), 9 deletions(-)

Why did you need to import math.h?

But it's reasonable enough. Will queue through my NBD tree; in part
because it has conflicts with other pending NBD patches (so
trivial-patches taking it would actually cost me more efforts than me
just fixing the conflicts).

Reviewed-by: Eric Blake 

> +++ b/qemu-nbd.c
> @@ -61,7 +61,7 @@
>  
>  #define MBR_SIZE 512
>  
> -static NBDExport *exp;
> +static NBDExport *export;

This one is a definite problem (POSIX says you shouldn't name any static
variable the same as a standard function entry point, even if it happens
to lnik)...

>  static int verbose;
>  static char *srcpath;
>  static SocketAddress *saddr;
> @@ -335,7 +335,7 @@ static int nbd_can_accept(void)
>  return state == RUNNING && nb_fds < shared;
>  }
>  
> -static void nbd_export_closed(NBDExport *exp)
> +static void nbd_export_closed(NBDExport *export)

...this one just silences a -Wshadow but is permitted by strict C. We
have too many other -Wshadow violations in our code base, but I don't
mind getting rid of this one.

> @@ -1015,10 +1015,11 @@ int main(int argc, char **argv)
>  }
>  }
>  
> -exp = nbd_export_new(bs, dev_offset, fd_size, nbdflags, 
> nbd_export_closed,
> - writethrough, NULL, _fatal);
> -nbd_export_set_name(exp, export_name);
> -nbd_export_set_description(exp, export_description);
> +export = nbd_export_new(bs, dev_offset, fd_size, nbdflags,
> +nbd_export_closed, writethrough,
> +NULL, _fatal);
> +nbd_export_set_name(export, export_name);
> +nbd_export_set_description(export, export_description);

Here's where my pending to add 'qemu-nbd --bitmap' conflict; I don't
mind fixing it.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] block: don't probe zeroes in bs->file by default on block_status

2019-01-11 Thread Vladimir Sementsov-Ogievskiy



On 11.01.2019 17:04, Eric Blake wrote:
> On 1/11/19 10:09 AM, Vladimir Sementsov-Ogievskiy wrote:
> 
> I suggested one: Pass large contiguous allocated ranges to the protocol
> driver, but just assume that the allocation status is correct in the
> format layer if they are small.

 So, for fully allocated image, we will call lseek always?
>>>
>>> If they are fully contiguous, yes. But that's a single lseek() call per
>>> image then instead of an lseek() for every 64k, so not a big problem.
>>
>> lseek is called on each mirror iteration, why one per image?
> 
> If the image has no holes, then lseek(0, SEEK_HOLE) will return EOF, and
> then you know that there are no holes, and you don't need to make any
> further lseek() calls.  Hence, once per image.  A fully-allocated file
> that has areas that read as known zeroes can be determined by fiemap
> (but not by lseek, which can only detect unallocated holes) - but we
> already know that while fiemap has more information, it also has more
> problems (you cannot use it safely without sync, but sync makes it too
> slow to use), so that is a non-starter.
> 
>>>
>>> In the more realistic case, you will still call lseek() occasionally
>>> because you will have some fragmentation, but the fragments can be
>>> rather large. But it should still significantly reduce them compared to
>>> now because you skip it for those parts with small contiguous
>>> allocations where lseek() would be called a lot today.
>>>
>>> Kevin
>>>
>>
>> Ok, you propose not to call lseek for small enough data regions reported by
>> format layer. And for images which are less fragmented, this helps less or 
>> don't
>> help.
> 
> Indeed - pick some threshold (maybe 16 clusters); if block status of the
> format layer returns something smaller than the threshold, don't bother
> refining the answer further by doing block status of the protocol layer
> (if the caller is iterating over an image 1 cluster at a time, then the
> threshold will never be tripped and thus you'll never do an lseek); but
> where the block status of the format layer is large, we are reading the
> file in larger chunks so we end up with fewer lseeks in the long run
> anyways.
> 
>>
>> Why do you think it is better?
>>
>> For not preallocated images it is worse, as it covers less cases. So, for our
>> scenarios it is worse.
> 
> Anywhere that you skip calling lseek(), you end up missing out on
> opportunities to optimize had you instead been able to learn from lseek
> that you were on a hole after all.

What are the cases when we'll benefit from lseek except preallocated images?

   So it becomes a balancing question:
> how much time is spent on probing for whether an optimization is even
> possible, vs. how much time is spent if the probe succeeded and you can
> then optimize.  For a fully-allocated image, all of the time spent
> probing is wasted (you never find a hole, so every probe was wasted).
> So it is indeed a tradeoff when picking your heuristics, of trying to
> balance how likely the probe will justify the time spent on the probe.
> 
>> The only case, when heuristic works better, is when user have preallocated 
>> image,
>> but don't know about new option, which returns old behavior. We are not 
>> interested
>> in this case and can't go this way, as it doesn't guarantee, that some 
>> customer will
>> not come again with lseek-related problems.
>>
>> Don't you like what Eric propose, about binding behavior switch to existing
>> detect-zeroes option?
>>
>> Or, we can add an opposite option, to enable new behavior, keeping the old 
>> one by
>> default. So, all stays as is, and who need uses new option. Heuristic may be
>> implemented then too.
>>
>

Re: [Qemu-devel] [PATCH] crypto: finish removing TABs

2019-01-11 Thread Daniel P . Berrangé

On Tue, Dec 18, 2018 at 12:16:39AM +0100, Paolo Bonzini wrote:
> Suggested-by: Daniel P. Berrange 
> Signed-off-by: Paolo Bonzini 
> ---
>  crypto/aes.c| 414 -
>  crypto/desrfb.c | 594 
>  2 files changed, 504 insertions(+), 504 deletions(-)

Acked-by: Daniel P. Berrangé 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH v4 for-4.0 0/7] vhost-user-blk: Add support for backend reconnecting

2019-01-11 Thread Michael S. Tsirkin

On Fri, Jan 11, 2019 at 03:53:42PM +, Stefan Hajnoczi wrote:
> I'm concerned that this approach to device recovery is invasive and hard
> to test.  Instead I would use VIRTIO's Device Status Field
> DEVICE_NEEDS_RESET bit to tell the guest driver that a reset is
> necessary.  This is more disruptive - drivers either have to resubmit or
> fail I/O with EIO - but it's also simple and more likely to work
> correctly (it only needs to be implemented correctly in the guest
> driver, not in the many available vhost-user backend implementations).
> 
> Stefan

Unfortunately drivers don't support DEVICE_NEEDS_RESET yet.
I'll be happy to accept patches, but this means we
can't depend on DEVICE_NEEDS_RESET for basic functionality
like reconnect. And given virtio 1.0 has been there
for a while now, I suspect the only way to start using
it is by adding a new feature flag.

Unfortunate but true.

-- 
MST

Re: [Qemu-devel] [PATCH] ui: vnc: finish removing TABs

2019-01-11 Thread Paolo Bonzini

On 18/12/18 00:16, Paolo Bonzini wrote:
> Suggested-by: Daniel P. Berrange 
> Signed-off-by: Paolo Bonzini 
> ---
>  ui/vnc-enc-hextile-template.h | 268 +++
>  ui/vnc-enc-zywrle.h   | 394 +-
>  2 files changed, 331 insertions(+), 331 deletions(-)
> 
> diff --git a/ui/vnc-enc-hextile-template.h b/ui/vnc-enc-hextile-template.h
> index d868d75720..0c56262aff 100644
> --- a/ui/vnc-enc-hextile-template.h
> +++ b/ui/vnc-enc-hextile-template.h
> @@ -30,127 +30,127 @@ static void CONCAT(send_hextile_tile_, NAME)(VncState 
> *vs,
>  int n_subtiles = 0;
>  
>  for (j = 0; j < h; j++) {
> - for (i = 0; i < w; i++) {
> - switch (n_colors) {
> - case 0:
> - bg = irow[i];
> - n_colors = 1;
> - break;
> - case 1:
> - if (irow[i] != bg) {
> - fg = irow[i];
> - n_colors = 2;
> - }
> - break;
> - case 2:
> - if (irow[i] != bg && irow[i] != fg) {
> - n_colors = 3;
> - } else {
> - if (irow[i] == bg)
> - bg_count++;
> - else if (irow[i] == fg)
> - fg_count++;
> - }
> - break;
> - default:
> - break;
> - }
> - }
> - if (n_colors > 2)
> - break;
> - irow += vnc_server_fb_stride(vd) / sizeof(pixel_t);
> +for (i = 0; i < w; i++) {
> +switch (n_colors) {
> +case 0:
> +bg = irow[i];
> +n_colors = 1;
> +break;
> +case 1:
> +if (irow[i] != bg) {
> +fg = irow[i];
> +n_colors = 2;
> +}
> +break;
> +case 2:
> +if (irow[i] != bg && irow[i] != fg) {
> +n_colors = 3;
> +} else {
> +if (irow[i] == bg)
> +bg_count++;
> +else if (irow[i] == fg)
> +fg_count++;
> +}
> +break;
> +default:
> +break;
> +}
> +}
> +if (n_colors > 2)
> +break;
> +irow += vnc_server_fb_stride(vd) / sizeof(pixel_t);
>  }
>  
>  if (n_colors > 1 && fg_count > bg_count) {
> - pixel_t tmp = fg;
> - fg = bg;
> - bg = tmp;
> +pixel_t tmp = fg;
> +fg = bg;
> +bg = tmp;
>  }
>  
>  if (!*has_bg || *last_bg != bg) {
> - flags |= 0x02;
> - *has_bg = 1;
> - *last_bg = bg;
> +flags |= 0x02;
> +*has_bg = 1;
> +*last_bg = bg;
>  }
>  
>  if (n_colors < 3 && (!*has_fg || *last_fg != fg)) {
> - flags |= 0x04;
> - *has_fg = 1;
> - *last_fg = fg;
> +flags |= 0x04;
> +*has_fg = 1;
> +*last_fg = fg;
>  }
>  
>  switch (n_colors) {
>  case 1:
> - n_data = 0;
> - break;
> +n_data = 0;
> +break;
>  case 2:
> - flags |= 0x08;
> -
> - irow = (pixel_t *)row;
> -
> - for (j = 0; j < h; j++) {
> - int min_x = -1;
> - for (i = 0; i < w; i++) {
> - if (irow[i] == fg) {
> - if (min_x == -1)
> - min_x = i;
> - } else if (min_x != -1) {
> - hextile_enc_cord(data + n_data, min_x, j, i - min_x, 1);
> - n_data += 2;
> - n_subtiles++;
> - min_x = -1;
> - }
> - }
> - if (min_x != -1) {
> - hextile_enc_cord(data + n_data, min_x, j, i - min_x, 1);
> - n_data += 2;
> - n_subtiles++;
> - }
> - irow += vnc_server_fb_stride(vd) / sizeof(pixel_t);
> - }
> - break;
> +flags |= 0x08;
> +
> +irow = (pixel_t *)row;
> +
> +for (j = 0; j < h; j++) {
> +int min_x = -1;
> +for (i = 0; i < w; i++) {
> +if (irow[i] == fg) {
> +if (min_x == -1)
> +min_x = i;
> +} else if (min_x != -1) {
> +hextile_enc_cord(data + n_data, min_x, j, i - min_x, 1);
> +n_data += 2;
> +n_subtiles++;
> +min_x = -1;
> +}
> +}
> +if (min_x != -1) {
> +hextile_enc_cord(data + n_data, min_x, j, i - min_x, 1);
> +n_data += 2;
> +n_subtiles++;
> +}
> +irow += vnc_server_fb_stride(vd) / sizeof(pixel_t);
> +}
> +break;
>  case 3:
> - flags |= 0x18;
> -
> - irow = (pixel_t *)row;
> -
> - if (!*has_bg || *last_bg != bg)
> - flags |= 0x02;
> -
> - for (j = 0; j < h; j++) {
> -

Re: [Qemu-devel] [PATCH] crypto: finish removing TABs

2019-01-11 Thread Paolo Bonzini

On 18/12/18 00:16, Paolo Bonzini wrote:
> Suggested-by: Daniel P. Berrange 
> Signed-off-by: Paolo Bonzini 
> ---
>  crypto/aes.c| 414 -
>  crypto/desrfb.c | 594 
>  2 files changed, 504 insertions(+), 504 deletions(-)
> 
> diff --git a/crypto/aes.c b/crypto/aes.c
> index 773d246b00..86b3092324 100644
> --- a/crypto/aes.c
> +++ b/crypto/aes.c
> @@ -1059,109 +1059,109 @@ const uint32_t AES_Td4[256] = {
>  0xU, 0x21212121U, 0x0c0c0c0cU, 0x7d7d7d7dU,
>  };
>  static const u32 rcon[] = {
> - 0x0100, 0x0200, 0x0400, 0x0800,
> - 0x1000, 0x2000, 0x4000, 0x8000,
> - 0x1B00, 0x3600, /* for 128-bit blocks, Rijndael never uses more 
> than 10 rcon values */
> +0x0100, 0x0200, 0x0400, 0x0800,
> +0x1000, 0x2000, 0x4000, 0x8000,
> +0x1B00, 0x3600, /* for 128-bit blocks, Rijndael never uses 
> more than 10 rcon values */
>  };
>  
>  /**
>   * Expand the cipher key into the encryption key schedule.
>   */
>  int AES_set_encrypt_key(const unsigned char *userKey, const int bits,
> - AES_KEY *key) {
> +AES_KEY *key) {
>  
> - u32 *rk;
> - int i = 0;
> - u32 temp;
> +u32 *rk;
> +int i = 0;
> +u32 temp;
>  
> - if (!userKey || !key)
> - return -1;
> - if (bits != 128 && bits != 192 && bits != 256)
> - return -2;
> +if (!userKey || !key)
> +return -1;
> +if (bits != 128 && bits != 192 && bits != 256)
> +return -2;
>  
> - rk = key->rd_key;
> +rk = key->rd_key;
>  
> - if (bits==128)
> - key->rounds = 10;
> - else if (bits==192)
> - key->rounds = 12;
> - else
> - key->rounds = 14;
> +if (bits==128)
> +key->rounds = 10;
> +else if (bits==192)
> +key->rounds = 12;
> +else
> +key->rounds = 14;
>  
> - rk[0] = GETU32(userKey );
> - rk[1] = GETU32(userKey +  4);
> - rk[2] = GETU32(userKey +  8);
> - rk[3] = GETU32(userKey + 12);
> - if (bits == 128) {
> - while (1) {
> - temp  = rk[3];
> - rk[4] = rk[0] ^
> +rk[0] = GETU32(userKey );
> +rk[1] = GETU32(userKey +  4);
> +rk[2] = GETU32(userKey +  8);
> +rk[3] = GETU32(userKey + 12);
> +if (bits == 128) {
> +while (1) {
> +temp  = rk[3];
> +rk[4] = rk[0] ^
>  (AES_Te4[(temp >> 16) & 0xff] & 0xff00) ^
>  (AES_Te4[(temp >>  8) & 0xff] & 0x00ff) ^
>  (AES_Te4[(temp  ) & 0xff] & 0xff00) ^
>  (AES_Te4[(temp >> 24)   ] & 0x00ff) ^
> - rcon[i];
> - rk[5] = rk[1] ^ rk[4];
> - rk[6] = rk[2] ^ rk[5];
> - rk[7] = rk[3] ^ rk[6];
> - if (++i == 10) {
> - return 0;
> - }
> - rk += 4;
> - }
> - }
> - rk[4] = GETU32(userKey + 16);
> - rk[5] = GETU32(userKey + 20);
> - if (bits == 192) {
> - while (1) {
> - temp = rk[ 5];
> - rk[ 6] = rk[ 0] ^
> +rcon[i];
> +rk[5] = rk[1] ^ rk[4];
> +rk[6] = rk[2] ^ rk[5];
> +rk[7] = rk[3] ^ rk[6];
> +if (++i == 10) {
> +return 0;
> +}
> +rk += 4;
> +}
> +}
> +rk[4] = GETU32(userKey + 16);
> +rk[5] = GETU32(userKey + 20);
> +if (bits == 192) {
> +while (1) {
> +temp = rk[ 5];
> +rk[ 6] = rk[ 0] ^
>  (AES_Te4[(temp >> 16) & 0xff] & 0xff00) ^
>  (AES_Te4[(temp >>  8) & 0xff] & 0x00ff) ^
>  (AES_Te4[(temp  ) & 0xff] & 0xff00) ^
>  (AES_Te4[(temp >> 24)   ] & 0x00ff) ^
> - rcon[i];
> - rk[ 7] = rk[ 1] ^ rk[ 6];
> - rk[ 8] = rk[ 2] ^ rk[ 7];
> - rk[ 9] = rk[ 3] ^ rk[ 8];
> - if (++i == 8) {
> - return 0;
> - }
> - rk[10] = rk[ 4] ^ rk[ 9];
> - rk[11] = rk[ 5] ^ rk[10];
> - rk += 6;
> - }
> - }
> - rk[6]

Re: [Qemu-devel] [PATCH] crypto: finish removing TABs

2019-01-11 Thread Paolo Bonzini

On 18/12/18 01:16, Philippe Mathieu-Daudé wrote:
> On 12/18/18 12:16 AM, Paolo Bonzini wrote:
>> Suggested-by: Daniel P. Berrange 
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  crypto/aes.c| 414 -
>>  crypto/desrfb.c | 594 
>>  2 files changed, 504 insertions(+), 504 deletions(-)
>>
>> diff --git a/crypto/aes.c b/crypto/aes.c
>> index 773d246b00..86b3092324 100644
>> --- a/crypto/aes.c
>> +++ b/crypto/aes.c
>> @@ -1059,109 +1059,109 @@ const uint32_t AES_Td4[256] = {
>>  0xU, 0x21212121U, 0x0c0c0c0cU, 0x7d7d7d7dU,
>>  };
>>  static const u32 rcon[] = {
>> -0x0100, 0x0200, 0x0400, 0x0800,
>> -0x1000, 0x2000, 0x4000, 0x8000,
>> -0x1B00, 0x3600, /* for 128-bit blocks, Rijndael never uses more 
>> than 10 rcon values */
>> +0x0100, 0x0200, 0x0400, 0x0800,
>> +0x1000, 0x2000, 0x4000, 0x8000,
>> +0x1B00, 0x3600, /* for 128-bit blocks, Rijndael never uses 
>> more than 10 rcon values */
>>  };
>>  
>>  /**
>>   * Expand the cipher key into the encryption key schedule.
>>   */
>>  int AES_set_encrypt_key(const unsigned char *userKey, const int bits,
>> -AES_KEY *key) {
>> +AES_KEY *key) {
>>  
>> -u32 *rk;
>> -int i = 0;
>> -u32 temp;
>> +u32 *rk;
>> +int i = 0;
>> +u32 temp;
>>  
>> -if (!userKey || !key)
>> -return -1;
>> -if (bits != 128 && bits != 192 && bits != 256)
>> -return -2;
>> +if (!userKey || !key)
> 
> Follow up patch: use braces to respect QEMU CODYING_STYLE.
> 
> {
> 
>> +return -1;
> 
> }
> 
>> +if (bits != 128 && bits != 192 && bits != 256)
>> +return -2;
>>  
>> -rk = key->rd_key;
>> +rk = key->rd_key;
>>  
>> -if (bits==128)
>> -key->rounds = 10;
>> -else if (bits==192)
>> -key->rounds = 12;
>> -else
>> -key->rounds = 14;
>> +if (bits==128)
>> +key->rounds = 10;
>> +else if (bits==192)
>> +key->rounds = 12;
>> +else
>> +key->rounds = 14;
>>  
>> -rk[0] = GETU32(userKey );
>> -rk[1] = GETU32(userKey +  4);
>> -rk[2] = GETU32(userKey +  8);
>> -rk[3] = GETU32(userKey + 12);
>> -if (bits == 128) {
>> -while (1) {
>> -temp  = rk[3];
>> -rk[4] = rk[0] ^
>> +rk[0] = GETU32(userKey );
>> +rk[1] = GETU32(userKey +  4);
>> +rk[2] = GETU32(userKey +  8);
>> +rk[3] = GETU32(userKey + 12);
>> +if (bits == 128) {
>> +while (1) {
>> +temp  = rk[3];
>> +rk[4] = rk[0] ^
>>  (AES_Te4[(temp >> 16) & 0xff] & 0xff00) 
>> ^
>>  (AES_Te4[(temp >>  8) & 0xff] & 0x00ff) 
>> ^
>>  (AES_Te4[(temp  ) & 0xff] & 0xff00) 
>> ^
>>  (AES_Te4[(temp >> 24)   ] & 0x00ff) 
>> ^
>> -rcon[i];
>> -rk[5] = rk[1] ^ rk[4];
>> -rk[6] = rk[2] ^ rk[5];
>> -rk[7] = rk[3] ^ rk[6];
>> -if (++i == 10) {
>> -return 0;
>> -}
>> -rk += 4;
>> -}
>> -}
>> -rk[4] = GETU32(userKey + 16);
>> -rk[5] = GETU32(userKey + 20);
>> -if (bits == 192) {
>> -while (1) {
>> -temp = rk[ 5];
>> -rk[ 6] = rk[ 0] ^
>> +rcon[i];
>> +rk[5] = rk[1] ^ rk[4];
>> +rk[6] = rk[2] ^ rk[5];
>> +rk[7] = rk[3] ^ rk[6];
>> +if (++i == 10) {
>> +return 0;
>> +}
>> +rk += 4;
>> +}
>> +}
>> +rk[4] = GETU32(userKey + 16);
>> +rk[5] = GETU32(userKey + 20);
>> +if (bits == 192) {
>> +while (1) {
>> +temp = rk[ 5];
>> +rk[ 6] = rk[ 0] ^
>>  (AES_Te4[(temp >> 16) & 0xff] & 0xff00) 
>> ^
>>  (AES_Te4[(temp >>  8) & 0xff] & 0x00ff) 
>> ^
>>  (AES_Te4[(temp  ) & 0xff] & 0xff00) 
>> ^
>>  (AES_Te4[(temp >> 24)   ] & 0x00ff) 
>> ^
>> -rcon[i];
>> -rk[ 7] = rk[ 1] ^ rk[ 6];
>> -rk[ 8] = rk[ 2] ^ rk[ 7];
>> -rk[ 9] = rk[ 3] ^ rk[ 8];
>> -if (++i ==

Re: [Qemu-devel] [PATCH] block: don't probe zeroes in bs->file by default on block_status

2019-01-11 Thread Eric Blake

On 1/11/19 10:22 AM, Vladimir Sementsov-Ogievskiy wrote:

>> Even a dumb most-recent use cache will speed this up: both the second
>> and third queries above can be avoided because we know that both 0x4
>> and 0x3 the second query at 0x4 can be skipped (0x4 is
>> between our most recent lseek at 0x2 and hole at 0x1)
> 
> Is it correct just use results from previous iterations? In mirror source
> is active and may change.

If you keep a cache, you have to keep the cache up-to-date. Any writes
to an area that is covered by the known-hole cache have to flush the
cache, so that the next block status no longer sees a known-hole and
ends up doing another lseek.  Or, if the cache has enough state to track
unknown/known-hole/known-data, then writes update the cache to be
known-data, and future block status can skip the lseek by using the
results of the cache.

> 
>>
>> Make the cache slightly larger, or use a bitmap with 2 bits per cluster
>> (tracking unknown, known-data, known-hole), with proper flushing of the
>> cache as we write to the image, or whatever, and we should automatically
>> get some performance improvements by using fewer lseek() anywhere that
>> we remember what previous lseek() already told us, with no knobs needed.
>>
> 
> So the cache should consider all writes and discards. And it is obviously
> more difficult to implement it, than just don't call this lseek. And I
> don't understand, why cache + lseek is better for the case when we don't
> need nor the lseek neither the cache. Is this all to not add an option?
> Also Kevin objects to caching lseek in parallel sub-thread.

Keven objected to caching anything if the image has multiple writers,
where an outside process could change the file allocation in between our
reads. But multiple writers is rare - in fact, our image locking for
qcow2 formats tries to prevent multiple writers.  Having multiple
threads within one process writing is fine, as long as they properly
coordinate writes to the lseek cache so that readers never see a stale
claim of a hole - although a stale claim of data is safe.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] block: don't probe zeroes in bs->file by default on block_status

2019-01-11 Thread Eric Blake

On 1/11/19 10:09 AM, Vladimir Sementsov-Ogievskiy wrote:

 I suggested one: Pass large contiguous allocated ranges to the protocol
 driver, but just assume that the allocation status is correct in the
 format layer if they are small.
>>>
>>> So, for fully allocated image, we will call lseek always?
>>
>> If they are fully contiguous, yes. But that's a single lseek() call per
>> image then instead of an lseek() for every 64k, so not a big problem.
> 
> lseek is called on each mirror iteration, why one per image?

If the image has no holes, then lseek(0, SEEK_HOLE) will return EOF, and
then you know that there are no holes, and you don't need to make any
further lseek() calls.  Hence, once per image.  A fully-allocated file
that has areas that read as known zeroes can be determined by fiemap
(but not by lseek, which can only detect unallocated holes) - but we
already know that while fiemap has more information, it also has more
problems (you cannot use it safely without sync, but sync makes it too
slow to use), so that is a non-starter.

>>
>> In the more realistic case, you will still call lseek() occasionally
>> because you will have some fragmentation, but the fragments can be
>> rather large. But it should still significantly reduce them compared to
>> now because you skip it for those parts with small contiguous
>> allocations where lseek() would be called a lot today.
>>
>> Kevin
>>
> 
> Ok, you propose not to call lseek for small enough data regions reported by
> format layer. And for images which are less fragmented, this helps less or 
> don't
> help.

Indeed - pick some threshold (maybe 16 clusters); if block status of the
format layer returns something smaller than the threshold, don't bother
refining the answer further by doing block status of the protocol layer
(if the caller is iterating over an image 1 cluster at a time, then the
threshold will never be tripped and thus you'll never do an lseek); but
where the block status of the format layer is large, we are reading the
file in larger chunks so we end up with fewer lseeks in the long run
anyways.

> 
> Why do you think it is better?
> 
> For not preallocated images it is worse, as it covers less cases. So, for our
> scenarios it is worse.

Anywhere that you skip calling lseek(), you end up missing out on
opportunities to optimize had you instead been able to learn from lseek
that you were on a hole after all.  So it becomes a balancing question:
how much time is spent on probing for whether an optimization is even
possible, vs. how much time is spent if the probe succeeded and you can
then optimize.  For a fully-allocated image, all of the time spent
probing is wasted (you never find a hole, so every probe was wasted).
So it is indeed a tradeoff when picking your heuristics, of trying to
balance how likely the probe will justify the time spent on the probe.

> The only case, when heuristic works better, is when user have preallocated 
> image,
> but don't know about new option, which returns old behavior. We are not 
> interested
> in this case and can't go this way, as it doesn't guarantee, that some 
> customer will
> not come again with lseek-related problems.
> 
> Don't you like what Eric propose, about binding behavior switch to existing
> detect-zeroes option?
> 
> Or, we can add an opposite option, to enable new behavior, keeping the old 
> one by
> default. So, all stays as is, and who need uses new option. Heuristic may be
> implemented then too.
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH 2/2] vfio-pci: Use vfio_register_event_notifier in vfio_intx_enable_kvm

2019-01-11 Thread Eric Auger

We can also use vfio_register_event_notifier() helper in
vfio_intx_enable_kvm to set the signalling associated to
VFIO_PCI_INTX_IRQ_INDEX.

Signed-off-by: Eric Auger 
---
 hw/vfio/pci.c | 38 +++---
 1 file changed, 7 insertions(+), 31 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index c589a4e666..db0504ca10 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -136,6 +136,9 @@ static int vfio_register_event_notifier(VFIOPCIDevice *vdev,
 case VFIO_PCI_ERR_IRQ_INDEX:
 notifier = >err_notifier;
 break;
+case VFIO_PCI_INTX_IRQ_INDEX:
+notifier = >intx.interrupt;
+break;
 default:
 return -EINVAL;
 }
@@ -351,10 +354,8 @@ static void vfio_intx_update(PCIDevice *pdev)
 static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp)
 {
 uint8_t pin = vfio_pci_read_config(>pdev, PCI_INTERRUPT_PIN, 1);
-int ret, argsz, retval = 0;
-struct vfio_irq_set *irq_set;
-int32_t *pfd;
 Error *err = NULL;
+int ret;
 
 if (!pin) {
 return 0;
@@ -376,34 +377,12 @@ static int vfio_intx_enable(VFIOPCIDevice *vdev, Error 
**errp)
 }
 #endif
 
-ret = event_notifier_init(>intx.interrupt, 0);
+ret = vfio_register_event_notifier(vdev, VFIO_PCI_INTX_IRQ_INDEX, true,
+   vfio_intx_interrupt, errp);
 if (ret) {
-error_setg_errno(errp, -ret, "event_notifier_init failed");
 return ret;
 }
 
-argsz = sizeof(*irq_set) + sizeof(*pfd);
-
-irq_set = g_malloc0(argsz);
-irq_set->argsz = argsz;
-irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
-irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
-irq_set->start = 0;
-irq_set->count = 1;
-pfd = (int32_t *)_set->data;
-
-*pfd = event_notifier_get_fd(>intx.interrupt);
-qemu_set_fd_handler(*pfd, vfio_intx_interrupt, NULL, vdev);
-
-ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
-if (ret) {
-error_setg_errno(errp, -ret, "failed to setup INTx fd");
-qemu_set_fd_handler(*pfd, NULL, NULL, vdev);
-event_notifier_cleanup(>intx.interrupt);
-retval = -errno;
-goto cleanup;
-}
-
 vfio_intx_enable_kvm(vdev, );
 if (err) {
 warn_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name);
@@ -413,10 +392,7 @@ static int vfio_intx_enable(VFIOPCIDevice *vdev, Error 
**errp)
 
 trace_vfio_intx_enable(vdev->vbasedev.name);
 
-cleanup:
-g_free(irq_set);
-
-return retval;
+return 0;
 }
 
 static void vfio_intx_disable(VFIOPCIDevice *vdev)
-- 
2.17.2

[Qemu-devel] [PATCH 1/2] vfio-pci: Introduce vfio_register_event_notifier helper

2019-01-11 Thread Eric Auger

The code used to attach the eventfd handler for the ERR and
REQ irq indices can be factorized into a helper. In subsequent
patches we will extend this helper to support other irq indices.

We test the notification is allowed outside of the helper:
respectively check vdev->pci_aer and VFIO_FEATURE_ENABLE_REQ.
Depending on the returned value we set vdev->pci_aer and
vdev->req_enabled. An error handle is introduced for future usage
although not strictly useful here.

Signed-off-by: Eric Auger 
---
 hw/vfio/pci.c | 291 ++
 1 file changed, 127 insertions(+), 164 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index c0cb1ec289..c589a4e666 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -105,6 +105,95 @@ static void vfio_intx_eoi(VFIODevice *vbasedev)
 vfio_unmask_single_irqindex(vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
 }
 
+/*
+ * vfio_register_event_notifier - setup/tear down eventfd
+ * notification and handling for IRQ indices that span over
+ * a single IRQ
+ *
+ * @vdev: VFIO device handle
+ * @index: IRQ index the eventfd/handler is associated to
+ * @target_state: true means notifier needs to be set up
+ * @handler to attach if @target_state is true
+ * @errp error handle
+ */
+static int vfio_register_event_notifier(VFIOPCIDevice *vdev,
+int index,
+bool target_state,
+void (*handler)(void *opaque),
+Error **errp)
+{
+struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info),
+  .index = index };
+struct vfio_irq_set *irq_set;
+EventNotifier *notifier;
+int argsz, ret = 0;
+int32_t *pfd, fd;
+
+switch (index) {
+case VFIO_PCI_REQ_IRQ_INDEX:
+notifier = >req_notifier;
+break;
+case VFIO_PCI_ERR_IRQ_INDEX:
+notifier = >err_notifier;
+break;
+default:
+return -EINVAL;
+}
+
+if (ioctl(vdev->vbasedev.fd,
+  VFIO_DEVICE_GET_IRQ_INFO, _info) < 0 || irq_info.count < 1) {
+error_setg_errno(errp, errno,
+ "no irq index %d available", index);
+return -EINVAL;
+}
+
+if (target_state) {
+ret = event_notifier_init(notifier, 0);
+if (ret) {
+error_setg_errno(errp, -ret,
+ "Unable to init event notifier for irq index %d",
+ index);
+return ret;
+}
+}
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+
+irq_set = g_malloc0(argsz);
+irq_set->argsz = argsz;
+irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
+ VFIO_IRQ_SET_ACTION_TRIGGER;
+irq_set->index = index;
+irq_set->start = 0;
+irq_set->count = 1;
+pfd = (int32_t *)_set->data;
+
+if (!notifier) {
+return -EINVAL;
+}
+
+fd = event_notifier_get_fd(notifier);
+
+if (target_state) {
+qemu_set_fd_handler(fd, handler, NULL, vdev);
+*pfd = fd;
+} else {
+qemu_set_fd_handler(fd, NULL, NULL, vdev);
+event_notifier_cleanup(notifier);
+*pfd = -1;
+}
+
+ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
+g_free(irq_set);
+
+if (ret) {
+error_setg_errno(errp, -ret,
+ "vfio: Failed to %s eventfd signalling for index %d",
+ *pfd < 0 ? "set up" : "tear down", index);
+}
+return ret;
+}
+
 static void vfio_intx_enable_kvm(VFIOPCIDevice *vdev, Error **errp)
 {
 #ifdef CONFIG_KVM
@@ -2621,86 +2710,6 @@ static void vfio_err_notifier_handler(void *opaque)
 vm_stop(RUN_STATE_INTERNAL_ERROR);
 }
 
-/*
- * Registers error notifier for devices supporting error recovery.
- * If we encounter a failure in this function, we report an error
- * and continue after disabling error recovery support for the
- * device.
- */
-static void vfio_register_err_notifier(VFIOPCIDevice *vdev)
-{
-int ret;
-int argsz;
-struct vfio_irq_set *irq_set;
-int32_t *pfd;
-
-if (!vdev->pci_aer) {
-return;
-}
-
-if (event_notifier_init(>err_notifier, 0)) {
-error_report("vfio: Unable to init event notifier for error 
detection");
-vdev->pci_aer = false;
-return;
-}
-
-argsz = sizeof(*irq_set) + sizeof(*pfd);
-
-irq_set = g_malloc0(argsz);
-irq_set->argsz = argsz;
-irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
- VFIO_IRQ_SET_ACTION_TRIGGER;
-irq_set->index = VFIO_PCI_ERR_IRQ_INDEX;
-irq_set->start = 0;
-irq_set->count = 1;
-pfd = (int32_t *)_set->data;
-
-*pfd = event_notifier_get_fd(>err_notifier);
-qemu_set_fd_handler(*pfd, vfio_err_notifier_handler, NULL, vdev);
-
-ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
-if (ret) {
-error_report("vfio: Failed to set up

[Qemu-devel] [PATCH 0/2] vfio-pci: Introduce vfio_register_event_notifier()

2019-01-11 Thread Eric Auger

This small series introduces the vfio_register_event_notifier()
helper which allows to set up/tear down the VFIO signalling of eventfd
for ERR, REQ and INTX irq indices.

On top of that, a new irq index is planned to signal DMA faults
in nested mode use case. This would use exactly the same mechanics.

Eric Auger (2):
  vfio-pci: Introduce vfio_register_event_notifier helper
  vfio-pci: Use vfio_register_event_notifier in vfio_intx_enable_kvm

 hw/vfio/pci.c | 329 --
 1 file changed, 134 insertions(+), 195 deletions(-)

-- 
2.17.2

Re: [Qemu-devel] [PULL v8 00/35] Misc patches for 2018-12-18

2019-01-11 Thread Peter Maydell

On Fri, 11 Jan 2019 at 14:52, Paolo Bonzini  wrote:
>
> The following changes since commit a311f891abf3833c1e4c5a62a6e5b0f1b81f22c3:
>
>   Merge remote-tracking branch 
> 'remotes/vivier2/tags/linux-user-for-4.0-pull-request' into staging 
> (2019-01-10 17:49:54 +)
>
> are available in the Git repository at:
>
>   git://github.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to 7d37435bd5801bb49e1c4b550fedd1c5fe143131:
>
>   avoid TABs in files that only contain a few (2019-01-11 15:46:56 +0100)
>
> 
> * HAX support for Linux hosts (Alejandro)
> * esp bugfixes (Guenter)
> * Windows build cleanup (Marc-André)
> * checkpatch logic improvements (Paolo)
> * coalesced range bugfix (Paolo)
> * switch testsuite to TAP (Paolo)
> * QTAILQ rewrite (Paolo)
> * block/iscsi.c cancellation fixes (Stefan)
> * improve selection of the default accelerator (Thomas)
>
> 
> v7->v8: dropped Debian dockerfile changes altogether
>

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.0
for any user-visible changes.

-- PMM

[Qemu-devel] [PATCH] hw/vfio/common: Refactor container initialization

2019-01-11 Thread Eric Auger

In vfio_connect_container() the code that selects the
iommu type can benefit from helpers such as
vfio_iommu_get_type() and vfio_init_container(). As
a result we end up with a switch/case on the iommu type
that makes the code a little bit more readable and ready
for addition of new iommu types. Also ioctl's get called
once per iommu_type.

Signed-off-by: Eric Auger 
Reviewed-by: Greg Kurz 

---

v2 -> v3:
- originally submitted in [RFC v2 00/28] vSMMUv3/pSMMUv3
  2 stage VFIO integration
- remove "nested only is selected if requested by @force_nested"
- added Greg's R-b
---
 hw/vfio/common.c | 101 ++-
 1 file changed, 64 insertions(+), 37 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 7c185e5a2e..8535bfe66f 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1036,12 +1036,57 @@ static void vfio_put_address_space(VFIOAddressSpace 
*space)
 }
 }
 
+/*
+ * vfio_iommu_get_type - selects the richest iommu_type (v2 first)
+ */
+static int vfio_iommu_get_type(VFIOContainer *container,
+   Error **errp)
+{
+int fd = container->fd;
+
+if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU)) {
+return VFIO_TYPE1v2_IOMMU;
+} else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU)) {
+return VFIO_TYPE1_IOMMU;
+} else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_v2_IOMMU)) {
+return VFIO_SPAPR_TCE_v2_IOMMU;
+} else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) {
+return VFIO_SPAPR_TCE_IOMMU;
+} else {
+error_setg(errp, "No available IOMMU models");
+return -EINVAL;
+}
+}
+
+static int vfio_init_container(VFIOContainer *container, int group_fd,
+   int iommu_type, Error **errp)
+{
+int ret;
+
+ret = ioctl(group_fd, VFIO_GROUP_SET_CONTAINER, >fd);
+if (ret) {
+error_setg_errno(errp, errno, "failed to set group container");
+return -errno;
+}
+
+ret = ioctl(container->fd, VFIO_SET_IOMMU, iommu_type);
+if (ret) {
+error_setg_errno(errp, errno, "failed to set iommu for container");
+return -errno;
+}
+container->iommu_type = iommu_type;
+return 0;
+}
+
 static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
   Error **errp)
 {
 VFIOContainer *container;
 int ret, fd;
 VFIOAddressSpace *space;
+int iommu_type;
+bool v2 = false;
+
 
 space = vfio_get_address_space(as);
 
@@ -1101,23 +1146,20 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as,
 container->fd = fd;
 QLIST_INIT(>giommu_list);
 QLIST_INIT(>hostwin_list);
-if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU) ||
-ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU)) {
-bool v2 = !!ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU);
+
+iommu_type = vfio_iommu_get_type(container, errp);
+if (iommu_type < 0) {
+goto free_container_exit;
+}
+
+switch (iommu_type) {
+case VFIO_TYPE1v2_IOMMU:
+case VFIO_TYPE1_IOMMU:
+{
 struct vfio_iommu_type1_info info;
 
-ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, );
+ret = vfio_init_container(container, group->fd, iommu_type, errp);
 if (ret) {
-error_setg_errno(errp, errno, "failed to set group container");
-ret = -errno;
-goto free_container_exit;
-}
-
-container->iommu_type = v2 ? VFIO_TYPE1v2_IOMMU : VFIO_TYPE1_IOMMU;
-ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type);
-if (ret) {
-error_setg_errno(errp, errno, "failed to set iommu for container");
-ret = -errno;
 goto free_container_exit;
 }
 
@@ -1137,28 +1179,16 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as,
 }
 vfio_host_win_add(container, 0, (hwaddr)-1, info.iova_pgsizes);
 container->pgsizes = info.iova_pgsizes;
-} else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU) ||
-   ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_v2_IOMMU)) {
+break;
+}
+case VFIO_SPAPR_TCE_v2_IOMMU:
+v2 = true;
+case VFIO_SPAPR_TCE_IOMMU:
+{
 struct vfio_iommu_spapr_tce_info info;
-bool v2 = !!ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_v2_IOMMU);
 
-ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, );
+ret = vfio_init_container(container, group->fd, iommu_type, errp);
 if (ret) {
-error_setg_errno(errp, errno, "failed to set group container");
-ret = -errno;
-goto free_container_exit;
-}
-container->iommu_type =
-v2 ? VFIO_SPAPR_TCE_v2_IOMMU : VFIO_SPAPR_TCE_IOMMU;
-ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type);
-if (ret) {
-

[Qemu-devel] [PATCH] scsi-generic: avoid possible out-of-bounds access to r->buf

2019-01-11 Thread Paolo Bonzini

Whenever the allocation length of a SCSI request is shorter than the size of the
VPD page list, page_idx is used blindly to index into r->buf.  Even though
the stores in the insertion sort are protected against overflows, the same is 
not
true of the reads and the final store of 0xb0.

This basically does the same thing as commit 57dbb58d80 ("scsi-generic: avoid
out-of-bounds access to VPD page list", 2018-11-06), except that here the
allocation length can be chosen by the guest.  Note that according to the SCSI
standard, the contents of the PAGE LENGTH field are not altered based
on the allocation length.

The code was introduced by commit 6c219fc8a1 ("scsi-generic: keep VPD
page list sorted", 2018-11-06) but the overflow was already possible before.

Reported-by: Kevin Wolf 
Fixes: a71c775b24ebc664129eb1d9b4c360590353efd5
Signed-off-by: Paolo Bonzini 
---
 hw/scsi/scsi-generic.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 7237b4162e..42700e8897 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -182,7 +182,7 @@ static void scsi_handle_inquiry_reply(SCSIGenericReq *r, 
SCSIDevice *s)
 /* Also take care of the opt xfer len. */
 stl_be_p(>buf[12],
 MIN_NON_ZERO(max_transfer, ldl_be_p(>buf[12])));
-} else if (s->needs_vpd_bl_emulation && page == 0x00) {
+} else if (s->needs_vpd_bl_emulation && page == 0x00 && r->buflen >= 
4) {
 /*
  * Now we're capable of supplying the VPD Block Limits
  * response if the hardware can't. Add it in the INQUIRY
@@ -193,18 +193,20 @@ static void scsi_handle_inquiry_reply(SCSIGenericReq *r, 
SCSIDevice *s)
  * and will use it to proper setup the SCSI device.
  *
  * VPD page numbers must be sorted, so insert 0xb0 at the
- * right place with an in-place insert.  After the initialization
- * part of the for loop is executed, the device response is
- * at r[0] to r[page_idx - 1].
+ * right place with an in-place insert.  When the while loop
+ * begins the device response is at r[0] to r[page_idx - 1].
  */
-for (page_idx = lduw_be_p(r->buf + 2) + 4;
- page_idx > 4 && r->buf[page_idx - 1] >= 0xb0;
- page_idx--) {
+page_idx = lduw_be_p(r->buf + 2) + 4;
+page_idx = MIN(page_idx, r->buflen);
+while (page_idx > 4 && r->buf[page_idx - 1] >= 0xb0) {
 if (page_idx < r->buflen) {
 r->buf[page_idx] = r->buf[page_idx - 1];
 }
+page_idx--;
+}
+if (page_idx < r->buflen) {
+r->buf[page_idx] = 0xb0;
 }
-r->buf[page_idx] = 0xb0;
 stw_be_p(r->buf + 2, lduw_be_p(r->buf + 2) + 1);
 }
 }
-- 
2.20.1

Re: [Qemu-devel] [PATCH 1/4] migration: add RAMBlock's offset validation

2019-01-11 Thread Yury Kotov

10.01.2019, 23:14, "Dr. David Alan Gilbert" :
> * Yury Kotov (yury-ko...@yandex-team.ru) wrote:
>>  RAM migration has a RAMBlock validation stage (flag RAM_SAVE_FLAG_MEM_SIZE).
>>  In this stage QEMU checks further information about RAMBlock:
>>  1. Presence (by idstr),
>>  2. Length (trying to resize, when differs),
>>  3. Optional page size.
>>
>>  This patch adds a check for RAMBlock's offset. Currently we check it during
>>  RAM pages loading - every RAM page has an offset in its header. But there 
>> is a
>>  case when we don't send RAM pages (see below).
>>
>>  The following commits introduce a capability (ignore-external) to skip some
>>  RAM blocks from migration. In such case the migration stream contains only
>>  meta information about RAM blocks to validate them. So, the only way to 
>> check
>>  block's offset is to send it explicitly.
>>
>>  Signed-off-by: Yury Kotov 
>
> But why check that offsets match? THey aren't supposed to!
> Offset's are entirely private to each qemu and they're allowed to be
> different; the only requirement is that the length and name of each
> RAMBlock matches, then all the operations we do over the migration
> stream are relative to the start of the block.
>

Yes, you are right. It seems that instead I should check block->mr->addr.

>
> One example where they are validly different is where you hotplug some
> RAM, so for example:
>
>   source qemu
>   -M 4G
>   hotplug PCI card
>   hotplug 2G
>
>   destination qemu
>   -M 4G
>   PCI card declared on the command line
>   extra 2G declared on the command line
>
> The offsets are different but we can migrate that case fine.
>
> Dave
>
>>  ---
>>   migration/ram.c | 15 +--
>>   1 file changed, 13 insertions(+), 2 deletions(-)
>>
>>  diff --git a/migration/ram.c b/migration/ram.c
>>  index 7e7deec4d8..39629254e1 100644
>>  --- a/migration/ram.c
>>  +++ b/migration/ram.c
>>  @@ -3171,6 +3171,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>   if (migrate_postcopy_ram() && block->page_size != 
>> qemu_host_page_size) {
>>   qemu_put_be64(f, block->page_size);
>>   }
>>  + qemu_put_be64(f, block->offset);
>>   }
>>
>>   rcu_read_unlock();
>>  @@ -4031,7 +4032,7 @@ static int ram_load(QEMUFile *f, void *opaque, int 
>> version_id)
>>
>>   seq_iter++;
>>
>>  - if (version_id != 4) {
>>  + if (version_id < 4) {
>>   ret = -EINVAL;
>>   }
>>
>>  @@ -4132,6 +4133,16 @@ static int ram_load(QEMUFile *f, void *opaque, int 
>> version_id)
>>   ret = -EINVAL;
>>   }
>>   }
>>  + if (version_id >= 5) {
>>  + ram_addr_t offset;
>>  + offset = qemu_get_be64(f);
>>  + if (block->offset != offset) {
>>  + error_report("Mismatched RAM block offset %s "
>>  + "%" PRId64 "!= %" PRId64,
>>  + id, offset, (uint64_t)block->offset);
>>  + ret = -EINVAL;
>>  + }
>>  + }
>>   ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
>> block->idstr);
>>   } else {
>>  @@ -4363,5 +4374,5 @@ static SaveVMHandlers savevm_ram_handlers = {
>>   void ram_mig_init(void)
>>   {
>>   qemu_mutex_init();
>>  - register_savevm_live(NULL, "ram", 0, 4, _ram_handlers, _state);
>>  + register_savevm_live(NULL, "ram", 0, 5, _ram_handlers, _state);
>>   }
>>  --
>>  2.20.1
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Regards,
Yury

1 2 3 4 >

1 - 100 of 308 matches

Mail list logo