date:20200610

Re: [PATCH v2 1/1] virtio-ccw: auto-manage VIRTIO_F_IOMMU_PLATFORM if PV

2020-06-10 Thread David Hildenbrand

On 10.06.20 06:31, David Gibson wrote:
> On Tue, Jun 09, 2020 at 12:44:39PM -0400, Michael S. Tsirkin wrote:
>> On Tue, Jun 09, 2020 at 06:28:39PM +0200, Halil Pasic wrote:
>>> On Tue, 9 Jun 2020 17:47:47 +0200
>>> Claudio Imbrenda  wrote:
>>>
 On Tue, 9 Jun 2020 11:41:30 +0200
 Halil Pasic  wrote:

 [...]

> I don't know. Janosch could answer that, but he is on vacation. Adding
> Claudio maybe he can answer. My understanding is, that while it might
> be possible, it is ugly at best. The ability to do a transition is
> indicated by a CPU model feature. Indicating the feature to the guest
> and then failing the transition sounds wrong to me.

 I agree. If the feature is advertised, then it has to work. I don't
 think we even have an architected way to fail the transition for that
 reason.

 What __could__ be done is to prevent qemu from even starting if an
 incompatible device is specified together with PV.
>>>
>>> AFAIU, the "specified together with PV" is the problem here. Currently
>>> we don't "specify PV" but PV is just a capability that is managed by the
>>> CPU model (like so many other).
>>
>> So if we want to keep it user friendly, there could be
>> protection property with values on/off/auto, and auto
>> would poke at host capability to figure out whether
>> it's supported.
>>
>> Both virtio and CPU would inherit from that.
> 
> Right, that's what I have in mind for my 'host-trust-limitation'
> property (a generalized version of the existing 'memory-encryption'
> machine option).  My draft patches already set virtio properties
> accordingly, it should be possible to set (default) cpu properties as
> well.

No crazy CPU model hacks please (at least speaking for the s390x).

-- 
Thanks,

David / dhildenb

[Bug 1882065] Re: Could this cause OOB bug ?

2020-06-10 Thread r1ng0hacking

You must start the trace function of QEMU to trigger this BUG!

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1882065

Title:
  Could this cause OOB bug ?

Status in QEMU:
  New

Bug description:
  In function megasas_handle_scsi(hw/scsi/megasas.c):

  ```c
  static int megasas_handle_scsi(MegasasState *s, MegasasCmd *cmd,
 int frame_cmd)
  {
  

  cdb = cmd->frame->pass.cdb;
  target_id = cmd->frame->header.target_id;
  lun_id = cmd->frame->header.lun_id;
  cdb_len = cmd->frame->header.cdb_len;
  

  if (cdb_len > 16) {
  trace_megasas_scsi_invalid_cdb_len(
  mfi_frame_desc[frame_cmd], is_logical,
  target_id, lun_id, cdb_len);
  megasas_write_sense(cmd, SENSE_CODE(INVALID_OPCODE));
  cmd->frame->header.scsi_status = CHECK_CONDITION;
  s->event_count++;
  return MFI_STAT_SCSI_DONE_WITH_ERROR;
  }
  }
  ```

  Two variables, frame_cmd and cdb_len, can be controlled by guest os.
  So can mfi_frame_desc[frame_cmd] cause OOB bug ?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1882065/+subscriptions

Re: [PATCH] hw/vfio/pci-quirks: Fix broken legacy IGD passthrough

2020-06-10 Thread Philippe Mathieu-Daudé

On 6/10/20 5:51 AM, Thomas Huth wrote:
> The #ifdef CONFIG_VFIO_IGD in pci-quirks.c is not working since the
> required header config-devices.h is not included, so that the legacy
> IGD passthrough is currently broken. Let's include the right header
> to fix this issue.
> 
> Buglink: https://bugs.launchpad.net/qemu/+bug/1882784
> Fixes: 29d62771c81d8fd244a67c14a1d968c268d3fb19
>("hw/vfio: Move the IGD quirk code to a separate file")

What about shorter tag?

Fixes: 29d62771c81 ("vfio: Move the IGD quirk code to a separate file")

> Signed-off-by: Thomas Huth 
> ---
>  hw/vfio/pci-quirks.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
> index f2155ddb1d..3158390db1 100644
> --- a/hw/vfio/pci-quirks.c
> +++ b/hw/vfio/pci-quirks.c
> @@ -11,6 +11,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "config-devices.h"

I've been wondering how we can avoid that mistake in the
future, but can find anything beside human review.

Reviewed-by: Philippe Mathieu-Daudé 

>  #include "exec/memop.h"
>  #include "qemu/units.h"
>  #include "qemu/error-report.h"
>

Re: [PATCH 1/7] target/arm: Fix missing temp frees in do_vshll_2sh

2020-06-10 Thread Philippe Mathieu-Daudé

On 6/9/20 6:02 PM, Peter Maydell wrote:
> The widenfn() in do_vshll_2sh() does not free the input 32-bit
> TCGv, so we need to do this in the calling code.
> 
> Signed-off-by: Peter Maydell 
> ---
>  target/arm/translate-neon.inc.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
> index 664d3612607..299a61f067b 100644
> --- a/target/arm/translate-neon.inc.c
> +++ b/target/arm/translate-neon.inc.c
> @@ -1624,6 +1624,7 @@ static bool do_vshll_2sh(DisasContext *s, 
> arg_2reg_shift *a,
>  tmp = tcg_temp_new_i64();
>  
>  widenfn(tmp, rm0);
> +tcg_temp_free_i32(rm0);
>  if (a->shift != 0) {
>  tcg_gen_shli_i64(tmp, tmp, a->shift);
>  tcg_gen_andi_i64(tmp, tmp, ~widen_mask);
> @@ -1631,6 +1632,7 @@ static bool do_vshll_2sh(DisasContext *s, 
> arg_2reg_shift *a,
>  neon_store_reg64(tmp, a->vd);
>  
>  widenfn(tmp, rm1);
> +tcg_temp_free_i32(rm1);
>  if (a->shift != 0) {
>  tcg_gen_shli_i64(tmp, tmp, a->shift);
>  tcg_gen_andi_i64(tmp, tmp, ~widen_mask);
> 

Reviewed-by: Philippe Mathieu-Daudé

[Bug 1882817] Re: Segfault in audio_pcm_sw_write with audio over VNC

2020-06-10 Thread Philippe Mathieu-Daudé

** Changed in: qemu
   Status: New => In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1882817

Title:
  Segfault in audio_pcm_sw_write with audio over VNC

Status in QEMU:
  In Progress

Bug description:
  QEMU 5.0.0, built with ./configure --target-list=x86_64-softmmu
  --enable-debug --disable-strip --disable-docs --disable-sdl

  Running on a headless host (Ryzen 3600), Arch Linux, 64bit latest.
  Guest is also Arch Linux, 64bit.

  Started with qemu-system-x86_64 -vnc 0.0.0.0:0 -enable-kvm -m 4096 -cpu host 
-smp cores=2,threads=1,sockets=1 -machine q35 -vga std -device
   ich9-ahci,id=ahci -drive file=vm0.qcow2,format=qcow2,if=none,id=dsk0 -device 
ide-hd,drive=dsk0,bus=ahci.0 -soundhw hda

  So, a headless VM is running on a server and is being connected to
  over VNC. The virtual sound card is detected and speaker test is
  running inside the VM. So far so good.

  Then, i tell the VNC client to enable audio (QEMU Audio Client
  Message, 255,1,0). QEMU responds with a "stream is about to start"
  message (QEMU Audio Server Message, 255,1,1) and then promptly crashes
  without sending anything else.

  Running it in GDB produces a crash at audio/audio.c:739

  Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
  audio_pcm_sw_write (sw=0x575bbf30, buf=0x0, size=1628) at 
audio/audio.c:739
  739 if (!sw->hw->pcm_ops->volume_out) {

  The exact sequence of events does not matter - i can enable sound
  before playing anything, and then it would say nothing and keep
  working, but crash with the same message once anything sound-playing
  is launched in the VM.

  Using different soundhw or adding various audiodev options does not
  seem to affect anything.

  I can't quite figure out if the QEMU Audio VNC extension is supposed
  to work at all or not, but it would be handy to me if it is.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1882817/+subscriptions

[Bug 1882065] Re: Could this cause OOB bug ?

2020-06-10 Thread r1ng0hacking

** Description changed:

- In function megasas_handle_scsi(hw/scsi/megasas.c):
- 
- ```c
- static int megasas_handle_scsi(MegasasState *s, MegasasCmd *cmd,
-int frame_cmd)
- {
- 

- cdb = cmd->frame->pass.cdb;
- target_id = cmd->frame->header.target_id;
- lun_id = cmd->frame->header.lun_id;
- cdb_len = cmd->frame->header.cdb_len;
- 

- if (cdb_len > 16) {
- trace_megasas_scsi_invalid_cdb_len(
- mfi_frame_desc[frame_cmd], is_logical,
- target_id, lun_id, cdb_len);
- megasas_write_sense(cmd, SENSE_CODE(INVALID_OPCODE));
- cmd->frame->header.scsi_status = CHECK_CONDITION;
- s->event_count++;
- return MFI_STAT_SCSI_DONE_WITH_ERROR;
- }
- }
- ```
- 
- Two variables, frame_cmd and cdb_len, can be controlled by guest os. So
- can mfi_frame_desc[frame_cmd] cause OOB bug ?
+ close!

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1882065

Title:
  Could this cause OOB bug ?

Status in QEMU:
  New

Bug description:
  close!

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1882065/+subscriptions

Re: [PATCH v3 00/20] virtio-mem: Paravirtualized memory hot(un)plug

2020-06-10 Thread David Hildenbrand

On 09.06.20 18:18, Eduardo Habkost wrote:
> On Tue, Jun 09, 2020 at 11:59:04AM -0400, Michael S. Tsirkin wrote:
>> On Tue, Jun 09, 2020 at 03:26:08PM +0200, David Hildenbrand wrote:
>>> On 09.06.20 15:11, Michael S. Tsirkin wrote:
 On Wed, Jun 03, 2020 at 04:48:54PM +0200, David Hildenbrand wrote:
> This is the very basic, initial version of virtio-mem. More info on
> virtio-mem in general can be found in the Linux kernel driver v2 posting
> [1] and in patch #10. The latest Linux driver v4 can be found at [2].
>
> This series is based on [3]:
> "[PATCH v1] pc: Support coldplugging of virtio-pmem-pci devices on all
>  buses"
>
> The patches can be found at:
> https://github.com/davidhildenbrand/qemu.git virtio-mem-v3

 So given we tweaked the config space a bit, this needs a respin.
>>>
>>> Yeah, the virtio-mem-v4 branch already contains a fixed-up version. Will
>>> send during the next days.
>>
>> BTW. People don't normally capitalize the letter after ":".
>> So a better subject is
>>   virtio-mem: paravirtualized memory hot(un)plug
> 
> I'm not sure that's still the rule:
> 
> [qemu/(49ee115552...)]$ git log --oneline v4.0.0.. | egrep ': [A-Z]' | wc -l
> 5261
> [qemu/(49ee115552...)]$ git log --oneline v4.0.0.. | egrep ': [a-z]' | wc -l
> 2921
> 

The kernel is slightly different, but it does not look like there is a
real rule nowadays

t480s: ~/git/linux virtio-mem-v4 $ git log --oneline v5.6..v5.7 | egrep
': [a-z]' | wc -l
9530
t480s: ~/git/linux virtio-mem-v4 $ git log --oneline v5.6..v5.7 | egrep
': [A-Z]' | wc -l
7689

-- 
Thanks,

David / dhildenb

Re: [PATCH v3] target/arm/cpu: adjust virtual time for arm cpu

2020-06-10 Thread Andrew Jones

On Wed, Jun 10, 2020 at 09:32:06AM +0800, Ying Fang wrote:
> 
> 
> On 6/8/2020 8:49 PM, Andrew Jones wrote:
> > On Mon, Jun 08, 2020 at 08:12:43PM +0800, Ying Fang wrote:
> > > From: fangying 
> > > 
> > > Virtual time adjustment was implemented for virt-5.0 machine type,
> > > but the cpu property was enabled only for host-passthrough and
> > > max cpu model. Let's add it for arm cpu which has the generic timer
> > > feature enabled.
> > > 
> > > Suggested-by: Andrew Jones 
> > 
> > This isn't true. I did suggest the way to arrange the code, after
> > Peter suggested to move the kvm_arm_add_vcpu_properties() call to
> > arm_cpu_post_init(), but I didn't suggest making this change in general,
> > which is what this tag means. In fact, I've argued that it's pretty
> I'm quite sorry for adding it here.

No problem.

> > pointless to do this, since KVM users should be using '-cpu host' or
> > '-cpu max' anyway. Since I don't need credit for the code arranging,
> As discussed in thread [1], there is a situation where a 'custom' cpu mode
> is needed for us to keep instruction set compatibility so that migration can
> be done, just like x86 does.

I understand the motivation. But, as I've said, KVM doesn't work that way.

> And we are planning to add support for it if
> nobody is currently doing that.

Great! I'm looking forward to seeing the KVM patches. Especially since,
without the KVM patches, the 'custom' CPU model isn't a custom CPU model,
it's just a misleading way to use host passthrough. Indeed, I'm a bit
opposed to allowing anything other than '-cpu host' and '-cpu max' (with
features explicitly enabled/disabled, e.g. -cpu host,pmu=off) to work
until KVM actually works with CPU models. Otherwise, how do we know the
difference between a model that actually works and one that is just
misleadingly named?

Thanks,
drew

> 
> Thanks.
> Ying
> 
> [1]: https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00022.html
> > please just drop the tag. Peter can maybe do that on merge though. Also,
> > despite not agreeing that we need this change today, as there's nothing
> > wrong with it and it looks good to me
> > 
> > Reviewed-by: Andrew Jones 
> > 
> > Thanks,
> > drew
> > 
> > > Signed-off-by: Ying Fang 
> > > 
> > > ---
> > > v3:
> > > - set kvm-no-adjvtime property in kvm_arm_add_vcpu_properties
> > > 
> > > v2:
> > > - move kvm_arm_add_vcpu_properties into arm_cpu_post_init
> > > 
> > > v1:
> > > - initial commit
> > > - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg08518.html
> > > 
> > > diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> > > index 32bec156f2..5b7a36b5d7 100644
> > > --- a/target/arm/cpu.c
> > > +++ b/target/arm/cpu.c
> > > @@ -1245,6 +1245,10 @@ void arm_cpu_post_init(Object *obj)
> > >   if (arm_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER)) {
> > >   qdev_property_add_static(DEVICE(cpu), 
> > > &arm_cpu_gt_cntfrq_property);
> > >   }
> > > +
> > > +if (kvm_enabled()) {
> > > +kvm_arm_add_vcpu_properties(obj);
> > > +}
> > >   }
> > >   static void arm_cpu_finalizefn(Object *obj)
> > > @@ -2029,7 +2033,6 @@ static void arm_max_initfn(Object *obj)
> > >   if (kvm_enabled()) {
> > >   kvm_arm_set_cpu_features_from_host(cpu);
> > > -kvm_arm_add_vcpu_properties(obj);
> > >   } else {
> > >   cortex_a15_initfn(obj);
> > > @@ -2183,7 +2186,6 @@ static void arm_host_initfn(Object *obj)
> > >   if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
> > >   aarch64_add_sve_properties(obj);
> > >   }
> > > -kvm_arm_add_vcpu_properties(obj);
> > >   arm_cpu_post_init(obj);
> > >   }
> > > diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> > > index cbc5c3868f..778cecc2e6 100644
> > > --- a/target/arm/cpu64.c
> > > +++ b/target/arm/cpu64.c
> > > @@ -592,7 +592,6 @@ static void aarch64_max_initfn(Object *obj)
> > >   if (kvm_enabled()) {
> > >   kvm_arm_set_cpu_features_from_host(cpu);
> > > -kvm_arm_add_vcpu_properties(obj);
> > >   } else {
> > >   uint64_t t;
> > >   uint32_t u;
> > > diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> > > index 4bdbe6dcac..eef3bbd1cc 100644
> > > --- a/target/arm/kvm.c
> > > +++ b/target/arm/kvm.c
> > > @@ -194,17 +194,18 @@ static void kvm_no_adjvtime_set(Object *obj, bool 
> > > value, Error **errp)
> > >   /* KVM VCPU properties should be prefixed with "kvm-". */
> > >   void kvm_arm_add_vcpu_properties(Object *obj)
> > >   {
> > > -if (!kvm_enabled()) {
> > > -return;
> > > -}
> > > +ARMCPU *cpu = ARM_CPU(obj);
> > > +CPUARMState *env = &cpu->env;
> > > -ARM_CPU(obj)->kvm_adjvtime = true;
> > > -object_property_add_bool(obj, "kvm-no-adjvtime", kvm_no_adjvtime_get,
> > > - kvm_no_adjvtime_set);
> > > -object_property_set_description(obj, "kvm-no-adjvtime",
> > > -"Set on to disable the adjustment of 
> > > "
> > > -

Re: [PATCH 0/7] target/arm: Convert Neon 3-reg-diff to decodetree

2020-06-10 Thread Philippe Mathieu-Daudé

On 6/9/20 6:02 PM, Peter Maydell wrote:
> This patchset converts the Neon insns in the "3 registers of different
> lengths" group to decodetree. Patch 1 is a bugfix for an earlier
> part of the conversion that's now in master.
> 
> I'm definitely finding that the new decodetree version of Neon
> is often easier to understand because we no longer try to
> accommodate multiple different kinds of widening/narrowing/etc
> insns in a single multi-pass loop: expanding out the loop and
> specializing it to the particular insn type helps a lot.

I agree. The TCG ARM code is well documented, but the decodetree
view makes it easier to review. Kinda obvious when you compare
with the TCG code in older QEMU architectures.
Personally I also find it easier to set breakpoints.

> (Or maybe it's just that having to read the old code and write
> the new version means I understand it better ;-))
> 
> Based-on: id:20200608183652.661386-1-richard.hender...@linaro.org
> ("[PATCH v3 0/9] decodetree: Add non-overlapping groups")
> because we use the new group syntax to set up the structure
> for the "size==0b11" vs "size!=0b11" decode which we'll fill
> in in subsequent patchsets.
> 
> thanks
> -- PMM
> 
> Peter Maydell (7):
>   target/arm: Fix missing temp frees in do_vshll_2sh
>   target/arm: Convert Neon 3-reg-diff prewidening ops to decodetree
>   target/arm: Convert Neon 3-reg-diff narrowing ops to decodetree
>   target/arm: Convert Neon 3-reg-diff VABAL, VABDL to decodetree
>   target/arm: Convert Neon 3-reg-diff long multiplies
>   target/arm: Convert Neon 3-reg-diff saturating doubling multiplies
>   target/arm: Convert Neon 3-reg-diff polynomial VMULL
> 
>  target/arm/translate.h  |   1 +
>  target/arm/neon-dp.decode   |  72 +
>  target/arm/translate-neon.inc.c | 521 
>  target/arm/translate.c  | 222 +-
>  4 files changed, 597 insertions(+), 219 deletions(-)
>

Re: [PATCH] iotests: Add copyright line in qcow2.py

2020-06-10 Thread Philippe Mathieu-Daudé

On 6/9/20 10:59 PM, Eric Blake wrote:
> The file qcow2.py was originally contributed in 2012 by Kevin Wolf,
> but was not given traditional boilerplate headers at the time.  The
> missing license was just rectified (commit 16306a7b39) using the
> project-default GPLv2+, but as Vladimir is not at Red Hat, he did not
> add a Copyright line.  All earlier contributions have come from CC'd
> authors, where all but Stefan used a Red Hat address at the time of
> the contribution, and that copyright carries over to the split to
> qcow2_format.py (d5262c7124).
> 
> CC: Kevin Wolf 
> CC: Stefan Hajnoczi 
> CC: Eduardo Habkost 
> CC: Max Reitz 
> CC: Philippe Mathieu-Daudé 
> CC: Paolo Bonzini 
> Signed-off-by: Eric Blake 

Acked-by: Philippe Mathieu-Daudé 

> ---
> Commit ids above assume my bitmaps pull request does not have to be respun...
> Based-on: <20200609205245.3548257-1-ebl...@redhat.com>
> ---
>  tests/qemu-iotests/qcow2.py| 2 ++
>  tests/qemu-iotests/qcow2_format.py | 1 +
>  2 files changed, 3 insertions(+)
> 
> diff --git a/tests/qemu-iotests/qcow2.py b/tests/qemu-iotests/qcow2.py
> index 8c187e9a7292..0910e6ac0705 100755
> --- a/tests/qemu-iotests/qcow2.py
> +++ b/tests/qemu-iotests/qcow2.py
> @@ -2,6 +2,8 @@
>  #
>  # Manipulations with qcow2 image
>  #
> +# Copyright (C) 2012 Red Hat, Inc.
> +#
>  # This program is free software; you can redistribute it and/or modify
>  # it under the terms of the GNU General Public License as published by
>  # the Free Software Foundation; either version 2 of the License, or
> diff --git a/tests/qemu-iotests/qcow2_format.py 
> b/tests/qemu-iotests/qcow2_format.py
> index 0f65fd161d5b..cc432e7ae06c 100644
> --- a/tests/qemu-iotests/qcow2_format.py
> +++ b/tests/qemu-iotests/qcow2_format.py
> @@ -1,6 +1,7 @@
>  # Library for manipulations with qcow2 image
>  #
>  # Copyright (c) 2020 Virtuozzo International GmbH.
> +# Copyright (C) 2012 Red Hat, Inc.
>  #
>  # This program is free software; you can redistribute it and/or modify
>  # it under the terms of the GNU General Public License as published by
>

Re: [PATCH] hw/vfio/pci-quirks: Fix broken legacy IGD passthrough

2020-06-10 Thread Thomas Huth

On 10/06/2020 09.31, Philippe Mathieu-Daudé wrote:
> On 6/10/20 5:51 AM, Thomas Huth wrote:
>> The #ifdef CONFIG_VFIO_IGD in pci-quirks.c is not working since the
>> required header config-devices.h is not included, so that the legacy
>> IGD passthrough is currently broken. Let's include the right header
>> to fix this issue.
>>
>> Buglink: https://bugs.launchpad.net/qemu/+bug/1882784
>> Fixes: 29d62771c81d8fd244a67c14a1d968c268d3fb19
>>("hw/vfio: Move the IGD quirk code to a separate file")
> 
> What about shorter tag?
> 
> Fixes: 29d62771c81 ("vfio: Move the IGD quirk code to a separate file")

I always forget whether to use the short or the long version for
"Fixes:" ... this can hopefully be fixed (if necessary) when the patch
gets picked up.

>> Signed-off-by: Thomas Huth 
>> ---
>>  hw/vfio/pci-quirks.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
>> index f2155ddb1d..3158390db1 100644
>> --- a/hw/vfio/pci-quirks.c
>> +++ b/hw/vfio/pci-quirks.c
>> @@ -11,6 +11,7 @@
>>   */
>>  
>>  #include "qemu/osdep.h"
>> +#include "config-devices.h"
> 
> I've been wondering how we can avoid that mistake in the
> future, but can find anything beside human review.

I think in the long term, we should include config-devices.h in osdep.h,
just like config-host.h and config-target.h is already included there.
Everything else is just too confusing. But then we should also add a
mechanism to poison the switches from config-devices.h in common code...
thus this likely needs some work and discussion of the patch first, so I
think we should go with this change to pci-quirks.c here first to get
the regression fixed ASAP.

 Thomas

[PATCH v2] hmp: Make json format optional for qom-set

2020-06-10 Thread David Hildenbrand

Commit 7d2ef6dcc1cf ("hmp: Simplify qom-set") switched to the json
parser, making it possible to specify complex types. However, with this
change it is no longer possible to specify proper sizes (e.g., 2G, 128M),
turning the interface harder to use for properties that consume sizes.

Let's switch back to the previous handling and allow to specify passing
json via the "-j" parameter.

Cc: Philippe Mathieu-Daudé 
Cc: Markus Armbruster 
Cc: Dr. David Alan Gilbert 
Cc: Paolo Bonzini 
Cc: "Daniel P. Berrangé" 
Cc: Eduardo Habkost 
Signed-off-by: David Hildenbrand 
---
v1 - v2:
- keep the "value:S" as correctly noted by Paolo :)
---
 hmp-commands.hx|  7 ---
 qom/qom-hmp-cmds.c | 20 
 2 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 28256209b5..5d12fbeebe 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1806,9 +1806,10 @@ ERST
 
 {
 .name   = "qom-set",
-.args_type  = "path:s,property:s,value:S",
-.params = "path property value",
-.help   = "set QOM property",
+.args_type  = "json:-j,path:s,property:s,value:S",
+.params = "[-j] path property value",
+.help   = "set QOM property.\n\t\t\t"
+  "-j: the property is specified in json format.",
 .cmd= hmp_qom_set,
 .flags  = "p",
 },
diff --git a/qom/qom-hmp-cmds.c b/qom/qom-hmp-cmds.c
index f704b6949a..a794e62f0b 100644
--- a/qom/qom-hmp-cmds.c
+++ b/qom/qom-hmp-cmds.c
@@ -44,15 +44,27 @@ void hmp_qom_list(Monitor *mon, const QDict *qdict)
 
 void hmp_qom_set(Monitor *mon, const QDict *qdict)
 {
+const bool json = qdict_get_try_bool(qdict, "json", false);
 const char *path = qdict_get_str(qdict, "path");
 const char *property = qdict_get_str(qdict, "property");
 const char *value = qdict_get_str(qdict, "value");
 Error *err = NULL;
-QObject *obj;
 
-obj = qobject_from_json(value, &err);
-if (err == NULL) {
-qmp_qom_set(path, property, obj, &err);
+if (!json) {
+Object *obj = object_resolve_path(path, NULL);
+
+if (!obj) {
+error_set(&err, ERROR_CLASS_DEVICE_NOT_FOUND,
+  "Device '%s' not found", path);
+} else {
+object_property_parse(obj, value, property, &err);
+}
+} else {
+QObject *obj = qobject_from_json(value, &err);
+
+if (!err) {
+qmp_qom_set(path, property, obj, &err);
+}
 }
 
 hmp_handle_error(mon, err);
-- 
2.26.2

Re: [PATCH] hw/vfio/pci-quirks: Fix broken legacy IGD passthrough

2020-06-10 Thread Philippe Mathieu-Daudé

On 6/10/20 9:50 AM, Thomas Huth wrote:
> On 10/06/2020 09.31, Philippe Mathieu-Daudé wrote:
>> On 6/10/20 5:51 AM, Thomas Huth wrote:
>>> The #ifdef CONFIG_VFIO_IGD in pci-quirks.c is not working since the
>>> required header config-devices.h is not included, so that the legacy
>>> IGD passthrough is currently broken. Let's include the right header
>>> to fix this issue.
>>>
>>> Buglink: https://bugs.launchpad.net/qemu/+bug/1882784
>>> Fixes: 29d62771c81d8fd244a67c14a1d968c268d3fb19
>>>("hw/vfio: Move the IGD quirk code to a separate file")
>>
>> What about shorter tag?
>>
>> Fixes: 29d62771c81 ("vfio: Move the IGD quirk code to a separate file")
> 
> I always forget whether to use the short or the long version for
> "Fixes:" ... this can hopefully be fixed (if necessary) when the patch
> gets picked up.
> 
>>> Signed-off-by: Thomas Huth 
>>> ---
>>>  hw/vfio/pci-quirks.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
>>> index f2155ddb1d..3158390db1 100644
>>> --- a/hw/vfio/pci-quirks.c
>>> +++ b/hw/vfio/pci-quirks.c
>>> @@ -11,6 +11,7 @@
>>>   */
>>>  
>>>  #include "qemu/osdep.h"
>>> +#include "config-devices.h"
>>
>> I've been wondering how we can avoid that mistake in the
>> future, but can find anything beside human review.
> 
> I think in the long term, we should include config-devices.h in osdep.h,
> just like config-host.h and config-target.h is already included there.
> Everything else is just too confusing. But then we should also add a
> mechanism to poison the switches from config-devices.h in common code...

We only need it for the files under hw/, right?

> thus this likely needs some work and discussion of the patch first, so I
> think we should go with this change to pci-quirks.c here first to get
> the regression fixed ASAP.

Sure, I'm not objecting that.

> 
>  Thomas
>

Re: [PATCH RESEND v3 31/58] auxbus: Rename aux_init_bus() to aux_bus_init()

2020-06-10 Thread Philippe Mathieu-Daudé

On 6/10/20 7:32 AM, Markus Armbruster wrote:
> Suggested-by: Philippe Mathieu-Daudé 
> Signed-off-by: Markus Armbruster 
> Reviewed-by: Paolo Bonzini 
> ---
>  include/hw/misc/auxbus.h | 4 ++--
>  hw/display/xlnx_dp.c | 2 +-
>  hw/misc/auxbus.c | 4 ++--
>  3 files changed, 5 insertions(+), 5 deletions(-)

Thanks!

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH RESEND v3 52/58] microbit: Eliminate two local variables in microbit_init()

2020-06-10 Thread Philippe Mathieu-Daudé

On 6/10/20 7:32 AM, Markus Armbruster wrote:
> Suggested-by: Philippe Mathieu-Daudé 
> Signed-off-by: Markus Armbruster 
> Reviewed-by: Paolo Bonzini 
> ---
>  hw/arm/microbit.c | 14 ++
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/arm/microbit.c b/hw/arm/microbit.c
> index d20ebd3aad..8fe42c9d6a 100644
> --- a/hw/arm/microbit.c
> +++ b/hw/arm/microbit.c
> @@ -36,15 +36,13 @@ static void microbit_init(MachineState *machine)
>  MicrobitMachineState *s = MICROBIT_MACHINE(machine);
>  MemoryRegion *system_memory = get_system_memory();
>  MemoryRegion *mr;
> -Object *soc = OBJECT(&s->nrf51);
> -Object *i2c = OBJECT(&s->i2c);

Thanks for this new patch.

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH] hw/vfio/pci-quirks: Fix broken legacy IGD passthrough

2020-06-10 Thread Thomas Huth

On 10/06/2020 09.53, Philippe Mathieu-Daudé wrote:
> On 6/10/20 9:50 AM, Thomas Huth wrote:
>> On 10/06/2020 09.31, Philippe Mathieu-Daudé wrote:
>>> On 6/10/20 5:51 AM, Thomas Huth wrote:
 The #ifdef CONFIG_VFIO_IGD in pci-quirks.c is not working since the
 required header config-devices.h is not included, so that the legacy
 IGD passthrough is currently broken. Let's include the right header
 to fix this issue.

 Buglink: https://bugs.launchpad.net/qemu/+bug/1882784
 Fixes: 29d62771c81d8fd244a67c14a1d968c268d3fb19
("hw/vfio: Move the IGD quirk code to a separate file")
>>>
>>> What about shorter tag?
>>>
>>> Fixes: 29d62771c81 ("vfio: Move the IGD quirk code to a separate file")
>>
>> I always forget whether to use the short or the long version for
>> "Fixes:" ... this can hopefully be fixed (if necessary) when the patch
>> gets picked up.
>>
 Signed-off-by: Thomas Huth 
 ---
  hw/vfio/pci-quirks.c | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
 index f2155ddb1d..3158390db1 100644
 --- a/hw/vfio/pci-quirks.c
 +++ b/hw/vfio/pci-quirks.c
 @@ -11,6 +11,7 @@
   */
  
  #include "qemu/osdep.h"
 +#include "config-devices.h"
>>>
>>> I've been wondering how we can avoid that mistake in the
>>> future, but can find anything beside human review.
>>
>> I think in the long term, we should include config-devices.h in osdep.h,
>> just like config-host.h and config-target.h is already included there.
>> Everything else is just too confusing. But then we should also add a
>> mechanism to poison the switches from config-devices.h in common code...
> 
> We only need it for the files under hw/, right?

qtest.c in the main directory includes it, too.

>> thus this likely needs some work and discussion of the patch first, so I
>> think we should go with this change to pci-quirks.c here first to get
>> the regression fixed ASAP.
> 
> Sure, I'm not objecting that.

Sure, I just wanted to make sure that whoever (Alex?) picks up this
patch does not wait for that other solution instead.

 Thomas

[Bug 1882784] Re: Legacy IGD passthrough in QEMU 5 disabled

2020-06-10 Thread Thomas Huth

Patch is on the list now:
https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg02567.html

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1882784

Title:
  Legacy IGD passthrough in QEMU 5 disabled

Status in QEMU:
  Confirmed

Bug description:
  Bug with tag v5.0.0, or commit
  fdd76fecdde1ad444ff4deb7f1c4f7e4a1ef97d6

  As of QEMU 5 Legacy IGD PT is no longer working.

  Host is a Xeon E3-1226 v3 and my method to test is to run the
  following:

  ./qemu-system-x86_64 \
-device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1f' \
-device 'vfio-pci,host=00:02.0,addr=02.0' \
-L '/usr/share/kvm' \
-nographic \
-vga none \
-nodefaults

  in the hope of seeing a "IGD device :00:02.0 cannot support legacy
  mode due to existing devices at address 1f.0" error.

  The culprit appears to be this commit:

  https://github.com/qemu/qemu/commit/29d62771c81d8fd244a67c14a1d968c268d3fb19

  Specifically the following block in pci-quirks.c:

  #ifdef CONFIG_VFIO_IGD
  vfio_probe_igd_bar4_quirk(vdev, nr);
  #endif

  as the kconfig variable CONFIG_VFIO_IGD doesn't appear to be available
  outside of makefiles as described here:
  https://qemu.weilnetz.de/doc/devel/kconfig.html. I can confirm that
  the igd code is being pulled in as removing this check, as would
  defining the variable I presume, makes Legacy IGD PT work again (ie I
  see the expected "existing devices" error).

  I first spotted this in Proxmox, but have confirmed the bug by
  building QEMU sources.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1882784/+subscriptions

Re: [PATCH RESEND v3 56/58] qdev: Convert bus-less devices to qdev_realize() with Coccinelle

2020-06-10 Thread Philippe Mathieu-Daudé

Hi Markus, Peter.

On 6/10/20 7:32 AM, Markus Armbruster wrote:
> All remaining conversions to qdev_realize() are for bus-less devices.
> Coccinelle script:
> 
> // only correct for bus-less @dev!
> 
> @@
> expression errp;
> expression dev;
> @@
> -qdev_init_nofail(dev);
> +qdev_realize(dev, NULL, &error_fatal);
> 
> @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@
> expression errp;
> expression dev;
> symbol true;
> @@
> -object_property_set_bool(OBJECT(dev), true, "realized", errp);
> +qdev_realize(DEVICE(dev), NULL, errp);
> 
> @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@
> expression errp;
> expression dev;
> symbol true;
> @@
> -object_property_set_bool(dev, true, "realized", errp);
> +qdev_realize(DEVICE(dev), NULL, errp);

Finally. Now my ealier suggestion is easier to explain:
Rename qdev_realize() -> sysbus_realize(), extracting the qdev_realize()
part. qdev_realize() doesn't take a Bus argument anymore.
Left for later.

> 
> Note that Coccinelle chokes on ARMSSE typedef vs. macro in
> hw/arm/armsse.c.  Worked around by temporarily renaming the macro for
> the spatch run.
> 
> Signed-off-by: Markus Armbruster 
> Acked-by: Alistair Francis 
> Reviewed-by: Paolo Bonzini 
> ---
>  hw/arm/allwinner-a10.c   |  2 +-
>  hw/arm/allwinner-h3.c|  2 +-
>  hw/arm/armsse.c  | 20 ++-
>  hw/arm/armv7m.c  |  2 +-
>  hw/arm/aspeed.c  |  3 +--
>  hw/arm/aspeed_ast2600.c  |  2 +-
>  hw/arm/aspeed_soc.c  |  2 +-
>  hw/arm/bcm2836.c |  3 +--
>  hw/arm/cubieboard.c  |  2 +-
>  hw/arm/digic.c   |  2 +-
>  hw/arm/digic_boards.c|  2 +-
>  hw/arm/exynos4210.c  |  4 +--
>  hw/arm/fsl-imx25.c   |  2 +-
>  hw/arm/fsl-imx31.c   |  2 +-
>  hw/arm/fsl-imx6.c|  2 +-
>  hw/arm/fsl-imx6ul.c  |  3 +--
>  hw/arm/fsl-imx7.c|  2 +-
>  hw/arm/highbank.c|  2 +-
>  hw/arm/imx25_pdk.c   |  2 +-
>  hw/arm/integratorcp.c|  2 +-
>  hw/arm/kzm.c |  2 +-
>  hw/arm/mcimx6ul-evk.c|  2 +-
>  hw/arm/mcimx7d-sabre.c   |  2 +-
>  hw/arm/mps2-tz.c |  9 +++
>  hw/arm/mps2.c|  7 +++---
>  hw/arm/musca.c   |  6 ++---
>  hw/arm/orangepi.c|  2 +-
>  hw/arm/raspi.c   |  2 +-
>  hw/arm/realview.c|  2 +-
>  hw/arm/sabrelite.c   |  2 +-
>  hw/arm/sbsa-ref.c|  2 +-
>  hw/arm/stm32f205_soc.c   |  2 +-
>  hw/arm/stm32f405_soc.c   |  2 +-
>  hw/arm/versatilepb.c |  2 +-
>  hw/arm/vexpress.c|  2 +-
>  hw/arm/virt.c|  2 +-
>  hw/arm/xilinx_zynq.c |  2 +-
>  hw/arm/xlnx-versal.c |  2 +-
>  hw/arm/xlnx-zcu102.c |  2 +-
>  hw/arm/xlnx-zynqmp.c | 10 +++-

Peter you might want to skim at the changes (other
ARM devices out of hw/arm/ involved) but to resume
basically these types are not SysBusDev:

- cpu
- soc / container
- or-gate / irq-splitter

I reviewed all of them.

Next is for Markus.

>  hw/char/serial-isa.c |  2 +-
>  hw/char/serial-pci-multi.c   |  2 +-
>  hw/char/serial-pci.c |  2 +-
>  hw/char/serial.c |  4 +--

I need to review again hw/char/serial-isa.c, it is
not clear why it is a container and not a SysBusDevice.

>  hw/ide/microdrive.c  |  3 ++-

I never had to look at the PCMCIA devices, they seem
an unfinished attempt to plug a the devices on a bus.
Maybe it is finished, but the code is not clear (and
not documented). I need more time to review.

>  hw/intc/pnv_xive.c   |  4 +--
>  hw/intc/spapr_xive.c |  4 +--
>  hw/intc/xics.c   |  2 +-
>  hw/intc/xive.c   |  2 +-
>  hw/pci-host/pnv_phb3.c   |  6 ++---
>  hw/pci-host/pnv_phb4.c   |  2 +-
>  hw/pci-host/pnv_phb4_pec.c   |  2 +-
>  hw/pci-host/prep.c   |  3 +--
>  hw/ppc/pnv.c | 32 ++--
>  hw/ppc/pnv_bmc.c |  2 +-
>  hw/ppc/pnv_core.c|  2 +-
>  hw/ppc/pnv_psi.c |  4 +--
>  hw/ppc/spapr.c   |  5

Re: [PATCH] hw/vfio/pci-quirks: Fix broken legacy IGD passthrough

2020-06-10 Thread Philippe Mathieu-Daudé

On 6/10/20 9:59 AM, Thomas Huth wrote:
> On 10/06/2020 09.53, Philippe Mathieu-Daudé wrote:
>> On 6/10/20 9:50 AM, Thomas Huth wrote:
>>> On 10/06/2020 09.31, Philippe Mathieu-Daudé wrote:
 On 6/10/20 5:51 AM, Thomas Huth wrote:
> The #ifdef CONFIG_VFIO_IGD in pci-quirks.c is not working since the
> required header config-devices.h is not included, so that the legacy
> IGD passthrough is currently broken. Let's include the right header
> to fix this issue.
>
> Buglink: https://bugs.launchpad.net/qemu/+bug/1882784
> Fixes: 29d62771c81d8fd244a67c14a1d968c268d3fb19
>("hw/vfio: Move the IGD quirk code to a separate file")

 What about shorter tag?

 Fixes: 29d62771c81 ("vfio: Move the IGD quirk code to a separate file")
>>>
>>> I always forget whether to use the short or the long version for
>>> "Fixes:" ... this can hopefully be fixed (if necessary) when the patch
>>> gets picked up.
>>>
> Signed-off-by: Thomas Huth 
> ---
>  hw/vfio/pci-quirks.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
> index f2155ddb1d..3158390db1 100644
> --- a/hw/vfio/pci-quirks.c
> +++ b/hw/vfio/pci-quirks.c
> @@ -11,6 +11,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "config-devices.h"

 I've been wondering how we can avoid that mistake in the
 future, but can find anything beside human review.
>>>
>>> I think in the long term, we should include config-devices.h in osdep.h,
>>> just like config-host.h and config-target.h is already included there.
>>> Everything else is just too confusing. But then we should also add a
>>> mechanism to poison the switches from config-devices.h in common code...
>>
>> We only need it for the files under hw/, right?
> 
> qtest.c in the main directory includes it, too.

hw/ and qtests could include "hw/hw.h" instead of affecting all the
codebase via "qemu/osdep.h".

*but* we have qdevs in target/ too... so either we think about cleaning
them, or we accept it is a lost cause (micro-architecture bits are tied
with hardware) and are doomed.

> 
>>> thus this likely needs some work and discussion of the patch first, so I
>>> think we should go with this change to pci-quirks.c here first to get
>>> the regression fixed ASAP.
>>
>> Sure, I'm not objecting that.
> 
> Sure, I just wanted to make sure that whoever (Alex?) picks up this
> patch does not wait for that other solution instead.
> 
>  Thomas
>

Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space()

2020-06-10 Thread Vladimir Sementsov-Ogievskiy


09.06.2020 18:29, Vladimir Sementsov-Ogievskiy wrote:

09.06.2020 18:18, Kevin Wolf wrote:

Am 09.06.2020 um 16:46 hat Eric Blake geschrieben:

On 6/9/20 9:28 AM, Vladimir Sementsov-Ogievskiy wrote:

09.06.2020 17:08, Kevin Wolf wrote:

Since commit c8bb23cbdbe, handle_alloc_space() is called for newly
allocated clusters to efficiently initialise the COW areas with zeros if
necessary. It skips the whole operation if both start_cow nor end_cow
are empty. However, it requests zeroing the whole request size (possibly
multiple megabytes) even if only one end of the request actually needs
this.

This patch reduces the write_zeroes request size in this case so that we
don't unnecessarily zero-initialise a region that we're going to
overwrite immediately.





Hmm, I'm afraid, that this may make things worse in some cases, as with
one big write-zero request
we preallocate data-region in the protocol file, so we have better
locality for the clusters we
are going to write. And, in the same time, with BDRV_REQ_NO_FALLBACK
flag write-zero must be
fast anyway (especially in comparison with the following write request).


Â Â Â Â Â Â Â Â Â  /*
Â Â Â Â Â Â Â Â Â Â  * instead of writing zero COW buffers,
Â Â Â Â Â Â Â Â Â Â  * efficiently zero out the whole clusters
Â Â Â Â Â Â Â Â Â Â  */
-Â Â Â Â Â Â Â  ret = qcow2_pre_write_overlap_check(bs, 0, m->alloc_offset,
-Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 
Â Â  m->nb_clusters *
s->cluster_size,
-Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 
Â Â Â Â  true);
+Â Â Â Â Â Â Â  ret = qcow2_pre_write_overlap_check(bs, 0, start, len, true);
Â Â Â Â Â Â Â Â Â  if (ret < 0) {
Â Â Â Â Â Â Â Â Â Â Â Â Â  return ret;
Â Â Â Â Â Â Â Â Â  }
Â Â Â Â Â Â Â Â Â  BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC_SPACE);
-Â Â Â Â Â Â Â  ret = bdrv_co_pwrite_zeroes(s->data_file, m->alloc_offset,
-Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  
m->nb_clusters * s->cluster_size,
+Â Â Â Â Â Â Â  ret = bdrv_co_pwrite_zeroes(s->data_file, start, len,
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  
BDRV_REQ_NO_FALLBACK);


Good point.  If we weren't using BDRV_REQ_NO_FALLBACK, then avoiding a
pre-zero pass over the middle is essential.  But since we are insisting that
the pre-zero pass be fast or else immediately fail, the time spent in
pre-zeroing should not be a concern.  Do you have benchmark numbers stating
otherwise?


I stumbled across this behaviour (write_zeros for 2 MB, then overwrite
almost everything) in the context of a different bug, and it just didn't
make much sense to me. Is there really a file system where fragmentation
is introduced by not zeroing the area first and then overwriting it?

I'm not insisting on making this change because the behaviour is
harmless if odd, but if we think that writing twice to some blocks is an
optimisation, maybe we should actually measure and document this.


Not to same blocks: first we do write-zeroes to the area aligned-up to cluster 
bound. So it's more probable that the resulting clusters would be contigous on 
file-system.. With your patch it may be split into two parts. (a bit too 
theoretical, I'd better prove it by example)

Also, we (Virtuozzo) have to support some custom distributed fs, where 
allocation itself is expensive, so the additional benefit of first (larger) 
write-zero request is that we have one allocation request instead of two (with 
your patch) or three (if we decide to make two write-zero opersions).


Hmm, Denis Lunev said me that double allocation should not be a problem, as it 
is happening almost in the same time, fs should handle this. So probably my 
counter-arguments are wrong. Still, handle_alloc_space() attracts attention 
often, and some benchmark-tests around it will not hurt.


--
Best regards,
Vladimir

Re: [PATCH v2 1/1] tricore: added triboard with tc27x_soc

2020-06-10 Thread Bastian Koppelmann

Hi,

thanks for the patch. In general this looks good to me. However, a have a few
nitpicks.

On Tue, Jun 09, 2020 at 05:25:53PM +0200, David Brenken wrote:
> From: Andreas Konopik 
> +static const int tc27x_soc_irqmap[] = {
> +};

Since this is empty, it's best to just remove it.

> +
> +static const hwaddr tc27x_soc_memmap[] = {
> +[TC27XD_DSPR2] = 0x5000,
> +[TC27XD_DCACHE2]   = 0x5001E000,
> +[TC27XD_DTAG2] = 0x500C,
> +[TC27XD_PSPR2] = 0x5010,
> +[TC27XD_PCACHE2]   = 0x50108000,
> +[TC27XD_PTAG2] = 0x501C,
> +[TC27XD_DSPR1] = 0x6000,
> +[TC27XD_DCACHE1]   = 0x6001E000,
> +[TC27XD_DTAG1] = 0x600C,
> +[TC27XD_PSPR1] = 0x6010,
> +[TC27XD_PCACHE1]   = 0x60108000,
> +[TC27XD_PTAG1] = 0x601C,
> +[TC27XD_DSPR0] = 0x7000,
> +[TC27XD_PSPR0] = 0x7010,
> +[TC27XD_PCACHE0]   = 0x70106000,
> +[TC27XD_PTAG0] = 0x701C,
> +[TC27XD_PFLASH0_C] = 0x8000,
> +[TC27XD_PFLASH1_C] = 0x8020,
> +[TC27XD_OLDA_C]= 0x8FE7,
> +[TC27XD_BROM_C]= 0x8FFF8000,
> +[TC27XD_LMURAM_C]  = 0x9000,
> +[TC27XD_EMEM_C]= 0x9F00,
> +[TC27XD_PFLASH0_U] = 0xA000,
> +[TC27XD_PFLASH1_U] = 0xA020,
> +[TC27XD_DFLASH0]   = 0xAF00,
> +[TC27XD_DFLASH1]   = 0xAF11,
> +[TC27XD_OLDA_U]= 0xAFE7,
> +[TC27XD_BROM_U]= 0xAFFF8000,
> +[TC27XD_LMURAM_U]  = 0xB000,
> +[TC27XD_EMEM_U]= 0xBF00,
> +[TC27XD_PSPRX] = 0xC000,
> +[TC27XD_DSPRX] = 0xD000,
> +};

Can we add the sizes here as well? That make it much easier to read. See
hw/riscv/sifive_e.c

Also what do the _U and _C suffixes mean? I could not find them in the user
manual [1].

> +
> +/*
> + * Initialize the auxiliary ROM region @mr and map it into
> + * the memory map at @base.
> + */
> +static void make_rom(MemoryRegion *mr, const char *name,
> + hwaddr base, hwaddr size)
> +{
> +memory_region_init_rom(mr, NULL, name, size, &error_fatal);
> +memory_region_add_subregion(get_system_memory(), base, mr);
> +}
> +
> +/*
> + * Initialize the auxiliary RAM region @mr and map it into
> + * the memory map at @base.
> + */
> +static void make_ram(MemoryRegion *mr, const char *name,
> + hwaddr base, hwaddr size)
> +{
> +memory_region_init_ram(mr, NULL, name, size, &error_fatal);
> +memory_region_add_subregion(get_system_memory(), base, mr);
> +}
> +
> +/*
> + * Create an alias of an entire original MemoryRegion @orig
> + * located at @base in the memory map.
> + */
> +static void make_alias(MemoryRegion *mr, const char *name,
> +   MemoryRegion *orig, hwaddr base)
> +{
> +memory_region_init_alias(mr, NULL, name, orig, 0,
> + memory_region_size(orig));
> +memory_region_add_subregion(get_system_memory(), base, mr);
> +}

These seem like very common idioms. It might be worth while to make this a
generic QEMU API. However this is out of scope for this patchset.

> +/*
> + * TriCore QEMU executes CPU0 only, thus it is sufficient to map
> + * LOCAL.PSPR/LOCAL.DSPR exclusively onto PSPR0/DSPR0.
> + */
> +make_alias(&s->psprX, "LOCAL.PSPR", &s->cpu0mem.pspr,
> +sc->memmap[TC27XD_PSPRX]);
> +make_alias(&s->dsprX, "LOCAL.DSPR", &s->cpu0mem.dspr,
> +sc->memmap[TC27XD_DSPRX]);
  
These aliases point to reserved memory in the user manual [1].

> +static void tc27x_soc_init(Object *obj)
> +{
> +TC27XSoCState *s = TC27X_SOC(obj);
> +TC27XSoCClass *sc = TC27X_SOC_GET_CLASS(s);
> +
> +sysbus_init_child_obj(OBJECT(s), "tc27x", OBJECT(&s->cpu), 
> sizeof(s->cpu),
> +sc->cpu_type);

Unnecessary cast. Just use sysbus_init_child_obj(obj,...)

> +static void tricore_load_kernel(const char *kernel_filename)
> +{
> +uint64_t entry;
> +long kernel_size;
> +TriCoreCPU *cpu;
> +CPUTriCoreState *env;
> +
> +kernel_size = load_elf(kernel_filename, NULL,
> +   NULL, NULL, &entry, NULL,
> +   NULL, NULL, 0,
> +   EM_TRICORE, 1, 0);
> +if (kernel_size <= 0) {
> +error_report("no kernel file '%s'", kernel_filename);
> +exit(1);
> +}
> +cpu = TRICORE_CPU(first_cpu);
> +env = &cpu->env;
> +env->PC = entry;
> +}

Just a note for the future. This seems like a function that ought to be
generalized for all TriCore boards.

Cheers,
Bastian

[1] 
https://hitex.co.uk/fileadmin/uk-files/downloads/ShieldBuddy/tc27xD_um_v2.2.pdf

Re: [RFC v2 18/18] guest memory protection: Alter virtio default properties for protected guests

2020-06-10 Thread Cornelia Huck

On Wed, 10 Jun 2020 14:39:22 +1000
David Gibson  wrote:

> On Tue, Jun 09, 2020 at 12:16:41PM +0200, Cornelia Huck wrote:
> > On Sun, 7 Jun 2020 13:07:35 +1000
> > David Gibson  wrote:
> >   
> > > On Sat, Jun 06, 2020 at 04:21:31PM -0400, Michael S. Tsirkin wrote:  
> > > > On Thu, May 21, 2020 at 01:43:04PM +1000, David Gibson wrote:
> > > > > The default behaviour for virtio devices is not to use the platforms 
> > > > > normal
> > > > > DMA paths, but instead to use the fact that it's running in a 
> > > > > hypervisor
> > > > > to directly access guest memory.  That doesn't work if the guest's 
> > > > > memory
> > > > > is protected from hypervisor access, such as with AMD's SEV or 
> > > > > POWER's PEF.
> > > > > 
> > > > > So, if a guest memory protection mechanism is enabled, then apply the
> > > > > iommu_platform=on option so it will go through normal DMA mechanisms.
> > > > > Those will presumably have some way of marking memory as shared with 
> > > > > the
> > > > > hypervisor or hardware so that DMA will work.
> > > > > 
> > > > > Signed-off-by: David Gibson 
> > > > > ---
> > > > >  hw/core/machine.c | 11 +++
> > > > >  1 file changed, 11 insertions(+)
> > > > > 
> > > > > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > > > > index 88d699bceb..cb6580954e 100644
> > > > > --- a/hw/core/machine.c
> > > > > +++ b/hw/core/machine.c
> > > > > @@ -28,6 +28,8 @@
> > > > >  #include "hw/mem/nvdimm.h"
> > > > >  #include "migration/vmstate.h"
> > > > >  #include "exec/guest-memory-protection.h"
> > > > > +#include "hw/virtio/virtio.h"
> > > > > +#include "hw/virtio/virtio-pci.h"
> > > > >  
> > > > >  GlobalProperty hw_compat_5_0[] = {};
> > > > >  const size_t hw_compat_5_0_len = G_N_ELEMENTS(hw_compat_5_0);
> > > > > @@ -1159,6 +1161,15 @@ void machine_run_board_init(MachineState 
> > > > > *machine)
> > > > >   * areas.
> > > > >   */
> > > > >  machine_set_mem_merge(OBJECT(machine), false, &error_abort);
> > > > > +
> > > > > +/*
> > > > > + * Virtio devices can't count on directly accessing guest
> > > > > + * memory, so they need iommu_platform=on to use normal DMA
> > > > > + * mechanisms.  That requires disabling legacy virtio support
> > > > > + * for virtio pci devices
> > > > > + */
> > > > > +object_register_sugar_prop(TYPE_VIRTIO_PCI, 
> > > > > "disable-legacy", "on");
> > > > > +object_register_sugar_prop(TYPE_VIRTIO_DEVICE, 
> > > > > "iommu_platform", "on");
> > > > >  }
> > > > >  
> > > > 
> > > > I think it's a reasonable way to address this overall.
> > > > As Cornelia has commented, addressing ccw as well
> > > 
> > > Sure.  I was assuming somebody who actually knows ccw could do that as
> > > a follow up.  
> > 
> > FWIW, I think we could simply enable iommu_platform for protected
> > guests for ccw; no prereqs like pci's disable-legacy.  
> 
> Right, and the code above should in fact already do so, since it
> applies that to TYPE_VIRTIO_DEVICE, which is common.  The
> disable-legacy part should be harmless for s390, since this is
> effectively just setting a default, and we don't expect any
> TYPE_VIRTIO_PCI devices to be instantiated on z.

Well, virtio-pci is available on s390, so people could try to use it --
however, forcing disable-legacy won't hurt in that case, as it won't
make the situation worse (I don't expect virtio-pci to work on s390
protected guests.)

> 
> > > > as cases where user has
> > > > specified the property manually could be worth-while.
> > > 
> > > I don't really see what's to be done there.  I'm assuming that if the
> > > user specifies it, they know what they're doing - particularly with
> > > nonstandard guests there are some odd edge cases where those
> > > combinations might work, they're just not very likely.  
> > 
> > If I understood Halil correctly, devices without iommu_platform
> > apparently can crash protected guests on s390. Is that supposed to be a
> > "if it breaks, you get to keep the pieces" situation, or do we really
> > want to enforce iommu_platform?  
> 
> I actually think "if you broke it, keep the pieces" is an acceptable
> approach here, but that doesn't preclude some further enforcement to
> improve UX.

I'm worried about spreading dealing with this over too many code areas,
though.


pgpCjQAvhrL4P.pgp
Description: OpenPGP digital signature

[PATCH] hw/timer/a9gtimer: Clear pending interrupt, after the clear of Event flag

2020-06-10 Thread Vaclav Vanc

A9 Global Timer is used with Edge triggered interrupts (This is true
at least for Zynq and i.MX6 processors).
When Event Flag is cleared in Interrupt Status Register and new interrupt
was supposed to be scheduled, interrupt request is never cleared.
Since interrupt in GIC is configured as Edge triggered, new interrupts
are not registered (because interrupt is stuck at pending and GIC thinks
it was already serviced). As a result interrupts from timer does not work
anymore.

Note: This happens only when interrupt was not serviced before the next
interrupt is suppose to be scheduled. This happens for example during
the increased load of the host system.

Interrupt is now always cleared when Event Flag is cleared.
This is in accordance to A9 Global Timer documentation.

Signed-off-by: Vaclav Vanc 
---
 hw/timer/a9gtimer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/timer/a9gtimer.c b/hw/timer/a9gtimer.c
index 7233068a37..732889105e 100644
--- a/hw/timer/a9gtimer.c
+++ b/hw/timer/a9gtimer.c
@@ -206,6 +206,9 @@ static void a9_gtimer_write(void *opaque, hwaddr addr, 
uint64_t value,
 case R_INTERRUPT_STATUS:
 a9_gtimer_update(s, false);
 gtb->status &= ~value;
+if (gtb->status == 0) {
+qemu_set_irq(gtb->irq, 0);
+}
 break;
 case R_COMPARATOR_HI:
 shift = 32;
-- 
2.20.1

Re: [PATCH 1/2] nbd/server: Avoid long error message assertions CVE-2020-10761

2020-06-10 Thread Vladimir Sementsov-Ogievskiy


08.06.2020 21:26, Eric Blake wrote:

Ever since commit 36683283 (v2.8), the server code asserts that error
strings sent to the client are well-formed per the protocol by not
exceeding the maximum string length of 4096.  At the time the server
first started sending error messages, the assertion could not be
triggered, because messages were completely under our control.
However, over the years, we have added latent scenarios where a client
could trigger the server to attempt an error message that would
include the client's information if it passed other checks first:

- requesting NBD_OPT_INFO/GO on an export name that is not present
   (commit 0cfae925 in v2.12 echoes the name)

- requesting NBD_OPT_LIST/SET_META_CONTEXT on an export name that is
   not present (commit e7b1948d in v2.12 echoes the name)

At the time, those were still safe because we flagged names larger
than 256 bytes with a different message; but that changed in commit
93676c88 (v4.2) when we raised the name limit to 4096 to match the NBD
string limit.  (That commit also failed to change the magic number
4096 in nbd_negotiate_send_rep_err to the just-introduced named
constant.)  So with that commit, long client names appended to server
text can now trigger the assertion, and thus be used as a denial of
service attack against a server.  As a mitigating factor, if the
server requires TLS, the client cannot trigger the problematic paths
unless it first supplies TLS credentials, and such trusted clients are
less likely to try to intentionally crash the server.

Reported-by: Xueqiang Wei 
CC: qemu-sta...@nongnu.org
Fixes: https://bugzilla.redhat.com/1843684 CVE-2020-10761
Fixes: 93676c88d7
Signed-off-by: Eric Blake 
---
  nbd/server.c   | 28 +---
  tests/qemu-iotests/143 |  4 
  tests/qemu-iotests/143.out |  2 ++
  3 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 02b1ed080145..ec130303586d 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -217,7 +217,7 @@ nbd_negotiate_send_rep_verr(NBDClient *client, uint32_t 
type,

  msg = g_strdup_vprintf(fmt, va);
  len = strlen(msg);
-assert(len < 4096);
+assert(len < NBD_MAX_STRING_SIZE);
  trace_nbd_negotiate_send_rep_err(msg);
  ret = nbd_negotiate_send_rep_len(client, type, len, errp);
  if (ret < 0) {
@@ -231,6 +231,27 @@ nbd_negotiate_send_rep_verr(NBDClient *client, uint32_t 
type,
  return 0;
  }

+/*
+ * Truncate a potentially-long user-supplied string into something
+ * more suitable for an error reply.
+ */
+static const char *
+nbd_truncate_name(const char *name)
+{
+#define SANE_LENGTH 80
+static char buf[SANE_LENGTH + 3 + 1]; /* Trailing '...', NUL */


s/NUL/NULL/

Hmm. It may break if we use it in parallel in two coroutines or threads.. Not 
sure, is it possible now, neither of course will it be possible in future.

I'd avoid creating functions returning  instead use g_strdup_printf(), like

char *tmp = g_strdup_printf("%.80s...", name);

  ( OR, if you want explicit constant: g_strdup_printf("%.*s...", SANE_LENGTH, 
name) )

... report error ...

g_free(tmp)

Using g_strdup_printf also is safer as we don't need to care about buf size.


+
+if (strlen(name) < SANE_LENGTH) {
+return name;
+}
+memcpy(buf, name, SANE_LENGTH);
+buf[SANE_LENGTH] = '.';
+buf[SANE_LENGTH + 1] = '.';
+buf[SANE_LENGTH + 2] = '.';
+buf[SANE_LENGTH + 3] = '\0';


one-line suggestion:

  sprintf(buf, "%.80s...", name);

OR

  sprintf(buf, "%.*s...", SANE_LENGTH, name);


+return buf;
+}
+
  /* Send an error reply.
   * Return -errno on error, 0 on success. */
  static int GCC_FMT_ATTR(4, 5)
@@ -597,7 +618,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, 
Error **errp)
  if (!exp) {
  return nbd_negotiate_send_rep_err(client, NBD_REP_ERR_UNKNOWN,
errp, "export '%s' not present",
-  name);
+  nbd_truncate_name(name));
  }

  /* Don't bother sending NBD_INFO_NAME unless client requested it */
@@ -996,7 +1017,8 @@ static int nbd_negotiate_meta_queries(NBDClient *client,
  meta->exp = nbd_export_find(export_name);
  if (meta->exp == NULL) {
  return nbd_opt_drop(client, NBD_REP_ERR_UNKNOWN, errp,
-"export '%s' not present", export_name);
+"export '%s' not present",
+nbd_truncate_name(export_name));
  }



Hmm, maybe instead of assertion, shrink message in 
nbd_negotiate_send_rep_verr() too?
This will save us from forgotten (or future) uses of the function.

Shrinking name is better, as it provides better message on result. But 
generally shrink
all two long messages in nbd_negotiate_send_rep_verr() (maybe, together with 
error_report())
seems a good thing for me.


  ret = nbd_opt_read(client, &nb_q

Re: [PATCH V2] virtio-pci: fix queue_enable write

2020-06-10 Thread Stefano Garzarella

On Wed, Jun 10, 2020 at 01:43:51PM +0800, Jason Wang wrote:
> Spec said: The driver uses this to selectively prevent the device from
> executing requests from this virtqueue. 1 - enabled; 0 - disabled.
> 
> Though write 0 to queue_enable is forbidden by the spec, we should not
> assume that the value is 1.
> 
> Fix this by ignore the write value other than 1.
> 
> Signed-off-by: Jason Wang 
> ---
> Changes from V1:
> - fix typo
> - warn wrong value through virtio_error
> ---
>  hw/virtio/virtio-pci.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index d028c17c24..7bc8c1c056 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -1273,16 +1273,20 @@ static void virtio_pci_common_write(void *opaque, 
> hwaddr addr,
>  virtio_queue_set_vector(vdev, vdev->queue_sel, val);
>  break;
>  case VIRTIO_PCI_COMMON_Q_ENABLE:
> -virtio_queue_set_num(vdev, vdev->queue_sel,
> - proxy->vqs[vdev->queue_sel].num);
> -virtio_queue_set_rings(vdev, vdev->queue_sel,
> +if (val == 1) {

Does it have to be 1 or can it be any value other than 0?

Thanks,
Stefano

> +virtio_queue_set_num(vdev, vdev->queue_sel,
> + proxy->vqs[vdev->queue_sel].num);
> +virtio_queue_set_rings(vdev, vdev->queue_sel,
> ((uint64_t)proxy->vqs[vdev->queue_sel].desc[1]) << 32 
> |
> proxy->vqs[vdev->queue_sel].desc[0],
> ((uint64_t)proxy->vqs[vdev->queue_sel].avail[1]) << 
> 32 |
> proxy->vqs[vdev->queue_sel].avail[0],
> ((uint64_t)proxy->vqs[vdev->queue_sel].used[1]) << 32 
> |
> proxy->vqs[vdev->queue_sel].used[0]);
> -proxy->vqs[vdev->queue_sel].enabled = 1;
> +proxy->vqs[vdev->queue_sel].enabled = 1;
> +} else {
> +virtio_error(vdev, "wrong value for queue_enable %"PRIx64, val);
> +}
>  break;
>  case VIRTIO_PCI_COMMON_Q_DESCLO:
>  proxy->vqs[vdev->queue_sel].desc[0] = val;
> -- 
> 2.20.1
> 
>

Re: [PATCH] hw/vfio/pci-quirks: Fix broken legacy IGD passthrough

2020-06-10 Thread Thomas Huth

On 10/06/2020 10.25, Philippe Mathieu-Daudé wrote:
> On 6/10/20 9:59 AM, Thomas Huth wrote:
>> On 10/06/2020 09.53, Philippe Mathieu-Daudé wrote:
>>> On 6/10/20 9:50 AM, Thomas Huth wrote:
 On 10/06/2020 09.31, Philippe Mathieu-Daudé wrote:
> On 6/10/20 5:51 AM, Thomas Huth wrote:
>> The #ifdef CONFIG_VFIO_IGD in pci-quirks.c is not working since the
>> required header config-devices.h is not included, so that the legacy
>> IGD passthrough is currently broken. Let's include the right header
>> to fix this issue.
>>
>> Buglink: https://bugs.launchpad.net/qemu/+bug/1882784
>> Fixes: 29d62771c81d8fd244a67c14a1d968c268d3fb19
>>("hw/vfio: Move the IGD quirk code to a separate file")
>
> What about shorter tag?
>
> Fixes: 29d62771c81 ("vfio: Move the IGD quirk code to a separate file")

 I always forget whether to use the short or the long version for
 "Fixes:" ... this can hopefully be fixed (if necessary) when the patch
 gets picked up.

>> Signed-off-by: Thomas Huth 
>> ---
>>  hw/vfio/pci-quirks.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
>> index f2155ddb1d..3158390db1 100644
>> --- a/hw/vfio/pci-quirks.c
>> +++ b/hw/vfio/pci-quirks.c
>> @@ -11,6 +11,7 @@
>>   */
>>  
>>  #include "qemu/osdep.h"
>> +#include "config-devices.h"
>
> I've been wondering how we can avoid that mistake in the
> future, but can find anything beside human review.

 I think in the long term, we should include config-devices.h in osdep.h,
 just like config-host.h and config-target.h is already included there.
 Everything else is just too confusing. But then we should also add a
 mechanism to poison the switches from config-devices.h in common code...
>>>
>>> We only need it for the files under hw/, right?
>>
>> qtest.c in the main directory includes it, too.
> 
> hw/ and qtests could include "hw/hw.h" instead of affecting all the
> codebase via "qemu/osdep.h".

I don't think that's a good idea - in that case, you have to make sure
to include hw/hw.h everywhere again, so you don't gain that much
compared to including config-devices.h directly everywhere. osdep.h is
our header that has to be included everywhere, so if we want to make
sure that these defines are available everywhere, we have to include it
from osdep.h.
Apart from that, hw/hw.h just contains one more prototype - which likely
should be renamed to cpu_hw_error() and moved to a cpu header instead,
so that we can finally delete hw/hw.h completely.

 Thomas

Re: [PATCH RESEND v3 56/58] qdev: Convert bus-less devices to qdev_realize() with Coccinelle

2020-06-10 Thread Markus Armbruster

Philippe Mathieu-Daudé  writes:

> Hi Markus, Peter.
>
> On 6/10/20 7:32 AM, Markus Armbruster wrote:
>> All remaining conversions to qdev_realize() are for bus-less devices.
>> Coccinelle script:
>> 
>> // only correct for bus-less @dev!
>> 
>> @@
>> expression errp;
>> expression dev;
>> @@
>> -qdev_init_nofail(dev);
>> +qdev_realize(dev, NULL, &error_fatal);
>> 
>> @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@
>> expression errp;
>> expression dev;
>> symbol true;
>> @@
>> -object_property_set_bool(OBJECT(dev), true, "realized", errp);
>> +qdev_realize(DEVICE(dev), NULL, errp);
>> 
>> @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@
>> expression errp;
>> expression dev;
>> symbol true;
>> @@
>> -object_property_set_bool(dev, true, "realized", errp);
>> +qdev_realize(DEVICE(dev), NULL, errp);
>
> Finally. Now my ealier suggestion is easier to explain:
> Rename qdev_realize() -> sysbus_realize(), extracting the qdev_realize()
> part. qdev_realize() doesn't take a Bus argument anymore.
> Left for later.

I'm still confused.

Cases:

* Devices that plug into a bus: use qdev_realize() passing that bus.

  If there is a bus-specific wrapper, use that, for legibility.

  In particular, use sysbus_realize() for sysbus devices plugging into
  the main system bus.

* Devices that don't plug into a bus: use qdev_realize() passing a null
  bus.

What would you like me to improve here?

>
>> 
>> Note that Coccinelle chokes on ARMSSE typedef vs. macro in
>> hw/arm/armsse.c.  Worked around by temporarily renaming the macro for
>> the spatch run.
>> 
>> Signed-off-by: Markus Armbruster 
>> Acked-by: Alistair Francis 
>> Reviewed-by: Paolo Bonzini 
>> ---
>>  hw/arm/allwinner-a10.c   |  2 +-
>>  hw/arm/allwinner-h3.c|  2 +-
>>  hw/arm/armsse.c  | 20 ++-
>>  hw/arm/armv7m.c  |  2 +-
>>  hw/arm/aspeed.c  |  3 +--
>>  hw/arm/aspeed_ast2600.c  |  2 +-
>>  hw/arm/aspeed_soc.c  |  2 +-
>>  hw/arm/bcm2836.c |  3 +--
>>  hw/arm/cubieboard.c  |  2 +-
>>  hw/arm/digic.c   |  2 +-
>>  hw/arm/digic_boards.c|  2 +-
>>  hw/arm/exynos4210.c  |  4 +--
>>  hw/arm/fsl-imx25.c   |  2 +-
>>  hw/arm/fsl-imx31.c   |  2 +-
>>  hw/arm/fsl-imx6.c|  2 +-
>>  hw/arm/fsl-imx6ul.c  |  3 +--
>>  hw/arm/fsl-imx7.c|  2 +-
>>  hw/arm/highbank.c|  2 +-
>>  hw/arm/imx25_pdk.c   |  2 +-
>>  hw/arm/integratorcp.c|  2 +-
>>  hw/arm/kzm.c |  2 +-
>>  hw/arm/mcimx6ul-evk.c|  2 +-
>>  hw/arm/mcimx7d-sabre.c   |  2 +-
>>  hw/arm/mps2-tz.c |  9 +++
>>  hw/arm/mps2.c|  7 +++---
>>  hw/arm/musca.c   |  6 ++---
>>  hw/arm/orangepi.c|  2 +-
>>  hw/arm/raspi.c   |  2 +-
>>  hw/arm/realview.c|  2 +-
>>  hw/arm/sabrelite.c   |  2 +-
>>  hw/arm/sbsa-ref.c|  2 +-
>>  hw/arm/stm32f205_soc.c   |  2 +-
>>  hw/arm/stm32f405_soc.c   |  2 +-
>>  hw/arm/versatilepb.c |  2 +-
>>  hw/arm/vexpress.c|  2 +-
>>  hw/arm/virt.c|  2 +-
>>  hw/arm/xilinx_zynq.c |  2 +-
>>  hw/arm/xlnx-versal.c |  2 +-
>>  hw/arm/xlnx-zcu102.c |  2 +-
>>  hw/arm/xlnx-zynqmp.c | 10 +++-
>
> Peter you might want to skim at the changes (other
> ARM devices out of hw/arm/ involved) but to resume
> basically these types are not SysBusDev:
>
> - cpu
> - soc / container
> - or-gate / irq-splitter
>
> I reviewed all of them.
>
> Next is for Markus.
>
>>  hw/char/serial-isa.c |  2 +-
>>  hw/char/serial-pci-multi.c   |  2 +-
>>  hw/char/serial-pci.c |  2 +-
>>  hw/char/serial.c |  4 +--
>
> I need to review again hw/char/serial-isa.c, it is
> not clear why it is a container and not a SysBusDevice.

TYPE_SERIAL is a bus-less TYPE_DEVICE.

TYPE_ISA_SERIAL is its adapter for the ISA bus.  It contains one
TYPE_SERIAL child.

TYPE_SERIAL_MM is its adapter for the sysbus pseudo-bus.  It contains
one TYPE_SERIAL child.

TYPE_PCI_SERIAL, "pci-serial-2x", "pci-serial-4x" are adapters for the
PCI bus.  They contain one, two and four TYPE_SERIAL respectively.

Exemplary use of QOM, I think.

>>  hw/ide/microdrive.c

Re: [PATCH 2/2] block: Call attention to truncation of long NBD exports

2020-06-10 Thread Vladimir Sementsov-Ogievskiy


08.06.2020 21:26, Eric Blake wrote:

Commit 93676c88 relaxed our NBD client code to request export names up
to the NBD protocol maximum of 4096 bytes without NUL terminator, even
though the block layer can't store anything longer than 4096 bytes
including NUL terminator for display to the user.  Since this means
there are some export names where we have to truncate things, we can
at least try to make the truncation a bit more obvious for the user.
Note that in spite of the truncated display name, we can still
communicate with an NBD server using such a long export name; this was
deemed nicer than refusing to even connect to such a server (since the
server may not be under our control, and since determining our actual
length limits gets tricky when nbd://host:port/export and
nbd+unix:///export?socket=/path are themselves variable-length
expansions beyond the export name but count towards the block layer
name length).

Reported-by: Xueqiang Wei 
Fixes: https://bugzilla.redhat.com/1843684
Signed-off-by: Eric Blake 


Reviewed-by: Vladimir Sementsov-Ogievskiy 


---
  block.c |  7 +--
  block/nbd.c | 21 +
  2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index 8416376c9b71..6dbcb7e083ea 100644
--- a/block.c
+++ b/block.c
@@ -6809,8 +6809,11 @@ void bdrv_refresh_filename(BlockDriverState *bs)
  pstrcpy(bs->filename, sizeof(bs->filename), bs->exact_filename);
  } else {
  QString *json = qobject_to_json(QOBJECT(bs->full_open_options));
-snprintf(bs->filename, sizeof(bs->filename), "json:%s",
- qstring_get_str(json));
+if (snprintf(bs->filename, sizeof(bs->filename), "json:%s",
+ qstring_get_str(json)) >= sizeof(bs->filename)) {
+/* Give user a hint if we truncated things. */
+strcpy(bs->filename + sizeof(bs->filename) - 4, "...");
+}


Is  4096 really enough for json in normal cases?


  qobject_unref(json);
  }
  }
diff --git a/block/nbd.c b/block/nbd.c
index 4ac23c8f6299..eed160c5cda1 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -1984,6 +1984,7 @@ static void nbd_refresh_filename(BlockDriverState *bs)
  {
  BDRVNBDState *s = bs->opaque;
  const char *host = NULL, *port = NULL, *path = NULL;
+size_t len = 0;

  if (s->saddr->type == SOCKET_ADDRESS_TYPE_INET) {
  const InetSocketAddress *inet = &s->saddr->u.inet;
@@ -1996,17 +1997,21 @@ static void nbd_refresh_filename(BlockDriverState *bs)
  } /* else can't represent as pseudo-filename */

  if (path && s->export) {
-snprintf(bs->exact_filename, sizeof(bs->exact_filename),
- "nbd+unix:///%s?socket=%s", s->export, path);
+len = snprintf(bs->exact_filename, sizeof(bs->exact_filename),
+   "nbd+unix:///%s?socket=%s", s->export, path);
  } else if (path && !s->export) {
-snprintf(bs->exact_filename, sizeof(bs->exact_filename),
- "nbd+unix://?socket=%s", path);
+len = snprintf(bs->exact_filename, sizeof(bs->exact_filename),
+   "nbd+unix://?socket=%s", path);
  } else if (host && s->export) {
-snprintf(bs->exact_filename, sizeof(bs->exact_filename),
- "nbd://%s:%s/%s", host, port, s->export);
+len = snprintf(bs->exact_filename, sizeof(bs->exact_filename),
+   "nbd://%s:%s/%s", host, port, s->export);
  } else if (host && !s->export) {
-snprintf(bs->exact_filename, sizeof(bs->exact_filename),
- "nbd://%s:%s", host, port);
+len = snprintf(bs->exact_filename, sizeof(bs->exact_filename),
+   "nbd://%s:%s", host, port);
+}
+if (len > sizeof(bs->exact_filename)) {
+/* Name is too long to represent exactly, so leave it empty. */
+bs->exact_filename[0] = '\0';
  }
  }




--
Best regards,
Vladimir

Re: [PATCH v6 00/16] acpi: i386 tweaks

2020-06-10 Thread Gerd Hoffmann

  Hi,

> Applied patch 1-7. Rest all look ok but couldn't apply
> since they seem to be on top of some other cleanups
> which are not upstream. Pls rebased and post and I'll apply.

Other way around, it is based on older master and we've got conflicting
patches merged upstream meanwhile.

I'll rebase & repost.

take care,
  Gerd

[PATCH v7 3/9] floppy: move cmos_get_fd_drive_type() from pc

2020-06-10 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Philippe Mathieu-Daudé 
Acked-by: John Snow 
---
 include/hw/block/fdc.h |  1 +
 include/hw/i386/pc.h   |  1 -
 hw/block/fdc.c | 26 +-
 hw/i386/pc.c   | 25 -
 4 files changed, 26 insertions(+), 27 deletions(-)

diff --git a/include/hw/block/fdc.h b/include/hw/block/fdc.h
index 5d71cf972268..479cebc0a330 100644
--- a/include/hw/block/fdc.h
+++ b/include/hw/block/fdc.h
@@ -16,5 +16,6 @@ void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
DriveInfo **fds, qemu_irq *fdc_tc);
 
 FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i);
+int cmos_get_fd_drive_type(FloppyDriveType fd0);
 
 #endif
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 8d764f965cd3..5e3b19ab78fc 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -176,7 +176,6 @@ typedef void (*cpu_set_smm_t)(int smm, void *arg);
 void pc_i8259_create(ISABus *isa_bus, qemu_irq *i8259_irqs);
 
 ISADevice *pc_find_fdc0(void);
-int cmos_get_fd_drive_type(FloppyDriveType fd0);
 
 /* port92.c */
 #define PORT92_A20_LINE "a20"
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 8024c822cea3..ea0fb8ee15b9 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -32,7 +32,6 @@
 #include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "qemu/timer.h"
-#include "hw/i386/pc.h"
 #include "hw/acpi/aml-build.h"
 #include "hw/irq.h"
 #include "hw/isa/isa.h"
@@ -2809,6 +2808,31 @@ static Aml *build_fdinfo_aml(int idx, FloppyDriveType 
type)
 return dev;
 }
 
+int cmos_get_fd_drive_type(FloppyDriveType fd0)
+{
+int val;
+
+switch (fd0) {
+case FLOPPY_DRIVE_TYPE_144:
+/* 1.44 Mb 3"5 drive */
+val = 4;
+break;
+case FLOPPY_DRIVE_TYPE_288:
+/* 2.88 Mb 3"5 drive */
+val = 5;
+break;
+case FLOPPY_DRIVE_TYPE_120:
+/* 1.2 Mb 5"5 drive */
+val = 2;
+break;
+case FLOPPY_DRIVE_TYPE_NONE:
+default:
+val = 0;
+break;
+}
+return val;
+}
+
 static void fdc_isa_build_aml(ISADevice *isadev, Aml *scope)
 {
 Aml *dev;
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 2128f3d6fe8b..c5db7be6d8b1 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -385,31 +385,6 @@ static uint64_t ioportF0_read(void *opaque, hwaddr addr, 
unsigned size)
 
 #define REG_EQUIPMENT_BYTE  0x14
 
-int cmos_get_fd_drive_type(FloppyDriveType fd0)
-{
-int val;
-
-switch (fd0) {
-case FLOPPY_DRIVE_TYPE_144:
-/* 1.44 Mb 3"5 drive */
-val = 4;
-break;
-case FLOPPY_DRIVE_TYPE_288:
-/* 2.88 Mb 3"5 drive */
-val = 5;
-break;
-case FLOPPY_DRIVE_TYPE_120:
-/* 1.2 Mb 5"5 drive */
-val = 2;
-break;
-case FLOPPY_DRIVE_TYPE_NONE:
-default:
-val = 0;
-break;
-}
-return val;
-}
-
 static void cmos_init_hd(ISADevice *s, int type_ofs, int info_ofs,
  int16_t cylinders, int8_t heads, int8_t sectors)
 {
-- 
2.18.4

[PATCH v7 8/9] acpi: drop build_piix4_pm()

2020-06-10 Thread Gerd Hoffmann

The _SB.PCI0.PX13.P13C opregion (holds isa device enable bits)
is not used any more, remove it from DSDT.

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Igor Mammedow 
---
 hw/i386/acpi-build.c | 16 
 1 file changed, 16 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 750fcf9baa37..02cf4199c2e9 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1320,21 +1320,6 @@ static void build_q35_isa_bridge(Aml *table)
 aml_append(table, scope);
 }
 
-static void build_piix4_pm(Aml *table)
-{
-Aml *dev;
-Aml *scope;
-
-scope =  aml_scope("_SB.PCI0");
-dev = aml_device("PX13");
-aml_append(dev, aml_name_decl("_ADR", aml_int(0x00010003)));
-
-aml_append(dev, aml_operation_region("P13C", AML_PCI_CONFIG,
- aml_int(0x00), 0xff));
-aml_append(scope, dev);
-aml_append(table, scope);
-}
-
 static void build_piix4_isa_bridge(Aml *table)
 {
 Aml *dev;
@@ -1486,7 +1471,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 aml_append(dsdt, sb_scope);
 
 build_hpet_aml(dsdt);
-build_piix4_pm(dsdt);
 build_piix4_isa_bridge(dsdt);
 build_isa_devices_aml(dsdt);
 build_piix4_pci_hotplug(dsdt);
-- 
2.18.4

[PATCH v7 2/9] floppy: make isa_fdc_get_drive_max_chs static

2020-06-10 Thread Gerd Hoffmann

acpi aml generator needs this, but it is in floppy code now
so we can make the function static.

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Igor Mammedov 
Reviewed-by: Philippe Mathieu-Daudé 
Acked-by: John Snow 
---
 include/hw/block/fdc.h | 2 --
 hw/block/fdc.c | 4 ++--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/hw/block/fdc.h b/include/hw/block/fdc.h
index c15ff4c62315..5d71cf972268 100644
--- a/include/hw/block/fdc.h
+++ b/include/hw/block/fdc.h
@@ -16,7 +16,5 @@ void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
DriveInfo **fds, qemu_irq *fdc_tc);
 
 FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i);
-void isa_fdc_get_drive_max_chs(FloppyDriveType type,
-   uint8_t *maxc, uint8_t *maxh, uint8_t *maxs);
 
 #endif
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index b4d2eaf66dcd..8024c822cea3 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -2744,8 +2744,8 @@ FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, 
int i)
 return isa->state.drives[i].drive;
 }
 
-void isa_fdc_get_drive_max_chs(FloppyDriveType type,
-   uint8_t *maxc, uint8_t *maxh, uint8_t *maxs)
+static void isa_fdc_get_drive_max_chs(FloppyDriveType type, uint8_t *maxc,
+  uint8_t *maxh, uint8_t *maxs)
 {
 const FDFormat *fdf;
 
-- 
2.18.4

[PATCH v7 1/9] acpi: move aml builder code for floppy device

2020-06-10 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Igor Mammedov 
---
 hw/block/fdc.c   | 83 
 hw/i386/acpi-build.c | 83 
 stubs/cmos.c |  7 
 stubs/Makefile.objs  |  1 +
 4 files changed, 91 insertions(+), 83 deletions(-)
 create mode 100644 stubs/cmos.c

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index c5fb9d6ece77..b4d2eaf66dcd 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -32,6 +32,8 @@
 #include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "qemu/timer.h"
+#include "hw/i386/pc.h"
+#include "hw/acpi/aml-build.h"
 #include "hw/irq.h"
 #include "hw/isa/isa.h"
 #include "hw/qdev-properties.h"
@@ -2765,6 +2767,85 @@ void isa_fdc_get_drive_max_chs(FloppyDriveType type,
 (*maxc)--;
 }
 
+static Aml *build_fdinfo_aml(int idx, FloppyDriveType type)
+{
+Aml *dev, *fdi;
+uint8_t maxc, maxh, maxs;
+
+isa_fdc_get_drive_max_chs(type, &maxc, &maxh, &maxs);
+
+dev = aml_device("FLP%c", 'A' + idx);
+
+aml_append(dev, aml_name_decl("_ADR", aml_int(idx)));
+
+fdi = aml_package(16);
+aml_append(fdi, aml_int(idx));  /* Drive Number */
+aml_append(fdi,
+aml_int(cmos_get_fd_drive_type(type)));  /* Device Type */
+/*
+ * the values below are the limits of the drive, and are thus independent
+ * of the inserted media
+ */
+aml_append(fdi, aml_int(maxc));  /* Maximum Cylinder Number */
+aml_append(fdi, aml_int(maxs));  /* Maximum Sector Number */
+aml_append(fdi, aml_int(maxh));  /* Maximum Head Number */
+/*
+ * SeaBIOS returns the below values for int 0x13 func 0x08 regardless of
+ * the drive type, so shall we
+ */
+aml_append(fdi, aml_int(0xAF));  /* disk_specify_1 */
+aml_append(fdi, aml_int(0x02));  /* disk_specify_2 */
+aml_append(fdi, aml_int(0x25));  /* disk_motor_wait */
+aml_append(fdi, aml_int(0x02));  /* disk_sector_siz */
+aml_append(fdi, aml_int(0x12));  /* disk_eot */
+aml_append(fdi, aml_int(0x1B));  /* disk_rw_gap */
+aml_append(fdi, aml_int(0xFF));  /* disk_dtl */
+aml_append(fdi, aml_int(0x6C));  /* disk_formt_gap */
+aml_append(fdi, aml_int(0xF6));  /* disk_fill */
+aml_append(fdi, aml_int(0x0F));  /* disk_head_sttl */
+aml_append(fdi, aml_int(0x08));  /* disk_motor_strt */
+
+aml_append(dev, aml_name_decl("_FDI", fdi));
+return dev;
+}
+
+static void fdc_isa_build_aml(ISADevice *isadev, Aml *scope)
+{
+Aml *dev;
+Aml *crs;
+int i;
+
+#define ACPI_FDE_MAX_FD 4
+uint32_t fde_buf[5] = {
+0, 0, 0, 0, /* presence of floppy drives #0 - #3 */
+cpu_to_le32(2)  /* tape presence (2 == never present) */
+};
+
+crs = aml_resource_template();
+aml_append(crs, aml_io(AML_DECODE16, 0x03F2, 0x03F2, 0x00, 0x04));
+aml_append(crs, aml_io(AML_DECODE16, 0x03F7, 0x03F7, 0x00, 0x01));
+aml_append(crs, aml_irq_no_flags(6));
+aml_append(crs,
+aml_dma(AML_COMPATIBILITY, AML_NOTBUSMASTER, AML_TRANSFER8, 2));
+
+dev = aml_device("FDC0");
+aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0700")));
+aml_append(dev, aml_name_decl("_CRS", crs));
+
+for (i = 0; i < MIN(MAX_FD, ACPI_FDE_MAX_FD); i++) {
+FloppyDriveType type = isa_fdc_get_drive_type(isadev, i);
+
+if (type < FLOPPY_DRIVE_TYPE_NONE) {
+fde_buf[i] = cpu_to_le32(1);  /* drive present */
+aml_append(dev, build_fdinfo_aml(i, type));
+}
+}
+aml_append(dev, aml_name_decl("_FDE",
+   aml_buffer(sizeof(fde_buf), (uint8_t *)fde_buf)));
+
+aml_append(scope, dev);
+}
+
 static const VMStateDescription vmstate_isa_fdc ={
 .name = "fdc",
 .version_id = 2,
@@ -2798,11 +2879,13 @@ static Property isa_fdc_properties[] = {
 static void isabus_fdc_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
+ISADeviceClass *isa = ISA_DEVICE_CLASS(klass);
 
 dc->realize = isabus_fdc_realize;
 dc->fw_name = "fdc";
 dc->reset = fdctrl_external_reset_isa;
 dc->vmsd = &vmstate_isa_fdc;
+isa->build_aml = fdc_isa_build_aml;
 device_class_set_props(dc, isa_fdc_properties);
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 }
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 473cbdfffd05..7726d5c0f7cb 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -937,85 +937,6 @@ static void build_hpet_aml(Aml *table)
 aml_append(table, scope);
 }
 
-static Aml *build_fdinfo_aml(int idx, FloppyDriveType type)
-{
-Aml *dev, *fdi;
-uint8_t maxc, maxh, maxs;
-
-isa_fdc_get_drive_max_chs(type, &maxc, &maxh, &maxs);
-
-dev = aml_device("FLP%c", 'A' + idx);
-
-aml_append(dev, aml_name_decl("_ADR", aml_int(idx)));
-
-fdi = aml_package(16);
-aml_append(fdi, aml_int(idx));  /* Drive Number */
-aml_append(fdi,
-aml_int(cmos_get_fd_drive_type(type)));  /* Device Type */
-/

[PATCH v7 9/9] acpi: q35: drop _SB.PCI0.ISA.LPCD opregion.

2020-06-10 Thread Gerd Hoffmann

Seems to be unused.

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Igor Mammedov 
---
 hw/i386/acpi-build.c | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 02cf4199c2e9..d93ea40c58b9 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1296,7 +1296,6 @@ static void build_q35_isa_bridge(Aml *table)
 {
 Aml *dev;
 Aml *scope;
-Aml *field;
 
 scope =  aml_scope("_SB.PCI0");
 dev = aml_device("ISA");
@@ -1306,16 +1305,6 @@ static void build_q35_isa_bridge(Aml *table)
 aml_append(dev, aml_operation_region("PIRQ", AML_PCI_CONFIG,
  aml_int(0x60), 0x0C));
 
-aml_append(dev, aml_operation_region("LPCD", AML_PCI_CONFIG,
- aml_int(0x80), 0x02));
-field = aml_field("LPCD", AML_ANY_ACC, AML_NOLOCK, AML_PRESERVE);
-aml_append(field, aml_named_field("COMA", 3));
-aml_append(field, aml_reserved_field(1));
-aml_append(field, aml_named_field("COMB", 3));
-aml_append(field, aml_reserved_field(1));
-aml_append(field, aml_named_field("LPTD", 2));
-aml_append(dev, field);
-
 aml_append(scope, dev);
 aml_append(table, scope);
 }
-- 
2.18.4

[PATCH v7 7/9] acpi: drop serial/parallel enable bits from dsdt

2020-06-10 Thread Gerd Hoffmann

The _STA methods for COM+LPT used to reference them,
but that isn't the case any more.

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Igor Mammedov 
---
 hw/i386/acpi-build.c | 23 ---
 1 file changed, 23 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index c8e47700fc53..750fcf9baa37 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1316,15 +1316,6 @@ static void build_q35_isa_bridge(Aml *table)
 aml_append(field, aml_named_field("LPTD", 2));
 aml_append(dev, field);
 
-aml_append(dev, aml_operation_region("LPCE", AML_PCI_CONFIG,
- aml_int(0x82), 0x02));
-/* enable bits */
-field = aml_field("LPCE", AML_ANY_ACC, AML_NOLOCK, AML_PRESERVE);
-aml_append(field, aml_named_field("CAEN", 1));
-aml_append(field, aml_named_field("CBEN", 1));
-aml_append(field, aml_named_field("LPEN", 1));
-aml_append(dev, field);
-
 aml_append(scope, dev);
 aml_append(table, scope);
 }
@@ -1348,7 +1339,6 @@ static void build_piix4_isa_bridge(Aml *table)
 {
 Aml *dev;
 Aml *scope;
-Aml *field;
 
 scope =  aml_scope("_SB.PCI0");
 dev = aml_device("ISA");
@@ -1357,19 +1347,6 @@ static void build_piix4_isa_bridge(Aml *table)
 /* PIIX PCI to ISA irq remapping */
 aml_append(dev, aml_operation_region("P40C", AML_PCI_CONFIG,
  aml_int(0x60), 0x04));
-/* enable bits */
-field = aml_field("^PX13.P13C", AML_ANY_ACC, AML_NOLOCK, AML_PRESERVE);
-/* Offset(0x5f),, 7, */
-aml_append(field, aml_reserved_field(0x2f8));
-aml_append(field, aml_reserved_field(7));
-aml_append(field, aml_named_field("LPEN", 1));
-/* Offset(0x67),, 3, */
-aml_append(field, aml_reserved_field(0x38));
-aml_append(field, aml_reserved_field(3));
-aml_append(field, aml_named_field("CAEN", 1));
-aml_append(field, aml_reserved_field(3));
-aml_append(field, aml_named_field("CBEN", 1));
-aml_append(dev, field);
 
 aml_append(scope, dev);
 aml_append(table, scope);
-- 
2.18.4

[PATCH v7 4/9] acpi: move aml builder code for i8042 (kbd+mouse) device

2020-06-10 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Igor Mammedov 
---
 hw/i386/acpi-build.c | 39 ---
 hw/input/pckbd.c | 31 +++
 2 files changed, 31 insertions(+), 39 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 7726d5c0f7cb..9fed13a27333 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -937,42 +937,6 @@ static void build_hpet_aml(Aml *table)
 aml_append(table, scope);
 }
 
-static Aml *build_kbd_device_aml(void)
-{
-Aml *dev;
-Aml *crs;
-
-dev = aml_device("KBD");
-aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0303")));
-
-aml_append(dev, aml_name_decl("_STA", aml_int(0xf)));
-
-crs = aml_resource_template();
-aml_append(crs, aml_io(AML_DECODE16, 0x0060, 0x0060, 0x01, 0x01));
-aml_append(crs, aml_io(AML_DECODE16, 0x0064, 0x0064, 0x01, 0x01));
-aml_append(crs, aml_irq_no_flags(1));
-aml_append(dev, aml_name_decl("_CRS", crs));
-
-return dev;
-}
-
-static Aml *build_mouse_device_aml(void)
-{
-Aml *dev;
-Aml *crs;
-
-dev = aml_device("MOU");
-aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0F13")));
-
-aml_append(dev, aml_name_decl("_STA", aml_int(0xf)));
-
-crs = aml_resource_template();
-aml_append(crs, aml_irq_no_flags(12));
-aml_append(dev, aml_name_decl("_CRS", crs));
-
-return dev;
-}
-
 static void build_isa_devices_aml(Aml *table)
 {
 bool ambiguous;
@@ -980,9 +944,6 @@ static void build_isa_devices_aml(Aml *table)
 Aml *scope = aml_scope("_SB.PCI0.ISA");
 Object *obj = object_resolve_path_type("", TYPE_ISA_BUS, &ambiguous);
 
-aml_append(scope, build_kbd_device_aml());
-aml_append(scope, build_mouse_device_aml());
-
 if (ambiguous) {
 error_report("Multiple ISA busses, unable to define IPMI ACPI data");
 } else if (!obj) {
diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
index 60a41303203a..29d633ca9478 100644
--- a/hw/input/pckbd.c
+++ b/hw/input/pckbd.c
@@ -26,6 +26,7 @@
 #include "qemu/log.h"
 #include "hw/isa/isa.h"
 #include "migration/vmstate.h"
+#include "hw/acpi/aml-build.h"
 #include "hw/input/ps2.h"
 #include "hw/irq.h"
 #include "hw/input/i8042.h"
@@ -561,12 +562,42 @@ static void i8042_realizefn(DeviceState *dev, Error 
**errp)
 qemu_register_reset(kbd_reset, s);
 }
 
+static void i8042_build_aml(ISADevice *isadev, Aml *scope)
+{
+Aml *kbd;
+Aml *mou;
+Aml *crs;
+
+crs = aml_resource_template();
+aml_append(crs, aml_io(AML_DECODE16, 0x0060, 0x0060, 0x01, 0x01));
+aml_append(crs, aml_io(AML_DECODE16, 0x0064, 0x0064, 0x01, 0x01));
+aml_append(crs, aml_irq_no_flags(1));
+
+kbd = aml_device("KBD");
+aml_append(kbd, aml_name_decl("_HID", aml_eisaid("PNP0303")));
+aml_append(kbd, aml_name_decl("_STA", aml_int(0xf)));
+aml_append(kbd, aml_name_decl("_CRS", crs));
+
+crs = aml_resource_template();
+aml_append(crs, aml_irq_no_flags(12));
+
+mou = aml_device("MOU");
+aml_append(mou, aml_name_decl("_HID", aml_eisaid("PNP0F13")));
+aml_append(mou, aml_name_decl("_STA", aml_int(0xf)));
+aml_append(mou, aml_name_decl("_CRS", crs));
+
+aml_append(scope, kbd);
+aml_append(scope, mou);
+}
+
 static void i8042_class_initfn(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
+ISADeviceClass *isa = ISA_DEVICE_CLASS(klass);
 
 dc->realize = i8042_realizefn;
 dc->vmsd = &vmstate_kbd_isa;
+isa->build_aml = i8042_build_aml;
 set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
 }
 
-- 
2.18.4

[PATCH v7 0/9] acpi: i386 tweaks

2020-06-10 Thread Gerd Hoffmann

First batch of microvm patches, some generic acpi stuff.
Split the acpi-build.c monster, specifically split the
pc and q35 and pci bits into a separate file which we
can skip building at some point in the future.

v2 changes: leave acpi-build.c largely as-is, move useful
bits to other places to allow them being reused, specifically:

 * move isa device generator functions to individual isa devices.
 * move fw_cfg generator function to fw_cfg.c

v3 changes: fix rtc, support multiple lpt devices.

v4 changes:
 * drop merged patches.
 * split rtc crs change to separata patch.
 * added two cleanup patches.
 * picked up ack & review tags.

v5 changes:
 * add comment for rtc crs update.
 * add even more cleanup patches.
 * picked up ack & review tags.

v6 changes:
 * floppy: move cmos_get_fd_drive_type.
 * picked up ack & review tags.

v7 changes:
 * rebased to mst/pci branch, resolved stubs conflict.
 * dropped patches already queued up in mst/pci.
 * added missing sign-off.
 * picked up ack & review tags.

take care,
  Gerd

Gerd Hoffmann (9):
  acpi: move aml builder code for floppy device
  floppy: make isa_fdc_get_drive_max_chs static
  floppy: move cmos_get_fd_drive_type() from pc
  acpi: move aml builder code for i8042 (kbd+mouse) device
  acpi: factor out fw_cfg_add_acpi_dsdt()
  acpi: simplify build_isa_devices_aml()
  acpi: drop serial/parallel enable bits from dsdt
  acpi: drop build_piix4_pm()
  acpi: q35: drop _SB.PCI0.ISA.LPCD opregion.

 hw/i386/fw_cfg.h   |   1 +
 include/hw/block/fdc.h |   3 +-
 include/hw/i386/pc.h   |   1 -
 hw/block/fdc.c | 111 +-
 hw/i386/acpi-build.c   | 211 ++---
 hw/i386/fw_cfg.c   |  28 ++
 hw/i386/pc.c   |  25 -
 hw/input/pckbd.c   |  31 ++
 stubs/cmos.c   |   7 ++
 stubs/Makefile.objs|   1 +
 10 files changed, 184 insertions(+), 235 deletions(-)
 create mode 100644 stubs/cmos.c

-- 
2.18.4

[PATCH v7 6/9] acpi: simplify build_isa_devices_aml()

2020-06-10 Thread Gerd Hoffmann

x86 machines can have a single ISA bus only.

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Igor Mammedov 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/i386/acpi-build.c | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 86be45eea17c..c8e47700fc53 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -940,19 +940,14 @@ static void build_hpet_aml(Aml *table)
 static void build_isa_devices_aml(Aml *table)
 {
 bool ambiguous;
-
-Aml *scope = aml_scope("_SB.PCI0.ISA");
 Object *obj = object_resolve_path_type("", TYPE_ISA_BUS, &ambiguous);
+Aml *scope;
 
-if (ambiguous) {
-error_report("Multiple ISA busses, unable to define IPMI ACPI data");
-} else if (!obj) {
-error_report("No ISA bus, unable to define IPMI ACPI data");
-} else {
-build_acpi_ipmi_devices(scope, BUS(obj), "\\_SB.PCI0.ISA");
-isa_build_aml(ISA_BUS(obj), scope);
-}
+assert(obj && !ambiguous);
 
+scope = aml_scope("_SB.PCI0.ISA");
+build_acpi_ipmi_devices(scope, BUS(obj), "\\_SB.PCI0.ISA");
+isa_build_aml(ISA_BUS(obj), scope);
 aml_append(table, scope);
 }
 
-- 
2.18.4

[PATCH v7 5/9] acpi: factor out fw_cfg_add_acpi_dsdt()

2020-06-10 Thread Gerd Hoffmann

Add helper function to add fw_cfg device,
also move code to hw/i386/fw_cfg.c.

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Igor Mammedov 
---
 hw/i386/fw_cfg.h |  1 +
 hw/i386/acpi-build.c | 24 +---
 hw/i386/fw_cfg.c | 28 
 3 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/hw/i386/fw_cfg.h b/hw/i386/fw_cfg.h
index 9e742787792b..275f15c1c5e8 100644
--- a/hw/i386/fw_cfg.h
+++ b/hw/i386/fw_cfg.h
@@ -25,5 +25,6 @@ FWCfgState *fw_cfg_arch_create(MachineState *ms,
uint16_t apic_id_limit);
 void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg);
 void fw_cfg_build_feature_control(MachineState *ms, FWCfgState *fw_cfg);
+void fw_cfg_add_acpi_dsdt(Aml *scope, FWCfgState *fw_cfg);
 
 #endif
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 9fed13a27333..86be45eea17c 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1759,30 +1759,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 
 /* create fw_cfg node, unconditionally */
 {
-/* when using port i/o, the 8-bit data register *always* overlaps
- * with half of the 16-bit control register. Hence, the total size
- * of the i/o region used is FW_CFG_CTL_SIZE; when using DMA, the
- * DMA control register is located at FW_CFG_DMA_IO_BASE + 4 */
-uint8_t io_size = object_property_get_bool(OBJECT(x86ms->fw_cfg),
-   "dma_enabled", NULL) ?
-  ROUND_UP(FW_CFG_CTL_SIZE, 4) + sizeof(dma_addr_t) :
-  FW_CFG_CTL_SIZE;
-
 scope = aml_scope("\\_SB.PCI0");
-dev = aml_device("FWCF");
-
-aml_append(dev, aml_name_decl("_HID", aml_string("QEMU0002")));
-
-/* device present, functioning, decoding, not shown in UI */
-aml_append(dev, aml_name_decl("_STA", aml_int(0xB)));
-
-crs = aml_resource_template();
-aml_append(crs,
-aml_io(AML_DECODE16, FW_CFG_IO_BASE, FW_CFG_IO_BASE, 0x01, io_size)
-);
-aml_append(dev, aml_name_decl("_CRS", crs));
-
-aml_append(scope, dev);
+fw_cfg_add_acpi_dsdt(scope, x86ms->fw_cfg);
 aml_append(dsdt, scope);
 }
 
diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
index da60ada59462..c55abfb01abb 100644
--- a/hw/i386/fw_cfg.c
+++ b/hw/i386/fw_cfg.c
@@ -15,6 +15,7 @@
 #include "qemu/osdep.h"
 #include "sysemu/numa.h"
 #include "hw/acpi/acpi.h"
+#include "hw/acpi/aml-build.h"
 #include "hw/firmware/smbios.h"
 #include "hw/i386/fw_cfg.h"
 #include "hw/timer/hpet.h"
@@ -179,3 +180,30 @@ void fw_cfg_build_feature_control(MachineState *ms, 
FWCfgState *fw_cfg)
 *val = cpu_to_le64(feature_control_bits | FEATURE_CONTROL_LOCKED);
 fw_cfg_add_file(fw_cfg, "etc/msr_feature_control", val, sizeof(*val));
 }
+
+void fw_cfg_add_acpi_dsdt(Aml *scope, FWCfgState *fw_cfg)
+{
+/*
+ * when using port i/o, the 8-bit data register *always* overlaps
+ * with half of the 16-bit control register. Hence, the total size
+ * of the i/o region used is FW_CFG_CTL_SIZE; when using DMA, the
+ * DMA control register is located at FW_CFG_DMA_IO_BASE + 4
+ */
+Object *obj = OBJECT(fw_cfg);
+uint8_t io_size = object_property_get_bool(obj, "dma_enabled", NULL) ?
+ROUND_UP(FW_CFG_CTL_SIZE, 4) + sizeof(dma_addr_t) :
+FW_CFG_CTL_SIZE;
+Aml *dev = aml_device("FWCF");
+Aml *crs = aml_resource_template();
+
+aml_append(dev, aml_name_decl("_HID", aml_string("QEMU0002")));
+
+/* device present, functioning, decoding, not shown in UI */
+aml_append(dev, aml_name_decl("_STA", aml_int(0xB)));
+
+aml_append(crs,
+aml_io(AML_DECODE16, FW_CFG_IO_BASE, FW_CFG_IO_BASE, 0x01, io_size));
+
+aml_append(dev, aml_name_decl("_CRS", crs));
+aml_append(scope, dev);
+}
-- 
2.18.4

Re: [PATCH V2] virtio-pci: fix queue_enable write

2020-06-10 Thread Michael S. Tsirkin

On Wed, Jun 10, 2020 at 10:57:26AM +0200, Stefano Garzarella wrote:
> On Wed, Jun 10, 2020 at 01:43:51PM +0800, Jason Wang wrote:
> > Spec said: The driver uses this to selectively prevent the device from
> > executing requests from this virtqueue. 1 - enabled; 0 - disabled.
> > 
> > Though write 0 to queue_enable is forbidden by the spec, we should not
> > assume that the value is 1.
> > 
> > Fix this by ignore the write value other than 1.
> > 
> > Signed-off-by: Jason Wang 
> > ---
> > Changes from V1:
> > - fix typo
> > - warn wrong value through virtio_error
> > ---
> >  hw/virtio/virtio-pci.c | 12 
> >  1 file changed, 8 insertions(+), 4 deletions(-)
> > 
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index d028c17c24..7bc8c1c056 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -1273,16 +1273,20 @@ static void virtio_pci_common_write(void *opaque, 
> > hwaddr addr,
> >  virtio_queue_set_vector(vdev, vdev->queue_sel, val);
> >  break;
> >  case VIRTIO_PCI_COMMON_Q_ENABLE:
> > -virtio_queue_set_num(vdev, vdev->queue_sel,
> > - proxy->vqs[vdev->queue_sel].num);
> > -virtio_queue_set_rings(vdev, vdev->queue_sel,
> > +if (val == 1) {
> 
> Does it have to be 1 or can it be any value other than 0?
> 
> Thanks,
> Stefano

spec says 1

> > +virtio_queue_set_num(vdev, vdev->queue_sel,
> > + proxy->vqs[vdev->queue_sel].num);
> > +virtio_queue_set_rings(vdev, vdev->queue_sel,
> > ((uint64_t)proxy->vqs[vdev->queue_sel].desc[1]) << 
> > 32 |
> > proxy->vqs[vdev->queue_sel].desc[0],
> > ((uint64_t)proxy->vqs[vdev->queue_sel].avail[1]) << 
> > 32 |
> > proxy->vqs[vdev->queue_sel].avail[0],
> > ((uint64_t)proxy->vqs[vdev->queue_sel].used[1]) << 
> > 32 |
> > proxy->vqs[vdev->queue_sel].used[0]);
> > -proxy->vqs[vdev->queue_sel].enabled = 1;
> > +proxy->vqs[vdev->queue_sel].enabled = 1;
> > +} else {
> > +virtio_error(vdev, "wrong value for queue_enable %"PRIx64, 
> > val);
> > +}
> >  break;
> >  case VIRTIO_PCI_COMMON_Q_DESCLO:
> >  proxy->vqs[vdev->queue_sel].desc[0] = val;
> > -- 
> > 2.20.1
> > 
> >

[PATCH] qcow2: Fix preallocation on images with unaligned sizes

2020-06-10 Thread Alberto Garcia

When resizing an image with qcow2_co_truncate() using the falloc or
full preallocation modes the code assumes that both the old and new
sizes are cluster-aligned.

There are two problems with this:

  1) The calculation of how many clusters are involved does not always
 get the right result.

 Example: creating a 60KB image and resizing it (with
 preallocation=full) to 80KB won't allocate the second cluster.

  2) No copy-on-write is performed, so in the previous example if
 there is a backing file then the first 60KB of the first cluster
 won't be filled with data from the backing file.

This patch fixes both issues.

Signed-off-by: Alberto Garcia 
---
 block/qcow2.c  | 17 ++---
 tests/qemu-iotests/125 | 21 +
 tests/qemu-iotests/125.out |  9 +
 3 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 0cd2e6757e..e20590c3b7 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -4239,8 +4239,8 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 old_file_size = ROUND_UP(old_file_size, s->cluster_size);
 }
 
-nb_new_data_clusters = DIV_ROUND_UP(offset - old_length,
-s->cluster_size);
+nb_new_data_clusters = (ROUND_UP(offset, s->cluster_size) -
+start_of_cluster(s, old_length)) >> s->cluster_bits;
 
 /* This is an overestimation; we will not actually allocate space for
  * these in the file but just make sure the new refcount structures are
@@ -4317,10 +4317,21 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 int64_t nb_clusters = MIN(
 nb_new_data_clusters,
 s->l2_slice_size - offset_to_l2_slice_index(s, guest_offset));
-QCowL2Meta allocation = {
+unsigned cow_start_length = offset_into_cluster(s, guest_offset);
+QCowL2Meta allocation;
+guest_offset = start_of_cluster(s, guest_offset);
+allocation = (QCowL2Meta) {
 .offset   = guest_offset,
 .alloc_offset = host_offset,
 .nb_clusters  = nb_clusters,
+.cow_start= {
+.offset   = 0,
+.nb_bytes = cow_start_length,
+},
+.cow_end  = {
+.offset   = nb_clusters << s->cluster_bits,
+.nb_bytes = 0,
+},
 };
 qemu_co_queue_init(&allocation.dependent_requests);
 
diff --git a/tests/qemu-iotests/125 b/tests/qemu-iotests/125
index d510984045..5c249b4b23 100755
--- a/tests/qemu-iotests/125
+++ b/tests/qemu-iotests/125
@@ -164,6 +164,27 @@ for GROWTH_SIZE in 16 48 80; do
 done
 done
 
+# Test image resizing using preallocation and unaligned offsets
+$QEMU_IMG create -f raw "$TEST_IMG.base" 128k | _filter_img_create
+$QEMU_IO -c 'write -q -P 1 0 128k' -f raw "$TEST_IMG.base"
+for orig_size in 31k 33k; do
+echo "--- Resizing image from $orig_size to 96k ---"
+_make_test_img -F raw -b "$TEST_IMG.base" -o cluster_size=64k "$orig_size"
+$QEMU_IMG resize -f "$IMGFMT" --preallocation=full "$TEST_IMG" 96k
+# The first part of the image should contain data from the backing file
+$QEMU_IO -c "read -q -P 1 0 ${orig_size}" "$TEST_IMG"
+# The resized part of the image should contain zeroes
+$QEMU_IO -c "read -q -P 0 ${orig_size} 63k" "$TEST_IMG"
+# The resized image should have 7 clusters:
+# header, L1 table, L2 table, refcount table, refcount block, 2 data 
clusters
+expected_file_length=$((65536 * 7))
+file_length=$(stat -c '%s' "$TEST_IMG_FILE")
+if [ "$file_length" != "$expected_file_length" ]; then
+echo "ERROR: file length $file_length (expected $expected_file_length)"
+fi
+echo
+done
+
 # success, all done
 echo '*** done'
 rm -f $seq.full
diff --git a/tests/qemu-iotests/125.out b/tests/qemu-iotests/125.out
index 596905f533..7f76f7af20 100644
--- a/tests/qemu-iotests/125.out
+++ b/tests/qemu-iotests/125.out
@@ -767,4 +767,13 @@ wrote 2048000/2048000 bytes at offset 0
 wrote 81920/81920 bytes at offset 2048000
 80 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=raw size=131072
+--- Resizing image from 31k to 96k ---
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=31744 
backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw
+Image resized.
+
+--- Resizing image from 33k to 96k ---
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=33792 
backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw
+Image resized.
+
 *** done
-- 
2.20.1

Re: [PATCH V2] virtio-pci: fix queue_enable write

2020-06-10 Thread Stefano Garzarella

On Wed, Jun 10, 2020 at 05:42:54AM -0400, Michael S. Tsirkin wrote:
> On Wed, Jun 10, 2020 at 10:57:26AM +0200, Stefano Garzarella wrote:
> > On Wed, Jun 10, 2020 at 01:43:51PM +0800, Jason Wang wrote:
> > > Spec said: The driver uses this to selectively prevent the device from
> > > executing requests from this virtqueue. 1 - enabled; 0 - disabled.
> > > 
> > > Though write 0 to queue_enable is forbidden by the spec, we should not
> > > assume that the value is 1.
> > > 
> > > Fix this by ignore the write value other than 1.
> > > 
> > > Signed-off-by: Jason Wang 
> > > ---
> > > Changes from V1:
> > > - fix typo
> > > - warn wrong value through virtio_error
> > > ---
> > >  hw/virtio/virtio-pci.c | 12 
> > >  1 file changed, 8 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > > index d028c17c24..7bc8c1c056 100644
> > > --- a/hw/virtio/virtio-pci.c
> > > +++ b/hw/virtio/virtio-pci.c
> > > @@ -1273,16 +1273,20 @@ static void virtio_pci_common_write(void *opaque, 
> > > hwaddr addr,
> > >  virtio_queue_set_vector(vdev, vdev->queue_sel, val);
> > >  break;
> > >  case VIRTIO_PCI_COMMON_Q_ENABLE:
> > > -virtio_queue_set_num(vdev, vdev->queue_sel,
> > > - proxy->vqs[vdev->queue_sel].num);
> > > -virtio_queue_set_rings(vdev, vdev->queue_sel,
> > > +if (val == 1) {
> > 
> > Does it have to be 1 or can it be any value other than 0?
> > 
> > Thanks,
> > Stefano
> 
> spec says 1

I was confused by "The driver MUST NOT write a 0 to queue_enable.",
interpreting it as "can write anything other than 0".

But as Jason also wrote in the commit message, the driver should write
1 to enable, so

Reviewed-by: Stefano Garzarella 

Thanks,
Stefano

Re: [PATCH] MAINTAINERS: Volunteer for maintaining the Renesas hardware

2020-06-10 Thread Yoshinori Sato

On Mon, 08 Jun 2020 17:28:41 +0900,
Philippe Mathieu-Daudé wrote:
> 
> Hi Aurelien,
> 
> On 6/1/20 11:41 PM, Aurelien Jarno wrote:
> > On 2020-06-01 11:20, Philippe Mathieu-Daudé wrote:
> >> I don't have much clue about the Renesas hardware, but at least
> >> I know now the source files a little bit, so I volunteer to pick
> >> up patches and send pull-requests for them during my scarce
> >> hobbyist time, until someone else with more knowledge steps up
> >> to do this job instead.
> >>
> >> Suggested-by: Alex Bennée 
> >> Signed-off-by: Philippe Mathieu-Daudé 
> >> ---
> >>  MAINTAINERS | 15 +--
> >>  1 file changed, 13 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/MAINTAINERS b/MAINTAINERS
> >> index 0944d9c731..cbba3ac757 100644
> >> --- a/MAINTAINERS
> >> +++ b/MAINTAINERS
> >> @@ -298,9 +298,7 @@ SH4 TCG CPUs
> >>  M: Aurelien Jarno 
> >>  S: Odd Fixes
> >>  F: target/sh4/
> >> -F: hw/sh4/
> >>  F: disas/sh4.c
> >> -F: include/hw/sh4/
> >>  
> >>  SPARC TCG CPUs
> >>  M: Mark Cave-Ayland 
> >> @@ -1238,6 +1236,18 @@ F: pc-bios/canyonlands.dt[sb]
> >>  F: pc-bios/u-boot-sam460ex-20100605.bin
> >>  F: roms/u-boot-sam460ex
> >>  
> >> +Renesas Hardware
> >> +
> >> +SH4 Hardware
> >> +M: Aurelien Jarno 
> >> +M: Philippe Mathieu-Daudé 
> > 
> > That's fine for me, and just to be clear I don't mind being demoted to a
> > reviewer or even removed from there. I do not really have time to work
> > on that.
> 
> I understand. Aleksandar implicitly NAcked my patch, as there is
> another contributor who is a better candidate, Yoshinori Sato.
> He recently posted Renesas series:
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg708102.html
> 
> I'll simply send a patch to make this hardware orphan, and
> Yoshinori can maintain it if he has the time and is willing to.
> 
> > 
> >> +S: Odd Fixes
> >> +F: hw/sh4/
> >> +F: hw/char/sh_serial.c
> >> +F: hw/intc/sh_intc.c
> >> +F: hw/timer/sh_timer.c
> >> +F: include/hw/sh4/
> >> +
> >>  SH4 Machines
> >>  
> >>  R2D
> >> @@ -1246,6 +1256,7 @@ S: Maintained
> >>  F: hw/sh4/r2d.c
> >>  F: hw/intc/sh_intc.c
> >>  F: hw/timer/sh_timer.c
> >> +F: include/hw/sh4/sh_intc.h
> >>  
> >>  Shix
> >>  M: Magnus Damm 
> > 
> > Acked-by: Aurelien Jarno 
> > 
> 

I agree with the proposal.
Acked-by: Yosinori Sato 

-- 
Yosinori Sato

Re: [PATCH v2 2/8] MAINTAINERS: Mark SH4 based R2D & Shix machines orphan

2020-06-10 Thread Yoshinori Sato

On Tue, 09 Jun 2020 18:12:42 +0900,
Philippe Mathieu-Daudé wrote:
> 
> Hi Magnus,
> 
> On 6/9/20 10:59 AM, Magnus Damm wrote:
> > Hi Markus and Thomas,
> > 
> > On Tue, Jun 9, 2020 at 5:41 PM Markus Armbruster  wrote:
> >>
> >> Thomas Huth  writes:
> >>
> >>> On 08/06/2020 11.01, Philippe Mathieu-Daudé wrote:
>  Last commit from Magnus Damm is fc8e320ef583, which date is
>  Fri Nov 13 2009.  As nobody else seems to care about the patches
>  posted [*] related to the R2D and Shix machines, mark them orphan.
> 
>  Many thanks to Magnus for his substantial contributions to QEMU,
>  and for introducing these SH4 based machine!
> >>
> >> s/machine/machines/
> >>
> 
>  [*] https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg08519.html
> 
>  Cc: Magnus Damm 
>  Signed-off-by: Philippe Mathieu-Daudé 
>  ---
>   MAINTAINERS | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
>  diff --git a/MAINTAINERS b/MAINTAINERS
>  index 49d90c70de..a012d9b74e 100644
>  --- a/MAINTAINERS
>  +++ b/MAINTAINERS
>  @@ -1250,14 +1250,15 @@ SH4 Machines
>   
>   R2D
>   M: Magnus Damm 
>  -S: Maintained
>  +S: Orphan
>   F: hw/sh4/r2d.c
>   F: hw/intc/sh_intc.c
>   F: hw/timer/sh_timer.c
>  +F: include/hw/sh4/sh_intc.h
> 
>   Shix
>   M: Magnus Damm 
>  -S: Odd Fixes
>  +S: Orphan
>   F: hw/sh4/shix.c
> >>>
> >>> Having both, an "M:" entry and "S: Orphan" in a section sounds weird.
> >>> Magnus, are you still interested in these sections? If not, I think the
> >>> "M:" line should be removed...?
> >>
> >> Concur.  Of course, let's give Magnus a chance to chime in.
> > 
> > Thanks guys! I'm interested but don't have so much time available to
> > commit to this I'm afraid. In particular I'm keen on trying to keep
> > R2D around since I happen to have a physical machine setup in my
> > remote access rack. SH4 with FPU used to have alright gcc + binutils
> > toolchain and glibc support once while other SuperH SoCs lacked some
> > portions. So keeping SH4 (sh775x) around would be nice IMO.
> 
> Great news!
> 
> FYI Yoshinori Sato did a great job on updating the Renesas
> hardware, see:
> https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg08584.html
> 
> He might be able to help with the UART/TIMER peripherals used by the
> R2D, see a suggestion to add a 'Renesas hardware' entry:
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg708478.html
> 
> If Yoshinori accept the suggestion to add a Renesas hardware entry, do
> you agree to be listed as there too? Maybe with a 'R:' tag for
> designated reviewer instead of maintainer.
> 
> So I'll respin this series with these changes:
> 
> R2D: S: 'Maintained' -> 'Odd Fixes'
> 
> So contributors don't wait for you to take the patches, and they can go
> via qemu-trivial.
> 
> And Shix -> No maintainer, S: 'Obsolete'.
> 
> The TCG backend stay orphan.
> 
> Regards,
> 
> Phil.
> 
> > 
> > Cheers,
> > 
> > / magnus
> > 
> 

OK.
I also need sh4, so I will perform maintenance.

-- 
Yosinori Sato

[PATCH v7 2/7] block/io: refactor coroutine wrappers

2020-06-10 Thread Vladimir Sementsov-Ogievskiy

Most of our coroutine wrappers already follow this convention:

We have 'coroutine_fn bdrv_co_()' as
the core function, and a wrapper 'bdrv_()' which does parameters packing and call bdrv_run_co().

The only outsiders are the bdrv_prwv_co and
bdrv_common_block_status_above wrappers. Let's refactor them to behave
as the others, it simplifies further conversion of coroutine wrappers.

This patch adds indirection layer, but it will be compensated by
further commit, which will drop bdrv_co_prwv together with is_write
logic, to keep read and write path separate.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 block/io.c | 60 +-
 1 file changed, 32 insertions(+), 28 deletions(-)

diff --git a/block/io.c b/block/io.c
index df8f2a98d4..af6cb839bf 100644
--- a/block/io.c
+++ b/block/io.c
@@ -933,27 +933,31 @@ typedef struct RwCo {
 BdrvRequestFlags flags;
 } RwCo;
 
+static int coroutine_fn bdrv_co_prwv(BdrvChild *child, int64_t offset,
+ QEMUIOVector *qiov, bool is_write,
+ BdrvRequestFlags flags)
+{
+if (is_write) {
+return bdrv_co_pwritev(child, offset, qiov->size, qiov, flags);
+} else {
+return bdrv_co_preadv(child, offset, qiov->size, qiov, flags);
+}
+}
+
 static int coroutine_fn bdrv_rw_co_entry(void *opaque)
 {
 RwCo *rwco = opaque;
 
-if (!rwco->is_write) {
-return bdrv_co_preadv(rwco->child, rwco->offset,
-  rwco->qiov->size, rwco->qiov,
-  rwco->flags);
-} else {
-return bdrv_co_pwritev(rwco->child, rwco->offset,
-   rwco->qiov->size, rwco->qiov,
-   rwco->flags);
-}
+return bdrv_co_prwv(rwco->child, rwco->offset, rwco->qiov,
+rwco->is_write, rwco->flags);
 }
 
 /*
  * Process a vectored synchronous request using coroutines
  */
-static int bdrv_prwv_co(BdrvChild *child, int64_t offset,
-QEMUIOVector *qiov, bool is_write,
-BdrvRequestFlags flags)
+static int bdrv_prwv(BdrvChild *child, int64_t offset,
+ QEMUIOVector *qiov, bool is_write,
+ BdrvRequestFlags flags)
 {
 RwCo rwco = {
 .child = child,
@@ -971,8 +975,7 @@ int bdrv_pwrite_zeroes(BdrvChild *child, int64_t offset,
 {
 QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, NULL, bytes);
 
-return bdrv_prwv_co(child, offset, &qiov, true,
-BDRV_REQ_ZERO_WRITE | flags);
+return bdrv_prwv(child, offset, &qiov, true, BDRV_REQ_ZERO_WRITE | flags);
 }
 
 /*
@@ -1021,7 +1024,7 @@ int bdrv_preadv(BdrvChild *child, int64_t offset, 
QEMUIOVector *qiov)
 {
 int ret;
 
-ret = bdrv_prwv_co(child, offset, qiov, false, 0);
+ret = bdrv_prwv(child, offset, qiov, false, 0);
 if (ret < 0) {
 return ret;
 }
@@ -1045,7 +1048,7 @@ int bdrv_pwritev(BdrvChild *child, int64_t offset, 
QEMUIOVector *qiov)
 {
 int ret;
 
-ret = bdrv_prwv_co(child, offset, qiov, true, 0);
+ret = bdrv_prwv(child, offset, qiov, true, 0);
 if (ret < 0) {
 return ret;
 }
@@ -2463,14 +2466,15 @@ early_out:
 return ret;
 }
 
-static int coroutine_fn bdrv_co_block_status_above(BlockDriverState *bs,
-   BlockDriverState *base,
-   bool want_zero,
-   int64_t offset,
-   int64_t bytes,
-   int64_t *pnum,
-   int64_t *map,
-   BlockDriverState **file)
+static int coroutine_fn
+bdrv_co_common_block_status_above(BlockDriverState *bs,
+  BlockDriverState *base,
+  bool want_zero,
+  int64_t offset,
+  int64_t bytes,
+  int64_t *pnum,
+  int64_t *map,
+  BlockDriverState **file)
 {
 BlockDriverState *p;
 int ret = 0;
@@ -2508,10 +2512,10 @@ static int coroutine_fn 
bdrv_block_status_above_co_entry(void *opaque)
 {
 BdrvCoBlockStatusData *data = opaque;
 
-return bdrv_co_block_status_above(data->bs, data->base,
-  data->want_zero,
-  data->offset, data->bytes,
-  data->pnum, data->map, data->file);
+return bdrv_co_common_block_status_above(data->bs, data->base,
+ data->want_zero,
+ data->offset, data->bytes,
+

[PATCH v7 0/7] coroutines: generate wrapper code

2020-06-10 Thread Vladimir Sementsov-Ogievskiy

Hi all!

The aim of the series is to reduce code-duplication and writing
parameters structure-packing by hand around coroutine function wrappers.

Benefits:
 - no code duplication
 - less indirection

v7: apply Eric's suggestions
02: fix grammar in commit msg, add Eric's r-b
04: - don't create separate header for generated_co_wrapper thing
- inline aio_wait_kick() call
- use json.dumps to make style for clang-format
05: - drop "#include "block/generated-co-wrapper.h" (since the header is 
removed)
- add Eric's r-b

Vladimir Sementsov-Ogievskiy (7):
  block: return error-code from bdrv_invalidate_cache
  block/io: refactor coroutine wrappers
  block: declare some coroutine functions in block/coroutines.h
  scripts: add coroutine-wrapper.py
  block: generate coroutine-wrapper code
  block: drop bdrv_prwv
  block/io: refactor save/load vmstate

 Makefile |   8 +
 block/block-gen.h|  49 +
 block/coroutines.h   |  65 +++
 include/block/block.h|  31 ++--
 block.c  |  97 ++
 block/io.c   | 336 +--
 tests/test-bdrv-drain.c  |   2 +-
 block/Makefile.objs  |   1 +
 scripts/coroutine-wrapper.py | 180 +++
 9 files changed, 388 insertions(+), 381 deletions(-)
 create mode 100644 block/block-gen.h
 create mode 100644 block/coroutines.h
 create mode 100755 scripts/coroutine-wrapper.py

-- 
2.21.0

[PATCH v7 5/7] block: generate coroutine-wrapper code

2020-06-10 Thread Vladimir Sementsov-Ogievskiy

Use code generation implemented in previous commit to generated
coroutine wrappers in block.c and block/io.c

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 block/coroutines.h|   6 +-
 include/block/block.h |  16 ++--
 block.c   |  73 ---
 block/io.c| 212 --
 4 files changed, 13 insertions(+), 294 deletions(-)

diff --git a/block/coroutines.h b/block/coroutines.h
index 9ce1730a09..c62b3a2697 100644
--- a/block/coroutines.h
+++ b/block/coroutines.h
@@ -34,7 +34,7 @@ int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState 
*bs, Error **errp);
 int coroutine_fn
 bdrv_co_prwv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov,
  bool is_write, BdrvRequestFlags flags);
-int
+int generated_co_wrapper
 bdrv_prwv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov,
   bool is_write, BdrvRequestFlags flags);
 
@@ -47,7 +47,7 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
   int64_t *pnum,
   int64_t *map,
   BlockDriverState **file);
-int
+int generated_co_wrapper
 bdrv_common_block_status_above(BlockDriverState *bs,
BlockDriverState *base,
bool want_zero,
@@ -60,7 +60,7 @@ bdrv_common_block_status_above(BlockDriverState *bs,
 int coroutine_fn
 bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
bool is_read);
-int
+int generated_co_wrapper
 bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
 bool is_read);
 
diff --git a/include/block/block.h b/include/block/block.h
index 59a002e06f..9f94c59057 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -405,8 +405,9 @@ void bdrv_refresh_filename(BlockDriverState *bs);
 int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact,
   PreallocMode prealloc, BdrvRequestFlags 
flags,
   Error **errp);
-int bdrv_truncate(BdrvChild *child, int64_t offset, bool exact,
-  PreallocMode prealloc, BdrvRequestFlags flags, Error **errp);
+int generated_co_wrapper
+bdrv_truncate(BdrvChild *child, int64_t offset, bool exact,
+  PreallocMode prealloc, BdrvRequestFlags flags, Error **errp);
 
 int64_t bdrv_nb_sectors(BlockDriverState *bs);
 int64_t bdrv_getlength(BlockDriverState *bs);
@@ -448,7 +449,8 @@ typedef enum {
 BDRV_FIX_ERRORS   = 2,
 } BdrvCheckMode;
 
-int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix);
+int generated_co_wrapper bdrv_check(BlockDriverState *bs, BdrvCheckResult *res,
+BdrvCheckMode fix);
 
 /* The units of offset and total_work_size may be chosen arbitrarily by the
  * block driver; total_work_size may change during the course of the amendment
@@ -471,12 +473,13 @@ void bdrv_aio_cancel_async(BlockAIOCB *acb);
 int bdrv_co_ioctl(BlockDriverState *bs, int req, void *buf);
 
 /* Invalidate any cached metadata used by image formats */
-int bdrv_invalidate_cache(BlockDriverState *bs, Error **errp);
+int generated_co_wrapper bdrv_invalidate_cache(BlockDriverState *bs,
+   Error **errp);
 void bdrv_invalidate_cache_all(Error **errp);
 int bdrv_inactivate_all(void);
 
 /* Ensure contents are flushed to disk.  */
-int bdrv_flush(BlockDriverState *bs);
+int generated_co_wrapper bdrv_flush(BlockDriverState *bs);
 int coroutine_fn bdrv_co_flush(BlockDriverState *bs);
 int bdrv_flush_all(void);
 void bdrv_close_all(void);
@@ -491,7 +494,8 @@ void bdrv_drain_all(void);
 AIO_WAIT_WHILE(bdrv_get_aio_context(bs_),  \
cond); })
 
-int bdrv_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes);
+int generated_co_wrapper bdrv_pdiscard(BdrvChild *child, int64_t offset,
+   int64_t bytes);
 int bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes);
 int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
diff --git a/block.c b/block.c
index 2ca9267729..3046696f30 100644
--- a/block.c
+++ b/block.c
@@ -4640,43 +4640,6 @@ int coroutine_fn bdrv_co_check(BlockDriverState *bs,
 return bs->drv->bdrv_co_check(bs, res, fix);
 }
 
-typedef struct CheckCo {
-BlockDriverState *bs;
-BdrvCheckResult *res;
-BdrvCheckMode fix;
-int ret;
-} CheckCo;
-
-static void coroutine_fn bdrv_check_co_entry(void *opaque)
-{
-CheckCo *cco = opaque;
-cco->ret = bdrv_co_check(cco->bs, cco->res, cco->fix);
-aio_wait_kick();
-}
-
-int bdrv_check(BlockDriverState *bs,
-   BdrvCheckResult *res, BdrvCheckMode fix)
-{
-Coroutine *co;
-CheckCo cco = {
-.bs = bs,
-.res = res,
-.ret = -EINPROGRESS,
-.fix =

[PATCH v7 6/7] block: drop bdrv_prwv

2020-06-10 Thread Vladimir Sementsov-Ogievskiy

Now that we are not maintaining boilerplate code for coroutine
wrappers, there is no more sense in keeping the extra indirection layer
of bdrv_prwv().  Let's drop it and instead generate pure bdrv_preadv()
and bdrv_pwritev().

Currently, bdrv_pwritev() and bdrv_preadv() are returning bytes on
success, auto generated functions will instead return zero, as their
_co_ prototype. Still, it's simple to make the conversion safe: the
only external user of bdrv_pwritev() is test-bdrv-drain, and it is
comfortable enough with bdrv_co_pwritev() instead. So prototypes are
moved to local block/coroutines.h. Next, the only internal use is
bdrv_pread() and bdrv_pwrite(), which are modified to return bytes on
success.

Of course, it would be great to convert bdrv_pread() and bdrv_pwrite()
to return 0 on success. But this requires audit (and probably
conversion) of all their users, let's leave it for another day
refactoring.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 block/coroutines.h  | 10 -
 include/block/block.h   |  2 --
 block/io.c  | 49 -
 tests/test-bdrv-drain.c |  2 +-
 4 files changed, 15 insertions(+), 48 deletions(-)

diff --git a/block/coroutines.h b/block/coroutines.h
index c62b3a2697..6c63a819c9 100644
--- a/block/coroutines.h
+++ b/block/coroutines.h
@@ -31,12 +31,12 @@ int coroutine_fn bdrv_co_check(BlockDriverState *bs,
BdrvCheckResult *res, BdrvCheckMode fix);
 int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs, Error **errp);
 
-int coroutine_fn
-bdrv_co_prwv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov,
- bool is_write, BdrvRequestFlags flags);
 int generated_co_wrapper
-bdrv_prwv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov,
-  bool is_write, BdrvRequestFlags flags);
+bdrv_preadv(BdrvChild *child, int64_t offset, unsigned int bytes,
+QEMUIOVector *qiov, BdrvRequestFlags flags);
+int generated_co_wrapper
+bdrv_pwritev(BdrvChild *child, int64_t offset, unsigned int bytes,
+ QEMUIOVector *qiov, BdrvRequestFlags flags);
 
 int coroutine_fn
 bdrv_co_common_block_status_above(BlockDriverState *bs,
diff --git a/include/block/block.h b/include/block/block.h
index 9f94c59057..280cf2a7d5 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -385,9 +385,7 @@ int bdrv_pwrite_zeroes(BdrvChild *child, int64_t offset,
int bytes, BdrvRequestFlags flags);
 int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags flags);
 int bdrv_pread(BdrvChild *child, int64_t offset, void *buf, int bytes);
-int bdrv_preadv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov);
 int bdrv_pwrite(BdrvChild *child, int64_t offset, const void *buf, int bytes);
-int bdrv_pwritev(BdrvChild *child, int64_t offset, QEMUIOVector *qiov);
 int bdrv_pwrite_sync(BdrvChild *child, int64_t offset,
  const void *buf, int count);
 /*
diff --git a/block/io.c b/block/io.c
index 36fbf9e1fa..3060c7e6ed 100644
--- a/block/io.c
+++ b/block/io.c
@@ -890,23 +890,11 @@ static int bdrv_check_byte_request(BlockDriverState *bs, 
int64_t offset,
 return 0;
 }
 
-int coroutine_fn bdrv_co_prwv(BdrvChild *child, int64_t offset,
-  QEMUIOVector *qiov, bool is_write,
-  BdrvRequestFlags flags)
-{
-if (is_write) {
-return bdrv_co_pwritev(child, offset, qiov->size, qiov, flags);
-} else {
-return bdrv_co_preadv(child, offset, qiov->size, qiov, flags);
-}
-}
-
 int bdrv_pwrite_zeroes(BdrvChild *child, int64_t offset,
int bytes, BdrvRequestFlags flags)
 {
-QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, NULL, bytes);
-
-return bdrv_prwv(child, offset, &qiov, true, BDRV_REQ_ZERO_WRITE | flags);
+return bdrv_pwritev(child, offset, bytes, NULL,
+BDRV_REQ_ZERO_WRITE | flags);
 }
 
 /*
@@ -950,41 +938,19 @@ int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags 
flags)
 }
 }
 
-/* return < 0 if error. See bdrv_pwrite() for the return codes */
-int bdrv_preadv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov)
-{
-int ret;
-
-ret = bdrv_prwv(child, offset, qiov, false, 0);
-if (ret < 0) {
-return ret;
-}
-
-return qiov->size;
-}
-
 /* See bdrv_pwrite() for the return codes */
 int bdrv_pread(BdrvChild *child, int64_t offset, void *buf, int bytes)
 {
+int ret;
 QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, buf, bytes);
 
 if (bytes < 0) {
 return -EINVAL;
 }
 
-return bdrv_preadv(child, offset, &qiov);
-}
-
-int bdrv_pwritev(BdrvChild *child, int64_t offset, QEMUIOVector *qiov)
-{
-int ret;
+ret = bdrv_preadv(child, offset, bytes, &qiov,  0);
 
-ret = bdrv_prwv(child, offset, qiov, true, 0);
-if (ret < 0) {
-return ret;
-}
-
-return qiov->size;
+return ret < 0 ? ret : bytes;
 }
 
 /* Return

[PATCH v7 1/7] block: return error-code from bdrv_invalidate_cache

2020-06-10 Thread Vladimir Sementsov-Ogievskiy

This is the only coroutine wrapper from block.c and block/io.c which
doesn't return a value, so let's convert it to the common behavior, to
simplify moving to generated coroutine wrappers in a further commit.

Also, bdrv_invalidate_cache is a void function, returning error only
through **errp parameter, which is considered to be bad practice, as
it forces callers to define and propagate local_err variable, so
conversion is good anyway.

This patch leaves the conversion of .bdrv_co_invalidate_cache() driver
callbacks and bdrv_invalidate_cache_all() for another day.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 include/block/block.h |  2 +-
 block.c   | 32 ++--
 2 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 25e299605e..46965a7780 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -464,7 +464,7 @@ void bdrv_aio_cancel_async(BlockAIOCB *acb);
 int bdrv_co_ioctl(BlockDriverState *bs, int req, void *buf);
 
 /* Invalidate any cached metadata used by image formats */
-void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp);
+int bdrv_invalidate_cache(BlockDriverState *bs, Error **errp);
 void bdrv_invalidate_cache_all(Error **errp);
 int bdrv_inactivate_all(void);
 
diff --git a/block.c b/block.c
index 8416376c9b..b01551f21c 100644
--- a/block.c
+++ b/block.c
@@ -5643,8 +5643,8 @@ void bdrv_init_with_whitelist(void)
 bdrv_init();
 }
 
-static void coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs,
-  Error **errp)
+static int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs,
+ Error **errp)
 {
 BdrvChild *child, *parent;
 uint64_t perm, shared_perm;
@@ -5653,14 +5653,14 @@ static void coroutine_fn 
bdrv_co_invalidate_cache(BlockDriverState *bs,
 BdrvDirtyBitmap *bm;
 
 if (!bs->drv)  {
-return;
+return -ENOMEDIUM;
 }
 
 QLIST_FOREACH(child, &bs->children, next) {
 bdrv_co_invalidate_cache(child->bs, &local_err);
 if (local_err) {
 error_propagate(errp, local_err);
-return;
+return -EINVAL;
 }
 }
 
@@ -5684,7 +5684,7 @@ static void coroutine_fn 
bdrv_co_invalidate_cache(BlockDriverState *bs,
 if (ret < 0) {
 bs->open_flags |= BDRV_O_INACTIVE;
 error_propagate(errp, local_err);
-return;
+return ret;
 }
 bdrv_set_perm(bs, perm, shared_perm);
 
@@ -5693,7 +5693,7 @@ static void coroutine_fn 
bdrv_co_invalidate_cache(BlockDriverState *bs,
 if (local_err) {
 bs->open_flags |= BDRV_O_INACTIVE;
 error_propagate(errp, local_err);
-return;
+return -EINVAL;
 }
 }
 
@@ -5705,7 +5705,7 @@ static void coroutine_fn 
bdrv_co_invalidate_cache(BlockDriverState *bs,
 if (ret < 0) {
 bs->open_flags |= BDRV_O_INACTIVE;
 error_setg_errno(errp, -ret, "Could not refresh total sector 
count");
-return;
+return ret;
 }
 }
 
@@ -5715,27 +5715,30 @@ static void coroutine_fn 
bdrv_co_invalidate_cache(BlockDriverState *bs,
 if (local_err) {
 bs->open_flags |= BDRV_O_INACTIVE;
 error_propagate(errp, local_err);
-return;
+return -EINVAL;
 }
 }
 }
+
+return 0;
 }
 
 typedef struct InvalidateCacheCo {
 BlockDriverState *bs;
 Error **errp;
 bool done;
+int ret;
 } InvalidateCacheCo;
 
 static void coroutine_fn bdrv_invalidate_cache_co_entry(void *opaque)
 {
 InvalidateCacheCo *ico = opaque;
-bdrv_co_invalidate_cache(ico->bs, ico->errp);
+ico->ret = bdrv_co_invalidate_cache(ico->bs, ico->errp);
 ico->done = true;
 aio_wait_kick();
 }
 
-void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
+int bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
 {
 Coroutine *co;
 InvalidateCacheCo ico = {
@@ -5752,22 +5755,23 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error 
**errp)
 bdrv_coroutine_enter(bs, co);
 BDRV_POLL_WHILE(bs, !ico.done);
 }
+
+return ico.ret;
 }
 
 void bdrv_invalidate_cache_all(Error **errp)
 {
 BlockDriverState *bs;
-Error *local_err = NULL;
 BdrvNextIterator it;
 
 for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
 AioContext *aio_context = bdrv_get_aio_context(bs);
+int ret;
 
 aio_context_acquire(aio_context);
-bdrv_invalidate_cache(bs, &local_err);
+ret = bdrv_invalidate_cache(bs, errp);
 aio_context_release(aio_context);
-if (local_err) {
-error_propagate(errp, local_err);
+if (ret < 0) {
 bdrv_next_cleanup(&it);

[PATCH v7 3/7] block: declare some coroutine functions in block/coroutines.h

2020-06-10 Thread Vladimir Sementsov-Ogievskiy

We are going to keep coroutine-wrappers code (structure-packing
parameters, BDRV_POLL wrapper functions) in separate auto-generated
files. So, we'll need a header with declaration of original _co_
functions, for those which are static now. As well, we'll need
declarations for wrapper functions. Do these declarations now, as a
preparation step.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 block/coroutines.h | 67 ++
 block.c|  8 +++---
 block/io.c | 34 +++
 3 files changed, 88 insertions(+), 21 deletions(-)
 create mode 100644 block/coroutines.h

diff --git a/block/coroutines.h b/block/coroutines.h
new file mode 100644
index 00..9ce1730a09
--- /dev/null
+++ b/block/coroutines.h
@@ -0,0 +1,67 @@
+/*
+ * Block layer I/O functions
+ *
+ * Copyright (c) 2003 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef BLOCK_COROUTINES_INT_H
+#define BLOCK_COROUTINES_INT_H
+
+#include "block/block_int.h"
+
+int coroutine_fn bdrv_co_check(BlockDriverState *bs,
+   BdrvCheckResult *res, BdrvCheckMode fix);
+int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs, Error **errp);
+
+int coroutine_fn
+bdrv_co_prwv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov,
+ bool is_write, BdrvRequestFlags flags);
+int
+bdrv_prwv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov,
+  bool is_write, BdrvRequestFlags flags);
+
+int coroutine_fn
+bdrv_co_common_block_status_above(BlockDriverState *bs,
+  BlockDriverState *base,
+  bool want_zero,
+  int64_t offset,
+  int64_t bytes,
+  int64_t *pnum,
+  int64_t *map,
+  BlockDriverState **file);
+int
+bdrv_common_block_status_above(BlockDriverState *bs,
+   BlockDriverState *base,
+   bool want_zero,
+   int64_t offset,
+   int64_t bytes,
+   int64_t *pnum,
+   int64_t *map,
+   BlockDriverState **file);
+
+int coroutine_fn
+bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
+   bool is_read);
+int
+bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
+bool is_read);
+
+#endif /* BLOCK_COROUTINES_INT_H */
diff --git a/block.c b/block.c
index b01551f21c..2ca9267729 100644
--- a/block.c
+++ b/block.c
@@ -48,6 +48,7 @@
 #include "qemu/timer.h"
 #include "qemu/cutils.h"
 #include "qemu/id.h"
+#include "block/coroutines.h"
 
 #ifdef CONFIG_BSD
 #include 
@@ -4625,8 +4626,8 @@ static void bdrv_delete(BlockDriverState *bs)
  * free of errors) or -errno when an internal error occurred. The results of 
the
  * check are stored in res.
  */
-static int coroutine_fn bdrv_co_check(BlockDriverState *bs,
-  BdrvCheckResult *res, BdrvCheckMode fix)
+int coroutine_fn bdrv_co_check(BlockDriverState *bs,
+   BdrvCheckResult *res, BdrvCheckMode fix)
 {
 if (bs->drv == NULL) {
 return -ENOMEDIUM;
@@ -5643,8 +5644,7 @@ void bdrv_init_with_whitelist(void)
 bdrv_init();
 }
 
-static int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs,
- Error **errp)
+int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs, Error **errp)
 {
 BdrvChild *child, *parent;
 uint64_t perm, shared_perm;
diff --git a/block/io.c b/block/io.c
index af6cb839bf..deb9ca8d82 100644
--- a/block/io.c
+++ b/block/io.c
@@ -29,6 +29,7 @@
 #include "block/blockjob.h"
 #include "b

[PATCH v7 7/7] block/io: refactor save/load vmstate

2020-06-10 Thread Vladimir Sementsov-Ogievskiy

Like for read/write in a previous commit, drop extra indirection layer,
generate directly bdrv_readv_vmstate() and bdrv_writev_vmstate().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 block/coroutines.h| 10 +++
 include/block/block.h |  6 ++--
 block/io.c| 67 ++-
 3 files changed, 42 insertions(+), 41 deletions(-)

diff --git a/block/coroutines.h b/block/coroutines.h
index 6c63a819c9..f69179f5ef 100644
--- a/block/coroutines.h
+++ b/block/coroutines.h
@@ -57,11 +57,9 @@ bdrv_common_block_status_above(BlockDriverState *bs,
int64_t *map,
BlockDriverState **file);
 
-int coroutine_fn
-bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
-   bool is_read);
-int generated_co_wrapper
-bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
-bool is_read);
+int coroutine_fn bdrv_co_readv_vmstate(BlockDriverState *bs,
+   QEMUIOVector *qiov, int64_t pos);
+int coroutine_fn bdrv_co_writev_vmstate(BlockDriverState *bs,
+QEMUIOVector *qiov, int64_t pos);
 
 #endif /* BLOCK_COROUTINES_INT_H */
diff --git a/include/block/block.h b/include/block/block.h
index 280cf2a7d5..7826c3fe27 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -574,8 +574,10 @@ int path_has_protocol(const char *path);
 int path_is_absolute(const char *path);
 char *path_combine(const char *base_path, const char *filename);
 
-int bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
-int bdrv_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
+int generated_co_wrapper
+bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
+int generated_co_wrapper
+bdrv_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
 int bdrv_save_vmstate(BlockDriverState *bs, const uint8_t *buf,
   int64_t pos, int size);
 
diff --git a/block/io.c b/block/io.c
index 3060c7e6ed..83ffc7d390 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2489,66 +2489,67 @@ int bdrv_is_allocated_above(BlockDriverState *top,
 }
 
 int coroutine_fn
-bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
-   bool is_read)
+bdrv_co_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos)
 {
 BlockDriver *drv = bs->drv;
 int ret = -ENOTSUP;
 
+if (!drv) {
+return -ENOMEDIUM;
+}
+
 bdrv_inc_in_flight(bs);
 
-if (!drv) {
-ret = -ENOMEDIUM;
-} else if (drv->bdrv_load_vmstate) {
-if (is_read) {
-ret = drv->bdrv_load_vmstate(bs, qiov, pos);
-} else {
-ret = drv->bdrv_save_vmstate(bs, qiov, pos);
-}
+if (drv->bdrv_load_vmstate) {
+ret = drv->bdrv_load_vmstate(bs, qiov, pos);
 } else if (bs->file) {
-ret = bdrv_co_rw_vmstate(bs->file->bs, qiov, pos, is_read);
+ret = bdrv_co_readv_vmstate(bs->file->bs, qiov, pos);
 }
 
 bdrv_dec_in_flight(bs);
+
 return ret;
 }
 
-int bdrv_save_vmstate(BlockDriverState *bs, const uint8_t *buf,
-  int64_t pos, int size)
+int coroutine_fn
+bdrv_co_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos)
 {
-QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, buf, size);
-int ret;
+BlockDriver *drv = bs->drv;
+int ret = -ENOTSUP;
 
-ret = bdrv_writev_vmstate(bs, &qiov, pos);
-if (ret < 0) {
-return ret;
+if (!drv) {
+return -ENOMEDIUM;
 }
 
-return size;
-}
+bdrv_inc_in_flight(bs);
 
-int bdrv_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos)
-{
-return bdrv_rw_vmstate(bs, qiov, pos, false);
+if (drv->bdrv_load_vmstate) {
+ret = drv->bdrv_save_vmstate(bs, qiov, pos);
+} else if (bs->file) {
+ret = bdrv_co_writev_vmstate(bs->file->bs, qiov, pos);
+}
+
+bdrv_dec_in_flight(bs);
+
+return ret;
 }
 
-int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf,
+int bdrv_save_vmstate(BlockDriverState *bs, const uint8_t *buf,
   int64_t pos, int size)
 {
 QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, buf, size);
-int ret;
-
-ret = bdrv_readv_vmstate(bs, &qiov, pos);
-if (ret < 0) {
-return ret;
-}
+int ret = bdrv_writev_vmstate(bs, &qiov, pos);
 
-return size;
+return ret < 0 ? ret : size;
 }
 
-int bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos)
+int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf,
+  int64_t pos, int size)
 {
-return bdrv_rw_vmstate(bs, qiov, pos, true);
+QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, buf, size);
+int ret = bdrv_readv_vmstate(bs, &qiov, pos);
+
+return ret < 0 ? ret : size;
 }
 
 /

[PATCH v7 4/7] scripts: add coroutine-wrapper.py

2020-06-10 Thread Vladimir Sementsov-Ogievskiy

We have a very frequent pattern of creating coroutine from function
with several arguments:

  - create structure to pack parameters
  - create _entry function to call original function taking parameters
from struct
  - do different magic to handle completion: set ret to NOT_DONE or
EINPROGRESS or use separate bool field
  - fill the struct and create coroutine from _entry function and this
struct as a parameter
  - do coroutine enter and BDRV_POLL_WHILE loop

Let's reduce code duplication by generating coroutine wrappers.

This patch adds scripts/coroutine-wrapper.py together with some
friends, which will generate functions with declared prototypes marked
by 'generated_co_wrapper' specifier.

The usage of new code generation is as follows:

1. define somewhere

int coroutine_fn bdrv_co_NAME(...) {...}

   function

2. declare in some header file

int generated_co_wrapper bdrv_NAME(...);

   function with same list of parameters. (you'll need to include
   "block/generated-co-wrapper.h" to get the specifier)

3. both declarations should be available through block/coroutines.h
   header.

4. add header with generated_co_wrapper declaration into
   COROUTINE_HEADERS list in Makefile

Still, no function is now marked, this work is for the following
commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 Makefile |   8 ++
 block/block-gen.h|  49 ++
 include/block/block.h|   7 ++
 block/Makefile.objs  |   1 +
 scripts/coroutine-wrapper.py | 180 +++
 5 files changed, 245 insertions(+)
 create mode 100644 block/block-gen.h
 create mode 100755 scripts/coroutine-wrapper.py

diff --git a/Makefile b/Makefile
index d1af126ea1..517a29810f 100644
--- a/Makefile
+++ b/Makefile
@@ -175,6 +175,14 @@ generated-files-y += $(TRACE_SOURCES)
 generated-files-y += $(BUILD_DIR)/trace-events-all
 generated-files-y += .git-submodule-status
 
+generated-files-y += block/block-gen.c
+
+COROUTINE_HEADERS = include/block/block.h block/coroutines.h
+block/block-gen.c: $(COROUTINE_HEADERS) scripts/coroutine-wrapper.py
+   $(call quiet-command, \
+   cat $(addprefix $(SRC_PATH)/,$(COROUTINE_HEADERS)) | \
+   $(SRC_PATH)/scripts/coroutine-wrapper.py > 
$@,"GEN","$(TARGET_DIR)$@")
+
 trace-group-name = $(shell dirname $1 | sed -e 's/[^a-zA-Z0-9]/_/g')
 
 tracetool-y = $(SRC_PATH)/scripts/tracetool.py
diff --git a/block/block-gen.h b/block/block-gen.h
new file mode 100644
index 00..f80cf4897d
--- /dev/null
+++ b/block/block-gen.h
@@ -0,0 +1,49 @@
+/*
+ * Block coroutine wrapping core, used by auto-generated block/block-gen.c
+ *
+ * Copyright (c) 2003 Fabrice Bellard
+ * Copyright (c) 2020 Virtuozzo International GmbH
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef BLOCK_BLOCK_GEN_H
+#define BLOCK_BLOCK_GEN_H
+
+#include "block/block_int.h"
+
+/* Base structure for argument packing structures */
+typedef struct BdrvPollCo {
+BlockDriverState *bs;
+bool in_progress;
+int ret;
+Coroutine *co; /* Keep pointer here for debugging */
+} BdrvPollCo;
+
+static inline int bdrv_poll_co(BdrvPollCo *s)
+{
+assert(!qemu_in_coroutine());
+
+bdrv_coroutine_enter(s->bs, s->co);
+BDRV_POLL_WHILE(s->bs, s->in_progress);
+
+return s->ret;
+}
+
+#endif /* BLOCK_BLOCK_GEN_H */
diff --git a/include/block/block.h b/include/block/block.h
index 46965a7780..59a002e06f 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -10,6 +10,13 @@
 #include "block/blockjob.h"
 #include "qemu/hbitmap.h"
 
+/*
+ * generated_co_wrapper
+ * Function specifier, which does nothing but marking functions to be
+ * generated by scripts/coroutine-wrapper.py
+ */
+#define generated_co_wrapper
+
 /* block.c */
 typedef struct BlockDriver BlockDriver;
 typedef struct BdrvChild BdrvChild;
diff --git a/block/Makefile.ob

Re: [PATCH] hw/vfio/pci-quirks: Fix broken legacy IGD passthrough

2020-06-10 Thread Philippe Mathieu-Daudé

On 6/10/20 11:03 AM, Thomas Huth wrote:
> On 10/06/2020 10.25, Philippe Mathieu-Daudé wrote:
>> On 6/10/20 9:59 AM, Thomas Huth wrote:
>>> On 10/06/2020 09.53, Philippe Mathieu-Daudé wrote:
 On 6/10/20 9:50 AM, Thomas Huth wrote:
> On 10/06/2020 09.31, Philippe Mathieu-Daudé wrote:
>> On 6/10/20 5:51 AM, Thomas Huth wrote:
>>> The #ifdef CONFIG_VFIO_IGD in pci-quirks.c is not working since the
>>> required header config-devices.h is not included, so that the legacy
>>> IGD passthrough is currently broken. Let's include the right header
>>> to fix this issue.
>>>
>>> Buglink: https://bugs.launchpad.net/qemu/+bug/1882784
>>> Fixes: 29d62771c81d8fd244a67c14a1d968c268d3fb19
>>>("hw/vfio: Move the IGD quirk code to a separate file")
>>
>> What about shorter tag?
>>
>> Fixes: 29d62771c81 ("vfio: Move the IGD quirk code to a separate file")
>
> I always forget whether to use the short or the long version for
> "Fixes:" ... this can hopefully be fixed (if necessary) when the patch
> gets picked up.
>
>>> Signed-off-by: Thomas Huth 
>>> ---
>>>  hw/vfio/pci-quirks.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
>>> index f2155ddb1d..3158390db1 100644
>>> --- a/hw/vfio/pci-quirks.c
>>> +++ b/hw/vfio/pci-quirks.c
>>> @@ -11,6 +11,7 @@
>>>   */
>>>  
>>>  #include "qemu/osdep.h"
>>> +#include "config-devices.h"
>>
>> I've been wondering how we can avoid that mistake in the
>> future, but can find anything beside human review.
>
> I think in the long term, we should include config-devices.h in osdep.h,
> just like config-host.h and config-target.h is already included there.
> Everything else is just too confusing. But then we should also add a
> mechanism to poison the switches from config-devices.h in common code...

 We only need it for the files under hw/, right?
>>>
>>> qtest.c in the main directory includes it, too.
>>
>> hw/ and qtests could include "hw/hw.h" instead of affecting all the
>> codebase via "qemu/osdep.h".
> 
> I don't think that's a good idea - in that case, you have to make sure
> to include hw/hw.h everywhere again, so you don't gain that much
> compared to including config-devices.h directly everywhere. osdep.h is
> our header that has to be included everywhere, so if we want to make
> sure that these defines are available everywhere, we have to include it
> from osdep.h.
> Apart from that, hw/hw.h just contains one more prototype - which likely
> should be renamed to cpu_hw_error() and moved to a cpu header instead,
> so that we can finally delete hw/hw.h completely.

Yes, this discussion implied the "hw/hw.h" cleanup talked elsewhere is done.

Re: [RFC v2 18/18] guest memory protection: Alter virtio default properties for protected guests

2020-06-10 Thread David Gibson

On Wed, Jun 10, 2020 at 10:48:42AM +0200, Cornelia Huck wrote:
> On Wed, 10 Jun 2020 14:39:22 +1000
> David Gibson  wrote:
> 
> > On Tue, Jun 09, 2020 at 12:16:41PM +0200, Cornelia Huck wrote:
> > > On Sun, 7 Jun 2020 13:07:35 +1000
> > > David Gibson  wrote:
> > >   
> > > > On Sat, Jun 06, 2020 at 04:21:31PM -0400, Michael S. Tsirkin wrote:  
> > > > > On Thu, May 21, 2020 at 01:43:04PM +1000, David Gibson wrote:
> > > > > > The default behaviour for virtio devices is not to use the 
> > > > > > platforms normal
> > > > > > DMA paths, but instead to use the fact that it's running in a 
> > > > > > hypervisor
> > > > > > to directly access guest memory.  That doesn't work if the guest's 
> > > > > > memory
> > > > > > is protected from hypervisor access, such as with AMD's SEV or 
> > > > > > POWER's PEF.
> > > > > > 
> > > > > > So, if a guest memory protection mechanism is enabled, then apply 
> > > > > > the
> > > > > > iommu_platform=on option so it will go through normal DMA 
> > > > > > mechanisms.
> > > > > > Those will presumably have some way of marking memory as shared 
> > > > > > with the
> > > > > > hypervisor or hardware so that DMA will work.
> > > > > > 
> > > > > > Signed-off-by: David Gibson 
> > > > > > ---
> > > > > >  hw/core/machine.c | 11 +++
> > > > > >  1 file changed, 11 insertions(+)
> > > > > > 
> > > > > > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > > > > > index 88d699bceb..cb6580954e 100644
> > > > > > --- a/hw/core/machine.c
> > > > > > +++ b/hw/core/machine.c
> > > > > > @@ -28,6 +28,8 @@
> > > > > >  #include "hw/mem/nvdimm.h"
> > > > > >  #include "migration/vmstate.h"
> > > > > >  #include "exec/guest-memory-protection.h"
> > > > > > +#include "hw/virtio/virtio.h"
> > > > > > +#include "hw/virtio/virtio-pci.h"
> > > > > >  
> > > > > >  GlobalProperty hw_compat_5_0[] = {};
> > > > > >  const size_t hw_compat_5_0_len = G_N_ELEMENTS(hw_compat_5_0);
> > > > > > @@ -1159,6 +1161,15 @@ void machine_run_board_init(MachineState 
> > > > > > *machine)
> > > > > >   * areas.
> > > > > >   */
> > > > > >  machine_set_mem_merge(OBJECT(machine), false, 
> > > > > > &error_abort);
> > > > > > +
> > > > > > +/*
> > > > > > + * Virtio devices can't count on directly accessing guest
> > > > > > + * memory, so they need iommu_platform=on to use normal DMA
> > > > > > + * mechanisms.  That requires disabling legacy virtio 
> > > > > > support
> > > > > > + * for virtio pci devices
> > > > > > + */
> > > > > > +object_register_sugar_prop(TYPE_VIRTIO_PCI, 
> > > > > > "disable-legacy", "on");
> > > > > > +object_register_sugar_prop(TYPE_VIRTIO_DEVICE, 
> > > > > > "iommu_platform", "on");
> > > > > >  }
> > > > > >  
> > > > > 
> > > > > I think it's a reasonable way to address this overall.
> > > > > As Cornelia has commented, addressing ccw as well
> > > > 
> > > > Sure.  I was assuming somebody who actually knows ccw could do that as
> > > > a follow up.  
> > > 
> > > FWIW, I think we could simply enable iommu_platform for protected
> > > guests for ccw; no prereqs like pci's disable-legacy.  
> > 
> > Right, and the code above should in fact already do so, since it
> > applies that to TYPE_VIRTIO_DEVICE, which is common.  The
> > disable-legacy part should be harmless for s390, since this is
> > effectively just setting a default, and we don't expect any
> > TYPE_VIRTIO_PCI devices to be instantiated on z.
> 
> Well, virtio-pci is available on s390, so people could try to use it --
> however, forcing disable-legacy won't hurt in that case, as it won't
> make the situation worse (I don't expect virtio-pci to work on s390
> protected guests.)

Sure, and if by whatever chance it does work, then you'll need
iommu_platform, and therefore disable-legacy for it.

> > > > > as cases where user has
> > > > > specified the property manually could be worth-while.
> > > > 
> > > > I don't really see what's to be done there.  I'm assuming that if the
> > > > user specifies it, they know what they're doing - particularly with
> > > > nonstandard guests there are some odd edge cases where those
> > > > combinations might work, they're just not very likely.  
> > > 
> > > If I understood Halil correctly, devices without iommu_platform
> > > apparently can crash protected guests on s390. Is that supposed to be a
> > > "if it breaks, you get to keep the pieces" situation, or do we really
> > > want to enforce iommu_platform?  
> > 
> > I actually think "if you broke it, keep the pieces" is an acceptable
> > approach here, but that doesn't preclude some further enforcement to
> > improve UX.
> 
> I'm worried about spreading dealing with this over too many code areas,
> though.



-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_

Re: [PATCH v2 1/1] virtio-ccw: auto-manage VIRTIO_F_IOMMU_PLATFORM if PV

2020-06-10 Thread David Gibson

On Wed, Jun 10, 2020 at 09:22:45AM +0200, David Hildenbrand wrote:
> On 10.06.20 06:31, David Gibson wrote:
> > On Tue, Jun 09, 2020 at 12:44:39PM -0400, Michael S. Tsirkin wrote:
> >> On Tue, Jun 09, 2020 at 06:28:39PM +0200, Halil Pasic wrote:
> >>> On Tue, 9 Jun 2020 17:47:47 +0200
> >>> Claudio Imbrenda  wrote:
> >>>
>  On Tue, 9 Jun 2020 11:41:30 +0200
>  Halil Pasic  wrote:
> 
>  [...]
> 
> > I don't know. Janosch could answer that, but he is on vacation. Adding
> > Claudio maybe he can answer. My understanding is, that while it might
> > be possible, it is ugly at best. The ability to do a transition is
> > indicated by a CPU model feature. Indicating the feature to the guest
> > and then failing the transition sounds wrong to me.
> 
>  I agree. If the feature is advertised, then it has to work. I don't
>  think we even have an architected way to fail the transition for that
>  reason.
> 
>  What __could__ be done is to prevent qemu from even starting if an
>  incompatible device is specified together with PV.
> >>>
> >>> AFAIU, the "specified together with PV" is the problem here. Currently
> >>> we don't "specify PV" but PV is just a capability that is managed by the
> >>> CPU model (like so many other).
> >>
> >> So if we want to keep it user friendly, there could be
> >> protection property with values on/off/auto, and auto
> >> would poke at host capability to figure out whether
> >> it's supported.
> >>
> >> Both virtio and CPU would inherit from that.
> > 
> > Right, that's what I have in mind for my 'host-trust-limitation'
> > property (a generalized version of the existing 'memory-encryption'
> > machine option).  My draft patches already set virtio properties
> > accordingly, it should be possible to set (default) cpu properties as
> > well.
> 
> No crazy CPU model hacks please (at least speaking for the s390x).

Uh... I'm not really sure what you have in mind here.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH v2 1/1] virtio-ccw: auto-manage VIRTIO_F_IOMMU_PLATFORM if PV

2020-06-10 Thread David Hildenbrand

On 10.06.20 12:07, David Gibson wrote:
> On Wed, Jun 10, 2020 at 09:22:45AM +0200, David Hildenbrand wrote:
>> On 10.06.20 06:31, David Gibson wrote:
>>> On Tue, Jun 09, 2020 at 12:44:39PM -0400, Michael S. Tsirkin wrote:
 On Tue, Jun 09, 2020 at 06:28:39PM +0200, Halil Pasic wrote:
> On Tue, 9 Jun 2020 17:47:47 +0200
> Claudio Imbrenda  wrote:
>
>> On Tue, 9 Jun 2020 11:41:30 +0200
>> Halil Pasic  wrote:
>>
>> [...]
>>
>>> I don't know. Janosch could answer that, but he is on vacation. Adding
>>> Claudio maybe he can answer. My understanding is, that while it might
>>> be possible, it is ugly at best. The ability to do a transition is
>>> indicated by a CPU model feature. Indicating the feature to the guest
>>> and then failing the transition sounds wrong to me.
>>
>> I agree. If the feature is advertised, then it has to work. I don't
>> think we even have an architected way to fail the transition for that
>> reason.
>>
>> What __could__ be done is to prevent qemu from even starting if an
>> incompatible device is specified together with PV.
>
> AFAIU, the "specified together with PV" is the problem here. Currently
> we don't "specify PV" but PV is just a capability that is managed by the
> CPU model (like so many other).

 So if we want to keep it user friendly, there could be
 protection property with values on/off/auto, and auto
 would poke at host capability to figure out whether
 it's supported.

 Both virtio and CPU would inherit from that.
>>>
>>> Right, that's what I have in mind for my 'host-trust-limitation'
>>> property (a generalized version of the existing 'memory-encryption'
>>> machine option).  My draft patches already set virtio properties
>>> accordingly, it should be possible to set (default) cpu properties as
>>> well.
>>
>> No crazy CPU model hacks please (at least speaking for the s390x).
> 
> Uh... I'm not really sure what you have in mind here.
> 

Reading along I got the impression that we want to glue the availability
of CPU features to other QEMU cmdline parameters (besides the
accelerator). ("to set (default) cpu properties as well"). If we are
talking about other CPU properties not expressed as CPU features (e.g.,
-cpu X,Y=on ...), then there is no issue.

-- 
Thanks,

David / dhildenb

[Bug 1882671] Re: qemu-system-x86_64 (ver 4.2) stuck at boot with OVMF bios

2020-06-10 Thread Launchpad Bug Tracker

Status changed to 'Confirmed' because the bug affects multiple users.

** Changed in: qemu (Ubuntu)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1882671

Title:
  qemu-system-x86_64 (ver 4.2) stuck at boot with OVMF bios

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  The version of QEMU (4.2.0) packaged for Ubuntu 20.04 hangs
  indefinitely at boot if an OVMF bios is used. This happens ONLY with
  qemu-system-x86_64. qemu-system-i386 works fine with the latest ia32
  OVMF bios.

  NOTE[1]: the same identical OVMF bios works fine on QEMU 2.x packaged with 
Ubuntu 18.04.
  NOTE[2]: reproducing the fatal bug requires *no* operating system:

 qemu-system-x86_64 -bios OVMF-pure-efi.fd

  On its window QEMU gets stuck at the very first stage:
 "Guest has not initialized the display (yet)."

  NOTE[3]: QEMU gets stuck no matter if KVM is used or not.

  NOTE[4]: By adding the `-d int` option it is possible to observe that
  QEMU is, apparently, stuck in an endless loop of interrupts. For the
  first few seconds, registers' values vary quickly, but at some point
  they reach a final value, while the interrupt counter increments:

2568: v=68 e= i=0 cpl=0 IP=0038:07f1d225 pc=07f1d225 
SP=0030:07f0c8d0 env->regs[R_EAX]=
  RAX= RBX=07f0c920 RCX= 
RDX=0001
  RSI=06d18798 RDI=8664 RBP= 
RSP=07f0c8d0
  R8 =0001 R9 =0089 R10= 
R11=07f2c987
  R12= R13= R14=07087901 
R15=
  RIP=07f1d225 RFL=0246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
  ES =0030   00cf9300 DPL=0 DS   [-WA]
  CS =0038   00af9a00 DPL=0 CS64 [-R-]
  SS =0030   00cf9300 DPL=0 DS   [-WA]
  DS =0030   00cf9300 DPL=0 DS   [-WA]
  FS =0030   00cf9300 DPL=0 DS   [-WA]
  GS =0030   00cf9300 DPL=0 DS   [-WA]
  LDT=   8200 DPL=0 LDT
  TR =   8b00 DPL=0 TSS64-busy
  GDT= 079eea98 0047
  IDT= 0758f018 0fff
  CR0=80010033 CR2= CR3=07c01000 CR4=0668
  DR0= DR1= DR2= 
DR3= 
  DR6=0ff0 DR7=0400
  CCS=0044 CCD= CCO=EFLAGS  
  EFER=0d00

  
  NOTE[5]: Just to better help the investigation of the bug, I'd like to remark 
that the issue is NOT caused by an endless loop of triple-faults. I tried with 
-d cpu_reset and there is NO such loop. No triple fault whatsoever.

  NOTE[6]: The OVMF version used for the test has been downloaded from:
  
https://www.kraxel.org/repos/jenkins/edk2/edk2.git-ovmf-x64-0-20200515.1398.g6ff7c838d0.noarch.rpm

  but the issue is the same with older OVMF versions as well.

  
  Please take a look at it, as the bug is NOT a corner case. QEMU 4.2.0 cannot 
boot with an UEFI firmware (OVMF) while virtualizing a x86_64 machine AT ALL.

  Thank you very much,
  Vladislav K. Valtchev

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1882671/+subscriptions

Re: [PATCH v2] hmp: Make json format optional for qom-set

2020-06-10 Thread David Hildenbrand

On 10.06.20 09:51, David Hildenbrand wrote:
> Commit 7d2ef6dcc1cf ("hmp: Simplify qom-set") switched to the json
> parser, making it possible to specify complex types. However, with this
> change it is no longer possible to specify proper sizes (e.g., 2G, 128M),
> turning the interface harder to use for properties that consume sizes.
> 
> Let's switch back to the previous handling and allow to specify passing
> json via the "-j" parameter.
> 
> Cc: Philippe Mathieu-Daudé 
> Cc: Markus Armbruster 
> Cc: Dr. David Alan Gilbert 
> Cc: Paolo Bonzini 
> Cc: "Daniel P. Berrangé" 
> Cc: Eduardo Habkost 
> Signed-off-by: David Hildenbrand 
> ---
> v1 - v2:
> - keep the "value:S" as correctly noted by Paolo :)
> ---
>  hmp-commands.hx|  7 ---
>  qom/qom-hmp-cmds.c | 20 
>  2 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index 28256209b5..5d12fbeebe 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1806,9 +1806,10 @@ ERST
>  
>  {
>  .name   = "qom-set",
> -.args_type  = "path:s,property:s,value:S",
> -.params = "path property value",
> -.help   = "set QOM property",
> +.args_type  = "json:-j,path:s,property:s,value:S",
> +.params = "[-j] path property value",
> +.help   = "set QOM property.\n\t\t\t"
> +  "-j: the property is specified in json format.",

Stupid mistake:

"-j: the value is specified in json format


-- 
Thanks,

David / dhildenb

Re: [PATCH v2] hmp: Make json format optional for qom-set

2020-06-10 Thread Dr. David Alan Gilbert

* David Hildenbrand (da...@redhat.com) wrote:
> Commit 7d2ef6dcc1cf ("hmp: Simplify qom-set") switched to the json
> parser, making it possible to specify complex types. However, with this
> change it is no longer possible to specify proper sizes (e.g., 2G, 128M),
> turning the interface harder to use for properties that consume sizes.
> 
> Let's switch back to the previous handling and allow to specify passing
> json via the "-j" parameter.
> 
> Cc: Philippe Mathieu-DaudÃ© 
> Cc: Markus Armbruster 
> Cc: Dr. David Alan Gilbert 
> Cc: Paolo Bonzini 
> Cc: "Daniel P. BerrangÃ©" 
> Cc: Eduardo Habkost 
> Signed-off-by: David Hildenbrand 

Yep OK.  Shame it's got back to even more complex but it makes sense.


Reviewed-by: Dr. David Alan Gilbert 

> ---
> v1 - v2:
> - keep the "value:S" as correctly noted by Paolo :)
> ---
>  hmp-commands.hx|  7 ---
>  qom/qom-hmp-cmds.c | 20 
>  2 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index 28256209b5..5d12fbeebe 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1806,9 +1806,10 @@ ERST
>  
>  {
>  .name   = "qom-set",
> -.args_type  = "path:s,property:s,value:S",
> -.params = "path property value",
> -.help   = "set QOM property",
> +.args_type  = "json:-j,path:s,property:s,value:S",
> +.params = "[-j] path property value",
> +.help   = "set QOM property.\n\t\t\t"
> +  "-j: the property is specified in json format.",
>  .cmd= hmp_qom_set,
>  .flags  = "p",
>  },
> diff --git a/qom/qom-hmp-cmds.c b/qom/qom-hmp-cmds.c
> index f704b6949a..a794e62f0b 100644
> --- a/qom/qom-hmp-cmds.c
> +++ b/qom/qom-hmp-cmds.c
> @@ -44,15 +44,27 @@ void hmp_qom_list(Monitor *mon, const QDict *qdict)
>  
>  void hmp_qom_set(Monitor *mon, const QDict *qdict)
>  {
> +const bool json = qdict_get_try_bool(qdict, "json", false);
>  const char *path = qdict_get_str(qdict, "path");
>  const char *property = qdict_get_str(qdict, "property");
>  const char *value = qdict_get_str(qdict, "value");
>  Error *err = NULL;
> -QObject *obj;
>  
> -obj = qobject_from_json(value, &err);
> -if (err == NULL) {
> -qmp_qom_set(path, property, obj, &err);
> +if (!json) {
> +Object *obj = object_resolve_path(path, NULL);
> +
> +if (!obj) {
> +error_set(&err, ERROR_CLASS_DEVICE_NOT_FOUND,
> +  "Device '%s' not found", path);
> +} else {
> +object_property_parse(obj, value, property, &err);
> +}
> +} else {
> +QObject *obj = qobject_from_json(value, &err);
> +
> +if (!err) {
> +qmp_qom_set(path, property, obj, &err);
> +}
>  }
>  
>  hmp_handle_error(mon, err);
> -- 
> 2.26.2
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH v2] hmp: Make json format optional for qom-set

2020-06-10 Thread Dr. David Alan Gilbert

* David Hildenbrand (da...@redhat.com) wrote:
> On 10.06.20 09:51, David Hildenbrand wrote:
> > Commit 7d2ef6dcc1cf ("hmp: Simplify qom-set") switched to the json
> > parser, making it possible to specify complex types. However, with this
> > change it is no longer possible to specify proper sizes (e.g., 2G, 128M),
> > turning the interface harder to use for properties that consume sizes.
> > 
> > Let's switch back to the previous handling and allow to specify passing
> > json via the "-j" parameter.
> > 
> > Cc: Philippe Mathieu-DaudÃ© 
> > Cc: Markus Armbruster 
> > Cc: Dr. David Alan Gilbert 
> > Cc: Paolo Bonzini 
> > Cc: "Daniel P. BerrangÃ©" 
> > Cc: Eduardo Habkost 
> > Signed-off-by: David Hildenbrand 
> > ---
> > v1 - v2:
> > - keep the "value:S" as correctly noted by Paolo :)
> > ---
> >  hmp-commands.hx|  7 ---
> >  qom/qom-hmp-cmds.c | 20 
> >  2 files changed, 20 insertions(+), 7 deletions(-)
> > 
> > diff --git a/hmp-commands.hx b/hmp-commands.hx
> > index 28256209b5..5d12fbeebe 100644
> > --- a/hmp-commands.hx
> > +++ b/hmp-commands.hx
> > @@ -1806,9 +1806,10 @@ ERST
> >  
> >  {
> >  .name   = "qom-set",
> > -.args_type  = "path:s,property:s,value:S",
> > -.params = "path property value",
> > -.help   = "set QOM property",
> > +.args_type  = "json:-j,path:s,property:s,value:S",
> > +.params = "[-j] path property value",
> > +.help   = "set QOM property.\n\t\t\t"
> > +  "-j: the property is specified in json format.",
> 
> Stupid mistake:
> 
> "-j: the value is specified in json format

oops; can fix that in commit

> 
> -- 
> Thanks,
> 
> David / dhildenb
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH v2] hmp: Make json format optional for qom-set

2020-06-10 Thread David Hildenbrand

On 10.06.20 12:39, Dr. David Alan Gilbert wrote:
> * David Hildenbrand (da...@redhat.com) wrote:
>> On 10.06.20 09:51, David Hildenbrand wrote:
>>> Commit 7d2ef6dcc1cf ("hmp: Simplify qom-set") switched to the json
>>> parser, making it possible to specify complex types. However, with this
>>> change it is no longer possible to specify proper sizes (e.g., 2G, 128M),
>>> turning the interface harder to use for properties that consume sizes.
>>>
>>> Let's switch back to the previous handling and allow to specify passing
>>> json via the "-j" parameter.
>>>
>>> Cc: Philippe Mathieu-Daudé 
>>> Cc: Markus Armbruster 
>>> Cc: Dr. David Alan Gilbert 
>>> Cc: Paolo Bonzini 
>>> Cc: "Daniel P. Berrangé" 
>>> Cc: Eduardo Habkost 
>>> Signed-off-by: David Hildenbrand 
>>> ---
>>> v1 - v2:
>>> - keep the "value:S" as correctly noted by Paolo :)
>>> ---
>>>  hmp-commands.hx|  7 ---
>>>  qom/qom-hmp-cmds.c | 20 
>>>  2 files changed, 20 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/hmp-commands.hx b/hmp-commands.hx
>>> index 28256209b5..5d12fbeebe 100644
>>> --- a/hmp-commands.hx
>>> +++ b/hmp-commands.hx
>>> @@ -1806,9 +1806,10 @@ ERST
>>>  
>>>  {
>>>  .name   = "qom-set",
>>> -.args_type  = "path:s,property:s,value:S",
>>> -.params = "path property value",
>>> -.help   = "set QOM property",
>>> +.args_type  = "json:-j,path:s,property:s,value:S",
>>> +.params = "[-j] path property value",
>>> +.help   = "set QOM property.\n\t\t\t"
>>> +  "-j: the property is specified in json format.",
>>
>> Stupid mistake:
>>
>> "-j: the value is specified in json format
> 
> oops; can fix that in commit

Perfect, let me know in case you need a respin. Thanks!


-- 
Thanks,

David / dhildenb

Source for configuration for cloud-init

2020-06-10 Thread Søren Hansen

All,

I'm finding myself needing to pass in some information to cloud-init in
some VM's in a non-cloud environment.

cloud-init is a (very) widely used tool for applying some initial
configuration to VM's. It originally exclusively used AWS's EC2's metadata
and userdata service, but has since been extended to use many other
configuration sources. The volume of the configuration varies a lot. For my
use, they will be several kB (passing in various certificates, etc.). For
non-cloud environments, the traditional source for user-data configuration
has been an ISO with the relevant configuration files. This feels
anachronistic to me.

I raised a feature request with cloud-init to have it support fw_cfg as a
configuration source: https://bugs.launchpad.net/cloud-init/+bug/1879294

I was told that there was already a feature request to use SMBIOS fields to
do the same: https://bugs.launchpad.net/cloud-init/+bug/1753558

Dan Berrangé (of libvirt fame) pointed out (in that latter bug report) that
the qemu developers advised against using fw_cfg for this sort of thing. I
have no particular reason to doubt him, but I'd still like to hear from the
horse's own mouth and try to understand why.

While I can certainly find a way to serialize my config blob into a single
string with no NULLs (per SMBIOS spec) and pass it in through the SMBIOS
interface, the fw_cfg approach seems a whole lot simpler for both the user
and for cloud-init. Also, having gotten so used to cloud-init just being
there, personally it doesn't feel like much of a stretch to think of it as
a sort of firmware.

Can someone enlighten me on why using fw_cfg is the wrong way to go?

Best regards,
Soren L. Hansen

Re: [PATCH v2 1/8] MAINTAINERS: Mark SH4 hardware orphan

2020-06-10 Thread Aleksandar Markovic

пон, 8. јун 2020. у 11:05 Philippe Mathieu-Daudé  је
написао/ла:
>
> Aurelien Jarno expressed his desire to orphan the SH4 hardware [*]:
>
>   I don't mind being [...] removed from there.
>   I do not really have time to work on that.
>
> Mark the SH4 emulated hardware orphan.
>
> Many thanks to Aurelien for his substantial contributions to QEMU,
> and for maintaining the SH4 hardware for various years!
>
> [*] https://www.mail-archive.com/qemu-devel@nongnu.org/msg708400.html
>
> Message-Id: <20200601214125.ga1924...@aurel32.net>
> Acked-by: Aurelien Jarno 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---

The basic idea of the patch (as read from the title and the commit
message) is good and positive.

The problem is that the patch does something different than the commit
message says - pretending that it just orphans something. Which is not
good. Actually, very clumsy and bad.

It creates a whole new subsection in MAINTAINERS file (not said in the
commit message), without any consistency with the current organization
in the file. That new subsection looks completely misplaced, living
with "TCG CPUs" neighbours. On top of that, it creates a new
precedent, leaving many unanswered questions, like: Should other
targets follow the same pattern?

I personally think that creating a new subsection is just a code
churn, waste of everybody's time on unimportant things.

Wouldn't it be simpler that you just changed statuses of all Aurelien
sh4 sections to "Orphaned", as he already said he approves, and leave
sh4 sections reorganization to a future maintainer?

If you really want to reorganize sh4 sections, these changes should be
in a separate patch. "Orphaning" patch should contain only changes of
statuses.

Regards,
Aleksandar

>  MAINTAINERS | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 6e7890ce82..49d90c70de 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -299,9 +299,7 @@ SH4 TCG CPUs
>  M: Aurelien Jarno 
>  S: Odd Fixes
>  F: target/sh4/
> -F: hw/sh4/
>  F: disas/sh4.c
> -F: include/hw/sh4/
>
>  SPARC TCG CPUs
>  M: Mark Cave-Ayland 
> @@ -1948,6 +1946,14 @@ F: hw/*/*xive*
>  F: include/hw/*/*xive*
>  F: docs/*/*xive*
>
> +SH4 Hardware
> +S: Orphan
> +F: hw/sh4/
> +F: hw/char/sh_serial.c
> +F: hw/intc/sh_intc.c
> +F: hw/timer/sh_timer.c
> +F: include/hw/sh4/
> +
>  Subsystems
>  --
>  Audio
> --
> 2.21.3
>
>

Re: [PATCH v2 1/8] MAINTAINERS: Mark SH4 hardware orphan

2020-06-10 Thread Thomas Huth

On 10/06/2020 13.08, Aleksandar Markovic wrote:
> пон, 8. јун 2020. у 11:05 Philippe Mathieu-Daudé  је
> написао/ла:
>>
>> Aurelien Jarno expressed his desire to orphan the SH4 hardware [*]:
>>
>>   I don't mind being [...] removed from there.
>>   I do not really have time to work on that.
>>
>> Mark the SH4 emulated hardware orphan.
>>
>> Many thanks to Aurelien for his substantial contributions to QEMU,
>> and for maintaining the SH4 hardware for various years!
>>
>> [*] https://www.mail-archive.com/qemu-devel@nongnu.org/msg708400.html
>>
>> Message-Id: <20200601214125.ga1924...@aurel32.net>
>> Acked-by: Aurelien Jarno 
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
> 
> The basic idea of the patch (as read from the title and the commit
> message) is good and positive.
> 
> The problem is that the patch does something different than the commit
> message says - pretending that it just orphans something. Which is not
> good. Actually, very clumsy and bad.

Aleksandar, could you please stop being so negative? If you've got
issues with a patch, that's fair, but you can then also simply express
your opinion in a professional and constructive way. Calling the work of
someone else "clumsy" is really not something that I want to read on the
qemu-devel mailing list.

 Thanks,
  Thomas

Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space()

2020-06-10 Thread Kevin Wolf

Am 10.06.2020 um 08:50 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 09.06.2020 19:19, Eric Blake wrote:
> > On 6/9/20 10:18 AM, Kevin Wolf wrote:
> > 
> > > > > > -Â Â Â Â Â Â Â  ret = bdrv_co_pwrite_zeroes(s->data_file, 
> > > > > > m->alloc_offset,
> > > > > > -Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 
> > > > > > Â Â  m->nb_clusters * s->cluster_size,
> > > > > > +Â Â Â Â Â Â Â  ret = bdrv_co_pwrite_zeroes(s->data_file, start, 
> > > > > > len,
> > > > > > Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 
> > > > > > Â Â Â Â  BDRV_REQ_NO_FALLBACK);
> > > > 
> > > > Good point.Â  If we weren't using BDRV_REQ_NO_FALLBACK, then avoiding a
> > > > pre-zero pass over the middle is essential.Â  But since we are 
> > > > insisting that
> > > > the pre-zero pass be fast or else immediately fail, the time spent in
> > > > pre-zeroing should not be a concern.Â  Do you have benchmark numbers 
> > > > stating
> > > > otherwise?
> > > 
> > > I stumbled across this behaviour (write_zeros for 2 MB, then overwrite
> > > almost everything) in the context of a different bug, and it just didn't
> > > make much sense to me. Is there really a file system where fragmentation
> > > is introduced by not zeroing the area first and then overwriting it?
> > > 
> > > I'm not insisting on making this change because the behaviour is
> > > harmless if odd, but if we think that writing twice to some blocks is an
> > > optimisation, maybe we should actually measure and document this.
> > > 
> > > 
> > > Anyway, let's talk about the reported bug that made me look at the
> > > strace that showed this behaviour because I feel it supports my last
> > > point. It's a bit messy, but anyway:
> > > 
> > > Â Â Â Â  https://bugzilla.redhat.com/show_bug.cgi?id=1666864
> > > 
> > > So initially, bad performance on a fragmented image file was reported.
> > > Not much to do there, but then in comment 16, QA reported a performance
> > > regression in this case between 4.0 and 4.2. And this change caused by
> > > c8bb23cbdbe, i.e. the commit that introduced handle_alloc_space().
> > > 
> > > Turns out that BDRV_REQ_NO_FALLBACK doesn't always guarantee that it's
> > > _really_ fast. fallocate(FALLOC_FL_ZERO_RANGE) causes some kind of flush
> > > on XFS and buffered writes don't. So with the old code, qemu-img convert
> > > to a file on a very full filesystem that will cause fragmentation, was
> > > much faster with writing a zero buffer than with write_zeroes (because
> > > it didn't flush the result).
> > 
> > Wow. That makes it sound like we should NOT attempt
> > fallocate(FALLOC_FL_ZERO_RANGE) on the fast path, because we don't
> > have guarantees that it is fast.
> > 
> > I really wish the kernel would give us
> > fallocate(FALLOC_FL_ZERO_RANGE|FALLOC_FL_NO_FALLBACK) which would
> > fail fast rather than doing a flush or other slow fallback.
> > 
> > > 
> > > I don't fully understand why this is and hope that XFS can do
> > > something about it. I also don't really think we should revert the
> > > change in QEMU, though I'm not completely sure. But I just wanted
> > > to share this to show that "obvious" characteristics of certain
> > > types of requests aren't always true and doing obscure
> > > optimisations based on what we think filesystems may do can
> > > actually achieve the opposite in some cases.
> > 
> > It also goes to show us that the kernel does NOT yet give us enough
> > fine-grained control over what we really want (which is: 'pre-zero
> > this if it is fast, but don't waste time if it is not).Â  Most of the
> > kernel interfaces end up being 'pre-zero this, and it might be fast,
> > fail fast, or even fall back to something safe but slow, and you
> > can't tell the difference short of trying'.
> 
> Hmm, actually, for small cow areas (several bytes? several sectors?),
> I'm not surprised that direct writing zeroed buffer may work faster
> than any kind of WRITE_ZERO request. Especially, expanding
> write-request for a small amount of bytes may be faster than doing
> intead two requests. Possibly, we need some heuristics here. And I
> think, it would be good to add some benchmarks based on
> scripts/simplebench to have real numbers (we'll try).

I'll continue the discussion in the BZ, but yes, at the moment the
recommendation of the XFS people seems to be that we avoid fallocate()
(at least without FALLOC_FL_KEEP_SIZE and on local filesystems) for
small sizes.

It's not obvious what "small sizes" means in practice, but I wouldn't be
surprised if a qcow2 cluster is always in this category, even if it's
2 MB. (The pathological qemu-img convert case does use 2 MB buffers.)

Kevin

Re: [PATCH v2 1/8] MAINTAINERS: Mark SH4 hardware orphan

2020-06-10 Thread Aleksandar Markovic

сре, 10. јун 2020. у 13:17 Thomas Huth  је написао/ла:
>
> On 10/06/2020 13.08, Aleksandar Markovic wrote:
> > пон, 8. јун 2020. у 11:05 Philippe Mathieu-Daudé  је
> > написао/ла:
> >>
> >> Aurelien Jarno expressed his desire to orphan the SH4 hardware [*]:
> >>
> >>   I don't mind being [...] removed from there.
> >>   I do not really have time to work on that.
> >>
> >> Mark the SH4 emulated hardware orphan.
> >>
> >> Many thanks to Aurelien for his substantial contributions to QEMU,
> >> and for maintaining the SH4 hardware for various years!
> >>
> >> [*] https://www.mail-archive.com/qemu-devel@nongnu.org/msg708400.html
> >>
> >> Message-Id: <20200601214125.ga1924...@aurel32.net>
> >> Acked-by: Aurelien Jarno 
> >> Signed-off-by: Philippe Mathieu-Daudé 
> >> ---
> >
> > The basic idea of the patch (as read from the title and the commit
> > message) is good and positive.
> >
> > The problem is that the patch does something different than the commit
> > message says - pretending that it just orphans something. Which is not
> > good. Actually, very clumsy and bad.
>
> Aleksandar, could you please stop being so negative? If you've got
> issues with a patch, that's fair, but you can then also simply express
> your opinion in a professional and constructive way. Calling the work of
> someone else "clumsy" is really not something that I want to read on the
> qemu-devel mailing list.
>

Ok, than delete mentally that word, and focus on the substance.

>  Thanks,
>   Thomas
>

[PATCH v9 00/61] target/riscv: support vector extension v0.7.1

2020-06-10 Thread LIU Zhiwei

This patchset implements the vector extension for RISC-V on QEMU.

You can also find the patchset and all *test cases* in
my repo(https://github.com/romanheros/qemu.git branch:vector-upstream-v9).
All the test cases are in the directory qemu/tests/riscv/vector/. They are
riscv64 linux user mode programs.

You can test the patchset by the script qemu/tests/riscv/vector/runcase.sh.

Features:
  * support specification 
riscv-v-spec-0.7.1.(https://github.com/riscv/riscv-v-spec/releases/tag/0.7.1/)
  * support basic vector extension.
  * support Zvlsseg.
  * support Zvamo.
  * not support Zvediv as it is changing.
  * SLEN always equals VLEN.
  * element width support 8bit, 16bit, 32bit, 64bit.

Changelog:
v9
  * always set dynamic rounding mode for vector float insns.
  * bug fix atomic implementation.
  * bug fix first-only-fault.
  * some small tidy up.

v8
  * support different float rounding modes for vector instructions.
  * use lastest released TCG GVEC DUP IR.
  * set RV_VLEN_MAX to 256 bits, as GVEC IR uses simd_desc.

v7
  * move vl == 0 check to translation time by add a global cpu_vl.
  * implement vector element inline load and store function by TCG IR.
  * based on vec_element_load(store), implement some permutation instructions.
  * implement rsubs GVEC IR.
  * fixup vsmul, vmfne, vfmerge, vslidedown.
  * some other small bugs and indentation errors.

v6
  * use gvec_dup Gvec IR to accellerate move and merge.
  * a better way to implement fixed point instructions.
  * a global check when vl == 0.
  * limit some macros to only one inline function call.
  * fixup sew error when use Gvec IR.
  * fixup bugs for corner cases.

v5
  * fixup a bug in tb flags.

v4
  * no change

v3
  * move check code from execution-time to translation-time
  * use a continous memory block for vector register description.
  * vector registers as direct fields in RISCVCPUState.
  * support VLEN configure from qemu command line.
  * support ELEN configure from qemu command line.
  * support vector specification version configure from qemu command line.
  * probe pages before real load or store access.
  * use probe_page_check for no-fault operations in linux user mode.
  * generation atomic exit exception when in parallel environment.
  * fixup a lot of concrete bugs.

V2
  * use float16_compare{_quiet}
  * only use GETPC() in outer most helper
  * add ctx.ext_v Property

LIU Zhiwei (61):
  target/riscv: add vector extension field in CPURISCVState
  target/riscv: implementation-defined constant parameters
  target/riscv: support vector extension csr
  target/riscv: add vector configure instruction
  target/riscv: add an internals.h header
  target/riscv: add vector stride load and store instructions
  target/riscv: add vector index load and store instructions
  target/riscv: add fault-only-first unit stride load
  target/riscv: add vector amo operations
  target/riscv: vector single-width integer add and subtract
  target/riscv: vector widening integer add and subtract
  target/riscv: vector integer add-with-carry / subtract-with-borrow
instructions
  target/riscv: vector bitwise logical instructions
  target/riscv: vector single-width bit shift instructions
  target/riscv: vector narrowing integer right shift instructions
  target/riscv: vector integer comparison instructions
  target/riscv: vector integer min/max instructions
  target/riscv: vector single-width integer multiply instructions
  target/riscv: vector integer divide instructions
  target/riscv: vector widening integer multiply instructions
  target/riscv: vector single-width integer multiply-add instructions
  target/riscv: vector widening integer multiply-add instructions
  target/riscv: vector integer merge and move instructions
  target/riscv: vector single-width saturating add and subtract
  target/riscv: vector single-width averaging add and subtract
  target/riscv: vector single-width fractional multiply with rounding
and saturation
  target/riscv: vector widening saturating scaled multiply-add
  target/riscv: vector single-width scaling shift instructions
  target/riscv: vector narrowing fixed-point clip instructions
  target/riscv: vector single-width floating-point add/subtract
instructions
  target/riscv: vector widening floating-point add/subtract instructions
  target/riscv: vector single-width floating-point multiply/divide
instructions
  target/riscv: vector widening floating-point multiply
  target/riscv: vector single-width floating-point fused multiply-add
instructions
  target/riscv: vector widening floating-point fused multiply-add
instructions
  target/riscv: vector floating-point square-root instruction
  target/riscv: vector floating-point min/max instructions
  target/riscv: vector floating-point sign-injection instructions
  target/riscv: vector floating-point compare instructions
  target/riscv: vector floating-point classify instructions
  target/riscv: vector floating-point merge instructions
  target/riscv

Re: Clarification regarding new qemu-img convert --target-is-zero flag

2020-06-10 Thread Kevin Wolf

Am 10.06.2020 um 08:28 hat Sam Eiderman geschrieben:
> Hi,
> 
> My target format is a Persistent Disk on GCP.
> https://cloud.google.com/persistent-disk
> 
> And my use case is converting VMDKs to PDs so I'm just using qemu-img
> for the conversion (not using qemu as a hypervisor).
> 
> Luckily PDs are zeroed out when allocated but I was asking to
> understand the restrictions of qemu-img convert.
> 
> It could be useful for qemu-img convert to not zero out the disk, but
> do write allocated zeroes, I'm imagining cloud scenarios where instead
> of virtual disks the customer receives an attached physical SSD device
> that is not zeroed out beforehand (only encryption key changed, for
> privacy/security sake) so reads will return garbage.

But that's the default mode? Zeroing out the whole disk upfront is an
optimisation that we do if efficient zeroing is possible, but if we
can't, we just write explicit zeros where needed.

--target-is-zero means that you promise that the target is already
pre-zeroed so qemu-img can further optimise things. If you specify it
and the target doesn't contain zeros, but random data, you get garbage.

Kevin

[PATCH v9 01/61] target/riscv: add vector extension field in CPURISCVState

2020-06-10 Thread LIU Zhiwei

The 32 vector registers will be viewed as a continuous memory block.
It avoids the convension between element index and (regno, offset).
Thus elements can be directly accessed by offset from the first vector
base address.

Signed-off-by: LIU Zhiwei 
Acked-by: Alistair Francis 
Reviewed-by: Richard Henderson 
---
 target/riscv/cpu.h   | 12 
 target/riscv/translate.c |  3 ++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 80569f0d44..0018a79fa3 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -59,6 +59,7 @@
 #define RVA RV('A')
 #define RVF RV('F')
 #define RVD RV('D')
+#define RVV RV('V')
 #define RVC RV('C')
 #define RVS RV('S')
 #define RVU RV('U')
@@ -88,9 +89,20 @@ typedef struct CPURISCVState CPURISCVState;
 
 #include "pmp.h"
 
+#define RV_VLEN_MAX 512
+
 struct CPURISCVState {
 target_ulong gpr[32];
 uint64_t fpr[32]; /* assume both F and D extensions */
+
+/* vector coprocessor state. */
+uint64_t vreg[32 * RV_VLEN_MAX / 64] QEMU_ALIGNED(16);
+target_ulong vxrm;
+target_ulong vxsat;
+target_ulong vl;
+target_ulong vstart;
+target_ulong vtype;
+
 target_ulong pc;
 target_ulong load_res;
 target_ulong load_val;
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 43bf7e39a6..b71b7e4bc2 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -32,7 +32,7 @@
 #include "instmap.h"
 
 /* global register indices */
-static TCGv cpu_gpr[32], cpu_pc;
+static TCGv cpu_gpr[32], cpu_pc, cpu_vl;
 static TCGv_i64 cpu_fpr[32]; /* assume F and D extensions */
 static TCGv load_res;
 static TCGv load_val;
@@ -886,6 +886,7 @@ void riscv_translate_init(void)
 }
 
 cpu_pc = tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, pc), "pc");
+cpu_vl = tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, vl), "vl");
 load_res = tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, load_res),
  "load_res");
 load_val = tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, load_val),
-- 
2.23.0

Re: [PATCH] iotests: Fix 291 across more file systems

2020-06-10 Thread Kevin Wolf

Am 09.06.2020 um 22:46 hat Eric Blake geschrieben:
> On 6/9/20 8:32 AM, Kevin Wolf wrote:
> > Am 08.06.2020 um 21:56 hat Eric Blake geschrieben:
> > > Depending on the granularity of holes and amount of metadata consumed
> > > by a file, the 'disk size:' number of 'qemu-img info' is not reliable.
> > > Adjust our test to use a different set of filters to avoid spurious
> > > failures.
> > > 
> > > Reported-by: Kevin Wolf 
> > > Fixes: cf2d1203dc
> > > Signed-off-by: Eric Blake 
> > 
> > Thanks, applied to the block branch.
> 
> It has a conflict with one of Vladimir's bitmaps patches that I'm about to
> send a pull request for; so I'll resolve the conflict and include it in my
> bitmaps tree instead, and you can drop it from yours.  I'm assuming I can
> add your Acked-by since you were willing to stage it.

Ok, no problem.

Kevin

Re: [PATCH v7 0/9] acpi: i386 tweaks

2020-06-10 Thread Igor Mammedov

On Wed, 10 Jun 2020 11:41:22 +0200
Gerd Hoffmann  wrote:

> First batch of microvm patches, some generic acpi stuff.
> Split the acpi-build.c monster, specifically split the
> pc and q35 and pci bits into a separate file which we
> can skip building at some point in the future.
> 
It looks like series is missing patch to whitelist changed ACPI tables in
bios-table-test.

Do we already have test case for microvm in bios-table-test,
if not it's probably time to add it.

> v2 changes: leave acpi-build.c largely as-is, move useful
> bits to other places to allow them being reused, specifically:
> 
>  * move isa device generator functions to individual isa devices.
>  * move fw_cfg generator function to fw_cfg.c
> 
> v3 changes: fix rtc, support multiple lpt devices.
> 
> v4 changes:
>  * drop merged patches.
>  * split rtc crs change to separata patch.
>  * added two cleanup patches.
>  * picked up ack & review tags.
> 
> v5 changes:
>  * add comment for rtc crs update.
>  * add even more cleanup patches.
>  * picked up ack & review tags.
> 
> v6 changes:
>  * floppy: move cmos_get_fd_drive_type.
>  * picked up ack & review tags.
> 
> v7 changes:
>  * rebased to mst/pci branch, resolved stubs conflict.
>  * dropped patches already queued up in mst/pci.
>  * added missing sign-off.
>  * picked up ack & review tags.
> 
> take care,
>   Gerd
> 
> Gerd Hoffmann (9):
>   acpi: move aml builder code for floppy device
>   floppy: make isa_fdc_get_drive_max_chs static
>   floppy: move cmos_get_fd_drive_type() from pc
>   acpi: move aml builder code for i8042 (kbd+mouse) device
>   acpi: factor out fw_cfg_add_acpi_dsdt()
>   acpi: simplify build_isa_devices_aml()
>   acpi: drop serial/parallel enable bits from dsdt
>   acpi: drop build_piix4_pm()
>   acpi: q35: drop _SB.PCI0.ISA.LPCD opregion.
> 
>  hw/i386/fw_cfg.h   |   1 +
>  include/hw/block/fdc.h |   3 +-
>  include/hw/i386/pc.h   |   1 -
>  hw/block/fdc.c | 111 +-
>  hw/i386/acpi-build.c   | 211 ++---
>  hw/i386/fw_cfg.c   |  28 ++
>  hw/i386/pc.c   |  25 -
>  hw/input/pckbd.c   |  31 ++
>  stubs/cmos.c   |   7 ++
>  stubs/Makefile.objs|   1 +
>  10 files changed, 184 insertions(+), 235 deletions(-)
>  create mode 100644 stubs/cmos.c
>

[PATCH v9 03/61] target/riscv: support vector extension csr

2020-06-10 Thread LIU Zhiwei

The v0.7.1 specification does not define vector status within mstatus.
A future revision will define the privileged portion of the vector status.

Signed-off-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Reviewed-by: Richard Henderson 
---
 target/riscv/cpu_bits.h | 15 +
 target/riscv/csr.c  | 75 -
 2 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index 7f64ee1174..8117e8b5a7 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -29,6 +29,14 @@
 #define FSR_NXA (FPEXC_NX << FSR_AEXC_SHIFT)
 #define FSR_AEXC(FSR_NVA | FSR_OFA | FSR_UFA | FSR_DZA | FSR_NXA)
 
+/* Vector Fixed-Point round model */
+#define FSR_VXRM_SHIFT  9
+#define FSR_VXRM(0x3 << FSR_VXRM_SHIFT)
+
+/* Vector Fixed-Point saturation flag */
+#define FSR_VXSAT_SHIFT 8
+#define FSR_VXSAT   (0x1 << FSR_VXSAT_SHIFT)
+
 /* Control and Status Registers */
 
 /* User Trap Setup */
@@ -48,6 +56,13 @@
 #define CSR_FRM 0x002
 #define CSR_FCSR0x003
 
+/* User Vector CSRs */
+#define CSR_VSTART  0x008
+#define CSR_VXSAT   0x009
+#define CSR_VXRM0x00a
+#define CSR_VL  0xc20
+#define CSR_VTYPE   0xc21
+
 /* User Timers and Counters */
 #define CSR_CYCLE   0xc00
 #define CSR_TIME0xc01
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 383be0a955..ac01c835e1 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -46,6 +46,10 @@ void riscv_set_csr_ops(int csrno, riscv_csr_operations *ops)
 static int fs(CPURISCVState *env, int csrno)
 {
 #if !defined(CONFIG_USER_ONLY)
+/* loose check condition for fcsr in vector extension */
+if ((csrno == CSR_FCSR) && (env->misa & RVV)) {
+return 0;
+}
 if (!env->debugger && !riscv_cpu_fp_enabled(env)) {
 return -1;
 }
@@ -53,6 +57,14 @@ static int fs(CPURISCVState *env, int csrno)
 return 0;
 }
 
+static int vs(CPURISCVState *env, int csrno)
+{
+if (env->misa & RVV) {
+return 0;
+}
+return -1;
+}
+
 static int ctr(CPURISCVState *env, int csrno)
 {
 #if !defined(CONFIG_USER_ONLY)
@@ -154,6 +166,10 @@ static int read_fcsr(CPURISCVState *env, int csrno, 
target_ulong *val)
 #endif
 *val = (riscv_cpu_get_fflags(env) << FSR_AEXC_SHIFT)
 | (env->frm << FSR_RD_SHIFT);
+if (vs(env, csrno) >= 0) {
+*val |= (env->vxrm << FSR_VXRM_SHIFT)
+| (env->vxsat << FSR_VXSAT_SHIFT);
+}
 return 0;
 }
 
@@ -166,10 +182,62 @@ static int write_fcsr(CPURISCVState *env, int csrno, 
target_ulong val)
 env->mstatus |= MSTATUS_FS;
 #endif
 env->frm = (val & FSR_RD) >> FSR_RD_SHIFT;
+if (vs(env, csrno) >= 0) {
+env->vxrm = (val & FSR_VXRM) >> FSR_VXRM_SHIFT;
+env->vxsat = (val & FSR_VXSAT) >> FSR_VXSAT_SHIFT;
+}
 riscv_cpu_set_fflags(env, (val & FSR_AEXC) >> FSR_AEXC_SHIFT);
 return 0;
 }
 
+static int read_vtype(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env->vtype;
+return 0;
+}
+
+static int read_vl(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env->vl;
+return 0;
+}
+
+static int read_vxrm(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env->vxrm;
+return 0;
+}
+
+static int write_vxrm(CPURISCVState *env, int csrno, target_ulong val)
+{
+env->vxrm = val;
+return 0;
+}
+
+static int read_vxsat(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env->vxsat;
+return 0;
+}
+
+static int write_vxsat(CPURISCVState *env, int csrno, target_ulong val)
+{
+env->vxsat = val;
+return 0;
+}
+
+static int read_vstart(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env->vstart;
+return 0;
+}
+
+static int write_vstart(CPURISCVState *env, int csrno, target_ulong val)
+{
+env->vstart = val;
+return 0;
+}
+
 /* User Timers and Counters */
 static int read_instret(CPURISCVState *env, int csrno, target_ulong *val)
 {
@@ -1183,7 +1251,12 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
 [CSR_FFLAGS] =  { fs,   read_fflags,  write_fflags  },
 [CSR_FRM] = { fs,   read_frm, write_frm },
 [CSR_FCSR] ={ fs,   read_fcsr,write_fcsr},
-
+/* Vector CSRs */
+[CSR_VSTART] =  { vs,   read_vstart,  write_vstart  },
+[CSR_VXSAT] =   { vs,   read_vxsat,   write_vxsat   },
+[CSR_VXRM] ={ vs,   read_vxrm,write_vxrm},
+[CSR_VL] =  { vs,   read_vl },
+[CSR_VTYPE] =   { vs,   read_vtype  },
 /* User Timers and Counters */
 [CSR_CYCLE] =   { ctr,  read_instret},
 [CSR_INSTRET] = { ctr,  read_instr

[PATCH v9 02/61] target/riscv: implementation-defined constant parameters

2020-06-10 Thread LIU Zhiwei

vlen is the vector register length in bits.
elen is the max element size in bits.
vext_spec is the vector specification version, default value is v0.7.1.

Signed-off-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Reviewed-by: Richard Henderson 
---
 target/riscv/cpu.c | 7 +++
 target/riscv/cpu.h | 5 +
 2 files changed, 12 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 3a6d202d03..1af79404fa 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -106,6 +106,11 @@ static void set_priv_version(CPURISCVState *env, int 
priv_ver)
 env->priv_ver = priv_ver;
 }
 
+static void set_vext_version(CPURISCVState *env, int vext_ver)
+{
+env->vext_ver = vext_ver;
+}
+
 static void set_feature(CPURISCVState *env, int feature)
 {
 env->features |= (1ULL << feature);
@@ -361,6 +366,7 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 CPURISCVState *env = &cpu->env;
 RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(dev);
 int priv_version = PRIV_VERSION_1_11_0;
+int vext_version = VEXT_VERSION_0_07_1;
 target_ulong target_misa = 0;
 Error *local_err = NULL;
 
@@ -384,6 +390,7 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 }
 
 set_priv_version(env, priv_version);
+set_vext_version(env, vext_version);
 
 if (cpu->cfg.mmu) {
 set_feature(env, RISCV_FEATURE_MMU);
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 0018a79fa3..302e0859a0 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -78,6 +78,8 @@ enum {
 #define PRIV_VERSION_1_10_0 0x00011000
 #define PRIV_VERSION_1_11_0 0x00011100
 
+#define VEXT_VERSION_0_07_1 0x0701
+
 #define TRANSLATE_PMP_FAIL 2
 #define TRANSLATE_FAIL 1
 #define TRANSLATE_SUCCESS 0
@@ -113,6 +115,7 @@ struct CPURISCVState {
 target_ulong guest_phys_fault_addr;
 
 target_ulong priv_ver;
+target_ulong vext_ver;
 target_ulong misa;
 target_ulong misa_mask;
 
@@ -275,6 +278,8 @@ typedef struct RISCVCPU {
 
 char *priv_spec;
 char *user_spec;
+uint16_t vlen;
+uint16_t elen;
 bool mmu;
 bool pmp;
 } cfg;
-- 
2.23.0

[PATCH v9 04/61] target/riscv: add vector configure instruction

2020-06-10 Thread LIU Zhiwei

vsetvl and vsetvli are two configure instructions for vl, vtype. TB flags
should update after configure instructions. The (ill, lmul, sew ) of vtype
and the bit of (VSTART == 0 && VL == VLMAX) will be placed within tb_flags.

Signed-off-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Reviewed-by: Richard Henderson 
---
 target/riscv/Makefile.objs  |  2 +-
 target/riscv/cpu.h  | 63 +---
 target/riscv/helper.h   |  2 +
 target/riscv/insn32.decode  |  5 ++
 target/riscv/insn_trans/trans_rvv.inc.c | 79 +
 target/riscv/translate.c| 17 +-
 target/riscv/vector_helper.c| 53 +
 7 files changed, 209 insertions(+), 12 deletions(-)
 create mode 100644 target/riscv/insn_trans/trans_rvv.inc.c
 create mode 100644 target/riscv/vector_helper.c

diff --git a/target/riscv/Makefile.objs b/target/riscv/Makefile.objs
index ff651f69f6..ff38df6219 100644
--- a/target/riscv/Makefile.objs
+++ b/target/riscv/Makefile.objs
@@ -1,4 +1,4 @@
-obj-y += translate.o op_helper.o cpu_helper.o cpu.o csr.o fpu_helper.o 
gdbstub.o
+obj-y += translate.o op_helper.o cpu_helper.o cpu.o csr.o fpu_helper.o 
vector_helper.o gdbstub.o
 obj-$(CONFIG_SOFTMMU) += pmp.o
 
 ifeq ($(CONFIG_SOFTMMU),y)
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 302e0859a0..0ad51c6580 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -21,6 +21,7 @@
 #define RISCV_CPU_H
 
 #include "hw/core/cpu.h"
+#include "hw/registerfields.h"
 #include "exec/cpu-defs.h"
 #include "fpu/softfloat-types.h"
 
@@ -93,6 +94,12 @@ typedef struct CPURISCVState CPURISCVState;
 
 #define RV_VLEN_MAX 512
 
+FIELD(VTYPE, VLMUL, 0, 2)
+FIELD(VTYPE, VSEW, 2, 3)
+FIELD(VTYPE, VEDIV, 5, 2)
+FIELD(VTYPE, RESERVED, 7, sizeof(target_ulong) * 8 - 9)
+FIELD(VTYPE, VILL, sizeof(target_ulong) * 8 - 2, 1)
+
 struct CPURISCVState {
 target_ulong gpr[32];
 uint64_t fpr[32]; /* assume both F and D extensions */
@@ -352,19 +359,62 @@ void riscv_cpu_set_fflags(CPURISCVState *env, 
target_ulong);
 #define TB_FLAGS_MMU_MASK   3
 #define TB_FLAGS_MSTATUS_FS MSTATUS_FS
 
+typedef CPURISCVState CPUArchState;
+typedef RISCVCPU ArchCPU;
+#include "exec/cpu-all.h"
+
+FIELD(TB_FLAGS, VL_EQ_VLMAX, 2, 1)
+FIELD(TB_FLAGS, LMUL, 3, 2)
+FIELD(TB_FLAGS, SEW, 5, 3)
+FIELD(TB_FLAGS, VILL, 8, 1)
+
+/*
+ * A simplification for VLMAX
+ * = (1 << LMUL) * VLEN / (8 * (1 << SEW))
+ * = (VLEN << LMUL) / (8 << SEW)
+ * = (VLEN << LMUL) >> (SEW + 3)
+ * = VLEN >> (SEW + 3 - LMUL)
+ */
+static inline uint32_t vext_get_vlmax(RISCVCPU *cpu, target_ulong vtype)
+{
+uint8_t sew, lmul;
+
+sew = FIELD_EX64(vtype, VTYPE, VSEW);
+lmul = FIELD_EX64(vtype, VTYPE, VLMUL);
+return cpu->cfg.vlen >> (sew + 3 - lmul);
+}
+
 static inline void cpu_get_tb_cpu_state(CPURISCVState *env, target_ulong *pc,
-target_ulong *cs_base, uint32_t *flags)
+target_ulong *cs_base, uint32_t 
*pflags)
 {
+uint32_t flags = 0;
+
 *pc = env->pc;
 *cs_base = 0;
+
+if (riscv_has_ext(env, RVV)) {
+uint32_t vlmax = vext_get_vlmax(env_archcpu(env), env->vtype);
+bool vl_eq_vlmax = (env->vstart == 0) && (vlmax == env->vl);
+flags = FIELD_DP32(flags, TB_FLAGS, VILL,
+FIELD_EX64(env->vtype, VTYPE, VILL));
+flags = FIELD_DP32(flags, TB_FLAGS, SEW,
+FIELD_EX64(env->vtype, VTYPE, VSEW));
+flags = FIELD_DP32(flags, TB_FLAGS, LMUL,
+FIELD_EX64(env->vtype, VTYPE, VLMUL));
+flags = FIELD_DP32(flags, TB_FLAGS, VL_EQ_VLMAX, vl_eq_vlmax);
+} else {
+flags = FIELD_DP32(flags, TB_FLAGS, VILL, 1);
+}
+
 #ifdef CONFIG_USER_ONLY
-*flags = TB_FLAGS_MSTATUS_FS;
+flags |= TB_FLAGS_MSTATUS_FS;
 #else
-*flags = cpu_mmu_index(env, 0);
+flags |= cpu_mmu_index(env, 0);
 if (riscv_cpu_fp_enabled(env)) {
-*flags |= env->mstatus & MSTATUS_FS;
+flags |= env->mstatus & MSTATUS_FS;
 }
 #endif
+*pflags = flags;
 }
 
 int riscv_csrrw(CPURISCVState *env, int csrno, target_ulong *ret_value,
@@ -405,9 +455,4 @@ void riscv_set_csr_ops(int csrno, riscv_csr_operations 
*ops);
 
 void riscv_cpu_register_gdb_regs_for_features(CPUState *cs);
 
-typedef CPURISCVState CPUArchState;
-typedef RISCVCPU ArchCPU;
-
-#include "exec/cpu-all.h"
-
 #endif /* RISCV_CPU_H */
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index debb22a480..3c28c7e407 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -76,3 +76,5 @@ DEF_HELPER_2(mret, tl, env, tl)
 DEF_HELPER_1(wfi, void, env)
 DEF_HELPER_1(tlb_flush, void, env)
 #endif
+/* Vector functions */
+DEF_HELPER_3(vsetvl, tl, env, tl, tl)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index b883672e63..53340bdbc4 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -62,6 +62,7

[PATCH v9 05/61] target/riscv: add an internals.h header

2020-06-10 Thread LIU Zhiwei

The internals.h keeps things that are not relevant to the actual architecture,
only to the implementation, separate.

Signed-off-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Reviewed-by: Richard Henderson 
---
 target/riscv/internals.h | 24 
 1 file changed, 24 insertions(+)
 create mode 100644 target/riscv/internals.h

diff --git a/target/riscv/internals.h b/target/riscv/internals.h
new file mode 100644
index 00..22a49af413
--- /dev/null
+++ b/target/riscv/internals.h
@@ -0,0 +1,24 @@
+/*
+ * QEMU RISC-V CPU -- internal functions and types
+ *
+ * Copyright (c) 2020 T-Head Semiconductor Co., Ltd. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#ifndef RISCV_CPU_INTERNALS_H
+#define RISCV_CPU_INTERNALS_H
+
+#include "hw/registerfields.h"
+
+#endif
-- 
2.23.0

[PATCH v9 07/61] target/riscv: add vector index load and store instructions

2020-06-10 Thread LIU Zhiwei

Vector indexed operations add the contents of each element of the
vector offset operand specified by vs2 to the base effective address
to give the effective address of each element.

Signed-off-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Reviewed-by: Richard Henderson 
---
 target/riscv/helper.h   |  35 +++
 target/riscv/insn32.decode  |  13 +++
 target/riscv/insn_trans/trans_rvv.inc.c | 129 
 target/riscv/vector_helper.c| 116 +
 4 files changed, 293 insertions(+)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 87dfa90609..f9b3da60ca 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -183,3 +183,38 @@ DEF_HELPER_6(vsse_v_b, void, ptr, ptr, tl, tl, env, i32)
 DEF_HELPER_6(vsse_v_h, void, ptr, ptr, tl, tl, env, i32)
 DEF_HELPER_6(vsse_v_w, void, ptr, ptr, tl, tl, env, i32)
 DEF_HELPER_6(vsse_v_d, void, ptr, ptr, tl, tl, env, i32)
+DEF_HELPER_6(vlxb_v_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxb_v_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxb_v_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxb_v_d, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxh_v_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxh_v_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxh_v_d, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxw_v_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxw_v_d, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxe_v_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxe_v_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxe_v_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxe_v_d, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxbu_v_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxbu_v_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxbu_v_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxbu_v_d, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxhu_v_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxhu_v_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxhu_v_d, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxwu_v_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vlxwu_v_d, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxb_v_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxb_v_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxb_v_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxb_v_d, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxh_v_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxh_v_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxh_v_d, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxw_v_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxw_v_d, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxe_v_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxe_v_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxe_v_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vsxe_v_d, void, ptr, ptr, tl, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index ef521152c5..bc36df33b5 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -241,6 +241,19 @@ vssh_v ... 010 . . . 101 . 0100111 @r_nfvm
 vssw_v ... 010 . . . 110 . 0100111 @r_nfvm
 vsse_v ... 010 . . . 111 . 0100111 @r_nfvm
 
+vlxb_v ... 111 . . . 000 . 111 @r_nfvm
+vlxh_v ... 111 . . . 101 . 111 @r_nfvm
+vlxw_v ... 111 . . . 110 . 111 @r_nfvm
+vlxe_v ... 011 . . . 111 . 111 @r_nfvm
+vlxbu_v... 011 . . . 000 . 111 @r_nfvm
+vlxhu_v... 011 . . . 101 . 111 @r_nfvm
+vlxwu_v... 011 . . . 110 . 111 @r_nfvm
+# Vector ordered-indexed and unordered-indexed store insns.
+vsxb_v ... -11 . . . 000 . 0100111 @r_nfvm
+vsxh_v ... -11 . . . 101 . 0100111 @r_nfvm
+vsxw_v ... -11 . . . 110 . 0100111 @r_nfvm
+vsxe_v ... -11 . . . 111 . 0100111 @r_nfvm
+
 # *** new major opcode OP-V ***
 vsetvli 0 ... . 111 . 1010111  @r2_zimm
 vsetvl  100 . . 111 . 1010111  @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index f9950ad5a0..c3a79c5232 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -432,3 +432,132 @@ GEN_VEXT_TRANS(vssb_v, 0, rnfvm, st_stride_op, 
st_stride_check)
 GEN_VEXT_TRANS(vssh_v, 1, rnfvm, st_stride_op, st_stride_check)
 GEN_VEXT_TRANS(vssw_v, 2, rnfvm, st_stride_op, st_stride_check)
 GEN_VEXT_TRANS(vsse_v, 3, rnfvm, st_stride_op, st_stride_check)
+
+/*
+ *** index load and store
+ */
+typedef void gen_helper_ldst_index(TCGv_ptr, TCGv_ptr, TCGv,
+   TCGv_ptr, TCGv_env, TCGv_i32);
+
+static bool ldst_index_trans(uin

Re: Clarification regarding new qemu-img convert --target-is-zero flag

2020-06-10 Thread Sam Eiderman

I see,

I thought qemu-img (by default) checks the virtual size of the disk
before starting to copy allocated data, zeroes out all of the virtual
size (slowly) and then writes all the allocated data except for
zeroes.

But from what I understand now, qemu-img finds that the target is raw
and can not be efficiently zeroed, so it just writes all the allocated
data, including zeroes, leaving unallocated gaps in the virtual size
unwritten.

I have an image of 800MB VMDK with virtual size of 24GB

So if the following:
qemu-img convert "${IMAGE_PATH}" -p -O raw -S 512b /dev/sdc 2>&1
Takes roughly 3 minutes and 40 seconds (qemu 3.1.0)

And:
qemu-img convert "${IMAGE_PATH}" -n --target-is-zero -p -O raw /dev/sdc 2>&1
Takes roughly 2 seconds (qemu 5.0.0)

This means that probably there are ~23GB of zeroes *allocated* in this VMDK,
I'll check that.

Sam


On Wed, Jun 10, 2020 at 2:37 PM Kevin Wolf  wrote:
>
> Am 10.06.2020 um 08:28 hat Sam Eiderman geschrieben:
> > Hi,
> >
> > My target format is a Persistent Disk on GCP.
> > https://cloud.google.com/persistent-disk
> >
> > And my use case is converting VMDKs to PDs so I'm just using qemu-img
> > for the conversion (not using qemu as a hypervisor).
> >
> > Luckily PDs are zeroed out when allocated but I was asking to
> > understand the restrictions of qemu-img convert.
> >
> > It could be useful for qemu-img convert to not zero out the disk, but
> > do write allocated zeroes, I'm imagining cloud scenarios where instead
> > of virtual disks the customer receives an attached physical SSD device
> > that is not zeroed out beforehand (only encryption key changed, for
> > privacy/security sake) so reads will return garbage.
>
> But that's the default mode? Zeroing out the whole disk upfront is an
> optimisation that we do if efficient zeroing is possible, but if we
> can't, we just write explicit zeros where needed.
>
> --target-is-zero means that you promise that the target is already
> pre-zeroed so qemu-img can further optimise things. If you specify it
> and the target doesn't contain zeros, but random data, you get garbage.
>
> Kevin
>

[PATCH v4 01/21] exec: Introduce ram_block_discard_(disable|require)()

2020-06-10 Thread David Hildenbrand

We want to replace qemu_balloon_inhibit() by something more generic.
Especially, we want to make sure that technologies that really rely on
RAM block discards to work reliably to run mutual exclusive with
technologies that effectively break it.

E.g., vfio will usually pin all guest memory, turning the virtio-balloon
basically useless and make the VM consume more memory than reported via
the balloon. While the balloon is special already (=> no guarantees, same
behavior possible afer reboots and with huge pages), this will be
different, especially, with virtio-mem.

Let's implement a way such that we can make both types of technology run
mutually exclusive. We'll convert existing balloon inhibitors in successive
patches and add some new ones. Add the check to
qemu_balloon_is_inhibited() for now. We might want to make
virtio-balloon an acutal inhibitor in the future - however, that
requires more thought to not break existing setups.

Reviewed-by: Dr. David Alan Gilbert 
Cc: "Michael S. Tsirkin" 
Cc: Richard Henderson 
Cc: Paolo Bonzini 
Signed-off-by: David Hildenbrand 
---
 balloon.c |  3 ++-
 exec.c| 52 +++
 include/exec/memory.h | 41 ++
 3 files changed, 95 insertions(+), 1 deletion(-)

diff --git a/balloon.c b/balloon.c
index f104b42961..5fff79523a 100644
--- a/balloon.c
+++ b/balloon.c
@@ -40,7 +40,8 @@ static int balloon_inhibit_count;
 
 bool qemu_balloon_is_inhibited(void)
 {
-return atomic_read(&balloon_inhibit_count) > 0;
+return atomic_read(&balloon_inhibit_count) > 0 ||
+   ram_block_discard_is_disabled();
 }
 
 void qemu_balloon_inhibit(bool state)
diff --git a/exec.c b/exec.c
index be4be2df3a..c4c1d9df84 100644
--- a/exec.c
+++ b/exec.c
@@ -4051,4 +4051,56 @@ void mtree_print_dispatch(AddressSpaceDispatch *d, 
MemoryRegion *root)
 }
 }
 
+/*
+ * If positive, discarding RAM is disabled. If negative, discarding RAM is
+ * required to work and cannot be disabled.
+ */
+static int ram_block_discard_disabled;
+
+int ram_block_discard_disable(bool state)
+{
+int old;
+
+if (!state) {
+atomic_dec(&ram_block_discard_disabled);
+return 0;
+}
+
+do {
+old = atomic_read(&ram_block_discard_disabled);
+if (old < 0) {
+return -EBUSY;
+}
+} while (atomic_cmpxchg(&ram_block_discard_disabled, old, old + 1) != old);
+return 0;
+}
+
+int ram_block_discard_require(bool state)
+{
+int old;
+
+if (!state) {
+atomic_inc(&ram_block_discard_disabled);
+return 0;
+}
+
+do {
+old = atomic_read(&ram_block_discard_disabled);
+if (old > 0) {
+return -EBUSY;
+}
+} while (atomic_cmpxchg(&ram_block_discard_disabled, old, old - 1) != old);
+return 0;
+}
+
+bool ram_block_discard_is_disabled(void)
+{
+return atomic_read(&ram_block_discard_disabled) > 0;
+}
+
+bool ram_block_discard_is_required(void)
+{
+return atomic_read(&ram_block_discard_disabled) < 0;
+}
+
 #endif
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 3e00cdbbfa..eea7d284b9 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -2474,6 +2474,47 @@ static inline MemOp devend_memop(enum device_endian end)
 }
 #endif
 
+/*
+ * Inhibit technologies that require discarding of pages in RAM blocks, e.g.,
+ * to manage the actual amount of memory consumed by the VM (then, the memory
+ * provided by RAM blocks might be bigger than the desired memory consumption).
+ * This *must* be set if:
+ * - Discarding parts of a RAM blocks does not result in the change being
+ *   reflected in the VM and the pages getting freed.
+ * - All memory in RAM blocks is pinned or duplicated, invaldiating any 
previous
+ *   discards blindly.
+ * - Discarding parts of a RAM blocks will result in integrity issues (e.g.,
+ *   encrypted VMs).
+ * Technologies that only temporarily pin the current working set of a
+ * driver are fine, because we don't expect such pages to be discarded
+ * (esp. based on guest action like balloon inflation).
+ *
+ * This is *not* to be used to protect from concurrent discards (esp.,
+ * postcopy).
+ *
+ * Returns 0 if successful. Returns -EBUSY if a technology that relies on
+ * discards to work reliably is active.
+ */
+int ram_block_discard_disable(bool state);
+
+/*
+ * Inhibit technologies that disable discarding of pages in RAM blocks.
+ *
+ * Returns 0 if successful. Returns -EBUSY if discards are already set to
+ * broken.
+ */
+int ram_block_discard_require(bool state);
+
+/*
+ * Test if discarding of memory in ram blocks is disabled.
+ */
+bool ram_block_discard_is_disabled(void);
+
+/*
+ * Test if discarding of memory in ram blocks is required to work reliably.
+ */
+bool ram_block_discard_is_required(void);
+
 #endif
 
 #endif
-- 
2.26.2

[PATCH v9 06/61] target/riscv: add vector stride load and store instructions

2020-06-10 Thread LIU Zhiwei

Vector strided operations access the first memory element at the base address,
and then access subsequent elements at address increments given by the byte
offset contained in the x register specified by rs2.

Vector unit-stride operations access elements stored contiguously in memory
starting from the base effective address. It can been seen as a special
case of strided operations.

Signed-off-by: LIU Zhiwei 
Reviewed-by: Richard Henderson 
Reviewed-by: Alistair Francis 
---
 target/riscv/helper.h   | 105 ++
 target/riscv/insn32.decode  |  32 ++
 target/riscv/insn_trans/trans_rvv.inc.c | 355 
 target/riscv/internals.h|   5 +
 target/riscv/translate.c|   7 +
 target/riscv/vector_helper.c| 410 
 6 files changed, 914 insertions(+)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 3c28c7e407..87dfa90609 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -78,3 +78,108 @@ DEF_HELPER_1(tlb_flush, void, env)
 #endif
 /* Vector functions */
 DEF_HELPER_3(vsetvl, tl, env, tl, tl)
+DEF_HELPER_5(vlb_v_b, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlb_v_b_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlb_v_h, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlb_v_h_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlb_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlb_v_w_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlb_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlb_v_d_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlh_v_h, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlh_v_h_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlh_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlh_v_w_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlh_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlh_v_d_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlw_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlw_v_w_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlw_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlw_v_d_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vle_v_b, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vle_v_b_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vle_v_h, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vle_v_h_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vle_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vle_v_w_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vle_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vle_v_d_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbu_v_b, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbu_v_b_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbu_v_h, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbu_v_h_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbu_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbu_v_w_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbu_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbu_v_d_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlhu_v_h, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlhu_v_h_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlhu_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlhu_v_w_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlhu_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlhu_v_d_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlwu_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlwu_v_w_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlwu_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlwu_v_d_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsb_v_b, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsb_v_b_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsb_v_h, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsb_v_h_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsb_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsb_v_w_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsb_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsb_v_d_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsh_v_h, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsh_v_h_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsh_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsh_v_w_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsh_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsh_v_d_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsw_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsw_v_w_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsw_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vsw_v_d_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vse_v_b, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vse_v_b_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vse_v_h, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vse_v_h_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vse_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vse_v_w_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vse_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vse_

[PATCH v9 08/61] target/riscv: add fault-only-first unit stride load

2020-06-10 Thread LIU Zhiwei

The unit-stride fault-only-fault load instructions are used to
vectorize loops with data-dependent exit conditions(while loops).
These instructions execute as a regular load except that they
will only take a trap on element 0.

Signed-off-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Reviewed-by: Richard Henderson 
---
 target/riscv/helper.h   |  22 +
 target/riscv/insn32.decode  |   7 ++
 target/riscv/insn_trans/trans_rvv.inc.c |  73 
 target/riscv/vector_helper.c| 110 
 4 files changed, 212 insertions(+)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index f9b3da60ca..72ba4d9bdb 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -218,3 +218,25 @@ DEF_HELPER_6(vsxe_v_b, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vsxe_v_h, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vsxe_v_w, void, ptr, ptr, tl, ptr, env, i32)
 DEF_HELPER_6(vsxe_v_d, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_5(vlbff_v_b, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbff_v_h, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbff_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbff_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlhff_v_h, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlhff_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlhff_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlwff_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlwff_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vleff_v_b, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vleff_v_h, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vleff_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vleff_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbuff_v_b, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbuff_v_h, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbuff_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlbuff_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlhuff_v_h, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlhuff_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlhuff_v_d, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlwuff_v_w, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlwuff_v_d, void, ptr, ptr, tl, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index bc36df33b5..b76c09c8c0 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -224,6 +224,13 @@ vle_v  ... 000 . 0 . 111 . 111 @r2_nfvm
 vlbu_v ... 000 . 0 . 000 . 111 @r2_nfvm
 vlhu_v ... 000 . 0 . 101 . 111 @r2_nfvm
 vlwu_v ... 000 . 0 . 110 . 111 @r2_nfvm
+vlbff_v... 100 . 1 . 000 . 111 @r2_nfvm
+vlhff_v... 100 . 1 . 101 . 111 @r2_nfvm
+vlwff_v... 100 . 1 . 110 . 111 @r2_nfvm
+vleff_v... 000 . 1 . 111 . 111 @r2_nfvm
+vlbuff_v   ... 000 . 1 . 000 . 111 @r2_nfvm
+vlhuff_v   ... 000 . 1 . 101 . 111 @r2_nfvm
+vlwuff_v   ... 000 . 1 . 110 . 111 @r2_nfvm
 vsb_v  ... 000 . 0 . 000 . 0100111 @r2_nfvm
 vsh_v  ... 000 . 0 . 101 . 0100111 @r2_nfvm
 vsw_v  ... 000 . 0 . 110 . 0100111 @r2_nfvm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c 
b/target/riscv/insn_trans/trans_rvv.inc.c
index c3a79c5232..299b479ec1 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -561,3 +561,76 @@ GEN_VEXT_TRANS(vsxb_v, 0, rnfvm, st_index_op, 
st_index_check)
 GEN_VEXT_TRANS(vsxh_v, 1, rnfvm, st_index_op, st_index_check)
 GEN_VEXT_TRANS(vsxw_v, 2, rnfvm, st_index_op, st_index_check)
 GEN_VEXT_TRANS(vsxe_v, 3, rnfvm, st_index_op, st_index_check)
+
+/*
+ *** unit stride fault-only-first load
+ */
+static bool ldff_trans(uint32_t vd, uint32_t rs1, uint32_t data,
+   gen_helper_ldst_us *fn, DisasContext *s)
+{
+TCGv_ptr dest, mask;
+TCGv base;
+TCGv_i32 desc;
+
+TCGLabel *over = gen_new_label();
+tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
+
+dest = tcg_temp_new_ptr();
+mask = tcg_temp_new_ptr();
+base = tcg_temp_new();
+desc = tcg_const_i32(simd_desc(0, s->vlen / 8, data));
+
+gen_get_gpr(base, rs1);
+tcg_gen_addi_ptr(dest, cpu_env, vreg_ofs(s, vd));
+tcg_gen_addi_ptr(mask, cpu_env, vreg_ofs(s, 0));
+
+fn(dest, mask, base, cpu_env, desc);
+
+tcg_temp_free_ptr(dest);
+tcg_temp_free_ptr(mask);
+tcg_temp_free(base);
+tcg_temp_free_i32(desc);
+gen_set_label(over);
+return true;
+}
+
+static bool ldff_op(DisasContext *s, arg_r2nfvm *a, uint8_t seq)
+{
+uint32_t data = 0;
+gen_helper_ldst_us *fn;
+static gen_helper_ldst_us * const fns[7][4] = {
+{ gen_helper_vlbff_v_b,  gen_helper_vlbff_v_h,
+  gen_helper_vlbff_v_w,  gen_helper_vlbff_v_d },
+{ NULL,  gen_helper_vlhff_v_h,
+  gen_helper_vlhff_v_w,  gen_he

[PATCH v4 00/21] virtio-mem: Paravirtualized memory hot(un)plug

2020-06-10 Thread David Hildenbrand

This is the very basic, initial version of virtio-mem. More info on
virtio-mem in general can be found in the Linux kernel driver v2 posting
[1] and in patch #10. The Linux driver is currently on its way upstream.

This series is based on [3]:
"[PATCH v1] pc: Support coldplugging of virtio-pmem-pci devices on all
 buses"
And [4]:
"[PATCH v2] hmp: Make json format optional for qom-set"

The patches can be found at:
https://github.com/davidhildenbrand/qemu.git virtio-mem-v4

"The basic idea of virtio-mem is to provide a flexible,
cross-architecture memory hot(un)plug solution that avoids many limitations
imposed by existing technologies, architectures, and interfaces."

There are a lot of addons in the works (esp. protection of unplugged
memory, better hugepage support (esp. when reading unplugged memory),
resizeable memory backends, support for more architectures, ...), this is
the very basic version to get the ball rolling.

The first 8 patches make sure we don't have any sudden surprises e.g., if
somebody tries to pin all memory in RAM blocks, resulting in a higher
memory consumption than desired. The remaining patches add basic virtio-mem
along with support for x86-64. The last patch indicates to the guest OS
the maximum possible PFN using ACPI SRAT, such that Linux can properly
enable the swiotlb when booting only with DMA memory.

[1] https://lkml.kernel.org/r/20200311171422.10484-1-da...@redhat.com
[2] https://lkml.kernel.org/r/20200507140139.17083-1-da...@redhat.com
[3] https://lkml.kernel.org/r/20200525084511.51379-1-da...@redhat.com
[3] https://lkml.kernel.org/r/20200610075153.33892-1-da...@redhat.com

Based-on: <20200525084511.51379-1-da...@redhat.com>
Based-on: <20200610075153.33892-1-da...@redhat.com>
Cc: teawater 
Cc: Pankaj Gupta 

v3 -> v4
- Adapt to virtio-mem config layout change (block size now is 64bit)
- Added "numa: Auto-enable NUMA when any memory devices are possible"

v2 -> v3:
- Rebased on upstream/[3]
- "virtio-mem: Exclude unplugged memory during migration"
-- Added
- "virtio-mem: Paravirtualized memory hot(un)plug"
-- Simplify bitmap operations, find consecutive areas
-- Tweak error messages
-- Reshuffle some checks
-- Minor cleanups
- "accel/kvm: Convert to ram_block_discard_disable()"
- "target/i386: sev: Use ram_block_discard_disable()"
-- Keep asserts clean of functional things

v1 -> v2:
- Rebased to object_property_*() changes
- "exec: Introduce ram_block_discard_(disable|require)()"
-- Change the function names and rephrase/add comments
- "virtio-balloon: Rip out qemu_balloon_inhibit()"
-- Add and use "migration_in_incoming_postcopy()"
- "migration/rdma: Use ram_block_discard_disable()"
-- Add a comment regarding pin_all vs. !pin_all
- "virtio-mem: Paravirtualized memory hot(un)plug"
-- Replace virtio_mem_discard_inhibited() by
   migration_in_incoming_postcopy()
-- Drop some asserts
-- Drop virtio_mem_bad_request(), use virtio_error() directly, printing
   more information
-- Replace "Note: Discarding should never fail ..." comments by
   error_report()
-- Replace virtio_stw_p() by cpu_to_le16()
-- Drop migration_addr and migration_block_size
-- Minor cleanups
- "linux-headers: update to contain virtio-mem"
-- Updated to latest v4 in Linux
- General changes
-- Fixup the users of the renamed ram_block_discard_(disable|require)
-- Use "X: cannot disable RAM discard"-styled error messages
- Added
-- "virtio-mem: Migration sanity checks"
-- "virtio-mem: Add trace events"

David Hildenbrand (21):
  exec: Introduce ram_block_discard_(disable|require)()
  vfio: Convert to ram_block_discard_disable()
  accel/kvm: Convert to ram_block_discard_disable()
  s390x/pv: Convert to ram_block_discard_disable()
  virtio-balloon: Rip out qemu_balloon_inhibit()
  target/i386: sev: Use ram_block_discard_disable()
  migration/rdma: Use ram_block_discard_disable()
  migration/colo: Use ram_block_discard_disable()
  linux-headers: update to contain virtio-mem
  virtio-mem: Paravirtualized memory hot(un)plug
  virtio-pci: Proxy for virtio-mem
  MAINTAINERS: Add myself as virtio-mem maintainer
  hmp: Handle virtio-mem when printing memory device info
  numa: Handle virtio-mem in NUMA stats
  pc: Support for virtio-mem-pci
  virtio-mem: Allow notifiers for size changes
  virtio-pci: Send qapi events when the virtio-mem size changes
  virtio-mem: Migration sanity checks
  virtio-mem: Add trace events
  virtio-mem: Exclude unplugged memory during migration
  numa: Auto-enable NUMA when any memory devices are possible

 MAINTAINERS |   8 +
 accel/kvm/kvm-all.c |   4 +-
 balloon.c   |  17 -
 exec.c  |  52 ++
 hw/arm/virt.c   |   2 +
 hw/core/numa.c  |  17 +-
 hw/i386/Kconfig |   1 +
 hw/i386/microvm.c   |   1 +
 hw/i386/pc.c|

[PATCH v4 06/21] target/i386: sev: Use ram_block_discard_disable()

2020-06-10 Thread David Hildenbrand

AMD SEV will pin all guest memory, mark discarding of RAM broken. At the
time this is called, we cannot have anyone active that relies on discards
to work properly - let's still implement error handling.

Reviewed-by: Dr. David Alan Gilbert 
Cc: "Michael S. Tsirkin" 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Signed-off-by: David Hildenbrand 
---
 target/i386/sev.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/target/i386/sev.c b/target/i386/sev.c
index 51cdbe5496..4a4863db28 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -649,6 +649,12 @@ sev_guest_init(const char *id)
 uint32_t host_cbitpos;
 struct sev_user_data_status status = {};
 
+ret = ram_block_discard_disable(true);
+if (ret) {
+error_report("%s: cannot disable RAM discard", __func__);
+return NULL;
+}
+
 sev_state = s = g_new0(SEVState, 1);
 s->sev_info = lookup_sev_guest_info(id);
 if (!s->sev_info) {
@@ -724,6 +730,7 @@ sev_guest_init(const char *id)
 err:
 g_free(sev_state);
 sev_state = NULL;
+ram_block_discard_disable(false);
 return NULL;
 }
 
-- 
2.26.2

[PATCH v4 02/21] vfio: Convert to ram_block_discard_disable()

2020-06-10 Thread David Hildenbrand

VFIO is (except devices without a physical IOMMU or some mediated devices)
incompatible with discarding of RAM. The kernel will pin basically all VM
memory. Let's convert to ram_block_discard_disable(), which can now
fail, in contrast to qemu_balloon_inhibit().

Leave "x-balloon-allowed" named as it is for now.

Cc: Cornelia Huck 
Cc: Alex Williamson 
Cc: Christian Borntraeger 
Cc: Tony Krowiak 
Cc: Halil Pasic 
Cc: Pierre Morel 
Cc: Eric Farman 
Signed-off-by: David Hildenbrand 
---
 hw/vfio/ap.c  | 10 +++
 hw/vfio/ccw.c | 11 
 hw/vfio/common.c  | 53 +++
 hw/vfio/pci.c |  6 ++--
 include/hw/vfio/vfio-common.h |  4 +--
 5 files changed, 45 insertions(+), 39 deletions(-)

diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index 95564c17ed..d0b1bc7581 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -105,12 +105,12 @@ static void vfio_ap_realize(DeviceState *dev, Error 
**errp)
 vapdev->vdev.dev = dev;
 
 /*
- * vfio-ap devices operate in a way compatible with
- * memory ballooning, as no pages are pinned in the host.
- * This needs to be set before vfio_get_device() for vfio common to
- * handle the balloon inhibitor.
+ * vfio-ap devices operate in a way compatible discarding of memory in
+ * RAM blocks, as no pages are pinned in the host. This needs to be
+ * set before vfio_get_device() for vfio common to handle
+ * ram_block_discard_disable().
  */
-vapdev->vdev.balloon_allowed = true;
+vapdev->vdev.ram_block_discard_allowed = true;
 
 ret = vfio_get_device(vfio_group, mdevid, &vapdev->vdev, errp);
 if (ret) {
diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 63406184d2..82857f1615 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -418,12 +418,13 @@ static void vfio_ccw_get_device(VFIOGroup *group, 
VFIOCCWDevice *vcdev,
 
 /*
  * All vfio-ccw devices are believed to operate in a way compatible with
- * memory ballooning, ie. pages pinned in the host are in the current
- * working set of the guest driver and therefore never overlap with pages
- * available to the guest balloon driver.  This needs to be set before
- * vfio_get_device() for vfio common to handle the balloon inhibitor.
+ * discarding of memory in RAM blocks, ie. pages pinned in the host are
+ * in the current working set of the guest driver and therefore never
+ * overlap e.g., with pages available to the guest balloon driver.  This
+ * needs to be set before vfio_get_device() for vfio common to handle
+ * ram_block_discard_disable().
  */
-vcdev->vdev.balloon_allowed = true;
+vcdev->vdev.ram_block_discard_allowed = true;
 
 if (vfio_get_device(group, vcdev->cdev.mdevid, &vcdev->vdev, errp)) {
 goto out_err;
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 0b3593b3c0..33357140b8 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -33,7 +33,6 @@
 #include "qemu/error-report.h"
 #include "qemu/main-loop.h"
 #include "qemu/range.h"
-#include "sysemu/balloon.h"
 #include "sysemu/kvm.h"
 #include "sysemu/reset.h"
 #include "trace.h"
@@ -1215,31 +1214,36 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as,
 space = vfio_get_address_space(as);
 
 /*
- * VFIO is currently incompatible with memory ballooning insofar as the
+ * VFIO is currently incompatible with discarding of RAM insofar as the
  * madvise to purge (zap) the page from QEMU's address space does not
  * interact with the memory API and therefore leaves stale virtual to
  * physical mappings in the IOMMU if the page was previously pinned.  We
- * therefore add a balloon inhibit for each group added to a container,
+ * therefore set discarding broken for each group added to a container,
  * whether the container is used individually or shared.  This provides
  * us with options to allow devices within a group to opt-in and allow
- * ballooning, so long as it is done consistently for a group (for instance
+ * discarding, so long as it is done consistently for a group (for instance
  * if the device is an mdev device where it is known that the host vendor
  * driver will never pin pages outside of the working set of the guest
- * driver, which would thus not be ballooning candidates).
+ * driver, which would thus not be discarding candidates).
  *
  * The first opportunity to induce pinning occurs here where we attempt to
  * attach the group to existing containers within the AddressSpace.  If any
- * pages are already zapped from the virtual address space, such as from a
- * previous ballooning opt-in, new pinning will cause valid mappings to be
+ * pages are already zapped from the virtual address space, such as from
+ * previous discards, new pinning will cause valid mappings to be
  * re-established.  Likewise, when the overall MemoryListen

Re: [PATCH v2 2/8] MAINTAINERS: Mark SH4 based R2D & Shix machines orphan

2020-06-10 Thread Aleksandar Markovic

пон, 8. јун 2020. у 11:03 Philippe Mathieu-Daudé  је
написао/ла:
>
> Last commit from Magnus Damm is fc8e320ef583, which date is
> Fri Nov 13 2009.  As nobody else seems to care about the patches
> posted [*] related to the R2D and Shix machines, mark them orphan.
>
> Many thanks to Magnus for his substantial contributions to QEMU,
> and for introducing these SH4 based machine!
>
> [*] https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg08519.html
>
> Cc: Magnus Damm 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---

I think, regarding both patches 1 and 2 of this series, we just got
overly complicated.

I suggest simple replacement of Aurelien's and Magnus' name with
Yoshimori's, with possible exception of addition of the line:

+F: include/hw/sh4/sh_intc.h

And that's it!

And let's finish this unpleasant episode!

Regards,
Aleksandar

P.S. I now expect that Thomas will complain about my usage of the
words "unpleasant" and "overly".

>  MAINTAINERS | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 49d90c70de..a012d9b74e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1250,14 +1250,15 @@ SH4 Machines
>  
>  R2D
>  M: Magnus Damm 
> -S: Maintained
> +S: Orphan
>  F: hw/sh4/r2d.c
>  F: hw/intc/sh_intc.c
>  F: hw/timer/sh_timer.c
> +F: include/hw/sh4/sh_intc.h
>
>  Shix
>  M: Magnus Damm 
> -S: Odd Fixes
> +S: Orphan
>  F: hw/sh4/shix.c
>
>  SPARC Machines
> --
> 2.21.3
>
>

[PATCH v4 03/21] accel/kvm: Convert to ram_block_discard_disable()

2020-06-10 Thread David Hildenbrand

Discarding memory does not work as expected. At the time this is called,
we cannot have anyone active that relies on discards to work properly.

Reviewed-by: Dr. David Alan Gilbert 
Cc: Paolo Bonzini 
Signed-off-by: David Hildenbrand 
---
 accel/kvm/kvm-all.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index d06cc04079..fa18b2caae 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -40,7 +40,6 @@
 #include "trace.h"
 #include "hw/irq.h"
 #include "sysemu/sev.h"
-#include "sysemu/balloon.h"
 #include "qapi/visitor.h"
 #include "qapi/qapi-types-common.h"
 #include "qapi/qapi-visit-common.h"
@@ -2143,7 +2142,8 @@ static int kvm_init(MachineState *ms)
 
 s->sync_mmu = !!kvm_vm_check_extension(kvm_state, KVM_CAP_SYNC_MMU);
 if (!s->sync_mmu) {
-qemu_balloon_inhibit(true);
+ret = ram_block_discard_disable(true);
+assert(!ret);
 }
 
 return 0;
-- 
2.26.2

[PATCH v4 08/21] migration/colo: Use ram_block_discard_disable()

2020-06-10 Thread David Hildenbrand

COLO will copy all memory in a RAM block, disable discarding of RAM.

Reviewed-by: Dr. David Alan Gilbert 
Tested-by: Lukas Straub 
Cc: "Michael S. Tsirkin" 
Cc: Hailiang Zhang 
Cc: Juan Quintela 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
---
 include/migration/colo.h |  2 +-
 migration/migration.c|  8 +++-
 migration/savevm.c   | 11 +--
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 1636e6f907..768e1f04c3 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -25,7 +25,7 @@ void migrate_start_colo_process(MigrationState *s);
 bool migration_in_colo_state(void);
 
 /* loadvm */
-void migration_incoming_enable_colo(void);
+int migration_incoming_enable_colo(void);
 void migration_incoming_disable_colo(void);
 bool migration_incoming_colo_enabled(void);
 void *colo_process_incoming_thread(void *opaque);
diff --git a/migration/migration.c b/migration/migration.c
index 14856cc930..0f6799f5d2 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -338,12 +338,18 @@ bool migration_incoming_colo_enabled(void)
 
 void migration_incoming_disable_colo(void)
 {
+ram_block_discard_disable(false);
 migration_colo_enabled = false;
 }
 
-void migration_incoming_enable_colo(void)
+int migration_incoming_enable_colo(void)
 {
+if (ram_block_discard_disable(true)) {
+error_report("COLO: cannot disable RAM discard");
+return -EBUSY;
+}
 migration_colo_enabled = true;
+return 0;
 }
 
 void migrate_add_address(SocketAddress *address)
diff --git a/migration/savevm.c b/migration/savevm.c
index c00a6807d9..19b4f9600d 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2111,8 +2111,15 @@ static int 
loadvm_handle_recv_bitmap(MigrationIncomingState *mis,
 
 static int loadvm_process_enable_colo(MigrationIncomingState *mis)
 {
-migration_incoming_enable_colo();
-return colo_init_ram_cache();
+int ret = migration_incoming_enable_colo();
+
+if (!ret) {
+ret = colo_init_ram_cache();
+if (ret) {
+migration_incoming_disable_colo();
+}
+}
+return ret;
 }
 
 /*
-- 
2.26.2

[PATCH v4 05/21] virtio-balloon: Rip out qemu_balloon_inhibit()

2020-06-10 Thread David Hildenbrand

The only remaining special case is postcopy. It cannot handle
concurrent discards yet, which would result in requesting already sent
pages from the source. Special-case it in virtio-balloon instead.

Introduce migration_in_incoming_postcopy(), to find out if incoming
postcopy is active.

Cc: "Michael S. Tsirkin" 
Cc: Juan Quintela 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
---
 balloon.c  | 18 --
 hw/virtio/virtio-balloon.c |  8 +++-
 include/migration/misc.h   |  2 ++
 include/sysemu/balloon.h   |  2 --
 migration/migration.c  |  7 +++
 migration/postcopy-ram.c   | 23 ---
 6 files changed, 16 insertions(+), 44 deletions(-)

diff --git a/balloon.c b/balloon.c
index 5fff79523a..354408c6ea 100644
--- a/balloon.c
+++ b/balloon.c
@@ -36,24 +36,6 @@
 static QEMUBalloonEvent *balloon_event_fn;
 static QEMUBalloonStatus *balloon_stat_fn;
 static void *balloon_opaque;
-static int balloon_inhibit_count;
-
-bool qemu_balloon_is_inhibited(void)
-{
-return atomic_read(&balloon_inhibit_count) > 0 ||
-   ram_block_discard_is_disabled();
-}
-
-void qemu_balloon_inhibit(bool state)
-{
-if (state) {
-atomic_inc(&balloon_inhibit_count);
-} else {
-atomic_dec(&balloon_inhibit_count);
-}
-
-assert(atomic_read(&balloon_inhibit_count) >= 0);
-}
 
 static bool have_balloon(Error **errp)
 {
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 065cd450f1..5ce2f956df 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -63,6 +63,12 @@ static bool 
virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
 return pbp->base_gpa == base_gpa;
 }
 
+static bool virtio_balloon_inhibited(void)
+{
+/* Postcopy cannot deal with concurrent discards, so it's special. */
+return ram_block_discard_is_disabled() || migration_in_incoming_postcopy();
+}
+
 static void balloon_inflate_page(VirtIOBalloon *balloon,
  MemoryRegion *mr, hwaddr mr_offset,
  PartiallyBalloonedPage *pbp)
@@ -360,7 +366,7 @@ static void virtio_balloon_handle_output(VirtIODevice 
*vdev, VirtQueue *vq)
 
 trace_virtio_balloon_handle_output(memory_region_name(section.mr),
pa);
-if (!qemu_balloon_is_inhibited()) {
+if (!virtio_balloon_inhibited()) {
 if (vq == s->ivq) {
 balloon_inflate_page(s, section.mr,
  section.offset_within_region, &pbp);
diff --git a/include/migration/misc.h b/include/migration/misc.h
index d2762257aa..34e7d75713 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -69,6 +69,8 @@ bool migration_has_failed(MigrationState *);
 /* ...and after the device transmission */
 bool migration_in_postcopy_after_devices(MigrationState *);
 void migration_global_dump(Monitor *mon);
+/* True if incomming migration entered POSTCOPY_INCOMING_DISCARD */
+bool migration_in_incoming_postcopy(void);
 
 /* migration/block-dirty-bitmap.c */
 void dirty_bitmap_mig_init(void);
diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
index aea0c44985..20a2defe3a 100644
--- a/include/sysemu/balloon.h
+++ b/include/sysemu/balloon.h
@@ -23,7 +23,5 @@ typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo 
*info);
 int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
  QEMUBalloonStatus *stat_func, void *opaque);
 void qemu_remove_balloon_handler(void *opaque);
-bool qemu_balloon_is_inhibited(void);
-void qemu_balloon_inhibit(bool state);
 
 #endif
diff --git a/migration/migration.c b/migration/migration.c
index b63ad91d34..14856cc930 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1772,6 +1772,13 @@ bool migration_in_postcopy_after_devices(MigrationState 
*s)
 return migration_in_postcopy() && s->postcopy_after_devices;
 }
 
+bool migration_in_incoming_postcopy(void)
+{
+PostcopyState ps = postcopy_state_get();
+
+return ps >= POSTCOPY_INCOMING_DISCARD && ps < POSTCOPY_INCOMING_END;
+}
+
 bool migration_is_idle(void)
 {
 MigrationState *s = current_migration;
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index a36402722b..b41a9fe2fd 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -27,7 +27,6 @@
 #include "qemu/notify.h"
 #include "qemu/rcu.h"
 #include "sysemu/sysemu.h"
-#include "sysemu/balloon.h"
 #include "qemu/error-report.h"
 #include "trace.h"
 #include "hw/boards.h"
@@ -520,20 +519,6 @@ int postcopy_ram_incoming_init(MigrationIncomingState *mis)
 return 0;
 }
 
-/*
- * Manage a single vote to the QEMU balloon inhibitor for all postcopy usage,
- * last caller wins.
- */
-static void postcopy_balloon_inhibit(bool state)
-{
-static bool cur_state = false;
-
-if (state != cur_state) {
-qemu_balloon_inhibit(state);
-

[PATCH v4 04/21] s390x/pv: Convert to ram_block_discard_disable()

2020-06-10 Thread David Hildenbrand

Discarding RAM does not work as expected with protected VMs. Let's
switch to ram_block_discard_disable() for now, as we want to get rid
of qemu_balloon_inhibit(). Note that it will currently never fail, but
might fail in the future with new technologies (e.g., virtio-mem).

Cc: Richard Henderson 
Cc: Cornelia Huck 
Cc: Halil Pasic 
Cc: Christian Borntraeger 
Cc: Janosch Frank 
Signed-off-by: David Hildenbrand 
---
 hw/s390x/s390-virtio-ccw.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 60b16fef77..c985bb56eb 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -43,7 +43,6 @@
 #include "hw/qdev-properties.h"
 #include "hw/s390x/tod.h"
 #include "sysemu/sysemu.h"
-#include "sysemu/balloon.h"
 #include "hw/s390x/pv.h"
 #include "migration/blocker.h"
 
@@ -329,7 +328,7 @@ static void s390_machine_unprotect(S390CcwMachineState *ms)
 ms->pv = false;
 migrate_del_blocker(pv_mig_blocker);
 error_free_or_abort(&pv_mig_blocker);
-qemu_balloon_inhibit(false);
+ram_block_discard_disable(false);
 }
 
 static int s390_machine_protect(S390CcwMachineState *ms)
@@ -338,17 +337,22 @@ static int s390_machine_protect(S390CcwMachineState *ms)
 int rc;
 
/*
-* Ballooning on protected VMs needs support in the guest for
-* sharing and unsharing balloon pages. Block ballooning for
-* now, until we have a solution to make at least Linux guests
-* either support it or fail gracefully.
+* Discarding of memory in RAM blocks does not work as expected with
+* protected VMs. Sharing and unsharing pages would be required. Disable
+* it for now, until until we have a solution to make at least Linux
+* guests either support it (e.g., virtio-balloon) or fail gracefully.
 */
-qemu_balloon_inhibit(true);
+rc = ram_block_discard_disable(true);
+if (rc) {
+error_report("protected VMs: cannot disable RAM discard");
+return rc;
+}
+
 error_setg(&pv_mig_blocker,
"protected VMs are currently not migrateable.");
 rc = migrate_add_blocker(pv_mig_blocker, &local_err);
 if (rc) {
-qemu_balloon_inhibit(false);
+ram_block_discard_disable(false);
 error_report_err(local_err);
 error_free_or_abort(&pv_mig_blocker);
 return rc;
@@ -357,7 +361,7 @@ static int s390_machine_protect(S390CcwMachineState *ms)
 /* Create SE VM */
 rc = s390_pv_vm_enable();
 if (rc) {
-qemu_balloon_inhibit(false);
+ram_block_discard_disable(false);
 migrate_del_blocker(pv_mig_blocker);
 error_free_or_abort(&pv_mig_blocker);
 return rc;
-- 
2.26.2

[PATCH v4 11/21] virtio-pci: Proxy for virtio-mem

2020-06-10 Thread David Hildenbrand

Let's add a proxy for virtio-mem, make it a memory device, and
pass-through the properties.

Reviewed-by: Pankaj Gupta 
Cc: "Michael S. Tsirkin" 
Cc: Marcel Apfelbaum 
Cc: "Dr. David Alan Gilbert" 
Cc: Igor Mammedov 
Signed-off-by: David Hildenbrand 
---
 hw/virtio/Makefile.objs|   1 +
 hw/virtio/virtio-mem-pci.c | 129 +
 hw/virtio/virtio-mem-pci.h |  33 ++
 include/hw/pci/pci.h   |   1 +
 4 files changed, 164 insertions(+)
 create mode 100644 hw/virtio/virtio-mem-pci.c
 create mode 100644 hw/virtio/virtio-mem-pci.h

diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 7df70e977e..b9661f9c01 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -19,6 +19,7 @@ obj-$(call land,$(CONFIG_VHOST_USER_FS),$(CONFIG_VIRTIO_PCI)) 
+= vhost-user-fs-p
 obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
 obj-$(CONFIG_VHOST_VSOCK) += vhost-vsock.o
 obj-$(CONFIG_VIRTIO_MEM) += virtio-mem.o
+common-obj-$(call land,$(CONFIG_VIRTIO_MEM),$(CONFIG_VIRTIO_PCI)) += 
virtio-mem-pci.o
 
 ifeq ($(CONFIG_VIRTIO_PCI),y)
 obj-$(CONFIG_VHOST_VSOCK) += vhost-vsock-pci.o
diff --git a/hw/virtio/virtio-mem-pci.c b/hw/virtio/virtio-mem-pci.c
new file mode 100644
index 00..b325303b32
--- /dev/null
+++ b/hw/virtio/virtio-mem-pci.c
@@ -0,0 +1,129 @@
+/*
+ * Virtio MEM PCI device
+ *
+ * Copyright (C) 2020 Red Hat, Inc.
+ *
+ * Authors:
+ *  David Hildenbrand 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "virtio-mem-pci.h"
+#include "hw/mem/memory-device.h"
+#include "qapi/error.h"
+
+static void virtio_mem_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
+{
+VirtIOMEMPCI *mem_pci = VIRTIO_MEM_PCI(vpci_dev);
+DeviceState *vdev = DEVICE(&mem_pci->vdev);
+
+qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
+object_property_set_bool(OBJECT(vdev), true, "realized", errp);
+}
+
+static void virtio_mem_pci_set_addr(MemoryDeviceState *md, uint64_t addr,
+Error **errp)
+{
+object_property_set_uint(OBJECT(md), addr, VIRTIO_MEM_ADDR_PROP, errp);
+}
+
+static uint64_t virtio_mem_pci_get_addr(const MemoryDeviceState *md)
+{
+return object_property_get_uint(OBJECT(md), VIRTIO_MEM_ADDR_PROP,
+&error_abort);
+}
+
+static MemoryRegion *virtio_mem_pci_get_memory_region(MemoryDeviceState *md,
+  Error **errp)
+{
+VirtIOMEMPCI *pci_mem = VIRTIO_MEM_PCI(md);
+VirtIOMEM *vmem = VIRTIO_MEM(&pci_mem->vdev);
+VirtIOMEMClass *vmc = VIRTIO_MEM_GET_CLASS(vmem);
+
+return vmc->get_memory_region(vmem, errp);
+}
+
+static uint64_t virtio_mem_pci_get_plugged_size(const MemoryDeviceState *md,
+Error **errp)
+{
+return object_property_get_uint(OBJECT(md), VIRTIO_MEM_SIZE_PROP,
+errp);
+}
+
+static void virtio_mem_pci_fill_device_info(const MemoryDeviceState *md,
+MemoryDeviceInfo *info)
+{
+VirtioMEMDeviceInfo *vi = g_new0(VirtioMEMDeviceInfo, 1);
+VirtIOMEMPCI *pci_mem = VIRTIO_MEM_PCI(md);
+VirtIOMEM *vmem = VIRTIO_MEM(&pci_mem->vdev);
+VirtIOMEMClass *vpc = VIRTIO_MEM_GET_CLASS(vmem);
+DeviceState *dev = DEVICE(md);
+
+if (dev->id) {
+vi->has_id = true;
+vi->id = g_strdup(dev->id);
+}
+
+/* let the real device handle everything else */
+vpc->fill_device_info(vmem, vi);
+
+info->u.virtio_mem.data = vi;
+info->type = MEMORY_DEVICE_INFO_KIND_VIRTIO_MEM;
+}
+
+static void virtio_mem_pci_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
+PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
+MemoryDeviceClass *mdc = MEMORY_DEVICE_CLASS(klass);
+
+k->realize = virtio_mem_pci_realize;
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
+pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_MEM;
+pcidev_k->revision = VIRTIO_PCI_ABI_VERSION;
+pcidev_k->class_id = PCI_CLASS_OTHERS;
+
+mdc->get_addr = virtio_mem_pci_get_addr;
+mdc->set_addr = virtio_mem_pci_set_addr;
+mdc->get_plugged_size = virtio_mem_pci_get_plugged_size;
+mdc->get_memory_region = virtio_mem_pci_get_memory_region;
+mdc->fill_device_info = virtio_mem_pci_fill_device_info;
+}
+
+static void virtio_mem_pci_instance_init(Object *obj)
+{
+VirtIOMEMPCI *dev = VIRTIO_MEM_PCI(obj);
+
+virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
+TYPE_VIRTIO_MEM);
+object_property_add_alias(obj, VIRTIO_MEM_BLOCK_SIZE_PROP,
+  OBJECT(&dev->vdev), VIRTIO_MEM_BLOCK_SIZE_PROP);
+object_property_add_alias(obj, VIRTIO_MEM

[PATCH v4 07/21] migration/rdma: Use ram_block_discard_disable()

2020-06-10 Thread David Hildenbrand

RDMA will pin all guest memory (as documented in docs/rdma.txt). We want
to disable RAM block discards - however, to keep it simple use
ram_block_discard_is_required() instead of inhibiting.

Note: It is not sufficient to limit disabling to pin_all. Even when only
conditionally pinning 1 MB chunks, as soon as one page within such a
chunk was discarded and one page not, the discarded pages will be pinned
as well.

Reviewed-by: Dr. David Alan Gilbert 
Cc: "Michael S. Tsirkin" 
Cc: Juan Quintela 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
---
 migration/rdma.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index ec45d33ba3..bbe6f36627 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -29,6 +29,7 @@
 #include "qemu/sockets.h"
 #include "qemu/bitmap.h"
 #include "qemu/coroutine.h"
+#include "exec/memory.h"
 #include 
 #include 
 #include 
@@ -4017,8 +4018,14 @@ void rdma_start_incoming_migration(const char 
*host_port, Error **errp)
 Error *local_err = NULL;
 
 trace_rdma_start_incoming_migration();
-rdma = qemu_rdma_data_init(host_port, &local_err);
 
+/* Avoid ram_block_discard_disable(), cannot change during migration. */
+if (ram_block_discard_is_required()) {
+error_setg(errp, "RDMA: cannot disable RAM discard");
+return;
+}
+
+rdma = qemu_rdma_data_init(host_port, &local_err);
 if (rdma == NULL) {
 goto err;
 }
@@ -4067,10 +4074,17 @@ void rdma_start_outgoing_migration(void *opaque,
 const char *host_port, Error **errp)
 {
 MigrationState *s = opaque;
-RDMAContext *rdma = qemu_rdma_data_init(host_port, errp);
 RDMAContext *rdma_return_path = NULL;
+RDMAContext *rdma;
 int ret = 0;
 
+/* Avoid ram_block_discard_disable(), cannot change during migration. */
+if (ram_block_discard_is_required()) {
+error_setg(errp, "RDMA: cannot disable RAM discard");
+return;
+}
+
+rdma = qemu_rdma_data_init(host_port, errp);
 if (rdma == NULL) {
 goto err;
 }
-- 
2.26.2

[PATCH v4 10/21] virtio-mem: Paravirtualized memory hot(un)plug

2020-06-10 Thread David Hildenbrand

This is the very basic/initial version of virtio-mem. An introduction to
virtio-mem can be found in the Linux kernel driver [1]. While it can be
used in the current state for hotplug of a smaller amount of memory, it
will heavily benefit from resizeable memory regions in the future.

Each virtio-mem device manages a memory region (provided via a memory
backend). After requested by the hypervisor ("requested-size"), the
guest can try to plug/unplug blocks of memory within that region, in order
to reach the requested size. Initially, and after a reboot, all memory is
unplugged (except in special cases - reboot during postcopy).

The guest may only try to plug/unplug blocks of memory within the usable
region size. The usable region size is a little bigger than the
requested size, to give the device driver some flexibility. The usable
region size will only grow, except on reboots or when all memory is
requested to get unplugged. The guest can never plug more memory than
requested. Unplugged memory will get zapped/discarded, similar to in a
balloon device.

The block size is variable, however, it is always chosen in a way such that
THP splits are avoided (e.g., 2MB). The state of each block
(plugged/unplugged) is tracked in a bitmap.

As virtio-mem devices (e.g., virtio-mem-pci) will be memory devices, we now
expose "VirtioMEMDeviceInfo" via "query-memory-devices".

--

There are two important follow-up items that are in the works:
1. Resizeable memory regions: Use resizeable allocations/RAM blocks to
   grow/shrink along with the usable region size. This avoids creating
   initially very big VMAs, RAM blocks, and KVM slots.
2. Protection of unplugged memory: Make sure the gust cannot actually
   make use of unplugged memory.

Other follow-up items that are in the works:
1. Exclude unplugged memory during migration (via precopy notifier).
2. Handle remapping of memory.
3. Support for other architectures.

--

Example usage (virtio-mem-pci is introduced in follow-up patches):

Start QEMU with two virtio-mem devices (one per NUMA node):
 $ qemu-system-x86_64 -m 4G,maxmem=20G \
  -smp sockets=2,cores=2 \
  -numa node,nodeid=0,cpus=0-1 -numa node,nodeid=1,cpus=2-3 \
  [...]
  -object memory-backend-ram,id=mem0,size=8G \
  -device virtio-mem-pci,id=vm0,memdev=mem0,node=0,requested-size=0M \
  -object memory-backend-ram,id=mem1,size=8G \
  -device virtio-mem-pci,id=vm1,memdev=mem1,node=1,requested-size=1G

Query the configuration:
 (qemu) info memory-devices
 Memory device [virtio-mem]: "vm0"
   memaddr: 0x14000
   node: 0
   requested-size: 0
   size: 0
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem0
 Memory device [virtio-mem]: "vm1"
   memaddr: 0x34000
   node: 1
   requested-size: 1073741824
   size: 1073741824
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem1

Add some memory to node 0:
 (qemu) qom-set vm0 requested-size 500M

Remove some memory from node 1:
 (qemu) qom-set vm1 requested-size 200M

Query the configuration again:
 (qemu) info memory-devices
 Memory device [virtio-mem]: "vm0"
   memaddr: 0x14000
   node: 0
   requested-size: 524288000
   size: 524288000
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem0
 Memory device [virtio-mem]: "vm1"
   memaddr: 0x34000
   node: 1
   requested-size: 209715200
   size: 209715200
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem1

[1] https://lkml.kernel.org/r/20200311171422.10484-1-da...@redhat.com

Cc: "Michael S. Tsirkin" 
Cc: Eric Blake 
Cc: Markus Armbruster 
Cc: "Dr. David Alan Gilbert" 
Cc: Igor Mammedov 
Signed-off-by: David Hildenbrand 
---
 hw/virtio/Kconfig  |  11 +
 hw/virtio/Makefile.objs|   1 +
 hw/virtio/virtio-mem.c | 724 +
 include/hw/virtio/virtio-mem.h |  78 
 qapi/misc.json |  39 +-
 5 files changed, 852 insertions(+), 1 deletion(-)
 create mode 100644 hw/virtio/virtio-mem.c
 create mode 100644 include/hw/virtio/virtio-mem.h

diff --git a/hw/virtio/Kconfig b/hw/virtio/Kconfig
index 83122424fa..0eda25c4e1 100644
--- a/hw/virtio/Kconfig
+++ b/hw/virtio/Kconfig
@@ -47,3 +47,14 @@ config VIRTIO_PMEM
 depends on VIRTIO
 depends on VIRTIO_PMEM_SUPPORTED
 select MEM_DEVICE
+
+config VIRTIO_MEM_SUPPORTED
+bool
+
+config VIRTIO_MEM
+bool
+default y
+depends on VIRTIO
+depends on LINUX
+depends on VIRTIO_MEM_SUPPORTED
+select MEM_DEVICE
diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 4e4d39a0a4..7df70e977e 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -18,6 +18,7 @@ common-obj-$(call 
land,$(CONFIG_VIRTIO_PMEM),$(CONFIG_VIRTIO_PCI)) += virtio-pme
 obj-$(call land,$(CONFIG_VHOST_USER_FS),$(CONFIG_VIRTIO_PCI)) += 
vhost-user-fs-pci.o
 obj-$(CONFIG_V

[PATCH v4 14/21] numa: Handle virtio-mem in NUMA stats

2020-06-10 Thread David Hildenbrand

Account the memory to the configured nid.

Reviewed-by: Pankaj Gupta 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: "Michael S. Tsirkin" 
Signed-off-by: David Hildenbrand 
---
 hw/core/numa.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index 316bc50d75..06960918e7 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -812,6 +812,7 @@ static void numa_stat_memory_devices(NumaNodeMem node_mem[])
 MemoryDeviceInfoList *info;
 PCDIMMDeviceInfo *pcdimm_info;
 VirtioPMEMDeviceInfo *vpi;
+VirtioMEMDeviceInfo *vmi;
 
 for (info = info_list; info; info = info->next) {
 MemoryDeviceInfo *value = info->value;
@@ -832,6 +833,11 @@ static void numa_stat_memory_devices(NumaNodeMem 
node_mem[])
 node_mem[0].node_mem += vpi->size;
 node_mem[0].node_plugged_mem += vpi->size;
 break;
+case MEMORY_DEVICE_INFO_KIND_VIRTIO_MEM:
+vmi = value->u.virtio_mem.data;
+node_mem[vmi->node].node_mem += vmi->size;
+node_mem[vmi->node].node_plugged_mem += vmi->size;
+break;
 default:
 g_assert_not_reached();
 }
-- 
2.26.2

[PATCH v4 09/21] linux-headers: update to contain virtio-mem

2020-06-10 Thread David Hildenbrand

To be merged hopefully soon. Then, we can replace this by a proper
header sync.

Cc: "Michael S. Tsirkin" 
Signed-off-by: David Hildenbrand 
---
 include/standard-headers/linux/virtio_ids.h |   1 +
 include/standard-headers/linux/virtio_mem.h | 211 
 2 files changed, 212 insertions(+)
 create mode 100644 include/standard-headers/linux/virtio_mem.h

diff --git a/include/standard-headers/linux/virtio_ids.h 
b/include/standard-headers/linux/virtio_ids.h
index ecc27a1740..b052355ac7 100644
--- a/include/standard-headers/linux/virtio_ids.h
+++ b/include/standard-headers/linux/virtio_ids.h
@@ -44,6 +44,7 @@
 #define VIRTIO_ID_VSOCK19 /* virtio vsock transport */
 #define VIRTIO_ID_CRYPTO   20 /* virtio crypto */
 #define VIRTIO_ID_IOMMU23 /* virtio IOMMU */
+#define VIRTIO_ID_MEM  24 /* virtio mem */
 #define VIRTIO_ID_FS   26 /* virtio filesystem */
 #define VIRTIO_ID_PMEM 27 /* virtio pmem */
 #define VIRTIO_ID_MAC80211_HWSIM 29 /* virtio mac80211-hwsim */
diff --git a/include/standard-headers/linux/virtio_mem.h 
b/include/standard-headers/linux/virtio_mem.h
new file mode 100644
index 00..05e5ade75d
--- /dev/null
+++ b/include/standard-headers/linux/virtio_mem.h
@@ -0,0 +1,211 @@
+/* SPDX-License-Identifier: BSD-3-Clause */
+/*
+ * Virtio Mem Device
+ *
+ * Copyright Red Hat, Inc. 2020
+ *
+ * Authors:
+ * David Hildenbrand 
+ *
+ * This header is BSD licensed so anyone can use the definitions
+ * to implement compatible drivers/servers:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of IBM nor the names of its contributors
+ *may be used to endorse or promote products derived from this software
+ *without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+ * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+ * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#ifndef _LINUX_VIRTIO_MEM_H
+#define _LINUX_VIRTIO_MEM_H
+
+#include "standard-headers/linux/types.h"
+#include "standard-headers/linux/virtio_types.h"
+#include "standard-headers/linux/virtio_ids.h"
+#include "standard-headers/linux/virtio_config.h"
+
+/*
+ * Each virtio-mem device manages a dedicated region in physical address
+ * space. Each device can belong to a single NUMA node, multiple devices
+ * for a single NUMA node are possible. A virtio-mem device is like a
+ * "resizable DIMM" consisting of small memory blocks that can be plugged
+ * or unplugged. The device driver is responsible for (un)plugging memory
+ * blocks on demand.
+ *
+ * Virtio-mem devices can only operate on their assigned memory region in
+ * order to (un)plug memory. A device cannot (un)plug memory belonging to
+ * other devices.
+ *
+ * The "region_size" corresponds to the maximum amount of memory that can
+ * be provided by a device. The "size" corresponds to the amount of memory
+ * that is currently plugged. "requested_size" corresponds to a request
+ * from the device to the device driver to (un)plug blocks. The
+ * device driver should try to (un)plug blocks in order to reach the
+ * "requested_size". It is impossible to plug more memory than requested.
+ *
+ * The "usable_region_size" represents the memory region that can actually
+ * be used to (un)plug memory. It is always at least as big as the
+ * "requested_size" and will grow dynamically. It will only shrink when
+ * explicitly triggered (VIRTIO_MEM_REQ_UNPLUG).
+ *
+ * There are no guarantees what will happen if unplugged memory is
+ * read/written. Such memory should, in general, not be touched. E.g.,
+ * even writing might succeed, but the values will simply be discarded at
+ * random points in time.
+ *
+ * It can happen that the device cannot process a request, because it is
+ * busy. The device driver has to retry later.
+ *
+ * Usually, during system resets all memory wi

[PATCH v4 13/21] hmp: Handle virtio-mem when printing memory device info

2020-06-10 Thread David Hildenbrand

Print the memory device info just like for other memory devices.

Cc: "Dr. David Alan Gilbert" 
Cc: "Michael S. Tsirkin" 
Signed-off-by: David Hildenbrand 
---
 monitor/hmp-cmds.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 9c61e769ca..afc9a28069 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1818,6 +1818,7 @@ void hmp_info_memory_devices(Monitor *mon, const QDict 
*qdict)
 MemoryDeviceInfoList *info_list = qmp_query_memory_devices(&err);
 MemoryDeviceInfoList *info;
 VirtioPMEMDeviceInfo *vpi;
+VirtioMEMDeviceInfo *vmi;
 MemoryDeviceInfo *value;
 PCDIMMDeviceInfo *di;
 
@@ -1852,6 +1853,21 @@ void hmp_info_memory_devices(Monitor *mon, const QDict 
*qdict)
 monitor_printf(mon, "  size: %" PRIu64 "\n", vpi->size);
 monitor_printf(mon, "  memdev: %s\n", vpi->memdev);
 break;
+case MEMORY_DEVICE_INFO_KIND_VIRTIO_MEM:
+vmi = value->u.virtio_mem.data;
+monitor_printf(mon, "Memory device [%s]: \"%s\"\n",
+   MemoryDeviceInfoKind_str(value->type),
+   vmi->id ? vmi->id : "");
+monitor_printf(mon, "  memaddr: 0x%" PRIx64 "\n", 
vmi->memaddr);
+monitor_printf(mon, "  node: %" PRId64 "\n", vmi->node);
+monitor_printf(mon, "  requested-size: %" PRIu64 "\n",
+   vmi->requested_size);
+monitor_printf(mon, "  size: %" PRIu64 "\n", vmi->size);
+monitor_printf(mon, "  max-size: %" PRIu64 "\n", 
vmi->max_size);
+monitor_printf(mon, "  block-size: %" PRIu64 "\n",
+   vmi->block_size);
+monitor_printf(mon, "  memdev: %s\n", vmi->memdev);
+break;
 default:
 g_assert_not_reached();
 }
-- 
2.26.2

[PATCH v4 12/21] MAINTAINERS: Add myself as virtio-mem maintainer

2020-06-10 Thread David Hildenbrand

Let's make sure patches/bug reports find the right person.

Reviewed-by: Dr. David Alan Gilbert 
Cc: "Michael S. Tsirkin" 
Cc: Peter Maydell 
Cc: Markus Armbruster 
Signed-off-by: David Hildenbrand 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 3abe3faa4e..4889485e6c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1762,6 +1762,14 @@ F: hw/virtio/virtio-crypto.c
 F: hw/virtio/virtio-crypto-pci.c
 F: include/hw/virtio/virtio-crypto.h
 
+virtio-mem
+M: David Hildenbrand 
+S: Supported
+F: hw/virtio/virtio-mem.c
+F: hw/virtio/virtio-mem-pci.h
+F: hw/virtio/virtio-mem-pci.c
+F: include/hw/virtio/virtio-mem.h
+
 nvme
 M: Keith Busch 
 L: qemu-bl...@nongnu.org
-- 
2.26.2

[PATCH v4 17/21] virtio-pci: Send qapi events when the virtio-mem size changes

2020-06-10 Thread David Hildenbrand

Let's register the notifier and trigger the qapi event with the right
device id.

MEMORY_DEVICE_SIZE_CHANGE is similar to BALLOON_CHANGE, however on a
memory device level.

Don't unregister the notifier (we neither have finalize() nor unrealize()
for VirtIOPCIProxy, so it's not that simple to do it) - both devices are
expected to vanish at the same time.

Cc: "Michael S. Tsirkin" 
Cc: Markus Armbruster 
Cc: "Dr. David Alan Gilbert" 
Cc: Eric Blake 
Cc: Igor Mammedov 
Signed-off-by: David Hildenbrand 
---
 hw/virtio/virtio-mem-pci.c | 28 
 hw/virtio/virtio-mem-pci.h |  1 +
 monitor/monitor.c  |  1 +
 qapi/misc.json | 25 +
 4 files changed, 55 insertions(+)

diff --git a/hw/virtio/virtio-mem-pci.c b/hw/virtio/virtio-mem-pci.c
index b325303b32..1a8e854123 100644
--- a/hw/virtio/virtio-mem-pci.c
+++ b/hw/virtio/virtio-mem-pci.c
@@ -14,6 +14,7 @@
 #include "virtio-mem-pci.h"
 #include "hw/mem/memory-device.h"
 #include "qapi/error.h"
+#include "qapi/qapi-events-misc.h"
 
 static void virtio_mem_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
 {
@@ -74,6 +75,21 @@ static void virtio_mem_pci_fill_device_info(const 
MemoryDeviceState *md,
 info->type = MEMORY_DEVICE_INFO_KIND_VIRTIO_MEM;
 }
 
+static void virtio_mem_pci_size_change_notify(Notifier *notifier, void *data)
+{
+VirtIOMEMPCI *pci_mem = container_of(notifier, VirtIOMEMPCI,
+ size_change_notifier);
+DeviceState *dev = DEVICE(pci_mem);
+const uint64_t * const size_p = data;
+const char *id = NULL;
+
+if (dev->id) {
+id = g_strdup(dev->id);
+}
+
+qapi_event_send_memory_device_size_change(!!id, id, *size_p);
+}
+
 static void virtio_mem_pci_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -98,9 +114,21 @@ static void virtio_mem_pci_class_init(ObjectClass *klass, 
void *data)
 static void virtio_mem_pci_instance_init(Object *obj)
 {
 VirtIOMEMPCI *dev = VIRTIO_MEM_PCI(obj);
+VirtIOMEMClass *vmc;
+VirtIOMEM *vmem;
 
 virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
 TYPE_VIRTIO_MEM);
+
+dev->size_change_notifier.notify = virtio_mem_pci_size_change_notify;
+vmem = VIRTIO_MEM(&dev->vdev);
+vmc = VIRTIO_MEM_GET_CLASS(vmem);
+/*
+ * We never remove the notifier again, as we expect both devices to
+ * disappear at the same time.
+ */
+vmc->add_size_change_notifier(vmem, &dev->size_change_notifier);
+
 object_property_add_alias(obj, VIRTIO_MEM_BLOCK_SIZE_PROP,
   OBJECT(&dev->vdev), VIRTIO_MEM_BLOCK_SIZE_PROP);
 object_property_add_alias(obj, VIRTIO_MEM_SIZE_PROP, OBJECT(&dev->vdev),
diff --git a/hw/virtio/virtio-mem-pci.h b/hw/virtio/virtio-mem-pci.h
index 8820cd6628..b51a28b275 100644
--- a/hw/virtio/virtio-mem-pci.h
+++ b/hw/virtio/virtio-mem-pci.h
@@ -28,6 +28,7 @@ typedef struct VirtIOMEMPCI VirtIOMEMPCI;
 struct VirtIOMEMPCI {
 VirtIOPCIProxy parent_obj;
 VirtIOMEM vdev;
+Notifier size_change_notifier;
 };
 
 #endif /* QEMU_VIRTIO_MEM_PCI_H */
diff --git a/monitor/monitor.c b/monitor/monitor.c
index 125494410a..19dcb8fbe3 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -235,6 +235,7 @@ static MonitorQAPIEventConf 
monitor_qapi_event_conf[QAPI_EVENT__MAX] = {
 [QAPI_EVENT_QUORUM_REPORT_BAD] = { 1000 * SCALE_MS },
 [QAPI_EVENT_QUORUM_FAILURE]= { 1000 * SCALE_MS },
 [QAPI_EVENT_VSERPORT_CHANGE]   = { 1000 * SCALE_MS },
+[QAPI_EVENT_MEMORY_DEVICE_SIZE_CHANGE] = { 1000 * SCALE_MS },
 };
 
 /*
diff --git a/qapi/misc.json b/qapi/misc.json
index e1c5547b65..4b25daeadb 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -1432,6 +1432,31 @@
 ##
 { 'command': 'query-memory-devices', 'returns': ['MemoryDeviceInfo'] }
 
+##
+# @MEMORY_DEVICE_SIZE_CHANGE:
+#
+# Emitted when the size of a memory device changes. Only emitted for memory
+# devices that can actually change the size (e.g., virtio-mem due to guest
+# action).
+#
+# @id: device's ID
+# @size: the new size of memory that the device provides
+#
+# Note: this event is rate-limited.
+#
+# Since: 5.1
+#
+# Example:
+#
+# <- { "event": "MEMORY_DEVICE_SIZE_CHANGE",
+#  "data": { "id": "vm0", "size": 1073741824},
+#  "timestamp": { "seconds": 1588168529, "microseconds": 201316 } }
+#
+##
+{ 'event': 'MEMORY_DEVICE_SIZE_CHANGE',
+  'data': { '*id': 'str', 'size': 'size' } }
+
+
 ##
 # @MEM_UNPLUG_ERROR:
 #
-- 
2.26.2

[PATCH v4 15/21] pc: Support for virtio-mem-pci

2020-06-10 Thread David Hildenbrand

Let's wire it up similar to virtio-pmem. Also disallow unplug, so it's
harder for users to shoot themselves into the foot.

Reviewed-by: Pankaj Gupta 
Cc: "Michael S. Tsirkin" 
Cc: Marcel Apfelbaum 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Eric Blake 
Cc: Markus Armbruster 
Signed-off-by: David Hildenbrand 
---
 hw/i386/Kconfig |  1 +
 hw/i386/pc.c| 49 -
 2 files changed, 29 insertions(+), 21 deletions(-)

diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index c93f32f657..03e347b207 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -35,6 +35,7 @@ config PC
 select ACPI_PCI
 select ACPI_VMGENID
 select VIRTIO_PMEM_SUPPORTED
+select VIRTIO_MEM_SUPPORTED
 
 config PC_PCI
 bool
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index c740495eb6..ee6368915b 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -86,6 +86,7 @@
 #include "hw/net/ne2000-isa.h"
 #include "standard-headers/asm-x86/bootparam.h"
 #include "hw/virtio/virtio-pmem-pci.h"
+#include "hw/virtio/virtio-mem-pci.h"
 #include "hw/mem/memory-device.h"
 #include "sysemu/replay.h"
 #include "qapi/qmp/qerror.h"
@@ -1657,8 +1658,8 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 numa_cpu_pre_plug(cpu_slot, dev, errp);
 }
 
-static void pc_virtio_pmem_pci_pre_plug(HotplugHandler *hotplug_dev,
-DeviceState *dev, Error **errp)
+static void pc_virtio_md_pci_pre_plug(HotplugHandler *hotplug_dev,
+  DeviceState *dev, Error **errp)
 {
 HotplugHandler *hotplug_dev2 = qdev_get_bus_hotplug_handler(dev);
 Error *local_err = NULL;
@@ -1669,7 +1670,8 @@ static void pc_virtio_pmem_pci_pre_plug(HotplugHandler 
*hotplug_dev,
  * order. We should never reach this point when hotplugging on x86,
  * however, better add a safety net.
  */
-error_setg(errp, "virtio-pmem-pci hotplug not supported on this bus.");
+error_setg(errp, "hotplug of virtio based memory devices not supported"
+   " on this bus.");
 return;
 }
 /*
@@ -1684,8 +1686,8 @@ static void pc_virtio_pmem_pci_pre_plug(HotplugHandler 
*hotplug_dev,
 error_propagate(errp, local_err);
 }
 
-static void pc_virtio_pmem_pci_plug(HotplugHandler *hotplug_dev,
-DeviceState *dev, Error **errp)
+static void pc_virtio_md_pci_plug(HotplugHandler *hotplug_dev,
+  DeviceState *dev, Error **errp)
 {
 HotplugHandler *hotplug_dev2 = qdev_get_bus_hotplug_handler(dev);
 Error *local_err = NULL;
@@ -1705,17 +1707,17 @@ static void pc_virtio_pmem_pci_plug(HotplugHandler 
*hotplug_dev,
 error_propagate(errp, local_err);
 }
 
-static void pc_virtio_pmem_pci_unplug_request(HotplugHandler *hotplug_dev,
-  DeviceState *dev, Error **errp)
+static void pc_virtio_md_pci_unplug_request(HotplugHandler *hotplug_dev,
+DeviceState *dev, Error **errp)
 {
-/* We don't support virtio pmem hot unplug */
-error_setg(errp, "virtio pmem device unplug not supported.");
+/* We don't support hot unplug of virtio based memory devices */
+error_setg(errp, "virtio based memory devices cannot be unplugged.");
 }
 
-static void pc_virtio_pmem_pci_unplug(HotplugHandler *hotplug_dev,
-  DeviceState *dev, Error **errp)
+static void pc_virtio_md_pci_unplug(HotplugHandler *hotplug_dev,
+DeviceState *dev, Error **errp)
 {
-/* We don't support virtio pmem hot unplug */
+/* We don't support hot unplug of virtio based memory devices */
 }
 
 static void pc_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
@@ -1725,8 +1727,9 @@ static void pc_machine_device_pre_plug_cb(HotplugHandler 
*hotplug_dev,
 pc_memory_pre_plug(hotplug_dev, dev, errp);
 } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
 pc_cpu_pre_plug(hotplug_dev, dev, errp);
-} else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI)) {
-pc_virtio_pmem_pci_pre_plug(hotplug_dev, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI) ||
+   object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MEM_PCI)) {
+pc_virtio_md_pci_pre_plug(hotplug_dev, dev, errp);
 }
 }
 
@@ -1737,8 +1740,9 @@ static void pc_machine_device_plug_cb(HotplugHandler 
*hotplug_dev,
 pc_memory_plug(hotplug_dev, dev, errp);
 } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
 pc_cpu_plug(hotplug_dev, dev, errp);
-} else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI)) {
-pc_virtio_pmem_pci_plug(hotplug_dev, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI) ||
+   object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MEM_PCI)) {
+pc_virtio_md_p

[PATCH v4 16/21] virtio-mem: Allow notifiers for size changes

2020-06-10 Thread David Hildenbrand

We want to send qapi events in case the size of a virtio-mem device
changes. This allows upper layers to always know how much memory is
actually currently consumed via a virtio-mem device.

Unfortuantely, we have to report the id of our proxy device. Let's provide
an easy way for our proxy device to register, so it can send the qapi
events. Piggy-backing on the notifier infrastructure (although we'll
only ever have one notifier registered) seems to be an easy way.

Reviewed-by: Dr. David Alan Gilbert 
Cc: "Michael S. Tsirkin" 
Cc: "Dr. David Alan Gilbert" 
Cc: Igor Mammedov 
Signed-off-by: David Hildenbrand 
---
 hw/virtio/virtio-mem.c | 21 -
 include/hw/virtio/virtio-mem.h |  5 +
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index d8a0c974d3..2df33f9125 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -184,6 +184,7 @@ static int virtio_mem_state_change_request(VirtIOMEM *vmem, 
uint64_t gpa,
 } else {
 vmem->size -= size;
 }
+notifier_list_notify(&vmem->size_change_notifiers, &vmem->size);
 return VIRTIO_MEM_RESP_ACK;
 }
 
@@ -242,7 +243,10 @@ static int virtio_mem_unplug_all(VirtIOMEM *vmem)
 return -EBUSY;
 }
 bitmap_clear(vmem->bitmap, 0, vmem->bitmap_size);
-vmem->size = 0;
+if (vmem->size) {
+vmem->size = 0;
+notifier_list_notify(&vmem->size_change_notifiers, &vmem->size);
+}
 
 virtio_mem_resize_usable_region(vmem, vmem->requested_size, true);
 return 0;
@@ -561,6 +565,18 @@ static MemoryRegion 
*virtio_mem_get_memory_region(VirtIOMEM *vmem, Error **errp)
 return &vmem->memdev->mr;
 }
 
+static void virtio_mem_add_size_change_notifier(VirtIOMEM *vmem,
+Notifier *notifier)
+{
+notifier_list_add(&vmem->size_change_notifiers, notifier);
+}
+
+static void virtio_mem_remove_size_change_notifier(VirtIOMEM *vmem,
+   Notifier *notifier)
+{
+notifier_remove(notifier);
+}
+
 static void virtio_mem_get_size(Object *obj, Visitor *v, const char *name,
 void *opaque, Error **errp)
 {
@@ -668,6 +684,7 @@ static void virtio_mem_instance_init(Object *obj)
 VirtIOMEM *vmem = VIRTIO_MEM(obj);
 
 vmem->block_size = VIRTIO_MEM_MIN_BLOCK_SIZE;
+notifier_list_init(&vmem->size_change_notifiers);
 
 object_property_add(obj, VIRTIO_MEM_SIZE_PROP, "size", virtio_mem_get_size,
 NULL, NULL, NULL);
@@ -705,6 +722,8 @@ static void virtio_mem_class_init(ObjectClass *klass, void 
*data)
 
 vmc->fill_device_info = virtio_mem_fill_device_info;
 vmc->get_memory_region = virtio_mem_get_memory_region;
+vmc->add_size_change_notifier = virtio_mem_add_size_change_notifier;
+vmc->remove_size_change_notifier = virtio_mem_remove_size_change_notifier;
 }
 
 static const TypeInfo virtio_mem_info = {
diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h
index 6981096f7c..b74c77cd42 100644
--- a/include/hw/virtio/virtio-mem.h
+++ b/include/hw/virtio/virtio-mem.h
@@ -64,6 +64,9 @@ typedef struct VirtIOMEM {
 
 /* block size and alignment */
 uint64_t block_size;
+
+/* notifiers to notify when "size" changes */
+NotifierList size_change_notifiers;
 } VirtIOMEM;
 
 typedef struct VirtIOMEMClass {
@@ -73,6 +76,8 @@ typedef struct VirtIOMEMClass {
 /* public */
 void (*fill_device_info)(const VirtIOMEM *vmen, VirtioMEMDeviceInfo *vi);
 MemoryRegion *(*get_memory_region)(VirtIOMEM *vmem, Error **errp);
+void (*add_size_change_notifier)(VirtIOMEM *vmem, Notifier *notifier);
+void (*remove_size_change_notifier)(VirtIOMEM *vmem, Notifier *notifier);
 } VirtIOMEMClass;
 
 #endif
-- 
2.26.2

[PATCH v4 19/21] virtio-mem: Add trace events

2020-06-10 Thread David Hildenbrand

Let's add some trace events that might come in handy later.

Cc: "Michael S. Tsirkin" 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
---
 hw/virtio/trace-events | 10 ++
 hw/virtio/virtio-mem.c | 10 +-
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index e83500bee9..c40ad5ea27 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -73,3 +73,13 @@ virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d"
 virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
 virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t 
sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
 virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, 
uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64
+
+# virtio-mem.c
+virtio_mem_send_response(uint16_t type) "type=%" PRIu16
+virtio_mem_plug_request(uint64_t addr, uint16_t nb_blocks) "addr=0x%" PRIx64 " 
nb_blocks=%" PRIu16
+virtio_mem_unplug_request(uint64_t addr, uint16_t nb_blocks) "addr=0x%" PRIx64 
" nb_blocks=%" PRIu16
+virtio_mem_unplugged_all(void) ""
+virtio_mem_unplug_all_request(void) ""
+virtio_mem_resized_usable_region(uint64_t old_size, uint64_t new_size) 
"old_size=0x%" PRIx64 "new_size=0x%" PRIx64
+virtio_mem_state_request(uint64_t addr, uint16_t nb_blocks) "addr=0x%" PRIx64 
" nb_blocks=%" PRIu16
+virtio_mem_state_response(uint16_t state) "state=%" PRIu16
diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index 450b8dc49d..468fb4170f 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -30,6 +30,7 @@
 #include "hw/boards.h"
 #include "hw/qdev-properties.h"
 #include "config-devices.h"
+#include "trace.h"
 
 /*
  * Use QEMU_VMALLOC_ALIGN, so no THP will have to be split when unplugging
@@ -100,6 +101,7 @@ static void virtio_mem_send_response(VirtIOMEM *vmem, 
VirtQueueElement *elem,
 VirtIODevice *vdev = VIRTIO_DEVICE(vmem);
 VirtQueue *vq = vmem->vq;
 
+trace_virtio_mem_send_response(le16_to_cpu(resp->type));
 iov_from_buf(elem->in_sg, elem->in_num, 0, resp, sizeof(*resp));
 
 virtqueue_push(vq, elem, sizeof(*resp));
@@ -195,6 +197,7 @@ static void virtio_mem_plug_request(VirtIOMEM *vmem, 
VirtQueueElement *elem,
 const uint16_t nb_blocks = le16_to_cpu(req->u.plug.nb_blocks);
 uint16_t type;
 
+trace_virtio_mem_plug_request(gpa, nb_blocks);
 type = virtio_mem_state_change_request(vmem, gpa, nb_blocks, true);
 virtio_mem_send_response_simple(vmem, elem, type);
 }
@@ -206,6 +209,7 @@ static void virtio_mem_unplug_request(VirtIOMEM *vmem, 
VirtQueueElement *elem,
 const uint16_t nb_blocks = le16_to_cpu(req->u.unplug.nb_blocks);
 uint16_t type;
 
+trace_virtio_mem_unplug_request(gpa, nb_blocks);
 type = virtio_mem_state_change_request(vmem, gpa, nb_blocks, false);
 virtio_mem_send_response_simple(vmem, elem, type);
 }
@@ -225,6 +229,7 @@ static void virtio_mem_resize_usable_region(VirtIOMEM *vmem,
 return;
 }
 
+trace_virtio_mem_resized_usable_region(vmem->usable_region_size, newsize);
 vmem->usable_region_size = newsize;
 }
 
@@ -247,7 +252,7 @@ static int virtio_mem_unplug_all(VirtIOMEM *vmem)
 vmem->size = 0;
 notifier_list_notify(&vmem->size_change_notifiers, &vmem->size);
 }
-
+trace_virtio_mem_unplugged_all();
 virtio_mem_resize_usable_region(vmem, vmem->requested_size, true);
 return 0;
 }
@@ -255,6 +260,7 @@ static int virtio_mem_unplug_all(VirtIOMEM *vmem)
 static void virtio_mem_unplug_all_request(VirtIOMEM *vmem,
   VirtQueueElement *elem)
 {
+trace_virtio_mem_unplug_all_request();
 if (virtio_mem_unplug_all(vmem)) {
 virtio_mem_send_response_simple(vmem, elem, VIRTIO_MEM_RESP_BUSY);
 } else {
@@ -272,6 +278,7 @@ static void virtio_mem_state_request(VirtIOMEM *vmem, 
VirtQueueElement *elem,
 .type = cpu_to_le16(VIRTIO_MEM_RESP_ACK),
 };
 
+trace_virtio_mem_state_request(gpa, nb_blocks);
 if (!virtio_mem_valid_range(vmem, gpa, size)) {
 virtio_mem_send_response_simple(vmem, elem, VIRTIO_MEM_RESP_ERROR);
 return;
@@ -284,6 +291,7 @@ static void virtio_mem_state_request(VirtIOMEM *vmem, 
VirtQueueElement *elem,
 } else {
 resp.u.state.state = cpu_to_le16(VIRTIO_MEM_STATE_MIXED);
 }
+trace_virtio_mem_state_response(le16_to_cpu(resp.u.state.state));
 virtio_mem_send_response(vmem, elem, &resp);
 }
 
-- 
2.26.2

[PATCH v9 11/61] target/riscv: vector widening integer add and subtract

2020-06-10 Thread LIU Zhiwei

Signed-off-by: LIU Zhiwei 
Reviewed-by: Richard Henderson 
Reviewed-by: Alistair Francis 
---
 target/riscv/helper.h   |  49 +++
 target/riscv/insn32.decode  |  16 ++
 target/riscv/insn_trans/trans_rvv.inc.c | 186 
 target/riscv/vector_helper.c| 111 ++
 4 files changed, 362 insertions(+)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index f791f2dbc6..608704850a 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -294,3 +294,52 @@ DEF_HELPER_FLAGS_4(vec_rsubs8, TCG_CALL_NO_RWG, void, ptr, 
ptr, i64, i32)
 DEF_HELPER_FLAGS_4(vec_rsubs16, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
 DEF_HELPER_FLAGS_4(vec_rsubs32, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
 DEF_HELPER_FLAGS_4(vec_rsubs64, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
+
+DEF_HELPER_6(vwaddu_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwaddu_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwaddu_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwsubu_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwsubu_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwsubu_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwadd_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwadd_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwadd_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwsub_vv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwsub_vv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwsub_vv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwaddu_vx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwaddu_vx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwaddu_vx_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwsubu_vx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwsubu_vx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwsubu_vx_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwadd_vx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwadd_vx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwadd_vx_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwsub_vx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwsub_vx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwsub_vx_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwaddu_wv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwaddu_wv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwaddu_wv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwsubu_wv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwsubu_wv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwsubu_wv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwadd_wv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwadd_wv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwadd_wv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwsub_wv_b, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwsub_wv_h, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwsub_wv_w, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vwaddu_wx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwaddu_wx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwaddu_wx_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwsubu_wx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwsubu_wx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwsubu_wx_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwadd_wx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwadd_wx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwadd_wx_w, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwsub_wx_b, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwsub_wx_h, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vwsub_wx_w, void, ptr, ptr, tl, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index d1034a0e61..4bdbfd16fa 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -284,6 +284,22 @@ vsub_vv 10 . . . 000 . 1010111 
@r_vm
 vsub_vx 10 . . . 100 . 1010111 @r_vm
 vrsub_vx11 . . . 100 . 1010111 @r_vm
 vrsub_vi11 . . . 011 . 1010111 @r_vm
+vwaddu_vv   11 . . . 010 . 1010111 @r_vm
+vwaddu_vx   11 . . . 110 . 1010111 @r_vm
+vwadd_vv110001 . . . 010 . 1010111 @r_vm
+vwadd_vx110001 . . . 110 . 1010111 @r_vm
+vwsubu_vv   110010 . . . 010 . 1010111 @r_vm
+vwsubu_vx   110010 . . . 110 . 1010111 @r_vm
+vwsub_vv110011 . . . 010 . 1010111 @r_vm
+vwsub_vx110011 . . . 110 . 1010111 @r_vm
+vwaddu_wv   110100 . . . 010 . 1010111 @r_vm
+vwaddu_wx   110100 . . . 110 . 1010111 @r_vm
+vwadd_wv110101 . . . 010 . 1010111 @r_vm
+vwadd_wx110101 . . . 110 . 101011

1 2 3 4 5 >

1 - 100 of 408 matches

Mail list logo