Re: Avocado error fetching QEMU boot_linux.py assets

2020-07-23 Thread Philippe Mathieu-Daudé
On 7/24/20 7:43 AM, Philippe Mathieu-Daudé wrote:
> Hi,
> 
> [cross list post]
> 
> Using QEMU at commit 3cbc8970f5 I'm getting this error:
> 
> Fetching assets from tests/acceptance/boot_linux_console.py.
> Fetching assets from tests/acceptance/boot_linux.py.
> Traceback (most recent call last):
>   File "/usr/lib64/python3.7/runpy.py", line 193, in _run_module_as_main
> "__main__", mod_spec)
>   File "/usr/lib64/python3.7/runpy.py", line 85, in _run_code
> exec(code, run_globals)
>   File
> "/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/__main__.py",
> line 11, in 
> sys.exit(main.run())
>   File
> "/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/core/app.py",
> line 91, in run
> return method(self.parser.config)
>   File
> "/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
> line 291, in run
> success, fail = fetch_assets(test_file)
>   File
> "/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
> line 200, in fetch_assets
> handler = FetchAssetHandler(test_file, klass, method)
>   File
> "/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
> line 65, in __init__
> self.visit(self.tree)
>   File "/usr/lib64/python3.7/ast.py", line 271, in visit
> return visitor(node)
>   File "/usr/lib64/python3.7/ast.py", line 279, in generic_visit
> self.visit(item)
>   File "/usr/lib64/python3.7/ast.py", line 271, in visit
> return visitor(node)
>   File
> "/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
> line 139, in visit_ClassDef
> self.generic_visit(node)
>   File "/usr/lib64/python3.7/ast.py", line 279, in generic_visit
> self.visit(item)
>   File "/usr/lib64/python3.7/ast.py", line 271, in visit
> return visitor(node)
>   File
> "/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
> line 171, in visit_Assign
> self.asgmts[cur_klass][cur_method][name] = node.value.s
> KeyError: 'launch_and_wait'

FYI here cur_klass='BootLinuxX8664' and name='chksum'.

> 
> Same if I revert these:
> 0f26d94ec9 ("tests/acceptance: skip s390x_ccw_vrtio_tcg on GitLab")
> 1c80c87c8c ("tests/acceptance: refactor boot_linux to allow code reuse")
> 
> If I remove boot_linux.py, all other files are processed correctly.
> 
> Any idea what can be wrong here?
> 
> Thanks,
> 
> Phil.
> 




Avocado error fetching QEMU boot_linux.py assets

2020-07-23 Thread Philippe Mathieu-Daudé
Hi,

[cross list post]

Using QEMU at commit 3cbc8970f5 I'm getting this error:

Fetching assets from tests/acceptance/boot_linux_console.py.
Fetching assets from tests/acceptance/boot_linux.py.
Traceback (most recent call last):
  File "/usr/lib64/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
  File "/usr/lib64/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
  File
"/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/__main__.py",
line 11, in 
sys.exit(main.run())
  File
"/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/core/app.py",
line 91, in run
return method(self.parser.config)
  File
"/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
line 291, in run
success, fail = fetch_assets(test_file)
  File
"/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
line 200, in fetch_assets
handler = FetchAssetHandler(test_file, klass, method)
  File
"/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
line 65, in __init__
self.visit(self.tree)
  File "/usr/lib64/python3.7/ast.py", line 271, in visit
return visitor(node)
  File "/usr/lib64/python3.7/ast.py", line 279, in generic_visit
self.visit(item)
  File "/usr/lib64/python3.7/ast.py", line 271, in visit
return visitor(node)
  File
"/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
line 139, in visit_ClassDef
self.generic_visit(node)
  File "/usr/lib64/python3.7/ast.py", line 279, in generic_visit
self.visit(item)
  File "/usr/lib64/python3.7/ast.py", line 271, in visit
return visitor(node)
  File
"/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
line 171, in visit_Assign
self.asgmts[cur_klass][cur_method][name] = node.value.s
KeyError: 'launch_and_wait'

Same if I revert these:
0f26d94ec9 ("tests/acceptance: skip s390x_ccw_vrtio_tcg on GitLab")
1c80c87c8c ("tests/acceptance: refactor boot_linux to allow code reuse")

If I remove boot_linux.py, all other files are processed correctly.

Any idea what can be wrong here?

Thanks,

Phil.




[Bug 1888601] Re: QEMU v5.1.0-rc0/rc1 hang with nested virtualization

2020-07-23 Thread Jason Wang
Hi:

It's not clear to me:

- Is the hang happen on the host or L1 guest?
- Is qemu 5.1-rc0 used on the host or L1 guest?
- When did you see the hung, just after launching the guest?
- Can you use gdb to get a calltrace of qemu when you see the hang?
- What's the version of kernel in L1 and L2 guest?

Thanks

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1888601

Title:
  QEMU v5.1.0-rc0/rc1 hang with nested virtualization

Status in QEMU:
  New

Bug description:
  We're running Kata Containers using QEMU and with v5.1.0rc0 and rc1
  have noticed a problem at startup where QEMu appears to hang. We are
  not seeing this problem on our bare metal nodes and only on a VSI that
  supports nested virtualization.

  We unfortunately see nothing at all in the QEMU logs to help
  understand the problem and a hung process is just a guess at this
  point.

  Using git bisect we first see the problem with...

  ---

  f19bcdfedd53ee93412d535a842a89fa27cae7f2 is the first bad commit
  commit f19bcdfedd53ee93412d535a842a89fa27cae7f2
  Author: Jason Wang 
  Date:   Wed Jul 1 22:55:28 2020 +0800

  virtio-pci: implement queue_enabled method

  With version 1, we can detect whether a queue is enabled via
  queue_enabled.

  Signed-off-by: Jason Wang 
  Signed-off-by: Cindy Lu 
  Message-Id: <20200701145538.22333-5-l...@redhat.com>
  Reviewed-by: Michael S. Tsirkin 
  Signed-off-by: Michael S. Tsirkin 
  Acked-by: Jason Wang 

   hw/virtio/virtio-pci.c | 13 +
   1 file changed, 13 insertions(+)

  ---

  Reverting this commit (on top of 5.1.0-rc1) seems to work and prevent
  the hanging.

  ---

  Here's how kata ends up launching qemu in our environment --
  /opt/kata/bin/qemu-system-x86_64 -name 
sandbox-849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f -uuid 
6bec458e-1da7-4847-a5d7-5ab31d4d2465 -machine pc,accel=kvm,kernel_irqchip -cpu 
host,pmu=off -qmp 
unix:/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/qmp.sock,server,nowait
 -m 4096M,slots=10,maxmem=30978M -device 
pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= 
-device virtio-serial-pci,disable-modern=true,id=serial0,romfile= -device 
virtconsole,chardev=charconsole0,id=console0 -chardev 
socket,id=charconsole0,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/console.sock,server,nowait
 -device virtio-scsi-pci,id=scsi0,disable-modern=true,romfile= -object 
rng-random,id=rng0,filename=/dev/urandom -device 
virtio-rng-pci,rng=rng0,romfile= -device 
virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 -chardev 
socket,id=charch0,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/kata.sock,server,nowait
 -chardev 
socket,id=char-396c5c3e19e29353,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/vhost-fs.sock
 -device 
vhost-user-fs-pci,chardev=char-396c5c3e19e29353,tag=kataShared,romfile= -netdev 
tap,id=network-0,vhost=on,vhostfds=3:4,fds=5:6 -device 
driver=virtio-net-pci,netdev=network-0,mac=52:ac:2d:02:1f:6f,disable-modern=true,mq=on,vectors=6,romfile=
 -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults 
-nographic -daemonize -object 
memory-backend-file,id=dimm1,size=4096M,mem-path=/dev/shm,share=on -numa 
node,memdev=dimm1 -kernel /opt/kata/share/kata-containers/vmlinuz-5.7.9-74 
-initrd 
/opt/kata/share/kata-containers/kata-containers-initrd_alpine_1.11.2-6_agent.initrd
 -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 
i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 
console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 pci=lastbus=0 debug 
panic=1 nr_cpus=4 agent.use_vsock=false scsi_mod.scan=none 
init=/usr/bin/kata-agent -pidfile 
/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/pid 
-D 
/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/qemu.log
 -smp 2,cores=1,threads=1,sockets=4,maxcpus=4

  ---

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1888601/+subscriptions



[PATCH 7/7] ide: cancel pending callbacks on SRST

2020-07-23 Thread John Snow
The SRST implementation did not keep up with the rest of IDE; it is
possible to perform a weak reset on an IDE device to remove the BSY/DRQ
bits, and then issue writes to the control/device registers which can
cause chaos with the state machine.

Fix that by actually performing a real reset.

Reported-by: Alexander Bulekov 
Fixes: https://bugs.launchpad.net/qemu/+bug/1878253
Fixes: https://bugs.launchpad.net/qemu/+bug/1887303
Fixes: https://bugs.launchpad.net/qemu/+bug/1887309
Signed-off-by: John Snow 
---
 hw/ide/core.c | 58 +++
 1 file changed, 40 insertions(+), 18 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index e4c69a7fde..4da689abdf 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -2241,6 +2241,37 @@ uint32_t ide_status_read(void *opaque, uint32_t addr)
 return ret;
 }
 
+static void ide_perform_srst(IDEState *s)
+{
+s->status |= BUSY_STAT;
+
+/* Halt PIO (Via register state); PIO BH remains scheduled. */
+ide_transfer_halt(s);
+
+/* Cancel DMA -- may drain block device and invoke callbacks */
+ide_cancel_dma_sync(s);
+
+/* Cancel PIO callback, reset registers/signature, etc */
+ide_reset(s);
+
+if (s->drive_kind == IDE_CD) {
+/* ATAPI drives do not set READY or SEEK */
+s->status = 0x00;
+}
+}
+
+static void ide_bus_perform_srst(void *opaque)
+{
+IDEBus *bus = opaque;
+IDEState *s;
+int i;
+
+for (i = 0; i < 2; i++) {
+s = >ifs[i];
+ide_perform_srst(s);
+}
+}
+
 void ide_ctrl_write(void *opaque, uint32_t addr, uint32_t val)
 {
 IDEBus *bus = opaque;
@@ -2249,26 +2280,17 @@ void ide_ctrl_write(void *opaque, uint32_t addr, 
uint32_t val)
 
 trace_ide_ctrl_write(addr, val, bus);
 
-/* common for both drives */
-if (!(bus->cmd & IDE_CTRL_RESET) &&
-(val & IDE_CTRL_RESET)) {
-/* reset low to high */
-for(i = 0;i < 2; i++) {
+/* Device0 and Device1 each have their own control register,
+ * but QEMU models it as just one register in the controller. */
+if ((bus->cmd & IDE_CTRL_RESET) &&
+!(val & IDE_CTRL_RESET)) {
+/* SRST triggers on falling edge */
+for (i = 0; i < 2; i++) {
 s = >ifs[i];
-s->status = BUSY_STAT | SEEK_STAT;
-s->error = 0x01;
-}
-} else if ((bus->cmd & IDE_CTRL_RESET) &&
-   !(val & IDE_CTRL_RESET)) {
-/* high to low */
-for(i = 0;i < 2; i++) {
-s = >ifs[i];
-if (s->drive_kind == IDE_CD)
-s->status = 0x00; /* NOTE: READY is _not_ set */
-else
-s->status = READY_STAT | SEEK_STAT;
-ide_set_signature(s);
+s->status |= BUSY_STAT;
 }
+aio_bh_schedule_oneshot(qemu_get_aio_context(),
+ide_bus_perform_srst, bus);
 }
 
 bus->cmd = val;
-- 
2.26.2




[PATCH 0/7] IDE: SRST and other fixes

2020-07-23 Thread John Snow
The goal of this series is to fix the Software Reset (SRST) routine.
That said, the first six patches are almost entirely unrelated...

Patches 2, 3, and 6 fix extremely minor deviations from the spec I
noticed while researching SRST. (One of them gets rid of a FIXME from
2003.)

Patches 1, 4, and 5 are very small code cleanups that don't cause any
functional changes that should make patches 2, 3, and 6 more obvious to
review.

Patch 7 fixes SRST; it depends on the other patches only for a changed
constant name. With a small rebase, it could be suitable for 5.1.

John Snow (7):
  ide: rename cmd_write to ctrl_write
  ide: don't tamper with the device register
  ide: model HOB correctly
  ide: reorder set/get sector functions
  ide: remove magic constants from the device register
  ide: clear interrupt on command write
  ide: cancel pending callbacks on SRST

 include/hw/ide/internal.h |  21 +--
 hw/ide/core.c | 124 +++---
 hw/ide/ioport.c   |   2 +-
 hw/ide/macio.c|   2 +-
 hw/ide/mmio.c |   8 +--
 hw/ide/pci.c  |  12 ++--
 hw/ide/trace-events   |   2 +-
 7 files changed, 106 insertions(+), 65 deletions(-)

-- 
2.26.2





[PATCH 3/7] ide: model HOB correctly

2020-07-23 Thread John Snow
I have been staring at this FIXME for years and I never knew what it
meant. I finally stumbled across it!

When writing to the command registers, the old value is shifted into a
HOB copy of the register and the new value is written into the primary
register. When reading registers, the value retrieved is dependent on
the HOB bit in the CONTROL register.

By setting bit 7 (0x80) in CONTROL, any register read will, if it has
one, yield the HOB value for that register instead.

Our code has a problem: We were using bit 7 of the DEVICE register to
model this. We use bus->cmd roughly as the control register already, as
it stores the value from ide_ctrl_write.

Lastly, all command register writes reset the HOB, so fix that, too.

Signed-off-by: John Snow 
---
 include/hw/ide/internal.h |  1 +
 hw/ide/core.c | 15 +++
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/hw/ide/internal.h b/include/hw/ide/internal.h
index 10ea6e1e23..16d806e0cf 100644
--- a/include/hw/ide/internal.h
+++ b/include/hw/ide/internal.h
@@ -58,6 +58,7 @@ typedef struct IDEDMAOps IDEDMAOps;
 #define TAG_MASK   0xf8
 
 /* Bits of Device Control register */
+#define IDE_CTRL_HOB0x80
 #define IDE_CTRL_RESET  0x04
 #define IDE_CTRL_DISABLE_IRQ0x02
 
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 5cedebc408..a880b91b47 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -1215,8 +1215,7 @@ static void ide_cmd_lba48_transform(IDEState *s, int 
lba48)
 static void ide_clear_hob(IDEBus *bus)
 {
 /* any write clears HOB high bit of device control register */
-bus->ifs[0].select &= ~(1 << 7);
-bus->ifs[1].select &= ~(1 << 7);
+bus->cmd &= ~(IDE_CTRL_HOB);
 }
 
 /* IOport [W]rite [R]egisters */
@@ -1256,12 +1255,14 @@ void ide_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 return;
 }
 
+/* NOTE: Device0 and Device1 both receive incoming register writes.
+ * (They're on the same bus! They have to!) */
+
 switch (reg_num) {
 case 0:
 break;
 case ATA_IOPORT_WR_FEATURES:
 ide_clear_hob(bus);
-/* NOTE: data is written to the two drives */
 bus->ifs[0].hob_feature = bus->ifs[0].feature;
 bus->ifs[1].hob_feature = bus->ifs[1].feature;
 bus->ifs[0].feature = val;
@@ -1296,7 +1297,7 @@ void ide_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 bus->ifs[1].hcyl = val;
 break;
 case ATA_IOPORT_WR_DEVICE_HEAD:
-/* FIXME: HOB readback uses bit 7 */
+ide_clear_hob(bus);
 bus->ifs[0].select = val | 0xa0;
 bus->ifs[1].select = val | 0xa0;
 /* select drive */
@@ -1304,7 +1305,7 @@ void ide_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 break;
 default:
 case ATA_IOPORT_WR_COMMAND:
-/* command */
+ide_clear_hob(bus);
 ide_exec_cmd(bus, val);
 break;
 }
@@ -2142,9 +2143,7 @@ uint32_t ide_ioport_read(void *opaque, uint32_t addr)
 int ret, hob;
 
 reg_num = addr & 7;
-/* FIXME: HOB readback uses bit 7, but it's always set right now */
-//hob = s->select & (1 << 7);
-hob = 0;
+hob = bus->cmd & (IDE_CTRL_HOB);
 switch (reg_num) {
 case ATA_IOPORT_RR_DATA:
 ret = 0xff;
-- 
2.26.2




[PATCH 5/7] ide: remove magic constants from the device register

2020-07-23 Thread John Snow
(In QEMU, we call this the "select" register.)

My memory isn't good enough to memorize what these magic runes
do. Label them to prevent mixups from happening in the future.

Side note: I assume it's safe to always set 0xA0 even though ATA2 claims
these bits are reserved, because ATA3 immediately reinstated that these
bits should be always on. ATA4 and subsequent specs only claim that the
fields are obsolete, so I assume it's safe to leave these set and that
it should work with the widest array of guests.

Signed-off-by: John Snow 
---
 include/hw/ide/internal.h | 11 +++
 hw/ide/core.c | 26 ++
 2 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/include/hw/ide/internal.h b/include/hw/ide/internal.h
index 16d806e0cf..d5a6ba1056 100644
--- a/include/hw/ide/internal.h
+++ b/include/hw/ide/internal.h
@@ -29,6 +29,17 @@ typedef struct IDEDMAOps IDEDMAOps;
 
 #define MAX_IDE_DEVS 2
 
+/* Device/Head ("select") Register */
+#define ATA_DEV_SELECT  0x10
+/* ATA1,3: Defined as '1'.
+ * ATA2:   Reserved.
+ * ATA3-7: obsolete. */
+#define ATA_DEV_ALWAYS_ON   0xA0
+#define ATA_DEV_LBA 0x40
+#define ATA_DEV_LBA_MSB 0x0F  /* LBA 24:27 */
+#define ATA_DEV_HS  0x0F  /* HS 3:0 */
+
+
 /* Bits of HD_STATUS */
 #define ERR_STAT   0x01
 #define INDEX_STAT 0x02
diff --git a/hw/ide/core.c b/hw/ide/core.c
index f35864070b..5f4f004312 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -367,7 +367,7 @@ fill_buffer:
 
 static void ide_set_signature(IDEState *s)
 {
-s->select &= 0xf0; /* clear head */
+s->select &= ~(ATA_DEV_HS); /* clear head */
 /* put signature */
 s->nsector = 1;
 s->sector = 1;
@@ -586,7 +586,7 @@ void ide_transfer_stop(IDEState *s)
 int64_t ide_get_sector(IDEState *s)
 {
 int64_t sector_num;
-if (s->select & 0x40) {
+if (s->select & (ATA_DEV_LBA)) {
 if (s->lba48) {
 sector_num = ((int64_t)s->hob_hcyl << 40) |
 ((int64_t) s->hob_lcyl << 32) |
@@ -595,13 +595,13 @@ int64_t ide_get_sector(IDEState *s)
 ((int64_t) s->lcyl << 8) | s->sector;
 } else {
 /* LBA28 */
-sector_num = ((s->select & 0x0f) << 24) | (s->hcyl << 16) |
-(s->lcyl << 8) | s->sector;
+sector_num = ((s->select & (ATA_DEV_LBA_MSB)) << 24) |
+(s->hcyl << 16) | (s->lcyl << 8) | s->sector;
 }
 } else {
 /* CHS */
 sector_num = ((s->hcyl << 8) | s->lcyl) * s->heads * s->sectors +
-(s->select & 0x0f) * s->sectors + (s->sector - 1);
+(s->select & (ATA_DEV_HS)) * s->sectors + (s->sector - 1);
 }
 
 return sector_num;
@@ -610,7 +610,7 @@ int64_t ide_get_sector(IDEState *s)
 void ide_set_sector(IDEState *s, int64_t sector_num)
 {
 unsigned int cyl, r;
-if (s->select & 0x40) {
+if (s->select & (ATA_DEV_LBA)) {
 if (s->lba48) {
 s->sector = sector_num;
 s->lcyl = sector_num >> 8;
@@ -620,7 +620,8 @@ void ide_set_sector(IDEState *s, int64_t sector_num)
 s->hob_hcyl = sector_num >> 40;
 } else {
 /* LBA28 */
-s->select = (s->select & 0xf0) | (sector_num >> 24);
+s->select = (s->select & ~(ATA_DEV_LBA_MSB)) |
+((sector_num >> 24) & (ATA_DEV_LBA_MSB));
 s->hcyl = (sector_num >> 16);
 s->lcyl = (sector_num >> 8);
 s->sector = (sector_num);
@@ -631,7 +632,8 @@ void ide_set_sector(IDEState *s, int64_t sector_num)
 r = sector_num % (s->heads * s->sectors);
 s->hcyl = cyl >> 8;
 s->lcyl = cyl;
-s->select = (s->select & 0xf0) | ((r / s->sectors) & 0x0f);
+s->select = (s->select & ~(ATA_DEV_HS)) |
+((r / s->sectors) & (ATA_DEV_HS));
 s->sector = (r % s->sectors) + 1;
 }
 }
@@ -1302,10 +1304,10 @@ void ide_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 break;
 case ATA_IOPORT_WR_DEVICE_HEAD:
 ide_clear_hob(bus);
-bus->ifs[0].select = val | 0xa0;
-bus->ifs[1].select = val | 0xa0;
+bus->ifs[0].select = val | (ATA_DEV_ALWAYS_ON);
+bus->ifs[1].select = val | (ATA_DEV_ALWAYS_ON);
 /* select drive */
-bus->unit = (val >> 4) & 1;
+bus->unit = (val & (ATA_DEV_SELECT)) ? 1 : 0;
 break;
 default:
 case ATA_IOPORT_WR_COMMAND:
@@ -1343,7 +1345,7 @@ static void ide_reset(IDEState *s)
 s->hob_lcyl = 0;
 s->hob_hcyl = 0;
 
-s->select = 0xa0;
+s->select = (ATA_DEV_ALWAYS_ON);
 s->status = READY_STAT | SEEK_STAT;
 
 s->lba48 = 0;
-- 
2.26.2




[PATCH 6/7] ide: clear interrupt on command write

2020-07-23 Thread John Snow
Not known to fix any bug, but I couldn't help but notice that ATA
specifies that writing to this register should clear an interrupt.

ATA7: Section 5.3.3 (Command register - Effect)
ATA6: Section 7.4.4 (Command register - Effect)
ATA5: Section 7.4.4 (Command register - Effect)
ATA4: Section 7.4.4 (Command register - Effect)
ATA3: Section 5.2.2 (Command register)

Other editions: try searching for the phrase "Writing this register".

Signed-off-by: John Snow 
---
 hw/ide/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 5f4f004312..e4c69a7fde 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -1312,6 +1312,7 @@ void ide_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 default:
 case ATA_IOPORT_WR_COMMAND:
 ide_clear_hob(bus);
+qemu_irq_lower(bus->irq);
 ide_exec_cmd(bus, val);
 break;
 }
-- 
2.26.2




[PATCH 4/7] ide: reorder set/get sector functions

2020-07-23 Thread John Snow
Reorder these just a pinch to make them more obvious at a glance what
the addressing mode is.

Signed-off-by: John Snow 
---
 hw/ide/core.c | 26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index a880b91b47..f35864070b 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -587,21 +587,23 @@ int64_t ide_get_sector(IDEState *s)
 {
 int64_t sector_num;
 if (s->select & 0x40) {
-/* lba */
-if (!s->lba48) {
-sector_num = ((s->select & 0x0f) << 24) | (s->hcyl << 16) |
-(s->lcyl << 8) | s->sector;
-} else {
+if (s->lba48) {
 sector_num = ((int64_t)s->hob_hcyl << 40) |
 ((int64_t) s->hob_lcyl << 32) |
 ((int64_t) s->hob_sector << 24) |
 ((int64_t) s->hcyl << 16) |
 ((int64_t) s->lcyl << 8) | s->sector;
+} else {
+/* LBA28 */
+sector_num = ((s->select & 0x0f) << 24) | (s->hcyl << 16) |
+(s->lcyl << 8) | s->sector;
 }
 } else {
+/* CHS */
 sector_num = ((s->hcyl << 8) | s->lcyl) * s->heads * s->sectors +
 (s->select & 0x0f) * s->sectors + (s->sector - 1);
 }
+
 return sector_num;
 }
 
@@ -609,20 +611,22 @@ void ide_set_sector(IDEState *s, int64_t sector_num)
 {
 unsigned int cyl, r;
 if (s->select & 0x40) {
-if (!s->lba48) {
-s->select = (s->select & 0xf0) | (sector_num >> 24);
-s->hcyl = (sector_num >> 16);
-s->lcyl = (sector_num >> 8);
-s->sector = (sector_num);
-} else {
+if (s->lba48) {
 s->sector = sector_num;
 s->lcyl = sector_num >> 8;
 s->hcyl = sector_num >> 16;
 s->hob_sector = sector_num >> 24;
 s->hob_lcyl = sector_num >> 32;
 s->hob_hcyl = sector_num >> 40;
+} else {
+/* LBA28 */
+s->select = (s->select & 0xf0) | (sector_num >> 24);
+s->hcyl = (sector_num >> 16);
+s->lcyl = (sector_num >> 8);
+s->sector = (sector_num);
 }
 } else {
+/* CHS */
 cyl = sector_num / (s->heads * s->sectors);
 r = sector_num % (s->heads * s->sectors);
 s->hcyl = cyl >> 8;
-- 
2.26.2




[PATCH 2/7] ide: don't tamper with the device register

2020-07-23 Thread John Snow
In real ISA operation, register writes go out to an entire bus channel
and all listening devices receive the write. The devices do not toggle
the DEV bit based on their own configuration, nor does the HBA
intermediate or tamper with that value.

The reality of the matter is that DEV0/DEV1 accordingly will react to
command register writes based on whether or not the device was selected.

This does not fix a known bug, but it makes the code slightly simpler
and more obvious.

Signed-off-by: John Snow 
---
 hw/ide/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index b472220d65..5cedebc408 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -1297,8 +1297,8 @@ void ide_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 break;
 case ATA_IOPORT_WR_DEVICE_HEAD:
 /* FIXME: HOB readback uses bit 7 */
-bus->ifs[0].select = (val & ~0x10) | 0xa0;
-bus->ifs[1].select = (val | 0x10) | 0xa0;
+bus->ifs[0].select = val | 0xa0;
+bus->ifs[1].select = val | 0xa0;
 /* select drive */
 bus->unit = (val >> 4) & 1;
 break;
-- 
2.26.2




[PATCH 1/7] ide: rename cmd_write to ctrl_write

2020-07-23 Thread John Snow
It's the Control register, part of the Control block -- Command is
misleading here. Rename all related functions and constants.

Signed-off-by: John Snow 
---
 include/hw/ide/internal.h |  9 +
 hw/ide/core.c | 12 ++--
 hw/ide/ioport.c   |  2 +-
 hw/ide/macio.c|  2 +-
 hw/ide/mmio.c |  8 
 hw/ide/pci.c  | 12 ++--
 hw/ide/trace-events   |  2 +-
 7 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/include/hw/ide/internal.h b/include/hw/ide/internal.h
index 1a7869e85d..10ea6e1e23 100644
--- a/include/hw/ide/internal.h
+++ b/include/hw/ide/internal.h
@@ -57,8 +57,9 @@ typedef struct IDEDMAOps IDEDMAOps;
 #define REL0x04
 #define TAG_MASK   0xf8
 
-#define IDE_CMD_RESET   0x04
-#define IDE_CMD_DISABLE_IRQ 0x02
+/* Bits of Device Control register */
+#define IDE_CTRL_RESET  0x04
+#define IDE_CTRL_DISABLE_IRQ0x02
 
 /* ACS-2 T13/2015-D Table B.2 Command codes */
 #define WIN_NOP0x00
@@ -564,7 +565,7 @@ static inline IDEState *idebus_active_if(IDEBus *bus)
 
 static inline void ide_set_irq(IDEBus *bus)
 {
-if (!(bus->cmd & IDE_CMD_DISABLE_IRQ)) {
+if (!(bus->cmd & IDE_CTRL_DISABLE_IRQ)) {
 qemu_irq_raise(bus->irq);
 }
 }
@@ -603,7 +604,7 @@ void ide_atapi_io_error(IDEState *s, int ret);
 void ide_ioport_write(void *opaque, uint32_t addr, uint32_t val);
 uint32_t ide_ioport_read(void *opaque, uint32_t addr1);
 uint32_t ide_status_read(void *opaque, uint32_t addr);
-void ide_cmd_write(void *opaque, uint32_t addr, uint32_t val);
+void ide_ctrl_write(void *opaque, uint32_t addr, uint32_t val);
 void ide_data_writew(void *opaque, uint32_t addr, uint32_t val);
 uint32_t ide_data_readw(void *opaque, uint32_t addr);
 void ide_data_writel(void *opaque, uint32_t addr, uint32_t val);
diff --git a/hw/ide/core.c b/hw/ide/core.c
index d997a78e47..b472220d65 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -2235,25 +2235,25 @@ uint32_t ide_status_read(void *opaque, uint32_t addr)
 return ret;
 }
 
-void ide_cmd_write(void *opaque, uint32_t addr, uint32_t val)
+void ide_ctrl_write(void *opaque, uint32_t addr, uint32_t val)
 {
 IDEBus *bus = opaque;
 IDEState *s;
 int i;
 
-trace_ide_cmd_write(addr, val, bus);
+trace_ide_ctrl_write(addr, val, bus);
 
 /* common for both drives */
-if (!(bus->cmd & IDE_CMD_RESET) &&
-(val & IDE_CMD_RESET)) {
+if (!(bus->cmd & IDE_CTRL_RESET) &&
+(val & IDE_CTRL_RESET)) {
 /* reset low to high */
 for(i = 0;i < 2; i++) {
 s = >ifs[i];
 s->status = BUSY_STAT | SEEK_STAT;
 s->error = 0x01;
 }
-} else if ((bus->cmd & IDE_CMD_RESET) &&
-   !(val & IDE_CMD_RESET)) {
+} else if ((bus->cmd & IDE_CTRL_RESET) &&
+   !(val & IDE_CTRL_RESET)) {
 /* high to low */
 for(i = 0;i < 2; i++) {
 s = >ifs[i];
diff --git a/hw/ide/ioport.c b/hw/ide/ioport.c
index ab1f4e5d9c..b613ff3bba 100644
--- a/hw/ide/ioport.c
+++ b/hw/ide/ioport.c
@@ -46,7 +46,7 @@ static const MemoryRegionPortio ide_portio_list[] = {
 };
 
 static const MemoryRegionPortio ide_portio2_list[] = {
-{ 0, 1, 1, .read = ide_status_read, .write = ide_cmd_write },
+{ 0, 1, 1, .read = ide_status_read, .write = ide_ctrl_write },
 PORTIO_END_OF_LIST(),
 };
 
diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index 62a599a075..b270a10163 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -329,7 +329,7 @@ static void pmac_ide_write(void *opaque, hwaddr addr, 
uint64_t val,
 case 0x8:
 case 0x16:
 if (size == 1) {
-ide_cmd_write(>bus, 0, val);
+ide_ctrl_write(>bus, 0, val);
 }
 break;
 case 0x20:
diff --git a/hw/ide/mmio.c b/hw/ide/mmio.c
index d233bd8c01..80b8a9eb09 100644
--- a/hw/ide/mmio.c
+++ b/hw/ide/mmio.c
@@ -95,16 +95,16 @@ static uint64_t mmio_ide_status_read(void *opaque, hwaddr 
addr,
 return ide_status_read(>bus, 0);
 }
 
-static void mmio_ide_cmd_write(void *opaque, hwaddr addr,
-   uint64_t val, unsigned size)
+static void mmio_ide_ctrl_write(void *opaque, hwaddr addr,
+uint64_t val, unsigned size)
 {
 MMIOState *s = opaque;
-ide_cmd_write(>bus, 0, val);
+ide_ctrl_write(>bus, 0, val);
 }
 
 static const MemoryRegionOps mmio_ide_cs_ops = {
 .read = mmio_ide_status_read,
-.write = mmio_ide_cmd_write,
+.write = mmio_ide_ctrl_write,
 .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
diff --git a/hw/ide/pci.c b/hw/ide/pci.c
index 5e85c4ad17..59726ae453 100644
--- a/hw/ide/pci.c
+++ b/hw/ide/pci.c
@@ -38,7 +38,7 @@
 (IDE_RETRY_DMA | IDE_RETRY_PIO | \
 IDE_RETRY_READ | IDE_RETRY_FLUSH)
 
-static uint64_t pci_ide_cmd_read(void *opaque, hwaddr addr, unsigned size)
+static uint64_t pci_ide_status_read(void 

Re: [PATCH v2 08/22] migration/block-dirty-bitmap: keep bitmap state for all bitmaps

2020-07-23 Thread Vladimir Sementsov-Ogievskiy

24.07.2020 00:30, Eric Blake wrote:

On 2/17/20 9:02 AM, Vladimir Sementsov-Ogievskiy wrote:

Keep bitmap state for disabled bitmaps too. Keep the state until the
end of the process. It's needed for the following commit to implement
bitmap postcopy canceling.

To clean-up the new list the following logic is used:
We need two events to consider bitmap migration finished:
1. chunk with DIRTY_BITMAP_MIG_FLAG_COMPLETE flag should be received
2. dirty_bitmap_mig_before_vm_start should be called
These two events may come in any order, so we understand which one is
last, and on the last of them we remove bitmap migration state from the
list.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  migration/block-dirty-bitmap.c | 64 +++---
  1 file changed, 43 insertions(+), 21 deletions(-)



@@ -484,45 +488,59 @@ static int dirty_bitmap_load_start(QEMUFile *f, 
DBMLoadState *s)
  bdrv_disable_dirty_bitmap(s->bitmap);
  if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) {
-    LoadBitmapState *b;
-
  bdrv_dirty_bitmap_create_successor(s->bitmap, _err);
  if (local_err) {
  error_report_err(local_err);
  return -EINVAL;
  }
-
-    b = g_new(LoadBitmapState, 1);
-    b->bs = s->bs;
-    b->bitmap = s->bitmap;
-    b->migrated = false;
-    s->enabled_bitmaps = g_slist_prepend(s->enabled_bitmaps, b);
  }
+    b = g_new(LoadBitmapState, 1);
+    b->bs = s->bs;
+    b->bitmap = s->bitmap;
+    b->migrated = false;
+    b->enabled = flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED,
+
+    s->bitmaps = g_slist_prepend(s->bitmaps, b);


Did you really mean to use a comma operator there, or should that be ';'?



Of course, it should be ';':)

--
Best regards,
Vladimir



Re: [PATCH 2/2] ppc: Enable 2nd DAWR support on p10

2020-07-23 Thread David Gibson
On Thu, Jul 23, 2020 at 04:12:20PM +0530, Ravi Bangoria wrote:
> As per the PAPR, bit 0 of byte 64 in pa-features property indicates
> availability of 2nd DAWR registers. i.e. If this bit is set, 2nd
> DAWR is present, otherwise not. Use KVM_CAP_PPC_DAWR1 capability to
> find whether kvm supports 2nd DAWR or nor. If it's supported, set
> the pa-feature bit in guest DT so the guest kernel can support 2nd
> DAWR.
> 
> Signed-off-by: Ravi Bangoria 
> ---
>  hw/ppc/spapr.c  | 33 +
>  include/hw/ppc/spapr.h  |  1 +
>  linux-headers/asm-powerpc/kvm.h |  4 
>  linux-headers/linux/kvm.h   |  1 +
>  target/ppc/cpu.h|  2 ++
>  target/ppc/kvm.c|  7 +++
>  target/ppc/kvm_ppc.h|  6 ++
>  target/ppc/translate_init.inc.c | 17 -
>  8 files changed, 70 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 0ae293ec94..4416319363 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -252,6 +252,31 @@ static void spapr_dt_pa_features(SpaprMachineState 
> *spapr,
>  /* 60: NM atomic, 62: RNG */
>  0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
>  };
> +uint8_t pa_features_310[] = { 66, 0,
> +/* 0: MMU|FPU|SLB|RUN|DABR|NX, 1: fri[nzpm]|DABRX|SPRG3|SLB0|PP110 */
> +/* 2: VPM|DS205|PPR|DS202|DS206, 3: LSD|URG, SSO, 5: LE|CFAR|EB|LSQ 
> */
> +0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0, /* 0 - 5 */
> +/* 6: DS207 */
> +0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /* 6 - 11 */
> +/* 16: Vector */
> +0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
> +/* 18: Vec. Scalar, 20: Vec. XOR, 22: HTM */
> +0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 18 - 23 */
> +/* 24: Ext. Dec, 26: 64 bit ftrs, 28: PM ftrs */
> +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 24 - 29 */
> +/* 30: MMR, 32: LE atomic, 34: EBB + ext EBB */
> +0x80, 0x00, 0x80, 0x00, 0xC0, 0x00, /* 30 - 35 */
> +/* 36: SPR SO, 38: Copy/Paste, 40: Radix MMU */
> +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 36 - 41 */
> +/* 42: PM, 44: PC RA, 46: SC vec'd */
> +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 42 - 47 */
> +/* 48: SIMD, 50: QP BFP, 52: String */
> +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 - 53 */
> +/* 54: DecFP, 56: DecI, 58: SHA */
> +0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 54 - 59 */
> +/* 60: NM atomic, 62: RNG, 64: DAWR1 */
> +0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
> +};
>  uint8_t *pa_features = NULL;
>  size_t pa_size;
>  
> @@ -267,6 +292,10 @@ static void spapr_dt_pa_features(SpaprMachineState 
> *spapr,
>  pa_features = pa_features_300;
>  pa_size = sizeof(pa_features_300);
>  }
> +if (ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_3_10, 0, cpu->compat_pvr)) 
> {
> +pa_features = pa_features_310;
> +pa_size = sizeof(pa_features_310);
> +}
>  if (!pa_features) {
>  return;
>  }
> @@ -291,6 +320,10 @@ static void spapr_dt_pa_features(SpaprMachineState 
> *spapr,
>  pa_features[40 + 2] &= ~0x80; /* Radix MMU */
>  }
>  
> +if (kvm_enabled() && kvmppc_has_cap_dawr1()) {
> +pa_features[66] |= 0x80;
> +}

Nack.  The guest visible platform must not depend on host capabilities
because it makes a complete mess of migration.  The machine type and
properties of other devices need to define what the guest environment
will be, then qemu can either provide it, or fail outright if KVM
doesn't have the neccessary support.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[PATCH v5 3/6] target/ppc: add vmulh{su}w instructions

2020-07-23 Thread Lijun Pan
vmulhsw: Vector Multiply High Signed Word
vmulhuw: Vector Multiply High Unsigned Word

Signed-off-by: Lijun Pan 
---
v4/v5: no change
Reviewed-by: Richard Henderson 
v3: inline the helper_vmulh{su}w multiply directly instead of using macro
v2: fix coding style
use Power ISA 3.1 flag

 target/ppc/helper.h |  2 ++
 target/ppc/int_helper.c | 19 +++
 target/ppc/translate/vmx-impl.inc.c |  6 ++
 target/ppc/translate/vmx-ops.inc.c  |  4 ++--
 4 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 69416b6d7c..3b3013866a 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -184,6 +184,8 @@ DEF_HELPER_3(vmulosw, void, avr, avr, avr)
 DEF_HELPER_3(vmuloub, void, avr, avr, avr)
 DEF_HELPER_3(vmulouh, void, avr, avr, avr)
 DEF_HELPER_3(vmulouw, void, avr, avr, avr)
+DEF_HELPER_3(vmulhsw, void, avr, avr, avr)
+DEF_HELPER_3(vmulhuw, void, avr, avr, avr)
 DEF_HELPER_3(vslo, void, avr, avr, avr)
 DEF_HELPER_3(vsro, void, avr, avr, avr)
 DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index bd3e6d7cc7..a3a20821fc 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1086,6 +1086,25 @@ VMUL(uw, u32, VsrW, VsrD, uint64_t)
 #undef VMUL_DO_ODD
 #undef VMUL
 
+void helper_vmulhsw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+int i;
+
+for (i = 0; i < 4; i++) {
+r->s32[i] = (int32_t)(((int64_t)a->s32[i] * (int64_t)b->s32[i]) >> 32);
+}
+}
+
+void helper_vmulhuw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+int i;
+
+for (i = 0; i < 4; i++) {
+r->u32[i] = (uint32_t)(((uint64_t)a->u32[i] *
+   (uint64_t)b->u32[i]) >> 32);
+}
+}
+
 void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
   ppc_avr_t *c)
 {
diff --git a/target/ppc/translate/vmx-impl.inc.c 
b/target/ppc/translate/vmx-impl.inc.c
index 8c89738552..50bac375fc 100644
--- a/target/ppc/translate/vmx-impl.inc.c
+++ b/target/ppc/translate/vmx-impl.inc.c
@@ -811,9 +811,15 @@ GEN_VXFORM_V(vmulld, MO_64, tcg_gen_gvec_mul, 4, 7);
 GEN_VXFORM(vmuleub, 4, 8);
 GEN_VXFORM(vmuleuh, 4, 9);
 GEN_VXFORM(vmuleuw, 4, 10);
+GEN_VXFORM(vmulhuw, 4, 10);
+GEN_VXFORM_DUAL(vmuleuw, PPC_ALTIVEC, PPC_NONE,
+vmulhuw, PPC_NONE, PPC2_ISA310);
 GEN_VXFORM(vmulesb, 4, 12);
 GEN_VXFORM(vmulesh, 4, 13);
 GEN_VXFORM(vmulesw, 4, 14);
+GEN_VXFORM(vmulhsw, 4, 14);
+GEN_VXFORM_DUAL(vmulesw, PPC_ALTIVEC, PPC_NONE,
+vmulhsw, PPC_NONE, PPC2_ISA310);
 GEN_VXFORM_V(vslb, MO_8, tcg_gen_gvec_shlv, 2, 4);
 GEN_VXFORM_V(vslh, MO_16, tcg_gen_gvec_shlv, 2, 5);
 GEN_VXFORM_V(vslw, MO_32, tcg_gen_gvec_shlv, 2, 6);
diff --git a/target/ppc/translate/vmx-ops.inc.c 
b/target/ppc/translate/vmx-ops.inc.c
index b49787ac97..29701ad778 100644
--- a/target/ppc/translate/vmx-ops.inc.c
+++ b/target/ppc/translate/vmx-ops.inc.c
@@ -110,10 +110,10 @@ GEN_VXFORM_207(vmulosw, 4, 6),
 GEN_VXFORM_310(vmulld, 4, 7),
 GEN_VXFORM(vmuleub, 4, 8),
 GEN_VXFORM(vmuleuh, 4, 9),
-GEN_VXFORM_207(vmuleuw, 4, 10),
+GEN_VXFORM_DUAL(vmuleuw, vmulhuw, 4, 10, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM(vmulesb, 4, 12),
 GEN_VXFORM(vmulesh, 4, 13),
-GEN_VXFORM_207(vmulesw, 4, 14),
+GEN_VXFORM_DUAL(vmulesw, vmulhsw, 4, 14, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM(vslb, 2, 4),
 GEN_VXFORM(vslh, 2, 5),
 GEN_VXFORM_DUAL(vslw, vrlwnm, 2, 6, PPC_ALTIVEC, PPC_NONE),
-- 
2.23.0




[PATCH v5 2/6] target/ppc: add vmulld to INDEX_op_mul_vec case

2020-07-23 Thread Lijun Pan
Group vmuluwm and vmulld. Make vmulld-specific
changes since it belongs to new ISA 3.1.

Signed-off-by: Lijun Pan 
---
v5: no change
v4: add missing changes, and split to 5/11, 6/11, 7/11
v3: use tcg_gen_gvec_mul()
v2: fix coding style
use Power ISA 3.1 flag

 tcg/ppc/tcg-target.h |  2 ++
 tcg/ppc/tcg-target.inc.c | 12 ++--
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index 4fa21f0e71..ff1249ef8e 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -63,6 +63,7 @@ typedef enum {
 tcg_isa_2_06,
 tcg_isa_2_07,
 tcg_isa_3_00,
+tcg_isa_3_10,
 } TCGPowerISA;
 
 extern TCGPowerISA have_isa;
@@ -72,6 +73,7 @@ extern bool have_vsx;
 #define have_isa_2_06  (have_isa >= tcg_isa_2_06)
 #define have_isa_2_07  (have_isa >= tcg_isa_2_07)
 #define have_isa_3_00  (have_isa >= tcg_isa_3_00)
+#define have_isa_3_10  (have_isa >= tcg_isa_3_10)
 
 /* optional instructions automatically implemented */
 #define TCG_TARGET_HAS_ext8u_i320 /* andi */
diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c
index ee1f9227c1..caa8985b46 100644
--- a/tcg/ppc/tcg-target.inc.c
+++ b/tcg/ppc/tcg-target.inc.c
@@ -564,6 +564,7 @@ static int tcg_target_const_match(tcg_target_long val, 
TCGType type,
 #define VMULOUHVX4(72)
 #define VMULOUWVX4(136)   /* v2.07 */
 #define VMULUWMVX4(137)   /* v2.07 */
+#define VMULLD VX4(457)   /* v3.10 */
 #define VMSUMUHM   VX4(38)
 
 #define VMRGHB VX4(12)
@@ -3015,6 +3016,8 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, 
unsigned vece)
 return -1;
 case MO_32:
 return have_isa_2_07 ? 1 : -1;
+case MO_64:
+return have_isa_3_10;
 }
 return 0;
 case INDEX_op_bitsel_vec:
@@ -3149,6 +3152,7 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 static const uint32_t
 add_op[4] = { VADDUBM, VADDUHM, VADDUWM, VADDUDM },
 sub_op[4] = { VSUBUBM, VSUBUHM, VSUBUWM, VSUBUDM },
+mul_op[4] = { 0, 0, VMULUWM, VMULLD },
 neg_op[4] = { 0, 0, VNEGW, VNEGD },
 eq_op[4]  = { VCMPEQUB, VCMPEQUH, VCMPEQUW, VCMPEQUD },
 ne_op[4]  = { VCMPNEB, VCMPNEH, VCMPNEW, 0 },
@@ -3199,8 +3203,7 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 a1 = 0;
 break;
 case INDEX_op_mul_vec:
-tcg_debug_assert(vece == MO_32 && have_isa_2_07);
-insn = VMULUWM;
+insn = mul_op[vece];
 break;
 case INDEX_op_ssadd_vec:
 insn = ssadd_op[vece];
@@ -3709,6 +3712,11 @@ static void tcg_target_init(TCGContext *s)
 have_isa = tcg_isa_3_00;
 }
 #endif
+#ifdef PPC_FEATURE2_ARCH_3_10
+if (hwcap2 & PPC_FEATURE2_ARCH_3_10) {
+have_isa = tcg_isa_3_10;
+}
+#endif
 
 #ifdef PPC_FEATURE2_HAS_ISEL
 /* Prefer explicit instruction from the kernel. */
-- 
2.23.0




[PATCH v5 5/6] target/ppc: add vdiv{su}{wd} vmod{su}{wd} instructions

2020-07-23 Thread Lijun Pan
vdivsw: Vector Divide Signed Word
vdivuw: Vector Divide Unsigned Word
vdivsd: Vector Divide Signed Doubleword
vdivud: Vector Divide Unsigned Doubleword
vmodsw: Vector Modulo Signed Word
vmoduw: Vector Modulo Unsigned Word
vmodsd: Vector Modulo Signed Doubleword
vmodud: Vector Modulo Unsigned Doubleword

Signed-off-by: Lijun Pan 
---
v5: no change
v4: add a comment on undefined result of divide operation.
fix if(){} coding style issue, remove blank line.
v3: add missing divided-by-zero, divided-by-(-1) handling
v2: fix coding style
use Power ISA 3.1 flag

 target/ppc/helper.h |  8 
 target/ppc/int_helper.c | 27 +++
 target/ppc/translate.c  |  3 +++
 target/ppc/translate/vmx-impl.inc.c | 15 +++
 target/ppc/translate/vmx-ops.inc.c  | 17 +++--
 5 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 0036788919..70a14029ca 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -188,6 +188,14 @@ DEF_HELPER_3(vmulhsw, void, avr, avr, avr)
 DEF_HELPER_3(vmulhuw, void, avr, avr, avr)
 DEF_HELPER_3(vmulhsd, void, avr, avr, avr)
 DEF_HELPER_3(vmulhud, void, avr, avr, avr)
+DEF_HELPER_3(vdivsw, void, avr, avr, avr)
+DEF_HELPER_3(vdivuw, void, avr, avr, avr)
+DEF_HELPER_3(vdivsd, void, avr, avr, avr)
+DEF_HELPER_3(vdivud, void, avr, avr, avr)
+DEF_HELPER_3(vmodsw, void, avr, avr, avr)
+DEF_HELPER_3(vmoduw, void, avr, avr, avr)
+DEF_HELPER_3(vmodsd, void, avr, avr, avr)
+DEF_HELPER_3(vmodud, void, avr, avr, avr)
 DEF_HELPER_3(vslo, void, avr, avr, avr)
 DEF_HELPER_3(vsro, void, avr, avr, avr)
 DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 57d6767f60..62b93b4568 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1121,6 +1121,33 @@ void helper_vmulhud(ppc_avr_t *r, ppc_avr_t *a, 
ppc_avr_t *b)
 mulu64(, >u64[1], a->u64[1], b->u64[1]);
 }
 
+#define VDIV_MOD_DO(name, op, element, sign, bit)   \
+void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)   \
+{   \
+int i;  \
+\
+for (i = 0; i < ARRAY_SIZE(r->element); i++) {  \
+if (unlikely((b->element[i] == 0) ||\
+(sign &&\
+(b->element[i] == UINT##bit##_MAX) &&   \
+(a->element[i] == INT##bit##_MIN {  \
+/* Undefined, No Special Registers Altered */   \
+continue;   \
+}   \
+r->element[i] = a->element[i] op b->element[i]; \
+}   \
+}
+VDIV_MOD_DO(divsw, /, s32, 1, 32)
+VDIV_MOD_DO(divuw, /, u32, 0, 32)
+VDIV_MOD_DO(divsd, /, s64, 1, 64)
+VDIV_MOD_DO(divud, /, u64, 0, 64)
+VDIV_MOD_DO(modsw, %, s32, 1, 32)
+VDIV_MOD_DO(moduw, %, u32, 0, 32)
+VDIV_MOD_DO(modsd, %, s64, 1, 64)
+VDIV_MOD_DO(modud, %, u64, 0, 64)
+#undef VDIV_MOD_DO
+
+
 void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
   ppc_avr_t *c)
 {
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 590c3e3bc7..55f0e1a01d 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -388,6 +388,9 @@ GEN_OPCODE3(name, opc1, opc2, opc3, opc4, inval, type, 
type2)
 #define GEN_HANDLER2_E_2(name, onam, opc1, opc2, opc3, opc4, inval, typ, typ2) 
\
 GEN_OPCODE4(name, onam, opc1, opc2, opc3, opc4, inval, typ, typ2)
 
+#define GEN_HANDLER_BOTH(name, opc1, opc2, opc3, inval0, inval1, type0, type1) 
\
+GEN_OPCODE_DUAL(name, opc1, opc2, opc3, inval0, inval1, type0, type1)
+
 typedef struct opcode_t {
 unsigned char opc1, opc2, opc3, opc4;
 #if HOST_LONG_BITS == 64 /* Explicitly align to 64 bits */
diff --git a/target/ppc/translate/vmx-impl.inc.c 
b/target/ppc/translate/vmx-impl.inc.c
index 0910807232..ac5e820541 100644
--- a/target/ppc/translate/vmx-impl.inc.c
+++ b/target/ppc/translate/vmx-impl.inc.c
@@ -798,6 +798,9 @@ static void trans_vclzd(DisasContext *ctx)
 tcg_temp_free_i64(avr);
 }
 
+static void gen_vexptefp(DisasContext *ctx);
+static void gen_vlogefp(DisasContext *ctx);
+
 GEN_VXFORM(vmuloub, 4, 0);
 GEN_VXFORM(vmulouh, 4, 1);
 GEN_VXFORM(vmulouw, 4, 2);
@@ -822,6 +825,18 @@ GEN_VXFORM(vmulhsw, 4, 14);
 GEN_VXFORM_DUAL(vmulesw, PPC_ALTIVEC, PPC_NONE,
 vmulhsw, PPC_NONE, PPC2_ISA310);
 GEN_VXFORM(vmulhsd, 4, 15);
+GEN_VXFORM(vdivuw, 5, 2);
+GEN_VXFORM(vdivud, 5, 3);
+GEN_VXFORM(vdivsw, 5, 6);

[PATCH v5 4/6] target/ppc: add vmulh{su}d instructions

2020-07-23 Thread Lijun Pan
vmulhsd: Vector Multiply High Signed Doubleword
vmulhud: Vector Multiply High Unsigned Doubleword

Signed-off-by: Lijun Pan 
---
v4/v5: no change
Reviewed-by: Richard Henderson 
v3: simplify helper_vmulh{su}d 
v2: fix coding style
use Power ISA 3.1 flag

 target/ppc/helper.h |  2 ++
 target/ppc/int_helper.c | 16 
 target/ppc/translate/vmx-impl.inc.c |  2 ++
 target/ppc/translate/vmx-ops.inc.c  |  2 ++
 4 files changed, 22 insertions(+)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 3b3013866a..0036788919 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -186,6 +186,8 @@ DEF_HELPER_3(vmulouh, void, avr, avr, avr)
 DEF_HELPER_3(vmulouw, void, avr, avr, avr)
 DEF_HELPER_3(vmulhsw, void, avr, avr, avr)
 DEF_HELPER_3(vmulhuw, void, avr, avr, avr)
+DEF_HELPER_3(vmulhsd, void, avr, avr, avr)
+DEF_HELPER_3(vmulhud, void, avr, avr, avr)
 DEF_HELPER_3(vslo, void, avr, avr, avr)
 DEF_HELPER_3(vsro, void, avr, avr, avr)
 DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index a3a20821fc..57d6767f60 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1105,6 +1105,22 @@ void helper_vmulhuw(ppc_avr_t *r, ppc_avr_t *a, 
ppc_avr_t *b)
 }
 }
 
+void helper_vmulhsd(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+uint64_t discard;
+
+muls64(, >u64[0], a->s64[0], b->s64[0]);
+muls64(, >u64[1], a->s64[1], b->s64[1]);
+}
+
+void helper_vmulhud(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+uint64_t discard;
+
+mulu64(, >u64[0], a->u64[0], b->u64[0]);
+mulu64(, >u64[1], a->u64[1], b->u64[1]);
+}
+
 void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
   ppc_avr_t *c)
 {
diff --git a/target/ppc/translate/vmx-impl.inc.c 
b/target/ppc/translate/vmx-impl.inc.c
index 50bac375fc..0910807232 100644
--- a/target/ppc/translate/vmx-impl.inc.c
+++ b/target/ppc/translate/vmx-impl.inc.c
@@ -812,6 +812,7 @@ GEN_VXFORM(vmuleub, 4, 8);
 GEN_VXFORM(vmuleuh, 4, 9);
 GEN_VXFORM(vmuleuw, 4, 10);
 GEN_VXFORM(vmulhuw, 4, 10);
+GEN_VXFORM(vmulhud, 4, 11);
 GEN_VXFORM_DUAL(vmuleuw, PPC_ALTIVEC, PPC_NONE,
 vmulhuw, PPC_NONE, PPC2_ISA310);
 GEN_VXFORM(vmulesb, 4, 12);
@@ -820,6 +821,7 @@ GEN_VXFORM(vmulesw, 4, 14);
 GEN_VXFORM(vmulhsw, 4, 14);
 GEN_VXFORM_DUAL(vmulesw, PPC_ALTIVEC, PPC_NONE,
 vmulhsw, PPC_NONE, PPC2_ISA310);
+GEN_VXFORM(vmulhsd, 4, 15);
 GEN_VXFORM_V(vslb, MO_8, tcg_gen_gvec_shlv, 2, 4);
 GEN_VXFORM_V(vslh, MO_16, tcg_gen_gvec_shlv, 2, 5);
 GEN_VXFORM_V(vslw, MO_32, tcg_gen_gvec_shlv, 2, 6);
diff --git a/target/ppc/translate/vmx-ops.inc.c 
b/target/ppc/translate/vmx-ops.inc.c
index 29701ad778..f3f4855111 100644
--- a/target/ppc/translate/vmx-ops.inc.c
+++ b/target/ppc/translate/vmx-ops.inc.c
@@ -111,9 +111,11 @@ GEN_VXFORM_310(vmulld, 4, 7),
 GEN_VXFORM(vmuleub, 4, 8),
 GEN_VXFORM(vmuleuh, 4, 9),
 GEN_VXFORM_DUAL(vmuleuw, vmulhuw, 4, 10, PPC_ALTIVEC, PPC_NONE),
+GEN_VXFORM_310(vmulhud, 4, 11),
 GEN_VXFORM(vmulesb, 4, 12),
 GEN_VXFORM(vmulesh, 4, 13),
 GEN_VXFORM_DUAL(vmulesw, vmulhsw, 4, 14, PPC_ALTIVEC, PPC_NONE),
+GEN_VXFORM_310(vmulhsd, 4, 15),
 GEN_VXFORM(vslb, 2, 4),
 GEN_VXFORM(vslh, 2, 5),
 GEN_VXFORM_DUAL(vslw, vrlwnm, 2, 6, PPC_ALTIVEC, PPC_NONE),
-- 
2.23.0




[PATCH v5 6/6] target/ppc: add vmsumudm vmsumcud instructions

2020-07-23 Thread Lijun Pan
vmsumudm (Power ISA 3.0) - Vector Multiply-Sum Unsigned Doubleword Modulo
VA-form.
vmsumcud (Power ISA 3.1) - Vector Multiply-Sum & write Carry-out Unsigned
Doubleword VA-form.

Signed-off-by: Lijun Pan 
---
v5: update instruction flag for vmsumcud.
integrate into this isa3.1 patch series
v3: implement vmsumudm/vmsumcud through int128 functions,
suggested by Richard Henderson.

 disas/ppc.c |  2 ++
 target/ppc/helper.h |  4 ++-
 target/ppc/int_helper.c | 49 -
 target/ppc/translate.c  |  1 -
 target/ppc/translate/vmx-impl.inc.c | 39 ---
 target/ppc/translate/vmx-ops.inc.c  |  2 ++
 6 files changed, 76 insertions(+), 21 deletions(-)

diff --git a/disas/ppc.c b/disas/ppc.c
index 63e97cfe1d..bd76fae4c4 100644
--- a/disas/ppc.c
+++ b/disas/ppc.c
@@ -2261,7 +2261,9 @@ const struct powerpc_opcode powerpc_opcodes[] = {
 { "vmsumshs",  VXA(4,  41), VXA_MASK,  PPCVEC, { VD, VA, VB, VC } },
 { "vmsumubm",  VXA(4,  36), VXA_MASK,   PPCVEC,{ VD, VA, VB, 
VC } },
 { "vmsumuhm",  VXA(4,  38), VXA_MASK,   PPCVEC,{ VD, VA, VB, 
VC } },
+{ "vmsumudm",  VXA(4,  35), VXA_MASK,   PPCVEC, { VD, VA, VB, VC } },
 { "vmsumuhs",  VXA(4,  39), VXA_MASK,   PPCVEC,{ VD, VA, VB, 
VC } },
+{ "vmsumcud",  VXA(4,  23), VXA_MASK,   PPCVEC, { VD, VA, VB, VC } },
 { "vmulesb",   VX(4,  776), VX_MASK,   PPCVEC, { VD, VA, VB } },
 { "vmulesh",   VX(4,  840), VX_MASK,   PPCVEC, { VD, VA, VB } },
 { "vmuleub",   VX(4,  520), VX_MASK,   PPCVEC, { VD, VA, VB } },
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 70a14029ca..00a31d64bc 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -274,10 +274,12 @@ DEF_HELPER_3(vpkpx, void, avr, avr, avr)
 DEF_HELPER_5(vmhaddshs, void, env, avr, avr, avr, avr)
 DEF_HELPER_5(vmhraddshs, void, env, avr, avr, avr, avr)
 DEF_HELPER_5(vmsumuhm, void, env, avr, avr, avr, avr)
+DEF_HELPER_5(vmsumudm, void, env, avr, avr, avr, avr)
 DEF_HELPER_5(vmsumuhs, void, env, avr, avr, avr, avr)
 DEF_HELPER_5(vmsumshm, void, env, avr, avr, avr, avr)
 DEF_HELPER_5(vmsumshs, void, env, avr, avr, avr, avr)
-DEF_HELPER_4(vmladduhm, void, avr, avr, avr, avr)
+DEF_HELPER_5(vmsumcud, void, env, avr, avr, avr, avr)
+DEF_HELPER_5(vmladduhm, void, env, avr, avr, avr, avr)
 DEF_HELPER_FLAGS_2(mtvscr, TCG_CALL_NO_RWG, void, env, i32)
 DEF_HELPER_FLAGS_1(mfvscr, TCG_CALL_NO_RWG, i32, env)
 DEF_HELPER_3(lvebx, void, env, avr, tl)
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 62b93b4568..2e919a7b8e 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -913,7 +913,8 @@ void helper_vmhraddshs(CPUPPCState *env, ppc_avr_t *r, 
ppc_avr_t *a,
 }
 }
 
-void helper_vmladduhm(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
+void helper_vmladduhm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a,
+  ppc_avr_t *b, ppc_avr_t *c)
 {
 int i;
 
@@ -1051,6 +1052,52 @@ void helper_vmsumuhs(CPUPPCState *env, ppc_avr_t *r, 
ppc_avr_t *a,
 }
 }
 
+void helper_vmsumudm(CPUPPCState *env, ppc_avr_t *r,
+ ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
+{
+Int128 sum;
+uint64_t lo, hi;
+
+sum = int128_make128(c->VsrD(1), c->VsrD(0));
+
+mulu64(, , a->VsrD(0), b->VsrD(0));
+sum = int128_add(sum, int128_make128(lo, hi));
+
+mulu64(, , a->VsrD(1), b->VsrD(1));
+sum = int128_add(sum, int128_make128(lo, hi));
+
+r->VsrD(0) = int128_gethi(sum);
+r->VsrD(1) = int128_getlo(sum);
+}
+
+void helper_vmsumcud(CPUPPCState *env, ppc_avr_t *r,
+ ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
+{
+Int128 sum;
+uint64_t p1lo, p1hi, p2lo, p2hi;
+
+mulu64(, , a->VsrD(0), b->VsrD(0));
+mulu64(, , a->VsrD(1), b->VsrD(1));
+
+/* Sum lowest 64-bit elements.  */
+sum = int128_make128(c->VsrD(1), 0);
+sum = int128_add(sum, int128_make128(p1lo, 0));
+sum = int128_add(sum, int128_make128(p2lo, 0));
+
+/*
+ * Discard low 64-bits, leaving the carry into bit 64.
+ * Then sum the higher 64-bit elements.
+ */
+sum = int128_rshift(sum, 64);
+sum = int128_add(sum, int128_make128(c->VsrD(0), 0));
+sum = int128_add(sum, int128_make128(p1hi, 0));
+sum = int128_add(sum, int128_make128(p2hi, 0));
+
+/* The result is only the carry into bits 64 & 65. */
+r->VsrD(1) = int128_gethi(sum);
+r->VsrD(0) = 0;
+}
+
 #define VMUL_DO_EVN(name, mul_element, mul_access, prod_access, cast)   \
 void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)   \
 {   \
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 55f0e1a01d..b33b2d87a4 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -7324,7 +7324,6 @@ GEN_HANDLER(lvsl, 0x1f, 0x06, 0x00, 0x0001, 

[PATCH v5 1/6] Update PowerPC AT_HWCAP2 definition

2020-07-23 Thread Lijun Pan
Add PPC2_FEATURE2_ARCH_3_10 to the PowerPC AT_HWCAP2 definitions.

Signed-off-by: Lijun Pan 
---
v5: match the definition with that in linux's
arch/powerpc/include/uapi/asm/cputable.h
v4: add missing changes, and split to 5/11, 6/11, 7/11
v3: use tcg_gen_gvec_mul()
v2: fix coding style
use Power ISA 3.1 flag

 include/elf.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/elf.h b/include/elf.h
index 8fbfe60e09..4fc1276ab6 100644
--- a/include/elf.h
+++ b/include/elf.h
@@ -554,6 +554,7 @@ typedef struct {
 #define PPC_FEATURE2_HTM_NOSC   0x0100
 #define PPC_FEATURE2_ARCH_3_00  0x0080
 #define PPC_FEATURE2_HAS_IEEE1280x0040
+#define PPC_FEATURE2_ARCH_3_10  0x0004
 
 /* Bits present in AT_HWCAP for Sparc.  */
 
-- 
2.23.0




[PATCH v5 0/6] Add several Power ISA 3.1 32/64-bit vector instructions

2020-07-23 Thread Lijun Pan
This patch series add several newly introduced 32/64-bit vector
instructions in Power ISA 3.1. Power ISA 3.1 flag is introduced in
this version. In v4 version, coding style issues are fixed, community
reviews/suggestions are taken into consideration. 1/11 - 5/11 of v4 were
accepted by David Gibson, and 9/11 of v4 was accepted by Lauren Vivier.
This v5 version updates PPC_FEATURE2_ARCH_3_10 definition in 6/11 of v4,
rebases 7/11 8/11 10/11 11/11 of v4, and integrates vmsumudm/vmsumcud
patch.

Lijun Pan (6):
  Update PowerPC AT_HWCAP2 definition
  target/ppc: add vmulld to INDEX_op_mul_vec case
  target/ppc: add vmulh{su}w instructions
  target/ppc: add vmulh{su}d instructions
  target/ppc: add vdiv{su}{wd} vmod{su}{wd} instructions
  target/ppc: add vmsumudm vmsumcud instructions

 disas/ppc.c |   2 +
 include/elf.h   |   1 +
 target/ppc/helper.h |  16 +++-
 target/ppc/int_helper.c | 111 +++-
 target/ppc/translate.c  |   4 +-
 target/ppc/translate/vmx-impl.inc.c |  62 +++-
 target/ppc/translate/vmx-ops.inc.c  |  25 ++-
 tcg/ppc/tcg-target.h|   2 +
 tcg/ppc/tcg-target.inc.c|  12 ++-
 9 files changed, 208 insertions(+), 27 deletions(-)

-- 
2.23.0




[Bug 1888431] Re: v5.1.0-rc1 build fails on Mac OS X 10.11.6

2020-07-23 Thread Thomas Huth
Hmm, let's see ... the work-arounds for old Mac OS X versions have been
removed here:

https://git.qemu.org/?p=qemu.git;a=commitdiff;h=483644c25b932360018

It mentiones that this commit has broken compilation earlier:

https://git.qemu.org/?p=qemu.git;a=commitdiff;h=50290c002c045280f8d

... so the newest version that still might be compilable is v4.0.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1888431

Title:
  v5.1.0-rc1 build fails on Mac OS X 10.11.6

Status in QEMU:
  Won't Fix

Bug description:
  Hi all,

  build of tag v5.1.0-rc1 fails on Mac OS X 10.11.6 (El Capitan) with
  the following error:

  git clone https://git.qemu.org/git/qemu.git
  
  cd qemu
  git submodule init
  
  git submodule update --recursive
  
  ./configure
  
  make
  

CC  trace/control.o
  In file included from trace/control.c:29:
  In file included from /Users/rtb/src/qemu/include/monitor/monitor.h:4:
  In file included from /Users/rtb/src/qemu/include/block/block.h:4:
  In file included from /Users/rtb/src/qemu/include/block/aio.h:23:
  /Users/rtb/src/qemu/include/qemu/timer.h:843:9: warning: implicit declaration 
of function 'clock_gettime' is invalid in C99
[-Wimplicit-function-declaration]
  clock_gettime(CLOCK_MONOTONIC, );
  ^
  /Users/rtb/src/qemu/include/qemu/timer.h:843:23: error: use of undeclared 
identifier 'CLOCK_MONOTONIC'
  clock_gettime(CLOCK_MONOTONIC, );
^
  1 warning and 1 error generated.
  make: *** [trace/control.o] Error 1

  
  rtb:qemu rtb$ git log -n1
  commit c8004fe6bbfc0d9c2e7b942c418a85efb3ac4b00 (HEAD -> master, tag: 
v5.1.0-rc1, origin/master, origin/HEAD)
  Author: Peter Maydell 
  Date:   Tue Jul 21 20:28:59 2020 +0100

  Update version for v5.1.0-rc1 release
  
  Signed-off-by: Peter Maydell 
  rtb:qemu rtb$ 

  
  Please find the full output of all the commands (from git clone of the repo, 
to the make) in the attached file "buildfail.txt".

  Thank you!

  Best regards,

  Robert Ball

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1888431/+subscriptions



[Bug 1888601] Re: QEMU v5.1.0-rc0/rc1 hang with nested virtualization

2020-07-23 Thread Simon Kaegi
I believe the VSI itself is QEMU based but don't know the version or
details but suspect it's 4.1 based. We compile our own QEMU version for
use with Kata and that's where we're now using 5.1.0-rc1 with the above
commit reverted.

Host Kernel is ... 4.15.0-101-generic if that helps

re: cpu -- four of these...
```
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 61
model name  : Intel Core Processor (Broadwell, IBRS)
stepping: 2
microcode   : 0x1
cpu MHz : 2095.148
cache size  : 16384 KB
physical id : 0
siblings: 4
core id : 0
cpu cores   : 2
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm 
constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx 
ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes 
xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault 
invpcid_single pti ssbd ibrs ibpb tpr_shadow vnmi flexpriority ept vpid 
fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap 
xsaveopt arat md_clear
bugs: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds 
swapgs taa itlb_multihit
bogomips: 4190.29
clflush size: 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
```

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1888601

Title:
  QEMU v5.1.0-rc0/rc1 hang with nested virtualization

Status in QEMU:
  New

Bug description:
  We're running Kata Containers using QEMU and with v5.1.0rc0 and rc1
  have noticed a problem at startup where QEMu appears to hang. We are
  not seeing this problem on our bare metal nodes and only on a VSI that
  supports nested virtualization.

  We unfortunately see nothing at all in the QEMU logs to help
  understand the problem and a hung process is just a guess at this
  point.

  Using git bisect we first see the problem with...

  ---

  f19bcdfedd53ee93412d535a842a89fa27cae7f2 is the first bad commit
  commit f19bcdfedd53ee93412d535a842a89fa27cae7f2
  Author: Jason Wang 
  Date:   Wed Jul 1 22:55:28 2020 +0800

  virtio-pci: implement queue_enabled method

  With version 1, we can detect whether a queue is enabled via
  queue_enabled.

  Signed-off-by: Jason Wang 
  Signed-off-by: Cindy Lu 
  Message-Id: <20200701145538.22333-5-l...@redhat.com>
  Reviewed-by: Michael S. Tsirkin 
  Signed-off-by: Michael S. Tsirkin 
  Acked-by: Jason Wang 

   hw/virtio/virtio-pci.c | 13 +
   1 file changed, 13 insertions(+)

  ---

  Reverting this commit (on top of 5.1.0-rc1) seems to work and prevent
  the hanging.

  ---

  Here's how kata ends up launching qemu in our environment --
  /opt/kata/bin/qemu-system-x86_64 -name 
sandbox-849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f -uuid 
6bec458e-1da7-4847-a5d7-5ab31d4d2465 -machine pc,accel=kvm,kernel_irqchip -cpu 
host,pmu=off -qmp 
unix:/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/qmp.sock,server,nowait
 -m 4096M,slots=10,maxmem=30978M -device 
pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= 
-device virtio-serial-pci,disable-modern=true,id=serial0,romfile= -device 
virtconsole,chardev=charconsole0,id=console0 -chardev 
socket,id=charconsole0,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/console.sock,server,nowait
 -device virtio-scsi-pci,id=scsi0,disable-modern=true,romfile= -object 
rng-random,id=rng0,filename=/dev/urandom -device 
virtio-rng-pci,rng=rng0,romfile= -device 
virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 -chardev 
socket,id=charch0,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/kata.sock,server,nowait
 -chardev 
socket,id=char-396c5c3e19e29353,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/vhost-fs.sock
 -device 
vhost-user-fs-pci,chardev=char-396c5c3e19e29353,tag=kataShared,romfile= -netdev 
tap,id=network-0,vhost=on,vhostfds=3:4,fds=5:6 -device 
driver=virtio-net-pci,netdev=network-0,mac=52:ac:2d:02:1f:6f,disable-modern=true,mq=on,vectors=6,romfile=
 -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults 
-nographic -daemonize -object 
memory-backend-file,id=dimm1,size=4096M,mem-path=/dev/shm,share=on -numa 
node,memdev=dimm1 -kernel /opt/kata/share/kata-containers/vmlinuz-5.7.9-74 
-initrd 
/opt/kata/share/kata-containers/kata-containers-initrd_alpine_1.11.2-6_agent.initrd
 -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 
i8042.dumbkbd=1 i8042.nopnp=1 

Re: [PATCH] hw/input/virtio-input-hid.c: Don't undef CONFIG_CURSES

2020-07-23 Thread Thomas Huth
On 23/07/2020 21.24, Peter Maydell wrote:
> virtio-input-hid.c undefines CONFIG_CURSES before including
> ui/console.h. However since commits e2f82e924d057935 and b0766612d16da18
> that header does not have behaviour dependent on CONFIG_CURSES.
> Remove the now-unneeded undef.
> 
> Signed-off-by: Peter Maydell 
> ---
> NB: tested with 'make check' only.
> 
>  hw/input/virtio-input-hid.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/hw/input/virtio-input-hid.c b/hw/input/virtio-input-hid.c
> index 09cf2609854..a7a244a95db 100644
> --- a/hw/input/virtio-input-hid.c
> +++ b/hw/input/virtio-input-hid.c
> @@ -12,7 +12,6 @@
>  #include "hw/qdev-properties.h"
>  #include "hw/virtio/virtio-input.h"
>  
> -#undef CONFIG_CURSES
>  #include "ui/console.h"
>  
>  #include "standard-headers/linux/input.h"
> 

Reviewed-by: Thomas Huth 




Re: [PATCH 2/2] e1000e: make TX reentrant

2020-07-23 Thread Jason Wang



On 2020/7/23 下午6:36, Peter Maydell wrote:

On Wed, 22 Jul 2020 at 10:00, Jason Wang  wrote:

In loopback mode, e1000e RX can DMA into TX doorbell which requires
TX to be reentrant. This patch make e1000e's TX routine reentrant by
introducing a per device boolean for recording whether or not a TX
rountine is being called and return early.

Signed-off-by: Jason Wang 
---

This feels like a sticking-plaster fix that's not really in the
right place... It stops us from calling back into
e1000e_start_xmit(), but it doesn't prevent a DMA request
from touching other device registers that update state in
the E100ECore struct that the transmit code is not expecting
to change.



Right, so we can track the mr owner and fail the memory access if 
there's another mr transaction in memory core: 
memory_region_dispatch_read() and memory_region_dispatch_write().


But what's more interesting is that some device uses bh for the doorbell 
which may require more thought...


Thanks




thanks
-- PMM






Re: [PATCH v2 4/7] target/riscv: Check nanboxed inputs to fp helpers

2020-07-23 Thread Richard Henderson
On 7/23/20 7:47 PM, LIU Zhiwei wrote:
> 
> 
> On 2020/7/24 8:28, Richard Henderson wrote:
>> If a 32-bit input is not properly nanboxed, then the input is
>> replaced with the default qnan.
>>
>> Signed-off-by: Richard Henderson 
>> ---
>>   target/riscv/internals.h  | 11 +++
>>   target/riscv/fpu_helper.c | 64 ---
>>   2 files changed, 57 insertions(+), 18 deletions(-)
>>
>> diff --git a/target/riscv/internals.h b/target/riscv/internals.h
>> index 9f4ba7d617..f1a546dba6 100644
>> --- a/target/riscv/internals.h
>> +++ b/target/riscv/internals.h
>> @@ -43,4 +43,15 @@ static inline uint64_t nanbox_s(float32 f)
>>   return f | MAKE_64BIT_MASK(32, 32);
>>   }
>>   +static inline float32 check_nanbox_s(uint64_t f)
>> +{
>> +    uint64_t mask = MAKE_64BIT_MASK(32, 32);
>> +
>> +    if (likely((f & mask) == mask)) {
>> +    return (uint32_t)f;
>> +    } else {
>> +    return 0x7fc0u; /* default qnan */
>> +    }
>> +}
>> +
> If possible,
> 
> +static inline float32 check_nanbox(uint64_t f, uint32_t flen)
> +{
> +    uint64_t mask = MAKE_64BIT_MASK(flen, 64 - flen);
> +
> +    if (likely((f & mask) == mask)) {
> +    return (uint32_t)f;
> +    } else {
> +    return (flen == 32) ? 0x7fc0u : 0x7e00u; /* default qnan */
> +    }
> +}

The difficulty of choosing the proper default qnan is an example of why we
should *not* attempt to make this function fully general, but should instead
define separate functions for each type.  E.g.

static inline float16 check_nanbox_h(uint64_t f);
static inline bfloat16 check_nanbox_b(uint64_t f);


r~



Re: [PATCH v2 1/7] target/riscv: Generate nanboxed results from fp helpers

2020-07-23 Thread Richard Henderson
On 7/23/20 7:35 PM, LIU Zhiwei wrote:
> 
> 
> On 2020/7/24 8:28, Richard Henderson wrote:
>> Make sure that all results from single-precision scalar helpers
>> are properly nan-boxed to 64-bits.
>>
>> Signed-off-by: Richard Henderson 
>> ---
>>   target/riscv/internals.h  |  5 +
>>   target/riscv/fpu_helper.c | 42 +--
>>   2 files changed, 28 insertions(+), 19 deletions(-)
>>
>> diff --git a/target/riscv/internals.h b/target/riscv/internals.h
>> index 37d33820ad..9f4ba7d617 100644
>> --- a/target/riscv/internals.h
>> +++ b/target/riscv/internals.h
>> @@ -38,4 +38,9 @@ target_ulong fclass_d(uint64_t frs1);
>>   #define SEW32 2
>>   #define SEW64 3
>>   +static inline uint64_t nanbox_s(float32 f)
>> +{
>> +    return f | MAKE_64BIT_MASK(32, 32);
>> +}
>> +
> If define it here,  we can also define a more general  function with flen.
> 
> +static inline uint64_t nanbox_s(float32 f, uint32_t flen)
> +{
> +    return f | MAKE_64BIT_MASK(flen, 64 - flen);
> +}
> +
> 
> So we can reuse it in fp16 or bf16 scalar instruction and in vector 
> instructions.

While we could do that, we will not encounter all possible lengths.  In the
cover letter, I mentioned defining a second function,

static inline uint64_t nanbox_h(float16 f)
{
   return f | MAKE_64BIT_MASK(16, 48);
}

Having two separate functions will, I believe, be easier to use in practice.


r~



Re: [PATCH v1] hw/pci-host: save/restore pci host config register

2020-07-23 Thread Wangjing (Hogan, Cloud Infrastructure Service Product Dept.)
On Sat, Jul 25, 2020 at 10:53:03AM Hogan Wang wrote:
> * Michael S. Tsirkin (m...@redhat.com) wrote:
> > On Thu, Jul 23, 2020 at 02:12:54PM +0100, Dr. David Alan Gilbert wrote:
> > > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > > On Thu, Jul 23, 2020 at 08:53:03PM +0800, Hogan Wang wrote:
> > > > > From: Hogan Wang 
> > > > > 
> > > > > The pci host config register is used to save PCI address for 
> > > > > read/write config data. If guest write a value to config 
> > > > > register, and then pause the vcpu to migrate, After the 
> > > > > migration, the guest continue to write pci config data, and the 
> > > > > write data will be ignored because of new qemu process lost the 
> > > > > config register state.
> > > > > 
> > > > > Reproduction steps are:
> > > > > 1. guest booting in seabios.
> > > > > 2. guest enable the SMRAM in seabios:piix4_apmc_smm_setup, and then
> > > > >expect to disable the SMRAM by pci_config_writeb.
> > > > > 3. after guest write the pci host config register, and then pasued 
> > > > > vcpu
> > > > >to finish migration.
> > > > > 4. guest write config data(0x0A) fail to disable the SMRAM becasue of
> > > > >config register state lost.
> > > > > 5. guest continue to boot and crash in ipxe option ROM due to SMRAM in
> > > > >enabled state.
> > > > > 
> > > > > Signed-off-by: Hogan Wang 
> > > > 
> > > > I guess this is like v3 right?
> > > > 
> > > > thanks a lot for the patch!
> > > > 
> > > > My question stands : does anyone see a way to pass this info 
> > > > around without breaking migration for all existing machine types?
> > > 
> > > You need a .needed clause in the vmstate_i440fx_pcihost and 
> > > vmstate_q35_pcihost which is a pointer to a function which enables 
> > > it on new machine types and ignores it on old ones.
> > > 
> > > Or, if it always crashes if the SMRAM is enabled, then the migration 
> > > is dead anyway; so you could make the .needed only save the config 
> > > if the SMRAM is opened, so you'd get a unknown section error, which 
> > > is nasty but it would only happen in the case it would crash anyway.
> > > 
> > > Dave
> > 
> > Problem is we never know whether it's needed.
> > 
> > For example: guest programs cf8, then cfc.
> > Guest on destination can crash if migrated after writing cf8 before 
> > writing cfc.
> > But in theory it can also crash if guest assumes
> > cf8 is unchanged and just writes cfc.
> > 
> > So what I'd prefer to do is put it in some data that old qemu ignores. 
> > Then once qemu on destination is updated, it will start interpreting 
> > it.
> 
> We don't have a way to do that; the choice is:
>   a) Not sending it for old versions, so you only get the
> fix for new machine types
> 
>   b) Trying to second guess when it will crash
> 
> I recommend (a) generally - but the format has no way to ignore unknown data.
> 
> Dave
> 

The i440fx and q35 machines integrate i440FX or ICH9-LPC PCI device by
default. Refer to i440FX and ICH9-LPC spcifications, there are some reserved
configuration registers can used to save/restore PCIHostState.config_reg,
like i440FX.config[0x57] used for Older coreboot to get RAM size from QEMU.

whitch is nasty but it friendly to old ones.

> > 
> > > > 
> > > > > ---
> > > > >  hw/pci-host/i440fx.c   | 11 +++
> > > > >  hw/pci-host/q35.c  | 11 +++
> > > > >  hw/pci/pci_host.c  | 11 +++
> > > > >  hw/pci/pcie_host.c | 11 +++
> > > > >  include/hw/pci/pci_host.h  | 10 ++  
> > > > > include/hw/pci/pcie_host.h | 10 ++
> > > > >  6 files changed, 64 insertions(+)
> > > > > 
> > > > 
> > > --
> > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> > 
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




[for-5.2 v4 10/10] s390: Recognize host-trust-limitation option

2020-07-23 Thread David Gibson
At least some s390 cpu models support "Protected Virtualization" (PV),
a mechanism to protect guests from eavesdropping by a compromised
hypervisor.

This is similar in function to other mechanisms like AMD's SEV and
POWER's PEF, which are controlled bythe "host-trust-limitation"
machine option.  s390 is a slightly special case, because we already
supported PV, simply by using a CPU model with the required feature
(S390_FEAT_UNPACK).

To integrate this with the option used by other platforms, we
implement the following compromise:

 - When the host-trust-limitation option is set, s390 will recognize
   it, verify that the CPU can support PV (failing if not) and set
   virtio default options necessary for encrypted or protected guests,
   as on other platforms.  i.e. if host-trust-limitation is set, we
   will either create a guest capable of entering PV mode, or fail
   outright

 - If host-trust-limitation is not set, guest's might still be able to
   enter PV mode, if the CPU has the right model.  This may be a
   little surprising, but shouldn't actually be harmful.

To start a guest supporting Protected Virtualization using the new
option use the command line arguments:
-object s390-pv-guest,id=pv0 -machine host-trust-limitation=pv0

Signed-off-by: David Gibson 
---
 hw/s390x/pv.c | 61 +++
 1 file changed, 61 insertions(+)

diff --git a/hw/s390x/pv.c b/hw/s390x/pv.c
index ab3a2482aa..4bf3b345b6 100644
--- a/hw/s390x/pv.c
+++ b/hw/s390x/pv.c
@@ -14,8 +14,11 @@
 #include 
 
 #include "cpu.h"
+#include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "sysemu/kvm.h"
+#include "qom/object_interfaces.h"
+#include "exec/host-trust-limitation.h"
 #include "hw/s390x/ipl.h"
 #include "hw/s390x/pv.h"
 
@@ -111,3 +114,61 @@ void s390_pv_inject_reset_error(CPUState *cs)
 /* Report that we are unable to enter protected mode */
 env->regs[r1 + 1] = DIAG_308_RC_INVAL_FOR_PV;
 }
+
+#define TYPE_S390_PV_GUEST "s390-pv-guest"
+#define S390_PV_GUEST(obj)  \
+OBJECT_CHECK(S390PVGuestState, (obj), TYPE_S390_PV_GUEST)
+
+typedef struct S390PVGuestState S390PVGuestState;
+
+/**
+ * S390PVGuestState:
+ *
+ * The S390PVGuestState object is basically a dummy used to tell the
+ * host trust limitation system to use s390's PV mechanism.  guest.
+ *
+ * # $QEMU \
+ * -object s390-pv-guest,id=pv0 \
+ * -machine ...,host-trust-limitation=pv0
+ */
+struct S390PVGuestState {
+Object parent_obj;
+};
+
+static int s390_pv_kvm_init(HostTrustLimitation *gmpo, Error **errp)
+{
+if (!s390_has_feat(S390_FEAT_UNPACK)) {
+error_setg(errp,
+   "CPU model does not support Protected Virtualization");
+return -1;
+}
+
+return 0;
+}
+
+static void s390_pv_guest_class_init(ObjectClass *oc, void *data)
+{
+HostTrustLimitationClass *gmpc = HOST_TRUST_LIMITATION_CLASS(oc);
+
+gmpc->kvm_init = s390_pv_kvm_init;
+}
+
+static const TypeInfo s390_pv_guest_info = {
+.parent = TYPE_OBJECT,
+.name = TYPE_S390_PV_GUEST,
+.instance_size = sizeof(S390PVGuestState),
+.class_init = s390_pv_guest_class_init,
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_HOST_TRUST_LIMITATION },
+{ TYPE_USER_CREATABLE },
+{ }
+}
+};
+
+static void
+s390_pv_register_types(void)
+{
+type_register_static(_pv_guest_info);
+}
+
+type_init(s390_pv_register_types);
-- 
2.26.2




[for-5.2 v4 08/10] spapr: PEF: block migration

2020-07-23 Thread David Gibson
We haven't yet implemented the fairly involved handshaking that will be
needed to migrate PEF protected guests.  For now, just use a migration
blocker so we get a meaningful error if someone attempts this (this is the
same approach used by AMD SEV).

Signed-off-by: David Gibson 
---
 target/ppc/pef.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/target/ppc/pef.c b/target/ppc/pef.c
index 53a6af0347..6a50efd580 100644
--- a/target/ppc/pef.c
+++ b/target/ppc/pef.c
@@ -36,6 +36,8 @@ struct PefGuestState {
 Object parent_obj;
 };
 
+static Error *pef_mig_blocker;
+
 static int pef_kvm_init(HostTrustLimitation *gmpo, Error **errp)
 {
 if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURE_GUEST)) {
@@ -52,6 +54,10 @@ static int pef_kvm_init(HostTrustLimitation *gmpo, Error 
**errp)
 }
 }
 
+/* add migration blocker */
+error_setg(_mig_blocker, "PEF: Migration is not implemented");
+migrate_add_blocker(pef_mig_blocker, _abort);
+
 return 0;
 }
 
-- 
2.26.2




[for-5.2 v4 06/10] host trust limitation: Add Error ** to HostTrustLimitation::kvm_init

2020-07-23 Thread David Gibson
This allows failures to be reported richly and idiomatically.

Signed-off-by: David Gibson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
---
 accel/kvm/kvm-all.c  |  4 +++-
 include/exec/host-trust-limitation.h |  2 +-
 target/i386/sev.c| 31 ++--
 3 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 4b6402c12c..3f98c6be7c 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2164,9 +2164,11 @@ static int kvm_init(MachineState *ms)
 if (ms->htl) {
 HostTrustLimitationClass *htlc =
 HOST_TRUST_LIMITATION_GET_CLASS(ms->htl);
+Error *local_err = NULL;
 
-ret = htlc->kvm_init(ms->htl);
+ret = htlc->kvm_init(ms->htl, _err);
 if (ret < 0) {
+error_report_err(local_err);
 goto err;
 }
 }
diff --git a/include/exec/host-trust-limitation.h 
b/include/exec/host-trust-limitation.h
index fc30ea3f78..d93b537280 100644
--- a/include/exec/host-trust-limitation.h
+++ b/include/exec/host-trust-limitation.h
@@ -30,7 +30,7 @@
 typedef struct HostTrustLimitationClass {
 InterfaceClass parent;
 
-int (*kvm_init)(HostTrustLimitation *);
+int (*kvm_init)(HostTrustLimitation *, Error **);
 int (*encrypt_data)(HostTrustLimitation *, uint8_t *, uint64_t);
 } HostTrustLimitationClass;
 
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 8e3c9dcc2c..0d06976da5 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -626,7 +626,7 @@ sev_vm_state_change(void *opaque, int running, RunState 
state)
 }
 }
 
-static int sev_kvm_init(HostTrustLimitation *htl)
+static int sev_kvm_init(HostTrustLimitation *htl, Error **errp)
 {
 SevGuestState *sev = SEV_GUEST(htl);
 char *devname;
@@ -648,14 +648,14 @@ static int sev_kvm_init(HostTrustLimitation *htl)
 host_cbitpos = ebx & 0x3f;
 
 if (host_cbitpos != sev->cbitpos) {
-error_report("%s: cbitpos check failed, host '%d' requested '%d'",
- __func__, host_cbitpos, sev->cbitpos);
+error_setg(errp, "%s: cbitpos check failed, host '%d' requested '%d'",
+   __func__, host_cbitpos, sev->cbitpos);
 goto err;
 }
 
 if (sev->reduced_phys_bits < 1) {
-error_report("%s: reduced_phys_bits check failed, it should be >=1,"
- " requested '%d'", __func__, sev->reduced_phys_bits);
+error_setg(errp, "%s: reduced_phys_bits check failed, it should be 
>=1,"
+   " requested '%d'", __func__, sev->reduced_phys_bits);
 goto err;
 }
 
@@ -664,20 +664,19 @@ static int sev_kvm_init(HostTrustLimitation *htl)
 devname = object_property_get_str(OBJECT(sev), "sev-device", NULL);
 sev->sev_fd = open(devname, O_RDWR);
 if (sev->sev_fd < 0) {
-error_report("%s: Failed to open %s '%s'", __func__,
- devname, strerror(errno));
-}
-g_free(devname);
-if (sev->sev_fd < 0) {
+error_setg(errp, "%s: Failed to open %s '%s'", __func__,
+   devname, strerror(errno));
+g_free(devname);
 goto err;
 }
+g_free(devname);
 
 ret = sev_platform_ioctl(sev->sev_fd, SEV_PLATFORM_STATUS, ,
  _error);
 if (ret) {
-error_report("%s: failed to get platform status ret=%d "
- "fw_error='%d: %s'", __func__, ret, fw_error,
- fw_error_to_str(fw_error));
+error_setg(errp, "%s: failed to get platform status ret=%d "
+   "fw_error='%d: %s'", __func__, ret, fw_error,
+   fw_error_to_str(fw_error));
 goto err;
 }
 sev->build_id = status.build;
@@ -687,14 +686,14 @@ static int sev_kvm_init(HostTrustLimitation *htl)
 trace_kvm_sev_init();
 ret = sev_ioctl(sev->sev_fd, KVM_SEV_INIT, NULL, _error);
 if (ret) {
-error_report("%s: failed to initialize ret=%d fw_error=%d '%s'",
- __func__, ret, fw_error, fw_error_to_str(fw_error));
+error_setg(errp, "%s: failed to initialize ret=%d fw_error=%d '%s'",
+   __func__, ret, fw_error, fw_error_to_str(fw_error));
 goto err;
 }
 
 ret = sev_launch_start(sev);
 if (ret) {
-error_report("%s: failed to create encryption context", __func__);
+error_setg(errp, "%s: failed to create encryption context", __func__);
 goto err;
 }
 
-- 
2.26.2




[for-5.2 v4 07/10] spapr: Add PEF based host trust limitation

2020-07-23 Thread David Gibson
Some upcoming POWER machines have a system called PEF (Protected
Execution Facility) which uses a small ultravisor to allow guests to
run in a way that they can't be eavesdropped by the hypervisor.  The
effect is roughly similar to AMD SEV, although the mechanisms are
quite different.

Most of the work of this is done between the guest, KVM and the
ultravisor, with little need for involvement by qemu.  However qemu
does need to tell KVM to allow secure VMs.

Because the availability of secure mode is a guest visible difference
which depends on having the right hardware and firmware, we don't
enable this by default.  In order to run a secure guest you need to
create a "pef-guest" object and set the host-trust-limitation machine
property to point to it.

Note that this just *allows* secure guests, the architecture of PEF is
such that the guest still needs to talk to the ultravisor to enter
secure mode.  Qemu has no directly way of knowing if the guest is in
secure mode, and certainly can't know until well after machine
creation time.

To start a PEF-capable guest, use the command line options:
-object pef-guest,id=pef0 -machine host-trust-limitation=pef0

Signed-off-by: David Gibson 
Acked-by: Ram Pai 
---
 target/ppc/Makefile.objs |  2 +-
 target/ppc/pef.c | 83 
 2 files changed, 84 insertions(+), 1 deletion(-)
 create mode 100644 target/ppc/pef.c

diff --git a/target/ppc/Makefile.objs b/target/ppc/Makefile.objs
index e8fa18ce13..ac93b9700e 100644
--- a/target/ppc/Makefile.objs
+++ b/target/ppc/Makefile.objs
@@ -6,7 +6,7 @@ obj-y += machine.o mmu_helper.o mmu-hash32.o monitor.o 
arch_dump.o
 obj-$(TARGET_PPC64) += mmu-hash64.o mmu-book3s-v3.o compat.o
 obj-$(TARGET_PPC64) += mmu-radix64.o
 endif
-obj-$(CONFIG_KVM) += kvm.o
+obj-$(CONFIG_KVM) += kvm.o pef.o
 obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
 obj-y += dfp_helper.o
 obj-y += excp_helper.o
diff --git a/target/ppc/pef.c b/target/ppc/pef.c
new file mode 100644
index 00..53a6af0347
--- /dev/null
+++ b/target/ppc/pef.c
@@ -0,0 +1,83 @@
+/*
+ * PEF (Protected Execution Facility) for POWER support
+ *
+ * Copyright David Gibson, Redhat Inc. 2020
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+
+#include "qapi/error.h"
+#include "qom/object_interfaces.h"
+#include "sysemu/kvm.h"
+#include "migration/blocker.h"
+#include "exec/host-trust-limitation.h"
+
+#define TYPE_PEF_GUEST "pef-guest"
+#define PEF_GUEST(obj)  \
+OBJECT_CHECK(PefGuestState, (obj), TYPE_PEF_GUEST)
+
+typedef struct PefGuestState PefGuestState;
+
+/**
+ * PefGuestState:
+ *
+ * The PefGuestState object is used for creating and managing a PEF
+ * guest.
+ *
+ * # $QEMU \
+ * -object pef-guest,id=pef0 \
+ * -machine ...,host-trust-limitation=pef0
+ */
+struct PefGuestState {
+Object parent_obj;
+};
+
+static int pef_kvm_init(HostTrustLimitation *gmpo, Error **errp)
+{
+if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURE_GUEST)) {
+error_setg(errp,
+   "KVM implementation does not support Secure VMs (is an 
ultravisor running?)");
+return -1;
+} else {
+int ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PPC_SECURE_GUEST, 0, 1);
+
+if (ret < 0) {
+error_setg(errp,
+   "Error enabling PEF with KVM");
+return -1;
+}
+}
+
+return 0;
+}
+
+static void pef_guest_class_init(ObjectClass *oc, void *data)
+{
+HostTrustLimitationClass *gmpc = HOST_TRUST_LIMITATION_CLASS(oc);
+
+gmpc->kvm_init = pef_kvm_init;
+}
+
+static const TypeInfo pef_guest_info = {
+.parent = TYPE_OBJECT,
+.name = TYPE_PEF_GUEST,
+.instance_size = sizeof(PefGuestState),
+.class_init = pef_guest_class_init,
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_HOST_TRUST_LIMITATION },
+{ TYPE_USER_CREATABLE },
+{ }
+}
+};
+
+static void
+pef_register_types(void)
+{
+type_register_static(_guest_info);
+}
+
+type_init(pef_register_types);
-- 
2.26.2




[for-5.2 v4 09/10] host trust limitation: Alter virtio default properties for protected guests

2020-07-23 Thread David Gibson
The default behaviour for virtio devices is not to use the platforms normal
DMA paths, but instead to use the fact that it's running in a hypervisor
to directly access guest memory.  That doesn't work if the guest's memory
is protected from hypervisor access, such as with AMD's SEV or POWER's PEF.

So, if a host trust limitation mechanism is enabled, then apply the
iommu_platform=on option so it will go through normal DMA mechanisms.
Those will presumably have some way of marking memory as shared with the
hypervisor or hardware so that DMA will work.

Signed-off-by: David Gibson 
---
 hw/core/machine.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index b599b0ba65..2a723bf07b 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -28,6 +28,8 @@
 #include "hw/mem/nvdimm.h"
 #include "migration/vmstate.h"
 #include "exec/host-trust-limitation.h"
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/virtio-pci.h"
 
 GlobalProperty hw_compat_5_0[] = {
 { "virtio-balloon-device", "page-poison", "false" },
@@ -1161,6 +1163,15 @@ void machine_run_board_init(MachineState *machine)
  * areas.
  */
 machine_set_mem_merge(OBJECT(machine), false, _abort);
+
+/*
+ * Virtio devices can't count on directly accessing guest
+ * memory, so they need iommu_platform=on to use normal DMA
+ * mechanisms.  That requires disabling legacy virtio support
+ * for virtio pci devices
+ */
+object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy", "on");
+object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform", "on");
 }
 
 machine_class->init(machine);
-- 
2.26.2




[for-5.2 v4 01/10] host trust limitation: Introduce new host trust limitation interface

2020-07-23 Thread David Gibson
Several architectures have mechanisms which are designed to protect guest
memory from interference or eavesdropping by a compromised hypervisor.  AMD
SEV does this with in-chip memory encryption and Intel has a similar
mechanism.  POWER's Protected Execution Framework (PEF) accomplishes a
similar goal using an ultravisor and new memory protection features,
instead of encryption.

To (partially) unify handling for these, this introduces a new
HostTrustLimitation QOM interface.

Signed-off-by: David Gibson 
Acked-by: Dr. David Alan Gilbert 
Reviewed-by: Richard Henderson 
---
 backends/Makefile.objs   |  2 ++
 backends/host-trust-limitation.c | 29 
 include/exec/host-trust-limitation.h | 33 
 include/qemu/typedefs.h  |  1 +
 4 files changed, 65 insertions(+)
 create mode 100644 backends/host-trust-limitation.c
 create mode 100644 include/exec/host-trust-limitation.h

diff --git a/backends/Makefile.objs b/backends/Makefile.objs
index 22d204cb48..dcb8f58d31 100644
--- a/backends/Makefile.objs
+++ b/backends/Makefile.objs
@@ -21,3 +21,5 @@ common-obj-$(CONFIG_LINUX) += hostmem-memfd.o
 common-obj-$(CONFIG_GIO) += dbus-vmstate.o
 dbus-vmstate.o-cflags = $(GIO_CFLAGS)
 dbus-vmstate.o-libs = $(GIO_LIBS)
+
+common-obj-y += host-trust-limitation.o
diff --git a/backends/host-trust-limitation.c b/backends/host-trust-limitation.c
new file mode 100644
index 00..96a381cd8a
--- /dev/null
+++ b/backends/host-trust-limitation.c
@@ -0,0 +1,29 @@
+/*
+ * QEMU Host Trust Limitation interface
+ *
+ * Copyright: David Gibson, Red Hat Inc. 2020
+ *
+ * Authors:
+ *  David Gibson 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+
+#include "exec/host-trust-limitation.h"
+
+static const TypeInfo host_trust_limitation_info = {
+.name = TYPE_HOST_TRUST_LIMITATION,
+.parent = TYPE_INTERFACE,
+.class_size = sizeof(HostTrustLimitationClass),
+};
+
+static void host_trust_limitation_register_types(void)
+{
+type_register_static(_trust_limitation_info);
+}
+
+type_init(host_trust_limitation_register_types)
diff --git a/include/exec/host-trust-limitation.h 
b/include/exec/host-trust-limitation.h
new file mode 100644
index 00..03887b1be1
--- /dev/null
+++ b/include/exec/host-trust-limitation.h
@@ -0,0 +1,33 @@
+/*
+ * QEMU Host Trust Limitation interface
+ *
+ * Copyright: David Gibson, Red Hat Inc. 2020
+ *
+ * Authors:
+ *  David Gibson 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ *
+ */
+#ifndef QEMU_HOST_TRUST_LIMITATION_H
+#define QEMU_HOST_TRUST_LIMITATION_H
+
+#include "qom/object.h"
+
+#define TYPE_HOST_TRUST_LIMITATION "host-trust-limitation"
+#define HOST_TRUST_LIMITATION(obj)\
+INTERFACE_CHECK(HostTrustLimitation, (obj),   \
+TYPE_HOST_TRUST_LIMITATION)
+#define HOST_TRUST_LIMITATION_CLASS(klass)\
+OBJECT_CLASS_CHECK(HostTrustLimitationClass, (klass), \
+   TYPE_HOST_TRUST_LIMITATION)
+#define HOST_TRUST_LIMITATION_GET_CLASS(obj)  \
+OBJECT_GET_CLASS(HostTrustLimitationClass, (obj), \
+ TYPE_HOST_TRUST_LIMITATION)
+
+typedef struct HostTrustLimitationClass {
+InterfaceClass parent;
+} HostTrustLimitationClass;
+
+#endif /* QEMU_HOST_TRUST_LIMITATION_H */
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index 427027a970..624d59c037 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -51,6 +51,7 @@ typedef struct FWCfgIoState FWCfgIoState;
 typedef struct FWCfgMemState FWCfgMemState;
 typedef struct FWCfgState FWCfgState;
 typedef struct HostMemoryBackend HostMemoryBackend;
+typedef struct HostTrustLimitation HostTrustLimitation;
 typedef struct I2CBus I2CBus;
 typedef struct I2SCodec I2SCodec;
 typedef struct IOMMUMemoryRegion IOMMUMemoryRegion;
-- 
2.26.2




[for-5.2 v4 00/10] Generalize memory encryption models

2020-07-23 Thread David Gibson
A number of hardware platforms are implementing mechanisms whereby the
hypervisor does not have unfettered access to guest memory, in order
to mitigate the security impact of a compromised hypervisor.

AMD's SEV implements this with in-cpu memory encryption, and Intel has
its own memory encryption mechanism.  POWER has an upcoming mechanism
to accomplish this in a different way, using a new memory protection
level plus a small trusted ultravisor.  s390 also has a protected
execution environment.

The current code (committed or draft) for these features has each
platform's version configured entirely differently.  That doesn't seem
ideal for users, or particularly for management layers.

AMD SEV introduces a notionally generic machine option
"machine-encryption", but it doesn't actually cover any cases other
than SEV.

This series is a proposal to at least partially unify configuration
for these mechanisms, by renaming and generalizing AMD's
"memory-encryption" property.  It is replaced by a
"host-trust-limitation" property pointing to a platform specific
object which configures and manages the specific details.

Please apply.

Changes since v3:
 * Rebased
 * Added first cut at handling of s390 protected virtualization
Changes since RFCv2:
 * Rebased
 * Removed preliminary SEV cleanups (they've been merged)
 * Changed name to "host trust limitation"
 * Added migration blocker to the PEF code (based on SEV's version)
Changes since RFCv1:
 * Rebased
 * Fixed some errors pointed out by Dave Gilbert

David Gibson (10):
  host trust limitation: Introduce new host trust limitation interface
  host trust limitation: Handle memory encryption via interface
  host trust limitation: Move side effect out of
machine_set_memory_encryption()
  host trust limitation: Rework the "memory-encryption" property
  host trust limitation: Decouple kvm_memcrypt_*() helpers from KVM
  host trust limitation: Add Error ** to HostTrustLimitation::kvm_init
  spapr: Add PEF based host trust limitation
  spapr: PEF: block migration
  host trust limitation: Alter virtio default properties for protected
guests
  s390: Recognize host-trust-limitation option

 accel/kvm/kvm-all.c  |  40 ++--
 accel/kvm/sev-stub.c |   7 +-
 accel/stubs/kvm-stub.c   |  10 --
 backends/Makefile.objs   |   2 +
 backends/host-trust-limitation.c |  29 ++
 hw/core/machine.c|  61 +--
 hw/i386/pc_sysfw.c   |   6 +-
 hw/s390x/pv.c|  61 +++
 include/exec/host-trust-limitation.h |  72 +
 include/hw/boards.h  |   2 +-
 include/qemu/typedefs.h  |   1 +
 include/sysemu/kvm.h |  17 ---
 include/sysemu/sev.h |   4 +-
 target/i386/sev.c| 148 ---
 target/ppc/Makefile.objs |   2 +-
 target/ppc/pef.c |  89 
 16 files changed, 387 insertions(+), 164 deletions(-)
 create mode 100644 backends/host-trust-limitation.c
 create mode 100644 include/exec/host-trust-limitation.h
 create mode 100644 target/ppc/pef.c

-- 
2.26.2




[for-5.2 v4 05/10] host trust limitation: Decouple kvm_memcrypt_*() helpers from KVM

2020-07-23 Thread David Gibson
The kvm_memcrypt_enabled() and kvm_memcrypt_encrypt_data() helper functions
don't conceptually have any connection to KVM (although it's not possible
in practice to use them without it).

They also rely on looking at the global KVMState.  But the same information
is available from the machine, and the only existing callers have natural
access to the machine state.

Therefore, move and rename them to helpers in host-trust-limitation.h,
taking an explicit machine parameter.

Signed-off-by: David Gibson 
Reviewed-by: Richard Henderson 
---
 accel/kvm/kvm-all.c  | 27 -
 accel/stubs/kvm-stub.c   | 10 
 hw/i386/pc_sysfw.c   |  6 +++--
 include/exec/host-trust-limitation.h | 36 
 include/sysemu/kvm.h | 17 -
 5 files changed, 40 insertions(+), 56 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index e2d8f47f93..4b6402c12c 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -117,9 +117,6 @@ struct KVMState
 KVMMemoryListener memory_listener;
 QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus;
 
-/* host trust limitation (e.g. by guest memory encryption) */
-HostTrustLimitation *htl;
-
 /* For "info mtree -f" to tell if an MR is registered in KVM */
 int nr_as;
 struct KVMAs {
@@ -218,28 +215,6 @@ int kvm_get_max_memslots(void)
 return s->nr_slots;
 }
 
-bool kvm_memcrypt_enabled(void)
-{
-if (kvm_state && kvm_state->htl) {
-return true;
-}
-
-return false;
-}
-
-int kvm_memcrypt_encrypt_data(uint8_t *ptr, uint64_t len)
-{
-HostTrustLimitation *htl = kvm_state->htl;
-
-if (htl) {
-HostTrustLimitationClass *htlc = HOST_TRUST_LIMITATION_GET_CLASS(htl);
-
-return htlc->encrypt_data(htl, ptr, len);
-}
-
-return 1;
-}
-
 /* Called with KVMMemoryListener.slots_lock held */
 static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml)
 {
@@ -2194,8 +2169,6 @@ static int kvm_init(MachineState *ms)
 if (ret < 0) {
 goto err;
 }
-
-kvm_state->htl = ms->htl;
 }
 
 ret = kvm_arch_init(ms, s);
diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
index 82f118d2df..78b3eef117 100644
--- a/accel/stubs/kvm-stub.c
+++ b/accel/stubs/kvm-stub.c
@@ -104,16 +104,6 @@ int kvm_on_sigbus(int code, void *addr)
 return 1;
 }
 
-bool kvm_memcrypt_enabled(void)
-{
-return false;
-}
-
-int kvm_memcrypt_encrypt_data(uint8_t *ptr, uint64_t len)
-{
-  return 1;
-}
-
 #ifndef CONFIG_USER_ONLY
 int kvm_irqchip_add_msi_route(KVMState *s, int vector, PCIDevice *dev)
 {
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index b6c0822fe3..e8d3b795a1 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -38,6 +38,7 @@
 #include "sysemu/sysemu.h"
 #include "hw/block/flash.h"
 #include "sysemu/kvm.h"
+#include "exec/host-trust-limitation.h"
 
 /*
  * We don't have a theoretically justifiable exact lower bound on the base
@@ -201,10 +202,11 @@ static void pc_system_flash_map(PCMachineState *pcms,
 pc_isa_bios_init(rom_memory, flash_mem, size);
 
 /* Encrypt the pflash boot ROM */
-if (kvm_memcrypt_enabled()) {
+if (host_trust_limitation_enabled(MACHINE(pcms))) {
 flash_ptr = memory_region_get_ram_ptr(flash_mem);
 flash_size = memory_region_size(flash_mem);
-ret = kvm_memcrypt_encrypt_data(flash_ptr, flash_size);
+ret = host_trust_limitation_encrypt(MACHINE(pcms),
+flash_ptr, flash_size);
 if (ret) {
 error_report("failed to encrypt pflash rom");
 exit(1);
diff --git a/include/exec/host-trust-limitation.h 
b/include/exec/host-trust-limitation.h
index a19f12ae14..fc30ea3f78 100644
--- a/include/exec/host-trust-limitation.h
+++ b/include/exec/host-trust-limitation.h
@@ -14,6 +14,7 @@
 #define QEMU_HOST_TRUST_LIMITATION_H
 
 #include "qom/object.h"
+#include "hw/boards.h"
 
 #define TYPE_HOST_TRUST_LIMITATION "host-trust-limitation"
 #define HOST_TRUST_LIMITATION(obj)\
@@ -33,4 +34,39 @@ typedef struct HostTrustLimitationClass {
 int (*encrypt_data)(HostTrustLimitation *, uint8_t *, uint64_t);
 } HostTrustLimitationClass;
 
+/**
+ * host_trust_limitation_enabled - return whether guest memory is protected
+ * from hypervisor access (with memory
+ * encryption or otherwise)
+ * Returns: true guest memory is not directly accessible to qemu
+ *  false guest memory is directly accessible to qemu
+ */
+static inline bool host_trust_limitation_enabled(MachineState *machine)
+{
+return !!machine->htl;
+}
+
+/**
+ * host_trust_limitation_encrypt: encrypt the memory range to make
+ *it guest accessible
+ *
+ * 

[for-5.2 v4 04/10] host trust limitation: Rework the "memory-encryption" property

2020-07-23 Thread David Gibson
Currently the "memory-encryption" property is only looked at once we get to
kvm_init().  Although protection of guest memory from the hypervisor isn't
something that could really ever work with TCG, it's not conceptually tied
to the KVM accelerator.

In addition, the way the string property is resolved to an object is
almost identical to how a QOM link property is handled.

So, create a new "host-trust-limitation" link property which sets this QOM
interface link directly in the machine.  For compatibility we keep the
"memory-encryption" property, but now implemented in terms of the new
property.

Signed-off-by: David Gibson 
Reviewed-by: Richard Henderson 
---
 accel/kvm/kvm-all.c | 23 +++
 hw/core/machine.c   | 41 -
 include/hw/boards.h |  2 +-
 3 files changed, 44 insertions(+), 22 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index d7d95eacc7..e2d8f47f93 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2186,25 +2186,16 @@ static int kvm_init(MachineState *ms)
  * if memory encryption object is specified then initialize the memory
  * encryption context.
  */
-if (ms->memory_encryption) {
-Object *obj = object_resolve_path_component(object_get_objects_root(),
-ms->memory_encryption);
-
-if (object_dynamic_cast(obj, TYPE_HOST_TRUST_LIMITATION)) {
-HostTrustLimitation *htl = HOST_TRUST_LIMITATION(obj);
-HostTrustLimitationClass *htlc
-= HOST_TRUST_LIMITATION_GET_CLASS(htl);
-
-ret = htlc->kvm_init(htl);
-if (ret < 0) {
-goto err;
-}
+if (ms->htl) {
+HostTrustLimitationClass *htlc =
+HOST_TRUST_LIMITATION_GET_CLASS(ms->htl);
 
-kvm_state->htl = htl;
-} else {
-ret = -1;
+ret = htlc->kvm_init(ms->htl);
+if (ret < 0) {
 goto err;
 }
+
+kvm_state->htl = ms->htl;
 }
 
 ret = kvm_arch_init(ms, s);
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 035a1fc631..b599b0ba65 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -27,6 +27,7 @@
 #include "hw/pci/pci.h"
 #include "hw/mem/nvdimm.h"
 #include "migration/vmstate.h"
+#include "exec/host-trust-limitation.h"
 
 GlobalProperty hw_compat_5_0[] = {
 { "virtio-balloon-device", "page-poison", "false" },
@@ -422,16 +423,37 @@ static char *machine_get_memory_encryption(Object *obj, 
Error **errp)
 {
 MachineState *ms = MACHINE(obj);
 
-return g_strdup(ms->memory_encryption);
+if (ms->htl) {
+return g_strdup(object_get_canonical_path_component(OBJECT(ms->htl)));
+}
+
+return NULL;
 }
 
 static void machine_set_memory_encryption(Object *obj, const char *value,
 Error **errp)
 {
-MachineState *ms = MACHINE(obj);
+Object *htl =
+object_resolve_path_component(object_get_objects_root(), value);
+
+if (!htl) {
+error_setg(errp, "No such memory encryption object '%s'", value);
+return;
+}
 
-g_free(ms->memory_encryption);
-ms->memory_encryption = g_strdup(value);
+object_property_set_link(obj, "host-trust-limitation", htl, errp);
+}
+
+static void machine_check_host_trust_limitation(const Object *obj,
+const char *name,
+Object *new_target,
+Error **errp)
+{
+/*
+ * So far the only constraint is that the target has the
+ * TYPE_HOST_TRUST_LIMITATION interface, and that's checked by the
+ * QOM core
+ */
 }
 
 static bool machine_get_nvdimm(Object *obj, Error **errp)
@@ -852,6 +874,15 @@ static void machine_class_init(ObjectClass *oc, void *data)
 object_class_property_set_description(oc, "enforce-config-section",
 "Set on to enforce configuration section migration");
 
+object_class_property_add_link(oc, "host-trust-limitation",
+   TYPE_HOST_TRUST_LIMITATION,
+   offsetof(MachineState, htl),
+   machine_check_host_trust_limitation,
+   OBJ_PROP_LINK_STRONG);
+object_class_property_set_description(oc, "host-trust-limitation",
+"Set host trust limitation object to use");
+
+/* For compatibility */
 object_class_property_add_str(oc, "memory-encryption",
 machine_get_memory_encryption, machine_set_memory_encryption);
 object_class_property_set_description(oc, "memory-encryption",
@@ -1123,7 +1154,7 @@ void machine_run_board_init(MachineState *machine)
 }
 }
 
-if (machine->memory_encryption) {
+if (machine->htl) {
 /*
  * With host trust limitation, the host can't see the real
  * contents of 

[for-5.2 v4 02/10] host trust limitation: Handle memory encryption via interface

2020-07-23 Thread David Gibson
At the moment AMD SEV sets a special function pointer, plus an opaque
handle in KVMState to let things know how to encrypt guest memory.

Now that we have a QOM interface for handling things related to host trust
limitation, use a QOM method on that interface, rather than a bare function
pointer for this.

Signed-off-by: David Gibson 
Reviewed-by: Richard Henderson 
---
 accel/kvm/kvm-all.c  |  38 ++---
 accel/kvm/sev-stub.c |   7 +-
 include/exec/host-trust-limitation.h |   3 +
 include/sysemu/sev.h |   4 +-
 target/i386/sev.c| 119 +++
 5 files changed, 80 insertions(+), 91 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 63ef6af9a1..d7d95eacc7 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -39,11 +39,11 @@
 #include "qemu/main-loop.h"
 #include "trace.h"
 #include "hw/irq.h"
-#include "sysemu/sev.h"
 #include "qapi/visitor.h"
 #include "qapi/qapi-types-common.h"
 #include "qapi/qapi-visit-common.h"
 #include "sysemu/reset.h"
+#include "exec/host-trust-limitation.h"
 
 #include "hw/boards.h"
 
@@ -117,9 +117,8 @@ struct KVMState
 KVMMemoryListener memory_listener;
 QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus;
 
-/* memory encryption */
-void *memcrypt_handle;
-int (*memcrypt_encrypt_data)(void *handle, uint8_t *ptr, uint64_t len);
+/* host trust limitation (e.g. by guest memory encryption) */
+HostTrustLimitation *htl;
 
 /* For "info mtree -f" to tell if an MR is registered in KVM */
 int nr_as;
@@ -221,7 +220,7 @@ int kvm_get_max_memslots(void)
 
 bool kvm_memcrypt_enabled(void)
 {
-if (kvm_state && kvm_state->memcrypt_handle) {
+if (kvm_state && kvm_state->htl) {
 return true;
 }
 
@@ -230,10 +229,12 @@ bool kvm_memcrypt_enabled(void)
 
 int kvm_memcrypt_encrypt_data(uint8_t *ptr, uint64_t len)
 {
-if (kvm_state->memcrypt_handle &&
-kvm_state->memcrypt_encrypt_data) {
-return kvm_state->memcrypt_encrypt_data(kvm_state->memcrypt_handle,
-  ptr, len);
+HostTrustLimitation *htl = kvm_state->htl;
+
+if (htl) {
+HostTrustLimitationClass *htlc = HOST_TRUST_LIMITATION_GET_CLASS(htl);
+
+return htlc->encrypt_data(htl, ptr, len);
 }
 
 return 1;
@@ -2186,13 +2187,24 @@ static int kvm_init(MachineState *ms)
  * encryption context.
  */
 if (ms->memory_encryption) {
-kvm_state->memcrypt_handle = sev_guest_init(ms->memory_encryption);
-if (!kvm_state->memcrypt_handle) {
+Object *obj = object_resolve_path_component(object_get_objects_root(),
+ms->memory_encryption);
+
+if (object_dynamic_cast(obj, TYPE_HOST_TRUST_LIMITATION)) {
+HostTrustLimitation *htl = HOST_TRUST_LIMITATION(obj);
+HostTrustLimitationClass *htlc
+= HOST_TRUST_LIMITATION_GET_CLASS(htl);
+
+ret = htlc->kvm_init(htl);
+if (ret < 0) {
+goto err;
+}
+
+kvm_state->htl = htl;
+} else {
 ret = -1;
 goto err;
 }
-
-kvm_state->memcrypt_encrypt_data = sev_encrypt_data;
 }
 
 ret = kvm_arch_init(ms, s);
diff --git a/accel/kvm/sev-stub.c b/accel/kvm/sev-stub.c
index 4f97452585..9c7c897593 100644
--- a/accel/kvm/sev-stub.c
+++ b/accel/kvm/sev-stub.c
@@ -15,12 +15,7 @@
 #include "qemu-common.h"
 #include "sysemu/sev.h"
 
-int sev_encrypt_data(void *handle, uint8_t *ptr, uint64_t len)
-{
-abort();
-}
-
-void *sev_guest_init(const char *id)
+HostTrustLimitation *sev_guest_init(const char *id)
 {
 return NULL;
 }
diff --git a/include/exec/host-trust-limitation.h 
b/include/exec/host-trust-limitation.h
index 03887b1be1..a19f12ae14 100644
--- a/include/exec/host-trust-limitation.h
+++ b/include/exec/host-trust-limitation.h
@@ -28,6 +28,9 @@
 
 typedef struct HostTrustLimitationClass {
 InterfaceClass parent;
+
+int (*kvm_init)(HostTrustLimitation *);
+int (*encrypt_data)(HostTrustLimitation *, uint8_t *, uint64_t);
 } HostTrustLimitationClass;
 
 #endif /* QEMU_HOST_TRUST_LIMITATION_H */
diff --git a/include/sysemu/sev.h b/include/sysemu/sev.h
index 98c1ec8d38..a4aee6a87d 100644
--- a/include/sysemu/sev.h
+++ b/include/sysemu/sev.h
@@ -16,6 +16,6 @@
 
 #include "sysemu/kvm.h"
 
-void *sev_guest_init(const char *id);
-int sev_encrypt_data(void *handle, uint8_t *ptr, uint64_t len);
+HostTrustLimitation *sev_guest_init(const char *id);
+
 #endif
diff --git a/target/i386/sev.c b/target/i386/sev.c
index c3ecf86704..8e3c9dcc2c 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -28,6 +28,7 @@
 #include "sysemu/runstate.h"
 #include "trace.h"
 #include "migration/blocker.h"
+#include "exec/host-trust-limitation.h"
 
 #define TYPE_SEV_GUEST "sev-guest"
 #define SEV_GUEST(obj)  

[for-5.2 v4 03/10] host trust limitation: Move side effect out of machine_set_memory_encryption()

2020-07-23 Thread David Gibson
When the "memory-encryption" property is set, we also disable KSM
merging for the guest, since it won't accomplish anything.

We want that, but doing it in the property set function itself is
thereoretically incorrect, in the unlikely event of some configuration
environment that set the property then cleared it again before
constructing the guest.

More importantly, it makes some other cleanups we want more difficult.
So, instead move this logic to machine_run_board_init() conditional on
the final value of the property.

Signed-off-by: David Gibson 
Reviewed-by: Richard Henderson 
---
 hw/core/machine.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 2f881d6d75..035a1fc631 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -432,14 +432,6 @@ static void machine_set_memory_encryption(Object *obj, 
const char *value,
 
 g_free(ms->memory_encryption);
 ms->memory_encryption = g_strdup(value);
-
-/*
- * With memory encryption, the host can't see the real contents of RAM,
- * so there's no point in it trying to merge areas.
- */
-if (value) {
-machine_set_mem_merge(obj, false, errp);
-}
 }
 
 static bool machine_get_nvdimm(Object *obj, Error **errp)
@@ -1131,6 +1123,15 @@ void machine_run_board_init(MachineState *machine)
 }
 }
 
+if (machine->memory_encryption) {
+/*
+ * With host trust limitation, the host can't see the real
+ * contents of RAM, so there's no point in it trying to merge
+ * areas.
+ */
+machine_set_mem_merge(OBJECT(machine), false, _abort);
+}
+
 machine_class->init(machine);
 }
 
-- 
2.26.2




Re: [PATCH v2 4/7] target/riscv: Check nanboxed inputs to fp helpers

2020-07-23 Thread LIU Zhiwei




On 2020/7/24 8:28, Richard Henderson wrote:

If a 32-bit input is not properly nanboxed, then the input is
replaced with the default qnan.

Signed-off-by: Richard Henderson 
---
  target/riscv/internals.h  | 11 +++
  target/riscv/fpu_helper.c | 64 ---
  2 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/target/riscv/internals.h b/target/riscv/internals.h
index 9f4ba7d617..f1a546dba6 100644
--- a/target/riscv/internals.h
+++ b/target/riscv/internals.h
@@ -43,4 +43,15 @@ static inline uint64_t nanbox_s(float32 f)
  return f | MAKE_64BIT_MASK(32, 32);
  }
  
+static inline float32 check_nanbox_s(uint64_t f)

+{
+uint64_t mask = MAKE_64BIT_MASK(32, 32);
+
+if (likely((f & mask) == mask)) {
+return (uint32_t)f;
+} else {
+return 0x7fc0u; /* default qnan */
+}
+}
+

If possible,

+static inline float32 check_nanbox(uint64_t f, uint32_t flen)
+{
+uint64_t mask = MAKE_64BIT_MASK(flen, 64 - flen);
+
+if (likely((f & mask) == mask)) {
+return (uint32_t)f;
+} else {
+return (flen == 32) ? 0x7fc0u : 0x7e00u; /* default qnan */
+}
+}
+

Reviewed-by: LIU Zhiwei 

Zhiwei

  #endif
diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
index 72541958a7..bb346a8249 100644
--- a/target/riscv/fpu_helper.c
+++ b/target/riscv/fpu_helper.c
@@ -81,9 +81,12 @@ void helper_set_rounding_mode(CPURISCVState *env, uint32_t 
rm)
  set_float_rounding_mode(softrm, >fp_status);
  }
  
-static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,

-   uint64_t frs3, int flags)
+static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
+   uint64_t rs3, int flags)
  {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
+float32 frs3 = check_nanbox_s(rs3);
  return nanbox_s(float32_muladd(frs1, frs2, frs3, flags, >fp_status));
  }
  
@@ -139,74 +142,97 @@ uint64_t helper_fnmadd_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2,

float_muladd_negate_product, >fp_status);
  }
  
-uint64_t helper_fadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

+uint64_t helper_fadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
  {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
  return nanbox_s(float32_add(frs1, frs2, >fp_status));
  }
  
-uint64_t helper_fsub_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

+uint64_t helper_fsub_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
  {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
  return nanbox_s(float32_sub(frs1, frs2, >fp_status));
  }
  
-uint64_t helper_fmul_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

+uint64_t helper_fmul_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
  {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
  return nanbox_s(float32_mul(frs1, frs2, >fp_status));
  }
  
-uint64_t helper_fdiv_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

+uint64_t helper_fdiv_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
  {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
  return nanbox_s(float32_div(frs1, frs2, >fp_status));
  }
  
-uint64_t helper_fmin_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

+uint64_t helper_fmin_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
  {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
  return nanbox_s(float32_minnum(frs1, frs2, >fp_status));
  }
  
-uint64_t helper_fmax_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

+uint64_t helper_fmax_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
  {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
  return nanbox_s(float32_maxnum(frs1, frs2, >fp_status));
  }
  
-uint64_t helper_fsqrt_s(CPURISCVState *env, uint64_t frs1)

+uint64_t helper_fsqrt_s(CPURISCVState *env, uint64_t rs1)
  {
+float32 frs1 = check_nanbox_s(rs1);
  return nanbox_s(float32_sqrt(frs1, >fp_status));
  }
  
-target_ulong helper_fle_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

+target_ulong helper_fle_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
  {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
  return float32_le(frs1, frs2, >fp_status);
  }
  
-target_ulong helper_flt_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

+target_ulong helper_flt_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
  {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
  return float32_lt(frs1, frs2, >fp_status);
  }
  
-target_ulong helper_feq_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

+target_ulong helper_feq_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
  {
+float32 frs1 = check_nanbox_s(rs1);
+

Re: [PATCH v2 3/7] target/riscv: Generate nanboxed results from trans_rvf.inc.c

2020-07-23 Thread LIU Zhiwei




On 2020/7/24 8:28, Richard Henderson wrote:

Make sure that all results from inline single-precision scalar
operations are properly nan-boxed to 64-bits.

Signed-off-by: Richard Henderson 
---
  target/riscv/insn_trans/trans_rvf.inc.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/target/riscv/insn_trans/trans_rvf.inc.c 
b/target/riscv/insn_trans/trans_rvf.inc.c
index c7057482e8..264d3139f1 100644
--- a/target/riscv/insn_trans/trans_rvf.inc.c
+++ b/target/riscv/insn_trans/trans_rvf.inc.c
@@ -167,6 +167,7 @@ static bool trans_fsgnj_s(DisasContext *ctx, arg_fsgnj_s *a)
  tcg_gen_deposit_i64(cpu_fpr[a->rd], cpu_fpr[a->rs2], cpu_fpr[a->rs1],
  0, 31);
  }
+gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
  mark_fs_dirty(ctx);
  return true;
  }
@@ -183,6 +184,7 @@ static bool trans_fsgnjn_s(DisasContext *ctx, arg_fsgnjn_s 
*a)
  tcg_gen_deposit_i64(cpu_fpr[a->rd], t0, cpu_fpr[a->rs1], 0, 31);
  tcg_temp_free_i64(t0);
  }
+gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
  mark_fs_dirty(ctx);
  return true;
  }
@@ -199,6 +201,7 @@ static bool trans_fsgnjx_s(DisasContext *ctx, arg_fsgnjx_s 
*a)
  tcg_gen_xor_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], t0);
  tcg_temp_free_i64(t0);
  }
+gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
  mark_fs_dirty(ctx);
  return true;
  }
@@ -369,6 +372,7 @@ static bool trans_fmv_w_x(DisasContext *ctx, arg_fmv_w_x *a)
  #else
  tcg_gen_extu_i32_i64(cpu_fpr[a->rd], t0);
  #endif
+gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
  
  mark_fs_dirty(ctx);

  tcg_temp_free(t0);

Reviewed-by: LIU Zhiwei 

Zhiwei




Re: [PATCH v2 2/7] target/riscv: Generalize gen_nanbox_fpr to gen_nanbox_s

2020-07-23 Thread LIU Zhiwei




On 2020/7/24 8:28, Richard Henderson wrote:

Do not depend on the RVD extension, take input and output via
TCGv_i64 instead of fpu regno.  Move the function to translate.c
so that it can be used in multiple trans_*.inc.c files.

Signed-off-by: Richard Henderson 
---
  target/riscv/insn_trans/trans_rvf.inc.c | 16 +---
  target/riscv/translate.c| 11 +++
  2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvf.inc.c 
b/target/riscv/insn_trans/trans_rvf.inc.c
index 3bfd8881e7..c7057482e8 100644
--- a/target/riscv/insn_trans/trans_rvf.inc.c
+++ b/target/riscv/insn_trans/trans_rvf.inc.c
@@ -23,20 +23,6 @@
  return false;   \
  } while (0)
  
-/*

- * RISC-V requires NaN-boxing of narrower width floating
- * point values.  This applies when a 32-bit value is
- * assigned to a 64-bit FP register.  Thus this does not
- * apply when the RVD extension is not present.
- */
-static void gen_nanbox_fpr(DisasContext *ctx, int regno)
-{
-if (has_ext(ctx, RVD)) {
-tcg_gen_ori_i64(cpu_fpr[regno], cpu_fpr[regno],
-MAKE_64BIT_MASK(32, 32));
-}
-}
-
  static bool trans_flw(DisasContext *ctx, arg_flw *a)
  {
  TCGv t0 = tcg_temp_new();
@@ -46,7 +32,7 @@ static bool trans_flw(DisasContext *ctx, arg_flw *a)
  tcg_gen_addi_tl(t0, t0, a->imm);
  
  tcg_gen_qemu_ld_i64(cpu_fpr[a->rd], t0, ctx->mem_idx, MO_TEUL);

-gen_nanbox_fpr(ctx, a->rd);
+gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
  
  tcg_temp_free(t0);

  mark_fs_dirty(ctx);
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 9632e79cf3..12a746da97 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -90,6 +90,17 @@ static inline bool has_ext(DisasContext *ctx, uint32_t ext)
  return ctx->misa & ext;
  }
  
+/*

+ * RISC-V requires NaN-boxing of narrower width floating point values.
+ * This applies when a 32-bit value is assigned to a 64-bit FP register.
+ * For consistency and simplicity, we nanbox results even when the RVD
+ * extension is not present.
+ */
+static void gen_nanbox_s(TCGv_i64 out, TCGv_i64 in)
+{
+tcg_gen_ori_i64(out, in, MAKE_64BIT_MASK(32, 32));
+}
+

If possible,

+static void gen_nanbox(TCGv_i64 out, TCGv_i64 in, uint32_t flen)
+{
+tcg_gen_ori_i64(out, in, MAKE_64BIT_MASK(flen, 64 - flen));
+}
+

Reviewed-by: LIU Zhiwei 

Zhiwei

  static void generate_exception(DisasContext *ctx, int excp)
  {
  tcg_gen_movi_tl(cpu_pc, ctx->base.pc_next);





Re: [PATCH v2 1/7] target/riscv: Generate nanboxed results from fp helpers

2020-07-23 Thread LIU Zhiwei




On 2020/7/24 8:28, Richard Henderson wrote:

Make sure that all results from single-precision scalar helpers
are properly nan-boxed to 64-bits.

Signed-off-by: Richard Henderson 
---
  target/riscv/internals.h  |  5 +
  target/riscv/fpu_helper.c | 42 +--
  2 files changed, 28 insertions(+), 19 deletions(-)

diff --git a/target/riscv/internals.h b/target/riscv/internals.h
index 37d33820ad..9f4ba7d617 100644
--- a/target/riscv/internals.h
+++ b/target/riscv/internals.h
@@ -38,4 +38,9 @@ target_ulong fclass_d(uint64_t frs1);
  #define SEW32 2
  #define SEW64 3
  
+static inline uint64_t nanbox_s(float32 f)

+{
+return f | MAKE_64BIT_MASK(32, 32);
+}
+

If define it here,  we can also define a more general  function with flen.

+static inline uint64_t nanbox_s(float32 f, uint32_t flen)
+{
+return f | MAKE_64BIT_MASK(flen, 64 - flen);
+}
+

So we can reuse it in fp16 or bf16 scalar instruction and in vector 
instructions.


Reviewed-by: LIU Zhiwei 

Zhiwei

  #endif
diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
index 4379756dc4..72541958a7 100644
--- a/target/riscv/fpu_helper.c
+++ b/target/riscv/fpu_helper.c
@@ -81,10 +81,16 @@ void helper_set_rounding_mode(CPURISCVState *env, uint32_t 
rm)
  set_float_rounding_mode(softrm, >fp_status);
  }
  
+static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,

+   uint64_t frs3, int flags)
+{
+return nanbox_s(float32_muladd(frs1, frs2, frs3, flags, >fp_status));
+}
+
  uint64_t helper_fmadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
  uint64_t frs3)
  {
-return float32_muladd(frs1, frs2, frs3, 0, >fp_status);
+return do_fmadd_s(env, frs1, frs2, frs3, 0);
  }
  
  uint64_t helper_fmadd_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2,

@@ -96,8 +102,7 @@ uint64_t helper_fmadd_d(CPURISCVState *env, uint64_t frs1, 
uint64_t frs2,
  uint64_t helper_fmsub_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
  uint64_t frs3)
  {
-return float32_muladd(frs1, frs2, frs3, float_muladd_negate_c,
-  >fp_status);
+return do_fmadd_s(env, frs1, frs2, frs3, float_muladd_negate_c);
  }
  
  uint64_t helper_fmsub_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2,

@@ -110,8 +115,7 @@ uint64_t helper_fmsub_d(CPURISCVState *env, uint64_t frs1, 
uint64_t frs2,
  uint64_t helper_fnmsub_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
   uint64_t frs3)
  {
-return float32_muladd(frs1, frs2, frs3, float_muladd_negate_product,
-  >fp_status);
+return do_fmadd_s(env, frs1, frs2, frs3, float_muladd_negate_product);
  }
  
  uint64_t helper_fnmsub_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2,

@@ -124,8 +128,8 @@ uint64_t helper_fnmsub_d(CPURISCVState *env, uint64_t frs1, 
uint64_t frs2,
  uint64_t helper_fnmadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
   uint64_t frs3)
  {
-return float32_muladd(frs1, frs2, frs3, float_muladd_negate_c |
-  float_muladd_negate_product, >fp_status);
+return do_fmadd_s(env, frs1, frs2, frs3,
+  float_muladd_negate_c | float_muladd_negate_product);
  }
  
  uint64_t helper_fnmadd_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2,

@@ -137,37 +141,37 @@ uint64_t helper_fnmadd_d(CPURISCVState *env, uint64_t 
frs1, uint64_t frs2,
  
  uint64_t helper_fadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

  {
-return float32_add(frs1, frs2, >fp_status);
+return nanbox_s(float32_add(frs1, frs2, >fp_status));
  }
  
  uint64_t helper_fsub_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

  {
-return float32_sub(frs1, frs2, >fp_status);
+return nanbox_s(float32_sub(frs1, frs2, >fp_status));
  }
  
  uint64_t helper_fmul_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

  {
-return float32_mul(frs1, frs2, >fp_status);
+return nanbox_s(float32_mul(frs1, frs2, >fp_status));
  }
  
  uint64_t helper_fdiv_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

  {
-return float32_div(frs1, frs2, >fp_status);
+return nanbox_s(float32_div(frs1, frs2, >fp_status));
  }
  
  uint64_t helper_fmin_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

  {
-return float32_minnum(frs1, frs2, >fp_status);
+return nanbox_s(float32_minnum(frs1, frs2, >fp_status));
  }
  
  uint64_t helper_fmax_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

  {
-return float32_maxnum(frs1, frs2, >fp_status);
+return nanbox_s(float32_maxnum(frs1, frs2, >fp_status));
  }
  
  uint64_t helper_fsqrt_s(CPURISCVState *env, uint64_t frs1)

  {
-return float32_sqrt(frs1, >fp_status);
+return nanbox_s(float32_sqrt(frs1, >fp_status));
  }
  
  target_ulong helper_fle_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)

@@ -209,23 +213,23 @@ uint64_t helper_fcvt_lu_s(CPURISCVState 

Re: [PATCH v2 0/7] target/riscv: NaN-boxing for multiple precison

2020-07-23 Thread LIU Zhiwei




On 2020/7/24 8:28, Richard Henderson wrote:

This is my take on Liu Zhiwei's patch set:
https://patchew.org/QEMU/20200626205917.4545-1-zhiwei_...@c-sky.com

This differs from Zhiwei's v1 in:

  * If a helper is involved, the helper does the boxing and unboxing.

  * Which leaves only LDW and FSGN*.S as the only instructions that
are expanded inline which need to handle nanboxing.

  * All mention of RVD is dropped vs boxing.  This means that an
RVF-only cpu will still generate and check nanboxes into the
64-bit cpu_fpu slots.  There should be no way an RVF-only cpu
can generate an unboxed cpu_fpu value.

This choice is made to speed up the common case: RVF+RVD, so
that we do not have to check whether RVD is enabled.

  * The translate.c primitives take TCGv values rather than fpu
regno, which will make it possible to use them with RVV,
since v0.9 does proper nanboxing.

Agree.

And I think this patch set should be applied  if possible, because it is 
bug fix.

  * I have adjusted the current naming to be float32 specific ("*_s"),
to avoid confusion with the float16 data type supported by RVV.

It's OK.

A more general function with flen is better in my opinion. So that it 
can be used

everywhere, both in scalar and vector instructions, even the future fp16 or
bf16 instructions.

Zhiwei


r~


LIU Zhiwei (2):
   target/riscv: Clean up fmv.w.x
   target/riscv: check before allocating TCG temps

Richard Henderson (5):
   target/riscv: Generate nanboxed results from fp helpers
   target/riscv: Generalize gen_nanbox_fpr to gen_nanbox_s
   target/riscv: Generate nanboxed results from trans_rvf.inc.c
   target/riscv: Check nanboxed inputs to fp helpers
   target/riscv: Check nanboxed inputs in trans_rvf.inc.c

  target/riscv/internals.h|  16 
  target/riscv/fpu_helper.c   | 102 
  target/riscv/insn_trans/trans_rvd.inc.c |   8 +-
  target/riscv/insn_trans/trans_rvf.inc.c |  99 ++-
  target/riscv/translate.c|  29 +++
  5 files changed, 178 insertions(+), 76 deletions(-)






Re: [PATCH 02/12] fuzz: Add general virtual-device fuzzer

2020-07-23 Thread Alexander Bulekov
On 200722 2339, Alexander Bulekov wrote:
> This is a generic fuzzer designed to fuzz a virtual device's
> MemoryRegions, as long as they exist within the Memory or Port IO (if it
> exists) AddressSpaces. The fuzzer's input is interpreted into a sequence
> of qtest commands (outb, readw, etc). The interpreted commands are
> separated by a magic seaparator, which should be easy for the fuzzer to
> guess. Without ASan, the separator can be specified as a "dictionary
> value" using the -dict argument (see libFuzzer documentation).
> 
> Signed-off-by: Alexander Bulekov 
> ---
>  tests/qtest/fuzz/Makefile.include |   1 +
>  tests/qtest/fuzz/general_fuzz.c   | 467 ++
>  2 files changed, 468 insertions(+)
>  create mode 100644 tests/qtest/fuzz/general_fuzz.c
> 
> diff --git a/tests/qtest/fuzz/Makefile.include 
> b/tests/qtest/fuzz/Makefile.include
> index 5bde793bf2..854322efb6 100644
> --- a/tests/qtest/fuzz/Makefile.include
> +++ b/tests/qtest/fuzz/Makefile.include
> @@ -11,6 +11,7 @@ fuzz-obj-y += tests/qtest/fuzz/qtest_wrappers.o
>  fuzz-obj-$(CONFIG_PCI_I440FX) += tests/qtest/fuzz/i440fx_fuzz.o
>  fuzz-obj-$(CONFIG_VIRTIO_NET) += tests/qtest/fuzz/virtio_net_fuzz.o
>  fuzz-obj-$(CONFIG_SCSI) += tests/qtest/fuzz/virtio_scsi_fuzz.o
> +fuzz-obj-y += tests/qtest/fuzz/general_fuzz.o
>  
>  FUZZ_CFLAGS += -I$(SRC_PATH)/tests -I$(SRC_PATH)/tests/qtest
>  
> diff --git a/tests/qtest/fuzz/general_fuzz.c b/tests/qtest/fuzz/general_fuzz.c
> new file mode 100644
> index 00..fd92cc5bdf
> --- /dev/null
> +++ b/tests/qtest/fuzz/general_fuzz.c
> @@ -0,0 +1,467 @@
> +/*
> + * General Virtual-Device Fuzzing Target
> + *
> + * Copyright Red Hat Inc., 2020
> + *
> + * Authors:
> + *  Alexander Bulekov   
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +
> +#include 
> +
> +#include "cpu.h"
> +#include "tests/qtest/libqtest.h"
> +#include "fuzz.h"
> +#include "fork_fuzz.h"
> +#include "exec/address-spaces.h"
> +#include "string.h"
> +#include "exec/memory.h"
> +#include "exec/ramblock.h"
> +#include "exec/address-spaces.h"
> +#include "hw/qdev-core.h"
> +
> +/*
> + * CMD_SEP is a random 32-bit value used to separate "commands" in the fuzz
> + * input
> + */
> +#define CMD_SEP "\x84\x05\x5C\x5E"
> +#define DEFAULT_TIMEOUT_US 10
> +
> +typedef struct {
> +size_t addr;
> +size_t len; /* The number of bytes until the end of the I/O region */
> +} address_range;
> +
> +static useconds_t timeout = 10;
> +/*
> + * List of memory regions that are children of QOM objects specified by the
> + * user for fuzzing.
> + */
> +static GPtrArray *fuzzable_memoryregions;
> +/*
> + * Here we want to convert a fuzzer-provided [io-region-index, offset] to
> + * a physical address. To do this, we iterate over all of the matched
> + * MemoryRegions. Check whether each region exists within the particular io
> + * space. Return the absolute address of the offset within the index'th 
> region
> + * that is a subregion of the io_space and the distance until the end of the
> + * memory region.
> + */
> +static bool get_io_address(address_range *result,
> +MemoryRegion *io_space,
> +uint8_t index,
> +uint32_t offset) {
> +MemoryRegion *mr, *root;
> +index = index % fuzzable_memoryregions->len;
> +int candidate_regions = 0;
> +int i = 0;
> +int ind = index;
> +size_t abs_addr;
> +
> +while (ind >= 0 && fuzzable_memoryregions->len) {
> +*result = (address_range){0, 0};
> +mr = g_ptr_array_index(fuzzable_memoryregions, i);
> +if (mr->enabled) {
> +abs_addr = mr->addr;
> +for (root = mr; root->container; ) {
> +root = root->container;
> +abs_addr += root->addr;
> +}
> +/*
> + * Only consider the region if it is rooted at the io_space we 
> want
> + */
> +if (root == io_space) {

Theres a problem here. This finds an aboslute address for an index
+ offset in our fuzzable_memory_regions array, but doesn't check that
the MemoryRegion has the highest priority for that address.
I think the way to solve this is to do the opposite
address_space_translate and ensure that the MemoryRegion* we get back is
the same MemoryRegion* that is in our fuzzable_memory_regions array.

Only noticed this as I was trying to fuzz an audio device and saw that
by fuzzing the device's PCI space the fuzzer would set the BAR over an
existing higher-priority device and the fuzzer was exercising code for
that device.
-Alex

> +ind--;
> +candidate_regions++;
> +result->addr = abs_addr + (offset % mr->size);
> +result->len = mr->size - (offset % mr->size);
> +}
> +  

Re: [PATCH for-5.1 2/2] tpm: Improve help on TPM types when none are available

2020-07-23 Thread Stefan Berger

On 7/23/20 7:58 AM, Markus Armbruster wrote:

Help is a bit awkward when no TPM types are built into QEMU:

 $ upstream-qemu -tpmdev nonexistent,id=tpm0


I hope you don't mind me replacing 'upstream-qemu' with 
'x86_64-softmmu/qemu-system-x86_64'?



 upstream-qemu: -tpmdev nonexistent,id=tpm0: Parameter 'type' expects a TPM 
backend type



and this one with 'qemu-system-x86_64:'



 Supported TPM types (choose only one):

Improve to

 upstream-qemu: -tpmdev nonexistent,id=tpm0: Parameter 'type' expects a TPM 
backend type
 No TPM backend types are available'



I hope you don't mind me replacing 'upstream-qemu' with 
'x86_64-softmmu/qemu-system-x86_64'?


   Stefan



[PATCH v2 7/7] target/riscv: check before allocating TCG temps

2020-07-23 Thread Richard Henderson
From: LIU Zhiwei 

Signed-off-by: LIU Zhiwei 
Message-Id: <20200626205917.4545-5-zhiwei_...@c-sky.com>
Signed-off-by: Richard Henderson 
---
 target/riscv/insn_trans/trans_rvd.inc.c | 8 
 target/riscv/insn_trans/trans_rvf.inc.c | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvd.inc.c 
b/target/riscv/insn_trans/trans_rvd.inc.c
index ea1044f13b..4f832637fa 100644
--- a/target/riscv/insn_trans/trans_rvd.inc.c
+++ b/target/riscv/insn_trans/trans_rvd.inc.c
@@ -20,10 +20,10 @@
 
 static bool trans_fld(DisasContext *ctx, arg_fld *a)
 {
-TCGv t0 = tcg_temp_new();
-gen_get_gpr(t0, a->rs1);
 REQUIRE_FPU;
 REQUIRE_EXT(ctx, RVD);
+TCGv t0 = tcg_temp_new();
+gen_get_gpr(t0, a->rs1);
 tcg_gen_addi_tl(t0, t0, a->imm);
 
 tcg_gen_qemu_ld_i64(cpu_fpr[a->rd], t0, ctx->mem_idx, MO_TEQ);
@@ -35,10 +35,10 @@ static bool trans_fld(DisasContext *ctx, arg_fld *a)
 
 static bool trans_fsd(DisasContext *ctx, arg_fsd *a)
 {
-TCGv t0 = tcg_temp_new();
-gen_get_gpr(t0, a->rs1);
 REQUIRE_FPU;
 REQUIRE_EXT(ctx, RVD);
+TCGv t0 = tcg_temp_new();
+gen_get_gpr(t0, a->rs1);
 tcg_gen_addi_tl(t0, t0, a->imm);
 
 tcg_gen_qemu_st_i64(cpu_fpr[a->rs2], t0, ctx->mem_idx, MO_TEQ);
diff --git a/target/riscv/insn_trans/trans_rvf.inc.c 
b/target/riscv/insn_trans/trans_rvf.inc.c
index 0d04677a02..16df9c5ee2 100644
--- a/target/riscv/insn_trans/trans_rvf.inc.c
+++ b/target/riscv/insn_trans/trans_rvf.inc.c
@@ -25,10 +25,10 @@
 
 static bool trans_flw(DisasContext *ctx, arg_flw *a)
 {
-TCGv t0 = tcg_temp_new();
-gen_get_gpr(t0, a->rs1);
 REQUIRE_FPU;
 REQUIRE_EXT(ctx, RVF);
+TCGv t0 = tcg_temp_new();
+gen_get_gpr(t0, a->rs1);
 tcg_gen_addi_tl(t0, t0, a->imm);
 
 tcg_gen_qemu_ld_i64(cpu_fpr[a->rd], t0, ctx->mem_idx, MO_TEUL);
@@ -41,11 +41,11 @@ static bool trans_flw(DisasContext *ctx, arg_flw *a)
 
 static bool trans_fsw(DisasContext *ctx, arg_fsw *a)
 {
+REQUIRE_FPU;
+REQUIRE_EXT(ctx, RVF);
 TCGv t0 = tcg_temp_new();
 gen_get_gpr(t0, a->rs1);
 
-REQUIRE_FPU;
-REQUIRE_EXT(ctx, RVF);
 tcg_gen_addi_tl(t0, t0, a->imm);
 
 tcg_gen_qemu_st_i64(cpu_fpr[a->rs2], t0, ctx->mem_idx, MO_TEUL);
-- 
2.25.1




[PATCH v2 5/7] target/riscv: Check nanboxed inputs in trans_rvf.inc.c

2020-07-23 Thread Richard Henderson
If a 32-bit input is not properly nanboxed, then the input is replaced
with the default qnan.  The only inline expansion is for the sign-changing
set of instructions: FSGNJ.S, FSGNJX.S, FSGNJN.S.

Signed-off-by: Richard Henderson 
---
 target/riscv/insn_trans/trans_rvf.inc.c | 71 +++--
 target/riscv/translate.c| 18 +++
 2 files changed, 73 insertions(+), 16 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvf.inc.c 
b/target/riscv/insn_trans/trans_rvf.inc.c
index 264d3139f1..f9a9e0643a 100644
--- a/target/riscv/insn_trans/trans_rvf.inc.c
+++ b/target/riscv/insn_trans/trans_rvf.inc.c
@@ -161,47 +161,86 @@ static bool trans_fsgnj_s(DisasContext *ctx, arg_fsgnj_s 
*a)
 {
 REQUIRE_FPU;
 REQUIRE_EXT(ctx, RVF);
+
 if (a->rs1 == a->rs2) { /* FMOV */
-tcg_gen_mov_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1]);
+gen_check_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rs1]);
 } else { /* FSGNJ */
-tcg_gen_deposit_i64(cpu_fpr[a->rd], cpu_fpr[a->rs2], cpu_fpr[a->rs1],
-0, 31);
+TCGv_i64 rs1 = tcg_temp_new_i64();
+TCGv_i64 rs2 = tcg_temp_new_i64();
+
+gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
+gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
+
+/* This formulation retains the nanboxing of rs2. */
+tcg_gen_deposit_i64(cpu_fpr[a->rd], rs2, rs1, 0, 31);
+tcg_temp_free_i64(rs1);
+tcg_temp_free_i64(rs2);
 }
-gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
 mark_fs_dirty(ctx);
 return true;
 }
 
 static bool trans_fsgnjn_s(DisasContext *ctx, arg_fsgnjn_s *a)
 {
+TCGv_i64 rs1, rs2, mask;
+
 REQUIRE_FPU;
 REQUIRE_EXT(ctx, RVF);
+
+rs1 = tcg_temp_new_i64();
+gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
+
 if (a->rs1 == a->rs2) { /* FNEG */
-tcg_gen_xori_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], INT32_MIN);
+tcg_gen_xori_i64(cpu_fpr[a->rd], rs1, MAKE_64BIT_MASK(31, 1));
 } else {
-TCGv_i64 t0 = tcg_temp_new_i64();
-tcg_gen_not_i64(t0, cpu_fpr[a->rs2]);
-tcg_gen_deposit_i64(cpu_fpr[a->rd], t0, cpu_fpr[a->rs1], 0, 31);
-tcg_temp_free_i64(t0);
+rs2 = tcg_temp_new_i64();
+gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
+
+/*
+ * Replace bit 31 in rs1 with inverse in rs2.
+ * This formulation retains the nanboxing of rs1.
+ */
+mask = tcg_const_i64(~MAKE_64BIT_MASK(31, 1));
+tcg_gen_andc_i64(rs2, mask, rs2);
+tcg_gen_and_i64(rs1, mask, rs1);
+tcg_gen_or_i64(cpu_fpr[a->rd], rs1, rs2);
+
+tcg_temp_free_i64(mask);
+tcg_temp_free_i64(rs2);
 }
-gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
+tcg_temp_free_i64(rs1);
+
 mark_fs_dirty(ctx);
 return true;
 }
 
 static bool trans_fsgnjx_s(DisasContext *ctx, arg_fsgnjx_s *a)
 {
+TCGv_i64 rs1, rs2;
+
 REQUIRE_FPU;
 REQUIRE_EXT(ctx, RVF);
+
+rs1 = tcg_temp_new_i64();
+gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
+
 if (a->rs1 == a->rs2) { /* FABS */
-tcg_gen_andi_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], ~INT32_MIN);
+tcg_gen_andi_i64(cpu_fpr[a->rd], rs1, ~MAKE_64BIT_MASK(31, 1));
 } else {
-TCGv_i64 t0 = tcg_temp_new_i64();
-tcg_gen_andi_i64(t0, cpu_fpr[a->rs2], INT32_MIN);
-tcg_gen_xor_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], t0);
-tcg_temp_free_i64(t0);
+rs2 = tcg_temp_new_i64();
+gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
+
+/*
+ * Xor bit 31 in rs1 with that in rs2.
+ * This formulation retains the nanboxing of rs1.
+ */
+tcg_gen_andi_i64(rs2, rs2, MAKE_64BIT_MASK(31, 1));
+tcg_gen_xor_i64(cpu_fpr[a->rd], rs1, rs2);
+
+tcg_temp_free_i64(rs2);
 }
-gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
+tcg_temp_free_i64(rs1);
+
 mark_fs_dirty(ctx);
 return true;
 }
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 12a746da97..bf35182776 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -101,6 +101,24 @@ static void gen_nanbox_s(TCGv_i64 out, TCGv_i64 in)
 tcg_gen_ori_i64(out, in, MAKE_64BIT_MASK(32, 32));
 }
 
+/*
+ * A narrow n-bit operation, where n < FLEN, checks that input operands
+ * are correctly Nan-boxed, i.e., all upper FLEN - n bits are 1.
+ * If so, the least-significant bits of the input are used, otherwise the
+ * input value is treated as an n-bit canonical NaN (v2.2 section 9.2).
+ *
+ * Here, the result is always nan-boxed, even the canonical nan.
+ */
+static void gen_check_nanbox_s(TCGv_i64 out, TCGv_i64 in)
+{
+TCGv_i64 t_max = tcg_const_i64(0xull);
+TCGv_i64 t_nan = tcg_const_i64(0x7fc0ull);
+
+tcg_gen_movcond_i64(TCG_COND_GEU, out, in, t_max, in, t_nan);
+tcg_temp_free_i64(t_max);
+tcg_temp_free_i64(t_nan);
+}
+
 static void generate_exception(DisasContext *ctx, int excp)
 {
 

[PATCH v2 3/7] target/riscv: Generate nanboxed results from trans_rvf.inc.c

2020-07-23 Thread Richard Henderson
Make sure that all results from inline single-precision scalar
operations are properly nan-boxed to 64-bits.

Signed-off-by: Richard Henderson 
---
 target/riscv/insn_trans/trans_rvf.inc.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/target/riscv/insn_trans/trans_rvf.inc.c 
b/target/riscv/insn_trans/trans_rvf.inc.c
index c7057482e8..264d3139f1 100644
--- a/target/riscv/insn_trans/trans_rvf.inc.c
+++ b/target/riscv/insn_trans/trans_rvf.inc.c
@@ -167,6 +167,7 @@ static bool trans_fsgnj_s(DisasContext *ctx, arg_fsgnj_s *a)
 tcg_gen_deposit_i64(cpu_fpr[a->rd], cpu_fpr[a->rs2], cpu_fpr[a->rs1],
 0, 31);
 }
+gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
 mark_fs_dirty(ctx);
 return true;
 }
@@ -183,6 +184,7 @@ static bool trans_fsgnjn_s(DisasContext *ctx, arg_fsgnjn_s 
*a)
 tcg_gen_deposit_i64(cpu_fpr[a->rd], t0, cpu_fpr[a->rs1], 0, 31);
 tcg_temp_free_i64(t0);
 }
+gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
 mark_fs_dirty(ctx);
 return true;
 }
@@ -199,6 +201,7 @@ static bool trans_fsgnjx_s(DisasContext *ctx, arg_fsgnjx_s 
*a)
 tcg_gen_xor_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], t0);
 tcg_temp_free_i64(t0);
 }
+gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
 mark_fs_dirty(ctx);
 return true;
 }
@@ -369,6 +372,7 @@ static bool trans_fmv_w_x(DisasContext *ctx, arg_fmv_w_x *a)
 #else
 tcg_gen_extu_i32_i64(cpu_fpr[a->rd], t0);
 #endif
+gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
 
 mark_fs_dirty(ctx);
 tcg_temp_free(t0);
-- 
2.25.1




[PATCH v2 2/7] target/riscv: Generalize gen_nanbox_fpr to gen_nanbox_s

2020-07-23 Thread Richard Henderson
Do not depend on the RVD extension, take input and output via
TCGv_i64 instead of fpu regno.  Move the function to translate.c
so that it can be used in multiple trans_*.inc.c files.

Signed-off-by: Richard Henderson 
---
 target/riscv/insn_trans/trans_rvf.inc.c | 16 +---
 target/riscv/translate.c| 11 +++
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvf.inc.c 
b/target/riscv/insn_trans/trans_rvf.inc.c
index 3bfd8881e7..c7057482e8 100644
--- a/target/riscv/insn_trans/trans_rvf.inc.c
+++ b/target/riscv/insn_trans/trans_rvf.inc.c
@@ -23,20 +23,6 @@
 return false;   \
 } while (0)
 
-/*
- * RISC-V requires NaN-boxing of narrower width floating
- * point values.  This applies when a 32-bit value is
- * assigned to a 64-bit FP register.  Thus this does not
- * apply when the RVD extension is not present.
- */
-static void gen_nanbox_fpr(DisasContext *ctx, int regno)
-{
-if (has_ext(ctx, RVD)) {
-tcg_gen_ori_i64(cpu_fpr[regno], cpu_fpr[regno],
-MAKE_64BIT_MASK(32, 32));
-}
-}
-
 static bool trans_flw(DisasContext *ctx, arg_flw *a)
 {
 TCGv t0 = tcg_temp_new();
@@ -46,7 +32,7 @@ static bool trans_flw(DisasContext *ctx, arg_flw *a)
 tcg_gen_addi_tl(t0, t0, a->imm);
 
 tcg_gen_qemu_ld_i64(cpu_fpr[a->rd], t0, ctx->mem_idx, MO_TEUL);
-gen_nanbox_fpr(ctx, a->rd);
+gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
 
 tcg_temp_free(t0);
 mark_fs_dirty(ctx);
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 9632e79cf3..12a746da97 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -90,6 +90,17 @@ static inline bool has_ext(DisasContext *ctx, uint32_t ext)
 return ctx->misa & ext;
 }
 
+/*
+ * RISC-V requires NaN-boxing of narrower width floating point values.
+ * This applies when a 32-bit value is assigned to a 64-bit FP register.
+ * For consistency and simplicity, we nanbox results even when the RVD
+ * extension is not present.
+ */
+static void gen_nanbox_s(TCGv_i64 out, TCGv_i64 in)
+{
+tcg_gen_ori_i64(out, in, MAKE_64BIT_MASK(32, 32));
+}
+
 static void generate_exception(DisasContext *ctx, int excp)
 {
 tcg_gen_movi_tl(cpu_pc, ctx->base.pc_next);
-- 
2.25.1




[PATCH v2 6/7] target/riscv: Clean up fmv.w.x

2020-07-23 Thread Richard Henderson
From: LIU Zhiwei 

Use tcg_gen_extu_tl_i64 to avoid the ifdef.

Signed-off-by: LIU Zhiwei 
Message-Id: <20200626205917.4545-7-zhiwei_...@c-sky.com>
Signed-off-by: Richard Henderson 
---
 target/riscv/insn_trans/trans_rvf.inc.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvf.inc.c 
b/target/riscv/insn_trans/trans_rvf.inc.c
index f9a9e0643a..0d04677a02 100644
--- a/target/riscv/insn_trans/trans_rvf.inc.c
+++ b/target/riscv/insn_trans/trans_rvf.inc.c
@@ -406,11 +406,7 @@ static bool trans_fmv_w_x(DisasContext *ctx, arg_fmv_w_x 
*a)
 TCGv t0 = tcg_temp_new();
 gen_get_gpr(t0, a->rs1);
 
-#if defined(TARGET_RISCV64)
-tcg_gen_mov_i64(cpu_fpr[a->rd], t0);
-#else
-tcg_gen_extu_i32_i64(cpu_fpr[a->rd], t0);
-#endif
+tcg_gen_extu_tl_i64(cpu_fpr[a->rd], t0);
 gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
 
 mark_fs_dirty(ctx);
-- 
2.25.1




[PATCH v2 1/7] target/riscv: Generate nanboxed results from fp helpers

2020-07-23 Thread Richard Henderson
Make sure that all results from single-precision scalar helpers
are properly nan-boxed to 64-bits.

Signed-off-by: Richard Henderson 
---
 target/riscv/internals.h  |  5 +
 target/riscv/fpu_helper.c | 42 +--
 2 files changed, 28 insertions(+), 19 deletions(-)

diff --git a/target/riscv/internals.h b/target/riscv/internals.h
index 37d33820ad..9f4ba7d617 100644
--- a/target/riscv/internals.h
+++ b/target/riscv/internals.h
@@ -38,4 +38,9 @@ target_ulong fclass_d(uint64_t frs1);
 #define SEW32 2
 #define SEW64 3
 
+static inline uint64_t nanbox_s(float32 f)
+{
+return f | MAKE_64BIT_MASK(32, 32);
+}
+
 #endif
diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
index 4379756dc4..72541958a7 100644
--- a/target/riscv/fpu_helper.c
+++ b/target/riscv/fpu_helper.c
@@ -81,10 +81,16 @@ void helper_set_rounding_mode(CPURISCVState *env, uint32_t 
rm)
 set_float_rounding_mode(softrm, >fp_status);
 }
 
+static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
+   uint64_t frs3, int flags)
+{
+return nanbox_s(float32_muladd(frs1, frs2, frs3, flags, >fp_status));
+}
+
 uint64_t helper_fmadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
 uint64_t frs3)
 {
-return float32_muladd(frs1, frs2, frs3, 0, >fp_status);
+return do_fmadd_s(env, frs1, frs2, frs3, 0);
 }
 
 uint64_t helper_fmadd_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
@@ -96,8 +102,7 @@ uint64_t helper_fmadd_d(CPURISCVState *env, uint64_t frs1, 
uint64_t frs2,
 uint64_t helper_fmsub_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
 uint64_t frs3)
 {
-return float32_muladd(frs1, frs2, frs3, float_muladd_negate_c,
-  >fp_status);
+return do_fmadd_s(env, frs1, frs2, frs3, float_muladd_negate_c);
 }
 
 uint64_t helper_fmsub_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
@@ -110,8 +115,7 @@ uint64_t helper_fmsub_d(CPURISCVState *env, uint64_t frs1, 
uint64_t frs2,
 uint64_t helper_fnmsub_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
  uint64_t frs3)
 {
-return float32_muladd(frs1, frs2, frs3, float_muladd_negate_product,
-  >fp_status);
+return do_fmadd_s(env, frs1, frs2, frs3, float_muladd_negate_product);
 }
 
 uint64_t helper_fnmsub_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
@@ -124,8 +128,8 @@ uint64_t helper_fnmsub_d(CPURISCVState *env, uint64_t frs1, 
uint64_t frs2,
 uint64_t helper_fnmadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
  uint64_t frs3)
 {
-return float32_muladd(frs1, frs2, frs3, float_muladd_negate_c |
-  float_muladd_negate_product, >fp_status);
+return do_fmadd_s(env, frs1, frs2, frs3,
+  float_muladd_negate_c | float_muladd_negate_product);
 }
 
 uint64_t helper_fnmadd_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
@@ -137,37 +141,37 @@ uint64_t helper_fnmadd_d(CPURISCVState *env, uint64_t 
frs1, uint64_t frs2,
 
 uint64_t helper_fadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
 {
-return float32_add(frs1, frs2, >fp_status);
+return nanbox_s(float32_add(frs1, frs2, >fp_status));
 }
 
 uint64_t helper_fsub_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
 {
-return float32_sub(frs1, frs2, >fp_status);
+return nanbox_s(float32_sub(frs1, frs2, >fp_status));
 }
 
 uint64_t helper_fmul_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
 {
-return float32_mul(frs1, frs2, >fp_status);
+return nanbox_s(float32_mul(frs1, frs2, >fp_status));
 }
 
 uint64_t helper_fdiv_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
 {
-return float32_div(frs1, frs2, >fp_status);
+return nanbox_s(float32_div(frs1, frs2, >fp_status));
 }
 
 uint64_t helper_fmin_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
 {
-return float32_minnum(frs1, frs2, >fp_status);
+return nanbox_s(float32_minnum(frs1, frs2, >fp_status));
 }
 
 uint64_t helper_fmax_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
 {
-return float32_maxnum(frs1, frs2, >fp_status);
+return nanbox_s(float32_maxnum(frs1, frs2, >fp_status));
 }
 
 uint64_t helper_fsqrt_s(CPURISCVState *env, uint64_t frs1)
 {
-return float32_sqrt(frs1, >fp_status);
+return nanbox_s(float32_sqrt(frs1, >fp_status));
 }
 
 target_ulong helper_fle_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
@@ -209,23 +213,23 @@ uint64_t helper_fcvt_lu_s(CPURISCVState *env, uint64_t 
frs1)
 
 uint64_t helper_fcvt_s_w(CPURISCVState *env, target_ulong rs1)
 {
-return int32_to_float32((int32_t)rs1, >fp_status);
+return nanbox_s(int32_to_float32((int32_t)rs1, >fp_status));
 }
 
 uint64_t helper_fcvt_s_wu(CPURISCVState *env, target_ulong rs1)
 {
-return uint32_to_float32((uint32_t)rs1, >fp_status);
+return nanbox_s(uint32_to_float32((uint32_t)rs1, >fp_status));
 }
 
 #if 

[PATCH v2 4/7] target/riscv: Check nanboxed inputs to fp helpers

2020-07-23 Thread Richard Henderson
If a 32-bit input is not properly nanboxed, then the input is
replaced with the default qnan.

Signed-off-by: Richard Henderson 
---
 target/riscv/internals.h  | 11 +++
 target/riscv/fpu_helper.c | 64 ---
 2 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/target/riscv/internals.h b/target/riscv/internals.h
index 9f4ba7d617..f1a546dba6 100644
--- a/target/riscv/internals.h
+++ b/target/riscv/internals.h
@@ -43,4 +43,15 @@ static inline uint64_t nanbox_s(float32 f)
 return f | MAKE_64BIT_MASK(32, 32);
 }
 
+static inline float32 check_nanbox_s(uint64_t f)
+{
+uint64_t mask = MAKE_64BIT_MASK(32, 32);
+
+if (likely((f & mask) == mask)) {
+return (uint32_t)f;
+} else {
+return 0x7fc0u; /* default qnan */
+}
+}
+
 #endif
diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
index 72541958a7..bb346a8249 100644
--- a/target/riscv/fpu_helper.c
+++ b/target/riscv/fpu_helper.c
@@ -81,9 +81,12 @@ void helper_set_rounding_mode(CPURISCVState *env, uint32_t 
rm)
 set_float_rounding_mode(softrm, >fp_status);
 }
 
-static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
-   uint64_t frs3, int flags)
+static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
+   uint64_t rs3, int flags)
 {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
+float32 frs3 = check_nanbox_s(rs3);
 return nanbox_s(float32_muladd(frs1, frs2, frs3, flags, >fp_status));
 }
 
@@ -139,74 +142,97 @@ uint64_t helper_fnmadd_d(CPURISCVState *env, uint64_t 
frs1, uint64_t frs2,
   float_muladd_negate_product, >fp_status);
 }
 
-uint64_t helper_fadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+uint64_t helper_fadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
 return nanbox_s(float32_add(frs1, frs2, >fp_status));
 }
 
-uint64_t helper_fsub_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+uint64_t helper_fsub_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
 return nanbox_s(float32_sub(frs1, frs2, >fp_status));
 }
 
-uint64_t helper_fmul_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+uint64_t helper_fmul_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
 return nanbox_s(float32_mul(frs1, frs2, >fp_status));
 }
 
-uint64_t helper_fdiv_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+uint64_t helper_fdiv_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
 return nanbox_s(float32_div(frs1, frs2, >fp_status));
 }
 
-uint64_t helper_fmin_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+uint64_t helper_fmin_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
 return nanbox_s(float32_minnum(frs1, frs2, >fp_status));
 }
 
-uint64_t helper_fmax_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+uint64_t helper_fmax_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
 return nanbox_s(float32_maxnum(frs1, frs2, >fp_status));
 }
 
-uint64_t helper_fsqrt_s(CPURISCVState *env, uint64_t frs1)
+uint64_t helper_fsqrt_s(CPURISCVState *env, uint64_t rs1)
 {
+float32 frs1 = check_nanbox_s(rs1);
 return nanbox_s(float32_sqrt(frs1, >fp_status));
 }
 
-target_ulong helper_fle_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+target_ulong helper_fle_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
 return float32_le(frs1, frs2, >fp_status);
 }
 
-target_ulong helper_flt_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+target_ulong helper_flt_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
 return float32_lt(frs1, frs2, >fp_status);
 }
 
-target_ulong helper_feq_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+target_ulong helper_feq_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
+float32 frs1 = check_nanbox_s(rs1);
+float32 frs2 = check_nanbox_s(rs2);
 return float32_eq_quiet(frs1, frs2, >fp_status);
 }
 
-target_ulong helper_fcvt_w_s(CPURISCVState *env, uint64_t frs1)
+target_ulong helper_fcvt_w_s(CPURISCVState *env, uint64_t rs1)
 {
+float32 frs1 = check_nanbox_s(rs1);
 return float32_to_int32(frs1, >fp_status);
 }
 
-target_ulong helper_fcvt_wu_s(CPURISCVState *env, uint64_t frs1)
+target_ulong helper_fcvt_wu_s(CPURISCVState *env, uint64_t rs1)
 {

[PATCH v2 0/7] target/riscv: NaN-boxing for multiple precison

2020-07-23 Thread Richard Henderson
This is my take on Liu Zhiwei's patch set:
https://patchew.org/QEMU/20200626205917.4545-1-zhiwei_...@c-sky.com

This differs from Zhiwei's v1 in:

 * If a helper is involved, the helper does the boxing and unboxing.

 * Which leaves only LDW and FSGN*.S as the only instructions that
   are expanded inline which need to handle nanboxing.

 * All mention of RVD is dropped vs boxing.  This means that an
   RVF-only cpu will still generate and check nanboxes into the
   64-bit cpu_fpu slots.  There should be no way an RVF-only cpu
   can generate an unboxed cpu_fpu value.

   This choice is made to speed up the common case: RVF+RVD, so
   that we do not have to check whether RVD is enabled.

 * The translate.c primitives take TCGv values rather than fpu
   regno, which will make it possible to use them with RVV,
   since v0.9 does proper nanboxing.

 * I have adjusted the current naming to be float32 specific ("*_s"),
   to avoid confusion with the float16 data type supported by RVV.


r~


LIU Zhiwei (2):
  target/riscv: Clean up fmv.w.x
  target/riscv: check before allocating TCG temps

Richard Henderson (5):
  target/riscv: Generate nanboxed results from fp helpers
  target/riscv: Generalize gen_nanbox_fpr to gen_nanbox_s
  target/riscv: Generate nanboxed results from trans_rvf.inc.c
  target/riscv: Check nanboxed inputs to fp helpers
  target/riscv: Check nanboxed inputs in trans_rvf.inc.c

 target/riscv/internals.h|  16 
 target/riscv/fpu_helper.c   | 102 
 target/riscv/insn_trans/trans_rvd.inc.c |   8 +-
 target/riscv/insn_trans/trans_rvf.inc.c |  99 ++-
 target/riscv/translate.c|  29 +++
 5 files changed, 178 insertions(+), 76 deletions(-)

-- 
2.25.1




Re: [PATCH v2 16/22] qemu-iotests/199: change discard patterns

2020-07-23 Thread Eric Blake

On 2/17/20 9:02 AM, Vladimir Sementsov-Ogievskiy wrote:

iotest 40 works too long because of many discard opertion. On the same


I'm assuming you meant s/40/199/ here, as well as the typo fixes pointed 
out by Andrey.



time, postcopy period is very short, in spite of all these efforts.

So, let's use less discards (and with more interesting patterns) to
reduce test timing. In the next commit we'll increase postcopy period.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/199 | 44 +-
  1 file changed, 26 insertions(+), 18 deletions(-)




--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v2 17/22] qemu-iotests/199: increase postcopy period

2020-07-23 Thread Eric Blake

On 2/17/20 9:02 AM, Vladimir Sementsov-Ogievskiy wrote:

Test wants force bitmap postcopy. Still, resulting postcopy period is


The test wants to force a bitmap postcopy. Still, the resulting postcopy 
period is very small.



very small. Let's increase it by adding more bitmaps to migrate. Also,
test disabled bitmaps migration.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/199 | 58 --
  1 file changed, 39 insertions(+), 19 deletions(-)


Patches 12-17:
Tested-by: Eric Blake 

As they all work without any other patches in this series, and DO make a 
dramatic difference (cutting the test from over 70 seconds to just 7, on 
my machine), I'm inclined to stage them now, even while waiting for you 
to rebase the rest of the series.  And 18 is already in the tree.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v0 3/4] migration: add background snapshot

2020-07-23 Thread Peter Xu
On Wed, Jul 22, 2020 at 11:11:32AM +0300, Denis Plotnikov wrote:
> +/**
> + * ram_copy_page: make a page copy
> + *
> + * Used in the background snapshot to make a copy of a memeory page.
> + * Ensures that the memeory page is copied only once.
> + * When a page copy is done, restores read/write access to the memory
> + * page.
> + * If a page is being copied by another thread, wait until the copying
> + * ends and exit.
> + *
> + * Returns:
> + *   -1 - on error
> + *0 - the page wasn't copied by the function call
> + *1 - the page has been copied
> + *
> + * @block: RAM block to use
> + * @page_nr:   the page number to copy
> + * @page_copy: the pointer to return a page copy
> + *
> + */
> +int ram_copy_page(RAMBlock *block, unsigned long page_nr,
> +  void **page_copy)
> +{
> +void *host_page;
> +int res = 0;
> +
> +atomic_inc(_state->page_copier_cnt);
> +
> +if (test_and_set_bit_atomic(page_nr, block->touched_map)) {
> +while (!test_bit_atomic(page_nr, block->copied_map)) {
> +/* the page is being copied -- wait for the end of the copying */
> +qemu_event_wait(_state->page_copying_done);
> +}
> +goto out;
> +}
> +
> +*page_copy = ram_page_buffer_get();
> +if (!*page_copy) {
> +res = -1;
> +goto out;
> +}
> +
> +host_page = block->host + (page_nr << TARGET_PAGE_BITS);
> +memcpy(*page_copy, host_page, TARGET_PAGE_SIZE);
> +
> +if (ram_set_rw(host_page, TARGET_PAGE_SIZE)) {
> +ram_page_buffer_free(*page_copy);
> +*page_copy = NULL;
> +res = -1;
> +goto out;
> +}
> +
> +set_bit_atomic(page_nr, block->copied_map);
> +qemu_event_set(_state->page_copying_done);
> +qemu_event_reset(_state->page_copying_done);
> +
> +res = 1;
> +out:
> +atomic_dec(_state->page_copier_cnt);
> +return res;
> +}

Is ram_set_rw() be called on the page only if a page fault triggered?
Shouldn't we also do that even in the background thread when we proactively
copying the pages?

Besides current solution, do you think we can make it simpler by only deliver
the fault request to the background thread?  We can let the background thread
to do all the rests and IIUC we can drop all the complicated sync bitmaps and
so on by doing so.  The process can look like:

  - background thread runs the general precopy migration, and,

- it only does the ram_bulk_stage, which is the first loop, because for
  snapshot no reason to send a page twice..

- After copy one page, do ram_set_rw() always, so accessing of this page
  will never trigger write-protect page fault again,

- take requests from the unqueue_page() just like what's done in this
  series, but instead of copying the page, the page request should look
  exactly like the postcopy one.  We don't need copy_page because the page
  won't be changed before we unprotect it, so it shiould be safe.  These
  pages will still be with high priority because when queued it means vcpu
  writed to this protected page and fault in userfaultfd.  We need to
  migrate these pages first to unblock them.

  - the fault handler thread only needs to do:

- when get a uffd-wp message, translate into a postcopy-like request
  (calculate ramblock and offset), then queue it.  That's all.

I believe we can avoid the copy_page parameter that was passed around, and we
can also save the two extra bitmaps and the complicated synchronizations.

Do you think this would work?

Besides, have we disabled dirty tracking of memslots?  IIUC that's not needed
for background snapshot too, so neither do we need dirty tracking nor do we
need to sync the dirty bitmap from outside us (kvm/vhost/...).

-- 
Peter Xu




[Bug 1888728] Re: Bare chroot in linux-user fails with pgb_reserved_va: Assertion `guest_base != 0' failed.

2020-07-23 Thread Laurent Vivier
** Tags added: linux-user

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1888728

Title:
  Bare chroot in linux-user fails with pgb_reserved_va: Assertion
  `guest_base != 0' failed.

Status in QEMU:
  New

Bug description:
  Trying to run a bare chroot with no additional bind mounts fails on
  git master (8ffa52c20d5693d454f65f2024a1494edfea65d4) with:

  root@nofan:~/qemu> chroot /local_scratch/sid-m68k-sbuild/
  qemu-m68k-static: /root/qemu/linux-user/elfload.c:2315: pgb_reserved_va: 
Assertion `guest_base != 0' failed.
  Aborted
  root@nofan:~/qemu>

  The problem can be worked around by bind-mounting /proc from the host
  system into the target chroot:

  root@nofan:~/qemu> mount -o bind /proc/ /local_scratch/sid-m68k-sbuild/proc/
  root@nofan:~/qemu> chroot /local_scratch/sid-m68k-sbuild/
  bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
  (sid-m68k-sbuild)root@nofan:/#

  Host system is an up-to-date Debian unstable (2020-07-23).

  I have not been able to bisect the issue yet since there is another
  annoying linux-user bug (virtual memory exhaustion) that was somewhere
  introduced and fixed between v5.0.0 and HEAD and overshadows the
  original Assertion failure bug.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1888728/+subscriptions



[PATCH v2] linux-user: Add most IFTUN ioctls

2020-07-23 Thread Shu-Chun Weng
The three options handling `struct sock_fprog` (TUNATTACHFILTER,
TUNDETACHFILTER, and TUNGETFILTER) are not implemented. Linux kernel
keeps a user space pointer in them which we cannot correctly handle.

Signed-off-by: Josh Kunz 
Signed-off-by: Shu-Chun Weng 
---
v2:
  Title changed from "linux-user: Add several IFTUN ioctls"

  Properly specify the argument types for various options, including a custom
  implementation for TUNSETTXFILTER.

  #ifdef guards for macros introduced up to 5 years ago.

 linux-user/ioctls.h   | 45 +++
 linux-user/syscall.c  | 36 +++
 linux-user/syscall_defs.h | 32 
 3 files changed, 113 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index 0713ae1311..b9fb01f558 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -593,3 +593,48 @@
   IOCTL(KCOV_DISABLE, 0, TYPE_NULL)
   IOCTL(KCOV_INIT_TRACE, IOC_R, TYPE_ULONG)
 #endif
+
+  IOCTL(TUNSETDEBUG, IOC_W, TYPE_INT)
+  IOCTL(TUNSETIFF,   IOC_RW, MK_PTR(MK_STRUCT(STRUCT_short_ifreq)))
+  IOCTL(TUNSETPERSIST,   IOC_W, TYPE_INT)
+  IOCTL(TUNSETOWNER, IOC_W, TYPE_INT)
+  IOCTL(TUNSETLINK,  IOC_W, TYPE_INT)
+  IOCTL(TUNSETGROUP, IOC_W, TYPE_INT)
+  IOCTL(TUNGETFEATURES,  IOC_R, MK_PTR(TYPE_INT))
+  IOCTL(TUNSETOFFLOAD,   IOC_W, TYPE_LONG)
+  IOCTL_SPECIAL(TUNSETTXFILTER, IOC_W, do_ioctl_TUNSETTXFILTER,
+/*
+ * We can't represent `struct tun_filter` in thunk so leaving
+ * this empty. do_ioctl_TUNSETTXFILTER will do the conversion.
+ */
+TYPE_NULL)
+  IOCTL(TUNGETIFF,   IOC_R, MK_PTR(MK_STRUCT(STRUCT_short_ifreq)))
+  IOCTL(TUNGETSNDBUF,IOC_R, MK_PTR(TYPE_INT))
+  IOCTL(TUNSETSNDBUF,IOC_W, MK_PTR(TYPE_INT))
+  /*
+   * TUNATTACHFILTER and TUNDETACHFILTER are not supported. Linux kernel keeps 
a
+   * user pointer in TUNATTACHFILTER, which we are not able to correctly 
handle.
+   */
+  IOCTL(TUNGETVNETHDRSZ, IOC_R, MK_PTR(TYPE_INT))
+  IOCTL(TUNSETVNETHDRSZ, IOC_W, MK_PTR(TYPE_INT))
+  IOCTL(TUNSETQUEUE, IOC_W, MK_PTR(MK_STRUCT(STRUCT_short_ifreq)))
+  IOCTL(TUNSETIFINDEX ,  IOC_W, MK_PTR(TYPE_INT))
+  /* TUNGETFILTER is not supported: see TUNATTACHFILTER. */
+  IOCTL(TUNSETVNETLE,IOC_W, MK_PTR(TYPE_INT))
+  IOCTL(TUNGETVNETLE,IOC_R, MK_PTR(TYPE_INT))
+#ifdef TUNSETVNETBE
+  IOCTL(TUNSETVNETBE,IOC_W, MK_PTR(TYPE_INT))
+  IOCTL(TUNGETVNETBE,IOC_R, MK_PTR(TYPE_INT))
+#endif
+#ifdef TUNSETSTEERINGEBPF
+  IOCTL(TUNSETSTEERINGEBPF, IOC_W, MK_PTR(TYPE_INT))
+#endif
+#ifdef TUNSETFILTEREBPF
+  IOCTL(TUNSETFILTEREBPF, IOC_W, MK_PTR(TYPE_INT))
+#endif
+#ifdef TUNSETCARRIER
+  IOCTL(TUNSETCARRIER,   IOC_W, MK_PTR(TYPE_INT))
+#endif
+#ifdef TUNGETDEVNETNS
+  IOCTL(TUNGETDEVNETNS,  IOC_R, TYPE_NULL)
+#endif
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 1211e759c2..7f1efed189 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -56,6 +56,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #ifdef CONFIG_TIMERFD
@@ -5422,6 +5423,41 @@ static abi_long do_ioctl_drm(const IOCTLEntry *ie, 
uint8_t *buf_temp,
 
 #endif
 
+static abi_long do_ioctl_TUNSETTXFILTER(const IOCTLEntry *ie, uint8_t 
*buf_temp,
+int fd, int cmd, abi_long arg)
+{
+struct tun_filter *filter = (struct tun_filter *)buf_temp;
+struct tun_filter *target_filter;
+char *target_addr;
+
+assert(ie->access == IOC_W);
+
+target_filter = lock_user(VERIFY_READ, arg, sizeof(*filter), 1);
+if (!target_filter) {
+return -TARGET_EFAULT;
+}
+filter->flags = tswap16(target_filter->flags);
+filter->count = tswap16(target_filter->count);
+unlock_user(target_filter, arg, sizeof(*filter));
+
+if (filter->count) {
+if (sizeof(*filter) + filter->count * ETH_ALEN > MAX_STRUCT_SIZE) {
+return -TARGET_EFAULT;
+}
+
+target_addr = lock_user(VERIFY_READ, arg + sizeof(*filter),
+filter->count * ETH_ALEN, 1);
+if (!target_addr) {
+return -TARGET_EFAULT;
+}
+memcpy(filter->addr, target_addr, filter->count * ETH_ALEN);
+unlock_user(target_addr, arg + sizeof(*filter),
+filter->count * ETH_ALEN);
+}
+
+return get_errno(safe_ioctl(fd, ie->host_cmd, filter));
+}
+
 IOCTLEntry ioctl_entries[] = {
 #define IOCTL(cmd, access, ...) \
 { TARGET_ ## cmd, cmd, #cmd, access, 0, {  __VA_ARGS__ } },
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 3c261cff0e..7ef0ff0328 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -891,6 +891,38 @@ struct target_rtc_pll_info {
 
 #define TARGET_SIOCGIWNAME 0x8B01  /* get name == wireless 
protocol */
 
+/* From  */
+
+#define TARGET_TUNSETDEBUGTARGET_IOW('T', 201, int)
+#define 

Re: Testing the virtio-vhost-user QEMU patch

2020-07-23 Thread Alyssa Ross
Stefan Hajnoczi  writes:

> On Tue, Jul 21, 2020 at 07:14:38AM +, Alyssa Ross wrote:
>> Hi -- I hope it's okay me reaching out like this.
>> 
>> I've been trying to test out the virtio-vhost-user implementation that's
>> been posted to this list a couple of times, but have been unable to get
>> it to boot a kernel following the steps listed either on
>>  or
>> .
>> 
>> Specifically, the kernel appears to be unable to write to the
>> virtio-vhost-user device's PCI registers.  I've included the full panic
>> output from the kernel at the end of this message.  The panic is
>> reproducible with two different kernels I tried (with different configs
>> and versions).  I tried both versions of the virtio-vhost-user I was
>> able to find[1][2], and both exhibited the same behaviour.
>> 
>> Is this a known issue?  Am I doing something wrong?
>
> Hi,
> Unfortunately I'm not sure what the issue is. This is an early
> virtio-pci register access before a driver for any specific device type
> (net, blk, vhost-user, etc) comes into play.

Small update here: I tried on another computer, and it worked.  Made
sure that it was exactly the same QEMU binary, command line, and VM
disk/initrd/kernel, so I think I can fairly confidently say the panic
depends on what hardware QEMU is running on.  I set -cpu value to the
same on both as well (SandyBridge).

I also discovered that it works on my primary computer (the one it
panicked on before) with KVM disabled.

Note that I've only got so far as finding that it boots on the other
machine -- I haven't verified yet that it actually works.

Bad host CPU:  Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
Good host CPU: AMD EPYC 7401P 24-Core Processor

May I ask what host CPUs other people have tested this on?  Having more
data would probably be useful.  Could it be an AMD vs. Intel thing?



Re: [PATCH v0 2/4] migration: add background snapshot capability

2020-07-23 Thread Peter Xu
On Wed, Jul 22, 2020 at 11:11:31AM +0300, Denis Plotnikov wrote:
> diff --git a/migration/migration.c b/migration/migration.c
> index 2ed9923227..2ec0451abe 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1086,6 +1086,32 @@ static bool migrate_caps_check(bool *cap_list,
>  error_setg(errp, "Postcopy is not compatible with 
> ignore-shared");
>  return false;
>  }
> +
> +if (cap_list[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) {
> +error_setg(errp, "Postcopy is not compatible "
> +"with background snapshot");
> +return false;
> +}
> +}
> +
> +if (cap_list[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) {
> +if (cap_list[MIGRATION_CAPABILITY_RELEASE_RAM]) {
> +error_setg(errp, "Background snapshot is not compatible "
> +"with release ram capability");
> +return false;
> +}
> +
> +if (cap_list[MIGRATION_CAPABILITY_COMPRESS]) {
> +error_setg(errp, "Background snapshot is not "
> +"currently compatible with compression");
> +return false;
> +}
> +
> +if (cap_list[MIGRATION_CAPABILITY_XBZRLE]) {
> +error_setg(errp, "Background snapshot is not "
> +"currently compatible with XBZLRE");
> +return false;
> +}

Are these four the only ones that is not compatible with background snapshot?
I'm looking at:

typedef enum MigrationCapability {
MIGRATION_CAPABILITY_XBZRLE,
MIGRATION_CAPABILITY_RDMA_PIN_ALL,
MIGRATION_CAPABILITY_AUTO_CONVERGE,
MIGRATION_CAPABILITY_ZERO_BLOCKS,
MIGRATION_CAPABILITY_COMPRESS,
MIGRATION_CAPABILITY_EVENTS,
MIGRATION_CAPABILITY_POSTCOPY_RAM,
MIGRATION_CAPABILITY_X_COLO,
MIGRATION_CAPABILITY_RELEASE_RAM,
MIGRATION_CAPABILITY_BLOCK,
MIGRATION_CAPABILITY_RETURN_PATH,
MIGRATION_CAPABILITY_PAUSE_BEFORE_SWITCHOVER,
MIGRATION_CAPABILITY_MULTIFD,
MIGRATION_CAPABILITY_DIRTY_BITMAPS,
MIGRATION_CAPABILITY_POSTCOPY_BLOCKTIME,
MIGRATION_CAPABILITY_LATE_BLOCK_ACTIVATE,
MIGRATION_CAPABILITY_X_IGNORE_SHARED,
MIGRATION_CAPABILITY_VALIDATE_UUID,
MIGRATION_CAPABILITY__MAX,
} MigrationCapability;

My gut feeling is that most of them is not compatible with it... If background
snapshot is majorly used on its own, not sure whether it's worth it to create a
new qmp command, rather than reusing the "migrate" command.  The thing is it
could be confusing when people noticed when all the parameters won't work again
with snapshots.

Btw, it does not mean we need to duplicate the code.  We should still be able
to leverage most of the codes in qmp_migrate(), maybe even call qmp_migrate()
inside a new qmp_snapshot().

Thoughts?..

-- 
Peter Xu




Re: [PATCH v0 3/4] migration: add background snapshot

2020-07-23 Thread Peter Xu
On Wed, Jul 22, 2020 at 11:11:32AM +0300, Denis Plotnikov wrote:
> +static void *background_snapshot_thread(void *opaque)
> +{
> +MigrationState *m = opaque;
> +QIOChannelBuffer *bioc;
> +QEMUFile *fb;
> +int res = 0;
> +
> +rcu_register_thread();
> +
> +qemu_file_set_rate_limit(m->to_dst_file, INT64_MAX);
> +
> +qemu_mutex_lock_iothread();
> +vm_stop(RUN_STATE_PAUSED);
> +
> +qemu_savevm_state_header(m->to_dst_file);
> +qemu_mutex_unlock_iothread();
> +qemu_savevm_state_setup(m->to_dst_file);

Is it intended to skip bql for the setup phase?  IIUC the main thread could
start the vm before we take the lock again below if we released it...

> +qemu_mutex_lock_iothread();
> +
> +migrate_set_state(>state, MIGRATION_STATUS_SETUP,
> +  MIGRATION_STATUS_ACTIVE);
> +
> +/*
> + * We want to save the vm state for the moment when the snapshot saving 
> was
> + * called but also we want to write RAM content with vm running. The RAM
> + * content should appear first in the vmstate.
> + * So, we first, save non-ram part of the vmstate to the temporary, 
> buffer,
> + * then write ram part of the vmstate to the migration stream with vCPUs
> + * running and, finally, write the non-ram part of the vmstate from the
> + * buffer to the migration stream.
> + */
> +bioc = qio_channel_buffer_new(4096);
> +qio_channel_set_name(QIO_CHANNEL(bioc), "vmstate-buffer");
> +fb = qemu_fopen_channel_output(QIO_CHANNEL(bioc));
> +object_unref(OBJECT(bioc));
> +
> +if (ram_write_tracking_start()) {
> +goto failed_resume;
> +}
> +
> +if (global_state_store()) {
> +goto failed_resume;
> +}

Is this needed?  We should be always in stopped state here, right?

> +
> +cpu_synchronize_all_states();
> +
> +if (qemu_savevm_state_complete_precopy_non_iterable(fb, false, false)) {
> +goto failed_resume;
> +}
> +
> +vm_start();
> +qemu_mutex_unlock_iothread();
> +
> +while (!res) {
> +res = qemu_savevm_state_iterate(m->to_dst_file, false);
> +
> +if (res < 0 || qemu_file_get_error(m->to_dst_file)) {
> +goto failed;
> +}
> +}
> +
> +/*
> + * By this moment we have RAM content saved into the migration stream.
> + * The next step is to flush the non-ram content (vm devices state)
> + * right after the ram content. The device state was stored in
> + * the temporary buffer prior to the ram saving.
> + */
> +qemu_put_buffer(m->to_dst_file, bioc->data, bioc->usage);
> +qemu_fflush(m->to_dst_file);
> +
> +if (qemu_file_get_error(m->to_dst_file)) {
> +goto failed;
> +}
> +
> +migrate_set_state(>state, MIGRATION_STATUS_ACTIVE,
> + MIGRATION_STATUS_COMPLETED);
> +goto exit;
> +
> +failed_resume:
> +vm_start();
> +qemu_mutex_unlock_iothread();
> +failed:
> +migrate_set_state(>state, MIGRATION_STATUS_ACTIVE,
> +  MIGRATION_STATUS_FAILED);
> +exit:
> +ram_write_tracking_stop();
> +qemu_fclose(fb);
> +qemu_mutex_lock_iothread();
> +qemu_savevm_state_cleanup();
> +qemu_mutex_unlock_iothread();
> +rcu_unregister_thread();
> +return NULL;
> +}
> +
>  void migrate_fd_connect(MigrationState *s, Error *error_in)
>  {
>  Error *local_err = NULL;
> @@ -3599,8 +3694,14 @@ void migrate_fd_connect(MigrationState *s, Error 
> *error_in)
>  migrate_fd_cleanup(s);
>  return;
>  }
> -qemu_thread_create(>thread, "live_migration", migration_thread, s,
> -   QEMU_THREAD_JOINABLE);
> +if (migrate_background_snapshot()) {
> +qemu_thread_create(>thread, "bg_snapshot",

Maybe the name "live_snapshot" suites more (since the other one is
"live_migration")?

> +   background_snapshot_thread, s,
> +   QEMU_THREAD_JOINABLE);
> +} else {
> +qemu_thread_create(>thread, "live_migration", migration_thread, s,
> +   QEMU_THREAD_JOINABLE);
> +}
>  s->migration_thread_running = true;
>  }
>  

[...]

> @@ -1151,9 +1188,11 @@ static int save_normal_page(RAMState *rs, RAMBlock 
> *block, ram_addr_t offset,
>  ram_counters.transferred += save_page_header(rs, rs->f, block,
>   offset | 
> RAM_SAVE_FLAG_PAGE);
>  if (async) {
> -qemu_put_buffer_async(rs->f, buf, TARGET_PAGE_SIZE,
> -  migrate_release_ram() &
> -  migration_in_postcopy());
> +bool may_free = migrate_background_snapshot() ||
> +(migrate_release_ram() &&
> + migration_in_postcopy());

Does background snapshot need to free the memory?  /me confused..

> +
> +qemu_put_buffer_async(rs->f, buf, TARGET_PAGE_SIZE, may_free);
>  } else {
>  

Re: [PATCH v2 12/22] qemu-iotests/199: fix style

2020-07-23 Thread Eric Blake

On 2/17/20 9:02 AM, Vladimir Sementsov-Ogievskiy wrote:

Mostly, satisfy pep8 complains.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/199 | 13 +++--
  1 file changed, 7 insertions(+), 6 deletions(-)


With none of your series applied, I get:

$ ./check -qcow2 199
...
199  not run[16:52:34] [16:52:34]not 
suitable for this cache mode: writeback

Not run: 199
Passed all 0 iotests
199  fail   [16:53:37] [16:53:37]output 
mismatch (see 199.out.bad)
--- /home/eblake/qemu/tests/qemu-iotests/199.out	2020-07-23 
16:48:56.275529368 -0500
+++ /home/eblake/qemu/build/tests/qemu-iotests/199.out.bad	2020-07-23 
16:53:37.728416207 -0500

@@ -1,5 +1,13 @@
-.
+E
+==
+ERROR: test_postcopy (__main__.TestDirtyBitmapPostcopyMigration)
+--
+Traceback (most recent call last):
+  File "199", line 41, in setUp
+os.mkfifo(fifo)
+FileExistsError: [Errno 17] File exists
+
 --
 Ran 1 tests

-OK
+FAILED (errors=1)
Failures: 199
Failed 1 of 1 iotests

Ah, 'scratch/mig_fifo' was left over from some other aborted run of the 
test. I removed that file (which implies it might be nice if the test 
handled that automatically, instead of making me do it), and tried 
again; now I got the desired:


199  pass   [17:00:34] [17:01:48]  74s
Passed all 1 iotests


After trying to rebase your series, I once again got failures, but that 
could mean I botched the rebase (since quite a few of the code patches 
earlier in the series were non-trivially changed).  If you send a v3 
(which would be really nice!), I'd hoist this and 13/22 first in the 
series, to get to a point where testing 199 works, to then make it 
easier to demonstrate what the rest of the 199 enhancements do in 
relation to the non-iotest patches.  But I like that you separated the 
199 improvements from the code - testing-wise, it's easy to apply the 
iotests patches first, make sure it fails, then apply the code patches, 
and make sure it passes, to prove that the enhanced test now covers what 
the code fixes did.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v2 08/22] migration/block-dirty-bitmap: keep bitmap state for all bitmaps

2020-07-23 Thread Eric Blake

On 2/17/20 9:02 AM, Vladimir Sementsov-Ogievskiy wrote:

Keep bitmap state for disabled bitmaps too. Keep the state until the
end of the process. It's needed for the following commit to implement
bitmap postcopy canceling.

To clean-up the new list the following logic is used:
We need two events to consider bitmap migration finished:
1. chunk with DIRTY_BITMAP_MIG_FLAG_COMPLETE flag should be received
2. dirty_bitmap_mig_before_vm_start should be called
These two events may come in any order, so we understand which one is
last, and on the last of them we remove bitmap migration state from the
list.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  migration/block-dirty-bitmap.c | 64 +++---
  1 file changed, 43 insertions(+), 21 deletions(-)



@@ -484,45 +488,59 @@ static int dirty_bitmap_load_start(QEMUFile *f, 
DBMLoadState *s)
  
  bdrv_disable_dirty_bitmap(s->bitmap);

  if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) {
-LoadBitmapState *b;
-
  bdrv_dirty_bitmap_create_successor(s->bitmap, _err);
  if (local_err) {
  error_report_err(local_err);
  return -EINVAL;
  }
-
-b = g_new(LoadBitmapState, 1);
-b->bs = s->bs;
-b->bitmap = s->bitmap;
-b->migrated = false;
-s->enabled_bitmaps = g_slist_prepend(s->enabled_bitmaps, b);
  }
  
+b = g_new(LoadBitmapState, 1);

+b->bs = s->bs;
+b->bitmap = s->bitmap;
+b->migrated = false;
+b->enabled = flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED,
+
+s->bitmaps = g_slist_prepend(s->bitmaps, b);


Did you really mean to use a comma operator there, or should that be ';'?

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v2 2/3] linux-user: Add missing termbits types and values definitions

2020-07-23 Thread Max Filippov
On Thu, Jul 23, 2020 at 2:25 PM Max Filippov  wrote:
>
> On Thu, Jul 23, 2020 at 2:04 PM Filip Bozuta  wrote:
> >
> > This patch introduces missing target types ('target_flag_t', 'target_cc_t',
> > 'target_speed_t') in a few 'termibts.h' header files. Also, two missing
> > values ('TARGET_IUTF8' and 'TARGET_EXTPROC') were also added. These values
> > were also added in file 'syscall.c' in bitmask tables 'iflag_tbl[]' and
> > 'lflag_tbl[]' which are used to convert values of 'struct termios' between
> > target and host.
> >
> > Signed-off-by: Filip Bozuta 
> > ---
> >  linux-user/alpha/termbits.h   |  1 +
> >  linux-user/cris/termbits.h| 18 
> >  linux-user/hppa/termbits.h| 17 +++
> >  linux-user/mips/termbits.h| 17 +++
> >  linux-user/ppc/termbits.h | 21 --
> >  linux-user/sh4/termbits.h | 19 +
> >  linux-user/sparc/termbits.h   | 18 
> >  linux-user/sparc64/termbits.h | 18 
> >  linux-user/syscall.c  | 34 +++---
> >  linux-user/xtensa/termbits.h  | 53 ++-
> >  10 files changed, 130 insertions(+), 86 deletions(-)
>
> Curious why you did it to some targets, but not to others?
> E.g. the following headers have similar definitions:
> linux-user/aarch64/termbits.h
> linux-user/arm/termbits.h
> linux-user/i386/termbits.h
> linux-user/m68k/termbits.h
> linux-user/microblaze/termbits.h
> linux-user/nios2/termbits.h
> linux-user/riscv/termbits.h
> linux-user/s390x/termbits.h
> linux-user/tilegx/termbits.h

Never mind, I got this email before the other that adds generic headers...

Reviewed-by: Max Filippov 

-- 
Thanks.
-- Max



Re: [PATCH v2 2/3] linux-user: Add missing termbits types and values definitions

2020-07-23 Thread Max Filippov
On Thu, Jul 23, 2020 at 2:04 PM Filip Bozuta  wrote:
>
> This patch introduces missing target types ('target_flag_t', 'target_cc_t',
> 'target_speed_t') in a few 'termibts.h' header files. Also, two missing
> values ('TARGET_IUTF8' and 'TARGET_EXTPROC') were also added. These values
> were also added in file 'syscall.c' in bitmask tables 'iflag_tbl[]' and
> 'lflag_tbl[]' which are used to convert values of 'struct termios' between
> target and host.
>
> Signed-off-by: Filip Bozuta 
> ---
>  linux-user/alpha/termbits.h   |  1 +
>  linux-user/cris/termbits.h| 18 
>  linux-user/hppa/termbits.h| 17 +++
>  linux-user/mips/termbits.h| 17 +++
>  linux-user/ppc/termbits.h | 21 --
>  linux-user/sh4/termbits.h | 19 +
>  linux-user/sparc/termbits.h   | 18 
>  linux-user/sparc64/termbits.h | 18 
>  linux-user/syscall.c  | 34 +++---
>  linux-user/xtensa/termbits.h  | 53 ++-
>  10 files changed, 130 insertions(+), 86 deletions(-)

Curious why you did it to some targets, but not to others?
E.g. the following headers have similar definitions:
linux-user/aarch64/termbits.h
linux-user/arm/termbits.h
linux-user/i386/termbits.h
linux-user/m68k/termbits.h
linux-user/microblaze/termbits.h
linux-user/nios2/termbits.h
linux-user/riscv/termbits.h
linux-user/s390x/termbits.h
linux-user/tilegx/termbits.h

-- 
Thanks.
-- Max



[PATCH v2 0/3] Adding support for printing contents of 'struct termios' which is used by ioctls of group 'ioctl_tty'

2020-07-23 Thread Filip Bozuta
This series introduces strace printing functionality for
contents of 'struct termios'.

The first patch in the series introduces a generic 'termbits.h'
file for some architectures which have same 'struct termios'
flag values and 'ioctl_tty' definitions.

The second patch introduces some missing types and flag
values for 'struct termios' which are needed to print
it's contents.

The third patch introduces the 'strace' argument printing
functionality itself by using existing functions and macros
in 'strace.c'.

Testing method:

The argument printing functionality was tested using mini
test program, which were cross compiled for certain
architectures ('ppc','ppc64','mips','mips64','mipsel'),
in which the ioctls of group 'ioctl_tty' were used.
These programs were cross executed with QEMU with "-strace"
to check if the contents 'struct termios' are getting
correctly printed.

Based-on: <20200722200437.312767-1-filip.boz...@syrmia.com>

Filip Bozuta (5):
  linux-user: Add generic 'termbits.h' for some archs
  linux-user: Add missing termbits types and values definitions
  linux-user: Add strace support for printing arguments for ioctls used
for terminals and serial lines

 include/exec/user/thunk.h|   1 +
 linux-user/aarch64/termbits.h| 228 +
 linux-user/alpha/termbits.h  |   1 +
 linux-user/arm/termbits.h| 223 +
 linux-user/cris/termbits.h   |  18 +-
 linux-user/generic/termbits.h| 318 +++
 linux-user/hppa/termbits.h   |  17 +-
 linux-user/i386/termbits.h   | 233 +
 linux-user/m68k/termbits.h   | 234 +
 linux-user/microblaze/termbits.h | 220 +---
 linux-user/mips/termbits.h   |  17 +-
 linux-user/nios2/termbits.h  | 228 +
 linux-user/openrisc/termbits.h   | 302 +-
 linux-user/ppc/termbits.h|  21 +-
 linux-user/qemu.h|   1 +
 linux-user/riscv/termbits.h  | 228 +
 linux-user/s390x/termbits.h  | 289 +
 linux-user/sh4/termbits.h|  19 +-
 linux-user/sparc/termbits.h  |  18 +-
 linux-user/sparc64/termbits.h|  18 +-
 linux-user/strace.c  | 415 ++-
 linux-user/strace.list   |  17 +-
 linux-user/syscall.c |  35 +--
 linux-user/tilegx/termbits.h | 276 +---
 linux-user/x86_64/termbits.h | 254 +--
 linux-user/xtensa/termbits.h |  53 ++--
 thunk.c  |  23 +-
 27 files changed, 900 insertions(+), 2807 deletions(-)
 create mode 100644 linux-user/generic/termbits.h

-- 
2.25.1




[PATCH v2 3/3] linux-user: Add strace support for printing arguments for ioctls used for terminals and serial lines

2020-07-23 Thread Filip Bozuta
Functions "print_ioctl()" and "print_syscall_ret_ioctl()" are used
to print arguments of "ioctl()" with "-strace". These functions
use "thunk_print()", which is defined in "thunk.c", to print the
contents of ioctl's third arguments that are not basic types.

However, this function doesn't handle ioctls of group ioctl_tty which
are used for terminals and serial lines. These ioctls use a type
"struct termios" which thunk type is defined in a non standard
way using "STRUCT_SPECIAL()". This means that this type is not decoded
regularly using "thunk_convert()" and uses special converting functions
"target_to_host_termios()" and "host_to_target_termios()", which are defined
in "syscall.c" to decode it's values.

For simillar reasons, this type is also not printed regularly using
"thunk_print()". That is the reason why a separate printing function
"print_termios()" is defined in file "strace.c". This function decodes
and prints flag values of the "termios" structure.

Implementation notes:

Function "print_termios()" was implemented in "strace.c" using
an existing function "print_flags()" to print flag values of
"struct termios" fields. Also, recently implemented function
"print_enums()" was also used to print enumareted values which
are contained in the fields of 'struct termios'.

These flag values were defined using an existing macro "FLAG_TARGET()"
that generates aproppriate target flag values and string representations
of these flags. Also, the recently defined macro "ENUM_TARGET()" was
used to generate aproppriate enumarated values and their respective
string representations.

Function "print_termios()" was declared in "qemu.h" so that it can
be accessed in "syscall.c". Type "StructEntry" defined in
"exec/user/thunk.h" contains information that is used to decode
structure values. Field "void print(void *arg)" was added in this
structure as a special print function. Also, function "thunk_print()"
was changed a little so that it uses this special print function
in case it is defined. This printing function was instantiated with
the defined "print_termios()" in "syscall.c" in "struct_termios_def".

Signed-off-by: Filip Bozuta 
---
 include/exec/user/thunk.h |   1 +
 linux-user/qemu.h |   1 +
 linux-user/strace.c   | 195 ++
 linux-user/syscall.c  |   1 +
 thunk.c   |  23 +++--
 5 files changed, 212 insertions(+), 9 deletions(-)

diff --git a/include/exec/user/thunk.h b/include/exec/user/thunk.h
index 7992475c9f..a5bbb2c733 100644
--- a/include/exec/user/thunk.h
+++ b/include/exec/user/thunk.h
@@ -55,6 +55,7 @@ typedef struct {
 int *field_offsets[2];
 /* special handling */
 void (*convert[2])(void *dst, const void *src);
+void (*print)(void *arg);
 int size[2];
 int align[2];
 const char *name;
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index f431805e57..a69a0bd347 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -706,6 +706,7 @@ static inline uint64_t target_offset64(uint64_t word0, 
uint64_t word1)
 }
 #endif /* TARGET_ABI_BITS != 32 */
 
+void print_termios(void *arg);
 
 /* ARM EABI and MIPS expect 64bit types aligned even on pairs or registers */
 #ifdef TARGET_ARM
diff --git a/linux-user/strace.c b/linux-user/strace.c
index 3f16bb2c53..b9ba39ce6e 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -1284,6 +1284,140 @@ UNUSED static struct flags falloc_flags[] = {
 #endif
 };
 
+UNUSED static struct flags termios_iflags[] = {
+FLAG_TARGET(IGNBRK),
+FLAG_TARGET(BRKINT),
+FLAG_TARGET(IGNPAR),
+FLAG_TARGET(PARMRK),
+FLAG_TARGET(INPCK),
+FLAG_TARGET(ISTRIP),
+FLAG_TARGET(INLCR),
+FLAG_TARGET(IGNCR),
+FLAG_TARGET(ICRNL),
+FLAG_TARGET(IUCLC),
+FLAG_TARGET(IXON),
+FLAG_TARGET(IXANY),
+FLAG_TARGET(IXOFF),
+FLAG_TARGET(IMAXBEL),
+FLAG_TARGET(IUTF8),
+FLAG_END,
+};
+
+UNUSED static struct flags termios_oflags[] = {
+FLAG_TARGET(OPOST),
+FLAG_TARGET(OLCUC),
+FLAG_TARGET(ONLCR),
+FLAG_TARGET(OCRNL),
+FLAG_TARGET(ONOCR),
+FLAG_TARGET(ONLRET),
+FLAG_TARGET(OFILL),
+FLAG_TARGET(OFDEL),
+FLAG_END,
+};
+
+UNUSED static struct enums termios_oflags_NLDLY[] = {
+ENUM_TARGET(NL0),
+ENUM_TARGET(NL1),
+ENUM_END,
+};
+
+UNUSED static struct enums termios_oflags_CRDLY[] = {
+ENUM_TARGET(CR0),
+ENUM_TARGET(CR1),
+ENUM_TARGET(CR2),
+ENUM_TARGET(CR3),
+ENUM_END,
+};
+
+UNUSED static struct enums termios_oflags_TABDLY[] = {
+ENUM_TARGET(TAB0),
+ENUM_TARGET(TAB1),
+ENUM_TARGET(TAB2),
+ENUM_TARGET(TAB3),
+ENUM_END,
+};
+
+UNUSED static struct enums termios_oflags_VTDLY[] = {
+ENUM_TARGET(VT0),
+ENUM_TARGET(VT1),
+ENUM_END,
+};
+
+UNUSED static struct enums termios_oflags_FFDLY[] = {
+ENUM_TARGET(FF0),
+ENUM_TARGET(FF1),
+ENUM_END,
+};
+
+UNUSED static struct enums 

Re: [PATCH v2 10/22] migration/block-dirty-bitmap: cancel migration on shutdown

2020-07-23 Thread Eric Blake

On 2/17/20 9:02 AM, Vladimir Sementsov-Ogievskiy wrote:

If target is turned of prior to postcopy finished, target crashes


s/of/off/


because busy bitmaps are found at shutdown.
Canceling incoming migration helps, as it removes all unfinished (and
therefore busy) bitmaps.

Similarly on source we crash in bdrv_close_all which asserts that all
bdrv states are removed, because bdrv states involved into dirty bitmap
migration are referenced by it. So, we need to cancel outgoing
migration as well.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




[PATCH v2 1/3] linux-user: Add generic 'termbits.h' for some archs

2020-07-23 Thread Filip Bozuta
This patch introduces a generic 'termbits.h' file for following
archs: 'aarch64', 'arm', 'i386, 'm68k', 'microblaze', 'nios2',
'openrisc', 'riscv', 's390x', 'x86_64'.

Since all of these archs have the same termios flag values and
same ioctl_tty numbers, there is no need for a separate 'termbits.h'
file for each one of them. For that reason one generic 'termbits.h'
file was added for all of them and an '#include' directive was
added for this generic file in every arch 'termbits.h' file.

Also, some of the flag values that were missing were added in this
generic file so that it matches the generic 'termibts.h' and 'ioctls.h'
files from the kernel: 'asm-generic/termbits.h' and 'asm-generic/ioctls.h'.

Signed-off-by: Filip Bozuta 
---
 linux-user/aarch64/termbits.h| 228 +-
 linux-user/arm/termbits.h| 223 +-
 linux-user/generic/termbits.h| 318 +++
 linux-user/i386/termbits.h   | 233 +-
 linux-user/m68k/termbits.h   | 234 +--
 linux-user/microblaze/termbits.h | 220 +
 linux-user/nios2/termbits.h  | 228 +-
 linux-user/openrisc/termbits.h   | 302 +
 linux-user/riscv/termbits.h  | 228 +-
 linux-user/s390x/termbits.h  | 289 +---
 linux-user/tilegx/termbits.h | 276 +--
 linux-user/x86_64/termbits.h | 254 +---
 12 files changed, 329 insertions(+), 2704 deletions(-)
 create mode 100644 linux-user/generic/termbits.h

diff --git a/linux-user/aarch64/termbits.h b/linux-user/aarch64/termbits.h
index 0ab448d090..b1d4f4fedb 100644
--- a/linux-user/aarch64/termbits.h
+++ b/linux-user/aarch64/termbits.h
@@ -1,227 +1 @@
-/* from asm/termbits.h */
-/* NOTE: exactly the same as i386 */
-
-#ifndef LINUX_USER_AARCH64_TERMBITS_H
-#define LINUX_USER_AARCH64_TERMBITS_H
-
-#define TARGET_NCCS 19
-
-struct target_termios {
-unsigned int c_iflag;   /* input mode flags */
-unsigned int c_oflag;   /* output mode flags */
-unsigned int c_cflag;   /* control mode flags */
-unsigned int c_lflag;   /* local mode flags */
-unsigned char c_line;/* line discipline */
-unsigned char c_cc[TARGET_NCCS];/* control characters */
-};
-
-/* c_iflag bits */
-#define TARGET_IGNBRK  001
-#define TARGET_BRKINT  002
-#define TARGET_IGNPAR  004
-#define TARGET_PARMRK  010
-#define TARGET_INPCK   020
-#define TARGET_ISTRIP  040
-#define TARGET_INLCR   100
-#define TARGET_IGNCR   200
-#define TARGET_ICRNL   400
-#define TARGET_IUCLC   0001000
-#define TARGET_IXON0002000
-#define TARGET_IXANY   0004000
-#define TARGET_IXOFF   001
-#define TARGET_IMAXBEL 002
-#define TARGET_IUTF8   004
-
-/* c_oflag bits */
-#define TARGET_OPOST   001
-#define TARGET_OLCUC   002
-#define TARGET_ONLCR   004
-#define TARGET_OCRNL   010
-#define TARGET_ONOCR   020
-#define TARGET_ONLRET  040
-#define TARGET_OFILL   100
-#define TARGET_OFDEL   200
-#define TARGET_NLDLY   400
-#define   TARGET_NL0   000
-#define   TARGET_NL1   400
-#define TARGET_CRDLY   0003000
-#define   TARGET_CR0   000
-#define   TARGET_CR1   0001000
-#define   TARGET_CR2   0002000
-#define   TARGET_CR3   0003000
-#define TARGET_TABDLY  0014000
-#define   TARGET_TAB0  000
-#define   TARGET_TAB1  0004000
-#define   TARGET_TAB2  001
-#define   TARGET_TAB3  0014000
-#define   TARGET_XTABS 0014000
-#define TARGET_BSDLY   002
-#define   TARGET_BS0   000
-#define   TARGET_BS1   002
-#define TARGET_VTDLY   004
-#define   TARGET_VT0   000
-#define   TARGET_VT1   004
-#define TARGET_FFDLY   010
-#define   TARGET_FF0   000
-#define   TARGET_FF1   010
-
-/* c_cflag bit meaning */
-#define TARGET_CBAUD   0010017
-#define  TARGET_B0 000 /* hang up */
-#define  TARGET_B50001
-#define  TARGET_B75002
-#define  TARGET_B110   003
-#define  TARGET_B134   004
-#define  TARGET_B150   005
-#define  TARGET_B200   006
-#define  TARGET_B300   007
-#define  TARGET_B600   010
-#define  TARGET_B1200  011
-#define  TARGET_B1800  012
-#define  TARGET_B2400  013
-#define  TARGET_B4800  014
-#define  TARGET_B9600  015
-#define  TARGET_B19200 016
-#define  TARGET_B38400 017
-#define TARGET_EXTA B19200
-#define TARGET_EXTB B38400
-#define TARGET_CSIZE   060
-#define   TARGET_CS5   000
-#define   TARGET_CS6   020
-#define   TARGET_CS7   040
-#define   TARGET_CS8   060
-#define TARGET_CSTOPB  100
-#define TARGET_CREAD   200
-#define TARGET_PARENB  400
-#define TARGET_PARODD  0001000
-#define TARGET_HUPCL   0002000
-#define TARGET_CLOCAL  0004000
-#define TARGET_CBAUDEX 001

[PATCH v2 2/3] linux-user: Add missing termbits types and values definitions

2020-07-23 Thread Filip Bozuta
This patch introduces missing target types ('target_flag_t', 'target_cc_t',
'target_speed_t') in a few 'termibts.h' header files. Also, two missing
values ('TARGET_IUTF8' and 'TARGET_EXTPROC') were also added. These values
were also added in file 'syscall.c' in bitmask tables 'iflag_tbl[]' and
'lflag_tbl[]' which are used to convert values of 'struct termios' between
target and host.

Signed-off-by: Filip Bozuta 
---
 linux-user/alpha/termbits.h   |  1 +
 linux-user/cris/termbits.h| 18 
 linux-user/hppa/termbits.h| 17 +++
 linux-user/mips/termbits.h| 17 +++
 linux-user/ppc/termbits.h | 21 --
 linux-user/sh4/termbits.h | 19 +
 linux-user/sparc/termbits.h   | 18 
 linux-user/sparc64/termbits.h | 18 
 linux-user/syscall.c  | 34 +++---
 linux-user/xtensa/termbits.h  | 53 ++-
 10 files changed, 130 insertions(+), 86 deletions(-)

diff --git a/linux-user/alpha/termbits.h b/linux-user/alpha/termbits.h
index a71425174a..4a4b1e96f2 100644
--- a/linux-user/alpha/termbits.h
+++ b/linux-user/alpha/termbits.h
@@ -159,6 +159,7 @@ struct target_termios {
 #define TARGET_FLUSHO  0x0080
 #define TARGET_PENDIN  0x2000
 #define TARGET_IEXTEN  0x0400
+#define TARGET_EXTPROC  0x1000
 
 #define TARGET_FIOCLEX TARGET_IO('f', 1)
 #define TARGET_FIONCLEXTARGET_IO('f', 2)
diff --git a/linux-user/cris/termbits.h b/linux-user/cris/termbits.h
index 475ee70fed..0c8d8fc051 100644
--- a/linux-user/cris/termbits.h
+++ b/linux-user/cris/termbits.h
@@ -5,13 +5,17 @@
 
 #define TARGET_NCCS 19
 
+typedef unsigned char   target_cc_t;/* cc_t */
+typedef unsigned inttarget_speed_t; /* speed_t */
+typedef unsigned inttarget_tcflag_t;/* tcflag_t */
+
 struct target_termios {
-unsigned int c_iflag;   /* input mode flags */
-unsigned int c_oflag;   /* output mode flags */
-unsigned int c_cflag;   /* control mode flags */
-unsigned int c_lflag;   /* local mode flags */
-unsigned char c_line;/* line discipline */
-unsigned char c_cc[TARGET_NCCS];/* control characters */
+target_tcflag_t c_iflag;   /* input mode flags */
+target_tcflag_t c_oflag;   /* output mode flags */
+target_tcflag_t c_cflag;   /* control mode flags */
+target_tcflag_t c_lflag;   /* local mode flags */
+target_cc_t c_line;/* line discipline */
+target_cc_t c_cc[TARGET_NCCS]; /* control characters */
 };
 
 /* c_iflag bits */
@@ -29,6 +33,7 @@ struct target_termios {
 #define TARGET_IXANY   0004000
 #define TARGET_IXOFF   001
 #define TARGET_IMAXBEL 002
+#define TARGET_IUTF8   004
 
 /* c_oflag bits */
 #define TARGET_OPOST   001
@@ -118,6 +123,7 @@ struct target_termios {
 #define TARGET_FLUSHO  001
 #define TARGET_PENDIN  004
 #define TARGET_IEXTEN  010
+#define TARGET_EXTPROC 020
 
 /* c_cc character offsets */
 #define TARGET_VINTR   0
diff --git a/linux-user/hppa/termbits.h b/linux-user/hppa/termbits.h
index 8fba839dd4..11fd4eed62 100644
--- a/linux-user/hppa/termbits.h
+++ b/linux-user/hppa/termbits.h
@@ -5,13 +5,17 @@
 
 #define TARGET_NCCS 19
 
+typedef unsigned char   target_cc_t;/* cc_t */
+typedef unsigned inttarget_speed_t; /* speed_t */
+typedef unsigned inttarget_tcflag_t;/* tcflag_t */
+
 struct target_termios {
-unsigned int c_iflag;   /* input mode flags */
-unsigned int c_oflag;   /* output mode flags */
-unsigned int c_cflag;   /* control mode flags */
-unsigned int c_lflag;   /* local mode flags */
-unsigned char c_line;/* line discipline */
-unsigned char c_cc[TARGET_NCCS];/* control characters */
+target_tcflag_t c_iflag;   /* input mode flags */
+target_tcflag_t c_oflag;   /* output mode flags */
+target_tcflag_t c_cflag;   /* control mode flags */
+target_tcflag_t c_lflag;   /* local mode flags */
+target_cc_t c_line;/* line discipline */
+target_cc_t c_cc[TARGET_NCCS]; /* control characters */
 };
 
 /* c_iflag bits */
@@ -120,6 +124,7 @@ struct target_termios {
 #define TARGET_FLUSHO  001
 #define TARGET_PENDIN  004
 #define TARGET_IEXTEN  010
+#define TARGET_EXTPROC 020
 
 /* c_cc character offsets */
 #define TARGET_VINTR0
diff --git a/linux-user/mips/termbits.h b/linux-user/mips/termbits.h
index 3287cf6df8..e8b4b58d87 100644
--- a/linux-user/mips/termbits.h
+++ b/linux-user/mips/termbits.h
@@ -5,13 +5,17 @@
 
 #define TARGET_NCCS 23
 
+typedef unsigned char   target_cc_t;/* cc_t */
+typedef unsigned inttarget_speed_t; /* speed_t */
+typedef unsigned 

Re: [PATCH v2 03/22] migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup

2020-07-23 Thread Eric Blake

On 2/19/20 8:20 AM, Vladimir Sementsov-Ogievskiy wrote:

18.02.2020 14:00, Andrey Shinkevich wrote:

On 17/02/2020 18:02, Vladimir Sementsov-Ogievskiy wrote:

Rename dirty_bitmap_mig_cleanup to dirty_bitmap_do_save_cleanup, to
stress that it is on save part.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  migration/block-dirty-bitmap.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)




At the next opportunity, I would suggest the name like
"dirty_bitmap_do_clean_after_saving()"
and similar for dirty_bitmap_save_cleanup()
"dirty_bitmap_clean_after_saving()".


I'd keep my naming, it corresponds to .save_cleanup handler name.


I'm fine with that explanation, so no need to rename again.

Reviewed-by: Eric Blake 





Reviewed-by: Andrey Shinkevich 





--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




[Bug 1888728] [NEW] Bare chroot in linux-user fails with pgb_reserved_va: Assertion `guest_base != 0' failed.

2020-07-23 Thread John Paul Adrian Glaubitz
Public bug reported:

Trying to run a bare chroot with no additional bind mounts fails on git
master (8ffa52c20d5693d454f65f2024a1494edfea65d4) with:

root@nofan:~/qemu> chroot /local_scratch/sid-m68k-sbuild/
qemu-m68k-static: /root/qemu/linux-user/elfload.c:2315: pgb_reserved_va: 
Assertion `guest_base != 0' failed.
Aborted
root@nofan:~/qemu>

The problem can be worked around by bind-mounting /proc from the host
system into the target chroot:

root@nofan:~/qemu> mount -o bind /proc/ /local_scratch/sid-m68k-sbuild/proc/
root@nofan:~/qemu> chroot /local_scratch/sid-m68k-sbuild/
bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
(sid-m68k-sbuild)root@nofan:/#

Host system is an up-to-date Debian unstable (2020-07-23).

I have not been able to bisect the issue yet since there is another
annoying linux-user bug (virtual memory exhaustion) that was somewhere
introduced and fixed between v5.0.0 and HEAD and overshadows the
original Assertion failure bug.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1888728

Title:
  Bare chroot in linux-user fails with pgb_reserved_va: Assertion
  `guest_base != 0' failed.

Status in QEMU:
  New

Bug description:
  Trying to run a bare chroot with no additional bind mounts fails on
  git master (8ffa52c20d5693d454f65f2024a1494edfea65d4) with:

  root@nofan:~/qemu> chroot /local_scratch/sid-m68k-sbuild/
  qemu-m68k-static: /root/qemu/linux-user/elfload.c:2315: pgb_reserved_va: 
Assertion `guest_base != 0' failed.
  Aborted
  root@nofan:~/qemu>

  The problem can be worked around by bind-mounting /proc from the host
  system into the target chroot:

  root@nofan:~/qemu> mount -o bind /proc/ /local_scratch/sid-m68k-sbuild/proc/
  root@nofan:~/qemu> chroot /local_scratch/sid-m68k-sbuild/
  bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
  (sid-m68k-sbuild)root@nofan:/#

  Host system is an up-to-date Debian unstable (2020-07-23).

  I have not been able to bisect the issue yet since there is another
  annoying linux-user bug (virtual memory exhaustion) that was somewhere
  introduced and fixed between v5.0.0 and HEAD and overshadows the
  original Assertion failure bug.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1888728/+subscriptions



Re: [PATCH v2 02/22] migration/block-dirty-bitmap: rename state structure types

2020-07-23 Thread Eric Blake

On 2/17/20 9:02 AM, Vladimir Sementsov-Ogievskiy wrote:

Rename types to be symmetrical for load/save part and shorter.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  migration/block-dirty-bitmap.c | 68 ++
  1 file changed, 36 insertions(+), 32 deletions(-)


No longer applies to master, but the mechanical aspect of the change 
makes sense. If you rebase the series to make review easier,


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v2 00/22] Fix error handling during bitmap postcopy

2020-07-23 Thread Eric Blake

On 5/29/20 7:16 AM, Vladimir Sementsov-Ogievskiy wrote:

29.05.2020 14:58, Eric Blake wrote:

On 4/2/20 2:42 AM, Vladimir Sementsov-Ogievskiy wrote:

Ping!

It's a fix, but not a degradation and I'm afraid too big for 5.0.

Still, I think I should ping it anyway. John, I'm afraid, that this 
all is for your branch :)


Just noticing this thread, now that we've shuffled bitmaps 
maintainers. Is there anything here that we still need to include in 5.1?


Yes, we need the whole series.


I'm starting to go through it now, to see what is still worth getting in 
to 5.1-rc2, but no promises as it is a long series and I don't want to 
introduce last-minute regressions (the fact that this missed 5.0 says 
that 5.1 will be no worse than 5.0 if we don't get this in until 5.2).


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v4 2/4] block/nbd: define new max_write_zero_fast limit

2020-07-23 Thread Eric Blake

On 6/11/20 11:26 AM, Vladimir Sementsov-Ogievskiy wrote:

The NBD spec was recently updated to clarify that max_block doesn't
relate to NBD_CMD_WRITE_ZEROES with NBD_CMD_FLAG_FAST_ZERO (which
mirrors Qemu flag BDRV_REQ_NO_FALLBACK).

bs->bl.max_write_zero_fast is zero by default which means using
max_pwrite_zeroes. Update nbd driver to allow larger requests with
BDRV_REQ_NO_FALLBACK.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/block/nbd.c b/block/nbd.c
index 4ac23c8f62..b0584cf68d 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -1956,6 +1956,7 @@ static void nbd_refresh_limits(BlockDriverState *bs, 
Error **errp)
  
  bs->bl.request_alignment = min;

  bs->bl.max_pdiscard = QEMU_ALIGN_DOWN(INT_MAX, min);
+bs->bl.max_pwrite_zeroes_fast = bs->bl.max_pdiscard;
  bs->bl.max_pwrite_zeroes = max;


Do we even need max_pwrite_zeroes_fast?  Doesn't qemu behave correctly 
if we just blindly assign max_pdiscard and max_pwrite_zeroes to the same 
value near 2G?



  bs->bl.max_transfer = max;
  



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v4 1/4] block: add max_pwrite_zeroes_fast to BlockLimits

2020-07-23 Thread Eric Blake

On 6/11/20 11:26 AM, Vladimir Sementsov-Ogievskiy wrote:

The NBD spec was recently updated to clarify that max_block doesn't
relate to NBD_CMD_WRITE_ZEROES with NBD_CMD_FLAG_FAST_ZERO (which
mirrors Qemu flag BDRV_REQ_NO_FALLBACK). To drop the restriction we
need new max_pwrite_zeroes_fast.

Default value of new max_pwrite_zeroes_fast is zero and it means
use max_pwrite_zeroes. So this commit semantically changes nothing.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/block_int.h |  8 
  block/io.c| 17 -
  2 files changed, 20 insertions(+), 5 deletions(-)


Hmm, this is an optimization, rather than a correctness issue.  I'm 
sorry I didn't review it sooner, but at this point, I think it is better 
as 5.2 material.




diff --git a/include/block/block_int.h b/include/block/block_int.h
index 791de6a59c..277e32fe31 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -626,6 +626,14 @@ typedef struct BlockLimits {
   * pwrite_zeroes_alignment. May be 0 if no inherent 32-bit limit */
  int32_t max_pwrite_zeroes;
  
+/*

+ * Maximum number of bytes that can zeroed at once if flag
+ * BDRV_REQ_NO_FALLBACK specified. Must be multiple of
+ * pwrite_zeroes_alignment.
+ * If 0, max_pwrite_zeroes is used for no-fallback case.
+ */
+int64_t max_pwrite_zeroes_fast;


Nice that this is 64-bit off the bat (I know you have another series 
about converting more stuff to 64-bit).



+
  /* Optimal alignment for write zeroes requests in bytes. A power
   * of 2 is best but not mandatory.  Must be a multiple of
   * bl.request_alignment, and must be less than max_pwrite_zeroes
diff --git a/block/io.c b/block/io.c
index df8f2a98d4..0af62a53fd 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1774,12 +1774,13 @@ static int coroutine_fn 
bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
  bool need_flush = false;
  int head = 0;
  int tail = 0;
-
-int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX);
+int max_write_zeroes;


32-bit...


  int alignment = MAX(bs->bl.pwrite_zeroes_alignment,
  bs->bl.request_alignment);
  int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, MAX_BOUNCE_BUFFER);
  
+assert(alignment % bs->bl.request_alignment == 0);


Would this look any better using the QEMU_IS_ALIGNED macro?


+
  if (!drv) {
  return -ENOMEDIUM;
  }
@@ -1788,12 +1789,18 @@ static int coroutine_fn 
bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
  return -ENOTSUP;
  }
  
-assert(alignment % bs->bl.request_alignment == 0);

-head = offset % alignment;
-tail = (offset + bytes) % alignment;
+if ((flags & BDRV_REQ_NO_FALLBACK) && bs->bl.max_pwrite_zeroes_fast) {
+max_write_zeroes = bs->bl.max_pwrite_zeroes_fast;


...but you try to assign something that may be 64-bit into it.  Risk of 
overflow.  Maybe we should get your 64-bit cleanup series in first.



+} else {
+max_write_zeroes = bs->bl.max_pwrite_zeroes;
+}
+max_write_zeroes = MIN_NON_ZERO(max_write_zeroes, INT_MAX);
  max_write_zeroes = QEMU_ALIGN_DOWN(max_write_zeroes, alignment);
  assert(max_write_zeroes >= bs->bl.request_alignment);
  
+head = offset % alignment;

+tail = (offset + bytes) % alignment;
+
  while (bytes > 0 && !ret) {
  int num = bytes;
  



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v11 00/11] iotests: Dump QCOW2 dirty bitmaps metadata

2020-07-23 Thread Eric Blake

On 7/23/20 2:42 PM, Eric Blake wrote:

On 7/17/20 3:14 AM, Andrey Shinkevich wrote:
Add dirty bitmap information to QCOW2 metadata dump in the 
qcow2_format.py.





  block/qcow2.c  |   2 +-
  docs/interop/qcow2.txt |   2 +-
  tests/qemu-iotests/qcow2.py    |  18 ++-
  tests/qemu-iotests/qcow2_format.py | 221 
++---

  4 files changed, 220 insertions(+), 23 deletions(-)


I still don't see any obvious coverage of the new output, which makes it 
harder to test (I have to manually run qcow2.py on a file rather than 
seeing what changes in a ???.out file).  I know we said back in v9 that 
test 291 is not the right test, but that does not stop you from adding a 
new test just for that purpose.


The bulk of this series is touching a non-installed utility. At this 
point, I feel safer deferring it to 5.2 (it is a feature addition for 
testsuite use only, and we missed soft freeze), even though it has no 
negative impact to installed binaries.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




[PATCH-for-5.1] gitlab-ci: Fix Avocado cache

2020-07-23 Thread Philippe Mathieu-Daudé
In commit 6957fd98dc ("gitlab: add avocado asset caching") we
tried to save the Avocado cache (as in commit c1073e44b4 with
Travis-CI) however it doesn't work as expected. For some reason
Avocado uses /root/avocado_cache/ which we can not select later.

Manually generate a Avocado config to force the use of the
current directory.

See:
- https://docs.gitlab.com/ee/ci/caching/
- 
https://avocado-framework.readthedocs.io/en/latest/guides/writer/chapters/writing.html#fetching-asset-files

Reported-by: Thomas Huth 
Fixes: 6957fd98dc ("gitlab: add avocado asset caching")
Signed-off-by: Philippe Mathieu-Daudé 
---
 .gitlab-ci.yml | 28 +++-
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 362e5ee755..b19db22fbd 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -8,11 +8,9 @@ stages:
   - build
   - test
 
-# We assume GitLab has it's own caching set up for RPM/APT repositories so we
-# just take care of avocado assets here.
-cache:
-  paths:
-- $HOME/avocado/data/cache
+# We assume GitLab has it's own caching set up for RPM/APT repositories
+cache: _cache
+  policy: pull-push
 
 include:
   - local: '/.gitlab-ci.d/edk2.yml'
@@ -47,11 +45,23 @@ include:
 - find . -type f -exec touch {} +
 - make $MAKE_CHECK_ARGS
 
-.post_acceptance_template: _acceptance
+.acceptance_template: _definition
+  cache:
+# inherit all global cache settings
+<<: *global_cache
+key: acceptance_cache
+paths:
+  - $CI_PROJECT_DIR/avocado_cache
+policy: pull-push
+  before_script:
+- JOBS=$(expr $(nproc) + 1)
+- mkdir -p ~/.config/avocado
+- echo "[datadir.paths]" > ~/.config/avocado/avocado.conf
+- echo "cache_dirs = ['${CI_PROJECT_DIR}/avocado_cache']" >> 
~/.config/avocado/avocado.conf
   after_script:
 - cd build
 - python3 -c 'import json; r = 
json.load(open("tests/results/latest/results.json")); [print(t["logfile"]) for 
t in r["tests"] if t["status"] not in ("PASS", "SKIP")]' | xargs cat
-- du -chs $HOME/avocado/data/cache
+- du -chs $CI_PROJECT_DIR/avocado_cache
 
 build-system-ubuntu-main:
   <<: *native_build_job_definition
@@ -76,13 +86,13 @@ check-system-ubuntu-main:
 
 acceptance-system-ubuntu-main:
   <<: *native_test_job_definition
+  <<: *acceptance_definition
   needs:
 - job: build-system-ubuntu-main
   artifacts: true
   variables:
 IMAGE: ubuntu2004
 MAKE_CHECK_ARGS: check-acceptance
-  <<: *post_acceptance
 
 build-system-fedora-alt:
   <<: *native_build_job_definition
@@ -107,13 +117,13 @@ check-system-fedora-alt:
 
 acceptance-system-fedora-alt:
   <<: *native_test_job_definition
+  <<: *acceptance_definition
   needs:
 - job: build-system-fedora-alt
   artifacts: true
   variables:
 IMAGE: fedora
 MAKE_CHECK_ARGS: check-acceptance
-  <<: *post_acceptance
 
 build-disabled:
   <<: *native_build_job_definition
-- 
2.21.3




Re: [PATCH v11 01/11] qcow2: Fix capitalization of header extension constant.

2020-07-23 Thread Eric Blake

On 7/17/20 3:14 AM, Andrey Shinkevich wrote:

Make the capitalization of the hexadecimal numbers consistent for the
QCOW2 header extension constants in docs/interop/qcow2.txt.

Suggested-by: Eric Blake 
Signed-off-by: Andrey Shinkevich 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
  block/qcow2.c  | 2 +-
  docs/interop/qcow2.txt | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Eric Blake 

This one is trivial; if I have a reason to do a bitmaps pull request for 
the next 5.1 -rc build, I'll include this too; if not, it doesn't hurt 
to wait for 5.2.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v11 00/11] iotests: Dump QCOW2 dirty bitmaps metadata

2020-07-23 Thread Eric Blake

On 7/17/20 3:14 AM, Andrey Shinkevich wrote:

Add dirty bitmap information to QCOW2 metadata dump in the qcow2_format.py.

v10:
   01: Fixing of issues in QCOW2 extension classes noted by Vladimir.
   02: Reading bitmap tables was moved into Qcow2BitmapTable class.
   03: Handling '-j' key was moved into "if __name__" section.
   04: Making copy of __dict__ was replaced with the method to_dict().
   05: Qcow2HeaderExtensionsDoc is introduced in the separate patch.

Andrey Shinkevich (11):
   qcow2: Fix capitalization of header extension constant.
   qcow2_format.py: make printable data an extension class member
   qcow2_format.py: change Qcow2BitmapExt initialization method
   qcow2_format.py: dump bitmap flags in human readable way.
   qcow2_format.py: Dump bitmap directory information
   qcow2_format.py: pass cluster size to substructures
   qcow2_format.py: Dump bitmap table serialized entries
   qcow2.py: Introduce '-j' key to dump in JSON format
   qcow2_format.py: collect fields to dump in JSON format
   qcow2_format.py: introduce Qcow2HeaderExtensionsDoc class
   qcow2_format.py: support dumping metadata in JSON format

  block/qcow2.c  |   2 +-
  docs/interop/qcow2.txt |   2 +-
  tests/qemu-iotests/qcow2.py|  18 ++-
  tests/qemu-iotests/qcow2_format.py | 221 ++---
  4 files changed, 220 insertions(+), 23 deletions(-)


I still don't see any obvious coverage of the new output, which makes it 
harder to test (I have to manually run qcow2.py on a file rather than 
seeing what changes in a ???.out file).  I know we said back in v9 that 
test 291 is not the right test, but that does not stop you from adding a 
new test just for that purpose.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH for-5.1? 0/4] non-blocking connect

2020-07-23 Thread Eric Blake

On 7/20/20 1:07 PM, Vladimir Sementsov-Ogievskiy wrote:

Hi! This fixes real problem (see 04). On the other hand it may be too
much for 5.1, and it's not a degradation. So, up to you.


Given the concerns raised on 3, I think I'll wait for v2 of the series, 
and defer it to 5.2.




It's based on "[PATCH for-5.1? 0/3] Fix nbd reconnect dead-locks", or
in other words
Based-on: <20200720090024.18186-1-vsement...@virtuozzo.com>

Vladimir Sementsov-Ogievskiy (4):
   qemu-sockets: refactor inet_connect_addr
   qemu-sockets: implement non-blocking connect interface
   io/channel-socket: implement non-blocking connect
   block/nbd: use non-blocking connect: fix vm hang on connect()

  include/io/channel-socket.h | 14 +++
  include/qemu/sockets.h  |  6 +++
  block/nbd.c | 11 +++---
  io/channel-socket.c | 74 
  util/qemu-sockets.c | 76 ++---
  5 files changed, 153 insertions(+), 28 deletions(-)



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH 3/4] crypto: use QOM macros for declaration/definition of secret types

2020-07-23 Thread Eric Blake

On 7/23/20 1:14 PM, Daniel P. Berrangé wrote:

This introduces the use of the OBJECT_DEFINE and OBJECT_DECLARE macro
families in the secret types, in order to eliminate boilerplate code.

Signed-off-by: Daniel P. Berrangé 
---

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH 4/4] crypto: use QOM macros for declaration/definition of TLS creds types

2020-07-23 Thread Eric Blake

On 7/23/20 1:14 PM, Daniel P. Berrangé wrote:

This introduces the use of the OBJECT_DEFINE and OBJECT_DECLARE macro
families in the TLS creds types, in order to eliminate boilerplate code.

Signed-off-by: Daniel P. Berrangé 
---

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




[PATCH] hw/input/virtio-input-hid.c: Don't undef CONFIG_CURSES

2020-07-23 Thread Peter Maydell
virtio-input-hid.c undefines CONFIG_CURSES before including
ui/console.h. However since commits e2f82e924d057935 and b0766612d16da18
that header does not have behaviour dependent on CONFIG_CURSES.
Remove the now-unneeded undef.

Signed-off-by: Peter Maydell 
---
NB: tested with 'make check' only.

 hw/input/virtio-input-hid.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/input/virtio-input-hid.c b/hw/input/virtio-input-hid.c
index 09cf2609854..a7a244a95db 100644
--- a/hw/input/virtio-input-hid.c
+++ b/hw/input/virtio-input-hid.c
@@ -12,7 +12,6 @@
 #include "hw/qdev-properties.h"
 #include "hw/virtio/virtio-input.h"
 
-#undef CONFIG_CURSES
 #include "ui/console.h"
 
 #include "standard-headers/linux/input.h"
-- 
2.20.1




Re: [PATCH 2/4] qom: provide convenient macros for declaring and defining types

2020-07-23 Thread Eric Blake

On 7/23/20 1:14 PM, Daniel P. Berrangé wrote:

When creating new QOM types, there is a lot of boilerplate code that
must be repeated using a standard pattern. This is tedious to write
and liable to suffer from subtle inconsistencies. Thus it would
benefit from some simple automation.

QOM was loosely inspired by GLib's GObject, and indeed GObject suffers
from the same burden of boilerplate code, but has long provided a set of
macros to eliminate this burden in the source implementation. More
recently it has also provided a set of macros to eliminate this burden
in the header declaration.

In GLib there are the G_DECLARE_* and G_DEFINE_* family of macros
for the header declaration and source implementation respectively:

   https://developer.gnome.org/gobject/stable/chapter-gobject.html
   https://developer.gnome.org/gobject/stable/howto-gobject.html

This patch takes inspiration from GObject to provide the equivalent
functionality for QOM.





IOW, in both cases the maintainer now only has to think about the
interesting part of the code which implements useful functionality
and avoids much of the boilerplate.

Signed-off-by: Daniel P. Berrangé 
---
  include/qom/object.h | 277 +++
  1 file changed, 277 insertions(+)

diff --git a/include/qom/object.h b/include/qom/object.h
index 1f8aa2d48e..be64421089 100644
--- a/include/qom/object.h
+++ b/include/qom/object.h
@@ -304,6 +304,119 @@ typedef struct InterfaceInfo InterfaceInfo;
   *
   * The first example of such a QOM method was #CPUClass.reset,
   * another example is #DeviceClass.realize.
+ *
+ * # Standard type declaration and definition macros #
+ *
+ * A lot of the code outlined above follows a standard pattern and naming
+ * convention. To reduce the amount of boilerplate code that needs to be
+ * written for a new type there are two sets of macros to generate the
+ * common parts in a standard format.
+ *
+ * A type is declared using the OBJECT_DECLARE macro family. In types
+ * which do not require any virtual functions in the class, the
+ * OBJECT_DECLARE_SIMPLE_TYPE macro is suitable, and is commonly placed
+ * in the header file:
+ *
+ * 
+ *   Declaring a simple type
+ *   
+ * OBJECT_DECLARE_SIMPLE_TYPE(MyDevice, my_device, MY_DEVICE, DEVICE)


How sensitive is this macro to trailing semicolon?  Must the user omit 
it (as shown here), supply it (by tweaking the macro to be a syntax 
error if one is not supplied), or is it optional?  I guess whatever glib 
does is fine to copy, though.


Hmm. I think you meant to use s/ DEVICE/ Device/ here...


+ *   
+ * 
+ *
+ * This is equivalent to the following:
+ *
+ * 
+ *   Expansion from declaring a simple type
+ *   
+ * typedef struct MyDevice MyDevice;
+ * typedef struct MyDeviceClass MyDeviceClass;
+ *
+ * G_DEFINE_AUTOPTR_CLEANUP_FUNC(MyDeviceClass, object_unref)
+ *
+ * #define MY_DEVICE_GET_CLASS(void *obj) \
+ * OBJECT_GET_CLASS(MyDeviceClass, obj, TYPE_MY_DEVICE)


How'd you manage to invoke #define inside the OBJECT_DECLARE_SIMPLE_TYPE 
macro expansion?


/me reads ahead

Oh, you didn't; you used a static inline function instead.  But the 
effect is the same, so claiming the equivalence here, while slightly 
misleading, is not horrible.



+ * #define MY_DEVICE_CLASS(void *klass) \
+ * OBJECT_CLASS_CHECK(MyDeviceClass, klass, TYPE_MY_DEVICE)
+ * #define MY_DEVICE(void *obj)
+ * OBJECT_CHECK(MyDevice, obj, TYPE_MY_DEVICE)
+ *
+ * struct MyDeviceClass {
+ * DeviceClass parent_class;


...given that this line is constructed as arg4##Class, and the fact that 
we have DeviceClass, not DEVICEClass.



+ * };
+ *   
+ * 
+ *
+ * The 'struct MyDevice' needs to be declared separately.
+ * If the type requires virtual functions to be declared in the class
+ * struct, then the alternative OBJECT_DECLARE_TYPE() macro can be
+ * used. This does the same as OBJECT_DECLARE_SIMPLE_TYPE(), but without
+ * the 'struct MyDeviceClass' definition.
+ *
+ * To implement the type, the OBJECT_DEFINE macro family is available.
+ * In the simple case the OBJECT_DEFINE_TYPE macro is suitable:
+ *
+ * 
+ *   Defining a simple type
+ *   
+ * OBJECT_DEFINE_TYPE(MyDevice, my_device, MY_DEVICE, DEVICE)


Unlike the declare, here, using DEVICE looks correct...


+ *   
+ * 
+ *
+ * This is equivalent to the following:
+ *
+ * 
+ *   Expansion from defining a simple type
+ *   
+ * static void my_device_finalize(Object *obj);
+ * static void my_device_class_init(ObjectClass *oc, void *data);
+ * static void my_device_init(Object *obj);
+ *
+ * static const TypeInfo my_device_info = {
+ * .parent = TYPE_DEVICE,


...given the expansion here.


+ * .name = TYPE_MY_DEVICE,
+ * .instance_size = sizeof(MyDevice),
+ * .instance_init = my_device_init,
+ * .instance_finalize = my_device_finalize,
+ * .class_size = sizeof(MyDeviceClass),
+ *   

[PULL 1/1] KVM: fix CPU reset wrt HF2_GIF_MASK

2020-07-23 Thread Eduardo Habkost
From: Vitaly Kuznetsov 

HF2_GIF_MASK is set in env->hflags2 unconditionally on CPU reset
(see x86_cpu_reset()) but when calling KVM_SET_NESTED_STATE,
KVM_STATE_NESTED_GIF_SET is only valid for nSVM as e.g. nVMX code
looks like

if (kvm_state->hdr.vmx.vmxon_pa == -1ull) {
if (kvm_state->flags & ~KVM_STATE_NESTED_EVMCS)
return -EINVAL;
}

Also, when adjusting the environment after KVM_GET_NESTED_STATE we
need not reset HF2_GIF_MASK on VMX as e.g. x86_cpu_pending_interrupt()
expects it to be set.

Alternatively, we could've made env->hflags2 SVM-only.

Reported-by: Jan Kiszka 
Fixes: b16c0e20c742 ("KVM: add support for AMD nested live migration")
Signed-off-by: Vitaly Kuznetsov 
Message-Id: <20200723142701.2521161-1-vkuzn...@redhat.com>
Tested-by: Jan Kiszka 
Signed-off-by: Eduardo Habkost 
---
 target/i386/kvm.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index b8455c89ed..6f18d940a5 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -3877,7 +3877,9 @@ static int kvm_put_nested_state(X86CPU *cpu)
 } else {
 env->nested_state->flags &= ~KVM_STATE_NESTED_GUEST_MODE;
 }
-if (env->hflags2 & HF2_GIF_MASK) {
+
+/* Don't set KVM_STATE_NESTED_GIF_SET on VMX as it is illegal */
+if (cpu_has_svm(env) && (env->hflags2 & HF2_GIF_MASK)) {
 env->nested_state->flags |= KVM_STATE_NESTED_GIF_SET;
 } else {
 env->nested_state->flags &= ~KVM_STATE_NESTED_GIF_SET;
@@ -3919,10 +3921,14 @@ static int kvm_get_nested_state(X86CPU *cpu)
 } else {
 env->hflags &= ~HF_GUEST_MASK;
 }
-if (env->nested_state->flags & KVM_STATE_NESTED_GIF_SET) {
-env->hflags2 |= HF2_GIF_MASK;
-} else {
-env->hflags2 &= ~HF2_GIF_MASK;
+
+/* Keep HF2_GIF_MASK set on !SVM as x86_cpu_pending_interrupt() needs it */
+if (cpu_has_svm(env)) {
+if (env->nested_state->flags & KVM_STATE_NESTED_GIF_SET) {
+env->hflags2 |= HF2_GIF_MASK;
+} else {
+env->hflags2 &= ~HF2_GIF_MASK;
+}
 }
 
 return ret;
-- 
2.26.2




[PULL 0/1] x86 bug fix for -rc2

2020-07-23 Thread Eduardo Habkost
The following changes since commit 8ffa52c20d5693d454f65f2024a1494edfea65d4:

  Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
(2020-07-23 13:38:21 +0100)

are available in the Git repository at:

  git://github.com/ehabkost/qemu.git tags/x86-next-for-5.1-pull-request

for you to fetch changes up to 0baa4b445e28f37243e5dc72e7efe32f0c9d7801:

  KVM: fix CPU reset wrt HF2_GIF_MASK (2020-07-23 15:03:54 -0400)


x86 bug fix for -rc2

A fix from Vitaly Kuznetsov for a CPU reset bug
reported by Jan Kiszka.



Vitaly Kuznetsov (1):
  KVM: fix CPU reset wrt HF2_GIF_MASK

 target/i386/kvm.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

-- 
2.26.2





Re: [PATCH 1/4] qom: make object_ref/unref use a void * instead of Object *.

2020-07-23 Thread Eric Blake

On 7/23/20 1:14 PM, Daniel P. Berrangé wrote:

The object_ref/unref methods are intended for use with any subclass of
the base Object. Using "Object *" in the signature is not adding any
meaningful level of type safety, since callers simply use "OBJECT(ptr)"
and this expands to an unchecked cast "(Object *)".

By using "void *" we enable the object_unref() method to be used to
provide support for g_autoptr() with any subclass.

Signed-off-by: Daniel P. Berrangé 
---
  include/qom/object.h | 4 ++--
  qom/object.c | 6 --
  2 files changed, 6 insertions(+), 4 deletions(-)


Is it worth a followup patch (probably with Coccinelle) that changes:

object_ref(OBJECT(dev));

to the now-simpler

object_ref(dev);

But I don't think it belongs in this patch, so

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH] kvm: kvm_init_vcpu take Error pointer

2020-07-23 Thread Philippe Mathieu-Daudé
On 7/23/20 6:09 PM, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> Clean up the error handling in kvm_init_vcpu so we can see what went
> wrong more easily.
> 
> Make it take an Error ** and fill it out with what failed, including
> the cpu id, so you can tell if it only fails at a given ID.
> 
> Replace the remaining DPRINTF by a trace.
> 
> This turns a:
> kvm_init_vcpu failed: Invalid argument
> 
> into:
> kvm_init_vcpu: kvm_get_vcpu failed (256): Invalid argument
> 
> and with the trace you then get to see:
> 
> 19049@1595520414.310107:kvm_init_vcpu index: 169 id: 212
> 19050@1595520414.310635:kvm_init_vcpu index: 170 id: 256
> qemu-system-x86_64: kvm_init_vcpu: kvm_get_vcpu failed (256): Invalid argument
> 
> which makes stuff a lot more obvious.
> 
> Signed-off-by: Dr. David Alan Gilbert 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  accel/kvm/kvm-all.c| 19 ++-
>  accel/kvm/trace-events |  1 +
>  accel/stubs/kvm-stub.c |  2 +-
>  include/sysemu/kvm.h   |  2 +-
>  softmmu/cpus.c |  6 +-
>  5 files changed, 18 insertions(+), 12 deletions(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 63ef6af9a1c..0fbece977c7 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -430,17 +430,18 @@ static int kvm_get_vcpu(KVMState *s, unsigned long 
> vcpu_id)
>  return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
>  }
>  
> -int kvm_init_vcpu(CPUState *cpu)
> +int kvm_init_vcpu(CPUState *cpu, Error **errp)
>  {
>  KVMState *s = kvm_state;
>  long mmap_size;
>  int ret;
>  
> -DPRINTF("kvm_init_vcpu\n");
> +trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>  
>  ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>  if (ret < 0) {
> -DPRINTF("kvm_create_vcpu failed\n");
> +error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed 
> (%lu)",
> + kvm_arch_vcpu_id(cpu));
>  goto err;
>  }
>  
> @@ -451,7 +452,8 @@ int kvm_init_vcpu(CPUState *cpu)
>  mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
>  if (mmap_size < 0) {
>  ret = mmap_size;
> -DPRINTF("KVM_GET_VCPU_MMAP_SIZE failed\n");
> +error_setg_errno(errp, -mmap_size,
> + "kvm_init_vcpu: KVM_GET_VCPU_MMAP_SIZE failed");
>  goto err;
>  }
>  
> @@ -459,7 +461,9 @@ int kvm_init_vcpu(CPUState *cpu)
>  cpu->kvm_fd, 0);
>  if (cpu->kvm_run == MAP_FAILED) {
>  ret = -errno;
> -DPRINTF("mmap'ing vcpu state failed\n");
> +error_setg_errno(errp, ret,
> + "kvm_init_vcpu: mmap'ing vcpu state failed (%lu)",
> + kvm_arch_vcpu_id(cpu));
>  goto err;
>  }
>  
> @@ -469,6 +473,11 @@ int kvm_init_vcpu(CPUState *cpu)
>  }
>  
>  ret = kvm_arch_init_vcpu(cpu);
> +if (ret < 0) {
> +error_setg_errno(errp, -ret,
> + "kvm_init_vcpu: kvm_arch_init_vcpu failed (%lu)",
> + kvm_arch_vcpu_id(cpu));
> +}
>  err:
>  return ret;
>  }
> diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events
> index a68eb665343..e15ae8980d3 100644
> --- a/accel/kvm/trace-events
> +++ b/accel/kvm/trace-events
> @@ -8,6 +8,7 @@ kvm_run_exit(int cpu_index, uint32_t reason) "cpu_index %d, 
> reason %d"
>  kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 0x%x, arg %p"
>  kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to 
> retrieve ONEREG %" PRIu64 " from KVM: %s"
>  kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set 
> ONEREG %" PRIu64 " to KVM: %s"
> +kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"
>  kvm_irqchip_commit_routes(void) ""
>  kvm_irqchip_add_msi_route(char *name, int vector, int virq) "dev %s vector 
> %d virq %d"
>  kvm_irqchip_update_msi_route(int virq) "Updating MSI route virq=%d"
> diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
> index 82f118d2df9..cd573bfe3d9 100644
> --- a/accel/stubs/kvm-stub.c
> +++ b/accel/stubs/kvm-stub.c
> @@ -37,7 +37,7 @@ int kvm_destroy_vcpu(CPUState *cpu)
>  return -ENOSYS;
>  }
>  
> -int kvm_init_vcpu(CPUState *cpu)
> +int kvm_init_vcpu(CPUState *cpu, Error **errp)
>  {
>  return -ENOSYS;
>  }
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index b4174d941c2..410848af514 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -216,7 +216,7 @@ int kvm_has_many_ioeventfds(void);
>  int kvm_has_gsi_routing(void);
>  int kvm_has_intx_set_mask(void);
>  
> -int kvm_init_vcpu(CPUState *cpu);
> +int kvm_init_vcpu(CPUState *cpu, Error **errp);
>  int kvm_cpu_exec(CPUState *cpu);
>  int kvm_destroy_vcpu(CPUState *cpu);
>  
> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> index a802e899abb..9725fd9951f 100644
> --- a/softmmu/cpus.c
> +++ b/softmmu/cpus.c
> @@ -1170,11 +1170,7 @@ static void 

Re: [PATCH 3/3] block/nbd: nbd_co_reconnect_loop(): don't sleep if drained

2020-07-23 Thread Eric Blake

On 7/20/20 4:00 AM, Vladimir Sementsov-Ogievskiy wrote:

We try to go to wakeable sleep, so that, if drain begins it will break
the sleep. But what if nbd_client_co_drain_begin() already called and
s->drained is already true? We'll go to sleep, and drain will have to
wait for the whole timeout. Let's improve it.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 11 ++-
  1 file changed, 6 insertions(+), 5 deletions(-)



How frequently did you hit this case?  At any rate, the optimization 
looks sane, and I'm happy to include it in 5.1.


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH 2/3] block/nbd: on shutdown terminate connection attempt

2020-07-23 Thread Eric Blake

On 7/20/20 4:00 AM, Vladimir Sementsov-Ogievskiy wrote:

On shutdown nbd driver may be in a connecting state. We should shutdown
it as well, otherwise we may hang in
nbd_teardown_connection, waiting for conneciton_co to finish in
BDRV_POLL_WHILE(bs, s->connection_co) loop if remote server is down.

How to reproduce the dead lock:



Same reproducer as in the previous patch (again, where a temporary sleep 
or well-placed gdb breakpoint may be more reliable than running two 
process loops).





Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 14 ++
  1 file changed, 10 insertions(+), 4 deletions(-)



Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH 3/4] crypto: use QOM macros for declaration/definition of secret types

2020-07-23 Thread Eduardo Habkost
On Thu, Jul 23, 2020 at 07:14:09PM +0100, Daniel P. Berrangé wrote:
> This introduces the use of the OBJECT_DEFINE and OBJECT_DECLARE macro
> families in the secret types, in order to eliminate boilerplate code.
> 
> Signed-off-by: Daniel P. Berrangé 
> ---
>  crypto/secret.c | 24 
>  crypto/secret_common.c  | 32 +---
>  crypto/secret_keyring.c | 28 +---
>  include/crypto/secret.h | 11 ++-
>  include/crypto/secret_common.h  | 13 ++---
>  include/crypto/secret_keyring.h | 18 ++
>  6 files changed, 28 insertions(+), 98 deletions(-)
> 

Beautiful.

I wonder how hard it would be to automate this.  I'm assuming
Coccinelle won't be able to deal with the macro definitions, but
a handwritten conversion script would be really useful for
dealing with our 1226 static TypeInfo structs.

-- 
Eduardo




Re: [PATCH 1/3] block/nbd: allow drain during reconnect attempt

2020-07-23 Thread Eric Blake

On 7/20/20 4:00 AM, Vladimir Sementsov-Ogievskiy wrote:

It should be to reenter qio_channel_yield() on io/channel read/write
path, so it's safe to reduce in_flight and allow attaching new aio
context. And no problem to allow drain itself: connection attempt is
not a guest request. Moreover, if remote server is down, we can hang
in negotiation, blocking drain section and provoking a dead lock.

How to reproduce the dead lock:



I tried to reproduce this; but in the several minutes it has taken me to 
write this email, it still has not hung.  Still, your stack trace is 
fairly good evidence of the problem, where adding a temporary sleep or 
running it under gdb with a breakpoint can probably make reproduction 
easier.



1. Create nbd-fault-injector.conf with the following contents:

[inject-error "mega1"]
event=data
io=readwrite
when=before

2. In one terminal run nbd-fault-injector in a loop, like this:

n=1; while true; do
 echo $n; ((n++));


Bashism, but not a problem for the commit message.


 ./nbd-fault-injector.py 127.0.0.1:1 nbd-fault-injector.conf;
done

3. In another terminal run qemu-io in a loop, like this:

n=1; while true; do
 echo $n; ((n++));
 ./qemu-io -c 'read 0 512' nbd+tcp://127.0.0.1:1;


I prefer the spelling nbd:// for TCP connections, but also inconsequential.


Note, that the hang may be
triggered by another bug, so the whole case is fixed only together with
commit "block/nbd: on shutdown terminate connection attempt".

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 65a4f56924..49254f1c3c 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -280,7 +280,18 @@ static coroutine_fn void 
nbd_reconnect_attempt(BDRVNBDState *s)
  s->ioc = NULL;
  }
  
+bdrv_dec_in_flight(s->bs);

  s->connect_status = nbd_client_connect(s->bs, _err);
+s->wait_drained_end = true;
+while (s->drained) {
+/*
+ * We may be entered once from nbd_client_attach_aio_context_bh
+ * and then from nbd_client_co_drain_end. So here is a loop.
+ */
+qemu_coroutine_yield();
+}
+bdrv_inc_in_flight(s->bs);
+


This is very similar to the code in nbd_co_reconnect_loop.  Does that 
function still need to wait on drained, since it calls 
nbd_reconnect_attempt which is now doing the same loop?  But off-hand, 
I'm not seeing a problem with keeping both places.


Reviewed-by: Eric Blake 

As a bug fix, I'll be including this in my NBD pull request for the next 
-rc build.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




[PATCH 2/4] qom: provide convenient macros for declaring and defining types

2020-07-23 Thread Daniel P . Berrangé
When creating new QOM types, there is a lot of boilerplate code that
must be repeated using a standard pattern. This is tedious to write
and liable to suffer from subtle inconsistencies. Thus it would
benefit from some simple automation.

QOM was loosely inspired by GLib's GObject, and indeed GObject suffers
from the same burden of boilerplate code, but has long provided a set of
macros to eliminate this burden in the source implementation. More
recently it has also provided a set of macros to eliminate this burden
in the header declaration.

In GLib there are the G_DECLARE_* and G_DEFINE_* family of macros
for the header declaration and source implementation respectively:

  https://developer.gnome.org/gobject/stable/chapter-gobject.html
  https://developer.gnome.org/gobject/stable/howto-gobject.html

This patch takes inspiration from GObject to provide the equivalent
functionality for QOM.

In the header file, instead of:

typedef struct MyDevice MyDevice;
typedef struct MyDeviceClass MyDeviceClass;

G_DEFINE_AUTOPTR_CLEANUP_FUNC(MyDeviceClass, object_unref)

#define MY_DEVICE_GET_CLASS(void *obj) \
OBJECT_GET_CLASS(MyDeviceClass, obj, TYPE_MY_DEVICE)
#define MY_DEVICE_CLASS(void *klass) \
OBJECT_CLASS_CHECK(MyDeviceClass, klass, TYPE_MY_DEVICE)
#define MY_DEVICE(void *obj)
OBJECT_CHECK(MyDevice, obj, TYPE_MY_DEVICE)

struct MyDeviceClass {
DeviceClass parent_class;
};

We now have

OBJECT_DECLARE_SIMPLE_TYPE(MyDevice, my_device, MY_DEVICE, DEVICE)

In cases where the class needs some virtual methods, it can be left
to be implemented manually using

OBJECT_DECLARE_TYPE(MyDevice, my_device, MY_DEVICE)

Note that these macros are including support for g_autoptr() for the
object types, which is something previously only supported for variables
declared as the base Object * type.

Meanwhile in the source file, instead of:

static void my_device_finalize(Object *obj);
static void my_device_class_init(ObjectClass *oc, void *data);
static void my_device_init(Object *obj);

static const TypeInfo my_device_info = {
.parent = TYPE_DEVICE,
.name = TYPE_MY_DEVICE,
.instance_size = sizeof(MyDevice),
.instance_init = my_device_init,
.instance_finalize = my_device_finalize,
.class_size = sizeof(MyDeviceClass),
.class_init = my_device_class_init,
};

static void
my_device_register_types(void)
{
type_register_static(_device_info);
}
type_init(my_device_register_types);

We now have

OBJECT_DEFINE_TYPE(MyDevice, my_device, MY_DEVICE, DEVICE)

Or, if a class needs to implement interfaces:

OBJECT_DEFINE_TYPE_WITH_INTERFACES(MyDevice, my_device, MY_DEVICE, DEVICE,
   { TYPE_USER_CREATABLE }, { NULL })

Or, if a class needs to be abstract

OBJECT_DEFINE_ABSTRACT_TYPE(MyDevice, my_device, MY_DEVICE, DEVICE)

IOW, in both cases the maintainer now only has to think about the
interesting part of the code which implements useful functionality
and avoids much of the boilerplate.

Signed-off-by: Daniel P. Berrangé 
---
 include/qom/object.h | 277 +++
 1 file changed, 277 insertions(+)

diff --git a/include/qom/object.h b/include/qom/object.h
index 1f8aa2d48e..be64421089 100644
--- a/include/qom/object.h
+++ b/include/qom/object.h
@@ -304,6 +304,119 @@ typedef struct InterfaceInfo InterfaceInfo;
  *
  * The first example of such a QOM method was #CPUClass.reset,
  * another example is #DeviceClass.realize.
+ *
+ * # Standard type declaration and definition macros #
+ *
+ * A lot of the code outlined above follows a standard pattern and naming
+ * convention. To reduce the amount of boilerplate code that needs to be
+ * written for a new type there are two sets of macros to generate the
+ * common parts in a standard format.
+ *
+ * A type is declared using the OBJECT_DECLARE macro family. In types
+ * which do not require any virtual functions in the class, the
+ * OBJECT_DECLARE_SIMPLE_TYPE macro is suitable, and is commonly placed
+ * in the header file:
+ *
+ * 
+ *   Declaring a simple type
+ *   
+ * OBJECT_DECLARE_SIMPLE_TYPE(MyDevice, my_device, MY_DEVICE, DEVICE)
+ *   
+ * 
+ *
+ * This is equivalent to the following:
+ *
+ * 
+ *   Expansion from declaring a simple type
+ *   
+ * typedef struct MyDevice MyDevice;
+ * typedef struct MyDeviceClass MyDeviceClass;
+ *
+ * G_DEFINE_AUTOPTR_CLEANUP_FUNC(MyDeviceClass, object_unref)
+ *
+ * #define MY_DEVICE_GET_CLASS(void *obj) \
+ * OBJECT_GET_CLASS(MyDeviceClass, obj, TYPE_MY_DEVICE)
+ * #define MY_DEVICE_CLASS(void *klass) \
+ * OBJECT_CLASS_CHECK(MyDeviceClass, klass, TYPE_MY_DEVICE)
+ * #define MY_DEVICE(void *obj)
+ * OBJECT_CHECK(MyDevice, obj, TYPE_MY_DEVICE)
+ *
+ * struct MyDeviceClass {
+ * DeviceClass parent_class;
+ *  

[PATCH 0/4] qom: reduce boilerplate required for declaring and defining objects

2020-07-23 Thread Daniel P . Berrangé
To just duplicate the patch 2 message

When creating new QOM types, there is a lot of boilerplate code that
must be repeated using a standard pattern. This is tedious to write
and liable to suffer from subtle inconsistencies. Thus it would
benefit from some simple automation.

QOM was loosely inspired by GLib's GObject, and indeed GObject suffers
from the same burden of boilerplate code, but has long provided a set of
macros to eliminate this burden in the source implementation. More
recently it has also provided a set of macros to eliminate this burden
in the header declaration.

In GLib there are the G_DECLARE_* and G_DEFINE_* family of macros
for the header declaration and source implementation respectively:

  https://developer.gnome.org/gobject/stable/chapter-gobject.html
  https://developer.gnome.org/gobject/stable/howto-gobject.html

This patch takes inspiration from GObject to provide the equivalent
functionality for QOM.

In the header file, instead of:

typedef struct MyDevice MyDevice;
typedef struct MyDeviceClass MyDeviceClass;

G_DEFINE_AUTOPTR_CLEANUP_FUNC(MyDeviceClass, object_unref)

#define MY_DEVICE_GET_CLASS(void *obj) \
OBJECT_GET_CLASS(MyDeviceClass, obj, TYPE_MY_DEVICE)
#define MY_DEVICE_CLASS(void *klass) \
OBJECT_CLASS_CHECK(MyDeviceClass, klass, TYPE_MY_DEVICE)
#define MY_DEVICE(void *obj)
OBJECT_CHECK(MyDevice, obj, TYPE_MY_DEVICE)

struct MyDeviceClass {
DeviceClass parent_class;
};

We now have

OBJECT_DECLARE_SIMPLE_TYPE(MyDevice, my_device, MY_DEVICE, DEVICE)

In cases where the class needs some virtual methods, it can be left
to be implemented manually using

OBJECT_DECLARE_TYPE(MyDevice, my_device, MY_DEVICE)

Note that these macros are including support for g_autoptr() for the
object types, which is something previously only supported for variables
declared as the base Object * type.

Meanwhile in the source file, instead of:

static void my_device_finalize(Object *obj);
static void my_device_class_init(ObjectClass *oc, void *data);
static void my_device_init(Object *obj);

static const TypeInfo my_device_info = {
.parent = TYPE_DEVICE,
.name = TYPE_MY_DEVICE,
.instance_size = sizeof(MyDevice),
.instance_init = my_device_init,
.instance_finalize = my_device_finalize,
.class_size = sizeof(MyDeviceClass),
.class_init = my_device_class_init,
};

static void
my_device_register_types(void)
{
type_register_static(_device_info);
}
type_init(my_device_register_types);

We now have

OBJECT_DEFINE_TYPE(MyDevice, my_device, MY_DEVICE, DEVICE)

Or, if a class needs to implement interfaces:

OBJECT_DEFINE_TYPE_WITH_INTERFACES(MyDevice, my_device, MY_DEVICE, DEVICE,
   { TYPE_USER_CREATABLE }, { NULL })

Or, if a class needs to be abstract

OBJECT_DEFINE_ABSTRACT_TYPE(MyDevice, my_device, MY_DEVICE, DEVICE)

IOW, in both cases the maintainer now only has to think about the
interesting part of the code which implements useful functionality
and avoids much of the boilerplate.


Patches 3 and 4 illustrate the usage of the new macros, and by excluding
the qom changes, and just looking at the crypto, the diffstat shows the
benefits quite nicely:

 crypto/secret.c | 24 
 crypto/secret_common.c  | 32 +---
 crypto/secret_keyring.c | 28 +---
 crypto/tlscreds.c   | 25 +++--
 crypto/tlscredsanon.c   | 23 ---
 crypto/tlscredspsk.c| 25 +
 crypto/tlscredsx509.c   | 29 -
 include/crypto/secret.h | 11 ++-
 include/crypto/secret_common.h  | 13 ++---
 include/crypto/secret_keyring.h | 18 ++
 include/crypto/tlscreds.h   | 13 ++---
 include/crypto/tlscredsanon.h   | 14 ++
 include/crypto/tlscredspsk.h| 13 ++---
 include/crypto/tlscredsx509.h   | 13 ++---
 14 files changed, 52 insertions(+), 229 deletions(-)

(The 'qom' file diffstat is misled by the large amount of API doc text
 added).

Daniel P. Berrangé (4):
  qom: make object_ref/unref use a void * instead of Object *.
  qom: provide convenient macros for declaring and defining types
  crypto: use QOM macros for declaration/definition of secret types
  crypto: use QOM macros for declaration/definition of TLS creds types

 crypto/secret.c |  24 +--
 crypto/secret_common.c  |  32 +---
 crypto/secret_keyring.c |  28 +---
 crypto/tlscreds.c   |  25 +--
 crypto/tlscredsanon.c   |  23 +--
 crypto/tlscredspsk.c|  25 +--
 crypto/tlscredsx509.c   |  29 +---
 include/crypto/secret.h |  11 +-
 include/crypto/secret_common.h  |  13 

[PATCH 1/4] qom: make object_ref/unref use a void * instead of Object *.

2020-07-23 Thread Daniel P . Berrangé
The object_ref/unref methods are intended for use with any subclass of
the base Object. Using "Object *" in the signature is not adding any
meaningful level of type safety, since callers simply use "OBJECT(ptr)"
and this expands to an unchecked cast "(Object *)".

By using "void *" we enable the object_unref() method to be used to
provide support for g_autoptr() with any subclass.

Signed-off-by: Daniel P. Berrangé 
---
 include/qom/object.h | 4 ++--
 qom/object.c | 6 --
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/qom/object.h b/include/qom/object.h
index 0f3a60617c..1f8aa2d48e 100644
--- a/include/qom/object.h
+++ b/include/qom/object.h
@@ -1035,7 +1035,7 @@ GSList *object_class_get_list_sorted(const char 
*implements_type,
  * as its reference count is greater than zero.
  * Returns: @obj
  */
-Object *object_ref(Object *obj);
+Object *object_ref(void *obj);
 
 /**
  * object_unref:
@@ -1044,7 +1044,7 @@ Object *object_ref(Object *obj);
  * Decrease the reference count of a object.  A object cannot be freed as long
  * as its reference count is greater than zero.
  */
-void object_unref(Object *obj);
+void object_unref(void *obj);
 
 /**
  * object_property_try_add:
diff --git a/qom/object.c b/qom/object.c
index 00fdf89b3b..b1822a2ef4 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -1124,8 +1124,9 @@ GSList *object_class_get_list_sorted(const char 
*implements_type,
 object_class_cmp);
 }
 
-Object *object_ref(Object *obj)
+Object *object_ref(void *objptr)
 {
+Object *obj = OBJECT(objptr);
 if (!obj) {
 return NULL;
 }
@@ -1133,8 +1134,9 @@ Object *object_ref(Object *obj)
 return obj;
 }
 
-void object_unref(Object *obj)
+void object_unref(void *objptr)
 {
+Object *obj = OBJECT(objptr);
 if (!obj) {
 return;
 }
-- 
2.26.2




[PATCH 4/4] crypto: use QOM macros for declaration/definition of TLS creds types

2020-07-23 Thread Daniel P . Berrangé
This introduces the use of the OBJECT_DEFINE and OBJECT_DECLARE macro
families in the TLS creds types, in order to eliminate boilerplate code.

Signed-off-by: Daniel P. Berrangé 
---
 crypto/tlscreds.c | 25 +++--
 crypto/tlscredsanon.c | 23 ---
 crypto/tlscredspsk.c  | 25 +
 crypto/tlscredsx509.c | 29 -
 include/crypto/tlscreds.h | 13 ++---
 include/crypto/tlscredsanon.h | 14 ++
 include/crypto/tlscredspsk.h  | 13 ++---
 include/crypto/tlscredsx509.h | 13 ++---
 8 files changed, 24 insertions(+), 131 deletions(-)

diff --git a/crypto/tlscreds.c b/crypto/tlscreds.c
index b68735f06f..c238ff7d4b 100644
--- a/crypto/tlscreds.c
+++ b/crypto/tlscreds.c
@@ -24,6 +24,9 @@
 #include "tlscredspriv.h"
 #include "trace.h"
 
+OBJECT_DEFINE_ABSTRACT_TYPE(QCryptoTLSCreds, qcrypto_tls_creds,
+QCRYPTO_TLS_CREDS, OBJECT)
+
 #define DH_BITS 2048
 
 #ifdef CONFIG_GNUTLS
@@ -258,25 +261,3 @@ qcrypto_tls_creds_finalize(Object *obj)
 g_free(creds->dir);
 g_free(creds->priority);
 }
-
-
-static const TypeInfo qcrypto_tls_creds_info = {
-.parent = TYPE_OBJECT,
-.name = TYPE_QCRYPTO_TLS_CREDS,
-.instance_size = sizeof(QCryptoTLSCreds),
-.instance_init = qcrypto_tls_creds_init,
-.instance_finalize = qcrypto_tls_creds_finalize,
-.class_init = qcrypto_tls_creds_class_init,
-.class_size = sizeof(QCryptoTLSCredsClass),
-.abstract = true,
-};
-
-
-static void
-qcrypto_tls_creds_register_types(void)
-{
-type_register_static(_tls_creds_info);
-}
-
-
-type_init(qcrypto_tls_creds_register_types);
diff --git a/crypto/tlscredsanon.c b/crypto/tlscredsanon.c
index 30275b6847..dc1b77e37c 100644
--- a/crypto/tlscredsanon.c
+++ b/crypto/tlscredsanon.c
@@ -26,6 +26,9 @@
 #include "qom/object_interfaces.h"
 #include "trace.h"
 
+OBJECT_DEFINE_TYPE_WITH_INTERFACES(QCryptoTLSCredsAnon, qcrypto_tls_creds_anon,
+   QCRYPTO_TLS_CREDS_ANON, QCRYPTO_TLS_CREDS,
+   { TYPE_USER_CREATABLE }, { NULL })
 
 #ifdef CONFIG_GNUTLS
 
@@ -191,25 +194,7 @@ qcrypto_tls_creds_anon_class_init(ObjectClass *oc, void 
*data)
 }
 
 
-static const TypeInfo qcrypto_tls_creds_anon_info = {
-.parent = TYPE_QCRYPTO_TLS_CREDS,
-.name = TYPE_QCRYPTO_TLS_CREDS_ANON,
-.instance_size = sizeof(QCryptoTLSCredsAnon),
-.instance_finalize = qcrypto_tls_creds_anon_finalize,
-.class_size = sizeof(QCryptoTLSCredsAnonClass),
-.class_init = qcrypto_tls_creds_anon_class_init,
-.interfaces = (InterfaceInfo[]) {
-{ TYPE_USER_CREATABLE },
-{ }
-}
-};
-
-
 static void
-qcrypto_tls_creds_anon_register_types(void)
+qcrypto_tls_creds_anon_init(Object *obj)
 {
-type_register_static(_tls_creds_anon_info);
 }
-
-
-type_init(qcrypto_tls_creds_anon_register_types);
diff --git a/crypto/tlscredspsk.c b/crypto/tlscredspsk.c
index e26807b899..0c66be3647 100644
--- a/crypto/tlscredspsk.c
+++ b/crypto/tlscredspsk.c
@@ -27,6 +27,10 @@
 #include "trace.h"
 
 
+OBJECT_DEFINE_TYPE_WITH_INTERFACES(QCryptoTLSCredsPSK, qcrypto_tls_creds_psk,
+   QCRYPTO_TLS_CREDS_PSK, QCRYPTO_TLS_CREDS,
+   { TYPE_USER_CREATABLE }, { NULL })
+
 #ifdef CONFIG_GNUTLS
 
 static int
@@ -281,26 +285,7 @@ qcrypto_tls_creds_psk_class_init(ObjectClass *oc, void 
*data)
   qcrypto_tls_creds_psk_prop_set_username);
 }
 
-
-static const TypeInfo qcrypto_tls_creds_psk_info = {
-.parent = TYPE_QCRYPTO_TLS_CREDS,
-.name = TYPE_QCRYPTO_TLS_CREDS_PSK,
-.instance_size = sizeof(QCryptoTLSCredsPSK),
-.instance_finalize = qcrypto_tls_creds_psk_finalize,
-.class_size = sizeof(QCryptoTLSCredsPSKClass),
-.class_init = qcrypto_tls_creds_psk_class_init,
-.interfaces = (InterfaceInfo[]) {
-{ TYPE_USER_CREATABLE },
-{ }
-}
-};
-
-
 static void
-qcrypto_tls_creds_psk_register_types(void)
+qcrypto_tls_creds_psk_init(Object *obj)
 {
-type_register_static(_tls_creds_psk_info);
 }
-
-
-type_init(qcrypto_tls_creds_psk_register_types);
diff --git a/crypto/tlscredsx509.c b/crypto/tlscredsx509.c
index dd7267ccdb..a39555e5e6 100644
--- a/crypto/tlscredsx509.c
+++ b/crypto/tlscredsx509.c
@@ -28,6 +28,10 @@
 #include "trace.h"
 
 
+OBJECT_DEFINE_TYPE_WITH_INTERFACES(QCryptoTLSCredsX509, qcrypto_tls_creds_x509,
+   QCRYPTO_TLS_CREDS_X509, QCRYPTO_TLS_CREDS,
+   { TYPE_USER_CREATABLE }, { NULL })
+
 #ifdef CONFIG_GNUTLS
 
 #include 
@@ -814,28 +818,3 @@ qcrypto_tls_creds_x509_class_init(ObjectClass *oc, void 
*data)
   qcrypto_tls_creds_x509_prop_get_passwordid,
   qcrypto_tls_creds_x509_prop_set_passwordid);
 }
-
-
-static const TypeInfo 

[PATCH 3/4] crypto: use QOM macros for declaration/definition of secret types

2020-07-23 Thread Daniel P . Berrangé
This introduces the use of the OBJECT_DEFINE and OBJECT_DECLARE macro
families in the secret types, in order to eliminate boilerplate code.

Signed-off-by: Daniel P. Berrangé 
---
 crypto/secret.c | 24 
 crypto/secret_common.c  | 32 +---
 crypto/secret_keyring.c | 28 +---
 include/crypto/secret.h | 11 ++-
 include/crypto/secret_common.h  | 13 ++---
 include/crypto/secret_keyring.h | 18 ++
 6 files changed, 28 insertions(+), 98 deletions(-)

diff --git a/crypto/secret.c b/crypto/secret.c
index 281cb81f0f..55b406f79e 100644
--- a/crypto/secret.c
+++ b/crypto/secret.c
@@ -25,6 +25,9 @@
 #include "qemu/module.h"
 #include "trace.h"
 
+OBJECT_DEFINE_TYPE_WITH_INTERFACES(QCryptoSecret, qcrypto_secret,
+   QCRYPTO_SECRET, QCRYPTO_SECRET_COMMON,
+   { TYPE_USER_CREATABLE }, { NULL })
 
 static void
 qcrypto_secret_load_data(QCryptoSecretCommon *sec_common,
@@ -140,26 +143,7 @@ qcrypto_secret_class_init(ObjectClass *oc, void *data)
   qcrypto_secret_prop_set_file);
 }
 
-
-static const TypeInfo qcrypto_secret_info = {
-.parent = TYPE_QCRYPTO_SECRET_COMMON,
-.name = TYPE_QCRYPTO_SECRET,
-.instance_size = sizeof(QCryptoSecret),
-.instance_finalize = qcrypto_secret_finalize,
-.class_size = sizeof(QCryptoSecretClass),
-.class_init = qcrypto_secret_class_init,
-.interfaces = (InterfaceInfo[]) {
-{ TYPE_USER_CREATABLE },
-{ }
-}
-};
-
-
 static void
-qcrypto_secret_register_types(void)
+qcrypto_secret_init(Object *obj)
 {
-type_register_static(_secret_info);
 }
-
-
-type_init(qcrypto_secret_register_types);
diff --git a/crypto/secret_common.c b/crypto/secret_common.c
index b03d530867..9a054b90b5 100644
--- a/crypto/secret_common.c
+++ b/crypto/secret_common.c
@@ -28,6 +28,9 @@
 #include "trace.h"
 
 
+OBJECT_DEFINE_ABSTRACT_TYPE(QCryptoSecretCommon, qcrypto_secret_common,
+QCRYPTO_SECRET_COMMON, OBJECT)
+
 static void qcrypto_secret_decrypt(QCryptoSecretCommon *secret,
const uint8_t *input,
size_t inputlen,
@@ -269,7 +272,7 @@ qcrypto_secret_prop_get_keyid(Object *obj,
 
 
 static void
-qcrypto_secret_finalize(Object *obj)
+qcrypto_secret_common_finalize(Object *obj)
 {
 QCryptoSecretCommon *secret = QCRYPTO_SECRET_COMMON(obj);
 
@@ -279,7 +282,7 @@ qcrypto_secret_finalize(Object *obj)
 }
 
 static void
-qcrypto_secret_class_init(ObjectClass *oc, void *data)
+qcrypto_secret_common_class_init(ObjectClass *oc, void *data)
 {
 object_class_property_add_bool(oc, "loaded",
qcrypto_secret_prop_get_loaded,
@@ -297,6 +300,10 @@ qcrypto_secret_class_init(ObjectClass *oc, void *data)
   qcrypto_secret_prop_set_iv);
 }
 
+static void
+qcrypto_secret_common_init(Object *obj)
+{
+}
 
 int qcrypto_secret_lookup(const char *secretid,
   uint8_t **data,
@@ -380,24 +387,3 @@ char *qcrypto_secret_lookup_as_base64(const char *secretid,
 g_free(data);
 return ret;
 }
-
-
-static const TypeInfo qcrypto_secret_info = {
-.parent = TYPE_OBJECT,
-.name = TYPE_QCRYPTO_SECRET_COMMON,
-.instance_size = sizeof(QCryptoSecretCommon),
-.instance_finalize = qcrypto_secret_finalize,
-.class_size = sizeof(QCryptoSecretCommonClass),
-.class_init = qcrypto_secret_class_init,
-.abstract = true,
-};
-
-
-static void
-qcrypto_secret_register_types(void)
-{
-type_register_static(_secret_info);
-}
-
-
-type_init(qcrypto_secret_register_types);
diff --git a/crypto/secret_keyring.c b/crypto/secret_keyring.c
index 8bfc58ebf4..463aefe5dc 100644
--- a/crypto/secret_keyring.c
+++ b/crypto/secret_keyring.c
@@ -26,6 +26,9 @@
 #include "trace.h"
 #include "crypto/secret_keyring.h"
 
+OBJECT_DEFINE_TYPE_WITH_INTERFACES(QCryptoSecretKeyring, 
qcrypto_secret_keyring,
+   QCRYPTO_SECRET_KEYRING, 
QCRYPTO_SECRET_COMMON,
+   { TYPE_USER_CREATABLE }, { NULL })
 
 static inline
 long keyctl_read(int32_t key, uint8_t *buffer, size_t buflen)
@@ -109,6 +112,11 @@ qcrypto_secret_keyring_complete(UserCreatable *uc, Error 
**errp)
 }
 
 
+static void
+qcrypto_secret_keyring_finalize(Object *obj)
+{
+}
+
 static void
 qcrypto_secret_keyring_class_init(ObjectClass *oc, void *data)
 {
@@ -124,25 +132,7 @@ qcrypto_secret_keyring_class_init(ObjectClass *oc, void 
*data)
   NULL, NULL);
 }
 
-
-static const TypeInfo qcrypto_secret_info = {
-.parent = TYPE_QCRYPTO_SECRET_COMMON,
-.name = TYPE_QCRYPTO_SECRET_KEYRING,
-.instance_size = sizeof(QCryptoSecretKeyring),
-.class_size = sizeof(QCryptoSecretKeyringClass),
-.class_init = 

Re: [RFC PATCH] s390x/pci: vfio-pci breakage with disabled mem enforcement

2020-07-23 Thread Matthew Rosato

On 7/23/20 12:29 PM, Alex Williamson wrote:

On Thu, 23 Jul 2020 11:13:55 -0400
Matthew Rosato  wrote:


I noticed that after kernel commit abafbc55 'vfio-pci: Invalidate mmaps
and block MMIO access on disabled memory' vfio-pci via qemu on s390x
fails spectacularly, with errors in qemu like:

qemu-system-s390x: vfio_region_read(0001:00:00.0:region0+0x0, 4) failed: 
Input/output error

 From read to bar 0 originating out of hw/s390x/s390-pci-inst.c:zpci_read_bar().

So, I'm trying to figure out how to get vfio-pci happy again on s390x.  From
a bit of tracing, we seem to be triggering the new trap in
__vfio_pci_memory_enabled().  Sure enough, if I just force this function to
return 'true' as a test case, things work again.
The included patch attempts to enforce the setting, which restores everything
to working order but also triggers vfio_bar_restore() in the process  So
this isn't the right answer, more of a proof-of-concept.

@Alex: Any guidance on what needs to happen to make qemu-s390x happy with this
recent kernel change?


Bummer!  I won't claim to understand s390 PCI, but if we have a VF
exposed to the "host" (ie. the first level where vfio-pci is being
used), but we can't tell that it's a VF, how do we know whether the
memory bit in the command register is unimplemented because it's a VF
or unimplemented because the device doesn't support MMIO?  How are the
device ID, vendor ID, and BAR registers virtualized to the host?  Could


On s390 this info is all advertised/accessed via special instructions 
(my PoC qemu patch was intercepting one of these sorts of instructions 
from the guest and modifying it).



the memory enable bit also be emulated by that virtualization, much
like vfio-pci does for userspace?  If the other registers are
virtualized, but these command register bits are left unimplemented, do
we need code to deduce that we have a VF based on the existence of MMIO
BARs, but lack of memory enable bit?  Thanks,


Yeah, I'm thinking we might need something to this effect for s390 at 
least.  But I'm curious if Niklas/Pierre have any add'l thoughts about 
that since they touched the virtfn stuff recently for s390 PCI.




Alex


@Nilkas/@Pierre: I wonder if this might be related to host device is_virtfn?
I note that my host device lspci output looks like:

:00:00.0 Ethernet controller: Mellanox Technologies MT27710 Family 
[ConnectX-4 Lx Virtual Function]

But the device is not marked as is_virtfn..  Otherwise, Alex's fix
from htps://lkml.org/lkml/2020/6/25/628 should cover the case.



Matthew Rosato (1):
   s390x/pci: Enforce PCI_COMMAND_MEMORY for vfio-pci

  hw/s390x/s390-pci-inst.c | 10 ++
  1 file changed, 10 insertions(+)








Re: [PULL 0/9] acpi,virtio,pc: bugfixes

2020-07-23 Thread Peter Maydell
On Wed, 22 Jul 2020 at 13:09, Michael S. Tsirkin  wrote:
>
> The following changes since commit c8004fe6bbfc0d9c2e7b942c418a85efb3ac4b00:
>
>   Update version for v5.1.0-rc1 release (2020-07-21 20:28:59 +0100)
>
> are available in the Git repository at:
>
>   git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream
>
> for you to fetch changes up to ccec7e9603f446fe75c6c563ba335c00cfda6a06:
>
>   virtio-pci: Changed vdev to proxy for VirtIO PCI BAR callbacks. (2020-07-22 
> 08:05:37 -0400)
>
> 
> acpi,virtio,pc: bugfixes
>
> Fix bug in ACPI which were tripping up guests.
> Fix a use-after-free with hotplug of virtio devices.
> Block ability to create legacy devices which shouldn't have been
> there in the first place.
> Fix migration error handling with balloon.
> Drop some dead code in virtio.
> vtd emulation fixup.
>
> Signed-off-by: Michael S. Tsirkin 
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.1
for any user-visible changes.

-- PMM



  1   2   3   4   >