date:20171211

Re: [Qemu-devel] [PATCH] Add ability for user to specify mouse ungrab key

2017-12-11 Thread no-reply

Hi,

This series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Message-id: 20171210195931.26042-1-programmingk...@gmail.com
Subject: [Qemu-devel] [PATCH] Add ability for user to specify mouse ungrab key
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=8
time make docker-test-quick@centos6
time make docker-test-build@min-glib
time make docker-test-mingw@fedora
# iotests is broken now, skip
# time make docker-test-block@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]   
patchew/20171210195931.26042-1-programmingk...@gmail.com -> 
patchew/20171210195931.26042-1-programmingk...@gmail.com
Switched to a new branch 'test'
27712d6200 Add ability for user to specify mouse ungrab key

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into '/var/tmp/patchew-tester-tmp-451ut46n/src/dtc'...
Submodule path 'dtc': checked out '558cd81bdd432769b59bff01240c44f82cfb1a9d'
  BUILD   centos6
make[1]: Entering directory '/var/tmp/patchew-tester-tmp-451ut46n/src'
  GEN 
/var/tmp/patchew-tester-tmp-451ut46n/src/docker-src.2017-12-10-15.04.43.31944/qemu.tar
Cloning into 
'/var/tmp/patchew-tester-tmp-451ut46n/src/docker-src.2017-12-10-15.04.43.31944/qemu.tar.vroot'...
done.
Your branch is up-to-date with 'origin/test'.
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 
'/var/tmp/patchew-tester-tmp-451ut46n/src/docker-src.2017-12-10-15.04.43.31944/qemu.tar.vroot/dtc'...
Submodule path 'dtc': checked out '558cd81bdd432769b59bff01240c44f82cfb1a9d'
Submodule 'ui/keycodemapdb' (git://git.qemu.org/keycodemapdb.git) registered 
for path 'ui/keycodemapdb'
Cloning into 
'/var/tmp/patchew-tester-tmp-451ut46n/src/docker-src.2017-12-10-15.04.43.31944/qemu.tar.vroot/ui/keycodemapdb'...
Submodule path 'ui/keycodemapdb': checked out 
'10739aa26051a5d49d88132604539d3ed085e72e'
  COPYRUNNER
RUN test-quick in qemu:centos6 
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
bison-2.4.1-5.el6.x86_64
bzip2-devel-1.0.5-7.el6_0.x86_64
ccache-3.1.6-2.el6.x86_64
csnappy-devel-0-6.20150729gitd7bc683.el6.x86_64
flex-2.5.35-9.el6.x86_64
gcc-4.4.7-18.el6.x86_64
gettext-0.17-18.el6.x86_64
git-1.7.1-9.el6_9.x86_64
glib2-devel-2.28.8-9.el6.x86_64
libepoxy-devel-1.2-3.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
librdmacm-devel-1.0.21-0.el6.x86_64
lzo-devel-2.03-3.1.el6_5.1.x86_64
make-3.81-23.el6.x86_64
mesa-libEGL-devel-11.0.7-4.el6.x86_64
mesa-libgbm-devel-11.0.7-4.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
spice-glib-devel-0.26-8.el6.x86_64
spice-server-devel-0.12.4-16.el6.x86_64
tar-1.23-15.el6_8.x86_64
vte-devel-0.25.1-9.el6.x86_64
xen-devel-4.6.6-2.el6.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=bison bzip2-devel ccache csnappy-devel flex g++
 gcc gettext git glib2-devel libepoxy-devel libfdt-devel
 librdmacm-devel lzo-devel make mesa-libEGL-devel 
mesa-libgbm-devel pixman-devel SDL-devel spice-glib-devel 
spice-server-devel tar vte-devel xen-devel zlib-devel
HOSTNAME=97e5010c8368
MAKEFLAGS= -j8
J=8
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
TARGET_LIST=
SHLVL=1
HOME=/root
TEST_DIR=/tmp/qemu-test
FEATURES= dtc
DEBUG=
_=/usr/bin/env

Configure options:
--enable-werror --target-list=x86_64-softmmu,aarch64-softmmu 
--prefix=/tmp/qemu-test/install
No C++ compiler available; disabling C++ specific optional code
Install prefix/tmp/qemu-test/install
BIOS directory/tmp/qemu-test/install/share/qemu
firmware path /tmp/qemu-test/install/share/qemu-firmware
binary directory  /tmp/qemu-test/install/bin
library directory /tmp/qemu-test/install/lib
module directory  /tmp/qemu-test/install/lib/qemu
libexec directory /tmp/qemu-test/install/libexec
include directory /tmp/qemu-test/install/include
config directory  /tmp/qemu-test/install/etc
local state directory   /tmp/qemu-test/install/var
Manual directory  /tmp/qemu-test/install/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path   /tmp/qemu-test/src
GIT binarygit
GIT submodules
C compilercc
Host C compiler   cc
C++ compiler  
Objective-C compiler cc
ARFLAGS   rv
CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g 
QEMU_CFLAGS   -I/usr/include/pixman-1   -I$(SRC_PATH)/dtc/libfdt -pthread 
-I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include   -DNCURSES_WIDECHAR   
-fPIE -DPIE -m64 -mcx16 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 
-D_LARGEFILE_SOURCE

Re: [Qemu-devel] [PATCH v3] rcu: reduce more than 7MB heap memory by malloc_trim()

2017-12-11 Thread Shannon Zhao



On 2017/12/12 14:54, Yang Zhong wrote:
>> 2) what effect it has on boot time in Shannon's case.
>   Hello Shannon,
> 
>   It's hard for me to reproduce your commands in my x86 enviornment, as a 
> compare test,
>   would you please help me use above two TEMP patches to verify VM bootup 
> time again?
> 
>   Those data can help Paolo to decide which patch will be used or how to 
> adjust delta
>   parameter.  Many thanks!
> 
Sure, I'll test these patches.

Thanks,
-- 
Shannon

Re: [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR

2017-12-11 Thread Alexey Kardashevskiy

On 12/12/17 17:06, Alexey Kardashevskiy wrote:
> On 12/12/17 16:54, Alex Williamson wrote:
>> On Tue, 12 Dec 2017 16:21:31 +1100
>> Alexey Kardashevskiy  wrote:
>>
>>> This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
>>> which tells that a region with MSIX data can be mapped entirely, i.e.
>>> the VFIO PCI driver won't prevent MSIX vectors area from being mapped.
>>>
>>> This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
>>> by default and "false" for pseries-2.12+ machines.
>>>
>>> This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
>>> https://www.spinics.net/lists/kvm/msg160282.html
>>>
>>> Signed-off-by: Alexey Kardashevskiy 
>>> ---
>>>
>>> This is an RFC as it requires kernel headers update which is not there yet.
>>>
>>> I'd like to make it "msix-mmap" (without "no") but could not find a way
>>> of enabling a device property for machine versions newer than some value.
>>>
>>> I changed 2.11 machine just for the demonstration purpose.
>>>
>>>
>>> ---
>>>  hw/vfio/pci.h |  1 +
>>>  include/hw/vfio/vfio-common.h |  1 +
>>>  linux-headers/linux/vfio.h|  5 +
>>>  hw/ppc/spapr.c| 10 +-
>>>  hw/vfio/common.c  | 15 +++
>>>  hw/vfio/pci.c | 11 +++
>>>  6 files changed, 42 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
>>> index a8fb3b3..53912ef 100644
>>> --- a/hw/vfio/pci.h
>>> +++ b/hw/vfio/pci.h
>>> @@ -142,6 +142,7 @@ typedef struct VFIOPCIDevice {
>>>  bool no_kvm_intx;
>>>  bool no_kvm_msi;
>>>  bool no_kvm_msix;
>>> +bool msix_no_mmap;
>>>  } VFIOPCIDevice;
>>>  
>>>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
>>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
>>> index f3a2ac9..927d600 100644
>>> --- a/include/hw/vfio/vfio-common.h
>>> +++ b/include/hw/vfio/vfio-common.h
>>> @@ -171,6 +171,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int 
>>> index,
>>>   struct vfio_region_info **info);
>>>  int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
>>>   uint32_t subtype, struct vfio_region_info 
>>> **info);
>>> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int 
>>> region);
>>>  #endif
>>>  extern const MemoryListener vfio_prereg_listener;
>>>  
>>> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
>>> index 4e7ab4c..bce9baf 100644
>>> --- a/linux-headers/linux/vfio.h
>>> +++ b/linux-headers/linux/vfio.h
>>> @@ -300,6 +300,11 @@ struct vfio_region_info_cap_type {
>>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG (2)
>>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG  (3)
>>>  
>>> +/*
>>> + * The MSIX mappable capability informs that MSIX data of a BAR can be 
>>> mmapped.
>>> + */
>>> +#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE 3
>>> +
>>>  /**
>>>   * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
>>>   * struct vfio_irq_info)
>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>> index 9de63f0..1dfc386 100644
>>> --- a/hw/ppc/spapr.c
>>> +++ b/hw/ppc/spapr.c
>>> @@ -3742,13 +3742,21 @@ static const TypeInfo spapr_machine_info = {
>>>  /*
>>>   * pseries-2.11
>>>   */
>>> +#define SPAPR_COMPAT_2_11 \
>>> +HW_COMPAT_2_10\
>>> +{ \
>>> +.driver = "vfio-pci", \
>>> +.property = "msix-no-mmap",   \
>>> +.value= "on", \
>>> +},\
>>> +
>>>  static void spapr_machine_2_11_instance_options(MachineState *machine)
>>>  {
>>>  }
>>>  
>>>  static void spapr_machine_2_11_class_options(MachineClass *mc)
>>>  {
>>> -/* Defaults for the latest behaviour inherited from the base class */
>>> +SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
>>>  }
>>>  
>>>  DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>> index ed7717d..593514c 100644
>>> --- a/hw/vfio/common.c
>>> +++ b/hw/vfio/common.c
>>> @@ -1408,6 +1408,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, 
>>> uint32_t type,
>>>  return -ENODEV;
>>>  }
>>>  
>>> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int 
>>> region)
>>> +{
>>> +struct vfio_region_info *info = NULL;
>>> +bool ret = false;
>>> +
>>> +if (!vfio_get_region_info(vbasedev, region, )) {
>>> +if (vfio_get_region_info_cap(info, cap_type)) {
>>> +ret = true;
>>> +}
>>> +g_free(info);
>>> +}
>>> +
>>> +return ret;
>>>

Re: [Qemu-devel] [PATCH v3] rcu: reduce more than 7MB heap memory by malloc_trim()

2017-12-11 Thread Yang Zhong

On Mon, Dec 11, 2017 at 05:31:43PM +0100, Paolo Bonzini wrote:
> On 07/12/2017 16:06, Yang Zhong wrote:
> >  Which show trim cost time less than 1ms and call_rcu_thread() do 10 times 
> > batch free, the trim also 10 times.
> > 
> >  I also did below changes: 
> > delta=1000,  and 
> > next_trim_time = qemu_clock_get_ns(QEMU_CLOCK_HOST) + delta * 
> > last_trim_time
> > 
> >  The whole VM bootup will trim 3 times.
> 
> For any adaptive mechanism (either this one or the simple "if (n == 0)"
> one), the question is:
> 
> 1) what effect it has on RSS in your case
  Hello Paolo,

  I list those two TEMP patch here,

  (1). if (n==0) patch
  /*
   * Global grace period counter.  Bit 0 is always one in rcu_gp_ctr.
   * Bits 1 and above are defined in synchronize_rcu.
  @@ -246,6 +246,7 @@ static void *call_rcu_thread(void *opaque)
  qemu_event_reset(_call_ready_event);
  n = atomic_read(_call_count);
  if (n == 0) {
  +malloc_trim(4 * 1024 * 1024);
  qemu_event_wait(_call_ready_event);
  }
  }

  (2). adaptive patch

   rcu_register_thread();

  @@ -272,6 +273,21 @@ static void *call_rcu_thread(void *opaque)
 node->func(node);
 }
 qemu_mutex_unlock_iothread();
  +
  +static uint64_t next_trim_time, last_trim_time;
  +int delta=1000;
  +
  +if ( qemu_clock_get_ns(QEMU_CLOCK_HOST) < next_trim_time ) {
  +next_trim_time -= last_trim_time / delta;   /* or higher */
  +last_trim_time -= last_trim_time / delta;   /* same as previous 
line */
  +} else {
  +uint64_t trim_start_time = qemu_clock_get_ns(QEMU_CLOCK_HOST);
  +malloc_trim(4 * 1024 *1024);
  +last_trim_time = qemu_clock_get_ns(QEMU_CLOCK_HOST) - 
trim_start_time;
  +next_trim_time = qemu_clock_get_ns(QEMU_CLOCK_HOST) + delta * 
last_trim_time;
  +   }
  +


   I used those two TEMP patch to test and results as below:

   My test command
   sudo ./qemu-system-x86_64 -enable-kvm -cpu host -m 2G -smp 
cpus=4,cores=4,threads=1,sockets=1 \
-drive 
format=raw,file=/home/yangzhon/icx/workspace/eywa.img,index=0,media=disk 
-nographic
  
  (1) if (n==0) patch
563015d84000-563016fd6000 rw-p  00:00 0  
[heap]
Size:  18760 kB
KernelPageSize:4 kB
MMUPageSize:   4 kB
Rss:3176 kB
Pss:3176 kB

  (2)adaptive patch
55bd5975a000-55bd5a9ac000 rw-p  00:00 0  
[heap]
Size:  18760 kB
KernelPageSize:4 kB
MMUPageSize:   4 kB
Rss:3196 kB
Pss:3196 kB

  if set delta=10, then get below result

56043a2e1000-56043b533000 rw-p  00:00 0  
[heap]
Size:  18760 kB
KernelPageSize:4 kB
MMUPageSize:   4 kB
Rss:3168 kB
Pss:3168 kB

 
  With my test command, if used the n==0 patch, the trim times decresed to 1/2,
  if delta=1000 in patch2, the trim time is 3. If delta=10, the trim time is 10.

  Regards,

  Yang 

> 2) what effect it has on boot time in Shannon's case.
  Hello Shannon,

  It's hard for me to reproduce your commands in my x86 enviornment, as a 
compare test,
  would you please help me use above two TEMP patches to verify VM bootup time 
again?

  Those data can help Paolo to decide which patch will be used or how to adjust 
delta
  parameter.  Many thanks!

  Regards,

  Yang


> Either patch is okay if you can justify it with these two performance
> indices.
> 
> Thanks,
> 
> Paolo

[Qemu-devel] [PATCH] MAINTAINERS: replace the unavailable email address

2017-12-11 Thread Shannon Zhao

From: Zhaoshenglong 

Since I'm not working as an assignee in Linaro, replace the Linaro email
address with my personal one.

Signed-off-by: Zhaoshenglong 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 0255113..45e2e20 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -543,7 +543,7 @@ F: include/hw/*/xlnx*.h
 
 ARM ACPI Subsystem
 M: Shannon Zhao 
-M: Shannon Zhao 
+M: Shannon Zhao 
 L: qemu-...@nongnu.org
 S: Maintained
 F: hw/arm/virt-acpi-build.c
-- 
2.0.4

Re: [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR

2017-12-11 Thread Alexey Kardashevskiy

On 12/12/17 16:54, Alex Williamson wrote:
> On Tue, 12 Dec 2017 16:21:31 +1100
> Alexey Kardashevskiy  wrote:
> 
>> This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
>> which tells that a region with MSIX data can be mapped entirely, i.e.
>> the VFIO PCI driver won't prevent MSIX vectors area from being mapped.
>>
>> This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
>> by default and "false" for pseries-2.12+ machines.
>>
>> This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
>> https://www.spinics.net/lists/kvm/msg160282.html
>>
>> Signed-off-by: Alexey Kardashevskiy 
>> ---
>>
>> This is an RFC as it requires kernel headers update which is not there yet.
>>
>> I'd like to make it "msix-mmap" (without "no") but could not find a way
>> of enabling a device property for machine versions newer than some value.
>>
>> I changed 2.11 machine just for the demonstration purpose.
>>
>>
>> ---
>>  hw/vfio/pci.h |  1 +
>>  include/hw/vfio/vfio-common.h |  1 +
>>  linux-headers/linux/vfio.h|  5 +
>>  hw/ppc/spapr.c| 10 +-
>>  hw/vfio/common.c  | 15 +++
>>  hw/vfio/pci.c | 11 +++
>>  6 files changed, 42 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
>> index a8fb3b3..53912ef 100644
>> --- a/hw/vfio/pci.h
>> +++ b/hw/vfio/pci.h
>> @@ -142,6 +142,7 @@ typedef struct VFIOPCIDevice {
>>  bool no_kvm_intx;
>>  bool no_kvm_msi;
>>  bool no_kvm_msix;
>> +bool msix_no_mmap;
>>  } VFIOPCIDevice;
>>  
>>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
>> index f3a2ac9..927d600 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -171,6 +171,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
>>   struct vfio_region_info **info);
>>  int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
>>   uint32_t subtype, struct vfio_region_info 
>> **info);
>> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int 
>> region);
>>  #endif
>>  extern const MemoryListener vfio_prereg_listener;
>>  
>> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
>> index 4e7ab4c..bce9baf 100644
>> --- a/linux-headers/linux/vfio.h
>> +++ b/linux-headers/linux/vfio.h
>> @@ -300,6 +300,11 @@ struct vfio_region_info_cap_type {
>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG  (2)
>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG   (3)
>>  
>> +/*
>> + * The MSIX mappable capability informs that MSIX data of a BAR can be 
>> mmapped.
>> + */
>> +#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE  3
>> +
>>  /**
>>   * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
>>   *  struct vfio_irq_info)
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 9de63f0..1dfc386 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -3742,13 +3742,21 @@ static const TypeInfo spapr_machine_info = {
>>  /*
>>   * pseries-2.11
>>   */
>> +#define SPAPR_COMPAT_2_11 \
>> +HW_COMPAT_2_10\
>> +{ \
>> +.driver = "vfio-pci", \
>> +.property = "msix-no-mmap",   \
>> +.value= "on", \
>> +},\
>> +
>>  static void spapr_machine_2_11_instance_options(MachineState *machine)
>>  {
>>  }
>>  
>>  static void spapr_machine_2_11_class_options(MachineClass *mc)
>>  {
>> -/* Defaults for the latest behaviour inherited from the base class */
>> +SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
>>  }
>>  
>>  DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index ed7717d..593514c 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -1408,6 +1408,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, 
>> uint32_t type,
>>  return -ENODEV;
>>  }
>>  
>> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int 
>> region)
>> +{
>> +struct vfio_region_info *info = NULL;
>> +bool ret = false;
>> +
>> +if (!vfio_get_region_info(vbasedev, region, )) {
>> +if (vfio_get_region_info_cap(info, cap_type)) {
>> +ret = true;
>> +}
>> +g_free(info);
>> +}
>> +
>> +return ret;
>> +}
>> +
>>  /*
>>   * Interfaces for IBM EEH (Enhanced Error Handling)
>>   */
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index c977ee3..d9aeae8 100644
>> ---

[Qemu-devel] [Bug 902413] Re: qemu-i386-user on ARM host: wine hangs/spins when trying to run anything

2017-12-11 Thread Juan Melgarejo Ludeña

A year has passed since last update by Nathan Shearer, but status is
labeled 'incomplete'. Please check if it's solved with wine 3.0

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/902413

Title:
  qemu-i386-user on ARM host: wine hangs/spins when trying to run
  anything

Status in QEMU:
  Incomplete
Status in wine package in Gentoo Linux:
  New

Bug description:
  With qemu built from git from 217bfb445b54db618a30f3a39170bebd9fd9dbf2
  and configured with './configure --target-list=i386-linux-user
  --static --interp-prefix=/home/pgriffais/natty-i386/', trying to run
  wine 1.3.15 from an Ubuntu 11.04 chroot results in hangs. If I run an
  i386 emulated wineserver, wineserver hangs in:

  0x600c7f8c in read () at ../sysdeps/unix/syscall-template.S:82
  82../sysdeps/unix/syscall-template.S: No such file or directory.
in ../sysdeps/unix/syscall-template.S
  (gdb) bt
  #0  0x600c7f8c in read () at ../sysdeps/unix/syscall-template.S:82
  #1  0x6004a316 in read (cpu_env=0x622c3ee8, num=3, arg1=6, arg2=1121255519, 
  arg3=1, arg4=134875664, arg5=1, arg6=1121255528, arg7=0, arg8=0)
  at /usr/include/bits/unistd.h:45
  #2  do_syscall (cpu_env=0x622c3ee8, num=3, arg1=6, arg2=1121255519, arg3=1, 
  arg4=134875664, arg5=1, arg6=1121255528, arg7=0, arg8=0)
  at /home/ubuntu/src/qemu/linux-user/syscall.c:4691
  #3  0x600262f0 in cpu_loop (env=0x622c3ee8)
  at /home/ubuntu/src/qemu/linux-user/main.c:321
  #4  0x60026bbc in main (argc=, 
  argv=, envp=)
  at /home/ubuntu/src/qemu/linux-user/main.c:3817

  While wine hangs in:

  0x600c84ac in recvmsg () at ../sysdeps/unix/syscall-template.S:82
  82../sysdeps/unix/syscall-template.S: No such file or directory.
in ../sysdeps/unix/syscall-template.S
  (gdb) bt
  #0  0x600c84ac in recvmsg () at ../sysdeps/unix/syscall-template.S:82
  #1  0x60041c4e in do_sendrecvmsg (fd=4, target_msg=, 
  flags=1073741824, send=0)
  at /home/ubuntu/src/qemu/linux-user/syscall.c:1834
  #2  0x600497ec in do_socketcall (cpu_env=, num=102, 
  arg1=17, arg2=1122504544, arg3=2076831732, arg4=1122504568, 
  arg5=2076942688, arg6=1122504888, arg7=0, arg8=0)
  at /home/ubuntu/src/qemu/linux-user/syscall.c:2235
  #3  do_syscall (cpu_env=, num=102, arg1=17, 
  arg2=1122504544, arg3=2076831732, arg4=1122504568, arg5=2076942688, 
  arg6=1122504888, arg7=0, arg8=0)
  at /home/ubuntu/src/qemu/linux-user/syscall.c:6085
  #4  0x600262f0 in cpu_loop (env=0x622c3f08)
  at /home/ubuntu/src/qemu/linux-user/main.c:321
  #5  0x60026bbc in main (argc=, 
  argv=, envp=)
  at /home/ubuntu/src/qemu/linux-user/main.c:3817

  However if I build wineserver 1.3.15 natively for ARM and run it on
  the host while wine is emulated, I get the following:

  root@tiberiusstation:/home/ubuntu# ./natty-i386/usr/bin/wine notepad
  Unsupported ancillary data: 1/2
  Unsupported ancillary data: 1/2
  Unsupported ancillary data: 1/2
  err:process:__wine_kernel_init boot event wait timed out

  I assume the last one is due to wineboot.exe hanging. The main wine
  process hangs in there:

  cg_temp_new_internal_i32 (temp_local=)
  at /home/ubuntu/src/qemu/tcg/tcg.c:483
  483   }
  (gdb) bt
  #0  tcg_temp_new_internal_i32 (temp_local=)
  at /home/ubuntu/src/qemu/tcg/tcg.c:483
  #1  0x60052ac6 in tcg_temp_new_i32 (val=6)
  at /home/ubuntu/src/qemu/tcg/tcg.h:442
  #2  tcg_const_i32 (val=6) at /home/ubuntu/src/qemu/tcg/tcg.c:530
  #3  0x6005ef0c in tcg_gen_shri_i32 (ot=2, op1=2, op2=7, is_right=1, 
  is_arith=0, s=)
  at /home/ubuntu/src/qemu/tcg/tcg-op.h:605
  #4  gen_shift_rm_im (ot=2, op1=2, op2=7, is_right=1, is_arith=0, 
  s=)
  at /home/ubuntu/src/qemu/target-i386/translate.c:1514
  #5  0x6006df90 in gen_shifti (s=0xbefea970, pc_start=)
  at /home/ubuntu/src/qemu/target-i386/translate.c:1946
  #6  disas_insn (s=0xbefea970, pc_start=)
  at /home/ubuntu/src/qemu/target-i386/translate.c:5397
  #7  0x60091758 in gen_intermediate_code_internal (env=0x625656f8, 
  tb=0x402cdf48) at /home/ubuntu/src/qemu/target-i386/translate.c:7825
  #8  gen_intermediate_code_pc (env=0x625656f8, tb=0x402cdf48)
  at /home/ubuntu/src/qemu/target-i386/translate.c:7896
  #9  0x60054bf2 in cpu_restore_state (tb=0x402cdf48, env=0x62565690, 
  searched_pc=1617393812) at /home/ubuntu/src/qemu/translate-all.c:126
  #10 0x60091d9e in handle_cpu_signal (host_signum=, 
  pinfo=, puc=0xbefeab70)
  at /home/ubuntu/src/qemu/user-exec.c:117
  #11 cpu_x86_signal_handler (host_signum=, 
  pinfo=, puc=0xbefeab70)
  at /home/ubuntu/src/qemu/user-exec.c:458
  #12 0x6003c764 in host_signal_handler (host_signum=11, info=0xbefeaaf0, 
  puc=)
  at /home/ubuntu/src/qemu/linux-user/signal.c:492
  #13 
  #14 0x60677894 in static_code_gen_buffer ()
  #15 0x6000a260 in cpu_x86_exec (env=0x0)
  at

Re: [Qemu-devel] [PATCH 2/3] hw/arm/virt: Add another UART to the virt board

2017-12-11 Thread Shannon Zhao



On 2017/12/8 23:02, Peter Maydell wrote:
> Currently we only provide one non-secure UART on the virt
> board. This is OK for most purposes, but there are some
> use cases where having a second UART would be useful (like
> bare-metal testing where you don't really want to have to
> probe and set up a PCI device just to have a second comms
> channel).
> 
> Add a second NS UART to the virt board. This will be the
> second serial device if 'secure=no' (the default), and the
> third serial device if 'secure=yes'.
> 
> Signed-off-by: Peter Maydell 
> ---
>  include/hw/arm/virt.h |  2 ++
>  hw/arm/virt.c | 19 ---
>  2 files changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 33b0ff3..685009a 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -72,6 +72,7 @@ enum {
>  VIRT_GPIO,
>  VIRT_SECURE_UART,
>  VIRT_SECURE_MEM,
> +VIRT_UART_2,
>  };
>  
>  typedef struct MemMapEntry {
> @@ -85,6 +86,7 @@ typedef struct {
>  bool no_its;
>  bool no_pmu;
>  bool claim_edge_triggered_timers;
> +bool no_second_uart;
>  } VirtMachineClass;
>  
>  typedef struct {
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 543f9bd..e234f55 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -139,6 +139,7 @@ static const MemMapEntry a15memmap[] = {
>  [VIRT_FW_CFG] = { 0x0902, 0x0018 },
>  [VIRT_GPIO] =   { 0x0903, 0x1000 },
>  [VIRT_SECURE_UART] ={ 0x0904, 0x1000 },
> +[VIRT_UART_2] = { 0x0905, 0x1000 },
>  [VIRT_MMIO] =   { 0x0a00, 0x0200 },
>  /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size 
> */
>  [VIRT_PLATFORM_BUS] =   { 0x0c00, 0x0200 },
> @@ -157,6 +158,7 @@ static const int a15irqmap[] = {
>  [VIRT_PCIE] = 3, /* ... to 6 */
>  [VIRT_GPIO] = 7,
>  [VIRT_SECURE_UART] = 8,
> +[VIRT_UART_2] = 9,
>  [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
>  [VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
>  [VIRT_PLATFORM_BUS] = 112, /* ...to 112 + PLATFORM_BUS_NUM_IRQS -1 */
> @@ -676,7 +678,7 @@ static void create_uart(const VirtMachineState *vms, 
> qemu_irq *pic, int uart,
>  
>  if (uart == VIRT_UART) {
>  qemu_fdt_setprop_string(vms->fdt, "/chosen", "stdout-path", 
> nodename);
> -} else {
> +} else if (uart == VIRT_SECURE_UART) {
>  /* Mark as not usable by the normal world */
>  qemu_fdt_setprop_string(vms->fdt, nodename, "status", "disabled");
>  qemu_fdt_setprop_string(vms->fdt, nodename, "secure-status", "okay");
> @@ -1260,6 +1262,7 @@ static void machvirt_init(MachineState *machine)
>  int n, virt_max_cpus;
>  MemoryRegion *ram = g_new(MemoryRegion, 1);
>  bool firmware_loaded = bios_name || drive_get(IF_PFLASH, 0, 0);
> +int uart_count = 0;
>  
>  /* We can probe only here because during property set
>   * KVM is not available yet
> @@ -1419,11 +1422,16 @@ static void machvirt_init(MachineState *machine)
>  
>  fdt_add_pmu_nodes(vms);
>  
> -create_uart(vms, pic, VIRT_UART, sysmem, serial_hds[0]);
> +create_uart(vms, pic, VIRT_UART, sysmem, serial_hds[uart_count++]);
>  
>  if (vms->secure) {
>  create_secure_ram(vms, secure_sysmem);
> -create_uart(vms, pic, VIRT_SECURE_UART, secure_sysmem, 
> serial_hds[1]);
> +create_uart(vms, pic, VIRT_SECURE_UART, secure_sysmem,
> +serial_hds[uart_count++]);
> +}
> +
> +if (!vmc->no_second_uart) {
> +create_uart(vms, pic, VIRT_UART_2, sysmem, serial_hds[uart_count++]);
>  }
>  
>  create_rtc(vms, pic);
> @@ -1693,8 +1701,13 @@ static void virt_2_11_instance_init(Object *obj)
>  
>  static void virt_machine_2_11_options(MachineClass *mc)
>  {
> +VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
> +
>  virt_machine_2_12_options(mc);
>  SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_11);
> +
> +/* The second NS UART was added in 2.12 */
> +vmc->no_second_uart = true;
>  }
>  DEFINE_VIRT_MACHINE(2, 11)
>  
> 
I'm wondering if it need to provide a machine option for user to choose
whether adding the second uart or not.

Thanks,
-- 
Shannon

Re: [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR

2017-12-11 Thread Alex Williamson

On Tue, 12 Dec 2017 16:21:31 +1100
Alexey Kardashevskiy  wrote:

> This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
> which tells that a region with MSIX data can be mapped entirely, i.e.
> the VFIO PCI driver won't prevent MSIX vectors area from being mapped.
> 
> This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
> by default and "false" for pseries-2.12+ machines.
> 
> This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
> https://www.spinics.net/lists/kvm/msg160282.html
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
> 
> This is an RFC as it requires kernel headers update which is not there yet.
> 
> I'd like to make it "msix-mmap" (without "no") but could not find a way
> of enabling a device property for machine versions newer than some value.
> 
> I changed 2.11 machine just for the demonstration purpose.
> 
> 
> ---
>  hw/vfio/pci.h |  1 +
>  include/hw/vfio/vfio-common.h |  1 +
>  linux-headers/linux/vfio.h|  5 +
>  hw/ppc/spapr.c| 10 +-
>  hw/vfio/common.c  | 15 +++
>  hw/vfio/pci.c | 11 +++
>  6 files changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> index a8fb3b3..53912ef 100644
> --- a/hw/vfio/pci.h
> +++ b/hw/vfio/pci.h
> @@ -142,6 +142,7 @@ typedef struct VFIOPCIDevice {
>  bool no_kvm_intx;
>  bool no_kvm_msi;
>  bool no_kvm_msix;
> +bool msix_no_mmap;
>  } VFIOPCIDevice;
>  
>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index f3a2ac9..927d600 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -171,6 +171,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
>   struct vfio_region_info **info);
>  int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
>   uint32_t subtype, struct vfio_region_info 
> **info);
> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int 
> region);
>  #endif
>  extern const MemoryListener vfio_prereg_listener;
>  
> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
> index 4e7ab4c..bce9baf 100644
> --- a/linux-headers/linux/vfio.h
> +++ b/linux-headers/linux/vfio.h
> @@ -300,6 +300,11 @@ struct vfio_region_info_cap_type {
>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG   (2)
>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG(3)
>  
> +/*
> + * The MSIX mappable capability informs that MSIX data of a BAR can be 
> mmapped.
> + */
> +#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE   3
> +
>  /**
>   * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
>   *   struct vfio_irq_info)
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 9de63f0..1dfc386 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3742,13 +3742,21 @@ static const TypeInfo spapr_machine_info = {
>  /*
>   * pseries-2.11
>   */
> +#define SPAPR_COMPAT_2_11 \
> +HW_COMPAT_2_10\
> +{ \
> +.driver = "vfio-pci", \
> +.property = "msix-no-mmap",   \
> +.value= "on", \
> +},\
> +
>  static void spapr_machine_2_11_instance_options(MachineState *machine)
>  {
>  }
>  
>  static void spapr_machine_2_11_class_options(MachineClass *mc)
>  {
> -/* Defaults for the latest behaviour inherited from the base class */
> +SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
>  }
>  
>  DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index ed7717d..593514c 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -1408,6 +1408,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, 
> uint32_t type,
>  return -ENODEV;
>  }
>  
> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region)
> +{
> +struct vfio_region_info *info = NULL;
> +bool ret = false;
> +
> +if (!vfio_get_region_info(vbasedev, region, )) {
> +if (vfio_get_region_info_cap(info, cap_type)) {
> +ret = true;
> +}
> +g_free(info);
> +}
> +
> +return ret;
> +}
> +
>  /*
>   * Interfaces for IBM EEH (Enhanced Error Handling)
>   */
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index c977ee3..d9aeae8 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -1289,6 +1289,12 @@ static void vfio_pci_fixup_msix_region(VFIOPCIDevice 
> *vdev)
>  off_t start, end;
>  VFIORegion *region =

Re: [Qemu-devel] [PATCH 3/3] hw/arm/virt-acpi-build: Add second UART to ACPI tables

2017-12-11 Thread Shannon Zhao



On 2017/12/8 23:02, Peter Maydell wrote:
> Add the second UART to the ACPI tables.
> 
> Signed-off-by: Peter Maydell 
> ---
> Pure guesswork, as I don't have a UEFI setup to hand and
> am not familiar with ACPI table formats either...
> ---
>  hw/arm/virt-acpi-build.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 3d78ff6..a38287b 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -689,6 +689,7 @@ static void build_fadt(GArray *table_data, BIOSLinker 
> *linker,
>  static void
>  build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>  {
> +VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
>  Aml *scope, *dsdt;
>  const MemMapEntry *memmap = vms->memmap;
>  const int *irqmap = vms->irqmap;
> @@ -706,6 +707,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  acpi_dsdt_add_cpus(scope, vms->smp_cpus);
>  acpi_dsdt_add_uart(scope, [VIRT_UART],
> (irqmap[VIRT_UART] + ARM_SPI_BASE));
> +if (!vmc->no_second_uart) {
> +acpi_dsdt_add_uart(scope, [VIRT_UART_2],
> +   (irqmap[VIRT_UART_2] + ARM_SPI_BASE));
> +}
>  acpi_dsdt_add_flash(scope, [VIRT_FLASH]);
>  acpi_dsdt_add_fw_cfg(scope, [VIRT_FW_CFG]);
>  acpi_dsdt_add_virtio(scope, [VIRT_MMIO],
> 
Reviewed-by: Shannon Zhao 

-- 
Shannon

Re: [Qemu-devel] [PATCH] sparc: Make sure we mmap at SHMLBA alignment

2017-12-11 Thread Richard Henderson

On 12/08/2017 08:57 AM, Peter Maydell wrote:
> SPARC Linux has an oddity that it insists that mmap()
> of MAP_FIXED memory must be at an alignment defined by
> SHMLBA, which is more aligned than the page size
> (typically, SHMLBA alignment is to 16K, and pages are 8K).
> This is a relic of ancient hardware that had cache
> aliasing constraints, but even on modern hardware the
> kernel still insists on the alignment.
> 
> To ensure that we get mmap() alignment sufficient to
> make the kernel happy, change QEMU_VMALLOC_ALIGN,
> qemu_fd_getpagesize() and qemu_mempath_getpagesize()
> to use the maximum of getpagesize() and SHMLBA.
> 
> In particular, this allows 'make check' to pass on Sparc:
> we were previously failing the ivshmem tests.
> 
> Signed-off-by: Peter Maydell 
> ---

Reviewed-by: Richard Henderson 

r~

Re: [Qemu-devel] [PATCH qemu] RFC: spapr/iommu: Enable in-kernel TCE acceleration via VFIO KVM device

2017-12-11 Thread Alex Williamson

On Tue, 12 Dec 2017 16:18:53 +1100
Alexey Kardashevskiy  wrote:

> In order to enable TCE operations support in KVM, we have to inform
> the KVM about VFIO groups being attached to specific LIOBNs. The KVM
> already knows about VFIO groups, the only bit missing is which
> in-kernel TCE table (the one with user visible TCEs) should update
> the attached broups. There is an KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE
> attribute of the VFIO KVM device which receives a groupfd/tablefd couple.
> 
> This adds get_attr()/set_attr() to IOMMUMemoryRegionClass, like
> iommu_ops::domain_get_attr/domain_set_attr in the Linux kernel.
> 
> This implements get_attr() for sPAPR IOMMU to return a TCE table fd
> as an IOMMU_ATTR_KVM_FD attribute. This also reads now
> the KVM_CAP_SPAPR_TCE_VFIO capability to prevent the TCE table from
> reallocating to the userspace if the KVM can accelerate TCE operations.
> 
> This finally notifies the VFIO KVM device about new group being attached
> to a LIOBN.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
> 
> Assuming it is accepted, does it make sense to split
> include/exec/memory.h out and get merged separately?
> 
> ---
>  include/exec/memory.h | 10 ++
>  target/ppc/kvm_ppc.h  |  6 ++
>  hw/ppc/spapr_iommu.c  | 19 +++
>  hw/vfio/common.c  | 24 
>  target/ppc/kvm.c  |  7 ++-
>  hw/vfio/trace-events  |  1 +
>  6 files changed, 66 insertions(+), 1 deletion(-)
> 
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 5ed4042..6395c6f 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -190,6 +190,10 @@ struct MemoryRegionOps {
>  const MemoryRegionMmio old_mmio;
>  };
>  
> +enum IOMMUMemoryRegionAttr {
> +IOMMU_ATTR_KVM_FD

You're generalizing the wrong thing here, this is specifically a
SPAPR_TCE_FD, call it that.

> +};
> +
>  typedef struct IOMMUMemoryRegionClass {
>  /* private */
>  struct DeviceClass parent_class;
> @@ -210,6 +214,12 @@ typedef struct IOMMUMemoryRegionClass {
>  IOMMUNotifierFlag new_flags);
>  /* Set this up to provide customized IOMMU replay function */
>  void (*replay)(IOMMUMemoryRegion *iommu, IOMMUNotifier *notifier);
> +
> +/* Get/set IOMMU misc attributes */
> +int (*get_attr)(IOMMUMemoryRegion *iommu, enum IOMMUMemoryRegionAttr,
> +void *data);
> +int (*set_attr)(IOMMUMemoryRegion *iommu, enum IOMMUMemoryRegionAttr,
> +void *data);
>  } IOMMUMemoryRegionClass;
>  
>  typedef struct CoalescedMemoryRange CoalescedMemoryRange;
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index d6be38e..2b985e1 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -48,6 +48,7 @@ void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t 
> page_shift,
>  int kvmppc_remove_spapr_tce(void *table, int pfd, uint32_t window_size);
>  int kvmppc_reset_htab(int shift_hint);
>  uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift);
> +bool kvmppc_has_cap_spapr_vfio(void);
>  #endif /* !CONFIG_USER_ONLY */
>  bool kvmppc_has_cap_epr(void);
>  int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function);
> @@ -231,6 +232,11 @@ static inline bool 
> kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
>  return true;
>  }
>  
> +static inline bool kvmppc_has_cap_spapr_vfio(void)
> +{
> +return false;
> +}
> +
>  #endif /* !CONFIG_USER_ONLY */
>  
>  static inline bool kvmppc_has_cap_epr(void)
> diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
> index 5ccd785..ce8a769 100644
> --- a/hw/ppc/spapr_iommu.c
> +++ b/hw/ppc/spapr_iommu.c
> @@ -17,6 +17,7 @@
>   * License along with this library; if not, see 
> .
>   */
>  #include "qemu/osdep.h"
> +#include 
>  #include "qemu/error-report.h"
>  #include "hw/hw.h"
>  #include "qemu/log.h"
> @@ -160,6 +161,19 @@ static uint64_t 
> spapr_tce_get_min_page_size(IOMMUMemoryRegion *iommu)
>  return 1ULL << tcet->page_shift;
>  }
>  
> +static int spapr_tce_get_attr(IOMMUMemoryRegion *iommu,
> +  enum IOMMUMemoryRegionAttr attr, void *data)
> +{
> +sPAPRTCETable *tcet = container_of(iommu, sPAPRTCETable, iommu);
> +
> +if (attr == IOMMU_ATTR_KVM_FD && kvmppc_has_cap_spapr_vfio()) {
> +*(int *) data = tcet->fd;
> +return 0;
> +}
> +
> +return -EINVAL;
> +}
> +
>  static void spapr_tce_notify_flag_changed(IOMMUMemoryRegion *iommu,
>IOMMUNotifierFlag old,
>IOMMUNotifierFlag new)
> @@ -284,6 +298,10 @@ void spapr_tce_set_need_vfio(sPAPRTCETable *tcet, bool 
> need_vfio)
>  
>  tcet->need_vfio = need_vfio;
>  
> +if (!need_vfio || (tcet->fd != -1 && kvmppc_has_cap_spapr_vfio())) {
> +return;
> +}
> +
>  oldtable = tcet->table;
>  
>

Re: [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR

2017-12-11 Thread Alexey Kardashevskiy

On 12/12/17 16:21, Alexey Kardashevskiy wrote:
> This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
> which tells that a region with MSIX data can be mapped entirely, i.e.
> the VFIO PCI driver won't prevent MSIX vectors area from being mapped.
> 
> This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
> by default and "false" for pseries-2.12+ machines.
> 
> This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
> https://www.spinics.net/lists/kvm/msg160282.html
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
> 
> This is an RFC as it requires kernel headers update which is not there yet.
> 
> I'd like to make it "msix-mmap" (without "no") but could not find a way
> of enabling a device property for machine versions newer than some value.

Ah, this remark is wrong, making it "no" property does not help.

How do we enforce some property on some device depending on a machine type?



> 
> I changed 2.11 machine just for the demonstration purpose.
> 
> 
> ---
>  hw/vfio/pci.h |  1 +
>  include/hw/vfio/vfio-common.h |  1 +
>  linux-headers/linux/vfio.h|  5 +
>  hw/ppc/spapr.c| 10 +-
>  hw/vfio/common.c  | 15 +++
>  hw/vfio/pci.c | 11 +++
>  6 files changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> index a8fb3b3..53912ef 100644
> --- a/hw/vfio/pci.h
> +++ b/hw/vfio/pci.h
> @@ -142,6 +142,7 @@ typedef struct VFIOPCIDevice {
>  bool no_kvm_intx;
>  bool no_kvm_msi;
>  bool no_kvm_msix;
> +bool msix_no_mmap;
>  } VFIOPCIDevice;
>  
>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index f3a2ac9..927d600 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -171,6 +171,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
>   struct vfio_region_info **info);
>  int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
>   uint32_t subtype, struct vfio_region_info 
> **info);
> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int 
> region);
>  #endif
>  extern const MemoryListener vfio_prereg_listener;
>  
> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
> index 4e7ab4c..bce9baf 100644
> --- a/linux-headers/linux/vfio.h
> +++ b/linux-headers/linux/vfio.h
> @@ -300,6 +300,11 @@ struct vfio_region_info_cap_type {
>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG   (2)
>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG(3)
>  
> +/*
> + * The MSIX mappable capability informs that MSIX data of a BAR can be 
> mmapped.
> + */
> +#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE   3
> +
>  /**
>   * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
>   *   struct vfio_irq_info)
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 9de63f0..1dfc386 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3742,13 +3742,21 @@ static const TypeInfo spapr_machine_info = {
>  /*
>   * pseries-2.11
>   */
> +#define SPAPR_COMPAT_2_11 \
> +HW_COMPAT_2_10\
> +{ \
> +.driver = "vfio-pci", \
> +.property = "msix-no-mmap",   \
> +.value= "on", \
> +},\
> +
>  static void spapr_machine_2_11_instance_options(MachineState *machine)
>  {
>  }
>  
>  static void spapr_machine_2_11_class_options(MachineClass *mc)
>  {
> -/* Defaults for the latest behaviour inherited from the base class */
> +SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
>  }
>  
>  DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index ed7717d..593514c 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -1408,6 +1408,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, 
> uint32_t type,
>  return -ENODEV;
>  }
>  
> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region)
> +{
> +struct vfio_region_info *info = NULL;
> +bool ret = false;
> +
> +if (!vfio_get_region_info(vbasedev, region, )) {
> +if (vfio_get_region_info_cap(info, cap_type)) {
> +ret = true;
> +}
> +g_free(info);
> +}
> +
> +return ret;
> +}
> +
>  /*
>   * Interfaces for IBM EEH (Enhanced Error Handling)
>   */
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index c977ee3..d9aeae8 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -1289,6 +1289,12 @@ static

[Qemu-devel] [PATCH V4] pci: removed the is_express field since a uniform interface was inserted

2017-12-11 Thread Yoni Bettan

* according to Eduardo Habkost's commit
  fd3b02c8896d597dd8b9e053dec579cf0386aee1

* since all PCIEs now implement INTERFACE_PCIE_DEVICE we
  don't need this field anymore

* Devices that where only INTERFACE_PCIE_DEVICE (is_express == 1)
  or
  devices that where only INTERFACE_CONVENTIONAL_PCI_DEVICE (is_express 
== 0)
  where not affected by the change

  The only devices that were affected are those that are hybrid and also
  had (is_express == 1) - therefor only:
- hw/vfio/pci.c
- hw/usb/hcd-xhci.c

  For both I made sure that QEMU_PCI_CAP_EXPRESS is on

Signed-off-by: Yoni Bettan 
---
 docs/pcie_pci_bridge.txt   | 2 +-
 hw/block/nvme.c| 1 -
 hw/net/e1000e.c| 1 -
 hw/pci-bridge/pcie_pci_bridge.c| 1 -
 hw/pci-bridge/pcie_root_port.c | 1 -
 hw/pci-bridge/xio3130_downstream.c | 1 -
 hw/pci-bridge/xio3130_upstream.c   | 1 -
 hw/pci-host/xilinx-pcie.c  | 1 -
 hw/pci/pci.c   | 8 ++--
 hw/scsi/megasas.c  | 4 
 hw/usb/hcd-xhci.c  | 9 -
 hw/vfio/pci.c  | 5 -
 include/hw/pci/pci.h   | 3 ---
 13 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/docs/pcie_pci_bridge.txt b/docs/pcie_pci_bridge.txt
index 5a4203f97c..ab35ebf3ca 100644
--- a/docs/pcie_pci_bridge.txt
+++ b/docs/pcie_pci_bridge.txt
@@ -110,5 +110,5 @@ To enable device hot-plug into the bridge on Linux there're 
3 ways:
 Implementation
 ==
 The PCIE-PCI bridge is based on PCI-PCI bridge, but also accumulates PCI 
Express
-features as a PCI Express device (is_express=1).
+features as a PCI Express device.
 
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 441e21ed1f..9325bc0911 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1087,7 +1087,6 @@ static void nvme_class_init(ObjectClass *oc, void *data)
 pc->vendor_id = PCI_VENDOR_ID_INTEL;
 pc->device_id = 0x5845;
 pc->revision = 2;
-pc->is_express = 1;
 
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 dc->desc = "Non-Volatile Memory Express";
diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index f1af279e8d..c360f0d8c9 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -675,7 +675,6 @@ static void e1000e_class_init(ObjectClass *class, void 
*data)
 c->revision = 0;
 c->romfile = "efi-e1000e.rom";
 c->class_id = PCI_CLASS_NETWORK_ETHERNET;
-c->is_express = 1;
 
 dc->desc = "Intel 82574L GbE Controller";
 dc->reset = e1000e_qdev_reset;
diff --git a/hw/pci-bridge/pcie_pci_bridge.c b/hw/pci-bridge/pcie_pci_bridge.c
index a4d827c99d..b7d9ebbec2 100644
--- a/hw/pci-bridge/pcie_pci_bridge.c
+++ b/hw/pci-bridge/pcie_pci_bridge.c
@@ -169,7 +169,6 @@ static void pcie_pci_bridge_class_init(ObjectClass *klass, 
void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(klass);
 
-k->is_express = 1;
 k->is_bridge = 1;
 k->vendor_id = PCI_VENDOR_ID_REDHAT;
 k->device_id = PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE;
diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
index 9b6e4ce512..45f9e8cd4a 100644
--- a/hw/pci-bridge/pcie_root_port.c
+++ b/hw/pci-bridge/pcie_root_port.c
@@ -145,7 +145,6 @@ static void rp_class_init(ObjectClass *klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 
-k->is_express = 1;
 k->is_bridge = 1;
 k->config_write = rp_write_config;
 k->realize = rp_realize;
diff --git a/hw/pci-bridge/xio3130_downstream.c 
b/hw/pci-bridge/xio3130_downstream.c
index 1e09d2afb7..613a0d6bb7 100644
--- a/hw/pci-bridge/xio3130_downstream.c
+++ b/hw/pci-bridge/xio3130_downstream.c
@@ -177,7 +177,6 @@ static void xio3130_downstream_class_init(ObjectClass 
*klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 
-k->is_express = 1;
 k->is_bridge = 1;
 k->config_write = xio3130_downstream_write_config;
 k->realize = xio3130_downstream_realize;
diff --git a/hw/pci-bridge/xio3130_upstream.c b/hw/pci-bridge/xio3130_upstream.c
index 227997ce46..d4645bddee 100644
--- a/hw/pci-bridge/xio3130_upstream.c
+++ b/hw/pci-bridge/xio3130_upstream.c
@@ -148,7 +148,6 @@ static void xio3130_upstream_class_init(ObjectClass *klass, 
void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 
-k->is_express = 1;
 k->is_bridge = 1;
 k->config_write = xio3130_upstream_write_config;
 k->realize = xio3130_upstream_realize;
diff --git a/hw/pci-host/xilinx-pcie.c b/hw/pci-host/xilinx-pcie.c
index 7659253090..a4ca3ba30f 100644
--- a/hw/pci-host/xilinx-pcie.c
+++ b/hw/pci-host/xilinx-pcie.c
@@ -298,7 +298,6 @@ static void xilinx_pcie_root_class_init(ObjectClass *klass,

[Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR

2017-12-11 Thread Alexey Kardashevskiy

This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
which tells that a region with MSIX data can be mapped entirely, i.e.
the VFIO PCI driver won't prevent MSIX vectors area from being mapped.

This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
by default and "false" for pseries-2.12+ machines.

This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
https://www.spinics.net/lists/kvm/msg160282.html

Signed-off-by: Alexey Kardashevskiy 
---

This is an RFC as it requires kernel headers update which is not there yet.

I'd like to make it "msix-mmap" (without "no") but could not find a way
of enabling a device property for machine versions newer than some value.

I changed 2.11 machine just for the demonstration purpose.


---
 hw/vfio/pci.h |  1 +
 include/hw/vfio/vfio-common.h |  1 +
 linux-headers/linux/vfio.h|  5 +
 hw/ppc/spapr.c| 10 +-
 hw/vfio/common.c  | 15 +++
 hw/vfio/pci.c | 11 +++
 6 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index a8fb3b3..53912ef 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -142,6 +142,7 @@ typedef struct VFIOPCIDevice {
 bool no_kvm_intx;
 bool no_kvm_msi;
 bool no_kvm_msix;
+bool msix_no_mmap;
 } VFIOPCIDevice;
 
 uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index f3a2ac9..927d600 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -171,6 +171,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
  struct vfio_region_info **info);
 int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
  uint32_t subtype, struct vfio_region_info **info);
+bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region);
 #endif
 extern const MemoryListener vfio_prereg_listener;
 
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 4e7ab4c..bce9baf 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -300,6 +300,11 @@ struct vfio_region_info_cap_type {
 #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG (2)
 #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG  (3)
 
+/*
+ * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped.
+ */
+#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE 3
+
 /**
  * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
  * struct vfio_irq_info)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 9de63f0..1dfc386 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3742,13 +3742,21 @@ static const TypeInfo spapr_machine_info = {
 /*
  * pseries-2.11
  */
+#define SPAPR_COMPAT_2_11 \
+HW_COMPAT_2_10\
+{ \
+.driver = "vfio-pci", \
+.property = "msix-no-mmap",   \
+.value= "on", \
+},\
+
 static void spapr_machine_2_11_instance_options(MachineState *machine)
 {
 }
 
 static void spapr_machine_2_11_class_options(MachineClass *mc)
 {
-/* Defaults for the latest behaviour inherited from the base class */
+SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
 }
 
 DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index ed7717d..593514c 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1408,6 +1408,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, 
uint32_t type,
 return -ENODEV;
 }
 
+bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region)
+{
+struct vfio_region_info *info = NULL;
+bool ret = false;
+
+if (!vfio_get_region_info(vbasedev, region, )) {
+if (vfio_get_region_info_cap(info, cap_type)) {
+ret = true;
+}
+g_free(info);
+}
+
+return ret;
+}
+
 /*
  * Interfaces for IBM EEH (Enhanced Error Handling)
  */
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index c977ee3..d9aeae8 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1289,6 +1289,12 @@ static void vfio_pci_fixup_msix_region(VFIOPCIDevice 
*vdev)
 off_t start, end;
 VFIORegion *region = >bars[vdev->msix->table_bar].region;
 
+if (!vdev->msix_no_mmap &&
+vfio_is_cap_present(>vbasedev, 
VFIO_REGION_INFO_CAP_MSIX_MAPPABLE,
+vdev->msix->table_bar)) {
+return;
+}
+
 /*
  * We expect to find a single mmap covering the whole BAR, anything else
  * means it's either unsupported or already

[Qemu-devel] [PATCH qemu] RFC: spapr/iommu: Enable in-kernel TCE acceleration via VFIO KVM device

2017-12-11 Thread Alexey Kardashevskiy

In order to enable TCE operations support in KVM, we have to inform
the KVM about VFIO groups being attached to specific LIOBNs. The KVM
already knows about VFIO groups, the only bit missing is which
in-kernel TCE table (the one with user visible TCEs) should update
the attached broups. There is an KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE
attribute of the VFIO KVM device which receives a groupfd/tablefd couple.

This adds get_attr()/set_attr() to IOMMUMemoryRegionClass, like
iommu_ops::domain_get_attr/domain_set_attr in the Linux kernel.

This implements get_attr() for sPAPR IOMMU to return a TCE table fd
as an IOMMU_ATTR_KVM_FD attribute. This also reads now
the KVM_CAP_SPAPR_TCE_VFIO capability to prevent the TCE table from
reallocating to the userspace if the KVM can accelerate TCE operations.

This finally notifies the VFIO KVM device about new group being attached
to a LIOBN.

Signed-off-by: Alexey Kardashevskiy 
---

Assuming it is accepted, does it make sense to split
include/exec/memory.h out and get merged separately?

---
 include/exec/memory.h | 10 ++
 target/ppc/kvm_ppc.h  |  6 ++
 hw/ppc/spapr_iommu.c  | 19 +++
 hw/vfio/common.c  | 24 
 target/ppc/kvm.c  |  7 ++-
 hw/vfio/trace-events  |  1 +
 6 files changed, 66 insertions(+), 1 deletion(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 5ed4042..6395c6f 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -190,6 +190,10 @@ struct MemoryRegionOps {
 const MemoryRegionMmio old_mmio;
 };
 
+enum IOMMUMemoryRegionAttr {
+IOMMU_ATTR_KVM_FD
+};
+
 typedef struct IOMMUMemoryRegionClass {
 /* private */
 struct DeviceClass parent_class;
@@ -210,6 +214,12 @@ typedef struct IOMMUMemoryRegionClass {
 IOMMUNotifierFlag new_flags);
 /* Set this up to provide customized IOMMU replay function */
 void (*replay)(IOMMUMemoryRegion *iommu, IOMMUNotifier *notifier);
+
+/* Get/set IOMMU misc attributes */
+int (*get_attr)(IOMMUMemoryRegion *iommu, enum IOMMUMemoryRegionAttr,
+void *data);
+int (*set_attr)(IOMMUMemoryRegion *iommu, enum IOMMUMemoryRegionAttr,
+void *data);
 } IOMMUMemoryRegionClass;
 
 typedef struct CoalescedMemoryRange CoalescedMemoryRange;
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index d6be38e..2b985e1 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@ -48,6 +48,7 @@ void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t 
page_shift,
 int kvmppc_remove_spapr_tce(void *table, int pfd, uint32_t window_size);
 int kvmppc_reset_htab(int shift_hint);
 uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift);
+bool kvmppc_has_cap_spapr_vfio(void);
 #endif /* !CONFIG_USER_ONLY */
 bool kvmppc_has_cap_epr(void);
 int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function);
@@ -231,6 +232,11 @@ static inline bool 
kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
 return true;
 }
 
+static inline bool kvmppc_has_cap_spapr_vfio(void)
+{
+return false;
+}
+
 #endif /* !CONFIG_USER_ONLY */
 
 static inline bool kvmppc_has_cap_epr(void)
diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index 5ccd785..ce8a769 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -17,6 +17,7 @@
  * License along with this library; if not, see .
  */
 #include "qemu/osdep.h"
+#include 
 #include "qemu/error-report.h"
 #include "hw/hw.h"
 #include "qemu/log.h"
@@ -160,6 +161,19 @@ static uint64_t 
spapr_tce_get_min_page_size(IOMMUMemoryRegion *iommu)
 return 1ULL << tcet->page_shift;
 }
 
+static int spapr_tce_get_attr(IOMMUMemoryRegion *iommu,
+  enum IOMMUMemoryRegionAttr attr, void *data)
+{
+sPAPRTCETable *tcet = container_of(iommu, sPAPRTCETable, iommu);
+
+if (attr == IOMMU_ATTR_KVM_FD && kvmppc_has_cap_spapr_vfio()) {
+*(int *) data = tcet->fd;
+return 0;
+}
+
+return -EINVAL;
+}
+
 static void spapr_tce_notify_flag_changed(IOMMUMemoryRegion *iommu,
   IOMMUNotifierFlag old,
   IOMMUNotifierFlag new)
@@ -284,6 +298,10 @@ void spapr_tce_set_need_vfio(sPAPRTCETable *tcet, bool 
need_vfio)
 
 tcet->need_vfio = need_vfio;
 
+if (!need_vfio || (tcet->fd != -1 && kvmppc_has_cap_spapr_vfio())) {
+return;
+}
+
 oldtable = tcet->table;
 
 tcet->table = spapr_tce_alloc_table(tcet->liobn,
@@ -643,6 +661,7 @@ static void 
spapr_iommu_memory_region_class_init(ObjectClass *klass, void *data)
 imrc->translate = spapr_tce_translate_iommu;
 imrc->get_min_page_size = spapr_tce_get_min_page_size;
 imrc->notify_flag_changed = spapr_tce_notify_flag_changed;
+imrc->get_attr = spapr_tce_get_attr;
 }
 
 static const TypeInfo spapr_iommu_memory_region_info = {
diff --git

[Qemu-devel] [PATCH qemu v2] vfio-pci: Remove unused fields from VFIOMSIXInfo

2017-12-11 Thread Alexey Kardashevskiy

When support for multiple mappings per a region were added, this was
left behind, let's finish and remove unused bits.

Fixes: db0da029a185 "vfio: Generalize region support"
Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v2:
* updated commit log

---
 hw/vfio/pci.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index 502a575..a8fb3b3 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -93,8 +93,6 @@ typedef struct VFIOMSIXInfo {
 uint16_t entries;
 uint32_t table_offset;
 uint32_t pba_offset;
-MemoryRegion mmap_mem;
-void *mmap;
 unsigned long *pending;
 } VFIOMSIXInfo;
 
-- 
2.11.0

[Qemu-devel] [PATCH qemu v2] vfio/spapr: Allow fallback to SPAPR TCE IOMMU v1

2017-12-11 Thread Alexey Kardashevskiy

The vfio_iommu_spapr_tce driver advertises kernel's support for
v1 and v2 IOMMU support, however it is not always possible to use
the requested IOMMU type. For example, a pseries host platform does not
support dynamic DMA windows so v2 cannot initialize and QEMU fails to
start.

This adds a fallback to the v1 IOMMU if v2 cannot be used.

Fixes: 318f67ce1371 "vfio: spapr: Add DMA memory preregistering (SPAPR IOMMU 
v2)"
Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v2:
* updated commit log

---
 hw/vfio/common.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 7b2924c..cd81cc9 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1040,6 +1040,11 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as,
 v2 ? VFIO_SPAPR_TCE_v2_IOMMU : VFIO_SPAPR_TCE_IOMMU;
 ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type);
 if (ret) {
+container->iommu_type = VFIO_SPAPR_TCE_IOMMU;
+v2 = false;
+ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type);
+}
+if (ret) {
 error_setg_errno(errp, errno, "failed to set iommu for container");
 ret = -errno;
 goto free_container_exit;
-- 
2.11.0

Re: [Qemu-devel] [PATCH] display: check irq handler index before access

2017-12-11 Thread P J P

+-- On Mon, 11 Dec 2017, Peter Maydell wrote --+
| It would be more sensible to just mask off the top bits of
| 'level' before starting the loop, rather than checking every
| time around the loop:
|level &= MAKE_64BIT_MASK(0, TC6493XB_GPIOS);

Sent a revised patch v1. Thank you.
--
Prasad J Pandit / Red Hat Product Security Team
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F

[Qemu-devel] [PATCH v1] display: limit irq handler index to TC6393XB_GPIOS

2017-12-11 Thread P J P

From: Prasad J Pandit 

The ctz32() routine could return value greater than
TC6393XB_GPIOS=16. This could lead to an OOB array access.
Mask 'level' to avoid it.

Reported-by: Moguofang 
Signed-off-by: Prasad J Pandit 
---
 hw/display/tc6393xb.c | 1 +
 1 file changed, 1 insertion(+)

Update: mask 'level' value to TC6393XB_GPIOS=16
  -> https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg01685.html

diff --git a/hw/display/tc6393xb.c b/hw/display/tc6393xb.c
index 74d10af3d4..0ae63605f0 100644
--- a/hw/display/tc6393xb.c
+++ b/hw/display/tc6393xb.c
@@ -172,6 +172,7 @@ static void tc6393xb_gpio_handler_update(TC6393xbState *s)
 int bit;
 
 level = s->gpio_level & s->gpio_dir;
+level &= MAKE_64BIT_MASK(0, TC6393XB_GPIOS);
 
 for (diff = s->prev_level ^ level; diff; diff ^= 1 << bit) {
 bit = ctz32(diff);
-- 
2.13.6

[Qemu-devel] [PATCH v2] target/i386: add clflushopt to "Skylake-Server" cpu model

2017-12-11 Thread Haozhong Zhang

CPUID_7_0_EBX_CLFLUSHOPT is missed in current "Skylake-Server" cpu
model. Add it to "Skylake-Server" cpu model on pc-i440fx-2.11 and
pc-q35-2.11. Keep it disabled in "Skylake-Server" cpu model on older
machine types.

Signed-off-by: Haozhong Zhang 
---
v1 can be found at 
  https://lists.nongnu.org/archive/html/qemu-devel/2017-12/msg01659.html

I'm not sure whether this patch is too late for QEMU 2.11. If it is, I'll
rebase and resend it after 2.12 window opens.
---
 include/hw/i386/pc.h | 5 +
 target/i386/cpu.c| 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index ef438bd765..085a688b26 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -383,6 +383,11 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
 .driver   = "q35-pcihost",\
 .property = "x-pci-hole64-fix",\
 .value= "off",\
+},\
+{\
+.driver   = "Skylake-Server" "-" TYPE_X86_CPU,\
+.property = "clflushopt",\
+.value= "off",\
 },
 
 #define PC_COMPAT_2_9 \
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 045d66191f..7d033b7d30 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1376,7 +1376,7 @@ static X86CPUDefinition builtin_x86_defs[] = {
 CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_MPX | CPUID_7_0_EBX_CLWB |
 CPUID_7_0_EBX_AVX512F | CPUID_7_0_EBX_AVX512DQ |
 CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512CD |
-CPUID_7_0_EBX_AVX512VL,
+CPUID_7_0_EBX_AVX512VL | CPUID_7_0_EBX_CLFLUSHOPT,
 /* Missing: XSAVES (not supported by some Linux versions,
  * including v4.1 to v4.12).
  * KVM doesn't yet expose any XSAVES state save component,
-- 
2.14.1

Re: [Qemu-devel] [PATCH] x86/cpu: Enable new SSE/AVX/AVX512 cpu features

2017-12-11 Thread Yang Zhong

On Mon, Dec 11, 2017 at 05:17:15PM +0100, Paolo Bonzini wrote:
> On 22/11/2017 08:27, Yang Zhong wrote:
> > Intel IceLake cpu has added new cpu features,AVX512_VBMI2/GFNI/
> > VAES/VPCLMULQDQ/AVX512_VNNI/AVX512_BITALG. Those new cpu features
> > need expose to guest VM.
> > 
> > The bit definition:
> > CPUID.(EAX=7,ECX=0):ECX[bit 06] AVX512_VBMI2
> > CPUID.(EAX=7,ECX=0):ECX[bit 08] GFNI
> > CPUID.(EAX=7,ECX=0):ECX[bit 09] VAES
> > CPUID.(EAX=7,ECX=0):ECX[bit 10] VPCLMULQDQ
> > CPUID.(EAX=7,ECX=0):ECX[bit 11] AVX512_VNNI
> > CPUID.(EAX=7,ECX=0):ECX[bit 12] AVX512_BITALG
> > 
> > The release document ref below link:
> > https://software.intel.com/sites/default/files/managed/c5/15/\
> > architecture-instruction-set-extensions-programming-reference.pdf
> > 
> > Signed-off-by: Yang Zhong 
> > ---
> >  target/i386/cpu.c | 6 +++---
> >  target/i386/cpu.h | 6 ++
> >  2 files changed, 9 insertions(+), 3 deletions(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 045d661..a67ced2 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -437,9 +437,9 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] 
> > = {
> >  [FEAT_7_0_ECX] = {
> >  .feat_names = {
> >  NULL, "avx512vbmi", "umip", "pku",
> > -"ospke", NULL, NULL, NULL,
> > -NULL, NULL, NULL, NULL,
> > -NULL, NULL, "avx512-vpopcntdq", NULL,
> > +"ospke", NULL, "avx512vbmi2", NULL,
> > +"gfni", "vaes", "vpclmulqdq", "avx512vnni",
> > +"avx512bitalg", NULL, "avx512-vpopcntdq", NULL,
> >  "la57", NULL, NULL, NULL,
> >  NULL, NULL, "rdpid", NULL,
> >  NULL, NULL, NULL, NULL,
> > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > index b086b15..cdbf8b0 100644
> > --- a/target/i386/cpu.h
> > +++ b/target/i386/cpu.h
> > @@ -635,6 +635,12 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
> >  #define CPUID_7_0_ECX_UMIP (1U << 2)
> >  #define CPUID_7_0_ECX_PKU  (1U << 3)
> >  #define CPUID_7_0_ECX_OSPKE(1U << 4)
> > +#define CPUID_7_0_ECX_VBMI2(1U << 6) /* Additional VBMI Instrs */
> > +#define CPUID_7_0_ECX_GFNI (1U << 8)
> > +#define CPUID_7_0_ECX_VAES (1U << 9)
> > +#define CPUID_7_0_ECX_VPCLMULQDQ (1U << 10)
> > +#define CPUID_7_0_ECX_AVX512VNNI (1U << 11)
> > +#define CPUID_7_0_ECX_AVX512BITALG (1U << 12)
> >  #define CPUID_7_0_ECX_AVX512_VPOPCNTDQ (1U << 14) /* POPCNT for vectors of 
> > DW/QW */
> >  #define CPUID_7_0_ECX_LA57 (1U << 16)
> >  #define CPUID_7_0_ECX_RDPID(1U << 22)
> > 
> 
> Queued, thanks.
> 
  Thanks Paolo!

  Regards,

  Yang
> Paolo

Re: [Qemu-devel] [PATCH for-2.11?] target/arm: Generate UNDEF for 32-bit Thumb2 insns

2017-12-11 Thread Peter Xu

On Mon, Dec 11, 2017 at 08:55:56PM +, Peter Maydell wrote:
> On 11 December 2017 at 19:42, Emilio G. Cota  wrote:
> > On Mon, Dec 11, 2017 at 17:32:48 +, Peter Maydell wrote:
> >> Thanks. I think I have come down on the side of putting this into
> >> 2.11, so rolling an rc5 today, and delaying the final release
> >> a day to Wednesday.
> >
> > Glad to see it's in -rc5 -- thanks for fixing this so quickly!
> >
> > Again, apologies for not having caught this earlier ;-(
> 
> It's my own fault really -- my extremely ad-hoc approach
> to testing for Arm guests was bound to come back and
> bite me sooner or later.

Should we include the vfio fix in too for rc5?

  http://patchwork.ozlabs.org/patch/844940/

I see that the tag is there already, not sure whether it means it
missed the chance again...  Thanks,

-- 
Peter Xu

[Qemu-devel] [PATCH v1 1/2] i386: Add Intel Processor Trace feature support

2017-12-11 Thread Luwei Kang

From: Chao Peng 

Expose Intel Processor Trace feature to guest.

Signed-off-by: Chao Peng 
Signed-off-by: Luwei Kang 
---
 target/i386/cpu.c | 19 ++-
 target/i386/cpu.h |  1 +
 target/i386/kvm.c | 23 +++
 3 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 045d661..1d34a6f 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -426,7 +426,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
 NULL, NULL, "mpx", NULL,
 "avx512f", "avx512dq", "rdseed", "adx",
 "smap", "avx512ifma", "pcommit", "clflushopt",
-"clwb", NULL, "avx512pf", "avx512er",
+"clwb", "intel-pt", "avx512pf", "avx512er",
 "avx512cd", "sha-ni", "avx512bw", "avx512vl",
 },
 .cpuid_eax = 7,
@@ -2973,6 +2973,23 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 }
 break;
 }
+case 0x14: {
+if ((env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) &&
+ kvm_enabled()) {
+KVMState *s = cs->kvm_state;
+
+*eax = kvm_arch_get_supported_cpuid(s, 0x14, count, R_EAX);
+*ebx = kvm_arch_get_supported_cpuid(s, 0x14, count, R_EBX);
+*ecx = kvm_arch_get_supported_cpuid(s, 0x14, count, R_ECX);
+*edx = kvm_arch_get_supported_cpuid(s, 0x14, count, R_EDX);
+} else {
+*eax = 0;
+*ebx = 0;
+*ecx = 0;
+*edx = 0;
+}
+break;
+}
 case 0x4000:
 /*
  * CPUID code in kvm_arch_init_vcpu() ignores stuff
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index b086b15..4bdb7c6 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -624,6 +624,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
 #define CPUID_7_0_EBX_PCOMMIT  (1U << 22) /* Persistent Commit */
 #define CPUID_7_0_EBX_CLFLUSHOPT (1U << 23) /* Flush a Cache Line Optimized */
 #define CPUID_7_0_EBX_CLWB (1U << 24) /* Cache Line Write Back */
+#define CPUID_7_0_EBX_INTEL_PT (1U << 25) /* Intel Processor Trace */
 #define CPUID_7_0_EBX_AVX512PF (1U << 26) /* AVX-512 Prefetch */
 #define CPUID_7_0_EBX_AVX512ER (1U << 27) /* AVX-512 Exponential and 
Reciprocal */
 #define CPUID_7_0_EBX_AVX512CD (1U << 28) /* AVX-512 Conflict Detection */
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index b1e32e9..31d20c8 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -869,6 +869,29 @@ int kvm_arch_init_vcpu(CPUState *cs)
 c = _data.entries[cpuid_i++];
 }
 break;
+case 0x14: {
+uint32_t times;
+
+c->function = i;
+c->index = 0;
+c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+cpu_x86_cpuid(env, i, 0, >eax, >ebx, >ecx, >edx);
+times = c->eax;
+
+for (j = 1; j <= times; ++j) {
+if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
+fprintf(stderr, "cpuid_data is full, no space for "
+"cpuid(eax:0x14,ecx:0x%x)\n", j);
+abort();
+}
+c = _data.entries[cpuid_i++];
+c->function = i;
+c->index = j;
+c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+cpu_x86_cpuid(env, i, j, >eax, >ebx, >ecx, >edx);
+}
+break;
+}
 default:
 c->function = i;
 c->flags = 0;
-- 
1.8.3.1

[Qemu-devel] [PATCH v1 2/2] i386: Add support to get/set/migrate Intel Processor Trace feature

2017-12-11 Thread Luwei Kang

From: Chao Peng 

Add Intel Processor Trace related definition. It also add
corresponding part to kvm_get/set_msr and vmstate.

Signed-off-by: Chao Peng 
Signed-off-by: Luwei Kang 
---
 target/i386/cpu.h | 22 ++
 target/i386/kvm.c | 51 +++
 target/i386/machine.c | 37 +
 3 files changed, 110 insertions(+)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 4bdb7c6..0a9b8da 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -400,6 +400,21 @@
 #define MSR_MC0_ADDR0x402
 #define MSR_MC0_MISC0x403
 
+#define MSR_IA32_RTIT_OUTPUT_BASE   0x560
+#define MSR_IA32_RTIT_OUTPUT_MASK   0x561
+#define MSR_IA32_RTIT_CTL   0x570
+#define MSR_IA32_RTIT_STATUS0x571
+#define MSR_IA32_RTIT_CR3_MATCH 0x572
+#define MSR_IA32_RTIT_ADDR0_A   0x580
+#define MSR_IA32_RTIT_ADDR0_B   0x581
+#define MSR_IA32_RTIT_ADDR1_A   0x582
+#define MSR_IA32_RTIT_ADDR1_B   0x583
+#define MSR_IA32_RTIT_ADDR2_A   0x584
+#define MSR_IA32_RTIT_ADDR2_B   0x585
+#define MSR_IA32_RTIT_ADDR3_A   0x586
+#define MSR_IA32_RTIT_ADDR3_B   0x587
+#define MAX_RTIT_ADDRS  8
+
 #define MSR_EFER0xc080
 
 #define MSR_EFER_SCE   (1 << 0)
@@ -1106,6 +1121,13 @@ typedef struct CPUX86State {
 uint64_t msr_hv_stimer_config[HV_STIMER_COUNT];
 uint64_t msr_hv_stimer_count[HV_STIMER_COUNT];
 
+uint64_t msr_rtit_ctrl;
+uint64_t msr_rtit_status;
+uint64_t msr_rtit_output_base;
+uint64_t msr_rtit_output_mask;
+uint64_t msr_rtit_cr3_match;
+uint64_t msr_rtit_addrs[MAX_RTIT_ADDRS];
+
 /* exception/interrupt handling */
 int error_code;
 int exception_is_int;
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 31d20c8..655f860 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1783,6 +1783,25 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
 kvm_msr_entry_add(cpu, MSR_MTRRphysMask(i), mask);
 }
 }
+if (env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) {
+int addr_num = kvm_arch_get_supported_cpuid(kvm_state,
+0x14, 1, R_EAX) & 0x7;
+
+kvm_msr_entry_add(cpu, MSR_IA32_RTIT_CTL,
+env->msr_rtit_ctrl);
+kvm_msr_entry_add(cpu, MSR_IA32_RTIT_STATUS,
+env->msr_rtit_status);
+kvm_msr_entry_add(cpu, MSR_IA32_RTIT_OUTPUT_BASE,
+env->msr_rtit_output_base);
+kvm_msr_entry_add(cpu, MSR_IA32_RTIT_OUTPUT_MASK,
+env->msr_rtit_output_mask);
+kvm_msr_entry_add(cpu, MSR_IA32_RTIT_CR3_MATCH,
+env->msr_rtit_cr3_match);
+for (i = 0; i < addr_num; i++) {
+kvm_msr_entry_add(cpu, MSR_IA32_RTIT_ADDR0_A + i,
+env->msr_rtit_addrs[MSR_IA32_RTIT_ADDR0_A + i]);
+}
+}
 
 /* Note: MSR_IA32_FEATURE_CONTROL is written separately, see
  *   kvm_put_msr_feature_control. */
@@ -2130,6 +2149,20 @@ static int kvm_get_msrs(X86CPU *cpu)
 }
 }
 
+if (env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) {
+int addr_num =
+kvm_arch_get_supported_cpuid(kvm_state, 0x14, 1, R_EAX) & 0x7;
+
+kvm_msr_entry_add(cpu, MSR_IA32_RTIT_CTL, 0);
+kvm_msr_entry_add(cpu, MSR_IA32_RTIT_STATUS, 0);
+kvm_msr_entry_add(cpu, MSR_IA32_RTIT_OUTPUT_BASE, 0);
+kvm_msr_entry_add(cpu, MSR_IA32_RTIT_OUTPUT_MASK, 0);
+kvm_msr_entry_add(cpu, MSR_IA32_RTIT_CR3_MATCH, 0);
+for (i = 0; i < addr_num; i++) {
+kvm_msr_entry_add(cpu, MSR_IA32_RTIT_ADDR0_A + i, 0);
+}
+}
+
 ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_MSRS, cpu->kvm_msr_buf);
 if (ret < 0) {
 return ret;
@@ -2370,6 +2403,24 @@ static int kvm_get_msrs(X86CPU *cpu)
 env->mtrr_var[MSR_MTRRphysIndex(index)].base = msrs[i].data;
 }
 break;
+case MSR_IA32_RTIT_CTL:
+env->msr_rtit_ctrl = msrs[i].data;
+break;
+case MSR_IA32_RTIT_STATUS:
+env->msr_rtit_status = msrs[i].data;
+break;
+case MSR_IA32_RTIT_OUTPUT_BASE:
+env->msr_rtit_output_base = msrs[i].data;
+break;
+case MSR_IA32_RTIT_OUTPUT_MASK:
+env->msr_rtit_output_mask = msrs[i].data;
+break;
+case MSR_IA32_RTIT_CR3_MATCH:
+env->msr_rtit_cr3_match = msrs[i].data;
+break;
+case MSR_IA32_RTIT_ADDR0_A ... MSR_IA32_RTIT_ADDR3_B:
+env->msr_rtit_addrs[index -

Re: [Qemu-devel] [PATCH v3 3/3] msi: Handle remappable format interrupt request

2017-12-11 Thread Chao Gao

On Mon, Dec 11, 2017 at 06:07:48PM +, Anthony PERARD wrote:
>On Fri, Nov 17, 2017 at 02:24:25PM +0800, Chao Gao wrote:
>> According to VT-d spec Interrupt Remapping and Interrupt Posting ->
>> Interrupt Remapping -> Interrupt Request Formats On Intel 64
>> Platforms, fields of MSI data register have changed. This patch
>> avoids wrongly regarding a remappable format interrupt request as
>> an interrupt binded with a pirq.
>> 
>> Signed-off-by: Chao Gao 
>> Signed-off-by: Lan Tianyu 
>> ---
>> v3:
>>  - clarify the interrupt format bit is Intel-specific, then it is
>>  improper to define MSI_ADDR_IF_MASK in a common header.
>> ---
>>  hw/i386/xen/xen-hvm.c | 10 +-
>>  hw/pci/msi.c  |  5 +++--
>>  hw/pci/msix.c |  4 +++-
>>  hw/xen/xen_pt_msi.c   |  2 +-
>>  include/hw/xen/xen.h  |  2 +-
>>  stubs/xen-hvm.c   |  2 +-
>>  6 files changed, 18 insertions(+), 7 deletions(-)
>> 
>> diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
>> index 8028bed..52dc8af 100644
>> --- a/hw/i386/xen/xen-hvm.c
>> +++ b/hw/i386/xen/xen-hvm.c
>> @@ -145,8 +145,16 @@ void xen_piix_pci_write_config_client(uint32_t address, 
>> uint32_t val, int len)
>>  }
>>  }
>>  
>> -int xen_is_pirq_msi(uint32_t msi_data)
>> +int xen_is_pirq_msi(uint32_t msi_addr_lo, uint32_t msi_data)
>>  {
>> +/* If the MSI address is configured in remapping format, the MSI will 
>> not
>> + * be remapped into a pirq. This 'if' test excludes Intel-specific
>> + * remappable msi.
>> + */
>> +#define MSI_ADDR_IF_MASK 0x0010
>
>I don't think that is the right place for a define, they also exist
>outside of the context of the function.

yes.

>That define would be better at the top of this file, I think.(There is

will do.

Thanks
Chao

>probably a better place in the common headers, but I'm not sure were.)

Re: [Qemu-devel] [PATCH v6 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application

2017-12-11 Thread Liu, Changpeng



> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Monday, December 11, 2017 10:33 PM
> To: Liu, Changpeng 
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> 
> Subject: Re: [PATCH v6 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk
> sample application
> 
> > +static int vub_virtio_process_req(VubDev *vdev_blk,
> > + VuVirtq *vq)
> > +{
> > +VugDev *gdev = _blk->parent;
> > +VuDev *vu_dev = >parent;
> > +VuVirtqElement *elem;
> > +uint32_t type;
> > +unsigned in_num;
> > +unsigned out_num;
> > +VubReq *req;
> > +
> > +elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement));
> > +if (!elem) {
> > +return -1;
> > +}
> > +
> > +/* refer to hw/block/virtio_blk.c */
> > +if (elem->out_num < 1 || elem->in_num < 1) {
> > +fprintf(stderr, "virtio-blk request missing headers\n");
> > +free(elem);
> > +return -1;
> > +}
> > +
> > +req = g_new0(VubReq, 1);
> > +req->vdev_blk = vdev_blk;
> > +req->vq = vq;
> > +req->elem = elem;
> > +
> > +in_num = elem->in_num;
> > +out_num = elem->out_num;
> > +
> > +/* don't support VIRTIO_F_ANY_LAYOUT and virtio 1.0 only */
> > +if (elem->out_sg[0].iov_len < sizeof(struct virtio_blk_outhdr)) {
> > +fprintf(stderr, "Invalid outhdr size\n");
> > +goto err;
> > +}
> 
> QEMU has iov_discard_front() and iov_discard_back().  They make it
> pretty easy to support VIRTIO_F_ANY_LAYOUT.  If you have time, please
> consider adding it to libvhost-user, but it's not a requirement for this
> patch series.
> 
> > +req->out = (struct virtio_blk_outhdr *)elem->out_sg[0].iov_base;
> > +out_num--;
> > +
> > +if (elem->in_sg[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) 
> > {
> > +fprintf(stderr, "Invalid inhdr size\n");
> > +goto err;
> > +}
> > +req->in = (struct virtio_blk_inhdr *)elem->in_sg[in_num - 1].iov_base;
> > +in_num--;
> > +
> > +type = le32toh(req->out->type);
> 
> Endianness is more complicated if you want to support both VIRTIO 1.0
> and Legacy (0.9.7).  I guess it's okay to make libvhost-user code only
> support VIRTIO 1.0.
> > +static void vub_queue_set_started(VuDev *vu_dev, int idx, bool started)
> > +{
> > +VuVirtq *vq;
> > +
> > +assert(vu_dev);
> > +
> > +if ((idx < 0) || (idx >= VHOST_MAX_NR_VIRTQUEUE)) {
> > +fprintf(stderr, "VQ Index out of range: %d\n", idx);
> > +vub_panic_cb(vu_dev, NULL);
> > +return;
> > +}
> 
> Is it necessary to check num_queues?  The vhost-user master should not
> be able to enable idx >= num_queues.
Yes, this part of code can be removed.

Re: [Qemu-devel] [PATCH v13 00/12] Add ARMv8 RAS virtualization support in QEMU

2017-12-11 Thread gengdongjiu


On 2017/12/11 21:32, Igor Mammedov wrote:
>> Hi maintainer,
>>
>>   This patch set seems pending about one month, could you help review for 
>> them?  Thanks.
> I'm going to look at ACPI side of it this week.

Igor, thank you very much in advance.

> 
>

Re: [Qemu-devel] [PATCH v3 2/3] xen/pt: Pass the whole msi addr/data to Xen

2017-12-11 Thread Chao Gao

On Mon, Dec 11, 2017 at 05:59:08PM +, Anthony PERARD wrote:
>On Fri, Nov 17, 2017 at 02:24:24PM +0800, Chao Gao wrote:
>> Previously, some fields (reserved or unalterable) are filtered by
>> Qemu. This fields are useless for the legacy interrupt format.
>> However, these fields are may meaningful (for intel platform)
>> for the interrupt of remapping format. It is better to pass the whole
>> msi addr/data to Xen without any filtering.
>> 
>> The main reason why we want this is QEMU doesn't have the knowledge
>> to decide the interrupt format after we introduce vIOMMU inside Xen.
>> Passing the whole msi message down and let arch-specific vIOMMU to
>> decide the interrupt format.
>> 
>> Signed-off-by: Chao Gao 
>> Signed-off-by: Lan Tianyu 
>> ---
>> v3:
>>  - new
>> ---
>>  hw/xen/xen_pt_msi.c | 47 ---
>>  1 file changed, 12 insertions(+), 35 deletions(-)
>> 
>> diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
>> index 6d1e3bd..f7d6e76 100644
>> --- a/hw/xen/xen_pt_msi.c
>> +++ b/hw/xen/xen_pt_msi.c
>> @@ -47,25 +47,6 @@ static inline uint32_t msi_ext_dest_id(uint32_t addr_hi)
>>  return addr_hi & 0xff00;
>>  }
>>  
>> -static uint32_t msi_gflags(uint32_t data, uint64_t addr)
>> -{
>> -uint32_t result = 0;
>> -int rh, dm, dest_id, deliv_mode, trig_mode;
>> -
>> -rh = (addr >> MSI_ADDR_REDIRECTION_SHIFT) & 0x1;
>> -dm = (addr >> MSI_ADDR_DEST_MODE_SHIFT) & 0x1;
>> -dest_id = msi_dest_id(addr);
>> -deliv_mode = (data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x7;
>> -trig_mode = (data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
>> -
>> -result = dest_id | (rh << XEN_PT_GFLAGS_SHIFT_RH)
>> -| (dm << XEN_PT_GFLAGS_SHIFT_DM)
>> -| (deliv_mode << XEN_PT_GFLAGSSHIFT_DELIV_MODE)
>> -| (trig_mode << XEN_PT_GFLAGSSHIFT_TRG_MODE);
>> -
>> -return result;
>> -}
>> -
>>  static inline uint64_t msi_addr64(XenPTMSI *msi)
>>  {
>>  return (uint64_t)msi->addr_hi << 32 | msi->addr_lo;
>> @@ -160,23 +141,20 @@ static int msi_msix_update(XenPCIPassthroughState *s,
>> bool masked)
>>  {
>>  PCIDevice *d = >dev;
>> -uint8_t gvec = msi_vector(data);
>> -uint32_t gflags = msi_gflags(data, addr);
>> +uint32_t gflags = masked ? 0 : (1u << XEN_PT_GFLAGSSHIFT_UNMASKED);
>>  int rc = 0;
>>  uint64_t table_addr = 0;
>>  
>> -XEN_PT_LOG(d, "Updating MSI%s with pirq %d gvec %#x gflags %#x"
>> -   " (entry: %#x)\n",
>> -   is_msix ? "-X" : "", pirq, gvec, gflags, msix_entry);
>> +XEN_PT_LOG(d, "Updating MSI%s with pirq %d gvec %#x addr %"PRIx64
>> +   " data %#x gflags %#x (entry: %#x)\n",
>> +   is_msix ? "-X" : "", pirq, addr, data, gflags, msix_entry);
>>  
>>  if (is_msix) {
>>  table_addr = s->msix->mmio_base_addr;
>>  }
>>  
>> -gflags |= masked ? 0 : (1u << XEN_PT_GFLAGSSHIFT_UNMASKED);
>> -
>> -rc = xc_domain_update_msi_irq(xen_xc, xen_domid, gvec,
>> -  pirq, gflags, table_addr);
>> +rc = xc_domain_update_msi_irq(xen_xc, xen_domid, pirq, addr,
>> +  data, gflags, table_addr);
>
>Are you trying to modifie an existing API? That is not going to work. We
>want to be able to build QEMU against older version of Xen, and it
>should work as well.

Yes. I thought it didn't matter. And definitely, I was wrong. I will keep
compatibility by introducing a new API. A wapper function, which calls
the old or new API according to the Xen version, would be used here.

Thanks
Chao

Re: [Qemu-devel] [PATCH v6 3/4] contrib/libvhost-user: enable virtio config space messages

2017-12-11 Thread Liu, Changpeng



> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Monday, December 11, 2017 10:00 PM
> To: Liu, Changpeng 
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> 
> Subject: Re: [PATCH v6 3/4] contrib/libvhost-user: enable virtio config space
> messages
> 
> On Tue, Dec 05, 2017 at 02:27:18PM +0800, Changpeng Liu wrote:
> > @@ -798,6 +801,70 @@ vu_set_slave_req_fd(VuDev *dev, VhostUserMsg
> *vmsg)
> >  }
> >
> >  static bool
> > +vu_get_config(VuDev *dev, VhostUserMsg *vmsg)
> > +{
> > +int ret = -1;
> > +
> > +if (dev->iface->get_config) {
> > +ret = dev->iface->get_config(dev, vmsg->payload.config.region,
> > + vmsg->payload.config.size);
> > +}
> > +
> > +if (ret) {
> > +/* resize to zero to indicate an error to master */
> > +vmsg->size = 0;
> > +}
> 
> Please document this error case in vhost-user.txt.  I don't remember
> reading about it.
Thanks, will add it to vhost-user.txt.

Re: [Qemu-devel] [PATCH v6 1/4] vhost-user: add new vhost user messages to support virtio config space

2017-12-11 Thread Liu, Changpeng



> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Monday, December 11, 2017 9:39 PM
> To: Liu, Changpeng 
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; m...@redhat.com;
> marcandre.lur...@redhat.com; fel...@nutanix.com; Harris, James R
> 
> Subject: Re: [PATCH v6 1/4] vhost-user: add new vhost user messages to support
> virtio config space
> 
> On Tue, Dec 05, 2017 at 02:27:16PM +0800, Changpeng Liu wrote:
> > +* VHOST_USER_SET_CONFIG
> > +  Id: 25
> > +  Equivalent ioctl: N/A
> > +  Master payload: virtio device config space
> > +
> > +  Submitted by the vhost-user master when the Guest changes the virtio
> > +  device configuration space and also can be used for live migration
> > +  on the destination host. The vhost-user slave must check the flags
> > +  filed, and slaves MUST NOT accept SET_CONFIG for read-only
> 
> s/filed/field/
Thanks.
> 
> > +static int vhost_user_set_config(struct vhost_dev *dev, const uint8_t 
> > *config,
> > + uint32_t offset, uint32_t size, uint32_t 
> > flags)
> > +{
> > +uint8_t *p;
> > +bool reply_supported = virtio_has_feature(dev->protocol_features,
> > +  
> > VHOST_USER_PROTOCOL_F_REPLY_ACK);
> > +
> > +VhostUserMsg msg = {
> > +msg.request = VHOST_USER_SET_CONFIG,
> > +msg.flags = VHOST_USER_VERSION,
> > +msg.size = VHOST_USER_CONFIG_HDR_SIZE + size,
> > +};
> > +
> > +if (reply_supported) {
> > +msg.flags |= VHOST_USER_NEED_REPLY_MASK;
> > +}
> > +
> > +msg.payload.config.offset = offset,
> > +msg.payload.config.size = size,
> > +msg.payload.config.flags = flags,
> > +p = msg.payload.config.region;
> > +memcpy(p, config + offset, size);
> 
> This function can be made more general by changing the semantics of the
> config argument:
> 
>   memcpy(p, config, size);
> 
> Now the caller can pass just a single field instead of a whole 256-byte
> config buffer.  It might be clearer to name the argument "data" or
> "region" instead of "config" though.
Good suggestion, will change it.
> 
> > @@ -1505,6 +1508,67 @@ void vhost_ack_features(struct vhost_dev *hdev,
> const int *feature_bits,
> >  }
> >  }
> >
> > +int vhost_dev_get_config(struct vhost_dev *hdev, uint8_t *config,
> > + uint32_t config_len)
> > +{
> > +assert(hdev->vhost_ops);
> > +
> > +if (hdev->vhost_ops->vhost_get_config) {
> > +return hdev->vhost_ops->vhost_get_config(hdev, config, config_len);
> > +}
> > +
> > +return 0;
> > +}
> > +
> > +int vhost_dev_set_config(struct vhost_dev *hdev, const uint8_t *config,
> > + uint32_t offset, uint32_t size, uint32_t flags)
> > +{
> > +assert(hdev->vhost_ops);
> > +
> > +if (hdev->vhost_ops->vhost_set_config) {
> > +return hdev->vhost_ops->vhost_set_config(hdev, config, offset,
> > + size, flags);
> > +}
> > +
> > +return 0;
> > +}
> 
> Both vhost_dev_get_config() and vhost_dev_set_config() cannot fail
> silently.  The device will not work properly if the configuration space
> feature is not supported.
> 
> Please make these functions return an error if the callback is NULL.
Ok, will add callback check.
> 
> The vhost-blk code should also check that
> hdev->vhost_ops->vhost_set_config != NULL during realize.  This way
> users see the error when adding the device instead of at runtime when
> the function gets called.
For vhost-blk, get_config is mandatory, but for set_config, it should depend
on the feature bit: VIRTIO_BLK_F_CONFIG_WCE is enabled or not. Of course,
migration should be another case. So running time error process should be
okay.

[Qemu-devel] [PATCH v4] qemu-img: Document --force-share / -U

2017-12-11 Thread Fam Zheng

Signed-off-by: Fam Zheng 

---

v4: "images". [Kevin]

v3: Document that the option is not allowed for read-write. [Stefan]

v2: - "code{qemu-img}". [Kashyap, Eric]
- "etc.." -> "etc.".
---
 qemu-img.texi | 9 +
 1 file changed, 9 insertions(+)

diff --git a/qemu-img.texi b/qemu-img.texi
index fdcf120f36..d93501f94f 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -57,6 +57,15 @@ exclusive with the @var{-O} parameters. It is currently 
required to also use
 the @var{-n} parameter to skip image creation. This restriction may be relaxed
 in a future release.
 
+@item --force-share (-U)
+
+If specified, @code{qemu-img} will open the image with shared permissions,
+which makes it less likely to conflict with a running guest's permissions due
+to image locking. For example, this can be used to get the image information
+(with 'info' subcommand) when the image is used by a running guest. Note that
+this could produce inconsistent result because of concurrent metadata changes,
+etc. This option is only allowed when opening images in read-only mode.
+
 @item fmt
 is the disk image format. It is guessed automatically in most cases. See below
 for a description of the supported disk formats.
-- 
2.14.3

Re: [Qemu-devel] [PATCH] blockjob: kick jobs on set-speed

2017-12-11 Thread Jeff Cody

On Mon, Dec 11, 2017 at 06:46:09PM -0500, John Snow wrote:
> If users set an unreasonably low speed (like one byte per second), the
> calculated delay may exceed many hours. While we like to punish users
> for asking for stupid things, we do also like to allow users to correct
> their wicked ways.
> 
> When a user provides a new speed, kick the job to allow it to recalculate
> its delay.
> 
> Signed-off-by: John Snow 
> ---
>  blockjob.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 715c2c2680..43f01ad190 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -483,6 +483,7 @@ static void block_job_completed_txn_success(BlockJob *job)
>  void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
>  {
>  Error *local_err = NULL;
> +int64_t old_speed = job->speed;
>  
>  if (!job->driver->set_speed) {
>  error_setg(errp, QERR_UNSUPPORTED);
> @@ -495,6 +496,10 @@ void block_job_set_speed(BlockJob *job, int64_t speed, 
> Error **errp)
>  }
>  
>  job->speed = speed;
> +/* Kick the job to recompute its delay */
> +if ((speed > old_speed) && timer_pending(>sleep_timer)) {

job->sleep_timer is protected by block_job_mutex (via
block_job_lock/unlock); is it safe for us to check it here outside the
mutex?

But in any case, I think we could get rid of the timer_pending check, and
just always kick the job if we have a speed increase.  block_job_enter()
should do the right thing (mutex protected check on job->busy and
job->sleep_timer).

> +block_job_enter(job);
> +}
>  }
>  
>  void block_job_complete(BlockJob *job, Error **errp)
> -- 
> 2.14.3
>

Re: [Qemu-devel] [PATCH v3 1/8] memory: address_space_iterate

2017-12-11 Thread Paolo Bonzini

On 11/12/2017 20:46, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> Iterate through an address space calling a function for each
> section.  The iteration is done in order.
> 
> Signed-off-by: Dr. David Alan Gilbert 

It seems to me that you can achieve the same effect by implementing the
region_add and region_nop callbacks, and leaving out region_del.  Am I
missing something?

Thanks,

Paolo

> ---
>  include/exec/memory.h | 23 +++
>  memory.c  | 22 ++
>  2 files changed, 45 insertions(+)
> 
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 5ed4042f87..f5a9df642e 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -1987,6 +1987,29 @@ address_space_write_cached(MemoryRegionCache *cache, 
> hwaddr addr,
>  address_space_write(cache->as, cache->xlat + addr, 
> MEMTXATTRS_UNSPECIFIED, buf, len);
>  }
>  
> +/**
> + * ASIterateCallback: Function type called by address_space_iterate
> + *
> + * Return 0 on success or a negative error code.
> + *
> + * @mrs: Memory region section for this range
> + * @opaque: The opaque value passed in to the iterator.
> + */
> +typedef int (*ASIterateCallback)(MemoryRegionSection *mrs, void *opaque);
> +
> +/**
> + * address_space_iterate: Call the function for each address range in the
> + *AddressSpace, in sorted order.
> + *
> + * Return 0 on success or a negative error code.
> + *
> + * @as: Address space to iterate over
> + * @cb: Function to call.  If the function returns none-0 the iteration will
> + * stop.
> + * @opaque: Value to pass to the function
> + */
> +int
> +address_space_iterate(AddressSpace *as, ASIterateCallback cb, void *opaque);
>  #endif
>  
>  #endif
> diff --git a/memory.c b/memory.c
> index e26e5a3b1d..f45137f25e 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -2810,6 +2810,28 @@ void address_space_destroy(AddressSpace *as)
>  call_rcu(as, do_address_space_destroy, rcu);
>  }
>  
> +int address_space_iterate(AddressSpace *as, ASIterateCallback cb,
> +  void *opaque)
> +{
> +int res = 0;
> +FlatView *fv = address_space_to_flatview(as);
> +FlatRange *range;
> +
> +flatview_ref(fv);
> +
> +FOR_EACH_FLAT_RANGE(range, fv) {
> +MemoryRegionSection mrs = section_from_flat_range(range, fv);
> +res = cb(, opaque);
> +if (res) {
> +break;
> +}
> +}
> +
> +flatview_unref(fv);
> +
> +return res;
> +}
> +
>  static const char *memory_region_type(MemoryRegion *mr)
>  {
>  if (memory_region_is_ram_device(mr)) {
>

[Qemu-devel] [PATCH] blockjob: kick jobs on set-speed

2017-12-11 Thread John Snow

If users set an unreasonably low speed (like one byte per second), the
calculated delay may exceed many hours. While we like to punish users
for asking for stupid things, we do also like to allow users to correct
their wicked ways.

When a user provides a new speed, kick the job to allow it to recalculate
its delay.

Signed-off-by: John Snow 
---
 blockjob.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/blockjob.c b/blockjob.c
index 715c2c2680..43f01ad190 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -483,6 +483,7 @@ static void block_job_completed_txn_success(BlockJob *job)
 void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
 {
 Error *local_err = NULL;
+int64_t old_speed = job->speed;
 
 if (!job->driver->set_speed) {
 error_setg(errp, QERR_UNSUPPORTED);
@@ -495,6 +496,10 @@ void block_job_set_speed(BlockJob *job, int64_t speed, 
Error **errp)
 }
 
 job->speed = speed;
+/* Kick the job to recompute its delay */
+if ((speed > old_speed) && timer_pending(>sleep_timer)) {
+block_job_enter(job);
+}
 }
 
 void block_job_complete(BlockJob *job, Error **errp)
-- 
2.14.3

Re: [Qemu-devel] [qemu-s390x] [RFC PATCH v2 0/3] tests for CCW IDA

2017-12-11 Thread Halil Pasic



On 12/07/2017 12:53 PM, Cornelia Huck wrote:
> On Thu, 7 Dec 2017 10:01:35 +0100
> Thomas Huth  wrote:
> 
>> On 07.12.2017 07:38, Thomas Huth wrote:
>>> On 08.11.2017 17:54, Halil Pasic wrote:  
 I've keept the title althogh the scope shifted a bit: it's
 more about introducing ccw-testdev than about IDA. The goal
 is to facilitate testing the virtual channel subsystem
 implementation, and the ccw interpretation.

 The first patch is the interesting one. See it's cover letter
 for details. The RFC is about discussing some technical issues
 with this patch.

 The other two patches are an out of source kernel module which
 is basically only there so you can try out the first patch. The
 tests there should probably be ported to something else. I don't
 know what: maybe kvm-unit-tests, maybe qtest+libqos, or maybe some
 bios based test image. We still have to figure out that.   
>>>
>>> I think both, kvm-unit-tests or qtest+libqos would be good candidates.
>>> Please don't invent a new bios base test image, since kvm-unit-tests
>>> should be very similar already and we really don't need to duplicate
>>> work here.
>>>
>>> Anyway, you'd need to add some CSS infracture there first (in both
>>> kvm-unit-tests and the qtest environments), so it's likely a similar
>>> amount of work. qtest has the advantage that it gets checked
>>> automatically during "make check" each time, so I'd have a weak
>>> preference for that one.  
>>
>> Another thought: I'd also like to see the more complex virtio device
>> qtests enabled for virtio-ccw one day (e.g. tests/virtio-blk-test.c), so
>> I think we sooner or later should have some CSS infrastructure in the
>> qtests anyway ==> May I suggest that you have a try with the qtest approach?
> 
> Agreed, this would be helpful to get more ccw coverage in general.
> 

Yeah qtest+libqos does seem like the most likely candidate. We
are likely to go down this path. I say we, because it seems likely
that the guest counterpart with the unit test suite is going to
be done by somebody having more time to invest into this.

Regards,
Halil

[Qemu-devel] [PATCH v2 5/5] s390-ccw: interactive boot menu for scsi

2017-12-11 Thread Collin L. Walling

Interactive boot menu for scsi. This follows the same procedure
as the interactive menu for eckd dasd. An example follows:

s390x Enumerated Boot Menu.

3 entries detected. Select from index 0 to 2.

Please choose:

Signed-off-by: Collin L. Walling 
---
 pc-bios/s390-ccw/bootmap.c |  9 ++---
 pc-bios/s390-ccw/main.c|  2 ++
 pc-bios/s390-ccw/menu.c| 14 ++
 pc-bios/s390-ccw/menu.h|  1 +
 4 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index c817cf8..78e41ab 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -552,15 +552,18 @@ static void ipl_scsi(void)
 }
 
 program_table_entries++;
-if (program_table_entries == loadparm + 1) {
-break; /* selected entry found */
-}
 }
 
 debug_print_int("program table entries", program_table_entries);
 
 IPL_assert(program_table_entries != 0, "Empty Program Table");
 
+if (menu_check_flags(BOOT_MENU_FLAG_BOOT_OPTS)) {
+loadparm = menu_get_enum_boot_index(program_table_entries);
+}
+
+prog_table_entry = (ScsiBlockPtr *)(sec + pte_len * (loadparm + 1));
+
 zipl_run(prog_table_entry); /* no return */
 }
 
diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c
index fb0ef92..2a697a0 100644
--- a/pc-bios/s390-ccw/main.c
+++ b/pc-bios/s390-ccw/main.c
@@ -112,6 +112,8 @@ static void virtio_setup(void)
 vdev->selected_scsi_device.lun = iplb.scsi.lun;
 blk_schid.ssid = iplb.scsi.ssid & 0x3;
 found = find_dev(, iplb.scsi.devno);
+menu_set_parms(iplb.scsi.boot_menu_flags,
+iplb.scsi.boot_menu_timeout);
 break;
 default:
 panic("List-directed IPL not supported yet!\n");
diff --git a/pc-bios/s390-ccw/menu.c b/pc-bios/s390-ccw/menu.c
index d707afb..49876c1 100644
--- a/pc-bios/s390-ccw/menu.c
+++ b/pc-bios/s390-ccw/menu.c
@@ -211,6 +211,20 @@ int menu_get_zipl_boot_index(const void *stage2, ZiplParms 
zipl_parms)
 return get_boot_index(ct - 1);
 }
 
+int menu_get_enum_boot_index(int entries)
+{
+char tmp[4];
+
+sclp_print("s390x Enumerated Boot Menu.\n\n");
+
+sclp_print(itostr(entries, tmp, sizeof(tmp)));
+sclp_print(" entries detected. Select from boot index 0 to ");
+sclp_print(itostr(entries - 1, tmp, sizeof(tmp)));
+sclp_print(".\n\n");
+
+return get_boot_index(entries);
+}
+
 void menu_set_parms(uint8_t boot_menu_flag, uint16_t boot_menu_timeout)
 {
 flags = boot_menu_flag;
diff --git a/pc-bios/s390-ccw/menu.h b/pc-bios/s390-ccw/menu.h
index a8727fa..4373b0c 100644
--- a/pc-bios/s390-ccw/menu.h
+++ b/pc-bios/s390-ccw/menu.h
@@ -24,5 +24,6 @@ typedef struct ZiplParms {
 void menu_set_parms(uint8_t boot_menu_flags, uint16_t boot_menu_timeout);
 bool menu_check_flags(uint8_t check_flags);
 int menu_get_zipl_boot_index(const void *stage2, ZiplParms zipl_parms);
+int menu_get_enum_boot_index(int entries);
 
 #endif /* MENU_H */
-- 
2.7.4

[Qemu-devel] [PATCH v2 2/5] s390-ccw: ipl structs for eckd cdl/ldl

2017-12-11 Thread Collin L. Walling

ECKD DASDs have different IPL structures for CDL and LDL
formats. The current Ipl1 and Ipl2 structs follow the CDL
format, so we prepend "EckdCdl" to them. Boot info for LDL
has been moved to a new struct: EckdLdlIpl1.

Also introduce structs for IPL stages 1 and 1b and for
disk geometry.

Signed-off-by: Collin L. Walling 
Acked-by: Janosch Frank 
---
 pc-bios/s390-ccw/bootmap.c | 24 ++--
 pc-bios/s390-ccw/bootmap.h | 55 +-
 2 files changed, 53 insertions(+), 26 deletions(-)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index 6f8e30f..5546b79 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -221,9 +221,9 @@ static void run_eckd_boot_script(block_number_t 
mbr_block_nr)
 static void ipl_eckd_cdl(void)
 {
 XEckdMbr *mbr;
-Ipl2 *ipl2 = (void *)sec;
+EckdCdlIpl2 *ipl2 = (void *)sec;
 IplVolumeLabel *vlbl = (void *)sec;
-block_number_t block_nr;
+block_number_t mbr_block_nr;
 
 /* we have just read the block #0 and recognized it as "IPL1" */
 sclp_print("CDL\n");
@@ -231,7 +231,7 @@ static void ipl_eckd_cdl(void)
 memset(sec, FREE_SPACE_FILLER, sizeof(sec));
 read_block(1, ipl2, "Cannot read IPL2 record at block 1");
 
-mbr = >u.x.mbr;
+mbr = >mbr;
 IPL_assert(magic_match(mbr, ZIPL_MAGIC), "No zIPL section in IPL2 
record.");
 IPL_assert(block_size_ok(mbr->blockptr.xeckd.bptr.size),
"Bad block size in zIPL section of IPL2 record.");
@@ -239,7 +239,7 @@ static void ipl_eckd_cdl(void)
"Non-ECKD device type in zIPL section of IPL2 record.");
 
 /* save pointer to Boot Script */
-block_nr = eckd_block_num((void *)&(mbr->blockptr));
+mbr_block_nr = eckd_block_num((void *)&(mbr->blockptr));
 
 memset(sec, FREE_SPACE_FILLER, sizeof(sec));
 read_block(2, vlbl, "Cannot read Volume Label at block 2");
@@ -249,7 +249,7 @@ static void ipl_eckd_cdl(void)
"Invalid magic of volser block");
 print_volser(vlbl->f.volser);
 
-run_eckd_boot_script(block_nr);
+run_eckd_boot_script(mbr_block_nr);
 /* no return */
 }
 
@@ -280,8 +280,8 @@ static void print_eckd_ldl_msg(ECKD_IPL_mode_t mode)
 
 static void ipl_eckd_ldl(ECKD_IPL_mode_t mode)
 {
-block_number_t block_nr;
-BootInfo *bip = (void *)(sec + 0x70); /* BootInfo is MBR for LDL */
+block_number_t mbr_block_nr;
+EckdLdlIpl1 *ipl1 = (void *)sec;
 
 if (mode != ECKD_LDL_UNLABELED) {
 print_eckd_ldl_msg(mode);
@@ -292,15 +292,17 @@ static void ipl_eckd_ldl(ECKD_IPL_mode_t mode)
 memset(sec, FREE_SPACE_FILLER, sizeof(sec));
 read_block(0, sec, "Cannot read block 0 to grab boot info.");
 if (mode == ECKD_LDL_UNLABELED) {
-if (!magic_match(bip->magic, ZIPL_MAGIC)) {
+if (!magic_match(ipl1->boot_info.magic, ZIPL_MAGIC)) {
 return; /* not applicable layout */
 }
 sclp_print("unlabeled LDL.\n");
 }
-verify_boot_info(bip);
+verify_boot_info(>boot_info);
+
+mbr_block_nr =
+eckd_block_num((void *)&(ipl1->boot_info.bp.ipl.bm_ptr.eckd.bptr));
 
-block_nr = eckd_block_num((void *)&(bip->bp.ipl.bm_ptr.eckd.bptr));
-run_eckd_boot_script(block_nr);
+run_eckd_boot_script(mbr_block_nr);
 /* no return */
 }
 
diff --git a/pc-bios/s390-ccw/bootmap.h b/pc-bios/s390-ccw/bootmap.h
index 4980838..b700d08 100644
--- a/pc-bios/s390-ccw/bootmap.h
+++ b/pc-bios/s390-ccw/bootmap.h
@@ -226,22 +226,47 @@ typedef struct BootInfo {  /* @ 0x70, record #0   
 */
 } bp;
 } __attribute__ ((packed)) BootInfo; /* see also XEckdMbr   */
 
-typedef struct Ipl1 {
-unsigned char key[4]; /* == "IPL1" */
-unsigned char data[24];
-} __attribute__((packed)) Ipl1;
+/*
+ * Structs for IPL
+ */
+#define STAGE2_BLK_CNT_MAX  24 /* Stage 1b can load up to 24 blocks */
 
-typedef struct Ipl2 {
-unsigned char key[4]; /* == "IPL2" */
-union {
-unsigned char data[144];
-struct {
-unsigned char reserved1[92-4];
-XEckdMbr mbr;
-unsigned char reserved2[144-(92-4)-sizeof(XEckdMbr)];
-} x;
-} u;
-} __attribute__((packed)) Ipl2;
+typedef struct EckdCdlIpl1 {
+uint8_t key[4]; /* == "IPL1" */
+uint8_t data[24];
+} __attribute__((packed)) EckdCdlIpl1;
+
+typedef struct EckdSeekarg {
+uint16_t pad;
+uint16_t cyl;
+uint16_t head;
+uint8_t sec;
+uint8_t pad2;
+} __attribute__ ((packed)) EckdSeekarg;
+
+typedef struct EckdStage1b {
+uint8_t reserved[32 * STAGE2_BLK_CNT_MAX];
+struct EckdSeekarg seek[STAGE2_BLK_CNT_MAX];
+uint8_t unused[64];
+} __attribute__ ((packed)) EckdStage1b;
+
+typedef struct EckdStage1 {
+uint8_t reserved[72];
+struct EckdSeekarg seek[2];
+} __attribute__ ((packed)) EckdStage1;
+
+typedef struct EckdCdlIpl2 {
+uint8_t key[4]; /* == "IPL2" */
+struct

[Qemu-devel] [PATCH v2 4/5] s390-ccw: interactive boot menu for eckd dasd

2017-12-11 Thread Collin L. Walling

When the boot menu options are present and the guest's
disk has been configured by the zipl tool, then the user
will be presented with an interactive boot menu with
labeled entries. An example of what the menu might look
like:

zIPL v1.37.1-build-20170714 interactive boot menu.

 0. default (linux-4.13.0)

 1. linux-4.13.0
 2. performance
 3. kvm

Please choose (default will boot in 10 seconds):

If the user's input is empty or 0, the default zipl entry will
be chosen. If the input is within the range presented by the
menu, then the selection will be booted. Any erroneous input
will cancel the timeout and prompt the user until correct
input is given.

Any value set for loadparm will override all boot menu options.
If loadparm=PROMPT, then the menu prompt will continuously wait
until correct user input is given.

The absence of any boot options on the command line will attempt
to use the zipl loader values.

Signed-off-by: Collin L. Walling 
---
 pc-bios/s390-ccw/Makefile   |   2 +-
 pc-bios/s390-ccw/bootmap.c  |  71 +-
 pc-bios/s390-ccw/bootmap.h  |   2 +
 pc-bios/s390-ccw/main.c |   3 +
 pc-bios/s390-ccw/menu.c | 223 
 pc-bios/s390-ccw/menu.h |  28 ++
 pc-bios/s390-ccw/s390-ccw.h |   2 +
 pc-bios/s390-ccw/sclp.c |  20 
 pc-bios/s390-ccw/virtio.c   |   2 +-
 9 files changed, 348 insertions(+), 5 deletions(-)
 create mode 100644 pc-bios/s390-ccw/menu.c
 create mode 100644 pc-bios/s390-ccw/menu.h

diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
index 9f7904f..1712c2d 100644
--- a/pc-bios/s390-ccw/Makefile
+++ b/pc-bios/s390-ccw/Makefile
@@ -9,7 +9,7 @@ $(call set-vpath, $(SRC_PATH)/pc-bios/s390-ccw)
 
 .PHONY : all clean build-all
 
-OBJECTS = start.o main.o bootmap.o sclp.o virtio.o virtio-scsi.o 
virtio-blkdev.o libc.o
+OBJECTS = start.o main.o bootmap.o sclp.o virtio.o virtio-scsi.o 
virtio-blkdev.o libc.o menu.o
 QEMU_CFLAGS := $(filter -W%, $(QEMU_CFLAGS))
 QEMU_CFLAGS += -ffreestanding -fno-delete-null-pointer-checks -msoft-float
 QEMU_CFLAGS += -march=z900 -fPIE -fno-strict-aliasing
diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index 5546b79..c817cf8 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -13,6 +13,7 @@
 #include "bootmap.h"
 #include "virtio.h"
 #include "bswap.h"
+#include "menu.h"
 
 #ifdef DEBUG
 /* #define DEBUG_FALLBACK */
@@ -83,6 +84,7 @@ static void jump_to_IPL_code(uint64_t address)
 
 static unsigned char _bprs[8*1024]; /* guessed "max" ECKD sector size */
 static const int max_bprs_entries = sizeof(_bprs) / sizeof(ExtEckdBlockPtr);
+static uint8_t stage2[STAGE2_MAX_SIZE] __attribute__((__aligned__(PAGE_SIZE)));
 
 static inline void verify_boot_info(BootInfo *bip)
 {
@@ -182,7 +184,57 @@ static block_number_t load_eckd_segments(block_number_t 
blk, uint64_t *address)
 return block_nr;
 }
 
-static void run_eckd_boot_script(block_number_t mbr_block_nr)
+static void read_stage2(block_number_t s1b_block_nr)
+{
+block_number_t s2_block_nr;
+EckdStage1b *s1b = (void *)sec;
+int i;
+
+/* Get Stage1b data */
+memset(sec, FREE_SPACE_FILLER, sizeof(sec));
+read_block(s1b_block_nr, s1b, "Cannot read stage1b boot loader.");
+
+/* Get Stage2 data */
+memset(stage2, FREE_SPACE_FILLER, sizeof(stage2));
+
+for (i = 0; i < STAGE2_MAX_SIZE / MAX_SECTOR_SIZE; i++) {
+s2_block_nr = eckd_block_num((void *)&(s1b->seek[i].cyl));
+
+if (!s2_block_nr) {
+break;
+}
+
+read_block(s2_block_nr, (stage2 + MAX_SECTOR_SIZE * i),
+   "Error reading Stage2 data");
+}
+}
+
+static bool find_zipl_boot_menu_data(block_number_t s1b_block_nr,
+ ZiplParms *zipl_parms)
+{
+int offset;
+void *s2_offset;
+
+read_stage2(s1b_block_nr);
+
+/* Menu banner starts with "zIPL" */
+for (offset = 0; offset < STAGE2_MAX_SIZE - 4; offset++) {
+s2_offset = stage2 + offset;
+
+if (magic_match(s2_offset, ZIPL_MAGIC_EBCDIC)) {
+zipl_parms->flag = *(uint16_t *)(s2_offset - 140);
+zipl_parms->timeout = *(uint16_t *)(s2_offset - 138);
+zipl_parms->menu_start = offset;
+return true;
+}
+}
+
+sclp_print("No zipl boot menu data found. Booting default entry.");
+return false;
+}
+
+static void run_eckd_boot_script(block_number_t mbr_block_nr,
+ block_number_t s1b_block_nr)
 {
 int i;
 unsigned int loadparm = get_loadparm_index();
@@ -190,6 +242,12 @@ static void run_eckd_boot_script(block_number_t 
mbr_block_nr)
 uint64_t address;
 ScsiMbr *bte = (void *)sec; /* Eckd bootmap table entry */
 BootMapScript *bms = (void *)sec;
+ZiplParms zipl_parms;
+
+if (menu_check_flags(BOOT_MENU_FLAG_BOOT_OPTS | BOOT_MENU_FLAG_ZIPL_OPTS)
+&&

[Qemu-devel] [PATCH v2 1/5] s390-ccw: update libc

2017-12-11 Thread Collin L. Walling

Moved:
  memcmp from bootmap.h to libc.h (renamed from _memcmp)
  strlen from sclp.c to libc.h (renamed from _strlen)

Added C standard functions:
  isdigit
  atoi

Added non-C standard function:
  itostr

Signed-off-by: Collin L. Walling 
Acked-by: Christian Borntraeger 
Reviewed-by: Janosch Frank 
---
 pc-bios/s390-ccw/Makefile  |  2 +-
 pc-bios/s390-ccw/bootmap.c |  4 +--
 pc-bios/s390-ccw/bootmap.h | 16 +-
 pc-bios/s390-ccw/libc.c| 75 ++
 pc-bios/s390-ccw/libc.h| 31 +++
 pc-bios/s390-ccw/main.c| 17 +--
 pc-bios/s390-ccw/sclp.c| 10 +--
 7 files changed, 112 insertions(+), 43 deletions(-)
 create mode 100644 pc-bios/s390-ccw/libc.c

diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
index 6d0c2ee..9f7904f 100644
--- a/pc-bios/s390-ccw/Makefile
+++ b/pc-bios/s390-ccw/Makefile
@@ -9,7 +9,7 @@ $(call set-vpath, $(SRC_PATH)/pc-bios/s390-ccw)
 
 .PHONY : all clean build-all
 
-OBJECTS = start.o main.o bootmap.o sclp.o virtio.o virtio-scsi.o 
virtio-blkdev.o
+OBJECTS = start.o main.o bootmap.o sclp.o virtio.o virtio-scsi.o 
virtio-blkdev.o libc.o
 QEMU_CFLAGS := $(filter -W%, $(QEMU_CFLAGS))
 QEMU_CFLAGS += -ffreestanding -fno-delete-null-pointer-checks -msoft-float
 QEMU_CFLAGS += -march=z900 -fPIE -fno-strict-aliasing
diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index 67a6123..6f8e30f 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -512,7 +512,7 @@ static bool is_iso_bc_entry_compatible(IsoBcSection *s)
 "Failed to read image sector 0");
 
 /* Checking bytes 8 - 32 for S390 Linux magic */
-return !_memcmp(magic_sec + 8, linux_s390_magic, 24);
+return !memcmp(magic_sec + 8, linux_s390_magic, 24);
 }
 
 /* Location of the current sector of the directory */
@@ -641,7 +641,7 @@ static uint32_t find_iso_bc(void)
 if (vd->type == VOL_DESC_TYPE_BOOT) {
 IsoVdElTorito *et = >vd.boot;
 
-if (!_memcmp(>el_torito[0], el_torito_magic, 32)) {
+if (!memcmp(>el_torito[0], el_torito_magic, 32)) {
 return bswap32(et->bc_offset);
 }
 }
diff --git a/pc-bios/s390-ccw/bootmap.h b/pc-bios/s390-ccw/bootmap.h
index cf99a4c..4980838 100644
--- a/pc-bios/s390-ccw/bootmap.h
+++ b/pc-bios/s390-ccw/bootmap.h
@@ -310,20 +310,6 @@ static inline bool magic_match(const void *data, const 
void *magic)
 return *((uint32_t *)data) == *((uint32_t *)magic);
 }
 
-static inline int _memcmp(const void *s1, const void *s2, size_t n)
-{
-int i;
-const uint8_t *p1 = s1, *p2 = s2;
-
-for (i = 0; i < n; i++) {
-if (p1[i] != p2[i]) {
-return p1[i] > p2[i] ? 1 : -1;
-}
-}
-
-return 0;
-}
-
 static inline uint32_t iso_733_to_u32(uint64_t x)
 {
 return (uint32_t)x;
@@ -416,7 +402,7 @@ const uint8_t vol_desc_magic[] = "CD001";
 
 static inline bool is_iso_vd_valid(IsoVolDesc *vd)
 {
-return !_memcmp(>ident[0], vol_desc_magic, 5) &&
+return !memcmp(>ident[0], vol_desc_magic, 5) &&
vd->version == 0x1 &&
vd->type <= VOL_DESC_TYPE_PARTITION;
 }
diff --git a/pc-bios/s390-ccw/libc.c b/pc-bios/s390-ccw/libc.c
new file mode 100644
index 000..60c4b28
--- /dev/null
+++ b/pc-bios/s390-ccw/libc.c
@@ -0,0 +1,75 @@
+/*
+ * libc-style definitions and functions
+ *
+ * This code is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ */
+
+#include "libc.h"
+
+/**
+ * atoi:
+ * @str: the string to be converted.
+ *
+ * Given a string @str, convert it to an integer. Any non-numerical value
+ * will terminate the conversion.
+ *
+ * Returns: an integer converted from the string @str.
+ */
+int atoi(const char *str)
+{
+int i;
+int val = 0;
+
+for (i = 0; str[i]; i++) {
+char c = str[i];
+if (!isdigit(c)) {
+break;
+}
+val *= 10;
+val += c - '0';
+}
+
+return val;
+}
+
+/**
+ * itostr:
+ * @num: the integer to be converted.
+ * @str: a pointer to a string to store the conversion.
+ * @len: the length of the passed string.
+ *
+ * Given an integer @num, convert it to a string. The string @str must be
+ * allocated beforehand. The resulting string will be null terminated and
+ * returned.
+ *
+ * Returns: the string @str of the converted integer @num.
+ */
+char *itostr(int num, char *str, size_t len)
+{
+long num_len = 1;
+int tmp = num;
+int i;
+
+/* Count length of num */
+while ((tmp /= 10) > 0) {
+num_len++;
+}
+
+/* Check if we have enough space for num and null */
+if (len < num_len) {
+return 0;
+}
+
+/* Convert int to

[Qemu-devel] [PATCH v2 3/5] s390-ccw: parse and set boot menu options

2017-12-11 Thread Collin L. Walling

Set boot menu options for an s390 guest and store them in
the iplb. These options are set via the QEMU command line
option:

-boot menu=on|off[,splash-time=X]

or via the libvirt domain xml:


  


Where X represents some positive integer representing
milliseconds.

A loadparm other than 'prompt' will disable the menu and
just boot the specified entry.

Signed-off-by: Collin L. Walling 
Reviewed-by: Janosch Frank 
---
 hw/s390x/ipl.c  | 55 +
 hw/s390x/ipl.h  |  8 +--
 pc-bios/s390-ccw/iplb.h |  8 +--
 3 files changed, 67 insertions(+), 4 deletions(-)

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index 0d06fc1..ed5e8d1 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -23,6 +23,8 @@
 #include "hw/s390x/ebcdic.h"
 #include "ipl.h"
 #include "qemu/error-report.h"
+#include "qemu/config-file.h"
+#include "qemu/cutils.h"
 
 #define KERN_IMAGE_START0x01UL
 #define KERN_PARM_AREA  0x010480UL
@@ -33,6 +35,9 @@
 #define ZIPL_IMAGE_START0x009000UL
 #define IPL_PSW_MASK(PSW_MASK_32 | PSW_MASK_64)
 
+#define BOOT_MENU_FLAG_BOOT_OPTS 0x80
+#define BOOT_MENU_FLAG_ZIPL_OPTS 0x40
+
 static bool iplb_extended_needed(void *opaque)
 {
 S390IPLState *ipl = S390_IPL(object_resolve_path(TYPE_S390_IPL, NULL));
@@ -219,6 +224,51 @@ static Property s390_ipl_properties[] = {
 DEFINE_PROP_END_OF_LIST(),
 };
 
+static void s390_ipl_set_boot_menu(uint8_t *boot_menu_flags,
+   uint16_t *boot_menu_timeout)
+{
+MachineState *machine = MACHINE(qdev_get_machine());
+char *lp = object_property_get_str(OBJECT(machine), "loadparm", NULL);
+QemuOptsList *plist = qemu_find_opts("boot-opts");
+QemuOpts *opts = QTAILQ_FIRST(>head);
+const char *p = qemu_opt_get(opts, "menu");
+unsigned long timeout = 0;
+
+if (memcmp(lp, "PROMPT  ", 8) == 0) {
+*boot_menu_flags = BOOT_MENU_FLAG_BOOT_OPTS;
+
+} else if (*lp) {
+/* If loadparm is set to any value, then discard boot menu */
+return;
+
+} else if (!p) {
+/* In the absence of -boot menu, use zipl loader parameters */
+*boot_menu_flags = BOOT_MENU_FLAG_ZIPL_OPTS;
+
+} else if (strncmp(p, "on", 2) == 0) {
+*boot_menu_flags = BOOT_MENU_FLAG_BOOT_OPTS;
+
+p = qemu_opt_get(opts, "splash-time");
+
+if (p && qemu_strtoul(p, NULL, 10, )) {
+error_report("splash-time value is invalid, forcing it to 0.");
+return;
+}
+
+/* Store timeout value as seconds */
+timeout /= 1000;
+
+if (timeout > 0x) {
+error_report("splash-time value is greater than 65535000,"
+ " forcing it to 65535000.");
+*boot_menu_timeout = 0x;
+return;
+}
+
+*boot_menu_timeout = timeout;
+}
+}
+
 static bool s390_gen_initial_iplb(S390IPLState *ipl)
 {
 DeviceState *dev_st;
@@ -245,6 +295,8 @@ static bool s390_gen_initial_iplb(S390IPLState *ipl)
 ipl->iplb.pbt = S390_IPL_TYPE_CCW;
 ipl->iplb.ccw.devno = cpu_to_be16(ccw_dev->sch->devno);
 ipl->iplb.ccw.ssid = ccw_dev->sch->ssid & 3;
+s390_ipl_set_boot_menu(>iplb.ccw.boot_menu_flags,
+   >iplb.ccw.boot_menu_timeout);
 } else if (sd) {
 SCSIBus *bus = scsi_bus_from_device(sd);
 VirtIOSCSI *vdev = container_of(bus, VirtIOSCSI, bus);
@@ -266,6 +318,8 @@ static bool s390_gen_initial_iplb(S390IPLState *ipl)
 ipl->iplb.scsi.channel = cpu_to_be16(sd->channel);
 ipl->iplb.scsi.devno = cpu_to_be16(ccw_dev->sch->devno);
 ipl->iplb.scsi.ssid = ccw_dev->sch->ssid & 3;
+s390_ipl_set_boot_menu(>iplb.scsi.boot_menu_flags,
+   >iplb.scsi.boot_menu_timeout);
 } else {
 return false; /* unknown device */
 }
@@ -273,6 +327,7 @@ static bool s390_gen_initial_iplb(S390IPLState *ipl)
 if (!s390_ipl_set_loadparm(ipl->iplb.loadparm)) {
 ipl->iplb.flags |= DIAG308_FLAGS_LP_VALID;
 }
+
 return true;
 }
 
diff --git a/hw/s390x/ipl.h b/hw/s390x/ipl.h
index 8a705e0..ff3b397 100644
--- a/hw/s390x/ipl.h
+++ b/hw/s390x/ipl.h
@@ -17,7 +17,9 @@
 
 struct IplBlockCcw {
 uint64_t netboot_start_addr;
-uint8_t  reserved0[77];
+uint8_t  reserved0[74];
+uint16_t boot_menu_timeout;
+uint8_t  boot_menu_flags;
 uint8_t  ssid;
 uint16_t devno;
 uint8_t  vm_flags;
@@ -51,7 +53,9 @@ struct IplBlockQemuScsi {
 uint32_t lun;
 uint16_t target;
 uint16_t channel;
-uint8_t  reserved0[77];
+uint8_t  reserved0[74];
+uint16_t boot_menu_timeout;
+uint8_t  boot_menu_flags;
 uint8_t  ssid;
 uint16_t devno;
 } QEMU_PACKED;
diff

[Qemu-devel] [PATCH v2 0/5] Interactive Boot Menu for DASD and SCSI Guests on s390x

2017-12-11 Thread Collin L. Walling

Thanks for your patience. I've been bouncing back-and-forth between different
designs.

Lots-o-changes this version.  Please call me out if I missed anything from the 
previous round of review.

--- [v2] ---

libc

- fixed up atoi and itostr and moved them to the new lib.c file

- documentation follows gtk-doc (let me know if I'm missing something)

ipl structs

- fixed up commit message

- s/BootEckd*/Eckd*

boot option parsing

- hw/s390x/ipl.c now handles *all* logic behind parsing command line values 
   and setting the appropriate values in the iplb (including interpreting 
   loadparm)

- pc-bios/s390-ccw/main.c now only sets the boot menu fields that were read 
   from the iplb (no longer interpreting loadparm here)

- timeout value is now stored as seconds instead of milliseconds 
   (maximum of 65535 seconds [~18 hours])

- error reported for invalid splash-time value
- if splash-time is invalid, then set it to 0 (wait forever)
- if splash-time is greater than max, then set it to max

- s/boot_menu_enabled/boot_menu_flags

- boot_menu_flags is set to *one* of these flags:
- BOOT_MENU_FLAG_BOOT_OPTS
- set if -boot menu=on or -machine loadparm=prompt
- BOOT_MENU_FLAG_ZIPL_OPTS
- set if no boot options or loadparm are set

- fixed ordering of the new fields in the iplb's

boot menu for eckd dasd

- now supports zipl loader values

- function chs removed and now using pre-existing function eckd_block_num 
   instead

- sclp_read functionality and its helpers are now in menu.c, which is where 
   the only call to this function occurs
- renamed to read_prompt
- renamed read in sclp.c to sclp_read

- introduced header menu.h

- introduced new struct, ZiplParms that contains the following fields
   relating to zipl boot menu data:
- flags
- timeout
- menu_start

- stage2 reading cleaned up

- no longer panic if boot menu data is not found -- instead just print a 
   message and boot default (what if the user did not configure a menu?)

boot menu for scsi

- will only show a menu if BOOT_MENU_FLAG_BOOT_OPTS was set

--- [Summary] ---

These patches implement a boot menu for ECKD DASD and SCSI guests on s390x. 
The menu will only appear if the disk has been configured for IPL with the 
zIPL tool and with the following QEMU command line options:

-boot menu=on[,splash-time=X] and/or -machine loadparm='prompt'

or via the following libvirt domain xml:


  


or


  ...
  


Where X is some positive integer representing time in milliseconds.

A loadparm other than 'prompt' will disable the menu and just boot 
the specified entry.

If no boot options are specified, we will attempt to use the values
provided by zipl (ECKD DASD only).

Collin L. Walling (5):
  s390-ccw: update libc
  s390-ccw: ipl structs for eckd cdl/ldl
  s390-ccw: parse and set boot menu options
  s390-ccw: interactive boot menu for eckd dasd
  s390-ccw: interactive boot menu for scsi

 hw/s390x/ipl.c  |  55 ++
 hw/s390x/ipl.h  |   8 +-
 pc-bios/s390-ccw/Makefile   |   2 +-
 pc-bios/s390-ccw/bootmap.c  | 104 +++
 pc-bios/s390-ccw/bootmap.h  |  73 --
 pc-bios/s390-ccw/iplb.h |   8 +-
 pc-bios/s390-ccw/libc.c |  75 ++
 pc-bios/s390-ccw/libc.h |  31 ++
 pc-bios/s390-ccw/main.c |  22 ++--
 pc-bios/s390-ccw/menu.c | 237 
 pc-bios/s390-ccw/menu.h |  29 ++
 pc-bios/s390-ccw/s390-ccw.h |   2 +
 pc-bios/s390-ccw/sclp.c |  30 --
 pc-bios/s390-ccw/virtio.c   |   2 +-
 14 files changed, 600 insertions(+), 78 deletions(-)
 create mode 100644 pc-bios/s390-ccw/libc.c
 create mode 100644 pc-bios/s390-ccw/menu.c
 create mode 100644 pc-bios/s390-ccw/menu.h

-- 
2.7.4

Re: [Qemu-devel] [RFC PATCH 0/5] Scoped locks using attribute((cleanup))

2017-12-11 Thread Emilio G. Cota

On Fri, Dec 08, 2017 at 11:55:48 +0100, Paolo Bonzini wrote:
> So I'm a bit underwhelmed by this experiment.  Other opinions?

I am on the same boat. Most use cases in this patchset are arguably
adding more complexity because they substitute already very simple
code (e.g. "lock; do_something; unlock"), as others have pointed out
as well.

I usually deal with tricky cases (i.e. functions with many return
paths) with an inline "__locked" function. In most cases this will
be clearer than using the macros. I concede though that the separate
inline is not always an option.

That said, two comments:

- We might be better off just exposing the cleanup attribute
  via some convenience macros. The systemd codebase does this,
  mostly for freeing memory or closing file descriptors. I suspect
  a large percentage of goto's in our codebase could be eliminated.

  This could be also used for locks, although we'd need a variant
  of mutex_lock that returned the mutex, so that in the cleanup
  function we could just check for NULL.

- Does the cleanup attribute work on all compilers used to build QEMU?
  (I'm thinking of Windows in particular.)

Thanks,

Emilio

Re: [Qemu-devel] [PATCH for-2.12 2/2] net: Remove the deprecated -tftp, -bootp, -redir and -smb options

2017-12-11 Thread Peter Maydell

On 7 December 2017 at 18:02, Thomas Huth  wrote:
> These options likely do not work as expected as soon as the user
> tries to use more than one network interface at once. The parameters
> have been marked as deprecated since QEMU v2.6, so users had plenty
> of time to move their scripts to the new syntax. Time to remove the
> old parameters now.

The deprecation message says:
   error_report("The -redir option is deprecated. "
"Please use '-netdev user,hostfwd=...' instead.");

How does this work for systems which have embedded ethernet
devices and can't use -netdev ?

This is one reason I haven't bothered to update my scripts yet
(the other being that the deprecation message is basically
saying "go and do a bunch of research into command line
syntax" rather than "replace your current option '-redir xyz'
with '-netdev user,hostfwd=x,y:z'"...)

The message also doesn't point out that if you were previously
using -net + -redir you need to switch to -device + -netdev,
since -net + -netdev doesn't work AFAIK. Which is more
upheaval to a working command line setup.

thanks
-- PMM

Re: [Qemu-devel] [PATCH 5/5] thread-pool: convert to use lock guards

2017-12-11 Thread Paolo Bonzini

On 11/12/2017 11:23, Stefan Hajnoczi wrote:
>>
>> In other words, I don't see what 'QEMU_WITH_LOCK_GUARD() {}' buys us
>> over '{ QEMU_LOCK_GUARD() }'.
> The QEMU_WITH_LOCK_GUARD() {} syntax is nice because it's similar to
> if/while/for statements.
> 
> However, { QEMU_LOCK_GUARD() } doesn't hide a for statement in a macro
> so the break statement works inside the scope.  Less chance of bugs.

The same is true of a "switch" statement.  Being able to break out of
QEMU_WITH_LOCK_GUARD could also be a feature...

Paolo


> I'd be okay without QEMU_WITH_LOCK_GUARD().



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH for-2.12 2/2] net: Remove the deprecated -tftp, -bootp, -redir and -smb options

2017-12-11 Thread Samuel Thibault

Thomas Huth, on jeu. 07 déc. 2017 19:02:35 +0100, wrote:
> These options likely do not work as expected as soon as the user
> tries to use more than one network interface at once. The parameters
> have been marked as deprecated since QEMU v2.6, so users had plenty
> of time to move their scripts to the new syntax. Time to remove the
> old parameters now.
> 
> Signed-off-by: Thomas Huth 

Reviewed-by: Samuel Thibault 

> ---
>  include/net/net.h   |  3 ---
>  include/net/slirp.h |  4 
>  net/slirp.c | 58 
> -
>  os-posix.c  |  8 
>  qemu-doc.texi   | 24 --
>  qemu-options.hx | 15 --
>  vl.c| 18 -
>  7 files changed, 130 deletions(-)
> 
> diff --git a/include/net/net.h b/include/net/net.h
> index 1c55a93..670e03e 100644
> --- a/include/net/net.h
> +++ b/include/net/net.h
> @@ -204,9 +204,6 @@ extern NICInfo nd_table[MAX_NICS];
>  extern const char *host_net_devices[];
>  
>  /* from net.c */
> -extern const char *legacy_tftp_prefix;
> -extern const char *legacy_bootp_filename;
> -
>  int net_client_init(QemuOpts *opts, bool is_netdev, Error **errp);
>  int net_client_parse(QemuOptsList *opts_list, const char *str);
>  int net_init_clients(void);
> diff --git a/include/net/slirp.h b/include/net/slirp.h
> index 0c98e46..2c37fa0 100644
> --- a/include/net/slirp.h
> +++ b/include/net/slirp.h
> @@ -34,10 +34,6 @@
>  void hmp_hostfwd_add(Monitor *mon, const QDict *qdict);
>  void hmp_hostfwd_remove(Monitor *mon, const QDict *qdict);
>  
> -int net_slirp_redir(const char *redir_str);
> -
> -int net_slirp_smb(const char *exported_dir);
> -
>  void hmp_info_usernet(Monitor *mon, const QDict *qdict);
>  
>  #endif
> diff --git a/net/slirp.c b/net/slirp.c
> index cb8ca23..4999a25 100644
> --- a/net/slirp.c
> +++ b/net/slirp.c
> @@ -85,8 +85,6 @@ typedef struct SlirpState {
>  } SlirpState;
>  
>  static struct slirp_config_str *slirp_configs;
> -const char *legacy_tftp_prefix;
> -const char *legacy_bootp_filename;
>  static QTAILQ_HEAD(slirp_stacks, SlirpState) slirp_stacks =
>  QTAILQ_HEAD_INITIALIZER(slirp_stacks);
>  
> @@ -96,8 +94,6 @@ static int slirp_guestfwd(SlirpState *s, const char 
> *config_str,
>int legacy_format, Error **errp);
>  
>  #ifndef _WIN32
> -static const char *legacy_smb_export;
> -
>  static int slirp_smb(SlirpState *s, const char *exported_dir,
>   struct in_addr vserver_addr, Error **errp);
>  static void slirp_smb_cleanup(SlirpState *s);
> @@ -193,13 +189,6 @@ static int net_slirp_init(NetClientState *peer, const 
> char *model,
>  return -1;
>  }
>  
> -if (!tftp_export) {
> -tftp_export = legacy_tftp_prefix;
> -}
> -if (!bootfile) {
> -bootfile = legacy_bootp_filename;
> -}
> -
>  if (vnetwork) {
>  if (get_str_sep(buf, sizeof(buf), , '/') < 0) {
>  if (!inet_aton(vnetwork, )) {
> @@ -386,9 +375,6 @@ static int net_slirp_init(NetClientState *peer, const 
> char *model,
>  }
>  }
>  #ifndef _WIN32
> -if (!smb_export) {
> -smb_export = legacy_smb_export;
> -}
>  if (smb_export) {
>  if (slirp_smb(s, smb_export, smbsrv, errp) < 0) {
>  goto error;
> @@ -586,28 +572,6 @@ void hmp_hostfwd_add(Monitor *mon, const QDict *qdict)
>  
>  }
>  
> -int net_slirp_redir(const char *redir_str)
> -{
> -struct slirp_config_str *config;
> -Error *err = NULL;
> -int res;
> -
> -if (QTAILQ_EMPTY(_stacks)) {
> -config = g_malloc(sizeof(*config));
> -pstrcpy(config->str, sizeof(config->str), redir_str);
> -config->flags = SLIRP_CFG_HOSTFWD | SLIRP_CFG_LEGACY;
> -config->next = slirp_configs;
> -slirp_configs = config;
> -return 0;
> -}
> -
> -res = slirp_hostfwd(QTAILQ_FIRST(_stacks), redir_str, 1, );
> -if (res < 0) {
> -error_report_err(err);
> -}
> -return res;
> -}
> -
>  #ifndef _WIN32
>  
>  /* automatic user mode samba server configuration */
> @@ -723,28 +687,6 @@ static int slirp_smb(SlirpState* s, const char 
> *exported_dir,
>  return 0;
>  }
>  
> -/* automatic user mode samba server configuration (legacy interface) */
> -int net_slirp_smb(const char *exported_dir)
> -{
> -struct in_addr vserver_addr = { .s_addr = 0 };
> -
> -if (legacy_smb_export) {
> -fprintf(stderr, "-smb given twice\n");
> -return -1;
> -}
> -legacy_smb_export = exported_dir;
> -if (!QTAILQ_EMPTY(_stacks)) {
> -Error *err = NULL;
> -int res = slirp_smb(QTAILQ_FIRST(_stacks), exported_dir,
> -vserver_addr, );
> -if (res < 0) {
> -error_report_err(err);
> -}
> -return res;
> -}
> -return 0;
> -}
> -
>  #endif /* !defined(_WIN32) */
>  
>  struct

Re: [Qemu-devel] [PATCH for-2.12 1/2] net: Remove the legacy "-net channel" parameter

2017-12-11 Thread Samuel Thibault

Thomas Huth, on jeu. 07 déc. 2017 19:02:34 +0100, wrote:
> It has never been documented, so hardly anybody knows about this
> parameter, and it is marked as deprecated since QEMU v2.6.
> Time to let it go now.
> 
> Signed-off-by: Thomas Huth 

Reviewed-by: Samuel Thibault 

> ---
>  include/net/slirp.h |  2 --
>  net/net.c   |  7 ---
>  net/slirp.c | 34 --
>  qemu-doc.texi   |  5 -
>  4 files changed, 48 deletions(-)
> 
> diff --git a/include/net/slirp.h b/include/net/slirp.h
> index 64b795c..0c98e46 100644
> --- a/include/net/slirp.h
> +++ b/include/net/slirp.h
> @@ -36,8 +36,6 @@ void hmp_hostfwd_remove(Monitor *mon, const QDict *qdict);
>  
>  int net_slirp_redir(const char *redir_str);
>  
> -int net_slirp_parse_legacy(QemuOptsList *opts_list, const char *optarg, int 
> *ret);
> -
>  int net_slirp_smb(const char *exported_dir);
>  
>  void hmp_info_usernet(Monitor *mon, const QDict *qdict);
> diff --git a/net/net.c b/net/net.c
> index 39ef546..7425857 100644
> --- a/net/net.c
> +++ b/net/net.c
> @@ -1565,13 +1565,6 @@ int net_init_clients(void)
>  
>  int net_client_parse(QemuOptsList *opts_list, const char *optarg)
>  {
> -#if defined(CONFIG_SLIRP)
> -int ret;
> -if (net_slirp_parse_legacy(opts_list, optarg, )) {
> -return ret;
> -}
> -#endif
> -
>  if (!qemu_opts_parse_noisily(opts_list, optarg, true)) {
>  return -1;
>  }
> diff --git a/net/slirp.c b/net/slirp.c
> index 318a26e..cb8ca23 100644
> --- a/net/slirp.c
> +++ b/net/slirp.c
> @@ -956,37 +956,3 @@ int net_init_slirp(const Netdev *netdev, const char 
> *name,
>  
>  return ret;
>  }
> -
> -int net_slirp_parse_legacy(QemuOptsList *opts_list, const char *optarg, int 
> *ret)
> -{
> -if (strcmp(opts_list->name, "net") != 0 ||
> -strncmp(optarg, "channel,", strlen("channel,")) != 0) {
> -return 0;
> -}
> -
> -error_report("The '-net channel' option is deprecated. "
> - "Please use '-netdev user,guestfwd=...' instead.");
> -
> -/* handle legacy -net channel,port:chr */
> -optarg += strlen("channel,");
> -
> -if (QTAILQ_EMPTY(_stacks)) {
> -struct slirp_config_str *config;
> -
> -config = g_malloc(sizeof(*config));
> -pstrcpy(config->str, sizeof(config->str), optarg);
> -config->flags = SLIRP_CFG_LEGACY;
> -config->next = slirp_configs;
> -slirp_configs = config;
> -*ret = 0;
> -} else {
> -Error *err = NULL;
> -*ret = slirp_guestfwd(QTAILQ_FIRST(_stacks), optarg, 1, );
> -if (*ret < 0) {
> -error_report_err(err);
> -}
> -}
> -
> -return 1;
> -}
> -
> diff --git a/qemu-doc.texi b/qemu-doc.texi
> index db2351c..982cab5 100644
> --- a/qemu-doc.texi
> +++ b/qemu-doc.texi
> @@ -2459,11 +2459,6 @@ The ``-smb /some/dir'' argument is now a synonym for 
> setting
>  the ``-netdev user,smb=/some/dir'' argument instead. The new
>  syntax allows different settings to be provided per NIC.
>  
> -@subsection -net channel (since 2.6.0)
> -
> -The ``--net channel,ARGS'' argument is now a synonym for setting
> -the ``-netdev user,guestfwd=ARGS'' argument instead.
> -
>  @subsection -net vlan (since 2.9.0)
>  
>  The ``-net vlan=NN'' argument is partially replaced with the
> -- 
> 1.8.3.1
> 

-- 
Samuel
R: Parce que ça renverse bêtement l'ordre naturel de lecture!
Q: Mais pourquoi citer en fin d'article est-il si effroyable?
R: Citer en fin d'article
Q: Quelle est la chose la plus désagréable sur les groupes de news?

Re: [Qemu-devel] [RFC PATCH 0/5] Scoped locks using attribute((cleanup))

2017-12-11 Thread Paolo Bonzini

On 11/12/2017 15:11, Eric Blake wrote:
> I don't know if there is a way to make gcc insert stack-unwind
> directives that are honored across longjmp (I know C++ does it for
> exceptions; so there may be a way, and I just don't know it).

Probably -fexceptions.

Paolo

> Conversely, I do know that pthread_cleanup_push/pop, which does
> something similar, is permitted by POSIX to NOT work across longjmp:
> 
>Calling longjmp(3) (siglongjmp(3)) produces undefined  results
> if  any
>call  has  been made to pthread_cleanup_push() or
> pthread_cleanup_pop()
>without the matching call of the pair since the jump buffer was
> filled
>by   setjmp(3)  (sigsetjmp(3)).   Likewise,  calling  longjmp(3)
> (sig‐
>longjmp(3)) from inside a clean-up handler produces  undefined
> results
>unless  the  jump  buffer  was  also filled by setjmp(3)
> (sigsetjmp(3))
>inside the handler.




signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH 08/13] imx_fec: Add support for multiple Tx DMA rings

2017-12-11 Thread Andrey Smirnov

More recent version of the IP block support more than one Tx DMA ring,
so add the code implementing that feature.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/net/imx_fec.c | 133 ---
 include/hw/net/imx_fec.h |  18 ++-
 2 files changed, 130 insertions(+), 21 deletions(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index 825c879a28..77d27f763e 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -196,6 +196,31 @@ static const char *imx_eth_reg_name(IMXFECState *s, 
uint32_t index)
 }
 }
 
+/*
+ * Versions of this device with more than one TX descriptor save the
+ * 2nd and 3rd descriptors in a subsection, to maintain migration
+ * compatibility with previous versions of the device that only
+ * supported a single descriptor.
+ */
+static bool imx_eth_is_multi_tx_ring(void *opaque)
+{
+IMXFECState *s = IMX_FEC(opaque);
+
+return s->tx_ring_num > 1;
+}
+
+static const VMStateDescription vmstate_imx_eth_txdescs = {
+.name = "imx.fec/txdescs",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = imx_eth_is_multi_tx_ring,
+.fields = (VMStateField[]) {
+ VMSTATE_UINT32(tx_descriptor[1], IMXFECState),
+ VMSTATE_UINT32(tx_descriptor[2], IMXFECState),
+ VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_imx_eth = {
 .name = TYPE_IMX_FEC,
 .version_id = 2,
@@ -203,15 +228,18 @@ static const VMStateDescription vmstate_imx_eth = {
 .fields = (VMStateField[]) {
 VMSTATE_UINT32_ARRAY(regs, IMXFECState, ENET_MAX),
 VMSTATE_UINT32(rx_descriptor, IMXFECState),
-VMSTATE_UINT32(tx_descriptor, IMXFECState),
-
+VMSTATE_UINT32(tx_descriptor[0], IMXFECState),
 VMSTATE_UINT32(phy_status, IMXFECState),
 VMSTATE_UINT32(phy_control, IMXFECState),
 VMSTATE_UINT32(phy_advertise, IMXFECState),
 VMSTATE_UINT32(phy_int, IMXFECState),
 VMSTATE_UINT32(phy_int_mask, IMXFECState),
 VMSTATE_END_OF_LIST()
-}
+},
+.subsections = (const VMStateDescription * []) {
+_imx_eth_txdescs,
+NULL
+},
 };
 
 #define PHY_INT_ENERGYON(1 << 7)
@@ -406,7 +434,7 @@ static void imx_fec_do_tx(IMXFECState *s)
 {
 int frame_size = 0, descnt = 0;
 uint8_t *ptr = s->frame;
-uint32_t addr = s->tx_descriptor;
+uint32_t addr = s->tx_descriptor[0];
 
 while (descnt++ < IMX_MAX_DESC) {
 IMXFECBufDesc bd;
@@ -447,16 +475,47 @@ static void imx_fec_do_tx(IMXFECState *s)
 }
 }
 
-s->tx_descriptor = addr;
+s->tx_descriptor[0] = addr;
 
 imx_eth_update(s);
 }
 
-static void imx_enet_do_tx(IMXFECState *s)
+static void imx_enet_do_tx(IMXFECState *s, uint32_t index)
 {
 int frame_size = 0, descnt = 0;
+
 uint8_t *ptr = s->frame;
-uint32_t addr = s->tx_descriptor;
+uint32_t addr, int_txb, int_txf, tdsr;
+size_t ring;
+
+switch (index) {
+case ENET_TDAR:
+ring= 0;
+int_txb = ENET_INT_TXB;
+int_txf = ENET_INT_TXF;
+tdsr= ENET_TDSR;
+break;
+case ENET_TDAR1:
+ring= 1;
+int_txb = ENET_INT_TXB1;
+int_txf = ENET_INT_TXF1;
+tdsr= ENET_TDSR1;
+break;
+case ENET_TDAR2:
+ring= 2;
+int_txb = ENET_INT_TXB2;
+int_txf = ENET_INT_TXF2;
+tdsr= ENET_TDSR2;
+break;
+default:
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: bogus value for index %x\n",
+  __func__, index);
+abort();
+break;
+}
+
+addr = s->tx_descriptor[ring];
 
 while (descnt++ < IMX_MAX_DESC) {
 IMXENETBufDesc bd;
@@ -502,32 +561,32 @@ static void imx_enet_do_tx(IMXFECState *s)
 
 frame_size = 0;
 if (bd.option & ENET_BD_TX_INT) {
-s->regs[ENET_EIR] |= ENET_INT_TXF;
+s->regs[ENET_EIR] |= int_txf;
 }
 }
 if (bd.option & ENET_BD_TX_INT) {
-s->regs[ENET_EIR] |= ENET_INT_TXB;
+s->regs[ENET_EIR] |= int_txb;
 }
 bd.flags &= ~ENET_BD_R;
 /* Write back the modified descriptor.  */
 imx_enet_write_bd(, addr);
 /* Advance to the next descriptor.  */
 if ((bd.flags & ENET_BD_W) != 0) {
-addr = s->regs[ENET_TDSR];
+addr = s->regs[tdsr];
 } else {
 addr += sizeof(bd);
 }
 }
 
-s->tx_descriptor = addr;
+s->tx_descriptor[ring] = addr;
 
 imx_eth_update(s);
 }
 
-static void imx_eth_do_tx(IMXFECState *s)
+static void imx_eth_do_tx(IMXFECState *s, uint32_t index)
 {
 if

[Qemu-devel] [PATCH 11/13] imx_fec: Reserve full FSL_IMX25_FEC_SIZE page for the register file

2017-12-11 Thread Andrey Smirnov

Some i.MX SoCs (e.g. i.MX7) have FEC registers going as far as offset
0x614, so to avoid getting aborts when accessing those on QEMU, extend
the register file to cover FSL_IMX25_FEC_SIZE(16K) of address space
instead of just 1K.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Signed-off-by: Andrey Smirnov 
---
 hw/net/imx_fec.c   | 2 +-
 include/hw/arm/fsl-imx25.h | 1 -
 include/hw/net/imx_fec.h   | 1 +
 3 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index c1cf7f9c58..4fb48f62ba 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -1281,7 +1281,7 @@ static void imx_eth_realize(DeviceState *dev, Error 
**errp)
 SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
 
 memory_region_init_io(>iomem, OBJECT(dev), _eth_ops, s,
-  TYPE_IMX_FEC, 0x400);
+  TYPE_IMX_FEC, FSL_IMX25_FEC_SIZE);
 sysbus_init_mmio(sbd, >iomem);
 sysbus_init_irq(sbd, >irq[0]);
 sysbus_init_irq(sbd, >irq[1]);
diff --git a/include/hw/arm/fsl-imx25.h b/include/hw/arm/fsl-imx25.h
index d0e8e9d956..65a73714ef 100644
--- a/include/hw/arm/fsl-imx25.h
+++ b/include/hw/arm/fsl-imx25.h
@@ -192,7 +192,6 @@ typedef struct FslIMX25State {
 #define FSL_IMX25_UART5_ADDR0x5002C000
 #define FSL_IMX25_UART5_SIZE0x4000
 #define FSL_IMX25_FEC_ADDR  0x50038000
-#define FSL_IMX25_FEC_SIZE  0x4000
 #define FSL_IMX25_CCM_ADDR  0x53F8
 #define FSL_IMX25_CCM_SIZE  0x4000
 #define FSL_IMX25_GPT4_ADDR 0x53F84000
diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h
index 91ef8f89a6..7b3faa4019 100644
--- a/include/hw/net/imx_fec.h
+++ b/include/hw/net/imx_fec.h
@@ -245,6 +245,7 @@ typedef struct {
 
 #define ENET_TX_RING_NUM   3
 
+#define FSL_IMX25_FEC_SIZE  0x4000
 
 typedef struct IMXFECState {
 /*< private >*/
-- 
2.14.3

[Qemu-devel] [PATCH 12/13] sdhci: Add i.MX specific subtype of SDHCI

2017-12-11 Thread Andrey Smirnov

IP block found on several generations of i.MX family does not use
vanilla SDHCI implementation and it comes with a number of quirks.

Introduce i.MX SDHCI subtype of SDHCI block to add code necessary to
support unmodified Linux guest driver.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Signed-off-by: Andrey Smirnov 
---
 hw/sd/sdhci-internal.h |  19 +
 hw/sd/sdhci.c  | 228 -
 include/hw/sd/sdhci.h  |   8 ++
 3 files changed, 253 insertions(+), 2 deletions(-)

diff --git a/hw/sd/sdhci-internal.h b/hw/sd/sdhci-internal.h
index 161177cf39..b86ac0791b 100644
--- a/hw/sd/sdhci-internal.h
+++ b/hw/sd/sdhci-internal.h
@@ -85,12 +85,18 @@
 
 /* R/W Host control Register 0x0 */
 #define SDHC_HOSTCTL   0x28
+#define SDHC_CTRL_LED  0x01
 #define SDHC_CTRL_DMA_CHECK_MASK   0x18
 #define SDHC_CTRL_SDMA 0x00
 #define SDHC_CTRL_ADMA1_32 0x08
 #define SDHC_CTRL_ADMA2_32 0x10
 #define SDHC_CTRL_ADMA2_64 0x18
 #define SDHC_DMA_TYPE(x)   ((x) & SDHC_CTRL_DMA_CHECK_MASK)
+#define SDHC_CTRL_4BITBUS  0x02
+#define SDHC_CTRL_8BITBUS  0x20
+#define SDHC_CTRL_CDTEST_INS   0x40
+#define SDHC_CTRL_CDTEST_EN0x80
+
 
 /* R/W Power Control Register 0x0 */
 #define SDHC_PWRCON0x29
@@ -229,4 +235,17 @@ enum {
 
 extern const VMStateDescription sdhci_vmstate;
 
+
+#define ESDHC_MIX_CTRL  0x48
+#define ESDHC_VENDOR_SPEC   0xc0
+#define ESDHC_DLL_CTRL  0x60
+
+#define ESDHC_TUNING_CTRL   0xcc
+#define ESDHC_TUNE_CTRL_STATUS  0x68
+#define ESDHC_WTMK_LVL  0x44
+
+#define ESDHC_CTRL_4BITBUS  (0x1 << 1)
+#define ESDHC_CTRL_8BITBUS  (0x2 << 1)
+
+
 #endif
diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
index 6d6a791ee9..758af067f9 100644
--- a/hw/sd/sdhci.c
+++ b/hw/sd/sdhci.c
@@ -265,7 +265,8 @@ static void sdhci_send_command(SDHCIState *s)
 }
 }
 
-if ((s->norintstsen & SDHC_NISEN_TRSCMP) &&
+if (!(s->quirks & SDHCI_QUIRK_NO_BUSY_IRQ) &&
+(s->norintstsen & SDHC_NISEN_TRSCMP) &&
 (s->cmdreg & SDHC_CMD_RESPONSE) == SDHC_CMD_RSP_WITH_BUSY) {
 s->norintsts |= SDHC_NIS_TRSCMP;
 }
@@ -1191,6 +1192,8 @@ static void sdhci_initfn(SDHCIState *s)
 
 s->insert_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, 
sdhci_raise_insertion_irq, s);
 s->transfer_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, sdhci_data_transfer, 
s);
+
+s->io_ops = _mmio_ops;
 }
 
 static void sdhci_uninitfn(SDHCIState *s)
@@ -1347,7 +1350,7 @@ static void sdhci_sysbus_realize(DeviceState *dev, Error 
** errp)
 s->buf_maxsz = sdhci_get_fifolen(s);
 s->fifo_buffer = g_malloc0(s->buf_maxsz);
 sysbus_init_irq(sbd, >irq);
-memory_region_init_io(>iomem, OBJECT(s), _mmio_ops, s, "sdhci",
+memory_region_init_io(>iomem, OBJECT(s), s->io_ops, s, "sdhci",
 SDHC_REGISTERS_MAP_SIZE);
 sysbus_init_mmio(sbd, >iomem);
 }
@@ -1386,11 +1389,232 @@ static const TypeInfo sdhci_bus_info = {
 .class_init = sdhci_bus_class_init,
 };
 
+static uint64_t usdhc_read(void *opaque, hwaddr offset, unsigned size)
+{
+SDHCIState *s = SYSBUS_SDHCI(opaque);
+uint32_t ret;
+uint16_t hostctl;
+
+switch (offset) {
+default:
+return sdhci_read(opaque, offset, size);
+
+case SDHC_HOSTCTL:
+/*
+ * For a detailed explanation on the following bit
+ * manipulation code see comments in a similar part of
+ * usdhc_write()
+ */
+hostctl = SDHC_DMA_TYPE(s->hostctl) << (8 - 3);
+
+if (s->hostctl & SDHC_CTRL_8BITBUS) {
+hostctl |= ESDHC_CTRL_8BITBUS;
+}
+
+if (s->hostctl & SDHC_CTRL_4BITBUS) {
+hostctl |= ESDHC_CTRL_4BITBUS;
+}
+
+ret  = hostctl;
+ret |= (uint32_t)s->blkgap << 16;
+ret |= (uint32_t)s->wakcon << 24;
+
+break;
+
+case ESDHC_DLL_CTRL:
+case ESDHC_TUNE_CTRL_STATUS:
+case 0x6c:
+case ESDHC_TUNING_CTRL:
+case ESDHC_VENDOR_SPEC:
+case ESDHC_MIX_CTRL:
+case ESDHC_WTMK_LVL:
+ret = 0;
+break;
+}
+
+return ret;
+}
+
+static void
+usdhc_write(void *opaque, hwaddr offset, uint64_t val, unsigned size)
+{
+SDHCIState *s = SYSBUS_SDHCI(opaque);
+uint8_t hostctl;
+uint32_t value = (uint32_t)val;
+
+switch (offset) {
+case ESDHC_DLL_CTRL:
+case ESDHC_TUNE_CTRL_STATUS:
+case 0x6c:
+case ESDHC_TUNING_CTRL:
+case ESDHC_WTMK_LVL:
+case ESDHC_VENDOR_SPEC:
+break;
+
+case SDHC_HOSTCTL:
+/*
+ * Here's What ESDHCI has at offset 0x28

[Qemu-devel] [PATCH 09/13] imx_fec: Use correct length for packet size

2017-12-11 Thread Andrey Smirnov

Use 'frame_size' instead of 'len' when calling qemu_send_packet(),
failing to do so results in malformed packets send in case when that
packed is fragmented into multiple DMA transactions.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/net/imx_fec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index 77d27f763e..6cb9e2e20e 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -556,7 +556,7 @@ static void imx_enet_do_tx(IMXFECState *s, uint32_t index)
 }
 /* Last buffer in frame.  */
 
-qemu_send_packet(qemu_get_queue(s->nic), s->frame, len);
+qemu_send_packet(qemu_get_queue(s->nic), s->frame, frame_size);
 ptr = s->frame;
 
 frame_size = 0;
-- 
2.14.3

[Qemu-devel] [PATCH 10/13] imx_fec: Fix a typo in imx_enet_receive()

2017-12-11 Thread Andrey Smirnov

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Signed-off-by: Andrey Smirnov 
---
 hw/net/imx_fec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index 6cb9e2e20e..c1cf7f9c58 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -1150,7 +1150,7 @@ static ssize_t imx_enet_receive(NetClientState *nc, const 
uint8_t *buf,
 size += 2;
 }
 
-/* Huge frames are truncted.  */
+/* Huge frames are truncated. */
 if (size > s->regs[ENET_FTRL]) {
 size = s->regs[ENET_FTRL];
 flags |= ENET_BD_TR | ENET_BD_LG;
-- 
2.14.3

[Qemu-devel] [PATCH 13/13] sdhci: Implement write method of ACMD12ERRSTS register

2017-12-11 Thread Andrey Smirnov

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/sd/sdhci.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
index 758af067f9..cb9e0db9fb 100644
--- a/hw/sd/sdhci.c
+++ b/hw/sd/sdhci.c
@@ -1139,6 +1139,9 @@ sdhci_write(void *opaque, hwaddr offset, uint64_t val, 
unsigned size)
 s->admasysaddr = (s->admasysaddr & (0xULL |
 ((uint64_t)mask << 32))) | ((uint64_t)value << 32);
 break;
+case SDHC_ACMD12ERRSTS:
+MASKED_WRITE(s->acmd12errsts, mask, value);
+break;
 case SDHC_FEAER:
 s->acmd12errsts |= value;
 s->errintsts |= (value >> 16) & s->errintstsen;
-- 
2.14.3

[Qemu-devel] [PATCH 07/13] imx_fec: Emulate SHIFT16 in ENETx_RACC

2017-12-11 Thread Andrey Smirnov

Needed to support latest Linux kernel driver which relies on that
functionality.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/net/imx_fec.c | 23 +++
 include/hw/net/imx_fec.h |  2 ++
 2 files changed, 25 insertions(+)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index 6feda18742..825c879a28 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -1037,6 +1037,7 @@ static ssize_t imx_enet_receive(NetClientState *nc, const 
uint8_t *buf,
 uint8_t *crc_ptr;
 unsigned int buf_len;
 size_t size = len;
+bool shift16 = s->regs[ENET_RACC] & ENET_RACC_SHIFT16;
 
 FEC_PRINTF("len %d\n", (int)size);
 
@@ -1051,6 +1052,10 @@ static ssize_t imx_enet_receive(NetClientState *nc, 
const uint8_t *buf,
 crc = cpu_to_be32(crc32(~0, buf, size));
 crc_ptr = (uint8_t *) 
 
+if (shift16) {
+size += 2;
+}
+
 /* Huge frames are truncted.  */
 if (size > s->regs[ENET_FTRL]) {
 size = s->regs[ENET_FTRL];
@@ -1087,6 +1092,24 @@ static ssize_t imx_enet_receive(NetClientState *nc, 
const uint8_t *buf,
 buf_len += size - 4;
 }
 buf_addr = bd.data;
+
+if (shift16) {
+/*
+ * If SHIFT16 bit of ENETx_RACC register is set we need to
+ * align the payload to 4-byte boundary.
+ */
+const uint8_t zeros[2] = { 0 };
+
+dma_memory_write(_space_memory, buf_addr,
+ zeros, sizeof(zeros));
+
+buf_addr += sizeof(zeros);
+buf_len  -= sizeof(zeros);
+
+/* We only do this once per Ethernet frame */
+shift16 = false;
+}
+
 dma_memory_write(_space_memory, buf_addr, buf, buf_len);
 buf += buf_len;
 if (size < 4) {
diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h
index a390d704a6..af0840a0fa 100644
--- a/include/hw/net/imx_fec.h
+++ b/include/hw/net/imx_fec.h
@@ -170,6 +170,8 @@
 #define ENET_TWFR_TFWR_LENGTH  (6)
 #define ENET_TWFR_STRFWD   (1 << 8)
 
+#define ENET_RACC_SHIFT16  BIT(7)
+
 /* Buffer Descriptor.  */
 typedef struct {
 uint16_t length;
-- 
2.14.3

[Qemu-devel] [PATCH 06/13] imx_fec: Use MIN instead of explicit ternary operator

2017-12-11 Thread Andrey Smirnov

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Andrey Smirnov 
---
 hw/net/imx_fec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index 50da91bf9e..6feda18742 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -1076,7 +1076,7 @@ static ssize_t imx_enet_receive(NetClientState *nc, const 
uint8_t *buf,
   TYPE_IMX_FEC, __func__);
 break;
 }
-buf_len = (size <= s->regs[ENET_MRBR]) ? size : s->regs[ENET_MRBR];
+buf_len = MIN(size, s->regs[ENET_MRBR]);
 bd.length = buf_len;
 size -= buf_len;
 
-- 
2.14.3

[Qemu-devel] [PATCH 05/13] imx_fec: Use ENET_FTRL to determine truncation length

2017-12-11 Thread Andrey Smirnov

Frame truncation length, TRUNC_FL, is determined by the contents of
ENET_FTRL register, so convert the code to use it instead of a
hardcoded constant.

To avoid the case where TRUNC_FL is greater that ENET_MAX_FRAME_SIZE,
increase the value of the latter to its theoretical maximum of 16K.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Signed-off-by: Andrey Smirnov 
---
 hw/net/imx_fec.c | 4 ++--
 include/hw/net/imx_fec.h | 3 ++-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index 56cb72273c..50da91bf9e 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -1052,8 +1052,8 @@ static ssize_t imx_enet_receive(NetClientState *nc, const 
uint8_t *buf,
 crc_ptr = (uint8_t *) 
 
 /* Huge frames are truncted.  */
-if (size > ENET_MAX_FRAME_SIZE) {
-size = ENET_MAX_FRAME_SIZE;
+if (size > s->regs[ENET_FTRL]) {
+size = s->regs[ENET_FTRL];
 flags |= ENET_BD_TR | ENET_BD_LG;
 }
 
diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h
index 67993870a2..a390d704a6 100644
--- a/include/hw/net/imx_fec.h
+++ b/include/hw/net/imx_fec.h
@@ -86,7 +86,6 @@
 #define ENET_TCCR3 393
 #define ENET_MAX   400
 
-#define ENET_MAX_FRAME_SIZE2032
 
 /* EIR and EIMR */
 #define ENET_INT_HB(1 << 31)
@@ -155,6 +154,8 @@
 #define ENET_RCR_NLC   (1 << 30)
 #define ENET_RCR_GRS   (1 << 31)
 
+#define ENET_MAX_FRAME_SIZE(1 << ENET_RCR_MAX_FL_LENGTH)
+
 /* TCR */
 #define ENET_TCR_GTS   (1 << 0)
 #define ENET_TCR_FDEN  (1 << 2)
-- 
2.14.3

[Qemu-devel] [PATCH 03/13] imx_fec: Change queue flushing heuristics

2017-12-11 Thread Andrey Smirnov

In current implementation, packet queue flushing logic seem to suffer
from a deadlock like scenario if a packet is received by the interface
before before Rx ring is initialized by Guest's driver. Consider the
following sequence of events:

1. A QEMU instance is started against a TAP device on Linux
   host, running Linux guest, e. g., something to the effect
   of:

   qemu-system-arm \
  -net nic,model=imx.fec,netdev=lan0 \
  netdev tap,id=lan0,ifname=tap0,script=no,downscript=no \
  ... rest of the arguments ...

2. Once QEMU starts, but before guest reaches the point where
   FEC deriver is done initializing the HW, Guest, via TAP
   interface, receives a number of multicast MDNS packets from
   Host (not necessarily true for every OS, but it happens at
   least on Fedora 25)

3. Recieving a packet in such a state results in
   imx_eth_can_receive() returning '0', which in turn causes
   tap_send() to disable corresponding event (tap.c:203)

4. Once Guest's driver reaches the point where it is ready to
   recieve packets it prepares Rx ring descriptors and writes
   ENET_RDAR_RDAR to ENET_RDAR register to indicate to HW that
   more descriptors are ready. And at this points emulation
   layer does this:

 s->regs[index] = ENET_RDAR_RDAR;
 imx_eth_enable_rx(s);

   which, combined with:

  if (!s->regs[ENET_RDAR]) {
 qemu_flush_queued_packets(qemu_get_queue(s->nic));
  }

   results in Rx queue never being flushed and corresponding
   I/O event beign disabled.

To prevent the problem, change the code to always flush packet queue
when ENET_RDAR transitions 0 -> ENET_RDAR_RDAR.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/net/imx_fec.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index 8b2e4b8ffe..eb034ffd0c 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -533,7 +533,7 @@ static void imx_eth_do_tx(IMXFECState *s)
 }
 }
 
-static void imx_eth_enable_rx(IMXFECState *s)
+static void imx_eth_enable_rx(IMXFECState *s, bool flush)
 {
 IMXFECBufDesc bd;
 bool rx_ring_full;
@@ -544,7 +544,7 @@ static void imx_eth_enable_rx(IMXFECState *s)
 
 if (rx_ring_full) {
 FEC_PRINTF("RX buffer full\n");
-} else if (!s->regs[ENET_RDAR]) {
+} else if (flush) {
 qemu_flush_queued_packets(qemu_get_queue(s->nic));
 }
 
@@ -807,7 +807,7 @@ static void imx_eth_write(void *opaque, hwaddr offset, 
uint64_t value,
 if (s->regs[ENET_ECR] & ENET_ECR_ETHEREN) {
 if (!s->regs[index]) {
 s->regs[index] = ENET_RDAR_RDAR;
-imx_eth_enable_rx(s);
+imx_eth_enable_rx(s, true);
 }
 } else {
 s->regs[index] = 0;
@@ -930,7 +930,7 @@ static int imx_eth_can_receive(NetClientState *nc)
 
 FEC_PRINTF("\n");
 
-return s->regs[ENET_RDAR] ? 1 : 0;
+return !!s->regs[ENET_RDAR];
 }
 
 static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
@@ -1020,7 +1020,7 @@ static ssize_t imx_fec_receive(NetClientState *nc, const 
uint8_t *buf,
 }
 }
 s->rx_descriptor = addr;
-imx_eth_enable_rx(s);
+imx_eth_enable_rx(s, false);
 imx_eth_update(s);
 return len;
 }
@@ -1116,7 +1116,7 @@ static ssize_t imx_enet_receive(NetClientState *nc, const 
uint8_t *buf,
 }
 }
 s->rx_descriptor = addr;
-imx_eth_enable_rx(s);
+imx_eth_enable_rx(s, false);
 imx_eth_update(s);
 return len;
 }
-- 
2.14.3

[Qemu-devel] [PATCH 00/13] i.MX FEC and SD changes

2017-12-11 Thread Andrey Smirnov

Hi everyone,

This patchset is a spin-off from original i.MX7 support submission
found here [1], containing all of the patchest that are more or less
agreed upon and are ready (hopefully!) for inclusion.

Changes since [1]:

- Rx buffer in FEC was moved from stack to heap to allow
  worry-free expansion to 16K limit.

- Added more comments explaining rather convoluted bit-moving
  in eSHDC emuation code

- Triple Tx ring DMA "VMState" code was changed to follow
  Peter's recommendations (avoiding the need to incrememnt the
  version_id)

- FSL_IMX25_FEC_SIZE is used as a size of FEC's register file

- Removed leftover code from "imx_fec: Change queue flushing
  heuristics"

[1] https://lists.gnu.org/archive/html/qemu-arm/2017-11/msg00045.html


Andrey Smirnov (13):
  imx_fec: Do not link to netdev
  imx_fec: Refactor imx_eth_enable_rx()
  imx_fec: Change queue flushing heuristicsw
  imx_fec: Move Tx frame buffer away from the stack
  imx_fec: Use ENET_FTRL to determine truncation length
  imx_fec: Use MIN instead of explicit ternary operator
  imx_fec: Emulate SHIFT16 in ENETx_RACC
  imx_fec: Add support for multiple Tx DMA rings
  imx_fec: Use correct length for packet size
  imx_fec: Fix a typo in imx_enet_receive()
  imx_fec: Reserve full FSL_IMX25_FEC_SIZE page for the register file
  sdhci: Add i.MX specific subtype of SDHCI
  sdhci: Implement write method of ACMD12ERRSTS register

 hw/arm/fsl-imx6.c  |   1 +
 hw/net/imx_fec.c   | 210 -
 hw/sd/sdhci-internal.h |  19 
 hw/sd/sdhci.c  | 231 -
 include/hw/arm/fsl-imx25.h |   1 -
 include/hw/net/imx_fec.h   |  27 +-
 include/hw/sd/sdhci.h  |   8 ++
 7 files changed, 444 insertions(+), 53 deletions(-)

-- 
2.14.3

[Qemu-devel] [PATCH 02/13] imx_fec: Refactor imx_eth_enable_rx()

2017-12-11 Thread Andrey Smirnov

Refactor imx_eth_enable_rx() to have more meaningfull variable name
than 'tmp' and to reduce number of logical negations done.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/net/imx_fec.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index 88b4b049d7..8b2e4b8ffe 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -536,19 +536,19 @@ static void imx_eth_do_tx(IMXFECState *s)
 static void imx_eth_enable_rx(IMXFECState *s)
 {
 IMXFECBufDesc bd;
-bool tmp;
+bool rx_ring_full;
 
 imx_fec_read_bd(, s->rx_descriptor);
 
-tmp = ((bd.flags & ENET_BD_E) != 0);
+rx_ring_full = !(bd.flags & ENET_BD_E);
 
-if (!tmp) {
+if (rx_ring_full) {
 FEC_PRINTF("RX buffer full\n");
 } else if (!s->regs[ENET_RDAR]) {
 qemu_flush_queued_packets(qemu_get_queue(s->nic));
 }
 
-s->regs[ENET_RDAR] = tmp ? ENET_RDAR_RDAR : 0;
+s->regs[ENET_RDAR] = rx_ring_full ? 0 : ENET_RDAR_RDAR;
 }
 
 static void imx_eth_reset(DeviceState *d)
-- 
2.14.3

[Qemu-devel] [PATCH 01/13] imx_fec: Do not link to netdev

2017-12-11 Thread Andrey Smirnov

Binding to a particular netdev doesn't seem to belong to this layer
and should probably be done as a part of board or SoC specific code.

Convert all of the users of this IP block to use
qdev_set_nic_properties() instead.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/arm/fsl-imx6.c | 1 +
 hw/net/imx_fec.c  | 2 --
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/arm/fsl-imx6.c b/hw/arm/fsl-imx6.c
index 26fd214004..2ed7146c52 100644
--- a/hw/arm/fsl-imx6.c
+++ b/hw/arm/fsl-imx6.c
@@ -385,6 +385,7 @@ static void fsl_imx6_realize(DeviceState *dev, Error **errp)
 spi_table[i].irq));
 }
 
+qdev_set_nic_properties(DEVICE(>eth), _table[0]);
 object_property_set_bool(OBJECT(>eth), true, "realized", );
 if (err) {
 error_propagate(errp, err);
diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index 90e6ee35ba..88b4b049d7 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -1171,8 +1171,6 @@ static void imx_eth_realize(DeviceState *dev, Error 
**errp)
 
 qemu_macaddr_default_if_unset(>conf.macaddr);
 
-s->conf.peers.ncs[0] = nd_table[0].netdev;
-
 s->nic = qemu_new_nic(_eth_net_info, >conf,
   object_get_typename(OBJECT(dev)),
   DEVICE(dev)->id, s);
-- 
2.14.3

Re: [Qemu-devel] [PATCH-2.12 v2 2/3] xilinx_spips: Set all of the reset values

2017-12-11 Thread francisco iglesias

On 11 December 2017 at 18:27, Alistair Francis 
wrote:

> On Wed, Dec 6, 2017 at 3:39 PM, francisco iglesias
>  wrote:
> > Hi Alistair,
> >
> > On 6 December 2017 at 23:22, Alistair Francis <
> alistair.fran...@xilinx.com>
> > wrote:
> >>
> >> Following the ZynqMP register spec let's ensure that all reset values
> >> are set.
> >>
> >> Signed-off-by: Alistair Francis 
> >> ---
> >> V2:
> >>  - Don't bother double setting registers
> >>
> >>  hw/ssi/xilinx_spips.c | 35 ++-
> >>  include/hw/ssi/xilinx_spips.h |  2 +-
> >>  2 files changed, 31 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c
> >> index 899db814ee..b8182cfd74 100644
> >> --- a/hw/ssi/xilinx_spips.c
> >> +++ b/hw/ssi/xilinx_spips.c
> >> @@ -66,6 +66,7 @@
> >>
> >>  /* interrupt mechanism */
> >>  #define R_INTR_STATUS   (0x04 / 4)
> >> +#define R_INTR_STATUS_RESET (0x104)
> >>  #define R_INTR_EN   (0x08 / 4)
> >>  #define R_INTR_DIS  (0x0C / 4)
> >>  #define R_INTR_MASK (0x10 / 4)
> >> @@ -102,6 +103,9 @@
> >>  #define R_SLAVE_IDLE_COUNT  (0x24 / 4)
> >>  #define R_TX_THRES  (0x28 / 4)
> >>  #define R_RX_THRES  (0x2C / 4)
> >> +#define R_GPIO  (0x30 / 4)
> >> +#define R_LPBK_DLY_ADJ  (0x38 / 4)
> >> +#define R_LPBK_DLY_ADJ_RESET (0x33)
> >>  #define R_TXD1  (0x80 / 4)
> >>  #define R_TXD2  (0x84 / 4)
> >>  #define R_TXD3  (0x88 / 4)
> >> @@ -140,8 +144,12 @@
> >>  #define R_GQSPI_IER (0x108 / 4)
> >>  #define R_GQSPI_IDR (0x10c / 4)
> >>  #define R_GQSPI_IMR (0x110 / 4)
> >> +#define R_GQSPI_IMR_RESET   (0xfbe)
> >>  #define R_GQSPI_TX_THRESH   (0x128 / 4)
> >>  #define R_GQSPI_RX_THRESH   (0x12c / 4)
> >> +#define R_GQSPI_GPIO_THRESH (0x130 / 4)
> >
> >
> > According to doc (mentioned in patch 0/3) the address above, 0x130, is
> > "GQSPI GPIO for Write Protect". Should we rename the define to
> R_GQSPI_GPIO?
> > (Based on doc and that the other WP is named R_GPIO).
>
> Hmmm... I auto generated these names, so somewhere internally we call
> it GQSPI_GPIO_THRESH, but apparently not in the documentation.
>
> All the other auto generated code (headers for standalone
> applications) will have a similar auto generated name, so I'm tempted
> to keep it as this.


Hi Alistair,

I see your point here and since autogenerated is less error prone and the
rest of the patch is looking good:

Reviewed-by: Francisco Iglesias 

(If you decide to go on and rename the define you can keep my reviewed-by
tag).

Best regards,
Francisco Iglesias



> Otherwise the register is technically just called
> GQSPI_GPIO, according to the documentation. That doesn't seem to clash
> with anything else.
>
>
I think changing it to GQSPI_GPIO makes the most sense then. That way
> it matches the documentation and is still searchably close to the auto
> generated string.
>
>
Good catch!
>
> Alistair
>
> >
> > Best regards,
> > Francisco Iglesias
> >
> >>
> >> +#define R_GQSPI_LPBK_DLY_ADJ (0x138 / 4)
> >> +#define R_GQSPI_LPBK_DLY_ADJ_RESET (0x33)
> >>  #define R_GQSPI_CNFG(0x100 / 4)
> >>  FIELD(GQSPI_CNFG, MODE_EN, 30, 2)
> >>  FIELD(GQSPI_CNFG, GEN_FIFO_START_MODE, 29, 1)
> >> @@ -177,8 +185,16 @@
> >>  FIELD(GQSPI_GF_SNAPSHOT, EXPONENT, 9, 1)
> >>  FIELD(GQSPI_GF_SNAPSHOT, DATA_XFER, 8, 1)
> >>  FIELD(GQSPI_GF_SNAPSHOT, IMMEDIATE_DATA, 0, 8)
> >> -#define R_GQSPI_MOD_ID(0x168 / 4)
> >> -#define R_GQSPI_MOD_ID_VALUE  0x010A
> >> +#define R_GQSPI_MOD_ID(0x1fc / 4)
> >> +#define R_GQSPI_MOD_ID_RESET  (0x10a)
> >> +
> >> +#define R_QSPIDMA_DST_CTRL (0x80c / 4)
> >> +#define R_QSPIDMA_DST_CTRL_RESET   (0x803ffa00)
> >> +#define R_QSPIDMA_DST_I_MASK   (0x820 / 4)
> >> +#define R_QSPIDMA_DST_I_MASK_RESET (0xfe)
> >> +#define R_QSPIDMA_DST_CTRL2(0x824 / 4)
> >> +#define R_QSPIDMA_DST_CTRL2_RESET  (0x081bfff8)
> >> +
> >>  /* size of TXRX FIFOs */
> >>  #define RXFF_A  (128)
> >>  #define TXFF_A  (128)
> >> @@ -351,11 +367,20 @@ static void xlnx_zynqmp_qspips_reset(DeviceState
> *d)
> >>  fifo8_reset(>rx_fifo_g);
> >>  fifo8_reset(>rx_fifo_g);
> >>  fifo32_reset(>fifo_g);
> >> +s->regs[R_INTR_STATUS] = R_INTR_STATUS_RESET;
> >> +s->regs[R_GPIO] = 1;
> >> +s->regs[R_LPBK_DLY_ADJ] = R_LPBK_DLY_ADJ_RESET;
> >> +s->regs[R_GQSPI_GFIFO_THRESH] = 0x10;
> >> +s->regs[R_MOD_ID] = 0x01090101;
> >> +s->regs[R_GQSPI_IMR] = R_GQSPI_IMR_RESET;
> >>  s->regs[R_GQSPI_TX_THRESH] = 1;
> >>  s->regs[R_GQSPI_RX_THRESH] = 1;
> >> -s->regs[R_GQSPI_GFIFO_THRESH] = 1;
> >> -s->regs[R_GQSPI_IMR] = GQSPI_IXR_MASK;
> >> -s->regs[R_MOD_ID] = 0x01090101;
> >> +s->regs[R_GQSPI_GPIO_THRESH] = 1;
> >> +s->regs[R_GQSPI_LPBK_DLY_ADJ] =

Re: [Qemu-devel] [PATCH for-2.11?] target/arm: Generate UNDEF for 32-bit Thumb2 insns

2017-12-11 Thread Peter Maydell

On 11 December 2017 at 19:42, Emilio G. Cota  wrote:
> On Mon, Dec 11, 2017 at 17:32:48 +, Peter Maydell wrote:
>> Thanks. I think I have come down on the side of putting this into
>> 2.11, so rolling an rc5 today, and delaying the final release
>> a day to Wednesday.
>
> Glad to see it's in -rc5 -- thanks for fixing this so quickly!
>
> Again, apologies for not having caught this earlier ;-(

It's my own fault really -- my extremely ad-hoc approach
to testing for Arm guests was bound to come back and
bite me sooner or later.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v1 for-2-12 04/15] s390x/flic: simplify flic initialization

2017-12-11 Thread David Hildenbrand

On 11.12.2017 18:17, Cornelia Huck wrote:
> On Mon, 11 Dec 2017 14:47:29 +0100
> David Hildenbrand  wrote:
> 
>> This makes it clearer, which device is used for which accelerator.
>>
>> Signed-off-by: David Hildenbrand 
>> ---
>>  hw/intc/s390_flic.c  |  9 +++--
>>  hw/intc/s390_flic_kvm.c  | 12 
>>  include/hw/s390x/s390_flic.h |  9 -
>>  3 files changed, 7 insertions(+), 23 deletions(-)
>>
>> diff --git a/hw/intc/s390_flic.c b/hw/intc/s390_flic.c
>> index 6eaf178d79..a78bdf1d90 100644
>> --- a/hw/intc/s390_flic.c
>> +++ b/hw/intc/s390_flic.c
>> @@ -40,11 +40,16 @@ void s390_flic_init(void)
>>  {
>>  DeviceState *dev;
>>  
>> -dev = s390_flic_kvm_create();
>> -if (!dev) {
>> +if (kvm_enabled()) {
>> +dev = qdev_create(NULL, TYPE_KVM_S390_FLIC);
>> +object_property_add_child(qdev_get_machine(), TYPE_KVM_S390_FLIC,
>> +  OBJECT(dev), NULL);
>> +} else if (tcg_enabled()) {
>>  dev = qdev_create(NULL, TYPE_QEMU_S390_FLIC);
>>  object_property_add_child(qdev_get_machine(), TYPE_QEMU_S390_FLIC,
>>OBJECT(dev), NULL);
> 
> Can you use TYPE_S390_FLIC_COMMON for attaching the flic to the machine?

I suggest doing that in a separate patch. (I remember that changing the
name should not harm migration).

> 
>> +} else {
>> +g_assert_not_reached();
> 
> Checking for tcg_enabled() explicitly does not seem the common pattern,
> although it is fine with me (I doubt we'll support other accelerators
> for s390x in the foreseeable future).

Indeed, I can drop that.

> 
>>  }
>>  qdev_init_nofail(dev);
>>  }
> 
> Do we want to switch to the same pattern for the storage attribute
> device as well?

Yes, can have a look, thanks!

> 
> Change looks fine to me.
> 


-- 

Thanks,

David / dhildenb

Re: [Qemu-devel] [PATCH v1 for-2-12 09/15] s390x/tcg: implement TEST PENDING INTERRUPTION

2017-12-11 Thread David Hildenbrand

On 11.12.2017 19:01, Cornelia Huck wrote:
> On Mon, 11 Dec 2017 14:47:34 +0100
> David Hildenbrand  wrote:
> 
>> Use s390_cpu_virt_mem_write() so we can actually revert what we did
>> (re-inject the dequeued IO interrupt).
>>
>> Signed-off-by: David Hildenbrand 
>> ---
>>  target/s390x/helper.h  |  1 +
>>  target/s390x/insn-data.def |  1 +
>>  target/s390x/misc_helper.c | 53 
>> ++
>>  target/s390x/translate.c   |  8 +++
>>  4 files changed, 63 insertions(+)
>>
> 
>> +uint32_t HELPER(tpi)(CPUS390XState *env, uint64_t addr)
>> +{
>> +const uintptr_t ra = GETPC();
>> +S390CPU *cpu = s390_env_get_cpu(env);
>> +QEMUS390FLICState *flic = QEMU_S390_FLIC(s390_get_flic());
>> +QEMUS390FlicIO *io = NULL;
>> +LowCore *lowcore;
>> +
>> +if (addr & 0x3) {
>> +s390_program_interrupt(env, PGM_SPECIFICATION, 4, ra);
>> +}
>> +
>> +qemu_mutex_lock_iothread();
>> +io = qemu_s390_flic_dequeue_io(flic, env->cregs[6]);
>> +if (!io) {
>> +qemu_mutex_unlock_iothread();
>> +return 0;
>> +}
>> +
>> +if (addr) {
>> +struct {
>> +uint16_t id;
>> +uint16_t nr;
>> +uint32_t parm;
>> +} tmp = {
>> +.id = cpu_to_be16(io->id),
>> +.nr = cpu_to_be16(io->nr),
>> +.parm = cpu_to_be32(io->parm),
>> +};
> 
> That's a two-word interruption code; can you call this something better
> than 'tmp'?

IMHO from the context we have here it should be pretty clear what is
happening. I mean we are defining and initializing the temporary data
structure.

But I can change the name if you can come up with a catchy variable name. :)

Thanks!

> 
>> +
>> +if (s390_cpu_virt_mem_write(cpu, addr, 0, , sizeof(tmp))) {
>> +/* writing failed, reinject and properly clean up */
>> +s390_io_interrupt(io->id, io->nr, io->parm, io->word);
>> +qemu_mutex_unlock_iothread();
>> +g_free(io);
>> +s390_cpu_virt_mem_handle_exc(cpu, ra);
>> +return 0;
>> +}
>> +} else {
>> +/* no protection applies */
>> +lowcore = cpu_map_lowcore(env);
>> +lowcore->subchannel_id = cpu_to_be16(io->id);
>> +lowcore->subchannel_nr = cpu_to_be16(io->nr);
>> +lowcore->io_int_parm = cpu_to_be32(io->parm);
>> +lowcore->io_int_word = cpu_to_be32(io->word);
>> +cpu_unmap_lowcore(lowcore);
>> +}
>> +
>> +g_free(io);
>> +qemu_mutex_unlock_iothread();
>> +return 1;
>> +}
>> +
>>  void HELPER(tsch)(CPUS390XState *env, uint64_t r1, uint64_t inst)
>>  {
>>  S390CPU *cpu = s390_env_get_cpu(env);


-- 

Thanks,

David / dhildenb

[Qemu-devel] [PATCH v3 7/8] vhost: Remove old vhost_set_memory etc

2017-12-11 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Remove the old update mechanism, vhost_set_memory, and the functions
it uses and the memory_changed flags we no longer use.

Signed-off-by: Dr. David Alan Gilbert 
---
 hw/virtio/vhost.c | 254 --
 include/hw/virtio/vhost.h |   3 -
 2 files changed, 257 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index c00d82f96a..0bf6fb0577 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -156,160 +156,6 @@ static void vhost_log_sync_range(struct vhost_dev *dev,
 }
 }
 
-/* Assign/unassign. Keep an unsorted array of non-overlapping
- * memory regions in dev->mem. */
-static void vhost_dev_unassign_memory(struct vhost_dev *dev,
-  uint64_t start_addr,
-  uint64_t size)
-{
-int from, to, n = dev->mem->nregions;
-/* Track overlapping/split regions for sanity checking. */
-int overlap_start = 0, overlap_end = 0, overlap_middle = 0, split = 0;
-
-for (from = 0, to = 0; from < n; ++from, ++to) {
-struct vhost_memory_region *reg = dev->mem->regions + to;
-uint64_t reglast;
-uint64_t memlast;
-uint64_t change;
-
-/* clone old region */
-if (to != from) {
-memcpy(reg, dev->mem->regions + from, sizeof *reg);
-}
-
-/* No overlap is simple */
-if (!ranges_overlap(reg->guest_phys_addr, reg->memory_size,
-start_addr, size)) {
-continue;
-}
-
-/* Split only happens if supplied region
- * is in the middle of an existing one. Thus it can not
- * overlap with any other existing region. */
-assert(!split);
-
-reglast = range_get_last(reg->guest_phys_addr, reg->memory_size);
-memlast = range_get_last(start_addr, size);
-
-/* Remove whole region */
-if (start_addr <= reg->guest_phys_addr && memlast >= reglast) {
---dev->mem->nregions;
---to;
-++overlap_middle;
-continue;
-}
-
-/* Shrink region */
-if (memlast >= reglast) {
-reg->memory_size = start_addr - reg->guest_phys_addr;
-assert(reg->memory_size);
-assert(!overlap_end);
-++overlap_end;
-continue;
-}
-
-/* Shift region */
-if (start_addr <= reg->guest_phys_addr) {
-change = memlast + 1 - reg->guest_phys_addr;
-reg->memory_size -= change;
-reg->guest_phys_addr += change;
-reg->userspace_addr += change;
-assert(reg->memory_size);
-assert(!overlap_start);
-++overlap_start;
-continue;
-}
-
-/* This only happens if supplied region
- * is in the middle of an existing one. Thus it can not
- * overlap with any other existing region. */
-assert(!overlap_start);
-assert(!overlap_end);
-assert(!overlap_middle);
-/* Split region: shrink first part, shift second part. */
-memcpy(dev->mem->regions + n, reg, sizeof *reg);
-reg->memory_size = start_addr - reg->guest_phys_addr;
-assert(reg->memory_size);
-change = memlast + 1 - reg->guest_phys_addr;
-reg = dev->mem->regions + n;
-reg->memory_size -= change;
-assert(reg->memory_size);
-reg->guest_phys_addr += change;
-reg->userspace_addr += change;
-/* Never add more than 1 region */
-assert(dev->mem->nregions == n);
-++dev->mem->nregions;
-++split;
-}
-}
-
-/* Called after unassign, so no regions overlap the given range. */
-static void vhost_dev_assign_memory(struct vhost_dev *dev,
-uint64_t start_addr,
-uint64_t size,
-uint64_t uaddr)
-{
-int from, to;
-struct vhost_memory_region *merged = NULL;
-for (from = 0, to = 0; from < dev->mem->nregions; ++from, ++to) {
-struct vhost_memory_region *reg = dev->mem->regions + to;
-uint64_t prlast, urlast;
-uint64_t pmlast, umlast;
-uint64_t s, e, u;
-
-/* clone old region */
-if (to != from) {
-memcpy(reg, dev->mem->regions + from, sizeof *reg);
-}
-prlast = range_get_last(reg->guest_phys_addr, reg->memory_size);
-pmlast = range_get_last(start_addr, size);
-urlast = range_get_last(reg->userspace_addr, reg->memory_size);
-umlast = range_get_last(uaddr, size);
-
-/* check for overlapping regions: should never happen. */
-assert(prlast < start_addr || pmlast < reg->guest_phys_addr);
-/* Not an adjacent or overlapping region - do not merge. */
-if ((prlast + 1 != start_addr || urlast + 1 != uaddr) &&
-

[Qemu-devel] [PATCH v3 8/8] vhost: Move mem_sections maintenance into commit/update routines

2017-12-11 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Move the maintenance of mem_sections into the vhost_update_mem routines,
this removes the need for the vhost_region_add/del callbacks.

Suggested-by: Igor Mammedov 
  (and mostly written by Igor!)

Signed-off-by: Dr. David Alan Gilbert 
---
 hw/virtio/vhost.c | 58 +++
 1 file changed, 16 insertions(+), 42 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 0bf6fb0577..5d3f921be5 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -408,6 +408,13 @@ static int vhost_update_mem_cb(MemoryRegionSection *mrs, 
void *opaque)
 if (!vhost_section(mrs)) {
 return 0;
 }
+++vtmp->dev->n_mem_sections;
+vtmp->dev->mem_sections = g_renew(MemoryRegionSection,
+  vtmp->dev->mem_sections,
+  vtmp->dev->n_mem_sections);
+vtmp->dev->mem_sections[vtmp->dev->n_mem_sections - 1] = *mrs;
+memory_region_ref(mrs->mr);
+
 mrs_size = int128_get64(mrs->size);
 mrs_gpa  = mrs->offset_within_address_space;
 mrs_host = (uintptr_t)memory_region_get_ram_ptr(mrs->mr) +
@@ -461,6 +468,7 @@ static int vhost_update_mem_cb(MemoryRegionSection *mrs, 
void *opaque)
 static int vhost_update_mem(struct vhost_dev *dev, bool *changed)
 {
 int res;
+unsigned i;
 struct vhost_update_mem_tmp vtmp;
 size_t mem_size;
 vtmp.regions = 0;
@@ -469,6 +477,14 @@ static int vhost_update_mem(struct vhost_dev *dev, bool 
*changed)
 
 trace_vhost_update_mem();
 *changed = false;
+/* Clear out the section list, it'll get rebuilt */
+for (i = 0; i < dev->n_mem_sections; i++) {
+memory_region_unref(dev->mem_sections[i].mr);
+}
+g_free(dev->mem_sections);
+dev->mem_sections = NULL;
+dev->n_mem_sections = 0;
+
 res = address_space_iterate(_space_memory,
 vhost_update_mem_cb, );
 if (res) {
@@ -553,46 +569,6 @@ static void vhost_commit(MemoryListener *listener)
 }
 }
 
-static void vhost_region_add(MemoryListener *listener,
- MemoryRegionSection *section)
-{
-struct vhost_dev *dev = container_of(listener, struct vhost_dev,
- memory_listener);
-
-if (!vhost_section(section)) {
-return;
-}
-
-++dev->n_mem_sections;
-dev->mem_sections = g_renew(MemoryRegionSection, dev->mem_sections,
-dev->n_mem_sections);
-dev->mem_sections[dev->n_mem_sections - 1] = *section;
-memory_region_ref(section->mr);
-}
-
-static void vhost_region_del(MemoryListener *listener,
- MemoryRegionSection *section)
-{
-struct vhost_dev *dev = container_of(listener, struct vhost_dev,
- memory_listener);
-int i;
-
-if (!vhost_section(section)) {
-return;
-}
-
-memory_region_unref(section->mr);
-for (i = 0; i < dev->n_mem_sections; ++i) {
-if (dev->mem_sections[i].offset_within_address_space
-== section->offset_within_address_space) {
---dev->n_mem_sections;
-memmove(>mem_sections[i], >mem_sections[i+1],
-(dev->n_mem_sections - i) * sizeof(*dev->mem_sections));
-break;
-}
-}
-}
-
 static void vhost_iommu_unmap_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
 {
 struct vhost_iommu *iommu = container_of(n, struct vhost_iommu, n);
@@ -1165,8 +1141,6 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
 
 hdev->memory_listener = (MemoryListener) {
 .commit = vhost_commit,
-.region_add = vhost_region_add,
-.region_del = vhost_region_del,
 .region_nop = vhost_region_nop,
 .log_start = vhost_log_start,
 .log_stop = vhost_log_stop,
-- 
2.14.3

[Qemu-devel] [PATCH v3 6/8] vhost: Compare and copy updated region data into device state

2017-12-11 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Compare the temporary region data with the original, and if it's
different update the original in the device state.

Signed-off-by: Dr. David Alan Gilbert 
---
 hw/virtio/trace-events |  2 ++
 hw/virtio/vhost.c  | 19 +--
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 92fadec192..fac89aaba5 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -2,6 +2,8 @@
 
 # hw/virtio/vhost.c
 vhost_section(const char *name, int r) "%s:%d"
+vhost_update_mem(void) ""
+vhost_update_mem_changed(void) ""
 vhost_update_mem_cb(const char *name, uint64_t gpa, uint64_t size, uint64_t 
host) "%s: 0x%"PRIx64"+0x%"PRIx64" @ 0x%"PRIx64
 vhost_update_mem_cb_abut(const char *name, uint64_t new_size) "%s: 0x%"PRIx64
 
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 646a3480c1..c00d82f96a 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -707,10 +707,12 @@ static int vhost_update_mem(struct vhost_dev *dev, bool 
*changed)
 {
 int res;
 struct vhost_update_mem_tmp vtmp;
+size_t mem_size;
 vtmp.regions = 0;
 vtmp.nregions = 0;
 vtmp.dev = dev;
 
+trace_vhost_update_mem();
 *changed = false;
 res = address_space_iterate(_space_memory,
 vhost_update_mem_cb, );
@@ -718,8 +720,21 @@ static int vhost_update_mem(struct vhost_dev *dev, bool 
*changed)
 goto out;
 }
 
-/* TODO */
-*changed = dev->mem_changed_start_addr < dev->mem_changed_end_addr;
+mem_size = offsetof(struct vhost_memory, regions) +
+   (vtmp.nregions + 1) * sizeof dev->mem->regions[0];
+
+if (vtmp.nregions != dev->mem->nregions ||
+   memcmp(vtmp.regions, dev->mem->regions, mem_size)) {
+*changed = true;
+/* Update the main regions list from our tmp */
+dev->mem = g_realloc(dev->mem, mem_size);
+dev->mem->nregions = vtmp.nregions;
+memcpy(dev->mem->regions, vtmp.regions,
+   vtmp.nregions * sizeof dev->mem->regions[0]);
+used_memslots = vtmp.nregions;
+trace_vhost_update_mem_changed();
+}
+
 out:
 g_free(vtmp.regions);
 return res;
-- 
2.14.3

[Qemu-devel] [PATCH v3 4/8] vhost: New memory update functions

2017-12-11 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

vhost_update_mem will replace the existing update mechanism.
They make use of the Flatview we have now to make the update simpler.
This commit just adds the basic structure.

Signed-off-by: Dr. David Alan Gilbert 
---
 hw/virtio/vhost.c | 51 ++-
 1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index c7ce7baf9b..6d720036db 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -634,11 +634,51 @@ static void vhost_begin(MemoryListener *listener)
 dev->mem_changed_start_addr = -1;
 }
 
+struct vhost_update_mem_tmp {
+struct vhost_dev   *dev;
+uint32_t nregions;
+struct vhost_memory_region *regions;
+};
+
+/* Called for each MRS from vhost_update_mem */
+static int vhost_update_mem_cb(MemoryRegionSection *mrs, void *opaque)
+{
+if (!vhost_section(mrs)) {
+return 0;
+}
+
+/* TODO */
+return 0;
+}
+
+static int vhost_update_mem(struct vhost_dev *dev, bool *changed)
+{
+int res;
+struct vhost_update_mem_tmp vtmp;
+vtmp.regions = 0;
+vtmp.nregions = 0;
+vtmp.dev = dev;
+
+*changed = false;
+res = address_space_iterate(_space_memory,
+vhost_update_mem_cb, );
+if (res) {
+goto out;
+}
+
+/* TODO */
+*changed = dev->mem_changed_start_addr < dev->mem_changed_end_addr;
+out:
+g_free(vtmp.regions);
+return res;
+}
+
 static void vhost_commit(MemoryListener *listener)
 {
 struct vhost_dev *dev = container_of(listener, struct vhost_dev,
  memory_listener);
 uint64_t log_size;
+bool changed;
 int r;
 
 if (!dev->memory_changed) {
@@ -647,7 +687,12 @@ static void vhost_commit(MemoryListener *listener)
 if (!dev->started) {
 return;
 }
-if (dev->mem_changed_start_addr > dev->mem_changed_end_addr) {
+if (vhost_update_mem(dev, )) {
+return;
+}
+
+if (!changed) {
+/* None of the mappings we care about changed */
 return;
 }
 
@@ -1523,6 +1568,7 @@ void vhost_ack_features(struct vhost_dev *hdev, const int 
*feature_bits,
 int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
 {
 int i, r;
+bool changed;
 
 /* should only be called after backend is connected */
 assert(hdev->vhost_ops);
@@ -1535,6 +1581,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice 
*vdev)
 goto fail_features;
 }
 
+if (vhost_update_mem(hdev, )) {
+goto fail_mem;
+}
 if (vhost_dev_has_iommu(hdev)) {
 memory_listener_register(>iommu_listener, vdev->dma_as);
 }
-- 
2.14.3

[Qemu-devel] [PATCH v3 2/8] vhost: Move log_dirty check

2017-12-11 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Move the log_dirty check into vhost_section.

Signed-off-by: Dr. David Alan Gilbert 
---
 hw/virtio/trace-events |  3 +++
 hw/virtio/vhost.c  | 20 +---
 2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 775461ae98..4a493bcd46 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -1,5 +1,8 @@
 # See docs/devel/tracing.txt for syntax documentation.
 
+# hw/virtio/vhost.c
+vhost_section(const char *name, int r) "%s:%d"
+
 # hw/virtio/virtio.c
 virtqueue_alloc_element(void *elem, size_t sz, unsigned in_num, unsigned 
out_num) "elem %p size %zd in_num %u out_num %u"
 virtqueue_fill(void *vq, const void *elem, unsigned int len, unsigned int idx) 
"vq %p elem %p len %u idx %u"
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index e4290ce93d..e923219e63 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -27,6 +27,7 @@
 #include "hw/virtio/virtio-access.h"
 #include "migration/blocker.h"
 #include "sysemu/dma.h"
+#include "trace.h"
 
 /* enabled until disconnected backend stabilizes */
 #define _VHOST_DEBUG 1
@@ -567,18 +568,12 @@ static void vhost_set_memory(MemoryListener *listener,
  memory_listener);
 hwaddr start_addr = section->offset_within_address_space;
 ram_addr_t size = int128_get64(section->size);
-bool log_dirty =
-memory_region_get_dirty_log_mask(section->mr) & ~(1 << 
DIRTY_MEMORY_MIGRATION);
 int s = offsetof(struct vhost_memory, regions) +
 (dev->mem->nregions + 1) * sizeof dev->mem->regions[0];
 void *ram;
 
 dev->mem = g_realloc(dev->mem, s);
 
-if (log_dirty) {
-add = false;
-}
-
 assert(size);
 
 /* Optimize no-change case. At least cirrus_vga does this a lot at this 
time. */
@@ -611,8 +606,19 @@ static void vhost_set_memory(MemoryListener *listener,
 
 static bool vhost_section(MemoryRegionSection *section)
 {
-return memory_region_is_ram(section->mr) &&
+bool result;
+bool log_dirty = memory_region_get_dirty_log_mask(section->mr) &
+ ~(1 << DIRTY_MEMORY_MIGRATION);
+result = memory_region_is_ram(section->mr) &&
 !memory_region_is_rom(section->mr);
+
+/* Vhost doesn't handle any block which is doing dirty-tracking other
+ * than migration; this typically fires on VGA areas.
+ */
+result &= !log_dirty;
+
+trace_vhost_section(section->mr->name, result);
+return result;
 }
 
 static void vhost_begin(MemoryListener *listener)
-- 
2.14.3

[Qemu-devel] [PATCH v3 5/8] vhost: update_mem_cb implementation

2017-12-11 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Add the meat of update_mem_cb;  this is called for each region,
to add a region to our temporary list.
Our temporary list is in order we look to see if this
region can be merged with the current head of list.

Signed-off-by: Dr. David Alan Gilbert 
---
 hw/virtio/trace-events |  2 ++
 hw/virtio/vhost.c  | 54 +-
 2 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 4a493bcd46..92fadec192 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -2,6 +2,8 @@
 
 # hw/virtio/vhost.c
 vhost_section(const char *name, int r) "%s:%d"
+vhost_update_mem_cb(const char *name, uint64_t gpa, uint64_t size, uint64_t 
host) "%s: 0x%"PRIx64"+0x%"PRIx64" @ 0x%"PRIx64
+vhost_update_mem_cb_abut(const char *name, uint64_t new_size) "%s: 0x%"PRIx64
 
 # hw/virtio/virtio.c
 virtqueue_alloc_element(void *elem, size_t sz, unsigned in_num, unsigned 
out_num) "elem %p size %zd in_num %u out_num %u"
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 6d720036db..646a3480c1 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -643,11 +643,63 @@ struct vhost_update_mem_tmp {
 /* Called for each MRS from vhost_update_mem */
 static int vhost_update_mem_cb(MemoryRegionSection *mrs, void *opaque)
 {
+struct vhost_update_mem_tmp *vtmp = opaque;
+struct vhost_memory_region *cur_vmr;
+bool need_add = true;
+uint64_t mrs_size;
+uint64_t mrs_gpa;
+uintptr_t mrs_host;
+
 if (!vhost_section(mrs)) {
 return 0;
 }
+mrs_size = int128_get64(mrs->size);
+mrs_gpa  = mrs->offset_within_address_space;
+mrs_host = (uintptr_t)memory_region_get_ram_ptr(mrs->mr) +
+ mrs->offset_within_region;
+
+trace_vhost_update_mem_cb(mrs->mr->name, mrs_gpa, mrs_size, mrs_host);
+
+if (vtmp->nregions) {
+/* Since we already have at least one region, lets see if
+ * this extends it; since we're scanning in order, we only
+ * have to look at the last one, and the FlatView that calls
+ * us shouldn't have overlaps.
+ */
+struct vhost_memory_region *prev_vmr = vtmp->regions +
+   (vtmp->nregions - 1);
+uint64_t prev_gpa_start = prev_vmr->guest_phys_addr;
+uint64_t prev_gpa_end   = range_get_last(prev_gpa_start,
+ prev_vmr->memory_size);
+uint64_t prev_host_start = prev_vmr->userspace_addr;
+uint64_t prev_host_end   = range_get_last(prev_host_start,
+  prev_vmr->memory_size);
+
+if (prev_gpa_end + 1 == mrs_gpa &&
+prev_host_end + 1 == mrs_host &&
+(!vtmp->dev->vhost_ops->vhost_backend_can_merge ||
+vtmp->dev->vhost_ops->vhost_backend_can_merge(vtmp->dev,
+mrs_host, mrs_size,
+prev_host_start, prev_vmr->memory_size))) {
+/* The two regions abut */
+need_add = false;
+mrs_size = mrs_size + prev_vmr->memory_size;
+prev_vmr->memory_size = mrs_size;
+trace_vhost_update_mem_cb_abut(mrs->mr->name, mrs_size);
+}
+}
+
+if (need_add) {
+vtmp->nregions++;
+vtmp->regions = g_realloc_n(vtmp->regions, vtmp->nregions,
+sizeof(vtmp->regions[0]));
+cur_vmr = >regions[vtmp->nregions - 1];
+cur_vmr->guest_phys_addr = mrs_gpa;
+cur_vmr->memory_size = mrs_size;
+cur_vmr->userspace_addr  = mrs_host;
+cur_vmr->flags_padding = 0;
+}
 
-/* TODO */
 return 0;
 }
 
-- 
2.14.3

[Qemu-devel] [PATCH v3 3/8] vhost: Simplify ring verification checks

2017-12-11 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

vhost_verify_ring_mappings() were used to verify that
rings are still accessible and related memory hasn't
been moved after flatview is updated.

It was doing checks by mapping ring's GPA+len and
checking that HVA hadn't changed with new memory map.
To avoid maybe expensive mapping call, we were
identifying address range that changed and were doing
mapping only if ring was in changed range.

However it's not neccessary to perform ring's GPA
mapping as we already have its current HVA and all
we need is to verify that ring's GPA translates to
the same HVA in updated flatview.

This will allow the following patches to simplify the range
comparison that was previously needed to avoid expensive
verify_ring_mapping calls.

Signed-off-by: Igor Mammedov 
with modifications by:
Signed-off-by: Dr. David Alan Gilbert 
---
 hw/virtio/vhost.c | 74 ++-
 1 file changed, 41 insertions(+), 33 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index e923219e63..c7ce7baf9b 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -450,35 +450,37 @@ static void vhost_memory_unmap(struct vhost_dev *dev, 
void *buffer,
 }
 }
 
-static int vhost_verify_ring_part_mapping(struct vhost_dev *dev,
-  void *part,
-  uint64_t part_addr,
-  uint64_t part_size,
-  uint64_t start_addr,
-  uint64_t size)
+static int vhost_verify_ring_part_mapping(void *ring_hva,
+  uint64_t ring_gpa,
+  uint64_t ring_size,
+  void *reg_hva,
+  uint64_t reg_gpa,
+  uint64_t reg_size)
 {
-hwaddr l;
-void *p;
-int r = 0;
+uint64_t hva_ring_offset;
+uint64_t ring_last = range_get_last(ring_gpa, ring_size);
+uint64_t reg_last = range_get_last(reg_gpa, reg_size);
 
-if (!ranges_overlap(start_addr, size, part_addr, part_size)) {
+if (ring_last < reg_gpa || ring_gpa > reg_last) {
 return 0;
 }
-l = part_size;
-p = vhost_memory_map(dev, part_addr, , 1);
-if (!p || l != part_size) {
-r = -ENOMEM;
+/* check that whole ring's is mapped */
+if (ring_last > reg_last) {
+return -EBUSY;
 }
-if (p != part) {
-r = -EBUSY;
+/* check that ring's MemoryRegion wasn't replaced */
+hva_ring_offset = ring_gpa - reg_gpa;
+if (ring_hva != reg_hva + hva_ring_offset) {
+return -ENOMEM;
 }
-vhost_memory_unmap(dev, p, l, 0, 0);
-return r;
+
+return 0;
 }
 
 static int vhost_verify_ring_mappings(struct vhost_dev *dev,
-  uint64_t start_addr,
-  uint64_t size)
+  void *reg_hva,
+  uint64_t reg_gpa,
+  uint64_t reg_size)
 {
 int i, j;
 int r = 0;
@@ -492,22 +494,25 @@ static int vhost_verify_ring_mappings(struct vhost_dev 
*dev,
 struct vhost_virtqueue *vq = dev->vqs + i;
 
 j = 0;
-r = vhost_verify_ring_part_mapping(dev, vq->desc, vq->desc_phys,
-   vq->desc_size, start_addr, size);
+r = vhost_verify_ring_part_mapping(
+vq->desc, vq->desc_phys, vq->desc_size,
+reg_hva, reg_gpa, reg_size);
 if (r) {
 break;
 }
 
 j++;
-r = vhost_verify_ring_part_mapping(dev, vq->avail, vq->avail_phys,
-   vq->avail_size, start_addr, size);
+r = vhost_verify_ring_part_mapping(
+vq->desc, vq->desc_phys, vq->desc_size,
+reg_hva, reg_gpa, reg_size);
 if (r) {
 break;
 }
 
 j++;
-r = vhost_verify_ring_part_mapping(dev, vq->used, vq->used_phys,
-   vq->used_size, start_addr, size);
+r = vhost_verify_ring_part_mapping(
+vq->desc, vq->desc_phys, vq->desc_size,
+reg_hva, reg_gpa, reg_size);
 if (r) {
 break;
 }
@@ -633,8 +638,6 @@ static void vhost_commit(MemoryListener *listener)
 {
 struct vhost_dev *dev = container_of(listener, struct vhost_dev,
  memory_listener);
-hwaddr start_addr = 0;
-ram_addr_t size = 0;
 uint64_t log_size;
 int r;
 
@@ -649,11 +652,16 @@ static void vhost_commit(MemoryListener *listener)
 }
 
 if (dev->started) {
-start_addr = dev->mem_changed_start_addr;
-size =

[Qemu-devel] [PATCH v3 0/8] Rework vhost memory region updates

2017-12-11 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Hi,
  This patch set reworks the way the vhost code handles changes in
physical address space layout that came from a discussion with Igor.

Its intention is to simplify a lot of the update code,
and to make it easier for the postcopy+shared code to
do the hugepage alignments that are needed.

Instead of updating and trying to merge sections of address
space on each add/remove callback, we wait until the commit phase
and go through and rebuild a list by walking the Flatview of
memory and end up producing an ordered list.
We compare the list to the old list to trigger updates.

v3
  Fix misordered comparison (that is removed later in the series)
  A few spelling fixes
  Now bisect builds, make checks and passes a test with hotlug
  memory on real vhost.

Dave


Dr. David Alan Gilbert (8):
  memory: address_space_iterate
  vhost: Move log_dirty check
  vhost: Simplify ring verification checks
  vhost: New memory update functions
  vhost: update_mem_cb implementation
  vhost: Compare and copy updated region data into device state
  vhost: Remove old vhost_set_memory etc
  vhost: Move mem_sections maintenance into commit/update routines

 hw/virtio/trace-events|   7 +
 hw/virtio/vhost.c | 498 --
 include/exec/memory.h |  23 +++
 include/hw/virtio/vhost.h |   3 -
 memory.c  |  22 ++
 5 files changed, 226 insertions(+), 327 deletions(-)

-- 
2.14.3

[Qemu-devel] [PATCH v3 1/8] memory: address_space_iterate

2017-12-11 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Iterate through an address space calling a function for each
section.  The iteration is done in order.

Signed-off-by: Dr. David Alan Gilbert 
---
 include/exec/memory.h | 23 +++
 memory.c  | 22 ++
 2 files changed, 45 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 5ed4042f87..f5a9df642e 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1987,6 +1987,29 @@ address_space_write_cached(MemoryRegionCache *cache, 
hwaddr addr,
 address_space_write(cache->as, cache->xlat + addr, MEMTXATTRS_UNSPECIFIED, 
buf, len);
 }
 
+/**
+ * ASIterateCallback: Function type called by address_space_iterate
+ *
+ * Return 0 on success or a negative error code.
+ *
+ * @mrs: Memory region section for this range
+ * @opaque: The opaque value passed in to the iterator.
+ */
+typedef int (*ASIterateCallback)(MemoryRegionSection *mrs, void *opaque);
+
+/**
+ * address_space_iterate: Call the function for each address range in the
+ *AddressSpace, in sorted order.
+ *
+ * Return 0 on success or a negative error code.
+ *
+ * @as: Address space to iterate over
+ * @cb: Function to call.  If the function returns none-0 the iteration will
+ * stop.
+ * @opaque: Value to pass to the function
+ */
+int
+address_space_iterate(AddressSpace *as, ASIterateCallback cb, void *opaque);
 #endif
 
 #endif
diff --git a/memory.c b/memory.c
index e26e5a3b1d..f45137f25e 100644
--- a/memory.c
+++ b/memory.c
@@ -2810,6 +2810,28 @@ void address_space_destroy(AddressSpace *as)
 call_rcu(as, do_address_space_destroy, rcu);
 }
 
+int address_space_iterate(AddressSpace *as, ASIterateCallback cb,
+  void *opaque)
+{
+int res = 0;
+FlatView *fv = address_space_to_flatview(as);
+FlatRange *range;
+
+flatview_ref(fv);
+
+FOR_EACH_FLAT_RANGE(range, fv) {
+MemoryRegionSection mrs = section_from_flat_range(range, fv);
+res = cb(, opaque);
+if (res) {
+break;
+}
+}
+
+flatview_unref(fv);
+
+return res;
+}
+
 static const char *memory_region_type(MemoryRegion *mr)
 {
 if (memory_region_is_ram_device(mr)) {
-- 
2.14.3

Re: [Qemu-devel] [PATCH for-2.11?] target/arm: Generate UNDEF for 32-bit Thumb2 insns

2017-12-11 Thread Emilio G. Cota

On Mon, Dec 11, 2017 at 17:32:48 +, Peter Maydell wrote:
> Thanks. I think I have come down on the side of putting this into
> 2.11, so rolling an rc5 today, and delaying the final release
> a day to Wednesday.

Glad to see it's in -rc5 -- thanks for fixing this so quickly!

Again, apologies for not having caught this earlier ;-(

Emilio

Re: [Qemu-devel] [PATCH v4 4/4] ivshmem: Disable irqfd on device reset

2017-12-11 Thread Markus Armbruster

Ladi Prosek  writes:

> The effects of ivshmem_enable_irqfd() was not undone on device reset.
>
> This manifested as:
> ivshmem_add_kvm_msi_virq: Assertion `!s->msi_vectors[vector].pdev' failed.
>
> when irqfd was enabled before reset and then enabled again after reset, making
> ivshmem_enable_irqfd() run for the second time.
>
> To reproduce, run:
>
>   ivshmem-server
>
> and QEMU with:
>
>   -device ivshmem-doorbell,chardev=iv
>   -chardev socket,path=/tmp/ivshmem_socket,id=iv
>
> then install the Windows driver, at the time of writing available at:
>
> https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem
>
> and crash-reboot the guest by inducing a BSOD.
>
> Signed-off-by: Ladi Prosek 

Reviewed-by: Markus Armbruster

Re: [Qemu-devel] [PATCH for-2.12 0/4] qmp dirty bitmap API

2017-12-11 Thread John Snow



On 12/11/2017 06:15 AM, Kevin Wolf wrote:
> Am 09.12.2017 um 01:57 hat John Snow geschrieben:
>> Here's an idea of what this API might look like without revealing
>> explicit merge/split primitives.
>>
>> A new bitmap property that lets us set retention:
>>
>> :: block-dirty-bitmap-set-retention bitmap=foo slices=10
>>
>> Or something similar, where the default property for all bitmaps is
>> zero -- the current behavior: no copies retained.
>>
>> By setting it to a non-zero positive integer, the incremental backup
>> mode will automatically save a disabled copy when possible.
> 
> -EMAGIC
> 
> Operations that create or delete user-visible objects should be
> explicit, not automatic. You're trying to implement management layer
> functionality in qemu here, but incomplete enough that the artifacts of
> it are still visible externally. (A complete solution within qemu
> wouldn't expose low-level concepts such as bitmaps on an external
> interface, but you would expose something like checkpoints.)
> 
> Usually it's not a good idea to have a design where qemu implements
> enough to restrict management tools to whatever use case we had in mind,
> but not enough to make the management tool's life substantially easier
> (by not having to care about some low-level concepts).
> 
>> "What happens if we exceed our retention?"
>>
>> (A) We push the last one out automatically, or
>> (B) We fail the operation immediately.
>>
>> A is more convenient, but potentially unsafe if the management tool or
>> user wasn't aware that was going to happen.
>> B is more annoying, but definitely more safe as it means we cannot lose
>> a bitmap accidentally.
> 
> Both mean that the management layer has not only to deal with the
> deletion of bitmaps as it wants to have them, but also to keep the
> retention counter somewhere and predict what qemu is going to do to the
> bitmaps and whether any corrective action needs to be taken.
> 
> This is making things more complex rather than simpler.
> 
>> I would argue for B with perhaps a force-cycle=true|false that defaults
>> to false to let management tools say "Yes, go ahead, remove the old one"
>> with additionally some return to let us know it happened:
>>
>> {"return": {
>>   "dropped-slices": [ {"bitmap0": 0}, ...]
>> }}
>>
>> This would introduce some concept of bitmap slices into the mix as ID'd
>> children of a bitmap. I would propose that these slices are numbered and
>> monotonically increasing. "bitmap0" as an object starts with no slices,
>> but every incremental backup creates slice 0, slice 1, slice 2, and so
>> on. Even after we start deleting some, they stay ordered. These numbers
>> then stand in for points in time.
>>
>> The counter can (must?) be reset and all slices forgotten when
>> performing a full backup while providing a bitmap argument.
>>
>> "How can a user make use of the slices once they're made?"
>>
>> Let's consider something like mode=partial in contrast to
>> mode=incremental, and an example where we have 6 prior slices:
>> 0,1,2,3,4,5, (and, unnamed, the 'active' slice.)
>>
>> mode=partial bitmap=foo slice=4
>>
>> This would create a backup from slice 4 to the current time α. This
>> includes all clusters from 4, 5, and the active bitmap.
>>
>> I don't think it is meaningful to define any end point that isn't the
>> current time, so I've omitted that as a possibility.
> 
> John, what are you doing here? This adds option after option, and even
> additional slice object, only complicating an easy thing more and more.
> I'm not sure if that was your intention, but I feel I'm starting to
> understand better how Linus's rants come about.
> 
> Let me summarise what this means for management layer:
> 
> * The management layer has to manage bitmaps. They have direct control
>   over creation and deletion of bitmaps. So far so good.
> 
> * It also has to manage slices in those bitmaps objects; and these
>   slices are what contains the actual bitmaps. In order to identify a
>   bitmap in qemu, you need:
> 
> a) the node name
> b) the bitmap ID, and
> c) the slice number
> 
>   The slice number is assigned by qemu and libvirt has to wait until
>   qemu tells it about the slice number of a newly created slice. If
>   libvirt doesn't receive the reply to the command that started the
>   block job, it needs to be able to query this information from qemu,
>   e.g. in query-block-jobs.
> 
> * Slices are automatically created when you start a backup job with a
>   bitmap. It doesn't matter whether you even intend to do an incremental
>   backup against this point in time. qemu knows better.
> 
> * In order to delete a slice that you don't need any more, you have to
>   create more slices (by doing more backups), but you don't get to
>   decide which one is dropped. qemu helpfully just drops the oldest one.
>   It doesn't matter if you want to keep an older one so you can do an
>   incremental backup for a longer timespan. Don't worry about your
>   backup

[Qemu-devel] [ANNOUNCE] QEMU 2.11.0-rc5 is now available

2017-12-11 Thread Michael Roth

Hello,

On behalf of the QEMU Team, I'd like to announce the availability of the
sixth release candidate for the QEMU 2.11 release.  This release is meant
for testing purposes and should not be used in a production environment.

  http://download.qemu-project.org/qemu-2.11.0-rc5.tar.xz
  http://download.qemu-project.org/qemu-2.11.0-rc5.tar.xz.sig

A note from the maintainer:

  Unfortunately we had a late-breaking bug in QEMU's Arm CPU
  emulation, so we've had to roll a 5th release candidate.
  This has just one bugfix in it compared to rc4, and we
  plan to make the final 2.11 release on Wednesday.

You can help improve the quality of the QEMU 2.11 release by testing this
release and reporting bugs on Launchpad:

  https://bugs.launchpad.net/qemu/

The release plan, as well a documented known issues for release
candidates, are available at:

  http://wiki.qemu.org/Planning/2.11

Please add entries to the ChangeLog for the 2.11 release below:

  http://wiki.qemu.org/ChangeLog/2.11

Changes since rc4:

6afd0c1998: Update version for v2.11.0-rc5 release (Peter Maydell)
7472e2efb0: target/arm: Generate UNDEF for 32-bit Thumb2 insns (Peter Maydell)

Re: [Qemu-devel] [PATCH for 2.11 0/2] QEMU crashes with CD device without media

2017-12-11 Thread John Snow

On 12/11/2017 05:24 AM, Denis V. Lunev wrote:
> On 11/28/2017 03:10 PM, Denis V. Lunev wrote:
>> There are 2 cases I have spotted so far:
>> 1) IDE ATAPI read processing. Actually this was reported from field
>> 2) QEMU IO hmp command (found during evaluation of (1))
>>
>> SCSI code checks during access that blk_is_available(). These patches add
>> same checks on different code paths.
>>
>> Pls decide whether these patches should go through sub-system trees or via
>> block tree.
>>
>> Signed-off-by: Denis V. Lunev 
>> CC: "Dr. David Alan Gilbert" 
>> CC: John Snow 
>> CC: Kevin Wolf 
>> CC: Stefan Hajnoczi 
>>
> any decision on this?
> 

Hi Den, I was investigating it and had a long reply typed out but I
didn't want to bore you to tears with my badly written novel. I'll send
it shortly.

I couldn't reproduce the problem and I can't see how the behavior you
are seeing is possible, so I am concerned about the root cause of what's
wrong.

I will accept the hotfix before 2.12-rc0 or 2.11.1, but I want to try
and see what's going wrong in the meantime.

Re: [Qemu-devel] [PATCH v3 3/3] msi: Handle remappable format interrupt request

2017-12-11 Thread Anthony PERARD

On Fri, Nov 17, 2017 at 02:24:25PM +0800, Chao Gao wrote:
> According to VT-d spec Interrupt Remapping and Interrupt Posting ->
> Interrupt Remapping -> Interrupt Request Formats On Intel 64
> Platforms, fields of MSI data register have changed. This patch
> avoids wrongly regarding a remappable format interrupt request as
> an interrupt binded with a pirq.
> 
> Signed-off-by: Chao Gao 
> Signed-off-by: Lan Tianyu 
> ---
> v3:
>  - clarify the interrupt format bit is Intel-specific, then it is
>  improper to define MSI_ADDR_IF_MASK in a common header.
> ---
>  hw/i386/xen/xen-hvm.c | 10 +-
>  hw/pci/msi.c  |  5 +++--
>  hw/pci/msix.c |  4 +++-
>  hw/xen/xen_pt_msi.c   |  2 +-
>  include/hw/xen/xen.h  |  2 +-
>  stubs/xen-hvm.c   |  2 +-
>  6 files changed, 18 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
> index 8028bed..52dc8af 100644
> --- a/hw/i386/xen/xen-hvm.c
> +++ b/hw/i386/xen/xen-hvm.c
> @@ -145,8 +145,16 @@ void xen_piix_pci_write_config_client(uint32_t address, 
> uint32_t val, int len)
>  }
>  }
>  
> -int xen_is_pirq_msi(uint32_t msi_data)
> +int xen_is_pirq_msi(uint32_t msi_addr_lo, uint32_t msi_data)
>  {
> +/* If the MSI address is configured in remapping format, the MSI will not
> + * be remapped into a pirq. This 'if' test excludes Intel-specific
> + * remappable msi.
> + */
> +#define MSI_ADDR_IF_MASK 0x0010

I don't think that is the right place for a define, they also exist
outside of the context of the function.
That define would be better at the top of this file, I think. (There is
probably a better place in the common headers, but I'm not sure were.)

Thanks,

-- 
Anthony PERARD

Re: [Qemu-devel] [PATCH v1 for-2-12 09/15] s390x/tcg: implement TEST PENDING INTERRUPTION

2017-12-11 Thread Cornelia Huck

On Mon, 11 Dec 2017 14:47:34 +0100
David Hildenbrand  wrote:

> Use s390_cpu_virt_mem_write() so we can actually revert what we did
> (re-inject the dequeued IO interrupt).
> 
> Signed-off-by: David Hildenbrand 
> ---
>  target/s390x/helper.h  |  1 +
>  target/s390x/insn-data.def |  1 +
>  target/s390x/misc_helper.c | 53 
> ++
>  target/s390x/translate.c   |  8 +++
>  4 files changed, 63 insertions(+)
> 

> +uint32_t HELPER(tpi)(CPUS390XState *env, uint64_t addr)
> +{
> +const uintptr_t ra = GETPC();
> +S390CPU *cpu = s390_env_get_cpu(env);
> +QEMUS390FLICState *flic = QEMU_S390_FLIC(s390_get_flic());
> +QEMUS390FlicIO *io = NULL;
> +LowCore *lowcore;
> +
> +if (addr & 0x3) {
> +s390_program_interrupt(env, PGM_SPECIFICATION, 4, ra);
> +}
> +
> +qemu_mutex_lock_iothread();
> +io = qemu_s390_flic_dequeue_io(flic, env->cregs[6]);
> +if (!io) {
> +qemu_mutex_unlock_iothread();
> +return 0;
> +}
> +
> +if (addr) {
> +struct {
> +uint16_t id;
> +uint16_t nr;
> +uint32_t parm;
> +} tmp = {
> +.id = cpu_to_be16(io->id),
> +.nr = cpu_to_be16(io->nr),
> +.parm = cpu_to_be32(io->parm),
> +};

That's a two-word interruption code; can you call this something better
than 'tmp'?

> +
> +if (s390_cpu_virt_mem_write(cpu, addr, 0, , sizeof(tmp))) {
> +/* writing failed, reinject and properly clean up */
> +s390_io_interrupt(io->id, io->nr, io->parm, io->word);
> +qemu_mutex_unlock_iothread();
> +g_free(io);
> +s390_cpu_virt_mem_handle_exc(cpu, ra);
> +return 0;
> +}
> +} else {
> +/* no protection applies */
> +lowcore = cpu_map_lowcore(env);
> +lowcore->subchannel_id = cpu_to_be16(io->id);
> +lowcore->subchannel_nr = cpu_to_be16(io->nr);
> +lowcore->io_int_parm = cpu_to_be32(io->parm);
> +lowcore->io_int_word = cpu_to_be32(io->word);
> +cpu_unmap_lowcore(lowcore);
> +}
> +
> +g_free(io);
> +qemu_mutex_unlock_iothread();
> +return 1;
> +}
> +
>  void HELPER(tsch)(CPUS390XState *env, uint64_t r1, uint64_t inst)
>  {
>  S390CPU *cpu = s390_env_get_cpu(env);

Re: [Qemu-devel] [PATCH v3 2/3] xen/pt: Pass the whole msi addr/data to Xen

2017-12-11 Thread Anthony PERARD

On Fri, Nov 17, 2017 at 02:24:24PM +0800, Chao Gao wrote:
> Previously, some fields (reserved or unalterable) are filtered by
> Qemu. This fields are useless for the legacy interrupt format.
> However, these fields are may meaningful (for intel platform)
> for the interrupt of remapping format. It is better to pass the whole
> msi addr/data to Xen without any filtering.
> 
> The main reason why we want this is QEMU doesn't have the knowledge
> to decide the interrupt format after we introduce vIOMMU inside Xen.
> Passing the whole msi message down and let arch-specific vIOMMU to
> decide the interrupt format.
> 
> Signed-off-by: Chao Gao 
> Signed-off-by: Lan Tianyu 
> ---
> v3:
>  - new
> ---
>  hw/xen/xen_pt_msi.c | 47 ---
>  1 file changed, 12 insertions(+), 35 deletions(-)
> 
> diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
> index 6d1e3bd..f7d6e76 100644
> --- a/hw/xen/xen_pt_msi.c
> +++ b/hw/xen/xen_pt_msi.c
> @@ -47,25 +47,6 @@ static inline uint32_t msi_ext_dest_id(uint32_t addr_hi)
>  return addr_hi & 0xff00;
>  }
>  
> -static uint32_t msi_gflags(uint32_t data, uint64_t addr)
> -{
> -uint32_t result = 0;
> -int rh, dm, dest_id, deliv_mode, trig_mode;
> -
> -rh = (addr >> MSI_ADDR_REDIRECTION_SHIFT) & 0x1;
> -dm = (addr >> MSI_ADDR_DEST_MODE_SHIFT) & 0x1;
> -dest_id = msi_dest_id(addr);
> -deliv_mode = (data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x7;
> -trig_mode = (data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
> -
> -result = dest_id | (rh << XEN_PT_GFLAGS_SHIFT_RH)
> -| (dm << XEN_PT_GFLAGS_SHIFT_DM)
> -| (deliv_mode << XEN_PT_GFLAGSSHIFT_DELIV_MODE)
> -| (trig_mode << XEN_PT_GFLAGSSHIFT_TRG_MODE);
> -
> -return result;
> -}
> -
>  static inline uint64_t msi_addr64(XenPTMSI *msi)
>  {
>  return (uint64_t)msi->addr_hi << 32 | msi->addr_lo;
> @@ -160,23 +141,20 @@ static int msi_msix_update(XenPCIPassthroughState *s,
> bool masked)
>  {
>  PCIDevice *d = >dev;
> -uint8_t gvec = msi_vector(data);
> -uint32_t gflags = msi_gflags(data, addr);
> +uint32_t gflags = masked ? 0 : (1u << XEN_PT_GFLAGSSHIFT_UNMASKED);
>  int rc = 0;
>  uint64_t table_addr = 0;
>  
> -XEN_PT_LOG(d, "Updating MSI%s with pirq %d gvec %#x gflags %#x"
> -   " (entry: %#x)\n",
> -   is_msix ? "-X" : "", pirq, gvec, gflags, msix_entry);
> +XEN_PT_LOG(d, "Updating MSI%s with pirq %d gvec %#x addr %"PRIx64
> +   " data %#x gflags %#x (entry: %#x)\n",
> +   is_msix ? "-X" : "", pirq, addr, data, gflags, msix_entry);
>  
>  if (is_msix) {
>  table_addr = s->msix->mmio_base_addr;
>  }
>  
> -gflags |= masked ? 0 : (1u << XEN_PT_GFLAGSSHIFT_UNMASKED);
> -
> -rc = xc_domain_update_msi_irq(xen_xc, xen_domid, gvec,
> -  pirq, gflags, table_addr);
> +rc = xc_domain_update_msi_irq(xen_xc, xen_domid, pirq, addr,
> +  data, gflags, table_addr);

Are you trying to modifie an existing API? That is not going to work. We
want to be able to build QEMU against older version of Xen, and it
should work as well.

>  
>  if (rc) {
>  XEN_PT_ERR(d, "Updating of MSI%s failed. (err: %d)\n",
> @@ -199,8 +177,6 @@ static int msi_msix_disable(XenPCIPassthroughState *s,
>  bool is_binded)
>  {
>  PCIDevice *d = >dev;
> -uint8_t gvec = msi_vector(data);
> -uint32_t gflags = msi_gflags(data, addr);
>  int rc = 0;
>  
>  if (pirq == XEN_PT_UNASSIGNED_PIRQ) {
> @@ -208,12 +184,13 @@ static int msi_msix_disable(XenPCIPassthroughState *s,
>  }
>  
>  if (is_binded) {
> -XEN_PT_LOG(d, "Unbind MSI%s with pirq %d, gvec %#x\n",
> -   is_msix ? "-X" : "", pirq, gvec);
> -rc = xc_domain_unbind_msi_irq(xen_xc, xen_domid, gvec, pirq, gflags);
> +XEN_PT_LOG(d, "Unbind MSI%s with pirq %d, addr %"PRIx64", data 
> %#x\n",
> +   is_msix ? "-X" : "", pirq, addr, data);
> +rc = xc_domain_unbind_msi_irq(xen_xc, xen_domid, pirq, addr, data);

Same here, this build against older version of Xen, but I don't think an
older libxc (like from Xen 4.10) is going to correctly with this new
arguments.

>  if (rc) {
> -XEN_PT_ERR(d, "Unbinding of MSI%s failed. (err: %d, pirq: %d, 
> gvec: %#x)\n",
> -   is_msix ? "-X" : "", errno, pirq, gvec);
> +XEN_PT_ERR(d, "Unbinding of MSI%s failed. (err: %d, pirq: %d, "
> +   "addr: %"PRIx64", data: %#x)\n",
> +   is_msix ? "-X" : "", errno, pirq, addr, data);
>  return rc;
>  }
>  }

Thanks,

-- 
Anthony PERARD

Re: [Qemu-devel] [PATCH 08/17] iotests: Skip 103 for refcount_bits=1

2017-12-11 Thread John Snow



On 12/11/2017 12:17 PM, Max Reitz wrote:
> On 2017-12-09 02:36, John Snow wrote:
>>
>>
>> On 11/30/2017 08:23 AM, Max Reitz wrote:
>>> On 2017-11-30 04:18, Fam Zheng wrote:
 On Thu, 11/23 03:08, Max Reitz wrote:
> Signed-off-by: Max Reitz 
> ---
>  tests/qemu-iotests/103 | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/tests/qemu-iotests/103 b/tests/qemu-iotests/103
> index ecbd8ebd71..d0cfab8844 100755
> --- a/tests/qemu-iotests/103
> +++ b/tests/qemu-iotests/103
> @@ -40,6 +40,8 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
>  _supported_fmt qcow2
>  _supported_proto file nfs
>  _supported_os Linux
> +# Internal snapshots are (currently) impossible with refcount_bits=1
> +_unsupported_imgopts 'refcount_bits=1[^0-9]'

 What is the "[^0-9]" part for?
>>>
>>> It's so you can specify refcount_bits=16, but not
>>> refcount_bits=1,compat=0.10 or just refcount_bits=1.
>>>
>>> Max
>>>
>>
>> Worth a comment?
> 
> There is a comment above it that says that refcount_bits=1 is the
> disallowed option. :-)
> 
> I could add a "(refcount_bits=16 is OK, though)" if that would have been
> enough for you (or any proposal of yours).
> 
> Max
> 

Not worth a re-spin.

The double negative of "unsupported" and "not 0-9" takes a hot second to
parse. Mentioning that you are looking to prohibit 1,[foo] specifically
helps.

Re: [Qemu-devel] [PATCH for-2.11?] target/arm: Generate UNDEF for 32-bit Thumb2 insns

2017-12-11 Thread Peter Maydell

On 11 December 2017 at 17:00, Richard Henderson  wrote:
> On 12/11/2017 07:42 AM, Peter Maydell wrote:
>> The refactoring of commit 296e5a0a6c3935 has a nasty bug:
>> it accidentally dropped the generation of code to raise
>> the UNDEF exception when disas_thumb2_insn() returns nonzero.
>> This means that 32-bit Thumb2 instruction patterns that
>> ought to UNDEF just act like nops instead. This is likely
>> to break any number of things, including the kernel's "disable
>> the FPU and use the UNDEF exception to identify when to turn
>> it back on again" trick.
>>
>> Signed-off-by: Peter Maydell 
>> ---
>> This is the smallest possible fix that will correct the
>> bug, for possible inclusion in 2.11; for 2.12 we should
>> fix the asymmetry where disas_thumb() generates its own
>> exception-raising code but disas_thumb2() wants the caller
>> to do it. (This asymmetry is why we didn't notice the
>> problem in code review.)
>>
>> I'm not sure whether this should go into 2.11 or not --
>> this time last week it would have been an easy "yes".
>
> Reviewed-by: Richard Henderson 

Thanks. I think I have come down on the side of putting this into
2.11, so rolling an rc5 today, and delaying the final release
a day to Wednesday.

thanks
-- PMM

Re: [Qemu-devel] [PATCH-2.12 v2 2/3] xilinx_spips: Set all of the reset values

2017-12-11 Thread Alistair Francis

On Wed, Dec 6, 2017 at 3:39 PM, francisco iglesias
 wrote:
> Hi Alistair,
>
> On 6 December 2017 at 23:22, Alistair Francis 
> wrote:
>>
>> Following the ZynqMP register spec let's ensure that all reset values
>> are set.
>>
>> Signed-off-by: Alistair Francis 
>> ---
>> V2:
>>  - Don't bother double setting registers
>>
>>  hw/ssi/xilinx_spips.c | 35 ++-
>>  include/hw/ssi/xilinx_spips.h |  2 +-
>>  2 files changed, 31 insertions(+), 6 deletions(-)
>>
>> diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c
>> index 899db814ee..b8182cfd74 100644
>> --- a/hw/ssi/xilinx_spips.c
>> +++ b/hw/ssi/xilinx_spips.c
>> @@ -66,6 +66,7 @@
>>
>>  /* interrupt mechanism */
>>  #define R_INTR_STATUS   (0x04 / 4)
>> +#define R_INTR_STATUS_RESET (0x104)
>>  #define R_INTR_EN   (0x08 / 4)
>>  #define R_INTR_DIS  (0x0C / 4)
>>  #define R_INTR_MASK (0x10 / 4)
>> @@ -102,6 +103,9 @@
>>  #define R_SLAVE_IDLE_COUNT  (0x24 / 4)
>>  #define R_TX_THRES  (0x28 / 4)
>>  #define R_RX_THRES  (0x2C / 4)
>> +#define R_GPIO  (0x30 / 4)
>> +#define R_LPBK_DLY_ADJ  (0x38 / 4)
>> +#define R_LPBK_DLY_ADJ_RESET (0x33)
>>  #define R_TXD1  (0x80 / 4)
>>  #define R_TXD2  (0x84 / 4)
>>  #define R_TXD3  (0x88 / 4)
>> @@ -140,8 +144,12 @@
>>  #define R_GQSPI_IER (0x108 / 4)
>>  #define R_GQSPI_IDR (0x10c / 4)
>>  #define R_GQSPI_IMR (0x110 / 4)
>> +#define R_GQSPI_IMR_RESET   (0xfbe)
>>  #define R_GQSPI_TX_THRESH   (0x128 / 4)
>>  #define R_GQSPI_RX_THRESH   (0x12c / 4)
>> +#define R_GQSPI_GPIO_THRESH (0x130 / 4)
>
>
> According to doc (mentioned in patch 0/3) the address above, 0x130, is
> "GQSPI GPIO for Write Protect". Should we rename the define to R_GQSPI_GPIO?
> (Based on doc and that the other WP is named R_GPIO).

Hmmm... I auto generated these names, so somewhere internally we call
it GQSPI_GPIO_THRESH, but apparently not in the documentation.

All the other auto generated code (headers for standalone
applications) will have a similar auto generated name, so I'm tempted
to keep it as this. Otherwise the register is technically just called
GQSPI_GPIO, according to the documentation. That doesn't seem to clash
with anything else.

I think changing it to GQSPI_GPIO makes the most sense then. That way
it matches the documentation and is still searchably close to the auto
generated string.

Good catch!

Alistair

>
> Best regards,
> Francisco Iglesias
>
>>
>> +#define R_GQSPI_LPBK_DLY_ADJ (0x138 / 4)
>> +#define R_GQSPI_LPBK_DLY_ADJ_RESET (0x33)
>>  #define R_GQSPI_CNFG(0x100 / 4)
>>  FIELD(GQSPI_CNFG, MODE_EN, 30, 2)
>>  FIELD(GQSPI_CNFG, GEN_FIFO_START_MODE, 29, 1)
>> @@ -177,8 +185,16 @@
>>  FIELD(GQSPI_GF_SNAPSHOT, EXPONENT, 9, 1)
>>  FIELD(GQSPI_GF_SNAPSHOT, DATA_XFER, 8, 1)
>>  FIELD(GQSPI_GF_SNAPSHOT, IMMEDIATE_DATA, 0, 8)
>> -#define R_GQSPI_MOD_ID(0x168 / 4)
>> -#define R_GQSPI_MOD_ID_VALUE  0x010A
>> +#define R_GQSPI_MOD_ID(0x1fc / 4)
>> +#define R_GQSPI_MOD_ID_RESET  (0x10a)
>> +
>> +#define R_QSPIDMA_DST_CTRL (0x80c / 4)
>> +#define R_QSPIDMA_DST_CTRL_RESET   (0x803ffa00)
>> +#define R_QSPIDMA_DST_I_MASK   (0x820 / 4)
>> +#define R_QSPIDMA_DST_I_MASK_RESET (0xfe)
>> +#define R_QSPIDMA_DST_CTRL2(0x824 / 4)
>> +#define R_QSPIDMA_DST_CTRL2_RESET  (0x081bfff8)
>> +
>>  /* size of TXRX FIFOs */
>>  #define RXFF_A  (128)
>>  #define TXFF_A  (128)
>> @@ -351,11 +367,20 @@ static void xlnx_zynqmp_qspips_reset(DeviceState *d)
>>  fifo8_reset(>rx_fifo_g);
>>  fifo8_reset(>rx_fifo_g);
>>  fifo32_reset(>fifo_g);
>> +s->regs[R_INTR_STATUS] = R_INTR_STATUS_RESET;
>> +s->regs[R_GPIO] = 1;
>> +s->regs[R_LPBK_DLY_ADJ] = R_LPBK_DLY_ADJ_RESET;
>> +s->regs[R_GQSPI_GFIFO_THRESH] = 0x10;
>> +s->regs[R_MOD_ID] = 0x01090101;
>> +s->regs[R_GQSPI_IMR] = R_GQSPI_IMR_RESET;
>>  s->regs[R_GQSPI_TX_THRESH] = 1;
>>  s->regs[R_GQSPI_RX_THRESH] = 1;
>> -s->regs[R_GQSPI_GFIFO_THRESH] = 1;
>> -s->regs[R_GQSPI_IMR] = GQSPI_IXR_MASK;
>> -s->regs[R_MOD_ID] = 0x01090101;
>> +s->regs[R_GQSPI_GPIO_THRESH] = 1;
>> +s->regs[R_GQSPI_LPBK_DLY_ADJ] = R_GQSPI_LPBK_DLY_ADJ_RESET;
>> +s->regs[R_GQSPI_MOD_ID] = R_GQSPI_MOD_ID_RESET;
>> +s->regs[R_QSPIDMA_DST_CTRL] = R_QSPIDMA_DST_CTRL_RESET;
>> +s->regs[R_QSPIDMA_DST_I_MASK] = R_QSPIDMA_DST_I_MASK_RESET;
>> +s->regs[R_QSPIDMA_DST_CTRL2] = R_QSPIDMA_DST_CTRL2_RESET;
>>  s->man_start_com_g = false;
>>  s->gqspi_irqline = 0;
>>  xlnx_zynqmp_qspips_update_ixr(s);
>> diff --git a/include/hw/ssi/xilinx_spips.h b/include/hw/ssi/xilinx_spips.h
>> index 75fc94ce5d..d398a4e81c 100644
>> --- a/include/hw/ssi/xilinx_spips.h
>> +++ b/include/hw/ssi/xilinx_spips.h
>> @@ -32,7 +32,7 @@
>>  typedef struct XilinxSPIPS

Re: [Qemu-devel] [Qemu-block] [PATCH] blockdev-backup: enable non-root nodes for backup

2017-12-11 Thread John Snow



On 12/11/2017 12:05 PM, Max Reitz wrote:
> On 2017-12-11 17:47, John Snow wrote:
>> On 12/11/2017 11:31 AM, Max Reitz wrote:
>>> On 2017-12-08 18:09, John Snow wrote:
 On 12/08/2017 09:30 AM, Max Reitz wrote:
> On 2017-12-05 01:48, John Snow wrote:
>>
>> I would say that a bitmap attached to a BlockBackend should behave in
>> the way you say: writes to any children should change the bitmap here.
>>
>> bitmaps attached to nodes shouldn't worry about such things.
>
> Do we have bitmaps attached to BlockBackends?  I sure hope not.
>
> We should not have any interface that requires the use of BlockBackends
> by now.  If we do, that's something that has to be fixed.
>
>

 I'm not sure what the right paradigm is anymore, then.

 A node is just a node, but something has to represent the "drive" as far
 as the device model sees it. I thought that *was* the BlockBackend, but
 is it not?
>>>
>>> Yes, and on the other side the BB represents the device model for the
>>> block layer.  But the thing is that the user should be blissfully
>>> unaware...  Or do you want to make bitmaps attachable to guest devices
>>> (through the QOM path or ID) instead?
>>>
>>
>> OK, sure -- the user can specify a device model to attach it to instead
>> of a node. They don't have to be aware of the BB itself.
>>
>> The implementation though, I imagine it associates with that BB.
> 
> But that would be a whole new implementation...
> 

Yeah.

>>> (The block layer would then internally translate that to a BB.  But
>>> that's a bad internal interface because the bitmap is still attached to
>>> a BDS, and it's a bad external interface because currently you can
>>> attach bitmaps to nodes and only to nodes...)
>>
>> What if the type of bitmap we want to track trans-node changes was not
>> attached to a BDS? That'd be one way to obviously discriminate between
>> "This tracks tree-wide changes" and "This tracks node-local changes."
> 
> A new type of bitmap? :-/
> 

"type" may be too strong of a word, but... all the ones we use currently
are node-local.

>> Implementation wise I don't have any actual thought as to how this could
>> possibly be efficient. Maybe a bitmap reference at each BDS that is a
>> child of that particular BB?
>>
>> On attach, the BDS gets a set-only reference to that bitmap.
>> On detach, we remove the reference.
>>
>> Then, any writes anywhere in the tree will coagulate in one place. It
>> may or may not be particularly true or correct, because a write down the
>> tree doesn't necessarily mean the visible data has changed at the top
>> layer, but I don't think we have anything institutionally set in place
>> to determine if we are changing visible data up-graph with a write
>> down-graph.
> 
> Hmmm...  The first thing to clarify is whether we want two types of
> bitmaps.  I don't think there is much use to node-local bitmaps, all
> bitmaps should track every dirtying of their associated node (wherever
> it comes from).
> 

I don't disagree, but if that's the case then attaching bitmaps to
non-root nodes should be prohibited and we ought to shore up the
semantic idea that bitmaps must collect writes from their children.

I'm not sure that's any better inherently than expanding the idea to
include obvious differences between node-local and tree-wide bitmaps.
Currently, we just confuse the two concepts and do a poor job of either.

> However, if that is too much of a performance penalty...  Then we
> probably do have to distinguish between the two so that users only add
> tree-wide bitmaps when they need them.
> 
> OTOH, I guess that in the common case it's not a performance penalty at
> all, if done right.  Usually, a node you attach a bitmap to will not
> have child nodes that are written to by other nodes.  So in the common
> case your tree-wide bitmap is just a plain local bitmap and thus just as
> efficient.
> 
> And if some child node is indeed written to by some other node...  I
> think you always want a tree-wide bitmap anyway.
> 
> So I think all bitmaps should be tree-wide and the fact that they
> currently are not is a bug.
I could agree with this viewpoint, but that means we need to disallow
bitmaps to be attached to any non-root BDS, and then implement an
ability to give set-only references to children.

That's a bit of an extreme tactic that might prevent us from ever using
node-local bitmaps with anything resembling a sane API in the future.
Maybe that's OK, but it is a commitment.

I'm not sure there are any good reasons to have node-local bitmaps, but
the idea was one I more or less inherited when I started working in this
area because specifying specific BDS nodes with "device=" was going out
of vogue, but what exactly a BB was wasn't really defined yet, so I got
stuck with a QMP interface named "node=" and an implementation that has
an affinity for nodes.

Re: [Qemu-devel] [PATCH v1 for-2-12 04/15] s390x/flic: simplify flic initialization

2017-12-11 Thread Cornelia Huck

On Mon, 11 Dec 2017 14:47:29 +0100
David Hildenbrand  wrote:

> This makes it clearer, which device is used for which accelerator.
> 
> Signed-off-by: David Hildenbrand 
> ---
>  hw/intc/s390_flic.c  |  9 +++--
>  hw/intc/s390_flic_kvm.c  | 12 
>  include/hw/s390x/s390_flic.h |  9 -
>  3 files changed, 7 insertions(+), 23 deletions(-)
> 
> diff --git a/hw/intc/s390_flic.c b/hw/intc/s390_flic.c
> index 6eaf178d79..a78bdf1d90 100644
> --- a/hw/intc/s390_flic.c
> +++ b/hw/intc/s390_flic.c
> @@ -40,11 +40,16 @@ void s390_flic_init(void)
>  {
>  DeviceState *dev;
>  
> -dev = s390_flic_kvm_create();
> -if (!dev) {
> +if (kvm_enabled()) {
> +dev = qdev_create(NULL, TYPE_KVM_S390_FLIC);
> +object_property_add_child(qdev_get_machine(), TYPE_KVM_S390_FLIC,
> +  OBJECT(dev), NULL);
> +} else if (tcg_enabled()) {
>  dev = qdev_create(NULL, TYPE_QEMU_S390_FLIC);
>  object_property_add_child(qdev_get_machine(), TYPE_QEMU_S390_FLIC,
>OBJECT(dev), NULL);

Can you use TYPE_S390_FLIC_COMMON for attaching the flic to the machine?

> +} else {
> +g_assert_not_reached();

Checking for tcg_enabled() explicitly does not seem the common pattern,
although it is fine with me (I doubt we'll support other accelerators
for s390x in the foreseeable future).

>  }
>  qdev_init_nofail(dev);
>  }

Do we want to switch to the same pattern for the storage attribute
device as well?

Change looks fine to me.

Re: [Qemu-devel] [PATCH 08/17] iotests: Skip 103 for refcount_bits=1

2017-12-11 Thread Max Reitz

On 2017-12-09 02:36, John Snow wrote:
> 
> 
> On 11/30/2017 08:23 AM, Max Reitz wrote:
>> On 2017-11-30 04:18, Fam Zheng wrote:
>>> On Thu, 11/23 03:08, Max Reitz wrote:
 Signed-off-by: Max Reitz 
 ---
  tests/qemu-iotests/103 | 2 ++
  1 file changed, 2 insertions(+)

 diff --git a/tests/qemu-iotests/103 b/tests/qemu-iotests/103
 index ecbd8ebd71..d0cfab8844 100755
 --- a/tests/qemu-iotests/103
 +++ b/tests/qemu-iotests/103
 @@ -40,6 +40,8 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
  _supported_fmt qcow2
  _supported_proto file nfs
  _supported_os Linux
 +# Internal snapshots are (currently) impossible with refcount_bits=1
 +_unsupported_imgopts 'refcount_bits=1[^0-9]'
>>>
>>> What is the "[^0-9]" part for?
>>
>> It's so you can specify refcount_bits=16, but not
>> refcount_bits=1,compat=0.10 or just refcount_bits=1.
>>
>> Max
>>
> 
> Worth a comment?

There is a comment above it that says that refcount_bits=1 is the
disallowed option. :-)

I could add a "(refcount_bits=16 is OK, though)" if that would have been
enough for you (or any proposal of yours).

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Qemu-trivial] [PATCH] lsi_scsi: add support for PPR Extended Message

2017-12-11 Thread George Kennedy


Thank you Paolo,

"Signed-off-by: George Kennedy"

George

On 12/11/2017 11:55 AM, Paolo Bonzini wrote:

On 11/12/2017 17:45, George Kennedy wrote:

The LSI 53c895a code does not handle the PPR Extended Message. Add
support to handle PPR Extended Message like SDTR and WDTR are handled.
That is, to skip past the message bytes and ignore the message.

---
  hw/scsi/lsi53c895a.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 595c260..1e02a89 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -961,6 +961,10 @@ static void lsi_do_msgout(LSIState *s)
  DPRINTF("WDTR (ignored)\n");
  lsi_skip_msgbytes(s, 1);
  break;
+    case 4:
+    DPRINTF("PPR (ignored)\n");
+    lsi_skip_msgbytes(s, 5);
+    break;
  default:
  goto bad;
  }

Hi George,

for a patch to QEMU to be accepted, you need to confirm the origin of
your patch (according to the "Developer Certificate of Origin", see
https://developercertificate.org/).

In order to do this, it's enough to reply to this message with
"Signed-off-by: George Kennedy " in the reply.

Thanks,

Paolo

Re: [Qemu-devel] [Qemu-block] [PATCH] blockdev-backup: enable non-root nodes for backup

2017-12-11 Thread Max Reitz

On 2017-12-11 17:47, John Snow wrote:
> 
> 
> On 12/11/2017 11:31 AM, Max Reitz wrote:
>> On 2017-12-08 18:09, John Snow wrote:
>>>
>>>
>>> On 12/08/2017 09:30 AM, Max Reitz wrote:
 On 2017-12-05 01:48, John Snow wrote:
>
>
> On 12/04/2017 05:21 PM, Max Reitz wrote:
>> On 2017-12-04 23:15, John Snow wrote:
>>>
>>>
>>> On 12/01/2017 02:41 PM, Max Reitz wrote:
 ((By the way, I don't suppose that's how it should work...  But I don't
 suppose that we want propagation of dirtying towards the BDS roots, do
 we? :-/))
>>>
>>> I have never really satisfactorily explained to myself what bitmaps on
>>> intermediate notes truly represent or mean.
>>>
>>> The simple case is "This layer itself serviced a write request."
>>>
>>> If that information is not necessarily meaningful, I'm not sure that's a
>>> problem except in configuration.
>>>
>>>
>>> ...Now, if you wanted to talk about bitmaps that associate with a
>>> Backend instead of a Node...
>>
>> But it's not about bitmaps on intermediate nodes, quite the opposite.
>> It's about bitmaps on roots but write requests happening on intermediate
>> nodes.
>>
>
> Oh, I see what you're saying. It magically doesn't really change my
> opinion, by coincidence!
>
>> Say you have a node I and two filter nodes A and B using it (and they
>> are OK with shared writers).  There is a dirty bitmap on A.
>>
>> Now when a write request goes through B, I will obviously have changed,
>> and because A and B are filters, so will A.  But the dirty bitmap on A
>> will still be clean.
>>
>> My example was that when you run a mirror over A, you won't see dirtying
>> from B.  So you can't e.g. add a throttle driver between a mirror job
>> and the node you want to mirror, because the dirty bitmap on the
>> throttle driver will not be affected by accesses to the actual node.
>>
>> Max
>>
>
> Well, in this case I would say that a root BDS is not really any
> different from an intermediate one and can't really know what's going on
> in the world outside.
>
> At least, I think that's how we model it right now -- we pretend that we
> can record the activity of an entire drive graph by putting the bitmap
> on the root-most node we can get a hold of and assuming that all writes
> are going to go through us.

 Well, yeah, I know we do.  But I consider this counter-intuitive and if
 something is counter-intuitive it's often a bug.

> Clearly this is increasingly false the more we modularise the block graph.
>
>
> *uhm*
>
>
> I would say that a bitmap attached to a BlockBackend should behave in
> the way you say: writes to any children should change the bitmap here.
>
> bitmaps attached to nodes shouldn't worry about such things.

 Do we have bitmaps attached to BlockBackends?  I sure hope not.

 We should not have any interface that requires the use of BlockBackends
 by now.  If we do, that's something that has to be fixed.

 Max

>>>
>>> I'm not sure what the right paradigm is anymore, then.
>>>
>>> A node is just a node, but something has to represent the "drive" as far
>>> as the device model sees it. I thought that *was* the BlockBackend, but
>>> is it not?
>>
>> Yes, and on the other side the BB represents the device model for the
>> block layer.  But the thing is that the user should be blissfully
>> unaware...  Or do you want to make bitmaps attachable to guest devices
>> (through the QOM path or ID) instead?
>>
> 
> OK, sure -- the user can specify a device model to attach it to instead
> of a node. They don't have to be aware of the BB itself.
> 
> The implementation though, I imagine it associates with that BB.

But that would be a whole new implementation...

>> (The block layer would then internally translate that to a BB.  But
>> that's a bad internal interface because the bitmap is still attached to
>> a BDS, and it's a bad external interface because currently you can
>> attach bitmaps to nodes and only to nodes...)
> 
> What if the type of bitmap we want to track trans-node changes was not
> attached to a BDS? That'd be one way to obviously discriminate between
> "This tracks tree-wide changes" and "This tracks node-local changes."

A new type of bitmap? :-/

> Implementation wise I don't have any actual thought as to how this could
> possibly be efficient. Maybe a bitmap reference at each BDS that is a
> child of that particular BB?
> 
> On attach, the BDS gets a set-only reference to that bitmap.
> On detach, we remove the reference.
> 
> Then, any writes anywhere in the tree will coagulate in one place. It
> may or may not be particularly true or correct, because a write down the
> tree doesn't necessarily mean the visible data has changed at the top
> layer, but I don't

Re: [Qemu-devel] [PATCH for-2.11?] target/arm: Generate UNDEF for 32-bit Thumb2 insns

2017-12-11 Thread Richard Henderson

On 12/11/2017 07:42 AM, Peter Maydell wrote:
> The refactoring of commit 296e5a0a6c3935 has a nasty bug:
> it accidentally dropped the generation of code to raise
> the UNDEF exception when disas_thumb2_insn() returns nonzero.
> This means that 32-bit Thumb2 instruction patterns that
> ought to UNDEF just act like nops instead. This is likely
> to break any number of things, including the kernel's "disable
> the FPU and use the UNDEF exception to identify when to turn
> it back on again" trick.
> 
> Signed-off-by: Peter Maydell 
> ---
> This is the smallest possible fix that will correct the
> bug, for possible inclusion in 2.11; for 2.12 we should
> fix the asymmetry where disas_thumb() generates its own
> exception-raising code but disas_thumb2() wants the caller
> to do it. (This asymmetry is why we didn't notice the
> problem in code review.)
> 
> I'm not sure whether this should go into 2.11 or not --
> this time last week it would have been an easy "yes".

Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH v5 0/4] xenfb: Enablement for Windows PV HID frontend

2017-12-11 Thread Anthony PERARD

On Fri, Nov 03, 2017 at 11:56:27AM +, Owen Smith wrote:
> Improve the input device model in xenfb, by updating the
> Qemu input handlers and adding a feature to allow for
> raw (unscaled) absolute coordinates to be represented.
> 
> Changes:
>   * use keycodedb to generate qcode to linux input mapping
>   * move rescaling to the mouse_event handler
>   * add activate for raw_pointer devices
> 
> Owen Smith (3):
>   ui: generate qcode to linux mappings
>   xenfb: Use Input Handlers directly
>   xenfb: Add [feature|request]-raw-pointer
>   xenfb: activate input handlers for raw pointer devices

The patch series looks good to me:
Reviewed-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD

Re: [Qemu-devel] [Qemu-trivial] [PATCH] lsi_scsi: add support for PPR Extended Message

2017-12-11 Thread Paolo Bonzini

On 11/12/2017 17:45, George Kennedy wrote:
> The LSI 53c895a code does not handle the PPR Extended Message. Add
> support to handle PPR Extended Message like SDTR and WDTR are handled.
> That is, to skip past the message bytes and ignore the message.
> 
> ---
>  hw/scsi/lsi53c895a.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
> index 595c260..1e02a89 100644
> --- a/hw/scsi/lsi53c895a.c
> +++ b/hw/scsi/lsi53c895a.c
> @@ -961,6 +961,10 @@ static void lsi_do_msgout(LSIState *s)
>  DPRINTF("WDTR (ignored)\n");
>  lsi_skip_msgbytes(s, 1);
>  break;
> +    case 4:
> +    DPRINTF("PPR (ignored)\n");
> +    lsi_skip_msgbytes(s, 5);
> +    break;
>  default:
>  goto bad;
>  }

Hi George,

for a patch to QEMU to be accepted, you need to confirm the origin of
your patch (according to the "Developer Certificate of Origin", see
https://developercertificate.org/).

In order to do this, it's enough to reply to this message with
"Signed-off-by: George Kennedy " in the reply.

Thanks,

Paolo

Re: [Qemu-devel] [PATCH] configure: Fix curses probe for older ncurses

2017-12-11 Thread Peter Maydell

On 26 November 2017 at 21:13, Brad Smith  wrote:
> Fix the curses probe with older ncurses (.e.g. 5.7, as used by OpenBSD).
>
> ncurses 5.7 requires _XOPEN_SOURCE_EXTENDED to be defined for WACS_* 
> constants.
>
> Signed-off-by: Brad Smith 
>
>
> diff --git a/configure b/configure
> index 0c6e7572db..9715b9c2cc 100755
> --- a/configure
> +++ b/configure
> @@ -3186,7 +3186,7 @@ EOF
>IFS=:
>for curses_inc in $curses_inc_list; do
>  # Make sure we get the wide character prototypes
> -curses_inc="-DNCURSES_WIDECHAR $curses_inc"
> +curses_inc="-DNCURSES_WIDECHAR -D_XOPEN_SOURCE_EXTENDED $curses_inc"
>  IFS=:
>  for curses_lib in $curses_lib_list; do
>unset IFS

Having thought about this a bit more, I think I'm definitely not
happy with defining _XOPEN_SOURCE_EXTENDED by default for every
host OS. I think we should either:
 (a) define it only for OpenBSD in the per-host case statement
in configure, with a note that we're doing it to work around the
supplied ncurses version being ancient
 (b) just say that if you want this optional QEMU feature you need
a version of ncurses that was released this decade

To be honest I'd favour (b): there are limits to how much
we need to support adventures in retrocomputing.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v1] cpu-exec: fix missed CPU kick during interrupt injection

2017-12-11 Thread David Hildenbrand


> atomic_mb_set can be a little faster on x86, so:
> 
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index dfba5ebd29..4452cd9856 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -528,12 +528,10 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>  
>  /* Clear the interrupt flag now since we're processing
>   * cpu->interrupt_request and cpu->exit_request.
> + * Ensure zeroing happens before reading cpu->exit_request or
> + * cpu->interrupt_request (see also smp_wmb in cpu_exit())
>   */
> -atomic_set(>icount_decr.u16.high, 0);
> -/* Ensure zeroing happens before reading cpu->exit_request or
> - * cpu->interrupt_request. (also see cpu_exit())
> - */
> -smp_mb();
> +atomic_mb_set(>icount_decr.u16.high, 0);
>  
>  if (unlikely(atomic_read(>interrupt_request))) {
>  int interrupt_request;
> 

Looks good to me! Thanks!


-- 

Thanks,

David / dhildenb

Re: [Qemu-devel] [PATCH v1] cpus: make pause_all_cpus() play with SMP on single threaded TCG

2017-12-11 Thread Paolo Bonzini

On 11/12/2017 17:44, David Hildenbrand wrote:
>> -void cpu_stop_current(void)
>> -{
>> -if (current_cpu) {
>> -qemu_cpu_stop(current_cpu, true);
>> -}
>> -}
> Btw. this does not compile as this is used also in vl.c
> 

Doh, then I'm applying your patch untouched.

Paolo

Re: [Qemu-devel] [Qemu-block] [PATCH] blockdev-backup: enable non-root nodes for backup

2017-12-11 Thread John Snow

On 12/11/2017 11:31 AM, Max Reitz wrote:
> On 2017-12-08 18:09, John Snow wrote:
>>
>>
>> On 12/08/2017 09:30 AM, Max Reitz wrote:
>>> On 2017-12-05 01:48, John Snow wrote:

 On 12/04/2017 05:21 PM, Max Reitz wrote:
> On 2017-12-04 23:15, John Snow wrote:
>>
>>
>> On 12/01/2017 02:41 PM, Max Reitz wrote:
>>> ((By the way, I don't suppose that's how it should work...  But I don't
>>> suppose that we want propagation of dirtying towards the BDS roots, do
>>> we? :-/))
>>
>> I have never really satisfactorily explained to myself what bitmaps on
>> intermediate notes truly represent or mean.
>>
>> The simple case is "This layer itself serviced a write request."
>>
>> If that information is not necessarily meaningful, I'm not sure that's a
>> problem except in configuration.
>>
>>
>> ...Now, if you wanted to talk about bitmaps that associate with a
>> Backend instead of a Node...
>
> But it's not about bitmaps on intermediate nodes, quite the opposite.
> It's about bitmaps on roots but write requests happening on intermediate
> nodes.
>

 Oh, I see what you're saying. It magically doesn't really change my
 opinion, by coincidence!

> Say you have a node I and two filter nodes A and B using it (and they
> are OK with shared writers).  There is a dirty bitmap on A.
>
> Now when a write request goes through B, I will obviously have changed,
> and because A and B are filters, so will A.  But the dirty bitmap on A
> will still be clean.
>
> My example was that when you run a mirror over A, you won't see dirtying
> from B.  So you can't e.g. add a throttle driver between a mirror job
> and the node you want to mirror, because the dirty bitmap on the
> throttle driver will not be affected by accesses to the actual node.
>
> Max
>

 Well, in this case I would say that a root BDS is not really any
 different from an intermediate one and can't really know what's going on
 in the world outside.

 At least, I think that's how we model it right now -- we pretend that we
 can record the activity of an entire drive graph by putting the bitmap
 on the root-most node we can get a hold of and assuming that all writes
 are going to go through us.
>>>
>>> Well, yeah, I know we do.  But I consider this counter-intuitive and if
>>> something is counter-intuitive it's often a bug.
>>>
 Clearly this is increasingly false the more we modularise the block graph.

 *uhm*

 I would say that a bitmap attached to a BlockBackend should behave in
 the way you say: writes to any children should change the bitmap here.

 bitmaps attached to nodes shouldn't worry about such things.
>>>
>>> Do we have bitmaps attached to BlockBackends?  I sure hope not.
>>>
>>> We should not have any interface that requires the use of BlockBackends
>>> by now.  If we do, that's something that has to be fixed.
>>>
>>> Max
>>>
>>
>> I'm not sure what the right paradigm is anymore, then.
>>
>> A node is just a node, but something has to represent the "drive" as far
>> as the device model sees it. I thought that *was* the BlockBackend, but
>> is it not?
> 
> Yes, and on the other side the BB represents the device model for the
> block layer.  But the thing is that the user should be blissfully
> unaware...  Or do you want to make bitmaps attachable to guest devices
> (through the QOM path or ID) instead?
> 

OK, sure -- the user can specify a device model to attach it to instead
of a node. They don't have to be aware of the BB itself.

The implementation though, I imagine it associates with that BB.

> (The block layer would then internally translate that to a BB.  But
> that's a bad internal interface because the bitmap is still attached to
> a BDS, and it's a bad external interface because currently you can
> attach bitmaps to nodes and only to nodes...)

What if the type of bitmap we want to track trans-node changes was not
attached to a BDS? That'd be one way to obviously discriminate between
"This tracks tree-wide changes" and "This tracks node-local changes."

Implementation wise I don't have any actual thought as to how this could
possibly be efficient. Maybe a bitmap reference at each BDS that is a
child of that particular BB?

On attach, the BDS gets a set-only reference to that bitmap.
On detach, we remove the reference.

Then, any writes anywhere in the tree will coagulate in one place. It
may or may not be particularly true or correct, because a write down the
tree doesn't necessarily mean the visible data has changed at the top
layer, but I don't think we have anything institutionally set in place
to determine if we are changing visible data up-graph with a write
down-graph.

[Qemu-devel] [Qemu-trivial] [PATCH] lsi_scsi: add support for PPR Extended Message

2017-12-11 Thread George Kennedy

The LSI 53c895a code does not handle the PPR Extended Message. Add 
support to handle PPR Extended Message like SDTR and WDTR are handled. 
That is, to skip past the message bytes and ignore the message.


---
 hw/scsi/lsi53c895a.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 595c260..1e02a89 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -961,6 +961,10 @@ static void lsi_do_msgout(LSIState *s)
 DPRINTF("WDTR (ignored)\n");
 lsi_skip_msgbytes(s, 1);
 break;
+case 4:
+DPRINTF("PPR (ignored)\n");
+lsi_skip_msgbytes(s, 5);
+break;
 default:
 goto bad;
 }
--
1.8.3.1

Re: [Qemu-devel] [PATCH v1] cpus: make pause_all_cpus() play with SMP on single threaded TCG

2017-12-11 Thread David Hildenbrand


> -void cpu_stop_current(void)
> -{
> -if (current_cpu) {
> -qemu_cpu_stop(current_cpu, true);
> -}
> -}

Btw. this does not compile as this is used also in vl.c

> -
>  int vm_stop(RunState state)
>  {
>  if (qemu_in_vcpu_thread()) {
> @@ -1818,7 +1809,8 @@ int vm_stop(RunState state)
>   * FIXME: should not return to device code in case
>   * vm_stop() has been requested.
>   */
> -cpu_stop_current();
> +qemu_cpu_stop(current_cpu);
> +cpu_exit(current_cpu);
>  return 0;
>  }
>  
> 


-- 

Thanks,

David / dhildenb

Re: [Qemu-devel] [PATCH v1] cpus: make pause_all_cpus() play with SMP on single threaded TCG

2017-12-11 Thread David Hildenbrand


>  int vm_stop(RunState state)
>  {
>  if (qemu_in_vcpu_thread()) {
> @@ -1818,7 +1809,8 @@ int vm_stop(RunState state)
>   * FIXME: should not return to device code in case
>   * vm_stop() has been requested.
>   */
> -cpu_stop_current();
> +qemu_cpu_stop(current_cpu);
> +cpu_exit(current_cpu);

We're doing the cpu_exit() now after the broadcast, is this ok?

Also we drop the check for current_cpu, I assume this is also ok.

>  return 0;
>  }
> 

-- 

Thanks,

David / dhildenb

1 2 3 >

1 - 100 of 275 matches

Mail list logo