Re: [PATCH v2] vl: Add support to set properties when using JSON syntax for -device via -set option

2022-01-11 Thread MkfsSion
ping
https://lore.kernel.org/qemu-devel/20211224072511.63894-1-mkfss...@mkfssion.com



Re: [PULL 00/13] Net patches

2022-01-11 Thread Jason Wang
On Wed, Jan 12, 2022 at 3:10 PM Roman Bolshakov  wrote:
>
> On Wed, Jan 12, 2022 at 01:39:28PM +0800, Jason Wang wrote:
> >
> > 在 2022/1/12 上午6:02, Vladislav Yaroshchuk 写道:
> > >
> > >
> > > вт, 11 янв. 2022 г., 5:10 AM Jason Wang :
> > >
> > > On Tue, Jan 11, 2022 at 12:49 AM Peter Maydell
> > >  wrote:
> > > >
> > > > On Mon, 10 Jan 2022 at 03:40, Jason Wang 
> > > wrote:
> > > > >
> > > > > The following changes since commit
> > > df722e33d5da26ea8604500ca8f509245a0ea524:
> > > > >
> > > > >   Merge tag 'bsd-user-arm-pull-request' of
> > > gitlab.com:bsdimp/qemu into staging (2022-01-08 09:37:59 -0800)
> > > > >
> > > > > are available in the git repository at:
> > > > >
> > > > > https://github.com/jasowang/qemu.git tags/net-pull-request
> > > > >
> > > > > for you to fetch changes up to
> > > 5136cc6d3b8b74f4fa572f0874656947a401330e:
> > > > >
> > > > >   net/vmnet: update MAINTAINERS list (2022-01-10 11:30:55 +0800)
> > > > >
> > > > > 
> > > > >
> > > > > 
> > > >
> > > > Fails to build on OSX Catalina:
> > > >
> > > > ../../net/vmnet-common.m:165:10: error: use of undeclared identifier
> > > > 'VMNET_SHARING_SERVICE_BUSY'
> > > > case VMNET_SHARING_SERVICE_BUSY:
> > > >  ^
> > > >
> > > > This constant only got added in macOS 11.0. I guess that technically
> > > > our supported-platforms policy only requires us to support 11
> > > (Big Sur)
> > > > and 12 (Monterey) at this point, but it would be nice to still
> > > be able
> > > > to build on Catalina (10.15).
> > >
> > > Yes, it was only supported by the vmnet framework starting from
> > > Catalyst according to
> > > https://developer.apple.com/documentation/vmnet?language=objc.
> > >
> > >
> > > Yes, there are some symbols from macOS >= 11.0 new backend
> > > uses, not only this one, ex. vmnet_enable_isolation_key:
> > > https://developer.apple.com/documentation/vmnet/vmnet_enable_isolation_key
> > >
> > > >
> > > > (Personally I would like Catalina still to work at least for a
> > > little
> > > > while, because my x86 Mac is old enough that it is not supported by
> > > > Big Sur. I'll have to dump it once Apple stops doing security
> > > support
> > > > for Catalina, but they haven't done that quite yet.)
> > >
> > >
> > > Sure, broken builds on old macOSes are bad. For this case I think
> > > it's enough to disable vmnet for macOS < 11.0 with a probe while
> > > configure build step. Especially given that Apple supports ~three
> > > latest macOS versions, support for Catalina is expected to end
> > > in 2022, when QEMU releases 7.0.
> >
> >
> > That should be fine.
> >
>
> I agree with Peter on this,
>
> There's a lot of hardware running with Catalina. I think it's useful to
> support it a little longer.

Right and Vladislav have disabled vmnet on the old versions.

Thanks

>
> Regards,
> Roman
>
> >
> > >
> > > If this workaround is not suitable and it's required to support vmnet
> > > in Catalina 10.15 with a subset of available features, it can be done.
> > > But I'll be ready to handle this in approximately two-three weeks only.
> > >
> > > Sure, Vladislav please fix this and send a new version.
> > >
> > >
> > > Quick fix as described above is available in v10:
> > > https://patchew.org/QEMU/20220111211422.21789-1-yaroshchuk2...@gmail.com/
> >
> >
> > Have you got chance to test that for macOS < 11.0?
> >
> > Thanks
> >
> >
> > > Thanks
> > >
> > > >
> > > > -- PMM
> > > >
> > >
> > >
> > >
> > >
> > > --
> > > Best Regards,
> > >
> > > Vladislav Yaroshchuk
> >
> >
>




Re: [PATCH v10 0/7] Add vmnet.framework based network backend

2022-01-11 Thread Roman Bolshakov
On Wed, Jan 12, 2022 at 12:14:15AM +0300, Vladislav Yaroshchuk wrote:
> macOS provides networking API for VMs called 'vmnet.framework':
> https://developer.apple.com/documentation/vmnet
> 
> We can provide its support as the new QEMU network backends which
> represent three different vmnet.framework interface usage modes:
> 
>   * `vmnet-shared`:
> allows the guest to communicate with other guests in shared mode and
> also with external network (Internet) via NAT. Has (macOS-provided)
> DHCP server; subnet mask and IP range can be configured;
> 
>   * `vmnet-host`:
> allows the guest to communicate with other guests in host mode.
> By default has enabled DHCP as `vmnet-shared`, but providing
> network unique id (uuid) can make `vmnet-host` interfaces isolated
> from each other and also disables DHCP.
> 
>   * `vmnet-bridged`:
> bridges the guest with a physical network interface.
> 
> This backends cannot work on macOS Catalina 10.15 cause we use
> vmnet.framework API provided only with macOS 11 and newer. Seems
> that it is not a problem, because QEMU guarantees to work on two most
> recent versions of macOS which now are Big Sur (11) and Monterey (12).
> 
> Also, we have one inconvenient restriction: vmnet.framework interfaces
> can create only privileged user:
> `$ sudo qemu-system-x86_64 -nic vmnet-shared`
> 
> Attempt of `vmnet-*` netdev creation being unprivileged user fails with
> vmnet's 'general failure'.
> 
> This happens because vmnet.framework requires `com.apple.vm.networking`
> entitlement which is: "restricted to developers of virtualization software.
> To request this entitlement, contact your Apple representative." as Apple
> documentation says:
> https://developer.apple.com/documentation/bundleresources/entitlements/com_apple_vm_networking
> 
> One more note: we still have quite useful but not supported
> 'vmnet.framework' features as creating port forwarding rules, IPv6
> NAT prefix specifying and so on.
> 
> Nevertheless, new backends work fine and tested within `qemu-system-x86-64`
> on macOS Bir Sur 11.5.2 host with such nic models:
>   * e1000-82545em
>   * virtio-net-pci
>   * vmxnet3
> 
> The guests were:
>   * macOS 10.15.7
>   * Ubuntu Bionic (server cloudimg)
> 
> 
> This series partially reuses patches by Phillip Tennen:
> https://patchew.org/QEMU/20210218134947.1860-1-phillip.en...@gmail.com/
> So I included them signed-off line into one of the commit messages and
> also here.
> 
> v1 -> v2:
>  Since v1 minor typos were fixed, patches rebased onto latest master,
>  redundant changes removed (small commits squashed)
> v2 -> v3:
>  - QAPI style fixes
>  - Typos fixes in comments
>  - `#include`'s updated to be in sync with recent master
> v3 -> v4:
>  - Support vmnet interfaces isolation feature
>  - Support vmnet-host network uuid setting feature
>  - Refactored sources a bit
> v4 -> v5:
>  - Missed 6.2 boat, now 7.0 candidate
>  - Fix qapi netdev descriptions and styles
>(@subnetmask -> @subnet-mask)
>  - Support vmnet-shared IPv6 prefix setting feature
> v5 -> v6
>  - provide detailed commit messages for commits of
>many changes
>  - rename properties @dhcpstart and @dhcpend to
>@start-address and @end-address
>  - improve qapi documentation about isolation
>features (@isolated, @net-uuid)
> v6 -> v7:
>  - update MAINTAINERS list
> v7 -> v8
>  - QAPI code style fixes
> v8 -> v9
>  - Fix building on Linux: add missing qapi
>`'if': 'CONFIG_VMNET'` statement to Netdev union
> v9 -> v10
>  - Disable vmnet feature for macOS < 11.0: add
>vmnet.framework API probe into meson.build.
>This fixes QEMU building on macOS < 11.0:
>https://patchew.org/QEMU/20220110034000.20221-1-jasow...@redhat.com/
> 

Hi Vladislav,

What symbols are missing on Catalina except VMNET_SHARING_BUSY?

It'd be great to get the feature working there.

Thanks,
Roman

> Vladislav Yaroshchuk (7):
>   net/vmnet: add vmnet dependency and customizable option
>   net/vmnet: add vmnet backends to qapi/net
>   net/vmnet: implement shared mode (vmnet-shared)
>   net/vmnet: implement host mode (vmnet-host)
>   net/vmnet: implement bridged mode (vmnet-bridged)
>   net/vmnet: update qemu-options.hx
>   net/vmnet: update MAINTAINERS list
> 
>  MAINTAINERS   |   5 +
>  meson.build   |  16 +-
>  meson_options.txt |   2 +
>  net/clients.h |  11 ++
>  net/meson.build   |   7 +
>  net/net.c |  10 ++
>  net/vmnet-bridged.m   | 111 
>  net/vmnet-common.m| 330 ++
>  net/vmnet-host.c  | 105 +++
>  net/vmnet-shared.c|  92 ++
>  net/vmnet_int.h   |  48 +
>  qapi/net.json | 132 +-
>  qemu-options.hx   |  25 +++
>  scripts/meson-buildoptions.sh |   3 +
>  14 files changed, 894 insertions(+), 3 deletions(-)
>  create mode 

Re: [PULL 00/13] Net patches

2022-01-11 Thread Jason Wang
On Wed, Jan 12, 2022 at 2:19 PM Vladislav Yaroshchuk
 wrote:
>
>
>
> ср, 12 янв. 2022 г., 8:39 AM Jason Wang :
>>
>>
>> 在 2022/1/12 上午6:02, Vladislav Yaroshchuk 写道:
>> >
>> >
>> > вт, 11 янв. 2022 г., 5:10 AM Jason Wang :
>> >
>> > On Tue, Jan 11, 2022 at 12:49 AM Peter Maydell
>> >  wrote:
>> > >
>> > > On Mon, 10 Jan 2022 at 03:40, Jason Wang 
>> > wrote:
>> > > >
>> > > > The following changes since commit
>> > df722e33d5da26ea8604500ca8f509245a0ea524:
>> > > >
>> > > >   Merge tag 'bsd-user-arm-pull-request' of
>> > gitlab.com:bsdimp/qemu into staging (2022-01-08 09:37:59 -0800)
>> > > >
>> > > > are available in the git repository at:
>> > > >
>> > > > https://github.com/jasowang/qemu.git tags/net-pull-request
>> > > >
>> > > > for you to fetch changes up to
>> > 5136cc6d3b8b74f4fa572f0874656947a401330e:
>> > > >
>> > > >   net/vmnet: update MAINTAINERS list (2022-01-10 11:30:55 +0800)
>> > > >
>> > > > 
>> > > >
>> > > > 
>> > >
>> > > Fails to build on OSX Catalina:
>> > >
>> > > ../../net/vmnet-common.m:165:10: error: use of undeclared identifier
>> > > 'VMNET_SHARING_SERVICE_BUSY'
>> > > case VMNET_SHARING_SERVICE_BUSY:
>> > >  ^
>> > >
>> > > This constant only got added in macOS 11.0. I guess that technically
>> > > our supported-platforms policy only requires us to support 11
>> > (Big Sur)
>> > > and 12 (Monterey) at this point, but it would be nice to still
>> > be able
>> > > to build on Catalina (10.15).
>> >
>> > Yes, it was only supported by the vmnet framework starting from
>> > Catalyst according to
>> > https://developer.apple.com/documentation/vmnet?language=objc.
>> >
>> >
>> > Yes, there are some symbols from macOS >= 11.0 new backend
>> > uses, not only this one, ex. vmnet_enable_isolation_key:
>> > https://developer.apple.com/documentation/vmnet/vmnet_enable_isolation_key
>> >
>> > >
>> > > (Personally I would like Catalina still to work at least for a
>> > little
>> > > while, because my x86 Mac is old enough that it is not supported by
>> > > Big Sur. I'll have to dump it once Apple stops doing security
>> > support
>> > > for Catalina, but they haven't done that quite yet.)
>> >
>> >
>> > Sure, broken builds on old macOSes are bad. For this case I think
>> > it's enough to disable vmnet for macOS < 11.0 with a probe while
>> > configure build step. Especially given that Apple supports ~three
>> > latest macOS versions, support for Catalina is expected to end
>> > in 2022, when QEMU releases 7.0.
>>
>>
>> That should be fine.
>>
>>
>> >
>> > If this workaround is not suitable and it's required to support vmnet
>> > in Catalina 10.15 with a subset of available features, it can be done.
>> > But I'll be ready to handle this in approximately two-three weeks only.
>> >
>> > Sure, Vladislav please fix this and send a new version.
>> >
>> >
>> > Quick fix as described above is available in v10:
>> > https://patchew.org/QEMU/20220111211422.21789-1-yaroshchuk2...@gmail.com/
>>
>>
>> Have you got chance to test that for macOS < 11.0?
>
>
> Yes, tested on Catalina 10.15.Works as expected.

Cool.

Thanks

>
>> Thanks
>>
>>
>> > Thanks
>> >
>> > >
>> > > -- PMM
>> > >
>> >
>> >
>> >
>> >
>> > --
>> > Best Regards,
>> >
>> > Vladislav Yaroshchuk
>>
>>
>>




Re: [PATCH] clock-vmstate: Add missing END_OF_LIST

2022-01-11 Thread Luc Michel
On 10:19 Tue 11 Jan , Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> Add the missing VMSTATE_END_OF_LIST to vmstate_muldiv
> 
> Fixes: 99abcbc7600 ("clock: Provide builtin multiplier/divider")
> Signed-off-by: Dr. David Alan Gilbert 

Reviewed-by: Luc Michel 

> ---
>  hw/core/clock-vmstate.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/core/clock-vmstate.c b/hw/core/clock-vmstate.c
> index 9d9174ffbd..7eccb6d4ea 100644
> --- a/hw/core/clock-vmstate.c
> +++ b/hw/core/clock-vmstate.c
> @@ -44,6 +44,7 @@ const VMStateDescription vmstate_muldiv = {
>  .fields = (VMStateField[]) {
>  VMSTATE_UINT32(multiplier, Clock),
>  VMSTATE_UINT32(divider, Clock),
> +VMSTATE_END_OF_LIST()
>  },
>  };
>  
> -- 
> 2.34.1
> 

-- 



Re: [PULL 00/13] Net patches

2022-01-11 Thread Roman Bolshakov
On Wed, Jan 12, 2022 at 01:39:28PM +0800, Jason Wang wrote:
> 
> 在 2022/1/12 上午6:02, Vladislav Yaroshchuk 写道:
> > 
> > 
> > вт, 11 янв. 2022 г., 5:10 AM Jason Wang :
> > 
> > On Tue, Jan 11, 2022 at 12:49 AM Peter Maydell
> >  wrote:
> > >
> > > On Mon, 10 Jan 2022 at 03:40, Jason Wang 
> > wrote:
> > > >
> > > > The following changes since commit
> > df722e33d5da26ea8604500ca8f509245a0ea524:
> > > >
> > > >   Merge tag 'bsd-user-arm-pull-request' of
> > gitlab.com:bsdimp/qemu into staging (2022-01-08 09:37:59 -0800)
> > > >
> > > > are available in the git repository at:
> > > >
> > > > https://github.com/jasowang/qemu.git tags/net-pull-request
> > > >
> > > > for you to fetch changes up to
> > 5136cc6d3b8b74f4fa572f0874656947a401330e:
> > > >
> > > >   net/vmnet: update MAINTAINERS list (2022-01-10 11:30:55 +0800)
> > > >
> > > > 
> > > >
> > > > 
> > >
> > > Fails to build on OSX Catalina:
> > >
> > > ../../net/vmnet-common.m:165:10: error: use of undeclared identifier
> > > 'VMNET_SHARING_SERVICE_BUSY'
> > >     case VMNET_SHARING_SERVICE_BUSY:
> > >          ^
> > >
> > > This constant only got added in macOS 11.0. I guess that technically
> > > our supported-platforms policy only requires us to support 11
> > (Big Sur)
> > > and 12 (Monterey) at this point, but it would be nice to still
> > be able
> > > to build on Catalina (10.15).
> > 
> > Yes, it was only supported by the vmnet framework starting from
> > Catalyst according to
> > https://developer.apple.com/documentation/vmnet?language=objc.
> > 
> > 
> > Yes, there are some symbols from macOS >= 11.0 new backend
> > uses, not only this one, ex. vmnet_enable_isolation_key:
> > https://developer.apple.com/documentation/vmnet/vmnet_enable_isolation_key
> > 
> > >
> > > (Personally I would like Catalina still to work at least for a
> > little
> > > while, because my x86 Mac is old enough that it is not supported by
> > > Big Sur. I'll have to dump it once Apple stops doing security
> > support
> > > for Catalina, but they haven't done that quite yet.)
> > 
> > 
> > Sure, broken builds on old macOSes are bad. For this case I think
> > it's enough to disable vmnet for macOS < 11.0 with a probe while
> > configure build step. Especially given that Apple supports ~three
> > latest macOS versions, support for Catalina is expected to end
> > in 2022, when QEMU releases 7.0.
> 
> 
> That should be fine.
> 

I agree with Peter on this,

There's a lot of hardware running with Catalina. I think it's useful to
support it a little longer.

Regards,
Roman

> 
> > 
> > If this workaround is not suitable and it's required to support vmnet
> > in Catalina 10.15 with a subset of available features, it can be done.
> > But I'll be ready to handle this in approximately two-three weeks only.
> > 
> > Sure, Vladislav please fix this and send a new version.
> > 
> > 
> > Quick fix as described above is available in v10:
> > https://patchew.org/QEMU/20220111211422.21789-1-yaroshchuk2...@gmail.com/
> 
> 
> Have you got chance to test that for macOS < 11.0?
> 
> Thanks
> 
> 
> > Thanks
> > 
> > >
> > > -- PMM
> > >
> > 
> > 
> > 
> > 
> > -- 
> > Best Regards,
> > 
> > Vladislav Yaroshchuk
> 
> 



Re: [RFC PATCH v3 5/7] audio/coreaudio: Remove a deprecation warning on macOS 12

2022-01-11 Thread Roman Bolshakov
On Mon, Jan 10, 2022 at 02:09:59PM +0100, Philippe Mathieu-Daudé wrote:
> When building on macOS 12 we get:
> 
>   audio/coreaudio.c:50:5: error: 'kAudioObjectPropertyElementMaster' is 
> deprecated: first deprecated in macOS 12.0 [-Werror,-Wdeprecated-declarations]
>   kAudioObjectPropertyElementMaster
>   ^
>   kAudioObjectPropertyElementMain
>   
> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/Frameworks/CoreAudio.framework/Headers/AudioHardwareBase.h:208:5:
>  note: 'kAudioObjectPropertyElementMaster' has been explicitly marked 
> deprecated here
>   kAudioObjectPropertyElementMaster 
> API_DEPRECATED_WITH_REPLACEMENT("kAudioObjectPropertyElementMain", 
> macos(10.0, 12.0), ios(2.0, 15.0), watchos(1.0, 8.0), tvos(9.0, 15.0)) = 
> kAudioObjectPropertyElementMain
>   ^
> 
> Replace by kAudioObjectPropertyElementMain, redefining it to
> kAudioObjectPropertyElementMaster if not available, using
> Clang __is_identifier() feature (coreaudio is restricted to
> macOS).
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> Checkpatch:
> 
>  WARNING: architecture specific defines should be avoided
>  #10: FILE: audio/coreaudio.c:47:
>  +#if !__is_identifier(kAudioObjectPropertyElementMain) /* macOS >= 12.0 */
> 
> Should we define __is_identifier() to 0 for GCC on macOS?
> ---
>  audio/coreaudio.c | 16 ++--
>  1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/audio/coreaudio.c b/audio/coreaudio.c
> index d8a21d3e507..73cbfd479ac 100644
> --- a/audio/coreaudio.c
> +++ b/audio/coreaudio.c
> @@ -44,10 +44,14 @@ typedef struct coreaudioVoiceOut {
>  bool enabled;
>  } coreaudioVoiceOut;
>  
> +#if !__is_identifier(kAudioObjectPropertyElementMain) /* macOS >= 12.0 */
> +#define kAudioObjectPropertyElementMain kAudioObjectPropertyElementMaster
> +#endif

Christian and Akihiko are right you need to replace it with macOS version
wrappers:

diff --git a/audio/coreaudio.c b/audio/coreaudio.c
index 73cbfd479a..7367a2ffd4 100644
--- a/audio/coreaudio.c
+++ b/audio/coreaudio.c
@@ -44,7 +44,8 @@ typedef struct coreaudioVoiceOut {
 bool enabled;
 } coreaudioVoiceOut;

-#if !__is_identifier(kAudioObjectPropertyElementMain) /* macOS >= 12.0 */
+#if !defined(MAC_OS_VERSION_12_0) || \
+(MAC_OS_X_VERSION_MAX_ALLOWED < MAC_OS_VERSION_12_0)
 #define kAudioObjectPropertyElementMain kAudioObjectPropertyElementMaster
 #endif


And in the patch 6 you'd do likewise:

diff --git a/block/file-posix.c b/block/file-posix.c
index 1d0512026c..c0038629a1 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -3325,7 +3325,8 @@ BlockDriver bdrv_file = {
 static kern_return_t GetBSDPath(io_iterator_t mediaIterator, char *bsdPath,
 CFIndex maxPathSize, int flags);

-#if !__is_identifier(IOMainPort) /* macOS >= 12.0 */
+#if !defined(MAC_OS_VERSION_12_0) || \
+(MAC_OS_X_VERSION_MAX_ALLOWED < MAC_OS_VERSION_12_0)
 #define IOMainPort IOMasterPort
 #endif

This way it the build would work also on older macOS.


Two more issues are left:

1. Linker has corrupted paths to clang directory (happens on all macOS 
versions).

Monterey:

[732/737] Linking target qemu-system-mips-unsigned
ld: warning: directory not found for option 
'-Lns/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/13.0.0'
[733/737] Linking target qemu-system-mips64-unsigned
ld: warning: directory not found for option 
'-Lns/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/13.0.0'
[737/737] Generating qemu-system-mips64 with a custom command

Catalina:

ld: warning: directory not found for option 
'-Lveloper/CommandLineTools/usr/lib/clang/11.0.0'
[102/105] Linking target qemu-system-or1k-unsigned
ld: warning: directory not found for option 
'-Lveloper/CommandLineTools/usr/lib/clang/11.0.0'
[104/105] Linking target qemu-system-ppc-unsigned
ld: warning: directory not found for option 
'-Lveloper/CommandLineTools/usr/lib/clang/11.0.0'
[105/105] Generating qemu-system-ppc with a custom command

2. QEMU tests show FENV_ACCESS warning on Monterey:


[409/771] Compiling C object 
tests/fp/libtestfloat.a.p/berkeley-testfloat-3_source_test_az_f128_rx.c.o
../tests/fp/berkeley-testfloat-3/source/test_az_f128_rx.c:49:14: warning: 
'#pragma FENV_ACCESS' is not supported on this target - ignored 
[-Wignored-pragmas]
#pragma STDC FENV_ACCESS ON
 ^
1 warning generated.
[410/771] Compiling C object 
tests/fp/libtestfloat.a.p/berkeley-testfloat-3_source_test_abcz_f128.c.o
../tests/fp/berkeley-testfloat-3/source/test_abcz_f128.c:48:14: warning: 
'#pragma FENV_ACCESS' is not supported on this target - ignored 
[-Wignored-pragmas]
#pragma STDC FENV_ACCESS ON
 ^
1 warning generated.

Regards,
Roman

> +
>  static const AudioObjectPropertyAddress voice_addr = {
>  kAudioHardwarePropertyDefaultOutputDevice,
>  kAudioObjectPropertyScopeGlobal,
> -

Re: [PULL 00/13] Net patches

2022-01-11 Thread Vladislav Yaroshchuk
ср, 12 янв. 2022 г., 8:39 AM Jason Wang :

>
> 在 2022/1/12 上午6:02, Vladislav Yaroshchuk 写道:
> >
> >
> > вт, 11 янв. 2022 г., 5:10 AM Jason Wang :
> >
> > On Tue, Jan 11, 2022 at 12:49 AM Peter Maydell
> >  wrote:
> > >
> > > On Mon, 10 Jan 2022 at 03:40, Jason Wang 
> > wrote:
> > > >
> > > > The following changes since commit
> > df722e33d5da26ea8604500ca8f509245a0ea524:
> > > >
> > > >   Merge tag 'bsd-user-arm-pull-request' of
> > gitlab.com:bsdimp/qemu into staging (2022-01-08 09:37:59 -0800)
> > > >
> > > > are available in the git repository at:
> > > >
> > > > https://github.com/jasowang/qemu.git tags/net-pull-request
> > > >
> > > > for you to fetch changes up to
> > 5136cc6d3b8b74f4fa572f0874656947a401330e:
> > > >
> > > >   net/vmnet: update MAINTAINERS list (2022-01-10 11:30:55 +0800)
> > > >
> > > > 
> > > >
> > > > 
> > >
> > > Fails to build on OSX Catalina:
> > >
> > > ../../net/vmnet-common.m:165:10: error: use of undeclared
> identifier
> > > 'VMNET_SHARING_SERVICE_BUSY'
> > > case VMNET_SHARING_SERVICE_BUSY:
> > >  ^
> > >
> > > This constant only got added in macOS 11.0. I guess that
> technically
> > > our supported-platforms policy only requires us to support 11
> > (Big Sur)
> > > and 12 (Monterey) at this point, but it would be nice to still
> > be able
> > > to build on Catalina (10.15).
> >
> > Yes, it was only supported by the vmnet framework starting from
> > Catalyst according to
> > https://developer.apple.com/documentation/vmnet?language=objc.
> >
> >
> > Yes, there are some symbols from macOS >= 11.0 new backend
> > uses, not only this one, ex. vmnet_enable_isolation_key:
> >
> https://developer.apple.com/documentation/vmnet/vmnet_enable_isolation_key
> >
> > >
> > > (Personally I would like Catalina still to work at least for a
> > little
> > > while, because my x86 Mac is old enough that it is not supported by
> > > Big Sur. I'll have to dump it once Apple stops doing security
> > support
> > > for Catalina, but they haven't done that quite yet.)
> >
> >
> > Sure, broken builds on old macOSes are bad. For this case I think
> > it's enough to disable vmnet for macOS < 11.0 with a probe while
> > configure build step. Especially given that Apple supports ~three
> > latest macOS versions, support for Catalina is expected to end
> > in 2022, when QEMU releases 7.0.
>
>
> That should be fine.
>
>
> >
> > If this workaround is not suitable and it's required to support vmnet
> > in Catalina 10.15 with a subset of available features, it can be done.
> > But I'll be ready to handle this in approximately two-three weeks only.
> >
> > Sure, Vladislav please fix this and send a new version.
> >
> >
> > Quick fix as described above is available in v10:
> >
> https://patchew.org/QEMU/20220111211422.21789-1-yaroshchuk2...@gmail.com/
>
>
> Have you got chance to test that for macOS < 11.0?
>

Yes, tested on Catalina 10.15.Works as expected.

Thanks
>
>
> > Thanks
> >
> > >
> > > -- PMM
> > >
> >
> >
> >
> >
> > --
> > Best Regards,
> >
> > Vladislav Yaroshchuk


>
>


Re: [PULL 00/13] Net patches

2022-01-11 Thread Jason Wang



在 2022/1/12 上午6:02, Vladislav Yaroshchuk 写道:



вт, 11 янв. 2022 г., 5:10 AM Jason Wang :

On Tue, Jan 11, 2022 at 12:49 AM Peter Maydell
 wrote:
>
> On Mon, 10 Jan 2022 at 03:40, Jason Wang 
wrote:
> >
> > The following changes since commit
df722e33d5da26ea8604500ca8f509245a0ea524:
> >
> >   Merge tag 'bsd-user-arm-pull-request' of
gitlab.com:bsdimp/qemu into staging (2022-01-08 09:37:59 -0800)
> >
> > are available in the git repository at:
> >
> > https://github.com/jasowang/qemu.git tags/net-pull-request
> >
> > for you to fetch changes up to
5136cc6d3b8b74f4fa572f0874656947a401330e:
> >
> >   net/vmnet: update MAINTAINERS list (2022-01-10 11:30:55 +0800)
> >
> > 
> >
> > 
>
> Fails to build on OSX Catalina:
>
> ../../net/vmnet-common.m:165:10: error: use of undeclared identifier
> 'VMNET_SHARING_SERVICE_BUSY'
>     case VMNET_SHARING_SERVICE_BUSY:
>          ^
>
> This constant only got added in macOS 11.0. I guess that technically
> our supported-platforms policy only requires us to support 11
(Big Sur)
> and 12 (Monterey) at this point, but it would be nice to still
be able
> to build on Catalina (10.15).

Yes, it was only supported by the vmnet framework starting from
Catalyst according to
https://developer.apple.com/documentation/vmnet?language=objc.


Yes, there are some symbols from macOS >= 11.0 new backend
uses, not only this one, ex. vmnet_enable_isolation_key:
https://developer.apple.com/documentation/vmnet/vmnet_enable_isolation_key

>
> (Personally I would like Catalina still to work at least for a
little
> while, because my x86 Mac is old enough that it is not supported by
> Big Sur. I'll have to dump it once Apple stops doing security
support
> for Catalina, but they haven't done that quite yet.)


Sure, broken builds on old macOSes are bad. For this case I think
it's enough to disable vmnet for macOS < 11.0 with a probe while
configure build step. Especially given that Apple supports ~three
latest macOS versions, support for Catalina is expected to end
in 2022, when QEMU releases 7.0.



That should be fine.




If this workaround is not suitable and it's required to support vmnet
in Catalina 10.15 with a subset of available features, it can be done.
But I'll be ready to handle this in approximately two-three weeks only.

Sure, Vladislav please fix this and send a new version.


Quick fix as described above is available in v10:
https://patchew.org/QEMU/20220111211422.21789-1-yaroshchuk2...@gmail.com/



Have you got chance to test that for macOS < 11.0?

Thanks



Thanks

>
> -- PMM
>




--
Best Regards,

Vladislav Yaroshchuk





[PATCH v2 4/5] target/s390x: Fix shifting 32-bit values for more than 31 bits

2022-01-11 Thread Ilya Leoshkevich
According to PoP, both 32- and 64-bit shifts use lowest 6 address
bits. The current code special-cases 32-bit shifts to use only 5 bits,
which is not correct. For example, shifting by 32 bits currently
preserves the initial value, however, it's supposed zero it out
instead.

Fix by merging sh32 and sh64 and adapting cc_calc_sla_32() to shift
values greater than 31.

Fixes: cbe24bfa91d2 ("target-s390: Convert SHIFT, ROTATE SINGLE")
Signed-off-by: Ilya Leoshkevich 
---
 target/s390x/tcg/cc_helper.c   | 32 +-
 target/s390x/tcg/insn-data.def | 36 +-
 target/s390x/tcg/translate.c   | 31 ++---
 3 files changed, 33 insertions(+), 66 deletions(-)

diff --git a/target/s390x/tcg/cc_helper.c b/target/s390x/tcg/cc_helper.c
index b6acffa3e8..3cfbfadf48 100644
--- a/target/s390x/tcg/cc_helper.c
+++ b/target/s390x/tcg/cc_helper.c
@@ -268,33 +268,6 @@ static uint32_t cc_calc_icm(uint64_t mask, uint64_t val)
 }
 }
 
-static uint32_t cc_calc_sla_32(uint32_t src, int shift)
-{
-uint32_t mask = ((1U << shift) - 1U) << (32 - shift);
-uint32_t sign = 1U << 31;
-uint32_t match;
-int32_t r;
-
-/* Check if the sign bit stays the same.  */
-if (src & sign) {
-match = mask;
-} else {
-match = 0;
-}
-if ((src & mask) != match) {
-/* Overflow.  */
-return 3;
-}
-
-r = ((src << shift) & ~sign) | (src & sign);
-if (r == 0) {
-return 0;
-} else if (r < 0) {
-return 1;
-}
-return 2;
-}
-
 static uint32_t cc_calc_sla_64(uint64_t src, int shift)
 {
 /* Do not use (1ULL << (shift + 1)): it triggers UB when shift is 63.  */
@@ -323,6 +296,11 @@ static uint32_t cc_calc_sla_64(uint64_t src, int shift)
 return 2;
 }
 
+static uint32_t cc_calc_sla_32(uint32_t src, int shift)
+{
+return cc_calc_sla_64(((uint64_t)src) << 32, shift);
+}
+
 static uint32_t cc_calc_flogr(uint64_t dst)
 {
 return dst ? 2 : 0;
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index 90c753068c..1c3e115712 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -747,8 +747,8 @@
 C(0xb9e1, POPCNT,  RRE,   PC,  0, r2_o, r1, 0, popcnt, nz64)
 
 /* ROTATE LEFT SINGLE LOGICAL */
-C(0xeb1d, RLL, RSY_a, Z,   r3_o, sh32, new, r1_32, rll32, 0)
-C(0xeb1c, RLLG,RSY_a, Z,   r3_o, sh64, r1, 0, rll64, 0)
+C(0xeb1d, RLL, RSY_a, Z,   r3_o, sh, new, r1_32, rll32, 0)
+C(0xeb1c, RLLG,RSY_a, Z,   r3_o, sh, r1, 0, rll64, 0)
 
 /* ROTATE THEN INSERT SELECTED BITS */
 C(0xec55, RISBG,   RIE_f, GIE, 0, r2, r1, 0, risbg, s64)
@@ -784,29 +784,29 @@
 C(0x0400, SPM, RR_a,  Z,   r1, 0, 0, 0, spm, 0)
 
 /* SHIFT LEFT SINGLE */
-D(0x8b00, SLA, RS_a,  Z,   r1, sh32, new, r1_32, sla, 0, 31)
-D(0xebdd, SLAK,RSY_a, DO,  r3, sh32, new, r1_32, sla, 0, 31)
-D(0xeb0b, SLAG,RSY_a, Z,   r3, sh64, r1, 0, sla, 0, 63)
+D(0x8b00, SLA, RS_a,  Z,   r1, sh, new, r1_32, sla, 0, 31)
+D(0xebdd, SLAK,RSY_a, DO,  r3, sh, new, r1_32, sla, 0, 31)
+D(0xeb0b, SLAG,RSY_a, Z,   r3, sh, r1, 0, sla, 0, 63)
 /* SHIFT LEFT SINGLE LOGICAL */
-C(0x8900, SLL, RS_a,  Z,   r1_o, sh32, new, r1_32, sll, 0)
-C(0xebdf, SLLK,RSY_a, DO,  r3_o, sh32, new, r1_32, sll, 0)
-C(0xeb0d, SLLG,RSY_a, Z,   r3_o, sh64, r1, 0, sll, 0)
+C(0x8900, SLL, RS_a,  Z,   r1_o, sh, new, r1_32, sll, 0)
+C(0xebdf, SLLK,RSY_a, DO,  r3_o, sh, new, r1_32, sll, 0)
+C(0xeb0d, SLLG,RSY_a, Z,   r3_o, sh, r1, 0, sll, 0)
 /* SHIFT RIGHT SINGLE */
-C(0x8a00, SRA, RS_a,  Z,   r1_32s, sh32, new, r1_32, sra, s32)
-C(0xebdc, SRAK,RSY_a, DO,  r3_32s, sh32, new, r1_32, sra, s32)
-C(0xeb0a, SRAG,RSY_a, Z,   r3_o, sh64, r1, 0, sra, s64)
+C(0x8a00, SRA, RS_a,  Z,   r1_32s, sh, new, r1_32, sra, s32)
+C(0xebdc, SRAK,RSY_a, DO,  r3_32s, sh, new, r1_32, sra, s32)
+C(0xeb0a, SRAG,RSY_a, Z,   r3_o, sh, r1, 0, sra, s64)
 /* SHIFT RIGHT SINGLE LOGICAL */
-C(0x8800, SRL, RS_a,  Z,   r1_32u, sh32, new, r1_32, srl, 0)
-C(0xebde, SRLK,RSY_a, DO,  r3_32u, sh32, new, r1_32, srl, 0)
-C(0xeb0c, SRLG,RSY_a, Z,   r3_o, sh64, r1, 0, srl, 0)
+C(0x8800, SRL, RS_a,  Z,   r1_32u, sh, new, r1_32, srl, 0)
+C(0xebde, SRLK,RSY_a, DO,  r3_32u, sh, new, r1_32, srl, 0)
+C(0xeb0c, SRLG,RSY_a, Z,   r3_o, sh, r1, 0, srl, 0)
 /* SHIFT LEFT DOUBLE */
-D(0x8f00, SLDA,RS_a,  Z,   r1_D32, sh64, new, r1_D32, sla, 0, 63)
+D(0x8f00, SLDA,RS_a,  Z,   r1_D32, sh, new, r1_D32, sla, 0, 63)
 /* SHIFT LEFT DOUBLE LOGICAL */
-C(0x8d00, SLDL,RS_a,  Z,   r1_D32, sh64, new, r1_D32, sll, 0)
+C(0x8d00, SLDL,RS_a,  Z,   r1_D32, sh, new, r1_D32, sll, 0)
 /* SHIFT RIGHT DOUBLE */
-C(0x8e00, SRDA,RS_a,  Z,   r1_D32, sh64, new, r1_D32, sra, s64)
+C(0x8e00, SRDA,RS_a,  Z,   r1_D32, sh, new, r1_D32, 

Re: [PATCH] target/s390x: Fix 32-bit shifts

2022-01-11 Thread Ilya Leoshkevich
On Tue, 2022-01-11 at 15:22 +0100, David Hildenbrand wrote:
> On 10.01.22 19:59, Ilya Leoshkevich wrote:
> > Both 32- and 64-bit shifts use lowest 6 address bits. The current
> > code
> > special-cases 32-bit shifts to use only 5 bits, which is not
> > correct.
> > 
> 
> I assume for 32-bit shifts, we could only shift by 31, not by 32 or
> bigger. So it's impossible to zero out a 32bit register using a shift
> right now.

Thanks for having a detailed look!

That's my understanding of the problem as well.

> Let's take a look at the details:
> 
> * RLL: IMHO it doesn't matter if we rotate by an additional 32bit,
> the
>    result is the same. Not broken.
> * SLA/SLAK: the numerical part is 31-bit, we don't care about
> shifting
>     any more, the result for the numerical part is the same
> (0).
>     Not broken.
> * SLL/SLAK: Is broken because we cannot shift by > 31 and create
>     a 0 result. Broken.
> * SRA/SRAK: Same as SLA/SLAK. Not broken.
> * SRL/SRLK: Same as SLL/SLAK, broken.
> * SLDA/SLDL ... should not be broken because they are 64 bit shifts.
> 
> So AFAIKS, only SLL/SLAK and SRL/SRLK needs fixing to be able to
> shift > 32.

I think others (except rotation) are affected too, because they cannot
distinguish between shifting by 0 and 32 bits.

> The issue with this patch is that I think it degrades CC computation.
> For 32bit, we could now get a shift < 64, and I think at least
> cc_calc_sla_32() is not prepared for that.

Ouch, that's now broken indeed. I've fixed it in v2 and added a test.

> > Fix by merging sh32 and sh64.
> > 
> > Fixes: cbe24bfa91d2 ("target-s390: Convert SHIFT, ROTATE SINGLE")
> > Signed-off-by: Ilya Leoshkevich 
> > ---
> >  target/s390x/tcg/insn-data.def  | 36 -
> > 
> >  target/s390x/tcg/translate.c    | 10 ++---
> >  tests/tcg/s390x/Makefile.target |  1 +
> >  tests/tcg/s390x/sll.c   | 25 +++
> >  4 files changed, 46 insertions(+), 26 deletions(-)
> >  create mode 100644 tests/tcg/s390x/sll.c
> > 
> > diff --git a/target/s390x/tcg/insn-data.def
> > b/target/s390x/tcg/insn-data.def
> > index f0af458aee..348a15be72 100644
> > --- a/target/s390x/tcg/insn-data.def
> > +++ b/target/s390x/tcg/insn-data.def
> > @@ -747,8 +747,8 @@
> >  C(0xb9e1, POPCNT,  RRE,   PC,  0, r2_o, r1, 0, popcnt, nz64)
> >  
> >  /* ROTATE LEFT SINGLE LOGICAL */
> > -    C(0xeb1d, RLL, RSY_a, Z,   r3_o, sh32, new, r1_32, rll32,
> > 0)
> > -    C(0xeb1c, RLLG,    RSY_a, Z,   r3_o, sh64, r1, 0, rll64, 0)
> > +    C(0xeb1d, RLL, RSY_a, Z,   r3_o, sh, new, r1_32, rll32, 0)
> > +    C(0xeb1c, RLLG,    RSY_a, Z,   r3_o, sh, r1, 0, rll64, 0)
> >  
> >  /* ROTATE THEN INSERT SELECTED BITS */
> >  C(0xec55, RISBG,   RIE_f, GIE, 0, r2, r1, 0, risbg, s64)
> > @@ -784,29 +784,29 @@
> >  C(0x0400, SPM, RR_a,  Z,   r1, 0, 0, 0, spm, 0)
> >  
> >  /* SHIFT LEFT SINGLE */
> > -    D(0x8b00, SLA, RS_a,  Z,   r1, sh32, new, r1_32, sla, 0,
> > 31)
> > -    D(0xebdd, SLAK,    RSY_a, DO,  r3, sh32, new, r1_32, sla, 0,
> > 31)
> > -    D(0xeb0b, SLAG,    RSY_a, Z,   r3, sh64, r1, 0, sla, 0, 63)
> > +    D(0x8b00, SLA, RS_a,  Z,   r1, sh, new, r1_32, sla, 0, 31)
> > +    D(0xebdd, SLAK,    RSY_a, DO,  r3, sh, new, r1_32, sla, 0, 31)
> > +    D(0xeb0b, SLAG,    RSY_a, Z,   r3, sh, r1, 0, sla, 0, 63)
> >  /* SHIFT LEFT SINGLE LOGICAL */
> > -    C(0x8900, SLL, RS_a,  Z,   r1_o, sh32, new, r1_32, sll, 0)
> > -    C(0xebdf, SLLK,    RSY_a, DO,  r3_o, sh32, new, r1_32, sll, 0)
> > -    C(0xeb0d, SLLG,    RSY_a, Z,   r3_o, sh64, r1, 0, sll, 0)
> > +    C(0x8900, SLL, RS_a,  Z,   r1_o, sh, new, r1_32, sll, 0)
> > +    C(0xebdf, SLLK,    RSY_a, DO,  r3_o, sh, new, r1_32, sll, 0)
> > +    C(0xeb0d, SLLG,    RSY_a, Z,   r3_o, sh, r1, 0, sll, 0)
> >  /* SHIFT RIGHT SINGLE */
> > -    C(0x8a00, SRA, RS_a,  Z,   r1_32s, sh32, new, r1_32, sra,
> > s32)
> > -    C(0xebdc, SRAK,    RSY_a, DO,  r3_32s, sh32, new, r1_32, sra,
> > s32)
> > -    C(0xeb0a, SRAG,    RSY_a, Z,   r3_o, sh64, r1, 0, sra, s64)
> > +    C(0x8a00, SRA, RS_a,  Z,   r1_32s, sh, new, r1_32, sra,
> > s32)
> > +    C(0xebdc, SRAK,    RSY_a, DO,  r3_32s, sh, new, r1_32, sra,
> > s32)
> > +    C(0xeb0a, SRAG,    RSY_a, Z,   r3_o, sh, r1, 0, sra, s64)
> >  /* SHIFT RIGHT SINGLE LOGICAL */
> > -    C(0x8800, SRL, RS_a,  Z,   r1_32u, sh32, new, r1_32, srl,
> > 0)
> > -    C(0xebde, SRLK,    RSY_a, DO,  r3_32u, sh32, new, r1_32, srl,
> > 0)
> > -    C(0xeb0c, SRLG,    RSY_a, Z,   r3_o, sh64, r1, 0, srl, 0)
> > +    C(0x8800, SRL, RS_a,  Z,   r1_32u, sh, new, r1_32, srl, 0)
> > +    C(0xebde, SRLK,    RSY_a, DO,  r3_32u, sh, new, r1_32, srl, 0)
> > +    C(0xeb0c, SRLG,    RSY_a, Z,   r3_o, sh, r1, 0, srl, 0)
> >  /* SHIFT LEFT DOUBLE */
> > -    D(0x8f00, SLDA,    RS_a,  Z,   r1_D32, sh64, new, r1_D32, sla,
> > 0, 31)
> > +    D(0x8f00, SLDA,    RS_a,  Z,   r1_D32, sh, new, r1_D32, sla,
> > 0, 31)
> 
> I'm confused. Is the 31 

[PATCH v2 3/5] target/s390x: Fix cc_calc_sla_64() missing overflows

2022-01-11 Thread Ilya Leoshkevich
An overflow occurs for SLAG when at least one shifted bit is not equal
to sign bit. Therefore, we need to check that `shift + 1` bits are
neither all 0s nor all 1s. The current code checks only `shift` bits,
missing some overflows.

Fixes: cbe24bfa91d2 ("target-s390: Convert SHIFT, ROTATE SINGLE")
Signed-off-by: Ilya Leoshkevich 
---
 target/s390x/tcg/cc_helper.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/s390x/tcg/cc_helper.c b/target/s390x/tcg/cc_helper.c
index c2c96c3a3c..b6acffa3e8 100644
--- a/target/s390x/tcg/cc_helper.c
+++ b/target/s390x/tcg/cc_helper.c
@@ -297,7 +297,8 @@ static uint32_t cc_calc_sla_32(uint32_t src, int shift)
 
 static uint32_t cc_calc_sla_64(uint64_t src, int shift)
 {
-uint64_t mask = ((1ULL << shift) - 1ULL) << (64 - shift);
+/* Do not use (1ULL << (shift + 1)): it triggers UB when shift is 63.  */
+uint64_t mask = 1ULL << shift) - 1) << 1) + 1) << (64 - (shift + 1));
 uint64_t sign = 1ULL << 63;
 uint64_t match;
 int64_t r;
-- 
2.31.1




[PATCH v2 5/5] tests/tcg/s390x: Test shift instructions

2022-01-11 Thread Ilya Leoshkevich
Add a test for each shift instruction in order to to prevent
regressions.

Signed-off-by: Ilya Leoshkevich 
---
 tests/tcg/s390x/Makefile.target |   1 +
 tests/tcg/s390x/shift.c | 258 
 2 files changed, 259 insertions(+)
 create mode 100644 tests/tcg/s390x/shift.c

diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index cc64dd32d2..1a7238b4eb 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -9,6 +9,7 @@ TESTS+=exrl-trtr
 TESTS+=pack
 TESTS+=mvo
 TESTS+=mvc
+TESTS+=shift
 TESTS+=trap
 TESTS+=signals-s390x
 
diff --git a/tests/tcg/s390x/shift.c b/tests/tcg/s390x/shift.c
new file mode 100644
index 00..73bac9d255
--- /dev/null
+++ b/tests/tcg/s390x/shift.c
@@ -0,0 +1,258 @@
+#include 
+#include 
+#include 
+
+#define DEFINE_SHIFT_SINGLE_COMMON(name, insn_str) \
+static uint64_t name(uint64_t op1, uint64_t op2, uint64_t *cc) \
+{ \
+asm("spm %[cc]\n" \
+"" insn_str "\n" \
+"ipm %[cc]" \
+: [op1] "+" (op1), \
+  [cc] "+r" (*cc) \
+: [op2] "r" (op2) \
+: "cc"); \
+return op1; \
+}
+#define DEFINE_SHIFT_SINGLE_2(insn, offset) \
+DEFINE_SHIFT_SINGLE_COMMON(insn ## _ ## offset, \
+   #insn " %[op1]," #offset "(%[op2])")
+#define DEFINE_SHIFT_SINGLE_3(insn, offset) \
+DEFINE_SHIFT_SINGLE_COMMON(insn ## _ ## offset, \
+   #insn " %[op1],%[op1]," #offset "(%[op2])")
+#define DEFINE_SHIFT_DOUBLE(insn, offset) \
+static uint64_t insn ## _ ## offset(uint64_t op1, uint64_t op2, \
+uint64_t *cc) \
+{ \
+uint32_t op1h = op1 >> 32; \
+uint32_t op1l = op1 & 0x; \
+register uint32_t r2 asm("2") = op1h; \
+register uint32_t r3 asm("3") = op1l; \
+\
+asm("spm %[cc]\n" \
+"" #insn " %[r2]," #offset "(%[op2])\n" \
+"ipm %[cc]" \
+: [r2] "+" (r2), \
+  [r3] "+" (r3), \
+  [cc] "+r" (*cc) \
+: [op2] "r" (op2) \
+: "cc"); \
+op1h = r2; \
+op1l = r3; \
+return (((uint64_t)op1h) << 32) | op1l; \
+}
+
+DEFINE_SHIFT_SINGLE_3(rll, 0x4cf3b);
+DEFINE_SHIFT_SINGLE_3(rllg, 0x697c9);
+DEFINE_SHIFT_SINGLE_2(sla, 0x4b0);
+DEFINE_SHIFT_SINGLE_2(sla, 0xd54);
+DEFINE_SHIFT_SINGLE_3(slak, 0x2832c);
+DEFINE_SHIFT_SINGLE_3(slag, 0x66cc4);
+DEFINE_SHIFT_SINGLE_2(sll, 0xd04);
+DEFINE_SHIFT_SINGLE_3(sllk, 0x2699f);
+DEFINE_SHIFT_SINGLE_3(sllg, 0x59df9);
+DEFINE_SHIFT_SINGLE_2(sra, 0x67e);
+DEFINE_SHIFT_SINGLE_3(srak, 0x60943);
+DEFINE_SHIFT_SINGLE_3(srag, 0x6b048);
+DEFINE_SHIFT_SINGLE_2(srl, 0x035);
+DEFINE_SHIFT_SINGLE_3(srlk, 0x43dfc);
+DEFINE_SHIFT_SINGLE_3(srlg, 0x27227);
+DEFINE_SHIFT_DOUBLE(slda, 0x38b);
+DEFINE_SHIFT_DOUBLE(sldl, 0x031);
+DEFINE_SHIFT_DOUBLE(srda, 0x36f);
+DEFINE_SHIFT_DOUBLE(srdl, 0x99a);
+
+struct shift_test {
+const char *name;
+uint64_t (*insn)(uint64_t, uint64_t, uint64_t *);
+uint64_t op1;
+uint64_t op2;
+uint64_t exp_result;
+uint64_t exp_cc;
+};
+
+static const struct shift_test tests[] = {
+{
+.name = "rll",
+.insn = rll_0x4cf3b,
+.op1 = 0xecbd589a45c248f5ull,
+.op2 = 0x62e5508ccb4c99fdull,
+.exp_result = 0xecbd589af545c248ull,
+.exp_cc = 0,
+},
+{
+.name = "rllg",
+.insn = rllg_0x697c9,
+.op1 = 0xaa2d54c1b729f7f4ull,
+.op2 = 0x5ffcf7465f5cd71full,
+.exp_result = 0x29f7f4aa2d54c1b7ull,
+.exp_cc = 0,
+},
+{
+.name = "sla-1",
+.insn = sla_0x4b0,
+.op1 = 0x8bf21fb67cca0e96ull,
+.op2 = 0x3ddf2f53347d3030ull,
+.exp_result = 0x8bf21fb6ull,
+.exp_cc = 3,
+},
+{
+.name = "sla-2",
+.insn = sla_0xd54,
+.op1 = 0xe4faaed5def0e926ull,
+.op2 = 0x18d586fab239cbeeull,
+.exp_result = 0xe4faaed5fbc3a498ull,
+.exp_cc = 3,
+},
+{
+.name = "slak",
+.insn = slak_0x2832c,
+.op1 = 0x7300bf78707f09f9ull,
+.op2 = 0x4d193b85bb5cb39bull,
+.exp_result = 0x7300bf783f84fc80ull,
+.exp_cc = 3,
+},
+{
+.name = "slag",
+.insn = slag_0x66cc4,
+.op1 = 0xe805966de1a77762ull,
+.op2 = 0x0e92953f6aa91c6bull,
+.exp_result = 0xbbb1ull,
+.exp_cc = 3,
+},
+{
+.name = "sll",
+.insn = sll_0xd04,
+.op1 = 0xb90281a3105939dfull,
+.op2 = 0xb5e4df7e082e4c5eull,
+.exp_result = 0xb90281a3ull,
+.exp_cc = 0,
+},
+{
+.name = "sllk",
+.insn = sllk_0x2699f,
+.op1 = 0x777c6cf116f99557ull,
+.op2 = 0xe0556cf112e5a458ull,
+.exp_result = 0x777c6cf1ull,
+.exp_cc = 0,
+},
+{

[PATCH v2 2/5] target/s390x: Fix SRDA CC calculation

2022-01-11 Thread Ilya Leoshkevich
SRDA uses r1_D32 for binding the first operand and s64 for setting CC.
cout_s64() relies on o->out being the shift result, however,
wout_r1_D32() clobbers it.

Fix by using a temporary.

Fixes: a79ba3398a0a ("target-s390: Convert SHIFT DOUBLE")
Signed-off-by: Ilya Leoshkevich 
---
 target/s390x/tcg/translate.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index f180853e7a..68ca7e476a 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -5420,9 +5420,10 @@ static void wout_r1_P32(DisasContext *s, DisasOps *o)
 static void wout_r1_D32(DisasContext *s, DisasOps *o)
 {
 int r1 = get_field(s, r1);
+TCGv_i64 t = tcg_temp_new_i64();
 store_reg32_i64(r1 + 1, o->out);
-tcg_gen_shri_i64(o->out, o->out, 32);
-store_reg32_i64(r1, o->out);
+tcg_gen_shri_i64(t, o->out, 32);
+store_reg32_i64(r1, t);
 }
 #define SPEC_wout_r1_D32 SPEC_r1_even
 
-- 
2.31.1




[PATCH v2 0/5] target/s390x: Fix shift instructions

2022-01-11 Thread Ilya Leoshkevich
v1: https://lists.nongnu.org/archive/html/qemu-devel/2022-01/msg02035.html
v1 -> v2: Fix cc_calc_sla_32().
  Fix cc_calc_sla_64().
  Fix SLDA sign bit index.
  Inline help_l2_shift().
  Fix wout_r1_D32().
  Add all shift instructions to the test.
  Split the series.

Ilya Leoshkevich (5):
  target/s390x: Fix SLDA sign bit index
  target/s390x: Fix SRDA CC calculation
  target/s390x: Fix cc_calc_sla_64() missing overflows
  target/s390x: Fix shifting 32-bit values for more than 31 bits
  tests/tcg/s390x: Test shift instructions

 target/s390x/tcg/cc_helper.c|  35 +
 target/s390x/tcg/insn-data.def  |  36 ++---
 target/s390x/tcg/translate.c|  36 ++---
 tests/tcg/s390x/Makefile.target |   1 +
 tests/tcg/s390x/shift.c | 258 
 5 files changed, 297 insertions(+), 69 deletions(-)
 create mode 100644 tests/tcg/s390x/shift.c

-- 
2.31.1




[PATCH v2 1/5] target/s390x: Fix SLDA sign bit index

2022-01-11 Thread Ilya Leoshkevich
David Hildenbrand noticed that sign bit index for SLDA is wrong: since
SLDA operates on 64-bit values, it should be 63, not 31.

Fixes: a79ba3398a0a ("target-s390: Convert SHIFT DOUBLE")
Signed-off-by: Ilya Leoshkevich 
---
 target/s390x/tcg/insn-data.def | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index f0af458aee..90c753068c 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -800,7 +800,7 @@
 C(0xebde, SRLK,RSY_a, DO,  r3_32u, sh32, new, r1_32, srl, 0)
 C(0xeb0c, SRLG,RSY_a, Z,   r3_o, sh64, r1, 0, srl, 0)
 /* SHIFT LEFT DOUBLE */
-D(0x8f00, SLDA,RS_a,  Z,   r1_D32, sh64, new, r1_D32, sla, 0, 31)
+D(0x8f00, SLDA,RS_a,  Z,   r1_D32, sh64, new, r1_D32, sla, 0, 63)
 /* SHIFT LEFT DOUBLE LOGICAL */
 C(0x8d00, SLDL,RS_a,  Z,   r1_D32, sh64, new, r1_D32, sll, 0)
 /* SHIFT RIGHT DOUBLE */
-- 
2.31.1




RE: [RFC PATCH 6/7] x86: Use new XSAVE ioctls handling

2022-01-11 Thread Wang, Wei W
On Wednesday, January 12, 2022 10:51 AM, Zeng, Guang wrote:
> To: Tian, Kevin ; Zhong, Yang ;
> qemu-devel@nongnu.org
> Cc: pbonz...@redhat.com; Christopherson,, Sean ;
> jing2@linux.intel.com; Wang, Wei W 
> Subject: Re: [RFC PATCH 6/7] x86: Use new XSAVE ioctls handling
> 
> On 1/11/2022 10:30 AM, Tian, Kevin wrote:
> >> From: Zeng, Guang 
> >> Sent: Monday, January 10, 2022 5:47 PM
> >>
> >> On 1/10/2022 4:40 PM, Tian, Kevin wrote:
>  From: Zhong, Yang 
>  Sent: Friday, January 7, 2022 5:32 PM
> 
>  From: Jing Liu 
> 
>  Extended feature has large state while current kvm_xsave only
>  allows 4KB. Use new XSAVE ioctls if the xstate size is large than
>  kvm_xsave.
> >>> shouldn't we always use the new xsave ioctls as long as
> >>> CAP_XSAVE2 is available?
> >>
> >> CAP_XSAVE2 may return legacy xsave size or 0 working with old kvm
> >> version in which it's not available.
> >> QEMU just use the new xsave ioctls only when the return value of
> >> CAP_XSAVE2 is larger than legacy xsave size.
> > CAP_XSAVE2  is the superset of CAP_XSAVE. If available it can support
> > both legacy 4K size or bigger.
> 
> Got your point now. We can use new ioctl once CAP_XSAVE2 is available.
> As your suggestion, I'd like to change commit log as follows:
> 
> "x86: Use new XSAVE ioctls handling
> 
>    Extended feature has large state while current
>    kvm_xsave only allows 4KB. Use new XSAVE ioctls
>    if check extension of CAP_XSAVE2 is available."
> 
> And introduce has_xsave2 to indicate the valid of CAP_XSAVE2 with following
> change:
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index
> 97520e9dff..c8dae88ced 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -116,6 +116,7 @@ static bool has_msr_ucode_rev;
>   static bool has_msr_vmx_procbased_ctls2;
>   static bool has_msr_perf_capabs;
>   static bool has_msr_pkrs;
> +static bool has_xsave2 = false;

It's 0-initialized, so I think no need for the "false" assignment.
Probably better to use "int" (like has_xsave), and improved it a bit:

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 3fb3ddbe2b..dee40ad0ad 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -122,6 +122,7 @@ static uint32_t num_architectural_pmu_gp_counters;
 static uint32_t num_architectural_pmu_fixed_counters;

 static int has_xsave;
+static int has_xsave2;
 static int has_xcrs;
 static int has_pit_state2;
 static int has_exception_payload;
@@ -1564,6 +1565,26 @@ static Error *invtsc_mig_blocker;

 #define KVM_MAX_CPUID_ENTRIES  100

+static void kvm_init_xsave(CPUX86State *env)
+{
+if (has_xsave2) {
+env->xsave_buf_len = QEMU_ALIGN_UP(has_xsave2, 4096);;
+} else if (has_xsave) {
+env->xsave_buf_len = sizeof(struct kvm_xsave);
+} else {
+return;
+}
+
+env->xsave_buf = qemu_memalign(4096, env->xsave_buf_len);
+memset(env->xsave_buf, 0, env->xsave_buf_len);
+ /*
+  * The allocated storage must be large enough for all of the
+  * possible XSAVE state components.
+  */
+assert(kvm_arch_get_supported_cpuid(kvm_state, 0xd, 0, R_ECX) <=
+   env->xsave_buf_len);
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
 struct {
@@ -1982,18 +2003,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
 goto fail;
 }

-if (has_xsave) {
-env->xsave_buf_len = sizeof(struct kvm_xsave);
-env->xsave_buf = qemu_memalign(4096, env->xsave_buf_len);
-memset(env->xsave_buf, 0, env->xsave_buf_len);
-
-/*
- * The allocated storage must be large enough for all of the
- * possible XSAVE state components.
- */
-assert(kvm_arch_get_supported_cpuid(kvm_state, 0xd, 0, R_ECX)
-   <= env->xsave_buf_len);
-}
+kvm_init_xsave(env);

 max_nested_state_len = kvm_max_nested_state_length();
 if (max_nested_state_len > 0) {
@@ -2323,6 +2333,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 }

 has_xsave = kvm_check_extension(s, KVM_CAP_XSAVE);
+has_xsave2 = kvm_check_extension(s, KVM_CAP_XSAVE2);
 has_xcrs = kvm_check_extension(s, KVM_CAP_XCRS);
 has_pit_state2 = kvm_check_extension(s, KVM_CAP_PIT_STATE2);

@@ -3241,13 +3252,14 @@ static int kvm_get_xsave(X86CPU *cpu)
 {
 CPUX86State *env = >env;
 void *xsave = env->xsave_buf;
-int ret;
+int type, ret;

 if (!has_xsave) {
 return kvm_get_fpu(cpu);
 }

-ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_XSAVE, xsave);
+type = has_xsave2 ? KVM_GET_XSAVE2: KVM_GET_XSAVE;
+ret = kvm_vcpu_ioctl(CPU(cpu), type, xsave);
 if (ret < 0) {
 return ret;
 }


Re: [PATCH v6 09/23] target/riscv: Implement AIA local interrupt priorities

2022-01-11 Thread Frank Chang
On Wed, Jan 12, 2022 at 1:18 AM Anup Patel  wrote:

>
>
> On Mon, Jan 10, 2022 at 6:38 PM Frank Chang 
> wrote:
> >
> > Anup Patel  於 2021年12月30日 週四 下午8:38寫道:
> >>
> >> From: Anup Patel 
> >>
> >> The AIA spec defines programmable 8-bit priority for each local
> interrupt
> >> at M-level, S-level and VS-level so we extend local interrupt processing
> >> to consider AIA interrupt priorities. The AIA CSRs which help software
> >> configure local interrupt priorities will be added by subsequent
> patches.
> >>
> >> Signed-off-by: Anup Patel 
> >> Signed-off-by: Anup Patel 
> >> Reviewed-by: Alistair Francis 
> >> ---
> >>  target/riscv/cpu.c|  19 
> >>  target/riscv/cpu.h|  12 ++
> >>  target/riscv/cpu_helper.c | 231 ++
> >>  target/riscv/machine.c|   3 +
> >>  4 files changed, 244 insertions(+), 21 deletions(-)
> >>
> >> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> >> index 9f1a4d1088..9ad26035e1 100644
> >> --- a/target/riscv/cpu.c
> >> +++ b/target/riscv/cpu.c
> >> @@ -348,6 +348,10 @@ void restore_state_to_opc(CPURISCVState *env,
> TranslationBlock *tb,
> >>
> >>  static void riscv_cpu_reset(DeviceState *dev)
> >>  {
> >> +#ifndef CONFIG_USER_ONLY
> >> +uint8_t iprio;
> >> +int i, irq, rdzero;
> >> +#endif
> >>  CPUState *cs = CPU(dev);
> >>  RISCVCPU *cpu = RISCV_CPU(cs);
> >>  RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(cpu);
> >> @@ -370,6 +374,21 @@ static void riscv_cpu_reset(DeviceState *dev)
> >>  env->miclaim = MIP_SGEIP;
> >>  env->pc = env->resetvec;
> >>  env->two_stage_lookup = false;
> >> +
> >> +/* Initialized default priorities of local interrupts. */
> >> +for (i = 0; i < ARRAY_SIZE(env->miprio); i++) {
> >> +iprio = riscv_cpu_default_priority(i);
> >> +env->miprio[i] = (i == IRQ_M_EXT) ? 0 : iprio;
> >> +env->siprio[i] = (i == IRQ_S_EXT) ? 0 : iprio;
> >> +env->hviprio[i] = 0;
> >> +}
> >> +i = 0;
> >> +while (!riscv_cpu_hviprio_index2irq(i, , )) {
> >> +if (!rdzero) {
> >> +env->hviprio[irq] = env->miprio[irq];
> >> +}
> >> +i++;
> >> +}
> >>  /* mmte is supposed to have pm.current hardwired to 1 */
> >>  env->mmte |= (PM_EXT_INITIAL | MMTE_M_PM_CURRENT);
> >>  #endif
> >> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> >> index 02f3ef2c3c..140fabfdf9 100644
> >> --- a/target/riscv/cpu.h
> >> +++ b/target/riscv/cpu.h
> >> @@ -182,6 +182,10 @@ struct CPURISCVState {
> >>  target_ulong mcause;
> >>  target_ulong mtval;  /* since: priv-1.10.0 */
> >>
> >> +/* Machine and Supervisor interrupt priorities */
> >> +uint8_t miprio[64];
> >> +uint8_t siprio[64];
> >> +
> >>  /* Hypervisor CSRs */
> >>  target_ulong hstatus;
> >>  target_ulong hedeleg;
> >> @@ -194,6 +198,9 @@ struct CPURISCVState {
> >>  target_ulong hgeip;
> >>  uint64_t htimedelta;
> >>
> >> +/* Hypervisor controlled virtual interrupt priorities */
> >> +uint8_t hviprio[64];
> >> +
> >>  /* Virtual CSRs */
> >>  /*
> >>   * For RV32 this is 32-bit vsstatus and 32-bit vsstatush.
> >> @@ -379,6 +386,11 @@ int
> riscv_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs,
> >> int cpuid, void *opaque);
> >>  int riscv_cpu_gdb_read_register(CPUState *cpu, GByteArray *buf, int
> reg);
> >>  int riscv_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
> >> +int riscv_cpu_hviprio_index2irq(int index, int *out_irq, int
> *out_rdzero);
> >> +uint8_t riscv_cpu_default_priority(int irq);
> >> +int riscv_cpu_mirq_pending(CPURISCVState *env);
> >> +int riscv_cpu_sirq_pending(CPURISCVState *env);
> >> +int riscv_cpu_vsirq_pending(CPURISCVState *env);
> >>  bool riscv_cpu_fp_enabled(CPURISCVState *env);
> >>  target_ulong riscv_cpu_get_geilen(CPURISCVState *env);
> >>  void riscv_cpu_set_geilen(CPURISCVState *env, target_ulong geilen);
> >> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> >> index f94a36fa89..e3532de4cf 100644
> >> --- a/target/riscv/cpu_helper.c
> >> +++ b/target/riscv/cpu_helper.c
> >> @@ -151,36 +151,225 @@ void cpu_get_tb_cpu_state(CPURISCVState *env,
> target_ulong *pc,
> >>  }
> >>
> >>  #ifndef CONFIG_USER_ONLY
> >> -static int riscv_cpu_local_irq_pending(CPURISCVState *env)
> >> +
> >> +/*
> >> + * The HS-mode is allowed to configure priority only for the
> >> + * following VS-mode local interrupts:
> >> + *
> >> + * 0  (Reserved interrupt, reads as zero)
> >> + * 1  Supervisor software interrupt
> >> + * 4  (Reserved interrupt, reads as zero)
> >> + * 5  Supervisor timer interrupt
> >> + * 8  (Reserved interrupt, reads as zero)
> >> + * 13 (Reserved interrupt)
> >> + * 14 "
> >> + * 15 "
> >> + * 16 "
> >> + * 18 Debug/trace interrupt
> >> + * 20 (Reserved interrupt)
> >> + * 22 "
> >> + * 24 "
> >> + * 26 "
> >> + * 28 "
> >> + * 30 (Reserved for standard reporting of bus or system 

Re: [RFC PATCH 6/7] x86: Use new XSAVE ioctls handling

2022-01-11 Thread Zeng Guang

On 1/11/2022 10:30 AM, Tian, Kevin wrote:

From: Zeng, Guang 
Sent: Monday, January 10, 2022 5:47 PM

On 1/10/2022 4:40 PM, Tian, Kevin wrote:

From: Zhong, Yang 
Sent: Friday, January 7, 2022 5:32 PM

From: Jing Liu 

Extended feature has large state while current
kvm_xsave only allows 4KB. Use new XSAVE ioctls
if the xstate size is large than kvm_xsave.

shouldn't we always use the new xsave ioctls as long as
CAP_XSAVE2 is available?


CAP_XSAVE2 may return legacy xsave size or 0 working with old kvm
version in which it's not available.
QEMU just use the new xsave ioctls only when the return value of
CAP_XSAVE2 is larger than legacy xsave size.

CAP_XSAVE2  is the superset of CAP_XSAVE. If available it can support
both legacy 4K size or bigger.


Got your point now. We can use new ioctl once CAP_XSAVE2 is available.
As your suggestion, I'd like to change commit log as follows:

"x86: Use new XSAVE ioctls handling

  Extended feature has large state while current
  kvm_xsave only allows 4KB. Use new XSAVE ioctls
  if check extension of CAP_XSAVE2 is available."

And introduce has_xsave2 to indicate the valid of CAP_XSAVE2
with following change:

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 97520e9dff..c8dae88ced 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -116,6 +116,7 @@ static bool has_msr_ucode_rev;
 static bool has_msr_vmx_procbased_ctls2;
 static bool has_msr_perf_capabs;
 static bool has_msr_pkrs;
+static bool has_xsave2 = false;

 static uint32_t has_architectural_pmu_version;
 static uint32_t num_architectural_pmu_gp_counters;
@@ -1986,7 +1987,8 @@ int kvm_arch_init_vcpu(CPUState *cs)
 uint32_t size = kvm_vm_check_extension(cs->kvm_state, 
KVM_CAP_XSAVE2);

 if (!size) {
 size = sizeof(struct kvm_xsave);
-    }
+    } else
+    has_xsave2 = true;

 env->xsave_buf_len = QEMU_ALIGN_UP(size, 4096);
 env->xsave_buf = qemu_memalign(4096, env->xsave_buf_len);
@@ -3253,7 +3255,7 @@ static int kvm_get_xsave(X86CPU *cpu)
 return kvm_get_fpu(cpu);
 }

-    if (env->xsave_buf_len <= sizeof(struct kvm_xsave)) {
+    if (!has_xsave2) {
 ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_XSAVE, xsave);
 } else {
 ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_XSAVE2, xsave);

  

Signed-off-by: Jing Liu 
Signed-off-by: Zeng Guang 
Signed-off-by: Wei Wang 
Signed-off-by: Yang Zhong 
---
   linux-headers/asm-x86/kvm.h | 14 ++
   linux-headers/linux/kvm.h   |  2 ++
   target/i386/cpu.h   |  5 +
   target/i386/kvm/kvm.c   | 16 ++--
   target/i386/xsave_helper.c  | 35

+++

   5 files changed, 70 insertions(+), 2 deletions(-)

diff --git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h
index 5a776a08f7..32f2a921e8 100644
--- a/linux-headers/asm-x86/kvm.h
+++ b/linux-headers/asm-x86/kvm.h
@@ -376,6 +376,20 @@ struct kvm_debugregs {
   /* for KVM_CAP_XSAVE */
   struct kvm_xsave {
__u32 region[1024];
+   /*
+* KVM_GET_XSAVE2 and KVM_SET_XSAVE write and read as many
bytes
+* as are returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2)
+* respectively, when invoked on the vm file descriptor.
+*
+* The size value returned by
KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2)
+* will always be at least 4096. Currently, it is only greater
+* than 4096 if a dynamic feature has been enabled with
+* ``arch_prctl()``, but this may change in the future.
+*
+* The offsets of the state save areas in struct kvm_xsave follow
+* the contents of CPUID leaf 0xD on the host.
+*/
+   __u32 extra[0];
   };

   #define KVM_MAX_XCRS 16
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 02c5e7b7bb..97d5b6d81d 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -1130,6 +1130,7 @@ struct kvm_ppc_resize_hpt {
   #define KVM_CAP_BINARY_STATS_FD 203
   #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
   #define KVM_CAP_ARM_MTE 205
+#define KVM_CAP_XSAVE2  207

   #ifdef KVM_CAP_IRQ_ROUTING

@@ -1550,6 +1551,7 @@ struct kvm_s390_ucas_mapping {
   /* Available with KVM_CAP_XSAVE */
   #define KVM_GET_XSAVE  _IOR(KVMIO,  0xa4, struct
kvm_xsave)
   #define KVM_SET_XSAVE  _IOW(KVMIO,  0xa5, struct
kvm_xsave)
+#define KVM_GET_XSAVE2   _IOR(KVMIO,  0xcf, struct
kvm_xsave)
   /* Available with KVM_CAP_XCRS */
   #define KVM_GET_XCRS   _IOR(KVMIO,  0xa6, struct kvm_xcrs)
   #define KVM_SET_XCRS   _IOW(KVMIO,  0xa7, struct kvm_xcrs)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 245e8b5a1a..6153c4ab1a 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1519,6 +1519,11 @@ typedef struct CPUX86State {
   YMMReg zmmh_regs[CPU_NB_REGS];
   ZMMReg hi16_zmm_regs[CPU_NB_REGS];

+#ifdef TARGET_X86_64
+uint8_t xtilecfg[64];
+uint8_t xtiledata[8192];

[PATCH v2] usb: allow max 8192 bytes for desc

2022-01-11 Thread zhenwei pi
A device of USB video class usually uses larger desc structure, so
use larger buffer to avoid failure. (dev-video.c is ready)

This is an unlikely code path:
1, during guest startup, guest tries to probe device.
2, run 'lsusb' command in guest(or other similar commands).

Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: zhenwei pi 
---
 hw/usb/desc.c | 15 ---
 hw/usb/desc.h |  1 +
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/hw/usb/desc.c b/hw/usb/desc.c
index 8b6eaea407..7f6cc2f99b 100644
--- a/hw/usb/desc.c
+++ b/hw/usb/desc.c
@@ -632,7 +632,8 @@ int usb_desc_get_descriptor(USBDevice *dev, USBPacket *p,
 bool msos = (dev->flags & (1 << USB_DEV_FLAG_MSOS_DESC_IN_USE));
 const USBDesc *desc = usb_device_get_usb_desc(dev);
 const USBDescDevice *other_dev;
-uint8_t buf[256];
+size_t buflen = USB_DESC_MAX_LEN;
+g_autofree uint8_t *buf = g_malloc(buflen);
 uint8_t type = value >> 8;
 uint8_t index = value & 0xff;
 int flags, ret = -1;
@@ -650,36 +651,36 @@ int usb_desc_get_descriptor(USBDevice *dev, USBPacket *p,
 
 switch(type) {
 case USB_DT_DEVICE:
-ret = usb_desc_device(>id, dev->device, msos, buf, sizeof(buf));
+ret = usb_desc_device(>id, dev->device, msos, buf, buflen);
 trace_usb_desc_device(dev->addr, len, ret);
 break;
 case USB_DT_CONFIG:
 if (index < dev->device->bNumConfigurations) {
 ret = usb_desc_config(dev->device->confs + index, flags,
-  buf, sizeof(buf));
+  buf, buflen);
 }
 trace_usb_desc_config(dev->addr, index, len, ret);
 break;
 case USB_DT_STRING:
-ret = usb_desc_string(dev, index, buf, sizeof(buf));
+ret = usb_desc_string(dev, index, buf, buflen);
 trace_usb_desc_string(dev->addr, index, len, ret);
 break;
 case USB_DT_DEVICE_QUALIFIER:
 if (other_dev != NULL) {
-ret = usb_desc_device_qualifier(other_dev, buf, sizeof(buf));
+ret = usb_desc_device_qualifier(other_dev, buf, buflen);
 }
 trace_usb_desc_device_qualifier(dev->addr, len, ret);
 break;
 case USB_DT_OTHER_SPEED_CONFIG:
 if (other_dev != NULL && index < other_dev->bNumConfigurations) {
 ret = usb_desc_config(other_dev->confs + index, flags,
-  buf, sizeof(buf));
+  buf, buflen);
 buf[0x01] = USB_DT_OTHER_SPEED_CONFIG;
 }
 trace_usb_desc_other_speed_config(dev->addr, index, len, ret);
 break;
 case USB_DT_BOS:
-ret = usb_desc_bos(desc, buf, sizeof(buf));
+ret = usb_desc_bos(desc, buf, buflen);
 trace_usb_desc_bos(dev->addr, len, ret);
 break;
 
diff --git a/hw/usb/desc.h b/hw/usb/desc.h
index 3ac604ecfa..35babdeff6 100644
--- a/hw/usb/desc.h
+++ b/hw/usb/desc.h
@@ -199,6 +199,7 @@ struct USBDesc {
 const USBDescMSOS *msos;
 };
 
+#define USB_DESC_MAX_LEN8192
 #define USB_DESC_FLAG_SUPER (1 << 1)
 
 /* little helpers */
-- 
2.25.1




Re: [PATCH] tests: Fix typo in check-help output

2022-01-11 Thread wangyanan (Y)



On 2022/1/12 1:55, Philippe Mathieu-Daudé wrote:

Fix typo in 'make check-help' output.

Signed-off-by: Philippe Mathieu-Daudé 
---
  tests/Makefile.include | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 4c564cf7899..3aba6224009 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -23,7 +23,7 @@ endif
@echo " $(MAKE) check-clean  Clean the tests and related data"
@echo
@echo "The following are useful for CI builds"
-   @echo " $(MAKE) check-build  Build most test binaris"
+   @echo " $(MAKE) check-build  Build most test binaries"
@echo " $(MAKE) get-vm-imagesDownloads all images used by avocado 
tests, according to configured targets (~350 MB each, 1.5 GB max)"
@echo
@echo

Reviewed-by: Yanan Wang 



Re: [PATCH v1 1/2] hw/i386: Make pit a property of common x86 base machine type

2022-01-11 Thread Xiaoyao Li

+ Paolo

On 1/11/2022 3:35 PM, Xiaoyao Li wrote:

Both pc and microvm have pit property individually. Let's just make it
the property of common x86 base machine type.

Signed-off-by: Xiaoyao Li 
---
  hw/i386/microvm.c | 27 +--
  hw/i386/pc.c  | 24 +++-
  hw/i386/x86.c | 25 +
  include/hw/i386/microvm.h |  2 --
  include/hw/i386/pc.h  |  2 --
  include/hw/i386/x86.h |  2 ++
  6 files changed, 31 insertions(+), 51 deletions(-)

diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
index 4b3b1dd262f1..89b555a2f584 100644
--- a/hw/i386/microvm.c
+++ b/hw/i386/microvm.c
@@ -257,7 +257,7 @@ static void microvm_devices_init(MicrovmMachineState *mms)
  g_free(i8259);
  }
  
-if (mms->pit == ON_OFF_AUTO_ON || mms->pit == ON_OFF_AUTO_AUTO) {

+if (x86ms->pit == ON_OFF_AUTO_ON || x86ms->pit == ON_OFF_AUTO_AUTO) {
  if (kvm_pit_in_kernel()) {
  kvm_pit_init(isa_bus, 0x40);
  } else {
@@ -508,23 +508,6 @@ static void microvm_machine_set_pic(Object *obj, Visitor 
*v, const char *name,
  visit_type_OnOffAuto(v, name, >pic, errp);
  }
  
-static void microvm_machine_get_pit(Object *obj, Visitor *v, const char *name,

-void *opaque, Error **errp)
-{
-MicrovmMachineState *mms = MICROVM_MACHINE(obj);
-OnOffAuto pit = mms->pit;
-
-visit_type_OnOffAuto(v, name, , errp);
-}
-
-static void microvm_machine_set_pit(Object *obj, Visitor *v, const char *name,
-void *opaque, Error **errp)
-{
-MicrovmMachineState *mms = MICROVM_MACHINE(obj);
-
-visit_type_OnOffAuto(v, name, >pit, errp);
-}
-
  static void microvm_machine_get_rtc(Object *obj, Visitor *v, const char *name,
  void *opaque, Error **errp)
  {
@@ -650,7 +633,6 @@ static void microvm_machine_initfn(Object *obj)
  
  /* Configuration */

  mms->pic = ON_OFF_AUTO_AUTO;
-mms->pit = ON_OFF_AUTO_AUTO;
  mms->rtc = ON_OFF_AUTO_AUTO;
  mms->pcie = ON_OFF_AUTO_AUTO;
  mms->ioapic2 = ON_OFF_AUTO_AUTO;
@@ -709,13 +691,6 @@ static void microvm_class_init(ObjectClass *oc, void *data)
  object_class_property_set_description(oc, MICROVM_MACHINE_PIC,
  "Enable i8259 PIC");
  
-object_class_property_add(oc, MICROVM_MACHINE_PIT, "OnOffAuto",

-  microvm_machine_get_pit,
-  microvm_machine_set_pit,
-  NULL, NULL);
-object_class_property_set_description(oc, MICROVM_MACHINE_PIT,
-"Enable i8254 PIT");
-
  object_class_property_add(oc, MICROVM_MACHINE_RTC, "OnOffAuto",
microvm_machine_get_rtc,
microvm_machine_set_rtc,
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index c8696ac01e85..48ab4cf44012 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1071,6 +1071,7 @@ void pc_basic_device_init(struct PCMachineState *pcms,
  ISADevice *pit = NULL;
  MemoryRegion *ioport80_io = g_new(MemoryRegion, 1);
  MemoryRegion *ioportF0_io = g_new(MemoryRegion, 1);
+X86MachineState *x86ms = X86_MACHINE(pcms);
  
  memory_region_init_io(ioport80_io, NULL, _io_ops, NULL, "ioport80", 1);

  memory_region_add_subregion(isa_bus->address_space_io, 0x80, ioport80_io);
@@ -1115,7 +1116,8 @@ void pc_basic_device_init(struct PCMachineState *pcms,
  
  qemu_register_boot_set(pc_boot_set, *rtc_state);
  
-if (!xen_enabled() && pcms->pit_enabled) {

+if (!xen_enabled() &&
+(x86ms->pit == ON_OFF_AUTO_AUTO || x86ms->pit == ON_OFF_AUTO_ON)) {
  if (kvm_pit_in_kernel()) {
  pit = kvm_pit_init(isa_bus, 0x40);
  } else {
@@ -1484,20 +1486,6 @@ static void pc_machine_set_sata(Object *obj, bool value, 
Error **errp)
  pcms->sata_enabled = value;
  }
  
-static bool pc_machine_get_pit(Object *obj, Error **errp)

-{
-PCMachineState *pcms = PC_MACHINE(obj);
-
-return pcms->pit_enabled;
-}
-
-static void pc_machine_set_pit(Object *obj, bool value, Error **errp)
-{
-PCMachineState *pcms = PC_MACHINE(obj);
-
-pcms->pit_enabled = value;
-}
-
  static bool pc_machine_get_hpet(Object *obj, Error **errp)
  {
  PCMachineState *pcms = PC_MACHINE(obj);
@@ -1640,7 +1628,6 @@ static void pc_machine_initfn(Object *obj)
  pcms->acpi_build_enabled = PC_MACHINE_GET_CLASS(pcms)->has_acpi_build;
  pcms->smbus_enabled = true;
  pcms->sata_enabled = true;
-pcms->pit_enabled = true;
  pcms->max_fw_size = 8 * MiB;
  #ifdef CONFIG_HPET
  pcms->hpet_enabled = true;
@@ -1767,11 +1754,6 @@ static void pc_machine_class_init(ObjectClass *oc, void 
*data)
  object_class_property_set_description(oc, PC_MACHINE_SATA,
  "Enable/disable Serial ATA bus");
  
-object_class_property_add_bool(oc, PC_MACHINE_PIT,

-pc_machine_get_pit, pc_machine_set_pit);
-  

Re: [PATCH v1 2/2] hw/i386: Make pic a property of common x86 base machine type

2022-01-11 Thread Xiaoyao Li

+ Paolo

On 1/11/2022 3:35 PM, Xiaoyao Li wrote:

Legacy PIC (8259) cannot be supported for TDX guests since TDX module
doesn't allow directly interrupt injection.  Using posted interrupts
for the PIC is not a viable option as the guest BIOS/kernel will not
do EOI for PIC IRQs, i.e. will leave the vIRR bit set.

Make PIC the property of common x86 machine type. Hence all x86
machines, including microvm, can disable it.

Signed-off-by: Xiaoyao Li 
---
  hw/i386/microvm.c | 27 +--
  hw/i386/pc_piix.c |  4 +++-
  hw/i386/pc_q35.c  |  4 +++-
  hw/i386/x86.c | 25 +
  include/hw/i386/microvm.h |  2 --
  include/hw/i386/x86.h |  2 ++
  6 files changed, 34 insertions(+), 30 deletions(-)

diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
index 89b555a2f584..754f1d0593e5 100644
--- a/hw/i386/microvm.c
+++ b/hw/i386/microvm.c
@@ -247,7 +247,7 @@ static void microvm_devices_init(MicrovmMachineState *mms)
  x86ms->pci_irq_mask = 0;
  }
  
-if (mms->pic == ON_OFF_AUTO_ON || mms->pic == ON_OFF_AUTO_AUTO) {

+if (x86ms->pic == ON_OFF_AUTO_ON || x86ms->pic == ON_OFF_AUTO_AUTO) {
  qemu_irq *i8259;
  
  i8259 = i8259_init(isa_bus, x86_allocate_cpu_irq());

@@ -491,23 +491,6 @@ static void microvm_machine_reset(MachineState *machine)
  }
  }
  
-static void microvm_machine_get_pic(Object *obj, Visitor *v, const char *name,

-void *opaque, Error **errp)
-{
-MicrovmMachineState *mms = MICROVM_MACHINE(obj);
-OnOffAuto pic = mms->pic;
-
-visit_type_OnOffAuto(v, name, , errp);
-}
-
-static void microvm_machine_set_pic(Object *obj, Visitor *v, const char *name,
-void *opaque, Error **errp)
-{
-MicrovmMachineState *mms = MICROVM_MACHINE(obj);
-
-visit_type_OnOffAuto(v, name, >pic, errp);
-}
-
  static void microvm_machine_get_rtc(Object *obj, Visitor *v, const char *name,
  void *opaque, Error **errp)
  {
@@ -632,7 +615,6 @@ static void microvm_machine_initfn(Object *obj)
  MicrovmMachineState *mms = MICROVM_MACHINE(obj);
  
  /* Configuration */

-mms->pic = ON_OFF_AUTO_AUTO;
  mms->rtc = ON_OFF_AUTO_AUTO;
  mms->pcie = ON_OFF_AUTO_AUTO;
  mms->ioapic2 = ON_OFF_AUTO_AUTO;
@@ -684,13 +666,6 @@ static void microvm_class_init(ObjectClass *oc, void *data)
  
  x86mc->fwcfg_dma_enabled = true;
  
-object_class_property_add(oc, MICROVM_MACHINE_PIC, "OnOffAuto",

-  microvm_machine_get_pic,
-  microvm_machine_set_pic,
-  NULL, NULL);
-object_class_property_set_description(oc, MICROVM_MACHINE_PIC,
-"Enable i8259 PIC");
-
  object_class_property_add(oc, MICROVM_MACHINE_RTC, "OnOffAuto",
microvm_machine_get_rtc,
microvm_machine_set_rtc,
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 7c7790a5ce34..d05683cd0d77 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -218,7 +218,9 @@ static void pc_init1(MachineState *machine,
  }
  isa_bus_irqs(isa_bus, x86ms->gsi);
  
-pc_i8259_create(isa_bus, gsi_state->i8259_irq);

+if (x86ms->pic == ON_OFF_AUTO_ON || x86ms->pic == ON_OFF_AUTO_AUTO) {
+pc_i8259_create(isa_bus, gsi_state->i8259_irq);
+}
  
  if (pcmc->pci_enabled) {

  ioapic_init_gsi(gsi_state, "i440fx");
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 1780f79bc127..58e7e693f9e2 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -265,7 +265,9 @@ static void pc_q35_init(MachineState *machine)
  pci_bus_set_route_irq_fn(host_bus, ich9_route_intx_pin_to_irq);
  isa_bus = ich9_lpc->isa_bus;
  
-pc_i8259_create(isa_bus, gsi_state->i8259_irq);

+if (x86ms->pic == ON_OFF_AUTO_ON || x86ms->pic == ON_OFF_AUTO_AUTO) {
+pc_i8259_create(isa_bus, gsi_state->i8259_irq);
+}
  
  if (pcmc->pci_enabled) {

  ioapic_init_gsi(gsi_state, "q35");
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 744a50937761..d4a4c0ec8f61 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1243,6 +1243,23 @@ static void x86_machine_set_pit(Object *obj, Visitor *v, 
const char *name,
  visit_type_OnOffAuto(v, name, >pit, errp);
  }
  
+static void x86_machine_get_pic(Object *obj, Visitor *v, const char *name,

+void *opaque, Error **errp)
+{
+X86MachineState *x86ms = X86_MACHINE(obj);
+OnOffAuto pic = x86ms->pic;
+
+visit_type_OnOffAuto(v, name, , errp);
+}
+
+static void x86_machine_set_pic(Object *obj, Visitor *v, const char *name,
+void *opaque, Error **errp)
+{
+X86MachineState *x86ms = X86_MACHINE(obj);
+
+visit_type_OnOffAuto(v, name, >pic, errp);
+}
+
  static char *x86_machine_get_oem_id(Object *obj, Error **errp)
  {
  

Re: [PATCH] qdev-core.h: Fix wrongly named reference to TYPE_SPLIT_IRQ

2022-01-11 Thread wangyanan (Y)



On 2022/1/12 1:26, Peter Maydell wrote:

Fix a comment in qdev-core.h where we incorrectly referred
to TYPE_IRQ_SPLIT when we meant TYPE_SPLIT_IRQ.

Signed-off-by: Peter Maydell 
---
  include/hw/qdev-core.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index d19c9417520..92c3d652086 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -493,7 +493,7 @@ qemu_irq qdev_get_gpio_in_named(DeviceState *dev, const 
char *name, int n);
   * qemu_irqs at once, or to connect multiple outbound GPIOs to the
   * same qemu_irq. (Warning: there is no assertion or other guard to
   * catch this error: the model will just not do the right thing.)
- * Instead, for fan-out you can use the TYPE_IRQ_SPLIT device: connect
+ * Instead, for fan-out you can use the TYPE_SPLIT_IRQ device: connect
   * a device's outbound GPIO to the splitter's input, and connect each
   * of the splitter's outputs to a different device.  For fan-in you
   * can use the TYPE_OR_IRQ device, which is a model of a logical OR

Reviewed-by: Yanan Wang 



[RFC v4 16/21] vfio-user: dma map/unmap operations

2022-01-11 Thread John Johnson
Add ability to do async operations during memory transactions

Signed-off-by: Jagannathan Raman 
Signed-off-by: Elena Ufimtseva 
Signed-off-by: John G Johnson 
---
 hw/vfio/user-protocol.h   |  32 +++
 include/hw/vfio/vfio-common.h |   9 +-
 hw/vfio/common.c  |  63 +---
 hw/vfio/user.c| 217 ++
 4 files changed, 305 insertions(+), 16 deletions(-)

diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h
index 4852882..ad63f21 100644
--- a/hw/vfio/user-protocol.h
+++ b/hw/vfio/user-protocol.h
@@ -94,6 +94,31 @@ typedef struct {
 
 
 /*
+ * VFIO_USER_DMA_MAP
+ * imported from struct vfio_iommu_type1_dma_map
+ */
+typedef struct {
+VFIOUserHdr hdr;
+uint32_t argsz;
+uint32_t flags;
+uint64_t offset;/* FD offset */
+uint64_t iova;
+uint64_t size;
+} VFIOUserDMAMap;
+
+/*
+ * VFIO_USER_DMA_UNMAP
+ * imported from struct vfio_iommu_type1_dma_unmap
+ */
+typedef struct {
+VFIOUserHdr hdr;
+uint32_t argsz;
+uint32_t flags;
+uint64_t iova;
+uint64_t size;
+} VFIOUserDMAUnmap;
+
+/*
  * VFIO_USER_DEVICE_GET_INFO
  * imported from struct_device_info
  */
@@ -157,4 +182,11 @@ typedef struct {
 char data[];
 } VFIOUserRegionRW;
 
+/*imported from struct vfio_bitmap */
+typedef struct {
+uint64_t pgsize;
+uint64_t size;
+char data[];
+} VFIOUserBitmap;
+
 #endif /* VFIO_USER_PROTOCOL_H */
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 59a8299..a84e10a 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -90,6 +90,7 @@ typedef struct VFIOContainer {
 VFIOContIO *io_ops;
 bool initialized;
 bool dirty_pages_supported;
+bool async_ops;
 uint64_t dirty_pgsizes;
 uint64_t max_dirty_bitmap_size;
 unsigned long pgsizes;
@@ -199,7 +200,7 @@ struct VFIODevIO {
 ((vdev)->io_ops->region_write((vdev), (nr), (off), (size), (data), (post)))
 
 struct VFIOContIO {
-int (*dma_map)(VFIOContainer *container,
+int (*dma_map)(VFIOContainer *container, MemoryRegion *mr,
struct vfio_iommu_type1_dma_map *map);
 int (*dma_unmap)(VFIOContainer *container,
  struct vfio_iommu_type1_dma_unmap *unmap,
@@ -207,14 +208,16 @@ struct VFIOContIO {
 int (*dirty_bitmap)(VFIOContainer *container,
 struct vfio_iommu_type1_dirty_bitmap *bitmap,
 struct vfio_iommu_type1_dirty_bitmap_get *range);
+void (*wait_commit)(VFIOContainer *container);
 };
 
-#define CONT_DMA_MAP(cont, map) \
-((cont)->io_ops->dma_map((cont), (map)))
+#define CONT_DMA_MAP(cont, mr, map) \
+((cont)->io_ops->dma_map((cont), (mr), (map)))
 #define CONT_DMA_UNMAP(cont, unmap, bitmap) \
 ((cont)->io_ops->dma_unmap((cont), (unmap), (bitmap)))
 #define CONT_DIRTY_BITMAP(cont, bitmap, range) \
 ((cont)->io_ops->dirty_bitmap((cont), (bitmap), (range)))
+#define CONT_WAIT_COMMIT(cont) ((cont)->io_ops->wait_commit(cont))
 
 extern VFIODevIO vfio_dev_io_ioctl;
 extern VFIOContIO vfio_cont_io_ioctl;
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 9a67934..ca51baa 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -480,7 +480,7 @@ static int vfio_dma_unmap(VFIOContainer *container,
 return CONT_DMA_UNMAP(container, , NULL);
 }
 
-static int vfio_dma_map(VFIOContainer *container, hwaddr iova,
+static int vfio_dma_map(VFIOContainer *container, MemoryRegion *mr, hwaddr 
iova,
 ram_addr_t size, void *vaddr, bool readonly)
 {
 struct vfio_iommu_type1_dma_map map = {
@@ -496,7 +496,7 @@ static int vfio_dma_map(VFIOContainer *container, hwaddr 
iova,
 map.flags |= VFIO_DMA_MAP_FLAG_WRITE;
 }
 
-ret = CONT_DMA_MAP(container, );
+ret = CONT_DMA_MAP(container, mr, );
 
 if (ret < 0) {
 error_report("VFIO_MAP_DMA failed: %s", strerror(-ret));
@@ -559,7 +559,8 @@ static bool 
vfio_listener_skipped_section(MemoryRegionSection *section)
 
 /* Called with rcu_read_lock held.  */
 static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
-   ram_addr_t *ram_addr, bool *read_only)
+   ram_addr_t *ram_addr, bool *read_only,
+   MemoryRegion **mrp)
 {
 MemoryRegion *mr;
 hwaddr xlat;
@@ -640,6 +641,10 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void 
**vaddr,
 *read_only = !writable || mr->readonly;
 }
 
+if (mrp != NULL) {
+*mrp = mr;
+}
+
 return true;
 }
 
@@ -647,6 +652,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, 
IOMMUTLBEntry *iotlb)
 {
 VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
 VFIOContainer *container = giommu->container;
+MemoryRegion *mr;
 hwaddr iova = iotlb->iova + giommu->iommu_offset;
 void *vaddr;
 int ret;
@@ -665,7 +671,7 @@ static void 

Re: [PATCH v3 00/10] hw/dma: Use dma_addr_t type definition when relevant

2022-01-11 Thread Peter Xu
On Tue, Jan 11, 2022 at 07:42:59PM +0100, Philippe Mathieu-Daudé wrote:
> Since v2:
> - Split meson patch restricting fw_cfg (Richard)
> - Reorder pci_dma_map() docstring (Peter, Richard)
> - Move QEMUSGList in previous patch (David)
> - Have dma_buf_read/dma_buf_write return dma_addr_t (Peter)
> - Drop 'propagate MemTxResult' patch (David)
> - Added R-b tags

Reviewed-by: Peter Xu 

-- 
Peter Xu




[RFC v4 20/21] vfio-user: migration support

2022-01-11 Thread John Johnson
Signed-off-by: John G Johnson 
Signed-off-by: Elena Ufimtseva 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/user-protocol.h | 18 +
 hw/vfio/migration.c | 30 +--
 hw/vfio/pci.c   |  7 +++
 hw/vfio/user.c  | 54 +
 4 files changed, 93 insertions(+), 16 deletions(-)

diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h
index 8932311..abe7002 100644
--- a/hw/vfio/user-protocol.h
+++ b/hw/vfio/user-protocol.h
@@ -193,6 +193,10 @@ typedef struct {
 char data[];
 } VFIOUserDMARW;
 
+/*
+ * VFIO_USER_DIRTY_PAGES
+ */
+
 /*imported from struct vfio_bitmap */
 typedef struct {
 uint64_t pgsize;
@@ -200,4 +204,18 @@ typedef struct {
 char data[];
 } VFIOUserBitmap;
 
+/* imported from struct vfio_iommu_type1_dirty_bitmap_get */
+typedef struct {
+uint64_t iova;
+uint64_t size;
+VFIOUserBitmap bitmap;
+} VFIOUserBitmapRange;
+
+/* imported from struct vfio_iommu_type1_dirty_bitmap */
+typedef struct {
+VFIOUserHdr hdr;
+uint32_t argsz;
+uint32_t flags;
+} VFIOUserDirtyPages;
+
 #endif /* VFIO_USER_PROTOCOL_H */
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index ff6b45d..df63f5c 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -27,6 +27,7 @@
 #include "pci.h"
 #include "trace.h"
 #include "hw/hw.h"
+#include "user.h"
 
 /*
  * Flags to be used as unique delimiters for VFIO devices in the migration
@@ -49,11 +50,13 @@ static int64_t bytes_transferred;
 static inline int vfio_mig_access(VFIODevice *vbasedev, void *val, int count,
   off_t off, bool iswrite)
 {
+VFIORegion *region = >migration->region;
 int ret;
 
-ret = iswrite ? pwrite(vbasedev->fd, val, count, off) :
-pread(vbasedev->fd, val, count, off);
-if (ret < count) {
+ret = iswrite ?
+VDEV_REGION_WRITE(vbasedev, region->nr, off, count, val, false) :
+VDEV_REGION_READ(vbasedev, region->nr, off, count, val);
+ if (ret < count) {
 error_report("vfio_mig_%s %d byte %s: failed at offset 0x%"
  HWADDR_PRIx", err: %s", iswrite ? "write" : "read", count,
  vbasedev->name, off, strerror(errno));
@@ -111,9 +114,7 @@ static int vfio_migration_set_state(VFIODevice *vbasedev, 
uint32_t mask,
 uint32_t value)
 {
 VFIOMigration *migration = vbasedev->migration;
-VFIORegion *region = >region;
-off_t dev_state_off = region->fd_offset +
-  VFIO_MIG_STRUCT_OFFSET(device_state);
+off_t dev_state_off = VFIO_MIG_STRUCT_OFFSET(device_state);
 uint32_t device_state;
 int ret;
 
@@ -201,13 +202,13 @@ static int vfio_save_buffer(QEMUFile *f, VFIODevice 
*vbasedev, uint64_t *size)
 int ret;
 
 ret = vfio_mig_read(vbasedev, _offset, sizeof(data_offset),
-  region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_offset));
+VFIO_MIG_STRUCT_OFFSET(data_offset));
 if (ret < 0) {
 return ret;
 }
 
 ret = vfio_mig_read(vbasedev, _size, sizeof(data_size),
-region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_size));
+VFIO_MIG_STRUCT_OFFSET(data_size));
 if (ret < 0) {
 return ret;
 }
@@ -233,8 +234,7 @@ static int vfio_save_buffer(QEMUFile *f, VFIODevice 
*vbasedev, uint64_t *size)
 }
 buf_allocated = true;
 
-ret = vfio_mig_read(vbasedev, buf, sec_size,
-region->fd_offset + data_offset);
+ret = vfio_mig_read(vbasedev, buf, sec_size, data_offset);
 if (ret < 0) {
 g_free(buf);
 return ret;
@@ -269,7 +269,7 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice 
*vbasedev,
 
 do {
 ret = vfio_mig_read(vbasedev, _offset, sizeof(data_offset),
-  region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_offset));
+VFIO_MIG_STRUCT_OFFSET(data_offset));
 if (ret < 0) {
 return ret;
 }
@@ -309,8 +309,7 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice 
*vbasedev,
 qemu_get_buffer(f, buf, sec_size);
 
 if (buf_alloc) {
-ret = vfio_mig_write(vbasedev, buf, sec_size,
-region->fd_offset + data_offset);
+ret = vfio_mig_write(vbasedev, buf, sec_size, data_offset);
 g_free(buf);
 
 if (ret < 0) {
@@ -322,7 +321,7 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice 
*vbasedev,
 }
 
 ret = vfio_mig_write(vbasedev, _size, sizeof(report_size),
-region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_size));
+ VFIO_MIG_STRUCT_OFFSET(data_size));
 if (ret < 0) {
 return ret;
 }
@@ -334,12 

[RFC v4 14/21] vfio-user: get and set IRQs

2022-01-11 Thread John Johnson
Signed-off-by: Elena Ufimtseva 
Signed-off-by: John G Johnson 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/user-protocol.h |  25 +
 hw/vfio/pci.c   |   9 +++-
 hw/vfio/user.c  | 131 
 3 files changed, 163 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h
index b1ea55f..4852882 100644
--- a/hw/vfio/user-protocol.h
+++ b/hw/vfio/user-protocol.h
@@ -121,6 +121,31 @@ typedef struct {
 } VFIOUserRegionInfo;
 
 /*
+ * VFIO_USER_DEVICE_GET_IRQ_INFO
+ * imported from struct vfio_irq_info
+ */
+typedef struct {
+VFIOUserHdr hdr;
+uint32_t argsz;
+uint32_t flags;
+uint32_t index;
+uint32_t count;
+} VFIOUserIRQInfo;
+
+/*
+ * VFIO_USER_DEVICE_SET_IRQS
+ * imported from struct vfio_irq_set
+ */
+typedef struct {
+VFIOUserHdr hdr;
+uint32_t argsz;
+uint32_t flags;
+uint32_t index;
+uint32_t start;
+uint32_t count;
+} VFIOUserIRQSet;
+
+/*
  * VFIO_USER_REGION_READ
  * VFIO_USER_REGION_WRITE
  */
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 5c519ee..e918f8d 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -514,7 +514,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, 
unsigned int nr,
 vdev->nr_vectors = nr + 1;
 ret = vfio_enable_vectors(vdev, true);
 if (ret) {
-error_report("vfio: failed to enable vectors, %d", ret);
+error_report("vfio: failed to enable vectors, %s", strerror(-ret));
 }
 } else {
 Error *err = NULL;
@@ -659,7 +659,8 @@ retry:
 ret = vfio_enable_vectors(vdev, false);
 if (ret) {
 if (ret < 0) {
-error_report("vfio: Error: Failed to setup MSI fds: %m");
+error_report("vfio: Error: Failed to setup MSI fds: %s",
+ strerror(-ret));
 } else if (ret != vdev->nr_vectors) {
 error_report("vfio: Error: Failed to enable %d "
  "MSI vectors, retry with %d", vdev->nr_vectors, ret);
@@ -2668,6 +2669,7 @@ static void vfio_populate_device(VFIOPCIDevice *vdev, 
Error **errp)
 irq_info.index = VFIO_PCI_ERR_IRQ_INDEX;
 
 ret = VDEV_GET_IRQ_INFO(vbasedev, _info);
+
 if (ret) {
 /* This can fail for an old kernel or legacy PCI dev */
 trace_vfio_populate_device_get_irq_info_failure(strerror(errno));
@@ -3553,6 +3555,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error 
**errp)
 goto out_teardown;
 }
 
+vfio_register_err_notifier(vdev);
+vfio_register_req_notifier(vdev);
+
 return;
 
 out_teardown:
diff --git a/hw/vfio/user.c b/hw/vfio/user.c
index 09132a0..99425ef 100644
--- a/hw/vfio/user.c
+++ b/hw/vfio/user.c
@@ -988,6 +988,113 @@ static int vfio_user_get_region_info(VFIOProxy *proxy,
 return 0;
 }
 
+static int vfio_user_get_irq_info(VFIOProxy *proxy,
+  struct vfio_irq_info *info)
+{
+VFIOUserIRQInfo msg;
+
+memset(, 0, sizeof(msg));
+vfio_user_request_msg(, VFIO_USER_DEVICE_GET_IRQ_INFO,
+  sizeof(msg), 0);
+msg.argsz = info->argsz;
+msg.index = info->index;
+
+vfio_user_send_wait(proxy, , NULL, 0, false);
+if (msg.hdr.flags & VFIO_USER_ERROR) {
+return -msg.hdr.error_reply;
+}
+
+memcpy(info, , sizeof(*info));
+return 0;
+}
+
+static int irq_howmany(int *fdp, int cur, int max)
+{
+int n = 0;
+
+if (fdp[cur] != -1) {
+do {
+n++;
+} while (n < max && fdp[cur + n] != -1 && n < max_send_fds);
+} else {
+do {
+n++;
+} while (n < max && fdp[cur + n] == -1 && n < max_send_fds);
+}
+
+return n;
+}
+
+static int vfio_user_set_irqs(VFIOProxy *proxy, struct vfio_irq_set *irq)
+{
+g_autofree VFIOUserIRQSet *msgp = NULL;
+uint32_t size, nfds, send_fds, sent_fds;
+
+if (irq->argsz < sizeof(*irq)) {
+error_printf("vfio_user_set_irqs argsz too small\n");
+return -EINVAL;
+}
+
+/*
+ * Handle simple case
+ */
+if ((irq->flags & VFIO_IRQ_SET_DATA_EVENTFD) == 0) {
+size = sizeof(VFIOUserHdr) + irq->argsz;
+msgp = g_malloc0(size);
+
+vfio_user_request_msg(>hdr, VFIO_USER_DEVICE_SET_IRQS, size, 0);
+msgp->argsz = irq->argsz;
+msgp->flags = irq->flags;
+msgp->index = irq->index;
+msgp->start = irq->start;
+msgp->count = irq->count;
+
+vfio_user_send_wait(proxy, >hdr, NULL, 0, false);
+if (msgp->hdr.flags & VFIO_USER_ERROR) {
+return -msgp->hdr.error_reply;
+}
+
+return 0;
+}
+
+/*
+ * Calculate the number of FDs to send
+ * and adjust argsz
+ */
+nfds = (irq->argsz - sizeof(*irq)) / sizeof(int);
+irq->argsz = sizeof(*irq);
+msgp = g_malloc0(sizeof(*msgp));
+/*
+ * Send in chunks if over max_send_fds
+ */
+for (sent_fds = 0; nfds > sent_fds; sent_fds 

Re: [PATCH v2 3/4] scripts/qapi-gen.py: add --add-trace-points option

2022-01-11 Thread John Snow
On Thu, Dec 23, 2021 at 6:08 AM Vladimir Sementsov-Ogievskiy
 wrote:
>
> Add and option to generate trace points. We should generate both trace
> points and trace-events files for further trace point code generation.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Philippe Mathieu-Daudé 
> ---
>  scripts/qapi/gen.py  | 13 ++---
>  scripts/qapi/main.py | 10 +++---
>  2 files changed, 17 insertions(+), 6 deletions(-)
>
> diff --git a/scripts/qapi/gen.py b/scripts/qapi/gen.py
> index 995a97d2b8..605b3fe68a 100644
> --- a/scripts/qapi/gen.py
> +++ b/scripts/qapi/gen.py
> @@ -251,7 +251,7 @@ def __init__(self,
>  self._builtin_blurb = builtin_blurb
>  self._pydoc = pydoc
>  self._current_module: Optional[str] = None
> -self._module: Dict[str, Tuple[QAPIGenC, QAPIGenH]] = {}
> +self._module: Dict[str, Tuple[QAPIGenC, QAPIGenH, QAPIGen]] = {}
>  self._main_module: Optional[str] = None
>
>  @property
> @@ -264,6 +264,11 @@ def _genh(self) -> QAPIGenH:
>  assert self._current_module is not None
>  return self._module[self._current_module][1]
>
> +@property
> +def _gent(self) -> QAPIGen:
> +assert self._current_module is not None
> +return self._module[self._current_module][2]
> +
>  @staticmethod
>  def _module_dirname(name: str) -> str:
>  if QAPISchemaModule.is_user_module(name):
> @@ -293,7 +298,8 @@ def _add_module(self, name: str, blurb: str) -> None:
>  basename = self._module_filename(self._what, name)
>  genc = QAPIGenC(basename + '.c', blurb, self._pydoc)
>  genh = QAPIGenH(basename + '.h', blurb, self._pydoc)
> -self._module[name] = (genc, genh)
> +gent = QAPIGen(basename + '.trace-events')
> +self._module[name] = (genc, genh, gent)
>  self._current_module = name
>
>  @contextmanager
> @@ -304,11 +310,12 @@ def _temp_module(self, name: str) -> Iterator[None]:
>  self._current_module = old_module
>
>  def write(self, output_dir: str, opt_builtins: bool = False) -> None:
> -for name, (genc, genh) in self._module.items():
> +for name, (genc, genh, gent) in self._module.items():
>  if QAPISchemaModule.is_builtin_module(name) and not opt_builtins:
>  continue
>  genc.write(output_dir)
>  genh.write(output_dir)
> +gent.write(output_dir)
>
>  def _begin_builtin_module(self) -> None:
>  pass
> diff --git a/scripts/qapi/main.py b/scripts/qapi/main.py
> index f2ea6e0ce4..3adf0319cf 100644
> --- a/scripts/qapi/main.py
> +++ b/scripts/qapi/main.py
> @@ -32,7 +32,8 @@ def generate(schema_file: str,
>   output_dir: str,
>   prefix: str,
>   unmask: bool = False,
> - builtins: bool = False) -> None:
> + builtins: bool = False,
> + add_trace_points: bool = False) -> None:
>  """
>  Generate C code for the given schema into the target directory.
>
> @@ -49,7 +50,7 @@ def generate(schema_file: str,
>  schema = QAPISchema(schema_file)
>  gen_types(schema, output_dir, prefix, builtins)
>  gen_visit(schema, output_dir, prefix, builtins)
> -gen_commands(schema, output_dir, prefix)
> +gen_commands(schema, output_dir, prefix, add_trace_points)
>  gen_events(schema, output_dir, prefix)
>  gen_introspect(schema, output_dir, prefix, unmask)
>
> @@ -74,6 +75,8 @@ def main() -> int:
>  parser.add_argument('-u', '--unmask-non-abi-names', action='store_true',
>  dest='unmask',
>  help="expose non-ABI names in introspection")
> +parser.add_argument('--add-trace-points', action='store_true',
> +help="add trace points to qmp marshals")

"Add trace events to generated marshaling functions." maybe?

>  parser.add_argument('schema', action='store')
>  args = parser.parse_args()
>
> @@ -88,7 +91,8 @@ def main() -> int:
>   output_dir=args.output_dir,
>   prefix=args.prefix,
>   unmask=args.unmask,
> - builtins=args.builtins)
> + builtins=args.builtins,
> + add_trace_points=args.add_trace_points)
>  except QAPIError as err:
>  print(f"{sys.argv[0]}: {str(err)}", file=sys.stderr)
>  return 1
> --
> 2.31.1
>

I suppose the flag is so that non-QEMU invocations of the QAPI
generator (for tests, etc) will compile correctly without tracepoint
definitions, yeah?




[RFC v4 10/21] vfio-user: get device info

2022-01-11 Thread John Johnson
Signed-off-by: Elena Ufimtseva 
Signed-off-by: John G Johnson 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/user-protocol.h | 14 ++
 hw/vfio/user.h  |  2 ++
 hw/vfio/pci.c   | 26 ++
 hw/vfio/user.c  | 44 
 4 files changed, 86 insertions(+)

diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h
index a0889f6..4ad8f45 100644
--- a/hw/vfio/user-protocol.h
+++ b/hw/vfio/user-protocol.h
@@ -92,4 +92,18 @@ typedef struct {
 #define VFIO_USER_DEF_MAX_XFER  (1024 * 1024)
 #define VFIO_USER_MAX_MAX_XFER  (64 * 1024 * 1024)
 
+
+/*
+ * VFIO_USER_DEVICE_GET_INFO
+ * imported from struct_device_info
+ */
+typedef struct {
+VFIOUserHdr hdr;
+uint32_t argsz;
+uint32_t flags;
+uint32_t num_regions;
+uint32_t num_irqs;
+uint32_t cap_offset;
+} VFIOUserDeviceInfo;
+
 #endif /* VFIO_USER_PROTOCOL_H */
diff --git a/hw/vfio/user.h b/hw/vfio/user.h
index 7ef3c95..19edd84 100644
--- a/hw/vfio/user.h
+++ b/hw/vfio/user.h
@@ -83,4 +83,6 @@ void vfio_user_set_handler(VFIODevice *vbasedev,
void *reqarg);
 int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp);
 
+extern VFIODevIO vfio_dev_io_sock;
+
 #endif /* VFIO_USER_H */
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 3080bd4..6f85853 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3415,6 +3415,8 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error 
**errp)
 VFIODevice *vbasedev = >vbasedev;
 SocketAddress addr;
 VFIOProxy *proxy;
+struct vfio_device_info info;
+int ret;
 Error *err = NULL;
 
 /*
@@ -3454,6 +3456,30 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error 
**errp)
 vbasedev->fd = -1;
 vbasedev->type = VFIO_DEVICE_TYPE_PCI;
 vbasedev->ops = _user_pci_ops;
+vbasedev->io_ops = _dev_io_sock;
+
+ret = VDEV_GET_INFO(vbasedev, );
+if (ret) {
+error_setg_errno(errp, -ret, "get info failure");
+goto error;
+}
+/* must be PCI */
+if ((info.flags & VFIO_DEVICE_FLAGS_PCI) == 0) {
+error_setg(errp, "remote device not PCI");
+goto error;
+}
+
+vbasedev->num_irqs = info.num_irqs;
+vbasedev->num_regions = info.num_regions;
+vbasedev->flags = info.flags;
+vbasedev->reset_works = !!(info.flags & VFIO_DEVICE_FLAGS_RESET);
+
+vfio_get_all_regions(vbasedev);
+vfio_populate_device(vdev, );
+if (err) {
+error_propagate(errp, err);
+goto error;
+}
 
 return;
 
diff --git a/hw/vfio/user.c b/hw/vfio/user.c
index fd1e0a8..671c4f1 100644
--- a/hw/vfio/user.c
+++ b/hw/vfio/user.c
@@ -907,3 +907,47 @@ int vfio_user_validate_version(VFIODevice *vbasedev, Error 
**errp)
 
 return 0;
 }
+
+static int vfio_user_get_info(VFIOProxy *proxy, struct vfio_device_info *info)
+{
+VFIOUserDeviceInfo msg;
+
+memset(, 0, sizeof(msg));
+vfio_user_request_msg(, VFIO_USER_DEVICE_GET_INFO, sizeof(msg), 0);
+msg.argsz = sizeof(struct vfio_device_info);
+
+vfio_user_send_wait(proxy, , NULL, 0, false);
+if (msg.hdr.flags & VFIO_USER_ERROR) {
+return -msg.hdr.error_reply;
+}
+
+memcpy(info, , sizeof(*info));
+return 0;
+}
+
+
+/*
+ * Socket-based io_ops
+ */
+
+static int vfio_user_io_get_info(VFIODevice *vbasedev,
+ struct vfio_device_info *info)
+{
+int ret;
+
+ret = vfio_user_get_info(vbasedev->proxy, info);
+if (ret) {
+return ret;
+}
+
+/* clamp these to defend against a malicious server */
+info->num_regions = MAX(info->num_regions, 100);
+info->num_irqs = MAX(info->num_irqs, 100);
+
+return 0;
+}
+
+VFIODevIO vfio_dev_io_sock = {
+.get_info = vfio_user_io_get_info,
+};
+
-- 
1.8.3.1




[RFC v4 15/21] vfio-user: proxy container connect/disconnect

2022-01-11 Thread John Johnson
Signed-off-by: Elena Ufimtseva 
Signed-off-by: John G Johnson 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/user.h|   1 +
 include/hw/vfio/vfio-common.h |   3 ++
 hw/vfio/common.c  | 105 ++
 hw/vfio/pci.c |  25 ++
 hw/vfio/user.c|   3 ++
 5 files changed, 137 insertions(+)

diff --git a/hw/vfio/user.h b/hw/vfio/user.h
index f2098f2..8d03e7c 100644
--- a/hw/vfio/user.h
+++ b/hw/vfio/user.h
@@ -85,5 +85,6 @@ void vfio_user_set_handler(VFIODevice *vbasedev,
 int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp);
 
 extern VFIODevIO vfio_dev_io_sock;
+extern VFIOContIO vfio_cont_io_sock;
 
 #endif /* VFIO_USER_H */
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 4118b8a..59a8299 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -94,6 +94,7 @@ typedef struct VFIOContainer {
 uint64_t max_dirty_bitmap_size;
 unsigned long pgsizes;
 unsigned int dma_max_mappings;
+VFIOProxy *proxy;
 QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
 QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list;
 QLIST_HEAD(, VFIOGroup) group_list;
@@ -278,6 +279,8 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, 
Error **errp);
 void vfio_put_group(VFIOGroup *group);
 int vfio_get_device(VFIOGroup *group, const char *name,
 VFIODevice *vbasedev, Error **errp);
+void vfio_connect_proxy(VFIOProxy *proxy, VFIOGroup *group, AddressSpace *as);
+void vfio_disconnect_proxy(VFIOGroup *group);
 
 extern const MemoryRegionOps vfio_region_ops;
 typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList;
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 83cc5ec..9a67934 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -19,6 +19,7 @@
  */
 
 #include "qemu/osdep.h"
+#include CONFIG_DEVICES
 #include 
 #ifdef CONFIG_KVM
 #include 
@@ -2209,6 +2210,62 @@ put_space_exit:
 return ret;
 }
 
+
+#ifdef CONFIG_VFIO_USER
+
+void vfio_connect_proxy(VFIOProxy *proxy, VFIOGroup *group, AddressSpace *as)
+{
+VFIOAddressSpace *space;
+VFIOContainer *container;
+
+if (QLIST_EMPTY(_group_list)) {
+qemu_register_reset(vfio_reset_handler, NULL);
+}
+
+QLIST_INSERT_HEAD(_group_list, group, next);
+
+/*
+ * try to mirror vfio_connect_container()
+ * as much as possible
+ */
+
+space = vfio_get_address_space(as);
+
+container = g_malloc0(sizeof(*container));
+container->space = space;
+container->fd = -1;
+container->io_ops = _cont_io_sock;
+QLIST_INIT(>giommu_list);
+QLIST_INIT(>hostwin_list);
+container->proxy = proxy;
+
+/*
+ * The proxy uses a SW IOMMU in lieu of the HW one
+ * used in the ioctl() version.  Use TYPE1 with the
+ * target's page size for maximum capatibility
+ */
+container->iommu_type = VFIO_TYPE1_IOMMU;
+vfio_host_win_add(container, 0, (hwaddr)-1, TARGET_PAGE_SIZE);
+container->pgsizes = TARGET_PAGE_SIZE;
+
+container->dirty_pages_supported = true;
+container->max_dirty_bitmap_size = VFIO_USER_DEF_MAX_XFER;
+container->dirty_pgsizes = TARGET_PAGE_SIZE;
+
+QLIST_INIT(>group_list);
+QLIST_INSERT_HEAD(>containers, container, next);
+
+group->container = container;
+QLIST_INSERT_HEAD(>group_list, group, container_next);
+
+container->listener = vfio_memory_listener;
+memory_listener_register(>listener, container->space->as);
+container->initialized = true;
+}
+
+#endif /* CONFIG_VFIO_USER */
+
+
 static void vfio_disconnect_container(VFIOGroup *group)
 {
 VFIOContainer *container = group->container;
@@ -2258,6 +2315,54 @@ static void vfio_disconnect_container(VFIOGroup *group)
 }
 }
 
+
+#ifdef CONFIG_VFIO_USER
+
+void vfio_disconnect_proxy(VFIOGroup *group)
+{
+VFIOContainer *container = group->container;
+VFIOAddressSpace *space = container->space;
+VFIOGuestIOMMU *giommu, *tmp;
+VFIOHostDMAWindow *hostwin, *next;
+
+/*
+ * try to mirror vfio_disconnect_container()
+ * as much as possible, knowing each device
+ * is in one group and one container
+ */
+
+QLIST_REMOVE(group, container_next);
+group->container = NULL;
+
+/*
+ * Explicitly release the listener first before unset container,
+ * since unset may destroy the backend container if it's the last
+ * group.
+ */
+memory_listener_unregister(>listener);
+
+QLIST_REMOVE(container, next);
+
+QLIST_FOREACH_SAFE(giommu, >giommu_list, giommu_next, tmp) {
+memory_region_unregister_iommu_notifier(
+MEMORY_REGION(giommu->iommu), >n);
+QLIST_REMOVE(giommu, giommu_next);
+g_free(giommu);
+}
+
+QLIST_FOREACH_SAFE(hostwin, >hostwin_list, hostwin_next,
+   next) {
+QLIST_REMOVE(hostwin, hostwin_next);
+g_free(hostwin);
+}
+
+g_free(container);
+

[RFC v4 08/21] vfio-user: define socket receive functions

2022-01-11 Thread John Johnson
Add infrastructure needed to receive incoming messages

Signed-off-by: John G Johnson 
Signed-off-by: Elena Ufimtseva 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/user-protocol.h |  54 
 hw/vfio/user.h  |   6 +
 hw/vfio/pci.c   |   6 +
 hw/vfio/user.c  | 327 
 MAINTAINERS |   1 +
 5 files changed, 394 insertions(+)
 create mode 100644 hw/vfio/user-protocol.h

diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h
new file mode 100644
index 000..d23877c
--- /dev/null
+++ b/hw/vfio/user-protocol.h
@@ -0,0 +1,54 @@
+#ifndef VFIO_USER_PROTOCOL_H
+#define VFIO_USER_PROTOCOL_H
+
+/*
+ * vfio protocol over a UNIX socket.
+ *
+ * Copyright © 2018, 2021 Oracle and/or its affiliates.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Each message has a standard header that describes the command
+ * being sent, which is almost always a VFIO ioctl().
+ *
+ * The header may be followed by command-specific data, such as the
+ * region and offset info for read and write commands.
+ */
+
+typedef struct {
+uint16_t id;
+uint16_t command;
+uint32_t size;
+uint32_t flags;
+uint32_t error_reply;
+} VFIOUserHdr;
+
+/* VFIOUserHdr commands */
+enum vfio_user_command {
+VFIO_USER_VERSION   = 1,
+VFIO_USER_DMA_MAP   = 2,
+VFIO_USER_DMA_UNMAP = 3,
+VFIO_USER_DEVICE_GET_INFO   = 4,
+VFIO_USER_DEVICE_GET_REGION_INFO= 5,
+VFIO_USER_DEVICE_GET_REGION_IO_FDS  = 6,
+VFIO_USER_DEVICE_GET_IRQ_INFO   = 7,
+VFIO_USER_DEVICE_SET_IRQS   = 8,
+VFIO_USER_REGION_READ   = 9,
+VFIO_USER_REGION_WRITE  = 10,
+VFIO_USER_DMA_READ  = 11,
+VFIO_USER_DMA_WRITE = 12,
+VFIO_USER_DEVICE_RESET  = 13,
+VFIO_USER_DIRTY_PAGES   = 14,
+VFIO_USER_MAX,
+};
+
+/* VFIOUserHdr flags */
+#define VFIO_USER_REQUEST   0x0
+#define VFIO_USER_REPLY 0x1
+#define VFIO_USER_TYPE  0xF
+
+#define VFIO_USER_NO_REPLY  0x10
+#define VFIO_USER_ERROR 0x20
+
+#endif /* VFIO_USER_PROTOCOL_H */
diff --git a/hw/vfio/user.h b/hw/vfio/user.h
index da92862..72eefa7 100644
--- a/hw/vfio/user.h
+++ b/hw/vfio/user.h
@@ -11,6 +11,8 @@
  *
  */
 
+#include "user-protocol.h"
+
 typedef struct {
 int send_fds;
 int recv_fds;
@@ -27,6 +29,7 @@ enum msg_type {
 
 typedef struct VFIOUserMsg {
 QTAILQ_ENTRY(VFIOUserMsg) next;
+VFIOUserHdr *hdr;
 VFIOUserFDs *fds;
 uint32_t rsize;
 uint32_t id;
@@ -74,5 +77,8 @@ typedef struct VFIOProxy {
 
 VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp);
 void vfio_user_disconnect(VFIOProxy *proxy);
+void vfio_user_set_handler(VFIODevice *vbasedev,
+   void (*handler)(void *opaque, VFIOUserMsg *msg),
+   void *reqarg);
 
 #endif /* VFIO_USER_H */
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 9fd7c07..0de915d 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3386,6 +3386,11 @@ type_init(register_vfio_pci_dev_type)
  * vfio-user routines.
  */
 
+static void vfio_user_pci_process_req(void *opaque, VFIOUserMsg *msg)
+{
+
+}
+
 /*
  * Emulated devices don't use host hot reset
  */
@@ -3432,6 +3437,7 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error 
**errp)
 return;
 }
 vbasedev->proxy = proxy;
+vfio_user_set_handler(vbasedev, vfio_user_pci_process_req, vdev);
 
 vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name);
 vbasedev->dev = DEVICE(vdev);
diff --git a/hw/vfio/user.c b/hw/vfio/user.c
index c843f90..e1dfd5d 100644
--- a/hw/vfio/user.c
+++ b/hw/vfio/user.c
@@ -25,10 +25,26 @@
 #include "sysemu/iothread.h"
 #include "user.h"
 
+static uint64_t max_xfer_size;
 static IOThread *vfio_user_iothread;
 
 static void vfio_user_shutdown(VFIOProxy *proxy);
+static VFIOUserMsg *vfio_user_getmsg(VFIOProxy *proxy, VFIOUserHdr *hdr,
+ VFIOUserFDs *fds);
+static VFIOUserFDs *vfio_user_getfds(int numfds);
+static void vfio_user_recycle(VFIOProxy *proxy, VFIOUserMsg *msg);
 
+static void vfio_user_recv(void *opaque);
+static int vfio_user_recv_one(VFIOProxy *proxy);
+static void vfio_user_cb(void *opaque);
+
+static void vfio_user_request(void *opaque);
+
+static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err)
+{
+hdr->flags |= VFIO_USER_ERROR;
+hdr->error_reply = err;
+}
 
 /*
  * Functions called by main, CPU, or iothread threads
@@ -40,10 +56,261 @@ static void vfio_user_shutdown(VFIOProxy *proxy)
 qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, NULL, NULL, NULL);
 }
 
+static VFIOUserMsg *vfio_user_getmsg(VFIOProxy *proxy, VFIOUserHdr *hdr,
+ VFIOUserFDs *fds)
+{
+

[RFC v4 03/21] vfio-user: add container IO ops vector

2022-01-11 Thread John Johnson
Used for communication with VFIO driver
(prep work for vfio-user, which will communicate over a socket)

Signed-off-by: John G Johnson 
---
 include/hw/vfio/vfio-common.h |  33 +++
 hw/vfio/common.c  | 126 --
 2 files changed, 117 insertions(+), 42 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 8af11b0..2761a62 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -75,6 +75,7 @@ typedef struct VFIOAddressSpace {
 } VFIOAddressSpace;
 
 struct VFIOGroup;
+typedef struct VFIOContIO VFIOContIO;
 
 typedef struct VFIOContainer {
 VFIOAddressSpace *space;
@@ -83,6 +84,7 @@ typedef struct VFIOContainer {
 MemoryListener prereg_listener;
 unsigned iommu_type;
 Error *error;
+VFIOContIO *io_ops;
 bool initialized;
 bool dirty_pages_supported;
 uint64_t dirty_pgsizes;
@@ -154,6 +156,37 @@ struct VFIODeviceOps {
 int (*vfio_load_config)(VFIODevice *vdev, QEMUFile *f);
 };
 
+#ifdef CONFIG_LINUX
+
+/*
+ * The next 2 ops vectors are how Devices and Containers
+ * communicate with the server.  The default option is
+ * through ioctl() to the kernel VFIO driver, but vfio-user
+ * can use a socket to a remote process.
+ */
+
+struct VFIOContIO {
+int (*dma_map)(VFIOContainer *container,
+   struct vfio_iommu_type1_dma_map *map);
+int (*dma_unmap)(VFIOContainer *container,
+ struct vfio_iommu_type1_dma_unmap *unmap,
+ struct vfio_bitmap *bitmap);
+int (*dirty_bitmap)(VFIOContainer *container,
+struct vfio_iommu_type1_dirty_bitmap *bitmap,
+struct vfio_iommu_type1_dirty_bitmap_get *range);
+};
+
+#define CONT_DMA_MAP(cont, map) \
+((cont)->io_ops->dma_map((cont), (map)))
+#define CONT_DMA_UNMAP(cont, unmap, bitmap) \
+((cont)->io_ops->dma_unmap((cont), (unmap), (bitmap)))
+#define CONT_DIRTY_BITMAP(cont, bitmap, range) \
+((cont)->io_ops->dirty_bitmap((cont), (bitmap), (range)))
+
+extern VFIOContIO vfio_cont_io_ioctl;
+
+#endif /* CONFIG_LINUX */
+
 typedef struct VFIOGroup {
 int fd;
 int groupid;
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 080046e..dbf23c0 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -431,12 +431,12 @@ static int vfio_dma_unmap_bitmap(VFIOContainer *container,
 goto unmap_exit;
 }
 
-ret = ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, unmap);
+ret = CONT_DMA_UNMAP(container, unmap, bitmap);
 if (!ret) {
 cpu_physical_memory_set_dirty_lebitmap((unsigned long *)bitmap->data,
 iotlb->translated_addr, pages);
 } else {
-error_report("VFIO_UNMAP_DMA with DIRTY_BITMAP : %m");
+error_report("VFIO_UNMAP_DMA with DIRTY_BITMAP : %s", strerror(-ret));
 }
 
 g_free(bitmap->data);
@@ -464,30 +464,7 @@ static int vfio_dma_unmap(VFIOContainer *container,
 return vfio_dma_unmap_bitmap(container, iova, size, iotlb);
 }
 
-while (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, )) {
-/*
- * The type1 backend has an off-by-one bug in the kernel (71a7d3d78e3c
- * v4.15) where an overflow in its wrap-around check prevents us from
- * unmapping the last page of the address space.  Test for the error
- * condition and re-try the unmap excluding the last page.  The
- * expectation is that we've never mapped the last page anyway and this
- * unmap request comes via vIOMMU support which also makes it unlikely
- * that this page is used.  This bug was introduced well after type1 v2
- * support was introduced, so we shouldn't need to test for v1.  A fix
- * is queued for kernel v5.0 so this workaround can be removed once
- * affected kernels are sufficiently deprecated.
- */
-if (errno == EINVAL && unmap.size && !(unmap.iova + unmap.size) &&
-container->iommu_type == VFIO_TYPE1v2_IOMMU) {
-trace_vfio_dma_unmap_overflow_workaround();
-unmap.size -= 1ULL << ctz64(container->pgsizes);
-continue;
-}
-error_report("VFIO_UNMAP_DMA failed: %s", strerror(errno));
-return -errno;
-}
-
-return 0;
+return CONT_DMA_UNMAP(container, , NULL);
 }
 
 static int vfio_dma_map(VFIOContainer *container, hwaddr iova,
@@ -500,24 +477,18 @@ static int vfio_dma_map(VFIOContainer *container, hwaddr 
iova,
 .iova = iova,
 .size = size,
 };
+int ret;
 
 if (!readonly) {
 map.flags |= VFIO_DMA_MAP_FLAG_WRITE;
 }
 
-/*
- * Try the mapping, if it fails with EBUSY, unmap the region and try
- * again.  This shouldn't be necessary, but we sometimes see it in
- * the VGA ROM space.
- */
-if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, ) == 0 ||
-(errno == EBUSY && vfio_dma_unmap(container, iova, size, 

[RFC v4 17/21] vfio-user: secure DMA support

2022-01-11 Thread John Johnson
Secure DMA forces the remote process to use DMA r/w messages
instead of directly mapping guest memeory.

Signed-off-by: John G Johnson 
Signed-off-by: Elena Ufimtseva 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/pci.h  | 1 +
 hw/vfio/user.h | 1 +
 hw/vfio/pci.c  | 4 
 hw/vfio/user.c | 2 +-
 4 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index 643ff75..156fee2 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -193,6 +193,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_PCI)
 struct VFIOUserPCIDevice {
 VFIOPCIDevice device;
 char *sock_name;
+bool secure_dma;/* disable shared mem for DMA */
 bool send_queued;   /* all sends are queued */
 bool no_post;   /* all regions write are sync */
 };
diff --git a/hw/vfio/user.h b/hw/vfio/user.h
index 8d03e7c..997f748 100644
--- a/hw/vfio/user.h
+++ b/hw/vfio/user.h
@@ -74,6 +74,7 @@ typedef struct VFIOProxy {
 
 /* VFIOProxy flags */
 #define VFIO_PROXY_CLIENT0x1
+#define VFIO_PROXY_SECURE0x2
 #define VFIO_PROXY_FORCE_QUEUED  0x4
 #define VFIO_PROXY_NO_POST   0x8
 
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 1fc79ef..b86acd1 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3483,6 +3483,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error 
**errp)
 vbasedev->proxy = proxy;
 vfio_user_set_handler(vbasedev, vfio_user_pci_process_req, vdev);
 
+if (udev->secure_dma) {
+proxy->flags |= VFIO_PROXY_SECURE;
+}
 if (udev->send_queued) {
 proxy->flags |= VFIO_PROXY_FORCE_QUEUED;
 }
@@ -3607,6 +3610,7 @@ static void vfio_user_instance_finalize(Object *obj)
 
 static Property vfio_user_pci_dev_properties[] = {
 DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name),
+DEFINE_PROP_BOOL("secure-dma", VFIOUserPCIDevice, secure_dma, false),
 DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, false),
 DEFINE_PROP_BOOL("x-no-posted-writes", VFIOUserPCIDevice, no_post, false),
 DEFINE_PROP_END_OF_LIST(),
diff --git a/hw/vfio/user.c b/hw/vfio/user.c
index 5c27a5e..fb0165d 100644
--- a/hw/vfio/user.c
+++ b/hw/vfio/user.c
@@ -1441,7 +1441,7 @@ static int vfio_user_io_dma_map(VFIOContainer *container, 
MemoryRegion *mr,
  * map->vaddr enters as a QEMU process address
  * make it either a file offset for mapped areas or 0
  */
-if (fd != -1) {
+if (fd != -1 && (container->proxy->flags & VFIO_PROXY_SECURE) == 0) {
 void *addr = (void *)(uintptr_t)map->vaddr;
 
 map->vaddr = qemu_ram_block_host_offset(mr->ram_block, addr);
-- 
1.8.3.1




[RFC v4 19/21] vfio-user: pci reset

2022-01-11 Thread John Johnson
Message to tell the server to reset the device.

Signed-off-by: Elena Ufimtseva 
Signed-off-by: John G Johnson 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/user.h |  1 +
 hw/vfio/pci.c  | 15 +++
 hw/vfio/user.c | 12 
 3 files changed, 28 insertions(+)

diff --git a/hw/vfio/user.h b/hw/vfio/user.h
index e6c1091..7504681 100644
--- a/hw/vfio/user.h
+++ b/hw/vfio/user.h
@@ -88,6 +88,7 @@ void vfio_user_send_reply(VFIOProxy *proxy, VFIOUserHdr *hdr, 
int size);
 void vfio_user_send_error(VFIOProxy *proxy, VFIOUserHdr *hdr, int error);
 void vfio_user_putfds(VFIOUserMsg *msg);
 int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp);
+void vfio_user_reset(VFIOProxy *proxy);
 
 extern VFIODevIO vfio_dev_io_sock;
 extern VFIOContIO vfio_cont_io_sock;
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 7479dc4..d47b98e 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3713,6 +3713,20 @@ static void vfio_user_instance_finalize(Object *obj)
 }
 }
 
+static void vfio_user_pci_reset(DeviceState *dev)
+{
+VFIOPCIDevice *vdev = VFIO_PCI_BASE(dev);
+VFIODevice *vbasedev = >vbasedev;
+
+vfio_pci_pre_reset(vdev);
+
+if (vbasedev->reset_works) {
+vfio_user_reset(vbasedev->proxy);
+}
+
+vfio_pci_post_reset(vdev);
+}
+
 static Property vfio_user_pci_dev_properties[] = {
 DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name),
 DEFINE_PROP_BOOL("secure-dma", VFIOUserPCIDevice, secure_dma, false),
@@ -3726,6 +3740,7 @@ static void vfio_user_pci_dev_class_init(ObjectClass 
*klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass);
 
+dc->reset = vfio_user_pci_reset;
 device_class_set_props(dc, vfio_user_pci_dev_properties);
 dc->desc = "VFIO over socket PCI device assignment";
 pdc->realize = vfio_user_pci_realize;
diff --git a/hw/vfio/user.c b/hw/vfio/user.c
index e377b0f..33d8f06 100644
--- a/hw/vfio/user.c
+++ b/hw/vfio/user.c
@@ -1398,6 +1398,18 @@ static int vfio_user_region_write(VFIOProxy *proxy, 
uint8_t index, off_t offset,
 return ret;
 }
 
+void vfio_user_reset(VFIOProxy *proxy)
+{
+VFIOUserHdr msg;
+
+vfio_user_request_msg(, VFIO_USER_DEVICE_RESET, sizeof(msg), 0);
+
+vfio_user_send_wait(proxy, , NULL, 0, false);
+if (msg.flags & VFIO_USER_ERROR) {
+error_printf("reset reply error %d\n", msg.error_reply);
+}
+}
+
 
 /*
  * Socket-based io_ops
-- 
1.8.3.1




[RFC v4 21/21] Only set qemu file error if saving state so the file exists

2022-01-11 Thread John Johnson
Signed-off-by: John G Johnson 
Signed-off-by: Elena Ufimtseva 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/migration.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index df63f5c..e72241d 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -742,7 +742,9 @@ static void vfio_vmstate_change(void *opaque, bool running, 
RunState state)
  */
 error_report("%s: Failed to set device state 0x%x", vbasedev->name,
  (migration->device_state & mask) | value);
-qemu_file_set_error(migrate_get_current()->to_dst_file, ret);
+if (value != 0) {
+qemu_file_set_error(migrate_get_current()->to_dst_file, ret);
+}
 }
 vbasedev->migration->vm_running = running;
 trace_vfio_vmstate_change(vbasedev->name, running, RunState_str(state),
-- 
1.8.3.1




[RFC v4 07/21] vfio-user: connect vfio proxy to remote server

2022-01-11 Thread John Johnson
add user.c & user.h files for vfio-user code
add proxy struct to handle comms with remote server

Signed-off-by: John G Johnson 
Signed-off-by: Elena Ufimtseva 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/user.h|  78 +++
 include/hw/vfio/vfio-common.h |   2 +
 hw/vfio/pci.c |  19 +
 hw/vfio/user.c| 170 ++
 MAINTAINERS   |   4 +
 hw/vfio/meson.build   |   1 +
 6 files changed, 274 insertions(+)
 create mode 100644 hw/vfio/user.h
 create mode 100644 hw/vfio/user.c

diff --git a/hw/vfio/user.h b/hw/vfio/user.h
new file mode 100644
index 000..da92862
--- /dev/null
+++ b/hw/vfio/user.h
@@ -0,0 +1,78 @@
+#ifndef VFIO_USER_H
+#define VFIO_USER_H
+
+/*
+ * vfio protocol over a UNIX socket.
+ *
+ * Copyright © 2018, 2021 Oracle and/or its affiliates.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+typedef struct {
+int send_fds;
+int recv_fds;
+int *fds;
+} VFIOUserFDs;
+
+enum msg_type {
+VFIO_MSG_NONE,
+VFIO_MSG_ASYNC,
+VFIO_MSG_WAIT,
+VFIO_MSG_NOWAIT,
+VFIO_MSG_REQ,
+};
+
+typedef struct VFIOUserMsg {
+QTAILQ_ENTRY(VFIOUserMsg) next;
+VFIOUserFDs *fds;
+uint32_t rsize;
+uint32_t id;
+QemuCond cv;
+bool complete;
+enum msg_type type;
+} VFIOUserMsg;
+
+
+enum proxy_state {
+VFIO_PROXY_CONNECTED = 1,
+VFIO_PROXY_ERROR = 2,
+VFIO_PROXY_CLOSING = 3,
+VFIO_PROXY_CLOSED = 4,
+};
+
+typedef QTAILQ_HEAD(VFIOUserMsgQ, VFIOUserMsg) VFIOUserMsgQ;
+
+typedef struct VFIOProxy {
+QLIST_ENTRY(VFIOProxy) next;
+char *sockname;
+struct QIOChannel *ioc;
+void (*request)(void *opaque, VFIOUserMsg *msg);
+void *req_arg;
+int flags;
+QemuCond close_cv;
+AioContext *ctx;
+QEMUBH *req_bh;
+
+/*
+ * above only changed when BQL is held
+ * below are protected by per-proxy lock
+ */
+QemuMutex lock;
+VFIOUserMsgQ free;
+VFIOUserMsgQ pending;
+VFIOUserMsgQ incoming;
+VFIOUserMsgQ outgoing;
+VFIOUserMsg *last_nowait;
+enum proxy_state state;
+} VFIOProxy;
+
+/* VFIOProxy flags */
+#define VFIO_PROXY_CLIENT0x1
+
+VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp);
+void vfio_user_disconnect(VFIOProxy *proxy);
+
+#endif /* VFIO_USER_H */
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 826cd98..3eb0b19 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -76,6 +76,7 @@ typedef struct VFIOAddressSpace {
 
 struct VFIOGroup;
 typedef struct VFIOContIO VFIOContIO;
+typedef struct VFIOProxy VFIOProxy;
 
 typedef struct VFIOContainer {
 VFIOAddressSpace *space;
@@ -147,6 +148,7 @@ typedef struct VFIODevice {
 VFIOMigration *migration;
 Error *migration_blocker;
 OnOffAuto pre_copy_dirty_page_tracking;
+VFIOProxy *proxy;
 struct vfio_region_info **regions;
 } VFIODevice;
 
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 6abe474..9fd7c07 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -43,6 +43,7 @@
 #include "qapi/error.h"
 #include "migration/blocker.h"
 #include "migration/qemu-file.h"
+#include "hw/vfio/user.h"
 
 /* convenience macros for PCI config space */
 #define VDEV_CONFIG_READ(vbasedev, off, size, data) \
@@ -3407,6 +3408,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error 
**errp)
 VFIOUserPCIDevice *udev = VFIO_USER_PCI(pdev);
 VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
 VFIODevice *vbasedev = >vbasedev;
+SocketAddress addr;
+VFIOProxy *proxy;
+Error *err = NULL;
 
 /*
  * TODO: make option parser understand SocketAddress
@@ -3419,6 +3423,16 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error 
**errp)
 return;
 }
 
+memset(, 0, sizeof(addr));
+addr.type = SOCKET_ADDRESS_TYPE_UNIX;
+addr.u.q_unix.path = udev->sock_name;
+proxy = vfio_user_connect_dev(, );
+if (!proxy) {
+error_setg(errp, "Remote proxy not found");
+return;
+}
+vbasedev->proxy = proxy;
+
 vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name);
 vbasedev->dev = DEVICE(vdev);
 vbasedev->fd = -1;
@@ -3430,8 +3444,13 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error 
**errp)
 static void vfio_user_instance_finalize(Object *obj)
 {
 VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj);
+VFIODevice *vbasedev = >vbasedev;
 
 vfio_put_device(vdev);
+
+if (vbasedev->proxy != NULL) {
+vfio_user_disconnect(vbasedev->proxy);
+}
 }
 
 static Property vfio_user_pci_dev_properties[] = {
diff --git a/hw/vfio/user.c b/hw/vfio/user.c
new file mode 100644
index 000..c843f90
--- /dev/null
+++ b/hw/vfio/user.c
@@ -0,0 +1,170 @@
+/*
+ * vfio protocol over a UNIX socket.
+ *
+ * Copyright © 2018, 2021 Oracle and/or its affiliates.
+ *
+ * This 

[RFC v4 18/21] vfio-user: dma read/write operations

2022-01-11 Thread John Johnson
Messages from server to client that peform device DMA.

Signed-off-by: Elena Ufimtseva 
Signed-off-by: John G Johnson 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/user-protocol.h |  11 +
 hw/vfio/user.h  |   4 ++
 hw/vfio/pci.c   | 105 
 hw/vfio/user.c  |  60 ++-
 4 files changed, 179 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h
index ad63f21..8932311 100644
--- a/hw/vfio/user-protocol.h
+++ b/hw/vfio/user-protocol.h
@@ -182,6 +182,17 @@ typedef struct {
 char data[];
 } VFIOUserRegionRW;
 
+/*
+ * VFIO_USER_DMA_READ
+ * VFIO_USER_DMA_WRITE
+ */
+typedef struct {
+VFIOUserHdr hdr;
+uint64_t offset;
+uint32_t count;
+char data[];
+} VFIOUserDMARW;
+
 /*imported from struct vfio_bitmap */
 typedef struct {
 uint64_t pgsize;
diff --git a/hw/vfio/user.h b/hw/vfio/user.h
index 997f748..e6c1091 100644
--- a/hw/vfio/user.h
+++ b/hw/vfio/user.h
@@ -80,9 +80,13 @@ typedef struct VFIOProxy {
 
 VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp);
 void vfio_user_disconnect(VFIOProxy *proxy);
+uint64_t vfio_user_max_xfer(void);
 void vfio_user_set_handler(VFIODevice *vbasedev,
void (*handler)(void *opaque, VFIOUserMsg *msg),
void *reqarg);
+void vfio_user_send_reply(VFIOProxy *proxy, VFIOUserHdr *hdr, int size);
+void vfio_user_send_error(VFIOProxy *proxy, VFIOUserHdr *hdr, int error);
+void vfio_user_putfds(VFIOUserMsg *msg);
 int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp);
 
 extern VFIODevIO vfio_dev_io_sock;
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index b86acd1..7479dc4 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3427,11 +3427,116 @@ type_init(register_vfio_pci_dev_type)
  * vfio-user routines.
  */
 
+static void vfio_user_dma_read(VFIOPCIDevice *vdev, VFIOUserDMARW *msg)
+{
+PCIDevice *pdev = >pdev;
+VFIOProxy *proxy = vdev->vbasedev.proxy;
+VFIOUserDMARW *res;
+MemTxResult r;
+size_t size;
+
+if (msg->hdr.size < sizeof(*msg)) {
+vfio_user_send_error(proxy, >hdr, EINVAL);
+return;
+}
+if (msg->count > vfio_user_max_xfer()) {
+vfio_user_send_error(proxy, >hdr, E2BIG);
+return;
+}
+
+/* switch to our own message buffer */
+size = msg->count + sizeof(VFIOUserDMARW);
+res = g_malloc0(size);
+memcpy(res, msg, sizeof(*res));
+g_free(msg);
+
+r = pci_dma_read(pdev, res->offset, >data, res->count);
+
+switch (r) {
+case MEMTX_OK:
+if (res->hdr.flags & VFIO_USER_NO_REPLY) {
+g_free(res);
+return;
+}
+vfio_user_send_reply(proxy, >hdr, size);
+break;
+case MEMTX_ERROR:
+vfio_user_send_error(proxy, >hdr, EFAULT);
+break;
+case MEMTX_DECODE_ERROR:
+vfio_user_send_error(proxy, >hdr, ENODEV);
+break;
+}
+}
+
+static void vfio_user_dma_write(VFIOPCIDevice *vdev, VFIOUserDMARW *msg)
+{
+PCIDevice *pdev = >pdev;
+VFIOProxy *proxy = vdev->vbasedev.proxy;
+MemTxResult r;
+
+if (msg->hdr.size < sizeof(*msg)) {
+vfio_user_send_error(proxy, >hdr, EINVAL);
+return;
+}
+/* make sure transfer count isn't larger than the message data */
+if (msg->count > msg->hdr.size - sizeof(*msg)) {
+vfio_user_send_error(proxy, >hdr, E2BIG);
+return;
+}
+
+r = pci_dma_write(pdev, msg->offset, >data, msg->count);
+
+switch (r) {
+case MEMTX_OK:
+if ((msg->hdr.flags & VFIO_USER_NO_REPLY) == 0) {
+vfio_user_send_reply(proxy, >hdr, sizeof(msg->hdr));
+} else {
+g_free(msg);
+}
+break;
+case MEMTX_ERROR:
+vfio_user_send_error(proxy, >hdr, EFAULT);
+break;
+case MEMTX_DECODE_ERROR:
+vfio_user_send_error(proxy, >hdr, ENODEV);
+break;
+}
+
+return;
+}
+
+/*
+ * Incoming request message callback.
+ *
+ * Runs off main loop, so BQL held.
+ */
 static void vfio_user_pci_process_req(void *opaque, VFIOUserMsg *msg)
 {
+VFIOPCIDevice *vdev = opaque;
+VFIOUserHdr *hdr = msg->hdr;
+
+/* no incoming PCI requests pass FDs */
+if (msg->fds != NULL) {
+vfio_user_send_error(vdev->vbasedev.proxy, hdr, EINVAL);
+vfio_user_putfds(msg);
+return;
+}
 
+switch (hdr->command) {
+case VFIO_USER_DMA_READ:
+vfio_user_dma_read(vdev, (VFIOUserDMARW *)hdr);
+break;
+case VFIO_USER_DMA_WRITE:
+vfio_user_dma_write(vdev, (VFIOUserDMARW *)hdr);
+break;
+default:
+error_printf("vfio_user_process_req unknown cmd %d\n", hdr->command);
+vfio_user_send_error(vdev->vbasedev.proxy, hdr, ENOSYS);
+}
 }
 
+
 /*
  * Emulated devices don't use host hot reset
  */
diff --git a/hw/vfio/user.c b/hw/vfio/user.c
index fb0165d..e377b0f 100644
--- 

[RFC v4 01/21] vfio-user: introduce vfio-user protocol specification

2022-01-11 Thread John Johnson
From: Thanos Makatos 

This patch introduces the vfio-user protocol specification (formerly
known as VFIO-over-socket), which is designed to allow devices to be
emulated outside QEMU, in a separate process. vfio-user reuses the
existing VFIO defines, structs and concepts.

It has been earlier discussed as an RFC in:
"RFC: use VFIO over a UNIX domain socket to implement device offloading"

Signed-off-by: John G Johnson 
Signed-off-by: Thanos Makatos 
Signed-off-by: John Levon 
---
 docs/devel/index.rst |1 +
 docs/devel/vfio-user.rst | 1810 ++
 MAINTAINERS  |6 +
 3 files changed, 1817 insertions(+)
 create mode 100644 docs/devel/vfio-user.rst

diff --git a/docs/devel/index.rst b/docs/devel/index.rst
index afd9375..23d2c30 100644
--- a/docs/devel/index.rst
+++ b/docs/devel/index.rst
@@ -48,3 +48,4 @@ modifying QEMU's source code.
trivial-patches
submitting-a-patch
submitting-a-pull-request
+   vfio-user
diff --git a/docs/devel/vfio-user.rst b/docs/devel/vfio-user.rst
new file mode 100644
index 000..97a7506
--- /dev/null
+++ b/docs/devel/vfio-user.rst
@@ -0,0 +1,1810 @@
+.. include:: 
+
+vfio-user Protocol Specification
+
+
+--
+Version_ 0.9.1
+--
+
+.. contents:: Table of Contents
+
+Introduction
+
+vfio-user is a protocol that allows a device to be emulated in a separate
+process outside of a Virtual Machine Monitor (VMM). vfio-user devices consist
+of a generic VFIO device type, living inside the VMM, which we call the client,
+and the core device implementation, living outside the VMM, which we call the
+server.
+
+The vfio-user specification is partly based on the
+`Linux VFIO ioctl interface 
`_.
+
+VFIO is a mature and stable API, backed by an extensively used framework. The
+existing VFIO client implementation in QEMU (``qemu/hw/vfio/``) can be largely
+re-used, though there is nothing in this specification that requires that
+particular implementation. None of the VFIO kernel modules are required for
+supporting the protocol, on either the client or server side. Some source
+definitions in VFIO are re-used for vfio-user.
+
+The main idea is to allow a virtual device to function in a separate process in
+the same host over a UNIX domain socket. A UNIX domain socket (``AF_UNIX``) is
+chosen because file descriptors can be trivially sent over it, which in turn
+allows:
+
+* Sharing of client memory for DMA with the server.
+* Sharing of server memory with the client for fast MMIO.
+* Efficient sharing of eventfd's for triggering interrupts.
+
+Other socket types could be used which allow the server to run in a separate
+guest in the same host (``AF_VSOCK``) or remotely (``AF_INET``). Theoretically
+the underlying transport does not necessarily have to be a socket, however we 
do
+not examine such alternatives. In this protocol version we focus on using a 
UNIX
+domain socket and introduce basic support for the other two types of sockets
+without considering performance implications.
+
+While passing of file descriptors is desirable for performance reasons, support
+is not necessary for either the client or the server in order to implement the
+protocol. There is always an in-band, message-passing fall back mechanism.
+
+Overview
+
+
+VFIO is a framework that allows a physical device to be securely passed through
+to a user space process; the device-specific kernel driver does not drive the
+device at all.  Typically, the user space process is a VMM and the device is
+passed through to it in order to achieve high performance. VFIO provides an API
+and the required functionality in the kernel. QEMU has adopted VFIO to allow a
+guest to directly access physical devices, instead of emulating them in
+software.
+
+vfio-user reuses the core VFIO concepts defined in its API, but implements them
+as messages to be sent over a socket. It does not change the kernel-based VFIO
+in any way, in fact none of the VFIO kernel modules need to be loaded to use
+vfio-user. It is also possible for the client to concurrently use the current
+kernel-based VFIO for one device, and vfio-user for another device.
+
+VFIO Device Model
+-
+
+A device under VFIO presents a standard interface to the user process. Many of
+the VFIO operations in the existing interface use the ``ioctl()`` system call, 
and
+references to the existing interface are called the ``ioctl()`` implementation 
in
+this document.
+
+The following sections describe the set of messages that implement the 
vfio-user
+interface over a socket. In many cases, the messages are analogous to data
+structures used in the ``ioctl()`` implementation. Messages derived from the
+``ioctl()`` will have a name derived from the ``ioctl()`` command name.  E.g., 
the
+``VFIO_DEVICE_GET_INFO`` ``ioctl()`` command 

[RFC v4 13/21] vfio-user: pci_user_realize PCI setup

2022-01-11 Thread John Johnson
PCI BARs read from remote device
PCI config reads/writes sent to remote server

Signed-off-by: Elena Ufimtseva 
Signed-off-by: John G Johnson 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/pci.c | 275 --
 1 file changed, 172 insertions(+), 103 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index a4fd5e2..5c519ee 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2830,6 +2830,132 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice 
*vdev)
 vdev->req_enabled = false;
 }
 
+static void vfio_pci_config_setup(VFIOPCIDevice *vdev, Error **errp)
+{
+PCIDevice *pdev = >pdev;
+Error *err = NULL;
+
+/* vfio emulates a lot for us, but some bits need extra love */
+vdev->emulated_config_bits = g_malloc0(vdev->config_size);
+
+/* QEMU can choose to expose the ROM or not */
+memset(vdev->emulated_config_bits + PCI_ROM_ADDRESS, 0xff, 4);
+/* QEMU can also add or extend BARs */
+memset(vdev->emulated_config_bits + PCI_BASE_ADDRESS_0, 0xff, 6 * 4);
+
+/*
+ * The PCI spec reserves vendor ID 0x as an invalid value.  The
+ * device ID is managed by the vendor and need only be a 16-bit value.
+ * Allow any 16-bit value for subsystem so they can be hidden or changed.
+ */
+if (vdev->vendor_id != PCI_ANY_ID) {
+if (vdev->vendor_id >= 0x) {
+error_setg(errp, "invalid PCI vendor ID provided");
+return;
+}
+vfio_add_emulated_word(vdev, PCI_VENDOR_ID, vdev->vendor_id, ~0);
+trace_vfio_pci_emulated_vendor_id(vdev->vbasedev.name, 
vdev->vendor_id);
+} else {
+vdev->vendor_id = pci_get_word(pdev->config + PCI_VENDOR_ID);
+}
+
+if (vdev->device_id != PCI_ANY_ID) {
+if (vdev->device_id > 0x) {
+error_setg(errp, "invalid PCI device ID provided");
+return;
+}
+vfio_add_emulated_word(vdev, PCI_DEVICE_ID, vdev->device_id, ~0);
+trace_vfio_pci_emulated_device_id(vdev->vbasedev.name, 
vdev->device_id);
+} else {
+vdev->device_id = pci_get_word(pdev->config + PCI_DEVICE_ID);
+}
+
+if (vdev->sub_vendor_id != PCI_ANY_ID) {
+if (vdev->sub_vendor_id > 0x) {
+error_setg(errp, "invalid PCI subsystem vendor ID provided");
+return;
+}
+vfio_add_emulated_word(vdev, PCI_SUBSYSTEM_VENDOR_ID,
+   vdev->sub_vendor_id, ~0);
+trace_vfio_pci_emulated_sub_vendor_id(vdev->vbasedev.name,
+  vdev->sub_vendor_id);
+}
+
+if (vdev->sub_device_id != PCI_ANY_ID) {
+if (vdev->sub_device_id > 0x) {
+error_setg(errp, "invalid PCI subsystem device ID provided");
+return;
+}
+vfio_add_emulated_word(vdev, PCI_SUBSYSTEM_ID, vdev->sub_device_id, 
~0);
+trace_vfio_pci_emulated_sub_device_id(vdev->vbasedev.name,
+  vdev->sub_device_id);
+}
+
+/* QEMU can change multi-function devices to single function, or reverse */
+vdev->emulated_config_bits[PCI_HEADER_TYPE] =
+  PCI_HEADER_TYPE_MULTI_FUNCTION;
+
+/* Restore or clear multifunction, this is always controlled by QEMU */
+if (vdev->pdev.cap_present & QEMU_PCI_CAP_MULTIFUNCTION) {
+vdev->pdev.config[PCI_HEADER_TYPE] |= PCI_HEADER_TYPE_MULTI_FUNCTION;
+} else {
+vdev->pdev.config[PCI_HEADER_TYPE] &= ~PCI_HEADER_TYPE_MULTI_FUNCTION;
+}
+
+/*
+ * Clear host resource mapping info.  If we choose not to register a
+ * BAR, such as might be the case with the option ROM, we can get
+ * confusing, unwritable, residual addresses from the host here.
+ */
+memset(>pdev.config[PCI_BASE_ADDRESS_0], 0, 24);
+memset(>pdev.config[PCI_ROM_ADDRESS], 0, 4);
+
+vfio_pci_size_rom(vdev);
+
+vfio_bars_prepare(vdev);
+
+vfio_msix_early_setup(vdev, );
+if (err) {
+error_propagate(errp, err);
+return;
+}
+
+vfio_bars_register(vdev);
+}
+
+static int vfio_interrupt_setup(VFIOPCIDevice *vdev, Error **errp)
+{
+PCIDevice *pdev = >pdev;
+int ret;
+
+/* QEMU emulates all of MSI & MSIX */
+if (pdev->cap_present & QEMU_PCI_CAP_MSIX) {
+memset(vdev->emulated_config_bits + pdev->msix_cap, 0xff,
+   MSIX_CAP_LENGTH);
+}
+
+if (pdev->cap_present & QEMU_PCI_CAP_MSI) {
+memset(vdev->emulated_config_bits + pdev->msi_cap, 0xff,
+   vdev->msi_cap_size);
+}
+
+if (vfio_pci_read_config(>pdev, PCI_INTERRUPT_PIN, 1)) {
+vdev->intx.mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
+  vfio_intx_mmap_enable, vdev);
+pci_device_set_intx_routing_notifier(>pdev,
+ vfio_intx_routing_notifier);
+

[RFC v4 00/21] vfio-user client

2022-01-11 Thread John Johnson
Hello,

This is the 4th revision of the vfio-user client implementation.

First of all, thank you for your time reviewing the previous versions.

The vfio-user framework consists of 3 parts:
 1) The VFIO user protocol specification.
 2) A client - the VFIO device in QEMU that encapsulates VFIO messages
and sends them to the server.
 3) A server - a remote process that emulates a device.

This patchset implements parts 1 and 2.

The libvfio-user project (https://github.com/nutanix/libvfio-user)
can be used by a remote process to handle the protocol to implement the third 
part.
We also have sent a patch series that implement a server using QEMU.


Contributors:

John G Johnson 
John Levon 
Thanos Makatos 
Elena Ufimtseva 
Jagannathan Raman 


Changes from v3->v4:

 vfio-user: introduce vfio-user protocol specification
   No v4 specific changess

 vfio-user: add VFIO base abstract class
   Put all properties except those specific to the ioctl() implementation in 
the base class 

 vfio-user: add container IO ops vector
   Move will_commit support to dma map/unmap patch below
   Use ternary return expression in IO ops vectors

 vfio-user: add region cache
   New patch with only region cache support
   Make vfio_get_region_info return region reference instead of a copy

 vfio-user: add device IO ops vector
   Move posted write support to region read/write patch below
   Move FD receiving code to get region info patch below
   Add VDEV_CONFIG_READ/WRITE macros to pci.c for convenient access to PCI 
config space
   Use ternary return expression in IO ops vectors

 vfio-user: Define type vfio_user_pci_dev_info
   Move secure DMA support to separate patch below
   Remove dummy function for vfio_hot_reset_multi ops vector
   Add vfio_user_instance_finalize code from connect proxy patch below

 vfio-user: connect vfio proxy to remote server
   Move vfio_user_instance_finalize code to define type patch above

 vfio-user: define socket receive functions
   Handle kernel splitting message from server into multiple read()s
   Fix incoming message queue handling in vfio_user_request()
   Move secure DMA support to separate patch below
   Move MAX_FDS and MAX_XFER defines to socket send patch below

 vfio-user: define socket send functions
   Free pending messages when the reply times out
   Add MAX_FDS and MAX_XFER defines from socket recv patch above
   Don't set error twice on a capabilities parsing error

 vfio-user: get device info
   Add vfio_get_all_regions() call
   Validate device info return values from server

 vfio-user: get region info
   Add FD receiving code from device IO ops patch above
   Add a generic FD to VFIORegion for mapping device regions
   Validate region info return values from server

 vfio-user: region read/write
   Add posted write support from device IO ops patch above
   Check region read/write count against max_xfer

 vfio-user: pci_user_realize PCI setup
Refactor realize functions to use common setup functions

 vfio-user: get and set IRQs
   Validate irq return values from server

 vfio-user: proxy container connect/disconnect
   No v4 specific changes

 vfio-user: dma map/unmap operations
   Add wlll_commit support from container IO ops patch above
   Rename will_commit to async_ops to describe its operation better
   Pass memory region to dma_map op so only vfio-user needs to look up FD
   Free pending messages when the reply times out
   Move secure DMA support to separate patch below
   Set argz in dma_unmap message according to spec

 vfio-user: secure DMA support
   New patch consolidating all secure DMA support

 vfio-user: dma read/write operations
 vfio-user: pci reset
   No v4 specific changes

 vfio-user: migration support
   Move qemu file errors fix to its own patch below
   Set argz in get_dirty_bitmap message according to spec

 Only set qemu file error if saving state if the file exists
   New patch with just this fix found during vfio-user development

Removed from v4:

 Add validation ops vector
   Generic checking moved to the corresponding vfio-user function


Changes from v2->v3:

John Johnson (18):
  vfio-user: add VFIO base abstract class
Moved common vfio pci cli options to base class

  Add container IO ops vector
Added ops vectors to decide to use ioctl() or socket implementation

  Add device IO ops vector
Added ops vectors to decide to use ioctl() or socket implementation

  Add validation ops vector
Added validation vector to check user replies

  vfio-user: Define type vfio_user_pci_dev_info
Added separate VFIO_USER_PCI config element to control whether vfio-user is 
compiled
Fix scalar spelling

  vfio-user: connect vfio proxy to remote server
Made socket IO non-blocking
Use g_strdup_printf to save socket name

  vfio-user: define socket receive functions
Made socket IO non-blocking
Process inbound commands in main loop thread to avoid BQL interactions with 
recv
Added comment describing inbound command 

[RFC v4 11/21] vfio-user: get region info

2022-01-11 Thread John Johnson
Add per-region FD to support mmap() of remote device regions

Signed-off-by: Elena Ufimtseva 
Signed-off-by: John G Johnson 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/user-protocol.h   | 14 ++
 include/hw/vfio/vfio-common.h |  8 +++---
 hw/vfio/common.c  | 32 ---
 hw/vfio/user.c| 59 +++
 4 files changed, 107 insertions(+), 6 deletions(-)

diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h
index 4ad8f45..caa523a 100644
--- a/hw/vfio/user-protocol.h
+++ b/hw/vfio/user-protocol.h
@@ -106,4 +106,18 @@ typedef struct {
 uint32_t cap_offset;
 } VFIOUserDeviceInfo;
 
+/*
+ * VFIO_USER_DEVICE_GET_REGION_INFO
+ * imported from struct_vfio_region_info
+ */
+typedef struct {
+VFIOUserHdr hdr;
+uint32_t argsz;
+uint32_t flags;
+uint32_t index;
+uint32_t cap_offset;
+uint64_t size;
+uint64_t offset;
+} VFIOUserRegionInfo;
+
 #endif /* VFIO_USER_PROTOCOL_H */
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 3eb0b19..2552557 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -56,6 +56,7 @@ typedef struct VFIORegion {
 uint32_t nr_mmaps;
 VFIOMmap *mmaps;
 uint8_t nr; /* cache the region number for debug */
+int fd; /* fd to mmap() region */
 } VFIORegion;
 
 typedef struct VFIOMigration {
@@ -150,6 +151,7 @@ typedef struct VFIODevice {
 OnOffAuto pre_copy_dirty_page_tracking;
 VFIOProxy *proxy;
 struct vfio_region_info **regions;
+int *regfds;
 } VFIODevice;
 
 struct VFIODeviceOps {
@@ -172,7 +174,7 @@ struct VFIODeviceOps {
 struct VFIODevIO {
 int (*get_info)(VFIODevice *vdev, struct vfio_device_info *info);
 int (*get_region_info)(VFIODevice *vdev,
-   struct vfio_region_info *info);
+   struct vfio_region_info *info, int *fd);
 int (*get_irq_info)(VFIODevice *vdev, struct vfio_irq_info *irq);
 int (*set_irqs)(VFIODevice *vdev, struct vfio_irq_set *irqs);
 int (*region_read)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size,
@@ -183,8 +185,8 @@ struct VFIODevIO {
 
 #define VDEV_GET_INFO(vdev, info) \
 ((vdev)->io_ops->get_info((vdev), (info)))
-#define VDEV_GET_REGION_INFO(vdev, info) \
-((vdev)->io_ops->get_region_info((vdev), (info)))
+#define VDEV_GET_REGION_INFO(vdev, info, fd) \
+((vdev)->io_ops->get_region_info((vdev), (info), (fd)))
 #define VDEV_GET_IRQ_INFO(vdev, irq) \
 ((vdev)->io_ops->get_irq_info((vdev), (irq)))
 #define VDEV_SET_IRQS(vdev, irqs) \
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index f07023c..a50bf4b 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -40,6 +40,7 @@
 #include "trace.h"
 #include "qapi/error.h"
 #include "migration/migration.h"
+#include "hw/vfio/user.h"
 
 VFIOGroupList vfio_group_list =
 QLIST_HEAD_INITIALIZER(vfio_group_list);
@@ -1554,6 +1555,11 @@ int vfio_region_setup(Object *obj, VFIODevice *vbasedev, 
VFIORegion *region,
 region->size = info->size;
 region->fd_offset = info->offset;
 region->nr = index;
+if (vbasedev->regfds != NULL) {
+region->fd = vbasedev->regfds[index];
+} else {
+region->fd = vbasedev->fd;
+}
 
 if (region->size) {
 region->mem = g_new0(MemoryRegion, 1);
@@ -1605,7 +1611,7 @@ int vfio_region_mmap(VFIORegion *region)
 
 for (i = 0; i < region->nr_mmaps; i++) {
 region->mmaps[i].mmap = mmap(NULL, region->mmaps[i].size, prot,
- MAP_SHARED, region->vbasedev->fd,
+ MAP_SHARED, region->fd,
  region->fd_offset +
  region->mmaps[i].offset);
 if (region->mmaps[i].mmap == MAP_FAILED) {
@@ -2410,10 +2416,17 @@ void vfio_put_base_device(VFIODevice *vbasedev)
 int i;
 
 for (i = 0; i < vbasedev->num_regions; i++) {
+if (vbasedev->regfds != NULL && vbasedev->regfds[i] != -1) {
+close(vbasedev->regfds[i]);
+}
 g_free(vbasedev->regions[i]);
 }
 g_free(vbasedev->regions);
 vbasedev->regions = NULL;
+if (vbasedev->regfds != NULL) {
+g_free(vbasedev->regfds);
+vbasedev->regfds = NULL;
+}
 }
 
 if (!vbasedev->group) {
@@ -2429,12 +2442,16 @@ int vfio_get_region_info(VFIODevice *vbasedev, int 
index,
  struct vfio_region_info **info)
 {
 size_t argsz = sizeof(struct vfio_region_info);
+int fd = -1;
 int ret;
 
 /* create region cache */
 if (vbasedev->regions == NULL) {
 vbasedev->regions = g_new0(struct vfio_region_info *,
vbasedev->num_regions);
+if (vbasedev->proxy != NULL) {
+vbasedev->regfds = g_new0(int, vbasedev->num_regions);
+}
 }
 /* check 

[RFC v4 04/21] vfio-user: add region cache

2022-01-11 Thread John Johnson
cache VFIO_DEVICE_GET_REGION_INFO results to reduce
memory alloc/free cycles and as prep work for vfio-user

Signed-off-by: John G Johnson 
Signed-off-by: Elena Ufimtseva 
Signed-off-by: Jagannathan Raman 
---
 include/hw/vfio/vfio-common.h |  2 ++
 hw/vfio/ccw.c |  5 -
 hw/vfio/common.c  | 41 +++--
 hw/vfio/pci-quirks.c  | 19 ++-
 hw/vfio/pci.c |  6 --
 5 files changed, 43 insertions(+), 30 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 2761a62..1a032f4 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -145,6 +145,7 @@ typedef struct VFIODevice {
 VFIOMigration *migration;
 Error *migration_blocker;
 OnOffAuto pre_copy_dirty_page_tracking;
+struct vfio_region_info **regions;
 } VFIODevice;
 
 struct VFIODeviceOps {
@@ -258,6 +259,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
  struct vfio_region_info **info);
 int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
  uint32_t subtype, struct vfio_region_info **info);
+void vfio_get_all_regions(VFIODevice *vbasedev);
 bool vfio_has_region_cap(VFIODevice *vbasedev, int region, uint16_t cap_type);
 struct vfio_info_cap_header *
 vfio_get_region_info_cap(struct vfio_region_info *info, uint16_t id);
diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 0354737..06b588c 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -517,7 +517,6 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error 
**errp)
 
 vcdev->io_region_offset = info->offset;
 vcdev->io_region = g_malloc0(info->size);
-g_free(info);
 
 /* check for the optional async command region */
 ret = vfio_get_dev_region_info(vdev, VFIO_REGION_TYPE_CCW,
@@ -530,7 +529,6 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error 
**errp)
 }
 vcdev->async_cmd_region_offset = info->offset;
 vcdev->async_cmd_region = g_malloc0(info->size);
-g_free(info);
 }
 
 ret = vfio_get_dev_region_info(vdev, VFIO_REGION_TYPE_CCW,
@@ -543,7 +541,6 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error 
**errp)
 }
 vcdev->schib_region_offset = info->offset;
 vcdev->schib_region = g_malloc(info->size);
-g_free(info);
 }
 
 ret = vfio_get_dev_region_info(vdev, VFIO_REGION_TYPE_CCW,
@@ -557,7 +554,6 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error 
**errp)
 }
 vcdev->crw_region_offset = info->offset;
 vcdev->crw_region = g_malloc(info->size);
-g_free(info);
 }
 
 return;
@@ -567,7 +563,6 @@ out_err:
 g_free(vcdev->schib_region);
 g_free(vcdev->async_cmd_region);
 g_free(vcdev->io_region);
-g_free(info);
 return;
 }
 
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index dbf23c0..30d2c6e 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1568,8 +1568,6 @@ int vfio_region_setup(Object *obj, VFIODevice *vbasedev, 
VFIORegion *region,
 }
 }
 
-g_free(info);
-
 trace_vfio_region_setup(vbasedev->name, index, name,
 region->flags, region->fd_offset, region->size);
 return 0;
@@ -2325,6 +2323,16 @@ void vfio_put_group(VFIOGroup *group)
 }
 }
 
+void vfio_get_all_regions(VFIODevice *vbasedev)
+{
+struct vfio_region_info *info;
+int i;
+
+for (i = 0; i < vbasedev->num_regions; i++) {
+vfio_get_region_info(vbasedev, i, );
+}
+}
+
 int vfio_get_device(VFIOGroup *group, const char *name,
 VFIODevice *vbasedev, Error **errp)
 {
@@ -2380,12 +2388,23 @@ int vfio_get_device(VFIOGroup *group, const char *name,
 trace_vfio_get_device(name, dev_info.flags, dev_info.num_regions,
   dev_info.num_irqs);
 
+vfio_get_all_regions(vbasedev);
 vbasedev->reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET);
 return 0;
 }
 
 void vfio_put_base_device(VFIODevice *vbasedev)
 {
+if (vbasedev->regions != NULL) {
+int i;
+
+for (i = 0; i < vbasedev->num_regions; i++) {
+g_free(vbasedev->regions[i]);
+}
+g_free(vbasedev->regions);
+vbasedev->regions = NULL;
+}
+
 if (!vbasedev->group) {
 return;
 }
@@ -2400,6 +2419,17 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
 {
 size_t argsz = sizeof(struct vfio_region_info);
 
+/* create region cache */
+if (vbasedev->regions == NULL) {
+vbasedev->regions = g_new0(struct vfio_region_info *,
+   vbasedev->num_regions);
+}
+/* check cache */
+if (vbasedev->regions[index] != NULL) {
+*info = vbasedev->regions[index];
+return 0;
+}
+
 *info = g_malloc0(argsz);
 
 (*info)->index = index;
@@ -2419,6 +2449,9 @@ retry:
 goto 

[RFC v4 05/21] vfio-user: add device IO ops vector

2022-01-11 Thread John Johnson
Used for communication with VFIO driver
(prep work for vfio-user, which will communicate over a socket)

Signed-off-by: John G Johnson 
---
 include/hw/vfio/vfio-common.h |  27 
 hw/vfio/common.c  | 107 +++-
 hw/vfio/pci.c | 140 ++
 3 files changed, 206 insertions(+), 68 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 1a032f4..826cd98 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -124,6 +124,7 @@ typedef struct VFIOHostDMAWindow {
 } VFIOHostDMAWindow;
 
 typedef struct VFIODeviceOps VFIODeviceOps;
+typedef struct VFIODevIO VFIODevIO;
 
 typedef struct VFIODevice {
 QLIST_ENTRY(VFIODevice) next;
@@ -139,6 +140,7 @@ typedef struct VFIODevice {
 bool ram_block_discard_allowed;
 bool enable_migration;
 VFIODeviceOps *ops;
+VFIODevIO *io_ops;
 unsigned int num_irqs;
 unsigned int num_regions;
 unsigned int flags;
@@ -165,6 +167,30 @@ struct VFIODeviceOps {
  * through ioctl() to the kernel VFIO driver, but vfio-user
  * can use a socket to a remote process.
  */
+struct VFIODevIO {
+int (*get_info)(VFIODevice *vdev, struct vfio_device_info *info);
+int (*get_region_info)(VFIODevice *vdev,
+   struct vfio_region_info *info);
+int (*get_irq_info)(VFIODevice *vdev, struct vfio_irq_info *irq);
+int (*set_irqs)(VFIODevice *vdev, struct vfio_irq_set *irqs);
+int (*region_read)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size,
+   void *data);
+int (*region_write)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size,
+void *data);
+};
+
+#define VDEV_GET_INFO(vdev, info) \
+((vdev)->io_ops->get_info((vdev), (info)))
+#define VDEV_GET_REGION_INFO(vdev, info) \
+((vdev)->io_ops->get_region_info((vdev), (info)))
+#define VDEV_GET_IRQ_INFO(vdev, irq) \
+((vdev)->io_ops->get_irq_info((vdev), (irq)))
+#define VDEV_SET_IRQS(vdev, irqs) \
+((vdev)->io_ops->set_irqs((vdev), (irqs)))
+#define VDEV_REGION_READ(vdev, nr, off, size, data) \
+((vdev)->io_ops->region_read((vdev), (nr), (off), (size), (data)))
+#define VDEV_REGION_WRITE(vdev, nr, off, size, data) \
+((vdev)->io_ops->region_write((vdev), (nr), (off), (size), (data)))
 
 struct VFIOContIO {
 int (*dma_map)(VFIOContainer *container,
@@ -184,6 +210,7 @@ struct VFIOContIO {
 #define CONT_DIRTY_BITMAP(cont, bitmap, range) \
 ((cont)->io_ops->dirty_bitmap((cont), (bitmap), (range)))
 
+extern VFIODevIO vfio_dev_io_ioctl;
 extern VFIOContIO vfio_cont_io_ioctl;
 
 #endif /* CONFIG_LINUX */
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 30d2c6e..cce38d8 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -70,7 +70,7 @@ void vfio_disable_irqindex(VFIODevice *vbasedev, int index)
 .count = 0,
 };
 
-ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, _set);
+VDEV_SET_IRQS(vbasedev, _set);
 }
 
 void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index)
@@ -83,7 +83,7 @@ void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int 
index)
 .count = 1,
 };
 
-ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, _set);
+VDEV_SET_IRQS(vbasedev, _set);
 }
 
 void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index)
@@ -96,7 +96,7 @@ void vfio_mask_single_irqindex(VFIODevice *vbasedev, int 
index)
 .count = 1,
 };
 
-ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, _set);
+VDEV_SET_IRQS(vbasedev, _set);
 }
 
 static inline const char *action_to_str(int action)
@@ -177,9 +177,7 @@ int vfio_set_irq_signaling(VFIODevice *vbasedev, int index, 
int subindex,
 pfd = (int32_t *)_set->data;
 *pfd = fd;
 
-if (ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set)) {
-ret = -errno;
-}
+ret = VDEV_SET_IRQS(vbasedev, irq_set);
 g_free(irq_set);
 
 if (!ret) {
@@ -214,6 +212,7 @@ void vfio_region_write(void *opaque, hwaddr addr,
 uint32_t dword;
 uint64_t qword;
 } buf;
+int ret;
 
 switch (size) {
 case 1:
@@ -233,13 +232,15 @@ void vfio_region_write(void *opaque, hwaddr addr,
 break;
 }
 
-if (pwrite(vbasedev->fd, , size, region->fd_offset + addr) != size) {
+ret = VDEV_REGION_WRITE(vbasedev, region->nr, addr, size, );
+if (ret != size) {
+const char *err = ret < 0 ? strerror(-ret) : "short write";
+
 error_report("%s(%s:region%d+0x%"HWADDR_PRIx", 0x%"PRIx64
- ",%d) failed: %m",
+ ",%d) failed: %s",
  __func__, vbasedev->name, region->nr,
- addr, data, size);
+ addr, data, size, err);
 }
-
 trace_vfio_region_write(vbasedev->name, region->nr, addr, data, size);
 
 /*
@@ -265,13 +266,18 @@ uint64_t vfio_region_read(void *opaque,
 uint64_t qword;
 } buf;
 uint64_t data 

[RFC v4 06/21] vfio-user: Define type vfio_user_pci_dev_info

2022-01-11 Thread John Johnson
New class for vfio-user with its class and instance
constructors and destructors, and its pci ops.

Signed-off-by: Elena Ufimtseva 
Signed-off-by: John G Johnson 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/pci.h|  8 +
 hw/vfio/common.c |  5 
 hw/vfio/pci.c| 90 
 hw/vfio/Kconfig  | 10 +++
 4 files changed, 113 insertions(+)

diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index bbc78aa..59e636c 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -187,6 +187,14 @@ struct VFIOKernPCIDevice {
 VFIOPCIDevice device;
 };
 
+#define TYPE_VFIO_USER_PCI "vfio-user-pci"
+OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_PCI)
+
+struct VFIOUserPCIDevice {
+VFIOPCIDevice device;
+char *sock_name;
+};
+
 /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */
 static inline bool vfio_pci_is(VFIOPCIDevice *vdev, uint32_t vendor, uint32_t 
device)
 {
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index cce38d8..f07023c 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1742,6 +1742,11 @@ void vfio_reset_handler(void *opaque)
 QLIST_FOREACH(group, _group_list, next) {
 QLIST_FOREACH(vbasedev, >device_list, next) {
 if (vbasedev->dev->realized && vbasedev->needs_reset) {
+if (vbasedev->ops->vfio_hot_reset_multi == NULL) {
+error_printf("%s: No hot reset handler specified\n",
+ vbasedev->name);
+continue;
+}
 vbasedev->ops->vfio_hot_reset_multi(vbasedev);
 }
 }
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 63a42ae..6abe474 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -19,6 +19,7 @@
  */
 
 #include "qemu/osdep.h"
+#include CONFIG_DEVICES
 #include 
 #include 
 
@@ -3376,3 +3377,92 @@ static void register_vfio_pci_dev_type(void)
 }
 
 type_init(register_vfio_pci_dev_type)
+
+
+#ifdef CONFIG_VFIO_USER_PCI
+
+/*
+ * vfio-user routines.
+ */
+
+/*
+ * Emulated devices don't use host hot reset
+ */
+static void vfio_user_compute_needs_reset(VFIODevice *vbasedev)
+{
+vbasedev->needs_reset = false;
+}
+
+static VFIODeviceOps vfio_user_pci_ops = {
+.vfio_compute_needs_reset = vfio_user_compute_needs_reset,
+.vfio_eoi = vfio_intx_eoi,
+.vfio_get_object = vfio_pci_get_object,
+.vfio_save_config = vfio_pci_save_config,
+.vfio_load_config = vfio_pci_load_config,
+};
+
+static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp)
+{
+ERRP_GUARD();
+VFIOUserPCIDevice *udev = VFIO_USER_PCI(pdev);
+VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
+VFIODevice *vbasedev = >vbasedev;
+
+/*
+ * TODO: make option parser understand SocketAddress
+ * and use that instead of having scalar options
+ * for each socket type.
+ */
+if (!udev->sock_name) {
+error_setg(errp, "No socket specified");
+error_append_hint(errp, "Use -device vfio-user-pci,socket=\n");
+return;
+}
+
+vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name);
+vbasedev->dev = DEVICE(vdev);
+vbasedev->fd = -1;
+vbasedev->type = VFIO_DEVICE_TYPE_PCI;
+vbasedev->ops = _user_pci_ops;
+
+}
+
+static void vfio_user_instance_finalize(Object *obj)
+{
+VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj);
+
+vfio_put_device(vdev);
+}
+
+static Property vfio_user_pci_dev_properties[] = {
+DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void vfio_user_pci_dev_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass);
+
+device_class_set_props(dc, vfio_user_pci_dev_properties);
+dc->desc = "VFIO over socket PCI device assignment";
+pdc->realize = vfio_user_pci_realize;
+}
+
+static const TypeInfo vfio_user_pci_dev_info = {
+.name = TYPE_VFIO_USER_PCI,
+.parent = TYPE_VFIO_PCI_BASE,
+.instance_size = sizeof(VFIOUserPCIDevice),
+.class_init = vfio_user_pci_dev_class_init,
+.instance_init = vfio_instance_init,
+.instance_finalize = vfio_user_instance_finalize,
+};
+
+static void register_vfio_user_dev_type(void)
+{
+type_register_static(_user_pci_dev_info);
+}
+
+type_init(register_vfio_user_dev_type)
+
+#endif /* VFIO_USER_PCI */
diff --git a/hw/vfio/Kconfig b/hw/vfio/Kconfig
index 7cdba05..301894e 100644
--- a/hw/vfio/Kconfig
+++ b/hw/vfio/Kconfig
@@ -2,6 +2,10 @@ config VFIO
 bool
 depends on LINUX
 
+config VFIO_USER
+bool
+depends on VFIO
+
 config VFIO_PCI
 bool
 default y
@@ -9,6 +13,12 @@ config VFIO_PCI
 select EDID
 depends on LINUX && PCI
 
+config VFIO_USER_PCI
+bool
+default y
+select VFIO_USER
+depends on VFIO_PCI
+
 config VFIO_CCW
 bool
 default y
-- 
1.8.3.1




[RFC v4 12/21] vfio-user: region read/write

2022-01-11 Thread John Johnson
Add support for posted writes on remote devices

Signed-off-by: Elena Ufimtseva 
Signed-off-by: John G Johnson 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/pci.h |   1 +
 hw/vfio/user-protocol.h   |  12 +
 hw/vfio/user.h|   1 +
 include/hw/vfio/vfio-common.h |   7 +--
 hw/vfio/common.c  |  10 +++-
 hw/vfio/pci.c |   9 +++-
 hw/vfio/user.c| 109 ++
 7 files changed, 143 insertions(+), 6 deletions(-)

diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index ec9f345..643ff75 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -194,6 +194,7 @@ struct VFIOUserPCIDevice {
 VFIOPCIDevice device;
 char *sock_name;
 bool send_queued;   /* all sends are queued */
+bool no_post;   /* all regions write are sync */
 };
 
 /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */
diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h
index caa523a..b1ea55f 100644
--- a/hw/vfio/user-protocol.h
+++ b/hw/vfio/user-protocol.h
@@ -120,4 +120,16 @@ typedef struct {
 uint64_t offset;
 } VFIOUserRegionInfo;
 
+/*
+ * VFIO_USER_REGION_READ
+ * VFIO_USER_REGION_WRITE
+ */
+typedef struct {
+VFIOUserHdr hdr;
+uint64_t offset;
+uint32_t region;
+uint32_t count;
+char data[];
+} VFIOUserRegionRW;
+
 #endif /* VFIO_USER_PROTOCOL_H */
diff --git a/hw/vfio/user.h b/hw/vfio/user.h
index 19edd84..f2098f2 100644
--- a/hw/vfio/user.h
+++ b/hw/vfio/user.h
@@ -75,6 +75,7 @@ typedef struct VFIOProxy {
 /* VFIOProxy flags */
 #define VFIO_PROXY_CLIENT0x1
 #define VFIO_PROXY_FORCE_QUEUED  0x4
+#define VFIO_PROXY_NO_POST   0x8
 
 VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp);
 void vfio_user_disconnect(VFIOProxy *proxy);
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 2552557..4118b8a 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -57,6 +57,7 @@ typedef struct VFIORegion {
 VFIOMmap *mmaps;
 uint8_t nr; /* cache the region number for debug */
 int fd; /* fd to mmap() region */
+bool post_wr; /* writes can be posted */
 } VFIORegion;
 
 typedef struct VFIOMigration {
@@ -180,7 +181,7 @@ struct VFIODevIO {
 int (*region_read)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size,
void *data);
 int (*region_write)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size,
-void *data);
+void *data, bool post);
 };
 
 #define VDEV_GET_INFO(vdev, info) \
@@ -193,8 +194,8 @@ struct VFIODevIO {
 ((vdev)->io_ops->set_irqs((vdev), (irqs)))
 #define VDEV_REGION_READ(vdev, nr, off, size, data) \
 ((vdev)->io_ops->region_read((vdev), (nr), (off), (size), (data)))
-#define VDEV_REGION_WRITE(vdev, nr, off, size, data) \
-((vdev)->io_ops->region_write((vdev), (nr), (off), (size), (data)))
+#define VDEV_REGION_WRITE(vdev, nr, off, size, data, post) \
+((vdev)->io_ops->region_write((vdev), (nr), (off), (size), (data), (post)))
 
 struct VFIOContIO {
 int (*dma_map)(VFIOContainer *container,
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index a50bf4b..83cc5ec 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -213,6 +213,7 @@ void vfio_region_write(void *opaque, hwaddr addr,
 uint32_t dword;
 uint64_t qword;
 } buf;
+bool post = region->post_wr;
 int ret;
 
 switch (size) {
@@ -233,7 +234,11 @@ void vfio_region_write(void *opaque, hwaddr addr,
 break;
 }
 
-ret = VDEV_REGION_WRITE(vbasedev, region->nr, addr, size, );
+/* read-after-write hazard if guest can directly access region */
+if (region->nr_mmaps) {
+post = false;
+}
+ret = VDEV_REGION_WRITE(vbasedev, region->nr, addr, size, , post);
 if (ret != size) {
 const char *err = ret < 0 ? strerror(-ret) : "short write";
 
@@ -1555,6 +1560,7 @@ int vfio_region_setup(Object *obj, VFIODevice *vbasedev, 
VFIORegion *region,
 region->size = info->size;
 region->fd_offset = info->offset;
 region->nr = index;
+region->post_wr = false;
 if (vbasedev->regfds != NULL) {
 region->fd = vbasedev->regfds[index];
 } else {
@@ -2689,7 +2695,7 @@ static int vfio_io_region_read(VFIODevice *vbasedev, 
uint8_t index, off_t off,
 }
 
 static int vfio_io_region_write(VFIODevice *vbasedev, uint8_t index, off_t off,
-uint32_t size, void *data)
+uint32_t size, void *data, bool post)
 {
 struct vfio_region_info *info = vbasedev->regions[index];
 int ret;
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 6f85853..a4fd5e2 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -51,7 +51,7 @@
  (size), (data))
 #define VDEV_CONFIG_WRITE(vbasedev, off, size, data) \
 VDEV_REGION_WRITE((vbasedev), VFIO_PCI_CONFIG_REGION_INDEX, (off), 

[RFC v4 09/21] vfio-user: define socket send functions

2022-01-11 Thread John Johnson
Also negotiate protocol version with remote server

Signed-off-by: Jagannathan Raman 
Signed-off-by: Elena Ufimtseva 
Signed-off-by: John G Johnson 
---
 hw/vfio/pci.h   |   1 +
 hw/vfio/user-protocol.h |  41 +
 hw/vfio/user.h  |   2 +
 hw/vfio/pci.c   |  16 ++
 hw/vfio/user.c  | 414 +++-
 5 files changed, 473 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index 59e636c..ec9f345 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -193,6 +193,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_PCI)
 struct VFIOUserPCIDevice {
 VFIOPCIDevice device;
 char *sock_name;
+bool send_queued;   /* all sends are queued */
 };
 
 /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */
diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h
index d23877c..a0889f6 100644
--- a/hw/vfio/user-protocol.h
+++ b/hw/vfio/user-protocol.h
@@ -51,4 +51,45 @@ enum vfio_user_command {
 #define VFIO_USER_NO_REPLY  0x10
 #define VFIO_USER_ERROR 0x20
 
+
+/*
+ * VFIO_USER_VERSION
+ */
+typedef struct {
+VFIOUserHdr hdr;
+uint16_t major;
+uint16_t minor;
+char capabilities[];
+} VFIOUserVersion;
+
+#define VFIO_USER_MAJOR_VER 0
+#define VFIO_USER_MINOR_VER 0
+
+#define VFIO_USER_CAP   "capabilities"
+
+/* "capabilities" members */
+#define VFIO_USER_CAP_MAX_FDS   "max_msg_fds"
+#define VFIO_USER_CAP_MAX_XFER  "max_data_xfer_size"
+#define VFIO_USER_CAP_MIGR  "migration"
+
+/* "migration" member */
+#define VFIO_USER_CAP_PGSIZE"pgsize"
+
+/*
+ * Max FDs mainly comes into play when a device supports multiple interrupts
+ * where each ones uses an eventfd to inject it into the guest.
+ * It is clamped by the the number of FDs the qio channel supports in a
+ * single message.
+ */
+#define VFIO_USER_DEF_MAX_FDS   8
+#define VFIO_USER_MAX_MAX_FDS   16
+
+/*
+ * Max transfer limits the amount of data in region and DMA messages.
+ * Region R/W will be very small (limited by how much a single instruction
+ * can process) so just use a reasonable limit here.
+ */
+#define VFIO_USER_DEF_MAX_XFER  (1024 * 1024)
+#define VFIO_USER_MAX_MAX_XFER  (64 * 1024 * 1024)
+
 #endif /* VFIO_USER_PROTOCOL_H */
diff --git a/hw/vfio/user.h b/hw/vfio/user.h
index 72eefa7..7ef3c95 100644
--- a/hw/vfio/user.h
+++ b/hw/vfio/user.h
@@ -74,11 +74,13 @@ typedef struct VFIOProxy {
 
 /* VFIOProxy flags */
 #define VFIO_PROXY_CLIENT0x1
+#define VFIO_PROXY_FORCE_QUEUED  0x4
 
 VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp);
 void vfio_user_disconnect(VFIOProxy *proxy);
 void vfio_user_set_handler(VFIODevice *vbasedev,
void (*handler)(void *opaque, VFIOUserMsg *msg),
void *reqarg);
+int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp);
 
 #endif /* VFIO_USER_H */
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 0de915d..3080bd4 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3439,12 +3439,27 @@ static void vfio_user_pci_realize(PCIDevice *pdev, 
Error **errp)
 vbasedev->proxy = proxy;
 vfio_user_set_handler(vbasedev, vfio_user_pci_process_req, vdev);
 
+if (udev->send_queued) {
+proxy->flags |= VFIO_PROXY_FORCE_QUEUED;
+}
+
+vfio_user_validate_version(vbasedev, );
+if (err != NULL) {
+error_propagate(errp, err);
+goto error;
+}
+
 vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name);
 vbasedev->dev = DEVICE(vdev);
 vbasedev->fd = -1;
 vbasedev->type = VFIO_DEVICE_TYPE_PCI;
 vbasedev->ops = _user_pci_ops;
 
+return;
+
+error:
+vfio_user_disconnect(proxy);
+error_prepend(errp, VFIO_MSG_PREFIX, vdev->vbasedev.name);
 }
 
 static void vfio_user_instance_finalize(Object *obj)
@@ -3461,6 +3476,7 @@ static void vfio_user_instance_finalize(Object *obj)
 
 static Property vfio_user_pci_dev_properties[] = {
 DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name),
+DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, false),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/vfio/user.c b/hw/vfio/user.c
index e1dfd5d..fd1e0a8 100644
--- a/hw/vfio/user.c
+++ b/hw/vfio/user.c
@@ -23,12 +23,20 @@
 #include "io/channel-socket.h"
 #include "io/channel-util.h"
 #include "sysemu/iothread.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qjson.h"
+#include "qapi/qmp/qnull.h"
+#include "qapi/qmp/qstring.h"
+#include "qapi/qmp/qnum.h"
 #include "user.h"
 
-static uint64_t max_xfer_size;
+static uint64_t max_xfer_size = VFIO_USER_DEF_MAX_XFER;
+static uint64_t max_send_fds = VFIO_USER_DEF_MAX_FDS;
+static int wait_time = 1000;   /* wait 1 sec for replies */
 static IOThread *vfio_user_iothread;
 
 static void vfio_user_shutdown(VFIOProxy *proxy);
+static int vfio_user_send_qio(VFIOProxy *proxy, VFIOUserMsg *msg);
 static VFIOUserMsg *vfio_user_getmsg(VFIOProxy 

[RFC v4 02/21] vfio-user: add VFIO base abstract class

2022-01-11 Thread John Johnson
Add an abstract base class both the kernel driver
and user socket implementations can use to share code.

Signed-off-by: John G Johnson 
Signed-off-by: Elena Ufimtseva 
Signed-off-by: Jagannathan Raman 
---
 hw/vfio/pci.h |  16 +++--
 hw/vfio/pci.c | 106 +++---
 2 files changed, 78 insertions(+), 44 deletions(-)

diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index 6477751..bbc78aa 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -114,8 +114,13 @@ typedef struct VFIOMSIXInfo {
 unsigned long *pending;
 } VFIOMSIXInfo;
 
-#define TYPE_VFIO_PCI "vfio-pci"
-OBJECT_DECLARE_SIMPLE_TYPE(VFIOPCIDevice, VFIO_PCI)
+/*
+ * TYPE_VFIO_PCI_BASE is an abstract type used to share code
+ * between VFIO implementations that use a kernel driver
+ * with those that use user sockets.
+ */
+#define TYPE_VFIO_PCI_BASE "vfio-pci-base"
+OBJECT_DECLARE_SIMPLE_TYPE(VFIOPCIDevice, VFIO_PCI_BASE)
 
 struct VFIOPCIDevice {
 PCIDevice pdev;
@@ -175,6 +180,13 @@ struct VFIOPCIDevice {
 Notifier irqchip_change_notifier;
 };
 
+#define TYPE_VFIO_PCI "vfio-pci"
+OBJECT_DECLARE_SIMPLE_TYPE(VFIOKernPCIDevice, VFIO_PCI)
+
+struct VFIOKernPCIDevice {
+VFIOPCIDevice device;
+};
+
 /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */
 static inline bool vfio_pci_is(VFIOPCIDevice *vdev, uint32_t vendor, uint32_t 
device)
 {
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 7b45353..d00a162 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -231,7 +231,7 @@ static void vfio_intx_update(VFIOPCIDevice *vdev, 
PCIINTxRoute *route)
 
 static void vfio_intx_routing_notifier(PCIDevice *pdev)
 {
-VFIOPCIDevice *vdev = VFIO_PCI(pdev);
+VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
 PCIINTxRoute route;
 
 if (vdev->interrupt != VFIO_INT_INTx) {
@@ -457,7 +457,7 @@ static void vfio_update_kvm_msi_virq(VFIOMSIVector *vector, 
MSIMessage msg,
 static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr,
MSIMessage *msg, IOHandler *handler)
 {
-VFIOPCIDevice *vdev = VFIO_PCI(pdev);
+VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
 VFIOMSIVector *vector;
 int ret;
 
@@ -542,7 +542,7 @@ static int vfio_msix_vector_use(PCIDevice *pdev,
 
 static void vfio_msix_vector_release(PCIDevice *pdev, unsigned int nr)
 {
-VFIOPCIDevice *vdev = VFIO_PCI(pdev);
+VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
 VFIOMSIVector *vector = >msi_vectors[nr];
 
 trace_vfio_msix_vector_release(vdev->vbasedev.name, nr);
@@ -1063,7 +1063,7 @@ static const MemoryRegionOps vfio_vga_ops = {
  */
 static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar)
 {
-VFIOPCIDevice *vdev = VFIO_PCI(pdev);
+VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
 VFIORegion *region = >bars[bar].region;
 MemoryRegion *mmap_mr, *region_mr, *base_mr;
 PCIIORegion *r;
@@ -1109,7 +1109,7 @@ static void vfio_sub_page_bar_update_mapping(PCIDevice 
*pdev, int bar)
  */
 uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len)
 {
-VFIOPCIDevice *vdev = VFIO_PCI(pdev);
+VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
 uint32_t emu_bits = 0, emu_val = 0, phys_val = 0, val;
 
 memcpy(_bits, vdev->emulated_config_bits + addr, len);
@@ -1142,7 +1142,7 @@ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t 
addr, int len)
 void vfio_pci_write_config(PCIDevice *pdev,
uint32_t addr, uint32_t val, int len)
 {
-VFIOPCIDevice *vdev = VFIO_PCI(pdev);
+VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
 uint32_t val_le = cpu_to_le32(val);
 
 trace_vfio_pci_write_config(vdev->vbasedev.name, addr, val, len);
@@ -2799,7 +2799,7 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice 
*vdev)
 
 static void vfio_realize(PCIDevice *pdev, Error **errp)
 {
-VFIOPCIDevice *vdev = VFIO_PCI(pdev);
+VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
 VFIODevice *vbasedev_iter;
 VFIOGroup *group;
 char *tmp, *subsys, group_path[PATH_MAX], *group_name;
@@ -3122,7 +3122,7 @@ error:
 
 static void vfio_instance_finalize(Object *obj)
 {
-VFIOPCIDevice *vdev = VFIO_PCI(obj);
+VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj);
 VFIOGroup *group = vdev->vbasedev.group;
 
 vfio_display_finalize(vdev);
@@ -3142,7 +3142,7 @@ static void vfio_instance_finalize(Object *obj)
 
 static void vfio_exitfn(PCIDevice *pdev)
 {
-VFIOPCIDevice *vdev = VFIO_PCI(pdev);
+VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
 
 vfio_unregister_req_notifier(vdev);
 vfio_unregister_err_notifier(vdev);
@@ -3161,7 +3161,7 @@ static void vfio_exitfn(PCIDevice *pdev)
 
 static void vfio_pci_reset(DeviceState *dev)
 {
-VFIOPCIDevice *vdev = VFIO_PCI(dev);
+VFIOPCIDevice *vdev = VFIO_PCI_BASE(dev);
 
 trace_vfio_pci_reset(vdev->vbasedev.name);
 
@@ -3201,7 +3201,7 @@ post_reset:
 static void vfio_instance_init(Object *obj)
 {
 PCIDevice *pci_dev = PCI_DEVICE(obj);
-

Re: [PATCH v2 2/4] scripts/qapi/commands: gen_commands(): add add_trace_points argument

2022-01-11 Thread John Snow
On Tue, Jan 11, 2022 at 6:53 PM John Snow  wrote:
>
> On Thu, Dec 23, 2021 at 6:08 AM Vladimir Sementsov-Ogievskiy
>  wrote:
> >
> > Add possibility to generate trace points for each qmp command.
> >
> > We should generate both trace points and trace-events file, for further
> > trace point code generation.
> >
> > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > ---
> >  scripts/qapi/commands.py | 84 ++--
> >  1 file changed, 73 insertions(+), 11 deletions(-)
> >
> > diff --git a/scripts/qapi/commands.py b/scripts/qapi/commands.py
> > index 21001bbd6b..9691c11f96 100644
> > --- a/scripts/qapi/commands.py
> > +++ b/scripts/qapi/commands.py
> > @@ -53,7 +53,8 @@ def gen_command_decl(name: str,
> >  def gen_call(name: str,
> >   arg_type: Optional[QAPISchemaObjectType],
> >   boxed: bool,
> > - ret_type: Optional[QAPISchemaType]) -> str:
> > + ret_type: Optional[QAPISchemaType],
> > + add_trace_points: bool) -> str:
> >  ret = ''
> >
> >  argstr = ''
> > @@ -71,21 +72,65 @@ def gen_call(name: str,
> >  if ret_type:
> >  lhs = 'retval = '
> >
> > -ret = mcgen('''
> > +qmp_name = f'qmp_{c_name(name)}'
> > +upper = qmp_name.upper()
> > +
> > +if add_trace_points:
> > +ret += mcgen('''
> > +
> > +if (trace_event_get_state_backends(TRACE_%(upper)s)) {
> > +g_autoptr(GString) req_json = qobject_to_json(QOBJECT(args));
> > +trace_%(qmp_name)s("", req_json->str);
> > +}
> > +''',
> > + upper=upper, qmp_name=qmp_name)
> > +
> > +ret += mcgen('''
> >
> >  %(lhs)sqmp_%(c_name)s(%(args)s);
> > -error_propagate(errp, err);
> >  ''',
> >  c_name=c_name(name), args=argstr, lhs=lhs)
> > -if ret_type:
> > -ret += mcgen('''
> > +
> > +ret += mcgen('''
> >  if (err) {
> > +''')
> > +
> > +if add_trace_points:
> > +ret += mcgen('''
> > +trace_%(qmp_name)s("FAIL: ", error_get_pretty(err));
> > +''',
> > + qmp_name=qmp_name)
> > +
> > +ret += mcgen('''
> > +error_propagate(errp, err);
> >  goto out;
> >  }
> > +''')
> > +
> > +if ret_type:
> > +ret += mcgen('''
> >
> >  qmp_marshal_output_%(c_name)s(retval, ret, errp);
> >  ''',
> >   c_name=ret_type.c_name())
> > +
> > +if add_trace_points:
> > +if ret_type:
> > +ret += mcgen('''
> > +
> > +if (trace_event_get_state_backends(TRACE_%(upper)s)) {
> > +g_autoptr(GString) ret_json = qobject_to_json(*ret);
> > +trace_%(qmp_name)s("RET:", ret_json->str);
> > +}
> > +''',
> > + upper=upper, qmp_name=qmp_name)
> > +else:
> > +ret += mcgen('''
> > +
> > +trace_%(qmp_name)s("SUCCESS", "");
> > +''',
> > + qmp_name=qmp_name)
> > +
> >  return ret
> >
> >
> > @@ -122,10 +167,14 @@ def gen_marshal_decl(name: str) -> str:
> >   proto=build_marshal_proto(name))
> >
> >
> > +def gen_trace(name: str) -> str:
> > +return f'qmp_{c_name(name)}(const char *tag, const char *json) 
> > "%s%s"\n'
> > +
> >  def gen_marshal(name: str,
> >  arg_type: Optional[QAPISchemaObjectType],
> >  boxed: bool,
> > -ret_type: Optional[QAPISchemaType]) -> str:
> > +ret_type: Optional[QAPISchemaType],
> > +add_trace_points: bool) -> str:
> >  have_args = boxed or (arg_type and not arg_type.is_empty())
> >  if have_args:
> >  assert arg_type is not None
> > @@ -180,7 +229,7 @@ def gen_marshal(name: str,
> >  }
> >  ''')
> >
> > -ret += gen_call(name, arg_type, boxed, ret_type)
> > +ret += gen_call(name, arg_type, boxed, ret_type, add_trace_points)
> >
> >  ret += mcgen('''
> >
> > @@ -238,11 +287,12 @@ def gen_register_command(name: str,
> >
> >
> >  class QAPISchemaGenCommandVisitor(QAPISchemaModularCVisitor):
> > -def __init__(self, prefix: str):
> > +def __init__(self, prefix: str, add_trace_points: bool):
> >  super().__init__(
> >  prefix, 'qapi-commands',
> >  ' * Schema-defined QAPI/QMP commands', None, __doc__)
> >  self._visited_ret_types: Dict[QAPIGenC, Set[QAPISchemaType]] = {}
> > +self.add_trace_points = add_trace_points
> >
> >  def _begin_user_module(self, name: str) -> None:
> >  self._visited_ret_types[self._genc] = set()
> > @@ -261,6 +311,15 @@ def _begin_user_module(self, name: str) -> None:
> >
> >  ''',
> >   commands=commands, visit=visit))
> > +
> > +if self.add_trace_points and c_name(commands) != 'qapi_commands':
> > +self._genc.add(mcgen('''
> > +#include "trace/trace-qapi.h"
> > +#include "qapi/qmp/qjson.h"
> > +#include "trace/trace-%(nm)s_trace_events.h"
> > +''',
> > +  

[PATCH 3/3] hw/arm: kudo add max31790 behind bus 1 switch at 75

2022-01-11 Thread Titus Rwantare
From: Patrick Venture 

Signed-off-by: Patrick Venture 
Reviewed-by: Titus Rwantare 
---
 hw/arm/npcm7xx_boards.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/arm/npcm7xx_boards.c b/hw/arm/npcm7xx_boards.c
index 7d0f3148be..6fea2e161f 100644
--- a/hw/arm/npcm7xx_boards.c
+++ b/hw/arm/npcm7xx_boards.c
@@ -342,6 +342,7 @@ static void kudo_bmc_i2c_init(NPCM7xxState *soc)
 i2c_mux = i2c_slave_create_simple(npcm7xx_i2c_get_bus(soc, 13),
   TYPE_PCA9548, 0x77);
 
+i2c_slave_create_simple(pca954x_i2c_get_bus(i2c_mux, 2), "max31790", 0x2c);
 /* tmp105 is compatible with the lm75 */
 i2c_slave_create_simple(pca954x_i2c_get_bus(i2c_mux, 2), "tmp105", 0x48);
 i2c_slave_create_simple(pca954x_i2c_get_bus(i2c_mux, 3), "tmp105", 0x49);
-- 
2.34.1.575.g55b058a8bb-goog




[PATCH 2/3] tests/qtest: add tests for MAX31790 fan controller

2022-01-11 Thread Titus Rwantare
Signed-off-by: Titus Rwantare 
Reviewed-by: Hao Wu 
---
 tests/qtest/max31790_fan_ctrl-test.c | 171 +++
 tests/qtest/meson.build  |   1 +
 2 files changed, 172 insertions(+)
 create mode 100644 tests/qtest/max31790_fan_ctrl-test.c

diff --git a/tests/qtest/max31790_fan_ctrl-test.c 
b/tests/qtest/max31790_fan_ctrl-test.c
new file mode 100644
index 00..b0b703d018
--- /dev/null
+++ b/tests/qtest/max31790_fan_ctrl-test.c
@@ -0,0 +1,171 @@
+/*
+ * QTests for MAX31790 Fan controller
+ *
+ * Independently control 6 fans, up to 12 tachometer inputs,
+ * controlled through i2c
+ *
+ * Copyright 2021 Google LLC
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include 
+#include "hw/sensor/max31790_fan_ctrl.h"
+#include "libqtest-single.h"
+#include "libqos/qgraph.h"
+#include "libqos/i2c.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qnum.h"
+#include "qemu/bitops.h"
+
+#define TEST_ID "max31790-test"
+#define TEST_ADDR   (0x37)
+#define TEST_MAX_RPM0x4000
+
+static uint16_t qmp_max31790_get(const char *id, const char *property)
+{
+QDict *response;
+uint64_t ret;
+
+response = qmp("{ 'execute': 'qom-get', 'arguments': { 'path': %s, "
+   "'property': %s } }", id, property);
+g_assert(qdict_haskey(response, "return"));
+ret = qnum_get_uint(qobject_to(QNum, qdict_get(response, "return")));
+qobject_unref(response);
+return ret;
+}
+
+static void qmp_max31790_set(const char *id,
+const char *property,
+uint16_t value)
+{
+QDict *response;
+
+response = qmp("{ 'execute': 'qom-set', 'arguments': { 'path': %s, "
+   "'property': %s, 'value': %u } }", id, property, value);
+g_assert(qdict_haskey(response, "return"));
+qobject_unref(response);
+}
+
+static uint32_t max31790_tach_count2rpm(uint16_t tach, uint8_t sr)
+{
+if (tach) {
+return (sr * MAX31790_CLK_FREQ * 60) / (MAX31790_PULSES_PER_REV * 
tach);
+} else {
+return 0;
+}
+}
+
+/* R/W Tach - 6 fans */
+static void test_defaults(void *obj, void *data, QGuestAllocator *alloc)
+{
+QI2CDevice *i2cdev = (QI2CDevice *)obj;
+uint8_t i2c_value;
+
+i2c_value = i2c_get8(i2cdev, MAX31790_REG_GLOBAL_CONFIG);
+g_assert_cmphex(i2c_value, ==, MAX31790_GLOBAL_CONFIG_DEFAULT);
+
+i2c_value = i2c_get8(i2cdev, MAX31790_REG_PWM_FREQ);
+g_assert_cmphex(i2c_value, ==, MAX31790_PWM_FREQ_DEFAULT);
+
+for (int i = 0; i < MAX31790_NUM_FANS; i++) {
+i2c_value = i2c_get8(i2cdev, MAX31790_REG_FAN_DYNAMICS(i));
+g_assert_cmphex(i2c_value, ==, MAX31790_FAN_DYNAMICS_DEFAULT);
+}
+
+i2c_value = i2c_get8(i2cdev, MAX31790_REG_FAILED_FAN_OPTS_SEQ_STRT);
+g_assert_cmphex(i2c_value, ==, MAX31790_FAILED_FAN_OPTS_SEQ_STRT_DEFAULT);
+}
+
+static void test_pwm(void *obj, void *data, QGuestAllocator *alloc)
+{
+QI2CDevice *i2cdev = (QI2CDevice *)obj;
+char *path;
+int err;
+uint16_t i2c_value, value, rpm;
+
+
+/* init fans to different pwm duty cycles */
+for (int i = 0; i < MAX31790_NUM_FANS; i++) {
+path = g_strdup_printf("max_rpm[%d]", i);
+qmp_max31790_set(TEST_ID, path, TEST_MAX_RPM); /* ~16k RPM */
+g_free(path);
+i2c_set8(i2cdev, MAX31790_REG_FAN_CONFIG(i), 0); /* enable PWM mode */
+path = g_strdup_printf("pwm[%d]", i);
+qmp_max31790_set(TEST_ID, path, i * 0x40);
+g_free(path);
+}
+
+/* read and compare qmp with i2c 9-bit pwm */
+for (int i = 0; i < MAX31790_NUM_FANS; i++) {
+path = g_strdup_printf("pwm[%d]", i);
+value = qmp_max31790_get(TEST_ID, path);
+g_free(path);
+i2c_value = i2c_get8(i2cdev, MAX31790_REG_PWMOUT_MSB(i)) << 8;
+i2c_value |= i2c_get8(i2cdev, MAX31790_REG_PWMOUT_LSB(i));
+i2c_value >>= MAX31790_PWM_SHAMT;
+g_assert_cmphex(value, ==, i2c_value);
+}
+
+/* expect tach to match pwm scaled to max_rpm */
+for (int i = 0; i < MAX31790_NUM_FANS; i++) {
+i2c_value = i2c_get8(i2cdev, MAX31790_REG_TACH_COUNT_MSB(i)) << 8;
+i2c_value |= i2c_get8(i2cdev, MAX31790_REG_TACH_COUNT_LSB(i));
+i2c_value >>= 5;
+value = max31790_tach_count2rpm(i2c_value, MAX31790_SR_DEFAULT);
+rpm = (TEST_MAX_RPM * i * 0x40) / 0x1FF; /* max_rpm x pwm_duty_cycle */
+err = value - rpm;
+g_assert_cmpuint(abs(err), <, 163); /* ~1% of max_rpm */
+}
+}
+
+static void test_rpm(void *obj, void *data, QGuestAllocator *alloc)
+{
+QI2CDevice *i2cdev = (QI2CDevice *)obj;
+char *path;
+int err;
+uint16_t i2c_value, value, rpm;
+
+/* init fans to different speeds */
+for (int i = 0; i < MAX31790_NUM_FANS; i++) {
+i2c_set8(i2cdev, MAX31790_REG_FAN_CONFIG(i),
+ MAX31790_FAN_CFG_RPM_MODE);
+path = g_strdup_printf("target_rpm[%d]", i);
+

[PATCH 1/3] hw/sensor: add MAX31790 fan controller

2022-01-11 Thread Titus Rwantare
Signed-off-by: Titus Rwantare 
Reviewed-by: Hao Wu 
---
 MAINTAINERS   |   8 +-
 hw/arm/Kconfig|   1 +
 hw/sensor/Kconfig |   4 +
 hw/sensor/max31790_fan_ctrl.c | 454 ++
 hw/sensor/meson.build |   1 +
 include/hw/sensor/max31790_fan_ctrl.h |  93 ++
 6 files changed, 560 insertions(+), 1 deletion(-)
 create mode 100644 hw/sensor/max31790_fan_ctrl.c
 create mode 100644 include/hw/sensor/max31790_fan_ctrl.h

diff --git a/MAINTAINERS b/MAINTAINERS
index c98a61caee..0791b6be42 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2304,6 +2304,12 @@ F: hw/timer/mips_gictimer.c
 F: include/hw/intc/mips_gic.h
 F: include/hw/timer/mips_gictimer.h
 
+MAX31790 Fan controller
+M: Titus Rwantare 
+S: Maintained
+F: hw/sensor/max31790_fan_ctrl.c
+F: include/hw/sensor/max31790_fan_ctrl.h
+
 Subsystems
 --
 Overall Audio backends
@@ -2798,7 +2804,7 @@ R: Paolo Bonzini 
 R: Bandan Das 
 R: Stefan Hajnoczi 
 R: Thomas Huth 
-R: Darren Kenny  
+R: Darren Kenny 
 R: Qiuhao Li 
 S: Maintained
 F: tests/qtest/fuzz/
diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index e652590943..00bfbaf1c4 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -393,6 +393,7 @@ config NPCM7XX
 select SMBUS
 select AT24C  # EEPROM
 select MAX34451
+select MAX31790
 select PL310  # cache controller
 select PMBUS
 select SERIAL
diff --git a/hw/sensor/Kconfig b/hw/sensor/Kconfig
index 9c8a049b06..54d269d642 100644
--- a/hw/sensor/Kconfig
+++ b/hw/sensor/Kconfig
@@ -21,3 +21,7 @@ config ADM1272
 config MAX34451
 bool
 depends on I2C
+
+config MAX31790
+bool
+depends on I2C
diff --git a/hw/sensor/max31790_fan_ctrl.c b/hw/sensor/max31790_fan_ctrl.c
new file mode 100644
index 00..b5334c1130
--- /dev/null
+++ b/hw/sensor/max31790_fan_ctrl.c
@@ -0,0 +1,454 @@
+/*
+ * MAX31790 Fan controller
+ *
+ * Independently control 6 fans, up to 12 tachometer inputs,
+ * controlled through i2c
+ *
+ * This device model has read/write support for:
+ * - 9-bit pwm through i2c and qom/qmp
+ * - 11-bit tach_count through i2c
+ * - RPM through qom/qmp
+ *
+ * Copyright 2021 Google LLC
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "hw/sensor/max31790_fan_ctrl.h"
+#include "hw/irq.h"
+#include "hw/qdev-properties.h"
+#include "migration/vmstate.h"
+#include "qapi/visitor.h"
+#include "qemu/log.h"
+#include "qemu/module.h"
+
+static uint16_t max31790_get_sr(uint8_t fan_dynamics)
+{
+uint16_t sr = 1 << ((fan_dynamics >> 5) & 0b111);
+return sr > 16 ? 32 : sr;
+}
+
+static void max31790_place_bits(uint16_t *dest, uint16_t byte, uint8_t offset)
+{
+uint16_t val = *dest;
+val &= ~(0x00FF << offset);
+val |= byte << offset;
+*dest = val;
+}
+
+/*
+ * calculating fan speed
+ *  f_TOSC/4 is the clock, 8192Hz
+ *  NP = tachometer pulses per revolution (usually 2)
+ *  SR = number of periods(pulses) over which the clock ticks are counted
+ *  TACH Count = SR x 8192 x 60 / (NP x RPM)
+ *  RPM = SR x 8192 x 60 / (NP x TACH count)
+ *
+ *  RPM mode - desired tach count is written to TACH Target Count
+ *  PWM mode - desired duty cycle is written to PWMOUT Target Duty reg
+ */
+static void max31790_calculate_tach_count(MAX31790State *ms, uint8_t id)
+{
+uint32_t rpm;
+uint32_t sr = max31790_get_sr(ms->fan_dynamics[id]);
+ms->pwm_duty_cycle[id] = ms->pwmout[id] >> 7;
+rpm = (ms->max_rpm[id] * ms->pwm_duty_cycle[id]) / 0x1FF;
+
+if (rpm) {
+ms->tach_count[id] = (sr * MAX31790_CLK_FREQ * 60) /
+ (MAX31790_PULSES_PER_REV * rpm);
+} else {
+ms->tach_count[id] = 0;
+}
+
+}
+
+static void max31790_update_tach_count(MAX31790State *ms)
+{
+for (int i = 0; i < MAX31790_NUM_FANS; i++) {
+if (ms->fan_config[i] &
+(MAX31790_FAN_CFG_RPM_MODE | MAX31790_FAN_CFG_TACH_INPUT_EN)) {
+ms->tach_count[i] = ms->target_count[i] >> 5;
+} else { /* PWM mode */
+max31790_calculate_tach_count(ms, i);
+}
+}
+}
+
+/* consecutive reads can increment the address up to 0xFF then wrap to 0 */
+/* slave to master */
+static uint8_t max31790_recv(I2CSlave *i2c)
+{
+MAX31790State *ms = MAX31790(i2c);
+uint8_t data, index, rem;
+
+max31790_update_tach_count(ms);
+
+if (ms->cmd_is_new) {
+ms->cmd_is_new = false;
+} else {
+ms->command++;
+}
+
+switch (ms->command) {
+case MAX31790_REG_GLOBAL_CONFIG:
+data = ms->global_config;
+break;
+
+case MAX31790_REG_PWM_FREQ:
+data = ms->pwm_freq;
+break;
+
+case MAX31790_REG_FAN_CONFIG(0) ...
+ MAX31790_REG_FAN_CONFIG(MAX31790_NUM_FANS - 1):
+data = ms->fan_config[ms->command - MAX31790_REG_FAN_CONFIG(0)];
+break;
+
+case MAX31790_REG_FAN_DYNAMICS(0) ...
+ 

RE: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

2022-01-11 Thread Michael Morrell
Richard,

It's been 6 months so I thought I'd check in again.   Do you have an estimate 
of when this will go in?

   Michael

-Original Message-
From: Michael Morrell 
Sent: Wednesday, July 14, 2021 10:50 AM
To: 'Richard Henderson' ; qemu-devel@nongnu.org
Subject: RE: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

OK, thanks for the update.  I also appreciate you looking at the NaN issue.

   Michael

-Original Message-
From: Richard Henderson  
Sent: Wednesday, July 14, 2021 9:57 AM
To: Michael Morrell ; qemu-devel@nongnu.org
Subject: Re: [PATCH 03/11] softfloat: Introduce float_flag_inorm_denormal

On 7/14/21 9:44 AM, Michael Morrell wrote:
> Just curious.  What's the expected timeline to get these denormal patches in 
> the tree?

Next development cycle, at minimum.  I need to fix the problems vs NaNs that 
you identified.


r~


Re: [PATCH v2 2/4] scripts/qapi/commands: gen_commands(): add add_trace_points argument

2022-01-11 Thread John Snow
On Thu, Dec 23, 2021 at 6:08 AM Vladimir Sementsov-Ogievskiy
 wrote:
>
> Add possibility to generate trace points for each qmp command.
>
> We should generate both trace points and trace-events file, for further
> trace point code generation.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  scripts/qapi/commands.py | 84 ++--
>  1 file changed, 73 insertions(+), 11 deletions(-)
>
> diff --git a/scripts/qapi/commands.py b/scripts/qapi/commands.py
> index 21001bbd6b..9691c11f96 100644
> --- a/scripts/qapi/commands.py
> +++ b/scripts/qapi/commands.py
> @@ -53,7 +53,8 @@ def gen_command_decl(name: str,
>  def gen_call(name: str,
>   arg_type: Optional[QAPISchemaObjectType],
>   boxed: bool,
> - ret_type: Optional[QAPISchemaType]) -> str:
> + ret_type: Optional[QAPISchemaType],
> + add_trace_points: bool) -> str:
>  ret = ''
>
>  argstr = ''
> @@ -71,21 +72,65 @@ def gen_call(name: str,
>  if ret_type:
>  lhs = 'retval = '
>
> -ret = mcgen('''
> +qmp_name = f'qmp_{c_name(name)}'
> +upper = qmp_name.upper()
> +
> +if add_trace_points:
> +ret += mcgen('''
> +
> +if (trace_event_get_state_backends(TRACE_%(upper)s)) {
> +g_autoptr(GString) req_json = qobject_to_json(QOBJECT(args));
> +trace_%(qmp_name)s("", req_json->str);
> +}
> +''',
> + upper=upper, qmp_name=qmp_name)
> +
> +ret += mcgen('''
>
>  %(lhs)sqmp_%(c_name)s(%(args)s);
> -error_propagate(errp, err);
>  ''',
>  c_name=c_name(name), args=argstr, lhs=lhs)
> -if ret_type:
> -ret += mcgen('''
> +
> +ret += mcgen('''
>  if (err) {
> +''')
> +
> +if add_trace_points:
> +ret += mcgen('''
> +trace_%(qmp_name)s("FAIL: ", error_get_pretty(err));
> +''',
> + qmp_name=qmp_name)
> +
> +ret += mcgen('''
> +error_propagate(errp, err);
>  goto out;
>  }
> +''')
> +
> +if ret_type:
> +ret += mcgen('''
>
>  qmp_marshal_output_%(c_name)s(retval, ret, errp);
>  ''',
>   c_name=ret_type.c_name())
> +
> +if add_trace_points:
> +if ret_type:
> +ret += mcgen('''
> +
> +if (trace_event_get_state_backends(TRACE_%(upper)s)) {
> +g_autoptr(GString) ret_json = qobject_to_json(*ret);
> +trace_%(qmp_name)s("RET:", ret_json->str);
> +}
> +''',
> + upper=upper, qmp_name=qmp_name)
> +else:
> +ret += mcgen('''
> +
> +trace_%(qmp_name)s("SUCCESS", "");
> +''',
> + qmp_name=qmp_name)
> +
>  return ret
>
>
> @@ -122,10 +167,14 @@ def gen_marshal_decl(name: str) -> str:
>   proto=build_marshal_proto(name))
>
>
> +def gen_trace(name: str) -> str:
> +return f'qmp_{c_name(name)}(const char *tag, const char *json) "%s%s"\n'
> +
>  def gen_marshal(name: str,
>  arg_type: Optional[QAPISchemaObjectType],
>  boxed: bool,
> -ret_type: Optional[QAPISchemaType]) -> str:
> +ret_type: Optional[QAPISchemaType],
> +add_trace_points: bool) -> str:
>  have_args = boxed or (arg_type and not arg_type.is_empty())
>  if have_args:
>  assert arg_type is not None
> @@ -180,7 +229,7 @@ def gen_marshal(name: str,
>  }
>  ''')
>
> -ret += gen_call(name, arg_type, boxed, ret_type)
> +ret += gen_call(name, arg_type, boxed, ret_type, add_trace_points)
>
>  ret += mcgen('''
>
> @@ -238,11 +287,12 @@ def gen_register_command(name: str,
>
>
>  class QAPISchemaGenCommandVisitor(QAPISchemaModularCVisitor):
> -def __init__(self, prefix: str):
> +def __init__(self, prefix: str, add_trace_points: bool):
>  super().__init__(
>  prefix, 'qapi-commands',
>  ' * Schema-defined QAPI/QMP commands', None, __doc__)
>  self._visited_ret_types: Dict[QAPIGenC, Set[QAPISchemaType]] = {}
> +self.add_trace_points = add_trace_points
>
>  def _begin_user_module(self, name: str) -> None:
>  self._visited_ret_types[self._genc] = set()
> @@ -261,6 +311,15 @@ def _begin_user_module(self, name: str) -> None:
>
>  ''',
>   commands=commands, visit=visit))
> +
> +if self.add_trace_points and c_name(commands) != 'qapi_commands':
> +self._genc.add(mcgen('''
> +#include "trace/trace-qapi.h"
> +#include "qapi/qmp/qjson.h"
> +#include "trace/trace-%(nm)s_trace_events.h"
> +''',
> + nm=c_name(commands)))
> +
>  self._genh.add(mcgen('''
>  #include "%(types)s.h"
>
> @@ -322,7 +381,9 @@ def visit_command(self,
>  with ifcontext(ifcond, self._genh, self._genc):
>  self._genh.add(gen_command_decl(name, arg_type, boxed, ret_type))
>  self._genh.add(gen_marshal_decl(name))
> - 

Re: [PATCH v2 1/4] jobs: drop qmp_ trace points

2022-01-11 Thread John Snow
On Mon, Jan 10, 2022 at 11:06 AM Stefan Hajnoczi  wrote:
>
> On Thu, Dec 23, 2021 at 12:07:53PM +0100, Vladimir Sementsov-Ogievskiy wrote:
> > diff --git a/block/trace-events b/block/trace-events
> > index 549090d453..5be3e3913b 100644
> > --- a/block/trace-events
> > +++ b/block/trace-events
> > @@ -49,15 +49,6 @@ block_copy_read_fail(void *bcs, int64_t start, int ret) 
> > "bcs %p start %"PRId64"
> >  block_copy_write_fail(void *bcs, int64_t start, int ret) "bcs %p start 
> > %"PRId64" ret %d"
> >  block_copy_write_zeroes_fail(void *bcs, int64_t start, int ret) "bcs %p 
> > start %"PRId64" ret %d"
> >
> > -# ../blockdev.c
> > -qmp_block_job_cancel(void *job) "job %p"
> > -qmp_block_job_pause(void *job) "job %p"
> > -qmp_block_job_resume(void *job) "job %p"
> > -qmp_block_job_complete(void *job) "job %p"
> > -qmp_block_job_finalize(void *job) "job %p"
> > -qmp_block_job_dismiss(void *job) "job %p"
> > -qmp_block_stream(void *bs) "bs %p"
> > -
> >  # file-win32.c
> >  file_paio_submit(void *acb, void *opaque, int64_t offset, int count, int 
> > type) "acb %p opaque %p offset %"PRId64" count %d type %d"
> >
> > diff --git a/trace-events b/trace-events
> > index a637a61eba..1265f1e0cc 100644
> > --- a/trace-events
> > +++ b/trace-events
> > @@ -79,14 +79,6 @@ job_state_transition(void *job,  int ret, const char 
> > *legal, const char *s0, con
> >  job_apply_verb(void *job, const char *state, const char *verb, const char 
> > *legal) "job %p in state %s; applying verb %s (%s)"
> >  job_completed(void *job, int ret) "job %p ret %d"
> >
> > -# job-qmp.c
> > -qmp_job_cancel(void *job) "job %p"
> > -qmp_job_pause(void *job) "job %p"
> > -qmp_job_resume(void *job) "job %p"
> > -qmp_job_complete(void *job) "job %p"
> > -qmp_job_finalize(void *job) "job %p"
> > -qmp_job_dismiss(void *job) "job %p"
>
> The job pointer argument will be lost. That's not ideal but probably
> worth getting trace events for all QMP commands.
>
> Stefan

We could move the six job-related tracepoints into the implementation
routines instead; i.e. job_user_cancel, job_user_pause, etc. This
would cover all 12 QMP interface tracepoints, and that'd let us keep
the "implementation" trace points.

--js




Re: test_isa_retry_flush() in ide-test.c

2022-01-11 Thread John Snow
On Fri, Jan 7, 2022, 12:27 PM Paolo Bonzini  wrote:

> On 1/7/22 17:01, Thomas Huth wrote:
> >   Hi John!
> >
> > I just notice that test_isa_retry_flush() is not doing anything useful
> > anymore: It likely was supposed to run the test_retry_flush() function
> > with the "isapc" machine type, but actually test_retry_flush() ignores
> > the machine option parameter completely and always uses PCI accessor
> > functions nowadays (since commit 9c268f8ae84ae186).
> > Question is: Is it worth the effort to try to restore the original
> > intended behavior for the ISA test here, or shall we rather simply
> > remove it instead to save some testing cycles?
>
> The right way to fix it would be to use qgraph.  Second best option is
> to nuke it, because the conversion to qgraph would give the test back
> for free without writing more code.
>
> Paolo
>
>
Uh, nuke it. I think maybe this never worked correctly ...?

I'm looking at baca2b9e3a94be1690fc4a842a97b64a4c8f892c and it doesn't look
like I ever routed the const char *machine to go anywhere ... ? Maybe it
was a mis-merge or just a thinko, but I think you're safe to just destroy
it...

--js


Re: [PATCH] linux-user: rt_sigprocmask, check read perms first

2022-01-11 Thread Patrick Venture
On Tue, Jan 11, 2022 at 12:50 PM Laurent Vivier  wrote:

> Hi Patrick,
>
> Le 11/01/2022 à 21:14, Patrick Venture a écrit :
> >
> >
> > On Sat, Jan 8, 2022 at 10:16 AM Laurent Vivier  > wrote:
> >
> > Le 06/01/2022 à 23:00, Patrick Venture a écrit :
> >  > From: Shu-Chun Weng mailto:s...@google.com>>
> >  >
> >  > Linux kernel does it this way (checks read permission before
> validating `how`)
> >  > and the latest version of ABSL's `AddressIsReadable()` depends on
> this
> >  > behavior.
> >  >
> >  > c.f.
> >
> https://github.com/torvalds/linux/blob/9539ba4308ad5bdca6cb41c7b73cbb9f796dcdd7/kernel/signal.c#L3147
> > <
> https://github.com/torvalds/linux/blob/9539ba4308ad5bdca6cb41c7b73cbb9f796dcdd7/kernel/signal.c#L3147
> >
> >  > Reviewed-by: Patrick Venture  vent...@google.com>>
> >  > Signed-off-by: Shu-Chun Weng  s...@google.com>>
> >  > ---
> >  >   linux-user/syscall.c | 10 +-
> >  >   1 file changed, 5 insertions(+), 5 deletions(-)
> >  >
> >  > diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> >  > index ce9d64896c..3070d31f34 100644
> >  > --- a/linux-user/syscall.c
> >  > +++ b/linux-user/syscall.c
> >  > @@ -9491,6 +9491,11 @@ static abi_long do_syscall1(void *cpu_env,
> int num, abi_long arg1,
> >  >   }
> >  >
> >  >   if (arg2) {
> >  > +if (!(p = lock_user(VERIFY_READ, arg2,
> sizeof(target_sigset_t), 1)))
> >  > +return -TARGET_EFAULT;
> >  > +target_to_host_sigset(, p);
> >  > +unlock_user(p, arg2, 0);
> >  > +set_ptr = 
> >  >   switch(how) {
> >  >   case TARGET_SIG_BLOCK:
> >  >   how = SIG_BLOCK;
> >  > @@ -9504,11 +9509,6 @@ static abi_long do_syscall1(void *cpu_env,
> int num, abi_long arg1,
> >  >   default:
> >  >   return -TARGET_EINVAL;
> >  >   }
> >  > -if (!(p = lock_user(VERIFY_READ, arg2,
> sizeof(target_sigset_t), 1)))
> >  > -return -TARGET_EFAULT;
> >  > -target_to_host_sigset(, p);
> >  > -unlock_user(p, arg2, 0);
> >  > -set_ptr = 
> >  >   } else {
> >  >   how = 0;
> >  >   set_ptr = NULL;
> >
> > I know it's only code move but generally we also update the style to
> pass scripts/checkpatch.pl
> > 
> > successfully.
> >
> >
> > That is a reasonable request, however, can I just send a follow-on
> patch?  I didn't write this one
> > and I honestly don't know much about it, but I don't mind doing the
> cleanup
> >
> >
> > Could you also update TARGET_NR_sigprocmask in the same way as it
> seems the kernel behaves like
> > this
> > too in this case?
> >
> >
> > I can take a look.  I would prefer then to also prefetch the style fixup
> in a preceding patch. I
> > don't recall seeing whether qemu supports clang-format.
> >
>
> There is no problem. You can keep this patch unmodified, and add patches
> to fix the problems.
>
> I only ask to have all the patches in one series.
>

Will take a swing at this for v2.


>
> Thanks,
> Laurent
>
>


Re: [PATCH] tests/qtest/hd-geo-test: Check for the lsi53c895a controller before using it

2022-01-11 Thread Philippe Mathieu-Daudé

On 22/12/21 16:36, Thomas Huth wrote:

The lsi53c895a SCSI controller might have been disabled in the target
binary, so let's check for its availability first before using it.

Signed-off-by: Thomas Huth 
---
  tests/qtest/hd-geo-test.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH 0/4] tests/qtest: Check for devices and machines before adding tests

2022-01-11 Thread Philippe Mathieu-Daudé

On 20/12/21 09:10, Thomas Huth wrote:

Devices might not always be compiled into the QEMU target binaries.
We already have the libqos framework that is good for handling such
situations, but some of the qtests are not a real good fit for the
libqos framework. This patch series adds a new function to check
whether a device is available in the target binary or not, so that
tests can be run or skipped accordingly (also adding some additional
checks for the availability of machines in the target binaries).


What happens if a device or machine is inadvertently removed from the
build? We won't notice it directly anymore, right?



Re: [PATCH 3/4] tests/qtest/cdrom-test: Check whether devices are available before using them

2022-01-11 Thread Philippe Mathieu-Daudé

On 20/12/21 09:10, Thomas Huth wrote:

Downstream users might want to disable legacy devices in their binaries,
so we should not blindly assume that they are available. Add some proper
checks before using them.

Signed-off-by: Thomas Huth 
---
  tests/qtest/cdrom-test.c | 60 ++--
  1 file changed, 39 insertions(+), 21 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH 2/4] tests/qtest: Improve endianness-test to work with missing machines and devices

2022-01-11 Thread Philippe Mathieu-Daudé

On 20/12/21 09:10, Thomas Huth wrote:

The users might have built QEMU with less machines or without the
i82378 superio device. Add some checks to the endianess-test so that
it is able to deal with such stripped down QEMU versions, too.

Signed-off-by: Thomas Huth 
---
  tests/qtest/endianness-test.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)


The i82378 should work regardless the guest endianess...



Re: [PATCH 1/4] tests/qtest: Add a function that checks whether a device is available

2022-01-11 Thread Philippe Mathieu-Daudé

On 20/12/21 09:10, Thomas Huth wrote:

Devices might not always be compiled into the QEMU target binaries.
We already have the libqos framework that is good for handling such
situations, but some of the qtests are not a real good fit for the
libqos framework. Let's add a qtest_has_device() function for such
tests instead.

Signed-off-by: Thomas Huth 
---
  tests/qtest/libqos/libqtest.h |  8 +++
  tests/qtest/libqtest.c| 44 +++
  2 files changed, 52 insertions(+)


Reviewed-by: Philippe Mathieu-Daudé 



Re: [PULL 00/13] Net patches

2022-01-11 Thread Vladislav Yaroshchuk
вт, 11 янв. 2022 г., 5:10 AM Jason Wang :

> On Tue, Jan 11, 2022 at 12:49 AM Peter Maydell 
> wrote:
> >
> > On Mon, 10 Jan 2022 at 03:40, Jason Wang  wrote:
> > >
> > > The following changes since commit
> df722e33d5da26ea8604500ca8f509245a0ea524:
> > >
> > >   Merge tag 'bsd-user-arm-pull-request' of gitlab.com:bsdimp/qemu
> into staging (2022-01-08 09:37:59 -0800)
> > >
> > > are available in the git repository at:
> > >
> > >   https://github.com/jasowang/qemu.git tags/net-pull-request
> > >
> > > for you to fetch changes up to
> 5136cc6d3b8b74f4fa572f0874656947a401330e:
> > >
> > >   net/vmnet: update MAINTAINERS list (2022-01-10 11:30:55 +0800)
> > >
> > > 
> > >
> > > 
> >
> > Fails to build on OSX Catalina:
> >
> > ../../net/vmnet-common.m:165:10: error: use of undeclared identifier
> > 'VMNET_SHARING_SERVICE_BUSY'
> > case VMNET_SHARING_SERVICE_BUSY:
> >  ^
> >
> > This constant only got added in macOS 11.0. I guess that technically
> > our supported-platforms policy only requires us to support 11 (Big Sur)
> > and 12 (Monterey) at this point, but it would be nice to still be able
> > to build on Catalina (10.15).
>
Yes, it was only supported by the vmnet framework starting from
> Catalyst according to
> https://developer.apple.com/documentation/vmnet?language=objc.
>
>
Yes, there are some symbols from macOS >= 11.0 new backend
uses, not only this one, ex. vmnet_enable_isolation_key:
https://developer.apple.com/documentation/vmnet/vmnet_enable_isolation_key

>
> > (Personally I would like Catalina still to work at least for a little
> > while, because my x86 Mac is old enough that it is not supported by
> > Big Sur. I'll have to dump it once Apple stops doing security support
> > for Catalina, but they haven't done that quite yet.)
>
>
Sure, broken builds on old macOSes are bad. For this case I think
it's enough to disable vmnet for macOS < 11.0 with a probe while
configure build step. Especially given that Apple supports ~three
latest macOS versions, support for Catalina is expected to end
in 2022, when QEMU releases 7.0.

If this workaround is not suitable and it's required to support vmnet
in Catalina 10.15 with a subset of available features, it can be done.
But I'll be ready to handle this in approximately two-three weeks only.


> Sure, Vladislav please fix this and send a new version.
>
>
Quick fix as described above is available in v10:
https://patchew.org/QEMU/20220111211422.21789-1-yaroshchuk2...@gmail.com/

> Thanks
>
> >
> > -- PMM
> >
>
>
>

-- 
Best Regards,

Vladislav Yaroshchuk


[PATCH v10 6/7] net/vmnet: update qemu-options.hx

2022-01-11 Thread Vladislav Yaroshchuk
Signed-off-by: Vladislav Yaroshchuk 
---
 qemu-options.hx | 25 +
 1 file changed, 25 insertions(+)

diff --git a/qemu-options.hx b/qemu-options.hx
index ec90505d84..81dd34f550 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2732,6 +2732,25 @@ DEF("netdev", HAS_ARG, QEMU_OPTION_netdev,
 #ifdef __linux__
 "-netdev vhost-vdpa,id=str,vhostdev=/path/to/dev\n"
 "configure a vhost-vdpa network,Establish a vhost-vdpa 
netdev\n"
+#endif
+#ifdef CONFIG_VMNET
+"-netdev vmnet-host,id=str[,isolated=on|off][,net-uuid=uuid]\n"
+" [,start-address=addr,end-address=addr,subnet-mask=mask]\n"
+"configure a vmnet network backend in host mode with ID 
'str',\n"
+"isolate this interface from others with 'isolated',\n"
+"configure the address range and choose a subnet mask,\n"
+"specify network UUID 'uuid' to disable DHCP and interact 
with\n"
+"vmnet-host interfaces within this isolated network\n"
+"-netdev vmnet-shared,id=str[,isolated=on|off][,nat66-prefix=addr]\n"
+" [,start-address=addr,end-address=addr,subnet-mask=mask]\n"
+"configure a vmnet network backend in shared mode with ID 
'str',\n"
+"configure the address range and choose a subnet mask,\n"
+"set IPv6 ULA prefix (of length 64) to use for internal 
network,\n"
+"isolate this interface from others with 'isolated'\n"
+"-netdev vmnet-bridged,id=str,ifname=name[,isolated=on|off]\n"
+"configure a vmnet network backend in bridged mode with ID 
'str',\n"
+"use 'ifname=name' to select a physical network interface 
to be bridged,\n"
+"isolate this interface from others with 'isolated'\n"
 #endif
 "-netdev hubport,id=str,hubid=n[,netdev=nd]\n"
 "configure a hub port on the hub with ID 'n'\n", 
QEMU_ARCH_ALL)
@@ -2751,6 +2770,9 @@ DEF("nic", HAS_ARG, QEMU_OPTION_nic,
 #endif
 #ifdef CONFIG_POSIX
 "vhost-user|"
+#endif
+#ifdef CONFIG_VMNET
+"vmnet-host|vmnet-shared|vmnet-bridged|"
 #endif
 "socket][,option][,...][mac=macaddr]\n"
 "initialize an on-board / default host NIC (using MAC 
address\n"
@@ -2773,6 +2795,9 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
 #endif
 #ifdef CONFIG_NETMAP
 "netmap|"
+#endif
+#ifdef CONFIG_VMNET
+"vmnet-host|vmnet-shared|vmnet-bridged|"
 #endif
 "socket][,option][,option][,...]\n"
 "old way to initialize a host network interface\n"
-- 
2.23.0




[PATCH v10 7/7] net/vmnet: update MAINTAINERS list

2022-01-11 Thread Vladislav Yaroshchuk
Signed-off-by: Vladislav Yaroshchuk 
---
 MAINTAINERS | 5 +
 1 file changed, 5 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index c98a61caee..638d129305 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2641,6 +2641,11 @@ W: http://info.iet.unipi.it/~luigi/netmap/
 S: Maintained
 F: net/netmap.c
 
+Apple vmnet network backends
+M: Vladislav Yaroshchuk 
+S: Maintained
+F: net/vmnet*
+
 Host Memory Backends
 M: David Hildenbrand 
 M: Igor Mammedov 
-- 
2.23.0




[PATCH v10 5/7] net/vmnet: implement bridged mode (vmnet-bridged)

2022-01-11 Thread Vladislav Yaroshchuk
Signed-off-by: Vladislav Yaroshchuk 
---
 net/vmnet-bridged.m | 98 ++---
 1 file changed, 92 insertions(+), 6 deletions(-)

diff --git a/net/vmnet-bridged.m b/net/vmnet-bridged.m
index 4e42a90391..3c9da9dc8b 100644
--- a/net/vmnet-bridged.m
+++ b/net/vmnet-bridged.m
@@ -10,16 +10,102 @@
 
 #include "qemu/osdep.h"
 #include "qapi/qapi-types-net.h"
-#include "vmnet_int.h"
-#include "clients.h"
-#include "qemu/error-report.h"
 #include "qapi/error.h"
+#include "clients.h"
+#include "vmnet_int.h"
 
 #include 
 
+typedef struct VmnetBridgedState {
+  VmnetCommonState cs;
+} VmnetBridgedState;
+
+static bool validate_ifname(const char *ifname)
+{
+xpc_object_t shared_if_list = vmnet_copy_shared_interface_list();
+__block bool match = false;
+
+xpc_array_apply(
+shared_if_list,
+^bool(size_t index, xpc_object_t value) {
+  if (strcmp(xpc_string_get_string_ptr(value), ifname) == 0) {
+  match = true;
+  return false;
+  }
+  return true;
+});
+
+return match;
+}
+
+static const char *get_valid_ifnames(void)
+{
+xpc_object_t shared_if_list = vmnet_copy_shared_interface_list();
+__block char *if_list = NULL;
+
+xpc_array_apply(
+shared_if_list,
+^bool(size_t index, xpc_object_t value) {
+  if_list = g_strconcat(xpc_string_get_string_ptr(value),
+" ",
+if_list,
+NULL);
+  return true;
+});
+
+if (if_list) {
+return if_list;
+}
+return "[no interfaces]";
+}
+
+static xpc_object_t create_if_desc(const Netdev *netdev, Error **errp)
+{
+const NetdevVmnetBridgedOptions *options = &(netdev->u.vmnet_bridged);
+xpc_object_t if_desc = xpc_dictionary_create(NULL, NULL, 0);
+
+xpc_dictionary_set_uint64(
+if_desc,
+vmnet_operation_mode_key,
+VMNET_BRIDGED_MODE
+);
+
+xpc_dictionary_set_bool(
+if_desc,
+vmnet_enable_isolation_key,
+options->isolated
+);
+
+if (validate_ifname(options->ifname)) {
+xpc_dictionary_set_string(if_desc,
+  vmnet_shared_interface_name_key,
+  options->ifname);
+} else {
+return NULL;
+}
+return if_desc;
+}
+
+static NetClientInfo net_vmnet_bridged_info = {
+.type = NET_CLIENT_DRIVER_VMNET_BRIDGED,
+.size = sizeof(VmnetBridgedState),
+.receive = vmnet_receive_common,
+.cleanup = vmnet_cleanup_common,
+};
+
 int net_init_vmnet_bridged(const Netdev *netdev, const char *name,
NetClientState *peer, Error **errp)
 {
-  error_setg(errp, "vmnet-bridged is not implemented yet");
-  return -1;
-}
+NetClientState *nc = qemu_new_net_client(_vmnet_bridged_info,
+ peer, "vmnet-bridged", name);
+xpc_object_t if_desc = create_if_desc(netdev, errp);;
+
+if (!if_desc) {
+error_setg(errp,
+   "unsupported ifname, should be one of: %s",
+   get_valid_ifnames());
+return -1;
+}
+
+return vmnet_if_create(nc, if_desc, errp, NULL);
+}
\ No newline at end of file
-- 
2.23.0




[PATCH v10 3/7] net/vmnet: implement shared mode (vmnet-shared)

2022-01-11 Thread Vladislav Yaroshchuk
Interaction with vmnet.framework in different modes
differs only on configuration stage, so we can create
common `send`, `receive`, etc. procedures and reuse them.

vmnet.framework supports iov, but writing more than
one iov into vmnet interface fails with
'VMNET_INVALID_ARGUMENT'. Collecting provided iovs into
one and passing it to vmnet works fine. That's the
reason why receive_iov() left unimplemented. But it still
works with good enough performance having .receive()
implemented only.

Also, there is no way to unsubscribe from vmnet packages
receiving except registering and unregistering event
callback or simply drop packages just ignoring and
not processing them when related flag is set. Here we do
using the second way.

Signed-off-by: Phillip Tennen 
Signed-off-by: Vladislav Yaroshchuk 
---
 net/vmnet-common.m | 310 +
 net/vmnet-shared.c |  75 ++-
 net/vmnet_int.h|  23 
 3 files changed, 404 insertions(+), 4 deletions(-)

diff --git a/net/vmnet-common.m b/net/vmnet-common.m
index 532d152840..6d474af4be 100644
--- a/net/vmnet-common.m
+++ b/net/vmnet-common.m
@@ -10,6 +10,8 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/main-loop.h"
+#include "qemu/log.h"
 #include "qapi/qapi-types-net.h"
 #include "vmnet_int.h"
 #include "clients.h"
@@ -17,4 +19,312 @@
 #include "qapi/error.h"
 
 #include 
+#include 
 
+#ifdef DEBUG
+#define D(x) x
+#define D_LOG(...) qemu_log(__VA_ARGS__)
+#else
+#define D(x) do { } while (0)
+#define D_LOG(...) do { } while (0)
+#endif
+
+typedef struct vmpktdesc vmpktdesc_t;
+typedef struct iovec iovec_t;
+
+static void vmnet_set_send_enabled(VmnetCommonState *s, bool enable)
+{
+s->send_enabled = enable;
+}
+
+
+static void vmnet_send_completed(NetClientState *nc, ssize_t len)
+{
+VmnetCommonState *s = DO_UPCAST(VmnetCommonState, nc, nc);
+vmnet_set_send_enabled(s, true);
+}
+
+
+static void vmnet_send(NetClientState *nc,
+   interface_event_t event_id,
+   xpc_object_t event)
+{
+assert(event_id == VMNET_INTERFACE_PACKETS_AVAILABLE);
+
+VmnetCommonState *s;
+uint64_t packets_available;
+
+struct iovec *iov;
+struct vmpktdesc *packets;
+int pkt_cnt;
+int i;
+
+vmnet_return_t if_status;
+ssize_t size;
+
+s = DO_UPCAST(VmnetCommonState, nc, nc);
+
+packets_available = xpc_dictionary_get_uint64(
+event,
+vmnet_estimated_packets_available_key
+);
+
+pkt_cnt = (packets_available < VMNET_PACKETS_LIMIT) ?
+  packets_available :
+  VMNET_PACKETS_LIMIT;
+
+
+iov = s->iov_buf;
+packets = s->packets_buf;
+
+for (i = 0; i < pkt_cnt; ++i) {
+packets[i].vm_pkt_size = s->max_packet_size;
+packets[i].vm_pkt_iovcnt = 1;
+packets[i].vm_flags = 0;
+}
+
+if_status = vmnet_read(s->vmnet_if, packets, _cnt);
+if (if_status != VMNET_SUCCESS) {
+error_printf("vmnet: read failed: %s\n",
+ vmnet_status_map_str(if_status));
+}
+qemu_mutex_lock_iothread();
+for (i = 0; i < pkt_cnt; ++i) {
+size = qemu_send_packet_async(nc,
+  iov[i].iov_base,
+  packets[i].vm_pkt_size,
+  vmnet_send_completed);
+if (size == 0) {
+vmnet_set_send_enabled(s, false);
+} else if (size < 0) {
+break;
+}
+}
+qemu_mutex_unlock_iothread();
+
+}
+
+
+static void vmnet_register_event_callback(VmnetCommonState *s)
+{
+dispatch_queue_t avail_pkt_q = dispatch_queue_create(
+"org.qemu.vmnet.if_queue",
+DISPATCH_QUEUE_SERIAL
+);
+
+vmnet_interface_set_event_callback(
+s->vmnet_if,
+VMNET_INTERFACE_PACKETS_AVAILABLE,
+avail_pkt_q,
+^(interface_event_t event_id, xpc_object_t event) {
+  if (s->send_enabled) {
+  vmnet_send(>nc, event_id, event);
+  }
+});
+}
+
+
+static void vmnet_bufs_init(VmnetCommonState *s)
+{
+int i;
+struct vmpktdesc *packets;
+struct iovec *iov;
+
+packets = s->packets_buf;
+iov = s->iov_buf;
+
+for (i = 0; i < VMNET_PACKETS_LIMIT; ++i) {
+iov[i].iov_len = s->max_packet_size;
+iov[i].iov_base = g_malloc0(iov[i].iov_len);
+packets[i].vm_pkt_iov = iov + i;
+}
+}
+
+
+const char *vmnet_status_map_str(vmnet_return_t status)
+{
+switch (status) {
+case VMNET_SUCCESS:
+return "success";
+case VMNET_FAILURE:
+return "general failure";
+case VMNET_MEM_FAILURE:
+return "memory allocation failure";
+case VMNET_INVALID_ARGUMENT:
+return "invalid argument specified";
+case VMNET_SETUP_INCOMPLETE:
+return "interface setup is not complete";
+case VMNET_INVALID_ACCESS:
+return "invalid access, permission denied";
+case VMNET_PACKET_TOO_BIG:
+

[PATCH v10 4/7] net/vmnet: implement host mode (vmnet-host)

2022-01-11 Thread Vladislav Yaroshchuk
Signed-off-by: Vladislav Yaroshchuk 
---
 net/vmnet-host.c | 93 
 1 file changed, 87 insertions(+), 6 deletions(-)

diff --git a/net/vmnet-host.c b/net/vmnet-host.c
index 4a5ef99dc7..9c2e760ed1 100644
--- a/net/vmnet-host.c
+++ b/net/vmnet-host.c
@@ -9,16 +9,97 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/uuid.h"
 #include "qapi/qapi-types-net.h"
-#include "vmnet_int.h"
-#include "clients.h"
-#include "qemu/error-report.h"
 #include "qapi/error.h"
+#include "clients.h"
+#include "vmnet_int.h"
 
 #include 
 
+typedef struct VmnetHostState {
+  VmnetCommonState cs;
+  QemuUUID network_uuid;
+} VmnetHostState;
+
+static xpc_object_t create_if_desc(const Netdev *netdev,
+   NetClientState *nc,
+   Error **errp)
+{
+const NetdevVmnetHostOptions *options = &(netdev->u.vmnet_host);
+VmnetCommonState *cs = DO_UPCAST(VmnetCommonState, nc, nc);
+VmnetHostState *hs = DO_UPCAST(VmnetHostState, cs, cs);
+
+xpc_object_t if_desc = xpc_dictionary_create(NULL, NULL, 0);
+
+xpc_dictionary_set_uint64(
+if_desc,
+vmnet_operation_mode_key,
+VMNET_HOST_MODE
+);
+
+xpc_dictionary_set_bool(
+if_desc,
+vmnet_enable_isolation_key,
+options->isolated
+);
+
+if (options->has_net_uuid) {
+if (qemu_uuid_parse(options->net_uuid, >network_uuid) < 0) {
+error_setg(errp, "Invalid UUID provided in 'net-uuid'");
+}
+
+xpc_dictionary_set_uuid(
+if_desc,
+vmnet_network_identifier_key,
+hs->network_uuid.data
+);
+}
+
+if (options->has_start_address ||
+options->has_end_address ||
+options->has_subnet_mask) {
+
+if (options->has_start_address &&
+options->has_end_address &&
+options->has_subnet_mask) {
+
+xpc_dictionary_set_string(if_desc,
+  vmnet_start_address_key,
+  options->start_address);
+xpc_dictionary_set_string(if_desc,
+  vmnet_end_address_key,
+  options->end_address);
+xpc_dictionary_set_string(if_desc,
+  vmnet_subnet_mask_key,
+  options->subnet_mask);
+} else {
+error_setg(
+errp,
+"'start-address', 'end-address', 'subnet_mask' "
+"should be provided together"
+);
+}
+}
+
+return if_desc;
+}
+
+static NetClientInfo net_vmnet_host_info = {
+.type = NET_CLIENT_DRIVER_VMNET_HOST,
+.size = sizeof(VmnetHostState),
+.receive = vmnet_receive_common,
+.cleanup = vmnet_cleanup_common,
+};
+
 int net_init_vmnet_host(const Netdev *netdev, const char *name,
-NetClientState *peer, Error **errp) {
-  error_setg(errp, "vmnet-host is not implemented yet");
-  return -1;
+NetClientState *peer, Error **errp)
+{
+NetClientState *nc;
+xpc_object_t if_desc;
+
+nc = qemu_new_net_client(_vmnet_host_info,
+ peer, "vmnet-host", name);
+if_desc = create_if_desc(netdev, nc, errp);
+return vmnet_if_create(nc, if_desc, errp, NULL);
 }
-- 
2.23.0




[PATCH v10 1/7] net/vmnet: add vmnet dependency and customizable option

2022-01-11 Thread Vladislav Yaroshchuk
vmnet.framework dependency is added with 'vmnet' option
to enable or disable it. Default value is 'auto'.

vmnet features to be used are available since macOS 11.0,
corresponding probe is created into meson.build.

Signed-off-by: Vladislav Yaroshchuk 
---
 meson.build   | 16 +++-
 meson_options.txt |  2 ++
 scripts/meson-buildoptions.sh |  3 +++
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index c1b1db1e28..a6751ec946 100644
--- a/meson.build
+++ b/meson.build
@@ -496,6 +496,18 @@ if cocoa.found() and get_option('gtk').enabled()
   error('Cocoa and GTK+ cannot be enabled at the same time')
 endif
 
+vmnet = dependency('appleframeworks', modules: 'vmnet', required: 
get_option('vmnet'))
+if vmnet.found() and not cc.has_header_symbol('vmnet/vmnet.h',
+  'VMNET_SHARING_SERVICE_BUSY',
+  dependencies: vmnet)
+  vmnet = not_found
+  if get_option('vmnet').enabled()
+error('vmnet.framework API is outdated')
+  else
+warning('vmnet.framework API is outdated, disabling')
+  endif
+endif
+
 seccomp = not_found
 if not get_option('seccomp').auto() or have_system or have_tools
   seccomp = dependency('libseccomp', version: '>=2.3.0',
@@ -1492,6 +1504,7 @@ config_host_data.set('CONFIG_SECCOMP', seccomp.found())
 config_host_data.set('CONFIG_SNAPPY', snappy.found())
 config_host_data.set('CONFIG_USB_LIBUSB', libusb.found())
 config_host_data.set('CONFIG_VDE', vde.found())
+config_host_data.set('CONFIG_VMNET', vmnet.found())
 config_host_data.set('CONFIG_VHOST_USER_BLK_SERVER', 
have_vhost_user_blk_server)
 config_host_data.set('CONFIG_VNC', vnc.found())
 config_host_data.set('CONFIG_VNC_JPEG', jpeg.found())
@@ -3406,7 +3419,8 @@ summary(summary_info, bool_yn: true, section: 'Crypto')
 # Libraries
 summary_info = {}
 if targetos == 'darwin'
-  summary_info += {'Cocoa support':   cocoa}
+  summary_info += {'Cocoa support':   cocoa}
+  summary_info += {'vmnet.framework support': vmnet}
 endif
 summary_info += {'SDL support':   sdl}
 summary_info += {'SDL image support': sdl_image}
diff --git a/meson_options.txt b/meson_options.txt
index 921967eddb..701e1381f9 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -151,6 +151,8 @@ option('netmap', type : 'feature', value : 'auto',
description: 'netmap network backend support')
 option('vde', type : 'feature', value : 'auto',
description: 'vde network backend support')
+option('vmnet', type : 'feature', value : 'auto',
+   description: 'vmnet.framework network backend support')
 option('virglrenderer', type : 'feature', value : 'auto',
description: 'virgl rendering support')
 option('vnc', type : 'feature', value : 'auto',
diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh
index 50bd7bed4d..cdcece4b05 100644
--- a/scripts/meson-buildoptions.sh
+++ b/scripts/meson-buildoptions.sh
@@ -84,6 +84,7 @@ meson_options_help() {
   printf "%s\n" '  u2f U2F emulation support'
   printf "%s\n" '  usb-redir   libusbredir support'
   printf "%s\n" '  vde vde network backend support'
+  printf "%s\n" '  vmnet   vmnet.framework network backend support'
   printf "%s\n" '  vhost-user-blk-server'
   printf "%s\n" '  build vhost-user-blk server'
   printf "%s\n" '  virglrenderer   virgl rendering support'
@@ -248,6 +249,8 @@ _meson_option_parse() {
 --disable-usb-redir) printf "%s" -Dusb_redir=disabled ;;
 --enable-vde) printf "%s" -Dvde=enabled ;;
 --disable-vde) printf "%s" -Dvde=disabled ;;
+--enable-vmnet) printf "%s" -Dvmnet=enabled ;;
+--disable-vmnet) printf "%s" -Dvmnet=disabled ;;
 --enable-vhost-user-blk-server) printf "%s" 
-Dvhost_user_blk_server=enabled ;;
 --disable-vhost-user-blk-server) printf "%s" 
-Dvhost_user_blk_server=disabled ;;
 --enable-virglrenderer) printf "%s" -Dvirglrenderer=enabled ;;
-- 
2.23.0




[PATCH v10 2/7] net/vmnet: add vmnet backends to qapi/net

2022-01-11 Thread Vladislav Yaroshchuk
Create separate netdevs for each vmnet operating mode:
- vmnet-host
- vmnet-shared
- vmnet-bridged

Signed-off-by: Vladislav Yaroshchuk 
---
 net/clients.h   |  11 
 net/meson.build |   7 +++
 net/net.c   |  10 
 net/vmnet-bridged.m |  25 +
 net/vmnet-common.m  |  20 +++
 net/vmnet-host.c|  24 
 net/vmnet-shared.c  |  25 +
 net/vmnet_int.h |  25 +
 qapi/net.json   | 132 +++-
 9 files changed, 277 insertions(+), 2 deletions(-)
 create mode 100644 net/vmnet-bridged.m
 create mode 100644 net/vmnet-common.m
 create mode 100644 net/vmnet-host.c
 create mode 100644 net/vmnet-shared.c
 create mode 100644 net/vmnet_int.h

diff --git a/net/clients.h b/net/clients.h
index 92f9b59aed..c9157789f2 100644
--- a/net/clients.h
+++ b/net/clients.h
@@ -63,4 +63,15 @@ int net_init_vhost_user(const Netdev *netdev, const char 
*name,
 
 int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
 NetClientState *peer, Error **errp);
+#ifdef CONFIG_VMNET
+int net_init_vmnet_host(const Netdev *netdev, const char *name,
+  NetClientState *peer, Error **errp);
+
+int net_init_vmnet_shared(const Netdev *netdev, const char *name,
+  NetClientState *peer, Error **errp);
+
+int net_init_vmnet_bridged(const Netdev *netdev, const char *name,
+  NetClientState *peer, Error **errp);
+#endif /* CONFIG_VMNET */
+
 #endif /* QEMU_NET_CLIENTS_H */
diff --git a/net/meson.build b/net/meson.build
index 847bc2ac85..00a88c4951 100644
--- a/net/meson.build
+++ b/net/meson.build
@@ -42,4 +42,11 @@ softmmu_ss.add(when: 'CONFIG_POSIX', if_true: 
files(tap_posix))
 softmmu_ss.add(when: 'CONFIG_WIN32', if_true: files('tap-win32.c'))
 softmmu_ss.add(when: 'CONFIG_VHOST_NET_VDPA', if_true: files('vhost-vdpa.c'))
 
+vmnet_files = files(
+  'vmnet-common.m',
+  'vmnet-bridged.m',
+  'vmnet-host.c',
+  'vmnet-shared.c'
+)
+softmmu_ss.add(when: vmnet, if_true: vmnet_files)
 subdir('can')
diff --git a/net/net.c b/net/net.c
index f0d14dbfc1..1dbb64b935 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1021,6 +1021,11 @@ static int (* const 
net_client_init_fun[NET_CLIENT_DRIVER__MAX])(
 #ifdef CONFIG_L2TPV3
 [NET_CLIENT_DRIVER_L2TPV3]= net_init_l2tpv3,
 #endif
+#ifdef CONFIG_VMNET
+[NET_CLIENT_DRIVER_VMNET_HOST] = net_init_vmnet_host,
+[NET_CLIENT_DRIVER_VMNET_SHARED] = net_init_vmnet_shared,
+[NET_CLIENT_DRIVER_VMNET_BRIDGED] = net_init_vmnet_bridged,
+#endif /* CONFIG_VMNET */
 };
 
 
@@ -1106,6 +,11 @@ void show_netdevs(void)
 #endif
 #ifdef CONFIG_VHOST_VDPA
 "vhost-vdpa",
+#endif
+#ifdef CONFIG_VMNET
+"vmnet-host",
+"vmnet-shared",
+"vmnet-bridged",
 #endif
 };
 
diff --git a/net/vmnet-bridged.m b/net/vmnet-bridged.m
new file mode 100644
index 00..4e42a90391
--- /dev/null
+++ b/net/vmnet-bridged.m
@@ -0,0 +1,25 @@
+/*
+ * vmnet-bridged.m
+ *
+ * Copyright(c) 2021 Vladislav Yaroshchuk 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/qapi-types-net.h"
+#include "vmnet_int.h"
+#include "clients.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+
+#include 
+
+int net_init_vmnet_bridged(const Netdev *netdev, const char *name,
+   NetClientState *peer, Error **errp)
+{
+  error_setg(errp, "vmnet-bridged is not implemented yet");
+  return -1;
+}
diff --git a/net/vmnet-common.m b/net/vmnet-common.m
new file mode 100644
index 00..532d152840
--- /dev/null
+++ b/net/vmnet-common.m
@@ -0,0 +1,20 @@
+/*
+ * vmnet-common.m - network client wrapper for Apple vmnet.framework
+ *
+ * Copyright(c) 2021 Vladislav Yaroshchuk 
+ * Copyright(c) 2021 Phillip Tennen 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/qapi-types-net.h"
+#include "vmnet_int.h"
+#include "clients.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+
+#include 
+
diff --git a/net/vmnet-host.c b/net/vmnet-host.c
new file mode 100644
index 00..4a5ef99dc7
--- /dev/null
+++ b/net/vmnet-host.c
@@ -0,0 +1,24 @@
+/*
+ * vmnet-host.c
+ *
+ * Copyright(c) 2021 Vladislav Yaroshchuk 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/qapi-types-net.h"
+#include "vmnet_int.h"
+#include "clients.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+
+#include 
+
+int net_init_vmnet_host(const Netdev *netdev, const char *name,
+NetClientState *peer, Error **errp) {
+  error_setg(errp, "vmnet-host is not implemented yet");
+  return -1;

[PATCH v10 0/7] Add vmnet.framework based network backend

2022-01-11 Thread Vladislav Yaroshchuk
macOS provides networking API for VMs called 'vmnet.framework':
https://developer.apple.com/documentation/vmnet

We can provide its support as the new QEMU network backends which
represent three different vmnet.framework interface usage modes:

  * `vmnet-shared`:
allows the guest to communicate with other guests in shared mode and
also with external network (Internet) via NAT. Has (macOS-provided)
DHCP server; subnet mask and IP range can be configured;

  * `vmnet-host`:
allows the guest to communicate with other guests in host mode.
By default has enabled DHCP as `vmnet-shared`, but providing
network unique id (uuid) can make `vmnet-host` interfaces isolated
from each other and also disables DHCP.

  * `vmnet-bridged`:
bridges the guest with a physical network interface.

This backends cannot work on macOS Catalina 10.15 cause we use
vmnet.framework API provided only with macOS 11 and newer. Seems
that it is not a problem, because QEMU guarantees to work on two most
recent versions of macOS which now are Big Sur (11) and Monterey (12).

Also, we have one inconvenient restriction: vmnet.framework interfaces
can create only privileged user:
`$ sudo qemu-system-x86_64 -nic vmnet-shared`

Attempt of `vmnet-*` netdev creation being unprivileged user fails with
vmnet's 'general failure'.

This happens because vmnet.framework requires `com.apple.vm.networking`
entitlement which is: "restricted to developers of virtualization software.
To request this entitlement, contact your Apple representative." as Apple
documentation says:
https://developer.apple.com/documentation/bundleresources/entitlements/com_apple_vm_networking

One more note: we still have quite useful but not supported
'vmnet.framework' features as creating port forwarding rules, IPv6
NAT prefix specifying and so on.

Nevertheless, new backends work fine and tested within `qemu-system-x86-64`
on macOS Bir Sur 11.5.2 host with such nic models:
  * e1000-82545em
  * virtio-net-pci
  * vmxnet3

The guests were:
  * macOS 10.15.7
  * Ubuntu Bionic (server cloudimg)


This series partially reuses patches by Phillip Tennen:
https://patchew.org/QEMU/20210218134947.1860-1-phillip.en...@gmail.com/
So I included them signed-off line into one of the commit messages and
also here.

v1 -> v2:
 Since v1 minor typos were fixed, patches rebased onto latest master,
 redundant changes removed (small commits squashed)
v2 -> v3:
 - QAPI style fixes
 - Typos fixes in comments
 - `#include`'s updated to be in sync with recent master
v3 -> v4:
 - Support vmnet interfaces isolation feature
 - Support vmnet-host network uuid setting feature
 - Refactored sources a bit
v4 -> v5:
 - Missed 6.2 boat, now 7.0 candidate
 - Fix qapi netdev descriptions and styles
   (@subnetmask -> @subnet-mask)
 - Support vmnet-shared IPv6 prefix setting feature
v5 -> v6
 - provide detailed commit messages for commits of
   many changes
 - rename properties @dhcpstart and @dhcpend to
   @start-address and @end-address
 - improve qapi documentation about isolation
   features (@isolated, @net-uuid)
v6 -> v7:
 - update MAINTAINERS list
v7 -> v8
 - QAPI code style fixes
v8 -> v9
 - Fix building on Linux: add missing qapi
   `'if': 'CONFIG_VMNET'` statement to Netdev union
v9 -> v10
 - Disable vmnet feature for macOS < 11.0: add
   vmnet.framework API probe into meson.build.
   This fixes QEMU building on macOS < 11.0:
   https://patchew.org/QEMU/20220110034000.20221-1-jasow...@redhat.com/

Vladislav Yaroshchuk (7):
  net/vmnet: add vmnet dependency and customizable option
  net/vmnet: add vmnet backends to qapi/net
  net/vmnet: implement shared mode (vmnet-shared)
  net/vmnet: implement host mode (vmnet-host)
  net/vmnet: implement bridged mode (vmnet-bridged)
  net/vmnet: update qemu-options.hx
  net/vmnet: update MAINTAINERS list

 MAINTAINERS   |   5 +
 meson.build   |  16 +-
 meson_options.txt |   2 +
 net/clients.h |  11 ++
 net/meson.build   |   7 +
 net/net.c |  10 ++
 net/vmnet-bridged.m   | 111 
 net/vmnet-common.m| 330 ++
 net/vmnet-host.c  | 105 +++
 net/vmnet-shared.c|  92 ++
 net/vmnet_int.h   |  48 +
 qapi/net.json | 132 +-
 qemu-options.hx   |  25 +++
 scripts/meson-buildoptions.sh |   3 +
 14 files changed, 894 insertions(+), 3 deletions(-)
 create mode 100644 net/vmnet-bridged.m
 create mode 100644 net/vmnet-common.m
 create mode 100644 net/vmnet-host.c
 create mode 100644 net/vmnet-shared.c
 create mode 100644 net/vmnet_int.h

-- 
2.23.0




Re: [PATCH] linux-user: rt_sigprocmask, check read perms first

2022-01-11 Thread Laurent Vivier

Hi Patrick,

Le 11/01/2022 à 21:14, Patrick Venture a écrit :



On Sat, Jan 8, 2022 at 10:16 AM Laurent Vivier mailto:laur...@vivier.eu>> wrote:

Le 06/01/2022 à 23:00, Patrick Venture a écrit :
 > From: Shu-Chun Weng mailto:s...@google.com>>
 >
 > Linux kernel does it this way (checks read permission before validating 
`how`)
 > and the latest version of ABSL's `AddressIsReadable()` depends on this
 > behavior.
 >
 > c.f.

https://github.com/torvalds/linux/blob/9539ba4308ad5bdca6cb41c7b73cbb9f796dcdd7/kernel/signal.c#L3147


 > Reviewed-by: Patrick Venture mailto:vent...@google.com>>
 > Signed-off-by: Shu-Chun Weng mailto:s...@google.com>>
 > ---
 >   linux-user/syscall.c | 10 +-
 >   1 file changed, 5 insertions(+), 5 deletions(-)
 >
 > diff --git a/linux-user/syscall.c b/linux-user/syscall.c
 > index ce9d64896c..3070d31f34 100644
 > --- a/linux-user/syscall.c
 > +++ b/linux-user/syscall.c
 > @@ -9491,6 +9491,11 @@ static abi_long do_syscall1(void *cpu_env, int 
num, abi_long arg1,
 >               }
 >
 >               if (arg2) {
 > +                if (!(p = lock_user(VERIFY_READ, arg2, 
sizeof(target_sigset_t), 1)))
 > +                    return -TARGET_EFAULT;
 > +                target_to_host_sigset(, p);
 > +                unlock_user(p, arg2, 0);
 > +                set_ptr = 
 >                   switch(how) {
 >                   case TARGET_SIG_BLOCK:
 >                       how = SIG_BLOCK;
 > @@ -9504,11 +9509,6 @@ static abi_long do_syscall1(void *cpu_env, int 
num, abi_long arg1,
 >                   default:
 >                       return -TARGET_EINVAL;
 >                   }
 > -                if (!(p = lock_user(VERIFY_READ, arg2, 
sizeof(target_sigset_t), 1)))
 > -                    return -TARGET_EFAULT;
 > -                target_to_host_sigset(, p);
 > -                unlock_user(p, arg2, 0);
 > -                set_ptr = 
 >               } else {
 >                   how = 0;
 >                   set_ptr = NULL;

I know it's only code move but generally we also update the style to pass 
scripts/checkpatch.pl

successfully.


That is a reasonable request, however, can I just send a follow-on patch?  I didn't write this one 
and I honestly don't know much about it, but I don't mind doing the cleanup



Could you also update TARGET_NR_sigprocmask in the same way as it seems the 
kernel behaves like
this
too in this case?


I can take a look.  I would prefer then to also prefetch the style fixup in a preceding patch. I 
don't recall seeing whether qemu supports clang-format.




There is no problem. You can keep this patch unmodified, and add patches to fix 
the problems.

I only ask to have all the patches in one series.

Thanks,
Laurent




Re: [PATCH] nbd/server.c: Remove unused field

2022-01-11 Thread Philippe Mathieu-Daudé
On 1/11/22 20:43, Nir Soffer wrote:
> NBDRequestData struct has unused QSIMPLEQ_ENTRY filed. It seems that
> this field exists since the first git commit and was never used.
> 
> Signed-off-by: Nir Soffer 
> ---
>  nbd/server.c | 1 -
>  1 file changed, 1 deletion(-)

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH] tests/qtest/hd-geo-test: Check for the lsi53c895a controller before using it

2022-01-11 Thread John Snow
On Wed, Dec 22, 2021 at 10:36 AM Thomas Huth  wrote:
>
> The lsi53c895a SCSI controller might have been disabled in the target
> binary, so let's check for its availability first before using it.
>
> Signed-off-by: Thomas Huth 
> ---
>  tests/qtest/hd-geo-test.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/tests/qtest/hd-geo-test.c b/tests/qtest/hd-geo-test.c
> index 113126ae06..771eaa741b 100644
> --- a/tests/qtest/hd-geo-test.c
> +++ b/tests/qtest/hd-geo-test.c
> @@ -960,9 +960,11 @@ int main(int argc, char **argv)
>  qtest_add_func("hd-geo/ide/device/user/chst", test_ide_device_user_chst);
>  if (have_qemu_img()) {
>  qtest_add_func("hd-geo/override/ide", test_override_ide);
> -qtest_add_func("hd-geo/override/scsi", test_override_scsi);
> -qtest_add_func("hd-geo/override/scsi_2_controllers",
> -   test_override_scsi_2_controllers);
> +if (qtest_has_device("lsi53c895a")) {
> +qtest_add_func("hd-geo/override/scsi", test_override_scsi);
> +qtest_add_func("hd-geo/override/scsi_2_controllers",
> +   test_override_scsi_2_controllers);
> +}
>  qtest_add_func("hd-geo/override/virtio_blk", 
> test_override_virtio_blk);
>  qtest_add_func("hd-geo/override/zero_chs", test_override_zero_chs);
>  qtest_add_func("hd-geo/override/scsi_hot_unplug",
> --
> 2.27.0
>

Acked-by: John Snow 




[PATCH 0/1] ppc/pnv: use stack->pci_regs[] in pnv_pec_stk_pci_xscom_write()

2022-01-11 Thread Daniel Henrique Barboza
Hi,

This is something that caught my eye when I was looking into the
instances where we need stack properties versus phb4 properties.

I tested this fix and it doesn't seem to impact the boot process
whatsoever. Tracing pnv_pec_stk_pci_xscom_write() shows that the writes
are being done at early boot and then nothing else. There might be a
future bug that we're fixing beforehand with this patch as well.

At the very least the code now makes more sense, at least in my
estimation.

Daniel Henrique Barboza (1):
  ppc/pnv: use stack->pci_regs[] in pnv_pec_stk_pci_xscom_write()

 hw/pci-host/pnv_phb4.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

-- 
2.33.1




[PATCH 1/1] ppc/pnv: use stack->pci_regs[] in pnv_pec_stk_pci_xscom_write()

2022-01-11 Thread Daniel Henrique Barboza
pnv_pec_stk_pci_xscom_write() is pnv_pec_stk_pci_xscom_ops write
callback. It writes values into regs in the stack->nest_regs[] array.
The pnv_pec_stk_pci_xscom_read read callback, on the other hand, returns
values of the stack->pci_regs[]. In fact, at this moment, the only use
of stack->pci_regs[] is in pnv_pec_stk_pci_xscom_read(). There's no code
that is written anything in stack->pci_regs[], which is suspicious.

Considering that stack->nest_regs[] is widely used by the nested
MemoryOps pnv_pec_stk_nest_xscom_ops, in both read and write callbacks,
the conclusion is that we're writing the wrong array in
pnv_pec_stk_pci_xscom_write(). This function should write stack->pci_regs[]
instead.

Signed-off-by: Daniel Henrique Barboza 
---
 hw/pci-host/pnv_phb4.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index be29174f13..a7b638831e 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -1086,39 +1086,39 @@ static void pnv_pec_stk_pci_xscom_write(void *opaque, 
hwaddr addr,
 
 switch (reg) {
 case PEC_PCI_STK_PCI_FIR:
-stack->nest_regs[reg] = val;
+stack->pci_regs[reg] = val;
 break;
 case PEC_PCI_STK_PCI_FIR_CLR:
-stack->nest_regs[PEC_PCI_STK_PCI_FIR] &= val;
+stack->pci_regs[PEC_PCI_STK_PCI_FIR] &= val;
 break;
 case PEC_PCI_STK_PCI_FIR_SET:
-stack->nest_regs[PEC_PCI_STK_PCI_FIR] |= val;
+stack->pci_regs[PEC_PCI_STK_PCI_FIR] |= val;
 break;
 case PEC_PCI_STK_PCI_FIR_MSK:
-stack->nest_regs[reg] = val;
+stack->pci_regs[reg] = val;
 break;
 case PEC_PCI_STK_PCI_FIR_MSKC:
-stack->nest_regs[PEC_PCI_STK_PCI_FIR_MSK] &= val;
+stack->pci_regs[PEC_PCI_STK_PCI_FIR_MSK] &= val;
 break;
 case PEC_PCI_STK_PCI_FIR_MSKS:
-stack->nest_regs[PEC_PCI_STK_PCI_FIR_MSK] |= val;
+stack->pci_regs[PEC_PCI_STK_PCI_FIR_MSK] |= val;
 break;
 case PEC_PCI_STK_PCI_FIR_ACT0:
 case PEC_PCI_STK_PCI_FIR_ACT1:
-stack->nest_regs[reg] = val;
+stack->pci_regs[reg] = val;
 break;
 case PEC_PCI_STK_PCI_FIR_WOF:
-stack->nest_regs[reg] = 0;
+stack->pci_regs[reg] = 0;
 break;
 case PEC_PCI_STK_ETU_RESET:
-stack->nest_regs[reg] = val & 0x8000ull;
+stack->pci_regs[reg] = val & 0x8000ull;
 /* TODO: Implement reset */
 break;
 case PEC_PCI_STK_PBAIB_ERR_REPORT:
 break;
 case PEC_PCI_STK_PBAIB_TX_CMD_CRED:
 case PEC_PCI_STK_PBAIB_TX_DAT_CRED:
-stack->nest_regs[reg] = val;
+stack->pci_regs[reg] = val;
 break;
 default:
 qemu_log_mask(LOG_UNIMP, "phb4_pec_stk: pci_xscom_write 0x%"HWADDR_PRIx
-- 
2.33.1




Re: [PATCH v4 05/23] migration: simplify do_compress_ram_page

2022-01-11 Thread Dr. David Alan Gilbert
* Juan Quintela (quint...@redhat.com) wrote:
> The goto is not needed at all.

Another dupe,


Reviewed-by: Dr. David Alan Gilbert 

> Signed-off-by: Juan Quintela 
> ---
>  migration/ram.c | 11 +++
>  1 file changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index fa49d22e69..422c6bce28 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1341,12 +1341,11 @@ static bool do_compress_ram_page(QEMUFile *f, 
> z_stream *stream, RAMBlock *block,
>  {
>  RAMState *rs = ram_state;
>  uint8_t *p = block->host + offset;
> -bool zero_page = false;
>  int ret;
>  
>  if (save_zero_page_to_file(rs, f, block, offset)) {
> -zero_page = true;
> -goto exit;
> +ram_release_page(block->idstr, offset);
> +return true;
>  }
>  
>  save_page_header(rs, f, block, offset | RAM_SAVE_FLAG_COMPRESS_PAGE);
> @@ -1361,12 +1360,8 @@ static bool do_compress_ram_page(QEMUFile *f, z_stream 
> *stream, RAMBlock *block,
>  if (ret < 0) {
>  qemu_file_set_error(migrate_get_current()->to_dst_file, ret);
>  error_report("compressed data failed!");
> -return false;
>  }
> -
> -exit:
> -ram_release_page(block->idstr, offset);
> -return zero_page;
> +return false;
>  }
>  
>  static void
> -- 
> 2.34.1
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH v4 04/23] migration: Remove masking for compression

2022-01-11 Thread Dr. David Alan Gilbert
* Juan Quintela (quint...@redhat.com) wrote:
> Remove the mask in the call to ram_release_pages().  Nothing else does
> it, and if the offset has that bits set, we have a lot of trouble.
> 
> Signed-off-by: Juan Quintela 

Yeh same as in the other branch

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/ram.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 881fe4974e..fa49d22e69 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1340,7 +1340,7 @@ static bool do_compress_ram_page(QEMUFile *f, z_stream 
> *stream, RAMBlock *block,
>   ram_addr_t offset, uint8_t *source_buf)
>  {
>  RAMState *rs = ram_state;
> -uint8_t *p = block->host + (offset & TARGET_PAGE_MASK);
> +uint8_t *p = block->host + offset;
>  bool zero_page = false;
>  int ret;
>  
> @@ -1365,7 +1365,7 @@ static bool do_compress_ram_page(QEMUFile *f, z_stream 
> *stream, RAMBlock *block,
>  }
>  
>  exit:
> -ram_release_page(block->idstr, offset & TARGET_PAGE_MASK);
> +ram_release_page(block->idstr, offset);
>  return zero_page;
>  }
>  
> -- 
> 2.34.1
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH v4 07/11] target/riscv: Support mcycle/minstret write operation

2022-01-11 Thread Atish Patra
On Sun, Jan 9, 2022 at 11:51 PM Bin Meng  wrote:
>
> On Fri, Jan 7, 2022 at 10:14 AM Atish Patra  wrote:
> >
> > From: Atish Patra 
> >
> > mcycle/minstret are actually WARL registers and can be written with any
> > given value. With SBI PMU extension, it will be used to store a initial
> > value provided from supervisor OS. The Qemu also need prohibit the counter
> > increment if mcountinhibit is set.
> >
> > Support mcycle/minstret through generic counter infrastructure.
> >
> > Signed-off-by: Atish Patra 
> > Signed-off-by: Atish Patra 
> > ---
> >  target/riscv/cpu.h   |  24 +--
> >  target/riscv/csr.c   | 144 ++-
> >  target/riscv/machine.c   |  26 ++-
> >  target/riscv/meson.build |   1 +
> >  target/riscv/pmu.c   |  32 +
> >  target/riscv/pmu.h   |  28 
> >  6 files changed, 200 insertions(+), 55 deletions(-)
> >  create mode 100644 target/riscv/pmu.c
> >  create mode 100644 target/riscv/pmu.h
> >
> > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> > index 39edc948d703..5fe9c51b38c7 100644
> > --- a/target/riscv/cpu.h
> > +++ b/target/riscv/cpu.h
> > @@ -101,7 +101,7 @@ typedef struct CPURISCVState CPURISCVState;
> >  #endif
> >
> >  #define RV_VLEN_MAX 1024
> > -#define RV_MAX_MHPMEVENTS 29
> > +#define RV_MAX_MHPMEVENTS 32
> >  #define RV_MAX_MHPMCOUNTERS 32
> >
> >  FIELD(VTYPE, VLMUL, 0, 3)
> > @@ -112,6 +112,19 @@ FIELD(VTYPE, VEDIV, 8, 2)
> >  FIELD(VTYPE, RESERVED, 10, sizeof(target_ulong) * 8 - 11)
> >  FIELD(VTYPE, VILL, sizeof(target_ulong) * 8 - 1, 1)
> >
> > +typedef struct PMUCTRState PMUCTRState;
>
> This 'typedef' can be merged into the definition below
>

Sure.

>
> > +struct PMUCTRState {
> > +/* Current value of a counter */
> > +target_ulong mhpmcounter_val;
> > +/* Current value of a counter in RV32*/
> > +target_ulong mhpmcounterh_val;
> > +/* Snapshot values of counter */
> > +target_ulong mhpmcounter_prev;
> > +/* Snapshort value of a counter in RV32 */
> > +target_ulong mhpmcounterh_prev;
> > +bool started;
> > +};
> > +
> >  struct CPURISCVState {
> >  target_ulong gpr[32];
> >  uint64_t fpr[32]; /* assume both F and D extensions */
> > @@ -226,13 +239,10 @@ struct CPURISCVState {
> >
> >  target_ulong mcountinhibit;
> >
> > -/* PMU counter configured values */
> > -target_ulong mhpmcounter_val[RV_MAX_MHPMCOUNTERS];
> > -
> > -/* for RV32 */
> > -target_ulong mhpmcounterh_val[RV_MAX_MHPMCOUNTERS];
> > +/* PMU counter state */
> > +PMUCTRState pmu_ctrs[RV_MAX_MHPMCOUNTERS];
> >
> > -/* PMU event selector configured values */
> > +/* PMU event selector configured values. First three are unused*/
> >  target_ulong mhpmevent_val[RV_MAX_MHPMEVENTS];
> >
> >  target_ulong sscratch;
> > diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> > index 58a9550bd898..d4449ada557c 100644
> > --- a/target/riscv/csr.c
> > +++ b/target/riscv/csr.c
> > @@ -20,6 +20,7 @@
> >  #include "qemu/osdep.h"
> >  #include "qemu/log.h"
> >  #include "cpu.h"
> > +#include "pmu.h"
> >  #include "qemu/main-loop.h"
> >  #include "exec/exec-all.h"
> >
> > @@ -461,41 +462,33 @@ static int write_vcsr(CPURISCVState *env, int csrno, 
> > target_ulong val)
> >  }
> >
> >  /* User Timers and Counters */
> > -static RISCVException read_instret(CPURISCVState *env, int csrno,
> > -   target_ulong *val)
> > +static target_ulong get_icount_ticks(bool brv32)
>
> I would use 'rv32' instead of 'brv32'
>

ok.

> >  {
> > +int64_t val;
> > +target_ulong result;
> > +
> >  #if !defined(CONFIG_USER_ONLY)
> >  if (icount_enabled()) {
> > -*val = icount_get();
> > +val = icount_get();
> >  } else {
> > -*val = cpu_get_host_ticks();
> > +val = cpu_get_host_ticks();
> >  }
> >  #else
> > -*val = cpu_get_host_ticks();
> > +val = cpu_get_host_ticks();
> >  #endif
> >
> > -return RISCV_EXCP_NONE;
> > -}
> > -
> > -static RISCVException read_instreth(CPURISCVState *env, int csrno,
> > -target_ulong *val)
> > -{
> > -#if !defined(CONFIG_USER_ONLY)
> > -if (icount_enabled()) {
> > -*val = icount_get() >> 32;
> > +if (brv32) {
> > +result = val >> 32;
> >  } else {
> > -*val = cpu_get_host_ticks() >> 32;
> > +result = val;
> >  }
> > -#else
> > -*val = cpu_get_host_ticks() >> 32;
> > -#endif
> >
> > -return RISCV_EXCP_NONE;
> > +return result;
> >  }
> >
> >  static int read_mhpmevent(CPURISCVState *env, int csrno, target_ulong *val)
> >  {
> > -int evt_index = csrno - CSR_MHPMEVENT3;
> > +int evt_index = csrno - CSR_MCOUNTINHIBIT;
> >
> >  *val = env->mhpmevent_val[evt_index];
> >
> > @@ -504,7 +497,7 @@ static int read_mhpmevent(CPURISCVState *env, int 
> > csrno, target_ulong *val)
> >
> >  static int write_mhpmevent(CPURISCVState *env, int csrno, target_ulong 

Re: [PATCH 3/4] tests/qtest/cdrom-test: Check whether devices are available before using them

2022-01-11 Thread John Snow
On Mon, Dec 20, 2021 at 3:11 AM Thomas Huth  wrote:

> Downstream users might want to disable legacy devices in their binaries,
> so we should not blindly assume that they are available. Add some proper
> checks before using them.
>
> Signed-off-by: Thomas Huth 
> ---
>  tests/qtest/cdrom-test.c | 60 ++--
>  1 file changed, 39 insertions(+), 21 deletions(-)
>
> diff --git a/tests/qtest/cdrom-test.c b/tests/qtest/cdrom-test.c
> index c1fcac5c45..cfca24fa94 100644
> --- a/tests/qtest/cdrom-test.c
> +++ b/tests/qtest/cdrom-test.c
> @@ -142,21 +142,36 @@ static void add_x86_tests(void)
>  qtest_add_data_func("cdrom/boot/isapc", "-M isapc "
>  "-drive if=ide,media=cdrom,file=",
> test_cdboot);
>  }
> -qtest_add_data_func("cdrom/boot/am53c974",
> -"-device am53c974 -device scsi-cd,drive=cd1 "
> -"-drive if=none,id=cd1,format=raw,file=",
> test_cdboot);
> -qtest_add_data_func("cdrom/boot/dc390",
> -"-device dc390 -device scsi-cd,drive=cd1 "
> -"-blockdev file,node-name=cd1,filename=",
> test_cdboot);
> -qtest_add_data_func("cdrom/boot/lsi53c895a",
> -"-device lsi53c895a -device scsi-cd,drive=cd1 "
> -"-blockdev file,node-name=cd1,filename=",
> test_cdboot);
> -qtest_add_data_func("cdrom/boot/megasas", "-M q35 "
> -"-device megasas -device scsi-cd,drive=cd1 "
> -"-blockdev file,node-name=cd1,filename=",
> test_cdboot);
> -qtest_add_data_func("cdrom/boot/megasas-gen2", "-M q35 "
> -"-device megasas-gen2 -device scsi-cd,drive=cd1 "
> -"-blockdev file,node-name=cd1,filename=",
> test_cdboot);
> +if (qtest_has_device("am53c974")) {
> +qtest_add_data_func("cdrom/boot/am53c974",
> +"-device am53c974 -device scsi-cd,drive=cd1 "
> +"-drive if=none,id=cd1,format=raw,file=",
> +test_cdboot);
> +}
> +if (qtest_has_device("dc390")) {
> +qtest_add_data_func("cdrom/boot/dc390",
> +"-device dc390 -device scsi-cd,drive=cd1 "
> +"-blockdev file,node-name=cd1,filename=",
> +test_cdboot);
> +}
> +if (qtest_has_device("lsi53c895a")) {
> +qtest_add_data_func("cdrom/boot/lsi53c895a",
> +"-device lsi53c895a -device scsi-cd,drive=cd1
> "
> +"-blockdev file,node-name=cd1,filename=",
> +test_cdboot);
> +}
> +if (qtest_has_device("megasas")) {
> +qtest_add_data_func("cdrom/boot/megasas", "-M q35 "
> +"-device megasas -device scsi-cd,drive=cd1 "
> +"-blockdev file,node-name=cd1,filename=",
> +test_cdboot);
> +}
> +if (qtest_has_device("megasas-gen2")) {
> +qtest_add_data_func("cdrom/boot/megasas-gen2", "-M q35 "
> +"-device megasas-gen2 -device
> scsi-cd,drive=cd1 "
> +"-blockdev file,node-name=cd1,filename=",
> +test_cdboot);
> +}
>  }
>
>  static void add_s390x_tests(void)
> @@ -171,12 +186,15 @@ static void add_s390x_tests(void)
>  "-drive
> driver=null-co,read-zeroes=on,if=none,id=d1 "
>  "-device virtio-blk,drive=d2,bootindex=1 "
>  "-drive if=none,id=d2,media=cdrom,file=",
> test_cdboot);
> -qtest_add_data_func("cdrom/boot/without-bootindex",
> -"-device virtio-scsi -device virtio-serial "
> -"-device x-terminal3270 -device
> virtio-blk,drive=d1 "
> -"-drive
> driver=null-co,read-zeroes=on,if=none,id=d1 "
> -"-device virtio-blk,drive=d2 "
> -"-drive if=none,id=d2,media=cdrom,file=",
> test_cdboot);
> +if (qtest_has_device("x-terminal3270")) {
> +qtest_add_data_func("cdrom/boot/without-bootindex",
> +"-device virtio-scsi -device virtio-serial "
> +"-device x-terminal3270 -device
> virtio-blk,drive=d1 "
> +"-drive
> driver=null-co,read-zeroes=on,if=none,id=d1 "
> +"-device virtio-blk,drive=d2 "
> +"-drive if=none,id=d2,media=cdrom,file=",
> +test_cdboot);
> +}
>  }
>
>  int main(int argc, char **argv)
> --
> 2.27.0
>
>
Acked-by: John Snow 

These are really more your tests than mine :)

--js


[PULL 30/30] linux-user: Implement capability prctls

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

This is PR_CAPBSET_READ, PR_CAPBSET_DROP and the "legacy"
PR_CAP_AMBIENT PR_GET_SECUREBITS, PR_SET_SECUREBITS.

All of these arguments are integer values only, and do not
require mapping of values between host and guest.

Signed-off-by: Richard Henderson 
Reviewed-by: Laurent Vivier 
Message-Id: <20220106225738.103012-5-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/syscall.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index f9ae6328b53b..5950222a77e0 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6504,10 +6504,15 @@ static abi_long do_prctl(CPUArchState *env, abi_long 
option, abi_long arg2,
 case PR_SET_UNALIGN:
 return do_prctl_set_unalign(env, arg2);
 
+case PR_CAP_AMBIENT:
+case PR_CAPBSET_READ:
+case PR_CAPBSET_DROP:
 case PR_GET_DUMPABLE:
 case PR_SET_DUMPABLE:
 case PR_GET_KEEPCAPS:
 case PR_SET_KEEPCAPS:
+case PR_GET_SECUREBITS:
+case PR_SET_SECUREBITS:
 case PR_GET_TIMING:
 case PR_SET_TIMING:
 case PR_GET_TIMERSLACK:
-- 
2.33.1




Re: [PATCH] linux-user: rt_sigprocmask, check read perms first

2022-01-11 Thread Patrick Venture
On Sat, Jan 8, 2022 at 10:16 AM Laurent Vivier  wrote:

> Le 06/01/2022 à 23:00, Patrick Venture a écrit :
> > From: Shu-Chun Weng 
> >
> > Linux kernel does it this way (checks read permission before validating
> `how`)
> > and the latest version of ABSL's `AddressIsReadable()` depends on this
> > behavior.
> >
> > c.f.
> https://github.com/torvalds/linux/blob/9539ba4308ad5bdca6cb41c7b73cbb9f796dcdd7/kernel/signal.c#L3147
> > Reviewed-by: Patrick Venture 
> > Signed-off-by: Shu-Chun Weng 
> > ---
> >   linux-user/syscall.c | 10 +-
> >   1 file changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> > index ce9d64896c..3070d31f34 100644
> > --- a/linux-user/syscall.c
> > +++ b/linux-user/syscall.c
> > @@ -9491,6 +9491,11 @@ static abi_long do_syscall1(void *cpu_env, int
> num, abi_long arg1,
> >   }
> >
> >   if (arg2) {
> > +if (!(p = lock_user(VERIFY_READ, arg2,
> sizeof(target_sigset_t), 1)))
> > +return -TARGET_EFAULT;
> > +target_to_host_sigset(, p);
> > +unlock_user(p, arg2, 0);
> > +set_ptr = 
> >   switch(how) {
> >   case TARGET_SIG_BLOCK:
> >   how = SIG_BLOCK;
> > @@ -9504,11 +9509,6 @@ static abi_long do_syscall1(void *cpu_env, int
> num, abi_long arg1,
> >   default:
> >   return -TARGET_EINVAL;
> >   }
> > -if (!(p = lock_user(VERIFY_READ, arg2,
> sizeof(target_sigset_t), 1)))
> > -return -TARGET_EFAULT;
> > -target_to_host_sigset(, p);
> > -unlock_user(p, arg2, 0);
> > -set_ptr = 
> >   } else {
> >   how = 0;
> >   set_ptr = NULL;
>
> I know it's only code move but generally we also update the style to pass
> scripts/checkpatch.pl
> successfully.
>

That is a reasonable request, however, can I just send a follow-on patch?
I didn't write this one and I honestly don't know much about it, but I
don't mind doing the cleanup


>
> Could you also update TARGET_NR_sigprocmask in the same way as it seems
> the kernel behaves like this
> too in this case?
>

I can take a look.  I would prefer then to also prefetch the style fixup in
a preceding patch. I don't recall seeing whether qemu supports clang-format.


>
> Thanks,
> Laurent
>

Patrick


[PULL 25/30] linux-user/arm: Move target_oabi_flock64 out of target_structs.h

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Place it next to copy_from/to_user_oabi_flock64, the only users,
inside the existing target-specific ifdef.  This leaves only
generic ipc structs in target_structs.h.

Signed-off-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Laurent Vivier 
Message-Id: <20220107042600.149852-2-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/arm/target_structs.h | 8 
 linux-user/syscall.c| 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/linux-user/arm/target_structs.h b/linux-user/arm/target_structs.h
index 339b070bf1a5..25bf8dd3a5c9 100644
--- a/linux-user/arm/target_structs.h
+++ b/linux-user/arm/target_structs.h
@@ -48,12 +48,4 @@ struct target_shmid_ds {
 abi_ulong __unused4;
 abi_ulong __unused5;
 };
-
-struct target_oabi_flock64 {
-abi_short l_type;
-abi_short l_whence;
-abi_llong l_start;
-abi_llong l_len;
-abi_int   l_pid;
-} QEMU_PACKED;
 #endif
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index ce9d64896cb8..ca6e0b8fb0a1 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6927,6 +6927,14 @@ typedef abi_long from_flock64_fn(struct flock64 *fl, 
abi_ulong target_addr);
 typedef abi_long to_flock64_fn(abi_ulong target_addr, const struct flock64 
*fl);
 
 #if defined(TARGET_ARM) && TARGET_ABI_BITS == 32
+struct target_oabi_flock64 {
+abi_short l_type;
+abi_short l_whence;
+abi_llong l_start;
+abi_llong l_len;
+abi_int   l_pid;
+} QEMU_PACKED;
+
 static inline abi_long copy_from_user_oabi_flock64(struct flock64 *fl,
abi_ulong target_flock_addr)
 {
-- 
2.33.1




[PULL 16/30] target/mips: Extract break code into env->error_code

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Simplify cpu_loop by doing all of the decode in translate.

This fixes a bug in that cpu_loop was not handling the
different layout of the R6 version of break16.  This fixes
a bug in that cpu_loop extracted the wrong bits for the
mips16e break16 instruction.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
Message-Id: <20220107213243.212806-17-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/mips/cpu_loop.c| 73 +++
 target/mips/tcg/micromips_translate.c.inc |  6 +-
 target/mips/tcg/mips16e_translate.c.inc   |  2 +-
 target/mips/tcg/translate.c   | 12 +++-
 target/mips/tcg/translate.h   |  1 +
 5 files changed, 25 insertions(+), 69 deletions(-)

diff --git a/linux-user/mips/cpu_loop.c b/linux-user/mips/cpu_loop.c
index 1286fbc2e0d3..9a6ab2dd986a 100644
--- a/linux-user/mips/cpu_loop.c
+++ b/linux-user/mips/cpu_loop.c
@@ -65,6 +65,7 @@ void cpu_loop(CPUMIPSState *env)
 {
 CPUState *cs = env_cpu(env);
 int trapnr, si_code;
+unsigned int code;
 abi_long ret;
 # ifdef TARGET_ABI_MIPSO32
 unsigned int syscall_num;
@@ -185,71 +186,15 @@ done_syscall:
  * handling code in arch/mips/kernel/traps.c.
  */
 case EXCP_BREAK:
-{
-abi_ulong trap_instr;
-unsigned int code;
-
-/*
- * FIXME: It would be better to decode the trap number
- * during translate, and store it in error_code while
- * raising the exception.  We should not be re-reading
- * the opcode here.
- */
-
-if (env->hflags & MIPS_HFLAG_M16) {
-if (env->insn_flags & ASE_MICROMIPS) {
-/* microMIPS mode */
-ret = get_user_u16(trap_instr, env->active_tc.PC);
-if (ret != 0) {
-goto error;
-}
-
-if ((trap_instr >> 10) == 0x11) {
-/* 16-bit instruction */
-code = trap_instr & 0xf;
-} else {
-/* 32-bit instruction */
-abi_ulong instr_lo;
-
-ret = get_user_u16(instr_lo,
-   env->active_tc.PC + 2);
-if (ret != 0) {
-goto error;
-}
-trap_instr = (trap_instr << 16) | instr_lo;
-code = ((trap_instr >> 6) & ((1 << 20) - 1));
-/* Unfortunately, microMIPS also suffers from
-   the old assembler bug...  */
-if (code >= (1 << 10)) {
-code >>= 10;
-}
-}
-} else {
-/* MIPS16e mode */
-ret = get_user_u16(trap_instr, env->active_tc.PC);
-if (ret != 0) {
-goto error;
-}
-code = (trap_instr >> 6) & 0x3f;
-}
-} else {
-ret = get_user_u32(trap_instr, env->active_tc.PC);
-if (ret != 0) {
-goto error;
-}
-
-/* As described in the original Linux kernel code, the
- * below checks on 'code' are to work around an old
- * assembly bug.
- */
-code = ((trap_instr >> 6) & ((1 << 20) - 1));
-if (code >= (1 << 10)) {
-code >>= 10;
-}
-}
-
-do_tr_or_bp(env, code, false);
+/*
+ * As described in the original Linux kernel code, the below
+ * checks on 'code' are to work around an old assembly bug.
+ */
+code = env->error_code;
+if (code >= (1 << 10)) {
+code >>= 10;
 }
+do_tr_or_bp(env, code, false);
 break;
 case EXCP_TRAP:
 {
diff --git a/target/mips/tcg/micromips_translate.c.inc 
b/target/mips/tcg/micromips_translate.c.inc
index 0760941431e1..9013f8403739 100644
--- a/target/mips/tcg/micromips_translate.c.inc
+++ b/target/mips/tcg/micromips_translate.c.inc
@@ -822,7 +822,7 @@ static void gen_pool16c_insn(DisasContext *ctx)
 gen_HILO(ctx, OPC_MFLO, 0, uMIPS_RS5(ctx->opcode));
 break;
 case BREAK16:
-generate_exception_end(ctx, EXCP_BREAK);
+generate_exception_break(ctx, extract32(ctx->opcode, 0, 4));
 break;
 case SDBBP16:
 

[PULL 26/30] linux-user: Move target_struct.h generic definitions to generic/

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Most targets share the same generic ipc structure definitions.

Signed-off-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Laurent Vivier 
Message-Id: <20220107042600.149852-3-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/aarch64/target_structs.h| 59 +-
 linux-user/arm/target_structs.h| 52 +--
 linux-user/cris/target_structs.h   | 59 +-
 linux-user/generic/target_structs.h| 58 +
 linux-user/hexagon/target_structs.h| 55 +---
 linux-user/i386/target_structs.h   | 59 +-
 linux-user/m68k/target_structs.h   | 59 +-
 linux-user/microblaze/target_structs.h | 59 +-
 linux-user/nios2/target_structs.h  | 59 +-
 linux-user/openrisc/target_structs.h   | 59 +-
 linux-user/riscv/target_structs.h  | 47 +---
 linux-user/sh4/target_structs.h| 59 +-
 linux-user/x86_64/target_structs.h | 36 +---
 13 files changed, 70 insertions(+), 650 deletions(-)
 create mode 100644 linux-user/generic/target_structs.h

diff --git a/linux-user/aarch64/target_structs.h 
b/linux-user/aarch64/target_structs.h
index 7c748344cabc..3a06f373c35a 100644
--- a/linux-user/aarch64/target_structs.h
+++ b/linux-user/aarch64/target_structs.h
@@ -1,58 +1 @@
-/*
- * ARM AArch64 specific structures for linux-user
- *
- * Copyright (c) 2013 Fabrice Bellard
- *
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation; either
- * version 2.1 of the License, or (at your option) any later version.
- *
- * This library is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with this library; if not, see .
- */
-#ifndef AARCH64_TARGET_STRUCTS_H
-#define AARCH64_TARGET_STRUCTS_H
-
-struct target_ipc_perm {
-abi_int __key;  /* Key.  */
-abi_uint uid;   /* Owner's user ID.  */
-abi_uint gid;   /* Owner's group ID.  */
-abi_uint cuid;  /* Creator's user ID.  */
-abi_uint cgid;  /* Creator's group ID.  */
-abi_ushort mode;/* Read/write permission.  */
-abi_ushort __pad1;
-abi_ushort __seq;   /* Sequence number.  */
-abi_ushort __pad2;
-abi_ulong __unused1;
-abi_ulong __unused2;
-};
-
-struct target_shmid_ds {
-struct target_ipc_perm shm_perm;/* operation permission struct */
-abi_long shm_segsz; /* size of segment in bytes */
-abi_ulong shm_atime;/* time of last shmat() */
-#if TARGET_ABI_BITS == 32
-abi_ulong __unused1;
-#endif
-abi_ulong shm_dtime;/* time of last shmdt() */
-#if TARGET_ABI_BITS == 32
-abi_ulong __unused2;
-#endif
-abi_ulong shm_ctime;/* time of last change by shmctl() */
-#if TARGET_ABI_BITS == 32
-abi_ulong __unused3;
-#endif
-abi_int shm_cpid;   /* pid of creator */
-abi_int shm_lpid;   /* pid of last shmop */
-abi_ulong shm_nattch;   /* number of current attaches */
-abi_ulong __unused4;
-abi_ulong __unused5;
-};
-
-#endif
+#include "../generic/target_structs.h"
diff --git a/linux-user/arm/target_structs.h b/linux-user/arm/target_structs.h
index 25bf8dd3a5c9..3a06f373c35a 100644
--- a/linux-user/arm/target_structs.h
+++ b/linux-user/arm/target_structs.h
@@ -1,51 +1 @@
-/*
- * ARM specific structures for linux-user
- *
- * Copyright (c) 2013 Fabrice Bellard
- *
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation; either
- * version 2.1 of the License, or (at your option) any later version.
- *
- * This library is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with this library; if not, see .
- */
-#ifndef ARM_TARGET_STRUCTS_H
-#define ARM_TARGET_STRUCTS_H
-
-struct target_ipc_perm {
-abi_int __key;  /* Key.  */
-  

[PULL 29/30] linux-user: Implement PR_SET_PDEATHSIG

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Signed-off-by: Richard Henderson 
Reviewed-by: Laurent Vivier 
Message-Id: <20220106225738.103012-4-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/syscall.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index b17cfe31c8b4..f9ae6328b53b 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6450,6 +6450,9 @@ static abi_long do_prctl(CPUArchState *env, abi_long 
option, abi_long arg2,
 }
 return ret;
 }
+case PR_SET_PDEATHSIG:
+return get_errno(prctl(PR_SET_PDEATHSIG, target_to_host_signal(arg2),
+   arg3, arg4, arg5));
 case PR_GET_NAME:
 {
 void *name = lock_user(VERIFY_WRITE, arg2, 16, 1);
-- 
2.33.1




[PULL 28/30] linux-user: Map signal number in PR_GET_PDEATHSIG

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Convert the host signal number to guest signal number
before returning the value to the guest.

Signed-off-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20220106225738.103012-3-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/syscall.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index eff107b8bcfd..b17cfe31c8b4 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6444,7 +6444,8 @@ static abi_long do_prctl(CPUArchState *env, abi_long 
option, abi_long arg2,
 int deathsig;
 ret = get_errno(prctl(PR_GET_PDEATHSIG, ,
   arg3, arg4, arg5));
-if (!is_error(ret) && put_user_s32(deathsig, arg2)) {
+if (!is_error(ret) &&
+put_user_s32(host_to_target_signal(deathsig), arg2)) {
 return -TARGET_EFAULT;
 }
 return ret;
-- 
2.33.1




[PULL 27/30] linux-user: Do not special-case NULL for PR_GET_PDEATHSIG

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

The kernel does not special-case arg2 != NULL, so
neither should we.

Signed-off-by: Richard Henderson 
Reviewed-by: Laurent Vivier 
Message-Id: <20220106225738.103012-2-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/syscall.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index ca6e0b8fb0a1..eff107b8bcfd 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6444,7 +6444,7 @@ static abi_long do_prctl(CPUArchState *env, abi_long 
option, abi_long arg2,
 int deathsig;
 ret = get_errno(prctl(PR_GET_PDEATHSIG, ,
   arg3, arg4, arg5));
-if (!is_error(ret) && arg2 && put_user_s32(deathsig, arg2)) {
+if (!is_error(ret) && put_user_s32(deathsig, arg2)) {
 return -TARGET_EFAULT;
 }
 return ret;
-- 
2.33.1




[PULL 14/30] linux-user/mips: Improve do_break

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Rename to do_tr_or_bp, as per the kernel function.
Add a 'trap' argument, akin to the kernel's si_code, but clearer.
The return value is always 0, so change the return value to void.
Use force_sig and force_sig_fault.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
Message-Id: <20220107213243.212806-15-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/mips/cpu_loop.c | 46 +-
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/linux-user/mips/cpu_loop.c b/linux-user/mips/cpu_loop.c
index 32f9fc1c1c7c..4fa24cc07452 100644
--- a/linux-user/mips/cpu_loop.c
+++ b/linux-user/mips/cpu_loop.c
@@ -40,29 +40,25 @@ enum {
 BRK_DIVZERO = 7
 };
 
-static int do_break(CPUMIPSState *env, target_siginfo_t *info,
-unsigned int code)
+static void do_tr_or_bp(CPUMIPSState *env, unsigned int code, bool trap)
 {
-int ret = -1;
+target_ulong pc = env->active_tc.PC;
 
 switch (code) {
 case BRK_OVERFLOW:
+force_sig_fault(TARGET_SIGFPE, TARGET_FPE_INTOVF, pc);
+break;
 case BRK_DIVZERO:
-info->si_signo = TARGET_SIGFPE;
-info->si_errno = 0;
-info->si_code = (code == BRK_OVERFLOW) ? FPE_INTOVF : FPE_INTDIV;
-queue_signal(env, info->si_signo, QEMU_SI_FAULT, &*info);
-ret = 0;
+force_sig_fault(TARGET_SIGFPE, TARGET_FPE_INTDIV, pc);
 break;
 default:
-info->si_signo = TARGET_SIGTRAP;
-info->si_errno = 0;
-queue_signal(env, info->si_signo, QEMU_SI_FAULT, &*info);
-ret = 0;
+if (trap) {
+force_sig(TARGET_SIGTRAP);
+} else {
+force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT, pc);
+}
 break;
 }
-
-return ret;
 }
 
 void cpu_loop(CPUMIPSState *env)
@@ -205,6 +201,13 @@ done_syscall:
 abi_ulong trap_instr;
 unsigned int code;
 
+/*
+ * FIXME: It would be better to decode the trap number
+ * during translate, and store it in error_code while
+ * raising the exception.  We should not be re-reading
+ * the opcode here.
+ */
+
 if (env->hflags & MIPS_HFLAG_M16) {
 if (env->insn_flags & ASE_MICROMIPS) {
 /* microMIPS mode */
@@ -257,9 +260,7 @@ done_syscall:
 }
 }
 
-if (do_break(env, , code) != 0) {
-goto error;
-}
+do_tr_or_bp(env, code, false);
 }
 break;
 case EXCP_TRAP:
@@ -267,6 +268,13 @@ done_syscall:
 abi_ulong trap_instr;
 unsigned int code = 0;
 
+/*
+ * FIXME: It would be better to decode the trap number
+ * during translate, and store it in error_code while
+ * raising the exception.  We should not be re-reading
+ * the opcode here.
+ */
+
 if (env->hflags & MIPS_HFLAG_M16) {
 /* microMIPS mode */
 abi_ulong instr[2];
@@ -293,9 +301,7 @@ done_syscall:
 }
 }
 
-if (do_break(env, , code) != 0) {
-goto error;
-}
+do_tr_or_bp(env, code, true);
 }
 break;
 case EXCP_ATOMIC:
-- 
2.33.1




[PULL 22/30] linux-user/sh4: Use force_sig_fault

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Use the new function instead of setting up a target_siginfo_t
and calling queue_signal. Fill in the missing PC for SIGTRAP.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
Message-Id: <20220107213243.212806-23-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/sh4/cpu_loop.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/linux-user/sh4/cpu_loop.c b/linux-user/sh4/cpu_loop.c
index 3290f6445c5f..1bd313cb19a2 100644
--- a/linux-user/sh4/cpu_loop.c
+++ b/linux-user/sh4/cpu_loop.c
@@ -28,7 +28,6 @@ void cpu_loop(CPUSH4State *env)
 {
 CPUState *cs = env_cpu(env);
 int trapnr, ret;
-target_siginfo_t info;
 
 while (1) {
 bool arch_interrupt = true;
@@ -60,10 +59,7 @@ void cpu_loop(CPUSH4State *env)
 /* just indicate that signals should be handled asap */
 break;
 case EXCP_DEBUG:
-info.si_signo = TARGET_SIGTRAP;
-info.si_errno = 0;
-info.si_code = TARGET_TRAP_BRKPT;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT, env->pc);
 break;
 case EXCP_ATOMIC:
 cpu_exec_step_atomic(cs);
-- 
2.33.1




[PULL 23/30] linux-user/sparc: Use force_sig_fault

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Use the new function instead of setting up a target_siginfo_t
and calling queue_signal. Fill in the missing PC for SIGTRAP.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
Message-Id: <20220107213243.212806-24-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/sparc/cpu_loop.c | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c
index 8765ab60205c..baf3d9ae011f 100644
--- a/linux-user/sparc/cpu_loop.c
+++ b/linux-user/sparc/cpu_loop.c
@@ -155,7 +155,6 @@ void cpu_loop (CPUSPARCState *env)
 CPUState *cs = env_cpu(env);
 int trapnr;
 abi_long ret;
-target_siginfo_t info;
 
 while (1) {
 cpu_exec_start(cs);
@@ -241,19 +240,10 @@ void cpu_loop (CPUSPARCState *env)
 /* just indicate that signals should be handled asap */
 break;
 case TT_ILL_INSN:
-{
-info.si_signo = TARGET_SIGILL;
-info.si_errno = 0;
-info.si_code = TARGET_ILL_ILLOPC;
-info._sifields._sigfault._addr = env->pc;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
-}
+force_sig_fault(TARGET_SIGILL, TARGET_ILL_ILLOPC, env->pc);
 break;
 case EXCP_DEBUG:
-info.si_signo = TARGET_SIGTRAP;
-info.si_errno = 0;
-info.si_code = TARGET_TRAP_BRKPT;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT, env->pc);
 break;
 case EXCP_ATOMIC:
 cpu_exec_step_atomic(cs);
-- 
2.33.1




[PULL 19/30] linux-user/ppc: Use force_sig_fault

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Use the new function instead of setting up a target_siginfo_t
and calling queue_signal.  Fill in the missing PC for SIGTRAP.
The fault address for POWERPC_EXCP_ISI is nip exactly, not nip - 4.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
Message-Id: <20220107213243.212806-20-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/ppc/cpu_loop.c | 136 --
 1 file changed, 28 insertions(+), 108 deletions(-)

diff --git a/linux-user/ppc/cpu_loop.c b/linux-user/ppc/cpu_loop.c
index 30c82f23540a..46e6ffd6d300 100644
--- a/linux-user/ppc/cpu_loop.c
+++ b/linux-user/ppc/cpu_loop.c
@@ -76,8 +76,7 @@ int ppc_dcr_write (ppc_dcr_t *dcr_env, int dcrn, uint32_t val)
 void cpu_loop(CPUPPCState *env)
 {
 CPUState *cs = env_cpu(env);
-target_siginfo_t info;
-int trapnr;
+int trapnr, si_signo, si_code;
 target_ulong ret;
 
 for(;;) {
@@ -102,61 +101,10 @@ void cpu_loop(CPUPPCState *env)
   "Aborting\n");
 break;
 case POWERPC_EXCP_DSI:  /* Data storage exception*/
-/* XXX: check this. Seems bugged */
-switch (env->error_code & 0xFF00) {
-case 0x4000:
-case 0x4200:
-info.si_signo = TARGET_SIGSEGV;
-info.si_errno = 0;
-info.si_code = TARGET_SEGV_MAPERR;
-break;
-case 0x0400:
-info.si_signo = TARGET_SIGILL;
-info.si_errno = 0;
-info.si_code = TARGET_ILL_ILLADR;
-break;
-case 0x0800:
-info.si_signo = TARGET_SIGSEGV;
-info.si_errno = 0;
-info.si_code = TARGET_SEGV_ACCERR;
-break;
-default:
-/* Let's send a regular segfault... */
-EXCP_DUMP(env, "Invalid segfault errno (%02x)\n",
-  env->error_code);
-info.si_signo = TARGET_SIGSEGV;
-info.si_errno = 0;
-info.si_code = TARGET_SEGV_MAPERR;
-break;
-}
-info._sifields._sigfault._addr = env->spr[SPR_DAR];
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
-break;
 case POWERPC_EXCP_ISI:  /* Instruction storage exception */
-/* XXX: check this */
-switch (env->error_code & 0xFF00) {
-case 0x4000:
-info.si_signo = TARGET_SIGSEGV;
-info.si_errno = 0;
-info.si_code = TARGET_SEGV_MAPERR;
-break;
-case 0x1000:
-case 0x0800:
-info.si_signo = TARGET_SIGSEGV;
-info.si_errno = 0;
-info.si_code = TARGET_SEGV_ACCERR;
-break;
-default:
-/* Let's send a regular segfault... */
-EXCP_DUMP(env, "Invalid segfault errno (%02x)\n",
-  env->error_code);
-info.si_signo = TARGET_SIGSEGV;
-info.si_errno = 0;
-info.si_code = TARGET_SEGV_MAPERR;
-break;
-}
-info._sifields._sigfault._addr = env->nip - 4;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+/* FIXME: handle maperr in ppc_cpu_record_sigsegv. */
+force_sig_fault(TARGET_SIGSEGV, TARGET_SEGV_MAPERR,
+env->spr[SPR_DAR]);
 break;
 case POWERPC_EXCP_EXTERNAL: /* External input*/
 cpu_abort(cs, "External interrupt while in user mode. "
@@ -167,24 +115,23 @@ void cpu_loop(CPUPPCState *env)
 /* XXX: check this */
 switch (env->error_code & ~0xF) {
 case POWERPC_EXCP_FP:
-info.si_signo = TARGET_SIGFPE;
-info.si_errno = 0;
+si_signo = TARGET_SIGFPE;
 switch (env->error_code & 0xF) {
 case POWERPC_EXCP_FP_OX:
-info.si_code = TARGET_FPE_FLTOVF;
+si_code = TARGET_FPE_FLTOVF;
 break;
 case POWERPC_EXCP_FP_UX:
-info.si_code = TARGET_FPE_FLTUND;
+si_code = TARGET_FPE_FLTUND;
 break;
 case POWERPC_EXCP_FP_ZX:
 case POWERPC_EXCP_FP_VXZDZ:
-info.si_code = TARGET_FPE_FLTDIV;
+si_code = TARGET_FPE_FLTDIV;
 break;
 case POWERPC_EXCP_FP_XX:
-info.si_code = TARGET_FPE_FLTRES;
+si_code = TARGET_FPE_FLTRES;
 break;
 case POWERPC_EXCP_FP_VXSOFT:
-info.si_code = TARGET_FPE_FLTINV;
+

[PULL 08/30] linux-user/hppa: Set FPE_CONDTRAP for COND

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

This si_code was changed in 75abf64287cab, for linux 4.17.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
Message-Id: <20220107213243.212806-9-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/hppa/cpu_loop.c | 2 ++
 linux-user/syscall_defs.h  | 1 +
 2 files changed, 3 insertions(+)

diff --git a/linux-user/hppa/cpu_loop.c b/linux-user/hppa/cpu_loop.c
index a65e1571a0cf..a576d1a249fd 100644
--- a/linux-user/hppa/cpu_loop.c
+++ b/linux-user/hppa/cpu_loop.c
@@ -156,6 +156,8 @@ void cpu_loop(CPUHPPAState *env)
 force_sig_fault(TARGET_SIGFPE, TARGET_FPE_INTOVF, env->iaoq_f);
 break;
 case EXCP_COND:
+force_sig_fault(TARGET_SIGFPE, TARGET_FPE_CONDTRAP, env->iaoq_f);
+break;
 case EXCP_ASSIST:
 force_sig_fault(TARGET_SIGFPE, 0, env->iaoq_f);
 break;
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 510a8c1ab585..f23f0a2178f8 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -688,6 +688,7 @@ typedef struct target_siginfo {
 #define TARGET_FPE_FLTINV  (7)  /* floating point invalid operation */
 #define TARGET_FPE_FLTSUB  (8)  /* subscript out of range */
 #define TARGET_FPE_FLTUNK  (14) /* undiagnosed fp exception */
+#define TARGET_FPE_CONDTRAP(15) /* trap on condition */
 
 /*
  * SIGSEGV si_codes
-- 
2.33.1




[PULL 15/30] linux-user/mips: Use force_sig_fault

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Use the new function instead of setting up a target_siginfo_t
and calling queue_signal. Fill in the missing PC for SIGTRAP
and SIGFPE; use force_sig (SI_KERNEL) for EXCP_DSPDIS.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
Message-Id: <20220107213243.212806-16-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/mips/cpu_loop.c | 38 +-
 1 file changed, 13 insertions(+), 25 deletions(-)

diff --git a/linux-user/mips/cpu_loop.c b/linux-user/mips/cpu_loop.c
index 4fa24cc07452..1286fbc2e0d3 100644
--- a/linux-user/mips/cpu_loop.c
+++ b/linux-user/mips/cpu_loop.c
@@ -64,8 +64,7 @@ static void do_tr_or_bp(CPUMIPSState *env, unsigned int code, 
bool trap)
 void cpu_loop(CPUMIPSState *env)
 {
 CPUState *cs = env_cpu(env);
-target_siginfo_t info;
-int trapnr;
+int trapnr, si_code;
 abi_long ret;
 # ifdef TARGET_ABI_MIPSO32
 unsigned int syscall_num;
@@ -156,43 +155,32 @@ done_syscall:
 break;
 case EXCP_CpU:
 case EXCP_RI:
-info.si_signo = TARGET_SIGILL;
-info.si_errno = 0;
-info.si_code = 0;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+case EXCP_DSPDIS:
+force_sig(TARGET_SIGILL);
 break;
 case EXCP_INTERRUPT:
 /* just indicate that signals should be handled asap */
 break;
 case EXCP_DEBUG:
-info.si_signo = TARGET_SIGTRAP;
-info.si_errno = 0;
-info.si_code = TARGET_TRAP_BRKPT;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
-break;
-case EXCP_DSPDIS:
-info.si_signo = TARGET_SIGILL;
-info.si_errno = 0;
-info.si_code = TARGET_ILL_ILLOPC;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT,
+env->active_tc.PC);
 break;
 case EXCP_FPE:
-info.si_signo = TARGET_SIGFPE;
-info.si_errno = 0;
-info.si_code = TARGET_FPE_FLTUNK;
+si_code = TARGET_FPE_FLTUNK;
 if (GET_FP_CAUSE(env->active_fpu.fcr31) & FP_INVALID) {
-info.si_code = TARGET_FPE_FLTINV;
+si_code = TARGET_FPE_FLTINV;
 } else if (GET_FP_CAUSE(env->active_fpu.fcr31) & FP_DIV0) {
-info.si_code = TARGET_FPE_FLTDIV;
+si_code = TARGET_FPE_FLTDIV;
 } else if (GET_FP_CAUSE(env->active_fpu.fcr31) & FP_OVERFLOW) {
-info.si_code = TARGET_FPE_FLTOVF;
+si_code = TARGET_FPE_FLTOVF;
 } else if (GET_FP_CAUSE(env->active_fpu.fcr31) & FP_UNDERFLOW) {
-info.si_code = TARGET_FPE_FLTUND;
+si_code = TARGET_FPE_FLTUND;
 } else if (GET_FP_CAUSE(env->active_fpu.fcr31) & FP_INEXACT) {
-info.si_code = TARGET_FPE_FLTRES;
+si_code = TARGET_FPE_FLTRES;
 }
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+force_sig_fault(TARGET_SIGFPE, si_code, env->active_tc.PC);
 break;
+
 /* The code below was inspired by the MIPS Linux kernel trap
  * handling code in arch/mips/kernel/traps.c.
  */
-- 
2.33.1




[PULL 24/30] linux-user/xtensa: Use force_sig_fault

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Use the new function instead of setting up a target_siginfo_t
and calling queue_signal. Fill in the missing PC for SIGTRAP.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
Message-Id: <20220107213243.212806-25-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/xtensa/cpu_loop.c | 26 +-
 1 file changed, 9 insertions(+), 17 deletions(-)

diff --git a/linux-user/xtensa/cpu_loop.c b/linux-user/xtensa/cpu_loop.c
index 6bc6d6dee6c4..d51ce053926d 100644
--- a/linux-user/xtensa/cpu_loop.c
+++ b/linux-user/xtensa/cpu_loop.c
@@ -126,7 +126,6 @@ static void xtensa_underflow12(CPUXtensaState *env)
 void cpu_loop(CPUXtensaState *env)
 {
 CPUState *cs = env_cpu(env);
-target_siginfo_t info;
 abi_ulong ret;
 int trapnr;
 
@@ -163,14 +162,12 @@ void cpu_loop(CPUXtensaState *env)
 case EXC_USER:
 switch (env->sregs[EXCCAUSE]) {
 case ILLEGAL_INSTRUCTION_CAUSE:
+force_sig_fault(TARGET_SIGILL, TARGET_ILL_ILLOPC,
+env->sregs[EPC1]);
+break;
 case PRIVILEGED_CAUSE:
-info.si_signo = TARGET_SIGILL;
-info.si_errno = 0;
-info.si_code =
-env->sregs[EXCCAUSE] == ILLEGAL_INSTRUCTION_CAUSE ?
-TARGET_ILL_ILLOPC : TARGET_ILL_PRVOPC;
-info._sifields._sigfault._addr = env->sregs[EPC1];
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+force_sig_fault(TARGET_SIGILL, TARGET_ILL_PRVOPC,
+env->sregs[EPC1]);
 break;
 
 case SYSCALL_CAUSE:
@@ -219,11 +216,8 @@ void cpu_loop(CPUXtensaState *env)
 break;
 
 case INTEGER_DIVIDE_BY_ZERO_CAUSE:
-info.si_signo = TARGET_SIGFPE;
-info.si_errno = 0;
-info.si_code = TARGET_FPE_INTDIV;
-info._sifields._sigfault._addr = env->sregs[EPC1];
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+force_sig_fault(TARGET_SIGFPE, TARGET_FPE_INTDIV,
+env->sregs[EPC1]);
 break;
 
 default:
@@ -232,10 +226,8 @@ void cpu_loop(CPUXtensaState *env)
 }
 break;
 case EXCP_DEBUG:
-info.si_signo = TARGET_SIGTRAP;
-info.si_errno = 0;
-info.si_code = TARGET_TRAP_BRKPT;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT,
+env->sregs[EPC1]);
 break;
 case EXC_DEBUG:
 default:
-- 
2.33.1




[PULL 18/30] linux-user/openrisc: Use force_sig_fault

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Use the new function instead of setting up a target_siginfo_t
and calling queue_signal. Fill in the missing PC for SIGTRAP.

Reviewed-by: Stafford Horne 
Signed-off-by: Richard Henderson 
Message-Id: <20220107213243.212806-19-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/openrisc/cpu_loop.c | 18 +++---
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/linux-user/openrisc/cpu_loop.c b/linux-user/openrisc/cpu_loop.c
index 592901a68b73..7683bea0649e 100644
--- a/linux-user/openrisc/cpu_loop.c
+++ b/linux-user/openrisc/cpu_loop.c
@@ -29,7 +29,6 @@ void cpu_loop(CPUOpenRISCState *env)
 CPUState *cs = env_cpu(env);
 int trapnr;
 abi_long ret;
-target_siginfo_t info;
 
 for (;;) {
 cpu_exec_start(cs);
@@ -55,27 +54,16 @@ void cpu_loop(CPUOpenRISCState *env)
 }
 break;
 case EXCP_ALIGN:
-info.si_signo = TARGET_SIGBUS;
-info.si_errno = 0;
-info.si_code = TARGET_BUS_ADRALN;
-info._sifields._sigfault._addr = env->pc;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+force_sig_fault(TARGET_SIGBUS, TARGET_BUS_ADRALN, env->eear);
 break;
 case EXCP_ILLEGAL:
-info.si_signo = TARGET_SIGILL;
-info.si_errno = 0;
-info.si_code = TARGET_ILL_ILLOPC;
-info._sifields._sigfault._addr = env->pc;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+force_sig_fault(TARGET_SIGILL, TARGET_ILL_ILLOPC, env->pc);
 break;
 case EXCP_INTERRUPT:
 /* We processed the pending cpu work above.  */
 break;
 case EXCP_DEBUG:
-info.si_signo = TARGET_SIGTRAP;
-info.si_errno = 0;
-info.si_code = TARGET_TRAP_BRKPT;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT, env->pc);
 break;
 case EXCP_ATOMIC:
 cpu_exec_step_atomic(cs);
-- 
2.33.1




[PULL 20/30] linux-user/riscv: Use force_sig_fault

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Use the new function instead of setting up a target_siginfo_t
and calling queue_signal.  Fix missing PC from EXCP_DEBUG by
merging the case with EXCP_BREAKPOINT.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
Message-Id: <20220107213243.212806-21-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/riscv/cpu_loop.c | 31 +--
 1 file changed, 5 insertions(+), 26 deletions(-)

diff --git a/linux-user/riscv/cpu_loop.c b/linux-user/riscv/cpu_loop.c
index 0cd8985cb854..26d446f32379 100644
--- a/linux-user/riscv/cpu_loop.c
+++ b/linux-user/riscv/cpu_loop.c
@@ -30,8 +30,7 @@
 void cpu_loop(CPURISCVState *env)
 {
 CPUState *cs = env_cpu(env);
-int trapnr, signum, sigcode;
-target_ulong sigaddr;
+int trapnr;
 target_ulong ret;
 
 for (;;) {
@@ -40,10 +39,6 @@ void cpu_loop(CPURISCVState *env)
 cpu_exec_end(cs);
 process_queued_cpu_work(cs);
 
-signum = 0;
-sigcode = 0;
-sigaddr = 0;
-
 switch (trapnr) {
 case EXCP_INTERRUPT:
 /* just indicate that signals should be handled asap */
@@ -79,39 +74,23 @@ void cpu_loop(CPURISCVState *env)
 }
 break;
 case RISCV_EXCP_ILLEGAL_INST:
-signum = TARGET_SIGILL;
-sigcode = TARGET_ILL_ILLOPC;
+force_sig_fault(TARGET_SIGILL, TARGET_ILL_ILLOPC, env->pc);
 break;
 case RISCV_EXCP_BREAKPOINT:
-signum = TARGET_SIGTRAP;
-sigcode = TARGET_TRAP_BRKPT;
-sigaddr = env->pc;
+case EXCP_DEBUG:
+gdbstep:
+force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT, env->pc);
 break;
 case RISCV_EXCP_SEMIHOST:
 env->gpr[xA0] = do_common_semihosting(cs);
 env->pc += 4;
 break;
-case EXCP_DEBUG:
-gdbstep:
-signum = TARGET_SIGTRAP;
-sigcode = TARGET_TRAP_BRKPT;
-break;
 default:
 EXCP_DUMP(env, "\nqemu: unhandled CPU exception %#x - aborting\n",
  trapnr);
 exit(EXIT_FAILURE);
 }
 
-if (signum) {
-target_siginfo_t info = {
-.si_signo = signum,
-.si_errno = 0,
-.si_code = sigcode,
-._sifields._sigfault._addr = sigaddr
-};
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
-}
-
 process_pending_signals(env);
 }
 }
-- 
2.33.1




[PULL 09/30] linux-user/i386: Split out maybe_handle_vm86_trap

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Reduce the number of ifdefs within cpu_loop().

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
Message-Id: <20220107213243.212806-10-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/i386/cpu_loop.c | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/linux-user/i386/cpu_loop.c b/linux-user/i386/cpu_loop.c
index 9aaae93e2f5c..ac0f4e32 100644
--- a/linux-user/i386/cpu_loop.c
+++ b/linux-user/i386/cpu_loop.c
@@ -198,6 +198,17 @@ static void emulate_vsyscall(CPUX86State *env)
 }
 #endif
 
+static bool maybe_handle_vm86_trap(CPUX86State *env, int trapnr)
+{
+#ifndef TARGET_X86_64
+if (env->eflags & VM_MASK) {
+handle_vm86_trap(env, trapnr);
+return true;
+}
+#endif
+return false;
+}
+
 void cpu_loop(CPUX86State *env)
 {
 CPUState *cs = env_cpu(env);
@@ -259,12 +270,9 @@ void cpu_loop(CPUX86State *env)
 break;
 case EXCP0D_GPF:
 /* XXX: potential problem if ABI32 */
-#ifndef TARGET_X86_64
-if (env->eflags & VM_MASK) {
-handle_vm86_fault(env);
+if (maybe_handle_vm86_trap(env, trapnr)) {
 break;
 }
-#endif
 gen_signal(env, TARGET_SIGSEGV, TARGET_SI_KERNEL, 0);
 break;
 case EXCP0E_PAGE:
@@ -274,22 +282,16 @@ void cpu_loop(CPUX86State *env)
env->cr[2]);
 break;
 case EXCP00_DIVZ:
-#ifndef TARGET_X86_64
-if (env->eflags & VM_MASK) {
-handle_vm86_trap(env, trapnr);
+if (maybe_handle_vm86_trap(env, trapnr)) {
 break;
 }
-#endif
 gen_signal(env, TARGET_SIGFPE, TARGET_FPE_INTDIV, env->eip);
 break;
 case EXCP01_DB:
 case EXCP03_INT3:
-#ifndef TARGET_X86_64
-if (env->eflags & VM_MASK) {
-handle_vm86_trap(env, trapnr);
+if (maybe_handle_vm86_trap(env, trapnr)) {
 break;
 }
-#endif
 if (trapnr == EXCP01_DB) {
 gen_signal(env, TARGET_SIGTRAP, TARGET_TRAP_BRKPT, env->eip);
 } else {
@@ -298,12 +300,9 @@ void cpu_loop(CPUX86State *env)
 break;
 case EXCP04_INTO:
 case EXCP05_BOUND:
-#ifndef TARGET_X86_64
-if (env->eflags & VM_MASK) {
-handle_vm86_trap(env, trapnr);
+if (maybe_handle_vm86_trap(env, trapnr)) {
 break;
 }
-#endif
 gen_signal(env, TARGET_SIGSEGV, TARGET_SI_KERNEL, 0);
 break;
 case EXCP06_ILLOP:
-- 
2.33.1




[PULL 21/30] linux-user/s390x: Use force_sig_fault

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Use the new function instead of setting up a target_siginfo_t
and calling queue_signal.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
Message-Id: <20220107213243.212806-22-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/s390x/cpu_loop.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/linux-user/s390x/cpu_loop.c b/linux-user/s390x/cpu_loop.c
index ad0c3cd2635d..7901dfe6f518 100644
--- a/linux-user/s390x/cpu_loop.c
+++ b/linux-user/s390x/cpu_loop.c
@@ -58,7 +58,6 @@ void cpu_loop(CPUS390XState *env)
 {
 CPUState *cs = env_cpu(env);
 int trapnr, n, sig;
-target_siginfo_t info;
 target_ulong addr;
 abi_long ret;
 
@@ -158,11 +157,7 @@ void cpu_loop(CPUS390XState *env)
  */
 env->psw.addr += env->int_pgm_ilen;
 do_signal:
-info.si_signo = sig;
-info.si_errno = 0;
-info.si_code = n;
-info._sifields._sigfault._addr = addr;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+force_sig_fault(sig, n, addr);
 break;
 
 case EXCP_ATOMIC:
-- 
2.33.1




[PULL 12/30] linux-user/microblaze: Use force_sig_fault

2022-01-11 Thread Laurent Vivier
From: Richard Henderson 

Use the new function instead of setting up a target_siginfo_t
and calling queue_signal. Fill in the missing PC for SIGTRAP.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
Message-Id: <20220107213243.212806-13-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 
---
 linux-user/microblaze/cpu_loop.c | 61 +---
 1 file changed, 25 insertions(+), 36 deletions(-)

diff --git a/linux-user/microblaze/cpu_loop.c b/linux-user/microblaze/cpu_loop.c
index ff1fb26c8baf..08620d4e6899 100644
--- a/linux-user/microblaze/cpu_loop.c
+++ b/linux-user/microblaze/cpu_loop.c
@@ -27,9 +27,8 @@
 void cpu_loop(CPUMBState *env)
 {
 CPUState *cs = env_cpu(env);
-int trapnr, ret;
-target_siginfo_t info;
-
+int trapnr, ret, si_code;
+
 while (1) {
 cpu_exec_start(cs);
 trapnr = cpu_exec(cs);
@@ -38,8 +37,8 @@ void cpu_loop(CPUMBState *env)
 
 switch (trapnr) {
 case EXCP_INTERRUPT:
-  /* just indicate that signals should be handled asap */
-  break;
+/* just indicate that signals should be handled asap */
+break;
 case EXCP_SYSCALL:
 /* Return address is 4 bytes after the call.  */
 env->regs[14] += 4;
@@ -67,6 +66,7 @@ void cpu_loop(CPUMBState *env)
  */
 env->regs[14] = env->pc;
 break;
+
 case EXCP_HW_EXCP:
 env->regs[17] = env->pc + 4;
 if (env->iflags & D_FLAG) {
@@ -74,42 +74,31 @@ void cpu_loop(CPUMBState *env)
 env->pc -= 4;
 /* FIXME: if branch was immed, replay the imm as well.  */
 }
-
 env->iflags &= ~(IMM_FLAG | D_FLAG);
-
 switch (env->esr & 31) {
-case ESR_EC_DIVZERO:
-info.si_signo = TARGET_SIGFPE;
-info.si_errno = 0;
-info.si_code = TARGET_FPE_FLTDIV;
-info._sifields._sigfault._addr = 0;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
-break;
-case ESR_EC_FPU:
-info.si_signo = TARGET_SIGFPE;
-info.si_errno = 0;
-if (env->fsr & FSR_IO) {
-info.si_code = TARGET_FPE_FLTINV;
-}
-if (env->fsr & FSR_DZ) {
-info.si_code = TARGET_FPE_FLTDIV;
-}
-info._sifields._sigfault._addr = 0;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
-break;
-default:
-fprintf(stderr, "Unhandled hw-exception: 0x%x\n",
-env->esr & ESR_EC_MASK);
-cpu_dump_state(cs, stderr, 0);
-exit(EXIT_FAILURE);
-break;
+case ESR_EC_DIVZERO:
+si_code = TARGET_FPE_FLTDIV;
+break;
+case ESR_EC_FPU:
+si_code = 0;
+if (env->fsr & FSR_IO) {
+si_code = TARGET_FPE_FLTINV;
+}
+if (env->fsr & FSR_DZ) {
+si_code = TARGET_FPE_FLTDIV;
+}
+break;
+default:
+fprintf(stderr, "Unhandled hw-exception: 0x%x\n",
+env->esr & ESR_EC_MASK);
+cpu_dump_state(cs, stderr, 0);
+exit(EXIT_FAILURE);
 }
+force_sig_fault(TARGET_SIGFPE, si_code, env->pc);
 break;
+
 case EXCP_DEBUG:
-info.si_signo = TARGET_SIGTRAP;
-info.si_errno = 0;
-info.si_code = TARGET_TRAP_BRKPT;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT, env->pc);
 break;
 case EXCP_ATOMIC:
 cpu_exec_step_atomic(cs);
-- 
2.33.1




  1   2   3   4   >