Re: [PATCH 0/5] disas/nanomips: Format string fixes

2022-11-01 Thread Richard Henderson

On 11/1/22 22:44, Philippe Mathieu-Daudé wrote:

Fix invalid string formats reported by Stefan:
https://lore.kernel.org/qemu-devel/78553699-00c1-ad69-1d58-02f75a1f4...@weilnetz.de/

Philippe Mathieu-Daudé (5):
   disas/nanomips: Fix invalid PRId64 format calling img_format()
   disas/nanomips: Fix invalid PRIx64 format calling img_format()
   disas/nanomips: Use G_GNUC_PRINTF to avoid invalid string formats
   disas/nanomips: Remove headers already included by "qemu/osdep.h"
   MAINTAINERS: Inherit from nanoMIPS

  MAINTAINERS  |  8 +---
  disas/nanomips.c | 44 +++-
  2 files changed, 24 insertions(+), 28 deletions(-)



Reviewed-by: Richard Henderson 

r~



Re: [PATCH 1/2] target/loongarch: Add exception subcode

2022-11-01 Thread Richard Henderson

On 11/1/22 18:32, Song Gao wrote:

We need subcodes to distinguish the same excode cs->exception_indexs,
such as EXCCODE_ADEF/EXCCODE_ADEM.

Signed-off-by: Song Gao 
---
  target/loongarch/cpu.c |  7 +++--
  target/loongarch/cpu.h | 58 ++
  2 files changed, 36 insertions(+), 29 deletions(-)


Reviewed-by: Richard Henderson 


r~



Re: [PATCH 2/2] target/loongarch: Fix raise_mmu_exception() set wrong exception_index

2022-11-01 Thread Richard Henderson

On 11/1/22 18:32, Song Gao wrote:

When the address is invalid address, We should set exception_index
according to MMUAccessType, and EXCCODE_ADEF need't update badinstr.
Otherwise, The system enters an infinite loop. e.g:
run test.c on system mode
test.c:
 #include

 void (*func)(int *);

 int main()
 {
 int i = 8;
 void *ptr = (void *)0x4000;
 func = ptr;
 func();
 return 0;
 }

Signed-off-by: Song Gao
---
  target/loongarch/cpu.c| 1 +
  target/loongarch/tlb_helper.c | 5 +++--
  2 files changed, 4 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson 

r~



[PATCH] target/arm: Two fixes for secure ptw

2022-11-01 Thread Richard Henderson
Reversed the sense of non-secure in get_phys_addr_lpae,
and failed to initialize attrs.secure for ARMMMUIdx_Phys_S.

Fixes: 48da29e4 ("target/arm: Add ptw_idx to S1Translate")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1293
Signed-off-by: Richard Henderson 
---
 target/arm/ptw.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 58a7bbda50..df3573f150 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -1357,7 +1357,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
S1Translate *ptw,
 descaddr |= (address >> (stride * (4 - level))) & indexmask;
 descaddr &= ~7ULL;
 nstable = extract32(tableattrs, 4, 1);
-if (!nstable) {
+if (nstable) {
 /*
  * Stage2_S -> Stage2 or Phys_S -> Phys_NS
  * Assert that the non-secure idx are even, and relative order.
@@ -2671,6 +2671,13 @@ static bool get_phys_addr_with_struct(CPUARMState *env, 
S1Translate *ptw,
 bool is_secure = ptw->in_secure;
 ARMMMUIdx s1_mmu_idx;
 
+/*
+ * The page table entries may downgrade secure to non-secure, but
+ * cannot upgrade an non-secure translation regime's attributes
+ * to secure.
+ */
+result->f.attrs.secure = is_secure;
+
 switch (mmu_idx) {
 case ARMMMUIdx_Phys_S:
 case ARMMMUIdx_Phys_NS:
@@ -2712,12 +2719,6 @@ static bool get_phys_addr_with_struct(CPUARMState *env, 
S1Translate *ptw,
 break;
 }
 
-/*
- * The page table entries may downgrade secure to non-secure, but
- * cannot upgrade an non-secure translation regime's attributes
- * to secure.
- */
-result->f.attrs.secure = is_secure;
 result->f.attrs.user = regime_is_user(env, mmu_idx);
 
 /*
-- 
2.34.1




Re: [PATCH RFC 1/4] net: Introduce qmp cmd "query-netdev"

2022-11-01 Thread Jason Wang
On Tue, Nov 1, 2022 at 12:19 AM  wrote:
>
> From: Hyman Huang(黄勇) 
>
> For netdev device that can offload virtio-net dataplane to slave,
> such as vhost-net, vhost-user and vhost-vdpa, exporting it's
> capability information and acked features would be more friendly for
> developers. These infomation can be analyzed and compare to slave
> capability provided by, eg dpdk or other slaves directly, helping to
> draw conclusions about if vm network interface works normally, if
> it vm can be migrated to another feature-compatible destination or
> whatever else.
>
> For developers who devote to offload virtio-net dataplane to DPU
> and make efforts to migrate vm lively from software-based source
> host to DPU-offload destination host smoothly, virtio-net feature
> compatibility is an serious issue, exporting the key capability
> and acked_features of netdev could also help to debug greatly.
>
> So we export out the key capabilities of netdev, which may affect
> the final negotiated virtio-net features, meanwhile, backed-up
> acked_features also exported, which is used to initialize or
> restore features negotiated between qemu and vhost slave when
> starting vhost_dev device.
>
> Signed-off-by: Hyman Huang(黄勇) 
> ---
>  net/net.c | 44 +++
>  qapi/net.json | 66 
> +++
>  2 files changed, 110 insertions(+)
>
> diff --git a/net/net.c b/net/net.c
> index 2db160e..5d11674 100644
> --- a/net/net.c
> +++ b/net/net.c
> @@ -53,6 +53,7 @@
>  #include "sysemu/runstate.h"
>  #include "net/colo-compare.h"
>  #include "net/filter.h"
> +#include "net/vhost-user.h"
>  #include "qapi/string-output-visitor.h"
>
>  /* Net bridge is currently not supported for W32. */
> @@ -1224,6 +1225,49 @@ void qmp_netdev_del(const char *id, Error **errp)
>  }
>  }
>
> +static NetDevInfo *query_netdev(NetClientState *nc)
> +{
> +NetDevInfo *info = NULL;
> +
> +if (!nc || !nc->is_netdev) {
> +return NULL;
> +}
> +
> +info = g_malloc0(sizeof(*info));
> +info->name = g_strdup(nc->name);
> +info->type = nc->info->type;
> +info->ufo = nc->info->has_ufo;
> +info->vnet_hdr = nc->info->has_vnet_hdr;
> +info->vnet_hdr_len = nc->info->has_vnet_hdr_len;

So all the fields are virtio specific, I wonder if it's better to
rename the command as query-vhost or query-virtio?

Thanks

> +
> +if (nc->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
> +info->has_acked_features = true;
> +info->acked_features = vhost_user_get_acked_features(nc);
> +}
> +
> +return info;
> +}
> +
> +NetDevInfoList *qmp_query_netdev(Error **errp)
> +{
> +NetClientState *nc;
> +NetDevInfo *info = NULL;
> +NetDevInfoList *head = NULL, **tail = 
> +
> +QTAILQ_FOREACH(nc, _clients, next) {
> +if (nc->info->type == NET_CLIENT_DRIVER_NIC) {
> +continue;
> +}
> +
> +info = query_netdev(nc);
> +if (info) {
> +QAPI_LIST_APPEND(tail, info);
> +}
> +}
> +
> +return head;
> +}
> +
>  static void netfilter_print_info(Monitor *mon, NetFilterState *nf)
>  {
>  char *str;
> diff --git a/qapi/net.json b/qapi/net.json
> index dd088c0..76a6513 100644
> --- a/qapi/net.json
> +++ b/qapi/net.json
> @@ -631,6 +631,72 @@
> 'if': 'CONFIG_VMNET' } } }
>
>  ##
> +# @NetDevInfo:
> +#
> +# NetDev information.  This structure describes a NetDev information, 
> including
> +# capabilities and negotiated features.
> +#
> +# @name: The NetDev name.
> +#
> +# @type: Type of NetDev.
> +#
> +# @ufo: True if NetDev has ufo capability.
> +#
> +# @vnet-hdr: True if NetDev has vnet_hdr.
> +#
> +# @vnet-hdr-len: True if given length can be assigned to NetDev.
> +#
> +# @acked-features: Negotiated features with vhost slave device if device 
> support
> +#  dataplane offload.
> +#
> +# Since:  7.1
> +##
> +{'struct': 'NetDevInfo',
> + 'data': {
> +'name': 'str',
> +'type': 'NetClientDriver',
> +'ufo':'bool',
> +'vnet-hdr':'bool',
> +'vnet-hdr-len':'bool',
> +'*acked-features': 'uint64' } }
> +
> +##
> +# @query-netdev:
> +#
> +# Get a list of NetDevInfo for all virtual netdev peer devices.
> +#
> +# Returns: a list of @NetDevInfo describing each virtual netdev peer device.
> +#
> +# Since: 7.1
> +#
> +# Example:
> +#
> +# -> { "execute": "query-netdev" }
> +# <- {
> +#   "return":[
> +#  {
> +# "name":"hostnet0",
> +# "type":"vhost-user",
> +# "ufo":true,
> +# "vnet-hdr":true,
> +# "vnet-hdr-len":true,
> +# "acked-features":"5111807907",
> +#  },
> +#  {
> +# "name":"hostnet1",
> +# "type":"vhost-user",
> +# "ufo":true,
> +# "vnet-hdr":true,
> +# "vnet-hdr-len":true,
> +# "acked-features":"5111807907",
> +#  }
> +#   ]
> +#   

Re: [PATCH v3] target/arm: honor HCR_E2H and HCR_TGE in ats_write64()

2022-11-01 Thread Richard Henderson

On 11/1/22 17:42, Ake Koomsin wrote:

We need to check HCR_E2H and HCR_TGE to select the right MMU index for
the correct translation regime.

To check for EL2&0 translation regime:
- For S1E0*, S1E1* and S12E* ops, check both HCR_E2H and HCR_TGE
- For S1E2* ops, check only HCR_E2H

Signed-off-by: Ake Koomsin
---

v3:
- Avoid recomputing arm_hcr_el2_eff() as recommended by Richard H.
- Use ':?' for more compact code as recommended by Richard H.

v2:
- Rebase with the latest upstream
- It turns out that we need to check both HCR_E2H and HCR_TGE for
   S1E0*, S1E1* and S12E* address translation as well according to the
   Architecture Manual.
-https://lists.gnu.org/archive/html/qemu-devel/2022-10/msg06084.html

v1:
https://lists.gnu.org/archive/html/qemu-devel/2022-10/msg02627.html

  target/arm/helper.c | 15 +--
  1 file changed, 9 insertions(+), 6 deletions(-)


Reviewed-by: Richard Henderson 

r~



RE: [PATCH 09/16] hw/9pfs: Disable unsupported flags and features for Windows

2022-11-01 Thread Shi, Guohuai



> -Original Message-
> From: Christian Schoenebeck 
> Sent: Wednesday, November 2, 2022 02:59
> To: qemu-devel@nongnu.org
> Cc: Greg Kurz ; Meng, Bin ; Shi,
> Guohuai 
> Subject: Re: [PATCH 09/16] hw/9pfs: Disable unsupported flags and features
> for Windows
> 
> [Please note: This e-mail is from an EXTERNAL e-mail address]
> 
> On Tuesday, November 1, 2022 4:34:54 PM CET Shi, Guohuai wrote:
> >
> > > -Original Message-
> > > From: Christian Schoenebeck 
> > > Sent: Tuesday, November 1, 2022 23:04
> > > To: qemu-devel@nongnu.org
> > > Cc: Shi, Guohuai ; Greg Kurz
> > > ; Meng, Bin 
> > > Subject: Re: [PATCH 09/16] hw/9pfs: Disable unsupported flags and
> > > features for Windows
> > >
> > > [Please note: This e-mail is from an EXTERNAL e-mail address]
> > >
> > > On Monday, October 24, 2022 6:57:52 AM CET Bin Meng wrote:
> > > > From: Guohuai Shi 
> > > >
> > > > Some flags and features are not supported on Windows, like mknod,
> > > > readlink, file mode, etc. Update the codes for Windows.
> > > >
> > > > Signed-off-by: Guohuai Shi 
> > > > Signed-off-by: Bin Meng 
> > > > ---
> > > >
> > > >  hw/9pfs/9p-util.h |  6 +++-
> > > >  hw/9pfs/9p.c  | 90 ++-
> > > >  2 files changed, 86 insertions(+), 10 deletions(-)
> > > >
> > > > diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index
> > > > 82b2d0c3e4..3d154e9103 100644
> > > > --- a/hw/9pfs/9p-util.h
> > > > +++ b/hw/9pfs/9p-util.h
> > > > @@ -53,8 +53,10 @@ static inline uint64_t makedev_dotl(uint32_t
> > > > dev_major,
> > > uint32_t dev_minor)
> > > >   */
> > > >  static inline uint64_t host_dev_to_dotl_dev(dev_t dev)  { -#ifdef
> > > > CONFIG_LINUX
> > > > +#if defined(CONFIG_LINUX)
> > > >  return dev;
> > > > +#elif defined(CONFIG_WIN32)
> > > > +return 0;
> > >
> > > Really?
> >
> > Check MS this document:
> > https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/fsta
> > t-fstat32-fstat64-fstati64-fstat32i64-fstat64i32?view=msvc-170
> > st_rdev: If a device, fd; otherwise 0.
> > st_dev: If a device, fd; otherwise 0.
> >
> > So for any file open, it should be 0.
> 
> Yeah, but that function translates a corresponding device ID for *Linux*
> guest side. And the intention is to avoid e.g. file ID collisions on guest
> side.
> Because for a Linux guest, the two-tuple (device number, inode number) makes
> a system-wide unique file ID.
> 
> If you just return zero here, that might be OK if only one 9p directory is
> exported to guest, but say you have "C:\foo\" exported and "D:\bar\" exported
> and mounted via 9p to guest, then guest would assume every file with the same
> inode number on those two to be the same files. But they are not. They are on
> two different drives actually.
> 

Got it.
Windows does not provide any numerical type ID for device, 
I think the solution could be using driver letter ASC code "C:", "D:", etc.

> >
> > >
> > > >  #else
> > > >  return makedev_dotl(major(dev), minor(dev));  #endif @@
> > > > -260,7
> > > > +262,9 @@ static inline struct dirent *qemu_dirent_dup(struct
> > > > +dirent
> > > > *dent)  #if defined CONFIG_DARWIN && defined
> > > > CONFIG_PTHREAD_FCHDIR_NP int pthread_fchdir_np(int fd)
> > > > __attribute__((weak_import));  #endif
> > > > +#ifndef CONFIG_WIN32
> > > >  int qemu_mknodat(P9_FILE_ID dirfd, const char *filename, mode_t mode,
> > > >   dev_t dev);
> > > > +#endif
> > > >
> > > >  #endif
> > > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index
> > > > 6c4af86240..771aab34ac
> > > > 100644
> > > > --- a/hw/9pfs/9p.c
> > > > +++ b/hw/9pfs/9p.c
> > > > @@ -39,6 +39,11 @@
> > > >  #include "qemu/xxhash.h"
> > > >  #include 
> > > >
> > > > +#ifdef CONFIG_WIN32
> > > > +#define UTIME_NOW   ((1l << 30) - 1l)
> > > > +#define UTIME_OMIT  ((1l << 30) - 2l) #endif
> > > > +
> > > >  int open_fd_hw;
> > > >  int total_open_fd;
> > > >  static int open_fd_rc;
> > > > @@ -132,13 +137,17 @@ static int dotl_to_open_flags(int flags)
> > > >  DotlOpenflagMap dotl_oflag_map[] = {
> > > >  { P9_DOTL_CREATE, O_CREAT },
> > > >  { P9_DOTL_EXCL, O_EXCL },
> > > > +#ifndef CONFIG_WIN32
> > > >  { P9_DOTL_NOCTTY , O_NOCTTY },
> > > > +#endif
> > > >  { P9_DOTL_TRUNC, O_TRUNC },
> > > >  { P9_DOTL_APPEND, O_APPEND },
> > > > +#ifndef CONFIG_WIN32
> > > >  { P9_DOTL_NONBLOCK, O_NONBLOCK } ,
> > > >  { P9_DOTL_DSYNC, O_DSYNC },
> > > >  { P9_DOTL_FASYNC, FASYNC }, -#ifndef CONFIG_DARWIN
> > > > +#endif
> > > > +#ifdef CONFIG_LINUX
> > >
> > > Better
> > >
> > >#if !defined(CONFIG_DARWIN) && !defined(CONFIG_WIN32)
> > >
> >
> > It is OK.
> 
> You got my point, hopefully:
> 
> > > Otherwise it might automatically opt-out other future platforms
> > > unintentionally.
> > >
> > > >  { P9_DOTL_NOATIME, O_NOATIME },
> > > >  /*
> > > >   *  On Darwin, we could map to F_NOCACHE, which is @@
> > > > -151,8
> > > > +160,10 @@ static 

Re: [PATCH 2/2] file-posix: add statx(STATX_DIOALIGN) support

2022-11-01 Thread Eric Biggers
On Tue, Nov 01, 2022 at 03:00:31PM -0400, Stefan Hajnoczi wrote:
>  /* Let's try to use the logical blocksize for the alignment. */
> -if (probe_logical_blocksize(fd, >bl.request_alignment) < 0) {
> -bs->bl.request_alignment = 0;
> +if (!bs->bl.request_alignment) {
> +if (probe_logical_blocksize(fd, >bl.request_alignment) < 0) {
> +bs->bl.request_alignment = 0;
> +}
>  }
>  
>  #ifdef __linux__
> -/*
> - * The XFS ioctl definitions are shipped in extra packages that might
> - * not always be available. Since we just need the XFS_IOC_DIOINFO ioctl
> - * here, we simply use our own definition instead:
> - */
> -struct xfs_dioattr {
> -uint32_t d_mem;
> -uint32_t d_miniosz;
> -uint32_t d_maxiosz;
> -} da;
> -if (ioctl(fd, _IOR('X', 30, struct xfs_dioattr), ) >= 0) {
> -bs->bl.request_alignment = da.d_miniosz;
> -/* The kernel returns wrong information for d_mem */
> -/* s->buf_align = da.d_mem; */
> +if (!bs->bl.request_alignment) {

This patch changes the fallback code to make the request_alignment value from
probe_logical_blocksize() override the value from XFS_IOC_DIOINFO.  Is that
intentional?

> +if (ioctl(fd, _IOR('X', 30, struct xfs_dioattr), ) >= 0) {
> +bs->bl.request_alignment = da.d_miniosz;
> +/* The kernel returns wrong information for d_mem */
> +/* s->buf_align = da.d_mem; */

Has this bug been reported to the XFS developers (linux-...@vger.kernel.org)?

- Eric



RE: [PATCH 07/16] hw/9pfs: Implement Windows specific utilities functions for 9pfs

2022-11-01 Thread Shi, Guohuai



> -Original Message-
> From: Christian Schoenebeck 
> Sent: Wednesday, November 2, 2022 02:22
> To: qemu-devel@nongnu.org
> Cc: Greg Kurz ; Meng, Bin ; Shi,
> Guohuai 
> Subject: Re: [PATCH 07/16] hw/9pfs: Implement Windows specific utilities
> functions for 9pfs
> 
> [Please note: This e-mail is from an EXTERNAL e-mail address]
> 
> On Tuesday, November 1, 2022 4:20:53 PM CET Shi, Guohuai wrote:
> >
> [...]
> > > > > Windows POSIX API and MinGW library do not provide the NO_FOLLOW
> > > > > flag, and do not allow opening a directory by POSIX open(). This
> > > > > causes all
> > > > > xxx_at() functions cannot work directly. However, we can provide
> > > > > Windows handle based functions to emulate xxx_at() functions (e.g.:
> > > > > openat_win32, utimensat_win32, etc.).
> > > > >
> > > > > Windows does not support extended attributes. 9pfs for Windows
> > > > > uses NTFS ADS (Alternate Data Streams) to emulate extended
> attributes.
> > > > >
> > > > > Windows does not provide POSIX compatible readlink(), and
> > > > > symbolic link feature in 9pfs will be disabled on Windows.
> > > >
> > > > Wouldn't it be more user friendly if the relevant error locations
> > > > would use something like error_report_once() and suggesting to
> > > > enable
> > > > mapped(-xattr) to make 9p symlinks on guest working if desired by the
> user?
> > > >
> > > > Probably this error case would need to wrapped into a dedicated
> > > > function, otherwise I guess error_report_once() would fire several
> > > > times by different callers.
> > > >
> > >
> > > Windows (MinGW) does not only support symlink, but also does not
> > > have symlink definitions.
> > > Windows does not support symlink flags S_IFLNK.
> > >
> > > So even I add symlink support by mapped-xattr, the MinGW library
> > > does not have symlink flags and get a build error.
> > > And this flags is defined by Windows header files.
> > > The impact of adding a new flags to an pre-defined structure (struct
> > > stat) is unknown.
> > >
> > > So I think it is not a good idea to do that.
> >
> > Because Windows does not support symlink, so error_report_once() and report
> it to user will be OK.
> > But mapped-xattr could not work.
> 
> Showing an error makes sense for "passthrough" security model, but not for
> the "mapped" one.
> 
> Just to avoid misapprehensions: are you aware that there is already a system-
> agnostic implementation for symlinks if "mapped" is used?
> 
> When mapped security model is enabled, then creating symlinks on guest will
> simply create a corresponding *regular* file on host and the content of that
> regular file on host is the pointing path as raw string. Additionally the
> symlink flag is added to "virtfs.mode" xattr to mark that regular file as a
> symlink, a virtual one that is. So this does not require any support for
> symlinks by either the underlying host file system, nor by host OS.
> 
> Likewise interpreting and walking those virtual symlinks in "mapped" mode is
> also implemented in the local fs driver already.

Yes, symlink can be supported by "mapped" mode.
I mean that MinGW does not provide symlink mode flags "S_IFLNK" and some other 
related functions and defines.
You can check with "9p.c": S_ISLNK, S_ISSOCK and S_ISFIFO are not valid on 
Windows (MinGW) host.
So even I enabled symlink supported by "mapped" mode on local-agent code, 
"9p.c" can not be built.

So I disabled symlink totally, because MinGW interface does not support it.

To resolve this issue, MinGW should add the missing defines at first.

> 
> Best regards,
> Christian Schoenebeck
> 


Thanks
Guohuai



Re: [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned

2022-11-01 Thread Eric Biggers
On Tue, Nov 01, 2022 at 07:27:16PM -0700, Eric Biggers wrote:
> On Tue, Nov 01, 2022 at 03:00:30PM -0400, Stefan Hajnoczi wrote:
> > Linux dm-crypt returns errno EIO from unaligned O_DIRECT pread(2) calls.
> 
> Citation needed.  For direct I/O to block devices, the kernel's block layer
> checks the alignment before the I/O is actually submitted to the underlying
> block device.  See
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/fops.c?h=v6.1-rc3#n306
> 
> > Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1290
> 
> That "bug" seems to be based on a misunderstanding of the kernel source code,
> and not any actual testing.
> 
> I just tested it, and the error code is EINVAL.
> 

I think I see what's happening.  The kernel code was broken just a few months
ago, in v6.0 by the commit "block: relax direct io memory alignment"
(https://git.kernel.org/linus/b1a000d3b8ec582d).  Now the block layer lets DIO
through when the user buffer is only aligned to the device's dma_alignment.  But
a dm-crypt device has a dma_alignment of 512 even when the crypto sector size
(and thus also the logical block size) is 4096.  So there is now a case where
misaligned DIO can reach dm-crypt, when that shouldn't be possible.

It also means that STATX_DIOALIGN will give the wrong value for
stx_dio_mem_align in the above case, 512 instead of 4096.  This is because
STATX_DIOALIGN for block devices relies on the dma_alignment.

I'll raise this on the linux-block and dm-devel mailing lists.  It would be nice
if people reported kernel bugs instead of silently working around them...

- Eric



Re: [PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned

2022-11-01 Thread Eric Biggers
On Tue, Nov 01, 2022 at 03:00:30PM -0400, Stefan Hajnoczi wrote:
> Linux dm-crypt returns errno EIO from unaligned O_DIRECT pread(2) calls.

Citation needed.  For direct I/O to block devices, the kernel's block layer
checks the alignment before the I/O is actually submitted to the underlying
block device.  See
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/fops.c?h=v6.1-rc3#n306

> Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1290

That "bug" seems to be based on a misunderstanding of the kernel source code,
and not any actual testing.

I just tested it, and the error code is EINVAL.

- Eric



Re: Re: [PATCH v2 1/4] virtio-crypto: Support asynchronous mode

2022-11-01 Thread Lei He

On 2022/11/2 03:51, Michael S. Tsirkin wrote:

On Tue, Nov 01, 2022 at 06:37:26AM -0400, Michael S. Tsirkin wrote:

On Sat, Oct 08, 2022 at 04:50:27PM +0800, Lei He wrote:

virtio-crypto: Modify the current interface of virtio-crypto
device to support asynchronous mode.

Signed-off-by: lei he 
---
  backends/cryptodev-builtin.c|  69 ++---
  backends/cryptodev-vhost-user.c |  51 +--
  backends/cryptodev.c|  44 +++---
  hw/virtio/virtio-crypto.c   | 324 ++--
  include/sysemu/cryptodev.h  |  60 +---
  5 files changed, 336 insertions(+), 212 deletions(-)

diff --git a/backends/cryptodev-builtin.c b/backends/cryptodev-builtin.c
index 125cbad1d3..cda6ca3b71 100644
--- a/backends/cryptodev-builtin.c
+++ b/backends/cryptodev-builtin.c
@@ -355,42 +355,62 @@ static int cryptodev_builtin_create_akcipher_session(
  return index;
  }
  
-static int64_t cryptodev_builtin_create_session(

+static int cryptodev_builtin_create_session(
 CryptoDevBackend *backend,
 CryptoDevBackendSessionInfo *sess_info,
-   uint32_t queue_index, Error **errp)
+   uint32_t queue_index,
+   CryptoDevCompletionFunc cb,
+   void *opaque)
  {
  CryptoDevBackendBuiltin *builtin =
CRYPTODEV_BACKEND_BUILTIN(backend);
  CryptoDevBackendSymSessionInfo *sym_sess_info;
  CryptoDevBackendAsymSessionInfo *asym_sess_info;
+int ret, status;
+Error *local_error = NULL;
  
  switch (sess_info->op_code) {

  case VIRTIO_CRYPTO_CIPHER_CREATE_SESSION:
  sym_sess_info = _info->u.sym_sess_info;
-return cryptodev_builtin_create_cipher_session(
-   builtin, sym_sess_info, errp);
+ret = cryptodev_builtin_create_cipher_session(
+builtin, sym_sess_info, _error);
+break;
  
  case VIRTIO_CRYPTO_AKCIPHER_CREATE_SESSION:

  asym_sess_info = _info->u.asym_sess_info;
-return cryptodev_builtin_create_akcipher_session(
-   builtin, asym_sess_info, errp);
+ret = cryptodev_builtin_create_akcipher_session(
+   builtin, asym_sess_info, _error);
+break;
  
  case VIRTIO_CRYPTO_HASH_CREATE_SESSION:

  case VIRTIO_CRYPTO_MAC_CREATE_SESSION:
  default:
-error_setg(errp, "Unsupported opcode :%" PRIu32 "",
+error_setg(_error, "Unsupported opcode :%" PRIu32 "",
 sess_info->op_code);
-return -1;
+return -VIRTIO_CRYPTO_NOTSUPP;
  }
  
-return -1;

+if (local_error) {
+error_report_err(local_error);
+}
+if (ret < 0) {
+status = -VIRTIO_CRYPTO_ERR;
+} else {
+sess_info->session_id = ret;
+status = VIRTIO_CRYPTO_OK;
+}
+if (cb) {
+cb(opaque, status);
+}
+return 0;
  }
  
  static int cryptodev_builtin_close_session(

 CryptoDevBackend *backend,
 uint64_t session_id,
-   uint32_t queue_index, Error **errp)
+   uint32_t queue_index,
+   CryptoDevCompletionFunc cb,
+   void *opaque)
  {
  CryptoDevBackendBuiltin *builtin =
CRYPTODEV_BACKEND_BUILTIN(backend);
@@ -407,6 +427,9 @@ static int cryptodev_builtin_close_session(
  
  g_free(session);

  builtin->sessions[session_id] = NULL;
+if (cb) {
+cb(opaque, VIRTIO_CRYPTO_OK);
+}
  return 0;
  }
  
@@ -506,7 +529,9 @@ static int cryptodev_builtin_asym_operation(

  static int cryptodev_builtin_operation(
   CryptoDevBackend *backend,
   CryptoDevBackendOpInfo *op_info,
- uint32_t queue_index, Error **errp)
+ uint32_t queue_index,
+ CryptoDevCompletionFunc cb,
+ void *opaque)
  {
  CryptoDevBackendBuiltin *builtin =
CRYPTODEV_BACKEND_BUILTIN(backend);
@@ -514,11 +539,12 @@ static int cryptodev_builtin_operation(
  CryptoDevBackendSymOpInfo *sym_op_info;
  CryptoDevBackendAsymOpInfo *asym_op_info;
  enum CryptoDevBackendAlgType algtype = op_info->algtype;
-int ret = -VIRTIO_CRYPTO_ERR;
+int status = -VIRTIO_CRYPTO_ERR;
+Error *local_error = NULL;
  
  if (op_info->session_id >= MAX_NUM_SESSIONS ||

builtin->sessions[op_info->session_id] == NULL) {
-error_setg(errp, "Cannot find a valid session id: %" PRIu64 "",
+error_setg(_error, "Cannot find a valid session id: %" PRIu64 "",
 op_info->session_id);
  return -VIRTIO_CRYPTO_INVSESS;
  }
@@ -526,14 +552,21 @@ static int cryptodev_builtin_operation(
  sess = builtin->sessions[op_info->session_id];
  if (algtype == CRYPTODEV_BACKEND_ALG_SYM) {
  sym_op_info = op_info->u.sym_op_info;
-ret = cryptodev_builtin_sym_operation(sess, sym_op_info, errp);
+status = 

Re: [PATCH 0/2] i386/cpuid: Minor fixes for CPUID leaf 1f setup

2022-11-01 Thread Xiaoyao Li

On 7/12/2022 10:12 AM, Xiaoyao Li wrote:

The issue that fixed by Patch 1 looks fatal though it doesn't appear on
KVM because KVM always searches with assending order and hit with the
correct cpuid leaf 0.

Patch 2 removes the wrong constraint on CPUID leaf 1f


Kindly ping.


Xiaoyao Li (2):
   i386/cpuid: Decrease cpuid_i when skipping CPUID leaf 1F
   i386/cpuid: Remove subleaf constraint on CPUID leaf 1F

  target/i386/kvm/kvm.c | 5 +
  1 file changed, 1 insertion(+), 4 deletions(-)






Re: [PATCH 4/5] target/riscv: No need to re-start QEMU timer when timecmp == UINT64_MAX

2022-11-01 Thread Alistair Francis
On Mon, Oct 31, 2022 at 1:49 PM Anup Patel  wrote:
>
> On Mon, Oct 31, 2022 at 6:25 AM Alistair Francis  wrote:
> >
> > On Fri, Oct 28, 2022 at 2:53 AM Anup Patel  wrote:
> > >
> > > The time CSR will wrap-around immediately after reaching UINT64_MAX
> > > so we don't need to re-start QEMU timer when timecmp == UINT64_MAX
> > > in riscv_timer_write_timecmp().
> >
> > I'm not clear what this is fixing?
> >
> > If the guest sets a timer for UINT64_MAX shouldn't that still trigger
> > an event at some point?
>
> Here's what Sstc says about timer interrupt using Sstc:
> "A supervisor timer interrupt becomes pending - as reflected in the
> STIP bit in the mip and sip registers - whenever time contains a
> value greater than or equal to stimecmp, treating the values as
> unsigned integers. Writes to stimecmp are guaranteed to be
> reflected in STIP eventually, but not necessarily immediately.
> The interrupt remains posted until stimecmp becomes greater
> than time - typically as a result of writing stimecmp."
>
> When timecmp = UINT64_MAX, the time CSR will eventually reach
> timecmp value but on next timer tick the time CSR will wrap-around
> and become zero which is less than UINT64_MAX. Now, the timer
> interrupt behaves like a level triggered interrupt so it will become 1
> when time = timecmp = UINT64_MAX and next timer tick it will
> become 0 again because time = 0 < timecmp = UINT64_MAX.

Ah, I didn't realise this. Can you add this to the code comment and
maybe add this description to the commit message. Otherwise:

Reviewed-by: Alistair Francis 

Alistair

>
> This time CSR wrap-around comparison with timecmp is natural
> to implement in HW but not straight forward in QEMU hence this
> patch.
>
> Software can potentially use timecmp = UINT64_MAX as a way
> to clear the timer interrupt and keep timer disabled instead of
> enabling/disabling sie.STIP. This timecmp = UINT64_MAX helps:
> 1) Linux RISC-V timer driver keep timer interrupt enable/disable
> state in-sync with Linux interrupt subsystem.
> 2) Reduce number of traps taken when emulating Sstc for the
> "Nested Guest" (i.e. Guest running under some "Guest Hypervisor"
> which in-turn runs under a "Host Hypervisor").
>
> In fact, the SBI set_timer() call also defines similar mechanism to
> disable timer: "If the supervisor wishes to clear the timer interrupt
> without scheduling the next timer event, it can either request a timer
> interrupt infinitely far into the future (i.e., (uint64_t)-1), ...".
>
> Regards,
> Anup
>
> >
> > Alistair
> >
> > >
> > > Signed-off-by: Anup Patel 
> > > ---
> > >  target/riscv/time_helper.c | 8 
> > >  1 file changed, 8 insertions(+)
> > >
> > > diff --git a/target/riscv/time_helper.c b/target/riscv/time_helper.c
> > > index 4fb2a471a9..1ee9f94813 100644
> > > --- a/target/riscv/time_helper.c
> > > +++ b/target/riscv/time_helper.c
> > > @@ -72,6 +72,14 @@ void riscv_timer_write_timecmp(RISCVCPU *cpu, 
> > > QEMUTimer *timer,
> > >  riscv_cpu_update_mip(cpu, timer_irq, BOOL_TO_MASK(0));
> > >  }
> > >
> > > +/*
> > > + * Don't re-start the QEMU timer when timecmp == UINT64_MAX because
> > > + * time CSR will wrap-around immediately after reaching UINT64_MAX.
> > > + */
> > > +if (timecmp == UINT64_MAX) {
> > > +return;
> > > +}
> > > +
> > >  /* otherwise, set up the future timer interrupt */
> > >  diff = timecmp - rtc_r;
> > >  /* back to ns (note args switched in muldiv64) */
> > > --
> > > 2.34.1
> > >
> > >



Re: [PULL 59/62] hw/block/pflash_cfi0{1, 2}: Error out if device length isn't a power of two

2022-11-01 Thread Philippe Mathieu-Daudé

On 1/11/22 23:23, Stefan Hajnoczi wrote:

There is a report that this commit breaks an existing OVMF setup:
https://gitlab.com/qemu-project/qemu/-/issues/1290#note_1156507334

I'm not familiar with pflash. Please find a way to avoid a regression
in QEMU 7.2 here.


Long-standing problem with pflash and underlying images... i.e:
https://lore.kernel.org/qemu-devel/20190308062455.29755-1-arm...@redhat.com/

Let's revert for 7.2. Daniel, I can prepare a patch explaining.



Re: [PATCH v5 1/6] hw/sd/sdhci: MMIO region is implemented in 32-bit accesses

2022-11-01 Thread Bernhard Beschow
Am 31. Oktober 2022 11:53:57 UTC schrieb "Philippe Mathieu-Daudé" 
:
>Signed-off-by: Philippe Mathieu-Daudé 

Indeed there is `offset & ~0x3` in both sdhci_{read,write}, so:
Reviewed-by: Bernhard Beschow 

>---
> hw/sd/sdhci.c | 4 
> 1 file changed, 4 insertions(+)
>
>diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
>index 0e5e988927..f9c5b58e6d 100644
>--- a/hw/sd/sdhci.c
>+++ b/hw/sd/sdhci.c
>@@ -1332,6 +1332,10 @@ sdhci_write(void *opaque, hwaddr offset, uint64_t val, 
>unsigned size)
> static const MemoryRegionOps sdhci_mmio_ops = {
> .read = sdhci_read,
> .write = sdhci_write,
>+.impl = {
>+.min_access_size = 4,
>+.max_access_size = 4,
>+},
> .valid = {
> .min_access_size = 1,
> .max_access_size = 4,




[PATCH v6 2/3] hw/sd/sdhci: Support big endian SD host controller interfaces

2022-11-01 Thread Philippe Mathieu-Daudé
Some SDHCI IP can be synthetized in various endianness:
https://github.com/u-boot/u-boot/blob/v2021.04/doc/README.fsl-esdhc

 - CONFIG_SYS_FSL_ESDHC_BE

   ESDHC IP is in big-endian mode. Accessing ESDHC registers can be
   determined by ESDHC IP's endian mode or processor's endian mode.

Our current implementation is little-endian. In order to support
big endianness:

- Rename current MemoryRegionOps as sdhci_mmio_le_ops ('le')
- Add an 'endianness' property to SDHCIState (default little endian)
- Set the 'io_ops' field in realize() after checking the property
- Add the sdhci_mmio_be_ops (big-endian) MemoryRegionOps.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sdhci-internal.h |  1 +
 hw/sd/sdhci.c  | 32 +---
 include/hw/sd/sdhci.h  |  1 +
 3 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/hw/sd/sdhci-internal.h b/hw/sd/sdhci-internal.h
index 964570f8e8..5f3765f12d 100644
--- a/hw/sd/sdhci-internal.h
+++ b/hw/sd/sdhci-internal.h
@@ -308,6 +308,7 @@ extern const VMStateDescription sdhci_vmstate;
 #define SDHC_CAPAB_REG_DEFAULT 0x057834b4
 
 #define DEFINE_SDHCI_COMMON_PROPERTIES(_state) \
+DEFINE_PROP_UINT8("endianness", _state, endianness, DEVICE_LITTLE_ENDIAN), 
\
 DEFINE_PROP_UINT8("sd-spec-version", _state, sd_spec_version, 2), \
 DEFINE_PROP_UINT8("uhs", _state, uhs_mode, UHS_NOT_SUPPORTED), \
 DEFINE_PROP_UINT8("vendor", _state, vendor, SDHCI_VENDOR_NONE), \
diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
index 22c758ad91..289baa879e 100644
--- a/hw/sd/sdhci.c
+++ b/hw/sd/sdhci.c
@@ -1329,7 +1329,7 @@ sdhci_write(void *opaque, hwaddr offset, uint64_t val, 
unsigned size)
value >> shift, value >> shift);
 }
 
-static const MemoryRegionOps sdhci_mmio_ops = {
+static const MemoryRegionOps sdhci_mmio_le_ops = {
 .read = sdhci_read,
 .write = sdhci_write,
 .impl = {
@@ -1344,6 +1344,21 @@ static const MemoryRegionOps sdhci_mmio_ops = {
 .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
+static const MemoryRegionOps sdhci_mmio_be_ops = {
+.read = sdhci_read,
+.write = sdhci_write,
+.impl = {
+.min_access_size = 4,
+.max_access_size = 4,
+},
+.valid = {
+.min_access_size = 1,
+.max_access_size = 4,
+.unaligned = false
+},
+.endianness = DEVICE_BIG_ENDIAN,
+};
+
 static void sdhci_init_readonly_registers(SDHCIState *s, Error **errp)
 {
 ERRP_GUARD();
@@ -1371,8 +1386,6 @@ void sdhci_initfn(SDHCIState *s)
 
 s->insert_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, 
sdhci_raise_insertion_irq, s);
 s->transfer_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, sdhci_data_transfer, 
s);
-
-s->io_ops = _mmio_ops;
 }
 
 void sdhci_uninitfn(SDHCIState *s)
@@ -1388,10 +1401,23 @@ void sdhci_common_realize(SDHCIState *s, Error **errp)
 {
 ERRP_GUARD();
 
+switch (s->endianness) {
+case DEVICE_LITTLE_ENDIAN:
+s->io_ops = _mmio_le_ops;
+break;
+case DEVICE_BIG_ENDIAN:
+s->io_ops = _mmio_be_ops;
+break;
+default:
+error_setg(errp, "Incorrect endianness");
+return;
+}
+
 sdhci_init_readonly_registers(s, errp);
 if (*errp) {
 return;
 }
+
 s->buf_maxsz = sdhci_get_fifolen(s);
 s->fifo_buffer = g_malloc0(s->buf_maxsz);
 
diff --git a/include/hw/sd/sdhci.h b/include/hw/sd/sdhci.h
index 01a64c5442..a989fca3b2 100644
--- a/include/hw/sd/sdhci.h
+++ b/include/hw/sd/sdhci.h
@@ -96,6 +96,7 @@ struct SDHCIState {
 /* Configurable properties */
 bool pending_insert_quirk; /* Quirk for Raspberry Pi card insert int */
 uint32_t quirks;
+uint8_t endianness;
 uint8_t sd_spec_version;
 uint8_t uhs_mode;
 uint8_t vendor;/* For vendor specific functionality */
-- 
2.38.1




[PATCH v6 0/3] ppc/e500: Add support for eSDHC

2022-11-01 Thread Philippe Mathieu-Daudé
This is a respin of Bernhard's v4 with Freescale eSDHC implemented
as an 'UNIMP' region. See v4 cover here:
https://lore.kernel.org/qemu-devel/20221018210146.193159-1-shen...@gmail.com/

Since v5:
- Rebased (ppc-next merged)
- Properly handle big-endian

Since v4:
- Do not rename ESDHC_* definitions to USDHC_*
- Do not modify SDHCIState structure

Supersedes: <20221031115402.91912-1-phi...@linaro.org>

Philippe Mathieu-Daudé (3):
  hw/sd/sdhci: MMIO region is implemented in 32-bit accesses
  hw/sd/sdhci: Support big endian SD host controller interfaces
  hw/ppc/e500: Add Freescale eSDHC to e500plat

 docs/system/ppc/ppce500.rst | 13 ++
 hw/ppc/Kconfig  |  2 ++
 hw/ppc/e500.c   | 48 -
 hw/ppc/e500.h   |  1 +
 hw/ppc/e500plat.c   |  1 +
 hw/sd/sdhci-internal.h  |  1 +
 hw/sd/sdhci.c   | 36 +---
 include/hw/sd/sdhci.h   |  1 +
 8 files changed, 99 insertions(+), 4 deletions(-)

-- 
2.38.1




[PATCH v6 3/3] hw/ppc/e500: Add Freescale eSDHC to e500plat

2022-11-01 Thread Philippe Mathieu-Daudé
Adds missing functionality to e500plat machine which increases the
chance of given "real" firmware images to access SD cards.

Signed-off-by: Bernhard Beschow 
Message-Id: <20221018210146.193159-8-shen...@gmail.com>
[PMD: Simplify using create_unimplemented_device("esdhc")]
Signed-off-by: Philippe Mathieu-Daudé 
---
 docs/system/ppc/ppce500.rst | 13 ++
 hw/ppc/Kconfig  |  2 ++
 hw/ppc/e500.c   | 48 -
 hw/ppc/e500.h   |  1 +
 hw/ppc/e500plat.c   |  1 +
 5 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/docs/system/ppc/ppce500.rst b/docs/system/ppc/ppce500.rst
index fa40e57d18..c9fe0915dc 100644
--- a/docs/system/ppc/ppce500.rst
+++ b/docs/system/ppc/ppce500.rst
@@ -19,6 +19,7 @@ The ``ppce500`` machine supports the following devices:
 * Power-off functionality via one GPIO pin
 * 1 Freescale MPC8xxx PCI host controller
 * VirtIO devices via PCI bus
+* 1 Freescale Enhanced Secure Digital Host controller (eSDHC)
 * 1 Freescale Enhanced Triple Speed Ethernet controller (eTSEC)
 
 Hardware configuration information
@@ -180,3 +181,15 @@ as follows:
   -kernel vmlinux \
   -drive if=pflash,file=/path/to/rootfs.ext2,format=raw \
   -append "rootwait root=/dev/mtdblock0"
+
+Alternatively, the root file system can also reside on an emulated SD card
+whose size must again be a power of two:
+
+.. code-block:: bash
+
+  $ qemu-system-ppc64 -M ppce500 -cpu e500mc -smp 4 -m 2G \
+  -display none -serial stdio \
+  -kernel vmlinux \
+  -device sd-card,drive=mydrive \
+  -drive id=mydrive,if=none,file=/path/to/rootfs.ext2,format=raw \
+  -append "rootwait root=/dev/mmcblk0"
diff --git a/hw/ppc/Kconfig b/hw/ppc/Kconfig
index b8d2522f45..72a311edcb 100644
--- a/hw/ppc/Kconfig
+++ b/hw/ppc/Kconfig
@@ -128,10 +128,12 @@ config E500
 select PFLASH_CFI01
 select PLATFORM_BUS
 select PPCE500_PCI
+select SDHCI
 select SERIAL
 select MPC_I2C
 select FDT_PPC
 select DS1338
+select UNIMP
 
 config E500PLAT
 bool
diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
index 2fe496677c..2bef2f01cb 100644
--- a/hw/ppc/e500.c
+++ b/hw/ppc/e500.c
@@ -48,6 +48,8 @@
 #include "hw/net/fsl_etsec/etsec.h"
 #include "hw/i2c/i2c.h"
 #include "hw/irq.h"
+#include "hw/sd/sdhci.h"
+#include "hw/misc/unimp.h"
 
 #define EPAPR_MAGIC(0x45504150)
 #define DTC_LOAD_PAD   0x180
@@ -66,11 +68,14 @@
 #define MPC8544_SERIAL1_REGS_OFFSET 0x4600ULL
 #define MPC8544_PCI_REGS_OFFSET0x8000ULL
 #define MPC8544_PCI_REGS_SIZE  0x1000ULL
+#define MPC85XX_ESDHC_REGS_OFFSET  0x2e000ULL
+#define MPC85XX_ESDHC_REGS_SIZE0x1000ULL
 #define MPC8544_UTIL_OFFSET0xeULL
 #define MPC8XXX_GPIO_OFFSET0x000FF000ULL
 #define MPC8544_I2C_REGS_OFFSET0x3000ULL
 #define MPC8XXX_GPIO_IRQ   47
 #define MPC8544_I2C_IRQ43
+#define MPC85XX_ESDHC_IRQ  72
 #define RTC_REGS_OFFSET0x68
 
 #define PLATFORM_CLK_FREQ_HZ   (400 * 1000 * 1000)
@@ -203,6 +208,22 @@ static void dt_i2c_create(void *fdt, const char *soc, 
const char *mpic,
 g_free(i2c);
 }
 
+static void dt_sdhc_create(void *fdt, const char *parent, const char *mpic)
+{
+hwaddr mmio = MPC85XX_ESDHC_REGS_OFFSET;
+hwaddr size = MPC85XX_ESDHC_REGS_SIZE;
+int irq = MPC85XX_ESDHC_IRQ;
+g_autofree char *name = NULL;
+
+name = g_strdup_printf("%s/sdhc@%" PRIx64, parent, mmio);
+qemu_fdt_add_subnode(fdt, name);
+qemu_fdt_setprop(fdt, name, "sdhci,auto-cmd12", NULL, 0);
+qemu_fdt_setprop_phandle(fdt, name, "interrupt-parent", mpic);
+qemu_fdt_setprop_cells(fdt, name, "bus-width", 4);
+qemu_fdt_setprop_cells(fdt, name, "interrupts", irq, 0x2);
+qemu_fdt_setprop_cells(fdt, name, "reg", mmio, size);
+qemu_fdt_setprop_string(fdt, name, "compatible", "fsl,esdhc");
+}
 
 typedef struct PlatformDevtreeData {
 void *fdt;
@@ -553,6 +574,10 @@ static int ppce500_load_device_tree(PPCE500MachineState 
*pms,
 
 dt_rtc_create(fdt, "i2c", "rtc");
 
+/* sdhc */
+if (pmc->has_esdhc) {
+dt_sdhc_create(fdt, soc, mpic);
+}
 
 gutil = g_strdup_printf("%s/global-utilities@%llx", soc,
 MPC8544_UTIL_OFFSET);
@@ -982,7 +1007,8 @@ void ppce500_init(MachineState *machine)
0, qdev_get_gpio_in(mpicdev, 42), 399193,
serial_hd(1), DEVICE_BIG_ENDIAN);
 }
-/* I2C */
+
+/* I2C */
 dev = qdev_new("mpc-i2c");
 s = SYS_BUS_DEVICE(dev);
 sysbus_realize_and_unref(s, _fatal);
@@ -992,6 +1018,26 @@ void ppce500_init(MachineState *machine)
 i2c = (I2CBus *)qdev_get_child_bus(dev, "i2c");
 i2c_slave_create_simple(i2c, "ds1338", RTC_REGS_OFFSET);
 
+/* eSDHC */
+if (pmc->has_esdhc) {
+create_unimplemented_device("esdhc",
+pmc->ccsrbar_base + 
MPC85XX_ESDHC_REGS_OFFSET,
+   

[PATCH v6 1/3] hw/sd/sdhci: MMIO region is implemented in 32-bit accesses

2022-11-01 Thread Philippe Mathieu-Daudé
Tested-by: Bernhard Beschow 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sdhci.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
index 306070c872..22c758ad91 100644
--- a/hw/sd/sdhci.c
+++ b/hw/sd/sdhci.c
@@ -1332,6 +1332,10 @@ sdhci_write(void *opaque, hwaddr offset, uint64_t val, 
unsigned size)
 static const MemoryRegionOps sdhci_mmio_ops = {
 .read = sdhci_read,
 .write = sdhci_write,
+.impl = {
+.min_access_size = 4,
+.max_access_size = 4,
+},
 .valid = {
 .min_access_size = 1,
 .max_access_size = 4,
-- 
2.38.1




Re: [PULL 59/62] hw/block/pflash_cfi0{1, 2}: Error out if device length isn't a power of two

2022-11-01 Thread Stefan Hajnoczi
There is a report that this commit breaks an existing OVMF setup:
https://gitlab.com/qemu-project/qemu/-/issues/1290#note_1156507334

I'm not familiar with pflash. Please find a way to avoid a regression
in QEMU 7.2 here.

Thank you!

Stefan



[PATCH] tests/unit/test-io-channel-command: Silence GCC error "maybe-uninitialized"

2022-11-01 Thread Bernhard Beschow
GCC issues a false positive warning, resulting in build failure with -Werror:

  In file included from /usr/lib/glib-2.0/include/glibconfig.h:9,
   from /usr/include/glib-2.0/glib/gtypes.h:34,
   from /usr/include/glib-2.0/glib/galloca.h:34,
   from /usr/include/glib-2.0/glib.h:32,
   from ../src/include/glib-compat.h:32,
   from ../src/include/qemu/osdep.h:144,
   from ../src/tests/unit/test-io-channel-command.c:21:
  /usr/include/glib-2.0/glib/gmacros.h: In function 
‘test_io_channel_command_fifo’:
  /usr/include/glib-2.0/glib/gmacros.h:1333:105: error: ‘dstargv’ may be used 
uninitialized [-Werror=maybe-uninitialized]
   1333 |   static G_GNUC_UNUSED inline void _GLIB_AUTO_FUNC_NAME(TypeName) 
(TypeName *_ptr) { if (*_ptr != none) (func) (*_ptr); } \
|   
  ^
  ../src/tests/unit/test-io-channel-command.c:39:19: note: ‘dstargv’ was 
declared here
 39 | g_auto(GStrv) dstargv;
|   ^~~
  /usr/include/glib-2.0/glib/gmacros.h:1333:105: error: ‘srcargv’ may be used 
uninitialized [-Werror=maybe-uninitialized]
   1333 |   static G_GNUC_UNUSED inline void _GLIB_AUTO_FUNC_NAME(TypeName) 
(TypeName *_ptr) { if (*_ptr != none) (func) (*_ptr); } \
|   
  ^
  ../src/tests/unit/test-io-channel-command.c:38:19: note: ‘srcargv’ was 
declared here
 38 | g_auto(GStrv) srcargv;
|   ^~~
  cc1: all warnings being treated as errors

GCC version:

  $ gcc --version
  gcc (GCC) 12.2.0

Fixes: 68406d10859385c88da73d0106254a7f47e6652e ('tests/unit: cleanups for 
test-io-channel-command')
Signed-off-by: Bernhard Beschow 
---
 tests/unit/test-io-channel-command.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/unit/test-io-channel-command.c 
b/tests/unit/test-io-channel-command.c
index 43e29c8cfb..ba0717d3c3 100644
--- a/tests/unit/test-io-channel-command.c
+++ b/tests/unit/test-io-channel-command.c
@@ -35,8 +35,8 @@ static void test_io_channel_command_fifo(bool async)
 g_autofree gchar *fifo = g_strdup_printf("%s/%s", tmpdir, TEST_FIFO);
 g_autoptr(GString) srcargs = g_string_new(socat);
 g_autoptr(GString) dstargs = g_string_new(socat);
-g_auto(GStrv) srcargv;
-g_auto(GStrv) dstargv;
+g_auto(GStrv) srcargv = NULL;
+g_auto(GStrv) dstargv = NULL;
 QIOChannel *src, *dst;
 QIOChannelTest *test;
 
-- 
2.38.1




Re: [PATCH qemu.git 11/11] hw/timer/imx_epit: rework CR write handling

2022-11-01 Thread Axel Heider




 Original Message 

From: Philippe Mathieu-Daudé [mailto:phi...@linaro.org]


- simplify code, improve comments
- fix https://gitlab.com/qemu-project/qemu/-/issues/1263


This doesn't match GitLab issues closing pattern:
https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#default-closing-pattern


I will change this to use "fix #1263" then. But the issue git closed already
from an earlier patch that got merged a few days ago.

Axel



Re: [PATCH qemu.git 04/11] hw/timer/imx_epit: remove explicit fields cnt and freq

2022-11-01 Thread Axel Heider




 Original Message 

From: Philippe Mathieu-Daudé [mailto:phi...@linaro.org]> diff --git 
a/hw/timer/imx_epit.c b/hw/timer/imx_epit.c>
index a79f58c963..37b04a1b53 100644
--- a/hw/timer/imx_epit.c
+++ b/hw/timer/imx_epit.c
@@ -77,23 +77,25 @@ static void imx_epit_update_int(IMXEPITState *s)
   * Must be called from within a ptimer_transaction_begin/commit block
   * for both s->timer_cmp and s->timer_reload.
   */
-static void imx_epit_set_freq(IMXEPITState *s)
+static uint32_t imx_epit_set_freq(IMXEPITState *s)

Maybe rename as imx_epit_get_freq() or simply imx_epit_freq(),



There will be an update of the whole patchset, so I will change
the name to imx_epit_freq(). That makes the name and signature in
sync. Note that the next commit in this patchset does more
refactoring anyway, so this is just a intermediate change that
is not really visible.


Axel






Re: [PULL 0/3] Block patches

2022-11-01 Thread Stefan Hajnoczi
Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/7.2 for any 
user-visible changes.


signature.asc
Description: PGP signature


Re: [PULL v3 for 7.2 00/31] testing and plugin updates

2022-11-01 Thread Stefan Hajnoczi
Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/7.2 for any 
user-visible changes.


signature.asc
Description: PGP signature


Re: [PATCH] linux-user: implement execveat

2022-11-01 Thread Laurent Vivier

Le 31/10/2022 à 09:40, Drew DeVault a écrit :

References: https://gitlab.com/qemu-project/qemu/-/issues/1007
Signed-off-by: Drew DeVault 
---
  linux-user/syscall.c | 50 ++--
  1 file changed, 44 insertions(+), 6 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index f55cdebee5..795f7ce4cd 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -634,6 +634,10 @@ safe_syscall4(pid_t, wait4, pid_t, pid, int *, status, 
int, options, \
  safe_syscall5(int, waitid, idtype_t, idtype, id_t, id, siginfo_t *, infop, \
int, options, struct rusage *, rusage)
  safe_syscall3(int, execve, const char *, filename, char **, argv, char **, 
envp)
+#if defined(TARGET_NR_execveat)
+safe_syscall5(int, execveat, int, dirfd, const char *, filename,
+char **, argv, char **, envp, int, flags)
+#endif
  #if defined(TARGET_NR_select) || defined(TARGET_NR__newselect) || \
  defined(TARGET_NR_pselect6) || defined(TARGET_NR_pselect6_time64)
  safe_syscall6(int, pselect6, int, nfds, fd_set *, readfds, fd_set *, 
writefds, \
@@ -8748,19 +8752,45 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int 
num, abi_long arg1,
  ret = get_errno(unlinkat(arg1, p, arg3));
  unlock_user(p, arg2, 0);
  return ret;
+#endif
+#if defined(TARGET_NR_execveat)
+case TARGET_NR_execveat:
  #endif
  case TARGET_NR_execve:
  {
  char **argp, **envp;
-int argc, envc;
+int argc, envc, dirfd, flags;
  abi_ulong gp;
  abi_ulong guest_argp;
  abi_ulong guest_envp;
  abi_ulong addr;
+abi_long path;
  char **q;
  
  argc = 0;

-guest_argp = arg2;
+
+switch (num) {
+case TARGET_NR_execve:
+path = arg1;
+guest_argp = arg2;
+guest_envp = arg3;
+dirfd = AT_FDCWD;
+flags = 0;
+break;
+#if defined(TARGET_NR_execveat)
+case TARGET_NR_execveat:
+dirfd = arg1;
+path = arg2;
+guest_argp = arg3;
+guest_envp = arg4;
+flags = arg5;
+break;
+#endif
+default:
+// squelch uninitialized variable warnings
+abort();
+}
+
  for (gp = guest_argp; gp; gp += sizeof(abi_ulong)) {
  if (get_user_ual(addr, gp))
  return -TARGET_EFAULT;
@@ -8769,7 +8799,6 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int 
num, abi_long arg1,
  argc++;
  }
  envc = 0;
-guest_envp = arg3;
  for (gp = guest_envp; gp; gp += sizeof(abi_ulong)) {
  if (get_user_ual(addr, gp))
  return -TARGET_EFAULT;
@@ -8803,7 +8832,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int 
num, abi_long arg1,
  }
  *q = NULL;
  
-if (!(p = lock_user_string(arg1)))

+if (!(p = lock_user_string(path)))
  goto execve_efault;
  /* Although execve() is not an interruptible syscall it is
   * a special case where we must use the safe_syscall wrapper:
@@ -8815,8 +8844,17 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int 
num, abi_long arg1,
   * before the execve completes and makes it the other
   * program's problem.
   */
-ret = get_errno(safe_execve(p, argp, envp));
-unlock_user(p, arg1, 0);
+switch (num) {
+case TARGET_NR_execve:
+ret = get_errno(safe_execve(p, argp, envp));
+break;
+#if defined(TARGET_NR_execveat)
+case TARGET_NR_execveat:
+ret = get_errno(safe_execveat(dirfd, p, argp, envp, flags));
+break;
+#endif
+}
+unlock_user(p, path, 0);
  
  goto execve_end;
  



I think it would be clearer to write a common function and to call it from execve and execveat, like 
it's in the kernel:


execve:

  do_execveat(AT_FDCWD, filename, argv, envp, 0);

execeveat:

  do_execveat(fd, filename, argv, envp, flags);

Thanks,
Laurent



Re: [PATCH v2 1/4] virtio-crypto: Support asynchronous mode

2022-11-01 Thread Michael S. Tsirkin
On Tue, Nov 01, 2022 at 06:37:26AM -0400, Michael S. Tsirkin wrote:
> On Sat, Oct 08, 2022 at 04:50:27PM +0800, Lei He wrote:
> > virtio-crypto: Modify the current interface of virtio-crypto
> > device to support asynchronous mode.
> > 
> > Signed-off-by: lei he 
> > ---
> >  backends/cryptodev-builtin.c|  69 ++---
> >  backends/cryptodev-vhost-user.c |  51 +--
> >  backends/cryptodev.c|  44 +++---
> >  hw/virtio/virtio-crypto.c   | 324 
> > ++--
> >  include/sysemu/cryptodev.h  |  60 +---
> >  5 files changed, 336 insertions(+), 212 deletions(-)
> > 
> > diff --git a/backends/cryptodev-builtin.c b/backends/cryptodev-builtin.c
> > index 125cbad1d3..cda6ca3b71 100644
> > --- a/backends/cryptodev-builtin.c
> > +++ b/backends/cryptodev-builtin.c
> > @@ -355,42 +355,62 @@ static int cryptodev_builtin_create_akcipher_session(
> >  return index;
> >  }
> >  
> > -static int64_t cryptodev_builtin_create_session(
> > +static int cryptodev_builtin_create_session(
> > CryptoDevBackend *backend,
> > CryptoDevBackendSessionInfo *sess_info,
> > -   uint32_t queue_index, Error **errp)
> > +   uint32_t queue_index,
> > +   CryptoDevCompletionFunc cb,
> > +   void *opaque)
> >  {
> >  CryptoDevBackendBuiltin *builtin =
> >CRYPTODEV_BACKEND_BUILTIN(backend);
> >  CryptoDevBackendSymSessionInfo *sym_sess_info;
> >  CryptoDevBackendAsymSessionInfo *asym_sess_info;
> > +int ret, status;
> > +Error *local_error = NULL;
> >  
> >  switch (sess_info->op_code) {
> >  case VIRTIO_CRYPTO_CIPHER_CREATE_SESSION:
> >  sym_sess_info = _info->u.sym_sess_info;
> > -return cryptodev_builtin_create_cipher_session(
> > -   builtin, sym_sess_info, errp);
> > +ret = cryptodev_builtin_create_cipher_session(
> > +builtin, sym_sess_info, _error);
> > +break;
> >  
> >  case VIRTIO_CRYPTO_AKCIPHER_CREATE_SESSION:
> >  asym_sess_info = _info->u.asym_sess_info;
> > -return cryptodev_builtin_create_akcipher_session(
> > -   builtin, asym_sess_info, errp);
> > +ret = cryptodev_builtin_create_akcipher_session(
> > +   builtin, asym_sess_info, _error);
> > +break;
> >  
> >  case VIRTIO_CRYPTO_HASH_CREATE_SESSION:
> >  case VIRTIO_CRYPTO_MAC_CREATE_SESSION:
> >  default:
> > -error_setg(errp, "Unsupported opcode :%" PRIu32 "",
> > +error_setg(_error, "Unsupported opcode :%" PRIu32 "",
> > sess_info->op_code);
> > -return -1;
> > +return -VIRTIO_CRYPTO_NOTSUPP;
> >  }
> >  
> > -return -1;
> > +if (local_error) {
> > +error_report_err(local_error);
> > +}
> > +if (ret < 0) {
> > +status = -VIRTIO_CRYPTO_ERR;
> > +} else {
> > +sess_info->session_id = ret;
> > +status = VIRTIO_CRYPTO_OK;
> > +}
> > +if (cb) {
> > +cb(opaque, status);
> > +}
> > +return 0;
> >  }
> >  
> >  static int cryptodev_builtin_close_session(
> > CryptoDevBackend *backend,
> > uint64_t session_id,
> > -   uint32_t queue_index, Error **errp)
> > +   uint32_t queue_index,
> > +   CryptoDevCompletionFunc cb,
> > +   void *opaque)
> >  {
> >  CryptoDevBackendBuiltin *builtin =
> >CRYPTODEV_BACKEND_BUILTIN(backend);
> > @@ -407,6 +427,9 @@ static int cryptodev_builtin_close_session(
> >  
> >  g_free(session);
> >  builtin->sessions[session_id] = NULL;
> > +if (cb) {
> > +cb(opaque, VIRTIO_CRYPTO_OK);
> > +}
> >  return 0;
> >  }
> >  
> > @@ -506,7 +529,9 @@ static int cryptodev_builtin_asym_operation(
> >  static int cryptodev_builtin_operation(
> >   CryptoDevBackend *backend,
> >   CryptoDevBackendOpInfo *op_info,
> > - uint32_t queue_index, Error **errp)
> > + uint32_t queue_index,
> > + CryptoDevCompletionFunc cb,
> > + void *opaque)
> >  {
> >  CryptoDevBackendBuiltin *builtin =
> >CRYPTODEV_BACKEND_BUILTIN(backend);
> > @@ -514,11 +539,12 @@ static int cryptodev_builtin_operation(
> >  CryptoDevBackendSymOpInfo *sym_op_info;
> >  CryptoDevBackendAsymOpInfo *asym_op_info;
> >  enum CryptoDevBackendAlgType algtype = op_info->algtype;
> > -int ret = -VIRTIO_CRYPTO_ERR;
> > +int status = -VIRTIO_CRYPTO_ERR;
> > +Error *local_error = NULL;
> >  
> >  if (op_info->session_id >= MAX_NUM_SESSIONS ||
> >builtin->sessions[op_info->session_id] == NULL) {
> > -error_setg(errp, "Cannot find a valid session id: %" PRIu64 "",
> > +error_setg(_error, "Cannot find a valid session id: %" 
> > PRIu64 "",
> >

Announcing QEMU 7.2 soft freeze

2022-11-01 Thread Stefan Hajnoczi
QEMU has entered soft feature freeze for the upcoming 7.2 release.
https://wiki.qemu.org/Planning/SoftFeatureFreeze

v7.2.0-rc0 is scheduled for tagging on November 8th.

If you encounter a bug that needs to be fixed before the release, please
create a GitLab Issue with the "7.2" milestone:
https://gitlab.com/qemu-project/qemu/-/milestones/7

The release planning page is here:
https://wiki.qemu.org/Planning/7.2

Contributors: Any features that haven't been merged into a maintainer's
tree yet are not expected to be in QEMU 7.2. Please help by focussing on
bug fixes during the feature freeze period. It is likely that code
review will slow down as maintainers focus on the release.

Now is a good time for everyone to add merged features to the changelog:
https://wiki.qemu.org/ChangeLog/7.2

Thanks for your help!

Stefan


signature.asc
Description: PGP signature


Re: [PATCH v5 0/6] ppc/e500: Add support for two types of flash, cleanup

2022-11-01 Thread B



Am 1. November 2022 17:43:46 UTC schrieb "Philippe Mathieu-Daudé" 
:
>On 1/11/22 17:58, Philippe Mathieu-Daudé wrote:
>> On 1/11/22 17:01, Bernhard Beschow wrote:
>>> Am 1. November 2022 10:41:51 UTC schrieb Bernhard Beschow 
>>> :
 On Mon, Oct 31, 2022 at 12:54 PM Philippe Mathieu-Daudé 
 wrote:
 
> This is a respin of Bernhard's v4 with Freescale eSDHC implemented
> as an 'UNIMP' region. See v4 cover here:
> 
> https://lore.kernel.org/qemu-devel/20221018210146.193159-1-shen...@gmail.com/
>
>>> Hi Phil,
>>> 
>>> Is there a chance to get this in for 7.2?
>> 
>> Well 1/ can you review patch #1 and 2/ we need to figure out what to do with 
>> patch #2 :) Can you point me to the CCSR datasheet?
>
>Maybe I found it, I'm looking at the "MPC8544E PowerQUICC III Integrated Host 
>Processor Family Reference Manual, Rev. 1".

This document looks similar to mine: 
https://www.nxp.com/docs/en/reference-manual/MPC8569ERM.pdf

Best regards,
Bernhard



Re: [PATCH v9 1/8] mm: Introduce memfd_restricted system call to create restricted user memory

2022-11-01 Thread Michael Roth
On Tue, Nov 01, 2022 at 10:19:44AM -0500, Michael Roth wrote:
> On Tue, Nov 01, 2022 at 07:37:29PM +0800, Chao Peng wrote:
> > On Mon, Oct 31, 2022 at 12:47:38PM -0500, Michael Roth wrote:
> > > On Tue, Oct 25, 2022 at 11:13:37PM +0800, Chao Peng wrote:
> > 
> > > 
> > >   3) Potentially useful for hugetlbfs support:
> > > 
> > >  One issue with hugetlbfs is that we don't support splitting the
> > >  hugepage in such cases, which was a big obstacle prior to UPM. Now
> > >  however, we may have the option of doing "lazy" invalidations where
> > >  fallocate(PUNCH_HOLE, ...) won't free a shmem-allocate page unless
> > >  all the subpages within the 2M range are either hole-punched, or the
> > >  guest is shut down, so in that way we never have to split it. Sean
> > >  was pondering something similar in another thread:
> > > 
> > >
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-mm%2FYyGLXXkFCmxBfu5U%40google.com%2Fdata=05%7C01%7CMichael.Roth%40amd.com%7C28ba5dbb51844f910dec08dabc1c99e6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638029128345507924%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=bxcRfuJIgo1Z1G8HQ800HscE6y7RXRQwvWSkfc5M8Bs%3Dreserved=0
> > > 
> > >  Issuing invalidations with folio-granularity ties in fairly well
> > >  with this sort of approach if we end up going that route.
> > 
> > There is semantics difference between the current one and the proposed
> > one: The invalidation range is exactly what userspace passed down to the
> > kernel (being fallocated) while the proposed one will be subset of that
> > (if userspace-provided addr/size is not aligned to power of two), I'm
> > not quite confident this difference has no side effect.
> 
> In theory userspace should not be allocating/hole-punching restricted
> pages for GPA ranges that are already mapped as private in the xarray,
> and KVM could potentially fail such requests (though it does currently).
> 
> But if we somehow enforced that, then we could rely on
> KVM_MEMORY_ENCRYPT_REG_REGION to handle all the MMU invalidation stuff,
> which would free up the restricted fd invalidation callbacks to be used
> purely to handle doing things like RMP/directmap fixups prior to returning
> restricted pages back to the host. So that was sort of my thinking why the
> new semantics would still cover all the necessary cases.

Sorry, this explanation is if we rely on userspace to fallocate() on 2MB
boundaries, and ignore any non-aligned requests in the kernel. But
that's not how I actually ended up implementing things, so I'm not sure
why answered that way...

In my implementation we actually do issue invalidations for fallocate()
even for non-2M-aligned GPA/offset ranges. For instance (assuming
restricted FD offset 0 corresponds to GPA 0), an fallocate() on GPA
range 0x1000-0x402000 would result in the following invalidations being
issued if everything was backed by a 2MB page:

  invalidate GPA: 0x001000-0x20, Page: pfn_to_page(I), order:9
  invalidate GPA: 0x20-0x40, Page: pfn_to_page(J), order:9
  invalidate GPA: 0x40-0x402000, Page: pfn_to_page(K), order:9

So you still cover the same range, but the arch/platform callbacks can
then, as a best effort, do things like restore 2M directmap if they see
that the backing page is 2MB+ and the GPA range covers the entire range.
If the GPA doesn't covers the whole range, or the backing page is
order:0, then in that case we are still forced to leave the directmap
split.

But with that in place we can then improve on that by allowing for the
use of hugetlbfs.

We'd still be somewhat reliant on userspace to issue fallocate()'s on
2M-aligned boundaries to some degree (guest teardown invalidations
could be issued as 2M-aligned, which would be the bulk of the pages
in most cases, but for discarding pages after private->shared
conversion we could still get fragmentation). This could maybe be
addressed by keeping track of those partial/non-2M-aligned fallocate()
requests and then issuing them as a batched 2M invalidation once all
the subpages have been fallocate(HOLE_PUNCH)'d. We'd need to enforce
that fallocate(PUNCH_HOLE) is preceeded by
KVM_MEMORY_ENCRYPT_UNREG_REGION to make sure MMU invalidations happen
though.

Not sure on these potential follow-ups, but they all at least seem
compatible with the proposed invalidation scheme.

-Mike

> 
> -Mike
> 
> > 
> > > 
> > > I need to rework things for v9, and we'll probably want to use struct
> > > folio instead of struct page now, but as a proof-of-concept of sorts this
> > > is what I'd added on top of v8 of your patchset to implement 1) and 2):
> > > 
> > >   
> > > 

[PATCH 1/2] file-posix: fix Linux alignment probing when EIO is returned

2022-11-01 Thread Stefan Hajnoczi
Linux dm-crypt returns errno EIO from unaligned O_DIRECT pread(2) calls.
Alignment probing fails on dm-crypt devices because the code expects
EINVAL.

Treating any errno as an "unaligned" indicator would be easy, but breaks
commit 22d182e82b4b ("block/raw-posix: fix launching with failed
disks"). Offline disks return EIO for correctly aligned requests and
EINVAL for unaligned requests.

It's possible to make both dm-crypt and offline disks work: look for the
transition from EINVAL to EIO instead of for a single EINVAL value.

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1290
Fixes: 22d182e82b4b ("block/raw-posix: fix launching with failed disks")
Signed-off-by: Stefan Hajnoczi 
---
 block/file-posix.c | 42 +++---
 1 file changed, 15 insertions(+), 27 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index b9647c5ffc..b9d62f52fe 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -355,31 +355,6 @@ static bool raw_needs_alignment(BlockDriverState *bs)
 return s->force_alignment;
 }
 
-/* Check if read is allowed with given memory buffer and length.
- *
- * This function is used to check O_DIRECT memory buffer and request alignment.
- */
-static bool raw_is_io_aligned(int fd, void *buf, size_t len)
-{
-ssize_t ret = pread(fd, buf, len, 0);
-
-if (ret >= 0) {
-return true;
-}
-
-#ifdef __linux__
-/* The Linux kernel returns EINVAL for misaligned O_DIRECT reads.  Ignore
- * other errors (e.g. real I/O error), which could happen on a failed
- * drive, since we only care about probing alignment.
- */
-if (errno != EINVAL) {
-return true;
-}
-#endif
-
-return false;
-}
-
 static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
 {
 BDRVRawState *s = bs->opaque;
@@ -426,34 +401,47 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
  * try to detect buf_align, which cannot be detected in some cases (e.g.
  * Gluster). If buf_align cannot be detected, we fallback to the value of
  * request_alignment.
+ *
+ * The probing loop keeps track of the last errno so that the alignment of
+ * offline disks can be probed. On Linux pread(2) returns with errno EINVAL
+ * for most file descriptors when O_DIRECT alignment constraints are unmet.
+ * Offline disks fail correctly aligned pread(2) with EIO. Therefore it's
+ * possible to detect alignment on offline disks by observing when the
+ * errno changes from EINVAL to something else.
  */
 
 if (!bs->bl.request_alignment) {
+int last_errno = 0;
 int i;
 size_t align;
 buf = qemu_memalign(max_align, max_align);
 for (i = 0; i < ARRAY_SIZE(alignments); i++) {
 align = alignments[i];
-if (raw_is_io_aligned(fd, buf, align)) {
+if (pread(fd, buf, align, 0) >= 0 ||
+(errno != EINVAL && last_errno == EINVAL)) {
 /* Fallback to safe value. */
 bs->bl.request_alignment = (align != 1) ? align : max_align;
 break;
 }
+last_errno = errno;
 }
 qemu_vfree(buf);
 }
 
 if (!s->buf_align) {
+int last_errno = 0;
 int i;
 size_t align;
 buf = qemu_memalign(max_align, 2 * max_align);
 for (i = 0; i < ARRAY_SIZE(alignments); i++) {
 align = alignments[i];
-if (raw_is_io_aligned(fd, buf + align, max_align)) {
+if (pread(fd, buf + align, max_align, 0) >= 0 ||
+(errno != EINVAL && last_errno == EINVAL)) {
 /* Fallback to request_alignment. */
 s->buf_align = (align != 1) ? align : bs->bl.request_alignment;
 break;
 }
+last_errno = errno;
 }
 qemu_vfree(buf);
 }
-- 
2.38.1




[PATCH 2/2] file-posix: add statx(STATX_DIOALIGN) support

2022-11-01 Thread Stefan Hajnoczi
Linux commit 825cf206ed51 ("statx: add direct I/O alignment
information") added an interface to fetch O_DIRECT alignment values for
block devices and file systems.

Prefer STATX_DIOALIGN to older interfaces and probing, but keep them as
fallbacks in case STATX_DIOALIGN cannot provide the information.

Testing shows the status of STATX_DIOALIGN support in Linux 6.1-rc3
appears to be:
- btrfs: no
- ext4: yes
- XFS: yes
- NVMe block devices: yes
- dm-crypt: yes

Cc: Eric Biggers 
Signed-off-by: Stefan Hajnoczi 
---
 block/file-posix.c | 54 +++---
 1 file changed, 37 insertions(+), 17 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index b9d62f52fe..00d8191880 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -372,28 +372,48 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
 
 bs->bl.request_alignment = 0;
 s->buf_align = 0;
+
+#if defined(__linux__) && defined(STATX_DIOALIGN)
+struct statx stx;
+
+/*
+ * Linux 6.1 introduced an interface for both block devices and file
+ * systems. The system call returns with the STATX_DIOALIGN bit cleared
+ * when the information is unavailable.
+ */
+if (statx(fd, "", AT_EMPTY_PATH, STATX_DIOALIGN, ) == 0 &&
+(stx.stx_mask & STATX_DIOALIGN)) {
+bs->bl.request_alignment = stx.stx_dio_offset_align;
+s->buf_align = stx.stx_dio_mem_align;
+}
+#endif /* defined(__linux__) && defined(STATX_DIOALIGN) */
+
 /* Let's try to use the logical blocksize for the alignment. */
-if (probe_logical_blocksize(fd, >bl.request_alignment) < 0) {
-bs->bl.request_alignment = 0;
+if (!bs->bl.request_alignment) {
+if (probe_logical_blocksize(fd, >bl.request_alignment) < 0) {
+bs->bl.request_alignment = 0;
+}
 }
 
 #ifdef __linux__
-/*
- * The XFS ioctl definitions are shipped in extra packages that might
- * not always be available. Since we just need the XFS_IOC_DIOINFO ioctl
- * here, we simply use our own definition instead:
- */
-struct xfs_dioattr {
-uint32_t d_mem;
-uint32_t d_miniosz;
-uint32_t d_maxiosz;
-} da;
-if (ioctl(fd, _IOR('X', 30, struct xfs_dioattr), ) >= 0) {
-bs->bl.request_alignment = da.d_miniosz;
-/* The kernel returns wrong information for d_mem */
-/* s->buf_align = da.d_mem; */
+if (!bs->bl.request_alignment) {
+/*
+ * The XFS ioctl definitions are shipped in extra packages that might
+ * not always be available. Since we just need the XFS_IOC_DIOINFO 
ioctl
+ * here, we simply use our own definition instead:
+ */
+struct xfs_dioattr {
+uint32_t d_mem;
+uint32_t d_miniosz;
+uint32_t d_maxiosz;
+} da;
+if (ioctl(fd, _IOR('X', 30, struct xfs_dioattr), ) >= 0) {
+bs->bl.request_alignment = da.d_miniosz;
+/* The kernel returns wrong information for d_mem */
+/* s->buf_align = da.d_mem; */
+}
 }
-#endif
+#endif /* __linux__ */
 
 /*
  * If we could not get the sizes so far, we can only guess them. First try
-- 
2.38.1




[PATCH 0/2] file-posix: alignment probing improvements

2022-11-01 Thread Stefan Hajnoczi
These patches fix alignment probing with dm-crypt and add support for the new
Linux statx(STATX_DIOALIGN) interface.

Stefan Hajnoczi (2):
  file-posix: fix Linux alignment probing when EIO is returned
  file-posix: add statx(STATX_DIOALIGN) support

 block/file-posix.c | 96 +-
 1 file changed, 52 insertions(+), 44 deletions(-)

-- 
2.38.1




Re: [v2] hw: misc: edu: fix 2 off-by-one errors

2022-11-01 Thread Christopher Friedt
Hi Jiri, Peter,

Are you able to review the more recent version of this change?

Look for the subject "[PATCH v4 x/3] hw: misc: edu: ..."

I believe I addressed all concerns.

Cheers,

C


On Mon, Oct 17, 2022 at 12:36 PM Christopher Friedt
 wrote:
>
> On Mon, Oct 17, 2022 at 2:23 AM Jiri Slaby  wrote:
> > On 15. 10. 22, 23:10, Chris Friedt wrote:
> > > From: Christopher Friedt 
> > This should be split into two patches. This way, it's hard to review.
>
> I can do that :-)
>
> Thanks for the review



Re: [PATCH 09/16] hw/9pfs: Disable unsupported flags and features for Windows

2022-11-01 Thread Christian Schoenebeck
On Tuesday, November 1, 2022 4:34:54 PM CET Shi, Guohuai wrote:
> 
> > -Original Message-
> > From: Christian Schoenebeck 
> > Sent: Tuesday, November 1, 2022 23:04
> > To: qemu-devel@nongnu.org
> > Cc: Shi, Guohuai ; Greg Kurz ;
> > Meng, Bin 
> > Subject: Re: [PATCH 09/16] hw/9pfs: Disable unsupported flags and features
> > for Windows
> > 
> > [Please note: This e-mail is from an EXTERNAL e-mail address]
> > 
> > On Monday, October 24, 2022 6:57:52 AM CET Bin Meng wrote:
> > > From: Guohuai Shi 
> > >
> > > Some flags and features are not supported on Windows, like mknod,
> > > readlink, file mode, etc. Update the codes for Windows.
> > >
> > > Signed-off-by: Guohuai Shi 
> > > Signed-off-by: Bin Meng 
> > > ---
> > >
> > >  hw/9pfs/9p-util.h |  6 +++-
> > >  hw/9pfs/9p.c  | 90 ++-
> > >  2 files changed, 86 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index
> > > 82b2d0c3e4..3d154e9103 100644
> > > --- a/hw/9pfs/9p-util.h
> > > +++ b/hw/9pfs/9p-util.h
> > > @@ -53,8 +53,10 @@ static inline uint64_t makedev_dotl(uint32_t dev_major,
> > uint32_t dev_minor)
> > >   */
> > >  static inline uint64_t host_dev_to_dotl_dev(dev_t dev)  { -#ifdef
> > > CONFIG_LINUX
> > > +#if defined(CONFIG_LINUX)
> > >  return dev;
> > > +#elif defined(CONFIG_WIN32)
> > > +return 0;
> > 
> > Really?
> 
> Check MS this document: 
> https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/fstat-fstat32-fstat64-fstati64-fstat32i64-fstat64i32?view=msvc-170
> st_rdev: If a device, fd; otherwise 0.
> st_dev: If a device, fd; otherwise 0.
> 
> So for any file open, it should be 0.

Yeah, but that function translates a corresponding device ID for *Linux* guest
side. And the intention is to avoid e.g. file ID collisions on guest side.
Because for a Linux guest, the two-tuple (device number, inode number) makes a
system-wide unique file ID.

If you just return zero here, that might be OK if only one 9p directory is
exported to guest, but say you have "C:\foo\" exported and "D:\bar\" exported
and mounted via 9p to guest, then guest would assume every file with the same
inode number on those two to be the same files. But they are not. They are on
two different drives actually.

> 
> > 
> > >  #else
> > >  return makedev_dotl(major(dev), minor(dev));  #endif @@ -260,7
> > > +262,9 @@ static inline struct dirent *qemu_dirent_dup(struct dirent
> > > *dent)  #if defined CONFIG_DARWIN && defined CONFIG_PTHREAD_FCHDIR_NP
> > > int pthread_fchdir_np(int fd) __attribute__((weak_import));  #endif
> > > +#ifndef CONFIG_WIN32
> > >  int qemu_mknodat(P9_FILE_ID dirfd, const char *filename, mode_t mode,
> > >   dev_t dev);
> > > +#endif
> > >
> > >  #endif
> > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index 6c4af86240..771aab34ac
> > > 100644
> > > --- a/hw/9pfs/9p.c
> > > +++ b/hw/9pfs/9p.c
> > > @@ -39,6 +39,11 @@
> > >  #include "qemu/xxhash.h"
> > >  #include 
> > >
> > > +#ifdef CONFIG_WIN32
> > > +#define UTIME_NOW   ((1l << 30) - 1l)
> > > +#define UTIME_OMIT  ((1l << 30) - 2l) #endif
> > > +
> > >  int open_fd_hw;
> > >  int total_open_fd;
> > >  static int open_fd_rc;
> > > @@ -132,13 +137,17 @@ static int dotl_to_open_flags(int flags)
> > >  DotlOpenflagMap dotl_oflag_map[] = {
> > >  { P9_DOTL_CREATE, O_CREAT },
> > >  { P9_DOTL_EXCL, O_EXCL },
> > > +#ifndef CONFIG_WIN32
> > >  { P9_DOTL_NOCTTY , O_NOCTTY },
> > > +#endif
> > >  { P9_DOTL_TRUNC, O_TRUNC },
> > >  { P9_DOTL_APPEND, O_APPEND },
> > > +#ifndef CONFIG_WIN32
> > >  { P9_DOTL_NONBLOCK, O_NONBLOCK } ,
> > >  { P9_DOTL_DSYNC, O_DSYNC },
> > >  { P9_DOTL_FASYNC, FASYNC },
> > > -#ifndef CONFIG_DARWIN
> > > +#endif
> > > +#ifdef CONFIG_LINUX
> > 
> > Better
> > 
> >#if !defined(CONFIG_DARWIN) && !defined(CONFIG_WIN32)
> > 
> 
> It is OK.

You got my point, hopefully:

> > Otherwise it might automatically opt-out other future platforms
> > unintentionally.
> > 
> > >  { P9_DOTL_NOATIME, O_NOATIME },
> > >  /*
> > >   *  On Darwin, we could map to F_NOCACHE, which is @@ -151,8
> > > +160,10 @@ static int dotl_to_open_flags(int flags)  #endif
> > >  { P9_DOTL_LARGEFILE, O_LARGEFILE },
> > >  { P9_DOTL_DIRECTORY, O_DIRECTORY },
> > > +#ifndef CONFIG_WIN32
> > >  { P9_DOTL_NOFOLLOW, O_NOFOLLOW },
> > >  { P9_DOTL_SYNC, O_SYNC },
> > > +#endif
> > >  };
> > >
> > >  for (i = 0; i < ARRAY_SIZE(dotl_oflag_map); i++) { @@ -179,8
> > > +190,11 @@ static int get_dotl_openflags(V9fsState *s, int oflags)
> > >   * Filter the client open flags
> > >   */
> > >  flags = dotl_to_open_flags(oflags);
> > > -flags &= ~(O_NOCTTY | O_ASYNC | O_CREAT);
> > > -#ifndef CONFIG_DARWIN
> > > +flags &= ~(O_CREAT);
> > > +#ifndef CONFIG_WIN32
> > > +flags &= ~(O_NOCTTY | O_ASYNC);
> > > +#endif
> > > +#ifdef 

Re: [PATCH 07/16] hw/9pfs: Implement Windows specific utilities functions for 9pfs

2022-11-01 Thread Christian Schoenebeck
On Tuesday, November 1, 2022 4:20:53 PM CET Shi, Guohuai wrote:
> 
[...]
> > > > Windows POSIX API and MinGW library do not provide the NO_FOLLOW
> > > > flag, and do not allow opening a directory by POSIX open(). This
> > > > causes all
> > > > xxx_at() functions cannot work directly. However, we can provide
> > > > Windows handle based functions to emulate xxx_at() functions (e.g.:
> > > > openat_win32, utimensat_win32, etc.).
> > > >
> > > > Windows does not support extended attributes. 9pfs for Windows uses
> > > > NTFS ADS (Alternate Data Streams) to emulate extended attributes.
> > > >
> > > > Windows does not provide POSIX compatible readlink(), and symbolic
> > > > link feature in 9pfs will be disabled on Windows.
> > >
> > > Wouldn't it be more user friendly if the relevant error locations
> > > would use something like error_report_once() and suggesting to enable
> > > mapped(-xattr) to make 9p symlinks on guest working if desired by the 
> > > user?
> > >
> > > Probably this error case would need to wrapped into a dedicated
> > > function, otherwise I guess error_report_once() would fire several
> > > times by different callers.
> > >
> > 
> > Windows (MinGW) does not only support symlink, but also does not have 
> > symlink
> > definitions.
> > Windows does not support symlink flags S_IFLNK.
> > 
> > So even I add symlink support by mapped-xattr, the MinGW library does not
> > have symlink flags and get a build error.
> > And this flags is defined by Windows header files.
> > The impact of adding a new flags to an pre-defined structure (struct stat) 
> > is
> > unknown.
> > 
> > So I think it is not a good idea to do that.
> 
> Because Windows does not support symlink, so error_report_once() and report 
> it to user will be OK.
> But mapped-xattr could not work.

Showing an error makes sense for "passthrough" security model, but not for the
"mapped" one.

Just to avoid misapprehensions: are you aware that there is already a system-
agnostic implementation for symlinks if "mapped" is used?

When mapped security model is enabled, then creating symlinks on guest will
simply create a corresponding *regular* file on host and the content of that
regular file on host is the pointing path as raw string. Additionally the
symlink flag is added to "virtfs.mode" xattr to mark that regular file as a
symlink, a virtual one that is. So this does not require any support for
symlinks by either the underlying host file system, nor by host OS.

Likewise interpreting and walking those virtual symlinks in "mapped" mode is
also implemented in the local fs driver already.

Best regards,
Christian Schoenebeck





[PATCH 2/2] tests: acpi: Fixup for tables in Arm HMAT series

2022-11-01 Thread Hesham Almatary via
Updates to issues with earlier patches in the pull request:
tests: virt: Update expected *.acpihmatvirt tables

Signed-off-by: Hesham Almatary 
---
 tests/data/acpi/virt/APIC.acpihmatvirt | Bin 396 -> 412 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/tests/data/acpi/virt/APIC.acpihmatvirt 
b/tests/data/acpi/virt/APIC.acpihmatvirt
index 
873e12f67c3838351a3ab4b7a43ee76e7730849a..68200204c6f8f2706c9896dbbccc5ecbec130d26
 100644
GIT binary patch
delta 90
zcmeBSp2N)L7~ttVhmnDSrF$ZmHAesg1WYz$v=)W3!4x-82B?aG5hTM0V!;4+z{GoA
RY#<(Yz+^2(ugL|BG5{#E3(f!l

delta 46
wcmbQk+{4V}7~tvL!^ptEyk;VoHKWf!mh`pNZ4J

[PATCH 1/2] tests: q35: acpi: Fixup for tables in noinitiator test.

2022-11-01 Thread Hesham Almatary via
Updates to issues with earlier patches in the pull request:
tests: acpi: q35: update expected blobs *.hmat-noinitiators expected HMAT

Signed-off-by: Hesham Almatary 
---
 tests/data/acpi/q35/DSDT.acpihmat-noinitiator | Bin 8553 -> 8691 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/tests/data/acpi/q35/DSDT.acpihmat-noinitiator 
b/tests/data/acpi/q35/DSDT.acpihmat-noinitiator
index 
c767d11cb1d088f613c49e55a7139cccababf66c..8efa1c5ded52b8a9dfbb6945a3c75cdc6ef9b277
 100644
GIT binary patch
delta 459
zcmaFq^x2uqCD#RZ$>5`k;y-qcGNp@2Dk+K@-A>;o1Dc^Ajr(fkWj#o
zxIk*sD~dnn)%T
zBtevM1b8~TF^KTQ18K=mJCDQvSg%Syh@0>V4kCWkWi35&$L2e@*?dw9C=Iywh<
z8W
z&!9jit^_Wg=thId4~6vW*#d$B9b@!B`amoY*9p#b0&$(;TxSs11P}{$J4M(6oc%%=1nfcHoh&17
F1px9@hSC54

delta 280
zcmezD{L+cbCDCpbBraYu>^TYyWjFW&+m*2x)y!dwCu%uhZIHm`V<^02dSfupBOL{4tP
z0_DkB3=1SDXL2zu5S~;!Ie}qv0?|1r^%-9Gb7#4~z5S^UG#a)_8AMER7z!C50
z62!yIAisI1get43KTEuOfGbD5ho=j#qjP|#fq{V;!{nWcR-5n29%bb6V~GLzRKSg8
Q@)~(tb`iD!XTJ~z0E?YeUH||9

-- 
2.25.1




Re: [PATCH v5 0/6] ppc/e500: Add support for two types of flash, cleanup

2022-11-01 Thread Philippe Mathieu-Daudé

On 1/11/22 18:43, Philippe Mathieu-Daudé wrote:

On 1/11/22 17:58, Philippe Mathieu-Daudé wrote:

On 1/11/22 17:01, Bernhard Beschow wrote:
Am 1. November 2022 10:41:51 UTC schrieb Bernhard Beschow 
:
On Mon, Oct 31, 2022 at 12:54 PM Philippe Mathieu-Daudé 


wrote:


This is a respin of Bernhard's v4 with Freescale eSDHC implemented
as an 'UNIMP' region. See v4 cover here:

https://lore.kernel.org/qemu-devel/20221018210146.193159-1-shen...@gmail.com/



Hi Phil,

Is there a chance to get this in for 7.2?


Well 1/ can you review patch #1 and 2/ we need to figure out what to 
do with patch #2 :) Can you point me to the CCSR datasheet?


Maybe I found it, I'm looking at the "MPC8544E PowerQUICC III Integrated 
Host Processor Family Reference Manual, Rev. 1".


On "Table 2-11. CCSR Block Base Address Map" I see the 0x2_7000–0x3_0FFF 
region marked as 'Reserved'. How does the eSDHC end mapped there?




Re: [PATCH v5 0/6] ppc/e500: Add support for two types of flash, cleanup

2022-11-01 Thread Philippe Mathieu-Daudé

On 1/11/22 17:58, Philippe Mathieu-Daudé wrote:

On 1/11/22 17:01, Bernhard Beschow wrote:
Am 1. November 2022 10:41:51 UTC schrieb Bernhard Beschow 
:
On Mon, Oct 31, 2022 at 12:54 PM Philippe Mathieu-Daudé 


wrote:


This is a respin of Bernhard's v4 with Freescale eSDHC implemented
as an 'UNIMP' region. See v4 cover here:

https://lore.kernel.org/qemu-devel/20221018210146.193159-1-shen...@gmail.com/



Hi Phil,

Is there a chance to get this in for 7.2?


Well 1/ can you review patch #1 and 2/ we need to figure out what to do 
with patch #2 :) Can you point me to the CCSR datasheet?


Maybe I found it, I'm looking at the "MPC8544E PowerQUICC III Integrated 
Host Processor Family Reference Manual, Rev. 1".




Re: [PATCH] migration: check magic value for deciding the mapping of channels

2022-11-01 Thread Daniel P . Berrangé
On Tue, Nov 01, 2022 at 09:10:14PM +0530, manish.mishra wrote:
> 
> On 01/11/22 8:21 pm, Daniel P. Berrangé wrote:
> > On Tue, Nov 01, 2022 at 02:30:29PM +, manish.mishra wrote:
> > > Current logic assumes that channel connections on the destination side are
> > > always established in the same order as the source and the first one will
> > > always be the default channel followed by the multifid or post-copy
> > > preemption channel. This may not be always true, as even if a channel has 
> > > a
> > > connection established on the source side it can be in the pending state 
> > > on
> > > the destination side and a newer connection can be established first.
> > > Basically causing out of order mapping of channels on the destination 
> > > side.
> > > Currently, all channels except post-copy preempt send a magic number, this
> > > patch uses that magic number to decide the type of channel. This logic is
> > > applicable only for precopy(multifd) live migration, as mentioned, the
> > > post-copy preempt channel does not send any magic number. Also, this patch
> > > uses MSG_PEEK to check the magic number of channels so that current
> > > data/control stream management remains un-effected.
> > > 
> > > Signed-off-by: manish.mishra 
> > > ---
> > >   include/io/channel.h | 25 +
> > >   io/channel-socket.c  | 27 +++
> > >   io/channel.c | 39 +++
> > >   migration/migration.c| 33 +
> > >   migration/multifd.c  | 12 
> > >   migration/multifd.h  |  2 +-
> > >   migration/postcopy-ram.c |  5 +
> > >   migration/postcopy-ram.h |  2 +-
> > >   8 files changed, 119 insertions(+), 26 deletions(-)
> > > 
> > > diff --git a/include/io/channel.h b/include/io/channel.h
> > > index c680ee7480..74177aeeea 100644
> > > --- a/include/io/channel.h
> > > +++ b/include/io/channel.h
> > > @@ -115,6 +115,10 @@ struct QIOChannelClass {
> > >   int **fds,
> > >   size_t *nfds,
> > >   Error **errp);
> > > +ssize_t (*io_read_peek)(QIOChannel *ioc,
> > > +void *buf,
> > > +size_t nbytes,
> > > +Error **errp);
> > >   int (*io_close)(QIOChannel *ioc,
> > >   Error **errp);
> > >   GSource * (*io_create_watch)(QIOChannel *ioc,
> > > @@ -475,6 +479,27 @@ int qio_channel_write_all(QIOChannel *ioc,
> > > size_t buflen,
> > > Error **errp);
> > > +/**
> > > + * qio_channel_read_peek_all:
> > > + * @ioc: the channel object
> > > + * @buf: the memory region to read in data
> > > + * @nbytes: the number of bytes to read
> > > + * @errp: pointer to a NULL-initialized error object
> > > + *
> > > + * Read given @nbytes data from peek of channel into
> > > + * memory region @buf.
> > > + *
> > > + * The function will be blocked until read size is
> > > + * equal to requested size.
> > > + *
> > > + * Returns: 1 if all bytes were read, 0 if end-of-file
> > > + *  occurs without data, or -1 on error
> > > + */
> > > +int qio_channel_read_peek_all(QIOChannel *ioc,
> > > +  void* buf,
> > > +  size_t nbytes,
> > > +  Error **errp);
> > > +
> > >   /**
> > >* qio_channel_set_blocking:
> > >* @ioc: the channel object
> > > diff --git a/io/channel-socket.c b/io/channel-socket.c
> > > index b76dca9cc1..b99f5dfda6 100644
> > > --- a/io/channel-socket.c
> > > +++ b/io/channel-socket.c
> > > @@ -705,6 +705,32 @@ static ssize_t qio_channel_socket_writev(QIOChannel 
> > > *ioc,
> > >   }
> > >   #endif /* WIN32 */
> > > +static ssize_t qio_channel_socket_read_peek(QIOChannel *ioc,
> > > +void *buf,
> > > +size_t nbytes,
> > > +Error **errp)
> > > +{
> > > +QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
> > > +ssize_t bytes = 0;
> > > +
> > > +retry:
> > > +bytes = recv(sioc->fd, buf, nbytes, MSG_PEEK);
> > > +
> > > +if (bytes < 0) {
> > > +if (errno == EINTR) {
> > > +goto retry;
> > > +}
> > > +if (errno == EAGAIN) {
> > > +return QIO_CHANNEL_ERR_BLOCK;
> > > +}
> > > +
> > > +error_setg_errno(errp, errno,
> > > + "Unable to read from peek of socket");
> > > +return -1;
> > > +}
> > > +
> > > +return bytes;
> > > +}
> > >   #ifdef QEMU_MSG_ZEROCOPY
> > >   static int qio_channel_socket_flush(QIOChannel *ioc,
> > > @@ -902,6 +928,7 @@ static void qio_channel_socket_class_init(ObjectClass 
> > > *klass,
> > >   ioc_klass->io_writev = qio_channel_socket_writev;
> > >   ioc_klass->io_readv = 

[PATCH v9 08/17] msi: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. A caller of msi_init(), which calls
pci_add_capability() in turn, is expected to ensure that will not
happen.

Signed-off-by: Akihiko Odaki 
---
 hw/pci/msi.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/hw/pci/msi.c b/hw/pci/msi.c
index 058d1d1ef1..5283a08b5a 100644
--- a/hw/pci/msi.c
+++ b/hw/pci/msi.c
@@ -194,7 +194,6 @@ int msi_init(struct PCIDevice *dev, uint8_t offset,
 unsigned int vectors_order;
 uint16_t flags;
 uint8_t cap_size;
-int config_offset;
 
 if (!msi_nonbroken) {
 error_setg(errp, "MSI is not supported by interrupt controller");
@@ -221,13 +220,7 @@ int msi_init(struct PCIDevice *dev, uint8_t offset,
 }
 
 cap_size = msi_cap_sizeof(flags);
-config_offset = pci_add_capability(dev, PCI_CAP_ID_MSI, offset,
-cap_size, errp);
-if (config_offset < 0) {
-return config_offset;
-}
-
-dev->msi_cap = config_offset;
+dev->msi_cap = pci_add_capability(dev, PCI_CAP_ID_MSI, offset, cap_size);
 dev->cap_present |= QEMU_PCI_CAP_MSI;
 
 pci_set_word(dev->config + msi_flags_off(dev), flags);
-- 
2.38.1




[PATCH v8 01/17] hw/vfio/pci: Ensure MSI and MSI-X do not overlap

2022-11-01 Thread Akihiko Odaki
pci_add_capability() checks whether capabilities overlap, and notifies
its caller so that it can properly handle the case. However, in the
most cases, the capabilities actually never overlap, and the interface
incurred extra error handling code, which is often incorrect or
suboptimal. For such cases, pci_add_capability() can simply abort the
execution if the capabilities actually overlap since it should be a
programming error.

This change handles the other cases: hw/vfio/pci depends on the check to
decide MSI and MSI-X capabilities overlap with another. As they are
quite an exceptional and hw/vfio/pci knows much about PCI capabilities,
adding code specific to the cases to hw/vfio/pci still results in less
code than having error handling code everywhere in total.

Signed-off-by: Akihiko Odaki 
---
 hw/pci/pci.c | 34 ++
 hw/vfio/pci.c| 15 ++-
 include/hw/pci/pci.h |  3 +++
 3 files changed, 39 insertions(+), 13 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 2f450f6a72..b53649d1fd 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2512,6 +2512,25 @@ static void pci_del_option_rom(PCIDevice *pdev)
 pdev->has_rom = false;
 }
 
+bool pci_check_capability_overlap(PCIDevice *pdev, uint8_t cap_id,
+  uint8_t offset, uint8_t size, Error **errp)
+{
+int i;
+
+for (i = offset; i < offset + size; i++) {
+if (pdev->used[i]) {
+error_setg(errp,
+   "%s:%02x:%02x.%x PCI capability %x at offset %x 
overlaps existing capability %x at offset %x",
+   pci_root_bus_path(pdev), pci_dev_bus_num(pdev),
+   PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn),
+   cap_id, offset, pci_find_capability_at_offset(pdev, i), 
i);
+return true;
+}
+}
+
+return false;
+}
+
 /*
  * On success, pci_add_capability() returns a positive value
  * that the offset of the pci capability.
@@ -2523,7 +2542,6 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
Error **errp)
 {
 uint8_t *config;
-int i, overlapping_cap;
 
 if (!offset) {
 offset = pci_find_space(pdev, size);
@@ -2534,17 +2552,9 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
  * depends on this check to verify that the device is not broken.
  * Should never trigger for emulated devices, but it's helpful
  * for debugging these. */
-for (i = offset; i < offset + size; i++) {
-overlapping_cap = pci_find_capability_at_offset(pdev, i);
-if (overlapping_cap) {
-error_setg(errp, "%s:%02x:%02x.%x "
-   "Attempt to add PCI capability %x at offset "
-   "%x overlaps existing capability %x at offset %x",
-   pci_root_bus_path(pdev), pci_dev_bus_num(pdev),
-   PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn),
-   cap_id, offset, overlapping_cap, i);
-return -EINVAL;
-}
+pci_check_capability_overlap(pdev, cap_id, offset, size, errp);
+if (errp) {
+return -EINVAL;
 }
 }
 
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 939dcc3d4a..0ca6b5ff4b 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1298,6 +1298,14 @@ static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos, 
Error **errp)
 
 trace_vfio_msi_setup(vdev->vbasedev.name, pos);
 
+vdev->msi_cap_size = 0xa + (msi_maskbit ? 0xa : 0) + (msi_64bit ? 0x4 : 0);
+
+ret = pci_check_capability_overlap(>pdev, PCI_CAP_ID_MSI,
+   pos, vdev->msi_cap_size, errp);
+if (!ret) {
+return -EINVAL;
+}
+
 ret = msi_init(>pdev, pos, entries, msi_64bit, msi_maskbit, );
 if (ret < 0) {
 if (ret == -ENOTSUP) {
@@ -1306,7 +1314,6 @@ static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos, 
Error **errp)
 error_propagate_prepend(errp, err, "msi_init failed: ");
 return ret;
 }
-vdev->msi_cap_size = 0xa + (msi_maskbit ? 0xa : 0) + (msi_64bit ? 0x4 : 0);
 
 return 0;
 }
@@ -1575,6 +1582,12 @@ static int vfio_msix_setup(VFIOPCIDevice *vdev, int pos, 
Error **errp)
 int ret;
 Error *err = NULL;
 
+ret = pci_check_capability_overlap(>pdev, PCI_CAP_ID_MSIX,
+   pos, MSIX_CAP_LENGTH, errp);
+if (!ret) {
+return -EINVAL;
+}
+
 vdev->msix->pending = g_new0(unsigned long,
  BITS_TO_LONGS(vdev->msix->entries));
 ret = msix_init(>pdev, vdev->msix->entries,
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index b54b6ef88f..77b264c17e 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -390,6 +390,9 @@ void pci_register_vga(PCIDevice *pci_dev, MemoryRegion *mem,
 void pci_unregister_vga(PCIDevice *pci_dev);
 

[PATCH v8 15/17] hw/vfio/pci: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
The code generating errors in pci_add_capability has a comment which
says:
> Verify that capabilities don't overlap.  Note: device assignment
> depends on this check to verify that the device is not broken.
> Should never trigger for emulated devices, but it's helpful for
> debugging these.

Indeed vfio has some code that passes capability offsets and sizes from
a physical device, but it explicitly pays attention so that the
capabilities never overlap. Therefore, in pci_add_capability(), we can
always assert that capabilities never overlap, and that is what happens
when omitting errp.

Signed-off-by: Akihiko Odaki 
---
 hw/vfio/pci-quirks.c | 15 +++
 hw/vfio/pci.c| 14 +-
 2 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index f0147a050a..e94fd273ea 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -1530,7 +1530,7 @@ const PropertyInfo qdev_prop_nv_gpudirect_clique = {
 static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice *vdev, Error **errp)
 {
 PCIDevice *pdev = >pdev;
-int ret, pos = 0xC8;
+int pos = 0xC8;
 
 if (vdev->nv_gpudirect_clique == 0xFF) {
 return 0;
@@ -1547,11 +1547,7 @@ static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice 
*vdev, Error **errp)
 return -EINVAL;
 }
 
-ret = pci_add_capability(pdev, PCI_CAP_ID_VNDR, pos, 8, errp);
-if (ret < 0) {
-error_prepend(errp, "Failed to add NVIDIA GPUDirect cap: ");
-return ret;
-}
+pci_add_capability(pdev, PCI_CAP_ID_VNDR, pos, 8);
 
 memset(vdev->emulated_config_bits + pos, 0xFF, 8);
 pos += PCI_CAP_FLAGS;
@@ -1718,12 +1714,7 @@ static int vfio_add_vmd_shadow_cap(VFIOPCIDevice *vdev, 
Error **errp)
 return -EFAULT;
 }
 
-ret = pci_add_capability(>pdev, PCI_CAP_ID_VNDR, pos,
- VMD_SHADOW_CAP_LEN, errp);
-if (ret < 0) {
-error_prepend(errp, "Failed to add VMD MEMBAR Shadow cap: ");
-return ret;
-}
+pci_add_capability(>pdev, PCI_CAP_ID_VNDR, pos, VMD_SHADOW_CAP_LEN);
 
 memset(vdev->emulated_config_bits + pos, 0xFF, VMD_SHADOW_CAP_LEN);
 pos += PCI_CAP_FLAGS;
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 0ca6b5ff4b..458729eae3 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1839,7 +1839,7 @@ static void vfio_add_emulated_long(VFIOPCIDevice *vdev, 
int pos,
 vfio_set_long_bits(vdev->emulated_config_bits + pos, mask, mask);
 }
 
-static int vfio_setup_pcie_cap(VFIOPCIDevice *vdev, int pos, uint8_t size,
+static int vfio_setup_pcie_cap(VFIOPCIDevice *vdev, uint8_t pos, uint8_t size,
Error **errp)
 {
 uint16_t flags;
@@ -1956,11 +1956,7 @@ static int vfio_setup_pcie_cap(VFIOPCIDevice *vdev, int 
pos, uint8_t size,
1, PCI_EXP_FLAGS_VERS);
 }
 
-pos = pci_add_capability(>pdev, PCI_CAP_ID_EXP, pos, size,
- errp);
-if (pos < 0) {
-return pos;
-}
+pos = pci_add_capability(>pdev, PCI_CAP_ID_EXP, pos, size);
 
 vdev->pdev.exp.exp_cap = pos;
 
@@ -2058,14 +2054,14 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, 
uint8_t pos, Error **errp)
 case PCI_CAP_ID_PM:
 vfio_check_pm_reset(vdev, pos);
 vdev->pm_cap = pos;
-ret = pci_add_capability(pdev, cap_id, pos, size, errp);
+pci_add_capability(pdev, cap_id, pos, size);
 break;
 case PCI_CAP_ID_AF:
 vfio_check_af_flr(vdev, pos);
-ret = pci_add_capability(pdev, cap_id, pos, size, errp);
+pci_add_capability(pdev, cap_id, pos, size);
 break;
 default:
-ret = pci_add_capability(pdev, cap_id, pos, size, errp);
+pci_add_capability(pdev, cap_id, pos, size);
 break;
 }
 
-- 
2.38.1




[PATCH v8 17/17] pci: Remove legacy errp from pci_add_capability

2022-11-01 Thread Akihiko Odaki
Signed-off-by: Akihiko Odaki 
---
 hw/pci/pci.c | 20 +---
 include/hw/pci/pci.h | 12 ++--
 2 files changed, 7 insertions(+), 25 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index cce57f572c..41de7643af 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2532,14 +2532,11 @@ bool pci_check_capability_overlap(PCIDevice *pdev, 
uint8_t cap_id,
 }
 
 /*
- * On success, pci_add_capability_legacy() returns a positive value
- * that the offset of the pci capability.
- * On failure, it sets an error and returns a negative error
- * code.
+ * pci_add_capability() returns a positive value that the offset of the pci
+ * capability.
  */
-int pci_add_capability_legacy(PCIDevice *pdev, uint8_t cap_id,
-  uint8_t offset, uint8_t size,
-  Error **errp)
+uint8_t pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
+   uint8_t offset, uint8_t size)
 {
 uint8_t *config;
 
@@ -2548,14 +2545,7 @@ int pci_add_capability_legacy(PCIDevice *pdev, uint8_t 
cap_id,
 /* out of PCI config space is programming error */
 assert(offset);
 } else {
-/* Verify that capabilities don't overlap.  Note: device assignment
- * depends on this check to verify that the device is not broken.
- * Should never trigger for emulated devices, but it's helpful
- * for debugging these. */
-pci_check_capability_overlap(pdev, cap_id, offset, size, errp);
-if (errp) {
-return -EINVAL;
-}
+pci_check_capability_overlap(pdev, cap_id, offset, size, _abort);
 }
 
 config = pdev->config + offset;
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 50c00ece3e..1baa7628b2 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -2,7 +2,6 @@
 #define QEMU_PCI_H
 
 #include "exec/memory.h"
-#include "qapi/error.h"
 #include "sysemu/dma.h"
 
 /* PCI includes legacy ISA access.  */
@@ -394,15 +393,8 @@ pcibus_t pci_get_bar_addr(PCIDevice *pci_dev, int 
region_num);
 bool pci_check_capability_overlap(PCIDevice *pdev, uint8_t cap_id,
   uint8_t offset, uint8_t size, Error **errp);
 
-int pci_add_capability_legacy(PCIDevice *pdev, uint8_t cap_id,
-  uint8_t offset, uint8_t size,
-  Error **errp);
-
-#define PCI_ADD_CAPABILITY_VA(pdev, cap_id, offset, size, errp, ...) \
-pci_add_capability_legacy(pdev, cap_id, offset, size, errp)
-
-#define pci_add_capability(...) \
-PCI_ADD_CAPABILITY_VA(__VA_ARGS__, _abort)
+uint8_t pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
+   uint8_t offset, uint8_t size);
 
 void pci_del_capability(PCIDevice *pci_dev, uint8_t cap_id, uint8_t cap_size);
 
-- 
2.38.1




Re: [PULL v2 00/13] tcg-next patch queue

2022-11-01 Thread Stefan Hajnoczi
Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/7.2 for any 
user-visible changes.


signature.asc
Description: PGP signature


[PATCH v9 10/17] pcie: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. A caller of a PCIe function which calls
pci_add_capability() in turn is expected to ensure that will not
happen.

Signed-off-by: Akihiko Odaki 
Acked-by: Jonathan Cameron  (for CXL parts)
---
 docs/pcie_sriov.txt|  4 +--
 include/hw/pci/pcie.h  | 11 +++
 hw/display/bochs-display.c |  4 +--
 hw/net/e1000e.c|  4 +--
 hw/pci-bridge/cxl_downstream.c |  9 ++
 hw/pci-bridge/cxl_upstream.c   |  8 ++---
 hw/pci-bridge/pcie_pci_bridge.c|  6 +---
 hw/pci-bridge/pcie_root_port.c |  9 +-
 hw/pci-bridge/xio3130_downstream.c |  7 +---
 hw/pci-bridge/xio3130_upstream.c   |  7 +---
 hw/pci-host/designware.c   |  3 +-
 hw/pci-host/xilinx-pcie.c  |  4 +--
 hw/pci/pcie.c  | 52 --
 hw/usb/hcd-xhci-pci.c  |  3 +-
 hw/virtio/virtio-pci.c |  3 +-
 15 files changed, 35 insertions(+), 99 deletions(-)

diff --git a/docs/pcie_sriov.txt b/docs/pcie_sriov.txt
index 11158dbf88..728a73ba7b 100644
--- a/docs/pcie_sriov.txt
+++ b/docs/pcie_sriov.txt
@@ -49,7 +49,7 @@ setting up a BAR for a VF.
pci_your_pf_dev_realize( ... )
{
   ...
-  int ret = pcie_endpoint_cap_init(d, 0x70);
+  pcie_endpoint_cap_init(d, 0x70);
   ...
   pcie_ari_init(d, 0x100, 1);
   ...
@@ -79,7 +79,7 @@ setting up a BAR for a VF.
pci_your_vf_dev_realize( ... )
{
   ...
-  int ret = pcie_endpoint_cap_init(d, 0x60);
+  pcie_endpoint_cap_init(d, 0x60);
   ...
   pcie_ari_init(d, 0x100, 1);
   ...
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index 798a262a0a..7a35851ae8 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -92,13 +92,12 @@ struct PCIExpressDevice {
 #define COMPAT_PROP_PCP "power_controller_present"
 
 /* PCI express capability helper functions */
-int pcie_cap_init(PCIDevice *dev, uint8_t offset, uint8_t type,
-  uint8_t port, Error **errp);
-int pcie_cap_v1_init(PCIDevice *dev, uint8_t offset,
- uint8_t type, uint8_t port);
-int pcie_endpoint_cap_init(PCIDevice *dev, uint8_t offset);
+void pcie_cap_init(PCIDevice *dev, uint8_t offset, uint8_t type, uint8_t port);
+void pcie_cap_v1_init(PCIDevice *dev, uint8_t offset,
+  uint8_t type, uint8_t port);
+void pcie_endpoint_cap_init(PCIDevice *dev, uint8_t offset);
 void pcie_cap_exit(PCIDevice *dev);
-int pcie_endpoint_cap_v1_init(PCIDevice *dev, uint8_t offset);
+void pcie_endpoint_cap_v1_init(PCIDevice *dev, uint8_t offset);
 void pcie_cap_v1_exit(PCIDevice *dev);
 uint8_t pcie_cap_get_type(const PCIDevice *dev);
 void pcie_cap_flags_set_vector(PCIDevice *dev, uint8_t vector);
diff --git a/hw/display/bochs-display.c b/hw/display/bochs-display.c
index 8ed734b195..111cabcfb3 100644
--- a/hw/display/bochs-display.c
+++ b/hw/display/bochs-display.c
@@ -265,7 +265,6 @@ static void bochs_display_realize(PCIDevice *dev, Error 
**errp)
 {
 BochsDisplayState *s = BOCHS_DISPLAY(dev);
 Object *obj = OBJECT(dev);
-int ret;
 
 if (s->vgamem < 4 * MiB) {
 error_setg(errp, "bochs-display: video memory too small");
@@ -302,8 +301,7 @@ static void bochs_display_realize(PCIDevice *dev, Error 
**errp)
 }
 
 if (pci_bus_is_express(pci_get_bus(dev))) {
-ret = pcie_endpoint_cap_init(dev, 0x80);
-assert(ret > 0);
+pcie_endpoint_cap_init(dev, 0x80);
 } else {
 dev->cap_present &= ~QEMU_PCI_CAP_EXPRESS;
 }
diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index e433b8f9a5..aea4305c43 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -462,9 +462,7 @@ static void e1000e_pci_realize(PCIDevice *pci_dev, Error 
**errp)
 
 e1000e_init_msix(s);
 
-if (pcie_endpoint_cap_v1_init(pci_dev, e1000e_pcie_offset) < 0) {
-hw_error("Failed to initialize PCIe capability");
-}
+pcie_endpoint_cap_v1_init(pci_dev, e1000e_pcie_offset);
 
 ret = msi_init(PCI_DEVICE(s), 0xD0, 1, true, false, NULL);
 if (ret) {
diff --git a/hw/pci-bridge/cxl_downstream.c b/hw/pci-bridge/cxl_downstream.c
index a361e519d0..1980dd9c6c 100644
--- a/hw/pci-bridge/cxl_downstream.c
+++ b/hw/pci-bridge/cxl_downstream.c
@@ -155,12 +155,8 @@ static void cxl_dsp_realize(PCIDevice *d, Error **errp)
 goto err_bridge;
 }
 
-rc = pcie_cap_init(d, CXL_DOWNSTREAM_PORT_EXP_OFFSET,
-   PCI_EXP_TYPE_DOWNSTREAM, p->port,
-   errp);
-if (rc < 0) {
-goto err_msi;
-}
+pcie_cap_init(d, CXL_DOWNSTREAM_PORT_EXP_OFFSET,
+  PCI_EXP_TYPE_DOWNSTREAM, p->port);
 
 pcie_cap_flr_init(d);
 pcie_cap_deverr_init(d);
@@ -195,7 +191,6 @@ static void cxl_dsp_realize(PCIDevice *d, Error **errp)
 pcie_chassis_del_slot(s);
  err_pcie_cap:
 pcie_cap_exit(d);
- err_msi:
 msi_uninit(d);
  err_bridge:
 

Re: [PATCH] migration: check magic value for deciding the mapping of channels

2022-11-01 Thread manish.mishra



On 01/11/22 9:15 pm, Daniel P. Berrangé wrote:

On Tue, Nov 01, 2022 at 09:10:14PM +0530, manish.mishra wrote:

On 01/11/22 8:21 pm, Daniel P. Berrangé wrote:

On Tue, Nov 01, 2022 at 02:30:29PM +, manish.mishra wrote:

Current logic assumes that channel connections on the destination side are
always established in the same order as the source and the first one will
always be the default channel followed by the multifid or post-copy
preemption channel. This may not be always true, as even if a channel has a
connection established on the source side it can be in the pending state on
the destination side and a newer connection can be established first.
Basically causing out of order mapping of channels on the destination side.
Currently, all channels except post-copy preempt send a magic number, this
patch uses that magic number to decide the type of channel. This logic is
applicable only for precopy(multifd) live migration, as mentioned, the
post-copy preempt channel does not send any magic number. Also, this patch
uses MSG_PEEK to check the magic number of channels so that current
data/control stream management remains un-effected.

Signed-off-by: manish.mishra 
---
   include/io/channel.h | 25 +
   io/channel-socket.c  | 27 +++
   io/channel.c | 39 +++
   migration/migration.c| 33 +
   migration/multifd.c  | 12 
   migration/multifd.h  |  2 +-
   migration/postcopy-ram.c |  5 +
   migration/postcopy-ram.h |  2 +-
   8 files changed, 119 insertions(+), 26 deletions(-)

diff --git a/include/io/channel.h b/include/io/channel.h
index c680ee7480..74177aeeea 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -115,6 +115,10 @@ struct QIOChannelClass {
   int **fds,
   size_t *nfds,
   Error **errp);
+ssize_t (*io_read_peek)(QIOChannel *ioc,
+void *buf,
+size_t nbytes,
+Error **errp);
   int (*io_close)(QIOChannel *ioc,
   Error **errp);
   GSource * (*io_create_watch)(QIOChannel *ioc,
@@ -475,6 +479,27 @@ int qio_channel_write_all(QIOChannel *ioc,
 size_t buflen,
 Error **errp);
+/**
+ * qio_channel_read_peek_all:
+ * @ioc: the channel object
+ * @buf: the memory region to read in data
+ * @nbytes: the number of bytes to read
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Read given @nbytes data from peek of channel into
+ * memory region @buf.
+ *
+ * The function will be blocked until read size is
+ * equal to requested size.
+ *
+ * Returns: 1 if all bytes were read, 0 if end-of-file
+ *  occurs without data, or -1 on error
+ */
+int qio_channel_read_peek_all(QIOChannel *ioc,
+  void* buf,
+  size_t nbytes,
+  Error **errp);
+
   /**
* qio_channel_set_blocking:
* @ioc: the channel object
diff --git a/io/channel-socket.c b/io/channel-socket.c
index b76dca9cc1..b99f5dfda6 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -705,6 +705,32 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc,
   }
   #endif /* WIN32 */
+static ssize_t qio_channel_socket_read_peek(QIOChannel *ioc,
+void *buf,
+size_t nbytes,
+Error **errp)
+{
+QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
+ssize_t bytes = 0;
+
+retry:
+bytes = recv(sioc->fd, buf, nbytes, MSG_PEEK);
+
+if (bytes < 0) {
+if (errno == EINTR) {
+goto retry;
+}
+if (errno == EAGAIN) {
+return QIO_CHANNEL_ERR_BLOCK;
+}
+
+error_setg_errno(errp, errno,
+ "Unable to read from peek of socket");
+return -1;
+}
+
+return bytes;
+}
   #ifdef QEMU_MSG_ZEROCOPY
   static int qio_channel_socket_flush(QIOChannel *ioc,
@@ -902,6 +928,7 @@ static void qio_channel_socket_class_init(ObjectClass 
*klass,
   ioc_klass->io_writev = qio_channel_socket_writev;
   ioc_klass->io_readv = qio_channel_socket_readv;
+ioc_klass->io_read_peek = qio_channel_socket_read_peek;
   ioc_klass->io_set_blocking = qio_channel_socket_set_blocking;
   ioc_klass->io_close = qio_channel_socket_close;
   ioc_klass->io_shutdown = qio_channel_socket_shutdown;
diff --git a/io/channel.c b/io/channel.c
index 0640941ac5..a2d9b96f3f 100644
--- a/io/channel.c
+++ b/io/channel.c
@@ -346,6 +346,45 @@ int qio_channel_write_all(QIOChannel *ioc,
   return qio_channel_writev_all(ioc, , 1, errp);
   }
+int qio_channel_read_peek_all(QIOChannel *ioc,
+  void* buf,
+  

[PATCH v9 00/17] pci: Abort if pci_add_capability fails

2022-11-01 Thread Akihiko Odaki
pci_add_capability appears most PCI devices. Its error handling required
lots of code, and led to inconsistent behaviors such as:
- passing error_abort
- passing error_fatal
- asserting the returned value
- propagating the error to the caller
- skipping the rest of the function
- just ignoring

The code generating errors in pci_add_capability had a comment which
says:
> Verify that capabilities don't overlap.  Note: device assignment
> depends on this check to verify that the device is not broken.
> Should never trigger for emulated devices, but it's helpful for
> debugging these.

Indeed vfio has some code that passes capability offsets and sizes from
a physical device, but it explicitly pays attention so that the
capabilities never overlap and the only exception are MSI and MSI-X
capabilities. Therefore, we can add code specific to the case, and
always assert that capabilities never overlap in the other cases,
resolving these inconsistencies.

v9:
- Return correct value with pci_check_capability_overlap()
  (Philippe Mathieu-Daudé)
- Use scripts/git.orderfile (Philippe Mathieu-Daudé)
- Simplify error checking in pci_add_capability() (Philippe Mathieu-Daudé)
- Improve comments (Philippe Mathieu-Daudé)

v8:
- Return boolean with pci_check_capability_overlap() (Philippe Mathieu-Daudé)

v7:
- Perform checks in vfio_msi_setup() and vfio_msix_setup() (Alex Williamson)

v6:
- Error in case of MSI/MSI-X capability overlap (Alex Williamson)

v5:
- Fix capability ID specification in vfio_msi_early_setup (Alex Williamson)
- Use range_covers_byte() (Alex Williamson)
- warn_report() in case of MSI/MSI-X capability overlap (Alex Williamson)

v4:
- Fix typos in messages (Markus Armbruster)
- hw/vfio/pci: Ensure MSI and MSI-X do not overlap (Alex Williamson)

v3:
- Correct patch split between virtio-pci and pci (Markus Armbruster)
- Add messages for individual patches (Markus Armbruster)
- Acked-by: Jonathan Cameron 

Akihiko Odaki (17):
  hw/vfio/pci: Ensure MSI and MSI-X do not overlap
  pci: Allow to omit errp for pci_add_capability
  hw/i386/amd_iommu: Omit errp for pci_add_capability
  ahci: Omit errp for pci_add_capability
  e1000e: Omit errp for pci_add_capability
  eepro100: Omit errp for pci_add_capability
  hw/nvme: Omit errp for pci_add_capability
  msi: Omit errp for pci_add_capability
  hw/pci/pci_bridge: Omit errp for pci_add_capability
  pcie: Omit errp for pci_add_capability
  pci/shpc: Omit errp for pci_add_capability
  msix: Omit errp for pci_add_capability
  pci/slotid: Omit errp for pci_add_capability
  hw/pci-bridge/pcie_pci_bridge: Omit errp for pci_add_capability
  hw/vfio/pci: Omit errp for pci_add_capability
  virtio-pci: Omit errp for pci_add_capability
  pci: Remove legacy errp from pci_add_capability

 docs/pcie_sriov.txt|  4 +--
 include/hw/pci/pci.h   | 12 +--
 include/hw/pci/pci_bridge.h|  5 ++-
 include/hw/pci/pcie.h  | 11 +++
 include/hw/pci/shpc.h  |  3 +-
 include/hw/virtio/virtio-pci.h |  2 +-
 hw/display/bochs-display.c |  4 +--
 hw/i386/amd_iommu.c| 21 +++-
 hw/ide/ich.c   |  8 ++---
 hw/net/e1000e.c| 22 +++--
 hw/net/eepro100.c  |  7 +---
 hw/nvme/ctrl.c | 14 ++--
 hw/pci-bridge/cxl_downstream.c |  9 ++
 hw/pci-bridge/cxl_upstream.c   |  8 ++---
 hw/pci-bridge/i82801b11.c  | 14 ++--
 hw/pci-bridge/pci_bridge_dev.c |  2 +-
 hw/pci-bridge/pcie_pci_bridge.c| 19 +++
 hw/pci-bridge/pcie_root_port.c | 16 ++---
 hw/pci-bridge/xio3130_downstream.c | 15 ++---
 hw/pci-bridge/xio3130_upstream.c   | 15 ++---
 hw/pci-host/designware.c   |  3 +-
 hw/pci-host/xilinx-pcie.c  |  4 +--
 hw/pci/msi.c   |  9 +-
 hw/pci/msix.c  |  8 ++---
 hw/pci/pci.c   | 47 +--
 hw/pci/pci_bridge.c| 21 
 hw/pci/pcie.c  | 52 --
 hw/pci/shpc.c  | 23 -
 hw/pci/slotid_cap.c|  8 ++---
 hw/usb/hcd-xhci-pci.c  |  3 +-
 hw/vfio/pci-quirks.c   | 15 ++---
 hw/vfio/pci.c  | 29 +++--
 hw/virtio/virtio-pci.c | 12 ++-
 33 files changed, 136 insertions(+), 309 deletions(-)

-- 
2.38.1




[PATCH v9 17/17] pci: Remove legacy errp from pci_add_capability

2022-11-01 Thread Akihiko Odaki
Signed-off-by: Akihiko Odaki 
---
 include/hw/pci/pci.h | 12 ++--
 hw/pci/pci.c | 18 --
 2 files changed, 6 insertions(+), 24 deletions(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 555ac03010..da414dc728 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -2,7 +2,6 @@
 #define QEMU_PCI_H
 
 #include "exec/memory.h"
-#include "qapi/error.h"
 #include "sysemu/dma.h"
 
 /* PCI includes legacy ISA access.  */
@@ -398,15 +397,8 @@ pcibus_t pci_get_bar_addr(PCIDevice *pci_dev, int 
region_num);
 bool pci_check_capability_overlap(PCIDevice *pdev, uint8_t cap_id,
   uint8_t offset, uint8_t size, Error **errp);
 
-int pci_add_capability_legacy(PCIDevice *pdev, uint8_t cap_id,
-  uint8_t offset, uint8_t size,
-  Error **errp);
-
-#define PCI_ADD_CAPABILITY_VA(pdev, cap_id, offset, size, errp, ...) \
-pci_add_capability_legacy(pdev, cap_id, offset, size, errp)
-
-#define pci_add_capability(...) \
-PCI_ADD_CAPABILITY_VA(__VA_ARGS__, _abort)
+uint8_t pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
+   uint8_t offset, uint8_t size);
 
 void pci_del_capability(PCIDevice *pci_dev, uint8_t cap_id, uint8_t cap_size);
 
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 5f77ca581a..41ec69ea7c 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2532,14 +2532,10 @@ bool pci_check_capability_overlap(PCIDevice *pdev, 
uint8_t cap_id,
 }
 
 /*
- * On success, pci_add_capability_legacy() returns a positive value
- * that the offset of the pci capability.
- * On failure, it sets an error and returns a negative error
- * code.
+ * Return: offset of the pci capability.
  */
-int pci_add_capability_legacy(PCIDevice *pdev, uint8_t cap_id,
-  uint8_t offset, uint8_t size,
-  Error **errp)
+uint8_t pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
+   uint8_t offset, uint8_t size)
 {
 uint8_t *config;
 
@@ -2548,13 +2544,7 @@ int pci_add_capability_legacy(PCIDevice *pdev, uint8_t 
cap_id,
 /* out of PCI config space is programming error */
 assert(offset);
 } else {
-/* Verify that capabilities don't overlap.  Note: device assignment
- * depends on this check to verify that the device is not broken.
- * Should never trigger for emulated devices, but it's helpful
- * for debugging these. */
-if (!pci_check_capability_overlap(pdev, cap_id, offset, size, errp)) {
-return -EINVAL;
-}
+pci_check_capability_overlap(pdev, cap_id, offset, size, _abort);
 }
 
 config = pdev->config + offset;
-- 
2.38.1




Re: [PATCH] target/i386/cpu: disable PERFCORE for AMD when cpu.pmu is off

2022-11-01 Thread Liang Yan

Hey Vitaly,

On 10/31/22 6:07 AM, Vitaly Kuznetsov wrote:

Liang Yan  writes:


With cpu.pmu=off, perfctr_core could still be seen in an AMD guest cpuid.
By further digging, I found cpu.perfctr_core did the trick. However,
considering the 'enable_pmu' in KVM could work on both Intel and AMD,
we may add AMD PMU control under 'enabe_pmu' in QEMU too.

This change will overide the property 'perfctr_ctr' and change the AMD PMU
to off by default.

Signed-off-by: Liang Yan 
---
  target/i386/cpu.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 22b681ca37..edf5413c90 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5706,6 +5706,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
  *ecx |= 1 << 1;/* CmpLegacy bit */
  }
  }
+
+if (!cpu->enable_pmu) {
+*ecx &= ~CPUID_EXT3_PERFCORE;
+}
  break;
  case 0x8002:
  case 0x8003:

I may be missing something but my first impression is that this will
make CPUID_EXT3_PERFCORE bit disappear when a !enable_pmu VM is migrated
from an old QEMU (pre-patch) to a new one. If so, then additional
precautions should be taking against that (e.g. tying the change to
CPU/machine model versions, for example).

Thanks for the reply, it is a quite good point. I was struggled with it 
a little bit earlier because cpu.pmu has such operation for Intel CPU. 
After further talk with AMD people, I noticed that AMD PMU is more than 
perfctr_core, it has some legacy counters in use. I will dig a little 
further and send a v2 with extra cpu counters and live migration 
compatibility.



Regards,

Liang







Re: [PATCH] hw/i386/pc.c: CXL Fixed Memory Window should not reserve e820 in bios

2022-11-01 Thread Jonathan Cameron via
On Wed, 26 Oct 2022 16:59:13 -0400
Gregory Price  wrote:

> Early-boot e820 records will be inserted by the bios/efi/early boot
> software and be reported to the kernel via insert_resource.  Later, when
> CXL drivers iterate through the regions again, they will insert another
> resource and make the RESERVED memory area a child.
> 
> This RESERVED memory area causes the memory region to become unusable,
> and as a result attempting to create memory regions with
> 
> `cxl create-region ...`
> 
> Will fail due to the RESERVED area intersecting with the CXL window.
> 
> During boot the following traceback is observed:
> 
> 0x81101650 in insert_resource_expand_to_fit ()
> 0x83d964c5 in e820__reserve_resources_late ()
> 0x83e03210 in pcibios_resource_survey ()
> 0x83e04f4a in pcibios_init ()
> 
> Which produces a call to reserve the CFMWS area:
> 
> (gdb) p *new
> $54 = {start = 0x29000, end = 0x2cfff, name = "Reserved",
>flags = 0x200, desc = 0x7, parent = 0x0, sibling = 0x0,
>child = 0x0}
> 
> Later the Kernel parses ACPI tables and reserves the exact same area as
> the CXL Fixed Memory Window:
> 
> 0x811016a4 in insert_resource_conflict ()
>   insert_resource ()
> 0x81a81389 in cxl_parse_cfmws ()
> 0x818c4a81 in call_handler ()
>   acpi_parse_entries_array ()
> 
> (gdb) p/x *new
> $59 = {start = 0x29000, end = 0x2cfff, name = "CXL Window 0",
>flags = 0x200, desc = 0x0, parent = 0x0, sibling = 0x0,
>child = 0x0}
> 
> This produces the following output in /proc/iomem:
> 
> 59000-68fff : CXL Window 0
>   59000-68fff : Reserved
> 
> This reserved area causes `get_free_mem_region()` to fail due to a check
> against `__region_intersects()`.  Due to this reserved area, the
> intersect check will only ever return REGION_INTERSECTS, which causes
> `cxl create-region` to always fail.
> 
> Signed-off-by: Gregory Price 

My understanding of e820 is limited, but from discussions with Intel folk
I believe this fix to be correct - we should never have had e820 regions
for CXL Fixed Memory Windows. As such

Acked-by: Jonathan Cameron 

> ---
>  hw/i386/pc.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 768982ae9a..203c90fedb 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1062,7 +1062,6 @@ void pc_memory_init(PCMachineState *pcms,
>  hwaddr cxl_size = MiB;
>  
>  cxl_base = pc_get_cxl_range_start(pcms);
> -e820_add_entry(cxl_base, cxl_size, E820_RESERVED);
>  memory_region_init(mr, OBJECT(machine), "cxl_host_reg", cxl_size);
>  memory_region_add_subregion(system_memory, cxl_base, mr);
>  cxl_resv_end = cxl_base + cxl_size;
> @@ -1078,7 +1077,6 @@ void pc_memory_init(PCMachineState *pcms,
>  memory_region_init_io(>mr, OBJECT(machine), _ops, 
> fw,
>"cxl-fixed-memory-region", fw->size);
>  memory_region_add_subregion(system_memory, fw->base, 
> >mr);
> -e820_add_entry(fw->base, fw->size, E820_RESERVED);
>  cxl_fmw_base += fw->size;
>  cxl_resv_end = cxl_fmw_base;
>  }




Re: [PULL 21/86] bios-tables-test: add test for number of cores > 255

2022-11-01 Thread Igor Mammedov
On Tue, 1 Nov 2022 13:52:21 +
Jonathan Cameron  wrote:

> On Mon, 31 Oct 2022 08:51:44 -0400
> "Michael S. Tsirkin"  wrote:
> 
> > From: Julia Suvorova 
> > 
> > The new test is run with a large number of cpus and checks if the
> > core_count field in smbios_cpu_test (structure type 4) is correct.
> > 
> > Choose q35 as it allows to run with -smp > 255.  
> 
> Getting a failure on this on i386.
> 
> qemu-system-i386: current -smp configuration requires kernel irqchip and 
> X2APIC API support.

just posted a fixup, see my other reply to Stefan in this pull req.

> 
> Note that was on bisection of this pull request applied to current mainline
> (also in the CI report for the HMAT set - though there is another issue 
> there.)
> 
> My guess is fix is don't run it unless 64 bit?
> 
> Jonathan
> 
> 
> > 
> > Signed-off-by: Julia Suvorova 
> > Message-Id: <20220731162141.178443-5-jus...@redhat.com>
> > Message-Id: <2022101731.101412-5-jus...@redhat.com>
> > Reviewed-by: Michael S. Tsirkin 
> > Signed-off-by: Michael S. Tsirkin 
> > Reviewed-by: Igor Mammedov 
> > ---
> >  tests/qtest/bios-tables-test.c | 58 ++
> >  1 file changed, 45 insertions(+), 13 deletions(-)
> > 
> > diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
> > index d4fbe6791d..e402b57d46 100644
> > --- a/tests/qtest/bios-tables-test.c
> > +++ b/tests/qtest/bios-tables-test.c
> > @@ -92,6 +92,8 @@ typedef struct {
> >  SmbiosEntryPoint smbios_ep_table;
> >  uint16_t smbios_cpu_max_speed;
> >  uint16_t smbios_cpu_curr_speed;
> > +uint8_t smbios_core_count;
> > +uint16_t smbios_core_count2;
> >  uint8_t *required_struct_types;
> >  int required_struct_types_len;
> >  QTestState *qts;
> > @@ -631,29 +633,42 @@ static inline bool smbios_single_instance(uint8_t 
> > type)
> >  }
> >  }
> >  
> > -static bool smbios_cpu_test(test_data *data, uint32_t addr)
> > +static void smbios_cpu_test(test_data *data, uint32_t addr,
> > +SmbiosEntryPointType ep_type)
> >  {
> > -uint16_t expect_speed[2];
> > -uint16_t real;
> > +uint8_t core_count, expected_core_count = data->smbios_core_count;
> > +uint16_t speed, expected_speed[2];
> > +uint16_t core_count2, expected_core_count2 = data->smbios_core_count2;
> >  int offset[2];
> >  int i;
> >  
> >  /* Check CPU speed for backward compatibility */
> >  offset[0] = offsetof(struct smbios_type_4, max_speed);
> >  offset[1] = offsetof(struct smbios_type_4, current_speed);
> > -expect_speed[0] = data->smbios_cpu_max_speed ? : 2000;
> > -expect_speed[1] = data->smbios_cpu_curr_speed ? : 2000;
> > +expected_speed[0] = data->smbios_cpu_max_speed ? : 2000;
> > +expected_speed[1] = data->smbios_cpu_curr_speed ? : 2000;
> >  
> >  for (i = 0; i < 2; i++) {
> > -real = qtest_readw(data->qts, addr + offset[i]);
> > -if (real != expect_speed[i]) {
> > -fprintf(stderr, "Unexpected SMBIOS CPU speed: real %u expect 
> > %u\n",
> > -real, expect_speed[i]);
> > -return false;
> > -}
> > +speed = qtest_readw(data->qts, addr + offset[i]);
> > +g_assert_cmpuint(speed, ==, expected_speed[i]);
> >  }
> >  
> > -return true;
> > +core_count = qtest_readb(data->qts,
> > +addr + offsetof(struct smbios_type_4, core_count));
> > +
> > +if (expected_core_count) {
> > +g_assert_cmpuint(core_count, ==, expected_core_count);
> > +}
> > +
> > +if (ep_type == SMBIOS_ENTRY_POINT_TYPE_64) {
> > +core_count2 = qtest_readw(data->qts,
> > +  addr + offsetof(struct smbios_type_4, 
> > core_count2));
> > +
> > +/* Core Count has reached its limit, checking Core Count 2 */
> > +if (expected_core_count == 0xFF && expected_core_count2) {
> > +g_assert_cmpuint(core_count2, ==, expected_core_count2);
> > +}
> > +}
> >  }
> >  
> >  static void test_smbios_structs(test_data *data, SmbiosEntryPointType 
> > ep_type)
> > @@ -686,7 +701,7 @@ static void test_smbios_structs(test_data *data, 
> > SmbiosEntryPointType ep_type)
> >  set_bit(type, struct_bitmap);
> >  
> >  if (type == 4) {
> > -g_assert(smbios_cpu_test(data, addr));
> > +smbios_cpu_test(data, addr, ep_type);
> >  }
> >  
> >  /* seek to end of unformatted string area of this struct ("\0\0") 
> > */
> > @@ -908,6 +923,21 @@ static void test_acpi_q35_tcg(void)
> >  free_test_data();
> >  }
> >  
> > +static void test_acpi_q35_tcg_core_count2(void)
> > +{
> > +test_data data = {
> > +.machine = MACHINE_Q35,
> > +.variant = ".core-count2",
> > +.required_struct_types = base_required_struct_types,
> > +.required_struct_types_len = 
> > ARRAY_SIZE(base_required_struct_types),
> > +.smbios_core_count = 0xFF,
> > 

[PATCH v9 07/17] hw/nvme: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. This behavior is appropriate here because all of
the capabilities set in this device are defined in the program and
their overlap should not happen unless there is a programming error.

Signed-off-by: Akihiko Odaki 
---
 hw/nvme/ctrl.c | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 87aeba0564..ff4e2beea6 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -7325,17 +7325,9 @@ static void nvme_init_sriov(NvmeCtrl *n, PCIDevice 
*pci_dev, uint16_t offset)
   PCI_BASE_ADDRESS_MEM_TYPE_64, bar_size);
 }
 
-static int nvme_add_pm_capability(PCIDevice *pci_dev, uint8_t offset)
+static void nvme_add_pm_capability(PCIDevice *pci_dev, uint8_t offset)
 {
-Error *err = NULL;
-int ret;
-
-ret = pci_add_capability(pci_dev, PCI_CAP_ID_PM, offset,
- PCI_PM_SIZEOF, );
-if (err) {
-error_report_err(err);
-return ret;
-}
+pci_add_capability(pci_dev, PCI_CAP_ID_PM, offset, PCI_PM_SIZEOF);
 
 pci_set_word(pci_dev->config + offset + PCI_PM_PMC,
  PCI_PM_CAP_VER_1_2);
@@ -7343,8 +7335,6 @@ static int nvme_add_pm_capability(PCIDevice *pci_dev, 
uint8_t offset)
  PCI_PM_CTRL_NO_SOFT_RESET);
 pci_set_word(pci_dev->wmask + offset + PCI_PM_CTRL,
  PCI_PM_CTRL_STATE_MASK);
-
-return 0;
 }
 
 static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp)
-- 
2.38.1




Re: [PATCH] migration: check magic value for deciding the mapping of channels

2022-11-01 Thread Daniel P . Berrangé
On Tue, Nov 01, 2022 at 02:30:29PM +, manish.mishra wrote:
> Current logic assumes that channel connections on the destination side are
> always established in the same order as the source and the first one will
> always be the default channel followed by the multifid or post-copy
> preemption channel. This may not be always true, as even if a channel has a
> connection established on the source side it can be in the pending state on
> the destination side and a newer connection can be established first.
> Basically causing out of order mapping of channels on the destination side.
> Currently, all channels except post-copy preempt send a magic number, this
> patch uses that magic number to decide the type of channel. This logic is
> applicable only for precopy(multifd) live migration, as mentioned, the
> post-copy preempt channel does not send any magic number. Also, this patch
> uses MSG_PEEK to check the magic number of channels so that current
> data/control stream management remains un-effected.
> 
> Signed-off-by: manish.mishra 
> ---
>  include/io/channel.h | 25 +
>  io/channel-socket.c  | 27 +++
>  io/channel.c | 39 +++
>  migration/migration.c| 33 +
>  migration/multifd.c  | 12 
>  migration/multifd.h  |  2 +-
>  migration/postcopy-ram.c |  5 +
>  migration/postcopy-ram.h |  2 +-
>  8 files changed, 119 insertions(+), 26 deletions(-)
> 
> diff --git a/include/io/channel.h b/include/io/channel.h
> index c680ee7480..74177aeeea 100644
> --- a/include/io/channel.h
> +++ b/include/io/channel.h
> @@ -115,6 +115,10 @@ struct QIOChannelClass {
>  int **fds,
>  size_t *nfds,
>  Error **errp);
> +ssize_t (*io_read_peek)(QIOChannel *ioc,
> +void *buf,
> +size_t nbytes,
> +Error **errp);
>  int (*io_close)(QIOChannel *ioc,
>  Error **errp);
>  GSource * (*io_create_watch)(QIOChannel *ioc,
> @@ -475,6 +479,27 @@ int qio_channel_write_all(QIOChannel *ioc,
>size_t buflen,
>Error **errp);
>  
> +/**
> + * qio_channel_read_peek_all:
> + * @ioc: the channel object
> + * @buf: the memory region to read in data
> + * @nbytes: the number of bytes to read
> + * @errp: pointer to a NULL-initialized error object
> + *
> + * Read given @nbytes data from peek of channel into
> + * memory region @buf.
> + *
> + * The function will be blocked until read size is
> + * equal to requested size.
> + *
> + * Returns: 1 if all bytes were read, 0 if end-of-file
> + *  occurs without data, or -1 on error
> + */
> +int qio_channel_read_peek_all(QIOChannel *ioc,
> +  void* buf,
> +  size_t nbytes,
> +  Error **errp);
> +
>  /**
>   * qio_channel_set_blocking:
>   * @ioc: the channel object
> diff --git a/io/channel-socket.c b/io/channel-socket.c
> index b76dca9cc1..b99f5dfda6 100644
> --- a/io/channel-socket.c
> +++ b/io/channel-socket.c
> @@ -705,6 +705,32 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc,
>  }
>  #endif /* WIN32 */
>  
> +static ssize_t qio_channel_socket_read_peek(QIOChannel *ioc,
> +void *buf,
> +size_t nbytes,
> +Error **errp)
> +{
> +QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
> +ssize_t bytes = 0;
> +
> +retry:
> +bytes = recv(sioc->fd, buf, nbytes, MSG_PEEK);
> +
> +if (bytes < 0) {
> +if (errno == EINTR) {
> +goto retry;
> +}
> +if (errno == EAGAIN) {
> +return QIO_CHANNEL_ERR_BLOCK;
> +}
> +
> +error_setg_errno(errp, errno,
> + "Unable to read from peek of socket");
> +return -1;
> +}
> +
> +return bytes;
> +}
>  
>  #ifdef QEMU_MSG_ZEROCOPY
>  static int qio_channel_socket_flush(QIOChannel *ioc,
> @@ -902,6 +928,7 @@ static void qio_channel_socket_class_init(ObjectClass 
> *klass,
>  
>  ioc_klass->io_writev = qio_channel_socket_writev;
>  ioc_klass->io_readv = qio_channel_socket_readv;
> +ioc_klass->io_read_peek = qio_channel_socket_read_peek;
>  ioc_klass->io_set_blocking = qio_channel_socket_set_blocking;
>  ioc_klass->io_close = qio_channel_socket_close;
>  ioc_klass->io_shutdown = qio_channel_socket_shutdown;
> diff --git a/io/channel.c b/io/channel.c
> index 0640941ac5..a2d9b96f3f 100644
> --- a/io/channel.c
> +++ b/io/channel.c
> @@ -346,6 +346,45 @@ int qio_channel_write_all(QIOChannel *ioc,
>  return qio_channel_writev_all(ioc, , 1, errp);
>  }
>  
> +int qio_channel_read_peek_all(QIOChannel *ioc,
> +  

Re: [PATCH v7 01/17] hw/vfio/pci: Ensure MSI and MSI-X do not overlap

2022-11-01 Thread Philippe Mathieu-Daudé

On 1/11/22 01:58, Akihiko Odaki wrote:

pci_add_capability() checks whether capabilities overlap, and notifies
its caller so that it can properly handle the case. However, in the
most cases, the capabilities actually never overlap, and the interface
incurred extra error handling code, which is often incorrect or
suboptimal. For such cases, pci_add_capability() can simply abort the
execution if the capabilities actually overlap since it should be a
programming error.

This change handles the other cases: hw/vfio/pci depends on the check to
decide MSI and MSI-X capabilities overlap with another. As they are
quite an exceptional and hw/vfio/pci knows much about PCI capabilities,
adding code specific to the cases to hw/vfio/pci still results in less
code than having error handling code everywhere in total.

Signed-off-by: Akihiko Odaki 
---
  hw/pci/pci.c | 32 
  hw/vfio/pci.c| 15 ++-
  include/hw/pci/pci.h |  3 +++
  3 files changed, 37 insertions(+), 13 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 2f450f6a72..33f5406706 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2512,6 +2512,23 @@ static void pci_del_option_rom(PCIDevice *pdev)
  pdev->has_rom = false;
  }
  
+void pci_check_capability_overlap(PCIDevice *pdev, uint8_t cap_id,

+  uint8_t offset, uint8_t size, Error **errp)
+{
+int i;
+
+for (i = offset; i < offset + size; i++) {
+if (pdev->used[i]) {
+error_setg(errp,
+   "%s:%02x:%02x.%x PCI capability %x at offset %x overlaps 
existing capability %x at offset %x",
+   pci_root_bus_path(pdev), pci_dev_bus_num(pdev),
+   PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn),
+   cap_id, offset, pci_find_capability_at_offset(pdev, i), 
i);
+return;


Per the Error API ("qapi/error.h" or 
https://gitlab.com/qemu-project/qemu/-/commit/e3fe3988d7851cac) this 
function should return a boolean (true on success and false on error).



+}
+}
+}




Re: [PATCH] virtio: add VIRTQUEUE_ERROR QAPI event

2022-11-01 Thread Vladimir Sementsov-Ogievskiy

ping

On 9/19/22 22:48, Vladimir Sementsov-Ogievskiy wrote:

For now we only log the vhost device error, when virtqueue is actually
stopped. Let's add a QAPI event, which makes possible:

  - collect statistics of such errors
  - make immediate actions: take coredums or do some other debugging

The event could be reused for some other virtqueue problems (not only
for vhost devices) in future. For this it gets a generic name and
structure.

Signed-off-by: Vladimir Sementsov-Ogievskiy


--
Best regards,
Vladimir




[PATCH v9 09/17] hw/pci/pci_bridge: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. A caller of pci_bridge_ssvid_init(), which calls
pci_add_capability() in turn, is expected to ensure that will not
happen.

Signed-off-by: Akihiko Odaki 
---
 include/hw/pci/pci_bridge.h|  5 ++---
 hw/pci-bridge/i82801b11.c  | 14 ++
 hw/pci-bridge/pcie_root_port.c |  7 +--
 hw/pci-bridge/xio3130_downstream.c |  8 ++--
 hw/pci-bridge/xio3130_upstream.c   |  8 ++--
 hw/pci/pci_bridge.c| 21 ++---
 6 files changed, 15 insertions(+), 48 deletions(-)

diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
index ba4bafac7c..e499482972 100644
--- a/include/hw/pci/pci_bridge.h
+++ b/include/hw/pci/pci_bridge.h
@@ -101,9 +101,8 @@ typedef struct PXBDev PXBDev;
 DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
  TYPE_PXB_CXL_DEVICE)
 
-int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
-  uint16_t svid, uint16_t ssid,
-  Error **errp);
+void pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
+   uint16_t svid, uint16_t ssid);
 
 PCIDevice *pci_bridge_get_device(PCIBus *bus);
 PCIBus *pci_bridge_get_sec_bus(PCIBridge *br);
diff --git a/hw/pci-bridge/i82801b11.c b/hw/pci-bridge/i82801b11.c
index f28181e210..f45dcdbacc 100644
--- a/hw/pci-bridge/i82801b11.c
+++ b/hw/pci-bridge/i82801b11.c
@@ -61,21 +61,11 @@ typedef struct I82801b11Bridge {
 
 static void i82801b11_bridge_realize(PCIDevice *d, Error **errp)
 {
-int rc;
-
 pci_bridge_initfn(d, TYPE_PCI_BUS);
 
-rc = pci_bridge_ssvid_init(d, I82801ba_SSVID_OFFSET,
-   I82801ba_SSVID_SVID, I82801ba_SSVID_SSID,
-   errp);
-if (rc < 0) {
-goto err_bridge;
-}
+pci_bridge_ssvid_init(d, I82801ba_SSVID_OFFSET,
+  I82801ba_SSVID_SVID, I82801ba_SSVID_SSID);
 pci_config_set_prog_interface(d->config, PCI_CLASS_BRIDGE_PCI_INF_SUB);
-return;
-
-err_bridge:
-pci_bridge_exitfn(d);
 }
 
 static const VMStateDescription i82801b11_bridge_dev_vmstate = {
diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
index 460e48269d..a9d8c2adb4 100644
--- a/hw/pci-bridge/pcie_root_port.c
+++ b/hw/pci-bridge/pcie_root_port.c
@@ -74,12 +74,7 @@ static void rp_realize(PCIDevice *d, Error **errp)
 }
 pcie_port_init_reg(d);
 
-rc = pci_bridge_ssvid_init(d, rpc->ssvid_offset, dc->vendor_id,
-   rpc->ssid, errp);
-if (rc < 0) {
-error_append_hint(errp, "Can't init SSV ID, error %d\n", rc);
-goto err_bridge;
-}
+pci_bridge_ssvid_init(d, rpc->ssvid_offset, dc->vendor_id, rpc->ssid);
 
 if (rpc->interrupts_init) {
 rc = rpc->interrupts_init(d, errp);
diff --git a/hw/pci-bridge/xio3130_downstream.c 
b/hw/pci-bridge/xio3130_downstream.c
index 05e2b06c0c..eea3d3a2df 100644
--- a/hw/pci-bridge/xio3130_downstream.c
+++ b/hw/pci-bridge/xio3130_downstream.c
@@ -81,12 +81,8 @@ static void xio3130_downstream_realize(PCIDevice *d, Error 
**errp)
 goto err_bridge;
 }
 
-rc = pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET,
-   XIO3130_SSVID_SVID, XIO3130_SSVID_SSID,
-   errp);
-if (rc < 0) {
-goto err_msi;
-}
+pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET,
+  XIO3130_SSVID_SVID, XIO3130_SSVID_SSID);
 
 rc = pcie_cap_init(d, XIO3130_EXP_OFFSET, PCI_EXP_TYPE_DOWNSTREAM,
p->port, errp);
diff --git a/hw/pci-bridge/xio3130_upstream.c b/hw/pci-bridge/xio3130_upstream.c
index 5ff46ef050..d954906d79 100644
--- a/hw/pci-bridge/xio3130_upstream.c
+++ b/hw/pci-bridge/xio3130_upstream.c
@@ -71,12 +71,8 @@ static void xio3130_upstream_realize(PCIDevice *d, Error 
**errp)
 goto err_bridge;
 }
 
-rc = pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET,
-   XIO3130_SSVID_SVID, XIO3130_SSVID_SSID,
-   errp);
-if (rc < 0) {
-goto err_msi;
-}
+pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET,
+  XIO3130_SSVID_SVID, XIO3130_SSVID_SSID);
 
 rc = pcie_cap_init(d, XIO3130_EXP_OFFSET, PCI_EXP_TYPE_UPSTREAM,
p->port, errp);
diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
index da34c8ebcd..30032fed64 100644
--- a/hw/pci/pci_bridge.c
+++ b/hw/pci/pci_bridge.c
@@ -42,21 +42,15 @@
 #define PCI_SSVID_SVID  4
 #define PCI_SSVID_SSID  6
 
-int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
-  uint16_t svid, uint16_t ssid,
-  Error **errp)
+void pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
+   uint16_t svid, uint16_t ssid)
 {
-int pos;
+uint8_t pos;
 
-pos = 

[PATCH v9 02/17] pci: Allow to omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
pci_add_capability appears most PCI devices. Its error handling required
lots of code, and led to inconsistent behaviors such as:
- passing error_abort
- passing error_fatal
- asserting the returned value
- propagating the error to the caller
- skipping the rest of the function
- just ignoring

The code generating errors in pci_add_capability had a comment which
says:
> Verify that capabilities don't overlap.  Note: device assignment
> depends on this check to verify that the device is not broken.
> Should never trigger for emulated devices, but it's helpful for
> debugging these.

Indeed vfio has some code that passes capability offsets and sizes from
a physical device, but it explicitly pays attention so that the
capabilities never overlap. Therefore, we can always assert that
capabilities never overlap when pci_add_capability is called, resolving
these inconsistencies.

Such an implementation of pci_add_capability will not have errp
parameter. However, there are so many callers of pci_add_capability
that it does not make sense to amend all of them at once to match
with the new signature. Instead, this change will allow callers of
pci_add_capability to omit errp as the first step.

Signed-off-by: Akihiko Odaki 
---
 include/hw/pci/pci.h | 13 ++---
 hw/pci/pci.c |  8 
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index f4e6612440..555ac03010 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -2,6 +2,7 @@
 #define QEMU_PCI_H
 
 #include "exec/memory.h"
+#include "qapi/error.h"
 #include "sysemu/dma.h"
 
 /* PCI includes legacy ISA access.  */
@@ -397,9 +398,15 @@ pcibus_t pci_get_bar_addr(PCIDevice *pci_dev, int 
region_num);
 bool pci_check_capability_overlap(PCIDevice *pdev, uint8_t cap_id,
   uint8_t offset, uint8_t size, Error **errp);
 
-int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
-   uint8_t offset, uint8_t size,
-   Error **errp);
+int pci_add_capability_legacy(PCIDevice *pdev, uint8_t cap_id,
+  uint8_t offset, uint8_t size,
+  Error **errp);
+
+#define PCI_ADD_CAPABILITY_VA(pdev, cap_id, offset, size, errp, ...) \
+pci_add_capability_legacy(pdev, cap_id, offset, size, errp)
+
+#define pci_add_capability(...) \
+PCI_ADD_CAPABILITY_VA(__VA_ARGS__, _abort)
 
 void pci_del_capability(PCIDevice *pci_dev, uint8_t cap_id, uint8_t cap_size);
 
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 5531e30385..5f77ca581a 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2532,14 +2532,14 @@ bool pci_check_capability_overlap(PCIDevice *pdev, 
uint8_t cap_id,
 }
 
 /*
- * On success, pci_add_capability() returns a positive value
+ * On success, pci_add_capability_legacy() returns a positive value
  * that the offset of the pci capability.
  * On failure, it sets an error and returns a negative error
  * code.
  */
-int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
-   uint8_t offset, uint8_t size,
-   Error **errp)
+int pci_add_capability_legacy(PCIDevice *pdev, uint8_t cap_id,
+  uint8_t offset, uint8_t size,
+  Error **errp)
 {
 uint8_t *config;
 
-- 
2.38.1




Re: [PATCH v4 2/3] util/main-loop: Avoid adding the same HANDLE twice

2022-11-01 Thread Philippe Mathieu-Daudé

On 19/10/22 12:20, Bin Meng wrote:

From: Bin Meng 

Fix the logic in qemu_add_wait_object() to avoid adding the same
HANDLE twice, as the behavior is undefined when passing an array
that contains same HANDLEs to WaitForMultipleObjects() API.

Signed-off-by: Bin Meng 
---

(no changes since v3)

Changes in v3:
- new patch: avoid adding the same HANDLE twice

  include/qemu/main-loop.h |  2 ++
  util/main-loop.c | 10 ++
  2 files changed, 12 insertions(+)


Reviewed-by: Philippe Mathieu-Daudé 




[PATCH v8 04/17] ahci: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. This behavior is appropriate here because all of
the capabilities set in this device are defined in the program and
their overlap should not happen unless there is a programming error.

Signed-off-by: Akihiko Odaki 
---
 hw/ide/ich.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/hw/ide/ich.c b/hw/ide/ich.c
index 1007a51fcb..3b478b01f8 100644
--- a/hw/ide/ich.c
+++ b/hw/ide/ich.c
@@ -106,7 +106,7 @@ static void pci_ich9_ahci_init(Object *obj)
 static void pci_ich9_ahci_realize(PCIDevice *dev, Error **errp)
 {
 struct AHCIPCIState *d;
-int sata_cap_offset;
+uint8_t sata_cap_offset;
 uint8_t *sata_cap;
 d = ICH9_AHCI(dev);
 int ret;
@@ -130,11 +130,7 @@ static void pci_ich9_ahci_realize(PCIDevice *dev, Error 
**errp)
  >ahci.mem);
 
 sata_cap_offset = pci_add_capability(dev, PCI_CAP_ID_SATA,
-  ICH9_SATA_CAP_OFFSET, SATA_CAP_SIZE,
-  errp);
-if (sata_cap_offset < 0) {
-return;
-}
+  ICH9_SATA_CAP_OFFSET, SATA_CAP_SIZE);
 
 sata_cap = dev->config + sata_cap_offset;
 pci_set_word(sata_cap + SATA_CAP_REV, 0x10);
-- 
2.38.1




[PATCH v9 11/17] pci/shpc: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. A caller of shpc_init(), which calls
pci_add_capability() in turn, is expected to ensure that will not
happen.

Signed-off-by: Akihiko Odaki 
---
 include/hw/pci/shpc.h   |  3 +--
 hw/pci-bridge/pci_bridge_dev.c  |  2 +-
 hw/pci-bridge/pcie_pci_bridge.c |  2 +-
 hw/pci/shpc.c   | 23 ++-
 4 files changed, 9 insertions(+), 21 deletions(-)

diff --git a/include/hw/pci/shpc.h b/include/hw/pci/shpc.h
index d5683b7399..18ab16ec9f 100644
--- a/include/hw/pci/shpc.h
+++ b/include/hw/pci/shpc.h
@@ -38,8 +38,7 @@ struct SHPCDevice {
 
 void shpc_reset(PCIDevice *d);
 int shpc_bar_size(PCIDevice *dev);
-int shpc_init(PCIDevice *dev, PCIBus *sec_bus, MemoryRegion *bar,
-  unsigned off, Error **errp);
+int shpc_init(PCIDevice *dev, PCIBus *sec_bus, MemoryRegion *bar, unsigned 
off);
 void shpc_cleanup(PCIDevice *dev, MemoryRegion *bar);
 void shpc_free(PCIDevice *dev);
 void shpc_cap_write_config(PCIDevice *d, uint32_t addr, uint32_t val, int len);
diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
index 657a06ddbe..4b6d1876eb 100644
--- a/hw/pci-bridge/pci_bridge_dev.c
+++ b/hw/pci-bridge/pci_bridge_dev.c
@@ -66,7 +66,7 @@ static void pci_bridge_dev_realize(PCIDevice *dev, Error 
**errp)
 dev->config[PCI_INTERRUPT_PIN] = 0x1;
 memory_region_init(_dev->bar, OBJECT(dev), "shpc-bar",
shpc_bar_size(dev));
-err = shpc_init(dev, >sec_bus, _dev->bar, 0, errp);
+err = shpc_init(dev, >sec_bus, _dev->bar, 0);
 if (err) {
 goto shpc_error;
 }
diff --git a/hw/pci-bridge/pcie_pci_bridge.c b/hw/pci-bridge/pcie_pci_bridge.c
index df5dfdd139..99778e3e24 100644
--- a/hw/pci-bridge/pcie_pci_bridge.c
+++ b/hw/pci-bridge/pcie_pci_bridge.c
@@ -42,7 +42,7 @@ static void pcie_pci_bridge_realize(PCIDevice *d, Error 
**errp)
 d->config[PCI_INTERRUPT_PIN] = 0x1;
 memory_region_init(_br->shpc_bar, OBJECT(d), "shpc-bar",
shpc_bar_size(d));
-rc = shpc_init(d, >sec_bus, _br->shpc_bar, 0, errp);
+rc = shpc_init(d, >sec_bus, _br->shpc_bar, 0);
 if (rc) {
 goto error;
 }
diff --git a/hw/pci/shpc.c b/hw/pci/shpc.c
index e71f3a7483..5b3228c793 100644
--- a/hw/pci/shpc.c
+++ b/hw/pci/shpc.c
@@ -440,16 +440,11 @@ static void shpc_cap_update_dword(PCIDevice *d)
 }
 
 /* Add SHPC capability to the config space for the device. */
-static int shpc_cap_add_config(PCIDevice *d, Error **errp)
+static void shpc_cap_add_config(PCIDevice *d)
 {
 uint8_t *config;
-int config_offset;
-config_offset = pci_add_capability(d, PCI_CAP_ID_SHPC,
-   0, SHPC_CAP_LENGTH,
-   errp);
-if (config_offset < 0) {
-return config_offset;
-}
+uint8_t config_offset;
+config_offset = pci_add_capability(d, PCI_CAP_ID_SHPC, 0, SHPC_CAP_LENGTH);
 config = d->config + config_offset;
 
 pci_set_byte(config + SHPC_CAP_DWORD_SELECT, 0);
@@ -459,7 +454,6 @@ static int shpc_cap_add_config(PCIDevice *d, Error **errp)
 /* Make dword select and data writable. */
 pci_set_byte(d->wmask + config_offset + SHPC_CAP_DWORD_SELECT, 0xff);
 pci_set_long(d->wmask + config_offset + SHPC_CAP_DWORD_DATA, 0x);
-return 0;
 }
 
 static uint64_t shpc_mmio_read(void *opaque, hwaddr addr,
@@ -584,18 +578,13 @@ void shpc_device_unplug_request_cb(HotplugHandler 
*hotplug_dev,
 }
 
 /* Initialize the SHPC structure in bridge's BAR. */
-int shpc_init(PCIDevice *d, PCIBus *sec_bus, MemoryRegion *bar,
-  unsigned offset, Error **errp)
+int shpc_init(PCIDevice *d, PCIBus *sec_bus, MemoryRegion *bar, unsigned 
offset)
 {
-int i, ret;
+int i;
 int nslots = SHPC_MAX_SLOTS; /* TODO: qdev property? */
 SHPCDevice *shpc = d->shpc = g_malloc0(sizeof(*d->shpc));
 shpc->sec_bus = sec_bus;
-ret = shpc_cap_add_config(d, errp);
-if (ret) {
-g_free(d->shpc);
-return ret;
-}
+shpc_cap_add_config(d);
 if (nslots < SHPC_MIN_SLOTS) {
 return 0;
 }
-- 
2.38.1




Re: [PATCH] migration: check magic value for deciding the mapping of channels

2022-11-01 Thread manish.mishra



On 01/11/22 8:21 pm, Daniel P. Berrangé wrote:

On Tue, Nov 01, 2022 at 02:30:29PM +, manish.mishra wrote:

Current logic assumes that channel connections on the destination side are
always established in the same order as the source and the first one will
always be the default channel followed by the multifid or post-copy
preemption channel. This may not be always true, as even if a channel has a
connection established on the source side it can be in the pending state on
the destination side and a newer connection can be established first.
Basically causing out of order mapping of channels on the destination side.
Currently, all channels except post-copy preempt send a magic number, this
patch uses that magic number to decide the type of channel. This logic is
applicable only for precopy(multifd) live migration, as mentioned, the
post-copy preempt channel does not send any magic number. Also, this patch
uses MSG_PEEK to check the magic number of channels so that current
data/control stream management remains un-effected.

Signed-off-by: manish.mishra 
---
  include/io/channel.h | 25 +
  io/channel-socket.c  | 27 +++
  io/channel.c | 39 +++
  migration/migration.c| 33 +
  migration/multifd.c  | 12 
  migration/multifd.h  |  2 +-
  migration/postcopy-ram.c |  5 +
  migration/postcopy-ram.h |  2 +-
  8 files changed, 119 insertions(+), 26 deletions(-)

diff --git a/include/io/channel.h b/include/io/channel.h
index c680ee7480..74177aeeea 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -115,6 +115,10 @@ struct QIOChannelClass {
  int **fds,
  size_t *nfds,
  Error **errp);
+ssize_t (*io_read_peek)(QIOChannel *ioc,
+void *buf,
+size_t nbytes,
+Error **errp);
  int (*io_close)(QIOChannel *ioc,
  Error **errp);
  GSource * (*io_create_watch)(QIOChannel *ioc,
@@ -475,6 +479,27 @@ int qio_channel_write_all(QIOChannel *ioc,
size_t buflen,
Error **errp);
  
+/**

+ * qio_channel_read_peek_all:
+ * @ioc: the channel object
+ * @buf: the memory region to read in data
+ * @nbytes: the number of bytes to read
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Read given @nbytes data from peek of channel into
+ * memory region @buf.
+ *
+ * The function will be blocked until read size is
+ * equal to requested size.
+ *
+ * Returns: 1 if all bytes were read, 0 if end-of-file
+ *  occurs without data, or -1 on error
+ */
+int qio_channel_read_peek_all(QIOChannel *ioc,
+  void* buf,
+  size_t nbytes,
+  Error **errp);
+
  /**
   * qio_channel_set_blocking:
   * @ioc: the channel object
diff --git a/io/channel-socket.c b/io/channel-socket.c
index b76dca9cc1..b99f5dfda6 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -705,6 +705,32 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc,
  }
  #endif /* WIN32 */
  
+static ssize_t qio_channel_socket_read_peek(QIOChannel *ioc,

+void *buf,
+size_t nbytes,
+Error **errp)
+{
+QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
+ssize_t bytes = 0;
+
+retry:
+bytes = recv(sioc->fd, buf, nbytes, MSG_PEEK);
+
+if (bytes < 0) {
+if (errno == EINTR) {
+goto retry;
+}
+if (errno == EAGAIN) {
+return QIO_CHANNEL_ERR_BLOCK;
+}
+
+error_setg_errno(errp, errno,
+ "Unable to read from peek of socket");
+return -1;
+}
+
+return bytes;
+}
  
  #ifdef QEMU_MSG_ZEROCOPY

  static int qio_channel_socket_flush(QIOChannel *ioc,
@@ -902,6 +928,7 @@ static void qio_channel_socket_class_init(ObjectClass 
*klass,
  
  ioc_klass->io_writev = qio_channel_socket_writev;

  ioc_klass->io_readv = qio_channel_socket_readv;
+ioc_klass->io_read_peek = qio_channel_socket_read_peek;
  ioc_klass->io_set_blocking = qio_channel_socket_set_blocking;
  ioc_klass->io_close = qio_channel_socket_close;
  ioc_klass->io_shutdown = qio_channel_socket_shutdown;
diff --git a/io/channel.c b/io/channel.c
index 0640941ac5..a2d9b96f3f 100644
--- a/io/channel.c
+++ b/io/channel.c
@@ -346,6 +346,45 @@ int qio_channel_write_all(QIOChannel *ioc,
  return qio_channel_writev_all(ioc, , 1, errp);
  }
  
+int qio_channel_read_peek_all(QIOChannel *ioc,

+  void* buf,
+  size_t nbytes,
+  Error **errp)
+{
+   QIOChannelClass *klass = 

[PATCH v9 15/17] hw/vfio/pci: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
The code generating errors in pci_add_capability has a comment which
says:
> Verify that capabilities don't overlap.  Note: device assignment
> depends on this check to verify that the device is not broken.
> Should never trigger for emulated devices, but it's helpful for
> debugging these.

Indeed vfio has some code that passes capability offsets and sizes from
a physical device, but it explicitly pays attention so that the
capabilities never overlap. Therefore, in pci_add_capability(), we can
always assert that capabilities never overlap, and that is what happens
when omitting errp.

Signed-off-by: Akihiko Odaki 
---
 hw/vfio/pci-quirks.c | 15 +++
 hw/vfio/pci.c| 14 +-
 2 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index f0147a050a..e94fd273ea 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -1530,7 +1530,7 @@ const PropertyInfo qdev_prop_nv_gpudirect_clique = {
 static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice *vdev, Error **errp)
 {
 PCIDevice *pdev = >pdev;
-int ret, pos = 0xC8;
+int pos = 0xC8;
 
 if (vdev->nv_gpudirect_clique == 0xFF) {
 return 0;
@@ -1547,11 +1547,7 @@ static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice 
*vdev, Error **errp)
 return -EINVAL;
 }
 
-ret = pci_add_capability(pdev, PCI_CAP_ID_VNDR, pos, 8, errp);
-if (ret < 0) {
-error_prepend(errp, "Failed to add NVIDIA GPUDirect cap: ");
-return ret;
-}
+pci_add_capability(pdev, PCI_CAP_ID_VNDR, pos, 8);
 
 memset(vdev->emulated_config_bits + pos, 0xFF, 8);
 pos += PCI_CAP_FLAGS;
@@ -1718,12 +1714,7 @@ static int vfio_add_vmd_shadow_cap(VFIOPCIDevice *vdev, 
Error **errp)
 return -EFAULT;
 }
 
-ret = pci_add_capability(>pdev, PCI_CAP_ID_VNDR, pos,
- VMD_SHADOW_CAP_LEN, errp);
-if (ret < 0) {
-error_prepend(errp, "Failed to add VMD MEMBAR Shadow cap: ");
-return ret;
-}
+pci_add_capability(>pdev, PCI_CAP_ID_VNDR, pos, VMD_SHADOW_CAP_LEN);
 
 memset(vdev->emulated_config_bits + pos, 0xFF, VMD_SHADOW_CAP_LEN);
 pos += PCI_CAP_FLAGS;
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 0ca6b5ff4b..458729eae3 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1839,7 +1839,7 @@ static void vfio_add_emulated_long(VFIOPCIDevice *vdev, 
int pos,
 vfio_set_long_bits(vdev->emulated_config_bits + pos, mask, mask);
 }
 
-static int vfio_setup_pcie_cap(VFIOPCIDevice *vdev, int pos, uint8_t size,
+static int vfio_setup_pcie_cap(VFIOPCIDevice *vdev, uint8_t pos, uint8_t size,
Error **errp)
 {
 uint16_t flags;
@@ -1956,11 +1956,7 @@ static int vfio_setup_pcie_cap(VFIOPCIDevice *vdev, int 
pos, uint8_t size,
1, PCI_EXP_FLAGS_VERS);
 }
 
-pos = pci_add_capability(>pdev, PCI_CAP_ID_EXP, pos, size,
- errp);
-if (pos < 0) {
-return pos;
-}
+pos = pci_add_capability(>pdev, PCI_CAP_ID_EXP, pos, size);
 
 vdev->pdev.exp.exp_cap = pos;
 
@@ -2058,14 +2054,14 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, 
uint8_t pos, Error **errp)
 case PCI_CAP_ID_PM:
 vfio_check_pm_reset(vdev, pos);
 vdev->pm_cap = pos;
-ret = pci_add_capability(pdev, cap_id, pos, size, errp);
+pci_add_capability(pdev, cap_id, pos, size);
 break;
 case PCI_CAP_ID_AF:
 vfio_check_af_flr(vdev, pos);
-ret = pci_add_capability(pdev, cap_id, pos, size, errp);
+pci_add_capability(pdev, cap_id, pos, size);
 break;
 default:
-ret = pci_add_capability(pdev, cap_id, pos, size, errp);
+pci_add_capability(pdev, cap_id, pos, size);
 break;
 }
 
-- 
2.38.1




Re: [PATCH 15/16] tests/qtest: virtio-9p-test: Adapt the case for win32

2022-11-01 Thread Christian Schoenebeck
On Monday, October 24, 2022 6:57:58 AM CET Bin Meng wrote:
> From: Guohuai Shi 
> 
> Windows does not provide the getuid() API. Let's create a local
> one and return a fixed value 0 as the uid for testing.
> 
> Signed-off-by: Guohuai Shi 
> Signed-off-by: Xuzhou Cheng 
> Signed-off-by: Bin Meng 
> ---
> 
>  tests/qtest/virtio-9p-test.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/tests/qtest/virtio-9p-test.c b/tests/qtest/virtio-9p-test.c
> index 25305a4cf7..e81e3e3709 100644
> --- a/tests/qtest/virtio-9p-test.c
> +++ b/tests/qtest/virtio-9p-test.c
> @@ -34,6 +34,13 @@ static uint32_t genfid(void)
>  return fid_generator++;
>  }
>  
> +#ifdef CONFIG_WIN32
> +static uint32_t getuid(void)
> +{
> +return 0;
> +}
> +#endif
> +

Due to recent 9p tests restructuring changes, same would be needed for new
tests/qtest/libqos/virtio-9p-client.c source file, as it's also calling 
getuid().

>  /**
>   * Splits the @a in string by @a delim into individual (non empty) strings
>   * and outputs them to @a out. The output array @a out is NULL terminated.
> 






Re: [PULL 00/14] qemu-macppc queue 20221031

2022-11-01 Thread Stefan Hajnoczi
Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/7.2 for any 
user-visible changes.


signature.asc
Description: PGP signature


[PATCH v8 12/17] msix: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. A caller of msix_init(), which calls
pci_add_capability() in turn, is expected to ensure that will not
happen.

Signed-off-by: Akihiko Odaki 
---
 hw/pci/msix.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/hw/pci/msix.c b/hw/pci/msix.c
index 1e381a9813..28af83403b 100644
--- a/hw/pci/msix.c
+++ b/hw/pci/msix.c
@@ -311,7 +311,7 @@ int msix_init(struct PCIDevice *dev, unsigned short 
nentries,
   uint8_t pba_bar_nr, unsigned pba_offset, uint8_t cap_pos,
   Error **errp)
 {
-int cap;
+uint8_t cap;
 unsigned table_size, pba_size;
 uint8_t *config;
 
@@ -340,11 +340,7 @@ int msix_init(struct PCIDevice *dev, unsigned short 
nentries,
 return -EINVAL;
 }
 
-cap = pci_add_capability(dev, PCI_CAP_ID_MSIX,
-  cap_pos, MSIX_CAP_LENGTH, errp);
-if (cap < 0) {
-return cap;
-}
+cap = pci_add_capability(dev, PCI_CAP_ID_MSIX, cap_pos, MSIX_CAP_LENGTH);
 
 dev->msix_cap = cap;
 dev->cap_present |= QEMU_PCI_CAP_MSIX;
-- 
2.38.1




UI layer threading and locking strategy; memory_region_snapshot_and_clear_dirty() races

2022-11-01 Thread Peter Maydell
Hi; I'm trying to find out what the UI layer's threading and
locking strategy is, at least as far as it applies to display
device models.

Specifically:
 * is the device's GraphicHwOps::gfx_update method always called
   from one specific thread, or might it be called from any thread?
 * is that method called with any locks guaranteed held? (eg the
   iothread lock)
 * is the caller of the gfx_update method OK if an implementation
   of the method drops the iothread lock temporarily while it is
   executing? (my guess would be "no")
 * for a gfx_update_async = true device, what are the requirements
   on calling graphic_hw_update_done()? Does the caller need to hold
   any particular lock? Does the call need to be done from any
   particular thread?

The background to this is that I'm looking again at the race
condition involving the memory_region_snapshot_and_clear_dirty()
function, as described here:
 
https://lore.kernel.org/qemu-devel/CAFEAcA9odnPo2LPip295Uztri7JfoVnQbkJ=wn+k8dqneb_...@mail.gmail.com/T/#u

Having worked through what is going on, as far as I can see:
 (1) in order to be sure that we have the right data to match
 the snapshotted dirty bitmap state, we must wait for all TCG
 vCPUs to leave their current TB
 (2) a vCPU might block waiting for the iothread lock mid-TB
 (3) therefore we cannot wait for the TCG vCPUs without dropping
 the iothread lock one way or another
 (4) but none of the callers expect that and various things break

My tentative idea for a fix is a bit of an upheaval:
 * have the display devices set gfx_update_async = true
 * instead of doing everything synchronously in their gfx_update
   method, they do the initial setup and call an 'async' version
   of memory_region_snapshot_and_clear_dirty()
 * that async version of the function will do what it does today,
   but without trying to wait for TCG vCPUs
 * instead the caller arranges (via call_rcu(), probably) a
   callback that will happen once all the TCG CPUs have finished
   executing their current TB
 * that callback does the actual copy-from-guest-ram-to-display
   and then calls graphic_hw_update_done()

This seems like an awful pain in the neck but I couldn't see
anything better :-(

Paolo: what (if any) guarantee does call_rcu() make about
which thread the callback function gets executed on, and what
locks are/are not held when it's called?

(I haven't looked at the migration code's use of
memory_global_after_dirty_log_sync() but I suspect it's
similarly broken.)

thanks
-- PMM



[PATCH v8 00/17] pci: Abort if pci_add_capability fails

2022-11-01 Thread Akihiko Odaki
pci_add_capability appears most PCI devices. Its error handling required
lots of code, and led to inconsistent behaviors such as:
- passing error_abort
- passing error_fatal
- asserting the returned value
- propagating the error to the caller
- skipping the rest of the function
- just ignoring

The code generating errors in pci_add_capability had a comment which
says:
> Verify that capabilities don't overlap.  Note: device assignment
> depends on this check to verify that the device is not broken.
> Should never trigger for emulated devices, but it's helpful for
> debugging these.

Indeed vfio has some code that passes capability offsets and sizes from
a physical device, but it explicitly pays attention so that the
capabilities never overlap and the only exception are MSI and MSI-X
capabilities. Therefore, we can add code specific to the case, and
always assert that capabilities never overlap in the other cases,
resolving these inconsistencies.

v8:
- Return boolean with pci_check_capability_overlap() (Philippe Mathieu-Daudé)

v7:
- Perform checks in vfio_msi_setup() and vfio_msix_setup() (Alex Williamson)

v6:
- Error in case of MSI/MSI-X capability overlap (Alex Williamson)

v5:
- Fix capability ID specification in vfio_msi_early_setup (Alex Williamson)
- Use range_covers_byte() (Alex Williamson)
- warn_report() in case of MSI/MSI-X capability overlap (Alex Williamson)

v4:
- Fix typos in messages (Markus Armbruster)
- hw/vfio/pci: Ensure MSI and MSI-X do not overlap (Alex Williamson)

v3:
- Correct patch split between virtio-pci and pci (Markus Armbruster)
- Add messages for individual patches (Markus Armbruster)
- Acked-by: Jonathan Cameron 

Akihiko Odaki (17):
  hw/vfio/pci: Ensure MSI and MSI-X do not overlap
  pci: Allow to omit errp for pci_add_capability
  hw/i386/amd_iommu: Omit errp for pci_add_capability
  ahci: Omit errp for pci_add_capability
  e1000e: Omit errp for pci_add_capability
  eepro100: Omit errp for pci_add_capability
  hw/nvme: Omit errp for pci_add_capability
  msi: Omit errp for pci_add_capability
  hw/pci/pci_bridge: Omit errp for pci_add_capability
  pcie: Omit errp for pci_add_capability
  pci/shpc: Omit errp for pci_add_capability
  msix: Omit errp for pci_add_capability
  pci/slotid: Omit errp for pci_add_capability
  hw/pci-bridge/pcie_pci_bridge: Omit errp for pci_add_capability
  hw/vfio/pci: Omit errp for pci_add_capability
  virtio-pci: Omit errp for pci_add_capability
  pci: Remove legacy errp from pci_add_capability

 docs/pcie_sriov.txt|  4 +--
 hw/display/bochs-display.c |  4 +--
 hw/i386/amd_iommu.c| 21 +++-
 hw/ide/ich.c   |  8 ++---
 hw/net/e1000e.c| 22 +++--
 hw/net/eepro100.c  |  7 +---
 hw/nvme/ctrl.c | 14 ++--
 hw/pci-bridge/cxl_downstream.c |  9 ++
 hw/pci-bridge/cxl_upstream.c   |  8 ++---
 hw/pci-bridge/i82801b11.c  | 14 ++--
 hw/pci-bridge/pci_bridge_dev.c |  2 +-
 hw/pci-bridge/pcie_pci_bridge.c| 19 +++
 hw/pci-bridge/pcie_root_port.c | 16 ++---
 hw/pci-bridge/xio3130_downstream.c | 15 ++---
 hw/pci-bridge/xio3130_upstream.c   | 15 ++---
 hw/pci-host/designware.c   |  3 +-
 hw/pci-host/xilinx-pcie.c  |  4 +--
 hw/pci/msi.c   |  9 +-
 hw/pci/msix.c  |  8 ++---
 hw/pci/pci.c   | 48 +--
 hw/pci/pci_bridge.c| 21 
 hw/pci/pcie.c  | 52 --
 hw/pci/shpc.c  | 23 -
 hw/pci/slotid_cap.c|  8 ++---
 hw/usb/hcd-xhci-pci.c  |  3 +-
 hw/vfio/pci-quirks.c   | 15 ++---
 hw/vfio/pci.c  | 29 +++--
 hw/virtio/virtio-pci.c | 12 ++-
 include/hw/pci/pci.h   |  8 +++--
 include/hw/pci/pci_bridge.h|  5 ++-
 include/hw/pci/pcie.h  | 11 +++
 include/hw/pci/shpc.h  |  3 +-
 include/hw/virtio/virtio-pci.h |  2 +-
 33 files changed, 133 insertions(+), 309 deletions(-)

-- 
2.38.1




[PATCH v8 16/17] virtio-pci: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. This behavior is appropriate here because all of
the capabilities set in this device are defined in the program and
their overlap should not happen unless there is a programming error.

Signed-off-by: Akihiko Odaki 
---
 hw/virtio/virtio-pci.c | 9 ++---
 include/hw/virtio/virtio-pci.h | 2 +-
 2 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index c37bdc77ea..b393ff01be 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1154,8 +1154,7 @@ static int virtio_pci_add_mem_cap(VirtIOPCIProxy *proxy,
 PCIDevice *dev = >pci_dev;
 int offset;
 
-offset = pci_add_capability(dev, PCI_CAP_ID_VNDR, 0,
-cap->cap_len, _abort);
+offset = pci_add_capability(dev, PCI_CAP_ID_VNDR, 0, cap->cap_len);
 
 assert(cap->cap_len >= sizeof *cap);
 memcpy(dev->config + offset + PCI_CAP_FLAGS, >cap_len,
@@ -1864,11 +1863,7 @@ static void virtio_pci_realize(PCIDevice *pci_dev, Error 
**errp)
 
 pcie_endpoint_cap_init(pci_dev, 0);
 
-pos = pci_add_capability(pci_dev, PCI_CAP_ID_PM, 0,
- PCI_PM_SIZEOF, errp);
-if (pos < 0) {
-return;
-}
+pos = pci_add_capability(pci_dev, PCI_CAP_ID_PM, 0, PCI_PM_SIZEOF);
 
 pci_dev->exp.pm_cap = pos;
 
diff --git a/include/hw/virtio/virtio-pci.h b/include/hw/virtio/virtio-pci.h
index 2446dcd9ae..9f3736723c 100644
--- a/include/hw/virtio/virtio-pci.h
+++ b/include/hw/virtio/virtio-pci.h
@@ -141,7 +141,7 @@ struct VirtIOPCIProxy {
 uint32_t msix_bar_idx;
 uint32_t modern_io_bar_idx;
 uint32_t modern_mem_bar_idx;
-int config_cap;
+uint8_t config_cap;
 uint32_t flags;
 bool disable_modern;
 bool ignore_backend_features;
-- 
2.38.1




Re: HMAT patches failure (was Re: [PULL 00/86] pci,pc,virtio: features, tests, fixes, cleanups)

2022-11-01 Thread Jonathan Cameron via
On Tue, 1 Nov 2022 06:32:05 -0400
"Michael S. Tsirkin"  wrote:

> On Mon, Oct 31, 2022 at 04:06:03PM -0400, Stefan Hajnoczi wrote:
> > Here is another CI failure:
> > 
> > qemu-system-i386: current -smp configuration requires kernel irqchip
> > and X2APIC API support.
> > Broken pipe
> > ../tests/qtest/libqtest.c:179: kill_qemu() tried to terminate QEMU
> > process but encountered exit status 1 (expected 0)
> > TAP parsing error: Too few tests run (expected 49, got 22)

Got a bit thrown by this which is unrelated to the HMAT series.  Given I 
bisected it...

   bios-tables-test: add test for number of cores > 255
seems to be issue.  I'll take a look into why shortly.


> > (test program exited with status code -6)
> > ――
> > 6/202 qemu:qtest+qtest-i386 / qtest-i386/test-hmp OK 7.46s 9 subtests passed
> > ▶ 7/202 ERROR:../tests/qtest/bios-tables-test.c:533:test_acpi_asl:
> > assertion failed: (all_tables_match) ERROR
> > 7/202 qemu:qtest+qtest-aarch64 / qtest-aarch64/bios-tables-test ERROR
> > 108.34s killed by signal 6 SIGABRT  
> > >>> G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh
> > >>>  QTEST_QEMU_BINARY=./qemu-system-aarch64 MALLOC_PERTURB_=89 
> > >>> /builds/qemu-project/qemu/build/tests/qtest/bios-tables-test --tap -k  
> > ― ✀ 
> > ―
> > stderr:
> > acpi-test: Warning! APIC binary file mismatch. Actual
> > [aml:/tmp/aml-UKB6U1], Expected
> > [aml:tests/data/acpi/virt/APIC.acpihmatvirt].
> > See source file tests/qtest/bios-tables-test.c for instructions on how
> > to update expected files.
> > to see ASL diff between mismatched files install IASL, rebuild QEMU
> > from scratch and re-run tests with V=1 environment variable set**
> > ERROR:../tests/qtest/bios-tables-test.c:533:test_acpi_asl: assertion
> > failed: (all_tables_match)

Ah. I'd failed to notice you said to drop first patch. Now replicating.
Looks like the tables introduced for HMAT need updating to take into account
changes made earlier in your pull request (version numbers etc)


Jonathan

> > (test program exited with status code -6)
> > 
> > https://gitlab.com/qemu-project/qemu/-/jobs/3253817453  
> 
> 
> Hesham Jonathan pls take a look, if you post a fixup today
> or early tomorrow I can squash it
> and then this patchset can still be included in the release.
> 
> Thanks!
> 




RE: [PATCH 07/16] hw/9pfs: Implement Windows specific utilities functions for 9pfs

2022-11-01 Thread Shi, Guohuai



> -Original Message-
> From: Shi, Guohuai
> Sent: Tuesday, November 1, 2022 23:13
> To: Christian Schoenebeck ; qemu-devel@nongnu.org
> Cc: Greg Kurz ; Meng, Bin 
> Subject: RE: [PATCH 07/16] hw/9pfs: Implement Windows specific utilities
> functions for 9pfs
> 
> 
> 
> > -Original Message-
> > From: Christian Schoenebeck 
> > Sent: Tuesday, November 1, 2022 22:28
> > To: qemu-devel@nongnu.org
> > Cc: Shi, Guohuai ; Greg Kurz
> > ; Meng, Bin 
> > Subject: Re: [PATCH 07/16] hw/9pfs: Implement Windows specific
> > utilities functions for 9pfs
> >
> > [Please note: This e-mail is from an EXTERNAL e-mail address]
> >
> > On Monday, October 24, 2022 6:57:50 AM CET Bin Meng wrote:
> > > From: Guohuai Shi 
> > >
> > > Windows POSIX API and MinGW library do not provide the NO_FOLLOW
> > > flag, and do not allow opening a directory by POSIX open(). This
> > > causes all
> > > xxx_at() functions cannot work directly. However, we can provide
> > > Windows handle based functions to emulate xxx_at() functions (e.g.:
> > > openat_win32, utimensat_win32, etc.).
> > >
> > > Windows does not support extended attributes. 9pfs for Windows uses
> > > NTFS ADS (Alternate Data Streams) to emulate extended attributes.
> > >
> > > Windows does not provide POSIX compatible readlink(), and symbolic
> > > link feature in 9pfs will be disabled on Windows.
> >
> > Wouldn't it be more user friendly if the relevant error locations
> > would use something like error_report_once() and suggesting to enable
> > mapped(-xattr) to make 9p symlinks on guest working if desired by the user?
> >
> > Probably this error case would need to wrapped into a dedicated
> > function, otherwise I guess error_report_once() would fire several
> > times by different callers.
> >
> 
> Windows (MinGW) does not only support symlink, but also does not have symlink
> definitions.
> Windows does not support symlink flags S_IFLNK.
> 
> So even I add symlink support by mapped-xattr, the MinGW library does not
> have symlink flags and get a build error.
> And this flags is defined by Windows header files.
> The impact of adding a new flags to an pre-defined structure (struct stat) is
> unknown.
> 
> So I think it is not a good idea to do that.

Because Windows does not support symlink, so error_report_once() and report it 
to user will be OK.
But mapped-xattr could not work.

> 
> > > Signed-off-by: Guohuai Shi 
> > > Signed-off-by: Bin Meng 
> > > ---
> > >
> > >  hw/9pfs/9p-local.h  |   7 +
> > >  hw/9pfs/9p-util.h   |  40 +-
> > >  hw/9pfs/9p-local.c  |   4 -
> > >  hw/9pfs/9p-util-win32.c | 885
> > > 
> > >  4 files changed, 931 insertions(+), 5 deletions(-)  create mode
> > > 100644 hw/9pfs/9p-util-win32.c
> > >
> > > diff --git a/hw/9pfs/9p-local.h b/hw/9pfs/9p-local.h index
> > > c8404063e5..02fd894ba3 100644
> > > --- a/hw/9pfs/9p-local.h
> > > +++ b/hw/9pfs/9p-local.h
> > > @@ -15,6 +15,13 @@
> > >
> > >  #include "9p-file-id.h"
> > >
> > > +typedef struct {
> > > +P9_FILE_ID mountfd;
> > > +#ifdef CONFIG_WIN32
> > > +char *root_path;
> > > +#endif
> > > +} LocalData;
> > > +
> > >  P9_FILE_ID local_open_nofollow(FsContext *fs_ctx, const char *path,
> > > int
> > flags,
> > > mode_t mode);  P9_FILE_ID
> > > local_opendir_nofollow(FsContext *fs_ctx, const char *path); diff
> > > --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index
> > > 1e7dc76345..82b2d0c3e4 100644
> > > --- a/hw/9pfs/9p-util.h
> > > +++ b/hw/9pfs/9p-util.h
> > > @@ -90,26 +90,61 @@ static inline int errno_to_dotl(int err) {
> > >  return err;
> > >  }
> > >
> > > -#ifdef CONFIG_DARWIN
> > > +#if defined(CONFIG_DARWIN)
> > >  #define qemu_fgetxattr(...) fgetxattr(__VA_ARGS__, 0, 0)
> > > +#elif defined(CONFIG_WIN32)
> > > +#define qemu_fgetxattr fgetxattr_win32
> > >  #else
> > >  #define qemu_fgetxattr fgetxattr
> > >  #endif
> > >
> > > +#ifdef CONFIG_WIN32
> > > +#define qemu_openat openat_win32
> > > +#define qemu_fstatatfstatat_win32
> > > +#define qemu_mkdiratmkdirat_win32
> > > +#define qemu_renameat   renameat_win32
> > > +#define qemu_utimensat  utimensat_win32
> > > +#define qemu_unlinkat   unlinkat_win32
> > > +#else
> > >  #define qemu_openat openat
> > >  #define qemu_fstatatfstatat
> > >  #define qemu_mkdiratmkdirat
> > >  #define qemu_renameat   renameat
> > >  #define qemu_utimensat  utimensat
> > >  #define qemu_unlinkat   unlinkat
> > > +#endif
> > > +
> > > +#ifdef CONFIG_WIN32
> > > +char *get_full_path_win32(P9_FILE_ID fd, const char *name); ssize_t
> > > +fgetxattr_win32(int fd, const char *name, void *value, size_t
> > > +size); P9_FILE_ID openat_win32(P9_FILE_ID dirfd, const char *pathname,
> int flags,
> > > +mode_t mode); int fstatat_win32(P9_FILE_ID
> > > +dirfd, const char *pathname,
> > > +  struct stat *statbuf, int flags); int
> > > +mkdirat_win32(P9_FILE_ID dirfd, const char *pathname, 

[PATCH v8 08/17] msi: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. A caller of msi_init(), which calls
pci_add_capability() in turn, is expected to ensure that will not
happen.

Signed-off-by: Akihiko Odaki 
---
 hw/pci/msi.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/hw/pci/msi.c b/hw/pci/msi.c
index 058d1d1ef1..5283a08b5a 100644
--- a/hw/pci/msi.c
+++ b/hw/pci/msi.c
@@ -194,7 +194,6 @@ int msi_init(struct PCIDevice *dev, uint8_t offset,
 unsigned int vectors_order;
 uint16_t flags;
 uint8_t cap_size;
-int config_offset;
 
 if (!msi_nonbroken) {
 error_setg(errp, "MSI is not supported by interrupt controller");
@@ -221,13 +220,7 @@ int msi_init(struct PCIDevice *dev, uint8_t offset,
 }
 
 cap_size = msi_cap_sizeof(flags);
-config_offset = pci_add_capability(dev, PCI_CAP_ID_MSI, offset,
-cap_size, errp);
-if (config_offset < 0) {
-return config_offset;
-}
-
-dev->msi_cap = config_offset;
+dev->msi_cap = pci_add_capability(dev, PCI_CAP_ID_MSI, offset, cap_size);
 dev->cap_present |= QEMU_PCI_CAP_MSI;
 
 pci_set_word(dev->config + msi_flags_off(dev), flags);
-- 
2.38.1




Re: [PULL 08/30] target/arm: Add ptw_idx to S1Translate

2022-11-01 Thread Philippe Mathieu-Daudé

On 1/11/22 11:10, Philippe Mathieu-Daudé wrote:

On 1/11/22 00:14, Philippe Mathieu-Daudé wrote:

On 25/10/22 18:39, Peter Maydell wrote:

From: Richard Henderson 

Hoist the computation of the mmu_idx for the ptw up to
get_phys_addr_with_struct and get_phys_addr_twostage.
This removes the duplicate check for stage2 disabled
from the middle of the walk, performing it only once.

Signed-off-by: Richard Henderson 
Reviewed-by: Alex Bennée 
Tested-by: Alex Bennée 
Message-id: 20221024051851.3074715-3-richard.hender...@linaro.org
Signed-off-by: Peter Maydell 
---
  target/arm/ptw.c | 71 
  1 file changed, 54 insertions(+), 17 deletions(-)


Since this commit I can not boot Trusted Firmware on the SBSA-ref 
machine.


Do we need to set in_ptw_idx in get_phys_addr_with_secure()?


I opened https://gitlab.com/qemu-project/qemu/-/issues/1293 to track.



Re: [PATCH 07/16] hw/9pfs: Implement Windows specific utilities functions for 9pfs

2022-11-01 Thread Christian Schoenebeck
On Monday, October 24, 2022 6:57:50 AM CET Bin Meng wrote:
> From: Guohuai Shi 
> 
> Windows POSIX API and MinGW library do not provide the NO_FOLLOW
> flag, and do not allow opening a directory by POSIX open(). This
> causes all xxx_at() functions cannot work directly. However, we
> can provide Windows handle based functions to emulate xxx_at()
> functions (e.g.: openat_win32, utimensat_win32, etc.).
> 
> Windows does not support extended attributes. 9pfs for Windows uses
> NTFS ADS (Alternate Data Streams) to emulate extended attributes.
> 
> Windows does not provide POSIX compatible readlink(), and symbolic
> link feature in 9pfs will be disabled on Windows.

Wouldn't it be more user friendly if the relevant error locations would use
something like error_report_once() and suggesting to enable mapped(-xattr) to
make 9p symlinks on guest working if desired by the user?

Probably this error case would need to wrapped into a dedicated function,
otherwise I guess error_report_once() would fire several times by different
callers.

> Signed-off-by: Guohuai Shi 
> Signed-off-by: Bin Meng 
> ---
> 
>  hw/9pfs/9p-local.h  |   7 +
>  hw/9pfs/9p-util.h   |  40 +-
>  hw/9pfs/9p-local.c  |   4 -
>  hw/9pfs/9p-util-win32.c | 885 
>  4 files changed, 931 insertions(+), 5 deletions(-)
>  create mode 100644 hw/9pfs/9p-util-win32.c
> 
> diff --git a/hw/9pfs/9p-local.h b/hw/9pfs/9p-local.h
> index c8404063e5..02fd894ba3 100644
> --- a/hw/9pfs/9p-local.h
> +++ b/hw/9pfs/9p-local.h
> @@ -15,6 +15,13 @@
>  
>  #include "9p-file-id.h"
>  
> +typedef struct {
> +P9_FILE_ID mountfd;
> +#ifdef CONFIG_WIN32
> +char *root_path;
> +#endif
> +} LocalData;
> +
>  P9_FILE_ID local_open_nofollow(FsContext *fs_ctx, const char *path, int 
> flags,
> mode_t mode);
>  P9_FILE_ID local_opendir_nofollow(FsContext *fs_ctx, const char *path);
> diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
> index 1e7dc76345..82b2d0c3e4 100644
> --- a/hw/9pfs/9p-util.h
> +++ b/hw/9pfs/9p-util.h
> @@ -90,26 +90,61 @@ static inline int errno_to_dotl(int err) {
>  return err;
>  }
>  
> -#ifdef CONFIG_DARWIN
> +#if defined(CONFIG_DARWIN)
>  #define qemu_fgetxattr(...) fgetxattr(__VA_ARGS__, 0, 0)
> +#elif defined(CONFIG_WIN32)
> +#define qemu_fgetxattr fgetxattr_win32
>  #else
>  #define qemu_fgetxattr fgetxattr
>  #endif
>  
> +#ifdef CONFIG_WIN32
> +#define qemu_openat openat_win32
> +#define qemu_fstatatfstatat_win32
> +#define qemu_mkdiratmkdirat_win32
> +#define qemu_renameat   renameat_win32
> +#define qemu_utimensat  utimensat_win32
> +#define qemu_unlinkat   unlinkat_win32
> +#else
>  #define qemu_openat openat
>  #define qemu_fstatatfstatat
>  #define qemu_mkdiratmkdirat
>  #define qemu_renameat   renameat
>  #define qemu_utimensat  utimensat
>  #define qemu_unlinkat   unlinkat
> +#endif
> +
> +#ifdef CONFIG_WIN32
> +char *get_full_path_win32(P9_FILE_ID fd, const char *name);
> +ssize_t fgetxattr_win32(int fd, const char *name, void *value, size_t size);
> +P9_FILE_ID openat_win32(P9_FILE_ID dirfd, const char *pathname, int flags,
> +mode_t mode);
> +int fstatat_win32(P9_FILE_ID dirfd, const char *pathname,
> +  struct stat *statbuf, int flags);
> +int mkdirat_win32(P9_FILE_ID dirfd, const char *pathname, mode_t mode);
> +int renameat_win32(P9_FILE_ID olddirfd, const char *oldpath,
> +   P9_FILE_ID newdirfd, const char *newpath);
> +int utimensat_win32(P9_FILE_ID dirfd, const char *pathname,
> +const struct timespec times[2], int flags);
> +int unlinkat_win32(P9_FILE_ID dirfd, const char *pathname, int flags);
> +int statfs_win32(const char *root_path, struct statfs *stbuf);
> +P9_FILE_ID openat_dir(P9_FILE_ID dirfd, const char *name);
> +P9_FILE_ID openat_file(P9_FILE_ID dirfd, const char *name, int flags,
> +   mode_t mode);
> +#endif
>  
>  static inline void close_preserve_errno(P9_FILE_ID fd)
>  {
>  int serrno = errno;
> +#ifndef CONFIG_WIN32
>  close(fd);
> +#else
> +CloseHandle(fd);
> +#endif
>  errno = serrno;
>  }
>  
> +#ifndef CONFIG_WIN32
>  static inline P9_FILE_ID openat_dir(P9_FILE_ID dirfd, const char *name)
>  {
>  return qemu_openat(dirfd, name,
> @@ -157,6 +192,7 @@ again:
>  errno = serrno;
>  return fd;
>  }
> +#endif
>  
>  ssize_t fgetxattrat_nofollow(P9_FILE_ID dirfd, const char *path,
>   const char *name, void *value, size_t size);
> @@ -167,6 +203,7 @@ ssize_t flistxattrat_nofollow(P9_FILE_ID dirfd, const 
> char *filename,
>  ssize_t fremovexattrat_nofollow(P9_FILE_ID dirfd, const char *filename,
>  const char *name);
>  
> +#ifndef CONFIG_WIN32
>  /*
>   * Darwin has d_seekoff, which appears to function similarly to d_off.
>   * However, it does not appear to be supported on all file systems,
> @@ -181,6 +218,7 @@ static 

[PATCH v9 14/17] hw/pci-bridge/pcie_pci_bridge: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. This behavior is appropriate heare because all of
the capabilities set in this device are defined in the program and
their overlap should not happen unless there is a programming error.

Signed-off-by: Akihiko Odaki 
---
 hw/pci-bridge/pcie_pci_bridge.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/hw/pci-bridge/pcie_pci_bridge.c b/hw/pci-bridge/pcie_pci_bridge.c
index 99778e3e24..1b839465e7 100644
--- a/hw/pci-bridge/pcie_pci_bridge.c
+++ b/hw/pci-bridge/pcie_pci_bridge.c
@@ -35,7 +35,7 @@ static void pcie_pci_bridge_realize(PCIDevice *d, Error 
**errp)
 {
 PCIBridge *br = PCI_BRIDGE(d);
 PCIEPCIBridge *pcie_br = PCIE_PCI_BRIDGE_DEV(d);
-int rc, pos;
+int rc;
 
 pci_bridge_initfn(d, TYPE_PCI_BUS);
 
@@ -49,12 +49,8 @@ static void pcie_pci_bridge_realize(PCIDevice *d, Error 
**errp)
 
 pcie_cap_init(d, 0, PCI_EXP_TYPE_PCI_BRIDGE, 0);
 
-pos = pci_add_capability(d, PCI_CAP_ID_PM, 0, PCI_PM_SIZEOF, errp);
-if (pos < 0) {
-goto pm_error;
-}
-d->exp.pm_cap = pos;
-pci_set_word(d->config + pos + PCI_PM_PMC, 0x3);
+d->exp.pm_cap = pci_add_capability(d, PCI_CAP_ID_PM, 0, PCI_PM_SIZEOF);
+pci_set_word(d->config + d->exp.pm_cap + PCI_PM_PMC, 0x3);
 
 pcie_cap_arifwd_init(d);
 pcie_cap_deverr_init(d);
@@ -85,7 +81,6 @@ static void pcie_pci_bridge_realize(PCIDevice *d, Error 
**errp)
 msi_error:
 pcie_aer_exit(d);
 aer_error:
-pm_error:
 pcie_cap_exit(d);
 shpc_cleanup(d, _br->shpc_bar);
 error:
-- 
2.38.1




[PATCH v8 03/17] hw/i386/amd_iommu: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. This behavior is appropriate here because all of
the capabilities set in this device are defined in the program and
their overlap should not happen unless there is a programming error.

Signed-off-by: Akihiko Odaki 
---
 hw/i386/amd_iommu.c | 21 -
 1 file changed, 4 insertions(+), 17 deletions(-)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 725f69095b..8a88cbea0a 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1539,7 +1539,6 @@ static void amdvi_sysbus_reset(DeviceState *dev)
 
 static void amdvi_sysbus_realize(DeviceState *dev, Error **errp)
 {
-int ret = 0;
 AMDVIState *s = AMD_IOMMU_DEVICE(dev);
 MachineState *ms = MACHINE(qdev_get_machine());
 PCMachineState *pcms = PC_MACHINE(ms);
@@ -1553,23 +1552,11 @@ static void amdvi_sysbus_realize(DeviceState *dev, 
Error **errp)
 if (!qdev_realize(DEVICE(>pci), >qbus, errp)) {
 return;
 }
-ret = pci_add_capability(>pci.dev, AMDVI_CAPAB_ID_SEC, 0,
- AMDVI_CAPAB_SIZE, errp);
-if (ret < 0) {
-return;
-}
-s->capab_offset = ret;
+s->capab_offset = pci_add_capability(>pci.dev, AMDVI_CAPAB_ID_SEC, 0,
+ AMDVI_CAPAB_SIZE);
 
-ret = pci_add_capability(>pci.dev, PCI_CAP_ID_MSI, 0,
- AMDVI_CAPAB_REG_SIZE, errp);
-if (ret < 0) {
-return;
-}
-ret = pci_add_capability(>pci.dev, PCI_CAP_ID_HT, 0,
- AMDVI_CAPAB_REG_SIZE, errp);
-if (ret < 0) {
-return;
-}
+pci_add_capability(>pci.dev, PCI_CAP_ID_MSI, 0, AMDVI_CAPAB_REG_SIZE);
+pci_add_capability(>pci.dev, PCI_CAP_ID_HT, 0, AMDVI_CAPAB_REG_SIZE);
 
 /* Pseudo address space under root PCI bus. */
 x86ms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID);
-- 
2.38.1




[PATCH v9 01/17] hw/vfio/pci: Ensure MSI and MSI-X do not overlap

2022-11-01 Thread Akihiko Odaki
pci_add_capability() checks whether capabilities overlap, and notifies
its caller so that it can properly handle the case. However, in the
most cases, the capabilities actually never overlap, and the interface
incurred extra error handling code, which is often incorrect or
suboptimal. For such cases, pci_add_capability() can simply abort the
execution if the capabilities actually overlap since it should be a
programming error.

This change handles the other cases: hw/vfio/pci depends on the check to
decide MSI and MSI-X capabilities overlap with another. As they are
quite an exceptional and hw/vfio/pci knows much about PCI capabilities,
adding code specific to the cases to hw/vfio/pci still results in less
code than having error handling code everywhere in total.

Signed-off-by: Akihiko Odaki 
---
 include/hw/pci/pci.h |  7 +++
 hw/pci/pci.c | 33 +
 hw/vfio/pci.c| 15 ++-
 3 files changed, 42 insertions(+), 13 deletions(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index b54b6ef88f..f4e6612440 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -390,6 +390,13 @@ void pci_register_vga(PCIDevice *pci_dev, MemoryRegion 
*mem,
 void pci_unregister_vga(PCIDevice *pci_dev);
 pcibus_t pci_get_bar_addr(PCIDevice *pci_dev, int region_num);
 
+/*
+ * If there is no overlap, pci_check_capability_overlap() returns true.
+ * Otherise, it sets an error and returns false.
+ */
+bool pci_check_capability_overlap(PCIDevice *pdev, uint8_t cap_id,
+  uint8_t offset, uint8_t size, Error **errp);
+
 int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
uint8_t offset, uint8_t size,
Error **errp);
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 2f450f6a72..5531e30385 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2512,6 +2512,25 @@ static void pci_del_option_rom(PCIDevice *pdev)
 pdev->has_rom = false;
 }
 
+bool pci_check_capability_overlap(PCIDevice *pdev, uint8_t cap_id,
+  uint8_t offset, uint8_t size, Error **errp)
+{
+int i;
+
+for (i = offset; i < offset + size; i++) {
+if (pdev->used[i]) {
+error_setg(errp,
+   "%s:%02x:%02x.%x PCI capability %x at offset %x 
overlaps existing capability %x at offset %x",
+   pci_root_bus_path(pdev), pci_dev_bus_num(pdev),
+   PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn),
+   cap_id, offset, pci_find_capability_at_offset(pdev, i), 
i);
+return false;
+}
+}
+
+return true;
+}
+
 /*
  * On success, pci_add_capability() returns a positive value
  * that the offset of the pci capability.
@@ -2523,7 +2542,6 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
Error **errp)
 {
 uint8_t *config;
-int i, overlapping_cap;
 
 if (!offset) {
 offset = pci_find_space(pdev, size);
@@ -2534,17 +2552,8 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
  * depends on this check to verify that the device is not broken.
  * Should never trigger for emulated devices, but it's helpful
  * for debugging these. */
-for (i = offset; i < offset + size; i++) {
-overlapping_cap = pci_find_capability_at_offset(pdev, i);
-if (overlapping_cap) {
-error_setg(errp, "%s:%02x:%02x.%x "
-   "Attempt to add PCI capability %x at offset "
-   "%x overlaps existing capability %x at offset %x",
-   pci_root_bus_path(pdev), pci_dev_bus_num(pdev),
-   PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn),
-   cap_id, offset, overlapping_cap, i);
-return -EINVAL;
-}
+if (!pci_check_capability_overlap(pdev, cap_id, offset, size, errp)) {
+return -EINVAL;
 }
 }
 
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 939dcc3d4a..0ca6b5ff4b 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1298,6 +1298,14 @@ static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos, 
Error **errp)
 
 trace_vfio_msi_setup(vdev->vbasedev.name, pos);
 
+vdev->msi_cap_size = 0xa + (msi_maskbit ? 0xa : 0) + (msi_64bit ? 0x4 : 0);
+
+ret = pci_check_capability_overlap(>pdev, PCI_CAP_ID_MSI,
+   pos, vdev->msi_cap_size, errp);
+if (!ret) {
+return -EINVAL;
+}
+
 ret = msi_init(>pdev, pos, entries, msi_64bit, msi_maskbit, );
 if (ret < 0) {
 if (ret == -ENOTSUP) {
@@ -1306,7 +1314,6 @@ static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos, 
Error **errp)
 error_propagate_prepend(errp, err, "msi_init failed: ");
 return ret;
 }
-vdev->msi_cap_size = 0xa + (msi_maskbit ? 0xa : 0) + (msi_64bit ? 0x4 : 0);
 
 return 0;
 }
@@ 

Re: [PATCH 09/16] hw/9pfs: Disable unsupported flags and features for Windows

2022-11-01 Thread Christian Schoenebeck
On Monday, October 24, 2022 6:57:52 AM CET Bin Meng wrote:
> From: Guohuai Shi 
> 
> Some flags and features are not supported on Windows, like mknod,
> readlink, file mode, etc. Update the codes for Windows.
> 
> Signed-off-by: Guohuai Shi 
> Signed-off-by: Bin Meng 
> ---
> 
>  hw/9pfs/9p-util.h |  6 +++-
>  hw/9pfs/9p.c  | 90 ++-
>  2 files changed, 86 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
> index 82b2d0c3e4..3d154e9103 100644
> --- a/hw/9pfs/9p-util.h
> +++ b/hw/9pfs/9p-util.h
> @@ -53,8 +53,10 @@ static inline uint64_t makedev_dotl(uint32_t dev_major, 
> uint32_t dev_minor)
>   */
>  static inline uint64_t host_dev_to_dotl_dev(dev_t dev)
>  {
> -#ifdef CONFIG_LINUX
> +#if defined(CONFIG_LINUX)
>  return dev;
> +#elif defined(CONFIG_WIN32)
> +return 0;

Really?

>  #else
>  return makedev_dotl(major(dev), minor(dev));
>  #endif
> @@ -260,7 +262,9 @@ static inline struct dirent *qemu_dirent_dup(struct 
> dirent *dent)
>  #if defined CONFIG_DARWIN && defined CONFIG_PTHREAD_FCHDIR_NP
>  int pthread_fchdir_np(int fd) __attribute__((weak_import));
>  #endif
> +#ifndef CONFIG_WIN32
>  int qemu_mknodat(P9_FILE_ID dirfd, const char *filename, mode_t mode,
>   dev_t dev);
> +#endif
>  
>  #endif
> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index 6c4af86240..771aab34ac 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -39,6 +39,11 @@
>  #include "qemu/xxhash.h"
>  #include 
>  
> +#ifdef CONFIG_WIN32
> +#define UTIME_NOW   ((1l << 30) - 1l)
> +#define UTIME_OMIT  ((1l << 30) - 2l)
> +#endif
> +
>  int open_fd_hw;
>  int total_open_fd;
>  static int open_fd_rc;
> @@ -132,13 +137,17 @@ static int dotl_to_open_flags(int flags)
>  DotlOpenflagMap dotl_oflag_map[] = {
>  { P9_DOTL_CREATE, O_CREAT },
>  { P9_DOTL_EXCL, O_EXCL },
> +#ifndef CONFIG_WIN32
>  { P9_DOTL_NOCTTY , O_NOCTTY },
> +#endif
>  { P9_DOTL_TRUNC, O_TRUNC },
>  { P9_DOTL_APPEND, O_APPEND },
> +#ifndef CONFIG_WIN32
>  { P9_DOTL_NONBLOCK, O_NONBLOCK } ,
>  { P9_DOTL_DSYNC, O_DSYNC },
>  { P9_DOTL_FASYNC, FASYNC },
> -#ifndef CONFIG_DARWIN
> +#endif
> +#ifdef CONFIG_LINUX

Better

   #if !defined(CONFIG_DARWIN) && !defined(CONFIG_WIN32)

Otherwise it might automatically opt-out other future platforms
unintentionally.

>  { P9_DOTL_NOATIME, O_NOATIME },
>  /*
>   *  On Darwin, we could map to F_NOCACHE, which is
> @@ -151,8 +160,10 @@ static int dotl_to_open_flags(int flags)
>  #endif
>  { P9_DOTL_LARGEFILE, O_LARGEFILE },
>  { P9_DOTL_DIRECTORY, O_DIRECTORY },
> +#ifndef CONFIG_WIN32
>  { P9_DOTL_NOFOLLOW, O_NOFOLLOW },
>  { P9_DOTL_SYNC, O_SYNC },
> +#endif
>  };
>  
>  for (i = 0; i < ARRAY_SIZE(dotl_oflag_map); i++) {
> @@ -179,8 +190,11 @@ static int get_dotl_openflags(V9fsState *s, int oflags)
>   * Filter the client open flags
>   */
>  flags = dotl_to_open_flags(oflags);
> -flags &= ~(O_NOCTTY | O_ASYNC | O_CREAT);
> -#ifndef CONFIG_DARWIN
> +flags &= ~(O_CREAT);
> +#ifndef CONFIG_WIN32
> +flags &= ~(O_NOCTTY | O_ASYNC);
> +#endif
> +#ifdef CONFIG_LINUX

Same as above: better explicitly opt-out than the other way around.

>  /*
>   * Ignore direct disk access hint until the server supports it.
>   */
> @@ -986,9 +1000,11 @@ static int stat_to_qid(V9fsPDU *pdu, const struct stat 
> *stbuf, V9fsQID *qidp)
>  if (S_ISDIR(stbuf->st_mode)) {
>  qidp->type |= P9_QID_TYPE_DIR;
>  }
> +#ifndef CONFIG_WIN32
>  if (S_ISLNK(stbuf->st_mode)) {
>  qidp->type |= P9_QID_TYPE_SYMLINK;
>  }
> +#endif
>  
>  return 0;
>  }
> @@ -1097,6 +1113,7 @@ static mode_t v9mode_to_mode(uint32_t mode, V9fsString 
> *extension)
>  ret |= S_IFDIR;
>  }
>  
> +#ifndef CONFIG_WIN32
>  if (mode & P9_STAT_MODE_SYMLINK) {
>  ret |= S_IFLNK;
>  }
> @@ -1106,6 +1123,7 @@ static mode_t v9mode_to_mode(uint32_t mode, V9fsString 
> *extension)
>  if (mode & P9_STAT_MODE_NAMED_PIPE) {
>  ret |= S_IFIFO;
>  }
> +#endif
>  if (mode & P9_STAT_MODE_DEVICE) {
>  if (extension->size && extension->data[0] == 'c') {
>  ret |= S_IFCHR;
> @@ -1118,6 +1136,7 @@ static mode_t v9mode_to_mode(uint32_t mode, V9fsString 
> *extension)
>  ret |= S_IFREG;
>  }
>  
> +#ifndef CONFIG_WIN32
>  if (mode & P9_STAT_MODE_SETUID) {
>  ret |= S_ISUID;
>  }
> @@ -1127,6 +1146,7 @@ static mode_t v9mode_to_mode(uint32_t mode, V9fsString 
> *extension)
>  if (mode & P9_STAT_MODE_SETVTX) {
>  ret |= S_ISVTX;
>  }
> +#endif
>  
>  return ret;
>  }
> @@ -1182,6 +1202,7 @@ static uint32_t stat_to_v9mode(const struct stat *stbuf)
>  mode |= P9_STAT_MODE_DIR;
>  }
>  
> +#ifndef CONFIG_WIN32
>  if (S_ISLNK(stbuf->st_mode)) {
>  mode |= P9_STAT_MODE_SYMLINK;
>   

[PATCH v9 03/17] hw/i386/amd_iommu: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. This behavior is appropriate here because all of
the capabilities set in this device are defined in the program and
their overlap should not happen unless there is a programming error.

Signed-off-by: Akihiko Odaki 
---
 hw/i386/amd_iommu.c | 21 -
 1 file changed, 4 insertions(+), 17 deletions(-)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 725f69095b..8a88cbea0a 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1539,7 +1539,6 @@ static void amdvi_sysbus_reset(DeviceState *dev)
 
 static void amdvi_sysbus_realize(DeviceState *dev, Error **errp)
 {
-int ret = 0;
 AMDVIState *s = AMD_IOMMU_DEVICE(dev);
 MachineState *ms = MACHINE(qdev_get_machine());
 PCMachineState *pcms = PC_MACHINE(ms);
@@ -1553,23 +1552,11 @@ static void amdvi_sysbus_realize(DeviceState *dev, 
Error **errp)
 if (!qdev_realize(DEVICE(>pci), >qbus, errp)) {
 return;
 }
-ret = pci_add_capability(>pci.dev, AMDVI_CAPAB_ID_SEC, 0,
- AMDVI_CAPAB_SIZE, errp);
-if (ret < 0) {
-return;
-}
-s->capab_offset = ret;
+s->capab_offset = pci_add_capability(>pci.dev, AMDVI_CAPAB_ID_SEC, 0,
+ AMDVI_CAPAB_SIZE);
 
-ret = pci_add_capability(>pci.dev, PCI_CAP_ID_MSI, 0,
- AMDVI_CAPAB_REG_SIZE, errp);
-if (ret < 0) {
-return;
-}
-ret = pci_add_capability(>pci.dev, PCI_CAP_ID_HT, 0,
- AMDVI_CAPAB_REG_SIZE, errp);
-if (ret < 0) {
-return;
-}
+pci_add_capability(>pci.dev, PCI_CAP_ID_MSI, 0, AMDVI_CAPAB_REG_SIZE);
+pci_add_capability(>pci.dev, PCI_CAP_ID_HT, 0, AMDVI_CAPAB_REG_SIZE);
 
 /* Pseudo address space under root PCI bus. */
 x86ms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID);
-- 
2.38.1




Re: [PATCH v5 0/6] ppc/e500: Add support for two types of flash, cleanup

2022-11-01 Thread Philippe Mathieu-Daudé

On 1/11/22 17:01, Bernhard Beschow wrote:

Am 1. November 2022 10:41:51 UTC schrieb Bernhard Beschow :

On Mon, Oct 31, 2022 at 12:54 PM Philippe Mathieu-Daudé 
wrote:


This is a respin of Bernhard's v4 with Freescale eSDHC implemented
as an 'UNIMP' region. See v4 cover here:

https://lore.kernel.org/qemu-devel/20221018210146.193159-1-shen...@gmail.com/

Only tested with the ppce500 machine (no further regression testing).

Since v4:
- Do not rename ESDHC_* definitions to USDHC_*
- Do not modify SDHCIState structure



Works beautifully, both for the buildroot load and for my proprietary load.
So:
Tested-by: Bernhard Beschow



Bernhard Beschow (4):
   hw/block/pflash_cfi0{1, 2}: Error out if device length isn't a power
 of two
   docs/system/ppc/ppce500: Use qemu-system-ppc64 across the board(s)
   hw/ppc/e500: Implement pflash handling
   hw/ppc/e500: Add Freescale eSDHC to e500plat

Philippe Mathieu-Daudé (2):
   hw/sd/sdhci: MMIO region is implemented in 32-bit accesses
   hw/sd/sdhci: Map host controller interface in host endianess


Hi Phil,

Is there a chance to get this in for 7.2?


Well 1/ can you review patch #1 and 2/ we need to figure out what to do 
with patch #2 :) Can you point me to the CCSR datasheet?




[PATCH v9 05/17] e1000e: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. This behavior is appropriate here because all of
the capabilities set in this device are defined in the program and
their overlap should not happen unless there is a programming error.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e.c | 18 +++---
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index ac96f7665a..e433b8f9a5 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -377,17 +377,10 @@ e1000e_gen_dsn(uint8_t *mac)
(uint64_t)(mac[0])  << 56;
 }
 
-static int
+static void
 e1000e_add_pm_capability(PCIDevice *pdev, uint8_t offset, uint16_t pmc)
 {
-Error *local_err = NULL;
-int ret = pci_add_capability(pdev, PCI_CAP_ID_PM, offset,
- PCI_PM_SIZEOF, _err);
-
-if (local_err) {
-error_report_err(local_err);
-return ret;
-}
+pci_add_capability(pdev, PCI_CAP_ID_PM, offset, PCI_PM_SIZEOF);
 
 pci_set_word(pdev->config + offset + PCI_PM_PMC,
  PCI_PM_CAP_VER_1_1 |
@@ -400,8 +393,6 @@ e1000e_add_pm_capability(PCIDevice *pdev, uint8_t offset, 
uint16_t pmc)
 
 pci_set_word(pdev->w1cmask + offset + PCI_PM_CTRL,
  PCI_PM_CTRL_PME_STATUS);
-
-return ret;
 }
 
 static void e1000e_write_config(PCIDevice *pci_dev, uint32_t address,
@@ -480,10 +471,7 @@ static void e1000e_pci_realize(PCIDevice *pci_dev, Error 
**errp)
 trace_e1000e_msi_init_fail(ret);
 }
 
-if (e1000e_add_pm_capability(pci_dev, e1000e_pmrb_offset,
-  PCI_PM_CAP_DSI) < 0) {
-hw_error("Failed to initialize PM capability");
-}
+e1000e_add_pm_capability(pci_dev, e1000e_pmrb_offset, PCI_PM_CAP_DSI);
 
 if (pcie_aer_init(pci_dev, PCI_ERR_VER, e1000e_aer_offset,
   PCI_ERR_SIZEOF, NULL) < 0) {
-- 
2.38.1




Re: [PATCH v4 1/3] util/main-loop: Fix maximum number of wait objects for win32

2022-11-01 Thread Bin Meng
On Tue, Nov 1, 2022 at 8:03 PM Daniel P. Berrangé  wrote:
>
> On Tue, Nov 01, 2022 at 09:14:55AM +0800, Bin Meng wrote:
> > Hi Daniel,
> >
> > On Wed, Oct 26, 2022 at 12:41 AM Bin Meng  wrote:
> > >
> > > On Wed, Oct 19, 2022 at 6:20 PM Bin Meng  wrote:
> > > >
> > > > From: Bin Meng 
> > > >
> > > > The maximum number of wait objects for win32 should be
> > > > MAXIMUM_WAIT_OBJECTS, not MAXIMUM_WAIT_OBJECTS + 1.
> > > >
> > > > Signed-off-by: Bin Meng 
> > > > ---
> > > >
> > > > Changes in v4:
> > > > - make the out of bounds access protection explicit
> > > >
> > > > Changes in v3:
> > > > - move the check of adding the same HANDLE twice to a separete patch
> > > >
> > > > Changes in v2:
> > > > - fix the logic in qemu_add_wait_object() to avoid adding
> > > >   the same HANDLE twice
> > > >
> > > >  util/main-loop.c | 10 +-
> > > >  1 file changed, 5 insertions(+), 5 deletions(-)
> > > >
> > >
> > > Ping?
> >
> > Would you queue this series? Thanks!
>
> The main loop is not my area as maintainer - it would normally be
> Paolo IIRC.
>

Thanks, but Paolo has been silent since day 1 ...

Regards,
Bin



Re: [PATCH 0/4] hw: make TCO watchdog actually work by default for Q35

2022-11-01 Thread Igor Mammedov
On Mon, 31 Oct 2022 11:48:58 -0400
"Michael S. Tsirkin"  wrote:

> On Mon, Oct 31, 2022 at 01:50:24PM +, Daniel P. Berrangé wrote:
> > On Mon, Oct 31, 2022 at 01:19:30PM +, Daniel P. Berrangé wrote:  
> > > The TCO watchdog is unconditionally integrated into the Q35 machine
> > > type by default, but at the same time is unconditionally disabled
> > > from firing by a host config option that overrides guest OS attempts
> > > to enable it. People have to know to set a magic -global to make
> > > it non-broken  
> > 
> > Incidentally I found that originally the TCO watchdog was not
> > unconditionally enabled. Its exposure to the guest could be
> > turned on/off using
> > 
> >   -global ICH9-LPC.enable_tco=bool
> > 
> > This was implemented for machine type compat, but it also gave
> > apps a way to disable the watchdog functionality. Unfortunately
> > that ability was discarded in this series:
> > 
> >   
> > https://lore.kernel.org/all/1453564933-29638-1-git-send-email-ehabk...@redhat.com/
> > 
> > but the 'enable_tco' property still exists in QOM, but silently
> > ignored.
> > 
> > Seems we should either fix the impl of 'enable_tco', or remove the
> > QOM property entirely, so we don't pretend it can be toggled anymore.
> > 
> > With regards,
> > Daniel  
> 
> i am inclined to say you are right and the fix is to fix the impl.

Is there need for users to disable whatchdog at all?
It was always present since then and no one complained, 
so perhaps we should ditch property instead fixing it
to keep it simple.

> 
> > -- 
> > |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange 
> > :|
> > |: https://libvirt.org -o-https://fstop138.berrange.com 
> > :|
> > |: https://entangle-photo.org-o-https://www.instagram.com/dberrange 
> > :|  
> 




[PATCH v9 04/17] ahci: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. This behavior is appropriate here because all of
the capabilities set in this device are defined in the program and
their overlap should not happen unless there is a programming error.

Signed-off-by: Akihiko Odaki 
---
 hw/ide/ich.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/hw/ide/ich.c b/hw/ide/ich.c
index 1007a51fcb..3b478b01f8 100644
--- a/hw/ide/ich.c
+++ b/hw/ide/ich.c
@@ -106,7 +106,7 @@ static void pci_ich9_ahci_init(Object *obj)
 static void pci_ich9_ahci_realize(PCIDevice *dev, Error **errp)
 {
 struct AHCIPCIState *d;
-int sata_cap_offset;
+uint8_t sata_cap_offset;
 uint8_t *sata_cap;
 d = ICH9_AHCI(dev);
 int ret;
@@ -130,11 +130,7 @@ static void pci_ich9_ahci_realize(PCIDevice *dev, Error 
**errp)
  >ahci.mem);
 
 sata_cap_offset = pci_add_capability(dev, PCI_CAP_ID_SATA,
-  ICH9_SATA_CAP_OFFSET, SATA_CAP_SIZE,
-  errp);
-if (sata_cap_offset < 0) {
-return;
-}
+  ICH9_SATA_CAP_OFFSET, SATA_CAP_SIZE);
 
 sata_cap = dev->config + sata_cap_offset;
 pci_set_word(sata_cap + SATA_CAP_REV, 0x10);
-- 
2.38.1




Re: [PATCH v8 17/17] pci: Remove legacy errp from pci_add_capability

2022-11-01 Thread Philippe Mathieu-Daudé

On 1/11/22 14:57, Akihiko Odaki wrote:

Signed-off-by: Akihiko Odaki 
---
  hw/pci/pci.c | 20 +---
  include/hw/pci/pci.h | 12 ++--
  2 files changed, 7 insertions(+), 25 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index cce57f572c..41de7643af 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2532,14 +2532,11 @@ bool pci_check_capability_overlap(PCIDevice *pdev, 
uint8_t cap_id,
  }
  
  /*

- * On success, pci_add_capability_legacy() returns a positive value
- * that the offset of the pci capability.
- * On failure, it sets an error and returns a negative error
- * code.
+ * pci_add_capability() returns a positive value that the offset of the pci
+ * capability.


Simpler:

"Return: offset of the PCI capability."


   */
-int pci_add_capability_legacy(PCIDevice *pdev, uint8_t cap_id,
-  uint8_t offset, uint8_t size,
-  Error **errp)
+uint8_t pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
+   uint8_t offset, uint8_t size)
  {
  uint8_t *config;
  





[PATCH v8 05/17] e1000e: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. This behavior is appropriate here because all of
the capabilities set in this device are defined in the program and
their overlap should not happen unless there is a programming error.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e.c | 18 +++---
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index ac96f7665a..e433b8f9a5 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -377,17 +377,10 @@ e1000e_gen_dsn(uint8_t *mac)
(uint64_t)(mac[0])  << 56;
 }
 
-static int
+static void
 e1000e_add_pm_capability(PCIDevice *pdev, uint8_t offset, uint16_t pmc)
 {
-Error *local_err = NULL;
-int ret = pci_add_capability(pdev, PCI_CAP_ID_PM, offset,
- PCI_PM_SIZEOF, _err);
-
-if (local_err) {
-error_report_err(local_err);
-return ret;
-}
+pci_add_capability(pdev, PCI_CAP_ID_PM, offset, PCI_PM_SIZEOF);
 
 pci_set_word(pdev->config + offset + PCI_PM_PMC,
  PCI_PM_CAP_VER_1_1 |
@@ -400,8 +393,6 @@ e1000e_add_pm_capability(PCIDevice *pdev, uint8_t offset, 
uint16_t pmc)
 
 pci_set_word(pdev->w1cmask + offset + PCI_PM_CTRL,
  PCI_PM_CTRL_PME_STATUS);
-
-return ret;
 }
 
 static void e1000e_write_config(PCIDevice *pci_dev, uint32_t address,
@@ -480,10 +471,7 @@ static void e1000e_pci_realize(PCIDevice *pci_dev, Error 
**errp)
 trace_e1000e_msi_init_fail(ret);
 }
 
-if (e1000e_add_pm_capability(pci_dev, e1000e_pmrb_offset,
-  PCI_PM_CAP_DSI) < 0) {
-hw_error("Failed to initialize PM capability");
-}
+e1000e_add_pm_capability(pci_dev, e1000e_pmrb_offset, PCI_PM_CAP_DSI);
 
 if (pcie_aer_init(pci_dev, PCI_ERR_VER, e1000e_aer_offset,
   PCI_ERR_SIZEOF, NULL) < 0) {
-- 
2.38.1




Re: [PATCH v5 0/6] ppc/e500: Add support for two types of flash, cleanup

2022-11-01 Thread Bernhard Beschow
Am 1. November 2022 10:41:51 UTC schrieb Bernhard Beschow :
>On Mon, Oct 31, 2022 at 12:54 PM Philippe Mathieu-Daudé 
>wrote:
>
>> This is a respin of Bernhard's v4 with Freescale eSDHC implemented
>> as an 'UNIMP' region. See v4 cover here:
>>
>> https://lore.kernel.org/qemu-devel/20221018210146.193159-1-shen...@gmail.com/
>>
>> Only tested with the ppce500 machine (no further regression testing).
>>
>> Since v4:
>> - Do not rename ESDHC_* definitions to USDHC_*
>> - Do not modify SDHCIState structure
>>
>
>Works beautifully, both for the buildroot load and for my proprietary load.
>So:
>Tested-by: Bernhard Beschow
>
>>
>> Bernhard Beschow (4):
>>   hw/block/pflash_cfi0{1, 2}: Error out if device length isn't a power
>> of two
>>   docs/system/ppc/ppce500: Use qemu-system-ppc64 across the board(s)
>>   hw/ppc/e500: Implement pflash handling
>>   hw/ppc/e500: Add Freescale eSDHC to e500plat
>>
>> Philippe Mathieu-Daudé (2):
>>   hw/sd/sdhci: MMIO region is implemented in 32-bit accesses
>>   hw/sd/sdhci: Map host controller interface in host endianess

Hi Phil,

Is there a chance to get this in for 7.2?

Best regards,
Bernhard
>>
>>  docs/system/ppc/ppce500.rst |  38 +--
>>  hw/block/pflash_cfi01.c |   8 ++-
>>  hw/block/pflash_cfi02.c |   5 ++
>>  hw/ppc/Kconfig  |   3 +
>>  hw/ppc/e500.c   | 127 +++-
>>  hw/ppc/e500.h   |   1 +
>>  hw/ppc/e500plat.c   |   1 +
>>  hw/sd/sdhci.c   |   6 +-
>>  8 files changed, 180 insertions(+), 9 deletions(-)
>>
>> --
>> 2.37.3
>>
>>




Re: [PATCH v8 01/17] hw/vfio/pci: Ensure MSI and MSI-X do not overlap

2022-11-01 Thread Philippe Mathieu-Daudé

On 1/11/22 14:57, Akihiko Odaki wrote:

pci_add_capability() checks whether capabilities overlap, and notifies
its caller so that it can properly handle the case. However, in the
most cases, the capabilities actually never overlap, and the interface
incurred extra error handling code, which is often incorrect or
suboptimal. For such cases, pci_add_capability() can simply abort the
execution if the capabilities actually overlap since it should be a
programming error.

This change handles the other cases: hw/vfio/pci depends on the check to
decide MSI and MSI-X capabilities overlap with another. As they are
quite an exceptional and hw/vfio/pci knows much about PCI capabilities,
adding code specific to the cases to hw/vfio/pci still results in less
code than having error handling code everywhere in total.

Signed-off-by: Akihiko Odaki 
---
  hw/pci/pci.c | 34 ++
  hw/vfio/pci.c| 15 ++-
  include/hw/pci/pci.h |  3 +++
  3 files changed, 39 insertions(+), 13 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 2f450f6a72..b53649d1fd 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2512,6 +2512,25 @@ static void pci_del_option_rom(PCIDevice *pdev)
  pdev->has_rom = false;
  }
  
+bool pci_check_capability_overlap(PCIDevice *pdev, uint8_t cap_id,

+  uint8_t offset, uint8_t size, Error **errp)
+{
+int i;
+
+for (i = offset; i < offset + size; i++) {
+if (pdev->used[i]) {
+error_setg(errp,
+   "%s:%02x:%02x.%x PCI capability %x at offset %x overlaps 
existing capability %x at offset %x",
+   pci_root_bus_path(pdev), pci_dev_bus_num(pdev),
+   PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn),
+   cap_id, offset, pci_find_capability_at_offset(pdev, i), 
i);
+return true;
+}
+}
+
+return false;
+}


I apologize for jumping at v8 :/

Per the Error API, function taking an Error** as last argument should 
return TRUE on success; or FALSE on error and setting the *errp argument.


Your function return 'true' on error. The confusion might come from its
name 'pci_check_capability_overlap'.

> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index b54b6ef88f..77b264c17e 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -390,6 +390,9 @@ void pci_register_vga(PCIDevice *pci_dev, 
MemoryRegion *mem,

>   void pci_unregister_vga(PCIDevice *pci_dev);
>   pcibus_t pci_get_bar_addr(PCIDevice *pci_dev, int region_num);
>

Please document function prototype of public APIs.

> +bool pci_check_capability_overlap(PCIDevice *pdev, uint8_t cap_id,
> +  uint8_t offset, uint8_t size, 
Error **errp);

> +
>   int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
>  uint8_t offset, uint8_t size,
>  Error **errp);

Also, consider configuring scripts/git.orderfile :)

Regards,

Phil.



[PATCH v8 13/17] pci/slotid: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. A caller of slotid_cap_init(), which calls
pci_add_capability() in turn, is expected to ensure that will not
happen.

Signed-off-by: Akihiko Odaki 
---
 hw/pci/slotid_cap.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/hw/pci/slotid_cap.c b/hw/pci/slotid_cap.c
index 36d021b4a6..5da8c82133 100644
--- a/hw/pci/slotid_cap.c
+++ b/hw/pci/slotid_cap.c
@@ -12,7 +12,7 @@ int slotid_cap_init(PCIDevice *d, int nslots,
 unsigned offset,
 Error **errp)
 {
-int cap;
+uint8_t cap;
 
 if (!chassis) {
 error_setg(errp, "Bridge chassis not specified. Each bridge is 
required"
@@ -24,11 +24,7 @@ int slotid_cap_init(PCIDevice *d, int nslots,
 return -EINVAL;
 }
 
-cap = pci_add_capability(d, PCI_CAP_ID_SLOTID, offset,
- SLOTID_CAP_LENGTH, errp);
-if (cap < 0) {
-return cap;
-}
+cap = pci_add_capability(d, PCI_CAP_ID_SLOTID, offset, SLOTID_CAP_LENGTH);
 /* We make each chassis unique, this way each bridge is First in Chassis */
 d->config[cap + PCI_SID_ESR] = PCI_SID_ESR_FIC |
 (nslots << SLOTID_NSLOTS_SHIFT);
-- 
2.38.1




Re: [PATCH] migration: check magic value for deciding the mapping of channels

2022-11-01 Thread manish.mishra

Sorry for late patch on this. I mentioned i will send it last week itself, but 
later reliased it was festival week in India, so was mostly holidays.

Thanks

Manish Mishra

On 01/11/22 8:00 pm, manish.mishra wrote:

Current logic assumes that channel connections on the destination side are
always established in the same order as the source and the first one will
always be the default channel followed by the multifid or post-copy
preemption channel. This may not be always true, as even if a channel has a
connection established on the source side it can be in the pending state on
the destination side and a newer connection can be established first.
Basically causing out of order mapping of channels on the destination side.
Currently, all channels except post-copy preempt send a magic number, this
patch uses that magic number to decide the type of channel. This logic is
applicable only for precopy(multifd) live migration, as mentioned, the
post-copy preempt channel does not send any magic number. Also, this patch
uses MSG_PEEK to check the magic number of channels so that current
data/control stream management remains un-effected.

Signed-off-by: manish.mishra
---
  include/io/channel.h | 25 +
  io/channel-socket.c  | 27 +++
  io/channel.c | 39 +++
  migration/migration.c| 33 +
  migration/multifd.c  | 12 
  migration/multifd.h  |  2 +-
  migration/postcopy-ram.c |  5 +
  migration/postcopy-ram.h |  2 +-
  8 files changed, 119 insertions(+), 26 deletions(-)

diff --git a/include/io/channel.h b/include/io/channel.h
index c680ee7480..74177aeeea 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -115,6 +115,10 @@ struct QIOChannelClass {
  int **fds,
  size_t *nfds,
  Error **errp);
+ssize_t (*io_read_peek)(QIOChannel *ioc,
+void *buf,
+size_t nbytes,
+Error **errp);
  int (*io_close)(QIOChannel *ioc,
  Error **errp);
  GSource * (*io_create_watch)(QIOChannel *ioc,
@@ -475,6 +479,27 @@ int qio_channel_write_all(QIOChannel *ioc,
size_t buflen,
Error **errp);
  
+/**

+ * qio_channel_read_peek_all:
+ * @ioc: the channel object
+ * @buf: the memory region to read in data
+ * @nbytes: the number of bytes to read
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Read given @nbytes data from peek of channel into
+ * memory region @buf.
+ *
+ * The function will be blocked until read size is
+ * equal to requested size.
+ *
+ * Returns: 1 if all bytes were read, 0 if end-of-file
+ *  occurs without data, or -1 on error
+ */
+int qio_channel_read_peek_all(QIOChannel *ioc,
+  void* buf,
+  size_t nbytes,
+  Error **errp);
+
  /**
   * qio_channel_set_blocking:
   * @ioc: the channel object
diff --git a/io/channel-socket.c b/io/channel-socket.c
index b76dca9cc1..b99f5dfda6 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -705,6 +705,32 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc,
  }
  #endif /* WIN32 */
  
+static ssize_t qio_channel_socket_read_peek(QIOChannel *ioc,

+void *buf,
+size_t nbytes,
+Error **errp)
+{
+QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
+ssize_t bytes = 0;
+
+retry:
+bytes = recv(sioc->fd, buf, nbytes, MSG_PEEK);
+
+if (bytes < 0) {
+if (errno == EINTR) {
+goto retry;
+}
+if (errno == EAGAIN) {
+return QIO_CHANNEL_ERR_BLOCK;
+}
+
+error_setg_errno(errp, errno,
+ "Unable to read from peek of socket");
+return -1;
+}
+
+return bytes;
+}
  
  #ifdef QEMU_MSG_ZEROCOPY

  static int qio_channel_socket_flush(QIOChannel *ioc,
@@ -902,6 +928,7 @@ static void qio_channel_socket_class_init(ObjectClass 
*klass,
  
  ioc_klass->io_writev = qio_channel_socket_writev;

  ioc_klass->io_readv = qio_channel_socket_readv;
+ioc_klass->io_read_peek = qio_channel_socket_read_peek;
  ioc_klass->io_set_blocking = qio_channel_socket_set_blocking;
  ioc_klass->io_close = qio_channel_socket_close;
  ioc_klass->io_shutdown = qio_channel_socket_shutdown;
diff --git a/io/channel.c b/io/channel.c
index 0640941ac5..a2d9b96f3f 100644
--- a/io/channel.c
+++ b/io/channel.c
@@ -346,6 +346,45 @@ int qio_channel_write_all(QIOChannel *ioc,
  return qio_channel_writev_all(ioc, , 1, errp);
  }
  
+int qio_channel_read_peek_all(QIOChannel *ioc,

+  void* buf,
+   

[PATCH v8 09/17] hw/pci/pci_bridge: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. A caller of pci_bridge_ssvid_init(), which calls
pci_add_capability() in turn, is expected to ensure that will not
happen.

Signed-off-by: Akihiko Odaki 
---
 hw/pci-bridge/i82801b11.c  | 14 ++
 hw/pci-bridge/pcie_root_port.c |  7 +--
 hw/pci-bridge/xio3130_downstream.c |  8 ++--
 hw/pci-bridge/xio3130_upstream.c   |  8 ++--
 hw/pci/pci_bridge.c| 21 ++---
 include/hw/pci/pci_bridge.h|  5 ++---
 6 files changed, 15 insertions(+), 48 deletions(-)

diff --git a/hw/pci-bridge/i82801b11.c b/hw/pci-bridge/i82801b11.c
index f28181e210..f45dcdbacc 100644
--- a/hw/pci-bridge/i82801b11.c
+++ b/hw/pci-bridge/i82801b11.c
@@ -61,21 +61,11 @@ typedef struct I82801b11Bridge {
 
 static void i82801b11_bridge_realize(PCIDevice *d, Error **errp)
 {
-int rc;
-
 pci_bridge_initfn(d, TYPE_PCI_BUS);
 
-rc = pci_bridge_ssvid_init(d, I82801ba_SSVID_OFFSET,
-   I82801ba_SSVID_SVID, I82801ba_SSVID_SSID,
-   errp);
-if (rc < 0) {
-goto err_bridge;
-}
+pci_bridge_ssvid_init(d, I82801ba_SSVID_OFFSET,
+  I82801ba_SSVID_SVID, I82801ba_SSVID_SSID);
 pci_config_set_prog_interface(d->config, PCI_CLASS_BRIDGE_PCI_INF_SUB);
-return;
-
-err_bridge:
-pci_bridge_exitfn(d);
 }
 
 static const VMStateDescription i82801b11_bridge_dev_vmstate = {
diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
index 460e48269d..a9d8c2adb4 100644
--- a/hw/pci-bridge/pcie_root_port.c
+++ b/hw/pci-bridge/pcie_root_port.c
@@ -74,12 +74,7 @@ static void rp_realize(PCIDevice *d, Error **errp)
 }
 pcie_port_init_reg(d);
 
-rc = pci_bridge_ssvid_init(d, rpc->ssvid_offset, dc->vendor_id,
-   rpc->ssid, errp);
-if (rc < 0) {
-error_append_hint(errp, "Can't init SSV ID, error %d\n", rc);
-goto err_bridge;
-}
+pci_bridge_ssvid_init(d, rpc->ssvid_offset, dc->vendor_id, rpc->ssid);
 
 if (rpc->interrupts_init) {
 rc = rpc->interrupts_init(d, errp);
diff --git a/hw/pci-bridge/xio3130_downstream.c 
b/hw/pci-bridge/xio3130_downstream.c
index 05e2b06c0c..eea3d3a2df 100644
--- a/hw/pci-bridge/xio3130_downstream.c
+++ b/hw/pci-bridge/xio3130_downstream.c
@@ -81,12 +81,8 @@ static void xio3130_downstream_realize(PCIDevice *d, Error 
**errp)
 goto err_bridge;
 }
 
-rc = pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET,
-   XIO3130_SSVID_SVID, XIO3130_SSVID_SSID,
-   errp);
-if (rc < 0) {
-goto err_msi;
-}
+pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET,
+  XIO3130_SSVID_SVID, XIO3130_SSVID_SSID);
 
 rc = pcie_cap_init(d, XIO3130_EXP_OFFSET, PCI_EXP_TYPE_DOWNSTREAM,
p->port, errp);
diff --git a/hw/pci-bridge/xio3130_upstream.c b/hw/pci-bridge/xio3130_upstream.c
index 5ff46ef050..d954906d79 100644
--- a/hw/pci-bridge/xio3130_upstream.c
+++ b/hw/pci-bridge/xio3130_upstream.c
@@ -71,12 +71,8 @@ static void xio3130_upstream_realize(PCIDevice *d, Error 
**errp)
 goto err_bridge;
 }
 
-rc = pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET,
-   XIO3130_SSVID_SVID, XIO3130_SSVID_SSID,
-   errp);
-if (rc < 0) {
-goto err_msi;
-}
+pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET,
+  XIO3130_SSVID_SVID, XIO3130_SSVID_SSID);
 
 rc = pcie_cap_init(d, XIO3130_EXP_OFFSET, PCI_EXP_TYPE_UPSTREAM,
p->port, errp);
diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
index da34c8ebcd..30032fed64 100644
--- a/hw/pci/pci_bridge.c
+++ b/hw/pci/pci_bridge.c
@@ -42,21 +42,15 @@
 #define PCI_SSVID_SVID  4
 #define PCI_SSVID_SSID  6
 
-int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
-  uint16_t svid, uint16_t ssid,
-  Error **errp)
+void pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
+   uint16_t svid, uint16_t ssid)
 {
-int pos;
+uint8_t pos;
 
-pos = pci_add_capability(dev, PCI_CAP_ID_SSVID, offset,
- PCI_SSVID_SIZEOF, errp);
-if (pos < 0) {
-return pos;
-}
+pos = pci_add_capability(dev, PCI_CAP_ID_SSVID, offset, PCI_SSVID_SIZEOF);
 
 pci_set_word(dev->config + pos + PCI_SSVID_SVID, svid);
 pci_set_word(dev->config + pos + PCI_SSVID_SSID, ssid);
-return pos;
 }
 
 /* Accessor function to get parent bridge device from pci bus. */
@@ -455,11 +449,8 @@ int pci_bridge_qemu_reserve_cap_init(PCIDevice *dev, int 
cap_offset,
 .mem_pref_64 = cpu_to_le64(res_reserve.mem_pref_64)
 };
 
-int offset = pci_add_capability(dev, PCI_CAP_ID_VNDR,
- 

Re: [PATCH 11/16] hw/9pfs: Add Linux error number definition

2022-11-01 Thread Christian Schoenebeck
On Monday, October 24, 2022 6:57:54 AM CET Bin Meng wrote:
> From: Guohuai Shi 
> 
> When using 9p2000.L protocol, the errno should use the Linux errno.
> Currently magic numbers with comments are used. Replace these with
> macros for future expansion.
> 
> Signed-off-by: Guohuai Shi 
> Signed-off-by: Bin Meng 
> ---
> 
>  hw/9pfs/9p-linux-errno.h | 151 +++
>  hw/9pfs/9p-util.h|  38 ++
>  2 files changed, 176 insertions(+), 13 deletions(-)
>  create mode 100644 hw/9pfs/9p-linux-errno.h
> 
> diff --git a/hw/9pfs/9p-linux-errno.h b/hw/9pfs/9p-linux-errno.h
> new file mode 100644
> index 00..56c37fa293
> --- /dev/null
> +++ b/hw/9pfs/9p-linux-errno.h
> @@ -0,0 +1,151 @@
> +/*
> + * 9p Linux errno translation definition
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include 
> +
> +#ifndef QEMU_9P_LINUX_ERRNO_H
> +#define QEMU_9P_LINUX_ERRNO_H
> +
> +/*
> + * This file contains the Linux errno definitions to translate errnos set by
> + * the 9P server (running on non-Linux hosts) to a corresponding errno value.
> + *
> + * This list should be periodically reviewed and updated; particularly for
> + * errnos that might be set as a result of a file system operation.
> + */

Yeah, that's my main concern here. I wonder if there is isn't a better
maintainable solution at least for the list of Linux errors, so that we don't
have to manually update the L_ macros below.

> +
> +#define L_EPERM 1   /* Operation not permitted */
> +#define L_ENOENT2   /* No such file or directory */
> +#define L_ESRCH 3   /* No such process */
> +#define L_EINTR 4   /* Interrupted system call */
> +#define L_EIO   5   /* I/O error */
> +#define L_ENXIO 6   /* No such device or address */
> +#define L_E2BIG 7   /* Argument list too long */
> +#define L_ENOEXEC   8   /* Exec format error */
> +#define L_EBADF 9   /* Bad file number */
> +#define L_ECHILD10  /* No child processes */
> +#define L_EAGAIN11  /* Try again */
> +#define L_ENOMEM12  /* Out of memory */
> +#define L_EACCES13  /* Permission denied */
> +#define L_EFAULT14  /* Bad address */
> +#define L_ENOTBLK   15  /* Block device required */
> +#define L_EBUSY 16  /* Device or resource busy */
> +#define L_EEXIST17  /* File exists */
> +#define L_EXDEV 18  /* Cross-device link */
> +#define L_ENODEV19  /* No such device */
> +#define L_ENOTDIR   20  /* Not a directory */
> +#define L_EISDIR21  /* Is a directory */
> +#define L_EINVAL22  /* Invalid argument */
> +#define L_ENFILE23  /* File table overflow */
> +#define L_EMFILE24  /* Too many open files */
> +#define L_ENOTTY25  /* Not a typewriter */
> +#define L_ETXTBSY   26  /* Text file busy */
> +#define L_EFBIG 27  /* File too large */
> +#define L_ENOSPC28  /* No space left on device */
> +#define L_ESPIPE29  /* Illegal seek */
> +#define L_EROFS 30  /* Read-only file system */
> +#define L_EMLINK31  /* Too many links */
> +#define L_EPIPE 32  /* Broken pipe */
> +#define L_EDOM  33  /* Math argument out of domain of func */
> +#define L_ERANGE34  /* Math result not representable */
> +#define L_EDEADLK   35  /* Resource deadlock would occur */
> +#define L_ENAMETOOLONG  36  /* File name too long */
> +#define L_ENOLCK37  /* No record locks available */
> +#define L_ENOSYS38  /* Function not implemented */
> +#define L_ENOTEMPTY 39  /* Directory not empty */
> +#define L_ELOOP 40  /* Too many symbolic links encountered */
> +#define L_ENOMSG42  /* No message of desired type */
> +#define L_EIDRM 43  /* Identifier removed */
> +#define L_ECHRNG44  /* Channel number out of range */
> +#define L_EL2NSYNC  45  /* Level 2 not synchronized */
> +#define L_EL3HLT46  /* Level 3 halted */
> +#define L_EL3RST47  /* Level 3 reset */
> +#define L_ELNRNG48  /* Link number out of range */
> +#define L_EUNATCH   49  /* Protocol driver not attached */
> +#define L_ENOCSI50  /* No CSI structure available */
> +#define L_EL2HLT51  /* Level 2 halted */
> +#define L_EBADE 52  /* Invalid exchange */
> +#define L_EBADR 53  /* Invalid request descriptor */
> +#define L_EXFULL54  /* Exchange full */
> +#define L_ENOANO55  /* No anode */
> +#define L_EBADRQC   56  /* Invalid request code */
> +#define L_EBADSLT   57  /* Invalid slot */
> +#define L_EBFONT58 

[PATCH v8 06/17] eepro100: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. This behavior is appropriate here because all of
the capabilities set in this device are defined in the program and
their overlap should not happen unless there is a programming error.

Signed-off-by: Akihiko Odaki 
---
 hw/net/eepro100.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
index 679f52f80f..bf2ecdded9 100644
--- a/hw/net/eepro100.c
+++ b/hw/net/eepro100.c
@@ -549,12 +549,7 @@ static void e100_pci_reset(EEPRO100State *s, Error **errp)
 if (info->power_management) {
 /* Power Management Capabilities */
 int cfg_offset = 0xdc;
-int r = pci_add_capability(>dev, PCI_CAP_ID_PM,
-   cfg_offset, PCI_PM_SIZEOF,
-   errp);
-if (r < 0) {
-return;
-}
+pci_add_capability(>dev, PCI_CAP_ID_PM, cfg_offset, PCI_PM_SIZEOF);
 
 pci_set_word(pci_conf + cfg_offset + PCI_PM_PMC, 0x7e21);
 #if 0 /* TODO: replace dummy code for power management emulation. */
-- 
2.38.1




Re: [PULL 21/86] bios-tables-test: add test for number of cores > 255

2022-11-01 Thread Ani Sinha
On Tue, Nov 1, 2022 at 19:22 Jonathan Cameron 
wrote:

> On Mon, 31 Oct 2022 08:51:44 -0400
> "Michael S. Tsirkin"  wrote:
>
> > From: Julia Suvorova 
> >
> > The new test is run with a large number of cpus and checks if the
> > core_count field in smbios_cpu_test (structure type 4) is correct.
> >
> > Choose q35 as it allows to run with -smp > 255.
>
> Getting a failure on this on i386.
>
> qemu-system-i386: current -smp configuration requires kernel irqchip and
> X2APIC API support.
>
> Note that was on bisection of this pull request applied to current mainline
> (also in the CI report for the HMAT set - though there is another issue
> there.)


Can you point me to the CI report?


>
> My guess is fix is don't run it unless 64 bit?
>
> Jonathan
>
>
> >
> > Signed-off-by: Julia Suvorova 
> > Message-Id: <20220731162141.178443-5-jus...@redhat.com>
> > Message-Id: <2022101731.101412-5-jus...@redhat.com>
> > Reviewed-by: Michael S. Tsirkin 
> > Signed-off-by: Michael S. Tsirkin 
> > Reviewed-by: Igor Mammedov 
> > ---
> >  tests/qtest/bios-tables-test.c | 58 ++
> >  1 file changed, 45 insertions(+), 13 deletions(-)
> >
> > diff --git a/tests/qtest/bios-tables-test.c
> b/tests/qtest/bios-tables-test.c
> > index d4fbe6791d..e402b57d46 100644
> > --- a/tests/qtest/bios-tables-test.c
> > +++ b/tests/qtest/bios-tables-test.c
> > @@ -92,6 +92,8 @@ typedef struct {
> >  SmbiosEntryPoint smbios_ep_table;
> >  uint16_t smbios_cpu_max_speed;
> >  uint16_t smbios_cpu_curr_speed;
> > +uint8_t smbios_core_count;
> > +uint16_t smbios_core_count2;
> >  uint8_t *required_struct_types;
> >  int required_struct_types_len;
> >  QTestState *qts;
> > @@ -631,29 +633,42 @@ static inline bool smbios_single_instance(uint8_t
> type)
> >  }
> >  }
> >
> > -static bool smbios_cpu_test(test_data *data, uint32_t addr)
> > +static void smbios_cpu_test(test_data *data, uint32_t addr,
> > +SmbiosEntryPointType ep_type)
> >  {
> > -uint16_t expect_speed[2];
> > -uint16_t real;
> > +uint8_t core_count, expected_core_count = data->smbios_core_count;
> > +uint16_t speed, expected_speed[2];
> > +uint16_t core_count2, expected_core_count2 =
> data->smbios_core_count2;
> >  int offset[2];
> >  int i;
> >
> >  /* Check CPU speed for backward compatibility */
> >  offset[0] = offsetof(struct smbios_type_4, max_speed);
> >  offset[1] = offsetof(struct smbios_type_4, current_speed);
> > -expect_speed[0] = data->smbios_cpu_max_speed ? : 2000;
> > -expect_speed[1] = data->smbios_cpu_curr_speed ? : 2000;
> > +expected_speed[0] = data->smbios_cpu_max_speed ? : 2000;
> > +expected_speed[1] = data->smbios_cpu_curr_speed ? : 2000;
> >
> >  for (i = 0; i < 2; i++) {
> > -real = qtest_readw(data->qts, addr + offset[i]);
> > -if (real != expect_speed[i]) {
> > -fprintf(stderr, "Unexpected SMBIOS CPU speed: real %u
> expect %u\n",
> > -real, expect_speed[i]);
> > -return false;
> > -}
> > +speed = qtest_readw(data->qts, addr + offset[i]);
> > +g_assert_cmpuint(speed, ==, expected_speed[i]);
> >  }
> >
> > -return true;
> > +core_count = qtest_readb(data->qts,
> > +addr + offsetof(struct smbios_type_4, core_count));
> > +
> > +if (expected_core_count) {
> > +g_assert_cmpuint(core_count, ==, expected_core_count);
> > +}
> > +
> > +if (ep_type == SMBIOS_ENTRY_POINT_TYPE_64) {
> > +core_count2 = qtest_readw(data->qts,
> > +  addr + offsetof(struct smbios_type_4,
> core_count2));
> > +
> > +/* Core Count has reached its limit, checking Core Count 2 */
> > +if (expected_core_count == 0xFF && expected_core_count2) {
> > +g_assert_cmpuint(core_count2, ==, expected_core_count2);
> > +}
> > +}
> >  }
> >
> >  static void test_smbios_structs(test_data *data, SmbiosEntryPointType
> ep_type)
> > @@ -686,7 +701,7 @@ static void test_smbios_structs(test_data *data,
> SmbiosEntryPointType ep_type)
> >  set_bit(type, struct_bitmap);
> >
> >  if (type == 4) {
> > -g_assert(smbios_cpu_test(data, addr));
> > +smbios_cpu_test(data, addr, ep_type);
> >  }
> >
> >  /* seek to end of unformatted string area of this struct
> ("\0\0") */
> > @@ -908,6 +923,21 @@ static void test_acpi_q35_tcg(void)
> >  free_test_data();
> >  }
> >
> > +static void test_acpi_q35_tcg_core_count2(void)
> > +{
> > +test_data data = {
> > +.machine = MACHINE_Q35,
> > +.variant = ".core-count2",
> > +.required_struct_types = base_required_struct_types,
> > +.required_struct_types_len =
> ARRAY_SIZE(base_required_struct_types),
> > +.smbios_core_count = 0xFF,
> > +.smbios_core_count2 = 275,
> > +};
> > +
> > +

[PULL v3 for 7.2 00/31] testing and plugin updates

2022-11-01 Thread Alex Bennée
The following changes since commit 5107fd3effb1cfec3b96d9e819f1605048640e31:

  net/vhost-vdpa.c: Fix clang compilation failure (2022-10-31 13:01:31 -0400)

are available in the Git repository at:

  https://github.com/stsquad/qemu.git tags/pull-testing-for-7.2-011122-3

for you to fetch changes up to 339bf0c071eff5e6ff1d9ddb3ad5cd02e4cd9ca3:

  tests/vm: use -o IdentitiesOnly=yes for ssh (2022-10-31 20:37:59 +)


testing and plugin updates for 7.2:

  - cleanup win32/64 docker files
  - update test-mingw test
  - add flex/bison to debian-all-test
  - handle --enable-static/--disable-pie in config
  - extend timeouts on x86_64 avocado tests
  - add flex/bison to debian-hexagon-cross
  - use regular semihosting for nios2 check-tcg
  - fix obscure linker error to nios2 softmmu tests
  - various windows portability fixes for tests
  - clean-up of MAINTAINERS
  - use -machine none when appropriate in avocado
  - make raspi2_initrd test detect shutdown
  - disable sh4 rd2 tests on gitlab
  - re-enable threadcount/linux-test for sh4
  - clean-up s390x handling of "ex" instruction
  - better handle new CPUs in execlog plugin
  - pass CONFIG_DEBUG_TCG to plugin builds
  - try and avoid races in test-io-channel-command
  - speed up ssh key checking for tests/vm


Alex Bennée (21):
  tests/lcitool: Rename non-Debian specific helper
  tests/docker: update fedora-win[32|64]-cross with lcitool
  tests/lcitool: Refresh to latest libvirt-ci module
  tests/docker: update test-mingw to run single build
  configure: don't enable cross compilers unless in target_list
  configure: fix the --enable-static --disable-pie case
  tests/avocado: extend the timeout for x86_64 tcg tests
  tests/tcg: use regular semihosting for nios2-softmmu
  MAINTAINERS: add entries for the key build bits
  MAINTAINERS: add features_to_c.sh to gdbstub files
  MAINTAINERS: fix-up for check-tcg Makefile changes
  tests/avocado: set -machine none for userfwd and vnc tests
  tests/avocado: disable sh4 rd2 tests on Gitlab
  tests/tcg: re-enable linux-test for sh4
  tests/tcg: re-enable threadcount for sh4
  target/s390x: don't use ld_code2 to probe next pc
  target/s390x: don't probe next pc for EXecuted insns
  target/s390x: fake instruction loading when handling 'ex'
  contrib/plugins: enable debug on CONFIG_DEBUG_TCG
  contrib/plugins: protect execlog's last_exec expansion
  tests/unit: cleanups for test-io-channel-command

Anton Johansson (2):
  tests/docker: Add flex/bison to `debian-all-test`
  tests/docker: Add flex/bison to `debian-hexagon-cross`

Bin Meng (4):
  semihosting/arm-compat-semi: Avoid using hardcoded /tmp
  tcg: Avoid using hardcoded /tmp
  block/vvfat: Unify the mkdir() call
  hw/usb: dev-mtp: Use g_mkdir()

Ilya Leoshkevich (1):
  tests/vm: use -o IdentitiesOnly=yes for ssh

Paolo Bonzini (1):
  tests/tcg: include CONFIG_PLUGIN in config-host.mak

Peter Maydell (1):
  tests/avocado: raspi2_initrd: Wait for guest shutdown message before 
stopping

Richard Henderson (1):
  tests/tcg/nios2: Tweak 10m50-ghrd.ld

 configure  |  17 ++-
 include/exec/translator.h  |  17 +++
 block/vvfat.c  |   9 +-
 contrib/plugins/execlog.c  |  38 --
 hw/usb/dev-mtp.c   |   4 +-
 semihosting/arm-compat-semi.c  |   3 +-
 target/s390x/tcg/translate.c   |  14 ++-
 tcg/tcg.c  |   3 +-
 tests/unit/test-io-channel-command.c   |  45 ---
 MAINTAINERS|  29 -
 contrib/plugins/Makefile   |   1 +
 tests/avocado/boot_linux.py|   1 +
 tests/avocado/boot_linux_console.py|   7 +-
 tests/avocado/info_usernet.py  |   3 +
 tests/avocado/vnc.py   |   1 +
 .../dockerfiles/debian-all-test-cross.docker   |   2 +
 .../docker/dockerfiles/debian-hexagon-cross.docker |   2 +-
 tests/docker/dockerfiles/fedora-win32-cross.docker | 139 +++--
 tests/docker/dockerfiles/fedora-win64-cross.docker | 138 ++--
 tests/docker/test-mingw|  16 +--
 tests/lcitool/libvirt-ci   |   2 +-
 tests/lcitool/refresh  |  48 ---
 tests/tcg/nios2/10m50-ghrd.ld  |  14 ++-
 tests/tcg/nios2/Makefile.softmmu-target|   3 +-
 tests/tcg/sh4/Makefile.target  |  12 --
 tests/vm/basevm.py |   3 +-
 26 files changed, 396 insertions(+), 175 deletions(-)

-- 

[PATCH v9 16/17] virtio-pci: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. This behavior is appropriate here because all of
the capabilities set in this device are defined in the program and
their overlap should not happen unless there is a programming error.

Signed-off-by: Akihiko Odaki 
---
 include/hw/virtio/virtio-pci.h | 2 +-
 hw/virtio/virtio-pci.c | 9 ++---
 2 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/include/hw/virtio/virtio-pci.h b/include/hw/virtio/virtio-pci.h
index 2446dcd9ae..9f3736723c 100644
--- a/include/hw/virtio/virtio-pci.h
+++ b/include/hw/virtio/virtio-pci.h
@@ -141,7 +141,7 @@ struct VirtIOPCIProxy {
 uint32_t msix_bar_idx;
 uint32_t modern_io_bar_idx;
 uint32_t modern_mem_bar_idx;
-int config_cap;
+uint8_t config_cap;
 uint32_t flags;
 bool disable_modern;
 bool ignore_backend_features;
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index c37bdc77ea..b393ff01be 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1154,8 +1154,7 @@ static int virtio_pci_add_mem_cap(VirtIOPCIProxy *proxy,
 PCIDevice *dev = >pci_dev;
 int offset;
 
-offset = pci_add_capability(dev, PCI_CAP_ID_VNDR, 0,
-cap->cap_len, _abort);
+offset = pci_add_capability(dev, PCI_CAP_ID_VNDR, 0, cap->cap_len);
 
 assert(cap->cap_len >= sizeof *cap);
 memcpy(dev->config + offset + PCI_CAP_FLAGS, >cap_len,
@@ -1864,11 +1863,7 @@ static void virtio_pci_realize(PCIDevice *pci_dev, Error 
**errp)
 
 pcie_endpoint_cap_init(pci_dev, 0);
 
-pos = pci_add_capability(pci_dev, PCI_CAP_ID_PM, 0,
- PCI_PM_SIZEOF, errp);
-if (pos < 0) {
-return;
-}
+pos = pci_add_capability(pci_dev, PCI_CAP_ID_PM, 0, PCI_PM_SIZEOF);
 
 pci_dev->exp.pm_cap = pos;
 
-- 
2.38.1




[PATCH v9 06/17] eepro100: Omit errp for pci_add_capability

2022-11-01 Thread Akihiko Odaki
Omitting errp for pci_add_capability() causes it to abort if
capabilities overlap. This behavior is appropriate here because all of
the capabilities set in this device are defined in the program and
their overlap should not happen unless there is a programming error.

Signed-off-by: Akihiko Odaki 
---
 hw/net/eepro100.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
index 679f52f80f..bf2ecdded9 100644
--- a/hw/net/eepro100.c
+++ b/hw/net/eepro100.c
@@ -549,12 +549,7 @@ static void e100_pci_reset(EEPRO100State *s, Error **errp)
 if (info->power_management) {
 /* Power Management Capabilities */
 int cfg_offset = 0xdc;
-int r = pci_add_capability(>dev, PCI_CAP_ID_PM,
-   cfg_offset, PCI_PM_SIZEOF,
-   errp);
-if (r < 0) {
-return;
-}
+pci_add_capability(>dev, PCI_CAP_ID_PM, cfg_offset, PCI_PM_SIZEOF);
 
 pci_set_word(pci_conf + cfg_offset + PCI_PM_PMC, 0x7e21);
 #if 0 /* TODO: replace dummy code for power management emulation. */
-- 
2.38.1




Re: [PATCH v4 1/3] util/main-loop: Fix maximum number of wait objects for win32

2022-11-01 Thread Philippe Mathieu-Daudé

On 19/10/22 12:20, Bin Meng wrote:

From: Bin Meng 

The maximum number of wait objects for win32 should be
MAXIMUM_WAIT_OBJECTS, not MAXIMUM_WAIT_OBJECTS + 1.

Signed-off-by: Bin Meng 
---

Changes in v4:
- make the out of bounds access protection explicit

Changes in v3:
- move the check of adding the same HANDLE twice to a separete patch

Changes in v2:
- fix the logic in qemu_add_wait_object() to avoid adding
   the same HANDLE twice

  util/main-loop.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/util/main-loop.c b/util/main-loop.c
index f00a25451b..de38876064 100644
--- a/util/main-loop.c
+++ b/util/main-loop.c
@@ -363,10 +363,10 @@ void qemu_del_polling_cb(PollingFunc *func, void *opaque)
  /* Wait objects support */
  typedef struct WaitObjects {
  int num;
-int revents[MAXIMUM_WAIT_OBJECTS + 1];
-HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
-WaitObjectFunc *func[MAXIMUM_WAIT_OBJECTS + 1];
-void *opaque[MAXIMUM_WAIT_OBJECTS + 1];
+int revents[MAXIMUM_WAIT_OBJECTS];
+HANDLE events[MAXIMUM_WAIT_OBJECTS];
+WaitObjectFunc *func[MAXIMUM_WAIT_OBJECTS];
+void *opaque[MAXIMUM_WAIT_OBJECTS];
  } WaitObjects;
  
  static WaitObjects wait_objects = {0};

@@ -395,7 +395,7 @@ void qemu_del_wait_object(HANDLE handle, WaitObjectFunc 
*func, void *opaque)
  if (w->events[i] == handle) {
  found = 1;
  }
-if (found) {
+if (found && i < (MAXIMUM_WAIT_OBJECTS - 1)) {


Matter of style, I find this form easier to review (same logic than
what follows):

   if (found && i + 1 < MAXIMUM_WAIT_OBJECTS) {


  w->events[i] = w->events[i + 1];
  w->func[i] = w->func[i + 1];
  w->opaque[i] = w->opaque[i + 1];


Reviewed-by: Philippe Mathieu-Daudé 




  1   2   >