date:20230801

Re: [PATCH v6 6/8] linux-user: Show heap address in /proc/pid/maps

2023-08-01 Thread Philippe Mathieu-Daudé


Hi Helge,

On 2/8/23 01:27, Helge Deller wrote:

Show the memory location of the heap in the /proc/pid/maps file inside
the guest. Store the heap address in ts->heap_base, which requires to
make that variable accessible for all guest architectures, not just
architectures for semihosted binaries (arm, m68k, riscv).

Note that /proc/pid/maps in the guest needs to show target-aligned
addresses. This is fixed in this patch, so now the heap and stack
address for architectures like sparc64 and alpha now show up in that
output as well.

Show 32- and 64-bit pointers with 8 digits and leading zeros (%08x/%08lx).
For 64-bit we could use %16lx, but we mimic the Linux kernel, which shows
even 64-bit addresses with %08lx.


You are describing 3 changes, do you mind splitting in 3 patches?
Otherwise LGTM.


Example:

user@machine:/# uname -a
Linux paq 5.15.88+ #47 SMP Sun Jan 15 12:53:11 CET 2023 aarch64 GNU/Linux

user@machine:/# cat /proc/self/maps
Linux p100 6.4.4-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jul 19 16:32:49 UTC 
2023 aarch64 GNU/Linux
55-559000 r-xp  fd:00 570430 
/usr/bin/cat
559000-550001f000 ---p  00:00 0
550001f000-550002 r--p f000 fd:00 570430 
/usr/bin/cat
550002-5500021000 rw-p 0001 fd:00 570430 
/usr/bin/cat
5500021000-5500042000 rw-p  00:00 0  [heap]
70-701000 ---p  00:00 0
701000-7000801000 rw-p  00:00 0  [stack]
7000801000-7000827000 r-xp  fd:00 571555 
/usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
7000827000-700083f000 ---p  00:00 0
700083f000-7000841000 r--p 0002e000 fd:00 571555 
/usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
7000841000-7000843000 rw-p 0003 fd:00 571555 
/usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
7000843000-7000844000 r-xp  00:00 0
7000844000-7000846000 rw-p  00:00 0
700085-70009d7000 r-xp  fd:00 571558 
/usr/lib/aarch64-linux-gnu/libc.so.6
70009d7000-70009ed000 ---p 00187000 fd:00 571558 
/usr/lib/aarch64-linux-gnu/libc.so.6
70009ed000-70009f r--p 0018d000 fd:00 571558 
/usr/lib/aarch64-linux-gnu/libc.so.6
70009f-70009f2000 rw-p 0019 fd:00 571558 
/usr/lib/aarch64-linux-gnu/libc.so.6

Signed-off-by: Helge Deller 
---
  include/exec/cpu_ldst.h |  4 ++--
  linux-user/main.c   |  2 ++
  linux-user/qemu.h   |  4 ++--
  linux-user/syscall.c| 13 +
  4 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h
index 645476f0e5..f1e6f31e88 100644
--- a/include/exec/cpu_ldst.h
+++ b/include/exec/cpu_ldst.h
@@ -72,10 +72,10 @@
   */
  #if TARGET_VIRT_ADDR_SPACE_BITS <= 32
  typedef uint32_t abi_ptr;
-#define TARGET_ABI_FMT_ptr "%x"
+#define TARGET_ABI_FMT_ptr "%08x"
  #else
  typedef uint64_t abi_ptr;
-#define TARGET_ABI_FMT_ptr "%"PRIx64
+#define TARGET_ABI_FMT_ptr "%08"PRIx64
  #endif


This is patch #1,



  #ifndef TARGET_TAGGED_ADDRESSES
diff --git a/linux-user/main.c b/linux-user/main.c
index dba67ffa36..fa6e47510f 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -946,6 +946,7 @@ int main(int argc, char **argv, char **envp)
  }
  }

+info->brk = TARGET_PAGE_ALIGN(info->brk);


Patch #3,


  target_set_brk(info->brk);
  syscall_init();
  signal_init();
@@ -955,6 +956,7 @@ int main(int argc, char **argv, char **envp)
 the real value of GUEST_BASE into account.  */
  tcg_prologue_init(tcg_ctx);

+ts->heap_base = info->brk;


Patch #3,


  target_cpu_copy_regs(env, regs);

  if (gdbstub) {
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 802794db63..7a6adac637 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -121,11 +121,11 @@ typedef struct TaskState {
  #ifdef TARGET_M68K
  abi_ulong tp_value;
  #endif
-#if defined(TARGET_ARM) || defined(TARGET_M68K) || defined(TARGET_RISCV)
+


Patch #3,


  /* Extra fields for semihosted binaries.  */
  abi_ulong heap_base;
  abi_ulong heap_limit;
-#endif
+
  abi_ulong stack_base;
  int used; /* non zero if used */
  struct image_info *info;
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 475260b7ce..dc8266c073 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8078,8 +8078,9 @@ static int open_self_maps_1(CPUArchState *cpu_env, int 
fd, bool smaps)
  MapInfo *e = (MapInfo *) s->data;

  if (h2g_valid(e->start)) {
-unsigned long min = e->start;
-unsigned long max = e->end;
+/* show page granularity of guest in /proc/pid/maps */
+unsigned long min = TARGET_PAGE_ALIGN(e->start);
+unsigned long max =

Re: [PATCH v3 3/3] cpu, softmmu/vl.c: Change parsing of -cpu argument to allow -cpu cpu,help to print options for the CPU type similar to how the '-device' option works.

2023-08-01 Thread Markus Armbruster

Dinah B  writes:

> Thanks, I will fix this. I somehow didn't catch that you had replied to the
> old one.

Happens :)

Re: [PATCH v2 4/4] virtio-net: Add support for USO features

2023-08-01 Thread Akihiko Odaki


On 2023/08/01 7:31, Yuri Benditovich wrote:

USO features of virtio-net device depend on kernel ability
to support them, for backward compatibility by default the
features are disabled on 8.0 and earlier.

Signed-off-by: Yuri Benditovich 
Signed-off-by: Andrew Melnychecnko 
---
  hw/core/machine.c   |  4 
  hw/net/virtio-net.c | 31 +--
  2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index f0d35c6401..a725e76738 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -38,10 +38,14 @@
  #include "exec/confidential-guest-support.h"
  #include "hw/virtio/virtio.h"
  #include "hw/virtio/virtio-pci.h"
+#include "hw/virtio/virtio-net.h"
  
  GlobalProperty hw_compat_8_0[] = {

  { "migration", "multifd-flush-after-each-section", "on"},
  { TYPE_PCI_DEVICE, "x-pcie-ari-nextfn-1", "on" },
+{ TYPE_VIRTIO_NET, "host_uso", "off"},
+{ TYPE_VIRTIO_NET, "guest_uso4", "off"},
+{ TYPE_VIRTIO_NET, "guest_uso6", "off"},


Nitpick: Add a whitespace before closing brackets '}'.


  };
  const size_t hw_compat_8_0_len = G_N_ELEMENTS(hw_compat_8_0);
  
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c

index d2311e7d6e..bd0ead94fe 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -659,6 +659,15 @@ static int peer_has_ufo(VirtIONet *n)
  return n->has_ufo;
  }
  
+static int peer_has_uso(VirtIONet *n)

+{
+if (!peer_has_vnet_hdr(n)) {
+return 0;
+}
+
+return qemu_has_uso(qemu_get_queue(n->nic)->peer);
+}
+
  static void virtio_net_set_mrg_rx_bufs(VirtIONet *n, int mergeable_rx_bufs,
 int version_1, int hash_report)
  {
@@ -796,6 +805,10 @@ static uint64_t virtio_net_get_features(VirtIODevice 
*vdev, uint64_t features,
  virtio_clear_feature(, VIRTIO_NET_F_GUEST_TSO6);
  virtio_clear_feature(, VIRTIO_NET_F_GUEST_ECN);
  
+virtio_clear_feature(, VIRTIO_NET_F_HOST_USO);

+virtio_clear_feature(, VIRTIO_NET_F_GUEST_USO4);
+virtio_clear_feature(, VIRTIO_NET_F_GUEST_USO6);
+
  virtio_clear_feature(, VIRTIO_NET_F_HASH_REPORT);
  }
  
@@ -804,6 +817,12 @@ static uint64_t virtio_net_get_features(VirtIODevice *vdev, uint64_t features,

  virtio_clear_feature(, VIRTIO_NET_F_HOST_UFO);
  }
  
+if (!peer_has_uso(n)) {

+virtio_clear_feature(, VIRTIO_NET_F_HOST_USO);
+virtio_clear_feature(, VIRTIO_NET_F_GUEST_USO4);
+virtio_clear_feature(, VIRTIO_NET_F_GUEST_USO6);
+}
+
  if (!get_vhost_net(nc->peer)) {
  return features;
  }
@@ -864,14 +883,16 @@ static void virtio_net_apply_guest_offloads(VirtIONet *n)
  !!(n->curr_guest_offloads & (1ULL << VIRTIO_NET_F_GUEST_USO6)));
  }
  
-static uint64_t virtio_net_guest_offloads_by_features(uint32_t features)

+static uint64_t virtio_net_guest_offloads_by_features(uint64_t features)
  {
  static const uint64_t guest_offloads_mask =
  (1ULL << VIRTIO_NET_F_GUEST_CSUM) |
  (1ULL << VIRTIO_NET_F_GUEST_TSO4) |
  (1ULL << VIRTIO_NET_F_GUEST_TSO6) |
  (1ULL << VIRTIO_NET_F_GUEST_ECN)  |
-(1ULL << VIRTIO_NET_F_GUEST_UFO);
+(1ULL << VIRTIO_NET_F_GUEST_UFO)  |
+(1ULL << VIRTIO_NET_F_GUEST_USO4) |
+(1ULL << VIRTIO_NET_F_GUEST_USO6);
  
  return guest_offloads_mask & features;

  }
@@ -3924,6 +3945,12 @@ static Property virtio_net_properties[] = {
  DEFINE_PROP_INT32("speed", VirtIONet, net_conf.speed, SPEED_UNKNOWN),
  DEFINE_PROP_STRING("duplex", VirtIONet, net_conf.duplex_str),
  DEFINE_PROP_BOOL("failover", VirtIONet, failover, false),
+DEFINE_PROP_BIT64("guest_uso4", VirtIONet, host_features,
+  VIRTIO_NET_F_GUEST_USO4, true),
+DEFINE_PROP_BIT64("guest_uso6", VirtIONet, host_features,
+  VIRTIO_NET_F_GUEST_USO6, true),
+DEFINE_PROP_BIT64("host_uso", VirtIONet, host_features,
+  VIRTIO_NET_F_HOST_USO, true),
  DEFINE_PROP_END_OF_LIST(),
  };

Re: [PATCH] vdpa: set old virtio status at cvq isolation probing end

2023-08-01 Thread Jason Wang

On Tue, Aug 1, 2023 at 11:28 AM Jason Wang  wrote:
>
> On Mon, Jul 31, 2023 at 5:41 PM Eugenio Perez Martin
>  wrote:
> >
> > On Mon, Jul 31, 2023 at 10:42 AM Jason Wang  wrote:
> > >
> > > On Mon, Jul 31, 2023 at 4:05 PM Eugenio Perez Martin
> > >  wrote:
> > > >
> > > > On Mon, Jul 31, 2023 at 8:36 AM Jason Wang  wrote:
> > > > >
> > > > > On Wed, Jul 26, 2023 at 2:27 PM Eugenio Perez Martin
> > > > >  wrote:
> > > > > >
> > > > > > On Wed, Jul 26, 2023 at 4:07 AM Jason Wang  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Wed, Jul 26, 2023 at 2:21 AM Eugenio Pérez 
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > The device already has a virtio status set by vhost_vdpa_init 
> > > > > > > > by the
> > > > > > > > time vhost_vdpa_probe_cvq_isolation is called. vhost_vdpa_init 
> > > > > > > > set
> > > > > > > > S_ACKNOWLEDGE and S_DRIVER, so it is invalid to just reset it.
> > > > > > > >
> > > > > > > > It is invalid to start the device after it, but all devices 
> > > > > > > > seems to be
> > > > > > > > fine with it.  Fixing qemu so it follows virtio start procedure.
> > > > > > > >
> > > > > > > > Fixes: 152128d64697 ("vdpa: move CVQ isolation check to 
> > > > > > > > net_init_vhost_vdpa")
> > > > > > > > Reported-by: Dragos Tatulea 
> > > > > > > > Signed-off-by: Eugenio Pérez 
> > > > > > > > ---
> > > > > > > >  net/vhost-vdpa.c | 2 ++
> > > > > > > >  1 file changed, 2 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > > > > > > > index 9795306742..d7e2b714b4 100644
> > > > > > > > --- a/net/vhost-vdpa.c
> > > > > > > > +++ b/net/vhost-vdpa.c
> > > > > > > > @@ -1333,6 +1333,8 @@ static int 
> > > > > > > > vhost_vdpa_probe_cvq_isolation(int device_fd, uint64_t features,
> > > > > > > >  out:
> > > > > > > >  status = 0;
> > > > > > > >  ioctl(device_fd, VHOST_VDPA_SET_STATUS, );
> > > > > > > > +status = VIRTIO_CONFIG_S_ACKNOWLEDGE | 
> > > > > > > > VIRTIO_CONFIG_S_DRIVER;
> > > > > > > > +ioctl(device_fd, VHOST_VDPA_SET_STATUS, );
> > > > > > >
> > > > > > > So if we fail after FEATURES_OK, this basically clears that bit. 
> > > > > > > Spec
> > > > > > > doesn't say it can or not, I wonder if a reset is better?
> > > > > > >
> > > > > >
> > > > > > I don't follow this, the reset is just above the added code, isn't 
> > > > > > it?
> > > > >
> > > > > I meant for error path:
> > > > >
> > > > > E.g:
> > > > > uint8_t status = VIRTIO_CONFIG_S_ACKNOWLEDGE |
> > > > >  VIRTIO_CONFIG_S_DRIVER |
> > > > >  VIRTIO_CONFIG_S_FEATURES_OK;
> > > > > ...
> > > > > r = ioctl(device_fd, VHOST_VDPA_SET_STATUS, );
> > > > > 
> > > > > if (cvq_group != -ENOTSUP) {
> > > > > r = cvq_group;
> > > > > goto out;
> > > > > }
> > > > >
> > > > > out:
> > > > > status = VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER;
> > > > > ioctl(device_fd, VHOST_VDPA_SET_STATUS, );
> > > > >
> > > > > We're basically clearing FEATURES_OK?
> > > > >
> > > >
> > > > Yes, it is the state that previous functions (vhost_vdpa_init) set. We
> > > > need to leave it that way, either if the backend supports cvq
> > > > isolation or not, or in the case of an error. Not doing that way makes
> > > > vhost_dev_start (and vhost_vdpa_set_features) set the features before
> > > > setting VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER.
> > > > Otherwise, the guest can (and do) access to config space before
> > > > _S_ACKNOWLEDGE | _S_DRIVER.
> > >
> > > I'm not sure if it is supported by the spec or not (I meant clearing
> > > the FEATURES_OK). Or maybe we need a reset here?
> > >
> >
> > Sorry, I'm still missing it :). The reset just above in all fail
> > paths. They go to "out" label, and the first ioctl reset the device,
> > the second set the VIRTIO_CONFIG_S_ACKNOWLEDGE |
> > VIRTIO_CONFIG_S_DRIVER.
>
> Just to make sure we are at the same page:
>
> On error we basically do:
>
> set_status(VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER |
> VIRTIO_CONFIG_S_FEATURES_OK);
> ...
> set_status(VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER)
>
> So it means the device allows the driver to clear FEATURES_OK. But
> spec is unclear whether or not this is supported. So I'm not sure it
> is supported by all devices.

Ok, I think I miss the set_status(0), so this patch should be fine.

Acked-by: Jason Wang 

Thanks

>
> Thanks
>
> >
> > > Thanks
> > >
> > > >
> > > >
> > > > > >
> > > > > > > Btw, spec requires a read of status after setting FEATURES_OK, 
> > > > > > > this
> > > > > > > seems to be missed in the current code.
> > > > > > >
> > > > > >
> > > > > > I'm ok with that, but this patch does not touch that part.
> > > > > >
> > > > > > To fix this properly we should:
> > > > > > - Expose vhost_vdpa_set_dev_features_fd as we did in previous 
> > > > > > versions
> > > > > > of the series that added vhost_vdpa_probe_cvq_isolation [1].
> > > > > > -

Re: [PATCH v6 0/8] linux-user: brk fixes

2023-08-01 Thread Joel Stanley

On Tue, 1 Aug 2023 at 23:28, Helge Deller  wrote:
>
> This patch series is a fix-up for some current problems
> regarding heap memory / brk handling in qemu which happens
> on some 32-bit platforms, e.g. problems loading static
> binaries.
>
> This series includes the 5 patches from Akihiko Odaki
> with some additional fixes and cleanups by me.

This has the same segfault as the branch that I previously tested,
when running on a ppc64le host..

As a reminder, the ppc64le machine (normally, and does in this case)
uses a 64K page size. I think this is a detail that is missing from
your chroot testing.


>
> Akihiko Odaki (5):
>   linux-user: Unset MAP_FIXED_NOREPLACE for host
>   linux-user: Do not call get_errno() in do_brk()
>   linux-user: Use MAP_FIXED_NOREPLACE for do_brk()
>   linux-user: Do nothing if too small brk is specified
>   linux-user: Do not align brk with host page size
>
> Helge Deller (3):
>   linux-user: Show heap address in /proc/pid/maps
>   linux-user: Optimize memory layout for static and dynamic executables
>   linux-user: Load pie executables at upper memory
>
>  include/exec/cpu_ldst.h |  4 +--
>  linux-user/elfload.c| 59 ++
>  linux-user/loader.h | 12 +++
>  linux-user/main.c   |  2 ++
>  linux-user/mmap.c   | 35 ++
>  linux-user/qemu.h   |  4 +--
>  linux-user/syscall.c| 80 -
>  7 files changed, 79 insertions(+), 117 deletions(-)
>
> --
> 2.41.0
>

Re: [PATCH QEMU 1/2] qapi: Reformat and craft the migration doc comments

2023-08-01 Thread Yong Huang

On Tue, Aug 1, 2023 at 8:33 PM Markus Armbruster  wrote:

> Yong Huang  writes:
>
> > On Fri, Jul 28, 2023 at 3:49 PM Markus Armbruster 
> wrote:
> >
> >> ~hyman  writes:
> >>
> >> > From: Hyman Huang(黄勇) 
> >> >
> >> > Reformat migration doc comments to conform to current conventions
> >> > as commit a937b6aa739 (qapi: Reformat doc comments to conform to
> >> > current conventions).
> >> >
> >> > Also, craft the dirty-limit capability comment.
> >>
> >> Split into two patches?
> >>
> > Ok.
> >
> >>
> >> > Signed-off-by: Hyman Huang(黄勇) 
> >> > ---
> >> >  qapi/migration.json | 66
> +
> >> >  1 file changed, 31 insertions(+), 35 deletions(-)
> >> >
> >> > diff --git a/qapi/migration.json b/qapi/migration.json
> >> > index 6b49593d2f..5d5649c885 100644
> >> > --- a/qapi/migration.json
> >> > +++ b/qapi/migration.json
> >> > @@ -258,17 +258,17 @@
> >> >  # blocked.  Present and non-empty when migration is blocked.
> >> >  # (since 6.0)
> >> >  #
> >> > -# @dirty-limit-throttle-time-per-round: Maximum throttle time
> (inmicroseconds) of virtual
> >> > -#   CPUs each dirty ring
> fullround, which shows how
> >> > -#   MigrationCapability
> dirty-limitaffects the guest
> >> > -#   during live migration.
> (since8.1)
> >> > -#
> >> > -# @dirty-limit-ring-full-time: Estimated average dirty ring full
> time(in microseconds)
> >> > -#  each dirty ring full round, note
> thatthe value equals
> >> > -#  dirty ring memory size divided
> byaverage dirty page rate
> >> > -#  of virtual CPU, which can be used
> toobserve the average
> >> > -#  memory load of virtual CPU
> indirectly.Note that zero
> >> > -#  means guest doesn't dirty memory
> (since8.1)
> >> > +# @dirty-limit-throttle-time-per-round: Maximum throttle time
> >> > +# (in microseconds) of virtual CPUs each dirty ring full round,
> >> > +# which shows how MigrationCapability dirty-limit affects the
> >>
> >> Perhaps "for each ... round"?
> >>
> >> Remind me, what's a "dirty ring full round"?
> >>
> > Every time the x86 PML buffer is filled with gpa, hardware throws an
> > exception and
> > guest exits to kvm, then to qemu. Qemu will handle the exception with
> > reaping the
> > dirty ring and get the dirty page info, then enter the kvm, empty the PML
> > buffer and
> > enter guests again, i call this "dirty ring full round", but it seems not
> > straightforward,
> > please help me describe that,  thanks.
>
> "x86 PML" is page modification logging, right?
>
Yes.

The dirty ring full round may be actually imprecise indeed. I'll try to
refactor the
comment in the next version, hoping to do better.

>
> >> > +# guest during live migration.  (Since 8.1)
> >> > +#
> >> > +# @dirty-limit-ring-full-time: Estimated average dirty ring full
> >> > +# time (in microseconds) each dirty ring full round. The value
> >>
> >> Likewise.
> >>
> >> > +# equals dirty ring memory size divided by average dirty page
> >>
> >> "the dirty ring memory size divided by the average ..."
> >>
> >> > +# rate of the virtual CPU, which can be used to observe the
> >> > +# average memory load of the virtual CPU indirectly. Note that
> >> > +# zero means guest doesn't dirty memory.  (Since 8.1)
> >>
> >> Two spaces between sentences for consistency.
> >>
> >> >  #
> >> >  # Since: 0.14
> >> >  ##
> >> > @@ -519,15 +519,11 @@
> >> >  # are present.  'return-path' capability must be enabled to use
> >> >  # it.  (since 8.1)
> >> >  #
> >> > -# @dirty-limit: If enabled, migration will use the dirty-limit algo
> to
> >> > -#   throttle down guest instead of auto-converge algo.
> >> > -#   Throttle algo only works when vCPU's dirtyrate
> greater
> >> > -#   than 'vcpu-dirty-limit', read processes in guest os
> >> > -#   aren't penalized any more, so this algo can improve
> >> > -#   performance of vCPU during live migration. This is an
> >> > -#   optional performance feature and should not affect
> the
> >> > -#   correctness of the existing auto-converge algo.
> >> > -#   (since 8.1)
> >> > +# @dirty-limit: If enabled, migration will throttle vCPUs as needed
> to
> >> > +# keep their dirty page rate within @vcpu-dirty-limit.  This can
> >> > +# improve responsiveness of large guests during live migration,
> >> > +# and can result in more stable read performance.  Requires KVM
> >> > +# with accelerator property "dirty-ring-size" set.  (Since 8.1)
> >> >  #
> >> >  # Features:
> >> >  #
> >> > @@ -822,17 +818,17 @@
> >> >  # Nodes are mapped to their block device name if there is one,
> and
> >> >  # to their node name otherwise.  (Since 5.2)
>

Re: [PATCH 0/3] hw/ufs: fix compilation warnings

2023-08-01 Thread Jeuk Kim


On 8/2/2023 6:03 AM, Philippe Mathieu-Daudé wrote:

Hi Mike,

On 28/7/23 01:34, Mike Maslenkin wrote:

This patchset contains a trivial compilation fixes for UFS support
applied to block-next tree.


Since the series isn't merged, it would be clearer to send
a v9 of "hw/ufs: Add Universal Flash Storage (UFS) support"
with the fixes squashed in (there is still time).

Regards,

Phil.



Hi Phil,
Thanks for your comment.
If Mike is okay, I'll send v9 of "hw/ufs: Add Universal Flash Storage 
UFS) support" with the fixes.


To Mike,
Is it okay with you if I make a patch v9, incorporating your fixes?

Re: [PATCH QEMU v3 1/3] qapi: Reformat the dirty-limit migration doc comments

2023-08-01 Thread Yong Huang

On Tue, Aug 1, 2023 at 8:34 PM Markus Armbruster  wrote:

> ~hyman  writes:
>
> > From: Hyman Huang(黄勇) 
> >
> > Reformat the dirty-limit migration doc comments to conform
> > to current conventions as commit a937b6aa739 (qapi: Reformat
> > doc comments to conform to current conventions).
> >
> > Signed-off-by: Markus Armbruster 
>
> Unexpected S-o-b.  Accident?
>
Yes, I'll fix that

>
> > Signed-off-by: Hyman Huang(黄勇) 
> > ---
> >  qapi/migration.json | 69 ++---
> >  1 file changed, 34 insertions(+), 35 deletions(-)
> >
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index 6b49593d2f..a74ade4d72 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -258,17 +258,17 @@
> >  # blocked.  Present and non-empty when migration is blocked.
> >  # (since 6.0)
> >  #
> > -# @dirty-limit-throttle-time-per-round: Maximum throttle time (in
> microseconds) of virtual
> > -#   CPUs each dirty ring full
> round, which shows how
> > -#   MigrationCapability dirty-limit
> affects the guest
> > -#   during live migration. (since
> 8.1)
> > -#
> > -# @dirty-limit-ring-full-time: Estimated average dirty ring full time
> (in microseconds)
> > -#  each dirty ring full round, note that
> the value equals
> > -#  dirty ring memory size divided by
> average dirty page rate
> > -#  of virtual CPU, which can be used to
> observe the average
> > -#  memory load of virtual CPU indirectly.
> Note that zero
> > -#  means guest doesn't dirty memory (since
> 8.1)
> > +# @dirty-limit-throttle-time-per-round: Maximum throttle time
> > +# (in microseconds) of virtual CPUs each dirty ring full round,
> > +# which shows how MigrationCapability dirty-limit affects the
> > +# guest during live migration.  (Since 8.1)
> > +#
> > +# @dirty-limit-ring-full-time: Estimated average dirty ring full
> > +# time (in microseconds) for each dirty ring full round. The
>
> Two spaces between sentences for consistency.
>
> > +# value equals the dirty ring memory size divided by the average
> > +# dirty page rate of the virtual CPU, which can be used to
> > +# observe the average memory load of the virtual CPU indirectly.
> > +# Note that zero means guest doesn't dirty memory.  (Since 8.1)
> >  #
> >  # Since: 0.14
> >  ##
> > @@ -519,15 +519,14 @@
> >  # are present.  'return-path' capability must be enabled to use
> >  # it.  (since 8.1)
> >  #
> > -# @dirty-limit: If enabled, migration will use the dirty-limit algo to
> > -#   throttle down guest instead of auto-converge algo.
> > -#   Throttle algo only works when vCPU's dirtyrate greater
> > -#   than 'vcpu-dirty-limit', read processes in guest os
> > -#   aren't penalized any more, so this algo can improve
> > -#   performance of vCPU during live migration. This is an
> > -#   optional performance feature and should not affect the
> > -#   correctness of the existing auto-converge algo.
> > -#   (since 8.1)
> > +# @dirty-limit: If enabled, migration will use the dirty-limit
> > +# algorithim to throttle down guest instead of auto-converge
> > +# algorithim. Throttle algorithim only works when vCPU's dirtyrate
> > +# greater than 'vcpu-dirty-limit', read processes in guest os
> > +# aren't penalized any more, so this algorithim can improve
> > +# performance of vCPU during live migration. This is an optional
> > +# performance feature and should not affect the correctness of the
> > +# existing auto-converge algorithim.  (Since 8.1)
> >  #
> >  # Features:
> >  #
> > @@ -822,17 +821,17 @@
> >  # Nodes are mapped to their block device name if there is one, and
> >  # to their node name otherwise.  (Since 5.2)
> >  #
> > -# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
> limit during
> > -# live migration. Should be in the range 1
> to 1000ms,
> > -# defaults to 1000ms. (Since 8.1)
> > +# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
> > +# limit during live migration. Should be in the range 1 to 1000ms.
> > +# Defaults to 1000ms.  (Since 8.1)
>
> Likewise.
>
> >  #
> >  # @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
> > -#Defaults to 1. (Since 8.1)
> > +# Defaults to 1.  (Since 8.1)
> >  #
> >  # Features:
> >  #
> >  # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
> > -#are experimental.
> > +# are experimental.
> >  #
> >  # Since: 2.4
> >  ##
> > @@ -988,17 +987,17 @@
> >  # Nodes are mapped to their block device

Re: [PATCH v6 1/8] linux-user: Unset MAP_FIXED_NOREPLACE for host

2023-08-01 Thread Richard Henderson


On 8/1/23 16:27, Helge Deller wrote:

From: Akihiko Odaki 

Passing MAP_FIXED_NOREPLACE to host will fail if the virtual
address space is reserved with mmap. Replace it with MAP_FIXED.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Helge Deller 
Signed-off-by: Helge Deller 
---
  linux-user/mmap.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index a5dfb56545..2f26cbaf5d 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -610,6 +610,8 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
target_prot,
  goto fail;
  }

+flags = (flags & ~MAP_FIXED_NOREPLACE) | MAP_FIXED;



Again, this must be restricted to reserved_va == 0 or 64-bit guests will fail.


r~


+
  /*
   * worst case: we cannot map the file because the offset is not
   * aligned, so we read it
--
2.41.0

[PATCH v6 7/8] linux-user: Optimize memory layout for static and dynamic executables

2023-08-01 Thread Helge Deller

Reorganize the guest memory layout to get as much memory as possible for
heap for the guest application.

This patch optimizes the memory layout by loading pie executables
into lower memory and shared libs into higher memory (at
TASK_UNMAPPED_BASE). This leaves a bigger memory area usable for heap
space which will be located directly after the executable.
Up to now, pie executable and shared libs were loaded directly behind
each other in the area at TASK_UNMAPPED_BASE, which leaves very little
space for heap.

I tested this patchset with chroots of alpha, arm, armel, arm64, hppa, m68k,
mips64el, mipsel, powerpc, ppc64, ppc64el, s390x, sh4 and sparc64 on a x86-64
host, and with a static armhf binary (which fails to run without this patch).

This patch temporarily breaks the Thread Sanitizer (TSan) application
which expects specific boundary definitions for memory mappings on
different platforms [1], see commit aab613fb9597 ("linux-user: Update
TASK_UNMAPPED_BASE for aarch64") for aarch64. The follow-up patch fixes it
again.

[1] 
https://github.com/llvm/llvm-project/blob/master/compiler-rt/lib/tsan/rtl/tsan_platform.h

Signed-off-by: Helge Deller 
---
 linux-user/elfload.c | 55 +---
 linux-user/mmap.c|  8 ---
 2 files changed, 21 insertions(+), 42 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 2aee2298ec..47a118e430 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3023,6 +3023,7 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 abi_ulong load_addr, load_bias, loaddr, hiaddr, error;
 int i, retval, prot_exec;
 Error *err = NULL;
+bool is_main_executable;

 /* First of all, some simple consistency checks */
 if (!elf_check_ident(ehdr)) {
@@ -3106,28 +3107,8 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 }
 }

-if (pinterp_name != NULL) {
-/*
- * This is the main executable.
- *
- * Reserve extra space for brk.
- * We hold on to this space while placing the interpreter
- * and the stack, lest they be placed immediately after
- * the data segment and block allocation from the brk.
- *
- * 16MB is chosen as "large enough" without being so large as
- * to allow the result to not fit with a 32-bit guest on a
- * 32-bit host. However some 64 bit guests (e.g. s390x)
- * attempt to place their heap further ahead and currently
- * nothing stops them smashing into QEMUs address space.
- */
-#if TARGET_LONG_BITS == 64
-info->reserve_brk = 32 * MiB;
-#else
-info->reserve_brk = 16 * MiB;
-#endif
-hiaddr += info->reserve_brk;
-
+is_main_executable = (pinterp_name != NULL);
+if (is_main_executable) {
 if (ehdr->e_type == ET_EXEC) {
 /*
  * Make sure that the low address does not conflict with
@@ -3136,7 +3117,7 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 probe_guest_base(image_name, loaddr, hiaddr);
 } else {
 /*
- * The binary is dynamic, but we still need to
+ * The binary is dynamic (pie-executabe), but we still need to
  * select guest_base.  In this case we pass a size.
  */
 probe_guest_base(image_name, 0, hiaddr - loaddr);
@@ -3159,7 +3140,7 @@ static void load_elf_image(const char *image_name, int 
image_fd,
  */
 load_addr = target_mmap(loaddr, (size_t)hiaddr - loaddr + 1, PROT_NONE,
 MAP_PRIVATE | MAP_ANON | MAP_NORESERVE |
-(ehdr->e_type == ET_EXEC ? MAP_FIXED : 0),
+(is_main_executable ? MAP_FIXED : 0),
 -1, 0);
 if (load_addr == -1) {
 goto exit_mmap;
@@ -3194,7 +3175,8 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 info->end_code = 0;
 info->start_data = -1;
 info->end_data = 0;
-info->brk = 0;
+/* possible start for brk is behind all sections of this ELF file. */
+info->brk = TARGET_PAGE_ALIGN(hiaddr);
 info->elf_flags = ehdr->e_flags;

 prot_exec = PROT_EXEC;
@@ -3288,9 +3270,6 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 info->end_data = vaddr_ef;
 }
 }
-if (vaddr_em > info->brk) {
-info->brk = vaddr_em;
-}
 #ifdef TARGET_MIPS
 } else if (eppnt->p_type == PT_MIPS_ABIFLAGS) {
 Mips_elf_abiflags_v0 abiflags;
@@ -3618,6 +3597,15 @@ int load_elf_binary(struct linux_binprm *bprm, struct 
image_info *info)

 if (elf_interpreter) {
 load_elf_interp(elf_interpreter, _info, bprm->buf);
+/*
+* Use brk address of interpreter if it was loaded above the
+* executable and leaves less than 16 MB for heap.
+

[PATCH v6 5/8] linux-user: Do not align brk with host page size

2023-08-01 Thread Helge Deller

From: Akihiko Odaki 

do_brk() minimizes calls into target_mmap() by aligning the address
with host page size, which is potentially larger than the target page
size. However, the current implementation of this optimization has two
bugs:

- The start of brk is rounded up with the host page size while brk
  advertises an address aligned with the target page size as the
  beginning of brk. This makes the beginning of brk unmapped.
- Content clearing after mapping is flawed. The size to clear is
  specified as HOST_PAGE_ALIGN(brk_page) - brk_page, but brk_page is
  aligned with the host page size so it is always zero.

This optimization actually has no practical benefit. It makes difference
when brk() is called multiple times with values in a range of the host
page size. However, sophisticated memory allocators try to avoid to
make such frequent brk() calls. For example, glibc 2.37 calls brk() to
shrink the heap only when there is a room more than 128 KiB. It is
rare to have a page size larger than 128 KiB if it happens.

Let's remove the optimization to fix the bugs and make the code simpler.

Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1616
Signed-off-by: Akihiko Odaki 
Reviewed-by: Helge Deller 
Signed-off-by: Helge Deller 
---
 linux-user/elfload.c |  4 ++--
 linux-user/syscall.c | 54 ++--
 2 files changed, 14 insertions(+), 44 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 861ec07abc..2aee2298ec 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3678,8 +3678,8 @@ int load_elf_binary(struct linux_binprm *bprm, struct 
image_info *info)
  * to mmap pages in this space.
  */
 if (info->reserve_brk) {
-abi_ulong start_brk = HOST_PAGE_ALIGN(info->brk);
-abi_ulong end_brk = HOST_PAGE_ALIGN(info->brk + info->reserve_brk);
+abi_ulong start_brk = TARGET_PAGE_ALIGN(info->brk);
+abi_ulong end_brk = TARGET_PAGE_ALIGN(info->brk + info->reserve_brk);
 target_munmap(start_brk, end_brk - start_brk);
 }

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index ebdc8c144c..475260b7ce 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -802,81 +802,51 @@ static inline int host_to_target_sock_type(int host_type)
 }

 static abi_ulong target_brk, initial_target_brk;
-static abi_ulong brk_page;

 void target_set_brk(abi_ulong new_brk)
 {
 target_brk = TARGET_PAGE_ALIGN(new_brk);
 initial_target_brk = target_brk;
-brk_page = HOST_PAGE_ALIGN(target_brk);
 }

 /* do_brk() must return target values and target errnos. */
 abi_long do_brk(abi_ulong brk_val)
 {
 abi_long mapped_addr;
-abi_ulong new_alloc_size;
-abi_ulong new_brk, new_host_brk_page;
+abi_ulong new_brk;
+abi_ulong old_brk;

 /* brk pointers are always untagged */

-/* return old brk value if brk_val unchanged */
-if (brk_val == target_brk) {
-return target_brk;
-}
-
 /* do not allow to shrink below initial brk value */
 if (brk_val < initial_target_brk) {
 return target_brk;
 }

 new_brk = TARGET_PAGE_ALIGN(brk_val);
-new_host_brk_page = HOST_PAGE_ALIGN(brk_val);
+old_brk = TARGET_PAGE_ALIGN(target_brk);

-/* brk_val and old target_brk might be on the same page */
-if (new_brk == TARGET_PAGE_ALIGN(target_brk)) {
-/* empty remaining bytes in (possibly larger) host page */
-memset(g2h_untagged(new_brk), 0, new_host_brk_page - new_brk);
+/* new and old target_brk might be on the same page */
+if (new_brk == old_brk) {
 target_brk = brk_val;
 return target_brk;
 }

 /* Release heap if necesary */
-if (new_brk < target_brk) {
-/* empty remaining bytes in (possibly larger) host page */
-memset(g2h_untagged(new_brk), 0, new_host_brk_page - new_brk);
-
-/* free unused host pages and set new brk_page */
-target_munmap(new_host_brk_page, brk_page - new_host_brk_page);
-brk_page = new_host_brk_page;
+if (new_brk < old_brk) {
+target_munmap(new_brk, old_brk - new_brk);

 target_brk = brk_val;
 return target_brk;
 }

-if (new_host_brk_page > brk_page) {
-new_alloc_size = new_host_brk_page - brk_page;
-mapped_addr = target_mmap(brk_page, new_alloc_size,
-  PROT_READ | PROT_WRITE,
-  MAP_FIXED_NOREPLACE | MAP_ANON | MAP_PRIVATE,
-  0, 0);
-} else {
-new_alloc_size = 0;
-mapped_addr = brk_page;
-}
-
-if (mapped_addr == brk_page) {
-/* Heap contents are initialized to zero, as for anonymous
- * mapped pages.  Technically the new pages are already
- * initialized to zero since they *are* anonymous mapped
- * pages, however we have to take care with the contents that
-

[PATCH v6 4/8] linux-user: Do nothing if too small brk is specified

2023-08-01 Thread Helge Deller

From: Akihiko Odaki 

Linux 6.4.7 does nothing when a value smaller than the initial brk is
specified.

Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
Signed-off-by: Akihiko Odaki 
Reviewed-by: Helge Deller 
Signed-off-by: Helge Deller 
---
 linux-user/syscall.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index ac429a185a..ebdc8c144c 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -820,14 +820,14 @@ abi_long do_brk(abi_ulong brk_val)

 /* brk pointers are always untagged */

-/* return old brk value if brk_val unchanged or zero */
-if (!brk_val || brk_val == target_brk) {
+/* return old brk value if brk_val unchanged */
+if (brk_val == target_brk) {
 return target_brk;
 }

 /* do not allow to shrink below initial brk value */
 if (brk_val < initial_target_brk) {
-brk_val = initial_target_brk;
+return target_brk;
 }

 new_brk = TARGET_PAGE_ALIGN(brk_val);
--
2.41.0

[PATCH v6 6/8] linux-user: Show heap address in /proc/pid/maps

2023-08-01 Thread Helge Deller

Show the memory location of the heap in the /proc/pid/maps file inside
the guest. Store the heap address in ts->heap_base, which requires to
make that variable accessible for all guest architectures, not just
architectures for semihosted binaries (arm, m68k, riscv).

Note that /proc/pid/maps in the guest needs to show target-aligned
addresses. This is fixed in this patch, so now the heap and stack
address for architectures like sparc64 and alpha now show up in that
output as well.

Show 32- and 64-bit pointers with 8 digits and leading zeros (%08x/%08lx).
For 64-bit we could use %16lx, but we mimic the Linux kernel, which shows
even 64-bit addresses with %08lx.

Example:

user@machine:/# uname -a
Linux paq 5.15.88+ #47 SMP Sun Jan 15 12:53:11 CET 2023 aarch64 GNU/Linux

user@machine:/# cat /proc/self/maps
Linux p100 6.4.4-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jul 19 16:32:49 UTC 
2023 aarch64 GNU/Linux
55-559000 r-xp  fd:00 570430 
/usr/bin/cat
559000-550001f000 ---p  00:00 0
550001f000-550002 r--p f000 fd:00 570430 
/usr/bin/cat
550002-5500021000 rw-p 0001 fd:00 570430 
/usr/bin/cat
5500021000-5500042000 rw-p  00:00 0  [heap]
70-701000 ---p  00:00 0
701000-7000801000 rw-p  00:00 0  [stack]
7000801000-7000827000 r-xp  fd:00 571555 
/usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
7000827000-700083f000 ---p  00:00 0
700083f000-7000841000 r--p 0002e000 fd:00 571555 
/usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
7000841000-7000843000 rw-p 0003 fd:00 571555 
/usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
7000843000-7000844000 r-xp  00:00 0
7000844000-7000846000 rw-p  00:00 0
700085-70009d7000 r-xp  fd:00 571558 
/usr/lib/aarch64-linux-gnu/libc.so.6
70009d7000-70009ed000 ---p 00187000 fd:00 571558 
/usr/lib/aarch64-linux-gnu/libc.so.6
70009ed000-70009f r--p 0018d000 fd:00 571558 
/usr/lib/aarch64-linux-gnu/libc.so.6
70009f-70009f2000 rw-p 0019 fd:00 571558 
/usr/lib/aarch64-linux-gnu/libc.so.6

Signed-off-by: Helge Deller 
---
 include/exec/cpu_ldst.h |  4 ++--
 linux-user/main.c   |  2 ++
 linux-user/qemu.h   |  4 ++--
 linux-user/syscall.c| 13 +
 4 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h
index 645476f0e5..f1e6f31e88 100644
--- a/include/exec/cpu_ldst.h
+++ b/include/exec/cpu_ldst.h
@@ -72,10 +72,10 @@
  */
 #if TARGET_VIRT_ADDR_SPACE_BITS <= 32
 typedef uint32_t abi_ptr;
-#define TARGET_ABI_FMT_ptr "%x"
+#define TARGET_ABI_FMT_ptr "%08x"
 #else
 typedef uint64_t abi_ptr;
-#define TARGET_ABI_FMT_ptr "%"PRIx64
+#define TARGET_ABI_FMT_ptr "%08"PRIx64
 #endif

 #ifndef TARGET_TAGGED_ADDRESSES
diff --git a/linux-user/main.c b/linux-user/main.c
index dba67ffa36..fa6e47510f 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -946,6 +946,7 @@ int main(int argc, char **argv, char **envp)
 }
 }

+info->brk = TARGET_PAGE_ALIGN(info->brk);
 target_set_brk(info->brk);
 syscall_init();
 signal_init();
@@ -955,6 +956,7 @@ int main(int argc, char **argv, char **envp)
the real value of GUEST_BASE into account.  */
 tcg_prologue_init(tcg_ctx);

+ts->heap_base = info->brk;
 target_cpu_copy_regs(env, regs);

 if (gdbstub) {
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 802794db63..7a6adac637 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -121,11 +121,11 @@ typedef struct TaskState {
 #ifdef TARGET_M68K
 abi_ulong tp_value;
 #endif
-#if defined(TARGET_ARM) || defined(TARGET_M68K) || defined(TARGET_RISCV)
+
 /* Extra fields for semihosted binaries.  */
 abi_ulong heap_base;
 abi_ulong heap_limit;
-#endif
+
 abi_ulong stack_base;
 int used; /* non zero if used */
 struct image_info *info;
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 475260b7ce..dc8266c073 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8078,8 +8078,9 @@ static int open_self_maps_1(CPUArchState *cpu_env, int 
fd, bool smaps)
 MapInfo *e = (MapInfo *) s->data;

 if (h2g_valid(e->start)) {
-unsigned long min = e->start;
-unsigned long max = e->end;
+/* show page granularity of guest in /proc/pid/maps */
+unsigned long min = TARGET_PAGE_ALIGN(e->start);
+unsigned long max = TARGET_PAGE_ALIGN(e->end);
 int flags = page_get_flags(h2g(min));
 const char *path;

@@ -8090,14 +8091,18 @@ static int open_self_maps_1(CPUArchState *cpu_env, int 
fd, bool smaps)
 continue;

[PATCH v6 8/8] linux-user: Load pie executables at upper memory

2023-08-01 Thread Helge Deller

Fix the elf loader to calculate a valid TASK_UNMAPPED_BASE address for all
32-bit architectures, based on the GUEST_ADDR_MAX constant.

Additionally modify the elf loader to load dynamic pie executables at
around:
~ 0x55  for 64-bit guest binaries on 64-bit host,
- 0x0030for 32-bit guest binaries on 64-bit host, and
- 0xfor 32-bit guest binaries on 32-bit host.

With this patch the Thread Sanitizer (TSan) application will work again,
as in commit aab613fb9597 ("linux-user: Update TASK_UNMAPPED_BASE for
aarch64").

Signed-off-by: Helge Deller 
---
 linux-user/elfload.c |  6 --
 linux-user/loader.h  | 12 
 linux-user/mmap.c| 35 ++-
 3 files changed, 34 insertions(+), 19 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 47a118e430..8f5a79b537 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3021,6 +3021,7 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 struct elfhdr *ehdr = (struct elfhdr *)bprm_buf;
 struct elf_phdr *phdr;
 abi_ulong load_addr, load_bias, loaddr, hiaddr, error;
+unsigned long load_offset = 0;
 int i, retval, prot_exec;
 Error *err = NULL;
 bool is_main_executable;
@@ -3121,6 +3122,7 @@ static void load_elf_image(const char *image_name, int 
image_fd,
  * select guest_base.  In this case we pass a size.
  */
 probe_guest_base(image_name, 0, hiaddr - loaddr);
+load_offset = TASK_UNMAPPED_BASE_PIE;
 }
 }

@@ -3138,7 +3140,7 @@ static void load_elf_image(const char *image_name, int 
image_fd,
  * In both cases, we will overwrite pages in this range with mappings
  * from the executable.
  */
-load_addr = target_mmap(loaddr, (size_t)hiaddr - loaddr + 1, PROT_NONE,
+load_addr = target_mmap(loaddr + load_offset, (size_t)hiaddr - loaddr + 1, 
PROT_NONE,
 MAP_PRIVATE | MAP_ANON | MAP_NORESERVE |
 (is_main_executable ? MAP_FIXED : 0),
 -1, 0);
@@ -3176,7 +3178,7 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 info->start_data = -1;
 info->end_data = 0;
 /* possible start for brk is behind all sections of this ELF file. */
-info->brk = TARGET_PAGE_ALIGN(hiaddr);
+info->brk = TARGET_PAGE_ALIGN(load_offset + hiaddr);
 info->elf_flags = ehdr->e_flags;

 prot_exec = PROT_EXEC;
diff --git a/linux-user/loader.h b/linux-user/loader.h
index 59cbeacf24..3bbfc108eb 100644
--- a/linux-user/loader.h
+++ b/linux-user/loader.h
@@ -18,6 +18,18 @@
 #ifndef LINUX_USER_LOADER_H
 #define LINUX_USER_LOADER_H

+/* where to map binaries? */
+#if HOST_LONG_BITS == 64 && TARGET_ABI_BITS == 64
+# define TASK_UNMAPPED_BASE_PIE 0x55
+# define TASK_UNMAPPED_BASE0x70
+#elif HOST_LONG_BITS == 64 && TARGET_ABI_BITS == 32
+# define TASK_UNMAPPED_BASE_PIE0x0030
+# define TASK_UNMAPPED_BASE(GUEST_ADDR_MAX - 0x2000 + 1)
+#else /* HOST_LONG_BITS == 32 && TARGET_ABI_BITS == 32 */
+# define TASK_UNMAPPED_BASE_PIE0x
+# define TASK_UNMAPPED_BASE0x4000
+#endif
+
 /*
  * Read a good amount of data initially, to hopefully get all the
  * program headers loaded.
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index c624feead0..3441198e21 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -23,6 +23,7 @@
 #include "user-internals.h"
 #include "user-mmap.h"
 #include "target_mman.h"
+#include "loader.h"

 static pthread_mutex_t mmap_mutex = PTHREAD_MUTEX_INITIALIZER;
 static __thread int mmap_lock_count;
@@ -295,23 +296,6 @@ static bool mmap_frag(abi_ulong real_start, abi_ulong 
start, abi_ulong last,
 return true;
 }

-#if HOST_LONG_BITS == 64 && TARGET_ABI_BITS == 64
-#ifdef TARGET_AARCH64
-# define TASK_UNMAPPED_BASE  0x55
-#else
-# define TASK_UNMAPPED_BASE  0x40
-#endif
-#elif HOST_LONG_BITS == 64 && TARGET_ABI_BITS == 32
-#ifdef TARGET_HPPA
-# define TASK_UNMAPPED_BASE  0xfa00
-#else
-# define TASK_UNMAPPED_BASE  0xe000
-#endif
-#else /* HOST_LONG_BITS == 32 && TARGET_ABI_BITS == 32 */
-# define TASK_UNMAPPED_BASE  0x4000
-#endif
-abi_ulong mmap_next_start = TASK_UNMAPPED_BASE;
-
 unsigned long last_brk;

 /*
@@ -344,6 +328,23 @@ abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size, 
abi_ulong align)
 abi_ulong addr;
 int wrapped, repeat;

+static abi_ulong mmap_next_start;
+
+/* initialize mmap_next_start if necessary */
+if (!mmap_next_start) {
+mmap_next_start = TASK_UNMAPPED_BASE;
+
+/* do sanity checks on guest memory layout */
+if (mmap_next_start >= GUEST_ADDR_MAX) {
+mmap_next_start = GUEST_ADDR_MAX - 0x10 + 1;
+}
+
+if (TASK_UNMAPPED_BASE_PIE >= mmap_next_start) {
+fprintf(stderr, "Memory too small for PIE executables.\n");
+exit(EXIT_FAILURE);

[PATCH v6 3/8] linux-user: Use MAP_FIXED_NOREPLACE for do_brk()

2023-08-01 Thread Helge Deller

From: Akihiko Odaki 

MAP_FIXED_NOREPLACE can ensure the mapped address is fixed without
concerning that the new mapping overwrites something else.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Helge Deller 
Signed-off-by: Helge Deller 
---
 linux-user/syscall.c | 17 +++--
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index b9d2ec02f9..ac429a185a 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -854,17 +854,12 @@ abi_long do_brk(abi_ulong brk_val)
 return target_brk;
 }

-/* We need to allocate more memory after the brk... Note that
- * we don't use MAP_FIXED because that will map over the top of
- * any existing mapping (like the one with the host libc or qemu
- * itself); instead we treat "mapped but at wrong address" as
- * a failure and unmap again.
- */
 if (new_host_brk_page > brk_page) {
 new_alloc_size = new_host_brk_page - brk_page;
 mapped_addr = target_mmap(brk_page, new_alloc_size,
-  PROT_READ|PROT_WRITE,
-  MAP_ANON|MAP_PRIVATE, 0, 0);
+  PROT_READ | PROT_WRITE,
+  MAP_FIXED_NOREPLACE | MAP_ANON | MAP_PRIVATE,
+  0, 0);
 } else {
 new_alloc_size = 0;
 mapped_addr = brk_page;
@@ -883,12 +878,6 @@ abi_long do_brk(abi_ulong brk_val)
 target_brk = brk_val;
 brk_page = new_host_brk_page;
 return target_brk;
-} else if (mapped_addr != -1) {
-/* Mapped but at wrong address, meaning there wasn't actually
- * enough space for this brk.
- */
-target_munmap(mapped_addr, new_alloc_size);
-mapped_addr = -1;
 }

 #if defined(TARGET_ALPHA)
--
2.41.0

[PATCH v6 2/8] linux-user: Do not call get_errno() in do_brk()

2023-08-01 Thread Helge Deller

From: Akihiko Odaki 

Later the returned value is compared with -1, and negated errno is not
expected.

Fixes: 00faf08c95 ("linux-user: Don't use MAP_FIXED in do_brk()")
Signed-off-by: Akihiko Odaki 
Reviewed-by: Helge Deller 
Signed-off-by: Helge Deller 
---
 linux-user/syscall.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 95727a816a..b9d2ec02f9 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -862,9 +862,9 @@ abi_long do_brk(abi_ulong brk_val)
  */
 if (new_host_brk_page > brk_page) {
 new_alloc_size = new_host_brk_page - brk_page;
-mapped_addr = get_errno(target_mmap(brk_page, new_alloc_size,
-PROT_READ|PROT_WRITE,
-MAP_ANON|MAP_PRIVATE, 0, 0));
+mapped_addr = target_mmap(brk_page, new_alloc_size,
+  PROT_READ|PROT_WRITE,
+  MAP_ANON|MAP_PRIVATE, 0, 0);
 } else {
 new_alloc_size = 0;
 mapped_addr = brk_page;
--
2.41.0

[PATCH v6 1/8] linux-user: Unset MAP_FIXED_NOREPLACE for host

2023-08-01 Thread Helge Deller

From: Akihiko Odaki 

Passing MAP_FIXED_NOREPLACE to host will fail if the virtual
address space is reserved with mmap. Replace it with MAP_FIXED.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Helge Deller 
Signed-off-by: Helge Deller 
---
 linux-user/mmap.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index a5dfb56545..2f26cbaf5d 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -610,6 +610,8 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
target_prot,
 goto fail;
 }

+flags = (flags & ~MAP_FIXED_NOREPLACE) | MAP_FIXED;
+
 /*
  * worst case: we cannot map the file because the offset is not
  * aligned, so we read it
--
2.41.0

[PATCH v6 0/8] linux-user: brk fixes

2023-08-01 Thread Helge Deller

This patch series is a fix-up for some current problems
regarding heap memory / brk handling in qemu which happens
on some 32-bit platforms, e.g. problems loading static
binaries.

This series includes the 5 patches from Akihiko Odaki
with some additional fixes and cleanups by me.

Akihiko Odaki (5):
  linux-user: Unset MAP_FIXED_NOREPLACE for host
  linux-user: Do not call get_errno() in do_brk()
  linux-user: Use MAP_FIXED_NOREPLACE for do_brk()
  linux-user: Do nothing if too small brk is specified
  linux-user: Do not align brk with host page size

Helge Deller (3):
  linux-user: Show heap address in /proc/pid/maps
  linux-user: Optimize memory layout for static and dynamic executables
  linux-user: Load pie executables at upper memory

 include/exec/cpu_ldst.h |  4 +--
 linux-user/elfload.c| 59 ++
 linux-user/loader.h | 12 +++
 linux-user/main.c   |  2 ++
 linux-user/mmap.c   | 35 ++
 linux-user/qemu.h   |  4 +--
 linux-user/syscall.c| 80 -
 7 files changed, 79 insertions(+), 117 deletions(-)

--
2.41.0

Re: [PATCH v3 02/17] tests: Rename test-x86-cpuid.c to test-x86-topo.c

2023-08-01 Thread Moger, Babu

Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu 
> 
> In fact, this unit tests APIC ID other than CPUID.

This is not clear.

The tests in test-x86-topo.c actually test the APIC ID combinations.
Rename to test-x86-topo.c to make its name more in line with its actual
content.

> Rename to test-x86-topo.c to make its name more in line with its
> actual content.
> 
> Signed-off-by: Zhao Liu 
> Tested-by: Yongwei Ma 
> Reviewed-by: Philippe Mathieu-Daudé  Acked-by: Michael S. Tsirkin 
> ---
> Changes since v1:
>  * Rename test-x86-apicid.c to test-x86-topo.c. (Yanan)
> ---
>  MAINTAINERS  | 2 +-
>  tests/unit/meson.build   | 4 ++--
>  tests/unit/{test-x86-cpuid.c => test-x86-topo.c} | 2 +-
>  3 files changed, 4 insertions(+), 4 deletions(-)
>  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (99%)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 12e59b6b27de..51ba3d593e90 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1719,7 +1719,7 @@ F: include/hw/southbridge/ich9.h
>  F: include/hw/southbridge/piix.h
>  F: hw/isa/apm.c
>  F: include/hw/isa/apm.h
> -F: tests/unit/test-x86-cpuid.c
> +F: tests/unit/test-x86-topo.c
>  F: tests/qtest/test-x86-cpuid-compat.c
>  
>  PC Chipset
> diff --git a/tests/unit/meson.build b/tests/unit/meson.build
> index 93977cc32d2b..39b5d0007c69 100644
> --- a/tests/unit/meson.build
> +++ b/tests/unit/meson.build
> @@ -21,8 +21,8 @@ tests = {
>'test-opts-visitor': [testqapi],
>'test-visitor-serialization': [testqapi],
>'test-bitmap': [],
> -  # all code tested by test-x86-cpuid is inside topology.h
> -  'test-x86-cpuid': [],
> +  # all code tested by test-x86-topo is inside topology.h
> +  'test-x86-topo': [],
>'test-cutils': [],
>'test-div128': [],
>'test-shift128': [],
> diff --git a/tests/unit/test-x86-cpuid.c b/tests/unit/test-x86-topo.c
> similarity index 99%
> rename from tests/unit/test-x86-cpuid.c
> rename to tests/unit/test-x86-topo.c
> index bfabc0403a1a..2b104f86d7c2 100644
> --- a/tests/unit/test-x86-cpuid.c
> +++ b/tests/unit/test-x86-topo.c
> @@ -1,5 +1,5 @@
>  /*
> - *  Test code for x86 CPUID and Topology functions
> + *  Test code for x86 APIC ID and Topology functions
>   *
>   *  Copyright (c) 2012 Red Hat Inc.
>   *

-- 
Thanks
Babu Moger

Re: [PATCH v3 00/17] Support smp.clusters for x86

2023-08-01 Thread Moger, Babu

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu 
> 
> Hi list,
> 
> This is the our v3 patch series, rebased on the master branch at the
> commit 234320cd0573 ("Merge tag 'pull-target-arm-20230731' of https:
> //git.linaro.org/people/pmaydell/qemu-arm into staging").
> 
> Comparing with v2 [1], v3 mainly adds "Tested-by", "Reviewed-by" and
> "ACKed-by" (for PC related patchies) tags and minor code changes (Pls
> see changelog).
> 
> 
> # Introduction
> 
> This series add the cluster support for x86 PC machine, which allows
> x86 can use smp.clusters to configure x86 modlue level CPU topology.

/s/modlue/module
> 
> And since the compatibility issue (see section: ## Why not share L2
> cache in cluster directly), this series also introduce a new command
> to adjust the x86 L2 cache topology.
> 
> Welcome your comments!
> 
> 
> # Backgroud
> 
> The "clusters" parameter in "smp" is introduced by ARM [2], but x86
> hasn't supported it.
> 
> At present, x86 defaults L2 cache is shared in one core, but this is
> not enough. There're some platforms that multiple cores share the
> same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
> Atom cores [3], that is, every four Atom cores shares one L2 cache.
> Therefore, we need the new CPU topology level (cluster/module).
> 
> Another reason is for hybrid architecture. cluster support not only
> provides another level of topology definition in x86, but would aslo
> provide required code change for future our hybrid topology support.
> 
> 
> # Overview
> 
> ## Introduction of module level for x86
> 
> "cluster" in smp is the CPU topology level which is between "core" and
> die.
> 
> For x86, the "cluster" in smp is corresponding to the module level [4],
> which is above the core level. So use the "module" other than "cluster"
> in x86 code.
> 
> And please note that x86 already has a cpu topology level also named
> "cluster" [4], this level is at the upper level of the package. Here,
> the cluster in x86 cpu topology is completely different from the
> "clusters" as the smp parameter. After the module level is introduced,
> the cluster as the smp parameter will actually refer to the module level
> of x86.
> 
> 
> ## Why not share L2 cache in cluster directly
> 
> Though "clusters" was introduced to help define L2 cache topology
> [2], using cluster to define x86's L2 cache topology will cause the
> compatibility problem:
> 
> Currently, x86 defaults that the L2 cache is shared in one core, which
> actually implies a default setting "cores per L2 cache is 1" and
> therefore implicitly defaults to having as many L2 caches as cores.
> 
> For example (i386 PC machine):
> -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> 
> Considering the topology of the L2 cache, this (*) implicitly means "1
> core per L2 cache" and "2 L2 caches per die".
> 
> If we use cluster to configure L2 cache topology with the new default
> setting "clusters per L2 cache is 1", the above semantics will change
> to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> cores per L2 cache".
> 
> So the same command (*) will cause changes in the L2 cache topology,
> further affecting the performance of the virtual machine.
> 
> Therefore, x86 should only treat cluster as a cpu topology level and
> avoid using it to change L2 cache by default for compatibility.
> 
> 
> ## module level in CPUID
> 
> Currently, we don't expose module level in CPUID.1FH because currently
> linux (v6.2-rc6) doesn't support module level. And exposing module and
> die levels at the same time in CPUID.1FH will cause linux to calculate
> wrong die_id. The module level should be exposed until the real machine
> has the module level in CPUID.1FH.
> 
> We can configure CPUID.04H.02H (L2 cache topology) with module level by
> a new command:
> 
> "-cpu,x-l2-cache-topo=cluster"
> 
> More information about this command, please see the section: "## New
> property: x-l2-cache-topo".
> 
> 
> ## New cache topology info in CPUCacheInfo
> 
> Currently, by default, the cache topology is encoded as:
> 1. i/d cache is shared in one core.
> 2. L2 cache is shared in one core.
> 3. L3 cache is shared in one die.
> 
> This default general setting has caused a misunderstanding, that is, the
> cache topology is completely equated with a specific cpu topology, such
> as the connection between L2 cache and core level, and the connection
> between L3 cache and die level.
> 
> In fact, the settings of these topologies depend on the specific
> platform and are not static. For example, on Alder Lake-P, every
> four Atom cores share the same L2 cache [2].
> 
> Thus, in this patch set, we explicitly define the corresponding cache
> topology for different cpu models and this has two benefits:
> 1. Easy to expand to new CPU models in the future, which has different
>cache topology.
> 2. It can easily support custom cache topology by some command (e.g.,
>x-l2-cache-topo).
> 
> 
> ## New property:

Re: [PATCH v3 01/17] i386: Fix comment style in topology.h

2023-08-01 Thread Moger, Babu

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu 
> 
> For function comments in this file, keep the comment style consistent
> with other places.

s/with other places./with other files in the directory./

> 
> Signed-off-by: Zhao Liu 
> Reviewed-by: Philippe Mathieu-Daudé  Reviewed-by: Yanan Wang 
> Acked-by: Michael S. Tsirkin 
> ---
>  include/hw/i386/topology.h | 33 +
>  1 file changed, 17 insertions(+), 16 deletions(-)
> 
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index 81573f6cfde0..5a19679f618b 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -24,7 +24,8 @@
>  #ifndef HW_I386_TOPOLOGY_H
>  #define HW_I386_TOPOLOGY_H
>  
> -/* This file implements the APIC-ID-based CPU topology enumeration logic,
> +/*
> + * This file implements the APIC-ID-based CPU topology enumeration logic,
>   * documented at the following document:
>   *   Intel® 64 Architecture Processor Topology Enumeration
>   *   
> http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
> @@ -41,7 +42,8 @@
>  
>  #include "qemu/bitops.h"
>  
> -/* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
> +/*
> + * APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
>   */
>  typedef uint32_t apic_id_t;
>  
> @@ -58,8 +60,7 @@ typedef struct X86CPUTopoInfo {
>  unsigned threads_per_core;
>  } X86CPUTopoInfo;
>  
> -/* Return the bit width needed for 'count' IDs
> - */
> +/* Return the bit width needed for 'count' IDs */
>  static unsigned apicid_bitwidth_for_count(unsigned count)
>  {
>  g_assert(count >= 1);
> @@ -67,15 +68,13 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
>  return count ? 32 - clz32(count) : 0;
>  }
>  
> -/* Bit width of the SMT_ID (thread ID) field on the APIC ID
> - */
> +/* Bit width of the SMT_ID (thread ID) field on the APIC ID */
>  static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
>  {
>  return apicid_bitwidth_for_count(topo_info->threads_per_core);
>  }
>  
> -/* Bit width of the Core_ID field
> - */
> +/* Bit width of the Core_ID field */
>  static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
>  {
>  return apicid_bitwidth_for_count(topo_info->cores_per_die);
> @@ -87,8 +86,7 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo 
> *topo_info)
>  return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
>  }
>  
> -/* Bit offset of the Core_ID field
> - */
> +/* Bit offset of the Core_ID field */
>  static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
>  {
>  return apicid_smt_width(topo_info);
> @@ -100,14 +98,14 @@ static inline unsigned apicid_die_offset(X86CPUTopoInfo 
> *topo_info)
>  return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
>  }
>  
> -/* Bit offset of the Pkg_ID (socket ID) field
> - */
> +/* Bit offset of the Pkg_ID (socket ID) field */
>  static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
>  {
>  return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
>  }
>  
> -/* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> +/*
> + * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
>   *
>   * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
>   */
> @@ -120,7 +118,8 @@ static inline apic_id_t 
> x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
> topo_ids->smt_id;
>  }
>  
> -/* Calculate thread/core/package IDs for a specific topology,
> +/*
> + * Calculate thread/core/package IDs for a specific topology,
>   * based on (contiguous) CPU index
>   */
>  static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> @@ -137,7 +136,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo 
> *topo_info,
>  topo_ids->smt_id = cpu_index % nr_threads;
>  }
>  
> -/* Calculate thread/core/package IDs for a specific topology,
> +/*
> + * Calculate thread/core/package IDs for a specific topology,
>   * based on APIC ID
>   */
>  static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> @@ -155,7 +155,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t 
> apicid,
>  topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
>  }
>  
> -/* Make APIC ID for the CPU 'cpu_index'
> +/*
> + * Make APIC ID for the CPU 'cpu_index'
>   *
>   * 'cpu_index' is a sequential, contiguous ID for the CPU.
>   */

-- 
Thanks
Babu Moger

[PATCH v2 2/3] linux-user: Emulate /proc/cpuinfo on aarch64 and arm

2023-08-01 Thread Helge Deller

Add emulation for /proc/cpuinfo for arm architecture.
The output below mimics output as seen on debian porterboxes.

aarch64 output example:

processor   : 0
model name  : ARMv8 Processor rev 0 (v8l)
BogoMIPS: 100.00
Features: fp asimd aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid 
asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm ilrcpc 
flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 
frint svei8mm svef32mm svef64mm svebf16 i8mm bf16 rng bti mte sme sme_i16i64 
sme_f64f64 sme_i8i32 sme_f16f32 sme_b16f32 sme_f32f32 sme_fa64
CPU implementer : 00
CPU architecture: 8
CPU variant : 0
CPU part: 0x51
CPU revision: 0

arm output example:

processor   : 0
model name  : ARMv7 Processor rev 0 (armv7l)
BogoMIPS: 50.00
Features: swp half thumb fast_mult vfp edsp neon vfpv3 tls vfpv4 idiva 
idivt vfpd32 lpae aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x1
CPU part: 0xd07
CPU revision: 0

Hardware: arm,cortex-a57
Revision: 
Serial  : 

Signed-off-by: Helge Deller 

v3:
- show variant, part, revision and implementor based on midr
  value (suggested by Richard Henderson)
v2:
- show features of CPU which is actually being emulated by qemu
  (suggested by Peter Maydell)
---
 linux-user/elfload.c | 130 +--
 linux-user/loader.h  |   6 +-
 linux-user/syscall.c |  67 +-
 3 files changed, 196 insertions(+), 7 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 861ec07abc..99804e477d 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -466,7 +466,7 @@ static bool init_guest_commpage(void)
 #define ELF_HWCAP get_elf_hwcap()
 #define ELF_HWCAP2 get_elf_hwcap2()

-static uint32_t get_elf_hwcap(void)
+uint32_t get_elf_hwcap(void)
 {
 ARMCPU *cpu = ARM_CPU(thread_cpu);
 uint32_t hwcaps = 0;
@@ -508,7 +508,7 @@ static uint32_t get_elf_hwcap(void)
 return hwcaps;
 }

-static uint32_t get_elf_hwcap2(void)
+uint32_t get_elf_hwcap2(void)
 {
 ARMCPU *cpu = ARM_CPU(thread_cpu);
 uint32_t hwcaps = 0;
@@ -521,6 +521,49 @@ static uint32_t get_elf_hwcap2(void)
 return hwcaps;
 }

+const char *elf_hwcap_str(uint32_t bit)
+{
+static const char *hwcap_str[] = {
+[__builtin_ctz(ARM_HWCAP_ARM_SWP  )] = "swp",
+[__builtin_ctz(ARM_HWCAP_ARM_HALF )] = "half",
+[__builtin_ctz(ARM_HWCAP_ARM_THUMB)] = "thumb",
+[__builtin_ctz(ARM_HWCAP_ARM_26BIT)] = "26bit",
+[__builtin_ctz(ARM_HWCAP_ARM_FAST_MULT)] = "fast_mult",
+[__builtin_ctz(ARM_HWCAP_ARM_FPA  )] = "fpa",
+[__builtin_ctz(ARM_HWCAP_ARM_VFP  )] = "vfp",
+[__builtin_ctz(ARM_HWCAP_ARM_EDSP )] = "edsp",
+[__builtin_ctz(ARM_HWCAP_ARM_JAVA )] = "java",
+[__builtin_ctz(ARM_HWCAP_ARM_IWMMXT   )] = "iwmmxt",
+[__builtin_ctz(ARM_HWCAP_ARM_CRUNCH   )] = "crunch",
+[__builtin_ctz(ARM_HWCAP_ARM_THUMBEE  )] = "thumbee",
+[__builtin_ctz(ARM_HWCAP_ARM_NEON )] = "neon",
+[__builtin_ctz(ARM_HWCAP_ARM_VFPv3)] = "vfpv3",
+[__builtin_ctz(ARM_HWCAP_ARM_VFPv3D16 )] = "vfpv3d16",
+[__builtin_ctz(ARM_HWCAP_ARM_TLS  )] = "tls",
+[__builtin_ctz(ARM_HWCAP_ARM_VFPv4)] = "vfpv4",
+[__builtin_ctz(ARM_HWCAP_ARM_IDIVA)] = "idiva",
+[__builtin_ctz(ARM_HWCAP_ARM_IDIVT)] = "idivt",
+[__builtin_ctz(ARM_HWCAP_ARM_VFPD32   )] = "vfpd32",
+[__builtin_ctz(ARM_HWCAP_ARM_LPAE )] = "lpae",
+[__builtin_ctz(ARM_HWCAP_ARM_EVTSTRM  )] = "evtstrm",
+};
+
+return bit < ARRAY_SIZE(hwcap_str) ? hwcap_str[bit] : NULL;
+}
+
+const char *elf_hwcap2_str(uint32_t bit)
+{
+static const char *hwcap_str[] = {
+[__builtin_ctz(ARM_HWCAP2_ARM_AES  )] = "aes",
+[__builtin_ctz(ARM_HWCAP2_ARM_PMULL)] = "pmull",
+[__builtin_ctz(ARM_HWCAP2_ARM_SHA1 )] = "sha1",
+[__builtin_ctz(ARM_HWCAP2_ARM_SHA2 )] = "sha2",
+[__builtin_ctz(ARM_HWCAP2_ARM_CRC32)] = "crc32",
+};
+
+return bit < ARRAY_SIZE(hwcap_str) ? hwcap_str[bit] : NULL;
+}
+
 #undef GET_FEATURE
 #undef GET_FEATURE_ID

@@ -668,7 +711,7 @@ enum {
 #define GET_FEATURE_ID(feat, hwcap) \
 do { if (cpu_isar_feature(feat, cpu)) { hwcaps |= hwcap; } } while (0)

-static uint32_t get_elf_hwcap(void)
+uint32_t get_elf_hwcap(void)
 {
 ARMCPU *cpu = ARM_CPU(thread_cpu);
 uint32_t hwcaps = 0;
@@ -706,7 +749,7 @@ static uint32_t get_elf_hwcap(void)
 return hwcaps;
 }

-static uint32_t get_elf_hwcap2(void)
+uint32_t get_elf_hwcap2(void)
 {
 ARMCPU *cpu = ARM_CPU(thread_cpu);
 uint32_t hwcaps = 0;
@@ -741,6 +784,85 @@ static uint32_t get_elf_hwcap2(void)
 return hwcaps;
 }

+const char *elf_hwcap_str(uint32_t bit)
+{
+static const char *hwcap_str[] = {
+[__builtin_ctz(ARM_HWCAP_A64_FP  )] = "fp",
+[__builtin_ctz(ARM_HWCAP_A64_ASIMD   )] = "asimd",
+

[PATCH v2 3/3] linux-user: Emulate /proc/cpuinfo for Alpha

2023-08-01 Thread Helge Deller

Add emulation for /proc/cpuinfo for the alpha architecture.

alpha output example:

(alpha-chroot)root@p100:/# cat /proc/cpuinfo
cpu : Alpha
cpu model   : ev67
cpu variation   : 7
cpu revision: 0
cpu serial number   : JA
system type : QEMU
system variation: 8.0.91
system revision : 0
system serial number: AY
cycle frequency [Hz]: 25000
timer frequency [Hz]: 250.00
page size [bytes]   : 8192
phys. address bits  : 44
max. addr. space #  : 255
BogoMIPS: 2500.00
platform string : AlphaServer QEMU virtual machine
cpus detected   : 8
cpus active : 8
cpu active mask : 00ff
L1 Icache   : n/a
L1 Dcache   : n/a
L2 cache: n/a
L3 cache: n/a

Signed-off-by: Helge Deller 
Cc: Michael Cree 
---
 linux-user/syscall.c | 50 ++--
 1 file changed, 48 insertions(+), 2 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 51ce81cefb..548eaea3a0 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8324,7 +8324,7 @@ void target_exception_dump(CPUArchState *env, const char 
*fmt, int code)
 #if HOST_BIG_ENDIAN != TARGET_BIG_ENDIAN || \
 defined(TARGET_SPARC) || defined(TARGET_M68K) || defined(TARGET_HPPA) || \
 defined(TARGET_RISCV) || defined(TARGET_S390X) || defined(TARGET_ARM) || \
-defined(TARGET_AARCH64)
+defined(TARGET_AARCH64) || defined(TARGET_ALPHA)
 static int is_proc(const char *filename, const char *entry)
 {
 return strcmp(filename, entry) == 0;
@@ -8376,6 +8376,51 @@ static int open_net_route(CPUArchState *cpu_env, int fd)
 }
 #endif

+#if defined(TARGET_ALPHA)
+static int open_cpuinfo(CPUArchState *cpu_env, int fd)
+{
+int num_cpus = sysconf(_SC_NPROCESSORS_ONLN);
+char model[32];
+char *p;
+AlphaCPU *cpu = env_archcpu(cpu_env);
+CPUAlphaState *env = >env;
+
+g_strlcpy(model, object_class_get_name(
+OBJECT_CLASS(CPU_GET_CLASS(env_cpu(cpu_env,
+  sizeof(model));
+p = strchr(model, '-');
+if (p) {
+*p = '\0';
+}
+
+dprintf(fd, "cpu\t\t\t: Alpha\n");
+dprintf(fd, "cpu model\t\t: %s\n", model);
+// object_class_get_name(OBJECT_CLASS(CPU_GET_CLASS(env_cpu(cpu_env);
+dprintf(fd, "cpu variation\t\t: %d\n", env->implver + 5);
+dprintf(fd, "cpu revision\t\t: 0\n");
+dprintf(fd, "cpu serial number\t: JA\n");
+dprintf(fd, "system type\t\t: QEMU\n");
+dprintf(fd, "system variation\t: %s\n", QEMU_VERSION);
+dprintf(fd, "system revision\t\t: 0\n");
+dprintf(fd, "system serial number\t: AY\n");
+dprintf(fd, "cycle frequency [Hz]\t: 25000\n");
+dprintf(fd, "timer frequency [Hz]\t: 250.00\n");
+dprintf(fd, "page size [bytes]\t: %d\n", TARGET_PAGE_SIZE);
+dprintf(fd, "phys. address bits\t: %d\n", TARGET_PHYS_ADDR_SPACE_BITS);
+dprintf(fd, "max. addr. space #\t: 255\n");
+dprintf(fd, "BogoMIPS\t\t: 2500.00\n");
+dprintf(fd, "platform string\t\t: AlphaServer QEMU virtual machine\n");
+dprintf(fd, "cpus detected\t\t: %d\n", num_cpus);
+dprintf(fd, "cpus active\t\t: %d\n", num_cpus);
+dprintf(fd, "cpu active mask\t\t: %016llx\n", (1ULL << num_cpus) - 1);
+dprintf(fd, "L1 Icache\t\t: n/a\n");
+dprintf(fd, "L1 Dcache\t\t: n/a\n");
+dprintf(fd, "L2 cache\t\t: n/a\n");
+dprintf(fd, "L3 cache\t\t: n/a\n");
+return 0;
+}
+#endif
+
 #if defined(TARGET_SPARC)
 static int open_cpuinfo(CPUArchState *cpu_env, int fd)
 {
@@ -8624,7 +8669,8 @@ int do_guest_openat(CPUArchState *cpu_env, int dirfd, 
const char *fname,
 #endif
 #if defined(TARGET_SPARC) || defined(TARGET_HPPA) || \
 defined(TARGET_RISCV) || defined(TARGET_S390X) || \
-defined(TARGET_ARM)   || defined(TARGET_AARCH64)
+defined(TARGET_ARM)   || defined(TARGET_AARCH64) || \
+defined(TARGET_ALPHA)
 { "/proc/cpuinfo", open_cpuinfo, is_proc },
 #endif
 #if defined(TARGET_M68K)
--
2.41.0

[PATCH v2 1/3] linux-user: Fix openat() emulation to correctly detect accesses to /proc

2023-08-01 Thread Helge Deller

In qemu we catch accesses to files like /proc/cpuinfo or /proc/net/route
and return to the guest contents which would be visible on a real system
(instead what the host would show).

This patch fixes a bug, where for example the accesses
cat /proccpuinfo
or
cd /proc && cat cpuinfo
will not be recognized by qemu and where qemu will wrongly show
the contents of the host's /proc/cpuinfo file.

Signed-off-by: Helge Deller 

--
v3:
- use g_autofree on returned value from realpath

v2:
- use g_autofree instead of pathname on stack
  Daniel P. Berrangé requested to not put buffers on stack.
  Using g_autofree keeps code much cleaner than using
  extended semantics of realpath(), unless I can use g_autofree
  on malloced area from realpath().
---
 linux-user/syscall.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 95727a816a..1ec7d27e37 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8539,9 +8539,12 @@ static int open_hardware(CPUArchState *cpu_env, int fd)
 }
 #endif

-int do_guest_openat(CPUArchState *cpu_env, int dirfd, const char *pathname,
+
+int do_guest_openat(CPUArchState *cpu_env, int dirfd, const char *fname,
 int flags, mode_t mode, bool safe)
 {
+g_autofree char *proc_name = NULL;
+const char *pathname;
 struct fake_open {
 const char *filename;
 int (*fill)(CPUArchState *cpu_env, int fd);
@@ -8567,6 +8570,14 @@ int do_guest_openat(CPUArchState *cpu_env, int dirfd, 
const char *pathname,
 { NULL, NULL, NULL }
 };

+/* if this is a file from /proc/ filesystem, expand full name */
+proc_name = realpath(fname, NULL);
+if (proc_name && strncmp(proc_name, "/proc/", 6) == 0) {
+pathname = proc_name;
+} else {
+pathname = fname;
+}
+
 if (is_proc_myself(pathname, "exe")) {
 if (safe) {
 return safe_openat(dirfd, exec_path, flags, mode);
--
2.41.0

[PATCH v2 0/3] linux-user: /proc/cpuinfo fix and content emulation for arm

2023-08-01 Thread Helge Deller

- One fix for correctly detecting /proc/cpuinfo access
- A new /proc/cpuinfo output for arm/arm64.
- A new /proc/cpuinfo output for Alpha

Helge Deller (3):
  linux-user: Fix openat() emulation to correctly detect accesses to
/proc
  linux-user: Emulate /proc/cpuinfo on aarch64 and arm
  linux-user: Emulate /proc/cpuinfo for Alpha

 linux-user/elfload.c | 130 +--
 linux-user/loader.h  |   6 +-
 linux-user/syscall.c | 126 -
 3 files changed, 254 insertions(+), 8 deletions(-)

--
2.41.0

[PULL 02/10] i386/xen: consistent locking around Xen singleshot timers

2023-08-01 Thread Philippe Mathieu-Daudé

From: David Woodhouse 

Coverity points out (CID 1507534, 1507968) that we sometimes access
env->xen_singleshot_timer_ns under the protection of
env->xen_timers_lock and sometimes not.

This isn't always an issue. There are two modes for the timers; if the
kernel supports the EVTCHN_SEND capability then it handles all the timer
hypercalls and delivery internally, and all we use the field for is to
get/set the timer as part of the vCPU state via an ioctl(). If the
kernel doesn't have that support, then we do all the emulation within
qemu, and *those* are the code paths where we actually care about the
locking.

But it doesn't hurt to be a little bit more consistent and avoid having
to explain *why* it's OK.

Signed-off-by: David Woodhouse 
Reviewed-by: Paul Durrant 
Message-Id: <20230801175747.145906-3-dw...@infradead.org>
Signed-off-by: Philippe Mathieu-Daudé 
---
 target/i386/kvm/xen-emu.c | 37 +++--
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c
index d7c7eb8d9c..a8146115f0 100644
--- a/target/i386/kvm/xen-emu.c
+++ b/target/i386/kvm/xen-emu.c
@@ -43,6 +43,7 @@
 
 static void xen_vcpu_singleshot_timer_event(void *opaque);
 static void xen_vcpu_periodic_timer_event(void *opaque);
+static int vcpuop_stop_singleshot_timer(CPUState *cs);
 
 #ifdef TARGET_X86_64
 #define hypercall_compat32(longmode) (!(longmode))
@@ -466,6 +467,7 @@ void kvm_xen_inject_vcpu_callback_vector(uint32_t vcpu_id, 
int type)
 }
 }
 
+/* Must always be called with xen_timers_lock held */
 static int kvm_xen_set_vcpu_timer(CPUState *cs)
 {
 X86CPU *cpu = X86_CPU(cs);
@@ -483,6 +485,7 @@ static int kvm_xen_set_vcpu_timer(CPUState *cs)
 
 static void do_set_vcpu_timer_virq(CPUState *cs, run_on_cpu_data data)
 {
+QEMU_LOCK_GUARD(_CPU(cs)->env.xen_timers_lock);
 kvm_xen_set_vcpu_timer(cs);
 }
 
@@ -545,7 +548,6 @@ static void do_vcpu_soft_reset(CPUState *cs, 
run_on_cpu_data data)
 env->xen_vcpu_time_info_gpa = INVALID_GPA;
 env->xen_vcpu_runstate_gpa = INVALID_GPA;
 env->xen_vcpu_callback_vector = 0;
-env->xen_singleshot_timer_ns = 0;
 memset(env->xen_virq, 0, sizeof(env->xen_virq));
 
 set_vcpu_info(cs, INVALID_GPA);
@@ -555,8 +557,13 @@ static void do_vcpu_soft_reset(CPUState *cs, 
run_on_cpu_data data)
   INVALID_GPA);
 if (kvm_xen_has_cap(EVTCHN_SEND)) {
 kvm_xen_set_vcpu_callback_vector(cs);
+
+QEMU_LOCK_GUARD(_CPU(cs)->env.xen_timers_lock);
+env->xen_singleshot_timer_ns = 0;
 kvm_xen_set_vcpu_timer(cs);
-}
+} else {
+vcpuop_stop_singleshot_timer(cs);
+};
 
 }
 
@@ -1059,6 +1066,10 @@ static int vcpuop_stop_periodic_timer(CPUState *target)
 return 0;
 }
 
+/*
+ * Userspace handling of timer, for older kernels.
+ * Must always be called with xen_timers_lock held.
+ */
 static int do_set_singleshot_timer(CPUState *cs, uint64_t timeout_abs,
bool future, bool linux_wa)
 {
@@ -1086,12 +1097,8 @@ static int do_set_singleshot_timer(CPUState *cs, 
uint64_t timeout_abs,
 timeout_abs = now + delta;
 }
 
-qemu_mutex_lock(>xen_timers_lock);
-
 timer_mod_ns(env->xen_singleshot_timer, qemu_now + delta);
 env->xen_singleshot_timer_ns = now + delta;
-
-qemu_mutex_unlock(>xen_timers_lock);
 return 0;
 }
 
@@ -1115,6 +1122,7 @@ static int vcpuop_set_singleshot_timer(CPUState *cs, 
uint64_t arg)
 return -EFAULT;
 }
 
+QEMU_LOCK_GUARD(_CPU(cs)->env.xen_timers_lock);
 return do_set_singleshot_timer(cs, sst.timeout_abs_ns,
!!(sst.flags & VCPU_SSHOTTMR_future),
false);
@@ -1141,6 +1149,7 @@ static bool kvm_xen_hcall_set_timer_op(struct 
kvm_xen_exit *exit, X86CPU *cpu,
 if (unlikely(timeout == 0)) {
 err = vcpuop_stop_singleshot_timer(CPU(cpu));
 } else {
+QEMU_LOCK_GUARD(_CPU(cpu)->env.xen_timers_lock);
 err = do_set_singleshot_timer(CPU(cpu), timeout, false, true);
 }
 exit->u.hcall.result = err;
@@ -1826,6 +1835,7 @@ int kvm_put_xen_state(CPUState *cs)
  * If the kernel has EVTCHN_SEND support then it handles timers too,
  * so the timer will be restored by kvm_xen_set_vcpu_timer() below.
  */
+QEMU_LOCK_GUARD(>xen_timers_lock);
 if (env->xen_singleshot_timer_ns) {
 ret = do_set_singleshot_timer(cs, env->xen_singleshot_timer_ns,
 false, false);
@@ -1844,10 +1854,8 @@ int kvm_put_xen_state(CPUState *cs)
 }
 
 if (env->xen_virq[VIRQ_TIMER]) {
-ret = kvm_xen_set_vcpu_timer(cs);
-if (ret < 0) {
-return ret;
-}
+do_set_vcpu_timer_virq(cs,
+   RUN_ON_CPU_HOST_INT(env->xen_virq[VIRQ_TIMER]));
 }
 return 0;
 }
@@ -1896,6 +1904,15 @@ int kvm_get_xen_state(CPUState

[PULL 10/10] target/m68k: Fix semihost lseek offset computation

2023-08-01 Thread Philippe Mathieu-Daudé

From: Peter Maydell 

The arguments for deposit64 are (value, start, length, fieldval); this
appears to have thought they were (value, fieldval, start,
length). Reorder the parameters to match the actual function.

Cc: qemu-sta...@nongnu.org
Fixes: 950272506d ("target/m68k: Use semihosting/syscalls.h")
Reported-by: Philippe Mathieu-Daudé 
Signed-off-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20230801154519.3505531-1-peter.mayd...@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé 
---
 target/m68k/m68k-semi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/m68k/m68k-semi.c b/target/m68k/m68k-semi.c
index 88ad9ba814..239f6e44e9 100644
--- a/target/m68k/m68k-semi.c
+++ b/target/m68k/m68k-semi.c
@@ -166,7 +166,7 @@ void do_m68k_semihosting(CPUM68KState *env, int nr)
 GET_ARG64(2);
 GET_ARG64(3);
 semihost_sys_lseek(cs, m68k_semi_u64_cb, arg0,
-   deposit64(arg2, arg1, 32, 32), arg3);
+   deposit64(arg2, 32, 32, arg1), arg3);
 break;
 
 case HOSTED_RENAME:
-- 
2.38.1

[PULL 04/10] ui/dbus: fix win32 compilation when !opengl

2023-08-01 Thread Philippe Mathieu-Daudé

From: Marc-Andre Lureau 

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1782

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20230725112540.53284-1-marcandre.lur...@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 ui/dbus-listener.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/ui/dbus-listener.c b/ui/dbus-listener.c
index 68ff343799..02fc6ae239 100644
--- a/ui/dbus-listener.c
+++ b/ui/dbus-listener.c
@@ -338,6 +338,7 @@ static bool dbus_scanout_map(DBusDisplayListener *ddl)
 return true;
 }
 
+#ifdef CONFIG_OPENGL
 static bool
 dbus_scanout_share_d3d_texture(
 DBusDisplayListener *ddl,
@@ -399,7 +400,8 @@ dbus_scanout_share_d3d_texture(
 
 return true;
 }
-#endif
+#endif /* CONFIG_OPENGL */
+#endif /* WIN32 */
 
 #ifdef CONFIG_OPENGL
 static void dbus_scanout_texture(DisplayChangeListener *dcl,
-- 
2.38.1

[PULL 01/10] hw/xen: fix off-by-one in xen_evtchn_set_gsi()

2023-08-01 Thread Philippe Mathieu-Daudé

From: David Woodhouse 

Coverity points out (CID 1508128) a bounds checking error. We need to check
for gsi >= IOAPIC_NUM_PINS, not just greater-than.

Also fix up an assert() that has the same problem, that Coverity didn't see.

Fixes: 4f81baa33ed6 ("hw/xen: Support GSI mapping to PIRQ")
Signed-off-by: David Woodhouse 
Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20230801175747.145906-2-dw...@infradead.org>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/i386/kvm/xen_evtchn.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index 3d810dbd59..0e9c108614 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -1587,7 +1587,7 @@ static int allocate_pirq(XenEvtchnState *s, int type, int 
gsi)
  found:
 pirq_inuse_word(s, pirq) |= pirq_inuse_bit(pirq);
 if (gsi >= 0) {
-assert(gsi <= IOAPIC_NUM_PINS);
+assert(gsi < IOAPIC_NUM_PINS);
 s->gsi_pirq[gsi] = pirq;
 }
 s->pirq[pirq].gsi = gsi;
@@ -1601,7 +1601,7 @@ bool xen_evtchn_set_gsi(int gsi, int level)
 
 assert(qemu_mutex_iothread_locked());
 
-if (!s || gsi < 0 || gsi > IOAPIC_NUM_PINS) {
+if (!s || gsi < 0 || gsi >= IOAPIC_NUM_PINS) {
 return false;
 }
 
-- 
2.38.1

[PULL 07/10] tests/migration: Add -fno-stack-protector

2023-08-01 Thread Philippe Mathieu-Daudé

From: Akihiko Odaki 

A build of GCC 13.2 will have stack protector enabled by default if it
was configured with --enable-default-ssp option. For such a compiler,
it is necessary to explicitly disable stack protector when linking
without standard libraries.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Juan Quintela 
Reviewed-by: Thomas Huth 
Message-Id: <20230731091042.139159-2-akihiko.od...@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 tests/migration/s390x/Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/migration/s390x/Makefile b/tests/migration/s390x/Makefile
index 6393c3e5b9..6671de2efc 100644
--- a/tests/migration/s390x/Makefile
+++ b/tests/migration/s390x/Makefile
@@ -6,8 +6,8 @@ all: a-b-bios.h
 fwdir=../../../pc-bios/s390-ccw
 
 CFLAGS+=-ffreestanding -fno-delete-null-pointer-checks -fPIE -Os \
-   -msoft-float -march=z900 -fno-asynchronous-unwind-tables -Wl,-pie \
-   -Wl,--build-id=none -nostdlib
+   -msoft-float -march=z900 -fno-asynchronous-unwind-tables \
+   -fno-stack-protector -Wl,-pie -Wl,--build-id=none -nostdlib
 
 a-b-bios.h: s390x.elf
echo "$$__note" > header.tmp
-- 
2.38.1

[PULL 08/10] target/nios2: Pass semihosting arg to exit

2023-08-01 Thread Philippe Mathieu-Daudé

From: Keith Packard 

Instead of using R_ARG0 (the semihost function number), use R_ARG1
(the provided exit status).

Signed-off-by: Keith Packard 
Reviewed-by: Peter Maydell 
Message-Id: <20230801152245.332749-1-kei...@keithp.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 target/nios2/nios2-semi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/nios2/nios2-semi.c b/target/nios2/nios2-semi.c
index 3738774976..f3b7aee4f1 100644
--- a/target/nios2/nios2-semi.c
+++ b/target/nios2/nios2-semi.c
@@ -133,8 +133,8 @@ void do_nios2_semihosting(CPUNios2State *env)
 args = env->regs[R_ARG1];
 switch (nr) {
 case HOSTED_EXIT:
-gdb_exit(env->regs[R_ARG0]);
-exit(env->regs[R_ARG0]);
+gdb_exit(env->regs[R_ARG1]);
+exit(env->regs[R_ARG1]);
 
 case HOSTED_OPEN:
 GET_ARG(0);
-- 
2.38.1

[PULL 05/10] ui/dbus: fix clang compilation issue

2023-08-01 Thread Philippe Mathieu-Daudé

From: Marc-André Lureau 

../ui/dbus-listener.c:236:9: error: expected expression
Error *err = NULL;

See:
https://gitlab.com/qemu-project/qemu/-/issues/1782#note_1488517427

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Thomas Huth 
Message-Id: <20230726151221.515761-1-marcandre.lur...@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 ui/dbus-listener.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/ui/dbus-listener.c b/ui/dbus-listener.c
index 02fc6ae239..30917271ab 100644
--- a/ui/dbus-listener.c
+++ b/ui/dbus-listener.c
@@ -232,7 +232,7 @@ static void dbus_call_update_gl(DisplayChangeListener *dcl,
 egl_fb_read_rect(ddl->ds, >fb, x, y, w, h);
 dbus_gfx_update(dcl, x, y, w, h);
 break;
-case SHARE_KIND_D3DTEX:
+case SHARE_KIND_D3DTEX: {
 Error *err = NULL;
 assert(ddl->d3d_texture);
 
@@ -249,6 +249,7 @@ static void dbus_call_update_gl(DisplayChangeListener *dcl,
 dbus_update_gl_cb,
 g_object_ref(ddl));
 break;
+}
 default:
 g_warn_if_reached();
 }
-- 
2.38.1

[PULL 09/10] target/nios2: Fix semihost lseek offset computation

2023-08-01 Thread Philippe Mathieu-Daudé

From: Keith Packard 

The arguments for deposit64 are (value, start, length, fieldval); this
appears to have thought they were (value, fieldval, start,
length). Reorder the parameters to match the actual function.

Signed-off-by: Keith Packard 
Reviewed-by: Philippe Mathieu-Daudé 
Fixes: d1e23cbaa403b2d ("target/nios2: Use semihosting/syscalls.h")
Reviewed-by: Peter Maydell 
Message-Id: <20230731235245.295513-1-kei...@keithp.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 target/nios2/nios2-semi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/nios2/nios2-semi.c b/target/nios2/nios2-semi.c
index f3b7aee4f1..9d0241c758 100644
--- a/target/nios2/nios2-semi.c
+++ b/target/nios2/nios2-semi.c
@@ -169,7 +169,7 @@ void do_nios2_semihosting(CPUNios2State *env)
 GET_ARG64(2);
 GET_ARG64(3);
 semihost_sys_lseek(cs, nios2_semi_u64_cb, arg0,
-   deposit64(arg2, arg1, 32, 32), arg3);
+   deposit64(arg2, 32, 32, arg1), arg3);
 break;
 
 case HOSTED_RENAME:
-- 
2.38.1

[PULL 03/10] hw/xen: prevent guest from binding loopback event channel to itself

2023-08-01 Thread Philippe Mathieu-Daudé

From: David Woodhouse 

Fuzzing showed that a guest could bind an interdomain port to itself, by
guessing the next port to be allocated and putting that as the 'remote'
port number. By chance, that works because the newly-allocated port has
type EVTCHNSTAT_unbound. It shouldn't.

Signed-off-by: David Woodhouse 
Reviewed-by: Paul Durrant 
Message-Id: <20230801175747.145906-4-dw...@infradead.org>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/i386/kvm/xen_evtchn.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index 0e9c108614..a731738411 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -1408,8 +1408,15 @@ int xen_evtchn_bind_interdomain_op(struct 
evtchn_bind_interdomain *interdomain)
 XenEvtchnPort *rp = >port_table[interdomain->remote_port];
 XenEvtchnPort *lp = >port_table[interdomain->local_port];
 
-if (rp->type == EVTCHNSTAT_unbound && rp->type_val == 0) {
-/* It's a match! */
+/*
+ * The 'remote' port for loopback must be an unbound port allocated for
+ * communication with the local domain (as indicated by rp->type_val
+ * being zero, not PORT_INFO_TYPEVAL_REMOTE_QEMU), and must *not* be
+ * the port that was just allocated for the local end.
+ */
+if (interdomain->local_port != interdomain->remote_port &&
+rp->type == EVTCHNSTAT_unbound && rp->type_val == 0) {
+
 rp->type = EVTCHNSTAT_interdomain;
 rp->type_val = interdomain->local_port;
 
-- 
2.38.1

[PULL 06/10] misc: Fix some typos in documentation and comments

2023-08-01 Thread Philippe Mathieu-Daudé

From: Stefan Weil 

Signed-off-by: Stefan Weil 
Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20230730180329.851576-1...@weilnetz.de>
Signed-off-by: Philippe Mathieu-Daudé 
---
 docs/about/deprecated.rst| 2 +-
 docs/devel/qom.rst   | 2 +-
 docs/system/devices/nvme.rst | 2 +-
 include/exec/memory.h| 2 +-
 hw/core/loader.c | 4 ++--
 ui/vnc-enc-tight.c   | 2 +-
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index 1c35f55666..92a2bafd2b 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -369,7 +369,7 @@ mapping permissions et al by using its 'mapped' security 
model option.
 Nowadays it would make sense to reimplement the ``proxy`` backend by using
 QEMU's ``vhost`` feature, which would eliminate the high latency costs under
 which the 9p ``proxy`` backend currently suffers. However as of to date nobody
-has indicated plans for such kind of reimplemention unfortunately.
+has indicated plans for such kind of reimplementation unfortunately.
 
 
 Block device options
diff --git a/docs/devel/qom.rst b/docs/devel/qom.rst
index 0b506426d7..9918fac7f2 100644
--- a/docs/devel/qom.rst
+++ b/docs/devel/qom.rst
@@ -30,7 +30,7 @@ user configuration.
 Creating a QOM class
 
 
-A simple minimal device implementation may look something like bellow:
+A simple minimal device implementation may look something like below:
 
 .. code-block:: c
:caption: Creating a minimal type
diff --git a/docs/system/devices/nvme.rst b/docs/system/devices/nvme.rst
index a8bb8d729c..2a3af268f7 100644
--- a/docs/system/devices/nvme.rst
+++ b/docs/system/devices/nvme.rst
@@ -232,7 +232,7 @@ parameters:
   Set the number of Reclaim Groups.
 
 ``fdp.nruh`` (default: ``0``)
-  Set the number of Reclaim Unit Handles. This is a mandatory paramater and
+  Set the number of Reclaim Unit Handles. This is a mandatory parameter and
   must be non-zero.
 
 ``fdp.runs`` (default: ``96M``)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 7f5c11a0cc..68284428f8 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -942,7 +942,7 @@ struct MemoryListener {
  *
  * @listener: The #MemoryListener.
  * @last_stage: The last stage to synchronize the log during migration.
- * The caller should gurantee that the synchronization with true for
+ * The caller should guarantee that the synchronization with true for
  * @last_stage is triggered for once after all VCPUs have been stopped.
  */
 void (*log_sync_global)(MemoryListener *listener, bool last_stage);
diff --git a/hw/core/loader.c b/hw/core/loader.c
index 8b7fd9e9e5..4dd5a71fb7 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -863,7 +863,7 @@ ssize_t load_image_gzipped(const char *filename, hwaddr 
addr, uint64_t max_sz)
 
 /*
  * The Linux header magic number for a EFI PE/COFF
- * image targetting an unspecified architecture.
+ * image targeting an unspecified architecture.
  */
 #define EFI_PE_LINUX_MAGIC"\xcd\x23\x82\x81"
 
@@ -1492,7 +1492,7 @@ RomGap rom_find_largest_gap_between(hwaddr base, size_t 
size)
 if (rom->mr || rom->fw_file) {
 continue;
 }
-/* ignore anything finishing bellow base */
+/* ignore anything finishing below base */
 if (rom->addr + rom->romsize <= base) {
 continue;
 }
diff --git a/ui/vnc-enc-tight.c b/ui/vnc-enc-tight.c
index 09200d71b8..ee853dcfcb 100644
--- a/ui/vnc-enc-tight.c
+++ b/ui/vnc-enc-tight.c
@@ -77,7 +77,7 @@ static int tight_send_framebuffer_update(VncState *vs, int x, 
int y,
 
 #ifdef CONFIG_VNC_JPEG
 static const struct {
-double jpeg_freq_min;   /* Don't send JPEG if the freq is bellow */
+double jpeg_freq_min;   /* Don't send JPEG if the freq is below */
 double jpeg_freq_threshold; /* Always send JPEG if the freq is above */
 int jpeg_idx;   /* Allow indexed JPEG */
 int jpeg_full;  /* Allow full color JPEG */
-- 
2.38.1

[PULL 00/10] Misc fixes for 2023-08-01

2023-08-01 Thread Philippe Mathieu-Daudé

The following changes since commit 802341823f1720511dd5cf53ae40285f7978c61b:

  Merge tag 'pull-tcg-20230731' of https://gitlab.com/rth7680/qemu into staging 
(2023-07-31 14:02:51 -0700)

are available in the Git repository at:

  https://github.com/philmd/qemu.git tags/misc-fixes-20230801

for you to fetch changes up to 8caaae7319a5f7ca449900c0e6bfcaed78fa3ae2:

  target/m68k: Fix semihost lseek offset computation (2023-08-01 23:52:23 +0200)


Misc patches queue

xen: Fix issues reported by fuzzer / Coverity
misc: Fix some typos in documentation and comments
ui/dbus: Build fixes for Clang/win32/!opengl
linux-user: Semihosting fixes on m68k/nios2
tests/migration: Disable stack protector when linking without stdlib



Akihiko Odaki (1):
  tests/migration: Add -fno-stack-protector

David Woodhouse (3):
  hw/xen: fix off-by-one in xen_evtchn_set_gsi()
  i386/xen: consistent locking around Xen singleshot timers
  hw/xen: prevent guest from binding loopback event channel to itself

Keith Packard (2):
  target/nios2: Pass semihosting arg to exit
  target/nios2: Fix semihost lseek offset computation

Marc-Andre Lureau (1):
  ui/dbus: fix win32 compilation when !opengl

Marc-André Lureau (1):
  ui/dbus: fix clang compilation issue

Peter Maydell (1):
  target/m68k: Fix semihost lseek offset computation

Stefan Weil (1):
  misc: Fix some typos in documentation and comments

 docs/about/deprecated.rst  |  2 +-
 docs/devel/qom.rst |  2 +-
 docs/system/devices/nvme.rst   |  2 +-
 include/exec/memory.h  |  2 +-
 hw/core/loader.c   |  4 ++--
 hw/i386/kvm/xen_evtchn.c   | 15 ++
 target/i386/kvm/xen-emu.c  | 37 +-
 target/m68k/m68k-semi.c|  2 +-
 target/nios2/nios2-semi.c  |  6 +++---
 ui/dbus-listener.c |  7 +--
 ui/vnc-enc-tight.c |  2 +-
 tests/migration/s390x/Makefile |  4 ++--
 12 files changed, 56 insertions(+), 29 deletions(-)

-- 
2.38.1

Re: [PATCH 2/3] tests/migration: Add -fno-stack-protector

2023-08-01 Thread Philippe Mathieu-Daudé


On 31/7/23 08:58, Akihiko Odaki wrote:

A build of GCC 13.2 will have stack protector enabled by default if it
was configured with --enable-default-ssp option. For such a compiler,
it is necessary to explicitly disable stack protector when linking
without standard libraries.

Signed-off-by: Akihiko Odaki 
---
  tests/migration/s390x/Makefile | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Queuing this single patch via misc-fixes.

Re: [PATCH] ui/dbus: fix win32 compilation when !opengl

2023-08-01 Thread Philippe Mathieu-Daudé


On 25/7/23 13:25, marcandre.lur...@redhat.com wrote:

From: Marc-Andre Lureau 

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1782

Signed-off-by: Marc-André Lureau 
---
  ui/dbus-listener.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)


Thanks, queued via misc-fixes.

Re: [PATCH] target/nios2: Fix semihost lseek offset computation

2023-08-01 Thread Philippe Mathieu-Daudé


On 1/8/23 01:52, Keith Packard via wrote:

The arguments for deposit64 are (value, start, length, fieldval); this
appears to have thought they were (value, fieldval, start,
length). Reorder the parameters to match the actual function.

Signed-off-by: Keith Packard 
---
  target/nios2/nios2-semi.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Thanks, queued via misc-fixes.

Re: [PATCH for-8.1] target/m68k: Fix semihost lseek offset computation

2023-08-01 Thread Philippe Mathieu-Daudé


On 1/8/23 17:45, Peter Maydell wrote:

The arguments for deposit64 are (value, start, length, fieldval); this
appears to have thought they were (value, fieldval, start,
length). Reorder the parameters to match the actual function.

Cc: qemu-sta...@nongnu.org
Fixes: 950272506d ("target/m68k: Use semihosting/syscalls.h")
Reported-by: Philippe Mathieu-Daudé 
Signed-off-by: Peter Maydell 
---
Same fix for m68k as Keith Packard just sent for nios2
---
  target/m68k/m68k-semi.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Thanks, queued via misc-fixes.

Re: [PATCH for-8.1] Misc Xen-on-KVM fixes

2023-08-01 Thread Philippe Mathieu-Daudé


On 1/8/23 19:57, David Woodhouse wrote:

A few minor fixes for the Xen emulation support. One was just merged, but
there are three outstanding.

David Woodhouse (3):
   hw/xen: fix off-by-one in xen_evtchn_set_gsi()
   i386/xen: consistent locking around Xen singleshot timers
   hw/xen: prevent guest from binding loopback event channel to itself

  hw/i386/kvm/xen_evtchn.c  | 15 +++
  target/i386/kvm/xen-emu.c | 36 ++--
  2 files changed, 37 insertions(+), 14 deletions(-)


Thanks, since the series is reviewed, I'm queuing via misc-fixes.

Re: [PATCH] target/nios2: Pass semihosting arg to exit

2023-08-01 Thread Philippe Mathieu-Daudé


On 1/8/23 17:22, Keith Packard via wrote:

Instead of using R_ARG0 (the semihost function number), use R_ARG1
(the provided exit status).

Signed-off-by: Keith Packard 
---
  target/nios2/nios2-semi.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Thanks, queued via misc-fixes.

Re: [PATCH] Fix some typos in documentation and comments

2023-08-01 Thread Philippe Mathieu-Daudé


On 30/7/23 20:03, Stefan Weil wrote:

Signed-off-by: Stefan Weil 
---

This patch was triggered by a spelling check for the generated
QEMU documentation using codespell. It does not try to fix all
typos which still exist in the QEMU code, but has a focus on
those required to fix the documentation. Nevertheless some code
comments with the same typos were fixed, too.

I think the patch is trivial, so maybe it can still be included
in the upcoming release, but that's not strictly necessary.

Stefan

  docs/about/deprecated.rst| 2 +-
  docs/devel/qom.rst   | 2 +-
  docs/system/devices/nvme.rst | 2 +-
  hw/core/loader.c | 4 ++--
  include/exec/memory.h| 2 +-
  ui/vnc-enc-tight.c   | 2 +-
  6 files changed, 7 insertions(+), 7 deletions(-)


Thanks, queued via misc-fixes.

Re: [PATCH] ui/dbus: fix clang compilation issue

2023-08-01 Thread Philippe Mathieu-Daudé


On 26/7/23 17:12, marcandre.lur...@redhat.com wrote:

From: Marc-André Lureau 

../ui/dbus-listener.c:236:9: error: expected expression
 Error *err = NULL;

See:
https://gitlab.com/qemu-project/qemu/-/issues/1782#note_1488517427

Signed-off-by: Marc-André Lureau 
---
  ui/dbus-listener.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)


Thanks, queued via misc-fixes.

[PATCH v1] Allowing setting and overriding parameters in smb.conf

2023-08-01 Thread Henrik Carlqvist

>From c480f787981308067a059213a1a7ce9c70ab668e Mon Sep 17 00:00:00 2001
From: Henrik Carlqvist 
Date: Tue, 1 Aug 2023 23:00:15 +0200
Subject: [PATCH] Allowing setting and overriding parameters in smb.conf

Signed-off-by: Henrik Carlqvist 
---

It would be nice to be able to change settings in smb.conf from the qemu 
command line. A kludge to edit the qemu smb.conf of a running smbd process 
is described at https://wiki.archlinux.org/title/QEMU , but IMHO my patch
provides a cleaner solution where parameters can be initially set to the
preferred values.

Best regards Henrik

 net/slirp.c | 44 ++--
 qapi/net.json   |  3 +++
 qemu-options.hx | 15 ---
 3 files changed, 53 insertions(+), 9 deletions(-)

diff --git a/net/slirp.c b/net/slirp.c
index c33b3e02e7..f860ea48f6 100644
--- a/net/slirp.c
+++ b/net/slirp.c
@@ -106,7 +106,8 @@ static int slirp_guestfwd(SlirpState *s, const char 
*config_str, Error **errp);
 
 #if defined(CONFIG_SMBD_COMMAND)
 static int slirp_smb(SlirpState *s, const char *exported_dir,
- struct in_addr vserver_addr, Error **errp);
+ struct in_addr vserver_addr, const char *smbparams,
+ Error **errp);
 static void slirp_smb_cleanup(SlirpState *s);
 #else
 static inline void slirp_smb_cleanup(SlirpState *s) { }
@@ -424,6 +425,7 @@ static int net_slirp_init(NetClientState *peer, const char 
*model,
   const char *bootfile, const char *vdhcp_start,
   const char *vnameserver, const char *vnameserver6,
   const char *smb_export, const char *vsmbserver,
+  const char *smbparams,
   const char **dnssearch, const char *vdomainname,
   const char *tftp_server_name,
   Error **errp)
@@ -678,7 +680,7 @@ static int net_slirp_init(NetClientState *peer, const char 
*model,
 }
 #if defined(CONFIG_SMBD_COMMAND)
 if (smb_export) {
-if (slirp_smb(s, smb_export, smbsrv, errp) < 0) {
+if (slirp_smb(s, smb_export, smbsrv, smbparams, errp) < 0) {
 goto error;
 }
 }
@@ -891,7 +893,8 @@ static void slirp_smb_cleanup(SlirpState *s)
 }
 
 static int slirp_smb(SlirpState* s, const char *exported_dir,
- struct in_addr vserver_addr, Error **errp)
+ struct in_addr vserver_addr, const char *smbparams,
+ Error **errp)
 {
 char *smb_conf;
 char *smb_cmdline;
@@ -950,10 +953,11 @@ static int slirp_smb(SlirpState* s, const char 
*exported_dir,
 "printing = bsd\n"
 "disable spoolss = yes\n"
 "usershare max shares = 0\n"
-"[qemu]\n"
-"path=%s\n"
 "read only=no\n"
 "guest ok=yes\n"
+   "%s"
+"[qemu]\n"
+"path=%s\n"
 "force user=%s\n",
 s->smb_dir,
 s->smb_dir,
@@ -963,6 +967,7 @@ static int slirp_smb(SlirpState* s, const char 
*exported_dir,
 s->smb_dir,
 s->smb_dir,
 s->smb_dir,
+smbparams,
 exported_dir,
 passwd->pw_name
 );
@@ -1143,6 +1148,29 @@ static const char **slirp_dnssearch(const StringList 
*dnsname)
 return ret;
 }
 
+static char *slirp_smbparams(const StringList *smbparam)
+{
+const StringList *c = smbparam;
+size_t i = 1; /* for string terminating 0 */
+char *ret;
+
+while (c) {
+i += strlen(c->value->str);
+i++; /* for \n */
+c = c->next;
+}
+ret = g_malloc(i * sizeof(*ret));
+ret[0]=0; /* Start with empty string */
+
+c = smbparam;
+while (c) {
+pstrcat(ret, i * sizeof(*ret), c->value->str);
+pstrcat(ret, i * sizeof(*ret), "\n");
+c = c->next;
+}
+return ret;
+}
+
 int net_init_slirp(const Netdev *netdev, const char *name,
NetClientState *peer, Error **errp)
 {
@@ -1151,6 +1179,7 @@ int net_init_slirp(const Netdev *netdev, const char *name,
 int ret;
 const NetdevUserOptions *user;
 const char **dnssearch;
+char *smbparams;
 bool ipv4 = true, ipv6 = true;
 
 assert(netdev->type == NET_CLIENT_DRIVER_USER);
@@ -1170,6 +1199,7 @@ int net_init_slirp(const Netdev *netdev, const char *name,
NULL;
 
 dnssearch = slirp_dnssearch(user->dnssearch);
+smbparams = slirp_smbparams(user->smbparam);
 
 /* all optional fields are initialized to "all bits zero" */
 
@@ -1182,7 +1212,8 @@ int net_init_slirp(const Netdev *netdev, const char *name,
  user->ipv6_host, user->hostname, user->tftp,
  user->bootfile, user->dhcpstart,
  user->dns, user->ipv6_dns, user->smb,
- user->smbserver, dnssearch, user->domainname,
+

Re: [PATCH 1/2] linux-user: Fix openat() emulation to correctly detect accesses to /proc

2023-08-01 Thread Daniel P . Berrangé

On Tue, Aug 01, 2023 at 09:10:34PM +0200, Helge Deller wrote:
> In qemu we catch accesses to files like /proc/cpuinfo or /proc/net/route
> and return to the guest contents which would be visible on a real system
> (instead what the host would show).
> 
> This patch fixes a bug, where for example the accesses
> cat /proccpuinfo
> or
> cd /proc && cat cpuinfo
> will not be recognized by qemu and where qemu will wrongly show
> the contents of the host's /proc/cpuinfo file.
> 
> Signed-off-by: Helge Deller 
> 
> --
> v2:
> - use g_autofree instead of pathname on stack
>   Daniel P. Berrangé requested to not put buffers on stack.
>   Using g_autofree keeps code much cleaner than using
>   extended semantics of realpath(), unless I can use g_autofree
>   on malloced area from realpath().

g_autofree is backed by free(), so it is fine to use that
with the realpath() allocated buffer.

> ---
>  linux-user/syscall.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 95727a816a..a089463969 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -8539,9 +8539,12 @@ static int open_hardware(CPUArchState *cpu_env, int fd)
>  }
>  #endif
> 
> -int do_guest_openat(CPUArchState *cpu_env, int dirfd, const char *pathname,
> +
> +int do_guest_openat(CPUArchState *cpu_env, int dirfd, const char *fname,
>  int flags, mode_t mode, bool safe)
>  {
> +g_autofree char *proc_name = g_new(char, PATH_MAX);
> +const char *pathname;
>  struct fake_open {
>  const char *filename;
>  int (*fill)(CPUArchState *cpu_env, int fd);
> @@ -8567,6 +8570,13 @@ int do_guest_openat(CPUArchState *cpu_env, int dirfd, 
> const char *pathname,
>  { NULL, NULL, NULL }
>  };
> 
> +/* if this is a file from /proc/ filesystem, expand full name */
> +if (realpath(fname, proc_name) && strncmp(proc_name, "/proc/", 6) == 0) {
> +pathname = proc_name;
> +} else {
> +pathname = fname;
> +}
> +
>  if (is_proc_myself(pathname, "exe")) {
>  if (safe) {
>  return safe_openat(dirfd, exec_path, flags, mode);
> --
> 2.41.0
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 3/3] accel/tcg: Do not issue misaligned i/o

2023-08-01 Thread Philippe Mathieu-Daudé


On 1/8/23 20:42, Richard Henderson wrote:

In the single-page case we were issuing misaligned i/o to
the memory subsystem, which does not handle it properly.
Split such accesses via do_{ld,st}_mmio_*.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1800
Signed-off-by: Richard Henderson 
---
  accel/tcg/cputlb.c | 118 +++--
  1 file changed, 72 insertions(+), 46 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH 2/3] accel/tcg: Issue wider aligned i/o in do_{ld,st}_mmio_*

2023-08-01 Thread Philippe Mathieu-Daudé


On 1/8/23 20:42, Richard Henderson wrote:

If the address and size are aligned, send larger chunks
to the memory subsystem.  This will be required to make
more use of these helpers.

Signed-off-by: Richard Henderson 
---
  accel/tcg/cputlb.c | 76 +-
  1 file changed, 69 insertions(+), 7 deletions(-)


Super nice!

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH] gdbstub: use 0 ("any process") on packets with no PID

2023-08-01 Thread Ilya Leoshkevich

On Tue, 2023-08-01 at 12:37 -0300, Matheus Tavares Bernardino wrote:
> Previously, qemu-user would always report PID 1 to GDB. This was
> changed
> at dc14a7a6e9 (gdbstub: Report the actual qemu-user pid, 2023-06-30),
> but read_thread_id() still considers GDB packets with "no PID" as
> "PID
> 1", which is not the qemu-user PID. Fix that by parsing "no PID" as
> "0",
> which the GDB Remote Protocol defines as "any process".
> 
> Note that this should have no effect for system emulation as, in this
> case, gdb_create_default_process() will assign PID 1 for the first
> process and that is what the gdbstub uses for GDB requests with no
> PID,
> or PID 0.
> 
> This issue was found with hexagon-lldb, which sends a "Hq" packet
> with
> only the thread-id, but no process-id, leading to the invalid usage
> of
> "PID 1" by qemu-hexagon and a subsequent "E22" reply.

Did you mean "Hg"?

> Signed-off-by: Matheus Tavares Bernardino 
> ---
>  gdbstub/gdbstub.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

The change looks good to me.
Thanks for looking into this and sorry for the breakage.

Acked-by: Ilya Leoshkevich

Re: [PATCH 1/3] accel/tcg: Adjust parameters and locking with do_{ld, st}_mmio_*

2023-08-01 Thread Philippe Mathieu-Daudé


Hi Richard,

On 1/8/23 20:42, Richard Henderson wrote:

Replace MMULookupPageData* with CPUTLBEntryFull, addr, size.
Move QEMU_IOTHREAD_LOCK_GUARD to the caller.

This simplifies the usage from do_ld16_beN and do_st16_leN, where
we weren't locking the entire operation, and required hoop jumping
for passing addr and size.

Signed-off-by: Richard Henderson 
---
  accel/tcg/cputlb.c | 65 +++---
  1 file changed, 32 insertions(+), 33 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index ba44501a7c..d28606b93e 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2066,24 +2066,21 @@ static void *atomic_mmu_lookup(CPUArchState *env, vaddr 
addr, MemOpIdx oi,
  /**
   * do_ld_mmio_beN:
   * @env: cpu context
- * @p: translation parameters
+ * @full: page parameters
   * @ret_be: accumulated data
+ * @addr: virtual address
+ * @size: number of bytes
   * @mmu_idx: virtual address context
   * @ra: return address into tcg generated code, or 0
   *
- * Load @p->size bytes from @p->addr, which is memory-mapped i/o.
+ * Load @size bytes from @addr, which is memory-mapped i/o.
   * The bytes are concatenated in big-endian order with @ret_be.


Do you mind adding:

 * Called with iothread lock held.

here and in do_st_mmio_leN()?

Otherwise:
Reviewed-by: Philippe Mathieu-Daudé 


   */

Re: [PATCH 0/3] hw/ufs: fix compilation warnings

2023-08-01 Thread Philippe Mathieu-Daudé


Hi Mike,

On 28/7/23 01:34, Mike Maslenkin wrote:

This patchset contains a trivial compilation fixes for UFS support
applied to block-next tree.


Since the series isn't merged, it would be clearer to send
a v9 of "hw/ufs: Add Universal Flash Storage (UFS) support"
with the fixes squashed in (there is still time).

Regards,

Phil.

Re: [PATCH v3 3/3] cpu, softmmu/vl.c: Change parsing of -cpu argument to allow -cpu cpu, help to print options for the CPU type similar to how the '-device' option works.

2023-08-01 Thread Dinah B

Thanks, I will fix this. I somehow didn't catch that you had replied to the
old one.

-Dinah

On Tue, Aug 1, 2023 at 10:10 AM Markus Armbruster  wrote:

> Dinah Baum  writes:
>
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1480
> > Signed-off-by: Dinah Baum 
> >
> > Signed-off-by: Dinah Baum 
>
> Looks basically the same as v2, which means my review still applies.
>
> Message-ID: <878rdbfww1@pond.sub.org>
> https://lists.nongnu.org/archive/html/qemu-devel/2023-05/msg06699.html
>
> If you need further assistance, just ask.
>
>

Re: [PATCH] gdbstub: Fix client Ctrl-C handling

2023-08-01 Thread Philippe Mathieu-Daudé


On 31/7/23 15:59, Peter Maydell wrote:

On Mon, 31 Jul 2023 at 07:59, Joel Stanley  wrote:


On Sun, 30 Jul 2023 at 09:43, Nicholas Piggin  wrote:


On Wed Jul 26, 2023 at 4:35 PM AEST, Joel Stanley wrote:

On Wed, 12 Jul 2023 at 02:12, Nicholas Piggin  wrote:




I was taking a look at -rc1 and it looks like this hasn't made it in.
Is it something we want to propose including?

As a user of qemu I'd vote for it to go in.


I think it should, gdb is hardly usable without it.


I think I hit this issue when debugging u-boot in the aspeed arm
machines. Your patch fixed things:

Tested-by: Joel Stanley 

Alex, Philippe, can we get this queued for 8.1?


I'm doing a pullreq today anyway, so I can grab it.


Thank you Peter!

Re: [PATCH v2 06/11] tpm_crb: move ACPI table building to device interface

2023-08-01 Thread Stefan Berger





On 7/31/23 23:02, Joelle van Dyne wrote:

On Mon, Jul 17, 2023 at 6:42 AM Igor Mammedov  wrote:


On Fri, 14 Jul 2023 13:21:33 -0400
Stefan Berger  wrote:


On 7/14/23 03:09, Joelle van Dyne wrote:

This logic is similar to TPM TIS ISA device. Since TPM CRB can only
support TPM 2.0 backends, we check for this in realize.

Signed-off-by: Joelle van Dyne 


This patch changes the order of in which the ACPI table elements are created 
but doesn't matter and also doesn't seem to upset ACPI test cases from what I 
saw:


it seems we do have tests for TIS only (which I added when I was refactoring it 
to TYPE_ACPI_DEV_AML_IF)
perhaps add a test for CRB before this patch a follow process described in 
bios-tables-test.c
for updating expected blob

I read the file and looked at the commits for TIS tests but I'm not
sure I understand how it works. At what point do I specify that the
CRB device should be created for the test?


For me it would be a bit of trial an error as well. So here's my best guess:

Did you look at b193e5f9cccb322b0febd5a2aba486? You basically have to find out 
which files
are going to change due to extending the tests, so doing something like in that 
patch comes
after you found out which files are changing and iirc the tests are going to 
complain about
those files. So I would try to first add CRB tests similar to the following to 
tests/qtest/bios-tables-test.c.
in one patch, then run the test cases and they will tell you which files 
changed, and then
add a patch similar to b193e5f9cccb322b0febd5a2aba486 before the test-enabling 
patch.

if (tpm_model_is_available("-machine q35", "tpm-tis")) {
qtest_add_func("acpi/q35/tpm2-tis", test_acpi_q35_tcg_tpm2_tis);
qtest_add_func("acpi/q35/tpm12-tis",
   test_acpi_q35_tcg_tpm12_tis);
}

I would try to something like the above to aarch64 here:

} else if (strcmp(arch, "aarch64") == 0) {
if (has_tcg && qtest_has_device("virtio-blk-pci")) {
qtest_add_func("acpi/virt", test_acpi_virt_tcg);
qtest_add_func("acpi/virt/acpihmatvirt",
test_acpi_virt_tcg_acpi_hmat);


Then you run the tests again then it should create those files with the ACPI 
data and you copy them
to their destination (like in ca745d2277496464b54fd832c15c45d0227325bb) and 
remove the changes from
tests/qtest/bios-tables-test-allowed-diff.h and that becomes your 3rd patch. 
Once you run the tests
again with the 3rd patch there should be no more complaints about ACPI related 
changes.

Since CRB ACPI tests are not enabled right now you can add these patches 
somewhere in the middle of
the series or also at the end.


I hope this helps.

   Stefan

Re: [PATCH 2/2] linux-user: Emulate /proc/cpuinfo on aarch64 and arm

2023-08-01 Thread Richard Henderson


On 8/1/23 12:10, Helge Deller wrote:

+dprintf(fd, "CPU implementer\t: 0x%d\n", is64 ? 50 : 56);
+dprintf(fd, "CPU architecture: %d\n",is64 ? 8 : 7);
+dprintf(fd, "CPU variant\t: 0x%d\n", is64 ? 0 : 2);
+dprintf(fd, "CPU part\t: 0x%d\n",is64 ? 0 : 584);
+dprintf(fd, "CPU revision\t: %d\n\n",is64 ? 1 : 2);


These all come from cpu->midr.  See linux/arch/arm64/kernel/cpuinfo.c:232.


r~

[PATCH 2/2] linux-user: Emulate /proc/cpuinfo on aarch64 and arm

2023-08-01 Thread Helge Deller

Add emulation for /proc/cpuinfo for arm architecture.
The output below mimics output as seen on debian porterboxes.

aarch64 output example:

processor   : 0
BogoMIPS: 100.00
Features: fp asimd aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid 
asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm ilrcpc 
flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 
frint svei8mm svef32mm svef64mm svebf16 i8mm bf16 rng bti mte sme sme_i16i64 
sme_f64f64 sme_i8i32 sme_f16f32 sme_b16f32 sme_f32f32 sme_fa64
CPU implementer : 0x50
CPU architecture: 8
CPU variant : 0x0
CPU part: 0x0
CPU revision: 1

arm output example:

processor   : 0
model name  : ARMv7 Processor rev 2 (v7l)
BogoMIPS: 50.00
Features: swp half thumb fast_mult vfp edsp neon vfpv3 tls vfpv4 idiva 
idivt vfpd32 lpae aes pmull sha1 sha2 crc32
CPU implementer : 0x56
CPU architecture: 7
CPU variant : 0x2
CPU part: 0x584
CPU revision: 2

Hardware: Marvell Armada 370/XP (Device Tree)
Revision: 
Serial  : 

Signed-off-by: Helge Deller 

v2:
- show features of CPU which is actually being emulated by qemu
  (suggested by Peter Maydell)
---
 linux-user/elfload.c | 130 +--
 linux-user/loader.h  |   6 +-
 linux-user/syscall.c |  58 ++-
 3 files changed, 187 insertions(+), 7 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 861ec07abc..99804e477d 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -466,7 +466,7 @@ static bool init_guest_commpage(void)
 #define ELF_HWCAP get_elf_hwcap()
 #define ELF_HWCAP2 get_elf_hwcap2()

-static uint32_t get_elf_hwcap(void)
+uint32_t get_elf_hwcap(void)
 {
 ARMCPU *cpu = ARM_CPU(thread_cpu);
 uint32_t hwcaps = 0;
@@ -508,7 +508,7 @@ static uint32_t get_elf_hwcap(void)
 return hwcaps;
 }

-static uint32_t get_elf_hwcap2(void)
+uint32_t get_elf_hwcap2(void)
 {
 ARMCPU *cpu = ARM_CPU(thread_cpu);
 uint32_t hwcaps = 0;
@@ -521,6 +521,49 @@ static uint32_t get_elf_hwcap2(void)
 return hwcaps;
 }

+const char *elf_hwcap_str(uint32_t bit)
+{
+static const char *hwcap_str[] = {
+[__builtin_ctz(ARM_HWCAP_ARM_SWP  )] = "swp",
+[__builtin_ctz(ARM_HWCAP_ARM_HALF )] = "half",
+[__builtin_ctz(ARM_HWCAP_ARM_THUMB)] = "thumb",
+[__builtin_ctz(ARM_HWCAP_ARM_26BIT)] = "26bit",
+[__builtin_ctz(ARM_HWCAP_ARM_FAST_MULT)] = "fast_mult",
+[__builtin_ctz(ARM_HWCAP_ARM_FPA  )] = "fpa",
+[__builtin_ctz(ARM_HWCAP_ARM_VFP  )] = "vfp",
+[__builtin_ctz(ARM_HWCAP_ARM_EDSP )] = "edsp",
+[__builtin_ctz(ARM_HWCAP_ARM_JAVA )] = "java",
+[__builtin_ctz(ARM_HWCAP_ARM_IWMMXT   )] = "iwmmxt",
+[__builtin_ctz(ARM_HWCAP_ARM_CRUNCH   )] = "crunch",
+[__builtin_ctz(ARM_HWCAP_ARM_THUMBEE  )] = "thumbee",
+[__builtin_ctz(ARM_HWCAP_ARM_NEON )] = "neon",
+[__builtin_ctz(ARM_HWCAP_ARM_VFPv3)] = "vfpv3",
+[__builtin_ctz(ARM_HWCAP_ARM_VFPv3D16 )] = "vfpv3d16",
+[__builtin_ctz(ARM_HWCAP_ARM_TLS  )] = "tls",
+[__builtin_ctz(ARM_HWCAP_ARM_VFPv4)] = "vfpv4",
+[__builtin_ctz(ARM_HWCAP_ARM_IDIVA)] = "idiva",
+[__builtin_ctz(ARM_HWCAP_ARM_IDIVT)] = "idivt",
+[__builtin_ctz(ARM_HWCAP_ARM_VFPD32   )] = "vfpd32",
+[__builtin_ctz(ARM_HWCAP_ARM_LPAE )] = "lpae",
+[__builtin_ctz(ARM_HWCAP_ARM_EVTSTRM  )] = "evtstrm",
+};
+
+return bit < ARRAY_SIZE(hwcap_str) ? hwcap_str[bit] : NULL;
+}
+
+const char *elf_hwcap2_str(uint32_t bit)
+{
+static const char *hwcap_str[] = {
+[__builtin_ctz(ARM_HWCAP2_ARM_AES  )] = "aes",
+[__builtin_ctz(ARM_HWCAP2_ARM_PMULL)] = "pmull",
+[__builtin_ctz(ARM_HWCAP2_ARM_SHA1 )] = "sha1",
+[__builtin_ctz(ARM_HWCAP2_ARM_SHA2 )] = "sha2",
+[__builtin_ctz(ARM_HWCAP2_ARM_CRC32)] = "crc32",
+};
+
+return bit < ARRAY_SIZE(hwcap_str) ? hwcap_str[bit] : NULL;
+}
+
 #undef GET_FEATURE
 #undef GET_FEATURE_ID

@@ -668,7 +711,7 @@ enum {
 #define GET_FEATURE_ID(feat, hwcap) \
 do { if (cpu_isar_feature(feat, cpu)) { hwcaps |= hwcap; } } while (0)

-static uint32_t get_elf_hwcap(void)
+uint32_t get_elf_hwcap(void)
 {
 ARMCPU *cpu = ARM_CPU(thread_cpu);
 uint32_t hwcaps = 0;
@@ -706,7 +749,7 @@ static uint32_t get_elf_hwcap(void)
 return hwcaps;
 }

-static uint32_t get_elf_hwcap2(void)
+uint32_t get_elf_hwcap2(void)
 {
 ARMCPU *cpu = ARM_CPU(thread_cpu);
 uint32_t hwcaps = 0;
@@ -741,6 +784,85 @@ static uint32_t get_elf_hwcap2(void)
 return hwcaps;
 }

+const char *elf_hwcap_str(uint32_t bit)
+{
+static const char *hwcap_str[] = {
+[__builtin_ctz(ARM_HWCAP_A64_FP  )] = "fp",
+[__builtin_ctz(ARM_HWCAP_A64_ASIMD   )] = "asimd",
+[__builtin_ctz(ARM_HWCAP_A64_EVTSTRM )] = "evtstrm",
+[__builtin_ctz(ARM_HWCAP_A64_AES )] = "aes",
+

[PATCH 0/2] linux-user: /proc/cpuinfo fix and content emulation for arm

2023-08-01 Thread Helge Deller

One fix for correctly detecting /proc/cpuinfo access
and new /proc/cpuinfo output for arm/arm64

Helge Deller (2):
  linux-user: Fix openat() emulation to correctly detect accesses to
/proc
  linux-user: Emulate /proc/cpuinfo on aarch64 and arm

 linux-user/elfload.c | 130 +--
 linux-user/loader.h  |   6 +-
 linux-user/syscall.c |  70 ++-
 3 files changed, 198 insertions(+), 8 deletions(-)

--
2.41.0

[PATCH 1/2] linux-user: Fix openat() emulation to correctly detect accesses to /proc

2023-08-01 Thread Helge Deller

In qemu we catch accesses to files like /proc/cpuinfo or /proc/net/route
and return to the guest contents which would be visible on a real system
(instead what the host would show).

This patch fixes a bug, where for example the accesses
cat /proccpuinfo
or
cd /proc && cat cpuinfo
will not be recognized by qemu and where qemu will wrongly show
the contents of the host's /proc/cpuinfo file.

Signed-off-by: Helge Deller 

--
v2:
- use g_autofree instead of pathname on stack
  Daniel P. Berrangé requested to not put buffers on stack.
  Using g_autofree keeps code much cleaner than using
  extended semantics of realpath(), unless I can use g_autofree
  on malloced area from realpath().
---
 linux-user/syscall.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 95727a816a..a089463969 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8539,9 +8539,12 @@ static int open_hardware(CPUArchState *cpu_env, int fd)
 }
 #endif

-int do_guest_openat(CPUArchState *cpu_env, int dirfd, const char *pathname,
+
+int do_guest_openat(CPUArchState *cpu_env, int dirfd, const char *fname,
 int flags, mode_t mode, bool safe)
 {
+g_autofree char *proc_name = g_new(char, PATH_MAX);
+const char *pathname;
 struct fake_open {
 const char *filename;
 int (*fill)(CPUArchState *cpu_env, int fd);
@@ -8567,6 +8570,13 @@ int do_guest_openat(CPUArchState *cpu_env, int dirfd, 
const char *pathname,
 { NULL, NULL, NULL }
 };

+/* if this is a file from /proc/ filesystem, expand full name */
+if (realpath(fname, proc_name) && strncmp(proc_name, "/proc/", 6) == 0) {
+pathname = proc_name;
+} else {
+pathname = fname;
+}
+
 if (is_proc_myself(pathname, "exe")) {
 if (safe) {
 return safe_openat(dirfd, exec_path, flags, mode);
--
2.41.0

Re: [PATCH 6/8] configure: support passthrough of -Dxxx args to meson

2023-08-01 Thread Daniel P . Berrangé

On Tue, Aug 01, 2023 at 08:42:05PM +0200, Thomas Huth wrote:
> On 01/08/2023 15.04, Daniel P. Berrangé wrote:
> > This can be useful for setting some meson global options, such as the
> > optimization level or debug state.xs
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >   configure | 5 +
> >   1 file changed, 5 insertions(+)
> > 
> > diff --git a/configure b/configure
> > index 26ec5e4f54..9fe3718b77 100755
> > --- a/configure
> > +++ b/configure
> > @@ -757,6 +757,9 @@ for opt do
> > # everything else has the same name in configure and meson
> > --*) meson_option_parse "$opt" "$optarg"
> > ;;
> > +  # Pass through -D options to meson
> > +  -D*) meson_options="$meson_options $opt"
> > +  ;;
> > esac
> >   done
> > @@ -887,6 +890,8 @@ cat << EOF
> > pie Position Independent Executables
> > debug-tcg   TCG debugging (default is disabled)
> > +  -Dmesonoptname=val  passthrough option to meson unmodified
> 
> I'd rather place that earlier in the help text, above the
> "meson_options_help" line in the configure script, next to the other
> --option=something lines.

Sure I've no preference. I was expecting Paolo to tell me to put it
somewhere else anyway :-)


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 7/8] gitlab: disable optimization and debug symbols in msys build

2023-08-01 Thread Daniel P . Berrangé

On Tue, Aug 01, 2023 at 08:44:02PM +0200, Thomas Huth wrote:
> On 01/08/2023 15.04, Daniel P. Berrangé wrote:
> > Building at -O2, adds 33% to the build time, over -O2. IOW a build that
> > takes 45 minutes at -O0, takes 60 minutes at -O2. Turning off debug
> > symbols drops it further, down to 38 minutes.
> > 
> > IOW, a "-O2 -g" build is 58% slower than a "-O0" build on msys in the
> > gitlab CI windows shared runners.
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >   .gitlab-ci.d/windows.yml | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/.gitlab-ci.d/windows.yml b/.gitlab-ci.d/windows.yml
> > index 34109a80f2..552e3b751d 100644
> > --- a/.gitlab-ci.d/windows.yml
> > +++ b/.gitlab-ci.d/windows.yml
> > @@ -113,7 +113,7 @@ msys2-64bit:
> >   # commit 9f8e6cad65a6 ("gitlab-ci: Speed up the msys2-64bit job by 
> > using --without-default-devices"
> >   # changed to compile QEMU with the --without-default-devices switch
> >   # for the msys2 64-bit job, due to the build could not complete within
> > -CONFIGURE_ARGS:  --target-list=x86_64-softmmu --without-default-devices
> > +CONFIGURE_ARGS:  --target-list=x86_64-softmmu 
> > --without-default-devices -Ddebug=false -Doptimization=0
> >   # qTests don't run successfully with "--without-default-devices",
> >   # so let's exclude the qtests from CI for now.
> >   TEST_ARGS: --no-suite qtest
> > @@ -123,5 +123,5 @@ msys2-32bit:
> > variables:
> >   MINGW_TARGET: mingw-w64-i686
> >   MSYSTEM: MINGW32
> > -CONFIGURE_ARGS:  --target-list=ppc64-softmmu
> > +CONFIGURE_ARGS:  --target-list=ppc64-softmmu -Ddebug=false 
> > -Doptimization=0
> >   TEST_ARGS: --no-suite qtest
> 
> This is IMHO a very good idea! But I think for now it's enough if you only
> change the 64-bit, isn't it?

My thought was if we do it for 32-bit too, we can enable some more targets
to get a more comprehensive build.


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 0/8] gitlab: speed up msys windows jobs with GCC

2023-08-01 Thread Thomas Huth


On 01/08/2023 15.03, Daniel P. Berrangé wrote:

This is an alternative and/or complementary to Thomas' proposal
to use CLang with msys:

   https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg05402.html

First of all, the current msys installer we're using is over 12
months out of date. Thus after running the install, pacman then
replaces most of what we've just installed with new downloaded
content. Using the most update installer cuts 3+1/2 minutes off
the msys install time - 7 minutes becomes 3+1/2.

Secondly, QEMU defaults to compiling with -O2 and this is more
computationally expensive for GCC. Switching to -O0 drops the
build time from 60 minutes down to 45 minutes.

Thirdly, including debug symbols also has an overhead, and turning
that off reduces time still further down to 38 minutes.

IOW, between all three changes, we can cut approx 25-26 minutes
off the job execution time, bringing it nicely within the job
timeout.

The actually phase of installing the mingw deps still accounts
for about 10 minutes and has not been optimized.

Possibly the same trick of -O0 and skipping -g would also help
the clang alternative Thomas' proposed. If so, that could be
enough to let us enable more features / targets during the
msys build.


I really like the idea! And I guess my idea with Clang needs some more work 
'til it is acceptable, so let's go with your idea for now to fix the timeout 
problem in the CI ... we can still optimize later with Clang in case we 
found a good solution for that ms_struct problem...


 Thomas

Re: [PATCH 8/8] gitlab: disable FF_SCRIPT_SECTIONS on msys jobs

2023-08-01 Thread Thomas Huth


On 01/08/2023 15.04, Daniel P. Berrangé wrote:

The FF_SCRIPT_SECTIONS=1 variable should ordinarily cause output from
each line of the job script to be presented in a collapsible section
with execution time listed.

While it works on Linux shared runners, when used with Windows runners
with PowerShell, this option does not create any sections, and actually
causes echo'ing of commands to be disabled, making it even worse to
debug the jobs.

Signed-off-by: Daniel P. Berrangé 
---
  .gitlab-ci.d/windows.yml | 4 
  1 file changed, 4 insertions(+)

diff --git a/.gitlab-ci.d/windows.yml b/.gitlab-ci.d/windows.yml
index 552e3b751d..cd7622a761 100644
--- a/.gitlab-ci.d/windows.yml
+++ b/.gitlab-ci.d/windows.yml
@@ -12,6 +12,10 @@
needs: []
stage: build
timeout: 80m
+  variables:
+# This feature doesn't (currently) work with PowerShell, it stops
+# the echo'ing of commands being run and doesn't show any timing
+FF_SCRIPT_SECTIONS: 0
artifacts:
  name: "$CI_JOB_NAME-$CI_COMMIT_REF_SLUG"
  expire_in: 7 days


Acked-by: Thomas Huth

Re: [PATCH 7/8] gitlab: disable optimization and debug symbols in msys build

2023-08-01 Thread Thomas Huth


On 01/08/2023 15.04, Daniel P. Berrangé wrote:

Building at -O2, adds 33% to the build time, over -O2. IOW a build that
takes 45 minutes at -O0, takes 60 minutes at -O2. Turning off debug
symbols drops it further, down to 38 minutes.

IOW, a "-O2 -g" build is 58% slower than a "-O0" build on msys in the
gitlab CI windows shared runners.

Signed-off-by: Daniel P. Berrangé 
---
  .gitlab-ci.d/windows.yml | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.gitlab-ci.d/windows.yml b/.gitlab-ci.d/windows.yml
index 34109a80f2..552e3b751d 100644
--- a/.gitlab-ci.d/windows.yml
+++ b/.gitlab-ci.d/windows.yml
@@ -113,7 +113,7 @@ msys2-64bit:
  # commit 9f8e6cad65a6 ("gitlab-ci: Speed up the msys2-64bit job by using 
--without-default-devices"
  # changed to compile QEMU with the --without-default-devices switch
  # for the msys2 64-bit job, due to the build could not complete within
-CONFIGURE_ARGS:  --target-list=x86_64-softmmu --without-default-devices
+CONFIGURE_ARGS:  --target-list=x86_64-softmmu --without-default-devices 
-Ddebug=false -Doptimization=0
  # qTests don't run successfully with "--without-default-devices",
  # so let's exclude the qtests from CI for now.
  TEST_ARGS: --no-suite qtest
@@ -123,5 +123,5 @@ msys2-32bit:
variables:
  MINGW_TARGET: mingw-w64-i686
  MSYSTEM: MINGW32
-CONFIGURE_ARGS:  --target-list=ppc64-softmmu
+CONFIGURE_ARGS:  --target-list=ppc64-softmmu -Ddebug=false -Doptimization=0
  TEST_ARGS: --no-suite qtest


This is IMHO a very good idea! But I think for now it's enough if you only 
change the 64-bit, isn't it?


 Thomas

[PATCH for-8.1 0/3] accel/tcg: Do not issue misaligned i/o

2023-08-01 Thread Richard Henderson

Fixing #1800, and possibly more.

r~

Richard Henderson (3):
  accel/tcg: Adjust parameters and locking with do_{ld,st}_mmio_*
  accel/tcg: Issue wider aligned i/o in do_{ld,st}_mmio_*
  accel/tcg: Do not issue misaligned i/o

 accel/tcg/cputlb.c | 251 ++---
 1 file changed, 169 insertions(+), 82 deletions(-)

-- 
2.34.1

[PATCH 2/3] accel/tcg: Issue wider aligned i/o in do_{ld,st}_mmio_*

2023-08-01 Thread Richard Henderson

If the address and size are aligned, send larger chunks
to the memory subsystem.  This will be required to make
more use of these helpers.

Signed-off-by: Richard Henderson 
---
 accel/tcg/cputlb.c | 76 +-
 1 file changed, 69 insertions(+), 7 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index d28606b93e..c3e1fdbf37 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2080,10 +2080,40 @@ static uint64_t do_ld_mmio_beN(CPUArchState *env, 
CPUTLBEntryFull *full,
uint64_t ret_be, vaddr addr, int size,
int mmu_idx, MMUAccessType type, uintptr_t ra)
 {
-for (int i = 0; i < size; i++) {
-uint8_t x = io_readx(env, full, mmu_idx, addr + i, ra, type, MO_UB);
-ret_be = (ret_be << 8) | x;
-}
+uint64_t t;
+
+tcg_debug_assert(size > 0 && size <= 8);
+do {
+/* Read aligned pieces up to 8 bytes. */
+switch ((size | (int)addr) & 7) {
+case 1:
+case 3:
+case 5:
+case 7:
+t = io_readx(env, full, mmu_idx, addr, ra, type, MO_UB);
+ret_be = (ret_be << 8) | t;
+size -= 1;
+addr += 1;
+break;
+case 2:
+case 6:
+t = io_readx(env, full, mmu_idx, addr, ra, type, MO_BEUW);
+ret_be = (ret_be << 16) | t;
+size -= 2;
+addr += 2;
+break;
+case 4:
+t = io_readx(env, full, mmu_idx, addr, ra, type, MO_BEUL);
+ret_be = (ret_be << 32) | t;
+size -= 4;
+addr += 4;
+break;
+case 0:
+return io_readx(env, full, mmu_idx, addr, ra, type, MO_BEUQ);
+default:
+qemu_build_not_reached();
+}
+} while (size);
 return ret_be;
 }
 
@@ -2678,9 +2708,41 @@ static uint64_t do_st_mmio_leN(CPUArchState *env, 
CPUTLBEntryFull *full,
uint64_t val_le, vaddr addr, int size,
int mmu_idx, uintptr_t ra)
 {
-for (int i = 0; i < size; i++, val_le >>= 8) {
-io_writex(env, full, mmu_idx, val_le, addr + i, ra, MO_UB);
-}
+tcg_debug_assert(size > 0 && size <= 8);
+
+do {
+/* Store aligned pieces up to 8 bytes. */
+switch ((size | (int)addr) & 7) {
+case 1:
+case 3:
+case 5:
+case 7:
+io_writex(env, full, mmu_idx, val_le, addr, ra, MO_UB);
+val_le >>= 8;
+size -= 1;
+addr += 1;
+break;
+case 2:
+case 6:
+io_writex(env, full, mmu_idx, val_le, addr, ra, MO_LEUW);
+val_le >>= 16;
+size -= 2;
+addr += 2;
+break;
+case 4:
+io_writex(env, full, mmu_idx, val_le, addr, ra, MO_LEUL);
+val_le >>= 32;
+size -= 4;
+addr += 4;
+break;
+case 0:
+io_writex(env, full, mmu_idx, val_le, addr, ra, MO_LEUQ);
+return 0;
+default:
+qemu_build_not_reached();
+}
+} while (size);
+
 return val_le;
 }
 
-- 
2.34.1

[PATCH 1/3] accel/tcg: Adjust parameters and locking with do_{ld, st}_mmio_*

2023-08-01 Thread Richard Henderson

Replace MMULookupPageData* with CPUTLBEntryFull, addr, size.
Move QEMU_IOTHREAD_LOCK_GUARD to the caller.

This simplifies the usage from do_ld16_beN and do_st16_leN, where
we weren't locking the entire operation, and required hoop jumping
for passing addr and size.

Signed-off-by: Richard Henderson 
---
 accel/tcg/cputlb.c | 65 +++---
 1 file changed, 32 insertions(+), 33 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index ba44501a7c..d28606b93e 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2066,24 +2066,21 @@ static void *atomic_mmu_lookup(CPUArchState *env, vaddr 
addr, MemOpIdx oi,
 /**
  * do_ld_mmio_beN:
  * @env: cpu context
- * @p: translation parameters
+ * @full: page parameters
  * @ret_be: accumulated data
+ * @addr: virtual address
+ * @size: number of bytes
  * @mmu_idx: virtual address context
  * @ra: return address into tcg generated code, or 0
  *
- * Load @p->size bytes from @p->addr, which is memory-mapped i/o.
+ * Load @size bytes from @addr, which is memory-mapped i/o.
  * The bytes are concatenated in big-endian order with @ret_be.
  */
-static uint64_t do_ld_mmio_beN(CPUArchState *env, MMULookupPageData *p,
-   uint64_t ret_be, int mmu_idx,
-   MMUAccessType type, uintptr_t ra)
+static uint64_t do_ld_mmio_beN(CPUArchState *env, CPUTLBEntryFull *full,
+   uint64_t ret_be, vaddr addr, int size,
+   int mmu_idx, MMUAccessType type, uintptr_t ra)
 {
-CPUTLBEntryFull *full = p->full;
-vaddr addr = p->addr;
-int i, size = p->size;
-
-QEMU_IOTHREAD_LOCK_GUARD();
-for (i = 0; i < size; i++) {
+for (int i = 0; i < size; i++) {
 uint8_t x = io_readx(env, full, mmu_idx, addr + i, ra, type, MO_UB);
 ret_be = (ret_be << 8) | x;
 }
@@ -2232,7 +2229,9 @@ static uint64_t do_ld_beN(CPUArchState *env, 
MMULookupPageData *p,
 unsigned tmp, half_size;
 
 if (unlikely(p->flags & TLB_MMIO)) {
-return do_ld_mmio_beN(env, p, ret_be, mmu_idx, type, ra);
+QEMU_IOTHREAD_LOCK_GUARD();
+return do_ld_mmio_beN(env, p->full, ret_be, p->addr, p->size,
+  mmu_idx, type, ra);
 }
 
 /*
@@ -2281,11 +2280,11 @@ static Int128 do_ld16_beN(CPUArchState *env, 
MMULookupPageData *p,
 MemOp atom;
 
 if (unlikely(p->flags & TLB_MMIO)) {
-p->size = size - 8;
-a = do_ld_mmio_beN(env, p, a, mmu_idx, MMU_DATA_LOAD, ra);
-p->addr += p->size;
-p->size = 8;
-b = do_ld_mmio_beN(env, p, 0, mmu_idx, MMU_DATA_LOAD, ra);
+QEMU_IOTHREAD_LOCK_GUARD();
+a = do_ld_mmio_beN(env, p->full, a, p->addr, size - 8,
+   mmu_idx, MMU_DATA_LOAD, ra);
+b = do_ld_mmio_beN(env, p->full, 0, p->addr + 8, 8,
+   mmu_idx, MMU_DATA_LOAD, ra);
 return int128_make128(b, a);
 }
 
@@ -2664,24 +2663,22 @@ Int128 cpu_ld16_mmu(CPUArchState *env, abi_ptr addr,
 /**
  * do_st_mmio_leN:
  * @env: cpu context
- * @p: translation parameters
+ * @full: page parameters
  * @val_le: data to store
+ * @addr: virtual address
+ * @size: number of bytes
  * @mmu_idx: virtual address context
  * @ra: return address into tcg generated code, or 0
  *
- * Store @p->size bytes at @p->addr, which is memory-mapped i/o.
+ * Store @size bytes at @addr, which is memory-mapped i/o.
  * The bytes to store are extracted in little-endian order from @val_le;
  * return the bytes of @val_le beyond @p->size that have not been stored.
  */
-static uint64_t do_st_mmio_leN(CPUArchState *env, MMULookupPageData *p,
-   uint64_t val_le, int mmu_idx, uintptr_t ra)
+static uint64_t do_st_mmio_leN(CPUArchState *env, CPUTLBEntryFull *full,
+   uint64_t val_le, vaddr addr, int size,
+   int mmu_idx, uintptr_t ra)
 {
-CPUTLBEntryFull *full = p->full;
-vaddr addr = p->addr;
-int i, size = p->size;
-
-QEMU_IOTHREAD_LOCK_GUARD();
-for (i = 0; i < size; i++, val_le >>= 8) {
+for (int i = 0; i < size; i++, val_le >>= 8) {
 io_writex(env, full, mmu_idx, val_le, addr + i, ra, MO_UB);
 }
 return val_le;
@@ -2698,7 +2695,9 @@ static uint64_t do_st_leN(CPUArchState *env, 
MMULookupPageData *p,
 unsigned tmp, half_size;
 
 if (unlikely(p->flags & TLB_MMIO)) {
-return do_st_mmio_leN(env, p, val_le, mmu_idx, ra);
+QEMU_IOTHREAD_LOCK_GUARD();
+return do_st_mmio_leN(env, p->full, val_le, p->addr,
+  p->size, mmu_idx, ra);
 } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
 return val_le >> (p->size * 8);
 }
@@ -2751,11 +2750,11 @@ static uint64_t do_st16_leN(CPUArchState *env, 
MMULookupPageData *p,
 MemOp atom;
 
 if (unlikely(p->flags & TLB_MMIO)) {
-p->size = 8;
-

Re: [PATCH 6/8] configure: support passthrough of -Dxxx args to meson

2023-08-01 Thread Thomas Huth


On 01/08/2023 15.04, Daniel P. Berrangé wrote:

This can be useful for setting some meson global options, such as the
optimization level or debug state.xs

Signed-off-by: Daniel P. Berrangé 
---
  configure | 5 +
  1 file changed, 5 insertions(+)

diff --git a/configure b/configure
index 26ec5e4f54..9fe3718b77 100755
--- a/configure
+++ b/configure
@@ -757,6 +757,9 @@ for opt do
# everything else has the same name in configure and meson
--*) meson_option_parse "$opt" "$optarg"
;;
+  # Pass through -D options to meson
+  -D*) meson_options="$meson_options $opt"
+  ;;
esac
  done
  
@@ -887,6 +890,8 @@ cat << EOF

pie Position Independent Executables
debug-tcg   TCG debugging (default is disabled)
  
+  -Dmesonoptname=val  passthrough option to meson unmodified


I'd rather place that earlier in the help text, above the 
"meson_options_help" line in the configure script, next to the other 
--option=something lines.


 Thomas


  NOTE: The object files are built at the place where configure is launched
  EOF
  exit 0

[PATCH 3/3] accel/tcg: Do not issue misaligned i/o

2023-08-01 Thread Richard Henderson

In the single-page case we were issuing misaligned i/o to
the memory subsystem, which does not handle it properly.
Split such accesses via do_{ld,st}_mmio_*.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1800
Signed-off-by: Richard Henderson 
---
 accel/tcg/cputlb.c | 118 +++--
 1 file changed, 72 insertions(+), 46 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index c3e1fdbf37..05d272f839 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2369,16 +2369,20 @@ static uint8_t do_ld_1(CPUArchState *env, 
MMULookupPageData *p, int mmu_idx,
 static uint16_t do_ld_2(CPUArchState *env, MMULookupPageData *p, int mmu_idx,
 MMUAccessType type, MemOp memop, uintptr_t ra)
 {
-uint64_t ret;
+uint16_t ret;
 
 if (unlikely(p->flags & TLB_MMIO)) {
-return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop);
-}
-
-/* Perform the load host endian, then swap if necessary. */
-ret = load_atom_2(env, ra, p->haddr, memop);
-if (memop & MO_BSWAP) {
-ret = bswap16(ret);
+QEMU_IOTHREAD_LOCK_GUARD();
+ret = do_ld_mmio_beN(env, p->full, 0, p->addr, 2, mmu_idx, type, ra);
+if ((memop & MO_BSWAP) == MO_LE) {
+ret = bswap16(ret);
+}
+} else {
+/* Perform the load host endian, then swap if necessary. */
+ret = load_atom_2(env, ra, p->haddr, memop);
+if (memop & MO_BSWAP) {
+ret = bswap16(ret);
+}
 }
 return ret;
 }
@@ -2389,13 +2393,17 @@ static uint32_t do_ld_4(CPUArchState *env, 
MMULookupPageData *p, int mmu_idx,
 uint32_t ret;
 
 if (unlikely(p->flags & TLB_MMIO)) {
-return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop);
-}
-
-/* Perform the load host endian. */
-ret = load_atom_4(env, ra, p->haddr, memop);
-if (memop & MO_BSWAP) {
-ret = bswap32(ret);
+QEMU_IOTHREAD_LOCK_GUARD();
+ret = do_ld_mmio_beN(env, p->full, 0, p->addr, 4, mmu_idx, type, ra);
+if ((memop & MO_BSWAP) == MO_LE) {
+ret = bswap32(ret);
+}
+} else {
+/* Perform the load host endian. */
+ret = load_atom_4(env, ra, p->haddr, memop);
+if (memop & MO_BSWAP) {
+ret = bswap32(ret);
+}
 }
 return ret;
 }
@@ -2406,13 +2414,17 @@ static uint64_t do_ld_8(CPUArchState *env, 
MMULookupPageData *p, int mmu_idx,
 uint64_t ret;
 
 if (unlikely(p->flags & TLB_MMIO)) {
-return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop);
-}
-
-/* Perform the load host endian. */
-ret = load_atom_8(env, ra, p->haddr, memop);
-if (memop & MO_BSWAP) {
-ret = bswap64(ret);
+QEMU_IOTHREAD_LOCK_GUARD();
+ret = do_ld_mmio_beN(env, p->full, 0, p->addr, 8, mmu_idx, type, ra);
+if ((memop & MO_BSWAP) == MO_LE) {
+ret = bswap64(ret);
+}
+} else {
+/* Perform the load host endian. */
+ret = load_atom_8(env, ra, p->haddr, memop);
+if (memop & MO_BSWAP) {
+ret = bswap64(ret);
+}
 }
 return ret;
 }
@@ -2560,20 +2572,22 @@ static Int128 do_ld16_mmu(CPUArchState *env, vaddr addr,
 cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
 crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD, );
 if (likely(!crosspage)) {
-/* Perform the load host endian. */
 if (unlikely(l.page[0].flags & TLB_MMIO)) {
 QEMU_IOTHREAD_LOCK_GUARD();
-a = io_readx(env, l.page[0].full, l.mmu_idx, addr,
- ra, MMU_DATA_LOAD, MO_64);
-b = io_readx(env, l.page[0].full, l.mmu_idx, addr + 8,
- ra, MMU_DATA_LOAD, MO_64);
-ret = int128_make128(HOST_BIG_ENDIAN ? b : a,
- HOST_BIG_ENDIAN ? a : b);
+a = do_ld_mmio_beN(env, l.page[0].full, 0, addr, 8,
+   l.mmu_idx, MMU_DATA_LOAD, ra);
+b = do_ld_mmio_beN(env, l.page[0].full, 0, addr + 8, 8,
+   l.mmu_idx, MMU_DATA_LOAD, ra);
+ret = int128_make128(b, a);
+if ((l.memop & MO_BSWAP) == MO_LE) {
+ret = bswap128(ret);
+}
 } else {
+/* Perform the load host endian. */
 ret = load_atom_16(env, ra, l.page[0].haddr, l.memop);
-}
-if (l.memop & MO_BSWAP) {
-ret = bswap128(ret);
+if (l.memop & MO_BSWAP) {
+ret = bswap128(ret);
+}
 }
 return ret;
 }
@@ -2872,7 +2886,11 @@ static void do_st_2(CPUArchState *env, MMULookupPageData 
*p, uint16_t val,
 int mmu_idx, MemOp memop, uintptr_t ra)
 {
 if (unlikely(p->flags & TLB_MMIO)) {
-io_writex(env, p->full, mmu_idx, val, p->addr, ra, memop);
+if ((memop & MO_BSWAP) != MO_LE) {
+

Re: [PATCH] gdbstub: Fix client Ctrl-C handling

2023-08-01 Thread Matheus Tavares Bernardino

Hi, Nick.

> Nicholas Piggin  wrote:
>
> On Tue Jul 11, 2023 at 9:03 PM AEST, Matheus Tavares Bernardino wrote:
> > > Nicholas Piggin  wrote:
> > >
> > > diff --git a/gdbstub/gdbstub.c b/gdbstub/gdbstub.c
> > > index 6911b73c07..ce8b42eb15 100644
> > > --- a/gdbstub/gdbstub.c
> > > +++ b/gdbstub/gdbstub.c
> > > @@ -2051,8 +2051,17 @@ void gdb_read_byte(uint8_t ch)
> > >  return;
> > >  }
> > >  if (runstate_is_running()) {
> > > -/* when the CPU is running, we cannot do anything except stop
> > > -   it when receiving a char */
> > > +/*
> > > + * When the CPU is running, we cannot do anything except stop
> > > + * it when receiving a char. This is expected on a Ctrl-C in the
> > > + * gdb client. Because we are in all-stop mode, gdb sends a
> > > + * 0x03 byte which is not a usual packet, so we handle it 
> > > specially
> > > + * here, but it does expect a stop reply.
> > > + */
> > > +if (ch != 0x03) {
> > > +warn_report("gdbstub: client sent packet while target 
> > > running\n");
> > > +}
> > > +gdbserver_state.allow_stop_reply = true;
> > >  vm_stop(RUN_STATE_PAUSED);
> > >  } else
> > >  #endif
> >
> > Makes sense to me, but shouldn't we send the stop-reply packet only for
> > Ctrl+C/0x03?
> 
> Good question.
> 
> I think if we get a character here that's not a 3, we're already in
> trouble, and we eat it so even worse. Since we only send a stop packet
> back when the vm stops, then if we don't send one now we might never
> send it. At least if we send one then the client might have some chance
> to get back to a sane state.

I just noticed now (as I was integrating the latest upstream patches
with our downstream qemu-system-hexagon) that this causes the
gdbstub-untimely-packet tcg test to fail.

My first thought was that, if 0x3 is the only valid case where we will
read a char when the cpu is running, perhaps not issuing the stop-reply
isn't that bad as GDB would ignore it anyways. E.g. from a `set debug
remote 1` output:

  Sending packet: $qSupported:multiprocess+;swbreak+;hwbreak+;qRelocInsn+;
   fork-events+;vfork-events+;exec-events+;vContSupported+;
   QThreadEvents+;no-resumed+;
   xmlRegisters=i386#6a...
  Packet instead of Ack, ignoring it

So, perhaps, we could do:

diff --git a/gdbstub/gdbstub.c b/gdbstub/gdbstub.c
index f123b40ce7..8af066301a 100644
--- a/gdbstub/gdbstub.c
+++ b/gdbstub/gdbstub.c
@@ -2055,8 +2055,9 @@ void gdb_read_byte(uint8_t ch)
  */
 if (ch != 0x03) {
 warn_report("gdbstub: client sent packet while target running\n");
+} else {
+gdbserver_state.allow_stop_reply = true;
 }
-gdbserver_state.allow_stop_reply = true;
 vm_stop(RUN_STATE_PAUSED);
 } else
 #endif
-- >8 --

Alternatively, since GDB ignores the packet anyways, should we just let
this be and refactor/remove the test?

Re: [PATCH 2/8] gitlab: print timestamps during windows msys jobs

2023-08-01 Thread Thomas Huth


On 01/08/2023 15.03, Daniel P. Berrangé wrote:

It is hard to get visibility into where time is consumed in our Windows
msys jobs. Adding a few log console messages with the timestamp will
aid in our debugging.

Signed-off-by: Daniel P. Berrangé 
---
  .gitlab-ci.d/windows.yml | 5 +
  1 file changed, 5 insertions(+)

diff --git a/.gitlab-ci.d/windows.yml b/.gitlab-ci.d/windows.yml
index f086540e40..831b080d12 100644
--- a/.gitlab-ci.d/windows.yml
+++ b/.gitlab-ci.d/windows.yml
@@ -19,6 +19,7 @@
  reports:
junit: "build/meson-logs/testlog.junit.xml"
before_script:
+  - Write-Output "Acquiring msys2.exe installer at $(Get-Date -Format u)"
- If ( !(Test-Path -Path msys64\var\cache ) ) {
mkdir msys64\var\cache
  }
@@ -27,6 +28,7 @@

"https://github.com/msys2/msys2-installer/releases/download/2022-06-03/msys2-base-x86_64-20220603.sfx.exe;
-outfile "msys64\var\cache\msys2.exe"
  }
+  - Write-Output "Invoking msys2.exe installer at $(Get-Date -Format u)"
- msys64\var\cache\msys2.exe -y
- ((Get-Content -path .\msys64\etc\\post-install\\07-pacman-key.post -Raw)
-replace '--refresh-keys', '--version') |
@@ -36,6 +38,7 @@
- .\msys64\usr\bin\bash -lc 'pacman --noconfirm -Syuu'  # Normal update
- taskkill /F /FI "MODULES eq msys-2.0.dll"
script:
+  - Write-Output "Installing mingw packages at $(Get-Date -Format u)"
- .\msys64\usr\bin\bash -lc "pacman -Sy --noconfirm --needed
bison diffutils flex
git grep make sed
@@ -66,6 +69,7 @@
$MINGW_TARGET-spice
$MINGW_TARGET-usbredir
$MINGW_TARGET-zstd "
+  - Write-Output "Running build at $(Get-Date -Format u)"
- $env:CHERE_INVOKING = 'yes'  # Preserve the current working directory
- $env:MSYS = 'winsymlinks:native' # Enable native Windows symlink
- mkdir build
@@ -73,6 +77,7 @@
- ..\msys64\usr\bin\bash -lc "../configure --enable-fdt=system 
$CONFIGURE_ARGS"
- ..\msys64\usr\bin\bash -lc "make"
- ..\msys64\usr\bin\bash -lc "make check MTESTARGS='$TEST_ARGS' || { cat 
meson-logs/testlog.txt; exit 1; } ;"
+  - Write-Output "Finished build at $(Get-Date -Format u)"
  
  msys2-64bit:

extends: .shared_msys2_builder


Reviewed-by: Thomas Huth

Re: [PATCH 1/8] gitlab: remove duplication between msys jobs

2023-08-01 Thread Thomas Huth


On 01/08/2023 15.03, Daniel P. Berrangé wrote:

Although they share a common parent, the two msys jobs still have
massive duplication in their script definitions that can easily be
collapsed.

Signed-off-by: Daniel P. Berrangé 
---
  .gitlab-ci.d/windows.yml | 132 +++
  1 file changed, 49 insertions(+), 83 deletions(-)


As I discovered in the recent days, this is also quite helpful in case we 
ever want to switch to Clang, since we need to change the prefix of the 
packages there. And it's still easy to have distinct packages with some few 
lines of codes changes, as I tried it out in my Clang patch. So this patch 
now sounds fine to me:


Reviewed-by: Thomas Huth

Re: [RFC PATCH 5/6] include/qemu/compiler: Fix problem with gcc_struct and Clang

2023-08-01 Thread Thomas Huth


On 01/08/2023 15.49, Daniel P. Berrangé wrote:
...

If the value of the msys2 jobs is that they let us run the test suite,
can we limit the usage of msys2 to just that part, and chain it upto
the fedora mingw cross compile jobs ie.

   win64-fedora-cross-container
 |
 +-> cross-win64-system
   |
  +-> check-win64-msys

In the "cross-win64-system" job we would have to publish the entire QEMU
'build' directory as an artifact, to pass it over to the msys job.  If
we also published /usr/x86_64-w64-mingw32/ as an artifact, then we would
not need to install any mingw packages under msys. The basic msys installer
can be run (which takes a couple of minutes), and then then we just dump
the Fedora artifacts of /usr/x86_64-w64-mingw32/ into the right place
and run the test suite.


It's a nice idea at a first glance - but at a second glance I doubt that 
this is easy to realize. You need the configured meson environment for 
running the tests, and you cannot easily pass that from the Fedora container 
to MSYS2. So you'd either need to re-run meson setup in the MSYS2 job and 
then trick it into believing that the binaries have already been built, or 
you have to run the test binaries "manually" without meson... both might be 
possible, but it sounds rather fragile to me.


 Thomas

Re: [PATCH 0/8] gitlab: speed up msys windows jobs with GCC

2023-08-01 Thread Thomas Huth


On 01/08/2023 16.35, Daniel P. Berrangé wrote:

On Tue, Aug 01, 2023 at 03:53:22PM +0200, Markus Armbruster wrote:

Daniel P. Berrangé  writes:


This is an alternative and/or complementary to Thomas' proposal
to use CLang with msys:

   https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg05402.html

First of all, the current msys installer we're using is over 12
months out of date. Thus after running the install, pacman then
replaces most of what we've just installed with new downloaded
content. Using the most update installer cuts 3+1/2 minutes off
the msys install time - 7 minutes becomes 3+1/2.

Secondly, QEMU defaults to compiling with -O2 and this is more
computationally expensive for GCC. Switching to -O0 drops the
build time from 60 minutes down to 45 minutes.


 From the fine manual[*]: "The effectiveness of some warnings depends on
optimizations also being enabled.  For example '-Wsuggest-final-types'
is more effective with link-time optimization and some instances of
other warnings may not be issued at all unless optimization is enabled.
While optimization in general improves the efficacy of control and data
flow sensitive warnings, in some cases it may also cause false
positives."  Do we care?


In general, yes, we do care.

In this specific case though, we're battling to figure out the lesser
of multiple evils.


I agree. Additionally, we also test compiling for Windows with the MinGW 
cross compiler suite in a Fedora container, and we still use the default 
optimization there, so we should have that covered.


 Thomas

Re: [PATCH for-8.1] target/m68k: Fix semihost lseek offset computation

2023-08-01 Thread Philippe Mathieu-Daudé


On 1/8/23 17:45, Peter Maydell wrote:

The arguments for deposit64 are (value, start, length, fieldval); this
appears to have thought they were (value, fieldval, start,
length). Reorder the parameters to match the actual function.

Cc: qemu-sta...@nongnu.org
Fixes: 950272506d ("target/m68k: Use semihosting/syscalls.h")
Reported-by: Philippe Mathieu-Daudé 
Signed-off-by: Peter Maydell 
---
Same fix for m68k as Keith Packard just sent for nios2
---
  target/m68k/m68k-semi.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/m68k/m68k-semi.c b/target/m68k/m68k-semi.c
index 88ad9ba8144..239f6e44e90 100644
--- a/target/m68k/m68k-semi.c
+++ b/target/m68k/m68k-semi.c
@@ -166,7 +166,7 @@ void do_m68k_semihosting(CPUM68KState *env, int nr)
  GET_ARG64(2);
  GET_ARG64(3);
  semihost_sys_lseek(cs, m68k_semi_u64_cb, arg0,
-   deposit64(arg2, arg1, 32, 32), arg3);
+   deposit64(arg2, 32, 32, arg1), arg3);
  break;
  
  case HOSTED_RENAME:


Thanks for writing the fix!

Reviewed-by: Philippe Mathieu-Daudé

[PATCH for-8.1 1/3] hw/xen: fix off-by-one in xen_evtchn_set_gsi()

2023-08-01 Thread David Woodhouse

From: David Woodhouse 

Coverity points out (CID 1508128) a bounds checking error. We need to check
for gsi >= IOAPIC_NUM_PINS, not just greater-than.

Also fix up an assert() that has the same problem, that Coverity didn't see.

Fixes: 4f81baa33ed6 ("hw/xen: Support GSI mapping to PIRQ")
Signed-off-by: David Woodhouse 
Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/i386/kvm/xen_evtchn.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index 3d810dbd59..0e9c108614 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -1587,7 +1587,7 @@ static int allocate_pirq(XenEvtchnState *s, int type, int 
gsi)
  found:
 pirq_inuse_word(s, pirq) |= pirq_inuse_bit(pirq);
 if (gsi >= 0) {
-assert(gsi <= IOAPIC_NUM_PINS);
+assert(gsi < IOAPIC_NUM_PINS);
 s->gsi_pirq[gsi] = pirq;
 }
 s->pirq[pirq].gsi = gsi;
@@ -1601,7 +1601,7 @@ bool xen_evtchn_set_gsi(int gsi, int level)
 
 assert(qemu_mutex_iothread_locked());
 
-if (!s || gsi < 0 || gsi > IOAPIC_NUM_PINS) {
+if (!s || gsi < 0 || gsi >= IOAPIC_NUM_PINS) {
 return false;
 }
 
-- 
2.40.1

[PATCH for-8.1 3/3] hw/xen: prevent guest from binding loopback event channel to itself

2023-08-01 Thread David Woodhouse

From: David Woodhouse 

Fuzzing showed that a guest could bind an interdomain port to itself, by
guessing the next port to be allocated and putting that as the 'remote'
port number. By chance, that works because the newly-allocated port has
type EVTCHNSTAT_unbound. It shouldn't.

Signed-off-by: David Woodhouse 
Reviewed-by: Paul Durrant 
---
 hw/i386/kvm/xen_evtchn.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index 0e9c108614..a731738411 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -1408,8 +1408,15 @@ int xen_evtchn_bind_interdomain_op(struct 
evtchn_bind_interdomain *interdomain)
 XenEvtchnPort *rp = >port_table[interdomain->remote_port];
 XenEvtchnPort *lp = >port_table[interdomain->local_port];
 
-if (rp->type == EVTCHNSTAT_unbound && rp->type_val == 0) {
-/* It's a match! */
+/*
+ * The 'remote' port for loopback must be an unbound port allocated for
+ * communication with the local domain (as indicated by rp->type_val
+ * being zero, not PORT_INFO_TYPEVAL_REMOTE_QEMU), and must *not* be
+ * the port that was just allocated for the local end.
+ */
+if (interdomain->local_port != interdomain->remote_port &&
+rp->type == EVTCHNSTAT_unbound && rp->type_val == 0) {
+
 rp->type = EVTCHNSTAT_interdomain;
 rp->type_val = interdomain->local_port;
 
-- 
2.40.1

[PATCH for-8.1 2/3] i386/xen: consistent locking around Xen singleshot timers

2023-08-01 Thread David Woodhouse

From: David Woodhouse 

Coverity points out (CID 1507534, 1507968) that we sometimes access
env->xen_singleshot_timer_ns under the protection of
env->xen_timers_lock and sometimes not.

This isn't always an issue. There are two modes for the timers; if the
kernel supports the EVTCHN_SEND capability then it handles all the timer
hypercalls and delivery internally, and all we use the field for is to
get/set the timer as part of the vCPU state via an ioctl(). If the
kernel doesn't have that support, then we do all the emulation within
qemu, and *those* are the code paths where we actually care about the
locking.

But it doesn't hurt to be a little bit more consistent and avoid having
to explain *why* it's OK.

Signed-off-by: David Woodhouse 
Reviewed-by: Paul Durrant 
---
 target/i386/kvm/xen-emu.c | 36 ++--
 1 file changed, 26 insertions(+), 10 deletions(-)

diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c
index d7c7eb8d9c..9946ff0905 100644
--- a/target/i386/kvm/xen-emu.c
+++ b/target/i386/kvm/xen-emu.c
@@ -43,6 +43,7 @@
 
 static void xen_vcpu_singleshot_timer_event(void *opaque);
 static void xen_vcpu_periodic_timer_event(void *opaque);
+static int vcpuop_stop_singleshot_timer(CPUState *cs);
 
 #ifdef TARGET_X86_64
 #define hypercall_compat32(longmode) (!(longmode))
@@ -466,6 +467,7 @@ void kvm_xen_inject_vcpu_callback_vector(uint32_t vcpu_id, 
int type)
 }
 }
 
+/* Must always be called with xen_timers_lock held */
 static int kvm_xen_set_vcpu_timer(CPUState *cs)
 {
 X86CPU *cpu = X86_CPU(cs);
@@ -483,6 +485,7 @@ static int kvm_xen_set_vcpu_timer(CPUState *cs)
 
 static void do_set_vcpu_timer_virq(CPUState *cs, run_on_cpu_data data)
 {
+QEMU_LOCK_GUARD(_CPU(cs)->env.xen_timers_lock);
 kvm_xen_set_vcpu_timer(cs);
 }
 
@@ -545,7 +548,6 @@ static void do_vcpu_soft_reset(CPUState *cs, 
run_on_cpu_data data)
 env->xen_vcpu_time_info_gpa = INVALID_GPA;
 env->xen_vcpu_runstate_gpa = INVALID_GPA;
 env->xen_vcpu_callback_vector = 0;
-env->xen_singleshot_timer_ns = 0;
 memset(env->xen_virq, 0, sizeof(env->xen_virq));
 
 set_vcpu_info(cs, INVALID_GPA);
@@ -555,8 +557,13 @@ static void do_vcpu_soft_reset(CPUState *cs, 
run_on_cpu_data data)
   INVALID_GPA);
 if (kvm_xen_has_cap(EVTCHN_SEND)) {
 kvm_xen_set_vcpu_callback_vector(cs);
+
+QEMU_LOCK_GUARD(_CPU(cs)->env.xen_timers_lock);
+env->xen_singleshot_timer_ns = 0;
 kvm_xen_set_vcpu_timer(cs);
-}
+} else {
+vcpuop_stop_singleshot_timer(cs);
+};
 
 }
 
@@ -1059,6 +1066,10 @@ static int vcpuop_stop_periodic_timer(CPUState *target)
 return 0;
 }
 
+/*
+ * Userspace handling of timer, for older kernels.
+ * Must always be called with xen_timers_lock held.
+ */
 static int do_set_singleshot_timer(CPUState *cs, uint64_t timeout_abs,
bool future, bool linux_wa)
 {
@@ -1086,12 +1097,8 @@ static int do_set_singleshot_timer(CPUState *cs, 
uint64_t timeout_abs,
 timeout_abs = now + delta;
 }
 
-qemu_mutex_lock(>xen_timers_lock);
-
 timer_mod_ns(env->xen_singleshot_timer, qemu_now + delta);
 env->xen_singleshot_timer_ns = now + delta;
-
-qemu_mutex_unlock(>xen_timers_lock);
 return 0;
 }
 
@@ -1115,6 +1122,7 @@ static int vcpuop_set_singleshot_timer(CPUState *cs, 
uint64_t arg)
 return -EFAULT;
 }
 
+QEMU_LOCK_GUARD(_CPU(cs)->env.xen_timers_lock);
 return do_set_singleshot_timer(cs, sst.timeout_abs_ns,
!!(sst.flags & VCPU_SSHOTTMR_future),
false);
@@ -1141,6 +1149,7 @@ static bool kvm_xen_hcall_set_timer_op(struct 
kvm_xen_exit *exit, X86CPU *cpu,
 if (unlikely(timeout == 0)) {
 err = vcpuop_stop_singleshot_timer(CPU(cpu));
 } else {
+QEMU_LOCK_GUARD(_CPU(cpu)->env.xen_timers_lock);
 err = do_set_singleshot_timer(CPU(cpu), timeout, false, true);
 }
 exit->u.hcall.result = err;
@@ -1826,6 +1835,7 @@ int kvm_put_xen_state(CPUState *cs)
  * If the kernel has EVTCHN_SEND support then it handles timers too,
  * so the timer will be restored by kvm_xen_set_vcpu_timer() below.
  */
+QEMU_LOCK_GUARD(>xen_timers_lock);
 if (env->xen_singleshot_timer_ns) {
 ret = do_set_singleshot_timer(cs, env->xen_singleshot_timer_ns,
 false, false);
@@ -1844,10 +1854,7 @@ int kvm_put_xen_state(CPUState *cs)
 }
 
 if (env->xen_virq[VIRQ_TIMER]) {
-ret = kvm_xen_set_vcpu_timer(cs);
-if (ret < 0) {
-return ret;
-}
+do_set_vcpu_timer_virq(cs, 
RUN_ON_CPU_HOST_INT(env->xen_virq[VIRQ_TIMER]));
 }
 return 0;
 }
@@ -1896,6 +1903,15 @@ int kvm_get_xen_state(CPUState *cs)
 if (ret < 0) {
 return ret;
 }
+
+/*
+ * This locking is fairly pointless, and is

[PATCH for-8.1] Misc Xen-on-KVM fixes

2023-08-01 Thread David Woodhouse

A few minor fixes for the Xen emulation support. One was just merged, but
there are three outstanding.

David Woodhouse (3):
  hw/xen: fix off-by-one in xen_evtchn_set_gsi()
  i386/xen: consistent locking around Xen singleshot timers
  hw/xen: prevent guest from binding loopback event channel to itself

 hw/i386/kvm/xen_evtchn.c  | 15 +++
 target/i386/kvm/xen-emu.c | 36 ++--
 2 files changed, 37 insertions(+), 14 deletions(-)

Re: [PULL 0/5] Misc fixes, for thread-pool, xen, and xen-emulate

2023-08-01 Thread Richard Henderson


On 8/1/23 02:40, Anthony PERARD via wrote:

The following changes since commit 802341823f1720511dd5cf53ae40285f7978c61b:

   Merge tag 'pull-tcg-20230731' ofhttps://gitlab.com/rth7680/qemu  into 
staging (2023-07-31 14:02:51 -0700)

are available in the Git repository at:

   https://xenbits.xen.org/git-http/people/aperard/qemu-dm.git  
tags/pull-xen-20230801

for you to fetch changes up to 856ca10f9ce1fcffeab18546b36a64f79017c905:

   xen-platform: do full PCI reset during unplug of IDE devices (2023-08-01 
10:22:33 +0100)


Misc fixes, for thread-pool, xen, and xen-emulate

* fix an access to `request_cond` QemuCond in thread-pool
* fix issue with PCI devices when unplugging IDE devices in Xen guest
* several fixes for issues pointed out by Coverity


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/8.1 as 
appropriate.


r~

[PULL 0/1 for 8.1] TLS crash fix

2023-08-01 Thread Daniel P . Berrangé

The following changes since commit 802341823f1720511dd5cf53ae40285f7978c61b:

  Merge tag 'pull-tcg-20230731' of https://gitlab.com/rth7680/qemu into staging 
(2023-07-31 14:02:51 -0700)

are available in the Git repository at:

  https://gitlab.com/berrange/qemu tags/io-tls-hs-crash-pull-request

for you to fetch changes up to 10be627d2b5ec2d6b3dce045144aa739eef678b4:

  io: remove io watch if TLS channel is closed during handshake (2023-08-01 
18:45:27 +0100)


Fix crash during early close of TLS channel



Daniel P. Berrangé (1):
  io: remove io watch if TLS channel is closed during handshake

 include/io/channel-tls.h |  1 +
 io/channel-tls.c | 18 --
 2 files changed, 13 insertions(+), 6 deletions(-)

-- 
2.41.0

[PULL 1/1] io: remove io watch if TLS channel is closed during handshake

2023-08-01 Thread Daniel P . Berrangé

The TLS handshake make take some time to complete, during which time an
I/O watch might be registered with the main loop. If the owner of the
I/O channel invokes qio_channel_close() while the handshake is waiting
to continue the I/O watch must be removed. Failing to remove it will
later trigger the completion callback which the owner is not expecting
to receive. In the case of the VNC server, this results in a SEGV as
vnc_disconnect_start() tries to shutdown a client connection that is
already gone / NULL.

CVE-2023-3354
Reported-by: jiangyegen 
Signed-off-by: Daniel P. Berrangé 
---
 include/io/channel-tls.h |  1 +
 io/channel-tls.c | 18 --
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/include/io/channel-tls.h b/include/io/channel-tls.h
index 5672479e9e..26c67f17e2 100644
--- a/include/io/channel-tls.h
+++ b/include/io/channel-tls.h
@@ -48,6 +48,7 @@ struct QIOChannelTLS {
 QIOChannel *master;
 QCryptoTLSSession *session;
 QIOChannelShutdown shutdown;
+guint hs_ioc_tag;
 };
 
 /**
diff --git a/io/channel-tls.c b/io/channel-tls.c
index 9805dd0a3f..847d5297c3 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -198,12 +198,13 @@ static void qio_channel_tls_handshake_task(QIOChannelTLS 
*ioc,
 }
 
 trace_qio_channel_tls_handshake_pending(ioc, status);
-qio_channel_add_watch_full(ioc->master,
-   condition,
-   qio_channel_tls_handshake_io,
-   data,
-   NULL,
-   context);
+ioc->hs_ioc_tag =
+qio_channel_add_watch_full(ioc->master,
+   condition,
+   qio_channel_tls_handshake_io,
+   data,
+   NULL,
+   context);
 }
 }
 
@@ -218,6 +219,7 @@ static gboolean qio_channel_tls_handshake_io(QIOChannel 
*ioc,
 QIOChannelTLS *tioc = QIO_CHANNEL_TLS(
 qio_task_get_source(task));
 
+tioc->hs_ioc_tag = 0;
 g_free(data);
 qio_channel_tls_handshake_task(tioc, task, context);
 
@@ -378,6 +380,10 @@ static int qio_channel_tls_close(QIOChannel *ioc,
 {
 QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc);
 
+if (tioc->hs_ioc_tag) {
+g_clear_handle_id(>hs_ioc_tag, g_source_remove);
+}
+
 return qio_channel_close(tioc->master, errp);
 }
 
-- 
2.41.0

Re: [PATCH v2] io: remove io watch if TLS channel is closed during handshake

2023-08-01 Thread Michael Tokarev


12.07.2023 19:55, Daniel P. Berrangé wrote:

The TLS handshake make take some time to complete, during which time an
I/O watch might be registered with the main loop. If the owner of the
I/O channel invokes qio_channel_close() while the handshake is waiting
to continue the I/O watch must be removed. Failing to remove it will
later trigger the completion callback which the owner is not expecting
to receive. In the case of the VNC server, this results in a SEGV as
vnc_disconnect_start() tries to shutdown a client connection that is
already gone / NULL.

CVE-2023-3354
Reported-by: jiangyegen 
Signed-off-by: Daniel P. Berrangé 


Can we have this in 8.1 please?

What's needed to get this one into 8.1?

/mjt

Re: [RFC PATCH 08/19] HostMem: Add private property to indicate to use kvm gmem

2023-08-01 Thread David Hildenbrand


On 31.07.23 18:21, Xiaoyao Li wrote:

From: Isaku Yamahata 

Signed-off-by: Isaku Yamahata 
Signed-off-by: Xiaoyao Li 
---
  backends/hostmem.c   | 18 ++
  include/sysemu/hostmem.h |  2 +-
  qapi/qom.json|  4 
  3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 747e7838c031..dbdbb0aafd45 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -461,6 +461,20 @@ static void host_memory_backend_set_reserve(Object *o, 
bool value, Error **errp)
  }
  backend->reserve = value;
  }
+
+static bool host_memory_backend_get_private(Object *o, Error **errp)
+{
+HostMemoryBackend *backend = MEMORY_BACKEND(o);
+
+return backend->private;
+}
+
+static void host_memory_backend_set_private(Object *o, bool value, Error 
**errp)
+{
+HostMemoryBackend *backend = MEMORY_BACKEND(o);
+
+backend->private = value;
+}
  #endif /* CONFIG_LINUX */
  
  static bool

@@ -541,6 +555,10 @@ host_memory_backend_class_init(ObjectClass *oc, void *data)
  host_memory_backend_get_reserve, host_memory_backend_set_reserve);
  object_class_property_set_description(oc, "reserve",
  "Reserve swap space (or huge pages) if applicable");
+object_class_property_add_bool(oc, "private",
+host_memory_backend_get_private, host_memory_backend_set_private);
+object_class_property_set_description(oc, "private",
+"Use KVM gmem private memory");
  #endif /* CONFIG_LINUX */
  /*
   * Do not delete/rename option. This option must be considered stable
diff --git a/include/sysemu/hostmem.h b/include/sysemu/hostmem.h
index 39326f1d4f9c..d88970395618 100644
--- a/include/sysemu/hostmem.h
+++ b/include/sysemu/hostmem.h
@@ -65,7 +65,7 @@ struct HostMemoryBackend {
  /* protected */
  uint64_t size;
  bool merge, dump, use_canonical_path;
-bool prealloc, is_mapped, share, reserve;
+bool prealloc, is_mapped, share, reserve, private;
  uint32_t prealloc_threads;
  ThreadContext *prealloc_context;
  DECLARE_BITMAP(host_nodes, MAX_NODES + 1);
diff --git a/qapi/qom.json b/qapi/qom.json
index 7f92ea43e8e1..e0b2044e3d20 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -605,6 +605,9 @@
  # @reserve: if true, reserve swap space (or huge pages) if applicable
  # (default: true) (since 6.1)
  #
+# @private: if true, use KVM gmem private memory
+#   (default: false) (since 8.1)
+#


But that's not what any of this does.

This patch only adds a property and doesn't even explain what it intends 
to achieve with that.


How will it be used from a user? What will it affect internally? What 
will it modify in regards of the memory backend?


That all should go into the surprisingly empty patch description.

--
Cheers,

David / dhildenb

Re: [RFC PATCH 05/19] kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot

2023-08-01 Thread Claudio Fontana

On 7/31/23 18:21, Xiaoyao Li wrote:
> From: Chao Peng 
> 
> Switch to KVM_SET_USER_MEMORY_REGION2 when supported by KVM.
> 
> With KVM_SET_USER_MEMORY_REGION2, QEMU can set up memory region that
> backen'ed both by hva-based shared memory and gmem fd based private
> memory.
> 
> Signed-off-by: Chao Peng 
> Codeveloped-by: Xiaoyao Li 
> Signed-off-by: Xiaoyao Li 
> ---
>  accel/kvm/kvm-all.c  | 57 +---
>  accel/kvm/trace-events   |  2 +-
>  include/sysemu/kvm_int.h |  2 ++
>  3 files changed, 51 insertions(+), 10 deletions(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index d8eee405de24..7b1818334ba7 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -288,35 +288,68 @@ int kvm_physical_memory_addr_from_host(KVMState *s, 
> void *ram,
>  static int kvm_set_user_memory_region(KVMMemoryListener *kml, KVMSlot *slot, 
> bool new)
>  {
>  KVMState *s = kvm_state;
> -struct kvm_userspace_memory_region mem;
> +struct kvm_userspace_memory_region2 mem;
> +static int cap_user_memory2 = -1;
>  int ret;
>  
> +if (cap_user_memory2 == -1) {
> +cap_user_memory2 = kvm_check_extension(s, KVM_CAP_USER_MEMORY2);
> +}
> +
> +if (!cap_user_memory2 && slot->fd >= 0) {
> +error_report("%s, KVM doesn't support gmem!", __func__);
> +exit(1);
> +}

We handle this special error case here,
while the existing callers of kvm_set_user_memory_region handle the other error 
cases in different places.

Not that the rest of kvm-all does an excellent job at error handling, but maybe 
we can avoid compounding on the issue.

> +
>  mem.slot = slot->slot | (kml->as_id << 16);
>  mem.guest_phys_addr = slot->start_addr;
>  mem.userspace_addr = (unsigned long)slot->ram;
>  mem.flags = slot->flags;
> +mem.gmem_fd = slot->fd;
> +mem.gmem_offset = slot->ofs;
>  
> -if (slot->memory_size && !new && (mem.flags ^ slot->old_flags) & 
> KVM_MEM_READONLY) {
> +if (slot->memory_size && !new && (slot->flags ^ slot->old_flags) & 
> KVM_MEM_READONLY) {

Why the change if mem.flags == slot->flags ?

>  /* Set the slot size to 0 before setting the slot to the desired
>   * value. This is needed based on KVM commit 75d61fbc. */
>  mem.memory_size = 0;
> -ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, );
> +
> +if (cap_user_memory2) {
> +ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION2, );
> +} else {
> +ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, );
> + }
>  if (ret < 0) {
>  goto err;
>  }
>  }
>  mem.memory_size = slot->memory_size;
> -ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, );
> +if (cap_user_memory2) {
> +ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION2, );
> +} else {
> +ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, );
> +}
>  slot->old_flags = mem.flags;
>  err:
>  trace_kvm_set_user_memory(mem.slot >> 16, (uint16_t)mem.slot, mem.flags,
>mem.guest_phys_addr, mem.memory_size,
> -  mem.userspace_addr, ret);
> +  mem.userspace_addr, mem.gmem_fd,
> +   mem.gmem_offset, ret);
>  if (ret < 0) {
> -error_report("%s: KVM_SET_USER_MEMORY_REGION failed, slot=%d,"
> - " start=0x%" PRIx64 ", size=0x%" PRIx64 ": %s",
> - __func__, mem.slot, slot->start_addr,
> - (uint64_t)mem.memory_size, strerror(errno));
> +if (cap_user_memory2) {
> +error_report("%s: KVM_SET_USER_MEMORY_REGION2 failed, 
> slot=%d,"
> +" start=0x%" PRIx64 ", size=0x%" PRIx64 ","
> +" flags=0x%" PRIx32 ","
> +" gmem_fd=%" PRId32 ", gmem_offset=0x%" PRIx64 ": 
> %s",
> +__func__, mem.slot, slot->start_addr,
> +(uint64_t)mem.memory_size, mem.flags,
> +mem.gmem_fd, (uint64_t)mem.gmem_offset,
> +strerror(errno));
> +} else {
> +error_report("%s: KVM_SET_USER_MEMORY_REGION failed, 
> slot=%d,"
> +" start=0x%" PRIx64 ", size=0x%" PRIx64 ": %s",
> +__func__, mem.slot, slot->start_addr,
> +(uint64_t)mem.memory_size, strerror(errno));
> +}
>  }
>  return ret;
>  }
> @@ -472,6 +505,9 @@ static int kvm_mem_flags(MemoryRegion *mr)
>  if (readonly && kvm_readonly_mem_allowed) {
>  flags |= KVM_MEM_READONLY;
>  }
> +if (memory_region_can_be_private(mr)) {
> +flags |= KVM_MEM_PRIVATE;
> +}
>  return flags;
>  }
>  
> @@ -1402,6 +1438,9 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
>  mem->ram_start_offset = ram_start_offset;
>  mem->ram = ram;
>

Re: [PATCH] migration/calc-dirty-rate: millisecond precision period

2023-08-01 Thread Peter Xu

On Thu, Jun 29, 2023 at 11:59:03AM +0300, Andrei Gudkov wrote:
> Introduces alternative argument calc-time-ms, which is the
> the same as calc-time but accepts millisecond value.
> Millisecond precision allows to make predictions whether
> migration will succeed or not. To do this, calculate dirty
> rate with calc-time-ms set to max allowed downtime, convert
> measured rate into volume of dirtied memory, and divide by
> network throughput. If the value is lower than max allowed
> downtime, then migration will converge.
> 
> Measurement results for single thread randomly writing to
> a 24GiB region:
> +--++
> | calc-time-ms | dirty-rate (MiB/s) |
> +--++
> |  100 |   1880 |
> |  200 |   1340 |
> |  300 |   1120 |
> |  400 |   1030 |
> |  500 |868 |
> |  750 |720 |
> | 1000 |636 |
> | 1500 |498 |
> | 2000 |423 |
> +--++
> 
> Signed-off-by: Andrei Gudkov 

Andrei, do you plan to enhance the commit message and data in a repost?  I
assume you may want to have your data points updated after the discussion,
and it won't need to be in a rush as it will only land 8.2.

The patch itself looks fine to me:

Acked-by: Peter Xu 

Thanks,

-- 
Peter Xu

Re: [PATCH v5 01/11] hw: arm: Add bananapi M2-Ultra and allwinner-r40 support

2023-08-01 Thread Guenter Roeck


On 8/1/23 09:01, Peter Maydell wrote:

On Sat, 24 Jun 2023 at 16:02, Guenter Roeck  wrote:


On 6/24/23 07:23, Guenter Roeck wrote:

On 6/24/23 03:40, Peter Maydell wrote:

On Fri, 23 Jun 2023 at 20:33, Guenter Roeck  wrote:


On 6/23/23 10:44, Peter Maydell wrote:

On Sat, 17 Jun 2023 at 17:29, Guenter Roeck  wrote:

Main problem is that the SD card gets instantiated randomly to
mmc0, mmc1, or mmc2, making it all but impossible to specify a
root file system device. The non-instantiated cards are always
reported as non-removable, including mmc0. Example:

mmc0: Failed to initialize a non-removable card


Do you mean that QEMU randomly connects the SD card to
a different MMC controller each time, or that Linux is
randomly assigning mmc0 to a different MMC controller each
time ?



Good question. Given the workaround (fix ?) I suggested is
in the devicetree file, I would assume it is the latter. I suspect
that Linux assigns drive names based on hardware detection order,
and that this is not deterministic for some reason. It is odd
because I have never experienced that with any other emulation.


Yeah, I don't really understand why it would be non-deterministic.
But it does make it sound like the right thing is for the
device tree file to explicitly say which MMC controller is
which -- presumably you might get unlucky with the timing
on real hardware too.



Agreed, only someone with real hardware would have to confirm
that this is the case.



Actually, the reason is quite simple. In the Linux kernel:

static struct platform_driver sunxi_mmc_driver = {
  .driver = {
  .name   = "sunxi-mmc",
  .probe_type = PROBE_PREFER_ASYNCHRONOUS,
^
  .of_match_table = sunxi_mmc_of_match,
  .pm = _mmc_pm_ops,
  },
  .probe  = sunxi_mmc_probe,
  .remove = sunxi_mmc_remove,
};

All mmc devices instantiate at the same time, thus the
device name association is random. If I drop the probe_type
assignment, it becomes deterministic.

On top of that, Linux does know which drives are removable
from the devicetree file. However, since probe order is
random, the assignment of the one removable drive to device
names is random. Sometimes mmc0 shows up as removable,
sometimes it is mmc1 or mmc2.

So my conclusion is that qemu isn't doing anything wrong,
it is all happening in the Linux kernel.


Hi Guenter -- do you know if this "random MMC controller"
issue has been fixed in Linux ? If so, we might be able
to update our test case image to avoid the slightly ugly
"root=b300" workaround at some point.



No, it has not been fixed, or at least there is nothing in linux-next.
I don't have real hardware, so I am not in a position to submit, much
less test, a patch on it. Someone had mentioned that real hardware would
handle the problem in initramfs. That seems wrong to me, but it is what
it is.

I changed my own test to use the "root=b300" hack. That seems highly kludgy,
but at least it works.

Guenter

Re: [Qemu-devel] [PATCH] acpi: Add emulated sleep button

2023-08-01 Thread Annie.li


Hi Igor,

On 7/14/2023 10:04 AM, Igor Mammedov wrote:

On Fri, 7 Jul 2023 13:43:36 -0400
"Annie.li"  wrote:


Hi Igor,

Revisiting this thread and have more questions, please clarify, thank you!

On 9/20/2021 3:53 AM, Igor Mammedov wrote:

On Fri, 6 Aug 2021 16:18:09 -0400
"Annie.li"  wrote:
  

Hello Igor,

This is an old patch, but it does what we need.
I am getting a little bit lost about not implementing fixed hardware
sleep button, can you please clarify? thank you!

On 7/20/2017 10:59 AM, Igor Mammedov wrote:

On Thu, 20 Jul 2017 11:31:26 +0200
Stefan Fritsch  wrote:
 

From: Stefan Fritsch 

Add an ACPI sleep button and QMP/HMP commands to trigger it.  A sleep
button is a so called "fixed hardware feature", which makes it more
suitable for putting the system to sleep than a laptop lid, for example.

The sleep button is disabled by default (Bit 5 in the FACP flags
register set and no button "device" present in SSDT/DSDT). Clearing said
bit enables it as a fixed feature device.

per spec sleep button is used for both putting system into
sleep and for waking it up.

Reusing system_wakeup 'button' to behave as per spec would
make this patch significantly smaller.

Current 'system_wakeup' sets the WAK_STS bit and PWRBTN_STS to wake up
the system, the system_wakeup 'button' is the power button. So(Correct me
if I am wrong) reusing the system_wakeup 'button' means reusing the power
button for sleep. See the following code of setting WAK_STS and PWRBTN_STS
for 'system_wakeup',
      case QEMU_WAKEUP_REASON_OTHER:
      /* ACPI_BITMASK_WAKE_STATUS should be set on resume.
     Pretend that resume was caused by power button */
      ar->pm1.evt.sts |=
      (ACPI_BITMASK_WAKE_STATUS | ACPI_BITMASK_POWER_BUTTON_STATUS);

(that's quite ancient code (0bacd1300d) and I couldn't find a reason why
power button was chosen.
  
https://urldefense.com/v3/__https://www.mail-archive.com/kvm@vger.kernel.org/msg05742.html__;!!ACWV5N9M2RV99hQ!PQp9_UyWYc4gTuTwNUiyPgE0Xwinlsi8F-J6zWOA8KRLUh0EIv68g01XQQrFzKipeZbe-vhHfpGZBb0$
that was tested with WinXP so would be wise to check if SLEEP button
works there. (though I'm not sure if we still care for XP being runnable on 
QEMU)
)


don't have WinXP at hand to check for now...
Microsoft has ended the WinXP support since Apr 8, 2014.
If someone is still running WinXP, not sure if they care about
the sleep button.


it doesn't have to be ACPI_BITMASK_POWER_BUTTON_STATUS,
you can convert it to sleep button by using ACPI_BITMASK_SLEEP_BUTTON_STATUS.
However you'll have to enable sleep button (which is disabled currently)
in FADT (ACPI_FADT_F_SLP_BUTTON), for guest to see the button.

Agree
Per ACPI spec, SLPBTN_STS -

"This optional bit is set when the sleep button is pressed. In the system
working state, while SLPBTN_EN and SLPBTN_STS are both set, an
interrupt event is raised. In the sleep or soft-off states a wake event is
generated when the sleeping button is pressed and the SLPBTN_EN bit
is set."

Using ACPI_BITMASK_SLEEP_BUTTON_STATUS means qemu ends up with
supporting the sleep button.  With this, implementing the fixed
hardware sleep button(what this patch does) is one option. The interesting
topic is whether it should be implemented as General Control sleep button
or fixed hardware button.




      break;

Per the ACPI spec, the power button can be used in single button model, i.e.
it can be used to either shut down the system or put the system into sleep.
However, this depends on the software policy and user's setting of the
system.
Sleep/shutdown can not be done through the power button in the same
scenario.
If the system has configured pressing the power button to put the system
into sleep,
system_sleep will put the system into sleep state through the power
button, as well
as system_powerdown. Pressing the power button will not shut down the
system.
In this case, system_powerdown has its own issue, but this is different
story and
let's just focus on things related to system_sleep here.

regardless of what button is used OSPM is free to enable or disable it
using FOO_EN bits.


Nod
The system_powerdown issue I mentioned above is -
If the guest is configured to go into sleep when the power button is
pressed, system_powerdown will trigger the guest to go into sleep state.
However, system_wakeup won't be able to wake up the guest in such case.
Looks the current qemu doesn't handle this scenario properly.

About reusing "system_wakeup", does it mean the following?

1. when guest is in sleep state, "system_wakeup" wakes up the guest
2. when guest is running, "system_wakeup" puts the guest into sleep

yes,  it could be something like this

  

"system_wakeup" sets WAK_STS and then system transitions to the
working state. Correspondingly, I suppose both SLPBTN_STS and
SLPBTN_EN need to be set for sleeping, and this is what fixed
hardware sleep button requires?

yep
 

I have combined the sleep and wakeup together, share the
code between. But

Re: [PATCH] migration/calc-dirty-rate: millisecond precision period

2023-08-01 Thread Peter Xu

On Tue, Aug 01, 2023 at 05:55:29PM +0300, gudkov.and...@huawei.com wrote:
> Hmmm, such underestimation looks strange to me. I am willing to test
> page-sampling and see whether its quality can be improved. Do you have
> any specific suggestions on the application to use as a workload?

I could have had a wrong impression here, sorry.

I played again with the page sampling approach, and that's actually pretty
decent..

I had that impression probably based on the fact that by default we chose 2
pages out of 1000-ish (consider workloads having 100-ish memory updates
where no sample page falls into it), and I do remember in some cases of my
test setups quite some time ago, it shows totally wrong numbers. But maybe
I had a wrong test, or wrong memory.

Now thinking about it, for random/seq on not so small memory ranges, that
seems to all work.  For very small ones spread over it goes to the random
case.

> 
> If it turns out that page-sampling is not an option, then performance
> impact of the dirty-bitmap must be improved somehow. Maybe it makes
> sense to split memory into 4GiB chunks and measure dirty page rate
> independently for each of the chunks (without enabling page
> protections for memory outside of the currently processed chunk).
> But the downsides are that 1) total measurement time will increase
> proportionally by number of chunks 2) dirty page rate will be
> overestimated.
> 
> But actually I am still hoping on page sampling. Since my goal is to
> roughly predict what can be migrated and what cannot be, I would prefer
> to keep predictor as lite as possible, even at the cost of
> (overestimation) error.

Yes I also hope that works as you said.

Thanks,

-- 
Peter Xu

Re: [RFC PATCH 04/19] memory: Introduce memory_region_can_be_private()

2023-08-01 Thread Claudio Fontana

On 8/1/23 18:48, Claudio Fontana wrote:
> On 7/31/23 18:21, Xiaoyao Li wrote:
>> Signed-off-by: Xiaoyao Li 
>> ---
>>  include/exec/memory.h | 9 +
>>  softmmu/memory.c  | 5 +
>>  2 files changed, 14 insertions(+)
>>
>> diff --git a/include/exec/memory.h b/include/exec/memory.h
>> index 61e31c7b9874..e119d3ce1a1d 100644
>> --- a/include/exec/memory.h
>> +++ b/include/exec/memory.h
>> @@ -1679,6 +1679,15 @@ static inline bool memory_region_is_romd(MemoryRegion 
>> *mr)
>>   */
>>  bool memory_region_is_protected(MemoryRegion *mr);
>>  
>> +/**
>> + * memory_region_can_be_private: check whether a memory region can be 
>> private
> 
> The name of the function is not particularly informative,
> 
>> + *
>> + * Returns %true if a memory region's ram_block has valid gmem fd assigned.
> 
> but in your comment you describe more accurately what it does, why not make 
> it the function name?
> 
> bool memory_region_has_valid_gmem_fd()


btw can a memory region have an invalid gmem_fd ?

If an invalid gmem_fd is just used to mark whether gmem_fd is present or not,

we could make it just:

bool memory_region_has_gmem_fd()


Thanks,

C

> 
>> + *
>> + * @mr: the memory region being queried
>> + */
>> +bool memory_region_can_be_private(MemoryRegion *mr);
> 
> 
> bool memory_region_has_valid_gmem_fd()
> 
> 
> Thanks,
> 
> C
> 
>> +
>>  /**
>>   * memory_region_get_iommu: check whether a memory region is an iommu
>>   *
>> diff --git a/softmmu/memory.c b/softmmu/memory.c
>> index 4f8f8c0a02e6..336c76ede660 100644
>> --- a/softmmu/memory.c
>> +++ b/softmmu/memory.c
>> @@ -1855,6 +1855,11 @@ bool memory_region_is_protected(MemoryRegion *mr)
>>  return mr->ram && (mr->ram_block->flags & RAM_PROTECTED);
>>  }
>>  
>> +bool memory_region_can_be_private(MemoryRegion *mr)
>> +{
>> +return mr->ram_block && mr->ram_block->gmem_fd >= 0;
>> +}
>> +
>>  uint8_t memory_region_get_dirty_log_mask(MemoryRegion *mr)
>>  {
>>  uint8_t mask = mr->dirty_log_mask;
>

Re: [RFC PATCH 04/19] memory: Introduce memory_region_can_be_private()

2023-08-01 Thread Claudio Fontana

On 7/31/23 18:21, Xiaoyao Li wrote:
> Signed-off-by: Xiaoyao Li 
> ---
>  include/exec/memory.h | 9 +
>  softmmu/memory.c  | 5 +
>  2 files changed, 14 insertions(+)
> 
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 61e31c7b9874..e119d3ce1a1d 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -1679,6 +1679,15 @@ static inline bool memory_region_is_romd(MemoryRegion 
> *mr)
>   */
>  bool memory_region_is_protected(MemoryRegion *mr);
>  
> +/**
> + * memory_region_can_be_private: check whether a memory region can be private

The name of the function is not particularly informative,

> + *
> + * Returns %true if a memory region's ram_block has valid gmem fd assigned.

but in your comment you describe more accurately what it does, why not make it 
the function name?

bool memory_region_has_valid_gmem_fd()

> + *
> + * @mr: the memory region being queried
> + */
> +bool memory_region_can_be_private(MemoryRegion *mr);


bool memory_region_has_valid_gmem_fd()


Thanks,

C

> +
>  /**
>   * memory_region_get_iommu: check whether a memory region is an iommu
>   *
> diff --git a/softmmu/memory.c b/softmmu/memory.c
> index 4f8f8c0a02e6..336c76ede660 100644
> --- a/softmmu/memory.c
> +++ b/softmmu/memory.c
> @@ -1855,6 +1855,11 @@ bool memory_region_is_protected(MemoryRegion *mr)
>  return mr->ram && (mr->ram_block->flags & RAM_PROTECTED);
>  }
>  
> +bool memory_region_can_be_private(MemoryRegion *mr)
> +{
> +return mr->ram_block && mr->ram_block->gmem_fd >= 0;
> +}
> +
>  uint8_t memory_region_get_dirty_log_mask(MemoryRegion *mr)
>  {
>  uint8_t mask = mr->dirty_log_mask;

Re: [RFC PATCH 03/19] RAMBlock: Support KVM gmemory

2023-08-01 Thread David Hildenbrand


On 31.07.23 18:21, Xiaoyao Li wrote:

From: Chao Peng 

Add KVM gmem support to RAMBlock so we can have both normal
hva based memory and gmem fd based memory in one RAMBlock.

The gmem part is represented by the gmem_fd.

Signed-off-by: Chao Peng 
Signed-off-by: Xiaoyao Li 


So, *someone* creates a RAMBlock in QEMU.

Who'll squeeze a gmem_fd in there? When? How?

Shouldn't we create the RAM memory region / RAMBlock already with the 
gmem_fd, and set that when initializing these things?


--
Cheers,

David / dhildenb

Re: [RFC PATCH 04/19] memory: Introduce memory_region_can_be_private()

2023-08-01 Thread Sean Christopherson

On Mon, Jul 31, 2023, Peter Xu wrote:
> On Mon, Jul 31, 2023 at 05:36:37PM -0400, Michael S. Tsirkin wrote:
> > On Mon, Jul 31, 2023 at 02:34:22PM -0700, Sean Christopherson wrote:
> > > On Mon, Jul 31, 2023, Peter Xu wrote:
> > > > On Mon, Jul 31, 2023 at 12:21:46PM -0400, Xiaoyao Li wrote:
> > > > > +bool memory_region_can_be_private(MemoryRegion *mr)
> > > > > +{
> > > > > +return mr->ram_block && mr->ram_block->gmem_fd >= 0;
> > > > > +}
> > > > 
> > > > This is not really MAP_PRIVATE, am I right?  If so, is there still 
> > > > chance
> > > > we rename it (it seems to be also in the kernel proposal all across..)?
> > > 
> > > Yes and yes.
> > > 
> > > > I worry it can be very confusing in the future against MAP_PRIVATE /
> > > > MAP_SHARED otherwise.
> > > 
> > > Heh, it's already quite confusing at times.  I'm definitely open to 
> > > naming that
> > > doesn't collide with MAP_{PRIVATE,SHARED}, especially if someone can come 
> > > with a
> > > naming scheme that includes a succinct way to describe memory that is 
> > > shared
> > > between two or more VMs, but is accessible to _only_ those VMs.
> > 
> > Standard solution is a technology specific prefix.
> > protect_shared, encrypt_shared etc.
> 
> Agreed, a prefix could definitely help (if nothing better comes at last..).
> If e.g. "encrypted" too long to be applied everywhere in var names and
> functions, maybe it can also be "enc_{private|shared}".

FWIW, I would stay away from "encrypted", there is no requirement that the 
memory
actually be encrypted.

Re: [PATCH] linux-user: Fix openat() emulation to correctly detect accesses to /proc

2023-08-01 Thread Daniel P . Berrangé

On Tue, Aug 01, 2023 at 04:58:57PM +0200, Helge Deller wrote:
> In qemu we catch accesses to files like /proc/cpuinfo or /proc/net/route
> and return to the guest contents which would be visible on a real system
> (instead what the host would show).
> 
> This patch fixes a bug, where for example the accesses
> cat /proccpuinfo
> or
> cd /proc && cat cpuinfo
> will not be recognized by qemu and where qemu will wrongly show
> the contents of the host's /proc/cpuinfo file.
> 
> Signed-off-by: Helge Deller 
> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 917c388073..bb864c2bb3 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -8531,9 +8531,11 @@ static int open_cpuinfo(CPUArchState *cpu_env, int fd)
>  }
>  #endif
> 
> -int do_guest_openat(CPUArchState *cpu_env, int dirfd, const char *pathname,
> +int do_guest_openat(CPUArchState *cpu_env, int dirfd, const char *fname,
>  int flags, mode_t mode, bool safe)
>  {
> +char proc_name[PATH_MAX];

No PATH_MAX buffers declared on the stack please.

> +const char *pathname;
>  struct fake_open {
>  const char *filename;
>  int (*fill)(CPUArchState *cpu_env, int fd);
> @@ -8560,6 +8562,13 @@ int do_guest_openat(CPUArchState *cpu_env, int dirfd, 
> const char *pathname,
>  { NULL, NULL, NULL }
>  };
> 
> +/* if this is a file from /proc/ filesystem, expand full name */
> +if (realpath(fname, proc_name) && strncmp(proc_name, "/proc/", 6) == 0) {

QEMU relies on the extended semantics of realpath() where passing NULL
for 'proc_name' causes it to allocate a buffer of the correct size and
return the new buffer. Using that avoids PATH_MAX variables.

> +pathname = proc_name;
> +} else {
> +pathname = fname;
> +}

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH 1/2] block/blkio: close the fd when blkio_connect() fails

2023-08-01 Thread Stefano Garzarella

libblkio drivers take ownership of `fd` only after a successful
blkio_connect(), so if it fails, we are still the owners.

Fixes: cad2ccc395 ("block/blkio: use qemu_open() to support fd passing for 
virtio-blk")
Suggested-by: Hanna Czenczek 
Signed-off-by: Stefano Garzarella 
---
 block/blkio.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/block/blkio.c b/block/blkio.c
index 8e7ce42c79..2d53a865e7 100644
--- a/block/blkio.c
+++ b/block/blkio.c
@@ -739,6 +739,7 @@ static int blkio_virtio_blk_connect(BlockDriverState *bs, 
QDict *options,
  * directly setting `path`.
  */
 if (fd_supported && ret == -EINVAL) {
+fd_supported = false;
 qemu_close(fd);
 
 /*
@@ -763,6 +764,14 @@ static int blkio_virtio_blk_connect(BlockDriverState *bs, 
QDict *options,
 }
 
 if (ret < 0) {
+if (fd_supported) {
+/*
+ * libblkio drivers take ownership of `fd` only after a successful
+ * blkio_connect(), so if it fails, we are still the owners.
+ */
+qemu_close(fd);
+}
+
 error_setg_errno(errp, -ret, "blkio_connect failed: %s",
  blkio_get_error_msg());
 return ret;
-- 
2.41.0

[PATCH 0/2] block/blkio: fix fd leak and add more comments for the fd passing

2023-08-01 Thread Stefano Garzarella

Hanna discovered an fd leak in the error path, and a few comments to
improve in the code.

Stefano Garzarella (2):
  block/blkio: close the fd when blkio_connect() fails
  block/blkio: add more comments on the fd passing handling

 block/blkio.c | 24 +---
 1 file changed, 21 insertions(+), 3 deletions(-)

-- 
2.41.0

[PATCH 2/2] block/blkio: add more comments on the fd passing handling

2023-08-01 Thread Stefano Garzarella

As Hanna pointed out, it is not clear in the code why qemu_open()
can fail, and why blkio_set_int("fd") is not enough to discover
the `fd` property support.

Let's fix them by adding more details in the code comments.

Suggested-by: Hanna Czenczek 
Signed-off-by: Stefano Garzarella 
---
 block/blkio.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/block/blkio.c b/block/blkio.c
index 2d53a865e7..848b8189d0 100644
--- a/block/blkio.c
+++ b/block/blkio.c
@@ -713,6 +713,12 @@ static int blkio_virtio_blk_connect(BlockDriverState *bs, 
QDict *options,
  */
 fd = qemu_open(path, O_RDWR, NULL);
 if (fd < 0) {
+/*
+ * qemu_open() can fail if the user specifies a path that is not
+ * a file or device, for example in the case of Unix Domain Socket
+ * for the virtio-blk-vhost-user driver. In such cases let's have
+ * libblkio open the path directly.
+ */
 fd_supported = false;
 } else {
 ret = blkio_set_int(s->blkio, "fd", fd);
@@ -734,9 +740,12 @@ static int blkio_virtio_blk_connect(BlockDriverState *bs, 
QDict *options,
 
 ret = blkio_connect(s->blkio);
 /*
- * If the libblkio driver doesn't support the `fd` property, 
blkio_connect()
- * will fail with -EINVAL. So let's try calling blkio_connect() again by
- * directly setting `path`.
+ * Before https://gitlab.com/libblkio/libblkio/-/merge_requests/208
+ * (libblkio <= v1.3.0), setting the `fd` property is not enough to check
+ * whether the driver supports the `fd` property or not. In that case,
+ * blkio_connect() will fail with -EINVAL.
+ * So let's try calling blkio_connect() again by directly setting `path`
+ * to cover this scenario.
  */
 if (fd_supported && ret == -EINVAL) {
 fd_supported = false;
-- 
2.41.0

Re: [PATCH v5 01/11] hw: arm: Add bananapi M2-Ultra and allwinner-r40 support

2023-08-01 Thread Peter Maydell

On Sat, 24 Jun 2023 at 16:02, Guenter Roeck  wrote:
>
> On 6/24/23 07:23, Guenter Roeck wrote:
> > On 6/24/23 03:40, Peter Maydell wrote:
> >> On Fri, 23 Jun 2023 at 20:33, Guenter Roeck  wrote:
> >>>
> >>> On 6/23/23 10:44, Peter Maydell wrote:
>  On Sat, 17 Jun 2023 at 17:29, Guenter Roeck  wrote:
> > Main problem is that the SD card gets instantiated randomly to
> > mmc0, mmc1, or mmc2, making it all but impossible to specify a
> > root file system device. The non-instantiated cards are always
> > reported as non-removable, including mmc0. Example:
> >
> > mmc0: Failed to initialize a non-removable card
> 
>  Do you mean that QEMU randomly connects the SD card to
>  a different MMC controller each time, or that Linux is
>  randomly assigning mmc0 to a different MMC controller each
>  time ?
> 
> >>>
> >>> Good question. Given the workaround (fix ?) I suggested is
> >>> in the devicetree file, I would assume it is the latter. I suspect
> >>> that Linux assigns drive names based on hardware detection order,
> >>> and that this is not deterministic for some reason. It is odd
> >>> because I have never experienced that with any other emulation.
> >>
> >> Yeah, I don't really understand why it would be non-deterministic.
> >> But it does make it sound like the right thing is for the
> >> device tree file to explicitly say which MMC controller is
> >> which -- presumably you might get unlucky with the timing
> >> on real hardware too.
> >>
> >
> > Agreed, only someone with real hardware would have to confirm
> > that this is the case.
> >
>
> Actually, the reason is quite simple. In the Linux kernel:
>
> static struct platform_driver sunxi_mmc_driver = {
>  .driver = {
>  .name   = "sunxi-mmc",
>  .probe_type = PROBE_PREFER_ASYNCHRONOUS,
>^
>  .of_match_table = sunxi_mmc_of_match,
>  .pm = _mmc_pm_ops,
>  },
>  .probe  = sunxi_mmc_probe,
>  .remove = sunxi_mmc_remove,
> };
>
> All mmc devices instantiate at the same time, thus the
> device name association is random. If I drop the probe_type
> assignment, it becomes deterministic.
>
> On top of that, Linux does know which drives are removable
> from the devicetree file. However, since probe order is
> random, the assignment of the one removable drive to device
> names is random. Sometimes mmc0 shows up as removable,
> sometimes it is mmc1 or mmc2.
>
> So my conclusion is that qemu isn't doing anything wrong,
> it is all happening in the Linux kernel.

Hi Guenter -- do you know if this "random MMC controller"
issue has been fixed in Linux ? If so, we might be able
to update our test case image to avoid the slightly ugly
"root=b300" workaround at some point.

thanks
-- PMM

[PATCH] target/m68k: Add URL to semihosting spec

2023-08-01 Thread Peter Maydell

The spec for m68k semihosting is documented in the libgloss
sources. Add a comment with the URL for it, as we already
have for nios2 semihosting.

Signed-off-by: Peter Maydell 
---
 target/m68k/m68k-semi.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/target/m68k/m68k-semi.c b/target/m68k/m68k-semi.c
index 239f6e44e90..80cd8d70dbb 100644
--- a/target/m68k/m68k-semi.c
+++ b/target/m68k/m68k-semi.c
@@ -15,6 +15,10 @@
  *
  *  You should have received a copy of the GNU General Public License
  *  along with this program; if not, see .
+ *
+ *  The semihosting protocol implemented here is described in the
+ *  libgloss sources:
+ *  
https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=libgloss/m68k/m68k-semi.txt;hb=HEAD
  */
 
 #include "qemu/osdep.h"
-- 
2.34.1

[PATCH for-8.1] target/m68k: Fix semihost lseek offset computation

2023-08-01 Thread Peter Maydell

The arguments for deposit64 are (value, start, length, fieldval); this
appears to have thought they were (value, fieldval, start,
length). Reorder the parameters to match the actual function.

Cc: qemu-sta...@nongnu.org
Fixes: 950272506d ("target/m68k: Use semihosting/syscalls.h")
Reported-by: Philippe Mathieu-Daudé 
Signed-off-by: Peter Maydell 
---
Same fix for m68k as Keith Packard just sent for nios2
---
 target/m68k/m68k-semi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/m68k/m68k-semi.c b/target/m68k/m68k-semi.c
index 88ad9ba8144..239f6e44e90 100644
--- a/target/m68k/m68k-semi.c
+++ b/target/m68k/m68k-semi.c
@@ -166,7 +166,7 @@ void do_m68k_semihosting(CPUM68KState *env, int nr)
 GET_ARG64(2);
 GET_ARG64(3);
 semihost_sys_lseek(cs, m68k_semi_u64_cb, arg0,
-   deposit64(arg2, arg1, 32, 32), arg3);
+   deposit64(arg2, 32, 32, arg1), arg3);
 break;
 
 case HOSTED_RENAME:
-- 
2.34.1

Re: qemu-img cache modes with Linux cgroup v1

2023-08-01 Thread Stefan Hajnoczi

Hi Daniel,
I agree with your points.

Stefan


signature.asc
Description: PGP signature

1 2 >

1 - 100 of 187 matches

Mail list logo