date:20240205

Re: building qemu on a system with libxkbcommon installed but not xkeyboard-config produces an core dump

2024-02-05 Thread Zhang Wen

> So how the problem manifests itself?  What the actual error message is?
> You mentioned some segfault iirc, care to share some details?
I'm doing a rebuild, and will attach the error message and coredump
file here in one or two days.

-- 
Key fingerprint:
419F 72F3 F3A9 36EE 1B72  B00B C1C3 4BD4 FDA3 362F

Re: [PATCH 1/5] virtio-blk: enforce iothread-vq-mapping validation

2024-02-05 Thread Manos Pitsidianakis


On Mon, 05 Feb 2024 19:26, Stefan Hajnoczi  wrote:

Hanna Czenczek  noticed that the safety of
`vq_aio_context[vq->value] = ctx;` with user-defined vq->value inputs is
not obvious.

The code is structured in validate() + apply() steps so input validation
is there, but it happens way earlier and there is nothing that
guarantees apply() can only be called with validated inputs.

This patch moves the validate() call inside the apply() function so
validation is guaranteed. I also added the bounds checking assertion
that Hanna suggested.

Signed-off-by: Stefan Hajnoczi 
---
hw/block/virtio-blk.c | 192 +++---
1 file changed, 107 insertions(+), 85 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 227d83569f..e8b37fd5f4 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1485,6 +1485,72 @@ static int virtio_blk_load_device(VirtIODevice *vdev, 
QEMUFile *f,
return 0;
}

+static void virtio_resize_cb(void *opaque)
+{
+VirtIODevice *vdev = opaque;
+
+assert(qemu_get_current_aio_context() == qemu_get_aio_context());
+virtio_notify_config(vdev);
+}
+
+static void virtio_blk_resize(void *opaque)
+{
+VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
+
+/*
+ * virtio_notify_config() needs to acquire the BQL,
+ * so it can't be called from an iothread. Instead, schedule
+ * it to be run in the main context BH.
+ */
+aio_bh_schedule_oneshot(qemu_get_aio_context(), virtio_resize_cb, vdev);
+}
+
+static void virtio_blk_ioeventfd_detach(VirtIOBlock *s)
+{
+VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+for (uint16_t i = 0; i < s->conf.num_queues; i++) {
+VirtQueue *vq = virtio_get_queue(vdev, i);
+virtio_queue_aio_detach_host_notifier(vq, s->vq_aio_context[i]);
+}
+}
+
+static void virtio_blk_ioeventfd_attach(VirtIOBlock *s)
+{
+VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+for (uint16_t i = 0; i < s->conf.num_queues; i++) {
+VirtQueue *vq = virtio_get_queue(vdev, i);
+virtio_queue_aio_attach_host_notifier(vq, s->vq_aio_context[i]);
+}
+}
+
+/* Suspend virtqueue ioeventfd processing during drain */
+static void virtio_blk_drained_begin(void *opaque)
+{
+VirtIOBlock *s = opaque;
+
+if (s->ioeventfd_started) {
+virtio_blk_ioeventfd_detach(s);
+}
+}
+
+/* Resume virtqueue ioeventfd processing after drain */
+static void virtio_blk_drained_end(void *opaque)
+{
+VirtIOBlock *s = opaque;
+
+if (s->ioeventfd_started) {
+virtio_blk_ioeventfd_attach(s);
+}
+}
+
+static const BlockDevOps virtio_block_ops = {
+.resize_cb = virtio_blk_resize,
+.drained_begin = virtio_blk_drained_begin,
+.drained_end   = virtio_blk_drained_end,
+};
+
static bool
validate_iothread_vq_mapping_list(IOThreadVirtQueueMappingList *list,
uint16_t num_queues, Error **errp)
@@ -1547,81 +1613,33 @@ 
validate_iothread_vq_mapping_list(IOThreadVirtQueueMappingList *list,
return true;
}

-static void virtio_resize_cb(void *opaque)
-{
-VirtIODevice *vdev = opaque;
-
-assert(qemu_get_current_aio_context() == qemu_get_aio_context());
-virtio_notify_config(vdev);
-}
-
-static void virtio_blk_resize(void *opaque)
-{
-VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
-
-/*
- * virtio_notify_config() needs to acquire the BQL,
- * so it can't be called from an iothread. Instead, schedule
- * it to be run in the main context BH.
- */
-aio_bh_schedule_oneshot(qemu_get_aio_context(), virtio_resize_cb, vdev);
-}
-
-static void virtio_blk_ioeventfd_detach(VirtIOBlock *s)
-{
-VirtIODevice *vdev = VIRTIO_DEVICE(s);
-
-for (uint16_t i = 0; i < s->conf.num_queues; i++) {
-VirtQueue *vq = virtio_get_queue(vdev, i);
-virtio_queue_aio_detach_host_notifier(vq, s->vq_aio_context[i]);
-}
-}
-
-static void virtio_blk_ioeventfd_attach(VirtIOBlock *s)
-{
-VirtIODevice *vdev = VIRTIO_DEVICE(s);
-
-for (uint16_t i = 0; i < s->conf.num_queues; i++) {
-VirtQueue *vq = virtio_get_queue(vdev, i);
-virtio_queue_aio_attach_host_notifier(vq, s->vq_aio_context[i]);
-}
-}
-
-/* Suspend virtqueue ioeventfd processing during drain */
-static void virtio_blk_drained_begin(void *opaque)
-{
-VirtIOBlock *s = opaque;
-
-if (s->ioeventfd_started) {
-virtio_blk_ioeventfd_detach(s);
-}
-}
-
-/* Resume virtqueue ioeventfd processing after drain */
-static void virtio_blk_drained_end(void *opaque)
-{
-VirtIOBlock *s = opaque;
-
-if (s->ioeventfd_started) {
-virtio_blk_ioeventfd_attach(s);
-}
-}
-
-static const BlockDevOps virtio_block_ops = {
-.resize_cb = virtio_blk_resize,
-.drained_begin = virtio_blk_drained_begin,
-.drained_end   = virtio_blk_drained_end,
-};
-
-/* Generate vq:AioContext mappings from a validated iothread-vq-mapping list */
-static void
-apply_vq_mapping(IOThreadVirtQueueMappingList *iothread_vq_mapping_list,
-

Re: [PATCH 5/5] monitor: use aio_co_reschedule_self()

2024-02-05 Thread Manos Pitsidianakis


On Mon, 05 Feb 2024 19:26, Stefan Hajnoczi  wrote:

The aio_co_reschedule_self() API is designed to avoid the race
condition between scheduling the coroutine in another AioContext and
yielding.

The QMP dispatch code uses the open-coded version that appears
susceptible to the race condition at first glance:

 aio_co_schedule(qemu_get_aio_context(), qemu_coroutine_self());
 qemu_coroutine_yield();

The code is actually safe because the iohandler and qemu_aio_context
AioContext run under the Big QEMU Lock. Nevertheless, set a good example
and use aio_co_reschedule_self() so it's obvious that there is no race.

Suggested-by: Hanna Reitz 
Signed-off-by: Stefan Hajnoczi 
---
qapi/qmp-dispatch.c | 7 ++-
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index 176b549473..f3488afeef 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -212,8 +212,7 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList 
*cmds, QObject *requ
 * executing the command handler so that it can make progress if it
 * involves an AIO_WAIT_WHILE().
 */
-aio_co_schedule(qemu_get_aio_context(), qemu_coroutine_self());
-qemu_coroutine_yield();
+aio_co_reschedule_self(qemu_get_aio_context());
}

monitor_set_cur(qemu_coroutine_self(), cur_mon);
@@ -227,9 +226,7 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList 
*cmds, QObject *requ
 * Move back to iohandler_ctx so that nested event loops for
 * qemu_aio_context don't start new monitor commands.
 */
-aio_co_schedule(iohandler_get_aio_context(),
-qemu_coroutine_self());
-qemu_coroutine_yield();
+aio_co_reschedule_self(iohandler_get_aio_context());
}
} else {
   /*
--
2.43.0




Reviewed-by: Manos Pitsidianakis

Re: [PATCH 2/5] virtio-blk: clarify that there is at least 1 virtqueue

2024-02-05 Thread Manos Pitsidianakis


On Mon, 05 Feb 2024 19:26, Stefan Hajnoczi  wrote:

It is not possible to instantiate a virtio-blk device with 0 virtqueues.
The following check is located in ->realize():

 if (!conf->num_queues) {
 error_setg(errp, "num-queues property must be larger than 0");
 return;
 }

Later on we access s->vq_aio_context[0] under the assumption that there
is as least one virtqueue. Hanna Czenczek  noted that
it would help to show that the array index is already valid.

Add an assertion to document that s->vq_aio_context[0] is always
safe...and catch future code changes that break this assumption.

Suggested-by: Hanna Czenczek 
Signed-off-by: Stefan Hajnoczi 
---
hw/block/virtio-blk.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index e8b37fd5f4..a0735a9bca 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1825,6 +1825,7 @@ static int virtio_blk_start_ioeventfd(VirtIODevice *vdev)
 * Try to change the AioContext so that block jobs and other operations can
 * co-locate their activity in the same AioContext. If it fails, nevermind.
 */
+assert(nvqs > 0); /* enforced during ->realize() */
r = blk_set_aio_context(s->conf.conf.blk, s->vq_aio_context[0],
_err);
if (r < 0) {
--
2.43.0



Reviewed-by: Manos Pitsidianakis

[PULL v3 3/3] oslib-posix: initialize backend memory objects in parallel

2024-02-05 Thread David Hildenbrand

From: Mark Kanda 

QEMU initializes preallocated backend memory as the objects are parsed from
the command line. This is not optimal in some cases (e.g. memory spanning
multiple NUMA nodes) because the memory objects are initialized in series.

Allow the initialization to occur in parallel (asynchronously). In order to
ensure optimal thread placement, asynchronous initialization requires prealloc
context threads to be in use.

Signed-off-by: Mark Kanda 
Message-ID: <20240131165327.3154970-2-mark.ka...@oracle.com>
Tested-by: Mario Casquero 
Signed-off-by: David Hildenbrand 
---
 backends/hostmem.c |   7 ++-
 hw/virtio/virtio-mem.c |   4 +-
 include/hw/qdev-core.h |   5 ++
 include/qemu/osdep.h   |  18 +-
 system/vl.c|   9 +++
 util/oslib-posix.c | 131 +++--
 util/oslib-win32.c |   8 ++-
 7 files changed, 145 insertions(+), 37 deletions(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 987f6f591e..81a72ce40b 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -20,6 +20,7 @@
 #include "qom/object_interfaces.h"
 #include "qemu/mmap-alloc.h"
 #include "qemu/madvise.h"
+#include "hw/qdev-core.h"
 
 #ifdef CONFIG_NUMA
 #include 
@@ -237,7 +238,7 @@ static void host_memory_backend_set_prealloc(Object *obj, 
bool value,
 uint64_t sz = memory_region_size(>mr);
 
 if (!qemu_prealloc_mem(fd, ptr, sz, backend->prealloc_threads,
-   backend->prealloc_context, errp)) {
+   backend->prealloc_context, false, errp)) {
 return;
 }
 backend->prealloc = true;
@@ -323,6 +324,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
 HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
 void *ptr;
 uint64_t sz;
+bool async = !phase_check(PHASE_LATE_BACKENDS_CREATED);
 
 if (!bc->alloc) {
 return;
@@ -402,7 +404,8 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
 if (backend->prealloc && 
!qemu_prealloc_mem(memory_region_get_fd(>mr),
 ptr, sz,
 backend->prealloc_threads,
-backend->prealloc_context, 
errp)) {
+backend->prealloc_context,
+async, errp)) {
 return;
 }
 }
diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index 99ab989852..ffd119ebac 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -605,7 +605,7 @@ static int virtio_mem_set_block_state(VirtIOMEM *vmem, 
uint64_t start_gpa,
 int fd = memory_region_get_fd(>memdev->mr);
 Error *local_err = NULL;
 
-if (!qemu_prealloc_mem(fd, area, size, 1, NULL, _err)) {
+if (!qemu_prealloc_mem(fd, area, size, 1, NULL, false, _err)) {
 static bool warned;
 
 /*
@@ -1248,7 +1248,7 @@ static int virtio_mem_prealloc_range_cb(VirtIOMEM *vmem, 
void *arg,
 int fd = memory_region_get_fd(>memdev->mr);
 Error *local_err = NULL;
 
-if (!qemu_prealloc_mem(fd, area, size, 1, NULL, _err)) {
+if (!qemu_prealloc_mem(fd, area, size, 1, NULL, false, _err)) {
 error_report_err(local_err);
 return -ENOMEM;
 }
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index d47536eadb..9228e96c87 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -1083,6 +1083,11 @@ typedef enum MachineInitPhase {
  */
 PHASE_ACCEL_CREATED,
 
+/*
+ * Late backend objects have been created and initialized.
+ */
+PHASE_LATE_BACKENDS_CREATED,
+
 /*
  * machine_class->init has been called, thus creating any embedded
  * devices and validating machine properties.  Devices created at
diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index c9692cc314..7d359dabc4 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -680,6 +680,8 @@ typedef struct ThreadContext ThreadContext;
  * @area: start address of the are to preallocate
  * @sz: the size of the area to preallocate
  * @max_threads: maximum number of threads to use
+ * @tc: prealloc context threads pointer, NULL if not in use
+ * @async: request asynchronous preallocation, requires @tc
  * @errp: returns an error if this function fails
  *
  * Preallocate memory (populate/prefault page tables writable) for the virtual
@@ -687,10 +689,24 @@ typedef struct ThreadContext ThreadContext;
  * each page in the area was faulted in writable at least once, for example,
  * after allocating file blocks for mapped files.
  *
+ * When setting @async, allocation might be performed asynchronously.
+ * qemu_finish_async_prealloc_mem() must be called to finish any asynchronous
+ * preallocation.
+ *
  * Return: true on success, else false setting @errp with error.
  */
 bool qemu_prealloc_mem(int

[PULL v3 1/3] hv-balloon: use get_min_alignment() to express 32 GiB alignment

2024-02-05 Thread David Hildenbrand

Let's implement the get_min_alignment() callback for memory devices, and
copy for the device memory region the alignment of the host memory
region. This mimics what virtio-mem does, and allows for re-introducing
proper alignment checks for the memory region size (where we don't care
about additional device requirements) in memory device core.

Message-ID: <20240117135554.787344-2-da...@redhat.com>
Reviewed-by: Maciej S. Szmigiero 
Signed-off-by: David Hildenbrand 
---
 hw/hyperv/hv-balloon.c | 37 +
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/hw/hyperv/hv-balloon.c b/hw/hyperv/hv-balloon.c
index 0238365712..ade283335a 100644
--- a/hw/hyperv/hv-balloon.c
+++ b/hw/hyperv/hv-balloon.c
@@ -1477,22 +1477,7 @@ static void hv_balloon_ensure_mr(HvBalloon *balloon)
 balloon->mr = g_new0(MemoryRegion, 1);
 memory_region_init(balloon->mr, OBJECT(balloon), TYPE_HV_BALLOON,
memory_region_size(hostmem_mr));
-
-/*
- * The VM can indicate an alignment up to 32 GiB. Memory device core can
- * usually only handle/guarantee 1 GiB alignment. The user will have to
- * specify a larger maxmem eventually.
- *
- * The memory device core will warn the user in case maxmem might have to 
be
- * increased and will fail plugging the device if there is not sufficient
- * space after alignment.
- *
- * TODO: we could do the alignment ourselves in a slightly bigger region.
- * But this feels better, although the warning might be annoying. Maybe
- * we can optimize that in the future (e.g., with such a device on the
- * cmdline place/size the device memory region differently.
- */
-balloon->mr->align = MAX(32 * GiB, 
memory_region_get_alignment(hostmem_mr));
+balloon->mr->align = memory_region_get_alignment(hostmem_mr);
 }
 
 static void hv_balloon_free_mr(HvBalloon *balloon)
@@ -1654,6 +1639,25 @@ static MemoryRegion 
*hv_balloon_md_get_memory_region(MemoryDeviceState *md,
 return balloon->mr;
 }
 
+static uint64_t hv_balloon_md_get_min_alignment(const MemoryDeviceState *md)
+{
+/*
+ * The VM can indicate an alignment up to 32 GiB. Memory device core can
+ * usually only handle/guarantee 1 GiB alignment. The user will have to
+ * specify a larger maxmem eventually.
+ *
+ * The memory device core will warn the user in case maxmem might have to 
be
+ * increased and will fail plugging the device if there is not sufficient
+ * space after alignment.
+ *
+ * TODO: we could do the alignment ourselves in a slightly bigger region.
+ * But this feels better, although the warning might be annoying. Maybe
+ * we can optimize that in the future (e.g., with such a device on the
+ * cmdline place/size the device memory region differently.
+ */
+return 32 * GiB;
+}
+
 static void hv_balloon_md_fill_device_info(const MemoryDeviceState *md,
MemoryDeviceInfo *info)
 {
@@ -1766,5 +1770,6 @@ static void hv_balloon_class_init(ObjectClass *klass, 
void *data)
 mdc->get_memory_region = hv_balloon_md_get_memory_region;
 mdc->decide_memslots = hv_balloon_decide_memslots;
 mdc->get_memslots = hv_balloon_get_memslots;
+mdc->get_min_alignment = hv_balloon_md_get_min_alignment;
 mdc->fill_device_info = hv_balloon_md_fill_device_info;
 }
-- 
2.43.0

Re: [PATCH 3/5] virtio-blk: add vq_rq[] bounds check in virtio_blk_dma_restart_cb()

2024-02-05 Thread Manos Pitsidianakis


On Mon, 05 Feb 2024 19:26, Stefan Hajnoczi  wrote:

Hanna Czenczek  noted that the array index in
virtio_blk_dma_restart_cb() is not bounds-checked:

 g_autofree VirtIOBlockReq **vq_rq = g_new0(VirtIOBlockReq *, num_queues);
 ...
 while (rq) {
 VirtIOBlockReq *next = rq->next;
 uint16_t idx = virtio_get_queue_index(rq->vq);

 rq->next = vq_rq[idx];
^^

The code is correct because both rq->vq and vq_rq[] depend on
num_queues, but this is indirect and not 100% obvious. Add an assertion.


This sentence could be useful as an inline comment too.



Suggested-by: Hanna Czenczek 
Signed-off-by: Stefan Hajnoczi 
---
hw/block/virtio-blk.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index a0735a9bca..f3193f4b75 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1209,6 +1209,7 @@ static void virtio_blk_dma_restart_cb(void *opaque, bool 
running,
VirtIOBlockReq *next = rq->next;
uint16_t idx = virtio_get_queue_index(rq->vq);

+assert(idx < num_queues);
rq->next = vq_rq[idx];
vq_rq[idx] = rq;
rq = next;
--
2.43.0



Reviewed-by: Manos Pitsidianakis

[PULL v3 2/3] memory-device: reintroduce memory region size check

2024-02-05 Thread David Hildenbrand

We used to check that the memory region size is multiples of the overall
requested address alignment for the device memory address.

We removed that check, because there are cases (i.e., hv-balloon) where
devices unconditionally request an address alignment that has a very large
alignment (i.e., 32 GiB), but the actual memory device size might not be
multiples of that alignment.

However, this change:

(a) allows for some practically impossible DIMM sizes, like "1GB+1 byte".
(b) allows for DIMMs that partially cover hugetlb pages, previously
reported in [1].

Both scenarios don't make any sense: we might even waste memory.

So let's reintroduce that check, but only check that the
memory region size is multiples of the memory region alignment (i.e.,
page size, huge page size), but not any additional memory device
requirements communicated using md->get_min_alignment().

The following examples now fail again as expected:

(a) 1M with 2M THP
 qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \
 -object memory-backend-ram,id=mem1,size=1M \
 -device pc-dimm,id=dimm1,memdev=mem1
 -> backend memory size must be multiple of 0x20

(b) 1G+1byte

 qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \
   -object memory-backend-ram,id=mem1,size=1073741825B \
   -device pc-dimm,id=dimm1,memdev=mem1
 -> backend memory size must be multiple of 0x20

(c) Unliagned hugetlb size (2M)

 qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \
   -object 
memory-backend-file,id=mem1,mem-path=/dev/hugepages/tmp,size=511M \
   -device pc-dimm,id=dimm1,memdev=mem1
 backend memory size must be multiple of 0x20

(d) Unliagned hugetlb size (1G)

 qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \
-object 
memory-backend-file,id=mem1,mem-path=/dev/hugepages1G/tmp,size=2047M \
-device pc-dimm,id=dimm1,memdev=mem1
 -> backend memory size must be multiple of 0x4000

Note that this fix depends on a hv-balloon change to communicate its
additional alignment requirements using get_min_alignment() instead of
through the memory region.

[1] 
https://lkml.kernel.org/r/f77d641d500324525ac036fe1827b3070de75fc1.1701088320.git.mpriv...@redhat.com

Message-ID: <20240117135554.787344-3-da...@redhat.com>
Reported-by: Zhenyu Zhang 
Reported-by: Michal Privoznik 
Fixes: eb1b7c4bd413 ("memory-device: Drop size alignment check")
Tested-by: Zhenyu Zhang 
Tested-by: Mario Casquero 
Reviewed-by: Maciej S. Szmigiero 
Signed-off-by: David Hildenbrand 
---
 hw/mem/memory-device.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c
index a1b1af26bc..e098585cda 100644
--- a/hw/mem/memory-device.c
+++ b/hw/mem/memory-device.c
@@ -374,6 +374,20 @@ void memory_device_pre_plug(MemoryDeviceState *md, 
MachineState *ms,
 goto out;
 }
 
+/*
+ * We always want the memory region size to be multiples of the memory
+ * region alignment: for example, DIMMs with 1G+1byte size don't make
+ * any sense. Note that we don't check that the size is multiples
+ * of any additional alignment requirements the memory device might
+ * have when it comes to the address in physical address space.
+ */
+if (!QEMU_IS_ALIGNED(memory_region_size(mr),
+ memory_region_get_alignment(mr))) {
+error_setg(errp, "backend memory size must be multiple of 0x%"
+   PRIx64, memory_region_get_alignment(mr));
+return;
+}
+
 if (legacy_align) {
 align = *legacy_align;
 } else {
-- 
2.43.0

[PULL v3 0/3] Host Memory Backends and Memory devices queue 2024-02-06

2024-02-05 Thread David Hildenbrand

The following changes since commit 39a6e4f87e7b75a45b08d6dc8b8b7c2954c87440:

  Merge tag 'pull-qapi-2024-02-03' of https://repo.or.cz/qemu/armbru into 
staging (2024-02-03 13:31:58 +)

are available in the Git repository at:

  https://github.com/davidhildenbrand/qemu.git tags/mem-2024-02-06-v3

for you to fetch changes up to 04accf43df83aa10f06f7dbda3ecf0db97f0c5a6:

  oslib-posix: initialize backend memory objects in parallel (2024-02-06 
08:15:22 +0100)


Hi,

"Host Memory Backends" and "Memory devices" queue ("mem"):
- Reintroduce memory region size checks for memory devices; the removal
  lead to some undesired side effects
- Preallocate memory of memory backends in selected configurations
  asynchronously (so we preallocate concurrently), to speed up QEMU
  startup time.


David Hildenbrand (2):
  hv-balloon: use get_min_alignment() to express 32 GiB alignment
  memory-device: reintroduce memory region size check

Mark Kanda (1):
  oslib-posix: initialize backend memory objects in parallel

 backends/hostmem.c |   7 ++-
 hw/hyperv/hv-balloon.c |  37 +++-
 hw/mem/memory-device.c |  14 +
 hw/virtio/virtio-mem.c |   4 +-
 include/hw/qdev-core.h |   5 ++
 include/qemu/osdep.h   |  18 +-
 system/vl.c|   9 +++
 util/oslib-posix.c | 131 +++--
 util/oslib-win32.c |   8 ++-
 9 files changed, 180 insertions(+), 53 deletions(-)

-- 
2.43.0

Re: [PULL v2 0/3] Host Memory Backends and Memory devices queue 2024-02-06

2024-02-05 Thread David Hildenbrand


On 06.02.24 08:14, David Hildenbrand wrote:

The following changes since commit 39a6e4f87e7b75a45b08d6dc8b8b7c2954c87440:

   Merge tag 'pull-qapi-2024-02-03' of https://repo.or.cz/qemu/armbru into 
staging (2024-02-03 13:31:58 +)

are available in the Git repository at:

   https://github.com/davidhildenbrand/qemu.git tags/mem-2024-02-06-v2

for you to fetch changes up to 8eb38ab662e74d618d473a9a52d71c644926c3c0:

   oslib-posix: initialize backend memory objects in parallel (2024-02-06 
08:09:55 +0100)


Hi,

"Host Memory Backends" and "Memory devices" queue ("mem"):
- Reintroduce memory region size checks for memory devices; the removal
   lead to some undesired side effects
- Preallocate memory of memory backends in selected configurations
   asynchronously (so we preallocate concurrently), to speed up QEMU
   startup time.


... and yet another instance of the same mail address is wrong. Gah.

Maybe v3 will do ...

--
Cheers,

David / dhildenb

Re: [PATCH 4/5] virtio-blk: declare VirtIOBlock::rq with a type

2024-02-05 Thread Manos Pitsidianakis


On Mon, 05 Feb 2024 19:26, Stefan Hajnoczi  wrote:

The VirtIOBlock::rq field has had the type void * since its introduction
in commit 869a5c6df19a ("Stop VM on error in virtio-blk. (Gleb
Natapov)").

Perhaps this was done to avoid the forward declaration of
VirtIOBlockReq.

Hanna Czenczek  pointed out the missing type. Specify
the actual type because there is no need to use void * here.

Suggested-by: Hanna Czenczek 
Signed-off-by: Stefan Hajnoczi 
---
include/hw/virtio/virtio-blk.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 833a9a344f..5c14110c4b 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -55,7 +55,7 @@ struct VirtIOBlock {
VirtIODevice parent_obj;
BlockBackend *blk;
QemuMutex rq_lock;
-void *rq; /* protected by rq_lock */
+struct VirtIOBlockReq *rq; /* protected by rq_lock */
VirtIOBlkConf conf;
unsigned short sector_mask;
bool original_wce;
--
2.43.0



Reviewed-by: Manos Pitsidianakis

[PULL v2 1/3] hv-balloon: use get_min_alignment() to express 32 GiB alignment

2024-02-05 Thread David Hildenbrand

Let's implement the get_min_alignment() callback for memory devices, and
copy for the device memory region the alignment of the host memory
region. This mimics what virtio-mem does, and allows for re-introducing
proper alignment checks for the memory region size (where we don't care
about additional device requirements) in memory device core.

Message-ID: <20240117135554.787344-2-da...@redhat.com>
Reviewed-by: Maciej S. Szmigiero 
Signed-off-by: David Hildenbrand 
---
 hw/hyperv/hv-balloon.c | 37 +
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/hw/hyperv/hv-balloon.c b/hw/hyperv/hv-balloon.c
index 0238365712..ade283335a 100644
--- a/hw/hyperv/hv-balloon.c
+++ b/hw/hyperv/hv-balloon.c
@@ -1477,22 +1477,7 @@ static void hv_balloon_ensure_mr(HvBalloon *balloon)
 balloon->mr = g_new0(MemoryRegion, 1);
 memory_region_init(balloon->mr, OBJECT(balloon), TYPE_HV_BALLOON,
memory_region_size(hostmem_mr));
-
-/*
- * The VM can indicate an alignment up to 32 GiB. Memory device core can
- * usually only handle/guarantee 1 GiB alignment. The user will have to
- * specify a larger maxmem eventually.
- *
- * The memory device core will warn the user in case maxmem might have to 
be
- * increased and will fail plugging the device if there is not sufficient
- * space after alignment.
- *
- * TODO: we could do the alignment ourselves in a slightly bigger region.
- * But this feels better, although the warning might be annoying. Maybe
- * we can optimize that in the future (e.g., with such a device on the
- * cmdline place/size the device memory region differently.
- */
-balloon->mr->align = MAX(32 * GiB, 
memory_region_get_alignment(hostmem_mr));
+balloon->mr->align = memory_region_get_alignment(hostmem_mr);
 }
 
 static void hv_balloon_free_mr(HvBalloon *balloon)
@@ -1654,6 +1639,25 @@ static MemoryRegion 
*hv_balloon_md_get_memory_region(MemoryDeviceState *md,
 return balloon->mr;
 }
 
+static uint64_t hv_balloon_md_get_min_alignment(const MemoryDeviceState *md)
+{
+/*
+ * The VM can indicate an alignment up to 32 GiB. Memory device core can
+ * usually only handle/guarantee 1 GiB alignment. The user will have to
+ * specify a larger maxmem eventually.
+ *
+ * The memory device core will warn the user in case maxmem might have to 
be
+ * increased and will fail plugging the device if there is not sufficient
+ * space after alignment.
+ *
+ * TODO: we could do the alignment ourselves in a slightly bigger region.
+ * But this feels better, although the warning might be annoying. Maybe
+ * we can optimize that in the future (e.g., with such a device on the
+ * cmdline place/size the device memory region differently.
+ */
+return 32 * GiB;
+}
+
 static void hv_balloon_md_fill_device_info(const MemoryDeviceState *md,
MemoryDeviceInfo *info)
 {
@@ -1766,5 +1770,6 @@ static void hv_balloon_class_init(ObjectClass *klass, 
void *data)
 mdc->get_memory_region = hv_balloon_md_get_memory_region;
 mdc->decide_memslots = hv_balloon_decide_memslots;
 mdc->get_memslots = hv_balloon_get_memslots;
+mdc->get_min_alignment = hv_balloon_md_get_min_alignment;
 mdc->fill_device_info = hv_balloon_md_fill_device_info;
 }
-- 
2.43.0

[PULL v2 0/3] Host Memory Backends and Memory devices queue 2024-02-06

2024-02-05 Thread David Hildenbrand

The following changes since commit 39a6e4f87e7b75a45b08d6dc8b8b7c2954c87440:

  Merge tag 'pull-qapi-2024-02-03' of https://repo.or.cz/qemu/armbru into 
staging (2024-02-03 13:31:58 +)

are available in the Git repository at:

  https://github.com/davidhildenbrand/qemu.git tags/mem-2024-02-06-v2

for you to fetch changes up to 8eb38ab662e74d618d473a9a52d71c644926c3c0:

  oslib-posix: initialize backend memory objects in parallel (2024-02-06 
08:09:55 +0100)


Hi,

"Host Memory Backends" and "Memory devices" queue ("mem"):
- Reintroduce memory region size checks for memory devices; the removal
  lead to some undesired side effects
- Preallocate memory of memory backends in selected configurations
  asynchronously (so we preallocate concurrently), to speed up QEMU
  startup time.


David Hildenbrand (2):
  hv-balloon: use get_min_alignment() to express 32 GiB alignment
  memory-device: reintroduce memory region size check

Mark Kanda (1):
  oslib-posix: initialize backend memory objects in parallel

 backends/hostmem.c |   7 ++-
 hw/hyperv/hv-balloon.c |  37 +++-
 hw/mem/memory-device.c |  14 +
 hw/virtio/virtio-mem.c |   4 +-
 include/hw/qdev-core.h |   5 ++
 include/qemu/osdep.h   |  18 +-
 system/vl.c|   9 +++
 util/oslib-posix.c | 131 +++--
 util/oslib-win32.c |   8 ++-
 9 files changed, 180 insertions(+), 53 deletions(-)

-- 
2.43.0

Re: [PULL 0/3] Host Memory Backends and Memory devices queue 2024-02-6

2024-02-05 Thread David Hildenbrand


On 06.02.24 08:02, David Hildenbrand wrote:

The following changes since commit 39a6e4f87e7b75a45b08d6dc8b8b7c2954c87440:

   Merge tag 'pull-qapi-2024-02-03' of https://repo.or.cz/qemu/armbru into 
staging (2024-02-03 13:31:58 +)

are available in the Git repository at:

   https://github.com/davidhildenbrand/qemu.git tags/mem-2024-02-06

for you to fetch changes up to a32c31979a852a07d590effa1586f6ab4fa4d784:

   oslib-posix: initialize backend memory objects in parallel (2024-02-04 
17:51:13 +0100)


Hi,

"Host Memory Backends" and "Memory devices" queue ("mem"):
- Reintroduce memory region size checks for memory devices; the removal
   lead to some undesired side effects
- Preallocate memory of memory backends in selected configurations
   asynchronously (so we preallocate concurrently), to speed up QEMU
   startup time.


A mail address in the second patch is wrong. Will send a v2.

--
Cheers,

David / dhildenb

Re: building qemu on a system with libxkbcommon installed but not xkeyboard-config produces an core dump

2024-02-05 Thread Michael Tokarev


06.02.2024 09:35, Zhang Wen:
...

I'm so sorry for the misspelling in my initial post and the confusion
caused by that.


That's no problem, things happen.


I'm building my own system following the Linux From Scratch package, which is at
https://www.linuxfromscratch.org.
I accidently built libxkbcommon package but not xkeyboard-config
package, and then
while building qemu I saw this problem. it's a little difficult to
find the root cause from
this error message, so I sent this patch to seek for help from upstream.


Okay, so it is mostly a self-built/self-installed linux.  That works too.

So how the problem manifests itself?  What the actual error message is?
You mentioned some segfault iirc, care to share some details?

It's not a problem at all to pick this change up, but I'd love to know
more details about this first :)

Thanks,

/mjt

[PULL 0/3] Host Memory Backends and Memory devices queue 2024-02-6

2024-02-05 Thread David Hildenbrand

The following changes since commit 39a6e4f87e7b75a45b08d6dc8b8b7c2954c87440:

  Merge tag 'pull-qapi-2024-02-03' of https://repo.or.cz/qemu/armbru into 
staging (2024-02-03 13:31:58 +)

are available in the Git repository at:

  https://github.com/davidhildenbrand/qemu.git tags/mem-2024-02-06

for you to fetch changes up to a32c31979a852a07d590effa1586f6ab4fa4d784:

  oslib-posix: initialize backend memory objects in parallel (2024-02-04 
17:51:13 +0100)


Hi,

"Host Memory Backends" and "Memory devices" queue ("mem"):
- Reintroduce memory region size checks for memory devices; the removal
  lead to some undesired side effects
- Preallocate memory of memory backends in selected configurations
  asynchronously (so we preallocate concurrently), to speed up QEMU
  startup time.


David Hildenbrand (2):
  hv-balloon: use get_min_alignment() to express 32 GiB alignment
  memory-device: reintroduce memory region size check

Mark Kanda (1):
  oslib-posix: initialize backend memory objects in parallel

 backends/hostmem.c |   7 ++-
 hw/hyperv/hv-balloon.c |  37 +++-
 hw/mem/memory-device.c |  14 +
 hw/virtio/virtio-mem.c |   4 +-
 include/hw/qdev-core.h |   5 ++
 include/qemu/osdep.h   |  18 +-
 system/vl.c|   9 +++
 util/oslib-posix.c | 131 +++--
 util/oslib-win32.c |   8 ++-
 9 files changed, 180 insertions(+), 53 deletions(-)

-- 
2.43.0

[PULL 1/3] hv-balloon: use get_min_alignment() to express 32 GiB alignment

2024-02-05 Thread David Hildenbrand

Let's implement the get_min_alignment() callback for memory devices, and
copy for the device memory region the alignment of the host memory
region. This mimics what virtio-mem does, and allows for re-introducing
proper alignment checks for the memory region size (where we don't care
about additional device requirements) in memory device core.

Message-ID: <20240117135554.787344-2-da...@redhat.com>
Reviewed-by: Maciej S. Szmigiero 
Signed-off-by: David Hildenbrand 
---
 hw/hyperv/hv-balloon.c | 37 +
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/hw/hyperv/hv-balloon.c b/hw/hyperv/hv-balloon.c
index 0238365712..ade283335a 100644
--- a/hw/hyperv/hv-balloon.c
+++ b/hw/hyperv/hv-balloon.c
@@ -1477,22 +1477,7 @@ static void hv_balloon_ensure_mr(HvBalloon *balloon)
 balloon->mr = g_new0(MemoryRegion, 1);
 memory_region_init(balloon->mr, OBJECT(balloon), TYPE_HV_BALLOON,
memory_region_size(hostmem_mr));
-
-/*
- * The VM can indicate an alignment up to 32 GiB. Memory device core can
- * usually only handle/guarantee 1 GiB alignment. The user will have to
- * specify a larger maxmem eventually.
- *
- * The memory device core will warn the user in case maxmem might have to 
be
- * increased and will fail plugging the device if there is not sufficient
- * space after alignment.
- *
- * TODO: we could do the alignment ourselves in a slightly bigger region.
- * But this feels better, although the warning might be annoying. Maybe
- * we can optimize that in the future (e.g., with such a device on the
- * cmdline place/size the device memory region differently.
- */
-balloon->mr->align = MAX(32 * GiB, 
memory_region_get_alignment(hostmem_mr));
+balloon->mr->align = memory_region_get_alignment(hostmem_mr);
 }
 
 static void hv_balloon_free_mr(HvBalloon *balloon)
@@ -1654,6 +1639,25 @@ static MemoryRegion 
*hv_balloon_md_get_memory_region(MemoryDeviceState *md,
 return balloon->mr;
 }
 
+static uint64_t hv_balloon_md_get_min_alignment(const MemoryDeviceState *md)
+{
+/*
+ * The VM can indicate an alignment up to 32 GiB. Memory device core can
+ * usually only handle/guarantee 1 GiB alignment. The user will have to
+ * specify a larger maxmem eventually.
+ *
+ * The memory device core will warn the user in case maxmem might have to 
be
+ * increased and will fail plugging the device if there is not sufficient
+ * space after alignment.
+ *
+ * TODO: we could do the alignment ourselves in a slightly bigger region.
+ * But this feels better, although the warning might be annoying. Maybe
+ * we can optimize that in the future (e.g., with such a device on the
+ * cmdline place/size the device memory region differently.
+ */
+return 32 * GiB;
+}
+
 static void hv_balloon_md_fill_device_info(const MemoryDeviceState *md,
MemoryDeviceInfo *info)
 {
@@ -1766,5 +1770,6 @@ static void hv_balloon_class_init(ObjectClass *klass, 
void *data)
 mdc->get_memory_region = hv_balloon_md_get_memory_region;
 mdc->decide_memslots = hv_balloon_decide_memslots;
 mdc->get_memslots = hv_balloon_get_memslots;
+mdc->get_min_alignment = hv_balloon_md_get_min_alignment;
 mdc->fill_device_info = hv_balloon_md_fill_device_info;
 }
-- 
2.43.0

Re: building qemu on a system with libxkbcommon installed but not xkeyboard-config produces an core dump

2024-02-05 Thread Zhang Wen

Peter Maydell  于2024年2月1日周四 20:57写道：
>
> On Thu, 1 Feb 2024 at 12:50, Michael Tokarev  wrote:
> >
> > 01.02.2024 15:11, Michael Tokarev wrote:
> > > 31.01.2024 11:13, Zhang Wen:
> > >> With this patch, qemu requires keyboard-config when libxkbcommon is 
> > >> found on the system. So if the qemu is configured when libxkbcommon is 
> > >> installed
> > >> but not keyboard-config, the configure stage will produce an error 
> > >> message, thus avoid coredump in the build stage.
> > >
> > > I'm not sure what you're talking about.  What *is* keyboard-config anyway?
> > >
> > > On a debian system there's no such thing.  There's keyboard-configuration
> > > package but it has nothing to do with that.  It looks like if we apply
> > > such patch, it will be impossible to build qemu on debian.
> >
> > Aha, I found it.  On debian it is /usr/share/pkgconfig/keyboard-config.pc,
> > which is a part of xkb-data package.  And libxkbcommon Depends on xkb-data.
> > It looks like the distribution here is wrong, there should be no 
> > libxkbcommon
> > without xkb-data which includes keyboard-config.
>
> Are we talking about "keyboard-config" or "xkeyboard-config" here?
> The commit message says "keyboard-config" but the patch itself
> says "xkeyboard-config".
>
> Zhang: it would be helpful if you could tell us which distro
> you are building on where you see this problem.
>
> thanks
> -- PMM

I'm so sorry for the misspelling in my initial post and the confusion
caused by that.
I'm building my own system following the Linux From Scratch package, which is at
https://www.linuxfromscratch.org.
I accidently built libxkbcommon package but not xkeyboard-config
package, and then
while building qemu I saw this problem. it's a little difficult to
find the root cause from
this error message, so I sent this patch to seek for help from upstream.

-- 
Key fingerprint:
419F 72F3 F3A9 36EE 1B72  B00B C1C3 4BD4 FDA3 362F

[PATCH 1/3] tests/migration-test: Stick with gicv3 in aarch64 test

2024-02-05 Thread peterx

From: Peter Xu 

Recently we introduced cross-binary migration test.  It's always wanted
that migration-test uses stable guest ABI for both QEMU binaries in this
case, so that both QEMU binaries will be compatible on the migration
stream with the cmdline specified.

Switch to a static gic version "3" rather than using version "max", so that
GIC should be stable now across any future QEMU binaries for migration-test.

Here the version can actually be anything as long as the ABI is stable.  We
choose "3" because it's the majority of what we already use in QEMU while
still new enough: "git grep gic-version=3" shows 6 hit, while version 4 has
no direct user yet besides "max".

Note that even with this change, aarch64 won't be able to work yet with
migration cross binary test, but then the only missing piece will be the
stable CPU model.

Signed-off-by: Peter Xu 
---
 tests/qtest/migration-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 7675519cfa..8a5bb1752e 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -819,7 +819,7 @@ static int test_migrate_start(QTestState **from, QTestState 
**to,
 } else if (strcmp(arch, "aarch64") == 0) {
 memory_size = "150M";
 machine_alias = "virt";
-machine_opts = "gic-version=max";
+machine_opts = "gic-version=3";
 arch_opts = g_strdup_printf("-cpu max -kernel %s", bootpath);
 start_address = ARM_TEST_MEM_START;
 end_address = ARM_TEST_MEM_END;
-- 
2.43.0

[PATCH 2/3] ci: Remove tag dependency for build-previous-qemu

2024-02-05 Thread peterx

From: Peter Xu 

The new build-previous-qemu job relies on QEMU release tag being present,
while that may not be always true for personal git repositories since by
default tag is not pushed.  The job can fail on those CI kicks, as reported
by Peter Maydell.

Fix it by fetching the tags remotely from the official repository, as
suggested by Dan.

[1] https://lore.kernel.org/r/zcc9sckj7vvqe...@redhat.com

Reported-by: Peter Maydell 
Suggested-by: Daniel P. Berrangé 
Signed-off-by: Peter Xu 
---
 .gitlab-ci.d/buildtest.yml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index 79bbc8585b..df48c9d31d 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -189,6 +189,8 @@ build-previous-qemu:
 TARGETS: x86_64-softmmu aarch64-softmmu
   before_script:
 - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)"
+- git remote add upstream https://gitlab.com/qemu-project/qemu
+- git fetch upstream $QEMU_PRRV_VERSION
 - git checkout $QEMU_PREV_VERSION
   after_script:
 - mv build build-previous
-- 
2.43.0

[PATCH 3/3] ci: Update comment for migration-compat-aarch64

2024-02-05 Thread peterx

From: Peter Xu 

It turns out that we may not be able to enable this test even for the
upcoming v9.0.  Document what we're still missing.

Signed-off-by: Peter Xu 
---
 .gitlab-ci.d/buildtest.yml | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index df48c9d31d..f01d6d0cdf 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -219,9 +219,10 @@ build-previous-qemu:
 - QTEST_QEMU_BINARY_DST=./qemu-system-${TARGET}
   QTEST_QEMU_BINARY=../build/qemu-system-${TARGET} 
./tests/qtest/migration-test
 
-# This job is disabled until we release 9.0. The existing
-# migration-test in 8.2 is broken on aarch64. The fix was already
-# commited, but it will only take effect once 9.0 is out.
+# This job needs to be disabled until we can have an aarch64 CPU model that
+# will both (1) support both KVM and TCG, and (2) provide a stable ABI.
+# Currently only "-cpu max" can provide (1), however it doesn't guarantee
+# (2).  Mark this test skipped until later.
 migration-compat-aarch64:
   extends: .migration-compat-common
   variables:
-- 
2.43.0

[PATCH 0/3] ci: Fixes on the recent cross-binary test case

2024-02-05 Thread peterx

From: Peter Xu 

Hi,

This small patchset updates the recent cross-binary test for migration on
a few things.

Patch 1 modifies the aarch64 test GIC version to 3 rather than "max",
paving way for enabling it, even if the CPU model is not yet ready.

Patch 2 removes the tag dependency of the new build-previous-qemu job, so
that in personal CI pipelines the job won't fail if the tag is missing, as
reported by Peter Maydell, and solution suggested by Dan.

Patch 3 updates the comment for aarch64 on the test to state the fact, and
what is missing.  Then we don't target it support for v9.0, but only until
we have a stable CPU model for aarch64 (if ever possible to support both
tcg and kvm).

Comments welcomed, thanks.

Peter Xu (3):
  tests/migration-test: Stick with gicv3 in aarch64 test
  ci: Remove tag dependency for build-previous-qemu
  ci: Update comment for migration-compat-aarch64

 tests/qtest/migration-test.c | 2 +-
 .gitlab-ci.d/buildtest.yml   | 9 ++---
 2 files changed, 7 insertions(+), 4 deletions(-)

-- 
2.43.0

Re: [PATCH] docs/style: allow C99 mixed declarations

2024-02-05 Thread Markus Armbruster

Daniel P. Berrangé  writes:

> On Mon, Feb 05, 2024 at 12:18:19PM -0500, Stefan Hajnoczi wrote:
>> C99 mixed declarations support interleaving of local variable
>> declarations and code.
>> 
>> The coding style "generally" forbids C99 mixed declarations with some
>> exceptions to the rule. This rule is not checked by checkpatch.pl and
>> naturally there are violations in the source tree.
>> 
>> While contemplating adding another exception, I came to the conclusion
>> that the best location for declarations depends on context. Let the
>> programmer declare variables where it is best for legibility. Don't try
>> to define all possible scenarios/exceptions.
>
> IIRC, we had a discussion on this topic sometime last year, but can't
> remember what the $SUBJECT was, so I'll just repeat what I said then.

From: Juan Quintela 
Subject: [PATCH] Change the default for Mixed declarations.
Date: Tue, 14 Feb 2023 17:07:38 +0100
Message-Id: <20230214160738.88614-1-quint...@redhat.com>
https://lore.kernel.org/qemu-devel/20230214160738.88614-1-quint...@redhat.com/

> Combining C99 mixed declarations with 'goto' is dangerous.
>
> Consider this program:
>
> $ cat jump.c
> #include 
>
> int main(int argc, char**argv) {
>
>   if (getenv("SKIP"))
> goto target;
>
>   char *foo = malloc(30);
>
>  target:
>   free(foo);
> }
>
> $ gcc -Wall -Wuninitialized -o jump jump.c
>
> $ SKIP=1 ./jump 
> free(): invalid pointer
> Aborted (core dumped)
>
>
>  -> The programmer thinks they have initialized 'foo'
>  -> GCC thinks the programmer has initialized 'foo'
>  -> Yet 'foo' is not guaranteed to be initialized at 'target:'
>
> Given that QEMU makes heavy use of 'goto', allowing C99 mixed
> declarations exposes us to significant danger.
>
> Full disclosure, GCC fails to diagnmose this mistake, even
> with a decl at start of 'main', but at least the mistake is
> now more visible to the programmer.
>
> Fortunately with -fanalyzer GCC can diagnose this:
>
> $ gcc -fanalyzer -Wall -o jump jump.c
> jump.c: In function ‘main’:
> jump.c:12:3: warning: use of uninitialized value ‘foo’ [CWE-457] 
> [-Wanalyzer-use-of-uninitialized-value]
>12 |   free(foo);
>   |   ^
>   ‘main’: events 1-5
> |
> |6 |   if (getenv("SKIP"))
> |  |  ~  
> |  |  |
> |  |  (3) following ‘true’ branch...
> |7 | goto target;
> |  | 
> |  | |
> |  | (4) ...to here
> |8 | 
> |9 |  char *foo = malloc(30);
> |  |^~~
> |  ||
> |  |(1) region created on stack here
> |  |(2) capacity: 8 bytes
> |..
> |   12 |   free(foo);
> |  |   ~
> |  |   |
> |  |   (5) use of uninitialized value ‘foo’ here
>
>
> ...but -fanalyzer isn't something we have enabled by default, it
> is opt-in. I'm also not sure how comprehensive the flow control
> analysis of -fanalyzer is ?  Can we be sure it'll catch these
> mistakes in large complex functions with many code paths ?
>
> Even if the compiler does reliably warn, I think the code pattern
> remains misleading to contributors, as the flow control flaw is
> very non-obvious.

Yup.  Strong dislike.

> Rather than accept the status quo and remove the coding guideline,
> I think we should strengthen the guidelines, such that it is
> explicitly forbidden in any method that uses 'goto'. Personally
> I'd go all the way to -Werror=declaration-after-statement, as

I support this.

> while C99 mixed decl is appealing,

Not to me.

I much prefer declarations and statements to be visually distinct.
Putting declarations first and separating from statements them with a
blank line accomplishes that.  Less necessary in languages where
declarations are syntactically obvious.

>it isn't exactly a game
> changer in improving code maintainability.


[...]

Re: [PATCH v2 0/6] migration/multifd: Fix channel creation vs. cleanup races

2024-02-05 Thread Peter Xu

On Mon, Feb 05, 2024 at 04:49:23PM -0300, Fabiano Rosas wrote:
> Based-on: 20240202102857.110210-1-pet...@redhat.com
> [PATCH v2 00/23] migration/multifd: Refactor ->send_prepare() and cleanups
> https://lore.kernel.org/r/20240202102857.110210-1-pet...@redhat.com
> 
> Hi,
> 
> In this v2 I made sure NO channel is created after the semaphores are
> posted. Feel free to call me out if that's not the case.

Queued into -staging.  Plan to send a pull only before I'll be out (Feb
9-19), so comments are still welcomed.  Thanks.

-- 
Peter Xu

Re: [PATCH v2 6/6] migration/multifd: Add a synchronization point for channel creation

2024-02-05 Thread Peter Xu

On Mon, Feb 05, 2024 at 04:49:29PM -0300, Fabiano Rosas wrote:
> It is possible that one of the multifd channels fails to be created at
> multifd_new_send_channel_async() while the rest of the channel
> creation tasks are still in flight.
> 
> This could lead to multifd_save_cleanup() executing the
> qemu_thread_join() loop too early and not waiting for the threads
> which haven't been created yet, leading to the freeing of resources
> that the newly created threads will try to access and crash.
> 
> Add a synchronization point after which there will be no attempts at
> thread creation and therefore calling multifd_save_cleanup() past that
> point will ensure it properly waits for the threads.
> 
> A note about performance: Prior to this patch, if a channel took too
> long to be established, other channels could finish connecting first
> and already start taking load. Now we're bounded by the
> slowest-connecting channel.
> 
> Reported-by: Avihai Horon 
> Signed-off-by: Fabiano Rosas 

[...]

> @@ -934,7 +936,6 @@ static void multifd_new_send_channel_async(QIOTask *task, 
> gpointer opaque)
>  MultiFDSendParams *p = opaque;
>  QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task));
>  Error *local_err = NULL;
> -

This line removal should belong to the previous patch.  I can touch that
up.

>  bool ret;
>  
>  trace_multifd_new_send_channel_async(p->id);

Reviewed-by: Peter Xu 

-- 
Peter Xu

Re: [PATCH v2 5/6] migration/multifd: Unify multifd and TLS connection paths

2024-02-05 Thread Peter Xu

On Mon, Feb 05, 2024 at 04:49:28PM -0300, Fabiano Rosas wrote:
> During multifd channel creation (multifd_send_new_channel_async) when
> TLS is enabled, the multifd_channel_connect function is called twice,
> once to create the TLS handshake thread and another time after the
> asynchrounous TLS handshake has finished.
> 
> This creates a slightly confusing call stack where
> multifd_channel_connect() is called more times than the number of
> channels. It also splits error handling between the two callers of
> multifd_channel_connect() causing some code duplication. Lastly, it
> gets in the way of having a single point to determine whether all
> channel creation tasks have been initiated.
> 
> Refactor the code to move the reentrancy one level up at the
> multifd_new_send_channel_async() level, de-duplicating the error
> handling and allowing for the next patch to introduce a
> synchronization point common to all the multifd channel creation,
> regardless of TLS.
> 
> Signed-off-by: Fabiano Rosas 

Reviewed-by: Peter Xu 

-- 
Peter Xu

RE: [PATCH v0 1/2] aspeed: support uart controller both 0 and 1 base

2024-02-05 Thread Jamin Lin

> -Original Message- 
> The uart definitions on the AST2700 are different :
> 
> 
> https://github.com/AspeedTech-BMC/linux/blob/aspeed-master-v6.6/arch/arm
> 64/boot/dts/aspeed/aspeed-g7.dtsi
> 
>   serial0 = 
>   serial1 = 
>   serial2 = 
>   serial3 = 
>   serial4 = 
>   serial5 = 
>   serial6 = 
>   serial7 = 
>   serial8 = 
>  ...
> 
> I think the names in the DT (and consequently in the QEMU models) follow the
> IP names in the datasheet.
> 
> I don't think we care in QEMU, so I would be inclined to change the indexing 
> of
> the device names in QEMU and start at 0, which would introduce a
> discrepancy for the AST2400, AST2600, AST2600 SoC.
> 
> Let's see what the other maintainers have to say.
> 
> Thanks,
> 
> C.
Hi Cedric, 

Did you mean to change the naming of uart device to 0 base for all ASPEED SOCs?
If yes, it seems we need to do the following changes.
1. add ASPEED_DEV_UART0 in aspeed_soc.h
2. Re-defined uart memory map for ast2600, ast10x0, ast2500 and ast2400(uart0 
-> ASPEED_DEV_UART0)
Take ast2600 for example:
static const hwaddr aspeed_soc_ast2600_memmap[] = {
[ASPEED_DEV_UART1] = 0x1E783000, ---> [ASPEED_DEV_UART0]
[ASPEED_DEV_UART2] = 0x1E78D000, ---> [ASPEED_DEV_UART1]
[ASPEED_DEV_UART3] = 0x1E78E000,
[ASPEED_DEV_UART4] = 0x1E78F000,
[ASPEED_DEV_UART5] = 0x1E784000,
[ASPEED_DEV_UART6] = 0x1E79,
[ASPEED_DEV_UART7] = 0x1E790100,
[ASPEED_DEV_UART8] = 0x1E790200,
[ASPEED_DEV_UART9] = 0x1E790300,
[ASPEED_DEV_UART10]= 0x1E790400,
[ASPEED_DEV_UART11]= 0x1E790500,
[ASPEED_DEV_UART12]= 0x1E790600,
[ASPEED_DEV_UART13]= 0x1E790700, ---> [ASPEED_DEV_UART12]
}; 
If no, could you please descript it more detail? So, I can change it and 
re-send this patch series.

By the way, I will send a new patch series to support AST2700 in two weeks.
We encountered GIC issues. It seems that QEMU support GIC v3 but SPI did not 
support, yet.
https://github.com/qemu/qemu/blob/master/hw/intc/arm_gicv3_dist.c#L383
https://github.com/AspeedTech-BMC/linux/blob/aspeed-master-v6.6/arch/arm64/boot/dts/aspeed/aspeed-g7.dtsi#L229
It think that we can discuss it in a new AST2700 patch series.
Thanks-Jamin

[PULL v2 38/39] tcg/s390x: Support TCG_COND_TST{EQ,NE}

2024-02-05 Thread Richard Henderson

Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.h |   2 +-
 tcg/s390x/tcg-target.c.inc | 139 +
 2 files changed, 97 insertions(+), 44 deletions(-)

diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index 53bed8c8d2..ae448c3a3a 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -138,7 +138,7 @@ extern uint64_t s390_facilities[3];
 
 #define TCG_TARGET_HAS_qemu_ldst_i128 1
 
-#define TCG_TARGET_HAS_tst0
+#define TCG_TARGET_HAS_tst1
 
 #define TCG_TARGET_HAS_v64HAVE_FACILITY(VECTOR)
 #define TCG_TARGET_HAS_v128   HAVE_FACILITY(VECTOR)
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 7f97080f52..ad587325fc 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -112,6 +112,9 @@ typedef enum S390Opcode {
 RI_OILH = 0xa50a,
 RI_OILL = 0xa50b,
 RI_TMLL = 0xa701,
+RI_TMLH = 0xa700,
+RI_TMHL = 0xa703,
+RI_TMHH = 0xa702,
 
 RIEb_CGRJ= 0xec64,
 RIEb_CLGRJ   = 0xec65,
@@ -404,10 +407,15 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind 
kind, int slot)
 #define S390_CC_NEVER   0
 #define S390_CC_ALWAYS  15
 
+#define S390_TM_EQ  8  /* CC == 0 */
+#define S390_TM_NE  7  /* CC in {1,2,3} */
+
 /* Condition codes that result from a COMPARE and COMPARE LOGICAL.  */
-static const uint8_t tcg_cond_to_s390_cond[] = {
+static const uint8_t tcg_cond_to_s390_cond[16] = {
 [TCG_COND_EQ]  = S390_CC_EQ,
 [TCG_COND_NE]  = S390_CC_NE,
+[TCG_COND_TSTEQ] = S390_CC_EQ,
+[TCG_COND_TSTNE] = S390_CC_NE,
 [TCG_COND_LT]  = S390_CC_LT,
 [TCG_COND_LE]  = S390_CC_LE,
 [TCG_COND_GT]  = S390_CC_GT,
@@ -421,9 +429,11 @@ static const uint8_t tcg_cond_to_s390_cond[] = {
 /* Condition codes that result from a LOAD AND TEST.  Here, we have no
unsigned instruction variation, however since the test is vs zero we
can re-map the outcomes appropriately.  */
-static const uint8_t tcg_cond_to_ltr_cond[] = {
+static const uint8_t tcg_cond_to_ltr_cond[16] = {
 [TCG_COND_EQ]  = S390_CC_EQ,
 [TCG_COND_NE]  = S390_CC_NE,
+[TCG_COND_TSTEQ] = S390_CC_ALWAYS,
+[TCG_COND_TSTNE] = S390_CC_NEVER,
 [TCG_COND_LT]  = S390_CC_LT,
 [TCG_COND_LE]  = S390_CC_LE,
 [TCG_COND_GT]  = S390_CC_GT,
@@ -542,10 +552,13 @@ static bool risbg_mask(uint64_t c)
 static bool tcg_target_const_match(int64_t val, int ct,
TCGType type, TCGCond cond, int vece)
 {
+uint64_t uval = val;
+
 if (ct & TCG_CT_CONST) {
 return true;
 }
 if (type == TCG_TYPE_I32) {
+uval = (uint32_t)val;
 val = (int32_t)val;
 }
 
@@ -567,6 +580,15 @@ static bool tcg_target_const_match(int64_t val, int ct,
 case TCG_COND_GTU:
 ct |= TCG_CT_CONST_U32;  /* CLGFI */
 break;
+case TCG_COND_TSTNE:
+case TCG_COND_TSTEQ:
+if (is_const_p16(uval) >= 0) {
+return true;  /* TMxx */
+}
+if (risbg_mask(uval)) {
+return true;  /* RISBG */
+}
+break;
 default:
 g_assert_not_reached();
 }
@@ -588,10 +610,6 @@ static bool tcg_target_const_match(int64_t val, int ct,
 if (ct & TCG_CT_CONST_INV) {
 val = ~val;
 }
-/*
- * Note that is_const_p16 is a subset of is_const_p32,
- * so we don't need both constraints.
- */
 if ((ct & TCG_CT_CONST_P32) && is_const_p32(val) >= 0) {
 return true;
 }
@@ -868,6 +886,9 @@ static const S390Opcode oi_insns[4] = {
 static const S390Opcode lif_insns[2] = {
 RIL_LLILF, RIL_LLIHF,
 };
+static const S390Opcode tm_insns[4] = {
+RI_TMLL, RI_TMLH, RI_TMHL, RI_TMHH
+};
 
 /* load a register with an immediate value */
 static void tcg_out_movi(TCGContext *s, TCGType type,
@@ -1228,6 +1249,36 @@ static int tgen_cmp2(TCGContext *s, TCGType type, 
TCGCond c, TCGReg r1,
 TCGCond inv_c = tcg_invert_cond(c);
 S390Opcode op;
 
+if (is_tst_cond(c)) {
+tcg_debug_assert(!need_carry);
+
+if (!c2const) {
+if (type == TCG_TYPE_I32) {
+tcg_out_insn(s, RRFa, NRK, TCG_REG_R0, r1, c2);
+} else {
+tcg_out_insn(s, RRFa, NGRK, TCG_REG_R0, r1, c2);
+}
+goto exit;
+}
+
+if (type == TCG_TYPE_I32) {
+c2 = (uint32_t)c2;
+}
+
+int i = is_const_p16(c2);
+if (i >= 0) {
+tcg_out_insn_RI(s, tm_insns[i], r1, c2 >> (i * 16));
+*inv_cc = c == TCG_COND_TSTEQ ? S390_TM_NE : S390_TM_EQ;
+return *inv_cc ^ 15;
+}
+
+if (risbg_mask(c2)) {
+tgen_andi_risbg(s, TCG_REG_R0, r1, c2);
+goto exit;
+}
+g_assert_not_reached();
+}
+
 if (c2const) {
 if (c2 == 0) {
 if (!(is_unsigned

[PULL v2 00/39] tcg patch queue

2024-02-05 Thread Richard Henderson

v2: Fix rebase error in patch 38 (tcg/s390x: Support TCG_COND_TST{EQ,NE}).


r~


The following changes since commit 39a6e4f87e7b75a45b08d6dc8b8b7c2954c87440:

  Merge tag 'pull-qapi-2024-02-03' of https://repo.or.cz/qemu/armbru into 
staging (2024-02-03 13:31:58 +)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20240205-2

for you to fetch changes up to 23c5692abc3917151dee36c00d751cf5bc46ef19:

  tcg/tci: Support TCG_COND_TST{EQ,NE} (2024-02-05 22:45:41 +)


tcg: Introduce TCG_COND_TST{EQ,NE}
target/alpha: Use TCG_COND_TST{EQ,NE}
target/m68k: Use TCG_COND_TST{EQ,NE} in gen_fcc_cond
target/sparc: Use TCG_COND_TSTEQ in gen_op_mulscc
target/s390x: Use TCG_COND_TSTNE for CC_OP_{TM,ICM}
target/s390x: Improve general case of disas_jcc


Paolo Bonzini (1):
  tcg/i386: Use TEST r,r to test 8/16/32 bits

Philippe Mathieu-Daudé (1):
  tcg/aarch64: Massage tcg_out_brcond()

Richard Henderson (37):
  tcg: Introduce TCG_COND_TST{EQ,NE}
  tcg: Introduce TCG_TARGET_HAS_tst
  tcg/optimize: Split out arg_is_const_val
  tcg/optimize: Split out do_constant_folding_cond1
  tcg/optimize: Do swap_commutative2 in do_constant_folding_cond2
  tcg/optimize: Handle TCG_COND_TST{EQ,NE}
  tcg/optimize: Lower TCG_COND_TST{EQ,NE} if unsupported
  target/alpha: Pass immediate value to gen_bcond_internal()
  target/alpha: Use TCG_COND_TST{EQ,NE} for BLB{C,S}
  target/alpha: Use TCG_COND_TST{EQ,NE} for CMOVLB{C,S}
  target/alpha: Use TCG_COND_TSTNE for gen_fold_mzero
  target/m68k: Use TCG_COND_TST{EQ,NE} in gen_fcc_cond
  target/sparc: Use TCG_COND_TSTEQ in gen_op_mulscc
  target/s390x: Use TCG_COND_TSTNE for CC_OP_{TM,ICM}
  target/s390x: Improve general case of disas_jcc
  tcg: Add TCGConst argument to tcg_target_const_match
  tcg/aarch64: Support TCG_COND_TST{EQ,NE}
  tcg/aarch64: Generate TBZ, TBNZ
  tcg/aarch64: Generate CBNZ for TSTNE of UINT32_MAX
  tcg/arm: Split out tcg_out_cmp()
  tcg/arm: Support TCG_COND_TST{EQ,NE}
  tcg/i386: Pass x86 condition codes to tcg_out_cmov
  tcg/i386: Move tcg_cond_to_jcc[] into tcg_out_cmp
  tcg/i386: Support TCG_COND_TST{EQ,NE}
  tcg/i386: Improve TSTNE/TESTEQ vs powers of two
  tcg/sparc64: Hoist read of tcg_cond_to_rcond
  tcg/sparc64: Pass TCGCond to tcg_out_cmp
  tcg/sparc64: Support TCG_COND_TST{EQ,NE}
  tcg/ppc: Sink tcg_to_bc usage into tcg_out_bc
  tcg/ppc: Use cr0 in tcg_to_bc and tcg_to_isel
  tcg/ppc: Tidy up tcg_target_const_match
  tcg/ppc: Add TCG_CT_CONST_CMP
  tcg/ppc: Support TCG_COND_TST{EQ,NE}
  tcg/s390x: Split constraint A into J+U
  tcg/s390x: Add TCG_CT_CONST_CMP
  tcg/s390x: Support TCG_COND_TST{EQ,NE}
  tcg/tci: Support TCG_COND_TST{EQ,NE}

 docs/devel/tcg-ops.rst   |   2 +
 include/tcg/tcg-cond.h   |  74 --
 tcg/aarch64/tcg-target-con-set.h |   5 +-
 tcg/aarch64/tcg-target-con-str.h |   1 +
 tcg/aarch64/tcg-target.h |   2 +
 tcg/arm/tcg-target.h |   2 +
 tcg/i386/tcg-target-con-set.h|   6 +-
 tcg/i386/tcg-target-con-str.h|   1 +
 tcg/i386/tcg-target.h|   2 +
 tcg/loongarch64/tcg-target.h |   2 +
 tcg/mips/tcg-target.h|   2 +
 tcg/ppc/tcg-target-con-set.h |   5 +-
 tcg/ppc/tcg-target-con-str.h |   1 +
 tcg/ppc/tcg-target.h |   2 +
 tcg/riscv/tcg-target.h   |   2 +
 tcg/s390x/tcg-target-con-set.h   |   8 +-
 tcg/s390x/tcg-target-con-str.h   |   3 +-
 tcg/s390x/tcg-target.h   |   2 +
 tcg/sparc64/tcg-target.h |   2 +
 tcg/tcg-internal.h   |   2 +
 tcg/tci/tcg-target.h |   2 +
 target/alpha/translate.c |  94 
 target/m68k/translate.c  |  74 +++---
 target/s390x/tcg/translate.c | 100 +++--
 target/sparc/translate.c |   4 +-
 tcg/optimize.c   | 474 ++-
 tcg/tcg.c|  40 +++-
 tcg/tci.c|  14 ++
 tcg/aarch64/tcg-target.c.inc | 166 +++---
 tcg/arm/tcg-target.c.inc |  62 +++--
 tcg/i386/tcg-target.c.inc| 201 -
 tcg/loongarch64/tcg-target.c.inc |   3 +-
 tcg/mips/tcg-target.c.inc|   3 +-
 tcg/ppc/tcg-target.c.inc | 294 ++--
 tcg/riscv/tcg-target.c.inc   |   3 +-
 tcg/s390x/tcg-target.c.inc   | 246 +---
 tcg/sparc64/tcg-target.c.inc |  65 --
 tcg/tci/tcg-target.c.inc |   3 +-
 38 files changed, 1379 insertions(+), 595 deletions(-)

RE: [PATCH v0 1/2] aspeed: support uart controller both 0 and 1 base

2024-02-05 Thread Jamin Lin

> -Original Message-
> From: Philippe Mathieu-Daudé 
> Sent: Monday, February 5, 2024 9:16 PM
> To: Cédric Le Goater ; Jamin Lin ;
> Peter Maydell ; Andrew Jeffery
> ; Joel Stanley ; open
> list:ASPEED BMCs ; open list:All patches CC here
> 
> Cc: Troy Lee 
> Subject: Re: [PATCH v0 1/2] aspeed: support uart controller both 0 and 1 base
> 
> On 5/2/24 11:46, Cédric Le Goater wrote:
> > Hello Jamin,
> >
> > On 2/5/24 10:14, Jamin Lin wrote:
> >> According to the design of ASPEED SOCS, the uart controller is 1 base
> >> for ast10x0, ast2600, ast2500 and ast2400.
> >>
> >> However, the uart controller is 0 base for ast2700.
> >> To support uart controller both 0 and 1 base, adds uasrt_bases
> >> parameter in AspeedSoCClass and set the default uart controller 1
> >> base for ast10x0, astt2600, ast2500 and ast2400.
> >
> > The board definition can set 'amc->uart_default' to choose a different
> > default serial port for the console, or use the "bmc-console" machine
> > option . Isn't it enough ? May be I am misunderstanding the need.
> >
> > To clarify,
> >
> > ASPEED_DEV_UART1 is in the first serial port on the boards.
> >
> > I think we chose to start the indexing at 1 because the Aspeed QEMU
> > modeling began first with the UART model (console) and for simplicity,
> > we copied the definitions of the device tree from Linux :
> >
> >      serial0 = 
> >      serial1 = 
> >      serial2 = 
> >      serial3 = 
> >      serial4 = 
> >      serial5 = 
> >
> > We replicated this indexing starting at 1 to nearly all device models :
> >
> >      ASPEED_DEV_UART1 - 13
> >      ASPEED_DEV_SPI1 -2
> >      ASPEED_DEV_EHCI1 -2
> >      ASPEED_DEV_TIMER1 - 8
> >      ASPEED_DEV_ETH1 -4
> >      ASPEED_DEV_MII1 - 4
> >      ASPEED_DEV_JTAG0 - 1  <--- !!
> >      ASPEED_DEV_FSI1 - 2
> >
> > I don't know what would be ASPEED_DEV_UART0 in this context.
> >
> > May be you could send a simplified AST2700 SoC model with definitions
> > of a minimum address space and IRQ space ?
> 
> Looking at TF-A definitions,
> https://github.com/ARM-software/arm-trusted-firmware/commit/85f199b774
> 47
> 
> #define UART_BASE U(0x14c33000)
> #define UART12_BASE   (UART_BASE + 0xb00)
> #define CONSOLE_UART_BASE UART12_BASE
> 
> As Cédric described, we have TF-A UART12_BASE -> QEMU
> ASPEED_DEV_UART13.
As Cédric described, the UART definitions on the AST2700 are different :
   
   
https://github.com/AspeedTech-BMC/linux/blob/aspeed-master-v6.6/arch/arm64/boot/dts/aspeed/aspeed-g7.dtsi

serial0 = 
serial1 = 
serial2 = 
serial3 = 
serial4 = 
serial5 = 
serial6 = 
serial7 = 
serial8 = 

According to the current design of ASPEED QEMU UART model, it used "1" base 
device name and
Follow the IP names in the datasheet
Take ast2600 for example:
static const hwaddr aspeed_soc_ast2600_memmap[] = {
[ASPEED_DEV_UART1] = 0x1E783000,
}
AST2600 datashee description:
Base Address of UART1 = 0x1E78 3000
Base Address of UART2 = 0x1E78 D000
Base Address of UART3 = 0x1E78 E000
Base Address of UART4 = 0x1E78 F000
Base Address of UART5 = 0x1E78 4000
Base Address of UART6 = 0x1E79 

However, device name of uart controller had been changed to "0" base for 
ast2700.
If we want to control uart0(ASPEED_DEV_UART1), we should set the memory map as 
following,

static const hwaddr aspeed_soc_ast2700_memmap[] = {
[ASPEED_DEV_UART1] =  0X14C33000,
[ASPEED_DEV_UART2] =  0X14C33100,
[ASPEED_DEV_UART3] =  0X14C33200,
[ASPEED_DEV_UART4] =  0X14C33300,
[ASPEED_DEV_UART5] =  0X12C1A000,
}

AST2700 datasheet description:
AST2700 integrate 13 sets of UART.
Base Address of UART0 = 0x14C3_3000
Base Address of UART1 = 0x14C3_3100
Base Address of UART2 = 0x14C3_3200
Base Address of UART3 = 0x14C3_3300
Base Address of UART4 = 0x12C1_A000
Base Address of UART5 = 0x14C3_3400
Base Address of UART6 = 0x14C3_3500
Base Address of UART7 = 0x14C3_3600
Base Address of UART8 = 0x14C3_3700
Base Address of UART9 = 0x14C3_3800
Base Address of UART10 = 0x14C3_3900
Base Address of UART11 = 0x14C3_3A00
Base Address of UART12 = 0x14C3_3B00

As you said, uart12 mapped ASPEED_DEV_UART13.
The device naming will confuse users because the device name in qemu mismatch 
with ast2700 datasheet.

That way why we want to add ASPEED_DEV_UART0 and set the memory map of AST2700 
as following.
static const hwaddr aspeed_soc_ast2700_memmap[] = {
[ASPEED_DEV_UART0] =  0X14C33000,
[ASPEED_DEV_UART1] =  0X14C33100,
[ASPEED_DEV_UART2] =  0X14C33200,
[ASPEED_DEV_UART3] =  0X14C33300,
[ASPEED_DEV_UART4] =  0X12C1A000,
[ASPEED_DEV_UART5] =  0X14C33400,
[ASPEED_DEV_UART6] =  0X14C33500,
[ASPEED_DEV_UART7] =  0X14C33600,
[ASPEED_DEV_UART8] =  0X14C33700,
[ASPEED_DEV_UART9] =  0X14C33800,
[ASPEED_DEV_UART10]=  0X14C33900,
[ASPEED_DEV_UART11]=  0X14C33A00,
[ASPEED_DEV_UART12]=

[PATCH v2 2/6] target/arm: Fix nregs computation in do_ld_zpa

2024-02-05 Thread Richard Henderson

The field is encoded as [0-3], which is convenient for
indexing our array of function pointers, but the true
value is [1-4].  Adjust before calling do_mem_zpa.

Add an assert, and move the comment re passing ZT to
the helper back next to the relevant code.

Cc: qemu-sta...@nongnu.org
Fixes: 206adacfb8d ("target/arm: Add mte helpers for sve scalar + int loads")
Signed-off-by: Richard Henderson 
---
 target/arm/tcg/translate-sve.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 296e7d1ce2..f50a426c98 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4445,11 +4445,7 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, 
TCGv_i64 addr,
 TCGv_ptr t_pg;
 int desc = 0;
 
-/*
- * For e.g. LD4, there are not enough arguments to pass all 4
- * registers as pointers, so encode the regno into the data field.
- * For consistency, do this even for LD1.
- */
+assert(mte_n >= 1 && mte_n <= 4);
 if (s->mte_active[0]) {
 int msz = dtype_msz(dtype);
 
@@ -4463,6 +4459,11 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, 
TCGv_i64 addr,
 addr = clean_data_tbi(s, addr);
 }
 
+/*
+ * For e.g. LD4, there are not enough arguments to pass all 4
+ * registers as pointers, so encode the regno into the data field.
+ * For consistency, do this even for LD1.
+ */
 desc = simd_desc(vsz, vsz, zt | desc);
 t_pg = tcg_temp_new_ptr();
 
@@ -4600,7 +4601,7 @@ static void do_ld_zpa(DisasContext *s, int zt, int pg,
  * accessible via the instruction encoding.
  */
 assert(fn != NULL);
-do_mem_zpa(s, zt, pg, addr, dtype, nreg, false, fn);
+do_mem_zpa(s, zt, pg, addr, dtype, nreg + 1, false, fn);
 }
 
 static bool trans_LD_zprr(DisasContext *s, arg_rprr_load *a)
-- 
2.34.1

[PATCH v2 1/6] linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3

2024-02-05 Thread Richard Henderson

When MTE3 is supported, the kernel maps
  PR_MTE_TCF_ASYNC | PR_MTE_TCF_SYNC
to
  MTE_CTRL_TCF_ASYMM
and from there to
  SCTLR_EL1.TCF0 = 3

There is no error reported for setting ASYNC | SYNC when MTE3 is not
supported; the kernel simply selects the ASYNC behavior of TCF0=2.

Signed-off-by: Richard Henderson 
---
 linux-user/aarch64/target_prctl.h | 25 +
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/linux-user/aarch64/target_prctl.h 
b/linux-user/aarch64/target_prctl.h
index 5067e7d731..49bd16aa95 100644
--- a/linux-user/aarch64/target_prctl.h
+++ b/linux-user/aarch64/target_prctl.h
@@ -173,21 +173,22 @@ static abi_long 
do_prctl_set_tagged_addr_ctrl(CPUArchState *env, abi_long arg2)
 env->tagged_addr_enable = arg2 & PR_TAGGED_ADDR_ENABLE;
 
 if (cpu_isar_feature(aa64_mte, cpu)) {
-switch (arg2 & PR_MTE_TCF_MASK) {
-case PR_MTE_TCF_NONE:
-case PR_MTE_TCF_SYNC:
-case PR_MTE_TCF_ASYNC:
-break;
-default:
-return -EINVAL;
-}
-
 /*
  * Write PR_MTE_TCF to SCTLR_EL1[TCF0].
- * Note that the syscall values are consistent with hw.
+ * Note that SYNC | ASYNC -> ASYMM with FEAT_MTE3,
+ * otherwise mte_update_sctlr_user chooses ASYNC.
  */
-env->cp15.sctlr_el[1] =
-deposit64(env->cp15.sctlr_el[1], 38, 2, arg2 >> PR_MTE_TCF_SHIFT);
+unsigned tcf = 0;
+if (arg2 & PR_MTE_TCF_ASYNC) {
+if ((arg2 & PR_MTE_TCF_SYNC) && cpu_isar_feature(aa64_mte3, cpu)) {
+tcf = 3;
+} else {
+tcf = 2;
+}
+} else if (arg2 & PR_MTE_TCF_SYNC) {
+tcf = 1;
+}
+env->cp15.sctlr_el[1] = deposit64(env->cp15.sctlr_el[1], 38, 2, tcf);
 
 /*
  * Write PR_MTE_TAG to GCR_EL1[Exclude].
-- 
2.34.1

[PATCH v2 6/6] target/arm: Fix SVE/SME gross MTE suppression checks

2024-02-05 Thread Richard Henderson

The TBI and TCMA bits are located within mtedesc, not desc.

Signed-off-by: Richard Henderson 
---
 target/arm/tcg/sme_helper.c |  8 
 target/arm/tcg/sve_helper.c | 12 ++--
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c
index 1ee2690ceb..904bfdac43 100644
--- a/target/arm/tcg/sme_helper.c
+++ b/target/arm/tcg/sme_helper.c
@@ -573,8 +573,8 @@ void sme_ld1_mte(CPUARMState *env, void *za, uint64_t *vg,
 desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
 /* Perform gross MTE suppression early. */
-if (!tbi_check(desc, bit55) ||
-tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+if (!tbi_check(mtedesc, bit55) ||
+tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
 mtedesc = 0;
 }
 
@@ -750,8 +750,8 @@ void sme_st1_mte(CPUARMState *env, void *za, uint64_t *vg, 
target_ulong addr,
 desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
 /* Perform gross MTE suppression early. */
-if (!tbi_check(desc, bit55) ||
-tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+if (!tbi_check(mtedesc, bit55) ||
+tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
 mtedesc = 0;
 }
 
diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
index bce4295d28..6853f58c19 100644
--- a/target/arm/tcg/sve_helper.c
+++ b/target/arm/tcg/sve_helper.c
@@ -5800,8 +5800,8 @@ void sve_ldN_r_mte(CPUARMState *env, uint64_t *vg, 
target_ulong addr,
 desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
 /* Perform gross MTE suppression early. */
-if (!tbi_check(desc, bit55) ||
-tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+if (!tbi_check(mtedesc, bit55) ||
+tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
 mtedesc = 0;
 }
 
@@ -6156,8 +6156,8 @@ void sve_ldnfff1_r_mte(CPUARMState *env, void *vg, 
target_ulong addr,
 desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
 /* Perform gross MTE suppression early. */
-if (!tbi_check(desc, bit55) ||
-tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+if (!tbi_check(mtedesc, bit55) ||
+tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
 mtedesc = 0;
 }
 
@@ -6410,8 +6410,8 @@ void sve_stN_r_mte(CPUARMState *env, uint64_t *vg, 
target_ulong addr,
 desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
 /* Perform gross MTE suppression early. */
-if (!tbi_check(desc, bit55) ||
-tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+if (!tbi_check(mtedesc, bit55) ||
+tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
 mtedesc = 0;
 }
 
-- 
2.34.1

[PATCH v2 3/6] target/arm: Adjust and validate mtedesc sizem1

2024-02-05 Thread Richard Henderson

When we added SVE_MTEDESC_SHIFT, we effectively limited the
maximum size of MTEDESC.  Adjust SIZEM1 to consume the remaining
bits (32 - 10 - 5 - 12 == 5).  Assert that the data to be stored
fits within the field (expecting 8 * 4 - 1 == 31, exact fit).

Signed-off-by: Richard Henderson 
---
 target/arm/internals.h | 2 +-
 target/arm/tcg/translate-sve.c | 7 ---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index fc337fe40e..50bff44549 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1278,7 +1278,7 @@ FIELD(MTEDESC, TBI,   4, 2)
 FIELD(MTEDESC, TCMA,  6, 2)
 FIELD(MTEDESC, WRITE, 8, 1)
 FIELD(MTEDESC, ALIGN, 9, 3)
-FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - 12)  /* size - 1 */
+FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - SVE_MTEDESC_SHIFT - 12)  /* size - 
1 */
 
 bool mte_probe(CPUARMState *env, uint32_t desc, uint64_t ptr);
 uint64_t mte_check(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t 
ra);
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index f50a426c98..2628ac2840 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4443,17 +4443,18 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, 
TCGv_i64 addr,
 {
 unsigned vsz = vec_full_reg_size(s);
 TCGv_ptr t_pg;
+uint32_t sizem1;
 int desc = 0;
 
 assert(mte_n >= 1 && mte_n <= 4);
+sizem1 = (mte_n << dtype_msz(dtype)) - 1;
+assert(sizem1 <= R_MTEDESC_SIZEM1_MASK >> R_MTEDESC_SIZEM1_SHIFT);
 if (s->mte_active[0]) {
-int msz = dtype_msz(dtype);
-
 desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
 desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
 desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
 desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
-desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (mte_n << msz) - 1);
+desc = FIELD_DP32(desc, MTEDESC, SIZEM1, sizem1);
 desc <<= SVE_MTEDESC_SHIFT;
 } else {
 addr = clean_data_tbi(s, addr);
-- 
2.34.1

[PATCH v2 4/6] target/arm: Split out make_svemte_desc

2024-02-05 Thread Richard Henderson

Share code that creates mtedesc and embeds within simd_desc.

Signed-off-by: Richard Henderson 
---
 target/arm/tcg/translate-a64.h |  2 ++
 target/arm/tcg/translate-sme.c | 15 +++
 target/arm/tcg/translate-sve.c | 47 ++
 3 files changed, 31 insertions(+), 33 deletions(-)

diff --git a/target/arm/tcg/translate-a64.h b/target/arm/tcg/translate-a64.h
index 96ba39b37e..7b811b8ac5 100644
--- a/target/arm/tcg/translate-a64.h
+++ b/target/arm/tcg/translate-a64.h
@@ -28,6 +28,8 @@ bool logic_imm_decode_wmask(uint64_t *result, unsigned int 
immn,
 bool sve_access_check(DisasContext *s);
 bool sme_enabled_check(DisasContext *s);
 bool sme_enabled_check_with_svcr(DisasContext *s, unsigned);
+uint32_t make_svemte_desc(DisasContext *s, unsigned vsz, uint32_t nregs,
+  uint32_t msz, bool is_write, uint32_t data);
 
 /* This function corresponds to CheckStreamingSVEEnabled. */
 static inline bool sme_sm_enabled_check(DisasContext *s)
diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index 8f0dfc884e..46c7fce8b4 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -206,7 +206,7 @@ static bool trans_LDST1(DisasContext *s, arg_LDST1 *a)
 
 TCGv_ptr t_za, t_pg;
 TCGv_i64 addr;
-int svl, desc = 0;
+uint32_t desc;
 bool be = s->be_data == MO_BE;
 bool mte = s->mte_active[0];
 
@@ -224,18 +224,11 @@ static bool trans_LDST1(DisasContext *s, arg_LDST1 *a)
 tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), a->esz);
 tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
 
-if (mte) {
-desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
-desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
-desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
-desc = FIELD_DP32(desc, MTEDESC, WRITE, a->st);
-desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (1 << a->esz) - 1);
-desc <<= SVE_MTEDESC_SHIFT;
-} else {
+if (!mte) {
 addr = clean_data_tbi(s, addr);
 }
-svl = streaming_vec_reg_size(s);
-desc = simd_desc(svl, svl, desc);
+
+desc = make_svemte_desc(s, streaming_vec_reg_size(s), 1, a->esz, a->st, 0);
 
 fns[a->esz][be][a->v][mte][a->st](tcg_env, t_za, t_pg, addr,
   tcg_constant_i32(desc));
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 2628ac2840..8868aae5ac 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4437,18 +4437,18 @@ static const uint8_t dtype_esz[16] = {
 3, 2, 1, 3
 };
 
-static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
-   int dtype, uint32_t mte_n, bool is_write,
-   gen_helper_gvec_mem *fn)
+uint32_t make_svemte_desc(DisasContext *s, unsigned vsz, uint32_t nregs,
+  uint32_t msz, bool is_write, uint32_t data)
 {
-unsigned vsz = vec_full_reg_size(s);
-TCGv_ptr t_pg;
 uint32_t sizem1;
-int desc = 0;
+uint32_t desc = 0;
 
-assert(mte_n >= 1 && mte_n <= 4);
-sizem1 = (mte_n << dtype_msz(dtype)) - 1;
+/* Assert all of the data fits, with or without MTE enabled. */
+assert(nregs >= 1 && nregs <= 4);
+sizem1 = (nregs << msz) - 1;
 assert(sizem1 <= R_MTEDESC_SIZEM1_MASK >> R_MTEDESC_SIZEM1_SHIFT);
+assert(data < 1u << SVE_MTEDESC_SHIFT);
+
 if (s->mte_active[0]) {
 desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
 desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
@@ -4456,7 +4456,18 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, 
TCGv_i64 addr,
 desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
 desc = FIELD_DP32(desc, MTEDESC, SIZEM1, sizem1);
 desc <<= SVE_MTEDESC_SHIFT;
-} else {
+}
+return simd_desc(vsz, vsz, desc | data);
+}
+
+static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
+   int dtype, uint32_t nregs, bool is_write,
+   gen_helper_gvec_mem *fn)
+{
+TCGv_ptr t_pg;
+uint32_t desc;
+
+if (!s->mte_active[0]) {
 addr = clean_data_tbi(s, addr);
 }
 
@@ -4465,7 +4476,8 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, 
TCGv_i64 addr,
  * registers as pointers, so encode the regno into the data field.
  * For consistency, do this even for LD1.
  */
-desc = simd_desc(vsz, vsz, zt | desc);
+desc = make_svemte_desc(s, vec_full_reg_size(s), nregs,
+dtype_msz(dtype), is_write, zt);
 t_pg = tcg_temp_new_ptr();
 
 tcg_gen_addi_ptr(t_pg, tcg_env, pred_full_reg_offset(s, pg));
@@ -5225,25 +5237,16 @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, 
int zm,
int scale, TCGv_i64 scalar, int msz, bool is_write,
gen_helper_gvec_mem_scatter *fn)
 {
-unsigned vsz = vec_full_reg_size(s);
 TCGv_ptr t_zm =

Re: [PATCH v2 00/23] migration/multifd: Refactor ->send_prepare() and cleanups

2024-02-05 Thread Peter Xu

queued.

-- 
Peter Xu

[PATCH v2 5/6] target/arm: Handle mte in do_ldrq, do_ldro

2024-02-05 Thread Richard Henderson

These functions "use the standard load helpers", but
fail to clean_data_tbi or populate mtedesc.

Signed-off-by: Richard Henderson 
---
 target/arm/tcg/translate-sve.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 8868aae5ac..fb3bcb47f6 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4861,8 +4861,13 @@ static void do_ldrq(DisasContext *s, int zt, int pg, 
TCGv_i64 addr, int dtype)
 unsigned vsz = vec_full_reg_size(s);
 TCGv_ptr t_pg;
 int poff;
+uint32_t desc;
 
 /* Load the first quadword using the normal predicated load helpers.  */
+if (!s->mte_active[0]) {
+addr = clean_data_tbi(s, addr);
+}
+
 poff = pred_full_reg_offset(s, pg);
 if (vsz > 16) {
 /*
@@ -4886,7 +4891,8 @@ static void do_ldrq(DisasContext *s, int zt, int pg, 
TCGv_i64 addr, int dtype)
 
 gen_helper_gvec_mem *fn
 = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0];
-fn(tcg_env, t_pg, addr, tcg_constant_i32(simd_desc(16, 16, zt)));
+desc = make_svemte_desc(s, 16, 1, dtype_msz(dtype), false, zt);
+fn(tcg_env, t_pg, addr, tcg_constant_i32(desc));
 
 /* Replicate that first quadword.  */
 if (vsz > 16) {
@@ -4929,6 +4935,7 @@ static void do_ldro(DisasContext *s, int zt, int pg, 
TCGv_i64 addr, int dtype)
 unsigned vsz_r32;
 TCGv_ptr t_pg;
 int poff, doff;
+uint32_t desc;
 
 if (vsz < 32) {
 /*
@@ -4941,6 +4948,9 @@ static void do_ldro(DisasContext *s, int zt, int pg, 
TCGv_i64 addr, int dtype)
 }
 
 /* Load the first octaword using the normal predicated load helpers.  */
+if (!s->mte_active[0]) {
+addr = clean_data_tbi(s, addr);
+}
 
 poff = pred_full_reg_offset(s, pg);
 if (vsz > 32) {
@@ -4965,7 +4975,8 @@ static void do_ldro(DisasContext *s, int zt, int pg, 
TCGv_i64 addr, int dtype)
 
 gen_helper_gvec_mem *fn
 = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0];
-fn(tcg_env, t_pg, addr, tcg_constant_i32(simd_desc(32, 32, zt)));
+desc = make_svemte_desc(s, 32, 1, dtype_msz(dtype), false, zt);
+fn(tcg_env, t_pg, addr, tcg_constant_i32(desc));
 
 /*
  * Replicate that first octaword.
-- 
2.34.1

[PATCH v2 0/6] target/arm: assorted mte fixes

2024-02-05 Thread Richard Henderson

The first patch is unchanged from

Supercedes: <20240131003557.176486-1-richard.hender...@linaro.org>

while the remaining patches replace

Supercedes: <20240205023948.25476-1-richard.hender...@linaro.org>

While digging through Gustavo's test case, wondering why it
should be failing at all, I finally noticed that we weren't
overflowing MTEDESC.SIZEM1, but underflowing (-1).  Oops.

But I did find a few other points by inspection where we
weren't properly handling or supplying MTEDESC.


r~


Richard Henderson (6):
  linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3
  target/arm: Fix nregs computation in do_ld_zpa
  target/arm: Adjust and validate mtedesc sizem1
  target/arm: Split out make_svemte_desc
  target/arm: Handle mte in do_ldrq, do_ldro
  target/arm: Fix SVE/SME gross MTE suppression checks

 linux-user/aarch64/target_prctl.h | 25 +-
 target/arm/internals.h|  2 +-
 target/arm/tcg/translate-a64.h|  2 +
 target/arm/tcg/sme_helper.c   |  8 ++--
 target/arm/tcg/sve_helper.c   | 12 ++---
 target/arm/tcg/translate-sme.c| 15 ++
 target/arm/tcg/translate-sve.c| 80 ++-
 7 files changed, 78 insertions(+), 66 deletions(-)

-- 
2.34.1

Re: [PATCH] vdpa-dev: Fix initialisation order to restore VDUSE compatibility

2024-02-05 Thread Jason Wang

On Mon, Feb 5, 2024 at 6:51 PM Stefano Garzarella  wrote:
>
> On Fri, Feb 02, 2024 at 02:25:21PM +0100, Kevin Wolf wrote:
> >VDUSE requires that virtqueues are first enabled before the DRIVER_OK
> >status flag is set; with the current API of the kernel module, it is
> >impossible to enable the opposite order in our block export code because
> >userspace is not notified when a virtqueue is enabled.

Did this mean virtio-blk will enable a virtqueue after DRIVER_OK? Sepc
is not clear about this and that's why we introduce
VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK.

>
> Yeah, IMHO the VDUSE protocol is missing a VDUSE_SET_VQ_READY message,

I think you meant when VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is
negotiated. If this is truth, it seems a little more complicated, for
example the get_backend_features needs to be forward to the userspace?
This seems suboptimal to implement this in the spec first and then we
can leverage the features. Or we can have another parameter for the
ioctl that creates the vduse device.

> I'll start another thread about that, but in the meantime I agree that
> we should fix QEMU since we need to work properly with old kernels as
> well.
>
> >
> >This requirement also mathces the normal initialisation order as done by
> >the generic vhost code in QEMU. However, commit 6c482547 accidentally
> >changed the order for vdpa-dev and broke access to VDUSE devices with
> >this.
> >
> >This changes vdpa-dev to use the normal order again and use the standard
> >vhost callback .vhost_set_vring_enable for this. VDUSE devices can be
> >used with vdpa-dev again after this fix.
>
> I like this approach and the patch LGTM, but I'm a bit worried about
> this function in hw/net/vhost_net.c:
>
>  int vhost_set_vring_enable(NetClientState *nc, int enable)
>  {
>  VHostNetState *net = get_vhost_net(nc);
>  const VhostOps *vhost_ops = net->dev.vhost_ops;
>
>  nc->vring_enable = enable;
>
>  if (vhost_ops && vhost_ops->vhost_set_vring_enable) {
>  return vhost_ops->vhost_set_vring_enable(>dev, enable);
>  }
>
>  return 0;
>  }
>
> @Eugenio, @Jason, should we change some things there if vhost-vdpa
> implements the vhost_set_vring_enable callback?

Eugenio may know more, I remember we need to enable cvq first for
shadow virtqueue to restore some states.

>
> Do you remember why we didn't implement it from the beginning?

It seems the vrings parameter is introduced after vhost-vdpa is implemented.

Thanks

>
> Thanks,
> Stefano
>
> >
> >Cc: qemu-sta...@nongnu.org
> >Fixes: 6c4825476a4351530bcac17abab72295b75ffe98
> >Signed-off-by: Kevin Wolf 
> >---
> > hw/virtio/vdpa-dev.c   |  5 +
> > hw/virtio/vhost-vdpa.c | 17 +
> > 2 files changed, 18 insertions(+), 4 deletions(-)
> >
> >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> >index eb9ecea83b..13e87f06f6 100644
> >--- a/hw/virtio/vdpa-dev.c
> >+++ b/hw/virtio/vdpa-dev.c
> >@@ -253,14 +253,11 @@ static int vhost_vdpa_device_start(VirtIODevice *vdev, 
> >Error **errp)
> >
> > s->dev.acked_features = vdev->guest_features;
> >
> >-ret = vhost_dev_start(>dev, vdev, false);
> >+ret = vhost_dev_start(>dev, vdev, true);
> > if (ret < 0) {
> > error_setg_errno(errp, -ret, "Error starting vhost");
> > goto err_guest_notifiers;
> > }
> >-for (i = 0; i < s->dev.nvqs; ++i) {
> >-vhost_vdpa_set_vring_ready(>vdpa, i);
> >-}
> > s->started = true;
> >
> > /*
> >diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >index 3a43beb312..c4574d56c5 100644
> >--- a/hw/virtio/vhost-vdpa.c
> >+++ b/hw/virtio/vhost-vdpa.c
> >@@ -904,6 +904,22 @@ int vhost_vdpa_set_vring_ready(struct vhost_vdpa *v, 
> >unsigned idx)
> > return r;
> > }
> >
> >+static int vhost_vdpa_set_vring_enable(struct vhost_dev *dev, int enable)
> >+{
> >+struct vhost_vdpa *v = dev->opaque;
> >+unsigned int i;
> >+int ret;
> >+
> >+for (i = 0; i < dev->nvqs; ++i) {
> >+ret = vhost_vdpa_set_vring_ready(v, i);
> >+if (ret < 0) {
> >+return ret;
> >+}
> >+}
> >+
> >+return 0;
> >+}
> >+
> > static int vhost_vdpa_set_config_call(struct vhost_dev *dev,
> >int fd)
> > {
> >@@ -1524,6 +1540,7 @@ const VhostOps vdpa_ops = {
> > .vhost_set_features = vhost_vdpa_set_features,
> > .vhost_reset_device = vhost_vdpa_reset_device,
> > .vhost_get_vq_index = vhost_vdpa_get_vq_index,
> >+.vhost_set_vring_enable = vhost_vdpa_set_vring_enable,
> > .vhost_get_config  = vhost_vdpa_get_config,
> > .vhost_set_config = vhost_vdpa_set_config,
> > .vhost_requires_shm_log = NULL,
> >--
> >2.43.0
> >
>

Re: [PATCH 0/2] migration: Fix return-path thread exit

2024-02-05 Thread Peter Xu

On Mon, Feb 05, 2024 at 10:32:33AM +, Daniel P. Berrangé wrote:
> On Fri, Feb 02, 2024 at 05:53:39PM +0800, Peter Xu wrote:
> > On Thu, Feb 01, 2024 at 07:48:51PM +0100, Cédric Le Goater wrote:
> > > Hello,
> > 
> > Hi, Cédric,
> > 
> > Thanks for the patches.
> > 
> > > 
> > > Today, close_return_path_on_source() can perform a shutdown to exit
> > > the return-path thread if an error occured. However, migrate_fd_cleanup()
> > > does cleanups too early and the shutdown in close_return_path_on_source()
> > > fails, leaving the source and destination waiting for an event to occur.
> > > 
> > > This little series tries to fix that. Comments welcome !  
> > 
> > One thing I do agree is that relying on qemu_file_get_error(to_dst_file) in
> > close_return_path_on_source() is weird: IMHO we have better way to detect
> > "whether the migration has error" now, which is migrate_has_error().
> > 
> > For this specific issue, I think one long standing issue that might be
> > relevant is we have two QEMUFile (from_dst_file, to_dst_file) that share
> > the same QIOChannel now.  Logically the two QEMUFile should be able to be
> > managed separately, say, close() of to_dst_file shouldn't affect the other.
> > 
> > However I don't think it's the case now, as qemu_fclose(to_dst_file) will
> > do qio_channel_close() already, which means there will be a side effect to
> > the other QEMUFile that its backing IOC is already closed.
> > 
> > Is this the issue we're facing?  IOW, the close() of to_dst_file will not
> > properly kick the other thread who is blocked at reading from_dst_file,
> > while the shutdown() will kick it out?
> > 
> > If so, not sure whether we can somehow relay the real qio_channel_close()
> > to until the last user releases it? IOW, conditionally close() the channel
> > in qio_channel_finalize(), if the channel is still open?  Would that make
> > sense?
> 
> IMHO the problem described above is a result of the design mistake of
> having 2 separate QEMUFile instances for what is ultimately the same
> channel. This was a convenient approach to take originally, but it has
> likely outlived its purpose.
> 
> In the ideal world IMHO, QEMUFile would not exist at all, and we would
> have a QIOChannelCached that adds the read/write buffering above the
> base QIOChannel.

We have that in the TODO wiki page for a long time, I'll update it
slightly.

https://wiki.qemu.org/ToDo/LiveMigration#Rewrite_QEMUFile_for_migration

But yeah that might be too big a hammer to solve this specific issue.
AFAIU Fabiano is looking into that direction, but I assume it should still
be a long term thing.

> 
> That's doable, but bigger than a quick fix. A natural stepping stone
> to get there though is to move from 2 QEMUFile objs down to 1 QEMUFile,
> which might be more practical as a quick fix.

Agree. However would this still be quite some change?

We still have a lot of references on the four qemufiles (to/from_dst_file,
to/from_src_file), at least that'll need a replacement; I didn't yet
further check whether all places can be done with a direct replacement of
such change, some tweaks may be needed here and there, but shouldn't be
major.

Meanwhile IIUC it'll also need a major rework on QEMUFile, allowing it to
be bi-directional?  We may need to duplicate the cache layer, IIUC, one for
each direction IOs.

-- 
Peter Xu

Re: [PATCH] tests/cdrom-test: Add cdrom test for LoongArch virt machine

2024-02-05 Thread maobibo


Hi Thomas,

On 2024/2/5 下午3:47, Thomas Huth wrote:

On 05/02/2024 03.13, Bibo Mao wrote:

The cdrom test skips to execute on LoongArch system with command
"make check"


Are you sure the test is marked with "skip"? ... it should at least test 
with the "none" machine...?


I check again, cdrom test case passes to run with "none" machine. And 
the root cause is that xorriso rpm package is missing so it is skipped.


Regards
Bibo Mao


this patch enables cdrom test for LoongArch virt
machine platform.

With this patch, cdrom test passes to run on LoongArch virt
machine type.

Signed-off-by: Bibo Mao 
---
  tests/qtest/cdrom-test.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/tests/qtest/cdrom-test.c b/tests/qtest/cdrom-test.c
index 0945383789..c8b97d8d9a 100644
--- a/tests/qtest/cdrom-test.c
+++ b/tests/qtest/cdrom-test.c
@@ -271,6 +271,9 @@ int main(int argc, char **argv)
  const char *virtmachine[] = { "virt", NULL };
  add_cdrom_param_tests(virtmachine);
  }
+    } else if (g_str_equal(arch, "loongarch64")) {
+    const char *virtmachine[] = { "virt", NULL };
+    add_cdrom_param_tests(virtmachine);
  } else {
  const char *nonemachine[] = { "none", NULL };
  add_cdrom_param_tests(nonemachine);


Anyway, using the virt machine is certainly better than the "none" 
machine, so:

Acked-by: Thomas Huth

Re: [PATCH] tests/cdrom-test: Add cdrom test for LoongArch virt machine

2024-02-05 Thread maobibo


Hi Philippe,

On 2024/2/5 下午8:58, Philippe Mathieu-Daudé wrote:

Hi Bibo,

On 5/2/24 03:13, Bibo Mao wrote:

The cdrom test skips to execute on LoongArch system with command
"make check", this patch enables cdrom test for LoongArch virt
machine platform.

With this patch, cdrom test passes to run on LoongArch virt
machine type.

Signed-off-by: Bibo Mao 
---
  tests/qtest/cdrom-test.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/tests/qtest/cdrom-test.c b/tests/qtest/cdrom-test.c
index 0945383789..c8b97d8d9a 100644
--- a/tests/qtest/cdrom-test.c
+++ b/tests/qtest/cdrom-test.c
@@ -271,6 +271,9 @@ int main(int argc, char **argv)
  const char *virtmachine[] = { "virt", NULL };
  add_cdrom_param_tests(virtmachine);
  }
+    } else if (g_str_equal(arch, "loongarch64")) {
+    const char *virtmachine[] = { "virt", NULL };
+    add_cdrom_param_tests(virtmachine);


What is the default device used, virtio-blk-pci?


yes, it is. For virt machine type, the default type for block device is
virtio interface, and it is defined at function loongarch_class_init().
   mc->block_default_type = IF_VIRTIO

Regards
Bibo Mao

RE: [PATCH v0 2/2] aspeed: fix hardcode boot address 0

2024-02-05 Thread Jamin Lin

> -Original Message-
> From: Cédric Le Goater 
> Sent: Monday, February 5, 2024 9:34 PM
> To: Jamin Lin ; Peter Maydell
> ; Andrew Jeffery ;
> Joel Stanley ; open list:ASPEED BMCs
> ; open list:All patches CC here
> 
> Cc: Troy Lee 
> Subject: Re: [PATCH v0 2/2] aspeed: fix hardcode boot address 0
> 
> On 2/5/24 10:14, Jamin Lin wrote:
> > In the previous design of QEMU model for ASPEED SOCs, it set the boot
> > address at 0 which was the hardcode setting for ast10x0, ast2600,
> > ast2500 and ast2400.
> >
> > According to the design of ast2700, it has bootmcu which is used for
> > executing SPL and initialize DRAM, then, CPUs(cortex-a35) execute
> > u-boot, kernel and rofs. QEMU will only support CPU(coretax-a35) parts
> > and the boot address is "0x4" for ast2700.
> 
> On the previous SoC, the ASPEED_DEV_SPI_BOOT region is an alias, at 0x0, to
> the FMC CE0 region, mapped at 0x2000.
> 
> Is 0x4 (or 0x4000 ?) the address for FMC CE0 region on the
> ast2700 ? or an alias ?
> 
It is "0x4 "(64bits address). CPU is armv8 cortex-a35 which is 64 bits 
CPU. 
The dram base address is "0x4 ".
The SPL base address is "0x1 ".
FMC_CS0 region mapped at "0x1 " address.

> What is the cortex-a35 reset address ?
> 
> It would help to also introduce a basic skeleton of the ast2700 SoC.
> 
AST2700
Primary Service Processor:
Embedded quad-core ARM Cortex A35 64-bit RISC CPU
Maximum running frequency: 1.6GHZ
Support: MMU, FPU, NEON, trust-zone, GIC-500 controller and so on.

BootMCU:
Ibex-riscv 32bits riscv.

Boot flow
ROM Code -> BootMCU(SPL) -> CPU Cortex A35(U-boot-> kernel -> rofs)

> Anyhow, this change makes sense. Could you please respin and also remove
> ASPEED_SOC_SPI_BOOT_ADDR. ?
> 
Okay, will remove it.

> Thanks,
> 
> C.
> 
> > Therefore, fixed hardcode boot address 0.
> >
> > Signed-off-by: Troy Lee 
> > Signed-off-by: Jamin Lin 
> > ---
> >   hw/arm/aspeed.c | 4 +++-
> >   1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c index
> > 218b81298e..82a92e8142 100644
> > --- a/hw/arm/aspeed.c
> > +++ b/hw/arm/aspeed.c
> > @@ -289,12 +289,14 @@ static void
> aspeed_install_boot_rom(AspeedMachineState *bmc, BlockBackend *blk,
> >   uint64_t rom_size)
> >   {
> >   AspeedSoCState *soc = bmc->soc;
> > +AspeedSoCClass *sc = ASPEED_SOC_GET_CLASS(soc);
> >
> >   memory_region_init_rom(>boot_rom, NULL,
> "aspeed.boot_rom", rom_size,
> >  _abort);
> >   memory_region_add_subregion_overlap(>spi_boot_container,
> 0,
> >   >boot_rom, 1);
> > -write_boot_rom(blk, ASPEED_SOC_SPI_BOOT_ADDR, rom_size,
> _abort);
> > +write_boot_rom(blk, sc->memmap[ASPEED_DEV_SPI_BOOT],
> > +   rom_size, _abort);
> >   }
> >
> >   void aspeed_board_init_flashes(AspeedSMCState *s, const char
> > *flashtype,

[PATCH V2] loongarch: Change the UEFI loading mode to loongarch

2024-02-05 Thread Xianglai Li

The UEFI loading mode in loongarch is very different
from that in other architectures:loongarch's UEFI code
is in rom, while other architectures' UEFI code is in flash.

loongarch UEFI can be loaded as follows:
-machine virt,pflash=pflash0-format
-bios ./QEMU_EFI.fd

Other architectures load UEFI using the following methods:
-machine virt,pflash0=pflash0-format,pflash1=pflash1-format

loongarch's UEFI loading method makes qemu and libvirt incompatible
when using NVRAM, and the cost of loongarch's current loading method
far outweighs the benefits, so we decided to use the same UEFI loading
scheme as other architectures.

Cc: Andrea Bolognani 
Cc: maob...@loongson.cn
Cc: Philippe Mathieu-Daudé 
Cc: Song Gao 
Cc: zhaotian...@loongson.cn
Signed-off-by: Xianglai Li 
---
 hw/loongarch/acpi-build.c   |  29 +--
 hw/loongarch/virt.c | 101 ++--
 include/hw/loongarch/virt.h |  10 ++--
 3 files changed, 107 insertions(+), 33 deletions(-)

diff --git a/hw/loongarch/acpi-build.c b/hw/loongarch/acpi-build.c
index a1c4198741..6c75f216ea 100644
--- a/hw/loongarch/acpi-build.c
+++ b/hw/loongarch/acpi-build.c
@@ -314,16 +314,39 @@ static void build_pci_device_aml(Aml *scope, 
LoongArchMachineState *lams)
 static void build_flash_aml(Aml *scope, LoongArchMachineState *lams)
 {
 Aml *dev, *crs;
+MemoryRegion *flash_mem;
 
-hwaddr flash_base = VIRT_FLASH_BASE;
-hwaddr flash_size = VIRT_FLASH_SIZE;
+hwaddr flash0_base;
+hwaddr flash0_size;
+
+hwaddr flash1_base;
+hwaddr flash1_size;
+
+flash_mem = pflash_cfi01_get_memory(lams->flash[0]);
+flash0_base = flash_mem->addr;
+flash0_size = flash_mem->size;
+
+flash_mem = pflash_cfi01_get_memory(lams->flash[1]);
+flash1_base = flash_mem->addr;
+flash1_size = flash_mem->size;
 
 dev = aml_device("FLS0");
 aml_append(dev, aml_name_decl("_HID", aml_string("LNRO0015")));
 aml_append(dev, aml_name_decl("_UID", aml_int(0)));
 
 crs = aml_resource_template();
-aml_append(crs, aml_memory32_fixed(flash_base, flash_size, 
AML_READ_WRITE));
+aml_append(crs, aml_memory32_fixed(flash0_base, flash0_size,
+   AML_READ_WRITE));
+aml_append(dev, aml_name_decl("_CRS", crs));
+aml_append(scope, dev);
+
+dev = aml_device("FLS1");
+aml_append(dev, aml_name_decl("_HID", aml_string("LNRO0015")));
+aml_append(dev, aml_name_decl("_UID", aml_int(1)));
+
+crs = aml_resource_template();
+aml_append(crs, aml_memory32_fixed(flash1_base, flash1_size,
+   AML_READ_WRITE));
 aml_append(dev, aml_name_decl("_CRS", crs));
 aml_append(scope, dev);
 }
diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 0ad7d8c887..a7b9199e70 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -54,7 +54,9 @@ struct loaderparams {
 const char *initrd_filename;
 };
 
-static void virt_flash_create(LoongArchMachineState *lams)
+static PFlashCFI01 *virt_flash_create1(LoongArchMachineState *lams,
+   const char *name,
+   const char *alias_prop_name)
 {
 DeviceState *dev = qdev_new(TYPE_PFLASH_CFI01);
 
@@ -66,45 +68,78 @@ static void virt_flash_create(LoongArchMachineState *lams)
 qdev_prop_set_uint16(dev, "id1", 0x18);
 qdev_prop_set_uint16(dev, "id2", 0x00);
 qdev_prop_set_uint16(dev, "id3", 0x00);
-qdev_prop_set_string(dev, "name", "virt.flash");
-object_property_add_child(OBJECT(lams), "virt.flash", OBJECT(dev));
-object_property_add_alias(OBJECT(lams), "pflash",
+qdev_prop_set_string(dev, "name", name);
+object_property_add_child(OBJECT(lams), name, OBJECT(dev));
+object_property_add_alias(OBJECT(lams), alias_prop_name,
   OBJECT(dev), "drive");
+return PFLASH_CFI01(dev);
+}
 
-lams->flash = PFLASH_CFI01(dev);
+static void virt_flash_create(LoongArchMachineState *lams)
+{
+lams->flash[0] = virt_flash_create1(lams, "virt.flash0", "pflash0");
+lams->flash[1] = virt_flash_create1(lams, "virt.flash1", "pflash1");
 }
 
-static void virt_flash_map(LoongArchMachineState *lams,
-   MemoryRegion *sysmem)
+static void virt_flash_map1(PFlashCFI01 *flash,
+hwaddr base, hwaddr size,
+MemoryRegion *sysmem)
 {
-PFlashCFI01 *flash = lams->flash;
 DeviceState *dev = DEVICE(flash);
-hwaddr base = VIRT_FLASH_BASE;
-hwaddr size = VIRT_FLASH_SIZE;
+BlockBackend *blk;
+hwaddr real_size = size;
+
+blk = pflash_cfi01_get_blk(flash);
+if (blk) {
+real_size = blk_getlength(blk);
+assert(real_size && real_size <= size);
+}
 
-assert(QEMU_IS_ALIGNED(size, VIRT_FLASH_SECTOR_SIZE));
-assert(size / VIRT_FLASH_SECTOR_SIZE <= UINT32_MAX);
+assert(QEMU_IS_ALIGNED(real_size, VIRT_FLASH_SECTOR_SIZE));
+assert(real_size /

RE: [PATCH v0 2/2] aspeed: fix hardcode boot address 0

2024-02-05 Thread Jamin Lin

> -Original Message-
> From: Philippe Mathieu-Daudé 
> Sent: Monday, February 5, 2024 9:20 PM
> To: Jamin Lin ; Cédric Le Goater ;
> Peter Maydell ; Andrew Jeffery
> ; Joel Stanley ; open
> list:ASPEED BMCs ; open list:All patches CC here
> 
> Cc: Troy Lee 
> Subject: Re: [PATCH v0 2/2] aspeed: fix hardcode boot address 0
> 
> Hi Jamin,
> 
> On 5/2/24 10:14, Jamin Lin via wrote:
> > In the previous design of QEMU model for ASPEED SOCs, it set the boot
> > address at 0 which was the hardcode setting for ast10x0, ast2600,
> > ast2500 and ast2400.
> >
> > According to the design of ast2700, it has bootmcu which is used for
> > executing SPL and initialize DRAM,
> 
> Out of curiosity, what architecture is this MCU?
MCU is riscv-ibex and its architecture is riscv-32.

> 
> > then, CPUs(cortex-a35)
> > execute u-boot, kernel and rofs. QEMU will only support
> > CPU(coretax-a35) parts and the boot address is "0x4" for ast2700.
> 
> OK, but I don't get how you get from here ...
> 
Our design make MCU execute SPL and copy u-boot image from SPI to DRAM at 
address 0x4 at SPL boot stage.
However, QEMU will only support to emulate CPU sides (coretex-a35) for ast2700, 
that was why we want to change the boot address at 0x4
And use the following start command by QEMU.

./qemu-system-aarch64 -M ast2750-evb -nographic -m 8G \
 -device loader,addr=0x4,file=${IMGDIR}/u-boot-nodtb.bin,force-raw=on \
 -device loader,addr=$((0x4 + 
${UBOOT_SIZE})),file=${IMGDIR}/u-boot.dtb,force-raw=on \
 ---
 ---

By the way, I will send a new patch series to support ast2700 in two weeks and 
We set memory map for ast2700 as following.

static const hwaddr aspeed_soc_ast2700_memmap[] = {
[ASPEED_DEV_SPI_BOOT]  =  0x4,
[ASPEED_DEV_SRAM]  =  0x1000,

Jamin
> > Therefore, fixed hardcode boot address 0.
> 
> ... to here.
> 
> > Signed-off-by: Troy Lee 
> > Signed-off-by: Jamin Lin 
> > ---
> >   hw/arm/aspeed.c | 4 +++-
> >   1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c index
> > 218b81298e..82a92e8142 100644
> > --- a/hw/arm/aspeed.c
> > +++ b/hw/arm/aspeed.c
> > @@ -289,12 +289,14 @@ static void
> aspeed_install_boot_rom(AspeedMachineState *bmc, BlockBackend *blk,
> >   uint64_t rom_size)
> >   {
> >   AspeedSoCState *soc = bmc->soc;
> > +AspeedSoCClass *sc = ASPEED_SOC_GET_CLASS(soc);
> >
> >   memory_region_init_rom(>boot_rom, NULL,
> "aspeed.boot_rom", rom_size,
> >  _abort);
> >   memory_region_add_subregion_overlap(>spi_boot_container,
> 0,
> >   >boot_rom, 1);
> > -write_boot_rom(blk, ASPEED_SOC_SPI_BOOT_ADDR, rom_size,
> _abort);
> > +write_boot_rom(blk, sc->memmap[ASPEED_DEV_SPI_BOOT],
> > +   rom_size, _abort);
> 
> Reviewed-by: Philippe Mathieu-Daudé 
> 
> >   }
> >
> >   void aspeed_board_init_flashes(AspeedSMCState *s, const char
> > *flashtype,

Re: [PATCH v2 6/7] vdpa: move iova_tree allocation to net_vhost_vdpa_init

2024-02-05 Thread Si-Wei Liu


Hi Eugenio,

I thought this new code looks good to me and the original issue I saw 
with x-svq=on should be gone. However, after rebase my tree on top of 
this, there's a new failure I found around setting up guest mappings at 
early boot, please see attached the specific QEMU config and 
corresponding event traces. Haven't checked into the detail yet, 
thinking you would need to be aware of ahead.


Regards,
-Siwei

On 2/1/2024 10:09 AM, Eugenio Pérez wrote:

As we are moving to keep the mapping through all the vdpa device life
instead of resetting it at VirtIO reset, we need to move all its
dependencies to the initialization too.  In particular devices with
x-svq=on need a valid iova_tree from the beginning.

Simplify the code also consolidating the two creation points: the first
data vq in case of SVQ active and CVQ start in case only CVQ uses it.

Suggested-by: Si-Wei Liu 
Signed-off-by: Eugenio Pérez 
---
  include/hw/virtio/vhost-vdpa.h | 16 ++-
  net/vhost-vdpa.c   | 36 +++---
  2 files changed, 18 insertions(+), 34 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 03ed2f2be3..ad754eb803 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -37,7 +37,21 @@ typedef struct vhost_vdpa_shared {
  struct vhost_vdpa_iova_range iova_range;
  QLIST_HEAD(, vdpa_iommu) iommu_list;
  
-/* IOVA mapping used by the Shadow Virtqueue */

+/*
+ * IOVA mapping used by the Shadow Virtqueue
+ *
+ * It is shared among all ASID for simplicity, whether CVQ shares ASID with
+ * guest or not:
+ * - Memory listener need access to guest's memory addresses allocated in
+ *   the IOVA tree.
+ * - There should be plenty of IOVA address space for both ASID not to
+ *   worry about collisions between them.  Guest's translations are still
+ *   validated with virtio virtqueue_pop so there is no risk for the guest
+ *   to access memory that it shouldn't.
+ *
+ * To allocate a iova tree per ASID is doable but it complicates the code
+ * and it is not worth it for the moment.
+ */
  VhostIOVATree *iova_tree;
  
  /* Copy of backend features */

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index cc589dd148..57edcf34d0 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -232,6 +232,7 @@ static void vhost_vdpa_cleanup(NetClientState *nc)
  return;
  }
  qemu_close(s->vhost_vdpa.shared->device_fd);
+g_clear_pointer(>vhost_vdpa.shared->iova_tree, vhost_iova_tree_delete);
  g_free(s->vhost_vdpa.shared);
  }
  
@@ -329,16 +330,8 @@ static void vdpa_net_migration_state_notifier(Notifier *notifier, void *data)
  
  static void vhost_vdpa_net_data_start_first(VhostVDPAState *s)

  {
-struct vhost_vdpa *v = >vhost_vdpa;
-
  migration_add_notifier(>migration_state,
 vdpa_net_migration_state_notifier);
-
-/* iova_tree may be initialized by vhost_vdpa_net_load_setup */
-if (v->shadow_vqs_enabled && !v->shared->iova_tree) {
-v->shared->iova_tree = vhost_iova_tree_new(v->shared->iova_range.first,
-   v->shared->iova_range.last);
-}
  }
  
  static int vhost_vdpa_net_data_start(NetClientState *nc)

@@ -383,19 +376,12 @@ static int vhost_vdpa_net_data_load(NetClientState *nc)
  static void vhost_vdpa_net_client_stop(NetClientState *nc)
  {
  VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
-struct vhost_dev *dev;
  
  assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
  
  if (s->vhost_vdpa.index == 0) {

  migration_remove_notifier(>migration_state);
  }
-
-dev = s->vhost_vdpa.dev;
-if (dev->vq_index + dev->nvqs == dev->vq_index_end) {
-g_clear_pointer(>vhost_vdpa.shared->iova_tree,
-vhost_iova_tree_delete);
-}
  }
  
  static NetClientInfo net_vhost_vdpa_info = {

@@ -557,24 +543,6 @@ out:
  return 0;
  }
  
-/*

- * If other vhost_vdpa already have an iova_tree, reuse it for simplicity,
- * whether CVQ shares ASID with guest or not, because:
- * - Memory listener need access to guest's memory addresses allocated in
- *   the IOVA tree.
- * - There should be plenty of IOVA address space for both ASID not to
- *   worry about collisions between them.  Guest's translations are still
- *   validated with virtio virtqueue_pop so there is no risk for the guest
- *   to access memory that it shouldn't.
- *
- * To allocate a iova tree per ASID is doable but it complicates the code
- * and it is not worth it for the moment.
- */
-if (!v->shared->iova_tree) {
-v->shared->iova_tree = vhost_iova_tree_new(v->shared->iova_range.first,
-   v->shared->iova_range.last);
-}
-
  r =

[PATCH v2 1/3] Hexagon (target/hexagon) Pass P0 explicitly to helpers that need it

2024-02-05 Thread Taylor Simpson

Rather than reading P0 from the env, pass it explicitly

Signed-off-by: Taylor Simpson 
---
 target/hexagon/macros.h  |  2 +-
 target/hexagon/hex_common.py | 10 ++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index 1376d6ccc1..9c700ce8ef 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -358,7 +358,7 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int 
shift)
 #endif
 #define fREAD_PC() (PC)
 
-#define fREAD_P0() (env->pred[0])
+#define fREAD_P0() (P0)
 
 #define fCHECK_PCALIGN(A)
 
diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py
index 195620c7ec..2dbd0ea737 100755
--- a/target/hexagon/hex_common.py
+++ b/target/hexagon/hex_common.py
@@ -197,6 +197,10 @@ def get_tagimms():
 return dict(zip(tags, list(map(compute_tag_immediates, tags
 
 
+def need_p0(tag):
+return "A_IMPLICIT_READS_P0" in attribdict[tag]
+
+
 def need_slot(tag):
 if (
 "A_CVI_SCATTER" not in attribdict[tag]
@@ -1118,6 +1122,12 @@ def helper_args(tag, regs, imms):
 "tcg_constant_tl(ctx->next_PC)",
 "target_ulong next_PC"
 ))
+if need_p0(tag):
+args.append(HelperArg(
+"i32",
+"hex_pred[0]",
+"uint32_t P0"
+))
 if need_slot(tag):
 args.append(HelperArg(
 "i32",
-- 
2.34.1

[PATCH v2 2/3] Hexagon (target/hexagon) Pass SP explicitly to helpers that need it

2024-02-05 Thread Taylor Simpson

Rather than reading SP from the env, pass it explicitly

Signed-off-by: Taylor Simpson 
---
 target/hexagon/macros.h  |  2 +-
 target/hexagon/attribs_def.h.inc |  1 +
 target/hexagon/hex_common.py | 11 +++
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index 9c700ce8ef..4b8665a9da 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -343,7 +343,7 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int 
shift)
 
 #define fREAD_LR() (env->gpr[HEX_REG_LR])
 
-#define fREAD_SP() (env->gpr[HEX_REG_SP])
+#define fREAD_SP() (SP)
 #define fREAD_LC0 (env->gpr[HEX_REG_LC0])
 #define fREAD_LC1 (env->gpr[HEX_REG_LC1])
 #define fREAD_SA0 (env->gpr[HEX_REG_SA0])
diff --git a/target/hexagon/attribs_def.h.inc b/target/hexagon/attribs_def.h.inc
index 87942d46f4..8949ee09cf 100644
--- a/target/hexagon/attribs_def.h.inc
+++ b/target/hexagon/attribs_def.h.inc
@@ -117,6 +117,7 @@ DEF_ATTRIB(IMPLICIT_READS_P1, "Reads the P1 register", "", 
"")
 DEF_ATTRIB(IMPLICIT_READS_P2, "Reads the P2 register", "", "")
 DEF_ATTRIB(IMPLICIT_READS_P3, "Reads the P3 register", "", "")
 DEF_ATTRIB(IMPLICIT_WRITES_USR, "May write USR", "", "")
+DEF_ATTRIB(IMPLICIT_READS_SP, "Reads the SP register", "", "")
 DEF_ATTRIB(COMMUTES, "The operation is communitive", "", "")
 DEF_ATTRIB(DEALLOCRET, "dealloc_return", "", "")
 DEF_ATTRIB(DEALLOCFRAME, "deallocframe", "", "")
diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py
index 2dbd0ea737..eb28aa98fe 100755
--- a/target/hexagon/hex_common.py
+++ b/target/hexagon/hex_common.py
@@ -101,6 +101,7 @@ def calculate_attribs():
 add_qemu_macro_attrib('fLSBNEW1', 'A_IMPLICIT_READS_P1')
 add_qemu_macro_attrib('fLSBNEW1NOT', 'A_IMPLICIT_READS_P1')
 add_qemu_macro_attrib('fREAD_P3', 'A_IMPLICIT_READS_P3')
+add_qemu_macro_attrib('fREAD_SP', 'A_IMPLICIT_READS_SP')
 
 # Recurse down macros, find attributes from sub-macros
 macroValues = list(macros.values())
@@ -201,6 +202,10 @@ def need_p0(tag):
 return "A_IMPLICIT_READS_P0" in attribdict[tag]
 
 
+def need_sp(tag):
+return "A_IMPLICIT_READS_SP" in attribdict[tag]
+
+
 def need_slot(tag):
 if (
 "A_CVI_SCATTER" not in attribdict[tag]
@@ -1128,6 +1133,12 @@ def helper_args(tag, regs, imms):
 "hex_pred[0]",
 "uint32_t P0"
 ))
+if need_sp(tag):
+args.append(HelperArg(
+"i32",
+"hex_gpr[HEX_REG_SP]",
+"uint32_t SP"
+))
 if need_slot(tag):
 args.append(HelperArg(
 "i32",
-- 
2.34.1

[PATCH v2 3/3] Hexagon (target/hexagon) Only pass env to generated helper when needed

2024-02-05 Thread Taylor Simpson

Currently, we pass env to every generated helper.  When the semantics of
the instruction only depend on the arguments, this is unnecessary and
adds extra overhead to the helper call.

We add the TCG_CALL_NO_RWG_SE flag to any non-HVX helpers that don't get
the ptr to env.

The A2_nop and SA1_setin1 instructions end up with no arguments.  This
results in a "old-style function definition" error from the compiler, so
we write overrides for them.

With this change, the number of helpers with env argument is
idef-parser enabled:329 total, 23 with env
idef-parser disabled:   1543 total, 550 with env

Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h|  3 +++
 target/hexagon/gen_helper_protos.py | 10 +-
 target/hexagon/hex_common.py| 23 ++-
 3 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 1c4391b415..e9ac2e3fe0 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -1369,3 +1369,6 @@
 gen_helper_raise_exception(tcg_env, excp); \
 } while (0)
 #endif
+
+#define fGEN_TCG_A2_nop(SHORTCODE) do { } while (0)
+#define fGEN_TCG_SA1_setin1(SHORTCODE) tcg_gen_movi_tl(RdV, -1)
diff --git a/target/hexagon/gen_helper_protos.py 
b/target/hexagon/gen_helper_protos.py
index c82b0f54e4..6e98429752 100755
--- a/target/hexagon/gen_helper_protos.py
+++ b/target/hexagon/gen_helper_protos.py
@@ -40,7 +40,15 @@ def gen_helper_prototype(f, tag, tagregs, tagimms):
 declared.append(arg.proto_arg)
 
 arguments = ", ".join(declared)
-f.write(f"DEF_HELPER_{len(declared) - 1}({tag}, {arguments})\n")
+
+## Add the TCG_CALL_NO_RWG_SE flag to helpers that don't take the env
+## argument and aren't HVX instructions.  Since HVX instructions take
+## pointers to their arguments, they will have side effects.
+if hex_common.need_env(tag) or hex_common.is_hvx_insn(tag):
+f.write(f"DEF_HELPER_{len(declared) - 1}({tag}, {arguments})\n")
+else:
+f.write(f"DEF_HELPER_FLAGS_{len(declared) - 1}({tag}, "
+f"TCG_CALL_NO_RWG_SE, {arguments})\n")
 
 
 def main():
diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py
index eb28aa98fe..ee08195aba 100755
--- a/target/hexagon/hex_common.py
+++ b/target/hexagon/hex_common.py
@@ -206,6 +206,18 @@ def need_sp(tag):
 return "A_IMPLICIT_READS_SP" in attribdict[tag]
 
 
+def is_hvx_insn(tag):
+return "A_CVI" in attribdict[tag]
+
+
+def need_env(tag):
+return ("A_STORE" in attribdict[tag] or
+"A_LOAD" in attribdict[tag] or
+"A_CVI_GATHER" in attribdict[tag] or
+"A_CVI_SCATTER" in attribdict[tag] or
+"A_IMPLICIT_WRITES_USR" in attribdict[tag])
+
+
 def need_slot(tag):
 if (
 "A_CVI_SCATTER" not in attribdict[tag]
@@ -1069,11 +1081,12 @@ def helper_args(tag, regs, imms):
 args = []
 
 ## First argument is the CPU state
-args.append(HelperArg(
-"env",
-"tcg_env",
-"CPUHexagonState *env"
-))
+if need_env(tag):
+args.append(HelperArg(
+"env",
+"tcg_env",
+"CPUHexagonState *env"
+))
 
 ## For predicated instructions, we pass in the destination register
 if is_predicated(tag):
-- 
2.34.1

[PATCH v2 0/3] Hexagon (target/hexagon) Only pass env to generated helper when needed

2024-02-05 Thread Taylor Simpson

Currently, we pass env to every generated helper.  When the semantics of
the instruction only depend on the arguments, this is unnecessary and
adds extra overhead to the helper call.

 Changes in v2 
- Separate patches to pass P0 and SP explicitly to helpers that need it
- Add the TCG_CALL_NO_RWG_SE flag to any non-HVX helpers that
  don't get ptr to env

Taylor Simpson (3):
  Hexagon (target/hexagon) Pass P0 explicitly to helpers that need it
  Hexagon (target/hexagon) Pass SP explicitly to helpers that need it
  Hexagon (target/hexagon) Only pass env to generated helper when needed

 target/hexagon/gen_tcg.h|  3 ++
 target/hexagon/macros.h |  4 +--
 target/hexagon/attribs_def.h.inc|  1 +
 target/hexagon/gen_helper_protos.py | 10 ++-
 target/hexagon/hex_common.py| 44 +
 5 files changed, 54 insertions(+), 8 deletions(-)

-- 
2.34.1

[PATCH v3 0/4] make vm-build-freebsd fixes

2024-02-05 Thread Ilya Leoshkevich

v2: https://lists.gnu.org/archive/html/qemu-devel/2024-02/msg00890.html
v2 -> v3: Structure the meson check similar to have_asan_fiber;
  Reduce the context size a little (Philippe).

v1: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg05155.html
v1 -> v2: Link with libinotify instead of disabling the inotify
  support (Daniel).
  Use a bit more context lines in order to prevent the
  incorrect application of the test patch.

Hi,

I needed to verify that my qemu-user changes didn't break BSD, and
Daniel Berrange suggested vm-build-freebsd on IRC. I had several
problems with it, which this series resolves.

Best regards,
Ilya

Ilya Leoshkevich (4):
  tests/vm: Set UseDNS=no in the sshd configuration
  tests/vm/freebsd: Reload the sshd configuration
  test-util-filemonitor: Adapt to the FreeBSD inotify rename semantics
  meson: Link with libinotify on FreeBSD

 meson.build| 23 +++
 tests/unit/test-util-filemonitor.c |  8 
 tests/vm/basevm.py |  2 ++
 tests/vm/freebsd   |  1 +
 util/meson.build   |  6 +-
 5 files changed, 35 insertions(+), 5 deletions(-)

-- 
2.43.0

[PATCH v3 2/4] tests/vm/freebsd: Reload the sshd configuration

2024-02-05 Thread Ilya Leoshkevich

After console_sshd_config(), the SSH server needs to be nudged to pick
up the new configs. The scripts for the other BSD flavors already do
this with a reboot, but a simple reload is sufficient.

Reviewed-by: Thomas Huth 
Signed-off-by: Ilya Leoshkevich 
---
 tests/vm/freebsd | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/vm/freebsd b/tests/vm/freebsd
index b581bd17fb7..1247f40a385 100755
--- a/tests/vm/freebsd
+++ b/tests/vm/freebsd
@@ -91,40 +91,41 @@ class FreeBSDVM(basevm.BaseVM):
 self.console_wait_send("Use an empty password", "\n")
 self.console_wait_send("Use a random password", "\n")
 self.console_wait("Enter password:")
 self.console_send("%s\n" % self._config["guest_pass"])
 self.console_wait("Enter password again:")
 self.console_send("%s\n" % self._config["guest_pass"])
 self.console_wait_send("Lock out",  "\n")
 self.console_wait_send("OK","yes\n")
 self.console_wait_send("Add another user",  "no\n")
 self.console_wait_send("~ #", "exit\n")
 
 # setup qemu user
 prompt = "$"
 self.console_ssh_init(prompt, self._config["guest_user"], 
self._config["guest_pass"])
 self.console_wait_send(prompt, "exit\n")
 
 # setup root user
 prompt = "root@freebsd:~ #"
 self.console_ssh_init(prompt, "root", self._config["root_pass"])
 self.console_sshd_config(prompt)
+self.console_wait_send(prompt, "service sshd reload\n")
 
 # setup virtio-blk #1 (tarfile)
 self.console_wait(prompt)
 self.console_send("echo 'chmod 666 /dev/vtbd1' >> /etc/rc.local\n")
 
 pkgs = self.get_qemu_packages_from_lcitool_json()
 self.print_step("Installing packages")
 self.ssh_root_check("pkg install -y %s\n" % " ".join(pkgs))
 
 # shutdown
 self.ssh_root(self.poweroff)
 self.wait()
 
 if os.path.exists(img):
 os.remove(img)
 os.rename(img_tmp, img)
 self.print_step("All done")
 
 if __name__ == "__main__":
 sys.exit(basevm.main(FreeBSDVM, config=FREEBSD_CONFIG))
-- 
2.43.0

[PATCH v3 4/4] meson: Link with libinotify on FreeBSD

2024-02-05 Thread Ilya Leoshkevich

make vm-build-freebsd fails with:

ld: error: undefined symbol: inotify_init1
>>> referenced by filemonitor-inotify.c:183 
(../src/util/filemonitor-inotify.c:183)
>>>   util_filemonitor-inotify.c.o:(qemu_file_monitor_new) in 
archive libqemuutil.a

On FreeBSD the inotify functions are defined in libinotify.so. Add it
to the dependencies.

Signed-off-by: Ilya Leoshkevich 
---
 meson.build  | 23 +++
 util/meson.build |  6 +-
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index b5d6dc94a83..e5d6f2d057e 100644
--- a/meson.build
+++ b/meson.build
@@ -2367,60 +2367,72 @@ if rbd.found()
 endif
 if rdma.found()
   config_host_data.set('HAVE_IBV_ADVISE_MR',
cc.has_function('ibv_advise_mr',
dependencies: rdma,
prefix: '#include 
'))
 endif
 
 have_asan_fiber = false
 if get_option('sanitizers') and \
not cc.has_function('__sanitizer_start_switch_fiber',
  args: '-fsanitize=address',
  prefix: '#include ')
   warning('Missing ASAN due to missing fiber annotation interface')
   warning('Without code annotation, the report may be inferior.')
 else
   have_asan_fiber = true
 endif
 config_host_data.set('CONFIG_ASAN_IFACE_FIBER', have_asan_fiber)
 
+have_inotify_init = cc.has_header_symbol('sys/inotify.h', 'inotify_init')
+have_inotify_init1 = cc.has_header_symbol('sys/inotify.h', 'inotify_init1')
+inotify = not_found
+if (have_inotify_init or have_inotify_init1) and host_os == 'freebsd'
+  # libinotify-kqueue
+  inotify = cc.find_library('inotify')
+  if have_inotify_init
+have_inotify_init = inotify.found()
+  endif
+  if have_inotify_init1
+have_inotify_init1 = inotify.found()
+  endif
+endif
+config_host_data.set('CONFIG_INOTIFY', have_inotify_init)
+config_host_data.set('CONFIG_INOTIFY1', have_inotify_init1)
+
 # has_header_symbol
 config_host_data.set('CONFIG_BLKZONED',
  cc.has_header_symbol('linux/blkzoned.h', 'BLKOPENZONE'))
 config_host_data.set('CONFIG_EPOLL_CREATE1',
  cc.has_header_symbol('sys/epoll.h', 'epoll_create1'))
 config_host_data.set('CONFIG_FALLOCATE_PUNCH_HOLE',
  cc.has_header_symbol('linux/falloc.h', 
'FALLOC_FL_PUNCH_HOLE') and
  cc.has_header_symbol('linux/falloc.h', 
'FALLOC_FL_KEEP_SIZE'))
 config_host_data.set('CONFIG_FALLOCATE_ZERO_RANGE',
  cc.has_header_symbol('linux/falloc.h', 
'FALLOC_FL_ZERO_RANGE'))
 config_host_data.set('CONFIG_FIEMAP',
  cc.has_header('linux/fiemap.h') and
  cc.has_header_symbol('linux/fs.h', 'FS_IOC_FIEMAP'))
 config_host_data.set('CONFIG_GETRANDOM',
  cc.has_function('getrandom') and
  cc.has_header_symbol('sys/random.h', 'GRND_NONBLOCK'))
-config_host_data.set('CONFIG_INOTIFY',
- cc.has_header_symbol('sys/inotify.h', 'inotify_init'))
-config_host_data.set('CONFIG_INOTIFY1',
- cc.has_header_symbol('sys/inotify.h', 'inotify_init1'))
 config_host_data.set('CONFIG_PRCTL_PR_SET_TIMERSLACK',
  cc.has_header_symbol('sys/prctl.h', 'PR_SET_TIMERSLACK'))
 config_host_data.set('CONFIG_RTNETLINK',
  cc.has_header_symbol('linux/rtnetlink.h', 
'IFLA_PROTO_DOWN'))
 config_host_data.set('CONFIG_SYSMACROS',
  cc.has_header_symbol('sys/sysmacros.h', 'makedev'))
 config_host_data.set('HAVE_OPTRESET',
  cc.has_header_symbol('getopt.h', 'optreset'))
 config_host_data.set('HAVE_IPPROTO_MPTCP',
  cc.has_header_symbol('netinet/in.h', 'IPPROTO_MPTCP'))
 
 # has_member
 config_host_data.set('HAVE_SIGEV_NOTIFY_THREAD_ID',
  cc.has_member('struct sigevent', 'sigev_notify_thread_id',
prefix: '#include '))
 config_host_data.set('HAVE_STRUCT_STAT_ST_ATIM',
  cc.has_member('struct stat', 'st_atim',
prefix: '#include '))
 config_host_data.set('HAVE_BLK_ZONE_REP_CAPACITY',
  cc.has_member('struct blk_zone', 'capacity',
@@ -4390,40 +4402,43 @@ if host_os == 'windows'
 endif
 summary_info += {'seccomp support':   seccomp}
 summary_info += {'GlusterFS support': glusterfs}
 summary_info += {'hv-balloon support': hv_balloon}
 summary_info += {'TPM support':   have_tpm}
 summary_info += {'libssh support':libssh}
 summary_info += {'lzo support':   lzo}
 summary_info += {'snappy support':snappy}
 summary_info += {'bzip2 support': libbzip2}
 summary_info += {'lzfse support': liblzfse}
 summary_info += {'zstd support':  zstd}
 summary_info += {'NUMA host support': numa}
 summary_info += {'capstone':  capstone}
 summary_info += {'libpmem support':   libpmem}
 summary_info

[PATCH v3 1/4] tests/vm: Set UseDNS=no in the sshd configuration

2024-02-05 Thread Ilya Leoshkevich

make vm-build-freebsd sometimes fails with "Connection timed out during
banner exchange". The client strace shows:

13:59:30 write(3, "SSH-2.0-OpenSSH_9.3\r\n", 21) = 21
13:59:30 getpid()   = 252655
13:59:30 poll([{fd=3, events=POLLIN}], 1, 5000) = 1 ([{fd=3, 
revents=POLLIN}])
13:59:32 read(3, "S", 1)= 1
13:59:32 poll([{fd=3, events=POLLIN}], 1, 3625) = 1 ([{fd=3, 
revents=POLLIN}])
13:59:32 read(3, "S", 1)= 1
13:59:32 poll([{fd=3, events=POLLIN}], 1, 3625) = 1 ([{fd=3, 
revents=POLLIN}])
13:59:32 read(3, "H", 1)= 1

There is a 2s delay during connection, and ConnectTimeout is set to 1.
Raising it makes the issue go away, but we can do better. The server
truss shows:

888: 27.811414714 socket(PF_INET,SOCK_DGRAM|SOCK_CLOEXEC,0) = 5 (0x5)
888: 27.811765030 connect(5,{ AF_INET 10.0.2.3:53 },16) = 0 (0x0)
888: 27.812166941 sendto(5,"\^Z/\^A\0\0\^A\0\0\0\0\0\0\^A2"...,39,0,NULL,0) 
= 39 (0x27)
888: 29.363970743 poll({ 5/POLLRDNORM },1,5000) = 1 (0x1)

So the delay is due to a DNS query. Disable DNS queries in the server
config.

Reviewed-by: Thomas Huth 
Signed-off-by: Ilya Leoshkevich 
---
 tests/vm/basevm.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py
index 61725b83254..c0d62c08031 100644
--- a/tests/vm/basevm.py
+++ b/tests/vm/basevm.py
@@ -406,40 +406,42 @@ def console_send(self, command):
 vm.console_socket.send(char.encode("utf-8"))
 time.sleep(0.01)
 
 def console_wait_send(self, wait, command):
 self.console_wait(wait)
 self.console_send(command)
 
 def console_ssh_init(self, prompt, user, pw):
 sshkey_cmd = "echo '%s' > .ssh/authorized_keys\n" \
  % self._config['ssh_pub_key'].rstrip()
 self.console_wait_send("login:","%s\n" % user)
 self.console_wait_send("Password:", "%s\n" % pw)
 self.console_wait_send(prompt,  "mkdir .ssh\n")
 self.console_wait_send(prompt,  sshkey_cmd)
 self.console_wait_send(prompt,  "chmod 755 .ssh\n")
 self.console_wait_send(prompt,  "chmod 644 .ssh/authorized_keys\n")
 
 def console_sshd_config(self, prompt):
 self.console_wait(prompt)
 self.console_send("echo 'PermitRootLogin yes' >> 
/etc/ssh/sshd_config\n")
+self.console_wait(prompt)
+self.console_send("echo 'UseDNS no' >> /etc/ssh/sshd_config\n")
 for var in self.envvars:
 self.console_wait(prompt)
 self.console_send("echo 'AcceptEnv %s' >> /etc/ssh/sshd_config\n" 
% var)
 
 def print_step(self, text):
 sys.stderr.write("### %s ...\n" % text)
 
 def wait_ssh(self, wait_root=False, seconds=300, cmd="exit 0"):
 # Allow more time for VM to boot under TCG.
 if not kvm_available(self.arch):
 seconds *= self.tcg_timeout_multiplier
 starttime = datetime.datetime.now()
 endtime = starttime + datetime.timedelta(seconds=seconds)
 cmd_success = False
 while datetime.datetime.now() < endtime:
 if wait_root and self.ssh_root(cmd) == 0:
 cmd_success = True
 break
 elif self.ssh(cmd) == 0:
 cmd_success = True
-- 
2.43.0

[PATCH v3 3/4] test-util-filemonitor: Adapt to the FreeBSD inotify rename semantics

2024-02-05 Thread Ilya Leoshkevich

Unlike on Linux, on FreeBSD renaming a file when the destination
already exists results in an IN_DELETE event for that existing file:

$ FILEMONITOR_DEBUG=1 build/tests/unit/test-util-filemonitor
Rename /tmp/test-util-filemonitor-K13LI2/fish/one.txt -> 
/tmp/test-util-filemonitor-K13LI2/two.txt
Event id=2 event=2 file=one.txt
Queue event id 2 event 2 file one.txt
Queue event id 1 event 2 file two.txt
Queue event id 10002 event 2 file two.txt
Queue event id 1 event 0 file two.txt
Queue event id 10002 event 0 file two.txt
Event id=1 event=0 file=two.txt
Expected event 0 but got 2

This difference in behavior is not expected to break the real users, so
teach the test to accept it.

Suggested-by: Daniel P. Berrange 
Signed-off-by: Ilya Leoshkevich 
---
 tests/unit/test-util-filemonitor.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/tests/unit/test-util-filemonitor.c 
b/tests/unit/test-util-filemonitor.c
index a22de275955..02e67fc96ac 100644
--- a/tests/unit/test-util-filemonitor.c
+++ b/tests/unit/test-util-filemonitor.c
@@ -343,40 +343,48 @@ test_file_monitor_events(void)
   .filesrc = "fish/", .watchid =  },
 { .type = QFILE_MONITOR_TEST_OP_ADD_WATCH,
   .filesrc = "fish/one.txt", .watchid =  },
 { .type = QFILE_MONITOR_TEST_OP_CREATE,
   .filesrc = "fish/one.txt", },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "one.txt", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_CREATED },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "one.txt", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_CREATED },
 
 
 { .type = QFILE_MONITOR_TEST_OP_DEL_WATCH,
   .filesrc = "fish/one.txt", .watchid =  },
 { .type = QFILE_MONITOR_TEST_OP_RENAME,
   .filesrc = "fish/one.txt", .filedst = "two.txt", },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "one.txt", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_DELETED },
+#ifdef __FreeBSD__
+{ .type = QFILE_MONITOR_TEST_OP_EVENT,
+  .filesrc = "two.txt", .watchid = ,
+  .eventid = QFILE_MONITOR_EVENT_DELETED },
+{ .type = QFILE_MONITOR_TEST_OP_EVENT,
+  .filesrc = "two.txt", .watchid = ,
+  .eventid = QFILE_MONITOR_EVENT_DELETED },
+#endif
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "two.txt", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_CREATED },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "two.txt", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_CREATED },
 
 
 { .type = QFILE_MONITOR_TEST_OP_RMDIR,
   .filesrc = "fish", },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_IGNORED,
   .swapnext = true },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "fish", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_DELETED },
 { .type = QFILE_MONITOR_TEST_OP_DEL_WATCH,
   .filesrc = "fish", .watchid =  },
 
-- 
2.43.0

Re: [PATCH v3 0/2] riscv: add rv32i,rv32e and rv64e CPUs

2024-02-05 Thread Alistair Francis

On Mon, Jan 22, 2024 at 10:34 PM Daniel Henrique Barboza
 wrote:
>
> Hi,
>
> This v3 has the same patches from v2 rebased with a newer
> riscv-to-apply.next branch (@ 096b6b07298).
>
> No other changes made. All patches acked.
>
> v2 link: 
> https://lore.kernel.org/qemu-riscv/20240108161903.353648-1-dbarb...@ventanamicro.com/
>
> Daniel Henrique Barboza (2):
>   target/riscv/cpu.c: add riscv_bare_cpu_init()
>   target/riscv: add rv32i, rv32e and rv64e CPUs

Thanks!

Applied to riscv-to-apply.next

Alistair

>
>  target/riscv/cpu-qom.h |  3 ++
>  target/riscv/cpu.c | 64 --
>  2 files changed, 52 insertions(+), 15 deletions(-)
>
> --
> 2.43.0
>
>

Re: [PATCH] docs/style: allow C99 mixed declarations

2024-02-05 Thread Alex Bennée

Stefan Hajnoczi  writes:

> C99 mixed declarations support interleaving of local variable
> declarations and code.
>
> The coding style "generally" forbids C99 mixed declarations with some
> exceptions to the rule. This rule is not checked by checkpatch.pl and
> naturally there are violations in the source tree.
>
> While contemplating adding another exception, I came to the conclusion
> that the best location for declarations depends on context. Let the
> programmer declare variables where it is best for legibility. Don't try
> to define all possible scenarios/exceptions.
>
> Signed-off-by: Stefan Hajnoczi 
> ---
>  docs/devel/style.rst | 20 
>  1 file changed, 20 deletions(-)

While I may be convinced there may be some cases where mixed
declarations could make things simpler/clearer I don't support removing
all the guidance and leaving it as a free for all as long as the
compiler is happy.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [PATCH v3 2/3] hw/virtio: cleanup shared resources

2024-02-05 Thread Alex Bennée

Albert Esteve  writes:

> Ensure that we cleanup all virtio shared
> resources when the vhost devices is cleaned
> up (after a hot unplug, or a crash).
>
> To do so, we add a new function to the virtio_dmabuf
> API called `virtio_dmabuf_vhost_cleanup`, which
> loop through the table and removes all
> resources owned by the vhost device parameter.
>
> Also, add a test to verify that the new
> function in the API behaves as expected.
>
> Signed-off-by: Albert Esteve 
> Acked-by: Stefan Hajnoczi 
> ---
>  hw/display/virtio-dmabuf.c| 22 +
>  hw/virtio/vhost.c |  3 +++
>  include/hw/virtio/virtio-dmabuf.h | 10 ++
>  tests/unit/test-virtio-dmabuf.c   | 33 +++
>  4 files changed, 68 insertions(+)
>
> diff --git a/hw/display/virtio-dmabuf.c b/hw/display/virtio-dmabuf.c
> index 3dba4577ca..6688809777 100644
> --- a/hw/display/virtio-dmabuf.c
> +++ b/hw/display/virtio-dmabuf.c
> @@ -136,6 +136,28 @@ SharedObjectType virtio_object_type(const QemuUUID *uuid)
>  return vso->type;
>  }
>  
> +static bool virtio_dmabuf_resource_is_owned(gpointer key,
> +gpointer value,
> +gpointer dev)
> +{
> +VirtioSharedObject *vso;
> +
> +vso = (VirtioSharedObject *) value;
> +return vso->type == TYPE_VHOST_DEV && vso->value == dev;

It's a bit surprising to see vso->value being an anonymous gpointer
rather than the proper type and a bit confusing between value and
vso->value.

> +}
> +
> +int virtio_dmabuf_vhost_cleanup(struct vhost_dev *dev)
> +{
> +int num_removed;
> +
> +g_mutex_lock();
> +num_removed = g_hash_table_foreach_remove(
> +resource_uuids, (GHRFunc) virtio_dmabuf_resource_is_owned, dev);
> +g_mutex_unlock();

I'll note if we used a QemuMutex for the lock we could:

  - use WITH_QEMU_LOCK_GUARD() { }
  - enable QSP porfiling for the lock

> +
> +return num_removed;
> +}
> +
>  void virtio_free_resources(void)
>  {
>  g_mutex_lock();
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 2c9ac79468..c5622eac14 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -16,6 +16,7 @@
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
>  #include "hw/virtio/vhost.h"
> +#include "hw/virtio/virtio-dmabuf.h"
>  #include "qemu/atomic.h"
>  #include "qemu/range.h"
>  #include "qemu/error-report.h"
> @@ -1599,6 +1600,8 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
>  migrate_del_blocker(>migration_blocker);
>  g_free(hdev->mem);
>  g_free(hdev->mem_sections);
> +/* free virtio shared objects */
> +virtio_dmabuf_vhost_cleanup(hdev);
>  if (hdev->vhost_ops) {
>  hdev->vhost_ops->vhost_backend_cleanup(hdev);
>  }
> diff --git a/include/hw/virtio/virtio-dmabuf.h 
> b/include/hw/virtio/virtio-dmabuf.h
> index 627c3b6db7..73f70fb482 100644
> --- a/include/hw/virtio/virtio-dmabuf.h
> +++ b/include/hw/virtio/virtio-dmabuf.h
> @@ -91,6 +91,16 @@ struct vhost_dev *virtio_lookup_vhost_device(const 
> QemuUUID *uuid);
>   */
>  SharedObjectType virtio_object_type(const QemuUUID *uuid);
>  
> +/**
> + * virtio_dmabuf_vhost_cleanup() - Destroys all entries of the shared
> + * resources lookup table that are owned by the vhost backend
> + * @dev: the pointer to the vhost device that owns the entries. Data is owned
> + *   by the called of the function.
> + * 
> + * Return: the number of resource entries removed.
> + */
> +int virtio_dmabuf_vhost_cleanup(struct vhost_dev *dev);
> +
>  /**
>   * virtio_free_resources() - Destroys all keys and values of the shared
>   * resources lookup table, and frees them
> diff --git a/tests/unit/test-virtio-dmabuf.c b/tests/unit/test-virtio-dmabuf.c
> index a45ec52f42..1c8123c2d2 100644
> --- a/tests/unit/test-virtio-dmabuf.c
> +++ b/tests/unit/test-virtio-dmabuf.c
> @@ -103,6 +103,38 @@ static void test_add_invalid_resource(void)
>  }
>  }
>  
> +static void test_cleanup_res(void)
> +{
> +QemuUUID uuids[20], uuid_alt;
> +struct vhost_dev *dev = g_new0(struct vhost_dev, 1);
> +struct vhost_dev *dev_alt = g_new0(struct vhost_dev, 1);
> +int i, num_removed;
> +
> +for (i = 0; i < ARRAY_SIZE(uuids); ++i) {
> +qemu_uuid_generate([i]);
> +virtio_add_vhost_device([i], dev);
> +/* vhost device is found */
> +g_assert(virtio_lookup_vhost_device([i]) != NULL);
> +}
> +qemu_uuid_generate(_alt);
> +virtio_add_vhost_device(_alt, dev_alt);
> +/* vhost device is found */
> +g_assert(virtio_lookup_vhost_device(_alt) != NULL);
> +/* cleanup all dev resources */
> +num_removed = virtio_dmabuf_vhost_cleanup(dev);
> +g_assert_cmpint(num_removed, ==, ARRAY_SIZE(uuids));
> +for (i = 0; i < ARRAY_SIZE(uuids); ++i) {
> +/* None of the dev resources is found after free'd */
> +g_assert_cmpint(virtio_lookup_dmabuf([i]), ==, -1);
> +}
> +/* uuid_alt

[PATCH] hw/i386/multiboot: Make bootloader_name static

2024-02-05 Thread Jens Nyberg

Global variable is only used in a single file and should therefor be set to
static in order to avoid name collisions.

Signed-off-by: Jens Nyberg 
---
 hw/i386/multiboot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/multiboot.c b/hw/i386/multiboot.c
index 3332712ab3..6bf6398d01 100644
--- a/hw/i386/multiboot.c
+++ b/hw/i386/multiboot.c
@@ -102,7 +102,7 @@ typedef struct {
 int mb_mods_count;
 } MultibootState;
 
-const char *bootloader_name = "qemu";
+static const char *bootloader_name = "qemu";
 
 static uint32_t mb_add_cmdline(MultibootState *s, const char *cmdline)
 {
-- 
2.43.0

[PATCH v5 0/4] target/s390x: Emulate CVDG and CVB*

2024-02-05 Thread Ilya Leoshkevich

v4: https://lists.gnu.org/archive/html/qemu-devel/2024-02/msg00434.html
v4 -> v5: Remove a redundant CVBG overflow check;
  Write the comment explaining the remaining CVBG overflow
  check;
  Add R-bs to the tests (Thomas).

v3: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg06664.html
v3 -> v4: Implement CVB error handling (David/Thomas).

v2: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg05048.html
v2 -> v3: Resurrect an old CVB* patch (Thomas).
  Add Richard's R-b.

v1: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg02865.html
v1 -> v2: Fix !CONFIG_INT128 builds (Richard).

Hi,

Ido reported that we are missing the CVDG emulation (which is very
similar to the existing CVD emulation). This series adds it along with
a test.

Best regards,
Ilya

Ilya Leoshkevich (4):
  target/s390x: Emulate CVDG
  target/s390x: Emulate CVB, CVBY and CVBG
  tests/tcg/s390x: Test CONVERT TO DECIMAL
  tests/tcg/s390x: Test CONVERT TO BINARY

 target/s390x/helper.h|   3 +
 target/s390x/tcg/insn-data.h.inc |   5 ++
 target/s390x/tcg/int_helper.c|  97 +
 target/s390x/tcg/translate.c |  24 
 tests/tcg/s390x/Makefile.target  |   2 +
 tests/tcg/s390x/cvb.c| 102 +++
 tests/tcg/s390x/cvd.c|  63 +++
 7 files changed, 296 insertions(+)
 create mode 100644 tests/tcg/s390x/cvb.c
 create mode 100644 tests/tcg/s390x/cvd.c

-- 
2.43.0

[PATCH v5 4/4] tests/tcg/s390x: Test CONVERT TO BINARY

2024-02-05 Thread Ilya Leoshkevich

Check the CVB's, CVBY's, and CVBG's corner cases.

Co-developed-by: Pavel Zbitskiy 
Reviewed-by: Thomas Huth 
Tested-by: Thomas Huth 
Signed-off-by: Ilya Leoshkevich 
---
 tests/tcg/s390x/Makefile.target |   1 +
 tests/tcg/s390x/cvb.c   | 102 
 2 files changed, 103 insertions(+)
 create mode 100644 tests/tcg/s390x/cvb.c

diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index 04e4bddd83d..e2aba2ec274 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -46,6 +46,7 @@ TESTS+=laalg
 TESTS+=add-logical-with-carry
 TESTS+=lae
 TESTS+=cvd
+TESTS+=cvb
 
 cdsg: CFLAGS+=-pthread
 cdsg: LDFLAGS+=-pthread
diff --git a/tests/tcg/s390x/cvb.c b/tests/tcg/s390x/cvb.c
new file mode 100644
index 000..e1735f6b81c
--- /dev/null
+++ b/tests/tcg/s390x/cvb.c
@@ -0,0 +1,102 @@
+/*
+ * Test the CONVERT TO BINARY instruction.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int signum;
+
+static void signal_handler(int n)
+{
+signum = n;
+}
+
+#define FAIL 0x1234567887654321
+#define OK32(x) (0x12345678 | (uint32_t)(x))
+
+static int64_t cvb(uint64_t x)
+{
+int64_t ret = FAIL;
+
+signum = -1;
+asm("cvb %[ret],%[x]" : [ret] "+r" (ret) : [x] "R" (x));
+
+return ret;
+}
+
+static int64_t cvby(uint64_t x)
+{
+int64_t ret = FAIL;
+
+signum = -1;
+asm("cvby %[ret],%[x]" : [ret] "+r" (ret) : [x] "T" (x));
+
+return ret;
+}
+
+static int64_t cvbg(__uint128_t x)
+{
+int64_t ret = FAIL;
+
+signum = -1;
+asm("cvbg %[ret],%[x]" : [ret] "+r" (ret) : [x] "T" (x));
+
+return ret;
+}
+
+int main(void)
+{
+__uint128_t m = (((__uint128_t)0x9223372036854775) << 16) | 0x8070;
+struct sigaction act;
+int err;
+
+memset(, 0, sizeof(act));
+act.sa_handler = signal_handler;
+err = sigaction(SIGFPE, , NULL);
+assert(err == 0);
+err = sigaction(SIGILL, , NULL);
+assert(err == 0);
+
+assert(cvb(0xc) == OK32(0) && signum == -1);
+assert(cvb(0x1c) == OK32(1) && signum == -1);
+assert(cvb(0x25594c) == OK32(25594) && signum == -1);
+assert(cvb(0x1d) == OK32(-1) && signum == -1);
+assert(cvb(0x2147483647c) == OK32(0x7fff) && signum == -1);
+assert(cvb(0x2147483648d) == OK32(-0x8000) && signum == -1);
+assert(cvb(0x7) == FAIL && signum == SIGILL);
+assert(cvb(0x2147483648c) == OK32(0x8000) && signum == SIGFPE);
+assert(cvb(0x30c) == OK32(0xb2d05e00) && signum == SIGFPE);
+assert(cvb(0x2147483649d) == OK32(0x7fff) && signum == SIGFPE);
+assert(cvb(0x30d) == OK32(0x4d2fa200) && signum == SIGFPE);
+
+assert(cvby(0xc) == OK32(0));
+assert(cvby(0x1c) == OK32(1));
+assert(cvby(0x25594c) == OK32(25594));
+assert(cvby(0x1d) == OK32(-1));
+assert(cvby(0x2147483647c) == OK32(0x7fff));
+assert(cvby(0x2147483648d) == OK32(-0x8000));
+assert(cvby(0x7) == FAIL && signum == SIGILL);
+assert(cvby(0x2147483648c) == OK32(0x8000) && signum == SIGFPE);
+assert(cvby(0x30c) == OK32(0xb2d05e00) && signum == SIGFPE);
+assert(cvby(0x2147483649d) == OK32(0x7fff) && signum == SIGFPE);
+assert(cvby(0x30d) == OK32(0x4d2fa200) && signum == SIGFPE);
+
+assert(cvbg(0xc) == 0);
+assert(cvbg(0x1c) == 1);
+assert(cvbg(0x25594c) == 25594);
+assert(cvbg(0x1d) == -1);
+assert(cvbg(m + 0xc) == 0x7fff);
+assert(cvbg(m + 0x1d) == -0x8000);
+assert(cvbg(0x7) == FAIL && signum == SIGILL);
+assert(cvbg(m + 0x1c) == FAIL && signum == SIGFPE);
+assert(cvbg(m + 0x2d) == FAIL && signum == SIGFPE);
+assert(cvbg(((__uint128_t)1 << 80) + 0xc) == FAIL && signum == SIGFPE);
+assert(cvbg(((__uint128_t)1 << 80) + 0xd) == FAIL && signum == SIGFPE);
+
+return EXIT_SUCCESS;
+}
-- 
2.43.0

[PATCH v5 3/4] tests/tcg/s390x: Test CONVERT TO DECIMAL

2024-02-05 Thread Ilya Leoshkevich

Check the CVD's, CVDY's, and CVDG's corner cases.

Reviewed-by: Thomas Huth 
Signed-off-by: Ilya Leoshkevich 
---
 tests/tcg/s390x/Makefile.target |  1 +
 tests/tcg/s390x/cvd.c   | 63 +
 2 files changed, 64 insertions(+)
 create mode 100644 tests/tcg/s390x/cvd.c

diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index 30994dcf9c2..04e4bddd83d 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -45,6 +45,7 @@ TESTS+=clc
 TESTS+=laalg
 TESTS+=add-logical-with-carry
 TESTS+=lae
+TESTS+=cvd
 
 cdsg: CFLAGS+=-pthread
 cdsg: LDFLAGS+=-pthread
diff --git a/tests/tcg/s390x/cvd.c b/tests/tcg/s390x/cvd.c
new file mode 100644
index 000..d776688985e
--- /dev/null
+++ b/tests/tcg/s390x/cvd.c
@@ -0,0 +1,63 @@
+/*
+ * Test the CONVERT TO DECIMAL instruction.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include 
+#include 
+#include 
+
+static uint64_t cvd(int32_t x)
+{
+uint64_t ret;
+
+asm("cvd %[x],%[ret]" : [ret] "=R" (ret) : [x] "r" (x));
+
+return ret;
+}
+
+static uint64_t cvdy(int32_t x)
+{
+uint64_t ret;
+
+asm("cvdy %[x],%[ret]" : [ret] "=T" (ret) : [x] "r" (x));
+
+return ret;
+}
+
+static __uint128_t cvdg(int64_t x)
+{
+__uint128_t ret;
+
+asm("cvdg %[x],%[ret]" : [ret] "=T" (ret) : [x] "r" (x));
+
+return ret;
+}
+
+int main(void)
+{
+__uint128_t m = (((__uint128_t)0x9223372036854775) << 16) | 0x8070;
+
+assert(cvd(0) == 0xc);
+assert(cvd(1) == 0x1c);
+assert(cvd(25594) == 0x25594c);
+assert(cvd(-1) == 0x1d);
+assert(cvd(0x7fff) == 0x2147483647c);
+assert(cvd(-0x8000) == 0x2147483648d);
+
+assert(cvdy(0) == 0xc);
+assert(cvdy(1) == 0x1c);
+assert(cvdy(25594) == 0x25594c);
+assert(cvdy(-1) == 0x1d);
+assert(cvdy(0x7fff) == 0x2147483647c);
+assert(cvdy(-0x8000) == 0x2147483648d);
+
+assert(cvdg(0) == 0xc);
+assert(cvdg(1) == 0x1c);
+assert(cvdg(25594) == 0x25594c);
+assert(cvdg(-1) == 0x1d);
+assert(cvdg(0x7fff) == (m + 0xc));
+assert(cvdg(-0x8000) == (m + 0x1d));
+
+return EXIT_SUCCESS;
+}
-- 
2.43.0

[PATCH v5 1/4] target/s390x: Emulate CVDG

2024-02-05 Thread Ilya Leoshkevich

CVDG is the same as CVD, except that it converts 64 bits into 128,
rather than 32 into 64. Create a new helper, which uses Int128
wrappers.

Reported-by: Ido Plat 
Reviewed-by: Richard Henderson 
Signed-off-by: Ilya Leoshkevich 
---
 target/s390x/helper.h|  1 +
 target/s390x/tcg/insn-data.h.inc |  1 +
 target/s390x/tcg/int_helper.c| 21 +
 target/s390x/tcg/translate.c |  8 
 4 files changed, 31 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 05102578fc9..332a9a9c632 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -89,6 +89,7 @@ DEF_HELPER_FLAGS_2(sqeb, TCG_CALL_NO_WG, i64, env, i64)
 DEF_HELPER_FLAGS_2(sqdb, TCG_CALL_NO_WG, i64, env, i64)
 DEF_HELPER_FLAGS_2(sqxb, TCG_CALL_NO_WG, i128, env, i128)
 DEF_HELPER_FLAGS_1(cvd, TCG_CALL_NO_RWG_SE, i64, s32)
+DEF_HELPER_FLAGS_1(cvdg, TCG_CALL_NO_RWG_SE, i128, s64)
 DEF_HELPER_FLAGS_4(pack, TCG_CALL_NO_WG, void, env, i32, i64, i64)
 DEF_HELPER_FLAGS_4(pka, TCG_CALL_NO_WG, void, env, i64, i64, i32)
 DEF_HELPER_FLAGS_4(pku, TCG_CALL_NO_WG, void, env, i64, i64, i32)
diff --git a/target/s390x/tcg/insn-data.h.inc b/target/s390x/tcg/insn-data.h.inc
index 2f07f39d9cb..388dcb8dbbc 100644
--- a/target/s390x/tcg/insn-data.h.inc
+++ b/target/s390x/tcg/insn-data.h.inc
@@ -296,6 +296,7 @@
 /* CONVERT TO DECIMAL */
 C(0x4e00, CVD, RX_a,  Z,   r1_o, a2, 0, 0, cvd, 0)
 C(0xe326, CVDY,RXY_a, LD,  r1_o, a2, 0, 0, cvd, 0)
+C(0xe32e, CVDG,RXY_a, Z,   r1_o, a2, 0, 0, cvdg, 0)
 /* CONVERT TO FIXED */
 F(0xb398, CFEBR,   RRF_e, Z,   0, e2, new, r1_32, cfeb, 0, IF_BFP)
 F(0xb399, CFDBR,   RRF_e, Z,   0, f2, new, r1_32, cfdb, 0, IF_BFP)
diff --git a/target/s390x/tcg/int_helper.c b/target/s390x/tcg/int_helper.c
index eb8e6dd1b57..121e3006a65 100644
--- a/target/s390x/tcg/int_helper.c
+++ b/target/s390x/tcg/int_helper.c
@@ -118,6 +118,27 @@ uint64_t HELPER(cvd)(int32_t reg)
 return dec;
 }
 
+Int128 HELPER(cvdg)(int64_t reg)
+{
+/* positive 0 */
+Int128 dec = int128_make64(0x0c);
+Int128 bin = int128_makes64(reg);
+Int128 base = int128_make64(10);
+int shift;
+
+if (!int128_nonneg(bin)) {
+bin = int128_neg(bin);
+dec = int128_make64(0x0d);
+}
+
+for (shift = 4; (shift < 128) && int128_nz(bin); shift += 4) {
+dec = int128_or(dec, int128_lshift(int128_remu(bin, base), shift));
+bin = int128_divu(bin, base);
+}
+
+return dec;
+}
+
 uint64_t HELPER(popcnt)(uint64_t val)
 {
 /* Note that we don't fold past bytes. */
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index a5fd9cccaa5..c2fdc920a50 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -2233,6 +2233,14 @@ static DisasJumpType op_cvd(DisasContext *s, DisasOps *o)
 return DISAS_NEXT;
 }
 
+static DisasJumpType op_cvdg(DisasContext *s, DisasOps *o)
+{
+TCGv_i128 t = tcg_temp_new_i128();
+gen_helper_cvdg(t, o->in1);
+tcg_gen_qemu_st_i128(t, o->in2, get_mem_index(s), MO_TE | MO_128);
+return DISAS_NEXT;
+}
+
 static DisasJumpType op_ct(DisasContext *s, DisasOps *o)
 {
 int m3 = get_field(s, m3);
-- 
2.43.0

[PATCH v5 2/4] target/s390x: Emulate CVB, CVBY and CVBG

2024-02-05 Thread Ilya Leoshkevich

Convert to Binary - counterparts of the already implemented Convert
to Decimal (CVD*) instructions.
Example from the Principles of Operation: 25594C becomes 63FA.

Co-developed-by: Pavel Zbitskiy 
Signed-off-by: Ilya Leoshkevich 
---
 target/s390x/helper.h|  2 +
 target/s390x/tcg/insn-data.h.inc |  4 ++
 target/s390x/tcg/int_helper.c| 76 
 target/s390x/tcg/translate.c | 16 +++
 4 files changed, 98 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 332a9a9c632..cc1c20e9e3f 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -88,6 +88,8 @@ DEF_HELPER_FLAGS_3(tcxb, TCG_CALL_NO_RWG_SE, i32, env, i128, 
i64)
 DEF_HELPER_FLAGS_2(sqeb, TCG_CALL_NO_WG, i64, env, i64)
 DEF_HELPER_FLAGS_2(sqdb, TCG_CALL_NO_WG, i64, env, i64)
 DEF_HELPER_FLAGS_2(sqxb, TCG_CALL_NO_WG, i128, env, i128)
+DEF_HELPER_3(cvb, void, env, i32, i64)
+DEF_HELPER_FLAGS_2(cvbg, TCG_CALL_NO_WG, i64, env, i128)
 DEF_HELPER_FLAGS_1(cvd, TCG_CALL_NO_RWG_SE, i64, s32)
 DEF_HELPER_FLAGS_1(cvdg, TCG_CALL_NO_RWG_SE, i128, s64)
 DEF_HELPER_FLAGS_4(pack, TCG_CALL_NO_WG, void, env, i32, i64, i64)
diff --git a/target/s390x/tcg/insn-data.h.inc b/target/s390x/tcg/insn-data.h.inc
index 388dcb8dbbc..e7d61cdec28 100644
--- a/target/s390x/tcg/insn-data.h.inc
+++ b/target/s390x/tcg/insn-data.h.inc
@@ -293,6 +293,10 @@
 D(0xec73, CLFIT,   RIE_a, GIE, r1_32u, i2_16u, 0, 0, ct, 0, 1)
 D(0xec71, CLGIT,   RIE_a, GIE, r1_o, i2_16u, 0, 0, ct, 0, 1)
 
+/* CONVERT TO BINARY */
+C(0x4f00, CVB, RX_a,  Z,   la2, 0, 0, 0, cvb, 0)
+C(0xe306, CVBY,RXY_a, LD,  la2, 0, 0, 0, cvb, 0)
+C(0xe30e, CVBG,RXY_a, Z,   la2, 0, r1, 0, cvbg, 0)
 /* CONVERT TO DECIMAL */
 C(0x4e00, CVD, RX_a,  Z,   r1_o, a2, 0, 0, cvd, 0)
 C(0xe326, CVDY,RXY_a, LD,  r1_o, a2, 0, 0, cvd, 0)
diff --git a/target/s390x/tcg/int_helper.c b/target/s390x/tcg/int_helper.c
index 121e3006a65..2af970f2c8b 100644
--- a/target/s390x/tcg/int_helper.c
+++ b/target/s390x/tcg/int_helper.c
@@ -25,6 +25,7 @@
 #include "exec/exec-all.h"
 #include "qemu/host-utils.h"
 #include "exec/helper-proto.h"
+#include "exec/cpu_ldst.h"
 
 /* #define DEBUG_HELPER */
 #ifdef DEBUG_HELPER
@@ -98,6 +99,81 @@ Int128 HELPER(divu64)(CPUS390XState *env, uint64_t ah, 
uint64_t al, uint64_t b)
 tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
 }
 
+void HELPER(cvb)(CPUS390XState *env, uint32_t r1, uint64_t dec)
+{
+int64_t pow10 = 1, bin = 0;
+int digit, sign;
+
+sign = dec & 0xf;
+if (sign < 0xa) {
+tcg_s390_data_exception(env, 0, GETPC());
+}
+dec >>= 4;
+
+while (dec) {
+digit = dec & 0xf;
+if (digit > 0x9) {
+tcg_s390_data_exception(env, 0, GETPC());
+}
+dec >>= 4;
+bin += digit * pow10;
+pow10 *= 10;
+}
+
+if (sign == 0xb || sign == 0xd) {
+bin = -bin;
+}
+
+/* R1 is updated even on fixed-point-divide exception. */
+env->regs[r1] = (env->regs[r1] & 0xULL) | (uint32_t)bin;
+if (bin != (int32_t)bin) {
+tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
+}
+}
+
+uint64_t HELPER(cvbg)(CPUS390XState *env, Int128 dec)
+{
+uint64_t dec64[] = {int128_getlo(dec), int128_gethi(dec)};
+int64_t bin = 0, pow10, tmp;
+int digit, i, sign;
+
+sign = dec64[0] & 0xf;
+if (sign < 0xa) {
+tcg_s390_data_exception(env, 0, GETPC());
+}
+dec64[0] >>= 4;
+pow10 = (sign == 0xb || sign == 0xd) ? -1 : 1;
+
+for (i = 1; i < 20; i++) {
+digit = dec64[i >> 4] & 0xf;
+if (digit > 0x9) {
+tcg_s390_data_exception(env, 0, GETPC());
+}
+dec64[i >> 4] >>= 4;
+/*
+ * Prepend the next digit and check for overflow. The multiplication
+ * cannot overflow, since, conveniently, the int64_t limits are
+ * approximately +-9.2E+18. If bin is zero, the addition cannot
+ * overflow. Otherwise bin is known to have the same sign as the rhs
+ * addend, in which case overflow happens if and only if the result
+ * has a different sign.
+ */
+tmp = bin + pow10 * digit;
+if (bin && ((tmp ^ bin) < 0)) {
+tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
+}
+bin = tmp;
+pow10 *= 10;
+}
+
+g_assert(!dec64[0]);
+if (dec64[1]) {
+tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
+}
+
+return bin;
+}
+
 uint64_t HELPER(cvd)(int32_t reg)
 {
 /* positive 0 */
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index c2fdc920a50..325b25959d3 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -2223,6 +2223,22 @@ static DisasJumpType op_csp(DisasContext *s, DisasOps *o)
 }
 #endif
 
+static DisasJumpType op_cvb(DisasContext *s, DisasOps *o)
+{
+TCGv_i64 t = tcg_temp_new_i64();
+

[PATCH v2 3/6] migration/multifd: Move multifd_send_setup error handling in to the function

2024-02-05 Thread Fabiano Rosas

Hide the error handling inside multifd_send_setup to make it cleaner
for the next patch to move the function around.

Reviewed-by: Peter Xu 
Signed-off-by: Fabiano Rosas 
---
 migration/migration.c |  6 +-
 migration/multifd.c   | 24 +---
 migration/multifd.h   |  2 +-
 3 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index ba99772e76..2942f8cf42 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3623,11 +3623,7 @@ void migrate_fd_connect(MigrationState *s, Error 
*error_in)
 return;
 }
 
-if (multifd_send_setup(_err) != 0) {
-migrate_set_error(s, local_err);
-error_report_err(local_err);
-migrate_set_state(>state, MIGRATION_STATUS_SETUP,
-  MIGRATION_STATUS_FAILED);
+if (!multifd_send_setup()) {
 migrate_fd_cleanup(s);
 return;
 }
diff --git a/migration/multifd.c b/migration/multifd.c
index 515d88e04b..cc10be2c3f 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -985,14 +985,16 @@ static void multifd_new_send_channel_create(gpointer 
opaque)
 socket_send_channel_create(multifd_new_send_channel_async, opaque);
 }
 
-int multifd_send_setup(Error **errp)
+bool multifd_send_setup(void)
 {
-int thread_count;
+MigrationState *s = migrate_get_current();
+Error *local_err = NULL;
+int thread_count, ret = 0;
 uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size();
 uint8_t i;
 
 if (!migrate_multifd()) {
-return 0;
+return true;
 }
 
 thread_count = migrate_multifd_channels();
@@ -1026,14 +1028,22 @@ int multifd_send_setup(Error **errp)
 
 for (i = 0; i < thread_count; i++) {
 MultiFDSendParams *p = _send_state->params[i];
-int ret;
 
-ret = multifd_send_state->ops->send_setup(p, errp);
+ret = multifd_send_state->ops->send_setup(p, _err);
 if (ret) {
-return ret;
+break;
 }
 }
-return 0;
+
+if (ret) {
+migrate_set_error(s, local_err);
+error_report_err(local_err);
+migrate_set_state(>state, MIGRATION_STATUS_SETUP,
+  MIGRATION_STATUS_FAILED);
+return false;
+}
+
+return true;
 }
 
 struct {
diff --git a/migration/multifd.h b/migration/multifd.h
index 7881980ee6..8a1cad0996 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -13,7 +13,7 @@
 #ifndef QEMU_MIGRATION_MULTIFD_H
 #define QEMU_MIGRATION_MULTIFD_H
 
-int multifd_send_setup(Error **errp);
+bool multifd_send_setup(void);
 void multifd_send_shutdown(void);
 int multifd_recv_setup(Error **errp);
 void multifd_recv_cleanup(void);
-- 
2.35.3

[PATCH v2 2/6] migration/multifd: Remove p->running

2024-02-05 Thread Fabiano Rosas

We currently only need p->running to avoid calling qemu_thread_join()
on a non existent thread if the thread has never been created.

However, there are at least two bugs in this logic:

1) On the sending side, p->running is set too early and
qemu_thread_create() can be skipped due to an error during TLS
handshake, leaving the flag set and leading to a crash when
multifd_send_cleanup() calls qemu_thread_join().

2) During exit, the multifd thread clears the flag while holding the
channel lock. The counterpart at multifd_send_cleanup() reads the flag
outside of the lock and might free the mutex while the multifd thread
still has it locked.

Fix the first issue by setting the flag right before creating the
thread. Rename it from p->running to p->thread_created to clarify its
usage.

Fix the second issue by not clearing the flag at the multifd thread
exit. We don't have any use for that.

Note that these bugs are straight-forward logic issues and not race
conditions. There is still a gap for races to affect this code due to
multifd_send_cleanup() being allowed to run concurrently with the
thread creation loop. This issue is solved in the next patches.

Cc: qemu-stable 
Fixes: 29647140157a ("migration/tls: add support for multifd tls-handshake")
Reported-by: Avihai Horon 
Reported-by: 
Reviewed-by: Peter Xu 
Signed-off-by: Fabiano Rosas 
---
 migration/multifd.c | 27 ---
 migration/multifd.h |  7 ++-
 2 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 8195c1daf3..515d88e04b 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -634,7 +634,7 @@ static void multifd_send_terminate_threads(void)
 qemu_thread_join(>tls_thread);
 }
 
-if (p->running) {
+if (p->thread_created) {
 qemu_thread_join(>thread);
 }
 }
@@ -862,7 +862,6 @@ out:
 error_free(local_err);
 }
 
-p->running = false;
 rcu_unregister_thread();
 migration_threads_remove(thread);
 trace_multifd_send_thread_end(p->id, p->packets_sent, 
p->total_normal_pages);
@@ -953,6 +952,8 @@ static bool multifd_channel_connect(MultiFDSendParams *p,
 migration_ioc_register_yank(ioc);
 p->registered_yank = true;
 p->c = ioc;
+
+p->thread_created = true;
 qemu_thread_create(>thread, p->name, multifd_send_thread, p,
QEMU_THREAD_JOINABLE);
 return true;
@@ -967,7 +968,6 @@ static void multifd_new_send_channel_async(QIOTask *task, 
gpointer opaque)
 trace_multifd_new_send_channel_async(p->id);
 if (!qio_task_propagate_error(task, _err)) {
 qio_channel_set_delay(ioc, false);
-p->running = true;
 if (multifd_channel_connect(p, ioc, _err)) {
 return;
 }
@@ -1128,15 +1128,15 @@ void multifd_recv_cleanup(void)
 for (i = 0; i < migrate_multifd_channels(); i++) {
 MultiFDRecvParams *p = _recv_state->params[i];
 
-if (p->running) {
-/*
- * multifd_recv_thread may hung at MULTIFD_FLAG_SYNC handle code,
- * however try to wakeup it without harm in cleanup phase.
- */
-qemu_sem_post(>sem_sync);
-}
+/*
+ * multifd_recv_thread may hung at MULTIFD_FLAG_SYNC handle code,
+ * however try to wakeup it without harm in cleanup phase.
+ */
+qemu_sem_post(>sem_sync);
 
-qemu_thread_join(>thread);
+if (p->thread_created) {
+qemu_thread_join(>thread);
+}
 }
 for (i = 0; i < migrate_multifd_channels(); i++) {
 multifd_recv_cleanup_channel(_recv_state->params[i]);
@@ -1222,9 +1222,6 @@ static void *multifd_recv_thread(void *opaque)
 multifd_recv_terminate_threads(local_err);
 error_free(local_err);
 }
-qemu_mutex_lock(>mutex);
-p->running = false;
-qemu_mutex_unlock(>mutex);
 
 rcu_unregister_thread();
 trace_multifd_recv_thread_end(p->id, p->packets_recved, 
p->total_normal_pages);
@@ -1330,7 +1327,7 @@ void multifd_recv_new_channel(QIOChannel *ioc, Error 
**errp)
 p->c = ioc;
 object_ref(OBJECT(ioc));
 
-p->running = true;
+p->thread_created = true;
 qemu_thread_create(>thread, p->name, multifd_recv_thread, p,
QEMU_THREAD_JOINABLE);
 qatomic_inc(_recv_state->count);
diff --git a/migration/multifd.h b/migration/multifd.h
index 720c9d50db..7881980ee6 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -73,6 +73,7 @@ typedef struct {
 char *name;
 /* channel thread id */
 QemuThread thread;
+bool thread_created;
 QemuThread tls_thread;
 bool tls_thread_created;
 /* communication channel */
@@ -93,8 +94,6 @@ typedef struct {
 /* syncs main thread and channels */
 QemuSemaphore sem_sync;
 
-/* is this channel thread running */
-bool running;
 /* multifd flags for each packet */
 uint32_t

[PATCH v2 6/6] migration/multifd: Add a synchronization point for channel creation

2024-02-05 Thread Fabiano Rosas

It is possible that one of the multifd channels fails to be created at
multifd_new_send_channel_async() while the rest of the channel
creation tasks are still in flight.

This could lead to multifd_save_cleanup() executing the
qemu_thread_join() loop too early and not waiting for the threads
which haven't been created yet, leading to the freeing of resources
that the newly created threads will try to access and crash.

Add a synchronization point after which there will be no attempts at
thread creation and therefore calling multifd_save_cleanup() past that
point will ensure it properly waits for the threads.

A note about performance: Prior to this patch, if a channel took too
long to be established, other channels could finish connecting first
and already start taking load. Now we're bounded by the
slowest-connecting channel.

Reported-by: Avihai Horon 
Signed-off-by: Fabiano Rosas 
---
 migration/multifd.c | 33 ++---
 1 file changed, 26 insertions(+), 7 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 89d39fa67c..a2b73c9946 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -62,6 +62,11 @@ struct {
  * Make it easy for now.
  */
 uintptr_t packet_num;
+/*
+ * Synchronization point past which no more channels will be
+ * created.
+ */
+QemuSemaphore channels_created;
 /* send channels ready */
 QemuSemaphore channels_ready;
 /*
@@ -622,10 +627,6 @@ static void multifd_send_terminate_threads(void)
 
 /*
  * Finally recycle all the threads.
- *
- * TODO: p->running is still buggy, e.g. we can reach here without the
- * corresponding multifd_new_send_channel_async() get invoked yet,
- * then a new thread can even be created after this function returns.
  */
 for (i = 0; i < migrate_multifd_channels(); i++) {
 MultiFDSendParams *p = _send_state->params[i];
@@ -670,6 +671,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams 
*p, Error **errp)
 
 static void multifd_send_cleanup_state(void)
 {
+qemu_sem_destroy(_send_state->channels_created);
 qemu_sem_destroy(_send_state->channels_ready);
 g_free(multifd_send_state->params);
 multifd_send_state->params = NULL;
@@ -934,7 +936,6 @@ static void multifd_new_send_channel_async(QIOTask *task, 
gpointer opaque)
 MultiFDSendParams *p = opaque;
 QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task));
 Error *local_err = NULL;
-
 bool ret;
 
 trace_multifd_new_send_channel_async(p->id);
@@ -951,18 +952,26 @@ static void multifd_new_send_channel_async(QIOTask *task, 
gpointer opaque)
 
 if (migrate_channel_requires_tls_upgrade(ioc)) {
 ret = multifd_tls_channel_connect(p, ioc, _err);
+if (ret) {
+return;
+}
 } else {
 ret = multifd_channel_connect(p, ioc, _err);
 }
 
+out:
+/*
+ * Here we're not interested whether creation succeeded, only that
+ * it happened at all.
+ */
+qemu_sem_post(_send_state->channels_created);
+
 if (ret) {
 return;
 }
 
-out:
 trace_multifd_new_send_channel_async_error(p->id, local_err);
 multifd_send_set_error(local_err);
-multifd_send_kick_main(p);
 object_unref(OBJECT(ioc));
 error_free(local_err);
 }
@@ -988,6 +997,7 @@ bool multifd_send_setup(void)
 multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
 multifd_send_state->params = g_new0(MultiFDSendParams, thread_count);
 multifd_send_state->pages = multifd_pages_init(page_count);
+qemu_sem_init(_send_state->channels_created, 0);
 qemu_sem_init(_send_state->channels_ready, 0);
 qatomic_set(_send_state->exiting, 0);
 multifd_send_state->ops = multifd_ops[migrate_multifd_compression()];
@@ -1013,6 +1023,15 @@ bool multifd_send_setup(void)
 multifd_new_send_channel_create(p);
 }
 
+/*
+ * Wait until channel creation has started for all channels. The
+ * creation can still fail, but no more channels will be created
+ * past this point.
+ */
+for (i = 0; i < thread_count; i++) {
+qemu_sem_wait(_send_state->channels_created);
+}
+
 for (i = 0; i < thread_count; i++) {
 MultiFDSendParams *p = _send_state->params[i];
 
-- 
2.35.3

[PATCH v2 0/6] migration/multifd: Fix channel creation vs. cleanup races

2024-02-05 Thread Fabiano Rosas

Based-on: 20240202102857.110210-1-pet...@redhat.com
[PATCH v2 00/23] migration/multifd: Refactor ->send_prepare() and cleanups
https://lore.kernel.org/r/20240202102857.110210-1-pet...@redhat.com

Hi,

In this v2 I made sure NO channel is created after the semaphores are
posted. Feel free to call me out if that's not the case.

Not much changes, except that now both TLS and non-TLS go through the
same code, so there's a centralized place to do error handling and
releasing the semaphore.

CI run: https://gitlab.com/farosas/qemu/-/pipelines/1165206107
based on Peter's code: https://gitlab.com/farosas/qemu/-/pipelines/1165303276

v1:
https://lore.kernel.org/r/20240202191128.1901-1-faro...@suse.de

This contains 2 patches from my previous series addressing the
p->running misuse and the TLS thread leak and 3 new patches to fix the
cleanup-while-creating-threads race.

For the p->running I'm keeping the idea from the other series to
remove p->running and use a more narrow p->thread_created flag. This
flag is used only inform whether the thread has been created so we can
join it.

For the cleanup race I have moved some code around and added a
semaphore to make multifd_save_setup() only return once all channel
creation tasks have started.

The idea is that after multifd_save_setup() returns, no new creations
are in flight and the p->thread_created flags will never change again,
so they're enough to cause the cleanup code to wait for the threads to
join.

CI run: https://gitlab.com/farosas/qemu/-/pipelines/1162798843

@Peter: I can rebase this on top of your series once we decide about
it.

Fabiano Rosas (6):
  migration/multifd: Join the TLS thread
  migration/multifd: Remove p->running
  migration/multifd: Move multifd_send_setup error handling in to the
function
  migration/multifd: Move multifd_send_setup into migration thread
  migration/multifd: Unify multifd and TLS connection paths
  migration/multifd: Add a synchronization point for channel creation

 migration/migration.c |  14 ++--
 migration/multifd.c   | 157 +++---
 migration/multifd.h   |  11 ++-
 3 files changed, 98 insertions(+), 84 deletions(-)

-- 
2.35.3

[PATCH v2 1/6] migration/multifd: Join the TLS thread

2024-02-05 Thread Fabiano Rosas

We're currently leaking the resources of the TLS thread by not joining
it and also overwriting the p->thread pointer altogether.

Fixes: a1af605bd5 ("migration/multifd: fix hangup with TLS-Multifd due to 
blocking handshake")
Cc: qemu-stable 
Reviewed-by: Peter Xu 
Signed-off-by: Fabiano Rosas 
---
 migration/multifd.c | 8 +++-
 migration/multifd.h | 2 ++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index ef13e2e781..8195c1daf3 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -630,6 +630,10 @@ static void multifd_send_terminate_threads(void)
 for (i = 0; i < migrate_multifd_channels(); i++) {
 MultiFDSendParams *p = _send_state->params[i];
 
+if (p->tls_thread_created) {
+qemu_thread_join(>tls_thread);
+}
+
 if (p->running) {
 qemu_thread_join(>thread);
 }
@@ -921,7 +925,9 @@ static bool multifd_tls_channel_connect(MultiFDSendParams 
*p,
 trace_multifd_tls_outgoing_handshake_start(ioc, tioc, hostname);
 qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing");
 p->c = QIO_CHANNEL(tioc);
-qemu_thread_create(>thread, "multifd-tls-handshake-worker",
+
+p->tls_thread_created = true;
+qemu_thread_create(>tls_thread, "multifd-tls-handshake-worker",
multifd_tls_handshake_thread, p,
QEMU_THREAD_JOINABLE);
 return true;
diff --git a/migration/multifd.h b/migration/multifd.h
index 78a2317263..720c9d50db 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -73,6 +73,8 @@ typedef struct {
 char *name;
 /* channel thread id */
 QemuThread thread;
+QemuThread tls_thread;
+bool tls_thread_created;
 /* communication channel */
 QIOChannel *c;
 /* is the yank function registered */
-- 
2.35.3

[PATCH v2 4/6] migration/multifd: Move multifd_send_setup into migration thread

2024-02-05 Thread Fabiano Rosas

We currently have an unfavorable situation around multifd channels
creation and the migration thread execution.

We create the multifd channels with qio_channel_socket_connect_async
-> qio_task_run_in_thread, but only connect them at the
multifd_new_send_channel_async callback, called from
qio_task_complete, which is registered as a glib event.

So at multifd_send_setup() we create the channels, but they will only
be actually usable after the whole multifd_send_setup() calling stack
returns back to the main loop. Which means that the migration thread
is already up and running without any possibility for the multifd
channels to be ready on time.

We currently rely on the channels-ready semaphore blocking
multifd_send_sync_main() until channels start to come up and release
it. However there have been bugs recently found when a channel's
creation fails and multifd_send_cleanup() is allowed to run while
other channels are still being created.

Let's start to organize this situation by moving the
multifd_send_setup() call into the migration thread. That way we
unblock the main-loop to dispatch the completion callbacks and
actually have a chance of getting the multifd channels ready for when
the migration thread needs them.

The next patches will deal with the synchronization aspects.

Note that this takes multifd_send_setup() out of the BQL.

Reviewed-by: Peter Xu 
Signed-off-by: Fabiano Rosas 
---
 migration/migration.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 2942f8cf42..0675e12c64 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3315,6 +3315,10 @@ static void *migration_thread(void *opaque)
 object_ref(OBJECT(s));
 update_iteration_initial_status(s);
 
+if (!multifd_send_setup()) {
+goto out;
+}
+
 bql_lock();
 qemu_savevm_state_header(s->to_dst_file);
 bql_unlock();
@@ -3386,6 +3390,7 @@ static void *migration_thread(void *opaque)
 urgent = migration_rate_limit();
 }
 
+out:
 trace_migration_thread_after_loop();
 migration_iteration_finish(s);
 object_unref(OBJECT(s));
@@ -3623,11 +3628,6 @@ void migrate_fd_connect(MigrationState *s, Error 
*error_in)
 return;
 }
 
-if (!multifd_send_setup()) {
-migrate_fd_cleanup(s);
-return;
-}
-
 if (migrate_background_snapshot()) {
 qemu_thread_create(>thread, "bg_snapshot",
 bg_migration_thread, s, QEMU_THREAD_JOINABLE);
-- 
2.35.3

[PATCH v2 5/6] migration/multifd: Unify multifd and TLS connection paths

2024-02-05 Thread Fabiano Rosas

During multifd channel creation (multifd_send_new_channel_async) when
TLS is enabled, the multifd_channel_connect function is called twice,
once to create the TLS handshake thread and another time after the
asynchrounous TLS handshake has finished.

This creates a slightly confusing call stack where
multifd_channel_connect() is called more times than the number of
channels. It also splits error handling between the two callers of
multifd_channel_connect() causing some code duplication. Lastly, it
gets in the way of having a single point to determine whether all
channel creation tasks have been initiated.

Refactor the code to move the reentrancy one level up at the
multifd_new_send_channel_async() level, de-duplicating the error
handling and allowing for the next patch to introduce a
synchronization point common to all the multifd channel creation,
regardless of TLS.

Signed-off-by: Fabiano Rosas 
---
 migration/multifd.c | 73 +++--
 1 file changed, 30 insertions(+), 43 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index cc10be2c3f..89d39fa67c 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -869,30 +869,7 @@ out:
 return NULL;
 }
 
-static bool multifd_channel_connect(MultiFDSendParams *p,
-QIOChannel *ioc,
-Error **errp);
-
-static void multifd_tls_outgoing_handshake(QIOTask *task,
-   gpointer opaque)
-{
-MultiFDSendParams *p = opaque;
-QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task));
-Error *err = NULL;
-
-if (!qio_task_propagate_error(task, )) {
-trace_multifd_tls_outgoing_handshake_complete(ioc);
-if (multifd_channel_connect(p, ioc, )) {
-return;
-}
-}
-
-trace_multifd_tls_outgoing_handshake_error(ioc, error_get_pretty(err));
-
-multifd_send_set_error(err);
-multifd_send_kick_main(p);
-error_free(err);
-}
+static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque);
 
 static void *multifd_tls_handshake_thread(void *opaque)
 {
@@ -900,7 +877,7 @@ static void *multifd_tls_handshake_thread(void *opaque)
 QIOChannelTLS *tioc = QIO_CHANNEL_TLS(p->c);
 
 qio_channel_tls_handshake(tioc,
-  multifd_tls_outgoing_handshake,
+  multifd_new_send_channel_async,
   p,
   NULL,
   NULL);
@@ -936,19 +913,6 @@ static bool multifd_channel_connect(MultiFDSendParams *p,
 QIOChannel *ioc,
 Error **errp)
 {
-trace_multifd_set_outgoing_channel(
-ioc, object_get_typename(OBJECT(ioc)),
-migrate_get_current()->hostname);
-
-if (migrate_channel_requires_tls_upgrade(ioc)) {
-/*
- * tls_channel_connect will call back to this
- * function after the TLS handshake,
- * so we mustn't call multifd_send_thread until then
- */
-return multifd_tls_channel_connect(p, ioc, errp);
-}
-
 migration_ioc_register_yank(ioc);
 p->registered_yank = true;
 p->c = ioc;
@@ -959,20 +923,43 @@ static bool multifd_channel_connect(MultiFDSendParams *p,
 return true;
 }
 
+/*
+ * When TLS is enabled this function is called once to establish the
+ * TLS connection and a second time after the TLS handshake to create
+ * the multifd channel. Without TLS it goes straight into the channel
+ * creation.
+ */
 static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
 {
 MultiFDSendParams *p = opaque;
 QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task));
 Error *local_err = NULL;
 
+bool ret;
+
 trace_multifd_new_send_channel_async(p->id);
-if (!qio_task_propagate_error(task, _err)) {
-qio_channel_set_delay(ioc, false);
-if (multifd_channel_connect(p, ioc, _err)) {
-return;
-}
+
+if (qio_task_propagate_error(task, _err)) {
+ret = false;
+goto out;
+}
+
+qio_channel_set_delay(ioc, false);
+
+trace_multifd_set_outgoing_channel(ioc, object_get_typename(OBJECT(ioc)),
+   migrate_get_current()->hostname);
+
+if (migrate_channel_requires_tls_upgrade(ioc)) {
+ret = multifd_tls_channel_connect(p, ioc, _err);
+} else {
+ret = multifd_channel_connect(p, ioc, _err);
+}
+
+if (ret) {
+return;
 }
 
+out:
 trace_multifd_new_send_channel_async_error(p->id, local_err);
 multifd_send_set_error(local_err);
 multifd_send_kick_main(p);
-- 
2.35.3

[PATCH] hw/net/tulip: add chip status register values

2024-02-05 Thread Sven Schnelle

Netbsd isn't able to detect a link on the emulated tulip card. That's
because netbsd reads the Chip Status Register of the Phy (address
0x14). The default phy data in the qemu tulip driver is all zero,
which means no link is established and autonegotation isn't complete.

Therefore set the register to 0x3b40, which means:

Link is up, Autonegotation complete, Full Duplex, 100MBit/s Link
speed.

Also clear the mask because this register is read only.

Signed-off-by: Sven Schnelle 
---
 hw/net/tulip.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/net/tulip.c b/hw/net/tulip.c
index 6d4fb06dad..1f2ef20977 100644
--- a/hw/net/tulip.c
+++ b/hw/net/tulip.c
@@ -421,7 +421,7 @@ static uint16_t tulip_mdi_default[] = {
 /* MDI Registers 8 - 15 */
 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x,
 /* MDI Registers 16 - 31 */
-0x0003, 0x, 0x0001, 0x, 0x, 0x, 0x, 0x,
+0x0003, 0x, 0x0001, 0x, 0x3b40, 0x, 0x, 0x,
 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x,
 };
 
@@ -429,7 +429,7 @@ static uint16_t tulip_mdi_default[] = {
 static const uint16_t tulip_mdi_mask[] = {
 0x, 0x, 0x, 0x, 0xc01f, 0x, 0x, 0x,
 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x,
-0x0fff, 0x, 0x, 0x, 0x, 0x, 0x, 0x,
+0x0fff, 0x, 0x, 0x, 0x, 0x, 0x, 0x,
 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x,
 };
 
-- 
2.43.0

Re: [PATCH v4 5/5] target/riscv: Implement privilege mode filtering for cycle/instret

2024-02-05 Thread Atish Kumar Patra

On Tue, Jan 23, 2024 at 4:15 PM Atish Kumar Patra 
wrote:

> On Sun, Jan 21, 2024 at 9:04 PM Alistair Francis 
> wrote:
> >
> > On Tue, Jan 9, 2024 at 10:29 AM Atish Patra  wrote:
> > >
> > > Privilege mode filtering can also be emulated for cycle/instret by
> > > tracking host_ticks/icount during each privilege mode switch. This
> > > patch implements that for both cycle/instret and mhpmcounters. The
> > > first one requires Smcntrpmf while the other one requires Sscofpmf
> > > to be enabled.
> > >
> > > The cycle/instret are still computed using host ticks when icount
> > > is not enabled. Otherwise, they are computed using raw icount which
> > > is more accurate in icount mode.
> > >
> > > Reviewed-by: Daniel Henrique Barboza 
> > > Signed-off-by: Atish Patra 
> > > ---
> > >  target/riscv/cpu.h| 11 +
> > >  target/riscv/cpu_helper.c |  9 +++-
> > >  target/riscv/csr.c| 95 ++-
> > >  target/riscv/pmu.c| 43 ++
> > >  target/riscv/pmu.h|  2 +
> > >  5 files changed, 136 insertions(+), 24 deletions(-)
> > >
> > > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> > > index 34617c4c4bab..40d10726155b 100644
> > > --- a/target/riscv/cpu.h
> > > +++ b/target/riscv/cpu.h
> > > @@ -136,6 +136,15 @@ typedef struct PMUCTRState {
> > >  target_ulong irq_overflow_left;
> > >  } PMUCTRState;
> > >
> > > +typedef struct PMUFixedCtrState {
> > > +/* Track cycle and icount for each privilege mode */
> > > +uint64_t counter[4];
> > > +uint64_t counter_prev[4];
> >
> > Are these two used?
> >
>
> Yes. That's where it tracks the current/previous value cycle/instret.
> riscv_pmu_icount_update_priv/riscv_pmu_cycle_update_priv
>
> The priv mode based filtering is enabled in
> riscv_pmu_ctr_get_fixed_counters_val
> using "counter" afterwards.
>
> Did I misunderstand your question ?
>
>
ping ?


> > Alistair
> >
> > > +/* Track cycle and icount for each privilege mode when V = 1*/
> > > +uint64_t counter_virt[2];
> > > +uint64_t counter_virt_prev[2];
> > > +} PMUFixedCtrState;
> > > +
> > >  struct CPUArchState {
> > >  target_ulong gpr[32];
> > >  target_ulong gprh[32]; /* 64 top bits of the 128-bit registers */
> > > @@ -334,6 +343,8 @@ struct CPUArchState {
> > >  /* PMU event selector configured values for RV32 */
> > >  target_ulong mhpmeventh_val[RV_MAX_MHPMEVENTS];
> > >
> > > +PMUFixedCtrState pmu_fixed_ctrs[2];
> > > +
> > >  target_ulong sscratch;
> > >  target_ulong mscratch;
> > >
> > > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > > index e7e23b34f455..3dddb1b433e8 100644
> > > --- a/target/riscv/cpu_helper.c
> > > +++ b/target/riscv/cpu_helper.c
> > > @@ -715,8 +715,13 @@ void riscv_cpu_set_mode(CPURISCVState *env,
> target_ulong newpriv)
> > >  {
> > >  g_assert(newpriv <= PRV_M && newpriv != PRV_RESERVED);
> > >
> > > -if (icount_enabled() && newpriv != env->priv) {
> > > -riscv_itrigger_update_priv(env);
> > > +if (newpriv != env->priv) {
> > > +if (icount_enabled()) {
> > > +riscv_itrigger_update_priv(env);
> > > +riscv_pmu_icount_update_priv(env, newpriv);
> > > +} else {
> > > +riscv_pmu_cycle_update_priv(env, newpriv);
> > > +}
> > >  }
> > >  /* tlb_flush is unnecessary as mode is contained in mmu_idx */
> > >  env->priv = newpriv;
> > > diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> > > index 3bd4aa22374f..307d052021c5 100644
> > > --- a/target/riscv/csr.c
> > > +++ b/target/riscv/csr.c
> > > @@ -782,32 +782,16 @@ static int write_vcsr(CPURISCVState *env, int
> csrno, target_ulong val)
> > >  return RISCV_EXCP_NONE;
> > >  }
> > >
> > > +#if defined(CONFIG_USER_ONLY)
> > >  /* User Timers and Counters */
> > >  static target_ulong get_ticks(bool shift)
> > >  {
> > > -int64_t val;
> > > -target_ulong result;
> > > -
> > > -#if !defined(CONFIG_USER_ONLY)
> > > -if (icount_enabled()) {
> > > -val = icount_get();
> > > -} else {
> > > -val = cpu_get_host_ticks();
> > > -}
> > > -#else
> > > -val = cpu_get_host_ticks();
> > > -#endif
> > > -
> > > -if (shift) {
> > > -result = val >> 32;
> > > -} else {
> > > -result = val;
> > > -}
> > > +int64_t val = cpu_get_host_ticks();
> > > +target_ulong result = shift ? val >> 32 : val;
> > >
> > >  return result;
> > >  }
> > >
> > > -#if defined(CONFIG_USER_ONLY)
> > >  static RISCVException read_time(CPURISCVState *env, int csrno,
> > >  target_ulong *val)
> > >  {
> > > @@ -932,6 +916,70 @@ static int write_mhpmeventh(CPURISCVState *env,
> int csrno, target_ulong val)
> > >  return RISCV_EXCP_NONE;
> > >  }
> > >
> > > +static target_ulong
> riscv_pmu_ctr_get_fixed_counters_val(CPURISCVState *env,
> > > +

Re: [PATCH] docs/style: allow C99 mixed declarations

2024-02-05 Thread Daniel P . Berrangé

On Mon, Feb 05, 2024 at 02:06:39PM -0500, Stefan Hajnoczi wrote:
> On Mon, 5 Feb 2024 at 13:16, Samuel Tardieu  wrote:
> >
> >
> > Daniel P. Berrangé  writes:
> >
> > > $ gcc -Wall -Wuninitialized -o jump jump.c
> >
> > Note that many GCC warnings don't trigger if you don't enable
> > optimizations. In the case you exhibit, adding -O is enough to get
> > a sensible warning:
> >
> > $ gcc -Wall -O -o jump jump.c
> > jump.c: In function ‘main’:
> > jump.c:11:3: warning: ‘foo’ may be used uninitialized
> > [-Wmaybe-uninitialized]
> >11 |   free(foo);
> >   |   ^
> > jump.c:8:9: note: ‘foo’ was declared here
> > 8 |   char *foo = malloc(30);
> >   | ^~~
> 
> llvm also prints a warning:
> 
>   jump.c:5:7: warning: variable 'foo' is used uninitialized whenever
> 'if' condition is true [-Wsometimes-uninitialized]
> 
> I confirmed that QEMU's current compiler flags enable these warnings
> so both gcc and llvm detect the issue that Daniel pointed out in QEMU
> code.
> 
> Daniel: Does this address your concern about compiler warnings?

While it is good that its included when optimization is enabled, IME
GCC is not entirely reliable at detecting unintialized variables as
it has limits on extent of its flow control analysis.

Also consider the maint impact if a method includes C99 decls, and
a contributor later needs to add a 'goto' error cleanup path. They
might be forced to make a bunch of unrelated refactoring changes to
eliminate use of the C99 decls. This makes me not a fan of having a
situation where we tell contributors they can use C99 mixed decls,
except when they can't use them.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH] docs/style: allow C99 mixed declarations

2024-02-05 Thread Stefan Hajnoczi

On Mon, 5 Feb 2024 at 13:16, Samuel Tardieu  wrote:
>
>
> Daniel P. Berrangé  writes:
>
> > $ gcc -Wall -Wuninitialized -o jump jump.c
>
> Note that many GCC warnings don't trigger if you don't enable
> optimizations. In the case you exhibit, adding -O is enough to get
> a sensible warning:
>
> $ gcc -Wall -O -o jump jump.c
> jump.c: In function ‘main’:
> jump.c:11:3: warning: ‘foo’ may be used uninitialized
> [-Wmaybe-uninitialized]
>11 |   free(foo);
>   |   ^
> jump.c:8:9: note: ‘foo’ was declared here
> 8 |   char *foo = malloc(30);
>   | ^~~

llvm also prints a warning:

  jump.c:5:7: warning: variable 'foo' is used uninitialized whenever
'if' condition is true [-Wsometimes-uninitialized]

I confirmed that QEMU's current compiler flags enable these warnings
so both gcc and llvm detect the issue that Daniel pointed out in QEMU
code.

Daniel: Does this address your concern about compiler warnings?


Stefan

Re: Re: [PATCH v2 4/4] meson: Link with libinotify on FreeBSD

2024-02-05 Thread Ilya Leoshkevich

On Mon, Feb 05, 2024 at 07:36:32PM +0100, Philippe Mathieu-Daudé wrote:
> Hi Ilya,
> 
> On 5/2/24 19:11, Ilya Leoshkevich wrote:
> > make vm-build-freebsd fails with:
> > 
> >  ld: error: undefined symbol: inotify_init1
> >  >>> referenced by filemonitor-inotify.c:183 
> > (../src/util/filemonitor-inotify.c:183)
> >  >>>   util_filemonitor-inotify.c.o:(qemu_file_monitor_new) 
> > in archive libqemuutil.a
> > 
> > On FreeBSD inotify functions are defined in libinotify.so. Add it to
> > the dependencies.
> > 
> > Signed-off-by: Ilya Leoshkevich 
> > ---
> >   meson.build  | 12 +++-
> >   util/meson.build |  6 +-
> >   2 files changed, 16 insertions(+), 2 deletions(-)
> 
> (for some reason your git-diff context is very verbose,
>  making review somehow annoying).

This is because of patch 3. It's essentially the snippet that Daniel
posted, except that patch -p1 applied it at a wrong location! So I
figured I'll send this series with a larger context, but I couldn't
find how to apply this setting to just 1 patch in git send-email.

> > +# libinotify-kqueue
> > +inotify = not_found
> > +if host_os == 'freebsd'
> > +  inotify = cc.find_library('inotify')
> > +endif
> > +
> >   #
> >   # config-host.h #
> >   #
> 
> 
> > @@ -2376,61 +2382,62 @@ have_asan_fiber = false
> >   if get_option('sanitizers') and \
> >  not cc.has_function('__sanitizer_start_switch_fiber',
> >args: '-fsanitize=address',
> >prefix: '#include ')
> > warning('Missing ASAN due to missing fiber annotation interface')
> > warning('Without code annotation, the report may be inferior.')
> >   else
> > have_asan_fiber = true
> >   endif
> >   config_host_data.set('CONFIG_ASAN_IFACE_FIBER', have_asan_fiber)
> >   # has_header_symbol
> 
> 
> >   config_host_data.set('CONFIG_INOTIFY',
> >cc.has_header_symbol('sys/inotify.h', 
> > 'inotify_init'))
> >   config_host_data.set('CONFIG_INOTIFY1',
> > - cc.has_header_symbol('sys/inotify.h', 
> > 'inotify_init1'))
> > + cc.has_header_symbol('sys/inotify.h', 
> > 'inotify_init1') and
> > + (host_os != 'freebsd' or inotify.found()))
> 
> Maybe we could use the same pattern as 'have_asan_fiber':
> 
>  have_inotify_init1 = cc.has_header_symbol('sys/inotify.h', 'inotify_init1')
>  if have_inotify_init1 and host_os == 'freebsd'
>have_inotify_init1 = cc.find_library('inotify')
>  endif
>  config_host_data.set('CONFIG_INOTIFY1', have_inotify_init1)

I agree, this looks nicer. I will send a v3.

> I wonder why we don't need the similar library check for the
> inotify_init symbol.

Sounds reasonable, it's just that currently the respective config value
is used only in linux-user, but for completeness it won't hurt.

> 
> Regards,
> 
> Phil.

Re: [PATCH v2 2/4] tests/vm/freebsd: Reload the sshd configuration

2024-02-05 Thread Thomas Huth


On 05/02/2024 19.11, Ilya Leoshkevich wrote:

After console_sshd_config(), the SSH server needs to be nudged to pick
up the new configs. The scripts for the other BSD flavors already do
this with a reboot, but a simple reload is sufficient.

Signed-off-by: Ilya Leoshkevich 
---
  tests/vm/freebsd | 1 +
  1 file changed, 1 insertion(+)


Reviewed-by: Thomas Huth

Re: Re: [PATCH v4 2/4] target/s390x: Emulate CVB, CVBY and CVBG

2024-02-05 Thread Ilya Leoshkevich

On Mon, Feb 05, 2024 at 06:04:27PM +0100, Thomas Huth wrote:
> On 02/02/2024 15.11, Ilya Leoshkevich wrote:
> > Convert to Binary - counterparts of the already implemented Convert
> > to Decimal (CVD*) instructions.
> > Example from the Principles of Operation: 25594C becomes 63FA.
> > 
> > Co-developed-by: Pavel Zbitskiy 
> > Signed-off-by: Ilya Leoshkevich 
> > ---
> >   target/s390x/helper.h|  2 +
> >   target/s390x/tcg/insn-data.h.inc |  4 ++
> >   target/s390x/tcg/int_helper.c| 72 
> >   target/s390x/tcg/translate.c | 16 +++
> >   4 files changed, 94 insertions(+)

[...]

> > +uint64_t HELPER(cvbg)(CPUS390XState *env, Int128 dec)
> > +{
> > +uint64_t dec64[] = {int128_getlo(dec), int128_gethi(dec)};
> > +int64_t bin = 0, pow10, tmp;
> > +int digit, i, sign;
> > +
> > +sign = dec64[0] & 0xf;
> > +if (sign < 0xa) {
> > +tcg_s390_data_exception(env, 0, GETPC());
> > +}
> > +dec64[0] >>= 4;
> > +pow10 = (sign == 0xb || sign == 0xd) ? -1 : 1;
> > +
> > +for (i = 1; i < 20; i++) {
> > +digit = dec64[i >> 4] & 0xf;
> > +if (digit > 0x9) {
> > +tcg_s390_data_exception(env, 0, GETPC());
> > +}
> > +dec64[i >> 4] >>= 4;
> > +tmp = pow10 * digit;
> > +if (digit && ((tmp ^ pow10) < 0)) {
> 
> That tmp ^ pow10 caused some frowning for me first, but it's just a check
> whether the sign changed, right? ... a comment in front of the line might be
> helpful.

Good point about writing a comment, I tried to elaborate why checking
the sign is justified, and realized that it's actually redundant.
The int64_t bounds are roughly +-9.2E+18, and the last pow10 value in
this loop is +-1E+18. The multiplication cannot overflow.

> > +tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
> > +}
> > +tmp = bin + tmp;
> > +if (bin && ((tmp ^ bin) < 0)) {

The addition, however, can, e.g., bin=9E+17 and tmp=9E+18.
So I'll send a v5 without the first check and with a comment.

[...]

Re: [PATCH v2 4/4] meson: Link with libinotify on FreeBSD

2024-02-05 Thread Philippe Mathieu-Daudé


Hi Ilya,

On 5/2/24 19:11, Ilya Leoshkevich wrote:

make vm-build-freebsd fails with:

 ld: error: undefined symbol: inotify_init1
 >>> referenced by filemonitor-inotify.c:183 
(../src/util/filemonitor-inotify.c:183)
 >>>   util_filemonitor-inotify.c.o:(qemu_file_monitor_new) in 
archive libqemuutil.a

On FreeBSD inotify functions are defined in libinotify.so. Add it to
the dependencies.

Signed-off-by: Ilya Leoshkevich 
---
  meson.build  | 12 +++-
  util/meson.build |  6 +-
  2 files changed, 16 insertions(+), 2 deletions(-)


(for some reason your git-diff context is very verbose,
 making review somehow annoying).


+# libinotify-kqueue
+inotify = not_found
+if host_os == 'freebsd'
+  inotify = cc.find_library('inotify')
+endif
+
  #
  # config-host.h #
  #
  




@@ -2376,61 +2382,62 @@ have_asan_fiber = false
  if get_option('sanitizers') and \
 not cc.has_function('__sanitizer_start_switch_fiber',
   args: '-fsanitize=address',
   prefix: '#include ')
warning('Missing ASAN due to missing fiber annotation interface')
warning('Without code annotation, the report may be inferior.')
  else
have_asan_fiber = true
  endif
  config_host_data.set('CONFIG_ASAN_IFACE_FIBER', have_asan_fiber)
  
  # has_header_symbol




  config_host_data.set('CONFIG_INOTIFY',
   cc.has_header_symbol('sys/inotify.h', 'inotify_init'))
  config_host_data.set('CONFIG_INOTIFY1',
- cc.has_header_symbol('sys/inotify.h', 'inotify_init1'))
+ cc.has_header_symbol('sys/inotify.h', 'inotify_init1') and
+ (host_os != 'freebsd' or inotify.found()))


Maybe we could use the same pattern as 'have_asan_fiber':

 have_inotify_init1 = cc.has_header_symbol('sys/inotify.h', 
'inotify_init1')

 if have_inotify_init1 and host_os == 'freebsd'
   have_inotify_init1 = cc.find_library('inotify')
 endif
 config_host_data.set('CONFIG_INOTIFY1', have_inotify_init1)

I wonder why we don't need the similar library check for the
inotify_init symbol.

Regards,

Phil.

[PATCH v2 3/4] tests/test-util-filemonitor: Adapt to FreeBSD inotify rename semantics

2024-02-05 Thread Ilya Leoshkevich

Unlike on Linux, on FreeBSD renaming a file when the destination
already exists results in IN_DELETE event for that existing file:

$ FILEMONITOR_DEBUG=1 build/tests/unit/test-util-filemonitor
Rename /tmp/test-util-filemonitor-K13LI2/fish/one.txt -> 
/tmp/test-util-filemonitor-K13LI2/two.txt
Event id=2 event=2 file=one.txt
Queue event id 2 event 2 file one.txt
Queue event id 1 event 2 file two.txt
Queue event id 10002 event 2 file two.txt
Queue event id 1 event 0 file two.txt
Queue event id 10002 event 0 file two.txt
Event id=1 event=0 file=two.txt
Expected event 0 but got 2

This difference in behavior is not expected to break the real users, so
teach the test to accept it.

Suggested-by: Daniel P. Berrange 
Signed-off-by: Ilya Leoshkevich 
---
 tests/unit/test-util-filemonitor.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/tests/unit/test-util-filemonitor.c 
b/tests/unit/test-util-filemonitor.c
index a22de275955..02e67fc96ac 100644
--- a/tests/unit/test-util-filemonitor.c
+++ b/tests/unit/test-util-filemonitor.c
@@ -333,60 +333,68 @@ test_file_monitor_events(void)
 
 
 { .type = QFILE_MONITOR_TEST_OP_MKDIR,
   .filesrc = "fish", },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "fish", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_CREATED },
 
 
 { .type = QFILE_MONITOR_TEST_OP_ADD_WATCH,
   .filesrc = "fish/", .watchid =  },
 { .type = QFILE_MONITOR_TEST_OP_ADD_WATCH,
   .filesrc = "fish/one.txt", .watchid =  },
 { .type = QFILE_MONITOR_TEST_OP_CREATE,
   .filesrc = "fish/one.txt", },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "one.txt", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_CREATED },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "one.txt", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_CREATED },
 
 
 { .type = QFILE_MONITOR_TEST_OP_DEL_WATCH,
   .filesrc = "fish/one.txt", .watchid =  },
 { .type = QFILE_MONITOR_TEST_OP_RENAME,
   .filesrc = "fish/one.txt", .filedst = "two.txt", },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "one.txt", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_DELETED },
+#ifdef __FreeBSD__
+{ .type = QFILE_MONITOR_TEST_OP_EVENT,
+  .filesrc = "two.txt", .watchid = ,
+  .eventid = QFILE_MONITOR_EVENT_DELETED },
+{ .type = QFILE_MONITOR_TEST_OP_EVENT,
+  .filesrc = "two.txt", .watchid = ,
+  .eventid = QFILE_MONITOR_EVENT_DELETED },
+#endif
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "two.txt", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_CREATED },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "two.txt", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_CREATED },
 
 
 { .type = QFILE_MONITOR_TEST_OP_RMDIR,
   .filesrc = "fish", },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_IGNORED,
   .swapnext = true },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "fish", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_DELETED },
 { .type = QFILE_MONITOR_TEST_OP_DEL_WATCH,
   .filesrc = "fish", .watchid =  },
 
 
 { .type = QFILE_MONITOR_TEST_OP_UNLINK,
   .filesrc = "two.txt", },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "two.txt", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_DELETED },
 { .type = QFILE_MONITOR_TEST_OP_EVENT,
   .filesrc = "two.txt", .watchid = ,
   .eventid = QFILE_MONITOR_EVENT_DELETED },
 
-- 
2.43.0

Re: [PATCH] docs/style: allow C99 mixed declarations

2024-02-05 Thread Samuel Tardieu




Daniel P. Berrangé  writes:


$ gcc -Wall -Wuninitialized -o jump jump.c


Note that many GCC warnings don't trigger if you don't enable 
optimizations. In the case you exhibit, adding -O is enough to get 
a sensible warning:


$ gcc -Wall -O -o jump jump.c
jump.c: In function ‘main’:
jump.c:11:3: warning: ‘foo’ may be used uninitialized 
[-Wmaybe-uninitialized]

  11 |   free(foo);
 |   ^
jump.c:8:9: note: ‘foo’ was declared here
   8 |   char *foo = malloc(30);
 | ^~~

Best.

 Sam
--
Samuel Tardieu

[PATCH v2 1/4] tests/vm: Set UseDNS=no in the sshd configuration

2024-02-05 Thread Ilya Leoshkevich

make vm-build-freebsd sometimes fails with "Connection timed out during
banner exchange". The client strace shows:

13:59:30 write(3, "SSH-2.0-OpenSSH_9.3\r\n", 21) = 21
13:59:30 getpid()   = 252655
13:59:30 poll([{fd=3, events=POLLIN}], 1, 5000) = 1 ([{fd=3, 
revents=POLLIN}])
13:59:32 read(3, "S", 1)= 1
13:59:32 poll([{fd=3, events=POLLIN}], 1, 3625) = 1 ([{fd=3, 
revents=POLLIN}])
13:59:32 read(3, "S", 1)= 1
13:59:32 poll([{fd=3, events=POLLIN}], 1, 3625) = 1 ([{fd=3, 
revents=POLLIN}])
13:59:32 read(3, "H", 1)= 1

There is a 2s delay during connection, and ConnectTimeout is set to 1.
Raising it makes the issue go away, but we can do better. The server
truss shows:

888: 27.811414714 socket(PF_INET,SOCK_DGRAM|SOCK_CLOEXEC,0) = 5 (0x5)
888: 27.811765030 connect(5,{ AF_INET 10.0.2.3:53 },16) = 0 (0x0)
888: 27.812166941 sendto(5,"\^Z/\^A\0\0\^A\0\0\0\0\0\0\^A2"...,39,0,NULL,0) 
= 39 (0x27)
888: 29.363970743 poll({ 5/POLLRDNORM },1,5000) = 1 (0x1)

So the delay is due to a DNS query. Disable DNS queries in the server
config.

Reviewed-by: Thomas Huth 
Signed-off-by: Ilya Leoshkevich 
---
 tests/vm/basevm.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py
index 61725b83254..c0d62c08031 100644
--- a/tests/vm/basevm.py
+++ b/tests/vm/basevm.py
@@ -396,60 +396,62 @@ def console_consume(self):
 self.console_log(output)
 vm.console_socket.setblocking(1)
 
 def console_send(self, command):
 vm = self._guest
 if self.debug:
 logline = re.sub("\n", "", command)
 logline = re.sub("[\x00-\x1f]", ".", logline)
 sys.stderr.write("con send: %s\n" % logline)
 for char in list(command):
 vm.console_socket.send(char.encode("utf-8"))
 time.sleep(0.01)
 
 def console_wait_send(self, wait, command):
 self.console_wait(wait)
 self.console_send(command)
 
 def console_ssh_init(self, prompt, user, pw):
 sshkey_cmd = "echo '%s' > .ssh/authorized_keys\n" \
  % self._config['ssh_pub_key'].rstrip()
 self.console_wait_send("login:","%s\n" % user)
 self.console_wait_send("Password:", "%s\n" % pw)
 self.console_wait_send(prompt,  "mkdir .ssh\n")
 self.console_wait_send(prompt,  sshkey_cmd)
 self.console_wait_send(prompt,  "chmod 755 .ssh\n")
 self.console_wait_send(prompt,  "chmod 644 .ssh/authorized_keys\n")
 
 def console_sshd_config(self, prompt):
 self.console_wait(prompt)
 self.console_send("echo 'PermitRootLogin yes' >> 
/etc/ssh/sshd_config\n")
+self.console_wait(prompt)
+self.console_send("echo 'UseDNS no' >> /etc/ssh/sshd_config\n")
 for var in self.envvars:
 self.console_wait(prompt)
 self.console_send("echo 'AcceptEnv %s' >> /etc/ssh/sshd_config\n" 
% var)
 
 def print_step(self, text):
 sys.stderr.write("### %s ...\n" % text)
 
 def wait_ssh(self, wait_root=False, seconds=300, cmd="exit 0"):
 # Allow more time for VM to boot under TCG.
 if not kvm_available(self.arch):
 seconds *= self.tcg_timeout_multiplier
 starttime = datetime.datetime.now()
 endtime = starttime + datetime.timedelta(seconds=seconds)
 cmd_success = False
 while datetime.datetime.now() < endtime:
 if wait_root and self.ssh_root(cmd) == 0:
 cmd_success = True
 break
 elif self.ssh(cmd) == 0:
 cmd_success = True
 break
 seconds = (endtime - datetime.datetime.now()).total_seconds()
 logging.debug("%ds before timeout", seconds)
 time.sleep(1)
 if not cmd_success:
 raise Exception("Timeout while waiting for guest ssh")
 
 def shutdown(self):
 self._guest.shutdown(timeout=self._shutdown_timeout)
 
-- 
2.43.0

[PATCH v2 2/4] tests/vm/freebsd: Reload the sshd configuration

2024-02-05 Thread Ilya Leoshkevich

After console_sshd_config(), the SSH server needs to be nudged to pick
up the new configs. The scripts for the other BSD flavors already do
this with a reboot, but a simple reload is sufficient.

Signed-off-by: Ilya Leoshkevich 
---
 tests/vm/freebsd | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/vm/freebsd b/tests/vm/freebsd
index b581bd17fb7..1247f40a385 100755
--- a/tests/vm/freebsd
+++ b/tests/vm/freebsd
@@ -81,50 +81,51 @@ class FreeBSDVM(basevm.BaseVM):
 self.console_wait("Full name")
 self.console_send("%s\n" % self._config["guest_user"])
 self.console_wait_send("Uid",   "\n")
 self.console_wait_send("Login group",   "\n")
 self.console_wait_send("Login group",   "\n")
 self.console_wait_send("Login class",   "\n")
 self.console_wait_send("Shell", "\n")
 self.console_wait_send("Home directory","\n")
 self.console_wait_send("Home directory perm",   "\n")
 self.console_wait_send("Use password",  "\n")
 self.console_wait_send("Use an empty password", "\n")
 self.console_wait_send("Use a random password", "\n")
 self.console_wait("Enter password:")
 self.console_send("%s\n" % self._config["guest_pass"])
 self.console_wait("Enter password again:")
 self.console_send("%s\n" % self._config["guest_pass"])
 self.console_wait_send("Lock out",  "\n")
 self.console_wait_send("OK","yes\n")
 self.console_wait_send("Add another user",  "no\n")
 self.console_wait_send("~ #", "exit\n")
 
 # setup qemu user
 prompt = "$"
 self.console_ssh_init(prompt, self._config["guest_user"], 
self._config["guest_pass"])
 self.console_wait_send(prompt, "exit\n")
 
 # setup root user
 prompt = "root@freebsd:~ #"
 self.console_ssh_init(prompt, "root", self._config["root_pass"])
 self.console_sshd_config(prompt)
+self.console_wait_send(prompt, "service sshd reload\n")
 
 # setup virtio-blk #1 (tarfile)
 self.console_wait(prompt)
 self.console_send("echo 'chmod 666 /dev/vtbd1' >> /etc/rc.local\n")
 
 pkgs = self.get_qemu_packages_from_lcitool_json()
 self.print_step("Installing packages")
 self.ssh_root_check("pkg install -y %s\n" % " ".join(pkgs))
 
 # shutdown
 self.ssh_root(self.poweroff)
 self.wait()
 
 if os.path.exists(img):
 os.remove(img)
 os.rename(img_tmp, img)
 self.print_step("All done")
 
 if __name__ == "__main__":
 sys.exit(basevm.main(FreeBSDVM, config=FREEBSD_CONFIG))
-- 
2.43.0

[PATCH v2 4/4] meson: Link with libinotify on FreeBSD

2024-02-05 Thread Ilya Leoshkevich

make vm-build-freebsd fails with:

ld: error: undefined symbol: inotify_init1
>>> referenced by filemonitor-inotify.c:183 
(../src/util/filemonitor-inotify.c:183)
>>>   util_filemonitor-inotify.c.o:(qemu_file_monitor_new) in 
archive libqemuutil.a

On FreeBSD inotify functions are defined in libinotify.so. Add it to
the dependencies.

Signed-off-by: Ilya Leoshkevich 
---
 meson.build  | 12 +++-
 util/meson.build |  6 +-
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index b5d6dc94a83..819cdedebe2 100644
--- a/meson.build
+++ b/meson.build
@@ -1982,60 +1982,66 @@ if libbpf.found() and not cc.links('''
#include 
int main(void)
{
  bpf_object__destroy_skeleton(NULL);
  return 0;
}''', dependencies: libbpf)
   libbpf = not_found
   if get_option('bpf').enabled()
 error('libbpf skeleton test failed')
   else
 warning('libbpf skeleton test failed, disabling')
   endif
 endif
 
 # libxdp
 libxdp = not_found
 if not get_option('af_xdp').auto() or have_system
 libxdp = dependency('libxdp', required: get_option('af_xdp'),
 version: '>=1.4.0', method: 'pkg-config')
 endif
 
 # libdw
 libdw = not_found
 if not get_option('libdw').auto() or \
 (not get_option('prefer_static') and (have_system or have_user))
 libdw = dependency('libdw',
method: 'pkg-config',
required: get_option('libdw'))
 endif
 
+# libinotify-kqueue
+inotify = not_found
+if host_os == 'freebsd'
+  inotify = cc.find_library('inotify')
+endif
+
 #
 # config-host.h #
 #
 
 config_host_data = configuration_data()
 
 audio_drivers_selected = []
 if have_system
   audio_drivers_available = {
 'alsa': alsa.found(),
 'coreaudio': coreaudio.found(),
 'dsound': dsound.found(),
 'jack': jack.found(),
 'oss': oss.found(),
 'pa': pulse.found(),
 'pipewire': pipewire.found(),
 'sdl': sdl.found(),
 'sndio': sndio.found(),
   }
   foreach k, v: audio_drivers_available
 config_host_data.set('CONFIG_AUDIO_' + k.to_upper(), v)
   endforeach
 
   # Default to native drivers first, OSS second, SDL third
   audio_drivers_priority = \
 [ 'pa', 'coreaudio', 'dsound', 'sndio', 'oss' ] + \
 (host_os == 'linux' ? [] : [ 'sdl' ])
   audio_drivers_default = []
   foreach k: audio_drivers_priority
 if audio_drivers_available[k]
@@ -2376,61 +2382,62 @@ have_asan_fiber = false
 if get_option('sanitizers') and \
not cc.has_function('__sanitizer_start_switch_fiber',
  args: '-fsanitize=address',
  prefix: '#include ')
   warning('Missing ASAN due to missing fiber annotation interface')
   warning('Without code annotation, the report may be inferior.')
 else
   have_asan_fiber = true
 endif
 config_host_data.set('CONFIG_ASAN_IFACE_FIBER', have_asan_fiber)
 
 # has_header_symbol
 config_host_data.set('CONFIG_BLKZONED',
  cc.has_header_symbol('linux/blkzoned.h', 'BLKOPENZONE'))
 config_host_data.set('CONFIG_EPOLL_CREATE1',
  cc.has_header_symbol('sys/epoll.h', 'epoll_create1'))
 config_host_data.set('CONFIG_FALLOCATE_PUNCH_HOLE',
  cc.has_header_symbol('linux/falloc.h', 
'FALLOC_FL_PUNCH_HOLE') and
  cc.has_header_symbol('linux/falloc.h', 
'FALLOC_FL_KEEP_SIZE'))
 config_host_data.set('CONFIG_FALLOCATE_ZERO_RANGE',
  cc.has_header_symbol('linux/falloc.h', 
'FALLOC_FL_ZERO_RANGE'))
 config_host_data.set('CONFIG_FIEMAP',
  cc.has_header('linux/fiemap.h') and
  cc.has_header_symbol('linux/fs.h', 'FS_IOC_FIEMAP'))
 config_host_data.set('CONFIG_GETRANDOM',
  cc.has_function('getrandom') and
  cc.has_header_symbol('sys/random.h', 'GRND_NONBLOCK'))
 config_host_data.set('CONFIG_INOTIFY',
  cc.has_header_symbol('sys/inotify.h', 'inotify_init'))
 config_host_data.set('CONFIG_INOTIFY1',
- cc.has_header_symbol('sys/inotify.h', 'inotify_init1'))
+ cc.has_header_symbol('sys/inotify.h', 'inotify_init1') and
+ (host_os != 'freebsd' or inotify.found()))
 config_host_data.set('CONFIG_PRCTL_PR_SET_TIMERSLACK',
  cc.has_header_symbol('sys/prctl.h', 'PR_SET_TIMERSLACK'))
 config_host_data.set('CONFIG_RTNETLINK',
  cc.has_header_symbol('linux/rtnetlink.h', 
'IFLA_PROTO_DOWN'))
 config_host_data.set('CONFIG_SYSMACROS',
  cc.has_header_symbol('sys/sysmacros.h', 'makedev'))
 config_host_data.set('HAVE_OPTRESET',
  cc.has_header_symbol('getopt.h', 'optreset'))
 config_host_data.set('HAVE_IPPROTO_MPTCP',
  cc.has_header_symbol('netinet/in.h', 'IPPROTO_MPTCP'))
 
 # has_member

[PATCH v2 0/4] make vm-build-freebsd fixes

2024-02-05 Thread Ilya Leoshkevich

v1: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg05155.html
v1 -> v2: Link with libinotify instead of disabling the inotify
  support (Daniel).
  Use a bit more context lines in order to prevent the
  incorrect application of the test patch.

Hi,

I needed to verify that my qemu-user changes didn't break BSD, and
Daniel Berrange suggested vm-build-freebsd on IRC. I had several
problems with it, which this series resolves.

Best regards,
Ilya

Ilya Leoshkevich (4):
  tests/vm: Set UseDNS=no in the sshd configuration
  tests/vm/freebsd: Reload the sshd configuration
  tests/test-util-filemonitor: Adapt to FreeBSD inotify rename semantics
  meson: Link with libinotify on FreeBSD

 meson.build| 12 +++-
 tests/unit/test-util-filemonitor.c |  8 
 tests/vm/basevm.py |  2 ++
 tests/vm/freebsd   |  1 +
 util/meson.build   |  6 +-
 5 files changed, 27 insertions(+), 2 deletions(-)

-- 
2.43.0

Re: [PATCH v2 23/23] migration/multifd: Optimize sender side to be lockless

2024-02-05 Thread Fabiano Rosas

Peter Xu  writes:

> On Mon, Feb 05, 2024 at 11:10:34AM -0300, Fabiano Rosas wrote:
>> > (maybe I can repost this single patch in-place to avoid another round of
>> >  mail bombs..)
>> 
>> Sure.
>
> I've got the final version attached here.  Feel free to have a look, thanks.
>
> 
> From 6ba337320430feae4ce9d3d906ea19f68430642d Mon Sep 17 00:00:00 2001
> From: Peter Xu 
> Date: Fri, 2 Feb 2024 18:28:57 +0800
> Subject: [PATCH] migration/multifd: Optimize sender side to be lockless
>
> When reviewing my attempt to refactor send_prepare(), Fabiano suggested we
> try out with dropping the mutex in multifd code [1].
>
> I thought about that before but I never tried to change the code.  Now
> maybe it's time to give it a stab.  This only optimizes the sender side.
>
> The trick here is multifd has a clear provider/consumer model, that the
> migration main thread publishes requests (either pending_job/pending_sync),
> while the multifd sender threads are consumers.  Here we don't have a lot
> of complicated data sharing, and the jobs can logically be submitted
> lockless.
>
> Arm the code with atomic weapons.  Two things worth mentioning:
>
>   - For multifd_send_pages(): we can use qatomic_load_acquire() when trying
>   to find a free channel, but that's expensive if we attach one ACQUIRE per
>   channel.  Instead, keep the qatomic_read() on reading the pending_job
>   flag as we do already, meanwhile use one smp_mb_acquire() after the loop
>   to guarantee the memory ordering.
>
>   - For pending_sync: it doesn't have any extra data required since now
>   p->flags are never touched, it should be safe to not use memory barrier.
>   That's different from pending_job.
>
> Provide rich comments for all the lockless operations to state how they are
> paired.  With that, we can remove the mutex.
>
> [1] https://lore.kernel.org/r/87o7d1jlu5@suse.de
>
> Suggested-by: Fabiano Rosas 
> Link: https://lore.kernel.org/r/20240202102857.110210-24-pet...@redhat.com
> Signed-off-by: Peter Xu 
> ---
>  migration/multifd.h |  2 --
>  migration/multifd.c | 51 +++--
>  2 files changed, 26 insertions(+), 27 deletions(-)
>
> diff --git a/migration/multifd.h b/migration/multifd.h
> index 98876ff94a..78a2317263 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -91,8 +91,6 @@ typedef struct {
>  /* syncs main thread and channels */
>  QemuSemaphore sem_sync;
>  
> -/* this mutex protects the following parameters */
> -QemuMutex mutex;
>  /* is this channel thread running */
>  bool running;
>  /* multifd flags for each packet */
> diff --git a/migration/multifd.c b/migration/multifd.c
> index b317d57d61..fbdb129088 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -501,19 +501,19 @@ static bool multifd_send_pages(void)
>  }
>  }
>  
> -qemu_mutex_lock(>mutex);
> -assert(!p->pages->num);
> -assert(!p->pages->block);
>  /*
> - * Double check on pending_job==false with the lock.  In the future if
> - * we can have >1 requester thread, we can replace this with a "goto
> - * retry", but that is for later.
> + * Make sure we read p->pending_job before all the rest.  Pairs with
> + * qatomic_store_release() in multifd_send_thread().
>   */
> -assert(qatomic_read(>pending_job) == false);
> -qatomic_set(>pending_job, true);
> +smp_mb_acquire();
> +assert(!p->pages->num);
>  multifd_send_state->pages = p->pages;
>  p->pages = pages;
> -qemu_mutex_unlock(>mutex);
> +/*
> + * Making sure p->pages is setup before marking pending_job=true. Pairs
> + * with the qatomic_load_acquire() in multifd_send_thread().
> + */
> +qatomic_store_release(>pending_job, true);
>  qemu_sem_post(>sem);
>  
>  return true;
> @@ -648,7 +648,6 @@ static bool 
> multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
>  }
>  multifd_send_channel_destroy(p->c);
>  p->c = NULL;
> -qemu_mutex_destroy(>mutex);
>  qemu_sem_destroy(>sem);
>  qemu_sem_destroy(>sem_sync);
>  g_free(p->name);
> @@ -742,14 +741,12 @@ int multifd_send_sync_main(void)
>  
>  trace_multifd_send_sync_main_signal(p->id);
>  
> -qemu_mutex_lock(>mutex);
>  /*
>   * We should be the only user so far, so not possible to be set by
>   * others concurrently.
>   */
>  assert(qatomic_read(>pending_sync) == false);
>  qatomic_set(>pending_sync, true);
> -qemu_mutex_unlock(>mutex);
>  qemu_sem_post(>sem);
>  }
>  for (i = 0; i < migrate_multifd_channels(); i++) {
> @@ -796,9 +793,12 @@ static void *multifd_send_thread(void *opaque)
>  if (multifd_send_should_exit()) {
>  break;
>  }
> -qemu_mutex_lock(>mutex);
>  
> -if (qatomic_read(>pending_job)) {
> +/*
> + * Read pending_job flag before p->pages.  Pairs with

Re: [GIT PULL 5/8] util: Add write-only "node-affinity" property for ThreadContext

2024-02-05 Thread David Hildenbrand


On 05.02.24 17:13, Claudio Fontana wrote:

Hello David,



Hi,


It would seem to me that a lot of the calling code like qemu_prealloc_mem for 
example
should be sysemu-only, not used for tools, or user mode either right?

And the thread_context.c itself should also be sysemu-only, correct?


Yes, both should currently only get used for sysemu only. Memory 
backends and virtio-mem are sysemu-only.


--
Cheers,

David / dhildenb

Re: [PATCH] docs/style: allow C99 mixed declarations

2024-02-05 Thread Peter Maydell

On Mon, 5 Feb 2024 at 17:41, Daniel P. Berrangé  wrote:
> Rather than accept the status quo and remove the coding guideline,
> I think we should strengthen the guidelines, such that it is
> explicitly forbidden in any method that uses 'goto'. Personally
> I'd go all the way to -Werror=declaration-after-statement, as
> while C99 mixed decl is appealing, it isn't exactly a game
> changer in improving code maintainability.

I definitely think we should either (a) say we're OK with
mixed declarations or (b) enforce it via the warning option.
(gcc -Wdeclaration-after-statement does not appear to warn
for the "declaration inside for()" syntax, so we could
enable it if we wanted to fix the existing style lapses.)

Personally I prefer declaration-at-top-of-block, but
more as an aesthetic choice than anything else.

thanks
-- PMM

Re: [PATCH] docs/style: allow C99 mixed declarations

2024-02-05 Thread Daniel P . Berrangé

On Mon, Feb 05, 2024 at 12:18:19PM -0500, Stefan Hajnoczi wrote:
> C99 mixed declarations support interleaving of local variable
> declarations and code.
> 
> The coding style "generally" forbids C99 mixed declarations with some
> exceptions to the rule. This rule is not checked by checkpatch.pl and
> naturally there are violations in the source tree.
> 
> While contemplating adding another exception, I came to the conclusion
> that the best location for declarations depends on context. Let the
> programmer declare variables where it is best for legibility. Don't try
> to define all possible scenarios/exceptions.

IIRC, we had a discussion on this topic sometime last year, but can't
remember what the $SUBJECT was, so I'll just repeat what I said then.

Combining C99 mixed declarations with 'goto' is dangerous.

Consider this program:

$ cat jump.c
#include 

int main(int argc, char**argv) {

  if (getenv("SKIP"))
goto target;

  char *foo = malloc(30);

 target:
  free(foo);
}

$ gcc -Wall -Wuninitialized -o jump jump.c

$ SKIP=1 ./jump 
free(): invalid pointer
Aborted (core dumped)


 -> The programmer thinks they have initialized 'foo'
 -> GCC thinks the programmer has initialized 'foo'
 -> Yet 'foo' is not guaranteed to be initialized at 'target:'

Given that QEMU makes heavy use of 'goto', allowing C99 mixed
declarations exposes us to significant danger.

Full disclosure, GCC fails to diagnmose this mistake, even
with a decl at start of 'main', but at least the mistake is
now more visible to the programmer.

Fortunately with -fanalyzer GCC can diagnose this:

$ gcc -fanalyzer -Wall -o jump jump.c
jump.c: In function ‘main’:
jump.c:12:3: warning: use of uninitialized value ‘foo’ [CWE-457] 
[-Wanalyzer-use-of-uninitialized-value]
   12 |   free(foo);
  |   ^
  ‘main’: events 1-5
|
|6 |   if (getenv("SKIP"))
|  |  ~  
|  |  |
|  |  (3) following ‘true’ branch...
|7 | goto target;
|  | 
|  | |
|  | (4) ...to here
|8 | 
|9 |  char *foo = malloc(30);
|  |^~~
|  ||
|  |(1) region created on stack here
|  |(2) capacity: 8 bytes
|..
|   12 |   free(foo);
|  |   ~
|  |   |
|  |   (5) use of uninitialized value ‘foo’ here


...but -fanalyzer isn't something we have enabled by default, it
is opt-in. I'm also not sure how comprehensive the flow control
analysis of -fanalyzer is ?  Can we be sure it'll catch these
mistakes in large complex functions with many code paths ?

Even if the compiler does reliably warn, I think the code pattern
remains misleading to contributors, as the flow control flaw is
very non-obvious.

Rather than accept the status quo and remove the coding guideline,
I think we should strengthen the guidelines, such that it is
explicitly forbidden in any method that uses 'goto'. Personally
I'd go all the way to -Werror=declaration-after-statement, as
while C99 mixed decl is appealing, it isn't exactly a game
changer in improving code maintainability.

> Signed-off-by: Stefan Hajnoczi 
> ---
>  docs/devel/style.rst | 20 
>  1 file changed, 20 deletions(-)
> 
> diff --git a/docs/devel/style.rst b/docs/devel/style.rst
> index 2f68b50079..80c4e4df52 100644
> --- a/docs/devel/style.rst
> +++ b/docs/devel/style.rst
> @@ -199,26 +199,6 @@ Rationale: a consistent (except for functions...) 
> bracing style reduces
>  ambiguity and avoids needless churn when lines are added or removed.
>  Furthermore, it is the QEMU coding style.
>  
> -Declarations
> -
> -
> -Mixed declarations (interleaving statements and declarations within
> -blocks) are generally not allowed; declarations should be at the beginning
> -of blocks. To avoid accidental re-use it is permissible to declare
> -loop variables inside for loops:
> -
> -.. code-block:: c
> -
> -for (int i = 0; i < ARRAY_SIZE(thing); i++) {
> -/* do something loopy */
> -}
> -
> -Every now and then, an exception is made for declarations inside a
> -#ifdef or #ifndef block: if the code looks nicer, such declarations can
> -be placed at the top of the block even if there are statements above.
> -On the other hand, however, it's often best to move that #ifdef/#ifndef
> -block to a separate function altogether.
> -
>  Conditional statements
>  ==
>  
> -- 
> 2.43.0
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH 0/5] virtio-blk: iothread-vq-mapping cleanups

2024-02-05 Thread Stefan Hajnoczi

Hanna reviewed the iothread-vq-mapping patches after they were applied to
qemu.git. This series consists of code cleanups that Hanna identified.

There are no functional changes or bug fixes that need to be backported to the
stable tree here, but it may make sense to backport them in the future to avoid
conflicts.

Stefan Hajnoczi (5):
  virtio-blk: enforce iothread-vq-mapping validation
  virtio-blk: clarify that there is at least 1 virtqueue
  virtio-blk: add vq_rq[] bounds check in virtio_blk_dma_restart_cb()
  virtio-blk: declare VirtIOBlock::rq with a type
  monitor: use aio_co_reschedule_self()

 include/hw/virtio/virtio-blk.h |   2 +-
 hw/block/virtio-blk.c  | 194 ++---
 qapi/qmp-dispatch.c|   7 +-
 3 files changed, 112 insertions(+), 91 deletions(-)

-- 
2.43.0

[PATCH 2/5] virtio-blk: clarify that there is at least 1 virtqueue

2024-02-05 Thread Stefan Hajnoczi

It is not possible to instantiate a virtio-blk device with 0 virtqueues.
The following check is located in ->realize():

  if (!conf->num_queues) {
  error_setg(errp, "num-queues property must be larger than 0");
  return;
  }

Later on we access s->vq_aio_context[0] under the assumption that there
is as least one virtqueue. Hanna Czenczek  noted that
it would help to show that the array index is already valid.

Add an assertion to document that s->vq_aio_context[0] is always
safe...and catch future code changes that break this assumption.

Suggested-by: Hanna Czenczek 
Signed-off-by: Stefan Hajnoczi 
---
 hw/block/virtio-blk.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index e8b37fd5f4..a0735a9bca 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1825,6 +1825,7 @@ static int virtio_blk_start_ioeventfd(VirtIODevice *vdev)
  * Try to change the AioContext so that block jobs and other operations can
  * co-locate their activity in the same AioContext. If it fails, nevermind.
  */
+assert(nvqs > 0); /* enforced during ->realize() */
 r = blk_set_aio_context(s->conf.conf.blk, s->vq_aio_context[0],
 _err);
 if (r < 0) {
-- 
2.43.0

[PATCH 3/5] virtio-blk: add vq_rq[] bounds check in virtio_blk_dma_restart_cb()

2024-02-05 Thread Stefan Hajnoczi

Hanna Czenczek  noted that the array index in
virtio_blk_dma_restart_cb() is not bounds-checked:

  g_autofree VirtIOBlockReq **vq_rq = g_new0(VirtIOBlockReq *, num_queues);
  ...
  while (rq) {
  VirtIOBlockReq *next = rq->next;
  uint16_t idx = virtio_get_queue_index(rq->vq);

  rq->next = vq_rq[idx];
 ^^

The code is correct because both rq->vq and vq_rq[] depend on
num_queues, but this is indirect and not 100% obvious. Add an assertion.

Suggested-by: Hanna Czenczek 
Signed-off-by: Stefan Hajnoczi 
---
 hw/block/virtio-blk.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index a0735a9bca..f3193f4b75 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1209,6 +1209,7 @@ static void virtio_blk_dma_restart_cb(void *opaque, bool 
running,
 VirtIOBlockReq *next = rq->next;
 uint16_t idx = virtio_get_queue_index(rq->vq);
 
+assert(idx < num_queues);
 rq->next = vq_rq[idx];
 vq_rq[idx] = rq;
 rq = next;
-- 
2.43.0

[PATCH 5/5] monitor: use aio_co_reschedule_self()

2024-02-05 Thread Stefan Hajnoczi

The aio_co_reschedule_self() API is designed to avoid the race
condition between scheduling the coroutine in another AioContext and
yielding.

The QMP dispatch code uses the open-coded version that appears
susceptible to the race condition at first glance:

  aio_co_schedule(qemu_get_aio_context(), qemu_coroutine_self());
  qemu_coroutine_yield();

The code is actually safe because the iohandler and qemu_aio_context
AioContext run under the Big QEMU Lock. Nevertheless, set a good example
and use aio_co_reschedule_self() so it's obvious that there is no race.

Suggested-by: Hanna Reitz 
Signed-off-by: Stefan Hajnoczi 
---
 qapi/qmp-dispatch.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index 176b549473..f3488afeef 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -212,8 +212,7 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList 
*cmds, QObject *requ
  * executing the command handler so that it can make progress if it
  * involves an AIO_WAIT_WHILE().
  */
-aio_co_schedule(qemu_get_aio_context(), qemu_coroutine_self());
-qemu_coroutine_yield();
+aio_co_reschedule_self(qemu_get_aio_context());
 }
 
 monitor_set_cur(qemu_coroutine_self(), cur_mon);
@@ -227,9 +226,7 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList 
*cmds, QObject *requ
  * Move back to iohandler_ctx so that nested event loops for
  * qemu_aio_context don't start new monitor commands.
  */
-aio_co_schedule(iohandler_get_aio_context(),
-qemu_coroutine_self());
-qemu_coroutine_yield();
+aio_co_reschedule_self(iohandler_get_aio_context());
 }
 } else {
/*
-- 
2.43.0

[PATCH 1/5] virtio-blk: enforce iothread-vq-mapping validation

2024-02-05 Thread Stefan Hajnoczi

Hanna Czenczek  noticed that the safety of
`vq_aio_context[vq->value] = ctx;` with user-defined vq->value inputs is
not obvious.

The code is structured in validate() + apply() steps so input validation
is there, but it happens way earlier and there is nothing that
guarantees apply() can only be called with validated inputs.

This patch moves the validate() call inside the apply() function so
validation is guaranteed. I also added the bounds checking assertion
that Hanna suggested.

Signed-off-by: Stefan Hajnoczi 
---
 hw/block/virtio-blk.c | 192 +++---
 1 file changed, 107 insertions(+), 85 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 227d83569f..e8b37fd5f4 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1485,6 +1485,72 @@ static int virtio_blk_load_device(VirtIODevice *vdev, 
QEMUFile *f,
 return 0;
 }
 
+static void virtio_resize_cb(void *opaque)
+{
+VirtIODevice *vdev = opaque;
+
+assert(qemu_get_current_aio_context() == qemu_get_aio_context());
+virtio_notify_config(vdev);
+}
+
+static void virtio_blk_resize(void *opaque)
+{
+VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
+
+/*
+ * virtio_notify_config() needs to acquire the BQL,
+ * so it can't be called from an iothread. Instead, schedule
+ * it to be run in the main context BH.
+ */
+aio_bh_schedule_oneshot(qemu_get_aio_context(), virtio_resize_cb, vdev);
+}
+
+static void virtio_blk_ioeventfd_detach(VirtIOBlock *s)
+{
+VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+for (uint16_t i = 0; i < s->conf.num_queues; i++) {
+VirtQueue *vq = virtio_get_queue(vdev, i);
+virtio_queue_aio_detach_host_notifier(vq, s->vq_aio_context[i]);
+}
+}
+
+static void virtio_blk_ioeventfd_attach(VirtIOBlock *s)
+{
+VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+for (uint16_t i = 0; i < s->conf.num_queues; i++) {
+VirtQueue *vq = virtio_get_queue(vdev, i);
+virtio_queue_aio_attach_host_notifier(vq, s->vq_aio_context[i]);
+}
+}
+
+/* Suspend virtqueue ioeventfd processing during drain */
+static void virtio_blk_drained_begin(void *opaque)
+{
+VirtIOBlock *s = opaque;
+
+if (s->ioeventfd_started) {
+virtio_blk_ioeventfd_detach(s);
+}
+}
+
+/* Resume virtqueue ioeventfd processing after drain */
+static void virtio_blk_drained_end(void *opaque)
+{
+VirtIOBlock *s = opaque;
+
+if (s->ioeventfd_started) {
+virtio_blk_ioeventfd_attach(s);
+}
+}
+
+static const BlockDevOps virtio_block_ops = {
+.resize_cb = virtio_blk_resize,
+.drained_begin = virtio_blk_drained_begin,
+.drained_end   = virtio_blk_drained_end,
+};
+
 static bool
 validate_iothread_vq_mapping_list(IOThreadVirtQueueMappingList *list,
 uint16_t num_queues, Error **errp)
@@ -1547,81 +1613,33 @@ 
validate_iothread_vq_mapping_list(IOThreadVirtQueueMappingList *list,
 return true;
 }
 
-static void virtio_resize_cb(void *opaque)
-{
-VirtIODevice *vdev = opaque;
-
-assert(qemu_get_current_aio_context() == qemu_get_aio_context());
-virtio_notify_config(vdev);
-}
-
-static void virtio_blk_resize(void *opaque)
-{
-VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
-
-/*
- * virtio_notify_config() needs to acquire the BQL,
- * so it can't be called from an iothread. Instead, schedule
- * it to be run in the main context BH.
- */
-aio_bh_schedule_oneshot(qemu_get_aio_context(), virtio_resize_cb, vdev);
-}
-
-static void virtio_blk_ioeventfd_detach(VirtIOBlock *s)
-{
-VirtIODevice *vdev = VIRTIO_DEVICE(s);
-
-for (uint16_t i = 0; i < s->conf.num_queues; i++) {
-VirtQueue *vq = virtio_get_queue(vdev, i);
-virtio_queue_aio_detach_host_notifier(vq, s->vq_aio_context[i]);
-}
-}
-
-static void virtio_blk_ioeventfd_attach(VirtIOBlock *s)
-{
-VirtIODevice *vdev = VIRTIO_DEVICE(s);
-
-for (uint16_t i = 0; i < s->conf.num_queues; i++) {
-VirtQueue *vq = virtio_get_queue(vdev, i);
-virtio_queue_aio_attach_host_notifier(vq, s->vq_aio_context[i]);
-}
-}
-
-/* Suspend virtqueue ioeventfd processing during drain */
-static void virtio_blk_drained_begin(void *opaque)
-{
-VirtIOBlock *s = opaque;
-
-if (s->ioeventfd_started) {
-virtio_blk_ioeventfd_detach(s);
-}
-}
-
-/* Resume virtqueue ioeventfd processing after drain */
-static void virtio_blk_drained_end(void *opaque)
-{
-VirtIOBlock *s = opaque;
-
-if (s->ioeventfd_started) {
-virtio_blk_ioeventfd_attach(s);
-}
-}
-
-static const BlockDevOps virtio_block_ops = {
-.resize_cb = virtio_blk_resize,
-.drained_begin = virtio_blk_drained_begin,
-.drained_end   = virtio_blk_drained_end,
-};
-
-/* Generate vq:AioContext mappings from a validated iothread-vq-mapping list */
-static void
-apply_vq_mapping(IOThreadVirtQueueMappingList *iothread_vq_mapping_list,
- AioContext **vq_aio_context, uint16_t

[PATCH 4/5] virtio-blk: declare VirtIOBlock::rq with a type

2024-02-05 Thread Stefan Hajnoczi

The VirtIOBlock::rq field has had the type void * since its introduction
in commit 869a5c6df19a ("Stop VM on error in virtio-blk. (Gleb
Natapov)").

Perhaps this was done to avoid the forward declaration of
VirtIOBlockReq.

Hanna Czenczek  pointed out the missing type. Specify
the actual type because there is no need to use void * here.

Suggested-by: Hanna Czenczek 
Signed-off-by: Stefan Hajnoczi 
---
 include/hw/virtio/virtio-blk.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 833a9a344f..5c14110c4b 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -55,7 +55,7 @@ struct VirtIOBlock {
 VirtIODevice parent_obj;
 BlockBackend *blk;
 QemuMutex rq_lock;
-void *rq; /* protected by rq_lock */
+struct VirtIOBlockReq *rq; /* protected by rq_lock */
 VirtIOBlkConf conf;
 unsigned short sector_mask;
 bool original_wce;
-- 
2.43.0

[PATCH] docs/style: allow C99 mixed declarations

2024-02-05 Thread Stefan Hajnoczi

C99 mixed declarations support interleaving of local variable
declarations and code.

The coding style "generally" forbids C99 mixed declarations with some
exceptions to the rule. This rule is not checked by checkpatch.pl and
naturally there are violations in the source tree.

While contemplating adding another exception, I came to the conclusion
that the best location for declarations depends on context. Let the
programmer declare variables where it is best for legibility. Don't try
to define all possible scenarios/exceptions.

Signed-off-by: Stefan Hajnoczi 
---
 docs/devel/style.rst | 20 
 1 file changed, 20 deletions(-)

diff --git a/docs/devel/style.rst b/docs/devel/style.rst
index 2f68b50079..80c4e4df52 100644
--- a/docs/devel/style.rst
+++ b/docs/devel/style.rst
@@ -199,26 +199,6 @@ Rationale: a consistent (except for functions...) bracing 
style reduces
 ambiguity and avoids needless churn when lines are added or removed.
 Furthermore, it is the QEMU coding style.
 
-Declarations
-
-
-Mixed declarations (interleaving statements and declarations within
-blocks) are generally not allowed; declarations should be at the beginning
-of blocks. To avoid accidental re-use it is permissible to declare
-loop variables inside for loops:
-
-.. code-block:: c
-
-for (int i = 0; i < ARRAY_SIZE(thing); i++) {
-/* do something loopy */
-}
-
-Every now and then, an exception is made for declarations inside a
-#ifdef or #ifndef block: if the code looks nicer, such declarations can
-be placed at the top of the block even if there are statements above.
-On the other hand, however, it's often best to move that #ifdef/#ifndef
-block to a separate function altogether.
-
 Conditional statements
 ==
 
-- 
2.43.0

Re: [PATCH v4 2/4] target/s390x: Emulate CVB, CVBY and CVBG

2024-02-05 Thread Thomas Huth


On 02/02/2024 15.11, Ilya Leoshkevich wrote:

Convert to Binary - counterparts of the already implemented Convert
to Decimal (CVD*) instructions.
Example from the Principles of Operation: 25594C becomes 63FA.

Co-developed-by: Pavel Zbitskiy 
Signed-off-by: Ilya Leoshkevich 
---
  target/s390x/helper.h|  2 +
  target/s390x/tcg/insn-data.h.inc |  4 ++
  target/s390x/tcg/int_helper.c| 72 
  target/s390x/tcg/translate.c | 16 +++
  4 files changed, 94 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 332a9a9c632..cc1c20e9e3f 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -88,6 +88,8 @@ DEF_HELPER_FLAGS_3(tcxb, TCG_CALL_NO_RWG_SE, i32, env, i128, 
i64)
  DEF_HELPER_FLAGS_2(sqeb, TCG_CALL_NO_WG, i64, env, i64)
  DEF_HELPER_FLAGS_2(sqdb, TCG_CALL_NO_WG, i64, env, i64)
  DEF_HELPER_FLAGS_2(sqxb, TCG_CALL_NO_WG, i128, env, i128)
+DEF_HELPER_3(cvb, void, env, i32, i64)
+DEF_HELPER_FLAGS_2(cvbg, TCG_CALL_NO_WG, i64, env, i128)
  DEF_HELPER_FLAGS_1(cvd, TCG_CALL_NO_RWG_SE, i64, s32)
  DEF_HELPER_FLAGS_1(cvdg, TCG_CALL_NO_RWG_SE, i128, s64)
  DEF_HELPER_FLAGS_4(pack, TCG_CALL_NO_WG, void, env, i32, i64, i64)
diff --git a/target/s390x/tcg/insn-data.h.inc b/target/s390x/tcg/insn-data.h.inc
index 388dcb8dbbc..e7d61cdec28 100644
--- a/target/s390x/tcg/insn-data.h.inc
+++ b/target/s390x/tcg/insn-data.h.inc
@@ -293,6 +293,10 @@
  D(0xec73, CLFIT,   RIE_a, GIE, r1_32u, i2_16u, 0, 0, ct, 0, 1)
  D(0xec71, CLGIT,   RIE_a, GIE, r1_o, i2_16u, 0, 0, ct, 0, 1)
  
+/* CONVERT TO BINARY */

+C(0x4f00, CVB, RX_a,  Z,   la2, 0, 0, 0, cvb, 0)
+C(0xe306, CVBY,RXY_a, LD,  la2, 0, 0, 0, cvb, 0)
+C(0xe30e, CVBG,RXY_a, Z,   la2, 0, r1, 0, cvbg, 0)
  /* CONVERT TO DECIMAL */
  C(0x4e00, CVD, RX_a,  Z,   r1_o, a2, 0, 0, cvd, 0)
  C(0xe326, CVDY,RXY_a, LD,  r1_o, a2, 0, 0, cvd, 0)
diff --git a/target/s390x/tcg/int_helper.c b/target/s390x/tcg/int_helper.c
index 121e3006a65..17974375e98 100644
--- a/target/s390x/tcg/int_helper.c
+++ b/target/s390x/tcg/int_helper.c
@@ -25,6 +25,7 @@
  #include "exec/exec-all.h"
  #include "qemu/host-utils.h"
  #include "exec/helper-proto.h"
+#include "exec/cpu_ldst.h"
  
  /* #define DEBUG_HELPER */

  #ifdef DEBUG_HELPER
@@ -98,6 +99,77 @@ Int128 HELPER(divu64)(CPUS390XState *env, uint64_t ah, 
uint64_t al, uint64_t b)
  tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
  }
  
+void HELPER(cvb)(CPUS390XState *env, uint32_t r1, uint64_t dec)

+{
+int64_t pow10 = 1, bin = 0;
+int digit, sign;
+
+sign = dec & 0xf;
+if (sign < 0xa) {
+tcg_s390_data_exception(env, 0, GETPC());
+}
+dec >>= 4;
+
+while (dec) {
+digit = dec & 0xf;
+if (digit > 0x9) {
+tcg_s390_data_exception(env, 0, GETPC());
+}
+dec >>= 4;
+bin += digit * pow10;
+pow10 *= 10;
+}
+
+if (sign == 0xb || sign == 0xd) {
+bin = -bin;
+}
+
+/* R1 is updated even on fixed-point-divide exception. */
+env->regs[r1] = (env->regs[r1] & 0xULL) | (uint32_t)bin;
+if (bin != (int32_t)bin) {
+tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
+}
+}
+
+uint64_t HELPER(cvbg)(CPUS390XState *env, Int128 dec)
+{
+uint64_t dec64[] = {int128_getlo(dec), int128_gethi(dec)};
+int64_t bin = 0, pow10, tmp;
+int digit, i, sign;
+
+sign = dec64[0] & 0xf;
+if (sign < 0xa) {
+tcg_s390_data_exception(env, 0, GETPC());
+}
+dec64[0] >>= 4;
+pow10 = (sign == 0xb || sign == 0xd) ? -1 : 1;
+
+for (i = 1; i < 20; i++) {
+digit = dec64[i >> 4] & 0xf;
+if (digit > 0x9) {
+tcg_s390_data_exception(env, 0, GETPC());
+}
+dec64[i >> 4] >>= 4;
+tmp = pow10 * digit;
+if (digit && ((tmp ^ pow10) < 0)) {


That tmp ^ pow10 caused some frowning for me first, but it's just a check 
whether the sign changed, right? ... a comment in front of the line might be 
helpful.



+tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
+}
+tmp = bin + tmp;
+if (bin && ((tmp ^ bin) < 0)) {
+tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
+}
+bin = tmp;
+pow10 *= 10;
+}
+
+g_assert(!dec64[0]);
+if (dec64[1]) {
+tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
+}
+
+return bin;
+}


Patch looks sane to me now, but I'd appreciate if Richard and/or David could 
have a look at this, too!


 Thomas

[PATCH v9 4/5] qmp: Added new command to retrieve eBPF blob.

2024-02-05 Thread Andrew Melnychenko

Now, the binary objects may be retrieved by id.
It would require for future qmp commands that may require specific
eBPF blob.

Added command "request-ebpf". This command returns
eBPF program encoded base64. The program taken from the
skeleton and essentially is an ELF object that can be
loaded in the future with libbpf.

The reason to use the command to provide the eBPF object
instead of a separate artifact was to avoid issues related
to finding the eBPF itself. eBPF object is an ELF binary
that contains the eBPF program and eBPF map description(BTF).
Overall, eBPF object should contain the program and enough
metadata to create/load eBPF with libbpf. As the eBPF
maps/program should correspond to QEMU, the eBPF can't
be used from different QEMU build.

The first solution was a helper that comes with QEMU
and loads appropriate eBPF objects. And the issue is
to find a proper helper if the system has several
different QEMUs installed and/or built from the source,
which helpers may not be compatible.

Another issue is QEMU updating while there is a running
QEMU instance. With an updated helper, it may not be
possible to hotplug virtio-net device to the already
running QEMU. Overall, requesting the eBPF object from
QEMU itself solves possible failures with acceptable effort.

Links:
[PATCH 3/5] qmp: Added the helper stamp check.
https://lore.kernel.org/all/20230219162100.174318-4-and...@daynix.com/

Signed-off-by: Andrew Melnychenko 
---
 ebpf/ebpf.c   | 69 +++
 ebpf/ebpf.h   | 29 ++
 ebpf/ebpf_rss.c   |  6 
 ebpf/meson.build  |  2 +-
 qapi/ebpf.json| 66 +
 qapi/meson.build  |  1 +
 qapi/qapi-schema.json |  1 +
 7 files changed, 173 insertions(+), 1 deletion(-)
 create mode 100644 ebpf/ebpf.c
 create mode 100644 ebpf/ebpf.h
 create mode 100644 qapi/ebpf.json

diff --git a/ebpf/ebpf.c b/ebpf/ebpf.c
new file mode 100644
index 00..2d73beb479
--- /dev/null
+++ b/ebpf/ebpf.c
@@ -0,0 +1,69 @@
+/*
+ * QEMU eBPF binary declaration routine.
+ *
+ * Developed by Daynix Computing LTD (http://www.daynix.com)
+ *
+ * Authors:
+ *  Andrew Melnychenko 
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/queue.h"
+#include "qapi/error.h"
+#include "qapi/qapi-commands-ebpf.h"
+#include "ebpf/ebpf.h"
+
+typedef struct ElfBinaryDataEntry {
+int id;
+const void *data;
+size_t datalen;
+
+QSLIST_ENTRY(ElfBinaryDataEntry) node;
+} ElfBinaryDataEntry;
+
+static QSLIST_HEAD(, ElfBinaryDataEntry) ebpf_elf_obj_list =
+QSLIST_HEAD_INITIALIZER();
+
+void ebpf_register_binary_data(int id, const void *data, size_t datalen)
+{
+struct ElfBinaryDataEntry *dataentry = NULL;
+
+dataentry = g_new0(struct ElfBinaryDataEntry, 1);
+dataentry->data = data;
+dataentry->datalen = datalen;
+dataentry->id = id;
+
+QSLIST_INSERT_HEAD(_elf_obj_list, dataentry, node);
+}
+
+const void *ebpf_find_binary_by_id(int id, size_t *sz, Error **errp)
+{
+struct ElfBinaryDataEntry *it = NULL;
+QSLIST_FOREACH(it, _elf_obj_list, node) {
+if (id == it->id) {
+*sz = it->datalen;
+return it->data;
+}
+}
+
+error_setg(errp, "can't find eBPF object with id: %d", id);
+
+return NULL;
+}
+
+EbpfObject *qmp_request_ebpf(EbpfProgramID id, Error **errp)
+{
+EbpfObject *ret = NULL;
+size_t size = 0;
+const void *data = ebpf_find_binary_by_id(id, , errp);
+if (!data) {
+return NULL;
+}
+
+ret = g_new0(EbpfObject, 1);
+ret->object = g_base64_encode(data, size);
+
+return ret;
+}
diff --git a/ebpf/ebpf.h b/ebpf/ebpf.h
new file mode 100644
index 00..378d4e9c70
--- /dev/null
+++ b/ebpf/ebpf.h
@@ -0,0 +1,29 @@
+/*
+ * QEMU eBPF binary declaration routine.
+ *
+ * Developed by Daynix Computing LTD (http://www.daynix.com)
+ *
+ * Authors:
+ *  Andrew Melnychenko 
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef EBPF_H
+#define EBPF_H
+
+
+void ebpf_register_binary_data(int id, const void *data,
+   size_t datalen);
+const void *ebpf_find_binary_by_id(int id, size_t *sz,
+   struct Error **errp);
+
+#define ebpf_binary_init(id, fn)   \
+static void __attribute__((constructor)) ebpf_binary_init_ ## fn(void) \
+{  \
+size_t datalen = 0;\
+const void *data = fn();   \
+ebpf_register_binary_data(id, data, datalen);  \
+}
+
+#endif /* EBPF_H */
diff --git a/ebpf/ebpf_rss.c b/ebpf/ebpf_rss.c
index 150aa40813..c8a68da2c5 100644
--- a/ebpf/ebpf_rss.c
+++ b/ebpf/ebpf_rss.c
@@ -13,6 +13,8 @@
 
 #include

[PATCH v9 2/5] ebpf: Added eBPF initialization by fds.

2024-02-05 Thread Andrew Melnychenko

It allows using file descriptors of eBPF provided
outside of QEMU.
QEMU may be run without capabilities for eBPF and run
RSS program provided by management tool(g.e. libvirt).

Signed-off-by: Andrew Melnychenko 
---
 ebpf/ebpf_rss-stub.c |  6 ++
 ebpf/ebpf_rss.c  | 27 +++
 ebpf/ebpf_rss.h  |  5 +
 3 files changed, 38 insertions(+)

diff --git a/ebpf/ebpf_rss-stub.c b/ebpf/ebpf_rss-stub.c
index e71e229190..8d7fae2ad9 100644
--- a/ebpf/ebpf_rss-stub.c
+++ b/ebpf/ebpf_rss-stub.c
@@ -28,6 +28,12 @@ bool ebpf_rss_load(struct EBPFRSSContext *ctx)
 return false;
 }
 
+bool ebpf_rss_load_fds(struct EBPFRSSContext *ctx, int program_fd,
+   int config_fd, int toeplitz_fd, int table_fd)
+{
+return false;
+}
+
 bool ebpf_rss_set_all(struct EBPFRSSContext *ctx, struct EBPFRSSConfig *config,
   uint16_t *indirections_table, uint8_t *toeplitz_key)
 {
diff --git a/ebpf/ebpf_rss.c b/ebpf/ebpf_rss.c
index f774d9636b..150aa40813 100644
--- a/ebpf/ebpf_rss.c
+++ b/ebpf/ebpf_rss.c
@@ -146,6 +146,33 @@ error:
 return false;
 }
 
+bool ebpf_rss_load_fds(struct EBPFRSSContext *ctx, int program_fd,
+   int config_fd, int toeplitz_fd, int table_fd)
+{
+if (ebpf_rss_is_loaded(ctx)) {
+return false;
+}
+
+if (program_fd < 0 || config_fd < 0 || toeplitz_fd < 0 || table_fd < 0) {
+return false;
+}
+
+ctx->program_fd = program_fd;
+ctx->map_configuration = config_fd;
+ctx->map_toeplitz_key = toeplitz_fd;
+ctx->map_indirections_table = table_fd;
+
+if (!ebpf_rss_mmap(ctx)) {
+ctx->program_fd = -1;
+ctx->map_configuration = -1;
+ctx->map_toeplitz_key = -1;
+ctx->map_indirections_table = -1;
+return false;
+}
+
+return true;
+}
+
 static bool ebpf_rss_set_config(struct EBPFRSSContext *ctx,
 struct EBPFRSSConfig *config)
 {
diff --git a/ebpf/ebpf_rss.h b/ebpf/ebpf_rss.h
index ab08a7266d..239242b0d2 100644
--- a/ebpf/ebpf_rss.h
+++ b/ebpf/ebpf_rss.h
@@ -14,6 +14,8 @@
 #ifndef QEMU_EBPF_RSS_H
 #define QEMU_EBPF_RSS_H
 
+#define EBPF_RSS_MAX_FDS 4
+
 struct EBPFRSSContext {
 void *obj;
 int program_fd;
@@ -41,6 +43,9 @@ bool ebpf_rss_is_loaded(struct EBPFRSSContext *ctx);
 
 bool ebpf_rss_load(struct EBPFRSSContext *ctx);
 
+bool ebpf_rss_load_fds(struct EBPFRSSContext *ctx, int program_fd,
+   int config_fd, int toeplitz_fd, int table_fd);
+
 bool ebpf_rss_set_all(struct EBPFRSSContext *ctx, struct EBPFRSSConfig *config,
   uint16_t *indirections_table, uint8_t *toeplitz_key);
 
-- 
2.43.0

[PATCH v9 0/5] eBPF RSS through QMP support.

2024-02-05 Thread Andrew Melnychenko

This series of patches provides the ability to retrieve eBPF program
through qmp, so management application may load bpf blob with proper 
capabilities.
Now, virtio-net devices can accept eBPF programs and maps through properties
as external file descriptors. Access to the eBPF map is direct through mmap()
call, so it should not require additional capabilities to bpf* calls.
eBPF file descriptors can be passed to QEMU from parent process or by unix
socket with sendfd() qmp command.

Changes since v8:
 * rebased and refactored QMP interface
 * license SPDX id only for new files

Changes since v7:
 * rebased and refactored
 * used SPDX license identifier
 * used DEFINE_PROP_ARRAY() for virtio-net "ebpf-rss-fds" property

Changes since v6:
 * added comments to ebpf.json
 * added libbpf version requirements to meson script with BPF_F_MMAPABLE check

Changes since v5:
 * refactored ebpf.json

Changes since v4:
 * refactored commit hunks
 * added explicit BPF_F_MMAPABLE declaration

Changes since v3:
 * fixed issue with the build if bpf disabled
 * rebased to the last master
 * refactored according to review

Changes since v2:
 * moved/refactored QMP command
 * refactored virtio-net

Changes since v1:
 * refactored virtio-net
 * moved hunks for ebpf mmap()
 * added qmp enum for eBPF id.

Andrew Melnychenko (5):
  ebpf: Added eBPF map update through mmap.
  ebpf: Added eBPF initialization by fds.
  virtio-net: Added property to load eBPF RSS with fds.
  qmp: Added new command to retrieve eBPF blob.
  ebpf: Updated eBPF program and skeleton.

 ebpf/ebpf.c|   69 ++
 ebpf/ebpf.h|   29 +
 ebpf/ebpf_rss-stub.c   |6 +
 ebpf/ebpf_rss.c|  150 +++-
 ebpf/ebpf_rss.h|   10 +
 ebpf/meson.build   |2 +-
 ebpf/rss.bpf.skeleton.h| 1343 
 hw/net/virtio-net.c|   54 +-
 include/hw/virtio/virtio-net.h |2 +
 meson.build|   10 +-
 qapi/ebpf.json |   66 ++
 qapi/meson.build   |1 +
 qapi/qapi-schema.json  |1 +
 tools/ebpf/rss.bpf.c   |7 +-
 14 files changed, 1047 insertions(+), 703 deletions(-)
 create mode 100644 ebpf/ebpf.c
 create mode 100644 ebpf/ebpf.h
 create mode 100644 qapi/ebpf.json

-- 
2.43.0

1 2 3 >

1 - 100 of 203 matches

Mail list logo