Re: building qemu on a system with libxkbcommon installed but not xkeyboard-config produces an core dump
> So how the problem manifests itself? What the actual error message is? > You mentioned some segfault iirc, care to share some details? I'm doing a rebuild, and will attach the error message and coredump file here in one or two days. -- Key fingerprint: 419F 72F3 F3A9 36EE 1B72 B00B C1C3 4BD4 FDA3 362F
Re: [PATCH 1/5] virtio-blk: enforce iothread-vq-mapping validation
On Mon, 05 Feb 2024 19:26, Stefan Hajnoczi wrote: Hanna Czenczek noticed that the safety of `vq_aio_context[vq->value] = ctx;` with user-defined vq->value inputs is not obvious. The code is structured in validate() + apply() steps so input validation is there, but it happens way earlier and there is nothing that guarantees apply() can only be called with validated inputs. This patch moves the validate() call inside the apply() function so validation is guaranteed. I also added the bounds checking assertion that Hanna suggested. Signed-off-by: Stefan Hajnoczi --- hw/block/virtio-blk.c | 192 +++--- 1 file changed, 107 insertions(+), 85 deletions(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 227d83569f..e8b37fd5f4 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -1485,6 +1485,72 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f, return 0; } +static void virtio_resize_cb(void *opaque) +{ +VirtIODevice *vdev = opaque; + +assert(qemu_get_current_aio_context() == qemu_get_aio_context()); +virtio_notify_config(vdev); +} + +static void virtio_blk_resize(void *opaque) +{ +VirtIODevice *vdev = VIRTIO_DEVICE(opaque); + +/* + * virtio_notify_config() needs to acquire the BQL, + * so it can't be called from an iothread. Instead, schedule + * it to be run in the main context BH. + */ +aio_bh_schedule_oneshot(qemu_get_aio_context(), virtio_resize_cb, vdev); +} + +static void virtio_blk_ioeventfd_detach(VirtIOBlock *s) +{ +VirtIODevice *vdev = VIRTIO_DEVICE(s); + +for (uint16_t i = 0; i < s->conf.num_queues; i++) { +VirtQueue *vq = virtio_get_queue(vdev, i); +virtio_queue_aio_detach_host_notifier(vq, s->vq_aio_context[i]); +} +} + +static void virtio_blk_ioeventfd_attach(VirtIOBlock *s) +{ +VirtIODevice *vdev = VIRTIO_DEVICE(s); + +for (uint16_t i = 0; i < s->conf.num_queues; i++) { +VirtQueue *vq = virtio_get_queue(vdev, i); +virtio_queue_aio_attach_host_notifier(vq, s->vq_aio_context[i]); +} +} + +/* Suspend virtqueue ioeventfd processing during drain */ +static void virtio_blk_drained_begin(void *opaque) +{ +VirtIOBlock *s = opaque; + +if (s->ioeventfd_started) { +virtio_blk_ioeventfd_detach(s); +} +} + +/* Resume virtqueue ioeventfd processing after drain */ +static void virtio_blk_drained_end(void *opaque) +{ +VirtIOBlock *s = opaque; + +if (s->ioeventfd_started) { +virtio_blk_ioeventfd_attach(s); +} +} + +static const BlockDevOps virtio_block_ops = { +.resize_cb = virtio_blk_resize, +.drained_begin = virtio_blk_drained_begin, +.drained_end = virtio_blk_drained_end, +}; + static bool validate_iothread_vq_mapping_list(IOThreadVirtQueueMappingList *list, uint16_t num_queues, Error **errp) @@ -1547,81 +1613,33 @@ validate_iothread_vq_mapping_list(IOThreadVirtQueueMappingList *list, return true; } -static void virtio_resize_cb(void *opaque) -{ -VirtIODevice *vdev = opaque; - -assert(qemu_get_current_aio_context() == qemu_get_aio_context()); -virtio_notify_config(vdev); -} - -static void virtio_blk_resize(void *opaque) -{ -VirtIODevice *vdev = VIRTIO_DEVICE(opaque); - -/* - * virtio_notify_config() needs to acquire the BQL, - * so it can't be called from an iothread. Instead, schedule - * it to be run in the main context BH. - */ -aio_bh_schedule_oneshot(qemu_get_aio_context(), virtio_resize_cb, vdev); -} - -static void virtio_blk_ioeventfd_detach(VirtIOBlock *s) -{ -VirtIODevice *vdev = VIRTIO_DEVICE(s); - -for (uint16_t i = 0; i < s->conf.num_queues; i++) { -VirtQueue *vq = virtio_get_queue(vdev, i); -virtio_queue_aio_detach_host_notifier(vq, s->vq_aio_context[i]); -} -} - -static void virtio_blk_ioeventfd_attach(VirtIOBlock *s) -{ -VirtIODevice *vdev = VIRTIO_DEVICE(s); - -for (uint16_t i = 0; i < s->conf.num_queues; i++) { -VirtQueue *vq = virtio_get_queue(vdev, i); -virtio_queue_aio_attach_host_notifier(vq, s->vq_aio_context[i]); -} -} - -/* Suspend virtqueue ioeventfd processing during drain */ -static void virtio_blk_drained_begin(void *opaque) -{ -VirtIOBlock *s = opaque; - -if (s->ioeventfd_started) { -virtio_blk_ioeventfd_detach(s); -} -} - -/* Resume virtqueue ioeventfd processing after drain */ -static void virtio_blk_drained_end(void *opaque) -{ -VirtIOBlock *s = opaque; - -if (s->ioeventfd_started) { -virtio_blk_ioeventfd_attach(s); -} -} - -static const BlockDevOps virtio_block_ops = { -.resize_cb = virtio_blk_resize, -.drained_begin = virtio_blk_drained_begin, -.drained_end = virtio_blk_drained_end, -}; - -/* Generate vq:AioContext mappings from a validated iothread-vq-mapping list */ -static void -apply_vq_mapping(IOThreadVirtQueueMappingList *iothread_vq_mapping_list, -
Re: [PATCH 5/5] monitor: use aio_co_reschedule_self()
On Mon, 05 Feb 2024 19:26, Stefan Hajnoczi wrote: The aio_co_reschedule_self() API is designed to avoid the race condition between scheduling the coroutine in another AioContext and yielding. The QMP dispatch code uses the open-coded version that appears susceptible to the race condition at first glance: aio_co_schedule(qemu_get_aio_context(), qemu_coroutine_self()); qemu_coroutine_yield(); The code is actually safe because the iohandler and qemu_aio_context AioContext run under the Big QEMU Lock. Nevertheless, set a good example and use aio_co_reschedule_self() so it's obvious that there is no race. Suggested-by: Hanna Reitz Signed-off-by: Stefan Hajnoczi --- qapi/qmp-dispatch.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c index 176b549473..f3488afeef 100644 --- a/qapi/qmp-dispatch.c +++ b/qapi/qmp-dispatch.c @@ -212,8 +212,7 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList *cmds, QObject *requ * executing the command handler so that it can make progress if it * involves an AIO_WAIT_WHILE(). */ -aio_co_schedule(qemu_get_aio_context(), qemu_coroutine_self()); -qemu_coroutine_yield(); +aio_co_reschedule_self(qemu_get_aio_context()); } monitor_set_cur(qemu_coroutine_self(), cur_mon); @@ -227,9 +226,7 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList *cmds, QObject *requ * Move back to iohandler_ctx so that nested event loops for * qemu_aio_context don't start new monitor commands. */ -aio_co_schedule(iohandler_get_aio_context(), -qemu_coroutine_self()); -qemu_coroutine_yield(); +aio_co_reschedule_self(iohandler_get_aio_context()); } } else { /* -- 2.43.0 Reviewed-by: Manos Pitsidianakis
Re: [PATCH 2/5] virtio-blk: clarify that there is at least 1 virtqueue
On Mon, 05 Feb 2024 19:26, Stefan Hajnoczi wrote: It is not possible to instantiate a virtio-blk device with 0 virtqueues. The following check is located in ->realize(): if (!conf->num_queues) { error_setg(errp, "num-queues property must be larger than 0"); return; } Later on we access s->vq_aio_context[0] under the assumption that there is as least one virtqueue. Hanna Czenczek noted that it would help to show that the array index is already valid. Add an assertion to document that s->vq_aio_context[0] is always safe...and catch future code changes that break this assumption. Suggested-by: Hanna Czenczek Signed-off-by: Stefan Hajnoczi --- hw/block/virtio-blk.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index e8b37fd5f4..a0735a9bca 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -1825,6 +1825,7 @@ static int virtio_blk_start_ioeventfd(VirtIODevice *vdev) * Try to change the AioContext so that block jobs and other operations can * co-locate their activity in the same AioContext. If it fails, nevermind. */ +assert(nvqs > 0); /* enforced during ->realize() */ r = blk_set_aio_context(s->conf.conf.blk, s->vq_aio_context[0], _err); if (r < 0) { -- 2.43.0 Reviewed-by: Manos Pitsidianakis
[PULL v3 3/3] oslib-posix: initialize backend memory objects in parallel
From: Mark Kanda QEMU initializes preallocated backend memory as the objects are parsed from the command line. This is not optimal in some cases (e.g. memory spanning multiple NUMA nodes) because the memory objects are initialized in series. Allow the initialization to occur in parallel (asynchronously). In order to ensure optimal thread placement, asynchronous initialization requires prealloc context threads to be in use. Signed-off-by: Mark Kanda Message-ID: <20240131165327.3154970-2-mark.ka...@oracle.com> Tested-by: Mario Casquero Signed-off-by: David Hildenbrand --- backends/hostmem.c | 7 ++- hw/virtio/virtio-mem.c | 4 +- include/hw/qdev-core.h | 5 ++ include/qemu/osdep.h | 18 +- system/vl.c| 9 +++ util/oslib-posix.c | 131 +++-- util/oslib-win32.c | 8 ++- 7 files changed, 145 insertions(+), 37 deletions(-) diff --git a/backends/hostmem.c b/backends/hostmem.c index 987f6f591e..81a72ce40b 100644 --- a/backends/hostmem.c +++ b/backends/hostmem.c @@ -20,6 +20,7 @@ #include "qom/object_interfaces.h" #include "qemu/mmap-alloc.h" #include "qemu/madvise.h" +#include "hw/qdev-core.h" #ifdef CONFIG_NUMA #include @@ -237,7 +238,7 @@ static void host_memory_backend_set_prealloc(Object *obj, bool value, uint64_t sz = memory_region_size(>mr); if (!qemu_prealloc_mem(fd, ptr, sz, backend->prealloc_threads, - backend->prealloc_context, errp)) { + backend->prealloc_context, false, errp)) { return; } backend->prealloc = true; @@ -323,6 +324,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp) HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc); void *ptr; uint64_t sz; +bool async = !phase_check(PHASE_LATE_BACKENDS_CREATED); if (!bc->alloc) { return; @@ -402,7 +404,8 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp) if (backend->prealloc && !qemu_prealloc_mem(memory_region_get_fd(>mr), ptr, sz, backend->prealloc_threads, -backend->prealloc_context, errp)) { +backend->prealloc_context, +async, errp)) { return; } } diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 99ab989852..ffd119ebac 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -605,7 +605,7 @@ static int virtio_mem_set_block_state(VirtIOMEM *vmem, uint64_t start_gpa, int fd = memory_region_get_fd(>memdev->mr); Error *local_err = NULL; -if (!qemu_prealloc_mem(fd, area, size, 1, NULL, _err)) { +if (!qemu_prealloc_mem(fd, area, size, 1, NULL, false, _err)) { static bool warned; /* @@ -1248,7 +1248,7 @@ static int virtio_mem_prealloc_range_cb(VirtIOMEM *vmem, void *arg, int fd = memory_region_get_fd(>memdev->mr); Error *local_err = NULL; -if (!qemu_prealloc_mem(fd, area, size, 1, NULL, _err)) { +if (!qemu_prealloc_mem(fd, area, size, 1, NULL, false, _err)) { error_report_err(local_err); return -ENOMEM; } diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index d47536eadb..9228e96c87 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -1083,6 +1083,11 @@ typedef enum MachineInitPhase { */ PHASE_ACCEL_CREATED, +/* + * Late backend objects have been created and initialized. + */ +PHASE_LATE_BACKENDS_CREATED, + /* * machine_class->init has been called, thus creating any embedded * devices and validating machine properties. Devices created at diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h index c9692cc314..7d359dabc4 100644 --- a/include/qemu/osdep.h +++ b/include/qemu/osdep.h @@ -680,6 +680,8 @@ typedef struct ThreadContext ThreadContext; * @area: start address of the are to preallocate * @sz: the size of the area to preallocate * @max_threads: maximum number of threads to use + * @tc: prealloc context threads pointer, NULL if not in use + * @async: request asynchronous preallocation, requires @tc * @errp: returns an error if this function fails * * Preallocate memory (populate/prefault page tables writable) for the virtual @@ -687,10 +689,24 @@ typedef struct ThreadContext ThreadContext; * each page in the area was faulted in writable at least once, for example, * after allocating file blocks for mapped files. * + * When setting @async, allocation might be performed asynchronously. + * qemu_finish_async_prealloc_mem() must be called to finish any asynchronous + * preallocation. + * * Return: true on success, else false setting @errp with error. */ bool qemu_prealloc_mem(int
[PULL v3 1/3] hv-balloon: use get_min_alignment() to express 32 GiB alignment
Let's implement the get_min_alignment() callback for memory devices, and copy for the device memory region the alignment of the host memory region. This mimics what virtio-mem does, and allows for re-introducing proper alignment checks for the memory region size (where we don't care about additional device requirements) in memory device core. Message-ID: <20240117135554.787344-2-da...@redhat.com> Reviewed-by: Maciej S. Szmigiero Signed-off-by: David Hildenbrand --- hw/hyperv/hv-balloon.c | 37 + 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/hw/hyperv/hv-balloon.c b/hw/hyperv/hv-balloon.c index 0238365712..ade283335a 100644 --- a/hw/hyperv/hv-balloon.c +++ b/hw/hyperv/hv-balloon.c @@ -1477,22 +1477,7 @@ static void hv_balloon_ensure_mr(HvBalloon *balloon) balloon->mr = g_new0(MemoryRegion, 1); memory_region_init(balloon->mr, OBJECT(balloon), TYPE_HV_BALLOON, memory_region_size(hostmem_mr)); - -/* - * The VM can indicate an alignment up to 32 GiB. Memory device core can - * usually only handle/guarantee 1 GiB alignment. The user will have to - * specify a larger maxmem eventually. - * - * The memory device core will warn the user in case maxmem might have to be - * increased and will fail plugging the device if there is not sufficient - * space after alignment. - * - * TODO: we could do the alignment ourselves in a slightly bigger region. - * But this feels better, although the warning might be annoying. Maybe - * we can optimize that in the future (e.g., with such a device on the - * cmdline place/size the device memory region differently. - */ -balloon->mr->align = MAX(32 * GiB, memory_region_get_alignment(hostmem_mr)); +balloon->mr->align = memory_region_get_alignment(hostmem_mr); } static void hv_balloon_free_mr(HvBalloon *balloon) @@ -1654,6 +1639,25 @@ static MemoryRegion *hv_balloon_md_get_memory_region(MemoryDeviceState *md, return balloon->mr; } +static uint64_t hv_balloon_md_get_min_alignment(const MemoryDeviceState *md) +{ +/* + * The VM can indicate an alignment up to 32 GiB. Memory device core can + * usually only handle/guarantee 1 GiB alignment. The user will have to + * specify a larger maxmem eventually. + * + * The memory device core will warn the user in case maxmem might have to be + * increased and will fail plugging the device if there is not sufficient + * space after alignment. + * + * TODO: we could do the alignment ourselves in a slightly bigger region. + * But this feels better, although the warning might be annoying. Maybe + * we can optimize that in the future (e.g., with such a device on the + * cmdline place/size the device memory region differently. + */ +return 32 * GiB; +} + static void hv_balloon_md_fill_device_info(const MemoryDeviceState *md, MemoryDeviceInfo *info) { @@ -1766,5 +1770,6 @@ static void hv_balloon_class_init(ObjectClass *klass, void *data) mdc->get_memory_region = hv_balloon_md_get_memory_region; mdc->decide_memslots = hv_balloon_decide_memslots; mdc->get_memslots = hv_balloon_get_memslots; +mdc->get_min_alignment = hv_balloon_md_get_min_alignment; mdc->fill_device_info = hv_balloon_md_fill_device_info; } -- 2.43.0
Re: [PATCH 3/5] virtio-blk: add vq_rq[] bounds check in virtio_blk_dma_restart_cb()
On Mon, 05 Feb 2024 19:26, Stefan Hajnoczi wrote: Hanna Czenczek noted that the array index in virtio_blk_dma_restart_cb() is not bounds-checked: g_autofree VirtIOBlockReq **vq_rq = g_new0(VirtIOBlockReq *, num_queues); ... while (rq) { VirtIOBlockReq *next = rq->next; uint16_t idx = virtio_get_queue_index(rq->vq); rq->next = vq_rq[idx]; ^^ The code is correct because both rq->vq and vq_rq[] depend on num_queues, but this is indirect and not 100% obvious. Add an assertion. This sentence could be useful as an inline comment too. Suggested-by: Hanna Czenczek Signed-off-by: Stefan Hajnoczi --- hw/block/virtio-blk.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index a0735a9bca..f3193f4b75 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -1209,6 +1209,7 @@ static void virtio_blk_dma_restart_cb(void *opaque, bool running, VirtIOBlockReq *next = rq->next; uint16_t idx = virtio_get_queue_index(rq->vq); +assert(idx < num_queues); rq->next = vq_rq[idx]; vq_rq[idx] = rq; rq = next; -- 2.43.0 Reviewed-by: Manos Pitsidianakis
[PULL v3 2/3] memory-device: reintroduce memory region size check
We used to check that the memory region size is multiples of the overall requested address alignment for the device memory address. We removed that check, because there are cases (i.e., hv-balloon) where devices unconditionally request an address alignment that has a very large alignment (i.e., 32 GiB), but the actual memory device size might not be multiples of that alignment. However, this change: (a) allows for some practically impossible DIMM sizes, like "1GB+1 byte". (b) allows for DIMMs that partially cover hugetlb pages, previously reported in [1]. Both scenarios don't make any sense: we might even waste memory. So let's reintroduce that check, but only check that the memory region size is multiples of the memory region alignment (i.e., page size, huge page size), but not any additional memory device requirements communicated using md->get_min_alignment(). The following examples now fail again as expected: (a) 1M with 2M THP qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \ -object memory-backend-ram,id=mem1,size=1M \ -device pc-dimm,id=dimm1,memdev=mem1 -> backend memory size must be multiple of 0x20 (b) 1G+1byte qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \ -object memory-backend-ram,id=mem1,size=1073741825B \ -device pc-dimm,id=dimm1,memdev=mem1 -> backend memory size must be multiple of 0x20 (c) Unliagned hugetlb size (2M) qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \ -object memory-backend-file,id=mem1,mem-path=/dev/hugepages/tmp,size=511M \ -device pc-dimm,id=dimm1,memdev=mem1 backend memory size must be multiple of 0x20 (d) Unliagned hugetlb size (1G) qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \ -object memory-backend-file,id=mem1,mem-path=/dev/hugepages1G/tmp,size=2047M \ -device pc-dimm,id=dimm1,memdev=mem1 -> backend memory size must be multiple of 0x4000 Note that this fix depends on a hv-balloon change to communicate its additional alignment requirements using get_min_alignment() instead of through the memory region. [1] https://lkml.kernel.org/r/f77d641d500324525ac036fe1827b3070de75fc1.1701088320.git.mpriv...@redhat.com Message-ID: <20240117135554.787344-3-da...@redhat.com> Reported-by: Zhenyu Zhang Reported-by: Michal Privoznik Fixes: eb1b7c4bd413 ("memory-device: Drop size alignment check") Tested-by: Zhenyu Zhang Tested-by: Mario Casquero Reviewed-by: Maciej S. Szmigiero Signed-off-by: David Hildenbrand --- hw/mem/memory-device.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c index a1b1af26bc..e098585cda 100644 --- a/hw/mem/memory-device.c +++ b/hw/mem/memory-device.c @@ -374,6 +374,20 @@ void memory_device_pre_plug(MemoryDeviceState *md, MachineState *ms, goto out; } +/* + * We always want the memory region size to be multiples of the memory + * region alignment: for example, DIMMs with 1G+1byte size don't make + * any sense. Note that we don't check that the size is multiples + * of any additional alignment requirements the memory device might + * have when it comes to the address in physical address space. + */ +if (!QEMU_IS_ALIGNED(memory_region_size(mr), + memory_region_get_alignment(mr))) { +error_setg(errp, "backend memory size must be multiple of 0x%" + PRIx64, memory_region_get_alignment(mr)); +return; +} + if (legacy_align) { align = *legacy_align; } else { -- 2.43.0
[PULL v3 0/3] Host Memory Backends and Memory devices queue 2024-02-06
The following changes since commit 39a6e4f87e7b75a45b08d6dc8b8b7c2954c87440: Merge tag 'pull-qapi-2024-02-03' of https://repo.or.cz/qemu/armbru into staging (2024-02-03 13:31:58 +) are available in the Git repository at: https://github.com/davidhildenbrand/qemu.git tags/mem-2024-02-06-v3 for you to fetch changes up to 04accf43df83aa10f06f7dbda3ecf0db97f0c5a6: oslib-posix: initialize backend memory objects in parallel (2024-02-06 08:15:22 +0100) Hi, "Host Memory Backends" and "Memory devices" queue ("mem"): - Reintroduce memory region size checks for memory devices; the removal lead to some undesired side effects - Preallocate memory of memory backends in selected configurations asynchronously (so we preallocate concurrently), to speed up QEMU startup time. David Hildenbrand (2): hv-balloon: use get_min_alignment() to express 32 GiB alignment memory-device: reintroduce memory region size check Mark Kanda (1): oslib-posix: initialize backend memory objects in parallel backends/hostmem.c | 7 ++- hw/hyperv/hv-balloon.c | 37 +++- hw/mem/memory-device.c | 14 + hw/virtio/virtio-mem.c | 4 +- include/hw/qdev-core.h | 5 ++ include/qemu/osdep.h | 18 +- system/vl.c| 9 +++ util/oslib-posix.c | 131 +++-- util/oslib-win32.c | 8 ++- 9 files changed, 180 insertions(+), 53 deletions(-) -- 2.43.0
Re: [PULL v2 0/3] Host Memory Backends and Memory devices queue 2024-02-06
On 06.02.24 08:14, David Hildenbrand wrote: The following changes since commit 39a6e4f87e7b75a45b08d6dc8b8b7c2954c87440: Merge tag 'pull-qapi-2024-02-03' of https://repo.or.cz/qemu/armbru into staging (2024-02-03 13:31:58 +) are available in the Git repository at: https://github.com/davidhildenbrand/qemu.git tags/mem-2024-02-06-v2 for you to fetch changes up to 8eb38ab662e74d618d473a9a52d71c644926c3c0: oslib-posix: initialize backend memory objects in parallel (2024-02-06 08:09:55 +0100) Hi, "Host Memory Backends" and "Memory devices" queue ("mem"): - Reintroduce memory region size checks for memory devices; the removal lead to some undesired side effects - Preallocate memory of memory backends in selected configurations asynchronously (so we preallocate concurrently), to speed up QEMU startup time. ... and yet another instance of the same mail address is wrong. Gah. Maybe v3 will do ... -- Cheers, David / dhildenb
Re: [PATCH 4/5] virtio-blk: declare VirtIOBlock::rq with a type
On Mon, 05 Feb 2024 19:26, Stefan Hajnoczi wrote: The VirtIOBlock::rq field has had the type void * since its introduction in commit 869a5c6df19a ("Stop VM on error in virtio-blk. (Gleb Natapov)"). Perhaps this was done to avoid the forward declaration of VirtIOBlockReq. Hanna Czenczek pointed out the missing type. Specify the actual type because there is no need to use void * here. Suggested-by: Hanna Czenczek Signed-off-by: Stefan Hajnoczi --- include/hw/virtio/virtio-blk.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h index 833a9a344f..5c14110c4b 100644 --- a/include/hw/virtio/virtio-blk.h +++ b/include/hw/virtio/virtio-blk.h @@ -55,7 +55,7 @@ struct VirtIOBlock { VirtIODevice parent_obj; BlockBackend *blk; QemuMutex rq_lock; -void *rq; /* protected by rq_lock */ +struct VirtIOBlockReq *rq; /* protected by rq_lock */ VirtIOBlkConf conf; unsigned short sector_mask; bool original_wce; -- 2.43.0 Reviewed-by: Manos Pitsidianakis
[PULL v2 1/3] hv-balloon: use get_min_alignment() to express 32 GiB alignment
Let's implement the get_min_alignment() callback for memory devices, and copy for the device memory region the alignment of the host memory region. This mimics what virtio-mem does, and allows for re-introducing proper alignment checks for the memory region size (where we don't care about additional device requirements) in memory device core. Message-ID: <20240117135554.787344-2-da...@redhat.com> Reviewed-by: Maciej S. Szmigiero Signed-off-by: David Hildenbrand --- hw/hyperv/hv-balloon.c | 37 + 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/hw/hyperv/hv-balloon.c b/hw/hyperv/hv-balloon.c index 0238365712..ade283335a 100644 --- a/hw/hyperv/hv-balloon.c +++ b/hw/hyperv/hv-balloon.c @@ -1477,22 +1477,7 @@ static void hv_balloon_ensure_mr(HvBalloon *balloon) balloon->mr = g_new0(MemoryRegion, 1); memory_region_init(balloon->mr, OBJECT(balloon), TYPE_HV_BALLOON, memory_region_size(hostmem_mr)); - -/* - * The VM can indicate an alignment up to 32 GiB. Memory device core can - * usually only handle/guarantee 1 GiB alignment. The user will have to - * specify a larger maxmem eventually. - * - * The memory device core will warn the user in case maxmem might have to be - * increased and will fail plugging the device if there is not sufficient - * space after alignment. - * - * TODO: we could do the alignment ourselves in a slightly bigger region. - * But this feels better, although the warning might be annoying. Maybe - * we can optimize that in the future (e.g., with such a device on the - * cmdline place/size the device memory region differently. - */ -balloon->mr->align = MAX(32 * GiB, memory_region_get_alignment(hostmem_mr)); +balloon->mr->align = memory_region_get_alignment(hostmem_mr); } static void hv_balloon_free_mr(HvBalloon *balloon) @@ -1654,6 +1639,25 @@ static MemoryRegion *hv_balloon_md_get_memory_region(MemoryDeviceState *md, return balloon->mr; } +static uint64_t hv_balloon_md_get_min_alignment(const MemoryDeviceState *md) +{ +/* + * The VM can indicate an alignment up to 32 GiB. Memory device core can + * usually only handle/guarantee 1 GiB alignment. The user will have to + * specify a larger maxmem eventually. + * + * The memory device core will warn the user in case maxmem might have to be + * increased and will fail plugging the device if there is not sufficient + * space after alignment. + * + * TODO: we could do the alignment ourselves in a slightly bigger region. + * But this feels better, although the warning might be annoying. Maybe + * we can optimize that in the future (e.g., with such a device on the + * cmdline place/size the device memory region differently. + */ +return 32 * GiB; +} + static void hv_balloon_md_fill_device_info(const MemoryDeviceState *md, MemoryDeviceInfo *info) { @@ -1766,5 +1770,6 @@ static void hv_balloon_class_init(ObjectClass *klass, void *data) mdc->get_memory_region = hv_balloon_md_get_memory_region; mdc->decide_memslots = hv_balloon_decide_memslots; mdc->get_memslots = hv_balloon_get_memslots; +mdc->get_min_alignment = hv_balloon_md_get_min_alignment; mdc->fill_device_info = hv_balloon_md_fill_device_info; } -- 2.43.0
[PULL v2 0/3] Host Memory Backends and Memory devices queue 2024-02-06
The following changes since commit 39a6e4f87e7b75a45b08d6dc8b8b7c2954c87440: Merge tag 'pull-qapi-2024-02-03' of https://repo.or.cz/qemu/armbru into staging (2024-02-03 13:31:58 +) are available in the Git repository at: https://github.com/davidhildenbrand/qemu.git tags/mem-2024-02-06-v2 for you to fetch changes up to 8eb38ab662e74d618d473a9a52d71c644926c3c0: oslib-posix: initialize backend memory objects in parallel (2024-02-06 08:09:55 +0100) Hi, "Host Memory Backends" and "Memory devices" queue ("mem"): - Reintroduce memory region size checks for memory devices; the removal lead to some undesired side effects - Preallocate memory of memory backends in selected configurations asynchronously (so we preallocate concurrently), to speed up QEMU startup time. David Hildenbrand (2): hv-balloon: use get_min_alignment() to express 32 GiB alignment memory-device: reintroduce memory region size check Mark Kanda (1): oslib-posix: initialize backend memory objects in parallel backends/hostmem.c | 7 ++- hw/hyperv/hv-balloon.c | 37 +++- hw/mem/memory-device.c | 14 + hw/virtio/virtio-mem.c | 4 +- include/hw/qdev-core.h | 5 ++ include/qemu/osdep.h | 18 +- system/vl.c| 9 +++ util/oslib-posix.c | 131 +++-- util/oslib-win32.c | 8 ++- 9 files changed, 180 insertions(+), 53 deletions(-) -- 2.43.0
Re: [PULL 0/3] Host Memory Backends and Memory devices queue 2024-02-6
On 06.02.24 08:02, David Hildenbrand wrote: The following changes since commit 39a6e4f87e7b75a45b08d6dc8b8b7c2954c87440: Merge tag 'pull-qapi-2024-02-03' of https://repo.or.cz/qemu/armbru into staging (2024-02-03 13:31:58 +) are available in the Git repository at: https://github.com/davidhildenbrand/qemu.git tags/mem-2024-02-06 for you to fetch changes up to a32c31979a852a07d590effa1586f6ab4fa4d784: oslib-posix: initialize backend memory objects in parallel (2024-02-04 17:51:13 +0100) Hi, "Host Memory Backends" and "Memory devices" queue ("mem"): - Reintroduce memory region size checks for memory devices; the removal lead to some undesired side effects - Preallocate memory of memory backends in selected configurations asynchronously (so we preallocate concurrently), to speed up QEMU startup time. A mail address in the second patch is wrong. Will send a v2. -- Cheers, David / dhildenb
Re: building qemu on a system with libxkbcommon installed but not xkeyboard-config produces an core dump
06.02.2024 09:35, Zhang Wen: ... I'm so sorry for the misspelling in my initial post and the confusion caused by that. That's no problem, things happen. I'm building my own system following the Linux From Scratch package, which is at https://www.linuxfromscratch.org. I accidently built libxkbcommon package but not xkeyboard-config package, and then while building qemu I saw this problem. it's a little difficult to find the root cause from this error message, so I sent this patch to seek for help from upstream. Okay, so it is mostly a self-built/self-installed linux. That works too. So how the problem manifests itself? What the actual error message is? You mentioned some segfault iirc, care to share some details? It's not a problem at all to pick this change up, but I'd love to know more details about this first :) Thanks, /mjt
[PULL 0/3] Host Memory Backends and Memory devices queue 2024-02-6
The following changes since commit 39a6e4f87e7b75a45b08d6dc8b8b7c2954c87440: Merge tag 'pull-qapi-2024-02-03' of https://repo.or.cz/qemu/armbru into staging (2024-02-03 13:31:58 +) are available in the Git repository at: https://github.com/davidhildenbrand/qemu.git tags/mem-2024-02-06 for you to fetch changes up to a32c31979a852a07d590effa1586f6ab4fa4d784: oslib-posix: initialize backend memory objects in parallel (2024-02-04 17:51:13 +0100) Hi, "Host Memory Backends" and "Memory devices" queue ("mem"): - Reintroduce memory region size checks for memory devices; the removal lead to some undesired side effects - Preallocate memory of memory backends in selected configurations asynchronously (so we preallocate concurrently), to speed up QEMU startup time. David Hildenbrand (2): hv-balloon: use get_min_alignment() to express 32 GiB alignment memory-device: reintroduce memory region size check Mark Kanda (1): oslib-posix: initialize backend memory objects in parallel backends/hostmem.c | 7 ++- hw/hyperv/hv-balloon.c | 37 +++- hw/mem/memory-device.c | 14 + hw/virtio/virtio-mem.c | 4 +- include/hw/qdev-core.h | 5 ++ include/qemu/osdep.h | 18 +- system/vl.c| 9 +++ util/oslib-posix.c | 131 +++-- util/oslib-win32.c | 8 ++- 9 files changed, 180 insertions(+), 53 deletions(-) -- 2.43.0
[PULL 1/3] hv-balloon: use get_min_alignment() to express 32 GiB alignment
Let's implement the get_min_alignment() callback for memory devices, and copy for the device memory region the alignment of the host memory region. This mimics what virtio-mem does, and allows for re-introducing proper alignment checks for the memory region size (where we don't care about additional device requirements) in memory device core. Message-ID: <20240117135554.787344-2-da...@redhat.com> Reviewed-by: Maciej S. Szmigiero Signed-off-by: David Hildenbrand --- hw/hyperv/hv-balloon.c | 37 + 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/hw/hyperv/hv-balloon.c b/hw/hyperv/hv-balloon.c index 0238365712..ade283335a 100644 --- a/hw/hyperv/hv-balloon.c +++ b/hw/hyperv/hv-balloon.c @@ -1477,22 +1477,7 @@ static void hv_balloon_ensure_mr(HvBalloon *balloon) balloon->mr = g_new0(MemoryRegion, 1); memory_region_init(balloon->mr, OBJECT(balloon), TYPE_HV_BALLOON, memory_region_size(hostmem_mr)); - -/* - * The VM can indicate an alignment up to 32 GiB. Memory device core can - * usually only handle/guarantee 1 GiB alignment. The user will have to - * specify a larger maxmem eventually. - * - * The memory device core will warn the user in case maxmem might have to be - * increased and will fail plugging the device if there is not sufficient - * space after alignment. - * - * TODO: we could do the alignment ourselves in a slightly bigger region. - * But this feels better, although the warning might be annoying. Maybe - * we can optimize that in the future (e.g., with such a device on the - * cmdline place/size the device memory region differently. - */ -balloon->mr->align = MAX(32 * GiB, memory_region_get_alignment(hostmem_mr)); +balloon->mr->align = memory_region_get_alignment(hostmem_mr); } static void hv_balloon_free_mr(HvBalloon *balloon) @@ -1654,6 +1639,25 @@ static MemoryRegion *hv_balloon_md_get_memory_region(MemoryDeviceState *md, return balloon->mr; } +static uint64_t hv_balloon_md_get_min_alignment(const MemoryDeviceState *md) +{ +/* + * The VM can indicate an alignment up to 32 GiB. Memory device core can + * usually only handle/guarantee 1 GiB alignment. The user will have to + * specify a larger maxmem eventually. + * + * The memory device core will warn the user in case maxmem might have to be + * increased and will fail plugging the device if there is not sufficient + * space after alignment. + * + * TODO: we could do the alignment ourselves in a slightly bigger region. + * But this feels better, although the warning might be annoying. Maybe + * we can optimize that in the future (e.g., with such a device on the + * cmdline place/size the device memory region differently. + */ +return 32 * GiB; +} + static void hv_balloon_md_fill_device_info(const MemoryDeviceState *md, MemoryDeviceInfo *info) { @@ -1766,5 +1770,6 @@ static void hv_balloon_class_init(ObjectClass *klass, void *data) mdc->get_memory_region = hv_balloon_md_get_memory_region; mdc->decide_memslots = hv_balloon_decide_memslots; mdc->get_memslots = hv_balloon_get_memslots; +mdc->get_min_alignment = hv_balloon_md_get_min_alignment; mdc->fill_device_info = hv_balloon_md_fill_device_info; } -- 2.43.0
Re: building qemu on a system with libxkbcommon installed but not xkeyboard-config produces an core dump
Peter Maydell 于2024年2月1日周四 20:57写道: > > On Thu, 1 Feb 2024 at 12:50, Michael Tokarev wrote: > > > > 01.02.2024 15:11, Michael Tokarev wrote: > > > 31.01.2024 11:13, Zhang Wen: > > >> With this patch, qemu requires keyboard-config when libxkbcommon is > > >> found on the system. So if the qemu is configured when libxkbcommon is > > >> installed > > >> but not keyboard-config, the configure stage will produce an error > > >> message, thus avoid coredump in the build stage. > > > > > > I'm not sure what you're talking about. What *is* keyboard-config anyway? > > > > > > On a debian system there's no such thing. There's keyboard-configuration > > > package but it has nothing to do with that. It looks like if we apply > > > such patch, it will be impossible to build qemu on debian. > > > > Aha, I found it. On debian it is /usr/share/pkgconfig/keyboard-config.pc, > > which is a part of xkb-data package. And libxkbcommon Depends on xkb-data. > > It looks like the distribution here is wrong, there should be no > > libxkbcommon > > without xkb-data which includes keyboard-config. > > Are we talking about "keyboard-config" or "xkeyboard-config" here? > The commit message says "keyboard-config" but the patch itself > says "xkeyboard-config". > > Zhang: it would be helpful if you could tell us which distro > you are building on where you see this problem. > > thanks > -- PMM I'm so sorry for the misspelling in my initial post and the confusion caused by that. I'm building my own system following the Linux From Scratch package, which is at https://www.linuxfromscratch.org. I accidently built libxkbcommon package but not xkeyboard-config package, and then while building qemu I saw this problem. it's a little difficult to find the root cause from this error message, so I sent this patch to seek for help from upstream. -- Key fingerprint: 419F 72F3 F3A9 36EE 1B72 B00B C1C3 4BD4 FDA3 362F
[PATCH 1/3] tests/migration-test: Stick with gicv3 in aarch64 test
From: Peter Xu Recently we introduced cross-binary migration test. It's always wanted that migration-test uses stable guest ABI for both QEMU binaries in this case, so that both QEMU binaries will be compatible on the migration stream with the cmdline specified. Switch to a static gic version "3" rather than using version "max", so that GIC should be stable now across any future QEMU binaries for migration-test. Here the version can actually be anything as long as the ABI is stable. We choose "3" because it's the majority of what we already use in QEMU while still new enough: "git grep gic-version=3" shows 6 hit, while version 4 has no direct user yet besides "max". Note that even with this change, aarch64 won't be able to work yet with migration cross binary test, but then the only missing piece will be the stable CPU model. Signed-off-by: Peter Xu --- tests/qtest/migration-test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index 7675519cfa..8a5bb1752e 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -819,7 +819,7 @@ static int test_migrate_start(QTestState **from, QTestState **to, } else if (strcmp(arch, "aarch64") == 0) { memory_size = "150M"; machine_alias = "virt"; -machine_opts = "gic-version=max"; +machine_opts = "gic-version=3"; arch_opts = g_strdup_printf("-cpu max -kernel %s", bootpath); start_address = ARM_TEST_MEM_START; end_address = ARM_TEST_MEM_END; -- 2.43.0
[PATCH 2/3] ci: Remove tag dependency for build-previous-qemu
From: Peter Xu The new build-previous-qemu job relies on QEMU release tag being present, while that may not be always true for personal git repositories since by default tag is not pushed. The job can fail on those CI kicks, as reported by Peter Maydell. Fix it by fetching the tags remotely from the official repository, as suggested by Dan. [1] https://lore.kernel.org/r/zcc9sckj7vvqe...@redhat.com Reported-by: Peter Maydell Suggested-by: Daniel P. Berrangé Signed-off-by: Peter Xu --- .gitlab-ci.d/buildtest.yml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml index 79bbc8585b..df48c9d31d 100644 --- a/.gitlab-ci.d/buildtest.yml +++ b/.gitlab-ci.d/buildtest.yml @@ -189,6 +189,8 @@ build-previous-qemu: TARGETS: x86_64-softmmu aarch64-softmmu before_script: - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)" +- git remote add upstream https://gitlab.com/qemu-project/qemu +- git fetch upstream $QEMU_PRRV_VERSION - git checkout $QEMU_PREV_VERSION after_script: - mv build build-previous -- 2.43.0
[PATCH 3/3] ci: Update comment for migration-compat-aarch64
From: Peter Xu It turns out that we may not be able to enable this test even for the upcoming v9.0. Document what we're still missing. Signed-off-by: Peter Xu --- .gitlab-ci.d/buildtest.yml | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml index df48c9d31d..f01d6d0cdf 100644 --- a/.gitlab-ci.d/buildtest.yml +++ b/.gitlab-ci.d/buildtest.yml @@ -219,9 +219,10 @@ build-previous-qemu: - QTEST_QEMU_BINARY_DST=./qemu-system-${TARGET} QTEST_QEMU_BINARY=../build/qemu-system-${TARGET} ./tests/qtest/migration-test -# This job is disabled until we release 9.0. The existing -# migration-test in 8.2 is broken on aarch64. The fix was already -# commited, but it will only take effect once 9.0 is out. +# This job needs to be disabled until we can have an aarch64 CPU model that +# will both (1) support both KVM and TCG, and (2) provide a stable ABI. +# Currently only "-cpu max" can provide (1), however it doesn't guarantee +# (2). Mark this test skipped until later. migration-compat-aarch64: extends: .migration-compat-common variables: -- 2.43.0
[PATCH 0/3] ci: Fixes on the recent cross-binary test case
From: Peter Xu Hi, This small patchset updates the recent cross-binary test for migration on a few things. Patch 1 modifies the aarch64 test GIC version to 3 rather than "max", paving way for enabling it, even if the CPU model is not yet ready. Patch 2 removes the tag dependency of the new build-previous-qemu job, so that in personal CI pipelines the job won't fail if the tag is missing, as reported by Peter Maydell, and solution suggested by Dan. Patch 3 updates the comment for aarch64 on the test to state the fact, and what is missing. Then we don't target it support for v9.0, but only until we have a stable CPU model for aarch64 (if ever possible to support both tcg and kvm). Comments welcomed, thanks. Peter Xu (3): tests/migration-test: Stick with gicv3 in aarch64 test ci: Remove tag dependency for build-previous-qemu ci: Update comment for migration-compat-aarch64 tests/qtest/migration-test.c | 2 +- .gitlab-ci.d/buildtest.yml | 9 ++--- 2 files changed, 7 insertions(+), 4 deletions(-) -- 2.43.0
Re: [PATCH] docs/style: allow C99 mixed declarations
Daniel P. Berrangé writes: > On Mon, Feb 05, 2024 at 12:18:19PM -0500, Stefan Hajnoczi wrote: >> C99 mixed declarations support interleaving of local variable >> declarations and code. >> >> The coding style "generally" forbids C99 mixed declarations with some >> exceptions to the rule. This rule is not checked by checkpatch.pl and >> naturally there are violations in the source tree. >> >> While contemplating adding another exception, I came to the conclusion >> that the best location for declarations depends on context. Let the >> programmer declare variables where it is best for legibility. Don't try >> to define all possible scenarios/exceptions. > > IIRC, we had a discussion on this topic sometime last year, but can't > remember what the $SUBJECT was, so I'll just repeat what I said then. From: Juan Quintela Subject: [PATCH] Change the default for Mixed declarations. Date: Tue, 14 Feb 2023 17:07:38 +0100 Message-Id: <20230214160738.88614-1-quint...@redhat.com> https://lore.kernel.org/qemu-devel/20230214160738.88614-1-quint...@redhat.com/ > Combining C99 mixed declarations with 'goto' is dangerous. > > Consider this program: > > $ cat jump.c > #include > > int main(int argc, char**argv) { > > if (getenv("SKIP")) > goto target; > > char *foo = malloc(30); > > target: > free(foo); > } > > $ gcc -Wall -Wuninitialized -o jump jump.c > > $ SKIP=1 ./jump > free(): invalid pointer > Aborted (core dumped) > > > -> The programmer thinks they have initialized 'foo' > -> GCC thinks the programmer has initialized 'foo' > -> Yet 'foo' is not guaranteed to be initialized at 'target:' > > Given that QEMU makes heavy use of 'goto', allowing C99 mixed > declarations exposes us to significant danger. > > Full disclosure, GCC fails to diagnmose this mistake, even > with a decl at start of 'main', but at least the mistake is > now more visible to the programmer. > > Fortunately with -fanalyzer GCC can diagnose this: > > $ gcc -fanalyzer -Wall -o jump jump.c > jump.c: In function ‘main’: > jump.c:12:3: warning: use of uninitialized value ‘foo’ [CWE-457] > [-Wanalyzer-use-of-uninitialized-value] >12 | free(foo); > | ^ > ‘main’: events 1-5 > | > |6 | if (getenv("SKIP")) > | | ~ > | | | > | | (3) following ‘true’ branch... > |7 | goto target; > | | > | | | > | | (4) ...to here > |8 | > |9 | char *foo = malloc(30); > | |^~~ > | || > | |(1) region created on stack here > | |(2) capacity: 8 bytes > |.. > | 12 | free(foo); > | | ~ > | | | > | | (5) use of uninitialized value ‘foo’ here > > > ...but -fanalyzer isn't something we have enabled by default, it > is opt-in. I'm also not sure how comprehensive the flow control > analysis of -fanalyzer is ? Can we be sure it'll catch these > mistakes in large complex functions with many code paths ? > > Even if the compiler does reliably warn, I think the code pattern > remains misleading to contributors, as the flow control flaw is > very non-obvious. Yup. Strong dislike. > Rather than accept the status quo and remove the coding guideline, > I think we should strengthen the guidelines, such that it is > explicitly forbidden in any method that uses 'goto'. Personally > I'd go all the way to -Werror=declaration-after-statement, as I support this. > while C99 mixed decl is appealing, Not to me. I much prefer declarations and statements to be visually distinct. Putting declarations first and separating from statements them with a blank line accomplishes that. Less necessary in languages where declarations are syntactically obvious. >it isn't exactly a game > changer in improving code maintainability. [...]
Re: [PATCH v2 0/6] migration/multifd: Fix channel creation vs. cleanup races
On Mon, Feb 05, 2024 at 04:49:23PM -0300, Fabiano Rosas wrote: > Based-on: 20240202102857.110210-1-pet...@redhat.com > [PATCH v2 00/23] migration/multifd: Refactor ->send_prepare() and cleanups > https://lore.kernel.org/r/20240202102857.110210-1-pet...@redhat.com > > Hi, > > In this v2 I made sure NO channel is created after the semaphores are > posted. Feel free to call me out if that's not the case. Queued into -staging. Plan to send a pull only before I'll be out (Feb 9-19), so comments are still welcomed. Thanks. -- Peter Xu
Re: [PATCH v2 6/6] migration/multifd: Add a synchronization point for channel creation
On Mon, Feb 05, 2024 at 04:49:29PM -0300, Fabiano Rosas wrote: > It is possible that one of the multifd channels fails to be created at > multifd_new_send_channel_async() while the rest of the channel > creation tasks are still in flight. > > This could lead to multifd_save_cleanup() executing the > qemu_thread_join() loop too early and not waiting for the threads > which haven't been created yet, leading to the freeing of resources > that the newly created threads will try to access and crash. > > Add a synchronization point after which there will be no attempts at > thread creation and therefore calling multifd_save_cleanup() past that > point will ensure it properly waits for the threads. > > A note about performance: Prior to this patch, if a channel took too > long to be established, other channels could finish connecting first > and already start taking load. Now we're bounded by the > slowest-connecting channel. > > Reported-by: Avihai Horon > Signed-off-by: Fabiano Rosas [...] > @@ -934,7 +936,6 @@ static void multifd_new_send_channel_async(QIOTask *task, > gpointer opaque) > MultiFDSendParams *p = opaque; > QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task)); > Error *local_err = NULL; > - This line removal should belong to the previous patch. I can touch that up. > bool ret; > > trace_multifd_new_send_channel_async(p->id); Reviewed-by: Peter Xu -- Peter Xu
Re: [PATCH v2 5/6] migration/multifd: Unify multifd and TLS connection paths
On Mon, Feb 05, 2024 at 04:49:28PM -0300, Fabiano Rosas wrote: > During multifd channel creation (multifd_send_new_channel_async) when > TLS is enabled, the multifd_channel_connect function is called twice, > once to create the TLS handshake thread and another time after the > asynchrounous TLS handshake has finished. > > This creates a slightly confusing call stack where > multifd_channel_connect() is called more times than the number of > channels. It also splits error handling between the two callers of > multifd_channel_connect() causing some code duplication. Lastly, it > gets in the way of having a single point to determine whether all > channel creation tasks have been initiated. > > Refactor the code to move the reentrancy one level up at the > multifd_new_send_channel_async() level, de-duplicating the error > handling and allowing for the next patch to introduce a > synchronization point common to all the multifd channel creation, > regardless of TLS. > > Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu -- Peter Xu
RE: [PATCH v0 1/2] aspeed: support uart controller both 0 and 1 base
> -Original Message- > The uart definitions on the AST2700 are different : > > > https://github.com/AspeedTech-BMC/linux/blob/aspeed-master-v6.6/arch/arm > 64/boot/dts/aspeed/aspeed-g7.dtsi > > serial0 = > serial1 = > serial2 = > serial3 = > serial4 = > serial5 = > serial6 = > serial7 = > serial8 = > ... > > I think the names in the DT (and consequently in the QEMU models) follow the > IP names in the datasheet. > > I don't think we care in QEMU, so I would be inclined to change the indexing > of > the device names in QEMU and start at 0, which would introduce a > discrepancy for the AST2400, AST2600, AST2600 SoC. > > Let's see what the other maintainers have to say. > > Thanks, > > C. Hi Cedric, Did you mean to change the naming of uart device to 0 base for all ASPEED SOCs? If yes, it seems we need to do the following changes. 1. add ASPEED_DEV_UART0 in aspeed_soc.h 2. Re-defined uart memory map for ast2600, ast10x0, ast2500 and ast2400(uart0 -> ASPEED_DEV_UART0) Take ast2600 for example: static const hwaddr aspeed_soc_ast2600_memmap[] = { [ASPEED_DEV_UART1] = 0x1E783000, ---> [ASPEED_DEV_UART0] [ASPEED_DEV_UART2] = 0x1E78D000, ---> [ASPEED_DEV_UART1] [ASPEED_DEV_UART3] = 0x1E78E000, [ASPEED_DEV_UART4] = 0x1E78F000, [ASPEED_DEV_UART5] = 0x1E784000, [ASPEED_DEV_UART6] = 0x1E79, [ASPEED_DEV_UART7] = 0x1E790100, [ASPEED_DEV_UART8] = 0x1E790200, [ASPEED_DEV_UART9] = 0x1E790300, [ASPEED_DEV_UART10]= 0x1E790400, [ASPEED_DEV_UART11]= 0x1E790500, [ASPEED_DEV_UART12]= 0x1E790600, [ASPEED_DEV_UART13]= 0x1E790700, ---> [ASPEED_DEV_UART12] }; If no, could you please descript it more detail? So, I can change it and re-send this patch series. By the way, I will send a new patch series to support AST2700 in two weeks. We encountered GIC issues. It seems that QEMU support GIC v3 but SPI did not support, yet. https://github.com/qemu/qemu/blob/master/hw/intc/arm_gicv3_dist.c#L383 https://github.com/AspeedTech-BMC/linux/blob/aspeed-master-v6.6/arch/arm64/boot/dts/aspeed/aspeed-g7.dtsi#L229 It think that we can discuss it in a new AST2700 patch series. Thanks-Jamin
[PULL v2 38/39] tcg/s390x: Support TCG_COND_TST{EQ,NE}
Signed-off-by: Richard Henderson --- tcg/s390x/tcg-target.h | 2 +- tcg/s390x/tcg-target.c.inc | 139 + 2 files changed, 97 insertions(+), 44 deletions(-) diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h index 53bed8c8d2..ae448c3a3a 100644 --- a/tcg/s390x/tcg-target.h +++ b/tcg/s390x/tcg-target.h @@ -138,7 +138,7 @@ extern uint64_t s390_facilities[3]; #define TCG_TARGET_HAS_qemu_ldst_i128 1 -#define TCG_TARGET_HAS_tst0 +#define TCG_TARGET_HAS_tst1 #define TCG_TARGET_HAS_v64HAVE_FACILITY(VECTOR) #define TCG_TARGET_HAS_v128 HAVE_FACILITY(VECTOR) diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc index 7f97080f52..ad587325fc 100644 --- a/tcg/s390x/tcg-target.c.inc +++ b/tcg/s390x/tcg-target.c.inc @@ -112,6 +112,9 @@ typedef enum S390Opcode { RI_OILH = 0xa50a, RI_OILL = 0xa50b, RI_TMLL = 0xa701, +RI_TMLH = 0xa700, +RI_TMHL = 0xa703, +RI_TMHH = 0xa702, RIEb_CGRJ= 0xec64, RIEb_CLGRJ = 0xec65, @@ -404,10 +407,15 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot) #define S390_CC_NEVER 0 #define S390_CC_ALWAYS 15 +#define S390_TM_EQ 8 /* CC == 0 */ +#define S390_TM_NE 7 /* CC in {1,2,3} */ + /* Condition codes that result from a COMPARE and COMPARE LOGICAL. */ -static const uint8_t tcg_cond_to_s390_cond[] = { +static const uint8_t tcg_cond_to_s390_cond[16] = { [TCG_COND_EQ] = S390_CC_EQ, [TCG_COND_NE] = S390_CC_NE, +[TCG_COND_TSTEQ] = S390_CC_EQ, +[TCG_COND_TSTNE] = S390_CC_NE, [TCG_COND_LT] = S390_CC_LT, [TCG_COND_LE] = S390_CC_LE, [TCG_COND_GT] = S390_CC_GT, @@ -421,9 +429,11 @@ static const uint8_t tcg_cond_to_s390_cond[] = { /* Condition codes that result from a LOAD AND TEST. Here, we have no unsigned instruction variation, however since the test is vs zero we can re-map the outcomes appropriately. */ -static const uint8_t tcg_cond_to_ltr_cond[] = { +static const uint8_t tcg_cond_to_ltr_cond[16] = { [TCG_COND_EQ] = S390_CC_EQ, [TCG_COND_NE] = S390_CC_NE, +[TCG_COND_TSTEQ] = S390_CC_ALWAYS, +[TCG_COND_TSTNE] = S390_CC_NEVER, [TCG_COND_LT] = S390_CC_LT, [TCG_COND_LE] = S390_CC_LE, [TCG_COND_GT] = S390_CC_GT, @@ -542,10 +552,13 @@ static bool risbg_mask(uint64_t c) static bool tcg_target_const_match(int64_t val, int ct, TCGType type, TCGCond cond, int vece) { +uint64_t uval = val; + if (ct & TCG_CT_CONST) { return true; } if (type == TCG_TYPE_I32) { +uval = (uint32_t)val; val = (int32_t)val; } @@ -567,6 +580,15 @@ static bool tcg_target_const_match(int64_t val, int ct, case TCG_COND_GTU: ct |= TCG_CT_CONST_U32; /* CLGFI */ break; +case TCG_COND_TSTNE: +case TCG_COND_TSTEQ: +if (is_const_p16(uval) >= 0) { +return true; /* TMxx */ +} +if (risbg_mask(uval)) { +return true; /* RISBG */ +} +break; default: g_assert_not_reached(); } @@ -588,10 +610,6 @@ static bool tcg_target_const_match(int64_t val, int ct, if (ct & TCG_CT_CONST_INV) { val = ~val; } -/* - * Note that is_const_p16 is a subset of is_const_p32, - * so we don't need both constraints. - */ if ((ct & TCG_CT_CONST_P32) && is_const_p32(val) >= 0) { return true; } @@ -868,6 +886,9 @@ static const S390Opcode oi_insns[4] = { static const S390Opcode lif_insns[2] = { RIL_LLILF, RIL_LLIHF, }; +static const S390Opcode tm_insns[4] = { +RI_TMLL, RI_TMLH, RI_TMHL, RI_TMHH +}; /* load a register with an immediate value */ static void tcg_out_movi(TCGContext *s, TCGType type, @@ -1228,6 +1249,36 @@ static int tgen_cmp2(TCGContext *s, TCGType type, TCGCond c, TCGReg r1, TCGCond inv_c = tcg_invert_cond(c); S390Opcode op; +if (is_tst_cond(c)) { +tcg_debug_assert(!need_carry); + +if (!c2const) { +if (type == TCG_TYPE_I32) { +tcg_out_insn(s, RRFa, NRK, TCG_REG_R0, r1, c2); +} else { +tcg_out_insn(s, RRFa, NGRK, TCG_REG_R0, r1, c2); +} +goto exit; +} + +if (type == TCG_TYPE_I32) { +c2 = (uint32_t)c2; +} + +int i = is_const_p16(c2); +if (i >= 0) { +tcg_out_insn_RI(s, tm_insns[i], r1, c2 >> (i * 16)); +*inv_cc = c == TCG_COND_TSTEQ ? S390_TM_NE : S390_TM_EQ; +return *inv_cc ^ 15; +} + +if (risbg_mask(c2)) { +tgen_andi_risbg(s, TCG_REG_R0, r1, c2); +goto exit; +} +g_assert_not_reached(); +} + if (c2const) { if (c2 == 0) { if (!(is_unsigned
[PULL v2 00/39] tcg patch queue
v2: Fix rebase error in patch 38 (tcg/s390x: Support TCG_COND_TST{EQ,NE}). r~ The following changes since commit 39a6e4f87e7b75a45b08d6dc8b8b7c2954c87440: Merge tag 'pull-qapi-2024-02-03' of https://repo.or.cz/qemu/armbru into staging (2024-02-03 13:31:58 +) are available in the Git repository at: https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20240205-2 for you to fetch changes up to 23c5692abc3917151dee36c00d751cf5bc46ef19: tcg/tci: Support TCG_COND_TST{EQ,NE} (2024-02-05 22:45:41 +) tcg: Introduce TCG_COND_TST{EQ,NE} target/alpha: Use TCG_COND_TST{EQ,NE} target/m68k: Use TCG_COND_TST{EQ,NE} in gen_fcc_cond target/sparc: Use TCG_COND_TSTEQ in gen_op_mulscc target/s390x: Use TCG_COND_TSTNE for CC_OP_{TM,ICM} target/s390x: Improve general case of disas_jcc Paolo Bonzini (1): tcg/i386: Use TEST r,r to test 8/16/32 bits Philippe Mathieu-Daudé (1): tcg/aarch64: Massage tcg_out_brcond() Richard Henderson (37): tcg: Introduce TCG_COND_TST{EQ,NE} tcg: Introduce TCG_TARGET_HAS_tst tcg/optimize: Split out arg_is_const_val tcg/optimize: Split out do_constant_folding_cond1 tcg/optimize: Do swap_commutative2 in do_constant_folding_cond2 tcg/optimize: Handle TCG_COND_TST{EQ,NE} tcg/optimize: Lower TCG_COND_TST{EQ,NE} if unsupported target/alpha: Pass immediate value to gen_bcond_internal() target/alpha: Use TCG_COND_TST{EQ,NE} for BLB{C,S} target/alpha: Use TCG_COND_TST{EQ,NE} for CMOVLB{C,S} target/alpha: Use TCG_COND_TSTNE for gen_fold_mzero target/m68k: Use TCG_COND_TST{EQ,NE} in gen_fcc_cond target/sparc: Use TCG_COND_TSTEQ in gen_op_mulscc target/s390x: Use TCG_COND_TSTNE for CC_OP_{TM,ICM} target/s390x: Improve general case of disas_jcc tcg: Add TCGConst argument to tcg_target_const_match tcg/aarch64: Support TCG_COND_TST{EQ,NE} tcg/aarch64: Generate TBZ, TBNZ tcg/aarch64: Generate CBNZ for TSTNE of UINT32_MAX tcg/arm: Split out tcg_out_cmp() tcg/arm: Support TCG_COND_TST{EQ,NE} tcg/i386: Pass x86 condition codes to tcg_out_cmov tcg/i386: Move tcg_cond_to_jcc[] into tcg_out_cmp tcg/i386: Support TCG_COND_TST{EQ,NE} tcg/i386: Improve TSTNE/TESTEQ vs powers of two tcg/sparc64: Hoist read of tcg_cond_to_rcond tcg/sparc64: Pass TCGCond to tcg_out_cmp tcg/sparc64: Support TCG_COND_TST{EQ,NE} tcg/ppc: Sink tcg_to_bc usage into tcg_out_bc tcg/ppc: Use cr0 in tcg_to_bc and tcg_to_isel tcg/ppc: Tidy up tcg_target_const_match tcg/ppc: Add TCG_CT_CONST_CMP tcg/ppc: Support TCG_COND_TST{EQ,NE} tcg/s390x: Split constraint A into J+U tcg/s390x: Add TCG_CT_CONST_CMP tcg/s390x: Support TCG_COND_TST{EQ,NE} tcg/tci: Support TCG_COND_TST{EQ,NE} docs/devel/tcg-ops.rst | 2 + include/tcg/tcg-cond.h | 74 -- tcg/aarch64/tcg-target-con-set.h | 5 +- tcg/aarch64/tcg-target-con-str.h | 1 + tcg/aarch64/tcg-target.h | 2 + tcg/arm/tcg-target.h | 2 + tcg/i386/tcg-target-con-set.h| 6 +- tcg/i386/tcg-target-con-str.h| 1 + tcg/i386/tcg-target.h| 2 + tcg/loongarch64/tcg-target.h | 2 + tcg/mips/tcg-target.h| 2 + tcg/ppc/tcg-target-con-set.h | 5 +- tcg/ppc/tcg-target-con-str.h | 1 + tcg/ppc/tcg-target.h | 2 + tcg/riscv/tcg-target.h | 2 + tcg/s390x/tcg-target-con-set.h | 8 +- tcg/s390x/tcg-target-con-str.h | 3 +- tcg/s390x/tcg-target.h | 2 + tcg/sparc64/tcg-target.h | 2 + tcg/tcg-internal.h | 2 + tcg/tci/tcg-target.h | 2 + target/alpha/translate.c | 94 target/m68k/translate.c | 74 +++--- target/s390x/tcg/translate.c | 100 +++-- target/sparc/translate.c | 4 +- tcg/optimize.c | 474 ++- tcg/tcg.c| 40 +++- tcg/tci.c| 14 ++ tcg/aarch64/tcg-target.c.inc | 166 +++--- tcg/arm/tcg-target.c.inc | 62 +++-- tcg/i386/tcg-target.c.inc| 201 - tcg/loongarch64/tcg-target.c.inc | 3 +- tcg/mips/tcg-target.c.inc| 3 +- tcg/ppc/tcg-target.c.inc | 294 ++-- tcg/riscv/tcg-target.c.inc | 3 +- tcg/s390x/tcg-target.c.inc | 246 +--- tcg/sparc64/tcg-target.c.inc | 65 -- tcg/tci/tcg-target.c.inc | 3 +- 38 files changed, 1379 insertions(+), 595 deletions(-)
RE: [PATCH v0 1/2] aspeed: support uart controller both 0 and 1 base
> -Original Message- > From: Philippe Mathieu-Daudé > Sent: Monday, February 5, 2024 9:16 PM > To: Cédric Le Goater ; Jamin Lin ; > Peter Maydell ; Andrew Jeffery > ; Joel Stanley ; open > list:ASPEED BMCs ; open list:All patches CC here > > Cc: Troy Lee > Subject: Re: [PATCH v0 1/2] aspeed: support uart controller both 0 and 1 base > > On 5/2/24 11:46, Cédric Le Goater wrote: > > Hello Jamin, > > > > On 2/5/24 10:14, Jamin Lin wrote: > >> According to the design of ASPEED SOCS, the uart controller is 1 base > >> for ast10x0, ast2600, ast2500 and ast2400. > >> > >> However, the uart controller is 0 base for ast2700. > >> To support uart controller both 0 and 1 base, adds uasrt_bases > >> parameter in AspeedSoCClass and set the default uart controller 1 > >> base for ast10x0, astt2600, ast2500 and ast2400. > > > > The board definition can set 'amc->uart_default' to choose a different > > default serial port for the console, or use the "bmc-console" machine > > option . Isn't it enough ? May be I am misunderstanding the need. > > > > To clarify, > > > > ASPEED_DEV_UART1 is in the first serial port on the boards. > > > > I think we chose to start the indexing at 1 because the Aspeed QEMU > > modeling began first with the UART model (console) and for simplicity, > > we copied the definitions of the device tree from Linux : > > > > serial0 = > > serial1 = > > serial2 = > > serial3 = > > serial4 = > > serial5 = > > > > We replicated this indexing starting at 1 to nearly all device models : > > > > ASPEED_DEV_UART1 - 13 > > ASPEED_DEV_SPI1 -2 > > ASPEED_DEV_EHCI1 -2 > > ASPEED_DEV_TIMER1 - 8 > > ASPEED_DEV_ETH1 -4 > > ASPEED_DEV_MII1 - 4 > > ASPEED_DEV_JTAG0 - 1 <--- !! > > ASPEED_DEV_FSI1 - 2 > > > > I don't know what would be ASPEED_DEV_UART0 in this context. > > > > May be you could send a simplified AST2700 SoC model with definitions > > of a minimum address space and IRQ space ? > > Looking at TF-A definitions, > https://github.com/ARM-software/arm-trusted-firmware/commit/85f199b774 > 47 > > #define UART_BASE U(0x14c33000) > #define UART12_BASE (UART_BASE + 0xb00) > #define CONSOLE_UART_BASE UART12_BASE > > As Cédric described, we have TF-A UART12_BASE -> QEMU > ASPEED_DEV_UART13. As Cédric described, the UART definitions on the AST2700 are different : https://github.com/AspeedTech-BMC/linux/blob/aspeed-master-v6.6/arch/arm64/boot/dts/aspeed/aspeed-g7.dtsi serial0 = serial1 = serial2 = serial3 = serial4 = serial5 = serial6 = serial7 = serial8 = According to the current design of ASPEED QEMU UART model, it used "1" base device name and Follow the IP names in the datasheet Take ast2600 for example: static const hwaddr aspeed_soc_ast2600_memmap[] = { [ASPEED_DEV_UART1] = 0x1E783000, } AST2600 datashee description: Base Address of UART1 = 0x1E78 3000 Base Address of UART2 = 0x1E78 D000 Base Address of UART3 = 0x1E78 E000 Base Address of UART4 = 0x1E78 F000 Base Address of UART5 = 0x1E78 4000 Base Address of UART6 = 0x1E79 However, device name of uart controller had been changed to "0" base for ast2700. If we want to control uart0(ASPEED_DEV_UART1), we should set the memory map as following, static const hwaddr aspeed_soc_ast2700_memmap[] = { [ASPEED_DEV_UART1] = 0X14C33000, [ASPEED_DEV_UART2] = 0X14C33100, [ASPEED_DEV_UART3] = 0X14C33200, [ASPEED_DEV_UART4] = 0X14C33300, [ASPEED_DEV_UART5] = 0X12C1A000, } AST2700 datasheet description: AST2700 integrate 13 sets of UART. Base Address of UART0 = 0x14C3_3000 Base Address of UART1 = 0x14C3_3100 Base Address of UART2 = 0x14C3_3200 Base Address of UART3 = 0x14C3_3300 Base Address of UART4 = 0x12C1_A000 Base Address of UART5 = 0x14C3_3400 Base Address of UART6 = 0x14C3_3500 Base Address of UART7 = 0x14C3_3600 Base Address of UART8 = 0x14C3_3700 Base Address of UART9 = 0x14C3_3800 Base Address of UART10 = 0x14C3_3900 Base Address of UART11 = 0x14C3_3A00 Base Address of UART12 = 0x14C3_3B00 As you said, uart12 mapped ASPEED_DEV_UART13. The device naming will confuse users because the device name in qemu mismatch with ast2700 datasheet. That way why we want to add ASPEED_DEV_UART0 and set the memory map of AST2700 as following. static const hwaddr aspeed_soc_ast2700_memmap[] = { [ASPEED_DEV_UART0] = 0X14C33000, [ASPEED_DEV_UART1] = 0X14C33100, [ASPEED_DEV_UART2] = 0X14C33200, [ASPEED_DEV_UART3] = 0X14C33300, [ASPEED_DEV_UART4] = 0X12C1A000, [ASPEED_DEV_UART5] = 0X14C33400, [ASPEED_DEV_UART6] = 0X14C33500, [ASPEED_DEV_UART7] = 0X14C33600, [ASPEED_DEV_UART8] = 0X14C33700, [ASPEED_DEV_UART9] = 0X14C33800, [ASPEED_DEV_UART10]= 0X14C33900, [ASPEED_DEV_UART11]= 0X14C33A00, [ASPEED_DEV_UART12]=
[PATCH v2 2/6] target/arm: Fix nregs computation in do_ld_zpa
The field is encoded as [0-3], which is convenient for indexing our array of function pointers, but the true value is [1-4]. Adjust before calling do_mem_zpa. Add an assert, and move the comment re passing ZT to the helper back next to the relevant code. Cc: qemu-sta...@nongnu.org Fixes: 206adacfb8d ("target/arm: Add mte helpers for sve scalar + int loads") Signed-off-by: Richard Henderson --- target/arm/tcg/translate-sve.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c index 296e7d1ce2..f50a426c98 100644 --- a/target/arm/tcg/translate-sve.c +++ b/target/arm/tcg/translate-sve.c @@ -4445,11 +4445,7 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, TCGv_ptr t_pg; int desc = 0; -/* - * For e.g. LD4, there are not enough arguments to pass all 4 - * registers as pointers, so encode the regno into the data field. - * For consistency, do this even for LD1. - */ +assert(mte_n >= 1 && mte_n <= 4); if (s->mte_active[0]) { int msz = dtype_msz(dtype); @@ -4463,6 +4459,11 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, addr = clean_data_tbi(s, addr); } +/* + * For e.g. LD4, there are not enough arguments to pass all 4 + * registers as pointers, so encode the regno into the data field. + * For consistency, do this even for LD1. + */ desc = simd_desc(vsz, vsz, zt | desc); t_pg = tcg_temp_new_ptr(); @@ -4600,7 +4601,7 @@ static void do_ld_zpa(DisasContext *s, int zt, int pg, * accessible via the instruction encoding. */ assert(fn != NULL); -do_mem_zpa(s, zt, pg, addr, dtype, nreg, false, fn); +do_mem_zpa(s, zt, pg, addr, dtype, nreg + 1, false, fn); } static bool trans_LD_zprr(DisasContext *s, arg_rprr_load *a) -- 2.34.1
[PATCH v2 1/6] linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3
When MTE3 is supported, the kernel maps PR_MTE_TCF_ASYNC | PR_MTE_TCF_SYNC to MTE_CTRL_TCF_ASYMM and from there to SCTLR_EL1.TCF0 = 3 There is no error reported for setting ASYNC | SYNC when MTE3 is not supported; the kernel simply selects the ASYNC behavior of TCF0=2. Signed-off-by: Richard Henderson --- linux-user/aarch64/target_prctl.h | 25 + 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/linux-user/aarch64/target_prctl.h b/linux-user/aarch64/target_prctl.h index 5067e7d731..49bd16aa95 100644 --- a/linux-user/aarch64/target_prctl.h +++ b/linux-user/aarch64/target_prctl.h @@ -173,21 +173,22 @@ static abi_long do_prctl_set_tagged_addr_ctrl(CPUArchState *env, abi_long arg2) env->tagged_addr_enable = arg2 & PR_TAGGED_ADDR_ENABLE; if (cpu_isar_feature(aa64_mte, cpu)) { -switch (arg2 & PR_MTE_TCF_MASK) { -case PR_MTE_TCF_NONE: -case PR_MTE_TCF_SYNC: -case PR_MTE_TCF_ASYNC: -break; -default: -return -EINVAL; -} - /* * Write PR_MTE_TCF to SCTLR_EL1[TCF0]. - * Note that the syscall values are consistent with hw. + * Note that SYNC | ASYNC -> ASYMM with FEAT_MTE3, + * otherwise mte_update_sctlr_user chooses ASYNC. */ -env->cp15.sctlr_el[1] = -deposit64(env->cp15.sctlr_el[1], 38, 2, arg2 >> PR_MTE_TCF_SHIFT); +unsigned tcf = 0; +if (arg2 & PR_MTE_TCF_ASYNC) { +if ((arg2 & PR_MTE_TCF_SYNC) && cpu_isar_feature(aa64_mte3, cpu)) { +tcf = 3; +} else { +tcf = 2; +} +} else if (arg2 & PR_MTE_TCF_SYNC) { +tcf = 1; +} +env->cp15.sctlr_el[1] = deposit64(env->cp15.sctlr_el[1], 38, 2, tcf); /* * Write PR_MTE_TAG to GCR_EL1[Exclude]. -- 2.34.1
[PATCH v2 6/6] target/arm: Fix SVE/SME gross MTE suppression checks
The TBI and TCMA bits are located within mtedesc, not desc. Signed-off-by: Richard Henderson --- target/arm/tcg/sme_helper.c | 8 target/arm/tcg/sve_helper.c | 12 ++-- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c index 1ee2690ceb..904bfdac43 100644 --- a/target/arm/tcg/sme_helper.c +++ b/target/arm/tcg/sme_helper.c @@ -573,8 +573,8 @@ void sme_ld1_mte(CPUARMState *env, void *za, uint64_t *vg, desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT); /* Perform gross MTE suppression early. */ -if (!tbi_check(desc, bit55) || -tcma_check(desc, bit55, allocation_tag_from_addr(addr))) { +if (!tbi_check(mtedesc, bit55) || +tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) { mtedesc = 0; } @@ -750,8 +750,8 @@ void sme_st1_mte(CPUARMState *env, void *za, uint64_t *vg, target_ulong addr, desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT); /* Perform gross MTE suppression early. */ -if (!tbi_check(desc, bit55) || -tcma_check(desc, bit55, allocation_tag_from_addr(addr))) { +if (!tbi_check(mtedesc, bit55) || +tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) { mtedesc = 0; } diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c index bce4295d28..6853f58c19 100644 --- a/target/arm/tcg/sve_helper.c +++ b/target/arm/tcg/sve_helper.c @@ -5800,8 +5800,8 @@ void sve_ldN_r_mte(CPUARMState *env, uint64_t *vg, target_ulong addr, desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT); /* Perform gross MTE suppression early. */ -if (!tbi_check(desc, bit55) || -tcma_check(desc, bit55, allocation_tag_from_addr(addr))) { +if (!tbi_check(mtedesc, bit55) || +tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) { mtedesc = 0; } @@ -6156,8 +6156,8 @@ void sve_ldnfff1_r_mte(CPUARMState *env, void *vg, target_ulong addr, desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT); /* Perform gross MTE suppression early. */ -if (!tbi_check(desc, bit55) || -tcma_check(desc, bit55, allocation_tag_from_addr(addr))) { +if (!tbi_check(mtedesc, bit55) || +tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) { mtedesc = 0; } @@ -6410,8 +6410,8 @@ void sve_stN_r_mte(CPUARMState *env, uint64_t *vg, target_ulong addr, desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT); /* Perform gross MTE suppression early. */ -if (!tbi_check(desc, bit55) || -tcma_check(desc, bit55, allocation_tag_from_addr(addr))) { +if (!tbi_check(mtedesc, bit55) || +tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) { mtedesc = 0; } -- 2.34.1
[PATCH v2 3/6] target/arm: Adjust and validate mtedesc sizem1
When we added SVE_MTEDESC_SHIFT, we effectively limited the maximum size of MTEDESC. Adjust SIZEM1 to consume the remaining bits (32 - 10 - 5 - 12 == 5). Assert that the data to be stored fits within the field (expecting 8 * 4 - 1 == 31, exact fit). Signed-off-by: Richard Henderson --- target/arm/internals.h | 2 +- target/arm/tcg/translate-sve.c | 7 --- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/target/arm/internals.h b/target/arm/internals.h index fc337fe40e..50bff44549 100644 --- a/target/arm/internals.h +++ b/target/arm/internals.h @@ -1278,7 +1278,7 @@ FIELD(MTEDESC, TBI, 4, 2) FIELD(MTEDESC, TCMA, 6, 2) FIELD(MTEDESC, WRITE, 8, 1) FIELD(MTEDESC, ALIGN, 9, 3) -FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - 12) /* size - 1 */ +FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - SVE_MTEDESC_SHIFT - 12) /* size - 1 */ bool mte_probe(CPUARMState *env, uint32_t desc, uint64_t ptr); uint64_t mte_check(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra); diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c index f50a426c98..2628ac2840 100644 --- a/target/arm/tcg/translate-sve.c +++ b/target/arm/tcg/translate-sve.c @@ -4443,17 +4443,18 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, { unsigned vsz = vec_full_reg_size(s); TCGv_ptr t_pg; +uint32_t sizem1; int desc = 0; assert(mte_n >= 1 && mte_n <= 4); +sizem1 = (mte_n << dtype_msz(dtype)) - 1; +assert(sizem1 <= R_MTEDESC_SIZEM1_MASK >> R_MTEDESC_SIZEM1_SHIFT); if (s->mte_active[0]) { -int msz = dtype_msz(dtype); - desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s)); desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid); desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma); desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write); -desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (mte_n << msz) - 1); +desc = FIELD_DP32(desc, MTEDESC, SIZEM1, sizem1); desc <<= SVE_MTEDESC_SHIFT; } else { addr = clean_data_tbi(s, addr); -- 2.34.1
[PATCH v2 4/6] target/arm: Split out make_svemte_desc
Share code that creates mtedesc and embeds within simd_desc. Signed-off-by: Richard Henderson --- target/arm/tcg/translate-a64.h | 2 ++ target/arm/tcg/translate-sme.c | 15 +++ target/arm/tcg/translate-sve.c | 47 ++ 3 files changed, 31 insertions(+), 33 deletions(-) diff --git a/target/arm/tcg/translate-a64.h b/target/arm/tcg/translate-a64.h index 96ba39b37e..7b811b8ac5 100644 --- a/target/arm/tcg/translate-a64.h +++ b/target/arm/tcg/translate-a64.h @@ -28,6 +28,8 @@ bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn, bool sve_access_check(DisasContext *s); bool sme_enabled_check(DisasContext *s); bool sme_enabled_check_with_svcr(DisasContext *s, unsigned); +uint32_t make_svemte_desc(DisasContext *s, unsigned vsz, uint32_t nregs, + uint32_t msz, bool is_write, uint32_t data); /* This function corresponds to CheckStreamingSVEEnabled. */ static inline bool sme_sm_enabled_check(DisasContext *s) diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c index 8f0dfc884e..46c7fce8b4 100644 --- a/target/arm/tcg/translate-sme.c +++ b/target/arm/tcg/translate-sme.c @@ -206,7 +206,7 @@ static bool trans_LDST1(DisasContext *s, arg_LDST1 *a) TCGv_ptr t_za, t_pg; TCGv_i64 addr; -int svl, desc = 0; +uint32_t desc; bool be = s->be_data == MO_BE; bool mte = s->mte_active[0]; @@ -224,18 +224,11 @@ static bool trans_LDST1(DisasContext *s, arg_LDST1 *a) tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), a->esz); tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); -if (mte) { -desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s)); -desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid); -desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma); -desc = FIELD_DP32(desc, MTEDESC, WRITE, a->st); -desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (1 << a->esz) - 1); -desc <<= SVE_MTEDESC_SHIFT; -} else { +if (!mte) { addr = clean_data_tbi(s, addr); } -svl = streaming_vec_reg_size(s); -desc = simd_desc(svl, svl, desc); + +desc = make_svemte_desc(s, streaming_vec_reg_size(s), 1, a->esz, a->st, 0); fns[a->esz][be][a->v][mte][a->st](tcg_env, t_za, t_pg, addr, tcg_constant_i32(desc)); diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c index 2628ac2840..8868aae5ac 100644 --- a/target/arm/tcg/translate-sve.c +++ b/target/arm/tcg/translate-sve.c @@ -4437,18 +4437,18 @@ static const uint8_t dtype_esz[16] = { 3, 2, 1, 3 }; -static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, - int dtype, uint32_t mte_n, bool is_write, - gen_helper_gvec_mem *fn) +uint32_t make_svemte_desc(DisasContext *s, unsigned vsz, uint32_t nregs, + uint32_t msz, bool is_write, uint32_t data) { -unsigned vsz = vec_full_reg_size(s); -TCGv_ptr t_pg; uint32_t sizem1; -int desc = 0; +uint32_t desc = 0; -assert(mte_n >= 1 && mte_n <= 4); -sizem1 = (mte_n << dtype_msz(dtype)) - 1; +/* Assert all of the data fits, with or without MTE enabled. */ +assert(nregs >= 1 && nregs <= 4); +sizem1 = (nregs << msz) - 1; assert(sizem1 <= R_MTEDESC_SIZEM1_MASK >> R_MTEDESC_SIZEM1_SHIFT); +assert(data < 1u << SVE_MTEDESC_SHIFT); + if (s->mte_active[0]) { desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s)); desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid); @@ -4456,7 +4456,18 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write); desc = FIELD_DP32(desc, MTEDESC, SIZEM1, sizem1); desc <<= SVE_MTEDESC_SHIFT; -} else { +} +return simd_desc(vsz, vsz, desc | data); +} + +static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, + int dtype, uint32_t nregs, bool is_write, + gen_helper_gvec_mem *fn) +{ +TCGv_ptr t_pg; +uint32_t desc; + +if (!s->mte_active[0]) { addr = clean_data_tbi(s, addr); } @@ -4465,7 +4476,8 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, * registers as pointers, so encode the regno into the data field. * For consistency, do this even for LD1. */ -desc = simd_desc(vsz, vsz, zt | desc); +desc = make_svemte_desc(s, vec_full_reg_size(s), nregs, +dtype_msz(dtype), is_write, zt); t_pg = tcg_temp_new_ptr(); tcg_gen_addi_ptr(t_pg, tcg_env, pred_full_reg_offset(s, pg)); @@ -5225,25 +5237,16 @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale, TCGv_i64 scalar, int msz, bool is_write, gen_helper_gvec_mem_scatter *fn) { -unsigned vsz = vec_full_reg_size(s); TCGv_ptr t_zm =
Re: [PATCH v2 00/23] migration/multifd: Refactor ->send_prepare() and cleanups
queued. -- Peter Xu
[PATCH v2 5/6] target/arm: Handle mte in do_ldrq, do_ldro
These functions "use the standard load helpers", but fail to clean_data_tbi or populate mtedesc. Signed-off-by: Richard Henderson --- target/arm/tcg/translate-sve.c | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c index 8868aae5ac..fb3bcb47f6 100644 --- a/target/arm/tcg/translate-sve.c +++ b/target/arm/tcg/translate-sve.c @@ -4861,8 +4861,13 @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype) unsigned vsz = vec_full_reg_size(s); TCGv_ptr t_pg; int poff; +uint32_t desc; /* Load the first quadword using the normal predicated load helpers. */ +if (!s->mte_active[0]) { +addr = clean_data_tbi(s, addr); +} + poff = pred_full_reg_offset(s, pg); if (vsz > 16) { /* @@ -4886,7 +4891,8 @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype) gen_helper_gvec_mem *fn = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0]; -fn(tcg_env, t_pg, addr, tcg_constant_i32(simd_desc(16, 16, zt))); +desc = make_svemte_desc(s, 16, 1, dtype_msz(dtype), false, zt); +fn(tcg_env, t_pg, addr, tcg_constant_i32(desc)); /* Replicate that first quadword. */ if (vsz > 16) { @@ -4929,6 +4935,7 @@ static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype) unsigned vsz_r32; TCGv_ptr t_pg; int poff, doff; +uint32_t desc; if (vsz < 32) { /* @@ -4941,6 +4948,9 @@ static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype) } /* Load the first octaword using the normal predicated load helpers. */ +if (!s->mte_active[0]) { +addr = clean_data_tbi(s, addr); +} poff = pred_full_reg_offset(s, pg); if (vsz > 32) { @@ -4965,7 +4975,8 @@ static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype) gen_helper_gvec_mem *fn = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0]; -fn(tcg_env, t_pg, addr, tcg_constant_i32(simd_desc(32, 32, zt))); +desc = make_svemte_desc(s, 32, 1, dtype_msz(dtype), false, zt); +fn(tcg_env, t_pg, addr, tcg_constant_i32(desc)); /* * Replicate that first octaword. -- 2.34.1
[PATCH v2 0/6] target/arm: assorted mte fixes
The first patch is unchanged from Supercedes: <20240131003557.176486-1-richard.hender...@linaro.org> while the remaining patches replace Supercedes: <20240205023948.25476-1-richard.hender...@linaro.org> While digging through Gustavo's test case, wondering why it should be failing at all, I finally noticed that we weren't overflowing MTEDESC.SIZEM1, but underflowing (-1). Oops. But I did find a few other points by inspection where we weren't properly handling or supplying MTEDESC. r~ Richard Henderson (6): linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3 target/arm: Fix nregs computation in do_ld_zpa target/arm: Adjust and validate mtedesc sizem1 target/arm: Split out make_svemte_desc target/arm: Handle mte in do_ldrq, do_ldro target/arm: Fix SVE/SME gross MTE suppression checks linux-user/aarch64/target_prctl.h | 25 +- target/arm/internals.h| 2 +- target/arm/tcg/translate-a64.h| 2 + target/arm/tcg/sme_helper.c | 8 ++-- target/arm/tcg/sve_helper.c | 12 ++--- target/arm/tcg/translate-sme.c| 15 ++ target/arm/tcg/translate-sve.c| 80 ++- 7 files changed, 78 insertions(+), 66 deletions(-) -- 2.34.1
Re: [PATCH] vdpa-dev: Fix initialisation order to restore VDUSE compatibility
On Mon, Feb 5, 2024 at 6:51 PM Stefano Garzarella wrote: > > On Fri, Feb 02, 2024 at 02:25:21PM +0100, Kevin Wolf wrote: > >VDUSE requires that virtqueues are first enabled before the DRIVER_OK > >status flag is set; with the current API of the kernel module, it is > >impossible to enable the opposite order in our block export code because > >userspace is not notified when a virtqueue is enabled. Did this mean virtio-blk will enable a virtqueue after DRIVER_OK? Sepc is not clear about this and that's why we introduce VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK. > > Yeah, IMHO the VDUSE protocol is missing a VDUSE_SET_VQ_READY message, I think you meant when VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is negotiated. If this is truth, it seems a little more complicated, for example the get_backend_features needs to be forward to the userspace? This seems suboptimal to implement this in the spec first and then we can leverage the features. Or we can have another parameter for the ioctl that creates the vduse device. > I'll start another thread about that, but in the meantime I agree that > we should fix QEMU since we need to work properly with old kernels as > well. > > > > >This requirement also mathces the normal initialisation order as done by > >the generic vhost code in QEMU. However, commit 6c482547 accidentally > >changed the order for vdpa-dev and broke access to VDUSE devices with > >this. > > > >This changes vdpa-dev to use the normal order again and use the standard > >vhost callback .vhost_set_vring_enable for this. VDUSE devices can be > >used with vdpa-dev again after this fix. > > I like this approach and the patch LGTM, but I'm a bit worried about > this function in hw/net/vhost_net.c: > > int vhost_set_vring_enable(NetClientState *nc, int enable) > { > VHostNetState *net = get_vhost_net(nc); > const VhostOps *vhost_ops = net->dev.vhost_ops; > > nc->vring_enable = enable; > > if (vhost_ops && vhost_ops->vhost_set_vring_enable) { > return vhost_ops->vhost_set_vring_enable(>dev, enable); > } > > return 0; > } > > @Eugenio, @Jason, should we change some things there if vhost-vdpa > implements the vhost_set_vring_enable callback? Eugenio may know more, I remember we need to enable cvq first for shadow virtqueue to restore some states. > > Do you remember why we didn't implement it from the beginning? It seems the vrings parameter is introduced after vhost-vdpa is implemented. Thanks > > Thanks, > Stefano > > > > >Cc: qemu-sta...@nongnu.org > >Fixes: 6c4825476a4351530bcac17abab72295b75ffe98 > >Signed-off-by: Kevin Wolf > >--- > > hw/virtio/vdpa-dev.c | 5 + > > hw/virtio/vhost-vdpa.c | 17 + > > 2 files changed, 18 insertions(+), 4 deletions(-) > > > >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c > >index eb9ecea83b..13e87f06f6 100644 > >--- a/hw/virtio/vdpa-dev.c > >+++ b/hw/virtio/vdpa-dev.c > >@@ -253,14 +253,11 @@ static int vhost_vdpa_device_start(VirtIODevice *vdev, > >Error **errp) > > > > s->dev.acked_features = vdev->guest_features; > > > >-ret = vhost_dev_start(>dev, vdev, false); > >+ret = vhost_dev_start(>dev, vdev, true); > > if (ret < 0) { > > error_setg_errno(errp, -ret, "Error starting vhost"); > > goto err_guest_notifiers; > > } > >-for (i = 0; i < s->dev.nvqs; ++i) { > >-vhost_vdpa_set_vring_ready(>vdpa, i); > >-} > > s->started = true; > > > > /* > >diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > >index 3a43beb312..c4574d56c5 100644 > >--- a/hw/virtio/vhost-vdpa.c > >+++ b/hw/virtio/vhost-vdpa.c > >@@ -904,6 +904,22 @@ int vhost_vdpa_set_vring_ready(struct vhost_vdpa *v, > >unsigned idx) > > return r; > > } > > > >+static int vhost_vdpa_set_vring_enable(struct vhost_dev *dev, int enable) > >+{ > >+struct vhost_vdpa *v = dev->opaque; > >+unsigned int i; > >+int ret; > >+ > >+for (i = 0; i < dev->nvqs; ++i) { > >+ret = vhost_vdpa_set_vring_ready(v, i); > >+if (ret < 0) { > >+return ret; > >+} > >+} > >+ > >+return 0; > >+} > >+ > > static int vhost_vdpa_set_config_call(struct vhost_dev *dev, > >int fd) > > { > >@@ -1524,6 +1540,7 @@ const VhostOps vdpa_ops = { > > .vhost_set_features = vhost_vdpa_set_features, > > .vhost_reset_device = vhost_vdpa_reset_device, > > .vhost_get_vq_index = vhost_vdpa_get_vq_index, > >+.vhost_set_vring_enable = vhost_vdpa_set_vring_enable, > > .vhost_get_config = vhost_vdpa_get_config, > > .vhost_set_config = vhost_vdpa_set_config, > > .vhost_requires_shm_log = NULL, > >-- > >2.43.0 > > >
Re: [PATCH 0/2] migration: Fix return-path thread exit
On Mon, Feb 05, 2024 at 10:32:33AM +, Daniel P. Berrangé wrote: > On Fri, Feb 02, 2024 at 05:53:39PM +0800, Peter Xu wrote: > > On Thu, Feb 01, 2024 at 07:48:51PM +0100, Cédric Le Goater wrote: > > > Hello, > > > > Hi, Cédric, > > > > Thanks for the patches. > > > > > > > > Today, close_return_path_on_source() can perform a shutdown to exit > > > the return-path thread if an error occured. However, migrate_fd_cleanup() > > > does cleanups too early and the shutdown in close_return_path_on_source() > > > fails, leaving the source and destination waiting for an event to occur. > > > > > > This little series tries to fix that. Comments welcome ! > > > > One thing I do agree is that relying on qemu_file_get_error(to_dst_file) in > > close_return_path_on_source() is weird: IMHO we have better way to detect > > "whether the migration has error" now, which is migrate_has_error(). > > > > For this specific issue, I think one long standing issue that might be > > relevant is we have two QEMUFile (from_dst_file, to_dst_file) that share > > the same QIOChannel now. Logically the two QEMUFile should be able to be > > managed separately, say, close() of to_dst_file shouldn't affect the other. > > > > However I don't think it's the case now, as qemu_fclose(to_dst_file) will > > do qio_channel_close() already, which means there will be a side effect to > > the other QEMUFile that its backing IOC is already closed. > > > > Is this the issue we're facing? IOW, the close() of to_dst_file will not > > properly kick the other thread who is blocked at reading from_dst_file, > > while the shutdown() will kick it out? > > > > If so, not sure whether we can somehow relay the real qio_channel_close() > > to until the last user releases it? IOW, conditionally close() the channel > > in qio_channel_finalize(), if the channel is still open? Would that make > > sense? > > IMHO the problem described above is a result of the design mistake of > having 2 separate QEMUFile instances for what is ultimately the same > channel. This was a convenient approach to take originally, but it has > likely outlived its purpose. > > In the ideal world IMHO, QEMUFile would not exist at all, and we would > have a QIOChannelCached that adds the read/write buffering above the > base QIOChannel. We have that in the TODO wiki page for a long time, I'll update it slightly. https://wiki.qemu.org/ToDo/LiveMigration#Rewrite_QEMUFile_for_migration But yeah that might be too big a hammer to solve this specific issue. AFAIU Fabiano is looking into that direction, but I assume it should still be a long term thing. > > That's doable, but bigger than a quick fix. A natural stepping stone > to get there though is to move from 2 QEMUFile objs down to 1 QEMUFile, > which might be more practical as a quick fix. Agree. However would this still be quite some change? We still have a lot of references on the four qemufiles (to/from_dst_file, to/from_src_file), at least that'll need a replacement; I didn't yet further check whether all places can be done with a direct replacement of such change, some tweaks may be needed here and there, but shouldn't be major. Meanwhile IIUC it'll also need a major rework on QEMUFile, allowing it to be bi-directional? We may need to duplicate the cache layer, IIUC, one for each direction IOs. -- Peter Xu
Re: [PATCH] tests/cdrom-test: Add cdrom test for LoongArch virt machine
Hi Thomas, On 2024/2/5 下午3:47, Thomas Huth wrote: On 05/02/2024 03.13, Bibo Mao wrote: The cdrom test skips to execute on LoongArch system with command "make check" Are you sure the test is marked with "skip"? ... it should at least test with the "none" machine...? I check again, cdrom test case passes to run with "none" machine. And the root cause is that xorriso rpm package is missing so it is skipped. Regards Bibo Mao this patch enables cdrom test for LoongArch virt machine platform. With this patch, cdrom test passes to run on LoongArch virt machine type. Signed-off-by: Bibo Mao --- tests/qtest/cdrom-test.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/qtest/cdrom-test.c b/tests/qtest/cdrom-test.c index 0945383789..c8b97d8d9a 100644 --- a/tests/qtest/cdrom-test.c +++ b/tests/qtest/cdrom-test.c @@ -271,6 +271,9 @@ int main(int argc, char **argv) const char *virtmachine[] = { "virt", NULL }; add_cdrom_param_tests(virtmachine); } + } else if (g_str_equal(arch, "loongarch64")) { + const char *virtmachine[] = { "virt", NULL }; + add_cdrom_param_tests(virtmachine); } else { const char *nonemachine[] = { "none", NULL }; add_cdrom_param_tests(nonemachine); Anyway, using the virt machine is certainly better than the "none" machine, so: Acked-by: Thomas Huth
Re: [PATCH] tests/cdrom-test: Add cdrom test for LoongArch virt machine
Hi Philippe, On 2024/2/5 下午8:58, Philippe Mathieu-Daudé wrote: Hi Bibo, On 5/2/24 03:13, Bibo Mao wrote: The cdrom test skips to execute on LoongArch system with command "make check", this patch enables cdrom test for LoongArch virt machine platform. With this patch, cdrom test passes to run on LoongArch virt machine type. Signed-off-by: Bibo Mao --- tests/qtest/cdrom-test.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/qtest/cdrom-test.c b/tests/qtest/cdrom-test.c index 0945383789..c8b97d8d9a 100644 --- a/tests/qtest/cdrom-test.c +++ b/tests/qtest/cdrom-test.c @@ -271,6 +271,9 @@ int main(int argc, char **argv) const char *virtmachine[] = { "virt", NULL }; add_cdrom_param_tests(virtmachine); } + } else if (g_str_equal(arch, "loongarch64")) { + const char *virtmachine[] = { "virt", NULL }; + add_cdrom_param_tests(virtmachine); What is the default device used, virtio-blk-pci? yes, it is. For virt machine type, the default type for block device is virtio interface, and it is defined at function loongarch_class_init(). mc->block_default_type = IF_VIRTIO Regards Bibo Mao
RE: [PATCH v0 2/2] aspeed: fix hardcode boot address 0
> -Original Message- > From: Cédric Le Goater > Sent: Monday, February 5, 2024 9:34 PM > To: Jamin Lin ; Peter Maydell > ; Andrew Jeffery ; > Joel Stanley ; open list:ASPEED BMCs > ; open list:All patches CC here > > Cc: Troy Lee > Subject: Re: [PATCH v0 2/2] aspeed: fix hardcode boot address 0 > > On 2/5/24 10:14, Jamin Lin wrote: > > In the previous design of QEMU model for ASPEED SOCs, it set the boot > > address at 0 which was the hardcode setting for ast10x0, ast2600, > > ast2500 and ast2400. > > > > According to the design of ast2700, it has bootmcu which is used for > > executing SPL and initialize DRAM, then, CPUs(cortex-a35) execute > > u-boot, kernel and rofs. QEMU will only support CPU(coretax-a35) parts > > and the boot address is "0x4" for ast2700. > > On the previous SoC, the ASPEED_DEV_SPI_BOOT region is an alias, at 0x0, to > the FMC CE0 region, mapped at 0x2000. > > Is 0x4 (or 0x4000 ?) the address for FMC CE0 region on the > ast2700 ? or an alias ? > It is "0x4 "(64bits address). CPU is armv8 cortex-a35 which is 64 bits CPU. The dram base address is "0x4 ". The SPL base address is "0x1 ". FMC_CS0 region mapped at "0x1 " address. > What is the cortex-a35 reset address ? > > It would help to also introduce a basic skeleton of the ast2700 SoC. > AST2700 Primary Service Processor: Embedded quad-core ARM Cortex A35 64-bit RISC CPU Maximum running frequency: 1.6GHZ Support: MMU, FPU, NEON, trust-zone, GIC-500 controller and so on. BootMCU: Ibex-riscv 32bits riscv. Boot flow ROM Code -> BootMCU(SPL) -> CPU Cortex A35(U-boot-> kernel -> rofs) > Anyhow, this change makes sense. Could you please respin and also remove > ASPEED_SOC_SPI_BOOT_ADDR. ? > Okay, will remove it. > Thanks, > > C. > > > Therefore, fixed hardcode boot address 0. > > > > Signed-off-by: Troy Lee > > Signed-off-by: Jamin Lin > > --- > > hw/arm/aspeed.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c index > > 218b81298e..82a92e8142 100644 > > --- a/hw/arm/aspeed.c > > +++ b/hw/arm/aspeed.c > > @@ -289,12 +289,14 @@ static void > aspeed_install_boot_rom(AspeedMachineState *bmc, BlockBackend *blk, > > uint64_t rom_size) > > { > > AspeedSoCState *soc = bmc->soc; > > +AspeedSoCClass *sc = ASPEED_SOC_GET_CLASS(soc); > > > > memory_region_init_rom(>boot_rom, NULL, > "aspeed.boot_rom", rom_size, > > _abort); > > memory_region_add_subregion_overlap(>spi_boot_container, > 0, > > >boot_rom, 1); > > -write_boot_rom(blk, ASPEED_SOC_SPI_BOOT_ADDR, rom_size, > _abort); > > +write_boot_rom(blk, sc->memmap[ASPEED_DEV_SPI_BOOT], > > + rom_size, _abort); > > } > > > > void aspeed_board_init_flashes(AspeedSMCState *s, const char > > *flashtype,
[PATCH V2] loongarch: Change the UEFI loading mode to loongarch
The UEFI loading mode in loongarch is very different from that in other architectures:loongarch's UEFI code is in rom, while other architectures' UEFI code is in flash. loongarch UEFI can be loaded as follows: -machine virt,pflash=pflash0-format -bios ./QEMU_EFI.fd Other architectures load UEFI using the following methods: -machine virt,pflash0=pflash0-format,pflash1=pflash1-format loongarch's UEFI loading method makes qemu and libvirt incompatible when using NVRAM, and the cost of loongarch's current loading method far outweighs the benefits, so we decided to use the same UEFI loading scheme as other architectures. Cc: Andrea Bolognani Cc: maob...@loongson.cn Cc: Philippe Mathieu-Daudé Cc: Song Gao Cc: zhaotian...@loongson.cn Signed-off-by: Xianglai Li --- hw/loongarch/acpi-build.c | 29 +-- hw/loongarch/virt.c | 101 ++-- include/hw/loongarch/virt.h | 10 ++-- 3 files changed, 107 insertions(+), 33 deletions(-) diff --git a/hw/loongarch/acpi-build.c b/hw/loongarch/acpi-build.c index a1c4198741..6c75f216ea 100644 --- a/hw/loongarch/acpi-build.c +++ b/hw/loongarch/acpi-build.c @@ -314,16 +314,39 @@ static void build_pci_device_aml(Aml *scope, LoongArchMachineState *lams) static void build_flash_aml(Aml *scope, LoongArchMachineState *lams) { Aml *dev, *crs; +MemoryRegion *flash_mem; -hwaddr flash_base = VIRT_FLASH_BASE; -hwaddr flash_size = VIRT_FLASH_SIZE; +hwaddr flash0_base; +hwaddr flash0_size; + +hwaddr flash1_base; +hwaddr flash1_size; + +flash_mem = pflash_cfi01_get_memory(lams->flash[0]); +flash0_base = flash_mem->addr; +flash0_size = flash_mem->size; + +flash_mem = pflash_cfi01_get_memory(lams->flash[1]); +flash1_base = flash_mem->addr; +flash1_size = flash_mem->size; dev = aml_device("FLS0"); aml_append(dev, aml_name_decl("_HID", aml_string("LNRO0015"))); aml_append(dev, aml_name_decl("_UID", aml_int(0))); crs = aml_resource_template(); -aml_append(crs, aml_memory32_fixed(flash_base, flash_size, AML_READ_WRITE)); +aml_append(crs, aml_memory32_fixed(flash0_base, flash0_size, + AML_READ_WRITE)); +aml_append(dev, aml_name_decl("_CRS", crs)); +aml_append(scope, dev); + +dev = aml_device("FLS1"); +aml_append(dev, aml_name_decl("_HID", aml_string("LNRO0015"))); +aml_append(dev, aml_name_decl("_UID", aml_int(1))); + +crs = aml_resource_template(); +aml_append(crs, aml_memory32_fixed(flash1_base, flash1_size, + AML_READ_WRITE)); aml_append(dev, aml_name_decl("_CRS", crs)); aml_append(scope, dev); } diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c index 0ad7d8c887..a7b9199e70 100644 --- a/hw/loongarch/virt.c +++ b/hw/loongarch/virt.c @@ -54,7 +54,9 @@ struct loaderparams { const char *initrd_filename; }; -static void virt_flash_create(LoongArchMachineState *lams) +static PFlashCFI01 *virt_flash_create1(LoongArchMachineState *lams, + const char *name, + const char *alias_prop_name) { DeviceState *dev = qdev_new(TYPE_PFLASH_CFI01); @@ -66,45 +68,78 @@ static void virt_flash_create(LoongArchMachineState *lams) qdev_prop_set_uint16(dev, "id1", 0x18); qdev_prop_set_uint16(dev, "id2", 0x00); qdev_prop_set_uint16(dev, "id3", 0x00); -qdev_prop_set_string(dev, "name", "virt.flash"); -object_property_add_child(OBJECT(lams), "virt.flash", OBJECT(dev)); -object_property_add_alias(OBJECT(lams), "pflash", +qdev_prop_set_string(dev, "name", name); +object_property_add_child(OBJECT(lams), name, OBJECT(dev)); +object_property_add_alias(OBJECT(lams), alias_prop_name, OBJECT(dev), "drive"); +return PFLASH_CFI01(dev); +} -lams->flash = PFLASH_CFI01(dev); +static void virt_flash_create(LoongArchMachineState *lams) +{ +lams->flash[0] = virt_flash_create1(lams, "virt.flash0", "pflash0"); +lams->flash[1] = virt_flash_create1(lams, "virt.flash1", "pflash1"); } -static void virt_flash_map(LoongArchMachineState *lams, - MemoryRegion *sysmem) +static void virt_flash_map1(PFlashCFI01 *flash, +hwaddr base, hwaddr size, +MemoryRegion *sysmem) { -PFlashCFI01 *flash = lams->flash; DeviceState *dev = DEVICE(flash); -hwaddr base = VIRT_FLASH_BASE; -hwaddr size = VIRT_FLASH_SIZE; +BlockBackend *blk; +hwaddr real_size = size; + +blk = pflash_cfi01_get_blk(flash); +if (blk) { +real_size = blk_getlength(blk); +assert(real_size && real_size <= size); +} -assert(QEMU_IS_ALIGNED(size, VIRT_FLASH_SECTOR_SIZE)); -assert(size / VIRT_FLASH_SECTOR_SIZE <= UINT32_MAX); +assert(QEMU_IS_ALIGNED(real_size, VIRT_FLASH_SECTOR_SIZE)); +assert(real_size /
RE: [PATCH v0 2/2] aspeed: fix hardcode boot address 0
> -Original Message- > From: Philippe Mathieu-Daudé > Sent: Monday, February 5, 2024 9:20 PM > To: Jamin Lin ; Cédric Le Goater ; > Peter Maydell ; Andrew Jeffery > ; Joel Stanley ; open > list:ASPEED BMCs ; open list:All patches CC here > > Cc: Troy Lee > Subject: Re: [PATCH v0 2/2] aspeed: fix hardcode boot address 0 > > Hi Jamin, > > On 5/2/24 10:14, Jamin Lin via wrote: > > In the previous design of QEMU model for ASPEED SOCs, it set the boot > > address at 0 which was the hardcode setting for ast10x0, ast2600, > > ast2500 and ast2400. > > > > According to the design of ast2700, it has bootmcu which is used for > > executing SPL and initialize DRAM, > > Out of curiosity, what architecture is this MCU? MCU is riscv-ibex and its architecture is riscv-32. > > > then, CPUs(cortex-a35) > > execute u-boot, kernel and rofs. QEMU will only support > > CPU(coretax-a35) parts and the boot address is "0x4" for ast2700. > > OK, but I don't get how you get from here ... > Our design make MCU execute SPL and copy u-boot image from SPI to DRAM at address 0x4 at SPL boot stage. However, QEMU will only support to emulate CPU sides (coretex-a35) for ast2700, that was why we want to change the boot address at 0x4 And use the following start command by QEMU. ./qemu-system-aarch64 -M ast2750-evb -nographic -m 8G \ -device loader,addr=0x4,file=${IMGDIR}/u-boot-nodtb.bin,force-raw=on \ -device loader,addr=$((0x4 + ${UBOOT_SIZE})),file=${IMGDIR}/u-boot.dtb,force-raw=on \ --- --- By the way, I will send a new patch series to support ast2700 in two weeks and We set memory map for ast2700 as following. static const hwaddr aspeed_soc_ast2700_memmap[] = { [ASPEED_DEV_SPI_BOOT] = 0x4, [ASPEED_DEV_SRAM] = 0x1000, Jamin > > Therefore, fixed hardcode boot address 0. > > ... to here. > > > Signed-off-by: Troy Lee > > Signed-off-by: Jamin Lin > > --- > > hw/arm/aspeed.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c index > > 218b81298e..82a92e8142 100644 > > --- a/hw/arm/aspeed.c > > +++ b/hw/arm/aspeed.c > > @@ -289,12 +289,14 @@ static void > aspeed_install_boot_rom(AspeedMachineState *bmc, BlockBackend *blk, > > uint64_t rom_size) > > { > > AspeedSoCState *soc = bmc->soc; > > +AspeedSoCClass *sc = ASPEED_SOC_GET_CLASS(soc); > > > > memory_region_init_rom(>boot_rom, NULL, > "aspeed.boot_rom", rom_size, > > _abort); > > memory_region_add_subregion_overlap(>spi_boot_container, > 0, > > >boot_rom, 1); > > -write_boot_rom(blk, ASPEED_SOC_SPI_BOOT_ADDR, rom_size, > _abort); > > +write_boot_rom(blk, sc->memmap[ASPEED_DEV_SPI_BOOT], > > + rom_size, _abort); > > Reviewed-by: Philippe Mathieu-Daudé > > > } > > > > void aspeed_board_init_flashes(AspeedSMCState *s, const char > > *flashtype,
Re: [PATCH v2 6/7] vdpa: move iova_tree allocation to net_vhost_vdpa_init
Hi Eugenio, I thought this new code looks good to me and the original issue I saw with x-svq=on should be gone. However, after rebase my tree on top of this, there's a new failure I found around setting up guest mappings at early boot, please see attached the specific QEMU config and corresponding event traces. Haven't checked into the detail yet, thinking you would need to be aware of ahead. Regards, -Siwei On 2/1/2024 10:09 AM, Eugenio Pérez wrote: As we are moving to keep the mapping through all the vdpa device life instead of resetting it at VirtIO reset, we need to move all its dependencies to the initialization too. In particular devices with x-svq=on need a valid iova_tree from the beginning. Simplify the code also consolidating the two creation points: the first data vq in case of SVQ active and CVQ start in case only CVQ uses it. Suggested-by: Si-Wei Liu Signed-off-by: Eugenio Pérez --- include/hw/virtio/vhost-vdpa.h | 16 ++- net/vhost-vdpa.c | 36 +++--- 2 files changed, 18 insertions(+), 34 deletions(-) diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h index 03ed2f2be3..ad754eb803 100644 --- a/include/hw/virtio/vhost-vdpa.h +++ b/include/hw/virtio/vhost-vdpa.h @@ -37,7 +37,21 @@ typedef struct vhost_vdpa_shared { struct vhost_vdpa_iova_range iova_range; QLIST_HEAD(, vdpa_iommu) iommu_list; -/* IOVA mapping used by the Shadow Virtqueue */ +/* + * IOVA mapping used by the Shadow Virtqueue + * + * It is shared among all ASID for simplicity, whether CVQ shares ASID with + * guest or not: + * - Memory listener need access to guest's memory addresses allocated in + * the IOVA tree. + * - There should be plenty of IOVA address space for both ASID not to + * worry about collisions between them. Guest's translations are still + * validated with virtio virtqueue_pop so there is no risk for the guest + * to access memory that it shouldn't. + * + * To allocate a iova tree per ASID is doable but it complicates the code + * and it is not worth it for the moment. + */ VhostIOVATree *iova_tree; /* Copy of backend features */ diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c index cc589dd148..57edcf34d0 100644 --- a/net/vhost-vdpa.c +++ b/net/vhost-vdpa.c @@ -232,6 +232,7 @@ static void vhost_vdpa_cleanup(NetClientState *nc) return; } qemu_close(s->vhost_vdpa.shared->device_fd); +g_clear_pointer(>vhost_vdpa.shared->iova_tree, vhost_iova_tree_delete); g_free(s->vhost_vdpa.shared); } @@ -329,16 +330,8 @@ static void vdpa_net_migration_state_notifier(Notifier *notifier, void *data) static void vhost_vdpa_net_data_start_first(VhostVDPAState *s) { -struct vhost_vdpa *v = >vhost_vdpa; - migration_add_notifier(>migration_state, vdpa_net_migration_state_notifier); - -/* iova_tree may be initialized by vhost_vdpa_net_load_setup */ -if (v->shadow_vqs_enabled && !v->shared->iova_tree) { -v->shared->iova_tree = vhost_iova_tree_new(v->shared->iova_range.first, - v->shared->iova_range.last); -} } static int vhost_vdpa_net_data_start(NetClientState *nc) @@ -383,19 +376,12 @@ static int vhost_vdpa_net_data_load(NetClientState *nc) static void vhost_vdpa_net_client_stop(NetClientState *nc) { VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc); -struct vhost_dev *dev; assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA); if (s->vhost_vdpa.index == 0) { migration_remove_notifier(>migration_state); } - -dev = s->vhost_vdpa.dev; -if (dev->vq_index + dev->nvqs == dev->vq_index_end) { -g_clear_pointer(>vhost_vdpa.shared->iova_tree, -vhost_iova_tree_delete); -} } static NetClientInfo net_vhost_vdpa_info = { @@ -557,24 +543,6 @@ out: return 0; } -/* - * If other vhost_vdpa already have an iova_tree, reuse it for simplicity, - * whether CVQ shares ASID with guest or not, because: - * - Memory listener need access to guest's memory addresses allocated in - * the IOVA tree. - * - There should be plenty of IOVA address space for both ASID not to - * worry about collisions between them. Guest's translations are still - * validated with virtio virtqueue_pop so there is no risk for the guest - * to access memory that it shouldn't. - * - * To allocate a iova tree per ASID is doable but it complicates the code - * and it is not worth it for the moment. - */ -if (!v->shared->iova_tree) { -v->shared->iova_tree = vhost_iova_tree_new(v->shared->iova_range.first, - v->shared->iova_range.last); -} - r =
[PATCH v2 1/3] Hexagon (target/hexagon) Pass P0 explicitly to helpers that need it
Rather than reading P0 from the env, pass it explicitly Signed-off-by: Taylor Simpson --- target/hexagon/macros.h | 2 +- target/hexagon/hex_common.py | 10 ++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h index 1376d6ccc1..9c700ce8ef 100644 --- a/target/hexagon/macros.h +++ b/target/hexagon/macros.h @@ -358,7 +358,7 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int shift) #endif #define fREAD_PC() (PC) -#define fREAD_P0() (env->pred[0]) +#define fREAD_P0() (P0) #define fCHECK_PCALIGN(A) diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py index 195620c7ec..2dbd0ea737 100755 --- a/target/hexagon/hex_common.py +++ b/target/hexagon/hex_common.py @@ -197,6 +197,10 @@ def get_tagimms(): return dict(zip(tags, list(map(compute_tag_immediates, tags +def need_p0(tag): +return "A_IMPLICIT_READS_P0" in attribdict[tag] + + def need_slot(tag): if ( "A_CVI_SCATTER" not in attribdict[tag] @@ -1118,6 +1122,12 @@ def helper_args(tag, regs, imms): "tcg_constant_tl(ctx->next_PC)", "target_ulong next_PC" )) +if need_p0(tag): +args.append(HelperArg( +"i32", +"hex_pred[0]", +"uint32_t P0" +)) if need_slot(tag): args.append(HelperArg( "i32", -- 2.34.1
[PATCH v2 2/3] Hexagon (target/hexagon) Pass SP explicitly to helpers that need it
Rather than reading SP from the env, pass it explicitly Signed-off-by: Taylor Simpson --- target/hexagon/macros.h | 2 +- target/hexagon/attribs_def.h.inc | 1 + target/hexagon/hex_common.py | 11 +++ 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h index 9c700ce8ef..4b8665a9da 100644 --- a/target/hexagon/macros.h +++ b/target/hexagon/macros.h @@ -343,7 +343,7 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int shift) #define fREAD_LR() (env->gpr[HEX_REG_LR]) -#define fREAD_SP() (env->gpr[HEX_REG_SP]) +#define fREAD_SP() (SP) #define fREAD_LC0 (env->gpr[HEX_REG_LC0]) #define fREAD_LC1 (env->gpr[HEX_REG_LC1]) #define fREAD_SA0 (env->gpr[HEX_REG_SA0]) diff --git a/target/hexagon/attribs_def.h.inc b/target/hexagon/attribs_def.h.inc index 87942d46f4..8949ee09cf 100644 --- a/target/hexagon/attribs_def.h.inc +++ b/target/hexagon/attribs_def.h.inc @@ -117,6 +117,7 @@ DEF_ATTRIB(IMPLICIT_READS_P1, "Reads the P1 register", "", "") DEF_ATTRIB(IMPLICIT_READS_P2, "Reads the P2 register", "", "") DEF_ATTRIB(IMPLICIT_READS_P3, "Reads the P3 register", "", "") DEF_ATTRIB(IMPLICIT_WRITES_USR, "May write USR", "", "") +DEF_ATTRIB(IMPLICIT_READS_SP, "Reads the SP register", "", "") DEF_ATTRIB(COMMUTES, "The operation is communitive", "", "") DEF_ATTRIB(DEALLOCRET, "dealloc_return", "", "") DEF_ATTRIB(DEALLOCFRAME, "deallocframe", "", "") diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py index 2dbd0ea737..eb28aa98fe 100755 --- a/target/hexagon/hex_common.py +++ b/target/hexagon/hex_common.py @@ -101,6 +101,7 @@ def calculate_attribs(): add_qemu_macro_attrib('fLSBNEW1', 'A_IMPLICIT_READS_P1') add_qemu_macro_attrib('fLSBNEW1NOT', 'A_IMPLICIT_READS_P1') add_qemu_macro_attrib('fREAD_P3', 'A_IMPLICIT_READS_P3') +add_qemu_macro_attrib('fREAD_SP', 'A_IMPLICIT_READS_SP') # Recurse down macros, find attributes from sub-macros macroValues = list(macros.values()) @@ -201,6 +202,10 @@ def need_p0(tag): return "A_IMPLICIT_READS_P0" in attribdict[tag] +def need_sp(tag): +return "A_IMPLICIT_READS_SP" in attribdict[tag] + + def need_slot(tag): if ( "A_CVI_SCATTER" not in attribdict[tag] @@ -1128,6 +1133,12 @@ def helper_args(tag, regs, imms): "hex_pred[0]", "uint32_t P0" )) +if need_sp(tag): +args.append(HelperArg( +"i32", +"hex_gpr[HEX_REG_SP]", +"uint32_t SP" +)) if need_slot(tag): args.append(HelperArg( "i32", -- 2.34.1
[PATCH v2 3/3] Hexagon (target/hexagon) Only pass env to generated helper when needed
Currently, we pass env to every generated helper. When the semantics of the instruction only depend on the arguments, this is unnecessary and adds extra overhead to the helper call. We add the TCG_CALL_NO_RWG_SE flag to any non-HVX helpers that don't get the ptr to env. The A2_nop and SA1_setin1 instructions end up with no arguments. This results in a "old-style function definition" error from the compiler, so we write overrides for them. With this change, the number of helpers with env argument is idef-parser enabled:329 total, 23 with env idef-parser disabled: 1543 total, 550 with env Signed-off-by: Taylor Simpson --- target/hexagon/gen_tcg.h| 3 +++ target/hexagon/gen_helper_protos.py | 10 +- target/hexagon/hex_common.py| 23 ++- 3 files changed, 30 insertions(+), 6 deletions(-) diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h index 1c4391b415..e9ac2e3fe0 100644 --- a/target/hexagon/gen_tcg.h +++ b/target/hexagon/gen_tcg.h @@ -1369,3 +1369,6 @@ gen_helper_raise_exception(tcg_env, excp); \ } while (0) #endif + +#define fGEN_TCG_A2_nop(SHORTCODE) do { } while (0) +#define fGEN_TCG_SA1_setin1(SHORTCODE) tcg_gen_movi_tl(RdV, -1) diff --git a/target/hexagon/gen_helper_protos.py b/target/hexagon/gen_helper_protos.py index c82b0f54e4..6e98429752 100755 --- a/target/hexagon/gen_helper_protos.py +++ b/target/hexagon/gen_helper_protos.py @@ -40,7 +40,15 @@ def gen_helper_prototype(f, tag, tagregs, tagimms): declared.append(arg.proto_arg) arguments = ", ".join(declared) -f.write(f"DEF_HELPER_{len(declared) - 1}({tag}, {arguments})\n") + +## Add the TCG_CALL_NO_RWG_SE flag to helpers that don't take the env +## argument and aren't HVX instructions. Since HVX instructions take +## pointers to their arguments, they will have side effects. +if hex_common.need_env(tag) or hex_common.is_hvx_insn(tag): +f.write(f"DEF_HELPER_{len(declared) - 1}({tag}, {arguments})\n") +else: +f.write(f"DEF_HELPER_FLAGS_{len(declared) - 1}({tag}, " +f"TCG_CALL_NO_RWG_SE, {arguments})\n") def main(): diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py index eb28aa98fe..ee08195aba 100755 --- a/target/hexagon/hex_common.py +++ b/target/hexagon/hex_common.py @@ -206,6 +206,18 @@ def need_sp(tag): return "A_IMPLICIT_READS_SP" in attribdict[tag] +def is_hvx_insn(tag): +return "A_CVI" in attribdict[tag] + + +def need_env(tag): +return ("A_STORE" in attribdict[tag] or +"A_LOAD" in attribdict[tag] or +"A_CVI_GATHER" in attribdict[tag] or +"A_CVI_SCATTER" in attribdict[tag] or +"A_IMPLICIT_WRITES_USR" in attribdict[tag]) + + def need_slot(tag): if ( "A_CVI_SCATTER" not in attribdict[tag] @@ -1069,11 +1081,12 @@ def helper_args(tag, regs, imms): args = [] ## First argument is the CPU state -args.append(HelperArg( -"env", -"tcg_env", -"CPUHexagonState *env" -)) +if need_env(tag): +args.append(HelperArg( +"env", +"tcg_env", +"CPUHexagonState *env" +)) ## For predicated instructions, we pass in the destination register if is_predicated(tag): -- 2.34.1
[PATCH v2 0/3] Hexagon (target/hexagon) Only pass env to generated helper when needed
Currently, we pass env to every generated helper. When the semantics of the instruction only depend on the arguments, this is unnecessary and adds extra overhead to the helper call. Changes in v2 - Separate patches to pass P0 and SP explicitly to helpers that need it - Add the TCG_CALL_NO_RWG_SE flag to any non-HVX helpers that don't get ptr to env Taylor Simpson (3): Hexagon (target/hexagon) Pass P0 explicitly to helpers that need it Hexagon (target/hexagon) Pass SP explicitly to helpers that need it Hexagon (target/hexagon) Only pass env to generated helper when needed target/hexagon/gen_tcg.h| 3 ++ target/hexagon/macros.h | 4 +-- target/hexagon/attribs_def.h.inc| 1 + target/hexagon/gen_helper_protos.py | 10 ++- target/hexagon/hex_common.py| 44 + 5 files changed, 54 insertions(+), 8 deletions(-) -- 2.34.1
[PATCH v3 0/4] make vm-build-freebsd fixes
v2: https://lists.gnu.org/archive/html/qemu-devel/2024-02/msg00890.html v2 -> v3: Structure the meson check similar to have_asan_fiber; Reduce the context size a little (Philippe). v1: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg05155.html v1 -> v2: Link with libinotify instead of disabling the inotify support (Daniel). Use a bit more context lines in order to prevent the incorrect application of the test patch. Hi, I needed to verify that my qemu-user changes didn't break BSD, and Daniel Berrange suggested vm-build-freebsd on IRC. I had several problems with it, which this series resolves. Best regards, Ilya Ilya Leoshkevich (4): tests/vm: Set UseDNS=no in the sshd configuration tests/vm/freebsd: Reload the sshd configuration test-util-filemonitor: Adapt to the FreeBSD inotify rename semantics meson: Link with libinotify on FreeBSD meson.build| 23 +++ tests/unit/test-util-filemonitor.c | 8 tests/vm/basevm.py | 2 ++ tests/vm/freebsd | 1 + util/meson.build | 6 +- 5 files changed, 35 insertions(+), 5 deletions(-) -- 2.43.0
[PATCH v3 2/4] tests/vm/freebsd: Reload the sshd configuration
After console_sshd_config(), the SSH server needs to be nudged to pick up the new configs. The scripts for the other BSD flavors already do this with a reboot, but a simple reload is sufficient. Reviewed-by: Thomas Huth Signed-off-by: Ilya Leoshkevich --- tests/vm/freebsd | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/vm/freebsd b/tests/vm/freebsd index b581bd17fb7..1247f40a385 100755 --- a/tests/vm/freebsd +++ b/tests/vm/freebsd @@ -91,40 +91,41 @@ class FreeBSDVM(basevm.BaseVM): self.console_wait_send("Use an empty password", "\n") self.console_wait_send("Use a random password", "\n") self.console_wait("Enter password:") self.console_send("%s\n" % self._config["guest_pass"]) self.console_wait("Enter password again:") self.console_send("%s\n" % self._config["guest_pass"]) self.console_wait_send("Lock out", "\n") self.console_wait_send("OK","yes\n") self.console_wait_send("Add another user", "no\n") self.console_wait_send("~ #", "exit\n") # setup qemu user prompt = "$" self.console_ssh_init(prompt, self._config["guest_user"], self._config["guest_pass"]) self.console_wait_send(prompt, "exit\n") # setup root user prompt = "root@freebsd:~ #" self.console_ssh_init(prompt, "root", self._config["root_pass"]) self.console_sshd_config(prompt) +self.console_wait_send(prompt, "service sshd reload\n") # setup virtio-blk #1 (tarfile) self.console_wait(prompt) self.console_send("echo 'chmod 666 /dev/vtbd1' >> /etc/rc.local\n") pkgs = self.get_qemu_packages_from_lcitool_json() self.print_step("Installing packages") self.ssh_root_check("pkg install -y %s\n" % " ".join(pkgs)) # shutdown self.ssh_root(self.poweroff) self.wait() if os.path.exists(img): os.remove(img) os.rename(img_tmp, img) self.print_step("All done") if __name__ == "__main__": sys.exit(basevm.main(FreeBSDVM, config=FREEBSD_CONFIG)) -- 2.43.0
[PATCH v3 4/4] meson: Link with libinotify on FreeBSD
make vm-build-freebsd fails with: ld: error: undefined symbol: inotify_init1 >>> referenced by filemonitor-inotify.c:183 (../src/util/filemonitor-inotify.c:183) >>> util_filemonitor-inotify.c.o:(qemu_file_monitor_new) in archive libqemuutil.a On FreeBSD the inotify functions are defined in libinotify.so. Add it to the dependencies. Signed-off-by: Ilya Leoshkevich --- meson.build | 23 +++ util/meson.build | 6 +- 2 files changed, 24 insertions(+), 5 deletions(-) diff --git a/meson.build b/meson.build index b5d6dc94a83..e5d6f2d057e 100644 --- a/meson.build +++ b/meson.build @@ -2367,60 +2367,72 @@ if rbd.found() endif if rdma.found() config_host_data.set('HAVE_IBV_ADVISE_MR', cc.has_function('ibv_advise_mr', dependencies: rdma, prefix: '#include ')) endif have_asan_fiber = false if get_option('sanitizers') and \ not cc.has_function('__sanitizer_start_switch_fiber', args: '-fsanitize=address', prefix: '#include ') warning('Missing ASAN due to missing fiber annotation interface') warning('Without code annotation, the report may be inferior.') else have_asan_fiber = true endif config_host_data.set('CONFIG_ASAN_IFACE_FIBER', have_asan_fiber) +have_inotify_init = cc.has_header_symbol('sys/inotify.h', 'inotify_init') +have_inotify_init1 = cc.has_header_symbol('sys/inotify.h', 'inotify_init1') +inotify = not_found +if (have_inotify_init or have_inotify_init1) and host_os == 'freebsd' + # libinotify-kqueue + inotify = cc.find_library('inotify') + if have_inotify_init +have_inotify_init = inotify.found() + endif + if have_inotify_init1 +have_inotify_init1 = inotify.found() + endif +endif +config_host_data.set('CONFIG_INOTIFY', have_inotify_init) +config_host_data.set('CONFIG_INOTIFY1', have_inotify_init1) + # has_header_symbol config_host_data.set('CONFIG_BLKZONED', cc.has_header_symbol('linux/blkzoned.h', 'BLKOPENZONE')) config_host_data.set('CONFIG_EPOLL_CREATE1', cc.has_header_symbol('sys/epoll.h', 'epoll_create1')) config_host_data.set('CONFIG_FALLOCATE_PUNCH_HOLE', cc.has_header_symbol('linux/falloc.h', 'FALLOC_FL_PUNCH_HOLE') and cc.has_header_symbol('linux/falloc.h', 'FALLOC_FL_KEEP_SIZE')) config_host_data.set('CONFIG_FALLOCATE_ZERO_RANGE', cc.has_header_symbol('linux/falloc.h', 'FALLOC_FL_ZERO_RANGE')) config_host_data.set('CONFIG_FIEMAP', cc.has_header('linux/fiemap.h') and cc.has_header_symbol('linux/fs.h', 'FS_IOC_FIEMAP')) config_host_data.set('CONFIG_GETRANDOM', cc.has_function('getrandom') and cc.has_header_symbol('sys/random.h', 'GRND_NONBLOCK')) -config_host_data.set('CONFIG_INOTIFY', - cc.has_header_symbol('sys/inotify.h', 'inotify_init')) -config_host_data.set('CONFIG_INOTIFY1', - cc.has_header_symbol('sys/inotify.h', 'inotify_init1')) config_host_data.set('CONFIG_PRCTL_PR_SET_TIMERSLACK', cc.has_header_symbol('sys/prctl.h', 'PR_SET_TIMERSLACK')) config_host_data.set('CONFIG_RTNETLINK', cc.has_header_symbol('linux/rtnetlink.h', 'IFLA_PROTO_DOWN')) config_host_data.set('CONFIG_SYSMACROS', cc.has_header_symbol('sys/sysmacros.h', 'makedev')) config_host_data.set('HAVE_OPTRESET', cc.has_header_symbol('getopt.h', 'optreset')) config_host_data.set('HAVE_IPPROTO_MPTCP', cc.has_header_symbol('netinet/in.h', 'IPPROTO_MPTCP')) # has_member config_host_data.set('HAVE_SIGEV_NOTIFY_THREAD_ID', cc.has_member('struct sigevent', 'sigev_notify_thread_id', prefix: '#include ')) config_host_data.set('HAVE_STRUCT_STAT_ST_ATIM', cc.has_member('struct stat', 'st_atim', prefix: '#include ')) config_host_data.set('HAVE_BLK_ZONE_REP_CAPACITY', cc.has_member('struct blk_zone', 'capacity', @@ -4390,40 +4402,43 @@ if host_os == 'windows' endif summary_info += {'seccomp support': seccomp} summary_info += {'GlusterFS support': glusterfs} summary_info += {'hv-balloon support': hv_balloon} summary_info += {'TPM support': have_tpm} summary_info += {'libssh support':libssh} summary_info += {'lzo support': lzo} summary_info += {'snappy support':snappy} summary_info += {'bzip2 support': libbzip2} summary_info += {'lzfse support': liblzfse} summary_info += {'zstd support': zstd} summary_info += {'NUMA host support': numa} summary_info += {'capstone': capstone} summary_info += {'libpmem support': libpmem} summary_info
[PATCH v3 1/4] tests/vm: Set UseDNS=no in the sshd configuration
make vm-build-freebsd sometimes fails with "Connection timed out during banner exchange". The client strace shows: 13:59:30 write(3, "SSH-2.0-OpenSSH_9.3\r\n", 21) = 21 13:59:30 getpid() = 252655 13:59:30 poll([{fd=3, events=POLLIN}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}]) 13:59:32 read(3, "S", 1)= 1 13:59:32 poll([{fd=3, events=POLLIN}], 1, 3625) = 1 ([{fd=3, revents=POLLIN}]) 13:59:32 read(3, "S", 1)= 1 13:59:32 poll([{fd=3, events=POLLIN}], 1, 3625) = 1 ([{fd=3, revents=POLLIN}]) 13:59:32 read(3, "H", 1)= 1 There is a 2s delay during connection, and ConnectTimeout is set to 1. Raising it makes the issue go away, but we can do better. The server truss shows: 888: 27.811414714 socket(PF_INET,SOCK_DGRAM|SOCK_CLOEXEC,0) = 5 (0x5) 888: 27.811765030 connect(5,{ AF_INET 10.0.2.3:53 },16) = 0 (0x0) 888: 27.812166941 sendto(5,"\^Z/\^A\0\0\^A\0\0\0\0\0\0\^A2"...,39,0,NULL,0) = 39 (0x27) 888: 29.363970743 poll({ 5/POLLRDNORM },1,5000) = 1 (0x1) So the delay is due to a DNS query. Disable DNS queries in the server config. Reviewed-by: Thomas Huth Signed-off-by: Ilya Leoshkevich --- tests/vm/basevm.py | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py index 61725b83254..c0d62c08031 100644 --- a/tests/vm/basevm.py +++ b/tests/vm/basevm.py @@ -406,40 +406,42 @@ def console_send(self, command): vm.console_socket.send(char.encode("utf-8")) time.sleep(0.01) def console_wait_send(self, wait, command): self.console_wait(wait) self.console_send(command) def console_ssh_init(self, prompt, user, pw): sshkey_cmd = "echo '%s' > .ssh/authorized_keys\n" \ % self._config['ssh_pub_key'].rstrip() self.console_wait_send("login:","%s\n" % user) self.console_wait_send("Password:", "%s\n" % pw) self.console_wait_send(prompt, "mkdir .ssh\n") self.console_wait_send(prompt, sshkey_cmd) self.console_wait_send(prompt, "chmod 755 .ssh\n") self.console_wait_send(prompt, "chmod 644 .ssh/authorized_keys\n") def console_sshd_config(self, prompt): self.console_wait(prompt) self.console_send("echo 'PermitRootLogin yes' >> /etc/ssh/sshd_config\n") +self.console_wait(prompt) +self.console_send("echo 'UseDNS no' >> /etc/ssh/sshd_config\n") for var in self.envvars: self.console_wait(prompt) self.console_send("echo 'AcceptEnv %s' >> /etc/ssh/sshd_config\n" % var) def print_step(self, text): sys.stderr.write("### %s ...\n" % text) def wait_ssh(self, wait_root=False, seconds=300, cmd="exit 0"): # Allow more time for VM to boot under TCG. if not kvm_available(self.arch): seconds *= self.tcg_timeout_multiplier starttime = datetime.datetime.now() endtime = starttime + datetime.timedelta(seconds=seconds) cmd_success = False while datetime.datetime.now() < endtime: if wait_root and self.ssh_root(cmd) == 0: cmd_success = True break elif self.ssh(cmd) == 0: cmd_success = True -- 2.43.0
[PATCH v3 3/4] test-util-filemonitor: Adapt to the FreeBSD inotify rename semantics
Unlike on Linux, on FreeBSD renaming a file when the destination already exists results in an IN_DELETE event for that existing file: $ FILEMONITOR_DEBUG=1 build/tests/unit/test-util-filemonitor Rename /tmp/test-util-filemonitor-K13LI2/fish/one.txt -> /tmp/test-util-filemonitor-K13LI2/two.txt Event id=2 event=2 file=one.txt Queue event id 2 event 2 file one.txt Queue event id 1 event 2 file two.txt Queue event id 10002 event 2 file two.txt Queue event id 1 event 0 file two.txt Queue event id 10002 event 0 file two.txt Event id=1 event=0 file=two.txt Expected event 0 but got 2 This difference in behavior is not expected to break the real users, so teach the test to accept it. Suggested-by: Daniel P. Berrange Signed-off-by: Ilya Leoshkevich --- tests/unit/test-util-filemonitor.c | 8 1 file changed, 8 insertions(+) diff --git a/tests/unit/test-util-filemonitor.c b/tests/unit/test-util-filemonitor.c index a22de275955..02e67fc96ac 100644 --- a/tests/unit/test-util-filemonitor.c +++ b/tests/unit/test-util-filemonitor.c @@ -343,40 +343,48 @@ test_file_monitor_events(void) .filesrc = "fish/", .watchid = }, { .type = QFILE_MONITOR_TEST_OP_ADD_WATCH, .filesrc = "fish/one.txt", .watchid = }, { .type = QFILE_MONITOR_TEST_OP_CREATE, .filesrc = "fish/one.txt", }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "one.txt", .watchid = , .eventid = QFILE_MONITOR_EVENT_CREATED }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "one.txt", .watchid = , .eventid = QFILE_MONITOR_EVENT_CREATED }, { .type = QFILE_MONITOR_TEST_OP_DEL_WATCH, .filesrc = "fish/one.txt", .watchid = }, { .type = QFILE_MONITOR_TEST_OP_RENAME, .filesrc = "fish/one.txt", .filedst = "two.txt", }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "one.txt", .watchid = , .eventid = QFILE_MONITOR_EVENT_DELETED }, +#ifdef __FreeBSD__ +{ .type = QFILE_MONITOR_TEST_OP_EVENT, + .filesrc = "two.txt", .watchid = , + .eventid = QFILE_MONITOR_EVENT_DELETED }, +{ .type = QFILE_MONITOR_TEST_OP_EVENT, + .filesrc = "two.txt", .watchid = , + .eventid = QFILE_MONITOR_EVENT_DELETED }, +#endif { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "two.txt", .watchid = , .eventid = QFILE_MONITOR_EVENT_CREATED }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "two.txt", .watchid = , .eventid = QFILE_MONITOR_EVENT_CREATED }, { .type = QFILE_MONITOR_TEST_OP_RMDIR, .filesrc = "fish", }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "", .watchid = , .eventid = QFILE_MONITOR_EVENT_IGNORED, .swapnext = true }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "fish", .watchid = , .eventid = QFILE_MONITOR_EVENT_DELETED }, { .type = QFILE_MONITOR_TEST_OP_DEL_WATCH, .filesrc = "fish", .watchid = }, -- 2.43.0
Re: [PATCH v3 0/2] riscv: add rv32i,rv32e and rv64e CPUs
On Mon, Jan 22, 2024 at 10:34 PM Daniel Henrique Barboza wrote: > > Hi, > > This v3 has the same patches from v2 rebased with a newer > riscv-to-apply.next branch (@ 096b6b07298). > > No other changes made. All patches acked. > > v2 link: > https://lore.kernel.org/qemu-riscv/20240108161903.353648-1-dbarb...@ventanamicro.com/ > > Daniel Henrique Barboza (2): > target/riscv/cpu.c: add riscv_bare_cpu_init() > target/riscv: add rv32i, rv32e and rv64e CPUs Thanks! Applied to riscv-to-apply.next Alistair > > target/riscv/cpu-qom.h | 3 ++ > target/riscv/cpu.c | 64 -- > 2 files changed, 52 insertions(+), 15 deletions(-) > > -- > 2.43.0 > >
Re: [PATCH] docs/style: allow C99 mixed declarations
Stefan Hajnoczi writes: > C99 mixed declarations support interleaving of local variable > declarations and code. > > The coding style "generally" forbids C99 mixed declarations with some > exceptions to the rule. This rule is not checked by checkpatch.pl and > naturally there are violations in the source tree. > > While contemplating adding another exception, I came to the conclusion > that the best location for declarations depends on context. Let the > programmer declare variables where it is best for legibility. Don't try > to define all possible scenarios/exceptions. > > Signed-off-by: Stefan Hajnoczi > --- > docs/devel/style.rst | 20 > 1 file changed, 20 deletions(-) While I may be convinced there may be some cases where mixed declarations could make things simpler/clearer I don't support removing all the guidance and leaving it as a free for all as long as the compiler is happy. -- Alex Bennée Virtualisation Tech Lead @ Linaro
Re: [PATCH v3 2/3] hw/virtio: cleanup shared resources
Albert Esteve writes: > Ensure that we cleanup all virtio shared > resources when the vhost devices is cleaned > up (after a hot unplug, or a crash). > > To do so, we add a new function to the virtio_dmabuf > API called `virtio_dmabuf_vhost_cleanup`, which > loop through the table and removes all > resources owned by the vhost device parameter. > > Also, add a test to verify that the new > function in the API behaves as expected. > > Signed-off-by: Albert Esteve > Acked-by: Stefan Hajnoczi > --- > hw/display/virtio-dmabuf.c| 22 + > hw/virtio/vhost.c | 3 +++ > include/hw/virtio/virtio-dmabuf.h | 10 ++ > tests/unit/test-virtio-dmabuf.c | 33 +++ > 4 files changed, 68 insertions(+) > > diff --git a/hw/display/virtio-dmabuf.c b/hw/display/virtio-dmabuf.c > index 3dba4577ca..6688809777 100644 > --- a/hw/display/virtio-dmabuf.c > +++ b/hw/display/virtio-dmabuf.c > @@ -136,6 +136,28 @@ SharedObjectType virtio_object_type(const QemuUUID *uuid) > return vso->type; > } > > +static bool virtio_dmabuf_resource_is_owned(gpointer key, > +gpointer value, > +gpointer dev) > +{ > +VirtioSharedObject *vso; > + > +vso = (VirtioSharedObject *) value; > +return vso->type == TYPE_VHOST_DEV && vso->value == dev; It's a bit surprising to see vso->value being an anonymous gpointer rather than the proper type and a bit confusing between value and vso->value. > +} > + > +int virtio_dmabuf_vhost_cleanup(struct vhost_dev *dev) > +{ > +int num_removed; > + > +g_mutex_lock(); > +num_removed = g_hash_table_foreach_remove( > +resource_uuids, (GHRFunc) virtio_dmabuf_resource_is_owned, dev); > +g_mutex_unlock(); I'll note if we used a QemuMutex for the lock we could: - use WITH_QEMU_LOCK_GUARD() { } - enable QSP porfiling for the lock > + > +return num_removed; > +} > + > void virtio_free_resources(void) > { > g_mutex_lock(); > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > index 2c9ac79468..c5622eac14 100644 > --- a/hw/virtio/vhost.c > +++ b/hw/virtio/vhost.c > @@ -16,6 +16,7 @@ > #include "qemu/osdep.h" > #include "qapi/error.h" > #include "hw/virtio/vhost.h" > +#include "hw/virtio/virtio-dmabuf.h" > #include "qemu/atomic.h" > #include "qemu/range.h" > #include "qemu/error-report.h" > @@ -1599,6 +1600,8 @@ void vhost_dev_cleanup(struct vhost_dev *hdev) > migrate_del_blocker(>migration_blocker); > g_free(hdev->mem); > g_free(hdev->mem_sections); > +/* free virtio shared objects */ > +virtio_dmabuf_vhost_cleanup(hdev); > if (hdev->vhost_ops) { > hdev->vhost_ops->vhost_backend_cleanup(hdev); > } > diff --git a/include/hw/virtio/virtio-dmabuf.h > b/include/hw/virtio/virtio-dmabuf.h > index 627c3b6db7..73f70fb482 100644 > --- a/include/hw/virtio/virtio-dmabuf.h > +++ b/include/hw/virtio/virtio-dmabuf.h > @@ -91,6 +91,16 @@ struct vhost_dev *virtio_lookup_vhost_device(const > QemuUUID *uuid); > */ > SharedObjectType virtio_object_type(const QemuUUID *uuid); > > +/** > + * virtio_dmabuf_vhost_cleanup() - Destroys all entries of the shared > + * resources lookup table that are owned by the vhost backend > + * @dev: the pointer to the vhost device that owns the entries. Data is owned > + * by the called of the function. > + * > + * Return: the number of resource entries removed. > + */ > +int virtio_dmabuf_vhost_cleanup(struct vhost_dev *dev); > + > /** > * virtio_free_resources() - Destroys all keys and values of the shared > * resources lookup table, and frees them > diff --git a/tests/unit/test-virtio-dmabuf.c b/tests/unit/test-virtio-dmabuf.c > index a45ec52f42..1c8123c2d2 100644 > --- a/tests/unit/test-virtio-dmabuf.c > +++ b/tests/unit/test-virtio-dmabuf.c > @@ -103,6 +103,38 @@ static void test_add_invalid_resource(void) > } > } > > +static void test_cleanup_res(void) > +{ > +QemuUUID uuids[20], uuid_alt; > +struct vhost_dev *dev = g_new0(struct vhost_dev, 1); > +struct vhost_dev *dev_alt = g_new0(struct vhost_dev, 1); > +int i, num_removed; > + > +for (i = 0; i < ARRAY_SIZE(uuids); ++i) { > +qemu_uuid_generate([i]); > +virtio_add_vhost_device([i], dev); > +/* vhost device is found */ > +g_assert(virtio_lookup_vhost_device([i]) != NULL); > +} > +qemu_uuid_generate(_alt); > +virtio_add_vhost_device(_alt, dev_alt); > +/* vhost device is found */ > +g_assert(virtio_lookup_vhost_device(_alt) != NULL); > +/* cleanup all dev resources */ > +num_removed = virtio_dmabuf_vhost_cleanup(dev); > +g_assert_cmpint(num_removed, ==, ARRAY_SIZE(uuids)); > +for (i = 0; i < ARRAY_SIZE(uuids); ++i) { > +/* None of the dev resources is found after free'd */ > +g_assert_cmpint(virtio_lookup_dmabuf([i]), ==, -1); > +} > +/* uuid_alt
[PATCH] hw/i386/multiboot: Make bootloader_name static
Global variable is only used in a single file and should therefor be set to static in order to avoid name collisions. Signed-off-by: Jens Nyberg --- hw/i386/multiboot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/i386/multiboot.c b/hw/i386/multiboot.c index 3332712ab3..6bf6398d01 100644 --- a/hw/i386/multiboot.c +++ b/hw/i386/multiboot.c @@ -102,7 +102,7 @@ typedef struct { int mb_mods_count; } MultibootState; -const char *bootloader_name = "qemu"; +static const char *bootloader_name = "qemu"; static uint32_t mb_add_cmdline(MultibootState *s, const char *cmdline) { -- 2.43.0
[PATCH v5 0/4] target/s390x: Emulate CVDG and CVB*
v4: https://lists.gnu.org/archive/html/qemu-devel/2024-02/msg00434.html v4 -> v5: Remove a redundant CVBG overflow check; Write the comment explaining the remaining CVBG overflow check; Add R-bs to the tests (Thomas). v3: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg06664.html v3 -> v4: Implement CVB error handling (David/Thomas). v2: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg05048.html v2 -> v3: Resurrect an old CVB* patch (Thomas). Add Richard's R-b. v1: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg02865.html v1 -> v2: Fix !CONFIG_INT128 builds (Richard). Hi, Ido reported that we are missing the CVDG emulation (which is very similar to the existing CVD emulation). This series adds it along with a test. Best regards, Ilya Ilya Leoshkevich (4): target/s390x: Emulate CVDG target/s390x: Emulate CVB, CVBY and CVBG tests/tcg/s390x: Test CONVERT TO DECIMAL tests/tcg/s390x: Test CONVERT TO BINARY target/s390x/helper.h| 3 + target/s390x/tcg/insn-data.h.inc | 5 ++ target/s390x/tcg/int_helper.c| 97 + target/s390x/tcg/translate.c | 24 tests/tcg/s390x/Makefile.target | 2 + tests/tcg/s390x/cvb.c| 102 +++ tests/tcg/s390x/cvd.c| 63 +++ 7 files changed, 296 insertions(+) create mode 100644 tests/tcg/s390x/cvb.c create mode 100644 tests/tcg/s390x/cvd.c -- 2.43.0
[PATCH v5 4/4] tests/tcg/s390x: Test CONVERT TO BINARY
Check the CVB's, CVBY's, and CVBG's corner cases. Co-developed-by: Pavel Zbitskiy Reviewed-by: Thomas Huth Tested-by: Thomas Huth Signed-off-by: Ilya Leoshkevich --- tests/tcg/s390x/Makefile.target | 1 + tests/tcg/s390x/cvb.c | 102 2 files changed, 103 insertions(+) create mode 100644 tests/tcg/s390x/cvb.c diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target index 04e4bddd83d..e2aba2ec274 100644 --- a/tests/tcg/s390x/Makefile.target +++ b/tests/tcg/s390x/Makefile.target @@ -46,6 +46,7 @@ TESTS+=laalg TESTS+=add-logical-with-carry TESTS+=lae TESTS+=cvd +TESTS+=cvb cdsg: CFLAGS+=-pthread cdsg: LDFLAGS+=-pthread diff --git a/tests/tcg/s390x/cvb.c b/tests/tcg/s390x/cvb.c new file mode 100644 index 000..e1735f6b81c --- /dev/null +++ b/tests/tcg/s390x/cvb.c @@ -0,0 +1,102 @@ +/* + * Test the CONVERT TO BINARY instruction. + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ +#include +#include +#include +#include +#include + +static int signum; + +static void signal_handler(int n) +{ +signum = n; +} + +#define FAIL 0x1234567887654321 +#define OK32(x) (0x12345678 | (uint32_t)(x)) + +static int64_t cvb(uint64_t x) +{ +int64_t ret = FAIL; + +signum = -1; +asm("cvb %[ret],%[x]" : [ret] "+r" (ret) : [x] "R" (x)); + +return ret; +} + +static int64_t cvby(uint64_t x) +{ +int64_t ret = FAIL; + +signum = -1; +asm("cvby %[ret],%[x]" : [ret] "+r" (ret) : [x] "T" (x)); + +return ret; +} + +static int64_t cvbg(__uint128_t x) +{ +int64_t ret = FAIL; + +signum = -1; +asm("cvbg %[ret],%[x]" : [ret] "+r" (ret) : [x] "T" (x)); + +return ret; +} + +int main(void) +{ +__uint128_t m = (((__uint128_t)0x9223372036854775) << 16) | 0x8070; +struct sigaction act; +int err; + +memset(, 0, sizeof(act)); +act.sa_handler = signal_handler; +err = sigaction(SIGFPE, , NULL); +assert(err == 0); +err = sigaction(SIGILL, , NULL); +assert(err == 0); + +assert(cvb(0xc) == OK32(0) && signum == -1); +assert(cvb(0x1c) == OK32(1) && signum == -1); +assert(cvb(0x25594c) == OK32(25594) && signum == -1); +assert(cvb(0x1d) == OK32(-1) && signum == -1); +assert(cvb(0x2147483647c) == OK32(0x7fff) && signum == -1); +assert(cvb(0x2147483648d) == OK32(-0x8000) && signum == -1); +assert(cvb(0x7) == FAIL && signum == SIGILL); +assert(cvb(0x2147483648c) == OK32(0x8000) && signum == SIGFPE); +assert(cvb(0x30c) == OK32(0xb2d05e00) && signum == SIGFPE); +assert(cvb(0x2147483649d) == OK32(0x7fff) && signum == SIGFPE); +assert(cvb(0x30d) == OK32(0x4d2fa200) && signum == SIGFPE); + +assert(cvby(0xc) == OK32(0)); +assert(cvby(0x1c) == OK32(1)); +assert(cvby(0x25594c) == OK32(25594)); +assert(cvby(0x1d) == OK32(-1)); +assert(cvby(0x2147483647c) == OK32(0x7fff)); +assert(cvby(0x2147483648d) == OK32(-0x8000)); +assert(cvby(0x7) == FAIL && signum == SIGILL); +assert(cvby(0x2147483648c) == OK32(0x8000) && signum == SIGFPE); +assert(cvby(0x30c) == OK32(0xb2d05e00) && signum == SIGFPE); +assert(cvby(0x2147483649d) == OK32(0x7fff) && signum == SIGFPE); +assert(cvby(0x30d) == OK32(0x4d2fa200) && signum == SIGFPE); + +assert(cvbg(0xc) == 0); +assert(cvbg(0x1c) == 1); +assert(cvbg(0x25594c) == 25594); +assert(cvbg(0x1d) == -1); +assert(cvbg(m + 0xc) == 0x7fff); +assert(cvbg(m + 0x1d) == -0x8000); +assert(cvbg(0x7) == FAIL && signum == SIGILL); +assert(cvbg(m + 0x1c) == FAIL && signum == SIGFPE); +assert(cvbg(m + 0x2d) == FAIL && signum == SIGFPE); +assert(cvbg(((__uint128_t)1 << 80) + 0xc) == FAIL && signum == SIGFPE); +assert(cvbg(((__uint128_t)1 << 80) + 0xd) == FAIL && signum == SIGFPE); + +return EXIT_SUCCESS; +} -- 2.43.0
[PATCH v5 3/4] tests/tcg/s390x: Test CONVERT TO DECIMAL
Check the CVD's, CVDY's, and CVDG's corner cases. Reviewed-by: Thomas Huth Signed-off-by: Ilya Leoshkevich --- tests/tcg/s390x/Makefile.target | 1 + tests/tcg/s390x/cvd.c | 63 + 2 files changed, 64 insertions(+) create mode 100644 tests/tcg/s390x/cvd.c diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target index 30994dcf9c2..04e4bddd83d 100644 --- a/tests/tcg/s390x/Makefile.target +++ b/tests/tcg/s390x/Makefile.target @@ -45,6 +45,7 @@ TESTS+=clc TESTS+=laalg TESTS+=add-logical-with-carry TESTS+=lae +TESTS+=cvd cdsg: CFLAGS+=-pthread cdsg: LDFLAGS+=-pthread diff --git a/tests/tcg/s390x/cvd.c b/tests/tcg/s390x/cvd.c new file mode 100644 index 000..d776688985e --- /dev/null +++ b/tests/tcg/s390x/cvd.c @@ -0,0 +1,63 @@ +/* + * Test the CONVERT TO DECIMAL instruction. + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ +#include +#include +#include + +static uint64_t cvd(int32_t x) +{ +uint64_t ret; + +asm("cvd %[x],%[ret]" : [ret] "=R" (ret) : [x] "r" (x)); + +return ret; +} + +static uint64_t cvdy(int32_t x) +{ +uint64_t ret; + +asm("cvdy %[x],%[ret]" : [ret] "=T" (ret) : [x] "r" (x)); + +return ret; +} + +static __uint128_t cvdg(int64_t x) +{ +__uint128_t ret; + +asm("cvdg %[x],%[ret]" : [ret] "=T" (ret) : [x] "r" (x)); + +return ret; +} + +int main(void) +{ +__uint128_t m = (((__uint128_t)0x9223372036854775) << 16) | 0x8070; + +assert(cvd(0) == 0xc); +assert(cvd(1) == 0x1c); +assert(cvd(25594) == 0x25594c); +assert(cvd(-1) == 0x1d); +assert(cvd(0x7fff) == 0x2147483647c); +assert(cvd(-0x8000) == 0x2147483648d); + +assert(cvdy(0) == 0xc); +assert(cvdy(1) == 0x1c); +assert(cvdy(25594) == 0x25594c); +assert(cvdy(-1) == 0x1d); +assert(cvdy(0x7fff) == 0x2147483647c); +assert(cvdy(-0x8000) == 0x2147483648d); + +assert(cvdg(0) == 0xc); +assert(cvdg(1) == 0x1c); +assert(cvdg(25594) == 0x25594c); +assert(cvdg(-1) == 0x1d); +assert(cvdg(0x7fff) == (m + 0xc)); +assert(cvdg(-0x8000) == (m + 0x1d)); + +return EXIT_SUCCESS; +} -- 2.43.0
[PATCH v5 1/4] target/s390x: Emulate CVDG
CVDG is the same as CVD, except that it converts 64 bits into 128, rather than 32 into 64. Create a new helper, which uses Int128 wrappers. Reported-by: Ido Plat Reviewed-by: Richard Henderson Signed-off-by: Ilya Leoshkevich --- target/s390x/helper.h| 1 + target/s390x/tcg/insn-data.h.inc | 1 + target/s390x/tcg/int_helper.c| 21 + target/s390x/tcg/translate.c | 8 4 files changed, 31 insertions(+) diff --git a/target/s390x/helper.h b/target/s390x/helper.h index 05102578fc9..332a9a9c632 100644 --- a/target/s390x/helper.h +++ b/target/s390x/helper.h @@ -89,6 +89,7 @@ DEF_HELPER_FLAGS_2(sqeb, TCG_CALL_NO_WG, i64, env, i64) DEF_HELPER_FLAGS_2(sqdb, TCG_CALL_NO_WG, i64, env, i64) DEF_HELPER_FLAGS_2(sqxb, TCG_CALL_NO_WG, i128, env, i128) DEF_HELPER_FLAGS_1(cvd, TCG_CALL_NO_RWG_SE, i64, s32) +DEF_HELPER_FLAGS_1(cvdg, TCG_CALL_NO_RWG_SE, i128, s64) DEF_HELPER_FLAGS_4(pack, TCG_CALL_NO_WG, void, env, i32, i64, i64) DEF_HELPER_FLAGS_4(pka, TCG_CALL_NO_WG, void, env, i64, i64, i32) DEF_HELPER_FLAGS_4(pku, TCG_CALL_NO_WG, void, env, i64, i64, i32) diff --git a/target/s390x/tcg/insn-data.h.inc b/target/s390x/tcg/insn-data.h.inc index 2f07f39d9cb..388dcb8dbbc 100644 --- a/target/s390x/tcg/insn-data.h.inc +++ b/target/s390x/tcg/insn-data.h.inc @@ -296,6 +296,7 @@ /* CONVERT TO DECIMAL */ C(0x4e00, CVD, RX_a, Z, r1_o, a2, 0, 0, cvd, 0) C(0xe326, CVDY,RXY_a, LD, r1_o, a2, 0, 0, cvd, 0) +C(0xe32e, CVDG,RXY_a, Z, r1_o, a2, 0, 0, cvdg, 0) /* CONVERT TO FIXED */ F(0xb398, CFEBR, RRF_e, Z, 0, e2, new, r1_32, cfeb, 0, IF_BFP) F(0xb399, CFDBR, RRF_e, Z, 0, f2, new, r1_32, cfdb, 0, IF_BFP) diff --git a/target/s390x/tcg/int_helper.c b/target/s390x/tcg/int_helper.c index eb8e6dd1b57..121e3006a65 100644 --- a/target/s390x/tcg/int_helper.c +++ b/target/s390x/tcg/int_helper.c @@ -118,6 +118,27 @@ uint64_t HELPER(cvd)(int32_t reg) return dec; } +Int128 HELPER(cvdg)(int64_t reg) +{ +/* positive 0 */ +Int128 dec = int128_make64(0x0c); +Int128 bin = int128_makes64(reg); +Int128 base = int128_make64(10); +int shift; + +if (!int128_nonneg(bin)) { +bin = int128_neg(bin); +dec = int128_make64(0x0d); +} + +for (shift = 4; (shift < 128) && int128_nz(bin); shift += 4) { +dec = int128_or(dec, int128_lshift(int128_remu(bin, base), shift)); +bin = int128_divu(bin, base); +} + +return dec; +} + uint64_t HELPER(popcnt)(uint64_t val) { /* Note that we don't fold past bytes. */ diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c index a5fd9cccaa5..c2fdc920a50 100644 --- a/target/s390x/tcg/translate.c +++ b/target/s390x/tcg/translate.c @@ -2233,6 +2233,14 @@ static DisasJumpType op_cvd(DisasContext *s, DisasOps *o) return DISAS_NEXT; } +static DisasJumpType op_cvdg(DisasContext *s, DisasOps *o) +{ +TCGv_i128 t = tcg_temp_new_i128(); +gen_helper_cvdg(t, o->in1); +tcg_gen_qemu_st_i128(t, o->in2, get_mem_index(s), MO_TE | MO_128); +return DISAS_NEXT; +} + static DisasJumpType op_ct(DisasContext *s, DisasOps *o) { int m3 = get_field(s, m3); -- 2.43.0
[PATCH v5 2/4] target/s390x: Emulate CVB, CVBY and CVBG
Convert to Binary - counterparts of the already implemented Convert to Decimal (CVD*) instructions. Example from the Principles of Operation: 25594C becomes 63FA. Co-developed-by: Pavel Zbitskiy Signed-off-by: Ilya Leoshkevich --- target/s390x/helper.h| 2 + target/s390x/tcg/insn-data.h.inc | 4 ++ target/s390x/tcg/int_helper.c| 76 target/s390x/tcg/translate.c | 16 +++ 4 files changed, 98 insertions(+) diff --git a/target/s390x/helper.h b/target/s390x/helper.h index 332a9a9c632..cc1c20e9e3f 100644 --- a/target/s390x/helper.h +++ b/target/s390x/helper.h @@ -88,6 +88,8 @@ DEF_HELPER_FLAGS_3(tcxb, TCG_CALL_NO_RWG_SE, i32, env, i128, i64) DEF_HELPER_FLAGS_2(sqeb, TCG_CALL_NO_WG, i64, env, i64) DEF_HELPER_FLAGS_2(sqdb, TCG_CALL_NO_WG, i64, env, i64) DEF_HELPER_FLAGS_2(sqxb, TCG_CALL_NO_WG, i128, env, i128) +DEF_HELPER_3(cvb, void, env, i32, i64) +DEF_HELPER_FLAGS_2(cvbg, TCG_CALL_NO_WG, i64, env, i128) DEF_HELPER_FLAGS_1(cvd, TCG_CALL_NO_RWG_SE, i64, s32) DEF_HELPER_FLAGS_1(cvdg, TCG_CALL_NO_RWG_SE, i128, s64) DEF_HELPER_FLAGS_4(pack, TCG_CALL_NO_WG, void, env, i32, i64, i64) diff --git a/target/s390x/tcg/insn-data.h.inc b/target/s390x/tcg/insn-data.h.inc index 388dcb8dbbc..e7d61cdec28 100644 --- a/target/s390x/tcg/insn-data.h.inc +++ b/target/s390x/tcg/insn-data.h.inc @@ -293,6 +293,10 @@ D(0xec73, CLFIT, RIE_a, GIE, r1_32u, i2_16u, 0, 0, ct, 0, 1) D(0xec71, CLGIT, RIE_a, GIE, r1_o, i2_16u, 0, 0, ct, 0, 1) +/* CONVERT TO BINARY */ +C(0x4f00, CVB, RX_a, Z, la2, 0, 0, 0, cvb, 0) +C(0xe306, CVBY,RXY_a, LD, la2, 0, 0, 0, cvb, 0) +C(0xe30e, CVBG,RXY_a, Z, la2, 0, r1, 0, cvbg, 0) /* CONVERT TO DECIMAL */ C(0x4e00, CVD, RX_a, Z, r1_o, a2, 0, 0, cvd, 0) C(0xe326, CVDY,RXY_a, LD, r1_o, a2, 0, 0, cvd, 0) diff --git a/target/s390x/tcg/int_helper.c b/target/s390x/tcg/int_helper.c index 121e3006a65..2af970f2c8b 100644 --- a/target/s390x/tcg/int_helper.c +++ b/target/s390x/tcg/int_helper.c @@ -25,6 +25,7 @@ #include "exec/exec-all.h" #include "qemu/host-utils.h" #include "exec/helper-proto.h" +#include "exec/cpu_ldst.h" /* #define DEBUG_HELPER */ #ifdef DEBUG_HELPER @@ -98,6 +99,81 @@ Int128 HELPER(divu64)(CPUS390XState *env, uint64_t ah, uint64_t al, uint64_t b) tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); } +void HELPER(cvb)(CPUS390XState *env, uint32_t r1, uint64_t dec) +{ +int64_t pow10 = 1, bin = 0; +int digit, sign; + +sign = dec & 0xf; +if (sign < 0xa) { +tcg_s390_data_exception(env, 0, GETPC()); +} +dec >>= 4; + +while (dec) { +digit = dec & 0xf; +if (digit > 0x9) { +tcg_s390_data_exception(env, 0, GETPC()); +} +dec >>= 4; +bin += digit * pow10; +pow10 *= 10; +} + +if (sign == 0xb || sign == 0xd) { +bin = -bin; +} + +/* R1 is updated even on fixed-point-divide exception. */ +env->regs[r1] = (env->regs[r1] & 0xULL) | (uint32_t)bin; +if (bin != (int32_t)bin) { +tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); +} +} + +uint64_t HELPER(cvbg)(CPUS390XState *env, Int128 dec) +{ +uint64_t dec64[] = {int128_getlo(dec), int128_gethi(dec)}; +int64_t bin = 0, pow10, tmp; +int digit, i, sign; + +sign = dec64[0] & 0xf; +if (sign < 0xa) { +tcg_s390_data_exception(env, 0, GETPC()); +} +dec64[0] >>= 4; +pow10 = (sign == 0xb || sign == 0xd) ? -1 : 1; + +for (i = 1; i < 20; i++) { +digit = dec64[i >> 4] & 0xf; +if (digit > 0x9) { +tcg_s390_data_exception(env, 0, GETPC()); +} +dec64[i >> 4] >>= 4; +/* + * Prepend the next digit and check for overflow. The multiplication + * cannot overflow, since, conveniently, the int64_t limits are + * approximately +-9.2E+18. If bin is zero, the addition cannot + * overflow. Otherwise bin is known to have the same sign as the rhs + * addend, in which case overflow happens if and only if the result + * has a different sign. + */ +tmp = bin + pow10 * digit; +if (bin && ((tmp ^ bin) < 0)) { +tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); +} +bin = tmp; +pow10 *= 10; +} + +g_assert(!dec64[0]); +if (dec64[1]) { +tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); +} + +return bin; +} + uint64_t HELPER(cvd)(int32_t reg) { /* positive 0 */ diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c index c2fdc920a50..325b25959d3 100644 --- a/target/s390x/tcg/translate.c +++ b/target/s390x/tcg/translate.c @@ -2223,6 +2223,22 @@ static DisasJumpType op_csp(DisasContext *s, DisasOps *o) } #endif +static DisasJumpType op_cvb(DisasContext *s, DisasOps *o) +{ +TCGv_i64 t = tcg_temp_new_i64(); +
[PATCH v2 3/6] migration/multifd: Move multifd_send_setup error handling in to the function
Hide the error handling inside multifd_send_setup to make it cleaner for the next patch to move the function around. Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas --- migration/migration.c | 6 +- migration/multifd.c | 24 +--- migration/multifd.h | 2 +- 3 files changed, 19 insertions(+), 13 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index ba99772e76..2942f8cf42 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3623,11 +3623,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) return; } -if (multifd_send_setup(_err) != 0) { -migrate_set_error(s, local_err); -error_report_err(local_err); -migrate_set_state(>state, MIGRATION_STATUS_SETUP, - MIGRATION_STATUS_FAILED); +if (!multifd_send_setup()) { migrate_fd_cleanup(s); return; } diff --git a/migration/multifd.c b/migration/multifd.c index 515d88e04b..cc10be2c3f 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -985,14 +985,16 @@ static void multifd_new_send_channel_create(gpointer opaque) socket_send_channel_create(multifd_new_send_channel_async, opaque); } -int multifd_send_setup(Error **errp) +bool multifd_send_setup(void) { -int thread_count; +MigrationState *s = migrate_get_current(); +Error *local_err = NULL; +int thread_count, ret = 0; uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size(); uint8_t i; if (!migrate_multifd()) { -return 0; +return true; } thread_count = migrate_multifd_channels(); @@ -1026,14 +1028,22 @@ int multifd_send_setup(Error **errp) for (i = 0; i < thread_count; i++) { MultiFDSendParams *p = _send_state->params[i]; -int ret; -ret = multifd_send_state->ops->send_setup(p, errp); +ret = multifd_send_state->ops->send_setup(p, _err); if (ret) { -return ret; +break; } } -return 0; + +if (ret) { +migrate_set_error(s, local_err); +error_report_err(local_err); +migrate_set_state(>state, MIGRATION_STATUS_SETUP, + MIGRATION_STATUS_FAILED); +return false; +} + +return true; } struct { diff --git a/migration/multifd.h b/migration/multifd.h index 7881980ee6..8a1cad0996 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -13,7 +13,7 @@ #ifndef QEMU_MIGRATION_MULTIFD_H #define QEMU_MIGRATION_MULTIFD_H -int multifd_send_setup(Error **errp); +bool multifd_send_setup(void); void multifd_send_shutdown(void); int multifd_recv_setup(Error **errp); void multifd_recv_cleanup(void); -- 2.35.3
[PATCH v2 2/6] migration/multifd: Remove p->running
We currently only need p->running to avoid calling qemu_thread_join() on a non existent thread if the thread has never been created. However, there are at least two bugs in this logic: 1) On the sending side, p->running is set too early and qemu_thread_create() can be skipped due to an error during TLS handshake, leaving the flag set and leading to a crash when multifd_send_cleanup() calls qemu_thread_join(). 2) During exit, the multifd thread clears the flag while holding the channel lock. The counterpart at multifd_send_cleanup() reads the flag outside of the lock and might free the mutex while the multifd thread still has it locked. Fix the first issue by setting the flag right before creating the thread. Rename it from p->running to p->thread_created to clarify its usage. Fix the second issue by not clearing the flag at the multifd thread exit. We don't have any use for that. Note that these bugs are straight-forward logic issues and not race conditions. There is still a gap for races to affect this code due to multifd_send_cleanup() being allowed to run concurrently with the thread creation loop. This issue is solved in the next patches. Cc: qemu-stable Fixes: 29647140157a ("migration/tls: add support for multifd tls-handshake") Reported-by: Avihai Horon Reported-by: Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas --- migration/multifd.c | 27 --- migration/multifd.h | 7 ++- 2 files changed, 14 insertions(+), 20 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 8195c1daf3..515d88e04b 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -634,7 +634,7 @@ static void multifd_send_terminate_threads(void) qemu_thread_join(>tls_thread); } -if (p->running) { +if (p->thread_created) { qemu_thread_join(>thread); } } @@ -862,7 +862,6 @@ out: error_free(local_err); } -p->running = false; rcu_unregister_thread(); migration_threads_remove(thread); trace_multifd_send_thread_end(p->id, p->packets_sent, p->total_normal_pages); @@ -953,6 +952,8 @@ static bool multifd_channel_connect(MultiFDSendParams *p, migration_ioc_register_yank(ioc); p->registered_yank = true; p->c = ioc; + +p->thread_created = true; qemu_thread_create(>thread, p->name, multifd_send_thread, p, QEMU_THREAD_JOINABLE); return true; @@ -967,7 +968,6 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque) trace_multifd_new_send_channel_async(p->id); if (!qio_task_propagate_error(task, _err)) { qio_channel_set_delay(ioc, false); -p->running = true; if (multifd_channel_connect(p, ioc, _err)) { return; } @@ -1128,15 +1128,15 @@ void multifd_recv_cleanup(void) for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDRecvParams *p = _recv_state->params[i]; -if (p->running) { -/* - * multifd_recv_thread may hung at MULTIFD_FLAG_SYNC handle code, - * however try to wakeup it without harm in cleanup phase. - */ -qemu_sem_post(>sem_sync); -} +/* + * multifd_recv_thread may hung at MULTIFD_FLAG_SYNC handle code, + * however try to wakeup it without harm in cleanup phase. + */ +qemu_sem_post(>sem_sync); -qemu_thread_join(>thread); +if (p->thread_created) { +qemu_thread_join(>thread); +} } for (i = 0; i < migrate_multifd_channels(); i++) { multifd_recv_cleanup_channel(_recv_state->params[i]); @@ -1222,9 +1222,6 @@ static void *multifd_recv_thread(void *opaque) multifd_recv_terminate_threads(local_err); error_free(local_err); } -qemu_mutex_lock(>mutex); -p->running = false; -qemu_mutex_unlock(>mutex); rcu_unregister_thread(); trace_multifd_recv_thread_end(p->id, p->packets_recved, p->total_normal_pages); @@ -1330,7 +1327,7 @@ void multifd_recv_new_channel(QIOChannel *ioc, Error **errp) p->c = ioc; object_ref(OBJECT(ioc)); -p->running = true; +p->thread_created = true; qemu_thread_create(>thread, p->name, multifd_recv_thread, p, QEMU_THREAD_JOINABLE); qatomic_inc(_recv_state->count); diff --git a/migration/multifd.h b/migration/multifd.h index 720c9d50db..7881980ee6 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -73,6 +73,7 @@ typedef struct { char *name; /* channel thread id */ QemuThread thread; +bool thread_created; QemuThread tls_thread; bool tls_thread_created; /* communication channel */ @@ -93,8 +94,6 @@ typedef struct { /* syncs main thread and channels */ QemuSemaphore sem_sync; -/* is this channel thread running */ -bool running; /* multifd flags for each packet */ uint32_t
[PATCH v2 6/6] migration/multifd: Add a synchronization point for channel creation
It is possible that one of the multifd channels fails to be created at multifd_new_send_channel_async() while the rest of the channel creation tasks are still in flight. This could lead to multifd_save_cleanup() executing the qemu_thread_join() loop too early and not waiting for the threads which haven't been created yet, leading to the freeing of resources that the newly created threads will try to access and crash. Add a synchronization point after which there will be no attempts at thread creation and therefore calling multifd_save_cleanup() past that point will ensure it properly waits for the threads. A note about performance: Prior to this patch, if a channel took too long to be established, other channels could finish connecting first and already start taking load. Now we're bounded by the slowest-connecting channel. Reported-by: Avihai Horon Signed-off-by: Fabiano Rosas --- migration/multifd.c | 33 ++--- 1 file changed, 26 insertions(+), 7 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 89d39fa67c..a2b73c9946 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -62,6 +62,11 @@ struct { * Make it easy for now. */ uintptr_t packet_num; +/* + * Synchronization point past which no more channels will be + * created. + */ +QemuSemaphore channels_created; /* send channels ready */ QemuSemaphore channels_ready; /* @@ -622,10 +627,6 @@ static void multifd_send_terminate_threads(void) /* * Finally recycle all the threads. - * - * TODO: p->running is still buggy, e.g. we can reach here without the - * corresponding multifd_new_send_channel_async() get invoked yet, - * then a new thread can even be created after this function returns. */ for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDSendParams *p = _send_state->params[i]; @@ -670,6 +671,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp) static void multifd_send_cleanup_state(void) { +qemu_sem_destroy(_send_state->channels_created); qemu_sem_destroy(_send_state->channels_ready); g_free(multifd_send_state->params); multifd_send_state->params = NULL; @@ -934,7 +936,6 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque) MultiFDSendParams *p = opaque; QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task)); Error *local_err = NULL; - bool ret; trace_multifd_new_send_channel_async(p->id); @@ -951,18 +952,26 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque) if (migrate_channel_requires_tls_upgrade(ioc)) { ret = multifd_tls_channel_connect(p, ioc, _err); +if (ret) { +return; +} } else { ret = multifd_channel_connect(p, ioc, _err); } +out: +/* + * Here we're not interested whether creation succeeded, only that + * it happened at all. + */ +qemu_sem_post(_send_state->channels_created); + if (ret) { return; } -out: trace_multifd_new_send_channel_async_error(p->id, local_err); multifd_send_set_error(local_err); -multifd_send_kick_main(p); object_unref(OBJECT(ioc)); error_free(local_err); } @@ -988,6 +997,7 @@ bool multifd_send_setup(void) multifd_send_state = g_malloc0(sizeof(*multifd_send_state)); multifd_send_state->params = g_new0(MultiFDSendParams, thread_count); multifd_send_state->pages = multifd_pages_init(page_count); +qemu_sem_init(_send_state->channels_created, 0); qemu_sem_init(_send_state->channels_ready, 0); qatomic_set(_send_state->exiting, 0); multifd_send_state->ops = multifd_ops[migrate_multifd_compression()]; @@ -1013,6 +1023,15 @@ bool multifd_send_setup(void) multifd_new_send_channel_create(p); } +/* + * Wait until channel creation has started for all channels. The + * creation can still fail, but no more channels will be created + * past this point. + */ +for (i = 0; i < thread_count; i++) { +qemu_sem_wait(_send_state->channels_created); +} + for (i = 0; i < thread_count; i++) { MultiFDSendParams *p = _send_state->params[i]; -- 2.35.3
[PATCH v2 0/6] migration/multifd: Fix channel creation vs. cleanup races
Based-on: 20240202102857.110210-1-pet...@redhat.com [PATCH v2 00/23] migration/multifd: Refactor ->send_prepare() and cleanups https://lore.kernel.org/r/20240202102857.110210-1-pet...@redhat.com Hi, In this v2 I made sure NO channel is created after the semaphores are posted. Feel free to call me out if that's not the case. Not much changes, except that now both TLS and non-TLS go through the same code, so there's a centralized place to do error handling and releasing the semaphore. CI run: https://gitlab.com/farosas/qemu/-/pipelines/1165206107 based on Peter's code: https://gitlab.com/farosas/qemu/-/pipelines/1165303276 v1: https://lore.kernel.org/r/20240202191128.1901-1-faro...@suse.de This contains 2 patches from my previous series addressing the p->running misuse and the TLS thread leak and 3 new patches to fix the cleanup-while-creating-threads race. For the p->running I'm keeping the idea from the other series to remove p->running and use a more narrow p->thread_created flag. This flag is used only inform whether the thread has been created so we can join it. For the cleanup race I have moved some code around and added a semaphore to make multifd_save_setup() only return once all channel creation tasks have started. The idea is that after multifd_save_setup() returns, no new creations are in flight and the p->thread_created flags will never change again, so they're enough to cause the cleanup code to wait for the threads to join. CI run: https://gitlab.com/farosas/qemu/-/pipelines/1162798843 @Peter: I can rebase this on top of your series once we decide about it. Fabiano Rosas (6): migration/multifd: Join the TLS thread migration/multifd: Remove p->running migration/multifd: Move multifd_send_setup error handling in to the function migration/multifd: Move multifd_send_setup into migration thread migration/multifd: Unify multifd and TLS connection paths migration/multifd: Add a synchronization point for channel creation migration/migration.c | 14 ++-- migration/multifd.c | 157 +++--- migration/multifd.h | 11 ++- 3 files changed, 98 insertions(+), 84 deletions(-) -- 2.35.3
[PATCH v2 1/6] migration/multifd: Join the TLS thread
We're currently leaking the resources of the TLS thread by not joining it and also overwriting the p->thread pointer altogether. Fixes: a1af605bd5 ("migration/multifd: fix hangup with TLS-Multifd due to blocking handshake") Cc: qemu-stable Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas --- migration/multifd.c | 8 +++- migration/multifd.h | 2 ++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/migration/multifd.c b/migration/multifd.c index ef13e2e781..8195c1daf3 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -630,6 +630,10 @@ static void multifd_send_terminate_threads(void) for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDSendParams *p = _send_state->params[i]; +if (p->tls_thread_created) { +qemu_thread_join(>tls_thread); +} + if (p->running) { qemu_thread_join(>thread); } @@ -921,7 +925,9 @@ static bool multifd_tls_channel_connect(MultiFDSendParams *p, trace_multifd_tls_outgoing_handshake_start(ioc, tioc, hostname); qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing"); p->c = QIO_CHANNEL(tioc); -qemu_thread_create(>thread, "multifd-tls-handshake-worker", + +p->tls_thread_created = true; +qemu_thread_create(>tls_thread, "multifd-tls-handshake-worker", multifd_tls_handshake_thread, p, QEMU_THREAD_JOINABLE); return true; diff --git a/migration/multifd.h b/migration/multifd.h index 78a2317263..720c9d50db 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -73,6 +73,8 @@ typedef struct { char *name; /* channel thread id */ QemuThread thread; +QemuThread tls_thread; +bool tls_thread_created; /* communication channel */ QIOChannel *c; /* is the yank function registered */ -- 2.35.3
[PATCH v2 4/6] migration/multifd: Move multifd_send_setup into migration thread
We currently have an unfavorable situation around multifd channels creation and the migration thread execution. We create the multifd channels with qio_channel_socket_connect_async -> qio_task_run_in_thread, but only connect them at the multifd_new_send_channel_async callback, called from qio_task_complete, which is registered as a glib event. So at multifd_send_setup() we create the channels, but they will only be actually usable after the whole multifd_send_setup() calling stack returns back to the main loop. Which means that the migration thread is already up and running without any possibility for the multifd channels to be ready on time. We currently rely on the channels-ready semaphore blocking multifd_send_sync_main() until channels start to come up and release it. However there have been bugs recently found when a channel's creation fails and multifd_send_cleanup() is allowed to run while other channels are still being created. Let's start to organize this situation by moving the multifd_send_setup() call into the migration thread. That way we unblock the main-loop to dispatch the completion callbacks and actually have a chance of getting the multifd channels ready for when the migration thread needs them. The next patches will deal with the synchronization aspects. Note that this takes multifd_send_setup() out of the BQL. Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas --- migration/migration.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 2942f8cf42..0675e12c64 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3315,6 +3315,10 @@ static void *migration_thread(void *opaque) object_ref(OBJECT(s)); update_iteration_initial_status(s); +if (!multifd_send_setup()) { +goto out; +} + bql_lock(); qemu_savevm_state_header(s->to_dst_file); bql_unlock(); @@ -3386,6 +3390,7 @@ static void *migration_thread(void *opaque) urgent = migration_rate_limit(); } +out: trace_migration_thread_after_loop(); migration_iteration_finish(s); object_unref(OBJECT(s)); @@ -3623,11 +3628,6 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) return; } -if (!multifd_send_setup()) { -migrate_fd_cleanup(s); -return; -} - if (migrate_background_snapshot()) { qemu_thread_create(>thread, "bg_snapshot", bg_migration_thread, s, QEMU_THREAD_JOINABLE); -- 2.35.3
[PATCH v2 5/6] migration/multifd: Unify multifd and TLS connection paths
During multifd channel creation (multifd_send_new_channel_async) when TLS is enabled, the multifd_channel_connect function is called twice, once to create the TLS handshake thread and another time after the asynchrounous TLS handshake has finished. This creates a slightly confusing call stack where multifd_channel_connect() is called more times than the number of channels. It also splits error handling between the two callers of multifd_channel_connect() causing some code duplication. Lastly, it gets in the way of having a single point to determine whether all channel creation tasks have been initiated. Refactor the code to move the reentrancy one level up at the multifd_new_send_channel_async() level, de-duplicating the error handling and allowing for the next patch to introduce a synchronization point common to all the multifd channel creation, regardless of TLS. Signed-off-by: Fabiano Rosas --- migration/multifd.c | 73 +++-- 1 file changed, 30 insertions(+), 43 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index cc10be2c3f..89d39fa67c 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -869,30 +869,7 @@ out: return NULL; } -static bool multifd_channel_connect(MultiFDSendParams *p, -QIOChannel *ioc, -Error **errp); - -static void multifd_tls_outgoing_handshake(QIOTask *task, - gpointer opaque) -{ -MultiFDSendParams *p = opaque; -QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task)); -Error *err = NULL; - -if (!qio_task_propagate_error(task, )) { -trace_multifd_tls_outgoing_handshake_complete(ioc); -if (multifd_channel_connect(p, ioc, )) { -return; -} -} - -trace_multifd_tls_outgoing_handshake_error(ioc, error_get_pretty(err)); - -multifd_send_set_error(err); -multifd_send_kick_main(p); -error_free(err); -} +static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque); static void *multifd_tls_handshake_thread(void *opaque) { @@ -900,7 +877,7 @@ static void *multifd_tls_handshake_thread(void *opaque) QIOChannelTLS *tioc = QIO_CHANNEL_TLS(p->c); qio_channel_tls_handshake(tioc, - multifd_tls_outgoing_handshake, + multifd_new_send_channel_async, p, NULL, NULL); @@ -936,19 +913,6 @@ static bool multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc, Error **errp) { -trace_multifd_set_outgoing_channel( -ioc, object_get_typename(OBJECT(ioc)), -migrate_get_current()->hostname); - -if (migrate_channel_requires_tls_upgrade(ioc)) { -/* - * tls_channel_connect will call back to this - * function after the TLS handshake, - * so we mustn't call multifd_send_thread until then - */ -return multifd_tls_channel_connect(p, ioc, errp); -} - migration_ioc_register_yank(ioc); p->registered_yank = true; p->c = ioc; @@ -959,20 +923,43 @@ static bool multifd_channel_connect(MultiFDSendParams *p, return true; } +/* + * When TLS is enabled this function is called once to establish the + * TLS connection and a second time after the TLS handshake to create + * the multifd channel. Without TLS it goes straight into the channel + * creation. + */ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque) { MultiFDSendParams *p = opaque; QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task)); Error *local_err = NULL; +bool ret; + trace_multifd_new_send_channel_async(p->id); -if (!qio_task_propagate_error(task, _err)) { -qio_channel_set_delay(ioc, false); -if (multifd_channel_connect(p, ioc, _err)) { -return; -} + +if (qio_task_propagate_error(task, _err)) { +ret = false; +goto out; +} + +qio_channel_set_delay(ioc, false); + +trace_multifd_set_outgoing_channel(ioc, object_get_typename(OBJECT(ioc)), + migrate_get_current()->hostname); + +if (migrate_channel_requires_tls_upgrade(ioc)) { +ret = multifd_tls_channel_connect(p, ioc, _err); +} else { +ret = multifd_channel_connect(p, ioc, _err); +} + +if (ret) { +return; } +out: trace_multifd_new_send_channel_async_error(p->id, local_err); multifd_send_set_error(local_err); multifd_send_kick_main(p); -- 2.35.3
[PATCH] hw/net/tulip: add chip status register values
Netbsd isn't able to detect a link on the emulated tulip card. That's because netbsd reads the Chip Status Register of the Phy (address 0x14). The default phy data in the qemu tulip driver is all zero, which means no link is established and autonegotation isn't complete. Therefore set the register to 0x3b40, which means: Link is up, Autonegotation complete, Full Duplex, 100MBit/s Link speed. Also clear the mask because this register is read only. Signed-off-by: Sven Schnelle --- hw/net/tulip.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/net/tulip.c b/hw/net/tulip.c index 6d4fb06dad..1f2ef20977 100644 --- a/hw/net/tulip.c +++ b/hw/net/tulip.c @@ -421,7 +421,7 @@ static uint16_t tulip_mdi_default[] = { /* MDI Registers 8 - 15 */ 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, /* MDI Registers 16 - 31 */ -0x0003, 0x, 0x0001, 0x, 0x, 0x, 0x, 0x, +0x0003, 0x, 0x0001, 0x, 0x3b40, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, }; @@ -429,7 +429,7 @@ static uint16_t tulip_mdi_default[] = { static const uint16_t tulip_mdi_mask[] = { 0x, 0x, 0x, 0x, 0xc01f, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, -0x0fff, 0x, 0x, 0x, 0x, 0x, 0x, 0x, +0x0fff, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, }; -- 2.43.0
Re: [PATCH v4 5/5] target/riscv: Implement privilege mode filtering for cycle/instret
On Tue, Jan 23, 2024 at 4:15 PM Atish Kumar Patra wrote: > On Sun, Jan 21, 2024 at 9:04 PM Alistair Francis > wrote: > > > > On Tue, Jan 9, 2024 at 10:29 AM Atish Patra wrote: > > > > > > Privilege mode filtering can also be emulated for cycle/instret by > > > tracking host_ticks/icount during each privilege mode switch. This > > > patch implements that for both cycle/instret and mhpmcounters. The > > > first one requires Smcntrpmf while the other one requires Sscofpmf > > > to be enabled. > > > > > > The cycle/instret are still computed using host ticks when icount > > > is not enabled. Otherwise, they are computed using raw icount which > > > is more accurate in icount mode. > > > > > > Reviewed-by: Daniel Henrique Barboza > > > Signed-off-by: Atish Patra > > > --- > > > target/riscv/cpu.h| 11 + > > > target/riscv/cpu_helper.c | 9 +++- > > > target/riscv/csr.c| 95 ++- > > > target/riscv/pmu.c| 43 ++ > > > target/riscv/pmu.h| 2 + > > > 5 files changed, 136 insertions(+), 24 deletions(-) > > > > > > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h > > > index 34617c4c4bab..40d10726155b 100644 > > > --- a/target/riscv/cpu.h > > > +++ b/target/riscv/cpu.h > > > @@ -136,6 +136,15 @@ typedef struct PMUCTRState { > > > target_ulong irq_overflow_left; > > > } PMUCTRState; > > > > > > +typedef struct PMUFixedCtrState { > > > +/* Track cycle and icount for each privilege mode */ > > > +uint64_t counter[4]; > > > +uint64_t counter_prev[4]; > > > > Are these two used? > > > > Yes. That's where it tracks the current/previous value cycle/instret. > riscv_pmu_icount_update_priv/riscv_pmu_cycle_update_priv > > The priv mode based filtering is enabled in > riscv_pmu_ctr_get_fixed_counters_val > using "counter" afterwards. > > Did I misunderstand your question ? > > ping ? > > Alistair > > > > > +/* Track cycle and icount for each privilege mode when V = 1*/ > > > +uint64_t counter_virt[2]; > > > +uint64_t counter_virt_prev[2]; > > > +} PMUFixedCtrState; > > > + > > > struct CPUArchState { > > > target_ulong gpr[32]; > > > target_ulong gprh[32]; /* 64 top bits of the 128-bit registers */ > > > @@ -334,6 +343,8 @@ struct CPUArchState { > > > /* PMU event selector configured values for RV32 */ > > > target_ulong mhpmeventh_val[RV_MAX_MHPMEVENTS]; > > > > > > +PMUFixedCtrState pmu_fixed_ctrs[2]; > > > + > > > target_ulong sscratch; > > > target_ulong mscratch; > > > > > > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c > > > index e7e23b34f455..3dddb1b433e8 100644 > > > --- a/target/riscv/cpu_helper.c > > > +++ b/target/riscv/cpu_helper.c > > > @@ -715,8 +715,13 @@ void riscv_cpu_set_mode(CPURISCVState *env, > target_ulong newpriv) > > > { > > > g_assert(newpriv <= PRV_M && newpriv != PRV_RESERVED); > > > > > > -if (icount_enabled() && newpriv != env->priv) { > > > -riscv_itrigger_update_priv(env); > > > +if (newpriv != env->priv) { > > > +if (icount_enabled()) { > > > +riscv_itrigger_update_priv(env); > > > +riscv_pmu_icount_update_priv(env, newpriv); > > > +} else { > > > +riscv_pmu_cycle_update_priv(env, newpriv); > > > +} > > > } > > > /* tlb_flush is unnecessary as mode is contained in mmu_idx */ > > > env->priv = newpriv; > > > diff --git a/target/riscv/csr.c b/target/riscv/csr.c > > > index 3bd4aa22374f..307d052021c5 100644 > > > --- a/target/riscv/csr.c > > > +++ b/target/riscv/csr.c > > > @@ -782,32 +782,16 @@ static int write_vcsr(CPURISCVState *env, int > csrno, target_ulong val) > > > return RISCV_EXCP_NONE; > > > } > > > > > > +#if defined(CONFIG_USER_ONLY) > > > /* User Timers and Counters */ > > > static target_ulong get_ticks(bool shift) > > > { > > > -int64_t val; > > > -target_ulong result; > > > - > > > -#if !defined(CONFIG_USER_ONLY) > > > -if (icount_enabled()) { > > > -val = icount_get(); > > > -} else { > > > -val = cpu_get_host_ticks(); > > > -} > > > -#else > > > -val = cpu_get_host_ticks(); > > > -#endif > > > - > > > -if (shift) { > > > -result = val >> 32; > > > -} else { > > > -result = val; > > > -} > > > +int64_t val = cpu_get_host_ticks(); > > > +target_ulong result = shift ? val >> 32 : val; > > > > > > return result; > > > } > > > > > > -#if defined(CONFIG_USER_ONLY) > > > static RISCVException read_time(CPURISCVState *env, int csrno, > > > target_ulong *val) > > > { > > > @@ -932,6 +916,70 @@ static int write_mhpmeventh(CPURISCVState *env, > int csrno, target_ulong val) > > > return RISCV_EXCP_NONE; > > > } > > > > > > +static target_ulong > riscv_pmu_ctr_get_fixed_counters_val(CPURISCVState *env, > > > +
Re: [PATCH] docs/style: allow C99 mixed declarations
On Mon, Feb 05, 2024 at 02:06:39PM -0500, Stefan Hajnoczi wrote: > On Mon, 5 Feb 2024 at 13:16, Samuel Tardieu wrote: > > > > > > Daniel P. Berrangé writes: > > > > > $ gcc -Wall -Wuninitialized -o jump jump.c > > > > Note that many GCC warnings don't trigger if you don't enable > > optimizations. In the case you exhibit, adding -O is enough to get > > a sensible warning: > > > > $ gcc -Wall -O -o jump jump.c > > jump.c: In function ‘main’: > > jump.c:11:3: warning: ‘foo’ may be used uninitialized > > [-Wmaybe-uninitialized] > >11 | free(foo); > > | ^ > > jump.c:8:9: note: ‘foo’ was declared here > > 8 | char *foo = malloc(30); > > | ^~~ > > llvm also prints a warning: > > jump.c:5:7: warning: variable 'foo' is used uninitialized whenever > 'if' condition is true [-Wsometimes-uninitialized] > > I confirmed that QEMU's current compiler flags enable these warnings > so both gcc and llvm detect the issue that Daniel pointed out in QEMU > code. > > Daniel: Does this address your concern about compiler warnings? While it is good that its included when optimization is enabled, IME GCC is not entirely reliable at detecting unintialized variables as it has limits on extent of its flow control analysis. Also consider the maint impact if a method includes C99 decls, and a contributor later needs to add a 'goto' error cleanup path. They might be forced to make a bunch of unrelated refactoring changes to eliminate use of the C99 decls. This makes me not a fan of having a situation where we tell contributors they can use C99 mixed decls, except when they can't use them. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH] docs/style: allow C99 mixed declarations
On Mon, 5 Feb 2024 at 13:16, Samuel Tardieu wrote: > > > Daniel P. Berrangé writes: > > > $ gcc -Wall -Wuninitialized -o jump jump.c > > Note that many GCC warnings don't trigger if you don't enable > optimizations. In the case you exhibit, adding -O is enough to get > a sensible warning: > > $ gcc -Wall -O -o jump jump.c > jump.c: In function ‘main’: > jump.c:11:3: warning: ‘foo’ may be used uninitialized > [-Wmaybe-uninitialized] >11 | free(foo); > | ^ > jump.c:8:9: note: ‘foo’ was declared here > 8 | char *foo = malloc(30); > | ^~~ llvm also prints a warning: jump.c:5:7: warning: variable 'foo' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized] I confirmed that QEMU's current compiler flags enable these warnings so both gcc and llvm detect the issue that Daniel pointed out in QEMU code. Daniel: Does this address your concern about compiler warnings? Stefan
Re: Re: [PATCH v2 4/4] meson: Link with libinotify on FreeBSD
On Mon, Feb 05, 2024 at 07:36:32PM +0100, Philippe Mathieu-Daudé wrote: > Hi Ilya, > > On 5/2/24 19:11, Ilya Leoshkevich wrote: > > make vm-build-freebsd fails with: > > > > ld: error: undefined symbol: inotify_init1 > > >>> referenced by filemonitor-inotify.c:183 > > (../src/util/filemonitor-inotify.c:183) > > >>> util_filemonitor-inotify.c.o:(qemu_file_monitor_new) > > in archive libqemuutil.a > > > > On FreeBSD inotify functions are defined in libinotify.so. Add it to > > the dependencies. > > > > Signed-off-by: Ilya Leoshkevich > > --- > > meson.build | 12 +++- > > util/meson.build | 6 +- > > 2 files changed, 16 insertions(+), 2 deletions(-) > > (for some reason your git-diff context is very verbose, > making review somehow annoying). This is because of patch 3. It's essentially the snippet that Daniel posted, except that patch -p1 applied it at a wrong location! So I figured I'll send this series with a larger context, but I couldn't find how to apply this setting to just 1 patch in git send-email. > > +# libinotify-kqueue > > +inotify = not_found > > +if host_os == 'freebsd' > > + inotify = cc.find_library('inotify') > > +endif > > + > > # > > # config-host.h # > > # > > > > @@ -2376,61 +2382,62 @@ have_asan_fiber = false > > if get_option('sanitizers') and \ > > not cc.has_function('__sanitizer_start_switch_fiber', > >args: '-fsanitize=address', > >prefix: '#include ') > > warning('Missing ASAN due to missing fiber annotation interface') > > warning('Without code annotation, the report may be inferior.') > > else > > have_asan_fiber = true > > endif > > config_host_data.set('CONFIG_ASAN_IFACE_FIBER', have_asan_fiber) > > # has_header_symbol > > > > config_host_data.set('CONFIG_INOTIFY', > >cc.has_header_symbol('sys/inotify.h', > > 'inotify_init')) > > config_host_data.set('CONFIG_INOTIFY1', > > - cc.has_header_symbol('sys/inotify.h', > > 'inotify_init1')) > > + cc.has_header_symbol('sys/inotify.h', > > 'inotify_init1') and > > + (host_os != 'freebsd' or inotify.found())) > > Maybe we could use the same pattern as 'have_asan_fiber': > > have_inotify_init1 = cc.has_header_symbol('sys/inotify.h', 'inotify_init1') > if have_inotify_init1 and host_os == 'freebsd' >have_inotify_init1 = cc.find_library('inotify') > endif > config_host_data.set('CONFIG_INOTIFY1', have_inotify_init1) I agree, this looks nicer. I will send a v3. > I wonder why we don't need the similar library check for the > inotify_init symbol. Sounds reasonable, it's just that currently the respective config value is used only in linux-user, but for completeness it won't hurt. > > Regards, > > Phil.
Re: [PATCH v2 2/4] tests/vm/freebsd: Reload the sshd configuration
On 05/02/2024 19.11, Ilya Leoshkevich wrote: After console_sshd_config(), the SSH server needs to be nudged to pick up the new configs. The scripts for the other BSD flavors already do this with a reboot, but a simple reload is sufficient. Signed-off-by: Ilya Leoshkevich --- tests/vm/freebsd | 1 + 1 file changed, 1 insertion(+) Reviewed-by: Thomas Huth
Re: Re: [PATCH v4 2/4] target/s390x: Emulate CVB, CVBY and CVBG
On Mon, Feb 05, 2024 at 06:04:27PM +0100, Thomas Huth wrote: > On 02/02/2024 15.11, Ilya Leoshkevich wrote: > > Convert to Binary - counterparts of the already implemented Convert > > to Decimal (CVD*) instructions. > > Example from the Principles of Operation: 25594C becomes 63FA. > > > > Co-developed-by: Pavel Zbitskiy > > Signed-off-by: Ilya Leoshkevich > > --- > > target/s390x/helper.h| 2 + > > target/s390x/tcg/insn-data.h.inc | 4 ++ > > target/s390x/tcg/int_helper.c| 72 > > target/s390x/tcg/translate.c | 16 +++ > > 4 files changed, 94 insertions(+) [...] > > +uint64_t HELPER(cvbg)(CPUS390XState *env, Int128 dec) > > +{ > > +uint64_t dec64[] = {int128_getlo(dec), int128_gethi(dec)}; > > +int64_t bin = 0, pow10, tmp; > > +int digit, i, sign; > > + > > +sign = dec64[0] & 0xf; > > +if (sign < 0xa) { > > +tcg_s390_data_exception(env, 0, GETPC()); > > +} > > +dec64[0] >>= 4; > > +pow10 = (sign == 0xb || sign == 0xd) ? -1 : 1; > > + > > +for (i = 1; i < 20; i++) { > > +digit = dec64[i >> 4] & 0xf; > > +if (digit > 0x9) { > > +tcg_s390_data_exception(env, 0, GETPC()); > > +} > > +dec64[i >> 4] >>= 4; > > +tmp = pow10 * digit; > > +if (digit && ((tmp ^ pow10) < 0)) { > > That tmp ^ pow10 caused some frowning for me first, but it's just a check > whether the sign changed, right? ... a comment in front of the line might be > helpful. Good point about writing a comment, I tried to elaborate why checking the sign is justified, and realized that it's actually redundant. The int64_t bounds are roughly +-9.2E+18, and the last pow10 value in this loop is +-1E+18. The multiplication cannot overflow. > > +tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); > > +} > > +tmp = bin + tmp; > > +if (bin && ((tmp ^ bin) < 0)) { The addition, however, can, e.g., bin=9E+17 and tmp=9E+18. So I'll send a v5 without the first check and with a comment. [...]
Re: [PATCH v2 4/4] meson: Link with libinotify on FreeBSD
Hi Ilya, On 5/2/24 19:11, Ilya Leoshkevich wrote: make vm-build-freebsd fails with: ld: error: undefined symbol: inotify_init1 >>> referenced by filemonitor-inotify.c:183 (../src/util/filemonitor-inotify.c:183) >>> util_filemonitor-inotify.c.o:(qemu_file_monitor_new) in archive libqemuutil.a On FreeBSD inotify functions are defined in libinotify.so. Add it to the dependencies. Signed-off-by: Ilya Leoshkevich --- meson.build | 12 +++- util/meson.build | 6 +- 2 files changed, 16 insertions(+), 2 deletions(-) (for some reason your git-diff context is very verbose, making review somehow annoying). +# libinotify-kqueue +inotify = not_found +if host_os == 'freebsd' + inotify = cc.find_library('inotify') +endif + # # config-host.h # # @@ -2376,61 +2382,62 @@ have_asan_fiber = false if get_option('sanitizers') and \ not cc.has_function('__sanitizer_start_switch_fiber', args: '-fsanitize=address', prefix: '#include ') warning('Missing ASAN due to missing fiber annotation interface') warning('Without code annotation, the report may be inferior.') else have_asan_fiber = true endif config_host_data.set('CONFIG_ASAN_IFACE_FIBER', have_asan_fiber) # has_header_symbol config_host_data.set('CONFIG_INOTIFY', cc.has_header_symbol('sys/inotify.h', 'inotify_init')) config_host_data.set('CONFIG_INOTIFY1', - cc.has_header_symbol('sys/inotify.h', 'inotify_init1')) + cc.has_header_symbol('sys/inotify.h', 'inotify_init1') and + (host_os != 'freebsd' or inotify.found())) Maybe we could use the same pattern as 'have_asan_fiber': have_inotify_init1 = cc.has_header_symbol('sys/inotify.h', 'inotify_init1') if have_inotify_init1 and host_os == 'freebsd' have_inotify_init1 = cc.find_library('inotify') endif config_host_data.set('CONFIG_INOTIFY1', have_inotify_init1) I wonder why we don't need the similar library check for the inotify_init symbol. Regards, Phil.
[PATCH v2 3/4] tests/test-util-filemonitor: Adapt to FreeBSD inotify rename semantics
Unlike on Linux, on FreeBSD renaming a file when the destination already exists results in IN_DELETE event for that existing file: $ FILEMONITOR_DEBUG=1 build/tests/unit/test-util-filemonitor Rename /tmp/test-util-filemonitor-K13LI2/fish/one.txt -> /tmp/test-util-filemonitor-K13LI2/two.txt Event id=2 event=2 file=one.txt Queue event id 2 event 2 file one.txt Queue event id 1 event 2 file two.txt Queue event id 10002 event 2 file two.txt Queue event id 1 event 0 file two.txt Queue event id 10002 event 0 file two.txt Event id=1 event=0 file=two.txt Expected event 0 but got 2 This difference in behavior is not expected to break the real users, so teach the test to accept it. Suggested-by: Daniel P. Berrange Signed-off-by: Ilya Leoshkevich --- tests/unit/test-util-filemonitor.c | 8 1 file changed, 8 insertions(+) diff --git a/tests/unit/test-util-filemonitor.c b/tests/unit/test-util-filemonitor.c index a22de275955..02e67fc96ac 100644 --- a/tests/unit/test-util-filemonitor.c +++ b/tests/unit/test-util-filemonitor.c @@ -333,60 +333,68 @@ test_file_monitor_events(void) { .type = QFILE_MONITOR_TEST_OP_MKDIR, .filesrc = "fish", }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "fish", .watchid = , .eventid = QFILE_MONITOR_EVENT_CREATED }, { .type = QFILE_MONITOR_TEST_OP_ADD_WATCH, .filesrc = "fish/", .watchid = }, { .type = QFILE_MONITOR_TEST_OP_ADD_WATCH, .filesrc = "fish/one.txt", .watchid = }, { .type = QFILE_MONITOR_TEST_OP_CREATE, .filesrc = "fish/one.txt", }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "one.txt", .watchid = , .eventid = QFILE_MONITOR_EVENT_CREATED }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "one.txt", .watchid = , .eventid = QFILE_MONITOR_EVENT_CREATED }, { .type = QFILE_MONITOR_TEST_OP_DEL_WATCH, .filesrc = "fish/one.txt", .watchid = }, { .type = QFILE_MONITOR_TEST_OP_RENAME, .filesrc = "fish/one.txt", .filedst = "two.txt", }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "one.txt", .watchid = , .eventid = QFILE_MONITOR_EVENT_DELETED }, +#ifdef __FreeBSD__ +{ .type = QFILE_MONITOR_TEST_OP_EVENT, + .filesrc = "two.txt", .watchid = , + .eventid = QFILE_MONITOR_EVENT_DELETED }, +{ .type = QFILE_MONITOR_TEST_OP_EVENT, + .filesrc = "two.txt", .watchid = , + .eventid = QFILE_MONITOR_EVENT_DELETED }, +#endif { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "two.txt", .watchid = , .eventid = QFILE_MONITOR_EVENT_CREATED }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "two.txt", .watchid = , .eventid = QFILE_MONITOR_EVENT_CREATED }, { .type = QFILE_MONITOR_TEST_OP_RMDIR, .filesrc = "fish", }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "", .watchid = , .eventid = QFILE_MONITOR_EVENT_IGNORED, .swapnext = true }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "fish", .watchid = , .eventid = QFILE_MONITOR_EVENT_DELETED }, { .type = QFILE_MONITOR_TEST_OP_DEL_WATCH, .filesrc = "fish", .watchid = }, { .type = QFILE_MONITOR_TEST_OP_UNLINK, .filesrc = "two.txt", }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "two.txt", .watchid = , .eventid = QFILE_MONITOR_EVENT_DELETED }, { .type = QFILE_MONITOR_TEST_OP_EVENT, .filesrc = "two.txt", .watchid = , .eventid = QFILE_MONITOR_EVENT_DELETED }, -- 2.43.0
Re: [PATCH] docs/style: allow C99 mixed declarations
Daniel P. Berrangé writes: $ gcc -Wall -Wuninitialized -o jump jump.c Note that many GCC warnings don't trigger if you don't enable optimizations. In the case you exhibit, adding -O is enough to get a sensible warning: $ gcc -Wall -O -o jump jump.c jump.c: In function ‘main’: jump.c:11:3: warning: ‘foo’ may be used uninitialized [-Wmaybe-uninitialized] 11 | free(foo); | ^ jump.c:8:9: note: ‘foo’ was declared here 8 | char *foo = malloc(30); | ^~~ Best. Sam -- Samuel Tardieu
[PATCH v2 1/4] tests/vm: Set UseDNS=no in the sshd configuration
make vm-build-freebsd sometimes fails with "Connection timed out during banner exchange". The client strace shows: 13:59:30 write(3, "SSH-2.0-OpenSSH_9.3\r\n", 21) = 21 13:59:30 getpid() = 252655 13:59:30 poll([{fd=3, events=POLLIN}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}]) 13:59:32 read(3, "S", 1)= 1 13:59:32 poll([{fd=3, events=POLLIN}], 1, 3625) = 1 ([{fd=3, revents=POLLIN}]) 13:59:32 read(3, "S", 1)= 1 13:59:32 poll([{fd=3, events=POLLIN}], 1, 3625) = 1 ([{fd=3, revents=POLLIN}]) 13:59:32 read(3, "H", 1)= 1 There is a 2s delay during connection, and ConnectTimeout is set to 1. Raising it makes the issue go away, but we can do better. The server truss shows: 888: 27.811414714 socket(PF_INET,SOCK_DGRAM|SOCK_CLOEXEC,0) = 5 (0x5) 888: 27.811765030 connect(5,{ AF_INET 10.0.2.3:53 },16) = 0 (0x0) 888: 27.812166941 sendto(5,"\^Z/\^A\0\0\^A\0\0\0\0\0\0\^A2"...,39,0,NULL,0) = 39 (0x27) 888: 29.363970743 poll({ 5/POLLRDNORM },1,5000) = 1 (0x1) So the delay is due to a DNS query. Disable DNS queries in the server config. Reviewed-by: Thomas Huth Signed-off-by: Ilya Leoshkevich --- tests/vm/basevm.py | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py index 61725b83254..c0d62c08031 100644 --- a/tests/vm/basevm.py +++ b/tests/vm/basevm.py @@ -396,60 +396,62 @@ def console_consume(self): self.console_log(output) vm.console_socket.setblocking(1) def console_send(self, command): vm = self._guest if self.debug: logline = re.sub("\n", "", command) logline = re.sub("[\x00-\x1f]", ".", logline) sys.stderr.write("con send: %s\n" % logline) for char in list(command): vm.console_socket.send(char.encode("utf-8")) time.sleep(0.01) def console_wait_send(self, wait, command): self.console_wait(wait) self.console_send(command) def console_ssh_init(self, prompt, user, pw): sshkey_cmd = "echo '%s' > .ssh/authorized_keys\n" \ % self._config['ssh_pub_key'].rstrip() self.console_wait_send("login:","%s\n" % user) self.console_wait_send("Password:", "%s\n" % pw) self.console_wait_send(prompt, "mkdir .ssh\n") self.console_wait_send(prompt, sshkey_cmd) self.console_wait_send(prompt, "chmod 755 .ssh\n") self.console_wait_send(prompt, "chmod 644 .ssh/authorized_keys\n") def console_sshd_config(self, prompt): self.console_wait(prompt) self.console_send("echo 'PermitRootLogin yes' >> /etc/ssh/sshd_config\n") +self.console_wait(prompt) +self.console_send("echo 'UseDNS no' >> /etc/ssh/sshd_config\n") for var in self.envvars: self.console_wait(prompt) self.console_send("echo 'AcceptEnv %s' >> /etc/ssh/sshd_config\n" % var) def print_step(self, text): sys.stderr.write("### %s ...\n" % text) def wait_ssh(self, wait_root=False, seconds=300, cmd="exit 0"): # Allow more time for VM to boot under TCG. if not kvm_available(self.arch): seconds *= self.tcg_timeout_multiplier starttime = datetime.datetime.now() endtime = starttime + datetime.timedelta(seconds=seconds) cmd_success = False while datetime.datetime.now() < endtime: if wait_root and self.ssh_root(cmd) == 0: cmd_success = True break elif self.ssh(cmd) == 0: cmd_success = True break seconds = (endtime - datetime.datetime.now()).total_seconds() logging.debug("%ds before timeout", seconds) time.sleep(1) if not cmd_success: raise Exception("Timeout while waiting for guest ssh") def shutdown(self): self._guest.shutdown(timeout=self._shutdown_timeout) -- 2.43.0
[PATCH v2 2/4] tests/vm/freebsd: Reload the sshd configuration
After console_sshd_config(), the SSH server needs to be nudged to pick up the new configs. The scripts for the other BSD flavors already do this with a reboot, but a simple reload is sufficient. Signed-off-by: Ilya Leoshkevich --- tests/vm/freebsd | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/vm/freebsd b/tests/vm/freebsd index b581bd17fb7..1247f40a385 100755 --- a/tests/vm/freebsd +++ b/tests/vm/freebsd @@ -81,50 +81,51 @@ class FreeBSDVM(basevm.BaseVM): self.console_wait("Full name") self.console_send("%s\n" % self._config["guest_user"]) self.console_wait_send("Uid", "\n") self.console_wait_send("Login group", "\n") self.console_wait_send("Login group", "\n") self.console_wait_send("Login class", "\n") self.console_wait_send("Shell", "\n") self.console_wait_send("Home directory","\n") self.console_wait_send("Home directory perm", "\n") self.console_wait_send("Use password", "\n") self.console_wait_send("Use an empty password", "\n") self.console_wait_send("Use a random password", "\n") self.console_wait("Enter password:") self.console_send("%s\n" % self._config["guest_pass"]) self.console_wait("Enter password again:") self.console_send("%s\n" % self._config["guest_pass"]) self.console_wait_send("Lock out", "\n") self.console_wait_send("OK","yes\n") self.console_wait_send("Add another user", "no\n") self.console_wait_send("~ #", "exit\n") # setup qemu user prompt = "$" self.console_ssh_init(prompt, self._config["guest_user"], self._config["guest_pass"]) self.console_wait_send(prompt, "exit\n") # setup root user prompt = "root@freebsd:~ #" self.console_ssh_init(prompt, "root", self._config["root_pass"]) self.console_sshd_config(prompt) +self.console_wait_send(prompt, "service sshd reload\n") # setup virtio-blk #1 (tarfile) self.console_wait(prompt) self.console_send("echo 'chmod 666 /dev/vtbd1' >> /etc/rc.local\n") pkgs = self.get_qemu_packages_from_lcitool_json() self.print_step("Installing packages") self.ssh_root_check("pkg install -y %s\n" % " ".join(pkgs)) # shutdown self.ssh_root(self.poweroff) self.wait() if os.path.exists(img): os.remove(img) os.rename(img_tmp, img) self.print_step("All done") if __name__ == "__main__": sys.exit(basevm.main(FreeBSDVM, config=FREEBSD_CONFIG)) -- 2.43.0
[PATCH v2 4/4] meson: Link with libinotify on FreeBSD
make vm-build-freebsd fails with: ld: error: undefined symbol: inotify_init1 >>> referenced by filemonitor-inotify.c:183 (../src/util/filemonitor-inotify.c:183) >>> util_filemonitor-inotify.c.o:(qemu_file_monitor_new) in archive libqemuutil.a On FreeBSD inotify functions are defined in libinotify.so. Add it to the dependencies. Signed-off-by: Ilya Leoshkevich --- meson.build | 12 +++- util/meson.build | 6 +- 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/meson.build b/meson.build index b5d6dc94a83..819cdedebe2 100644 --- a/meson.build +++ b/meson.build @@ -1982,60 +1982,66 @@ if libbpf.found() and not cc.links(''' #include int main(void) { bpf_object__destroy_skeleton(NULL); return 0; }''', dependencies: libbpf) libbpf = not_found if get_option('bpf').enabled() error('libbpf skeleton test failed') else warning('libbpf skeleton test failed, disabling') endif endif # libxdp libxdp = not_found if not get_option('af_xdp').auto() or have_system libxdp = dependency('libxdp', required: get_option('af_xdp'), version: '>=1.4.0', method: 'pkg-config') endif # libdw libdw = not_found if not get_option('libdw').auto() or \ (not get_option('prefer_static') and (have_system or have_user)) libdw = dependency('libdw', method: 'pkg-config', required: get_option('libdw')) endif +# libinotify-kqueue +inotify = not_found +if host_os == 'freebsd' + inotify = cc.find_library('inotify') +endif + # # config-host.h # # config_host_data = configuration_data() audio_drivers_selected = [] if have_system audio_drivers_available = { 'alsa': alsa.found(), 'coreaudio': coreaudio.found(), 'dsound': dsound.found(), 'jack': jack.found(), 'oss': oss.found(), 'pa': pulse.found(), 'pipewire': pipewire.found(), 'sdl': sdl.found(), 'sndio': sndio.found(), } foreach k, v: audio_drivers_available config_host_data.set('CONFIG_AUDIO_' + k.to_upper(), v) endforeach # Default to native drivers first, OSS second, SDL third audio_drivers_priority = \ [ 'pa', 'coreaudio', 'dsound', 'sndio', 'oss' ] + \ (host_os == 'linux' ? [] : [ 'sdl' ]) audio_drivers_default = [] foreach k: audio_drivers_priority if audio_drivers_available[k] @@ -2376,61 +2382,62 @@ have_asan_fiber = false if get_option('sanitizers') and \ not cc.has_function('__sanitizer_start_switch_fiber', args: '-fsanitize=address', prefix: '#include ') warning('Missing ASAN due to missing fiber annotation interface') warning('Without code annotation, the report may be inferior.') else have_asan_fiber = true endif config_host_data.set('CONFIG_ASAN_IFACE_FIBER', have_asan_fiber) # has_header_symbol config_host_data.set('CONFIG_BLKZONED', cc.has_header_symbol('linux/blkzoned.h', 'BLKOPENZONE')) config_host_data.set('CONFIG_EPOLL_CREATE1', cc.has_header_symbol('sys/epoll.h', 'epoll_create1')) config_host_data.set('CONFIG_FALLOCATE_PUNCH_HOLE', cc.has_header_symbol('linux/falloc.h', 'FALLOC_FL_PUNCH_HOLE') and cc.has_header_symbol('linux/falloc.h', 'FALLOC_FL_KEEP_SIZE')) config_host_data.set('CONFIG_FALLOCATE_ZERO_RANGE', cc.has_header_symbol('linux/falloc.h', 'FALLOC_FL_ZERO_RANGE')) config_host_data.set('CONFIG_FIEMAP', cc.has_header('linux/fiemap.h') and cc.has_header_symbol('linux/fs.h', 'FS_IOC_FIEMAP')) config_host_data.set('CONFIG_GETRANDOM', cc.has_function('getrandom') and cc.has_header_symbol('sys/random.h', 'GRND_NONBLOCK')) config_host_data.set('CONFIG_INOTIFY', cc.has_header_symbol('sys/inotify.h', 'inotify_init')) config_host_data.set('CONFIG_INOTIFY1', - cc.has_header_symbol('sys/inotify.h', 'inotify_init1')) + cc.has_header_symbol('sys/inotify.h', 'inotify_init1') and + (host_os != 'freebsd' or inotify.found())) config_host_data.set('CONFIG_PRCTL_PR_SET_TIMERSLACK', cc.has_header_symbol('sys/prctl.h', 'PR_SET_TIMERSLACK')) config_host_data.set('CONFIG_RTNETLINK', cc.has_header_symbol('linux/rtnetlink.h', 'IFLA_PROTO_DOWN')) config_host_data.set('CONFIG_SYSMACROS', cc.has_header_symbol('sys/sysmacros.h', 'makedev')) config_host_data.set('HAVE_OPTRESET', cc.has_header_symbol('getopt.h', 'optreset')) config_host_data.set('HAVE_IPPROTO_MPTCP', cc.has_header_symbol('netinet/in.h', 'IPPROTO_MPTCP')) # has_member
[PATCH v2 0/4] make vm-build-freebsd fixes
v1: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg05155.html v1 -> v2: Link with libinotify instead of disabling the inotify support (Daniel). Use a bit more context lines in order to prevent the incorrect application of the test patch. Hi, I needed to verify that my qemu-user changes didn't break BSD, and Daniel Berrange suggested vm-build-freebsd on IRC. I had several problems with it, which this series resolves. Best regards, Ilya Ilya Leoshkevich (4): tests/vm: Set UseDNS=no in the sshd configuration tests/vm/freebsd: Reload the sshd configuration tests/test-util-filemonitor: Adapt to FreeBSD inotify rename semantics meson: Link with libinotify on FreeBSD meson.build| 12 +++- tests/unit/test-util-filemonitor.c | 8 tests/vm/basevm.py | 2 ++ tests/vm/freebsd | 1 + util/meson.build | 6 +- 5 files changed, 27 insertions(+), 2 deletions(-) -- 2.43.0
Re: [PATCH v2 23/23] migration/multifd: Optimize sender side to be lockless
Peter Xu writes: > On Mon, Feb 05, 2024 at 11:10:34AM -0300, Fabiano Rosas wrote: >> > (maybe I can repost this single patch in-place to avoid another round of >> > mail bombs..) >> >> Sure. > > I've got the final version attached here. Feel free to have a look, thanks. > > > From 6ba337320430feae4ce9d3d906ea19f68430642d Mon Sep 17 00:00:00 2001 > From: Peter Xu > Date: Fri, 2 Feb 2024 18:28:57 +0800 > Subject: [PATCH] migration/multifd: Optimize sender side to be lockless > > When reviewing my attempt to refactor send_prepare(), Fabiano suggested we > try out with dropping the mutex in multifd code [1]. > > I thought about that before but I never tried to change the code. Now > maybe it's time to give it a stab. This only optimizes the sender side. > > The trick here is multifd has a clear provider/consumer model, that the > migration main thread publishes requests (either pending_job/pending_sync), > while the multifd sender threads are consumers. Here we don't have a lot > of complicated data sharing, and the jobs can logically be submitted > lockless. > > Arm the code with atomic weapons. Two things worth mentioning: > > - For multifd_send_pages(): we can use qatomic_load_acquire() when trying > to find a free channel, but that's expensive if we attach one ACQUIRE per > channel. Instead, keep the qatomic_read() on reading the pending_job > flag as we do already, meanwhile use one smp_mb_acquire() after the loop > to guarantee the memory ordering. > > - For pending_sync: it doesn't have any extra data required since now > p->flags are never touched, it should be safe to not use memory barrier. > That's different from pending_job. > > Provide rich comments for all the lockless operations to state how they are > paired. With that, we can remove the mutex. > > [1] https://lore.kernel.org/r/87o7d1jlu5@suse.de > > Suggested-by: Fabiano Rosas > Link: https://lore.kernel.org/r/20240202102857.110210-24-pet...@redhat.com > Signed-off-by: Peter Xu > --- > migration/multifd.h | 2 -- > migration/multifd.c | 51 +++-- > 2 files changed, 26 insertions(+), 27 deletions(-) > > diff --git a/migration/multifd.h b/migration/multifd.h > index 98876ff94a..78a2317263 100644 > --- a/migration/multifd.h > +++ b/migration/multifd.h > @@ -91,8 +91,6 @@ typedef struct { > /* syncs main thread and channels */ > QemuSemaphore sem_sync; > > -/* this mutex protects the following parameters */ > -QemuMutex mutex; > /* is this channel thread running */ > bool running; > /* multifd flags for each packet */ > diff --git a/migration/multifd.c b/migration/multifd.c > index b317d57d61..fbdb129088 100644 > --- a/migration/multifd.c > +++ b/migration/multifd.c > @@ -501,19 +501,19 @@ static bool multifd_send_pages(void) > } > } > > -qemu_mutex_lock(>mutex); > -assert(!p->pages->num); > -assert(!p->pages->block); > /* > - * Double check on pending_job==false with the lock. In the future if > - * we can have >1 requester thread, we can replace this with a "goto > - * retry", but that is for later. > + * Make sure we read p->pending_job before all the rest. Pairs with > + * qatomic_store_release() in multifd_send_thread(). > */ > -assert(qatomic_read(>pending_job) == false); > -qatomic_set(>pending_job, true); > +smp_mb_acquire(); > +assert(!p->pages->num); > multifd_send_state->pages = p->pages; > p->pages = pages; > -qemu_mutex_unlock(>mutex); > +/* > + * Making sure p->pages is setup before marking pending_job=true. Pairs > + * with the qatomic_load_acquire() in multifd_send_thread(). > + */ > +qatomic_store_release(>pending_job, true); > qemu_sem_post(>sem); > > return true; > @@ -648,7 +648,6 @@ static bool > multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp) > } > multifd_send_channel_destroy(p->c); > p->c = NULL; > -qemu_mutex_destroy(>mutex); > qemu_sem_destroy(>sem); > qemu_sem_destroy(>sem_sync); > g_free(p->name); > @@ -742,14 +741,12 @@ int multifd_send_sync_main(void) > > trace_multifd_send_sync_main_signal(p->id); > > -qemu_mutex_lock(>mutex); > /* > * We should be the only user so far, so not possible to be set by > * others concurrently. > */ > assert(qatomic_read(>pending_sync) == false); > qatomic_set(>pending_sync, true); > -qemu_mutex_unlock(>mutex); > qemu_sem_post(>sem); > } > for (i = 0; i < migrate_multifd_channels(); i++) { > @@ -796,9 +793,12 @@ static void *multifd_send_thread(void *opaque) > if (multifd_send_should_exit()) { > break; > } > -qemu_mutex_lock(>mutex); > > -if (qatomic_read(>pending_job)) { > +/* > + * Read pending_job flag before p->pages. Pairs with
Re: [GIT PULL 5/8] util: Add write-only "node-affinity" property for ThreadContext
On 05.02.24 17:13, Claudio Fontana wrote: Hello David, Hi, It would seem to me that a lot of the calling code like qemu_prealloc_mem for example should be sysemu-only, not used for tools, or user mode either right? And the thread_context.c itself should also be sysemu-only, correct? Yes, both should currently only get used for sysemu only. Memory backends and virtio-mem are sysemu-only. -- Cheers, David / dhildenb
Re: [PATCH] docs/style: allow C99 mixed declarations
On Mon, 5 Feb 2024 at 17:41, Daniel P. Berrangé wrote: > Rather than accept the status quo and remove the coding guideline, > I think we should strengthen the guidelines, such that it is > explicitly forbidden in any method that uses 'goto'. Personally > I'd go all the way to -Werror=declaration-after-statement, as > while C99 mixed decl is appealing, it isn't exactly a game > changer in improving code maintainability. I definitely think we should either (a) say we're OK with mixed declarations or (b) enforce it via the warning option. (gcc -Wdeclaration-after-statement does not appear to warn for the "declaration inside for()" syntax, so we could enable it if we wanted to fix the existing style lapses.) Personally I prefer declaration-at-top-of-block, but more as an aesthetic choice than anything else. thanks -- PMM
Re: [PATCH] docs/style: allow C99 mixed declarations
On Mon, Feb 05, 2024 at 12:18:19PM -0500, Stefan Hajnoczi wrote: > C99 mixed declarations support interleaving of local variable > declarations and code. > > The coding style "generally" forbids C99 mixed declarations with some > exceptions to the rule. This rule is not checked by checkpatch.pl and > naturally there are violations in the source tree. > > While contemplating adding another exception, I came to the conclusion > that the best location for declarations depends on context. Let the > programmer declare variables where it is best for legibility. Don't try > to define all possible scenarios/exceptions. IIRC, we had a discussion on this topic sometime last year, but can't remember what the $SUBJECT was, so I'll just repeat what I said then. Combining C99 mixed declarations with 'goto' is dangerous. Consider this program: $ cat jump.c #include int main(int argc, char**argv) { if (getenv("SKIP")) goto target; char *foo = malloc(30); target: free(foo); } $ gcc -Wall -Wuninitialized -o jump jump.c $ SKIP=1 ./jump free(): invalid pointer Aborted (core dumped) -> The programmer thinks they have initialized 'foo' -> GCC thinks the programmer has initialized 'foo' -> Yet 'foo' is not guaranteed to be initialized at 'target:' Given that QEMU makes heavy use of 'goto', allowing C99 mixed declarations exposes us to significant danger. Full disclosure, GCC fails to diagnmose this mistake, even with a decl at start of 'main', but at least the mistake is now more visible to the programmer. Fortunately with -fanalyzer GCC can diagnose this: $ gcc -fanalyzer -Wall -o jump jump.c jump.c: In function ‘main’: jump.c:12:3: warning: use of uninitialized value ‘foo’ [CWE-457] [-Wanalyzer-use-of-uninitialized-value] 12 | free(foo); | ^ ‘main’: events 1-5 | |6 | if (getenv("SKIP")) | | ~ | | | | | (3) following ‘true’ branch... |7 | goto target; | | | | | | | (4) ...to here |8 | |9 | char *foo = malloc(30); | |^~~ | || | |(1) region created on stack here | |(2) capacity: 8 bytes |.. | 12 | free(foo); | | ~ | | | | | (5) use of uninitialized value ‘foo’ here ...but -fanalyzer isn't something we have enabled by default, it is opt-in. I'm also not sure how comprehensive the flow control analysis of -fanalyzer is ? Can we be sure it'll catch these mistakes in large complex functions with many code paths ? Even if the compiler does reliably warn, I think the code pattern remains misleading to contributors, as the flow control flaw is very non-obvious. Rather than accept the status quo and remove the coding guideline, I think we should strengthen the guidelines, such that it is explicitly forbidden in any method that uses 'goto'. Personally I'd go all the way to -Werror=declaration-after-statement, as while C99 mixed decl is appealing, it isn't exactly a game changer in improving code maintainability. > Signed-off-by: Stefan Hajnoczi > --- > docs/devel/style.rst | 20 > 1 file changed, 20 deletions(-) > > diff --git a/docs/devel/style.rst b/docs/devel/style.rst > index 2f68b50079..80c4e4df52 100644 > --- a/docs/devel/style.rst > +++ b/docs/devel/style.rst > @@ -199,26 +199,6 @@ Rationale: a consistent (except for functions...) > bracing style reduces > ambiguity and avoids needless churn when lines are added or removed. > Furthermore, it is the QEMU coding style. > > -Declarations > - > - > -Mixed declarations (interleaving statements and declarations within > -blocks) are generally not allowed; declarations should be at the beginning > -of blocks. To avoid accidental re-use it is permissible to declare > -loop variables inside for loops: > - > -.. code-block:: c > - > -for (int i = 0; i < ARRAY_SIZE(thing); i++) { > -/* do something loopy */ > -} > - > -Every now and then, an exception is made for declarations inside a > -#ifdef or #ifndef block: if the code looks nicer, such declarations can > -be placed at the top of the block even if there are statements above. > -On the other hand, however, it's often best to move that #ifdef/#ifndef > -block to a separate function altogether. > - > Conditional statements > == > > -- > 2.43.0 > With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
[PATCH 0/5] virtio-blk: iothread-vq-mapping cleanups
Hanna reviewed the iothread-vq-mapping patches after they were applied to qemu.git. This series consists of code cleanups that Hanna identified. There are no functional changes or bug fixes that need to be backported to the stable tree here, but it may make sense to backport them in the future to avoid conflicts. Stefan Hajnoczi (5): virtio-blk: enforce iothread-vq-mapping validation virtio-blk: clarify that there is at least 1 virtqueue virtio-blk: add vq_rq[] bounds check in virtio_blk_dma_restart_cb() virtio-blk: declare VirtIOBlock::rq with a type monitor: use aio_co_reschedule_self() include/hw/virtio/virtio-blk.h | 2 +- hw/block/virtio-blk.c | 194 ++--- qapi/qmp-dispatch.c| 7 +- 3 files changed, 112 insertions(+), 91 deletions(-) -- 2.43.0
[PATCH 2/5] virtio-blk: clarify that there is at least 1 virtqueue
It is not possible to instantiate a virtio-blk device with 0 virtqueues. The following check is located in ->realize(): if (!conf->num_queues) { error_setg(errp, "num-queues property must be larger than 0"); return; } Later on we access s->vq_aio_context[0] under the assumption that there is as least one virtqueue. Hanna Czenczek noted that it would help to show that the array index is already valid. Add an assertion to document that s->vq_aio_context[0] is always safe...and catch future code changes that break this assumption. Suggested-by: Hanna Czenczek Signed-off-by: Stefan Hajnoczi --- hw/block/virtio-blk.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index e8b37fd5f4..a0735a9bca 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -1825,6 +1825,7 @@ static int virtio_blk_start_ioeventfd(VirtIODevice *vdev) * Try to change the AioContext so that block jobs and other operations can * co-locate their activity in the same AioContext. If it fails, nevermind. */ +assert(nvqs > 0); /* enforced during ->realize() */ r = blk_set_aio_context(s->conf.conf.blk, s->vq_aio_context[0], _err); if (r < 0) { -- 2.43.0
[PATCH 3/5] virtio-blk: add vq_rq[] bounds check in virtio_blk_dma_restart_cb()
Hanna Czenczek noted that the array index in virtio_blk_dma_restart_cb() is not bounds-checked: g_autofree VirtIOBlockReq **vq_rq = g_new0(VirtIOBlockReq *, num_queues); ... while (rq) { VirtIOBlockReq *next = rq->next; uint16_t idx = virtio_get_queue_index(rq->vq); rq->next = vq_rq[idx]; ^^ The code is correct because both rq->vq and vq_rq[] depend on num_queues, but this is indirect and not 100% obvious. Add an assertion. Suggested-by: Hanna Czenczek Signed-off-by: Stefan Hajnoczi --- hw/block/virtio-blk.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index a0735a9bca..f3193f4b75 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -1209,6 +1209,7 @@ static void virtio_blk_dma_restart_cb(void *opaque, bool running, VirtIOBlockReq *next = rq->next; uint16_t idx = virtio_get_queue_index(rq->vq); +assert(idx < num_queues); rq->next = vq_rq[idx]; vq_rq[idx] = rq; rq = next; -- 2.43.0
[PATCH 5/5] monitor: use aio_co_reschedule_self()
The aio_co_reschedule_self() API is designed to avoid the race condition between scheduling the coroutine in another AioContext and yielding. The QMP dispatch code uses the open-coded version that appears susceptible to the race condition at first glance: aio_co_schedule(qemu_get_aio_context(), qemu_coroutine_self()); qemu_coroutine_yield(); The code is actually safe because the iohandler and qemu_aio_context AioContext run under the Big QEMU Lock. Nevertheless, set a good example and use aio_co_reschedule_self() so it's obvious that there is no race. Suggested-by: Hanna Reitz Signed-off-by: Stefan Hajnoczi --- qapi/qmp-dispatch.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c index 176b549473..f3488afeef 100644 --- a/qapi/qmp-dispatch.c +++ b/qapi/qmp-dispatch.c @@ -212,8 +212,7 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList *cmds, QObject *requ * executing the command handler so that it can make progress if it * involves an AIO_WAIT_WHILE(). */ -aio_co_schedule(qemu_get_aio_context(), qemu_coroutine_self()); -qemu_coroutine_yield(); +aio_co_reschedule_self(qemu_get_aio_context()); } monitor_set_cur(qemu_coroutine_self(), cur_mon); @@ -227,9 +226,7 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList *cmds, QObject *requ * Move back to iohandler_ctx so that nested event loops for * qemu_aio_context don't start new monitor commands. */ -aio_co_schedule(iohandler_get_aio_context(), -qemu_coroutine_self()); -qemu_coroutine_yield(); +aio_co_reschedule_self(iohandler_get_aio_context()); } } else { /* -- 2.43.0
[PATCH 1/5] virtio-blk: enforce iothread-vq-mapping validation
Hanna Czenczek noticed that the safety of `vq_aio_context[vq->value] = ctx;` with user-defined vq->value inputs is not obvious. The code is structured in validate() + apply() steps so input validation is there, but it happens way earlier and there is nothing that guarantees apply() can only be called with validated inputs. This patch moves the validate() call inside the apply() function so validation is guaranteed. I also added the bounds checking assertion that Hanna suggested. Signed-off-by: Stefan Hajnoczi --- hw/block/virtio-blk.c | 192 +++--- 1 file changed, 107 insertions(+), 85 deletions(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 227d83569f..e8b37fd5f4 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -1485,6 +1485,72 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f, return 0; } +static void virtio_resize_cb(void *opaque) +{ +VirtIODevice *vdev = opaque; + +assert(qemu_get_current_aio_context() == qemu_get_aio_context()); +virtio_notify_config(vdev); +} + +static void virtio_blk_resize(void *opaque) +{ +VirtIODevice *vdev = VIRTIO_DEVICE(opaque); + +/* + * virtio_notify_config() needs to acquire the BQL, + * so it can't be called from an iothread. Instead, schedule + * it to be run in the main context BH. + */ +aio_bh_schedule_oneshot(qemu_get_aio_context(), virtio_resize_cb, vdev); +} + +static void virtio_blk_ioeventfd_detach(VirtIOBlock *s) +{ +VirtIODevice *vdev = VIRTIO_DEVICE(s); + +for (uint16_t i = 0; i < s->conf.num_queues; i++) { +VirtQueue *vq = virtio_get_queue(vdev, i); +virtio_queue_aio_detach_host_notifier(vq, s->vq_aio_context[i]); +} +} + +static void virtio_blk_ioeventfd_attach(VirtIOBlock *s) +{ +VirtIODevice *vdev = VIRTIO_DEVICE(s); + +for (uint16_t i = 0; i < s->conf.num_queues; i++) { +VirtQueue *vq = virtio_get_queue(vdev, i); +virtio_queue_aio_attach_host_notifier(vq, s->vq_aio_context[i]); +} +} + +/* Suspend virtqueue ioeventfd processing during drain */ +static void virtio_blk_drained_begin(void *opaque) +{ +VirtIOBlock *s = opaque; + +if (s->ioeventfd_started) { +virtio_blk_ioeventfd_detach(s); +} +} + +/* Resume virtqueue ioeventfd processing after drain */ +static void virtio_blk_drained_end(void *opaque) +{ +VirtIOBlock *s = opaque; + +if (s->ioeventfd_started) { +virtio_blk_ioeventfd_attach(s); +} +} + +static const BlockDevOps virtio_block_ops = { +.resize_cb = virtio_blk_resize, +.drained_begin = virtio_blk_drained_begin, +.drained_end = virtio_blk_drained_end, +}; + static bool validate_iothread_vq_mapping_list(IOThreadVirtQueueMappingList *list, uint16_t num_queues, Error **errp) @@ -1547,81 +1613,33 @@ validate_iothread_vq_mapping_list(IOThreadVirtQueueMappingList *list, return true; } -static void virtio_resize_cb(void *opaque) -{ -VirtIODevice *vdev = opaque; - -assert(qemu_get_current_aio_context() == qemu_get_aio_context()); -virtio_notify_config(vdev); -} - -static void virtio_blk_resize(void *opaque) -{ -VirtIODevice *vdev = VIRTIO_DEVICE(opaque); - -/* - * virtio_notify_config() needs to acquire the BQL, - * so it can't be called from an iothread. Instead, schedule - * it to be run in the main context BH. - */ -aio_bh_schedule_oneshot(qemu_get_aio_context(), virtio_resize_cb, vdev); -} - -static void virtio_blk_ioeventfd_detach(VirtIOBlock *s) -{ -VirtIODevice *vdev = VIRTIO_DEVICE(s); - -for (uint16_t i = 0; i < s->conf.num_queues; i++) { -VirtQueue *vq = virtio_get_queue(vdev, i); -virtio_queue_aio_detach_host_notifier(vq, s->vq_aio_context[i]); -} -} - -static void virtio_blk_ioeventfd_attach(VirtIOBlock *s) -{ -VirtIODevice *vdev = VIRTIO_DEVICE(s); - -for (uint16_t i = 0; i < s->conf.num_queues; i++) { -VirtQueue *vq = virtio_get_queue(vdev, i); -virtio_queue_aio_attach_host_notifier(vq, s->vq_aio_context[i]); -} -} - -/* Suspend virtqueue ioeventfd processing during drain */ -static void virtio_blk_drained_begin(void *opaque) -{ -VirtIOBlock *s = opaque; - -if (s->ioeventfd_started) { -virtio_blk_ioeventfd_detach(s); -} -} - -/* Resume virtqueue ioeventfd processing after drain */ -static void virtio_blk_drained_end(void *opaque) -{ -VirtIOBlock *s = opaque; - -if (s->ioeventfd_started) { -virtio_blk_ioeventfd_attach(s); -} -} - -static const BlockDevOps virtio_block_ops = { -.resize_cb = virtio_blk_resize, -.drained_begin = virtio_blk_drained_begin, -.drained_end = virtio_blk_drained_end, -}; - -/* Generate vq:AioContext mappings from a validated iothread-vq-mapping list */ -static void -apply_vq_mapping(IOThreadVirtQueueMappingList *iothread_vq_mapping_list, - AioContext **vq_aio_context, uint16_t
[PATCH 4/5] virtio-blk: declare VirtIOBlock::rq with a type
The VirtIOBlock::rq field has had the type void * since its introduction in commit 869a5c6df19a ("Stop VM on error in virtio-blk. (Gleb Natapov)"). Perhaps this was done to avoid the forward declaration of VirtIOBlockReq. Hanna Czenczek pointed out the missing type. Specify the actual type because there is no need to use void * here. Suggested-by: Hanna Czenczek Signed-off-by: Stefan Hajnoczi --- include/hw/virtio/virtio-blk.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h index 833a9a344f..5c14110c4b 100644 --- a/include/hw/virtio/virtio-blk.h +++ b/include/hw/virtio/virtio-blk.h @@ -55,7 +55,7 @@ struct VirtIOBlock { VirtIODevice parent_obj; BlockBackend *blk; QemuMutex rq_lock; -void *rq; /* protected by rq_lock */ +struct VirtIOBlockReq *rq; /* protected by rq_lock */ VirtIOBlkConf conf; unsigned short sector_mask; bool original_wce; -- 2.43.0
[PATCH] docs/style: allow C99 mixed declarations
C99 mixed declarations support interleaving of local variable declarations and code. The coding style "generally" forbids C99 mixed declarations with some exceptions to the rule. This rule is not checked by checkpatch.pl and naturally there are violations in the source tree. While contemplating adding another exception, I came to the conclusion that the best location for declarations depends on context. Let the programmer declare variables where it is best for legibility. Don't try to define all possible scenarios/exceptions. Signed-off-by: Stefan Hajnoczi --- docs/devel/style.rst | 20 1 file changed, 20 deletions(-) diff --git a/docs/devel/style.rst b/docs/devel/style.rst index 2f68b50079..80c4e4df52 100644 --- a/docs/devel/style.rst +++ b/docs/devel/style.rst @@ -199,26 +199,6 @@ Rationale: a consistent (except for functions...) bracing style reduces ambiguity and avoids needless churn when lines are added or removed. Furthermore, it is the QEMU coding style. -Declarations - - -Mixed declarations (interleaving statements and declarations within -blocks) are generally not allowed; declarations should be at the beginning -of blocks. To avoid accidental re-use it is permissible to declare -loop variables inside for loops: - -.. code-block:: c - -for (int i = 0; i < ARRAY_SIZE(thing); i++) { -/* do something loopy */ -} - -Every now and then, an exception is made for declarations inside a -#ifdef or #ifndef block: if the code looks nicer, such declarations can -be placed at the top of the block even if there are statements above. -On the other hand, however, it's often best to move that #ifdef/#ifndef -block to a separate function altogether. - Conditional statements == -- 2.43.0
Re: [PATCH v4 2/4] target/s390x: Emulate CVB, CVBY and CVBG
On 02/02/2024 15.11, Ilya Leoshkevich wrote: Convert to Binary - counterparts of the already implemented Convert to Decimal (CVD*) instructions. Example from the Principles of Operation: 25594C becomes 63FA. Co-developed-by: Pavel Zbitskiy Signed-off-by: Ilya Leoshkevich --- target/s390x/helper.h| 2 + target/s390x/tcg/insn-data.h.inc | 4 ++ target/s390x/tcg/int_helper.c| 72 target/s390x/tcg/translate.c | 16 +++ 4 files changed, 94 insertions(+) diff --git a/target/s390x/helper.h b/target/s390x/helper.h index 332a9a9c632..cc1c20e9e3f 100644 --- a/target/s390x/helper.h +++ b/target/s390x/helper.h @@ -88,6 +88,8 @@ DEF_HELPER_FLAGS_3(tcxb, TCG_CALL_NO_RWG_SE, i32, env, i128, i64) DEF_HELPER_FLAGS_2(sqeb, TCG_CALL_NO_WG, i64, env, i64) DEF_HELPER_FLAGS_2(sqdb, TCG_CALL_NO_WG, i64, env, i64) DEF_HELPER_FLAGS_2(sqxb, TCG_CALL_NO_WG, i128, env, i128) +DEF_HELPER_3(cvb, void, env, i32, i64) +DEF_HELPER_FLAGS_2(cvbg, TCG_CALL_NO_WG, i64, env, i128) DEF_HELPER_FLAGS_1(cvd, TCG_CALL_NO_RWG_SE, i64, s32) DEF_HELPER_FLAGS_1(cvdg, TCG_CALL_NO_RWG_SE, i128, s64) DEF_HELPER_FLAGS_4(pack, TCG_CALL_NO_WG, void, env, i32, i64, i64) diff --git a/target/s390x/tcg/insn-data.h.inc b/target/s390x/tcg/insn-data.h.inc index 388dcb8dbbc..e7d61cdec28 100644 --- a/target/s390x/tcg/insn-data.h.inc +++ b/target/s390x/tcg/insn-data.h.inc @@ -293,6 +293,10 @@ D(0xec73, CLFIT, RIE_a, GIE, r1_32u, i2_16u, 0, 0, ct, 0, 1) D(0xec71, CLGIT, RIE_a, GIE, r1_o, i2_16u, 0, 0, ct, 0, 1) +/* CONVERT TO BINARY */ +C(0x4f00, CVB, RX_a, Z, la2, 0, 0, 0, cvb, 0) +C(0xe306, CVBY,RXY_a, LD, la2, 0, 0, 0, cvb, 0) +C(0xe30e, CVBG,RXY_a, Z, la2, 0, r1, 0, cvbg, 0) /* CONVERT TO DECIMAL */ C(0x4e00, CVD, RX_a, Z, r1_o, a2, 0, 0, cvd, 0) C(0xe326, CVDY,RXY_a, LD, r1_o, a2, 0, 0, cvd, 0) diff --git a/target/s390x/tcg/int_helper.c b/target/s390x/tcg/int_helper.c index 121e3006a65..17974375e98 100644 --- a/target/s390x/tcg/int_helper.c +++ b/target/s390x/tcg/int_helper.c @@ -25,6 +25,7 @@ #include "exec/exec-all.h" #include "qemu/host-utils.h" #include "exec/helper-proto.h" +#include "exec/cpu_ldst.h" /* #define DEBUG_HELPER */ #ifdef DEBUG_HELPER @@ -98,6 +99,77 @@ Int128 HELPER(divu64)(CPUS390XState *env, uint64_t ah, uint64_t al, uint64_t b) tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); } +void HELPER(cvb)(CPUS390XState *env, uint32_t r1, uint64_t dec) +{ +int64_t pow10 = 1, bin = 0; +int digit, sign; + +sign = dec & 0xf; +if (sign < 0xa) { +tcg_s390_data_exception(env, 0, GETPC()); +} +dec >>= 4; + +while (dec) { +digit = dec & 0xf; +if (digit > 0x9) { +tcg_s390_data_exception(env, 0, GETPC()); +} +dec >>= 4; +bin += digit * pow10; +pow10 *= 10; +} + +if (sign == 0xb || sign == 0xd) { +bin = -bin; +} + +/* R1 is updated even on fixed-point-divide exception. */ +env->regs[r1] = (env->regs[r1] & 0xULL) | (uint32_t)bin; +if (bin != (int32_t)bin) { +tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); +} +} + +uint64_t HELPER(cvbg)(CPUS390XState *env, Int128 dec) +{ +uint64_t dec64[] = {int128_getlo(dec), int128_gethi(dec)}; +int64_t bin = 0, pow10, tmp; +int digit, i, sign; + +sign = dec64[0] & 0xf; +if (sign < 0xa) { +tcg_s390_data_exception(env, 0, GETPC()); +} +dec64[0] >>= 4; +pow10 = (sign == 0xb || sign == 0xd) ? -1 : 1; + +for (i = 1; i < 20; i++) { +digit = dec64[i >> 4] & 0xf; +if (digit > 0x9) { +tcg_s390_data_exception(env, 0, GETPC()); +} +dec64[i >> 4] >>= 4; +tmp = pow10 * digit; +if (digit && ((tmp ^ pow10) < 0)) { That tmp ^ pow10 caused some frowning for me first, but it's just a check whether the sign changed, right? ... a comment in front of the line might be helpful. +tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); +} +tmp = bin + tmp; +if (bin && ((tmp ^ bin) < 0)) { +tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); +} +bin = tmp; +pow10 *= 10; +} + +g_assert(!dec64[0]); +if (dec64[1]) { +tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); +} + +return bin; +} Patch looks sane to me now, but I'd appreciate if Richard and/or David could have a look at this, too! Thomas
[PATCH v9 4/5] qmp: Added new command to retrieve eBPF blob.
Now, the binary objects may be retrieved by id. It would require for future qmp commands that may require specific eBPF blob. Added command "request-ebpf". This command returns eBPF program encoded base64. The program taken from the skeleton and essentially is an ELF object that can be loaded in the future with libbpf. The reason to use the command to provide the eBPF object instead of a separate artifact was to avoid issues related to finding the eBPF itself. eBPF object is an ELF binary that contains the eBPF program and eBPF map description(BTF). Overall, eBPF object should contain the program and enough metadata to create/load eBPF with libbpf. As the eBPF maps/program should correspond to QEMU, the eBPF can't be used from different QEMU build. The first solution was a helper that comes with QEMU and loads appropriate eBPF objects. And the issue is to find a proper helper if the system has several different QEMUs installed and/or built from the source, which helpers may not be compatible. Another issue is QEMU updating while there is a running QEMU instance. With an updated helper, it may not be possible to hotplug virtio-net device to the already running QEMU. Overall, requesting the eBPF object from QEMU itself solves possible failures with acceptable effort. Links: [PATCH 3/5] qmp: Added the helper stamp check. https://lore.kernel.org/all/20230219162100.174318-4-and...@daynix.com/ Signed-off-by: Andrew Melnychenko --- ebpf/ebpf.c | 69 +++ ebpf/ebpf.h | 29 ++ ebpf/ebpf_rss.c | 6 ebpf/meson.build | 2 +- qapi/ebpf.json| 66 + qapi/meson.build | 1 + qapi/qapi-schema.json | 1 + 7 files changed, 173 insertions(+), 1 deletion(-) create mode 100644 ebpf/ebpf.c create mode 100644 ebpf/ebpf.h create mode 100644 qapi/ebpf.json diff --git a/ebpf/ebpf.c b/ebpf/ebpf.c new file mode 100644 index 00..2d73beb479 --- /dev/null +++ b/ebpf/ebpf.c @@ -0,0 +1,69 @@ +/* + * QEMU eBPF binary declaration routine. + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Andrew Melnychenko + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include "qemu/osdep.h" +#include "qemu/queue.h" +#include "qapi/error.h" +#include "qapi/qapi-commands-ebpf.h" +#include "ebpf/ebpf.h" + +typedef struct ElfBinaryDataEntry { +int id; +const void *data; +size_t datalen; + +QSLIST_ENTRY(ElfBinaryDataEntry) node; +} ElfBinaryDataEntry; + +static QSLIST_HEAD(, ElfBinaryDataEntry) ebpf_elf_obj_list = +QSLIST_HEAD_INITIALIZER(); + +void ebpf_register_binary_data(int id, const void *data, size_t datalen) +{ +struct ElfBinaryDataEntry *dataentry = NULL; + +dataentry = g_new0(struct ElfBinaryDataEntry, 1); +dataentry->data = data; +dataentry->datalen = datalen; +dataentry->id = id; + +QSLIST_INSERT_HEAD(_elf_obj_list, dataentry, node); +} + +const void *ebpf_find_binary_by_id(int id, size_t *sz, Error **errp) +{ +struct ElfBinaryDataEntry *it = NULL; +QSLIST_FOREACH(it, _elf_obj_list, node) { +if (id == it->id) { +*sz = it->datalen; +return it->data; +} +} + +error_setg(errp, "can't find eBPF object with id: %d", id); + +return NULL; +} + +EbpfObject *qmp_request_ebpf(EbpfProgramID id, Error **errp) +{ +EbpfObject *ret = NULL; +size_t size = 0; +const void *data = ebpf_find_binary_by_id(id, , errp); +if (!data) { +return NULL; +} + +ret = g_new0(EbpfObject, 1); +ret->object = g_base64_encode(data, size); + +return ret; +} diff --git a/ebpf/ebpf.h b/ebpf/ebpf.h new file mode 100644 index 00..378d4e9c70 --- /dev/null +++ b/ebpf/ebpf.h @@ -0,0 +1,29 @@ +/* + * QEMU eBPF binary declaration routine. + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Andrew Melnychenko + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#ifndef EBPF_H +#define EBPF_H + + +void ebpf_register_binary_data(int id, const void *data, + size_t datalen); +const void *ebpf_find_binary_by_id(int id, size_t *sz, + struct Error **errp); + +#define ebpf_binary_init(id, fn) \ +static void __attribute__((constructor)) ebpf_binary_init_ ## fn(void) \ +{ \ +size_t datalen = 0;\ +const void *data = fn(); \ +ebpf_register_binary_data(id, data, datalen); \ +} + +#endif /* EBPF_H */ diff --git a/ebpf/ebpf_rss.c b/ebpf/ebpf_rss.c index 150aa40813..c8a68da2c5 100644 --- a/ebpf/ebpf_rss.c +++ b/ebpf/ebpf_rss.c @@ -13,6 +13,8 @@ #include
[PATCH v9 2/5] ebpf: Added eBPF initialization by fds.
It allows using file descriptors of eBPF provided outside of QEMU. QEMU may be run without capabilities for eBPF and run RSS program provided by management tool(g.e. libvirt). Signed-off-by: Andrew Melnychenko --- ebpf/ebpf_rss-stub.c | 6 ++ ebpf/ebpf_rss.c | 27 +++ ebpf/ebpf_rss.h | 5 + 3 files changed, 38 insertions(+) diff --git a/ebpf/ebpf_rss-stub.c b/ebpf/ebpf_rss-stub.c index e71e229190..8d7fae2ad9 100644 --- a/ebpf/ebpf_rss-stub.c +++ b/ebpf/ebpf_rss-stub.c @@ -28,6 +28,12 @@ bool ebpf_rss_load(struct EBPFRSSContext *ctx) return false; } +bool ebpf_rss_load_fds(struct EBPFRSSContext *ctx, int program_fd, + int config_fd, int toeplitz_fd, int table_fd) +{ +return false; +} + bool ebpf_rss_set_all(struct EBPFRSSContext *ctx, struct EBPFRSSConfig *config, uint16_t *indirections_table, uint8_t *toeplitz_key) { diff --git a/ebpf/ebpf_rss.c b/ebpf/ebpf_rss.c index f774d9636b..150aa40813 100644 --- a/ebpf/ebpf_rss.c +++ b/ebpf/ebpf_rss.c @@ -146,6 +146,33 @@ error: return false; } +bool ebpf_rss_load_fds(struct EBPFRSSContext *ctx, int program_fd, + int config_fd, int toeplitz_fd, int table_fd) +{ +if (ebpf_rss_is_loaded(ctx)) { +return false; +} + +if (program_fd < 0 || config_fd < 0 || toeplitz_fd < 0 || table_fd < 0) { +return false; +} + +ctx->program_fd = program_fd; +ctx->map_configuration = config_fd; +ctx->map_toeplitz_key = toeplitz_fd; +ctx->map_indirections_table = table_fd; + +if (!ebpf_rss_mmap(ctx)) { +ctx->program_fd = -1; +ctx->map_configuration = -1; +ctx->map_toeplitz_key = -1; +ctx->map_indirections_table = -1; +return false; +} + +return true; +} + static bool ebpf_rss_set_config(struct EBPFRSSContext *ctx, struct EBPFRSSConfig *config) { diff --git a/ebpf/ebpf_rss.h b/ebpf/ebpf_rss.h index ab08a7266d..239242b0d2 100644 --- a/ebpf/ebpf_rss.h +++ b/ebpf/ebpf_rss.h @@ -14,6 +14,8 @@ #ifndef QEMU_EBPF_RSS_H #define QEMU_EBPF_RSS_H +#define EBPF_RSS_MAX_FDS 4 + struct EBPFRSSContext { void *obj; int program_fd; @@ -41,6 +43,9 @@ bool ebpf_rss_is_loaded(struct EBPFRSSContext *ctx); bool ebpf_rss_load(struct EBPFRSSContext *ctx); +bool ebpf_rss_load_fds(struct EBPFRSSContext *ctx, int program_fd, + int config_fd, int toeplitz_fd, int table_fd); + bool ebpf_rss_set_all(struct EBPFRSSContext *ctx, struct EBPFRSSConfig *config, uint16_t *indirections_table, uint8_t *toeplitz_key); -- 2.43.0
[PATCH v9 0/5] eBPF RSS through QMP support.
This series of patches provides the ability to retrieve eBPF program through qmp, so management application may load bpf blob with proper capabilities. Now, virtio-net devices can accept eBPF programs and maps through properties as external file descriptors. Access to the eBPF map is direct through mmap() call, so it should not require additional capabilities to bpf* calls. eBPF file descriptors can be passed to QEMU from parent process or by unix socket with sendfd() qmp command. Changes since v8: * rebased and refactored QMP interface * license SPDX id only for new files Changes since v7: * rebased and refactored * used SPDX license identifier * used DEFINE_PROP_ARRAY() for virtio-net "ebpf-rss-fds" property Changes since v6: * added comments to ebpf.json * added libbpf version requirements to meson script with BPF_F_MMAPABLE check Changes since v5: * refactored ebpf.json Changes since v4: * refactored commit hunks * added explicit BPF_F_MMAPABLE declaration Changes since v3: * fixed issue with the build if bpf disabled * rebased to the last master * refactored according to review Changes since v2: * moved/refactored QMP command * refactored virtio-net Changes since v1: * refactored virtio-net * moved hunks for ebpf mmap() * added qmp enum for eBPF id. Andrew Melnychenko (5): ebpf: Added eBPF map update through mmap. ebpf: Added eBPF initialization by fds. virtio-net: Added property to load eBPF RSS with fds. qmp: Added new command to retrieve eBPF blob. ebpf: Updated eBPF program and skeleton. ebpf/ebpf.c| 69 ++ ebpf/ebpf.h| 29 + ebpf/ebpf_rss-stub.c |6 + ebpf/ebpf_rss.c| 150 +++- ebpf/ebpf_rss.h| 10 + ebpf/meson.build |2 +- ebpf/rss.bpf.skeleton.h| 1343 hw/net/virtio-net.c| 54 +- include/hw/virtio/virtio-net.h |2 + meson.build| 10 +- qapi/ebpf.json | 66 ++ qapi/meson.build |1 + qapi/qapi-schema.json |1 + tools/ebpf/rss.bpf.c |7 +- 14 files changed, 1047 insertions(+), 703 deletions(-) create mode 100644 ebpf/ebpf.c create mode 100644 ebpf/ebpf.h create mode 100644 qapi/ebpf.json -- 2.43.0