Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang wrote: > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin wrote: > > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote: > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" > > > wrote: > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" > > > > > wrote: > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" > > > > > > > wrote: > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" > > > > > > > > > wrote: > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > > > > > > Check whether it is per-queue reset state in > > > > > > > > > > > virtio_net_flush_tx(). > > > > > > > > > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx > > > > > > > > > > > resources. At this > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not > > > > > > > > > > > try to send > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the > > > > > > > > > > > current > > > > > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > > > > > Do you in fact mean it's called from > > > > > > > > > > flush_or_purge_queued_packets? > > > > > > > > > > > > > > > > > > Yes > > > > > > > > > > > > > > > > > > virtio_queue_reset > > > > > > > > > k->queue_reset > > > > > > > > > virtio_net_queue_reset > > > > > > > > > flush_or_purge_queued_packets > > > > > > > > > qemu_flush_or_purge_queued_packets > > > > > > > > > . > > > > > > > > > (callback) > > > > > > > > > virtio_net_tx_complete > > > > > > > > > virtio_net_flush_tx > > > > > > > > > <-- here send new packet. We need stop it. > > > > > > > > > > > > > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass information > > > > > > > > > through the stack. I > > > > > > > > > originally thought it was a general situation, so I wanted to > > > > > > > > > put it in > > > > > > > > > struct VirtQueue. > > > > > > > > > > > > > > > > > > If it is not very suitable, it may be better to put it in > > > > > > > > > VirtIONetQueue. > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > > > > > > knows we are in the process of reset would be a bad idea. > > > > > > > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > > > > > Reported-by: Alexander Bulekov > > > > > > > > > > > Signed-off-by: Xuan Zhuo > > > > > > > > > > > --- > > > > > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t > > > > > > > > > > > virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > > > > > VirtQueueElement *elem; > > > > > > > > > > > int32_t num_packets = 0; > > > > > > > > > > > int queue_index = > > > > > > > > > > > vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > > > > > -if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > > > > > +if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > > > > > +virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > > > > > > > > > > > return num_packets; > > > > > > > > > > > > > > > > and then > > > > > > > > > > > > > > > > ret = virtio_net_flush_tx(q); > > > > > > > > if (ret >= n->tx_burst) > > > > > > > > > > > > > > > > > > > > > > > > will reschedule automatically won't it? > > > > > > > > > > > > > > > > also why check in virtio_net_flush_tx and not > > > > > > > > virtio_net_tx_complete? > > > > > > > > > > > > > > vir
[PULL 1/5] migration: Fix migration crash when target psize larger than host
From: Peter Xu Commit d9e474ea56 overlooked the case where the target psize is even larger than the host psize. One example is Alpha has 8K page size and migration will start to crash the source QEMU when running Alpha migration on x86. Fix it by detecting that case and set host start/end just to cover the single page to be migrated. This will slightly optimize the common case where host psize equals to guest psize so we don't even need to do the roundups, but that's trivial. Cc: qemu-sta...@nongnu.org Reported-by: Thomas Huth Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1456 Fixes: d9e474ea56 ("migration: Teach PSS about host page") Signed-off-by: Peter Xu Reviewed-by: Thomas Huth Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- migration/ram.c | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 334309f1c6..68a45338e3 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2319,8 +2319,25 @@ static void pss_host_page_prepare(PageSearchStatus *pss) size_t guest_pfns = qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS; pss->host_page_sending = true; -pss->host_page_start = ROUND_DOWN(pss->page, guest_pfns); -pss->host_page_end = ROUND_UP(pss->page + 1, guest_pfns); +if (guest_pfns <= 1) { +/* + * This covers both when guest psize == host psize, or when guest + * has larger psize than the host (guest_pfns==0). + * + * For the latter, we always send one whole guest page per + * iteration of the host page (example: an Alpha VM on x86 host + * will have guest psize 8K while host psize 4K). + */ +pss->host_page_start = pss->page; +pss->host_page_end = pss->page + 1; +} else { +/* + * The host page spans over multiple guest pages, we send them + * within the same host page iteration. + */ +pss->host_page_start = ROUND_DOWN(pss->page, guest_pfns); +pss->host_page_end = ROUND_UP(pss->page + 1, guest_pfns); +} } /* -- 2.39.1
[PULL 2/5] migration: No save_live_pending() method uses the QEMUFile parameter
So remove it everywhere. Signed-off-by: Juan Quintela Reviewed-by: Dr. David Alan Gilbert --- include/migration/register.h | 2 +- migration/savevm.h | 2 +- hw/s390x/s390-stattrib.c | 3 +-- hw/vfio/migration.c| 2 +- migration/block-dirty-bitmap.c | 2 +- migration/block.c | 2 +- migration/migration.c | 2 +- migration/ram.c| 2 +- migration/savevm.c | 4 ++-- 9 files changed, 10 insertions(+), 11 deletions(-) diff --git a/include/migration/register.h b/include/migration/register.h index c1dcff0f90..6ca71367af 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -46,7 +46,7 @@ typedef struct SaveVMHandlers { /* This runs outside the iothread lock! */ int (*save_setup)(QEMUFile *f, void *opaque); -void (*save_live_pending)(QEMUFile *f, void *opaque, +void (*save_live_pending)(void *opaque, uint64_t threshold_size, uint64_t *res_precopy_only, uint64_t *res_compatible, diff --git a/migration/savevm.h b/migration/savevm.h index 6461342cb4..524cf12f25 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -40,7 +40,7 @@ void qemu_savevm_state_cleanup(void); void qemu_savevm_state_complete_postcopy(QEMUFile *f); int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only, bool inactivate_disks); -void qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size, +void qemu_savevm_state_pending(uint64_t max_size, uint64_t *res_precopy_only, uint64_t *res_compatible, uint64_t *res_postcopy_only); diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c index 9eda1c3b2a..b7f12ec4e2 100644 --- a/hw/s390x/s390-stattrib.c +++ b/hw/s390x/s390-stattrib.c @@ -182,8 +182,7 @@ static int cmma_save_setup(QEMUFile *f, void *opaque) return 0; } -static void cmma_save_pending(QEMUFile *f, void *opaque, uint64_t max_size, - uint64_t *res_precopy_only, +static void cmma_save_pending(uint64_t *res_precopy_only, uint64_t *res_compatible, uint64_t *res_postcopy_only) { diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index c74453e0b5..b2125c7607 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -456,7 +456,7 @@ static void vfio_save_cleanup(void *opaque) trace_vfio_save_cleanup(vbasedev->name); } -static void vfio_save_pending(QEMUFile *f, void *opaque, +static void vfio_save_pending(void *opaque, uint64_t threshold_size, uint64_t *res_precopy_only, uint64_t *res_compatible, diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c index 15127d489a..c27ef9b033 100644 --- a/migration/block-dirty-bitmap.c +++ b/migration/block-dirty-bitmap.c @@ -762,7 +762,7 @@ static int dirty_bitmap_save_complete(QEMUFile *f, void *opaque) return 0; } -static void dirty_bitmap_save_pending(QEMUFile *f, void *opaque, +static void dirty_bitmap_save_pending(void *opaque, uint64_t max_size, uint64_t *res_precopy_only, uint64_t *res_compatible, diff --git a/migration/block.c b/migration/block.c index 5da15a62de..47852b8d58 100644 --- a/migration/block.c +++ b/migration/block.c @@ -863,7 +863,7 @@ static int block_save_complete(QEMUFile *f, void *opaque) return 0; } -static void block_save_pending(QEMUFile *f, void *opaque, uint64_t max_size, +static void block_save_pending(void *opaque, uint64_t max_size, uint64_t *res_precopy_only, uint64_t *res_compatible, uint64_t *res_postcopy_only) diff --git a/migration/migration.c b/migration/migration.c index 52b5d39244..76524cc56e 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3751,7 +3751,7 @@ static MigIterateState migration_iteration_run(MigrationState *s) uint64_t pending_size, pend_pre, pend_compat, pend_post; bool in_postcopy = s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE; -qemu_savevm_state_pending(s->to_dst_file, s->threshold_size, &pend_pre, +qemu_savevm_state_pending(s->threshold_size, &pend_pre, &pend_compat, &pend_post); pending_size = pend_pre + pend_compat + pend_post; diff --git a/migration/ram.c b/migration/ram.c index 68a45338e3..389739f162 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3409,7 +3409,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque) return 0; } -static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size, +static void ra
[PULL 3/5] migration: Split save_live_pending() into state_pending_*
We split the function into to: - state_pending_estimate: We estimate the remaining state size without stopping the machine. - state pending_exact: We calculate the exact amount of remaining state. The only "device" that implements different functions for _estimate() and _exact() is ram. Signed-off-by: Juan Quintela Reviewed-by: Dr. David Alan Gilbert --- docs/devel/migration.rst | 18 --- docs/devel/vfio-migration.rst | 4 ++-- include/migration/register.h | 19 +-- migration/savevm.h | 12 ++ hw/s390x/s390-stattrib.c | 10 hw/vfio/migration.c| 21 + migration/block-dirty-bitmap.c | 15 ++-- migration/block.c | 13 ++- migration/migration.c | 20 +++- migration/ram.c| 35 migration/savevm.c | 42 +++--- hw/vfio/trace-events | 2 +- migration/trace-events | 7 +++--- 13 files changed, 143 insertions(+), 75 deletions(-) diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst index 3e9656d8e0..6f65c23b47 100644 --- a/docs/devel/migration.rst +++ b/docs/devel/migration.rst @@ -482,15 +482,17 @@ An iterative device must provide: - A ``load_setup`` function that initialises the data structures on the destination. - - A ``save_live_pending`` function that is called repeatedly and must -indicate how much more data the iterative data must save. The core -migration code will use this to determine when to pause the CPUs -and complete the migration. + - A ``state_pending_exact`` function that indicates how much more +data we must save. The core migration code will use this to +determine when to pause the CPUs and complete the migration. - - A ``save_live_iterate`` function (called after ``save_live_pending`` -when there is significant data still to be sent). It should send -a chunk of data until the point that stream bandwidth limits tell it -to stop. Each call generates one section. + - A ``state_pending_estimate`` function that indicates how much more +data we must save. When the estimated amount is smaller than the +threshold, we call ``state_pending_exact``. + + - A ``save_live_iterate`` function should send a chunk of data until +the point that stream bandwidth limits tell it to stop. Each call +generates one section. - A ``save_live_complete_precopy`` function that must transmit the last section for the device containing any remaining data. diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst index 9ff6163c88..673057c90d 100644 --- a/docs/devel/vfio-migration.rst +++ b/docs/devel/vfio-migration.rst @@ -28,7 +28,7 @@ VFIO implements the device hooks for the iterative approach as follows: * A ``load_setup`` function that sets up the migration region on the destination and sets _RESUMING flag in the VFIO device state. -* A ``save_live_pending`` function that reads pending_bytes from the vendor +* A ``state_pending_exact`` function that reads pending_bytes from the vendor driver, which indicates the amount of data that the vendor driver has yet to save for the VFIO device. @@ -114,7 +114,7 @@ Live migration save path (RUNNING, _SETUP, _RUNNING|_SAVING) | (RUNNING, _ACTIVE, _RUNNING|_SAVING) - If device is active, get pending_bytes by .save_live_pending() + If device is active, get pending_bytes by .state_pending_exact() If total pending_bytes >= threshold_size, call .save_live_iterate() Data of VFIO device for pre-copy phase is copied Iterate till total pending bytes converge and are less than threshold diff --git a/include/migration/register.h b/include/migration/register.h index 6ca71367af..15cf32994d 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -46,11 +46,6 @@ typedef struct SaveVMHandlers { /* This runs outside the iothread lock! */ int (*save_setup)(QEMUFile *f, void *opaque); -void (*save_live_pending)(void *opaque, - uint64_t threshold_size, - uint64_t *res_precopy_only, - uint64_t *res_compatible, - uint64_t *res_postcopy_only); /* Note for save_live_pending: * - res_precopy_only is for data which must be migrated in precopy phase * or in stopped state, in other words - before target vm start @@ -61,8 +56,18 @@ typedef struct SaveVMHandlers { * Sum of res_postcopy_only, res_compatible and res_postcopy_only is the * whole amount of pending data. */ - - +/* This estimates the remaining data to transfer */ +void (*state_pending_estimate)(void *opaque, +
[PULL 0/5] Next patches
The following changes since commit 13356edb87506c148b163b8c7eb0695647d00c2a: Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging (2023-01-24 09:45:33 +) are available in the Git repository at: https://gitlab.com/juan.quintela/qemu.git tags/next-pull-request for you to fetch changes up to 07181bd32c39007855e25eb5675e9210e9ccd8da: migration: simplify migration_iteration_run() (2023-01-30 08:55:04 +0100) Migration PULL request Hi - Peter Xu fix for crash on hosts with guest page bigger than host - The parts that are reviewed of my vfio series Please, apply. Juan Quintela (4): migration: No save_live_pending() method uses the QEMUFile parameter migration: Split save_live_pending() into state_pending_* migration: Remove unused threshold_size parameter migration: simplify migration_iteration_run() Peter Xu (1): migration: Fix migration crash when target psize larger than host docs/devel/migration.rst | 18 ++- docs/devel/vfio-migration.rst | 4 +-- include/migration/register.h | 17 ++- migration/savevm.h | 10 +++--- hw/s390x/s390-stattrib.c | 11 --- hw/vfio/migration.c| 20 ++-- migration/block-dirty-bitmap.c | 14 - migration/block.c | 13 migration/migration.c | 42 ++--- migration/ram.c| 56 +++--- migration/savevm.c | 39 ++- hw/vfio/trace-events | 2 +- migration/trace-events | 7 +++-- 13 files changed, 163 insertions(+), 90 deletions(-) -- 2.39.1
[PULL 4/5] migration: Remove unused threshold_size parameter
Until previous commit, save_live_pending() was used for ram. Now with the split into state_pending_estimate() and state_pending_exact() it is not needed anymore, so remove them. Signed-off-by: Juan Quintela Reviewed-by: Dr. David Alan Gilbert --- include/migration/register.h | 2 -- migration/savevm.h | 6 ++ hw/vfio/migration.c| 1 - migration/block-dirty-bitmap.c | 1 - migration/block.c | 2 +- migration/migration.c | 10 -- migration/ram.c| 4 ++-- migration/savevm.c | 11 --- migration/trace-events | 4 ++-- 9 files changed, 15 insertions(+), 26 deletions(-) diff --git a/include/migration/register.h b/include/migration/register.h index 15cf32994d..b91a0cdbf8 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -58,13 +58,11 @@ typedef struct SaveVMHandlers { */ /* This estimates the remaining data to transfer */ void (*state_pending_estimate)(void *opaque, - uint64_t threshold_size, uint64_t *res_precopy_only, uint64_t *res_compatible, uint64_t *res_postcopy_only); /* This calculate the exact remaining data to transfer */ void (*state_pending_exact)(void *opaque, -uint64_t threshold_size, uint64_t *res_precopy_only, uint64_t *res_compatible, uint64_t *res_postcopy_only); diff --git a/migration/savevm.h b/migration/savevm.h index 5d2cff4411..b1901e68d5 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -40,12 +40,10 @@ void qemu_savevm_state_cleanup(void); void qemu_savevm_state_complete_postcopy(QEMUFile *f); int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only, bool inactivate_disks); -void qemu_savevm_state_pending_exact(uint64_t threshold_size, - uint64_t *res_precopy_only, +void qemu_savevm_state_pending_exact(uint64_t *res_precopy_only, uint64_t *res_compatible, uint64_t *res_postcopy_only); -void qemu_savevm_state_pending_estimate(uint64_t thershold_size, -uint64_t *res_precopy_only, +void qemu_savevm_state_pending_estimate(uint64_t *res_precopy_only, uint64_t *res_compatible, uint64_t *res_postcopy_only); void qemu_savevm_send_ping(QEMUFile *f, uint32_t value); diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index c49ca466d4..b3318f0f20 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -457,7 +457,6 @@ static void vfio_save_cleanup(void *opaque) } static void vfio_state_pending(void *opaque, - uint64_t threshold_size, uint64_t *res_precopy_only, uint64_t *res_compatible, uint64_t *res_postcopy_only) diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c index 6fac9fb34f..5a621419d3 100644 --- a/migration/block-dirty-bitmap.c +++ b/migration/block-dirty-bitmap.c @@ -763,7 +763,6 @@ static int dirty_bitmap_save_complete(QEMUFile *f, void *opaque) } static void dirty_bitmap_state_pending(void *opaque, - uint64_t max_size, uint64_t *res_precopy_only, uint64_t *res_compatible, uint64_t *res_postcopy_only) diff --git a/migration/block.c b/migration/block.c index 544e74e9c5..29f69025af 100644 --- a/migration/block.c +++ b/migration/block.c @@ -863,7 +863,7 @@ static int block_save_complete(QEMUFile *f, void *opaque) return 0; } -static void block_state_pending(void *opaque, uint64_t max_size, +static void block_state_pending(void *opaque, uint64_t *res_precopy_only, uint64_t *res_compatible, uint64_t *res_postcopy_only) diff --git a/migration/migration.c b/migration/migration.c index e7b4b94348..594a42f085 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3751,18 +3751,16 @@ static MigIterateState migration_iteration_run(MigrationState *s) uint64_t pend_pre, pend_compat, pend_post; bool in_postcopy = s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE; -qemu_savevm_state_pending_estimate(s->threshold_size, &pend_pre, - &pend_compat, &pend_post); +qemu_savevm_state_pending_estimate(&pend_pre, &pend_compat, &pend_post); uint64_t pending_size = pend_pre + pend_compat
[PULL 5/5] migration: simplify migration_iteration_run()
Signed-off-by: Juan Quintela Reviewed-by: Dr. David Alan Gilbert --- migration/migration.c | 24 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 594a42f085..644c61e91d 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3764,23 +3764,23 @@ static MigIterateState migration_iteration_run(MigrationState *s) pend_pre, pend_compat, pend_post); } -if (pending_size && pending_size >= s->threshold_size) { -/* Still a significant amount to transfer */ -if (!in_postcopy && pend_pre <= s->threshold_size && -qatomic_read(&s->start_postcopy)) { -if (postcopy_start(s)) { -error_report("%s: postcopy failed to start", __func__); -} -return MIG_ITERATE_SKIP; -} -/* Just another iteration step */ -qemu_savevm_state_iterate(s->to_dst_file, in_postcopy); -} else { +if (pending_size < s->threshold_size) { trace_migration_thread_low_pending(pending_size); migration_completion(s); return MIG_ITERATE_BREAK; } +/* Still a significant amount to transfer */ +if (!in_postcopy && pend_pre <= s->threshold_size && +qatomic_read(&s->start_postcopy)) { +if (postcopy_start(s)) { +error_report("%s: postcopy failed to start", __func__); +} +return MIG_ITERATE_SKIP; +} + +/* Just another iteration step */ +qemu_savevm_state_iterate(s->to_dst_file, in_postcopy); return MIG_ITERATE_RESUME; } -- 2.39.1
[PATCH v3] migration: Remove res_compatible parameter
It was only used for RAM, and in that case, it means that this amount of data was sent for memory. Just delete the field in all callers. [Rest of the pages on the vfio series are already on my previous pull request] Signed-off-by: Juan Quintela --- include/migration/register.h | 27 --- migration/savevm.h | 10 -- hw/s390x/s390-stattrib.c | 8 +++- hw/vfio/migration.c| 10 -- migration/block-dirty-bitmap.c | 7 +++ migration/block.c | 8 +++- migration/migration.c | 18 -- migration/ram.c| 20 migration/savevm.c | 28 ++-- hw/vfio/trace-events | 2 +- migration/trace-events | 4 ++-- 11 files changed, 58 insertions(+), 84 deletions(-) diff --git a/include/migration/register.h b/include/migration/register.h index b91a0cdbf8..5ef6b903ed 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -47,25 +47,22 @@ typedef struct SaveVMHandlers { /* This runs outside the iothread lock! */ int (*save_setup)(QEMUFile *f, void *opaque); /* Note for save_live_pending: - * - res_precopy_only is for data which must be migrated in precopy phase - * or in stopped state, in other words - before target vm start - * - res_compatible is for data which may be migrated in any phase - * - res_postcopy_only is for data which must be migrated in postcopy phase - * or in stopped state, in other words - after source vm stop + * - res_precopy is for data which must be migrated in precopy + * phase or in stopped state, in other words - before target + * vm start + * - res_postcopy is for data which must be migrated in postcopy + * phase or in stopped state, in other words - after source vm + * stop * - * Sum of res_postcopy_only, res_compatible and res_postcopy_only is the - * whole amount of pending data. + * Sum of res_precopy and res_postcopy is the whole amount of + * pending data. */ /* This estimates the remaining data to transfer */ -void (*state_pending_estimate)(void *opaque, - uint64_t *res_precopy_only, - uint64_t *res_compatible, - uint64_t *res_postcopy_only); +void (*state_pending_estimate)(void *opaque, uint64_t *res_precopy, + uint64_t *res_postcopy); /* This calculate the exact remaining data to transfer */ -void (*state_pending_exact)(void *opaque, -uint64_t *res_precopy_only, -uint64_t *res_compatible, -uint64_t *res_postcopy_only); +void (*state_pending_exact)(void *opaque, uint64_t *res_precopy, +uint64_t *res_postcopy); LoadStateHandler *load_state; int (*load_setup)(QEMUFile *f, void *opaque); int (*load_cleanup)(void *opaque); diff --git a/migration/savevm.h b/migration/savevm.h index b1901e68d5..3f51588341 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -40,12 +40,10 @@ void qemu_savevm_state_cleanup(void); void qemu_savevm_state_complete_postcopy(QEMUFile *f); int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only, bool inactivate_disks); -void qemu_savevm_state_pending_exact(uint64_t *res_precopy_only, - uint64_t *res_compatible, - uint64_t *res_postcopy_only); -void qemu_savevm_state_pending_estimate(uint64_t *res_precopy_only, -uint64_t *res_compatible, -uint64_t *res_postcopy_only); +void qemu_savevm_state_pending_exact(uint64_t *res_precopy, + uint64_t *res_postcopy); +void qemu_savevm_state_pending_estimate(uint64_t *res_precopy, +uint64_t *res_postcopy); void qemu_savevm_send_ping(QEMUFile *f, uint32_t value); void qemu_savevm_send_open_return_path(QEMUFile *f); int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len); diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c index f1ee2c65bb..b5f8f6d1b9 100644 --- a/hw/s390x/s390-stattrib.c +++ b/hw/s390x/s390-stattrib.c @@ -182,17 +182,15 @@ static int cmma_save_setup(QEMUFile *f, void *opaque) return 0; } -static void cmma_state_pending(void * opaque, - uint64_t *res_precopy_only, - uint64_t *res_compatible, - uint64_t *res_postcopy_only) +static void cmma_state_pending(void * opaque, uint64_t *res_precopy, + uint64_t *res_postcopy) { S390StAttri
[PATCH v2 01/11] migration: Update atomic stats out of the mutex
Signed-off-by: Juan Quintela --- migration/multifd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 000ca4d4ec..20a81cd7f2 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -432,8 +432,8 @@ static int multifd_send_pages(QEMUFile *f) transferred = ((uint64_t) pages->num) * p->page_size + p->packet_len; qemu_file_acct_rate_limit(f, transferred); ram_counters.multifd_bytes += transferred; +qemu_mutex_unlock(&p->mutex); stat64_add(&ram_atomic_counters.transferred, transferred); -qemu_mutex_unlock(&p->mutex); qemu_sem_post(&p->sem); return 1; @@ -624,8 +624,8 @@ int multifd_send_sync_main(QEMUFile *f) p->pending_job++; qemu_file_acct_rate_limit(f, p->packet_len); ram_counters.multifd_bytes += p->packet_len; +qemu_mutex_unlock(&p->mutex); stat64_add(&ram_atomic_counters.transferred, p->packet_len); -qemu_mutex_unlock(&p->mutex); qemu_sem_post(&p->sem); if (flush_zero_copy && p->c && (multifd_zero_copy_flush(p->c) < 0)) { -- 2.39.1
[PATCH v2 04/11] multifd: Count the number of bytes sent correctly
Current code asumes that all pages are whole. That is not true for example for compression already. Fix it for creating a new field ->sent_bytes that includes it. All ram_counters are used only from the migration thread, so we have two options: - put a mutex and fill everything when we sent it (not only ram_counters, also qemu_file->xfer_bytes). - Create a local variable that implements how much has been sent through each channel. And when we push another packet, we "add" the previous stats. I choose two due to less changes overall. On the previous code we increase transferred and then we sent. Current code goes the other way around. It sents the data, and after the fact, it updates the counters. Notice that each channel can have a maximum of half a megabyte of data without counting, so it is not very important. Signed-off-by: Juan Quintela --- migration/multifd.h | 2 ++ migration/multifd.c | 6 -- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/migration/multifd.h b/migration/multifd.h index e2802a9ce2..36f899c56f 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -102,6 +102,8 @@ typedef struct { uint32_t flags; /* global number of generated multifd packets */ uint64_t packet_num; +/* How many bytes have we sent on the last packet */ +uint64_t sent_bytes; /* thread has work to do */ int pending_job; /* array of pages to sent. diff --git a/migration/multifd.c b/migration/multifd.c index 61cafe4c76..cd26b2fda9 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -394,7 +394,6 @@ static int multifd_send_pages(QEMUFile *f) static int next_channel; MultiFDSendParams *p = NULL; /* make happy gcc */ MultiFDPages_t *pages = multifd_send_state->pages; -uint64_t transferred; if (qatomic_read(&multifd_send_state->exiting)) { return -1; @@ -429,7 +428,8 @@ static int multifd_send_pages(QEMUFile *f) p->packet_num = multifd_send_state->packet_num++; multifd_send_state->pages = p->pages; p->pages = pages; -transferred = ((uint64_t) pages->num) * p->page_size + p->packet_len; +uint64_t transferred = p->sent_bytes; +p->sent_bytes = 0; qemu_file_acct_rate_limit(f, transferred); qemu_mutex_unlock(&p->mutex); stat64_add(&ram_atomic_counters.multifd_bytes, transferred); @@ -719,6 +719,8 @@ static void *multifd_send_thread(void *opaque) } qemu_mutex_lock(&p->mutex); +p->sent_bytes += p->packet_len; +p->sent_bytes += p->next_packet_size; p->pending_job--; qemu_mutex_unlock(&p->mutex); -- 2.39.1
[PATCH v2 03/11] multifd: We already account for this packet on the multifd thread
Signed-off-by: Juan Quintela --- migration/multifd.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 49fa76e5e1..61cafe4c76 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -622,10 +622,7 @@ int multifd_send_sync_main(QEMUFile *f) p->packet_num = multifd_send_state->packet_num++; p->flags |= MULTIFD_FLAG_SYNC; p->pending_job++; -qemu_file_acct_rate_limit(f, p->packet_len); qemu_mutex_unlock(&p->mutex); -stat64_add(&ram_atomic_counters.multifd_bytes, p->packet_len); -stat64_add(&ram_atomic_counters.transferred, p->packet_len); qemu_sem_post(&p->sem); if (flush_zero_copy && p->c && (multifd_zero_copy_flush(p->c) < 0)) { -- 2.39.1
[PATCH v2 06/11] multifd: Make flags field thread local
Use of flags with respect to locking was incensistant. For the sending side: - it was set to 0 with mutex held on the multifd channel. - MULTIFD_FLAG_SYNC was set with mutex held on the migration thread. - Everything else was done without the mutex held on the multifd channel. On the reception side, it is not used on the migration thread, only on the multifd channels threads. So we move it to the multifd channels thread only variables, and we introduce a new bool sync_needed on the send side to pass that information. Signed-off-by: Juan Quintela --- migration/multifd.h | 10 ++ migration/multifd.c | 23 +-- 2 files changed, 19 insertions(+), 14 deletions(-) diff --git a/migration/multifd.h b/migration/multifd.h index 36f899c56f..a67cefc0a2 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -98,12 +98,12 @@ typedef struct { bool running; /* should this thread finish */ bool quit; -/* multifd flags for each packet */ -uint32_t flags; /* global number of generated multifd packets */ uint64_t packet_num; /* How many bytes have we sent on the last packet */ uint64_t sent_bytes; +/* Do we need to do an iteration sync */ +bool sync_needed; /* thread has work to do */ int pending_job; /* array of pages to sent. @@ -117,6 +117,8 @@ typedef struct { /* pointer to the packet */ MultiFDPacket_t *packet; +/* multifd flags for each packet */ +uint32_t flags; /* size of the next packet that contains pages */ uint32_t next_packet_size; /* packets sent through this channel */ @@ -163,8 +165,6 @@ typedef struct { bool running; /* should this thread finish */ bool quit; -/* multifd flags for each packet */ -uint32_t flags; /* global number of generated multifd packets */ uint64_t packet_num; @@ -172,6 +172,8 @@ typedef struct { /* pointer to the packet */ MultiFDPacket_t *packet; +/* multifd flags for each packet */ +uint32_t flags; /* size of the next packet that contains pages */ uint32_t next_packet_size; /* packets sent through this channel */ diff --git a/migration/multifd.c b/migration/multifd.c index cd26b2fda9..77196a55b4 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -620,7 +620,7 @@ int multifd_send_sync_main(QEMUFile *f) } p->packet_num = multifd_send_state->packet_num++; -p->flags |= MULTIFD_FLAG_SYNC; +p->sync_needed = true; p->pending_job++; qemu_mutex_unlock(&p->mutex); qemu_sem_post(&p->sem); @@ -667,7 +667,11 @@ static void *multifd_send_thread(void *opaque) if (p->pending_job) { uint64_t packet_num = p->packet_num; -uint32_t flags = p->flags; +p->flags = 0; +if (p->sync_needed) { +p->flags |= MULTIFD_FLAG_SYNC; +p->sync_needed = false; +} p->normal_num = 0; if (use_zero_copy_send) { @@ -689,14 +693,13 @@ static void *multifd_send_thread(void *opaque) } } multifd_send_fill_packet(p); -p->flags = 0; p->num_packets++; p->total_normal_pages += p->normal_num; p->pages->num = 0; p->pages->block = NULL; qemu_mutex_unlock(&p->mutex); -trace_multifd_send(p->id, packet_num, p->normal_num, flags, +trace_multifd_send(p->id, packet_num, p->normal_num, p->flags, p->next_packet_size); if (use_zero_copy_send) { @@ -724,7 +727,7 @@ static void *multifd_send_thread(void *opaque) p->pending_job--; qemu_mutex_unlock(&p->mutex); -if (flags & MULTIFD_FLAG_SYNC) { +if (p->flags & MULTIFD_FLAG_SYNC) { qemu_sem_post(&p->sem_sync); } qemu_sem_post(&multifd_send_state->channels_ready); @@ -1099,7 +1102,7 @@ static void *multifd_recv_thread(void *opaque) rcu_register_thread(); while (true) { -uint32_t flags; +bool sync_needed = false; if (p->quit) { break; @@ -1121,11 +1124,11 @@ static void *multifd_recv_thread(void *opaque) break; } -flags = p->flags; +trace_multifd_recv(p->id, p->packet_num, p->normal_num, p->flags, + p->next_packet_size); +sync_needed = p->flags & MULTIFD_FLAG_SYNC; /* recv methods don't know how to handle the SYNC flag */ p->flags &= ~MULTIFD_FLAG_SYNC; -trace_multifd_recv(p->id, p->packet_num, p->normal_num, flags, - p->next_packet_size); p->num_packets++; p->total_normal_pages += p->normal_num; qemu_mutex_unlock(&p->mutex); @@ -1137,7 +1140,7 @@ static void *multifd_recv_thread(void *opaque) }
[PATCH v2 08/11] multifd: Add capability to enable/disable zero_page
We have to enable it by default until we introduce the new code. Signed-off-by: Juan Quintela --- Change it to a capability. As capabilities are off by default, have to change MULTIFD_ZERO_PAGE to MAIN_ZERO_PAGE, so it is false for default, and true for older versions. --- qapi/migration.json | 8 +++- migration/migration.h | 1 + hw/core/machine.c | 1 + migration/migration.c | 13 - 4 files changed, 21 insertions(+), 2 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index 88ecf86ac8..ac5bc071a9 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -472,12 +472,18 @@ # Requires that QEMU be permitted to use locked memory # for guest RAM pages. # (since 7.1) +# # @postcopy-preempt: If enabled, the migration process will allow postcopy #requests to preempt precopy stream, so postcopy requests #will be handled faster. This is a performance feature and #should not affect the correctness of postcopy migration. #(since 7.1) # +# @main-zero-page: If enabled, the detection of zero pages will be +# done on the main thread. Otherwise it is done on +# the multifd threads. +# (since 8.0) +# # Features: # @unstable: Members @x-colo and @x-ignore-shared are experimental. # @@ -492,7 +498,7 @@ 'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate', { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] }, 'validate-uuid', 'background-snapshot', - 'zero-copy-send', 'postcopy-preempt'] } + 'zero-copy-send', 'postcopy-preempt', 'main-zero-page'] } ## # @MigrationCapabilityStatus: diff --git a/migration/migration.h b/migration/migration.h index ae4ffd3454..c38a0baf10 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -408,6 +408,7 @@ int migrate_multifd_channels(void); MultiFDCompression migrate_multifd_compression(void); int migrate_multifd_zlib_level(void); int migrate_multifd_zstd_level(void); +bool migrate_use_main_zero_page(void); #ifdef CONFIG_LINUX bool migrate_use_zero_copy_send(void); diff --git a/hw/core/machine.c b/hw/core/machine.c index 616f3a207c..97149e2de3 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -52,6 +52,7 @@ const size_t hw_compat_7_1_len = G_N_ELEMENTS(hw_compat_7_1); GlobalProperty hw_compat_7_0[] = { { "arm-gicv3-common", "force-8-bit-prio", "on" }, { "nvme-ns", "eui64-default", "on"}, +{ "migration", "main-zero-page", "true" }, }; const size_t hw_compat_7_0_len = G_N_ELEMENTS(hw_compat_7_0); diff --git a/migration/migration.c b/migration/migration.c index d261c7d16d..ab86c6601d 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -164,7 +164,8 @@ INITIALIZE_MIGRATE_CAPS_SET(check_caps_background_snapshot, MIGRATION_CAPABILITY_XBZRLE, MIGRATION_CAPABILITY_X_COLO, MIGRATION_CAPABILITY_VALIDATE_UUID, -MIGRATION_CAPABILITY_ZERO_COPY_SEND); +MIGRATION_CAPABILITY_ZERO_COPY_SEND, +MIGRATION_CAPABILITY_MAIN_ZERO_PAGE); /* When we add fault tolerance, we could have several migrations at once. For now we don't need to add @@ -2603,6 +2604,14 @@ bool migrate_use_multifd(void) return s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD]; } +bool migrate_use_main_zero_page(void) +{ +MigrationState *s = migrate_get_current(); + +/* We will enable this when we add the right code. */ +return true || s->enabled_capabilities[MIGRATION_CAPABILITY_MAIN_ZERO_PAGE]; +} + bool migrate_pause_before_switchover(void) { MigrationState *s; @@ -4425,6 +4434,8 @@ static Property migration_properties[] = { DEFINE_PROP_MIG_CAP("x-zero-copy-send", MIGRATION_CAPABILITY_ZERO_COPY_SEND), #endif +DEFINE_PROP_MIG_CAP("main-zero-page", +MIGRATION_CAPABILITY_MAIN_ZERO_PAGE), DEFINE_PROP_END_OF_LIST(), }; -- 2.39.1
[PATCH v2 05/11] migration: Make ram_save_target_page() a pointer
We are going to create a new function for multifd latest in the series. Signed-off-by: Juan Quintela Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 885d7dbf23..04e1093cdc 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -450,6 +450,13 @@ void dirty_sync_missed_zero_copy(void) ram_counters.dirty_sync_missed_zero_copy++; } +struct MigrationOps { +int (*ram_save_target_page)(RAMState *rs, PageSearchStatus *pss); +}; +typedef struct MigrationOps MigrationOps; + +MigrationOps *migration_ops; + CompressionStats compression_counters; struct CompressParam { @@ -2265,14 +2272,14 @@ static bool save_compress_page(RAMState *rs, PageSearchStatus *pss, } /** - * ram_save_target_page: save one target page + * ram_save_target_page_legacy: save one target page * * Returns the number of pages written * * @rs: current RAM state * @pss: data about the page we want to send */ -static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss) +static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss) { RAMBlock *block = pss->block; ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; @@ -2398,7 +2405,7 @@ static int ram_save_host_page_urgent(PageSearchStatus *pss) if (page_dirty) { /* Be strict to return code; it must be 1, or what else? */ -if (ram_save_target_page(rs, pss) != 1) { +if (migration_ops->ram_save_target_page(rs, pss) != 1) { error_report_once("%s: ram_save_target_page failed", __func__); ret = -1; goto out; @@ -2467,7 +2474,7 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss) if (preempt_active) { qemu_mutex_unlock(&rs->bitmap_mutex); } -tmppages = ram_save_target_page(rs, pss); +tmppages = migration_ops->ram_save_target_page(rs, pss); if (tmppages >= 0) { pages += tmppages; /* @@ -2662,6 +2669,8 @@ static void ram_save_cleanup(void *opaque) xbzrle_cleanup(); compress_threads_save_cleanup(); ram_state_cleanup(rsp); +g_free(migration_ops); +migration_ops = NULL; } static void ram_state_reset(RAMState *rs) @@ -3215,6 +3224,8 @@ static int ram_save_setup(QEMUFile *f, void *opaque) ram_control_before_iterate(f, RAM_CONTROL_SETUP); ram_control_after_iterate(f, RAM_CONTROL_SETUP); +migration_ops = g_malloc0(sizeof(MigrationOps)); +migration_ops->ram_save_target_page = ram_save_target_page_legacy; ret = multifd_send_sync_main(f); if (ret < 0) { return ret; -- 2.39.1
[PATCH v2 02/11] migration: Make multifd_bytes atomic
In the spirit of: commit 394d323bc3451e4d07f13341cb8817fac8dfbadd Author: Peter Xu Date: Tue Oct 11 17:55:51 2022 -0400 migration: Use atomic ops properly for page accountings Signed-off-by: Juan Quintela --- migration/ram.h | 1 + migration/migration.c | 4 ++-- migration/multifd.c | 4 ++-- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/migration/ram.h b/migration/ram.h index 81cbb0947c..3c1de6aedf 100644 --- a/migration/ram.h +++ b/migration/ram.h @@ -50,6 +50,7 @@ typedef struct { Stat64 duplicate; Stat64 normal; Stat64 postcopy_bytes; +Stat64 multifd_bytes; } MigrationAtomicStats; extern MigrationAtomicStats ram_atomic_counters; diff --git a/migration/migration.c b/migration/migration.c index 644c61e91d..d261c7d16d 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1058,7 +1058,7 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s) ram_counters.dirty_sync_missed_zero_copy; info->ram->postcopy_requests = ram_counters.postcopy_requests; info->ram->page_size = page_size; -info->ram->multifd_bytes = ram_counters.multifd_bytes; +info->ram->multifd_bytes = stat64_get(&ram_atomic_counters.multifd_bytes); info->ram->pages_per_second = s->pages_per_second; info->ram->precopy_bytes = ram_counters.precopy_bytes; info->ram->downtime_bytes = ram_counters.downtime_bytes; @@ -3659,7 +3659,7 @@ static MigThrError migration_detect_error(MigrationState *s) static uint64_t migration_total_bytes(MigrationState *s) { return qemu_file_total_transferred(s->to_dst_file) + -ram_counters.multifd_bytes; +stat64_get(&ram_atomic_counters.multifd_bytes); } static void migration_calculate_complete(MigrationState *s) diff --git a/migration/multifd.c b/migration/multifd.c index 20a81cd7f2..49fa76e5e1 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -431,8 +431,8 @@ static int multifd_send_pages(QEMUFile *f) p->pages = pages; transferred = ((uint64_t) pages->num) * p->page_size + p->packet_len; qemu_file_acct_rate_limit(f, transferred); -ram_counters.multifd_bytes += transferred; qemu_mutex_unlock(&p->mutex); +stat64_add(&ram_atomic_counters.multifd_bytes, transferred); stat64_add(&ram_atomic_counters.transferred, transferred); qemu_sem_post(&p->sem); @@ -623,8 +623,8 @@ int multifd_send_sync_main(QEMUFile *f) p->flags |= MULTIFD_FLAG_SYNC; p->pending_job++; qemu_file_acct_rate_limit(f, p->packet_len); -ram_counters.multifd_bytes += p->packet_len; qemu_mutex_unlock(&p->mutex); +stat64_add(&ram_atomic_counters.multifd_bytes, p->packet_len); stat64_add(&ram_atomic_counters.transferred, p->packet_len); qemu_sem_post(&p->sem); -- 2.39.1
[PATCH v2 09/11] multifd: Support for zero pages transmission
This patch adds counters and similar. Logic will be added on the following patch. Signed-off-by: Juan Quintela --- Added counters for duplicated/non duplicated pages. Removed reviewed by from David. Add total_zero_pages --- migration/multifd.h| 17 - migration/multifd.c| 36 +--- migration/ram.c| 2 -- migration/trace-events | 8 4 files changed, 49 insertions(+), 14 deletions(-) diff --git a/migration/multifd.h b/migration/multifd.h index cd389d18d2..a1b852200d 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -47,7 +47,10 @@ typedef struct { /* size of the next packet that contains pages */ uint32_t next_packet_size; uint64_t packet_num; -uint64_t unused[4];/* Reserved for future use */ +/* zero pages */ +uint32_t zero_pages; +uint32_t unused32[1];/* Reserved for future use */ +uint64_t unused64[3];/* Reserved for future use */ char ramblock[256]; uint64_t offset[]; } __attribute__((packed)) MultiFDPacket_t; @@ -127,6 +130,8 @@ typedef struct { uint64_t num_packets; /* non zero pages sent through this channel */ uint64_t total_normal_pages; +/* zero pages sent through this channel */ +uint64_t total_zero_pages; /* buffers to send */ struct iovec *iov; /* number of iovs used */ @@ -135,6 +140,10 @@ typedef struct { ram_addr_t *normal; /* num of non zero pages */ uint32_t normal_num; +/* Pages that are zero */ +ram_addr_t *zero; +/* num of zero pages */ +uint32_t zero_num; /* used for compression methods */ void *data; } MultiFDSendParams; @@ -184,12 +193,18 @@ typedef struct { uint8_t *host; /* non zero pages recv through this channel */ uint64_t total_normal_pages; +/* zero pages recv through this channel */ +uint64_t total_zero_pages; /* buffers to recv */ struct iovec *iov; /* Pages that are not zero */ ram_addr_t *normal; /* num of non zero pages */ uint32_t normal_num; +/* Pages that are zero */ +ram_addr_t *zero; +/* num of zero pages */ +uint32_t zero_num; /* used for de-compression methods */ void *data; } MultiFDRecvParams; diff --git a/migration/multifd.c b/migration/multifd.c index 7ebaca6e55..30cc206190 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -263,6 +263,7 @@ static void multifd_send_fill_packet(MultiFDSendParams *p) packet->normal_pages = cpu_to_be32(p->normal_num); packet->next_packet_size = cpu_to_be32(p->next_packet_size); packet->packet_num = cpu_to_be64(p->packet_num); +packet->zero_pages = cpu_to_be32(p->zero_num); if (p->pages->block) { strncpy(packet->ramblock, p->pages->block->idstr, 256); @@ -323,7 +324,15 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) p->next_packet_size = be32_to_cpu(packet->next_packet_size); p->packet_num = be64_to_cpu(packet->packet_num); -if (p->normal_num == 0) { +p->zero_num = be32_to_cpu(packet->zero_pages); +if (p->zero_num > packet->pages_alloc - p->normal_num) { +error_setg(errp, "multifd: received packet " + "with %u zero pages and expected maximum pages are %u", + p->zero_num, packet->pages_alloc - p->normal_num) ; +return -1; +} + +if (p->normal_num == 0 && p->zero_num == 0) { return 0; } @@ -428,6 +437,8 @@ static int multifd_send_pages(QEMUFile *f) p->packet_num = multifd_send_state->packet_num++; multifd_send_state->pages = p->pages; p->pages = pages; +stat64_add(&ram_atomic_counters.normal, p->normal_num); +stat64_add(&ram_atomic_counters.duplicate, p->zero_num); uint64_t transferred = p->sent_bytes; p->sent_bytes = 0; qemu_file_acct_rate_limit(f, transferred); @@ -546,6 +557,8 @@ void multifd_save_cleanup(void) p->iov = NULL; g_free(p->normal); p->normal = NULL; +g_free(p->zero); +p->zero = NULL; multifd_send_state->ops->send_cleanup(p, &local_err); if (local_err) { migrate_set_error(migrate_get_current(), local_err); @@ -675,6 +688,7 @@ static void *multifd_send_thread(void *opaque) qemu_mutex_unlock(&p->mutex); p->normal_num = 0; +p->zero_num = 0; if (use_zero_copy_send) { p->iovs_num = 0; @@ -695,8 +709,8 @@ static void *multifd_send_thread(void *opaque) } multifd_send_fill_packet(p); -trace_multifd_send(p->id, packet_num, p->normal_num, p->flags, - p->next_packet_size); +trace_multifd_send(p->id, packet_num, p->normal_num, p->zero_num, + p->flags, p->next_packet_size); if (use_zero_copy_send) { /* Send header first, without zer
[PATCH v2 00/11] Multifd zero page support
Based on top of my next branch. - Rebased on top of latest upstream - Redo a lot of the packet accounting still not completely perfect, but much better than what is upstream Still working continuing on that. Please review. [v2] - rebased on top of latest upstream - lots of minor fixes - start support for atomic counters * we need to move ram_limit_used/max to migration.c * that means fixing rdma.c * and test-vmstate. So I am donig that right now. Juan Quintela (11): migration: Update atomic stats out of the mutex migration: Make multifd_bytes atomic multifd: We already account for this packet on the multifd thread multifd: Count the number of bytes sent correctly migration: Make ram_save_target_page() a pointer multifd: Make flags field thread local multifd: Prepare to send a packet without the mutex held multifd: Add capability to enable/disable zero_page multifd: Support for zero pages transmission multifd: Zero pages transmission So we use multifd to transmit zero pages. qapi/migration.json| 8 ++- migration/migration.h | 1 + migration/multifd.h| 36 ++-- migration/ram.h| 1 + hw/core/machine.c | 1 + migration/migration.c | 16 +- migration/multifd.c| 123 +++-- migration/ram.c| 51 +++-- migration/trace-events | 8 +-- 9 files changed, 197 insertions(+), 48 deletions(-) -- 2.39.1
[PATCH v2 07/11] multifd: Prepare to send a packet without the mutex held
We do the send_prepare() and the fill of the head packet without the mutex held. It will help a lot for compression and later in the series for zero pages. Notice that we can use p->pages without holding p->mutex because p->pending_job == 1. Signed-off-by: Juan Quintela --- migration/multifd.h | 2 ++ migration/multifd.c | 12 ++-- 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/migration/multifd.h b/migration/multifd.h index a67cefc0a2..cd389d18d2 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -109,7 +109,9 @@ typedef struct { /* array of pages to sent. * The owner of 'pages' depends of 'pending_job' value: * pending_job == 0 -> migration_thread can use it. + * No need for mutex lock. * pending_job != 0 -> multifd_channel can use it. + * No need for mutex lock. */ MultiFDPages_t *pages; diff --git a/migration/multifd.c b/migration/multifd.c index 77196a55b4..7ebaca6e55 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -672,6 +672,8 @@ static void *multifd_send_thread(void *opaque) p->flags |= MULTIFD_FLAG_SYNC; p->sync_needed = false; } +qemu_mutex_unlock(&p->mutex); + p->normal_num = 0; if (use_zero_copy_send) { @@ -688,16 +690,10 @@ static void *multifd_send_thread(void *opaque) if (p->normal_num) { ret = multifd_send_state->ops->send_prepare(p, &local_err); if (ret != 0) { -qemu_mutex_unlock(&p->mutex); break; } } multifd_send_fill_packet(p); -p->num_packets++; -p->total_normal_pages += p->normal_num; -p->pages->num = 0; -p->pages->block = NULL; -qemu_mutex_unlock(&p->mutex); trace_multifd_send(p->id, packet_num, p->normal_num, p->flags, p->next_packet_size); @@ -722,6 +718,10 @@ static void *multifd_send_thread(void *opaque) } qemu_mutex_lock(&p->mutex); +p->num_packets++; +p->total_normal_pages += p->normal_num; +p->pages->num = 0; +p->pages->block = NULL; p->sent_bytes += p->packet_len; p->sent_bytes += p->next_packet_size; p->pending_job--; -- 2.39.1
[PATCH v2 11/11] So we use multifd to transmit zero pages.
Signed-off-by: Juan Quintela --- - Check zero_page property before using new code (Dave) --- migration/migration.c | 3 +-- migration/ram.c | 32 +++- 2 files changed, 32 insertions(+), 3 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index ab86c6601d..6293dee983 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2608,8 +2608,7 @@ bool migrate_use_main_zero_page(void) { MigrationState *s = migrate_get_current(); -/* We will enable this when we add the right code. */ -return true || s->enabled_capabilities[MIGRATION_CAPABILITY_MAIN_ZERO_PAGE]; +return s->enabled_capabilities[MIGRATION_CAPABILITY_MAIN_ZERO_PAGE]; } bool migrate_pause_before_switchover(void) diff --git a/migration/ram.c b/migration/ram.c index c70cc25169..d375f2ccde 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2421,6 +2421,31 @@ out: return ret; } +/** + * ram_save_target_page_multifd: save one target page + * + * Returns the number of pages written + * + * @rs: current RAM state + * @pss: data about the page we want to send + */ +static int ram_save_target_page_multifd(RAMState *rs, PageSearchStatus *pss) +{ +RAMBlock *block = pss->block; +ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; + +if (!migration_in_postcopy()) { +return ram_save_multifd_page(pss->pss_channel, block, offset); +} + +int res = save_zero_page(pss, block, offset); +if (res > 0) { +return res; +} + +return ram_save_page(rs, pss); +} + /** * ram_save_host_page: save a whole host page * @@ -3223,7 +3248,12 @@ static int ram_save_setup(QEMUFile *f, void *opaque) ram_control_after_iterate(f, RAM_CONTROL_SETUP); migration_ops = g_malloc0(sizeof(MigrationOps)); -migration_ops->ram_save_target_page = ram_save_target_page_legacy; +if (migrate_use_multifd() && !migrate_use_main_zero_page()) { +migration_ops->ram_save_target_page = ram_save_target_page_multifd; +} else { +migration_ops->ram_save_target_page = ram_save_target_page_legacy; +} + ret = multifd_send_sync_main(f); if (ret < 0) { return ret; -- 2.39.1
[PATCH v2 10/11] multifd: Zero pages transmission
This implements the zero page dection and handling. Signed-off-by: Juan Quintela --- Add comment for offset (dave) Use local variables for offset/block to have shorter lines --- migration/multifd.h | 5 + migration/multifd.c | 45 +++-- 2 files changed, 48 insertions(+), 2 deletions(-) diff --git a/migration/multifd.h b/migration/multifd.h index a1b852200d..5931de6f86 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -52,6 +52,11 @@ typedef struct { uint32_t unused32[1];/* Reserved for future use */ uint64_t unused64[3];/* Reserved for future use */ char ramblock[256]; +/* + * This array contains the pointers to: + * - normal pages (initial normal_pages entries) + * - zero pages (following zero_pages entries) + */ uint64_t offset[]; } __attribute__((packed)) MultiFDPacket_t; diff --git a/migration/multifd.c b/migration/multifd.c index 30cc206190..d3f82dad8a 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -11,6 +11,7 @@ */ #include "qemu/osdep.h" +#include "qemu/cutils.h" #include "qemu/rcu.h" #include "exec/target_page.h" #include "sysemu/sysemu.h" @@ -275,6 +276,12 @@ static void multifd_send_fill_packet(MultiFDSendParams *p) packet->offset[i] = cpu_to_be64(temp); } +for (i = 0; i < p->zero_num; i++) { +/* there are architectures where ram_addr_t is 32 bit */ +uint64_t temp = p->zero[i]; + +packet->offset[p->normal_num + i] = cpu_to_be64(temp); +} } static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) @@ -358,6 +365,18 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) p->normal[i] = offset; } +for (i = 0; i < p->zero_num; i++) { +uint64_t offset = be64_to_cpu(packet->offset[p->normal_num + i]); + +if (offset > (block->used_length - p->page_size)) { +error_setg(errp, "multifd: offset too long %" PRIu64 + " (max " RAM_ADDR_FMT ")", + offset, block->used_length); +return -1; +} +p->zero[i] = offset; +} + return 0; } @@ -657,6 +676,12 @@ static void *multifd_send_thread(void *opaque) { MultiFDSendParams *p = opaque; Error *local_err = NULL; +/* + * older qemu don't understand zero page on multifd channel. To + * have capabilities "false" by default, we need to name it this + * way. + */ +bool use_multifd_zero_page = !migrate_use_main_zero_page(); int ret = 0; bool use_zero_copy_send = migrate_use_zero_copy_send(); @@ -679,6 +704,7 @@ static void *multifd_send_thread(void *opaque) qemu_mutex_lock(&p->mutex); if (p->pending_job) { +RAMBlock *rb = p->pages->block; uint64_t packet_num = p->packet_num; p->flags = 0; if (p->sync_needed) { @@ -697,8 +723,16 @@ static void *multifd_send_thread(void *opaque) } for (int i = 0; i < p->pages->num; i++) { -p->normal[p->normal_num] = p->pages->offset[i]; -p->normal_num++; +uint64_t offset = p->pages->offset[i]; +if (use_multifd_zero_page && +buffer_is_zero(rb->host + offset, p->page_size)) { +p->zero[p->zero_num] = offset; +p->zero_num++; +ram_release_page(rb->idstr, offset); +} else { +p->normal[p->normal_num] = offset; +p->normal_num++; +} } if (p->normal_num) { @@ -1160,6 +1194,13 @@ static void *multifd_recv_thread(void *opaque) } } +for (int i = 0; i < p->zero_num; i++) { +void *page = p->host + p->zero[i]; +if (!buffer_is_zero(page, p->page_size)) { +memset(page, 0, p->page_size); +} +} + if (sync_needed) { qemu_sem_post(&multifd_recv_state->sem_sync); qemu_sem_wait(&p->sem_sync); -- 2.39.1
Re: Crash in RTC
ping On Wed, Aug 24, 2022 at 5:37 PM Konstantin Kostiuk wrote: > Hi Michael and Paolo, > > I write to you as maintainers of mc146818rtc.c. I am working on bug > https://bugzilla.redhat.com/show_bug.cgi?id=2054781 > and reproduced it on the current master branch. > > I added some print at line 202 (before assert(lost_clock >= 0), > https://gitlab.com/qemu-project/qemu/-/blob/master/hw/rtc/mc146818rtc.c#L202) > and got the following values: > > next_periodic_clock, old_period, last_periodic_clock, cur_clock, > lost_clock, current_time > 54439076429968, 32, 54439076429936, 54439076430178, 242, > 1661348768010822000 > 54439076430224, 512, 54439076429712, 54439076430188, 476, > 166134876807000 > 54439076430224, 32, 54439076430192, 54439076429884, -308, > 1661348768001838000 > > The current_time value in the last print is lower than in the previous one. > So, the error occurs because time has gone backward. > > I think this is a possible situation during time synchronization. > My question is what should we do in this case? > > Best Regards, > Konstantin Kostiuk. >
Re: [PATCH] linux-user: move target_flat.h to target subdirs
Le 29/01/2023 à 01:46, Mike Frysinger a écrit : This makes target_flat.h behave like every other target_xxx.h header. It also makes it actually work -- while the current header says adding a header to the target subdir overrides the common one, it doesn't. This is for two reasons: * meson.build adds -Ilinux-user before -Ilinux-user/$arch * the compiler search path for "target_flat.h" looks in the same dir as the source file before searching -I paths. This can be seen with the xtensa port -- the subdir settings aren't used which breaks stack setup. Move it to the generic/ subdir and add include stubs like every other target_xxx.h header is handled. Signed-off-by: Mike Frysinger --- linux-user/aarch64/target_flat.h | 1 + linux-user/arm/target_flat.h | 1 + linux-user/{ => generic}/target_flat.h | 0 linux-user/m68k/target_flat.h | 1 + linux-user/microblaze/target_flat.h| 1 + linux-user/sh4/target_flat.h | 1 + 6 files changed, 5 insertions(+) create mode 100644 linux-user/aarch64/target_flat.h create mode 100644 linux-user/arm/target_flat.h rename linux-user/{ => generic}/target_flat.h (100%) create mode 100644 linux-user/m68k/target_flat.h create mode 100644 linux-user/microblaze/target_flat.h create mode 100644 linux-user/sh4/target_flat.h diff --git a/linux-user/aarch64/target_flat.h b/linux-user/aarch64/target_flat.h new file mode 100644 index ..bc83224cea12 --- /dev/null +++ b/linux-user/aarch64/target_flat.h @@ -0,0 +1 @@ +#include "../generic/target_flat.h" diff --git a/linux-user/arm/target_flat.h b/linux-user/arm/target_flat.h new file mode 100644 index ..bc83224cea12 --- /dev/null +++ b/linux-user/arm/target_flat.h @@ -0,0 +1 @@ +#include "../generic/target_flat.h" diff --git a/linux-user/target_flat.h b/linux-user/generic/target_flat.h similarity index 100% rename from linux-user/target_flat.h rename to linux-user/generic/target_flat.h diff --git a/linux-user/m68k/target_flat.h b/linux-user/m68k/target_flat.h new file mode 100644 index ..bc83224cea12 --- /dev/null +++ b/linux-user/m68k/target_flat.h @@ -0,0 +1 @@ +#include "../generic/target_flat.h" diff --git a/linux-user/microblaze/target_flat.h b/linux-user/microblaze/target_flat.h new file mode 100644 index ..bc83224cea12 --- /dev/null +++ b/linux-user/microblaze/target_flat.h @@ -0,0 +1 @@ +#include "../generic/target_flat.h" diff --git a/linux-user/sh4/target_flat.h b/linux-user/sh4/target_flat.h new file mode 100644 index ..bc83224cea12 --- /dev/null +++ b/linux-user/sh4/target_flat.h @@ -0,0 +1 @@ +#include "../generic/target_flat.h" Applied to my linux-user-for-8.0 branch. Thanks, Laurent
Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo wrote: > > On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang wrote: > > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin wrote: > > > > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote: > > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" > > > > wrote: > > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" > > > > > > wrote: > > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" > > > > > > > > wrote: > > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" > > > > > > > > > > wrote: > > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > > > > > > > Check whether it is per-queue reset state in > > > > > > > > > > > > virtio_net_flush_tx(). > > > > > > > > > > > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx > > > > > > > > > > > > resources. At this > > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should > > > > > > > > > > > > not try to send > > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the > > > > > > > > > > > > current > > > > > > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > > > > > > Do you in fact mean it's called from > > > > > > > > > > > flush_or_purge_queued_packets? > > > > > > > > > > > > > > > > > > > > Yes > > > > > > > > > > > > > > > > > > > > virtio_queue_reset > > > > > > > > > > k->queue_reset > > > > > > > > > > virtio_net_queue_reset > > > > > > > > > > flush_or_purge_queued_packets > > > > > > > > > > qemu_flush_or_purge_queued_packets > > > > > > > > > > . > > > > > > > > > > (callback) > > > > > > > > > > virtio_net_tx_complete > > > > > > > > > > > > > > > > > > > > virtio_net_flush_tx <-- here send new packet. We need stop > > > > > > > > > > it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass information > > > > > > > > > > through the stack. I > > > > > > > > > > originally thought it was a general situation, so I wanted > > > > > > > > > > to put it in > > > > > > > > > > struct VirtQueue. > > > > > > > > > > > > > > > > > > > > If it is not very suitable, it may be better to put it in > > > > > > > > > > VirtIONetQueue. > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > > > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > > > > > > > knows we are in the process of reset would be a bad idea. > > > > > > > > > > > We want something much more local, ideally on stack even > > > > > > > > > > > ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > > > > > > Fixes: > > > > > > > > > > > > https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > > > > > > Reported-by: Alexander Bulekov > > > > > > > > > > > > Signed-off-by: Xuan Zhuo > > > > > > > > > > > > --- > > > > > > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t > > > > > > > > > > > > virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > > > > > > VirtQueueElement *elem; > > > > > > > > > > > > int32_t num_packets = 0; > > > > > > > > > > > > int queue_index = > > > > > > > > > > > > vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > > > > > > -if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > > > > > > +if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > > > > > > +virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > > > > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > > > > > > > > > > > > > return num_packets; > > > > > > > > > > > > > > > > > > and then > > > > > > > > > > > > > > > > > >
Re: [PATCH v9 01/58] include: import Xen public headers to include/standard-headers/
On 28/01/2023 09.10, David Woodhouse wrote: From: Joao Martins There are already some partial headers in include/hw/xen/interface/ which will be removed once we migrate users to the new location. To start with, define __XEN_TOOLS__ in hw/xen/xen.h to ensure that any internal definitions needed by Xen toolstack libraries are present regardless of the order in which the headers are included. A reckoning will come later, once we make the PV backends work in emulation and untangle the headers for Xen-native vs. generic parts. Signed-off-by: Joao Martins [dwmw2: Update to Xen public headers from 4.16.2 release, add some in io/, define __XEN_TOOLS__ in hw/xen/xen.h] Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- include/hw/xen/xen.h | 16 +- include/standard-headers/xen/arch-x86/cpuid.h | 118 ++ .../xen/arch-x86/xen-x86_32.h | 194 +++ .../xen/arch-x86/xen-x86_64.h | 241 include/standard-headers/xen/arch-x86/xen.h | 398 ++ include/standard-headers/xen/event_channel.h | 388 ++ include/standard-headers/xen/features.h | 143 +++ include/standard-headers/xen/grant_table.h| 686 ++ include/standard-headers/xen/hvm/hvm_op.h | 395 ++ include/standard-headers/xen/hvm/params.h | 318 + include/standard-headers/xen/io/blkif.h | 722 +++ include/standard-headers/xen/io/console.h | 56 + include/standard-headers/xen/io/fbif.h| 176 +++ include/standard-headers/xen/io/kbdif.h | 576 + include/standard-headers/xen/io/netif.h | 1102 + include/standard-headers/xen/io/protocols.h | 42 + include/standard-headers/xen/io/ring.h| 495 include/standard-headers/xen/io/usbif.h | 425 +++ include/standard-headers/xen/io/xenbus.h | 80 ++ include/standard-headers/xen/io/xs_wire.h | 153 +++ include/standard-headers/xen/memory.h | 754 +++ include/standard-headers/xen/physdev.h| 383 ++ include/standard-headers/xen/sched.h | 202 +++ include/standard-headers/xen/trace.h | 341 + include/standard-headers/xen/vcpu.h | 248 include/standard-headers/xen/version.h| 113 ++ include/standard-headers/xen/xen-compat.h | 46 + include/standard-headers/xen/xen.h| 1049 The files in include/standard-headers are created automatically by the scripts/update-linux-headers.sh script, so I was a little bit surprised that you don't provide an update to that script first ... if you copy new headers there manually, that might cause troubles later for the next person who runs the update-linux-headers.sh script. So I suggest to either adapt the script for your needs, or put the xen headers in a different location. Thomas
Re: [PATCH v3 08/14] RISC-V: Adding T-Head MemPair extension
On 2023/1/30 13:43, Richard Henderson wrote: On 1/29/23 16:03, LIU Zhiwei wrote: Thanks. It's a bug. We should load all memory addresses to local TCG temps first. Do you think we should probe all the memory addresses for the store pair instructions? If so, can we avoid the use of a helper function? Depends on what the hardware does. Even with a trap in the middle the stores are restartable, since no register state changes. I refer to the specification of LDP and STP on AARCH64. The specification allows "any access performed before the exception was taken is repeated". In detailed, "If, according to these rules, an instruction is executed as a sequence of accesses, exceptions, including interrupts, can be taken during that sequence, regardless of the memory type being accessed. If any of these exceptions are returned from using their preferred return address, the instruction that generated the sequence of accesses is re-executed, and so any access performed before the exception was taken is repeated. See also Taking an interrupt during a multi-access load or store on page D1-4664." However I see the implementation of LDP and STP on QEMU are in different ways. LDP will only load the first register when it ensures no trap in the second access. So I have two questions here. 1) One for the QEMU implementation about LDP. Can we implement the LDP as two directly loads to cpu registers instead of local TCG temps? 2) One for the comment. Why register state changes cause non-restartable? Do you mean if the first register changes, it may influence the calculation of address after the trap? "Even with a trap in the middle the stores are restartable, since no register state changes." But if you'd like no changes verifying both stores, for this case you can pack the pair into a larger data type: TCGv_i64 for pair of 32-bit, and TCGv_i128 for pair of 64-bit. Patches for TCGv_i128 [1] are just finishing review; patches to describe atomicity of the larger operation are also on list [2]. Anyway, the idea is that you issue one TCG memory operation, the entire operation is validated, and then the stores happen. The main reason is that assembler can do this check. Is it necessary to check this in QEMU? Yes. Conciser what happens when the insn is encoded with .long. Does the hardware trap an illegal instruction? Is the behavior simply unspecified? The manual could be improved to specify, akin to the Arm terms: UNDEFINED, CONSTRAINED UNPREDICTABLE, IMPLEMENTATION DEFINED, etc. Thanks, I will fix the manual. Best Regards, Zhiwei r~ [1] https://patchew.org/QEMU/20230126043824.54819-1-richard.hender...@linaro.org/ [2] https://patchew.org/QEMU/20221118094754.242910-1-richard.hender...@linaro.org/
Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
On Mon, Jan 30, 2023 at 1:50 PM Michael S. Tsirkin wrote: > > On Mon, Jan 30, 2023 at 11:53:18AM +0800, Jason Wang wrote: > > On Mon, Jan 30, 2023 at 11:42 AM Xuan Zhuo > > wrote: > > > > > > On Mon, 30 Jan 2023 11:01:40 +0800, Jason Wang > > > wrote: > > > > On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo > > > > wrote: > > > > > > > > > > On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang > > > > > wrote: > > > > > > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo > > > > > > wrote: > > > > > > > > > > > > > > Check whether it is per-queue reset state in > > > > > > > virtio_net_flush_tx(). > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At > > > > > > > this > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to > > > > > > > send > > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > > per-queue reset state. > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > Reported-by: Alexander Bulekov > > > > > > > Signed-off-by: Xuan Zhuo > > > > > > > --- > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t > > > > > > > virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > VirtQueueElement *elem; > > > > > > > int32_t num_packets = 0; > > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > -if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > +if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > +virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > > > We have other places that check DRIVER_OK do we need to check queue > > > > > > reset as well? > > > > > > > > > > I checked it again. I still think that the location of other checking > > > > > DRIVER_OK > > > > > does not need to check the queue reset. > > > > > > > > For example, if we don't disable can_receive() when the queue is > > > > reset, it means rx may go for virtio_net_receive_rcu(). It means the > > > > Qemu is still trying to process the traffic from the network backend > > > > like tap which may waste cpu cycles. > > > > > > > > I think the correct way is to return false when the queue is reset in > > > > can_receive(), then the backend poll will be disabled (e.g TAP). When > > > > the queue is enabled again, qemu_flush_queued_packets() will wake up > > > > the backend polling. > > > > > > > > Having had time to check other places but it would be better to > > > > mention why it doesn't need a check in the changelog. > > > > > > > > > static bool virtio_net_can_receive(NetClientState *nc) > > > { > > > VirtIONet *n = qemu_get_nic_opaque(nc); > > > VirtIODevice *vdev = VIRTIO_DEVICE(n); > > > VirtIONetQueue *q = virtio_net_get_subqueue(nc); > > > > > > if (!vdev->vm_running) { > > > return false; > > > } > > > > > > if (nc->queue_index >= n->curr_queue_pairs) { > > > return false; > > > } > > > > > > if (!virtio_queue_ready(q->rx_vq) || > > > !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > return false; > > > } > > > > > > return true; > > > } > > > > > > int virtio_queue_ready(VirtQueue *vq) > > > { > > > return vq->vring.avail != 0; > > > } > > > > > > > > > static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i) > > > { > > > vdev->vq[i].vring.desc = 0; > > > vdev->vq[i].vring.avail = 0; > > > vdev->vq[i].vring.used = 0; > > > vdev->vq[i].last_avail_idx = 0; > > > vdev->vq[i].shadow_avail_idx = 0; > > > vdev->vq[i].used_idx = 0; > > > vdev->vq[i].last_avail_wrap_counter = true; > > > vdev->vq[i].shadow_avail_wrap_counter = true; > > > vdev->vq[i].used_wrap_counter = true; > > > virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR); > > > vdev->vq[i].signalled_used = 0; > > > vdev->vq[i].signalled_used_valid = false; > > > vdev->vq[i].notification = true; > > > vdev->vq[i].vring.num = vdev->vq[i].vring.num_default; > > > vdev->vq[i].inuse = 0; > > > virtio_virtqueue_reset_region_cache(&vdev->vq[i]); > > > } > > > > > > In the implementation of Per-Queue Reset, for RX, we stop RX by setting > > > vdev->vq[i].vring.avail to 0. > > > > Ok, but this is kind of fragile (especially when vIOMMU is enabled). > > I'd add an explicit check for reset there. > > It's not great in that spec says avail 0 is actually legal. > But I don't really want to see more and more checks. > If we are doing cleanups, the right way is probably a new "live" flag > that transports can set correctly from the combinati
Re: [PATCH v2] linux-user: Fix SO_ERROR return code of getsockopt()
Le 27/01/2023 à 21:25, Helge Deller a écrit : Add translation for the host error return code of: getsockopt(19, SOL_SOCKET, SO_ERROR, [ECONNREFUSED], [4]) = 0 This fixes the testsuite of the cockpit debian package with a hppa-linux guest on a x86-64 host. Signed-off-by: Helge Deller --- v2: Fix indenting to make checkscript.sh happy diff --git a/linux-user/syscall.c b/linux-user/syscall.c index dac0cfe6c4..06e8612675 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -2809,8 +2809,13 @@ get_timeout: ret = get_errno(getsockopt(sockfd, level, optname, &val, &lv)); if (ret < 0) return ret; -if (optname == SO_TYPE) { +switch (optname) { +case SO_TYPE: val = host_to_target_sock_type(val); +break; +case SO_ERROR: +val = host_to_target_errno(val); +break; } if (len > lv) len = lv; Reviewed-by: Laurent Vivier
Re: [PATCH v4] linux-user: Fix /proc/cpuinfo output for hppa
Le 27/01/2023 à 21:10, Helge Deller a écrit : The hppa architectures provides an own output for the emulated /proc/cpuinfo file. Some userspace applications count (even if that's not the recommended way) the number of lines which start with "processor:" and assume that this number then reflects the number of online CPUs. Since those 3 architectures don't provide any such line, applications may assume "0" CPUs. One such issue can be seen in debian bug report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024653 Avoid such issues by adding a "processor:" line for each of the online CPUs. Signed-off-by: Helge Deller Reviewed-by: Philippe Mathieu-Daudé --- v4: - Drop sparc changes v3: - add trailing newline at end of /proc/cpuinfo file (needed by procps) v2: - drop m68k part (based on feedback from Laurent Vivier ) - change commit message diff --git a/linux-user/syscall.c b/linux-user/syscall.c index afb24fd0b9..5fa2ae6c8a 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -8318,11 +8318,17 @@ static int open_cpuinfo(CPUArchState *cpu_env, int fd) #if defined(TARGET_HPPA) static int open_cpuinfo(CPUArchState *cpu_env, int fd) { -dprintf(fd, "cpu family\t: PA-RISC 1.1e\n"); -dprintf(fd, "cpu\t\t: PA7300LC (PCX-L2)\n"); -dprintf(fd, "capabilities\t: os32\n"); -dprintf(fd, "model\t\t: 9000/778/B160L\n"); -dprintf(fd, "model name\t: Merlin L2 160 QEMU (9000/778/B160L)\n"); +int i, num_cpus; + +num_cpus = sysconf(_SC_NPROCESSORS_ONLN); +for (i = 0; i < num_cpus; i++) { +dprintf(fd, "processor\t: %d\n", i); +dprintf(fd, "cpu family\t: PA-RISC 1.1e\n"); +dprintf(fd, "cpu\t\t: PA7300LC (PCX-L2)\n"); +dprintf(fd, "capabilities\t: os32\n"); +dprintf(fd, "model\t\t: 9000/778/B160L - " +"Merlin L2 160 QEMU (9000/778/B160L)\n\n"); +} return 0; } #endif Reviewed-by: Laurent Vivier
Re: [PATCH v2] linux-user: Improve strace output of personality() and sysinfo()
Le 27/01/2023 à 21:18, Helge Deller a écrit : Make the strace look nicer for those two syscalls. Signed-off-by: Helge Deller -- v2: use TARGET_ABI_FMT_lx instead of %p in personality output as suggested by Philippe Mathieu-Daudé and Laurent Vivier diff --git a/linux-user/strace.list b/linux-user/strace.list index f9254725a1..703c0f1608 100644 --- a/linux-user/strace.list +++ b/linux-user/strace.list @@ -1043,7 +1043,8 @@ { TARGET_NR_perfctr, "perfctr" , NULL, NULL, NULL }, #endif #ifdef TARGET_NR_personality -{ TARGET_NR_personality, "personality" , NULL, NULL, NULL }, +{ TARGET_NR_personality, "personality" , "%s(0x"TARGET_ABI_FMT_lx")", NULL, + print_syscall_ret_addr }, #endif #ifdef TARGET_NR_pipe { TARGET_NR_pipe, "pipe" , NULL, NULL, NULL }, @@ -1502,7 +1503,7 @@ { TARGET_NR_sysfs, "sysfs" , NULL, NULL, NULL }, #endif #ifdef TARGET_NR_sysinfo -{ TARGET_NR_sysinfo, "sysinfo" , NULL, NULL, NULL }, +{ TARGET_NR_sysinfo, "sysinfo" , "%s(%p)", NULL, NULL }, #endif #ifdef TARGET_NR_sys_kexec_load { TARGET_NR_sys_kexec_load, "sys_kexec_load" , NULL, NULL, NULL }, Reviewed-by: Laurent Vivier
Re: [PATCH v2] linux-user: Fix SO_ERROR return code of getsockopt()
Le 27/01/2023 à 21:25, Helge Deller a écrit : Add translation for the host error return code of: getsockopt(19, SOL_SOCKET, SO_ERROR, [ECONNREFUSED], [4]) = 0 This fixes the testsuite of the cockpit debian package with a hppa-linux guest on a x86-64 host. Signed-off-by: Helge Deller --- v2: Fix indenting to make checkscript.sh happy diff --git a/linux-user/syscall.c b/linux-user/syscall.c index dac0cfe6c4..06e8612675 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -2809,8 +2809,13 @@ get_timeout: ret = get_errno(getsockopt(sockfd, level, optname, &val, &lv)); if (ret < 0) return ret; -if (optname == SO_TYPE) { +switch (optname) { +case SO_TYPE: val = host_to_target_sock_type(val); +break; +case SO_ERROR: +val = host_to_target_errno(val); +break; } if (len > lv) len = lv; Applied to my linux-user-for-8.0 branch. Thanks, Laurent
Re: [PATCH v4] linux-user: Fix /proc/cpuinfo output for hppa
Le 27/01/2023 à 21:10, Helge Deller a écrit : The hppa architectures provides an own output for the emulated /proc/cpuinfo file. Some userspace applications count (even if that's not the recommended way) the number of lines which start with "processor:" and assume that this number then reflects the number of online CPUs. Since those 3 architectures don't provide any such line, applications may assume "0" CPUs. One such issue can be seen in debian bug report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024653 Avoid such issues by adding a "processor:" line for each of the online CPUs. Signed-off-by: Helge Deller Reviewed-by: Philippe Mathieu-Daudé --- v4: - Drop sparc changes v3: - add trailing newline at end of /proc/cpuinfo file (needed by procps) v2: - drop m68k part (based on feedback from Laurent Vivier ) - change commit message diff --git a/linux-user/syscall.c b/linux-user/syscall.c index afb24fd0b9..5fa2ae6c8a 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -8318,11 +8318,17 @@ static int open_cpuinfo(CPUArchState *cpu_env, int fd) #if defined(TARGET_HPPA) static int open_cpuinfo(CPUArchState *cpu_env, int fd) { -dprintf(fd, "cpu family\t: PA-RISC 1.1e\n"); -dprintf(fd, "cpu\t\t: PA7300LC (PCX-L2)\n"); -dprintf(fd, "capabilities\t: os32\n"); -dprintf(fd, "model\t\t: 9000/778/B160L\n"); -dprintf(fd, "model name\t: Merlin L2 160 QEMU (9000/778/B160L)\n"); +int i, num_cpus; + +num_cpus = sysconf(_SC_NPROCESSORS_ONLN); +for (i = 0; i < num_cpus; i++) { +dprintf(fd, "processor\t: %d\n", i); +dprintf(fd, "cpu family\t: PA-RISC 1.1e\n"); +dprintf(fd, "cpu\t\t: PA7300LC (PCX-L2)\n"); +dprintf(fd, "capabilities\t: os32\n"); +dprintf(fd, "model\t\t: 9000/778/B160L - " +"Merlin L2 160 QEMU (9000/778/B160L)\n\n"); +} return 0; } #endif Applied to my linux-user-for-8.0 branch. Thanks, Laurent
Re: [PATCH v2] linux-user: Improve strace output of personality() and sysinfo()
Le 27/01/2023 à 21:18, Helge Deller a écrit : Make the strace look nicer for those two syscalls. Signed-off-by: Helge Deller -- v2: use TARGET_ABI_FMT_lx instead of %p in personality output as suggested by Philippe Mathieu-Daudé and Laurent Vivier diff --git a/linux-user/strace.list b/linux-user/strace.list index f9254725a1..703c0f1608 100644 --- a/linux-user/strace.list +++ b/linux-user/strace.list @@ -1043,7 +1043,8 @@ { TARGET_NR_perfctr, "perfctr" , NULL, NULL, NULL }, #endif #ifdef TARGET_NR_personality -{ TARGET_NR_personality, "personality" , NULL, NULL, NULL }, +{ TARGET_NR_personality, "personality" , "%s(0x"TARGET_ABI_FMT_lx")", NULL, + print_syscall_ret_addr }, #endif #ifdef TARGET_NR_pipe { TARGET_NR_pipe, "pipe" , NULL, NULL, NULL }, @@ -1502,7 +1503,7 @@ { TARGET_NR_sysfs, "sysfs" , NULL, NULL, NULL }, #endif #ifdef TARGET_NR_sysinfo -{ TARGET_NR_sysinfo, "sysinfo" , NULL, NULL, NULL }, +{ TARGET_NR_sysinfo, "sysinfo" , "%s(%p)", NULL, NULL }, #endif #ifdef TARGET_NR_sys_kexec_load { TARGET_NR_sys_kexec_load, "sys_kexec_load" , NULL, NULL, NULL }, Applied to my linux-user-for-8.0 branch. Thanks, Laurent
Re: [PATCH v3 09/14] RISC-V: Adding T-Head MemIdx extension
On 2023/1/25 5:21, Richard Henderson wrote: On 1/24/23 09:59, Christoph Muellner wrote: +/* XTheadMemIdx */ + +/* + * Load with memop from indexed address and add (imm5 << imm2) to rs1. + * If !preinc, then the load address is rs1. + * If preinc, then the load address is rs1 + (imm5) << imm2). + */ +static bool gen_load_inc(DisasContext *ctx, arg_th_meminc *a, MemOp memop, + bool preinc) +{ + TCGv rd = dest_gpr(ctx, a->rd); + TCGv addr = get_address(ctx, a->rs1, preinc ? a->imm5 << a->imm2 : 0); + + tcg_gen_qemu_ld_tl(rd, addr, ctx->mem_idx, memop); + addr = get_address(ctx, a->rs1, !preinc ? a->imm5 << a->imm2 : 0); First, you're leaking the previous 'addr' temporary. Although all temps allocated by temp_new() will be freed after the instruction translation automatically, we can improve current implementation. Second, get_address may make modifications to 'addr' which you don't want to write back. Good catch. Third, you are not checking for rd != rs1. Yes. I think what you want is int imm = a->imm5 << a->imm2; TCGv addr = get_address(ctx, a->rs1, preinc ? imm : 0); TCGv rd = dest_gpr(ctx, a->rd); TCGv rs1 = get_gpr(ctx, a->rs1, EXT_NONE); tcg_gen_qemu_ld_tl(rd, addr, ctx->mem_idx, memop); tcg_gen_addi_tl(rs1, rs1, imm); gen_set_gpr(ctx, a->rd, rd); gen_set_gpr(ctx, a->rs1, rs1); Yes, we should write back the 'addr' without modification. Best Regards, Zhiwei r~
Re: [PATCH v3 12/14] RISC-V: Add initial support for T-Head C906
On 2023/1/25 5:26, Richard Henderson wrote: On 1/24/23 09:59, Christoph Muellner wrote: +++ b/target/riscv/cpu.h @@ -27,6 +27,7 @@ #include "qom/object.h" #include "qemu/int128.h" #include "cpu_bits.h" +#include "cpu_vendorid.h" I don't see that this ID is required for all users of riscv/cpu.h. This include should be limited to cpu.c. OK. We can move it to cpu.c Best Regards, Zhiwei r~
Re: [PATCH v2] linux-user: Improve strace output of pread64() and pwrite64()
Le 27/01/2023 à 21:58, Helge Deller a écrit : Make the strace look nicer for those two syscalls. Signed-off-by: Helge Deller --- v2: Use regpairs_aligned() and target_offset64(), noticed by Laurent Vivier diff --git a/linux-user/strace.c b/linux-user/strace.c index 82dc1a1e20..379536f5c9 100644 --- a/linux-user/strace.c +++ b/linux-user/strace.c @@ -3824,6 +3824,25 @@ print_rlimit64(abi_ulong rlim_addr, int last) } } +#if defined(TARGET_NR_pread64) || defined(TARGET_NR_pwrite64) +static void +print_preadwrite64(CPUArchState *cpu_env, const struct syscallname *name, + abi_long arg0, abi_long arg1, abi_long arg2, + abi_long arg3, abi_long arg4, abi_long arg5) +{ +if (regpairs_aligned(cpu_env, TARGET_NR_pread64)) { +arg3 = arg4; +arg4 = arg5; +} +print_syscall_prologue(name); +print_raw_param("%d", arg0, 0); +print_pointer(arg1, 0); +print_raw_param("%d", arg2, 0); +qemu_log("%lld", (long long)target_offset64(arg3, arg4)); better to use: print_raw_param("%" PRIu64, target_offset64(arg3, arg4), 1); Thanks, Laurent
Re: [PATCH 2/2] tcg: use QTree instead of GTree
On Sun, Jan 29, 2023 at 05:38:08PM -0500, Emilio Cota wrote: > On Wed, Jan 25, 2023 at 15:58:25 +, Daniel P. Berrangé wrote: > > On Wed, Jan 11, 2023 at 12:34:29PM +, Daniel P. Berrangé wrote: > > > On Tue, Jan 10, 2023 at 10:55:36PM -0500, Emilio Cota wrote: > > > > qemu-user can hang in a multi-threaded fork. One common > > > > reason is that when creating a TB, between fork and exec > > > > we manipulate a GTree whose memory allocator (GSlice) is > > > > not fork-safe. > > > > > > BTW, I just checked latest glib status > > > > > > https://gitlab.gnome.org/GNOME/glib/-/issues/1079 > > > > > > it appears they're pretty close to deciding to delete the > > > GSlice impl and always use system malloc. > > > > They have now merged the code to delete the GSlice custom allocator. > > > > So glib >= 2.76.0 should not exhibit a hang > > > > > So if we do take this patch series it'll hopefully be a time > > > limited thing to carry. > > > > So the question is whether the issue is critical enough that we want > > to carry a workaround for a while, vs telling users to upgrade to > > newer glib (once 2.76 actually gets released) > > That is great news! > > Since this is a correctness issue, I think we should ship with qtree > and use it when configuring with glib <2.76.0. For later glib versions > we would just use gtree, e.g. via typedef + inline functions. > > Once the minimum glib required by the configure script is >= 2.76.0, > then we'd remove qtree. > > If that sounds like a good plan, I can send a v2. I'm fine with it, but be good to have an opinion here from the TCG subsystem maintainers, CC'ing them With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH qemu v3] x86: don't let decompressed kernel image clobber setup_data
On Sat, Jan 28, 2023 at 06:15:03AM -0500, Michael S. Tsirkin wrote: > On Mon, Jan 23, 2023 at 06:37:21AM -0600, Jason A. Donenfeld wrote: > > On Mon, Jan 23, 2023 at 6:12 AM Michael S. Tsirkin wrote: > > > > > > On Sun, Jan 22, 2023 at 08:21:30PM -0800, Eric Biggers wrote: > > > > Hi Michael, > > > > > > > > On Tue, Jan 10, 2023 at 12:50:42PM -0500, Michael S. Tsirkin wrote: > > > > > On Tue, Jan 10, 2023 at 04:34:49PM +0100, Jason A. Donenfeld wrote: > > > > > > Hi Michael, > > > > > > > > > > > > Could you queue up this patch and mark it as a fix for 7.2.1? It is > > > > > > a > > > > > > straight-up bug fix for a 7.2 regression that's now affected several > > > > > > users. > > > > > > > > > > OK. In the future pls cc me if you want me to merge a patch. Thanks! > > > > > > > > > > > - It has two Tested-by tags on the thread. > > > > > > - hpa, the maintainer of the kernel side of this, confirmed on one > > > > > > of > > > > > > the various tributary threads that this approach is a correct one. > > > > > > - It doesn't introduce any new functionality. > > > > > > > > > > > > For your convenience, you can grab this out of lore here: > > > > > > > > > > > > > > > > > > https://lore.kernel.org/lkml/20221230220725.618763-1-ja...@zx2c4.com/ > > > > > > > > > > > > Or if you want to yolo it: > > > > > > > > > > > > curl > > > > > > https://lore.kernel.org/lkml/20221230220725.618763-1-ja...@zx2c4.com/raw > > > > > > | git am -s > > > > > > > > > > > > It's now sat silent on the mailing list for a while. So let's > > > > > > please get > > > > > > this committed and backported so that the bug reports stop coming > > > > > > in. > > > > > > > > > > > > > > This patch still isn't on QEMU's master branch. What happened to it? > > > > > > > > - Eric > > > > > > Indeed though I remember picking it up. Tagged again now. Thanks! > > > > Thanks. What branch is this in? I didn't see it on: > > https://gitlab.com/mstredhat/qemu/-/branches/active > > https://github.com/mstsirkin/qemu/branches > > I don't use github really. And it was not pushed to gitlab as I was > figuring out issues with other patches before starting CI as CI minutes > are limited. QEMU CI config as of a few months ago, pushing to gitlab will *not* start CI pipelines. You need to explicitly use opt-in to it when pushing by using 'git push -o ci.variable=QEMU_CI=1' to create a pipeline with all jobs manual or QEMU_CI=2 to create a pipeline and immediately run all jobs. Alternatively use the web UI to start the pipeline, again setting QEMU_CI=1|2 So don't let CI minutes concerns dissuade from pushing to gitlab merely to publish the code. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v2 08/11] multifd: Add capability to enable/disable zero_page
Juan Quintela writes: > We have to enable it by default until we introduce the new code. > > Signed-off-by: Juan Quintela The subject doesn't quite match the patch to the QAPI schema. It claims "capability to enable/disable zero_page", but ... > --- > > Change it to a capability. As capabilities are off by default, have > to change MULTIFD_ZERO_PAGE to MAIN_ZERO_PAGE, so it is false for > default, and true for older versions. > --- > qapi/migration.json | 8 +++- > migration/migration.h | 1 + > hw/core/machine.c | 1 + > migration/migration.c | 13 - > 4 files changed, 21 insertions(+), 2 deletions(-) > > diff --git a/qapi/migration.json b/qapi/migration.json > index 88ecf86ac8..ac5bc071a9 100644 > --- a/qapi/migration.json > +++ b/qapi/migration.json > @@ -472,12 +472,18 @@ > # Requires that QEMU be permitted to use locked memory > # for guest RAM pages. > # (since 7.1) > +# > # @postcopy-preempt: If enabled, the migration process will allow postcopy > #requests to preempt precopy stream, so postcopy requests > #will be handled faster. This is a performance feature > and > #should not affect the correctness of postcopy migration. > #(since 7.1) > # > +# @main-zero-page: If enabled, the detection of zero pages will be > +# done on the main thread. Otherwise it is done on > +# the multifd threads. ... here, we add a capability to shift certain work to another thread. No "enable/disable" as far as I can tell. Which one is right? What's the default? Not this patch's fault, but needs fixing: we neglect to document the default for several other parameters. Wordsmithing nitpick: suggest "done by the thread" or maybe "done in the thread". @main-zero-page suggests this is about a special zero page. Perhaps I can think of a clearer name, but first I need to be sure what the thing is about. > +# (since 8.0) > +# > # Features: > # @unstable: Members @x-colo and @x-ignore-shared are experimental. > # > @@ -492,7 +498,7 @@ > 'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate', > { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] }, > 'validate-uuid', 'background-snapshot', > - 'zero-copy-send', 'postcopy-preempt'] } > + 'zero-copy-send', 'postcopy-preempt', 'main-zero-page'] } > > ## > # @MigrationCapabilityStatus:
Re: [PATCH 0/3] Misc sm501 clean ups
On 1/28/23 23:43, BALATON Zoltan wrote: On Mon, 23 Jan 2023, Philippe Mathieu-Daudé wrote: On 21/1/23 21:35, BALATON Zoltan wrote: Some small trivial clean ups I've found while looking at this file. BALATON Zoltan (3): hw/display/sm501: Remove parenthesis around consant macro definitions hw/display/sm501: Remove unneeded casts from void pointer hw/display/sm501: Code style fix hw/display/sm501.c | 419 +++-- 1 file changed, 210 insertions(+), 209 deletions(-) Reviewed-by: Philippe Mathieu-Daudé Ping? Who will merge this series? Should Daniel take it via PPC or Gerd for display? I only care that it gets in one way or another and not lost between maintainers. I'm planning a PR at the end of this week. I can take these in. Thanks, Daniel Regards, BALATON Zoltan
[PATCH v4 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs
From: Guohuai Shi This commit implements Windows specific xxxdir() APIs for safety directory access. Signed-off-by: Guohuai Shi Signed-off-by: Bin Meng --- hw/9pfs/9p-util.h | 6 + hw/9pfs/9p-util-win32.c | 296 2 files changed, 302 insertions(+) diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index 0f159fb4ce..c1c251fbd1 100644 --- a/hw/9pfs/9p-util.h +++ b/hw/9pfs/9p-util.h @@ -141,6 +141,12 @@ int unlinkat_win32(int dirfd, const char *pathname, int flags); int statfs_win32(const char *root_path, struct statfs *stbuf); int openat_dir(int dirfd, const char *name); int openat_file(int dirfd, const char *name, int flags, mode_t mode); +DIR *opendir_win32(const char *full_file_name); +int closedir_win32(DIR *pDir); +struct dirent *readdir_win32(DIR *pDir); +void rewinddir_win32(DIR *pDir); +void seekdir_win32(DIR *pDir, long pos); +long telldir_win32(DIR *pDir); #endif static inline void close_preserve_errno(int fd) diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c index a99d579a06..5503199300 100644 --- a/hw/9pfs/9p-util-win32.c +++ b/hw/9pfs/9p-util-win32.c @@ -37,6 +37,13 @@ *Windows does not support opendir, the directory fd is created by *CreateFile and convert to fd by _open_osfhandle(). Keep the fd open will *lock and protect the directory (can not be modified or replaced) + * + * 5. Windows and MinGW does not provide safety directory accessing functions. + *readdir(), seekdir() and telldir() may get or set wrong value because + *directory entry data is not protected. + * + *This file re-write POSIX directory accessing functions and cache all + *directory entries during opening. */ #include "qemu/osdep.h" @@ -51,6 +58,27 @@ #define V9FS_MAGIC 0x53465039 /* string "9PFS" */ +/* + * MinGW and Windows does not provide safety way to seek directory while other + * thread is modifying same directory. + * + * The two structures are used to cache all directory entries when opening it. + * Cached entries are always returned for read or seek. + */ +struct dir_win32_entry { +QSLIST_ENTRY(dir_win32_entry) node; +struct _finddata_t dd_data; +}; + +struct dir_win32 { +struct dirent dd_dir; +uint32_t offset; +uint32_t total_entries; +QSLIST_HEAD(, dir_win32_entry) head; +struct dir_win32_entry *current; +char dd_name[1]; +}; + /* * win32_error_to_posix - convert Win32 error to POSIX error number * @@ -977,3 +1005,271 @@ int qemu_mknodat(int dirfd, const char *filename, mode_t mode, dev_t dev) errno = ENOTSUP; return -1; } + +/* + * opendir_win32 - open a directory + * + * This function opens a directory and caches all directory entries. + */ +DIR *opendir_win32(const char *full_file_name) +{ +HANDLE hDir = INVALID_HANDLE_VALUE; +DWORD attribute; +intptr_t dd_handle = -1; +struct _finddata_t dd_data; + +struct dir_win32 *stream = NULL; +struct dir_win32_entry *dir_entry; +struct dir_win32_entry *prev; +struct dir_win32_entry *next; + +int err = 0; +int find_status; +uint32_t index; + +/* open directory to prevent it being removed */ + +hDir = CreateFile(full_file_name, GENERIC_READ, + FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, + NULL, + OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL); + +if (hDir == INVALID_HANDLE_VALUE) { +err = win32_error_to_posix(GetLastError()); +goto out; +} + +attribute = GetFileAttributes(full_file_name); + +/* symlink is not allow */ +if (attribute == INVALID_FILE_ATTRIBUTES +|| (attribute & FILE_ATTRIBUTE_REPARSE_POINT) != 0) { +err = EACCES; +goto out; +} + +/* check if it is a directory */ +if ((attribute & FILE_ATTRIBUTE_DIRECTORY) == 0) { +err = ENOTDIR; +goto out; +} + +/* + * findfirst() need suffix format name like "\dir1\dir2\*", allocate more + * buffer to store suffix. + */ +stream = g_malloc0(sizeof(struct dir_win32) + strlen(full_file_name) + 3); +QSLIST_INIT(&stream->head); + +strcpy(stream->dd_name, full_file_name); +strcat(stream->dd_name, "\\*"); + +dd_handle = _findfirst(stream->dd_name, &dd_data); + +if (dd_handle == -1) { +err = errno; +goto out; +} + +index = 0; + +/* read all entries to link list */ +do { +dir_entry = g_malloc0(sizeof(struct dir_win32_entry)); +memcpy(&dir_entry->dd_data, &dd_data, sizeof(dd_data)); +if (index == 0) { +QSLIST_INSERT_HEAD(&stream->head, dir_entry, node); +} else { +QSLIST_INSERT_AFTER(prev, dir_entry, node); +} + +prev = dir_entry; +find_status = _findnext(dd_handle, &dd_data); + +index++; +} while (find_status == 0); + +if (errno == ENOENT) { +/* No more matching files could
[PATCH v4 03/16] hw/9pfs: Replace the direct call to xxxdir() APIs with a wrapper
From: Guohuai Shi xxxdir() APIs are not safe on Windows host. For future extension to Windows, let's replace the direct call to xxxdir() APIs with a wrapper. Signed-off-by: Guohuai Shi Signed-off-by: Bin Meng --- hw/9pfs/9p-util.h | 14 ++ hw/9pfs/9p-local.c | 12 ++-- 2 files changed, 20 insertions(+), 6 deletions(-) diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index 90420a7578..0f159fb4ce 100644 --- a/hw/9pfs/9p-util.h +++ b/hw/9pfs/9p-util.h @@ -103,6 +103,13 @@ static inline int errno_to_dotl(int err) { #define qemu_renameat renameat_win32 #define qemu_utimensat utimensat_win32 #define qemu_unlinkat unlinkat_win32 + +#define qemu_opendiropendir_win32 +#define qemu_closedir closedir_win32 +#define qemu_readdirreaddir_win32 +#define qeme_rewinddir rewinddir_win32 +#define qemu_seekdirseekdir_win32 +#define qemu_telldirtelldir_win32 #else #define qemu_openat openat #define qemu_fstatatfstatat @@ -110,6 +117,13 @@ static inline int errno_to_dotl(int err) { #define qemu_renameat renameat #define qemu_utimensat utimensat #define qemu_unlinkat unlinkat + +#define qemu_opendiropendir +#define qemu_closedir closedir +#define qemu_readdirreaddir +#define qeme_rewinddir rewinddir +#define qemu_seekdirseekdir +#define qemu_telldirtelldir #endif #ifdef CONFIG_WIN32 diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c index b6102c9e5a..4385f18da2 100644 --- a/hw/9pfs/9p-local.c +++ b/hw/9pfs/9p-local.c @@ -495,7 +495,7 @@ static int local_close(FsContext *ctx, V9fsFidOpenState *fs) static int local_closedir(FsContext *ctx, V9fsFidOpenState *fs) { -return closedir(fs->dir.stream); +return qemu_closedir(fs->dir.stream); } static int local_open(FsContext *ctx, V9fsPath *fs_path, @@ -533,12 +533,12 @@ static int local_opendir(FsContext *ctx, static void local_rewinddir(FsContext *ctx, V9fsFidOpenState *fs) { -rewinddir(fs->dir.stream); +qeme_rewinddir(fs->dir.stream); } static off_t local_telldir(FsContext *ctx, V9fsFidOpenState *fs) { -return telldir(fs->dir.stream); +return qemu_telldir(fs->dir.stream); } static bool local_is_mapped_file_metadata(FsContext *fs_ctx, const char *name) @@ -552,13 +552,13 @@ static struct dirent *local_readdir(FsContext *ctx, V9fsFidOpenState *fs) struct dirent *entry; again: -entry = readdir(fs->dir.stream); +entry = qemu_readdir(fs->dir.stream); if (!entry) { return NULL; } #ifdef CONFIG_DARWIN int off; -off = telldir(fs->dir.stream); +off = qemu_telldir(fs->dir.stream); /* If telldir fails, fail the entire readdir call */ if (off < 0) { return NULL; @@ -581,7 +581,7 @@ again: static void local_seekdir(FsContext *ctx, V9fsFidOpenState *fs, off_t off) { -seekdir(fs->dir.stream, off); +qemu_seekdir(fs->dir.stream, off); } static ssize_t local_preadv(FsContext *ctx, V9fsFidOpenState *fs, -- 2.25.1
[PATCH v4 00/16] hw/9pfs: Add 9pfs support for Windows
At present there is no Windows support for 9p file system. This series adds initial Windows support for 9p file system. 'local' file system backend driver is supported on Windows, including open, read, write, close, rename, remove, etc. All security models are supported. The mapped (mapped-xattr) security model is implemented using NTFS Alternate Data Stream (ADS) so the 9p export path shall be on an NTFS partition. 'synth' driver is adapted for Windows too so that we can now run qtests on Windows for 9p related regression testing. Example command line to test: "-fsdev local,path=c:\msys64,security_model=mapped,id=p9 -device virtio-9p-pci,fsdev=p9,mount_tag=p9fs" Base-commit: 13356edb87506c148b163b8c7eb0695647d00c2a Changes in v4: - Fixed 9pfs mounted as read-only issue on Windows host, adding a win32_error_to_posix() to translate Windows native API error to POSIX one. - Fixed errors of handling symbolic links - Added forward declaration to avoid using 'void *' - Implemented Windows specific xxxdir() APIs for safe directory access Bin Meng (2): hw/9pfs: Update helper qemu_stat_rdev() hw/9pfs: Add a helper qemu_stat_blksize() Guohuai Shi (14): hw/9pfs: Add missing definitions for Windows hw/9pfs: Implement Windows specific utilities functions for 9pfs hw/9pfs: Replace the direct call to xxxdir() APIs with a wrapper hw/9pfs: Implement Windows specific xxxdir() APIs hw/9pfs: Update the local fs driver to support Windows hw/9pfs: Support getting current directory offset for Windows hw/9pfs: Disable unsupported flags and features for Windows hw/9pfs: Update v9fs_set_fd_limit() for Windows hw/9pfs: Add Linux error number definition hw/9pfs: Translate Windows errno to Linux value fsdev: Disable proxy fs driver on Windows hw/9pfs: Update synth fs driver for Windows tests/qtest: virtio-9p-test: Adapt the case for win32 meson.build: Turn on virtfs for Windows meson.build | 10 +- fsdev/file-op-9p.h| 33 + hw/9pfs/9p-linux-errno.h | 151 +++ hw/9pfs/9p-local.h|8 + hw/9pfs/9p-util.h | 139 ++- hw/9pfs/9p.h | 43 + tests/qtest/libqos/virtio-9p-client.h |7 + fsdev/qemu-fsdev.c|2 + hw/9pfs/9p-local.c| 269 - hw/9pfs/9p-synth.c|5 +- hw/9pfs/9p-util-win32.c | 1305 + hw/9pfs/9p.c | 90 +- hw/9pfs/codir.c |2 +- fsdev/meson.build |1 + hw/9pfs/meson.build |8 +- 15 files changed, 2008 insertions(+), 65 deletions(-) create mode 100644 hw/9pfs/9p-linux-errno.h create mode 100644 hw/9pfs/9p-util-win32.c -- 2.25.1
[PATCH v4 02/16] hw/9pfs: Implement Windows specific utilities functions for 9pfs
From: Guohuai Shi Windows POSIX API and MinGW library do not provide the NO_FOLLOW flag, and do not allow opening a directory by POSIX open(). This causes all xxx_at() functions cannot work directly. However, we can provide Windows handle based functions to emulate xxx_at() functions (e.g.: openat_win32, utimensat_win32, etc.). NTFS ADS (Alternate Data Streams) is used to emulate 9pfs extended attributes on Windows. Symbolic link is only supported when security model is "mapped-xattr" or "mapped-file". Signed-off-by: Guohuai Shi Signed-off-by: Bin Meng --- hw/9pfs/9p-local.h | 7 + hw/9pfs/9p-util.h | 32 +- hw/9pfs/9p-local.c | 4 - hw/9pfs/9p-util-win32.c | 979 4 files changed, 1017 insertions(+), 5 deletions(-) create mode 100644 hw/9pfs/9p-util-win32.c diff --git a/hw/9pfs/9p-local.h b/hw/9pfs/9p-local.h index 32c72749d9..77e7f57f89 100644 --- a/hw/9pfs/9p-local.h +++ b/hw/9pfs/9p-local.h @@ -13,6 +13,13 @@ #ifndef QEMU_9P_LOCAL_H #define QEMU_9P_LOCAL_H +typedef struct { +int mountfd; +#ifdef CONFIG_WIN32 +char *root_path; +#endif +} LocalData; + int local_open_nofollow(FsContext *fs_ctx, const char *path, int flags, mode_t mode); int local_opendir_nofollow(FsContext *fs_ctx, const char *path); diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index c314cf381d..90420a7578 100644 --- a/hw/9pfs/9p-util.h +++ b/hw/9pfs/9p-util.h @@ -88,18 +88,46 @@ static inline int errno_to_dotl(int err) { return err; } -#ifdef CONFIG_DARWIN +#if defined(CONFIG_DARWIN) #define qemu_fgetxattr(...) fgetxattr(__VA_ARGS__, 0, 0) +#elif defined(CONFIG_WIN32) +#define qemu_fgetxattr fgetxattr_win32 #else #define qemu_fgetxattr fgetxattr #endif +#ifdef CONFIG_WIN32 +#define qemu_openat openat_win32 +#define qemu_fstatatfstatat_win32 +#define qemu_mkdiratmkdirat_win32 +#define qemu_renameat renameat_win32 +#define qemu_utimensat utimensat_win32 +#define qemu_unlinkat unlinkat_win32 +#else #define qemu_openat openat #define qemu_fstatatfstatat #define qemu_mkdiratmkdirat #define qemu_renameat renameat #define qemu_utimensat utimensat #define qemu_unlinkat unlinkat +#endif + +#ifdef CONFIG_WIN32 +char *get_full_path_win32(HANDLE hDir, const char *name); +ssize_t fgetxattr_win32(int fd, const char *name, void *value, size_t size); +int openat_win32(int dirfd, const char *pathname, int flags, mode_t mode); +int fstatat_win32(int dirfd, const char *pathname, + struct stat *statbuf, int flags); +int mkdirat_win32(int dirfd, const char *pathname, mode_t mode); +int renameat_win32(int olddirfd, const char *oldpath, + int newdirfd, const char *newpath); +int utimensat_win32(int dirfd, const char *pathname, +const struct timespec times[2], int flags); +int unlinkat_win32(int dirfd, const char *pathname, int flags); +int statfs_win32(const char *root_path, struct statfs *stbuf); +int openat_dir(int dirfd, const char *name); +int openat_file(int dirfd, const char *name, int flags, mode_t mode); +#endif static inline void close_preserve_errno(int fd) { @@ -108,6 +136,7 @@ static inline void close_preserve_errno(int fd) errno = serrno; } +#ifndef CONFIG_WIN32 static inline int openat_dir(int dirfd, const char *name) { return qemu_openat(dirfd, name, @@ -154,6 +183,7 @@ again: errno = serrno; return fd; } +#endif ssize_t fgetxattrat_nofollow(int dirfd, const char *path, const char *name, void *value, size_t size); diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c index 9d07620235..b6102c9e5a 100644 --- a/hw/9pfs/9p-local.c +++ b/hw/9pfs/9p-local.c @@ -53,10 +53,6 @@ #define BTRFS_SUPER_MAGIC 0x9123683E #endif -typedef struct { -int mountfd; -} LocalData; - int local_open_nofollow(FsContext *fs_ctx, const char *path, int flags, mode_t mode) { diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c new file mode 100644 index 00..a99d579a06 --- /dev/null +++ b/hw/9pfs/9p-util-win32.c @@ -0,0 +1,979 @@ +/* + * 9p utilities (Windows Implementation) + * + * Copyright (c) 2022 Wind River Systems, Inc. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +/* + * This file contains Windows only functions for 9pfs. + * + * For 9pfs Windows host, the following features are different from Linux host: + * + * 1. Windows POSIX API does not provide the NO_FOLLOW flag, that means MinGW + *cannot detect if a path is a symbolic link or not. Also Windows do not + *provide POSIX compatible readlink(). Supporting symbolic link in 9pfs on + *Windows may cause security issues, so symbolic link support is disabled + *completely for security model "none" or "passthrough". + * + * 2. Windows file system does not support extended attr
[PATCH v4 05/16] hw/9pfs: Update the local fs driver to support Windows
From: Guohuai Shi Update the 9p 'local' file system driver to support Windows, including open, read, write, close, rename, remove, etc. All security models are supported. The mapped (mapped-xattr) security model is implemented using NTFS Alternate Data Stream (ADS) so the 9p export path shall be on an NTFS partition. Symbolic link and hard link are not supported when security model is "passthrough" or "none", because Windows NTFS does not fully support them with POSIX compatibility. Symbolic link is enabled when security model is "mapped-file" or "mapped-xattr". inode remap is always enabled because Windows file system does not provide a compatible inode number. mknod() is not supported because Windows does not support it. chown() and chmod() are not supported when 9pfs is configured with security mode to 'none' or 'passthrough' because Windows host does not support such type request. Signed-off-by: Guohuai Shi Signed-off-by: Bin Meng --- hw/9pfs/9p-local.h | 1 + hw/9pfs/9p-local.c | 253 +++-- 2 files changed, 246 insertions(+), 8 deletions(-) diff --git a/hw/9pfs/9p-local.h b/hw/9pfs/9p-local.h index 77e7f57f89..5905923881 100644 --- a/hw/9pfs/9p-local.h +++ b/hw/9pfs/9p-local.h @@ -17,6 +17,7 @@ typedef struct { int mountfd; #ifdef CONFIG_WIN32 char *root_path; +DWORD block_size; #endif } LocalData; diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c index 4385f18da2..d308a88759 100644 --- a/hw/9pfs/9p-local.c +++ b/hw/9pfs/9p-local.c @@ -21,11 +21,13 @@ #include "9p-xattr.h" #include "9p-util.h" #include "fsdev/qemu-fsdev.h" /* local_ops */ +#ifndef CONFIG_WIN32 #include #include #include #include #include +#endif #include "qemu/xattr.h" #include "qapi/error.h" #include "qemu/cutils.h" @@ -38,7 +40,9 @@ #include #endif #endif +#ifndef CONFIG_WIN32 #include +#endif #ifndef XFS_SUPER_MAGIC #define XFS_SUPER_MAGIC 0x58465342 @@ -90,10 +94,12 @@ int local_open_nofollow(FsContext *fs_ctx, const char *path, int flags, return fd; } +#ifndef CONFIG_WIN32 int local_opendir_nofollow(FsContext *fs_ctx, const char *path) { return local_open_nofollow(fs_ctx, path, O_DIRECTORY | O_RDONLY, 0); } +#endif static void renameat_preserve_errno(int odirfd, const char *opath, int ndirfd, const char *npath) @@ -236,7 +242,7 @@ static int local_set_mapped_file_attrat(int dirfd, const char *name, int ret; char buf[ATTR_MAX]; int uid = -1, gid = -1, mode = -1, rdev = -1; -int map_dirfd = -1, map_fd; +int map_dirfd = -1; bool is_root = !strcmp(name, "."); if (is_root) { @@ -300,10 +306,12 @@ update_map_file: return -1; } -map_fd = fileno(fp); +#ifndef CONFIG_WIN32 +int map_fd = fileno(fp); assert(map_fd != -1); ret = fchmod(map_fd, 0600); assert(ret == 0); +#endif if (credp->fc_uid != -1) { uid = credp->fc_uid; @@ -335,6 +343,7 @@ update_map_file: return 0; } +#ifndef CONFIG_WIN32 static int fchmodat_nofollow(int dirfd, const char *name, mode_t mode) { struct stat stbuf; @@ -396,6 +405,7 @@ static int fchmodat_nofollow(int dirfd, const char *name, mode_t mode) close_preserve_errno(fd); return ret; } +#endif static int local_set_xattrat(int dirfd, const char *path, FsCred *credp) { @@ -436,6 +446,7 @@ static int local_set_xattrat(int dirfd, const char *path, FsCred *credp) return 0; } +#ifndef CONFIG_WIN32 static int local_set_cred_passthrough(FsContext *fs_ctx, int dirfd, const char *name, FsCred *credp) { @@ -452,6 +463,7 @@ static int local_set_cred_passthrough(FsContext *fs_ctx, int dirfd, return fchmodat_nofollow(dirfd, name, credp->fc_mode & 0); } +#endif static ssize_t local_readlink(FsContext *fs_ctx, V9fsPath *fs_path, char *buf, size_t bufsz) @@ -470,6 +482,12 @@ static ssize_t local_readlink(FsContext *fs_ctx, V9fsPath *fs_path, close_preserve_errno(fd); } else if ((fs_ctx->export_flags & V9FS_SM_PASSTHROUGH) || (fs_ctx->export_flags & V9FS_SM_NONE)) { +#ifdef CONFIG_WIN32 +errno = ENOTSUP; +error_report_once("readlink is not available on Windows host when" + "security_model is \"none\" or \"passthrough\""); +tsize = -1; +#else char *dirpath = g_path_get_dirname(fs_path->data); char *name = g_path_get_basename(fs_path->data); int dirfd; @@ -484,6 +502,7 @@ static ssize_t local_readlink(FsContext *fs_ctx, V9fsPath *fs_path, out: g_free(name); g_free(dirpath); +#endif } return tsize; } @@ -522,9 +541,31 @@ static int local_opendir(FsContext *ctx, return -1; } +#ifdef CONFIG_WIN32 +char *full_file_name; + +HANDLE hDir = (HANDLE)_get_osfhandle(dirfd); + +full_file_name = get_full_pat
[PATCH v4 08/16] hw/9pfs: Add a helper qemu_stat_blksize()
As Windows host does not have stat->st_blksize field, we use the one we calculated in init_win32_root_directory(). Add a helper qemu_stat_blksize() and use it to avoid direct access to stat->st_blksize. Co-developed-by: Guohuai Shi Signed-off-by: Bin Meng --- hw/9pfs/9p-util.h | 13 + hw/9pfs/9p-util-win32.c | 7 +++ hw/9pfs/9p.c| 13 - 3 files changed, 32 insertions(+), 1 deletion(-) diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index 1fb54d0b97..ea8c116059 100644 --- a/hw/9pfs/9p-util.h +++ b/hw/9pfs/9p-util.h @@ -156,6 +156,7 @@ void seekdir_win32(DIR *pDir, long pos); long telldir_win32(DIR *pDir); off_t qemu_dirent_off_win32(struct V9fsState *s, union V9fsFidOpenState *fs); uint64_t qemu_stat_rdev_win32(struct FsContext *fs_ctx); +uint64_t qemu_stat_blksize_win32(struct FsContext *fs_ctx); #endif static inline void close_preserve_errno(int fd) @@ -285,6 +286,18 @@ static inline uint64_t qemu_stat_rdev(const struct stat *stbuf, #endif } +static inline uint64_t qemu_stat_blksize(const struct stat *stbuf, + struct FsContext *fs_ctx) +{ +#if defined(CONFIG_LINUX) || defined(CONFIG_DARWIN) +return stbuf->st_blksize; +#elif defined(CONFIG_WIN32) +return qemu_stat_blksize_win32(fs_ctx); +#else +#error Missing qemu_stat_blksize() implementation for this host system +#endif +} + /* * As long as mknodat is not available on macOS, this workaround * using pthread_fchdir_np is needed. qemu_mknodat is defined in diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c index 5f6d43b62c..5ece1db7aa 100644 --- a/hw/9pfs/9p-util-win32.c +++ b/hw/9pfs/9p-util-win32.c @@ -1296,3 +1296,10 @@ uint64_t qemu_stat_rdev_win32(struct FsContext *fs_ctx) return rdev; } + +uint64_t qemu_stat_blksize_win32(struct FsContext *fs_ctx) +{ +LocalData *data = fs_ctx->private; + +return data ? (uint64_t)data->block_size : 0; +} diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index 36916fe581..def85a57fa 100644 --- a/hw/9pfs/9p.c +++ b/hw/9pfs/9p.c @@ -1335,12 +1335,14 @@ static int32_t blksize_to_iounit(const V9fsPDU *pdu, int32_t blksize) static int32_t stat_to_iounit(const V9fsPDU *pdu, const struct stat *stbuf) { -return blksize_to_iounit(pdu, stbuf->st_blksize); +return blksize_to_iounit(pdu, qemu_stat_blksize(stbuf, &pdu->s->ctx)); } static int stat_to_v9stat_dotl(V9fsPDU *pdu, const struct stat *stbuf, V9fsStatDotl *v9lstat) { +dev_t rdev = qemu_stat_rdev(stbuf, &pdu->s->ctx); + memset(v9lstat, 0, sizeof(*v9lstat)); v9lstat->st_mode = stbuf->st_mode; @@ -1350,7 +1352,16 @@ static int stat_to_v9stat_dotl(V9fsPDU *pdu, const struct stat *stbuf, v9lstat->st_rdev = host_dev_to_dotl_dev(rdev); v9lstat->st_size = stbuf->st_size; v9lstat->st_blksize = stat_to_iounit(pdu, stbuf); +#if defined(CONFIG_LINUX) || defined(CONFIG_DARWIN) v9lstat->st_blocks = stbuf->st_blocks; +#elif defined(CONFIG_WIN32) +if (v9lstat->st_blksize == 0) { +v9lstat->st_blocks = 0; +} else { +v9lstat->st_blocks = ROUND_UP(v9lstat->st_size / v9lstat->st_blksize, + v9lstat->st_blksize); +} +#endif v9lstat->st_atime_sec = stbuf->st_atime; v9lstat->st_mtime_sec = stbuf->st_mtime; v9lstat->st_ctime_sec = stbuf->st_ctime; -- 2.25.1
[PATCH v4 09/16] hw/9pfs: Disable unsupported flags and features for Windows
From: Guohuai Shi Some flags and features are not supported on Windows, like mknod, readlink, file mode, etc. Update the codes for Windows. Signed-off-by: Guohuai Shi Signed-off-by: Bin Meng --- hw/9pfs/9p.c | 45 ++--- 1 file changed, 38 insertions(+), 7 deletions(-) diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index def85a57fa..2497a06f43 100644 --- a/hw/9pfs/9p.c +++ b/hw/9pfs/9p.c @@ -39,6 +39,11 @@ #include "qemu/xxhash.h" #include +#ifdef CONFIG_WIN32 +#define UTIME_NOW ((1l << 30) - 1l) +#define UTIME_OMIT ((1l << 30) - 2l) +#endif + int open_fd_hw; int total_open_fd; static int open_fd_rc; @@ -132,13 +137,17 @@ static int dotl_to_open_flags(int flags) DotlOpenflagMap dotl_oflag_map[] = { { P9_DOTL_CREATE, O_CREAT }, { P9_DOTL_EXCL, O_EXCL }, +#ifndef CONFIG_WIN32 { P9_DOTL_NOCTTY , O_NOCTTY }, +#endif { P9_DOTL_TRUNC, O_TRUNC }, { P9_DOTL_APPEND, O_APPEND }, +#ifndef CONFIG_WIN32 { P9_DOTL_NONBLOCK, O_NONBLOCK } , { P9_DOTL_DSYNC, O_DSYNC }, { P9_DOTL_FASYNC, FASYNC }, -#ifndef CONFIG_DARWIN +#endif +#if !defined(CONFIG_DARWIN) && !defined(CONFIG_WIN32) { P9_DOTL_NOATIME, O_NOATIME }, /* * On Darwin, we could map to F_NOCACHE, which is @@ -151,8 +160,10 @@ static int dotl_to_open_flags(int flags) #endif { P9_DOTL_LARGEFILE, O_LARGEFILE }, { P9_DOTL_DIRECTORY, O_DIRECTORY }, +#ifndef CONFIG_WIN32 { P9_DOTL_NOFOLLOW, O_NOFOLLOW }, { P9_DOTL_SYNC, O_SYNC }, +#endif }; for (i = 0; i < ARRAY_SIZE(dotl_oflag_map); i++) { @@ -179,8 +190,11 @@ static int get_dotl_openflags(V9fsState *s, int oflags) * Filter the client open flags */ flags = dotl_to_open_flags(oflags); -flags &= ~(O_NOCTTY | O_ASYNC | O_CREAT); -#ifndef CONFIG_DARWIN +flags &= ~(O_CREAT); +#ifndef CONFIG_WIN32 +flags &= ~(O_NOCTTY | O_ASYNC); +#endif +#if !defined(CONFIG_DARWIN) && !defined(CONFIG_WIN32) /* * Ignore direct disk access hint until the server supports it. */ @@ -1117,12 +1131,14 @@ static mode_t v9mode_to_mode(uint32_t mode, V9fsString *extension) if (mode & P9_STAT_MODE_SYMLINK) { ret |= S_IFLNK; } +#ifndef CONFIG_WIN32 if (mode & P9_STAT_MODE_SOCKET) { ret |= S_IFSOCK; } if (mode & P9_STAT_MODE_NAMED_PIPE) { ret |= S_IFIFO; } +#endif if (mode & P9_STAT_MODE_DEVICE) { if (extension->size && extension->data[0] == 'c') { ret |= S_IFCHR; @@ -1203,6 +1219,7 @@ static uint32_t stat_to_v9mode(const struct stat *stbuf) mode |= P9_STAT_MODE_SYMLINK; } +#ifndef CONFIG_WIN32 if (S_ISSOCK(stbuf->st_mode)) { mode |= P9_STAT_MODE_SOCKET; } @@ -1210,6 +1227,7 @@ static uint32_t stat_to_v9mode(const struct stat *stbuf) if (S_ISFIFO(stbuf->st_mode)) { mode |= P9_STAT_MODE_NAMED_PIPE; } +#endif if (S_ISBLK(stbuf->st_mode) || S_ISCHR(stbuf->st_mode)) { mode |= P9_STAT_MODE_DEVICE; @@ -1369,7 +1387,8 @@ static int stat_to_v9stat_dotl(V9fsPDU *pdu, const struct stat *stbuf, v9lstat->st_atime_nsec = stbuf->st_atimespec.tv_nsec; v9lstat->st_mtime_nsec = stbuf->st_mtimespec.tv_nsec; v9lstat->st_ctime_nsec = stbuf->st_ctimespec.tv_nsec; -#else +#endif +#ifdef CONFIG_LINUX v9lstat->st_atime_nsec = stbuf->st_atim.tv_nsec; v9lstat->st_mtime_nsec = stbuf->st_mtim.tv_nsec; v9lstat->st_ctime_nsec = stbuf->st_ctim.tv_nsec; @@ -2492,6 +2511,7 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp, struct dirent *dent; struct stat *st; struct V9fsDirEnt *entries = NULL; +unsigned char d_type = 0; /* * inode remapping requires the device id, which in turn might be @@ -2553,10 +2573,13 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp, v9fs_string_init(&name); v9fs_string_sprintf(&name, "%s", dent->d_name); +#ifndef CONFIG_WIN32 +d_type = dent->d_type; +#endif /* 11 = 7 + 4 (7 = start offset, 4 = space for storing count) */ len = pdu_marshal(pdu, 11 + count, "Qqbs", &qid, off, - dent->d_type, &name); + d_type, &name); v9fs_string_free(&name); @@ -2912,8 +2935,12 @@ static void coroutine_fn v9fs_create(void *opaque) v9fs_path_copy(&fidp->path, &path); v9fs_path_unlock(s); } else if (perm & P9_STAT_MODE_SOCKET) { +#ifndef CONFIG_WIN32 err = v9fs_co_mknod(pdu, fidp, &name, fidp->uid, -1, 0, S_IFSOCK | (perm & 0777), &stbuf); +#else +err = -ENOTSUP; +#endif if (err < 0) { goto out; } @@ -3983,7 +4010,7 @@ out_nofid: #if defined(CONFIG_LINUX) /* Currently, only Linux has XATTR_SIZE_MAX */ #define P9_XATTR_SIZE_MAX XATT
[PATCH v4 01/16] hw/9pfs: Add missing definitions for Windows
From: Guohuai Shi Some definitions currently used by the 9pfs codes are only available on POSIX platforms. Let's add our own ones in preparation to adding 9pfs support for Windows. Signed-off-by: Guohuai Shi Signed-off-by: Bin Meng --- fsdev/file-op-9p.h | 33 + hw/9pfs/9p.h | 43 +++ 2 files changed, 76 insertions(+) diff --git a/fsdev/file-op-9p.h b/fsdev/file-op-9p.h index 4997677460..7d9a736b66 100644 --- a/fsdev/file-op-9p.h +++ b/fsdev/file-op-9p.h @@ -27,6 +27,39 @@ # include #endif +#ifdef CONFIG_WIN32 + +/* POSIX structure not defined in Windows */ + +typedef uint32_t uid_t; +typedef uint32_t gid_t; + +/* from http://man7.org/linux/man-pages/man2/statfs.2.html */ +typedef uint32_t __fsword_t; +typedef uint32_t fsblkcnt_t; +typedef uint32_t fsfilcnt_t; + +/* from linux/include/uapi/asm-generic/posix_types.h */ +typedef struct { +long __val[2]; +} fsid_t; + +struct statfs { +__fsword_t f_type; +__fsword_t f_bsize; +fsblkcnt_t f_blocks; +fsblkcnt_t f_bfree; +fsblkcnt_t f_bavail; +fsfilcnt_t f_files; +fsfilcnt_t f_ffree; +fsid_t f_fsid; +__fsword_t f_namelen; +__fsword_t f_frsize; +__fsword_t f_flags; +}; + +#endif /* CONFIG_WIN32 */ + #define SM_LOCAL_MODE_BITS0600 #define SM_LOCAL_DIR_MODE_BITS0700 diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h index 2fce4140d1..ada9f14ebc 100644 --- a/hw/9pfs/9p.h +++ b/hw/9pfs/9p.h @@ -3,13 +3,56 @@ #include #include +#ifndef CONFIG_WIN32 #include +#endif #include "fsdev/file-op-9p.h" #include "fsdev/9p-iov-marshal.h" #include "qemu/thread.h" #include "qemu/coroutine.h" #include "qemu/qht.h" +#ifdef CONFIG_WIN32 + +/* Windows does not provide such a macro, typically it is 255 */ +#define NAME_MAX255 + +/* macros required for build, values do not matter */ +#define AT_SYMLINK_NOFOLLOW 0x100 /* Do not follow symbolic links */ +#define AT_REMOVEDIR0x200 /* Remove directory instead of file */ +#define O_DIRECTORY 0200 + +#define makedev(major, minor) \ +((dev_t)major) & 0xfff) << 8) | ((minor) & 0xff))) +#define major(dev) ((unsigned int)(((dev) >> 8) & 0xfff)) +#define minor(dev) ((unsigned int)(((dev) & 0xff))) + +/* + * Currenlty Windows/MinGW does not provide the following flag macros, + * so define them here for 9p codes. + * + * Once Windows/MinGW provides them, remove the defines to prevent conflicts. + */ + +#ifndef S_IFLNK +#define S_IFLNK 0xA000 +#define S_ISLNK(mode) ((mode & S_IFMT) == S_IFLNK) +#endif /* S_IFLNK */ + +#ifndef S_ISUID +#define S_ISUID 0x0800 +#endif + +#ifndef S_ISGID +#define S_ISGID 0x0400 +#endif + +#ifndef S_ISVTX +#define S_ISVTX 0x0200 +#endif + +#endif /* CONFIG_WIN32 */ + enum { P9_TLERROR = 6, P9_RLERROR, -- 2.25.1
[PATCH v4 12/16] hw/9pfs: Translate Windows errno to Linux value
From: Guohuai Shi Some of Windows error numbers have different value from Linux ones. For example, ENOTEMPTY is defined to 39 in Linux, but is defined to 41 in Windows. So deleting a directory from a Linux guest on top of QEMU from a Windows host complains: # rmdir tmp rmdir: 'tmp': Unknown error 41 This commit provides error number translation from Windows to Linux. It can make Linux guest OS happy with the error number when running on top of QEMU from a Windows host. Signed-off-by: Guohuai Shi Signed-off-by: Bin Meng --- hw/9pfs/9p-util.h | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index 778352b8ec..824ac81ad3 100644 --- a/hw/9pfs/9p-util.h +++ b/hw/9pfs/9p-util.h @@ -72,9 +72,9 @@ static inline int errno_to_dotl(int err) { #if defined(CONFIG_LINUX) /* nothing to translate (Linux -> Linux) */ -#elif defined(CONFIG_DARWIN) +#elif defined(CONFIG_DARWIN) || defined(CONFIG_WIN32) /* - * translation mandatory for macOS hosts + * translation mandatory for different hosts * * FIXME: Only most important errnos translated here yet, this should be * extended to as many errnos being translated as possible in future. @@ -83,9 +83,17 @@ static inline int errno_to_dotl(int err) case ENAMETOOLONG: return L_ENAMETOOLONG; case ENOTEMPTY: return L_ENOTEMPTY; case ELOOP: return L_ELOOP; +#ifdef CONFIG_DARWIN case ENOATTR: return L_ENODATA; case ENOTSUPreturn L_EOPNOTSUPP; case EOPNOTSUPP:return L_EOPNOTSUPP; +#endif +#ifdef CONFIG_WIN32 +case EDEADLK: return L_EDEADLK; +case ENOLCK:return L_ENOLCK; +case ENOSYS:return L_ENOSYS; +case EILSEQ:return L_EILSEQ; +#endif } #else #error Missing errno translation to Linux for this host system -- 2.25.1
[PATCH v4 13/16] fsdev: Disable proxy fs driver on Windows
From: Guohuai Shi We don't plan to support 'proxy' file system driver for 9pfs on Windows. Disable it for Windows build. Signed-off-by: Guohuai Shi Signed-off-by: Bin Meng --- fsdev/qemu-fsdev.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c index 3da64e9f72..58e0710fbb 100644 --- a/fsdev/qemu-fsdev.c +++ b/fsdev/qemu-fsdev.c @@ -89,6 +89,7 @@ static FsDriverTable FsDrivers[] = { NULL }, }, +#ifndef CONFIG_WIN32 { .name = "proxy", .ops = &proxy_ops, @@ -100,6 +101,7 @@ static FsDriverTable FsDrivers[] = { NULL }, }, +#endif }; static int validate_opt(void *opaque, const char *name, const char *value, -- 2.25.1
[PATCH v4 07/16] hw/9pfs: Update helper qemu_stat_rdev()
As Windows host does not have stat->st_rdev field, we use the first 3 characters of the root path to build a device id. Co-developed-by: Guohuai Shi Signed-off-by: Bin Meng --- hw/9pfs/9p-util.h | 22 +++--- hw/9pfs/9p-util-win32.c | 18 ++ hw/9pfs/9p.c| 5 +++-- 3 files changed, 40 insertions(+), 5 deletions(-) diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index 91f70a4c38..1fb54d0b97 100644 --- a/hw/9pfs/9p-util.h +++ b/hw/9pfs/9p-util.h @@ -22,8 +22,9 @@ /* forward declaration */ union V9fsFidOpenState; struct V9fsState; +struct FsContext; -#if !defined(CONFIG_LINUX) +#ifdef CONFIG_DARWIN /* * Generates a Linux device number (a.k.a. dev_t) for given device major @@ -55,10 +56,12 @@ static inline uint64_t makedev_dotl(uint32_t dev_major, uint32_t dev_minor) */ static inline uint64_t host_dev_to_dotl_dev(dev_t dev) { -#ifdef CONFIG_LINUX +#if defined(CONFIG_LINUX) || defined(CONFIG_WIN32) return dev; -#else +#elif defined(CONFIG_DARWIN) return makedev_dotl(major(dev), minor(dev)); +#else +#error Missing host_dev_to_dotl_dev() implementation for this host system #endif } @@ -152,6 +155,7 @@ void rewinddir_win32(DIR *pDir); void seekdir_win32(DIR *pDir, long pos); long telldir_win32(DIR *pDir); off_t qemu_dirent_off_win32(struct V9fsState *s, union V9fsFidOpenState *fs); +uint64_t qemu_stat_rdev_win32(struct FsContext *fs_ctx); #endif static inline void close_preserve_errno(int fd) @@ -269,6 +273,18 @@ static inline struct dirent *qemu_dirent_dup(struct dirent *dent) return g_memdup(dent, sz); } +static inline uint64_t qemu_stat_rdev(const struct stat *stbuf, + struct FsContext *fs_ctx) +{ +#if defined(CONFIG_LINUX) || defined(CONFIG_DARWIN) +return stbuf->st_rdev; +#elif defined(CONFIG_WIN32) +return qemu_stat_rdev_win32(fs_ctx); +#else +#error Missing qemu_stat_rdev() implementation for this host system +#endif +} + /* * As long as mknodat is not available on macOS, this workaround * using pthread_fchdir_np is needed. qemu_mknodat is defined in diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c index 050c177d0c..5f6d43b62c 100644 --- a/hw/9pfs/9p-util-win32.c +++ b/hw/9pfs/9p-util-win32.c @@ -1278,3 +1278,21 @@ off_t qemu_dirent_off_win32(struct V9fsState *s, union V9fsFidOpenState *fs) { return s->ops->telldir(&s->ctx, fs); } + +uint64_t qemu_stat_rdev_win32(struct FsContext *fs_ctx) +{ +uint64_t rdev = 0; +LocalData *data = fs_ctx->private; + +/* + * As Windows host does not have stat->st_rdev field, we use the first + * 3 characters of the root path to build a device id. + * + * (Windows root path always starts from a driver letter like "C:\") + */ +if (data) { +memcpy(&rdev, data->root_path, 3); +} + +return rdev; +} diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index be247eeb30..36916fe581 100644 --- a/hw/9pfs/9p.c +++ b/hw/9pfs/9p.c @@ -1266,7 +1266,8 @@ static int coroutine_fn stat_to_v9stat(V9fsPDU *pdu, V9fsPath *path, } else if (v9stat->mode & P9_STAT_MODE_DEVICE) { v9fs_string_sprintf(&v9stat->extension, "%c %u %u", S_ISCHR(stbuf->st_mode) ? 'c' : 'b', -major(stbuf->st_rdev), minor(stbuf->st_rdev)); +major(qemu_stat_rdev(stbuf, &pdu->s->ctx)), +minor(qemu_stat_rdev(stbuf, &pdu->s->ctx))); } else if (S_ISDIR(stbuf->st_mode) || S_ISREG(stbuf->st_mode)) { v9fs_string_sprintf(&v9stat->extension, "%s %lu", "HARDLINKCOUNT", (unsigned long)stbuf->st_nlink); @@ -1346,7 +1347,7 @@ static int stat_to_v9stat_dotl(V9fsPDU *pdu, const struct stat *stbuf, v9lstat->st_nlink = stbuf->st_nlink; v9lstat->st_uid = stbuf->st_uid; v9lstat->st_gid = stbuf->st_gid; -v9lstat->st_rdev = host_dev_to_dotl_dev(stbuf->st_rdev); +v9lstat->st_rdev = host_dev_to_dotl_dev(rdev); v9lstat->st_size = stbuf->st_size; v9lstat->st_blksize = stat_to_iounit(pdu, stbuf); v9lstat->st_blocks = stbuf->st_blocks; -- 2.25.1
[PATCH v4 16/16] meson.build: Turn on virtfs for Windows
From: Guohuai Shi Enable virtfs configuration option for Windows host. Signed-off-by: Guohuai Shi Signed-off-by: Bin Meng --- meson.build | 10 +- fsdev/meson.build | 1 + hw/9pfs/meson.build | 8 +--- 3 files changed, 11 insertions(+), 8 deletions(-) diff --git a/meson.build b/meson.build index 6d3b665629..8123136fdf 100644 --- a/meson.build +++ b/meson.build @@ -1751,16 +1751,16 @@ dbus_display = get_option('dbus_display') \ .allowed() have_virtfs = get_option('virtfs') \ -.require(targetos == 'linux' or targetos == 'darwin', - error_message: 'virtio-9p (virtfs) requires Linux or macOS') \ -.require(targetos == 'linux' or cc.has_function('pthread_fchdir_np'), +.require(targetos == 'linux' or targetos == 'darwin' or targetos == 'windows', + error_message: 'virtio-9p (virtfs) requires Linux or macOS or Windows') \ +.require(targetos == 'linux' or targetos == 'windows' or cc.has_function('pthread_fchdir_np'), error_message: 'virtio-9p (virtfs) on macOS requires the presence of pthread_fchdir_np') \ -.require(targetos == 'darwin' or (libattr.found() and libcap_ng.found()), +.require(targetos == 'darwin' or targetos == 'windows' or (libattr.found() and libcap_ng.found()), error_message: 'virtio-9p (virtfs) on Linux requires libcap-ng-devel and libattr-devel') \ .disable_auto_if(not have_tools and not have_system) \ .allowed() -have_virtfs_proxy_helper = targetos != 'darwin' and have_virtfs and have_tools +have_virtfs_proxy_helper = targetos != 'darwin' and targetos != 'windows' and have_virtfs and have_tools if get_option('block_drv_ro_whitelist') == '' config_host_data.set('CONFIG_BDRV_RO_WHITELIST', '') diff --git a/fsdev/meson.build b/fsdev/meson.build index b632b66348..2aad081aef 100644 --- a/fsdev/meson.build +++ b/fsdev/meson.build @@ -8,6 +8,7 @@ fsdev_ss.add(when: ['CONFIG_FSDEV_9P'], if_true: files( ), if_false: files('qemu-fsdev-dummy.c')) softmmu_ss.add_all(when: 'CONFIG_LINUX', if_true: fsdev_ss) softmmu_ss.add_all(when: 'CONFIG_DARWIN', if_true: fsdev_ss) +softmmu_ss.add_all(when: 'CONFIG_WIN32', if_true: fsdev_ss) if have_virtfs_proxy_helper executable('virtfs-proxy-helper', diff --git a/hw/9pfs/meson.build b/hw/9pfs/meson.build index 12443b6ad5..aaa50e71f7 100644 --- a/hw/9pfs/meson.build +++ b/hw/9pfs/meson.build @@ -2,7 +2,6 @@ fs_ss = ss.source_set() fs_ss.add(files( '9p-local.c', '9p-posix-acl.c', - '9p-proxy.c', '9p-synth.c', '9p-xattr-user.c', '9p-xattr.c', @@ -13,8 +12,11 @@ fs_ss.add(files( 'coth.c', 'coxattr.c', )) -fs_ss.add(when: 'CONFIG_LINUX', if_true: files('9p-util-linux.c')) -fs_ss.add(when: 'CONFIG_DARWIN', if_true: files('9p-util-darwin.c')) +fs_ss.add(when: 'CONFIG_LINUX', if_true: files('9p-proxy.c', + '9p-util-linux.c')) +fs_ss.add(when: 'CONFIG_DARWIN', if_true: files('9p-proxy.c', +'9p-util-darwin.c')) +fs_ss.add(when: 'CONFIG_WIN32', if_true: files('9p-util-win32.c')) fs_ss.add(when: 'CONFIG_XEN', if_true: files('xen-9p-backend.c')) softmmu_ss.add_all(when: 'CONFIG_FSDEV_9P', if_true: fs_ss) -- 2.25.1
[PATCH v4 15/16] tests/qtest: virtio-9p-test: Adapt the case for win32
From: Guohuai Shi Windows does not provide the getuid() API. Let's create a local one and return a fixed value 0 as the uid for testing. Co-developed-by: Xuzhou Cheng Signed-off-by: Guohuai Shi Signed-off-by: Bin Meng Reviewed-by: Thomas Huth --- tests/qtest/libqos/virtio-9p-client.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/tests/qtest/libqos/virtio-9p-client.h b/tests/qtest/libqos/virtio-9p-client.h index 78228eb97d..a5c0107580 100644 --- a/tests/qtest/libqos/virtio-9p-client.h +++ b/tests/qtest/libqos/virtio-9p-client.h @@ -491,4 +491,11 @@ void v9fs_rlink(P9Req *req); TunlinkatRes v9fs_tunlinkat(TunlinkatOpt); void v9fs_runlinkat(P9Req *req); +#ifdef CONFIG_WIN32 +static inline uint32_t getuid(void) +{ +return 0; +} +#endif + #endif -- 2.25.1
[PATCH v4 11/16] hw/9pfs: Add Linux error number definition
From: Guohuai Shi When using 9p2000.L protocol, the errno should use the Linux errno. Currently magic numbers with comments are used. Replace these with macros for future expansion. Signed-off-by: Guohuai Shi Signed-off-by: Bin Meng --- hw/9pfs/9p-linux-errno.h | 151 +++ hw/9pfs/9p-util.h| 24 +++ 2 files changed, 162 insertions(+), 13 deletions(-) create mode 100644 hw/9pfs/9p-linux-errno.h diff --git a/hw/9pfs/9p-linux-errno.h b/hw/9pfs/9p-linux-errno.h new file mode 100644 index 00..56c37fa293 --- /dev/null +++ b/hw/9pfs/9p-linux-errno.h @@ -0,0 +1,151 @@ +/* + * 9p Linux errno translation definition + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include + +#ifndef QEMU_9P_LINUX_ERRNO_H +#define QEMU_9P_LINUX_ERRNO_H + +/* + * This file contains the Linux errno definitions to translate errnos set by + * the 9P server (running on non-Linux hosts) to a corresponding errno value. + * + * This list should be periodically reviewed and updated; particularly for + * errnos that might be set as a result of a file system operation. + */ + +#define L_EPERM 1 /* Operation not permitted */ +#define L_ENOENT2 /* No such file or directory */ +#define L_ESRCH 3 /* No such process */ +#define L_EINTR 4 /* Interrupted system call */ +#define L_EIO 5 /* I/O error */ +#define L_ENXIO 6 /* No such device or address */ +#define L_E2BIG 7 /* Argument list too long */ +#define L_ENOEXEC 8 /* Exec format error */ +#define L_EBADF 9 /* Bad file number */ +#define L_ECHILD10 /* No child processes */ +#define L_EAGAIN11 /* Try again */ +#define L_ENOMEM12 /* Out of memory */ +#define L_EACCES13 /* Permission denied */ +#define L_EFAULT14 /* Bad address */ +#define L_ENOTBLK 15 /* Block device required */ +#define L_EBUSY 16 /* Device or resource busy */ +#define L_EEXIST17 /* File exists */ +#define L_EXDEV 18 /* Cross-device link */ +#define L_ENODEV19 /* No such device */ +#define L_ENOTDIR 20 /* Not a directory */ +#define L_EISDIR21 /* Is a directory */ +#define L_EINVAL22 /* Invalid argument */ +#define L_ENFILE23 /* File table overflow */ +#define L_EMFILE24 /* Too many open files */ +#define L_ENOTTY25 /* Not a typewriter */ +#define L_ETXTBSY 26 /* Text file busy */ +#define L_EFBIG 27 /* File too large */ +#define L_ENOSPC28 /* No space left on device */ +#define L_ESPIPE29 /* Illegal seek */ +#define L_EROFS 30 /* Read-only file system */ +#define L_EMLINK31 /* Too many links */ +#define L_EPIPE 32 /* Broken pipe */ +#define L_EDOM 33 /* Math argument out of domain of func */ +#define L_ERANGE34 /* Math result not representable */ +#define L_EDEADLK 35 /* Resource deadlock would occur */ +#define L_ENAMETOOLONG 36 /* File name too long */ +#define L_ENOLCK37 /* No record locks available */ +#define L_ENOSYS38 /* Function not implemented */ +#define L_ENOTEMPTY 39 /* Directory not empty */ +#define L_ELOOP 40 /* Too many symbolic links encountered */ +#define L_ENOMSG42 /* No message of desired type */ +#define L_EIDRM 43 /* Identifier removed */ +#define L_ECHRNG44 /* Channel number out of range */ +#define L_EL2NSYNC 45 /* Level 2 not synchronized */ +#define L_EL3HLT46 /* Level 3 halted */ +#define L_EL3RST47 /* Level 3 reset */ +#define L_ELNRNG48 /* Link number out of range */ +#define L_EUNATCH 49 /* Protocol driver not attached */ +#define L_ENOCSI50 /* No CSI structure available */ +#define L_EL2HLT51 /* Level 2 halted */ +#define L_EBADE 52 /* Invalid exchange */ +#define L_EBADR 53 /* Invalid request descriptor */ +#define L_EXFULL54 /* Exchange full */ +#define L_ENOANO55 /* No anode */ +#define L_EBADRQC 56 /* Invalid request code */ +#define L_EBADSLT 57 /* Invalid slot */ +#define L_EBFONT58 /* Bad font file format */ +#define L_ENOSTR59 /* Device not a stream */ +#define L_ENODATA 61 /* No data available */ +#define L_ETIME 62 /* Timer expired */ +#define L_ENOSR 63 /* Out of streams resources */ +#define L_ENONET64 /* Machine is not on the network */ +#define L_ENOPKG65 /* Package not installed */ +#define L_EREMOTE 66 /* Object is remote */ +#defin
[PATCH v4 06/16] hw/9pfs: Support getting current directory offset for Windows
From: Guohuai Shi On Windows 'struct dirent' does not have current directory offset. Update qemu_dirent_off() to support Windows. While we are here, add a build time check to error out if a new host does not implement this helper. Signed-off-by: Guohuai Shi Signed-off-by: Bin Meng --- hw/9pfs/9p-util.h | 16 +--- hw/9pfs/9p-util-win32.c | 5 + hw/9pfs/9p.c| 4 ++-- hw/9pfs/codir.c | 2 +- 4 files changed, 21 insertions(+), 6 deletions(-) diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index c1c251fbd1..91f70a4c38 100644 --- a/hw/9pfs/9p-util.h +++ b/hw/9pfs/9p-util.h @@ -19,6 +19,10 @@ #define O_PATH_9P_UTIL 0 #endif +/* forward declaration */ +union V9fsFidOpenState; +struct V9fsState; + #if !defined(CONFIG_LINUX) /* @@ -147,6 +151,7 @@ struct dirent *readdir_win32(DIR *pDir); void rewinddir_win32(DIR *pDir); void seekdir_win32(DIR *pDir, long pos); long telldir_win32(DIR *pDir); +off_t qemu_dirent_off_win32(struct V9fsState *s, union V9fsFidOpenState *fs); #endif static inline void close_preserve_errno(int fd) @@ -220,12 +225,17 @@ ssize_t fremovexattrat_nofollow(int dirfd, const char *filename, * so ensure it is manually injected earlier and call here when * needed. */ -static inline off_t qemu_dirent_off(struct dirent *dent) +static inline off_t qemu_dirent_off(struct dirent *dent, struct V9fsState *s, +union V9fsFidOpenState *fs) { -#ifdef CONFIG_DARWIN +#if defined(CONFIG_DARWIN) return dent->d_seekoff; -#else +#elif defined(CONFIG_LINUX) return dent->d_off; +#elif defined(CONFIG_WIN32) +return qemu_dirent_off_win32(s, fs); +#else +#error Missing qemu_dirent_off() implementation for this host system #endif } diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c index 5503199300..050c177d0c 100644 --- a/hw/9pfs/9p-util-win32.c +++ b/hw/9pfs/9p-util-win32.c @@ -1273,3 +1273,8 @@ long telldir_win32(DIR *pDir) return (long)stream->offset; } + +off_t qemu_dirent_off_win32(struct V9fsState *s, union V9fsFidOpenState *fs) +{ +return s->ops->telldir(&s->ctx, fs); +} diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index 072cf67956..be247eeb30 100644 --- a/hw/9pfs/9p.c +++ b/hw/9pfs/9p.c @@ -2336,7 +2336,7 @@ static int coroutine_fn v9fs_do_readdir_with_stat(V9fsPDU *pdu, count += len; v9fs_stat_free(&v9stat); v9fs_path_free(&path); -saved_dir_pos = qemu_dirent_off(dent); +saved_dir_pos = qemu_dirent_off(dent, pdu->s, &fidp->fs); } v9fs_readdir_unlock(&fidp->fs.dir); @@ -2537,7 +2537,7 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp, qid.version = 0; } -off = qemu_dirent_off(dent); +off = qemu_dirent_off(dent, pdu->s, &fidp->fs); v9fs_string_init(&name); v9fs_string_sprintf(&name, "%s", dent->d_name); diff --git a/hw/9pfs/codir.c b/hw/9pfs/codir.c index 7ba63be489..6d96e2d72b 100644 --- a/hw/9pfs/codir.c +++ b/hw/9pfs/codir.c @@ -167,7 +167,7 @@ static int do_readdir_many(V9fsPDU *pdu, V9fsFidState *fidp, } size += len; -saved_dir_pos = qemu_dirent_off(dent); +saved_dir_pos = qemu_dirent_off(dent, s, &fidp->fs); } /* restore (last) saved position */ -- 2.25.1
[PATCH v4 14/16] hw/9pfs: Update synth fs driver for Windows
From: Guohuai Shi Adapt synth fs driver for Windows in preparation to running qtest 9p testing on Windows. Signed-off-by: Guohuai Shi Signed-off-by: Bin Meng Reviewed-by: Philippe Mathieu-Daudé --- hw/9pfs/9p-synth.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/hw/9pfs/9p-synth.c b/hw/9pfs/9p-synth.c index f62c40b639..b1a362a689 100644 --- a/hw/9pfs/9p-synth.c +++ b/hw/9pfs/9p-synth.c @@ -146,8 +146,10 @@ static void synth_fill_statbuf(V9fsSynthNode *node, struct stat *stbuf) stbuf->st_gid = 0; stbuf->st_rdev = 0; stbuf->st_size = 0; +#ifndef CONFIG_WIN32 stbuf->st_blksize = 0; stbuf->st_blocks = 0; +#endif stbuf->st_atime = 0; stbuf->st_mtime = 0; stbuf->st_ctime = 0; @@ -230,7 +232,8 @@ static void synth_direntry(V9fsSynthNode *node, entry->d_ino = node->attr->inode; #ifdef CONFIG_DARWIN entry->d_seekoff = off + 1; -#else +#endif +#ifdef CONFIG_LINUX entry->d_off = off + 1; #endif } -- 2.25.1
[PATCH v4 10/16] hw/9pfs: Update v9fs_set_fd_limit() for Windows
From: Guohuai Shi Use _getmaxstdio() to set the fd limit on Windows. Signed-off-by: Guohuai Shi Signed-off-by: Bin Meng --- hw/9pfs/9p.c | 23 --- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index 2497a06f43..b55d0bc400 100644 --- a/hw/9pfs/9p.c +++ b/hw/9pfs/9p.c @@ -4396,11 +4396,28 @@ void v9fs_reset(V9fsState *s) static void __attribute__((__constructor__)) v9fs_set_fd_limit(void) { +int rlim_cur; +int ret; + +#ifndef CONFIG_WIN32 struct rlimit rlim; -if (getrlimit(RLIMIT_NOFILE, &rlim) < 0) { +ret = getrlimit(RLIMIT_NOFILE, &rlim); +rlim_cur = rlim.rlim_cur; +#else +/* + * On Windows host, _getmaxstdio() actually returns the number of max + * open files at the stdio level. It *may* be smaller than the number + * of open files by open() or CreateFile(). + */ +ret = _getmaxstdio(); +rlim_cur = ret; +#endif + +if (ret < 0) { error_report("Failed to get the resource limit"); exit(1); } -open_fd_hw = rlim.rlim_cur - MIN(400, rlim.rlim_cur / 3); -open_fd_rc = rlim.rlim_cur / 2; + +open_fd_hw = rlim_cur - MIN(400, rlim_cur / 3); +open_fd_rc = rlim_cur / 2; } -- 2.25.1
unable to use "-net user" argument after building from master branch
hello, I'm unable to use the "-net user" argument with the compiled "qemu-system-x86_64" binary. I get an error: "qemu-system-x86_64: -net user: network backend 'user' is not compiled into this binary" I don't know what I'm missing when I used the configure script with the following arguments: ../configure --enable-linux-user --enable-user --enable-curses --enable-vhost-net --enable-vhost-user --enable-png what am I doing wrong? what other arguments do I need to add to allow the "-net user" option to work?
Re: unable to use "-net user" argument after building from master branch
On Mon, Jan 30, 2023 at 6:01 PM Neal Elliott wrote: > > hello, > I'm unable to use the "-net user" argument with the compiled > "qemu-system-x86_64" binary. I get an error: "qemu-system-x86_64: -net user: > network backend 'user' is not compiled into this binary" > > I don't know what I'm missing when I used the configure script with the > following arguments: > > ../configure --enable-linux-user --enable-user --enable-curses > --enable-vhost-net --enable-vhost-user --enable-png > > what am I doing wrong? what other arguments do I need to add to allow the > "-net user" option to work? > Please install libslirp dev package for your host distribution. Regards, Bin
[PATCH] target/riscv: set tval for triggered watchpoints
From: Sergey Matyukevich According to priviledged spec, if [sm]tval is written with a nonzero value when a breakpoint exception occurs, then [sm]tval will contain the faulting virtual address. Set tval to hit address when breakpoint exception is triggered by hardware watchpoint. Signed-off-by: Sergey Matyukevich --- target/riscv/cpu_helper.c | 3 +++ target/riscv/debug.c | 1 + 2 files changed, 4 insertions(+) diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c index 9a28816521..d3be8c0511 100644 --- a/target/riscv/cpu_helper.c +++ b/target/riscv/cpu_helper.c @@ -1641,6 +1641,9 @@ void riscv_cpu_do_interrupt(CPUState *cs) case RISCV_EXCP_VIRT_INSTRUCTION_FAULT: tval = env->bins; break; +case RISCV_EXCP_BREAKPOINT: +tval = env->badaddr; +break; default: break; } diff --git a/target/riscv/debug.c b/target/riscv/debug.c index bf4840a6a3..48ef3c59ea 100644 --- a/target/riscv/debug.c +++ b/target/riscv/debug.c @@ -761,6 +761,7 @@ void riscv_cpu_debug_excp_handler(CPUState *cs) if (cs->watchpoint_hit) { if (cs->watchpoint_hit->flags & BP_CPU) { +env->badaddr = cs->watchpoint_hit->hitaddr; cs->watchpoint_hit = NULL; do_trigger_action(env, DBG_ACTION_BP); } -- 2.39.0
RE: [PATCH 3/9] igb: implement VFRE and VFTE registers
> -Original Message- > From: Akihiko Odaki > Sent: Sunday, 29 January 2023 10:16 > To: Sriram Yagnaraman > Cc: qemu-devel@nongnu.org; Jason Wang ; Dmitry > Fleytman ; Michael S . Tsirkin > ; Marcel Apfelbaum > Subject: Re: [PATCH 3/9] igb: implement VFRE and VFTE registers > > On 2023/01/28 22:46, Sriram Yagnaraman wrote: > > Also add checks for RXDCTL/TXDCTL queue enable bits > > > > Signed-off-by: Sriram Yagnaraman > > --- > > hw/net/igb_core.c | 42 +++--- > > hw/net/igb_regs.h | 3 ++- > > 2 files changed, 33 insertions(+), 12 deletions(-) > > > > diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c index > > 9bd53cc25f..6bca5459b9 100644 > > --- a/hw/net/igb_core.c > > +++ b/hw/net/igb_core.c > > @@ -778,6 +778,19 @@ igb_txdesc_writeback(IGBCore *core, dma_addr_t > base, > > return igb_tx_wb_eic(core, txi->idx); > > } > > > > +static inline bool > > +igb_tx_enabled(IGBCore *core, const E1000E_RingInfo *txi) { > > +bool vmdq = core->mac[MRQC] & 1; > > +uint16_t qn = txi->idx; > > +uint16_t vfn = (qn > IGB_MAX_VF_FUNCTIONS) ? > > + (qn - IGB_MAX_VF_FUNCTIONS) : qn; > > + > > +return (core->mac[TCTL] & E1000_TCTL_EN) && > > +(vmdq ? (core->mac[VFTE] & BIT(vfn)) : true) && > > Instead, do: (!vmdq || core->mac[VFTE] & BIT(vfn)) > > > +(core->mac[TXDCTL0 + (qn * 16)] & E1000_TXDCTL_QUEUE_ENABLE); > > +} > > + > > static void > > igb_start_xmit(IGBCore *core, const IGB_TxRing *txr) > > { > > @@ -787,8 +800,7 @@ igb_start_xmit(IGBCore *core, const IGB_TxRing > *txr) > > const E1000E_RingInfo *txi = txr->i; > > uint32_t eic = 0; > > > > -/* TODO: check if the queue itself is enabled too. */ > > -if (!(core->mac[TCTL] & E1000_TCTL_EN)) { > > +if (!igb_tx_enabled(core, txi)) { > > trace_e1000e_tx_disabled(); > > return; > > } > > @@ -1003,6 +1015,7 @@ static uint16_t igb_receive_assign(IGBCore *core, > const struct eth_header *ehdr, > > queues = BIT(def_pl >> E1000_VT_CTL_DEFAULT_POOL_SHIFT); > > } > > > > +queues &= core->mac[VFRE]; > > igb_rss_parse_packet(core, core->rx_pkt, external_tx != NULL, > rss_info); > > if (rss_info->queue & 1) { > > queues <<= 8; > > @@ -1486,7 +1499,7 @@ igb_receive_internal(IGBCore *core, const struct > iovec *iov, int iovcnt, > > static const int maximum_ethernet_hdr_len = (ETH_HLEN + 4); > > > > uint16_t queues = 0; > > -uint32_t n; > > +uint32_t n = 0; > > uint8_t min_buf[ETH_ZLEN]; > > struct iovec min_iov; > > struct eth_header *ehdr; > > @@ -1566,26 +1579,22 @@ igb_receive_internal(IGBCore *core, const > struct iovec *iov, int iovcnt, > > } > > > > igb_rx_ring_init(core, &rxr, i); > > - > > -trace_e1000e_rx_rss_dispatched_to_queue(rxr.i->idx); > > - > > if (!igb_has_rxbufs(core, rxr.i, total_size)) { > > retval = 0; > > } > > This stops sending packet when a disabled queue has no space. Yes, that is true, but I have refactored this part a bit in patchset 9 that fixes this. > > > } > > > > if (retval) { > > -n = E1000_ICR_RXT0; > > - > > igb_rx_fix_l4_csum(core, core->rx_pkt); > > > > for (i = 0; i < IGB_NUM_QUEUES; i++) { > > -if (!(queues & BIT(i))) { > > +if (!(queues & BIT(i)) || > > +!(core->mac[E1000_RXDCTL(i) >> 2] & > > + E1000_RXDCTL_QUEUE_ENABLE)) { > > continue; > > } > > > > igb_rx_ring_init(core, &rxr, i); > > - > > +trace_e1000e_rx_rss_dispatched_to_queue(rxr.i->idx); > > igb_write_packet_to_guest(core, core->rx_pkt, &rxr, > > &rss_info); > > > > /* Check if receive descriptor minimum threshold hit */ > > @@ -1594,6 +1603,9 @@ igb_receive_internal(IGBCore *core, const struct > iovec *iov, int iovcnt, > > } > > > > core->mac[EICR] |= igb_rx_wb_eic(core, rxr.i->idx); > > + > > +/* same as RXDW (rx descriptor written back)*/ > > +n = E1000_ICR_RXT0; > > } > > > > trace_e1000e_rx_written_to_guest(n); > > @@ -1981,9 +1993,16 @@ static void igb_set_vfmailbox(IGBCore *core, > > int index, uint32_t val) > > > > static void igb_vf_reset(IGBCore *core, uint16_t vfn) > > { > > +uint16_t qn0 = vfn; > > +uint16_t qn1 = vfn + IGB_MAX_VF_FUNCTIONS; > > + > > /* disable Rx and Tx for the VF*/ > > -core->mac[VFTE] &= ~BIT(vfn); > > +core->mac[RXDCTL0 + (qn0 * 16)] &= ~E1000_RXDCTL_QUEUE_ENABLE; > > +core->mac[RXDCTL0 + (qn1 * 16)] &= ~E1000_RXDCTL_QUEUE_ENABLE; > > +core->mac[TXDCTL0 + (qn0 * 16)] &= ~E1000_TXDCTL_QUEUE_ENABLE; > > +core->mac[TXDCTL0 + (qn1 * 16)] &= ~E1000_TXDCTL_QUEUE_ENABLE; > > core->mac[VFRE] &= ~BIT(vfn); > > +core->mac[VFTE] &= ~BIT(vfn); >
Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset
On Mon, 30 Jan 2023 16:40:08 +0800, Jason Wang wrote: > On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo wrote: > > > > On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang wrote: > > > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin > > > wrote: > > > > > > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote: > > > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" > > > > > wrote: > > > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > > > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" > > > > > > > wrote: > > > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" > > > > > > > > > wrote: > > > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" > > > > > > > > > > > wrote: > > > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Check whether it is per-queue reset state in > > > > > > > > > > > > > virtio_net_flush_tx(). > > > > > > > > > > > > > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx > > > > > > > > > > > > > resources. At this > > > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should > > > > > > > > > > > > > not try to send > > > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check > > > > > > > > > > > > > the current > > > > > > > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > > > > > > > Do you in fact mean it's called from > > > > > > > > > > > > flush_or_purge_queued_packets? > > > > > > > > > > > > > > > > > > > > > > Yes > > > > > > > > > > > > > > > > > > > > > > virtio_queue_reset > > > > > > > > > > > k->queue_reset > > > > > > > > > > > virtio_net_queue_reset > > > > > > > > > > > flush_or_purge_queued_packets > > > > > > > > > > > > > > > > > > > > > > qemu_flush_or_purge_queued_packets > > > > > > > > > > > . > > > > > > > > > > > (callback) > > > > > > > > > > > virtio_net_tx_complete > > > > > > > > > > > > > > > > > > > > > > virtio_net_flush_tx <-- here send new packet. We need > > > > > > > > > > > stop it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass > > > > > > > > > > > information through the stack. I > > > > > > > > > > > originally thought it was a general situation, so I > > > > > > > > > > > wanted to put it in > > > > > > > > > > > struct VirtQueue. > > > > > > > > > > > > > > > > > > > > > > If it is not very suitable, it may be better to put it in > > > > > > > > > > > VirtIONetQueue. > > > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > > > > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > > > > > > > > > > > > > If yes introducing a vq state just so > > > > > > > > > > > > virtio_net_flush_tx > > > > > > > > > > > > knows we are in the process of reset would be a bad > > > > > > > > > > > > idea. > > > > > > > > > > > > We want something much more local, ideally on stack > > > > > > > > > > > > even ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > > > > > > > Fixes: > > > > > > > > > > > > > https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > > > > > > > Reported-by: Alexander Bulekov > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo > > > > > > > > > > > > > --- > > > > > > > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t > > > > > > > > > > > > > virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > > > > > > > VirtQueueElement *elem; > > > > > > > > > > > > > int32_t num_packets = 0; > > > > > > > > > > > > > int queue_index = > > > > > > > > > > > > > vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > > > > > > > -if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) > > > > > > > > > > > > > { > > > > > > > > > > > > >
[PATCH 5/6] gitlab-ci.d/buildtest: Merge the two gprof-gcov jobs
There is only one job depending on the build-gprof-gcov job, so there is no real need for keeping this separate. It likely only has been split since the complete runtime is more than 60 minutes, but that can be better handled with a proper timeout setting instead. By merging the two jobs, we safe some precious CI minutes for starting a new container for the second job each time. Signed-off-by: Thomas Huth --- .gitlab-ci.d/buildtest.yml | 15 ++- 1 file changed, 2 insertions(+), 13 deletions(-) diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml index 86f9c37a07..91c7467a66 100644 --- a/.gitlab-ci.d/buildtest.yml +++ b/.gitlab-ci.d/buildtest.yml @@ -468,26 +468,15 @@ tsan-build: MAKE_CHECK_ARGS: bench V=1 # gprof/gcov are GCC features -build-gprof-gcov: +gprof-gcov: extends: .native_build_job_template needs: job: amd64-ubuntu2004-container + timeout: 80m variables: IMAGE: ubuntu2004 CONFIGURE_ARGS: --enable-gprof --enable-gcov TARGETS: aarch64-softmmu ppc64-softmmu s390x-softmmu x86_64-softmmu - artifacts: -expire_in: 1 days -paths: - - build - -check-gprof-gcov: - extends: .native_test_job_template - needs: -- job: build-gprof-gcov - artifacts: true - variables: -IMAGE: ubuntu2004 MAKE_CHECK_ARGS: check after_script: - cd build -- 2.31.1
[PATCH 4/6] gitlab-ci.d/buildtest: Merge the --without-default-* jobs
Let's safe some CI minutes by merging these two jobs. We can now also drop "--disable-capstone" since the capstone submodule has been removed a while ago. We should rather tes --disable-fdt now to test a compilation without the "dtc" submodule (for this we have to drop i386-softmmu from the target list unfortunately). Additionally, the qtests with s390x and sh4 are not read for "--without-default-devices" yet, so we can only test mips64 and avr here now. Signed-off-by: Thomas Huth --- .gitlab-ci.d/buildtest.yml | 17 + 1 file changed, 5 insertions(+), 12 deletions(-) diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml index 7b55dfc434..86f9c37a07 100644 --- a/.gitlab-ci.d/buildtest.yml +++ b/.gitlab-ci.d/buildtest.yml @@ -557,29 +557,22 @@ build-coroutine-sigaltstack: MAKE_CHECK_ARGS: check-unit # Check our reduced build configurations -build-without-default-devices: +build-without-defaults: extends: .native_build_job_template needs: job: amd64-centos8-container variables: IMAGE: centos8 -CONFIGURE_ARGS: --without-default-devices --disable-user - -build-without-default-features: - extends: .native_build_job_template - needs: -job: amd64-fedora-container - variables: -IMAGE: fedora CONFIGURE_ARGS: + --without-default-devices --without-default-features - --disable-capstone + --disable-fdt --disable-pie --disable-qom-cast-debug --disable-strip -TARGETS: avr-softmmu i386-softmmu mips64-softmmu s390x-softmmu sh4-softmmu +TARGETS: avr-softmmu mips64-softmmu s390x-softmmu sh4-softmmu sparc64-softmmu hexagon-linux-user i386-linux-user s390x-linux-user -MAKE_CHECK_ARGS: check-unit check-qtest SPEED=slow +MAKE_CHECK_ARGS: check-unit check-qtest-avr check-qtest-mips64 build-libvhost-user: extends: .base_job_template -- 2.31.1
[PATCH 3/6] tests/qtest/display-vga-test: Add proper checks if a device is available
display-vga-test currently tries to guess the usable VGA devices according to the target architecture that is used for the test. This of course does not work if QEMU has been built with the "--without-default-devices" configure switch. To fix this, use the qtest_has_device() function for the decision instead. This way we can also consolidate most of the test functions into one single function (that takes a parameter with the device name now), except for the multihead test that tries to instantiate two devices and thus is a little bit different. Signed-off-by: Thomas Huth --- tests/qtest/display-vga-test.c | 65 +- 1 file changed, 25 insertions(+), 40 deletions(-) diff --git a/tests/qtest/display-vga-test.c b/tests/qtest/display-vga-test.c index ace3bb28e0..1a26a66061 100644 --- a/tests/qtest/display-vga-test.c +++ b/tests/qtest/display-vga-test.c @@ -8,61 +8,46 @@ */ #include "qemu/osdep.h" -#include "libqtest-single.h" - -static void pci_cirrus(void) -{ -qtest_start("-vga none -device cirrus-vga"); -qtest_end(); -} - -static void pci_stdvga(void) -{ -qtest_start("-vga none -device VGA"); -qtest_end(); -} - -static void pci_secondary(void) -{ -qtest_start("-vga none -device secondary-vga"); -qtest_end(); -} +#include "libqtest.h" static void pci_multihead(void) { -qtest_start("-vga none -device VGA -device secondary-vga"); -qtest_end(); -} +QTestState *qts; -static void pci_virtio_gpu(void) -{ -qtest_start("-vga none -device virtio-gpu-pci"); -qtest_end(); +qts = qtest_init("-vga none -device VGA -device secondary-vga"); +qtest_quit(qts); } -static void pci_virtio_vga(void) +static void test_vga(gconstpointer data) { -qtest_start("-vga none -device virtio-vga"); -qtest_end(); +QTestState *qts; + +qts = qtest_initf("-vga none -device %s", (const char *)data); +qtest_quit(qts); } int main(int argc, char **argv) { -const char *arch = qtest_get_arch(); +const char *devices[] = { +"cirrus-vga", +"VGA", +"secondary-vga", +"virtio-gpu-pci", +"virtio-vga" +}; g_test_init(&argc, &argv, NULL); -if (strcmp(arch, "alpha") == 0 || strcmp(arch, "i386") == 0 || -strcmp(arch, "mips") == 0 || strcmp(arch, "x86_64") == 0) { -qtest_add_func("/display/pci/cirrus", pci_cirrus); +for (int i = 0; i < ARRAY_SIZE(devices); i++) { +if (qtest_has_device(devices[i])) { +char *testpath = g_strdup_printf("/display/pci/%s", devices[i]); +qtest_add_data_func(testpath, devices[i], test_vga); +g_free(testpath); +} } -qtest_add_func("/display/pci/stdvga", pci_stdvga); -qtest_add_func("/display/pci/secondary", pci_secondary); -qtest_add_func("/display/pci/multihead", pci_multihead); -qtest_add_func("/display/pci/virtio-gpu", pci_virtio_gpu); -if (g_str_equal(arch, "i386") || g_str_equal(arch, "x86_64") || -g_str_equal(arch, "hppa") || g_str_equal(arch, "ppc64")) { -qtest_add_func("/display/pci/virtio-vga", pci_virtio_vga); + +if (qtest_has_device("secondary-vga")) { +qtest_add_func("/display/pci/multihead", pci_multihead); } return g_test_run(); -- 2.31.1
[PATCH 0/3] Vhost-user: replace _SLAVE_ with _BACKEND_
This series continues the work done to get rid of harmful language in the Vhost-user specification. While the spec texts were changed to replace slave with backend, the protocol features and messages names hadn't been changed. This series renames remaining occurences in the spec and make use of the new names in both libvhost-user and the Vhost-user frontend code. Maxime Coquelin (3): docs: vhost-user: replace _SLAVE_ with _BACKEND_ libvhost-user: Adopt new backend naming vhost-user: Adopt new backend naming docs/interop/vhost-user.rst | 40 +++ hw/virtio/vhost-user.c| 30 - hw/virtio/virtio-qmp.c| 12 +++ subprojects/libvhost-user/libvhost-user.c | 20 ++-- subprojects/libvhost-user/libvhost-user.h | 20 ++-- 5 files changed, 61 insertions(+), 61 deletions(-) -- 2.39.1
[PATCH 1/6] gitlab-ci.d/buildtest: Remove ppc-softmmu from the clang-system job
We are also compile-testing ppc64-softmmu with clang in the "tsan-build" job, and ppc64-softmmu covers pretty much the same code as ppc-softmmu, so we should not lose much test coverage here by removing ppc-softmmu from the "clang-system" job. Signed-off-by: Thomas Huth --- .gitlab-ci.d/buildtest.yml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml index f09a898c3e..406608e5fc 100644 --- a/.gitlab-ci.d/buildtest.yml +++ b/.gitlab-ci.d/buildtest.yml @@ -316,8 +316,7 @@ clang-system: IMAGE: fedora CONFIGURE_ARGS: --cc=clang --cxx=clang++ --extra-cflags=-fsanitize=undefined --extra-cflags=-fno-sanitize-recover=undefined -TARGETS: alpha-softmmu arm-softmmu m68k-softmmu mips64-softmmu - ppc-softmmu s390x-softmmu +TARGETS: alpha-softmmu arm-softmmu m68k-softmmu mips64-softmmu s390x-softmmu MAKE_CHECK_ARGS: check-qtest check-tcg clang-user: -- 2.31.1
[PATCH 2/3] libvhost-user: Adopt new backend naming
In order to get rid of harmful language, the Vhost-user specification changed features and requests naming from _SLAVE_ to _BACKEND_. This patch adopts the new naming convention. Signed-off-by: Maxime Coquelin --- subprojects/libvhost-user/libvhost-user.c | 20 ++-- subprojects/libvhost-user/libvhost-user.h | 20 ++-- 2 files changed, 20 insertions(+), 20 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index fc69783d2b..f661af7c85 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -140,7 +140,7 @@ vu_request_to_string(unsigned int req) REQ(VHOST_USER_SET_VRING_ENABLE), REQ(VHOST_USER_SEND_RARP), REQ(VHOST_USER_NET_SET_MTU), -REQ(VHOST_USER_SET_SLAVE_REQ_FD), +REQ(VHOST_USER_SET_BACKEND_REQ_FD), REQ(VHOST_USER_IOTLB_MSG), REQ(VHOST_USER_SET_VRING_ENDIAN), REQ(VHOST_USER_GET_CONFIG), @@ -1365,7 +1365,7 @@ bool vu_set_queue_host_notifier(VuDev *dev, VuVirtq *vq, int fd, int qidx = vq - dev->vq; int fd_num = 0; VhostUserMsg vmsg = { -.request = VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG, +.request = VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG, .flags = VHOST_USER_VERSION | VHOST_USER_NEED_REPLY_MASK, .size = sizeof(vmsg.payload.area), .payload.area = { @@ -1383,7 +1383,7 @@ bool vu_set_queue_host_notifier(VuDev *dev, VuVirtq *vq, int fd, vmsg.fd_num = fd_num; -if (!vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD)) { +if (!vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_BACKEND_SEND_FD)) { return false; } @@ -1461,9 +1461,9 @@ vu_get_protocol_features_exec(VuDev *dev, VhostUserMsg *vmsg) */ uint64_t features = 1ULL << VHOST_USER_PROTOCOL_F_MQ | 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD | -1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ | +1ULL << VHOST_USER_PROTOCOL_F_BACKEND_REQ | 1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER | -1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD | +1ULL << VHOST_USER_PROTOCOL_F_BACKEND_SEND_FD | 1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK | 1ULL << VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS; @@ -1494,7 +1494,7 @@ vu_set_protocol_features_exec(VuDev *dev, VhostUserMsg *vmsg) if (vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS) && -(!vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_SLAVE_REQ) || +(!vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_BACKEND_REQ) || !vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK))) { /* * The use case for using messages for kick/call is simulation, to make @@ -1507,7 +1507,7 @@ vu_set_protocol_features_exec(VuDev *dev, VhostUserMsg *vmsg) * that actually enables the simulation case. */ vu_panic(dev, - "F_IN_BAND_NOTIFICATIONS requires F_SLAVE_REQ && F_REPLY_ACK"); + "F_IN_BAND_NOTIFICATIONS requires F_BACKEND_REQ && F_REPLY_ACK"); return false; } @@ -1910,7 +1910,7 @@ vu_process_message(VuDev *dev, VhostUserMsg *vmsg) return vu_get_queue_num_exec(dev, vmsg); case VHOST_USER_SET_VRING_ENABLE: return vu_set_vring_enable_exec(dev, vmsg); -case VHOST_USER_SET_SLAVE_REQ_FD: +case VHOST_USER_SET_BACKEND_REQ_FD: return vu_set_slave_req_fd(dev, vmsg); case VHOST_USER_GET_CONFIG: return vu_get_config(dev, vmsg); @@ -2416,9 +2416,9 @@ static void _vu_queue_notify(VuDev *dev, VuVirtq *vq, bool sync) if (vq->call_fd < 0 && vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS) && -vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_SLAVE_REQ)) { +vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_BACKEND_REQ)) { VhostUserMsg vmsg = { -.request = VHOST_USER_SLAVE_VRING_CALL, +.request = VHOST_USER_BACKEND_VRING_CALL, .flags = VHOST_USER_VERSION, .size = sizeof(vmsg.payload.state), .payload.state = { diff --git a/subprojects/libvhost-user/libvhost-user.h b/subprojects/libvhost-user/libvhost-user.h index 8cda9b8f57..8c5a2719e3 100644 --- a/subprojects/libvhost-user/libvhost-user.h +++ b/subprojects/libvhost-user/libvhost-user.h @@ -54,12 +54,12 @@ enum VhostUserProtocolFeature { VHOST_USER_PROTOCOL_F_RARP = 2, VHOST_USER_PROTOCOL_F_REPLY_ACK = 3, VHOST_USER_PROTOCOL_F_NET_MTU = 4, -VHOST_USER_PROTOCOL_F_SLAVE_REQ = 5, +VHOST_USER_PROTOCOL_F_BACKEND_REQ = 5, VHOST_USER_PROTOCOL_F_CROSS_ENDIAN = 6, VHOST_USER_PR
[PATCH 1/3] docs: vhost-user: replace _SLAVE_ with _BACKEND_
Backend's message and protocol features names were still using "_SLAVE_" naming. For consistency with the new naming convention and to get rid of the remaining harmful language, replace it with _BACKEND_. Signed-off-by: Maxime Coquelin --- docs/interop/vhost-user.rst | 40 ++--- 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst index 3f18ab424e..8a5924ea75 100644 --- a/docs/interop/vhost-user.rst +++ b/docs/interop/vhost-user.rst @@ -315,7 +315,7 @@ in the ancillary data: * ``VHOST_USER_SET_VRING_KICK`` * ``VHOST_USER_SET_VRING_CALL`` * ``VHOST_USER_SET_VRING_ERR`` -* ``VHOST_USER_SET_SLAVE_REQ_FD`` +* ``VHOST_USER_SET_BACKEND_REQ_FD`` (previous name ``VHOST_USER_SET_SLAVE_REQ_FD``) * ``VHOST_USER_SET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``) If *front-end* is unable to send the full message or receives a wrong @@ -516,7 +516,7 @@ expected to reply with a zero payload, non-zero otherwise. The back-end relies on the back-end communication channel (see :ref:`Back-end communication ` section below) to send IOTLB miss -and access failure events, by sending ``VHOST_USER_SLAVE_IOTLB_MSG`` +and access failure events, by sending ``VHOST_USER_BACKEND_IOTLB_MSG`` requests to the front-end with a ``struct vhost_iotlb_msg`` as payload. For miss events, the iotlb payload has to be filled with the miss message type (1), the I/O virtual address and the permissions @@ -540,15 +540,15 @@ Back-end communication -- An optional communication channel is provided if the back-end declares -``VHOST_USER_PROTOCOL_F_SLAVE_REQ`` protocol feature, to allow the +``VHOST_USER_PROTOCOL_F_BACKEND_REQ`` protocol feature, to allow the back-end to make requests to the front-end. -The fd is provided via ``VHOST_USER_SET_SLAVE_REQ_FD`` ancillary data. +The fd is provided via ``VHOST_USER_SET_BACKEND_REQ_FD`` ancillary data. -A back-end may then send ``VHOST_USER_SLAVE_*`` messages to the front-end +A back-end may then send ``VHOST_USER_BACKEND_*`` messages to the front-end using this fd communication channel. -If ``VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD`` protocol feature is +If ``VHOST_USER_PROTOCOL_F_BACKEND_SEND_FD`` protocol feature is negotiated, back-end can send file descriptors (at most 8 descriptors in each message) to front-end via ancillary data using this fd communication channel. @@ -835,7 +835,7 @@ Note that due to the fact that too many messages on the sockets can cause the sending application(s) to block, it is not advised to use this feature unless absolutely necessary. It is also considered an error to negotiate this feature without also negotiating -``VHOST_USER_PROTOCOL_F_SLAVE_REQ`` and ``VHOST_USER_PROTOCOL_F_REPLY_ACK``, +``VHOST_USER_PROTOCOL_F_BACKEND_REQ`` and ``VHOST_USER_PROTOCOL_F_REPLY_ACK``, the former is necessary for getting a message channel from the back-end to the front-end, while the latter needs to be used with the in-band notification messages to block until they are processed, both to avoid @@ -855,12 +855,12 @@ Protocol features #define VHOST_USER_PROTOCOL_F_RARP 2 #define VHOST_USER_PROTOCOL_F_REPLY_ACK 3 #define VHOST_USER_PROTOCOL_F_MTU 4 - #define VHOST_USER_PROTOCOL_F_SLAVE_REQ 5 + #define VHOST_USER_PROTOCOL_F_BACKEND_REQ 5 #define VHOST_USER_PROTOCOL_F_CROSS_ENDIAN 6 #define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION7 #define VHOST_USER_PROTOCOL_F_PAGEFAULT 8 #define VHOST_USER_PROTOCOL_F_CONFIG9 - #define VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD10 + #define VHOST_USER_PROTOCOL_F_BACKEND_SEND_FD 10 #define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER11 #define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12 #define VHOST_USER_PROTOCOL_F_RESET_DEVICE 13 @@ -1059,8 +1059,8 @@ Front-end message types in the ancillary data. This signals that polling will be used instead of waiting for the call. Note that if the protocol features ``VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS`` and - ``VHOST_USER_PROTOCOL_F_SLAVE_REQ`` have been negotiated this message - isn't necessary as the ``VHOST_USER_SLAVE_VRING_CALL`` message can be + ``VHOST_USER_PROTOCOL_F_BACKEND_REQ`` have been negotiated this message + isn't necessary as the ``VHOST_USER_BACKEND_VRING_CALL`` message can be used, it may however still be used to set an event file descriptor or to enable polling. @@ -1077,8 +1077,8 @@ Front-end message types invalid FD flag. This flag is set when there is no file descriptor in the ancillary data. Note that if the protocol features ``VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS`` and - ``VHOST_USER_PROTOCOL_F_SLAVE_REQ`` have been negotiated this message - isn't necessary as the ``VHOST_USER_SLAVE_VRING_ERR`` message can be + ``VHOST_USER_PROTOCOL_F_BAC
[PATCH 0/6] Shorten the runtime of some gitlab-CI shared runner jobs
We're currently facing the problem that the gitlab-CI jobs for the shared runners take too much of the limited CI minutes on gitlab.com. Here are now some patches that optimize some of the jobs a little bit to take less runtime. We slightly lose some test coverage by some of these changes (e.g. by dropping ppc-softmmu from a Clang-based test and only continue testing ppc64-softmmu with Clang in another job), but that should still be much better than running out of CI minutes after 3/4 of a month. Thomas Huth (6): gitlab-ci.d/buildtest: Remove ppc-softmmu from the clang-system job gitlab-ci.d/buildtest: Remove aarch64-softmmu from the build-system-ubuntu job tests/qtest/display-vga-test: Add proper checks if a device is available gitlab-ci.d/buildtest: Merge the --without-default-* jobs gitlab-ci.d/buildtest: Merge the two gprof-gcov jobs gitlab-ci.d/buildtest: Disintegrate the build-coroutine-sigaltstack job tests/qtest/display-vga-test.c | 65 +- .gitlab-ci.d/buildtest.yml | 51 ++ .gitlab-ci.d/crossbuilds.yml | 2 +- 3 files changed, 37 insertions(+), 81 deletions(-) -- 2.31.1
[PATCH 2/6] gitlab-ci.d/buildtest: Remove aarch64-softmmu from the build-system-ubuntu job
aarch64-softmmu is also checked on the same version of Ubuntu in the gprov-gcov job, so it is redundant to check it again in the normal ubuntu job. Signed-off-by: Thomas Huth --- .gitlab-ci.d/buildtest.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml index 406608e5fc..7b55dfc434 100644 --- a/.gitlab-ci.d/buildtest.yml +++ b/.gitlab-ci.d/buildtest.yml @@ -42,7 +42,7 @@ build-system-ubuntu: variables: IMAGE: ubuntu2004 CONFIGURE_ARGS: --enable-docs --enable-fdt=system --enable-capstone -TARGETS: aarch64-softmmu alpha-softmmu cris-softmmu hppa-softmmu +TARGETS: alpha-softmmu cris-softmmu hppa-softmmu microblazeel-softmmu mips64el-softmmu MAKE_CHECK_ARGS: check-build artifacts: -- 2.31.1
[PATCH 3/3] vhost-user: Adopt new backend naming
In order to get rid of harmful language, the Vhost-user specification changed features and requests naming from _SLAVE_ to _BACKEND_. This patch adopts the new naming convention. Signed-off-by: Maxime Coquelin --- hw/virtio/vhost-user.c | 30 +++--- hw/virtio/virtio-qmp.c | 12 ++-- 2 files changed, 21 insertions(+), 21 deletions(-) diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index d9ce0501b2..9b623ecf4a 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -40,7 +40,7 @@ #define VHOST_MEMORY_BASELINE_NREGIONS8 #define VHOST_USER_F_PROTOCOL_FEATURES 30 -#define VHOST_USER_SLAVE_MAX_FDS 8 +#define VHOST_USER_BACKEND_MAX_FDS 8 /* * Set maximum number of RAM slots supported to @@ -71,12 +71,12 @@ enum VhostUserProtocolFeature { VHOST_USER_PROTOCOL_F_RARP = 2, VHOST_USER_PROTOCOL_F_REPLY_ACK = 3, VHOST_USER_PROTOCOL_F_NET_MTU = 4, -VHOST_USER_PROTOCOL_F_SLAVE_REQ = 5, +VHOST_USER_PROTOCOL_F_BACKEND_REQ = 5, VHOST_USER_PROTOCOL_F_CROSS_ENDIAN = 6, VHOST_USER_PROTOCOL_F_CRYPTO_SESSION = 7, VHOST_USER_PROTOCOL_F_PAGEFAULT = 8, VHOST_USER_PROTOCOL_F_CONFIG = 9, -VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD = 10, +VHOST_USER_PROTOCOL_F_BACKEND_SEND_FD = 10, VHOST_USER_PROTOCOL_F_HOST_NOTIFIER = 11, VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD = 12, VHOST_USER_PROTOCOL_F_RESET_DEVICE = 13, @@ -110,7 +110,7 @@ typedef enum VhostUserRequest { VHOST_USER_SET_VRING_ENABLE = 18, VHOST_USER_SEND_RARP = 19, VHOST_USER_NET_SET_MTU = 20, -VHOST_USER_SET_SLAVE_REQ_FD = 21, +VHOST_USER_SET_BACKEND_REQ_FD = 21, VHOST_USER_IOTLB_MSG = 22, VHOST_USER_SET_VRING_ENDIAN = 23, VHOST_USER_GET_CONFIG = 24, @@ -134,11 +134,11 @@ typedef enum VhostUserRequest { } VhostUserRequest; typedef enum VhostUserSlaveRequest { -VHOST_USER_SLAVE_NONE = 0, -VHOST_USER_SLAVE_IOTLB_MSG = 1, -VHOST_USER_SLAVE_CONFIG_CHANGE_MSG = 2, -VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG = 3, -VHOST_USER_SLAVE_MAX +VHOST_USER_BACKEND_NONE = 0, +VHOST_USER_BACKEND_IOTLB_MSG = 1, +VHOST_USER_BACKEND_CONFIG_CHANGE_MSG = 2, +VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG = 3, +VHOST_USER_BACKEND_MAX } VhostUserSlaveRequest; typedef struct VhostUserMemoryRegion { @@ -1722,13 +1722,13 @@ static gboolean slave_read(QIOChannel *ioc, GIOCondition condition, } switch (hdr.request) { -case VHOST_USER_SLAVE_IOTLB_MSG: +case VHOST_USER_BACKEND_IOTLB_MSG: ret = vhost_backend_handle_iotlb_msg(dev, &payload.iotlb); break; -case VHOST_USER_SLAVE_CONFIG_CHANGE_MSG : +case VHOST_USER_BACKEND_CONFIG_CHANGE_MSG: ret = vhost_user_slave_handle_config_change(dev); break; -case VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG: +case VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG: ret = vhost_user_slave_handle_vring_host_notifier(dev, &payload.area, fd ? fd[0] : -1); break; @@ -1780,7 +1780,7 @@ fdcleanup: static int vhost_setup_slave_channel(struct vhost_dev *dev) { VhostUserMsg msg = { -.hdr.request = VHOST_USER_SET_SLAVE_REQ_FD, +.hdr.request = VHOST_USER_SET_BACKEND_REQ_FD, .hdr.flags = VHOST_USER_VERSION, }; struct vhost_user *u = dev->opaque; @@ -1791,7 +1791,7 @@ static int vhost_setup_slave_channel(struct vhost_dev *dev) QIOChannel *ioc; if (!virtio_has_feature(dev->protocol_features, -VHOST_USER_PROTOCOL_F_SLAVE_REQ)) { +VHOST_USER_PROTOCOL_F_BACKEND_REQ)) { return 0; } @@ -2147,7 +2147,7 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque, if (virtio_has_feature(features, VIRTIO_F_IOMMU_PLATFORM) && !(virtio_has_feature(dev->protocol_features, -VHOST_USER_PROTOCOL_F_SLAVE_REQ) && +VHOST_USER_PROTOCOL_F_BACKEND_REQ) && virtio_has_feature(dev->protocol_features, VHOST_USER_PROTOCOL_F_REPLY_ACK))) { error_setg(errp, "IOMMU support requires reply-ack and " diff --git a/hw/virtio/virtio-qmp.c b/hw/virtio/virtio-qmp.c index e4d4bece2d..b70148aba9 100644 --- a/hw/virtio/virtio-qmp.c +++ b/hw/virtio/virtio-qmp.c @@ -42,12 +42,12 @@ enum VhostUserProtocolFeature { VHOST_USER_PROTOCOL_F_RARP = 2, VHOST_USER_PROTOCOL_F_REPLY_ACK = 3, VHOST_USER_PROTOCOL_F_NET_MTU = 4, -VHOST_USER_PROTOCOL_F_SLAVE_REQ = 5, +VHOST_USER_PROTOCOL_F_BACKEND_REQ = 5, VHOST_USER_PROTOCOL_F_CROSS_ENDIAN = 6, VHOST_USER_PROTOCOL_F_CRYPTO_SESSION = 7, VHOST_USER_PROTOCOL_F_PAGEFAULT = 8, VHOST_USER_PROTOCOL_F_CONFIG = 9, -VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD = 10, +VHOST_USER_PROTOCOL_F_BACKEND_SEND_FD = 10, VHOST_USER_PROTOCOL_F_HOST_NOTIFIER = 11,
[PATCH 6/6] gitlab-ci.d/buildtest: Disintegrate the build-coroutine-sigaltstack job
We can get rid of the build-coroutine-sigaltstack job by moving the configure flags that should be tested here to other jobs: Move --with-coroutine=sigaltstack to the build-without-defaults job and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job. Signed-off-by: Thomas Huth --- .gitlab-ci.d/buildtest.yml | 14 ++ .gitlab-ci.d/crossbuilds.yml | 2 +- 2 files changed, 3 insertions(+), 13 deletions(-) diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml index 91c7467a66..1438797a1c 100644 --- a/.gitlab-ci.d/buildtest.yml +++ b/.gitlab-ci.d/buildtest.yml @@ -533,19 +533,8 @@ build-tci: - QTEST_QEMU_BINARY="./qemu-system-s390x" ./tests/qtest/pxe-test -m slow - make check-tcg -# Alternate coroutines implementations are only really of interest to KVM users -# However we can't test against KVM on Gitlab-CI so we can only run unit tests -build-coroutine-sigaltstack: - extends: .native_build_job_template - needs: -job: amd64-ubuntu2004-container - variables: -IMAGE: ubuntu2004 -CONFIGURE_ARGS: --with-coroutine=sigaltstack --disable-tcg ---enable-trace-backends=ftrace -MAKE_CHECK_ARGS: check-unit - # Check our reduced build configurations +# (and an alternative coroutine implementation) build-without-defaults: extends: .native_build_job_template needs: @@ -559,6 +548,7 @@ build-without-defaults: --disable-pie --disable-qom-cast-debug --disable-strip + --with-coroutine=sigaltstack TARGETS: avr-softmmu mips64-softmmu s390x-softmmu sh4-softmmu sparc64-softmmu hexagon-linux-user i386-linux-user s390x-linux-user MAKE_CHECK_ARGS: check-unit check-qtest-avr check-qtest-mips64 diff --git a/.gitlab-ci.d/crossbuilds.yml b/.gitlab-ci.d/crossbuilds.yml index 8dbbb8f881..027d2088da 100644 --- a/.gitlab-ci.d/crossbuilds.yml +++ b/.gitlab-ci.d/crossbuilds.yml @@ -159,7 +159,7 @@ cross-s390x-kvm-only: job: s390x-debian-cross-container variables: IMAGE: debian-s390x-cross -EXTRA_CONFIGURE_OPTS: --disable-tcg +EXTRA_CONFIGURE_OPTS: --disable-tcg --enable-trace-backends=ftrace cross-mips64el-kvm-only: extends: .cross_accel_build_job -- 2.31.1
Re: [PATCH v2 03/12] block/vmdk: Change extent info type
Am 28.01.2023 um 00:06 hat Eric Blake geschrieben: > On Thu, Jan 19, 2023 at 04:20:16PM +0100, Kevin Wolf wrote: > > Am 20.06.2022 um 18:26 hat Hanna Reitz geschrieben: > > > VMDK's implementation of .bdrv_get_specific_info() returns information > > > about its extent files, ostensibly in the form of ImageInfo objects. > > > However, it does not get this information through > > > bdrv_query_image_info(), but fills only a select few fields with custom > > > information that does not always match the fields' purposes. > > > > > > For example, @format, which is supposed to be a block driver name, is > > > filled with the extent type, e.g. SPARSE or FLAT. > > > > > > In ImageInfo, @compressed shows whether the data that can be seen in the > > > image is stored in compressed form or not. For example, a compressed > > > qcow2 image will store compressed data in its data file, but when > > > accessing the qcow2 node, you will see normal data. This is not how > > > VMDK uses the @compressed field for its extent files: Instead, it > > > signifies whether accessing the extent file will yield compressed data > > > (which the VMDK driver then (de-)compresses). > > > > Actually, VMDK was the only user of the field in ImageInfo. qcow2 > > doesn't set the field at all because it would have to parse all L2 > > tables in order to set it. > > > > So I suppose @compressed can now be removed from ImageInfo? > > I think you are okay for VMDK (the new struct has the same field > names, but better meanings, and the on-the-wire representation for > someone querying a known-VMDK image doesn't change). For non-VMDK, > any code that was querying @compressed will break, but arguably no one > was doing that since it would have always been false. If we want to > be super-conservative, we deprecate the field now and only remove it > from ImageInfo in a later release, but I'd rather trust Markus on this > decision. It is an optional field and VMDK is the only driver that ever provided it, and only in the context of extent information. In this case, the information on the wire stays the same after this patch. So I don't think there is any visible difference for a client, apart from schema introspection? > On a side note - would it be worth adding a bit to the qcow2 header > (one of the compatible_feature bits seems best) which we set when > writing a compressed cluster, so that on newer images, it is an O(1) > probe of whether the image contains (or at least has contained in the > past) a compressed cluster? Or is that going to add needless overhead > for something we really don't need to know all that often? I think the only use for it would be displaying it in 'qemu-img info'. Would you combine two bits then? One for "compressed bit is valid" and one for "contains compressed clusters"? Because with one bit you can only distinguish "compressed" from "don't know", but there wouldn't be a way to say for certain that an image is uncompressed. Kevin
Re: [PATCH 6/6] gitlab-ci.d/buildtest: Disintegrate the build-coroutine-sigaltstack job
On Mon, Jan 30, 2023 at 11:44:46AM +0100, Thomas Huth wrote: > We can get rid of the build-coroutine-sigaltstack job by moving > the configure flags that should be tested here to other jobs: > Move --with-coroutine=sigaltstack to the build-without-defaults job > and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job. The biggest user of coroutines is the block layer. So we probably ought to have coroutines aligned with a job that triggers the 'make check-block' for iotests. IIUC, the without-defaults job won't do that. How about, arbitrarily, using either the 'check-system-debian' or 'check-system-ubuntu' job. Those distros are closely related, so getting sigaltstack vs ucontext coverage between them is a good win, and they both trigger the block jobs IIUC. Incidentally sigaltstack is also covered by our Cirrus CI job for macOS. > Signed-off-by: Thomas Huth > --- > .gitlab-ci.d/buildtest.yml | 14 ++ > .gitlab-ci.d/crossbuilds.yml | 2 +- > 2 files changed, 3 insertions(+), 13 deletions(-) > > diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml > index 91c7467a66..1438797a1c 100644 > --- a/.gitlab-ci.d/buildtest.yml > +++ b/.gitlab-ci.d/buildtest.yml > @@ -533,19 +533,8 @@ build-tci: > - QTEST_QEMU_BINARY="./qemu-system-s390x" ./tests/qtest/pxe-test -m slow > - make check-tcg > > -# Alternate coroutines implementations are only really of interest to KVM > users > -# However we can't test against KVM on Gitlab-CI so we can only run unit > tests > -build-coroutine-sigaltstack: > - extends: .native_build_job_template > - needs: > -job: amd64-ubuntu2004-container > - variables: > -IMAGE: ubuntu2004 > -CONFIGURE_ARGS: --with-coroutine=sigaltstack --disable-tcg > ---enable-trace-backends=ftrace > -MAKE_CHECK_ARGS: check-unit > - > # Check our reduced build configurations > +# (and an alternative coroutine implementation) > build-without-defaults: >extends: .native_build_job_template >needs: > @@ -559,6 +548,7 @@ build-without-defaults: >--disable-pie >--disable-qom-cast-debug >--disable-strip > + --with-coroutine=sigaltstack > TARGETS: avr-softmmu mips64-softmmu s390x-softmmu sh4-softmmu >sparc64-softmmu hexagon-linux-user i386-linux-user s390x-linux-user > MAKE_CHECK_ARGS: check-unit check-qtest-avr check-qtest-mips64 > diff --git a/.gitlab-ci.d/crossbuilds.yml b/.gitlab-ci.d/crossbuilds.yml > index 8dbbb8f881..027d2088da 100644 > --- a/.gitlab-ci.d/crossbuilds.yml > +++ b/.gitlab-ci.d/crossbuilds.yml > @@ -159,7 +159,7 @@ cross-s390x-kvm-only: > job: s390x-debian-cross-container >variables: > IMAGE: debian-s390x-cross > -EXTRA_CONFIGURE_OPTS: --disable-tcg > +EXTRA_CONFIGURE_OPTS: --disable-tcg --enable-trace-backends=ftrace > > cross-mips64el-kvm-only: >extends: .cross_accel_build_job > -- > 2.31.1 > > With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH 0/3] Vhost-user: replace _SLAVE_ with _BACKEND_
On Mon, Jan 30, 2023 at 11:45:45AM +0100, Maxime Coquelin wrote: > This series continues the work done to get rid of harmful > language in the Vhost-user specification. I prefer a positive "switch to a more inclusive terminology". To consider if you keep doing this work. > While the spec texts were changed to replace slave with > backend, the protocol features and messages names hadn't > been changed. > > This series renames remaining occurences in the spec and > make use of the new names in both libvhost-user and the > Vhost-user frontend code. > > Maxime Coquelin (3): > docs: vhost-user: replace _SLAVE_ with _BACKEND_ > libvhost-user: Adopt new backend naming > vhost-user: Adopt new backend naming > > docs/interop/vhost-user.rst | 40 +++ > hw/virtio/vhost-user.c| 30 - > hw/virtio/virtio-qmp.c| 12 +++ > subprojects/libvhost-user/libvhost-user.c | 20 ++-- > subprojects/libvhost-user/libvhost-user.h | 20 ++-- > 5 files changed, 61 insertions(+), 61 deletions(-) > > -- > 2.39.1
Re: [PATCH 1/3] docs: vhost-user: replace _SLAVE_ with _BACKEND_
On Mon, Jan 30, 2023 at 11:45:46AM +0100, Maxime Coquelin wrote: > Backend's message and protocol features names were still > using "_SLAVE_" naming. For consistency with the new naming > convention and to get rid of the remaining harmful > language, replace it with _BACKEND_. > > Signed-off-by: Maxime Coquelin let's drop "to get rid of the remaining harmful language" as don't get rid of it. consistency is sufficient motivation. > --- > docs/interop/vhost-user.rst | 40 ++--- > 1 file changed, 20 insertions(+), 20 deletions(-) > > diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst > index 3f18ab424e..8a5924ea75 100644 > --- a/docs/interop/vhost-user.rst > +++ b/docs/interop/vhost-user.rst > @@ -315,7 +315,7 @@ in the ancillary data: > * ``VHOST_USER_SET_VRING_KICK`` > * ``VHOST_USER_SET_VRING_CALL`` > * ``VHOST_USER_SET_VRING_ERR`` > -* ``VHOST_USER_SET_SLAVE_REQ_FD`` > +* ``VHOST_USER_SET_BACKEND_REQ_FD`` (previous name > ``VHOST_USER_SET_SLAVE_REQ_FD``) > * ``VHOST_USER_SET_INFLIGHT_FD`` (if > ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``) > > If *front-end* is unable to send the full message or receives a wrong > @@ -516,7 +516,7 @@ expected to reply with a zero payload, non-zero otherwise. > > The back-end relies on the back-end communication channel (see :ref:`Back-end > communication ` section below) to send IOTLB miss > -and access failure events, by sending ``VHOST_USER_SLAVE_IOTLB_MSG`` > +and access failure events, by sending ``VHOST_USER_BACKEND_IOTLB_MSG`` > requests to the front-end with a ``struct vhost_iotlb_msg`` as > payload. For miss events, the iotlb payload has to be filled with the > miss message type (1), the I/O virtual address and the permissions > @@ -540,15 +540,15 @@ Back-end communication > -- > > An optional communication channel is provided if the back-end declares > -``VHOST_USER_PROTOCOL_F_SLAVE_REQ`` protocol feature, to allow the > +``VHOST_USER_PROTOCOL_F_BACKEND_REQ`` protocol feature, to allow the > back-end to make requests to the front-end. > > -The fd is provided via ``VHOST_USER_SET_SLAVE_REQ_FD`` ancillary data. > +The fd is provided via ``VHOST_USER_SET_BACKEND_REQ_FD`` ancillary data. > > -A back-end may then send ``VHOST_USER_SLAVE_*`` messages to the front-end > +A back-end may then send ``VHOST_USER_BACKEND_*`` messages to the front-end > using this fd communication channel. > > -If ``VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD`` protocol feature is > +If ``VHOST_USER_PROTOCOL_F_BACKEND_SEND_FD`` protocol feature is > negotiated, back-end can send file descriptors (at most 8 descriptors in > each message) to front-end via ancillary data using this fd communication > channel. > @@ -835,7 +835,7 @@ Note that due to the fact that too many messages on the > sockets can > cause the sending application(s) to block, it is not advised to use > this feature unless absolutely necessary. It is also considered an > error to negotiate this feature without also negotiating > -``VHOST_USER_PROTOCOL_F_SLAVE_REQ`` and ``VHOST_USER_PROTOCOL_F_REPLY_ACK``, > +``VHOST_USER_PROTOCOL_F_BACKEND_REQ`` and > ``VHOST_USER_PROTOCOL_F_REPLY_ACK``, > the former is necessary for getting a message channel from the back-end > to the front-end, while the latter needs to be used with the in-band > notification messages to block until they are processed, both to avoid > @@ -855,12 +855,12 @@ Protocol features >#define VHOST_USER_PROTOCOL_F_RARP 2 >#define VHOST_USER_PROTOCOL_F_REPLY_ACK 3 >#define VHOST_USER_PROTOCOL_F_MTU 4 > - #define VHOST_USER_PROTOCOL_F_SLAVE_REQ 5 > + #define VHOST_USER_PROTOCOL_F_BACKEND_REQ 5 >#define VHOST_USER_PROTOCOL_F_CROSS_ENDIAN 6 >#define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION7 >#define VHOST_USER_PROTOCOL_F_PAGEFAULT 8 >#define VHOST_USER_PROTOCOL_F_CONFIG9 > - #define VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD10 > + #define VHOST_USER_PROTOCOL_F_BACKEND_SEND_FD 10 >#define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER11 >#define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12 >#define VHOST_USER_PROTOCOL_F_RESET_DEVICE 13 > @@ -1059,8 +1059,8 @@ Front-end message types >in the ancillary data. This signals that polling will be used >instead of waiting for the call. Note that if the protocol features >``VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS`` and > - ``VHOST_USER_PROTOCOL_F_SLAVE_REQ`` have been negotiated this message > - isn't necessary as the ``VHOST_USER_SLAVE_VRING_CALL`` message can be > + ``VHOST_USER_PROTOCOL_F_BACKEND_REQ`` have been negotiated this message > + isn't necessary as the ``VHOST_USER_BACKEND_VRING_CALL`` message can be >used, it may however still be used to set an event file descriptor >or to enable polling. > > @@ -1077,8 +1077,8 @@ Front-end
Re: [PATCH 2/3] libvhost-user: Adopt new backend naming
On Mon, Jan 30, 2023 at 11:45:47AM +0100, Maxime Coquelin wrote: > In order to get rid of harmful language, the Vhost-user > specification changed features and requests naming from > _SLAVE_ to _BACKEND_. let's drop "to get rid of the remaining harmful language" as don't get rid of it. consistency is sufficient motivation. > This patch adopts the new naming convention. > > Signed-off-by: Maxime Coquelin > --- > subprojects/libvhost-user/libvhost-user.c | 20 ++-- > subprojects/libvhost-user/libvhost-user.h | 20 ++-- > 2 files changed, 20 insertions(+), 20 deletions(-) > > diff --git a/subprojects/libvhost-user/libvhost-user.c > b/subprojects/libvhost-user/libvhost-user.c > index fc69783d2b..f661af7c85 100644 > --- a/subprojects/libvhost-user/libvhost-user.c > +++ b/subprojects/libvhost-user/libvhost-user.c > @@ -140,7 +140,7 @@ vu_request_to_string(unsigned int req) > REQ(VHOST_USER_SET_VRING_ENABLE), > REQ(VHOST_USER_SEND_RARP), > REQ(VHOST_USER_NET_SET_MTU), > -REQ(VHOST_USER_SET_SLAVE_REQ_FD), > +REQ(VHOST_USER_SET_BACKEND_REQ_FD), > REQ(VHOST_USER_IOTLB_MSG), > REQ(VHOST_USER_SET_VRING_ENDIAN), > REQ(VHOST_USER_GET_CONFIG), > @@ -1365,7 +1365,7 @@ bool vu_set_queue_host_notifier(VuDev *dev, VuVirtq > *vq, int fd, > int qidx = vq - dev->vq; > int fd_num = 0; > VhostUserMsg vmsg = { > -.request = VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG, > +.request = VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG, > .flags = VHOST_USER_VERSION | VHOST_USER_NEED_REPLY_MASK, > .size = sizeof(vmsg.payload.area), > .payload.area = { > @@ -1383,7 +1383,7 @@ bool vu_set_queue_host_notifier(VuDev *dev, VuVirtq > *vq, int fd, > > vmsg.fd_num = fd_num; > > -if (!vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD)) { > +if (!vu_has_protocol_feature(dev, > VHOST_USER_PROTOCOL_F_BACKEND_SEND_FD)) { > return false; > } > > @@ -1461,9 +1461,9 @@ vu_get_protocol_features_exec(VuDev *dev, VhostUserMsg > *vmsg) > */ > uint64_t features = 1ULL << VHOST_USER_PROTOCOL_F_MQ | > 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD | > -1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ | > +1ULL << VHOST_USER_PROTOCOL_F_BACKEND_REQ | > 1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER | > -1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD | > +1ULL << VHOST_USER_PROTOCOL_F_BACKEND_SEND_FD | > 1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK | > 1ULL << VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS; > > @@ -1494,7 +1494,7 @@ vu_set_protocol_features_exec(VuDev *dev, VhostUserMsg > *vmsg) > > if (vu_has_protocol_feature(dev, > VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS) > && > -(!vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_SLAVE_REQ) || > +(!vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_BACKEND_REQ) || > !vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK))) { > /* > * The use case for using messages for kick/call is simulation, to > make > @@ -1507,7 +1507,7 @@ vu_set_protocol_features_exec(VuDev *dev, VhostUserMsg > *vmsg) > * that actually enables the simulation case. > */ > vu_panic(dev, > - "F_IN_BAND_NOTIFICATIONS requires F_SLAVE_REQ && > F_REPLY_ACK"); > + "F_IN_BAND_NOTIFICATIONS requires F_BACKEND_REQ && > F_REPLY_ACK"); > return false; > } > > @@ -1910,7 +1910,7 @@ vu_process_message(VuDev *dev, VhostUserMsg *vmsg) > return vu_get_queue_num_exec(dev, vmsg); > case VHOST_USER_SET_VRING_ENABLE: > return vu_set_vring_enable_exec(dev, vmsg); > -case VHOST_USER_SET_SLAVE_REQ_FD: > +case VHOST_USER_SET_BACKEND_REQ_FD: > return vu_set_slave_req_fd(dev, vmsg); > case VHOST_USER_GET_CONFIG: > return vu_get_config(dev, vmsg); > @@ -2416,9 +2416,9 @@ static void _vu_queue_notify(VuDev *dev, VuVirtq *vq, > bool sync) > if (vq->call_fd < 0 && > vu_has_protocol_feature(dev, > VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS) > && > -vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_SLAVE_REQ)) { > +vu_has_protocol_feature(dev, VHOST_USER_PROTOCOL_F_BACKEND_REQ)) { > VhostUserMsg vmsg = { > -.request = VHOST_USER_SLAVE_VRING_CALL, > +.request = VHOST_USER_BACKEND_VRING_CALL, > .flags = VHOST_USER_VERSION, > .size = sizeof(vmsg.payload.state), > .payload.state = { > diff --git a/subprojects/libvhost-user/libvhost-user.h > b/subprojects/libvhost-user/libvhost-user.h > index 8cda9b8f57..8c5a2719e3 100644 > --
Re: [PATCH 3/3] vhost-user: Adopt new backend naming
On Mon, Jan 30, 2023 at 11:45:48AM +0100, Maxime Coquelin wrote: > In order to get rid of harmful language, the Vhost-user > specification changed features and requests naming from > _SLAVE_ to _BACKEND_. "features and requests naming" -> "feature and request naming" (a reduced relative drops plurals generally), or "naming of features and requests" let's drop "to get rid of the remaining harmful language" as don't get rid of it. consistency is sufficient motivation. > This patch adopts the new naming convention. > > Signed-off-by: Maxime Coquelin > --- > hw/virtio/vhost-user.c | 30 +++--- > hw/virtio/virtio-qmp.c | 12 ++-- > 2 files changed, 21 insertions(+), 21 deletions(-) > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c > index d9ce0501b2..9b623ecf4a 100644 > --- a/hw/virtio/vhost-user.c > +++ b/hw/virtio/vhost-user.c > @@ -40,7 +40,7 @@ > > #define VHOST_MEMORY_BASELINE_NREGIONS8 > #define VHOST_USER_F_PROTOCOL_FEATURES 30 > -#define VHOST_USER_SLAVE_MAX_FDS 8 > +#define VHOST_USER_BACKEND_MAX_FDS 8 > > /* > * Set maximum number of RAM slots supported to > @@ -71,12 +71,12 @@ enum VhostUserProtocolFeature { > VHOST_USER_PROTOCOL_F_RARP = 2, > VHOST_USER_PROTOCOL_F_REPLY_ACK = 3, > VHOST_USER_PROTOCOL_F_NET_MTU = 4, > -VHOST_USER_PROTOCOL_F_SLAVE_REQ = 5, > +VHOST_USER_PROTOCOL_F_BACKEND_REQ = 5, > VHOST_USER_PROTOCOL_F_CROSS_ENDIAN = 6, > VHOST_USER_PROTOCOL_F_CRYPTO_SESSION = 7, > VHOST_USER_PROTOCOL_F_PAGEFAULT = 8, > VHOST_USER_PROTOCOL_F_CONFIG = 9, > -VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD = 10, > +VHOST_USER_PROTOCOL_F_BACKEND_SEND_FD = 10, > VHOST_USER_PROTOCOL_F_HOST_NOTIFIER = 11, > VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD = 12, > VHOST_USER_PROTOCOL_F_RESET_DEVICE = 13, > @@ -110,7 +110,7 @@ typedef enum VhostUserRequest { > VHOST_USER_SET_VRING_ENABLE = 18, > VHOST_USER_SEND_RARP = 19, > VHOST_USER_NET_SET_MTU = 20, > -VHOST_USER_SET_SLAVE_REQ_FD = 21, > +VHOST_USER_SET_BACKEND_REQ_FD = 21, > VHOST_USER_IOTLB_MSG = 22, > VHOST_USER_SET_VRING_ENDIAN = 23, > VHOST_USER_GET_CONFIG = 24, > @@ -134,11 +134,11 @@ typedef enum VhostUserRequest { > } VhostUserRequest; > > typedef enum VhostUserSlaveRequest { > -VHOST_USER_SLAVE_NONE = 0, > -VHOST_USER_SLAVE_IOTLB_MSG = 1, > -VHOST_USER_SLAVE_CONFIG_CHANGE_MSG = 2, > -VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG = 3, > -VHOST_USER_SLAVE_MAX > +VHOST_USER_BACKEND_NONE = 0, > +VHOST_USER_BACKEND_IOTLB_MSG = 1, > +VHOST_USER_BACKEND_CONFIG_CHANGE_MSG = 2, > +VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG = 3, > +VHOST_USER_BACKEND_MAX > } VhostUserSlaveRequest; > > typedef struct VhostUserMemoryRegion { > @@ -1722,13 +1722,13 @@ static gboolean slave_read(QIOChannel *ioc, > GIOCondition condition, > } > > switch (hdr.request) { > -case VHOST_USER_SLAVE_IOTLB_MSG: > +case VHOST_USER_BACKEND_IOTLB_MSG: > ret = vhost_backend_handle_iotlb_msg(dev, &payload.iotlb); > break; > -case VHOST_USER_SLAVE_CONFIG_CHANGE_MSG : > +case VHOST_USER_BACKEND_CONFIG_CHANGE_MSG: > ret = vhost_user_slave_handle_config_change(dev); > break; > -case VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG: > +case VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG: > ret = vhost_user_slave_handle_vring_host_notifier(dev, &payload.area, >fd ? fd[0] : -1); > break; > @@ -1780,7 +1780,7 @@ fdcleanup: > static int vhost_setup_slave_channel(struct vhost_dev *dev) > { > VhostUserMsg msg = { > -.hdr.request = VHOST_USER_SET_SLAVE_REQ_FD, > +.hdr.request = VHOST_USER_SET_BACKEND_REQ_FD, > .hdr.flags = VHOST_USER_VERSION, > }; > struct vhost_user *u = dev->opaque; > @@ -1791,7 +1791,7 @@ static int vhost_setup_slave_channel(struct vhost_dev > *dev) > QIOChannel *ioc; > > if (!virtio_has_feature(dev->protocol_features, > -VHOST_USER_PROTOCOL_F_SLAVE_REQ)) { > +VHOST_USER_PROTOCOL_F_BACKEND_REQ)) { > return 0; > } > > @@ -2147,7 +2147,7 @@ static int vhost_user_backend_init(struct vhost_dev > *dev, void *opaque, > > if (virtio_has_feature(features, VIRTIO_F_IOMMU_PLATFORM) && > !(virtio_has_feature(dev->protocol_features, > -VHOST_USER_PROTOCOL_F_SLAVE_REQ) && > +VHOST_USER_PROTOCOL_F_BACKEND_REQ) && > virtio_has_feature(dev->protocol_features, > VHOST_USER_PROTOCOL_F_REPLY_ACK))) { > error_setg(errp, "IOMMU support requires reply-ack and " > diff --git a/hw/virtio/virtio-qmp.c b/hw/virtio/virtio-qmp.c > index e4d4bece2d..b70148aba9 100644 > --- a/hw/virtio/virtio-qmp.c > +++ b/hw/virtio/virtio-qmp.c > @@
[PATCH 01/11] ui/dbus: unregister clipboard on connection close
From: Marc-André Lureau Fixes unregistration with p2p connections, since they don't have an associated name owner. Signed-off-by: Marc-André Lureau --- ui/dbus-clipboard.c | 18 +++--- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/ui/dbus-clipboard.c b/ui/dbus-clipboard.c index 5843d26cd2..d78f9db18f 100644 --- a/ui/dbus-clipboard.c +++ b/ui/dbus-clipboard.c @@ -203,15 +203,6 @@ dbus_clipboard_unregister_proxy(DBusDisplay *dpy) g_clear_object(&dpy->clipboard_proxy); } -static void -dbus_on_clipboard_proxy_name_owner_changed( -DBusDisplay *dpy, -GObject *object, -GParamSpec *pspec) -{ -dbus_clipboard_unregister_proxy(dpy); -} - static gboolean dbus_clipboard_register( DBusDisplay *dpy, @@ -219,6 +210,7 @@ dbus_clipboard_register( { g_autoptr(GError) err = NULL; const char *name = NULL; +GDBusConnection *connection = g_dbus_method_invocation_get_connection(invocation); if (dpy->clipboard_proxy) { g_dbus_method_invocation_return_error( @@ -231,7 +223,7 @@ dbus_clipboard_register( dpy->clipboard_proxy = qemu_dbus_display1_clipboard_proxy_new_sync( -g_dbus_method_invocation_get_connection(invocation), +connection, G_DBUS_PROXY_FLAGS_DO_NOT_AUTO_START, g_dbus_method_invocation_get_sender(invocation), "/org/qemu/Display1/Clipboard", @@ -251,7 +243,11 @@ dbus_clipboard_register( g_object_connect(dpy->clipboard_proxy, "swapped-signal::notify::g-name-owner", - dbus_on_clipboard_proxy_name_owner_changed, dpy, + dbus_clipboard_unregister_proxy, dpy, + NULL); +g_object_connect(connection, + "swapped-signal::closed", + dbus_clipboard_unregister_proxy, dpy, NULL); qemu_clipboard_reset_serial(); -- 2.39.1
[PATCH 02/11] audio/dbus: there are no sender for p2p mode
From: Marc-André Lureau Signed-off-by: Marc-André Lureau --- audio/audio_int.h | 2 +- audio/dbusaudio.c | 6 -- ui/dbus.c | 2 +- 3 files changed, 6 insertions(+), 4 deletions(-) diff --git a/audio/audio_int.h b/audio/audio_int.h index e87ce014a0..38864bfbbd 100644 --- a/audio/audio_int.h +++ b/audio/audio_int.h @@ -145,7 +145,7 @@ struct audio_driver { void *(*init) (Audiodev *); void (*fini) (void *); #ifdef CONFIG_GIO -void (*set_dbus_server) (AudioState *s, GDBusObjectManagerServer *manager); +void (*set_dbus_server) (AudioState *s, GDBusObjectManagerServer *manager, bool p2p); #endif struct audio_pcm_ops *pcm_ops; int can_be_default; diff --git a/audio/dbusaudio.c b/audio/dbusaudio.c index 722df0355e..9032dda49c 100644 --- a/audio/dbusaudio.c +++ b/audio/dbusaudio.c @@ -43,6 +43,7 @@ typedef struct DBusAudio { GDBusObjectManagerServer *server; +bool p2p; GDBusObjectSkeleton *audio; QemuDBusDisplay1Audio *iface; GHashTable *out_listeners; @@ -448,7 +449,7 @@ dbus_audio_register_listener(AudioState *s, bool out) { DBusAudio *da = s->drv_opaque; -const char *sender = g_dbus_method_invocation_get_sender(invocation); +const char *sender = da->p2p ? "p2p" : g_dbus_method_invocation_get_sender(invocation); g_autoptr(GDBusConnection) listener_conn = NULL; g_autoptr(GError) err = NULL; g_autoptr(GSocket) socket = NULL; @@ -591,7 +592,7 @@ dbus_audio_register_in_listener(AudioState *s, } static void -dbus_audio_set_server(AudioState *s, GDBusObjectManagerServer *server) +dbus_audio_set_server(AudioState *s, GDBusObjectManagerServer *server, bool p2p) { DBusAudio *da = s->drv_opaque; @@ -599,6 +600,7 @@ dbus_audio_set_server(AudioState *s, GDBusObjectManagerServer *server) g_assert(!da->server); da->server = g_object_ref(server); +da->p2p = p2p; da->audio = g_dbus_object_skeleton_new(DBUS_DISPLAY1_AUDIO_PATH); da->iface = qemu_dbus_display1_audio_skeleton_new(); diff --git a/ui/dbus.c b/ui/dbus.c index 32d88dc94a..0a9e2f4611 100644 --- a/ui/dbus.c +++ b/ui/dbus.c @@ -219,7 +219,7 @@ dbus_display_complete(UserCreatable *uc, Error **errp) dd->audiodev); return; } -audio_state->drv->set_dbus_server(audio_state, dd->server); +audio_state->drv->set_dbus_server(audio_state, dd->server, dd->p2p); } consoles = g_array_new(FALSE, FALSE, sizeof(guint32)); -- 2.39.1
[PATCH 05/11] meson: ensure dbus-display generated code is built before other units
From: Marc-André Lureau It's simply by luck that dbus-display header is built first before the other units using it. With sourceset, I can't find an easier way out than declaring an extra dependency for dbus-display1 generate code. Signed-off-by: Marc-André Lureau --- ui/meson.build | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/ui/meson.build b/ui/meson.build index 612ea2325b..0b2d0d21d1 100644 --- a/ui/meson.build +++ b/ui/meson.build @@ -83,7 +83,9 @@ if dbus_display '--interface-prefix', 'org.qemu.', '--c-namespace', 'QemuDBus', '--generate-c-code', '@BASENAME@']) - dbus_ss.add(when: [gio, pixman, opengl, gbm], + dbus_display1_lib = static_library('dbus-display1', dbus_display1, dependencies: gio) + dbus_display1_dep = declare_dependency(link_with: dbus_display1_lib, include_directories: include_directories('.')) + dbus_ss.add(when: [gio, pixman, opengl, gbm, dbus_display1_dep], if_true: [files( 'dbus-chardev.c', 'dbus-clipboard.c', @@ -91,7 +93,7 @@ if dbus_display 'dbus-error.c', 'dbus-listener.c', 'dbus.c', - ), dbus_display1]) + )]) ui_modules += {'dbus' : dbus_ss} endif -- 2.39.1
[PATCH 00/11] ui: dbus & misc fixes
From: Marc-André Lureau Hi, Here is a collection of fixes for UI related-stuff. Mostly it's improving/fixing -display dbus. Thanks for the reviews! Marc-André Lureau (11): ui/dbus: unregister clipboard on connection close audio/dbus: there are no sender for p2p mode ui/dbus: set mouse is-absolute during console creation ui/dbus: update the display when switching surface meson: ensure dbus-display generated code is built before other units ui: rename cursor_{put->unref} ui: rename cursor_{get->ref}, return it ui: keep current cursor with QemuConsole ui: set cursor upon listener registration ui: set cursor position upon listener registration ui/sdl: get the GL context from the window audio/audio_int.h | 2 +- include/ui/console.h| 5 +++-- ui/vnc.h| 1 - audio/dbusaudio.c | 6 -- hw/display/qxl-render.c | 4 ++-- hw/display/qxl.c| 2 +- hw/display/vmware_vga.c | 4 ++-- ui/console.c| 18 ++ ui/cursor.c | 5 +++-- ui/dbus-clipboard.c | 18 +++--- ui/dbus-console.c | 13 ++--- ui/dbus-listener.c | 7 --- ui/dbus.c | 2 +- ui/sdl2.c | 2 +- ui/spice-display.c | 8 ui/vnc.c| 8 ++-- ui/meson.build | 6 -- 17 files changed, 67 insertions(+), 44 deletions(-) -- 2.39.1
[PATCH 03/11] ui/dbus: set mouse is-absolute during console creation
From: Marc-André Lureau Signed-off-by: Marc-André Lureau --- ui/dbus-console.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/ui/dbus-console.c b/ui/dbus-console.c index 898a4ac8a5..241797b7e9 100644 --- a/ui/dbus-console.c +++ b/ui/dbus-console.c @@ -410,15 +410,21 @@ dbus_mouse_release(DBusDisplayConsole *ddc, return DBUS_METHOD_INVOCATION_HANDLED; } +static void +dbus_mouse_update_is_absolute(DBusDisplayConsole *ddc) +{ +g_object_set(ddc->iface_mouse, + "is-absolute", qemu_input_is_absolute(), + NULL); +} + static void dbus_mouse_mode_change(Notifier *notify, void *data) { DBusDisplayConsole *ddc = container_of(notify, DBusDisplayConsole, mouse_mode_notifier); -g_object_set(ddc->iface_mouse, - "is-absolute", qemu_input_is_absolute(), - NULL); +dbus_mouse_update_is_absolute(ddc); } int dbus_display_console_get_index(DBusDisplayConsole *ddc) @@ -491,6 +497,7 @@ dbus_display_console_new(DBusDisplay *display, QemuConsole *con) register_displaychangelistener(&ddc->dcl); ddc->mouse_mode_notifier.notify = dbus_mouse_mode_change; qemu_add_mouse_mode_change_notifier(&ddc->mouse_mode_notifier); +dbus_mouse_update_is_absolute(ddc); return ddc; } -- 2.39.1
[PATCH 04/11] ui/dbus: update the display when switching surface
From: Marc-André Lureau This seems to be the expected behaviour. Signed-off-by: Marc-André Lureau --- ui/dbus-listener.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/ui/dbus-listener.c b/ui/dbus-listener.c index f9fc8eda51..620d9450cc 100644 --- a/ui/dbus-listener.c +++ b/ui/dbus-listener.c @@ -321,6 +321,8 @@ static void dbus_gfx_switch(DisplayChangeListener *dcl, /* why not call disable instead? */ return; } + +dbus_gfx_update(dcl, 0, 0, surface_width(ddl->ds), surface_height(ddl->ds)); } static void dbus_mouse_set(DisplayChangeListener *dcl, -- 2.39.1
[PATCH 06/11] ui: rename cursor_{put->unref}
From: Marc-André Lureau The naming is more conventional in QEMU. Signed-off-by: Marc-André Lureau --- include/ui/console.h| 2 +- hw/display/qxl-render.c | 4 ++-- hw/display/qxl.c| 2 +- hw/display/vmware_vga.c | 4 ++-- ui/cursor.c | 2 +- ui/dbus-listener.c | 2 +- ui/spice-display.c | 4 ++-- ui/vnc.c| 2 +- 8 files changed, 11 insertions(+), 11 deletions(-) diff --git a/include/ui/console.h b/include/ui/console.h index 8e6cf782a1..ec28402a4f 100644 --- a/include/ui/console.h +++ b/include/ui/console.h @@ -153,7 +153,7 @@ typedef struct QEMUCursor { QEMUCursor *cursor_alloc(int width, int height); void cursor_get(QEMUCursor *c); -void cursor_put(QEMUCursor *c); +void cursor_unref(QEMUCursor *c); QEMUCursor *cursor_builtin_hidden(void); QEMUCursor *cursor_builtin_left_ptr(void); void cursor_print_ascii_art(QEMUCursor *c, const char *prefix); diff --git a/hw/display/qxl-render.c b/hw/display/qxl-render.c index fcfd40c3ac..ec99ec887a 100644 --- a/hw/display/qxl-render.c +++ b/hw/display/qxl-render.c @@ -290,7 +290,7 @@ static QEMUCursor *qxl_cursor(PCIQXLDevice *qxl, QXLCursor *cursor, return c; fail: -cursor_put(c); +cursor_unref(c); return NULL; } @@ -336,7 +336,7 @@ int qxl_render_cursor(PCIQXLDevice *qxl, QXLCommandExt *ext) } qemu_mutex_lock(&qxl->ssd.lock); if (qxl->ssd.cursor) { -cursor_put(qxl->ssd.cursor); +cursor_unref(qxl->ssd.cursor); } qxl->ssd.cursor = c; qxl->ssd.mouse_x = cmd->u.set.position.x; diff --git a/hw/display/qxl.c b/hw/display/qxl.c index ec712d3ca2..80ce1e9a93 100644 --- a/hw/display/qxl.c +++ b/hw/display/qxl.c @@ -299,7 +299,7 @@ void qxl_spice_reset_cursor(PCIQXLDevice *qxl) qxl->guest_cursor = 0; qemu_mutex_unlock(&qxl->track_lock); if (qxl->ssd.cursor) { -cursor_put(qxl->ssd.cursor); +cursor_unref(qxl->ssd.cursor); } qxl->ssd.cursor = cursor_builtin_hidden(); } diff --git a/hw/display/vmware_vga.c b/hw/display/vmware_vga.c index 59ae7f74b8..09591fbd39 100644 --- a/hw/display/vmware_vga.c +++ b/hw/display/vmware_vga.c @@ -550,12 +550,12 @@ static inline void vmsvga_cursor_define(struct vmsvga_state_s *s, default: fprintf(stderr, "%s: unhandled bpp %d, using fallback cursor\n", __func__, c->bpp); -cursor_put(qc); +cursor_unref(qc); qc = cursor_builtin_left_ptr(); } dpy_cursor_define(s->vga.con, qc); -cursor_put(qc); +cursor_unref(qc); } #endif diff --git a/ui/cursor.c b/ui/cursor.c index 835f0802f9..31b09bf058 100644 --- a/ui/cursor.c +++ b/ui/cursor.c @@ -111,7 +111,7 @@ void cursor_get(QEMUCursor *c) c->refcount++; } -void cursor_put(QEMUCursor *c) +void cursor_unref(QEMUCursor *c) { if (c == NULL) return; diff --git a/ui/dbus-listener.c b/ui/dbus-listener.c index 620d9450cc..cc1054caff 100644 --- a/ui/dbus-listener.c +++ b/ui/dbus-listener.c @@ -346,7 +346,7 @@ static void dbus_cursor_define(DisplayChangeListener *dcl, c->data, c->width * c->height * 4, TRUE, -(GDestroyNotify)cursor_put, +(GDestroyNotify)cursor_unref, c); qemu_dbus_display1_listener_call_cursor_define( diff --git a/ui/spice-display.c b/ui/spice-display.c index 0616a6982f..161868735b 100644 --- a/ui/spice-display.c +++ b/ui/spice-display.c @@ -463,7 +463,7 @@ void qemu_spice_cursor_refresh_bh(void *opaque) qemu_mutex_unlock(&ssd->lock); dpy_cursor_define(ssd->dcl.con, c); qemu_mutex_lock(&ssd->lock); -cursor_put(c); +cursor_unref(c); } if (ssd->mouse_x != -1 && ssd->mouse_y != -1) { @@ -765,7 +765,7 @@ static void display_mouse_define(DisplayChangeListener *dcl, qemu_mutex_lock(&ssd->lock); cursor_get(c); -cursor_put(ssd->cursor); +cursor_unref(ssd->cursor); ssd->cursor = c; ssd->hot_x = c->hot_x; ssd->hot_y = c->hot_y; diff --git a/ui/vnc.c b/ui/vnc.c index d9eacad759..0bdcc3dfce 100644 --- a/ui/vnc.c +++ b/ui/vnc.c @@ -1029,7 +1029,7 @@ static void vnc_dpy_cursor_define(DisplayChangeListener *dcl, VncDisplay *vd = container_of(dcl, VncDisplay, dcl); VncState *vs; -cursor_put(vd->cursor); +cursor_unref(vd->cursor); g_free(vd->cursor_mask); vd->cursor = c; -- 2.39.1
[PATCH 08/11] ui: keep current cursor with QemuConsole
From: Marc-André Lureau Keeping the current cursor around is useful, not only for VNC, but for other displays. Let's move it down, see the following patches for other usages. Signed-off-by: Marc-André Lureau --- include/ui/console.h | 1 + ui/vnc.h | 1 - ui/console.c | 8 ui/vnc.c | 7 ++- 4 files changed, 11 insertions(+), 6 deletions(-) diff --git a/include/ui/console.h b/include/ui/console.h index 3afb5c73a8..e87284da22 100644 --- a/include/ui/console.h +++ b/include/ui/console.h @@ -460,6 +460,7 @@ QemuConsole *qemu_console_lookup_by_device(DeviceState *dev, uint32_t head); QemuConsole *qemu_console_lookup_by_device_name(const char *device_id, uint32_t head, Error **errp); QemuConsole *qemu_console_lookup_unused(void); +QEMUCursor *qemu_console_get_cursor(QemuConsole *con); bool qemu_console_is_visible(QemuConsole *con); bool qemu_console_is_graphic(QemuConsole *con); bool qemu_console_is_fixedsize(QemuConsole *con); diff --git a/ui/vnc.h b/ui/vnc.h index a60fb13115..757fa83044 100644 --- a/ui/vnc.h +++ b/ui/vnc.h @@ -159,7 +159,6 @@ struct VncDisplay QKbdState *kbd; QemuMutex mutex; -QEMUCursor *cursor; int cursor_msize; uint8_t *cursor_mask; diff --git a/ui/console.c b/ui/console.c index ab43561fe1..4e88941165 100644 --- a/ui/console.c +++ b/ui/console.c @@ -93,6 +93,7 @@ struct QemuConsole { uint32_t head; QemuUIInfo ui_info; QEMUTimer *ui_timer; +QEMUCursor *cursor; const GraphicHwOps *hw_ops; void *hw; @@ -1922,6 +1923,8 @@ void dpy_cursor_define(QemuConsole *con, QEMUCursor *cursor) DisplayState *s = con->ds; DisplayChangeListener *dcl; +cursor_unref(con->cursor); +con->cursor = cursor_ref(cursor); if (!qemu_console_is_visible(con)) { return; } @@ -2287,6 +2290,11 @@ QemuConsole *qemu_console_lookup_unused(void) return NULL; } +QEMUCursor *qemu_console_get_cursor(QemuConsole *con) +{ +return con->cursor; +} + bool qemu_console_is_visible(QemuConsole *con) { return (con == active_console) || (con->dcls > 0); diff --git a/ui/vnc.c b/ui/vnc.c index 8aec5d751e..bbd8b6baae 100644 --- a/ui/vnc.c +++ b/ui/vnc.c @@ -988,10 +988,10 @@ static void vnc_mouse_set(DisplayChangeListener *dcl, static int vnc_cursor_define(VncState *vs) { -QEMUCursor *c = vs->vd->cursor; +QEMUCursor *c = qemu_console_get_cursor(vs->vd->dcl.con); int isize; -if (!vs->vd->cursor) { +if (!c) { return -1; } @@ -1029,10 +1029,7 @@ static void vnc_dpy_cursor_define(DisplayChangeListener *dcl, VncDisplay *vd = container_of(dcl, VncDisplay, dcl); VncState *vs; -cursor_unref(vd->cursor); g_free(vd->cursor_mask); - -vd->cursor = cursor_ref(vd->cursor); vd->cursor_msize = cursor_get_mono_bpl(c) * c->height; vd->cursor_mask = g_malloc0(vd->cursor_msize); cursor_get_mono_mask(c, 0, vd->cursor_mask); -- 2.39.1
[PATCH 07/11] ui: rename cursor_{get->ref}, return it
From: Marc-André Lureau The naming is more conventional in QEMU code, and allows to simplify some code. Signed-off-by: Marc-André Lureau --- include/ui/console.h | 2 +- ui/cursor.c | 3 ++- ui/dbus-listener.c | 3 +-- ui/spice-display.c | 4 ++-- ui/vnc.c | 3 +-- 5 files changed, 7 insertions(+), 8 deletions(-) diff --git a/include/ui/console.h b/include/ui/console.h index ec28402a4f..3afb5c73a8 100644 --- a/include/ui/console.h +++ b/include/ui/console.h @@ -152,7 +152,7 @@ typedef struct QEMUCursor { } QEMUCursor; QEMUCursor *cursor_alloc(int width, int height); -void cursor_get(QEMUCursor *c); +QEMUCursor *cursor_ref(QEMUCursor *c); void cursor_unref(QEMUCursor *c); QEMUCursor *cursor_builtin_hidden(void); QEMUCursor *cursor_builtin_left_ptr(void); diff --git a/ui/cursor.c b/ui/cursor.c index 31b09bf058..6fe67990e2 100644 --- a/ui/cursor.c +++ b/ui/cursor.c @@ -106,9 +106,10 @@ QEMUCursor *cursor_alloc(int width, int height) return c; } -void cursor_get(QEMUCursor *c) +QEMUCursor *cursor_ref(QEMUCursor *c) { c->refcount++; +return c; } void cursor_unref(QEMUCursor *c) diff --git a/ui/dbus-listener.c b/ui/dbus-listener.c index cc1054caff..f195d8df57 100644 --- a/ui/dbus-listener.c +++ b/ui/dbus-listener.c @@ -340,14 +340,13 @@ static void dbus_cursor_define(DisplayChangeListener *dcl, DBusDisplayListener *ddl = container_of(dcl, DBusDisplayListener, dcl); GVariant *v_data = NULL; -cursor_get(c); v_data = g_variant_new_from_data( G_VARIANT_TYPE("ay"), c->data, c->width * c->height * 4, TRUE, (GDestroyNotify)cursor_unref, -c); +cursor_ref(c)); qemu_dbus_display1_listener_call_cursor_define( ddl->proxy, diff --git a/ui/spice-display.c b/ui/spice-display.c index 161868735b..918cadecf1 100644 --- a/ui/spice-display.c +++ b/ui/spice-display.c @@ -459,7 +459,7 @@ void qemu_spice_cursor_refresh_bh(void *opaque) if (ssd->cursor) { QEMUCursor *c = ssd->cursor; assert(ssd->dcl.con); -cursor_get(c); +cursor_ref(c); qemu_mutex_unlock(&ssd->lock); dpy_cursor_define(ssd->dcl.con, c); qemu_mutex_lock(&ssd->lock); @@ -764,7 +764,7 @@ static void display_mouse_define(DisplayChangeListener *dcl, SimpleSpiceDisplay *ssd = container_of(dcl, SimpleSpiceDisplay, dcl); qemu_mutex_lock(&ssd->lock); -cursor_get(c); +cursor_ref(c); cursor_unref(ssd->cursor); ssd->cursor = c; ssd->hot_x = c->hot_x; diff --git a/ui/vnc.c b/ui/vnc.c index 0bdcc3dfce..8aec5d751e 100644 --- a/ui/vnc.c +++ b/ui/vnc.c @@ -1032,8 +1032,7 @@ static void vnc_dpy_cursor_define(DisplayChangeListener *dcl, cursor_unref(vd->cursor); g_free(vd->cursor_mask); -vd->cursor = c; -cursor_get(vd->cursor); +vd->cursor = cursor_ref(vd->cursor); vd->cursor_msize = cursor_get_mono_bpl(c) * c->height; vd->cursor_mask = g_malloc0(vd->cursor_msize); cursor_get_mono_mask(c, 0, vd->cursor_mask); -- 2.39.1
[PATCH 09/11] ui: set cursor upon listener registration
From: Marc-André Lureau Signed-off-by: Marc-André Lureau --- ui/console.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/ui/console.c b/ui/console.c index 4e88941165..3fc8bf2fbc 100644 --- a/ui/console.c +++ b/ui/console.c @@ -1661,6 +1661,9 @@ void register_displaychangelistener(DisplayChangeListener *dcl) con = active_console; } displaychangelistener_display_console(dcl, con, dcl->con ? &error_fatal : NULL); +if (con->cursor && dcl->ops->dpy_cursor_define) { +dcl->ops->dpy_cursor_define(dcl, con->cursor); +} text_console_update_cursor(NULL); } -- 2.39.1
[PATCH 10/11] ui: set cursor position upon listener registration
From: Marc-André Lureau Signed-off-by: Marc-André Lureau --- ui/console.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/ui/console.c b/ui/console.c index 3fc8bf2fbc..6f87158c2d 100644 --- a/ui/console.c +++ b/ui/console.c @@ -94,6 +94,7 @@ struct QemuConsole { QemuUIInfo ui_info; QEMUTimer *ui_timer; QEMUCursor *cursor; +int cursor_x, cursor_y, cursor_on; const GraphicHwOps *hw_ops; void *hw; @@ -1664,6 +1665,9 @@ void register_displaychangelistener(DisplayChangeListener *dcl) if (con->cursor && dcl->ops->dpy_cursor_define) { dcl->ops->dpy_cursor_define(dcl, con->cursor); } +if (dcl->ops->dpy_mouse_set) { +dcl->ops->dpy_mouse_set(dcl, con->cursor_x, con->cursor_y, con->cursor_on); +} text_console_update_cursor(NULL); } @@ -1908,6 +1912,9 @@ void dpy_mouse_set(QemuConsole *con, int x, int y, int on) DisplayState *s = con->ds; DisplayChangeListener *dcl; +con->cursor_x = x; +con->cursor_y = y; +con->cursor_on = on; if (!qemu_console_is_visible(con)) { return; } -- 2.39.1
[PATCH 11/11] ui/sdl: get the GL context from the window
From: Marc-André Lureau There is no guarantee to have a current GL context here. The current code seems to rely on the renderer using a GL backend, and to set a current GL context. But this is not always the case, for example if the renderer backend is DirectX. This change is enough to fix using virgl with sdl2 on win32, on my setup. Signed-off-by: Marc-André Lureau --- ui/sdl2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ui/sdl2.c b/ui/sdl2.c index 8cb77416af..f259e4c4d1 100644 --- a/ui/sdl2.c +++ b/ui/sdl2.c @@ -101,7 +101,7 @@ void sdl2_window_create(struct sdl2_console *scon) flags); scon->real_renderer = SDL_CreateRenderer(scon->real_window, -1, 0); if (scon->opengl) { -scon->winctx = SDL_GL_GetCurrentContext(); +scon->winctx = SDL_GL_CreateContext(scon->real_window); } sdl_update_caption(scon); } -- 2.39.1
RE: [PATCH v1 1/1] virtio-block: switch to blk_get_max_hw_transfer
> -Original Message- > From: Kevin Wolf > Sent: Friday, 13 January 2023 13:45 > To: Ilya Dryomov > Cc: Or Ozeri ; qemu-devel@nongnu.org; > dupad...@redhat.com; to.my.troc...@gmail.com; qemu- > bl...@nongnu.org; Danny Harnik ; Stefan Hajnoczi > ; Paolo Bonzini > Subject: [EXTERNAL] Re: [PATCH v1 1/1] virtio-block: switch to > blk_get_max_hw_transfer > > > I'm not sure I understand. This is not a passthrough device (unlike scsi- > generic), so why should we consider the hardware limits rather than the > kernel/other backend limits for read/write requests? I don't understand much about it as well :) But anyway, this bug was tested on 6.1.0, and Ilya suggested that I will test it on newer versions. After doing that, I found out that the bug is not reproduceable in 6.1.1. The commit that fixed things in 6.1.1 is this: block: introduce max_hw_iov for use in scsi-generic https://lists.gnu.org/archive/html/qemu-block/2021-09/msg00807.html I guess we can just discard this thread. Thanks, Or
[PATCH v6 1/3] block/rbd: Remove redundant stack variable passphrase_len
From: Or Ozeri Signed-off-by: Or Ozeri --- block/rbd.c | 16 ++-- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/block/rbd.c b/block/rbd.c index 3aa6aae0e0..cfec0f2862 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -385,7 +385,6 @@ static int qemu_rbd_encryption_format(rbd_image_t image, { int r = 0; g_autofree char *passphrase = NULL; -size_t passphrase_len; rbd_encryption_format_t format; rbd_encryption_options_t opts; rbd_encryption_luks1_format_options_t luks_opts; @@ -407,12 +406,12 @@ static int qemu_rbd_encryption_format(rbd_image_t image, opts_size = sizeof(luks_opts); r = qemu_rbd_convert_luks_create_options( qapi_RbdEncryptionCreateOptionsLUKS_base(&encrypt->u.luks), -&luks_opts.alg, &passphrase, &passphrase_len, errp); +&luks_opts.alg, &passphrase, &luks_opts.passphrase_size, +errp); if (r < 0) { return r; } luks_opts.passphrase = passphrase; -luks_opts.passphrase_size = passphrase_len; break; } case RBD_IMAGE_ENCRYPTION_FORMAT_LUKS2: { @@ -423,12 +422,12 @@ static int qemu_rbd_encryption_format(rbd_image_t image, r = qemu_rbd_convert_luks_create_options( qapi_RbdEncryptionCreateOptionsLUKS2_base( &encrypt->u.luks2), -&luks2_opts.alg, &passphrase, &passphrase_len, errp); +&luks2_opts.alg, &passphrase, &luks2_opts.passphrase_size, +errp); if (r < 0) { return r; } luks2_opts.passphrase = passphrase; -luks2_opts.passphrase_size = passphrase_len; break; } default: { @@ -467,7 +466,6 @@ static int qemu_rbd_encryption_load(rbd_image_t image, { int r = 0; g_autofree char *passphrase = NULL; -size_t passphrase_len; rbd_encryption_luks1_format_options_t luks_opts; rbd_encryption_luks2_format_options_t luks2_opts; rbd_encryption_format_t format; @@ -482,12 +480,11 @@ static int qemu_rbd_encryption_load(rbd_image_t image, opts_size = sizeof(luks_opts); r = qemu_rbd_convert_luks_options( qapi_RbdEncryptionOptionsLUKS_base(&encrypt->u.luks), -&passphrase, &passphrase_len, errp); +&passphrase, &luks_opts.passphrase_size, errp); if (r < 0) { return r; } luks_opts.passphrase = passphrase; -luks_opts.passphrase_size = passphrase_len; break; } case RBD_IMAGE_ENCRYPTION_FORMAT_LUKS2: { @@ -497,12 +494,11 @@ static int qemu_rbd_encryption_load(rbd_image_t image, opts_size = sizeof(luks2_opts); r = qemu_rbd_convert_luks_options( qapi_RbdEncryptionOptionsLUKS2_base(&encrypt->u.luks2), -&passphrase, &passphrase_len, errp); +&passphrase, &luks2_opts.passphrase_size, errp); if (r < 0) { return r; } luks2_opts.passphrase = passphrase; -luks2_opts.passphrase_size = passphrase_len; break; } default: { -- 2.25.1
[RFC PATCH] docs/about/deprecated: Deprecate 32-bit host systems
Testing 32-bit host OS support takes a lot of precious time during the QEMU contiguous integration tests, and considering that many OS vendors stopped shipping 32-bit variants of their OS distributions and most hardware from the past >10 years is capable of 64-bit, keeping the 32-bit support alive is an inadequate burden for the QEMU project. Let's mark the 32-bit support as deprecated so we can drop it after a while - this will help us to cut down our limited CI minutes in the gitlab CI, for example. Signed-off-by: Thomas Huth --- docs/about/deprecated.rst | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst index 9f1bbc495d..ce6463e72b 100644 --- a/docs/about/deprecated.rst +++ b/docs/about/deprecated.rst @@ -181,9 +181,20 @@ As Debian 10 ("Buster") moved into LTS the big endian 32 bit version of MIPS moved out of support making it hard to maintain our cross-compilation CI tests of the architecture. As we no longer have CI coverage support may bitrot away before the deprecation process -completes. The little endian variants of MIPS (both 32 and 64 bit) are +completes. The little endian variants of MIPS are still a supported host architecture. +32-bit host operating systems (since 8.0) +' + +Testing 32-bit host OS support takes a lot of precious time during the QEMU +contiguous integration tests, and considering that many OS vendors stopped +shipping 32-bit variants of their OS distributions and most hardware from +the past >10 years is capable of 64-bit, keeping the 32-bit support alive +is an inadequate burden for the QEMU project. Thus QEMU will soon drop the +support for 32-bit host systems. + + QEMU API (QAPI) events -- -- 2.31.1
[PATCH v6 2/3] block/rbd: Add luks-any encryption opening option
From: Or Ozeri Ceph RBD encryption API required specifying the encryption format for loading encryption. The supported formats were LUKS (v1) and LUKS2. Starting from Reef release, RBD also supports loading with "luks-any" format, which works for both versions of LUKS. This commit extends the qemu rbd driver API to enable qemu users to use this luks-any wildcard format. Signed-off-by: Or Ozeri --- block/rbd.c | 19 +++ qapi/block-core.json | 20 ++-- 2 files changed, 37 insertions(+), 2 deletions(-) diff --git a/block/rbd.c b/block/rbd.c index cfec0f2862..b929378871 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -468,6 +468,9 @@ static int qemu_rbd_encryption_load(rbd_image_t image, g_autofree char *passphrase = NULL; rbd_encryption_luks1_format_options_t luks_opts; rbd_encryption_luks2_format_options_t luks2_opts; +#ifdef LIBRBD_SUPPORTS_ENCRYPTION_LOAD2 +rbd_encryption_luks_format_options_t luks_any_opts; +#endif rbd_encryption_format_t format; rbd_encryption_options_t opts; size_t opts_size; @@ -501,6 +504,22 @@ static int qemu_rbd_encryption_load(rbd_image_t image, luks2_opts.passphrase = passphrase; break; } +#ifdef LIBRBD_SUPPORTS_ENCRYPTION_LOAD2 +case RBD_IMAGE_ENCRYPTION_FORMAT_LUKS_ANY: { +memset(&luks_any_opts, 0, sizeof(luks_any_opts)); +format = RBD_ENCRYPTION_FORMAT_LUKS; +opts = &luks_any_opts; +opts_size = sizeof(luks_any_opts); +r = qemu_rbd_convert_luks_options( + qapi_RbdEncryptionOptionsLUKSAny_base(&encrypt->u.luks_any), +&passphrase, &luks_any_opts.passphrase_size, errp); +if (r < 0) { +return r; +} +luks_any_opts.passphrase = passphrase; +break; +} +#endif default: { r = -ENOTSUP; error_setg_errno( diff --git a/qapi/block-core.json b/qapi/block-core.json index 95ac4fa634..e59fb5d453 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -3827,10 +3827,16 @@ ## # @RbdImageEncryptionFormat: # +# luks +# +# luks2 +# +# luks-any: Used for opening either luks or luks2. (Since 8.0) +# # Since: 6.1 ## { 'enum': 'RbdImageEncryptionFormat', - 'data': [ 'luks', 'luks2' ] } + 'data': [ 'luks', 'luks2', 'luks-any' ] } ## # @RbdEncryptionOptionsLUKSBase: @@ -3872,6 +3878,15 @@ 'base': 'RbdEncryptionOptionsLUKSBase', 'data': { } } +## +# @RbdEncryptionOptionsLUKSAny: +# +# Since: 8.0 +## +{ 'struct': 'RbdEncryptionOptionsLUKSAny', + 'base': 'RbdEncryptionOptionsLUKSBase', + 'data': { } } + ## # @RbdEncryptionCreateOptionsLUKS: # @@ -3899,7 +3914,8 @@ 'base': { 'format': 'RbdImageEncryptionFormat' }, 'discriminator': 'format', 'data': { 'luks': 'RbdEncryptionOptionsLUKS', -'luks2': 'RbdEncryptionOptionsLUKS2' } } +'luks2': 'RbdEncryptionOptionsLUKS2', +'luks-any': 'RbdEncryptionOptionsLUKSAny'} } ## # @RbdEncryptionCreateOptions: -- 2.25.1
Re: [RFC PATCH] docs/about/deprecated: Deprecate 32-bit host systems
On Mon, 30 Jan 2023 at 11:44, Thomas Huth wrote: > > Testing 32-bit host OS support takes a lot of precious time during the QEMU > contiguous integration tests, and considering that many OS vendors stopped > shipping 32-bit variants of their OS distributions and most hardware from > the past >10 years is capable of 64-bit True for x86, not necessarily true for other architectures. Are you proposing to deprecate x86 32-bit, or all 32-bit? I'm not entirely sure about whether we're yet at a point where I'd want to deprecate-and-drop 32-bit arm host support. thanks -- PMM
Re: [PATCH 0/3] Vhost-user: replace _SLAVE_ with _BACKEND_
On 1/30/23 12:08, Michael S. Tsirkin wrote: On Mon, Jan 30, 2023 at 11:45:45AM +0100, Maxime Coquelin wrote: This series continues the work done to get rid of harmful language in the Vhost-user specification. I prefer a positive "switch to a more inclusive terminology". To consider if you keep doing this work. Right, it is indeed better. I will post a new revision using positive wording. Thanks for the review, Maxime While the spec texts were changed to replace slave with backend, the protocol features and messages names hadn't been changed. This series renames remaining occurences in the spec and make use of the new names in both libvhost-user and the Vhost-user frontend code. Maxime Coquelin (3): docs: vhost-user: replace _SLAVE_ with _BACKEND_ libvhost-user: Adopt new backend naming vhost-user: Adopt new backend naming docs/interop/vhost-user.rst | 40 +++ hw/virtio/vhost-user.c| 30 - hw/virtio/virtio-qmp.c| 12 +++ subprojects/libvhost-user/libvhost-user.c | 20 ++-- subprojects/libvhost-user/libvhost-user.h | 20 ++-- 5 files changed, 61 insertions(+), 61 deletions(-) -- 2.39.1
Re: [PATCH] hw/usb/core: fix inconsistent ep and pid (UBS_TOKEN_SETUP)
Hi all, I'm sure this patch will prevent the assertion failure due to the inconsistent ep and pid (UBS_TOKEN_SETUP) ( https://lists.gnu.org/archive/html/qemu-devel/2021-06/msg07179.html). For UHCI (https://gitlab.com/qemu-project/qemu/-/issues/119) and OHCI ( https://gitlab.com/qemu-project/qemu/-/issues/303), this patch may be right. For EHCI, I found another way to trigger this assertion even with my patch because ehci_get_pid() returns 0 if qtd->token.QTD_TOKEN_PID is not valid[0]. In this case, the patch cannot capture it because pid is zero[2]. This case is specific to EHCI as far as I know. It seems we want to drop the operation if ehci_get_pid() returns 0. ```static int ehci_get_pid(EHCIqtd *qtd) { switch (get_field(qtd->token, QTD_TOKEN_PID)) { case 0: return USB_TOKEN_OUT; case 1: return USB_TOKEN_IN; case 2: return USB_TOKEN_SETUP; default: fprintf(stderr, "bad token\n"); // -> [0] return 0; } } static int ehci_execute(EHCIPacket *p, const char *action) { p->pid = ehci_get_pid(&p->qtd); // > [1] p->queue->last_pid = p->pid; endp = get_field(p->queue->qh.epchar, QH_EPCHAR_EP); ep = usb_ep_get(p->queue->dev, p->pid/*=0*/, endp); // ---> [2] ``` A qtest sequence is like ``` writel 0x1011b000 0x10124000 writel 0x10124004 0x358cbd80 writel 0x10124018 0x9e4bba36 writel 0x10124014 0x10139000 writel 0xfebd5020 0x1c4a5135 writel 0x10139008 0x3d5c4b84 clock_step 0xb17b0 writel 0xfebd5064 0x5f919911 clock_step 0xa9229 writel 0xfebd5064 0x5431e207 writel 0xfebd5038 0x1b2034b5 writel 0x1b2034a0 0x1010 writel 0x1010 0x10109000 writel 0x10109000 0x1011b000 clock_step 0xa9229 ``` Best, Qiang
[PATCH v6 3/3] block/rbd: Add support for layered encryption
From: Or Ozeri Starting from ceph Reef, RBD has built-in support for layered encryption, where each ancestor image (in a cloned image setting) can be possibly encrypted using a unique passphrase. A new function, rbd_encryption_load2, was added to librbd API. This new function supports an array of passphrases (via "spec" structs). This commit extends the qemu rbd driver API to use this new librbd API, in order to support this new layered encryption feature. Signed-off-by: Or Ozeri --- block/rbd.c | 153 ++- qapi/block-core.json | 11 +++- 2 files changed, 162 insertions(+), 2 deletions(-) diff --git a/block/rbd.c b/block/rbd.c index b929378871..3a8060b88b 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -71,6 +71,16 @@ static const char rbd_luks2_header_verification[ 'L', 'U', 'K', 'S', 0xBA, 0xBE, 0, 2 }; +static const char rbd_layered_luks_header_verification[ +RBD_ENCRYPTION_LUKS_HEADER_VERIFICATION_LEN] = { +'R', 'B', 'D', 'L', 0xBA, 0xBE, 0, 1 +}; + +static const char rbd_layered_luks2_header_verification[ +RBD_ENCRYPTION_LUKS_HEADER_VERIFICATION_LEN] = { +'R', 'B', 'D', 'L', 0xBA, 0xBE, 0, 2 +}; + typedef enum { RBD_AIO_READ, RBD_AIO_WRITE, @@ -537,6 +547,128 @@ static int qemu_rbd_encryption_load(rbd_image_t image, return 0; } + +#ifdef LIBRBD_SUPPORTS_ENCRYPTION_LOAD2 +static int qemu_rbd_encryption_load2(rbd_image_t image, + RbdEncryptionOptions *encrypt, + Error **errp) +{ +int r = 0; +int encrypt_count = 1; +int i; +RbdEncryptionOptions *curr_encrypt; +rbd_encryption_spec_t *specs; +rbd_encryption_luks1_format_options_t *luks_opts; +rbd_encryption_luks2_format_options_t *luks2_opts; +rbd_encryption_luks_format_options_t *luks_any_opts; + +/* count encryption options */ +for (curr_encrypt = encrypt->parent; curr_encrypt; + curr_encrypt = curr_encrypt->parent) { +++encrypt_count; +} + +specs = g_new0(rbd_encryption_spec_t, encrypt_count); + +curr_encrypt = encrypt; +for (i = 0; i < encrypt_count; ++i) { +switch (curr_encrypt->format) { +case RBD_IMAGE_ENCRYPTION_FORMAT_LUKS: { +specs[i].format = RBD_ENCRYPTION_FORMAT_LUKS1; + +luks_opts = g_new0(rbd_encryption_luks1_format_options_t, 1); +specs[i].opts = luks_opts; +specs[i].opts_size = sizeof(*luks_opts); + +r = qemu_rbd_convert_luks_options( +qapi_RbdEncryptionOptionsLUKS_base( +&curr_encrypt->u.luks), +(char **)&luks_opts->passphrase, +&luks_opts->passphrase_size, +errp); +break; +} +case RBD_IMAGE_ENCRYPTION_FORMAT_LUKS2: { +specs[i].format = RBD_ENCRYPTION_FORMAT_LUKS2; + +luks2_opts = g_new0(rbd_encryption_luks2_format_options_t, 1); +specs[i].opts = luks2_opts; +specs[i].opts_size = sizeof(*luks2_opts); + +r = qemu_rbd_convert_luks_options( +qapi_RbdEncryptionOptionsLUKS2_base( +&curr_encrypt->u.luks2), +(char **)&luks2_opts->passphrase, +&luks2_opts->passphrase_size, +errp); +break; +} +case RBD_IMAGE_ENCRYPTION_FORMAT_LUKS_ANY: { +specs[i].format = RBD_ENCRYPTION_FORMAT_LUKS; + +luks_any_opts = g_new0(rbd_encryption_luks_format_options_t, 1); +specs[i].opts = luks_any_opts; +specs[i].opts_size = sizeof(*luks_any_opts); + +r = qemu_rbd_convert_luks_options( +qapi_RbdEncryptionOptionsLUKSAny_base( +&curr_encrypt->u.luks_any), +(char **)&luks_any_opts->passphrase, +&luks_any_opts->passphrase_size, +errp); +break; +} +default: { +r = -ENOTSUP; +error_setg_errno( +errp, -r, "unknown image encryption format: %u", +curr_encrypt->format); +} +} + +if (r < 0) { +goto exit; +} + +curr_encrypt = curr_encrypt->parent; +} + +r = rbd_encryption_load2(image, specs, encrypt_count); +if (r < 0) { +error_setg_errno(errp, -r, "layered encryption load fail"); +goto exit; +} + +exit: +for (i = 0; i < encrypt_count; ++i) { +if (!specs[i].opts) { +break; +} + +switch (specs[i].format) { +case RBD_ENCRYPTION_FORMAT_LUKS1: { +luks_opts =
Re: [RFC PATCH] docs/about/deprecated: Deprecate 32-bit host systems
On Mon, Jan 30, 2023 at 11:47:02AM +, Peter Maydell wrote: > On Mon, 30 Jan 2023 at 11:44, Thomas Huth wrote: > > > > Testing 32-bit host OS support takes a lot of precious time during the QEMU > > contiguous integration tests, and considering that many OS vendors stopped > > shipping 32-bit variants of their OS distributions and most hardware from > > the past >10 years is capable of 64-bit > > True for x86, not necessarily true for other architectures. > Are you proposing to deprecate x86 32-bit, or all 32-bit? > I'm not entirely sure about whether we're yet at a point where > I'd want to deprecate-and-drop 32-bit arm host support. Do we have a feeling on which aspects of 32-bit cause us the support burden ? The boring stuff like compiler errors from mismatched integer sizes is mostly quick & easy to detect simply through a cross compile. I vaguely recall someone mentioned problems with atomic ops in the past, or was it 128-bit ints, caused implications for the codebase ? With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH 6/6] gitlab-ci.d/buildtest: Disintegrate the build-coroutine-sigaltstack job
On 30/01/2023 11.58, Daniel P. Berrangé wrote: On Mon, Jan 30, 2023 at 11:44:46AM +0100, Thomas Huth wrote: We can get rid of the build-coroutine-sigaltstack job by moving the configure flags that should be tested here to other jobs: Move --with-coroutine=sigaltstack to the build-without-defaults job and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job. The biggest user of coroutines is the block layer. So we probably ought to have coroutines aligned with a job that triggers the 'make check-block' for iotests. IIUC, the without-defaults job won't do that. How about, arbitrarily, using either the 'check-system-debian' or 'check-system-ubuntu' job. Those distros are closely related, so getting sigaltstack vs ucontext coverage between them is a good win, and they both trigger the block jobs IIUC. Ok ... but let's put qemu-bl...@nongnu.org on CC: first, to see what the block layer folks think about this. Incidentally sigaltstack is also covered by our Cirrus CI job for macOS. Oh, nice, so we already have some "check-block" test coverage there! Thomas --- .gitlab-ci.d/buildtest.yml | 14 ++ .gitlab-ci.d/crossbuilds.yml | 2 +- 2 files changed, 3 insertions(+), 13 deletions(-) diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml index 91c7467a66..1438797a1c 100644 --- a/.gitlab-ci.d/buildtest.yml +++ b/.gitlab-ci.d/buildtest.yml @@ -533,19 +533,8 @@ build-tci: - QTEST_QEMU_BINARY="./qemu-system-s390x" ./tests/qtest/pxe-test -m slow - make check-tcg -# Alternate coroutines implementations are only really of interest to KVM users -# However we can't test against KVM on Gitlab-CI so we can only run unit tests -build-coroutine-sigaltstack: - extends: .native_build_job_template - needs: -job: amd64-ubuntu2004-container - variables: -IMAGE: ubuntu2004 -CONFIGURE_ARGS: --with-coroutine=sigaltstack --disable-tcg ---enable-trace-backends=ftrace -MAKE_CHECK_ARGS: check-unit - # Check our reduced build configurations +# (and an alternative coroutine implementation) build-without-defaults: extends: .native_build_job_template needs: @@ -559,6 +548,7 @@ build-without-defaults: --disable-pie --disable-qom-cast-debug --disable-strip + --with-coroutine=sigaltstack TARGETS: avr-softmmu mips64-softmmu s390x-softmmu sh4-softmmu sparc64-softmmu hexagon-linux-user i386-linux-user s390x-linux-user MAKE_CHECK_ARGS: check-unit check-qtest-avr check-qtest-mips64 diff --git a/.gitlab-ci.d/crossbuilds.yml b/.gitlab-ci.d/crossbuilds.yml index 8dbbb8f881..027d2088da 100644 --- a/.gitlab-ci.d/crossbuilds.yml +++ b/.gitlab-ci.d/crossbuilds.yml @@ -159,7 +159,7 @@ cross-s390x-kvm-only: job: s390x-debian-cross-container variables: IMAGE: debian-s390x-cross -EXTRA_CONFIGURE_OPTS: --disable-tcg +EXTRA_CONFIGURE_OPTS: --disable-tcg --enable-trace-backends=ftrace cross-mips64el-kvm-only: extends: .cross_accel_build_job -- 2.31.1 With regards, Daniel