Introduce: Storage stability testing and DATA consistency verifying tools and system

2023-10-06 Thread 张友加
Dear All,




I hope you are all well. I would like to introduce new tools I have developed, 
named "LBA tools" which including hd_write_verify & hd_write_verify_dump.




github: https://github.com/zhangyoujia/hd_write_verify




pdf:  https://github.com/zhangyoujia/hd_write_verify/DISK stability 
testing and DATA consistency verifying tools and system.pdf




ppt:  https://github.com/zhangyoujia/hd_write_verify/存储稳定性测试与数据一致性校验工具和系统.pptx




bin:  https://github.com/zhangyoujia/hd_write_verify/bin




iso:  https://github.com/zhangyoujia/hd_write_verify/iso




Data is a vital asset for many businesses, making storage stability and data 
consistency the most fundamental requirements in storage technology scenarios.




The purpose of storage stability testing is to ensure that storage devices or 
systems can operate normally and remain stable over time, while also handling 
various abnormal situations such as sudden power outages and network failures. 
This testing typically includes stress testing, load testing, fault tolerance 
testing, and other evaluations to assess the performance and reliability of the 
storage system.




Data consistency checking is designed to ensure that the data stored in the 
system is accurate and consistent. This means that whenever data changes occur, 
all replicas should be updated simultaneously to avoid data inconsistency. Data 
consistency checking typically involves aspects such as data integrity, 
accuracy, consistency, and reliability.




LBA tools are very useful for testing Storage stability and verifying DATA 
consistency, there are much better than FIO & vdbench's verifying functions.




I believe that LBA tools will have a positive impact on the community and help 
users handle storage data more effectively. Your feedback and suggestions are 
greatly appreciated, and I hope you can try using LBA tools and share your 
experiences and recommendations.




Best regards




Re: [PATCH v2 22/53] migration/rdma: Drop dead qemu_rdma_data_init() code for !@host_port

2023-10-06 Thread Zhijian Li (Fujitsu)


On 28/09/2023 21:19, Markus Armbruster wrote:
> qemu_rdma_data_init() neglects to set an Error when it fails because
> @host_port is null.  Fortunately, no caller passes null, so this is
> merely a latent bug.  Drop the flawed code handling null argument.
> 
> Signed-off-by: Markus Armbruster 

Reviewed-by: Li Zhijian 


> ---
>   migration/rdma.c | 29 +
>   1 file changed, 13 insertions(+), 16 deletions(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 1a0ad44411..1ae2f87906 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -2747,25 +2747,22 @@ static RDMAContext *qemu_rdma_data_init(const char 
> *host_port, Error **errp)
>   RDMAContext *rdma = NULL;
>   InetSocketAddress *addr;
>   
> -if (host_port) {
> -rdma = g_new0(RDMAContext, 1);
> -rdma->current_index = -1;
> -rdma->current_chunk = -1;
> +rdma = g_new0(RDMAContext, 1);
> +rdma->current_index = -1;
> +rdma->current_chunk = -1;
>   
> -addr = g_new(InetSocketAddress, 1);
> -if (!inet_parse(addr, host_port, NULL)) {
> -rdma->port = atoi(addr->port);
> -rdma->host = g_strdup(addr->host);
> -rdma->host_port = g_strdup(host_port);
> -} else {
> -ERROR(errp, "bad RDMA migration address '%s'", host_port);
> -g_free(rdma);
> -rdma = NULL;
> -}
> -
> -qapi_free_InetSocketAddress(addr);
> +addr = g_new(InetSocketAddress, 1);
> +if (!inet_parse(addr, host_port, NULL)) {
> +rdma->port = atoi(addr->port);
> +rdma->host = g_strdup(addr->host);
> +rdma->host_port = g_strdup(host_port);
> +} else {
> +ERROR(errp, "bad RDMA migration address '%s'", host_port);
> +g_free(rdma);
> +rdma = NULL;
>   }
>   
> +qapi_free_InetSocketAddress(addr);
>   return rdma;
>   }
>   

Re: [PATCH v2 16/53] migration/rdma: Fix or document problematic uses of errno

2023-10-06 Thread Zhijian Li (Fujitsu)


On 28/09/2023 21:19, Markus Armbruster wrote:
> We use errno after calling Libibverbs functions that are not
> documented to set errno (manual page does not mention errno), or where
> the documentation is unclear ("returns [...] the value of errno on
> failure").  While this could be read as "sets errno and returns it",
> a glance at the source code[*] kills that hope:
> 
>  static inline int ibv_post_send(struct ibv_qp *qp, struct ibv_send_wr 
> *wr,
>  struct ibv_send_wr **bad_wr)
>  {
>  return qp->context->ops.post_send(qp, wr, bad_wr);
>  }
> 
> The callback can be
> 
>  static int mana_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
>struct ibv_send_wr **bad)
>  {
>  /* This version of driver supports RAW QP only.
>   * Posting WR is done directly in the application.
>   */
>  return EOPNOTSUPP;
>  }
> 
> Neither of them touches errno.
> 
> One of these errno uses is easy to fix, so do that now.  Several more
> will go away later in the series; add temporary FIXME commments.
> Three will remain; add TODO comments.  TODO, not FIXME, because the
> bug might be in Libibverbs documentation.
> 
> [*] https://github.com/linux-rdma/rdma-core.git
>  commit 55fa316b4b18f258d8ac1ceb4aa5a7a35b094dcf
> 
> Signed-off-by: Markus Armbruster 


Reviewed-by: Li Zhijian 


> ---
>   migration/rdma.c | 45 +++--
>   1 file changed, 39 insertions(+), 6 deletions(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 28097ce604..bba8c99fa9 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -853,6 +853,12 @@ static int qemu_rdma_broken_ipv6_kernel(struct 
> ibv_context *verbs, Error **errp)
>   
>   for (x = 0; x < num_devices; x++) {
>   verbs = ibv_open_device(dev_list[x]);
> +/*
> + * ibv_open_device() is not documented to set errno.  If
> + * it does, it's somebody else's doc bug.  If it doesn't,
> + * the use of errno below is wrong.
> + * TODO Find out whether ibv_open_device() sets errno.
> + */
>   if (!verbs) {
>   if (errno == EPERM) {
>   continue;
> @@ -1162,11 +1168,7 @@ static void qemu_rdma_advise_prefetch_mr(struct ibv_pd 
> *pd, uint64_t addr,
>   ret = ibv_advise_mr(pd, advice,
>   IBV_ADVISE_MR_FLAG_FLUSH, _list, 1);
>   /* ignore the error */
> -if (ret) {
> -trace_qemu_rdma_advise_mr(name, len, addr, strerror(errno));
> -} else {
> -trace_qemu_rdma_advise_mr(name, len, addr, "successed");
> -}
> +trace_qemu_rdma_advise_mr(name, len, addr, strerror(ret));
>   #endif
>   }
>   
> @@ -1183,7 +1185,12 @@ static int qemu_rdma_reg_whole_ram_blocks(RDMAContext 
> *rdma)
>   local->block[i].local_host_addr,
>   local->block[i].length, access
>   );
> -
> +/*
> + * ibv_reg_mr() is not documented to set errno.  If it does,
> + * it's somebody else's doc bug.  If it doesn't, the use of
> + * errno below is wrong.
> + * TODO Find out whether ibv_reg_mr() sets errno.
> + */
>   if (!local->block[i].mr &&
>   errno == ENOTSUP && rdma_support_odp(rdma->verbs)) {
>   access |= IBV_ACCESS_ON_DEMAND;
> @@ -1291,6 +1298,12 @@ static int qemu_rdma_register_and_get_keys(RDMAContext 
> *rdma,
>   trace_qemu_rdma_register_and_get_keys(len, chunk_start);
>   
>   block->pmr[chunk] = ibv_reg_mr(rdma->pd, chunk_start, len, access);
> +/*
> + * ibv_reg_mr() is not documented to set errno.  If it does,
> + * it's somebody else's doc bug.  If it doesn't, the use of
> + * errno below is wrong.
> + * TODO Find out whether ibv_reg_mr() sets errno.
> + */
>   if (!block->pmr[chunk] &&
>   errno == ENOTSUP && rdma_support_odp(rdma->verbs)) {
>   access |= IBV_ACCESS_ON_DEMAND;
> @@ -1408,6 +1421,11 @@ static int qemu_rdma_unregister_waiting(RDMAContext 
> *rdma)
>   block->remote_keys[chunk] = 0;
>   
>   if (ret != 0) {
> +/*
> + * FIXME perror() is problematic, bcause ibv_dereg_mr() is
> + * not documented to set errno.  Will go away later in
> + * this series.
> + */
>   perror("unregistration chunk failed");
>   return -ret;
>   }
> @@ -1658,6 +1676,11 @@ static int qemu_rdma_block_for_wrid(RDMAContext *rdma,
>   
>   ret = ibv_get_cq_event(ch, , _ctx);
>   if (ret) {
> +/*
> + * FIXME perror() is problematic, because ibv_reg_mr() is
> + * not documented to set errno.  Will go away later in
> + * this 

Re: [PATCH v2 53/53] migration/rdma: Replace flawed device detail dump by tracing

2023-10-06 Thread Zhijian Li (Fujitsu)


On 28/09/2023 21:20, Markus Armbruster wrote:
> qemu_rdma_dump_id() dumps RDMA device details to stdout.
> 
> rdma_start_outgoing_migration() calls it via qemu_rdma_source_init()
> and qemu_rdma_resolve_host() to show source device details.
> rdma_start_incoming_migration() arranges its call via
> rdma_accept_incoming_migration() and qemu_rdma_accept() to show
> destination device details.
> 
> Two issues:
> 
> 1. rdma_start_outgoing_migration() can run in HMP context.  The
> information should arguably go the monitor, not stdout.
> 
> 2. ibv_query_port() failure is reported as error.  Its callers remain
> unaware of this failure (qemu_rdma_dump_id() can't fail), so
> reporting this to the user as an error is problematic.
> 
> Fixable, but the device detail dump is noise, except when
> troubleshooting.  Tracing is a better fit.  Similar function
> qemu_rdma_dump_id() was converted to tracing in commit
> 733252deb8b (Tracify migration/rdma.c).
> 
> Convert qemu_rdma_dump_id(), too.
> 
> While there, touch up qemu_rdma_dump_gid()'s outdated comment.
> 
> Signed-off-by: Markus Armbruster 

Reviewed-by: Li Zhijian 


> ---
>   migration/rdma.c   | 23 ---
>   migration/trace-events |  2 ++
>   2 files changed, 10 insertions(+), 15 deletions(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index dba0802fca..07aef9a071 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -734,38 +734,31 @@ static void rdma_delete_block(RDMAContext *rdma, 
> RDMALocalBlock *block)
>   }
>   
>   /*
> - * Put in the log file which RDMA device was opened and the details
> - * associated with that device.
> + * Trace RDMA device open, with device details.
>*/
>   static void qemu_rdma_dump_id(const char *who, struct ibv_context *verbs)
>   {
>   struct ibv_port_attr port;
>   
>   if (ibv_query_port(verbs, 1, )) {
> -error_report("Failed to query port information");
> +trace_qemu_rdma_dump_id_failed(who);
>   return;
>   }
>   
> -printf("%s RDMA Device opened: kernel name %s "
> -   "uverbs device name %s, "
> -   "infiniband_verbs class device path %s, "
> -   "infiniband class device path %s, "
> -   "transport: (%d) %s\n",
> -who,
> +trace_qemu_rdma_dump_id(who,
>   verbs->device->name,
>   verbs->device->dev_name,
>   verbs->device->dev_path,
>   verbs->device->ibdev_path,
>   port.link_layer,
> -(port.link_layer == IBV_LINK_LAYER_INFINIBAND) ? 
> "Infiniband" :
> - ((port.link_layer == IBV_LINK_LAYER_ETHERNET)
> -? "Ethernet" : "Unknown"));
> +port.link_layer == IBV_LINK_LAYER_INFINIBAND ? "Infiniband"
> +: port.link_layer == IBV_LINK_LAYER_ETHERNET ? "Ethernet"
> +: "Unknown");
>   }
>   
>   /*
> - * Put in the log file the RDMA gid addressing information,
> - * useful for folks who have trouble understanding the
> - * RDMA device hierarchy in the kernel.
> + * Trace RDMA gid addressing information.
> + * Useful for understanding the RDMA device hierarchy in the kernel.
>*/
>   static void qemu_rdma_dump_gid(const char *who, struct rdma_cm_id *id)
>   {
> diff --git a/migration/trace-events b/migration/trace-events
> index d733107ec6..4ce16ae866 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -213,6 +213,8 @@ qemu_rdma_close(void) ""
>   qemu_rdma_connect_pin_all_requested(void) ""
>   qemu_rdma_connect_pin_all_outcome(bool pin) "%d"
>   qemu_rdma_dest_init_trying(const char *host, const char *ip) "%s => %s"
> +qemu_rdma_dump_id_failed(const char *who) "%s RDMA Device opened, but can't 
> query port information"
> +qemu_rdma_dump_id(const char *who, const char *name, const char *dev_name, 
> const char *dev_path, const char *ibdev_path, int transport, const char 
> *transport_name) "%s RDMA Device opened: kernel name %s uverbs device name 
> %s, infiniband_verbs class device path %s, infiniband class device path %s, 
> transport: (%d) %s"
>   qemu_rdma_dump_gid(const char *who, const char *src, const char *dst) "%s 
> Source GID: %s, Dest GID: %s"
>   qemu_rdma_exchange_get_response_start(const char *desc) "CONTROL: %s 
> receiving..."
>   qemu_rdma_exchange_get_response_none(const char *desc, int type) "Surprise: 
> got %s (%d)"

Re: [PATCH v2 52/53] migration/rdma: Use error_report() & friends instead of stderr

2023-10-06 Thread Zhijian Li (Fujitsu)


On 28/09/2023 21:20, Markus Armbruster wrote:
> error_report() obeys -msg, reports the current error location if any,
> and reports to the current monitor if any.  Reporting to stderr
> directly with fprintf() or perror() is wrong, because it loses all
> this.
> 
> Fix the offenders.  Bonus: resolves a FIXME about problematic use of
> errno.
> 
> Signed-off-by: Markus Armbruster 

Reviewed-by: Li Zhijian 


> ---
>   migration/rdma.c | 44 +---
>   1 file changed, 21 insertions(+), 23 deletions(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 54b59d12b1..dba0802fca 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -877,12 +877,12 @@ static int qemu_rdma_broken_ipv6_kernel(struct 
> ibv_context *verbs, Error **errp)
>   
>   if (roce_found) {
>   if (ib_found) {
> -fprintf(stderr, "WARN: migrations may fail:"
> -" IPv6 over RoCE / iWARP in linux"
> -" is broken. But since you appear to have a"
> -" mixed RoCE / IB environment, be sure to 
> only"
> -" migrate over the IB fabric until the 
> kernel "
> -" fixes the bug.\n");
> +warn_report("WARN: migrations may fail:"
> +" IPv6 over RoCE / iWARP in linux"
> +" is broken. But since you appear to have a"
> +" mixed RoCE / IB environment, be sure to only"
> +" migrate over the IB fabric until the kernel "
> +" fixes the bug.");
>   } else {
>   error_setg(errp, "RDMA ERROR: "
>  "You only have RoCE / iWARP devices in your 
> systems"
> @@ -1418,12 +1418,8 @@ static int qemu_rdma_unregister_waiting(RDMAContext 
> *rdma)
>   block->remote_keys[chunk] = 0;
>   
>   if (ret != 0) {
> -/*
> - * FIXME perror() is problematic, bcause ibv_dereg_mr() is
> - * not documented to set errno.  Will go away later in
> - * this series.
> - */
> -perror("unregistration chunk failed");
> +error_report("unregistration chunk failed: %s",
> + strerror(ret));
>   return -1;
>   }
>   rdma->total_registrations--;
> @@ -3767,7 +3763,8 @@ static int qemu_rdma_registration_handle(QEMUFile *f)
>   block->pmr[reg->key.chunk] = NULL;
>   
>   if (ret != 0) {
> -perror("rdma unregistration chunk failed");
> +error_report("rdma unregistration chunk failed: %s",
> + strerror(errno));
>   goto err;
>   }
>   
> @@ -3956,10 +3953,10 @@ static int qemu_rdma_registration_stop(QEMUFile *f,
>*/
>   
>   if (local->nb_blocks != nb_dest_blocks) {
> -fprintf(stderr, "ram blocks mismatch (Number of blocks %d vs %d) 
> "
> -"Your QEMU command line parameters are probably "
> -"not identical on both the source and destination.",
> -local->nb_blocks, nb_dest_blocks);
> +error_report("ram blocks mismatch (Number of blocks %d vs %d)",
> + local->nb_blocks, nb_dest_blocks);
> +error_printf("Your QEMU command line parameters are probably "
> + "not identical on both the source and 
> destination.");
>   rdma->errored = true;
>   return -1;
>   }
> @@ -3972,10 +3969,11 @@ static int qemu_rdma_registration_stop(QEMUFile *f,
>   
>   /* We require that the blocks are in the same order */
>   if (rdma->dest_blocks[i].length != local->block[i].length) {
> -fprintf(stderr, "Block %s/%d has a different length %" PRIu64
> -"vs %" PRIu64, local->block[i].block_name, i,
> -local->block[i].length,
> -rdma->dest_blocks[i].length);
> +error_report("Block %s/%d has a different length %" PRIu64
> + "vs %" PRIu64,
> + local->block[i].block_name, i,
> + local->block[i].length,
> + rdma->dest_blocks[i].length);
>   rdma->errored = true;
>   return -1;
>   }
> @@ -4091,7 +4089,7 @@ static void rdma_accept_incoming_migration(void *opaque)
>   ret = qemu_rdma_accept(rdma);
>   
>   if (ret < 0) {
> -fprintf(stderr, "RDMA ERROR: Migration initialization failed\n");
> +error_report("RDMA ERROR: Migration initialization failed");
>   return;
>   }
>   
> @@ -4103,7 

Re: [PATCH v2 51/53] migration/rdma: Downgrade qemu_rdma_cleanup() errors to warnings

2023-10-06 Thread Zhijian Li (Fujitsu)


On 28/09/2023 21:20, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job.  When the caller does, the error is reported twice.  When it
> doesn't (because it recovered from the error), there is no error to
> report, i.e. the report is bogus.
> 
> qemu_rdma_source_init(), qemu_rdma_connect(),
> rdma_start_incoming_migration(), and rdma_start_outgoing_migration()
> violate this principle: they call error_report() via
> qemu_rdma_cleanup().
> 
> Moreover, qemu_rdma_cleanup() can't fail.  It is called on error
> paths, and QIOChannel close and finalization.  Are the conditions it
> reports really errors?  I doubt it.
> 
> Downgrade qemu_rdma_cleanup()'s errors to warnings.
> 
> Signed-off-by: Markus Armbruster 

Reviewed-by: Li Zhijian 


> ---
>   migration/rdma.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 4e4d818460..54b59d12b1 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -2358,9 +2358,9 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
>  .type = RDMA_CONTROL_ERROR,
>  .repeat = 1,
>};
> -error_report("Early error. Sending error.");
> +warn_report("Early error. Sending error.");
>   if (qemu_rdma_post_send_control(rdma, NULL, , ) < 0) {
> -error_report_err(err);
> +warn_report_err(err);
>   }
>   }
>   

Re: [PATCH 39/52] migration/rdma: Convert qemu_rdma_write_one() to Error

2023-10-06 Thread Zhijian Li (Fujitsu)
+rdma-core


Is global variable *errno* reliable when the documentation only states
"returns 0 on success, or the value of errno on failure (which indicates the 
failure reason)."

Someone read it as "assign error code to errno and return it.", I used to think 
the same way.
but ibv_post_send() doesn't always follow this rule. see ibv_post_send() -> 
mana_post_send()

Actually, QEMU are using errno after calling libibverbs APIs, so we hope the 
man page can be
more clear. like posix does:

RETURN VALUE
Upon successful completion fopen(), fdopen() and freopen() return a 
FILE pointer.  Otherwise, NULL is returned and errno is set to indicate the 
error

Thanks
Zhijian


On 27/09/2023 19:46, Markus Armbruster wrote:
> migration/rdma.c uses errno directly or via perror() after the following
> functions:
> 
> * poll()
> 
>POSIX specifies errno is set on error.  Good.
> 
> * rdma_get_cm_event(), rdma_connect(), rdma_get_cm_event()
> 
>Manual page promises "if an error occurs, errno will be set".  Good.
> 
> * ibv_open_device()
> 
>Manual page does not mention errno.  Using it seems ill-advised.
> 
>qemu_rdma_broken_ipv6_kernel() recovers from EPERM by trying the next
>device.  Wrong if ibv_open_device() doesn't actually set errno.
> 
>What is to be done here?
> 
> * ibv_reg_mr()
> 
>Manual page does not mention errno.  Using it seems ill-advised.
> 
>qemu_rdma_reg_whole_ram_blocks() and qemu_rdma_register_and_get_keys()
>recover from errno = ENOTSUP by retrying with modified @access
>argument.  Wrong if ibv_reg_mr() doesn't actually set errno.
> 
>What is to be done here?
> 
> * ibv_get_cq_event()
> 
>Manual page does not mention errno.  Using it seems ill-advised.
> 
>qemu_rdma_block_for_wrid() calls perror().  Removed in PATCH 48.  Good
>enough.
> 
> * ibv_post_send()
> 
>Manual page has the function return "the value of errno on failure".
>Sounds like it sets errno to the value it returns.  However, the
>rdma-core repository defines it as
> 
>  static inline int ibv_post_send(struct ibv_qp *qp, struct ibv_send_wr 
> *wr,
>  struct ibv_send_wr **bad_wr)
>  {
>  return qp->context->ops.post_send(qp, wr, bad_wr);
>  }
> 
>and at least one of the methods fails without setting errno:
> 
>  static int mana_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
>struct ibv_send_wr **bad)
>  {
>  /* This version of driver supports RAW QP only.
>   * Posting WR is done directly in the application.
>   */
>  return EOPNOTSUPP;
>  }
> 
>qemu_rdma_write_one() calls perror().  PATCH 39 (this one) replaces it
>by error_setg(), not error_setg_errno().  Seems prudent, but should be
>called out in the commit message.
> 
> * ibv_advise_mr()
> 
>Manual page has the function return "the value of errno on failure".
>Sounds like it sets errno to the value it returns, but my findings for
>ibv_post_send() make me doubt it.
> 
>qemu_rdma_advise_prefetch_mr() traces strerror(errno).  Could be
>misleading.  Drop that part?
> 
> * ibv_dereg_mr()
> 
>Manual page has the function return "the value of errno on failure".
>Sounds like it sets errno to the value it returns, but my findings for
>ibv_post_send() make me doubt it.
> 
>qemu_rdma_unregister_waiting() calls perror().  Removed in PATCH 51.
>Good enough.
> 
> * qemu_get_cm_event_timeout()
> 
>Can fail without setting errno.
> 
>qemu_rdma_connect() calls perror().  Removed in PATCH 45.  Good
>enough.
> 
> Thoughts?
> 
> 
> [...]
> 
> [*] https://github.com/linux-rdma/rdma-core.git
>  commit 55fa316b4b18f258d8ac1ceb4aa5a7a35b094dcf
> 

Re: [PATCH v2 07/53] migration/rdma: Clean up two more harmless signed vs. unsigned issues

2023-10-06 Thread Zhijian Li (Fujitsu)


On 28/09/2023 21:19, Markus Armbruster wrote:
> qemu_rdma_exchange_get_response() compares int parameter @expecting
> with uint32_t head->type.  Actual arguments are non-negative
> enumeration constants, RDMAControlHeader uint32_t member type, or
> qemu_rdma_exchange_recv() int parameter expecting.  Actual arguments
> for the latter are non-negative enumeration constants.  Change both
> parameters to uint32_t.
> 
> In qio_channel_rdma_readv(), loop control variable @i is ssize_t, and
> counts from 0 up to @niov, which is size_t.  Change @i to size_t.
> 
> While there, make qio_channel_rdma_readv() and
> qio_channel_rdma_writev() more consistent: change the former's @done
> to ssize_t, and delete the latter's useless initialization of @len.
> 
> Signed-off-by: Markus Armbruster
> Reviewed-by: Fabiano Rosas

Reviewed-by: Li Zhijian 

Re: [Virtio-fs] (no subject)

2023-10-06 Thread Yajun Wu



On 10/6/2023 6:34 PM, Michael S. Tsirkin wrote:

External email: Use caution opening links or attachments


On Fri, Oct 06, 2023 at 11:47:55AM +0200, Hanna Czenczek wrote:

On 06.10.23 11:26, Michael S. Tsirkin wrote:

On Fri, Oct 06, 2023 at 11:15:55AM +0200, Hanna Czenczek wrote:

On 06.10.23 10:45, Michael S. Tsirkin wrote:

On Fri, Oct 06, 2023 at 09:48:14AM +0200, Hanna Czenczek wrote:

On 05.10.23 19:15, Michael S. Tsirkin wrote:

On Thu, Oct 05, 2023 at 01:08:52PM -0400, Stefan Hajnoczi wrote:

On Wed, Oct 04, 2023 at 02:58:57PM +0200, Hanna Czenczek wrote:

There is no clearly defined purpose for the virtio status byte in
vhost-user: For resetting, we already have RESET_DEVICE; and for virtio
feature negotiation, we have [GS]ET_FEATURES.  With the REPLY_ACK
protocol extension, it is possible for SET_FEATURES to return errors
(SET_PROTOCOL_FEATURES may be called before SET_FEATURES).

As for implementations, SET_STATUS is not widely implemented.  dpdk does
implement it, but only uses it to signal feature negotiation failure.
While it does log reset requests (SET_STATUS 0) as such, it effectively
ignores them, in contrast to RESET_OWNER (which is deprecated, and today
means the same thing as RESET_DEVICE).

While qemu superficially has support for [GS]ET_STATUS, it does not
forward the guest-set status byte, but instead just makes it up
internally, and actually completely ignores what the back-end returns,
only using it as the template for a subsequent SET_STATUS to add single
bits to it.  Notably, after setting FEATURES_OK, it never reads it back
to see whether the flag is still set, which is the only way in which
dpdk uses the status byte.

As-is, no front-end or back-end can rely on the other side handling this
field in a useful manner, and it also provides no practical use over
other mechanisms the vhost-user protocol has, which are more clearly
defined.  Deprecate it.

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Hanna Czenczek 
---
 docs/interop/vhost-user.rst | 28 +---
 1 file changed, 21 insertions(+), 7 deletions(-)

Reviewed-by: Stefan Hajnoczi 

SET_STATUS is the only way to signal failure to acknowledge FEATURES_OK.
The fact current backends never check errors does not mean they never
will. So no, not applying this.

Can this not be done with REPLY_ACK?  I.e., with the following message
order:

1. GET_FEATURES to find out whether VHOST_USER_F_PROTOCOL_FEATURES is
present
2. GET_PROTOCOL_FEATURES to hopefully get VHOST_USER_PROTOCOL_F_REPLY_ACK
3. SET_PROTOCOL_FEATURES to set VHOST_USER_PROTOCOL_F_REPLY_ACK
4. SET_FEATURES with need_reply

If not, the problem is that qemu has sent SET_STATUS 0 for a while when the
vCPUs are stopped, which generally seems to request a device reset.  If we
don’t state at least that SET_STATUS 0 is to be ignored, back-ends that will
implement SET_STATUS later may break with at least these qemu versions.  But
documenting that a particular use of the status byte is to be ignored would
be really strange.

Hanna

Hmm I guess. Though just following virtio spec seems cleaner to me...
vhost-user reconfigures the state fully on start.

Not the internal device state, though.  virtiofsd has internal state, and
other devices like vhost-gpu back-ends would probably, too.

Stefan has recently sent a series
(https://lists.nongnu.org/archive/html/qemu-devel/2023-10/msg00709.html) to
put the reset (RESET_DEVICE) into virtio_reset() (when we really need a
reset).

I really don’t like our current approach with the status byte. Following the
virtio specification to me would mean that the guest directly controls this
byte, which it does not.  qemu makes up values as it deems appropriate, and
this includes sending a SET_STATUS 0 when the guest is just paused, i.e.
when the guest really doesn’t want a device reset.

That means that qemu does not treat this as a virtio device field (because
that would mean exposing it to the guest driver), but instead treats it as
part of the vhost(-user) protocol.  It doesn’t feel right to me that we use
a virtio-defined feature for communication on the vhost level, i.e. between
front-end and back-end, and not between guest driver and device.  I think
all vhost-level protocol features should be fully defined in the vhost-user
specification, which REPLY_ACK is.

Hmm that makes sense. Maybe we should have done what stefan's patch
is doing.

Do look at the original commit that introduced it to understand why
it was added.

I don’t understand why this was added to the stop/cont code, though.  If it
is time consuming to make these changes, why are they done every time the VM
is paused
and resumed?  It makes sense that this would be done for the initial
configuration (where a reset also wouldn’t hurt), but here it seems wrong.

(To be clear, a reset in the stop/cont code is wrong, because it breaks
stateful devices.)

Also, note the newer commits 6f8be29ec17 and c3716f260bf.  The reset as
originally introduced was 

Re: [PATCH v2 06/53] migration/rdma: Fix unwanted integer truncation

2023-10-06 Thread Zhijian Li (Fujitsu)


On 28/09/2023 21:19, Markus Armbruster wrote:
> qio_channel_rdma_readv() assigns the size_t value of qemu_rdma_fill()
> to an int variable before it adds it to @done / subtracts it from
> @want, both size_t.  Truncation when qemu_rdma_fill() copies more than
> INT_MAX bytes.  Seems vanishingly unlikely, but needs fixing all the
> same.
> 
> Fixes: 6ddd2d76ca6f (migration: convert RDMA to use QIOChannel interface)
> Signed-off-by: Markus Armbruster

Reviewed-by: Li Zhijian 

Introduce: Storage stability testing and DATA consistency verifying tools and system

2023-10-06 Thread 张友加
Dear All,




I hope you are all well. I would like to introduce new tools I have developed, 
named "LBA tools" which including hd_write_verify & hd_write_verify_dump.




github: https://github.com/zhangyoujia/hd_write_verify




pdf:  https://github.com/zhangyoujia/hd_write_verify/DISK stability 
testing and DATA consistency verifying tools and system.pdf




ppt:  https://github.com/zhangyoujia/hd_write_verify/存储稳定性测试与数据一致性校验工具和系统.pptx




bin:  https://github.com/zhangyoujia/hd_write_verify/bin




iso:  https://github.com/zhangyoujia/hd_write_verify/iso




Data is a vital asset for many businesses, making storage stability and data 
consistency the most fundamental requirements in storage technology scenarios.




The purpose of storage stability testing is to ensure that storage devices or 
systems can operate normally and remain stable over time, while also handling 
various abnormal situations such as sudden power outages and network failures. 
This testing typically includes stress testing, load testing, fault tolerance 
testing, and other evaluations to assess the performance and reliability of the 
storage system.




Data consistency checking is designed to ensure that the data stored in the 
system is accurate and consistent. This means that whenever data changes occur, 
all replicas should be updated simultaneously to avoid data inconsistency. Data 
consistency checking typically involves aspects such as data integrity, 
accuracy, consistency, and reliability.




LBA tools are very useful for testing Storage stability and verifying DATA 
consistency, there are much better than FIO & vdbench's verifying functions.




I believe that LBA tools will have a positive impact on the community and help 
users handle storage data more effectively. Your feedback and suggestions are 
greatly appreciated, and I hope you can try using LBA tools and share your 
experiences and recommendations.




Best regards




[PATCH v4] hw/cxl: Add QTG _DSM support for ACPI0017 device

2023-10-06 Thread Dave Jiang
Add a simple _DSM call support for the ACPI0017 device to return fake QTG
ID values of 0 and 1 in all cases. This for _DSM plumbing testing from the OS.

Following edited for readability

Device (CXLM)
{
Name (_HID, "ACPI0017")  // _HID: Hardware ID
...
Method (_DSM, 4, Serialized)  // _DSM: Device-Specific Method
{
If ((Arg0 == ToUUID ("f365f9a6-a7de-4071-a66a-b40c0b4f8e52")))
{
If ((Arg2 == Zero))
{
Return (Buffer (One) { 0x01 })
}

If ((Arg2 == One))
{
Return (Package (0x02)
{
One,
Package (0x02)
{
Zero,
One
}
})
}
}
}

Signed-off-by: Dave Jiang 
Signed-off-by: Jonathan Cameron 

--
v4: Change to package of ints rather than buffers. Also moved to 2 QTG IDs
to improve kernel side testing. Tested on x86 qemu guest against kernel
QTG ID _DSM parsing code to be upstreamed.

v3: Fix output assignment to be BE host friendly. Fix typo in comment.
According to the CXL spec, the DSM output should be 1 WORD to indicate
the max suppoted QTG ID and a package of 0 or more WORDs for the QTG IDs.
In this dummy impementation, we have first WORD with a 1 to indcate max
supprted QTG ID of 1. And second WORD in a package to indicate the QTG
ID of 0.

v2: Minor edit to drop reference to switches in patch description.
Message-Id: <20230904161847.18468-3-jonathan.came...@huawei.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/acpi/cxl.c |   69 +
 hw/i386/acpi-build.c  |1 +
 include/hw/acpi/cxl.h |1 +
 3 files changed, 71 insertions(+)

diff --git a/hw/acpi/cxl.c b/hw/acpi/cxl.c
index 92b46bc9323b..9cd7905ea25a 100644
--- a/hw/acpi/cxl.c
+++ b/hw/acpi/cxl.c
@@ -30,6 +30,75 @@
 #include "qapi/error.h"
 #include "qemu/uuid.h"
 
+void build_cxl_dsm_method(Aml *dev)
+{
+Aml *method, *ifctx, *ifctx2;
+
+method = aml_method("_DSM", 4, AML_SERIALIZED);
+{
+Aml *function, *uuid;
+
+uuid = aml_arg(0);
+function = aml_arg(2);
+/* CXL spec v3.0 9.17.3.1 _DSM Function for Retrieving QTG ID */
+ifctx = aml_if(aml_equal(
+uuid, aml_touuid("F365F9A6-A7DE-4071-A66A-B40C0B4F8E52")));
+
+/* Function 0, standard DSM query function */
+ifctx2 = aml_if(aml_equal(function, aml_int(0)));
+{
+uint8_t byte_list[1] = { 0x01 }; /* function 1 only */
+
+aml_append(ifctx2,
+   aml_return(aml_buffer(sizeof(byte_list), byte_list)));
+}
+aml_append(ifctx, ifctx2);
+
+/*
+ * Function 1
+ * Creating a package with static values. The max supported QTG ID will
+ * be 1 and recommended QTG IDs are 0 and then 1.
+ * The values here are statically created to simplify emulation. Values
+ * from a real BIOS would be determined by the performance of all the
+ * present CXL memory and then assigned.
+ */
+ifctx2 = aml_if(aml_equal(function, aml_int(1)));
+{
+Aml *pak, *pak1;
+
+/*
+ * Return: A package containing two elements - a WORD that returns
+ * the maximum throttling group that the platform supports, and a
+ * package containing the QTG ID(s) that the platform recommends.
+ * Package {
+ * Max Supported QTG ID
+ * Package {QTG Recommendations}
+ * }
+ *
+ * While the SPEC specified WORD that hints at the value being
+ * 16bit, the ACPI dump of BIOS DSDT table showed that the values
+ * are integers with no specific size specification. aml_int() will
+ * be used for the values.
+ */
+pak1 = aml_package(2);
+/* Set QTG ID of 0 */
+aml_append(pak1, aml_int(0));
+/* Set QTG ID of 1 */
+aml_append(pak1, aml_int(1));
+
+pak = aml_package(2);
+/* Set Max QTG 1 */
+aml_append(pak, aml_int(1));
+aml_append(pak, pak1);
+
+aml_append(ifctx2, aml_return(pak));
+}
+aml_append(ifctx, ifctx2);
+}
+aml_append(method, ifctx);
+aml_append(dev, method);
+}
+
 static void cedt_build_chbs(GArray *table_data, PXBCXLDev *cxl)
 {
 PXBDev *pxb = PXB_DEV(cxl);
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 95199c89008a..692af40b1a75 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1422,6 +1422,7 @@ static void build_acpi0017(Aml *table)
 method = aml_method("_STA", 0, AML_NOTSERIALIZED);
 aml_append(method, aml_return(aml_int(0x01)));
 aml_append(dev, method);
+build_cxl_dsm_method(dev);
 

Re: [PATCH 2/3] hw/pci-host: Add emulation of Mai Logic Articia S

2023-10-06 Thread Volker Rümelin
Am 06.10.23 um 00:13 schrieb BALATON Zoltan:
> The Articia S is a generic chipset supporting several different CPUs
> that were used on some PPC boards. This is a minimal emulation of the
> parts needed for emulating the AmigaOne board.
>
> Signed-off-by: BALATON Zoltan 
> ---
>  hw/pci-host/Kconfig   |   5 +
>  hw/pci-host/articia.c | 266 ++
>  hw/pci-host/meson.build   |   2 +
>  include/hw/pci-host/articia.h |  17 +++
>  4 files changed, 290 insertions(+)
>  create mode 100644 hw/pci-host/articia.c
>  create mode 100644 include/hw/pci-host/articia.h
> diff --git a/hw/pci-host/articia.c b/hw/pci-host/articia.c
> new file mode 100644
> index 00..80558e1c47
> --- /dev/null
> +++ b/hw/pci-host/articia.c
> @@ -0,0 +1,266 @@
> +/*
> + * Mai Logic Articia S emulation
> + *
> + * Copyright (c) 2023 BALATON Zoltan
> + *
> + * This work is licensed under the GNU GPL license version 2 or later.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/log.h"
> +#include "qapi/error.h"
> +#include "hw/pci/pci_device.h"
> +#include "hw/pci/pci_host.h"
> +#include "hw/irq.h"
> +#include "hw/i2c/bitbang_i2c.h"
> +#include "hw/intc/i8259.h"
> +#include "hw/pci-host/articia.h"
> +
> +OBJECT_DECLARE_SIMPLE_TYPE(ArticiaState, ARTICIA)
> +
> +OBJECT_DECLARE_SIMPLE_TYPE(ArticiaHostState, ARTICIA_PCI_HOST)
> +struct ArticiaHostState {
> +PCIDevice parent_obj;
> +
> +ArticiaState *as;
> +};
> +
> +/* TYPE_ARTICIA */
> +
> +struct ArticiaState {
> +PCIHostState parent_obj;
> +
> +qemu_irq irq[PCI_NUM_PINS];
> +MemoryRegion io;
> +MemoryRegion mem;
> +MemoryRegion reg;
> +
> +bitbang_i2c_interface smbus;
> +uint32_t gpio; /* bits 0-7 in, 8-15 out, 16-23 direction (0 in, 1 out) */
> +hwaddr gpio_base;
> +MemoryRegion gpio_reg;
> +};
> +
> +static uint64_t articia_gpio_read(void *opaque, hwaddr addr, unsigned int 
> size)
> +{
> +ArticiaState *s = opaque;
> +
> +return (s->gpio >> (addr * 8)) & 0xff;
> +}
> +
> +static void articia_gpio_write(void *opaque, hwaddr addr, uint64_t val,
> +   unsigned int size)
> +{
> +ArticiaState *s = opaque;
> +uint32_t sh = addr * 8;
> +
> +if (addr == 0) {
> +/* in bits read only? */
> +return;
> +}
> +
> +if ((s->gpio & (0xff << sh)) != (val & 0xff) << sh) {
> +s->gpio &= ~(0xff << sh | 0xff);
> +s->gpio |= (val & 0xff) << sh;
> +s->gpio |= bitbang_i2c_set(>smbus, BITBANG_I2C_SDA,
> +   s->gpio & BIT(16) ?
> +   !!(s->gpio & BIT(8)) : 1);
> +if ((s->gpio & BIT(17))) {
> +s->gpio &= ~BIT(0);
> +s->gpio |= bitbang_i2c_set(>smbus, BITBANG_I2C_SCL,
> +   !!(s->gpio & BIT(9)));
> +}
> +}
> +}
> +
> +static const MemoryRegionOps articia_gpio_ops = {
> +.read = articia_gpio_read,
> +.write = articia_gpio_write,
> +.valid.min_access_size = 1,
> +.valid.max_access_size = 1,
> +.endianness = DEVICE_LITTLE_ENDIAN,
> +};
> +
> +static uint64_t articia_reg_read(void *opaque, hwaddr addr, unsigned int 
> size)
> +{
> +ArticiaState *s = opaque;
> +uint64_t ret = UINT_MAX;
> +
> +switch (addr) {
> +case 0xc00cf8:
> +ret = pci_host_conf_le_ops.read(PCI_HOST_BRIDGE(s), 0, size);
> +break;
> +case 0xe00cfc ... 0xe00cff:
> +ret = pci_host_data_le_ops.read(PCI_HOST_BRIDGE(s), addr - 0xe00cfc, 
> size);
> +break;
> +case 0xf0:
> +ret = pic_read_irq(isa_pic);
> +break;
> +default:
> +qemu_log_mask(LOG_UNIMP, "%s: Unimplemented register read 0x%"
> +  HWADDR_PRIx " %d\n", __func__, addr, size);
> +break;
> +}
> +return ret;
> +}
> +
> +static void articia_reg_write(void *opaque, hwaddr addr, uint64_t val,
> +  unsigned int size)
> +{
> +ArticiaState *s = opaque;
> +
> +switch (addr) {
> +case 0xc00cf8:
> +pci_host_conf_le_ops.write(PCI_HOST_BRIDGE(s), 0, val, size);
> +break;
> +case 0xe00cfc ... 0xe00cff:
> +pci_host_data_le_ops.write(PCI_HOST_BRIDGE(s), addr, val, size);
> +break;
> +default:
> +qemu_log_mask(LOG_UNIMP, "%s: Unimplemented register write 0x%"
> +  HWADDR_PRIx " %d <- %"PRIx64"\n", __func__, addr, 
> size, val);
> +break;
> +}
> +}
> +
> +static const MemoryRegionOps articia_reg_ops = {
> +.read = articia_reg_read,
> +.write = articia_reg_write,
> +.valid.min_access_size = 1,
> +.valid.max_access_size = 4,
> +.endianness = DEVICE_LITTLE_ENDIAN,
> +};
> +
> +static void articia_pcihost_set_irq(void *opaque, int n, int level)
> +{
> +ArticiaState *s = opaque;
> +qemu_set_irq(s->irq[n], level);
> +}
> +
> +static void articia_realize(DeviceState *dev, Error **errp)
> +{
> +   

Re: [PATCH 4/4] Python: Enable python3.12 support

2023-10-06 Thread John Snow
On Fri, Oct 6, 2023 at 4:40 PM Vladimir Sementsov-Ogievskiy
 wrote:
>
> On 06.10.23 22:52, John Snow wrote:
> > Python 3.12 has released, so update the test infrastructure to test
> > against this version. Update the configure script to look for it when an
> > explicit Python interpreter isn't chosen.
> >
> > Signed-off-by: John Snow 
> > ---
> >   configure  | 3 ++-
> >   python/setup.cfg   | 3 ++-
> >   tests/docker/dockerfiles/python.docker | 6 +-
> >   3 files changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/configure b/configure
> > index e9a921ffb0..b480a3d6ae 100755
> > --- a/configure
> > +++ b/configure
> > @@ -561,7 +561,8 @@ first_python=
> >   if test -z "${PYTHON}"; then
> >   # A bare 'python' is traditionally python 2.x, but some distros
> >   # have it as python 3.x, so check in both places.
> > -for binary in python3 python python3.11 python3.10 python3.9 
> > python3.8; do
> > +for binary in python3 python python3.12 python3.11 \
> > +  python3.10 python3.9 python3.8; do
> >   if has "$binary"; then
> >   python=$(command -v "$binary")
> >   if check_py_version "$python"; then
> > diff --git a/python/setup.cfg b/python/setup.cfg
> > index 8c67dce457..48668609d3 100644
> > --- a/python/setup.cfg
> > +++ b/python/setup.cfg
> > @@ -18,6 +18,7 @@ classifiers =
> >   Programming Language :: Python :: 3.9
> >   Programming Language :: Python :: 3.10
> >   Programming Language :: Python :: 3.11
> > +Programming Language :: Python :: 3.12
> >   Typing :: Typed
> >
> >   [options]
> > @@ -182,7 +183,7 @@ multi_line_output=3
> >   # of python available on your system to run this test.
> >
> >   [tox:tox]
> > -envlist = py38, py39, py310, py311
> > +envlist = py38, py39, py310, py311, py312
> >   skip_missing_interpreters = true
> >
> >   [testenv]
> > diff --git a/tests/docker/dockerfiles/python.docker 
> > b/tests/docker/dockerfiles/python.docker
> > index 383ccbdc3a..a3c1321190 100644
> > --- a/tests/docker/dockerfiles/python.docker
> > +++ b/tests/docker/dockerfiles/python.docker
> > @@ -11,7 +11,11 @@ ENV PACKAGES \
> >   python3-pip \
> >   python3-tox \
> >   python3-virtualenv \
> > -python3.10
> > +python3.10 \
> > +python3.11 \
> > +python3.12 \
> > +python3.8 \
> > +python3.9
>
> Hmm, interesting, how did it work before? Only 3.10 was tested?

I was relying on dependencies to pull in other versions -- tox usually
pulls in all but the very latest version. I made it explicit instead.
I can explain this in the commit.

>
> >
> >   RUN dnf install -y $PACKAGES
> >   RUN rpm -q $PACKAGES | sort > /packages.txt
>
> weak, I'm unsure about how this all works, I just see that 3.12 is added like 
> others in all hunks except python.docker, but I think adding several python 
> versions to docker should be safe anyway:
> Reviewed-by: Vladimir Sementsov-Ogievskiy 

Thanks -- and I see your patches for iotests. I'll try to look into
them shortly.

>
> --
> Best regards,
> Vladimir
>




Re: [Virtio-fs] (no subject)

2023-10-06 Thread Alex Bennée


Hanna Czenczek  writes:

> On 06.10.23 17:17, Alex Bennée wrote:
>> Hanna Czenczek  writes:
>>
>>> On 06.10.23 12:34, Michael S. Tsirkin wrote:
 On Fri, Oct 06, 2023 at 11:47:55AM +0200, Hanna Czenczek wrote:
> On 06.10.23 11:26, Michael S. Tsirkin wrote:
>> On Fri, Oct 06, 2023 at 11:15:55AM +0200, Hanna Czenczek wrote:
>>> On 06.10.23 10:45, Michael S. Tsirkin wrote:
 On Fri, Oct 06, 2023 at 09:48:14AM +0200, Hanna Czenczek wrote:
> On 05.10.23 19:15, Michael S. Tsirkin wrote:
>> On Thu, Oct 05, 2023 at 01:08:52PM -0400, Stefan Hajnoczi wrote:
>>> On Wed, Oct 04, 2023 at 02:58:57PM +0200, Hanna Czenczek wrote:
>> 
> What I’m saying is, 923b8921d21 introduced SET_STATUS calls that broke all
> devices that would implement them as per virtio spec, and even today it’s
> broken for stateful devices.  The mentioned performance issue is likely
> real, but we can’t address it by making up SET_STATUS calls that are 
> wrong.
>
> I concede that I didn’t think about DRIVER_OK.  Personally, I would do all
> final configuration that would happen upon a DRIVER_OK once the first 
> vring
> is started (i.e. receives a kick).  That has the added benefit of being
> asynchronous because it doesn’t block any vhost-user messages (which are
> synchronous, and thus block downtime).
>
> Hanna
 For better or worse kick is per ring. It's out of spec to start rings
 that were not kicked but I guess you could do configuration ...
 Seems somewhat asymmetrical though.
>>> I meant to take the first ring being started as the signal to do the
>>> global configuration, i.e. not do this once per vring, but once
>>> globally.
>>>
 Let's wait until next week, hopefully Yajun Wu will answer.
>>> I mean, personally I don’t really care about the whole SET_STATUS
>>> thing.  It’s clear that it’s broken for stateful devices.  The fact
>>> that it took until 6f8be29ec17d to fix it for just any device that
>>> would implement it according to spec to me is a strong indication that
>>> nobody does implement it according to spec, and is currently only used
>>> to signal to some specific back-end that all rings have been set up
>>> and should be configured in a single block.
>> I'm certainly using [GS]ET_STATUS for the proposed F_TRANSPORT
>> extensions where everything is off-loaded to the vhost-user backend.
>
> How do these back-ends work with the fact that qemu uses SET_STATUS
> incorrectly when not offloading?  Do you plan on fixing that?

Mainly having a common base implementation which does it right and
having very lightweight derivations for legacy stubs using it. The
aim is to eliminate the need for QEMU stubs entirely by fully specifying
the device from the vhost-user API. 

> (I.e. that we send SET_STATUS 0 when the VM is paused, potentially
> resetting state that is not recoverable, and that we set DRIVER and
> DRIVER_OK simultaneously.)

This is QEMU simulating a SET_STATUS rather than the guest triggering
it?

>
> Hanna


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [PATCH 4/4] Python: Enable python3.12 support

2023-10-06 Thread Vladimir Sementsov-Ogievskiy

On 06.10.23 23:39, Vladimir Sementsov-Ogievskiy wrote:

On 06.10.23 22:52, John Snow wrote:

Python 3.12 has released, so update the test infrastructure to test
against this version. Update the configure script to look for it when an
explicit Python interpreter isn't chosen.

Signed-off-by: John Snow 
---
  configure  | 3 ++-
  python/setup.cfg   | 3 ++-
  tests/docker/dockerfiles/python.docker | 6 +-
  3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index e9a921ffb0..b480a3d6ae 100755
--- a/configure
+++ b/configure
@@ -561,7 +561,8 @@ first_python=
  if test -z "${PYTHON}"; then
  # A bare 'python' is traditionally python 2.x, but some distros
  # have it as python 3.x, so check in both places.
-    for binary in python3 python python3.11 python3.10 python3.9 python3.8; do
+    for binary in python3 python python3.12 python3.11 \
+  python3.10 python3.9 python3.8; do
  if has "$binary"; then
  python=$(command -v "$binary")
  if check_py_version "$python"; then
diff --git a/python/setup.cfg b/python/setup.cfg
index 8c67dce457..48668609d3 100644
--- a/python/setup.cfg
+++ b/python/setup.cfg
@@ -18,6 +18,7 @@ classifiers =
  Programming Language :: Python :: 3.9
  Programming Language :: Python :: 3.10
  Programming Language :: Python :: 3.11
+    Programming Language :: Python :: 3.12
  Typing :: Typed
  [options]
@@ -182,7 +183,7 @@ multi_line_output=3
  # of python available on your system to run this test.
  [tox:tox]
-envlist = py38, py39, py310, py311
+envlist = py38, py39, py310, py311, py312
  skip_missing_interpreters = true
  [testenv]
diff --git a/tests/docker/dockerfiles/python.docker 
b/tests/docker/dockerfiles/python.docker
index 383ccbdc3a..a3c1321190 100644
--- a/tests/docker/dockerfiles/python.docker
+++ b/tests/docker/dockerfiles/python.docker
@@ -11,7 +11,11 @@ ENV PACKAGES \
  python3-pip \
  python3-tox \
  python3-virtualenv \
-    python3.10
+    python3.10 \
+    python3.11 \
+    python3.12 \
+    python3.8 \
+    python3.9


Hmm, interesting, how did it work before? Only 3.10 was tested?


  RUN dnf install -y $PACKAGES
  RUN rpm -q $PACKAGES | sort > /packages.txt


weak, I'm unsure about how this all works, I just see that 3.12 is added like 
others in all hunks except python.docker, but I think adding several python 
versions to docker should be safe anyway:
Reviewed-by: Vladimir Sementsov-Ogievskiy 



I meant, r-b is weak, not the patch :)

--
Best regards,
Vladimir




[PATCH] hw/display: fix memleak from virtio_add_resource

2023-10-06 Thread Matheus Tavares Bernardino
When the given uuid is already present in the hash table,
virtio_add_resource() does not add the passed VirtioSharedObject. In
this case, free it in the callers to avoid leaking memory. This fixed
the following `make check` error, when built with --enable-sanitizers:

  4/166 qemu:unit / test-virtio-dmabuf   ERROR 1.51s   exit status 1

  ==7716==ERROR: LeakSanitizer: detected memory leaks
  Direct leak of 320 byte(s) in 20 object(s) allocated from:
  #0 0x7f6fc16e3808 in __interceptor_malloc 
../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144
  #1 0x7f6fc1503e98 in g_malloc 
(/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x57e98)
  #2 0x564d63cafb6b in test_add_invalid_resource 
../tests/unit/test-virtio-dmabuf.c:100
  #3 0x7f6fc152659d  (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7a59d)
  SUMMARY: AddressSanitizer: 320 byte(s) leaked in 20 allocation(s).

The changes at virtio_add_resource() itself are not strictly necessary
for the memleak fix, but they make it more obvious that, on an error
return, the passed object is not added to the hash.

Signed-off-by: Matheus Tavares Bernardino 
---
 hw/display/virtio-dmabuf.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/hw/display/virtio-dmabuf.c b/hw/display/virtio-dmabuf.c
index 4a8e430f3d..3dba4577ca 100644
--- a/hw/display/virtio-dmabuf.c
+++ b/hw/display/virtio-dmabuf.c
@@ -29,7 +29,7 @@ static int uuid_equal_func(const void *lhv, const void *rhv)
 
 static bool virtio_add_resource(QemuUUID *uuid, VirtioSharedObject *value)
 {
-bool result = false;
+bool result = true;
 
 g_mutex_lock();
 if (resource_uuids == NULL) {
@@ -39,7 +39,9 @@ static bool virtio_add_resource(QemuUUID *uuid, 
VirtioSharedObject *value)
g_free);
 }
 if (g_hash_table_lookup(resource_uuids, uuid) == NULL) {
-result = g_hash_table_insert(resource_uuids, uuid, value);
+g_hash_table_insert(resource_uuids, uuid, value);
+} else {
+result = false;
 }
 g_mutex_unlock();
 
@@ -57,6 +59,9 @@ bool virtio_add_dmabuf(QemuUUID *uuid, int udmabuf_fd)
 vso->type = TYPE_DMABUF;
 vso->value = GINT_TO_POINTER(udmabuf_fd);
 result = virtio_add_resource(uuid, vso);
+if (!result) {
+g_free(vso);
+}
 
 return result;
 }
@@ -72,6 +77,9 @@ bool virtio_add_vhost_device(QemuUUID *uuid, struct vhost_dev 
*dev)
 vso->type = TYPE_VHOST_DEV;
 vso->value = dev;
 result = virtio_add_resource(uuid, vso);
+if (!result) {
+g_free(vso);
+}
 
 return result;
 }
-- 
2.37.2




Re: [PATCH 4/4] Python: Enable python3.12 support

2023-10-06 Thread Vladimir Sementsov-Ogievskiy

On 06.10.23 22:52, John Snow wrote:

Python 3.12 has released, so update the test infrastructure to test
against this version. Update the configure script to look for it when an
explicit Python interpreter isn't chosen.

Signed-off-by: John Snow 
---
  configure  | 3 ++-
  python/setup.cfg   | 3 ++-
  tests/docker/dockerfiles/python.docker | 6 +-
  3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index e9a921ffb0..b480a3d6ae 100755
--- a/configure
+++ b/configure
@@ -561,7 +561,8 @@ first_python=
  if test -z "${PYTHON}"; then
  # A bare 'python' is traditionally python 2.x, but some distros
  # have it as python 3.x, so check in both places.
-for binary in python3 python python3.11 python3.10 python3.9 python3.8; do
+for binary in python3 python python3.12 python3.11 \
+  python3.10 python3.9 python3.8; do
  if has "$binary"; then
  python=$(command -v "$binary")
  if check_py_version "$python"; then
diff --git a/python/setup.cfg b/python/setup.cfg
index 8c67dce457..48668609d3 100644
--- a/python/setup.cfg
+++ b/python/setup.cfg
@@ -18,6 +18,7 @@ classifiers =
  Programming Language :: Python :: 3.9
  Programming Language :: Python :: 3.10
  Programming Language :: Python :: 3.11
+Programming Language :: Python :: 3.12
  Typing :: Typed
  
  [options]

@@ -182,7 +183,7 @@ multi_line_output=3
  # of python available on your system to run this test.
  
  [tox:tox]

-envlist = py38, py39, py310, py311
+envlist = py38, py39, py310, py311, py312
  skip_missing_interpreters = true
  
  [testenv]

diff --git a/tests/docker/dockerfiles/python.docker 
b/tests/docker/dockerfiles/python.docker
index 383ccbdc3a..a3c1321190 100644
--- a/tests/docker/dockerfiles/python.docker
+++ b/tests/docker/dockerfiles/python.docker
@@ -11,7 +11,11 @@ ENV PACKAGES \
  python3-pip \
  python3-tox \
  python3-virtualenv \
-python3.10
+python3.10 \
+python3.11 \
+python3.12 \
+python3.8 \
+python3.9


Hmm, interesting, how did it work before? Only 3.10 was tested?

  
  RUN dnf install -y $PACKAGES

  RUN rpm -q $PACKAGES | sort > /packages.txt


weak, I'm unsure about how this all works, I just see that 3.12 is added like 
others in all hunks except python.docker, but I think adding several python 
versions to docker should be safe anyway:
Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir




Re: [PATCH 3/4] configure: fix error message to say Python 3.8

2023-10-06 Thread Vladimir Sementsov-Ogievskiy

On 06.10.23 22:52, John Snow wrote:

Signed-off-by: John Snow 


Reviewed-by: Vladimir Sementsov-Ogievskiy 


--
Best regards,
Vladimir




Re: [PATCH 2/4] python/qmp: remove Server.wait_closed() call for Python 3.12

2023-10-06 Thread Vladimir Sementsov-Ogievskiy

On 06.10.23 22:52, John Snow wrote:

This patch is a backport from
https://gitlab.com/qemu-project/python-qemu-qmp/-/commit/e03a3334b6a477beb09b293708632f2c06fe9f61

According to Guido in https://github.com/python/cpython/issues/104344 ,
this call was never meant to wait for the server to shut down - that is
handled synchronously - but instead, this waits for all connections to
close. Or, it would have, if it wasn't broken since it was introduced.

3.12 fixes the bug, which now causes a hang in our code. The fix is just
to remove the wait.

Signed-off-by: John Snow 


Reviewed-by: Vladimir Sementsov-Ogievskiy 


--
Best regards,
Vladimir




[PATCH] hw/ppc: Add nest1 chiplet control scoms

2023-10-06 Thread Chalapathi V
-Create nest1 chiplet model and add nest1 chiplet control scoms.
-Implementation of chiplet control scoms are put in pnv_pervasive.c
 as control scoms are common for all chiplets.

Signed-off-by: Chalapathi V 
---
 hw/ppc/meson.build|   2 +
 hw/ppc/pnv.c  |  11 +++
 hw/ppc/pnv_nest1_chiplet.c| 141 +
 hw/ppc/pnv_pervasive.c| 146 ++
 include/hw/ppc/pnv_chip.h |   2 +
 include/hw/ppc/pnv_nest_chiplet.h |  27 ++
 include/hw/ppc/pnv_pervasive.h|  30 ++
 include/hw/ppc/pnv_xscom.h|   3 +
 8 files changed, 362 insertions(+)
 create mode 100644 hw/ppc/pnv_nest1_chiplet.c
 create mode 100644 hw/ppc/pnv_pervasive.c
 create mode 100644 include/hw/ppc/pnv_nest_chiplet.h
 create mode 100644 include/hw/ppc/pnv_pervasive.h

diff --git a/hw/ppc/meson.build b/hw/ppc/meson.build
index 7c2c52434a..541d69cf94 100644
--- a/hw/ppc/meson.build
+++ b/hw/ppc/meson.build
@@ -50,6 +50,8 @@ ppc_ss.add(when: 'CONFIG_POWERNV', if_true: files(
   'pnv_bmc.c',
   'pnv_homer.c',
   'pnv_pnor.c',
+  'pnv_nest1_chiplet.c',
+  'pnv_pervasive.c',
 ))
 # PowerPC 4xx boards
 ppc_ss.add(when: 'CONFIG_PPC405', if_true: files(
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index eb54f93986..0e1c944753 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -1660,6 +1660,8 @@ static void pnv_chip_power10_instance_init(Object *obj)
 object_initialize_child(obj, "occ",  >occ, TYPE_PNV10_OCC);
 object_initialize_child(obj, "sbe",  >sbe, TYPE_PNV10_SBE);
 object_initialize_child(obj, "homer", >homer, TYPE_PNV10_HOMER);
+object_initialize_child(obj, "nest1_chiplet", >nest1_chiplet,
+TYPE_PNV_NEST1_CHIPLET);
 
 chip->num_pecs = pcc->num_pecs;
 
@@ -1829,6 +1831,15 @@ static void pnv_chip_power10_realize(DeviceState *dev, 
Error **errp)
 memory_region_add_subregion(get_system_memory(), PNV10_HOMER_BASE(chip),
 >homer.regs);
 
+/* nest1 chiplet control regs */
+object_property_set_link(OBJECT(>nest1_chiplet), "chip",
+ OBJECT(chip), _abort);
+if (!qdev_realize(DEVICE(>nest1_chiplet), NULL, errp)) {
+return;
+}
+pnv_xscom_add_subregion(chip, PNV10_XSCOM_NEST1_CTRL_CHIPLET_BASE,
+   >nest1_chiplet.xscom_ctrl_regs);
+
 /* PHBs */
 pnv_chip_power10_phb_realize(chip, _err);
 if (local_err) {
diff --git a/hw/ppc/pnv_nest1_chiplet.c b/hw/ppc/pnv_nest1_chiplet.c
new file mode 100644
index 00..c679428213
--- /dev/null
+++ b/hw/ppc/pnv_nest1_chiplet.c
@@ -0,0 +1,141 @@
+/*
+ * QEMU PowerPC nest1 chiplet model
+ *
+ * Copyright (c) 2023, IBM Corporation.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/qdev-properties.h"
+
+#include "hw/ppc/pnv.h"
+#include "hw/ppc/pnv_xscom.h"
+#include "hw/ppc/pnv_nest_chiplet.h"
+#include "hw/ppc/pnv_pervasive.h"
+#include "hw/ppc/fdt.h"
+
+#include 
+
+/* This chiplet contains nest1 chiplet control unit. More to come later */
+
+static uint64_t pnv_nest1_chiplet_xscom_read(void *opaque, hwaddr addr,
+ unsigned size)
+{
+PnvNest1Chiplet *nest1_chiplet = PNV_NEST1CHIPLET(opaque);
+int reg = addr >> 3;
+uint64_t val = 0;
+
+switch (reg) {
+case 0x000 ... 0x3FF:
+val = pnv_chiplet_ctrl_read(_chiplet->ctrl_regs, reg, size);
+break;
+default:
+qemu_log_mask(LOG_UNIMP, "%s: Invalid xscom read at 0x%" PRIx32 "\n",
+  __func__, reg);
+}
+
+return val;
+}
+
+static void pnv_nest1_chiplet_xscom_write(void *opaque, hwaddr addr,
+  uint64_t val, unsigned size)
+{
+PnvNest1Chiplet *nest1_chiplet = PNV_NEST1CHIPLET(opaque);
+int reg = addr >> 3;
+
+switch (reg) {
+case 0x000 ... 0x3FF:
+pnv_chiplet_ctrl_write(_chiplet->ctrl_regs, reg, val, size);
+break;
+
+default:
+qemu_log_mask(LOG_UNIMP, "%s: Invalid xscom write at 0x%" PRIx32 "\n",
+  __func__, reg);
+return;
+}
+return;
+}
+
+static const MemoryRegionOps pnv_nest1_chiplet_xscom_ops = {
+.read = pnv_nest1_chiplet_xscom_read,
+.write = pnv_nest1_chiplet_xscom_write,
+.valid.min_access_size = 8,
+.valid.max_access_size = 8,
+.impl.min_access_size = 8,
+.impl.max_access_size = 8,
+.endianness = DEVICE_BIG_ENDIAN,
+};
+
+static void pnv_nest1_chiplet_realize(DeviceState *dev, Error **errp)
+{
+PnvNest1Chiplet *nest1_chiplet = PNV_NEST1CHIPLET(dev);
+
+assert(nest1_chiplet->chip);
+
+/* NMMU xscom region */
+pnv_xscom_region_init(_chiplet->xscom_ctrl_regs,
+  OBJECT(nest1_chiplet), _nest1_chiplet_xscom_ops,
+  nest1_chiplet, 

Re: [PATCH 1/4] Python/iotests: Add type hint for nbd module

2023-10-06 Thread Vladimir Sementsov-Ogievskiy

On 06.10.23 22:52, John Snow wrote:

The test bails gracefully if this module isn't installed, but linters
need a little help understanding that. It's enough to just declare the
type in this case.

(Fixes pylint complaining about use of an uninitialized variable because
it isn't wise enough to understand the notrun call is noreturn.)

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 



Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir




[PATCH 2/4] qapi: introduce device-sync-config

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
Add command to sync config from vhost-user backend to the device. It
may be helpful when VHOST_USER_SLAVE_CONFIG_CHANGE_MSG failed or not
triggered interrupt to the guest or just not available (not supported
by vhost-user server).

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 hw/block/vhost-user-blk.c | 27 ---
 hw/virtio/virtio-pci.c|  9 +
 include/hw/qdev-core.h|  3 +++
 qapi/qdev.json| 14 ++
 softmmu/qdev-monitor.c| 23 +++
 5 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 1ee05b46ee..05ce7b5684 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -90,27 +90,39 @@ static void vhost_user_blk_set_config(VirtIODevice *vdev, 
const uint8_t *config)
 s->blkcfg.wce = blkcfg->wce;
 }
 
+static int vhost_user_blk_sync_config(DeviceState *dev, Error **errp)
+{
+int ret;
+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+VHostUserBlk *s = VHOST_USER_BLK(vdev);
+
+ret = vhost_dev_get_config(>dev, (uint8_t *)>blkcfg,
+   vdev->config_len, errp);
+if (ret < 0) {
+return ret;
+}
+
+memcpy(vdev->config, >blkcfg, vdev->config_len);
+virtio_notify_config(vdev);
+
+return 0;
+}
+
 static int vhost_user_blk_handle_config_change(struct vhost_dev *dev)
 {
 int ret;
-VirtIODevice *vdev = dev->vdev;
-VHostUserBlk *s = VHOST_USER_BLK(dev->vdev);
 Error *local_err = NULL;
 
 if (!dev->started) {
 return 0;
 }
 
-ret = vhost_dev_get_config(dev, (uint8_t *)>blkcfg,
-   vdev->config_len, _err);
+ret = vhost_user_blk_sync_config(DEVICE(dev->vdev), _err);
 if (ret < 0) {
 error_report_err(local_err);
 return ret;
 }
 
-memcpy(dev->vdev->config, >blkcfg, vdev->config_len);
-virtio_notify_config(dev->vdev);
-
 return 0;
 }
 
@@ -580,6 +592,7 @@ static void vhost_user_blk_class_init(ObjectClass *klass, 
void *data)
 
 device_class_set_props(dc, vhost_user_blk_properties);
 dc->vmsd = _vhost_user_blk;
+dc->sync_config = vhost_user_blk_sync_config;
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 vdc->realize = vhost_user_blk_device_realize;
 vdc->unrealize = vhost_user_blk_device_unrealize;
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index edbc0daa18..dd4620462b 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -2315,6 +2315,14 @@ static void virtio_pci_dc_realize(DeviceState *qdev, 
Error **errp)
 vpciklass->parent_dc_realize(qdev, errp);
 }
 
+static int virtio_pci_sync_config(DeviceState *dev, Error **errp)
+{
+VirtIOPCIProxy *proxy = VIRTIO_PCI(dev);
+VirtIODevice *vdev = virtio_bus_get_device(>bus);
+
+return qdev_sync_config(DEVICE(vdev), errp);
+}
+
 static void virtio_pci_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -2331,6 +2339,7 @@ static void virtio_pci_class_init(ObjectClass *klass, 
void *data)
 device_class_set_parent_realize(dc, virtio_pci_dc_realize,
 >parent_dc_realize);
 rc->phases.hold = virtio_pci_bus_reset_hold;
+dc->sync_config = virtio_pci_sync_config;
 }
 
 static const TypeInfo virtio_pci_info = {
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index a27ef2eb24..6fa5bac86e 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -95,6 +95,7 @@ typedef void (*DeviceUnrealize)(DeviceState *dev);
 typedef void (*DeviceReset)(DeviceState *dev);
 typedef void (*BusRealize)(BusState *bus, Error **errp);
 typedef void (*BusUnrealize)(BusState *bus);
+typedef int (*DeviceSyncConfig)(DeviceState *dev, Error **errp);
 
 /**
  * struct DeviceClass - The base class for all devices.
@@ -162,6 +163,7 @@ struct DeviceClass {
 DeviceReset reset;
 DeviceRealize realize;
 DeviceUnrealize unrealize;
+DeviceSyncConfig sync_config;
 
 /**
  * @vmsd: device state serialisation description for
@@ -555,6 +557,7 @@ bool qdev_hotplug_allowed(DeviceState *dev, Error **errp);
  */
 HotplugHandler *qdev_get_hotplug_handler(DeviceState *dev);
 void qdev_unplug(DeviceState *dev, Error **errp);
+int qdev_sync_config(DeviceState *dev, Error **errp);
 void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev,
   DeviceState *dev, Error **errp);
 void qdev_machine_creation_done(void);
diff --git a/qapi/qdev.json b/qapi/qdev.json
index fa80694735..2468f8bddf 100644
--- a/qapi/qdev.json
+++ b/qapi/qdev.json
@@ -315,3 +315,17 @@
 # Since: 8.2
 ##
 { 'event': 'X_DEVICE_ON', 'data': 'DeviceAndPath' }
+
+##
+# @x-device-sync-config:
+#
+# Sync config from backend to the guest.
+#
+# @id: the device's ID or QOM path
+#
+# Returns: Nothing on success
+#  If @id is not a valid device, DeviceNotFound
+#
+# Since: 8.2
+##
+{ 'command': 

[PATCH 0/4] vhost-user-blk: live resize additional APIs

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
In vhost-user protocol we have VHOST_USER_BACKEND_CONFIG_CHANGE_MSG,
which backend may send to notify Qemu, that we should re-read the
config, and notify the guest.

Still that's not always convenient: backend may not support this
message. Also, having QMP command to force config sync is more reliable
than waiting for notification from external program. It also may be
helpful for debug/restore: if we have changed disk size, but guest
doesn't see that, it's good to have a separate QMP command to trigger
resync of the config.

So, the series proposes two experimental APIs:

1. x-device-sync-config command, to trigger config synchronization

2. X_CONFIG_READ event, which notify management tool that guest read the
updated config. Of course, that can't guarantee that the guest correctly
handled the updated config, but it's still better than nothing: for sure
guest will not show new disk size until it read the updated config. So,
management tool may wait for this event to report success to the user.


The series is based on "[PATCH v8 0/4] pci hotplug tracking": it doesn't
depend on it, but just modify same files, so I just to avoid extra
conflicts.
Based-on: <20231005092926.56231-1-vsement...@yandex-team.ru>

Vladimir Sementsov-Ogievskiy (4):
  vhost-user-blk: simplify and fix vhost_user_blk_handle_config_change
  qapi: introduce device-sync-config
  qapi: device-sync-config: check runstate
  qapi: introduce CONFIG_READ event

 hw/block/vhost-user-blk.c | 32 ++-
 hw/virtio/virtio-pci.c| 18 +
 include/hw/qdev-core.h|  3 +++
 include/monitor/qdev.h|  1 +
 include/sysemu/runstate.h |  1 +
 monitor/monitor.c |  1 +
 qapi/qdev.json| 36 ++
 softmmu/qdev-monitor.c| 53 +++
 softmmu/runstate.c|  5 
 9 files changed, 138 insertions(+), 12 deletions(-)

-- 
2.34.1




[PATCH 1/4] vhost-user-blk: simplify and fix vhost_user_blk_handle_config_change

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
Let's not care about what was changed and update the whole config,
reasons:

1. config->geometry should be updated together with capacity, so we fix
   a bug.

2. Vhost-user protocol doesn't say anything about config change
   limitation. Silent ignore of changes doesn't seem to be correct.

3. vhost-user-vsock reads the whole config

4. on realize we don't do any checks on retrieved config, so no reason
   to care here

Also, let's notify guest unconditionally:

1. So does vhost-user-vsock

2. We are going to reuse the functionality in new cases when we do want
   to notify the guest unconditionally. So, no reason to create extra
   branches in the logic.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 hw/block/vhost-user-blk.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index eecf3f7a81..1ee05b46ee 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -93,7 +93,6 @@ static void vhost_user_blk_set_config(VirtIODevice *vdev, 
const uint8_t *config)
 static int vhost_user_blk_handle_config_change(struct vhost_dev *dev)
 {
 int ret;
-struct virtio_blk_config blkcfg;
 VirtIODevice *vdev = dev->vdev;
 VHostUserBlk *s = VHOST_USER_BLK(dev->vdev);
 Error *local_err = NULL;
@@ -102,19 +101,15 @@ static int vhost_user_blk_handle_config_change(struct 
vhost_dev *dev)
 return 0;
 }
 
-ret = vhost_dev_get_config(dev, (uint8_t *),
+ret = vhost_dev_get_config(dev, (uint8_t *)>blkcfg,
vdev->config_len, _err);
 if (ret < 0) {
 error_report_err(local_err);
 return ret;
 }
 
-/* valid for resize only */
-if (blkcfg.capacity != s->blkcfg.capacity) {
-s->blkcfg.capacity = blkcfg.capacity;
-memcpy(dev->vdev->config, >blkcfg, vdev->config_len);
-virtio_notify_config(dev->vdev);
-}
+memcpy(dev->vdev->config, >blkcfg, vdev->config_len);
+virtio_notify_config(dev->vdev);
 
 return 0;
 }
-- 
2.34.1




[PATCH 3/4] qapi: device-sync-config: check runstate

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
Command result is racy if allow it during migration. Let's allow the
sync only in RUNNING state.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/sysemu/runstate.h |  1 +
 softmmu/qdev-monitor.c| 27 ++-
 softmmu/runstate.c|  5 +
 3 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
index 08afb97695..1fc14c8122 100644
--- a/include/sysemu/runstate.h
+++ b/include/sysemu/runstate.h
@@ -5,6 +5,7 @@
 #include "qemu/notify.h"
 
 bool runstate_check(RunState state);
+const char *current_run_state_str(void);
 void runstate_set(RunState new_state);
 RunState runstate_get(void);
 bool runstate_is_running(void);
diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
index b6da24389f..b485375049 100644
--- a/softmmu/qdev-monitor.c
+++ b/softmmu/qdev-monitor.c
@@ -23,6 +23,7 @@
 #include "monitor/monitor.h"
 #include "monitor/qdev.h"
 #include "sysemu/arch_init.h"
+#include "sysemu/runstate.h"
 #include "qapi/error.h"
 #include "qapi/qapi-commands-qdev.h"
 #include "qapi/qapi-events-qdev.h"
@@ -1002,7 +1003,31 @@ int qdev_sync_config(DeviceState *dev, Error **errp)
 
 void qmp_x_device_sync_config(const char *id, Error **errp)
 {
-DeviceState *dev = find_device_state(id, errp);
+MigrationState *s = migrate_get_current();
+DeviceState *dev;
+
+/*
+ * During migration there is a race between syncing`config and migrating 
it,
+ * so let's just not allow it.
+ *
+ * Moreover, let's not rely on setting up interrupts in paused state, which
+ * may be a part of migration process.
+ */
+
+if (migration_is_running(s->state)) {
+error_setg(errp, "Config synchronization is not allowed "
+   "during migration.");
+return;
+}
+
+if (!runstate_is_running()) {
+error_setg(errp, "Config synchronization allowed only in '%s' state, "
+   "current state is '%s'", RunState_str(RUN_STATE_RUNNING),
+   current_run_state_str());
+return;
+}
+
+dev = find_device_state(id, errp);
 if (!dev) {
 return;
 }
diff --git a/softmmu/runstate.c b/softmmu/runstate.c
index 1652ed0439..3a8211474e 100644
--- a/softmmu/runstate.c
+++ b/softmmu/runstate.c
@@ -181,6 +181,11 @@ bool runstate_check(RunState state)
 return current_run_state == state;
 }
 
+const char *current_run_state_str(void)
+{
+return RunState_str(current_run_state);
+}
+
 static void runstate_init(void)
 {
 const RunStateTransition *p;
-- 
2.34.1




[PATCH 4/4] qapi: introduce CONFIG_READ event

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
Send a new event when guest reads virtio-pci config after
virtio_notify_config() call.

That's useful to check that guest fetched modified config, for example
after resizing disk backend.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 hw/virtio/virtio-pci.c |  9 +
 include/monitor/qdev.h |  1 +
 monitor/monitor.c  |  1 +
 qapi/qdev.json | 22 ++
 softmmu/qdev-monitor.c |  5 +
 5 files changed, 38 insertions(+)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index dd4620462b..f24f8ff03d 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -23,6 +23,7 @@
 #include "hw/boards.h"
 #include "hw/virtio/virtio.h"
 #include "migration/qemu-file-types.h"
+#include "monitor/qdev.h"
 #include "hw/pci/pci.h"
 #include "hw/pci/pci_bus.h"
 #include "hw/qdev-properties.h"
@@ -541,6 +542,10 @@ static uint64_t virtio_pci_config_read(void *opaque, 
hwaddr addr,
 }
 addr -= config;
 
+if (vdev->generation > 0) {
+qdev_config_read_event(DEVICE(proxy));
+}
+
 switch (size) {
 case 1:
 val = virtio_config_readb(vdev, addr);
@@ -1728,6 +1733,10 @@ static uint64_t virtio_pci_device_read(void *opaque, 
hwaddr addr,
 return UINT64_MAX;
 }
 
+if (vdev->generation > 0) {
+qdev_config_read_event(DEVICE(proxy));
+}
+
 switch (size) {
 case 1:
 val = virtio_config_modern_readb(vdev, addr);
diff --git a/include/monitor/qdev.h b/include/monitor/qdev.h
index 949a3672cb..f0b0eab07e 100644
--- a/include/monitor/qdev.h
+++ b/include/monitor/qdev.h
@@ -39,6 +39,7 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts,
 const char *qdev_set_id(DeviceState *dev, char *id, Error **errp);
 
 void qdev_hotplug_device_on_event(DeviceState *dev);
+void qdev_config_read_event(DeviceState *dev);
 
 DeviceAndPath *qdev_new_device_and_path(DeviceState *dev);
 
diff --git a/monitor/monitor.c b/monitor/monitor.c
index 941f87815a..f8aa91b190 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -315,6 +315,7 @@ static MonitorQAPIEventConf 
monitor_qapi_event_conf[QAPI_EVENT__MAX] = {
 [QAPI_EVENT_QUORUM_FAILURE]= { 1000 * SCALE_MS },
 [QAPI_EVENT_VSERPORT_CHANGE]   = { 1000 * SCALE_MS },
 [QAPI_EVENT_MEMORY_DEVICE_SIZE_CHANGE] = { 1000 * SCALE_MS },
+[QAPI_EVENT_X_CONFIG_READ]   = { 300 * SCALE_MS },
 };
 
 /*
diff --git a/qapi/qdev.json b/qapi/qdev.json
index 2468f8bddf..37a8785b81 100644
--- a/qapi/qdev.json
+++ b/qapi/qdev.json
@@ -329,3 +329,25 @@
 # Since: 8.2
 ##
 { 'command': 'x-device-sync-config', 'data': {'id': 'str'} }
+
+##
+# @X_CONFIG_READ:
+#
+# Emitted whenever guest reads virtio device config after config change.
+#
+# @device: device name
+#
+# @path: device path
+#
+# Since: 5.0.1-24
+#
+# Example:
+#
+# <- { "event": "X_CONFIG_READ",
+#  "data": { "device": "virtio-net-pci-0",
+#"path": "/machine/peripheral/virtio-net-pci-0" },
+#  "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
+#
+##
+{ 'event': 'X_CONFIG_READ',
+  'data': { '*device': 'str', 'path': 'str' } }
diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
index b485375049..d0f022e925 100644
--- a/softmmu/qdev-monitor.c
+++ b/softmmu/qdev-monitor.c
@@ -1252,3 +1252,8 @@ void qdev_hotplug_device_on_event(DeviceState *dev)
 dev->device_on_event_sent = true;
 qapi_event_send_x_device_on(dev->id, dev->canonical_path);
 }
+
+void qdev_config_read_event(DeviceState *dev)
+{
+qapi_event_send_x_config_read(dev->id, dev->canonical_path);
+}
-- 
2.34.1




[PATCH 0/4] Python: Enable python3.12 support

2023-10-06 Thread John Snow
A few mostly trivial fixes, one backport from the qemu.qmp repo, and
enabling the Python tests to run against Python3.12.

John Snow (4):
  Python/iotests: Add type hint for nbd module
  python/qmp: remove Server.wait_closed() call for Python 3.12
  configure: fix error message to say Python 3.8
  Python: Enable python3.12 support

 configure  | 5 +++--
 python/qemu/qmp/protocol.py| 1 -
 python/setup.cfg   | 3 ++-
 tests/docker/dockerfiles/python.docker | 6 +-
 tests/qemu-iotests/tests/nbd-multiconn | 4 +++-
 5 files changed, 13 insertions(+), 6 deletions(-)

-- 
2.41.0





[PATCH 1/4] Python/iotests: Add type hint for nbd module

2023-10-06 Thread John Snow
The test bails gracefully if this module isn't installed, but linters
need a little help understanding that. It's enough to just declare the
type in this case.

(Fixes pylint complaining about use of an uninitialized variable because
it isn't wise enough to understand the notrun call is noreturn.)

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/tests/nbd-multiconn | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/tests/nbd-multiconn 
b/tests/qemu-iotests/tests/nbd-multiconn
index 478a1eaba2..7e686a786e 100755
--- a/tests/qemu-iotests/tests/nbd-multiconn
+++ b/tests/qemu-iotests/tests/nbd-multiconn
@@ -20,6 +20,8 @@
 
 import os
 from contextlib import contextmanager
+from types import ModuleType
+
 import iotests
 from iotests import qemu_img_create, qemu_io
 
@@ -28,7 +30,7 @@ disk = os.path.join(iotests.test_dir, 'disk')
 size = '4M'
 nbd_sock = os.path.join(iotests.sock_dir, 'nbd_sock')
 nbd_uri = 'nbd+unix:///{}?socket=' + nbd_sock
-
+nbd: ModuleType
 
 @contextmanager
 def open_nbd(export_name):
-- 
2.41.0




[PATCH 4/4] Python: Enable python3.12 support

2023-10-06 Thread John Snow
Python 3.12 has released, so update the test infrastructure to test
against this version. Update the configure script to look for it when an
explicit Python interpreter isn't chosen.

Signed-off-by: John Snow 
---
 configure  | 3 ++-
 python/setup.cfg   | 3 ++-
 tests/docker/dockerfiles/python.docker | 6 +-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index e9a921ffb0..b480a3d6ae 100755
--- a/configure
+++ b/configure
@@ -561,7 +561,8 @@ first_python=
 if test -z "${PYTHON}"; then
 # A bare 'python' is traditionally python 2.x, but some distros
 # have it as python 3.x, so check in both places.
-for binary in python3 python python3.11 python3.10 python3.9 python3.8; do
+for binary in python3 python python3.12 python3.11 \
+  python3.10 python3.9 python3.8; do
 if has "$binary"; then
 python=$(command -v "$binary")
 if check_py_version "$python"; then
diff --git a/python/setup.cfg b/python/setup.cfg
index 8c67dce457..48668609d3 100644
--- a/python/setup.cfg
+++ b/python/setup.cfg
@@ -18,6 +18,7 @@ classifiers =
 Programming Language :: Python :: 3.9
 Programming Language :: Python :: 3.10
 Programming Language :: Python :: 3.11
+Programming Language :: Python :: 3.12
 Typing :: Typed
 
 [options]
@@ -182,7 +183,7 @@ multi_line_output=3
 # of python available on your system to run this test.
 
 [tox:tox]
-envlist = py38, py39, py310, py311
+envlist = py38, py39, py310, py311, py312
 skip_missing_interpreters = true
 
 [testenv]
diff --git a/tests/docker/dockerfiles/python.docker 
b/tests/docker/dockerfiles/python.docker
index 383ccbdc3a..a3c1321190 100644
--- a/tests/docker/dockerfiles/python.docker
+++ b/tests/docker/dockerfiles/python.docker
@@ -11,7 +11,11 @@ ENV PACKAGES \
 python3-pip \
 python3-tox \
 python3-virtualenv \
-python3.10
+python3.10 \
+python3.11 \
+python3.12 \
+python3.8 \
+python3.9
 
 RUN dnf install -y $PACKAGES
 RUN rpm -q $PACKAGES | sort > /packages.txt
-- 
2.41.0




[PATCH 2/4] python/qmp: remove Server.wait_closed() call for Python 3.12

2023-10-06 Thread John Snow
This patch is a backport from
https://gitlab.com/qemu-project/python-qemu-qmp/-/commit/e03a3334b6a477beb09b293708632f2c06fe9f61

According to Guido in https://github.com/python/cpython/issues/104344 ,
this call was never meant to wait for the server to shut down - that is
handled synchronously - but instead, this waits for all connections to
close. Or, it would have, if it wasn't broken since it was introduced.

3.12 fixes the bug, which now causes a hang in our code. The fix is just
to remove the wait.

Signed-off-by: John Snow 
---
 python/qemu/qmp/protocol.py | 1 -
 1 file changed, 1 deletion(-)

diff --git a/python/qemu/qmp/protocol.py b/python/qemu/qmp/protocol.py
index 753182131f..a4ffdfad51 100644
--- a/python/qemu/qmp/protocol.py
+++ b/python/qemu/qmp/protocol.py
@@ -495,7 +495,6 @@ async def _stop_server(self) -> None:
 try:
 self.logger.debug("Stopping server.")
 self._server.close()
-await self._server.wait_closed()
 self.logger.debug("Server stopped.")
 finally:
 self._server = None
-- 
2.41.0




[PATCH 3/4] configure: fix error message to say Python 3.8

2023-10-06 Thread John Snow
Signed-off-by: John Snow 
---
 configure | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure b/configure
index e08127045d..e9a921ffb0 100755
--- a/configure
+++ b/configure
@@ -944,7 +944,7 @@ then
 # If first_python is set, there was a binary somewhere even though
 # it was not suitable.  Use it for the error message.
 if test -n "$first_python"; then
-error_exit "Cannot use '$first_python', Python >= 3.7 is required." \
+error_exit "Cannot use '$first_python', Python >= 3.8 is required." \
 "Use --python=/path/to/python to specify a supported Python."
 else
 error_exit "Python not found. Use --python=/path/to/python"
-- 
2.41.0




Re: [PATCH 15/19] parallels: Remove unnecessary data_end field

2023-10-06 Thread Mike Maslenkin
On Mon, Oct 2, 2023 at 12:01 PM Alexander Ivanov
 wrote:
>
> Since we have used bitmap, field data_end in BDRVParallelsState is
> redundant and can be removed.
>
> Add parallels_data_end() helper and remove data_end handling.
>
> Signed-off-by: Alexander Ivanov 
> ---
>  block/parallels.c | 33 +
>  block/parallels.h |  1 -
>  2 files changed, 13 insertions(+), 21 deletions(-)
>
> diff --git a/block/parallels.c b/block/parallels.c
> index 48ea5b3f03..80a7171b84 100644
> --- a/block/parallels.c
> +++ b/block/parallels.c
> @@ -265,6 +265,13 @@ static void parallels_free_used_bitmap(BlockDriverState 
> *bs)
>  g_free(s->used_bmap);
>  }
>
> +static int64_t parallels_data_end(BDRVParallelsState *s)
> +{
> +int64_t data_end = s->data_start * BDRV_SECTOR_SIZE;
> +data_end += s->used_bmap_size * s->cluster_size;
> +return data_end;
> +}
> +
>  int64_t parallels_allocate_host_clusters(BlockDriverState *bs,
>   int64_t *clusters)
>  {
> @@ -275,7 +282,7 @@ int64_t parallels_allocate_host_clusters(BlockDriverState 
> *bs,
>
>  first_free = find_first_zero_bit(s->used_bmap, s->used_bmap_size);
>  if (first_free == s->used_bmap_size) {
> -host_off = s->data_end * BDRV_SECTOR_SIZE;
> +host_off = parallels_data_end(s);
>  prealloc_clusters = *clusters + s->prealloc_size / s->tracks;
>  bytes = prealloc_clusters * s->cluster_size;
>
> @@ -297,9 +304,6 @@ int64_t parallels_allocate_host_clusters(BlockDriverState 
> *bs,
>  s->used_bmap = bitmap_zero_extend(s->used_bmap, s->used_bmap_size,
>new_usedsize);
>  s->used_bmap_size = new_usedsize;
> -if (host_off + bytes > s->data_end * BDRV_SECTOR_SIZE) {
> -s->data_end = (host_off + bytes) / BDRV_SECTOR_SIZE;
> -}
>  } else {
>  next_used = find_next_bit(s->used_bmap, s->used_bmap_size, 
> first_free);
>
> @@ -315,8 +319,7 @@ int64_t parallels_allocate_host_clusters(BlockDriverState 
> *bs,
>   * branch. In the other case we are likely re-using hole. Preallocate
>   * the space if required by the prealloc_mode.
>   */
> -if (s->prealloc_mode == PRL_PREALLOC_MODE_FALLOCATE &&
> -host_off < s->data_end * BDRV_SECTOR_SIZE) {
> +if (s->prealloc_mode == PRL_PREALLOC_MODE_FALLOCATE) {
>  ret = bdrv_pwrite_zeroes(bs->file, host_off, bytes, 0);
>  if (ret < 0) {
>  return ret;
> @@ -757,13 +760,7 @@ parallels_check_outside_image(BlockDriverState *bs, 
> BdrvCheckResult *res,
>  }
>  }
>
> -if (high_off == 0) {
> -res->image_end_offset = s->data_end << BDRV_SECTOR_BITS;
> -} else {
> -res->image_end_offset = high_off + s->cluster_size;
> -s->data_end = res->image_end_offset >> BDRV_SECTOR_BITS;
> -}
> -
> +res->image_end_offset = parallels_data_end(s);
>  return 0;
>  }
>
> @@ -806,7 +803,6 @@ parallels_check_leak(BlockDriverState *bs, 
> BdrvCheckResult *res,
>  res->check_errors++;
>  return ret;
>  }
> -s->data_end = res->image_end_offset >> BDRV_SECTOR_BITS;
>
>  parallels_free_used_bitmap(bs);
>  ret = parallels_fill_used_bitmap(bs);
> @@ -1361,8 +1357,7 @@ static int parallels_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  }
>
>  s->data_start = data_start;
> -s->data_end = s->data_start;
> -if (s->data_end < (s->header_size >> BDRV_SECTOR_BITS)) {
> +if (s->data_start < (s->header_size >> BDRV_SECTOR_BITS)) {
>  /*
>   * There is not enough unused space to fit to block align between BAT
>   * and actual data. We can't avoid read-modify-write...
> @@ -1403,11 +1398,10 @@ static int parallels_open(BlockDriverState *bs, QDict 
> *options, int flags,
>
>  for (i = 0; i < s->bat_size; i++) {
>  sector = bat2sect(s, i);
> -if (sector + s->tracks > s->data_end) {
> -s->data_end = sector + s->tracks;
> +if (sector + s->tracks > file_nb_sectors) {
> +need_check = true;
>  }
>  }
> -need_check = need_check || s->data_end > file_nb_sectors;
>
>  ret = parallels_fill_used_bitmap(bs);
>  if (ret == -ENOMEM) {
> @@ -1461,7 +1455,6 @@ static int 
> parallels_truncate_unused_clusters(BlockDriverState *bs)
>  end_off = (end_off + 1) * s->cluster_size;
>  }
>  end_off += s->data_start * BDRV_SECTOR_SIZE;
> -s->data_end = end_off / BDRV_SECTOR_SIZE;
>  return bdrv_truncate(bs->file, end_off, true, PREALLOC_MODE_OFF, 0, 
> NULL);
>  }
>
> diff --git a/block/parallels.h b/block/parallels.h
> index 18b4f8068e..a6a048d890 100644
> --- a/block/parallels.h
> +++ b/block/parallels.h
> @@ -79,7 +79,6 @@ typedef struct BDRVParallelsState {
>  unsigned int bat_size;
>
>  int64_t  data_start;
> -int64_t  data_end;

Re: [v3] Help wanted for enabling -Wshadow=local

2023-10-06 Thread Warner Losh
On Fri, Oct 6, 2023, 11:55 AM Thomas Huth  wrote:

> On 06/10/2023 18.18, Thomas Huth wrote:
> > On 06/10/2023 16.45, Markus Armbruster wrote:
> >> Local variables shadowing other local variables or parameters make the
> >> code needlessly hard to understand.  Bugs love to hide in such code.
> >> Evidence: "[PATCH v3 1/7] migration/rdma: Fix save_page method to fail
> >> on polling error".
> >>
> >> Enabling -Wshadow would prevent bugs like this one.  But we have to
> >> clean up all the offenders first.
> >>
> >> Quite a few people responded to my calls for help.  Thank you so much!
> >>
> >> I'm collecting patches in my git repo at
> >> https://repo.or.cz/qemu/armbru.git in branch shadow-next.  All but the
> >> last two are in a pending pull request.
> >>
> >> My test build is down to seven files with warnings.  "[PATCH v2 0/3]
> >> hexagon: GETPC() and shadowing fixes" takes care of four, but it needs a
> >> rebase.
> >>
> >> Remaining three:
> >>
> >>  In file included from ../hw/display/virtio-gpu-virgl.c:19:
> >>  ../hw/display/virtio-gpu-virgl.c: In function
> ‘virgl_cmd_submit_3d’:
> >>  /work/armbru/qemu/include/hw/virtio/virtio-gpu.h:228:16: warning:
> >> declaration of ‘s’ shadows a previous local [-Wshadow=compatible-local]
> >>228 | size_t
> >> s;   \
> >>|^
> >>  ../hw/display/virtio-gpu-virgl.c:215:5: note: in expansion of
> macro
> >> ‘VIRTIO_GPU_FILL_CMD’
> >>215 | VIRTIO_GPU_FILL_CMD(cs);
> >>| ^~~
> >>  ../hw/display/virtio-gpu-virgl.c:213:12: note: shadowed
> declaration
> >> is here
> >>213 | size_t s;
> >>|^
> >>
> >>  In file included from ../contrib/vhost-user-gpu/virgl.h:18,
> >>   from ../contrib/vhost-user-gpu/virgl.c:17:
> >>  ../contrib/vhost-user-gpu/virgl.c: In function
> ‘virgl_cmd_submit_3d’:
> >>  ../contrib/vhost-user-gpu/vugpu.h:167:16: warning: declaration of
> ‘s’
> >> shadows a previous local [-Wshadow=compatible-local]
> >>167 | size_t
> >> s;   \
> >>|^
> >>  ../contrib/vhost-user-gpu/virgl.c:203:5: note: in expansion of
> macro
> >> ‘VUGPU_FILL_CMD’
> >>203 | VUGPU_FILL_CMD(cs);
> >>| ^~
> >>  ../contrib/vhost-user-gpu/virgl.c:201:12: note: shadowed
> declaration
> >> is here
> >>201 | size_t s;
> >>|^
> >>
> >>  ../contrib/vhost-user-gpu/vhost-user-gpu.c: In function
> >> ‘vg_resource_flush’:
> >>  ../contrib/vhost-user-gpu/vhost-user-gpu.c:837:29: warning:
> >> declaration of ‘i’ shadows a previous local [-Wshadow=local]
> >>837 | pixman_image_t *i =
> >>| ^
> >>  ../contrib/vhost-user-gpu/vhost-user-gpu.c:757:9: note: shadowed
> >> declaration is here
> >>757 | int i;
> >>| ^
> >>
> >> Gerd, Marc-André, or anybody else?
> >>
> >> More warnings may lurk in code my test build doesn't compile.  Need a
> >> full CI build with -Wshadow=local to find them.  Anybody care to kick
> >> one off?
> >
> > I ran a build here (with -Werror enabled, so that it's easier to see
> where
> > it breaks):
> >
> >   https://gitlab.com/thuth/qemu/-/pipelines/1028023489
> >
> > ... but I didn't see any additional spots in the logs beside the ones
> that
> > you already listed.
>
> After adding two more patches to fix the above warnings, things look
> pretty
> good:
>
>   https://gitlab.com/thuth/qemu/-/pipelines/1028413030
>
> There are just some warnings left in the BSD code, as Warner already
> mentioned in his reply to v2 of your mail:
>
>   https://gitlab.com/thuth/qemu/-/jobs/5241420713


I think I have fixes for these. I need to merge what just landed into
bsd-user fork, rebase, test, the apply them to qemu master branch, retest
and send them off...

My illness has hung on longer than I thought so I'm still behind...

Warner


>   Thomas
>
>


Re: [v3] Help wanted for enabling -Wshadow=local

2023-10-06 Thread Markus Armbruster
Markus Armbruster  writes:

> Local variables shadowing other local variables or parameters make the
> code needlessly hard to understand.  Bugs love to hide in such code.
> Evidence: "[PATCH v3 1/7] migration/rdma: Fix save_page method to fail
> on polling error".
>
> Enabling -Wshadow would prevent bugs like this one.  But we have to
> clean up all the offenders first.
>
> Quite a few people responded to my calls for help.  Thank you so much!
>
> I'm collecting patches in my git repo at
> https://repo.or.cz/qemu/armbru.git in branch shadow-next.  All but the
> last two are in a pending pull request.
>
> My test build is down to seven files with warnings.  "[PATCH v2 0/3]
> hexagon: GETPC() and shadowing fixes" takes care of four, but it needs a
> rebase.
>
> Remaining three:
>
> In file included from ../hw/display/virtio-gpu-virgl.c:19:
> ../hw/display/virtio-gpu-virgl.c: In function ‘virgl_cmd_submit_3d’:
> /work/armbru/qemu/include/hw/virtio/virtio-gpu.h:228:16: warning: 
> declaration of ‘s’ shadows a previous local [-Wshadow=compatible-local]
>   228 | size_t s; 
>   \
>   |^
> ../hw/display/virtio-gpu-virgl.c:215:5: note: in expansion of macro 
> ‘VIRTIO_GPU_FILL_CMD’
>   215 | VIRTIO_GPU_FILL_CMD(cs);
>   | ^~~
> ../hw/display/virtio-gpu-virgl.c:213:12: note: shadowed declaration is 
> here
>   213 | size_t s;
>   |^
>
> In file included from ../contrib/vhost-user-gpu/virgl.h:18,
>  from ../contrib/vhost-user-gpu/virgl.c:17:
> ../contrib/vhost-user-gpu/virgl.c: In function ‘virgl_cmd_submit_3d’:
> ../contrib/vhost-user-gpu/vugpu.h:167:16: warning: declaration of ‘s’ 
> shadows a previous local [-Wshadow=compatible-local]
>   167 | size_t s;   \
>   |^
> ../contrib/vhost-user-gpu/virgl.c:203:5: note: in expansion of macro 
> ‘VUGPU_FILL_CMD’
>   203 | VUGPU_FILL_CMD(cs);
>   | ^~
> ../contrib/vhost-user-gpu/virgl.c:201:12: note: shadowed declaration is 
> here
>   201 | size_t s;
>   |^
>
> ../contrib/vhost-user-gpu/vhost-user-gpu.c: In function 
> ‘vg_resource_flush’:
> ../contrib/vhost-user-gpu/vhost-user-gpu.c:837:29: warning: declaration 
> of ‘i’ shadows a previous local [-Wshadow=local]
>   837 | pixman_image_t *i =
>   | ^
> ../contrib/vhost-user-gpu/vhost-user-gpu.c:757:9: note: shadowed 
> declaration is here
>   757 | int i;
>   | ^
>
> Gerd, Marc-André, or anybody else?

Thomas posted patches:

[PATCH] hw/virtio/virtio-gpu: Fix compiler warning when compiling with 
-Wshadow
[PATCH] contrib/vhost-user-gpu: Fix compiler warning when compiling with 
-Wshadow

> More warnings may lurk in code my test build doesn't compile.  Need a
> full CI build with -Wshadow=local to find them.  Anybody care to kick
> one off?

Thomas did; see his reply downthread.

Thank you, Thomas!




[Stable-8.1.2 53/57] vdpa net: zero vhost_vdpa iova_tree pointer at cleanup

2023-10-06 Thread Michael Tokarev
From: Eugenio Pérez 

Not zeroing it causes a SIGSEGV if the live migration is cancelled, at
net device restart.

This is caused because CVQ tries to reuse the iova_tree that is present
in the first vhost_vdpa device at the end of vhost_vdpa_net_cvq_start.
As a consequence, it tries to access an iova_tree that has been already
free.

Fixes: 00ef422e9fbf ("vdpa net: move iova tree creation from init to start")
Reported-by: Yanhui Ma 
Signed-off-by: Eugenio Pérez 
Message-Id: <20230913123408.2819185-1-epere...@redhat.com>
Acked-by: Jason Wang 
Tested-by: Lei Yang 
Reviewed-by: Si-Wei Liu 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
(cherry picked from commit 0a7a164bc37b4ecbf74466e1e5243d72a768ad06)
Signed-off-by: Michael Tokarev 

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 9795306742..977faeb44b 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -385,6 +385,8 @@ static void vhost_vdpa_net_client_stop(NetClientState *nc)
 dev = s->vhost_vdpa.dev;
 if (dev->vq_index + dev->nvqs == dev->vq_index_end) {
 g_clear_pointer(>vhost_vdpa.iova_tree, vhost_iova_tree_delete);
+} else {
+s->vhost_vdpa.iova_tree = NULL;
 }
 }
 
-- 
2.39.2




[Stable-8.1.2 56/57] vdpa net: follow VirtIO initialization properly at cvq isolation probing

2023-10-06 Thread Michael Tokarev
From: Eugenio Pérez 

This patch solves a few issues.  The most obvious is that the feature
set was done previous to ACKNOWLEDGE | DRIVER status bit set.  Current
vdpa devices are permissive with this, but it is better to follow the
standard.

Fixes: 152128d646 ("vdpa: move CVQ isolation check to net_init_vhost_vdpa")
Signed-off-by: Eugenio Pérez 
Message-Id: <20230915170836.3078172-4-epere...@redhat.com>
Tested-by: Lei Yang 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
(cherry picked from commit 845ec38ae1578dd2d42ff15c9979f1bf44b23418)
Signed-off-by: Michael Tokarev 

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index cda6099ceb..07b616af51 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -1272,8 +1272,7 @@ static int vhost_vdpa_probe_cvq_isolation(int device_fd, 
uint64_t features,
 uint64_t backend_features;
 int64_t cvq_group;
 uint8_t status = VIRTIO_CONFIG_S_ACKNOWLEDGE |
- VIRTIO_CONFIG_S_DRIVER |
- VIRTIO_CONFIG_S_FEATURES_OK;
+ VIRTIO_CONFIG_S_DRIVER;
 int r;
 
 ERRP_GUARD();
@@ -1288,15 +1287,22 @@ static int vhost_vdpa_probe_cvq_isolation(int 
device_fd, uint64_t features,
 return 0;
 }
 
+r = ioctl(device_fd, VHOST_VDPA_SET_STATUS, );
+if (unlikely(r)) {
+error_setg_errno(errp, -r, "Cannot set device status");
+goto out;
+}
+
 r = ioctl(device_fd, VHOST_SET_FEATURES, );
 if (unlikely(r)) {
-error_setg_errno(errp, errno, "Cannot set features");
+error_setg_errno(errp, -r, "Cannot set features");
 goto out;
 }
 
+status |= VIRTIO_CONFIG_S_FEATURES_OK;
 r = ioctl(device_fd, VHOST_VDPA_SET_STATUS, );
 if (unlikely(r)) {
-error_setg_errno(errp, -r, "Cannot set status");
+error_setg_errno(errp, -r, "Cannot set device status");
 goto out;
 }
 
-- 
2.39.2




[Stable-8.1.2 51/57] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected

2023-10-06 Thread Michael Tokarev
From: Thomas Huth 

When starting a guest via libvirt with "virsh start --console ...",
the first second of the console output is missing. This is especially
annoying on s390x that only has a text console by default and no graphical
output - if the bios fails to boot here, the information about what went
wrong is completely lost.

One part of the problem (there is also some things to be done on the
libvirt side) is that QEMU only checks with a 1 second timer whether
the other side of the pty is already connected, so the first second of
the console output is always lost.

This likely used to work better in the past, since the code once checked
for a re-connection during write, but this has been removed in commit
f8278c7d74 ("char-pty: remove the check for connection on write") to avoid
some locking.

To ease the situation here at least a little bit, let's check with g_poll()
whether we could send out the data anyway, even if the connection has not
been marked as "connected" yet. The file descriptor is marked as non-blocking
anyway since commit fac6688a18 ("Do not hang on full PTY"), so this should
not cause any trouble if the other side is not ready for receiving yet.

With this patch applied, I can now successfully see the bios output of
a s390x guest when running it with "virsh start --console" (with a patched
version of virsh that fixes the remaining issues there, too).

Reported-by: Marc Hartmayer 
Signed-off-by: Thomas Huth 
Reviewed-by: Daniel P. Berrangé 
Message-Id: <20230816210743.1319018-1-th...@redhat.com>
(cherry picked from commit 4f7689f0817a717d18cc8aca298990760f27a89b)
Signed-off-by: Michael Tokarev 

diff --git a/chardev/char-pty.c b/chardev/char-pty.c
index 4e5deac18a..cc2f7617fe 100644
--- a/chardev/char-pty.c
+++ b/chardev/char-pty.c
@@ -106,11 +106,27 @@ static void pty_chr_update_read_handler(Chardev *chr)
 static int char_pty_chr_write(Chardev *chr, const uint8_t *buf, int len)
 {
 PtyChardev *s = PTY_CHARDEV(chr);
+GPollFD pfd;
+int rc;
 
-if (!s->connected) {
-return len;
+if (s->connected) {
+return io_channel_send(s->ioc, buf, len);
 }
-return io_channel_send(s->ioc, buf, len);
+
+/*
+ * The other side might already be re-connected, but the timer might
+ * not have fired yet. So let's check here whether we can write again:
+ */
+pfd.fd = QIO_CHANNEL_FILE(s->ioc)->fd;
+pfd.events = G_IO_OUT;
+pfd.revents = 0;
+rc = RETRY_ON_EINTR(g_poll(, 1, 0));
+g_assert(rc >= 0);
+if (!(pfd.revents & G_IO_HUP) && (pfd.revents & G_IO_OUT)) {
+io_channel_send(s->ioc, buf, len);
+}
+
+return len;
 }
 
 static GSource *pty_chr_add_watch(Chardev *chr, GIOCondition cond)
-- 
2.39.2




[Stable-8.1.2 54/57] vdpa net: fix error message setting virtio status

2023-10-06 Thread Michael Tokarev
From: Eugenio Pérez 

It incorrectly prints "error setting features", probably because a copy
paste miss.

Fixes: 152128d646 ("vdpa: move CVQ isolation check to net_init_vhost_vdpa")
Reported-by: Peter Maydell 
Signed-off-by: Eugenio Pérez 
Message-Id: <20230915170836.3078172-2-epere...@redhat.com>
Tested-by: Lei Yang 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Philippe Mathieu-Daudé 
(cherry picked from commit cbc9ae87b5f6f81c52a249e0b64100d5011fca53)
Signed-off-by: Michael Tokarev 

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 977faeb44b..1c79e33170 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -1295,7 +1295,7 @@ static int vhost_vdpa_probe_cvq_isolation(int device_fd, 
uint64_t features,
 
 r = ioctl(device_fd, VHOST_VDPA_SET_STATUS, );
 if (unlikely(r)) {
-error_setg_errno(errp, -r, "Cannot set device features");
+error_setg_errno(errp, -r, "Cannot set status");
 goto out;
 }
 
-- 
2.39.2




[Stable-8.1.2 46/57] subprojects/berkeley-testfloat-3: Update to fix a problem with compiler warnings

2023-10-06 Thread Michael Tokarev
From: Thomas Huth 

Update the berkeley-testfloat-3 wrap to include a patch provided by
Olaf Hering. This fixes a problem with "control reaches end of non-void
function [-Werror=return-type]" compiler warning/errors that are now
enabled by default in certain versions of GCC.

Reported-by: Olaf Hering 
Message-Id: <20230816091522.1292029-1-th...@redhat.com>
Signed-off-by: Thomas Huth 
(cherry picked from commit c01196bdddc280ae3710912e98e78f3103155eaf)
Signed-off-by: Michael Tokarev 

diff --git a/subprojects/berkeley-testfloat-3.wrap 
b/subprojects/berkeley-testfloat-3.wrap
index 6ad80a37b2..c86dc078a8 100644
--- a/subprojects/berkeley-testfloat-3.wrap
+++ b/subprojects/berkeley-testfloat-3.wrap
@@ -1,5 +1,5 @@
 [wrap-git]
 url = https://gitlab.com/qemu-project/berkeley-testfloat-3
-revision = 40619cbb3bf32872df8c53cc457039229428a263
+revision = e7af9751d9f9fd3b47911f51a5cfd08af256a9ab
 patch_directory = berkeley-testfloat-3
 depth = 1
-- 
2.39.2




[Stable-8.1.2 50/57] hw/display/ramfb: plug slight guest-triggerable leak on mode setting

2023-10-06 Thread Michael Tokarev
From: Laszlo Ersek 

The fw_cfg DMA write callback in ramfb prepares a new display surface in
QEMU; this new surface is put to use ("swapped in") upon the next display
update. At that time, the old surface (if any) is released.

If the guest triggers the fw_cfg DMA write callback at least twice between
two adjacent display updates, then the second callback (and further such
callbacks) will leak the previously prepared (but not yet swapped in)
display surface.

The issue can be shown by:

(1) starting QEMU with "-trace displaysurface_free", and

(2) running the following program in the guest UEFI shell:

> #include// ShellAppMain()
> #include  // gBS
> #include   // EFI_GRAPHICS_OUTPUT_PROTOCOL
>
> INTN
> EFIAPI
> ShellAppMain (
>   IN UINTN   Argc,
>   IN CHAR16  **Argv
>   )
> {
>   EFI_STATUSStatus;
>   VOID  *Interface;
>   EFI_GRAPHICS_OUTPUT_PROTOCOL  *Gop;
>   UINT32Mode;
>
>   Status = gBS->LocateProtocol (
>   ,
>   NULL,
>   
>   );
>   if (EFI_ERROR (Status)) {
> return 1;
>   }
>
>   Gop = Interface;
>
>   Mode = 1;
>   for ( ; ;) {
> Status = Gop->SetMode (Gop, Mode);
> if (EFI_ERROR (Status)) {
>   break;
> }
>
> Mode = 1 - Mode;
>   }
>
>   return 1;
> }

The symptom is then that:

- only one trace message appears periodically,

- the time between adjacent messages keeps increasing -- implying that
  some list structure (containing the leaked resources) keeps growing,

- the "surface" pointer is ever different.

> 18566@1695127471.449586:displaysurface_free surface=0x7f2fcc09a7c0
> 18566@1695127471.529559:displaysurface_free surface=0x7f2fcc9dac10
> 18566@1695127471.659812:displaysurface_free surface=0x7f2fcc441dd0
> 18566@1695127471.839669:displaysurface_free surface=0x7f2fcc0363d0
> 18566@1695127472.069674:displaysurface_free surface=0x7f2fcc413a80
> 18566@1695127472.349580:displaysurface_free surface=0x7f2fcc09cd00
> 18566@1695127472.679783:displaysurface_free surface=0x7f2fcc1395f0
> 18566@1695127473.059848:displaysurface_free surface=0x7f2fcc1cae50
> 18566@1695127473.489724:displaysurface_free surface=0x7f2fcc42fc50
> 18566@1695127473.969791:displaysurface_free surface=0x7f2fcc45dcc0
> 18566@1695127474.499708:displaysurface_free surface=0x7f2fcc70b9d0
> 18566@1695127475.079769:displaysurface_free surface=0x7f2fcc82acc0
> 18566@1695127475.709941:displaysurface_free surface=0x7f2fcc369c00
> 18566@1695127476.389619:displaysurface_free surface=0x7f2fcc32b910
> 18566@1695127477.119772:displaysurface_free surface=0x7f2fcc0d5a20
> 18566@1695127477.899517:displaysurface_free surface=0x7f2fcc086c40
> 18566@1695127478.729962:displaysurface_free surface=0x7f2fccc72020
> 18566@1695127479.609839:displaysurface_free surface=0x7f2fcc185160
> 18566@1695127480.539688:displaysurface_free surface=0x7f2fcc23a7e0
> 18566@1695127481.519759:displaysurface_free surface=0x7f2fcc3ec870
> 18566@1695127482.549930:displaysurface_free surface=0x7f2fcc634960
> 18566@1695127483.629661:displaysurface_free surface=0x7f2fcc26b140
> 18566@1695127484.759987:displaysurface_free surface=0x7f2fcc321700
> 18566@1695127485.940289:displaysurface_free surface=0x7f2fccaad100

We figured this wasn't a CVE-worthy problem, as only small amounts of
memory were leaked (the framebuffer itself is mapped from guest RAM, QEMU
only allocates administrative structures), plus libvirt restricts QEMU
memory footprint anyway, thus the guest can only DoS itself.

Plug the leak, by releasing the last prepared (not yet swapped in) display
surface, if any, in the fw_cfg DMA write callback.

Regarding the "reproducer", with the fix in place, the log is flooded with
trace messages (one per fw_cfg write), *and* the trace message alternates
between just two "surface" pointer values (i.e., nothing is leaked, the
allocator flip-flops between two objects in effect).

This issue appears to date back to the introducion of ramfb (995b30179bdc,
"hw/display: add ramfb, a simple boot framebuffer living in guest ram",
2018-06-18).

Cc: Gerd Hoffmann  (maintainer:ramfb)
Cc: qemu-sta...@nongnu.org
Fixes: 995b30179bdc
Signed-off-by: Laszlo Ersek 
Acked-by: Laszlo Ersek 
Reviewed-by: Gerd Hoffmann 
Reviewed-by: Marc-André Lureau 
Message-ID: <20230919131955.27223-1-ler...@redhat.com>
(cherry picked from commit e0288a778473ebd35eac6cc1924faca7d477d241)
Signed-off-by: Michael Tokarev 

diff --git a/hw/display/ramfb.c b/hw/display/ramfb.c
index 79b9754a58..c2b002d534 100644
--- a/hw/display/ramfb.c
+++ b/hw/display/ramfb.c
@@ -97,6 +97,7 @@ static void ramfb_fw_cfg_write(void *dev, off_t offset, 
size_t len)
 
 s->width = width;
 s->height = height;
+qemu_free_displaysurface(s->ds);
 s->ds = surface;
 }
 
-- 
2.39.2




[Stable-8.1.2 57/57] amd_iommu: Fix APIC address check

2023-10-06 Thread Michael Tokarev
From: Akihiko Odaki 

An MSI from I/O APIC may not exactly equal to APIC_DEFAULT_ADDRESS. In
fact, Windows 17763.3650 configures I/O APIC to set the dest_mode bit.
Cover the range assigned to APIC.

Fixes: 577c470f43 ("x86_iommu/amd: Prepare for interrupt remap support")
Signed-off-by: Akihiko Odaki 
Message-Id: <20230921114612.40671-1-akihiko.od...@daynix.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
(cherry picked from commit 0114c4513095598cdf1cd8d7dacdfff757628121)
Signed-off-by: Michael Tokarev 

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 9c77304438..9b7c6e2921 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1246,13 +1246,8 @@ static int amdvi_int_remap_msi(AMDVIState *iommu,
 return -AMDVI_IR_ERR;
 }
 
-if (origin->address & AMDVI_MSI_ADDR_HI_MASK) {
-trace_amdvi_err("MSI address high 32 bits non-zero when "
-"Interrupt Remapping enabled.");
-return -AMDVI_IR_ERR;
-}
-
-if ((origin->address & AMDVI_MSI_ADDR_LO_MASK) != APIC_DEFAULT_ADDRESS) {
+if (origin->address < AMDVI_INT_ADDR_FIRST ||
+origin->address + sizeof(origin->data) > AMDVI_INT_ADDR_LAST + 1) {
 trace_amdvi_err("MSI is not from IOAPIC.");
 return -AMDVI_IR_ERR;
 }
diff --git a/hw/i386/amd_iommu.h b/hw/i386/amd_iommu.h
index 6da893ee57..c5065a3e27 100644
--- a/hw/i386/amd_iommu.h
+++ b/hw/i386/amd_iommu.h
@@ -210,8 +210,6 @@
 #define AMDVI_INT_ADDR_FIRST0xfee0
 #define AMDVI_INT_ADDR_LAST 0xfeef
 #define AMDVI_INT_ADDR_SIZE (AMDVI_INT_ADDR_LAST - AMDVI_INT_ADDR_FIRST + 
1)
-#define AMDVI_MSI_ADDR_HI_MASK  (0xULL)
-#define AMDVI_MSI_ADDR_LO_MASK  (0xULL)
 
 /* SB IOAPIC is always on this device in AMD systems */
 #define AMDVI_IOAPIC_SB_DEVID   PCI_BUILD_BDF(0, PCI_DEVFN(0x14, 0))
-- 
2.39.2




[Stable-8.1.2 49/57] win32: avoid discarding the exception handler

2023-10-06 Thread Michael Tokarev
From: Marc-André Lureau 

In all likelihood, the compiler with lto doesn't see the function being
used, from assembly macro __try1. Help it by marking the function has
being used.

Resolves:
https://gitlab.com/qemu-project/qemu/-/issues/1904

Fixes: commit d89f30b4df ("win32: wrap socket close() with an exception 
handler")

Signed-off-by: Marc-André Lureau 
Reviewed-by: Thomas Huth 
(cherry picked from commit 75b773d84c89220463a14a6883d2b2a8e49e5b68)
Signed-off-by: Michael Tokarev 
(mjt: trivial context fixup in include/qemu/compiler.h)

diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index a309f90c76..5c7f63f351 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -197,4 +197,10 @@
 #define BUILTIN_SUBCLL_BROKEN
 #endif
 
+#if __has_attribute(used)
+# define QEMU_USED __attribute__((used))
+#else
+# define QEMU_USED
+#endif
+
 #endif /* COMPILER_H */
diff --git a/util/oslib-win32.c b/util/oslib-win32.c
index 19a0ea7fbe..55b0189dc3 100644
--- a/util/oslib-win32.c
+++ b/util/oslib-win32.c
@@ -479,7 +479,7 @@ int qemu_bind_wrap(int sockfd, const struct sockaddr *addr,
 return ret;
 }
 
-EXCEPTION_DISPOSITION
+QEMU_USED EXCEPTION_DISPOSITION
 win32_close_exception_handler(struct _EXCEPTION_RECORD *exception_record,
   void *registration, struct _CONTEXT *context,
   void *dispatcher)
-- 
2.39.2




[Stable-8.1.2 52/57] linux-user/hppa: Fix struct target_sigcontext layout

2023-10-06 Thread Michael Tokarev
From: Richard Henderson 

Use abi_ullong not uint64_t so that the alignment of the field
and therefore the layout of the struct is correct.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
(cherry picked from commit 33bc4fa78b06fc4e5fe22e5576811a97707e0cc6)
Signed-off-by: Michael Tokarev 

diff --git a/linux-user/hppa/signal.c b/linux-user/hppa/signal.c
index bda6e54655..ec5f5412d1 100644
--- a/linux-user/hppa/signal.c
+++ b/linux-user/hppa/signal.c
@@ -25,7 +25,7 @@
 struct target_sigcontext {
 abi_ulong sc_flags;
 abi_ulong sc_gr[32];
-uint64_t sc_fr[32];
+abi_ullong sc_fr[32];
 abi_ulong sc_iasq[2];
 abi_ulong sc_iaoq[2];
 abi_ulong sc_sar;
-- 
2.39.2




[Stable-8.1.2 48/57] target/i386: fix memory operand size for CVTPS2PD

2023-10-06 Thread Michael Tokarev
From: Paolo Bonzini 

CVTPS2PD only loads a half-register for memory, unlike the other
operations under 0x0F 0x5A.  "Unpack" the group into separate
emission functions instead of using gen_unary_fp_sse.

Signed-off-by: Paolo Bonzini 
(cherry picked from commit abd41884c530aa025ada253bf1a5bd0c2b808219)
Signed-off-by: Michael Tokarev 

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 43c39aad2a..0db19cda3b 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -805,10 +805,20 @@ static void decode_sse_unary(DisasContext *s, CPUX86State 
*env, X86OpEntry *entr
 case 0x51: entry->gen = gen_VSQRT; break;
 case 0x52: entry->gen = gen_VRSQRT; break;
 case 0x53: entry->gen = gen_VRCP; break;
-case 0x5A: entry->gen = gen_VCVTfp2fp; break;
 }
 }
 
+static void decode_0F5A(DisasContext *s, CPUX86State *env, X86OpEntry *entry, 
uint8_t *b)
+{
+static const X86OpEntry opcodes_0F5A[4] = {
+X86_OP_ENTRY2(VCVTPS2PD,  V,x,   W,xh, vex2),  /* VCVTPS2PD */
+X86_OP_ENTRY2(VCVTPD2PS,  V,x,   W,x,  vex2),  /* VCVTPD2PS */
+X86_OP_ENTRY3(VCVTSS2SD,  V,x,  H,x, W,x,  vex2_rep3), /* VCVTSS2SD */
+X86_OP_ENTRY3(VCVTSD2SS,  V,x,  H,x, W,x,  vex2_rep3), /* VCVTSD2SS */
+};
+*entry = *decode_by_prefix(s, opcodes_0F5A);
+}
+
 static void decode_0F5B(DisasContext *s, CPUX86State *env, X86OpEntry *entry, 
uint8_t *b)
 {
 static const X86OpEntry opcodes_0F5B[4] = {
@@ -891,7 +901,7 @@ static const X86OpEntry opcodes_0F[256] = {
 
 [0x58] = X86_OP_ENTRY3(VADD,   V,x, H,x, W,x, vex2_rep3 p_00_66_f3_f2),
 [0x59] = X86_OP_ENTRY3(VMUL,   V,x, H,x, W,x, vex2_rep3 p_00_66_f3_f2),
-[0x5a] = X86_OP_GROUP3(sse_unary,  V,x, H,x, W,x, vex2_rep3 
p_00_66_f3_f2), /* CVTPS2PD */
+[0x5a] = X86_OP_GROUP0(0F5A),
 [0x5b] = X86_OP_GROUP0(0F5B),
 [0x5c] = X86_OP_ENTRY3(VSUB,   V,x, H,x, W,x, vex2_rep3 p_00_66_f3_f2),
 [0x5d] = X86_OP_ENTRY3(VMIN,   V,x, H,x, W,x, vex2_rep3 p_00_66_f3_f2),
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 4fe8dec427..45a3e55cbf 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1914,12 +1914,22 @@ static void gen_VCOMI(DisasContext *s, CPUX86State 
*env, X86DecodedInsn *decode)
 set_cc_op(s, CC_OP_EFLAGS);
 }
 
-static void gen_VCVTfp2fp(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
+static void gen_VCVTPD2PS(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
 {
-gen_unary_fp_sse(s, env, decode,
- gen_helper_cvtpd2ps_xmm, gen_helper_cvtps2pd_xmm,
- gen_helper_cvtpd2ps_ymm, gen_helper_cvtps2pd_ymm,
- gen_helper_cvtsd2ss, gen_helper_cvtss2sd);
+if (s->vex_l) {
+gen_helper_cvtpd2ps_ymm(cpu_env, OP_PTR0, OP_PTR2);
+} else {
+gen_helper_cvtpd2ps_xmm(cpu_env, OP_PTR0, OP_PTR2);
+}
+}
+
+static void gen_VCVTPS2PD(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
+{
+if (s->vex_l) {
+gen_helper_cvtps2pd_ymm(cpu_env, OP_PTR0, OP_PTR2);
+} else {
+gen_helper_cvtps2pd_xmm(cpu_env, OP_PTR0, OP_PTR2);
+}
 }
 
 static void gen_VCVTPS2PH(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
@@ -1936,6 +1946,16 @@ static void gen_VCVTPS2PH(DisasContext *s, CPUX86State 
*env, X86DecodedInsn *dec
 }
 }
 
+static void gen_VCVTSD2SS(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
+{
+gen_helper_cvtsd2ss(cpu_env, OP_PTR0, OP_PTR1, OP_PTR2);
+}
+
+static void gen_VCVTSS2SD(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
+{
+gen_helper_cvtss2sd(cpu_env, OP_PTR0, OP_PTR1, OP_PTR2);
+}
+
 static void gen_VCVTSI2Sx(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
 {
 int vec_len = vector_len(s, decode);
-- 
2.39.2




[Stable-8.1.2 47/57] target/i386: generalize operand size "ph" for use in CVTPS2PD

2023-10-06 Thread Michael Tokarev
From: Paolo Bonzini 

CVTPS2PD only loads a half-register for memory, like CVTPH2PS.  It can
reuse the "ph" packed half-precision size to load a half-register,
but rename it to "xh" because it is now a variation of "x" (it is not
used only for half-precision values).

Signed-off-by: Paolo Bonzini 
(cherry picked from commit a48b26978a090fe1f3f3e54319902d4ab56a6b3a)
Signed-off-by: Michael Tokarev 

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 8f93a239dd..43c39aad2a 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -337,7 +337,7 @@ static const X86OpEntry opcodes_0F38_00toEF[240] = {
 [0x07] = X86_OP_ENTRY3(PHSUBSW,   V,x,  H,x,   W,x,  vex4 cpuid(SSSE3) mmx 
avx2_256 p_00_66),
 
 [0x10] = X86_OP_ENTRY2(PBLENDVB,  V,x, W,x,  vex4 cpuid(SSE41) 
avx2_256 p_66),
-[0x13] = X86_OP_ENTRY2(VCVTPH2PS, V,x, W,ph, vex11 cpuid(F16C) 
p_66),
+[0x13] = X86_OP_ENTRY2(VCVTPH2PS, V,x, W,xh, vex11 cpuid(F16C) 
p_66),
 [0x14] = X86_OP_ENTRY2(BLENDVPS,  V,x, W,x,  vex4 cpuid(SSE41) 
p_66),
 [0x15] = X86_OP_ENTRY2(BLENDVPD,  V,x, W,x,  vex4 cpuid(SSE41) 
p_66),
 /* Listed incorrectly as type 4 */
@@ -565,7 +565,7 @@ static const X86OpEntry opcodes_0F3A[256] = {
 [0x15] = X86_OP_ENTRY3(PEXTRW, E,w,  V,dq, I,b,  vex5 cpuid(SSE41) 
zext0 p_66),
 [0x16] = X86_OP_ENTRY3(PEXTR,  E,y,  V,dq, I,b,  vex5 cpuid(SSE41) 
p_66),
 [0x17] = X86_OP_ENTRY3(VEXTRACTPS, E,d,  V,dq, I,b,  vex5 cpuid(SSE41) 
p_66),
-[0x1d] = X86_OP_ENTRY3(VCVTPS2PH,  W,ph, V,x,  I,b,  vex11 cpuid(F16C) 
p_66),
+[0x1d] = X86_OP_ENTRY3(VCVTPS2PH,  W,xh, V,x,  I,b,  vex11 cpuid(F16C) 
p_66),
 
 [0x20] = X86_OP_ENTRY4(PINSRB, V,dq, H,dq, E,b,  vex5 cpuid(SSE41) 
zext2 p_66),
 [0x21] = X86_OP_GROUP0(VINSERTPS),
@@ -1104,7 +1104,7 @@ static bool decode_op_size(DisasContext *s, X86OpEntry 
*e, X86OpSize size, MemOp
 *ot = s->vex_l ? MO_256 : MO_128;
 return true;
 
-case X86_SIZE_ph: /* SSE/AVX packed half precision */
+case X86_SIZE_xh: /* SSE/AVX packed half register */
 *ot = s->vex_l ? MO_128 : MO_64;
 return true;
 
diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index cb6b8bcf67..a542ec1681 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -92,7 +92,7 @@ typedef enum X86OpSize {
 /* Custom */
 X86_SIZE_d64,
 X86_SIZE_f64,
-X86_SIZE_ph, /* SSE/AVX packed half precision */
+X86_SIZE_xh, /* SSE/AVX packed half register */
 } X86OpSize;
 
 typedef enum X86CPUIDFeature {
-- 
2.39.2




[Stable-8.1.2 55/57] vdpa net: stop probing if cannot set features

2023-10-06 Thread Michael Tokarev
From: Eugenio Pérez 

Otherwise it continues the CVQ isolation probing.

Fixes: 152128d646 ("vdpa: move CVQ isolation check to net_init_vhost_vdpa")
Reported-by: Peter Maydell 
Signed-off-by: Eugenio Pérez 
Message-Id: <20230915170836.3078172-3-epere...@redhat.com>
Tested-by: Lei Yang 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Philippe Mathieu-Daudé 
(cherry picked from commit f1085882d028e5a1b227443cd6e96bbb63d66f43)
Signed-off-by: Michael Tokarev 

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 1c79e33170..cda6099ceb 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -1291,6 +1291,7 @@ static int vhost_vdpa_probe_cvq_isolation(int device_fd, 
uint64_t features,
 r = ioctl(device_fd, VHOST_SET_FEATURES, );
 if (unlikely(r)) {
 error_setg_errno(errp, errno, "Cannot set features");
+goto out;
 }
 
 r = ioctl(device_fd, VHOST_VDPA_SET_STATUS, );
-- 
2.39.2




[Stable-8.1.2 00/57] Patch Round-up for stable 8.1.2, freeze on 2023-10-14

2023-10-06 Thread Michael Tokarev
The following patches are queued for QEMU stable v8.1.2:

  https://gitlab.com/qemu-project/qemu/-/commits/staging-8.1

Patch freeze is 2023-10-14, and the release is planned for 2023-10-16:

  https://wiki.qemu.org/Planning/8.1

Please respond here or CC qemu-sta...@nongnu.org on any additional patches
you think should (or shouldn't) be included in the release.

The changes which are staging for inclusion, with the original commit hash
from master branch, are given below the bottom line.

This release supposed to finally fix some long-standing issues in 8.1.x series,
by including commit 0d58c660689f "softmmu: Use async_run_on_cpu in tcg_commit"
and follow-up series fixing issues in other areas it uncovered, among other
fixes.

Thanks!

/mjt

--
01* 7798f5c576d8 Nicholas Piggin:
   hw/ppc: Introduce functions for conversion between timebase and 
   nanoseconds
02* 47de6c4c2870 Nicholas Piggin:
   host-utils: Add muldiv64_round_up
03* eab0888418ab Nicholas Piggin:
   hw/ppc: Round up the decrementer interval when converting to ns
04* 8e0a5ac87800 Nicholas Piggin:
   hw/ppc: Avoid decrementer rounding errors
05* c8fbc6b9f2f3 Nicholas Piggin:
   target/ppc: Sign-extend large decrementer to 64-bits
06* febb71d543a8 Nicholas Piggin:
   hw/ppc: Always store the decrementer value
07* 30d0647bcfa9 Nicholas Piggin:
   hw/ppc: Reset timebase facilities on machine reset
08* ea62f8a5172c Nicholas Piggin:
   hw/ppc: Read time only once to perform decrementer write
09* 2529497cb6b2 Mikulas Patocka:
   linux-user/hppa: clear the PSW 'N' bit when delivering signals
10* 5b1270ef1477 Mikulas Patocka:
   linux-user/hppa: lock both words of function descriptor
11* 7b165fa16402 Li Zhijian:
   hw/cxl: Fix CFMW config memory leak
12* de5bbfc602ef Dmitry Frolov:
   hw/cxl: Fix out of bound array access
13* 56d1a022a77e Hanna Czenczek:
   file-posix: Clear bs->bl.zoned on error
14* 4b5d80f3d020 Hanna Czenczek:
   file-posix: Check bs->bl.zoned for zone info
15* deab5c9a4ed7 Hanna Czenczek:
   file-posix: Fix zone update in I/O error path
16* d31b50a15dd2 Hanna Czenczek:
   file-posix: Simplify raw_co_prw's 'out' zone code
17* 380448464dd8 Hanna Czenczek:
   tests/file-io-error: New test
18* c78edb563942 Anton Johansson:
   include/exec: Widen tlb_hit/tlb_hit_page()
19* 32b214384e1e Fabian Vogt:
   hw/arm/boot: Set SCR_EL3.FGTEn when booting kernel
20* 903dbefc2b69 Peter Maydell:
   target/arm: Don't skip MTE checks for LDRT/STRT at EL0
21* c64023b0ba67 Thomas Huth:
   meson.build: Make keyutils independent from keyring
22* 0e5903436de7 Nicholas Piggin:
   accel/tcg: mttcg remove false-negative halted assertion
23* 7cfcc79b0ab8 Thomas Huth:
   hw/scsi/scsi-disk: Disallow block sizes smaller than 512 [CVE-2023-42467]
24* 0cb9c5880e6b Paolo Bonzini:
   ui/vnc: fix debug output for invalid audio message
25* 477b301000d6 Paolo Bonzini:
   ui/vnc: fix handling of VNC_FEATURE_XVP
26* cf02f29e1e38 Peter Xu:
   migration: Fix race that dest preempt thread close too early
27* 28a8347281e2 Fabiano Rosas:
   migration: Fix possible race when setting rp_state.error
28* 639decf52979 Fabiano Rosas:
   migration: Fix possible races when shutting down the return path
29* 7478fb0df914 Fabiano Rosas:
   migration: Fix possible race when shutting down to_dst_file
30* b3b101157d46 Fabiano Rosas:
   migration: Remove redundant cleanup of postcopy_qemufile_src
31* d50f5dc075cb Fabiano Rosas:
   migration: Consolidate return path closing code
32* ef796ee93b31 Fabiano Rosas:
   migration: Replace the return path retry logic
33* 36e9aab3c569 Fabiano Rosas:
   migration: Move return path cleanup to main migration thread
34* 0d58c660689f Richard Henderson:
   softmmu: Use async_run_on_cpu in tcg_commit
35* f47a90dacca8 Richard Henderson:
   accel/tcg: Avoid load of icount_decr if unused
36* 5d97e9463810 Richard Henderson:
   accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop
37* 0ca41ccf1c55 Richard Henderson:
   accel/tcg: Track current value of can_do_io in the TB
38* a2f99d484c54 Richard Henderson:
   accel/tcg: Improve setting of can_do_io at start of TB
39* 200c1f904f46 Richard Henderson:
   accel/tcg: Always set CF_LAST_IO with CF_NOIRQ
40* 18a536f1f8d6 Richard Henderson:
   accel/tcg: Always require can_do_io
41* 23fa6f56b33f Bastian Koppelmann:
   target/tricore: Fix RCPW/RRPW_INSERT insns for width = 0
42* 35ed01ba5448 Fabiano Rosas:
   optionrom: Remove build-id section
43* b86dc5cb0b41 Mark Cave-Ayland:
   esp: use correct type for esp_dma_enable() in sysbus_esp_gpio_demux()
44* 77668e4b9bca Mark Cave-Ayland:
   esp: restrict non-DMA transfer length to that of available data
45* be2b619a1734 Mark Cave-Ayland:
   scsi-disk: ensure that FORMAT UNIT commands are terminated
46 c01196bdddc2 Thomas Huth:
   subprojects/berkeley-testfloat-3: Update to fix a problem with compiler 
   warnings
47 a48b26978a09 Paolo Bonzini:
   target/i386: generalize operand size "ph" for use in CVTPS2PD
48 abd41884c530 Paolo 

Re: [PATCH v2 03/21] preallocate: Don't poll during permission updates

2023-10-06 Thread Vladimir Sementsov-Ogievskiy

On 06.10.23 11:56, Kevin Wolf wrote:

Am 05.10.2023 um 21:55 hat Vladimir Sementsov-Ogievskiy geschrieben:

On 11.09.23 12:46, Kevin Wolf wrote:

When the permission related BlockDriver callbacks are called, we are in
the middle of an operation traversing the block graph. Polling in such a
place is a very bad idea because the graph could change in unexpected
ways. In the future, callers will also hold the graph lock, which is
likely to turn polling into a deadlock.

So we need to get rid of calls to functions like bdrv_getlength() or
bdrv_truncate() there as these functions poll internally. They are
currently used so that when no parent has write/resize permissions on
the image any more, the preallocate filter drops the extra preallocated
area in the image file and gives up write/resize permissions itself.

In order to achieve this without polling in .bdrv_check_perm, don't
immediately truncate the image, but only schedule a BH to do so. The
filter keeps the write/resize permissions a bit longer now until the BH
has executed.

There is one case in which delaying doesn't work: Reopening the image
read-only. In this case, bs->file will likely be reopened read-only,
too, so keeping write permissions a bit longer on it doesn't work. But
we can already cover this case in preallocate_reopen_prepare() and not
rely on the permission updates for it.


Hmm, now I found one more "future" case.

I now try to rebase my "[PATCH v7 0/7] blockdev-replace"
https://patchew.org/QEMU/20230421114102.884457-1-vsement...@yandex-team.ru/

And it breaks after this commit.

By accident, blockdev-replace series uses exactly "preallocate" filter
to test insertion/removing of filters. And removing is broken now.

Removing is done as follows:

1. We have filter inserted: disk0 -file-> filter -file-> file0

2. blockdev-replace, replaces file child of disk0, so we should get the picture*: 
disk0 -file-> file0 <-file- filter

3. blockdev-del filter


But step [2] fails, as now preallocate filter doesn't drop permissions
during the operation (postponing this for a while) and the picture* is
impossible. Permission check fails.

Hmmm... Any idea how blockdev-replace and preallocate filter should
work :) ? Maybe, doing truncation in .drain_begin() will help? Will
try


Hm... What preallocate tries to do is really tricky...

Of course, the error is correct, this is an invalid configuration if
preallocate can still resize the image. So it would have to truncate the
file earlier, but the first time that preallocate knows of the change is
already too late to run requests.

Truncating on drain_begin feels more like a hack, but as long as it does
the job... Of course, this will have the preallocation truncated away on
events that have nothing to do with removing the filter. It's not
necessarily a disaster because preallocation is only an optimisation,
but it doesn't feel great.


Hmm, yes, that's not good.



Maybe let's take a step back: Which scenario is the preallocate driver
meant for and why do we even need to truncate the image file after
removing the filter? I suppose the filter doesn't make sense with raw
images because these are fixed size anyway, and pretty much any other
image format should be able to tolerate a permanently rounded up file
size. As long as you don't write to the preallocated area, it shouldn't
take space either on any sane filesystem.

Hmm, actually both VHD and VMDK can have footers, better avoid it with
those... But if truncating the image file on close is critical, what do
you do on crashes? Maybe preallocate should just not be considered
compatible with these formats?



Originally preallocate filter was made to be used with qcow2, on some 
proprietary storage, where:

1. Allocating of big chunk works a lot fater than allocating several smaller 
chunks
2. Holes are not free and/or file length is not free, so we really want to 
truncate the file back on close

Den, correct me if I'm wrong.

Good thing is that in this scenario we don't need to remove the filter in 
runtime, so there is no problem.


Now I think that the generic solution is just add a new handler 
.bdrv_pre_replace, so blockdev-replace may work as follows:

drain_begin

call .bdrv_pre_replace for all affected nodes

do the replace

drain_end

And prellocate filter would do truncation in this .bdrv_pre_replace handler and 
set some flag, that we have nothing to trunctate (the flag is automatically 
cleared on .drained_end handler). Then during permission update if we see that 
nothing-to-truncate flag, we can drop permissions immediately.

But this difficulty may be postponed, and I can just document that preallocate 
filter doesn't support removing in runtime (and modify the test to use another 
filter, or just not to remove preallocate filter).

--
Best regards,
Vladimir




Re: [PATCH v4 01/15] hw/pci: Add a pci_setup_iommu_ops() helper

2023-10-06 Thread Joao Martins
On 06/10/2023 18:09, Cédric Le Goater wrote:
>>> Getting acks from everyone will be difficultsince some PHBs are orphans.
>>
>> [...] This is what gets me a bit hesitant
> 
> orphans shouldn't be an issue, nor the PPC emulated machines. We will see
> what other maintainers have to say.

How about this as a compromise: to have a separate patch at the end of the
series that converts every other PHB? This way the rest can iterate, while we
await maintainers feedback without potentially blocking everything else.

Also, one other patch I'll add to this series at the end is this one:

https://lore.kernel.org/qemu-devel/20230908120521.50903-1-joao.m.mart...@oracle.com/

This way the vIOMMU series is a complete thing for old and new guests, as
opposed to just new.

Joao



Re: [v3] Help wanted for enabling -Wshadow=local

2023-10-06 Thread Thomas Huth

On 06/10/2023 18.18, Thomas Huth wrote:

On 06/10/2023 16.45, Markus Armbruster wrote:

Local variables shadowing other local variables or parameters make the
code needlessly hard to understand.  Bugs love to hide in such code.
Evidence: "[PATCH v3 1/7] migration/rdma: Fix save_page method to fail
on polling error".

Enabling -Wshadow would prevent bugs like this one.  But we have to
clean up all the offenders first.

Quite a few people responded to my calls for help.  Thank you so much!

I'm collecting patches in my git repo at
https://repo.or.cz/qemu/armbru.git in branch shadow-next.  All but the
last two are in a pending pull request.

My test build is down to seven files with warnings.  "[PATCH v2 0/3]
hexagon: GETPC() and shadowing fixes" takes care of four, but it needs a
rebase.

Remaining three:

 In file included from ../hw/display/virtio-gpu-virgl.c:19:
 ../hw/display/virtio-gpu-virgl.c: In function ‘virgl_cmd_submit_3d’:
 /work/armbru/qemu/include/hw/virtio/virtio-gpu.h:228:16: warning: 
declaration of ‘s’ shadows a previous local [-Wshadow=compatible-local]
   228 | size_t 
s;   \

   |    ^
 ../hw/display/virtio-gpu-virgl.c:215:5: note: in expansion of macro 
‘VIRTIO_GPU_FILL_CMD’

   215 | VIRTIO_GPU_FILL_CMD(cs);
   | ^~~
 ../hw/display/virtio-gpu-virgl.c:213:12: note: shadowed declaration 
is here

   213 | size_t s;
   |    ^

 In file included from ../contrib/vhost-user-gpu/virgl.h:18,
  from ../contrib/vhost-user-gpu/virgl.c:17:
 ../contrib/vhost-user-gpu/virgl.c: In function ‘virgl_cmd_submit_3d’:
 ../contrib/vhost-user-gpu/vugpu.h:167:16: warning: declaration of ‘s’ 
shadows a previous local [-Wshadow=compatible-local]
   167 | size_t 
s;   \

   |    ^
 ../contrib/vhost-user-gpu/virgl.c:203:5: note: in expansion of macro 
‘VUGPU_FILL_CMD’

   203 | VUGPU_FILL_CMD(cs);
   | ^~
 ../contrib/vhost-user-gpu/virgl.c:201:12: note: shadowed declaration 
is here

   201 | size_t s;
   |    ^

 ../contrib/vhost-user-gpu/vhost-user-gpu.c: In function 
‘vg_resource_flush’:
 ../contrib/vhost-user-gpu/vhost-user-gpu.c:837:29: warning: 
declaration of ‘i’ shadows a previous local [-Wshadow=local]

   837 | pixman_image_t *i =
   | ^
 ../contrib/vhost-user-gpu/vhost-user-gpu.c:757:9: note: shadowed 
declaration is here

   757 | int i;
   | ^

Gerd, Marc-André, or anybody else?

More warnings may lurk in code my test build doesn't compile.  Need a
full CI build with -Wshadow=local to find them.  Anybody care to kick
one off?


I ran a build here (with -Werror enabled, so that it's easier to see where 
it breaks):


  https://gitlab.com/thuth/qemu/-/pipelines/1028023489

... but I didn't see any additional spots in the logs beside the ones that 
you already listed.


After adding two more patches to fix the above warnings, things look pretty 
good:


 https://gitlab.com/thuth/qemu/-/pipelines/1028413030

There are just some warnings left in the BSD code, as Warner already 
mentioned in his reply to v2 of your mail:


 https://gitlab.com/thuth/qemu/-/jobs/5241420713

 Thomas




Re: [PATCH v3] hw/cxl: Add QTG _DSM support for ACPI0017 device

2023-10-06 Thread Dan Williams
Jonathan Cameron wrote:
[..]
> > > > 
> > > > what does "a WORD" mean is unclear - do you match what hardware does
> > > > when you use aml_buffer? pls mention this in commit log, and
> > > > show actual hardware dump for comparison.  
> > > The CXL spec says WORD without much qualification. It's a 16bit value 
> > > AFAICT. I'll add additional comment. Currently I do not have access to 
> > > actual hardware unfortunately. I'm constructing this purely based on spec 
> > > description.  
> > 
> 
> WORD does seem to be clearly defined in the ACPI spec as uint16
> and as this is describing a DSDT blob I think we can safe that
> it means that.  Also lines up with the fixed sizes in CEDT.

I am not sure it means that, the ACPI specification indicates that packages
can hold "integers" and integers can be any size up to 64-bits.

> > It's not clear buffer is actually word though.
> > 
> > Jonathan do you have hardware access?
> 
> No.  +CC linux-cxl to see if anyone else has hardware + BIOS with
> QTG implemented...  There will be lots of implementations soon so I'd make
> not guarantee they will all interpret this the same.
> 
> Aim here is Linux kernel enablement support, and unfortunately that almost
> always means we are ahead of easy availability of hardware. If it exists
> its probably prototypes in a lab, in which case no guarantees on the
> BIOS tables presented...

>From a pre-production system the ASL is putting the result of SizeOf
directly into the first element in the return package:

Local1 = SizeOf (CXQI)
Local0 [0x00] = Local1

...where CXQI appears to be a fixed table of QTG ids for the platform, and
SizeOf() returns an integer type with no restriction that it be a 16-bit
value.

So I think the specification is misleading by specifying WORD when ACPI
"Package" objects convey "Integers" where the size of the integer can be a
u8 to a u64.

> > Also, possible to get clarification from the spec committee?
> 
> I'm unclear what we are clarifying.  As I read it current implementation
> is indeed wrong and I failed to notice this earlier :(
> 
> Ultimately data encoding (ACPI 6.5 section 20.2..3 Data Objects Encoding)
> should I think be
> 
> 0x0B 0x00 0x00
> WordPrefix then data : note if you try a 0x0001 and feed
> it to iasl it will squash it into a byte instead and indeed if you
> force the binary to the above it will decode it as 0x but recompile
> that and you will be back to just
> 0x00 (as bytes don't need a prefix..)
> 
> Currently it would be.
> 0x11 0x05 0x0a 0x02 0x00 0x01
> BufferOp 
> 
> Btw I built a minimal DSDT file to test this and iasl isn't happy with
> the fact the _DSM doesn't return anything at all if ARG2 isn't 1 or 2.
> Whilst that's imdef territory as not covered by the CXL spec, we should
> return 'something' ;)
> 
> Anyhow, to do this as per the CXL spec we need an aml_word()
> that just implements the word case from aml_int()

If I understand correctly, aml_int() is sufficient since this is not a
Field() where access size matters.

> Chance are that it never matters if we get an ecoding that is
> only single byte (because the value is small) but who knows what
> other AML parsers might do.

I expect the reason WORD is used in the specification is because of the
size of the QTG ID field in the CFMWS. ACPI could support returning more
than USHRT_MAX in an Integer field in a Package, but those IDs above
USHRT_MAX could not be represented in CFMWS.

[..]
> > but again it is not clear at all what does spec mean.
> > an integer up to 0xf? a buffer as you did? just two bytes?
> > 
> > could be any of these.
> 
> The best we have in the way of description is the multiple QTG example
> where it's
> Package() {2, 1} combined with it being made up of WORDs
> 
> whereas in general that will get squashed to a pair of bytes...
> So I'm thinking WORDs is max size rather than only size but
> given ambiguity we should encode them as words anyway.

My assertion is that for the Package format the size of the integer does
not matter because the Package.Integer type can convey up to 64-bit values.
For being compatible with the *usage* of that max id, values that do not
fit into 16-bits are out of spec, but nothing prevents the Package from
using any size integer, afaics.



Re: [PATCH v7 14/15] scripts: add python_qmp_updater.py

2023-10-06 Thread Vladimir Sementsov-Ogievskiy

On 06.10.23 20:01, Eric Blake wrote:

On Fri, Oct 06, 2023 at 06:41:24PM +0300, Vladimir Sementsov-Ogievskiy wrote:

A script, to update the pattern

 result = self.vm.qmp(...)
 self.assert_qmp(result, 'return', {})

(and some similar ones) into

 self.vm.cmd(...)

Used in the next commit
 "python: use vm.cmd() instead of vm.qmp() where appropriate"

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  scripts/python_qmp_updater.py | 136 ++
  1 file changed, 136 insertions(+)
  create mode 100755 scripts/python_qmp_updater.py

diff --git a/scripts/python_qmp_updater.py b/scripts/python_qmp_updater.py
new file mode 100755
index 00..494a169812
--- /dev/null
+++ b/scripts/python_qmp_updater.py
@@ -0,0 +1,136 @@
+#!/usr/bin/env python3
+#
+# Intended usage:
+#
+# git grep -l '\.qmp(' | xargs ./scripts/python_qmp_updater.py
+#
+
+import re
+import sys
+from typing import Optional
+
+start_reg = re.compile(r'^(?P *)(?P\w+) = (?P.*).qmp\(',
+   flags=re.MULTILINE)
+
+success_reg_templ = re.sub('\n *', '', r"""
+(\n*{padding}(?P\#.*$))?
+\n*{padding}
+(
+self.assert_qmp\({res},\ 'return',\ {{}}\)
+|
+assert\ {res}\['return'\]\ ==\ {{}}
+|
+assert\ {res}\ ==\ {{'return':\ {{
+|
+self.assertEqual\({res}\['return'\],\ {{}}\)
+)""")


We may find other patterns, but this is a nice way to capture the most
common ones and a simple place to update if we find another one.

I did a quick grep for 'assert.*return' and noticed things like:

tests/qemu-iotests/056:self.assert_qmp(res, 'return', {})


this one is

res = self.vm.qmp(cmd, **kwargs)
if error:
self.assert_qmp(res, 'error/desc', error)
return False

so here we can't just use cmd()


tests/qemu-iotests/056:self.assert_qmp(res, 'return', [])


yes, that's a result of query- command, caller wants the exact result.
Actually that's a check for "no block-jobs".



This script only simplifies the {} form, not the []; but that makes
sense: when we are testing a command known to return an array rather
than nothing, we still want to check if the array is empty, and not
just that the command didn't crash.  We are only simplifying the
commands that check for nothing in particular returned, on the grounds
that not crashing was probably good enough, and explicitly checking
that nothing extra was returned is not worth the effort.



[..]





Reviewed-by: Eric Blake 



Thanks

--
Best regards,
Vladimir




[PATCH] contrib/vhost-user-gpu: Fix compiler warning when compiling with -Wshadow

2023-10-06 Thread Thomas Huth
Rename some variables to avoid compiler warnings when compiling
with -Wshadow=local.

Signed-off-by: Thomas Huth 
---
 contrib/vhost-user-gpu/vugpu.h  | 8 
 contrib/vhost-user-gpu/vhost-user-gpu.c | 6 +++---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/contrib/vhost-user-gpu/vugpu.h b/contrib/vhost-user-gpu/vugpu.h
index 509b679f03..5cede45134 100644
--- a/contrib/vhost-user-gpu/vugpu.h
+++ b/contrib/vhost-user-gpu/vugpu.h
@@ -164,12 +164,12 @@ struct virtio_gpu_ctrl_command {
 };
 
 #define VUGPU_FILL_CMD(out) do {\
-size_t s;   \
-s = iov_to_buf(cmd->elem.out_sg, cmd->elem.out_num, 0,  \
+size_t s_;  \
+s_ = iov_to_buf(cmd->elem.out_sg, cmd->elem.out_num, 0, \
, sizeof(out));  \
-if (s != sizeof(out)) { \
+if (s_ != sizeof(out)) {\
 g_critical("%s: command size incorrect %zu vs %zu", \
-   __func__, s, sizeof(out));   \
+   __func__, s_, sizeof(out));  \
 return; \
 }   \
 } while (0)
diff --git a/contrib/vhost-user-gpu/vhost-user-gpu.c 
b/contrib/vhost-user-gpu/vhost-user-gpu.c
index aa304475a0..bb41758e34 100644
--- a/contrib/vhost-user-gpu/vhost-user-gpu.c
+++ b/contrib/vhost-user-gpu/vhost-user-gpu.c
@@ -834,7 +834,7 @@ vg_resource_flush(VuGpu *g,
 .width = width,
 .height = height,
 };
-pixman_image_t *i =
+pixman_image_t *img =
 pixman_image_create_bits(pixman_image_get_format(res->image),
  msg->payload.update.width,
  msg->payload.update.height,
@@ -842,11 +842,11 @@ vg_resource_flush(VuGpu *g,
   payload.update.data),
  width * bpp);
 pixman_image_composite(PIXMAN_OP_SRC,
-   res->image, NULL, i,
+   res->image, NULL, img,
extents->x1, extents->y1,
0, 0, 0, 0,
width, height);
-pixman_image_unref(i);
+pixman_image_unref(img);
 vg_send_msg(g, msg, -1);
 g_free(msg);
 }
-- 
2.41.0




Re: [PULL 00/21] vfio queue

2023-10-06 Thread Eric Auger
Hi Cédric,

On 10/6/23 19:08, Cédric Le Goater wrote:
> Hello Eric,
> 
> On 10/6/23 14:25, Eric Auger wrote:
>> Hi Cédric,
>>
>> On 10/6/23 13:46, Eric Auger wrote:
>>> Hi Cédric,
>>>
>>> On 10/6/23 13:42, Eric Auger wrote:
>>>> Hi Cédric,
>>>>
>>>> On 10/6/23 12:33, Cédric Le Goater wrote:
>>>>> On 10/6/23 08:19, Cédric Le Goater wrote:
>>>>>> The following changes since commit
>>>>>> 2f3913f4b2ad74baeb5a6f1d36efbd9ecdf1057d:
>>>>>>
>>>>>>     Merge tag 'for_upstream' of
>>>>>> https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
>>>>>> (2023-10-05 09:01:01 -0400)
>>>>>>
>>>>>> are available in the Git repository at:
>>>>>>
>>>>>>     https://github.com/legoater/qemu/ tags/pull-vfio-20231006
>>>>>>
>>>>>> for you to fetch changes up to
>>>>>> 6e86aaef9ac57066aa923211a164df95b7b3cdf7:
>>>>>>
>>>>>>     vfio/common: Move legacy VFIO backend code into separate
>>>>>> container.c (2023-10-05 22:04:52 +0200)
>>>>>>
>>>>>> 
>>>>>> vfio queue:
>>>>>>
>>>>>> * Fix for VFIO display when using Intel vGPUs
>>>>>> * Support for dynamic MSI-X
>>>>>> * Preliminary work for IOMMUFD support
>>>>>
>>>>> Stefan,
>>>>>
>>>>> I just did some tests on z with passthough devices (PCI and AP) and
>>>>> the series is not bisectable. QEMU crashes at patch  :
>>>>>
>>>>>    "vfio/pci: Introduce vfio_[attach/detach]_device".
>>>>>
>>>>> Also, with everything applied, the guest fails to start with :
>>>>>
>>>>>   vfio: IRQ 0 not available (number of irqs 0)
>>>>>
>>>>> So, please hold on and sorry for the noise. I will start digging
>>>>> on my side.
>>>> I just tested with the head on vfio/pci: Introduce
>>>> vfio_[attach/detach]_device, with PCIe assignment on ARM and I fail to
>>>> reproduce the crash.
>>>>
>>>> Do you try hotplug or something simpler?
>>>
>>> also works for me with hotplug/hotunplug. Please let me know if I can
>>> help.
>>
>> I think this is related to the error handling.
>>
>> if you hotplug a vfio-device and if this encounters an error,
>> vfio_realize fails and you end at the 'error' label where the name of
>> the device is freed: g_free(vbasedev->name);
>>
>> However I see that the vfio_finalize is called (Zhengzhong warned me !!)
>> calls vfio_pci_put_device
>> which calls g_free(vdev->vbasedev.name) again.
>> please try adding
>> vdev->vbasedev.name = NULL after freeing the name in vfio_realize error:
>> so see if it fixes the crash.
>>
>> Then wrt irq stuff, I would be tempted to say it sounds unrelated to the
>> iommufd prereq series but well.
>>
>> Please let me know how you want me to fix that mess, sorry.
> 
> So, the issue was a bit complex to dig because it only crashed
> with a s390 guest under libvirt. This is my all-in-one combo VM
> which has 2 PCI, 2 AP, 1 CCW passthrough devices and the issue
> is in AP I think.
> 
> vfio_ap_realize lacks :
> 
>   @@ -188,6 +188,7 @@ static void vfio_ap_realize(DeviceState
>    error_report_err(err);
>    }
>      +    return;
Hum indeed! Thanks for fixing that.
>    error:
>    error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->name);
>    g_free(vbasedev->name);
> 
> 
> No error were to return and since everything was fine, the error
> path generated a lot of mess indeed !
> 
> If you are ok with the above, I will squash the change in the
> related patch and send a v2.
Unfortunately this is not sufficient. There is another regression
(crash) on a double free of vbasedev->name as I reported before. I was
able to hit it on a failing hotplug.  How do you want me to send the
fix? I can resend the whole series of just fixes on the related patches.

Thanks

Eric> Thanks,
> 
> C.
> 
> 
> 
> 
>>
>> Eric
>>
>>
>>>
>>> Eric
>>>>
>>>> Thanks
>>>>
>>>> Eric
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>&

Re: [PATCH v2 1/2] migration: Fix rdma migration failed

2023-10-06 Thread Peter Xu
On Fri, Oct 06, 2023 at 11:52:10AM -0400, Peter Xu wrote:
> On Tue, Oct 03, 2023 at 08:57:07PM +0200, Juan Quintela wrote:
> > commit c638f66121ce30063fbf68c3eab4d7429cf2b209
> > Author: Juan Quintela 
> > Date:   Tue Oct 3 20:53:38 2023 +0200
> > 
> > migration: Non multifd migration don't care about multifd flushes
> > 
> > RDMA was having trouble because
> > migrate_multifd_flush_after_each_section() can only be true or false,
> > but we don't want to send any flush when we are not in multifd
> > migration.
> > 
> > CC: Fabiano Rosas  > Reported-by: Li Zhijian 
> > Signed-off-by: Juan Quintela 
> > 
> > diff --git a/migration/ram.c b/migration/ram.c
> > index e4bfd39f08..716cef6425 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -1387,7 +1387,8 @@ static int find_dirty_block(RAMState *rs, 
> > PageSearchStatus *pss)
> >  pss->page = 0;
> >  pss->block = QLIST_NEXT_RCU(pss->block, next);
> >  if (!pss->block) {
> > -if (!migrate_multifd_flush_after_each_section()) {
> > +if (migrate_multifd() &&
> > +!migrate_multifd_flush_after_each_section()) {
> >  QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
> >  int ret = multifd_send_sync_main(f);
> >  if (ret < 0) {
> > @@ -3064,7 +3065,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> >  return ret;
> >  }
> >  
> > -if (!migrate_multifd_flush_after_each_section()) {
> > +if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
> >  qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> >  }
> >  
> > @@ -3176,7 +3177,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
> >  out:
> >  if (ret >= 0
> >  && migration_is_setup_or_active(migrate_get_current()->state)) {
> > -if (migrate_multifd_flush_after_each_section()) {
> > +if (migrate_multifd() && 
> > migrate_multifd_flush_after_each_section()) {
> >  ret = 
> > multifd_send_sync_main(rs->pss[RAM_CHANNEL_PRECOPY].pss_channel);
> >  if (ret < 0) {
> >  return ret;
> > @@ -3253,7 +3254,7 @@ static int ram_save_complete(QEMUFile *f, void 
> > *opaque)
> >  return ret;
> >  }
> >  
> > -if (!migrate_multifd_flush_after_each_section()) {
> > +if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
> >  qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> >  }
> >  qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> > @@ -3760,7 +3761,7 @@ int ram_load_postcopy(QEMUFile *f, int channel)
> >  break;
> >  case RAM_SAVE_FLAG_EOS:
> >  /* normal exit */
> > -if (migrate_multifd_flush_after_each_section()) {
> > +if (migrate_multifd() && 
> > migrate_multifd_flush_after_each_section()) {
> >  multifd_recv_sync_main();
> >  }
> >  break;
> > @@ -4038,7 +4039,8 @@ static int ram_load_precopy(QEMUFile *f)
> >  break;
> >  case RAM_SAVE_FLAG_EOS:
> >  /* normal exit */
> > -if (migrate_multifd_flush_after_each_section()) {
> > +if (migrate_multifd() &&
> > +migrate_multifd_flush_after_each_section()) {
> >  multifd_recv_sync_main();
> >  }
> >  break;
> 
> Reviewed-by: Peter Xu 
> 
> Did you forget to send this out formally?  Even if f1de309792d6656e landed
> (which, IMHO, shouldn't..), but IIUC rdma is still broken..

Two more things to mention..

$ git tag --contains 294e5a4034e81b

It tells me v8.1 is also affected.. so we may want to copy stable too for
8.1, for whichever patch we want to merge (either yours or Zhijian's)..

Meanwhile, it also breaks migration as long as user specifies the new
behavior.. for example: v8.1->v8.0 will break with this:

$ (echo "migrate exec:cat>out"; echo "quit") | ./qemu-v8.1.1 -M pc-q35-8.0 
-global migration.multifd-flush-after-each-section=false -monitor stdio
QEMU 8.1.1 monitor - type 'help' for more information
VNC server running on ::1:5900
(qemu) migrate exec:cat>out
(qemu) quit

$ ./qemu-v8.0.5 -M pc-q35-8.0 -incoming "exec:cat

Re: [PATCH v4 01/15] hw/pci: Add a pci_setup_iommu_ops() helper

2023-10-06 Thread Cédric Le Goater

Hello Joao,


I think you should first convert all PHBs to PCIIOMMUOps to avoid all the
tests as below and adapt pci_setup_iommu_ops() with the new parameter.



OK, that's Yi's original patch:

https://lore.kernel.org/all/20210302203827.437645-5-yi.l@intel.com/

I went with this one is that 1) it might take eons to get every single IOMMU
maintainer ack; and 2) it would allow each IOMMU to move at its own speed
specially as I can't test most of the other ones. essentially iterative, rather
than invasive change? Does that make sense?


I think it is ok to make global changes to replace a function by a struct
of ops. This is not major (unless the extra indirection has a major perf
impact on some platforms).


It should be a mechanical change. As the pci_setup_iommu_ops() should be
functionally equivalent to pci_setup_iommu() [...]


Thanks for going back to the previous proposal.


Getting acks from everyone will be difficultsince some PHBs are orphans.


[...] This is what gets me a bit hesitant


orphans shouldn't be an issue, nor the PPC emulated machines. We will see
what other maintainers have to say.

Thanks,

C.
 





Re: [PULL 00/21] vfio queue

2023-10-06 Thread Cédric Le Goater

Hello Eric,

On 10/6/23 14:25, Eric Auger wrote:

Hi Cédric,

On 10/6/23 13:46, Eric Auger wrote:

Hi Cédric,

On 10/6/23 13:42, Eric Auger wrote:

Hi Cédric,

On 10/6/23 12:33, Cédric Le Goater wrote:

On 10/6/23 08:19, Cédric Le Goater wrote:

The following changes since commit
2f3913f4b2ad74baeb5a6f1d36efbd9ecdf1057d:

    Merge tag 'for_upstream' of
https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
(2023-10-05 09:01:01 -0400)

are available in the Git repository at:

    https://github.com/legoater/qemu/ tags/pull-vfio-20231006

for you to fetch changes up to 6e86aaef9ac57066aa923211a164df95b7b3cdf7:

    vfio/common: Move legacy VFIO backend code into separate
container.c (2023-10-05 22:04:52 +0200)


vfio queue:

* Fix for VFIO display when using Intel vGPUs
* Support for dynamic MSI-X
* Preliminary work for IOMMUFD support


Stefan,

I just did some tests on z with passthough devices (PCI and AP) and
the series is not bisectable. QEMU crashes at patch  :

   "vfio/pci: Introduce vfio_[attach/detach]_device".

Also, with everything applied, the guest fails to start with :

  vfio: IRQ 0 not available (number of irqs 0)

So, please hold on and sorry for the noise. I will start digging
on my side.

I just tested with the head on vfio/pci: Introduce
vfio_[attach/detach]_device, with PCIe assignment on ARM and I fail to
reproduce the crash.

Do you try hotplug or something simpler?


also works for me with hotplug/hotunplug. Please let me know if I can help.


I think this is related to the error handling.

if you hotplug a vfio-device and if this encounters an error,
vfio_realize fails and you end at the 'error' label where the name of
the device is freed: g_free(vbasedev->name);

However I see that the vfio_finalize is called (Zhengzhong warned me !!)
calls vfio_pci_put_device
which calls g_free(vdev->vbasedev.name) again.
please try adding
vdev->vbasedev.name = NULL after freeing the name in vfio_realize error:
so see if it fixes the crash.

Then wrt irq stuff, I would be tempted to say it sounds unrelated to the
iommufd prereq series but well.

Please let me know how you want me to fix that mess, sorry.


So, the issue was a bit complex to dig because it only crashed
with a s390 guest under libvirt. This is my all-in-one combo VM
which has 2 PCI, 2 AP, 1 CCW passthrough devices and the issue
is in AP I think.

vfio_ap_realize lacks :

  @@ -188,6 +188,7 @@ static void vfio_ap_realize(DeviceState
   error_report_err(err);
   }
   
  +return;

   error:
   error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->name);
   g_free(vbasedev->name);


No error were to return and since everything was fine, the error
path generated a lot of mess indeed !

If you are ok with the above, I will squash the change in the
related patch and send a v2.

Thanks,

C.






Eric




Eric


Thanks

Eric




Thanks,

C.



Alex Williamson (1):
    vfio/display: Fix missing update to set backing fields

Eric Auger (7):
    scripts/update-linux-headers: Add iommufd.h
    vfio/common: Propagate KVM_SET_DEVICE_ATTR error if any
    vfio/common: Introduce vfio_container_add|del_section_window()
    vfio/pci: Introduce vfio_[attach/detach]_device
    vfio/platform: Use vfio_[attach/detach]_device
    vfio/ap: Use vfio_[attach/detach]_device
    vfio/ccw: Use vfio_[attach/detach]_device

Jing Liu (4):
    vfio/pci: detect the support of dynamic MSI-X allocation
    vfio/pci: enable vector on dynamic MSI-X allocation
    vfio/pci: use an invalid fd to enable MSI-X
    vfio/pci: enable MSI-X in interrupt restoring on dynamic
allocation

Yi Liu (2):
    vfio/common: Move IOMMU agnostic helpers to a separate file
    vfio/common: Move legacy VFIO backend code into separate
container.c

Zhenzhong Duan (7):
    vfio/pci: rename vfio_put_device to vfio_pci_put_device
    linux-headers: Add iommufd.h
    vfio/common: Extract out vfio_kvm_device_[add/del]_fd
    vfio/common: Move VFIO reset handler registration to a group
agnostic function
    vfio/common: Introduce a per container device list
    vfio/common: Store the parent container in VFIODevice
    vfio/common: Introduce a global VFIODevice list

   hw/vfio/pci.h   |    1 +
   include/hw/vfio/vfio-common.h   |   60 +-
   linux-headers/linux/iommufd.h   |  444 +
   hw/vfio/ap.c    |   69 +-
   hw/vfio/ccw.c   |  122 +--
   hw/vfio/common.c    | 1885
+++
   hw/vfio/container.c | 1161 
   hw/vfio/display.c   |    2 +
   hw/vfio/helpers.c   |  612 +
   hw/vfio/pci.c   |  194 ++--
   hw/vfio/platform.c  |   43 

Re: [PATCH v7 15/15] python: use vm.cmd() instead of vm.qmp() where appropriate

2023-10-06 Thread Eric Blake
On Fri, Oct 06, 2023 at 06:41:25PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> In many cases we just want an effect of qmp command and want to raise
> on failure.  Use vm.cmd() method which does exactly this.
> 
> The commit is generated by command
> 
>   git grep -l '\.qmp(' | xargs ./scripts/python_qmp_updater.py
> 
> And then, fix self.assertRaises to expect ExecuteError exception in
> tests/qemu-iotests/124
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/avocado/vnc.py  |  16 +-
>  tests/qemu-iotests/030| 168 +++---
>  tests/qemu-iotests/040| 172 +++
>  tests/qemu-iotests/041| 483 --
>  tests/qemu-iotests/045|  15 +-
>  tests/qemu-iotests/055|  62 +--
>  tests/qemu-iotests/056|  77 ++-
>  tests/qemu-iotests/093|  42 +-
>  tests/qemu-iotests/118| 225 
>  tests/qemu-iotests/124| 102 ++--

Given you called out a post-modification after the script, I checked
that out specifically (everything else does indeed match your script's
actions)


> +++ b/tests/qemu-iotests/124
> @@ -24,6 +24,7 @@
>  import os
>  import iotests
>  from iotests import try_remove
> +from qemu.qmp.qmp_client import ExecuteError
>  

>  
> @@ -504,9 +500,8 @@ class TestIncrementalBackup(TestIncrementalBackupBase):
>  target1 = self.prepare_backup(dr1bm0)
>  
>  # Re-run the exact same transaction.
> -result = self.vm.qmp('transaction', actions=transaction,
> - properties={'completion-mode':'grouped'})
> -self.assert_qmp(result, 'return', {})
> +self.vm.cmd('transaction', actions=transaction,
> +properties={'completion-mode':'grouped'})
>  
>  # Both should complete successfully this time.
>  self.assertTrue(self.wait_qmp_backup(drive0['id']))
> @@ -567,7 +562,7 @@ class TestIncrementalBackup(TestIncrementalBackupBase):
>  The granularity must always be a power of 2.
>  '''
>  self.assert_no_active_block_jobs()
> -self.assertRaises(AssertionError, self.add_bitmap,
> +self.assertRaises(ExecuteError, self.add_bitmap,
>'bitmap0', self.drives[0],
>granularity=64000)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: [PATCH v7 14/15] scripts: add python_qmp_updater.py

2023-10-06 Thread Eric Blake
On Fri, Oct 06, 2023 at 06:41:24PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> A script, to update the pattern
> 
> result = self.vm.qmp(...)
> self.assert_qmp(result, 'return', {})
> 
> (and some similar ones) into
> 
> self.vm.cmd(...)
> 
> Used in the next commit
> "python: use vm.cmd() instead of vm.qmp() where appropriate"
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  scripts/python_qmp_updater.py | 136 ++
>  1 file changed, 136 insertions(+)
>  create mode 100755 scripts/python_qmp_updater.py
> 
> diff --git a/scripts/python_qmp_updater.py b/scripts/python_qmp_updater.py
> new file mode 100755
> index 00..494a169812
> --- /dev/null
> +++ b/scripts/python_qmp_updater.py
> @@ -0,0 +1,136 @@
> +#!/usr/bin/env python3
> +#
> +# Intended usage:
> +#
> +# git grep -l '\.qmp(' | xargs ./scripts/python_qmp_updater.py
> +#
> +
> +import re
> +import sys
> +from typing import Optional
> +
> +start_reg = re.compile(r'^(?P *)(?P\w+) = (?P.*).qmp\(',
> +   flags=re.MULTILINE)
> +
> +success_reg_templ = re.sub('\n *', '', r"""
> +(\n*{padding}(?P\#.*$))?
> +\n*{padding}
> +(
> +self.assert_qmp\({res},\ 'return',\ {{}}\)
> +|
> +assert\ {res}\['return'\]\ ==\ {{}}
> +|
> +assert\ {res}\ ==\ {{'return':\ {{
> +|
> +self.assertEqual\({res}\['return'\],\ {{}}\)
> +)""")

We may find other patterns, but this is a nice way to capture the most
common ones and a simple place to update if we find another one.

I did a quick grep for 'assert.*return' and noticed things like:

tests/qemu-iotests/056:self.assert_qmp(res, 'return', {})
tests/qemu-iotests/056:self.assert_qmp(res, 'return', [])

This script only simplifies the {} form, not the []; but that makes
sense: when we are testing a command known to return an array rather
than nothing, we still want to check if the array is empty, and not
just that the command didn't crash.  We are only simplifying the
commands that check for nothing in particular returned, on the grounds
that not crashing was probably good enough, and explicitly checking
that nothing extra was returned is not worth the effort.

> +
> +some_check_templ = re.sub('\n *', '', r"""
> +(\n*{padding}(?P\#.*$))?
> +\s*self.assert_qmp\({res},""")
> +
> +
> +def tmatch(template: str, text: str,
> +   padding: str, res: str) -> Optional[re.Match[str]]:
> +return re.match(template.format(padding=padding, res=res), text,
> +flags=re.MULTILINE)
> +
> +
> +def find_closing_brace(text: str, start: int) -> int:
> +"""
> +Having '(' at text[start] search for pairing ')' and return its index.
> +"""
> +assert text[start] == '('
> +
> +height = 1
> +
> +for i in range(start + 1, len(text)):
> +if text[i] == '(':
> +height += 1
> +elif text[i] == ')':
> +height -= 1
> +if height == 0:
> +return i

I might have referred to this as 'nest' or 'depth', as I tend to think
of nesting depth rather than nesting height; but it's not a
show-stopper to use your naming.

> +
> +raise ValueError
> +
> +
> +def update(text: str) -> str:
> +result = ''
> +
> +while True:
> +m = start_reg.search(text)
> +if m is None:
> +result += text
> +break
> +
> +result += text[:m.start()]
> +
> +args_ind = m.end()
> +args_end = find_closing_brace(text, args_ind - 1)
> +
> +all_args = text[args_ind:args_end].split(',', 1)
> +
> +name = all_args[0]
> +args = None if len(all_args) == 1 else all_args[1]
> +
> +unchanged_call = text[m.start():args_end+1]
> +text = text[args_end+1:]
> +
> +padding, res, vm = m.group('padding', 'res', 'vm')
> +
> +m = tmatch(success_reg_templ, text, padding, res)
> +
> +if m is None:
> +result += unchanged_call
> +
> +if ('query-' not in name and
> +'x-debug-block-dirty-bitmap-sha256' not in name and
> +not tmatch(some_check_templ, text, padding, res)):
> +print(unchanged_call + text[:200] + '...\n\n')
> +
> +continue

Feels a bit hacky - but if it does the job, I'm not too worried.

> +
> +if m.group('comment'):
> +result += f'{padding}{m.group("comment")}\n'
> +
> +result += f'{padding}{vm}.cmd({name}'
> +
> +if args:
> +result += ','
> +
> +if '\n' in args:
> +m_args = re.search('(?P *).*$', args)
> +assert m_args is not None
> +
> +cur_padding = len(m_args.group('pad'))
> +expected = len(f'{padding}{res} = {vm}.qmp(')
> +drop = len(f'{res} = ')
> +if cur_padding == expected - 1:
> +# tolerate this bad style
> +

[PATCH] hw/virtio/virtio-gpu: Fix compiler warning when compiling with -Wshadow

2023-10-06 Thread Thomas Huth
Avoid using trivial variable names in macros, otherwise we get
the following compiler warning when compiling with -Wshadow=local:

In file included from ../../qemu/hw/display/virtio-gpu-virgl.c:19:
../../home/thuth/devel/qemu/hw/display/virtio-gpu-virgl.c:
 In function ‘virgl_cmd_submit_3d’:
../../qemu/include/hw/virtio/virtio-gpu.h:228:16: error: declaration of ‘s’
 shadows a previous local [-Werror=shadow=compatible-local]
  228 | size_t s;
  |^
../../qemu/hw/display/virtio-gpu-virgl.c:215:5: note: in expansion of macro
 ‘VIRTIO_GPU_FILL_CMD’
  215 | VIRTIO_GPU_FILL_CMD(cs);
  | ^~~
../../qemu/hw/display/virtio-gpu-virgl.c:213:12: note: shadowed declaration
 is here
  213 | size_t s;
  |^
cc1: all warnings being treated as errors

Signed-off-by: Thomas Huth 
---
 include/hw/virtio/virtio-gpu.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index 390c4642b8..8b7e3faf01 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -225,13 +225,13 @@ struct VhostUserGPU {
 };
 
 #define VIRTIO_GPU_FILL_CMD(out) do {   \
-size_t s;   \
-s = iov_to_buf(cmd->elem.out_sg, cmd->elem.out_num, 0,  \
+size_t s_;  \
+s_ = iov_to_buf(cmd->elem.out_sg, cmd->elem.out_num, 0, \
, sizeof(out));  \
-if (s != sizeof(out)) { \
+if (s_ != sizeof(out)) {\
 qemu_log_mask(LOG_GUEST_ERROR,  \
   "%s: command size incorrect %zu vs %zu\n",\
-  __func__, s, sizeof(out));\
+  __func__, s_, sizeof(out));   \
 return; \
 }   \
 } while (0)
-- 
2.41.0




Re: [PATCH v4 21/21] i386: Add new property to control L2 cache topo in CPUID.04H

2023-10-06 Thread Zhao Liu
Hi Michael,

On Tue, Oct 03, 2023 at 08:57:27AM -0400, Michael S. Tsirkin wrote:
> Date: Tue, 3 Oct 2023 08:57:27 -0400
> From: "Michael S. Tsirkin" 
> Subject: Re: [PATCH v4 21/21] i386: Add new property to control L2 cache
>  topo in CPUID.04H
> 
> On Fri, Sep 15, 2023 at 03:53:25PM +0800, Zhao Liu wrote:
> > Hi Philippe,
> > 
> > On Thu, Sep 14, 2023 at 09:41:30AM +0200, Philippe Mathieu-Daud? wrote:
> > > Date: Thu, 14 Sep 2023 09:41:30 +0200
> > > From: Philippe Mathieu-Daud? 
> > > Subject: Re: [PATCH v4 21/21] i386: Add new property to control L2 cache
> > >  topo in CPUID.04H
> > > 
> > > On 14/9/23 09:21, Zhao Liu wrote:
> > > > From: Zhao Liu 
> > > > 
> > > > The property x-l2-cache-topo will be used to change the L2 cache
> > > > topology in CPUID.04H.
> > > > 
> > > > Now it allows user to set the L2 cache is shared in core level or
> > > > cluster level.
> > > > 
> > > > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > > > topology will be overrode by the new topology setting.
> > > > 
> > > > Here we expose to user "cluster" instead of "module", to be consistent
> > > > with "cluster-id" naming.
> > > > 
> > > > Since CPUID.04H is used by intel CPUs, this property is available on
> > > > intel CPUs as for now.
> > > > 
> > > > When necessary, it can be extended to CPUID.801DH for AMD CPUs.
> > > > 
> > > > (Tested the cache topology in CPUID[0x04] leaf with "x-l2-cache-topo=[
> > > > core|cluster]", and tested the live migration between the QEMUs w/ &
> > > > w/o this patch series.)
> > > > 
> > > > Signed-off-by: Zhao Liu 
> > > > Tested-by: Yongwei Ma 
> > > > ---
> > > > Changes since v3:
> > > >   * Add description about test for live migration compatibility. (Babu)
> > > > 
> > > > Changes since v1:
> > > >   * Rename MODULE branch to CPU_TOPO_LEVEL_MODULE to match the previous
> > > > renaming changes.
> > > > ---
> > > >   target/i386/cpu.c | 34 +-
> > > >   target/i386/cpu.h |  2 ++
> > > >   2 files changed, 35 insertions(+), 1 deletion(-)
> > > 
> > > 
> > > > @@ -8079,6 +8110,7 @@ static Property x86_cpu_properties[] = {
> > > >false),
> > > >   DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, 
> > > > intel_pt_auto_level,
> > > >true),
> > > > +DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
> > > 
> > > We use the 'x-' prefix for unstable features, is it the case here?
> > 
> > I thought that if we can have a more general CLI way to define cache
> > topology in the future, then this option can be removed.
> > 
> > I'm not sure if this option could be treated as unstable, what do you
> > think?
> > 
> > 
> > Thanks,
> > Zhao
> 
> Then, please work on this new generic thing.
> What we don't want is people relying on unstable options.
> 

Okay, I'll remove this option in the next refresh.

BTW, about the generic cache topology, what about porting this option to
smp? Just like:

-smp cpus=4,sockets=2,cores=2,threads=1, \
 l3-cache=socket,l2-cache=core,l1-i-cache=core,l1-d-cache=core

>From the previous discussion [1] with Jonathan, it seems this format
could also meet the requirement for ARM.

If you like this, I'll move forward in this direction. ;-)

[1]: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg03997.html

Thanks,
Zhao




Re: [v3] Help wanted for enabling -Wshadow=local

2023-10-06 Thread Thomas Huth

On 06/10/2023 16.45, Markus Armbruster wrote:

Local variables shadowing other local variables or parameters make the
code needlessly hard to understand.  Bugs love to hide in such code.
Evidence: "[PATCH v3 1/7] migration/rdma: Fix save_page method to fail
on polling error".

Enabling -Wshadow would prevent bugs like this one.  But we have to
clean up all the offenders first.

Quite a few people responded to my calls for help.  Thank you so much!

I'm collecting patches in my git repo at
https://repo.or.cz/qemu/armbru.git in branch shadow-next.  All but the
last two are in a pending pull request.

My test build is down to seven files with warnings.  "[PATCH v2 0/3]
hexagon: GETPC() and shadowing fixes" takes care of four, but it needs a
rebase.

Remaining three:

 In file included from ../hw/display/virtio-gpu-virgl.c:19:
 ../hw/display/virtio-gpu-virgl.c: In function ‘virgl_cmd_submit_3d’:
 /work/armbru/qemu/include/hw/virtio/virtio-gpu.h:228:16: warning: 
declaration of ‘s’ shadows a previous local [-Wshadow=compatible-local]
   228 | size_t s;  
 \
   |^
 ../hw/display/virtio-gpu-virgl.c:215:5: note: in expansion of macro 
‘VIRTIO_GPU_FILL_CMD’
   215 | VIRTIO_GPU_FILL_CMD(cs);
   | ^~~
 ../hw/display/virtio-gpu-virgl.c:213:12: note: shadowed declaration is here
   213 | size_t s;
   |^

 In file included from ../contrib/vhost-user-gpu/virgl.h:18,
  from ../contrib/vhost-user-gpu/virgl.c:17:
 ../contrib/vhost-user-gpu/virgl.c: In function ‘virgl_cmd_submit_3d’:
 ../contrib/vhost-user-gpu/vugpu.h:167:16: warning: declaration of ‘s’ 
shadows a previous local [-Wshadow=compatible-local]
   167 | size_t s;   \
   |^
 ../contrib/vhost-user-gpu/virgl.c:203:5: note: in expansion of macro 
‘VUGPU_FILL_CMD’
   203 | VUGPU_FILL_CMD(cs);
   | ^~
 ../contrib/vhost-user-gpu/virgl.c:201:12: note: shadowed declaration is 
here
   201 | size_t s;
   |^

 ../contrib/vhost-user-gpu/vhost-user-gpu.c: In function 
‘vg_resource_flush’:
 ../contrib/vhost-user-gpu/vhost-user-gpu.c:837:29: warning: declaration of 
‘i’ shadows a previous local [-Wshadow=local]
   837 | pixman_image_t *i =
   | ^
 ../contrib/vhost-user-gpu/vhost-user-gpu.c:757:9: note: shadowed 
declaration is here
   757 | int i;
   | ^

Gerd, Marc-André, or anybody else?

More warnings may lurk in code my test build doesn't compile.  Need a
full CI build with -Wshadow=local to find them.  Anybody care to kick
one off?


I ran a build here (with -Werror enabled, so that it's easier to see where 
it breaks):


 https://gitlab.com/thuth/qemu/-/pipelines/1028023489

... but I didn't see any additional spots in the logs beside the ones that 
you already listed.


 Thomas





Re: [PATCH v11 00/10] migration: Modify 'migrate' and 'migrate-incoming' QAPI commands for migration

2023-10-06 Thread Het Gala


On 10/4/2023 7:33 PM, Fabiano Rosas wrote:

Het Gala  writes:


On 04/10/23 7:03 pm, Fabiano Rosas wrote:

Het Gala  writes:


This is v11 patchset of modified 'migrate' and 'migrate-incoming' QAPI design
for upstream review.

Update: Daniel has reviewed all patches and is okay with them. Markus has also
  given Acked-by tag for patches related to QAPI syntax change.
Fabiano, Juan and other migration maintainers, let me know if there are still
improvements to be made in this patchset series.

Link to previous upstream community patchset links:
v1:https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gnu.org_archive_html_qemu-2Ddevel_2022-2D12_msg04339.html=DwIBAg=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=xuVA--dLVo9lijpitqSt7EOEzBGpEvigXGCb9p_MIk0xmhQZ8bPasLgZ2aOlEBcz=jsRvKRy1JOiy05KX1CtLqWN1su5XNmKPKuJTSx5sZpU=
v2:https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gnu.org_archive_html_qemu-2Ddevel_2023-2D02_msg02106.html=DwIBAg=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=xuVA--dLVo9lijpitqSt7EOEzBGpEvigXGCb9p_MIk0xmhQZ8bPasLgZ2aOlEBcz=mzt3n5PD1QclHfpZEh-VMoLkkwT8xqjPYN-1r7MOly0=
v3:https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gnu.org_archive_html_qemu-2Ddevel_2023-2D02_msg02473.html=DwIBAg=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=xuVA--dLVo9lijpitqSt7EOEzBGpEvigXGCb9p_MIk0xmhQZ8bPasLgZ2aOlEBcz=fa9W71JU6-3xZrjLH7AmElgqwJGUkPeQv3P7n6EXxOM=
v4:https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gnu.org_archive_html_qemu-2Ddevel_2023-2D05_msg03064.html=DwIBAg=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=xuVA--dLVo9lijpitqSt7EOEzBGpEvigXGCb9p_MIk0xmhQZ8bPasLgZ2aOlEBcz=Xr1y3EvBzEtWT9O1fVNapCb3WnD-aWR8UeXv6J6gZQM=
v5:https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gnu.org_archive_html_qemu-2Ddevel_2023-2D05_msg04845.html=DwIBAg=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=xuVA--dLVo9lijpitqSt7EOEzBGpEvigXGCb9p_MIk0xmhQZ8bPasLgZ2aOlEBcz=OtK10W2Z0DobrktRfTCMYPxbcMaaZ6f6qoA65D4RG_A=
v6:https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gnu.org_archive_html_qemu-2Ddevel_2023-2D06_msg01251.html=DwIBAg=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=xuVA--dLVo9lijpitqSt7EOEzBGpEvigXGCb9p_MIk0xmhQZ8bPasLgZ2aOlEBcz=XH-4qFQgdkAKmRsa9DuqaZgJMvGUi1p4-s05AsAEYRo=
v7:https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gnu.org_archive_html_qemu-2Ddevel_2023-2D07_msg02027.html=DwIBAg=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=xuVA--dLVo9lijpitqSt7EOEzBGpEvigXGCb9p_MIk0xmhQZ8bPasLgZ2aOlEBcz=RwvfliI4wLm7S0TKl5RMku-gSSE-5fZPYH0MkzJdoPw=
v8:https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gnu.org_archive_html_qemu-2Ddevel_2023-2D07_msg02770.html=DwIBAg=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=xuVA--dLVo9lijpitqSt7EOEzBGpEvigXGCb9p_MIk0xmhQZ8bPasLgZ2aOlEBcz=BZsKBJGVPDWXwGgb2-fAnS9pWzTYuLzI92TmuWBcB3k=
v9:https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gnu.org_archive_html_qemu-2Ddevel_2023-2D07_msg04216.html=DwIBAg=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=xuVA--dLVo9lijpitqSt7EOEzBGpEvigXGCb9p_MIk0xmhQZ8bPasLgZ2aOlEBcz=YcWFU9I2u-R6QbVjweZ3lFvJlllm-i9o5_jtLBxC_oc=
v10:https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gnu.org_archive_html_qemu-2Ddevel_2023-2D07_msg05022.html=DwIBAg=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=xuVA--dLVo9lijpitqSt7EOEzBGpEvigXGCb9p_MIk0xmhQZ8bPasLgZ2aOlEBcz=JQt63Ikbz21vmsLmSensQu8zknGuS9bls-IFpndor78=

v10 -> v11 changelog:
---
- Resolved make check errors as its been almost two months since v10
version of this patchset series went out. Till date migration workflow
might have changed which caused make check errors.

Sorry, there must be a misunderstanding here. This series still has
problems. Just look at patch 6 that adds the "channel-type" parameter and
patch 10 that uses "channeltype" in the test (without hyphen). This
cannot work.

Ack. I will change that.

There's also several instances of g_autoptr being used incorrectly. I
could comment on every patch individually, but this series cannot have
passed make check.

Are we allowed to run the make checks ? I am not aware from where these
failures are arising. It would be helpful if you could point out to me
where g_autoptr is incorrectly used ?

I mean just the project's make check command:

cd build/
../configure
make -j$(nproc)
make -j$(nproc) check
Okay, now I got it. I was not aware of make -j check to pass the qemu 
tests. Will make sure that it passes the make check.

Please resend this with the issues fixed and drop the Reviewed-bys from
the affected patches.

How to verify which are the affected patches here ?

I'll comment in each patch individually.
Thankyou for commenting on the patches. Will try to make all the changes 
in the coming version.

We'll also have to add compatibility with 

RE: [PATCH v2 3/3] target/hexagon: avoid shadowing globals

2023-10-06 Thread ltaylorsimpson



> -Original Message-
> From: Brian Cain 
> Sent: Thursday, October 5, 2023 4:22 PM
> To: qemu-devel@nongnu.org
> Cc: bc...@quicinc.com; arm...@redhat.com; richard.hender...@linaro.org;
> phi...@linaro.org; peter.mayd...@linaro.org; quic_mathb...@quicinc.com;
> stefa...@redhat.com; a...@rev.ng; a...@rev.ng;
> quic_mlie...@quicinc.com; ltaylorsimp...@gmail.com
> Subject: [PATCH v2 3/3] target/hexagon: avoid shadowing globals
> 
> The typedef `vaddr` is shadowed by `vaddr` identifiers, so we rename the
> identifiers to avoid shadowing the type name.
> 
> The global `cpu_env` is shadowed by local `cpu_env` arguments, so we
> rename the function arguments to avoid shadowing the global.
> 
> Signed-off-by: Brian Cain 
> ---
>  target/hexagon/genptr.c | 56 -
>  target/hexagon/genptr.h | 18 
>  target/hexagon/mmvec/system_ext_mmvec.c |  4 +-
> target/hexagon/mmvec/system_ext_mmvec.h |  2 +-
>  target/hexagon/op_helper.c  |  4 +-
>  5 files changed, 42 insertions(+), 42 deletions(-)
> 
> diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c index
> 217bc7bb5a..11377ac92b 100644
> --- a/target/hexagon/genptr.c
> +++ b/target/hexagon/genptr.c
> @@ -334,28 +334,28 @@ void gen_set_byte_i64(int N, TCGv_i64 result, TCGv
> src)
>  tcg_gen_deposit_i64(result, result, src64, N * 8, 8);  }
> 
> -static inline void gen_load_locked4u(TCGv dest, TCGv vaddr, int
> mem_index)
> +static inline void gen_load_locked4u(TCGv dest, TCGv v_addr, int
> +mem_index)

I'd recommend moving both the type and the arg name to the new line, also 
indent the new line.
static inline void gen_load_locked4u(TCGv dest, TCGv v_addr,
  int mem_index)


> 
> -static inline void gen_load_locked8u(TCGv_i64 dest, TCGv vaddr, int
> mem_index)
> +static inline void gen_load_locked8u(TCGv_i64 dest, TCGv v_addr, int
> +mem_index)

Ditto

>  static inline void gen_store_conditional4(DisasContext *ctx,
> -  TCGv pred, TCGv vaddr, TCGv src)
> +  TCGv pred, TCGv v_addr, TCGv
> + src)

Ditto

>  zero = tcg_constant_tl(0);
> @@ -374,13 +374,13 @@ static inline void
> gen_store_conditional4(DisasContext *ctx,  }
> 
>  static inline void gen_store_conditional8(DisasContext *ctx,
> -  TCGv pred, TCGv vaddr, TCGv_i64 
> src)
> +  TCGv pred, TCGv v_addr,
> + TCGv_i64 src)

Indent

> -void mem_gather_store(CPUHexagonState *env, target_ulong vaddr, int
> slot)
> +void mem_gather_store(CPUHexagonState *env, target_ulong v_addr, int
> +slot)

Ditto

> -void mem_gather_store(CPUHexagonState *env, target_ulong vaddr, int
> slot);
> +void mem_gather_store(CPUHexagonState *env, target_ulong v_addr, int
> +slot);

Ditto


Otherwise,
Reviewed-by: Taylor Simpson 





Re: [PATCH v3 0/4] block: clean up coroutine versions of bdrv_{is_allocated,block_status}*

2023-10-06 Thread Kevin Wolf
Am 04.09.2023 um 12:03 hat Paolo Bonzini geschrieben:
> Provide coroutine versions of bdrv_is_allocated* and bdrv_block_status*,
> since the underlying BlockDriver API is coroutine-based, and use
> automatically-generated wrappers for the "mixed" versions.
> 
> v2->v3: cleaned up formatting

Thanks, applied to the block branch.

Kevin




Re: [PATCH v2 1/2] migration: Fix rdma migration failed

2023-10-06 Thread Peter Xu
On Tue, Oct 03, 2023 at 08:57:07PM +0200, Juan Quintela wrote:
> commit c638f66121ce30063fbf68c3eab4d7429cf2b209
> Author: Juan Quintela 
> Date:   Tue Oct 3 20:53:38 2023 +0200
> 
> migration: Non multifd migration don't care about multifd flushes
> 
> RDMA was having trouble because
> migrate_multifd_flush_after_each_section() can only be true or false,
> but we don't want to send any flush when we are not in multifd
> migration.
> 
> CC: Fabiano Rosas  Reported-by: Li Zhijian 
> Signed-off-by: Juan Quintela 
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index e4bfd39f08..716cef6425 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1387,7 +1387,8 @@ static int find_dirty_block(RAMState *rs, 
> PageSearchStatus *pss)
>  pss->page = 0;
>  pss->block = QLIST_NEXT_RCU(pss->block, next);
>  if (!pss->block) {
> -if (!migrate_multifd_flush_after_each_section()) {
> +if (migrate_multifd() &&
> +!migrate_multifd_flush_after_each_section()) {
>  QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
>  int ret = multifd_send_sync_main(f);
>  if (ret < 0) {
> @@ -3064,7 +3065,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>  return ret;
>  }
>  
> -if (!migrate_multifd_flush_after_each_section()) {
> +if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
>  qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
>  }
>  
> @@ -3176,7 +3177,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>  out:
>  if (ret >= 0
>  && migration_is_setup_or_active(migrate_get_current()->state)) {
> -if (migrate_multifd_flush_after_each_section()) {
> +if (migrate_multifd() && migrate_multifd_flush_after_each_section()) 
> {
>  ret = 
> multifd_send_sync_main(rs->pss[RAM_CHANNEL_PRECOPY].pss_channel);
>  if (ret < 0) {
>  return ret;
> @@ -3253,7 +3254,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>  return ret;
>  }
>  
> -if (!migrate_multifd_flush_after_each_section()) {
> +if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
>  qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
>  }
>  qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> @@ -3760,7 +3761,7 @@ int ram_load_postcopy(QEMUFile *f, int channel)
>  break;
>  case RAM_SAVE_FLAG_EOS:
>  /* normal exit */
> -if (migrate_multifd_flush_after_each_section()) {
> +if (migrate_multifd() && 
> migrate_multifd_flush_after_each_section()) {
>  multifd_recv_sync_main();
>  }
>  break;
> @@ -4038,7 +4039,8 @@ static int ram_load_precopy(QEMUFile *f)
>  break;
>  case RAM_SAVE_FLAG_EOS:
>  /* normal exit */
> -if (migrate_multifd_flush_after_each_section()) {
> +if (migrate_multifd() &&
> +migrate_multifd_flush_after_each_section()) {
>  multifd_recv_sync_main();
>  }
>  break;

Reviewed-by: Peter Xu 

Did you forget to send this out formally?  Even if f1de309792d6656e landed
(which, IMHO, shouldn't..), but IIUC rdma is still broken..

Thanks,

-- 
Peter Xu




Re: [Virtio-fs] (no subject)

2023-10-06 Thread Hanna Czenczek

On 06.10.23 17:17, Alex Bennée wrote:

Hanna Czenczek  writes:


On 06.10.23 12:34, Michael S. Tsirkin wrote:

On Fri, Oct 06, 2023 at 11:47:55AM +0200, Hanna Czenczek wrote:

On 06.10.23 11:26, Michael S. Tsirkin wrote:

On Fri, Oct 06, 2023 at 11:15:55AM +0200, Hanna Czenczek wrote:

On 06.10.23 10:45, Michael S. Tsirkin wrote:

On Fri, Oct 06, 2023 at 09:48:14AM +0200, Hanna Czenczek wrote:

On 05.10.23 19:15, Michael S. Tsirkin wrote:

On Thu, Oct 05, 2023 at 01:08:52PM -0400, Stefan Hajnoczi wrote:

On Wed, Oct 04, 2023 at 02:58:57PM +0200, Hanna Czenczek wrote:



What I’m saying is, 923b8921d21 introduced SET_STATUS calls that broke all
devices that would implement them as per virtio spec, and even today it’s
broken for stateful devices.  The mentioned performance issue is likely
real, but we can’t address it by making up SET_STATUS calls that are wrong.

I concede that I didn’t think about DRIVER_OK.  Personally, I would do all
final configuration that would happen upon a DRIVER_OK once the first vring
is started (i.e. receives a kick).  That has the added benefit of being
asynchronous because it doesn’t block any vhost-user messages (which are
synchronous, and thus block downtime).

Hanna

For better or worse kick is per ring. It's out of spec to start rings
that were not kicked but I guess you could do configuration ...
Seems somewhat asymmetrical though.

I meant to take the first ring being started as the signal to do the
global configuration, i.e. not do this once per vring, but once
globally.


Let's wait until next week, hopefully Yajun Wu will answer.

I mean, personally I don’t really care about the whole SET_STATUS
thing.  It’s clear that it’s broken for stateful devices.  The fact
that it took until 6f8be29ec17d to fix it for just any device that
would implement it according to spec to me is a strong indication that
nobody does implement it according to spec, and is currently only used
to signal to some specific back-end that all rings have been set up
and should be configured in a single block.

I'm certainly using [GS]ET_STATUS for the proposed F_TRANSPORT
extensions where everything is off-loaded to the vhost-user backend.


How do these back-ends work with the fact that qemu uses SET_STATUS 
incorrectly when not offloading?  Do you plan on fixing that?


(I.e. that we send SET_STATUS 0 when the VM is paused, potentially 
resetting state that is not recoverable, and that we set DRIVER and 
DRIVER_OK simultaneously.)


Hanna




[PATCH v7 13/15] tests/vm/basevm.py: use cmd() instead of qmp()

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
We don't expect failure here and need 'result' object. cmd() is better
in this case.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 tests/vm/basevm.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py
index a97e23b0ce..8aef4cff96 100644
--- a/tests/vm/basevm.py
+++ b/tests/vm/basevm.py
@@ -312,8 +312,8 @@ def boot(self, img, extra_args=[]):
 self._guest = guest
 # Init console so we can start consuming the chars.
 self.console_init()
-usernet_info = guest.qmp("human-monitor-command",
- command_line="info usernet").get("return")
+usernet_info = guest.cmd("human-monitor-command",
+ command_line="info usernet")
 self.ssh_port = get_info_usernet_hostfwd_port(usernet_info)
 if not self.ssh_port:
 raise Exception("Cannot find ssh port from 'info usernet':\n%s" % \
-- 
2.34.1




[PATCH v7 11/15] iotests: drop some extra ** in qmp() call

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
qmp() method supports passing dict (if it doesn't need substituting
'_' to '-' in keys). So, drop some extra '**' operators.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/040|  4 +-
 tests/qemu-iotests/041| 14 +++---
 tests/qemu-iotests/129|  2 +-
 tests/qemu-iotests/147|  2 +-
 tests/qemu-iotests/155|  2 +-
 tests/qemu-iotests/264| 12 ++---
 tests/qemu-iotests/295|  5 +-
 tests/qemu-iotests/296| 15 +++---
 tests/qemu-iotests/tests/migrate-bitmaps-test |  4 +-
 .../tests/mirror-ready-cancel-error   | 50 +--
 10 files changed, 54 insertions(+), 56 deletions(-)

diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
index e61e7f2433..4b8bf09a5d 100755
--- a/tests/qemu-iotests/040
+++ b/tests/qemu-iotests/040
@@ -774,7 +774,7 @@ class TestCommitWithFilters(iotests.QMPTestCase):
 result = self.vm.qmp('object-add', qom_type='throttle-group', id='tg')
 self.assert_qmp(result, 'return', {})
 
-result = self.vm.qmp('blockdev-add', **{
+result = self.vm.qmp('blockdev-add', {
 'node-name': 'top-filter',
 'driver': 'throttle',
 'throttle-group': 'tg',
@@ -935,7 +935,7 @@ class TestCommitWithOverriddenBacking(iotests.QMPTestCase):
 self.vm.launch()
 
 # Use base_b instead of base_a as the backing of top
-result = self.vm.qmp('blockdev-add', **{
+result = self.vm.qmp('blockdev-add', {
 'node-name': 'top',
 'driver': iotests.imgfmt,
 'file': {
diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 550e4dc391..3aef42aec8 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -236,7 +236,7 @@ class TestSingleBlockdev(TestSingleDrive):
 args = {'driver': iotests.imgfmt,
 'node-name': self.qmp_target,
 'file': { 'filename': target_img, 'driver': 'file' } }
-result = self.vm.qmp("blockdev-add", **args)
+result = self.vm.qmp("blockdev-add", args)
 self.assert_qmp(result, 'return', {})
 
 def test_mirror_to_self(self):
@@ -963,7 +963,7 @@ class TestRepairQuorum(iotests.QMPTestCase):
 #assemble the quorum block device from the individual files
 args = { "driver": "quorum", "node-name": "quorum0",
  "vote-threshold": 2, "children": [ "img0", "img1", "img2" ] }
-result = self.vm.qmp("blockdev-add", **args)
+result = self.vm.qmp("blockdev-add", args)
 self.assert_qmp(result, 'return', {})
 
 
@@ -1278,7 +1278,7 @@ class TestReplaces(iotests.QMPTestCase):
 """
 Check that we can replace filter nodes.
 """
-result = self.vm.qmp('blockdev-add', **{
+result = self.vm.qmp('blockdev-add', {
  'driver': 'copy-on-read',
  'node-name': 'filter0',
  'file': {
@@ -1319,7 +1319,7 @@ class TestFilters(iotests.QMPTestCase):
 self.vm = iotests.VM().add_device('virtio-scsi,id=vio-scsi')
 self.vm.launch()
 
-result = self.vm.qmp('blockdev-add', **{
+result = self.vm.qmp('blockdev-add', {
 'node-name': 'target',
 'driver': iotests.imgfmt,
 'file': {
@@ -1355,7 +1355,7 @@ class TestFilters(iotests.QMPTestCase):
 os.remove(backing_img)
 
 def test_cor(self):
-result = self.vm.qmp('blockdev-add', **{
+result = self.vm.qmp('blockdev-add', {
 'node-name': 'filter',
 'driver': 'copy-on-read',
 'file': self.filterless_chain
@@ -1384,7 +1384,7 @@ class TestFilters(iotests.QMPTestCase):
 assert target_map[1]['depth'] == 0
 
 def test_implicit_mirror_filter(self):
-result = self.vm.qmp('blockdev-add', **self.filterless_chain)
+result = self.vm.qmp('blockdev-add', self.filterless_chain)
 self.assert_qmp(result, 'return', {})
 
 # We need this so we can query from above the mirror node
@@ -1418,7 +1418,7 @@ class TestFilters(iotests.QMPTestCase):
 def test_explicit_mirror_filter(self):
 # Same test as above, but this time we give the mirror filter
 # a node-name so it will not be invisible
-result = self.vm.qmp('blockdev-add', **self.filterless_chain)
+result = self.vm.qmp('blockdev-add', self.filterless_chain)
 self.assert_qmp(result, 'return', {})
 
 # We need this so we can query from above the mirror node
diff --git a/tests/qemu-iotests/129 b/tests/qemu-iotests/129

[PATCH v7 09/15] iotests: refactor some common qmp result checks into generic pattern

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
To simplify further conversion.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/040 | 3 ++-
 tests/qemu-iotests/147 | 3 ++-
 tests/qemu-iotests/155 | 4 ++--
 tests/qemu-iotests/218 | 4 ++--
 tests/qemu-iotests/296 | 3 ++-
 5 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
index 5601a4873c..e61e7f2433 100755
--- a/tests/qemu-iotests/040
+++ b/tests/qemu-iotests/040
@@ -62,9 +62,10 @@ class ImageCommitTestCase(iotests.QMPTestCase):
 self.assert_no_active_block_jobs()
 if node_names:
 result = self.vm.qmp('block-commit', device='drive0', 
top_node=top, base_node=base)
+self.assert_qmp(result, 'return', {})
 else:
 result = self.vm.qmp('block-commit', device='drive0', top=top, 
base=base)
-self.assert_qmp(result, 'return', {})
+self.assert_qmp(result, 'return', {})
 self.wait_for_complete(need_ready)
 
 def run_default_commit_test(self):
diff --git a/tests/qemu-iotests/147 b/tests/qemu-iotests/147
index 47dfa62e6b..770b73e2f4 100755
--- a/tests/qemu-iotests/147
+++ b/tests/qemu-iotests/147
@@ -159,10 +159,11 @@ class BuiltinNBD(NBDBlockdevAddBase):
 
 if export_name is None:
 result = self.server.qmp('nbd-server-add', device='nbd-export')
+self.assert_qmp(result, 'return', {})
 else:
 result = self.server.qmp('nbd-server-add', device='nbd-export',
  name=export_name)
-self.assert_qmp(result, 'return', {})
+self.assert_qmp(result, 'return', {})
 
 if export_name2 is not None:
 result = self.server.qmp('nbd-server-add', device='nbd-export',
diff --git a/tests/qemu-iotests/155 b/tests/qemu-iotests/155
index eadda52615..d3e1b7401e 100755
--- a/tests/qemu-iotests/155
+++ b/tests/qemu-iotests/155
@@ -181,6 +181,7 @@ class MirrorBaseClass(BaseClass):
 result = self.vm.qmp(self.cmd, job_id='mirror-job', 
device='source',
  sync=sync, target='target',
  auto_finalize=False)
+self.assert_qmp(result, 'return', {})
 else:
 if self.existing:
 mode = 'existing'
@@ -190,8 +191,7 @@ class MirrorBaseClass(BaseClass):
  sync=sync, target=target_img,
  format=iotests.imgfmt, mode=mode,
  node_name='target', auto_finalize=False)
-
-self.assert_qmp(result, 'return', {})
+self.assert_qmp(result, 'return', {})
 
 self.vm.run_job('mirror-job', auto_finalize=False,
 pre_finalize=self.openBacking, auto_dismiss=True)
diff --git a/tests/qemu-iotests/218 b/tests/qemu-iotests/218
index 6320c4cb56..5e74c55b6a 100755
--- a/tests/qemu-iotests/218
+++ b/tests/qemu-iotests/218
@@ -61,14 +61,14 @@ def start_mirror(vm, speed=None, buf_size=None):
  sync='full',
  speed=speed,
  buf_size=buf_size)
+assert ret['return'] == {}
 else:
 ret = vm.qmp('blockdev-mirror',
  job_id='mirror',
  device='source',
  target='target',
  sync='full')
-
-assert ret['return'] == {}
+assert ret['return'] == {}
 
 
 log('')
diff --git a/tests/qemu-iotests/296 b/tests/qemu-iotests/296
index 0d21b740a7..19a674c5ae 100755
--- a/tests/qemu-iotests/296
+++ b/tests/qemu-iotests/296
@@ -133,9 +133,10 @@ class EncryptionSetupTestCase(iotests.QMPTestCase):
 
 if reOpen:
 result = vm.qmp(command, options=[opts])
+self.assert_qmp(result, 'return', {})
 else:
 result = vm.qmp(command, **opts)
-self.assert_qmp(result, 'return', {})
+self.assert_qmp(result, 'return', {})
 
 
 ###
-- 
2.34.1




[PATCH v7 08/15] iotests: add some missed checks of qmp result

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/041| 1 +
 tests/qemu-iotests/151| 1 +
 tests/qemu-iotests/152| 2 ++
 tests/qemu-iotests/tests/migrate-bitmaps-test | 2 ++
 4 files changed, 6 insertions(+)

diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 8429958bf0..4d7a829b65 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -1087,6 +1087,7 @@ class TestRepairQuorum(iotests.QMPTestCase):
 result = self.vm.qmp('blockdev-snapshot-sync', node_name='img1',
  snapshot_file=quorum_snapshot_file,
  snapshot_node_name="snap1");
+self.assert_qmp(result, 'return', {})
 
 result = self.vm.qmp('drive-mirror', job_id='job0', device='quorum0',
  sync='full', node_name='repair0', replaces="img1",
diff --git a/tests/qemu-iotests/151 b/tests/qemu-iotests/151
index b4d1bc2553..668d0c1e9c 100755
--- a/tests/qemu-iotests/151
+++ b/tests/qemu-iotests/151
@@ -159,6 +159,7 @@ class TestActiveMirror(iotests.QMPTestCase):
  sync='full',
  copy_mode='write-blocking',
  speed=1)
+self.assert_qmp(result, 'return', {})
 
 self.vm.hmp_qemu_io('source', 'break write_aio A')
 self.vm.hmp_qemu_io('source', 'aio_write 0 1M')  # 1
diff --git a/tests/qemu-iotests/152 b/tests/qemu-iotests/152
index 4e179c340f..b73a0d08a2 100755
--- a/tests/qemu-iotests/152
+++ b/tests/qemu-iotests/152
@@ -43,6 +43,7 @@ class TestUnaligned(iotests.QMPTestCase):
 def test_unaligned(self):
 result = self.vm.qmp('drive-mirror', device='drive0', sync='full',
  granularity=65536, target=target_img)
+self.assert_qmp(result, 'return', {})
 self.complete_and_wait()
 self.vm.shutdown()
 self.assertEqual(iotests.image_size(test_img), 
iotests.image_size(target_img),
@@ -51,6 +52,7 @@ class TestUnaligned(iotests.QMPTestCase):
 def test_unaligned_with_update(self):
 result = self.vm.qmp('drive-mirror', device='drive0', sync='full',
  granularity=65536, target=target_img)
+self.assert_qmp(result, 'return', {})
 self.wait_ready()
 self.vm.hmp_qemu_io('drive0', 'write 0 512')
 self.complete_and_wait(wait_ready=False)
diff --git a/tests/qemu-iotests/tests/migrate-bitmaps-test 
b/tests/qemu-iotests/tests/migrate-bitmaps-test
index 59f3357580..8668caae1e 100755
--- a/tests/qemu-iotests/tests/migrate-bitmaps-test
+++ b/tests/qemu-iotests/tests/migrate-bitmaps-test
@@ -101,6 +101,7 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 sha256 = get_bitmap_hash(self.vm_a)
 
 result = self.vm_a.qmp('migrate', uri=mig_cmd)
+self.assert_qmp(result, 'return', {})
 while True:
 event = self.vm_a.event_wait('MIGRATION')
 if event['data']['status'] == 'completed':
@@ -176,6 +177,7 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 self.assert_qmp(result, 'return', {})
 
 result = self.vm_a.qmp('migrate', uri=mig_cmd)
+self.assert_qmp(result, 'return', {})
 while True:
 event = self.vm_a.event_wait('MIGRATION')
 if event['data']['status'] == 'completed':
-- 
2.34.1




[PATCH v7 06/15] python/machine.py: upgrade vm.cmd() method

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
The method is not popular in iotests, we prefer use vm.qmp() and then
check success by hand. But that's not optimal. To simplify movement to
vm.cmd() let's support same interface improvements like in vm.qmp().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 python/qemu/machine/machine.py | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index c4e80544bd..352e15b074 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -698,13 +698,23 @@ def qmp(self, cmd: str,
 return ret
 
 def cmd(self, cmd: str,
-conv_keys: bool = True,
+args_dict: Optional[Dict[str, object]] = None,
+conv_keys: Optional[bool] = None,
 **args: Any) -> QMPReturnValue:
 """
 Invoke a QMP command.
 On success return the response dict.
 On failure raise an exception.
 """
+if args_dict is not None:
+assert not args
+assert conv_keys is None
+args = args_dict
+conv_keys = False
+
+if conv_keys is None:
+conv_keys = True
+
 qmp_args = self._qmp_args(conv_keys, args)
 ret = self._qmp.cmd(cmd, **qmp_args)
 if cmd == 'quit':
-- 
2.34.1




[PATCH v7 03/15] scripts/cpu-x86-uarch-abi.py: use .command() instead of .cmd()

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
Here we don't expect a failure. In case of failure we'll crash on
trying to access ['return']. Better is to use .command() that clearly
raises on failure.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 scripts/cpu-x86-uarch-abi.py | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/scripts/cpu-x86-uarch-abi.py b/scripts/cpu-x86-uarch-abi.py
index 82ff07582f..893afd1b35 100644
--- a/scripts/cpu-x86-uarch-abi.py
+++ b/scripts/cpu-x86-uarch-abi.py
@@ -69,7 +69,7 @@
 shell = QEMUMonitorProtocol(sock)
 shell.connect()
 
-models = shell.cmd("query-cpu-definitions")
+models = shell.command("query-cpu-definitions")
 
 # These QMP props don't correspond to CPUID fatures
 # so ignore them
@@ -85,7 +85,7 @@
 
 names = []
 
-for model in models["return"]:
+for model in models:
 if "alias-of" in model:
 continue
 names.append(model["name"])
@@ -93,12 +93,12 @@
 models = {}
 
 for name in sorted(names):
-cpu = shell.cmd("query-cpu-model-expansion",
- { "type": "static",
-   "model": { "name": name }})
+cpu = shell.command("query-cpu-model-expansion",
+{ "type": "static",
+  "model": { "name": name }})
 
 got = {}
-for (feature, present) in cpu["return"]["model"]["props"].items():
+for (feature, present) in cpu["model"]["props"].items():
 if present and feature not in skip:
 got[feature] = True
 
-- 
2.34.1




[PATCH v7 05/15] python/qemu: rename command() to cmd()

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
Use a shorter name. We are going to move in iotests from qmp() to
command() where possible. But command() is longer than qmp() and don't
look better. Let's rename.

You can simply grep for '\.command(' and for 'def command(' to check
that everything is updated (command() in tests/docker/docker.py is
unrelated).

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Eric Blake 
[vsementsov: also update three occurrences in
   tests/avocado/machine_aspeed.py and keep r-b]
---
 docs/devel/testing.rst|  10 +-
 python/qemu/machine/machine.py|   8 +-
 python/qemu/qmp/legacy.py |   2 +-
 python/qemu/qmp/qmp_shell.py  |   2 +-
 python/qemu/utils/qemu_ga_client.py   |   2 +-
 python/qemu/utils/qom.py  |   8 +-
 python/qemu/utils/qom_common.py   |   2 +-
 python/qemu/utils/qom_fuse.py |   6 +-
 scripts/cpu-x86-uarch-abi.py  |   8 +-
 scripts/device-crash-test |   8 +-
 scripts/render_block_graph.py |   8 +-
 tests/avocado/avocado_qemu/__init__.py|   4 +-
 tests/avocado/cpu_queries.py  |   5 +-
 tests/avocado/hotplug_cpu.py  |  10 +-
 tests/avocado/info_usernet.py |   4 +-
 tests/avocado/machine_arm_integratorcp.py |   6 +-
 tests/avocado/machine_aspeed.py   |  12 +-
 tests/avocado/machine_m68k_nextcube.py|   4 +-
 tests/avocado/machine_mips_malta.py   |   6 +-
 tests/avocado/machine_s390_ccw_virtio.py  |  28 ++--
 tests/avocado/migration.py|  10 +-
 tests/avocado/pc_cpu_hotplug_props.py |   2 +-
 tests/avocado/version.py  |   4 +-
 tests/avocado/virtio_check_params.py  |   6 +-
 tests/avocado/virtio_version.py   |   5 +-
 tests/avocado/x86_cpu_model_versions.py   |  13 +-
 tests/migration/guestperf/engine.py   | 150 +++---
 tests/qemu-iotests/256|  34 ++---
 tests/qemu-iotests/257|  36 +++---
 29 files changed, 204 insertions(+), 199 deletions(-)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index 5d1fc0aa95..21525e9aae 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -1014,8 +1014,8 @@ class.  Here's a simple usage example:
   """
   def test_qmp_human_info_version(self):
   self.vm.launch()
-  res = self.vm.command('human-monitor-command',
-command_line='info version')
+  res = self.vm.cmd('human-monitor-command',
+command_line='info version')
   self.assertRegexpMatches(res, r'^(\d+\.\d+\.\d)')
 
 To execute your test, run:
@@ -1065,15 +1065,15 @@ and hypothetical example follows:
   first_machine.launch()
   second_machine.launch()
 
-  first_res = first_machine.command(
+  first_res = first_machine.cmd(
   'human-monitor-command',
   command_line='info version')
 
-  second_res = second_machine.command(
+  second_res = second_machine.cmd(
   'human-monitor-command',
   command_line='info version')
 
-  third_res = self.get_vm(name='third_machine').command(
+  third_res = self.get_vm(name='third_machine').cmd(
   'human-monitor-command',
   command_line='info version')
 
diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index dd1a79cb37..c4e80544bd 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -697,16 +697,16 @@ def qmp(self, cmd: str,
 self._quit_issued = True
 return ret
 
-def command(self, cmd: str,
-conv_keys: bool = True,
-**args: Any) -> QMPReturnValue:
+def cmd(self, cmd: str,
+conv_keys: bool = True,
+**args: Any) -> QMPReturnValue:
 """
 Invoke a QMP command.
 On success return the response dict.
 On failure raise an exception.
 """
 qmp_args = self._qmp_args(conv_keys, args)
-ret = self._qmp.command(cmd, **qmp_args)
+ret = self._qmp.cmd(cmd, **qmp_args)
 if cmd == 'quit':
 self._quit_issued = True
 return ret
diff --git a/python/qemu/qmp/legacy.py b/python/qemu/qmp/legacy.py
index e5fa1ce9c4..22a2b5616e 100644
--- a/python/qemu/qmp/legacy.py
+++ b/python/qemu/qmp/legacy.py
@@ -207,7 +207,7 @@ def cmd_raw(self, name: str,
 qmp_cmd['arguments'] = args
 return self.cmd_obj(qmp_cmd)
 
-def command(self, cmd: str, **kwds: object) -> QMPReturnValue:
+def cmd(self, cmd: str, **kwds: object) -> QMPReturnValue:
 """
 Build and send a QMP command to the monitor, report errors if any
 """
diff --git a/python/qemu/qmp/qmp_shell.py b/python/qemu/qmp/qmp_shell.py
index 988d79c01b..98e684e9e8 100644
--- 

[PATCH v7 14/15] scripts: add python_qmp_updater.py

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
A script, to update the pattern

result = self.vm.qmp(...)
self.assert_qmp(result, 'return', {})

(and some similar ones) into

self.vm.cmd(...)

Used in the next commit
"python: use vm.cmd() instead of vm.qmp() where appropriate"

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 scripts/python_qmp_updater.py | 136 ++
 1 file changed, 136 insertions(+)
 create mode 100755 scripts/python_qmp_updater.py

diff --git a/scripts/python_qmp_updater.py b/scripts/python_qmp_updater.py
new file mode 100755
index 00..494a169812
--- /dev/null
+++ b/scripts/python_qmp_updater.py
@@ -0,0 +1,136 @@
+#!/usr/bin/env python3
+#
+# Intended usage:
+#
+# git grep -l '\.qmp(' | xargs ./scripts/python_qmp_updater.py
+#
+
+import re
+import sys
+from typing import Optional
+
+start_reg = re.compile(r'^(?P *)(?P\w+) = (?P.*).qmp\(',
+   flags=re.MULTILINE)
+
+success_reg_templ = re.sub('\n *', '', r"""
+(\n*{padding}(?P\#.*$))?
+\n*{padding}
+(
+self.assert_qmp\({res},\ 'return',\ {{}}\)
+|
+assert\ {res}\['return'\]\ ==\ {{}}
+|
+assert\ {res}\ ==\ {{'return':\ {{
+|
+self.assertEqual\({res}\['return'\],\ {{}}\)
+)""")
+
+some_check_templ = re.sub('\n *', '', r"""
+(\n*{padding}(?P\#.*$))?
+\s*self.assert_qmp\({res},""")
+
+
+def tmatch(template: str, text: str,
+   padding: str, res: str) -> Optional[re.Match[str]]:
+return re.match(template.format(padding=padding, res=res), text,
+flags=re.MULTILINE)
+
+
+def find_closing_brace(text: str, start: int) -> int:
+"""
+Having '(' at text[start] search for pairing ')' and return its index.
+"""
+assert text[start] == '('
+
+height = 1
+
+for i in range(start + 1, len(text)):
+if text[i] == '(':
+height += 1
+elif text[i] == ')':
+height -= 1
+if height == 0:
+return i
+
+raise ValueError
+
+
+def update(text: str) -> str:
+result = ''
+
+while True:
+m = start_reg.search(text)
+if m is None:
+result += text
+break
+
+result += text[:m.start()]
+
+args_ind = m.end()
+args_end = find_closing_brace(text, args_ind - 1)
+
+all_args = text[args_ind:args_end].split(',', 1)
+
+name = all_args[0]
+args = None if len(all_args) == 1 else all_args[1]
+
+unchanged_call = text[m.start():args_end+1]
+text = text[args_end+1:]
+
+padding, res, vm = m.group('padding', 'res', 'vm')
+
+m = tmatch(success_reg_templ, text, padding, res)
+
+if m is None:
+result += unchanged_call
+
+if ('query-' not in name and
+'x-debug-block-dirty-bitmap-sha256' not in name and
+not tmatch(some_check_templ, text, padding, res)):
+print(unchanged_call + text[:200] + '...\n\n')
+
+continue
+
+if m.group('comment'):
+result += f'{padding}{m.group("comment")}\n'
+
+result += f'{padding}{vm}.cmd({name}'
+
+if args:
+result += ','
+
+if '\n' in args:
+m_args = re.search('(?P *).*$', args)
+assert m_args is not None
+
+cur_padding = len(m_args.group('pad'))
+expected = len(f'{padding}{res} = {vm}.qmp(')
+drop = len(f'{res} = ')
+if cur_padding == expected - 1:
+# tolerate this bad style
+drop -= 1
+elif cur_padding < expected - 1:
+# assume nothing to do
+drop = 0
+
+if drop:
+args = re.sub('\n' + ' ' * drop, '\n', args)
+
+result += args
+
+result += ')'
+
+text = text[m.end():]
+
+return result
+
+
+for fname in sys.argv[1:]:
+print(fname)
+with open(fname) as f:
+t = f.read()
+
+t = update(t)
+
+with open(fname, 'w') as f:
+f.write(t)
-- 
2.34.1




[PATCH v7 12/15] iotests.py: pause_job(): drop return value

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
The returned value is unused. It's simple to check by command

 git grep -B 3 '\.pause_job('

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/iotests.py | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 6cc50f0b50..467faca43c 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -1338,8 +1338,7 @@ def pause_job(self, job_id='job0', wait=True):
 result = self.vm.qmp('block-job-pause', device=job_id)
 self.assert_qmp(result, 'return', {})
 if wait:
-return self.pause_wait(job_id)
-return result
+self.pause_wait(job_id)
 
 def case_skip(self, reason):
 '''Skip this test case'''
-- 
2.34.1




[PATCH v7 02/15] qmp_shell.py: _fill_completion() use .command() instead of .cmd()

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
We just want to ignore failure, so we don't need low level .cmd(). This
helps further renaming .command() to .cmd().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 python/qemu/qmp/qmp_shell.py | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/python/qemu/qmp/qmp_shell.py b/python/qemu/qmp/qmp_shell.py
index 619ab42ced..988d79c01b 100644
--- a/python/qemu/qmp/qmp_shell.py
+++ b/python/qemu/qmp/qmp_shell.py
@@ -91,14 +91,21 @@
 import sys
 from typing import (
 IO,
+Dict,
 Iterator,
 List,
 NoReturn,
 Optional,
 Sequence,
+cast,
 )
 
-from qemu.qmp import ConnectError, QMPError, SocketAddrT
+from qemu.qmp import (
+ConnectError,
+ExecuteError,
+QMPError,
+SocketAddrT,
+)
 from qemu.qmp.legacy import (
 QEMUMonitorProtocol,
 QMPBadPortError,
@@ -194,11 +201,12 @@ def close(self) -> None:
 super().close()
 
 def _fill_completion(self) -> None:
-cmds = self.cmd('query-commands')
-if 'error' in cmds:
-return
-for cmd in cmds['return']:
-self._completer.append(cmd['name'])
+try:
+cmds = cast(List[Dict[str, str]], self.command('query-commands'))
+for cmd in cmds:
+self._completer.append(cmd['name'])
+except ExecuteError:
+pass
 
 def _completer_setup(self) -> None:
 self._completer = QMPCompleter()
-- 
2.34.1




[PATCH v7 07/15] iotests: QemuStorageDaemon: add cmd() method like in QEMUMachine.

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
Add similar method for consistency.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/iotests.py | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 8ffd9fb660..6cc50f0b50 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -38,7 +38,7 @@
 from contextlib import contextmanager
 
 from qemu.machine import qtest
-from qemu.qmp.legacy import QMPMessage, QEMUMonitorProtocol
+from qemu.qmp.legacy import QMPMessage, QMPReturnValue, QEMUMonitorProtocol
 from qemu.utils import VerboseProcessError
 
 # Use this logger for logging messages directly from the iotests module
@@ -466,6 +466,11 @@ def get_qmp(self) -> QEMUMonitorProtocol:
 assert self._qmp is not None
 return self._qmp
 
+def cmd(self, cmd: str, args: Optional[Dict[str, object]] = None) \
+-> QMPReturnValue:
+assert self._qmp is not None
+return self._qmp.cmd(cmd, **(args or {}))
+
 def stop(self, kill_signal=15):
 self._p.send_signal(kill_signal)
 self._p.wait()
-- 
2.34.1




[PATCH v7 10/15] iotests: drop some extra semicolons

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/041 | 2 +-
 tests/qemu-iotests/196 | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 4d7a829b65..550e4dc391 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -1086,7 +1086,7 @@ class TestRepairQuorum(iotests.QMPTestCase):
 def test_after_a_quorum_snapshot(self):
 result = self.vm.qmp('blockdev-snapshot-sync', node_name='img1',
  snapshot_file=quorum_snapshot_file,
- snapshot_node_name="snap1");
+ snapshot_node_name="snap1")
 self.assert_qmp(result, 'return', {})
 
 result = self.vm.qmp('drive-mirror', job_id='job0', device='quorum0',
diff --git a/tests/qemu-iotests/196 b/tests/qemu-iotests/196
index 76509a5ad1..27c1629be3 100755
--- a/tests/qemu-iotests/196
+++ b/tests/qemu-iotests/196
@@ -46,7 +46,7 @@ class TestInvalidateAutoclear(iotests.QMPTestCase):
 
 def test_migration(self):
 result = self.vm_a.qmp('migrate', uri='exec:cat>' + migfile)
-self.assert_qmp(result, 'return', {});
+self.assert_qmp(result, 'return', {})
 self.assertNotEqual(self.vm_a.event_wait("STOP"), None)
 
 with open(disk, 'r+b') as f:
-- 
2.34.1




[PATCH v7 00/15] iotests: use vm.cmd()

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
Hi all!

Let's get rid of pattern

result = self.vm.qmp(...)
self.assert_qmp(result, 'return', {})

And switch to just

self.vm.cmd(...)

v7: add r-bs and small wording/grammar fixes by Eric
  05: handle missed tests/avocado/machine_aspeed.py, keep r-bs
  10: patch renamed: s/occasional/extra/
  14: new
  15: rebuilt with the script: some hunks are added, old are unchanged
  (look comparison with previous version in patchew or by
   git check-rebase)

Vladimir Sementsov-Ogievskiy (15):
  python/qemu/qmp/legacy: cmd(): drop cmd_id unused argument
  qmp_shell.py: _fill_completion() use .command() instead of .cmd()
  scripts/cpu-x86-uarch-abi.py: use .command() instead of .cmd()
  python: rename QEMUMonitorProtocol.cmd() to cmd_raw()
  python/qemu: rename command() to cmd()
  python/machine.py: upgrade vm.cmd() method
  iotests: QemuStorageDaemon: add cmd() method like in QEMUMachine.
  iotests: add some missed checks of qmp result
  iotests: refactor some common qmp result checks into generic pattern
  iotests: drop some extra semicolons
  iotests: drop some extra ** in qmp() call
  iotests.py: pause_job(): drop return value
  tests/vm/basevm.py: use cmd() instead of qmp()
  scripts: add python_qmp_updater.py
  python: use vm.cmd() instead of vm.qmp() where appropriate

 docs/devel/testing.rst|  10 +-
 python/qemu/machine/machine.py|  20 +-
 python/qemu/qmp/legacy.py |  10 +-
 python/qemu/qmp/qmp_shell.py  |  20 +-
 python/qemu/utils/qemu_ga_client.py   |   2 +-
 python/qemu/utils/qom.py  |   8 +-
 python/qemu/utils/qom_common.py   |   2 +-
 python/qemu/utils/qom_fuse.py |   6 +-
 scripts/cpu-x86-uarch-abi.py  |   8 +-
 scripts/device-crash-test |   8 +-
 scripts/python_qmp_updater.py | 136 +
 scripts/render_block_graph.py |   8 +-
 tests/avocado/avocado_qemu/__init__.py|   4 +-
 tests/avocado/cpu_queries.py  |   5 +-
 tests/avocado/hotplug_cpu.py  |  10 +-
 tests/avocado/info_usernet.py |   4 +-
 tests/avocado/machine_arm_integratorcp.py |   6 +-
 tests/avocado/machine_aspeed.py   |  12 +-
 tests/avocado/machine_m68k_nextcube.py|   4 +-
 tests/avocado/machine_mips_malta.py   |   6 +-
 tests/avocado/machine_s390_ccw_virtio.py  |  28 +-
 tests/avocado/migration.py|  10 +-
 tests/avocado/pc_cpu_hotplug_props.py |   2 +-
 tests/avocado/version.py  |   4 +-
 tests/avocado/virtio_check_params.py  |   6 +-
 tests/avocado/virtio_version.py   |   5 +-
 tests/avocado/vnc.py  |  16 +-
 tests/avocado/x86_cpu_model_versions.py   |  13 +-
 tests/migration/guestperf/engine.py   | 150 +++---
 tests/qemu-iotests/030| 168 +++---
 tests/qemu-iotests/040| 171 +++
 tests/qemu-iotests/041| 482 --
 tests/qemu-iotests/045|  15 +-
 tests/qemu-iotests/055|  62 +--
 tests/qemu-iotests/056|  77 ++-
 tests/qemu-iotests/093|  42 +-
 tests/qemu-iotests/118| 225 
 tests/qemu-iotests/124| 102 ++--
 tests/qemu-iotests/129|  14 +-
 tests/qemu-iotests/132|   5 +-
 tests/qemu-iotests/139|  45 +-
 tests/qemu-iotests/147|  30 +-
 tests/qemu-iotests/151| 103 ++--
 tests/qemu-iotests/152|   8 +-
 tests/qemu-iotests/155|  55 +-
 tests/qemu-iotests/165|   8 +-
 tests/qemu-iotests/196|   3 +-
 tests/qemu-iotests/205|   6 +-
 tests/qemu-iotests/218| 105 ++--
 tests/qemu-iotests/245| 245 -
 tests/qemu-iotests/256|  34 +-
 tests/qemu-iotests/257|  36 +-
 tests/qemu-iotests/264|  31 +-
 tests/qemu-iotests/281|  21 +-
 tests/qemu-iotests/295|  16 +-
 tests/qemu-iotests/296|  21 +-
 tests/qemu-iotests/298|  13 +-
 tests/qemu-iotests/300|  54 +-
 tests/qemu-iotests/iotests.py |  21 +-
 .../tests/backing-file-invalidation   |  11 +-
 tests/qemu-iotests/tests/copy-before-write|  15 +-
 .../tests/export-incoming-iothread|   6 +-
 .../qemu-iotests/tests/graph-changes-while-io |  18 +-
 tests/qemu-iotests/tests/image-fleecing   |   3 +-
 .../tests/migrate-bitmaps-postcopy-test   |  31 

[PATCH v7 01/15] python/qemu/qmp/legacy: cmd(): drop cmd_id unused argument

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
The argument is unused, let's drop it for now, as we are going to
refactor the interface and don't want to refactor unused things.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 python/qemu/qmp/legacy.py | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/python/qemu/qmp/legacy.py b/python/qemu/qmp/legacy.py
index e1e9383978..fe115e301c 100644
--- a/python/qemu/qmp/legacy.py
+++ b/python/qemu/qmp/legacy.py
@@ -195,20 +195,16 @@ def cmd_obj(self, qmp_cmd: QMPMessage) -> QMPMessage:
 )
 
 def cmd(self, name: str,
-args: Optional[Dict[str, object]] = None,
-cmd_id: Optional[object] = None) -> QMPMessage:
+args: Optional[Dict[str, object]] = None) -> QMPMessage:
 """
 Build a QMP command and send it to the QMP Monitor.
 
 :param name: command name (string)
 :param args: command arguments (dict)
-:param cmd_id: command id (dict, list, string or int)
 """
 qmp_cmd: QMPMessage = {'execute': name}
 if args:
 qmp_cmd['arguments'] = args
-if cmd_id:
-qmp_cmd['id'] = cmd_id
 return self.cmd_obj(qmp_cmd)
 
 def command(self, cmd: str, **kwds: object) -> QMPReturnValue:
-- 
2.34.1




[PATCH v7 04/15] python: rename QEMUMonitorProtocol.cmd() to cmd_raw()

2023-10-06 Thread Vladimir Sementsov-Ogievskiy
Having cmd() and command() methods in one class doesn't look good.
Rename cmd() to cmd_raw(), to show its meaning better.

We also want to rename command() to cmd() in future, so this commit is
a necessary step.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 python/qemu/machine/machine.py | 2 +-
 python/qemu/qmp/legacy.py  | 4 ++--
 tests/qemu-iotests/iotests.py  | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 35d5a672db..dd1a79cb37 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -692,7 +692,7 @@ def qmp(self, cmd: str,
 conv_keys = True
 
 qmp_args = self._qmp_args(conv_keys, args)
-ret = self._qmp.cmd(cmd, args=qmp_args)
+ret = self._qmp.cmd_raw(cmd, args=qmp_args)
 if cmd == 'quit' and 'error' not in ret and 'return' in ret:
 self._quit_issued = True
 return ret
diff --git a/python/qemu/qmp/legacy.py b/python/qemu/qmp/legacy.py
index fe115e301c..e5fa1ce9c4 100644
--- a/python/qemu/qmp/legacy.py
+++ b/python/qemu/qmp/legacy.py
@@ -194,8 +194,8 @@ def cmd_obj(self, qmp_cmd: QMPMessage) -> QMPMessage:
 )
 )
 
-def cmd(self, name: str,
-args: Optional[Dict[str, object]] = None) -> QMPMessage:
+def cmd_raw(self, name: str,
+args: Optional[Dict[str, object]] = None) -> QMPMessage:
 """
 Build a QMP command and send it to the QMP Monitor.
 
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index ef66fbd62b..8ffd9fb660 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -460,7 +460,7 @@ def __init__(self, *args: str, instance_id: str = 'a', qmp: 
bool = False):
 def qmp(self, cmd: str, args: Optional[Dict[str, object]] = None) \
 -> QMPMessage:
 assert self._qmp is not None
-return self._qmp.cmd(cmd, args)
+return self._qmp.cmd_raw(cmd, args)
 
 def get_qmp(self) -> QEMUMonitorProtocol:
 assert self._qmp is not None
-- 
2.34.1




Re: [Virtio-fs] (no subject)

2023-10-06 Thread Alex Bennée


Hanna Czenczek  writes:

> On 06.10.23 12:34, Michael S. Tsirkin wrote:
>> On Fri, Oct 06, 2023 at 11:47:55AM +0200, Hanna Czenczek wrote:
>>> On 06.10.23 11:26, Michael S. Tsirkin wrote:
 On Fri, Oct 06, 2023 at 11:15:55AM +0200, Hanna Czenczek wrote:
> On 06.10.23 10:45, Michael S. Tsirkin wrote:
>> On Fri, Oct 06, 2023 at 09:48:14AM +0200, Hanna Czenczek wrote:
>>> On 05.10.23 19:15, Michael S. Tsirkin wrote:
 On Thu, Oct 05, 2023 at 01:08:52PM -0400, Stefan Hajnoczi wrote:
> On Wed, Oct 04, 2023 at 02:58:57PM +0200, Hanna Czenczek wrote:

>>> What I’m saying is, 923b8921d21 introduced SET_STATUS calls that broke all
>>> devices that would implement them as per virtio spec, and even today it’s
>>> broken for stateful devices.  The mentioned performance issue is likely
>>> real, but we can’t address it by making up SET_STATUS calls that are wrong.
>>>
>>> I concede that I didn’t think about DRIVER_OK.  Personally, I would do all
>>> final configuration that would happen upon a DRIVER_OK once the first vring
>>> is started (i.e. receives a kick).  That has the added benefit of being
>>> asynchronous because it doesn’t block any vhost-user messages (which are
>>> synchronous, and thus block downtime).
>>>
>>> Hanna
>>
>> For better or worse kick is per ring. It's out of spec to start rings
>> that were not kicked but I guess you could do configuration ...
>> Seems somewhat asymmetrical though.
>
> I meant to take the first ring being started as the signal to do the
> global configuration, i.e. not do this once per vring, but once
> globally.
>
>> Let's wait until next week, hopefully Yajun Wu will answer.
>
> I mean, personally I don’t really care about the whole SET_STATUS
> thing.  It’s clear that it’s broken for stateful devices.  The fact
> that it took until 6f8be29ec17d to fix it for just any device that
> would implement it according to spec to me is a strong indication that
> nobody does implement it according to spec, and is currently only used
> to signal to some specific back-end that all rings have been set up
> and should be configured in a single block.

I'm certainly using [GS]ET_STATUS for the proposed F_TRANSPORT
extensions where everything is off-loaded to the vhost-user backend.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



RE: [PATCH v2 2/3] target/hexagon: fix some occurrences of -Wshadow=local

2023-10-06 Thread ltaylorsimpson



> -Original Message-
> From: Brian Cain 
> Sent: Thursday, October 5, 2023 4:22 PM
> To: qemu-devel@nongnu.org
> Cc: bc...@quicinc.com; arm...@redhat.com; richard.hender...@linaro.org;
> phi...@linaro.org; peter.mayd...@linaro.org; quic_mathb...@quicinc.com;
> stefa...@redhat.com; a...@rev.ng; a...@rev.ng;
> quic_mlie...@quicinc.com; ltaylorsimp...@gmail.com
> Subject: [PATCH v2 2/3] target/hexagon: fix some occurrences of -
> Wshadow=local
> 
> Of the changes in this commit, the changes in
> `HELPER(commit_hvx_stores)()` are less obvious.  They are required because
> of some macro invocations like SCATTER_OP_WRITE_TO_MEM().
> 
> e.g.:
> 
> In file included from ../target/hexagon/op_helper.c:31:
> ../target/hexagon/mmvec/macros.h:205:18: error: declaration of ‘i’
> shadows a previous local [-Werror=shadow=compatible-local]
>   205 | for (int i = 0; i < sizeof(MMVector); i += sizeof(TYPE)) 
> { \
>   |  ^
> ../target/hexagon/op_helper.c:157:17: note: in expansion of macro
> ‘SCATTER_OP_WRITE_TO_MEM’
>   157 | SCATTER_OP_WRITE_TO_MEM(uint16_t);
>   | ^~~
> ../target/hexagon/op_helper.c:135:9: note: shadowed declaration is here
>   135 | int i;
>   | ^
> In file included from ../target/hexagon/op_helper.c:31:
> ../target/hexagon/mmvec/macros.h:204:19: error: declaration of ‘ra’
> shadows a previous local [-Werror=shadow=compatible-local]
>   204 | uintptr_t ra = GETPC(); \
>   |   ^~
> ../target/hexagon/op_helper.c:160:17: note: in expansion of macro
> ‘SCATTER_OP_WRITE_TO_MEM’
>   160 | SCATTER_OP_WRITE_TO_MEM(uint32_t);
>   | ^~~
> ../target/hexagon/op_helper.c:134:15: note: shadowed declaration is here
>   134 | uintptr_t ra = GETPC();
>   |   ^~
> 
> Reviewed-by: Matheus Tavares Bernardino 
> Signed-off-by: Brian Cain 
> ---
>  target/hexagon/imported/alu.idef | 6 +++---
>  target/hexagon/mmvec/macros.h| 6 +++---
>  target/hexagon/op_helper.c   | 9 +++--
>  target/hexagon/translate.c   | 9 -
>  4 files changed, 13 insertions(+), 17 deletions(-)

Reviewed-by: Taylor Simpson 





[v3] Help wanted for enabling -Wshadow=local

2023-10-06 Thread Markus Armbruster
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand.  Bugs love to hide in such code.
Evidence: "[PATCH v3 1/7] migration/rdma: Fix save_page method to fail
on polling error".

Enabling -Wshadow would prevent bugs like this one.  But we have to
clean up all the offenders first.

Quite a few people responded to my calls for help.  Thank you so much!

I'm collecting patches in my git repo at
https://repo.or.cz/qemu/armbru.git in branch shadow-next.  All but the
last two are in a pending pull request.

My test build is down to seven files with warnings.  "[PATCH v2 0/3]
hexagon: GETPC() and shadowing fixes" takes care of four, but it needs a
rebase.

Remaining three:

In file included from ../hw/display/virtio-gpu-virgl.c:19:
../hw/display/virtio-gpu-virgl.c: In function ‘virgl_cmd_submit_3d’:
/work/armbru/qemu/include/hw/virtio/virtio-gpu.h:228:16: warning: 
declaration of ‘s’ shadows a previous local [-Wshadow=compatible-local]
  228 | size_t s;   
\
  |^
../hw/display/virtio-gpu-virgl.c:215:5: note: in expansion of macro 
‘VIRTIO_GPU_FILL_CMD’
  215 | VIRTIO_GPU_FILL_CMD(cs);
  | ^~~
../hw/display/virtio-gpu-virgl.c:213:12: note: shadowed declaration is here
  213 | size_t s;
  |^

In file included from ../contrib/vhost-user-gpu/virgl.h:18,
 from ../contrib/vhost-user-gpu/virgl.c:17:
../contrib/vhost-user-gpu/virgl.c: In function ‘virgl_cmd_submit_3d’:
../contrib/vhost-user-gpu/vugpu.h:167:16: warning: declaration of ‘s’ 
shadows a previous local [-Wshadow=compatible-local]
  167 | size_t s;   \
  |^
../contrib/vhost-user-gpu/virgl.c:203:5: note: in expansion of macro 
‘VUGPU_FILL_CMD’
  203 | VUGPU_FILL_CMD(cs);
  | ^~
../contrib/vhost-user-gpu/virgl.c:201:12: note: shadowed declaration is here
  201 | size_t s;
  |^

../contrib/vhost-user-gpu/vhost-user-gpu.c: In function ‘vg_resource_flush’:
../contrib/vhost-user-gpu/vhost-user-gpu.c:837:29: warning: declaration of 
‘i’ shadows a previous local [-Wshadow=local]
  837 | pixman_image_t *i =
  | ^
../contrib/vhost-user-gpu/vhost-user-gpu.c:757:9: note: shadowed 
declaration is here
  757 | int i;
  | ^

Gerd, Marc-André, or anybody else?

More warnings may lurk in code my test build doesn't compile.  Need a
full CI build with -Wshadow=local to find them.  Anybody care to kick
one off?




Re: [PATCH 4/5] migration: Provide QMP access to downtime stats

2023-10-06 Thread Peter Xu
On Fri, Oct 06, 2023 at 12:37:15PM +0100, Joao Martins wrote:
> I added the statistics mainly for observability (e.g. you would grep in the
> libvirt logs for a non developer and they can understand how downtime is
> explained). I wasn't specifically thinking about management app using this, 
> just
> broad access to the metrics.
> 
> One can get the same level of observability with a BPF/dtrace/systemtap 
> script,
> albeit in a less obvious way.

Makes sense.

> 
> With respect to motivation: I am doing migration with VFs and sometimes
> vhost-net, and the downtime/switchover is the only thing that is either
> non-determinisc or not captured in the migration math. There are some things
> that aren't accounted (e.g. vhost with enough queues will give you high
> downtimes),

Will this be something relevant to loading of the queues?  There used to be
a work on greatly reducing downtime especially for virtio scenarios over
multiple queues (and iirc even 1 queue also benefits from that), it wasn't
merged probably because not enough review:

https://lore.kernel.org/r/20230317081904.24389-1-xuchuangxc...@bytedance.com

Though personally I think that's some direction good to keep exploring at
least, maybe some slightly enhancement to that series will work for us.

> and algorithimally not really possible to account for as one needs
> to account every possible instruction when we quiesce the guest (or at least
> that's my understanding).
> 
> Just having these metrics, help the developer *and* user see why such downtime
> is high, and maybe open up window for fixes/bug-reports or where to improve.
> 
> Furthermore, hopefully these tracepoints or stats could be a starting point 
> for
> developers to understand how much downtime is spent in a particular device in
> Qemu(as a follow-up to this series),

Yes, I was actually expecting that when read the cover letter. :) This also
makes sense.  One thing worth mention is, the real downtime measured can,
IMHO, differ on src/dst due to "pre_save" and "post_load" may not really
doing similar things.  IIUC it can happen that some device sents fast, but
loads slow.  I'm not sure whether there's reversed use case. Maybe we want
to capture that on both sides on some metrics?

> or allow to implement bounds check limits in switchover limits in way
> that doesn't violate downtime-limit SLAs (I have a small set of patches
> for this).

I assume that decision will always be synchronized between src/dst in some
way, or guaranteed to be same. But I can wait to read the series first.

Thanks,

-- 
Peter Xu




Re: [Virtio-fs] [PATCH v4 2/8] vhost-user.rst: Improve [GS]ET_VRING_BASE doc

2023-10-06 Thread Hanna Czenczek

On 06.10.23 15:55, Hanna Czenczek wrote:

On 06.10.23 10:49, Michael S. Tsirkin wrote:

On Fri, Oct 06, 2023 at 09:53:53AM +0200, Hanna Czenczek wrote:

On 05.10.23 19:38, Stefan Hajnoczi wrote:

On Wed, Oct 04, 2023 at 02:58:58PM +0200, Hanna Czenczek wrote:


[...]


   ``VHOST_USER_GET_VRING_BASE``
 :id: 11
 :equivalent ioctl: ``VHOST_USER_GET_VRING_BASE``
 :request payload: vring state description
-  :reply payload: vring state description
+  :reply payload: vring descriptor index/indices
+
+  Stops the vring and returns the current descriptor index or 
indices:

+
+    * For a split virtqueue, returns only the 16-bit next descriptor
+  index in the *Available Ring*.  The index in the *Used Ring* is
+  controlled by the guest driver and can be read from the vring
I find "is controlled by the guest driver" confusing. The device 
writes
the Used Ring index. The driver only reads it. The device is the 
active

party here.
Er, good point.  That breaks the whole reasoning.  Then I don’t 
understand
why we do get/set the available ring index and not the used ring 
index.  Do

you know why?

It's simple. used ring index in memory is controlled by the device and
reflects device state.


Exactly, it’s device state, that’s why I thought the front-end needs 
to ensure its read and restored around the reset we currently have in 
vhost_dev_stop()/start().



device can just read it back to restore.


I find it strange that the device is supposed to read its own state 
from memory.



available ring index in memory is controlled by driver and does
not reflect device state.


Why can’t the device read the available index from memory?  That value 
is put into memory by the driver precisely so the device can read it 
from there.


Ah, wait, is the idea that the device may have an internal available 
index counter that reflects what descriptor it has already fetched? I.e. 
this index will lag behind the one in memory, and the difference are new 
descriptors that the device still needs to read? If that internal 
counter is the index that’s get/set here, then yes, that makes a lot of 
sense.


Hanna




Re: [Virtio-fs] [PATCH v4 2/8] vhost-user.rst: Improve [GS]ET_VRING_BASE doc

2023-10-06 Thread Hanna Czenczek

On 06.10.23 10:49, Michael S. Tsirkin wrote:

On Fri, Oct 06, 2023 at 09:53:53AM +0200, Hanna Czenczek wrote:

On 05.10.23 19:38, Stefan Hajnoczi wrote:

On Wed, Oct 04, 2023 at 02:58:58PM +0200, Hanna Czenczek wrote:

GET_VRING_BASE does not mention that it stops the respective ring.  Fix
that.

Furthermore, it is not fully clear what the "base offset" these
commands' documentation refers to is; an offset could be many things.
Be more precise and verbose about it, especially given that these
commands use different payload structures depending on whether the vring
is split or packed.

Signed-off-by: Hanna Czenczek 
---
   docs/interop/vhost-user.rst | 66 ++---
   1 file changed, 62 insertions(+), 4 deletions(-)

diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
index 2f68e67a1a..50f5acebe5 100644
--- a/docs/interop/vhost-user.rst
+++ b/docs/interop/vhost-user.rst
@@ -108,6 +108,37 @@ A vring state description
   :num: a 32-bit number
+A vring descriptor index for split virtqueues
+^
+
++-+-+
+| vring index | index in avail ring |
++-+-+
+
+:vring index: 32-bit index of the respective virtqueue
+
+:index in avail ring: 32-bit value, of which currently only the lower 16
+  bits are used:
+
+  - Bits 0–15: Next descriptor index in the *Available Ring*

I think we need to say more to make this implementable just by reading
the spec:

Index of the next *Available Ring* descriptor that the back-end will
process. This is a free-running index that is not wrapped by the ring
size.

Sure, thanks.


Feel free to rephrase.


+  - Bits 16–31: Reserved (set to zero)
+
+Vring descriptor indices for packed virtqueues
+^^
+
++-++
+| vring index | descriptor indices |
++-++
+
+:vring index: 32-bit index of the respective virtqueue
+
+:descriptor indices: 32-bit value:
+
+  - Bits 0–14: Index in the *Available Ring*

Same here.


+  - Bit 15: Driver (Available) Ring Wrap Counter
+  - Bits 16–30: Index in the *Used Ring*

Same here.


+  - Bit 31: Device (Used) Ring Wrap Counter
+
   A vring address description
   ^^^
@@ -1031,18 +1062,45 @@ Front-end message types
   ``VHOST_USER_SET_VRING_BASE``
 :id: 10
 :equivalent ioctl: ``VHOST_SET_VRING_BASE``
-  :request payload: vring state description
+  :request payload: vring descriptor index/indices
 :reply payload: N/A
-  Sets the base offset in the available vring.
+  Sets the next index to use for descriptors in this vring:
+
+  * For a split virtqueue, sets only the next descriptor index in the
+*Available Ring*.  The device is supposed to read the next index in
+the *Used Ring* from the respective vring structure in guest memory.
+
+  * For a packed virtqueue, both indices are supplied, as they are not
+explicitly available in memory.
+
+  Consequently, the payload type is specific to the type of virt queue
+  (*a vring descriptor index for split virtqueues* vs. *vring descriptor
+  indices for packed virtqueues*).
   ``VHOST_USER_GET_VRING_BASE``
 :id: 11
 :equivalent ioctl: ``VHOST_USER_GET_VRING_BASE``
 :request payload: vring state description
-  :reply payload: vring state description
+  :reply payload: vring descriptor index/indices
+
+  Stops the vring and returns the current descriptor index or indices:
+
+* For a split virtqueue, returns only the 16-bit next descriptor
+  index in the *Available Ring*.  The index in the *Used Ring* is
+  controlled by the guest driver and can be read from the vring

I find "is controlled by the guest driver" confusing. The device writes
the Used Ring index. The driver only reads it. The device is the active
party here.

Er, good point.  That breaks the whole reasoning.  Then I don’t understand
why we do get/set the available ring index and not the used ring index.  Do
you know why?

It's simple. used ring index in memory is controlled by the device and
reflects device state.


Exactly, it’s device state, that’s why I thought the front-end needs to 
ensure its read and restored around the reset we currently have in 
vhost_dev_stop()/start().



device can just read it back to restore.


I find it strange that the device is supposed to read its own state from 
memory.



available ring index in memory is controlled by driver and does
not reflect device state.


Why can’t the device read the available index from memory?  That value 
is put into memory by the driver precisely so the device can read it 
from there.


Hanna




[PATCH v2 07/10] target/riscv/tcg: add MISA user options hash

2023-10-06 Thread Daniel Henrique Barboza
We already track user choice for multi-letter extensions because we
needed to honor user choice when enabling/disabling extensions during
realize(). We refrained from adding the same mechanism for MISA
extensions since we didn't need it.

Profile support requires tne need to check for user choice for MISA
extensions, so let's add the corresponding hash now. It works like the
existing multi-letter hash (multi_ext_user_opts) but tracking MISA bits
options in the cpu_set_misa_ext_cfg() callback.

Note that we can't re-use the same hash from multi-letter extensions
because that hash uses cpu->cfg offsets as keys, while for MISA
extensions we're using MISA bits as keys.

After adding the user hash in cpu_set_misa_ext_cfg(), setting default
values with object_property_set_bool() in add_misa_properties() will end
up marking the user choice hash with them. Set the default value
manually to avoid it.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/tcg/tcg-cpu.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 8fb77e9e35..58de4428a9 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -34,6 +34,7 @@
 
 /* Hash that stores user set extensions */
 static GHashTable *multi_ext_user_opts;
+static GHashTable *misa_ext_user_opts;
 
 static bool cpu_cfg_ext_is_user_set(uint32_t ext_offset)
 {
@@ -689,6 +690,10 @@ static void cpu_set_misa_ext_cfg(Object *obj, Visitor *v, 
const char *name,
 return;
 }
 
+g_hash_table_insert(misa_ext_user_opts,
+GUINT_TO_POINTER(misa_bit),
+(gpointer)value);
+
 prev_val = env->misa_ext & misa_bit;
 
 if (value == prev_val) {
@@ -752,6 +757,7 @@ static const RISCVCPUMisaExtConfig misa_ext_cfgs[] = {
  */
 static void riscv_cpu_add_misa_properties(Object *cpu_obj)
 {
+CPURISCVState *env = _CPU(cpu_obj)->env;
 bool use_def_vals = riscv_cpu_is_generic(cpu_obj);
 int i;
 
@@ -772,7 +778,13 @@ static void riscv_cpu_add_misa_properties(Object *cpu_obj)
 NULL, (void *)misa_cfg);
 object_property_set_description(cpu_obj, name, desc);
 if (use_def_vals) {
-object_property_set_bool(cpu_obj, name, misa_cfg->enabled, NULL);
+if (misa_cfg->enabled) {
+env->misa_ext |= bit;
+env->misa_ext_mask |= bit;
+} else {
+env->misa_ext &= ~bit;
+env->misa_ext_mask &= ~bit;
+}
 }
 }
 }
@@ -967,6 +979,7 @@ static void tcg_cpu_instance_init(CPUState *cs)
 RISCVCPU *cpu = RISCV_CPU(cs);
 Object *obj = OBJECT(cpu);
 
+misa_ext_user_opts = g_hash_table_new(NULL, g_direct_equal);
 multi_ext_user_opts = g_hash_table_new(NULL, g_direct_equal);
 riscv_cpu_add_user_properties(obj);
 
-- 
2.41.0




[PATCH v2 04/10] target/riscv/kvm: add 'rva22u64' flag as unavailable

2023-10-06 Thread Daniel Henrique Barboza
KVM does not have the means to support enabling the rva22u64 profile.
The main reasons are:

- we're missing support for some mandatory rva22u64 extensions in the
  KVM module;

- we can't make promises about enabling a profile since it all depends
  on host support in the end.

We'll revisit this decision in the future if needed. For now mark the
'rva22u64' profile as unavailable when running a KVM CPU:

$ qemu-system-riscv64 -machine virt,accel=kvm -cpu rv64,rva22u64=true
qemu-system-riscv64: can't apply global rv64-riscv-cpu.rva22u64=true:
'rva22u64' is not available with KVM

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/kvm/kvm-cpu.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index c6615cb807..5f563b83df 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -358,7 +358,7 @@ static void cpu_set_cfg_unavailable(Object *obj, Visitor *v,
 }
 
 if (value) {
-error_setg(errp, "extension %s is not available with KVM",
+error_setg(errp, "'%s' is not available with KVM",
propname);
 }
 }
@@ -438,6 +438,11 @@ static void kvm_riscv_add_cpu_user_properties(Object 
*cpu_obj)
 riscv_cpu_add_kvm_unavail_prop_array(cpu_obj, riscv_cpu_extensions);
 riscv_cpu_add_kvm_unavail_prop_array(cpu_obj, riscv_cpu_vendor_exts);
 riscv_cpu_add_kvm_unavail_prop_array(cpu_obj, riscv_cpu_experimental_exts);
+
+   /* We don't have the needed KVM support for profiles */
+for (i = 0; riscv_profiles[i] != NULL; i++) {
+riscv_cpu_add_kvm_unavail_prop(cpu_obj, riscv_profiles[i]->name);
+}
 }
 
 static int kvm_riscv_get_regs_core(CPUState *cs)
-- 
2.41.0




[PATCH v2 09/10] target/riscv/tcg: handle MISA bits on profile commit

2023-10-06 Thread Daniel Henrique Barboza
The profile support is handling multi-letter extensions only. Let's add
support for MISA bits as well.

We'll go through every known MISA bit. If the user set the bit, doesn't
matter if to 'true' or 'false', ignore it. If the profile doesn't
declare the bit as mandatory, ignore it. Otherwise, set or clear the bit
in env->misa_ext and env->misa_ext_mask depending on whether the profile
was set to 'true' or 'false'.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/tcg/tcg-cpu.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index b1e778913c..d7540274f4 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -42,6 +42,12 @@ static bool cpu_cfg_ext_is_user_set(uint32_t ext_offset)
  GUINT_TO_POINTER(ext_offset));
 }
 
+static bool cpu_misa_ext_is_user_set(uint32_t misa_bit)
+{
+return g_hash_table_contains(misa_ext_user_opts,
+ GUINT_TO_POINTER(misa_bit));
+}
+
 static void riscv_cpu_write_misa_bit(RISCVCPU *cpu, uint32_t bit,
  bool enabled)
 {
@@ -283,6 +289,20 @@ static void riscv_cpu_commit_profile(RISCVCPU *cpu, 
RISCVCPUProfile *profile)
 {
 int i;
 
+for (i = 0; misa_bits[i] != 0; i++) {
+uint32_t bit = misa_bits[i];
+
+if (cpu_misa_ext_is_user_set(bit) || !(profile->misa_ext & bit)) {
+continue;
+}
+
+g_hash_table_insert(misa_ext_user_opts,
+GUINT_TO_POINTER(bit),
+(gpointer)profile->enabled);
+
+riscv_cpu_write_misa_bit(cpu, bit, profile->enabled);
+}
+
 for (i = 0;; i++) {
 int ext_offset = profile->ext_offsets[i];
 
-- 
2.41.0




[PATCH v2 02/10] target/riscv/cpu.c: add zihpm extension flag

2023-10-06 Thread Daniel Henrique Barboza
zihpm is the Hardware Performance Counters extension described in
chapter 12 of the unprivileged spec. It describes support for 29
unprivileged performance counters, hpmcounter3-hpmcounter21.

As with zicntr, QEMU already implements zihpm before it was even an
extension. zihpm is also part of the RVA22 profile, so add it to QEMU
to complement the future future profile implementation.

Default it to 'true' since it was always present in the code. Change the
realize() time validation to disable it in case 'icsr' isn't present and
if there's no hardware counters (cpu->cfg.pmu_num is zero).

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 4 +++-
 target/riscv/cpu_cfg.h | 1 +
 target/riscv/tcg/tcg-cpu.c | 4 
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 8783a415b1..b3befccf89 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -84,6 +84,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(zifencei, PRIV_VERSION_1_10_0, ext_ifencei),
 ISA_EXT_DATA_ENTRY(zihintntl, PRIV_VERSION_1_10_0, ext_zihintntl),
 ISA_EXT_DATA_ENTRY(zihintpause, PRIV_VERSION_1_10_0, ext_zihintpause),
+ISA_EXT_DATA_ENTRY(zihpm, PRIV_VERSION_1_12_0, ext_ihpm),
 ISA_EXT_DATA_ENTRY(zmmul, PRIV_VERSION_1_12_0, ext_zmmul),
 ISA_EXT_DATA_ENTRY(zawrs, PRIV_VERSION_1_12_0, ext_zawrs),
 ISA_EXT_DATA_ENTRY(zfa, PRIV_VERSION_1_12_0, ext_zfa),
@@ -1267,10 +1268,11 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
 MULTI_EXT_CFG_BOOL("svpbmt", ext_svpbmt, false),
 
 /*
- * Always default true - we'll disable it during
+ * Always default true - we'll disable them during
  * realize() if needed.
  */
 MULTI_EXT_CFG_BOOL("zicntr", ext_icntr, true),
+MULTI_EXT_CFG_BOOL("zihpm", ext_ihpm, true),
 
 MULTI_EXT_CFG_BOOL("zba", ext_zba, true),
 MULTI_EXT_CFG_BOOL("zbb", ext_zbb, true),
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index 671b8c7cb8..cf228546da 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -66,6 +66,7 @@ struct RISCVCPUConfig {
 bool ext_icsr;
 bool ext_icbom;
 bool ext_icboz;
+bool ext_ihpm;
 bool ext_zicond;
 bool ext_zihintntl;
 bool ext_zihintpause;
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index df187bc143..731192bafc 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -546,6 +546,10 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, 
Error **errp)
 cpu->cfg.ext_icntr = false;
 }
 
+if (cpu->cfg.ext_ihpm && (!cpu->cfg.ext_icsr || cpu->cfg.pmu_num == 0)) {
+cpu->cfg.ext_ihpm = false;
+}
+
 /*
  * Disable isa extensions based on priv spec after we
  * validated and set everything we need.
-- 
2.41.0




[PATCH v2 05/10] target/riscv/tcg: add user flag for profile support

2023-10-06 Thread Daniel Henrique Barboza
The TCG emulation implements all the extensions described in the
RVA22U64 profile, both mandatory and optional. The mandatory extensions
will be enabled via the profile flag. We'll leave the optional
extensions to be enabled by hand.

Given that this is the first profile we're implementing in TCG we'll
need some ground work first:

- all profiles declared in riscv_profiles[] will be exposed to users.
  TCG is the main accelerator we're considering when adding profile
  support in QEMU, so for now it's safe to assume that all profiles in
  riscv_profiles[] will be relevant to TCG;

- the set() callback for the profile user property will set the
  'user_set' flag for each profile that users enable/disable in the
  command line;

- we'll not support user profile settings for vendor CPUs. The flags
  will still be exposed but users won't be able to change them. The idea
  is that vendor CPUs in the future can enable profiles internally in
  their cpu_init() functions, showing to the external world that the CPU
  supports a certain profile. But users won't be able to enable/disable
  it.

For now we'll just expose the user flags for all profiles. Next patch
will introduce the 'commit profile' logic.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/tcg/tcg-cpu.c | 46 ++
 1 file changed, 46 insertions(+)

diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 731192bafc..a8ea869e6e 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -740,6 +740,50 @@ static void riscv_cpu_add_misa_properties(Object *cpu_obj)
 }
 }
 
+static void cpu_set_profile(Object *obj, Visitor *v, const char *name,
+void *opaque, Error **errp)
+{
+RISCVCPUProfile *profile = opaque;
+bool value;
+
+if (object_dynamic_cast(obj, TYPE_RISCV_DYNAMIC_CPU) == NULL) {
+error_setg(errp, "Profile %s only available for generic CPUs",
+   profile->name);
+return;
+}
+
+if (!visit_type_bool(v, name, , errp)) {
+return;
+}
+
+profile->user_set = true;
+profile->enabled = value;
+}
+
+static void cpu_get_profile(Object *obj, Visitor *v, const char *name,
+void *opaque, Error **errp)
+{
+RISCVCPUProfile *profile = opaque;
+bool value = profile->enabled;
+
+visit_type_bool(v, name, , errp);
+}
+
+static void riscv_cpu_add_profiles(Object *cpu_obj)
+{
+for (int i = 0;; i++) {
+const RISCVCPUProfile *profile = riscv_profiles[i];
+
+if (!profile) {
+break;
+}
+
+object_property_add(cpu_obj, profile->name, "bool",
+cpu_get_profile, cpu_set_profile,
+NULL, (void *)profile);
+}
+}
+
 static void cpu_set_multi_ext_cfg(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
 {
@@ -834,6 +878,8 @@ static void riscv_cpu_add_user_properties(Object *obj)
 riscv_cpu_add_multiext_prop_array(obj, riscv_cpu_vendor_exts);
 riscv_cpu_add_multiext_prop_array(obj, riscv_cpu_experimental_exts);
 
+riscv_cpu_add_profiles(obj);
+
 for (Property *prop = riscv_cpu_options; prop && prop->name; prop++) {
 qdev_property_add_static(DEVICE(obj), prop);
 }
-- 
2.41.0




[PATCH v2 08/10] target/riscv/tcg: add riscv_cpu_write_misa_bit()

2023-10-06 Thread Daniel Henrique Barboza
We have two instances of the setting/clearing a MISA bit from
env->misa_ext and env->misa_ext_mask pattern. And the next patch will
end up adding one more.

Create a helper to avoid code repetition.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/tcg/tcg-cpu.c | 44 --
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 58de4428a9..b1e778913c 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -42,6 +42,20 @@ static bool cpu_cfg_ext_is_user_set(uint32_t ext_offset)
  GUINT_TO_POINTER(ext_offset));
 }
 
+static void riscv_cpu_write_misa_bit(RISCVCPU *cpu, uint32_t bit,
+ bool enabled)
+{
+CPURISCVState *env = >env;
+
+if (enabled) {
+env->misa_ext |= bit;
+env->misa_ext_mask |= bit;
+} else {
+env->misa_ext &= ~bit;
+env->misa_ext_mask &= ~bit;
+}
+}
+
 static void riscv_cpu_synchronize_from_tb(CPUState *cs,
   const TranslationBlock *tb)
 {
@@ -700,20 +714,14 @@ static void cpu_set_misa_ext_cfg(Object *obj, Visitor *v, 
const char *name,
 return;
 }
 
-if (value) {
-if (!generic_cpu) {
-g_autofree char *cpuname = riscv_cpu_get_name(cpu);
-error_setg(errp, "'%s' CPU does not allow enabling extensions",
-   cpuname);
-return;
-}
-
-env->misa_ext |= misa_bit;
-env->misa_ext_mask |= misa_bit;
-} else {
-env->misa_ext &= ~misa_bit;
-env->misa_ext_mask &= ~misa_bit;
+if (value && !generic_cpu) {
+g_autofree char *cpuname = riscv_cpu_get_name(cpu);
+error_setg(errp, "'%s' CPU does not allow enabling extensions",
+   cpuname);
+return;
 }
+
+riscv_cpu_write_misa_bit(cpu, misa_bit, value);
 }
 
 static void cpu_get_misa_ext_cfg(Object *obj, Visitor *v, const char *name,
@@ -757,7 +765,6 @@ static const RISCVCPUMisaExtConfig misa_ext_cfgs[] = {
  */
 static void riscv_cpu_add_misa_properties(Object *cpu_obj)
 {
-CPURISCVState *env = _CPU(cpu_obj)->env;
 bool use_def_vals = riscv_cpu_is_generic(cpu_obj);
 int i;
 
@@ -778,13 +785,8 @@ static void riscv_cpu_add_misa_properties(Object *cpu_obj)
 NULL, (void *)misa_cfg);
 object_property_set_description(cpu_obj, name, desc);
 if (use_def_vals) {
-if (misa_cfg->enabled) {
-env->misa_ext |= bit;
-env->misa_ext_mask |= bit;
-} else {
-env->misa_ext &= ~bit;
-env->misa_ext_mask &= ~bit;
-}
+riscv_cpu_write_misa_bit(RISCV_CPU(cpu_obj), bit,
+ misa_cfg->enabled);
 }
 }
 }
-- 
2.41.0




[PATCH v2 00/10] riscv: RVA22U64 profile support

2023-10-06 Thread Daniel Henrique Barboza
Hi,

Several design changes were made in this version after the reviews and
feedback in the v1 [1]. The high-level summary is:

- we'll no longer allow users to set profile flags for vendor CPUs. If
  we're to adhere to the current policy of not allowing users to enable
  extensions for vendor CPUs, the profile support would become a
  glorified way of checking if the vendor CPU happens to support a
  specific profile. If a future vendor CPU supports a profile the CPU
  can declare it manually in its cpu_init() function, the flag will
  still be set, but users can't change it;

- disabling a profile will now disable all the mandatory extensions from
  the CPU;

- the profile logic was moved to realize() time in a step we're calling
  'commit profile'. This allows us to enable/disable profile extensions
  after considering user input in other individual extensions. The
  result is that we don't care about the order in which the profile flag
  was set in comparison with other extensions in the command line, i.e.
  the following lines are equal:

  -cpu rv64,zicbom=false,rva22u64=true,Zifencei=false

  -cpu rv64,rva22u64=true,zicbom=false,Zifencei=false

  and they mean 'enable the rva22u64 profile while keeping zicbom and
  Zifencei disabled'.


Other minor changes were needed as result of these design changes. E.g.
we're now having to track MISA extensions set by users (patch 7),
something that we were doing only for multi-letter extensions.

Changes from v1:
- patch 6 from v1 ("target/riscv/kvm: add 'rva22u64' flag as unavailable"):
- moved up to patch 4
- patch 5 from v1("target/riscv/tcg-cpu.c: enable profile support for vendor 
CPUs"):
- dropped
- patch 6 (new):
  - add riscv_cpu_commit_profile()
- patch 7 (new):
  - add user choice hash for MISA extensions
- patch 9 (new):
  - handle MISA bits user choice when commiting profiles
- patch 8 and 10 (new):
  - helpers to avoid code repetition
- v1 link: 
https://lore.kernel.org/qemu-riscv/20230926194951.183767-1-dbarb...@ventanamicro.com/


Daniel Henrique Barboza (10):
  target/riscv/cpu.c: add zicntr extension flag
  target/riscv/cpu.c: add zihpm extension flag
  target/riscv: add rva22u64 profile definition
  target/riscv/kvm: add 'rva22u64' flag as unavailable
  target/riscv/tcg: add user flag for profile support
  target/riscv/tcg: commit profiles during realize()
  target/riscv/tcg: add MISA user options hash
  target/riscv/tcg: add riscv_cpu_write_misa_bit()
  target/riscv/tcg: handle MISA bits on profile commit
  target/riscv/tcg: add hash table insert helpers

 target/riscv/cpu.c |  29 +++
 target/riscv/cpu.h |  12 +++
 target/riscv/cpu_cfg.h |   2 +
 target/riscv/kvm/kvm-cpu.c |   7 +-
 target/riscv/tcg/tcg-cpu.c | 165 +
 5 files changed, 197 insertions(+), 18 deletions(-)

-- 
2.41.0




[PATCH v2 03/10] target/riscv: add rva22u64 profile definition

2023-10-06 Thread Daniel Henrique Barboza
The rva22U64 profile, described in:

https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#rva22-profiles

Contains a set of CPU extensions aimed for 64-bit userspace
applications. Enabling this set to be enabled via a single user flag
makes it convenient to enable a predictable set of features for the CPU,
giving users more predicability when running/testing their workloads.

QEMU implements all possible extensions of this profile. The exception
is Zicbop (Cache-Block Prefetch Operations) that is not available since
QEMU RISC-V does not implement a cache model. For this same reason all
the so called 'synthetic extensions' described in the profile that are
cache related are ignored (Za64rs, Zic64b, Ziccif, Ziccrse, Ziccamoa,
Zicclsm).

An abstraction called RISCVCPUProfile is created to store the profile.
'ext_offsets' contains mandatory extensions that QEMU supports. Same
thing with the 'misa_ext' mask. Optional extensions must be enabled
manually in the command line if desired.

The design here is to use the common target/riscv/cpu.c file to store
the profile declaration and export it to the accelerator files. Each
accelerator is then responsible to expose it (or not) to users and how
to enable the extensions.

Next patches will implement the profile for TCG and KVM.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 20 
 target/riscv/cpu.h | 12 
 2 files changed, 32 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index b3befccf89..a439ff57a4 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1376,6 +1376,26 @@ Property riscv_cpu_options[] = {
 DEFINE_PROP_END_OF_LIST(),
 };
 
+/* Optional extensions left out: RVV, zfh, zkn, zks */
+static RISCVCPUProfile RVA22U64 = {
+.name = "rva22u64",
+.misa_ext = RVM | RVA | RVF | RVD | RVC,
+.ext_offsets = {
+CPU_CFG_OFFSET(ext_icsr), CPU_CFG_OFFSET(ext_zihintpause),
+CPU_CFG_OFFSET(ext_zba), CPU_CFG_OFFSET(ext_zbb),
+CPU_CFG_OFFSET(ext_zbs), CPU_CFG_OFFSET(ext_zfhmin),
+CPU_CFG_OFFSET(ext_zkt), CPU_CFG_OFFSET(ext_icntr),
+CPU_CFG_OFFSET(ext_ihpm), CPU_CFG_OFFSET(ext_icbom),
+CPU_CFG_OFFSET(ext_icboz),
+
+RISCV_PROFILE_EXT_LIST_END
+}
+};
+
+RISCVCPUProfile *riscv_profiles[] = {
+, NULL,
+};
+
 static Property riscv_cpu_properties[] = {
 DEFINE_PROP_BOOL("debug", RISCVCPU, cfg.debug, true),
 
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 3f11e69223..216bbbe7cd 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -66,6 +66,18 @@ const char *riscv_get_misa_ext_description(uint32_t bit);
 
 #define CPU_CFG_OFFSET(_prop) offsetof(struct RISCVCPUConfig, _prop)
 
+typedef struct riscv_cpu_profile {
+const char *name;
+uint32_t misa_ext;
+bool enabled;
+bool user_set;
+const int32_t ext_offsets[];
+} RISCVCPUProfile;
+
+#define RISCV_PROFILE_EXT_LIST_END -1
+
+extern RISCVCPUProfile *riscv_profiles[];
+
 /* Privileged specification version */
 enum {
 PRIV_VERSION_1_10_0 = 0,
-- 
2.41.0




[PATCH v2 06/10] target/riscv/tcg: commit profiles during realize()

2023-10-06 Thread Daniel Henrique Barboza
To 'commit' a profile means enabling/disabling all its mandatory
extensions after taking into account individual user choice w.r.t MISA
and multi-letter extensions. We'll handle multi-letter extennsions now -
MISA extensions needs additional steps that we'll take care later.

riscv_cpu_manage_profiles() will scroll through all profiles available
in QEMU and call riscv_cpu_commit_profile() for any profile that the
user set, either to 'true' or 'false'.

Setting a profile to 'true' means 'enable all mandatory extensions of
this profile'. Setting it to 'false' means disabling all its mandatory
extensions. Since we're doing it during realize() time we already have
all user choices for individual extensions sorted out, and they'll take
precedence. This will make us independent of left-to-right ordering in
the QEMU command line, i.e. the following QEMU command lines:

-cpu rv64,zicbom=false,rva22u64=true,Zifencei=false

-cpu rv64,zicbom=false,Zifencei=false,rva22u64=true

-cpu rv64,rva22u64=true,zicbom=false,Zifencei=false

They mean the same thing: "enable all mandatory extensions of the
rva22u64 profile while keeping zicbom and Zifencei disabled".

Enabling extensions in the profile is also considered an user choice, so
all extensions enabled will be added in the multi_ext_user_opts hash.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/tcg/tcg-cpu.c | 37 +
 1 file changed, 37 insertions(+)

diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index a8ea869e6e..8fb77e9e35 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -264,6 +264,41 @@ static void riscv_cpu_disable_priv_spec_isa_exts(RISCVCPU 
*cpu)
 }
 }
 
+static void riscv_cpu_commit_profile(RISCVCPU *cpu, RISCVCPUProfile *profile)
+{
+int i;
+
+for (i = 0;; i++) {
+int ext_offset = profile->ext_offsets[i];
+
+if (ext_offset == RISCV_PROFILE_EXT_LIST_END) {
+break;
+}
+
+if (cpu_cfg_ext_is_user_set(ext_offset)) {
+continue;
+}
+
+g_hash_table_insert(multi_ext_user_opts,
+GUINT_TO_POINTER(ext_offset),
+(gpointer)profile->enabled);
+isa_ext_update_enabled(cpu, ext_offset, profile->enabled);
+}
+}
+
+static void riscv_cpu_manage_profiles(RISCVCPU *cpu)
+{
+for (int i = 0; riscv_profiles[i] != NULL; i++) {
+RISCVCPUProfile *profile = riscv_profiles[i];
+
+if (!profile->user_set) {
+continue;
+}
+
+riscv_cpu_commit_profile(cpu, profile);
+}
+}
+
 /*
  * Check consistency between chosen extensions while setting
  * cpu->cfg accordingly.
@@ -273,6 +308,8 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error 
**errp)
 CPURISCVState *env = >env;
 Error *local_err = NULL;
 
+riscv_cpu_manage_profiles(cpu);
+
 /* Do some ISA extension error checking */
 if (riscv_has_ext(env, RVG) &&
 !(riscv_has_ext(env, RVI) && riscv_has_ext(env, RVM) &&
-- 
2.41.0




[PATCH v2 10/10] target/riscv/tcg: add hash table insert helpers

2023-10-06 Thread Daniel Henrique Barboza
Latest patches added several g_hash_table_insert() patterns. Add two
helpers, one for each user hash, to make the code cleaner.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/tcg/tcg-cpu.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index d7540274f4..d4ad1c09b3 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -42,12 +42,24 @@ static bool cpu_cfg_ext_is_user_set(uint32_t ext_offset)
  GUINT_TO_POINTER(ext_offset));
 }
 
+static void cpu_cfg_ext_add_user_opt(uint32_t ext_offset, bool value)
+{
+g_hash_table_insert(multi_ext_user_opts, GUINT_TO_POINTER(ext_offset),
+(gpointer)value);
+}
+
 static bool cpu_misa_ext_is_user_set(uint32_t misa_bit)
 {
 return g_hash_table_contains(misa_ext_user_opts,
  GUINT_TO_POINTER(misa_bit));
 }
 
+static void cpu_misa_ext_add_user_opt(uint32_t bit, bool value)
+{
+g_hash_table_insert(misa_ext_user_opts, GUINT_TO_POINTER(bit),
+(gpointer)value);
+}
+
 static void riscv_cpu_write_misa_bit(RISCVCPU *cpu, uint32_t bit,
  bool enabled)
 {
@@ -296,9 +308,7 @@ static void riscv_cpu_commit_profile(RISCVCPU *cpu, 
RISCVCPUProfile *profile)
 continue;
 }
 
-g_hash_table_insert(misa_ext_user_opts,
-GUINT_TO_POINTER(bit),
-(gpointer)profile->enabled);
+cpu_misa_ext_add_user_opt(bit, profile->enabled);
 
 riscv_cpu_write_misa_bit(cpu, bit, profile->enabled);
 }
@@ -314,9 +324,8 @@ static void riscv_cpu_commit_profile(RISCVCPU *cpu, 
RISCVCPUProfile *profile)
 continue;
 }
 
-g_hash_table_insert(multi_ext_user_opts,
-GUINT_TO_POINTER(ext_offset),
-(gpointer)profile->enabled);
+cpu_cfg_ext_add_user_opt(ext_offset, profile->enabled);
+
 isa_ext_update_enabled(cpu, ext_offset, profile->enabled);
 }
 }
@@ -724,9 +733,7 @@ static void cpu_set_misa_ext_cfg(Object *obj, Visitor *v, 
const char *name,
 return;
 }
 
-g_hash_table_insert(misa_ext_user_opts,
-GUINT_TO_POINTER(misa_bit),
-(gpointer)value);
+cpu_misa_ext_add_user_opt(misa_bit, value);
 
 prev_val = env->misa_ext & misa_bit;
 
@@ -867,9 +874,7 @@ static void cpu_set_multi_ext_cfg(Object *obj, Visitor *v, 
const char *name,
 return;
 }
 
-g_hash_table_insert(multi_ext_user_opts,
-GUINT_TO_POINTER(multi_ext_cfg->offset),
-(gpointer)value);
+cpu_cfg_ext_add_user_opt(multi_ext_cfg->offset, value);
 
 prev_val = isa_ext_is_enabled(cpu, multi_ext_cfg->offset);
 
-- 
2.41.0




[PATCH v2 01/10] target/riscv/cpu.c: add zicntr extension flag

2023-10-06 Thread Daniel Henrique Barboza
zicntr is the Base Counters and Timers extension described in chapter 12
of the unprivileged spec. It describes support for RDCYCLE, RDTIME and
RDINSTRET.

QEMU already implements it way before it was a discrete extension.
zicntr is part of the RVA22 profile, so let's add it to QEMU to make the
future profile implementation flag complete.

Given than it represents an already existing feature, default it to
'true'. Change the realize() time validation to disable it in case its
dependency (icsr) isn't present.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 7 +++
 target/riscv/cpu_cfg.h | 1 +
 target/riscv/tcg/tcg-cpu.c | 4 
 3 files changed, 12 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 521bb88538..8783a415b1 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -79,6 +79,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(zicbom, PRIV_VERSION_1_12_0, ext_icbom),
 ISA_EXT_DATA_ENTRY(zicboz, PRIV_VERSION_1_12_0, ext_icboz),
 ISA_EXT_DATA_ENTRY(zicond, PRIV_VERSION_1_12_0, ext_zicond),
+ISA_EXT_DATA_ENTRY(zicntr, PRIV_VERSION_1_12_0, ext_icntr),
 ISA_EXT_DATA_ENTRY(zicsr, PRIV_VERSION_1_10_0, ext_icsr),
 ISA_EXT_DATA_ENTRY(zifencei, PRIV_VERSION_1_10_0, ext_ifencei),
 ISA_EXT_DATA_ENTRY(zihintntl, PRIV_VERSION_1_10_0, ext_zihintntl),
@@ -1265,6 +1266,12 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
 MULTI_EXT_CFG_BOOL("svnapot", ext_svnapot, false),
 MULTI_EXT_CFG_BOOL("svpbmt", ext_svpbmt, false),
 
+/*
+ * Always default true - we'll disable it during
+ * realize() if needed.
+ */
+MULTI_EXT_CFG_BOOL("zicntr", ext_icntr, true),
+
 MULTI_EXT_CFG_BOOL("zba", ext_zba, true),
 MULTI_EXT_CFG_BOOL("zbb", ext_zbb, true),
 MULTI_EXT_CFG_BOOL("zbc", ext_zbc, true),
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index 0e6a0f245c..671b8c7cb8 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -62,6 +62,7 @@ struct RISCVCPUConfig {
 bool ext_zksh;
 bool ext_zkt;
 bool ext_ifencei;
+bool ext_icntr;
 bool ext_icsr;
 bool ext_icbom;
 bool ext_icboz;
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 08b806dc07..df187bc143 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -542,6 +542,10 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, 
Error **errp)
 cpu_cfg_ext_auto_update(cpu, CPU_CFG_OFFSET(ext_zksh), true);
 }
 
+if (cpu->cfg.ext_icntr && !cpu->cfg.ext_icsr) {
+cpu->cfg.ext_icntr = false;
+}
+
 /*
  * Disable isa extensions based on priv spec after we
  * validated and set everything we need.
-- 
2.41.0




  1   2   3   4   >