Re: [PATCH v2 3/3] docs/about: Add the missing release record in the subject

2021-08-22 Thread Thomas Huth

On 23/08/2021 05.00, Yanan Wang wrote:

Commit 29e0447551
(docs/about/removed-features: Document removed CLI options from QEMU v3.1)
has recorded some CLI options as replaced/removed from QEMU v3.1, but one
of the subjects has missed the release record. Let's fix it.

Reported-by: Cornelia Huck 
Signed-off-by: Yanan Wang 
---
  docs/about/removed-features.rst | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/about/removed-features.rst b/docs/about/removed-features.rst
index 1c926a8bc1..8feeead449 100644
--- a/docs/about/removed-features.rst
+++ b/docs/about/removed-features.rst
@@ -140,8 +140,8 @@ Use ``-rtc driftfix=slew`` instead.
  
  Replaced by ``-rtc base=date``.
  
-``-vnc ...,tls=...``, ``-vnc ...,x509=...`` & ``-vnc ...,x509verify=...``

-'
+``-vnc ...,tls=...``, ``-vnc ...,x509=...`` & ``-vnc ...,x509verify=...`` 
(removed in 3.1)
+''


Reviewed-by: Thomas Huth 




Re: [PATCH v2 2/3] docs/about: Unify the subject format

2021-08-22 Thread Thomas Huth

On 23/08/2021 05.00, Yanan Wang wrote:

There is a mixture of "since/removed in X.Y" vs "since/removed in X.Y.Z"
in the subjects in deprecated.rst/removed-features.rst. It will be better
to use an unified format. It seems unlikely that we will ever deprecate
something in a stable release, and even more unlikely that we'll remove
something in one, so the short versions look like the thing we want to
standardize on.

So here we unify the subject format in deprecated.rst to "since X.Y", and
unify the subject format in removed-features.rst to "removed in X.Y".

Signed-off-by: Yanan Wang 
Reviewed-by: Cornelia Huck 
---
  docs/about/deprecated.rst   | 56 -
  docs/about/removed-features.rst | 28 -
  2 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index 6d438f1c8d..8d4fd384a5 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -107,8 +107,8 @@ the process listing. This is replaced by the new 
``password-secret``
  option which lets the password be securely provided on the command
  line using a ``secret`` object instance.
  
-``opened`` property of ``rng-*`` objects (since 6.0.0)

-''
+``opened`` property of ``rng-*`` objects (since 6.0)
+
  
  The only effect of specifying ``opened=on`` in the command line or QMP

  ``object-add`` is that the device is opened immediately, possibly before all
@@ -116,8 +116,8 @@ other options have been processed.  This will either have 
no effect (if
  ``opened`` was the last option) or cause errors.  The property is therefore
  useless and should not be specified.
  
-``loaded`` property of ``secret`` and ``secret_keyring`` objects (since 6.0.0)

-''
+``loaded`` property of ``secret`` and ``secret_keyring`` objects (since 6.0)
+
  
  The only effect of specifying ``loaded=on`` in the command line or QMP

  ``object-add`` is that the secret is loaded immediately, possibly before all
@@ -142,33 +142,33 @@ should be used instead.
  QEMU Machine Protocol (QMP) commands
  
  
-``blockdev-open-tray``, ``blockdev-close-tray`` argument ``device`` (since 2.8.0)

-'
+``blockdev-open-tray``, ``blockdev-close-tray`` argument ``device`` (since 2.8)
+'''
  
  Use argument ``id`` instead.
  
-``eject`` argument ``device`` (since 2.8.0)

-'''
+``eject`` argument ``device`` (since 2.8)
+'
  
  Use argument ``id`` instead.
  
-``blockdev-change-medium`` argument ``device`` (since 2.8.0)

-
+``blockdev-change-medium`` argument ``device`` (since 2.8)
+''
  
  Use argument ``id`` instead.
  
-``block_set_io_throttle`` argument ``device`` (since 2.8.0)

-'''
+``block_set_io_throttle`` argument ``device`` (since 2.8)
+'
  
  Use argument ``id`` instead.
  
-``blockdev-add`` empty string argument ``backing`` (since 2.10.0)

-'
+``blockdev-add`` empty string argument ``backing`` (since 2.10)
+'''
  
  Use argument value ``null`` instead.
  
-``block-commit`` arguments ``base`` and ``top`` (since 3.1.0)

-'
+``block-commit`` arguments ``base`` and ``top`` (since 3.1)
+'''
  
  Use arguments ``base-node`` and ``top-node`` instead.
  
@@ -191,8 +191,8 @@ from Linux upstream kernel, declare it deprecated.

  System emulator CPUS
  
  
-``Icelake-Client`` CPU Model (since 5.2.0)

-''
+``Icelake-Client`` CPU Model (since 5.2)
+
  
  ``Icelake-Client`` CPU Models are deprecated. Use ``Icelake-Server`` CPU

  Models instead.
@@ -245,8 +245,8 @@ Device options
  Emulated device options
  '''
  
-``-device virtio-blk,scsi=on|off`` (since 5.0.0)

-
+``-device virtio-blk,scsi=on|off`` (since 5.0)
+^^
  
  The virtio-blk SCSI passthrough feature is a legacy VIRTIO feature.  VIRTIO 1.0

  and later do not support it because the virtio-scsi device was introduced for
@@ -258,14 +258,14 @@ 

Re: [PATCH v2 1/3] docs/about: Remove the duplicated doc

2021-08-22 Thread Thomas Huth

On 23/08/2021 05.00, Yanan Wang wrote:

There are two places describing the same thing about deprecation
of invalid topologies of -smp CLI, so remove the duplicated one.

Signed-off-by: Yanan Wang 
Reviewed-by: Cornelia Huck 
---
  docs/about/removed-features.rst | 13 -
  1 file changed, 13 deletions(-)

diff --git a/docs/about/removed-features.rst b/docs/about/removed-features.rst
index cbfa1a8e31..6a9c5bb484 100644
--- a/docs/about/removed-features.rst
+++ b/docs/about/removed-features.rst
@@ -647,19 +647,6 @@ as ignored. Currently, users are responsible for making 
sure the backing storage
  specified with ``-mem-path`` can actually provide the guest RAM configured 
with
  ``-m`` and QEMU fails to start up if RAM allocation is unsuccessful.
  
-``-smp`` (invalid topologies) (removed 5.2)

-'''
-
-CPU topology properties should describe whole machine topology including
-possible CPUs.
-
-However, historically it was possible to start QEMU with an incorrect topology
-where *n* <= *sockets* * *cores* * *threads* < *maxcpus*,
-which could lead to an incorrect topology enumeration by the guest.
-Support for invalid topologies is removed, the user must ensure
-topologies described with -smp include all possible cpus, i.e.
-*sockets* * *cores* * *threads* = *maxcpus*.
-
  ``-machine enforce-config-section=on|off`` (removed 5.2)
  
  



Reviewed-by: Thomas Huth 




[PATCH] hw/arm/smmuv3: Simplify range invalidation

2021-08-22 Thread Liu, Renwei
Simplify range invalidation which can avoid to iterate over all
iotlb entries multi-times. For instance invalidations patterns like
"invalidate 32 4kB pages starting from 0xffacd000" need to iterate over
all iotlb entries 6 times (num_pages: 1, 2, 16, 8, 4, 1). It only needs
to iterate over all iotlb entries once with new implementation.

Signed-off-by: Renwei Liu 
---
 hw/arm/smmu-common.c   |  6 +++---
 hw/arm/smmu-internal.h |  2 +-
 hw/arm/smmuv3.c| 22 --
 3 files changed, 8 insertions(+), 22 deletions(-)
 mode change 100644 => 100755 hw/arm/smmu-common.c
 mode change 100644 => 100755 hw/arm/smmu-internal.h
 mode change 100644 => 100755 hw/arm/smmuv3.c

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
old mode 100644
new mode 100755
index 0459850a93..ccb085f83c
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -142,8 +142,8 @@ static gboolean smmu_hash_remove_by_asid_iova(gpointer key, 
gpointer value,
 if (info->asid >= 0 && info->asid != SMMU_IOTLB_ASID(iotlb_key)) {
 return false;
 }
-return ((info->iova & ~entry->addr_mask) == entry->iova) ||
-   ((entry->iova & ~info->mask) == info->iova);
+return (entry->iova >= info->iova) &&
+   ((entry->iova + entry->addr_mask) < (info->iova + info->range));
 }
 
 inline void
@@ -167,7 +167,7 @@ smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova,
 
 SMMUIOTLBPageInvInfo info = {
 .asid = asid, .iova = iova,
-.mask = (num_pages * 1 << granule) - 1};
+.range = num_pages * 1 << granule};
 
 g_hash_table_foreach_remove(s->iotlb,
 smmu_hash_remove_by_asid_iova,
diff --git a/hw/arm/smmu-internal.h b/hw/arm/smmu-internal.h
old mode 100644
new mode 100755
index 2d75b31953..f0e3a777af
--- a/hw/arm/smmu-internal.h
+++ b/hw/arm/smmu-internal.h
@@ -101,7 +101,7 @@ uint64_t iova_level_offset(uint64_t iova, int inputsize,
 typedef struct SMMUIOTLBPageInvInfo {
 int asid;
 uint64_t iova;
-uint64_t mask;
+uint64_t range;
 } SMMUIOTLBPageInvInfo;
 
 typedef struct SMMUSIDRange {
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
old mode 100644
new mode 100755
index 01b60bee49..0b009107d1
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -857,7 +857,7 @@ static void smmuv3_inv_notifiers_iova(SMMUState *s, int 
asid, dma_addr_t iova,
 
 static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd)
 {
-dma_addr_t end, addr = CMD_ADDR(cmd);
+dma_addr_t addr = CMD_ADDR(cmd);
 uint8_t type = CMD_TYPE(cmd);
 uint16_t vmid = CMD_VMID(cmd);
 uint8_t scale = CMD_SCALE(cmd);
@@ -866,7 +866,6 @@ static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd)
 bool leaf = CMD_LEAF(cmd);
 uint8_t tg = CMD_TG(cmd);
 uint64_t num_pages;
-uint8_t granule;
 int asid = -1;
 
 if (type == SMMU_CMD_TLBI_NH_VA) {
@@ -880,23 +879,10 @@ static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd)
 return;
 }
 
-/* RIL in use */
-
 num_pages = (num + 1) * BIT_ULL(scale);
-granule = tg * 2 + 10;
-
-/* Split invalidations into ^2 range invalidations */
-end = addr + (num_pages << granule) - 1;
-
-while (addr != end + 1) {
-uint64_t mask = dma_aligned_pow2_mask(addr, end, 64);
-
-num_pages = (mask + 1) >> granule;
-trace_smmuv3_s1_range_inval(vmid, asid, addr, tg, num_pages, ttl, 
leaf);
-smmuv3_inv_notifiers_iova(s, asid, addr, tg, num_pages);
-smmu_iotlb_inv_iova(s, asid, addr, tg, num_pages, ttl);
-addr += mask + 1;
-}
+trace_smmuv3_s1_range_inval(vmid, asid, addr, tg, num_pages, ttl, leaf);
+smmuv3_inv_notifiers_iova(s, asid, addr, tg, num_pages);
+smmu_iotlb_inv_iova(s, asid, addr, tg, num_pages, ttl);
 }
 
 static gboolean
-- 
2.32.0




Re: [PATCH v4 15/21] target/riscv: Reorg csr instructions

2021-08-22 Thread Bin Meng
On Sat, Aug 21, 2021 at 1:43 AM Richard Henderson
 wrote:
>
> Introduce csrr and csrw helpers, for read-only and write-only insns.
>
> Note that we do not properly implement this in riscv_csrrw, in that
> we cannot distinguish true read-only (rs1 == 0) from any other zero
> write_mask another source register -- this should still raise an
> exception for read-only registers.
>
> Only issue gen_io_start for CF_USE_ICOUNT.
> Use ctx->zero for csrrc.
> Use get_gpr and dest_gpr.
>
> Reviewed-by: Bin Meng 
> Signed-off-by: Richard Henderson 
> ---
>  target/riscv/helper.h   |   6 +-
>  target/riscv/op_helper.c|  18 +--
>  target/riscv/insn_trans/trans_rvi.c.inc | 172 +---
>  3 files changed, 131 insertions(+), 65 deletions(-)
>

When testing Linux kernel boot, there was a segment fault in the
helper_csrw() path where ret_value pointer is now NULL, and some CSR
write op does not test ret_value.

Regards,
Bin



Re: [PATCH 2/3] hw/char: cadence_uart: Disable transmit when input clock is disabled

2021-08-22 Thread Bin Meng
On Mon, Aug 23, 2021 at 12:43 PM Alistair Francis  wrote:
>
> On Mon, Aug 23, 2021 at 12:11 PM Bin Meng  wrote:
> >
> > At present when input clock is disabled, any character transmitted
> > to tx fifo can still show on the serial line, which is wrong.
> >
> > Fixes: b636db306e06 ("hw/char/cadence_uart: add clock support")
> > Signed-off-by: Bin Meng 
> > ---
> >
> >  hw/char/cadence_uart.c | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/hw/char/cadence_uart.c b/hw/char/cadence_uart.c
> > index b4b5e8a3ee..154be34992 100644
> > --- a/hw/char/cadence_uart.c
> > +++ b/hw/char/cadence_uart.c
> > @@ -327,6 +327,11 @@ static gboolean cadence_uart_xmit(void *do_not_use, 
> > GIOCondition cond,
> >  static void uart_write_tx_fifo(CadenceUARTState *s, const uint8_t *buf,
> > int size)
> >  {
> > +/* ignore characters when unclocked or in reset */
> > +if (!clock_is_enabled(s->refclk) || device_is_in_reset(DEVICE(s))) {
> > +return;
> > +}
>
> Should we log a guest error here?
>

Not sure. Based on my past experience of many hardware, if the input
clock is disabled, accessing the whole register block might cause a
bus fault. But I believe such bus fault is not modeled in QEMU.

This change just mirrors the same check on the Rx side.

Regards,
Bin



Re: [PATCH 2/3] hw/char: cadence_uart: Disable transmit when input clock is disabled

2021-08-22 Thread Alistair Francis
On Mon, Aug 23, 2021 at 12:11 PM Bin Meng  wrote:
>
> At present when input clock is disabled, any character transmitted
> to tx fifo can still show on the serial line, which is wrong.
>
> Fixes: b636db306e06 ("hw/char/cadence_uart: add clock support")
> Signed-off-by: Bin Meng 
> ---
>
>  hw/char/cadence_uart.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/hw/char/cadence_uart.c b/hw/char/cadence_uart.c
> index b4b5e8a3ee..154be34992 100644
> --- a/hw/char/cadence_uart.c
> +++ b/hw/char/cadence_uart.c
> @@ -327,6 +327,11 @@ static gboolean cadence_uart_xmit(void *do_not_use, 
> GIOCondition cond,
>  static void uart_write_tx_fifo(CadenceUARTState *s, const uint8_t *buf,
> int size)
>  {
> +/* ignore characters when unclocked or in reset */
> +if (!clock_is_enabled(s->refclk) || device_is_in_reset(DEVICE(s))) {
> +return;
> +}

Should we log a guest error here?

Alistair

> +
>  if ((s->r[R_CR] & UART_CR_TX_DIS) || !(s->r[R_CR] & UART_CR_TX_EN)) {
>  return;
>  }
> --
> 2.25.1
>
>



Re: [PATCH 1/3] hw/misc: zynq_slcr: Correctly compute output clocks in the reset exit phase

2021-08-22 Thread Alistair Francis
On Mon, Aug 23, 2021 at 12:09 PM Bin Meng  wrote:
>
> As of today, when booting upstream U-Boot for Xilinx Zynq, the UART
> does not receive anything. Debugging shows that the UART input clock
> frequency is zero which prevents the UART from receiving anything as
> per the logic in uart_receive().
>
> From zynq_slcr_reset_exit() comment, it intends to compute output
> clocks according to ps_clk and registers. zynq_slcr_compute_clocks()
> is called to accomplish the task, inside which device_is_in_reset()
> is called to actually make the attempt in vain.
>
> Rework reset_hold() and reset_exit() so that in the reset exit phase,
> the logic can really compute output clocks in reset_exit().
>
> With this change, upstream U-Boot boots properly again with:
>
> $ qemu-system-arm -M xilinx-zynq-a9 -m 1G -display none -serial null -serial 
> stdio \
> -device loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
>
> Fixes: 38867cb7ec90 ("hw/misc/zynq_slcr: add clock generation for uarts")
> Signed-off-by: Bin Meng 

Acked-by: Alistair Francis 

Alistair

> ---
>
>  hw/misc/zynq_slcr.c | 31 ++-
>  1 file changed, 18 insertions(+), 13 deletions(-)
>
> diff --git a/hw/misc/zynq_slcr.c b/hw/misc/zynq_slcr.c
> index 5086e6b7ed..8b70285961 100644
> --- a/hw/misc/zynq_slcr.c
> +++ b/hw/misc/zynq_slcr.c
> @@ -269,6 +269,21 @@ static uint64_t zynq_slcr_compute_clock(const uint64_t 
> periods[],
>  zynq_slcr_compute_clock((plls), (state)->regs[reg], \
>  reg ## _ ## enable_field ## _SHIFT)
>
> +static void zynq_slcr_compute_clocks_internal(ZynqSLCRState *s, uint64_t 
> ps_clk)
> +{
> +uint64_t io_pll = zynq_slcr_compute_pll(ps_clk, s->regs[R_IO_PLL_CTRL]);
> +uint64_t arm_pll = zynq_slcr_compute_pll(ps_clk, 
> s->regs[R_ARM_PLL_CTRL]);
> +uint64_t ddr_pll = zynq_slcr_compute_pll(ps_clk, 
> s->regs[R_DDR_PLL_CTRL]);
> +
> +uint64_t uart_mux[4] = {io_pll, io_pll, arm_pll, ddr_pll};
> +
> +/* compute uartX reference clocks */
> +clock_set(s->uart0_ref_clk,
> +  ZYNQ_COMPUTE_CLK(s, uart_mux, R_UART_CLK_CTRL, CLKACT0));
> +clock_set(s->uart1_ref_clk,
> +  ZYNQ_COMPUTE_CLK(s, uart_mux, R_UART_CLK_CTRL, CLKACT1));
> +}
> +
>  /**
>   * Compute and set the ouputs clocks periods.
>   * But do not propagate them further. Connected clocks
> @@ -283,17 +298,7 @@ static void zynq_slcr_compute_clocks(ZynqSLCRState *s)
>  ps_clk = 0;
>  }
>
> -uint64_t io_pll = zynq_slcr_compute_pll(ps_clk, s->regs[R_IO_PLL_CTRL]);
> -uint64_t arm_pll = zynq_slcr_compute_pll(ps_clk, 
> s->regs[R_ARM_PLL_CTRL]);
> -uint64_t ddr_pll = zynq_slcr_compute_pll(ps_clk, 
> s->regs[R_DDR_PLL_CTRL]);
> -
> -uint64_t uart_mux[4] = {io_pll, io_pll, arm_pll, ddr_pll};
> -
> -/* compute uartX reference clocks */
> -clock_set(s->uart0_ref_clk,
> -  ZYNQ_COMPUTE_CLK(s, uart_mux, R_UART_CLK_CTRL, CLKACT0));
> -clock_set(s->uart1_ref_clk,
> -  ZYNQ_COMPUTE_CLK(s, uart_mux, R_UART_CLK_CTRL, CLKACT1));
> +zynq_slcr_compute_clocks_internal(s, ps_clk);
>  }
>
>  /**
> @@ -416,7 +421,7 @@ static void zynq_slcr_reset_hold(Object *obj)
>  ZynqSLCRState *s = ZYNQ_SLCR(obj);
>
>  /* will disable all output clocks */
> -zynq_slcr_compute_clocks(s);
> +zynq_slcr_compute_clocks_internal(s, 0);
>  zynq_slcr_propagate_clocks(s);
>  }
>
> @@ -425,7 +430,7 @@ static void zynq_slcr_reset_exit(Object *obj)
>  ZynqSLCRState *s = ZYNQ_SLCR(obj);
>
>  /* will compute output clocks according to ps_clk and registers */
> -zynq_slcr_compute_clocks(s);
> +zynq_slcr_compute_clocks_internal(s, clock_get(s->ps_clk));
>  zynq_slcr_propagate_clocks(s);
>  }
>
> --
> 2.25.1
>
>



Re: [PATCH v4 03/21] target/riscv: Clean up division helpers

2021-08-22 Thread Bin Meng
On Sat, Aug 21, 2021 at 1:46 AM Richard Henderson
 wrote:
>
> Utilize the condition in the movcond more; this allows some of
> the setcond that were feeding into movcond to be removed.
> Do not write into source1 and source2.  Re-name "condN" to "tempN"
> and use the temporaries for more than holding conditions.
>
> Signed-off-by: Richard Henderson 
> ---
>  target/riscv/translate.c | 160 ---
>  1 file changed, 84 insertions(+), 76 deletions(-)
>

Reviewed-by: Bin Meng 
Tested-by: Bin Meng 



Re: [PATCH v2 1/2] migration: allow multifd for socket protocol only

2021-08-22 Thread lizhij...@fujitsu.com
kindly ping


On 31/07/2021 22:05, Li Zhijian wrote:
> multifd with unsupported protocol will cause a segment fault.
> (gdb) bt
>   #0  0x563b4a93faf8 in socket_connect (addr=0x0, errp=0x7f7f02675410) at 
> ../util/qemu-sockets.c:1190
>   #1  0x563b4a797a03 in qio_channel_socket_connect_sync 
> (ioc=0x563b4d16e8c0, addr=0x0, errp=0x7f7f02675410) at 
> ../io/channel-socket.c:145
>   #2  0x563b4a797abf in qio_channel_socket_connect_worker 
> (task=0x563b4cd86c30, opaque=0x0) at ../io/channel-socket.c:168
>   #3  0x563b4a792631 in qio_task_thread_worker (opaque=0x563b4cd86c30) at 
> ../io/task.c:124
>   #4  0x563b4a91da69 in qemu_thread_start (args=0x563b4c44bb80) at 
> ../util/qemu-thread-posix.c:541
>   #5  0x7f7fe9b5b3f9 in ?? ()
>   #6  0x in ?? ()
>
> It's enough to check migrate_multifd_is_allowed() in multifd cleanup() and
> multifd setup() though there are so many other places using 
> migrate_use_multifd().
>
> Signed-off-by: Li Zhijian 
> ---
>   migration/migration.c |  4 
>   migration/multifd.c   | 24 ++--
>   migration/multifd.h   |  2 ++
>   3 files changed, 28 insertions(+), 2 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 2d306582ebf..212314541f1 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -456,10 +456,12 @@ static void qemu_start_incoming_migration(const char 
> *uri, Error **errp)
>   {
>   const char *p = NULL;
>   
> +migrate_protocol_allow_multifd(false); /* reset it anyway */
>   qapi_event_send_migration(MIGRATION_STATUS_SETUP);
>   if (strstart(uri, "tcp:", ) ||
>   strstart(uri, "unix:", NULL) ||
>   strstart(uri, "vsock:", NULL)) {
> +migrate_protocol_allow_multifd(true);
>   socket_start_incoming_migration(p ? p : uri, errp);
>   #ifdef CONFIG_RDMA
>   } else if (strstart(uri, "rdma:", )) {
> @@ -2289,9 +2291,11 @@ void qmp_migrate(const char *uri, bool has_blk, bool 
> blk,
>   }
>   }
>   
> +migrate_protocol_allow_multifd(false);
>   if (strstart(uri, "tcp:", ) ||
>   strstart(uri, "unix:", NULL) ||
>   strstart(uri, "vsock:", NULL)) {
> +migrate_protocol_allow_multifd(true);
>   socket_start_outgoing_migration(s, p ? p : uri, _err);
>   #ifdef CONFIG_RDMA
>   } else if (strstart(uri, "rdma:", )) {
> diff --git a/migration/multifd.c b/migration/multifd.c
> index ab41590e714..4a4d16d3888 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -531,7 +531,7 @@ void multifd_save_cleanup(void)
>   {
>   int i;
>   
> -if (!migrate_use_multifd()) {
> +if (!migrate_use_multifd() || !migrate_multifd_is_allowed()) {
>   return;
>   }
>   multifd_send_terminate_threads(NULL);
> @@ -864,6 +864,17 @@ cleanup:
>   multifd_new_send_channel_cleanup(p, sioc, local_err);
>   }
>   
> +static bool migrate_allow_multifd;
> +void migrate_protocol_allow_multifd(bool allow)
> +{
> +migrate_allow_multifd = allow;
> +}
> +
> +bool migrate_multifd_is_allowed(void)
> +{
> +return migrate_allow_multifd;
> +}
> +
>   int multifd_save_setup(Error **errp)
>   {
>   int thread_count;
> @@ -874,6 +885,11 @@ int multifd_save_setup(Error **errp)
>   if (!migrate_use_multifd()) {
>   return 0;
>   }
> +if (!migrate_multifd_is_allowed()) {
> +error_setg(errp, "multifd is not supported by current protocol");
> +return -1;
> +}
> +
>   s = migrate_get_current();
>   thread_count = migrate_multifd_channels();
>   multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
> @@ -967,7 +983,7 @@ int multifd_load_cleanup(Error **errp)
>   {
>   int i;
>   
> -if (!migrate_use_multifd()) {
> +if (!migrate_use_multifd() || !migrate_multifd_is_allowed()) {
>   return 0;
>   }
>   multifd_recv_terminate_threads(NULL);
> @@ -1123,6 +1139,10 @@ int multifd_load_setup(Error **errp)
>   if (!migrate_use_multifd()) {
>   return 0;
>   }
> +if (!migrate_multifd_is_allowed()) {
> +error_setg(errp, "multifd is not supported by current protocol");
> +return -1;
> +}
>   thread_count = migrate_multifd_channels();
>   multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
>   multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
> diff --git a/migration/multifd.h b/migration/multifd.h
> index 8d6751f5ed8..f62a1becd0b 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -13,6 +13,8 @@
>   #ifndef QEMU_MIGRATION_MULTIFD_H
>   #define QEMU_MIGRATION_MULTIFD_H
>   
> +bool migrate_multifd_is_allowed(void);
> +void migrate_protocol_allow_multifd(bool allow);
>   int multifd_save_setup(Error **errp);
>   void multifd_save_cleanup(void);
>   int multifd_load_setup(Error **errp);


[PATCH v2 1/2] migration/rdma: Try to register On-Demand Paging memory region

2021-08-22 Thread Li Zhijian
Previously, for the fsdax mem-backend-file, it will register failed with
Operation not supported. In this case, we can try to register it with
On-Demand Paging[1] like what rpma_mr_reg() does on rpma[2].

[1]: 
https://community.mellanox.com/s/article/understanding-on-demand-paging--odp-x
[2]: http://pmem.io/rpma/manpages/v0.9.0/rpma_mr_reg.3

CC: Marcel Apfelbaum 
Signed-off-by: Li Zhijian 

---
V2: add ODP sanity check and remove goto
---
 migration/rdma.c   | 73 ++
 migration/trace-events |  1 +
 2 files changed, 54 insertions(+), 20 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index 5c2d113aa94..eb80431aae2 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -1117,19 +1117,47 @@ static int qemu_rdma_alloc_qp(RDMAContext *rdma)
 return 0;
 }
 
+/* Check whether On-Demand Paging is supported by RDAM device */
+static bool rdma_support_odp(struct ibv_context *dev)
+{
+struct ibv_device_attr_ex attr = {0};
+int ret = ibv_query_device_ex(dev, NULL, );
+if (ret) {
+return false;
+}
+
+if (attr.odp_caps.general_caps & IBV_ODP_SUPPORT) {
+return true;
+}
+
+return false;
+}
+
 static int qemu_rdma_reg_whole_ram_blocks(RDMAContext *rdma)
 {
 int i;
 RDMALocalBlocks *local = >local_ram_blocks;
 
 for (i = 0; i < local->nb_blocks; i++) {
+int access = IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE;
+
 local->block[i].mr =
 ibv_reg_mr(rdma->pd,
 local->block[i].local_host_addr,
-local->block[i].length,
-IBV_ACCESS_LOCAL_WRITE |
-IBV_ACCESS_REMOTE_WRITE
+local->block[i].length, access
 );
+
+if (!local->block[i].mr &&
+errno == ENOTSUP && rdma_support_odp(rdma->verbs)) {
+access |= IBV_ACCESS_ON_DEMAND;
+/* register ODP mr */
+local->block[i].mr =
+ibv_reg_mr(rdma->pd,
+   local->block[i].local_host_addr,
+   local->block[i].length, access);
+trace_qemu_rdma_register_odp_mr(local->block[i].block_name);
+}
+
 if (!local->block[i].mr) {
 perror("Failed to register local dest ram block!");
 break;
@@ -1215,28 +1243,33 @@ static int qemu_rdma_register_and_get_keys(RDMAContext 
*rdma,
  */
 if (!block->pmr[chunk]) {
 uint64_t len = chunk_end - chunk_start;
+int access = rkey ? IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE :
+ 0;
 
 trace_qemu_rdma_register_and_get_keys(len, chunk_start);
 
-block->pmr[chunk] = ibv_reg_mr(rdma->pd,
-chunk_start, len,
-(rkey ? (IBV_ACCESS_LOCAL_WRITE |
-IBV_ACCESS_REMOTE_WRITE) : 0));
-
-if (!block->pmr[chunk]) {
-perror("Failed to register chunk!");
-fprintf(stderr, "Chunk details: block: %d chunk index %d"
-" start %" PRIuPTR " end %" PRIuPTR
-" host %" PRIuPTR
-" local %" PRIuPTR " registrations: %d\n",
-block->index, chunk, (uintptr_t)chunk_start,
-(uintptr_t)chunk_end, host_addr,
-(uintptr_t)block->local_host_addr,
-rdma->total_registrations);
-return -1;
+block->pmr[chunk] = ibv_reg_mr(rdma->pd, chunk_start, len, access);
+if (!block->pmr[chunk] &&
+errno == ENOTSUP && rdma_support_odp(rdma->verbs)) {
+access |= IBV_ACCESS_ON_DEMAND;
+/* register ODP mr */
+block->pmr[chunk] = ibv_reg_mr(rdma->pd, chunk_start, len, access);
+trace_qemu_rdma_register_odp_mr(block->block_name);
 }
-rdma->total_registrations++;
 }
+if (!block->pmr[chunk]) {
+perror("Failed to register chunk!");
+fprintf(stderr, "Chunk details: block: %d chunk index %d"
+" start %" PRIuPTR " end %" PRIuPTR
+" host %" PRIuPTR
+" local %" PRIuPTR " registrations: %d\n",
+block->index, chunk, (uintptr_t)chunk_start,
+(uintptr_t)chunk_end, host_addr,
+(uintptr_t)block->local_host_addr,
+rdma->total_registrations);
+return -1;
+}
+rdma->total_registrations++;
 
 if (lkey) {
 *lkey = block->pmr[chunk]->lkey;
diff --git a/migration/trace-events b/migration/trace-events
index a1c0f034ab8..5f6aa580def 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -212,6 +212,7 @@ qemu_rdma_poll_write(const char *compstr, int64_t comp, int 
left, uint64_t block
 qemu_rdma_poll_other(const char 

[PATCH v2 2/2] migration/rdma: advise prefetch write for ODP region

2021-08-22 Thread Li Zhijian
The responder mr registering with ODP will sent RNR NAK back to
the requester in the face of the page fault.
-
ibv_poll_cq wc.status=13 RNR retry counter exceeded!
ibv_poll_cq wrid=WRITE RDMA!
-
ibv_advise_mr(3) helps to make pages present before the actual IO is
conducted so that the responder does page fault as little as possible.

Signed-off-by: Li Zhijian 
Reviewed-by: Marcel Apfelbaum 

---
V2: use IBV_ADVISE_MR_FLAG_FLUSH instead of IB_UVERBS_ADVISE_MR_FLAG_FLUSH
and add Reviewed-by tag. # Marcel
---
 migration/rdma.c   | 40 
 migration/trace-events |  1 +
 2 files changed, 41 insertions(+)

diff --git a/migration/rdma.c b/migration/rdma.c
index eb80431aae2..6c2cc3f617c 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -1133,6 +1133,30 @@ static bool rdma_support_odp(struct ibv_context *dev)
 return false;
 }
 
+/*
+ * ibv_advise_mr to avoid RNR NAK error as far as possible.
+ * The responder mr registering with ODP will sent RNR NAK back to
+ * the requester in the face of the page fault.
+ */
+static void qemu_rdma_advise_prefetch_mr(struct ibv_pd *pd, uint64_t addr,
+ uint32_t len,  uint32_t lkey,
+ const char *name, bool wr)
+{
+int ret;
+int advice = wr ? IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE :
+ IBV_ADVISE_MR_ADVICE_PREFETCH;
+struct ibv_sge sg_list = {.lkey = lkey, .addr = addr, .length = len};
+
+ret = ibv_advise_mr(pd, advice,
+IBV_ADVISE_MR_FLAG_FLUSH, _list, 1);
+/* ignore the error */
+if (ret) {
+trace_qemu_rdma_advise_mr(name, len, addr, strerror(errno));
+} else {
+trace_qemu_rdma_advise_mr(name, len, addr, "successed");
+}
+}
+
 static int qemu_rdma_reg_whole_ram_blocks(RDMAContext *rdma)
 {
 int i;
@@ -1156,6 +1180,15 @@ static int qemu_rdma_reg_whole_ram_blocks(RDMAContext 
*rdma)
local->block[i].local_host_addr,
local->block[i].length, access);
 trace_qemu_rdma_register_odp_mr(local->block[i].block_name);
+
+if (local->block[i].mr) {
+qemu_rdma_advise_prefetch_mr(rdma->pd,
+(uintptr_t)local->block[i].local_host_addr,
+local->block[i].length,
+local->block[i].mr->lkey,
+local->block[i].block_name,
+true);
+}
 }
 
 if (!local->block[i].mr) {
@@ -1255,6 +1288,13 @@ static int qemu_rdma_register_and_get_keys(RDMAContext 
*rdma,
 /* register ODP mr */
 block->pmr[chunk] = ibv_reg_mr(rdma->pd, chunk_start, len, access);
 trace_qemu_rdma_register_odp_mr(block->block_name);
+
+if (block->pmr[chunk]) {
+qemu_rdma_advise_prefetch_mr(rdma->pd, (uintptr_t)chunk_start,
+len, block->pmr[chunk]->lkey,
+block->block_name, rkey);
+
+}
 }
 }
 if (!block->pmr[chunk]) {
diff --git a/migration/trace-events b/migration/trace-events
index 5f6aa580def..a8ae163707c 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -213,6 +213,7 @@ qemu_rdma_poll_other(const char *compstr, int64_t comp, int 
left) "other complet
 qemu_rdma_post_send_control(const char *desc) "CONTROL: sending %s.."
 qemu_rdma_register_and_get_keys(uint64_t len, void *start) "Registering %" 
PRIu64 " bytes @ %p"
 qemu_rdma_register_odp_mr(const char *name) "Try to register On-Demand Paging 
memory region: %s"
+qemu_rdma_advise_mr(const char *name, uint32_t len, uint64_t addr, const char 
*res) "Try to advise block %s prefetch at %" PRIu32 "@0x%" PRIx64 ": %s"
 qemu_rdma_registration_handle_compress(int64_t length, int index, int64_t 
offset) "Zapping zero chunk: %" PRId64 " bytes, index %d, offset %" PRId64
 qemu_rdma_registration_handle_finished(void) ""
 qemu_rdma_registration_handle_ram_blocks(void) ""
-- 
2.31.1






[PATCH v2 0/2] enable fsdax rdma migration

2021-08-22 Thread Li Zhijian
Previous qemu are facing 2 problems when migrating a fsdax memory backend with
RDMA protocol.
(1) ibv_reg_mr failed with Operation not supported
(2) requester(source) side could receive RNR NAK.

For the (1), we can try to register memory region with ODP feature which
has already been implemented in some modern HCA hardware/drivers.
For the (2), IB provides advise API to prefetch pages in specific memory
region. It can help driver reduce the page fault on responder(destination)
side during RDMA_WRITE.

CC: marcel.apfelb...@gmail.com

Li Zhijian (2):
  migration/rdma: Try to register On-Demand Paging memory region
  migration/rdma: advise prefetch write for ODP region

 migration/rdma.c   | 117 +
 migration/trace-events |   2 +
 2 files changed, 98 insertions(+), 21 deletions(-)

-- 
2.31.1






Re: [PATCH v4 02/21] tests/tcg/riscv64: Add test for division

2021-08-22 Thread Bin Meng
On Sat, Aug 21, 2021 at 1:43 AM Richard Henderson
 wrote:
>
> Signed-off-by: Richard Henderson 
> ---
>  tests/tcg/riscv64/test-div.c  | 58 +++
>  tests/tcg/riscv64/Makefile.target |  5 +++
>  2 files changed, 63 insertions(+)
>  create mode 100644 tests/tcg/riscv64/test-div.c
>  create mode 100644 tests/tcg/riscv64/Makefile.target
>

Reviewed-by: Bin Meng 
Tested-by: Bin Meng 



[PATCH v2 1/3] docs/about: Remove the duplicated doc

2021-08-22 Thread Yanan Wang
There are two places describing the same thing about deprecation
of invalid topologies of -smp CLI, so remove the duplicated one.

Signed-off-by: Yanan Wang 
Reviewed-by: Cornelia Huck 
---
 docs/about/removed-features.rst | 13 -
 1 file changed, 13 deletions(-)

diff --git a/docs/about/removed-features.rst b/docs/about/removed-features.rst
index cbfa1a8e31..6a9c5bb484 100644
--- a/docs/about/removed-features.rst
+++ b/docs/about/removed-features.rst
@@ -647,19 +647,6 @@ as ignored. Currently, users are responsible for making 
sure the backing storage
 specified with ``-mem-path`` can actually provide the guest RAM configured with
 ``-m`` and QEMU fails to start up if RAM allocation is unsuccessful.
 
-``-smp`` (invalid topologies) (removed 5.2)
-'''
-
-CPU topology properties should describe whole machine topology including
-possible CPUs.
-
-However, historically it was possible to start QEMU with an incorrect topology
-where *n* <= *sockets* * *cores* * *threads* < *maxcpus*,
-which could lead to an incorrect topology enumeration by the guest.
-Support for invalid topologies is removed, the user must ensure
-topologies described with -smp include all possible cpus, i.e.
-*sockets* * *cores* * *threads* = *maxcpus*.
-
 ``-machine enforce-config-section=on|off`` (removed 5.2)
 
 
-- 
2.19.1




[PATCH v2 2/3] docs/about: Unify the subject format

2021-08-22 Thread Yanan Wang
There is a mixture of "since/removed in X.Y" vs "since/removed in X.Y.Z"
in the subjects in deprecated.rst/removed-features.rst. It will be better
to use an unified format. It seems unlikely that we will ever deprecate
something in a stable release, and even more unlikely that we'll remove
something in one, so the short versions look like the thing we want to
standardize on.

So here we unify the subject format in deprecated.rst to "since X.Y", and
unify the subject format in removed-features.rst to "removed in X.Y".

Signed-off-by: Yanan Wang 
Reviewed-by: Cornelia Huck 
---
 docs/about/deprecated.rst   | 56 -
 docs/about/removed-features.rst | 28 -
 2 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index 6d438f1c8d..8d4fd384a5 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -107,8 +107,8 @@ the process listing. This is replaced by the new 
``password-secret``
 option which lets the password be securely provided on the command
 line using a ``secret`` object instance.
 
-``opened`` property of ``rng-*`` objects (since 6.0.0)
-''
+``opened`` property of ``rng-*`` objects (since 6.0)
+
 
 The only effect of specifying ``opened=on`` in the command line or QMP
 ``object-add`` is that the device is opened immediately, possibly before all
@@ -116,8 +116,8 @@ other options have been processed.  This will either have 
no effect (if
 ``opened`` was the last option) or cause errors.  The property is therefore
 useless and should not be specified.
 
-``loaded`` property of ``secret`` and ``secret_keyring`` objects (since 6.0.0)
-''
+``loaded`` property of ``secret`` and ``secret_keyring`` objects (since 6.0)
+
 
 The only effect of specifying ``loaded=on`` in the command line or QMP
 ``object-add`` is that the secret is loaded immediately, possibly before all
@@ -142,33 +142,33 @@ should be used instead.
 QEMU Machine Protocol (QMP) commands
 
 
-``blockdev-open-tray``, ``blockdev-close-tray`` argument ``device`` (since 
2.8.0)
-'
+``blockdev-open-tray``, ``blockdev-close-tray`` argument ``device`` (since 2.8)
+'''
 
 Use argument ``id`` instead.
 
-``eject`` argument ``device`` (since 2.8.0)
-'''
+``eject`` argument ``device`` (since 2.8)
+'
 
 Use argument ``id`` instead.
 
-``blockdev-change-medium`` argument ``device`` (since 2.8.0)
-
+``blockdev-change-medium`` argument ``device`` (since 2.8)
+''
 
 Use argument ``id`` instead.
 
-``block_set_io_throttle`` argument ``device`` (since 2.8.0)
-'''
+``block_set_io_throttle`` argument ``device`` (since 2.8)
+'
 
 Use argument ``id`` instead.
 
-``blockdev-add`` empty string argument ``backing`` (since 2.10.0)
-'
+``blockdev-add`` empty string argument ``backing`` (since 2.10)
+'''
 
 Use argument value ``null`` instead.
 
-``block-commit`` arguments ``base`` and ``top`` (since 3.1.0)
-'
+``block-commit`` arguments ``base`` and ``top`` (since 3.1)
+'''
 
 Use arguments ``base-node`` and ``top-node`` instead.
 
@@ -191,8 +191,8 @@ from Linux upstream kernel, declare it deprecated.
 System emulator CPUS
 
 
-``Icelake-Client`` CPU Model (since 5.2.0)
-''
+``Icelake-Client`` CPU Model (since 5.2)
+
 
 ``Icelake-Client`` CPU Models are deprecated. Use ``Icelake-Server`` CPU
 Models instead.
@@ -245,8 +245,8 @@ Device options
 Emulated device options
 '''
 
-``-device virtio-blk,scsi=on|off`` (since 5.0.0)
-
+``-device virtio-blk,scsi=on|off`` (since 5.0)
+^^
 
 The virtio-blk SCSI passthrough feature is a legacy VIRTIO feature.  VIRTIO 1.0
 and later do not support it because the virtio-scsi device was introduced for
@@ -258,14 +258,14 @@ alias.
 Block device options
 
 
-``"backing": ""`` (since 2.12.0)

[PATCH v2 3/3] docs/about: Add the missing release record in the subject

2021-08-22 Thread Yanan Wang
Commit 29e0447551
(docs/about/removed-features: Document removed CLI options from QEMU v3.1)
has recorded some CLI options as replaced/removed from QEMU v3.1, but one
of the subjects has missed the release record. Let's fix it.

Reported-by: Cornelia Huck 
Signed-off-by: Yanan Wang 
---
 docs/about/removed-features.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/about/removed-features.rst b/docs/about/removed-features.rst
index 1c926a8bc1..8feeead449 100644
--- a/docs/about/removed-features.rst
+++ b/docs/about/removed-features.rst
@@ -140,8 +140,8 @@ Use ``-rtc driftfix=slew`` instead.
 
 Replaced by ``-rtc base=date``.
 
-``-vnc ...,tls=...``, ``-vnc ...,x509=...`` & ``-vnc ...,x509verify=...``
-'
+``-vnc ...,tls=...``, ``-vnc ...,x509=...`` & ``-vnc ...,x509verify=...`` 
(removed in 3.1)
+''
 
 The "tls-creds" option should be used instead to point to a "tls-creds-x509"
 object created using "-object".
-- 
2.19.1




[PATCH v2 0/3] docs/about: some documentation clean-up/fix

2021-08-22 Thread Yanan Wang
This series makes some clean-up and fix for docs/about, including removing
the duplicated section, unifying the subject format, and adding the missed
release record in the subject.

v1->v2:
- update the commit message, combined with Cornelia's comment
- add the missing release record in the subject (patch #3)
- v1: 
https://lore.kernel.org/qemu-devel/20210820015628.173532-1-wangyana...@huawei.com/

Yanan Wang (3):
  docs/about: Remove the duplicated doc
  docs/about: Unify the subject format
  docs/about: Add the missing release record in the subject

 docs/about/deprecated.rst   | 56 -
 docs/about/removed-features.rst | 45 ++
 2 files changed, 44 insertions(+), 57 deletions(-)

-- 
2.19.1




[PATCH 2/3] hw/char: cadence_uart: Disable transmit when input clock is disabled

2021-08-22 Thread Bin Meng
At present when input clock is disabled, any character transmitted
to tx fifo can still show on the serial line, which is wrong.

Fixes: b636db306e06 ("hw/char/cadence_uart: add clock support")
Signed-off-by: Bin Meng 
---

 hw/char/cadence_uart.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/char/cadence_uart.c b/hw/char/cadence_uart.c
index b4b5e8a3ee..154be34992 100644
--- a/hw/char/cadence_uart.c
+++ b/hw/char/cadence_uart.c
@@ -327,6 +327,11 @@ static gboolean cadence_uart_xmit(void *do_not_use, 
GIOCondition cond,
 static void uart_write_tx_fifo(CadenceUARTState *s, const uint8_t *buf,
int size)
 {
+/* ignore characters when unclocked or in reset */
+if (!clock_is_enabled(s->refclk) || device_is_in_reset(DEVICE(s))) {
+return;
+}
+
 if ((s->r[R_CR] & UART_CR_TX_DIS) || !(s->r[R_CR] & UART_CR_TX_EN)) {
 return;
 }
-- 
2.25.1




[PATCH 3/3] hw/char: cadence_uart: Move clock/reset check to uart_can_receive()

2021-08-22 Thread Bin Meng
Currently the clock/reset check is done in uart_receive(), but we
can move the check to uart_can_receive() which is earlier.

Signed-off-by: Bin Meng 

---

 hw/char/cadence_uart.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/char/cadence_uart.c b/hw/char/cadence_uart.c
index 154be34992..7326445385 100644
--- a/hw/char/cadence_uart.c
+++ b/hw/char/cadence_uart.c
@@ -235,6 +235,12 @@ static void uart_parameters_setup(CadenceUARTState *s)
 static int uart_can_receive(void *opaque)
 {
 CadenceUARTState *s = opaque;
+
+/* ignore characters when unclocked or in reset */
+if (!clock_is_enabled(s->refclk) || device_is_in_reset(DEVICE(s))) {
+return 0;
+}
+
 int ret = MAX(CADENCE_UART_RX_FIFO_SIZE, CADENCE_UART_TX_FIFO_SIZE);
 uint32_t ch_mode = s->r[R_MR] & UART_MR_CHMODE;
 
@@ -358,11 +364,6 @@ static void uart_receive(void *opaque, const uint8_t *buf, 
int size)
 CadenceUARTState *s = opaque;
 uint32_t ch_mode = s->r[R_MR] & UART_MR_CHMODE;
 
-/* ignore characters when unclocked or in reset */
-if (!clock_is_enabled(s->refclk) || device_is_in_reset(DEVICE(s))) {
-return;
-}
-
 if (ch_mode == NORMAL_MODE || ch_mode == ECHO_MODE) {
 uart_write_rx_fifo(opaque, buf, size);
 }
-- 
2.25.1




[PATCH 1/3] hw/misc: zynq_slcr: Correctly compute output clocks in the reset exit phase

2021-08-22 Thread Bin Meng
As of today, when booting upstream U-Boot for Xilinx Zynq, the UART
does not receive anything. Debugging shows that the UART input clock
frequency is zero which prevents the UART from receiving anything as
per the logic in uart_receive().

>From zynq_slcr_reset_exit() comment, it intends to compute output
clocks according to ps_clk and registers. zynq_slcr_compute_clocks()
is called to accomplish the task, inside which device_is_in_reset()
is called to actually make the attempt in vain.

Rework reset_hold() and reset_exit() so that in the reset exit phase,
the logic can really compute output clocks in reset_exit().

With this change, upstream U-Boot boots properly again with:

$ qemu-system-arm -M xilinx-zynq-a9 -m 1G -display none -serial null -serial 
stdio \
-device loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0

Fixes: 38867cb7ec90 ("hw/misc/zynq_slcr: add clock generation for uarts")
Signed-off-by: Bin Meng 
---

 hw/misc/zynq_slcr.c | 31 ++-
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/hw/misc/zynq_slcr.c b/hw/misc/zynq_slcr.c
index 5086e6b7ed..8b70285961 100644
--- a/hw/misc/zynq_slcr.c
+++ b/hw/misc/zynq_slcr.c
@@ -269,6 +269,21 @@ static uint64_t zynq_slcr_compute_clock(const uint64_t 
periods[],
 zynq_slcr_compute_clock((plls), (state)->regs[reg], \
 reg ## _ ## enable_field ## _SHIFT)
 
+static void zynq_slcr_compute_clocks_internal(ZynqSLCRState *s, uint64_t 
ps_clk)
+{
+uint64_t io_pll = zynq_slcr_compute_pll(ps_clk, s->regs[R_IO_PLL_CTRL]);
+uint64_t arm_pll = zynq_slcr_compute_pll(ps_clk, s->regs[R_ARM_PLL_CTRL]);
+uint64_t ddr_pll = zynq_slcr_compute_pll(ps_clk, s->regs[R_DDR_PLL_CTRL]);
+
+uint64_t uart_mux[4] = {io_pll, io_pll, arm_pll, ddr_pll};
+
+/* compute uartX reference clocks */
+clock_set(s->uart0_ref_clk,
+  ZYNQ_COMPUTE_CLK(s, uart_mux, R_UART_CLK_CTRL, CLKACT0));
+clock_set(s->uart1_ref_clk,
+  ZYNQ_COMPUTE_CLK(s, uart_mux, R_UART_CLK_CTRL, CLKACT1));
+}
+
 /**
  * Compute and set the ouputs clocks periods.
  * But do not propagate them further. Connected clocks
@@ -283,17 +298,7 @@ static void zynq_slcr_compute_clocks(ZynqSLCRState *s)
 ps_clk = 0;
 }
 
-uint64_t io_pll = zynq_slcr_compute_pll(ps_clk, s->regs[R_IO_PLL_CTRL]);
-uint64_t arm_pll = zynq_slcr_compute_pll(ps_clk, s->regs[R_ARM_PLL_CTRL]);
-uint64_t ddr_pll = zynq_slcr_compute_pll(ps_clk, s->regs[R_DDR_PLL_CTRL]);
-
-uint64_t uart_mux[4] = {io_pll, io_pll, arm_pll, ddr_pll};
-
-/* compute uartX reference clocks */
-clock_set(s->uart0_ref_clk,
-  ZYNQ_COMPUTE_CLK(s, uart_mux, R_UART_CLK_CTRL, CLKACT0));
-clock_set(s->uart1_ref_clk,
-  ZYNQ_COMPUTE_CLK(s, uart_mux, R_UART_CLK_CTRL, CLKACT1));
+zynq_slcr_compute_clocks_internal(s, ps_clk);
 }
 
 /**
@@ -416,7 +421,7 @@ static void zynq_slcr_reset_hold(Object *obj)
 ZynqSLCRState *s = ZYNQ_SLCR(obj);
 
 /* will disable all output clocks */
-zynq_slcr_compute_clocks(s);
+zynq_slcr_compute_clocks_internal(s, 0);
 zynq_slcr_propagate_clocks(s);
 }
 
@@ -425,7 +430,7 @@ static void zynq_slcr_reset_exit(Object *obj)
 ZynqSLCRState *s = ZYNQ_SLCR(obj);
 
 /* will compute output clocks according to ps_clk and registers */
-zynq_slcr_compute_clocks(s);
+zynq_slcr_compute_clocks_internal(s, clock_get(s->ps_clk));
 zynq_slcr_propagate_clocks(s);
 }
 
-- 
2.25.1




[PATCH 0/3] hw/arm: xilinx_zynq: Fix upstream U-Boot boot failure

2021-08-22 Thread Bin Meng
As of today, when booting upstream U-Boot for Xilinx Zynq, the UART
does not receive anything. Debugging shows that the UART input clock
frequency is zero which prevents the UART from receiving anything as.
per the logic in uart_receive().

Note the U-Boot can still output data to the UART tx fifo, which should
not happen, as the design seems to prevent the data transmission when
clock is not enabled but somehow it only applies to the Rx side.

For anyone who is interested to give a try, here is the U-Boot defconfig:
$ make xilinx_zynq_virt_defconfig

and QEMU commands to test U-Boot:
$ qemu-system-arm -M xilinx-zynq-a9 -m 1G -display none -serial null -serial 
stdio \
-device loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0

Note U-Boot used to boot properly in QEMU 4.2.0 which is the QEMU
version used in current U-Boot's CI testing. The UART clock changes
were introduced by the following 3 commits:

38867cb7ec90 ("hw/misc/zynq_slcr: add clock generation for uarts")
b636db306e06 ("hw/char/cadence_uart: add clock support")
5b49a34c6800 ("hw/arm/xilinx_zynq: connect uart clocks to slcr")


Bin Meng (3):
  hw/misc: zynq_slcr: Correctly compute output clocks in the reset exit
phase
  hw/char: cadence_uart: Disable transmit when input clock is disabled
  hw/char: cadence_uart: Move clock/reset check to uart_can_receive()

 hw/char/cadence_uart.c | 16 +++-
 hw/misc/zynq_slcr.c| 31 ++-
 2 files changed, 29 insertions(+), 18 deletions(-)

-- 
2.25.1




Re: [PATCH 1/2] migration/rdma: Try to register On-Demand Paging memory region

2021-08-22 Thread lizhij...@fujitsu.com


On 22/08/2021 16:53, Marcel Apfelbaum wrote:
> Hi
>
> On Sat, Jul 31, 2021 at 5:00 PM Li Zhijian  wrote:
>> Previously, for the fsdax mem-backend-file, it will register failed with
>> Operation not supported. In this case, we can try to register it with
>> On-Demand Paging[1] like what rpma_mr_reg() does on rpma[2].
>>
>> [1]: 
>> https://community.mellanox.com/s/article/understanding-on-demand-paging--odp-x
>> [2]: http://pmem.io/rpma/manpages/v0.9.0/rpma_mr_reg.3
>> Signed-off-by: Li Zhijian 
>> ---
>>   migration/rdma.c   | 27 ++-
>>   migration/trace-events |  1 +
>>   2 files changed, 19 insertions(+), 9 deletions(-)
>>
>> diff --git a/migration/rdma.c b/migration/rdma.c
>> index 5c2d113aa94..8784b5f22a6 100644
>> --- a/migration/rdma.c
>> +++ b/migration/rdma.c
>> @@ -1123,15 +1123,21 @@ static int 
>> qemu_rdma_reg_whole_ram_blocks(RDMAContext *rdma)
>>   RDMALocalBlocks *local = >local_ram_blocks;
>>
>>   for (i = 0; i < local->nb_blocks; i++) {
>> +int access = IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE;
>> +
>> +on_demand:
>>   local->block[i].mr =
>>   ibv_reg_mr(rdma->pd,
>>   local->block[i].local_host_addr,
>> -local->block[i].length,
>> -IBV_ACCESS_LOCAL_WRITE |
>> -IBV_ACCESS_REMOTE_WRITE
>> +local->block[i].length, access
>>   );
>>   if (!local->block[i].mr) {
>> -perror("Failed to register local dest ram block!");
>> +if (!(access & IBV_ACCESS_ON_DEMAND) && errno == ENOTSUP) {
>> +access |= IBV_ACCESS_ON_DEMAND;
>> +trace_qemu_rdma_register_odp_mr(local->block[i].block_name);
>> +goto on_demand;
> Wouldn't it be better to check first if the device supports ODP ?
> Something like:
>  ret = ibv_exp_query_device(context, );
>  if (dattr.exp_device_cap_flags & IBV_EXP_DEVICE_ODP)...

Good idea !



>
> Also, I  am not (personally) too fond of the "on_demand" label usage here,
> however I will let the maintainer/others decide.
Indeed, how just repeating the ibv_reg_mr() instead of a 'go to'

Thanks
Zhijian



>
> Thanks,
> Marcel
>
>> +}
>> +perror("Failed to register local dest ram block!\n");
>>   break;
>>   }
>>   rdma->total_registrations++;
>> @@ -1215,15 +1221,18 @@ static int 
>> qemu_rdma_register_and_get_keys(RDMAContext *rdma,
>>*/
>>   if (!block->pmr[chunk]) {
>>   uint64_t len = chunk_end - chunk_start;
>> +int access = rkey ? IBV_ACCESS_LOCAL_WRITE | 
>> IBV_ACCESS_REMOTE_WRITE : 0;
>>
>>   trace_qemu_rdma_register_and_get_keys(len, chunk_start);
>>
>> -block->pmr[chunk] = ibv_reg_mr(rdma->pd,
>> -chunk_start, len,
>> -(rkey ? (IBV_ACCESS_LOCAL_WRITE |
>> -IBV_ACCESS_REMOTE_WRITE) : 0));
>> -
>> +on_demand:
>> +block->pmr[chunk] = ibv_reg_mr(rdma->pd, chunk_start, len, access);
>>   if (!block->pmr[chunk]) {
>> +if (!(access & IBV_ACCESS_ON_DEMAND) && errno == ENOTSUP) {
>> +access |= IBV_ACCESS_ON_DEMAND;
>> +trace_qemu_rdma_register_odp_mr(block->block_name);
>> +goto on_demand;
>> +}
>>   perror("Failed to register chunk!");
>>   fprintf(stderr, "Chunk details: block: %d chunk index %d"
>>   " start %" PRIuPTR " end %" PRIuPTR
>> diff --git a/migration/trace-events b/migration/trace-events
>> index a1c0f034ab8..5f6aa580def 100644
>> --- a/migration/trace-events
>> +++ b/migration/trace-events
>> @@ -212,6 +212,7 @@ qemu_rdma_poll_write(const char *compstr, int64_t comp, 
>> int left, uint64_t block
>>   qemu_rdma_poll_other(const char *compstr, int64_t comp, int left) "other 
>> completion %s (%" PRId64 ") received left %d"
>>   qemu_rdma_post_send_control(const char *desc) "CONTROL: sending %s.."
>>   qemu_rdma_register_and_get_keys(uint64_t len, void *start) "Registering %" 
>> PRIu64 " bytes @ %p"
>> +qemu_rdma_register_odp_mr(const char *name) "Try to register On-Demand 
>> Paging memory region: %s"
>>   qemu_rdma_registration_handle_compress(int64_t length, int index, int64_t 
>> offset) "Zapping zero chunk: %" PRId64 " bytes, index %d, offset %" PRId64
>>   qemu_rdma_registration_handle_finished(void) ""
>>   qemu_rdma_registration_handle_ram_blocks(void) ""
>> --
>> 2.31.1
>>
>>
>>
>>
>


Re: [PATCH 2/2] docs/about: Unify the subject format

2021-08-22 Thread wangyanan (Y)



On 2021/8/20 18:18, Cornelia Huck wrote:

On Fri, Aug 20 2021, Yanan Wang  wrote:


Unify the subject format in deprecated.rst to "since X.Y".
Unify the subject format in removed-features.rst to "removed in X.Y".

It seems unlikely that we will ever deprecate something in a stable
release, and even more unlikely that we'll remove something in one, so
the short versions look like the thing we want to standardize on.


Signed-off-by: Yanan Wang 
---
  docs/about/deprecated.rst   | 56 -
  docs/about/removed-features.rst | 28 -
  2 files changed, 42 insertions(+), 42 deletions(-)

Unrelated to your patch, line 143 in removed-features.rst reads

``-vnc ...,tls=...``, ``-vnc ...,x509=...`` & ``-vnc ...,x509verify=...``

and is missing the release it was removed in (presumably 3.1?)

Yes, this part of section was introduced by commit 29e0447551
(docs/about/removed-features: Document removed CLI options from QEMU v3.1),
so I think a record of "removed in 3.1" is needed. I rechecked both 
deprecated.rst
and removed-features.rst, and this is the only place where we are 
missing a release.

I can fix this in a third patch.

Anyway,

Reviewed-by: Cornelia Huck 

.

Thanks,
Yanan




[PATCH v3] hw/i386/acpi-build: Get NUMA information from struct NumaState

2021-08-22 Thread Jingqi Liu
Since commits aa57020774b ("numa: move numa global variable
nb_numa_nodes into MachineState") and 7e721e7b10e ("numa: move
numa global variable numa_info into MachineState"), we can get
NUMA information completely from MachineState::numa_state.

Remove PCMachineState::numa_nodes and PCMachineState::node_mem,
since they are just copied from MachineState::numa_state.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Jingqi Liu 
---
 hw/i386/acpi-build.c | 12 +++-
 hw/i386/pc.c |  9 -
 include/hw/i386/pc.h |  4 
 3 files changed, 7 insertions(+), 18 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 17836149fe..e3c9ad011e 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1902,6 +1902,8 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
MachineState *machine)
 X86MachineState *x86ms = X86_MACHINE(machine);
 const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine);
 PCMachineState *pcms = PC_MACHINE(machine);
+int nb_numa_nodes = machine->numa_state->num_nodes;
+NodeInfo *numa_info = machine->numa_state->nodes;
 ram_addr_t hotplugabble_address_space_size =
 object_property_get_int(OBJECT(pcms), PC_MACHINE_DEVMEM_REGION_SIZE,
 NULL);
@@ -1945,9 +1947,9 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
MachineState *machine)
 next_base = 0;
 numa_start = table_data->len;
 
-for (i = 1; i < pcms->numa_nodes + 1; ++i) {
+for (i = 1; i < nb_numa_nodes + 1; ++i) {
 mem_base = next_base;
-mem_len = pcms->node_mem[i - 1];
+mem_len = numa_info[i - 1].node_mem;
 next_base = mem_base + mem_len;
 
 /* Cut out the 640K hole */
@@ -1995,7 +1997,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
MachineState *machine)
 }
 
 slots = (table_data->len - numa_start) / sizeof *numamem;
-for (; slots < pcms->numa_nodes + 2; slots++) {
+for (; slots < nb_numa_nodes + 2; slots++) {
 numamem = acpi_data_push(table_data, sizeof *numamem);
 build_srat_memory(numamem, 0, 0, 0, MEM_AFFINITY_NOFLAGS);
 }
@@ -2011,7 +2013,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
MachineState *machine)
 if (hotplugabble_address_space_size) {
 numamem = acpi_data_push(table_data, sizeof *numamem);
 build_srat_memory(numamem, machine->device_memory->base,
-  hotplugabble_address_space_size, pcms->numa_nodes - 
1,
+  hotplugabble_address_space_size, nb_numa_nodes - 1,
   MEM_AFFINITY_HOTPLUGGABLE | MEM_AFFINITY_ENABLED);
 }
 
@@ -2513,7 +2515,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState 
*machine)
 }
 }
 #endif
-if (pcms->numa_nodes) {
+if (machine->numa_state->num_nodes) {
 acpi_add_table(table_offsets, tables_blob);
 build_srat(tables_blob, tables->linker, machine);
 if (machine->numa_state->have_numa_distance) {
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index c2b9d62a35..adbc348488 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -800,18 +800,9 @@ void pc_machine_done(Notifier *notifier, void *data)
 
 void pc_guest_info_init(PCMachineState *pcms)
 {
-int i;
-MachineState *ms = MACHINE(pcms);
 X86MachineState *x86ms = X86_MACHINE(pcms);
 
 x86ms->apic_xrupt_override = true;
-pcms->numa_nodes = ms->numa_state->num_nodes;
-pcms->node_mem = g_malloc0(pcms->numa_nodes *
-sizeof *pcms->node_mem);
-for (i = 0; i < ms->numa_state->num_nodes; i++) {
-pcms->node_mem[i] = ms->numa_state->nodes[i].node_mem;
-}
-
 pcms->machine_done.notify = pc_machine_done;
 qemu_add_machine_init_done_notifier(>machine_done);
 }
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 88dffe7517..31b334e0a4 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -47,10 +47,6 @@ typedef struct PCMachineState {
 bool default_bus_bypass_iommu;
 uint64_t max_fw_size;
 
-/* NUMA information: */
-uint64_t numa_nodes;
-uint64_t *node_mem;
-
 /* ACPI Memory hotplug IO base address */
 hwaddr memhp_io_base;
 } PCMachineState;
-- 
2.21.3




Re: [PATCH 2/2] migration/rdma: advise prefetch write for ODP region

2021-08-22 Thread lizhij...@fujitsu.com
Hi Marcel


On 22/08/2021 16:39, Marcel Apfelbaum wrote:
> Hi,
>
> On Sat, Jul 31, 2021 at 5:03 PM Li Zhijian  wrote:
>> The responder mr registering with ODP will sent RNR NAK back to
>> the requester in the face of the page fault.
>> -
>> ibv_poll_cq wc.status=13 RNR retry counter exceeded!
>> ibv_poll_cq wrid=WRITE RDMA!
>> -
>> ibv_advise_mr(3) helps to make pages present before the actual IO is
>> conducted so that the responder does page fault as little as possible.
>>
>> Signed-off-by: Li Zhijian 
>> ---
>>   migration/rdma.c   | 40 
>>   migration/trace-events |  1 +
>>   2 files changed, 41 insertions(+)
>>
>> diff --git a/migration/rdma.c b/migration/rdma.c
>> index 8784b5f22a6..a2ad00d665f 100644
>> --- a/migration/rdma.c
>> +++ b/migration/rdma.c
>> @@ -1117,6 +1117,30 @@ static int qemu_rdma_alloc_qp(RDMAContext *rdma)
>>   return 0;
>>   }
>>
>> +/*
>> + * ibv_advise_mr to avoid RNR NAK error as far as possible.
>> + * The responder mr registering with ODP will sent RNR NAK back to
>> + * the requester in the face of the page fault.
>> + */
>> +static void qemu_rdma_advise_prefetch_write_mr(struct ibv_pd *pd, uint64_t 
>> addr,
>> +   uint32_t len,  uint32_t lkey,
>> +   const char *name, bool wr)
>> +{
>> +int ret;
>> +int advice = wr ? IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE :
>> + IBV_ADVISE_MR_ADVICE_PREFETCH;
>> +struct ibv_sge sg_list = {.lkey = lkey, .addr = addr, .length = len};
>> +
>> +ret = ibv_advise_mr(pd, advice,
>> +IB_UVERBS_ADVISE_MR_FLAG_FLUSH, _list, 1);
>> +/* ignore the error */
> Following 
> https://github.com/linux-rdma/rdma-core/blob/master/libibverbs/man/ibv_advise_mr.3.md
> it looks like it is a best-effort optimization,
> I don't see any down-sides to it.
> However it seems like it is recommended to use
> IBV_ADVISE_MR_FLAG_FLUSH in order to
> increase the optimization chances.
Good catch,  i will update it soon.


Thanks

>
> Anyway
>
> Reviewed-by: Marcel Apfelbaum 
>
> Thanks,
> Marcel
>
>


Re: [PATCH V2] block/rbd: implement bdrv_co_block_status

2021-08-22 Thread Ilya Dryomov
On Tue, Aug 10, 2021 at 3:41 PM Peter Lieven  wrote:
>
> the qemu rbd driver currently lacks support for bdrv_co_block_status.
> This results mainly in incorrect progress during block operations (e.g.
> qemu-img convert with an rbd image as source).
>
> This patch utilizes the rbd_diff_iterate2 call from librbd to detect
> allocated and unallocated (all zero areas).
>
> To avoid querying the ceph OSDs for the answer this is only done if
> the image has the fast-diff features which depends on the object-map

Hi Peter,

Nit: "has the fast-diff feature which depends on the object-map and
exclusive-lock features"

> and exclusive-lock. In this case it is guaranteed that the information
> is present in memory in the librbd client and thus very fast.
>
> If fast-diff is not available all areas are reported to be allocated
> which is the current behaviour if bdrv_co_block_status is not implemented.
>
> Signed-off-by: Peter Lieven 
> ---
> V1->V2:
> - add commit comment [Stefano]
> - use failed_post_open [Stefano]
> - remove redundant assert [Stefano]
> - add macro+comment for the magic -9000 value [Stefano]
> - always set *file if its non NULL [Stefano]
>
>  block/rbd.c | 125 
>  1 file changed, 125 insertions(+)
>
> diff --git a/block/rbd.c b/block/rbd.c
> index dcf82b15b8..8692e76f40 100644
> --- a/block/rbd.c
> +++ b/block/rbd.c
> @@ -88,6 +88,7 @@ typedef struct BDRVRBDState {
>  char *namespace;
>  uint64_t image_size;
>  uint64_t object_size;
> +uint64_t features;
>  } BDRVRBDState;
>
>  typedef struct RBDTask {
> @@ -983,6 +984,13 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  s->image_size = info.size;
>  s->object_size = info.obj_size;
>
> +r = rbd_get_features(s->image, >features);
> +if (r < 0) {
> +error_setg_errno(errp, -r, "error getting image features from %s",
> + s->image_name);
> +goto failed_post_open;
> +}

The object-map and fast-diff features can be enabled/disabled while the
image is open so this should probably go to qemu_rbd_co_block_status().

> +
>  /* If we are using an rbd snapshot, we must be r/o, otherwise
>   * leave as-is */
>  if (s->snap != NULL) {
> @@ -1259,6 +1267,122 @@ static ImageInfoSpecific 
> *qemu_rbd_get_specific_info(BlockDriverState *bs,
>  return spec_info;
>  }
>
> +typedef struct rbd_diff_req {
> +uint64_t offs;
> +uint64_t bytes;
> +int exists;
> +} rbd_diff_req;
> +
> +/*
> + * rbd_diff_iterate2 allows to interrupt the exection by returning a negative
> + * value in the callback routine. Choose a value that does not conflict with
> + * an existing exitcode and return it if we want to prematurely stop the
> + * execution because we detected a change in the allocation status.
> + */
> +#define QEMU_RBD_EXIT_DIFF_ITERATE2 -9000
> +
> +static int qemu_rbd_co_block_status_cb(uint64_t offs, size_t len,
> +   int exists, void *opaque)
> +{
> +struct rbd_diff_req *req = opaque;
> +
> +assert(req->offs + req->bytes <= offs);
> +
> +if (req->exists && offs > req->offs + req->bytes) {
> +/*
> + * we started in an allocated area and jumped over an unallocated 
> area,
> + * req->bytes contains the length of the allocated area before the
> + * unallocated area. stop further processing.
> + */
> +return QEMU_RBD_EXIT_DIFF_ITERATE2;
> +}
> +if (req->exists && !exists) {
> +/*
> + * we started in an allocated area and reached a hole. req->bytes
> + * contains the length of the allocated area before the hole.
> + * stop further processing.
> + */
> +return QEMU_RBD_EXIT_DIFF_ITERATE2;
> +}
> +if (!req->exists && exists && offs > req->offs) {
> +/*
> + * we started in an unallocated area and hit the first allocated
> + * block. req->bytes must be set to the length of the unallocated 
> area
> + * before the allocated area. stop further processing.
> + */
> +req->bytes = offs - req->offs;
> +return QEMU_RBD_EXIT_DIFF_ITERATE2;
> +}
> +
> +/*
> + * assert that we catched all cases above and allocation state has not

catched -> caught

> + * changed during callbacks.
> + */
> +assert(exists == req->exists || !req->bytes);
> +req->exists = exists;
> +
> +/*
> + * assert that we either return an unallocated block or have got 
> callbacks
> + * for all allocated blocks present.
> + */
> +assert(!req->exists || offs == req->offs + req->bytes);
> +req->bytes = offs + len - req->offs;
> +
> +return 0;
> +}
> +
> +static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
> + bool want_zero, int64_t 
> offset,
> + int64_t 

Re: [PATCH v3 63/66] tcg/tci: Support raising sigbus for user-only

2021-08-22 Thread Richard Henderson

On 8/22/21 5:32 AM, Peter Maydell wrote:

On Sun, 22 Aug 2021 at 08:59, Richard Henderson
 wrote:


On 8/20/21 3:14 AM, Peter Maydell wrote:

@@ -377,11 +381,11 @@ static uint64_t tci_qemu_ld(CPUArchState *env, 
target_ulong taddr,
   static void tci_qemu_st(CPUArchState *env, target_ulong taddr, uint64_t val,
   MemOpIdx oi, const void *tb_ptr)
   {
-MemOp mop = get_memop(oi) & (MO_BSWAP | MO_SSIZE);
+MemOp mop = get_memop(oi);
   uintptr_t ra = (uintptr_t)tb_ptr;


Don't you need this bit in tci_qemu_st() as well ?


Which bit isn't present in st as well?
There's missing hunks in your reply, but afaics they're the same.


https://patchew.org/QEMU/20210818191920.390759-1-richard.hender...@linaro.org/20210818191920.390759-64-richard.hender...@linaro.org/

I had the function name wrong, but only the tci_qemu_st() change
has this bit:

-MemOp mop = get_memop(oi) & (MO_BSWAP | MO_SSIZE);
+MemOp mop = get_memop(oi);


Ah yes, thanks.

r~




Re: Testing a microcontroller emulation by loading the binary on incomplete Flash emulation

2021-08-22 Thread Peter Maydell
On Sun, 22 Aug 2021 at 15:37, Gautam Bhat  wrote:
>
> Hi,
>
> I am to implement a very simple microcontroller for my understanding
> of Qemu development. This microcontroller runs its code from
> programmable flash which is bit-, byte- and word addressable. To do
> some initial tests of my nascent microcontroller implementation I
> would like to load a very simple program binary. Is there a way to
> load this binary and start execution without emulating Flash
> controller and memory?

Just create a plain old RAM memory region, and then load the
guest binary into it with the 'generic loader' (which can
take an ELF file or a raw binary).

-- PMM



Re: [For 6.1 PATCH] hw/arm: xilinx_zynq: Disconnect the UART clocks temporarily

2021-08-22 Thread Bin Meng
On Sun, Aug 22, 2021 at 2:14 AM Peter Maydell  wrote:
>
> On Sat, 21 Aug 2021 at 16:45, Bin Meng  wrote:
> >
> > As of today, when booting upstream U-Boot for Xilinx Zynq, the UART
> > does not receive anything. Initial debugging shows that the UART clock
> > frequency is 0 somehow which prevents the UART from receiving anything.
> > Note the U-Boot can still output data to the UART tx fifo, which should
> > not happen, as the design seems to prevent the data transmission when
> > clock is not enabled but somehow it only applies to the Rx side.
> >
> > For anyone who is interested to give a try, here is the U-Boot defconfig:
> > $ make xilinx_zynq_virt_defconfig
> >
> > and QEMU commands to test U-Boot:
> > $ qemu-system-arm -M xilinx-zynq-a9 -m 1G -display none -serial null 
> > -serial stdio \
> > -device loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
> >
> > Note U-Boot used to boot properly in QEMU 4.2.0 which is the QEMU
> > version used in current U-Boot's CI testing. The UART clock changes
> > were introduced by the following 3 commits:
> >
> > 38867cb7ec90 ("hw/misc/zynq_slcr: add clock generation for uarts")
> > b636db306e06 ("hw/char/cadence_uart: add clock support")
> > 5b49a34c6800 ("hw/arm/xilinx_zynq: connect uart clocks to slcr")
> >
> > Looks like we don't have enough time to figure out a proper fix before
> > 6.1.0 release date, let's disconnect the UART clocks temporarily.
>
> This is too late for 6.1 regardless, I'm afraid.

That's too bad :(

I figured out a proper fix, and will send it out soon.

Regards,
Bin



vmgenid no maintainers

2021-08-22 Thread Ani Sinha
Hi :

All of you have contributed to vmgenid device at some point.
Unfortunately, this code currently has no maintainers. I had looked into
this feature as a part of another project for a previous company, hence
noticed it. Will any of you would want to be the maintainer of this
codebase in Qemu? I do not know all the parts of this feature and the
spec well enough to be confident to be a maintainer but I can be a
reviewer if you want.

Thanks
ani




Re: [PATCH 0/3] target/arm: Reduced-IPA space and highmem=off fixes

2021-08-22 Thread Marc Zyngier

On 2021-08-22 15:44, Marc Zyngier wrote:

With the availability of a fruity range of arm64 systems, it becomes
obvious that QEMU doesn't deal very well with limited IPA ranges when
used as a front-end for KVM.

This short series aims at making usable on such systems:
- the first patch makes the creation of a scratch VM IPA-limit aware
- the second one actually removes the highmem devices from the
computed IPA range when highmem=off
- the last one addresses an imprecision in the documentation for the
highmem option

This has been tested on an M1-based Mac-mini running Linux v5.14-rc6.


I realise I haven't been very clear in my description of the above.
With this series, using 'highmem=off' results in a usable VM, while
sticking to the default 'highmem=on' still generates an error.

M.
--
Jazz is not dead. It just smells funny...



[PATCH 0/3] target/arm: Reduced-IPA space and highmem=off fixes

2021-08-22 Thread Marc Zyngier
With the availability of a fruity range of arm64 systems, it becomes
obvious that QEMU doesn't deal very well with limited IPA ranges when
used as a front-end for KVM.

This short series aims at making usable on such systems:
- the first patch makes the creation of a scratch VM IPA-limit aware
- the second one actually removes the highmem devices from the
computed IPA range when highmem=off
- the last one addresses an imprecision in the documentation for the
highmem option

This has been tested on an M1-based Mac-mini running Linux v5.14-rc6.

Marc Zyngier (3):
  hw/arm/virt: KVM: Probe for KVM_CAP_ARM_VM_IPA_SIZE when creating
scratch VM
  hw/arm/virt: Honor highmem setting when computing highest_gpa
  docs/system/arm/virt: Fix documentation for the 'highmem' option

 docs/system/arm/virt.rst |  6 +++---
 hw/arm/virt.c| 10 +++---
 target/arm/kvm.c |  7 ++-
 3 files changed, 16 insertions(+), 7 deletions(-)

-- 
2.30.2




[PATCH 1/3] hw/arm/virt: KVM: Probe for KVM_CAP_ARM_VM_IPA_SIZE when creating scratch VM

2021-08-22 Thread Marc Zyngier
Although we probe for the IPA limits imposed by KVM (and the hardware)
when computing the memory map, we still use the old style '0' when
creating a scratch VM in kvm_arm_create_scratch_host_vcpu().

On systems that are severely IPA challenged (such as the Apple M1),
this results in a failure as KVM cannot use the default 40bit that
'0' represents.

Instead, probe for the extension and use the reported IPA limit
if available.

Cc: Andrew Jones 
Cc: Eric Auger 
Cc: Peter Maydell 
Signed-off-by: Marc Zyngier 
---
 target/arm/kvm.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index d8381ba224..cc3371a99b 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -70,12 +70,17 @@ bool kvm_arm_create_scratch_host_vcpu(const uint32_t 
*cpus_to_try,
   struct kvm_vcpu_init *init)
 {
 int ret = 0, kvmfd = -1, vmfd = -1, cpufd = -1;
+int max_vm_pa_size;
 
 kvmfd = qemu_open_old("/dev/kvm", O_RDWR);
 if (kvmfd < 0) {
 goto err;
 }
-vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
+max_vm_pa_size = ioctl(kvmfd, KVM_CHECK_EXTENSION, 
KVM_CAP_ARM_VM_IPA_SIZE);
+if (max_vm_pa_size < 0) {
+max_vm_pa_size = 0;
+}
+vmfd = ioctl(kvmfd, KVM_CREATE_VM, max_vm_pa_size);
 if (vmfd < 0) {
 goto err;
 }
-- 
2.30.2




[PATCH 3/3] docs/system/arm/virt: Fix documentation for the 'highmem' option

2021-08-22 Thread Marc Zyngier
The documentation for the 'highmem' option indicates that it controls
the placement of both devices and RAM. The actual behaviour of QEMU
seems to be that RAM is allowed to go beyond the 4GiB limit, and
that only devices are constraint by this option.

Align the documentation with the actual behaviour.

Cc: Andrew Jones 
Cc: Eric Auger 
Cc: Peter Maydell 
Signed-off-by: Marc Zyngier 
---
 docs/system/arm/virt.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst
index 59acf0eeaf..e206e7565d 100644
--- a/docs/system/arm/virt.rst
+++ b/docs/system/arm/virt.rst
@@ -86,9 +86,9 @@ mte
   Arm Memory Tagging Extensions. The default is ``off``.
 
 highmem
-  Set ``on``/``off`` to enable/disable placing devices and RAM in physical
-  address space above 32 bits. The default is ``on`` for machine types
-  later than ``virt-2.12``.
+  Set ``on``/``off`` to enable/disable placing devices in physical address
+  space above 32 bits. RAM in excess of 3GiB will always be placed above
+  32 bits. The default is ``on`` for machine types later than ``virt-2.12``.
 
 gic-version
   Specify the version of the Generic Interrupt Controller (GIC) to provide.
-- 
2.30.2




[PATCH 2/3] hw/arm/virt: Honor highmem setting when computing highest_gpa

2021-08-22 Thread Marc Zyngier
Even when the VM is configured with highmem=off, the highest_gpa
field includes devices that are above the 4GiB limit, which is
what highmem=off is supposed to enforce. This leads to failures
in virt_kvm_type() on systems that have a crippled IPA range,
as the reported IPA space is larger than what it should be.

Instead, honor the user-specified limit to only use the devices
at the lowest end of the spectrum.

Note that this doesn't affect memory, which is still allowed to
go beyond 4GiB with highmem=on configurations.

Cc: Andrew Jones 
Cc: Eric Auger 
Cc: Peter Maydell 
Signed-off-by: Marc Zyngier 
---
 hw/arm/virt.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 81eda46b0b..bc189e30b8 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1598,7 +1598,7 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState 
*vms, int idx)
 static void virt_set_memmap(VirtMachineState *vms)
 {
 MachineState *ms = MACHINE(vms);
-hwaddr base, device_memory_base, device_memory_size;
+hwaddr base, device_memory_base, device_memory_size, ceiling;
 int i;
 
 vms->memmap = extended_memmap;
@@ -1625,7 +1625,7 @@ static void virt_set_memmap(VirtMachineState *vms)
 device_memory_size = ms->maxram_size - ms->ram_size + ms->ram_slots * GiB;
 
 /* Base address of the high IO region */
-base = device_memory_base + ROUND_UP(device_memory_size, GiB);
+ceiling = base = device_memory_base + ROUND_UP(device_memory_size, GiB);
 if (base < device_memory_base) {
 error_report("maxmem/slots too huge");
 exit(EXIT_FAILURE);
@@ -1642,7 +1642,11 @@ static void virt_set_memmap(VirtMachineState *vms)
 vms->memmap[i].size = size;
 base += size;
 }
-vms->highest_gpa = base - 1;
+if (vms->highmem) {
+   /* If we have highmem, move the IPA limit to the top */
+   ceiling = base;
+}
+vms->highest_gpa = ceiling - 1;
 if (device_memory_size > 0) {
 ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
 ms->device_memory->base = device_memory_base;
-- 
2.30.2




Testing a microcontroller emulation by loading the binary on incomplete Flash emulation

2021-08-22 Thread Gautam Bhat
Hi,

I am to implement a very simple microcontroller for my understanding
of Qemu development. This microcontroller runs its code from
programmable flash which is bit-, byte- and word addressable. To do
some initial tests of my nascent microcontroller implementation I
would like to load a very simple program binary. Is there a way to
load this binary and start execution without emulating Flash
controller and memory?

Thanks.



[PATCH] virtio-vga, stubs: Fix qemu failing on virio-vga with blobs on when qemu compiled with modules enabled

2021-08-22 Thread Max
>From f972dab518c6581a83f397c35b3d09b2ef6de674 Mon Sep 17 00:00:00 2001
From: max 
Date: Sun, 22 Aug 2021 14:02:48 +0300
Subject: [PATCH] virtio-vga, stubs: Fix qemu failing on virio-vga with blobs on 
when qemu compiled with modules enabled

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/553
Signed-off-by: max 
---
 stubs/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/stubs/meson.build b/stubs/meson.build
index d3fa864..beb737f 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -53,7 +53,7 @@ if have_system
   stub_ss.add(files('semihost.c'))
   stub_ss.add(files('usb-dev-stub.c'))
   stub_ss.add(files('xen-hw-stub.c'))
-  stub_ss.add(files('virtio-gpu-udmabuf.c'))
+  stub_ss.add(when: 'CONFIG_LINUX', if_false: files('virtio-gpu-udmabuf.c'))
 else
   stub_ss.add(files('qdev.c'))
 endif
-- 
2.32.0






[PATCH v2 2/5] qemu/qarray.h: check scalar type in QARRAY_CREATE()

2021-08-22 Thread Christian Schoenebeck
Make sure at compile time that the scalar type of the array
requested to be created via QARRAY_CREATE() matches the scalar
type of the passed auto reference variable (unique pointer).

Suggested-by: Richard Henderson 
Signed-off-by: Christian Schoenebeck 
---
 include/qemu/qarray.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/qemu/qarray.h b/include/qemu/qarray.h
index 9885e5e9ed..643c8877c5 100644
--- a/include/qemu/qarray.h
+++ b/include/qemu/qarray.h
@@ -27,6 +27,8 @@
 #ifndef QEMU_QARRAY_H
 #define QEMU_QARRAY_H
 
+#include "qemu/compiler.h"
+
 /**
  * QArray provides a mechanism to access arrays in common C-style (e.g. by
  * square bracket [] operator) in conjunction with reference variables that
@@ -145,6 +147,10 @@
  * @param len - amount of array elements to be allocated immediately
  */
 #define QARRAY_CREATE(scalar_type, auto_var, len) \
+QEMU_BUILD_BUG_MSG( \
+!__builtin_types_compatible_p(scalar_type, typeof(*auto_var)), \
+"QArray scalar type mismatch" \
+); \
 qarray_create_##scalar_type((_var), len)
 
 #endif /* QEMU_QARRAY_H */
-- 
2.20.1




[PATCH v2 4/5] 9pfs: make V9fsPath usable via QArray API

2021-08-22 Thread Christian Schoenebeck
Signed-off-by: Christian Schoenebeck 
---
 fsdev/file-op-9p.h | 2 ++
 hw/9pfs/9p.c   | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/fsdev/file-op-9p.h b/fsdev/file-op-9p.h
index 42f677cf38..7630f0e538 100644
--- a/fsdev/file-op-9p.h
+++ b/fsdev/file-op-9p.h
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include "qemu-fsdev-throttle.h"
+#include "qemu/qarray.h"
 
 #define SM_LOCAL_MODE_BITS0600
 #define SM_LOCAL_DIR_MODE_BITS0700
@@ -105,6 +106,7 @@ struct V9fsPath {
 uint16_t size;
 char *data;
 };
+DECLARE_QARRAY_TYPE(V9fsPath);
 
 typedef union V9fsFidOpenState V9fsFidOpenState;
 
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index c857b31321..b59572fa79 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -50,6 +50,8 @@ enum {
 Oappend = 0x80,
 };
 
+DEFINE_QARRAY_TYPE(V9fsPath, v9fs_path_free);
+
 static ssize_t pdu_marshal(V9fsPDU *pdu, size_t offset, const char *fmt, ...)
 {
 ssize_t ret;
-- 
2.20.1




[PATCH v2 5/5] 9pfs: use QArray in v9fs_walk()

2021-08-22 Thread Christian Schoenebeck
Signed-off-by: Christian Schoenebeck 
---
 hw/9pfs/9p.c | 17 +
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index b59572fa79..91062ee4d6 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -1707,13 +1707,14 @@ static void coroutine_fn v9fs_walk(void *opaque)
 int name_idx;
 g_autofree V9fsQID *qids = NULL;
 int i, err = 0;
-V9fsPath dpath, path, *pathes = NULL;
+V9fsPath dpath, path;
+QArrayRef(V9fsPath) pathes = NULL;
 uint16_t nwnames;
 struct stat stbuf, fidst;
 g_autofree struct stat *stbufs = NULL;
 size_t offset = 7;
 int32_t fid, newfid;
-V9fsString *wnames = NULL;
+QArrayRef(V9fsString) wnames = NULL;
 V9fsFidState *fidp;
 V9fsFidState *newfidp = NULL;
 V9fsPDU *pdu = opaque;
@@ -1734,10 +1735,10 @@ static void coroutine_fn v9fs_walk(void *opaque)
 goto out_nofid;
 }
 if (nwnames) {
-wnames = g_new0(V9fsString, nwnames);
+QARRAY_CREATE(V9fsString, wnames, nwnames);
 qids   = g_new0(V9fsQID, nwnames);
 stbufs = g_new0(struct stat, nwnames);
-pathes = g_new0(V9fsPath, nwnames);
+QARRAY_CREATE(V9fsPath, pathes, nwnames);
 for (i = 0; i < nwnames; i++) {
 err = pdu_unmarshal(pdu, offset, "s", [i]);
 if (err < 0) {
@@ -1869,14 +1870,6 @@ out:
 v9fs_path_free();
 out_nofid:
 pdu_complete(pdu, err);
-if (nwnames && nwnames <= P9_MAXWELEM) {
-for (name_idx = 0; name_idx < nwnames; name_idx++) {
-v9fs_string_free([name_idx]);
-v9fs_path_free([name_idx]);
-}
-g_free(wnames);
-g_free(pathes);
-}
 }
 
 static int32_t coroutine_fn get_iounit(V9fsPDU *pdu, V9fsPath *path)
-- 
2.20.1




[PATCH v2 3/5] 9pfs: make V9fsString usable via QArray API

2021-08-22 Thread Christian Schoenebeck
Signed-off-by: Christian Schoenebeck 
---
 fsdev/9p-marshal.c | 2 ++
 fsdev/9p-marshal.h | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/fsdev/9p-marshal.c b/fsdev/9p-marshal.c
index a01bba6908..fbfc2a62cd 100644
--- a/fsdev/9p-marshal.c
+++ b/fsdev/9p-marshal.c
@@ -18,6 +18,8 @@
 
 #include "9p-marshal.h"
 
+DEFINE_QARRAY_TYPE(V9fsString, v9fs_string_free);
+
 void v9fs_string_free(V9fsString *str)
 {
 g_free(str->data);
diff --git a/fsdev/9p-marshal.h b/fsdev/9p-marshal.h
index ceaf2f521e..7229e4e617 100644
--- a/fsdev/9p-marshal.h
+++ b/fsdev/9p-marshal.h
@@ -1,10 +1,13 @@
 #ifndef QEMU_9P_MARSHAL_H
 #define QEMU_9P_MARSHAL_H
 
+#include "qemu/qarray.h"
+
 typedef struct V9fsString {
 uint16_t size;
 char *data;
 } V9fsString;
+DECLARE_QARRAY_TYPE(V9fsString);
 
 typedef struct V9fsQID {
 uint8_t type;
-- 
2.20.1




[PATCH v2 1/5] qemu/qarray.h: introduce QArray

2021-08-22 Thread Christian Schoenebeck
Implements deep auto free of arrays while retaining common C-style
squared bracket access.

Signed-off-by: Christian Schoenebeck 
---
 include/qemu/qarray.h | 150 ++
 1 file changed, 150 insertions(+)
 create mode 100644 include/qemu/qarray.h

diff --git a/include/qemu/qarray.h b/include/qemu/qarray.h
new file mode 100644
index 00..9885e5e9ed
--- /dev/null
+++ b/include/qemu/qarray.h
@@ -0,0 +1,150 @@
+/*
+ * QArray - deep auto free C-array
+ *
+ * Copyright (c) 2021 Crudebyte
+ *
+ * Authors:
+ *   Christian Schoenebeck 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#ifndef QEMU_QARRAY_H
+#define QEMU_QARRAY_H
+
+/**
+ * QArray provides a mechanism to access arrays in common C-style (e.g. by
+ * square bracket [] operator) in conjunction with reference variables that
+ * perform deep auto free of the array when leaving the scope of the auto
+ * reference variable. That means not only is the array itself automatically
+ * freed, but also memory dynamically allocated by the individual array
+ * elements.
+ *
+ * Example:
+ *
+ * Consider the following user struct @c Foo which shall be used as scalar
+ * (element) type of an array:
+ * @code
+ * typedef struct Foo {
+ * int i;
+ * char *s;
+ * } Foo;
+ * @endcode
+ * and assume it has the following function to free memory allocated by @c Foo
+ * instances:
+ * @code
+ * void free_foo(Foo *foo) {
+ * free(foo->s);
+ * }
+ * @endcode
+ * Add the following to a shared header file:
+ * @code
+ * DECLARE_QARRAY_TYPE(Foo);
+ * @endcode
+ * and the following to a C unit file:
+ * @code
+ * DEFINE_QARRAY_TYPE(Foo, free_foo);
+ * @endcode
+ * Finally the array may then be used like this:
+ * @code
+ * void doSomething(int n) {
+ * QArrayRef(Foo) foos = NULL;
+ * QARRAY_CREATE(Foo, foos, n);
+ * for (size_t i = 0; i < n; ++i) {
+ * foos[i].i = i;
+ * foos[i].s = calloc(4096, 1);
+ * snprintf(foos[i].s, 4096, "foo %d", i);
+ * }
+ * }
+ * @endcode
+ */
+
+/**
+ * Declares an array type for the passed @a scalar_type.
+ *
+ * This is typically used from a shared header file.
+ *
+ * @param scalar_type - type of the individual array elements
+ */
+#define DECLARE_QARRAY_TYPE(scalar_type) \
+typedef struct QArray##scalar_type { \
+size_t len; \
+scalar_type first[]; \
+} QArray##scalar_type; \
+\
+void qarray_create_##scalar_type(scalar_type **auto_var, size_t len); \
+void qarray_auto_free_##scalar_type(scalar_type **auto_var); \
+
+/**
+ * Defines an array type for the passed @a scalar_type and appropriate
+ * @a scalar_cleanup_func.
+ *
+ * This is typically used from a C unit file.
+ *
+ * @param scalar_type - type of the individual array elements
+ * @param scalar_cleanup_func - appropriate function to free memory dynamically
+ *  allocated by individual array elements before
+ */
+#define DEFINE_QARRAY_TYPE(scalar_type, scalar_cleanup_func) \
+void qarray_create_##scalar_type(scalar_type **auto_var, size_t len) \
+{ \
+qarray_auto_free_##scalar_type(auto_var); \
+QArray##scalar_type *arr = g_malloc0(sizeof(QArray##scalar_type) + \
+len * sizeof(scalar_type)); \
+arr->len = len; \
+*auto_var = >first[0]; \
+} \
+\
+void qarray_auto_free_##scalar_type(scalar_type **auto_var) \
+{ \
+scalar_type *first = (*auto_var); \
+if (!first) { \
+return; \
+} \
+QArray##scalar_type *arr = (QArray##scalar_type *) ( \
+((char *)first) - offsetof(QArray##scalar_type, first) \
+); \
+for (size_t i = 0; i < arr->len; ++i) { \
+scalar_cleanup_func(>first[i]); \
+} \
+g_free(arr); \
+} \
+
+/**
+ * Used to declare a reference variable (unique pointer) for an array. After
+ * leaving the scope of the reference variable, 

[PATCH v2 0/5] introduce QArray

2021-08-22 Thread Christian Schoenebeck
Patches 1 and 2 introduce include/qemu/qarray.h which implements a deep auto
free mechanism for arrays. Unlike GArray it does not require special macros,
function calls or member dereferencing to access the individual array
elements. So existing C-style array code can be retained with only very
little changes.

In this initial version QArray only supports the concept of unique pointers,
i.e. it does not support reference counting. The array (and all dynamically
allocated memory of individual array elements) is auto freed once execution
leaves the scope of the reference variable (unique pointer) associated with
the array.

Patches 3..5 are provided (e.g. as example) for 9p being the first user of
this new QArray API. These particular patches 3..5 are rebased on my
current 9p queue: https://github.com/cschoenebeck/qemu/commits/9p.next
which are basically just the following two queued patches:

https://github.com/cschoenebeck/qemu/commit/7772715d43908235940f5b7dec68d0458b1ccdf4
https://github.com/cschoenebeck/qemu/commit/838b55e392ea7d52e714fdba1db777f658aee2cc

v1 -> v2:

* Minor API comment changes [patch 1].

* Perform strong type check by using __builtin_types_compatible_p()
  instead of a weak check using sizeof() [patch 2].

Christian Schoenebeck (5):
  qemu/qarray.h: introduce QArray
  qemu/qarray.h: check scalar type in QARRAY_CREATE()
  9pfs: make V9fsString usable via QArray API
  9pfs: make V9fsPath usable via QArray API
  9pfs: use QArray in v9fs_walk()

 fsdev/9p-marshal.c|   2 +
 fsdev/9p-marshal.h|   3 +
 fsdev/file-op-9p.h|   2 +
 hw/9pfs/9p.c  |  19 ++---
 include/qemu/qarray.h | 156 ++
 5 files changed, 170 insertions(+), 12 deletions(-)
 create mode 100644 include/qemu/qarray.h

-- 
2.20.1




Re: [qemu-web PATCH] Add a blog post about FUSE block exports

2021-08-22 Thread Thomas Huth

On 20/08/2021 09.56, Hanna Reitz wrote:

On 19.08.21 18:23, Stefan Hajnoczi wrote:

On Thu, Aug 19, 2021 at 12:25:01PM +0200, Hanna Reitz wrote:

This post explains when FUSE block exports are useful, how they work,
and that it is fun to export an image file on its own path so it looks
like your image file (in whatever format it was) is a raw image now.

[...]

This is a long blog post. One idea is to show a quickstart
qemu-storage-daemon FUSE export command-line in the beginning before
explaining all the details. That way people who just want to see what
this is about can get an idea without learning all the background first.


Sounds good, will do.  Thanks!


Oops, sorry, I just already pushed your patch to the repository since I did 
not notice this conversation in time. Next time, please make sure to put 
myself (and Paolo) on CC: when sending updates for qemu-web, otherwise I 
might not notice the updates :-/


 Thomas




Re: [PATCH 1/5] qemu/qarray.h: introduce QArray

2021-08-22 Thread Christian Schoenebeck
On Samstag, 21. August 2021 22:18:18 CEST Christian Schoenebeck wrote:
> Implements deep auto free of arrays while retaining common C-style
> squared bracket access.
> 
> Signed-off-by: Christian Schoenebeck 
> ---

As I'm going to send a v2 anyway, I'll also just do some minor API comment 
changes in this patch, specifically ...

>  include/qemu/qarray.h | 148 ++
>  1 file changed, 148 insertions(+)
>  create mode 100644 include/qemu/qarray.h
> 
> diff --git a/include/qemu/qarray.h b/include/qemu/qarray.h
> new file mode 100644
> index 00..230a556e81
> --- /dev/null
> +++ b/include/qemu/qarray.h
> @@ -0,0 +1,148 @@
> +/*
> + * QArray - deep auto free C-array
> + *
> + * Copyright (c) 2021 Crudebyte
> + *
> + * Authors:
> + *   Christian Schoenebeck 
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> copy + * of this software and associated documentation files (the
> "Software"), to deal + * in the Software without restriction, including
> without limitation the rights + * to use, copy, modify, merge, publish,
> distribute, sublicense, and/or sell + * copies of the Software, and to
> permit persons to whom the Software is + * furnished to do so, subject to
> the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included
> in + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
> OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
> IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
> CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
> TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE
> SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE.
> + */
> +#ifndef QEMU_QARRAY_H
> +#define QEMU_QARRAY_H
> +
> +/**
> + * QArray provides a mechanism to access arrays in common C-style (e.g. by
> + * square bracket [] operator) in conjunction with reference variables that
> + * perform deep auto free of the array when leaving the scope of the auto
> + * reference variable. That means not only is the array itself
> automatically + * freed, but also memory dynamically allocated by the
> individual array + * elements.
> + *
> + * Example:
> + *
> + * Consider the following user struct @c Foo which shall be used as scalar
> + * (element) type of an array:
> + * @code
> + * typedef struct Foo {
> + * int i;
> + * char *s;
> + * } Foo;
> + * @endcode
> + * and assume it has the following function to free memory allocated by @c
> Foo + * instances:
> + * @code
> + * void free_foo(Foo *foo) {
> + * free(foo->s);
> + * }
> + * @endcode
> + * Add the following to a shared header file:
> + * @code
> + * DECLARE_QARRAY_TYPE(Foo);
> + * @endcode
> + * and the following to a C unit file:
> + * @code
> + * DEFINE_QARRAY_TYPE(Foo, free_foo);
> + * @endcode
> + * Finally the array may then be used like this:
> + * @code
> + * void doSomething(int n) {
> + * QArrayRef(Foo) foos = NULL;
> + * QARRAY_CREATE(Foo, foos, n);
> + * for (size_t i = 0; i < n; ++i) {
> + * foos[i].i = i;
> + * foos[i].s = calloc(4096, 1);
> + * snprintf(foos[i].s, 4096, "foo %d", i);
> + * }
> + * }
> + * @endcode
> + */
> +
> +/**
> + * Declares an array for the passed @a scalar_type.

"Declares an array *type* ...".

To not confuse it with declaring an array instance/variable.

> + *
> + * This is typically used from a shared header file.
> + *
> + * @param scalar_type - type of the individual array elements
> + */
> +#define DECLARE_QARRAY_TYPE(scalar_type) \
> +typedef struct QArray##scalar_type { \
> +size_t len; \
> +scalar_type first[]; \
> +} QArray##scalar_type; \
> +\
> +void qarray_create_##scalar_type(scalar_type **auto_var, size_t len); \
> +void qarray_auto_free_##scalar_type(scalar_type **auto_var); \ +
> +/**
> + * Defines an array for the passed @a scalar_type and appropriate
> + * @a scalar_cleanup_func.

Likewise "Defines an array *type* ...".

Plus missing:

"This is typically used from a C unit file."

> + *
> + * @param scalar_type - type of the individual array elements
> + * @param scalar_cleanup_func - appropriate function to free memory
> dynamically + *  allocated by individual array
> elements before + */
> +#define DEFINE_QARRAY_TYPE(scalar_type, scalar_cleanup_func) \
> +void qarray_create_##scalar_type(scalar_type **auto_var, size_t len) \
> +{ \
> +qarray_auto_free_##scalar_type(auto_var); \
> +QArray##scalar_type *arr = g_malloc0(sizeof(QArray##scalar_type) +
> \ +len * sizeof(scalar_type)); \
> +arr->len = len; \
> +*auto_var = >first[0]; \
> +} \
> +\
> +void 

Re: [PATCH v3 63/66] tcg/tci: Support raising sigbus for user-only

2021-08-22 Thread Peter Maydell
On Sun, 22 Aug 2021 at 08:59, Richard Henderson
 wrote:
>
> On 8/20/21 3:14 AM, Peter Maydell wrote:
> >> @@ -377,11 +381,11 @@ static uint64_t tci_qemu_ld(CPUArchState *env, 
> >> target_ulong taddr,
> >>   static void tci_qemu_st(CPUArchState *env, target_ulong taddr, uint64_t 
> >> val,
> >>   MemOpIdx oi, const void *tb_ptr)
> >>   {
> >> -MemOp mop = get_memop(oi) & (MO_BSWAP | MO_SSIZE);
> >> +MemOp mop = get_memop(oi);
> >>   uintptr_t ra = (uintptr_t)tb_ptr;
> >
> > Don't you need this bit in tci_qemu_st() as well ?
>
> Which bit isn't present in st as well?
> There's missing hunks in your reply, but afaics they're the same.

https://patchew.org/QEMU/20210818191920.390759-1-richard.hender...@linaro.org/20210818191920.390759-64-richard.hender...@linaro.org/

I had the function name wrong, but only the tci_qemu_st() change
has this bit:

-MemOp mop = get_memop(oi) & (MO_BSWAP | MO_SSIZE);
+MemOp mop = get_memop(oi);

-- PMM



Re: [PATCH 2/5] qemu/qarray.h: weak scalar type check in QARRAY_CREATE()

2021-08-22 Thread Christian Schoenebeck
On Sonntag, 22. August 2021 06:11:58 CEST Richard Henderson wrote:
> On 8/21/21 1:30 PM, Christian Schoenebeck wrote:
> > Unfortunately something like
> > 
> >_Static_assert(typeof(a) == typeof(b), "type mismatch");
> > 
> > is currently not suported by C. So for the time being at least
> > check that the size of the scalar types match at compile time.
> 
> Did you try
> _Static_assert(__builtin_types_compatible_p(X, Y), "type mismatch");
> 
> 
> r~

Ah, you are right. I was trying it, but now as you pointed me at it again, I 
realized I was just missing something. The specific use case here is like:

struct Foo {
  ...
} Foo;

Foo *var;

_Static_assert(__builtin_types_compatible_p(Foo, typeof(*var)),
   "type mismatch");

So I was missing the typeof() keyword to deduce the scalar type from the 
passed variable.

I'll send a v2.

Thanks!

Best regards,
Christian Schoenebeck





Re: [PATCH 1/2] migration/rdma: Try to register On-Demand Paging memory region

2021-08-22 Thread Marcel Apfelbaum
Hi

On Sat, Jul 31, 2021 at 5:00 PM Li Zhijian  wrote:
>
> Previously, for the fsdax mem-backend-file, it will register failed with
> Operation not supported. In this case, we can try to register it with
> On-Demand Paging[1] like what rpma_mr_reg() does on rpma[2].
>
> [1]: 
> https://community.mellanox.com/s/article/understanding-on-demand-paging--odp-x
> [2]: http://pmem.io/rpma/manpages/v0.9.0/rpma_mr_reg.3
> Signed-off-by: Li Zhijian 
> ---
>  migration/rdma.c   | 27 ++-
>  migration/trace-events |  1 +
>  2 files changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 5c2d113aa94..8784b5f22a6 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -1123,15 +1123,21 @@ static int qemu_rdma_reg_whole_ram_blocks(RDMAContext 
> *rdma)
>  RDMALocalBlocks *local = >local_ram_blocks;
>
>  for (i = 0; i < local->nb_blocks; i++) {
> +int access = IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE;
> +
> +on_demand:
>  local->block[i].mr =
>  ibv_reg_mr(rdma->pd,
>  local->block[i].local_host_addr,
> -local->block[i].length,
> -IBV_ACCESS_LOCAL_WRITE |
> -IBV_ACCESS_REMOTE_WRITE
> +local->block[i].length, access
>  );
>  if (!local->block[i].mr) {
> -perror("Failed to register local dest ram block!");
> +if (!(access & IBV_ACCESS_ON_DEMAND) && errno == ENOTSUP) {
> +access |= IBV_ACCESS_ON_DEMAND;
> +trace_qemu_rdma_register_odp_mr(local->block[i].block_name);
> +goto on_demand;

Wouldn't it be better to check first if the device supports ODP ?
Something like:
ret = ibv_exp_query_device(context, );
if (dattr.exp_device_cap_flags & IBV_EXP_DEVICE_ODP)...

Also, I  am not (personally) too fond of the "on_demand" label usage here,
however I will let the maintainer/others decide.

Thanks,
Marcel

> +}
> +perror("Failed to register local dest ram block!\n");
>  break;
>  }
>  rdma->total_registrations++;
> @@ -1215,15 +1221,18 @@ static int 
> qemu_rdma_register_and_get_keys(RDMAContext *rdma,
>   */
>  if (!block->pmr[chunk]) {
>  uint64_t len = chunk_end - chunk_start;
> +int access = rkey ? IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE 
> : 0;
>
>  trace_qemu_rdma_register_and_get_keys(len, chunk_start);
>
> -block->pmr[chunk] = ibv_reg_mr(rdma->pd,
> -chunk_start, len,
> -(rkey ? (IBV_ACCESS_LOCAL_WRITE |
> -IBV_ACCESS_REMOTE_WRITE) : 0));
> -
> +on_demand:
> +block->pmr[chunk] = ibv_reg_mr(rdma->pd, chunk_start, len, access);
>  if (!block->pmr[chunk]) {
> +if (!(access & IBV_ACCESS_ON_DEMAND) && errno == ENOTSUP) {
> +access |= IBV_ACCESS_ON_DEMAND;
> +trace_qemu_rdma_register_odp_mr(block->block_name);
> +goto on_demand;
> +}
>  perror("Failed to register chunk!");
>  fprintf(stderr, "Chunk details: block: %d chunk index %d"
>  " start %" PRIuPTR " end %" PRIuPTR
> diff --git a/migration/trace-events b/migration/trace-events
> index a1c0f034ab8..5f6aa580def 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -212,6 +212,7 @@ qemu_rdma_poll_write(const char *compstr, int64_t comp, 
> int left, uint64_t block
>  qemu_rdma_poll_other(const char *compstr, int64_t comp, int left) "other 
> completion %s (%" PRId64 ") received left %d"
>  qemu_rdma_post_send_control(const char *desc) "CONTROL: sending %s.."
>  qemu_rdma_register_and_get_keys(uint64_t len, void *start) "Registering %" 
> PRIu64 " bytes @ %p"
> +qemu_rdma_register_odp_mr(const char *name) "Try to register On-Demand 
> Paging memory region: %s"
>  qemu_rdma_registration_handle_compress(int64_t length, int index, int64_t 
> offset) "Zapping zero chunk: %" PRId64 " bytes, index %d, offset %" PRId64
>  qemu_rdma_registration_handle_finished(void) ""
>  qemu_rdma_registration_handle_ram_blocks(void) ""
> --
> 2.31.1
>
>
>
>



Re: [PATCH 2/2] migration/rdma: advise prefetch write for ODP region

2021-08-22 Thread Marcel Apfelbaum
Hi,

On Sat, Jul 31, 2021 at 5:03 PM Li Zhijian  wrote:
>
> The responder mr registering with ODP will sent RNR NAK back to
> the requester in the face of the page fault.
> -
> ibv_poll_cq wc.status=13 RNR retry counter exceeded!
> ibv_poll_cq wrid=WRITE RDMA!
> -
> ibv_advise_mr(3) helps to make pages present before the actual IO is
> conducted so that the responder does page fault as little as possible.
>
> Signed-off-by: Li Zhijian 
> ---
>  migration/rdma.c   | 40 
>  migration/trace-events |  1 +
>  2 files changed, 41 insertions(+)
>
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 8784b5f22a6..a2ad00d665f 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -1117,6 +1117,30 @@ static int qemu_rdma_alloc_qp(RDMAContext *rdma)
>  return 0;
>  }
>
> +/*
> + * ibv_advise_mr to avoid RNR NAK error as far as possible.
> + * The responder mr registering with ODP will sent RNR NAK back to
> + * the requester in the face of the page fault.
> + */
> +static void qemu_rdma_advise_prefetch_write_mr(struct ibv_pd *pd, uint64_t 
> addr,
> +   uint32_t len,  uint32_t lkey,
> +   const char *name, bool wr)
> +{
> +int ret;
> +int advice = wr ? IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE :
> + IBV_ADVISE_MR_ADVICE_PREFETCH;
> +struct ibv_sge sg_list = {.lkey = lkey, .addr = addr, .length = len};
> +
> +ret = ibv_advise_mr(pd, advice,
> +IB_UVERBS_ADVISE_MR_FLAG_FLUSH, _list, 1);
> +/* ignore the error */
> +if (ret) {
> +trace_qemu_rdma_advise_mr(name, len, addr, strerror(errno));
> +} else {
> +trace_qemu_rdma_advise_mr(name, len, addr, "successed");
> +}
> +}
> +
>  static int qemu_rdma_reg_whole_ram_blocks(RDMAContext *rdma)
>  {
>  int i;
> @@ -1140,6 +1164,17 @@ on_demand:
>  perror("Failed to register local dest ram block!\n");
>  break;
>  }
> +
> +if (access & IBV_ACCESS_ON_DEMAND) {
> +qemu_rdma_advise_prefetch_write_mr(rdma->pd,
> +   (uintptr_t)
> +   
> local->block[i].local_host_addr,
> +   local->block[i].length,
> +   local->block[i].mr->lkey,
> +   local->block[i].block_name,
> +   true);
> +}
> +
>  rdma->total_registrations++;
>  }
>
> @@ -1244,6 +1279,11 @@ on_demand:
>  rdma->total_registrations);
>  return -1;
>  }
> +if (access & IBV_ACCESS_ON_DEMAND) {
> +qemu_rdma_advise_prefetch_write_mr(rdma->pd, 
> (uintptr_t)chunk_start,
> +   len, block->pmr[chunk]->lkey,
> +   block->block_name, rkey);
> +}
>  rdma->total_registrations++;
>  }
>
> diff --git a/migration/trace-events b/migration/trace-events
> index 5f6aa580def..901c1d54c12 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -213,6 +213,7 @@ qemu_rdma_poll_other(const char *compstr, int64_t comp, 
> int left) "other complet
>  qemu_rdma_post_send_control(const char *desc) "CONTROL: sending %s.."
>  qemu_rdma_register_and_get_keys(uint64_t len, void *start) "Registering %" 
> PRIu64 " bytes @ %p"
>  qemu_rdma_register_odp_mr(const char *name) "Try to register On-Demand 
> Paging memory region: %s"
> +qemu_rdma_advise_mr(const char *name, uint32_t len, uint64_t addr, const 
> char *res) "Try to advise block %s prefetch write at %" PRIu32 "@0x%" PRIx64 
> ": %s"
>  qemu_rdma_registration_handle_compress(int64_t length, int index, int64_t 
> offset) "Zapping zero chunk: %" PRId64 " bytes, index %d, offset %" PRId64
>  qemu_rdma_registration_handle_finished(void) ""
>  qemu_rdma_registration_handle_ram_blocks(void) ""
> --
> 2.31.1
>

Following 
https://github.com/linux-rdma/rdma-core/blob/master/libibverbs/man/ibv_advise_mr.3.md
it looks like it is a best-effort optimization,
I don't see any down-sides to it.
However it seems like it is recommended to use
IBV_ADVISE_MR_FLAG_FLUSH in order to
increase the optimization chances.

Anyway

Reviewed-by: Marcel Apfelbaum 

Thanks,
Marcel



Re: [PATCH v3 63/66] tcg/tci: Support raising sigbus for user-only

2021-08-22 Thread Richard Henderson

On 8/20/21 3:14 AM, Peter Maydell wrote:

@@ -296,7 +296,7 @@ static uint64_t tci_qemu_ld(CPUArchState *env, target_ulong 
taddr,
  uintptr_t ra = (uintptr_t)tb_ptr;

  #ifdef CONFIG_SOFTMMU
-switch (mop) {
+switch (mop & (MO_BSWAP | MO_SSIZE)) {
  case MO_UB:
  return helper_ret_ldub_mmu(env, taddr, oi, ra);
  case MO_SB:
@@ -326,10 +326,14 @@ static uint64_t tci_qemu_ld(CPUArchState *env, 
target_ulong taddr,
  }
  #else
  void *haddr = g2h(env_cpu(env), taddr);
+unsigned a_mask = (1u << get_alignment_bits(mop)) - 1;
  uint64_t ret;

  set_helper_retaddr(ra);
-switch (mop) {
+if (taddr & a_mask) {
+helper_unaligned_ld(env, taddr);
+}
+switch (mop & (MO_BSWAP | MO_SSIZE)) {
  case MO_UB:
  ret = ldub_p(haddr);
  break;
@@ -377,11 +381,11 @@ static uint64_t tci_qemu_ld(CPUArchState *env, 
target_ulong taddr,
  static void tci_qemu_st(CPUArchState *env, target_ulong taddr, uint64_t val,
  MemOpIdx oi, const void *tb_ptr)
  {
-MemOp mop = get_memop(oi) & (MO_BSWAP | MO_SSIZE);
+MemOp mop = get_memop(oi);
  uintptr_t ra = (uintptr_t)tb_ptr;


Don't you need this bit in tci_qemu_st() as well ?


Which bit isn't present in st as well?
There's missing hunks in your reply, but afaics they're the same.

r~



Re: [PATCH v3 59/66] accel/tcg: Handle SIGBUS in handle_cpu_signal

2021-08-22 Thread Richard Henderson

On 8/20/21 2:34 AM, Peter Maydell wrote:

On Wed, 18 Aug 2021 at 21:13, Richard Henderson
 wrote:


We've been registering host SIGBUS, but then treating it
exactly like SIGSEGV.

Handle BUS_ADRALN via cpu_unaligned_access, but allow other
SIGBUS si_codes to continue into the host-to-guest signal
coversion code in host_signal_handler.  Unwind the guest
state so that we report the correct guest PC for the fault.


You can't rely on alignment faults being marked by BUS_ADRALN:
eg MIPS doesn't give you that si_code. How much does that matter
for our use of it here ?


Aside from whatever tcg/foo relies upon, it doesn't matter.

But: silly mips. I was going to rely on it there -- oh well, I'll undo that section of the 
follow-on tcg/mips patch set.  And it's got a different syscall for disabling the 
unaligned fixup.



r~



Re: [PATCH v4 3/7] python/aqmp-tui: Add AQMP TUI draft

2021-08-22 Thread John Snow
On Thu, Aug 19, 2021 at 1:39 PM G S Niteesh Babu 
wrote:

> Added a draft of AQMP TUI.
>
> Implements the follwing basic features:
> 1) Command transmission/reception.
> 2) Shows events asynchronously.
> 3) Shows server status in the bottom status bar.
>
> Also added type annotations and necessary pylint,
> mypy configurations
>
> Signed-off-by: G S Niteesh Babu 
> ---
>  python/qemu/aqmp/aqmp_tui.py | 566 +++
>  python/setup.cfg |  15 +-
>  2 files changed, 579 insertions(+), 2 deletions(-)
>  create mode 100644 python/qemu/aqmp/aqmp_tui.py
>
> diff --git a/python/qemu/aqmp/aqmp_tui.py b/python/qemu/aqmp/aqmp_tui.py
> new file mode 100644
> index 00..12c9c4162a
> --- /dev/null
> +++ b/python/qemu/aqmp/aqmp_tui.py
> @@ -0,0 +1,566 @@
> +# Copyright (c) 2021
> +#
> +# Authors:
> +#  Niteesh Babu G S 
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or
> +# later.  See the COPYING file in the top-level directory.
> +"""
> +AQMP TUI
> +
> +AQMP TUI is an asynchronous interface built on top the of the AQMP
> library.
> +It is the successor of QMP-shell and is bought-in as a replacement for it.
> +
> +Example Usage: aqmp-tui 
> +Full Usage: aqmp-tui --help
> +"""
> +
> +import argparse
> +import asyncio
> +import logging
> +from logging import Handler, LogRecord
> +import signal
> +from typing import (
> +List,
> +Optional,
> +Tuple,
> +Type,
> +Union,
> +cast,
> +)
> +
> +import urwid
> +import urwid_readline
> +
> +from ..qmp import QEMUMonitorProtocol, QMPBadPortError
> +from .message import DeserializationError, Message, UnexpectedTypeError
> +from .protocol import ConnectError
> +from .qmp_client import ExecInterruptedError, QMPClient
> +from .util import create_task, pretty_traceback
> +
> +
> +# The name of the signal that is used to update the history list
> +UPDATE_MSG: str = 'UPDATE_MSG'
> +
> +
> +def format_json(msg: str) -> str:
> +"""
> +Formats given multi-line JSON message into a single-line message.
> +Converting into single line is more asthetically pleasing when looking
> +along with error messages.
> +
> +Eg:
> +Input:
> +  [ 1,
> +true,
> +3 ]
> +The above input is not a valid QMP message and produces the following
> error
> +"QMP message is not a JSON object."
> +When displaying this in TUI in multiline mode we get
> +
> +[ 1,
> +  true,
> +  3 ]: QMP message is not a JSON object.
> +
> +whereas in singleline mode we get the following
> +
> +[1, true, 3]: QMP message is not a JSON object.
> +
> +The single line mode is more asthetically pleasing.
> +
> +:param msg:
> +The message to formatted into single line.
> +
> +:return: Formatted singleline message.
> +
> +NOTE: We cannot use the JSON module here because it is only capable of
> +format valid JSON messages. But here the goal is to also format
> invalid
> +JSON messages.
> +"""
> +msg = msg.replace('\n', '')
> +words = msg.split(' ')
> +words = [word for word in words if word != '']
> +return ' '.join(words)
> +
> +
> +def has_tui_handler(logger: logging.Logger,
> +handler_type: Type[Handler]) -> bool:
> +"""
> +The Logger class has no interface to check if a certain type of
> handler is
> +installed or not. So we provide an interface to do so.
> +
> +:param logger:
> +Logger object
> +:param handler_type:
> +The type of the handler to be checked.
> +
> +:return: returns True if handler of type `handler_type` is installed
> else
> + False.
> +"""
> +handlers = logger.handlers
> +for handler in handlers:
> +if isinstance(handler, handler_type):
> +return True
> +return False
> +
> +
> +class App(QMPClient):
> +"""
> +Implements the AQMP TUI.
> +
> +Initializes the widgets and starts the urwid event loop.
> +"""
> +def __init__(self, address: Union[str, Tuple[str, int]]) -> None:
> +"""
> +Initializes the TUI.
> +
> +:param address:
> +Address of the server to connect to.
> +"""
> +urwid.register_signal(type(self), UPDATE_MSG)
> +self.window = Window(self)
> +self.address = address
> +self.aloop: Optional[asyncio.AbstractEventLoop] = None
> +super().__init__()
> +
> +def add_to_history(self, msg: str, level: Optional[str] = None) ->
> None:
> +"""
> +Appends the msg to the history list.
> +
> +:param msg:
> +The raw message to be appended in string type.
> +"""
> +urwid.emit_signal(self, UPDATE_MSG, msg, level)
> +
> +def _cb_outbound(self, msg: Message) -> Message:
> +"""
> +Callback: outbound message hook.
> +
> +Appends the outgoing messages to the history box.
> +
> +:param msg: raw