[PATCH] target/rx: set PSW.I when executing wait instruction

2022-04-16 Thread Tomoaki Kawada
This patch fixes the implementation of the wait instruction to
implicitly update PSW.I as required by the ISA specification.

Signed-off-by: Tomoaki Kawada 
---
 target/rx/op_helper.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/rx/op_helper.c b/target/rx/op_helper.c
index 11f952d340..81645adde3 100644
--- a/target/rx/op_helper.c
+++ b/target/rx/op_helper.c
@@ -448,6 +448,7 @@ void QEMU_NORETURN helper_wait(CPURXState *env)
 
 cs->halted = 1;
 env->in_sleep = 1;
+env->psw_i = 1;
 raise_exception(env, EXCP_HLT, 0);
 }
 
-- 
2.35.1




Re: [PATCH v3] hw/misc: applesmc: use host osk as default on macs

2022-04-16 Thread Vladislav Yaroshchuk
Hi Pedro Torres,

Please note this threads
https://patchew.org/QEMU/20211022161448.81579-1-yaroshchuk2...@gmail.com/
https://patchew.org/QEMU/20220113152836.60398-1-yaroshchuk2...@gmail.com/

There was a discussion about how to preserve
backward compatibility of emulated AppleSMC
behaviour. The discussion has stopped, but
if you're intended to see this feature working
within QEMU - it'd be good to cc all the
participans of previous threads :)


Thanks,

Vladislav Yaroshchuk 




Re: [PATCH v2 for-7.1 0/9] nbd: actually make s->state thread-safe

2022-04-16 Thread Lukas Straub
On Thu, 14 Apr 2022 19:57:47 +0200
Paolo Bonzini  wrote:

> The main point of this series is patch 7, which removes the dubious and
> probably wrong use of atomics in block/nbd.c.  This in turn is enabled
> mostly by the cleanups in patches 3-5.  Together, they introduce a
> QemuMutex that synchronizes the NBD client coroutines, the reconnect_delay
> timer and nbd_cancel_in_flight() as well.
> 
> The fixes happen to remove an incorrect use of qemu_co_queue_restart_all
> and qemu_co_enter_next on the s->free_sema CoQueue, which was not guarded
> by s->send_mutex.
> 
> The rest is bugfixes, simplifying the code a bit, and extra documentation.
> 
> v1->v2:
> - cleaner patch 1
> - fix grammar in patch 4
> - split out patch 6
> 
> Paolo Bonzini (9):
>   nbd: safeguard against waking up invalid coroutine
>   nbd: mark more coroutine_fns
>   nbd: remove peppering of nbd_client_connected
>   nbd: keep send_mutex/free_sema handling outside
> nbd_co_do_establish_connection
>   nbd: use a QemuMutex to synchronize yanking, reconnection and
> coroutines
>   nbd: code motion and function renaming
>   nbd: move s->state under requests_lock
>   nbd: take receive_mutex when reading requests[].receiving
>   nbd: document what is protected by the CoMutexes
> 
>  block/coroutines.h |   4 +-
>  block/nbd.c| 298 +++--
>  2 files changed, 154 insertions(+), 148 deletions(-)
> 

For the whole series:

Reviewed-by: Lukas Straub 

Regards,
Lukas

-- 



pgpB66amBvKHM.pgp
Description: OpenPGP digital signature


Re: [RFC PATCH 09/18] hw/arm/xlnx-zynqmp: convert cpu clusters to arm_cpus

2022-04-16 Thread Philippe Mathieu-Daudé

On 30/3/22 14:56, Damien Hedde wrote:

qom-path of cpus are changed:
+ "apu-cluster/apu-cpu[n]" to "apu/cpu[n]"
+ "rpu-cluster/rpu-cpu[n]" to "rpu/cpu[n]"

Signed-off-by: Damien Hedde 
---
  include/hw/arm/xlnx-zynqmp.h |   8 +--
  hw/arm/xlnx-zynqmp.c | 121 +--
  2 files changed, 48 insertions(+), 81 deletions(-)



  static void xlnx_zynqmp_create_bbram(XlnxZynqMPState *s, qemu_irq *gic)
@@ -296,7 +282,8 @@ static void xlnx_zynqmp_create_apu_ctrl(XlnxZynqMPState *s, 
qemu_irq *gic)
  g_autofree gchar *name = g_strdup_printf("cpu%d", i);
  
  object_property_set_link(OBJECT(>apu_ctrl), name,

- OBJECT(>apu_cpu[i]), _abort);
+ OBJECT(arm_cpus_get_cpu(>apu, i)),
+ _abort);
  }


Possible further improvement, XlnxZynqMPAPUCtrl can now take a link to
one ArmCpusGroup, instead of array of ARMCPU.



Re: [RFC PATCH 08/18] hw/arm/arm_cpus: add arm_cpus device

2022-04-16 Thread Philippe Mathieu-Daudé

On 30/3/22 14:56, Damien Hedde wrote:

This object can be used to create a group of homogeneous
arm cpus.

Signed-off-by: Damien Hedde 
---
  include/hw/arm/arm_cpus.h | 45 
  hw/arm/arm_cpus.c | 63 +++
  hw/arm/meson.build|  1 +
  3 files changed, 109 insertions(+)
  create mode 100644 include/hw/arm/arm_cpus.h
  create mode 100644 hw/arm/arm_cpus.c



+/**
+ * ArmCpusState:
+ * @reset_hivecs: use to initialize cpu's reset-hivecs
+ * @has_el3: use to initialize cpu's has_el3
+ * @has_el2: use to initialize cpu's has_el2
+ * @reset_cbar: use to initialize cpu's reset-cbar
+ */
+struct ArmCpusState {
+CpusState parent_obj;
+
+bool reset_hivecs;
+bool has_el3;
+bool has_el2;
+uint64_t reset_cbar;
+};
+
+/*
+ * arm_cpus_get_cpu:
+ * Helper to get an ArmCpu from the container.
+ */
+static inline ARMCPU *arm_cpus_get_cpu(ArmCpusState *s, unsigned i)


Maybe we can avoid using 'State' suffix for CPU (still using Class 
suffixes) and name this:


  ARMCPU *arm_cpus_group_get_cpu(ArmCpusGroup *s, unsigned index);



Re: [RFC PATCH 02/18] hw/cpu/cpus: introduce _cpus_ device

2022-04-16 Thread Philippe Mathieu-Daudé

On 30/3/22 14:56, Damien Hedde wrote:

This object will be a _cpu-cluster_ generalization and
is meant to allow create cpus of the same type.

The main goal is that this object, on contrary to _cpu-cluster-_,
can be used to dynamically create cpus: it does not rely on
external code to populate the object with cpus.

Allowing the user to create a cpu cluster and each _cpu_
separately would be hard because of the following reasons:
+ cpu reset need to be handled
+ instantiation and realize of cpu-cluster and the cpus
   are interleaved
+ cpu cluster must contains only identical cpus and it seems
   difficult to check that at runtime.
Therefore we add a new type solving all this constraints.

_cpu-cluster_ will be updated to inherit from this class
in following commits.

Signed-off-by: Damien Hedde 
---
  include/hw/cpu/cpus.h |  71 +++
  hw/cpu/cpus.c | 127 ++
  hw/cpu/meson.build|   2 +-
  3 files changed, 199 insertions(+), 1 deletion(-)
  create mode 100644 include/hw/cpu/cpus.h
  create mode 100644 hw/cpu/cpus.c

diff --git a/include/hw/cpu/cpus.h b/include/hw/cpu/cpus.h
new file mode 100644
index 00..c65f568ef8
--- /dev/null
+++ b/include/hw/cpu/cpus.h
@@ -0,0 +1,71 @@
+/*
+ * QEMU CPUs type
+ *
+ * Copyright (c) 2022 GreenSocs
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef HW_CPU_CPUS_H
+#define HW_CPU_CPUS_H
+
+#include "qemu/typedefs.h"
+#include "hw/qdev-core.h"
+#include "qom/object.h"
+
+/*
+ * This object represent several CPUs which are all identical.


Typo "represents".


+ *
+ * If CPUs are not identical (for example, Cortex-A53 and Cortex-A57 CPUs in an
+ * Arm big.LITTLE system) they should be in different groups. If the CPUs do
+ * not have the same view of memory (for example the main CPU and a management
+ * controller processor) they should be in different groups.


This description calls for a clearer CpusGroupState name instead
of CpusState (which confuses me with CPUState). Alternatively
CpusArrayState.


+ *
+ * This is an abstract class, subclasses are supposed to be created on
+ * per-architecture basis to handle the specifics of the cpu architecture.
+ * Subclasses are meant to be user-creatable (for cold-plug).
+ */
+
+#define TYPE_CPUS "cpus"
+OBJECT_DECLARE_TYPE(CpusState, CpusClass, CPUS)
+
+/**
+ * CpusState:
+ * @cpu_type: The type of cpu.
+ * @topology.cpus: The number of cpus in this group.
+ *  Explicity put this field into a topology structure in
+ *  order to eventually update this smoothly with a full
+ *  CpuTopology structure in the future.
+ * @cpus: Array of pointer to cpu objects.
+ */
+struct CpusState {
+/*< private >*/
+DeviceState parent_obj;
+
+/*< public >*/
+char *cpu_type;
+struct {
+uint16_t cpus;
+} topology;
+CPUState **cpus;
+};




Re: [PATCH v2 for-7.1 9/9] nbd: document what is protected by the CoMutexes

2022-04-16 Thread Vladimir Sementsov-Ogievskiy

14.04.2022 20:57, Paolo Bonzini wrote:

Signed-off-by: Paolo Bonzini 
---
  block/nbd.c | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/block/nbd.c b/block/nbd.c
index 31c684772e..d0d94b40bd 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -81,12 +81,18 @@ typedef struct BDRVNBDState {
  NBDClientRequest requests[MAX_NBD_REQUESTS];
  QEMUTimer *reconnect_delay_timer;
  
+/* Protects sending data on the socket.  */

  CoMutex send_mutex;
+
+/*
+ * Protects receiving reply headers from the socket, as well as the
+ * fields reply and requests[].receiving


I think, worth noting, that s->reply is used without mutex after 
nbd_receive_replies() success and till setting s->reply.handle to 0 in 
nbd_co_receive_one_chunk()..

Should "s->reply.handle = 0" be done under mutex as well? And may be, in same 
critical section as nbd_recv_coroutines_wake() ?


+ */
  CoMutex receive_mutex;
+NBDReply reply;
  
  QEMUTimer *open_timer;
  
-NBDReply reply;

  BlockDriverState *bs;
  
  /* Connection parameters */



--
Best regards,
Vladimir



Re: [PATCH v2 for-7.1 8/9] nbd: take receive_mutex when reading requests[].receiving

2022-04-16 Thread Vladimir Sementsov-Ogievskiy

14.04.2022 20:57, Paolo Bonzini wrote:

requests[].receiving is set by nbd_receive_replies() under the receive_mutex;
Read it under the same mutex as well.  Waking up receivers on errors happens
after each reply finishes processing, in nbd_co_receive_one_chunk().
If there is no currently-active reply, there are two cases:

* either there is no active request at all, in which case no
element of request[] can have .receiving = true

* or nbd_receive_replies() must be running and owns receive_mutex;
in that case it will get back to nbd_co_receive_one_chunk() because
the socket has been shutdown, and all waiting coroutines will wake up
in turn.



Seems that removing "nbd_recv_coroutines_wake(s, true);" from 
nbd_channel_error_locked() could be a separate preparing patch. It's a separate thing:
no sense to wake all receving coroutines on error, if they will wake each-other 
in a chain anyway..


Signed-off-by: Paolo Bonzini 



Reviewed-by: Vladimir Sementsov-Ogievskiy 


---
  block/nbd.c | 15 +++
  1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index b5aac2548c..31c684772e 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -131,6 +131,7 @@ static void nbd_clear_bdrvstate(BlockDriverState *bs)
  s->x_dirty_bitmap = NULL;
  }
  
+/* Called with s->receive_mutex taken.  */

  static bool coroutine_fn nbd_recv_coroutine_wake_one(NBDClientRequest *req)
  {
  if (req->receiving) {
@@ -142,12 +143,13 @@ static bool coroutine_fn 
nbd_recv_coroutine_wake_one(NBDClientRequest *req)
  return false;
  }
  
-static void coroutine_fn nbd_recv_coroutines_wake(BDRVNBDState *s, bool all)

+static void coroutine_fn nbd_recv_coroutines_wake(BDRVNBDState *s)
  {
  int i;
  
+QEMU_LOCK_GUARD(>receive_mutex);

  for (i = 0; i < MAX_NBD_REQUESTS; i++) {
-if (nbd_recv_coroutine_wake_one(>requests[i]) && !all) {
+if (nbd_recv_coroutine_wake_one(>requests[i])) {
  return;
  }
  }
@@ -168,8 +170,6 @@ static void coroutine_fn 
nbd_channel_error_locked(BDRVNBDState *s, int ret)
  } else {
  s->state = NBD_CLIENT_QUIT;
  }
-
-nbd_recv_coroutines_wake(s, true);
  }
  
  static void coroutine_fn nbd_channel_error(BDRVNBDState *s, int ret)

@@ -432,11 +432,10 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState 
*s, uint64_t handle)
  
  qemu_coroutine_yield();

  /*
- * We may be woken for 3 reasons:
+ * We may be woken for 2 reasons:
   * 1. From this function, executing in parallel coroutine, when 
our
   *handle is received.
- * 2. From nbd_channel_error(), when connection is lost.
- * 3. From nbd_co_receive_one_chunk(), when previous request is
+ * 2. From nbd_co_receive_one_chunk(), when previous request is
   *finished and s->reply.handle set to 0.
   * Anyway, it's OK to lock the mutex and go to the next iteration.
   */
@@ -928,7 +927,7 @@ static coroutine_fn int nbd_co_receive_one_chunk(
  }
  s->reply.handle = 0;
  
-nbd_recv_coroutines_wake(s, false);

+nbd_recv_coroutines_wake(s);
  
  return ret;

  }



--
Best regards,
Vladimir



[PATCH] hw/nvme: allow to pass a memory backend object for the CMB

2022-04-16 Thread Wertenbroek Rick
Adds the optional -cmbdev= option that takes a QEMU memory backend
-object to be used to for the CMB (Controller Memory Buffer).
This option takes precedence over cmb_size_mb= if both used.
(The size will be deduced from the memory backend option).

Signed-off-by: Rick Wertenbroek 
---
hw/nvme/ctrl.c | 65 ++
hw/nvme/nvme.h |  9 +++
2 files changed, 55 insertions(+), 19 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 03760ddeae..9bcc7d6db0 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -29,6 +29,7 @@
 *  -device nvme-subsys,id=,nqn=
 *  -device nvme,serial=,id=, \
 *  cmb_size_mb=, \
+ *  [cmbdev=,] \
 *  [pmrdev=,] \
 *  max_ioqpairs=, \
 *  aerl=,aer_max_queued=, \
@@ -44,6 +45,11 @@
 * offset 0 in BAR2 and supports only WDS, RDS and SQS for now. By default, the
 * device will use the "v1.4 CMB scheme" - use the `legacy-cmb` parameter to
 * always enable the CMBLOC and CMBSZ registers (v1.3 behavior).
+ * Enabling cmb emulation can also be achieved by pointing to a memory-backend
+ * For example:
+ * -object memory-backend-ram,id=,size= \
+ * -device nvme,...,cmbdev=
+ * cmbdev= will take precedence over cmb_size_mb= when both provided.
 *
 * Enabling pmr emulation can be achieved by pointing to memory-backend-file.
 * For example:
@@ -341,16 +347,26 @@ static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr)
return false;
}

-lo = n->params.legacy_cmb ? n->cmb.mem.addr : n->cmb.cba;
-hi = lo + int128_get64(n->cmb.mem.size);
+if (n->cmb.dev) {
+lo = n->params.legacy_cmb ? 
host_memory_backend_get_memory(n->cmb.dev)->addr : n->cmb.cba;
+hi = lo + 
int128_get64(host_memory_backend_get_memory(n->cmb.dev)->size);
+} else {
+lo = n->params.legacy_cmb ? n->cmb.mem.addr : n->cmb.cba;
+hi = lo + int128_get64(n->cmb.mem.size);
+}

return addr >= lo && addr < hi;
}

static inline void *nvme_addr_to_cmb(NvmeCtrl *n, hwaddr addr)
{
-hwaddr base = n->params.legacy_cmb ? n->cmb.mem.addr : n->cmb.cba;
-return >cmb.buf[addr - base];
+if (n->cmb.dev) {
+hwaddr base = n->params.legacy_cmb ? 
host_memory_backend_get_memory(n->cmb.dev)->addr : n->cmb.cba;
+return memory_region_get_ram_ptr(>cmb.dev->mr) + (addr - base);
+} else {
+hwaddr base = n->params.legacy_cmb ? n->cmb.mem.addr : n->cmb.cba;
+return >cmb.buf[addr - base];
+}
}

static bool nvme_addr_is_pmr(NvmeCtrl *n, hwaddr addr)
@@ -6584,16 +6600,33 @@ static void nvme_init_state(NvmeCtrl *n)

static void nvme_init_cmb(NvmeCtrl *n, PCIDevice *pci_dev)
{
-uint64_t cmb_size = n->params.cmb_size_mb * MiB;
+uint64_t cmb_size;
uint64_t cap = ldq_le_p(>bar.cap);

-n->cmb.buf = g_malloc0(cmb_size);
-memory_region_init_io(>cmb.mem, OBJECT(n), _cmb_ops, n,
-  "nvme-cmb", cmb_size);
-pci_register_bar(pci_dev, NVME_CMB_BIR,
- PCI_BASE_ADDRESS_SPACE_MEMORY |
- PCI_BASE_ADDRESS_MEM_TYPE_64 |
- PCI_BASE_ADDRESS_MEM_PREFETCH, >cmb.mem);
+if (n->cmb.dev) {
+if (n->params.cmb_size_mb) {
+warn_report("Option cmb_size_mb is ignored when a cmbdev is 
provided");
+}
+n->params.cmb_size_mb = n->cmb.dev->size / MiB;
+cmb_size = n->cmb.dev->size;
+
+MemoryRegion *mr = host_memory_backend_get_memory(n->cmb.dev);
+assert(mr);
+
+pci_register_bar(pci_dev, NVME_CMB_BIR,
+ PCI_BASE_ADDRESS_SPACE_MEMORY |
+ PCI_BASE_ADDRESS_MEM_TYPE_64 |
+ PCI_BASE_ADDRESS_MEM_PREFETCH, mr);
+} else {
+cmb_size = n->params.cmb_size_mb * MiB;
+n->cmb.buf = g_malloc0(cmb_size);
+memory_region_init_io(>cmb.mem, OBJECT(n), _cmb_ops, n,
+  "nvme-cmb", cmb_size);
+pci_register_bar(pci_dev, NVME_CMB_BIR,
+ PCI_BASE_ADDRESS_SPACE_MEMORY |
+ PCI_BASE_ADDRESS_MEM_TYPE_64 |
+ PCI_BASE_ADDRESS_MEM_PREFETCH, >cmb.mem);
+}

NVME_CAP_SET_CMBS(cap, 1);
stq_le_p(>bar.cap, cap);
@@ -6678,7 +6711,7 @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, 
Error **errp)
}
}

-if (n->params.cmb_size_mb) {
+if (n->params.cmb_size_mb || n->cmb.dev) {
nvme_init_cmb(n, pci_dev);
}

@@ -6793,7 +6826,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice 
*pci_dev)
NVME_CAP_SET_CSS(cap, NVME_CAP_CSS_CSI_SUPP);
NVME_CAP_SET_CSS(cap, NVME_CAP_CSS_ADMIN_ONLY);
NVME_CAP_SET_MPSMAX(cap, 4);
-NVME_CAP_SET_CMBS(cap, n->params.cmb_size_mb ? 1 : 0);
+NVME_CAP_SET_CMBS(cap, (n->params.cmb_size_mb || n->cmb.dev) ? 1 : 0);
NVME_CAP_SET_PMRS(cap, n->pmr.dev ? 1 : 0);
stq_le_p(>bar.cap, cap);

@@ -6893,7 +6926,7 @@ static void 

Re: [PATCH v2 for-7.1 7/9] nbd: move s->state under requests_lock

2022-04-16 Thread Vladimir Sementsov-Ogievskiy

14.04.2022 20:57, Paolo Bonzini wrote:

Remove the confusing, and most likely wrong, atomics.  The only function
that used to be somewhat in a hot path was nbd_client_connected(),
but it is not anymore after the previous patches.

The same logic is used both to check if a request had to be reissued
and also in nbd_reconnecting_attempt().  The former cases are outside
requests_lock, while nbd_reconnecting_attempt() does have the lock,
therefore the two have been separated in the previous commit.
nbd_client_will_reconnect() can simply take s->requests_lock, while
nbd_reconnecting_attempt() can inline the access now that no
complicated atomics are involved.

Signed-off-by: Paolo Bonzini 



Reviewed-by: Vladimir Sementsov-Ogievskiy 


---
  block/nbd.c | 78 -
  1 file changed, 41 insertions(+), 37 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 37d466e435..b5aac2548c 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -35,7 +35,6 @@
  #include "qemu/option.h"
  #include "qemu/cutils.h"
  #include "qemu/main-loop.h"
-#include "qemu/atomic.h"
  
  #include "qapi/qapi-visit-sockets.h"

  #include "qapi/qmp/qstring.h"
@@ -72,10 +71,11 @@ typedef struct BDRVNBDState {
  NBDExportInfo info;
  
  /*

- * Protects free_sema, in_flight, requests[].coroutine,
+ * Protects state, free_sema, in_flight, requests[].coroutine,
   * reconnect_delay_timer.
   */
  QemuMutex requests_lock;
+NBDClientState state;
  CoQueue free_sema;
  int in_flight;
  NBDClientRequest requests[MAX_NBD_REQUESTS];
@@ -83,7 +83,6 @@ typedef struct BDRVNBDState {
  
  CoMutex send_mutex;

  CoMutex receive_mutex;
-NBDClientState state;
  
  QEMUTimer *open_timer;
  
@@ -132,11 +131,6 @@ static void nbd_clear_bdrvstate(BlockDriverState *bs)

  s->x_dirty_bitmap = NULL;
  }
  
-static bool nbd_client_connected(BDRVNBDState *s)

-{
-return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED;
-}
-
  static bool coroutine_fn nbd_recv_coroutine_wake_one(NBDClientRequest *req)
  {
  if (req->receiving) {
@@ -159,14 +153,15 @@ static void coroutine_fn 
nbd_recv_coroutines_wake(BDRVNBDState *s, bool all)
  }
  }
  
-static void coroutine_fn nbd_channel_error(BDRVNBDState *s, int ret)

+/* Called with s->requests_lock held.  */
+static void coroutine_fn nbd_channel_error_locked(BDRVNBDState *s, int ret)
  {
-if (nbd_client_connected(s)) {
+if (s->state == NBD_CLIENT_CONNECTED) {
  qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
  }
  Reviewed-by: Vladimir Sementsov-Ogievskiy 
  if (ret == -EIO) {
-if (nbd_client_connected(s)) {
+if (s->state == NBD_CLIENT_CONNECTED) {
  s->state = s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT :
  NBD_CLIENT_CONNECTING_NOWAIT;
  }
@@ -177,6 +172,12 @@ static void coroutine_fn nbd_channel_error(BDRVNBDState 
*s, int ret)
  nbd_recv_coroutines_wake(s, true);
  }
  
+static void coroutine_fn nbd_channel_error(BDRVNBDState *s, int ret)

+{
+QEMU_LOCK_GUARD(>requests_lock);
+nbd_channel_error_locked(s, ret);
+}
+
  static void reconnect_delay_timer_del(BDRVNBDState *s)
  {
  if (s->reconnect_delay_timer) {
@@ -189,20 +190,18 @@ static void reconnect_delay_timer_cb(void *opaque)
  {
  BDRVNBDState *s = opaque;
  
-if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) {

-s->state = NBD_CLIENT_CONNECTING_NOWAIT;
-nbd_co_establish_connection_cancel(s->conn);
-}
-
  reconnect_delay_timer_del(s);
+WITH_QEMU_LOCK_GUARD(>requests_lock) {
+if (s->state != NBD_CLIENT_CONNECTING_WAIT) {
+return;
+}
+s->state = NBD_CLIENT_CONNECTING_NOWAIT;
+}
+nbd_co_establish_connection_cancel(s->conn);
  }
  
  static void reconnect_delay_timer_init(BDRVNBDState *s, uint64_t expire_time_ns)

  {
-if (qatomic_load_acquire(>state) != NBD_CLIENT_CONNECTING_WAIT) {
-return;
-}
-
  assert(!s->reconnect_delay_timer);
  s->reconnect_delay_timer = aio_timer_new(bdrv_get_aio_context(s->bs),
   QEMU_CLOCK_REALTIME,
@@ -225,7 +224,9 @@ static void nbd_teardown_connection(BlockDriverState *bs)
  s->ioc = NULL;
  }
  
-s->state = NBD_CLIENT_QUIT;

+WITH_QEMU_LOCK_GUARD(>requests_lock) {
+s->state = NBD_CLIENT_QUIT;
+}
  }
  
  static void open_timer_del(BDRVNBDState *s)

@@ -254,15 +255,15 @@ static void open_timer_init(BDRVNBDState *s, uint64_t 
expire_time_ns)
  timer_mod(s->open_timer, expire_time_ns);
  }
  
-static bool nbd_client_connecting_wait(BDRVNBDState *s)

-{
-return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT;
-}
-
  static bool nbd_client_will_reconnect(BDRVNBDState *s)
  {
-return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT;
+/*
+ * Called only 

Re: [PATCH v2 for-7.1 6/9] nbd: code motion and function renaming

2022-04-16 Thread Vladimir Sementsov-Ogievskiy

14.04.2022 20:57, Paolo Bonzini wrote:

Prepare for the next patch, so that the diff is less confusing.

nbd_client_connecting is moved closer to the definition point.


Amm. To usage-point you mean?
The original idea was to keep simple state-reading helpers definitions together 
:)



nbd_client_connecting_wait() is kept only for the reconnection
logic; when it is used to check if a request has to be reissued,
use the renamed function nbd_client_will_reconnect().  In the
next patch, the two cases will have different locking requirements.

Signed-off-by: Paolo Bonzini 


Reviewed-by: Vladimir Sementsov-Ogievskiy 


---
  block/nbd.c | 24 ++--
  1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index a2414566d1..37d466e435 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -254,18 +254,15 @@ static void open_timer_init(BDRVNBDState *s, uint64_t 
expire_time_ns)
  timer_mod(s->open_timer, expire_time_ns);
  }
  
-static bool nbd_client_connecting(BDRVNBDState *s)

-{
-NBDClientState state = qatomic_load_acquire(>state);
-return state == NBD_CLIENT_CONNECTING_WAIT ||
-state == NBD_CLIENT_CONNECTING_NOWAIT;
-}
-
  static bool nbd_client_connecting_wait(BDRVNBDState *s)
  {
  return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT;
  }
  
+static bool nbd_client_will_reconnect(BDRVNBDState *s)

+{
+return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT;
+}
  /*
   * Update @bs with information learned during a completed negotiation process.
   * Return failure if the server's advertised options are incompatible with the
@@ -355,6 +352,13 @@ int coroutine_fn 
nbd_co_do_establish_connection(BlockDriverState *bs,
  return 0;
  }
  
+static bool nbd_client_connecting(BDRVNBDState *s)

+{
+NBDClientState state = qatomic_load_acquire(>state);
+return state == NBD_CLIENT_CONNECTING_WAIT ||
+state == NBD_CLIENT_CONNECTING_NOWAIT;
+}
+
  /* Called with s->requests_lock taken.  */
  static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
  {
@@ -1190,7 +1194,7 @@ static int coroutine_fn nbd_co_request(BlockDriverState 
*bs, NBDRequest *request
  error_free(local_err);
  local_err = NULL;
  }
-} while (ret < 0 && nbd_client_connecting_wait(s));
+} while (ret < 0 && nbd_client_will_reconnect(s));
  
  return ret ? ret : request_ret;

  }
@@ -1249,7 +1253,7 @@ static int coroutine_fn 
nbd_client_co_preadv(BlockDriverState *bs, int64_t offse
  error_free(local_err);
  local_err = NULL;
  }
-} while (ret < 0 && nbd_client_connecting_wait(s));
+} while (ret < 0 && nbd_client_will_reconnect(s));
  
  return ret ? ret : request_ret;

  }
@@ -1407,7 +1411,7 @@ static int coroutine_fn nbd_client_co_block_status(
  error_free(local_err);
  local_err = NULL;
  }
-} while (ret < 0 && nbd_client_connecting_wait(s));
+} while (ret < 0 && nbd_client_will_reconnect(s));
  
  if (ret < 0 || request_ret < 0) {

  return ret ? ret : request_ret;



--
Best regards,
Vladimir



Re: [PATCH v2 for-7.1 5/9] nbd: use a QemuMutex to synchronize yanking, reconnection and coroutines

2022-04-16 Thread Vladimir Sementsov-Ogievskiy

14.04.2022 20:57, Paolo Bonzini wrote:

The condition for waiting on the s->free_sema queue depends on
both s->in_flight and s->state.  The latter is currently using
atomics, but this is quite dubious and probably wrong.

Because s->state is written in the main thread too, for example by
the yank callback, it cannot be protected by a CoMutex.  Introduce a
separate lock that can be used by nbd_co_send_request(); later on this
lock will also be used for s->state.  There will not be any contention
on the lock unless there is a yank or reconnect, so this is not
performance sensitive.

Signed-off-by: Paolo Bonzini 
---
  block/nbd.c | 46 +++---
  1 file changed, 27 insertions(+), 19 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 62dd338ef3..a2414566d1 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -71,17 +71,22 @@ typedef struct BDRVNBDState {
  QIOChannel *ioc; /* The current I/O channel */
  NBDExportInfo info;
  
-CoMutex send_mutex;

+/*
+ * Protects free_sema, in_flight, requests[].coroutine,
+ * reconnect_delay_timer.
+ */


requests[].coroutine is read without mutex in nbd_receive_replies

reconnect_delay_timer_del() is called without mutex in nbd_cancel_in_flight() 
and in reconnect_delay_timer_cb()..

Is it OK? Worth improving the comment?


+QemuMutex requests_lock;
  CoQueue free_sema;
-
-CoMutex receive_mutex;
  int in_flight;
+NBDClientRequest requests[MAX_NBD_REQUESTS];
+QEMUTimer *reconnect_delay_timer;
+
+CoMutex send_mutex;
+CoMutex receive_mutex;
  NBDClientState state;
  
-QEMUTimer *reconnect_delay_timer;

  QEMUTimer *open_timer;
  
-NBDClientRequest requests[MAX_NBD_REQUESTS];

  NBDReply reply;
  BlockDriverState *bs;
  
@@ -350,7 +355,7 @@ int coroutine_fn nbd_co_do_establish_connection(BlockDriverState *bs,

  return 0;
  }
  
-/* called under s->send_mutex */

+/* Called with s->requests_lock taken.  */
  static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
  {
  bool blocking = nbd_client_connecting_wait(s);
@@ -382,9 +387,9 @@ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState 
*s)
  s->ioc = NULL;
  }
  
-qemu_co_mutex_unlock(>send_mutex);

+qemu_mutex_unlock(>requests_lock);
  nbd_co_do_establish_connection(s->bs, blocking, NULL);
-qemu_co_mutex_lock(>send_mutex);
+qemu_mutex_lock(>requests_lock);
  
  /*

   * The reconnect attempt is done (maybe successfully, maybe not), so
@@ -466,11 +471,10 @@ static int coroutine_fn 
nbd_co_send_request(BlockDriverState *bs,
  BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
  int rc, i = -1;
  
-qemu_co_mutex_lock(>send_mutex);

-
+qemu_mutex_lock(>requests_lock);
  while (s->in_flight == MAX_NBD_REQUESTS ||
 (!nbd_client_connected(s) && s->in_flight > 0)) {
-qemu_co_queue_wait(>free_sema, >send_mutex);
+qemu_co_queue_wait(>free_sema, >requests_lock);
  }
  
  s->in_flight++;

@@ -491,13 +495,13 @@ static int coroutine_fn 
nbd_co_send_request(BlockDriverState *bs,


Prior to this patch, if request sending fails, we'll not send further requests. 
After the patch, we can send more requests after failure on unlocking 
send_mutex.

May be that's not bad..

What's wrong if we keep send_mutex critical section as is and just lock 
requests_lock additionally inside send_mutex-critical-section?



  }
  }
  
-g_assert(qemu_in_coroutine());

  assert(i < MAX_NBD_REQUESTS);
-
  s->requests[i].coroutine = qemu_coroutine_self();
  s->requests[i].offset = request->from;
  s->requests[i].receiving = false;
+qemu_mutex_unlock(>requests_lock);
  
+qemu_co_mutex_lock(>send_mutex);

  request->handle = INDEX_TO_HANDLE(s, i);
  
  assert(s->ioc);

@@ -517,17 +521,19 @@ static int coroutine_fn 
nbd_co_send_request(BlockDriverState *bs,
  } else {
  rc = nbd_send_request(s->ioc, request);
  }
+qemu_co_mutex_unlock(>send_mutex);
  
-err:

  if (rc < 0) {
+qemu_mutex_lock(>requests_lock);
+err:
  nbd_channel_error(s, rc);
  if (i != -1) {
  s->requests[i].coroutine = NULL;
  }
  s->in_flight--;
  qemu_co_queue_next(>free_sema);
+qemu_mutex_unlock(>requests_lock);
  }
-qemu_co_mutex_unlock(>send_mutex);
  return rc;
  }
  
@@ -1017,12 +1023,11 @@ static bool nbd_reply_chunk_iter_receive(BDRVNBDState *s,

  return true;
  
  break_loop:

+qemu_mutex_lock(>requests_lock);
  s->requests[HANDLE_TO_INDEX(s, handle)].coroutine = NULL;
-
-qemu_co_mutex_lock(>send_mutex);
  s->in_flight--;
  qemu_co_queue_next(>free_sema);
-qemu_co_mutex_unlock(>send_mutex);
+qemu_mutex_unlock(>requests_lock);
  
  return false;

  }
@@ -1855,8 +1860,9 @@ static int nbd_open(BlockDriverState *bs, QDict *options, 
int flags,
  BDRVNBDState *s = 

Re: [PATCH v2 for-7.1 4/9] nbd: keep send_mutex/free_sema handling outside nbd_co_do_establish_connection

2022-04-16 Thread Vladimir Sementsov-Ogievskiy

14.04.2022 20:57, Paolo Bonzini wrote:

Elevate s->in_flight early so that other incoming requests will wait
on the CoQueue in nbd_co_send_request; restart them after getting back
from nbd_reconnect_attempt.  This could be after the reconnect timer or
nbd_cancel_in_flight have cancelled the attempt, so there is no
need anymore to cancel the requests there.

nbd_co_send_request now handles both stopping and restarting pending
requests after a successful connection, and there is no need to
hold send_mutex in nbd_co_do_establish_connection.  The current setup
is confusing because nbd_co_do_establish_connection is called both with
send_mutex taken and without it.  Before the patch it uses free_sema which
(at least in theory...) is protected by send_mutex, after the patch it
does not anymore.

Signed-off-by: Paolo Bonzini 


Seems good to me, still again, hard to imagine, can it lead to some new 
dead-locks or not. Seems not, at least I don't see the problem. We will see.

Reviewed-by: Vladimir Sementsov-Ogievskiy 


---
  block/coroutines.h |  4 +--
  block/nbd.c| 66 ++
  2 files changed, 33 insertions(+), 37 deletions(-)

diff --git a/block/coroutines.h b/block/coroutines.h
index b293e943c8..8f6e438ef3 100644
--- a/block/coroutines.h
+++ b/block/coroutines.h
@@ -59,7 +59,7 @@ int coroutine_fn bdrv_co_writev_vmstate(BlockDriverState *bs,
  QEMUIOVector *qiov, int64_t pos);
  
  int coroutine_fn

-nbd_co_do_establish_connection(BlockDriverState *bs, Error **errp);
+nbd_co_do_establish_connection(BlockDriverState *bs, bool blocking, Error 
**errp);
  
  
  int coroutine_fn

@@ -109,7 +109,7 @@ bdrv_common_block_status_above(BlockDriverState *bs,
 BlockDriverState **file,
 int *depth);
  int generated_co_wrapper
-nbd_do_establish_connection(BlockDriverState *bs, Error **errp);
+nbd_do_establish_connection(BlockDriverState *bs, bool blocking, Error **errp);
  
  int generated_co_wrapper

  blk_do_preadv(BlockBackend *blk, int64_t offset, int64_t bytes,
diff --git a/block/nbd.c b/block/nbd.c
index a30603ce87..62dd338ef3 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -187,9 +187,6 @@ static void reconnect_delay_timer_cb(void *opaque)
  if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) {
  s->state = NBD_CLIENT_CONNECTING_NOWAIT;
  nbd_co_establish_connection_cancel(s->conn);
-while (qemu_co_enter_next(>free_sema, NULL)) {
-/* Resume all queued requests */
-}
  }
  
  reconnect_delay_timer_del(s);


Seems, removing the timer is not needed here too. We do 
nbd_co_establish_connection_cancel(), and timer will be removed in 
nbd_reconnect_attempt anyway.

More over, we don't need to keep timer in the state at all: it should become local thing 
for nbd_reconnect_attempt. A kind of call nbd_co_do_establish_connection() with timeout. 
That could be refactored later, using same way as in 4-5 patches of my "[PATCH v4 
0/7] copy-before-write: on-cbw-error and cbw-timeout" series.


@@ -310,11 +307,10 @@ static int nbd_handle_updated_info(BlockDriverState *bs, 
Error **errp)
  }
  
  int coroutine_fn nbd_co_do_establish_connection(BlockDriverState *bs,

-Error **errp)
+bool blocking, Error **errp)
  {
  BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
  int ret;
-bool blocking = nbd_client_connecting_wait(s);
  IO_CODE();
  
  assert(!s->ioc);

@@ -350,7 +346,6 @@ int coroutine_fn 
nbd_co_do_establish_connection(BlockDriverState *bs,
  
  /* successfully connected */

  s->state = NBD_CLIENT_CONNECTED;
-qemu_co_queue_restart_all(>free_sema);
  
  return 0;

  }
@@ -358,25 +353,25 @@ int coroutine_fn 
nbd_co_do_establish_connection(BlockDriverState *bs,
  /* called under s->send_mutex */
  static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
  {
-assert(nbd_client_connecting(s));
-assert(s->in_flight == 0);
-
-if (nbd_client_connecting_wait(s) && s->reconnect_delay &&
-!s->reconnect_delay_timer)
-{
-/*
- * It's first reconnect attempt after switching to
- * NBD_CLIENT_CONNECTING_WAIT
- */
-reconnect_delay_timer_init(s,
-qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
-s->reconnect_delay * NANOSECONDS_PER_SECOND);
-}
+bool blocking = nbd_client_connecting_wait(s);
  
  /*

   * Now we are sure that nobody is accessing the channel, and no one will
   * try until we set the state to CONNECTED.
   */
+assert(nbd_client_connecting(s));
+assert(s->in_flight == 1);
+
+if (blocking && !s->reconnect_delay_timer) {
+/*
+ * It's the first reconnect attempt after switching to
+ * NBD_CLIENT_CONNECTING_WAIT
+ */
+

[PATCH v2] hw/vfio/common: Fix a small boundary issue of a trace

2022-04-16 Thread chenxiang via
From: Xiang Chen 

It uses [offset, offset + size - 1] to indicate that the length of range is
size in most places in vfio trace code (such as
trace_vfio_region_region_mmap()) execpt trace_vfio_region_sparse_mmap_entry().
So change it for trace_vfio_region_sparse_mmap_entry(), but if size is zero,
the trace will be weird with an underflow, so move the trace and trace it 
only if size is not zero.

Signed-off-by: Xiang Chen 
---
 hw/vfio/common.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 080046e3f5..345ea7bd8a 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1544,11 +1544,10 @@ static int vfio_setup_region_sparse_mmaps(VFIORegion 
*region,
 region->mmaps = g_new0(VFIOMmap, sparse->nr_areas);
 
 for (i = 0, j = 0; i < sparse->nr_areas; i++) {
-trace_vfio_region_sparse_mmap_entry(i, sparse->areas[i].offset,
-sparse->areas[i].offset +
-sparse->areas[i].size);
-
 if (sparse->areas[i].size) {
+trace_vfio_region_sparse_mmap_entry(i, sparse->areas[i].offset,
+sparse->areas[i].offset +
+sparse->areas[i].size - 1);
 region->mmaps[j].offset = sparse->areas[i].offset;
 region->mmaps[j].size = sparse->areas[i].size;
 j++;
-- 
2.33.0




[PATCH] softmmu/memory: Skip translation size instead of fixed granularity if translate() successfully

2022-04-16 Thread chenxiang via
From: Xiang Chen 

Currently memory_region_iommu_replay() does full page table walk with
fixed granularity (page size) no matter translate() succeeds or not.
Actually if translate() successfully, we can skip translation size
(iotlb.addr_mask + 1) instead of fixed granularity.

 Signed-off-by: Xiang Chen 
---
 softmmu/memory.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/softmmu/memory.c b/softmmu/memory.c
index bfa5d5178c..ccfa19cf71 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1924,7 +1924,7 @@ void memory_region_iommu_replay(IOMMUMemoryRegion 
*iommu_mr, IOMMUNotifier *n)
 {
 MemoryRegion *mr = MEMORY_REGION(iommu_mr);
 IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
-hwaddr addr, granularity;
+hwaddr addr, granularity, def_granu;
 IOMMUTLBEntry iotlb;
 
 /* If the IOMMU has its own replay callback, override */
@@ -1933,12 +1933,15 @@ void memory_region_iommu_replay(IOMMUMemoryRegion 
*iommu_mr, IOMMUNotifier *n)
 return;
 }
 
-granularity = memory_region_iommu_get_min_page_size(iommu_mr);
+def_granu = memory_region_iommu_get_min_page_size(iommu_mr);
 
 for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
 iotlb = imrc->translate(iommu_mr, addr, IOMMU_NONE, n->iommu_idx);
 if (iotlb.perm != IOMMU_NONE) {
 n->notify(n, );
+granularity = iotlb.addr_mask + 1;
+} else {
+granularity = def_granu;
 }
 
 /* if (2^64 - MR size) < granularity, it's possible to get an
-- 
2.33.0




[PATCH v2] hw/arm/smmuv3: Pass the actual perm to returned IOMMUTLBEntry in smmuv3_translate()

2022-04-16 Thread chenxiang via
From: Xiang Chen 

It always calls the IOMMU MR translate() callback with flag=IOMMU_NONE in
memory_region_iommu_replay(). Currently, smmuv3_translate() return an
IOMMUTLBEntry with perm set to IOMMU_NONE even if the translation success,
whereas it is expected to return the actual permission set in the table
entry.
So pass the actual perm to returned IOMMUTLBEntry in the table entry.

Signed-off-by: Xiang Chen 
Reviewed-by: Eric Auger 
---
 hw/arm/smmuv3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 674623aabe..707eb430c2 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -760,7 +760,7 @@ epilogue:
 qemu_mutex_unlock(>mutex);
 switch (status) {
 case SMMU_TRANS_SUCCESS:
-entry.perm = flag;
+entry.perm = cached_entry->entry.perm;
 entry.translated_addr = cached_entry->entry.translated_addr +
 (addr & cached_entry->entry.addr_mask);
 entry.addr_mask = cached_entry->entry.addr_mask;
-- 
2.33.0




Re: [PATCH] hw/arm/smmuv3: Pass the real perm to returned IOMMUTLBEntry in smmuv3_translate()

2022-04-16 Thread chenxiang (M)

Hi Eric,


在 2022/4/15 0:02, Eric Auger 写道:

Hi Chenxiang,

On 4/7/22 9:57 AM, chenxiang via wrote:

From: Xiang Chen 

In function memory_region_iommu_replay(), it decides to notify() or not
according to the perm of returned IOMMUTLBEntry. But for smmuv3, the
returned perm is always IOMMU_NONE even if the translation success.

I think you should precise in the commit message that
memory_region_iommu_replay() always calls the IOMMU MR translate()
callback with flag=IOMMU_NONE and thus, currently, translate() returns
an IOMMUTLBEntry with perm set to IOMMU_NONE if the translation
succeeds, whereas it is expected to return the actual permission set in
the table entry.


Thank you for your comments.
I will change the commit message in next version.





Pass the real perm to returned IOMMUTLBEntry to avoid the issue.

Signed-off-by: Xiang Chen 
---
  hw/arm/smmuv3.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 674623aabe..707eb430c2 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -760,7 +760,7 @@ epilogue:
  qemu_mutex_unlock(>mutex);
  switch (status) {
  case SMMU_TRANS_SUCCESS:
-entry.perm = flag;
+entry.perm = cached_entry->entry.perm;

With that clarification
Reviewed-by: Eric Auger 


Ok, thanks



the translate() doc in ./include/exec/memory.h states
"
If IOMMU_NONE is passed then the IOMMU must do the
  * full page table walk and report the permissions in the returned
  * IOMMUTLBEntry. (Note that this implies that an IOMMU may not
  * return different mappings for reads and writes.)
"


Thanks

Eric

  entry.translated_addr = cached_entry->entry.translated_addr +
  (addr & cached_entry->entry.addr_mask);
  entry.addr_mask = cached_entry->entry.addr_mask;

.