[PATCH] target/rx: set PSW.I when executing wait instruction
This patch fixes the implementation of the wait instruction to implicitly update PSW.I as required by the ISA specification. Signed-off-by: Tomoaki Kawada --- target/rx/op_helper.c | 1 + 1 file changed, 1 insertion(+) diff --git a/target/rx/op_helper.c b/target/rx/op_helper.c index 11f952d340..81645adde3 100644 --- a/target/rx/op_helper.c +++ b/target/rx/op_helper.c @@ -448,6 +448,7 @@ void QEMU_NORETURN helper_wait(CPURXState *env) cs->halted = 1; env->in_sleep = 1; +env->psw_i = 1; raise_exception(env, EXCP_HLT, 0); } -- 2.35.1
Re: [PATCH v3] hw/misc: applesmc: use host osk as default on macs
Hi Pedro Torres, Please note this threads https://patchew.org/QEMU/20211022161448.81579-1-yaroshchuk2...@gmail.com/ https://patchew.org/QEMU/20220113152836.60398-1-yaroshchuk2...@gmail.com/ There was a discussion about how to preserve backward compatibility of emulated AppleSMC behaviour. The discussion has stopped, but if you're intended to see this feature working within QEMU - it'd be good to cc all the participans of previous threads :) Thanks, Vladislav Yaroshchuk
Re: [PATCH v2 for-7.1 0/9] nbd: actually make s->state thread-safe
On Thu, 14 Apr 2022 19:57:47 +0200 Paolo Bonzini wrote: > The main point of this series is patch 7, which removes the dubious and > probably wrong use of atomics in block/nbd.c. This in turn is enabled > mostly by the cleanups in patches 3-5. Together, they introduce a > QemuMutex that synchronizes the NBD client coroutines, the reconnect_delay > timer and nbd_cancel_in_flight() as well. > > The fixes happen to remove an incorrect use of qemu_co_queue_restart_all > and qemu_co_enter_next on the s->free_sema CoQueue, which was not guarded > by s->send_mutex. > > The rest is bugfixes, simplifying the code a bit, and extra documentation. > > v1->v2: > - cleaner patch 1 > - fix grammar in patch 4 > - split out patch 6 > > Paolo Bonzini (9): > nbd: safeguard against waking up invalid coroutine > nbd: mark more coroutine_fns > nbd: remove peppering of nbd_client_connected > nbd: keep send_mutex/free_sema handling outside > nbd_co_do_establish_connection > nbd: use a QemuMutex to synchronize yanking, reconnection and > coroutines > nbd: code motion and function renaming > nbd: move s->state under requests_lock > nbd: take receive_mutex when reading requests[].receiving > nbd: document what is protected by the CoMutexes > > block/coroutines.h | 4 +- > block/nbd.c| 298 +++-- > 2 files changed, 154 insertions(+), 148 deletions(-) > For the whole series: Reviewed-by: Lukas Straub Regards, Lukas -- pgpB66amBvKHM.pgp Description: OpenPGP digital signature
Re: [RFC PATCH 09/18] hw/arm/xlnx-zynqmp: convert cpu clusters to arm_cpus
On 30/3/22 14:56, Damien Hedde wrote: qom-path of cpus are changed: + "apu-cluster/apu-cpu[n]" to "apu/cpu[n]" + "rpu-cluster/rpu-cpu[n]" to "rpu/cpu[n]" Signed-off-by: Damien Hedde --- include/hw/arm/xlnx-zynqmp.h | 8 +-- hw/arm/xlnx-zynqmp.c | 121 +-- 2 files changed, 48 insertions(+), 81 deletions(-) static void xlnx_zynqmp_create_bbram(XlnxZynqMPState *s, qemu_irq *gic) @@ -296,7 +282,8 @@ static void xlnx_zynqmp_create_apu_ctrl(XlnxZynqMPState *s, qemu_irq *gic) g_autofree gchar *name = g_strdup_printf("cpu%d", i); object_property_set_link(OBJECT(>apu_ctrl), name, - OBJECT(>apu_cpu[i]), _abort); + OBJECT(arm_cpus_get_cpu(>apu, i)), + _abort); } Possible further improvement, XlnxZynqMPAPUCtrl can now take a link to one ArmCpusGroup, instead of array of ARMCPU.
Re: [RFC PATCH 08/18] hw/arm/arm_cpus: add arm_cpus device
On 30/3/22 14:56, Damien Hedde wrote: This object can be used to create a group of homogeneous arm cpus. Signed-off-by: Damien Hedde --- include/hw/arm/arm_cpus.h | 45 hw/arm/arm_cpus.c | 63 +++ hw/arm/meson.build| 1 + 3 files changed, 109 insertions(+) create mode 100644 include/hw/arm/arm_cpus.h create mode 100644 hw/arm/arm_cpus.c +/** + * ArmCpusState: + * @reset_hivecs: use to initialize cpu's reset-hivecs + * @has_el3: use to initialize cpu's has_el3 + * @has_el2: use to initialize cpu's has_el2 + * @reset_cbar: use to initialize cpu's reset-cbar + */ +struct ArmCpusState { +CpusState parent_obj; + +bool reset_hivecs; +bool has_el3; +bool has_el2; +uint64_t reset_cbar; +}; + +/* + * arm_cpus_get_cpu: + * Helper to get an ArmCpu from the container. + */ +static inline ARMCPU *arm_cpus_get_cpu(ArmCpusState *s, unsigned i) Maybe we can avoid using 'State' suffix for CPU (still using Class suffixes) and name this: ARMCPU *arm_cpus_group_get_cpu(ArmCpusGroup *s, unsigned index);
Re: [RFC PATCH 02/18] hw/cpu/cpus: introduce _cpus_ device
On 30/3/22 14:56, Damien Hedde wrote: This object will be a _cpu-cluster_ generalization and is meant to allow create cpus of the same type. The main goal is that this object, on contrary to _cpu-cluster-_, can be used to dynamically create cpus: it does not rely on external code to populate the object with cpus. Allowing the user to create a cpu cluster and each _cpu_ separately would be hard because of the following reasons: + cpu reset need to be handled + instantiation and realize of cpu-cluster and the cpus are interleaved + cpu cluster must contains only identical cpus and it seems difficult to check that at runtime. Therefore we add a new type solving all this constraints. _cpu-cluster_ will be updated to inherit from this class in following commits. Signed-off-by: Damien Hedde --- include/hw/cpu/cpus.h | 71 +++ hw/cpu/cpus.c | 127 ++ hw/cpu/meson.build| 2 +- 3 files changed, 199 insertions(+), 1 deletion(-) create mode 100644 include/hw/cpu/cpus.h create mode 100644 hw/cpu/cpus.c diff --git a/include/hw/cpu/cpus.h b/include/hw/cpu/cpus.h new file mode 100644 index 00..c65f568ef8 --- /dev/null +++ b/include/hw/cpu/cpus.h @@ -0,0 +1,71 @@ +/* + * QEMU CPUs type + * + * Copyright (c) 2022 GreenSocs + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#ifndef HW_CPU_CPUS_H +#define HW_CPU_CPUS_H + +#include "qemu/typedefs.h" +#include "hw/qdev-core.h" +#include "qom/object.h" + +/* + * This object represent several CPUs which are all identical. Typo "represents". + * + * If CPUs are not identical (for example, Cortex-A53 and Cortex-A57 CPUs in an + * Arm big.LITTLE system) they should be in different groups. If the CPUs do + * not have the same view of memory (for example the main CPU and a management + * controller processor) they should be in different groups. This description calls for a clearer CpusGroupState name instead of CpusState (which confuses me with CPUState). Alternatively CpusArrayState. + * + * This is an abstract class, subclasses are supposed to be created on + * per-architecture basis to handle the specifics of the cpu architecture. + * Subclasses are meant to be user-creatable (for cold-plug). + */ + +#define TYPE_CPUS "cpus" +OBJECT_DECLARE_TYPE(CpusState, CpusClass, CPUS) + +/** + * CpusState: + * @cpu_type: The type of cpu. + * @topology.cpus: The number of cpus in this group. + * Explicity put this field into a topology structure in + * order to eventually update this smoothly with a full + * CpuTopology structure in the future. + * @cpus: Array of pointer to cpu objects. + */ +struct CpusState { +/*< private >*/ +DeviceState parent_obj; + +/*< public >*/ +char *cpu_type; +struct { +uint16_t cpus; +} topology; +CPUState **cpus; +};
Re: [PATCH v2 for-7.1 9/9] nbd: document what is protected by the CoMutexes
14.04.2022 20:57, Paolo Bonzini wrote: Signed-off-by: Paolo Bonzini --- block/nbd.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/block/nbd.c b/block/nbd.c index 31c684772e..d0d94b40bd 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -81,12 +81,18 @@ typedef struct BDRVNBDState { NBDClientRequest requests[MAX_NBD_REQUESTS]; QEMUTimer *reconnect_delay_timer; +/* Protects sending data on the socket. */ CoMutex send_mutex; + +/* + * Protects receiving reply headers from the socket, as well as the + * fields reply and requests[].receiving I think, worth noting, that s->reply is used without mutex after nbd_receive_replies() success and till setting s->reply.handle to 0 in nbd_co_receive_one_chunk().. Should "s->reply.handle = 0" be done under mutex as well? And may be, in same critical section as nbd_recv_coroutines_wake() ? + */ CoMutex receive_mutex; +NBDReply reply; QEMUTimer *open_timer; -NBDReply reply; BlockDriverState *bs; /* Connection parameters */ -- Best regards, Vladimir
Re: [PATCH v2 for-7.1 8/9] nbd: take receive_mutex when reading requests[].receiving
14.04.2022 20:57, Paolo Bonzini wrote: requests[].receiving is set by nbd_receive_replies() under the receive_mutex; Read it under the same mutex as well. Waking up receivers on errors happens after each reply finishes processing, in nbd_co_receive_one_chunk(). If there is no currently-active reply, there are two cases: * either there is no active request at all, in which case no element of request[] can have .receiving = true * or nbd_receive_replies() must be running and owns receive_mutex; in that case it will get back to nbd_co_receive_one_chunk() because the socket has been shutdown, and all waiting coroutines will wake up in turn. Seems that removing "nbd_recv_coroutines_wake(s, true);" from nbd_channel_error_locked() could be a separate preparing patch. It's a separate thing: no sense to wake all receving coroutines on error, if they will wake each-other in a chain anyway.. Signed-off-by: Paolo Bonzini Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/nbd.c | 15 +++ 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index b5aac2548c..31c684772e 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -131,6 +131,7 @@ static void nbd_clear_bdrvstate(BlockDriverState *bs) s->x_dirty_bitmap = NULL; } +/* Called with s->receive_mutex taken. */ static bool coroutine_fn nbd_recv_coroutine_wake_one(NBDClientRequest *req) { if (req->receiving) { @@ -142,12 +143,13 @@ static bool coroutine_fn nbd_recv_coroutine_wake_one(NBDClientRequest *req) return false; } -static void coroutine_fn nbd_recv_coroutines_wake(BDRVNBDState *s, bool all) +static void coroutine_fn nbd_recv_coroutines_wake(BDRVNBDState *s) { int i; +QEMU_LOCK_GUARD(>receive_mutex); for (i = 0; i < MAX_NBD_REQUESTS; i++) { -if (nbd_recv_coroutine_wake_one(>requests[i]) && !all) { +if (nbd_recv_coroutine_wake_one(>requests[i])) { return; } } @@ -168,8 +170,6 @@ static void coroutine_fn nbd_channel_error_locked(BDRVNBDState *s, int ret) } else { s->state = NBD_CLIENT_QUIT; } - -nbd_recv_coroutines_wake(s, true); } static void coroutine_fn nbd_channel_error(BDRVNBDState *s, int ret) @@ -432,11 +432,10 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t handle) qemu_coroutine_yield(); /* - * We may be woken for 3 reasons: + * We may be woken for 2 reasons: * 1. From this function, executing in parallel coroutine, when our *handle is received. - * 2. From nbd_channel_error(), when connection is lost. - * 3. From nbd_co_receive_one_chunk(), when previous request is + * 2. From nbd_co_receive_one_chunk(), when previous request is *finished and s->reply.handle set to 0. * Anyway, it's OK to lock the mutex and go to the next iteration. */ @@ -928,7 +927,7 @@ static coroutine_fn int nbd_co_receive_one_chunk( } s->reply.handle = 0; -nbd_recv_coroutines_wake(s, false); +nbd_recv_coroutines_wake(s); return ret; } -- Best regards, Vladimir
[PATCH] hw/nvme: allow to pass a memory backend object for the CMB
Adds the optional -cmbdev= option that takes a QEMU memory backend -object to be used to for the CMB (Controller Memory Buffer). This option takes precedence over cmb_size_mb= if both used. (The size will be deduced from the memory backend option). Signed-off-by: Rick Wertenbroek --- hw/nvme/ctrl.c | 65 ++ hw/nvme/nvme.h | 9 +++ 2 files changed, 55 insertions(+), 19 deletions(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 03760ddeae..9bcc7d6db0 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -29,6 +29,7 @@ * -device nvme-subsys,id=,nqn= * -device nvme,serial=,id=, \ * cmb_size_mb=, \ + * [cmbdev=,] \ * [pmrdev=,] \ * max_ioqpairs=, \ * aerl=,aer_max_queued=, \ @@ -44,6 +45,11 @@ * offset 0 in BAR2 and supports only WDS, RDS and SQS for now. By default, the * device will use the "v1.4 CMB scheme" - use the `legacy-cmb` parameter to * always enable the CMBLOC and CMBSZ registers (v1.3 behavior). + * Enabling cmb emulation can also be achieved by pointing to a memory-backend + * For example: + * -object memory-backend-ram,id=,size= \ + * -device nvme,...,cmbdev= + * cmbdev= will take precedence over cmb_size_mb= when both provided. * * Enabling pmr emulation can be achieved by pointing to memory-backend-file. * For example: @@ -341,16 +347,26 @@ static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr) return false; } -lo = n->params.legacy_cmb ? n->cmb.mem.addr : n->cmb.cba; -hi = lo + int128_get64(n->cmb.mem.size); +if (n->cmb.dev) { +lo = n->params.legacy_cmb ? host_memory_backend_get_memory(n->cmb.dev)->addr : n->cmb.cba; +hi = lo + int128_get64(host_memory_backend_get_memory(n->cmb.dev)->size); +} else { +lo = n->params.legacy_cmb ? n->cmb.mem.addr : n->cmb.cba; +hi = lo + int128_get64(n->cmb.mem.size); +} return addr >= lo && addr < hi; } static inline void *nvme_addr_to_cmb(NvmeCtrl *n, hwaddr addr) { -hwaddr base = n->params.legacy_cmb ? n->cmb.mem.addr : n->cmb.cba; -return >cmb.buf[addr - base]; +if (n->cmb.dev) { +hwaddr base = n->params.legacy_cmb ? host_memory_backend_get_memory(n->cmb.dev)->addr : n->cmb.cba; +return memory_region_get_ram_ptr(>cmb.dev->mr) + (addr - base); +} else { +hwaddr base = n->params.legacy_cmb ? n->cmb.mem.addr : n->cmb.cba; +return >cmb.buf[addr - base]; +} } static bool nvme_addr_is_pmr(NvmeCtrl *n, hwaddr addr) @@ -6584,16 +6600,33 @@ static void nvme_init_state(NvmeCtrl *n) static void nvme_init_cmb(NvmeCtrl *n, PCIDevice *pci_dev) { -uint64_t cmb_size = n->params.cmb_size_mb * MiB; +uint64_t cmb_size; uint64_t cap = ldq_le_p(>bar.cap); -n->cmb.buf = g_malloc0(cmb_size); -memory_region_init_io(>cmb.mem, OBJECT(n), _cmb_ops, n, - "nvme-cmb", cmb_size); -pci_register_bar(pci_dev, NVME_CMB_BIR, - PCI_BASE_ADDRESS_SPACE_MEMORY | - PCI_BASE_ADDRESS_MEM_TYPE_64 | - PCI_BASE_ADDRESS_MEM_PREFETCH, >cmb.mem); +if (n->cmb.dev) { +if (n->params.cmb_size_mb) { +warn_report("Option cmb_size_mb is ignored when a cmbdev is provided"); +} +n->params.cmb_size_mb = n->cmb.dev->size / MiB; +cmb_size = n->cmb.dev->size; + +MemoryRegion *mr = host_memory_backend_get_memory(n->cmb.dev); +assert(mr); + +pci_register_bar(pci_dev, NVME_CMB_BIR, + PCI_BASE_ADDRESS_SPACE_MEMORY | + PCI_BASE_ADDRESS_MEM_TYPE_64 | + PCI_BASE_ADDRESS_MEM_PREFETCH, mr); +} else { +cmb_size = n->params.cmb_size_mb * MiB; +n->cmb.buf = g_malloc0(cmb_size); +memory_region_init_io(>cmb.mem, OBJECT(n), _cmb_ops, n, + "nvme-cmb", cmb_size); +pci_register_bar(pci_dev, NVME_CMB_BIR, + PCI_BASE_ADDRESS_SPACE_MEMORY | + PCI_BASE_ADDRESS_MEM_TYPE_64 | + PCI_BASE_ADDRESS_MEM_PREFETCH, >cmb.mem); +} NVME_CAP_SET_CMBS(cap, 1); stq_le_p(>bar.cap, cap); @@ -6678,7 +6711,7 @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) } } -if (n->params.cmb_size_mb) { +if (n->params.cmb_size_mb || n->cmb.dev) { nvme_init_cmb(n, pci_dev); } @@ -6793,7 +6826,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) NVME_CAP_SET_CSS(cap, NVME_CAP_CSS_CSI_SUPP); NVME_CAP_SET_CSS(cap, NVME_CAP_CSS_ADMIN_ONLY); NVME_CAP_SET_MPSMAX(cap, 4); -NVME_CAP_SET_CMBS(cap, n->params.cmb_size_mb ? 1 : 0); +NVME_CAP_SET_CMBS(cap, (n->params.cmb_size_mb || n->cmb.dev) ? 1 : 0); NVME_CAP_SET_PMRS(cap, n->pmr.dev ? 1 : 0); stq_le_p(>bar.cap, cap); @@ -6893,7 +6926,7 @@ static void
Re: [PATCH v2 for-7.1 7/9] nbd: move s->state under requests_lock
14.04.2022 20:57, Paolo Bonzini wrote: Remove the confusing, and most likely wrong, atomics. The only function that used to be somewhat in a hot path was nbd_client_connected(), but it is not anymore after the previous patches. The same logic is used both to check if a request had to be reissued and also in nbd_reconnecting_attempt(). The former cases are outside requests_lock, while nbd_reconnecting_attempt() does have the lock, therefore the two have been separated in the previous commit. nbd_client_will_reconnect() can simply take s->requests_lock, while nbd_reconnecting_attempt() can inline the access now that no complicated atomics are involved. Signed-off-by: Paolo Bonzini Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/nbd.c | 78 - 1 file changed, 41 insertions(+), 37 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index 37d466e435..b5aac2548c 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -35,7 +35,6 @@ #include "qemu/option.h" #include "qemu/cutils.h" #include "qemu/main-loop.h" -#include "qemu/atomic.h" #include "qapi/qapi-visit-sockets.h" #include "qapi/qmp/qstring.h" @@ -72,10 +71,11 @@ typedef struct BDRVNBDState { NBDExportInfo info; /* - * Protects free_sema, in_flight, requests[].coroutine, + * Protects state, free_sema, in_flight, requests[].coroutine, * reconnect_delay_timer. */ QemuMutex requests_lock; +NBDClientState state; CoQueue free_sema; int in_flight; NBDClientRequest requests[MAX_NBD_REQUESTS]; @@ -83,7 +83,6 @@ typedef struct BDRVNBDState { CoMutex send_mutex; CoMutex receive_mutex; -NBDClientState state; QEMUTimer *open_timer; @@ -132,11 +131,6 @@ static void nbd_clear_bdrvstate(BlockDriverState *bs) s->x_dirty_bitmap = NULL; } -static bool nbd_client_connected(BDRVNBDState *s) -{ -return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED; -} - static bool coroutine_fn nbd_recv_coroutine_wake_one(NBDClientRequest *req) { if (req->receiving) { @@ -159,14 +153,15 @@ static void coroutine_fn nbd_recv_coroutines_wake(BDRVNBDState *s, bool all) } } -static void coroutine_fn nbd_channel_error(BDRVNBDState *s, int ret) +/* Called with s->requests_lock held. */ +static void coroutine_fn nbd_channel_error_locked(BDRVNBDState *s, int ret) { -if (nbd_client_connected(s)) { +if (s->state == NBD_CLIENT_CONNECTED) { qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); } Reviewed-by: Vladimir Sementsov-Ogievskiy if (ret == -EIO) { -if (nbd_client_connected(s)) { +if (s->state == NBD_CLIENT_CONNECTED) { s->state = s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT : NBD_CLIENT_CONNECTING_NOWAIT; } @@ -177,6 +172,12 @@ static void coroutine_fn nbd_channel_error(BDRVNBDState *s, int ret) nbd_recv_coroutines_wake(s, true); } +static void coroutine_fn nbd_channel_error(BDRVNBDState *s, int ret) +{ +QEMU_LOCK_GUARD(>requests_lock); +nbd_channel_error_locked(s, ret); +} + static void reconnect_delay_timer_del(BDRVNBDState *s) { if (s->reconnect_delay_timer) { @@ -189,20 +190,18 @@ static void reconnect_delay_timer_cb(void *opaque) { BDRVNBDState *s = opaque; -if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) { -s->state = NBD_CLIENT_CONNECTING_NOWAIT; -nbd_co_establish_connection_cancel(s->conn); -} - reconnect_delay_timer_del(s); +WITH_QEMU_LOCK_GUARD(>requests_lock) { +if (s->state != NBD_CLIENT_CONNECTING_WAIT) { +return; +} +s->state = NBD_CLIENT_CONNECTING_NOWAIT; +} +nbd_co_establish_connection_cancel(s->conn); } static void reconnect_delay_timer_init(BDRVNBDState *s, uint64_t expire_time_ns) { -if (qatomic_load_acquire(>state) != NBD_CLIENT_CONNECTING_WAIT) { -return; -} - assert(!s->reconnect_delay_timer); s->reconnect_delay_timer = aio_timer_new(bdrv_get_aio_context(s->bs), QEMU_CLOCK_REALTIME, @@ -225,7 +224,9 @@ static void nbd_teardown_connection(BlockDriverState *bs) s->ioc = NULL; } -s->state = NBD_CLIENT_QUIT; +WITH_QEMU_LOCK_GUARD(>requests_lock) { +s->state = NBD_CLIENT_QUIT; +} } static void open_timer_del(BDRVNBDState *s) @@ -254,15 +255,15 @@ static void open_timer_init(BDRVNBDState *s, uint64_t expire_time_ns) timer_mod(s->open_timer, expire_time_ns); } -static bool nbd_client_connecting_wait(BDRVNBDState *s) -{ -return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT; -} - static bool nbd_client_will_reconnect(BDRVNBDState *s) { -return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT; +/* + * Called only
Re: [PATCH v2 for-7.1 6/9] nbd: code motion and function renaming
14.04.2022 20:57, Paolo Bonzini wrote: Prepare for the next patch, so that the diff is less confusing. nbd_client_connecting is moved closer to the definition point. Amm. To usage-point you mean? The original idea was to keep simple state-reading helpers definitions together :) nbd_client_connecting_wait() is kept only for the reconnection logic; when it is used to check if a request has to be reissued, use the renamed function nbd_client_will_reconnect(). In the next patch, the two cases will have different locking requirements. Signed-off-by: Paolo Bonzini Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/nbd.c | 24 ++-- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index a2414566d1..37d466e435 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -254,18 +254,15 @@ static void open_timer_init(BDRVNBDState *s, uint64_t expire_time_ns) timer_mod(s->open_timer, expire_time_ns); } -static bool nbd_client_connecting(BDRVNBDState *s) -{ -NBDClientState state = qatomic_load_acquire(>state); -return state == NBD_CLIENT_CONNECTING_WAIT || -state == NBD_CLIENT_CONNECTING_NOWAIT; -} - static bool nbd_client_connecting_wait(BDRVNBDState *s) { return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT; } +static bool nbd_client_will_reconnect(BDRVNBDState *s) +{ +return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT; +} /* * Update @bs with information learned during a completed negotiation process. * Return failure if the server's advertised options are incompatible with the @@ -355,6 +352,13 @@ int coroutine_fn nbd_co_do_establish_connection(BlockDriverState *bs, return 0; } +static bool nbd_client_connecting(BDRVNBDState *s) +{ +NBDClientState state = qatomic_load_acquire(>state); +return state == NBD_CLIENT_CONNECTING_WAIT || +state == NBD_CLIENT_CONNECTING_NOWAIT; +} + /* Called with s->requests_lock taken. */ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s) { @@ -1190,7 +1194,7 @@ static int coroutine_fn nbd_co_request(BlockDriverState *bs, NBDRequest *request error_free(local_err); local_err = NULL; } -} while (ret < 0 && nbd_client_connecting_wait(s)); +} while (ret < 0 && nbd_client_will_reconnect(s)); return ret ? ret : request_ret; } @@ -1249,7 +1253,7 @@ static int coroutine_fn nbd_client_co_preadv(BlockDriverState *bs, int64_t offse error_free(local_err); local_err = NULL; } -} while (ret < 0 && nbd_client_connecting_wait(s)); +} while (ret < 0 && nbd_client_will_reconnect(s)); return ret ? ret : request_ret; } @@ -1407,7 +1411,7 @@ static int coroutine_fn nbd_client_co_block_status( error_free(local_err); local_err = NULL; } -} while (ret < 0 && nbd_client_connecting_wait(s)); +} while (ret < 0 && nbd_client_will_reconnect(s)); if (ret < 0 || request_ret < 0) { return ret ? ret : request_ret; -- Best regards, Vladimir
Re: [PATCH v2 for-7.1 5/9] nbd: use a QemuMutex to synchronize yanking, reconnection and coroutines
14.04.2022 20:57, Paolo Bonzini wrote: The condition for waiting on the s->free_sema queue depends on both s->in_flight and s->state. The latter is currently using atomics, but this is quite dubious and probably wrong. Because s->state is written in the main thread too, for example by the yank callback, it cannot be protected by a CoMutex. Introduce a separate lock that can be used by nbd_co_send_request(); later on this lock will also be used for s->state. There will not be any contention on the lock unless there is a yank or reconnect, so this is not performance sensitive. Signed-off-by: Paolo Bonzini --- block/nbd.c | 46 +++--- 1 file changed, 27 insertions(+), 19 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index 62dd338ef3..a2414566d1 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -71,17 +71,22 @@ typedef struct BDRVNBDState { QIOChannel *ioc; /* The current I/O channel */ NBDExportInfo info; -CoMutex send_mutex; +/* + * Protects free_sema, in_flight, requests[].coroutine, + * reconnect_delay_timer. + */ requests[].coroutine is read without mutex in nbd_receive_replies reconnect_delay_timer_del() is called without mutex in nbd_cancel_in_flight() and in reconnect_delay_timer_cb().. Is it OK? Worth improving the comment? +QemuMutex requests_lock; CoQueue free_sema; - -CoMutex receive_mutex; int in_flight; +NBDClientRequest requests[MAX_NBD_REQUESTS]; +QEMUTimer *reconnect_delay_timer; + +CoMutex send_mutex; +CoMutex receive_mutex; NBDClientState state; -QEMUTimer *reconnect_delay_timer; QEMUTimer *open_timer; -NBDClientRequest requests[MAX_NBD_REQUESTS]; NBDReply reply; BlockDriverState *bs; @@ -350,7 +355,7 @@ int coroutine_fn nbd_co_do_establish_connection(BlockDriverState *bs, return 0; } -/* called under s->send_mutex */ +/* Called with s->requests_lock taken. */ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s) { bool blocking = nbd_client_connecting_wait(s); @@ -382,9 +387,9 @@ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s) s->ioc = NULL; } -qemu_co_mutex_unlock(>send_mutex); +qemu_mutex_unlock(>requests_lock); nbd_co_do_establish_connection(s->bs, blocking, NULL); -qemu_co_mutex_lock(>send_mutex); +qemu_mutex_lock(>requests_lock); /* * The reconnect attempt is done (maybe successfully, maybe not), so @@ -466,11 +471,10 @@ static int coroutine_fn nbd_co_send_request(BlockDriverState *bs, BDRVNBDState *s = (BDRVNBDState *)bs->opaque; int rc, i = -1; -qemu_co_mutex_lock(>send_mutex); - +qemu_mutex_lock(>requests_lock); while (s->in_flight == MAX_NBD_REQUESTS || (!nbd_client_connected(s) && s->in_flight > 0)) { -qemu_co_queue_wait(>free_sema, >send_mutex); +qemu_co_queue_wait(>free_sema, >requests_lock); } s->in_flight++; @@ -491,13 +495,13 @@ static int coroutine_fn nbd_co_send_request(BlockDriverState *bs, Prior to this patch, if request sending fails, we'll not send further requests. After the patch, we can send more requests after failure on unlocking send_mutex. May be that's not bad.. What's wrong if we keep send_mutex critical section as is and just lock requests_lock additionally inside send_mutex-critical-section? } } -g_assert(qemu_in_coroutine()); assert(i < MAX_NBD_REQUESTS); - s->requests[i].coroutine = qemu_coroutine_self(); s->requests[i].offset = request->from; s->requests[i].receiving = false; +qemu_mutex_unlock(>requests_lock); +qemu_co_mutex_lock(>send_mutex); request->handle = INDEX_TO_HANDLE(s, i); assert(s->ioc); @@ -517,17 +521,19 @@ static int coroutine_fn nbd_co_send_request(BlockDriverState *bs, } else { rc = nbd_send_request(s->ioc, request); } +qemu_co_mutex_unlock(>send_mutex); -err: if (rc < 0) { +qemu_mutex_lock(>requests_lock); +err: nbd_channel_error(s, rc); if (i != -1) { s->requests[i].coroutine = NULL; } s->in_flight--; qemu_co_queue_next(>free_sema); +qemu_mutex_unlock(>requests_lock); } -qemu_co_mutex_unlock(>send_mutex); return rc; } @@ -1017,12 +1023,11 @@ static bool nbd_reply_chunk_iter_receive(BDRVNBDState *s, return true; break_loop: +qemu_mutex_lock(>requests_lock); s->requests[HANDLE_TO_INDEX(s, handle)].coroutine = NULL; - -qemu_co_mutex_lock(>send_mutex); s->in_flight--; qemu_co_queue_next(>free_sema); -qemu_co_mutex_unlock(>send_mutex); +qemu_mutex_unlock(>requests_lock); return false; } @@ -1855,8 +1860,9 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags, BDRVNBDState *s =
Re: [PATCH v2 for-7.1 4/9] nbd: keep send_mutex/free_sema handling outside nbd_co_do_establish_connection
14.04.2022 20:57, Paolo Bonzini wrote: Elevate s->in_flight early so that other incoming requests will wait on the CoQueue in nbd_co_send_request; restart them after getting back from nbd_reconnect_attempt. This could be after the reconnect timer or nbd_cancel_in_flight have cancelled the attempt, so there is no need anymore to cancel the requests there. nbd_co_send_request now handles both stopping and restarting pending requests after a successful connection, and there is no need to hold send_mutex in nbd_co_do_establish_connection. The current setup is confusing because nbd_co_do_establish_connection is called both with send_mutex taken and without it. Before the patch it uses free_sema which (at least in theory...) is protected by send_mutex, after the patch it does not anymore. Signed-off-by: Paolo Bonzini Seems good to me, still again, hard to imagine, can it lead to some new dead-locks or not. Seems not, at least I don't see the problem. We will see. Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/coroutines.h | 4 +-- block/nbd.c| 66 ++ 2 files changed, 33 insertions(+), 37 deletions(-) diff --git a/block/coroutines.h b/block/coroutines.h index b293e943c8..8f6e438ef3 100644 --- a/block/coroutines.h +++ b/block/coroutines.h @@ -59,7 +59,7 @@ int coroutine_fn bdrv_co_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos); int coroutine_fn -nbd_co_do_establish_connection(BlockDriverState *bs, Error **errp); +nbd_co_do_establish_connection(BlockDriverState *bs, bool blocking, Error **errp); int coroutine_fn @@ -109,7 +109,7 @@ bdrv_common_block_status_above(BlockDriverState *bs, BlockDriverState **file, int *depth); int generated_co_wrapper -nbd_do_establish_connection(BlockDriverState *bs, Error **errp); +nbd_do_establish_connection(BlockDriverState *bs, bool blocking, Error **errp); int generated_co_wrapper blk_do_preadv(BlockBackend *blk, int64_t offset, int64_t bytes, diff --git a/block/nbd.c b/block/nbd.c index a30603ce87..62dd338ef3 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -187,9 +187,6 @@ static void reconnect_delay_timer_cb(void *opaque) if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) { s->state = NBD_CLIENT_CONNECTING_NOWAIT; nbd_co_establish_connection_cancel(s->conn); -while (qemu_co_enter_next(>free_sema, NULL)) { -/* Resume all queued requests */ -} } reconnect_delay_timer_del(s); Seems, removing the timer is not needed here too. We do nbd_co_establish_connection_cancel(), and timer will be removed in nbd_reconnect_attempt anyway. More over, we don't need to keep timer in the state at all: it should become local thing for nbd_reconnect_attempt. A kind of call nbd_co_do_establish_connection() with timeout. That could be refactored later, using same way as in 4-5 patches of my "[PATCH v4 0/7] copy-before-write: on-cbw-error and cbw-timeout" series. @@ -310,11 +307,10 @@ static int nbd_handle_updated_info(BlockDriverState *bs, Error **errp) } int coroutine_fn nbd_co_do_establish_connection(BlockDriverState *bs, -Error **errp) +bool blocking, Error **errp) { BDRVNBDState *s = (BDRVNBDState *)bs->opaque; int ret; -bool blocking = nbd_client_connecting_wait(s); IO_CODE(); assert(!s->ioc); @@ -350,7 +346,6 @@ int coroutine_fn nbd_co_do_establish_connection(BlockDriverState *bs, /* successfully connected */ s->state = NBD_CLIENT_CONNECTED; -qemu_co_queue_restart_all(>free_sema); return 0; } @@ -358,25 +353,25 @@ int coroutine_fn nbd_co_do_establish_connection(BlockDriverState *bs, /* called under s->send_mutex */ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s) { -assert(nbd_client_connecting(s)); -assert(s->in_flight == 0); - -if (nbd_client_connecting_wait(s) && s->reconnect_delay && -!s->reconnect_delay_timer) -{ -/* - * It's first reconnect attempt after switching to - * NBD_CLIENT_CONNECTING_WAIT - */ -reconnect_delay_timer_init(s, -qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + -s->reconnect_delay * NANOSECONDS_PER_SECOND); -} +bool blocking = nbd_client_connecting_wait(s); /* * Now we are sure that nobody is accessing the channel, and no one will * try until we set the state to CONNECTED. */ +assert(nbd_client_connecting(s)); +assert(s->in_flight == 1); + +if (blocking && !s->reconnect_delay_timer) { +/* + * It's the first reconnect attempt after switching to + * NBD_CLIENT_CONNECTING_WAIT + */ +
[PATCH v2] hw/vfio/common: Fix a small boundary issue of a trace
From: Xiang Chen It uses [offset, offset + size - 1] to indicate that the length of range is size in most places in vfio trace code (such as trace_vfio_region_region_mmap()) execpt trace_vfio_region_sparse_mmap_entry(). So change it for trace_vfio_region_sparse_mmap_entry(), but if size is zero, the trace will be weird with an underflow, so move the trace and trace it only if size is not zero. Signed-off-by: Xiang Chen --- hw/vfio/common.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 080046e3f5..345ea7bd8a 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1544,11 +1544,10 @@ static int vfio_setup_region_sparse_mmaps(VFIORegion *region, region->mmaps = g_new0(VFIOMmap, sparse->nr_areas); for (i = 0, j = 0; i < sparse->nr_areas; i++) { -trace_vfio_region_sparse_mmap_entry(i, sparse->areas[i].offset, -sparse->areas[i].offset + -sparse->areas[i].size); - if (sparse->areas[i].size) { +trace_vfio_region_sparse_mmap_entry(i, sparse->areas[i].offset, +sparse->areas[i].offset + +sparse->areas[i].size - 1); region->mmaps[j].offset = sparse->areas[i].offset; region->mmaps[j].size = sparse->areas[i].size; j++; -- 2.33.0
[PATCH] softmmu/memory: Skip translation size instead of fixed granularity if translate() successfully
From: Xiang Chen Currently memory_region_iommu_replay() does full page table walk with fixed granularity (page size) no matter translate() succeeds or not. Actually if translate() successfully, we can skip translation size (iotlb.addr_mask + 1) instead of fixed granularity. Signed-off-by: Xiang Chen --- softmmu/memory.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/softmmu/memory.c b/softmmu/memory.c index bfa5d5178c..ccfa19cf71 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -1924,7 +1924,7 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) { MemoryRegion *mr = MEMORY_REGION(iommu_mr); IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr); -hwaddr addr, granularity; +hwaddr addr, granularity, def_granu; IOMMUTLBEntry iotlb; /* If the IOMMU has its own replay callback, override */ @@ -1933,12 +1933,15 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) return; } -granularity = memory_region_iommu_get_min_page_size(iommu_mr); +def_granu = memory_region_iommu_get_min_page_size(iommu_mr); for (addr = 0; addr < memory_region_size(mr); addr += granularity) { iotlb = imrc->translate(iommu_mr, addr, IOMMU_NONE, n->iommu_idx); if (iotlb.perm != IOMMU_NONE) { n->notify(n, ); +granularity = iotlb.addr_mask + 1; +} else { +granularity = def_granu; } /* if (2^64 - MR size) < granularity, it's possible to get an -- 2.33.0
[PATCH v2] hw/arm/smmuv3: Pass the actual perm to returned IOMMUTLBEntry in smmuv3_translate()
From: Xiang Chen It always calls the IOMMU MR translate() callback with flag=IOMMU_NONE in memory_region_iommu_replay(). Currently, smmuv3_translate() return an IOMMUTLBEntry with perm set to IOMMU_NONE even if the translation success, whereas it is expected to return the actual permission set in the table entry. So pass the actual perm to returned IOMMUTLBEntry in the table entry. Signed-off-by: Xiang Chen Reviewed-by: Eric Auger --- hw/arm/smmuv3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index 674623aabe..707eb430c2 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -760,7 +760,7 @@ epilogue: qemu_mutex_unlock(>mutex); switch (status) { case SMMU_TRANS_SUCCESS: -entry.perm = flag; +entry.perm = cached_entry->entry.perm; entry.translated_addr = cached_entry->entry.translated_addr + (addr & cached_entry->entry.addr_mask); entry.addr_mask = cached_entry->entry.addr_mask; -- 2.33.0
Re: [PATCH] hw/arm/smmuv3: Pass the real perm to returned IOMMUTLBEntry in smmuv3_translate()
Hi Eric, 在 2022/4/15 0:02, Eric Auger 写道: Hi Chenxiang, On 4/7/22 9:57 AM, chenxiang via wrote: From: Xiang Chen In function memory_region_iommu_replay(), it decides to notify() or not according to the perm of returned IOMMUTLBEntry. But for smmuv3, the returned perm is always IOMMU_NONE even if the translation success. I think you should precise in the commit message that memory_region_iommu_replay() always calls the IOMMU MR translate() callback with flag=IOMMU_NONE and thus, currently, translate() returns an IOMMUTLBEntry with perm set to IOMMU_NONE if the translation succeeds, whereas it is expected to return the actual permission set in the table entry. Thank you for your comments. I will change the commit message in next version. Pass the real perm to returned IOMMUTLBEntry to avoid the issue. Signed-off-by: Xiang Chen --- hw/arm/smmuv3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index 674623aabe..707eb430c2 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -760,7 +760,7 @@ epilogue: qemu_mutex_unlock(>mutex); switch (status) { case SMMU_TRANS_SUCCESS: -entry.perm = flag; +entry.perm = cached_entry->entry.perm; With that clarification Reviewed-by: Eric Auger Ok, thanks the translate() doc in ./include/exec/memory.h states " If IOMMU_NONE is passed then the IOMMU must do the * full page table walk and report the permissions in the returned * IOMMUTLBEntry. (Note that this implies that an IOMMU may not * return different mappings for reads and writes.) " Thanks Eric entry.translated_addr = cached_entry->entry.translated_addr + (addr & cached_entry->entry.addr_mask); entry.addr_mask = cached_entry->entry.addr_mask; .