Re: [PATCH 6/7] hw/isa: Assert isa_register_portio_list() gets non-NULL ISA device
On 14/2/23 19:49, Richard Henderson wrote: On 2/14/23 00:18, Philippe Mathieu-Daudé wrote: __attribute__((nonnull)) void a1(void *ptr) { // can no use assert(ptr) because compiler warning } I briefly glossed over that... I realize we'd probably want to add -fno-delete-null-pointer-checks if we make too much ... here. The compiler warning should go away with the right flag. Doh, got it now!
Re: [PATCH 0/8] aspeed: I2C fixes, -drive removal (first step)
Cédric Le Goater writes: > Hello, > > This series starts with a first set of patches fixing I2C slave mode > in the Aspeed I2C controller, a test device and its associated test in > avocado. > > Follow some cleanups which allow the use of block devices instead of > drives. So that, instead of specifying : > > -drive file=./flash-ast2600-evb,format=raw,if=mtd > -drive file=./ast2600-evb.pnor,format=raw,if=mtd > ... > > and guessing from the order which bus the device is attached to, we > can use : > > -blockdev node-name=fmc0,driver=file,filename=./bmc.img > -device mx66u51235f,bus=ssi.0,drive=fmc0 > -blockdev node-name=fmc1,driver=file,filename=./bmc-alt.img > -device mx66u51235f,bus=ssi.0,drive=fmc1 > -blockdev node-name=pnor,driver=file,filename=./pnor > -device mx66l1g45g,bus=ssi.1,drive=pnor > ... > > It is not perfect, the CS index still depends on the order, but it is > now possible to run a machine without -drive ...,if=mtd. Lovely! Does this cover all uses of IF_MTD, or only some? > This lacks the final patch enabling the '-nodefaults' option by not > creating the default devices if specified on the command line. It > needs some more evaluation of the possible undesired effects. Are you thinking of something similar to the default CD-ROM, i.e. use default_list to have -device suppress a certain kind of default devices, and also have -nodefaults suppress them all?
Re: [PATCH v2 6/7] CI: Stop building docs on centos8
Il mar 14 feb 2023, 18:26 Kevin Wolf ha scritto: > Am 14.02.2023 um 15:03 hat Paolo Bonzini geschrieben: > > In the case of Python the issue is not the interpreter per se, though > > there are a couple new feature in Python 3.7 that are quite nice (for > > example improved data classes[1] or context variables[2]). The main > > problem as far as I understood (and have seen in my experience) is > > linting tools. New versions fix bugs that caused false positives, but > > also become more strict at the same time. The newer versions at the > > same time are very quick at dropping support for old versions of > > Python; while older versions sometimes throw deprecation warnings on > > new versions of Python. This makes it very hard to support a single > > version of, say, mypy that works on all versions from RHEL8 and SLE15 > > to Fedora 38 and Ubuntu 23.04. > > Why do we have to support a single version of mypy? What is wrong with > running an old mypy version with old Python version, and a newer mypy > with newer Python versions? > > Sure, they will complain about different things, but it doesn't feel > that different from supporting multiple C compilers in various versions. > It's more like having to support only C++03 on RHEL 8 and only C++20 in Fedora 37, without even being able to use a preprocessor. For example old versions might not understand some type annotations and will fail mypy altogether, therefore even with newer versions you can't annotate the whole source and have to fall back to non-strict mode. Paolo > Kevin > >
Re: [PATCH 0/4] qemu-img: Fix exit code for errors closing the image
On 13.01.23 14:29, Kevin Wolf wrote: Another thing that could be tried is making failure in .bdrv_close less likely by doing things earlier. At least ENOSPC could probably be avoided if dirty bitmaps clusters were allocated during the write request that first sets a bit in them (I know too little about the details how bitmaps are implemented in qcow2, though, maybe Vladimir can help here). That's possible but not trivial :) Qcow2 does nothing with dirty bitmaps during normal operation. Only on close, it finds all persistent bitmaps and stores them somehow, mostly allocating new clusters on the fly. So the simplest way look like: - add generic handler .bitmap_changed in BlockDriver, to handle bitmap change in qcow2 (that's not only write, but may be bitmap_merge opertion). - in a new handler allocate some clusters to produce a pool for dirty bitmaps saving (will need clusters for bitmap data and metadata (bitmap table, bitmap directory)) - in block/qcow2-bitmap.c switch qcow2_alloc_cluster() to a wrapper, that first tries to get a cluster from the pool and if it's empty fallback to qcow2_alloc_cluster() Note also, that this will increase fragmentation. Or, may be more effective would be to preallocate clusters on bitmap creation (and therefore on image resize). More difficult would be rework the whole code to bind allocated clusters for each persistent dirty bitmap. -- Best regards, Vladimir
Re: [PATCH 6/7] hw/isa: Assert isa_register_portio_list() gets non-NULL ISA device
On 2/14/23 00:18, Philippe Mathieu-Daudé wrote: __attribute__((nonnull)) void a1(void *ptr) { // can no use assert(ptr) because compiler warning } I briefly glossed over that... I realize we'd probably want to add -fno-delete-null-pointer-checks if we make too much ... here. The compiler warning should go away with the right flag. r~
Re: [PATCH v2 0/7] Python: Drop support for Python 3.6
On Thu, Feb 9, 2023 at 7:31 PM John Snow wrote: > > Howdy, this series increases our minimum python version to 3.7. > > CI: https://gitlab.com/jsnow/qemu/-/pipelines/771780626 > (All green!) > GL: https://gitlab.com/jsnow/qemu/-/commits/python-require-37 > > Patches 1 and 2 are loose pre-requisites; I'd like to merge them into > qemu.git within the week whether or not we take this series. I'd > appreciate an "ACK" on those specifically. They're just riding along > here because they make this series a bit nicer. > > Patches 3-6 are the hard pre-requisites, and 7 does the dirty work. > > The motivation for this series is that Python 3.6 was EOL at the end of > 2021; upstream tools are beginning to drop support for it, including > setuptools, pylint, mypy, etc. As time goes by, it becomes more > difficult to support and test against the full range of Python versions > that QEMU supports. The closer we get to Python 3.12, the harder it will > be to cover that full spread of versions. > > The qemu.qmp library and the avocado testing framework both have > motivations for dropping 3.6 support, but are committed to not doing so > until QEMU drops support. > > So, I'd like to talk about doing it. > > V2: > - Added R-Bs to patch 1 > - Updated commit message for patch 7 with explicit version info > - Added DO-NOT-MERGE to patch 5's title > - Tested tests/vm/freebsd, netbsd, and openbsd in addition to full CI > > RFC: > - Patch 5 is just a proof-of-concept; we need to update lcitool instead. > - Cleber, I need to update your ansible scripts. How do I test them? > > Thanks! > --js > > John Snow (7): > python: support pylint 2.16 > python: drop pipenv Hi, I've staged these first two patches to my Python branch. (Kevin, Hanna; is that acceptable? I touch some iotests to do some trivial linting whack-a-mole.) --js > configure: Look for auxiliary Python installations > configure: Add nice hint to Python failure message > DO-NOT-MERGE: testing: Add Python >= 3.7 to Centos, OpenSuSE > CI: Stop building docs on centos8 > Python: Drop support for Python 3.6 > > docs/conf.py | 4 +- > python/README.rst | 3 - > configure | 40 +- > .gitlab-ci.d/buildtest.yml| 2 +- > .gitlab-ci.d/static_checks.yml| 4 +- > python/.gitignore | 4 +- > python/Makefile | 57 ++- > python/Pipfile| 13 - > python/Pipfile.lock | 347 -- > python/qemu/qmp/protocol.py | 2 +- > python/qemu/qmp/qmp_client.py | 2 +- > python/qemu/utils/qemu_ga_client.py | 6 +- > python/setup.cfg | 11 +- > python/tests/minreqs.txt | 45 +++ > scripts/qapi/mypy.ini | 2 +- > tests/docker/dockerfiles/centos8.docker | 1 + > tests/docker/dockerfiles/opensuse-leap.docker | 1 + > tests/docker/dockerfiles/python.docker| 1 - > tests/qemu-iotests/iotests.py | 4 +- > .../tests/migrate-bitmaps-postcopy-test | 2 +- > 20 files changed, 135 insertions(+), 416 deletions(-) > delete mode 100644 python/Pipfile > delete mode 100644 python/Pipfile.lock > create mode 100644 python/tests/minreqs.txt > > -- > 2.39.0 > >
Re: [PATCH 3/3] migration: Remove _only suffix for res_postcopy/precopy
On 14.02.23 21:22, Juan Quintela wrote: Vladimir Sementsov-Ogievskiy wrote: On 08.02.23 16:57, Juan Quintela wrote: Once that res_compatible is removed, they don't make sense anymore. Signed-off-by: Juan Quintela --- include/migration/register.h | 18 -- migration/savevm.h | 8 hw/s390x/s390-stattrib.c | 7 +++ hw/vfio/migration.c| 10 -- migration/block-dirty-bitmap.c | 6 +++--- migration/block.c | 7 +++ migration/ram.c| 18 -- migration/savevm.c | 24 ++-- 8 files changed, 43 insertions(+), 55 deletions(-) diff --git a/include/migration/register.h b/include/migration/register.h index a958a92a0f..4a4a6d7174 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -47,22 +47,20 @@ typedef struct SaveVMHandlers { /* This runs outside the iothread lock! */ int (*save_setup)(QEMUFile *f, void *opaque); /* Note for save_live_pending: - * - res_precopy_only is for data which must be migrated in precopy phase + * - res_precopy is for data which must be migrated in precopy phase * or in stopped state, in other words - before target vm start - * - res_postcopy_only is for data which must be migrated in postcopy phase + * - res_postcopy is for data which must be migrated in postcopy phase * or in stopped state, in other words - after source vm stop That's now wrong. "postcopy" is everything except "precopy", as it includes "compat". Really, for RAM, it can be copied in precopy too, and it is copied in precopy until user run command migrate-start-postcopy. (In contrast: block-dirty-bitmap cannot migrate in precopy at all, it migrate only in stopped state or in postcopy). So, finally: "precopy" definition: - must be migrated in precopy or in stopped state - in other words: must be migrated before target start - in other words: can't be migrated in postcopy - in other words: can't be migrated after target start "postcopy" definition: - can migrate in postcopy - in other words: can migrate after target start some properties: - probably can be migrated in precopy (like RAM), or, may be not (like block-dirty-bitmap) - of course, can be migrated in stopped state What about (with latest naming) must_precopy: * must be migrated in precopy or in stopped state * i.e. must be migrated before target start can_postcopy: * can migrate in postcopy or in stopped state * i.e. can migrate after target start * some can also be migrated during precopy (RAM) * some must be migrated after source stops (block-dirty-bitmap) Sounds very good, I'm for! To be absolutely clear, we may rename them to "not_postcopyable" and "postcopyable". I hate negatives when naming variables. Agree. What about: must_precopy and can_postcopy? Sounds good!) -- Best regards, Vladimir
[PATCH] MAINTAINERS: drop Vladimir from parallels block driver
I have to admit this is out of my scope now. Still feel free to Cc me directly if my help is needed :) Signed-off-by: Vladimir Sementsov-Ogievskiy --- MAINTAINERS | 2 -- 1 file changed, 2 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 96e25f62ac..12bc96f52a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3580,13 +3580,11 @@ F: block/dmg.c parallels M: Stefan Hajnoczi M: Denis V. Lunev -M: Vladimir Sementsov-Ogievskiy L: qemu-block@nongnu.org S: Supported F: block/parallels.c F: block/parallels-ext.c F: docs/interop/parallels.txt -T: git https://gitlab.com/vsementsov/qemu.git block qed M: Stefan Hajnoczi -- 2.34.1
Re: [PATCH 3/3] migration: Remove _only suffix for res_postcopy/precopy
Vladimir Sementsov-Ogievskiy wrote: > On 08.02.23 16:57, Juan Quintela wrote: >> Once that res_compatible is removed, they don't make sense anymore. >> Signed-off-by: Juan Quintela >> --- >> include/migration/register.h | 18 -- >> migration/savevm.h | 8 >> hw/s390x/s390-stattrib.c | 7 +++ >> hw/vfio/migration.c| 10 -- >> migration/block-dirty-bitmap.c | 6 +++--- >> migration/block.c | 7 +++ >> migration/ram.c| 18 -- >> migration/savevm.c | 24 ++-- >> 8 files changed, 43 insertions(+), 55 deletions(-) >> diff --git a/include/migration/register.h >> b/include/migration/register.h >> index a958a92a0f..4a4a6d7174 100644 >> --- a/include/migration/register.h >> +++ b/include/migration/register.h >> @@ -47,22 +47,20 @@ typedef struct SaveVMHandlers { >> /* This runs outside the iothread lock! */ >> int (*save_setup)(QEMUFile *f, void *opaque); >> /* Note for save_live_pending: >> - * - res_precopy_only is for data which must be migrated in precopy >> phase >> + * - res_precopy is for data which must be migrated in precopy phase >>* or in stopped state, in other words - before target vm start >> - * - res_postcopy_only is for data which must be migrated in postcopy >> phase >> + * - res_postcopy is for data which must be migrated in postcopy phase >>* or in stopped state, in other words - after source vm stop > > > That's now wrong. "postcopy" is everything except "precopy", as it > includes "compat". Really, for RAM, it can be copied in precopy too, > and it is copied in precopy until user run command > migrate-start-postcopy. (In contrast: block-dirty-bitmap cannot > migrate in precopy at all, it migrate only in stopped state or in > postcopy). > > So, finally: > > "precopy" > > definition: > - must be migrated in precopy or in stopped state > - in other words: must be migrated before target start > - in other words: can't be migrated in postcopy > - in other words: can't be migrated after target start > > "postcopy" > > definition: > - can migrate in postcopy > - in other words: can migrate after target start > some properties: > - probably can be migrated in precopy (like RAM), or, may be not (like > block-dirty-bitmap) > - of course, can be migrated in stopped state What about (with latest naming) must_precopy: * must be migrated in precopy or in stopped state * i.e. must be migrated before target start can_postcopy: * can migrate in postcopy or in stopped state * i.e. can migrate after target start * some can also be migrated during precopy (RAM) * some must be migrated after source stops (block-dirty-bitmap) > To be absolutely clear, we may rename them to "not_postcopyable" and > "postcopyable". I hate negatives when naming variables. What about: must_precopy and can_postcopy? Later, Juan.
Re: Lost partition tables on ide-hd + ahci drive
On Thu, Feb 2, 2023 at 7:08 AM Fiona Ebner wrote: > > Hi, > over the years we've got 1-2 dozen reports[0] about suddenly > missing/corrupted MBR/partition tables. The issue seems to be very rare > and there was no success in trying to reproduce it yet. I'm asking here > in the hope that somebody has seen something similar. > > The only commonality seems to be the use of an ide-hd drive with ahci bus. > > It does seem to happen with both Linux and Windows guests (one of the > reports even mentions FreeBSD) and backing storages for the VMs include > ZFS, RBD, LVM-Thin as well as file-based storages. > > Relevant part of an example configuration: > > > -device 'ahci,id=ahci0,multifunction=on,bus=pci.0,addr=0x7' \ > > -drive > > 'file=/dev/zvol/myzpool/vm-168-disk-0,if=none,id=drive-sata0,format=raw,cache=none,aio=io_uring,detect-zeroes=on' > > \ > > -device 'ide-hd,bus=ahci0.0,drive=drive-sata0,id=sata0' \ > > The first reports are from before io_uring was used and there are also > reports with writeback cache mode and discard=on,detect-zeroes=unmap. > > Some reports say that the issue occurred under high IO load. > > Many reports suspect backups causing the issue. Our backup mechanism > uses backup_job_create() for each drive and runs the jobs sequentially. > It uses a custom block driver as the backup target which just forwards > the writes to the actual target which can be a file or our backup server. > (If you really want to see the details, apply the patches in [1] and see > pve-backup.c and block/backup-dump.c). > > Of course, the backup job will read sector 0 of the source disk, but I > really can't see where a stray write would happen, why the issue would > trigger so rarely or why seemingly only ide-hd+ahci would be affected. > > So again, just asking if somebody has seen something similar or has a > hunch of what the cause might be. > Hi Floria; I'm sorry to say that I haven't worked on the block devices (or backup) for a little while now, so I am not immediately sure what might be causing this problem. In general, I advise against using AHCI in production as better performance (and dev support) can be achieved through virtio. Still, I am not sure why the combination of AHCI with backup_job_create() would be corrupting the early sectors of the disk. Do you have any analysis on how much data gets corrupted? Is it the first sector only, the first few? Has anyone taken a peek at the backing storage to see if there are any interesting patterns that can be observed? (Zeroes, garbage, old data?) Have any errors or warnings been observed in either the guest or the host that might offer some clues? Is there any commonality in the storage format being used? Is it qcow2? Is it network-backed? Apologies for the "tier 1" questions. > [0]: https://bugzilla.proxmox.com/show_bug.cgi?id=2874 > [1]: https://git.proxmox.com/?p=pve-qemu.git;a=tree;f=debian/patches;hb=HEAD >
Re: [PATCH 2/3] migration: Remove unused res_compatible
Vladimir Sementsov-Ogievskiy wrote: > On 08.02.23 16:57, Juan Quintela wrote: >> { >> -uint64_t pend_pre, pend_compat, pend_post; >> +uint64_t pend_pre, pend_post; >> bool in_postcopy = s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE; >> -qemu_savevm_state_pending_estimate(_pre, _compat, >> _post); >> -uint64_t pending_size = pend_pre + pend_compat + pend_post; >> +qemu_savevm_state_pending_estimate(_pre, _post); >> +uint64_t pending_size = pend_pre + pend_post; > > Mixed declarations are "gnerally not allowed" by devel/style.rst.. > Preexisting, but we may fix it now. They are used left and right. But you are right. Instead to change my code, I have sent a proposal to change devel/style.rst. Discuss it there O:-) > Anyway: > > Reviewed-by: Vladimir Sementsov-Ogievskiy Thanks.
Re: [PATCH v10 02/12] parallels: Fix high_off calculation in parallels_co_check()
On 03.02.23 12:18, Alexander Ivanov wrote: Don't let high_off be more than the file size even if we don't fix the image. Signed-off-by: Alexander Ivanov Reviewed-by: Denis V. Lunev Reviewed-by: Vladimir Sementsov-Ogievskiy -- Best regards, Vladimir
Re: [PATCH v10 01/12] parallels: Out of image offset in BAT leads to image inflation
On 03.02.23 12:18, Alexander Ivanov wrote: data_end field in BDRVParallelsState is set to the biggest offset present in BAT. If this offset is outside of the image, any further write will create the cluster at this offset and/or the image will be truncated to this offset on close. This is definitely not correct. Raise an error in parallels_open() if data_end points outside the image and it is not a check (let the check to repaire the image). Set data_end to the end of the cluster with the last correct offset. Signed-off-by: Alexander Ivanov Reviewed-by: Denis V. Lunev --- block/parallels.c | 17 + 1 file changed, 17 insertions(+) diff --git a/block/parallels.c b/block/parallels.c index bbea2f2221..4af68adc61 100644 --- a/block/parallels.c +++ b/block/parallels.c @@ -732,6 +732,7 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, BDRVParallelsState *s = bs->opaque; ParallelsHeader ph; int ret, size, i; +int64_t file_size; QemuOpts *opts = NULL; Error *local_err = NULL; char *buf; @@ -741,6 +742,12 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, return ret; } +file_size = bdrv_getlength(bs->file->bs); +if (file_size < 0) { +return -EINVAL; +} +file_size >>= BDRV_SECTOR_BITS; if file size somehow not aligned to BDRV_SECTOR_SIZE, that's not correct, DIV_ROUND_UP would be better + ret = bdrv_pread(bs->file, 0, sizeof(ph), , 0); if (ret < 0) { goto fail; @@ -805,6 +812,16 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, for (i = 0; i < s->bat_size; i++) { int64_t off = bat2sect(s, i); +if (off >= file_size) { +if (flags & BDRV_O_CHECK) { +continue; +} +error_setg(errp, "parallels: Offset %" PRIi64 " in BAT[%d] entry " + "is larger than file size (%" PRIi64 ")", + off, i, file_size); offsets in sectors rather than in bytes may be a bit misleading +ret = -EINVAL; +goto fail; +} if (off >= s->data_end) { s->data_end = off + s->tracks; } -- Best regards, Vladimir
Re: [PATCH v2 6/7] CI: Stop building docs on centos8
Am 14.02.2023 um 15:03 hat Paolo Bonzini geschrieben: > In the case of Python the issue is not the interpreter per se, though > there are a couple new feature in Python 3.7 that are quite nice (for > example improved data classes[1] or context variables[2]). The main > problem as far as I understood (and have seen in my experience) is > linting tools. New versions fix bugs that caused false positives, but > also become more strict at the same time. The newer versions at the > same time are very quick at dropping support for old versions of > Python; while older versions sometimes throw deprecation warnings on > new versions of Python. This makes it very hard to support a single > version of, say, mypy that works on all versions from RHEL8 and SLE15 > to Fedora 38 and Ubuntu 23.04. Why do we have to support a single version of mypy? What is wrong with running an old mypy version with old Python version, and a newer mypy with newer Python versions? Sure, they will complain about different things, but it doesn't feel that different from supporting multiple C compilers in various versions. Kevin
[PATCH 2/8] hw/i2c: only schedule pending master when bus is idle
From: Klaus Jensen It is not given that the current master will release the bus after a transfer ends. Only schedule a pending master if the bus is idle. Fixes: 37fa5ca42623 ("hw/i2c: support multiple masters") Signed-off-by: Klaus Jensen Acked-by: Corey Minyard Message-Id: <20221116084312.35808-2-...@irrelevant.dk> Signed-off-by: Cédric Le Goater --- include/hw/i2c/i2c.h | 2 ++ hw/i2c/aspeed_i2c.c | 2 ++ hw/i2c/core.c| 37 ++--- 3 files changed, 26 insertions(+), 15 deletions(-) diff --git a/include/hw/i2c/i2c.h b/include/hw/i2c/i2c.h index 9b9581d230..2a3abacd1b 100644 --- a/include/hw/i2c/i2c.h +++ b/include/hw/i2c/i2c.h @@ -141,6 +141,8 @@ int i2c_start_send(I2CBus *bus, uint8_t address); */ int i2c_start_send_async(I2CBus *bus, uint8_t address); +void i2c_schedule_pending_master(I2CBus *bus); + void i2c_end_transfer(I2CBus *bus); void i2c_nack(I2CBus *bus); void i2c_ack(I2CBus *bus); diff --git a/hw/i2c/aspeed_i2c.c b/hw/i2c/aspeed_i2c.c index c166fd20fa..1f071a3811 100644 --- a/hw/i2c/aspeed_i2c.c +++ b/hw/i2c/aspeed_i2c.c @@ -550,6 +550,8 @@ static void aspeed_i2c_bus_handle_cmd(AspeedI2CBus *bus, uint64_t value) } SHARED_ARRAY_FIELD_DP32(bus->regs, reg_cmd, M_STOP_CMD, 0); aspeed_i2c_set_state(bus, I2CD_IDLE); + +i2c_schedule_pending_master(bus->bus); } if (aspeed_i2c_bus_pkt_mode_en(bus)) { diff --git a/hw/i2c/core.c b/hw/i2c/core.c index d4ba8146bf..bed594fe59 100644 --- a/hw/i2c/core.c +++ b/hw/i2c/core.c @@ -185,22 +185,39 @@ int i2c_start_transfer(I2CBus *bus, uint8_t address, bool is_recv) void i2c_bus_master(I2CBus *bus, QEMUBH *bh) { -if (i2c_bus_busy(bus)) { -I2CPendingMaster *node = g_new(struct I2CPendingMaster, 1); -node->bh = bh; +I2CPendingMaster *node = g_new(struct I2CPendingMaster, 1); +node->bh = bh; + +QSIMPLEQ_INSERT_TAIL(>pending_masters, node, entry); +} + +void i2c_schedule_pending_master(I2CBus *bus) +{ +I2CPendingMaster *node; -QSIMPLEQ_INSERT_TAIL(>pending_masters, node, entry); +if (i2c_bus_busy(bus)) { +/* someone is already controlling the bus; wait for it to release it */ +return; +} +if (QSIMPLEQ_EMPTY(>pending_masters)) { return; } -bus->bh = bh; +node = QSIMPLEQ_FIRST(>pending_masters); +bus->bh = node->bh; + +QSIMPLEQ_REMOVE_HEAD(>pending_masters, entry); +g_free(node); + qemu_bh_schedule(bus->bh); } void i2c_bus_release(I2CBus *bus) { bus->bh = NULL; + +i2c_schedule_pending_master(bus); } int i2c_start_recv(I2CBus *bus, uint8_t address) @@ -234,16 +251,6 @@ void i2c_end_transfer(I2CBus *bus) g_free(node); } bus->broadcast = false; - -if (!QSIMPLEQ_EMPTY(>pending_masters)) { -I2CPendingMaster *node = QSIMPLEQ_FIRST(>pending_masters); -bus->bh = node->bh; - -QSIMPLEQ_REMOVE_HEAD(>pending_masters, entry); -g_free(node); - -qemu_bh_schedule(bus->bh); -} } int i2c_send(I2CBus *bus, uint8_t data) -- 2.39.1
[PATCH 7/8] aspeed: Introduce a spi_boot region under the SoC
The default boot of the Aspeed SoCs is address 0x0. For this reason, the FMC flash device contents are remapped by HW on the first 256MB of the address space. In QEMU, this is currently done in the machine init with the setup of a region alias. Move this code to the SoC and introduce an extra container to prepare ground for the boot ROM region which will overlap the FMC flash remapping. Signed-off-by: Cédric Le Goater --- include/hw/arm/aspeed_soc.h | 3 +++ hw/arm/aspeed.c | 13 + hw/arm/aspeed_ast2600.c | 13 + hw/arm/aspeed_soc.c | 14 ++ hw/arm/fby35.c | 8 +--- 5 files changed, 32 insertions(+), 19 deletions(-) diff --git a/include/hw/arm/aspeed_soc.h b/include/hw/arm/aspeed_soc.h index bd1e03e78a..db55505d14 100644 --- a/include/hw/arm/aspeed_soc.h +++ b/include/hw/arm/aspeed_soc.h @@ -58,6 +58,8 @@ struct AspeedSoCState { MemoryRegion *dram_mr; MemoryRegion dram_container; MemoryRegion sram; +MemoryRegion spi_boot_container; +MemoryRegion spi_boot; AspeedVICState vic; AspeedRtcState rtc; AspeedTimerCtrlState timerctrl; @@ -120,6 +122,7 @@ struct AspeedSoCClass { enum { +ASPEED_DEV_SPI_BOOT, ASPEED_DEV_IOMEM, ASPEED_DEV_UART1, ASPEED_DEV_UART2, diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c index 21184f3ad4..998dc57969 100644 --- a/hw/arm/aspeed.c +++ b/hw/arm/aspeed.c @@ -384,18 +384,7 @@ static void aspeed_machine_init(MachineState *machine) MemoryRegion *boot_rom = g_new(MemoryRegion, 1); uint64_t size = memory_region_size(>mmio); -/* - * create a ROM region using the default mapping window size of - * the flash module. The window size is 64MB for the AST2400 - * SoC and 128MB for the AST2500 SoC, which is twice as big as - * needed by the flash modules of the Aspeed machines. - */ -if (ASPEED_MACHINE(machine)->mmio_exec) { -memory_region_init_alias(boot_rom, NULL, "aspeed.boot_rom", - >mmio, 0, size); -memory_region_add_subregion(get_system_memory(), FIRMWARE_ADDR, -boot_rom); -} else { +if (!ASPEED_MACHINE(machine)->mmio_exec) { memory_region_init_rom(boot_rom, NULL, "aspeed.boot_rom", size, _abort); memory_region_add_subregion(get_system_memory(), FIRMWARE_ADDR, diff --git a/hw/arm/aspeed_ast2600.c b/hw/arm/aspeed_ast2600.c index bb2769df04..20f0b772d6 100644 --- a/hw/arm/aspeed_ast2600.c +++ b/hw/arm/aspeed_ast2600.c @@ -21,6 +21,7 @@ #define ASPEED_SOC_DPMCU_SIZE 0x0004 static const hwaddr aspeed_soc_ast2600_memmap[] = { +[ASPEED_DEV_SPI_BOOT] = 0x0, [ASPEED_DEV_SRAM] = 0x1000, [ASPEED_DEV_DPMCU] = 0x1800, /* 0x1600 0x17FF : AHB BUS do LPC Bus bridge */ @@ -282,6 +283,12 @@ static void aspeed_soc_ast2600_realize(DeviceState *dev, Error **errp) qemu_irq irq; g_autofree char *sram_name = NULL; +/* Default boot region (SPI memory or ROMs) */ +memory_region_init(>spi_boot_container, OBJECT(s), + "aspeed.spi_boot_container", 0x1000); +memory_region_add_subregion(s->memory, sc->memmap[ASPEED_DEV_SPI_BOOT], +>spi_boot_container); + /* IO space */ aspeed_mmio_map_unimplemented(s, SYS_BUS_DEVICE(>iomem), "aspeed.io", sc->memmap[ASPEED_DEV_IOMEM], @@ -431,6 +438,12 @@ static void aspeed_soc_ast2600_realize(DeviceState *dev, Error **errp) sysbus_connect_irq(SYS_BUS_DEVICE(>fmc), 0, aspeed_soc_get_irq(s, ASPEED_DEV_FMC)); +/* Set up an alias on the FMC CE0 region (boot default) */ +MemoryRegion *fmc0_mmio = >fmc.flashes[0].mmio; +memory_region_init_alias(>spi_boot, OBJECT(s), "aspeed.spi_boot", + fmc0_mmio, 0, memory_region_size(fmc0_mmio)); +memory_region_add_subregion(>spi_boot_container, 0x0, >spi_boot); + /* SPI */ for (i = 0; i < sc->spis_num; i++) { object_property_set_link(OBJECT(>spi[i]), "dram", diff --git a/hw/arm/aspeed_soc.c b/hw/arm/aspeed_soc.c index e884d6badc..3507ea5818 100644 --- a/hw/arm/aspeed_soc.c +++ b/hw/arm/aspeed_soc.c @@ -25,6 +25,7 @@ #define ASPEED_SOC_IOMEM_SIZE 0x0020 static const hwaddr aspeed_soc_ast2400_memmap[] = { +[ASPEED_DEV_SPI_BOOT] = 0x0, [ASPEED_DEV_IOMEM] = 0x1E60, [ASPEED_DEV_FMC]= 0x1E62, [ASPEED_DEV_SPI1] = 0x1E63, @@ -59,6 +60,7 @@ static const hwaddr aspeed_soc_ast2400_memmap[] = { }; static const hwaddr aspeed_soc_ast2500_memmap[] = { +[ASPEED_DEV_SPI_BOOT] = 0x0, [ASPEED_DEV_IOMEM] = 0x1E60, [ASPEED_DEV_FMC]= 0x1E62, [ASPEED_DEV_SPI1] = 0x1E63, @@ -245,6 +247,12 @@ static void
[PATCH 8/8] aspeed: Add a boot_rom overlap region in the SoC spi_boot container
To avoid the SPI transactions fetching instructions from the FMC CE0 flash device and speed up boot, a ROM can be created if a drive is available. Reverse a bit the logic to allow a machine to boot without a drive, using a block device instead : -blockdev node-name=fmc0,driver=file,filename=/path/to/flash.img \ -device mx66u51235f,bus=ssi.0,drive=fmc0 Signed-off-by: Cédric Le Goater --- hw/arm/aspeed.c | 45 +++-- 1 file changed, 27 insertions(+), 18 deletions(-) diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c index 998dc57969..13e719bae7 100644 --- a/hw/arm/aspeed.c +++ b/hw/arm/aspeed.c @@ -243,10 +243,9 @@ static void aspeed_reset_secondary(ARMCPU *cpu, #define FIRMWARE_ADDR 0x0 -static void write_boot_rom(DriveInfo *dinfo, hwaddr addr, size_t rom_size, +static void write_boot_rom(BlockBackend *blk, hwaddr addr, size_t rom_size, Error **errp) { -BlockBackend *blk = blk_by_legacy_dinfo(dinfo); g_autofree void *storage = NULL; int64_t size; @@ -272,6 +271,22 @@ static void write_boot_rom(DriveInfo *dinfo, hwaddr addr, size_t rom_size, rom_add_blob_fixed("aspeed.boot_rom", storage, rom_size, addr); } +/* + * Create a ROM and copy the flash contents at the expected address + * (0x0). Boots faster than execute-in-place. + */ +static void aspeed_install_boot_rom(AspeedSoCState *soc, BlockBackend *blk, +uint64_t rom_size) +{ +MemoryRegion *boot_rom = g_new(MemoryRegion, 1); + +memory_region_init_rom(boot_rom, NULL, "aspeed.boot_rom", rom_size, + _abort); +memory_region_add_subregion_overlap(>spi_boot_container, 0, +boot_rom, 1); +write_boot_rom(blk, FIRMWARE_ADDR, rom_size, _abort); +} + void aspeed_board_init_flashes(AspeedSMCState *s, const char *flashtype, unsigned int count, int unit0) { @@ -328,7 +343,6 @@ static void aspeed_machine_init(MachineState *machine) AspeedMachineState *bmc = ASPEED_MACHINE(machine); AspeedMachineClass *amc = ASPEED_MACHINE_GET_CLASS(machine); AspeedSoCClass *sc; -DriveInfo *drive0 = drive_get(IF_MTD, 0, 0); int i; NICInfo *nd = _table[0]; @@ -378,21 +392,6 @@ static void aspeed_machine_init(MachineState *machine) bmc->spi_model ? bmc->spi_model : amc->spi_model, 1, amc->num_cs); -/* Install first FMC flash content as a boot rom. */ -if (drive0) { -AspeedSMCFlash *fl = >soc.fmc.flashes[0]; -MemoryRegion *boot_rom = g_new(MemoryRegion, 1); -uint64_t size = memory_region_size(>mmio); - -if (!ASPEED_MACHINE(machine)->mmio_exec) { -memory_region_init_rom(boot_rom, NULL, "aspeed.boot_rom", - size, _abort); -memory_region_add_subregion(get_system_memory(), FIRMWARE_ADDR, -boot_rom); -write_boot_rom(drive0, FIRMWARE_ADDR, size, _abort); -} -} - if (machine->kernel_filename && sc->num_cpus > 1) { /* With no u-boot we must set up a boot stub for the secondary CPU */ MemoryRegion *smpboot = g_new(MemoryRegion, 1); @@ -423,6 +422,16 @@ static void aspeed_machine_init(MachineState *machine) drive_get(IF_SD, 0, bmc->soc.sdhci.num_slots)); } +if (!bmc->mmio_exec) { +DriveInfo *mtd0 = drive_get(IF_MTD, 0, 0); + +if (mtd0) { +uint64_t rom_size = memory_region_size(>soc.spi_boot); +aspeed_install_boot_rom(>soc, blk_by_legacy_dinfo(mtd0), +rom_size); +} +} + arm_load_kernel(ARM_CPU(first_cpu), machine, _board_binfo); } -- 2.39.1
[PATCH 3/8] hw/misc: add a toy i2c echo device
From: Klaus Jensen Add an example I2C device to demonstrate how a slave may master the bus and send data asynchronously to another slave. The device will echo whatever it is sent to the device identified by the first byte received. Signed-off-by: Klaus Jensen [ clg: - Changed to build to use CONFIG_ASPEED_SOC since only supported on such SoCs - folded in these fixes : https://lore.kernel.org/qemu-devel/Y3yMKAhOkYGtnkOp@cormorant.local/ ] Message-Id: <20220601210831.67259-7-...@irrelevant.dk> Signed-off-by: Cédric Le Goater --- hw/misc/i2c-echo.c | 156 hw/misc/meson.build | 2 + 2 files changed, 158 insertions(+) create mode 100644 hw/misc/i2c-echo.c diff --git a/hw/misc/i2c-echo.c b/hw/misc/i2c-echo.c new file mode 100644 index 00..5705ab5d73 --- /dev/null +++ b/hw/misc/i2c-echo.c @@ -0,0 +1,156 @@ +#include "qemu/osdep.h" +#include "qemu/timer.h" +#include "qemu/main-loop.h" +#include "block/aio.h" +#include "hw/i2c/i2c.h" + +#define TYPE_I2C_ECHO "i2c-echo" +OBJECT_DECLARE_SIMPLE_TYPE(I2CEchoState, I2C_ECHO) + +enum i2c_echo_state { +I2C_ECHO_STATE_IDLE, +I2C_ECHO_STATE_START_SEND, +I2C_ECHO_STATE_ACK, +}; + +typedef struct I2CEchoState { +I2CSlave parent_obj; + +I2CBus *bus; + +enum i2c_echo_state state; +QEMUBH *bh; + +unsigned int pos; +uint8_t data[3]; +} I2CEchoState; + +static void i2c_echo_bh(void *opaque) +{ +I2CEchoState *state = opaque; + +switch (state->state) { +case I2C_ECHO_STATE_IDLE: +return; + +case I2C_ECHO_STATE_START_SEND: +if (i2c_start_send_async(state->bus, state->data[0])) { +goto release_bus; +} + +state->pos++; +state->state = I2C_ECHO_STATE_ACK; +return; + +case I2C_ECHO_STATE_ACK: +if (state->pos > 2) { +break; +} + +if (i2c_send_async(state->bus, state->data[state->pos++])) { +break; +} + +return; +} + + +i2c_end_transfer(state->bus); +release_bus: +i2c_bus_release(state->bus); + +state->state = I2C_ECHO_STATE_IDLE; +} + +static int i2c_echo_event(I2CSlave *s, enum i2c_event event) +{ +I2CEchoState *state = I2C_ECHO(s); + +switch (event) { +case I2C_START_RECV: +state->pos = 0; + +break; + +case I2C_START_SEND: +state->pos = 0; + +break; + +case I2C_FINISH: +state->pos = 0; +state->state = I2C_ECHO_STATE_START_SEND; +i2c_bus_master(state->bus, state->bh); + +break; + +case I2C_NACK: +break; + +default: +return -1; +} + +return 0; +} + +static uint8_t i2c_echo_recv(I2CSlave *s) +{ +I2CEchoState *state = I2C_ECHO(s); + +if (state->pos > 2) { +return 0xff; +} + +return state->data[state->pos++]; +} + +static int i2c_echo_send(I2CSlave *s, uint8_t data) +{ +I2CEchoState *state = I2C_ECHO(s); + +if (state->pos > 2) { +return -1; +} + +state->data[state->pos++] = data; + +return 0; +} + +static void i2c_echo_realize(DeviceState *dev, Error **errp) +{ +I2CEchoState *state = I2C_ECHO(dev); +BusState *bus = qdev_get_parent_bus(dev); + +state->bus = I2C_BUS(bus); +state->bh = qemu_bh_new(i2c_echo_bh, state); + +return; +} + +static void i2c_echo_class_init(ObjectClass *oc, void *data) +{ +I2CSlaveClass *sc = I2C_SLAVE_CLASS(oc); +DeviceClass *dc = DEVICE_CLASS(oc); + +dc->realize = i2c_echo_realize; + +sc->event = i2c_echo_event; +sc->recv = i2c_echo_recv; +sc->send = i2c_echo_send; +} + +static const TypeInfo i2c_echo = { +.name = TYPE_I2C_ECHO, +.parent = TYPE_I2C_SLAVE, +.instance_size = sizeof(I2CEchoState), +.class_init = i2c_echo_class_init, +}; + +static void register_types(void) +{ +type_register_static(_echo); +} + +type_init(register_types); diff --git a/hw/misc/meson.build b/hw/misc/meson.build index 448e14b531..3eb1bda710 100644 --- a/hw/misc/meson.build +++ b/hw/misc/meson.build @@ -129,6 +129,8 @@ softmmu_ss.add(when: 'CONFIG_NRF51_SOC', if_true: files('nrf51_rng.c')) softmmu_ss.add(when: 'CONFIG_GRLIB', if_true: files('grlib_ahb_apb_pnp.c')) +softmmu_ss.add(when: 'CONFIG_ASPEED_SOC', if_true: files('i2c-echo.c')) + specific_ss.add(when: 'CONFIG_AVR_POWER', if_true: files('avr_power.c')) specific_ss.add(when: 'CONFIG_MAC_VIA', if_true: files('mac_via.c')) -- 2.39.1
[PATCH 1/8] m25p80: Improve error when the backend file size does not match the device
Currently, when a block backend is attached to a m25p80 device and the associated file size does not match the flash model, QEMU complains with the error message "failed to read the initial flash content". This is confusing for the user. Use blk_check_size_and_read_all() instead of blk_pread() to improve the reported error. Reviewed-by: Peter Maydell Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Peter Delevoryas Reviewed-by: Alistair Francis Message-Id: <20221115151000.2080833-1-...@kaod.org> Signed-off-by: Cédric Le Goater --- breakage with commit a4b15a8b9e ("pflash: Only read non-zero parts of backend image") when using -snaphot. hw/block/m25p80.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c index 802d2eb021..dc5ffbc4ff 100644 --- a/hw/block/m25p80.c +++ b/hw/block/m25p80.c @@ -24,6 +24,7 @@ #include "qemu/osdep.h" #include "qemu/units.h" #include "sysemu/block-backend.h" +#include "hw/block/block.h" #include "hw/qdev-properties.h" #include "hw/qdev-properties-system.h" #include "hw/ssi/ssi.h" @@ -1615,8 +1616,7 @@ static void m25p80_realize(SSIPeripheral *ss, Error **errp) trace_m25p80_binding(s); s->storage = blk_blockalign(s->blk, s->size); -if (blk_pread(s->blk, 0, s->size, s->storage, 0) < 0) { -error_setg(errp, "failed to read the initial flash content"); +if (!blk_check_size_and_read_all(s->blk, s->storage, s->size, errp)) { return; } } else { -- 2.39.1
[PATCH 6/8] aspeed/smc: Wire CS lines at reset
It has become difficult to define on the command line the flash devices of the Aspeed machines and their file backend. Currently, a set of default flash devices is created at machine init and drives are associated to the FMC and SPI controller devices in sequence : -drive file,format=raw,if=mtd -drive file,format=raw,if=mtd ... The CS lines are wired in the same creation loop. On real systems, these flash devices are sometime soldered to the board but the models can be different or a socket is provided to replace the flash device. So, it is legitimate to not consider them as always available by default. Some machine options were provided to specify different models, but this has its limits and the best approach would be to allow the use of block devices, such as : -blockdev node-name=fmc0,driver=file,filename=./flash.img \ -device mx66u51235f,bus=ssi.0,drive=fmc0 \ The first step in that direction is to wire the CS lines of all available devices on a bus at reset time. Let's do that and check the maximum number of devices supported by the bus while at it. The bus parent can now be explicitly defined but the device order still depends on the command line definitions. Signed-off-by: Cédric Le Goater --- hw/arm/aspeed.c | 4 hw/ssi/aspeed_smc.c | 24 2 files changed, 24 insertions(+), 4 deletions(-) diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c index 7c28546d7f..21184f3ad4 100644 --- a/hw/arm/aspeed.c +++ b/hw/arm/aspeed.c @@ -283,7 +283,6 @@ void aspeed_board_init_flashes(AspeedSMCState *s, const char *flashtype, for (i = 0; i < count; ++i) { DriveInfo *dinfo = drive_get(IF_MTD, 0, unit0 + i); -qemu_irq cs_line; DeviceState *dev; dev = qdev_new(flashtype); @@ -291,9 +290,6 @@ void aspeed_board_init_flashes(AspeedSMCState *s, const char *flashtype, qdev_prop_set_drive(dev, "drive", blk_by_legacy_dinfo(dinfo)); } qdev_realize_and_unref(dev, BUS(s->spi), _fatal); - -cs_line = qdev_get_gpio_in_named(dev, SSI_GPIO_CS, 0); -qdev_connect_gpio_out_named(DEVICE(s), "cs", i, cs_line); } } diff --git a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c index 7281169322..412cf125d9 100644 --- a/hw/ssi/aspeed_smc.c +++ b/hw/ssi/aspeed_smc.c @@ -680,6 +680,28 @@ static void aspeed_smc_flash_update_ctrl(AspeedSMCFlash *fl, uint32_t value) aspeed_smc_flash_do_select(fl, unselect); } +/* + * TODO: assumption is made on the order of creation of devices, the + * ones on the command line or the default devices created at machine + * init. + */ +static void aspeed_smc_wire_cs_lines(AspeedSMCState *s, int cs_max) +{ +BusState *b = BUS(s->spi); +BusChild *kid; + +QTAILQ_FOREACH(kid, >children, sibling) { +qemu_irq cs_line = qdev_get_gpio_in_named(kid->child, SSI_GPIO_CS, 0); +if (kid->index < cs_max) { +qdev_connect_gpio_out_named(DEVICE(s), "cs", kid->index, cs_line); +} else { +warn_report("Too many devices for SSI bus %s", +object_class_get_name(object_get_class(OBJECT(s; +return; +} +} +} + static void aspeed_smc_reset(DeviceState *d) { AspeedSMCState *s = ASPEED_SMC(d); @@ -692,6 +714,8 @@ static void aspeed_smc_reset(DeviceState *d) memset(s->regs, 0, sizeof s->regs); } +aspeed_smc_wire_cs_lines(s, asc->cs_num_max); + /* Unselect all peripherals */ for (i = 0; i < asc->cs_num_max; ++i) { s->regs[s->r_ctrl0 + i] |= CTRL_CE_STOP_ACTIVE; -- 2.39.1
[PATCH 5/8] aspeed/smc: Replace SysBus IRQs with GPIO lines
It's cleaner and removes the curious '+ 1' required to skip the DMA IRQ line of the controller. Signed-off-by: Cédric Le Goater --- hw/arm/aspeed.c | 2 +- hw/ssi/aspeed_smc.c | 5 + 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c index 27dda58338..7c28546d7f 100644 --- a/hw/arm/aspeed.c +++ b/hw/arm/aspeed.c @@ -293,7 +293,7 @@ void aspeed_board_init_flashes(AspeedSMCState *s, const char *flashtype, qdev_realize_and_unref(dev, BUS(s->spi), _fatal); cs_line = qdev_get_gpio_in_named(dev, SSI_GPIO_CS, 0); -sysbus_connect_irq(SYS_BUS_DEVICE(s), i + 1, cs_line); +qdev_connect_gpio_out_named(DEVICE(s), "cs", i, cs_line); } } diff --git a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c index 22df4be528..7281169322 100644 --- a/hw/ssi/aspeed_smc.c +++ b/hw/ssi/aspeed_smc.c @@ -1134,10 +1134,7 @@ static void aspeed_smc_realize(DeviceState *dev, Error **errp) /* Setup cs_lines for peripherals */ s->cs_lines = g_new0(qemu_irq, asc->cs_num_max); - -for (i = 0; i < asc->cs_num_max; ++i) { -sysbus_init_irq(sbd, >cs_lines[i]); -} +qdev_init_gpio_out_named(DEVICE(s), s->cs_lines, "cs", asc->cs_num_max); /* The memory region for the controller registers */ memory_region_init_io(>mmio, OBJECT(s), _smc_ops, s, -- 2.39.1
[PATCH 4/8] tests/avocado/machine_aspeed.py: Add I2C slave tests
Test extracted from : https://lists.nongnu.org/archive/html/qemu-devel/2022-06/msg00183.html Signed-off-by: Cédric Le Goater --- tests/avocado/machine_aspeed.py | 10 ++ 1 file changed, 10 insertions(+) diff --git a/tests/avocado/machine_aspeed.py b/tests/avocado/machine_aspeed.py index ddf05b3617..d2c57ccb7e 100644 --- a/tests/avocado/machine_aspeed.py +++ b/tests/avocado/machine_aspeed.py @@ -199,6 +199,8 @@ def test_arm_ast2600_evb_buildroot(self): 'tmp105,bus=aspeed.i2c.bus.3,address=0x4d,id=tmp-test'); self.vm.add_args('-device', 'ds1338,bus=aspeed.i2c.bus.3,address=0x32'); +self.vm.add_args('-device', + 'i2c-echo,bus=aspeed.i2c.bus.3,address=0x42'); self.do_test_arm_aspeed_buildroot_start(image_path, '0xf00') exec_command_and_wait_for_pattern(self, @@ -217,6 +219,14 @@ def test_arm_ast2600_evb_buildroot(self): year = time.strftime("%Y") exec_command_and_wait_for_pattern(self, 'hwclock -f /dev/rtc1', year); +exec_command_and_wait_for_pattern(self, + 'echo slave-24c02 0x1064 > /sys/bus/i2c/devices/i2c-3/new_device', + 'i2c i2c-3: new_device: Instantiated device slave-24c02 at 0x64'); +exec_command(self, 'i2cset -y 3 0x42 0x64 0x00 0xaa i'); +time.sleep(0.1) +exec_command_and_wait_for_pattern(self, + 'hexdump /sys/bus/i2c/devices/3-1064/slave-eeprom', + '000 ffaa '); self.do_test_arm_aspeed_buildroot_poweroff() -- 2.39.1
[PATCH 0/8] aspeed: I2C fixes, -drive removal (first step)
Hello, This series starts with a first set of patches fixing I2C slave mode in the Aspeed I2C controller, a test device and its associated test in avocado. Follow some cleanups which allow the use of block devices instead of drives. So that, instead of specifying : -drive file=./flash-ast2600-evb,format=raw,if=mtd -drive file=./ast2600-evb.pnor,format=raw,if=mtd ... and guessing from the order which bus the device is attached to, we can use : -blockdev node-name=fmc0,driver=file,filename=./bmc.img -device mx66u51235f,bus=ssi.0,drive=fmc0 -blockdev node-name=fmc1,driver=file,filename=./bmc-alt.img -device mx66u51235f,bus=ssi.0,drive=fmc1 -blockdev node-name=pnor,driver=file,filename=./pnor -device mx66l1g45g,bus=ssi.1,drive=pnor ... It is not perfect, the CS index still depends on the order, but it is now possible to run a machine without -drive ...,if=mtd. This lacks the final patch enabling the '-nodefaults' option by not creating the default devices if specified on the command line. It needs some more evaluation of the possible undesired effects. Thanks, C. Cédric Le Goater (6): m25p80: Improve error when the backend file size does not match the device tests/avocado/machine_aspeed.py: Add I2C slave tests aspeed/smc: Replace SysBus IRQs with GPIO lines aspeed/smc: Wire CS lines at reset aspeed: Introduce a spi_boot region under the SoC aspeed: Add a boot_rom overlap region in the SoC spi_boot container Klaus Jensen (2): hw/i2c: only schedule pending master when bus is idle hw/misc: add a toy i2c echo device include/hw/arm/aspeed_soc.h | 3 + include/hw/i2c/i2c.h| 2 + hw/arm/aspeed.c | 60 ++-- hw/arm/aspeed_ast2600.c | 13 +++ hw/arm/aspeed_soc.c | 14 +++ hw/arm/fby35.c | 8 +- hw/block/m25p80.c | 4 +- hw/i2c/aspeed_i2c.c | 2 + hw/i2c/core.c | 37 +--- hw/misc/i2c-echo.c | 156 hw/ssi/aspeed_smc.c | 29 +- hw/misc/meson.build | 2 + tests/avocado/machine_aspeed.py | 10 ++ 13 files changed, 279 insertions(+), 61 deletions(-) create mode 100644 hw/misc/i2c-echo.c -- 2.39.1
[PATCH v2] block: temporarily hold the new AioContext of bs_top in bdrv_append()
bdrv_append() is called with bs_top AioContext held, but bdrv_attach_child_noperm() could change the AioContext of bs_top. bdrv_replace_node_noperm() calls bdrv_drained_begin() starting from commit 2398747128 ("block: Don't poll in bdrv_replace_child_noperm()"). bdrv_drained_begin() can call BDRV_POLL_WHILE that assumes the new lock is taken, so let's temporarily hold the new AioContext to prevent QEMU from failing in BDRV_POLL_WHILE when it tries to release the wrong AioContext. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2168209 Reported-by: Aihua Liang Signed-off-by: Stefano Garzarella --- v2: - released the right lock in the error path [Kevin] - held the new lock until the end of the function [Kevin] v1: https://lore.kernel.org/qemu-devel/20230214105156.316586-1-sgarz...@redhat.com/ --- block.c | 23 +++ 1 file changed, 23 insertions(+) diff --git a/block.c b/block.c index aa9062f2c1..036fce19e0 100644 --- a/block.c +++ b/block.c @@ -5266,6 +5266,8 @@ int bdrv_drop_filter(BlockDriverState *bs, Error **errp) * child. * * This function does not create any image files. + * + * The caller must hold the AioContext lock for @bs_top. */ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top, Error **errp) @@ -5273,11 +5275,14 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top, int ret; BdrvChild *child; Transaction *tran = tran_new(); +AioContext *old_context, *new_context = NULL; GLOBAL_STATE_CODE(); assert(!bs_new->backing); +old_context = bdrv_get_aio_context(bs_top); + child = bdrv_attach_child_noperm(bs_new, bs_top, "backing", _of_bds, bdrv_backing_role(bs_new), tran, errp); @@ -5286,6 +5291,19 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top, goto out; } +/* + * bdrv_attach_child_noperm could change the AioContext of bs_top. + * bdrv_replace_node_noperm calls bdrv_drained_begin, so let's temporarily + * hold the new AioContext, since bdrv_drained_begin calls BDRV_POLL_WHILE + * that assumes the new lock is taken. + */ +new_context = bdrv_get_aio_context(bs_top); + +if (old_context != new_context) { +aio_context_release(old_context); +aio_context_acquire(new_context); +} + ret = bdrv_replace_node_noperm(bs_top, bs_new, true, tran, errp); if (ret < 0) { goto out; @@ -5297,6 +5315,11 @@ out: bdrv_refresh_limits(bs_top, NULL, NULL); +if (new_context && old_context != new_context) { +aio_context_release(new_context); +aio_context_acquire(old_context); +} + return ret; } -- 2.39.1
Re: [PATCH] hbitmap: fix hbitmap_status() return value for first dirty bit case
On 02.02.23 21:15, Andrey Zhadchenko via wrote: The last return statement should return true, as we already evaluated that start == next_dirty Also, fix hbitmap_status() description in header Cc: qemu-sta...@nongnu.org Fixes: a6426475a75 ("block/dirty-bitmap: introduce bdrv_dirty_bitmap_status()") Ohh :/ Signed-off-by: Andrey Zhadchenko Reviewed-by: Vladimir Sementsov-Ogievskiy --- include/qemu/hbitmap.h | 2 +- util/hbitmap.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h index af4e4ab746..8136e33674 100644 --- a/include/qemu/hbitmap.h +++ b/include/qemu/hbitmap.h @@ -330,7 +330,7 @@ bool hbitmap_next_dirty_area(const HBitmap *hb, int64_t start, int64_t end, int64_t *dirty_start, int64_t *dirty_count); /* - * bdrv_dirty_bitmap_status: + * hbitmap_status: * @hb: The HBitmap to operate on * @start: The bit to start from * @count: Number of bits to proceed diff --git a/util/hbitmap.c b/util/hbitmap.c index 297db35fb1..6d6e1b595d 100644 --- a/util/hbitmap.c +++ b/util/hbitmap.c @@ -331,7 +331,7 @@ bool hbitmap_status(const HBitmap *hb, int64_t start, int64_t count, assert(next_zero > start); *pnum = next_zero - start; -return false; +return true; } bool hbitmap_empty(const HBitmap *hb) -- Best regards, Vladimir
Re: [PATCH] block/mirror: add 'write-blocking-after-ready' copy mode
On 14.02.23 17:29, Fiona Ebner wrote: [..] [0]: Is there a good way to peek the iterator without doing something like the following (we do know the offset from last time in mirror_iteration(), so that is not an issue)? offset_from_last_time = bdrv_dirty_iter_next(s->dbi); ...other stuff... peek = bdrv_dirty_iter_next(s->dbi); /* Get back to the previous state. */ bdrv_set_dirty_iter(s->dbi, offset_from_last_time); check = bdrv_dirty_iter_next(s->dbi); assert(check == offset_from_before); // just to be sure I think, that this all should be refactored to use bdrv_dirty_bitmap_next_dirty_area() and keep the "current_offset" instead of "dbi" in MirrorBlockJob. This way further changes will be simpler. -- Best regards, Vladimir
Re: MBR plus emulated FAT
On Sat, Jan 28, 2023 at 08:48:13PM +0100, Csepp wrote: > > Eric Blake writes: > > > On Fri, Jan 27, 2023 at 01:00:05AM +0100, Csepp wrote: > >> Hi! > >> > >> Would it be possible to store the metadata for emulated FAT partitions > >> backed by host directories? It would make installing Windows 98 much > >> more seamless, since it would be able to set the boot flag during > >> install. I have a 9front install that uses no block devices and gets its > >> root file system via a simple 9P server, I've found that extremely > >> useful, since it lets me back up or modify files directly from the host. > >> > >> I don't have that much free time, but if it wouldn't be too difficult to > >> implement this and someone helped, I could try to do it myself. But > >> honestly I would be super thankful if someone else implemented it > >> instead. > > > > I wonder if you can already accomplish this using nbdkit-floppy-plugin > > (expose a directory as a FAT file system over NBD), then connect qemu > > as an NBD client, rather than having to implement this all in qemu > > proper. > > Didn't know abut that program, thanks! Unfortunately: > > The plugin does not support writes. > https://manpages.debian.org/bullseye/nbdkit/nbdkit-floppy-plugin.1.en.html (Late reply - Eric just told me about this thread) You can place the COW filter on top to make this plugin writable. However that doesn't persist the changes back to the disk (VVFAT is kind of crazy). $ nbdkit --filter=cow floppy /directory Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Re: [PATCH] block/mirror: add 'write-blocking-after-ready' copy mode
On 02.02.23 16:27, Fiona Ebner wrote: Am 02.02.23 um 12:34 schrieb Kevin Wolf: Am 02.02.2023 um 11:19 hat Fiona Ebner geschrieben: Am 31.01.23 um 19:18 schrieb Denis V. Lunev: Frankly speaking I would say that this switch could be considered NOT QEMU job and we should just send a notification (event) for the completion of the each iteration and management software should take a decision to switch from async mode to the sync one. My first thought was very similar. We should provide a building block that just switches between the two modes and then the management tool can decide what the right policy is. Adding a new event when the first iteration is done (I'm not sure if there is much value in having it for later iterations) makes sense to me if someone wants to use it. If we add it, let's not forget that events can be lost and clients must be able to query the same information, so we'd have to add it to query-jobs, too - which in turn requires adding a job type specific struct to JobInfo first. Well, Denis said 2 iterations might be better. But I'm fine with initially adding an event just for the first iteration, further ones can still be added later. Returning the number of completed iterations as part of the mirror-specific job info would anticipate that. Once we have this generic infrastructure with low-level building block, I wouldn't necessarily be opposed to having an option build on top where QEMU automatically does what we consider most useful for most users. auto-finalize/dismiss already do something similar. Unfortunately, our management software is a bit limited in that regard currently and making listening for events available in the necessary place would take a bit of work. Having the switch command would nearly be enough for us (we'd just switch after READY). But we'd also need that when the switch happens after READY, that all remaining asynchronous operations are finished by the command. Otherwise, the original issue with inactivating block drives while mirror still has work remains. Do those semantics for the switch sound acceptable? Completing the remaining asynchronous operations can take a while, so I don't think it's something to be done in a synchronous QMP command. Do we need an event that tells you that the switch has completed? Sure, makes sense. Since you said that an having an event implies that there will be a possibility to query for the same information, yes ;) What Denis suggested in the other mail also sounds good to me: Am 02.02.23 um 12:09 schrieb Denis V. Lunev: On 2/2/23 11:19, Fiona Ebner wrote: Unfortunately, our management software is a bit limited in that regard currently and making listening for events available in the necessary place would take a bit of work. Having the switch command would nearly be enough for us (we'd just switch after READY). But we'd also need that when the switch happens after READY, that all remaining asynchronous operations are finished by the command. That could be a matter of the other event I believe. We switch mode and reset the state. New READY event will be sent once the bitmap is cleared. That seems fair. That would avoid adding a new kind of event. But having to switch the mirror job to sync mode just to avoid doing I/O on an inactive device sounds wrong to me. It doesn't fix the root cause of that problem, but just papers over it. If you say the root cause is "the job not being completed before switchover", then yes. But if the root cause is "switchover happening while the drive is not actively synced", then a way to switch modes can fix the root cause :) Why does your management tool not complete the mirror job before it does the migration switchover that inactivates images? I did talk with my team leader about the possibility, but we decided to not go for it, because it requires doing the migration in two steps with pause-before-switchover and has the potential to increase guest downtime quite a bit. So I went for this approach instead. Interesting point. Maybe we need a way to automatically complete all the jobs before switchower? It seems no reason to break the jobs if user didn't cancel them. (and of course no reason to allow a code path leading to assertion). -- Best regards, Vladimir
Re: [PATCH v3 04/14] hw/char/serial-pci-multi: Factor multi_serial_class_initfn() out
On Mon, Feb 13, 2023 at 7:46 PM Philippe Mathieu-Daudé wrote: > Extract code common to multi_2x_serial_pci_class_initfn() and > multi_4x_serial_pci_class_initfn() to multi_serial_class_initfn(). > > Signed-off-by: Philippe Mathieu-Daudé > --- > hw/char/serial-pci-multi.c | 22 +- > 1 file changed, 13 insertions(+), 9 deletions(-) > > diff --git a/hw/char/serial-pci-multi.c b/hw/char/serial-pci-multi.c > index e56c0bc841..704be5c294 100644 > --- a/hw/char/serial-pci-multi.c > +++ b/hw/char/serial-pci-multi.c > @@ -155,14 +155,14 @@ static Property multi_4x_serial_pci_properties[] = { > DEFINE_PROP_END_OF_LIST(), > }; > > -static void multi_2x_serial_pci_class_initfn(ObjectClass *klass, void > *data) > +static void multi_serial_class_initfn(ObjectClass *klass, void *data) > { > DeviceClass *dc = DEVICE_CLASS(klass); > PCIDeviceClass *pc = PCI_DEVICE_CLASS(klass); > + > pc->realize = multi_serial_pci_realize; > pc->exit = multi_serial_pci_exit; > pc->vendor_id = PCI_VENDOR_ID_REDHAT; > -pc->device_id = PCI_DEVICE_ID_REDHAT_SERIAL2; > pc->revision = 1; > pc->class_id = PCI_CLASS_COMMUNICATION_SERIAL; > dc->vmsd = _pci_multi_serial; > @@ -170,19 +170,22 @@ static void > multi_2x_serial_pci_class_initfn(ObjectClass *klass, void *data) > set_bit(DEVICE_CATEGORY_INPUT, dc->categories); > } > > +static void multi_2x_serial_pci_class_initfn(ObjectClass *klass, void > *data) > +{ > +DeviceClass *dc = DEVICE_CLASS(klass); > +PCIDeviceClass *pc = PCI_DEVICE_CLASS(klass); > + > +pc->device_id = PCI_DEVICE_ID_REDHAT_SERIAL2; > +device_class_set_props(dc, multi_2x_serial_pci_properties); > +} > + > static void multi_4x_serial_pci_class_initfn(ObjectClass *klass, void > *data) > { > DeviceClass *dc = DEVICE_CLASS(klass); > PCIDeviceClass *pc = PCI_DEVICE_CLASS(klass); > -pc->realize = multi_serial_pci_realize; > -pc->exit = multi_serial_pci_exit; > -pc->vendor_id = PCI_VENDOR_ID_REDHAT; > + > pc->device_id = PCI_DEVICE_ID_REDHAT_SERIAL4; > -pc->revision = 1; > -pc->class_id = PCI_CLASS_COMMUNICATION_SERIAL; > -dc->vmsd = _pci_multi_serial; > device_class_set_props(dc, multi_4x_serial_pci_properties); > -set_bit(DEVICE_CATEGORY_INPUT, dc->categories); > } > > static void multi_serial_init(Object *o) > @@ -202,6 +205,7 @@ static const TypeInfo multi_serial_pci_types[] = { > .parent = TYPE_PCI_DEVICE, > .instance_size = sizeof(PCIMultiSerialState), > .instance_init = multi_serial_init, > +.class_init = multi_serial_class_initfn, > .abstract = true, > .interfaces = (InterfaceInfo[]) { > { INTERFACE_CONVENTIONAL_PCI_DEVICE }, > -- > 2.38.1 > > > This patch hits an assert for me: qemu-system-x86_64: ../src/qom/object.c:1279: object_class_property_add: Assertion `!object_class_property_find(klass, name)' failed. with the following backtrace: Thread 1 "qemu-system-x86" received signal SIGABRT, Aborted. __pthread_kill_implementation (threadid=, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 44return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0; (gdb) bt #0 __pthread_kill_implementation (threadid=, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x76c91953 in __pthread_kill_internal (signo=6, threadid=) at pthread_kill.c:78 #2 0x76c42ea8 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x76c2c53d in __GI_abort () at abort.c:79 #4 0x76c2c45c in __assert_fail_base (fmt=0x76da5d68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x560b9998 "!object_class_property_find(klass, name)", file=0x560b94be "../src/qom/object.c", line=1279, function=) at assert.c:92 #5 0x76c3b9f6 in __assert_fail (assertion=assertion@entry=0x560b9998 "!object_class_property_find(klass, name)", file=file@entry=0x560b94be "../src/qom/object.c", line=line@entry=1279, function=function@entry=0x560b9d00 <__PRETTY_FUNCTION__.21> "object_class_property_add") at assert.c:101 #6 0x55dbb690 in object_class_property_add (klass=klass@entry=0x569af820, name=name@entry=0x55fbe04b "chardev1", type=0x561230dd "str", get=0x55db2ad0 , set=0x55db36d0 , release=0x559f8f90 , opaque=0x563cc900 ) at ../src/qom/object.c:1279 #7 0x55db3e6d in qdev_class_add_property (prop=0x563cc900 , name=0x55fbe04b "chardev1", klass=0x569af820) at ../src/hw/core/qdev-properties.c:889 #8 device_class_set_props (dc=0x569af820, props=) at ../src/hw/core/qdev-properties.c:955 #9 0x55dba590 in type_initialize (ti=0x567f4840) at ../src/qom/object.c:1094 #10 object_class_foreach_tramp (key=, value=0x567f4840, opaque=0x7fffe260) at ../src/qom/object.c:1081 #11 0x770bcda8 in g_hash_table_foreach
Re: [PATCH] block/mirror: add 'write-blocking-after-ready' copy mode
On 02.02.23 18:23, Kevin Wolf wrote: Am 02.02.2023 um 14:35 hat Denis V. Lunev geschrieben: On 2/2/23 14:27, Fiona Ebner wrote: Am 02.02.23 um 12:34 schrieb Kevin Wolf: Am 02.02.2023 um 11:19 hat Fiona Ebner geschrieben: Am 31.01.23 um 19:18 schrieb Denis V. Lunev: Frankly speaking I would say that this switch could be considered NOT QEMU job and we should just send a notification (event) for the completion of the each iteration and management software should take a decision to switch from async mode to the sync one. My first thought was very similar. We should provide a building block that just switches between the two modes and then the management tool can decide what the right policy is. Adding a new event when the first iteration is done (I'm not sure if there is much value in having it for later iterations) makes sense to me if someone wants to use it. If we add it, let's not forget that events can be lost and clients must be able to query the same information, so we'd have to add it to query-jobs, too - which in turn requires adding a job type specific struct to JobInfo first. Well, Denis said 2 iterations might be better. But I'm fine with initially adding an event just for the first iteration, further ones can still be added later. Returning the number of completed iterations as part of the mirror-specific job info would anticipate that. May be it would be better to have an event on each iteration + make available iteration count over block status query. In the ready phase, each iteration can be very short. Basically if the guest writes to one block and then the mirror catches up, that's a whole iteration. So if the guest is doing I/O at a moderate rate so that the host can keep up with it, you might end up with one QMP event per I/O request. I think, after first iteration the only physical parameters are data_sent and remaining_dirty. Number of additional iterations doesn't matter - we just loop through dirty bits. I'm not even sure that first iteration completion has physical meaning.. Probably, "the whole disk touched", so we can make more reasonable prediction about further speed and convergence.. -- Best regards, Vladimir
Re: [PATCH 1/3] migration: In case of postcopy, the memory ends in res_postcopy_only
Vladimir Sementsov-Ogievskiy wrote: > On 09.02.23 21:10, Juan Quintela wrote: >> Vladimir Sementsov-Ogievskiy wrote: >>> On 08.02.23 16:57, Juan Quintela wrote: So remove last assignation of res_compatible. >> > > > I think, that the order of logic and documentation changing since introducing > _estimate is a bit confused. > > But I agree now, that we are safe to unite old compat and old postcopy_only > into one variable, as we want only > > 1. the total sum, to probably go to migration_completion() > 2. pend_pre to probably go to postcopy_start() > > So, patch is OK, and seems it changes absolutely nothing in logic. Thanks for > explanations! > > > Reviewed-by: Vladimir Sementsov-Ogievskiy Thanks. You are welcome.
Re: [PATCH 3/3] migration: Remove _only suffix for res_postcopy/precopy
On 08.02.23 16:57, Juan Quintela wrote: Once that res_compatible is removed, they don't make sense anymore. Signed-off-by: Juan Quintela --- include/migration/register.h | 18 -- migration/savevm.h | 8 hw/s390x/s390-stattrib.c | 7 +++ hw/vfio/migration.c| 10 -- migration/block-dirty-bitmap.c | 6 +++--- migration/block.c | 7 +++ migration/ram.c| 18 -- migration/savevm.c | 24 ++-- 8 files changed, 43 insertions(+), 55 deletions(-) diff --git a/include/migration/register.h b/include/migration/register.h index a958a92a0f..4a4a6d7174 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -47,22 +47,20 @@ typedef struct SaveVMHandlers { /* This runs outside the iothread lock! */ int (*save_setup)(QEMUFile *f, void *opaque); /* Note for save_live_pending: - * - res_precopy_only is for data which must be migrated in precopy phase + * - res_precopy is for data which must be migrated in precopy phase * or in stopped state, in other words - before target vm start - * - res_postcopy_only is for data which must be migrated in postcopy phase + * - res_postcopy is for data which must be migrated in postcopy phase * or in stopped state, in other words - after source vm stop That's now wrong. "postcopy" is everything except "precopy", as it includes "compat". Really, for RAM, it can be copied in precopy too, and it is copied in precopy until user run command migrate-start-postcopy. (In contrast: block-dirty-bitmap cannot migrate in precopy at all, it migrate only in stopped state or in postcopy). So, finally: "precopy" definition: - must be migrated in precopy or in stopped state - in other words: must be migrated before target start - in other words: can't be migrated in postcopy - in other words: can't be migrated after target start "postcopy" definition: - can migrate in postcopy - in other words: can migrate after target start some properties: - probably can be migrated in precopy (like RAM), or, may be not (like block-dirty-bitmap) - of course, can be migrated in stopped state To be absolutely clear, we may rename them to "not_postcopyable" and "postcopyable". -- Best regards, Vladimir
Re: [PATCH] block: temporarily hold the new AioContext of bs_top in bdrv_append()
On Tue, Feb 14, 2023 at 3:06 PM Kevin Wolf wrote: > > Am 14.02.2023 um 13:22 hat Stefano Garzarella geschrieben: > > On Tue, Feb 14, 2023 at 12:56 PM Kevin Wolf wrote: > > > > > > Am 14.02.2023 um 11:51 hat Stefano Garzarella geschrieben: > > > > bdrv_append() is called with bs_top AioContext held, but > > > > bdrv_attach_child_noperm() could change the AioContext of bs_top. > > > > > > > > bdrv_replace_node_noperm() calls bdrv_drained_begin() starting from > > > > commit 2398747128 ("block: Don't poll in bdrv_replace_child_noperm()"). > > > > bdrv_drained_begin() can call BDRV_POLL_WHILE that assumes the new lock > > > > is taken, so let's temporarily hold the new AioContext to prevent QEMU > > > > from failing in BDRV_POLL_WHILE when it tries to release the wrong > > > > AioContext. > > > > > > > > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2168209 > > > > Reported-by: Aihua Liang > > > > Signed-off-by: Stefano Garzarella > > > > --- > > > > I'm not sure whether to use the following Fixes tag. That commit added > > > > the > > > > calls to bdrv_drained_begin() in bdrv_replace_node_noperm(), but maybe > > > > the > > > > problem was pre-existing. > > > > > > > > Fixes: 2398747128 ("block: Don't poll in bdrv_replace_child_noperm()") > > > > > > > > Note: a local reproducer is attached in the BZ, it is based on the > > > > Aihua Liang > > > > report and it hits the issue with a 20% ratio. > > > > --- > > > > block.c | 23 +++ > > > > 1 file changed, 23 insertions(+) > > > > > > > > diff --git a/block.c b/block.c > > > > index aa9062f2c1..0e2bc11e0b 100644 > > > > --- a/block.c > > > > +++ b/block.c > > > > @@ -5266,6 +5266,8 @@ int bdrv_drop_filter(BlockDriverState *bs, Error > > > > **errp) > > > > * child. > > > > * > > > > * This function does not create any image files. > > > > + * > > > > + * The caller must hold the AioContext lock for @bs_top. > > > > */ > > > > int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top, > > > > Error **errp) > > > > @@ -5273,11 +5275,14 @@ int bdrv_append(BlockDriverState *bs_new, > > > > BlockDriverState *bs_top, > > > > int ret; > > > > BdrvChild *child; > > > > Transaction *tran = tran_new(); > > > > +AioContext *old_context, *new_context; > > > > > > > > GLOBAL_STATE_CODE(); > > > > > > > > assert(!bs_new->backing); > > > > > > > > +old_context = bdrv_get_aio_context(bs_top); > > > > + > > > > child = bdrv_attach_child_noperm(bs_new, bs_top, "backing", > > > > _of_bds, > > > > bdrv_backing_role(bs_new), > > > > tran, errp); > > > > @@ -5286,11 +5291,29 @@ int bdrv_append(BlockDriverState *bs_new, > > > > BlockDriverState *bs_top, > > > > goto out; > > > > } > > > > > > > > +/* > > > > + * bdrv_attach_child_noperm could change the AioContext of bs_top. > > > > + * bdrv_replace_node_noperm calls bdrv_drained_begin, so let's > > > > temporarily > > > > + * hold the new AioContext, since bdrv_drained_begin calls > > > > BDRV_POLL_WHILE > > > > + * that assumes the new lock is taken. > > > > + */ > > > > +new_context = bdrv_get_aio_context(bs_top); > > > > + > > > > +if (old_context != new_context) { > > > > +aio_context_release(old_context); > > > > +aio_context_acquire(new_context); > > > > +} > > > > + > > > > ret = bdrv_replace_node_noperm(bs_top, bs_new, true, tran, errp); > > > > if (ret < 0) { > > > > goto out; > > > > > > If we take the error path, we return with new_context locked instead of > > > old_context now. > > > > Grr, I'm blind... > > > > > > > > > } > > > > > > > > +if (old_context != new_context) { > > > > +aio_context_release(new_context); > > > > +aio_context_acquire(old_context); > > > > +} > > > > + > > > > ret = bdrv_refresh_perms(bs_new, tran, errp); > > > > out: > > > > tran_finalize(tran, ret); > > > > > > Strictly speaking, don't we need to hold the lock across > > > tran_finalize(), too? It completes the bdrv_replace_node_noperm() call > > > you covered above. > > > > Right! > > > > > > > > Maybe bdrv_refresh_perms() and bdrv_refresh_limits(), too, in fact. We > > > never clearly defined which functions need the lock and which don't, so > > > hard to tell. > > > > Okay, so to be on the safe side, I'll switch them back just before return. > > > > > It's really time to get rid of it. > > > > How could one disagree? :-) > > > > What about the Fixes tag? Should I include it? > > I'm not sure. Before the patch, bdrv_replace_child_noperm() had a drain > which could have caused the same kind of problems. I see. > But we're now > draining two BDSes, maybe that is the relevant difference. I guess we've > always had a bug, but that commit made it more likely to actually > trigger? Yes, my same doubt. I also guess it was
Re: [PATCH 2/3] migration: Remove unused res_compatible
On 08.02.23 16:57, Juan Quintela wrote: { -uint64_t pend_pre, pend_compat, pend_post; +uint64_t pend_pre, pend_post; bool in_postcopy = s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE; -qemu_savevm_state_pending_estimate(_pre, _compat, _post); -uint64_t pending_size = pend_pre + pend_compat + pend_post; +qemu_savevm_state_pending_estimate(_pre, _post); +uint64_t pending_size = pend_pre + pend_post; Mixed declarations are "gnerally not allowed" by devel/style.rst.. Preexisting, but we may fix it now. Anyway: Reviewed-by: Vladimir Sementsov-Ogievskiy -- Best regards, Vladimir
Re: [PATCH 1/3] migration: In case of postcopy, the memory ends in res_postcopy_only
On 09.02.23 21:10, Juan Quintela wrote: Vladimir Sementsov-Ogievskiy wrote: On 08.02.23 16:57, Juan Quintela wrote: So remove last assignation of res_compatible. I hoped for some description when asked to split it out :) Signed-off-by: Juan Quintela --- migration/ram.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/migration/ram.c b/migration/ram.c index b966e148c2..85ccbf88ad 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3474,7 +3474,7 @@ static void ram_state_pending_exact(void *opaque, if (migrate_postcopy_ram()) { /* We can do postcopy, and all the data is postcopiable */ -*res_compatible += remaining_size; +*res_postcopy_only += remaining_size; Actually, these "remaining_size" bytes are still compatible, i.e. we can migrate these pending bytes in pre-copy, and we actually do it, until user call migrate-start-postcopy, yes? But we exploit the fact that, this change don't affect any logic, just name becomes wrong.. Yes? Or I don't follow:/ My definition of the fields is: how are we going to transfer that bytes. if they are on res_precopy_only, we transfer them with precopy, if they are on res_postocpy_only, we transfer them with postcopy. So, the rest of RAM, if we are in postcopy, we sent it with postcopy, and if we are in precopy, we sent them with precopy. See the whole code. This is the _estimate function. uint64_t remaining_size = rs->migration_dirty_pages * TARGET_PAGE_SIZE; if (migrate_postcopy_ram()) { /* We can do postcopy, and all the data is postcopiable */ *res_postcopy_only += remaining_size; } else { *res_precopy_only += remaining_size; } After the change, _exact does exactly the same. The caller (migration_iteration_run()) does this (I remove traces and things that don't matter for this). This is before the change. Remember: in precopy, we add res_compat to pend_pre, and in postcopy to pend_post. uint64_t pending_size = pend_pre + pend_compat + pend_post; ### pending_size is the sum of the three, so it doesn't matter. if (pend_pre + pend_compat <= s->threshold_size) { ### In precopy, we add pend_compat to pend_pre, so we are ok. ### In postcopy, we add the data to pend_postcopy, but that is right, ### because to calculate the downtime, we only care about what we have ### to transfer with precopy, in particular, we aren't going to send ### more ram, so it is ok that it is in pend_post. qemu_savevm_state_pending_exact(_pre, _compat, _post); pending_size = pend_pre + pend_compat + pend_post; } if (!pending_size || pending_size < s->threshold_size) { migration_completion(s); return MIG_ITERATE_BREAK; } /* Still a significant amount to transfer */ if (!in_postcopy && pend_pre <= s->threshold_size && qatomic_read(>start_postcopy)) { this is what I mean. See how we only use pend_pre to decide if we ### are entering postcopy. if (postcopy_start(s)) { error_report("%s: postcopy failed to start", __func__); } return MIG_ITERATE_SKIP; } So the only "behaviour" that we can say are having is that with the actualy, this one was already "changed", as _estimate never return compat other than zero. So, The patch really changes nothing change we are a little bit more aggressive on calling qemu_savevm_state_pending_exact(), but I will arguee that the new behaviour is the right one. What do you think? I think, that the order of logic and documentation changing since introducing _estimate is a bit confused. But I agree now, that we are safe to unite old compat and old postcopy_only into one variable, as we want only 1. the total sum, to probably go to migration_completion() 2. pend_pre to probably go to postcopy_start() So, patch is OK, and seems it changes absolutely nothing in logic. Thanks for explanations! Reviewed-by: Vladimir Sementsov-Ogievskiy -- Best regards, Vladimir
Re: [PATCH] block/mirror: add 'write-blocking-after-ready' copy mode
Am 02.02.23 um 12:34 schrieb Kevin Wolf: > Am 02.02.2023 um 11:19 hat Fiona Ebner geschrieben: >> Am 31.01.23 um 19:18 schrieb Denis V. Lunev: >>> Frankly speaking I would say that this switch could be considered >>> NOT QEMU job and we should just send a notification (event) for the >>> completion of the each iteration and management software should >>> take a decision to switch from async mode to the sync one. > > My first thought was very similar. We should provide a building block > that just switches between the two modes and then the management tool > can decide what the right policy is. > > Adding a new event when the first iteration is done (I'm not sure if > there is much value in having it for later iterations) makes sense to > me if someone wants to use it. If we add it, let's not forget that > events can be lost and clients must be able to query the same > information, so we'd have to add it to query-jobs, too - which in turn > requires adding a job type specific struct to JobInfo first. When exactly should an iteration loop be considered finished? An idea would be to detect the last call to mirror_perform() in mirror_iteration(), mark the corresponding operation with a new flag, and trigger the event once mirror_iteration_done() is called with that operation. To implement it, I'd peek (below[0] should make it clear what I mean by "peek") the dirty iterator in the beginning of mirror_iteration() after computing nb_chunks. If peeking returns -1, we are in the final batch. Then in the loop where mirror_perform() is called, we need to figure out when the last call for that batch is. But the loop abort condition (seemingly?) depends on the result of mirror_perform(), so that might get a bit involved. I didn't think about it in detail yet, because I first wanted to ask if this is the approach to go for. An alternative would be to have an event when the iteration loop was restarted rather than when the iteration loop is finished, i.e. triggering the event in mirror_iteration() when the dirty iterator is reset. This is simpler, but it does not trigger if there are no writes to the source at all and otherwise it (most likely) triggers while there still are pending operations from the current iteration loop. What do you think? [0]: Is there a good way to peek the iterator without doing something like the following (we do know the offset from last time in mirror_iteration(), so that is not an issue)? > offset_from_last_time = bdrv_dirty_iter_next(s->dbi); > ...other stuff... > peek = bdrv_dirty_iter_next(s->dbi); > /* Get back to the previous state. */ > bdrv_set_dirty_iter(s->dbi, offset_from_last_time); > check = bdrv_dirty_iter_next(s->dbi); > assert(check == offset_from_before); // just to be sure Best Regards, Fiona
[PATCH] block: temporarily hold the new AioContext of bs_top in bdrv_append()
bdrv_append() is called with bs_top AioContext held, but bdrv_attach_child_noperm() could change the AioContext of bs_top. bdrv_replace_node_noperm() calls bdrv_drained_begin() starting from commit 2398747128 ("block: Don't poll in bdrv_replace_child_noperm()"). bdrv_drained_begin() can call BDRV_POLL_WHILE that assumes the new lock is taken, so let's temporarily hold the new AioContext to prevent QEMU from failing in BDRV_POLL_WHILE when it tries to release the wrong AioContext. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2168209 Reported-by: Aihua Liang Signed-off-by: Stefano Garzarella --- I'm not sure whether to use the following Fixes tag. That commit added the calls to bdrv_drained_begin() in bdrv_replace_node_noperm(), but maybe the problem was pre-existing. Fixes: 2398747128 ("block: Don't poll in bdrv_replace_child_noperm()") Note: a local reproducer is attached in the BZ, it is based on the Aihua Liang report and it hits the issue with a 20% ratio. --- block.c | 23 +++ 1 file changed, 23 insertions(+) diff --git a/block.c b/block.c index aa9062f2c1..0e2bc11e0b 100644 --- a/block.c +++ b/block.c @@ -5266,6 +5266,8 @@ int bdrv_drop_filter(BlockDriverState *bs, Error **errp) * child. * * This function does not create any image files. + * + * The caller must hold the AioContext lock for @bs_top. */ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top, Error **errp) @@ -5273,11 +5275,14 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top, int ret; BdrvChild *child; Transaction *tran = tran_new(); +AioContext *old_context, *new_context; GLOBAL_STATE_CODE(); assert(!bs_new->backing); +old_context = bdrv_get_aio_context(bs_top); + child = bdrv_attach_child_noperm(bs_new, bs_top, "backing", _of_bds, bdrv_backing_role(bs_new), tran, errp); @@ -5286,11 +5291,29 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top, goto out; } +/* + * bdrv_attach_child_noperm could change the AioContext of bs_top. + * bdrv_replace_node_noperm calls bdrv_drained_begin, so let's temporarily + * hold the new AioContext, since bdrv_drained_begin calls BDRV_POLL_WHILE + * that assumes the new lock is taken. + */ +new_context = bdrv_get_aio_context(bs_top); + +if (old_context != new_context) { +aio_context_release(old_context); +aio_context_acquire(new_context); +} + ret = bdrv_replace_node_noperm(bs_top, bs_new, true, tran, errp); if (ret < 0) { goto out; } +if (old_context != new_context) { +aio_context_release(new_context); +aio_context_acquire(old_context); +} + ret = bdrv_refresh_perms(bs_new, tran, errp); out: tran_finalize(tran, ret); -- 2.39.1
Re: [PATCH v2 6/7] CI: Stop building docs on centos8
On Tue, Feb 14, 2023 at 03:03:54PM +0100, Paolo Bonzini wrote: > On Tue, Feb 14, 2023 at 12:49 PM Daniel P. Berrangé > wrote: > > [quote] > > The motivation for this series is that Python 3.6 was EOL at the end of > > 2021; upstream tools are beginning to drop support for it, including > > setuptools, pylint, mypy, etc. As time goes by, it becomes more > > difficult to support and test against the full range of Python versions > > that QEMU supports. The closer we get to Python 3.12, the harder it will > > be to cover that full spread of versions. > > [/quote] > > > > this is all about new/eol versions of software upstream, and I don't > > think that's a justification. QEMU explicitly aims to use distro provided > > versions and upstream EOL status is not relevant in that context. Even > > if using "pip" to install it is possible to limit yourself to upstream > > releases which still support 3.6. > > > > There is the separate issue of Meson dropping python 3.6 which motivates > > Paolo's series. Again though, we don't have to increase our minimum meson > > version, because meson is working today. It is our choice to to increase > > it to use latest available meson features. At some point we can decide > > what we have is good enough and we don't have to keep chasing the latest > > features. Maybe we're not there yet, but we should think about when that > > would be. > > In the case of Meson, the main advantage is moving _all_ of the > emulator configury out of the configure script. This requires > add_global_dependencies which was added in 0.63. So in that case it > is indeed mostly about shiny new features and it's not absolutely > necessary. > > In the case of Python the issue is not the interpreter per se, though > there are a couple new feature in Python 3.7 that are quite nice (for > example improved data classes[1] or context variables[2]). The main > problem as far as I understood (and have seen in my experience) is > linting tools. New versions fix bugs that caused false positives, but > also become more strict at the same time. The newer versions at the > same time are very quick at dropping support for old versions of > Python; while older versions sometimes throw deprecation warnings on > new versions of Python. This makes it very hard to support a single > version of, say, mypy that works on all versions from RHEL8 and SLE15 > to Fedora 38 and Ubuntu 23.04. > > [1] https://peps.python.org/pep-0557/ > [2] https://peps.python.org/pep-0567/ > > In fact this issue is the reason why RHEL9 does not package any of > these tools and does not run them as part of building RPMs even though > in principle it would be a good idea; it's too much of a pain to have > a single version that works across all the packages in the > distribution. > > Regarding your other suggestion: > > > * For non-native library/applications dependancies we aim > > to support only the most recent distro version. Users > > of older distros may need to dynamically fetch newer > > deps. > > I think this is a good idea, but one issue with "only supporting the > most recent distro version" is SUSE. While the most recent version of > SLE is about 5 years old, there is no next version in sight---SUSE > instead is working on their "Adaptable Linux Platform", but it's still > in the prototype stage[3]. So alternatively we could put a 4 or 5 year > cutoff after which you need to fetch newer deps. Considering the > delays between freeze and release of distros like RHEL or SLE, in > practice we would probably keep Python versions supported for 6-7 > years. Yeah, that kind of problem with very old SUSE would push towards simply excluding the LTS distros, or excluding them if they're older than N years, and expect users of such old distros to download newer python modules, etc. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH] block: temporarily hold the new AioContext of bs_top in bdrv_append()
On Tue, Feb 14, 2023 at 12:56 PM Kevin Wolf wrote: > > Am 14.02.2023 um 11:51 hat Stefano Garzarella geschrieben: > > bdrv_append() is called with bs_top AioContext held, but > > bdrv_attach_child_noperm() could change the AioContext of bs_top. > > > > bdrv_replace_node_noperm() calls bdrv_drained_begin() starting from > > commit 2398747128 ("block: Don't poll in bdrv_replace_child_noperm()"). > > bdrv_drained_begin() can call BDRV_POLL_WHILE that assumes the new lock > > is taken, so let's temporarily hold the new AioContext to prevent QEMU > > from failing in BDRV_POLL_WHILE when it tries to release the wrong > > AioContext. > > > > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2168209 > > Reported-by: Aihua Liang > > Signed-off-by: Stefano Garzarella > > --- > > I'm not sure whether to use the following Fixes tag. That commit added the > > calls to bdrv_drained_begin() in bdrv_replace_node_noperm(), but maybe the > > problem was pre-existing. > > > > Fixes: 2398747128 ("block: Don't poll in bdrv_replace_child_noperm()") > > > > Note: a local reproducer is attached in the BZ, it is based on the Aihua > > Liang > > report and it hits the issue with a 20% ratio. > > --- > > block.c | 23 +++ > > 1 file changed, 23 insertions(+) > > > > diff --git a/block.c b/block.c > > index aa9062f2c1..0e2bc11e0b 100644 > > --- a/block.c > > +++ b/block.c > > @@ -5266,6 +5266,8 @@ int bdrv_drop_filter(BlockDriverState *bs, Error > > **errp) > > * child. > > * > > * This function does not create any image files. > > + * > > + * The caller must hold the AioContext lock for @bs_top. > > */ > > int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top, > > Error **errp) > > @@ -5273,11 +5275,14 @@ int bdrv_append(BlockDriverState *bs_new, > > BlockDriverState *bs_top, > > int ret; > > BdrvChild *child; > > Transaction *tran = tran_new(); > > +AioContext *old_context, *new_context; > > > > GLOBAL_STATE_CODE(); > > > > assert(!bs_new->backing); > > > > +old_context = bdrv_get_aio_context(bs_top); > > + > > child = bdrv_attach_child_noperm(bs_new, bs_top, "backing", > > _of_bds, > > bdrv_backing_role(bs_new), > > tran, errp); > > @@ -5286,11 +5291,29 @@ int bdrv_append(BlockDriverState *bs_new, > > BlockDriverState *bs_top, > > goto out; > > } > > > > +/* > > + * bdrv_attach_child_noperm could change the AioContext of bs_top. > > + * bdrv_replace_node_noperm calls bdrv_drained_begin, so let's > > temporarily > > + * hold the new AioContext, since bdrv_drained_begin calls > > BDRV_POLL_WHILE > > + * that assumes the new lock is taken. > > + */ > > +new_context = bdrv_get_aio_context(bs_top); > > + > > +if (old_context != new_context) { > > +aio_context_release(old_context); > > +aio_context_acquire(new_context); > > +} > > + > > ret = bdrv_replace_node_noperm(bs_top, bs_new, true, tran, errp); > > if (ret < 0) { > > goto out; > > If we take the error path, we return with new_context locked instead of > old_context now. Grr, I'm blind... > > > } > > > > +if (old_context != new_context) { > > +aio_context_release(new_context); > > +aio_context_acquire(old_context); > > +} > > + > > ret = bdrv_refresh_perms(bs_new, tran, errp); > > out: > > tran_finalize(tran, ret); > > Strictly speaking, don't we need to hold the lock across > tran_finalize(), too? It completes the bdrv_replace_node_noperm() call > you covered above. Right! > > Maybe bdrv_refresh_perms() and bdrv_refresh_limits(), too, in fact. We > never clearly defined which functions need the lock and which don't, so > hard to tell. Okay, so to be on the safe side, I'll switch them back just before return. > It's really time to get rid of it. How could one disagree? :-) What about the Fixes tag? Should I include it? Thanks, Stefano
Re: [PATCH v2 6/7] CI: Stop building docs on centos8
Thomas Huth writes: > On 14/02/2023 08.40, Markus Armbruster wrote: >> Daniel P. Berrangé writes: >> [...] >> >>> We don't have to drop python 3.6. It is a choice because >>> of a desire to be able to use some shiny new python >>> features without caring about back compat. >> I read this on Friday, and decided to let it sit until after the >> weekend. Well, it's now Tuesday, and to be frank, it's still as >> offensively flippant as it was on Friday. It shows either ignorance of >> or cavalier disregard for the sheer amount of work some of us have had >> to put into keeping old versions of Python viable. > > I'm a complete python ignorant, too, so I'm a little bit surprised of > the amount of pain that these scripts are causing. > > No matter of that fact, I think Peter still has a point that we have a > real conflict here with our current support policy. So this either > means that Python was the wrong choice for our needs (since it is > moving too fast and causing too much friction), or we should really > rethink our support policy. > > I guess we're too deep into the Python rabbit hole already, and I'm > not aware of any other good solutions (back to Perl scripts? No, > thanks!), so it's likely quite impossible to tune that knob. > > Thus we should maybe really start talking about our support policy > now. I think the main problem is likely the sentence "Support for the > previous major version will be dropped 2 years after the new major > version is released". Maybe we should shorten that time frame to 1 > year. I think this should be a fair approach. Generally I recommend avoiding installing a new LTS until at least the first tick release has ironed out the obvious bugs. A year seems like a fair grace period to update to the next LTS. Those that like sitting on old maintained LTS releases are less likely to be tracking the bleeding edge of development anyway and likely are happy with their distro provided packages. BTW my next testing/next finally updates the last few Ubuntu 20.04 to 22.04 systems which also allows removing a few tsan and clang hacks in the process. Progress might not be a straight line but it averages out in that approximate direction ;-) > The 2 years caused some confusions in the past already, since > e.g. Debian only supports the previous major release for only one more > year, and macOS also releases a major version each year ... so IMHO we > could shorten the time frame for the previous major release to 1 year > instead. People then could still continue building QEMU on CentOS 8, > but they have to be aware that they might install other software like > Sphinx manually if they want to continue using QEMU with docs there. > What do you think? Works for me at least. -- Alex Bennée Virtualisation Tech Lead @ Linaro
Re: [PATCH 3/4] hw: Use qdev_get_parent_bus() in qdev_get_own_fw_dev_path_from_handler()
On 14/2/23 00:29, Richard Henderson wrote: On 2/12/23 12:47, Philippe Mathieu-Daudé wrote: -static char *qdev_get_fw_dev_path_from_handler(BusState *bus, DeviceState *dev) +static char *qdev_get_fw_dev_path_from_handler(DeviceState *dev) { Object *obj = OBJECT(dev); + BusState *bus = qdev_get_parent_bus(dev); char *d = NULL; while (!d && obj->parent) { This is a separate change from... -char *qdev_get_own_fw_dev_path_from_handler(BusState *bus, DeviceState *dev) +char *qdev_get_own_fw_dev_path_from_handler(DeviceState *dev) { Object *obj = OBJECT(dev); - return fw_path_provider_try_get_dev_path(obj, bus, dev); + return fw_path_provider_try_get_dev_path(obj, qdev_get_parent_bus(dev), dev); ... this, which is what $SUBJECT says. @@ -67,7 +68,7 @@ static int qdev_get_fw_dev_path_helper(DeviceState *dev, char *p, int size) if (dev && dev->parent_bus) { char *d; l = qdev_get_fw_dev_path_helper(dev->parent_bus->parent, p, size); - d = qdev_get_fw_dev_path_from_handler(dev->parent_bus, dev); + d = qdev_get_fw_dev_path_from_handler(dev); We've already accessed parent_bus just above if (!d) { d = bus_get_fw_dev_path(dev->parent_bus, dev); ... and just below. So, what's the cleanup? qdev_get_own_fw_dev_path_from_handler() being a public API, I wanted to clean it to avoid a funny case when it is called with bus != qdev_get_parent_bus(dev). Maybe I merged 2 patches in one, I'll revisit. Or I can just add assert(bus == qdev_get_parent_bus(dev)) to prove the API is convoluted. I'll reword on before respin.
Re: [PATCH v2 6/7] CI: Stop building docs on centos8
On Tue, 14 Feb 2023 at 07:40, Markus Armbruster wrote: > I read this on Friday, and decided to let it sit until after the > weekend. Well, it's now Tuesday, and to be frank, it's still as > offensively flippant as it was on Friday. It shows either ignorance of > or cavalier disregard for the sheer amount of work some of us have had > to put into keeping old versions of Python viable. >From my point of view it is definitely ignorance. I simply didn't expect that a serious major scripting language would be making it this hard to write code that is backwards compatible with older but not ancient versions. thanks -- PMM
Re: [PATCH v2 6/7] CI: Stop building docs on centos8
Daniel P. Berrangé writes: [...] > We don't have to drop python 3.6. It is a choice because > of a desire to be able to use some shiny new python > features without caring about back compat. I read this on Friday, and decided to let it sit until after the weekend. Well, it's now Tuesday, and to be frank, it's still as offensively flippant as it was on Friday. It shows either ignorance of or cavalier disregard for the sheer amount of work some of us have had to put into keeping old versions of Python viable. The latter would be quite unlike you, so it must be the former. John has sunk *man-months* into keeping old versions of Python viable. I've put in a lot less myself, but still enough to be painfully aware of it. I figure Cleber and Beraldo are somewhere in between. Insinuating John's proposal is motivated by "a desire to be able to use some shiny new python features without caring about back compat" disrespects all this work. We should have a sober discussion on which versions of Python to work with, and the tradeoffs involved. But before I engage in that, I insist on resetting the frame: no, this is not about shiny, new Python features. It is about stopping the bleeding. It is about reducing what feels more and more like bullshit work to me, so we can actually accomplish stuff that matters. And let's give the people who have been doing the actual work the benefit of the doubt. [...]
Re: [PATCH v2 6/7] CI: Stop building docs on centos8
On Tue, Feb 14, 2023 at 09:35:44AM +0100, Thomas Huth wrote: > On 14/02/2023 08.40, Markus Armbruster wrote: > > Daniel P. Berrangé writes: > > > > [...] > > > > > We don't have to drop python 3.6. It is a choice because > > > of a desire to be able to use some shiny new python > > > features without caring about back compat. > > > > I read this on Friday, and decided to let it sit until after the > > weekend. Well, it's now Tuesday, and to be frank, it's still as > > offensively flippant as it was on Friday. It shows either ignorance of > > or cavalier disregard for the sheer amount of work some of us have had > > to put into keeping old versions of Python viable. > > I'm a complete python ignorant, too, so I'm a little bit surprised of the > amount of pain that these scripts are causing. > > No matter of that fact, I think Peter still has a point that we have a real > conflict here with our current support policy. So this either means that > Python was the wrong choice for our needs (since it is moving too fast and > causing too much friction), or we should really rethink our support policy. > > I guess we're too deep into the Python rabbit hole already, and I'm not > aware of any other good solutions (back to Perl scripts? No, thanks!), so > it's likely quite impossible to tune that knob. I still believe python is a probably the best thing for what we're using it for. Certainly would not suggest shell or perl, and using a compiled language would add its own complications for cross compilation. > Thus we should maybe really start talking about our support policy now. I > think the main problem is likely the sentence "Support for the previous > major version will be dropped 2 years after the new major version is > released". Maybe we should shorten that time frame to 1 year. The 2 years > caused some confusions in the past already, since e.g. Debian only supports > the previous major release for only one more year, and macOS also releases a > major version each year ... so IMHO we could shorten the time frame for the > previous major release to 1 year instead. People then could still continue > building QEMU on CentOS 8, but they have to be aware that they might install > other software like Sphinx manually if they want to continue using QEMU with > docs there. What do you think? I think perhaps the problem is not in the length of time defined by our support policy, but rather that we're facing a rather different reality to the one we've historically been used it, where distros are no longer critical dependancies and our support policy does not reflect that. For any C/C++ application, wanting to target the versions shipped in a distro has been pretty much normal practice. C has not ever come with a standard package manager toolset, the distros service that role. The distros also aren't generally a fan of shipping multiple versions of C libs in parallel. Pretty much every non-C library though is different. They all have their own package manager service / tools (perl has cpan, pytyhon has PyPi/pip, ruby has gems. With latest compiled languages like Go/Rust, this has gone one step further and is natively integrated into the compiler toolchain as standard. IOW, for everything except C, it has become increasingly normal practice to ignore the distro and dynamically download all the deps your application needs into a self contained local environment. Now, the distros aren't especially a fan of this new world, since they still prefer to unbundle all these deps, but I think that approach is increasingly difficult for them to achieve because the majority of upstreams don't care for the distro versions. Thus what we're experiancing is a clash between the traditional way that C applications/libraries deal with their deps, vs the way pretty much every other language deals with their deps in the modern world. It has come up now because we're making much more use of python now, than we did in the past. Our support policy is written from the POV of the C world, and merely reducing the length of time we support a distro does not address the different world view of Python. Should we instead try to be more explicit about the different needs of the non-C dependencies ? We could for example say * For native library/application dependancies we aim to support the two most recent distro versions, for 2 years overlap * For non-native library/applications dependancies we aim to support only the most recent distro version. Users of older distros may need to dynamically fetch newer deps. The python 3.8 runtime would be considered a native dep, so fall under the 2 distro versions rule. This is fine with CentOS 8, since it provides newer python runtime versions. The python libraries, or tools written in python (meson), would fall under the second rule, and so only need to target one distro version. This would be compatible with CentOS 8, as the users would be expected to download extra
Re: [PATCH] block: temporarily hold the new AioContext of bs_top in bdrv_append()
Am 14.02.2023 um 13:22 hat Stefano Garzarella geschrieben: > On Tue, Feb 14, 2023 at 12:56 PM Kevin Wolf wrote: > > > > Am 14.02.2023 um 11:51 hat Stefano Garzarella geschrieben: > > > bdrv_append() is called with bs_top AioContext held, but > > > bdrv_attach_child_noperm() could change the AioContext of bs_top. > > > > > > bdrv_replace_node_noperm() calls bdrv_drained_begin() starting from > > > commit 2398747128 ("block: Don't poll in bdrv_replace_child_noperm()"). > > > bdrv_drained_begin() can call BDRV_POLL_WHILE that assumes the new lock > > > is taken, so let's temporarily hold the new AioContext to prevent QEMU > > > from failing in BDRV_POLL_WHILE when it tries to release the wrong > > > AioContext. > > > > > > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2168209 > > > Reported-by: Aihua Liang > > > Signed-off-by: Stefano Garzarella > > > --- > > > I'm not sure whether to use the following Fixes tag. That commit added the > > > calls to bdrv_drained_begin() in bdrv_replace_node_noperm(), but maybe the > > > problem was pre-existing. > > > > > > Fixes: 2398747128 ("block: Don't poll in bdrv_replace_child_noperm()") > > > > > > Note: a local reproducer is attached in the BZ, it is based on the Aihua > > > Liang > > > report and it hits the issue with a 20% ratio. > > > --- > > > block.c | 23 +++ > > > 1 file changed, 23 insertions(+) > > > > > > diff --git a/block.c b/block.c > > > index aa9062f2c1..0e2bc11e0b 100644 > > > --- a/block.c > > > +++ b/block.c > > > @@ -5266,6 +5266,8 @@ int bdrv_drop_filter(BlockDriverState *bs, Error > > > **errp) > > > * child. > > > * > > > * This function does not create any image files. > > > + * > > > + * The caller must hold the AioContext lock for @bs_top. > > > */ > > > int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top, > > > Error **errp) > > > @@ -5273,11 +5275,14 @@ int bdrv_append(BlockDriverState *bs_new, > > > BlockDriverState *bs_top, > > > int ret; > > > BdrvChild *child; > > > Transaction *tran = tran_new(); > > > +AioContext *old_context, *new_context; > > > > > > GLOBAL_STATE_CODE(); > > > > > > assert(!bs_new->backing); > > > > > > +old_context = bdrv_get_aio_context(bs_top); > > > + > > > child = bdrv_attach_child_noperm(bs_new, bs_top, "backing", > > > _of_bds, > > > bdrv_backing_role(bs_new), > > > tran, errp); > > > @@ -5286,11 +5291,29 @@ int bdrv_append(BlockDriverState *bs_new, > > > BlockDriverState *bs_top, > > > goto out; > > > } > > > > > > +/* > > > + * bdrv_attach_child_noperm could change the AioContext of bs_top. > > > + * bdrv_replace_node_noperm calls bdrv_drained_begin, so let's > > > temporarily > > > + * hold the new AioContext, since bdrv_drained_begin calls > > > BDRV_POLL_WHILE > > > + * that assumes the new lock is taken. > > > + */ > > > +new_context = bdrv_get_aio_context(bs_top); > > > + > > > +if (old_context != new_context) { > > > +aio_context_release(old_context); > > > +aio_context_acquire(new_context); > > > +} > > > + > > > ret = bdrv_replace_node_noperm(bs_top, bs_new, true, tran, errp); > > > if (ret < 0) { > > > goto out; > > > > If we take the error path, we return with new_context locked instead of > > old_context now. > > Grr, I'm blind... > > > > > > } > > > > > > +if (old_context != new_context) { > > > +aio_context_release(new_context); > > > +aio_context_acquire(old_context); > > > +} > > > + > > > ret = bdrv_refresh_perms(bs_new, tran, errp); > > > out: > > > tran_finalize(tran, ret); > > > > Strictly speaking, don't we need to hold the lock across > > tran_finalize(), too? It completes the bdrv_replace_node_noperm() call > > you covered above. > > Right! > > > > > Maybe bdrv_refresh_perms() and bdrv_refresh_limits(), too, in fact. We > > never clearly defined which functions need the lock and which don't, so > > hard to tell. > > Okay, so to be on the safe side, I'll switch them back just before return. > > > It's really time to get rid of it. > > How could one disagree? :-) > > What about the Fixes tag? Should I include it? I'm not sure. Before the patch, bdrv_replace_child_noperm() had a drain which could have caused the same kind of problems. But we're now draining two BDSes, maybe that is the relevant difference. I guess we've always had a bug, but that commit made it more likely to actually trigger? Kevin
Re: [PATCH v2 6/7] CI: Stop building docs on centos8
On Tue, Feb 14, 2023 at 12:49 PM Daniel P. Berrangé wrote: > [quote] > The motivation for this series is that Python 3.6 was EOL at the end of > 2021; upstream tools are beginning to drop support for it, including > setuptools, pylint, mypy, etc. As time goes by, it becomes more > difficult to support and test against the full range of Python versions > that QEMU supports. The closer we get to Python 3.12, the harder it will > be to cover that full spread of versions. > [/quote] > > this is all about new/eol versions of software upstream, and I don't > think that's a justification. QEMU explicitly aims to use distro provided > versions and upstream EOL status is not relevant in that context. Even > if using "pip" to install it is possible to limit yourself to upstream > releases which still support 3.6. > > There is the separate issue of Meson dropping python 3.6 which motivates > Paolo's series. Again though, we don't have to increase our minimum meson > version, because meson is working today. It is our choice to to increase > it to use latest available meson features. At some point we can decide > what we have is good enough and we don't have to keep chasing the latest > features. Maybe we're not there yet, but we should think about when that > would be. In the case of Meson, the main advantage is moving _all_ of the emulator configury out of the configure script. This requires add_global_dependencies which was added in 0.63. So in that case it is indeed mostly about shiny new features and it's not absolutely necessary. In the case of Python the issue is not the interpreter per se, though there are a couple new feature in Python 3.7 that are quite nice (for example improved data classes[1] or context variables[2]). The main problem as far as I understood (and have seen in my experience) is linting tools. New versions fix bugs that caused false positives, but also become more strict at the same time. The newer versions at the same time are very quick at dropping support for old versions of Python; while older versions sometimes throw deprecation warnings on new versions of Python. This makes it very hard to support a single version of, say, mypy that works on all versions from RHEL8 and SLE15 to Fedora 38 and Ubuntu 23.04. [1] https://peps.python.org/pep-0557/ [2] https://peps.python.org/pep-0567/ In fact this issue is the reason why RHEL9 does not package any of these tools and does not run them as part of building RPMs even though in principle it would be a good idea; it's too much of a pain to have a single version that works across all the packages in the distribution. Regarding your other suggestion: > * For non-native library/applications dependancies we aim > to support only the most recent distro version. Users > of older distros may need to dynamically fetch newer > deps. I think this is a good idea, but one issue with "only supporting the most recent distro version" is SUSE. While the most recent version of SLE is about 5 years old, there is no next version in sight---SUSE instead is working on their "Adaptable Linux Platform", but it's still in the prototype stage[3]. So alternatively we could put a 4 or 5 year cutoff after which you need to fetch newer deps. Considering the delays between freeze and release of distros like RHEL or SLE, in practice we would probably keep Python versions supported for 6-7 years. A 4 year cutoff in practice means that we would be able to drop Python 3.6 support for QEMU 7.1 (RHEL8 becomes 4 year old next May, while SLE is already over the threshold). In practice this means waiting until next April for John/Markus/myself, which I think is fair. Note that at least for now Meson need not to follow this rule; it is distributed with QEMU because configure must execute it before Make sets up the Python venv. This may change in the future of course, depending on the direction that the configure script takes with respect to setting up the venv, but I wouldn't hold my breath. However, the minimum required version of Meson of course must respect the oldest supported version of Python. Paolo [3] https://www.suse.com/c/the-first-prototype-of-adaptable-linux-platform-is-live/
Re: [PATCH 6/7] hw/isa: Assert isa_register_portio_list() gets non-NULL ISA device
On 8/2/23 20:47, Richard Henderson wrote: On 2/7/23 14:07, Philippe Mathieu-Daudé wrote: The previous commit removed the single call to isa_register_portio_list() with dev=NULL. To be sure we won't reintroduce such weird (ab)use, add an assertion. Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Richard Henderson I wonder how much use of __attribute__((nonnull)) we should be making. __attribute__((nonnull)) is compile-time, but seems weaker than the good old runtime assert(): void a0(void *ptr) { assert(ptr); } __attribute__((nonnull)) void a1(void *ptr) { // can no use assert(ptr) because compiler warning } void b0(void *x) { a(NULL); // runtime assertion } void b(void *x) { a1(NULL); // compile error } void c0(void *x) { a0(x); } void c1(void *x) { a1(x); } void d0(void *x) { c0(NULL); // runtime assertion } void d1(void *x) { c1(NULL); // no compile error, no assertion! } I realize we'd probably want to add -fno-delete-null-pointer-checks if we make too much use of that.
Re: [PATCH v2 6/7] CI: Stop building docs on centos8
Am 14.02.2023 um 08:40 hat Markus Armbruster geschrieben: > I read this on Friday, and decided to let it sit until after the > weekend. Well, it's now Tuesday, and to be frank, it's still as > offensively flippant as it was on Friday. It shows either ignorance of > or cavalier disregard for the sheer amount of work some of us have had > to put into keeping old versions of Python viable. > > The latter would be quite unlike you, so it must be the former. Honest question, Markus, because I haven't been following as much what is happening in Python recently: What are the biggest pain points in this context? Has Python started removing features from new versions more aggressively so that we have to update the code so it can run on newer versions, and still keep compatibility paths for older versions that don't have the replacement yet? Kevin
Re: [PATCH v2 6/7] CI: Stop building docs on centos8
On Tue, Feb 14, 2023 at 08:40:20AM +0100, Markus Armbruster wrote: > Daniel P. Berrangé writes: > > [...] > > > We don't have to drop python 3.6. It is a choice because > > of a desire to be able to use some shiny new python > > features without caring about back compat. > > I read this on Friday, and decided to let it sit until after the > weekend. Well, it's now Tuesday, and to be frank, it's still as > offensively flippant as it was on Friday. It shows either ignorance of > or cavalier disregard for the sheer amount of work some of us have had > to put into keeping old versions of Python viable. > > The latter would be quite unlike you, so it must be the former. I'm sorry, I don't mean it to be offensive. I'm genuinely not seeing from the descriptions in the series what the functional benefits are from dropping 3.6. > John has sunk *man-months* into keeping old versions of Python viable. > I've put in a lot less myself, but still enough to be painfully aware of > it. I figure Cleber and Beraldo are somewhere in between > > Insinuating John's proposal is motivated by "a desire to be able to use > some shiny new python features without caring about back compat" > disrespects all this work. I'm writing my comments based on what is described in the cover letter as the motivations for the change: [quote] The motivation for this series is that Python 3.6 was EOL at the end of 2021; upstream tools are beginning to drop support for it, including setuptools, pylint, mypy, etc. As time goes by, it becomes more difficult to support and test against the full range of Python versions that QEMU supports. The closer we get to Python 3.12, the harder it will be to cover that full spread of versions. [/quote] this is all about new/eol versions of software upstream, and I don't think that's a justification. QEMU explicitly aims to use distro provided versions and upstream EOL status is not relevant in that context. Even if using "pip" to install it is possible to limit yourself to upstream releases which still support 3.6. There is the separate issue of Meson dropping python 3.6 which motivates Paolo's series. Again though, we don't have to increase our minimum meson version, because meson is working today. It is our choice to to increase it to use latest available meson features. At some point we can decide what we have is good enough and we don't have to keep chasing the latest features. Maybe we're not there yet, but we should think about when that would be. [quote] The qemu.qmp library and the avocado testing framework both have motivations for dropping 3.6 support, but are committed to not doing so until QEMU drops support. [/quote] I suspect that this is more of a driver for the drop of 3.6, but I don't see any details. IOW overall justification come across as wanting to use new features, and follow upstream EOL, without any real detail of what we're going to gain from a functional POV. > We should have a sober discussion on which versions of Python to work > with, and the tradeoffs involved. But before I engage in that, I insist > on resetting the frame: no, this is not about shiny, new Python > features. It is about stopping the bleeding. It is about reducing what > feels more and more like bullshit work to me, so we can actually > accomplish stuff that matters. Every applications developer chooses an arbitrary cut off points for minimum software versions, depending on their particular needs. With our support policy we tried to express a reasonable tradeoff between keeping back compat, and being able to adopt new features. Obviously that tradeoff is not currently looking acceptable on the python side, but it is not clear why that is ? Can someone simply explain what we wil see as the benefit from dropping 3.6 / adopting 3.7 as the baseline ? With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v2 6/7] CI: Stop building docs on centos8
On 14/02/2023 08.40, Markus Armbruster wrote: Daniel P. Berrangé writes: [...] We don't have to drop python 3.6. It is a choice because of a desire to be able to use some shiny new python features without caring about back compat. I read this on Friday, and decided to let it sit until after the weekend. Well, it's now Tuesday, and to be frank, it's still as offensively flippant as it was on Friday. It shows either ignorance of or cavalier disregard for the sheer amount of work some of us have had to put into keeping old versions of Python viable. I'm a complete python ignorant, too, so I'm a little bit surprised of the amount of pain that these scripts are causing. No matter of that fact, I think Peter still has a point that we have a real conflict here with our current support policy. So this either means that Python was the wrong choice for our needs (since it is moving too fast and causing too much friction), or we should really rethink our support policy. I guess we're too deep into the Python rabbit hole already, and I'm not aware of any other good solutions (back to Perl scripts? No, thanks!), so it's likely quite impossible to tune that knob. Thus we should maybe really start talking about our support policy now. I think the main problem is likely the sentence "Support for the previous major version will be dropped 2 years after the new major version is released". Maybe we should shorten that time frame to 1 year. The 2 years caused some confusions in the past already, since e.g. Debian only supports the previous major release for only one more year, and macOS also releases a major version each year ... so IMHO we could shorten the time frame for the previous major release to 1 year instead. People then could still continue building QEMU on CentOS 8, but they have to be aware that they might install other software like Sphinx manually if they want to continue using QEMU with docs there. What do you think? Thomas
Re: [PATCH] block: temporarily hold the new AioContext of bs_top in bdrv_append()
Am 14.02.2023 um 11:51 hat Stefano Garzarella geschrieben: > bdrv_append() is called with bs_top AioContext held, but > bdrv_attach_child_noperm() could change the AioContext of bs_top. > > bdrv_replace_node_noperm() calls bdrv_drained_begin() starting from > commit 2398747128 ("block: Don't poll in bdrv_replace_child_noperm()"). > bdrv_drained_begin() can call BDRV_POLL_WHILE that assumes the new lock > is taken, so let's temporarily hold the new AioContext to prevent QEMU > from failing in BDRV_POLL_WHILE when it tries to release the wrong > AioContext. > > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2168209 > Reported-by: Aihua Liang > Signed-off-by: Stefano Garzarella > --- > I'm not sure whether to use the following Fixes tag. That commit added the > calls to bdrv_drained_begin() in bdrv_replace_node_noperm(), but maybe the > problem was pre-existing. > > Fixes: 2398747128 ("block: Don't poll in bdrv_replace_child_noperm()") > > Note: a local reproducer is attached in the BZ, it is based on the Aihua Liang > report and it hits the issue with a 20% ratio. > --- > block.c | 23 +++ > 1 file changed, 23 insertions(+) > > diff --git a/block.c b/block.c > index aa9062f2c1..0e2bc11e0b 100644 > --- a/block.c > +++ b/block.c > @@ -5266,6 +5266,8 @@ int bdrv_drop_filter(BlockDriverState *bs, Error **errp) > * child. > * > * This function does not create any image files. > + * > + * The caller must hold the AioContext lock for @bs_top. > */ > int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top, > Error **errp) > @@ -5273,11 +5275,14 @@ int bdrv_append(BlockDriverState *bs_new, > BlockDriverState *bs_top, > int ret; > BdrvChild *child; > Transaction *tran = tran_new(); > +AioContext *old_context, *new_context; > > GLOBAL_STATE_CODE(); > > assert(!bs_new->backing); > > +old_context = bdrv_get_aio_context(bs_top); > + > child = bdrv_attach_child_noperm(bs_new, bs_top, "backing", > _of_bds, > bdrv_backing_role(bs_new), > tran, errp); > @@ -5286,11 +5291,29 @@ int bdrv_append(BlockDriverState *bs_new, > BlockDriverState *bs_top, > goto out; > } > > +/* > + * bdrv_attach_child_noperm could change the AioContext of bs_top. > + * bdrv_replace_node_noperm calls bdrv_drained_begin, so let's > temporarily > + * hold the new AioContext, since bdrv_drained_begin calls > BDRV_POLL_WHILE > + * that assumes the new lock is taken. > + */ > +new_context = bdrv_get_aio_context(bs_top); > + > +if (old_context != new_context) { > +aio_context_release(old_context); > +aio_context_acquire(new_context); > +} > + > ret = bdrv_replace_node_noperm(bs_top, bs_new, true, tran, errp); > if (ret < 0) { > goto out; If we take the error path, we return with new_context locked instead of old_context now. > } > > +if (old_context != new_context) { > +aio_context_release(new_context); > +aio_context_acquire(old_context); > +} > + > ret = bdrv_refresh_perms(bs_new, tran, errp); > out: > tran_finalize(tran, ret); Strictly speaking, don't we need to hold the lock across tran_finalize(), too? It completes the bdrv_replace_node_noperm() call you covered above. Maybe bdrv_refresh_perms() and bdrv_refresh_limits(), too, in fact. We never clearly defined which functions need the lock and which don't, so hard to tell. It's really time to get rid of it. Kevin