Re: [PATCH RFC 2/2] hw: Replace drive_get_next() by drive_get()
Philippe Mathieu-Daudé writes: > On 11/15/21 16:57, Markus Armbruster wrote: >> Philippe Mathieu-Daudé writes: >>> On 11/15/21 13:55, Markus Armbruster wrote: drive_get_next() is basically a bad idea. It returns the "next" block backend of a certain interface type. "Next" means bus=0,unit=N, where subsequent calls count N up from zero, per interface type. This lets you define unit numbers implicitly by execution order. If the order changes, or new calls appear "in the middle", unit numbers change. ABI break. Hard to spot in review. Explicit is better than implicit: use drive_get() directly. Signed-off-by: Markus Armbruster --- > @@ -435,11 +438,13 @@ static void aspeed_machine_init(MachineState *machine) } for (i = 0; i < bmc->soc.sdhci.num_slots; i++) { -sdhci_attach_drive(>soc.sdhci.slots[i], drive_get_next(IF_SD)); +sdhci_attach_drive(>soc.sdhci.slots[i], + drive_get(IF_SD, 0, i)); >>> >>> If we put SD on bus #0, ... >>> } if (bmc->soc.emmc.num_slots) { -sdhci_attach_drive(>soc.emmc.slots[0], drive_get_next(IF_SD)); +sdhci_attach_drive(>soc.emmc.slots[0], + drive_get(IF_SD, 0, bmc->soc.sdhci.num_slots)); >>> >>> ... we'd want to put eMMC on bus #1 >> >> Using separate buses for different kinds of devices would be neater, but >> it also would be an incompatible change. This patch keeps existing >> bus/unit numbers working. drive_get_next() can only use bus 0. >> >>> but I see having eMMC cards on a >>> IF_SD bus as a bug, since these cards are soldered on the board. >> >> IF_SD is not a bus, it's an "block interface type", which is really just >> a user interface thing. > > Why are we discriminating by "block interface type" then? > > What is the difference between "block interfaces"? I see a block drive > as a generic unit, usable on multiple hardware devices. > > I never really understood how this "block interface type" helps > developers and users. I thought BlockInterfaceType and DriveInfo > were legacy / deprecated APIs we want to get rid of; and we would > come up with a replacement API using BlockDeviceInfo or providing > a BlockFrontend state of the art object. > Anyway, I suppose the explanation is buried in the git history > before the last 8 years. I need to keep reading. In the beginning (v0.4.2), there was -hda and -hdb, and life was simple. Then there was -hdc, -hdd, -cdrom (v0.5.1), -fda, -fdb (v0.6.0), -mtdblock, -sd, -pflash (v0.9.1). All these options do two things: they create a block backend, and they request the board to create a certain block frontend for it, similar to other options of this vintage, like -serial, -parallel, and -net. Boards generally ignore requests they don't understand, but that's just sloppiness. For each set of related options, there was a global variable holding the requests: bs_table[] for -hda, -hdb, -hdc, -hdd, -cdrom; fd_table[] -fda, -fdb; mtd_bdrv for -mtd; sd_drv for -ds; pflash_table[] for -pflash. The options replaced prior ones, except for -pflash, which appended to its table. bs_table[]'s index had a peculiar meaning: it's bus * MAX_IDE_DEVS + unit. This ensures that -hda (index 0) goes on IDE bus 0 as unit 0; -hdb on bus 0, unit 1; -hdc on 1, 0; -hdc on 1, 1. Life was now complicated enough for a generalization (v0.9.1), so there was -drive (v0.9.1). All the variables holding requests were fused into drives_table[]. Table elements are identified by (type, bus, unit), where type is an enum whose members correspond to the old global variables: IF_IDE for bs_table[], IF_FLOPPY for fd_table[], and so forth. So: -hda becomes type = IF_IDE, bus = 0, unit = 0 -hdb becomes type = IF_IDE, bus = 0, unit = 1 ... -sd becomes type = IF_SD, bus = 0, unit = 0 1st -pflash becomes type = IF_PFLASH, bus = 0, unit = 0 2nd -pflash becomes type = IF_PFLASH, bus = 0, unit = 1 ... Other mappings from old to new global variables would have been possible. I figure this one was chosen because it comes with a reasonable user interface. Identifying block devices by (interface type, bus, unit) is certainly nicer than by index in bs_table[]. Since bus and/or unit make sense only with some interface types, they are optional. Things calmed down for a couple of years, until -device appeared (v0.12). Now we needed a way to define just a backend, without requesting a frontend from the board. Instead of inventing a new option, this became IF_NONE, with meaningless bus and unit. Over the next years, the block layer outgrew -drive's limited capabilities to define frontends. -blockdev appeard (v1.7.0) and matured over several releases. I don't remember exactly when it became stable, relegating -drive if=none to legacy status. What's *not*
Re: [PATCH] dump-guest-memory: Use BQL to protect dump finalize process
Hi On Tue, Nov 16, 2021 at 7:22 AM Peter Xu wrote: > > When finalizing the dump-guest-memory with detached mode, we'll first set dump > status to either FAIL or COMPLETE before doing the cleanup, however right > after > the dump status change it's possible that another dump-guest-memory qmp > command > is sent so both the main thread and dump thread (which is during cleanup) > could > be accessing dump state in paralell. That's racy. parallel > > Fix it by protecting the finalizing phase of dump-guest-memory using BQL as > well. To do that, we expand the BQL from dump_cleanup() into dump_process(), > so we will take the BQL longer than before. The critical section must cover > the status switch from ACTIVE->{FAIL|COMPLETE} so as to make sure there's no > race any more. > > We can also just introduce a specific mutex to serialize the dump process, but > BQL should be enough for now so far, not to mention vm_start() in dump_cleanup > will need BQL anyway, so maybe easier to just use the same mutex. > > Reported-by: Laszlo Ersek > Signed-off-by: Peter Xu Reviewed-by: Marc-André Lureau > --- > dump/dump.c | 24 ++-- > 1 file changed, 18 insertions(+), 6 deletions(-) > > diff --git a/dump/dump.c b/dump/dump.c > index 662d0a62cd..196b7b8ab9 100644 > --- a/dump/dump.c > +++ b/dump/dump.c > @@ -96,13 +96,7 @@ static int dump_cleanup(DumpState *s) > g_free(s->guest_note); > s->guest_note = NULL; > if (s->resume) { > -if (s->detached) { > -qemu_mutex_lock_iothread(); > -} > vm_start(); > -if (s->detached) { > -qemu_mutex_unlock_iothread(); > -} > } > migrate_del_blocker(dump_migration_blocker); > > @@ -1873,6 +1867,11 @@ static void dump_process(DumpState *s, Error **errp) > Error *local_err = NULL; > DumpQueryResult *result = NULL; > > +/* > + * When running with detached mode, these operations are not run with > BQL. > + * It's still safe, because it's protected by setting s->state to ACTIVE, > + * so dump_in_progress() check will block yet another dump-guest-memory. > + */ > if (s->has_format && s->format == DUMP_GUEST_MEMORY_FORMAT_WIN_DMP) { > #ifdef TARGET_X86_64 > create_win_dump(s, _err); > @@ -1883,6 +1882,15 @@ static void dump_process(DumpState *s, Error **errp) > create_vmcore(s, _err); > } > > +/* > + * Serialize the finalizing of dump process using BQL to make sure no > + * concurrent access to DumpState is allowed. BQL is also required for > + * dump_cleanup as vm_start() needs it. > + */ > +if (s->detached) { > +qemu_mutex_lock_iothread(); > +} > + > /* make sure status is written after written_size updates */ > smp_wmb(); > qatomic_set(>status, > @@ -1898,6 +1906,10 @@ static void dump_process(DumpState *s, Error **errp) > > error_propagate(errp, local_err); > dump_cleanup(s); > + > +if (s->detached) { > +qemu_mutex_unlock_iothread(); > +} > } > > static void *dump_thread(void *data) > -- > 2.32.0 >
Re: [PATCH v2 1/3] icount: preserve cflags when custom tb is about to execute
On 11.11.2021 15:20, Alex Bennée wrote: Pavel Dovgalyuk writes: When debugging with the watchpoints, qemu may need to create TB with single instruction. This is achieved by setting cpu->cflags_next_tb. But when this block is about to execute, it may be interrupted by another thread. In this case cflags will be lost and next executed TB will not be the special one. This patch checks TB exit reason and restores cflags_next_tb to allow finding the interrupted block. Signed-off-by: Pavel Dovgalyuk --- accel/tcg/cpu-exec.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c index 2d14d02f6c..df12452b8f 100644 --- a/accel/tcg/cpu-exec.c +++ b/accel/tcg/cpu-exec.c @@ -846,6 +846,16 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb, * cpu_handle_interrupt. cpu_handle_interrupt will also * clear cpu->icount_decr.u16.high. */ +if (cpu->cflags_next_tb == -1 +&& (!use_icount || !(tb->cflags & CF_USE_ICOUNT) Why check use_icount here? The cflags should always have CF_USE_ICOUNT set when icount is enabled. Lets not over complicate the inverted || tests we have here. Not really. Sometimes we use non-icount blocks in icount mode. But AFAIR they are used only for triggering the exeptions, but not for real execution. +|| cpu_neg(cpu)->icount_decr.u16.low >= tb->icount)) { Is u16.low ever set when icount isn't enabled? This condition is checked for icount mode only. u16.low is not used without icount. +/* + * icount is disabled or there are enough instructions + * in the budget, do not retranslate this block with + * different parameters. + */ +cpu->cflags_next_tb = tb->cflags; +} return; }
[PATCH] migration: fix dump-vmstate with modules
To work correctly -dump-vmstate and vmstate-static-checker.py need to dump all the supported vmstates. But as some devices can be modules, they are not loaded at startup and not dumped. Fix that by loading all available modules before dumping the machine vmstate. Fixes: 7ab6e7fcce97 ("qdev: device module support") Cc: kra...@redhat.com Signed-off-by: Laurent Vivier --- softmmu/vl.c | 1 + 1 file changed, 1 insertion(+) diff --git a/softmmu/vl.c b/softmmu/vl.c index 1159a64bce4e..620a1f1367e2 100644 --- a/softmmu/vl.c +++ b/softmmu/vl.c @@ -3766,6 +3766,7 @@ void qemu_init(int argc, char **argv, char **envp) if (vmstate_dump_file) { /* dump and exit */ +module_load_qom_all(); dump_vmstate_json_to_file(vmstate_dump_file); exit(0); } -- 2.33.1
Re: [PATCH v5 0/6] Add vmnet.framework based network backend
On Mon, Nov 15, 2021 at 6:45 PM Vladislav Yaroshchuk wrote: > > > > пн, 15 нояб. 2021 г. в 07:53, Jason Wang : >> >> On Fri, Nov 12, 2021 at 5:14 PM Vladislav Yaroshchuk >> wrote: >> > >> > macOS provides networking API for VMs called 'vmnet.framework': >> > https://developer.apple.com/documentation/vmnet >> > >> > We can provide its support as the new QEMU network backends which >> > represent three different vmnet.framework interface usage modes: >> > >> > * `vmnet-shared`: >> > allows the guest to communicate with other guests in shared mode and >> > also with external network (Internet) via NAT. Has (macOS-provided) >> > DHCP server; subnet mask and IP range can be configured; >> > >> > * `vmnet-host`: >> > allows the guest to communicate with other guests in host mode. >> > By default has enabled DHCP as `vmnet-shared`, but providing >> > network unique id (uuid) can make `vmnet-host` interfaces isolated >> > from each other and also disables DHCP. >> > >> > * `vmnet-bridged`: >> > bridges the guest with a physical network interface. >> > >> > This backends cannot work on macOS Catalina 10.15 cause we use >> > vmnet.framework API provided only with macOS 11 and newer. Seems >> > that it is not a problem, because QEMU guarantees to work on two most >> > recent versions of macOS which now are Big Sur (11) and Monterey (12). >> > >> > Also, we have one inconvenient restriction: vmnet.framework interfaces >> > can create only privileged user: >> > `$ sudo qemu-system-x86_64 -nic vmnet-shared` >> > >> > Attempt of `vmnet-*` netdev creation being unprivileged user fails with >> > vmnet's 'general failure'. >> > >> > This happens because vmnet.framework requires `com.apple.vm.networking` >> > entitlement which is: "restricted to developers of virtualization software. >> > To request this entitlement, contact your Apple representative." as Apple >> > documentation says: >> > https://developer.apple.com/documentation/bundleresources/entitlements/com_apple_vm_networking >> >> Do you know how multipass work? Looks like it uses vmnet without privileges. >> > > I've just checked this, and they still need root privileges. They have a > `multipassd` daemon which is launched as root by launchd by default. > > ``` > bash-5.1$ ps axo ruser,ruid,comm | grep multipass > root 0 /Library/Application > Support/com.canonical.multipass/bin/multipassd > root 0 /Library/Application > Support/com.canonical.multipass/bin/hyperkit > ``` > > That's the reason why it's required to 'enter a password' while multipass > installation: > it creates launchd plist (kinda launch rule) and places it to > /Library/LaunchDaemons/, > which can be done only with root privileges. > > ``` > bash-5.1$ ll /Library/LaunchDaemons | grep multipass > -rw-r--r-- 1 root wheel 1.1K 15 Nov 12:47 com.canonical.multipassd.plist > ``` > > And after this launchd launches this multipassd daemon as root every boot. > > So an unprivileged user can launch a multipass VM instance, but actually the > `hyperkit` process which interacts with vmnet is gonna be launched by > multipassd > running as root. I wonder how it passes the vmnet object to qemu? Nothing obvious from the qemu command line that multipass launched: -nic vmnet-macos,mode=shared,model=virtio-net-pci,mac=52:54:00:52:e8:e4 (But I haven't had time to check their qemu codes). > > tl;dr sadly, we can't interact with vmnet.framework without having our binary > correctly > signed and being an unprivileged user. Root privileges or special signature > with > entitlement is required. This is something similar to what happens in other OS. E.g for the tap backend, it can't be created without privileges. So qemu allows to: 1) the TAP to be created by privileged program like libvirt, and its fd could be passed to qemu via SCM_RIGHTS 2) run a set-uid helper to create and config TAP This is something we need to consider now or in the future probably. Thanks > > >> Thanks >> > > Thank you for your review, I will check it this week and reply as soon as > possible. > >> >> > >> > One more note: we still have quite useful but not supported >> > 'vmnet.framework' features as creating port forwarding rules, IPv6 >> > NAT prefix specifying and so on. >> > >> > Nevertheless, new backends work fine and tested within `qemu-system-x86-64` >> > on macOS Bir Sur 11.5.2 host with such nic models: >> > * e1000-82545em >> > * virtio-net-pci >> > * vmxnet3 >> > >> > The guests were: >> > * macOS 10.15.7 >> > * Ubuntu Bionic (server cloudimg) >> > >> > >> > This series partially reuses patches by Phillip Tennen: >> > https://patchew.org/QEMU/20210218134947.1860-1-phillip.en...@gmail.com/ >> > So I included his signed-off line into one of the commit messages and >> > also here. >> > >> > v1 -> v2: >> > Since v1 minor typos were fixed, patches rebased onto latest master, >> > redundant changes removed (small commits squashed) >> >
Re: [PATCH v2 1/6] pci: implement power state
Hi, > > new_addr = pci_bar_address(d, i, r->type, r->size); > > +if (!d->has_power) { > > +new_addr = PCI_BAR_UNMAPPED; > > +} > > > > /* This bar isn't changed */ > > if (new_addr == r->addr) > > I am a bit confused about why this is necessary. > When power is removed device is reset, does not > this disable device memory automatically? Hmm. While it clearly doesn't hurt it might not be needed indeed. The reset-on-poweroff should make sure both bars and dma are off, and due to config space access being blocked the guest shouldn't be able to turn them on while device power is off. So, yes, in theory we should be able to drop this. Assuming nothing in qemu ever touches the bar mappings (host access to config space is not blocked). I'll have a look. take care, Gerd
Re: [PATCH for 6.2 v4] nbd/server: Add --selinux-label option
On 15/11/2021 21.29, Eric Blake wrote: From: "Richard W.M. Jones" Under SELinux, Unix domain sockets have two labels. One is on the disk and can be set with commands such as chcon(1). There is a different label stored in memory (called the process label). This can only be set by the process creating the socket. When using SELinux + SVirt and wanting qemu to be able to connect to a qemu-nbd instance, you must set both labels correctly first. For qemu-nbd the options to set the second label are awkward. You can create the socket in a wrapper program and then exec into qemu-nbd. Or you could try something with LD_PRELOAD. This commit adds the ability to set the label straightforwardly on the command line, via the new --selinux-label flag. (The name of the flag is the same as the equivalent nbdkit option.) ... @@ -3430,6 +3437,7 @@ summary_info += {'libdaxctl support': libdaxctl} summary_info += {'libudev': libudev} # Dummy dependency, keep .found() summary_info += {'FUSE lseek':fuse_lseek.found()} +summary_info += {'selinux': selinux.found()} It's nicer if you do it like this (i.e. without the .found()): summary_info += {'selinux': selinux} ... then meson prints out the version of the library, too. Apart from that, patch looks fine to me: Reviewed-by: Thomas Huth
Re: [PATCH] gitlab-ci: Test compilation on Windows with MSYS2
Hi On Mon, Nov 15, 2021 at 6:31 PM Philippe Mathieu-Daudé wrote: > On 11/15/21 15:06, Thomas Huth wrote: > > Gitlab also provides runners with Windows, we can use them to > > test compilation with MSYS2, in both, 64-bit and 32-bit. > > > > However, it takes quite a long time to set up the VM, so to > > stay in the 1h time frame, we can only compile and check one > > target here. > I wonder why gitlab does not offer the docker executor. On the freedesktop gitlab instance, they have windows docker executor, which speeds up the build time. Maybe we could also have our own Windows runner for qemu? > > > Signed-off-by: Thomas Huth > > --- > > "make check" recently broke on MSYS2, and nobody noticed since > apparently > > hardly anybody looks at the cirrus-CI output ... so here's another try > > to get some more test coverage in this area in the gitlab-CI instead. > > Patch needs to be applied after the "tests/unit/test-smp-parse: > > Make an unified name for the tested machine" patch to get "make check" > > fixed first. > > > > RFC -> v1: > > - Use cache to speed up installation a little bit > > - Add a 32-bit builder, too > > > > .gitlab-ci.d/qemu-project.yml | 1 + > > .gitlab-ci.d/windows.yml | 98 +++ > > 2 files changed, 99 insertions(+) > > create mode 100644 .gitlab-ci.d/windows.yml > > > > diff --git a/.gitlab-ci.d/qemu-project.yml > b/.gitlab-ci.d/qemu-project.yml > > index b3d79bc429..871262fe0e 100644 > > --- a/.gitlab-ci.d/qemu-project.yml > > +++ b/.gitlab-ci.d/qemu-project.yml > > @@ -11,3 +11,4 @@ include: > >- local: '/.gitlab-ci.d/static_checks.yml' > >- local: '/.gitlab-ci.d/custom-runners.yml' > >- local: '/.gitlab-ci.d/cirrus.yml' > > + - local: '/.gitlab-ci.d/windows.yml' > > diff --git a/.gitlab-ci.d/windows.yml b/.gitlab-ci.d/windows.yml > > new file mode 100644 > > index 00..309f7e7fb8 > > --- /dev/null > > +++ b/.gitlab-ci.d/windows.yml > > @@ -0,0 +1,98 @@ > > +.shared_msys2_builder: > > + tags: > > + - shared-windows > > + - windows > > + - windows-1809 > > + cache: > > +key: "${CI_JOB_NAME}-cache" > > It would be nice to cache the shared 'before_script' part, > but it doesn't seems trivial; meanwhile this patch works and > is KISS, so: > > Reviewed-by: Philippe Mathieu-Daudé > > > +paths: > > + - ${CI_PROJECT_DIR}/msys64/var/cache > > + needs: [] > > + stage: build > > + timeout: 70m > > + before_script: > > + - If ( !(Test-Path -Path msys64\var\cache ) ) { > > + mkdir msys64\var\cache > > +} > > + - If ( !(Test-Path -Path msys64\var\cache\msys2.exe ) ) { > > + Invoke-WebRequest > > + " > https://github.com/msys2/msys2-installer/releases/download/2021-07-25/msys2-base-x86_64-20210725.sfx.exe > " > > + -outfile "msys64\var\cache\msys2.exe" > > +} > > + - msys64\var\cache\msys2.exe -y > > + - ((Get-Content -path .\msys64\etc\\post-install\\07-pacman-key.post > -Raw) > > + -replace '--refresh-keys', '--version') | > > + Set-Content -Path ${CI_PROJECT_DIR}\msys64\etc\\post-install\\ > 07-pacman-key.post > > + - .\msys64\usr\bin\bash -lc "sed -i 's/^CheckSpace/#CheckSpace/g' > /etc/pacman.conf" > > + - .\msys64\usr\bin\bash -lc 'pacman --noconfirm -Syuu' # Core update > > + - .\msys64\usr\bin\bash -lc 'pacman --noconfirm -Syuu' # Normal > update > > + - taskkill /F /FI "MODULES eq msys-2.0.dll" > > + > > +msys2-64bit: > > + extends: .shared_msys2_builder > > + script: > > + - .\msys64\usr\bin\bash -lc "pacman -Sy --noconfirm --needed > > + diffutils git grep make sed > > + mingw-w64-x86_64-capstone > [...] > > -- Marc-André Lureau
Re: [PATCH] qmp: Stabilize preconfig
El lun., 15 nov. 2021 16:40, Markus Armbruster escribió: > > Why do you care? For another example, you can use "reboot" or > > "systemctl isolate reboot.target" and they end up doing the same thing. > > > > As long as qemu_init invokes qmp_machine_set, qmp_accel_set, > > qmp_device_add, qmp_plugin_add, qmp_cont, etc. to do its job, the > > difference between qemu-system-* and qemu-qmp-* is a couple thousands > > lines of boring code that all but disappears once the VM is up and > > running. IOW, with the right design (e.g. shortcut options for QOM > > properties good; dozens of global variables bad), there's absolutely no > > issue with some people using qemu-system-* and some using qemu-qmp-*. > > I think maintaining two binaries forever is madness. I want the old one > to wither away. This seems a bit dogmatic to me. The difference between the two binaries would be literally a single file, which basically disappears before anything interesting happens. Making the new binary capable of serving all use cases should not be > hard, just work (see my design sketch). I expect the result to serve > *better* than the mess we have now. > Most of the mess is in the implementation. Not all, granted. But overall softmmu/vl.c's ugliness is mostly due to the layers of backwards compatibility, and that wouldn't go away very soon. My point is that we still have quite a few commands without > 'allow-preconfig' mostly because we are afraid of allowing them in > preconfig state, not because of true phase dependencies. > >>> > >>> I think there's very few of them, if any (outside the block layer for > >>> which patches exist), and those are due to distraction more than fear. > >> > >> qapi/*.json has 216 commands, of which 26 carry 'allow-preconfig'. > > > > Well, > > https://lists.gnu.org/archive/html/qemu-devel/2021-06/msg01597.html > > alone would more than double that. :) > > > > Places like machine.json, machine-target.json, migration.json, > > replay.json have a lot of commands that are, obviously, almost entirely > > not suitable for preconfig. I don't think there are many commands left, > > I'd guess maybe 30 (meaning that ~60% are done). > > My point is that "very few" is not literally true, and I think you just > confirmed it ;) Ok, let me rephrase that as "most of the missing ones are block-layer relates". ;) Paolo
Re: [PATCH] q35: flip acpi-pci-hotplug-with-bridge-support default back to off
On Thu, 11 Nov 2021, Gerd Hoffmann wrote: > Switch qemu 6.2 back to 6.0 behavior (aka native pcie hotplug) because > acpi hotplug for pcie ports caused all kinds of regressions and a fix > for those is not in sight. > > Add compat property for 6.1 to keep it enabled there. Use a separate > compat property list so we can apply it to 6.1 only. I think we are not going this route anymore. ACPI will continue to be the default for 6.2 and future and we will continue to fix the issues on the ACPI hotplug side. I see Michael sent PRs for some of the fixes from Igor and Julia. > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/641 > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2006409 > Signed-off-by: Gerd Hoffmann > --- > hw/acpi/ich9.c | 2 +- > hw/i386/pc.c | 1 - > hw/i386/pc_q35.c | 14 +- > 3 files changed, 14 insertions(+), 3 deletions(-) > > diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c > index 1ee2ba2c508c..6e7d4c9eb54a 100644 > --- a/hw/acpi/ich9.c > +++ b/hw/acpi/ich9.c > @@ -427,7 +427,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs > *pm) > pm->disable_s3 = 0; > pm->disable_s4 = 0; > pm->s4_val = 2; > -pm->use_acpi_hotplug_bridge = true; > +pm->use_acpi_hotplug_bridge = false; > > object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE, > >pm_io_base, OBJ_PROP_FLAG_READ); > diff --git a/hw/i386/pc.c b/hw/i386/pc.c > index 2592a821486f..4fed82dafcf0 100644 > --- a/hw/i386/pc.c > +++ b/hw/i386/pc.c > @@ -106,7 +106,6 @@ GlobalProperty pc_compat_6_0[] = { > { "qemu64" "-" TYPE_X86_CPU, "model", "6" }, > { "qemu64" "-" TYPE_X86_CPU, "stepping", "3" }, > { TYPE_X86_CPU, "x-vendor-cpuid-only", "off" }, > -{ "ICH9-LPC", ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, "off" }, > }; > const size_t pc_compat_6_0_len = G_N_ELEMENTS(pc_compat_6_0); > > diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c > index 797e09500b15..735dd3cff4ed 100644 > --- a/hw/i386/pc_q35.c > +++ b/hw/i386/pc_q35.c > @@ -375,8 +375,20 @@ static void pc_q35_6_1_machine_options(MachineClass *m) > m->smp_props.prefer_sockets = true; > } > > +/* 6.1 only compat property (not applied to 6.0 + older) */ > +static GlobalProperty pc_compat_6_1_only[] = { > +{ "ICH9-LPC", ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, "on" }, > +}; > +static const size_t pc_compat_6_1_only_len = > G_N_ELEMENTS(pc_compat_6_1_only); > + > +static void pc_q35_6_1_only_machine_options(MachineClass *m) > +{ > +pc_q35_6_1_machine_options(m); > +compat_props_add(m->compat_props, pc_compat_6_1_only, > pc_compat_6_1_only_len); > +} > + > DEFINE_Q35_MACHINE(v6_1, "pc-q35-6.1", NULL, > - pc_q35_6_1_machine_options); > + pc_q35_6_1_only_machine_options); > > static void pc_q35_6_0_machine_options(MachineClass *m) > { > -- > 2.33.1 > >
[PATCH] dump-guest-memory: Use BQL to protect dump finalize process
When finalizing the dump-guest-memory with detached mode, we'll first set dump status to either FAIL or COMPLETE before doing the cleanup, however right after the dump status change it's possible that another dump-guest-memory qmp command is sent so both the main thread and dump thread (which is during cleanup) could be accessing dump state in paralell. That's racy. Fix it by protecting the finalizing phase of dump-guest-memory using BQL as well. To do that, we expand the BQL from dump_cleanup() into dump_process(), so we will take the BQL longer than before. The critical section must cover the status switch from ACTIVE->{FAIL|COMPLETE} so as to make sure there's no race any more. We can also just introduce a specific mutex to serialize the dump process, but BQL should be enough for now so far, not to mention vm_start() in dump_cleanup will need BQL anyway, so maybe easier to just use the same mutex. Reported-by: Laszlo Ersek Signed-off-by: Peter Xu --- dump/dump.c | 24 ++-- 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/dump/dump.c b/dump/dump.c index 662d0a62cd..196b7b8ab9 100644 --- a/dump/dump.c +++ b/dump/dump.c @@ -96,13 +96,7 @@ static int dump_cleanup(DumpState *s) g_free(s->guest_note); s->guest_note = NULL; if (s->resume) { -if (s->detached) { -qemu_mutex_lock_iothread(); -} vm_start(); -if (s->detached) { -qemu_mutex_unlock_iothread(); -} } migrate_del_blocker(dump_migration_blocker); @@ -1873,6 +1867,11 @@ static void dump_process(DumpState *s, Error **errp) Error *local_err = NULL; DumpQueryResult *result = NULL; +/* + * When running with detached mode, these operations are not run with BQL. + * It's still safe, because it's protected by setting s->state to ACTIVE, + * so dump_in_progress() check will block yet another dump-guest-memory. + */ if (s->has_format && s->format == DUMP_GUEST_MEMORY_FORMAT_WIN_DMP) { #ifdef TARGET_X86_64 create_win_dump(s, _err); @@ -1883,6 +1882,15 @@ static void dump_process(DumpState *s, Error **errp) create_vmcore(s, _err); } +/* + * Serialize the finalizing of dump process using BQL to make sure no + * concurrent access to DumpState is allowed. BQL is also required for + * dump_cleanup as vm_start() needs it. + */ +if (s->detached) { +qemu_mutex_lock_iothread(); +} + /* make sure status is written after written_size updates */ smp_wmb(); qatomic_set(>status, @@ -1898,6 +1906,10 @@ static void dump_process(DumpState *s, Error **errp) error_propagate(errp, local_err); dump_cleanup(s); + +if (s->detached) { +qemu_mutex_unlock_iothread(); +} } static void *dump_thread(void *data) -- 2.32.0
Re: [PATCH v4 07/20] target/riscv: Adjust csr write mask with XLEN
On Fri, Nov 12, 2021 at 1:58 AM LIU Zhiwei wrote: > > Write mask is representing the bits we care about. > > Signed-off-by: LIU Zhiwei > Reviewed-by: Richard Henderson Reviewed-by: Alistair Francis Alistair > --- > target/riscv/insn_trans/trans_rvi.c.inc | 4 ++-- > target/riscv/op_helper.c| 3 ++- > 2 files changed, 4 insertions(+), 3 deletions(-) > > diff --git a/target/riscv/insn_trans/trans_rvi.c.inc > b/target/riscv/insn_trans/trans_rvi.c.inc > index e51dbc41c5..40c81421f2 100644 > --- a/target/riscv/insn_trans/trans_rvi.c.inc > +++ b/target/riscv/insn_trans/trans_rvi.c.inc > @@ -486,7 +486,7 @@ static bool trans_csrrw(DisasContext *ctx, arg_csrrw *a) > return do_csrw(ctx, a->csr, src); > } > > -TCGv mask = tcg_constant_tl(-1); > +TCGv mask = tcg_constant_tl(get_xl(ctx) == MXL_RV32 ? UINT32_MAX : -1); > return do_csrrw(ctx, a->rd, a->csr, src, mask); > } > > @@ -537,7 +537,7 @@ static bool trans_csrrwi(DisasContext *ctx, arg_csrrwi *a) > return do_csrw(ctx, a->csr, src); > } > > -TCGv mask = tcg_constant_tl(-1); > +TCGv mask = tcg_constant_tl(get_xl(ctx) == MXL_RV32 ? UINT32_MAX : -1); > return do_csrrw(ctx, a->rd, a->csr, src, mask); > } > > diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c > index 095d39671b..561e156bec 100644 > --- a/target/riscv/op_helper.c > +++ b/target/riscv/op_helper.c > @@ -50,7 +50,8 @@ target_ulong helper_csrr(CPURISCVState *env, int csr) > > void helper_csrw(CPURISCVState *env, int csr, target_ulong src) > { > -RISCVException ret = riscv_csrrw(env, csr, NULL, src, -1); > +target_ulong mask = cpu_get_xl(env) == MXL_RV32 ? UINT32_MAX : -1; > +RISCVException ret = riscv_csrrw(env, csr, NULL, src, mask); > > if (ret != RISCV_EXCP_NONE) { > riscv_raise_exception(env, ret, GETPC()); > -- > 2.25.1 > >
Re: [PATCH v4 06/20] target/riscv: Relax debug check for pm write
On Fri, Nov 12, 2021 at 2:12 AM LIU Zhiwei wrote: > > Signed-off-by: LIU Zhiwei > Reviewed-by: Richard Henderson Reviewed-by: Alistair Francis Alistair > --- > target/riscv/csr.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/target/riscv/csr.c b/target/riscv/csr.c > index 9f41954894..74c0b788fd 100644 > --- a/target/riscv/csr.c > +++ b/target/riscv/csr.c > @@ -1445,6 +1445,9 @@ static bool check_pm_current_disabled(CPURISCVState > *env, int csrno) > int csr_priv = get_field(csrno, 0x300); > int pm_current; > > +if (env->debugger) { > +return false; > +} > /* > * If priv lvls differ that means we're accessing csr from higher priv > lvl, > * so allow the access > -- > 2.25.1 > >
Re: [PATCH v4 05/20] target/riscv: Use gdb xml according to max mxlen
On Fri, Nov 12, 2021 at 1:54 AM LIU Zhiwei wrote: > > Signed-off-by: LIU Zhiwei > Reviewed-by: Richard Henderson Reviewed-by: Alistair Francis Alistair > --- > target/riscv/gdbstub.c | 71 +++--- > 1 file changed, 52 insertions(+), 19 deletions(-) > > diff --git a/target/riscv/gdbstub.c b/target/riscv/gdbstub.c > index 23429179e2..8d0f9139d7 100644 > --- a/target/riscv/gdbstub.c > +++ b/target/riscv/gdbstub.c > @@ -24,11 +24,23 @@ int riscv_cpu_gdb_read_register(CPUState *cs, GByteArray > *mem_buf, int n) > { > RISCVCPU *cpu = RISCV_CPU(cs); > CPURISCVState *env = >env; > +target_ulong tmp; > > if (n < 32) { > -return gdb_get_regl(mem_buf, env->gpr[n]); > +tmp = env->gpr[n]; > } else if (n == 32) { > -return gdb_get_regl(mem_buf, env->pc); > +tmp = env->pc; > +} else { > +return 0; > +} > + > +switch (env->misa_mxl_max) { > +case MXL_RV32: > +return gdb_get_reg32(mem_buf, tmp); > +case MXL_RV64: > +return gdb_get_reg64(mem_buf, tmp); > +default: > +g_assert_not_reached(); > } > return 0; > } > @@ -37,18 +49,32 @@ int riscv_cpu_gdb_write_register(CPUState *cs, uint8_t > *mem_buf, int n) > { > RISCVCPU *cpu = RISCV_CPU(cs); > CPURISCVState *env = >env; > - > -if (n == 0) { > -/* discard writes to x0 */ > -return sizeof(target_ulong); > -} else if (n < 32) { > -env->gpr[n] = ldtul_p(mem_buf); > -return sizeof(target_ulong); > +int length = 0; > +target_ulong tmp; > + > +switch (env->misa_mxl_max) { > +case MXL_RV32: > +tmp = (int32_t)ldl_p(mem_buf); > +length = 4; > +break; > +case MXL_RV64: > +if (cpu_get_xl(env) < MXL_RV64) { > +tmp = (int32_t)ldq_p(mem_buf); > +} else { > +tmp = ldq_p(mem_buf); > +} > +length = 8; > +break; > +default: > +g_assert_not_reached(); > +} > +if (n > 0 && n < 32) { > +env->gpr[n] = tmp; > } else if (n == 32) { > -env->pc = ldtul_p(mem_buf); > -return sizeof(target_ulong); > +env->pc = tmp; > } > -return 0; > + > +return length; > } > > static int riscv_gdb_get_fpu(CPURISCVState *env, GByteArray *buf, int n) > @@ -198,13 +224,20 @@ void riscv_cpu_register_gdb_regs_for_features(CPUState > *cs) > gdb_register_coprocessor(cs, riscv_gdb_get_fpu, riscv_gdb_set_fpu, > 36, "riscv-32bit-fpu.xml", 0); > } > -#if defined(TARGET_RISCV32) > -gdb_register_coprocessor(cs, riscv_gdb_get_virtual, > riscv_gdb_set_virtual, > - 1, "riscv-32bit-virtual.xml", 0); > -#elif defined(TARGET_RISCV64) > -gdb_register_coprocessor(cs, riscv_gdb_get_virtual, > riscv_gdb_set_virtual, > - 1, "riscv-64bit-virtual.xml", 0); > -#endif > +switch (env->misa_mxl_max) { > +case MXL_RV32: > +gdb_register_coprocessor(cs, riscv_gdb_get_virtual, > + riscv_gdb_set_virtual, > + 1, "riscv-32bit-virtual.xml", 0); > +break; > +case MXL_RV64: > +gdb_register_coprocessor(cs, riscv_gdb_get_virtual, > + riscv_gdb_set_virtual, > + 1, "riscv-64bit-virtual.xml", 0); > +break; > +default: > +g_assert_not_reached(); > +} > > gdb_register_coprocessor(cs, riscv_gdb_get_csr, riscv_gdb_set_csr, > riscv_gen_dynamic_csr_xml(cs, cs->gdb_num_regs), > -- > 2.25.1 > >
RE: [PATCH 2/2] virtio-blk: support BLKSECDISCARD
> The Linux block layer shares the DISCARD queue limits with SECDISCARD. > That's different from BLKZEROOUT (QEMU's WRITE_ZEROES). This is a Linux > implementation detail but I guess virtio-blk can share the DISCARD limits with > SECDISCARD too... > > > @@ -622,6 +628,7 @@ static int virtio_blk_handle_request(VirtIOBlockReq > *req, MultiReqBuffer *mrb) > > unsigned out_num = req->elem.out_num; > > VirtIOBlock *s = req->dev; > > VirtIODevice *vdev = VIRTIO_DEVICE(s); > > +bool is_secdiscard = false; > > > > if (req->elem.out_num < 1 || req->elem.in_num < 1) { > > virtio_error(vdev, "virtio-blk missing headers"); @@ -722,6 > > +729,9 @@ static int virtio_blk_handle_request(VirtIOBlockReq *req, > MultiReqBuffer *mrb) > > * VIRTIO_BLK_T_OUT flag set. We masked this flag in the switch > > statement, > > * so we must mask it for these requests, then we will check if it is > > set. > > */ > > +case VIRTIO_BLK_T_SECDISCARD & ~VIRTIO_BLK_T_OUT: > > +is_secdiscard = true; > > +__attribute__((fallthrough)); > > The DISCARD case doesn't use __attribute__((fallthrough)) so this is > inconsistent. > QEMU doesn't use __attribute__((fallthrough)) so I suggest dropping this. Sure, will try to drop the fallthrough case. > > > diff --git a/include/standard-headers/linux/virtio_blk.h > > b/include/standard-headers/linux/virtio_blk.h > > index 2dcc90826a..c55a07840c 100644 > > --- a/include/standard-headers/linux/virtio_blk.h > > +++ b/include/standard-headers/linux/virtio_blk.h > > @@ -40,6 +40,7 @@ > > #define VIRTIO_BLK_F_MQ12 /* support more than one vq > */ > > #define VIRTIO_BLK_F_DISCARD 13 /* DISCARD is supported */ > > #define VIRTIO_BLK_F_WRITE_ZEROES 14 /* WRITE ZEROES is > supported */ > > +#define VIRTIO_BLK_F_SECDISCARD15 /* WRITE ZEROES is supported > */ > > The comment is copy-pasted from WRITE_ZEROES. Will fix it.
RE: [PATCH 1/2] block:hdev: support BLKSECDISCARD
> > Add a new option "secdiscard" for block drive. Parse it and record it > > in bs->open_flags as bit(BDRV_O_SECDISCARD). > > > > Add a new BDRV_REQ_SECDISCARD bit for secure discard request from > > virtual device. > > > > For host_device backend: implement by ioctl(BLKSECDISCARD) on real > > host block device. > > For other backend, no implementation. > > > > E.g.: > > qemu-system-x86_64 \ > > ... \ > > -drive > file=/dev/mmcblk0p2,if=none,format=raw,discard=on,secdiscard=on,id=sd0 \ > > -device virtio-blk-pci,drive=sd0,id=sd0_vblk \ > > ... > > I'm curious why there is explicit control over this feature (-drive > secdiscard=on|off). For example, is there a reason why users would want to > disable secdiscard on a drive that supports it? I guess there is no way to > emulate > it correctly so secdiscard=on on a drive that doesn't support it produces an > error? Yes. AFAIK, there is no way to emulate a "secure" discard. But it seems also hard to detect whether a host drive support secdiscard unless it really perform a real secdiscard. So I choose to add an option for user to enable it for virtual block. There is an assumption that user knows whether host support secdiscard. Best Regard Yadong
RE: [PATCH 1/2] block:hdev: support BLKSECDISCARD
> > Notably absent: qapi/block-core.json. Without changing this, the option can't > be > available in -blockdev, which is the primary option to configure block device > backends. > > This patch seems to contain multiple logical changes that should be split into > separate patches: > > * Adding a flags parameter to .bdrv_co_pdiscard > > * Support for the new feature in the core block layer (should be done > with -blockdev) > > * Convenience magic for -drive (BDRV_O_SECDISCARD). It's not clear that > this should be done at all because the option is really specific to > one single block driver (file-posix). I think in your patch, all > other block drivers silently ignore the option, which is not what we > want. Sorry for the absent for -blockdev. Will try add that. Also I will try to split the patches. And for the BDRV_O_SECDISCARD, it is specific for file-posix.c(host_device). Maybe I need to add the option only for file-posix.c. > > > diff --git a/block.c b/block.c > > index 580cb77a70..4f05e96d12 100644 > > --- a/block.c > > +++ b/block.c > > @@ -1128,6 +1128,32 @@ int bdrv_parse_discard_flags(const char *mode, > int *flags) > > return 0; > > } > > > > +/** > > + * Set open flags for a given secdiscard mode > > + * > > + * Return 0 on success, -1 if the secdiscard mode was invalid. > > + */ > > +int bdrv_parse_secdiscard_flags(const char *mode, int *flags, Error > > +**errp) { > > +*flags &= ~BDRV_O_SECDISCARD; > > + > > +if (!strcmp(mode, "off")) { > > +/* do nothing */ > > +} else if (!strcmp(mode, "on")) { > > +if (!(*flags & BDRV_O_UNMAP)) { > > +error_setg(errp, "cannot enable secdiscard when discard is " > > + "disabled!"); > > +return -1; > > +} > > This check has nothing to do with parsing the option, it's validating its > value. > > You don't even need a new function to parse it, because there is already > qemu_opt_get_bool(). Duplicating it means only that you're inconsistent with > other boolean options, which alternatively accept "yes"/"no", "true"/"false", > "y/n". Sure. Will use qemu_opt_get_bool() instead. > > > + > > +*flags |= BDRV_O_SECDISCARD; > > +} else { > > +return -1; > > +} > > + > > +return 0; > > +} > > + > > /** > > * Set open flags for a given cache mode > > * > > @@ -1695,6 +1721,11 @@ QemuOptsList bdrv_runtime_opts = { > > .type = QEMU_OPT_STRING, > > .help = "discard operation (ignore/off, unmap/on)", > > }, > > +{ > > +.name = BDRV_OPT_SECDISCARD, > > +.type = QEMU_OPT_STRING, > > +.help = "secure discard operation (off, on)", > > +}, > > { > > .name = BDRV_OPT_FORCE_SHARE, > > .type = QEMU_OPT_BOOL, > > @@ -1735,6 +1766,7 @@ static int bdrv_open_common(BlockDriverState *bs, > BlockBackend *file, > > const char *driver_name = NULL; > > const char *node_name = NULL; > > const char *discard; > > +const char *secdiscard; > > QemuOpts *opts; > > BlockDriver *drv; > > Error *local_err = NULL; > > @@ -1829,6 +1861,16 @@ static int bdrv_open_common(BlockDriverState > *bs, BlockBackend *file, > > } > > } > > > > + > > +secdiscard = qemu_opt_get(opts, BDRV_OPT_SECDISCARD); > > +if (secdiscard != NULL) { > > +if (bdrv_parse_secdiscard_flags(secdiscard, >open_flags, > > +errp) != 0) { > > +ret = -EINVAL; > > +goto fail_opts; > > +} > > +} > > + > > bs->detect_zeroes = > > bdrv_parse_detect_zeroes(opts, bs->open_flags, _err); > > if (local_err) { > > @@ -3685,6 +3727,10 @@ static BlockDriverState *bdrv_open_inherit(const > char *filename, > > , options, flags, options); > > } > > > > +if (g_strcmp0(qdict_get_try_str(options, BDRV_OPT_SECDISCARD), "on")) { > > +flags |= BDRV_O_SECDISCARD; > > Indentation is off. Will fix it. > > > --- a/block/file-posix.c > > +++ b/block/file-posix.c > > @@ -160,6 +160,7 @@ typedef struct BDRVRawState { > > bool is_xfs:1; > > #endif > > bool has_discard:1; > > +bool has_secdiscard:1; > > bool has_write_zeroes:1; > > bool discard_zeroes:1; > > bool use_linux_aio:1; > > has_secdiscard is only set to false in raw_open_common() and never changed or > used. Will remove it. > > > @@ -727,6 +728,7 @@ static int raw_open_common(BlockDriverState *bs, > > QDict *options, #endif /* !defined(CONFIG_LINUX_IO_URING) */ > > > > s->has_discard = true; > > +s->has_secdiscard = false; > > s->has_write_zeroes = true; > > if ((bs->open_flags & BDRV_O_NOCACHE) != 0 && !dio_byte_aligned(s->fd)) > { > > s->needs_alignment = true; > > @@ -765,6 +767,7 @@ static int raw_open_common(BlockDriverState *bs, > QDict *options, > >
RE: [PATCH 2/2] virtio-blk: support BLKSECDISCARD
> On Mon, Nov 15, 2021 at 12:52:00PM +0800, yadong...@intel.com wrote: > > From: Yadong Qi > > > > Add new virtio feature: VIRTIO_BLK_F_SECDISCARD. > > Add new virtio command: VIRTIO_BLK_T_SECDISCARD. > > > > This feature is disabled by default, it will check the backend > > bs->open_flags & BDRV_O_SECDISCARD, enable it if BDRV_O_SECDISCARD > > is supported. > > > > Signed-off-by: Yadong Qi > > --- > > hw/block/virtio-blk.c | 26 + > > include/standard-headers/linux/virtio_blk.h | 4 > > > Any changes to standard headers need to go to linux and virtio TC. Sure. I will draft proposal to virtio-comment for review.
RE: [PATCH 2/2] virtio-blk: support BLKSECDISCARD
> >Add new virtio feature: VIRTIO_BLK_F_SECDISCARD. > >Add new virtio command: VIRTIO_BLK_T_SECDISCARD. > > Has a proposal been sent out yet to virtio-comment mailing list for discussing > these specification changes? > Not yet. I will draft a proposal to virtio-comment if no big concern of this patch >From maintainer. > > > >diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index > >dbc4c5a3cd..7bc3484521 100644 > >--- a/hw/block/virtio-blk.c > >+++ b/hw/block/virtio-blk.c > >@@ -536,7 +536,8 @@ static bool virtio_blk_sect_range_ok(VirtIOBlock > >*dev, } > > > > static uint8_t virtio_blk_handle_discard_write_zeroes(VirtIOBlockReq *req, > >-struct virtio_blk_discard_write_zeroes *dwz_hdr, bool is_write_zeroes) > >+struct virtio_blk_discard_write_zeroes *dwz_hdr, bool is_write_zeroes, > >+bool is_secdiscard) > > Since the function now handles 3 commands, I'm thinking if it's better to pass > the command directly and then make a switch instead of using 2 booleans. > Make sense. > > { > > VirtIOBlock *s = req->dev; > > VirtIODevice *vdev = VIRTIO_DEVICE(s); @@ -577,8 +578,8 @@ static > >uint8_t virtio_blk_handle_discard_write_zeroes(VirtIOBlockReq *req, > > goto err; > > } > > > >+int blk_aio_flags = 0; > > Maybe better to move it to the beginning of the function. Sure. > > > > >-blk_aio_pdiscard(s->blk, sector << BDRV_SECTOR_BITS, bytes, 0, > >+if (is_secdiscard) { > >+blk_aio_flags |= BDRV_REQ_SECDISCARD; > >+} > >+ > >+blk_aio_pdiscard(s->blk, sector << BDRV_SECTOR_BITS, bytes, > >+ blk_aio_flags, > > virtio_blk_discard_write_zeroes_complete, req); > > } > > > >@@ -622,6 +628,7 @@ static int virtio_blk_handle_request(VirtIOBlockReq > *req, MultiReqBuffer *mrb) > > unsigned out_num = req->elem.out_num; > > VirtIOBlock *s = req->dev; > > VirtIODevice *vdev = VIRTIO_DEVICE(s); > >+bool is_secdiscard = false; > > > > if (req->elem.out_num < 1 || req->elem.in_num < 1) { > > virtio_error(vdev, "virtio-blk missing headers"); @@ -722,6 > >+729,9 @@ static int virtio_blk_handle_request(VirtIOBlockReq *req, > MultiReqBuffer *mrb) > > * VIRTIO_BLK_T_OUT flag set. We masked this flag in the switch > > statement, > > * so we must mask it for these requests, then we will check if it is > > set. > > */ > >+case VIRTIO_BLK_T_SECDISCARD & ~VIRTIO_BLK_T_OUT: > >+is_secdiscard = true; > >+__attribute__((fallthrough)); > > We can use QEMU_FALLTHROUGH here. Sure. > > > > > err_status = virtio_blk_handle_discard_write_zeroes(req, _hdr, > >- > >is_write_zeroes); > >+is_write_zeroes, > >+ > >+ is_secdiscard); > > > >+if (blk_get_flags(conf->conf.blk) & BDRV_O_SECDISCARD) > >+virtio_add_feature(>host_features, > >VIRTIO_BLK_F_SECDISCARD); > >+else > >+virtio_clear_feature(>host_features, > >+ VIRTIO_BLK_F_SECDISCARD); > >+ > > IIUC here we set or not the feature if BDRV_O_SECDISCARD is set. > > Should we keep it disabled if "secdiscard" is false? (e.g. to avoid migration > problems) Yes, BDRV_O_SECDISCARD=(secdiscard=="on") ? 1 : 0; > > Otherwise what is the purpose of the "secdiscard" property? I cannot find a good method to detect whether host device support BLKSECDISCARD. So I add this "secdiscard" property to explicitly enable this feature. Best Regard Yadong > > Thanks, > Stefano
Re: [PATCH v9 68/76] target/riscv: gdb: support vector registers for rv64 & rv32
On Fri, Oct 29, 2021 at 8:24 PM wrote: > > From: Hsiangkai Wang > > Signed-off-by: Hsiangkai Wang > Signed-off-by: Greentime Hu > Signed-off-by: Frank Chang Acked-by: Alistair Francis Alistair > --- > target/riscv/cpu.c | 2 + > target/riscv/cpu.h | 1 + > target/riscv/gdbstub.c | 184 + > 3 files changed, 187 insertions(+) > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index c760ea08621..860f356bd99 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -675,6 +675,8 @@ static const char *riscv_gdb_get_dynamic_xml(CPUState > *cs, const char *xmlname) > > if (strcmp(xmlname, "riscv-csr.xml") == 0) { > return cpu->dyn_csr_xml; > +} else if (strcmp(xmlname, "riscv-vector.xml") == 0) { > +return cpu->dyn_vreg_xml; > } > > return NULL; > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h > index 5d93ccdfa71..dc10f27093b 100644 > --- a/target/riscv/cpu.h > +++ b/target/riscv/cpu.h > @@ -291,6 +291,7 @@ struct RISCVCPU { > CPURISCVState env; > > char *dyn_csr_xml; > +char *dyn_vreg_xml; > > /* Configuration Settings */ > struct { > diff --git a/target/riscv/gdbstub.c b/target/riscv/gdbstub.c > index 23429179e2e..881ab333924 100644 > --- a/target/riscv/gdbstub.c > +++ b/target/riscv/gdbstub.c > @@ -20,6 +20,32 @@ > #include "exec/gdbstub.h" > #include "cpu.h" > > +struct TypeSize { > +const char *gdb_type; > +const char *id; > +int size; > +const char suffix; > +}; > + > +static const struct TypeSize vec_lanes[] = { > +/* quads */ > +{ "uint128", "quads", 128, 'q' }, > +/* 64 bit */ > +{ "uint64", "longs", 64, 'l' }, > +/* 32 bit */ > +{ "uint32", "words", 32, 'w' }, > +/* 16 bit */ > +{ "uint16", "shorts", 16, 's' }, > +/* > + * TODO: currently there is no reliable way of telling > + * if the remote gdb actually understands ieee_half so > + * we don't expose it in the target description for now. > + * { "ieee_half", 16, 'h', 'f' }, > + */ > +/* bytes */ > +{ "uint8", "bytes", 8, 'b' }, > +}; > + > int riscv_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n) > { > RISCVCPU *cpu = RISCV_CPU(cs); > @@ -101,6 +127,96 @@ static int riscv_gdb_set_fpu(CPURISCVState *env, uint8_t > *mem_buf, int n) > return 0; > } > > +/* > + * Convert register index number passed by GDB to the correspond > + * vector CSR number. Vector CSRs are defined after vector registers > + * in dynamic generated riscv-vector.xml, thus the starting register index > + * of vector CSRs is 32. > + * Return 0 if register index number is out of range. > + */ > +static int riscv_gdb_vector_csrno(int num_regs) > +{ > +/* > + * The order of vector CSRs in the switch case > + * should match with the order defined in csr_ops[]. > + */ > +switch (num_regs) { > +case 32: > +return CSR_VSTART; > +case 33: > +return CSR_VXSAT; > +case 34: > +return CSR_VXRM; > +case 35: > +return CSR_VCSR; > +case 36: > +return CSR_VL; > +case 37: > +return CSR_VTYPE; > +case 38: > +return CSR_VLENB; > +default: > +/* Unknown register. */ > +return 0; > +} > +} > + > +static int riscv_gdb_get_vector(CPURISCVState *env, GByteArray *buf, int n) > +{ > +uint16_t vlenb = env_archcpu(env)->cfg.vlen >> 3; > +if (n < 32) { > +int i; > +int cnt = 0; > +for (i = 0; i < vlenb; i += 8) { > +cnt += gdb_get_reg64(buf, > + env->vreg[(n * vlenb + i) / 8]); > +} > +return cnt; > +} > + > +int csrno = riscv_gdb_vector_csrno(n); > + > +if (!csrno) { > +return 0; > +} > + > +target_ulong val = 0; > +int result = riscv_csrrw_debug(env, csrno, , 0, 0); > + > +if (result == 0) { > +return gdb_get_regl(buf, val); > +} > + > +return 0; > +} > + > +static int riscv_gdb_set_vector(CPURISCVState *env, uint8_t *mem_buf, int n) > +{ > +uint16_t vlenb = env_archcpu(env)->cfg.vlen >> 3; > +if (n < 32) { > +int i; > +for (i = 0; i < vlenb; i += 8) { > +env->vreg[(n * vlenb + i) / 8] = ldq_p(mem_buf + i); > +} > +return vlenb; > +} > + > +int csrno = riscv_gdb_vector_csrno(n); > + > +if (!csrno) { > +return 0; > +} > + > +target_ulong val = ldtul_p(mem_buf); > +int result = riscv_csrrw_debug(env, csrno, NULL, val, -1); > + > +if (result == 0) { > +return sizeof(target_ulong); > +} > + > +return 0; > +} > + > static int riscv_gdb_get_csr(CPURISCVState *env, GByteArray *buf, int n) > { > if (n < CSR_TABLE_SIZE) { > @@ -187,6 +303,68 @@ static int riscv_gen_dynamic_csr_xml(CPUState *cs, int > base_reg) > return CSR_TABLE_SIZE; > } > > +static int
Re: [PATCH v4 04/20] target/riscv: Extend pc for runtime pc write
On Fri, Nov 12, 2021 at 2:01 AM LIU Zhiwei wrote: > > In some cases, we must restore the guest PC to the address of the start of > the TB, such as when the instruction counter hits zero. So extend pc register > according to current xlen for these cases. > > Signed-off-by: LIU Zhiwei > Reviewed-by: Richard Henderson Reviewed-by: Alistair Francis Alistair > --- > target/riscv/cpu.c| 22 +++--- > target/riscv/cpu.h| 2 ++ > target/riscv/cpu_helper.c | 2 +- > 3 files changed, 22 insertions(+), 4 deletions(-) > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index f812998123..0d2d175fa2 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -319,7 +319,12 @@ static void riscv_cpu_set_pc(CPUState *cs, vaddr value) > { > RISCVCPU *cpu = RISCV_CPU(cs); > CPURISCVState *env = >env; > -env->pc = value; > + > +if (cpu_get_xl(env) == MXL_RV32) { > +env->pc = (int32_t)value; > +} else { > +env->pc = value; > +} > } > > static void riscv_cpu_synchronize_from_tb(CPUState *cs, > @@ -327,7 +332,13 @@ static void riscv_cpu_synchronize_from_tb(CPUState *cs, > { > RISCVCPU *cpu = RISCV_CPU(cs); > CPURISCVState *env = >env; > -env->pc = tb->pc; > +RISCVMXL xl = FIELD_EX32(tb->flags, TB_FLAGS, XL); > + > +if (xl == MXL_RV32) { > +env->pc = (int32_t)tb->pc; > +} else { > +env->pc = tb->pc; > +} > } > > static bool riscv_cpu_has_work(CPUState *cs) > @@ -348,7 +359,12 @@ static bool riscv_cpu_has_work(CPUState *cs) > void restore_state_to_opc(CPURISCVState *env, TranslationBlock *tb, >target_ulong *data) > { > -env->pc = data[0]; > +RISCVMXL xl = FIELD_EX32(tb->flags, TB_FLAGS, XL); > +if (xl == MXL_RV32) { > +env->pc = (int32_t)data[0]; > +} else { > +env->pc = data[0]; > +} > } > > static void riscv_cpu_reset(DeviceState *dev) > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h > index 0760c0af93..8befff0166 100644 > --- a/target/riscv/cpu.h > +++ b/target/riscv/cpu.h > @@ -420,6 +420,8 @@ static inline RISCVMXL riscv_cpu_mxl(CPURISCVState *env) > } > #endif > > +RISCVMXL cpu_get_xl(CPURISCVState *env); > + > /* > * A simplification for VLMAX > * = (1 << LMUL) * VLEN / (8 * (1 << SEW)) > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c > index 4c048cc266..79aba9c880 100644 > --- a/target/riscv/cpu_helper.c > +++ b/target/riscv/cpu_helper.c > @@ -35,7 +35,7 @@ int riscv_cpu_mmu_index(CPURISCVState *env, bool ifetch) > #endif > } > > -static RISCVMXL cpu_get_xl(CPURISCVState *env) > +RISCVMXL cpu_get_xl(CPURISCVState *env) > { > #if defined(TARGET_RISCV32) > return MXL_RV32; > -- > 2.25.1 > >
Re: [PATCH v9 74/76] target/riscv: rvv-1.0: add vector unit-stride mask load/store insns
On Fri, Oct 29, 2021 at 8:38 PM wrote: > > From: Frank Chang > > Signed-off-by: Frank Chang Acked-by: Alistair Francis Alistair > --- > target/riscv/helper.h | 2 ++ > target/riscv/insn32.decode | 4 +++ > target/riscv/insn_trans/trans_rvv.c.inc | 40 + > target/riscv/vector_helper.c| 21 + > 4 files changed, 67 insertions(+) > > diff --git a/target/riscv/helper.h b/target/riscv/helper.h > index ab283d12b79..6e58343af35 100644 > --- a/target/riscv/helper.h > +++ b/target/riscv/helper.h > @@ -129,6 +129,8 @@ DEF_HELPER_5(vse8_v_mask, void, ptr, ptr, tl, env, i32) > DEF_HELPER_5(vse16_v_mask, void, ptr, ptr, tl, env, i32) > DEF_HELPER_5(vse32_v_mask, void, ptr, ptr, tl, env, i32) > DEF_HELPER_5(vse64_v_mask, void, ptr, ptr, tl, env, i32) > +DEF_HELPER_5(vlm_v, void, ptr, ptr, tl, env, i32) > +DEF_HELPER_5(vsm_v, void, ptr, ptr, tl, env, i32) > DEF_HELPER_6(vlse8_v, void, ptr, ptr, tl, tl, env, i32) > DEF_HELPER_6(vlse16_v, void, ptr, ptr, tl, tl, env, i32) > DEF_HELPER_6(vlse32_v, void, ptr, ptr, tl, tl, env, i32) > diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode > index 3b6524bad91..1a4a2871464 100644 > --- a/target/riscv/insn32.decode > +++ b/target/riscv/insn32.decode > @@ -305,6 +305,10 @@ vse16_v... 000 . 0 . 101 . 0100111 > @r2_nfvm > vse32_v... 000 . 0 . 110 . 0100111 @r2_nfvm > vse64_v... 000 . 0 . 111 . 0100111 @r2_nfvm > > +# Vector unit-stride mask load/store insns. > +vlm_v 000 000 1 01011 . 000 . 111 @r2 > +vsm_v 000 000 1 01011 . 000 . 0100111 @r2 > + > # Vector strided insns. > vlse8_v ... 010 . . . 000 . 111 @r_nfvm > vlse16_v... 010 . . . 101 . 111 @r_nfvm > diff --git a/target/riscv/insn_trans/trans_rvv.c.inc > b/target/riscv/insn_trans/trans_rvv.c.inc > index e540b5d33c2..97b1dc10265 100644 > --- a/target/riscv/insn_trans/trans_rvv.c.inc > +++ b/target/riscv/insn_trans/trans_rvv.c.inc > @@ -697,6 +697,46 @@ GEN_VEXT_TRANS(vse16_v, MO_16, r2nfvm, st_us_op, > st_us_check) > GEN_VEXT_TRANS(vse32_v, MO_32, r2nfvm, st_us_op, st_us_check) > GEN_VEXT_TRANS(vse64_v, MO_64, r2nfvm, st_us_op, st_us_check) > > +/* > + *** unit stride mask load and store > + */ > +static bool ld_us_mask_op(DisasContext *s, arg_vlm_v *a, uint8_t eew) > +{ > +uint32_t data = 0; > +gen_helper_ldst_us *fn = gen_helper_vlm_v; > + > +/* EMUL = 1, NFIELDS = 1 */ > +data = FIELD_DP32(data, VDATA, LMUL, 0); > +data = FIELD_DP32(data, VDATA, NF, 1); > +return ldst_us_trans(a->rd, a->rs1, data, fn, s, false); > +} > + > +static bool ld_us_mask_check(DisasContext *s, arg_vlm_v *a, uint8_t eew) > +{ > +/* EMUL = 1, NFIELDS = 1 */ > +return require_rvv(s) && vext_check_isa_ill(s); > +} > + > +static bool st_us_mask_op(DisasContext *s, arg_vsm_v *a, uint8_t eew) > +{ > +uint32_t data = 0; > +gen_helper_ldst_us *fn = gen_helper_vsm_v; > + > +/* EMUL = 1, NFIELDS = 1 */ > +data = FIELD_DP32(data, VDATA, LMUL, 0); > +data = FIELD_DP32(data, VDATA, NF, 1); > +return ldst_us_trans(a->rd, a->rs1, data, fn, s, true); > +} > + > +static bool st_us_mask_check(DisasContext *s, arg_vsm_v *a, uint8_t eew) > +{ > +/* EMUL = 1, NFIELDS = 1 */ > +return require_rvv(s) && vext_check_isa_ill(s); > +} > + > +GEN_VEXT_TRANS(vlm_v, MO_8, vlm_v, ld_us_mask_op, ld_us_mask_check) > +GEN_VEXT_TRANS(vsm_v, MO_8, vsm_v, st_us_mask_op, st_us_mask_check) > + > /* > *** stride load and store > */ > diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c > index 83373ca6fc6..4c1a1310e63 100644 > --- a/target/riscv/vector_helper.c > +++ b/target/riscv/vector_helper.c > @@ -345,6 +345,27 @@ GEN_VEXT_ST_US(vse16_v, int16_t, ste_h) > GEN_VEXT_ST_US(vse32_v, int32_t, ste_w) > GEN_VEXT_ST_US(vse64_v, int64_t, ste_d) > > +/* > + *** unit stride mask load and store, EEW = 1 > + */ > +void HELPER(vlm_v)(void *vd, void *v0, target_ulong base, > +CPURISCVState *env, uint32_t desc) > +{ > +/* evl = ceil(vl/8) */ > +uint8_t evl = (env->vl + 7) >> 3; > +vext_ldst_us(vd, base, env, desc, lde_b, > + 0, evl, GETPC(), MMU_DATA_LOAD); > +} > + > +void HELPER(vsm_v)(void *vd, void *v0, target_ulong base, > +CPURISCVState *env, uint32_t desc) > +{ > +/* evl = ceil(vl/8) */ > +uint8_t evl = (env->vl + 7) >> 3; > +vext_ldst_us(vd, base, env, desc, ste_b, > + 0, evl, GETPC(), MMU_DATA_STORE); > +} > + > /* > *** index: access vector element from indexed memory > */ > -- > 2.25.1 > >
Re: [PATCH v9 73/76] target/riscv: rvv-1.0: add evl parameter to vext_ldst_us()
On Fri, Oct 29, 2021 at 8:11 PM wrote: > > From: Frank Chang > > Add supports of Vector unit-stride mask load/store instructions > (vlm.v, vsm.v), which has: > evl (effective vector length) = ceil(env->vl / 8). > > The new instructions operate the same as unmasked byte loads and stores. > Add evl parameter to reuse vext_ldst_us(). > > Signed-off-by: Frank Chang Reviewed-by: Alistair Francis Alistair > --- > target/riscv/vector_helper.c | 36 ++-- > 1 file changed, 18 insertions(+), 18 deletions(-) > > diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c > index 946dca53ffd..83373ca6fc6 100644 > --- a/target/riscv/vector_helper.c > +++ b/target/riscv/vector_helper.c > @@ -279,15 +279,15 @@ GEN_VEXT_ST_STRIDE(vsse64_v, int64_t, ste_d) > /* unmasked unit-stride load and store operation*/ > static void > vext_ldst_us(void *vd, target_ulong base, CPURISCVState *env, uint32_t desc, > - vext_ldst_elem_fn *ldst_elem, > - uint32_t esz, uintptr_t ra, MMUAccessType access_type) > + vext_ldst_elem_fn *ldst_elem, uint32_t esz, uint32_t evl, > + uintptr_t ra, MMUAccessType access_type) > { > uint32_t i, k; > uint32_t nf = vext_nf(desc); > uint32_t max_elems = vext_max_elems(desc, esz); > > /* load bytes from guest memory */ > -for (i = env->vstart; i < env->vl; i++, env->vstart++) { > +for (i = env->vstart; i < evl; i++, env->vstart++) { > k = 0; > while (k < nf) { > target_ulong addr = base + ((i * nf + k) << esz); > @@ -316,7 +316,7 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong base, > \ >CPURISCVState *env, uint32_t desc)\ > { \ > vext_ldst_us(vd, base, env, desc, LOAD_FN, \ > - ctzl(sizeof(ETYPE)), GETPC(), MMU_DATA_LOAD); \ > + ctzl(sizeof(ETYPE)), env->vl, GETPC(), MMU_DATA_LOAD); \ > } > > GEN_VEXT_LD_US(vle8_v, int8_t, lde_b) > @@ -324,20 +324,20 @@ GEN_VEXT_LD_US(vle16_v, int16_t, lde_h) > GEN_VEXT_LD_US(vle32_v, int32_t, lde_w) > GEN_VEXT_LD_US(vle64_v, int64_t, lde_d) > > -#define GEN_VEXT_ST_US(NAME, ETYPE, STORE_FN) \ > -void HELPER(NAME##_mask)(void *vd, void *v0, target_ulong base, \ > - CPURISCVState *env, uint32_t desc) \ > -{ \ > -uint32_t stride = vext_nf(desc) << ctzl(sizeof(ETYPE)); \ > -vext_ldst_stride(vd, v0, base, stride, env, desc, false, STORE_FN, \ > - ctzl(sizeof(ETYPE)), GETPC(), MMU_DATA_STORE); \ > -} \ > -\ > -void HELPER(NAME)(void *vd, void *v0, target_ulong base,\ > - CPURISCVState *env, uint32_t desc)\ > -{ \ > -vext_ldst_us(vd, base, env, desc, STORE_FN, \ > - ctzl(sizeof(ETYPE)), GETPC(), MMU_DATA_STORE); \ > +#define GEN_VEXT_ST_US(NAME, ETYPE, STORE_FN)\ > +void HELPER(NAME##_mask)(void *vd, void *v0, target_ulong base, \ > + CPURISCVState *env, uint32_t desc) \ > +{\ > +uint32_t stride = vext_nf(desc) << ctzl(sizeof(ETYPE)); \ > +vext_ldst_stride(vd, v0, base, stride, env, desc, false, STORE_FN, \ > + ctzl(sizeof(ETYPE)), GETPC(), MMU_DATA_STORE); \ > +}\ > + \ > +void HELPER(NAME)(void *vd, void *v0, target_ulong base, \ > + CPURISCVState *env, uint32_t desc) \ > +{\ > +vext_ldst_us(vd, base, env, desc, STORE_FN, \ > + ctzl(sizeof(ETYPE)), env->vl, GETPC(), MMU_DATA_STORE); \ > } > > GEN_VEXT_ST_US(vse8_v, int8_t, ste_b) > -- > 2.25.1 > >
Re: [PATCH v9 67/76] target/riscv: rvv-1.0: trigger illegal instruction exception if frm is not valid
On Fri, Oct 29, 2021 at 8:14 PM wrote: > > From: Frank Chang > > If the frm field contains an invalid rounding mode (101-111), > attempting to execute any vector floating-point instruction, even > those that do not depend on the rounding mode, will raise an illegal > instruction exception. > > Call gen_set_rm() with DYN rounding mode to check and trigger illegal > instruction exception if frm field contains invalid value at run-time > for vector floating-point instructions. > > Signed-off-by: Frank Chang Acked-by: Alistair Francis Alistair > --- > target/riscv/insn_trans/trans_rvv.c.inc | 22 ++ > 1 file changed, 22 insertions(+) > > diff --git a/target/riscv/insn_trans/trans_rvv.c.inc > b/target/riscv/insn_trans/trans_rvv.c.inc > index 7589c8ce32a..53c8573f117 100644 > --- a/target/riscv/insn_trans/trans_rvv.c.inc > +++ b/target/riscv/insn_trans/trans_rvv.c.inc > @@ -2374,6 +2374,10 @@ static bool do_opfv(DisasContext *s, arg_rmr *a, > int rm) > { > if (checkfn(s, a)) { > +if (rm != RISCV_FRM_DYN) { > +gen_set_rm(s, RISCV_FRM_DYN); > +} > + > uint32_t data = 0; > TCGLabel *over = gen_new_label(); > gen_set_rm(s, rm); > @@ -2459,6 +2463,8 @@ static bool trans_vfmv_v_f(DisasContext *s, > arg_vfmv_v_f *a) > require_rvf(s) && > vext_check_isa_ill(s) && > require_align(a->rd, s->lmul)) { > +gen_set_rm(s, RISCV_FRM_DYN); > + > TCGv_i64 t1; > > if (s->vl_eq_vlmax) { > @@ -2540,6 +2546,10 @@ static bool opfv_widen_check(DisasContext *s, arg_rmr > *a) > static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ > { \ > if (opfv_widen_check(s, a)) { \ > +if (FRM != RISCV_FRM_DYN) {\ > +gen_set_rm(s, RISCV_FRM_DYN); \ > +} \ > + \ > uint32_t data = 0; \ > static gen_helper_gvec_3_ptr * const fns[2] = {\ > gen_helper_##HELPER##_h, \ > @@ -2627,6 +2637,10 @@ static bool opfv_narrow_check(DisasContext *s, arg_rmr > *a) > static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ > { \ > if (opfv_narrow_check(s, a)) { \ > +if (FRM != RISCV_FRM_DYN) {\ > +gen_set_rm(s, RISCV_FRM_DYN); \ > +} \ > + \ > uint32_t data = 0; \ > static gen_helper_gvec_3_ptr * const fns[2] = {\ > gen_helper_##HELPER##_h, \ > @@ -2668,6 +2682,10 @@ static bool opxfv_narrow_check(DisasContext *s, > arg_rmr *a) > static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ > { \ > if (opxfv_narrow_check(s, a)) {\ > +if (FRM != RISCV_FRM_DYN) {\ > +gen_set_rm(s, RISCV_FRM_DYN); \ > +} \ > + \ > uint32_t data = 0; \ > static gen_helper_gvec_3_ptr * const fns[3] = {\ > gen_helper_##HELPER##_b, \ > @@ -3138,6 +3156,8 @@ static bool trans_vfmv_f_s(DisasContext *s, > arg_vfmv_f_s *a) > if (require_rvv(s) && > require_rvf(s) && > vext_check_isa_ill(s)) { > +gen_set_rm(s, RISCV_FRM_DYN); > + > unsigned int ofs = (8 << s->sew); > unsigned int len = 64 - ofs; > TCGv_i64 t_nan; > @@ -3162,6 +3182,8 @@ static bool trans_vfmv_s_f(DisasContext *s, > arg_vfmv_s_f *a) > if (require_rvv(s) && > require_rvf(s) && > vext_check_isa_ill(s)) { > +gen_set_rm(s, RISCV_FRM_DYN); > + > /* The instructions ignore LMUL and vector register group. */ > TCGv_i64 t1; > TCGLabel *over = gen_new_label(); > -- > 2.25.1 > >
Re: [PATCH v9 66/76] target/riscv: rvv-1.0: implement vstart CSR
Alistair On Fri, Oct 29, 2021 at 7:55 PM wrote: > > From: Frank Chang > > * Update and check vstart value for vector instructions. > * Add whole register move instruction helper functions as we have to > call helper function for case where vstart is not zero. > * Remove probe_pages() calls in vector load/store instructions > (except fault-only-first loads) to raise the memory access exception > at the exact processed vector element. > > Signed-off-by: Frank Chang Reviewed-by: Alistair Francis Alistair > --- > target/riscv/csr.c | 6 +- > target/riscv/helper.h | 5 + > target/riscv/insn_trans/trans_rvv.c.inc | 75 ++--- > target/riscv/translate.c| 6 +- > target/riscv/vector_helper.c| 210 +++- > 5 files changed, 199 insertions(+), 103 deletions(-) > > diff --git a/target/riscv/csr.c b/target/riscv/csr.c > index 3dfbc177381..146447eac5d 100644 > --- a/target/riscv/csr.c > +++ b/target/riscv/csr.c > @@ -343,7 +343,11 @@ static RISCVException write_vstart(CPURISCVState *env, > int csrno, > #if !defined(CONFIG_USER_ONLY) > env->mstatus |= MSTATUS_VS; > #endif > -env->vstart = val; > +/* > + * The vstart CSR is defined to have only enough writable bits > + * to hold the largest element index, i.e. lg2(VLEN) bits. > + */ > +env->vstart = val & ~(~0ULL << ctzl(env_archcpu(env)->cfg.vlen)); > return RISCV_EXCP_NONE; > } > > diff --git a/target/riscv/helper.h b/target/riscv/helper.h > index 1a0d817f0f5..a717a87a0e0 100644 > --- a/target/riscv/helper.h > +++ b/target/riscv/helper.h > @@ -1073,6 +1073,11 @@ DEF_HELPER_6(vcompress_vm_h, void, ptr, ptr, ptr, ptr, > env, i32) > DEF_HELPER_6(vcompress_vm_w, void, ptr, ptr, ptr, ptr, env, i32) > DEF_HELPER_6(vcompress_vm_d, void, ptr, ptr, ptr, ptr, env, i32) > > +DEF_HELPER_4(vmv1r_v, void, ptr, ptr, env, i32) > +DEF_HELPER_4(vmv2r_v, void, ptr, ptr, env, i32) > +DEF_HELPER_4(vmv4r_v, void, ptr, ptr, env, i32) > +DEF_HELPER_4(vmv8r_v, void, ptr, ptr, env, i32) > + > DEF_HELPER_5(vzext_vf2_h, void, ptr, ptr, ptr, env, i32) > DEF_HELPER_5(vzext_vf2_w, void, ptr, ptr, ptr, env, i32) > DEF_HELPER_5(vzext_vf2_d, void, ptr, ptr, ptr, env, i32) > diff --git a/target/riscv/insn_trans/trans_rvv.c.inc > b/target/riscv/insn_trans/trans_rvv.c.inc > index be3f9f13275..7589c8ce32a 100644 > --- a/target/riscv/insn_trans/trans_rvv.c.inc > +++ b/target/riscv/insn_trans/trans_rvv.c.inc > @@ -490,7 +490,7 @@ static bool vext_check_sds(DisasContext *s, int vd, int > vs1, int vs2, int vm) > */ > static bool vext_check_reduction(DisasContext *s, int vs2) > { > -return require_align(vs2, s->lmul); > +return require_align(vs2, s->lmul) && (s->vstart == 0); > } > > /* > @@ -2786,7 +2786,8 @@ GEN_MM_TRANS(vmxnor_mm) > static bool trans_vcpop_m(DisasContext *s, arg_rmr *a) > { > if (require_rvv(s) && > -vext_check_isa_ill(s)) { > +vext_check_isa_ill(s) && > +s->vstart == 0) { > TCGv_ptr src2, mask; > TCGv dst; > TCGv_i32 desc; > @@ -2817,7 +2818,8 @@ static bool trans_vcpop_m(DisasContext *s, arg_rmr *a) > static bool trans_vfirst_m(DisasContext *s, arg_rmr *a) > { > if (require_rvv(s) && > -vext_check_isa_ill(s)) { > +vext_check_isa_ill(s) && > +s->vstart == 0) { > TCGv_ptr src2, mask; > TCGv dst; > TCGv_i32 desc; > @@ -2852,7 +2854,8 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) >\ > if (require_rvv(s) && \ > vext_check_isa_ill(s) && \ > require_vm(a->vm, a->rd) &&\ > -(a->rd != a->rs2)) { \ > +(a->rd != a->rs2) && \ > +(s->vstart == 0)) {\ > uint32_t data = 0; \ > gen_helper_gvec_3_ptr *fn = gen_helper_##NAME; \ > TCGLabel *over = gen_new_label(); \ > @@ -2888,7 +2891,8 @@ static bool trans_viota_m(DisasContext *s, arg_viota_m > *a) > vext_check_isa_ill(s) && > !is_overlapped(a->rd, 1 << MAX(s->lmul, 0), a->rs2, 1) && > require_vm(a->vm, a->rd) && > -require_align(a->rd, s->lmul)) { > +require_align(a->rd, s->lmul) && > +(s->vstart == 0)) { > uint32_t data = 0; > TCGLabel *over = gen_new_label(); > tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); > @@ -3109,6 +3113,7 @@ static bool trans_vmv_s_x(DisasContext *s, arg_vmv_s_x > *a) > TCGLabel *over = gen_new_label(); > > tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); > +tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); > > t1 =
Re: [PATCH for 6.2] nbd/server: Silence clang sanitizer warning
On 11/15/21 23:39, Eric Blake wrote: > clang's sanitizer is picky: memset(NULL, x, 0) is technically > undefined behavior, even though no sane implementation of memset() > deferences the NULL. Caught by the nbd-qemu-allocation iotest. > > The alternative to checking before each memset is to instead force an > allocation of 1 element instead of g_new0(type, 0)'s behavior of > returning NULL for a 0-length array. > > Reported-by: Peter Maydell > Fixes: 3b1f244c59 (nbd: Allow export of multiple bitmaps for one device) > Signed-off-by: Eric Blake > --- > nbd/server.c | 13 + > 1 file changed, 9 insertions(+), 4 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
[PULL 2/3] tests/unit/test-smp-parse: QOM'ify smp_machine_class_init()
smp_machine_class_init() is the actual TypeInfo::class_init(). Declare it as such in smp_machine_info, and avoid to call it manually in each test. Move smp_machine_info definition just before we register the type to avoid a forward declaration. Reviewed-by: Andrew Jones Reviewed-by: Richard Henderson Reviewed-by: Yanan Wang Tested-by: Yanan Wang Signed-off-by: Philippe Mathieu-Daudé Message-Id: <2025145900.2531865-3-phi...@redhat.com> --- tests/unit/test-smp-parse.c | 34 ++ 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/tests/unit/test-smp-parse.c b/tests/unit/test-smp-parse.c index 47774c1ee2a..3fff2af4e27 100644 --- a/tests/unit/test-smp-parse.c +++ b/tests/unit/test-smp-parse.c @@ -75,14 +75,6 @@ typedef struct SMPTestData { const char *expect_error; } SMPTestData; -/* Type info of the tested machine */ -static const TypeInfo smp_machine_info = { -.name = TYPE_MACHINE, -.parent = TYPE_OBJECT, -.class_size = sizeof(MachineClass), -.instance_size = sizeof(MachineState), -}; - /* * List all the possible valid sub-collections of the generic 5 * topology parameters (i.e. cpus/maxcpus/sockets/cores/threads), @@ -480,9 +472,10 @@ static void unsupported_params_init(MachineClass *mc, SMPTestData *data) } } -/* Reset the related machine properties before each sub-test */ -static void smp_machine_class_init(MachineClass *mc) +static void machine_base_class_init(ObjectClass *oc, void *data) { +MachineClass *mc = MACHINE_CLASS(oc); + mc->min_cpus = MIN_CPUS; mc->max_cpus = MAX_CPUS; @@ -498,8 +491,6 @@ static void test_generic(void) SMPTestData *data = &(SMPTestData){{ }}; int i; -smp_machine_class_init(mc); - for (i = 0; i < ARRAY_SIZE(data_generic_valid); i++) { *data = data_generic_valid[i]; unsupported_params_init(mc, data); @@ -539,7 +530,6 @@ static void test_with_dies(void) unsigned int num_dies = 2; int i; -smp_machine_class_init(mc); /* Force the SMP compat properties */ mc->smp_props.dies_supported = true; @@ -586,12 +576,24 @@ static void test_with_dies(void) object_unref(obj); } +/* Type info of the tested machine */ +static const TypeInfo smp_machine_types[] = { +{ +.name = TYPE_MACHINE, +.parent = TYPE_OBJECT, +.class_init = machine_base_class_init, +.class_size = sizeof(MachineClass), +.instance_size = sizeof(MachineState), +} +}; + +DEFINE_TYPES(smp_machine_types) + int main(int argc, char *argv[]) { -g_test_init(, , NULL); - module_call_init(MODULE_INIT_QOM); -type_register_static(_machine_info); + +g_test_init(, , NULL); g_test_add_func("/test-smp-parse/generic", test_generic); g_test_add_func("/test-smp-parse/with_dies", test_with_dies); -- 2.31.1
Re: [PATCH] sev: allow capabilities to check for SEV-ES support
On Mon, Nov 15, 2021 at 02:38:04PM -0500, Tyler Fanelli wrote: > Probe for SEV-ES and SEV-SNP capabilities to distinguish between Rome, > Naples, and Milan processors. Use the CPUID function to probe if a > processor is capable of running SEV-ES or SEV-SNP, rather than if it > actually is running SEV-ES or SEV-SNP. > > Signed-off-by: Tyler Fanelli > --- > qapi/misc-target.json | 11 +-- > target/i386/sev.c | 6 -- > 2 files changed, 13 insertions(+), 4 deletions(-) > > diff --git a/qapi/misc-target.json b/qapi/misc-target.json > index 5aa2b95b7d..c3e9bce12b 100644 > --- a/qapi/misc-target.json > +++ b/qapi/misc-target.json > @@ -182,13 +182,19 @@ > # @reduced-phys-bits: Number of physical Address bit reduction when SEV is > # enabled > # > +# @es: SEV-ES capability of the machine. > +# > +# @snp: SEV-SNP capability of the machine. > +# Missing '(since 7.0)' tags on the new members. > # Since: 2.12 > ## > { 'struct': 'SevCapability', >'data': { 'pdh': 'str', > 'cert-chain': 'str', > 'cbitpos': 'int', > -'reduced-phys-bits': 'int'}, > +'reduced-phys-bits': 'int', > +'es': 'bool', > +'snp': 'bool'}, >'if': 'TARGET_I386' } > > ## > @@ -205,7 +211,8 @@ > # > # -> { "execute": "query-sev-capabilities" } > # <- { "return": { "pdh": "8CCDD8DDD", "cert-chain": "888CCCDDDEE", > -# "cbitpos": 47, "reduced-phys-bits": 5}} > +# "cbitpos": 47, "reduced-phys-bits": 5 > +# "es": false, "snp": false}} Invalid JSON, as you missed the comma needed after 5. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [PATCH v4 00/11] tests/unit: Fix test-smp-parse
On 11/15/21 15:58, Philippe Mathieu-Daudé wrote: > Missing review: patches #4 to #8 (new) > > Yet another approach to fix test-smp-parse. v2 from Yanan Wang: > https://lore.kernel.org/qemu-devel/2021024429.10568-1-wangyana...@huawei.com/ > > Here we use the QOM class_init() to avoid having to deal with > object_unref() and deinit(). > > Since v3: > - Restore 'dies_supported' field in test_with_dies (Yanan) > - Add R-b tags > - QOM-ify the TYPE_MACHINE classes > > Philippe Mathieu-Daudé (11): > tests/unit/test-smp-parse: Restore MachineClass fields after modifying > tests/unit/test-smp-parse: QOM'ify smp_machine_class_init() > tests/unit/test-smp-parse: Explicit MachineClass name Patches 1-3 queued.
[PULL 3/3] tests/unit/test-smp-parse: Explicit MachineClass name
If the MachineClass::name pointer is not explicitly set, it is NULL. Per the C standard, passing a NULL pointer to printf "%s" format is undefined. Some implementations display it as 'NULL', other as 'null'. Since we are comparing the formatted output, we need a stable value. The easiest is to explicit a machine name string. Reviewed-by: Andrew Jones Reviewed-by: Richard Henderson Reviewed-by: Yanan Wang Tested-by: Yanan Wang Signed-off-by: Philippe Mathieu-Daudé Message-Id: <2025145900.2531865-4-phi...@redhat.com> --- tests/unit/test-smp-parse.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/tests/unit/test-smp-parse.c b/tests/unit/test-smp-parse.c index 3fff2af4e27..b02450e25a3 100644 --- a/tests/unit/test-smp-parse.c +++ b/tests/unit/test-smp-parse.c @@ -23,6 +23,8 @@ #define MIN_CPUS 1 /* set the min CPUs supported by the machine as 1 */ #define MAX_CPUS 512 /* set the max CPUs supported by the machine as 512 */ +#define SMP_MACHINE_NAME "TEST-SMP" + /* * Used to define the generic 3-level CPU topology hierarchy * -sockets/cores/threads @@ -307,13 +309,13 @@ static struct SMPTestData data_generic_invalid[] = { * should tweak the supported min CPUs to 2 for testing */ .config = SMP_CONFIG_GENERIC(T, 1, F, 0, F, 0, F, 0, F, 0), .expect_error = "Invalid SMP CPUs 1. The min CPUs supported " -"by machine '(null)' is 2", +"by machine '" SMP_MACHINE_NAME "' is 2", }, { /* config: -smp 512 * should tweak the supported max CPUs to 511 for testing */ .config = SMP_CONFIG_GENERIC(T, 512, F, 0, F, 0, F, 0, F, 0), .expect_error = "Invalid SMP CPUs 512. The max CPUs supported " -"by machine '(null)' is 511", +"by machine '" SMP_MACHINE_NAME "' is 511", }, }; @@ -481,6 +483,8 @@ static void machine_base_class_init(ObjectClass *oc, void *data) mc->smp_props.prefer_sockets = true; mc->smp_props.dies_supported = false; + +mc->name = g_strdup(SMP_MACHINE_NAME); } static void test_generic(void) -- 2.31.1
[PULL 1/3] tests/unit/test-smp-parse: Restore MachineClass fields after modifying
There is a single MachineClass object, registered with type_register_static(_machine_info). Since the same object is used multiple times (an MachineState object is instantiated in both test_generic and test_with_dies), we should restore its internal state after modifying for the test purpose. Reviewed-by: Andrew Jones Reviewed-by: Richard Henderson Reviewed-by: Yanan Wang Tested-by: Yanan Wang Signed-off-by: Philippe Mathieu-Daudé Message-Id: <2025145900.2531865-2-phi...@redhat.com> --- tests/unit/test-smp-parse.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/tests/unit/test-smp-parse.c b/tests/unit/test-smp-parse.c index cbe0c990494..47774c1ee2a 100644 --- a/tests/unit/test-smp-parse.c +++ b/tests/unit/test-smp-parse.c @@ -512,7 +512,7 @@ static void test_generic(void) smp_parse_test(ms, data, true); } -/* Reset the supported min CPUs and max CPUs */ +/* Force invalid min CPUs and max CPUs */ mc->min_cpus = 2; mc->max_cpus = 511; @@ -523,6 +523,10 @@ static void test_generic(void) smp_parse_test(ms, data, false); } +/* Reset the supported min CPUs and max CPUs */ +mc->min_cpus = MIN_CPUS; +mc->max_cpus = MAX_CPUS; + object_unref(obj); } @@ -536,6 +540,7 @@ static void test_with_dies(void) int i; smp_machine_class_init(mc); +/* Force the SMP compat properties */ mc->smp_props.dies_supported = true; for (i = 0; i < ARRAY_SIZE(data_generic_valid); i++) { @@ -575,6 +580,9 @@ static void test_with_dies(void) smp_parse_test(ms, data, false); } +/* Restore the SMP compat properties */ +mc->smp_props.dies_supported = false; + object_unref(obj); } -- 2.31.1
[PULL 0/3] Machine-next patches for 2021-11-15
The following changes since commit 42f6c9179be4401974dd3a75ee72defd16b5092d: Merge tag 'pull-ppc-2022' of https://github.com/legoater/qemu into staging (2021-11-12 12:28:25 +0100) are available in the Git repository at: https://github.com/philmd/qemu.git tags/machine-core-2025 for you to fetch changes up to 7b6d1bc9629f3dd45647ec3418e0606a9248: tests/unit/test-smp-parse: Explicit MachineClass name (2021-11-15 21:49:16 +0100) Machine core patches - Rework SMP parsing unit test to work on WinGW: https://github.com/qemu/qemu/runs/4078386652 This fixes: Test smp_parse failed! Expected error report: Invalid SMP CPUs 1. The min CPUs supported by machine '(null)' is 2 Output error report: Invalid SMP CPUs 1. The min CPUs supported by machine '(NULL)' is 2 Philippe Mathieu-Daudé (3): tests/unit/test-smp-parse: Restore MachineClass fields after modifying tests/unit/test-smp-parse: QOM'ify smp_machine_class_init() tests/unit/test-smp-parse: Explicit MachineClass name tests/unit/test-smp-parse.c | 52 +++-- 1 file changed, 33 insertions(+), 19 deletions(-) -- 2.31.1
Re: [PULL 00/20] pci,pc,virtio: bugfixes
On 11/15/21 17:37, Michael S. Tsirkin wrote: > The following changes since commit 0a70bcf18caf7a61d480f8448723c15209d128ef: > > Update version for v6.2.0-rc0 release (2021-11-09 18:22:57 +0100) > > are available in the Git repository at: > > git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream > > for you to fetch changes up to 18416c62e36a79823a9e28f6b2260aa13c25e1d9: > > pcie: expire pending delete (2021-11-15 11:10:11 -0500) > > > pci,pc,virtio: bugfixes > > pci power management fixes > acpi hotplug fixes > misc other fixes > > Signed-off-by: Michael S. Tsirkin > > > Eugenio Pérez (4): > vhost: Rename last_index to vq_index_end > vhost: Fix last vq queue index of devices with no cvq > vdpa: Replace qemu_open_old by qemu_open at > vdpa: Check for existence of opts.vhostdev > > Gerd Hoffmann (6): > pci: implement power state > pcie: implement slot power control for pcie root ports > pcie: add power indicator blink check > pcie: factor out pcie_cap_slot_unplug() > pcie: fast unplug when slot power is off > pcie: expire pending delete > > Igor Mammedov (2): > pcie: rename 'native-hotplug' to 'x-native-hotplug' > tests: bios-tables-test update expected blobs > > Jason Wang (2): > virtio: use virtio accessor to access packed descriptor flags > virtio: use virtio accessor to access packed event > > Julia Suvorova (3): > hw/acpi/ich9: Add compat prop to keep HPC bit set for 6.1 machine type > bios-tables-test: Allow changes in DSDT ACPI tables > hw/i386/acpi-build: Deny control on PCIe Native Hot-plug in _OSC > > Philippe Mathieu-Daudé (1): > hw/mem/pc-dimm: Restrict NUMA-specific code to NUMA machines > > Stefan Hajnoczi (1): > softmmu/qdev-monitor: fix use-after-free in qdev_set_id() > > Stefano Garzarella (1): > net/vhost-vdpa: fix memory leak in vhost_vdpa_get_max_queue_pairs() > > include/hw/acpi/ich9.h| 1 + > include/hw/pci/pci.h | 2 + > include/hw/qdev-core.h| 1 + > include/hw/virtio/vhost.h | 4 +- > hw/acpi/ich9.c| 18 > hw/i386/acpi-build.c | 12 -- > hw/i386/pc.c | 2 + > hw/i386/pc_q35.c | 9 +++- > hw/mem/pc-dimm.c | 23 ++ > hw/net/vhost_net.c| 12 +++--- > hw/pci/pci.c | 25 ++- > hw/pci/pci_host.c | 6 ++- > hw/pci/pcie.c | 79 > -- > hw/pci/pcie_port.c| 2 +- > hw/virtio/vhost-vdpa.c| 2 +- > hw/virtio/virtio.c| 24 --- > net/vhost-vdpa.c | 8 +++- > softmmu/qdev-monitor.c| 6 ++- Cc'ing Alex for this apparently unrelated test failure: make[2]: Entering directory 'build/tests/tcg/arm-linux-user' timeout --foreground 60 tests/guest-debug/run-test.py --gdb /usr/bin/gdb-multiarch --qemu build/qemu-arm --qargs "" --bin testthread --test tests/tcg/multiarch/gdbstub/test-thread-breakpoint.py > run-gdbstub-thread-breakpoint.out make[2]: *** [tests/tcg/multiarch/Makefile.target:71: run-gdbstub-thread-breakpoint] Error 1 make[1]: *** [tests/tcg/Makefile.qemu:102: run-guest-tests] Error 2 make: *** [tests/Makefile.include:63: run-tcg-tests-arm-linux-user] Error 2 https://gitlab.com/qemu-project/qemu/-/jobs/1785024040#L5986
[PATCH for 6.2] nbd/server: Silence clang sanitizer warning
clang's sanitizer is picky: memset(NULL, x, 0) is technically undefined behavior, even though no sane implementation of memset() deferences the NULL. Caught by the nbd-qemu-allocation iotest. The alternative to checking before each memset is to instead force an allocation of 1 element instead of g_new0(type, 0)'s behavior of returning NULL for a 0-length array. Reported-by: Peter Maydell Fixes: 3b1f244c59 (nbd: Allow export of multiple bitmaps for one device) Signed-off-by: Eric Blake --- nbd/server.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 6d03e8a4b436..d9164ee6d0da 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1,5 +1,5 @@ /* - * Copyright (C) 2016-2020 Red Hat, Inc. + * Copyright (C) 2016-2021 Red Hat, Inc. * Copyright (C) 2005 Anthony Liguori * * Network Block Device Server Side @@ -879,7 +879,9 @@ static bool nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta, if (!*query) { if (client->opt == NBD_OPT_LIST_META_CONTEXT) { meta->allocation_depth = meta->exp->allocation_depth; -memset(meta->bitmaps, 1, meta->exp->nr_export_bitmaps); +if (meta->exp->nr_export_bitmaps) { +memset(meta->bitmaps, 1, meta->exp->nr_export_bitmaps); +} } trace_nbd_negotiate_meta_query_parse("empty"); return true; @@ -894,7 +896,8 @@ static bool nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta, if (nbd_strshift(, "dirty-bitmap:")) { trace_nbd_negotiate_meta_query_parse("dirty-bitmap:"); if (!*query) { -if (client->opt == NBD_OPT_LIST_META_CONTEXT) { +if (client->opt == NBD_OPT_LIST_META_CONTEXT && +meta->exp->nr_export_bitmaps) { memset(meta->bitmaps, 1, meta->exp->nr_export_bitmaps); } trace_nbd_negotiate_meta_query_parse("empty"); @@ -1024,7 +1027,9 @@ static int nbd_negotiate_meta_queries(NBDClient *client, /* enable all known contexts */ meta->base_allocation = true; meta->allocation_depth = meta->exp->allocation_depth; -memset(meta->bitmaps, 1, meta->exp->nr_export_bitmaps); +if (meta->exp->nr_export_bitmaps) { +memset(meta->bitmaps, 1, meta->exp->nr_export_bitmaps); +} } else { for (i = 0; i < nb_queries; ++i) { ret = nbd_negotiate_meta_query(client, meta, errp); -- 2.33.1
[PATCH-for-6.2? 2/2] hw/intc/arm_gicv3: Introduce CONFIG_ARM_GIC_TCG Kconfig selector
The TYPE_ARM_GICV3 device is an emulated one. When using KVM, it is recommended to use the TYPE_KVM_ARM_GICV3 device (which uses in-kernel support). When using --with-devices-FOO, it is possible to build a binary with a specific set of devices. When this binary is restricted to KVM accelerator, the TYPE_ARM_GICV3 device is irrelevant, and it is desirable to remove it from the binary. Therefore introduce the CONFIG_ARM_GIC_TCG Kconfig selector which select the files required to have the TYPE_ARM_GICV3 device, but also allowing to de-select this device. Signed-off-by: Philippe Mathieu-Daudé --- hw/intc/arm_gicv3.c | 2 +- hw/intc/Kconfig | 5 + hw/intc/meson.build | 10 ++ 3 files changed, 12 insertions(+), 5 deletions(-) diff --git a/hw/intc/arm_gicv3.c b/hw/intc/arm_gicv3.c index c6282984b1e..76950c6f724 100644 --- a/hw/intc/arm_gicv3.c +++ b/hw/intc/arm_gicv3.c @@ -1,5 +1,5 @@ /* - * ARM Generic Interrupt Controller v3 + * ARM Generic Interrupt Controller v3 (emulation) * * Copyright (c) 2015 Huawei. * Copyright (c) 2016 Linaro Limited diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig index 78aed93c454..010ded7eae0 100644 --- a/hw/intc/Kconfig +++ b/hw/intc/Kconfig @@ -25,6 +25,11 @@ config APIC select MSI_NONBROKEN select I8259 +config ARM_GIC_TCG +bool +default y +depends on ARM_GIC && TCG + config ARM_GIC_KVM bool default y diff --git a/hw/intc/meson.build b/hw/intc/meson.build index 11352806db2..70080bc161c 100644 --- a/hw/intc/meson.build +++ b/hw/intc/meson.build @@ -3,12 +3,14 @@ 'arm_gic.c', 'arm_gic_common.c', 'arm_gicv2m.c', - 'arm_gicv3.c', 'arm_gicv3_common.c', - 'arm_gicv3_dist.c', 'arm_gicv3_its_common.c', - 'arm_gicv3_redist.c', +)) +softmmu_ss.add(when: 'CONFIG_ARM_GIC_TCG', if_true: files( + 'arm_gicv3.c', + 'arm_gicv3_dist.c', 'arm_gicv3_its.c', + 'arm_gicv3_redist.c', )) softmmu_ss.add(when: 'CONFIG_ETRAXFS', if_true: files('etraxfs_pic.c')) softmmu_ss.add(when: 'CONFIG_HEATHROW_PIC', if_true: files('heathrow_pic.c')) @@ -26,7 +28,7 @@ specific_ss.add(when: 'CONFIG_ALLWINNER_A10_PIC', if_true: files('allwinner-a10-pic.c')) specific_ss.add(when: 'CONFIG_APIC', if_true: files('apic.c', 'apic_common.c')) specific_ss.add(when: 'CONFIG_ARM_GIC', if_true: files('arm_gicv3_cpuif_common.c')) -specific_ss.add(when: 'CONFIG_ARM_GIC', if_true: files('arm_gicv3_cpuif.c')) +specific_ss.add(when: 'CONFIG_ARM_GIC_TCG', if_true: files('arm_gicv3_cpuif.c')) specific_ss.add(when: 'CONFIG_ARM_GIC_KVM', if_true: files('arm_gic_kvm.c')) specific_ss.add(when: ['CONFIG_ARM_GIC_KVM', 'TARGET_AARCH64'], if_true: files('arm_gicv3_kvm.c', 'arm_gicv3_its_kvm.c')) specific_ss.add(when: 'CONFIG_ARM_V7M', if_true: files('armv7m_nvic.c')) -- 2.31.1
[PATCH-for-6.2? 0/2] hw/intc/arm_gicv3: Introduce CONFIG_ARM_GIC_TCG Kconfig selector
The GICv3 ITS support has been introduced uring the 6.2 development window (commits 18f6290a6a9..17fb5e36aab). This device is for emulation. When building virtualization-only binary, it might be desirable to not include this device. Introduce the CONFIG_ARM_GIC_TCG Kconfig selector to allow downstream distributions to deselect this device. Based-on: pull-target-arm-2025-1 Philippe Mathieu-Daudé (2): hw/intc/arm_gicv3: Extract gicv3_set_gicv3state from arm_gicv3_cpuif.c hw/intc/arm_gicv3: Introduce CONFIG_ARM_GIC_TCG Kconfig selector hw/intc/arm_gicv3.c | 2 +- hw/intc/arm_gicv3_cpuif.c| 10 +- hw/intc/arm_gicv3_cpuif_common.c | 22 ++ hw/intc/Kconfig | 5 + hw/intc/meson.build | 11 +++ 5 files changed, 36 insertions(+), 14 deletions(-) create mode 100644 hw/intc/arm_gicv3_cpuif_common.c -- 2.31.1
[PATCH-for-6.2? 1/2] hw/intc/arm_gicv3: Extract gicv3_set_gicv3state from arm_gicv3_cpuif.c
gicv3_set_gicv3state() is used by arm_gicv3_common.c in arm_gicv3_common_realize(). Since we want to restrict arm_gicv3_cpuif.c to TCG, extract gicv3_set_gicv3state() to a new file. Add this file to the meson 'specific' source set, since it needs access to "cpu.h". Signed-off-by: Philippe Mathieu-Daudé --- hw/intc/arm_gicv3_cpuif.c| 10 +- hw/intc/arm_gicv3_cpuif_common.c | 22 ++ hw/intc/meson.build | 1 + 3 files changed, 24 insertions(+), 9 deletions(-) create mode 100644 hw/intc/arm_gicv3_cpuif_common.c diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c index 3fe5de8ad7d..5e539e3b5ef 100644 --- a/hw/intc/arm_gicv3_cpuif.c +++ b/hw/intc/arm_gicv3_cpuif.c @@ -1,5 +1,5 @@ /* - * ARM Generic Interrupt Controller v3 + * ARM Generic Interrupt Controller v3 (emulation) * * Copyright (c) 2016 Linaro Limited * Written by Peter Maydell @@ -21,14 +21,6 @@ #include "hw/irq.h" #include "cpu.h" -void gicv3_set_gicv3state(CPUState *cpu, GICv3CPUState *s) -{ -ARMCPU *arm_cpu = ARM_CPU(cpu); -CPUARMState *env = _cpu->env; - -env->gicv3state = (void *)s; -}; - static GICv3CPUState *icc_cs_from_env(CPUARMState *env) { return env->gicv3state; diff --git a/hw/intc/arm_gicv3_cpuif_common.c b/hw/intc/arm_gicv3_cpuif_common.c new file mode 100644 index 000..ff1239f65db --- /dev/null +++ b/hw/intc/arm_gicv3_cpuif_common.c @@ -0,0 +1,22 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * ARM Generic Interrupt Controller v3 + * + * Copyright (c) 2016 Linaro Limited + * Written by Peter Maydell + * + * This code is licensed under the GPL, version 2 or (at your option) + * any later version. + */ + +#include "qemu/osdep.h" +#include "gicv3_internal.h" +#include "cpu.h" + +void gicv3_set_gicv3state(CPUState *cpu, GICv3CPUState *s) +{ +ARMCPU *arm_cpu = ARM_CPU(cpu); +CPUARMState *env = _cpu->env; + +env->gicv3state = (void *)s; +}; diff --git a/hw/intc/meson.build b/hw/intc/meson.build index c89d2ca180e..11352806db2 100644 --- a/hw/intc/meson.build +++ b/hw/intc/meson.build @@ -25,6 +25,7 @@ specific_ss.add(when: 'CONFIG_ALLWINNER_A10_PIC', if_true: files('allwinner-a10-pic.c')) specific_ss.add(when: 'CONFIG_APIC', if_true: files('apic.c', 'apic_common.c')) +specific_ss.add(when: 'CONFIG_ARM_GIC', if_true: files('arm_gicv3_cpuif_common.c')) specific_ss.add(when: 'CONFIG_ARM_GIC', if_true: files('arm_gicv3_cpuif.c')) specific_ss.add(when: 'CONFIG_ARM_GIC_KVM', if_true: files('arm_gic_kvm.c')) specific_ss.add(when: ['CONFIG_ARM_GIC_KVM', 'TARGET_AARCH64'], if_true: files('arm_gicv3_kvm.c', 'arm_gicv3_its_kvm.c')) -- 2.31.1
Re: "make check" fails in a clang sanitizer build on "nbd-qemu-allocation" iotest
On Mon, Nov 15, 2021 at 05:11:54PM +, Peter Maydell wrote: > Hi; running a 'make check' on a clang sanitizer build one of > the iotests falls over due to a NULL pointer being passed to > memset(): > > > TEST iotest-qcow2: nbd-qemu-allocation [fail] > +../../nbd/server.c:1027:16: runtime error: null pointer passed as > argument 1, which is declared to never be null The code in question: if (client->opt == NBD_OPT_LIST_META_CONTEXT && !nb_queries) { /* enable all known contexts */ meta->base_allocation = true; meta->allocation_depth = meta->exp->allocation_depth; memset(meta->bitmaps, 1, meta->exp->nr_export_bitmaps); I suspect what is happening is that meta->bitmaps is NULL when meta->exp->nr_export_bitmaps is 0. It's annoying that clang's sanitizer whines even for a 0-length memset, but a strict reading of POSIX says that we really are in the technically undefined behavior when passing NULL (even with 0 length), so such whiny behavior is permitted. So I'll post a patch. > > Does this look familiar ? First I've heard of it; thanks for alerting me. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [PATCH RFC 2/2] hw: Replace drive_get_next() by drive_get()
On 11/15/21 16:57, Markus Armbruster wrote: > Philippe Mathieu-Daudé writes: >> On 11/15/21 13:55, Markus Armbruster wrote: >>> drive_get_next() is basically a bad idea. It returns the "next" block >>> backend of a certain interface type. "Next" means bus=0,unit=N, where >>> subsequent calls count N up from zero, per interface type. >>> >>> This lets you define unit numbers implicitly by execution order. If the >>> order changes, or new calls appear "in the middle", unit numbers change. >>> ABI break. Hard to spot in review. >>> >>> Explicit is better than implicit: use drive_get() directly. >>> >>> Signed-off-by: Markus Armbruster >>> --- >>> @@ -435,11 +438,13 @@ static void aspeed_machine_init(MachineState *machine) >>> } >>> >>> for (i = 0; i < bmc->soc.sdhci.num_slots; i++) { >>> -sdhci_attach_drive(>soc.sdhci.slots[i], >>> drive_get_next(IF_SD)); >>> +sdhci_attach_drive(>soc.sdhci.slots[i], >>> + drive_get(IF_SD, 0, i)); >> >> If we put SD on bus #0, ... >> >>> } >>> >>> if (bmc->soc.emmc.num_slots) { >>> -sdhci_attach_drive(>soc.emmc.slots[0], drive_get_next(IF_SD)); >>> +sdhci_attach_drive(>soc.emmc.slots[0], >>> + drive_get(IF_SD, 0, bmc->soc.sdhci.num_slots)); >> >> ... we'd want to put eMMC on bus #1 > > Using separate buses for different kinds of devices would be neater, but > it also would be an incompatible change. This patch keeps existing > bus/unit numbers working. drive_get_next() can only use bus 0. > >> but I see having eMMC cards on a >> IF_SD bus as a bug, since these cards are soldered on the board. > > IF_SD is not a bus, it's an "block interface type", which is really just > a user interface thing. Why are we discriminating by "block interface type" then? What is the difference between "block interfaces"? I see a block drive as a generic unit, usable on multiple hardware devices. I never really understood how this "block interface type" helps developers and users. I thought BlockInterfaceType and DriveInfo were legacy / deprecated APIs we want to get rid of; and we would come up with a replacement API using BlockDeviceInfo or providing a BlockFrontend state of the art object. Anyway, I suppose the explanation is buried in the git history before the last 8 years. I need to keep reading.
Re: [PULL 00/13] Block layer patches
Hi Kevin, On 11/15/21 15:53, Kevin Wolf wrote: > The following changes since commit 42f6c9179be4401974dd3a75ee72defd16b5092d: > > Merge tag 'pull-ppc-2022' of https://github.com/legoater/qemu into > staging (2021-11-12 12:28:25 +0100) > > are available in the Git repository at: > > git://repo.or.cz/qemu/kevin.git tags/for-upstream > > for you to fetch changes up to 7461272c5f6032436ef9032c091c0118539483e4: > > softmmu/qdev-monitor: fix use-after-free in qdev_set_id() (2021-11-15 > 15:49:46 +0100) > > > Block layer patches > > - Fixes to image streaming job and block layer reconfiguration to make > iotest 030 pass again > - docs: Deprecate incorrectly typed device_add arguments > - file-posix: Fix alignment after reopen changing O_DIRECT > > > Hanna Reitz (10): > stream: Traverse graph after modification > block: Manipulate children list in .attach/.detach > block: Unite remove_empty_child and child_free > block: Drop detached child from ignore list > block: Pass BdrvChild ** to replace_child_noperm > block: Restructure remove_file_or_backing_child() > transactions: Invoke clean() after everything else > block: Let replace_child_tran keep indirect pointer > block: Let replace_child_noperm free children > iotests/030: Unthrottle parallel jobs in reverse > > Kevin Wolf (2): > docs: Deprecate incorrectly typed device_add arguments > file-posix: Fix alignment after reopen changing O_DIRECT > > Stefan Hajnoczi (1): > softmmu/qdev-monitor: fix use-after-free in qdev_set_id() > > docs/about/deprecated.rst | 14 +++ > include/qemu/transactions.h | 3 + > block.c | 233 > +--- > block/file-posix.c | 20 +++- > block/stream.c | 7 +- > softmmu/qdev-monitor.c | 2 +- > util/transactions.c | 8 +- > tests/qemu-iotests/030 | 11 ++- > tests/qemu-iotests/142 | 22 + > tests/qemu-iotests/142.out | 15 +++ > 10 files changed, 269 insertions(+), 66 deletions(-) Looking at current /staging I noticed iotest#142 failed, build-tcg-disabled job: +++ 142.out.bad @@ -750,6 +750,7 @@ --- Alignment after changing O_DIRECT --- +qemu-io: Cannot get 'write' permission without 'resize': Image size is not a multiple of request alignment read 42/42 bytes at offset 42 42 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) read 42/42 bytes at offset 42 https://gitlab.com/qemu-project/qemu/-/jobs/1784955950#L2794
Re: [PULL 00/13] Block layer patches
On 11/15/21 3:53 PM, Kevin Wolf wrote: The following changes since commit 42f6c9179be4401974dd3a75ee72defd16b5092d: Merge tag 'pull-ppc-2022' of https://github.com/legoater/qemu into staging (2021-11-12 12:28:25 +0100) are available in the Git repository at: git://repo.or.cz/qemu/kevin.git tags/for-upstream for you to fetch changes up to 7461272c5f6032436ef9032c091c0118539483e4: softmmu/qdev-monitor: fix use-after-free in qdev_set_id() (2021-11-15 15:49:46 +0100) Block layer patches - Fixes to image streaming job and block layer reconfiguration to make iotest 030 pass again - docs: Deprecate incorrectly typed device_add arguments - file-posix: Fix alignment after reopen changing O_DIRECT Hanna Reitz (10): stream: Traverse graph after modification block: Manipulate children list in .attach/.detach block: Unite remove_empty_child and child_free block: Drop detached child from ignore list block: Pass BdrvChild ** to replace_child_noperm block: Restructure remove_file_or_backing_child() transactions: Invoke clean() after everything else block: Let replace_child_tran keep indirect pointer block: Let replace_child_noperm free children iotests/030: Unthrottle parallel jobs in reverse Kevin Wolf (2): docs: Deprecate incorrectly typed device_add arguments file-posix: Fix alignment after reopen changing O_DIRECT Stefan Hajnoczi (1): softmmu/qdev-monitor: fix use-after-free in qdev_set_id() docs/about/deprecated.rst | 14 +++ include/qemu/transactions.h | 3 + block.c | 233 +--- block/file-posix.c | 20 +++- block/stream.c | 7 +- softmmu/qdev-monitor.c | 2 +- util/transactions.c | 8 +- tests/qemu-iotests/030 | 11 ++- tests/qemu-iotests/142 | 22 + tests/qemu-iotests/142.out | 15 +++ 10 files changed, 269 insertions(+), 66 deletions(-) This is failing iotest 142 for build-tcg-disabled. I did retry, in case it was transitory. https://gitlab.com/qemu-project/qemu/-/jobs/1784955950 r~
[PATCH for 6.2 v4] nbd/server: Add --selinux-label option
From: "Richard W.M. Jones" Under SELinux, Unix domain sockets have two labels. One is on the disk and can be set with commands such as chcon(1). There is a different label stored in memory (called the process label). This can only be set by the process creating the socket. When using SELinux + SVirt and wanting qemu to be able to connect to a qemu-nbd instance, you must set both labels correctly first. For qemu-nbd the options to set the second label are awkward. You can create the socket in a wrapper program and then exec into qemu-nbd. Or you could try something with LD_PRELOAD. This commit adds the ability to set the label straightforwardly on the command line, via the new --selinux-label flag. (The name of the flag is the same as the equivalent nbdkit option.) A worked example showing how to use the new option can be found in this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1984938 Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1984938 Signed-off-by: Richard W.M. Jones Reviewed-by: Daniel P. Berrangé Message-Id: <20210930084701.3899578-1-rjo...@redhat.com> [eblake: rebase to configure changes, reject --selinux-label if it is not compiled in or not used on a Unix socket] Signed-off-by: Eric Blake --- v3 was here: https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg07677.html since then: another rework of the configure logic (now that meson-buildoptions.sh exists), and a rework of the error reporting (for now, loudly complain on all unsupported attempts at labeling. We may later allow things that currently fail) Candidate for 6.2 in spite of soft freeze because of earlier attempted pull request here: https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg07081.html and my apologies for letting this slip a month without action. meson.build | 10 +++- qemu-nbd.c| 46 +++ meson_options.txt | 3 ++ scripts/meson-buildoptions.sh | 3 ++ tests/docker/dockerfiles/centos8.docker | 1 + .../dockerfiles/fedora-i386-cross.docker | 1 + tests/docker/dockerfiles/fedora.docker| 1 + tests/docker/dockerfiles/opensuse-leap.docker | 1 + tests/docker/dockerfiles/ubuntu1804.docker| 1 + tests/docker/dockerfiles/ubuntu2004.docker| 1 + 10 files changed, 67 insertions(+), 1 deletion(-) diff --git a/meson.build b/meson.build index 2ece4fe0889a..73b7deaa0ef3 100644 --- a/meson.build +++ b/meson.build @@ -1201,6 +1201,11 @@ keyutils = dependency('libkeyutils', required: false, has_gettid = cc.has_function('gettid') +# libselinux +selinux = dependency('libselinux', + required: get_option('selinux'), + method: 'pkg-config', kwargs: static_kwargs) + # Malloc tests malloc = [] @@ -1479,6 +1484,7 @@ config_host_data.set('CONFIG_SPICE_PROTOCOL', spice_protocol.found()) config_host_data.set('CONFIG_SPICE', spice.found()) config_host_data.set('CONFIG_X11', x11.found()) config_host_data.set('CONFIG_CFI', get_option('cfi')) +config_host_data.set('CONFIG_SELINUX', selinux.found()) config_host_data.set('QEMU_VERSION', '"@0@"'.format(meson.project_version())) config_host_data.set('QEMU_VERSION_MAJOR', meson.project_version().split('.')[0]) config_host_data.set('QEMU_VERSION_MINOR', meson.project_version().split('.')[1]) @@ -3054,7 +3060,8 @@ if have_tools qemu_io = executable('qemu-io', files('qemu-io.c'), dependencies: [block, qemuutil], install: true) qemu_nbd = executable('qemu-nbd', files('qemu-nbd.c'), - dependencies: [blockdev, qemuutil, gnutls], install: true) + dependencies: [blockdev, qemuutil, gnutls, selinux], + install: true) subdir('storage-daemon') subdir('contrib/rdmacm-mux') @@ -3430,6 +3437,7 @@ summary_info += {'libdaxctl support': libdaxctl} summary_info += {'libudev': libudev} # Dummy dependency, keep .found() summary_info += {'FUSE lseek':fuse_lseek.found()} +summary_info += {'selinux': selinux.found()} summary(summary_info, bool_yn: true, section: 'Dependencies') if not supported_cpus.contains(cpu) diff --git a/qemu-nbd.c b/qemu-nbd.c index 9d895ba24b1e..c6c20df68a4d 100644 --- a/qemu-nbd.c +++ b/qemu-nbd.c @@ -47,6 +47,10 @@ #include "trace/control.h" #include "qemu-version.h" +#ifdef CONFIG_SELINUX +#include +#endif + #ifdef __linux__ #define HAVE_NBD_DEVICE 1 #else @@ -64,6 +68,7 @@ #define QEMU_NBD_OPT_FORK 263 #define QEMU_NBD_OPT_TLSAUTHZ 264 #define QEMU_NBD_OPT_PID_FILE 265 +#define QEMU_NBD_OPT_SELINUX_LABEL 266 #define MBR_SIZE 512 @@ -116,6 +121,9 @@ static void usage(const char *name) " --forkfork off the server process and exit the parent\n" "once the server is running\n" " --pid-file=PATH store the server's process ID in the given file\n" +#ifdef
[PULL 4/4] hw/rtc/pl031: Send RTC_CHANGE QMP event
From: Eric Auger The PL031 currently is not able to report guest RTC change to the QMP monitor as opposed to mc146818 or spapr RTCs. This patch adds the call to qapi_event_send_rtc_change() when the Load Register is written. The value which is reported corresponds to the difference between the guest reference time and the reference time kept in softmmu/rtc.c. For instance adding 20s to the guest RTC value will report 20. Adding an extra 20s to the guest RTC value will report 20 + 20 = 40. The inclusion of qapi/qapi-types-misc-target.h in hw/rtl/pl031.c require to compile the PL031 with specific_ss.add() to avoid ./qapi/qapi-types-misc-target.h:18:13: error: attempt to use poisoned "TARGET_". Signed-off-by: Eric Auger Reviewed-by: Peter Maydell Message-id: 20210920122535.269988-1-eric.au...@redhat.com Signed-off-by: Peter Maydell --- hw/rtc/pl031.c | 10 +- hw/rtc/meson.build | 2 +- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/hw/rtc/pl031.c b/hw/rtc/pl031.c index 2bbb2062ac8..e7ced90b025 100644 --- a/hw/rtc/pl031.c +++ b/hw/rtc/pl031.c @@ -24,6 +24,7 @@ #include "qemu/log.h" #include "qemu/module.h" #include "trace.h" +#include "qapi/qapi-events-misc-target.h" #define RTC_DR 0x00/* Data read register */ #define RTC_MR 0x04/* Match register */ @@ -136,10 +137,17 @@ static void pl031_write(void * opaque, hwaddr offset, trace_pl031_write(offset, value); switch (offset) { -case RTC_LR: +case RTC_LR: { +struct tm tm; + s->tick_offset += value - pl031_get_count(s); + +qemu_get_timedate(, s->tick_offset); +qapi_event_send_rtc_change(qemu_timedate_diff()); + pl031_set_alarm(s); break; +} case RTC_MR: s->mr = value; pl031_set_alarm(s); diff --git a/hw/rtc/meson.build b/hw/rtc/meson.build index 7cecdee5ddb..8fd8d8f9a71 100644 --- a/hw/rtc/meson.build +++ b/hw/rtc/meson.build @@ -2,7 +2,7 @@ softmmu_ss.add(when: 'CONFIG_DS1338', if_true: files('ds1338.c')) softmmu_ss.add(when: 'CONFIG_M41T80', if_true: files('m41t80.c')) softmmu_ss.add(when: 'CONFIG_M48T59', if_true: files('m48t59.c')) -softmmu_ss.add(when: 'CONFIG_PL031', if_true: files('pl031.c')) +specific_ss.add(when: 'CONFIG_PL031', if_true: files('pl031.c')) softmmu_ss.add(when: 'CONFIG_TWL92230', if_true: files('twl92230.c')) softmmu_ss.add(when: ['CONFIG_ISA_BUS', 'CONFIG_M48T59'], if_true: files('m48t59-isa.c')) softmmu_ss.add(when: 'CONFIG_XLNX_ZYNQMP', if_true: files('xlnx-zynqmp-rtc.c')) -- 2.25.1
[PULL 0/4] target-arm queue
Hi; some minor changes for 6.2, which I think can be classified as bug fixes and are OK for this point in the release cycle. (Wouldn't be the end of the world if they slipped to 7.0.) -- PMM The following changes since commit 42f6c9179be4401974dd3a75ee72defd16b5092d: Merge tag 'pull-ppc-2022' of https://github.com/legoater/qemu into staging (2021-11-12 12:28:25 +0100) are available in the Git repository at: https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-2025-1 for you to fetch changes up to 1adf528ec3bdf62ea3b580b7ad562534a3676ff5: hw/rtc/pl031: Send RTC_CHANGE QMP event (2021-11-15 18:53:00 +) target-arm queue: * Support multiple redistributor regions for TCG GICv3 * Send RTC_CHANGE QMP event from pl031 Eric Auger (1): hw/rtc/pl031: Send RTC_CHANGE QMP event Peter Maydell (3): hw/intc/arm_gicv3: Move checking of redist-region-count to arm_gicv3_common_realize hw/intc/arm_gicv3: Set GICR_TYPER.Last correctly when nb_redist_regions > 1 hw/intc/arm_gicv3: Support multiple redistributor regions include/hw/intc/arm_gicv3_common.h | 14 -- hw/intc/arm_gicv3.c| 12 +--- hw/intc/arm_gicv3_common.c | 56 -- hw/intc/arm_gicv3_kvm.c| 10 ++- hw/intc/arm_gicv3_redist.c | 40 +++ hw/rtc/pl031.c | 10 ++- hw/rtc/meson.build | 2 +- 7 files changed, 83 insertions(+), 61 deletions(-)
[PULL 2/4] hw/intc/arm_gicv3: Set GICR_TYPER.Last correctly when nb_redist_regions > 1
The 'Last' bit in the GICR_TYPER GICv3 redistributor register is supposed to be set to 1 if this is the last redistributor in a series of contiguous redistributor pages. Currently we set Last only for the redistributor for CPU (num_cpu - 1). This only works if there is a single redistributor region; if there are multiple redistributor regions then we need to set the Last bit for the last redistributor in each region. This doesn't cause any problems currently because only the KVM GICv3 supports multiple redistributor regions, and it ignores the value in GICv3State::gicr_typer. But we need to fix this before we can enable support for multiple regions in the emulated GICv3. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- hw/intc/arm_gicv3_common.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c index 8e47809398b..8de9205b386 100644 --- a/hw/intc/arm_gicv3_common.c +++ b/hw/intc/arm_gicv3_common.c @@ -297,7 +297,7 @@ void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler, static void arm_gicv3_common_realize(DeviceState *dev, Error **errp) { GICv3State *s = ARM_GICV3_COMMON(dev); -int i, rdist_capacity; +int i, rdist_capacity, cpuidx; /* revision property is actually reserved and currently used only in order * to keep the interface compatible with GICv2 code, avoiding extra @@ -355,7 +355,6 @@ static void arm_gicv3_common_realize(DeviceState *dev, Error **errp) for (i = 0; i < s->num_cpu; i++) { CPUState *cpu = qemu_get_cpu(i); uint64_t cpu_affid; -int last; s->cpu[i].cpu = cpu; s->cpu[i].gic = s; @@ -375,7 +374,6 @@ static void arm_gicv3_common_realize(DeviceState *dev, Error **errp) * PLPIS == 0 (physical LPIs not supported) */ cpu_affid = object_property_get_uint(OBJECT(cpu), "mp-affinity", NULL); -last = (i == s->num_cpu - 1); /* The CPU mp-affinity property is in MPIDR register format; squash * the affinity bytes into 32 bits as the GICR_TYPER has them. @@ -384,13 +382,22 @@ static void arm_gicv3_common_realize(DeviceState *dev, Error **errp) (cpu_affid & 0xFF); s->cpu[i].gicr_typer = (cpu_affid << 32) | (1 << 24) | -(i << 8) | -(last << 4); +(i << 8); if (s->lpi_enable) { s->cpu[i].gicr_typer |= GICR_TYPER_PLPIS; } } + +/* + * Now go through and set GICR_TYPER.Last for the final + * redistributor in each region. + */ +cpuidx = 0; +for (i = 0; i < s->nb_redist_regions; i++) { +cpuidx += s->redist_region_count[i]; +s->cpu[cpuidx - 1].gicr_typer |= GICR_TYPER_LAST; +} } static void arm_gicv3_finalize(Object *obj) -- 2.25.1
[PULL 1/4] hw/intc/arm_gicv3: Move checking of redist-region-count to arm_gicv3_common_realize
The GICv3 devices have an array property redist-region-count. Currently we check this for errors (bad values) in gicv3_init_irqs_and_mmio(), just before we use it. Move this error checking to the arm_gicv3_common_realize() function, where we sanity-check all of the other base-class properties. (This will always be before gicv3_init_irqs_and_mmio() is called, because that function is called in the subclass realize methods, after they have called the parent-class realize.) The motivation for this refactor is: * we would like to use the redist_region_count[] values in arm_gicv3_common_realize() in a subsequent patch, so we need to have already done the sanity-checking first * this removes the only use of the Error** argument to gicv3_init_irqs_and_mmio(), so we can remove some error-handling boilerplate Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- include/hw/intc/arm_gicv3_common.h | 2 +- hw/intc/arm_gicv3.c| 6 +- hw/intc/arm_gicv3_common.c | 26 +- hw/intc/arm_gicv3_kvm.c| 6 +- 4 files changed, 16 insertions(+), 24 deletions(-) diff --git a/include/hw/intc/arm_gicv3_common.h b/include/hw/intc/arm_gicv3_common.h index aa4f0d67703..cb2b0d0ad45 100644 --- a/include/hw/intc/arm_gicv3_common.h +++ b/include/hw/intc/arm_gicv3_common.h @@ -306,6 +306,6 @@ struct ARMGICv3CommonClass { }; void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler, - const MemoryRegionOps *ops, Error **errp); + const MemoryRegionOps *ops); #endif diff --git a/hw/intc/arm_gicv3.c b/hw/intc/arm_gicv3.c index 3f24707838c..bcf54a5f0a5 100644 --- a/hw/intc/arm_gicv3.c +++ b/hw/intc/arm_gicv3.c @@ -393,11 +393,7 @@ static void arm_gic_realize(DeviceState *dev, Error **errp) return; } -gicv3_init_irqs_and_mmio(s, gicv3_set_irq, gic_ops, _err); -if (local_err) { -error_propagate(errp, local_err); -return; -} +gicv3_init_irqs_and_mmio(s, gicv3_set_irq, gic_ops); gicv3_init_cpuif(s); } diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c index 223db16feca..8e47809398b 100644 --- a/hw/intc/arm_gicv3_common.c +++ b/hw/intc/arm_gicv3_common.c @@ -250,22 +250,11 @@ static const VMStateDescription vmstate_gicv3 = { }; void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler, - const MemoryRegionOps *ops, Error **errp) + const MemoryRegionOps *ops) { SysBusDevice *sbd = SYS_BUS_DEVICE(s); -int rdist_capacity = 0; int i; -for (i = 0; i < s->nb_redist_regions; i++) { -rdist_capacity += s->redist_region_count[i]; -} -if (rdist_capacity < s->num_cpu) { -error_setg(errp, "Capacity of the redist regions(%d) " - "is less than number of vcpus(%d)", - rdist_capacity, s->num_cpu); -return; -} - /* For the GIC, also expose incoming GPIO lines for PPIs for each CPU. * GPIO array layout is thus: * [0..N-1] spi @@ -308,7 +297,7 @@ void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler, static void arm_gicv3_common_realize(DeviceState *dev, Error **errp) { GICv3State *s = ARM_GICV3_COMMON(dev); -int i; +int i, rdist_capacity; /* revision property is actually reserved and currently used only in order * to keep the interface compatible with GICv2 code, avoiding extra @@ -350,6 +339,17 @@ static void arm_gicv3_common_realize(DeviceState *dev, Error **errp) return; } +rdist_capacity = 0; +for (i = 0; i < s->nb_redist_regions; i++) { +rdist_capacity += s->redist_region_count[i]; +} +if (rdist_capacity < s->num_cpu) { +error_setg(errp, "Capacity of the redist regions(%d) " + "is less than number of vcpus(%d)", + rdist_capacity, s->num_cpu); +return; +} + s->cpu = g_new0(GICv3CPUState, s->num_cpu); for (i = 0; i < s->num_cpu; i++) { diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c index 5c09f00dec2..ab58c73306d 100644 --- a/hw/intc/arm_gicv3_kvm.c +++ b/hw/intc/arm_gicv3_kvm.c @@ -787,11 +787,7 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp) return; } -gicv3_init_irqs_and_mmio(s, kvm_arm_gicv3_set_irq, NULL, _err); -if (local_err) { -error_propagate(errp, local_err); -return; -} +gicv3_init_irqs_and_mmio(s, kvm_arm_gicv3_set_irq, NULL); for (i = 0; i < s->num_cpu; i++) { ARMCPU *cpu = ARM_CPU(qemu_get_cpu(i)); -- 2.25.1
[PULL 3/4] hw/intc/arm_gicv3: Support multiple redistributor regions
Our GICv3 QOM interface includes an array property redist-region-count which allows board models to specify that the registributor registers are not in a single contiguous range, but split into multiple pieces. We implemented this for KVM, but currently the TCG GICv3 model insists that there is only one region. You can see the limit being hit with a setup like: qemu-system-aarch64 -machine virt,gic-version=3 -smp 124 Add support for split regions to the TCG GICv3. To do this we switch from allocating a simple array of MemoryRegions to an array of GICv3RedistRegion structs so that we can use the GICv3RedistRegion as the opaque pointer in the MemoryRegion read/write callbacks. Each GICv3RedistRegion contains the MemoryRegion, a backpointer allowing the read/write callback to get hold of the GICv3State, and an index which allows us to calculate which CPU's redistributor is being accessed. Note that arm_gicv3_kvm always passes in NULL as the ops argument to gicv3_init_irqs_and_mmio(), so the only MemoryRegion read/write callbacks we need to update to handle this new scheme are the gicv3_redist_read/write functions used by the emulated GICv3. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- include/hw/intc/arm_gicv3_common.h | 12 - hw/intc/arm_gicv3.c| 6 - hw/intc/arm_gicv3_common.c | 15 --- hw/intc/arm_gicv3_kvm.c| 4 +-- hw/intc/arm_gicv3_redist.c | 40 -- 5 files changed, 46 insertions(+), 31 deletions(-) diff --git a/include/hw/intc/arm_gicv3_common.h b/include/hw/intc/arm_gicv3_common.h index cb2b0d0ad45..fc38e4b7dca 100644 --- a/include/hw/intc/arm_gicv3_common.h +++ b/include/hw/intc/arm_gicv3_common.h @@ -215,13 +215,23 @@ struct GICv3CPUState { bool seenbetter; }; +/* + * The redistributor pages might be split into more than one region + * on some machine types if there are many CPUs. + */ +typedef struct GICv3RedistRegion { +GICv3State *gic; +MemoryRegion iomem; +uint32_t cpuidx; /* index of first CPU this region covers */ +} GICv3RedistRegion; + struct GICv3State { /*< private >*/ SysBusDevice parent_obj; /*< public >*/ MemoryRegion iomem_dist; /* Distributor */ -MemoryRegion *iomem_redist; /* Redistributor Regions */ +GICv3RedistRegion *redist_regions; /* Redistributor Regions */ uint32_t *redist_region_count; /* redistributor count within each region */ uint32_t nb_redist_regions; /* number of redist regions */ diff --git a/hw/intc/arm_gicv3.c b/hw/intc/arm_gicv3.c index bcf54a5f0a5..c6282984b1e 100644 --- a/hw/intc/arm_gicv3.c +++ b/hw/intc/arm_gicv3.c @@ -387,12 +387,6 @@ static void arm_gic_realize(DeviceState *dev, Error **errp) return; } -if (s->nb_redist_regions != 1) { -error_setg(errp, "VGICv3 redist region number(%d) not equal to 1", - s->nb_redist_regions); -return; -} - gicv3_init_irqs_and_mmio(s, gicv3_set_irq, gic_ops); gicv3_init_cpuif(s); diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c index 8de9205b386..9884d2e39b9 100644 --- a/hw/intc/arm_gicv3_common.c +++ b/hw/intc/arm_gicv3_common.c @@ -254,6 +254,7 @@ void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler, { SysBusDevice *sbd = SYS_BUS_DEVICE(s); int i; +int cpuidx; /* For the GIC, also expose incoming GPIO lines for PPIs for each CPU. * GPIO array layout is thus: @@ -282,14 +283,20 @@ void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler, "gicv3_dist", 0x1); sysbus_init_mmio(sbd, >iomem_dist); -s->iomem_redist = g_new0(MemoryRegion, s->nb_redist_regions); +s->redist_regions = g_new0(GICv3RedistRegion, s->nb_redist_regions); +cpuidx = 0; for (i = 0; i < s->nb_redist_regions; i++) { char *name = g_strdup_printf("gicv3_redist_region[%d]", i); +GICv3RedistRegion *region = >redist_regions[i]; -memory_region_init_io(>iomem_redist[i], OBJECT(s), - ops ? [1] : NULL, s, name, +region->gic = s; +region->cpuidx = cpuidx; +cpuidx += s->redist_region_count[i]; + +memory_region_init_io(>iomem, OBJECT(s), + ops ? [1] : NULL, region, name, s->redist_region_count[i] * GICV3_REDIST_SIZE); -sysbus_init_mmio(sbd, >iomem_redist[i]); +sysbus_init_mmio(sbd, >iomem); g_free(name); } } diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c index ab58c73306d..5ec5ff9ef6e 100644 --- a/hw/intc/arm_gicv3_kvm.c +++ b/hw/intc/arm_gicv3_kvm.c @@ -825,7 +825,7 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp) KVM_VGIC_V3_ADDR_TYPE_DIST, s->dev_fd, 0); if (!multiple_redist_region_allowed) { -
[PATCH] sev: allow capabilities to check for SEV-ES support
Probe for SEV-ES and SEV-SNP capabilities to distinguish between Rome, Naples, and Milan processors. Use the CPUID function to probe if a processor is capable of running SEV-ES or SEV-SNP, rather than if it actually is running SEV-ES or SEV-SNP. Signed-off-by: Tyler Fanelli --- qapi/misc-target.json | 11 +-- target/i386/sev.c | 6 -- 2 files changed, 13 insertions(+), 4 deletions(-) diff --git a/qapi/misc-target.json b/qapi/misc-target.json index 5aa2b95b7d..c3e9bce12b 100644 --- a/qapi/misc-target.json +++ b/qapi/misc-target.json @@ -182,13 +182,19 @@ # @reduced-phys-bits: Number of physical Address bit reduction when SEV is # enabled # +# @es: SEV-ES capability of the machine. +# +# @snp: SEV-SNP capability of the machine. +# # Since: 2.12 ## { 'struct': 'SevCapability', 'data': { 'pdh': 'str', 'cert-chain': 'str', 'cbitpos': 'int', -'reduced-phys-bits': 'int'}, +'reduced-phys-bits': 'int', +'es': 'bool', +'snp': 'bool'}, 'if': 'TARGET_I386' } ## @@ -205,7 +211,8 @@ # # -> { "execute": "query-sev-capabilities" } # <- { "return": { "pdh": "8CCDD8DDD", "cert-chain": "888CCCDDDEE", -# "cbitpos": 47, "reduced-phys-bits": 5}} +# "cbitpos": 47, "reduced-phys-bits": 5 +# "es": false, "snp": false}} # ## { 'command': 'query-sev-capabilities', 'returns': 'SevCapability', diff --git a/target/i386/sev.c b/target/i386/sev.c index eede07f11d..6d78dcd744 100644 --- a/target/i386/sev.c +++ b/target/i386/sev.c @@ -506,7 +506,7 @@ static SevCapability *sev_get_capabilities(Error **errp) guchar *pdh_data = NULL; guchar *cert_chain_data = NULL; size_t pdh_len = 0, cert_chain_len = 0; -uint32_t ebx; +uint32_t eax, ebx; int fd; if (!kvm_enabled()) { @@ -534,8 +534,10 @@ static SevCapability *sev_get_capabilities(Error **errp) cap->pdh = g_base64_encode(pdh_data, pdh_len); cap->cert_chain = g_base64_encode(cert_chain_data, cert_chain_len); -host_cpuid(0x801F, 0, NULL, , NULL, NULL); +host_cpuid(0x801F, 0, , , NULL, NULL); cap->cbitpos = ebx & 0x3f; +cap->es = (eax & 0x8) ? true : false; +cap->snp = (eax & 0x10) ? true : false; /* * When SEV feature is enabled, we loose one bit in guest physical -- 2.31.1
Re: [PATCH 0/2] hw/nvme: fix namespace attachment on controller hotplug
On Sep 24 09:27, Klaus Jensen wrote: > From: Klaus Jensen > > First patch from Hannes fixes the subsystem registration process such > that shared (but non-detached) namespaces are automatically attached to > hotplugged controllers. > > The second patch changes the default for 'shared' such that namespaces > are shared by default and will thus by default be attached to hotplugged > controllers. This adds a compat property for older machine versions and > updates the documentation to reflect this. > > Hannes Reinecke (1): > hw/nvme: reattach subsystem namespaces on hotplug > > Klaus Jensen (1): > hw/nvme: change nvme-ns 'shared' default > > docs/system/devices/nvme.rst | 24 ++-- > hw/core/machine.c| 4 +++- > hw/nvme/ns.c | 8 +--- > hw/nvme/subsys.c | 10 +- > 4 files changed, 27 insertions(+), 19 deletions(-) > Ping :) signature.asc Description: PGP signature
Re: [PATCH v2] hw/rtc/pl031: Send RTC_CHANGE QMP event
On Mon, 15 Nov 2021 at 12:49, Paolo Bonzini wrote: > > On 11/1/21 17:04, Peter Maydell wrote: > > On Thu, 23 Sept 2021 at 14:29, Peter Maydell > > wrote: > >> > >> On Mon, 20 Sept 2021 at 13:25, Eric Auger wrote: > >>> > >>> The PL031 currently is not able to report guest RTC change to the QMP > >>> monitor as opposed to mc146818 or spapr RTCs. This patch adds the call > >>> to qapi_event_send_rtc_change() when the Load Register is written. The > >>> value which is reported corresponds to the difference between the guest > >>> reference time and the reference time kept in softmmu/rtc.c. > >>> > >>> For instance adding 20s to the guest RTC value will report 20. Adding > >>> an extra 20s to the guest RTC value will report 20 + 20 = 40. > >>> > >>> The inclusion of qapi/qapi-types-misc-target.h in hw/rtl/pl031.c > >>> require to compile the PL031 with specific_ss.add() to avoid > >>> ./qapi/qapi-types-misc-target.h:18:13: error: attempt to use poisoned > >>> "TARGET_". > >>> > >>> Signed-off-by: Eric Auger > >>> Signed-off-by: Peter Maydell > >> > >> Thanks. This looks plausible to me (well, it would ;-)) but > >> I would appreciate review from Paolo or somebody else who > >> understands the rtc_change feature and handling. > > > > Ping? Review from somebody who understands rtc_change would > > still be nice... > > The change looks good to me (sorry I missed this v2). x86 also has some > logic in the migration post-load, that might end up sending the event. > However, that's best done separately after understanding and documenting > exactly what x86 is doing. Thanks; I've applied this to target-arm.next for 6.2. -- PMM
Re: [PULL 15/26] configure, meson: move pthread_setname_np checks to Meson
On Mon, 15 Nov 2021 at 17:16, Daniel P. Berrangé wrote: > > On Thu, Oct 14, 2021 at 06:29:27PM +0200, Paolo Bonzini wrote: > > This makes the pthreads check dead in configure, so remove it > > as well. > > This change appears broken > > On v6.1.0 > > $ grep SETNAME_NP build/config-host.h > #define CONFIG_PTHREAD_SETNAME_NP_W_TID 1 > > While on git master > > $ grep SETNAME_NP build/config-host.h > #define CONFIG_PTHREAD_SETNAME_NP_WO_TID > #define CONFIG_PTHREAD_SETNAME_NP_W_TID > > > it shoudn't be possible to have both these configs > satisfied as they're 2 completely different impls. > > In the meson-log.txt we can see both tests passing, but with > warnings > > > /home/berrange/src/virt/qemu/build/meson-private/tmpt191k3q1/testfile.c:9:5: > warning: implicit declaration of function 'pthread_setname_np'; did you mean > 'pthread_setcanceltype'? [-Wimplicit-function-declaration] > > So it isn't actually validating the function parameter > signature, and then link is succeeding because the linker > doesn't care about the incompatible function signatures. > > Peter meanwhile reports that he gets no pthread_Setname_np > at all, and his meson-build.log shows -Werror set. This > causes the missing fnuction signature to cause tests to > fail in both cases. More specifically, these tests get run with "-Werror=implicit-function-declaration" (and this is not because I've passed any kind of -Werror option in --extra-cflags; I have passed --enable-werror but that is supposed to be the default anyway...) -- PMM
Re: [PULL 15/26] configure, meson: move pthread_setname_np checks to Meson
On Thu, Oct 14, 2021 at 06:29:27PM +0200, Paolo Bonzini wrote: > This makes the pthreads check dead in configure, so remove it > as well. This change appears broken On v6.1.0 $ grep SETNAME_NP build/config-host.h #define CONFIG_PTHREAD_SETNAME_NP_W_TID 1 While on git master $ grep SETNAME_NP build/config-host.h #define CONFIG_PTHREAD_SETNAME_NP_WO_TID #define CONFIG_PTHREAD_SETNAME_NP_W_TID it shoudn't be possible to have both these configs satisfied as they're 2 completely different impls. In the meson-log.txt we can see both tests passing, but with warnings /home/berrange/src/virt/qemu/build/meson-private/tmpt191k3q1/testfile.c:9:5: warning: implicit declaration of function 'pthread_setname_np'; did you mean 'pthread_setcanceltype'? [-Wimplicit-function-declaration] So it isn't actually validating the function parameter signature, and then link is succeeding because the linker doesn't care about the incompatible function signatures. Peter meanwhile reports that he gets no pthread_Setname_np at all, and his meson-build.log shows -Werror set. This causes the missing fnuction signature to cause tests to fail in both cases. The original config.log shows _GNU_SOURCE being defined for these tests, which would get the function signture defined. > > Reviewed-by: Marc-André Lureau > Message-Id: <20211007130829.632254-9-pbonz...@redhat.com> > Reviewed-by: Thomas Huth > Signed-off-by: Paolo Bonzini > --- > configure| 78 > meson.build | 23 > util/qemu-thread-posix.c | 5 ++- > 3 files changed, 25 insertions(+), 81 deletions(-) > > diff --git a/configure b/configure > index e78f58978f..c7e95e59cc 100755 > --- a/configure > +++ b/configure > @@ -3148,71 +3148,6 @@ if test "$modules" = yes; then > fi > fi > > -## > -# pthread probe > -PTHREADLIBS_LIST="-pthread -lpthread -lpthreadGC2" > - > -pthread=no > -cat > $TMPC << EOF > -#include > -static void *f(void *p) { return NULL; } > -int main(void) { > - pthread_t thread; > - pthread_create(, 0, f, 0); > - return 0; > -} > -EOF > -if compile_prog "" "" ; then > - pthread=yes > -else > - for pthread_lib in $PTHREADLIBS_LIST; do > -if compile_prog "" "$pthread_lib" ; then > - pthread=yes > - break > -fi > - done > -fi > - > -if test "$mingw32" != yes && test "$pthread" = no; then > - error_exit "pthread check failed" \ > - "Make sure to have the pthread libs and headers installed." > -fi > - > -# check for pthread_setname_np with thread id > -pthread_setname_np_w_tid=no > -cat > $TMPC << EOF > -#include > - > -static void *f(void *p) { return NULL; } > -int main(void) > -{ > -pthread_t thread; > -pthread_create(, 0, f, 0); > -pthread_setname_np(thread, "QEMU"); > -return 0; > -} > -EOF > -if compile_prog "" "$pthread_lib" ; then > - pthread_setname_np_w_tid=yes > -fi > - > -# check for pthread_setname_np without thread id > -pthread_setname_np_wo_tid=no > -cat > $TMPC << EOF > -#include > - > -static void *f(void *p) { pthread_setname_np("QEMU"); return NULL; } > -int main(void) > -{ > -pthread_t thread; > -pthread_create(, 0, f, 0); > -return 0; > -} > -EOF > -if compile_prog "" "$pthread_lib" ; then > - pthread_setname_np_wo_tid=yes > -fi > - > ## > # libssh probe > if test "$libssh" != "no" ; then > @@ -4498,19 +4433,6 @@ if test "$debug_mutex" = "yes" ; then >echo "CONFIG_DEBUG_MUTEX=y" >> $config_host_mak > fi > > -# Hold two types of flag: > -# CONFIG_THREAD_SETNAME_BYTHREAD - we've got a way of setting the name on > -# a thread we have a handle to > -# CONFIG_PTHREAD_SETNAME_NP_W_TID - A way of doing it on a particular > -# platform > -if test "$pthread_setname_np_w_tid" = "yes" ; then > - echo "CONFIG_THREAD_SETNAME_BYTHREAD=y" >> $config_host_mak > - echo "CONFIG_PTHREAD_SETNAME_NP_W_TID=y" >> $config_host_mak > -elif test "$pthread_setname_np_wo_tid" = "yes" ; then > - echo "CONFIG_THREAD_SETNAME_BYTHREAD=y" >> $config_host_mak > - echo "CONFIG_PTHREAD_SETNAME_NP_WO_TID=y" >> $config_host_mak > -fi > - > if test "$bochs" = "yes" ; then >echo "CONFIG_BOCHS=y" >> $config_host_mak > fi > diff --git a/meson.build b/meson.build > index e8e728bf72..26fc4e5792 100644 > --- a/meson.build > +++ b/meson.build > @@ -1584,6 +1584,29 @@ config_host_data.set('CONFIG_POSIX_MADVISE', > cc.links(gnu_source_prefix + ''' >#include >#include >int main(void) { return posix_madvise(NULL, 0, POSIX_MADV_DONTNEED); }''')) > + > +config_host_data.set('CONFIG_PTHREAD_SETNAME_NP_W_TID', cc.links(''' > + #include > + > + static void *f(void *p) { return NULL; } > + int main(void) > + { > +pthread_t thread; > +pthread_create(, 0, f, 0); > +pthread_setname_np(thread, "QEMU"); > +return 0; > + }''',
"make check" fails in a clang sanitizer build on "nbd-qemu-allocation" iotest
Hi; running a 'make check' on a clang sanitizer build one of the iotests falls over due to a NULL pointer being passed to memset(): TEST iotest-qcow2: nbd-qemu-allocation [fail] QEMU -- "/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/arm-clang/tests/qemu-iotests/../../qemu-system-aarch64" -nodefaults -display none -accel qtest -machine virt QEMU_IMG -- "/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/arm-clang/tests/qemu-iotests/../../qemu-img" QEMU_IO -- "/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/arm-clang/tests/qemu-iotests/../../qemu-io" --cache writeback --aio threads -f qcow2 QEMU_NBD -- "/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/arm-clang/tests/qemu-iotests/../../qemu-nbd" IMGFMT-- qcow2 IMGPROTO -- file PLATFORM -- Linux/x86_64 e104462 5.4.0-89-generic TEST_DIR -- /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/arm-clang/tests/qemu-iotests/scratch SOCK_DIR -- /tmp/tmp13ihi_hj GDB_OPTIONS -- VALGRIND_QEMU -- PRINT_QEMU_OUTPUT -- --- /home/petmay01/linaro/qemu-from-laptop/qemu/tests/qemu-iotests/tests/nbd-qemu-allocation.out +++ nbd-qemu-allocation.out.bad @@ -14,6 +14,8 @@ [{ "start": 0, "length": 1048576, "depth": 1, "present": true, "zero": false, "data": true, "offset": 327680}, { "start": 1048576, "length": 2097152, "depth": 0, "present": true, "zero": false, "data": true, "offset": 327680}, { "start": 3145728, "length": 1048576, "depth": 1, "present": false, "zero": true, "data": false}] +../../nbd/server.c:1027:16: runtime error: null pointer passed as argument 1, which is declared to never be null +/usr/include/string.h:61:62: note: nonnull attribute specified here exports available: 1 export: '' size: 4194304 TEST iotest-qcow2: qsd-jobs Not run: 172 186 192 220 287 Failures: nbd-qemu-allocation Failed 1 of 118 iotests Does this look familiar ? -- PMM
Re: [RFC PATCH v2 1/5] virtio: introduce virtio_force_modern()
On Mon, Nov 15 2021, Halil Pasic wrote: > On Fri, 12 Nov 2021 16:37:20 +0100 > Cornelia Huck wrote: > >> On Fri, Nov 12 2021, Halil Pasic wrote: >> >> > Legacy vs modern should be detected via transport specific means. We >> > can't wait till feature negotiation is done. Let us introduce >> > virtio_force_modern() as a means for the transport code to signal >> > that the device should operate in modern mode (because a modern driver >> > was detected). >> > >> > A new callback is added for the situations where the device needs >> > to do more than just setting the VIRTIO_F_VERSION_1 feature bit. For >> > example, when vhost is involved, we may need to propagate the features >> > to the vhost device. >> > >> > Signed-off-by: Halil Pasic >> > --- >> > >> > I'm still struggling with how to deal with vhost-user and co. The >> > problem is that I'm not very familiar with the life-cycle of, let us >> > say, a vhost_user device. >> > >> > Looks to me like the vhost part might be just an implementation detail, >> > and could even become a hot swappable thing. >> > >> > Another thing is, that vhost processes set_features differently. It >> > might or might not be a good idea to change this. >> > >> > Does anybody know why don't we propagate the features on features_set, >> > but under a set of different conditions, one of which is the vhost >> > device is started? >> > --- >> > hw/virtio/virtio.c | 13 + >> > include/hw/virtio/virtio.h | 2 ++ >> > 2 files changed, 15 insertions(+) >> > >> >> Did you see my feedback in >> https://lore.kernel.org/qemu-devel/87tugzc26y@redhat.com/? I think >> at least some of it still applies. >> > > Sure. My idea was to send out a v2 first which helps us think about the > bigger picture, and then answer that mail. Now I realize I should have > sent the response first, and then the v2 immediately afterwards. > >> > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c >> > index 3a1f6c520c..26db1b31e6 100644 >> > --- a/hw/virtio/virtio.c >> > +++ b/hw/virtio/virtio.c >> > @@ -3281,6 +3281,19 @@ void virtio_init(VirtIODevice *vdev, const char >> > *name, >> > vdev->use_guest_notifier_mask = true; >> > } >> > >> > +void virtio_force_modern(VirtIODevice *vdev) >> >> I'd still prefer to call this virtio_indicate_modern: we don't really >> force anything; the driver has simply already decided that it will use >> the modern interface and we provide an early indication in the features >> so that code looking at them makes the right decisions. > > I tried to explain my dislike for virtio_indicate_modern in my response > to that email. In somewhat different words: IMHO indication is about an > external observer and has a symbolic dimension to it. Please see > https://www.oxfordlearnersdictionaries.com/definition/english/indicate > This function is about changing the behavior of the device. Its > post-condition is: the device acts compliant to virtio 1.0 or higher. My personal preference is "indicate", I don't like "force". I don't want a semantics discussion; I'll leave the decision to the virtio maintainers. > >> >> > +{ >> > +VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); >> > + >> > +virtio_add_feature(>guest_features, VIRTIO_F_VERSION_1); >> > +/* Let the device do it's normal thing. */ >> > +virtio_set_features(vdev, vdev->guest_features); >> >> I don't think this is substantially different from setting VERSION_1 >> only: At least the callers you introduce call this during reset, >> i.e. when guest_features is 0 anyway. > > I agree. Just wanted to be conservative, and preserve whatever is there. > > >> We still have the whole processing >> that is done after feature setting that may have effects different from >> what the ultimate feature setting will give us. > > Yes, this is an intermediate state. As I pointed out, intermediate states > are necessary. Why? We just want VERSION_1 so that the checks work, why do we need to fiddle with other settings? We only need to propagate it to e.g. vhost. > >> While I don't think >> calling set_features twice is forbidden, that sequence is likely quite >> untested, and I'm not sure we can exclude side effects. > > I can't disagree with that. But IMHO we can just say: such problems, if > any, are bugs that need to be fixed. Well, what about first fixing the endianness bugs, before we potentially open up a can of worms? > > I think not doing the whole song-and-dance is conceptually more > problematic because it is more likely to lead to inconsistent internal > state. For example check out: vhost acked_features <-> guest_features. What is wrong with verifying with one single feature?
Re: Guests wont start with 15 pcie-root-port devices
Will this fix make it into 6.2? On 11/12/2021 3:51 PM, Igor Mammedov wrote: On Fri, 12 Nov 2021 17:53:42 + Daniel P. Berrangé wrote: On Fri, Nov 12, 2021 at 12:35:07PM -0500, Brian Rak wrote: In 6.1, a guest with 15 empty pcie-root-port devices will not boot properly - it just hangs on "Guest has not initialized the display (yet).". As soon as I remove the last pcie-root-port, the guest begins starting up normally. Yes, QEMU 6.1 has a regression https://gitlab.com/qemu-project/qemu/-/issues/641 commit e2a6290aab578b2170c1f5909fa556385dc0d820 Author: Marcel Apfelbaum Date: Mon Aug 2 12:00:57 2021 +0300 hw/pcie-root-port: Fix hotplug for PCI devices requiring IO Although I can't say I really understand why that commit triggered it. It caused the firmware to always allocate I/O space for every port and there's limited total I/O space, so it runs out at 15 devices. alternatively instead of reverting to native PCIe hotplug as in the issue Daniel's mentioned, you can apply following fix https://patchew.org/QEMU/2022110857.3116853-1-imamm...@redhat.com/ Regards, Daniel
Re: [PULL 21/26] configure, meson: move more compiler checks to Meson
On Mon, 15 Nov 2021 at 16:36, Peter Maydell wrote: > > On Thu, 14 Oct 2021 at 17:49, Paolo Bonzini wrote: > > > > Reviewed-by: Marc-André Lureau > > Message-Id: <20211007130829.632254-15-pbonz...@redhat.com> > > Signed-off-by: Paolo Bonzini > > --- > > configure| 91 > > meson.build | 44 +++ > > util/meson.build | 4 ++- > > 3 files changed, 47 insertions(+), 92 deletions(-) > > > > diff --git a/meson.build b/meson.build > > index 6bf43e6d30..6b7487b725 100644 > > --- a/meson.build > > +++ b/meson.build > > @@ -1550,6 +1550,8 @@ config_host_data.set('CONFIG_INOTIFY', > > cc.has_header_symbol('sys/inotify.h', 'inotify_init')) > > config_host_data.set('CONFIG_INOTIFY1', > > cc.has_header_symbol('sys/inotify.h', > > 'inotify_init1')) > > +config_host_data.set('CONFIG_IOVEC', > > + cc.has_header_symbol('sys/uio.h', 'struct iovec')) > > config_host_data.set('CONFIG_MACHINE_BSWAP_H', > > cc.has_header_symbol('machine/bswap.h', 'bswap32', > >prefix: '''#include > > > > Hi -- I've just noticed that this change breaks compilation for me, > because this test incorrectly fails to set CONFIG_IOVEC on a system > where the header defines 'struct iovec'. That wasn't quite right. On further investigation, the problem is that the meson tests silently misbehave if you included "-Werror" in your --extra-cflags. This results in meson building all the tests with -Werror, but meson's generated code for tests can't handle that. (configure gets this right because it has code in that specifically checks "does this test case give the same result for -Werror and not".) -- PMM
[PULL 03/20] vhost: Rename last_index to vq_index_end
From: Eugenio Pérez The doc of this field pointed out that last_index is the last vq index. This is misleading, since it's actually one past the end of the vqs. Renaming and modifying comment. Signed-off-by: Eugenio Pérez Acked-by: Jason Wang Message-Id: <20211104085625.2054959-2-epere...@redhat.com> Reviewed-by: Juan Quintela Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- include/hw/virtio/vhost.h | 4 ++-- hw/net/vhost_net.c| 4 ++-- hw/virtio/vhost-vdpa.c| 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h index 3fa0b554ef..58a73e7b7a 100644 --- a/include/hw/virtio/vhost.h +++ b/include/hw/virtio/vhost.h @@ -74,8 +74,8 @@ struct vhost_dev { unsigned int nvqs; /* the first virtqueue which would be used by this vhost dev */ int vq_index; -/* the last vq index for the virtio device (not vhost) */ -int last_index; +/* one past the last vq index for the virtio device (not vhost) */ +int vq_index_end; /* if non-zero, minimum required value for max_queues */ int num_queues; uint64_t features; diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index 0d888f29a6..29f2c4212f 100644 --- a/hw/net/vhost_net.c +++ b/hw/net/vhost_net.c @@ -232,10 +232,10 @@ fail: } static void vhost_net_set_vq_index(struct vhost_net *net, int vq_index, - int last_index) + int vq_index_end) { net->dev.vq_index = vq_index; -net->dev.last_index = last_index; +net->dev.vq_index_end = vq_index_end; } static int vhost_net_start_one(struct vhost_net *net, diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index 0d8051426c..bcaf00e09f 100644 --- a/hw/virtio/vhost-vdpa.c +++ b/hw/virtio/vhost-vdpa.c @@ -645,7 +645,7 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started) vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs); } -if (dev->vq_index + dev->nvqs != dev->last_index) { +if (dev->vq_index + dev->nvqs != dev->vq_index_end) { return 0; } -- MST
Re: [PULL 21/26] configure, meson: move more compiler checks to Meson
On Mon, 15 Nov 2021 at 16:36, Peter Maydell wrote: > > On Thu, 14 Oct 2021 at 17:49, Paolo Bonzini wrote: > > > > Reviewed-by: Marc-André Lureau > > Message-Id: <20211007130829.632254-15-pbonz...@redhat.com> > > Signed-off-by: Paolo Bonzini > > --- > > configure| 91 > > meson.build | 44 +++ > > util/meson.build | 4 ++- > > 3 files changed, 47 insertions(+), 92 deletions(-) > > > > diff --git a/meson.build b/meson.build > > index 6bf43e6d30..6b7487b725 100644 > > --- a/meson.build > > +++ b/meson.build > > @@ -1550,6 +1550,8 @@ config_host_data.set('CONFIG_INOTIFY', > > cc.has_header_symbol('sys/inotify.h', 'inotify_init')) > > config_host_data.set('CONFIG_INOTIFY1', > > cc.has_header_symbol('sys/inotify.h', > > 'inotify_init1')) > > +config_host_data.set('CONFIG_IOVEC', > > + cc.has_header_symbol('sys/uio.h', 'struct iovec')) > > config_host_data.set('CONFIG_MACHINE_BSWAP_H', > > cc.has_header_symbol('machine/bswap.h', 'bswap32', > >prefix: '''#include > > > > Hi -- I've just noticed that this change breaks compilation for me, > because this test incorrectly fails to set CONFIG_IOVEC on a system > where the header defines 'struct iovec'. This seems to be because > "struct iovec" isn't a valid thing to test with has_header_symbol, > because it provokes a compiler error from clang. https://github.com/mesonbuild/meson/issues/1975 says that for gcc it's actually going to be wrong the other way (always setting CONFIG_IOVEC whether the system header has the struct or not), because "struct wombat;" is syntactically OK as a *declaration*, not a use. Maybe we can work around this by testing for the presence of something else, eg IOV_MAX or the readv or writev functions ? -- PMM
[PULL 16/20] pcie: implement slot power control for pcie root ports
From: Gerd Hoffmann With this patch hot-plugged pci devices will only be visible to the guest if the guests hotplug driver has enabled slot power. This should fix the hot-plug race which one can hit when hot-plugging a pci device at boot, while the guest is in the middle of the pci bus scan. Signed-off-by: Gerd Hoffmann Message-Id: <2021130859.1171890-3-kra...@redhat.com> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- hw/pci/pcie.c | 28 1 file changed, 28 insertions(+) diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c index 914a9bf3d1..13d11a57c7 100644 --- a/hw/pci/pcie.c +++ b/hw/pci/pcie.c @@ -366,6 +366,29 @@ static void hotplug_event_clear(PCIDevice *dev) } } +static void pcie_set_power_device(PCIBus *bus, PCIDevice *dev, void *opaque) +{ +bool *power = opaque; + +pci_set_power(dev, *power); +} + +static void pcie_cap_update_power(PCIDevice *hotplug_dev) +{ +uint8_t *exp_cap = hotplug_dev->config + hotplug_dev->exp.exp_cap; +PCIBus *sec_bus = pci_bridge_get_sec_bus(PCI_BRIDGE(hotplug_dev)); +uint32_t sltcap = pci_get_long(exp_cap + PCI_EXP_SLTCAP); +uint16_t sltctl = pci_get_word(exp_cap + PCI_EXP_SLTCTL); +bool power = true; + +if (sltcap & PCI_EXP_SLTCAP_PCP) { +power = (sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_ON; +} + +pci_for_each_device(sec_bus, pci_bus_num(sec_bus), +pcie_set_power_device, ); +} + /* * A PCI Express Hot-Plug Event has occurred, so update slot status register * and notify OS of the event if necessary. @@ -434,6 +457,7 @@ void pcie_cap_slot_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, pci_word_test_and_set_mask(exp_cap + PCI_EXP_LNKSTA, PCI_EXP_LNKSTA_DLLLA); } +pcie_cap_update_power(hotplug_pdev); return; } @@ -451,6 +475,7 @@ void pcie_cap_slot_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, } pcie_cap_slot_event(hotplug_pdev, PCI_EXP_HP_EV_PDC | PCI_EXP_HP_EV_ABP); +pcie_cap_update_power(hotplug_pdev); } } @@ -625,6 +650,7 @@ void pcie_cap_slot_reset(PCIDevice *dev) PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_ABP); +pcie_cap_update_power(dev); hotplug_event_update_event_status(dev); } @@ -705,6 +731,7 @@ void pcie_cap_slot_write_config(PCIDevice *dev, pci_word_test_and_set_mask(exp_cap + PCI_EXP_SLTSTA, PCI_EXP_SLTSTA_PDC); } +pcie_cap_update_power(dev); hotplug_event_notify(dev); @@ -731,6 +758,7 @@ int pcie_cap_slot_post_load(void *opaque, int version_id) { PCIDevice *dev = opaque; hotplug_event_update_event_status(dev); +pcie_cap_update_power(dev); return 0; } -- MST
[PULL 20/20] pcie: expire pending delete
From: Gerd Hoffmann Add an expire time for pending delete, once the time is over allow pressing the attention button again. This makes pcie hotplug behave more like acpi hotplug, where one can try sending an 'device_del' monitor command again in case the guest didn't respond to the first attempt. Signed-off-by: Gerd Hoffmann Message-Id: <2021130859.1171890-7-kra...@redhat.com> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- include/hw/qdev-core.h | 1 + hw/pci/pcie.c | 2 ++ softmmu/qdev-monitor.c | 4 +++- 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 72622bd337..20d3066595 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -181,6 +181,7 @@ struct DeviceState { char *canonical_path; bool realized; bool pending_deleted_event; +int64_t pending_deleted_expires_ms; QDict *opts; int hotplugged; bool allow_unplug_during_migration; diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c index a930ac738a..c5ed266337 100644 --- a/hw/pci/pcie.c +++ b/hw/pci/pcie.c @@ -548,6 +548,8 @@ void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev, } dev->pending_deleted_event = true; +dev->pending_deleted_expires_ms = +qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 5000; /* 5 secs */ /* In case user cancel the operation of multi-function hot-add, * remove the function that is unexposed to guest individually, diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c index 588a62b88d..5925f1ae5f 100644 --- a/softmmu/qdev-monitor.c +++ b/softmmu/qdev-monitor.c @@ -943,7 +943,9 @@ void qmp_device_del(const char *id, Error **errp) { DeviceState *dev = find_device_state(id, errp); if (dev != NULL) { -if (dev->pending_deleted_event) { +if (dev->pending_deleted_event && +(dev->pending_deleted_expires_ms == 0 || + dev->pending_deleted_expires_ms > qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL))) { error_setg(errp, "Device %s is already in the " "process of unplug", id); return; -- MST
[PULL 12/20] virtio: use virtio accessor to access packed event
From: Jason Wang We used to access packed descriptor event and off_wrap via address_space_{write|read}_cached(). When we hit the cache, memcpy() is used which is not atomic which may lead a wrong value to be read or wrote. This patch fixes this by switching to use virito_{stw|lduw}_phys_cached() to make sure the access is atomic. Fixes: 683f7665679c1 ("virtio: event suppression support for packed ring") Cc: qemu-sta...@nongnu.org Signed-off-by: Jason Wang Message-Id: <2021063854.29060-2-jasow...@redhat.com> Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- hw/virtio/virtio.c | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 939bcbfeb9..ea7c079fb0 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -247,13 +247,10 @@ static void vring_packed_event_read(VirtIODevice *vdev, hwaddr off_off = offsetof(VRingPackedDescEvent, off_wrap); hwaddr off_flags = offsetof(VRingPackedDescEvent, flags); -address_space_read_cached(cache, off_flags, >flags, - sizeof(e->flags)); +e->flags = virtio_lduw_phys_cached(vdev, cache, off_flags); /* Make sure flags is seen before off_wrap */ smp_rmb(); -address_space_read_cached(cache, off_off, >off_wrap, - sizeof(e->off_wrap)); -virtio_tswap16s(vdev, >off_wrap); +e->off_wrap = virtio_lduw_phys_cached(vdev, cache, off_off); virtio_tswap16s(vdev, >flags); } @@ -263,8 +260,7 @@ static void vring_packed_off_wrap_write(VirtIODevice *vdev, { hwaddr off = offsetof(VRingPackedDescEvent, off_wrap); -virtio_tswap16s(vdev, _wrap); -address_space_write_cached(cache, off, _wrap, sizeof(off_wrap)); +virtio_stw_phys_cached(vdev, cache, off, off_wrap); address_space_cache_invalidate(cache, off, sizeof(off_wrap)); } @@ -273,8 +269,7 @@ static void vring_packed_flags_write(VirtIODevice *vdev, { hwaddr off = offsetof(VRingPackedDescEvent, flags); -virtio_tswap16s(vdev, ); -address_space_write_cached(cache, off, , sizeof(flags)); +virtio_stw_phys_cached(vdev, cache, off, flags); address_space_cache_invalidate(cache, off, sizeof(flags)); } -- MST
[PULL 19/20] pcie: fast unplug when slot power is off
From: Gerd Hoffmann In case the slot is powered off (and the power indicator turned off too) we can unplug right away, without round-trip to the guest. Also clear pending attention button press, there is nothing to care about any more. Signed-off-by: Gerd Hoffmann Message-Id: <2021130859.1171890-6-kra...@redhat.com> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- hw/pci/pcie.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c index 959bf074b2..a930ac738a 100644 --- a/hw/pci/pcie.c +++ b/hw/pci/pcie.c @@ -560,6 +560,16 @@ void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev, return; } +if (((sltctl & PCI_EXP_SLTCTL_PIC) == PCI_EXP_SLTCTL_PWR_IND_OFF) && +((sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_OFF)) { +/* slot is powered off -> unplug without round-trip to the guest */ +pcie_cap_slot_do_unplug(hotplug_pdev); +hotplug_event_notify(hotplug_pdev); +pci_word_test_and_clear_mask(exp_cap + PCI_EXP_SLTSTA, + PCI_EXP_SLTSTA_ABP); +return; +} + pcie_cap_slot_push_attention_button(hotplug_pdev); } -- MST
[PULL 11/20] virtio: use virtio accessor to access packed descriptor flags
From: Jason Wang We used to access packed descriptor flags via address_space_{write|read}_cached(). When we hit the cache, memcpy() is used which is not an atomic operation which may lead a wrong value is read or wrote. So this patch switches to use virito_{stw|lduw}_phys_cached() to make sure the aceess is atomic. Fixes: 86044b24e865f ("virtio: basic packed virtqueue support") Cc: qemu-sta...@nongnu.org Signed-off-by: Jason Wang Message-Id: <2021063854.29060-1-jasow...@redhat.com> Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- hw/virtio/virtio.c | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index cc69a9b881..939bcbfeb9 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -507,11 +507,9 @@ static void vring_packed_desc_read_flags(VirtIODevice *vdev, MemoryRegionCache *cache, int i) { -address_space_read_cached(cache, - i * sizeof(VRingPackedDesc) + - offsetof(VRingPackedDesc, flags), - flags, sizeof(*flags)); -virtio_tswap16s(vdev, flags); +hwaddr off = i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, flags); + +*flags = virtio_lduw_phys_cached(vdev, cache, off); } static void vring_packed_desc_read(VirtIODevice *vdev, @@ -564,8 +562,7 @@ static void vring_packed_desc_write_flags(VirtIODevice *vdev, { hwaddr off = i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, flags); -virtio_tswap16s(vdev, >flags); -address_space_write_cached(cache, off, >flags, sizeof(desc->flags)); +virtio_stw_phys_cached(vdev, cache, off, desc->flags); address_space_cache_invalidate(cache, off, sizeof(desc->flags)); } -- MST
[PULL 17/20] pcie: add power indicator blink check
From: Gerd Hoffmann Refuse to push the attention button in case the guest is busy with some hotplug operation (as indicated by the power indicator blinking). Signed-off-by: Gerd Hoffmann Message-Id: <2021130859.1171890-4-kra...@redhat.com> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- hw/pci/pcie.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c index 13d11a57c7..b92dbff118 100644 --- a/hw/pci/pcie.c +++ b/hw/pci/pcie.c @@ -506,6 +506,7 @@ void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev, PCIDevice *hotplug_pdev = PCI_DEVICE(hotplug_dev); uint8_t *exp_cap = hotplug_pdev->config + hotplug_pdev->exp.exp_cap; uint32_t sltcap = pci_get_word(exp_cap + PCI_EXP_SLTCAP); +uint16_t sltctl = pci_get_word(exp_cap + PCI_EXP_SLTCTL); /* Check if hot-unplug is disabled on the slot */ if ((sltcap & PCI_EXP_SLTCAP_HPC) == 0) { @@ -521,6 +522,12 @@ void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev, return; } +if ((sltctl & PCI_EXP_SLTCTL_PIC) == PCI_EXP_SLTCTL_PWR_IND_BLINK) { +error_setg(errp, "Hot-unplug failed: " + "guest is busy (power indicator blinking)"); +return; +} + dev->pending_deleted_event = true; /* In case user cancel the operation of multi-function hot-add, -- MST
[PULL 15/20] pci: implement power state
From: Gerd Hoffmann This allows to power off pci devices. In "off" state the devices will not be visible. No pci config space access, no pci bar access, no dma. Default state is "on", so this patch (alone) should not change behavior. Use case: Allows hotplug controllers implement slot power. Hotplug controllers doing so should set the inital power state for devices in the ->plug callback. Signed-off-by: Gerd Hoffmann Message-Id: <2021130859.1171890-2-kra...@redhat.com> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- include/hw/pci/pci.h | 2 ++ hw/pci/pci.c | 25 +++-- hw/pci/pci_host.c| 6 -- 3 files changed, 29 insertions(+), 4 deletions(-) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 5c4016b995..e7cdf2d5ec 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -268,6 +268,7 @@ typedef struct PCIReqIDCache PCIReqIDCache; struct PCIDevice { DeviceState qdev; bool partially_hotplugged; +bool has_power; /* PCI config space */ uint8_t *config; @@ -908,5 +909,6 @@ extern const VMStateDescription vmstate_pci_device; } MSIMessage pci_get_msi_message(PCIDevice *dev, int vector); +void pci_set_power(PCIDevice *pci_dev, bool state); #endif diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 4a84e478ce..e5993c1ef5 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -1380,6 +1380,9 @@ static void pci_update_mappings(PCIDevice *d) continue; new_addr = pci_bar_address(d, i, r->type, r->size); +if (!d->has_power) { +new_addr = PCI_BAR_UNMAPPED; +} /* This bar isn't changed */ if (new_addr == r->addr) @@ -1464,8 +1467,8 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int if (range_covers_byte(addr, l, PCI_COMMAND)) { pci_update_irq_disabled(d, was_irq_disabled); memory_region_set_enabled(>bus_master_enable_region, - pci_get_word(d->config + PCI_COMMAND) -& PCI_COMMAND_MASTER); + (pci_get_word(d->config + PCI_COMMAND) + & PCI_COMMAND_MASTER) && d->has_power); } msi_write_config(d, addr, val_in, l); @@ -2182,6 +2185,8 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp) pci_qdev_unrealize(DEVICE(pci_dev)); return; } + +pci_set_power(pci_dev, true); } PCIDevice *pci_new_multifunction(int devfn, bool multifunction, @@ -2853,6 +2858,22 @@ MSIMessage pci_get_msi_message(PCIDevice *dev, int vector) return msg; } +void pci_set_power(PCIDevice *d, bool state) +{ +if (d->has_power == state) { +return; +} + +d->has_power = state; +pci_update_mappings(d); +memory_region_set_enabled(>bus_master_enable_region, + (pci_get_word(d->config + PCI_COMMAND) + & PCI_COMMAND_MASTER) && d->has_power); +if (!d->has_power) { +pci_device_reset(d); +} +} + static const TypeInfo pci_device_type_info = { .name = TYPE_PCI_DEVICE, .parent = TYPE_DEVICE, diff --git a/hw/pci/pci_host.c b/hw/pci/pci_host.c index cf02f0d6a5..7beafd40a8 100644 --- a/hw/pci/pci_host.c +++ b/hw/pci/pci_host.c @@ -74,7 +74,8 @@ void pci_host_config_write_common(PCIDevice *pci_dev, uint32_t addr, /* non-zero functions are only exposed when function 0 is present, * allowing direct removal of unexposed functions. */ -if (pci_dev->qdev.hotplugged && !pci_get_function_0(pci_dev)) { +if ((pci_dev->qdev.hotplugged && !pci_get_function_0(pci_dev)) || +!pci_dev->has_power) { return; } @@ -97,7 +98,8 @@ uint32_t pci_host_config_read_common(PCIDevice *pci_dev, uint32_t addr, /* non-zero functions are only exposed when function 0 is present, * allowing direct removal of unexposed functions. */ -if (pci_dev->qdev.hotplugged && !pci_get_function_0(pci_dev)) { +if ((pci_dev->qdev.hotplugged && !pci_get_function_0(pci_dev)) || +!pci_dev->has_power) { return ~0x0; } -- MST
[PULL 07/20] hw/acpi/ich9: Add compat prop to keep HPC bit set for 6.1 machine type
From: Julia Suvorova To solve issues [1-2] the Hot Plug Capable bit in PCIe Slots will be turned on, while the switch to ACPI Hot-plug will be done in the DSDT table. Introducing 'x-keep-native-hpc' property disables the HPC bit only in 6.1 and as a result keeps the forced 'reserve-io' on pcie-root-ports in 6.1 too. [1] https://gitlab.com/qemu-project/qemu/-/issues/641 [2] https://bugzilla.redhat.com/show_bug.cgi?id=2006409 Signed-off-by: Julia Suvorova Signed-off-by: Igor Mammedov Message-Id: <2022110857.3116853-3-imamm...@redhat.com> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- include/hw/acpi/ich9.h | 1 + hw/acpi/ich9.c | 18 ++ hw/i386/pc.c | 2 ++ hw/i386/pc_q35.c | 7 ++- 4 files changed, 27 insertions(+), 1 deletion(-) diff --git a/include/hw/acpi/ich9.h b/include/hw/acpi/ich9.h index f04f1791bd..7ca92843c6 100644 --- a/include/hw/acpi/ich9.h +++ b/include/hw/acpi/ich9.h @@ -56,6 +56,7 @@ typedef struct ICH9LPCPMRegs { AcpiCpuHotplug gpe_cpu; CPUHotplugState cpuhp_state; +bool keep_pci_slot_hpc; bool use_acpi_hotplug_bridge; AcpiPciHpState acpi_pci_hotplug; MemHotplugState acpi_memory_hotplug; diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c index 1ee2ba2c50..ebe08ed831 100644 --- a/hw/acpi/ich9.c +++ b/hw/acpi/ich9.c @@ -419,6 +419,20 @@ static void ich9_pm_set_acpi_pci_hotplug(Object *obj, bool value, Error **errp) s->pm.use_acpi_hotplug_bridge = value; } +static bool ich9_pm_get_keep_pci_slot_hpc(Object *obj, Error **errp) +{ +ICH9LPCState *s = ICH9_LPC_DEVICE(obj); + +return s->pm.keep_pci_slot_hpc; +} + +static void ich9_pm_set_keep_pci_slot_hpc(Object *obj, bool value, Error **errp) +{ +ICH9LPCState *s = ICH9_LPC_DEVICE(obj); + +s->pm.keep_pci_slot_hpc = value; +} + void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm) { static const uint32_t gpe0_len = ICH9_PMIO_GPE0_LEN; @@ -428,6 +442,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm) pm->disable_s4 = 0; pm->s4_val = 2; pm->use_acpi_hotplug_bridge = true; +pm->keep_pci_slot_hpc = true; object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE, >pm_io_base, OBJ_PROP_FLAG_READ); @@ -454,6 +469,9 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm) object_property_add_bool(obj, ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, ich9_pm_get_acpi_pci_hotplug, ich9_pm_set_acpi_pci_hotplug); +object_property_add_bool(obj, "x-keep-pci-slot-hpc", + ich9_pm_get_keep_pci_slot_hpc, + ich9_pm_set_keep_pci_slot_hpc); } void ich9_pm_device_pre_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 2592a82148..a2ef40ecbc 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -98,6 +98,7 @@ GlobalProperty pc_compat_6_1[] = { { TYPE_X86_CPU, "hv-version-id-build", "0x1bbc" }, { TYPE_X86_CPU, "hv-version-id-major", "0x0006" }, { TYPE_X86_CPU, "hv-version-id-minor", "0x0001" }, +{ "ICH9-LPC", "x-keep-pci-slot-hpc", "false" }, }; const size_t pc_compat_6_1_len = G_N_ELEMENTS(pc_compat_6_1); @@ -107,6 +108,7 @@ GlobalProperty pc_compat_6_0[] = { { "qemu64" "-" TYPE_X86_CPU, "stepping", "3" }, { TYPE_X86_CPU, "x-vendor-cpuid-only", "off" }, { "ICH9-LPC", ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, "off" }, +{ "ICH9-LPC", "x-keep-pci-slot-hpc", "true" }, }; const size_t pc_compat_6_0_len = G_N_ELEMENTS(pc_compat_6_0); diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index fc34b905ee..e1e100316d 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -137,6 +137,7 @@ static void pc_q35_init(MachineState *machine) DriveInfo *hd[MAX_SATA_PORTS]; MachineClass *mc = MACHINE_GET_CLASS(machine); bool acpi_pcihp; +bool keep_pci_slot_hpc; /* Check whether RAM fits below 4G (leaving 1/2 GByte for IO memory * and 256 Mbytes for PCI Express Enhanced Configuration Access Mapping @@ -242,7 +243,11 @@ static void pc_q35_init(MachineState *machine) ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, NULL); -if (acpi_pcihp) { +keep_pci_slot_hpc = object_property_get_bool(OBJECT(lpc), + "x-keep-pci-slot-hpc", + NULL); + +if (!keep_pci_slot_hpc && acpi_pcihp) { object_register_sugar_prop(TYPE_PCIE_SLOT, "x-native-hotplug", "false", true); } -- MST
[PULL 14/20] vdpa: Check for existence of opts.vhostdev
From: Eugenio Pérez Since net_init_vhost_vdpa is trying to open it. Not specifying it in the command line crash qemu. Fixes: 7327813d17 ("vhost-vdpa: open device fd in net_init_vhost_vdpa()") Signed-off-by: Eugenio Pérez Message-Id: <2022193431.2379298-3-epere...@redhat.com> Acked-by: Jason Wang Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- net/vhost-vdpa.c | 4 1 file changed, 4 insertions(+) diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c index 1a7250b980..2e3c22a8c7 100644 --- a/net/vhost-vdpa.c +++ b/net/vhost-vdpa.c @@ -260,6 +260,10 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name, assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA); opts = >u.vhost_vdpa; +if (!opts->vhostdev) { +error_setg(errp, "vdpa character device not specified with vhostdev"); +return -1; +} vdpa_device_fd = qemu_open(opts->vhostdev, O_RDWR, errp); if (vdpa_device_fd == -1) { -- MST
[PULL 02/20] softmmu/qdev-monitor: fix use-after-free in qdev_set_id()
From: Stefan Hajnoczi Reported by Coverity (CID 1465222). Fixes: 4a1d937796de0fecd8b22d7dbebf87f38e8282fd ("softmmu/qdev-monitor: add error handling in qdev_set_id") Cc: Damien Hedde Cc: Kevin Wolf Cc: Michael S. Tsirkin Signed-off-by: Stefan Hajnoczi Message-Id: <20211102163342.31162-1-stefa...@redhat.com> Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin Reviewed-by: Kevin Wolf Reviewed-by: Marc-André Lureau Reviewed-by: Damien Hedde Reviewed-by: Markus Armbruster --- softmmu/qdev-monitor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c index f8b3a4cd82..588a62b88d 100644 --- a/softmmu/qdev-monitor.c +++ b/softmmu/qdev-monitor.c @@ -593,8 +593,8 @@ const char *qdev_set_id(DeviceState *dev, char *id, Error **errp) if (prop) { dev->id = id; } else { -g_free(id); error_setg(errp, "Duplicate device ID '%s'", id); +g_free(id); return NULL; } } else { -- MST
[PULL 13/20] vdpa: Replace qemu_open_old by qemu_open at
From: Eugenio Pérez There is no reason to keep using the old one, since we neither use the variadics arguments nor open it with O_DIRECT. Also, net_client_init1, the caller of net_init_vhost_vdpa, wants all net_client_init_fun to use Error API, so it's a good step in that direction. Signed-off-by: Eugenio Pérez Message-Id: <2022193431.2379298-2-epere...@redhat.com> Acked-by: Jason Wang Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- net/vhost-vdpa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c index 373b706b90..1a7250b980 100644 --- a/net/vhost-vdpa.c +++ b/net/vhost-vdpa.c @@ -261,7 +261,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name, assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA); opts = >u.vhost_vdpa; -vdpa_device_fd = qemu_open_old(opts->vhostdev, O_RDWR); +vdpa_device_fd = qemu_open(opts->vhostdev, O_RDWR, errp); if (vdpa_device_fd == -1) { return -errno; } -- MST
[PULL 09/20] hw/i386/acpi-build: Deny control on PCIe Native Hot-plug in _OSC
From: Julia Suvorova There are two ways to enable ACPI PCI Hot-plug: * Disable the Hot-plug Capable bit on PCIe slots. This was the first approach which led to regression [1-2], as I/O space for a port is allocated only when it is hot-pluggable, which is determined by HPC bit. * Leave the HPC bit on and disable PCIe Native Hot-plug in _OSC method. This removes the (future) ability of hot-plugging switches with PCIe Native hotplug since ACPI PCI Hot-plug only works with cold-plugged bridges. If the user wants to explicitely use this feature, they can disable ACPI PCI Hot-plug with: --global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off Change the bit in _OSC method so that the OS selects ACPI PCI Hot-plug instead of PCIe Native. [1] https://gitlab.com/qemu-project/qemu/-/issues/641 [2] https://bugzilla.redhat.com/show_bug.cgi?id=2006409 Signed-off-by: Julia Suvorova Signed-off-by: Igor Mammedov Message-Id: <2022110857.3116853-5-imamm...@redhat.com> Reviewed-by: Ani Sinha Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- hw/i386/acpi-build.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index a3ad6abd33..a99c6e4fe3 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1337,7 +1337,7 @@ static void build_x86_acpi_pci_hotplug(Aml *table, uint64_t pcihp_addr) aml_append(table, scope); } -static Aml *build_q35_osc_method(void) +static Aml *build_q35_osc_method(bool enable_native_pcie_hotplug) { Aml *if_ctx; Aml *if_ctx2; @@ -1359,8 +1359,10 @@ static Aml *build_q35_osc_method(void) /* * Always allow native PME, AER (no dependencies) * Allow SHPC (PCI bridges can have SHPC controller) + * Disable PCIe Native Hot-plug if ACPI PCI Hot-plug is enabled. */ -aml_append(if_ctx, aml_and(a_ctrl, aml_int(0x1F), a_ctrl)); +aml_append(if_ctx, aml_and(a_ctrl, +aml_int(0x1E | (enable_native_pcie_hotplug ? 0x1 : 0x0)), a_ctrl)); if_ctx2 = aml_if(aml_lnot(aml_equal(aml_arg(1), aml_int(1; /* Unknown revision */ @@ -1449,7 +1451,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03"))); aml_append(dev, aml_name_decl("_ADR", aml_int(0))); aml_append(dev, aml_name_decl("_UID", aml_int(pcmc->pci_root_uid))); -aml_append(dev, build_q35_osc_method()); +aml_append(dev, build_q35_osc_method(!pm->pcihp_bridge_en)); aml_append(sb_scope, dev); if (mcfg_valid) { aml_append(sb_scope, build_q35_dram_controller()); @@ -1565,7 +1567,9 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, if (pci_bus_is_express(bus)) { aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08"))); aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03"))); -aml_append(dev, build_q35_osc_method()); + +/* Expander bridges do not have ACPI PCI Hot-plug enabled */ +aml_append(dev, build_q35_osc_method(true)); } else { aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03"))); } -- MST
[PULL 01/20] net/vhost-vdpa: fix memory leak in vhost_vdpa_get_max_queue_pairs()
From: Stefano Garzarella Use g_autofree to ensure that `config` is freed when vhost_vdpa_get_max_queue_pairs() returns. Reported-by: Coverity (CID 1465228: RESOURCE_LEAK) Fixes: 402378407d ("vhost-vdpa: multiqueue support") Signed-off-by: Stefano Garzarella Message-Id: <20211102155157.241034-1-sgarz...@redhat.com> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin Reviewed-by: Philippe Mathieu-Daudé Acked-by: Jason Wang --- net/vhost-vdpa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c index 49ab322511..373b706b90 100644 --- a/net/vhost-vdpa.c +++ b/net/vhost-vdpa.c @@ -214,7 +214,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer, static int vhost_vdpa_get_max_queue_pairs(int fd, int *has_cvq, Error **errp) { unsigned long config_size = offsetof(struct vhost_vdpa_config, buf); -struct vhost_vdpa_config *config; +g_autofree struct vhost_vdpa_config *config = NULL; __virtio16 *max_queue_pairs; uint64_t features; int ret; -- MST
[PULL 18/20] pcie: factor out pcie_cap_slot_unplug()
From: Gerd Hoffmann No functional change. Signed-off-by: Gerd Hoffmann Message-Id: <2021130859.1171890-5-kra...@redhat.com> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- hw/pci/pcie.c | 32 1 file changed, 20 insertions(+), 12 deletions(-) diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c index b92dbff118..959bf074b2 100644 --- a/hw/pci/pcie.c +++ b/hw/pci/pcie.c @@ -497,6 +497,25 @@ static void pcie_unplug_device(PCIBus *bus, PCIDevice *dev, void *opaque) object_unparent(OBJECT(dev)); } +static void pcie_cap_slot_do_unplug(PCIDevice *dev) +{ +PCIBus *sec_bus = pci_bridge_get_sec_bus(PCI_BRIDGE(dev)); +uint8_t *exp_cap = dev->config + dev->exp.exp_cap; +uint32_t lnkcap = pci_get_long(exp_cap + PCI_EXP_LNKCAP); + +pci_for_each_device_under_bus(sec_bus, pcie_unplug_device, NULL); + +pci_word_test_and_clear_mask(exp_cap + PCI_EXP_SLTSTA, + PCI_EXP_SLTSTA_PDS); +if (dev->cap_present & QEMU_PCIE_LNKSTA_DLLLA || +(lnkcap & PCI_EXP_LNKCAP_DLLLARC)) { +pci_word_test_and_clear_mask(exp_cap + PCI_EXP_LNKSTA, + PCI_EXP_LNKSTA_DLLLA); +} +pci_word_test_and_set_mask(exp_cap + PCI_EXP_SLTSTA, + PCI_EXP_SLTSTA_PDC); +} + void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev, DeviceState *dev, Error **errp) { @@ -676,7 +695,6 @@ void pcie_cap_slot_write_config(PCIDevice *dev, uint32_t pos = dev->exp.exp_cap; uint8_t *exp_cap = dev->config + pos; uint16_t sltsta = pci_get_word(exp_cap + PCI_EXP_SLTSTA); -uint32_t lnkcap = pci_get_long(exp_cap + PCI_EXP_LNKCAP); if (ranges_overlap(addr, len, pos + PCI_EXP_SLTSTA, 2)) { /* @@ -726,17 +744,7 @@ void pcie_cap_slot_write_config(PCIDevice *dev, (val & PCI_EXP_SLTCTL_PIC_OFF) == PCI_EXP_SLTCTL_PIC_OFF && (!(old_slt_ctl & PCI_EXP_SLTCTL_PCC) || (old_slt_ctl & PCI_EXP_SLTCTL_PIC_OFF) != PCI_EXP_SLTCTL_PIC_OFF)) { -PCIBus *sec_bus = pci_bridge_get_sec_bus(PCI_BRIDGE(dev)); -pci_for_each_device_under_bus(sec_bus, pcie_unplug_device, NULL); -pci_word_test_and_clear_mask(exp_cap + PCI_EXP_SLTSTA, - PCI_EXP_SLTSTA_PDS); -if (dev->cap_present & QEMU_PCIE_LNKSTA_DLLLA || -(lnkcap & PCI_EXP_LNKCAP_DLLLARC)) { -pci_word_test_and_clear_mask(exp_cap + PCI_EXP_LNKSTA, - PCI_EXP_LNKSTA_DLLLA); -} -pci_word_test_and_set_mask(exp_cap + PCI_EXP_SLTSTA, - PCI_EXP_SLTSTA_PDC); +pcie_cap_slot_do_unplug(dev); } pcie_cap_update_power(dev); -- MST
[PULL 10/20] tests: bios-tables-test update expected blobs
From: Igor Mammedov The changes are the result of 'hw/i386/acpi-build: Deny control on PCIe Native Hot-Plug in _OSC' which hides PCIE hotplug bit in host-bridge _OSC Method (_OSC, 4, NotSerialized) // _OSC: Operating System Capabilities { CreateDWordField (Arg3, Zero, CDW1) If ((Arg0 == ToUUID ("33db4d5b-1ff7-401c-9657-7441c03dd766") /* PCI Host Bridge Device */)) { CreateDWordField (Arg3, 0x04, CDW2) CreateDWordField (Arg3, 0x08, CDW3) Local0 = CDW3 /* \_SB_.PCI0._OSC.CDW3 */ -Local0 &= 0x1F +Local0 &= 0x1E Signed-off-by: Igor Mammedov Message-Id: <2022110857.3116853-6-imamm...@redhat.com> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- tests/qtest/bios-tables-test-allowed-diff.h | 16 tests/data/acpi/q35/DSDT| Bin 8289 -> 8289 bytes tests/data/acpi/q35/DSDT.acpihmat | Bin 9614 -> 9614 bytes tests/data/acpi/q35/DSDT.bridge | Bin 11003 -> 11003 bytes tests/data/acpi/q35/DSDT.cphp | Bin 8753 -> 8753 bytes tests/data/acpi/q35/DSDT.dimmpxm| Bin 9943 -> 9943 bytes tests/data/acpi/q35/DSDT.ipmibt | Bin 8364 -> 8364 bytes tests/data/acpi/q35/DSDT.ivrs | Bin 8306 -> 8306 bytes tests/data/acpi/q35/DSDT.memhp | Bin 9648 -> 9648 bytes tests/data/acpi/q35/DSDT.mmio64 | Bin 9419 -> 9419 bytes tests/data/acpi/q35/DSDT.multi-bridge | Bin 8583 -> 8583 bytes tests/data/acpi/q35/DSDT.nohpet | Bin 8147 -> 8147 bytes tests/data/acpi/q35/DSDT.numamem| Bin 8295 -> 8295 bytes tests/data/acpi/q35/DSDT.tis.tpm12 | Bin 8894 -> 8894 bytes tests/data/acpi/q35/DSDT.tis.tpm2 | Bin 8894 -> 8894 bytes tests/data/acpi/q35/DSDT.xapic | Bin 35652 -> 35652 bytes 16 files changed, 16 deletions(-) diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h index 48e5634d4b..dfb8523c8b 100644 --- a/tests/qtest/bios-tables-test-allowed-diff.h +++ b/tests/qtest/bios-tables-test-allowed-diff.h @@ -1,17 +1 @@ /* List of comma-separated changed AML files to ignore */ -"tests/data/acpi/q35/DSDT", -"tests/data/acpi/q35/DSDT.tis", -"tests/data/acpi/q35/DSDT.bridge", -"tests/data/acpi/q35/DSDT.mmio64", -"tests/data/acpi/q35/DSDT.ipmibt", -"tests/data/acpi/q35/DSDT.cphp", -"tests/data/acpi/q35/DSDT.memhp", -"tests/data/acpi/q35/DSDT.acpihmat", -"tests/data/acpi/q35/DSDT.numamem", -"tests/data/acpi/q35/DSDT.dimmpxm", -"tests/data/acpi/q35/DSDT.nohpet", -"tests/data/acpi/q35/DSDT.tis.tpm2", -"tests/data/acpi/q35/DSDT.tis.tpm12", -"tests/data/acpi/q35/DSDT.multi-bridge", -"tests/data/acpi/q35/DSDT.ivrs", -"tests/data/acpi/q35/DSDT.xapic", diff --git a/tests/data/acpi/q35/DSDT b/tests/data/acpi/q35/DSDT index 281fc82c03b2562d2e6b7caec0d817b034a47138..c1965f6051ef2af81dd8412abe169d87845bb033 100644 GIT binary patch delta 24 gcmaFp@X&$FCDET<0BnZ{w*UYD delta 24 gcmaFp@X&$FCDET<0BnK?w*UYD diff --git a/tests/data/acpi/q35/DSDT.acpihmat b/tests/data/acpi/q35/DSDT.acpihmat index 8c1e05a11a328ec1cc6f86e36e52c28f41f9744e..f24d4874bff8d327a165ed7c36de507aea114edd 100644 GIT binary patch delta 24 fcmeD4?(^ny33dtTQ)OUa+&+=(3ZvY{`|DKzU@Hhn delta 24 fcmeD4?(^ny33dtTQ)OUa+%}Qx3ZwkS`|DKzU?vDi diff --git a/tests/data/acpi/q35/DSDT.bridge b/tests/data/acpi/q35/DSDT.bridge index 6f1464b6c712d7f33cb4b891b7ce76fe228f44c9..424d51bd1cb39ea73501ef7d0044ee52cec5bdac 100644 GIT binary patch delta 24 gcmewz`a6`%CDF$oF>WH)6-K#@_k$DxTWtqt delta 24 fcmdn!veAXhCDF$oF?J%?6-N1u_k$DxTWAMo diff --git a/tests/data/acpi/q35/DSDT.dimmpxm b/tests/data/acpi/q35/DSDT.dimmpxm index fe5820d93d057ef09a001662369b15afbc5b87e2..76e451e829ec4c245315f7eed8731aa1be45a747 100644 GIT binary patch delta 24 gcmccad)=4ICDD~$3R?@yKo0BrFHn*aa+ diff --git a/tests/data/acpi/q35/DSDT.memhp b/tests/data/acpi/q35/DSDT.memhp index 9bc11518fc57687ca789dc70793b48b29a0d74ed..4e9cb3dc6896bb79ccac0fe342a404549f6610e8 100644 GIT binary patch delta 24 gcmdnsy}_HyCD7Sg;8$f{S^uTTRaEr delta 24 fcmZp7Zg=K#33dr-S7cyd?48JUg;9Rv{S^uTTQ>*m diff --git a/tests/data/acpi/q35/DSDT.nohpet b/tests/data/acpi/q35/DSDT.nohpet index e8202e6ddfbe96071f32f1ec05758f650569943e..83d1aa00ac5686df479673fb0d7830f946e25dea 100644 GIT binary patch delta 24 gcmca?f7zbPCDn+a delta 24 gcmaFv@Z5pRCDn+a diff --git a/tests/data/acpi/q35/DSDT.tis.tpm12 b/tests/data/acpi/q35/DSDT.tis.tpm12 index c96b5277a14ae98174408d690d6e0246bd932623..0ebdf6fbd77967f1ab5d5337b7b1fed314cfaca8 100644 GIT binary patch delta 24 gcmdnzy3du%CDyjp@iVCN7s?mk^h31_s6r6S=N1%5A)#+64f6_X delta 26 icmX>yjp@iVCN7s?mk^h31_s9U6S=N1%5S`%+64f6@(F(c -- MST
[PULL 05/20] hw/mem/pc-dimm: Restrict NUMA-specific code to NUMA machines
From: Philippe Mathieu-Daudé When trying to use the pc-dimm device on a non-NUMA machine, we get: $ qemu-system-arm -M none -cpu max -S \ -object memory-backend-file,id=mem1,size=1M,mem-path=/tmp/1m \ -device pc-dimm,id=dimm1,memdev=mem1 Segmentation fault (core dumped) (gdb) bt #0 pc_dimm_realize (dev=0x56da3e90, errp=0x7fffcd10) at hw/mem/pc-dimm.c:184 #1 0x55fe1f8f in device_set_realized (obj=0x56da3e90, value=true, errp=0x7fffce18) at hw/core/qdev.c:531 #2 0x55feb4a9 in property_set_bool (obj=0x56da3e90, v=0x56e54420, name=0x563c3c41 "realized", opaque=0x56a704f0, errp=0x7fffce18) at qom/object.c:2257 To avoid that crash, restrict the pc-dimm NUMA check to machines supporting NUMA, and do not allow the use of 'node' property on non-NUMA machines. Suggested-by: Igor Mammedov Signed-off-by: Philippe Mathieu-Daudé Message-Id: <20211106145016.611332-1-f4...@amsat.org> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- hw/mem/pc-dimm.c | 23 +++ 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c index a3a2560301..48b913aba6 100644 --- a/hw/mem/pc-dimm.c +++ b/hw/mem/pc-dimm.c @@ -181,7 +181,21 @@ static void pc_dimm_realize(DeviceState *dev, Error **errp) PCDIMMDevice *dimm = PC_DIMM(dev); PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm); MachineState *ms = MACHINE(qdev_get_machine()); -int nb_numa_nodes = ms->numa_state->num_nodes; + +if (ms->numa_state) { +int nb_numa_nodes = ms->numa_state->num_nodes; + +if (((nb_numa_nodes > 0) && (dimm->node >= nb_numa_nodes)) || +(!nb_numa_nodes && dimm->node)) { +error_setg(errp, "'DIMM property " PC_DIMM_NODE_PROP " has value %" + PRIu32 "' which exceeds the number of numa nodes: %d", + dimm->node, nb_numa_nodes ? nb_numa_nodes : 1); +return; +} +} else if (dimm->node > 0) { +error_setg(errp, "machine doesn't support NUMA"); +return; +} if (!dimm->hostmem) { error_setg(errp, "'" PC_DIMM_MEMDEV_PROP "' property is not set"); @@ -191,13 +205,6 @@ static void pc_dimm_realize(DeviceState *dev, Error **errp) object_get_canonical_path_component(OBJECT(dimm->hostmem))); return; } -if (((nb_numa_nodes > 0) && (dimm->node >= nb_numa_nodes)) || -(!nb_numa_nodes && dimm->node)) { -error_setg(errp, "'DIMM property " PC_DIMM_NODE_PROP " has value %" - PRIu32 "' which exceeds the number of numa nodes: %d", - dimm->node, nb_numa_nodes ? nb_numa_nodes : 1); -return; -} if (ddc->realize) { ddc->realize(dimm, errp); -- MST
[PULL 00/20] pci,pc,virtio: bugfixes
The following changes since commit 0a70bcf18caf7a61d480f8448723c15209d128ef: Update version for v6.2.0-rc0 release (2021-11-09 18:22:57 +0100) are available in the Git repository at: git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream for you to fetch changes up to 18416c62e36a79823a9e28f6b2260aa13c25e1d9: pcie: expire pending delete (2021-11-15 11:10:11 -0500) pci,pc,virtio: bugfixes pci power management fixes acpi hotplug fixes misc other fixes Signed-off-by: Michael S. Tsirkin Eugenio Pérez (4): vhost: Rename last_index to vq_index_end vhost: Fix last vq queue index of devices with no cvq vdpa: Replace qemu_open_old by qemu_open at vdpa: Check for existence of opts.vhostdev Gerd Hoffmann (6): pci: implement power state pcie: implement slot power control for pcie root ports pcie: add power indicator blink check pcie: factor out pcie_cap_slot_unplug() pcie: fast unplug when slot power is off pcie: expire pending delete Igor Mammedov (2): pcie: rename 'native-hotplug' to 'x-native-hotplug' tests: bios-tables-test update expected blobs Jason Wang (2): virtio: use virtio accessor to access packed descriptor flags virtio: use virtio accessor to access packed event Julia Suvorova (3): hw/acpi/ich9: Add compat prop to keep HPC bit set for 6.1 machine type bios-tables-test: Allow changes in DSDT ACPI tables hw/i386/acpi-build: Deny control on PCIe Native Hot-plug in _OSC Philippe Mathieu-Daudé (1): hw/mem/pc-dimm: Restrict NUMA-specific code to NUMA machines Stefan Hajnoczi (1): softmmu/qdev-monitor: fix use-after-free in qdev_set_id() Stefano Garzarella (1): net/vhost-vdpa: fix memory leak in vhost_vdpa_get_max_queue_pairs() include/hw/acpi/ich9.h| 1 + include/hw/pci/pci.h | 2 + include/hw/qdev-core.h| 1 + include/hw/virtio/vhost.h | 4 +- hw/acpi/ich9.c| 18 hw/i386/acpi-build.c | 12 -- hw/i386/pc.c | 2 + hw/i386/pc_q35.c | 9 +++- hw/mem/pc-dimm.c | 23 ++ hw/net/vhost_net.c| 12 +++--- hw/pci/pci.c | 25 ++- hw/pci/pci_host.c | 6 ++- hw/pci/pcie.c | 79 -- hw/pci/pcie_port.c| 2 +- hw/virtio/vhost-vdpa.c| 2 +- hw/virtio/virtio.c| 24 --- net/vhost-vdpa.c | 8 +++- softmmu/qdev-monitor.c| 6 ++- tests/data/acpi/q35/DSDT | Bin 8289 -> 8289 bytes tests/data/acpi/q35/DSDT.acpihmat | Bin 9614 -> 9614 bytes tests/data/acpi/q35/DSDT.bridge | Bin 11003 -> 11003 bytes tests/data/acpi/q35/DSDT.cphp | Bin 8753 -> 8753 bytes tests/data/acpi/q35/DSDT.dimmpxm | Bin 9943 -> 9943 bytes tests/data/acpi/q35/DSDT.ipmibt | Bin 8364 -> 8364 bytes tests/data/acpi/q35/DSDT.ivrs | Bin 8306 -> 8306 bytes tests/data/acpi/q35/DSDT.memhp| Bin 9648 -> 9648 bytes tests/data/acpi/q35/DSDT.mmio64 | Bin 9419 -> 9419 bytes tests/data/acpi/q35/DSDT.multi-bridge | Bin 8583 -> 8583 bytes tests/data/acpi/q35/DSDT.nohpet | Bin 8147 -> 8147 bytes tests/data/acpi/q35/DSDT.numamem | Bin 8295 -> 8295 bytes tests/data/acpi/q35/DSDT.tis.tpm12| Bin 8894 -> 8894 bytes tests/data/acpi/q35/DSDT.tis.tpm2 | Bin 8894 -> 8894 bytes tests/data/acpi/q35/DSDT.xapic| Bin 35652 -> 35652 bytes 33 files changed, 176 insertions(+), 60 deletions(-)
[PULL 08/20] bios-tables-test: Allow changes in DSDT ACPI tables
From: Julia Suvorova Prepare for changing the _OSC method in q35 DSDT. Signed-off-by: Julia Suvorova Signed-off-by: Igor Mammedov Acked-by: Ani Sinha Message-Id: <2022110857.3116853-4-imamm...@redhat.com> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- tests/qtest/bios-tables-test-allowed-diff.h | 16 1 file changed, 16 insertions(+) diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h index dfb8523c8b..48e5634d4b 100644 --- a/tests/qtest/bios-tables-test-allowed-diff.h +++ b/tests/qtest/bios-tables-test-allowed-diff.h @@ -1 +1,17 @@ /* List of comma-separated changed AML files to ignore */ +"tests/data/acpi/q35/DSDT", +"tests/data/acpi/q35/DSDT.tis", +"tests/data/acpi/q35/DSDT.bridge", +"tests/data/acpi/q35/DSDT.mmio64", +"tests/data/acpi/q35/DSDT.ipmibt", +"tests/data/acpi/q35/DSDT.cphp", +"tests/data/acpi/q35/DSDT.memhp", +"tests/data/acpi/q35/DSDT.acpihmat", +"tests/data/acpi/q35/DSDT.numamem", +"tests/data/acpi/q35/DSDT.dimmpxm", +"tests/data/acpi/q35/DSDT.nohpet", +"tests/data/acpi/q35/DSDT.tis.tpm2", +"tests/data/acpi/q35/DSDT.tis.tpm12", +"tests/data/acpi/q35/DSDT.multi-bridge", +"tests/data/acpi/q35/DSDT.ivrs", +"tests/data/acpi/q35/DSDT.xapic", -- MST
[PULL 06/20] pcie: rename 'native-hotplug' to 'x-native-hotplug'
From: Igor Mammedov Mark property as experimental/internal adding 'x-' prefix. Property was introduced in 6.1 and it should have provided ability to turn on native PCIE hotplug on port even when ACPI PCI hotplug is in use is user explicitly sets property on CLI. However that never worked since slot is wired to ACPI hotplug controller. Another non-intended usecase: disable native hotplug on slot when APCI based hotplug is disabled, which works but slot has 'hotplug' property for this taks. It should be relatively safe to rename it to experimental as no users should exist for it and given that the property is broken we don't really want to leave it around for much longer lest users start using it. Signed-off-by: Igor Mammedov Reviewed-by: Ani Sinha Message-Id: <2022110857.3116853-2-imamm...@redhat.com> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- hw/i386/pc_q35.c | 2 +- hw/pci/pcie_port.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 797e09500b..fc34b905ee 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -243,7 +243,7 @@ static void pc_q35_init(MachineState *machine) NULL); if (acpi_pcihp) { -object_register_sugar_prop(TYPE_PCIE_SLOT, "native-hotplug", +object_register_sugar_prop(TYPE_PCIE_SLOT, "x-native-hotplug", "false", true); } diff --git a/hw/pci/pcie_port.c b/hw/pci/pcie_port.c index da850e8dde..e95c1e5519 100644 --- a/hw/pci/pcie_port.c +++ b/hw/pci/pcie_port.c @@ -148,7 +148,7 @@ static Property pcie_slot_props[] = { DEFINE_PROP_UINT8("chassis", PCIESlot, chassis, 0), DEFINE_PROP_UINT16("slot", PCIESlot, slot, 0), DEFINE_PROP_BOOL("hotplug", PCIESlot, hotplug, true), -DEFINE_PROP_BOOL("native-hotplug", PCIESlot, native_hotplug, true), +DEFINE_PROP_BOOL("x-native-hotplug", PCIESlot, native_hotplug, true), DEFINE_PROP_END_OF_LIST() }; -- MST
[PULL 04/20] vhost: Fix last vq queue index of devices with no cvq
From: Eugenio Pérez The -1 assumes that cvq device model is accounted in data_queue_pairs, if cvq does not exists, but it's actually the opposite: Devices with !cvq are ok but devices with cvq does not add the last queue to data_queue_pairs. This is not a problem to vhost-net, but it is to vhost-vdpa: * Devices with cvq gets initialized at last data vq device model, not at cvq one. * Devices with !cvq never gets initialized, since last_index is the first queue of the last device model. Because of that, the right change in last_index is to actually add the cvq, not to remove the missing one. This is not a problem to vhost-net, but it is to vhost-vdpa, which device model trust to reach the last index to finish starting the device. Also, as the previous commit, rename it to index_end. Tested with vp_vdpa with host's vhost=on and vhost=off, with ctrl_vq=on and ctrl_vq=off. Fixes: 049eb15b5fc9 ("vhost: record the last virtqueue index for the virtio device") Reviewed-by: Juan Quintela Signed-off-by: Eugenio Pérez Message-Id: <20211104085625.2054959-3-epere...@redhat.com> Acked-by: Jason Wang Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- hw/net/vhost_net.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index 29f2c4212f..30379d2ca4 100644 --- a/hw/net/vhost_net.c +++ b/hw/net/vhost_net.c @@ -326,11 +326,11 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs, VirtIONet *n = VIRTIO_NET(dev); int nvhosts = data_queue_pairs + cvq; struct vhost_net *net; -int r, e, i, last_index = data_queue_pairs * 2; +int r, e, i, index_end = data_queue_pairs * 2; NetClientState *peer; -if (!cvq) { -last_index -= 1; +if (cvq) { +index_end += 1; } if (!k->set_guest_notifiers) { @@ -347,7 +347,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs, } net = get_vhost_net(peer); -vhost_net_set_vq_index(net, i * 2, last_index); +vhost_net_set_vq_index(net, i * 2, index_end); /* Suppress the masking guest notifiers on vhost user * because vhost user doesn't interrupt masking/unmasking -- MST
Re: [PULL 21/26] configure, meson: move more compiler checks to Meson
On Thu, 14 Oct 2021 at 17:49, Paolo Bonzini wrote: > > Reviewed-by: Marc-André Lureau > Message-Id: <20211007130829.632254-15-pbonz...@redhat.com> > Signed-off-by: Paolo Bonzini > --- > configure| 91 > meson.build | 44 +++ > util/meson.build | 4 ++- > 3 files changed, 47 insertions(+), 92 deletions(-) > diff --git a/meson.build b/meson.build > index 6bf43e6d30..6b7487b725 100644 > --- a/meson.build > +++ b/meson.build > @@ -1550,6 +1550,8 @@ config_host_data.set('CONFIG_INOTIFY', > cc.has_header_symbol('sys/inotify.h', 'inotify_init')) > config_host_data.set('CONFIG_INOTIFY1', > cc.has_header_symbol('sys/inotify.h', 'inotify_init1')) > +config_host_data.set('CONFIG_IOVEC', > + cc.has_header_symbol('sys/uio.h', 'struct iovec')) > config_host_data.set('CONFIG_MACHINE_BSWAP_H', > cc.has_header_symbol('machine/bswap.h', 'bswap32', >prefix: '''#include Hi -- I've just noticed that this change breaks compilation for me, because this test incorrectly fails to set CONFIG_IOVEC on a system where the header defines 'struct iovec'. This seems to be because "struct iovec" isn't a valid thing to test with has_header_symbol, because it provokes a compiler error from clang. The meson-log.txt shows: Running compile: Working directory: /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/arm-clang/meson-private/tmpfspzse_8 Command line: clang-7 -m64 -mcx16 /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/arm-clang/meson-private/tmpfspzse_8/testfile.c -o /mnt/nvme disk/linaro/qemu-from-laptop/qemu/build/arm-clang/meson-private/tmpfspzse_8/output.obj -c -fsanitize=undefined -fno-sanitize=shift-base -Werror -D_FILE_OFFSET_BITS=64 -O0 -Werror=implicit-function-declaration -Werror=unknown-warning-option -Werror=unused-command-line-argument -Werror=i gnored-optimization-argument -std=gnu11 Code: #include int main(void) { /* If it's not defined as a macro, try to use as a symbol */ #ifndef struct iovec struct iovec; #endif return 0; } Compiler stdout: Compiler stderr: /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/arm-clang/meson-private/tmpfspzse_8/testfile.c:5:28: error: extra tokens at end of #ifndef di rective [-Werror,-Wextra-tokens] #ifndef struct iovec ^ // 1 error generated. ...skipping... #ifndef struct iovec struct iovec; #endif return 0; } Compiler stdout: Compiler stderr: /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/arm-clang/meson-private/tmpfspzse_8/testfile.c:5:28: error: extra tokens at end of #ifndef r] #ifndef struct iovec ^ // 1 error generated. Header has symbol "struct iovec" : NO For comparison, with a gcc build the test works because gcc happens to merely warn rather than fail for the syntax issue: Running compile: Working directory: /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/x86/meson-private/tmpidzebj6t Command line: ccache gcc -m64 -mcx16 /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/x86/meson-private/tmpidzebj6t/testfile.c -o /mnt/nvmedis k/linaro/qemu-from-laptop/qemu/build/x86/meson-private/tmpidzebj6t/output.obj -c -D_FILE_OFFSET_BITS=64 -O0 -std=gnu11 Code: #include int main(void) { /* If it's not defined as a macro, try to use as a symbol */ int main(void) { return 0; } Compiler stdout: Compiler stderr: ...skipping... #ifndef struct iovec struct iovec; #endif return 0; } Compiler stdout: Compiler stderr: /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/x86/meson-private/tmpidzebj6t/testfile.c: In function 'main': /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/x86/meson-private/tmpidzebj6t/testfile.c:5:28: warning: extra tokens at end of #ifndef direcve 5 | #ifndef struct iovec |^ Header has symbol "struct iovec" : YES -- PMM
Re: [PATCH v4 00/25] block layer: split block APIs in global state and I/O
On Mon, Nov 15, 2021 at 05:03:28PM +0100, Hanna Reitz wrote: > On 25.10.21 12:17, Emanuele Giuseppe Esposito wrote: > > Currently, block layer APIs like block-backend.h contain a mix of > > functions that are either running in the main loop and under the > > BQL, or are thread-safe functions and run in iothreads performing I/O. > > The functions running under BQL also take care of modifying the > > block graph, by using drain and/or aio_context_acquire/release. > > This makes it very confusing to understand where each function > > runs, and what assumptions it provided with regards to thread > > safety. > > > > We call the functions running under BQL "global state (GS) API", and > > distinguish them from the thread-safe "I/O API". > > > > The aim of this series is to split the relevant block headers in > > global state and I/O sub-headers. > > Despite leaving quite some comments, the series and the split seem > reasonable to me overall. (This is a pretty big series, after all, so those > “some comments” stack up against a majority of changes that seem OK to me. > :)) > > One thing I noticed while reviewing is that it’s really hard to verify that > no I/O function calls a GS function. What would be wonderful is some > function marker like coroutine_fn that marks GS functions (or I/O functions) > and that we could then verify the call paths. But AFAIU we’ve always wanted > precisely that for coroutine_fn and still don’t have it, so this seems like > extremely wishful thinking... :( Even if we don't programmatically verify these rules, it would be a major step forward if we simply adopted a standard naming convention for the APIs that was essentally self-describing. eg block_io_XXX for all I/O stuff and block_state_ for all global state ,and block_common if we have stuff applicable to both use cases. I wouldn't suggest doing it as part of this series, but I think it'd be worthwhile in general for the medium-long term, despite the code churn in updating all usage in the short term. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: does drive_get_next(IF_NONE) make sense?
On 15/11/2021 08.12, Alistair Francis wrote: On Mon, Nov 15, 2021 at 3:32 PM Markus Armbruster wrote: Peter Maydell writes: On Fri, 12 Nov 2021 at 13:34, Markus Armbruster wrote: Thomas Huth writes: On 03/11/2021 09.41, Markus Armbruster wrote: Peter Maydell writes: Does it make sense for a device/board to do drive_get_next(IF_NONE) ? Short answer: hell, no! ;) Would it make sense to add an "assert(type != IF_NONE)" to drive_get() to avoid such mistakes in the future? Worth a try. You need to fix the sifive_u_otp device first :-) And for that, we may want Hao Wu's "[PATCH v4 5/7] blockdev: Add a new IF type IF_OTHER" first. I can fixup sifive_u_otp, just let me know what the prefered solution is What kind of device is that OTP exactly? If it is some kind of non-serial flash device, maybe you could simply use IF_PFLASH instead? Thomas
Re: [PATCH v4 00/25] block layer: split block APIs in global state and I/O
On 25.10.21 12:17, Emanuele Giuseppe Esposito wrote: Currently, block layer APIs like block-backend.h contain a mix of functions that are either running in the main loop and under the BQL, or are thread-safe functions and run in iothreads performing I/O. The functions running under BQL also take care of modifying the block graph, by using drain and/or aio_context_acquire/release. This makes it very confusing to understand where each function runs, and what assumptions it provided with regards to thread safety. We call the functions running under BQL "global state (GS) API", and distinguish them from the thread-safe "I/O API". The aim of this series is to split the relevant block headers in global state and I/O sub-headers. Despite leaving quite some comments, the series and the split seem reasonable to me overall. (This is a pretty big series, after all, so those “some comments” stack up against a majority of changes that seem OK to me. :)) One thing I noticed while reviewing is that it’s really hard to verify that no I/O function calls a GS function. What would be wonderful is some function marker like coroutine_fn that marks GS functions (or I/O functions) and that we could then verify the call paths. But AFAIU we’ve always wanted precisely that for coroutine_fn and still don’t have it, so this seems like extremely wishful thinking... :( I think most of the issues I found can be fixed (or are even irrelevant), the only thing that really worries me are the two places that are clearly I/O paths that call permission functions: Namely first block_crypto_amend_options_generic_luks() (part of the luks block driver’s .bdrv_co_amend implementation), which calls bdrv_child_refresh_perms(); and second fuse_do_truncate(), which calls blk_set_perm(). In the first case, we need this call so that we don’t permanently hog the WRITE permission for the luks file, which used to be a problem, I believe. We want to unshare the WRITE permission (and apparently also CONSISTENT_READ) during the key update, so we need some way to temporarily update the permissions. I only really see four solutions for this: (1) We somehow make the amend job run in the main context under the BQL and have it prevent all concurrent I/O access (seems bad) (2) We can make the permission functions part of the I/O path (seems wrong and probably impossible?) (3) We can drop the permissions update and permanently require the permissions that we need when updating keys (I think this might break existing use cases) (4) We can acquire the BQL around the permission update call and perhaps that works? I don’t know how (4) would work but it’s basically the only reasonable solution I can come up with. Would this be a way to call a BQL function from an I/O function? As for the second case, the same applies as above, with the differences that we have no jobs, so this code must always run in the block device’s AioContext (I think), which rules out (1); but (3) would become easier (i.e. require the RESIZE permission all the time), although that too might have an impact on existing users (don’t think so, though). In any case, if we could do (4), that would solve the problem here, too. And finally, another notable thing I noticed is that the way how create-related functions are handled is inconsistent. I believe they should all be GS functions; qmp_blockdev_create() seems to agree with me on this, but we currently seem to have some bugs there. It’s possible to invoke blockdev-create on a block device that’s in an I/O thread, and then qemu crashes. Oops. (The comment in qmp_blockdev_create() says that the block drivers’ implementations should prevent this, but apparently they don’t...?) In any case, that’s a pre-existing bug, of course, that doesn’t concern this series (other than that it suggests that “create” functions should be classified as GS). Hanna
Re: [PATCH RFC 0/2] Eliminate drive_get_next()
Peter Maydell writes: > On Mon, 15 Nov 2021 at 12:55, Markus Armbruster wrote: >> >> This is RFC because I'm unsure about the removal of >> >> /* Reason: init() method uses drive_get_next() */ >> dc->user_creatable = false; >> >> in PATCH 1. Both users appear to wire up some GPIO. If that's >> necessary for the thing to work, we should just replace the comment. > > Looking at the code, it sort of is and sort of isn't. The GPIO line > is the chip-select line. If you don't connect it then (because the > ssi-sd device configures cs_polarity to SSI_CS_LOW, requesting an > active-low chip-select) the device will always be selected. If > the machine created an SSI bus with no SSI device attached to it > then in theory the user could create an ssi-sd device and connect > it there and have it work. But in practice it's really unlikely: > machines create SSI buses with specific wired-in devices on them, > and the guest OS knows what it has to do to enable the chip select > for the device it wants to talk to (often some known GPIO pin on > a GPIO controller). > > So I would keep the user_creatable = false, with a reason of > "user should wire up GPIO chip-select line". But see below for I'll do it this way. > a pile of contrary precedent. > > (The chip-select GPIO is created in the parent class, incidentally.) > >> Aside: there may be devices that need manual wiring to work, yet don't >> have user_creatable unset. Bugs if you ask me. I don't have smart >> ideas on how to track them down. > > Me neither. I notice that the TYPE_M25P80 is also an SSI peripheral > with an active-low chipselect but that one doesn't set user_creatable > to false. TYPE_SSD0323 also is user-creatable and that one has an > active-high chipselect, which means the user can create a device but > it will then never be usable because it will ignore all transactions. > (More generally, looks like most subclasses of TYPE_SSI_PERIPHERAL > don't set user_creatable = false.) For sysbus devices, we clear user_creatable by default, because sysbus devices pretty much always[*] need wiring. Is this the case for SSI bus devices, too? [*] The most prominent exception is "dynamic sysbus", which I believe was a mistake.
Re: [PATCH RFC 2/2] hw: Replace drive_get_next() by drive_get()
Philippe Mathieu-Daudé writes: > On 11/15/21 13:55, Markus Armbruster wrote: >> drive_get_next() is basically a bad idea. It returns the "next" block >> backend of a certain interface type. "Next" means bus=0,unit=N, where >> subsequent calls count N up from zero, per interface type. >> >> This lets you define unit numbers implicitly by execution order. If the >> order changes, or new calls appear "in the middle", unit numbers change. >> ABI break. Hard to spot in review. >> >> Explicit is better than implicit: use drive_get() directly. >> >> Signed-off-by: Markus Armbruster >> --- >> include/sysemu/blockdev.h | 1 - >> blockdev.c | 10 -- >> hw/arm/aspeed.c | 21 + >> hw/arm/cubieboard.c | 2 +- >> hw/arm/imx25_pdk.c | 2 +- >> hw/arm/integratorcp.c | 2 +- >> hw/arm/mcimx6ul-evk.c | 2 +- >> hw/arm/mcimx7d-sabre.c | 2 +- >> hw/arm/msf2-som.c | 2 +- >> hw/arm/npcm7xx_boards.c | 6 +++--- >> hw/arm/orangepi.c | 2 +- >> hw/arm/raspi.c | 2 +- >> hw/arm/realview.c | 2 +- >> hw/arm/sabrelite.c | 2 +- >> hw/arm/versatilepb.c| 4 ++-- >> hw/arm/vexpress.c | 6 +++--- >> hw/arm/xilinx_zynq.c| 16 +--- >> hw/arm/xlnx-versal-virt.c | 3 ++- >> hw/arm/xlnx-zcu102.c| 6 +++--- >> hw/microblaze/petalogix_ml605_mmu.c | 2 +- >> hw/misc/sifive_u_otp.c | 2 +- >> hw/riscv/microchip_pfsoc.c | 2 +- >> hw/sparc64/niagara.c| 2 +- >> 23 files changed, 49 insertions(+), 52 deletions(-) > >> @@ -435,11 +438,13 @@ static void aspeed_machine_init(MachineState *machine) >> } >> >> for (i = 0; i < bmc->soc.sdhci.num_slots; i++) { >> -sdhci_attach_drive(>soc.sdhci.slots[i], drive_get_next(IF_SD)); >> +sdhci_attach_drive(>soc.sdhci.slots[i], >> + drive_get(IF_SD, 0, i)); > > If we put SD on bus #0, ... > >> } >> >> if (bmc->soc.emmc.num_slots) { >> -sdhci_attach_drive(>soc.emmc.slots[0], drive_get_next(IF_SD)); >> +sdhci_attach_drive(>soc.emmc.slots[0], >> + drive_get(IF_SD, 0, bmc->soc.sdhci.num_slots)); > > ... we'd want to put eMMC on bus #1 Using separate buses for different kinds of devices would be neater, but it also would be an incompatible change. This patch keeps existing bus/unit numbers working. drive_get_next() can only use bus 0. > but I see having eMMC cards on a > IF_SD bus as a bug, since these cards are soldered on the board. IF_SD is not a bus, it's an "block interface type", which is really just a user interface thing. >> --- a/hw/arm/vexpress.c >> +++ b/hw/arm/vexpress.c >> @@ -625,7 +625,7 @@ static void vexpress_common_init(MachineState *machine) >>qdev_get_gpio_in(sysctl, >> ARM_SYSCTL_GPIO_MMC_WPROT)); >> qdev_connect_gpio_out_named(dev, "card-inserted", 0, >>qdev_get_gpio_in(sysctl, >> ARM_SYSCTL_GPIO_MMC_CARDIN)); >> -dinfo = drive_get_next(IF_SD); >> +dinfo = drive_get(IF_SD, 0, 0); > > Can we have one interface refactor per patch (IF_SD, IF_PFLASH, IF_MTD...)? Peter asked for one patch per "board/SoC model". I'll do whatever helps reviewers. >> @@ -657,7 +657,7 @@ static void vexpress_common_init(MachineState *machine) >> >> sysbus_create_simple("pl111", map[VE_CLCD], pic[14]); >> >> -dinfo = drive_get_next(IF_PFLASH); >> +dinfo = drive_get(IF_PFLASH, 0, 0); > >> -static inline void zynq_init_spi_flashes(uint32_t base_addr, qemu_irq irq, >> - bool is_qspi) >> +static inline int zynq_init_spi_flashes(uint32_t base_addr, qemu_irq irq, >> +bool is_qspi, int unit0) >> { >> +int unit = unit0; >> DeviceState *dev; >> SysBusDevice *busdev; >> SSIBus *spi; >> @@ -156,7 +157,7 @@ static inline void zynq_init_spi_flashes(uint32_t >> base_addr, qemu_irq irq, >> spi = (SSIBus *)qdev_get_child_bus(dev, bus_name); >> >> for (j = 0; j < num_ss; ++j) { >> -DriveInfo *dinfo = drive_get_next(IF_MTD); >> +DriveInfo *dinfo = drive_get(IF_MTD, 0, unit++); > >> diff --git a/hw/arm/xlnx-zcu102.c b/hw/arm/xlnx-zcu102.c >> index 3dc2b5e8ca..45eb19ab3b 100644 >> --- a/hw/arm/xlnx-zcu102.c >> +++ b/hw/arm/xlnx-zcu102.c >> @@ -190,7 +190,7 @@ static void xlnx_zcu102_init(MachineState *machine) >> BusState *spi_bus; >> DeviceState *flash_dev; >> qemu_irq cs_line; >> -DriveInfo *dinfo = drive_get_next(IF_MTD); >> +DriveInfo *dinfo = drive_get(IF_MTD, 0, i); > > If this is bus #0, ... > >>
Re: [PATCH] qmp: Stabilize preconfig
Paolo Bonzini writes: > On 11/13/21 08:52, Markus Armbruster wrote: >> I'm not asking what to do "if it hurts", or "if you want a cold-plugged >> device". I'm asking whether there's a reason for ever wanting hot plug >> instead of cold plug. Or in other words, what can hot plug possibly >> gain us over cold plug? >> >> As far as I know, the answer is "nothing but trouble". > > Yes, I agree. > >> If that's true, then what we should tell users is to stick to -device >> for initial configuration, and stay away from device_add. > > Yes, which is one issue with stabilizing -preconfig. It's not clear > what exactly it is solving. > > The boat for this has sailed. The only sane way to do this is a new > binary. "Ideally" still applies to any new binary. >>> >>> Well, "ideally" any new binary would only have a few command line >>> options, and ordering would be mostly irrelevant. For example I'd >>> expect a QMP binary to only have a few options, mostly for >>> debugging/development (-L, -trace) and for process-wide settings (such >>> as -name). >> >> This is where we disagree. For me, a new, alternative qemu-system-FOO binary >> should be able to replace the warty one we have. >> >> One important kind of user is management applications. Libvirt >> developers tell us that they'd like to configure as much as possible via >> QMP. Another kind of user dear to me is me^H^Hdevelopers. For ad hoc >> testing, having to configure via QMP is a pain we'd rathe do without. I >> don't want to remain stuck on the traditional binary, I want to do this >> with the new one. > > Why do you care? For another example, you can use "reboot" or > "systemctl isolate reboot.target" and they end up doing the same thing. > > As long as qemu_init invokes qmp_machine_set, qmp_accel_set, > qmp_device_add, qmp_plugin_add, qmp_cont, etc. to do its job, the > difference between qemu-system-* and qemu-qmp-* is a couple thousands > lines of boring code that all but disappears once the VM is up and > running. IOW, with the right design (e.g. shortcut options for QOM > properties good; dozens of global variables bad), there's absolutely no > issue with some people using qemu-system-* and some using qemu-qmp-*. I think maintaining two binaries forever is madness. I want the old one to wither away. Making the new binary capable of serving all use cases should not be hard, just work (see my design sketch). I expect the result to serve *better* than the mess we have now. >> Likewise, we'd fail QMP commands that are "out of phase". >> @allow-preconfig is a crutch that only exists because we're afraid (with >> reason) of hidden assumptions in QMP commands. > > At this point, it's not even like that anymore (except for block devices > because my patches haven't been applied). My point is that we still have quite a few commands without 'allow-preconfig' mostly because we are afraid of allowing them in preconfig state, not because of true phase dependencies. >>> >>> I think there's very few of them, if any (outside the block layer for >>> which patches exist), and those are due to distraction more than fear. >> >> qapi/*.json has 216 commands, of which 26 carry 'allow-preconfig'. > > Well, > https://lists.gnu.org/archive/html/qemu-devel/2021-06/msg01597.html > alone would more than double that. :) > > Places like machine.json, machine-target.json, migration.json, > replay.json have a lot of commands that are, obviously, almost entirely > not suitable for preconfig. I don't think there are many commands left, > I'd guess maybe 30 (meaning that ~60% are done). My point is that "very few" is not literally true, and I think you just confirmed it ;)
Re: does drive_get_next(IF_NONE) make sense?
Peter Maydell writes: > On Mon, 15 Nov 2021 at 13:24, Kevin Wolf wrote: >> Same question as for Hao Wu's series: Wouldn't the proper solution be to >> add a drive property to the machine type? >> >> If you can't use -blockdev, it's not done right. > > Is there an example of "doing it right" for built-in-to-the-machine > devices? > > (My experience with the new-style options is that almost > always they're designed for x86 where the device they're attached > to is also created on the command line, and then handling of boards > where the device is builtin is either an afterthought or forgotten. > See also -netdev, where it took forever for built-in-ethernet to > be usable.) I'm afraid the situation for onboard block devices is far worse than it ever was for NICs. See my reply "On configuring onboard block devices with -blockdev" to Kevin's other message on the topic. To be fair, improving onboard device configuration is *hard*. Our general device configuration interface doesn't cover them, and we've so far failed finding a general solution. Without one, we're drowning in the sheer number of boards and onboard devices. Which is ever growing.
On configuring onboard block devices with -blockdev (was: [PATCH v4 6/7] hw/nvram: Update at24c EEPROM init function in NPCM7xx boards)
Kevin Wolf writes: > Am 03.11.2021 um 23:01 hat Hao Wu geschrieben: >> We made 3 changes to the at24c_eeprom_init function in >> npcm7xx_boards.c: >> >> 1. We allow the function to take a I2CBus* as parameter. This allows >>us to attach an EEPROM device behind an I2C mux which is not >>possible with the old method. >> >> 2. We make at24c EEPROMs are backed by drives so that we can >>specify the content of the EEPROMs. >> >> 3. Instead of using i2c address as unit number, This patch assigns >>unique unit numbers for each eeproms in each board. This avoids >>conflict in providing multiple eeprom contents with the same address. >>In the old method if we specify two drives with the same unit number, >>the following error will occur: `Device with id 'none85' exists`. >> >> Signed-off-by: Hao Wu >> --- >> hw/arm/npcm7xx_boards.c | 14 +- >> 1 file changed, 9 insertions(+), 5 deletions(-) >> >> diff --git a/hw/arm/npcm7xx_boards.c b/hw/arm/npcm7xx_boards.c >> index dec7d16ae5..9121e081fa 100644 >> --- a/hw/arm/npcm7xx_boards.c >> +++ b/hw/arm/npcm7xx_boards.c >> @@ -126,13 +126,17 @@ static I2CBus *npcm7xx_i2c_get_bus(NPCM7xxState *soc, >> uint32_t num) >> return I2C_BUS(qdev_get_child_bus(DEVICE(>smbus[num]), "i2c-bus")); >> } >> >> -static void at24c_eeprom_init(NPCM7xxState *soc, int bus, uint8_t addr, >> - uint32_t rsize) >> +static void at24c_eeprom_init(I2CBus *i2c_bus, int bus, uint8_t addr, >> + uint32_t rsize, int unit_number) >> { >> -I2CBus *i2c_bus = npcm7xx_i2c_get_bus(soc, bus); >> I2CSlave *i2c_dev = i2c_slave_new("at24c-eeprom", addr); >> DeviceState *dev = DEVICE(i2c_dev); >> +DriveInfo *dinfo; >> >> +dinfo = drive_get(IF_OTHER, bus, unit_number); >> +if (dinfo) { >> +qdev_prop_set_drive(dev, "drive", blk_by_legacy_dinfo(dinfo)); >> +} > > I may be missing the history of this series, but why do we have to use a > legacy DriveInfo here instead of adding a drive property to the board to > make this configurable with -blockdev? > > Adding even more devices that can only be configured with -drive feels > like a step backwards to me. More like sideways. The big unfinished part of -blockdev is configuring onboard devices with it. I took a stab at one instance in commit ebc29e1bea "pc: Support firmware configuration with -blockdev", 2019-03-11. Quoting from the commit message: Mapping -drive if=none,... to -blockdev is a solved problem. With if=T other than if=none, -drive additionally configures a block device frontend. For non-onboard devices, that part maps to -device. Also a solved problem. For onboard devices such as PC flash memory, we have an unsolved problem. This is actually an instance of a wider problem: our general device configuration interface doesn't cover onboard devices. Instead, we have a zoo of ad hoc interfaces that are much more limited. One of them is -drive, which we'd rather deprecate, but can't until we have suitable replacements for all its uses. Sadly, I can't attack the wider problem today. So back to the narrow problem. My first idea was to reduce it to its solved buddy by using pluggable instead of onboard devices for the flash memory. Workable, but it requires some extra smarts in firmware descriptors and libvirt. Paolo had an idea that is simpler for libvirt: keep the devices onboard, and add machine properties for their block backends. The implementation is less than straightforward, I'm afraid. First, block backend properties are *qdev* properties. Machines can't have those, as they're not devices. I could duplicate these qdev properties as QOM properties, but I hate that. More seriously, the properties do not belong to the machine, they belong to the onboard flash devices. Adding them to the machine would then require bad magic to somehow transfer them to the flash devices. Fortunately, QOM provides the means to handle exactly this case: add alias properties to the machine that forward to the onboard devices' properties. Properties need to be created in .instance_init() methods. For PC machines, that's pc_machine_initfn(). To make alias properties work, we need to create the onboard flash devices there, too. Requires several bug fixes, in the previous commits. We also have to realize the devices. More on that below. The need to create onboard devices differently results in a non-trivial refactoring. The need for keeping -drive working complicates things further. The "several bug fixes" took me 26 preparatory patches, plus at least three more to fix regressions caused by one of them. I then did the same for ARM virt in commit e0561e60f1 "hw/arm/virt: Support firmware configuration with -blockdev", 2019-05-07. Only a few preparatory patches, but the patch
Re: [PATCH v4 03/25] assertions for block global state API
On 15.11.21 13:27, Emanuele Giuseppe Esposito wrote: @@ -586,12 +589,14 @@ void bdrv_unapply_subtree_drain(BdrvChild *child, BlockDriverState *old_parent) void coroutine_fn bdrv_co_drain(BlockDriverState *bs) { assert(qemu_in_coroutine()); + assert(qemu_in_main_thread()); bdrv_drained_begin(bs); bdrv_drained_end(bs); } void bdrv_drain(BlockDriverState *bs) { + assert(qemu_in_main_thread()); bdrv_drained_begin(bs); bdrv_drained_end(bs); } Why are these GS functions when both bdrv_drained_begin() and bdrv_drained_end() are I/O functions? I can understand making the drain_all functions GS functions, but it seems weird to say it’s an I/O function when a single BDS is drained via bdrv_drained_begin() and bdrv_drained_end(), but not via bdrv_drain(), which just does both. (I can see that there are no I/O path callers, but I still find it strange.) The problem as you saw is on the path callers: bdrv_drain seems to be called only in main thread, while others or similar drains are also coming from I/O. I would say maybe it's a better idea to just put everything I/O, maybe (probably) there are cases not caught by the tests. The only ones in global-state will be: - bdrv_apply_subtree_drain and bdrv_unapply_subtree_drain (called only in .attach and .detach callbacks of BdrvChildClass, run under BQL). - bdrv_drain_all_end_quiesce (called only by bdrv_close thus under BQL). - bdrv_drain_all_begin, bdrv_drain_all_end and bdrv_drain_all (have already BQL assertions) In general, this is the underlying problem with these assertions: if a test doesn't test a specific code path, an unneeded assertion might pass undetected. I believe the I/O vs. GS classification should not only be done based on functional concerns, i.e. who calls this function (is it called by an I/O function?) and whether it can be thread-safe or not (does it call a BQL function?), but also needs to be done semantically. I believe there are some functions that we consider thread-safe but are also only called by BQL functions, for example apparently bdrv_drain(), bdrv_apply_subtree_drain(), and bdrv_unapply_subtree_drain(). Such functions can functionally be considered both GS and I/O functions, so the decision should be done semantically. I believe that it makes sense to say all drain-related functions for a single subtree are I/O functions. OTOH, functions operating on multiple trees in the block graph (i.e. all functions that have some “_all_” in their name) should naturally be considered GS functions, regardless of whether their implementation is thread-safe or not, because they are by nature global functions. That’s why I’m wondering why some drain functions are I/O and others are GS; or why this block-status function is I/O and this one is GS. It may make sense functionally, but semantically it’s strange. I understand it may be difficult for you to know which functions relate to each other and thus know these semantic relationships, but in most of the series the split seems very reasonable to me, semantically. If it didn’t, I said so. :) Hanna
[PATCH-for-7.0 v4 08/11] tests/unit/test-smp-parse: Add 'smp-without-dies-valid' machine type
Keep the common TYPE_MACHINE class initialization in machine_base_class_init(), make it abstract, and move the non-common code to a new class: "smp-without-dies-valid". Signed-off-by: Philippe Mathieu-Daudé --- tests/unit/test-smp-parse.c | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/tests/unit/test-smp-parse.c b/tests/unit/test-smp-parse.c index dfe7f1313b0..90a249fe8c8 100644 --- a/tests/unit/test-smp-parse.c +++ b/tests/unit/test-smp-parse.c @@ -478,13 +478,19 @@ static void machine_base_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); +mc->smp_props.prefer_sockets = true; + +mc->name = g_strdup(SMP_MACHINE_NAME); +} + +static void machine_without_dies_valid_class_init(ObjectClass *oc, void *data) +{ +MachineClass *mc = MACHINE_CLASS(oc); + mc->min_cpus = MIN_CPUS; mc->max_cpus = MAX_CPUS; -mc->smp_props.prefer_sockets = true; mc->smp_props.dies_supported = false; - -mc->name = g_strdup(SMP_MACHINE_NAME); } static void machine_without_dies_invalid_class_init(ObjectClass *oc, void *data) @@ -606,9 +612,14 @@ static const TypeInfo smp_machine_types[] = { { .name = TYPE_MACHINE, .parent = TYPE_OBJECT, +.abstract = true, .class_init = machine_base_class_init, .class_size = sizeof(MachineClass), .instance_size = sizeof(MachineState), +}, { +.name = MACHINE_TYPE_NAME("smp-without-dies-valid"), +.parent = TYPE_MACHINE, +.class_init = machine_without_dies_valid_class_init, }, { .name = MACHINE_TYPE_NAME("smp-without-dies-invalid"), .parent = TYPE_MACHINE, @@ -629,7 +640,7 @@ int main(int argc, char *argv[]) g_test_init(, , NULL); g_test_add_data_func("/test-smp-parse/generic/valid", - TYPE_MACHINE, + MACHINE_TYPE_NAME("smp-without-dies-valid"), test_generic_valid); g_test_add_data_func("/test-smp-parse/generic/invalid", MACHINE_TYPE_NAME("smp-without-dies-invalid"), -- 2.31.1
[PATCH-for-7.0 v4 07/11] tests/unit/test-smp-parse: Add 'smp-without-dies-invalid' machine type
Avoid modifying the MachineClass internals by adding the 'smp-without-dies-invalid' machine, which inherits from TYPE_MACHINE. Signed-off-by: Philippe Mathieu-Daudé --- tests/unit/test-smp-parse.c | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/tests/unit/test-smp-parse.c b/tests/unit/test-smp-parse.c index ff61da06e3d..dfe7f1313b0 100644 --- a/tests/unit/test-smp-parse.c +++ b/tests/unit/test-smp-parse.c @@ -487,6 +487,17 @@ static void machine_base_class_init(ObjectClass *oc, void *data) mc->name = g_strdup(SMP_MACHINE_NAME); } +static void machine_without_dies_invalid_class_init(ObjectClass *oc, void *data) +{ +MachineClass *mc = MACHINE_CLASS(oc); + +/* Force invalid min CPUs and max CPUs */ +mc->min_cpus = 2; +mc->max_cpus = 511; + +mc->smp_props.dies_supported = false; +} + static void machine_with_dies_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); @@ -530,11 +541,6 @@ static void test_generic_invalid(const void *opaque) SMPTestData *data = &(SMPTestData){}; int i; - -/* Force invalid min CPUs and max CPUs */ -mc->min_cpus = 2; -mc->max_cpus = 511; - for (i = 0; i < ARRAY_SIZE(data_generic_invalid); i++) { *data = data_generic_invalid[i]; unsupported_params_init(mc, data); @@ -542,10 +548,6 @@ static void test_generic_invalid(const void *opaque) smp_parse_test(ms, data, false); } -/* Reset the supported min CPUs and max CPUs */ -mc->min_cpus = MIN_CPUS; -mc->max_cpus = MAX_CPUS; - object_unref(obj); } @@ -607,6 +609,10 @@ static const TypeInfo smp_machine_types[] = { .class_init = machine_base_class_init, .class_size = sizeof(MachineClass), .instance_size = sizeof(MachineState), +}, { +.name = MACHINE_TYPE_NAME("smp-without-dies-invalid"), +.parent = TYPE_MACHINE, +.class_init = machine_without_dies_invalid_class_init, }, { .name = MACHINE_TYPE_NAME("smp-with-dies"), .parent = TYPE_MACHINE, @@ -626,7 +632,7 @@ int main(int argc, char *argv[]) TYPE_MACHINE, test_generic_valid); g_test_add_data_func("/test-smp-parse/generic/invalid", - TYPE_MACHINE, + MACHINE_TYPE_NAME("smp-without-dies-invalid"), test_generic_invalid); g_test_add_data_func("/test-smp-parse/with_dies", MACHINE_TYPE_NAME("smp-with-dies"), -- 2.31.1
[PATCH-for-6.2 v4 01/11] tests/unit/test-smp-parse: Restore MachineClass fields after modifying
There is a single MachineClass object, registered with type_register_static(_machine_info). Since the same object is used multiple times (an MachineState object is instantiated in both test_generic and test_with_dies), we should restore its internal state after modifying for the test purpose. Reviewed-by: Andrew Jones Reviewed-by: Richard Henderson Reviewed-by: Yanan Wang Tested-by: Yanan Wang Signed-off-by: Philippe Mathieu-Daudé --- tests/unit/test-smp-parse.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/tests/unit/test-smp-parse.c b/tests/unit/test-smp-parse.c index cbe0c990494..47774c1ee2a 100644 --- a/tests/unit/test-smp-parse.c +++ b/tests/unit/test-smp-parse.c @@ -512,7 +512,7 @@ static void test_generic(void) smp_parse_test(ms, data, true); } -/* Reset the supported min CPUs and max CPUs */ +/* Force invalid min CPUs and max CPUs */ mc->min_cpus = 2; mc->max_cpus = 511; @@ -523,6 +523,10 @@ static void test_generic(void) smp_parse_test(ms, data, false); } +/* Reset the supported min CPUs and max CPUs */ +mc->min_cpus = MIN_CPUS; +mc->max_cpus = MAX_CPUS; + object_unref(obj); } @@ -536,6 +540,7 @@ static void test_with_dies(void) int i; smp_machine_class_init(mc); +/* Force the SMP compat properties */ mc->smp_props.dies_supported = true; for (i = 0; i < ARRAY_SIZE(data_generic_valid); i++) { @@ -575,6 +580,9 @@ static void test_with_dies(void) smp_parse_test(ms, data, false); } +/* Restore the SMP compat properties */ +mc->smp_props.dies_supported = false; + object_unref(obj); } -- 2.31.1
[PATCH v4 00/11] tests/unit: Fix test-smp-parse
Missing review: patches #4 to #8 (new) Yet another approach to fix test-smp-parse. v2 from Yanan Wang: https://lore.kernel.org/qemu-devel/2021024429.10568-1-wangyana...@huawei.com/ Here we use the QOM class_init() to avoid having to deal with object_unref() and deinit(). Since v3: - Restore 'dies_supported' field in test_with_dies (Yanan) - Add R-b tags - QOM-ify the TYPE_MACHINE classes Philippe Mathieu-Daudé (11): tests/unit/test-smp-parse: Restore MachineClass fields after modifying tests/unit/test-smp-parse: QOM'ify smp_machine_class_init() tests/unit/test-smp-parse: Explicit MachineClass name tests/unit/test-smp-parse: Pass machine type as argument to tests tests/unit/test-smp-parse: Split the 'generic' test in valid / invalid tests/unit/test-smp-parse: Add 'smp-with-dies' machine type tests/unit/test-smp-parse: Add 'smp-without-dies-invalid' machine type tests/unit/test-smp-parse: Add 'smp-without-dies-valid' machine type tests/unit/test-smp-parse: Simplify pointer to compound literal use tests/unit/test-smp-parse: Constify some pointer/struct hw/core: Rename smp_parse() -> machine_parse_smp_config() include/hw/boards.h | 3 +- hw/core/machine-smp.c | 6 +- hw/core/machine.c | 2 +- tests/unit/test-smp-parse.c | 221 +++- 4 files changed, 148 insertions(+), 84 deletions(-) -- 2.31.1
Re: does drive_get_next(IF_NONE) make sense?
Am 15.11.2021 um 14:31 hat Peter Maydell geschrieben: > On Mon, 15 Nov 2021 at 13:24, Kevin Wolf wrote: > > Same question as for Hao Wu's series: Wouldn't the proper solution be to > > add a drive property to the machine type? > > > > If you can't use -blockdev, it's not done right. > > Is there an example of "doing it right" for built-in-to-the-machine > devices? > > (My experience with the new-style options is that almost > always they're designed for x86 where the device they're attached > to is also created on the command line, and then handling of boards > where the device is builtin is either an afterthought or forgotten. > See also -netdev, where it took forever for built-in-ethernet to > be usable.) As long as we don't have a separate way to configure built-in devices without creating them, for boards where the device is built in, the properties for the device have to become machine properties. I seem to remember that Markus implemented this for some boards. Just doing without properties for these devices, either by just hard coding things instead of providing options to the user, or by bypassing the property system and accessing global state instead feels wrong. Maybe once we have .instance_config (see my QOM/QAPI integration RFC), forwarding these properties from -machine could actually become trivial, just one call to set the properties for each built-in device. For now, it's probably not as nice because each property needs to be forwarded individually instead of just providing access to all properties of the device. Kevin
Re: [PATCH v4 24/25] job.h: split function pointers in JobDriver
On 25.10.21 12:17, Emanuele Giuseppe Esposito wrote: The job API will be handled separately in another serie. Signed-off-by: Emanuele Giuseppe Esposito Reviewed-by: Stefan Hajnoczi --- include/qemu/job.h | 16 1 file changed, 16 insertions(+) diff --git a/include/qemu/job.h b/include/qemu/job.h index 6e67b6977f..7e9e59f4b8 100644 --- a/include/qemu/job.h +++ b/include/qemu/job.h @@ -169,12 +169,21 @@ typedef struct Job { * Callbacks and other information about a Job driver. */ struct JobDriver { + +/* Fields initialized in struct definition and never changed. */ Like in patch 19, I’d prefer a slightly more verbose comment that I’d find more easily readable. + /** Derived Job struct size */ size_t instance_size; /** Enum describing the operation */ JobType job_type; +/* + * Functions run without regard to the BQL and may run in any s/and/that/? + * arbitrary thread. These functions do not need to be thread-safe + * because the caller ensures that are invoked from one thread at time. s/that/they/ (or “that they”) I believe .run() must be run in the job’s context, though. Not sure if that’s necessary to note, but it isn’t really an arbitrary thread, and block jobs certainly require this (because they run in the block device’s context). Or is that something that’s going to change with I/O threading? Hanna
[PULL 07/13] transactions: Invoke clean() after everything else
From: Hanna Reitz Invoke the transaction drivers' .clean() methods only after all .commit() or .abort() handlers are done. This makes it easier to have nested transactions where the top-level transactions pass objects to lower transactions that the latter can still use throughout their commit/abort phases, while the top-level transaction keeps a reference that is released in its .clean() method. (Before this commit, that is also possible, but the top-level transaction would need to take care to invoke tran_add() before the lower-level transaction does. This commit makes the ordering irrelevant, which is just a bit nicer.) Signed-off-by: Hanna Reitz Message-Id: <2021120829.81329-8-hre...@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy Signed-off-by: Kevin Wolf --- include/qemu/transactions.h | 3 +++ util/transactions.c | 8 ++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/include/qemu/transactions.h b/include/qemu/transactions.h index 92c5965235..2f2060acd9 100644 --- a/include/qemu/transactions.h +++ b/include/qemu/transactions.h @@ -31,6 +31,9 @@ * tran_create(), call your "prepare" functions on it, and finally call * tran_abort() or tran_commit() to finalize the transaction by corresponding * finalization actions in reverse order. + * + * The clean() functions registered by the drivers in a transaction are called + * last, after all abort() or commit() functions have been called. */ #ifndef QEMU_TRANSACTIONS_H diff --git a/util/transactions.c b/util/transactions.c index d0bc9a3e73..2dbdedce95 100644 --- a/util/transactions.c +++ b/util/transactions.c @@ -61,11 +61,13 @@ void tran_abort(Transaction *tran) { TransactionAction *act, *next; -QSLIST_FOREACH_SAFE(act, >actions, entry, next) { +QSLIST_FOREACH(act, >actions, entry) { if (act->drv->abort) { act->drv->abort(act->opaque); } +} +QSLIST_FOREACH_SAFE(act, >actions, entry, next) { if (act->drv->clean) { act->drv->clean(act->opaque); } @@ -80,11 +82,13 @@ void tran_commit(Transaction *tran) { TransactionAction *act, *next; -QSLIST_FOREACH_SAFE(act, >actions, entry, next) { +QSLIST_FOREACH(act, >actions, entry) { if (act->drv->commit) { act->drv->commit(act->opaque); } +} +QSLIST_FOREACH_SAFE(act, >actions, entry, next) { if (act->drv->clean) { act->drv->clean(act->opaque); } -- 2.31.1
Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k
On Montag, 15. November 2021 12:54:32 CET Stefan Hajnoczi wrote: > On Thu, Nov 11, 2021 at 06:54:03PM +0100, Christian Schoenebeck wrote: > > On Donnerstag, 11. November 2021 17:31:52 CET Stefan Hajnoczi wrote: > > > On Wed, Nov 10, 2021 at 04:53:33PM +0100, Christian Schoenebeck wrote: > > > > On Mittwoch, 10. November 2021 16:14:19 CET Stefan Hajnoczi wrote: > > > > > On Wed, Nov 10, 2021 at 02:14:43PM +0100, Christian Schoenebeck wrote: > > > > > > On Mittwoch, 10. November 2021 11:05:50 CET Stefan Hajnoczi wrote: > > > > > > As you are apparently reluctant for changing the virtio specs, > > > > > > what > > > > > > about > > > > > > introducing those discussed virtio capabalities either as > > > > > > experimental > > > > > > ones > > > > > > without specs changes, or even just as 9p specific device > > > > > > capabilities > > > > > > for > > > > > > now. I mean those could be revoked on both sides at any time > > > > > > anyway. > > > > > > > > > > I would like to understand the root cause before making changes. > > > > > > > > > > "It's faster when I do X" is useful information but it doesn't > > > > > necessarily mean doing X is the solution. The "it's faster when I do > > > > > X > > > > > because Y" part is missing in my mind. Once there is evidence that > > > > > shows > > > > > Y then it will be clearer if X is a good solution, if there's a more > > > > > general solution, or if it was just a side-effect. > > > > > > > > I think I made it clear that the root cause of the observed > > > > performance > > > > gain with rising transmission size is latency (and also that > > > > performance > > > > is not the only reason for addressing this queue size issue). > > > > > > > > Each request roundtrip has a certain minimum latency, the virtio ring > > > > alone > > > > has its latency, plus latency of the controller portion of the file > > > > server > > > > (e.g. permissions, sandbox checks, file IDs) that is executed with > > > > *every* > > > > request, plus latency of dispatching the request handling between > > > > threads > > > > several times back and forth (also for each request). > > > > > > > > Therefore when you split large payloads (e.g. reading a large file) > > > > into > > > > smaller n amount of chunks, then that individual latency per request > > > > accumulates to n times the individual latency, eventually leading to > > > > degraded transmission speed as those requests are serialized. > > > > > > It's easy to increase the blocksize in benchmarks, but real applications > > > offer less control over the I/O pattern. If latency in the device > > > implementation (QEMU) is the root cause then reduce the latency to speed > > > up all applications, even those that cannot send huge requests. > > > > Which I did, still do, and also mentioned before, e.g.: > > > > 8d6cb100731c4d28535adbf2a3c2d1f29be3fef4 9pfs: reduce latency of Twalk > > 0c4356ba7dafc8ecb5877a42fc0d68d45ccf5951 9pfs: T_readdir latency > > optimization > > > > Reducing overall latency is a process that is ongoing and will still take > > a > > very long development time. Not because of me, but because of lack of > > reviewers. And even then, it does not make the effort to support higher > > transmission sizes obsolete. > > > > > One idea is request merging on the QEMU side. If the application sends > > > 10 sequential read or write requests, coalesce them together before the > > > main part of request processing begins in the device. Process a single > > > large request to spread the cost of the file server over the 10 > > > requests. (virtio-blk has request merging to help with the cost of lots > > > of small qcow2 I/O requests.) The cool thing about this is that the > > > guest does not need to change its I/O pattern to benefit from the > > > optimization. > > > > > > Stefan > > > > Ok, don't get me wrong: I appreciate that you are suggesting approaches > > that could improve things. But I could already hand you over a huge list > > of mine. The limiting factor here is not the lack of ideas of what could > > be improved, but rather the lack of people helping out actively on 9p > > side: > > https://lists.gnu.org/archive/html/qemu-devel/2021-10/msg06452.html > > > > The situation on kernel side is the same. I already have a huge list of > > what could & should be improved. But there is basically no reviewer for > > 9p patches on Linux kernel side either. > > > > The much I appreciate suggestions of what could be improved, I would > > appreciate much more if there was *anybody* actively assisting as well. In > > the time being I have to work the list down in small patch chunks, > > priority based. > I see request merging as an alternative to this patch series, not as an > additional idea. It is not an alternative. Like I said before, even if it would solve the sequential I/O performance issue (by not simply moving the problem somewhere else), which I doubt, your suggestion would still not solve the semantical
[PULL 09/13] block: Let replace_child_noperm free children
From: Hanna Reitz In most of the block layer, especially when traversing down from other BlockDriverStates, we assume that BdrvChild.bs can never be NULL. When it becomes NULL, it is expected that the corresponding BdrvChild pointer also becomes NULL and the BdrvChild object is freed. Therefore, once bdrv_replace_child_noperm() sets the BdrvChild.bs pointer to NULL, it should also immediately set the corresponding BdrvChild pointer (like bs->file or bs->backing) to NULL. In that context, it also makes sense for this function to free the child. Sometimes we cannot do so, though, because it is called in a transactional context where the caller might still want to reinstate the child in the abort branch (and free it only on commit), so this behavior has to remain optional. In bdrv_replace_child_tran()'s abort handler, we now rely on the fact that the BdrvChild passed to bdrv_replace_child_tran() must have had a non-NULL .bs pointer initially. Make a note of that and assert it. Signed-off-by: Hanna Reitz Message-Id: <2021120829.81329-10-hre...@redhat.com> Signed-off-by: Kevin Wolf --- block.c | 102 +++- 1 file changed, 79 insertions(+), 23 deletions(-) diff --git a/block.c b/block.c index a40027161c..0ac5b163d2 100644 --- a/block.c +++ b/block.c @@ -87,8 +87,10 @@ static BlockDriverState *bdrv_open_inherit(const char *filename, static bool bdrv_recurse_has_child(BlockDriverState *bs, BlockDriverState *child); +static void bdrv_child_free(BdrvChild *child); static void bdrv_replace_child_noperm(BdrvChild **child, - BlockDriverState *new_bs); + BlockDriverState *new_bs, + bool free_empty_child); static void bdrv_remove_file_or_backing_child(BlockDriverState *bs, BdrvChild *child, Transaction *tran); @@ -2256,12 +2258,16 @@ typedef struct BdrvReplaceChildState { BdrvChild *child; BdrvChild **childp; BlockDriverState *old_bs; +bool free_empty_child; } BdrvReplaceChildState; static void bdrv_replace_child_commit(void *opaque) { BdrvReplaceChildState *s = opaque; +if (s->free_empty_child && !s->child->bs) { +bdrv_child_free(s->child); +} bdrv_unref(s->old_bs); } @@ -2278,22 +2284,26 @@ static void bdrv_replace_child_abort(void *opaque) * modify the BdrvChild * pointer we indirectly pass to it, i.e. it * will not modify s->child. From that perspective, it does not matter * whether we pass s->childp or >child. - * (TODO: Right now, bdrv_replace_child_noperm() never modifies that - * pointer anyway (though it will in the future), so at this point it - * absolutely does not matter whether we pass s->childp or >child.) * (2) If new_bs is not NULL, s->childp will be NULL. We then cannot use * it here. * (3) If new_bs is NULL, *s->childp will have been NULLed by * bdrv_replace_child_tran()'s bdrv_replace_child_noperm() call, and we * must not pass a NULL *s->childp here. - * (TODO: In its current state, bdrv_replace_child_noperm() will not - * have NULLed *s->childp, so this does not apply yet. It will in the - * future.) * * So whether new_bs was NULL or not, we cannot pass s->childp here; and in * any case, there is no reason to pass it anyway. */ -bdrv_replace_child_noperm(>child, s->old_bs); +bdrv_replace_child_noperm(>child, s->old_bs, true); +/* + * The child was pre-existing, so s->old_bs must be non-NULL, and + * s->child thus must not have been freed + */ +assert(s->child != NULL); +if (!new_bs) { +/* As described above, *s->childp was cleared, so restore it */ +assert(s->childp != NULL); +*s->childp = s->child; +} bdrv_unref(new_bs); } @@ -2310,30 +2320,44 @@ static TransactionActionDrv bdrv_replace_child_drv = { * * The function doesn't update permissions, caller is responsible for this. * + * (*childp)->bs must not be NULL. + * * Note that if new_bs == NULL, @childp is stored in a state object attached * to @tran, so that the old child can be reinstated in the abort handler. * Therefore, if @new_bs can be NULL, @childp must stay valid until the * transaction is committed or aborted. * - * (TODO: The reinstating does not happen yet, but it will once - * bdrv_replace_child_noperm() NULLs *childp when new_bs is NULL.) + * If @free_empty_child is true and @new_bs is NULL, the BdrvChild is + * freed (on commit). @free_empty_child should only be false if the + * caller will free the BDrvChild themselves (which may be important + * if this is in turn called in another transactional context). */ static void
[PATCH-for-7.0 v4 09/11] tests/unit/test-smp-parse: Simplify pointer to compound literal use
We can simply use a local variable (and pass its pointer) instead of a pointer to a compound literal. Reviewed-by: Andrew Jones Reviewed-by: Richard Henderson Reviewed-by: Yanan Wang Tested-by: Yanan Wang Signed-off-by: Philippe Mathieu-Daudé --- tests/unit/test-smp-parse.c | 66 ++--- 1 file changed, 33 insertions(+), 33 deletions(-) diff --git a/tests/unit/test-smp-parse.c b/tests/unit/test-smp-parse.c index 90a249fe8c8..2f3bcf198a5 100644 --- a/tests/unit/test-smp-parse.c +++ b/tests/unit/test-smp-parse.c @@ -520,19 +520,19 @@ static void test_generic_valid(const void *opaque) Object *obj = object_new(machine_type); MachineState *ms = MACHINE(obj); MachineClass *mc = MACHINE_GET_CLASS(obj); -SMPTestData *data = &(SMPTestData){{ }}; +SMPTestData data = {}; int i; for (i = 0; i < ARRAY_SIZE(data_generic_valid); i++) { -*data = data_generic_valid[i]; -unsupported_params_init(mc, data); +data = data_generic_valid[i]; +unsupported_params_init(mc, ); -smp_parse_test(ms, data, true); +smp_parse_test(ms, , true); /* Unsupported parameters can be provided with their values as 1 */ -data->config.has_dies = true; -data->config.dies = 1; -smp_parse_test(ms, data, true); +data.config.has_dies = true; +data.config.dies = 1; +smp_parse_test(ms, , true); } object_unref(obj); @@ -544,14 +544,14 @@ static void test_generic_invalid(const void *opaque) Object *obj = object_new(machine_type); MachineState *ms = MACHINE(obj); MachineClass *mc = MACHINE_GET_CLASS(obj); -SMPTestData *data = &(SMPTestData){}; +SMPTestData data = {}; int i; for (i = 0; i < ARRAY_SIZE(data_generic_invalid); i++) { -*data = data_generic_invalid[i]; -unsupported_params_init(mc, data); +data = data_generic_invalid[i]; +unsupported_params_init(mc, ); -smp_parse_test(ms, data, false); +smp_parse_test(ms, , false); } object_unref(obj); @@ -563,45 +563,45 @@ static void test_with_dies(const void *opaque) Object *obj = object_new(machine_type); MachineState *ms = MACHINE(obj); MachineClass *mc = MACHINE_GET_CLASS(obj); -SMPTestData *data = &(SMPTestData){{ }}; +SMPTestData data = {}; unsigned int num_dies = 2; int i; for (i = 0; i < ARRAY_SIZE(data_generic_valid); i++) { -*data = data_generic_valid[i]; -unsupported_params_init(mc, data); +data = data_generic_valid[i]; +unsupported_params_init(mc, ); /* when dies parameter is omitted, it will be set as 1 */ -data->expect_prefer_sockets.dies = 1; -data->expect_prefer_cores.dies = 1; +data.expect_prefer_sockets.dies = 1; +data.expect_prefer_cores.dies = 1; -smp_parse_test(ms, data, true); +smp_parse_test(ms, , true); /* when dies parameter is specified */ -data->config.has_dies = true; -data->config.dies = num_dies; -if (data->config.has_cpus) { -data->config.cpus *= num_dies; +data.config.has_dies = true; +data.config.dies = num_dies; +if (data.config.has_cpus) { +data.config.cpus *= num_dies; } -if (data->config.has_maxcpus) { -data->config.maxcpus *= num_dies; +if (data.config.has_maxcpus) { +data.config.maxcpus *= num_dies; } -data->expect_prefer_sockets.dies = num_dies; -data->expect_prefer_sockets.cpus *= num_dies; -data->expect_prefer_sockets.max_cpus *= num_dies; -data->expect_prefer_cores.dies = num_dies; -data->expect_prefer_cores.cpus *= num_dies; -data->expect_prefer_cores.max_cpus *= num_dies; +data.expect_prefer_sockets.dies = num_dies; +data.expect_prefer_sockets.cpus *= num_dies; +data.expect_prefer_sockets.max_cpus *= num_dies; +data.expect_prefer_cores.dies = num_dies; +data.expect_prefer_cores.cpus *= num_dies; +data.expect_prefer_cores.max_cpus *= num_dies; -smp_parse_test(ms, data, true); +smp_parse_test(ms, , true); } for (i = 0; i < ARRAY_SIZE(data_with_dies_invalid); i++) { -*data = data_with_dies_invalid[i]; -unsupported_params_init(mc, data); +data = data_with_dies_invalid[i]; +unsupported_params_init(mc, ); -smp_parse_test(ms, data, false); +smp_parse_test(ms, , false); } object_unref(obj); -- 2.31.1
Re: [PATCH] gitlab-ci: Split custom-runners.yml in one file per runner
On 11/15/21 15:33, Thomas Huth wrote: > On 15/11/2021 10.56, Philippe Mathieu-Daudé wrote: >> To ease maintenance, add the custom-runners/ directory and >> split custom-runners.yml in 3 files, all included by the >> current custom-runners.yml: >> - ubuntu-18.04-s390x.yml >> - ubuntu-20.04-aarch64.yml >> - centos-stream-8-x86_64.yml >> >> Signed-off-by: Philippe Mathieu-Daudé >> --- >> Based-on: <2021160501.862396-1-cr...@redhat.com> >> https://gitlab.com/stsquad/qemu/-/tree/for-6.2/misc-fixes >> --- >> .gitlab-ci.d/custom-runners.yml | 268 +- >> .../custom-runners/centos-stream-8-x86_64.yml | 28 ++ >> .../custom-runners/ubuntu-18.04-s390x.yml | 118 >> .../custom-runners/ubuntu-20.04-aarch64.yml | 118 >> 4 files changed, 268 insertions(+), 264 deletions(-) >> create mode 100644 >> .gitlab-ci.d/custom-runners/centos-stream-8-x86_64.yml >> create mode 100644 .gitlab-ci.d/custom-runners/ubuntu-18.04-s390x.yml >> create mode 100644 .gitlab-ci.d/custom-runners/ubuntu-20.04-aarch64.yml > > Should there maybe be entries in MAINTAINERS for these new files, so > that it is clear who takes care of the external runners? Yes, this is the principal motivation behind this series. I expect each maintainer to send their patch.
[PATCH-for-7.0 v4 11/11] hw/core: Rename smp_parse() -> machine_parse_smp_config()
All methods related to MachineState are prefixed with "machine_". smp_parse() does not need to be an exception. Rename it and const'ify the SMPConfiguration argument, since it doesn't need to be modified. Reviewed-by: Andrew Jones Reviewed-by: Richard Henderson Reviewed-by: Yanan Wang Tested-by: Yanan Wang Signed-off-by: Philippe Mathieu-Daudé --- include/hw/boards.h | 3 ++- hw/core/machine-smp.c | 6 -- hw/core/machine.c | 2 +- tests/unit/test-smp-parse.c | 8 4 files changed, 11 insertions(+), 8 deletions(-) diff --git a/include/hw/boards.h b/include/hw/boards.h index 9c1c1901046..7597cec4400 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -34,7 +34,8 @@ HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine); void machine_set_cpu_numa_node(MachineState *machine, const CpuInstanceProperties *props, Error **errp); -void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp); +void machine_parse_smp_config(MachineState *ms, + const SMPConfiguration *config, Error **errp); /** * machine_class_allow_dynamic_sysbus_dev: Add type to list of valid devices diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c index 116a0cbbfab..2cbfd574293 100644 --- a/hw/core/machine-smp.c +++ b/hw/core/machine-smp.c @@ -44,7 +44,8 @@ static char *cpu_hierarchy_to_string(MachineState *ms) } /* - * smp_parse - Generic function used to parse the given SMP configuration + * machine_parse_smp_config: Generic function used to parse the given + * SMP configuration * * Any missing parameter in "cpus/maxcpus/sockets/cores/threads" will be * automatically computed based on the provided ones. @@ -63,7 +64,8 @@ static char *cpu_hierarchy_to_string(MachineState *ms) * introduced topology members which are likely to be target specific should * be directly set as 1 if they are omitted (e.g. dies for PC since 4.1). */ -void smp_parse(MachineState *ms, SMPConfiguration *config, Error **errp) +void machine_parse_smp_config(MachineState *ms, + const SMPConfiguration *config, Error **errp) { MachineClass *mc = MACHINE_GET_CLASS(ms); unsigned cpus= config->has_cpus ? config->cpus : 0; diff --git a/hw/core/machine.c b/hw/core/machine.c index 26ec54e7261..a2d3c9969d9 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -760,7 +760,7 @@ static void machine_set_smp(Object *obj, Visitor *v, const char *name, return; } -smp_parse(ms, config, errp); +machine_parse_smp_config(ms, config, errp); } static void machine_class_init(ObjectClass *oc, void *data) diff --git a/tests/unit/test-smp-parse.c b/tests/unit/test-smp-parse.c index 8f47a2e65f6..8e488e95145 100644 --- a/tests/unit/test-smp-parse.c +++ b/tests/unit/test-smp-parse.c @@ -337,7 +337,7 @@ static const struct SMPTestData data_with_dies_invalid[] = { }, }; -static char *smp_config_to_string(SMPConfiguration *config) +static char *smp_config_to_string(const SMPConfiguration *config) { return g_strdup_printf( "(SMPConfiguration) {\n" @@ -371,7 +371,7 @@ static char *cpu_topology_to_string(const CpuTopology *topo) topo->cores, topo->threads, topo->max_cpus); } -static void check_parse(MachineState *ms, SMPConfiguration *config, +static void check_parse(MachineState *ms, const SMPConfiguration *config, const CpuTopology *expect_topo, const char *expect_err, bool is_valid) { @@ -380,8 +380,8 @@ static void check_parse(MachineState *ms, SMPConfiguration *config, g_autofree char *output_topo_str = NULL; Error *err = NULL; -/* call the generic parser smp_parse() */ -smp_parse(ms, config, ); +/* call the generic parser */ +machine_parse_smp_config(ms, config, ); output_topo_str = cpu_topology_to_string(>smp); -- 2.31.1