Re: [Qemu-devel] [PATCH v2] target-arm: kvm: Differentiate registers based on write-back levels
On 16 July 2015 at 12:34, Christoffer Dall christoffer.d...@linaro.org wrote: Some registers like the CNTVCT register should only be written to the kernel as part of machine initialization or on vmload operations, but never during runtime, as this can potentially make time go backwards or create inconsistent time observations between VCPUs. Introduce a list of registers that should not be written back at runtime and check this list on syncing the register state to the KVM state. Thanks. I think this should go into QEMU 2.4, given that it fixes a bug with time misbehaving in guests. Are you happy that it's received enough testing? (I have given 32-bit KVM a spin but have no convenient 64-bit box to test with, and besides, I didn't notice the bug in the first place :-)) Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- Changes since RFC: - Move cpreg_level to kvm_arm_cpreg_level and into kvm32.c and kvm64.c - Changed struct name and declare as static const I have a couple of minor comments on the comments below, and you forgot to update the stub version of write_list_to_kvmstate(). I can just fix these up as I put it into target-arm.next, though. Fixed up version at: https://git.linaro.org/people/peter.maydell/qemu-arm.git target-arm.next If interested parties could test that by end-of-Monday that would be nice (since rc2 is scheduled for Tuesday). dtc | 2 +- target-arm/kvm.c | 6 +- target-arm/kvm32.c | 30 +- target-arm/kvm64.c | 30 +- target-arm/kvm_arm.h | 12 +++- target-arm/machine.c | 2 +- 6 files changed, 76 insertions(+), 6 deletions(-) diff --git a/dtc b/dtc index 65cc4d2..bc895d6 16 --- a/dtc +++ b/dtc @@ -1 +1 @@ -Subproject commit 65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf +Subproject commit bc895d6d09695d05ceb8b52486ffe861d6cfbdde Stray submodule change :-) diff --git a/target-arm/kvm32.c b/target-arm/kvm32.c index d7e7d68..6769815 100644 --- a/target-arm/kvm32.c +++ b/target-arm/kvm32.c @@ -153,6 +153,34 @@ bool kvm_arm_reg_syncs_via_cpreg_list(uint64_t regidx) } } +typedef struct CPRegStateLevel { +uint64_t regidx; +int level; +} CPRegStateLevel; + +/* All coprocessor registers not listed in the following table are assumed to + * be of the level KVM_PUT_RUNTIME_STATE, a register should be written less . If a register. + * often, you must add it to this table with a state of either + * KVM_PUT_RESET_STATE or KVM_PUT_FULL_STATE. + */ +static const CPRegStateLevel non_runtime_cpregs[] = { +{ KVM_REG_ARM_TIMER_CNT, KVM_PUT_FULL_STATE }, +}; diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index ac34f51..d59f41c 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -139,6 +139,34 @@ bool kvm_arm_reg_syncs_via_cpreg_list(uint64_t regidx) } } +typedef struct CPRegStateLevel { +uint64_t regidx; +int level; +} CPRegStateLevel; + +/* All system not listed in the following table are assumed to be of the level system registers + * KVM_PUT_RUNTIME_STATE, a register should be written less often, you must . If a register + * add it to this table with a state of either KVM_PUT_RESET_STATE or + * KVM_PUT_FULL_STATE. + */ +static const CPRegStateLevel non_runtime_cpregs[] = { +{ KVM_REG_ARM_TIMER_CNT, KVM_PUT_FULL_STATE }, +}; --- a/target-arm/kvm_arm.h +++ b/target-arm/kvm_arm.h @@ -83,7 +93,7 @@ bool kvm_arm_reg_syncs_via_cpreg_list(uint64_t regidx); * Note that we do not stop early on failure -- we will attempt * writing all registers in the list. */ -bool write_list_to_kvmstate(ARMCPU *cpu); +bool write_list_to_kvmstate(ARMCPU *cpu, int level); You forgot to update the stub function in target-arm/kvm-stub.c, so this breaks compilation on non-ARM hosts. thanks -- PMM
[Qemu-devel] [PULL 6/7] ipxe: don't override GITVERSION
We had build problems due to the git version checking in the ipxe build system in the past. Don't remember the details, but the problem seems to be gone now, so lets remove the workaround. Signed-off-by: Gerd Hoffmann kra...@redhat.com [ most likely ipxe commit 6153c09c41034250408f3596555fcaae715da46c: [build] Set GITVERSION only if there is a git repository ] Reviewed-by: Laszlo Ersek ler...@redhat.com --- roms/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/roms/Makefile b/roms/Makefile index 4971f6b..55fc45b 100644 --- a/roms/Makefile +++ b/roms/Makefile @@ -121,12 +121,12 @@ efi-rom-%: build-pxe-roms build-efi-roms -o ../pc-bios/efi-$*.rom build-pxe-roms: ipxe/src/config/local/general.h ipxe/src/config/local/branding.h - $(MAKE) -C ipxe/src GITVERSION= \ + $(MAKE) -C ipxe/src \ CROSS_COMPILE=$(x86_64_cross_prefix) \ $(patsubst %,bin/%.rom,$(pxerom_targets)) build-efi-roms: build-pxe-roms ipxe/src/config/local/general.h ipxe/src/config/local/branding.h - $(MAKE) -C ipxe/src GITVERSION= \ + $(MAKE) -C ipxe/src \ CROSS_COMPILE=$(x86_64_cross_prefix) \ $(patsubst %,bin-i386-efi/%.efidrv,$(pxerom_targets)) \ $(patsubst %,bin-x86_64-efi/%.efidrv,$(pxerom_targets)) -- 1.8.3.1
[Qemu-devel] [PULL 3/7] ipxe: rm local config in cleanup
ipxe build now generates empty local header files in case they are not present. Let's remove them on cleanup to make sure we store a fresh copy on the next build. Signed-off-by: Gerd Hoffmann kra...@redhat.com Acked-by: Laszlo Ersek ler...@redhat.com --- roms/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/roms/Makefile b/roms/Makefile index 7b3f156..c0153c7 100644 --- a/roms/Makefile +++ b/roms/Makefile @@ -153,5 +153,6 @@ clean: $(MAKE) -C sgabios clean rm -f sgabios/.depend $(MAKE) -C ipxe/src veryclean + (cd ipxe; rm -f src/config/local/*.h) $(MAKE) -C SLOF clean rm -rf u-boot/build.e500 -- 1.8.3.1
[Qemu-devel] [PULL 4/7] ipxe: disable load file protocol
Activate the opt-out added by one not-upstream ipxe patch: [efi] make load file protocol optional Signed-off-by: Gerd Hoffmann kra...@redhat.com Reviewed-by: Laszlo Ersek ler...@redhat.com --- roms/config.ipxe.general.h | 1 + 1 file changed, 1 insertion(+) diff --git a/roms/config.ipxe.general.h b/roms/config.ipxe.general.h index 619ee4c..2df042a 100644 --- a/roms/config.ipxe.general.h +++ b/roms/config.ipxe.general.h @@ -2,3 +2,4 @@ #define BANNER_TIMEOUT 30 #undef ROM_BANNER_TIMEOUT #define ROM_BANNER_TIMEOUT 0 +#undef EFI_PROTO_LOAD_FILE -- 1.8.3.1
[Qemu-devel] [PULL for-2.4 0/7] update ipxe roms, fix efi support
Hi, This pull finally fixes the efi boot support. ipxe is updated to the latest master, two non-upstream commits needed to make efi work are added on top, and the build process is tweaked a bit. The ipxe changes are pushed to git://git.kraxel.org/ipxe (branch qemu, tag qemu-2.4). They should be mirrored to git://git.qemu.org/ipxe.git before merging this pull request. please pull, Gerd The following changes since commit 2d5ee9e7a7dd495d233cf9613a865f63f88e3375: Merge remote-tracking branch 'remotes/lalrae/tags/mips-20150716' into staging (2015-07-16 10:40:23 +0100) are available in the git repository at: git://git.kraxel.org/qemu tags/pull-ipxe-20150717-1 for you to fetch changes up to 4f0c601b71e53a7d225a1913b784242400788991: ipxe: update binaries (2015-07-16 17:39:12 +0200) update ipxe roms, fix efi support Gerd Hoffmann (7): ipxe: update from 35c53797 to 24112d9 (upstream/master) ipxe: update to 87981bb (qemu) ipxe: rm local config in cleanup ipxe: disable load file protocol ipxe: add qemu branding ipxe: don't override GITVERSION ipxe: update binaries pc-bios/efi-e1000.rom | Bin 197120 - 192512 bytes pc-bios/efi-eepro100.rom| Bin 197632 - 192512 bytes pc-bios/efi-ne2k_pci.rom| Bin 195584 - 190976 bytes pc-bios/efi-pcnet.rom | Bin 195584 - 190976 bytes pc-bios/efi-rtl8139.rom | Bin 200192 - 194560 bytes pc-bios/efi-virtio.rom | Bin 194048 - 188928 bytes roms/Makefile | 9 + roms/config.ipxe.branding.h | 2 ++ roms/config.ipxe.general.h | 1 + roms/ipxe | 2 +- 10 files changed, 9 insertions(+), 5 deletions(-) create mode 100644 roms/config.ipxe.branding.h
[Qemu-devel] [PULL 5/7] ipxe: add qemu branding
Apply qemu-project.org branding, so the official builds can easily be identified in the banner. Signed-off-by: Gerd Hoffmann kra...@redhat.com Acked-by: Laszlo Ersek ler...@redhat.com --- roms/Makefile | 4 ++-- roms/config.ipxe.branding.h | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) create mode 100644 roms/config.ipxe.branding.h diff --git a/roms/Makefile b/roms/Makefile index c0153c7..4971f6b 100644 --- a/roms/Makefile +++ b/roms/Makefile @@ -120,12 +120,12 @@ efi-rom-%: build-pxe-roms build-efi-roms -ec ipxe/src/bin-x86_64-efi/$(VID)$(DID).efidrv \ -o ../pc-bios/efi-$*.rom -build-pxe-roms: ipxe/src/config/local/general.h +build-pxe-roms: ipxe/src/config/local/general.h ipxe/src/config/local/branding.h $(MAKE) -C ipxe/src GITVERSION= \ CROSS_COMPILE=$(x86_64_cross_prefix) \ $(patsubst %,bin/%.rom,$(pxerom_targets)) -build-efi-roms: build-pxe-roms ipxe/src/config/local/general.h +build-efi-roms: build-pxe-roms ipxe/src/config/local/general.h ipxe/src/config/local/branding.h $(MAKE) -C ipxe/src GITVERSION= \ CROSS_COMPILE=$(x86_64_cross_prefix) \ $(patsubst %,bin-i386-efi/%.efidrv,$(pxerom_targets)) \ diff --git a/roms/config.ipxe.branding.h b/roms/config.ipxe.branding.h new file mode 100644 index 000..324995e --- /dev/null +++ b/roms/config.ipxe.branding.h @@ -0,0 +1,2 @@ +#undef PRODUCT_NAME +#define PRODUCT_NAME iPXE for qemu-project.org -- 1.8.3.1
[Qemu-devel] [PULL 7/7] ipxe: update binaries
Signed-off-by: Gerd Hoffmann kra...@redhat.com Reviewed-by: Laszlo Ersek ler...@redhat.com --- pc-bios/efi-e1000.rom| Bin 197120 - 192512 bytes pc-bios/efi-eepro100.rom | Bin 197632 - 192512 bytes pc-bios/efi-ne2k_pci.rom | Bin 195584 - 190976 bytes pc-bios/efi-pcnet.rom| Bin 195584 - 190976 bytes pc-bios/efi-rtl8139.rom | Bin 200192 - 194560 bytes pc-bios/efi-virtio.rom | Bin 194048 - 188928 bytes 6 files changed, 0 insertions(+), 0 deletions(-) diff --git a/pc-bios/efi-e1000.rom b/pc-bios/efi-e1000.rom index 4e29d9d..203bbaf 100644 Binary files a/pc-bios/efi-e1000.rom and b/pc-bios/efi-e1000.rom differ diff --git a/pc-bios/efi-eepro100.rom b/pc-bios/efi-eepro100.rom index 2a92d6f..0df3135 100644 Binary files a/pc-bios/efi-eepro100.rom and b/pc-bios/efi-eepro100.rom differ diff --git a/pc-bios/efi-ne2k_pci.rom b/pc-bios/efi-ne2k_pci.rom index 6366017..91c0ebb 100644 Binary files a/pc-bios/efi-ne2k_pci.rom and b/pc-bios/efi-ne2k_pci.rom differ diff --git a/pc-bios/efi-pcnet.rom b/pc-bios/efi-pcnet.rom index a61f586..4d79e37 100644 Binary files a/pc-bios/efi-pcnet.rom and b/pc-bios/efi-pcnet.rom differ diff --git a/pc-bios/efi-rtl8139.rom b/pc-bios/efi-rtl8139.rom index c9c77ea..0d3287b 100644 Binary files a/pc-bios/efi-rtl8139.rom and b/pc-bios/efi-rtl8139.rom differ diff --git a/pc-bios/efi-virtio.rom b/pc-bios/efi-virtio.rom index eec2790..8f9a9b4 100644 Binary files a/pc-bios/efi-virtio.rom and b/pc-bios/efi-virtio.rom differ -- 1.8.3.1
[Qemu-devel] [PULL 2/7] ipxe: update to 87981bb (qemu)
Add two patches we've been struggling to get upstream. They are available from git://git.qemu.org/ipxe.git qemu git shortlog Gerd Hoffmann (1): [efi] make load file protocol optional Laszlo Ersek (1): efi_snp: improve compliance with the EFI_SIMPLE_NETWORK_PROTOCOL spec Signed-off-by: Gerd Hoffmann kra...@redhat.com Reviewed-by: Laszlo Ersek ler...@redhat.com --- roms/ipxe | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/roms/ipxe b/roms/ipxe index 24112d9..87981bb 16 --- a/roms/ipxe +++ b/roms/ipxe @@ -1 +1 @@ -Subproject commit 24112d91a0ab4c29794bc7750cbfc4166bc9026a +Subproject commit 87981bb66f5f496f124469fb6c13aa13ee00061e -- 1.8.3.1
[Qemu-devel] [PULL 1/7] ipxe: update from 35c53797 to 24112d9 (upstream/master)
git shortlog Alex Williamson (1): [dhcp] Extract timing parameters out to config/dhcp.h Bernd Wiebelt (1): [tg3] Add support for BCM57766 Christian Hesse (3): [intel] Add PCI device IDs for Intel I218-LM and I218-V [build] Add missing const qualifiers [ath9k] Remove confusing logic inversion in an ANI variable Ed Swierk (1): [intel] Update PCI device IDs for Intel 82599 and X540 10G NICs Laszlo Ersek (1): [virtio] Downgrade per-iobuf debug messages to DBGC2 Michael Brown (228): [device] Provide a driver-private data field for root devices [iobuf] Add iob_split() to split an I/O buffer into portions [rndis] Add generic RNDIS device abstraction [hyperv] Add support for Hyper-V hypervisor [hyperv] Add support for VMBus devices [hyperv] Add support for NetVSC paravirtual network devices [rndis] Send RNDIS_INITIALISE_MSG [rndis] Send RNDIS_HALT_MSG [hyperv] Tear down NetVSC RX buffer GPADL after closing VMBus device [rndis] Clear receive filter when closing the device [hyperv] Receive all VMBus messages in a poll [hyperv] Increase TX ring size [hyperv] Assume that VMBus xfer page ranges correspond to RNDIS messages [rndis] Ignore start-of-day RNDIS_INDICATE_STATUS_MSG with status 0x40020006 [hyperv] Tidy up debug output [hyperv] Require support for VMBus version 3.0 or newer [build] Include Hyper-V driver in the all-drivers build [pci] Allow drivers to specify a PCI class [romprefix] Ensure UNDI loader can be included by all ROM types [usb] Add basic support for USB devices [usb] Add basic support for USB hubs [usb] Add support for xHCI host controllers [ncm] Add support for CDC-NCM USB Ethernet devices [usb] Report xHCI host controller events [ncm] Use large multi-packet buffers by default [tftp] Explicitly abort connection whenever parent interface is closed [uri] Allow tftp_uri() to construct a URI with a custom port [pxe] Use tftp_uri() to construct PXE TFTP URIs [pxe] Maintain a queue for received PXE UDP packets [ncm] Reserve headroom in received packets [usb] Try multiple USB device configurations [usb] Handle CDC union functional descriptors [usb] Parse endpoint descriptor bInterval field [usb] Allow usb_stream() to enforce a terminating short packet [ecm] Add support for CDC-ECM USB Ethernet devices [xhci] Delay after (possibly) forcing port link state to RxDetect [build] Move branding information to config/branding.h [build] Use PRODUCT_SHORT_NAME for end-user visible strings [build] Allow product URI to be customised via config/branding.h [build] Allow error message URI to be customised via config/branding.h [build] Allow command help text URI to be customised via config/branding.h [build] Allow setting help text URI to be customised via config/branding.h [build] Allow product tag line to be customised via config/branding.h [rndis] Add rndis_rx_err() [usb] Handle port status changes received after failing to find a driver [efi] Disallow R_X86_64_32 relocations [build] Apply the -fno-PIE -nopie workaround only to i386 builds [usb] Provide generic framework for refilling receive endpoints [usb] Use generic refill framework for USB hub interrupt endpoints [ecm] Use generic refill framework for bulk IN and interrupt endpoints [ncm] Use generic refill framework for bulk IN and interrupt endpoints [libc] Remove unused string functions [libc] Rewrite string functions [test] Add self-tests for more string functions [test] Add constant-length memset() self-tests [libc] Reduce size of memset() [usb] Add generic USB network device framework [ecm] Use generic USB network device framework [ncm] Use generic USB network device framework [timer] Rewrite the 8254 Programmable Interval Timer support [xhci] Leak memory if controller fails to disable slot [xhci] Abort commands on timeout [test] Add IPv4 self-tests [legal] Add missing copyright header to net/ipv4.c [ipv4] Rewrite inet_aton() [libc] Rewrite strtoul() [hyperv] Check for required features [prefix] Use .bss16 as temporary stack space for calls to install_block [zbin] Use LZMA compression [zbin] Perform extra normalisation after completing decompression [prefix] Call decompressor in flat real mode when DEBUG=libprefix is enabled [zbin] Allow decompressor to generate debug output via BIOS console [zbin] Fix check for existence of most recent output byte [zbin] Remove now-unused unnrv2b.S decompressor [legal] Update GPLv2 licence text [legal] Include full licence text for all GPL2_OR_LATER files [mucurses] Add missing FILE_LICENCE declarations
[Qemu-devel] [RFC PATCH V3 0/3] Multithread TCG async_safe_work part.
From: KONRAD Frederic fred.kon...@greensocs.com This is the async_safe_work introduction bit of the Multithread TCG work. Rebased on current upstream (6169b60285fe1ff730d840a49527e721bfb30899). (Currently untested as I need to rebase MTTCG first.) It can be cloned here: http://git.greensocs.com/fkonrad/mttcg.git branch async_work_v3 The first patch introduces a mutex to protect the existing queued_work_* CPUState members against multiple (concurent) access. The second patch introduces a tcg_exec_flag which will be 1 when we are inside cpu_exec(), -1 when we must not enter the cpu execution and 0 when we are allowed to do so. This is required as safe work need to be sure that's all vCPU are outside cpu_exec(). The last patch introduces async_safe_work. It allows to add some work which will be done asynchronously but only when all vCPUs are outside cpu_exec(). The tcg thread will wait that no vCPUs have any pending safe work before reentering cpu-exec(). Changes V2 - V3: * Check atomically we are not in the executiong loop to fix the race condition which might happen. Changes V1 - V2: * Release the lock while running the callback for both async and safe work. KONRAD Frederic (3): cpus: protect queued_work_* with work_mutex. cpus: add tcg_exec_flag. cpus: introduce async_run_safe_work_on_cpu. cpu-exec.c| 10 cpus.c| 160 -- include/qom/cpu.h | 57 +++ qom/cpu.c | 20 +++ 4 files changed, 207 insertions(+), 40 deletions(-) -- 1.9.0
[Qemu-devel] [RFC PATCH V3 1/3] cpus: protect queued_work_* with work_mutex.
From: KONRAD Frederic fred.kon...@greensocs.com This protects queued_work_* used by async_run_on_cpu, run_on_cpu and flush_queued_work with a new lock (work_mutex) to prevent multiple (concurrent) access. Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com Changes V1 - V2: * Unlock the mutex while running the callback. --- cpus.c| 11 +++ include/qom/cpu.h | 3 +++ qom/cpu.c | 1 + 3 files changed, 15 insertions(+) diff --git a/cpus.c b/cpus.c index b00a423..eabd4b1 100644 --- a/cpus.c +++ b/cpus.c @@ -845,6 +845,8 @@ void run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data) wi.func = func; wi.data = data; wi.free = false; + +qemu_mutex_lock(cpu-work_mutex); if (cpu-queued_work_first == NULL) { cpu-queued_work_first = wi; } else { @@ -853,6 +855,7 @@ void run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data) cpu-queued_work_last = wi; wi.next = NULL; wi.done = false; +qemu_mutex_unlock(cpu-work_mutex); qemu_cpu_kick(cpu); while (!wi.done) { @@ -876,6 +879,8 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data) wi-func = func; wi-data = data; wi-free = true; + +qemu_mutex_lock(cpu-work_mutex); if (cpu-queued_work_first == NULL) { cpu-queued_work_first = wi; } else { @@ -884,6 +889,7 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data) cpu-queued_work_last = wi; wi-next = NULL; wi-done = false; +qemu_mutex_unlock(cpu-work_mutex); qemu_cpu_kick(cpu); } @@ -896,15 +902,20 @@ static void flush_queued_work(CPUState *cpu) return; } +qemu_mutex_lock(cpu-work_mutex); while ((wi = cpu-queued_work_first)) { cpu-queued_work_first = wi-next; +qemu_mutex_unlock(cpu-work_mutex); wi-func(wi-data); +qemu_mutex_lock(cpu-work_mutex); wi-done = true; if (wi-free) { g_free(wi); } } cpu-queued_work_last = NULL; +qemu_mutex_unlock(cpu-work_mutex); + qemu_cond_broadcast(qemu_work_cond); } diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 20aabc9..efa9624 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -242,6 +242,8 @@ struct kvm_run; * @mem_io_pc: Host Program Counter at which the memory was accessed. * @mem_io_vaddr: Target virtual address at which the memory was accessed. * @kvm_fd: vCPU file descriptor for KVM. + * @work_mutex: Lock to prevent multiple access to queued_work_*. + * @queued_work_first: First asynchronous work pending. * * State of one CPU core or thread. */ @@ -262,6 +264,7 @@ struct CPUState { uint32_t host_tid; bool running; struct QemuCond *halt_cond; +QemuMutex work_mutex; struct qemu_work_item *queued_work_first, *queued_work_last; bool thread_kicked; bool created; diff --git a/qom/cpu.c b/qom/cpu.c index eb9cfec..4e12598 100644 --- a/qom/cpu.c +++ b/qom/cpu.c @@ -316,6 +316,7 @@ static void cpu_common_initfn(Object *obj) cpu-gdb_num_regs = cpu-gdb_num_g_regs = cc-gdb_num_core_regs; QTAILQ_INIT(cpu-breakpoints); QTAILQ_INIT(cpu-watchpoints); +qemu_mutex_init(cpu-work_mutex); } static void cpu_common_finalize(Object *obj) -- 1.9.0
[Qemu-devel] [RFC PATCH V3 2/3] cpus: add tcg_exec_flag.
From: KONRAD Frederic fred.kon...@greensocs.com This flag indicates the state of the VCPU thread: * 0 if the VCPU is allowed to execute code. * 1 if the VCPU is currently executing code. * -1 if the VCPU is not allowed to execute code. This allows to atomically check and run safe work or check and continue the TCG execution. Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com Changes V2 - V3: * introduce a third state which allow or not the execution. * atomically check and set the flag when starting or blocking the code execution. Changes V1 - V2: * do both tcg_executing = 0 or 1 in cpu_exec(). --- cpu-exec.c| 5 + include/qom/cpu.h | 32 qom/cpu.c | 19 +++ 3 files changed, 56 insertions(+) diff --git a/cpu-exec.c b/cpu-exec.c index 75694f3..e1a 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -371,6 +371,10 @@ int cpu_exec(CPUState *cpu) cpu-halted = 0; } +if (!tcg_cpu_try_start_execution(cpu)) { +cpu-exit_request = 1; +return 0; +} current_cpu = cpu; /* As long as current_cpu is null, up to the assignment just above, @@ -583,5 +587,6 @@ int cpu_exec(CPUState *cpu) /* fail safe : never use current_cpu outside cpu_exec() */ current_cpu = NULL; +tcg_cpu_allow_execution(cpu); return ret; } diff --git a/include/qom/cpu.h b/include/qom/cpu.h index efa9624..de7487e 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -226,6 +226,7 @@ struct kvm_run; * @stopped: Indicates the CPU has been artificially stopped. * @tcg_exit_req: Set to force TCG to stop executing linked TBs for this * CPU and return to its top level loop. + * @tcg_exec_flag: See tcg_cpu_flag_* function. * @singlestep_enabled: Flags for single-stepping. * @icount_extra: Instructions until next timer event. * @icount_decr: Number of cycles left, with interrupt flag in high bit. @@ -322,6 +323,8 @@ struct CPUState { (absolute value) offset as small as possible. This reduces code size, especially for hosts without large memory offsets. */ volatile sig_atomic_t tcg_exit_req; + +int tcg_exec_flag; }; QTAILQ_HEAD(CPUTailQ, CPUState); @@ -337,6 +340,35 @@ extern struct CPUTailQ cpus; DECLARE_TLS(CPUState *, current_cpu); #define current_cpu tls_var(current_cpu) + +/** + * tcg_cpu_try_block_execution + * @cpu: The CPU to block the execution + * + * Try to set the tcg_exec_flag to -1 saying the CPU can't execute code if the + * CPU is not executing code. + * Returns true if the cpu execution is blocked, false otherwise. + */ +bool tcg_cpu_try_block_execution(CPUState *cpu); + +/** + * tcg_cpu_allow_execution + * @cpu: The CPU to allow the execution. + * + * Just reset the state of tcg_exec_flag, and allow the execution of some code. + */ +void tcg_cpu_allow_execution(CPUState *cpu); + +/** + * tcg_cpu_try_start_execution + * @cpu: The CPU to start the execution. + * + * Just set the tcg_exec_flag to 1 saying the CPU is executing code if the CPU + * is allowed to run some code. + * Returns true if the cpu can execute, false otherwise. + */ +bool tcg_cpu_try_start_execution(CPUState *cpu); + /** * cpu_paging_enabled: * @cpu: The CPU whose state is to be inspected. diff --git a/qom/cpu.c b/qom/cpu.c index 4e12598..e32f90c 100644 --- a/qom/cpu.c +++ b/qom/cpu.c @@ -26,6 +26,23 @@ #include qemu/error-report.h #include sysemu/sysemu.h +bool tcg_cpu_try_block_execution(CPUState *cpu) +{ +return (atomic_cmpxchg(cpu-tcg_exec_flag, 0, -1) + || (cpu-tcg_exec_flag == -1)); +} + +void tcg_cpu_allow_execution(CPUState *cpu) +{ +cpu-tcg_exec_flag = 0; +} + +bool tcg_cpu_try_start_execution(CPUState *cpu) +{ +return (atomic_cmpxchg(cpu-tcg_exec_flag, 0, 1) + || (cpu-tcg_exec_flag == 1)); +} + bool cpu_exists(int64_t id) { CPUState *cpu; @@ -249,6 +266,8 @@ static void cpu_common_reset(CPUState *cpu) cpu-icount_decr.u32 = 0; cpu-can_do_io = 0; cpu-exception_index = -1; + +tcg_cpu_allow_execution(cpu); memset(cpu-tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof(void *)); } -- 1.9.0
[Qemu-devel] [RFC PATCH V3 3/3] cpus: introduce async_run_safe_work_on_cpu.
From: KONRAD Frederic fred.kon...@greensocs.com We already had async_run_on_cpu but we need all VCPUs outside their execution loop to execute some tb_flush/invalidate task: async_run_on_cpu_safe schedule a work on a VCPU but the work start when no more VCPUs are executing code. When a safe work is pending cpu_has_work returns true, so cpu_exec returns and the VCPUs can't enters execution loop. cpu_thread_is_idle returns false so at the moment where all VCPUs are stop || stopped the safe work queue can be flushed. Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com Changes V3 - V4: * Use tcg_cpu_try_block_execution. * Use a counter to know how many safe work are pending. Changes V2 - V3: * Unlock the mutex while executing the callback. Changes V1 - V2: * Move qemu_cpu_kick_thread to avoid prototype declaration. * Use the work_mutex lock to protect the queued_safe_work_* structures. --- cpu-exec.c| 5 ++ cpus.c| 149 +++--- include/qom/cpu.h | 24 - 3 files changed, 137 insertions(+), 41 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index e1a..97805cc 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -363,6 +363,11 @@ int cpu_exec(CPUState *cpu) /* This must be volatile so it is not trashed by longjmp() */ volatile bool have_tb_lock = false; +if (async_safe_work_pending()) { +cpu-exit_request = 1; +return 0; +} + if (cpu-halted) { if (!cpu_has_work(cpu)) { return EXCP_HALTED; diff --git a/cpus.c b/cpus.c index eabd4b1..2250296 100644 --- a/cpus.c +++ b/cpus.c @@ -69,6 +69,8 @@ static CPUState *next_cpu; int64_t max_delay; int64_t max_advance; +int safe_work_pending; /* Number of safe work pending for all VCPUs. */ + bool cpu_is_stopped(CPUState *cpu) { return cpu-stopped || !runstate_is_running(); @@ -76,7 +78,7 @@ bool cpu_is_stopped(CPUState *cpu) static bool cpu_thread_is_idle(CPUState *cpu) { -if (cpu-stop || cpu-queued_work_first) { +if (cpu-stop || cpu-queued_work_first || cpu-queued_safe_work_first) { return false; } if (cpu_is_stopped(cpu)) { @@ -833,6 +835,45 @@ void qemu_init_cpu_loop(void) qemu_thread_get_self(io_thread); } +static void qemu_cpu_kick_thread(CPUState *cpu) +{ +#ifndef _WIN32 +int err; + +err = pthread_kill(cpu-thread-thread, SIG_IPI); +if (err) { +fprintf(stderr, qemu:%s: %s, __func__, strerror(err)); +exit(1); +} +#else /* _WIN32 */ +if (!qemu_cpu_is_self(cpu)) { +CONTEXT tcgContext; + +if (SuspendThread(cpu-hThread) == (DWORD)-1) { +fprintf(stderr, qemu:%s: GetLastError:%lu\n, __func__, +GetLastError()); +exit(1); +} + +/* On multi-core systems, we are not sure that the thread is actually + * suspended until we can get the context. + */ +tcgContext.ContextFlags = CONTEXT_CONTROL; +while (GetThreadContext(cpu-hThread, tcgContext) != 0) { +continue; +} + +cpu_signal(0); + +if (ResumeThread(cpu-hThread) == (DWORD)-1) { +fprintf(stderr, qemu:%s: GetLastError:%lu\n, __func__, +GetLastError()); +exit(1); +} +} +#endif +} + void run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data) { struct qemu_work_item wi; @@ -894,6 +935,70 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data) qemu_cpu_kick(cpu); } +void async_run_safe_work_on_cpu(CPUState *cpu, void (*func)(void *data), +void *data) +{ +struct qemu_work_item *wi; + +wi = g_malloc0(sizeof(struct qemu_work_item)); +wi-func = func; +wi-data = data; +wi-free = true; + +atomic_inc(safe_work_pending); +qemu_mutex_lock(cpu-work_mutex); +if (cpu-queued_safe_work_first == NULL) { +cpu-queued_safe_work_first = wi; +} else { +cpu-queued_safe_work_last-next = wi; +} +cpu-queued_safe_work_last = wi; +wi-next = NULL; +wi-done = false; +qemu_mutex_unlock(cpu-work_mutex); + +CPU_FOREACH(cpu) { +qemu_cpu_kick_thread(cpu); +} +} + +static void flush_queued_safe_work(CPUState *cpu) +{ +struct qemu_work_item *wi; +CPUState *other_cpu; + +if (cpu-queued_safe_work_first == NULL) { +return; +} + +CPU_FOREACH(other_cpu) { +if (!tcg_cpu_try_block_execution(other_cpu)) { +return; +} +} + +qemu_mutex_lock(cpu-work_mutex); +while ((wi = cpu-queued_safe_work_first)) { +cpu-queued_safe_work_first = wi-next; +qemu_mutex_unlock(cpu-work_mutex); +wi-func(wi-data); +qemu_mutex_lock(cpu-work_mutex); +wi-done = true; +if (wi-free) { +g_free(wi); +} +atomic_dec(safe_work_pending); +} +cpu-queued_safe_work_last
[Qemu-devel] [PULL 4/4] Revert xhci: set timer to retry xfers
This reverts commit 4e8cfbe1143d8384387595b500212d7a7f11aeae. We should not poll via timer, and with ccid being fixed to properly notify us about pending transfers we don't have to. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/usb/hcd-xhci.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c index 90a5fbf..c673bed 100644 --- a/hw/usb/hcd-xhci.c +++ b/hw/usb/hcd-xhci.c @@ -,8 +,6 @@ static void xhci_kick_ep(XHCIState *xhci, unsigned int slotid, if (xfer-running_retry) { DPRINTF(xhci: xfer nacked, stopping schedule\n); epctx-retry = xfer; -timer_mod(epctx-kick_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + - epctx-interval * 125000); break; } } -- 1.8.3.1
Re: [Qemu-devel] vhost-user memory regions.
Hi Naredula On Mon, Jul 13, 2015 at 10:35 AM, Naredula Janardhana Reddy naredula.j...@gmail.com wrote: In the QEMU 2.3.0 , The memory regions published to user space application is limited address space, not the complete memory regions of quest vm, due to this , the user level app cannot decode all the memory block addresses. This is needed for reading the packet from virtio ring. Is there any configuration in qemu to overcome the above or it is bug in virtio-user?. Test details: Qemu cmd line: /opt/qemu.2.3.0/bin/qemu-system-x86_64 -enable-kvm -gdb tcp::1336,server,nowait -monitor tcp::51008,server,nowait,nodelay -m 256M -mem-path /hugetlbfs -mem-prealloc -smp 2 -chardev backend=socket,id=ch1,path=/opt_src/port1 -netdev vhost-user,id=guest0,chardev=ch1 -device virtio-net-pci,mac=00:30:48:DB:5E:01,netdev=guest0,mq=on,vectors=8 -vnc :8 -serial telnet::50008,server,nowait -daemonize -append net_send_int_disable=1 ipaddr=192.168.122.3 gw=192.168.122.1 nic_intr_off=1 netbh_tightloop=0 page_cache=1 -kernel /opt_src/jiny/jiny_image.bin -drive if=virtio,id=hdr0,file=/opt_src/jiny/disk In the below message The below is message recevied at the user level application from vhost-user(qemu), the size is only 655k. but needed is 256M(size of memory or shared memory size of hugetlb). I have Temporarly hardcoded the size in the app by changing the size to 256M from 655K. so that network packet address can able to decode from the virtio ring. Cmd: VHOST_USER_SET_MEM_TABLE (0x5) flags: 0x1 regions: 2 region: gpa = 0x0 size = 655360 ua = 0x2ac0 region: gpa = 0x0 size = 786432 ua = 0xff4 I got a similar error when using vapp (https://github.com/virtualopensystems/vapp), it's missing mmap_offset field. You need this fix: https://github.com/virtualopensystems/vapp/pull/3/files -- Marc-André Lureau
[Qemu-devel] [FIX PATCH] pc-dimm: Fail pc-dimm realization for invalid nodes in non-NUMA configuration
pc_dimm_realize() validates the NUMA node to which memory hotplug is being performed only in case of NUMA configuration. Include a check to fail invalid nodes in case of non-NUMA configuration too. Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com --- hw/mem/pc-dimm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c index bb04862..099025e 100644 --- a/hw/mem/pc-dimm.c +++ b/hw/mem/pc-dimm.c @@ -414,10 +414,11 @@ static void pc_dimm_realize(DeviceState *dev, Error **errp) error_setg(errp, ' PC_DIMM_MEMDEV_PROP ' property is not set); return; } -if ((nb_numa_nodes 0) (dimm-node = nb_numa_nodes)) { +if (((nb_numa_nodes 0) (dimm-node = nb_numa_nodes)) || +(!nb_numa_nodes dimm-node)) { error_setg(errp, 'DIMM property PC_DIMM_NODE_PROP has value % PRIu32 ' which exceeds the number of numa nodes: %d, - dimm-node, nb_numa_nodes); + dimm-node, nb_numa_nodes ? nb_numa_nodes : 1); return; } } -- 2.1.0
[Qemu-devel] [RFC PATCH 1/2] spapr: add dumpdtb support
Signed-off-by: Andrew Jones drjo...@redhat.com --- hw/ppc/spapr.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index a6f19473cf278..c1cbf3387ae0c 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -30,6 +30,7 @@ #include hw/fw-path-provider.h #include elf.h #include net/net.h +#include sysemu/device_tree.h #include sysemu/block-backend.h #include sysemu/cpus.h #include sysemu/kvm.h @@ -822,6 +823,7 @@ static void spapr_finalize_fdt(sPAPRMachineState *spapr, exit(1); } +qemu_fdt_dumpdtb(fdt, fdt_totalsize(fdt)); cpu_physical_memory_write(fdt_addr, fdt, fdt_totalsize(fdt)); g_free(bootlist); -- 2.4.3
Re: [Qemu-devel] [RFC v3 06/13] target-i386: translate: implement qemu_ldlink and qemu_stcond ops
Alvise Rigo a.r...@virtualopensystems.com writes: Implement strex and ldrex instruction relying on TCG's qemu_ldlink and qemu_stcond. For the time being only 32bit configurations are supported. Suggested-by: Jani Kokkonen jani.kokko...@huawei.com Suggested-by: Claudio Fontana claudio.font...@huawei.com Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com --- tcg/i386/tcg-target.c | 136 ++ 1 file changed, 114 insertions(+), 22 deletions(-) diff --git a/tcg/i386/tcg-target.c b/tcg/i386/tcg-target.c index 0d7c99c..d8250a9 100644 --- a/tcg/i386/tcg-target.c +++ b/tcg/i386/tcg-target.c @@ -1141,6 +1141,17 @@ static void * const qemu_ld_helpers[16] = { [MO_BEQ] = helper_be_ldq_mmu, }; +/* LoadLink helpers, only unsigned. Use the macro below to access them. */ +static void * const qemu_ldex_helpers[16] = { +[MO_LEUL] = helper_le_ldlinkul_mmu, +}; + +#define LDEX_HELPER(mem_op) \ +({ \ +assert(mem_op MO_EXCL); \ +qemu_ldex_helpers[((int)mem_op - MO_EXCL)]; \ +}) + /* helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr, * uintxx_t val, int mmu_idx, uintptr_t ra) */ @@ -1154,6 +1165,17 @@ static void * const qemu_st_helpers[16] = { [MO_BEQ] = helper_be_stq_mmu, }; +/* StoreConditional helpers. Use the macro below to access them. */ +static void * const qemu_stex_helpers[16] = { +[MO_LEUL] = helper_le_stcondl_mmu, +}; + +#define STEX_HELPER(mem_op) \ +({ \ +assert(mem_op MO_EXCL); \ +qemu_stex_helpers[(int)mem_op - MO_EXCL]; \ +}) + Same comments as for target-arm. Do we need to be protecting backends with HAS_LDST_EXCL defines or some such macro hackery? What currently happens if you use the new TCG ops when the backend doesn't support them? Is supporting all backends a prerequisite for the series? /* Perform the TLB load and compare. Inputs: @@ -1249,6 +1271,7 @@ static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi, * for a load or store, so that we can later generate the correct helper code */ static void add_qemu_ldst_label(TCGContext *s, bool is_ld, TCGMemOpIdx oi, +TCGReg llsc_success, TCGReg datalo, TCGReg datahi, TCGReg addrlo, TCGReg addrhi, tcg_insn_unit *raddr, @@ -1257,6 +1280,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool is_ld, TCGMemOpIdx oi, TCGLabelQemuLdst *label = new_ldst_label(s); label-is_ld = is_ld; +label-llsc_success = llsc_success; label-oi = oi; label-datalo_reg = datalo; label-datahi_reg = datahi; @@ -1311,7 +1335,11 @@ static void tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) (uintptr_t)l-raddr); } -tcg_out_call(s, qemu_ld_helpers[opc (MO_BSWAP | MO_SIZE)]); +if (opc MO_EXCL) { +tcg_out_call(s, LDEX_HELPER(opc)); +} else { +tcg_out_call(s, qemu_ld_helpers[opc ~MO_SIGN]); +} data_reg = l-datalo_reg; switch (opc MO_SSIZE) { @@ -1415,9 +1443,16 @@ static void tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) } } -/* Tail call to the helper, with the return address back inline. */ -tcg_out_push(s, retaddr); -tcg_out_jmp(s, qemu_st_helpers[opc (MO_BSWAP | MO_SIZE)]); +if (opc MO_EXCL) { +tcg_out_call(s, STEX_HELPER(opc)); +/* Save the output of the StoreConditional */ +tcg_out_mov(s, TCG_TYPE_I32, l-llsc_success, TCG_REG_EAX); +tcg_out_jmp(s, l-raddr); +} else { +/* Tail call to the helper, with the return address back inline. */ +tcg_out_push(s, retaddr); +tcg_out_jmp(s, qemu_st_helpers[opc]); +} } #elif defined(__x86_64__) defined(__linux__) # include asm/prctl.h @@ -1530,7 +1565,8 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, /* XXX: qemu_ld and qemu_st could be modified to clobber only EDX and EAX. It will be useful once fixed registers globals are less common. */ -static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) +static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64, +bool isLoadLink) { TCGReg datalo, datahi, addrlo; TCGReg addrhi __attribute__((unused)); @@ -1553,14 +1589,34 @@ static void
[Qemu-devel] [PULL 3/4] usb-ccid: add missing wakeup calls
Properly notify the host adapter that we have data pending, so it doesn't has to poll us. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/usb/dev-smartcard-reader.c | 5 + 1 file changed, 5 insertions(+) diff --git a/hw/usb/dev-smartcard-reader.c b/hw/usb/dev-smartcard-reader.c index 904b568..8952eff 100644 --- a/hw/usb/dev-smartcard-reader.c +++ b/hw/usb/dev-smartcard-reader.c @@ -283,6 +283,7 @@ typedef struct CCIDBus { typedef struct USBCCIDState { USBDevice dev; USBEndpoint *intr; +USBEndpoint *bulk; CCIDBus bus; CCIDCardState *card; BulkIn bulk_in_pending[BULK_IN_PENDING_NUM]; /* circular */ @@ -769,6 +770,7 @@ static void ccid_write_slot_status(USBCCIDState *s, CCID_Header *recv) h-b.bError = s-bError; h-bClockStatus = CLOCK_STATUS_RUNNING; ccid_reset_error_status(s); +usb_wakeup(s-bulk, 0); } static void ccid_write_parameters(USBCCIDState *s, CCID_Header *recv) @@ -789,6 +791,7 @@ static void ccid_write_parameters(USBCCIDState *s, CCID_Header *recv) h-bProtocolNum = s-bProtocolNum; h-abProtocolDataStructure = s-abProtocolDataStructure; ccid_reset_error_status(s); +usb_wakeup(s-bulk, 0); } static void ccid_write_data_block(USBCCIDState *s, uint8_t slot, uint8_t seq, @@ -810,6 +813,7 @@ static void ccid_write_data_block(USBCCIDState *s, uint8_t slot, uint8_t seq, } memcpy(p-abData, data, len); ccid_reset_error_status(s); +usb_wakeup(s-bulk, 0); } static void ccid_report_error_failed(USBCCIDState *s, uint8_t error) @@ -1319,6 +1323,7 @@ static void ccid_realize(USBDevice *dev, Error **errp) NULL); qbus_set_hotplug_handler(BUS(s-bus), DEVICE(dev), error_abort); s-intr = usb_ep_get(dev, USB_TOKEN_IN, CCID_INT_IN_EP); +s-bulk = usb_ep_get(dev, USB_TOKEN_IN, CCID_BULK_IN_EP); s-card = NULL; s-migration_state = MIGRATION_NONE; s-migration_target_ip = 0; -- 1.8.3.1
[Qemu-devel] [PULL for-2.4 0/4] usb: fixes for 2.4 (ccid, xhci and usb-host)
Hi, Here are usb fixes for 2.4. please pull, Gerd The following changes since commit 2d5ee9e7a7dd495d233cf9613a865f63f88e3375: Merge remote-tracking branch 'remotes/lalrae/tags/mips-20150716' into staging (2015-07-16 10:40:23 +0100) are available in the git repository at: git://git.kraxel.org/qemu tags/pull-usb-20150717-1 for you to fetch changes up to 92fdfa4bef9c92addcc009dd3e0131172b4fdc78: Revert xhci: set timer to retry xfers (2015-07-17 13:20:53 +0200) usb: fixes for 2.4 (ccid, xhci and usb-host) Gerd Hoffmann (3): usb-ccid: fix 61b4887b41b270bc837ead57bc502d904af023bb usb-ccid: add missing wakeup calls Revert xhci: set timer to retry xfers Lin Ma (1): Re-attach usb device to kernel while usb_host_open fails hw/usb/dev-smartcard-reader.c | 21 + hw/usb/hcd-xhci.c | 2 -- hw/usb/host-libusb.c | 3 +++ 3 files changed, 16 insertions(+), 10 deletions(-)
[Qemu-devel] [PULL 2/4] usb-ccid: fix 61b4887b41b270bc837ead57bc502d904af023bb
QOMification dropped the parent device lookup, fix it. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/usb/dev-smartcard-reader.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/hw/usb/dev-smartcard-reader.c b/hw/usb/dev-smartcard-reader.c index de534ba..904b568 100644 --- a/hw/usb/dev-smartcard-reader.c +++ b/hw/usb/dev-smartcard-reader.c @@ -1184,7 +1184,7 @@ void ccid_card_send_apdu_to_guest(CCIDCardState *card, uint8_t *apdu, uint32_t len) { DeviceState *qdev = DEVICE(card); -USBDevice *dev = USB_DEVICE(qdev); +USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent); USBCCIDState *s = USB_CCID_DEV(dev); Answer *answer; @@ -1207,7 +1207,7 @@ void ccid_card_send_apdu_to_guest(CCIDCardState *card, void ccid_card_card_removed(CCIDCardState *card) { DeviceState *qdev = DEVICE(card); -USBDevice *dev = USB_DEVICE(qdev); +USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent); USBCCIDState *s = USB_CCID_DEV(dev); ccid_on_slot_change(s, false); @@ -1218,7 +1218,7 @@ void ccid_card_card_removed(CCIDCardState *card) int ccid_card_ccid_attach(CCIDCardState *card) { DeviceState *qdev = DEVICE(card); -USBDevice *dev = USB_DEVICE(qdev); +USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent); USBCCIDState *s = USB_CCID_DEV(dev); DPRINTF(s, 1, CCID Attach\n); @@ -1231,7 +1231,7 @@ int ccid_card_ccid_attach(CCIDCardState *card) void ccid_card_ccid_detach(CCIDCardState *card) { DeviceState *qdev = DEVICE(card); -USBDevice *dev = USB_DEVICE(qdev); +USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent); USBCCIDState *s = USB_CCID_DEV(dev); DPRINTF(s, 1, CCID Detach\n); @@ -1244,7 +1244,7 @@ void ccid_card_ccid_detach(CCIDCardState *card) void ccid_card_card_error(CCIDCardState *card, uint64_t error) { DeviceState *qdev = DEVICE(card); -USBDevice *dev = USB_DEVICE(qdev); +USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent); USBCCIDState *s = USB_CCID_DEV(dev); s-bmCommandStatus = COMMAND_STATUS_FAILED; @@ -1263,7 +1263,7 @@ void ccid_card_card_error(CCIDCardState *card, uint64_t error) void ccid_card_card_inserted(CCIDCardState *card) { DeviceState *qdev = DEVICE(card); -USBDevice *dev = USB_DEVICE(qdev); +USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent); USBCCIDState *s = USB_CCID_DEV(dev); s-bmCommandStatus = COMMAND_STATUS_NO_ERROR; @@ -1275,7 +1275,7 @@ static int ccid_card_exit(DeviceState *qdev) { int ret = 0; CCIDCardState *card = CCID_CARD(qdev); -USBDevice *dev = USB_DEVICE(qdev); +USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent); USBCCIDState *s = USB_CCID_DEV(dev); if (ccid_card_inserted(s)) { @@ -1289,7 +1289,7 @@ static int ccid_card_exit(DeviceState *qdev) static int ccid_card_init(DeviceState *qdev) { CCIDCardState *card = CCID_CARD(qdev); -USBDevice *dev = USB_DEVICE(qdev); +USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent); USBCCIDState *s = USB_CCID_DEV(dev); int ret = 0; -- 1.8.3.1
[Qemu-devel] [RFC PATCH 2/2] spapr: -kernel: allow linking with specified addr
I've started playing with adding ppc support to kvm-unit-tests, using spapr for the machine model. I wanted to link the unit test at 0x40 to match qemu's load address, making the unit test startup code simpler, but ended up with 0x80 instead, due to how translate_kernel_address works. The translation makes sense for how Linux kernels are linked (always at 0xc000 or 0xc000), but for the unit test case we need to avoid adding the offset. Signed-off-by: Andrew Jones drjo...@redhat.com --- Big RFC because I don't know if the always at 0xc... statement is 100% true for Linux, nor if this patch would break other stuff... hw/ppc/spapr.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index c1cbf3387ae0c..4f1548f5168db 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -832,6 +832,9 @@ static void spapr_finalize_fdt(sPAPRMachineState *spapr, static uint64_t translate_kernel_address(void *opaque, uint64_t addr) { +if ((addr 0x0fff) == addr) { +return addr; +} return (addr 0x0fff) + KERNEL_LOAD_ADDR; } -- 2.4.3
[Qemu-devel] [RFC PATCH 0/2] spapr: changes for kvm-unit-tests
I've started playing with adding ppc support to kvm-unit-tests, using spapr for the machine model. Here are a couple patches to spapr I've made. Thanks, drew Andrew Jones (2): spapr: add dumpdtb support spapr: -kernel: allow linking with specified addr hw/ppc/spapr.c | 5 + 1 file changed, 5 insertions(+) -- 2.4.3
Re: [Qemu-devel] [PULL for-2.4 0/4] usb: fixes for 2.4 (ccid, xhci and usb-host)
On 17 July 2015 at 12:29, Gerd Hoffmann kra...@redhat.com wrote: Hi, Here are usb fixes for 2.4. please pull, Gerd The following changes since commit 2d5ee9e7a7dd495d233cf9613a865f63f88e3375: Merge remote-tracking branch 'remotes/lalrae/tags/mips-20150716' into staging (2015-07-16 10:40:23 +0100) are available in the git repository at: git://git.kraxel.org/qemu tags/pull-usb-20150717-1 for you to fetch changes up to 92fdfa4bef9c92addcc009dd3e0131172b4fdc78: Revert xhci: set timer to retry xfers (2015-07-17 13:20:53 +0200) usb: fixes for 2.4 (ccid, xhci and usb-host) Applied, thanks. -- PMM
Re: [Qemu-devel] [RFC v3 05/13] target-arm: translate: implement qemu_ldlink and qemu_stcond ops
Alvise Rigo a.r...@virtualopensystems.com writes: Implement strex and ldrex instruction relying on TCG's qemu_ldlink and qemu_stcond. For the time being only the 32bit instructions are supported. Suggested-by: Jani Kokkonen jani.kokko...@huawei.com Suggested-by: Claudio Fontana claudio.font...@huawei.com Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com --- target-arm/translate.c | 87 ++- tcg/arm/tcg-target.c | 121 + 2 files changed, 178 insertions(+), 30 deletions(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index 80302cd..0366c76 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -72,6 +72,8 @@ static TCGv_i64 cpu_exclusive_test; static TCGv_i32 cpu_exclusive_info; #endif +static TCGv_i32 cpu_ll_sc_context; + /* FIXME: These should be removed. */ static TCGv_i32 cpu_F0s, cpu_F1s; static TCGv_i64 cpu_F0d, cpu_F1d; @@ -103,6 +105,8 @@ void arm_translate_init(void) offsetof(CPUARMState, exclusive_addr), exclusive_addr); cpu_exclusive_val = tcg_global_mem_new_i64(TCG_AREG0, offsetof(CPUARMState, exclusive_val), exclusive_val); +cpu_ll_sc_context = tcg_global_mem_new_i32(TCG_AREG0, +offsetof(CPUARMState, ll_sc_context), ll_sc_context); #ifdef CONFIG_USER_ONLY cpu_exclusive_test = tcg_global_mem_new_i64(TCG_AREG0, offsetof(CPUARMState, exclusive_test), exclusive_test); @@ -961,6 +965,18 @@ DO_GEN_ST(8, MO_UB) DO_GEN_ST(16, MO_TEUW) DO_GEN_ST(32, MO_TEUL) +/* Load/Store exclusive generators (always unsigned) */ +static inline void gen_aa32_ldex32(TCGv_i32 val, TCGv_i32 addr, int index) +{ +tcg_gen_qemu_ldlink_i32(val, addr, index, MO_TEUL | MO_EXCL); +} + +static inline void gen_aa32_stex32(TCGv_i32 is_dirty, TCGv_i32 val, + TCGv_i32 addr, int index) +{ +tcg_gen_qemu_stcond_i32(is_dirty, val, addr, index, MO_TEUL | MO_EXCL); +} + static inline void gen_set_pc_im(DisasContext *s, target_ulong val) { tcg_gen_movi_i32(cpu_R[15], val); @@ -7427,6 +7443,26 @@ static void gen_load_exclusive(DisasContext *s, int rt, int rt2, store_reg(s, rt, tmp); } +static void gen_load_exclusive_multi(DisasContext *s, int rt, int rt2, + TCGv_i32 addr, int size) +{ +TCGv_i32 tmp = tcg_temp_new_i32(); + +switch (size) { +case 0: +case 1: +abort(); +case 2: +gen_aa32_ldex32(tmp, addr, get_mem_index(s)); +break; +case 3: +default: +abort(); +} + +store_reg(s, rt, tmp); +} + static void gen_clrex(DisasContext *s) { gen_helper_atomic_clear(cpu_env); @@ -7460,6 +7496,52 @@ static void gen_store_exclusive(DisasContext *s, int rd, int rt, int rt2, tcg_temp_free_i64(val); tcg_temp_free_i32(tmp_size); } + +static void gen_store_exclusive_multi(DisasContext *s, int rd, int rt, int rt2, + TCGv_i32 addr, int size) +{ +TCGv_i32 tmp; +TCGv_i32 is_dirty; +TCGLabel *done_label; +TCGLabel *fail_label; + +fail_label = gen_new_label(); +done_label = gen_new_label(); + +tmp = tcg_temp_new_i32(); +is_dirty = tcg_temp_new_i32(); + +/* Fail if we are not in LL/SC context. */ +tcg_gen_brcondi_i32(TCG_COND_NE, cpu_ll_sc_context, 1, fail_label); + +tmp = load_reg(s, rt); +switch (size) { +case 0: +case 1: +abort(); +break; +case 2: +gen_aa32_stex32(is_dirty, tmp, addr, get_mem_index(s)); +break; +case 3: +default: +abort(); +} + +tcg_temp_free_i32(tmp); + +/* Check if the store conditional has to fail. */ +tcg_gen_brcondi_i32(TCG_COND_EQ, is_dirty, 1, fail_label); +tcg_temp_free_i32(is_dirty); + +tcg_temp_free_i32(tmp); + +tcg_gen_movi_i32(cpu_R[rd], 0); /* is_dirty = 0 */ +tcg_gen_br(done_label); +gen_set_label(fail_label); +tcg_gen_movi_i32(cpu_R[rd], 1); /* is_dirty = 1 */ +gen_set_label(done_label); +} #endif /* gen_srs: @@ -8308,7 +8390,7 @@ static void disas_arm_insn(DisasContext *s, unsigned int insn) } else if (insn (1 20)) { switch (op1) { case 0: /* ldrex */ -gen_load_exclusive(s, rd, 15, addr, 2); +gen_load_exclusive_multi(s, rd, 15, addr, 2); break; case 1: /* ldrexd */ gen_load_exclusive(s, rd, rd + 1, addr, 3); @@ -8326,7 +8408,8 @@ static void disas_arm_insn(DisasContext *s, unsigned int insn) rm = insn 0xf; switch (op1) {
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On 17/07/2015 15:28, Marc Zyngier wrote: Marc, does it ring any bell? Well, this is an example of a guest accessing non-memory using an instruction that we cannot safely emulate - not an IO accessor (load multiple, for example). In this case, we kill the guest (we could as well inject a fault). Yup, I later found this in the dmesg: [1875219.903969] kvm [14843]: load/store instruction decoding not implemented This vcpu seems to be accessing 0x918 (the mmio structure points there), but I can't immediately relate it to the content of the registers. 0x918 is the pl011, which makes some sense. Have you ever seen a corrupted register dump? The guest RAM goes from 0x4000 to 0xbfff, so SP is pointing outside memory. PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) If the register dump is not corrupted, execution went in the weeds... I don't have the guest anymore but it's just firmware so the memory contents are well reproducible: 0xbf671200: a97d6ffa ldmdbge sp!, {r1, r3, r4, r5, r6, r7, r8, r9, sl, fp, sp, lr}^ 0xbf671204: a97e77fc ldmdbge lr!, {r2, r3, r4, r5, r6, r7, r8, r9, sl, ip, sp, lr}^ 0xbf671208: f85f03fe undefined instruction 0xf85f03fe 0xbf67120c: 910803ff strdls r0, [r8, -pc] 0xbf671210: ad7007e0 ldclge 7, cr0, [r0, #-896]! 0xbf671214: ad710fe2 ldclge 15, cr0, [r1, #-904]! 0xbf671218: ad7217e4 ldclge 7, cr1, [r2, #-912]! 0xbf67121c: ad731fe6 ldclge 15, cr1, [r3, #-920]! 0xbf671220: ad7427e8 ldclge 7, cr2, [r4, #-928]! 0xbf671224: ad752fea ldclge 15, cr2, [r5, #-936]! What looks a bit weird is that all the registers are clamped to 32bit, but the PSTATE indicates it is running in 64bit (EL1h, which makes the PC being utterly wrong). The PC can be okay since UEFI code doesn't really use virtual addressing, but the other registers are weird indeed. What are you running there? Just firmware. It's a UEFI reboot loop (as soon as you get to the UEFI shell QEMU exits and the script starts a new one). The kernel is an old 3.19 one, I'll upgrade and retest. Paolo
[Qemu-devel] [PATCH 1/4] Revert vhost-user: add multi queue support
This reverts commit 830d70db692e374b5f4407f96a1ceefdcc97. The interface as merged isn't fully backwards-compatible with existing clients. Revert it. Let's redo this after 2.4, based on protocol extensions in follow-up patches. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- qapi-schema.json | 6 +- hw/net/vhost_net.c| 3 +-- hw/virtio/vhost-user.c| 11 +-- net/vhost-user.c | 37 + docs/specs/vhost-user.txt | 5 - qemu-options.hx | 5 ++--- 6 files changed, 18 insertions(+), 49 deletions(-) diff --git a/qapi-schema.json b/qapi-schema.json index 106008c..88a3883 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -2463,16 +2463,12 @@ # # @vhostforce: #optional vhost on for non-MSIX virtio guests (default: false). # -# @queues: #optional number of queues to be created for multiqueue vhost-user -# (default: 1) (Since 2.4) -# # Since 2.1 ## { 'struct': 'NetdevVhostUserOptions', 'data': { 'chardev':'str', -'*vhostforce':'bool', -'*queues':'uint32' } } +'*vhostforce':'bool' } } ## # @NetClientOptions diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index 9bd360b..5c1d11f 100644 --- a/hw/net/vhost_net.c +++ b/hw/net/vhost_net.c @@ -160,7 +160,6 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options) net-dev.nvqs = 2; net-dev.vqs = net-vqs; -net-dev.vq_index = net-nc-queue_index; r = vhost_dev_init(net-dev, options-opaque, options-backend_type); @@ -287,7 +286,7 @@ static void vhost_net_stop_one(struct vhost_net *net, for (file.index = 0; file.index net-dev.nvqs; ++file.index) { const VhostOps *vhost_ops = net-dev.vhost_ops; int r = vhost_ops-vhost_call(net-dev, VHOST_RESET_OWNER, - file); + NULL); assert(r = 0); } } diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index d6f2163..e7ab829 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -210,12 +210,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request, break; case VHOST_SET_OWNER: -break; - case VHOST_RESET_OWNER: -memcpy(msg.state, arg, sizeof(struct vhost_vring_state)); -msg.state.index += dev-vq_index; -msg.size = sizeof(m.state); break; case VHOST_SET_MEM_TABLE: @@ -258,20 +253,17 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request, case VHOST_SET_VRING_NUM: case VHOST_SET_VRING_BASE: memcpy(msg.state, arg, sizeof(struct vhost_vring_state)); -msg.state.index += dev-vq_index; msg.size = sizeof(m.state); break; case VHOST_GET_VRING_BASE: memcpy(msg.state, arg, sizeof(struct vhost_vring_state)); -msg.state.index += dev-vq_index; msg.size = sizeof(m.state); need_reply = 1; break; case VHOST_SET_VRING_ADDR: memcpy(msg.addr, arg, sizeof(struct vhost_vring_addr)); -msg.addr.index += dev-vq_index; msg.size = sizeof(m.addr); break; @@ -279,7 +271,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request, case VHOST_SET_VRING_CALL: case VHOST_SET_VRING_ERR: file = arg; -msg.u64 = (file-index + dev-vq_index) VHOST_USER_VRING_IDX_MASK; +msg.u64 = file-index VHOST_USER_VRING_IDX_MASK; msg.size = sizeof(m.u64); if (ioeventfd_enabled() file-fd 0) { fds[fd_num++] = file-fd; @@ -321,7 +313,6 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request, error_report(Received bad msg size.); return -1; } -msg.state.index -= dev-vq_index; memcpy(arg, msg.state, sizeof(struct vhost_vring_state)); break; default: diff --git a/net/vhost-user.c b/net/vhost-user.c index b51bc04..93dcecd 100644 --- a/net/vhost-user.c +++ b/net/vhost-user.c @@ -120,39 +120,35 @@ static void net_vhost_user_event(void *opaque, int event) case CHR_EVENT_OPENED: vhost_user_start(s); net_vhost_link_down(s, false); -error_report(chardev \%s\ went up, s-nc.info_str); +error_report(chardev \%s\ went up, s-chr-label); break; case CHR_EVENT_CLOSED: net_vhost_link_down(s, true); vhost_user_stop(s); -error_report(chardev \%s\ went down, s-nc.info_str); +error_report(chardev \%s\ went down, s-chr-label); break; } } static int net_vhost_user_init(NetClientState *peer, const char *device, - const char *name, CharDriverState *chr, - uint32_t queues) + const char
[Qemu-devel] [PATCH 0/4] vhost-user: protocol updates
This patchset sets the stage for extending the vhost user protocol, with full backwards compatibility. The approach is to negotiate feature bits queried and acknowledged during device setup. For now, there's no new functionality: two new messages are added that will allow negotiating new messages required for functionality such as MQ and migration. For now, I used the feature bit 30 to signal support for these new messages, and we now have 64 more bits to play. The patches can be found in my tree, branch vhost-user. Only patch 1 is intended for 2.4. Posting early so people working on extensions such as migration can review this - but please note the protocol is not set in stone yet. Michael S. Tsirkin (4): Revert vhost-user: add multi queue support vhost-user: refactor ioctl translation vhost-user: add protocol feature negotiation vhost-user: unit test for new messages qapi-schema.json | 6 +- include/hw/virtio/vhost.h | 1 + hw/net/vhost_net.c| 5 +- hw/virtio/vhost-user.c| 150 ++ net/vhost-user.c | 37 tests/vhost-user-test.c | 19 ++ docs/specs/vhost-user.txt | 40 +++-- qemu-options.hx | 5 +- 8 files changed, 174 insertions(+), 89 deletions(-) -- MST
[Qemu-devel] [PATCH 3/4] vhost-user: add protocol feature negotiation
Support a separate bitmask for vhost-user protocol features, and messages to get/set protocol features. Invoke them at init. No features are defined yet. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- include/hw/virtio/vhost.h | 1 + hw/net/vhost_net.c| 2 ++ hw/virtio/vhost-user.c| 52 +++ docs/specs/vhost-user.txt | 37 + 4 files changed, 92 insertions(+) diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h index dd51050..6467c73 100644 --- a/include/hw/virtio/vhost.h +++ b/include/hw/virtio/vhost.h @@ -47,6 +47,7 @@ struct vhost_dev { unsigned long long features; unsigned long long acked_features; unsigned long long backend_features; +unsigned long long protocol_features; bool started; bool log_enabled; unsigned long long log_size; diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index 5c1d11f..c864237 100644 --- a/hw/net/vhost_net.c +++ b/hw/net/vhost_net.c @@ -152,8 +152,10 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options) net-dev.backend_features = qemu_has_vnet_hdr(options-net_backend) ? 0 : (1ULL VHOST_NET_F_VIRTIO_NET_HDR); net-backend = r; +net-dev.protocol_features = 0; } else { net-dev.backend_features = 0; +net-dev.protocol_features = 0; net-backend = -1; } net-nc = options-net_backend; diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index 5a83d00..c4428a1 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -25,6 +25,9 @@ #define VHOST_MEMORY_MAX_NREGIONS8 +#define VHOST_USER_F_PROTOCOL_FEATURES 30 +#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x0ULL + typedef enum VhostUserRequest { VHOST_USER_NONE = 0, VHOST_USER_GET_FEATURES = 1, @@ -41,6 +44,8 @@ typedef enum VhostUserRequest { VHOST_USER_SET_VRING_KICK = 12, VHOST_USER_SET_VRING_CALL = 13, VHOST_USER_SET_VRING_ERR = 14, +VHOST_USER_GET_PROTOCOL_FEATURES = 15, +VHOST_USER_SET_PROTOCOL_FEATURES = 26, VHOST_USER_MAX } VhostUserRequest; @@ -332,10 +337,57 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request, static int vhost_user_init(struct vhost_dev *dev, void *opaque) { +VhostUserMsg msg = { 0 }; +int err; + assert(dev-vhost_ops-backend_type == VHOST_BACKEND_TYPE_USER); dev-opaque = opaque; +msg.request = VHOST_USER_GET_FEATURES; +msg.flags = VHOST_USER_VERSION; +msg.size = 0; + +err = vhost_user_write(dev, msg, NULL, 0); +if (err 0) { +return err; +} + +err = vhost_user_read(dev, msg); +if (err 0) { +return err; +} + +if (__virtio_has_feature(msg.u64, VHOST_USER_F_PROTOCOL_FEATURES)) { +dev-backend_features |= 1ULL VHOST_USER_F_PROTOCOL_FEATURES; + +msg.request = VHOST_USER_GET_PROTOCOL_FEATURES; +msg.flags = VHOST_USER_VERSION; +msg.size = 0; + +err = vhost_user_write(dev, msg, NULL, 0); +if (err 0) { +return err; +} + +err = vhost_user_read(dev, msg); +if (err 0) { +return err; +} + +dev-protocol_features = msg.u64 VHOST_USER_PROTOCOL_FEATURE_MASK; + +msg.request = VHOST_USER_SET_PROTOCOL_FEATURES; +msg.flags = VHOST_USER_VERSION; +msg.u64 = dev-protocol_features; +msg.size = sizeof msg.u64; + +err = vhost_user_write(dev, msg, NULL, 0); +if (err 0) { +return err; +} +} + return 0; } diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt index 650bb18..0062baa 100644 --- a/docs/specs/vhost-user.txt +++ b/docs/specs/vhost-user.txt @@ -113,6 +113,7 @@ message replies. Most of the requests don't require replies. Here is a list of the ones that do: * VHOST_GET_FEATURES + * VHOST_GET_PROTOCOL_FEATURES * VHOST_GET_VRING_BASE There are several messages that the master sends with file descriptors passed @@ -127,6 +128,13 @@ in the ancillary data: If Master is unable to send the full message or receives a wrong reply it will close the connection. An optional reconnection mechanism can be implemented. +Any protocol extensions are gated by protocol feature bits, +which allows full backwards compatibility on both master +and slave. +As older slaves don't support negotiating protocol features, +a feature bit was dedicated for this purpose: +#define VHOST_USER_F_PROTOCOL_FEATURES 30 + Message types - @@ -138,6 +146,8 @@ Message types Slave payload: u64 Get from the underlying vhost implementation the features bitmask. + Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for + VHOST_USER_GET_PROTOCOL_FEATURES and VHOST_USER_SET_PROTOCOL_FEATURES. * VHOST_USER_SET_FEATURES @@ -146,6 +156,33 @@ Message types Master payload: u64
[Qemu-devel] [PATCH 2/4] vhost-user: refactor ioctl translation
translate at the outset, have rest of code use vhost-user features exclusively. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- hw/virtio/vhost-user.c | 87 ++ 1 file changed, 46 insertions(+), 41 deletions(-) diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index e7ab829..5a83d00 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -89,35 +89,40 @@ static bool ioeventfd_enabled(void) return kvm_enabled() kvm_eventfds_enabled(); } -static unsigned long int ioctl_to_vhost_user_request[VHOST_USER_MAX] = { --1, /* VHOST_USER_NONE */ -VHOST_GET_FEATURES, /* VHOST_USER_GET_FEATURES */ -VHOST_SET_FEATURES, /* VHOST_USER_SET_FEATURES */ -VHOST_SET_OWNER,/* VHOST_USER_SET_OWNER */ -VHOST_RESET_OWNER, /* VHOST_USER_RESET_OWNER */ -VHOST_SET_MEM_TABLE,/* VHOST_USER_SET_MEM_TABLE */ -VHOST_SET_LOG_BASE, /* VHOST_USER_SET_LOG_BASE */ -VHOST_SET_LOG_FD, /* VHOST_USER_SET_LOG_FD */ -VHOST_SET_VRING_NUM,/* VHOST_USER_SET_VRING_NUM */ -VHOST_SET_VRING_ADDR, /* VHOST_USER_SET_VRING_ADDR */ -VHOST_SET_VRING_BASE, /* VHOST_USER_SET_VRING_BASE */ -VHOST_GET_VRING_BASE, /* VHOST_USER_GET_VRING_BASE */ -VHOST_SET_VRING_KICK, /* VHOST_USER_SET_VRING_KICK */ -VHOST_SET_VRING_CALL, /* VHOST_USER_SET_VRING_CALL */ -VHOST_SET_VRING_ERR /* VHOST_USER_SET_VRING_ERR */ -}; - static VhostUserRequest vhost_user_request_translate(unsigned long int request) { -VhostUserRequest idx; - -for (idx = 0; idx VHOST_USER_MAX; idx++) { -if (ioctl_to_vhost_user_request[idx] == request) { -break; -} +switch (request) { +case VHOST_GET_FEATURES: +return VHOST_USER_GET_FEATURES; +case VHOST_SET_FEATURES: +return VHOST_USER_SET_FEATURES; +case VHOST_SET_OWNER: +return VHOST_USER_SET_OWNER; +case VHOST_RESET_OWNER: +return VHOST_USER_RESET_OWNER; +case VHOST_SET_MEM_TABLE: +return VHOST_USER_SET_MEM_TABLE; +case VHOST_SET_LOG_BASE: +return VHOST_USER_SET_LOG_BASE; +case VHOST_SET_LOG_FD: +return VHOST_USER_SET_LOG_FD; +case VHOST_SET_VRING_NUM: +return VHOST_USER_SET_VRING_NUM; +case VHOST_SET_VRING_ADDR: +return VHOST_USER_SET_VRING_ADDR; +case VHOST_SET_VRING_BASE: +return VHOST_USER_SET_VRING_BASE; +case VHOST_GET_VRING_BASE: +return VHOST_USER_GET_VRING_BASE; +case VHOST_SET_VRING_KICK: +return VHOST_USER_SET_VRING_KICK; +case VHOST_SET_VRING_CALL: +return VHOST_USER_SET_VRING_CALL; +case VHOST_SET_VRING_ERR: +return VHOST_USER_SET_VRING_ERR; +default: +return VHOST_USER_MAX; } - -return (idx == VHOST_USER_MAX) ? VHOST_USER_NONE : idx; } static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg) @@ -198,22 +203,22 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request, msg.flags = VHOST_USER_VERSION; msg.size = 0; -switch (request) { -case VHOST_GET_FEATURES: +switch (msg_request) { +case VHOST_USER_GET_FEATURES: need_reply = 1; break; -case VHOST_SET_FEATURES: -case VHOST_SET_LOG_BASE: +case VHOST_USER_SET_FEATURES: +case VHOST_USER_SET_LOG_BASE: msg.u64 = *((__u64 *) arg); msg.size = sizeof(m.u64); break; -case VHOST_SET_OWNER: -case VHOST_RESET_OWNER: +case VHOST_USER_SET_OWNER: +case VHOST_USER_RESET_OWNER: break; -case VHOST_SET_MEM_TABLE: +case VHOST_USER_SET_MEM_TABLE: for (i = 0; i dev-mem-nregions; ++i) { struct vhost_memory_region *reg = dev-mem-regions + i; ram_addr_t ram_addr; @@ -246,30 +251,30 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request, break; -case VHOST_SET_LOG_FD: +case VHOST_USER_SET_LOG_FD: fds[fd_num++] = *((int *) arg); break; -case VHOST_SET_VRING_NUM: -case VHOST_SET_VRING_BASE: +case VHOST_USER_SET_VRING_NUM: +case VHOST_USER_SET_VRING_BASE: memcpy(msg.state, arg, sizeof(struct vhost_vring_state)); msg.size = sizeof(m.state); break; -case VHOST_GET_VRING_BASE: +case VHOST_USER_GET_VRING_BASE: memcpy(msg.state, arg, sizeof(struct vhost_vring_state)); msg.size = sizeof(m.state); need_reply = 1; break; -case VHOST_SET_VRING_ADDR: +case VHOST_USER_SET_VRING_ADDR: memcpy(msg.addr, arg, sizeof(struct vhost_vring_addr)); msg.size = sizeof(m.addr); break; -case VHOST_SET_VRING_KICK: -case VHOST_SET_VRING_CALL: -case VHOST_SET_VRING_ERR: +case VHOST_USER_SET_VRING_KICK: +case VHOST_USER_SET_VRING_CALL: +case VHOST_USER_SET_VRING_ERR: file = arg;
[Qemu-devel] [PATCH 4/4] vhost-user: unit test for new messages
Data is empty for now, but do make sure master sets the new feature bit flag. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- tests/vhost-user-test.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c index 75fedf0..228acb6 100644 --- a/tests/vhost-user-test.c +++ b/tests/vhost-user-test.c @@ -53,6 +53,8 @@ #define VHOST_MEMORY_MAX_NREGIONS8 +#define VHOST_USER_F_PROTOCOL_FEATURES 30 + typedef enum VhostUserRequest { VHOST_USER_NONE = 0, VHOST_USER_GET_FEATURES = 1, @@ -69,6 +71,8 @@ typedef enum VhostUserRequest { VHOST_USER_SET_VRING_KICK = 12, VHOST_USER_SET_VRING_CALL = 13, VHOST_USER_SET_VRING_ERR = 14, +VHOST_USER_GET_PROTOCOL_FEATURES = 15, +VHOST_USER_SET_PROTOCOL_FEATURES = 16, VHOST_USER_MAX } VhostUserRequest; @@ -293,11 +297,26 @@ static void chr_read(void *opaque, const uint8_t *buf, int size) /* send back features to qemu */ msg.flags |= VHOST_USER_REPLY_MASK; msg.size = sizeof(m.u64); +msg.u64 = 0x1ULL VHOST_USER_F_PROTOCOL_FEATURES; +p = (uint8_t *) msg; +qemu_chr_fe_write_all(chr, p, VHOST_USER_HDR_SIZE + msg.size); +break; + +case VHOST_USER_SET_FEATURES: + g_assert_cmpint(msg.u64 (0x1ULL VHOST_USER_F_PROTOCOL_FEATURES), + !=, 0ULL); +break; + +case VHOST_USER_GET_PROTOCOL_FEATURES: +/* send back features to qemu */ +msg.flags |= VHOST_USER_REPLY_MASK; +msg.size = sizeof(m.u64); msg.u64 = 0; p = (uint8_t *) msg; qemu_chr_fe_write_all(chr, p, VHOST_USER_HDR_SIZE + msg.size); break; + case VHOST_USER_GET_VRING_BASE: /* send back vring base to qemu */ msg.flags |= VHOST_USER_REPLY_MASK; -- MST
[Qemu-devel] [PATCH] Merge memory_region_init_reservation() into memory_region_init_io()
Just speficying ops = NULL in some cases can be more convenient than having two functions. See vGICv3 v5 patch set for at least one such case. GICv2 code can be refactored in a way similar to what is done in the mentioned patchset, killing some code duplication. Signed-off-by: Pavel Fedin p.fe...@samsung.com --- include/exec/memory.h | 14 +++--- memory.c | 10 +- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index 1394715..3b5e44e 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -436,6 +436,9 @@ void memory_region_init_alias(MemoryRegion *mr, * memory_region_init_rom_device: Initialize a ROM memory region. Writes are * handled via callbacks. * + * If NULL callbacks pointer is given, then I/O space is not supposed to be + * handled by QEMU itself. Any access via the memory API will cause an abort(). + * * @mr: the #MemoryRegion to be initialized. * @owner: the object that tracks the region's reference count * @ops: callbacks for write access handling. @@ -458,16 +461,21 @@ void memory_region_init_rom_device(MemoryRegion *mr, * A reservation region primariy serves debugging purposes. It claims I/O * space that is not supposed to be handled by QEMU itself. Any access via * the memory API will cause an abort(). + * This function is deprecated. Use memory_region_init_io() with NULL + * callbacks instead. * * @mr: the #MemoryRegion to be initialized * @owner: the object that tracks the region's reference count * @name: used for debugging; not visible to the user or ABI * @size: size of the region. */ -void memory_region_init_reservation(MemoryRegion *mr, -struct Object *owner, +static inline void memory_region_init_reservation(MemoryRegion *mr, +Object *owner, const char *name, -uint64_t size); +uint64_t size) +{ +memory_region_init_io(mr, owner, NULL, mr, name, size); +} /** * memory_region_init_iommu: Initialize a memory region that translates diff --git a/memory.c b/memory.c index 0acebb1..b843952 100644 --- a/memory.c +++ b/memory.c @@ -1187,7 +1187,7 @@ void memory_region_init_io(MemoryRegion *mr, uint64_t size) { memory_region_init(mr, owner, name, size); -mr-ops = ops; +mr-ops = ops ? ops : unassigned_mem_ops; mr-opaque = opaque; mr-terminates = true; } @@ -1307,14 +1307,6 @@ void memory_region_init_iommu(MemoryRegion *mr, notifier_list_init(mr-iommu_notify); } -void memory_region_init_reservation(MemoryRegion *mr, -Object *owner, -const char *name, -uint64_t size) -{ -memory_region_init_io(mr, owner, unassigned_mem_ops, mr, name, size); -} - static void memory_region_finalize(Object *obj) { MemoryRegion *mr = MEMORY_REGION(obj); -- 1.9.5.msysgit.0
Re: [Qemu-devel] [PATCH for-2.4 1/4] Revert vhost-user: add multi queue support
On 07/17/2015 08:09 AM, Michael S. Tsirkin wrote: This reverts commit 830d70db692e374b5f4407f96a1ceefdcc97. The interface as merged isn't fully backwards-compatible with existing clients. Revert it. Let's redo this after 2.4, based on protocol extensions in follow-up patches. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- qapi-schema.json | 6 +- hw/net/vhost_net.c| 3 +-- hw/virtio/vhost-user.c| 11 +-- net/vhost-user.c | 37 + docs/specs/vhost-user.txt | 5 - qemu-options.hx | 5 ++--- 6 files changed, 18 insertions(+), 49 deletions(-) Reviewed-by: Eric Blake ebl...@redhat.com -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On Fri, 17 Jul 2015 15:04:27 +0100 Paolo Bonzini pbonz...@redhat.com wrote: On 17/07/2015 15:28, Marc Zyngier wrote: Marc, does it ring any bell? Well, this is an example of a guest accessing non-memory using an instruction that we cannot safely emulate - not an IO accessor (load multiple, for example). In this case, we kill the guest (we could as well inject a fault). Yup, I later found this in the dmesg: [1875219.903969] kvm [14843]: load/store instruction decoding not implemented This vcpu seems to be accessing 0x918 (the mmio structure points there), but I can't immediately relate it to the content of the registers. 0x918 is the pl011, which makes some sense. Have you ever seen a corrupted register dump? The guest RAM goes from 0x4000 to 0xbfff, so SP is pointing outside memory. I've never seen such a corruption - so far. PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) If the register dump is not corrupted, execution went in the weeds... I don't have the guest anymore but it's just firmware so the memory contents are well reproducible: 0xbf671200: a97d6ffa ldmdbgesp!, {r1, r3, r4, r5, r6, r7, r8, r9, sl, fp, sp, lr}^ 0xbf671204: a97e77fc ldmdbgelr!, {r2, r3, r4, r5, r6, r7, r8, r9, sl, ip, sp, lr}^ 0xbf671208: f85f03fe undefined instruction 0xf85f03fe 0xbf67120c: 910803ff strdls r0, [r8, -pc] 0xbf671210: ad7007e0 ldclge 7, cr0, [r0, #-896]! 0xbf671214: ad710fe2 ldclge 15, cr0, [r1, #-904]! 0xbf671218: ad7217e4 ldclge 7, cr1, [r2, #-912]! 0xbf67121c: ad731fe6 ldclge 15, cr1, [r3, #-920]! 0xbf671220: ad7427e8 ldclge 7, cr2, [r4, #-928]! 0xbf671224: ad752fea ldclge 15, cr2, [r5, #-936]! But that's all 32bit code, and your guest was running in 64bit. What does it look like as A64? What looks a bit weird is that all the registers are clamped to 32bit, but the PSTATE indicates it is running in 64bit (EL1h, which makes the PC being utterly wrong). The PC can be okay since UEFI code doesn't really use virtual addressing, but the other registers are weird indeed. It definitely looks like something tramped on your register file. KVM doesn't do that at all (we use the whole AArch64 register file irrespective of the execution state). Is your UEFI guest 32 or 64bit? M. -- Jazz is not dead. It just smells funny.
[Qemu-devel] [PATCH v2 2/2] blockdev: always compile in -drive aio= parsing
CONFIG_LINUX_AIO is an implementation detail of raw-posix.c. Don't mention CONFIG_LINUX_AIO in blockdev.c. Let raw-posix.c decide what to do with BDRV_O_NATIVE_AIO if CONFIG_LINUX_AIO is not defined. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- blockdev.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/blockdev.c b/blockdev.c index 62a4586..37b91c8 100644 --- a/blockdev.c +++ b/blockdev.c @@ -405,7 +405,6 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts, bdrv_flags |= BDRV_O_NO_FLUSH; } -#ifdef CONFIG_LINUX_AIO if ((buf = qemu_opt_get(opts, aio)) != NULL) { if (!strcmp(buf, native)) { bdrv_flags |= BDRV_O_NATIVE_AIO; @@ -416,7 +415,6 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts, goto early_err; } } -#endif if ((buf = qemu_opt_get(opts, format)) != NULL) { if (is_help_option(buf)) { -- 2.4.3
[Qemu-devel] [PATCH v2 0/2] block: warn about aio=native if libaio is unavailable
v2: * Banish CONFIG_LINUX_AIO from blockdev.c, that is raw-posix.c's business [Kevin] * Print the warning in the same way as the aio=native,cache.direct=off deprecation warning [Kevin] Open question: what about the Windows case? We now pass the FILE_FLAG_OVERLAPPED flag which we didn't do before for -drive aio=native. Stefan Hajnoczi (2): raw-posix: warn about BDRV_O_NATIVE_AIO if libaio is unavailable blockdev: always compile in -drive aio= parsing block/raw-posix.c | 11 ++- blockdev.c| 2 -- 2 files changed, 10 insertions(+), 3 deletions(-) -- 2.4.3
[Qemu-devel] [PATCH v2 1/2] raw-posix: warn about BDRV_O_NATIVE_AIO if libaio is unavailable
raw-posix.c silently ignores BDRV_O_NATIVE_AIO if libaio is unavailable. It is confusing when aio=native performance is identical to aio=threads because the binary was accidentally built without libaio. Print a deprecation warning if -drive aio=native is used with a binary that does not support libaio. There are probably users using aio=native who would be inconvenienced if QEMU suddenly refused to start their guests. In the future this will become an error. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/raw-posix.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/block/raw-posix.c b/block/raw-posix.c index 855febe..e09019c 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -519,7 +519,16 @@ static int raw_open_common(BlockDriverState *bs, QDict *options, future QEMU versions.\n, bs-filename); } -#endif +#else +if (bdrv_flags BDRV_O_NATIVE_AIO) { +error_printf(WARNING: aio=native was specified for '%s', but + is not supported in this build. Falling back to + aio=threads.\n + This will become an error condition in + future QEMU versions.\n, + bs-filename); +} +#endif /* !defined(CONFIG_LINUX_AIO) */ s-has_discard = true; s-has_write_zeroes = true; -- 2.4.3
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On 07/17/15 15:28, Marc Zyngier wrote: On Fri, 17 Jul 2015 10:30:38 +0100 Paolo Bonzini pbonz...@redhat.com wrote: On 17/07/2015 06:44, Paolo Bonzini wrote: On 16/07/2015 21:05, Richard W.M. Jones wrote: Sorry to spoil things, but I'm still seeing this bug, although it is now a lot less frequent with your patch. I would estimate it happens more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs with qemu.git + the v2 patch series. It's the exact same hang in both cases. Is it possible that this patch doesn't completely close any race? Still, it is an improvement, so there is that. Would seem at first glance like a different bug. Interestingly, adding some tracing (qemu_clock_get_ns) makes the bug more likely: now it reproduces in about 10 tries. Of course :) adding other kinds of tracing instead make it go away again (50 tries). Perhaps this: i/o thread vcpu thread worker thread - lock_iothread notify_me = 1 ... unlock_iothread lock_iothread notify_me = 3 ppoll notify_me = 1 bh-scheduled = 1 event_notifier_set event_notifier_test_and_clear ppoll ^^ hang In the exact shape above, it doesn't seem too likely to happen, but perhaps there's another simpler case. Still, the bug exists. The above is not really related to notify_me. Here the notification is not being optimized away! So I wonder if this one has been there forever. Fam suggested putting the event_notifier_test_and_clear before aio_bh_poll(), but it does not work. I'll look more close However, an unconditional event_notifier_test_and_clear is pretty expensive. On one hand, obviously correctness comes first. On the other hand, an expensive operation at the wrong place can mask the race very easily; I'll let the fix run for a while, but I'm not sure if a successful test really says anything useful. So it may not be useful, but still successful test is successful. :) The following patch, which also includes the delta between v2 and v3 of this series, survived 674 reboots before hitting a definitely unrelated problem: error: kvm run failed Function not implemented PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) For the record, this is the kvm_run struct: $6 = {request_interrupt_window = 0 '\000', padding1 = \000\000\000\000\000\000, exit_reason = 0, ready_for_interrupt_injection = 0 '\000', if_flag = 0 '\000', flags = 0, cr8 = 0, apic_base = 0, {hw = { hardware_exit_reason = 150994968}, fail_entry = {hardware_entry_failure_reason = 150994968}, ex = { exception = 150994968, error_code = 0}, io = {direction = 24 '\030', size = 0 '\000', port = 2304, count = 0, data_offset = 144}, debug = {arch = {No data fields}}, mmio = {phys_addr = 150994968, data = \220\000\000\000\000\000\000, len = 4, is_write = 0 '\000'}, hypercall = {nr = 150994968, args = {144, 4, 0, 0, 0, 0}, ret = 0, longmode = 0, pad = 0}, tpr_access = {rip = 150994968, is_write = 144, pad = 0}, s390_sieic = {icptcode = 24 '\030', ipa = 2304, ipb = 0}, s390_reset_flags = 150994968, s390_ucontrol = {trans_exc_code = 150994968, pgm_code = 144}, dcr = { dcrn = 150994968, data = 0, is_write = 144 '\220'}, internal = {suberror = 150994968, ndata = 0, data = {144, 4, 0 repeats 14 times}}, osi = {gprs = {150994968, 144, 4, 0 repeats 29 times}}, papr_hcall = {nr = 150994968, ret = 144, args = {4, 0, 0, 0, 0, 0, 0, 0, 0}}, s390_tsch = { subchannel_id = 24, subchannel_nr = 2304, io_int_parm = 0, io_int_word = 144, ipb = 0, dequeued = 4 '\004'}, epr = {epr = 150994968}, system_event = {type = 150994968, flags = 144}, s390_stsi = {addr = 150994968, ar = 144 '\220', reserved = 0 '\000', fc = 0 '\000', sel1 = 0 '\000', sel2 = 0}, padding = \030\000\000\t\000\000\000\000\220\000\000\000\000\000\000\000\004, '\000' repeats 238 times}, kvm_valid_regs = 0,
[Qemu-devel] STM32F407-Discovery led blinking video
as planned, the GNU ARM Eclipse QEMU branch is now fully functional for simple LED blinking applications. the board image is displayed in a SDL window and the LEDs are coloured rectangles shown over the image. a short demo video is available from: http://gnuarmeclipse.livius.net/mw/images/f/fa/Stm32f4-discovery.mov regards, Liviu
Re: [Qemu-devel] [PATCH v2] raw-posix.c: Make physical devices usable in QEMU under Mac OS X host
On Thu, Jul 16, 2015 at 04:46:07PM -0400, Programmingkid wrote: @@ -2014,7 +2015,9 @@ kern_return_t GetBSDPath( io_iterator_t mediaIterator, char *bsdPath, CFIndex ma if ( bsdPathAsCFString ) { size_t devPathLength; strcpy( bsdPath, _PATH_DEV ); -strcat( bsdPath, r ); +if (flags BDRV_O_NOCACHE) { +strcat(bsdPath, r); +} devPathLength = strlen( bsdPath ); if ( CFStringGetCString( bsdPathAsCFString, bsdPath + devPathLength, maxPathSize - devPathLength, kCFStringEncodingASCII ) ) { kernResult = KERN_SUCCESS; Is this the fix that makes CD-ROM passthrough work for you? Does the guest boot successfully when you do: -drive if=ide,media=cdrom,cache=none,file=/dev/cdrom ? @@ -2027,7 +2030,67 @@ kern_return_t GetBSDPath( io_iterator_t mediaIterator, char *bsdPath, CFIndex ma return kernResult; } -#endif +/* Sets up a physical device for use in QEMU */ +static void setupDevice(const char *bsdPath) +{ + /* +* Mac OS X does not like allowing QEMU to use physical devices that are +* mounted. Attempts to do so result in 'Resource busy' errors. +*/ + +int fd; +fd = qemu_open(bsdPath, O_RDONLY | O_BINARY | O_LARGEFILE); + +/* if the device fails to open */ +if (fd 0) { +printf(Error: failed to open %s\n, bsdPath); +printf(If device %s is mounted on the desktop, unmount it +first before using it in QEMU.\n, bsdPath); +printf(\nCommand to unmount device: diskutil unmountDisk %s, bsdPath); +printf(\nCommand to mount device: diskutil mountDisk %s\n\n, bsdPath); +} This error message is printed regardless of the errno value. What is the specific errno value when open(2) fails because the device is mounted? Also, can you move this after the raw_open_common() call to avoid the setupCDROM()/setupDevice() changes made in this patch and duplicating the error message. It doesn't seem necessary to open the files ahead of raw_open_common() since the code continues in the error case anyway: ret = raw_open_common(bs, options, flags, 0, local_err); if (ret 0) { if (local_err) { error_propagate(errp, local_err); } #if defined(__APPLE__) defined(__MACH__) if (strstart(filename, /dev/) == 0 ret == -EBUSY) { /* or whatever */ error_report(If device %s is mounted on the desktop, unmount it first before using it in QEMU., bsdPath); error_report(Command to unmount device: diskutil unmountDisk %s, bsdPath); error_report(Command to mount device: diskutil mountDisk %s, bsdPath); } #endif /* defined(__APPLE__) defined(__MACH__) */ return ret; } That's a much smaller change. + +/* if the device opens */ +else { +qemu_close(fd); +} +} + +/* Sets up a real cdrom for use in QEMU */ +static void setupCDROM(char *bsdPath) +{ +int index, numOfTestPartitions = 2, fd; +char testPartition[MAXPATHLEN]; +bool partitionFound = false; + +/* look for a working partition */ +for (index = 0; index numOfTestPartitions; index++) { +strncpy(testPartition, bsdPath, MAXPATHLEN); The safe way to use strncpy() is: strncpy(testPartition, bsdPath, MAXPATHLEN - 1); testPartition[MAXPATHLEN - 1] = '\0'; but pstrcpy() is a easier to use correctly. Please use that instead. +snprintf(testPartition, MAXPATHLEN, %ss%d, testPartition, index); +fd = qemu_open(testPartition, O_RDONLY | O_BINARY | O_LARGEFILE); +if (fd 0) { Should be fd = 0 since fd = 0 is valid too, although unlikely. +partitionFound = true; +qemu_close(fd); +break; +} +} + +/* if a working partition on the device was not found */ +if (partitionFound == false) { +printf(Error: Failed to find a working partition on disc!\n); +printf(If your disc is mounted on the desktop, trying unmounting it +first before using it in QEMU.\n); +printf(\nCommand to unmount disc: +diskutil unmountDisk %s\n, bsdPath); +printf(Command to mount disc: + diskutil mountDisk %s\n\n, bsdPath); +} + +DPRINTF(Using %s as CDROM\n, testPartition); +strncpy(bsdPath, testPartition, MAXPATHLEN); Please use pstrcpy(). +} + +#endif /* defined(__APPLE__) defined(__MACH__) */ static int hdev_probe_device(const char *filename) { @@ -2119,30 +2182,28 @@ static int hdev_open(BlockDriverState *bs, QDict *options, int flags, #if defined(__APPLE__) defined(__MACH__) const char *filename = qdict_get_str(options, filename); -if (strstart(filename, /dev/cdrom, NULL)) { -kern_return_t kernResult; -io_iterator_t mediaIterator; -char
Re: [Qemu-devel] [RFC PATCH qemu v3 4/4] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering)
On Fri, Jul 17, 2015 at 05:13:37PM +1000, Alexey Kardashevskiy wrote: On 07/16/2015 03:11 PM, David Gibson wrote: On Tue, Jul 14, 2015 at 10:21:54PM +1000, Alexey Kardashevskiy wrote: This makes use of the new memory registering feature. The idea is to provide the userspace ability to notify the host kernel about pages which are going to be used for DMA. Having this information, the host kernel can pin them all once per user process, do locked pages accounting (once) and not spent time on doing that in real time with possible failures which cannot be handled nicely in some cases. This adds a guest RAM memory listener which notifies a VFIO container about memory which needs to be pinned/unpinned. VFIO MMIO regions (i.e. skip dump regions) are skipped. The feature is only enabled for SPAPR IOMMU v2. The host kernel changes are required. Since v2 does not need/support VFIO_IOMMU_ENABLE, this does not call it when v2 is detected and enabled. This does not change the guest visible interface. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru I've looked at this in more depth now, and attempting to unify the pre-reg and mapping listeners like this can't work - they need to be listening on different address spaces: mapping actions need to be listening on the PCI address space, whereas the pre-reg needs to be listening on address_space_memory. For x86 - for now - those end up being the same thing, but on Power they're not. We do need to be clear about what differences are due to the presence of a guest IOMMU versus which are due to arch or underlying IOMMU type. For now Power has a guest IOMMU and x86 doesn't, but that could well change in future: we could well implement the guest side IOMMU for x86 in future (or x86 could invent a paravirt IOMMU interface). On the other side, BenH's experimental powernv machine type could introduce Power machines without a guest side IOMMU (or at least an optional guest side IOMMU). The quick and dirty approach here is: 1. Leave the main listener as is 2. Add a new pre-reg notifier to the spapr iommu specific code, which listens on address_space_memory, *not* the PCI space The more generally correct approach, which allows for more complex IOMMU arrangements and the possibility of new IOMMU types with pre-reg is: 1. Have the core implement both a mapping listener and a pre-reg listener (optionally enabled by a per-iommu-type flag). Basically the first one sees what *is* mapped, the second sees what *could* be mapped. 2. As now, the mapping listener listens on PCI address space, if RAM blocks are added, immediately map them into the host IOMMU, if guest IOMMU blocks appear register a notifier which will mirror guest IOMMU mappings to the host IOMMU (this is what we do now). 3. The pre-reg listener also listens on the PCI address space. RAM blocks added are pre-registered immediately. PCI address space listeners won't be notified about RAM blocks on sPAPR. Sure they will - if any RAM blocks were mapped directly into PCI address space, the listener would be notified. It's just that no RAM blocks are directly mapped into PCI space, only partially mapped in via IOMMU blocks. But the idea is this scheme could handle a platform that has both a bypass DMA window which maps directly onto a block of ram and an IOMMU controlled DMA window. Or one which could have either setup depending on circumstances (which is probably true of BenH's powernv machine type). But, if guest IOMMU blocks are added, instead of registering a guest-iommu notifier, guest-iommu notifier is the one called via memory_region_notify_iommu() from H_PUT_TCE? Instead implies dropping it, how this can work? Because the other listener - the mapping listener at (2) handles that part. The pre-reg listener doesn't. But as noted in by other mail this whole scheme doesn't work without a way to discover an IOMMU region's target AS in advance, which doesn't currently exist. we register another listener on the *target* AS of the guest IOMMU, same callbacks as this one. In practice that target AS will almost always resolve to address_space_memory, but this can at least in theory handle crazy guest setups with multiple layers of IOMMU. 4. Have to ensure that the pre-reg callbacks always happen before the mapping calls. For a system with an IOMMU backend which requires pre-registration, but doesn't have a guest IOMMU, we need to pre-reg, then host-iommu-map RAM blocks that appear in PCI address space. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson pgp019P3Hv12i.pgp Description: PGP
Re: [Qemu-devel] [PATCH v2 00/13] tcg/sparc v8plus code generation
On 2015-07-17 12:23, Aurelien Jarno wrote: On 2015-07-16 22:29, Richard Henderson wrote: On 07/15/2015 09:54 PM, Aurelien Jarno wrote: While I understand why we need the new trunc_shr_i32 opcode for MIPS64 (the 32-bit values must be kept sign-extended), I currently fail to see why it is needed for SPARC. As far as I recall, it improves code for extracting high parts of 64-bit quantities. Without this, we wind up with a 64-bit shift, requiring a 64-bit temp register, followed by the real truncate which can copy the data to a 32-bit destination register. Ok, I understand the use case now. So it's not for correctness, but rather to generate more optimized code. OTOH, it means that we always have to go through a 32-bit register first when truncating a 64-bit value. I mean we gain in the following case: shr_i64 t64, t64, i trunc_i64_i32 t32, t64 ... But we lose in the following case: trunc_i64_i32 t32, t64 neg t32, t32 ... Overall I guess the advantages far outweigh the disadvantages. -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [RFC v3 08/13] exec.c: introduce a simple rendezvous support
Alvise Rigo a.r...@virtualopensystems.com writes: When a vCPU is about to set a memory page as exclusive, it needs to wait that all the running vCPUs finish to execute the current TB and to be aware of the exact moment when that happens. For this, add a simple rendezvous mechanism that will be used in softmmu_llsc_template.h to implement the ldlink operation. Suggested-by: Jani Kokkonen jani.kokko...@huawei.com Suggested-by: Claudio Fontana claudio.font...@huawei.com Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com --- cpus.c| 5 + exec.c| 45 + include/qom/cpu.h | 16 3 files changed, 66 insertions(+) diff --git a/cpus.c b/cpus.c index aee445a..f4d938e 100644 --- a/cpus.c +++ b/cpus.c @@ -1423,6 +1423,11 @@ static int tcg_cpu_exec(CPUArchState *env) qemu_mutex_unlock_iothread(); ret = cpu_exec(env); cpu-tcg_executing = 0; + +if (unlikely(cpu-pending_rdv)) { +cpu_exit_do_rendezvous(cpu); +} + I'll ignore this stuff for now as I assume we can all use the async work patch of Fred's? qemu_mutex_lock_iothread(); #ifdef CONFIG_PROFILER tcg_time += profile_getclock() - ti; diff --git a/exec.c b/exec.c index 964e922..51958ed 100644 --- a/exec.c +++ b/exec.c @@ -746,6 +746,51 @@ void cpu_breakpoint_remove_all(CPUState *cpu, int mask) } } +/* Rendezvous implementation. + * The corresponding definitions are in include/qom/cpu.h. */ +CpuExitRendezvous cpu_exit_rendezvous; +inline void cpu_exit_init_rendezvous(void) +{ +CpuExitRendezvous *rdv = cpu_exit_rendezvous; + +rdv-attendees = 0; +} + +inline void cpu_exit_rendezvous_add_attendee(CPUState *cpu) +{ +CpuExitRendezvous *rdv = cpu_exit_rendezvous; + +if (!cpu-pending_rdv) { +cpu-pending_rdv = 1; +atomic_inc(rdv-attendees); +} +} + +void cpu_exit_do_rendezvous(CPUState *cpu) +{ +CpuExitRendezvous *rdv = cpu_exit_rendezvous; + +atomic_dec(rdv-attendees); + +cpu-pending_rdv = 0; +} + +void cpu_exit_rendezvous_wait_others(CPUState *cpu) +{ +CpuExitRendezvous *rdv = cpu_exit_rendezvous; + +while (rdv-attendees) { +g_usleep(TCG_RDV_POLLING_PERIOD); +} +} + +void cpu_exit_rendezvous_release(void) +{ +CpuExitRendezvous *rdv = cpu_exit_rendezvous; + +rdv-attendees = 0; +} + /* enable or disable single step mode. EXCP_DEBUG is returned by the CPU loop after each instruction */ void cpu_single_step(CPUState *cpu, int enabled) diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 8f3fe56..8d121b3 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -201,6 +201,20 @@ typedef struct CPUWatchpoint { QTAILQ_ENTRY(CPUWatchpoint) entry; } CPUWatchpoint; +/* Rendezvous support */ +#define TCG_RDV_POLLING_PERIOD 10 +typedef struct CpuExitRendezvous { +volatile int attendees; +QemuMutex lock; +} CpuExitRendezvous; + +extern CpuExitRendezvous cpu_exit_rendezvous; +void cpu_exit_init_rendezvous(void); +void cpu_exit_rendezvous_add_attendee(CPUState *cpu); +void cpu_exit_do_rendezvous(CPUState *cpu); +void cpu_exit_rendezvous_wait_others(CPUState *cpu); +void cpu_exit_rendezvous_release(void); + struct KVMState; struct kvm_run; @@ -291,6 +305,8 @@ struct CPUState { void *opaque; +volatile int pending_rdv; + I will however echo the hmmm on Fred's patch about the optimistic use of volatile here. As Peter mentioned it is a red flag and I would prefer explicit memory consistency behaviour used and documented. /* In order to avoid passing too many arguments to the MMIO helpers, * we store some rarely used information in the CPU context. */ -- Alex Bennée
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On Fri, 17 Jul 2015 14:39:55 +0100 Laszlo Ersek ler...@redhat.com wrote: On 07/17/15 15:28, Marc Zyngier wrote: On Fri, 17 Jul 2015 10:30:38 +0100 Paolo Bonzini pbonz...@redhat.com wrote: On 17/07/2015 06:44, Paolo Bonzini wrote: On 16/07/2015 21:05, Richard W.M. Jones wrote: Sorry to spoil things, but I'm still seeing this bug, although it is now a lot less frequent with your patch. I would estimate it happens more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs with qemu.git + the v2 patch series. It's the exact same hang in both cases. Is it possible that this patch doesn't completely close any race? Still, it is an improvement, so there is that. Would seem at first glance like a different bug. Interestingly, adding some tracing (qemu_clock_get_ns) makes the bug more likely: now it reproduces in about 10 tries. Of course :) adding other kinds of tracing instead make it go away again (50 tries). Perhaps this: i/o thread vcpu thread worker thread - lock_iothread notify_me = 1 ... unlock_iothread lock_iothread notify_me = 3 ppoll notify_me = 1 bh-scheduled = 1 event_notifier_set event_notifier_test_and_clear ppoll ^^ hang In the exact shape above, it doesn't seem too likely to happen, but perhaps there's another simpler case. Still, the bug exists. The above is not really related to notify_me. Here the notification is not being optimized away! So I wonder if this one has been there forever. Fam suggested putting the event_notifier_test_and_clear before aio_bh_poll(), but it does not work. I'll look more close However, an unconditional event_notifier_test_and_clear is pretty expensive. On one hand, obviously correctness comes first. On the other hand, an expensive operation at the wrong place can mask the race very easily; I'll let the fix run for a while, but I'm not sure if a successful test really says anything useful. So it may not be useful, but still successful test is successful. :) The following patch, which also includes the delta between v2 and v3 of this series, survived 674 reboots before hitting a definitely unrelated problem: error: kvm run failed Function not implemented PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) For the record, this is the kvm_run struct: $6 = {request_interrupt_window = 0 '\000', padding1 = \000\000\000\000\000\000, exit_reason = 0, ready_for_interrupt_injection = 0 '\000', if_flag = 0 '\000', flags = 0, cr8 = 0, apic_base = 0, {hw = { hardware_exit_reason = 150994968}, fail_entry = {hardware_entry_failure_reason = 150994968}, ex = { exception = 150994968, error_code = 0}, io = {direction = 24 '\030', size = 0 '\000', port = 2304, count = 0, data_offset = 144}, debug = {arch = {No data fields}}, mmio = {phys_addr = 150994968, data = \220\000\000\000\000\000\000, len = 4, is_write = 0 '\000'}, hypercall = {nr = 150994968, args = {144, 4, 0, 0, 0, 0}, ret = 0, longmode = 0, pad = 0}, tpr_access = {rip = 150994968, is_write = 144, pad = 0}, s390_sieic = {icptcode = 24 '\030', ipa = 2304, ipb = 0}, s390_reset_flags = 150994968, s390_ucontrol = {trans_exc_code = 150994968, pgm_code = 144}, dcr = { dcrn = 150994968, data = 0, is_write = 144 '\220'}, internal = {suberror = 150994968, ndata = 0, data = {144, 4, 0 repeats 14 times}}, osi = {gprs = {150994968, 144, 4, 0 repeats 29 times}}, papr_hcall = {nr = 150994968, ret = 144, args = {4, 0, 0, 0, 0, 0, 0, 0, 0}}, s390_tsch = { subchannel_id = 24, subchannel_nr = 2304, io_int_parm = 0, io_int_word = 144, ipb = 0, dequeued = 4 '\004'}, epr = {epr = 150994968}, system_event = {type = 150994968, flags = 144}, s390_stsi = {addr = 150994968, ar = 144 '\220', reserved = 0 '\000',
[Qemu-devel] [PULL for-2.4 1/1] virtio-rng: trigger timer only when guest requests for entropy
From: Pankaj Gupta pagu...@redhat.com This patch triggers timer only when guest requests for entropy. As soon as first request from guest for entropy comes we set the timer. Timer bumps up the quota value when it gets triggered. Signed-off-by: Pankaj Gupta pagu...@redhat.com Reviewed-by: Amit Shah amit.s...@redhat.com Message-Id: 1436962608-9961-2-git-send-email-pagu...@redhat.com [Re-worded patch subject, removed extra whitespace -- Amit] Signed-off-by: Amit Shah amit.s...@redhat.com --- hw/virtio/virtio-rng.c | 15 --- include/hw/virtio/virtio-rng.h | 1 + 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c index 740ed31..6e5f022 100644 --- a/hw/virtio/virtio-rng.c +++ b/hw/virtio/virtio-rng.c @@ -77,6 +77,12 @@ static void virtio_rng_process(VirtIORNG *vrng) return; } +if (vrng-activate_timer) { +timer_mod(vrng-rate_limit_timer, + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + vrng-conf.period_ms); +vrng-activate_timer = false; +} + if (vrng-quota_remaining 0) { quota = 0; } else { @@ -138,8 +144,7 @@ static void check_rate_limit(void *opaque) vrng-quota_remaining = vrng-conf.max_bytes; virtio_rng_process(vrng); -timer_mod(vrng-rate_limit_timer, - qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + vrng-conf.period_ms); +vrng-activate_timer = true; } static void virtio_rng_device_realize(DeviceState *dev, Error **errp) @@ -195,13 +200,9 @@ static void virtio_rng_device_realize(DeviceState *dev, Error **errp) vrng-vq = virtio_add_queue(vdev, 8, handle_input); vrng-quota_remaining = vrng-conf.max_bytes; - vrng-rate_limit_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, check_rate_limit, vrng); - -timer_mod(vrng-rate_limit_timer, - qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + vrng-conf.period_ms); - +vrng-activate_timer = true; register_savevm(dev, virtio-rng, -1, 1, virtio_rng_save, virtio_rng_load, vrng); } diff --git a/include/hw/virtio/virtio-rng.h b/include/hw/virtio/virtio-rng.h index 0316488..3f07de7 100644 --- a/include/hw/virtio/virtio-rng.h +++ b/include/hw/virtio/virtio-rng.h @@ -44,6 +44,7 @@ typedef struct VirtIORNG { */ QEMUTimer *rate_limit_timer; int64_t quota_remaining; +bool activate_timer; } VirtIORNG; #endif -- 2.4.3
[Qemu-devel] [PULL for-2.4 0/1] virtio-rng: reduce wakeups
The following changes since commit 5b5e8cdd7da7a2214dd062afff5b866234aab228: Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20150717-1' into staging (2015-07-17 12:39:12 +0100) are available in the git repository at: https://git.kernel.org/pub/scm/virt/qemu/amit/virtio-rng.git tags/vrng-2.4 for you to fetch changes up to 621a20e08155179b1902c428361e80f41429f50d: virtio-rng: trigger timer only when guest requests for entropy (2015-07-17 19:05:16 +0530) Fire timer only when required. Brings down wakeups by a big number. Pankaj Gupta (1): virtio-rng: trigger timer only when guest requests for entropy hw/virtio/virtio-rng.c | 15 --- include/hw/virtio/virtio-rng.h | 1 + 2 files changed, 9 insertions(+), 7 deletions(-) -- 2.4.3
Re: [Qemu-devel] [PATCH v4 1/1] vhost user: add support of live migration
On 17/07/2015 15:35, Marc-André Lureau wrote: LOG_FD is implemented in the kernel drivers/vhost/vhost.c. It seems to be an eventfd-like mechanism to save on dirty bitmap scans. However, it's not well documented how to implement it in a correct way. and it's not used by qemu, so hard to say if it actually work well. In any case, live migration needs a new message type (like LOG_MMAP_FD) in the vhost-user protocol. Yes, perhaps with size and offset. I am looking at this. The offset should be 0... Do you know the size of the ram_addr_t space from VHOST_USER_SET_MEM_TABLE's user address and size fields? If the size isn't needed, you can reuse LOG_BASE, ignoring the content of the payload and adding the SCM_RIGHTS file descriptor. Paolo
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On Fri, Jul 17, 2015 at 02:48:40PM +0100, Marc Zyngier wrote: Still: there is nothing in the registers that remotely points to that area. X0 is the closest, but it'd take a big negative offset to get there. Is that a Linux kernel? or something else? You're sure it's not this one? https://bugzilla.redhat.com/show_bug.cgi?id=1194366 That was caused by ftrace screwing up guest memory, so it was effectively running random code. It is also fixed (by you in fact). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Re: [Qemu-devel] [RFC v3 08/13] exec.c: introduce a simple rendezvous support
On Fri, Jul 17, 2015 at 3:45 PM, Alex Bennée alex.ben...@linaro.org wrote: Alvise Rigo a.r...@virtualopensystems.com writes: When a vCPU is about to set a memory page as exclusive, it needs to wait that all the running vCPUs finish to execute the current TB and to be aware of the exact moment when that happens. For this, add a simple rendezvous mechanism that will be used in softmmu_llsc_template.h to implement the ldlink operation. Suggested-by: Jani Kokkonen jani.kokko...@huawei.com Suggested-by: Claudio Fontana claudio.font...@huawei.com Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com --- cpus.c| 5 + exec.c| 45 + include/qom/cpu.h | 16 3 files changed, 66 insertions(+) diff --git a/cpus.c b/cpus.c index aee445a..f4d938e 100644 --- a/cpus.c +++ b/cpus.c @@ -1423,6 +1423,11 @@ static int tcg_cpu_exec(CPUArchState *env) qemu_mutex_unlock_iothread(); ret = cpu_exec(env); cpu-tcg_executing = 0; + +if (unlikely(cpu-pending_rdv)) { +cpu_exit_do_rendezvous(cpu); +} + I'll ignore this stuff for now as I assume we can all use the async work patch of Fred's? Yes, it will be more likely based on the plain async_run_on_cpu. qemu_mutex_lock_iothread(); #ifdef CONFIG_PROFILER tcg_time += profile_getclock() - ti; diff --git a/exec.c b/exec.c index 964e922..51958ed 100644 --- a/exec.c +++ b/exec.c @@ -746,6 +746,51 @@ void cpu_breakpoint_remove_all(CPUState *cpu, int mask) } } +/* Rendezvous implementation. + * The corresponding definitions are in include/qom/cpu.h. */ +CpuExitRendezvous cpu_exit_rendezvous; +inline void cpu_exit_init_rendezvous(void) +{ +CpuExitRendezvous *rdv = cpu_exit_rendezvous; + +rdv-attendees = 0; +} + +inline void cpu_exit_rendezvous_add_attendee(CPUState *cpu) +{ +CpuExitRendezvous *rdv = cpu_exit_rendezvous; + +if (!cpu-pending_rdv) { +cpu-pending_rdv = 1; +atomic_inc(rdv-attendees); +} +} + +void cpu_exit_do_rendezvous(CPUState *cpu) +{ +CpuExitRendezvous *rdv = cpu_exit_rendezvous; + +atomic_dec(rdv-attendees); + +cpu-pending_rdv = 0; +} + +void cpu_exit_rendezvous_wait_others(CPUState *cpu) +{ +CpuExitRendezvous *rdv = cpu_exit_rendezvous; + +while (rdv-attendees) { +g_usleep(TCG_RDV_POLLING_PERIOD); +} +} + +void cpu_exit_rendezvous_release(void) +{ +CpuExitRendezvous *rdv = cpu_exit_rendezvous; + +rdv-attendees = 0; +} + /* enable or disable single step mode. EXCP_DEBUG is returned by the CPU loop after each instruction */ void cpu_single_step(CPUState *cpu, int enabled) diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 8f3fe56..8d121b3 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -201,6 +201,20 @@ typedef struct CPUWatchpoint { QTAILQ_ENTRY(CPUWatchpoint) entry; } CPUWatchpoint; +/* Rendezvous support */ +#define TCG_RDV_POLLING_PERIOD 10 +typedef struct CpuExitRendezvous { +volatile int attendees; +QemuMutex lock; +} CpuExitRendezvous; + +extern CpuExitRendezvous cpu_exit_rendezvous; +void cpu_exit_init_rendezvous(void); +void cpu_exit_rendezvous_add_attendee(CPUState *cpu); +void cpu_exit_do_rendezvous(CPUState *cpu); +void cpu_exit_rendezvous_wait_others(CPUState *cpu); +void cpu_exit_rendezvous_release(void); + struct KVMState; struct kvm_run; @@ -291,6 +305,8 @@ struct CPUState { void *opaque; +volatile int pending_rdv; + I will however echo the hmmm on Fred's patch about the optimistic use of volatile here. As Peter mentioned it is a red flag and I would prefer explicit memory consistency behaviour used and documented. In my local branch I'm now using atomic_set/atomic_read, basically what is defined in qemu/atomic.h. Is this enough? Regards, alvise /* In order to avoid passing too many arguments to the MMIO helpers, * we store some rarely used information in the CPU context. */ -- Alex Bennée
Re: [Qemu-devel] [PATCH 2/2 v3] virtio-rng: Serve pending request if any after timer bumps up quota.
On (Thu) 16 Jul 2015 [02:34:34], Pankaj Gupta wrote: Anyway we can look at that later, patch 1 is already a big improvement so I think we should just stick with that, and think about other things in a different series. Sure. I think we can apply patch 1 for 2.4, since it reduces the wakeups we cause. If you can post powertop numbers, that'll be great. I'll do a pull req in the meantime. Thanks, Amit
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On 07/17/15 15:48, Marc Zyngier wrote: On Fri, 17 Jul 2015 14:39:55 +0100 Laszlo Ersek ler...@redhat.com wrote: On 07/17/15 15:28, Marc Zyngier wrote: On Fri, 17 Jul 2015 10:30:38 +0100 Paolo Bonzini pbonz...@redhat.com wrote: On 17/07/2015 06:44, Paolo Bonzini wrote: On 16/07/2015 21:05, Richard W.M. Jones wrote: Sorry to spoil things, but I'm still seeing this bug, although it is now a lot less frequent with your patch. I would estimate it happens more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs with qemu.git + the v2 patch series. It's the exact same hang in both cases. Is it possible that this patch doesn't completely close any race? Still, it is an improvement, so there is that. Would seem at first glance like a different bug. Interestingly, adding some tracing (qemu_clock_get_ns) makes the bug more likely: now it reproduces in about 10 tries. Of course :) adding other kinds of tracing instead make it go away again (50 tries). Perhaps this: i/o thread vcpu thread worker thread - lock_iothread notify_me = 1 ... unlock_iothread lock_iothread notify_me = 3 ppoll notify_me = 1 bh-scheduled = 1 event_notifier_set event_notifier_test_and_clear ppoll ^^ hang In the exact shape above, it doesn't seem too likely to happen, but perhaps there's another simpler case. Still, the bug exists. The above is not really related to notify_me. Here the notification is not being optimized away! So I wonder if this one has been there forever. Fam suggested putting the event_notifier_test_and_clear before aio_bh_poll(), but it does not work. I'll look more close However, an unconditional event_notifier_test_and_clear is pretty expensive. On one hand, obviously correctness comes first. On the other hand, an expensive operation at the wrong place can mask the race very easily; I'll let the fix run for a while, but I'm not sure if a successful test really says anything useful. So it may not be useful, but still successful test is successful. :) The following patch, which also includes the delta between v2 and v3 of this series, survived 674 reboots before hitting a definitely unrelated problem: error: kvm run failed Function not implemented PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) For the record, this is the kvm_run struct: $6 = {request_interrupt_window = 0 '\000', padding1 = \000\000\000\000\000\000, exit_reason = 0, ready_for_interrupt_injection = 0 '\000', if_flag = 0 '\000', flags = 0, cr8 = 0, apic_base = 0, {hw = { hardware_exit_reason = 150994968}, fail_entry = {hardware_entry_failure_reason = 150994968}, ex = { exception = 150994968, error_code = 0}, io = {direction = 24 '\030', size = 0 '\000', port = 2304, count = 0, data_offset = 144}, debug = {arch = {No data fields}}, mmio = {phys_addr = 150994968, data = \220\000\000\000\000\000\000, len = 4, is_write = 0 '\000'}, hypercall = {nr = 150994968, args = {144, 4, 0, 0, 0, 0}, ret = 0, longmode = 0, pad = 0}, tpr_access = {rip = 150994968, is_write = 144, pad = 0}, s390_sieic = {icptcode = 24 '\030', ipa = 2304, ipb = 0}, s390_reset_flags = 150994968, s390_ucontrol = {trans_exc_code = 150994968, pgm_code = 144}, dcr = { dcrn = 150994968, data = 0, is_write = 144 '\220'}, internal = {suberror = 150994968, ndata = 0, data = {144, 4, 0 repeats 14 times}}, osi = {gprs = {150994968, 144, 4, 0 repeats 29 times}}, papr_hcall = {nr = 150994968, ret = 144, args = {4, 0, 0, 0, 0, 0, 0, 0, 0}}, s390_tsch = { subchannel_id = 24, subchannel_nr = 2304, io_int_parm = 0, io_int_word = 144, ipb = 0, dequeued = 4 '\004'}, epr = {epr = 150994968}, system_event = {type = 150994968, flags = 144}, s390_stsi = {addr = 150994968, ar = 144 '\220', reserved = 0 '\000', fc = 0 '\000', sel1 = 0 '\000', sel2 = 0}, padding =
Re: [Qemu-devel] [PATCH] blockdev: warn about aio=native if libaio is unavailable
On Fri, Jul 17, 2015 at 12:56:15PM +0200, Kevin Wolf wrote: Am 17.07.2015 um 11:59 hat Stefan Hajnoczi geschrieben: QEMU silently ignores aio=native if libaio is unavailable. It is confusing when aio=native performance is identical to aio=threads because the binary was accidentally built without libaio. Use error_report() because failing would break backward compatibility. There are probably users using aio=native who would be inconvenienced if QEMU suddenly refused to start their guests. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com I hope not too many people are using aio=native without having libaio compiled in... Can we make it a message like for the case with aio=native,cache.direct=off, i.e. a deprecation warning that allows us to make this an error in a few releases? Yes, I'll move the warning to raw-posix.c so all callers benefit from it. pgpoOON4yc1s8.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On Fri, 17 Jul 2015 14:53:06 +0100 Richard W.M. Jones rjo...@redhat.com wrote: On Fri, Jul 17, 2015 at 02:48:40PM +0100, Marc Zyngier wrote: Still: there is nothing in the registers that remotely points to that area. X0 is the closest, but it'd take a big negative offset to get there. Is that a Linux kernel? or something else? You're sure it's not this one? https://bugzilla.redhat.com/show_bug.cgi?id=1194366 That was caused by ftrace screwing up guest memory, so it was effectively running random code. It is also fixed (by you in fact). Don't think so. The bug you quote was the guest kernel being buggy, and touching non-memory space. This new issue seems different - this is not a Linux kernel, by the look of it. M. -- Jazz is not dead. It just smells funny.
Re: [Qemu-devel] [PATCH v4 1/1] vhost user: add support of live migration
On 17/07/2015 12:34, Marc-André Lureau wrote: On Fri, Jul 17, 2015 at 4:25 AM, Paolo Bonzini pbonz...@redhat.com wrote: But LOG_BASE makes little sense across processes, and LOG_FD is unused in QEMU, isn't it? So this patch is not enough to add support of live migration. You are right, LOG_BASE doesn't make much sense, and LOG_FD isn't used, despite some code already there. Furthermore, the log is allocated with regular malloc, hardly shareable. LOG_FD is implemented in the kernel drivers/vhost/vhost.c. It seems to be an eventfd-like mechanism to save on dirty bitmap scans. However, it's not well documented how to implement it in a correct way. In any case, live migration needs a new message type (like LOG_MMAP_FD) in the vhost-user protocol. Paolo
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On Fri, Jul 17, 2015 at 11:30:38AM +0200, Paolo Bonzini wrote: error: kvm run failed Function not implemented PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) Vaguely reminiscent of this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1194366 (See comment 7 in particular) diff --git a/aio-posix.c b/aio-posix.c index 268d14d..d2011d0 100644 --- a/aio-posix.c +++ b/aio-posix.c @@ -273,6 +273,13 @@ bool aio_poll(AioContext *ctx, bool blocking) aio_context_acquire(ctx); } +/* This should be optimized... */ +event_notifier_test_and_clear(ctx-notifier); + +if (blocking) { +atomic_sub(ctx-notify_me, 2); +} + /* if we have any readable fds, dispatch event */ if (ret 0) { for (i = 0; i npfd; i++) { @@ -283,10 +290,6 @@ bool aio_poll(AioContext *ctx, bool blocking) npfd = 0; ctx-walking_handlers--; -if (blocking) { -atomic_sub(ctx-notify_me, 2); -} - /* Run dispatch even if there were no readable fds to run timers */ if (aio_dispatch(ctx)) { progress = true; diff --git a/aio-win32.c b/aio-win32.c index 2bfd5f8..33809fd 100644 --- a/aio-win32.c +++ b/aio-win32.c @@ -326,6 +326,10 @@ bool aio_poll(AioContext *ctx, bool blocking) if (timeout) { aio_context_acquire(ctx); } + +/* This should be optimized... */ +event_notifier_test_and_clear(ctx-notifier); + if (blocking) { assert(first); atomic_sub(ctx-notify_me, 2); diff --git a/async.c b/async.c index 9204907..120e183 100644 --- a/async.c +++ b/async.c @@ -202,6 +202,9 @@ aio_ctx_check(GSource *source) AioContext *ctx = (AioContext *) source; QEMUBH *bh; +/* This should be optimized... */ +event_notifier_test_and_clear(ctx-notifier); + atomic_and(ctx-notify_me, ~1); for (bh = ctx-first_bh; bh; bh = bh-next) { if (!bh-deleted bh-scheduled) { @@ -280,6 +280,10 @@ static void aio_rfifolock_cb(void *opaque) aio_notify(opaque); } +static void event_notifier_dummy_cb(EventNotifier *e) +{ +} + AioContext *aio_context_new(Error **errp) { int ret; @@ -292,7 +296,7 @@ AioContext *aio_context_new(Error **errp) return NULL; } g_source_set_can_recurse(ctx-source, true); -aio_set_event_notifier(ctx, ctx-notifier, NULL); +aio_set_event_notifier(ctx, ctx-notifier, event_notifier_dummy_cb); ctx-thread_pool = NULL; qemu_mutex_init(ctx-bh_lock); rfifolock_init(ctx-lock, aio_rfifolock_cb, ctx); With this patch, I've got to 500 iterations without seeing the error. Still testing it however ... Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
[Qemu-devel] [PATCH v5 1/5] Implement GIC-500 base class
From: Shlomo Pongratz shlomo.pongr...@huawei.com This class is to be used by both software and KVM implementations of GICv3 Signed-off-by: Shlomo Pongratz shlomo.pongr...@huawei.com Signed-off-by: Pavel Fedin p.fe...@samsung.com --- hw/intc/Makefile.objs | 1 + hw/intc/arm_gicv3_common.c | 260 + hw/intc/gicv3_internal.h | 156 ++ include/hw/intc/arm_gicv3_common.h | 113 4 files changed, 530 insertions(+) create mode 100644 hw/intc/arm_gicv3_common.c create mode 100644 hw/intc/gicv3_internal.h create mode 100644 include/hw/intc/arm_gicv3_common.h diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs index 092d8a8..1317e5a 100644 --- a/hw/intc/Makefile.objs +++ b/hw/intc/Makefile.objs @@ -12,6 +12,7 @@ common-obj-$(CONFIG_IOAPIC) += ioapic_common.o common-obj-$(CONFIG_ARM_GIC) += arm_gic_common.o common-obj-$(CONFIG_ARM_GIC) += arm_gic.o common-obj-$(CONFIG_ARM_GIC) += arm_gicv2m.o +common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_common.o common-obj-$(CONFIG_OPENPIC) += openpic.o obj-$(CONFIG_APIC) += apic.o apic_common.o diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c new file mode 100644 index 000..ef6f22e --- /dev/null +++ b/hw/intc/arm_gicv3_common.c @@ -0,0 +1,260 @@ +/* + * ARM GICv3 support - common bits of emulated and KVM kernel model + * + * Copyright (c) 2012 Linaro Limited + * Copyright (c) 2015 Huawei. + * Written by Peter Maydell + * Extended to 64 cores by Shlomo Pongratz + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation, either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, see http://www.gnu.org/licenses/. + */ + +#include gicv3_internal.h + +static void gicv3_pre_save(void *opaque) +{ +GICv3State *s = (GICv3State *)opaque; +ARMGICv3CommonClass *c = ARM_GICV3_COMMON_GET_CLASS(s); + +if (c-pre_save) { +c-pre_save(s); +} +} + +static int gicv3_post_load(void *opaque, int version_id) +{ +GICv3State *s = (GICv3State *)opaque; +ARMGICv3CommonClass *c = ARM_GICV3_COMMON_GET_CLASS(s); + +if (c-post_load) { +c-post_load(s); +} +return 0; +} + +static const VMStateDescription vmstate_gicv3_irq_state = { +.name = arm_gicv3_irq_state, +.version_id = 1, +.minimum_version_id = 1, +.fields = (VMStateField[]) { +VMSTATE_UINT64(pending, gicv3_irq_state), +VMSTATE_UINT64(active, gicv3_irq_state), +VMSTATE_UINT64(level, gicv3_irq_state), +VMSTATE_UINT64(group, gicv3_irq_state), +VMSTATE_BOOL(edge_trigger, gicv3_irq_state), +VMSTATE_END_OF_LIST() +} +}; + +static const VMStateDescription vmstate_gicv3_sgi_state = { +.name = arm_gicv3_sgi_state, +.version_id = 1, +.minimum_version_id = 1, +.fields = (VMStateField[]) { +VMSTATE_UINT64_ARRAY(pending, gicv3_sgi_state, GICV3_NCPU), +VMSTATE_END_OF_LIST() +} +}; + +static const VMStateDescription vmstate_gicv3 = { +.name = arm_gicv3, +.version_id = 7, +.minimum_version_id = 7, +.pre_save = gicv3_pre_save, +.post_load = gicv3_post_load, +.fields = (VMStateField[]) { +VMSTATE_UINT32(ctlr, GICv3State), +VMSTATE_UINT32_ARRAY(cpu_ctlr, GICv3State, GICV3_NCPU), +VMSTATE_STRUCT_ARRAY(irq_state, GICv3State, GICV3_MAXIRQ, 1, + vmstate_gicv3_irq_state, gicv3_irq_state), +VMSTATE_UINT64_ARRAY(irq_target, GICv3State, GICV3_MAXIRQ), +VMSTATE_UINT8_2DARRAY(priority1, GICv3State, GIC_INTERNAL, + GICV3_NCPU), +VMSTATE_UINT8_ARRAY(priority2, GICv3State, +GICV3_MAXIRQ - GIC_INTERNAL), +VMSTATE_UINT16_2DARRAY(last_active, GICv3State, GICV3_MAXIRQ, + GICV3_NCPU), +VMSTATE_STRUCT_ARRAY(sgi_state, GICv3State, GIC_NR_SGIS, 1, + vmstate_gicv3_sgi_state, gicv3_sgi_state), +VMSTATE_UINT16_ARRAY(priority_mask, GICv3State, GICV3_NCPU), +VMSTATE_UINT16_ARRAY(running_irq, GICv3State, GICV3_NCPU), +VMSTATE_UINT16_ARRAY(running_priority, GICv3State, GICV3_NCPU), +VMSTATE_UINT16_ARRAY(current_pending, GICv3State, GICV3_NCPU), +VMSTATE_END_OF_LIST() +} +}; + +void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler, + const MemoryRegionOps *dist_ops, +
Re: [Qemu-devel] [RFC v3 06/13] target-i386: translate: implement qemu_ldlink and qemu_stcond ops
On Fri, Jul 17, 2015 at 2:56 PM, Alex Bennée alex.ben...@linaro.org wrote: Alvise Rigo a.r...@virtualopensystems.com writes: Implement strex and ldrex instruction relying on TCG's qemu_ldlink and qemu_stcond. For the time being only 32bit configurations are supported. Suggested-by: Jani Kokkonen jani.kokko...@huawei.com Suggested-by: Claudio Fontana claudio.font...@huawei.com Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com --- tcg/i386/tcg-target.c | 136 ++ 1 file changed, 114 insertions(+), 22 deletions(-) diff --git a/tcg/i386/tcg-target.c b/tcg/i386/tcg-target.c index 0d7c99c..d8250a9 100644 --- a/tcg/i386/tcg-target.c +++ b/tcg/i386/tcg-target.c @@ -1141,6 +1141,17 @@ static void * const qemu_ld_helpers[16] = { [MO_BEQ] = helper_be_ldq_mmu, }; +/* LoadLink helpers, only unsigned. Use the macro below to access them. */ +static void * const qemu_ldex_helpers[16] = { +[MO_LEUL] = helper_le_ldlinkul_mmu, +}; + +#define LDEX_HELPER(mem_op) \ +({ \ +assert(mem_op MO_EXCL); \ +qemu_ldex_helpers[((int)mem_op - MO_EXCL)]; \ +}) + /* helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr, * uintxx_t val, int mmu_idx, uintptr_t ra) */ @@ -1154,6 +1165,17 @@ static void * const qemu_st_helpers[16] = { [MO_BEQ] = helper_be_stq_mmu, }; +/* StoreConditional helpers. Use the macro below to access them. */ +static void * const qemu_stex_helpers[16] = { +[MO_LEUL] = helper_le_stcondl_mmu, +}; + +#define STEX_HELPER(mem_op) \ +({ \ +assert(mem_op MO_EXCL); \ +qemu_stex_helpers[(int)mem_op - MO_EXCL]; \ +}) + Same comments as for target-arm. Do we need to be protecting backends with HAS_LDST_EXCL defines or some such macro hackery? What currently happens if you use the new TCG ops when the backend doesn't support them? Is supporting all backends a prerequisite for the series? I think that the ideal approach would be to have all the backends implementing the slowpath for atomic instructions so that the HAS_LDST_EXCL macro will not be needed. Then a frontend can rely on the slowpath or not. So, ideally, it's a prerequisite. Regards, alvise /* Perform the TLB load and compare. Inputs: @@ -1249,6 +1271,7 @@ static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi, * for a load or store, so that we can later generate the correct helper code */ static void add_qemu_ldst_label(TCGContext *s, bool is_ld, TCGMemOpIdx oi, +TCGReg llsc_success, TCGReg datalo, TCGReg datahi, TCGReg addrlo, TCGReg addrhi, tcg_insn_unit *raddr, @@ -1257,6 +1280,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool is_ld, TCGMemOpIdx oi, TCGLabelQemuLdst *label = new_ldst_label(s); label-is_ld = is_ld; +label-llsc_success = llsc_success; label-oi = oi; label-datalo_reg = datalo; label-datahi_reg = datahi; @@ -1311,7 +1335,11 @@ static void tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) (uintptr_t)l-raddr); } -tcg_out_call(s, qemu_ld_helpers[opc (MO_BSWAP | MO_SIZE)]); +if (opc MO_EXCL) { +tcg_out_call(s, LDEX_HELPER(opc)); +} else { +tcg_out_call(s, qemu_ld_helpers[opc ~MO_SIGN]); +} data_reg = l-datalo_reg; switch (opc MO_SSIZE) { @@ -1415,9 +1443,16 @@ static void tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) } } -/* Tail call to the helper, with the return address back inline. */ -tcg_out_push(s, retaddr); -tcg_out_jmp(s, qemu_st_helpers[opc (MO_BSWAP | MO_SIZE)]); +if (opc MO_EXCL) { +tcg_out_call(s, STEX_HELPER(opc)); +/* Save the output of the StoreConditional */ +tcg_out_mov(s, TCG_TYPE_I32, l-llsc_success, TCG_REG_EAX); +tcg_out_jmp(s, l-raddr); +} else { +/* Tail call to the helper, with the return address back inline. */ +tcg_out_push(s, retaddr); +tcg_out_jmp(s, qemu_st_helpers[opc]); +} } #elif defined(__x86_64__) defined(__linux__) # include asm/prctl.h @@ -1530,7 +1565,8 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, /* XXX: qemu_ld and qemu_st could be modified to clobber only EDX and EAX. It will be useful once fixed registers globals are less common. */
[Qemu-devel] [PATCH v5 3/5] Introduce irqchip type specification for KVM
This patch introduces kernel_irqchip_type member in Machine class, which it passed to kvm_arch_irqchip_create. It allows machine models to specify correct GIC type during KVM capability verification. The variable is defined as int in order to be architecture-agnostic for potential future uses by other architectures. Signed-off-by: Pavel Fedin p.fe...@samsung.com --- hw/arm/vexpress.c| 1 + hw/arm/virt.c| 3 +++ include/hw/boards.h | 1 + include/sysemu/kvm.h | 3 ++- kvm-all.c| 2 +- stubs/kvm.c | 2 +- target-arm/kvm.c | 13 +++-- 7 files changed, 20 insertions(+), 5 deletions(-) diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c index da21788..0675a00 100644 --- a/hw/arm/vexpress.c +++ b/hw/arm/vexpress.c @@ -556,6 +556,7 @@ static void vexpress_common_init(MachineState *machine) const hwaddr *map = daughterboard-motherboard_map; int i; +machine-kernel_irqchip_type = KVM_DEV_TYPE_ARM_VGIC_V2; daughterboard-init(vms, machine-ram_size, machine-cpu_model, pic); /* diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 4846892..2e7d858 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -945,6 +945,9 @@ static void virt_instance_init(Object *obj) Set on/off to enable/disable the ARM Security Extensions (TrustZone), NULL); + +/* Default GIC type is v2 */ +vms-parent.kernel_irqchip_type = KVM_DEV_TYPE_ARM_VGIC_V2; } static void virt_class_init(ObjectClass *oc, void *data) diff --git a/include/hw/boards.h b/include/hw/boards.h index 2aec9cb..37eb767 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -127,6 +127,7 @@ struct MachineState { char *accel; bool kernel_irqchip_allowed; bool kernel_irqchip_required; +int kernel_irqchip_type; int kvm_shadow_mem; char *dtb; char *dumpdtb; diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index 983e99e..8f4d485 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -434,6 +434,7 @@ void kvm_init_irq_routing(KVMState *s); /** * kvm_arch_irqchip_create: * @KVMState: The KVMState pointer + * @type: irqchip type, architecture-specific * * Allow architectures to create an in-kernel irq chip themselves. * @@ -441,7 +442,7 @@ void kvm_init_irq_routing(KVMState *s); *0: irq chip was not created * 0: irq chip was created */ -int kvm_arch_irqchip_create(KVMState *s); +int kvm_arch_irqchip_create(KVMState *s, int type); /** * kvm_set_one_reg - set a register value in KVM via KVM_SET_ONE_REG ioctl diff --git a/kvm-all.c b/kvm-all.c index 06e06f2..8df938d 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -1395,7 +1395,7 @@ static void kvm_irqchip_create(MachineState *machine, KVMState *s) /* First probe and see if there's a arch-specific hook to create the * in-kernel irqchip for us */ -ret = kvm_arch_irqchip_create(s); +ret = kvm_arch_irqchip_create(s, machine-kernel_irqchip_type); if (ret == 0) { ret = kvm_vm_ioctl(s, KVM_CREATE_IRQCHIP); } diff --git a/stubs/kvm.c b/stubs/kvm.c index e7c60b6..a8505ff 100644 --- a/stubs/kvm.c +++ b/stubs/kvm.c @@ -1,7 +1,7 @@ #include qemu-common.h #include sysemu/kvm.h -int kvm_arch_irqchip_create(KVMState *s) +int kvm_arch_irqchip_create(KVMState *s, int type) { return 0; } diff --git a/target-arm/kvm.c b/target-arm/kvm.c index 548bfd7..3992c1f 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -579,19 +579,28 @@ void kvm_arch_init_irq_routing(KVMState *s) { } -int kvm_arch_irqchip_create(KVMState *s) +int kvm_arch_irqchip_create(KVMState *s, int type) { int ret; +/* Failure here means forgotten initialization of + * machine-kernel_irqchip_type in model code + */ +assert(type != 0); + /* If we can create the VGIC using the newer device control API, we * let the device do this when it initializes itself, otherwise we * fall back to the old API */ -ret = kvm_create_device(s, KVM_DEV_TYPE_ARM_VGIC_V2, true); +ret = kvm_create_device(s, type, true); if (ret == 0) { return 1; } +/* Fallback will create VGIC v2 */ +if (type != KVM_DEV_TYPE_ARM_VGIC_V2) { +return ret; +} return 0; } -- 1.9.5.msysgit.0
[Qemu-devel] [PATCH v5 2/5] Extract some reusable vGIC code
These functions are useful also for vGICv3 implementation. Make them accessible from within other modules. Actually kvm_dist_get() and kvm_dist_put() could also be made reusable, but they would require two extra parameters (s-dev_fd and s-num_cpu) as well as lots of typecasts of 's' to DeviceState * and back to GICState *. This makes the code very ugly so i decided to stop at this point. I tried also an approach with making a base class for all possible GICs, but it would contain only three variables (dev_fd, cpu_num and irq_num), and accessing them through the rest of the code would be again tedious (either ugly casts or qemu-style separate object pointer). So i disliked it too. Signed-off-by: Pavel Fedin p.fe...@samsung.com --- hw/intc/arm_gic_kvm.c | 84 --- hw/intc/vgic_common.h | 43 ++ 2 files changed, 82 insertions(+), 45 deletions(-) create mode 100644 hw/intc/vgic_common.h diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c index f56bff1..9b6d702 100644 --- a/hw/intc/arm_gic_kvm.c +++ b/hw/intc/arm_gic_kvm.c @@ -23,6 +23,7 @@ #include sysemu/kvm.h #include kvm_arm.h #include gic_internal.h +#include vgic_common.h //#define DEBUG_GIC_KVM @@ -52,7 +53,7 @@ typedef struct KVMARMGICClass { void (*parent_reset)(DeviceState *dev); } KVMARMGICClass; -static void kvm_arm_gic_set_irq(void *opaque, int irq, int level) +void kvm_arm_gic_set_irq(uint32_t num_irq, int irq, int level) { /* Meaning of the 'irq' parameter: * [0..N-1] : external interrupts @@ -63,10 +64,9 @@ static void kvm_arm_gic_set_irq(void *opaque, int irq, int level) * has separate fields in the irq number for type, * CPU number and interrupt number. */ -GICState *s = (GICState *)opaque; int kvm_irq, irqtype, cpu; -if (irq (s-num_irq - GIC_INTERNAL)) { +if (irq (num_irq - GIC_INTERNAL)) { /* External interrupt. The kernel numbers these like the GIC * hardware, with external interrupt IDs starting after the * internal ones. @@ -77,7 +77,7 @@ static void kvm_arm_gic_set_irq(void *opaque, int irq, int level) } else { /* Internal interrupt: decode into (cpu, interrupt id) */ irqtype = KVM_ARM_IRQ_TYPE_PPI; -irq -= (s-num_irq - GIC_INTERNAL); +irq -= (num_irq - GIC_INTERNAL); cpu = irq / GIC_INTERNAL; irq %= GIC_INTERNAL; } @@ -87,12 +87,19 @@ static void kvm_arm_gic_set_irq(void *opaque, int irq, int level) kvm_set_irq(kvm_state, kvm_irq, !!level); } +static void kvm_arm_gicv2_set_irq(void *opaque, int irq, int level) +{ +GICState *s = (GICState *)opaque; + +kvm_arm_gic_set_irq(s-num_irq, irq, level); +} + static bool kvm_arm_gic_can_save_restore(GICState *s) { return s-dev_fd = 0; } -static bool kvm_gic_supports_attr(GICState *s, int group, int attrnum) +bool kvm_gic_supports_attr(int dev_fd, int group, int attrnum) { struct kvm_device_attr attr = { .group = group, @@ -100,14 +107,14 @@ static bool kvm_gic_supports_attr(GICState *s, int group, int attrnum) .flags = 0, }; -if (s-dev_fd == -1) { +if (dev_fd == -1) { return false; } -return kvm_device_ioctl(s-dev_fd, KVM_HAS_DEVICE_ATTR, attr) == 0; +return kvm_device_ioctl(dev_fd, KVM_HAS_DEVICE_ATTR, attr) == 0; } -static void kvm_gic_access(GICState *s, int group, int offset, +void kvm_gic_access(int dev_fd, int group, int offset, int cpu, uint32_t *val, bool write) { struct kvm_device_attr attr; @@ -130,7 +137,7 @@ static void kvm_gic_access(GICState *s, int group, int offset, type = KVM_GET_DEVICE_ATTR; } -err = kvm_device_ioctl(s-dev_fd, type, attr); +err = kvm_device_ioctl(dev_fd, type, attr); if (err 0) { fprintf(stderr, KVM_{SET/GET}_DEVICE_ATTR failed: %s\n, strerror(-err)); @@ -138,20 +145,6 @@ static void kvm_gic_access(GICState *s, int group, int offset, } } -static void kvm_gicd_access(GICState *s, int offset, int cpu, -uint32_t *val, bool write) -{ -kvm_gic_access(s, KVM_DEV_ARM_VGIC_GRP_DIST_REGS, - offset, cpu, val, write); -} - -static void kvm_gicc_access(GICState *s, int offset, int cpu, -uint32_t *val, bool write) -{ -kvm_gic_access(s, KVM_DEV_ARM_VGIC_GRP_CPU_REGS, - offset, cpu, val, write); -} - #define for_each_irq_reg(_ctr, _max_irq, _field_width) \ for (_ctr = 0; _ctr ((_max_irq) / (32 / (_field_width))); _ctr++) @@ -291,7 +284,7 @@ static void kvm_dist_get(GICState *s, uint32_t offset, int width, irq = i * regsz; cpu = 0; while ((cpu s-num_cpu irq GIC_INTERNAL) || cpu == 0) { -kvm_gicd_access(s, offset, cpu, reg, false); +kvm_gicd_access(s-dev_fd, offset, cpu, reg,
Re: [Qemu-devel] [RFC v3 07/13] ram_addr.h: Make exclusive bitmap accessors atomic
Alvise Rigo a.r...@virtualopensystems.com writes: Suggested-by: Jani Kokkonen jani.kokko...@huawei.com Suggested-by: Claudio Fontana claudio.font...@huawei.com Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com --- include/exec/ram_addr.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index 2766541..e51bd65 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -255,7 +255,7 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(unsigned long *dest, /* Exclusive bitmap accessors. */ static inline void cpu_physical_memory_set_excl_dirty(ram_addr_t addr) { -set_bit(addr TARGET_PAGE_BITS, +set_bit_atomic(addr TARGET_PAGE_BITS, ram_list.dirty_memory[DIRTY_MEMORY_EXCLUSIVE]); } @@ -267,8 +267,8 @@ static inline int cpu_physical_memory_excl_is_dirty(ram_addr_t addr) static inline void cpu_physical_memory_clear_excl_dirty(ram_addr_t addr) { -clear_bit(addr TARGET_PAGE_BITS, - ram_list.dirty_memory[DIRTY_MEMORY_EXCLUSIVE]); + bitmap_test_and_clear_atomic(ram_list.dirty_memory[DIRTY_MEMORY_EXCLUSIVE], + addr TARGET_PAGE_BITS, 1); Does this call for simply implementing a clear_bit_atomic() rather than the fancy bitmap_test_and_clear_atomic. Looking at atomic.h it seems the primitives you need are there. } #endif -- Alex Bennée
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On Fri, 17 Jul 2015 10:30:38 +0100 Paolo Bonzini pbonz...@redhat.com wrote: On 17/07/2015 06:44, Paolo Bonzini wrote: On 16/07/2015 21:05, Richard W.M. Jones wrote: Sorry to spoil things, but I'm still seeing this bug, although it is now a lot less frequent with your patch. I would estimate it happens more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs with qemu.git + the v2 patch series. It's the exact same hang in both cases. Is it possible that this patch doesn't completely close any race? Still, it is an improvement, so there is that. Would seem at first glance like a different bug. Interestingly, adding some tracing (qemu_clock_get_ns) makes the bug more likely: now it reproduces in about 10 tries. Of course :) adding other kinds of tracing instead make it go away again (50 tries). Perhaps this: i/o thread vcpu thread worker thread - lock_iothread notify_me = 1 ... unlock_iothread lock_iothread notify_me = 3 ppoll notify_me = 1 bh-scheduled = 1 event_notifier_set event_notifier_test_and_clear ppoll ^^ hang In the exact shape above, it doesn't seem too likely to happen, but perhaps there's another simpler case. Still, the bug exists. The above is not really related to notify_me. Here the notification is not being optimized away! So I wonder if this one has been there forever. Fam suggested putting the event_notifier_test_and_clear before aio_bh_poll(), but it does not work. I'll look more close However, an unconditional event_notifier_test_and_clear is pretty expensive. On one hand, obviously correctness comes first. On the other hand, an expensive operation at the wrong place can mask the race very easily; I'll let the fix run for a while, but I'm not sure if a successful test really says anything useful. So it may not be useful, but still successful test is successful. :) The following patch, which also includes the delta between v2 and v3 of this series, survived 674 reboots before hitting a definitely unrelated problem: error: kvm run failed Function not implemented PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) For the record, this is the kvm_run struct: $6 = {request_interrupt_window = 0 '\000', padding1 = \000\000\000\000\000\000, exit_reason = 0, ready_for_interrupt_injection = 0 '\000', if_flag = 0 '\000', flags = 0, cr8 = 0, apic_base = 0, {hw = { hardware_exit_reason = 150994968}, fail_entry = {hardware_entry_failure_reason = 150994968}, ex = { exception = 150994968, error_code = 0}, io = {direction = 24 '\030', size = 0 '\000', port = 2304, count = 0, data_offset = 144}, debug = {arch = {No data fields}}, mmio = {phys_addr = 150994968, data = \220\000\000\000\000\000\000, len = 4, is_write = 0 '\000'}, hypercall = {nr = 150994968, args = {144, 4, 0, 0, 0, 0}, ret = 0, longmode = 0, pad = 0}, tpr_access = {rip = 150994968, is_write = 144, pad = 0}, s390_sieic = {icptcode = 24 '\030', ipa = 2304, ipb = 0}, s390_reset_flags = 150994968, s390_ucontrol = {trans_exc_code = 150994968, pgm_code = 144}, dcr = { dcrn = 150994968, data = 0, is_write = 144 '\220'}, internal = {suberror = 150994968, ndata = 0, data = {144, 4, 0 repeats 14 times}}, osi = {gprs = {150994968, 144, 4, 0 repeats 29 times}}, papr_hcall = {nr = 150994968, ret = 144, args = {4, 0, 0, 0, 0, 0, 0, 0, 0}}, s390_tsch = { subchannel_id = 24, subchannel_nr = 2304, io_int_parm = 0, io_int_word = 144, ipb = 0, dequeued = 4 '\004'}, epr = {epr = 150994968}, system_event = {type = 150994968, flags = 144}, s390_stsi = {addr = 150994968, ar = 144 '\220', reserved = 0 '\000', fc = 0 '\000', sel1 = 0 '\000', sel2 = 0}, padding = \030\000\000\t\000\000\000\000\220\000\000\000\000\000\000\000\004, '\000' repeats 238
Re: [Qemu-devel] [PATCH v4 1/1] vhost user: add support of live migration
Hi On Fri, Jul 17, 2015 at 2:57 PM, Paolo Bonzini pbonz...@redhat.com wrote: LOG_FD is implemented in the kernel drivers/vhost/vhost.c. It seems to be an eventfd-like mechanism to save on dirty bitmap scans. However, it's not well documented how to implement it in a correct way. and it's not used by qemu, so hard to say if it actually work well. In any case, live migration needs a new message type (like LOG_MMAP_FD) in the vhost-user protocol. Yes, perhaps with size and offset. I am looking at this. -- Marc-André Lureau
Re: [Qemu-devel] [virtio guest] vring_need_event() from virtqueue_kick_prepare()
On Fri, Jul 17, 2015 at 09:30:03AM +0300, Catalin Vasile wrote: Do you mean vhost_net - old kernel, qemu - latest, guest - latest? I mean whatever combination didn't work for your custom vhost module. If vhost_net is broken too there is probably a vhost/QEMU bug that should be fixed. Stefan pgpaYd2oKLC3H.pgp Description: PGP signature
Re: [Qemu-devel] [RFC v3 05/13] target-arm: translate: implement qemu_ldlink and qemu_stcond ops
On Fri, Jul 17, 2015 at 2:51 PM, Alex Bennée alex.ben...@linaro.org wrote: Alvise Rigo a.r...@virtualopensystems.com writes: Implement strex and ldrex instruction relying on TCG's qemu_ldlink and qemu_stcond. For the time being only the 32bit instructions are supported. Suggested-by: Jani Kokkonen jani.kokko...@huawei.com Suggested-by: Claudio Fontana claudio.font...@huawei.com Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com --- target-arm/translate.c | 87 ++- tcg/arm/tcg-target.c | 121 + 2 files changed, 178 insertions(+), 30 deletions(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index 80302cd..0366c76 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -72,6 +72,8 @@ static TCGv_i64 cpu_exclusive_test; static TCGv_i32 cpu_exclusive_info; #endif +static TCGv_i32 cpu_ll_sc_context; + /* FIXME: These should be removed. */ static TCGv_i32 cpu_F0s, cpu_F1s; static TCGv_i64 cpu_F0d, cpu_F1d; @@ -103,6 +105,8 @@ void arm_translate_init(void) offsetof(CPUARMState, exclusive_addr), exclusive_addr); cpu_exclusive_val = tcg_global_mem_new_i64(TCG_AREG0, offsetof(CPUARMState, exclusive_val), exclusive_val); +cpu_ll_sc_context = tcg_global_mem_new_i32(TCG_AREG0, +offsetof(CPUARMState, ll_sc_context), ll_sc_context); #ifdef CONFIG_USER_ONLY cpu_exclusive_test = tcg_global_mem_new_i64(TCG_AREG0, offsetof(CPUARMState, exclusive_test), exclusive_test); @@ -961,6 +965,18 @@ DO_GEN_ST(8, MO_UB) DO_GEN_ST(16, MO_TEUW) DO_GEN_ST(32, MO_TEUL) +/* Load/Store exclusive generators (always unsigned) */ +static inline void gen_aa32_ldex32(TCGv_i32 val, TCGv_i32 addr, int index) +{ +tcg_gen_qemu_ldlink_i32(val, addr, index, MO_TEUL | MO_EXCL); +} + +static inline void gen_aa32_stex32(TCGv_i32 is_dirty, TCGv_i32 val, + TCGv_i32 addr, int index) +{ +tcg_gen_qemu_stcond_i32(is_dirty, val, addr, index, MO_TEUL | MO_EXCL); +} + static inline void gen_set_pc_im(DisasContext *s, target_ulong val) { tcg_gen_movi_i32(cpu_R[15], val); @@ -7427,6 +7443,26 @@ static void gen_load_exclusive(DisasContext *s, int rt, int rt2, store_reg(s, rt, tmp); } +static void gen_load_exclusive_multi(DisasContext *s, int rt, int rt2, + TCGv_i32 addr, int size) +{ +TCGv_i32 tmp = tcg_temp_new_i32(); + +switch (size) { +case 0: +case 1: +abort(); +case 2: +gen_aa32_ldex32(tmp, addr, get_mem_index(s)); +break; +case 3: +default: +abort(); +} + +store_reg(s, rt, tmp); +} + static void gen_clrex(DisasContext *s) { gen_helper_atomic_clear(cpu_env); @@ -7460,6 +7496,52 @@ static void gen_store_exclusive(DisasContext *s, int rd, int rt, int rt2, tcg_temp_free_i64(val); tcg_temp_free_i32(tmp_size); } + +static void gen_store_exclusive_multi(DisasContext *s, int rd, int rt, int rt2, + TCGv_i32 addr, int size) +{ +TCGv_i32 tmp; +TCGv_i32 is_dirty; +TCGLabel *done_label; +TCGLabel *fail_label; + +fail_label = gen_new_label(); +done_label = gen_new_label(); + +tmp = tcg_temp_new_i32(); +is_dirty = tcg_temp_new_i32(); + +/* Fail if we are not in LL/SC context. */ +tcg_gen_brcondi_i32(TCG_COND_NE, cpu_ll_sc_context, 1, fail_label); + +tmp = load_reg(s, rt); +switch (size) { +case 0: +case 1: +abort(); +break; +case 2: +gen_aa32_stex32(is_dirty, tmp, addr, get_mem_index(s)); +break; +case 3: +default: +abort(); +} + +tcg_temp_free_i32(tmp); + +/* Check if the store conditional has to fail. */ +tcg_gen_brcondi_i32(TCG_COND_EQ, is_dirty, 1, fail_label); +tcg_temp_free_i32(is_dirty); + +tcg_temp_free_i32(tmp); + +tcg_gen_movi_i32(cpu_R[rd], 0); /* is_dirty = 0 */ +tcg_gen_br(done_label); +gen_set_label(fail_label); +tcg_gen_movi_i32(cpu_R[rd], 1); /* is_dirty = 1 */ +gen_set_label(done_label); +} #endif /* gen_srs: @@ -8308,7 +8390,7 @@ static void disas_arm_insn(DisasContext *s, unsigned int insn) } else if (insn (1 20)) { switch (op1) { case 0: /* ldrex */ -gen_load_exclusive(s, rd, 15, addr, 2); +gen_load_exclusive_multi(s, rd, 15, addr, 2); break; case 1: /* ldrexd */ gen_load_exclusive(s, rd, rd + 1, addr, 3); @@ -8326,7 +8408,8 @@ static void disas_arm_insn(DisasContext *s, unsigned int insn) rm =
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On 17/07/2015 14:58, Richard W.M. Jones wrote: On Fri, Jul 17, 2015 at 11:30:38AM +0200, Paolo Bonzini wrote: error: kvm run failed Function not implemented PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) Vaguely reminiscent of this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1194366 (See comment 7 in particular) Must be it, I was using an old kernel. Thanks! Paolo diff --git a/aio-posix.c b/aio-posix.c index 268d14d..d2011d0 100644 --- a/aio-posix.c +++ b/aio-posix.c @@ -273,6 +273,13 @@ bool aio_poll(AioContext *ctx, bool blocking) aio_context_acquire(ctx); } +/* This should be optimized... */ +event_notifier_test_and_clear(ctx-notifier); + +if (blocking) { +atomic_sub(ctx-notify_me, 2); +} + /* if we have any readable fds, dispatch event */ if (ret 0) { for (i = 0; i npfd; i++) { @@ -283,10 +290,6 @@ bool aio_poll(AioContext *ctx, bool blocking) npfd = 0; ctx-walking_handlers--; -if (blocking) { -atomic_sub(ctx-notify_me, 2); -} - /* Run dispatch even if there were no readable fds to run timers */ if (aio_dispatch(ctx)) { progress = true; diff --git a/aio-win32.c b/aio-win32.c index 2bfd5f8..33809fd 100644 --- a/aio-win32.c +++ b/aio-win32.c @@ -326,6 +326,10 @@ bool aio_poll(AioContext *ctx, bool blocking) if (timeout) { aio_context_acquire(ctx); } + +/* This should be optimized... */ +event_notifier_test_and_clear(ctx-notifier); + if (blocking) { assert(first); atomic_sub(ctx-notify_me, 2); diff --git a/async.c b/async.c index 9204907..120e183 100644 --- a/async.c +++ b/async.c @@ -202,6 +202,9 @@ aio_ctx_check(GSource *source) AioContext *ctx = (AioContext *) source; QEMUBH *bh; +/* This should be optimized... */ +event_notifier_test_and_clear(ctx-notifier); + atomic_and(ctx-notify_me, ~1); for (bh = ctx-first_bh; bh; bh = bh-next) { if (!bh-deleted bh-scheduled) { @@ -280,6 +280,10 @@ static void aio_rfifolock_cb(void *opaque) aio_notify(opaque); } +static void event_notifier_dummy_cb(EventNotifier *e) +{ +} + AioContext *aio_context_new(Error **errp) { int ret; @@ -292,7 +296,7 @@ AioContext *aio_context_new(Error **errp) return NULL; } g_source_set_can_recurse(ctx-source, true); -aio_set_event_notifier(ctx, ctx-notifier, NULL); +aio_set_event_notifier(ctx, ctx-notifier, event_notifier_dummy_cb); ctx-thread_pool = NULL; qemu_mutex_init(ctx-bh_lock); rfifolock_init(ctx-lock, aio_rfifolock_cb, ctx); With this patch, I've got to 500 iterations without seeing the error. Still testing it however ... Rich.
[Qemu-devel] [PATCH v5 5/5] Add gicversion option to virt machine
Set kernel_irqchip_type according to value of the option and pass it around where necessary. Instantiate devices and fdt nodes according to the choice. mac_cpus for virt machine increased to 64. GICv2 compatibility check happens inside arm_gic_common_realize(). Signed-off-by: Pavel Fedin p.fe...@samsung.com --- hw/arm/virt.c | 137 ++ include/hw/arm/fdt.h | 2 +- include/hw/arm/virt.h | 6 ++- 3 files changed, 121 insertions(+), 24 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 2e7d858..afbeedf 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -48,6 +48,8 @@ #include hw/arm/sysbus-fdt.h #include hw/platform-bus.h #include hw/arm/fdt.h +#include linux/kvm.h /* For KVM_DEV_TYPE_ARM_VGIC_V{2|3} */ +#include qapi/visitor.h /* Number of external interrupt lines to configure the GIC with */ #define NUM_IRQS 256 @@ -106,7 +108,12 @@ static const MemMapEntry a15memmap[] = { /* GIC distributor and CPU interfaces sit inside the CPU peripheral space */ [VIRT_GIC_DIST] = { 0x0800, 0x0001 }, [VIRT_GIC_CPU] ={ 0x0801, 0x0001 }, -[VIRT_GIC_V2M] ={ 0x0802, 0x1000 }, +[VIRT_GIC_V2M] ={ 0x0802, 0x0001 }, +/* On v3 VIRT_GIC_DIST_MBI and VIRT_ITS_CONTROL take place of + * VIRT_GIC_CPU and VIRT_GIC_V2M respectively + */ +[VIRT_ITS_TRANSLATION] ={ 0x0803, 0x0001 }, +[VIRT_LPI] ={ 0x0804, 0x0080 }, [VIRT_UART] = { 0x0900, 0x1000 }, [VIRT_RTC] ={ 0x0901, 0x1000 }, [VIRT_FW_CFG] = { 0x0902, 0x000a }, @@ -256,10 +263,13 @@ static void fdt_add_timer_nodes(const VirtBoardInfo *vbi) * they are edge-triggered. */ ARMCPU *armcpu; +uint32_t max; uint32_t irqflags = GIC_FDT_IRQ_FLAGS_EDGE_LO_HI; +/* Argument is 32 bit but 8 bits are reserved for flags */ +max = (vbi-smp_cpus = 24) ? 24 : vbi-smp_cpus; irqflags = deposit32(irqflags, GIC_FDT_IRQ_PPI_CPU_START, - GIC_FDT_IRQ_PPI_CPU_WIDTH, (1 vbi-smp_cpus) - 1); + GIC_FDT_IRQ_PPI_CPU_WIDTH, (1 max) - 1); qemu_fdt_add_subnode(vbi-fdt, /timer); @@ -283,6 +293,18 @@ static void fdt_add_cpu_nodes(const VirtBoardInfo *vbi) { int cpu; +/* + * From Documentation/devicetree/bindings/arm/cpus.txt + * On ARM v8 64-bit systems value should be set to 2, + * that corresponds to the MPIDR_EL1 register size. + * If MPIDR_EL1[63:32] value is equal to 0 on all CPUs + * in the system, #address-cells can be set to 1, since + * MPIDR_EL1[63:32] bits are not used for CPUs + * identification. + * + * Now GIC500 doesn't support affinities 2 3 so currently + * #address-cells can stay 1 until future GIC + */ qemu_fdt_add_subnode(vbi-fdt, /cpus); qemu_fdt_setprop_cell(vbi-fdt, /cpus, #address-cells, 0x1); qemu_fdt_setprop_cell(vbi-fdt, /cpus, #size-cells, 0x0); @@ -319,25 +341,36 @@ static void fdt_add_v2m_gic_node(VirtBoardInfo *vbi) qemu_fdt_setprop_cell(vbi-fdt, /intc/v2m, phandle, vbi-v2m_phandle); } -static void fdt_add_gic_node(VirtBoardInfo *vbi) +static void fdt_add_gic_node(VirtBoardInfo *vbi, int type) { vbi-gic_phandle = qemu_fdt_alloc_phandle(vbi-fdt); qemu_fdt_setprop_cell(vbi-fdt, /, interrupt-parent, vbi-gic_phandle); qemu_fdt_add_subnode(vbi-fdt, /intc); -/* 'cortex-a15-gic' means 'GIC v2' */ -qemu_fdt_setprop_string(vbi-fdt, /intc, compatible, -arm,cortex-a15-gic); qemu_fdt_setprop_cell(vbi-fdt, /intc, #interrupt-cells, 3); qemu_fdt_setprop(vbi-fdt, /intc, interrupt-controller, NULL, 0); -qemu_fdt_setprop_sized_cells(vbi-fdt, /intc, reg, - 2, vbi-memmap[VIRT_GIC_DIST].base, - 2, vbi-memmap[VIRT_GIC_DIST].size, - 2, vbi-memmap[VIRT_GIC_CPU].base, - 2, vbi-memmap[VIRT_GIC_CPU].size); qemu_fdt_setprop_cell(vbi-fdt, /intc, #address-cells, 0x2); qemu_fdt_setprop_cell(vbi-fdt, /intc, #size-cells, 0x2); qemu_fdt_setprop(vbi-fdt, /intc, ranges, NULL, 0); +if (type == KVM_DEV_TYPE_ARM_VGIC_V3) { +qemu_fdt_setprop_string(vbi-fdt, /intc, compatible, +arm,gic-v3); +qemu_fdt_setprop_sized_cells(vbi-fdt, /intc, reg, + 2, vbi-memmap[VIRT_GIC_DIST].base, + 2, vbi-memmap[VIRT_GIC_DIST].size, + 2, vbi-memmap[VIRT_LPI].base, + 2, vbi-memmap[VIRT_LPI].size); +} else { +/* 'cortex-a15-gic' means 'GIC v2' */ +qemu_fdt_setprop_string(vbi-fdt, /intc, compatible, +
[Qemu-devel] [PATCH v5 0/5] [PATCH v4 0/5] vGICv3 support
This series introduces support for GICv3 by KVM. Software emulation is currently not supported. Differences from v4: - Do not reintroduce several constants shared with GICv2, reuse them instead. - Added gicv3_init_irqs_and_mmio() in base class, to be used by both software emulation and KVM code. Avoids code duplication. - Do not add NULL msi-parent phandle to PCI device in the FDT - Removed a couple of stale things from virt.c Differences from v3: - Fixed stupid build breakage in patch 0002 - Rebased on top of current master, patch 0003 adjusted according to kvm_irqchip_create() changes - Added assertion against uninitialized kernel_irqchip_type - Removed kernel_irqchip_type initialization from models which do not use KVM vGIC Differences from v2: - Removed some unrelated and unnecessary changes from virt machine, occasionally slipped in; some of them caused qemu to crash on ARM32. - Fixed build for ARM32; vGICv3 code requires definitions which are present only in ARM64 kernel Differences from v1: - Base class included, taken from the series by Shlomo Pongratz: http://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg01512.html The code is refactored as little as possible in order to simplify further addition of software emulation: - Minor fixes in code style and comments, according to old reviews - Removed REV_V3 definition because it's currently not used, and it does not add any meaning to number 3. - Removed reserved regions for MBI and ITS (except for 'virt' machine memory map). These should go to separate classes when implemented. - Improved commit messages - vGIC patches restructured - Use 'gicversion' option instead of virt-v3 machine Pavel Fedin (4): Extract some reusable vGIC code Introduce irqchip type specification for KVM Initial implementation of vGICv3 Add gicversion option to virt machine Shlomo Pongratz (1): Implement GIC-500 base class hw/arm/vexpress.c | 1 + hw/arm/virt.c | 140 hw/intc/Makefile.objs | 4 + hw/intc/arm_gic_kvm.c | 84 ++-- hw/intc/arm_gicv3_common.c | 260 + hw/intc/arm_gicv3_kvm.c| 159 +++ hw/intc/gicv3_internal.h | 156 ++ hw/intc/vgic_common.h | 43 ++ include/hw/arm/fdt.h | 2 +- include/hw/arm/virt.h | 6 +- include/hw/boards.h| 1 + include/hw/intc/arm_gicv3_common.h | 113 include/sysemu/kvm.h | 3 +- kvm-all.c | 2 +- stubs/kvm.c| 2 +- target-arm/kvm.c | 13 +- 16 files changed, 915 insertions(+), 74 deletions(-) create mode 100644 hw/intc/arm_gicv3_common.c create mode 100644 hw/intc/arm_gicv3_kvm.c create mode 100644 hw/intc/gicv3_internal.h create mode 100644 hw/intc/vgic_common.h create mode 100644 include/hw/intc/arm_gicv3_common.h -- 1.9.5.msysgit.0
[Qemu-devel] [PATCH v5 4/5] Initial implementation of vGICv3
Get/put routines are missing, live migration is not possible. Signed-off-by: Pavel Fedin p.fe...@samsung.com --- hw/intc/Makefile.objs | 3 + hw/intc/arm_gicv3_kvm.c | 159 2 files changed, 162 insertions(+) create mode 100644 hw/intc/arm_gicv3_kvm.c diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs index 1317e5a..e2525a8 100644 --- a/hw/intc/Makefile.objs +++ b/hw/intc/Makefile.objs @@ -17,6 +17,9 @@ common-obj-$(CONFIG_OPENPIC) += openpic.o obj-$(CONFIG_APIC) += apic.o apic_common.o obj-$(CONFIG_ARM_GIC_KVM) += arm_gic_kvm.o +ifeq ($(ARCH), aarch64) # Only 64-bit KVM can use these +obj-$(CONFIG_ARM_GIC_KVM) += arm_gicv3_kvm.o +endif obj-$(CONFIG_STELLARIS) += armv7m_nvic.o obj-$(CONFIG_EXYNOS4) += exynos4210_gic.o exynos4210_combiner.o obj-$(CONFIG_GRLIB) += grlib_irqmp.o diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c new file mode 100644 index 000..e2724bc --- /dev/null +++ b/hw/intc/arm_gicv3_kvm.c @@ -0,0 +1,159 @@ +/* + * ARM Generic Interrupt Controller using KVM in-kernel support + * + * Copyright (c) 2015 Samsung Electronics Co., Ltd. + * Written by Pavel Fedin + * Based on vGICv2 code by Peter Maydell + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation, either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, see http://www.gnu.org/licenses/. + */ + +#include hw/sysbus.h +#include sysemu/kvm.h +#include kvm_arm.h +#include gicv3_internal.h +#include vgic_common.h + +#ifdef DEBUG_GICV3_KVM +static const int debug_gicv3_kvm = 1; +#else +static const int debug_gicv3_kvm = 0; +#endif + +#define DPRINTF(fmt, ...) do { \ +if (debug_gicv3_kvm) { \ +printf(kvm_gicv3: fmt , ## __VA_ARGS__); \ +} \ +} while (0) + +#define TYPE_KVM_ARM_GICV3 kvm-arm-gicv3 +#define KVM_ARM_GICV3(obj) \ + OBJECT_CHECK(GICv3State, (obj), TYPE_KVM_ARM_GICV3) +#define KVM_ARM_GICV3_CLASS(klass) \ + OBJECT_CLASS_CHECK(KVMARMGICv3Class, (klass), TYPE_KVM_ARM_GICV3) +#define KVM_ARM_GICV3_GET_CLASS(obj) \ + OBJECT_GET_CLASS(KVMARMGICv3Class, (obj), TYPE_KVM_ARM_GICV3) + +typedef struct KVMARMGICv3Class { +ARMGICv3CommonClass parent_class; +DeviceRealize parent_realize; +void (*parent_reset)(DeviceState *dev); +} KVMARMGICv3Class; + +static void kvm_arm_gicv3_set_irq(void *opaque, int irq, int level) +{ +GICv3State *s = (GICv3State *)opaque; + +kvm_arm_gic_set_irq(s-num_irq, irq, level); +} + +static void kvm_arm_gicv3_put(GICv3State *s) +{ +/* TODO */ +DPRINTF(Cannot put kernel gic state, no kernel interface\n); +} + +static void kvm_arm_gicv3_get(GICv3State *s) +{ +/* TODO */ +DPRINTF(Cannot get kernel gic state, no kernel interface\n); +} + +static void kvm_arm_gicv3_reset(DeviceState *dev) +{ +GICv3State *s = ARM_GICV3_COMMON(dev); +KVMARMGICv3Class *kgc = KVM_ARM_GICV3_GET_CLASS(s); + +DPRINTF(Reset\n); + +kgc-parent_reset(dev); +kvm_arm_gicv3_put(s); +} + +static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp) +{ +GICv3State *s = KVM_ARM_GICV3(dev); +KVMARMGICv3Class *kgc = KVM_ARM_GICV3_GET_CLASS(s); +Error *local_err = NULL; +int ret; + +DPRINTF(kvm_arm_gicv3_realize\n); + +kgc-parent_realize(dev, local_err); +if (local_err) { +error_propagate(errp, local_err); +return; +} + +gicv3_init_irqs_and_mmio(s, kvm_arm_gicv3_set_irq, NULL, NULL); + +/* Try to create the device via the device control API */ +s-dev_fd = -1; +ret = kvm_create_device(kvm_state, KVM_DEV_TYPE_ARM_VGIC_V3, false); +if (ret = 0) { +s-dev_fd = ret; +} else if (ret != -ENODEV ret != -ENOTSUP) { +error_setg_errno(errp, -ret, error creating in-kernel VGIC); +return; +} + +if (kvm_gic_supports_attr(s-dev_fd, KVM_DEV_ARM_VGIC_GRP_NR_IRQS, 0)) { +uint32_t numirqs = s-num_irq; +DPRINTF(KVM_DEV_ARM_VGIC_GRP_NR_IRQS = %u\n, numirqs); +kvm_gic_access(s-dev_fd, KVM_DEV_ARM_VGIC_GRP_NR_IRQS, + 0, 0, numirqs, 1); +} + +/* Tell the kernel to complete VGIC initialization now */ +if (kvm_gic_supports_attr(s-dev_fd, KVM_DEV_ARM_VGIC_GRP_CTRL, + KVM_DEV_ARM_VGIC_CTRL_INIT)) { + DPRINTF(KVM_DEV_ARM_VGIC_CTRL_INIT\n); +kvm_gic_access(s-dev_fd, KVM_DEV_ARM_VGIC_GRP_CTRL, + KVM_DEV_ARM_VGIC_CTRL_INIT, 0, 0, 1); +} + +
Re: [Qemu-devel] [PATCH v1 4/6] xen: use errno instead of rc for xc_domain_add_to_physmap
On Thu, 2 Jul 2015, Konrad Rzeszutek Wilk wrote: In Xen 4.6 commit cd2f100f0f61b3f333d52d1737dd73f02daee592 libxc: Fix do_memory_op to return negative value on errors made the libxc API less odd-ball: On errors, return value is -1 and error code is in errno. On success the return value is either 0 or an positive value. Since we could be running with an old toolstack in which the Exx value is in rc or the newer, we add an wrapper around the xc_domain_add_to_physmap (called xen_xc_domain_add_to_physmap) which will always return the EXX. Suggested-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- include/hw/xen/xen_common.h | 22 ++ xen-hvm.c | 4 ++-- 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h index 38f29fb..fc66c3c 100644 --- a/include/hw/xen/xen_common.h +++ b/include/hw/xen/xen_common.h @@ -407,4 +407,26 @@ static inline int xen_set_ioreq_server_state(XenXC xc, domid_t dom, #endif +#if CONFIG_XEN_CTRL_INTERFACE_VERSION 460 This patch is fine as is and you can add my Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com however there is no code to detect 460 at the moment, see configure:1844. We need one more test case in the configure script to detect 4.6. +static inline int xen_xc_domain_add_to_physmap(XenXC xch, uint32_t domid, + unsigned int space, + unsigned long idx, + xen_pfn_t gpfn) +{ +return xc_domain_add_to_physmap(xch, domid, space, idx, gpfn); +} +#else +static inline int xen_xc_domain_add_to_physmap(XenXC xch, uint32_t domid, + unsigned int space, + unsigned long idx, + xen_pfn_t gpfn) +{ +/* In Xen 4.6 rc is -1 and errno contains the error value. */ +int rc = xc_domain_add_to_physmap(xch, domid, space, idx, gpfn); +if (rc == -1) +return errno; +return rc; +} +#endif + #endif /* QEMU_HW_XEN_COMMON_H */ diff --git a/xen-hvm.c b/xen-hvm.c index 0408462..fae2d22 100644 --- a/xen-hvm.c +++ b/xen-hvm.c @@ -345,7 +345,7 @@ go_physmap: unsigned long idx = pfn + i; xen_pfn_t gpfn = start_gpfn + i; -rc = xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, idx, gpfn); +rc = xen_xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, idx, gpfn); if (rc) { DPRINTF(add_to_physmap MFN %PRI_xen_pfn to PFN % PRI_xen_pfn failed: %d\n, idx, gpfn, rc); @@ -422,7 +422,7 @@ static int xen_remove_from_physmap(XenIOState *state, xen_pfn_t idx = start_addr + i; xen_pfn_t gpfn = phys_offset + i; -rc = xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, idx, gpfn); +rc = xen_xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, idx, gpfn); if (rc) { fprintf(stderr, add_to_physmap MFN %PRI_xen_pfn to PFN % PRI_xen_pfn failed: %d\n, idx, gpfn, rc); -- 2.1.0
Re: [Qemu-devel] [PATCH v1 5/6] xen/pt/msi: Add the register value when printing logging and error messages
On Thu, 2 Jul 2015, Konrad Rzeszutek Wilk wrote: We would like to know what the MSI register value is to help in troubleshooting in the field. As such modify the logging logic to include such details in xen_pt_msgctrl_reg_write. Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com hw/xen/xen_pt_config_init.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c index dd37be3..462e1b9 100644 --- a/hw/xen/xen_pt_config_init.c +++ b/hw/xen/xen_pt_config_init.c @@ -1086,7 +1086,7 @@ static int xen_pt_msgctrl_reg_write(XenPCIPassthroughState *s, /* setup MSI pirq for the first time */ if (!msi-initialized) { /* Init physical one */ -XEN_PT_LOG(s-dev, setup MSI\n); +XEN_PT_LOG(s-dev, setup MSI (register: %x).\n, *val); if (xen_pt_msi_setup(s)) { /* We do not broadcast the error to the framework code, so * that MSI errors are contained in MSI emulation code and @@ -1094,12 +1094,12 @@ static int xen_pt_msgctrl_reg_write(XenPCIPassthroughState *s, * Guest MSI would be actually not working. */ *val = ~PCI_MSI_FLAGS_ENABLE; -XEN_PT_WARN(s-dev, Can not map MSI.\n); +XEN_PT_WARN(s-dev, Can not map MSI (register: %x)!\n, *val); return 0; } if (xen_pt_msi_update(s)) { *val = ~PCI_MSI_FLAGS_ENABLE; -XEN_PT_WARN(s-dev, Can not bind MSI\n); +XEN_PT_WARN(s-dev, Can not bind MSI (register: %x)!\n, *val); return 0; } msi-initialized = true; -- 2.1.0
Re: [Qemu-devel] [RFC v3 12/13] softmmu_llsc_template.h: move to multithreading
Alvise Rigo a.r...@virtualopensystems.com writes: Update the TCG LL/SC instructions to work in multi-threading. The basic idea remains untouched, but the whole mechanism is improved to make use of the callback support to query TLB flush requests and the rendezvous callback to synchronize all the currently running vCPUs. In essence, if a vCPU wants to LL to a page which is not already set as EXCL, it will arrange a rendezvous with all the vCPUs that are executing a TB and query a TLB flush for *all* the vCPUs. Doing so, we make sure that: - the running vCPUs do not touch the EXCL page while the requesting vCPU is setting the transaction to EXCL of the page - all the vCPUs will have the EXCL flag in the TLB entry for that specific page *before* entering the next TB Suggested-by: Jani Kokkonen jani.kokko...@huawei.com Suggested-by: Claudio Fontana claudio.font...@huawei.com Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com --- cputlb.c| 2 + include/exec/cpu-defs.h | 4 ++ softmmu_llsc_template.h | 97 - 3 files changed, 69 insertions(+), 34 deletions(-) diff --git a/cputlb.c b/cputlb.c index 66df41a..0566e0f 100644 --- a/cputlb.c +++ b/cputlb.c @@ -30,6 +30,8 @@ #include exec/ram_addr.h #include tcg/tcg.h +#include sysemu/cpus.h + void qemu_mutex_lock_iothread(void); void qemu_mutex_unlock_iothread(void); diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h index c73a75f..40742b3 100644 --- a/include/exec/cpu-defs.h +++ b/include/exec/cpu-defs.h @@ -169,5 +169,9 @@ typedef struct CPUIOTLBEntry { /* Used for atomic instruction translation. */ \ bool ll_sc_context; \ hwaddr excl_protected_hwaddr; \ +/* Used to carry the stcond result and also as a flag to flag a + * normal store access made by a stcond. */ \ +int excl_succeeded; \ + #endif diff --git a/softmmu_llsc_template.h b/softmmu_llsc_template.h index 81e9d8e..4105e72 100644 --- a/softmmu_llsc_template.h +++ b/softmmu_llsc_template.h @@ -54,7 +54,21 @@ (TARGET_PAGE_MASK | TLB_INVALID_MASK)); \ }) \ -#define EXCLUSIVE_RESET_ADDR ULLONG_MAX +#define is_read_tlb_entry_set(env, page, index) \ +({ \ +(addr TARGET_PAGE_MASK) \ + == ((env-tlb_table[mmu_idx][index].addr_read) \ + (TARGET_PAGE_MASK | TLB_INVALID_MASK)); \ +}) \ + +/* Whenever a SC operation fails, we add a small delay to reduce the + * concurrency among the atomic instruction emulation code. Without this delay, + * in very congested situation where plain stores make all the pending LLs + * fail, the code could reach a stalling situation in which all the SCs happen + * to fail. + * TODO: make the delay dynamic according with the SC failing rate. + * */ +#define TCG_ATOMIC_INSN_EMUL_DELAY 100 I'd be tempted to split out this sort of chicanery into a separate patch. WORD_TYPE helper_le_ldlink_name(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi, uintptr_t retaddr) @@ -65,35 +79,58 @@ WORD_TYPE helper_le_ldlink_name(CPUArchState *env, target_ulong addr, hwaddr hw_addr; unsigned mmu_idx = get_mmuidx(oi); -/* Use the proper load helper from cpu_ldst.h */ -ret = helper_ld_legacy(env, addr, mmu_idx, retaddr); - -/* The last legacy access ensures that the TLB and IOTLB entry for 'addr' - * have been created. */ index = (addr TARGET_PAGE_BITS) (CPU_TLB_SIZE - 1); +if (!is_read_tlb_entry_set(env, addr, index)) { +tlb_fill(ENV_GET_CPU(env), addr, READ_ACCESS_TYPE, mmu_idx, retaddr); +} /* hw_addr = hwaddr of the page (i.e. section-mr-ram_addr + xlat) * plus the offset (i.e. addr ~TARGET_PAGE_MASK) */ hw_addr = (env-iotlb[mmu_idx][index].addr TARGET_PAGE_MASK) + addr; /* Set the exclusive-protected hwaddr. */ -env-excl_protected_hwaddr = hw_addr; -env-ll_sc_context = true; +qemu_mutex_lock(tcg_excl_access_lock); +if (cpu_physical_memory_excl_is_dirty(hw_addr) !exit_flush_request) { +exit_flush_request = 1; -/* No need to mask hw_addr with TARGET_PAGE_MASK since - * cpu_physical_memory_excl_is_dirty() will take care of that. */ -if (cpu_physical_memory_excl_is_dirty(hw_addr)) { -
Re: [Qemu-devel] [PATCH 2/2] tcg/i386: Reserve register for guest_base if a segment isn't available
On 17/07/2015 03:33, Peter Maydell wrote: On 16 July 2015 at 22:25, Richard Henderson r...@twiddle.net wrote: This saves 2 insns and 10 bytes from the implementation of each memory operation. Do we have an idea of which platforms/configs don't let us have a segment register for guest_base? Everything non-Linux I think, which for user-mode emulation means bsd-user. Paolo
Re: [Qemu-devel] [PATCH v1 6/6] xen/pt: Use XEN_PT_LOG properly to guard against compiler warnings.
On Thu, 2 Jul 2015, Konrad Rzeszutek Wilk wrote: If XEN_PT_LOGGING_ENABLED is enabled the XEN_PT_LOG macros start using the first argument. Which means if within the function there is only one user of the argument ('d') and XEN_PT_LOGGING_ENABLED is not set, we get compiler warnings. This is not the case now but with the xen/pt: Use xen_host_pci_get_[byte|word] instead of dev.config we will hit - so this sync up the function to the rest of them. Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com hw/xen/xen_pt_config_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c index 462e1b9..21d4938 100644 --- a/hw/xen/xen_pt_config_init.c +++ b/hw/xen/xen_pt_config_init.c @@ -1418,7 +1418,7 @@ static int xen_pt_msixctrl_reg_init(XenPCIPassthroughState *s, reg_field = pci_get_word(d-config + real_offset); if (reg_field PCI_MSIX_FLAGS_ENABLE) { -XEN_PT_LOG(d, MSIX already enabled, disabling it first\n); +XEN_PT_LOG(s-dev, MSIX already enabled, disabling it first\n); xen_host_pci_set_word(s-real_device, real_offset, reg_field ~PCI_MSIX_FLAGS_ENABLE); } -- 2.1.0
Re: [Qemu-devel] [RFC v3 12/13] softmmu_llsc_template.h: move to multithreading
On Fri, Jul 17, 2015 at 5:27 PM, Alex Bennée alex.ben...@linaro.org wrote: Alvise Rigo a.r...@virtualopensystems.com writes: Update the TCG LL/SC instructions to work in multi-threading. The basic idea remains untouched, but the whole mechanism is improved to make use of the callback support to query TLB flush requests and the rendezvous callback to synchronize all the currently running vCPUs. In essence, if a vCPU wants to LL to a page which is not already set as EXCL, it will arrange a rendezvous with all the vCPUs that are executing a TB and query a TLB flush for *all* the vCPUs. Doing so, we make sure that: - the running vCPUs do not touch the EXCL page while the requesting vCPU is setting the transaction to EXCL of the page - all the vCPUs will have the EXCL flag in the TLB entry for that specific page *before* entering the next TB Suggested-by: Jani Kokkonen jani.kokko...@huawei.com Suggested-by: Claudio Fontana claudio.font...@huawei.com Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com --- cputlb.c| 2 + include/exec/cpu-defs.h | 4 ++ softmmu_llsc_template.h | 97 - 3 files changed, 69 insertions(+), 34 deletions(-) diff --git a/cputlb.c b/cputlb.c index 66df41a..0566e0f 100644 --- a/cputlb.c +++ b/cputlb.c @@ -30,6 +30,8 @@ #include exec/ram_addr.h #include tcg/tcg.h +#include sysemu/cpus.h + void qemu_mutex_lock_iothread(void); void qemu_mutex_unlock_iothread(void); diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h index c73a75f..40742b3 100644 --- a/include/exec/cpu-defs.h +++ b/include/exec/cpu-defs.h @@ -169,5 +169,9 @@ typedef struct CPUIOTLBEntry { /* Used for atomic instruction translation. */ \ bool ll_sc_context; \ hwaddr excl_protected_hwaddr; \ +/* Used to carry the stcond result and also as a flag to flag a + * normal store access made by a stcond. */ \ +int excl_succeeded; \ + #endif diff --git a/softmmu_llsc_template.h b/softmmu_llsc_template.h index 81e9d8e..4105e72 100644 --- a/softmmu_llsc_template.h +++ b/softmmu_llsc_template.h @@ -54,7 +54,21 @@ (TARGET_PAGE_MASK | TLB_INVALID_MASK)); \ }) \ -#define EXCLUSIVE_RESET_ADDR ULLONG_MAX +#define is_read_tlb_entry_set(env, page, index) \ +({ \ +(addr TARGET_PAGE_MASK) \ + == ((env-tlb_table[mmu_idx][index].addr_read) \ + (TARGET_PAGE_MASK | TLB_INVALID_MASK)); \ +}) \ + +/* Whenever a SC operation fails, we add a small delay to reduce the + * concurrency among the atomic instruction emulation code. Without this delay, + * in very congested situation where plain stores make all the pending LLs + * fail, the code could reach a stalling situation in which all the SCs happen + * to fail. + * TODO: make the delay dynamic according with the SC failing rate. + * */ +#define TCG_ATOMIC_INSN_EMUL_DELAY 100 I'd be tempted to split out this sort of chicanery into a separate patch. OK, I think it's a good idea since it's not strictly required. Regards, alvise WORD_TYPE helper_le_ldlink_name(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi, uintptr_t retaddr) @@ -65,35 +79,58 @@ WORD_TYPE helper_le_ldlink_name(CPUArchState *env, target_ulong addr, hwaddr hw_addr; unsigned mmu_idx = get_mmuidx(oi); -/* Use the proper load helper from cpu_ldst.h */ -ret = helper_ld_legacy(env, addr, mmu_idx, retaddr); - -/* The last legacy access ensures that the TLB and IOTLB entry for 'addr' - * have been created. */ index = (addr TARGET_PAGE_BITS) (CPU_TLB_SIZE - 1); +if (!is_read_tlb_entry_set(env, addr, index)) { +tlb_fill(ENV_GET_CPU(env), addr, READ_ACCESS_TYPE, mmu_idx, retaddr); +} /* hw_addr = hwaddr of the page (i.e. section-mr-ram_addr + xlat) * plus the offset (i.e. addr ~TARGET_PAGE_MASK) */ hw_addr = (env-iotlb[mmu_idx][index].addr TARGET_PAGE_MASK) + addr; /* Set the exclusive-protected hwaddr. */ -env-excl_protected_hwaddr = hw_addr; -env-ll_sc_context = true; +qemu_mutex_lock(tcg_excl_access_lock); +if (cpu_physical_memory_excl_is_dirty(hw_addr) !exit_flush_request) { +exit_flush_request = 1; -/* No need to mask hw_addr with TARGET_PAGE_MASK since - *
Re: [Qemu-devel] [PATCH 2/2] tcg/i386: Reserve register for guest_base if a segment isn't available
On 07/17/2015 02:33 AM, Peter Maydell wrote: On 16 July 2015 at 22:25, Richard Henderson r...@twiddle.net wrote: This saves 2 insns and 10 bytes from the implementation of each memory operation. Do we have an idea of which platforms/configs don't let us have a segment register for guest_base? We've only bothered to fill in code for Linux so far. These days we should probably probe for fsgsbase insns and use those. Though I don't know how many of the various operating systems enable that... r~
Re: [Qemu-devel] [PATCH RFC 2/9] tcg: don't abuse TCG type in tcg_gen_trunc_shr_i64_i32
On 07/15/2015 12:03 PM, Aurelien Jarno wrote: The tcg_gen_trunc_shr_i64_i32 function takes a 64-bit argument and returns a 32-bit value. Directly call tcg_gen_op3 with the correct types instead of calling tcg_gen_op3i_i32 and abusing the TCG types. Cc: Paolo Bonzinipbonz...@redhat.com Cc: Richard Hendersonr...@twiddle.net Signed-off-by: Aurelien Jarnoaurel...@aurel32.net --- tcg/tcg-op.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Reviewed-by: Richard Henderson r...@twiddle.net r~
Re: [Qemu-devel] [PATCH RFC 3/9] tcg: implement real ext_i32_i64 and extu_i32_i64 ops
On 07/15/2015 12:03 PM, Aurelien Jarno wrote: Implement optional but real ext_i32_i64 and extu_i32_i64 ops. When implemented, these ensure that a 32-bit value is always converted to a 64-bit value and not propagated through the register allocator or the optimizer. Cc: Paolo Bonzinipbonz...@redhat.com Cc: Richard Hendersonr...@twiddle.net Signed-off-by: Aurelien Jarnoaurel...@aurel32.net --- tcg/aarch64/tcg-target.h | 6 +- tcg/i386/tcg-target.h| 7 ++- tcg/ia64/tcg-target.h| 6 +- tcg/ppc/tcg-target.h | 7 ++- tcg/s390/tcg-target.h| 6 +- tcg/sparc/tcg-target.h | 6 +- tcg/tcg-op.c | 6 ++ tcg/tcg-opc.h| 3 +++ tcg/tci/tcg-target.h | 7 ++- 9 files changed, 47 insertions(+), 7 deletions(-) Reviewed-by: Richard Henderson r...@twiddle.net r~
Re: [Qemu-devel] [PATCH RFC 1/9] tcg: rename trunc_shr_i32 into trunc_shr_i64_i32
On 07/15/2015 12:03 PM, Aurelien Jarno wrote: The op is sometimes named trunc_shr_i32 and sometimes trunc_shr_i64_i32, and the name in the README doesn't match the name offered to the frontends. Always use the long name to make it clear it is a size changing op. Cc: Paolo Bonzinipbonz...@redhat.com Cc: Richard Hendersonr...@twiddle.net Signed-off-by: Aurelien Jarnoaurel...@aurel32.net --- tcg/README | 2 +- tcg/aarch64/tcg-target.h | 2 +- tcg/i386/tcg-target.h| 2 +- tcg/ia64/tcg-target.h| 2 +- tcg/optimize.c | 6 +++--- tcg/ppc/tcg-target.h | 2 +- tcg/s390/tcg-target.h| 2 +- tcg/sparc/tcg-target.c | 4 ++-- tcg/sparc/tcg-target.h | 2 +- tcg/tcg-op.c | 4 ++-- tcg/tcg-opc.h| 4 ++-- tcg/tcg.h| 2 +- tcg/tci/tcg-target.h | 2 +- 13 files changed, 18 insertions(+), 18 deletions(-) Reviewed-by: Richard Henderson r...@twiddle.net r~
Re: [Qemu-devel] [PATCH RFC 7/9] tcg: replace ext/u_i32_i64 by a mov when not implemented
On 07/15/2015 12:03 PM, Aurelien Jarno wrote: When ext_i32_i64 and extu_i32_i64 ops are not implemented, this means that the register is already properly zero/sign extended, so we can simply replace it by a mov. In practice it means at least one of the two ops should always be implemented on 64-bit targets. Cc: Paolo Bonzinipbonz...@redhat.com Cc: Richard Hendersonr...@twiddle.net Signed-off-by: Aurelien Jarnoaurel...@aurel32.net --- tcg/tcg-op.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) If we're going to do this (and of course pick a solution for all of the other backends), I think perhaps x86 should choose trunc + exts as the two that should be implemented, leaving extu the one that can be folded away. Something to experiment with... r~
Re: [Qemu-devel] [virtio guest] vring_need_event() from virtqueue_kick_prepare()
Do you mean vhost_net - old kernel, qemu - latest, guest - latest? On Thu, Jul 16, 2015 at 7:33 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Thu, Jul 16, 2015 at 1:54 PM, Catalin Vasile catalinvasil...@gmail.com wrote: Both. The compiled kernel was common for both. Does vhost_net work with the old kernel + new QEMU combo? Stefan
Re: [Qemu-devel] [PULL for-2.4 0/4] input: fixes for 2.4
Hi, I'm afraid this doesn't build for Windows: In file included from /home/petmay01/linaro/qemu-for-merges/hw/input/virtio-input.c:13: /home/petmay01/linaro/qemu-for-merges/include/standard-headers/linux/input.h:890:1: error: SW_MAX redefined In file included from /usr/lib/gcc/i586-mingw32msvc/4.2.1-sjlj/../../../../i586-mingw32msvc/include/windows.h:55, from /home/petmay01/linaro/qemu-for-merges/include/sysemu/os-win32.h:29, from /home/petmay01/linaro/qemu-for-merges/include/qemu-common.h:48, from /home/petmay01/linaro/qemu-for-merges/include/qemu/iov.h:17, from /home/petmay01/linaro/qemu-for-merges/hw/input/virtio-input.c:7: /usr/lib/gcc/i586-mingw32msvc/4.2.1-sjlj/../../../../i586-mingw32msvc/include/winuser.h:729:1: error: this is the location of the previous definition Lovely. Looks like a bug in the mingw headers to me, especially as my machine cross-builds this just fine for both win32 and win64. We are in hard-freeze though, so no time to experiments here, I'll redo the pull with patch #3 dropped, lets sort this post-2.4. cheers, Gerd
[Qemu-devel] [PULL 1/3] virtio-input: fix segfault in virtio_input_hid_properties
From: Lin Ma l...@suse.com commit 5cce173 introduced virtio-input segfault, This patch fixes it. Signed-off-by: Lin Ma l...@suse.com Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/input/virtio-input-hid.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/input/virtio-input-hid.c b/hw/input/virtio-input-hid.c index 616a815..4d85dad 100644 --- a/hw/input/virtio-input-hid.c +++ b/hw/input/virtio-input-hid.c @@ -308,6 +308,7 @@ static void virtio_input_hid_handle_status(VirtIOInput *vinput, static Property virtio_input_hid_properties[] = { DEFINE_PROP_STRING(display, VirtIOInputHID, display), DEFINE_PROP_UINT32(head, VirtIOInputHID, head, 0), +DEFINE_PROP_END_OF_LIST(), }; static void virtio_input_hid_class_init(ObjectClass *klass, void *data) -- 1.8.3.1
Re: [Qemu-devel] [PATCH] hostmem: Fix qemu_opt_get_bool() crash in host_memory_backend_init()
On 07/16/2015 11:39 PM, Eduardo Habkost wrote: This fixes the following crash, introduced by commit 49d2e648e8087d154d8bf8b91f27c8e05e79d5a6: $ gdb --args qemu-system-x86_64 -machine pc,mem-merge=off -object memory-backend-ram,id=ram-node0,size=1024 [...] Program received signal SIGABRT, Aborted. (gdb) bt #0 0x7253b8c7 in raise () at /lib64/libc.so.6 #1 0x7253d52a in abort () at /lib64/libc.so.6 #2 0x7253446d in __assert_fail_base () at /lib64/libc.so.6 #3 0x72534522 in () at /lib64/libc.so.6 #4 0x558bb80a in qemu_opt_get_bool_helper (opts=0x5621b650, name=name@entry=0x558ec922 mem-merge, defval=defval@entry=true, del=del@entry=false) at qemu/util/qemu-option.c:388 #5 0x558bbb5a in qemu_opt_get_bool (opts=optimized out, name=name@entry=0x558ec922 mem-merge, defval=defval@entry=true) at qemu/util/qemu-option.c:398 #6 0x55720a24 in host_memory_backend_init (obj=0x562ac970) at qemu/backends/hostmem.c:226 Instead of using qemu_opt_get_bool(), that didn't work with qemu_machine_opts for a long time, we can use the machine QOM properties directly. This was the intent. Signed-off-by: Eduardo Habkost ehabk...@redhat.com --- backends/hostmem.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/backends/hostmem.c b/backends/hostmem.c index 61c1ac0..38a32ed 100644 --- a/backends/hostmem.c +++ b/backends/hostmem.c @@ -10,6 +10,7 @@ * See the COPYING file in the top-level directory. */ #include sysemu/hostmem.h +#include hw/boards.h #include qapi/visitor.h #include qapi-types.h #include qapi-visit.h @@ -223,10 +224,10 @@ static void host_memory_backend_init(Object *obj) { HostMemoryBackend *backend = MEMORY_BACKEND(obj); -backend-merge = qemu_opt_get_bool(qemu_get_machine_opts(), - mem-merge, true); -backend-dump = qemu_opt_get_bool(qemu_get_machine_opts(), - dump-guest-core, true); +backend-merge = object_property_get_bool(OBJECT(current_machine), + mem-merge, error_abort); +backend-dump = object_property_get_bool(OBJECT(current_machine), +dump-guest-core, error_abort); backend-prealloc = mem_prealloc; object_property_add_bool(obj, merge, Thank you for fixing this, Reviewed-by: Marcel Apfelbaum mar...@redhat.com
Re: [Qemu-devel] memory hotplug fails with mem-merge option
On 07/16/2015 03:24 PM, Igor Mammedov wrote: On Thu, 16 Jul 2015 16:00:07 +0530 Bharata B Rao bhar...@linux.vnet.ibm.com wrote: On Thu, Jul 16, 2015 at 10:25:09AM +0530, Bharata B Rao wrote: Hi, When I use -machine pc,mem-merge=off|on -m 1G,slots=4,maxmem=2G, adding memory-backend-ram object fails like below. Same failure is seen with -machine pseries,mem-merge=on|off. (qemu) object_add memory-backend-ram,id=ram0,size=1G qemu-system-x86_64: util/qemu-option.c:388: qemu_opt_get_bool_helper: Assertion `opt-desc opt-desc-type == QEMU_OPT_BOOL' failed. Same failure seen with -machine pseries,mem-merge=off when used together with -object memory-backend-file on QEMU commandline in which case boot time failure is observed. Regards, Bharata. looks like related to 0a7cf217 util/qemu-config: fix regression of qmp_query_command_line_options and 49d2e64 machine: remove qemu_machine_opts global list CCing Marcel in hope he knows how to fix it right away. Hi, Can you please check that Eduardo's patch fixes this? [PATCH] hostmem: Fix qemu_opt_get_bool() crash in host_memory_backend_init() https://www.mail-archive.com/qemu-devel@nongnu.org/msg310915.html Thanks, Marcel
Re: [Qemu-devel] net: Next steps to deprecate -net
On 07/17/2015 09:25 AM, Peter Maydell wrote: On 17 July 2015 at 07:53, Thomas Huth th...@redhat.com wrote: Ok, assuming that my Network traffic dumping for -netdev devices patch series is going to solve the dumping-for-netdev problem, how do we tackle the remaining problems that we have to solve before we can deprecate -net? Does anybody have a survey of the (onboard) NICs that can only be configured with -net but not with -device? Could they nowadays be changed to work with -device, too, or are there still major obstacles to solve first? The problem is that -device says create a new device and configure it like this. But onboard NICs are created by the board, so we want let the user say how to configure those devices, not create new ones... Ok, I see ... maybe it makes sense to simply keep -net nic to be able to configure the default/onboard NIC, and only to remove all the other -net options instead (-net user etc.). The disliked vlan/hub concept could then be removed, too, since -net nic can be used together with -netdev nowadays by using something like -net nic,netdev=xxx as far as I know. That would clean up most points of confusion, I think, and would not cause too much code churn for the onboard NICs. Does that sound feasible? Thomas signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] memory hotplug fails with mem-merge option
On Fri, Jul 17, 2015 at 10:26:16AM +0300, Marcel Apfelbaum wrote: On 07/16/2015 03:24 PM, Igor Mammedov wrote: On Thu, 16 Jul 2015 16:00:07 +0530 Bharata B Rao bhar...@linux.vnet.ibm.com wrote: On Thu, Jul 16, 2015 at 10:25:09AM +0530, Bharata B Rao wrote: Hi, When I use -machine pc,mem-merge=off|on -m 1G,slots=4,maxmem=2G, adding memory-backend-ram object fails like below. Same failure is seen with -machine pseries,mem-merge=on|off. (qemu) object_add memory-backend-ram,id=ram0,size=1G qemu-system-x86_64: util/qemu-option.c:388: qemu_opt_get_bool_helper: Assertion `opt-desc opt-desc-type == QEMU_OPT_BOOL' failed. Same failure seen with -machine pseries,mem-merge=off when used together with -object memory-backend-file on QEMU commandline in which case boot time failure is observed. Regards, Bharata. looks like related to 0a7cf217 util/qemu-config: fix regression of qmp_query_command_line_options and 49d2e64 machine: remove qemu_machine_opts global list CCing Marcel in hope he knows how to fix it right away. Hi, Can you please check that Eduardo's patch fixes this? [PATCH] hostmem: Fix qemu_opt_get_bool() crash in host_memory_backend_init() https://www.mail-archive.com/qemu-devel@nongnu.org/msg310915.html Yes, this fixes the problem. Regards, Bharata.
Re: [Qemu-devel] [PATCH v2 3/3] AioContext: fix broken ctx-dispatching optimization
On Fri, Jul 17, 2015 at 06:17:47AM +0200, Paolo Bonzini wrote: On 16/07/2015 11:56, Paolo Bonzini wrote: @@ -286,13 +283,15 @@ bool aio_poll(AioContext *ctx, bool blocking) npfd = 0; ctx-walking_handlers--; +if (blocking) { +atomic_sub(ctx-notify_me, 2); +} + I kept this place for subtracting notify_me because it is the same place where aio_set_dispatching was called. However, it can be anticipated to /* if we have any readable fds, dispatch event */ if (ret 0) { for (i = 0; i npfd; i++) { i.e. right after poll. As poll has exited, it can't be blocking the thread anymore. Stefan, please let me send v3 on Monday. Sure. Stefan pgp5ys_kE4p0H.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH RFC 4/9] tcg/optimize: add optimizations for ext_i32_i64 and extu_i32_i64 ops
On 07/15/2015 12:03 PM, Aurelien Jarno wrote: They behave the same as ext32s_i64 and ext32u_i64 from the constant folding and zero propagation point of view, except that they can't be replaced by a mov, so we don't compute the affected value. Cc: Paolo Bonzinipbonz...@redhat.com Cc: Richard Hendersonr...@twiddle.net Signed-off-by: Aurelien Jarnoaurel...@aurel32.net --- tcg/optimize.c | 10 ++ 1 file changed, 10 insertions(+) Reviewed-by: Richard Henderson r...@twiddle.net r~
[Qemu-devel] [PATCH V3] virtio-net: unbreak any layout
Commit 032a74a1c0fcdd5fd1c69e56126b4c857ee36611 (virtio-net: byteswap virtio-net header) breaks any layout by requiring out_sg[0].iov_len = n-guest_hdr_len. Fixing this by copying header to temporary buffer if swap is needed, and then use this buffer as part of out_sg. Fixes 032a74a1c0fcdd5fd1c69e56126b4c857ee36611 (virtio-net: byteswap virtio-net header) Cc: qemu-sta...@nongnu.org Cc: c...@fr.ibm.com Signed-off-by: Jason Wang jasow...@redhat.com --- Changes from V2: - Drop the packet that will be truncated - Avoid scan the whole iov by just checking the return of iov_to_buf() Changes from V1: - avoid header copying if there's no need to do header swap - don't write the header back --- hw/net/virtio-net.c | 23 ++- include/hw/virtio/virtio-access.h | 9 + 2 files changed, 27 insertions(+), 5 deletions(-) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index e3c2db3..9f7e91d 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -1142,7 +1142,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) ssize_t ret, len; unsigned int out_num = elem.out_num; struct iovec *out_sg = elem.out_sg[0]; -struct iovec sg[VIRTQUEUE_MAX_SIZE]; +struct iovec sg[VIRTQUEUE_MAX_SIZE], sg2[VIRTQUEUE_MAX_SIZE + 1]; +struct virtio_net_hdr_mrg_rxbuf mhdr; if (out_num 1) { error_report(virtio-net header not in first element); @@ -1150,13 +1151,25 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) } if (n-has_vnet_hdr) { -if (out_sg[0].iov_len n-guest_hdr_len) { +if (iov_to_buf(out_sg, out_num, 0, mhdr, n-guest_hdr_len) +n-guest_hdr_len) { error_report(virtio-net header incorrect); exit(1); } -virtio_net_hdr_swap(vdev, (void *) out_sg[0].iov_base); +if (virtio_needs_swap(vdev)) { +virtio_net_hdr_swap(vdev, (void *) mhdr); +sg2[0].iov_base = mhdr; +sg2[0].iov_len = n-guest_hdr_len; +out_num = iov_copy(sg2[1], ARRAY_SIZE(sg2) - 1, + out_sg, out_num, + n-guest_hdr_len, -1); +if (out_num == VIRTQUEUE_MAX_SIZE) { +goto drop; + } +out_num += 1; +out_sg = sg2; + } } - /* * If host wants to see the guest header as is, we can * pass it on unchanged. Otherwise, copy just the parts @@ -1186,7 +1199,7 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) } len += ret; - +drop: virtqueue_push(q-tx_vq, elem, 0); virtio_notify(vdev, q-tx_vq); diff --git a/include/hw/virtio/virtio-access.h b/include/hw/virtio/virtio-access.h index cee5dd7..1ec1dfd 100644 --- a/include/hw/virtio/virtio-access.h +++ b/include/hw/virtio/virtio-access.h @@ -143,6 +143,15 @@ static inline uint64_t virtio_ldq_p(VirtIODevice *vdev, const void *ptr) } } +static inline bool virtio_needs_swap(VirtIODevice *vdev) +{ +#ifdef HOST_WORDS_BIGENDIAN +return virtio_access_is_big_endian(vdev) ? false : true; +#else +return virtio_access_is_big_endian(vdev) ? true : false; +#endif +} + static inline uint16_t virtio_tswap16(VirtIODevice *vdev, uint16_t s) { #ifdef HOST_WORDS_BIGENDIAN -- 2.1.4
Re: [Qemu-devel] net: Next steps to deprecate -net (was: [RFC PATCH] Enable vlans and dump for -netdev, too)
On Fri, Jul 17, 2015 at 08:53:08AM +0200, Thomas Huth wrote: On 05/26/2015 04:29 PM, Markus Armbruster wrote: Stefan Hajnoczi stefa...@redhat.com writes: [...] We thought the QEMU vlan concept would be dropped completely in the future, so it was never added to -netdev. No patches to do that have been posted over the years, so I think it was more of a conceptual goal than a concrete requirement. Well, patches to do that first need to replace the VLAN-only dump feature. To fully deprecate -net, we also have to replace -net nic for configuring onboard NICs. Prior discussion: http://lists.nongnu.org/archive/html/qemu-devel/2013-02/msg03743.html We haven't really tried either. Ok, assuming that my Network traffic dumping for -netdev devices patch series is going to solve the dumping-for-netdev problem, how do we tackle the remaining problems that we have to solve before we can deprecate -net? Does anybody have a survey of the (onboard) NICs that can only be configured with -net but not with -device? Could they nowadays be changed to work with -device, too, or are there still major obstacles to solve first? Take a look at nd_table[] and nb_nics. That's the array of -net nic devices. The boards look into the array to grab NICs. The default NIC is set in net_init_clients() as a -net nic option. The weird thing about -net nic is that the device is not created in net/net.c by the -net nic command-line option. It just adds the information to the nd_table[] array. It's the board that has to instantiate nd_table[] entries. For example, pc_nic_init() adds devices for x86 guests. That's about all I know or have thought about so far. It would be nice to get rid of -net but it will take some work and is a QEMU 3.0 feature since it breaks backwards compatibility. pgp0U_M3wqYnm.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH RFC 8/9] tcg/optimize: do not simplify size changing moves
On 07/15/2015 12:03 PM, Aurelien Jarno wrote: Now that we have real size changing ops, we don't need to marked high bits of the destination as garbage. The goal of the optimizer is to predict the value of the temps (and not of the registers) and do simplifications when possible. The problem there is therefore not the fact that those bits are not counted as garbage, but that a size changing op is replaced by a move. This patch is basically a revert of 24666baf, including the changes that have been made since then. Cc: Paolo Bonzini pbonz...@redhat.com Cc: Richard Henderson r...@twiddle.net Signed-off-by: Aurelien Jarno aurel...@aurel32.net What we're missing here is whether the omitted size changing op is extu or exts. Mask should be extended to match. Which means keeping most of this code. r~
Re: [Qemu-devel] [RFC PATCH qemu v3 4/4] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering)
On 07/16/2015 03:11 PM, David Gibson wrote: On Tue, Jul 14, 2015 at 10:21:54PM +1000, Alexey Kardashevskiy wrote: This makes use of the new memory registering feature. The idea is to provide the userspace ability to notify the host kernel about pages which are going to be used for DMA. Having this information, the host kernel can pin them all once per user process, do locked pages accounting (once) and not spent time on doing that in real time with possible failures which cannot be handled nicely in some cases. This adds a guest RAM memory listener which notifies a VFIO container about memory which needs to be pinned/unpinned. VFIO MMIO regions (i.e. skip dump regions) are skipped. The feature is only enabled for SPAPR IOMMU v2. The host kernel changes are required. Since v2 does not need/support VFIO_IOMMU_ENABLE, this does not call it when v2 is detected and enabled. This does not change the guest visible interface. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru I've looked at this in more depth now, and attempting to unify the pre-reg and mapping listeners like this can't work - they need to be listening on different address spaces: mapping actions need to be listening on the PCI address space, whereas the pre-reg needs to be listening on address_space_memory. For x86 - for now - those end up being the same thing, but on Power they're not. We do need to be clear about what differences are due to the presence of a guest IOMMU versus which are due to arch or underlying IOMMU type. For now Power has a guest IOMMU and x86 doesn't, but that could well change in future: we could well implement the guest side IOMMU for x86 in future (or x86 could invent a paravirt IOMMU interface). On the other side, BenH's experimental powernv machine type could introduce Power machines without a guest side IOMMU (or at least an optional guest side IOMMU). The quick and dirty approach here is: 1. Leave the main listener as is 2. Add a new pre-reg notifier to the spapr iommu specific code, which listens on address_space_memory, *not* the PCI space The more generally correct approach, which allows for more complex IOMMU arrangements and the possibility of new IOMMU types with pre-reg is: 1. Have the core implement both a mapping listener and a pre-reg listener (optionally enabled by a per-iommu-type flag). Basically the first one sees what *is* mapped, the second sees what *could* be mapped. 2. As now, the mapping listener listens on PCI address space, if RAM blocks are added, immediately map them into the host IOMMU, if guest IOMMU blocks appear register a notifier which will mirror guest IOMMU mappings to the host IOMMU (this is what we do now). 3. The pre-reg listener also listens on the PCI address space. RAM blocks added are pre-registered immediately. PCI address space listeners won't be notified about RAM blocks on sPAPR. But, if guest IOMMU blocks are added, instead of registering a guest-iommu notifier, guest-iommu notifier is the one called via memory_region_notify_iommu() from H_PUT_TCE? Instead implies dropping it, how this can work? we register another listener on the *target* AS of the guest IOMMU, same callbacks as this one. In practice that target AS will almost always resolve to address_space_memory, but this can at least in theory handle crazy guest setups with multiple layers of IOMMU. 4. Have to ensure that the pre-reg callbacks always happen before the mapping calls. For a system with an IOMMU backend which requires pre-registration, but doesn't have a guest IOMMU, we need to pre-reg, then host-iommu-map RAM blocks that appear in PCI address space. -- Alexey
[Qemu-devel] [PATCH V3 1/2] tests: introduce basic pci test for virtio-net
Signed-off-by: Jason Wang jasow...@redhat.com --- Changes from V2: - Remove duplicated code of gtest_start() Changes from V1: - replace the magic value 12 with a macro --- tests/Makefile | 2 +- tests/virtio-net-test.c | 185 ++-- 2 files changed, 179 insertions(+), 8 deletions(-) diff --git a/tests/Makefile b/tests/Makefile index 2c4b8dc..5805326 100644 --- a/tests/Makefile +++ b/tests/Makefile @@ -381,7 +381,7 @@ tests/wdt_ib700-test$(EXESUF): tests/wdt_ib700-test.o tests/tco-test$(EXESUF): tests/tco-test.o $(libqos-pc-obj-y) tests/virtio-balloon-test$(EXESUF): tests/virtio-balloon-test.o tests/virtio-blk-test$(EXESUF): tests/virtio-blk-test.o $(libqos-virtio-obj-y) -tests/virtio-net-test$(EXESUF): tests/virtio-net-test.o $(libqos-pc-obj-y) +tests/virtio-net-test$(EXESUF): tests/virtio-net-test.o $(libqos-pc-obj-y) $(libqos-virtio-obj-y) tests/virtio-rng-test$(EXESUF): tests/virtio-rng-test.o $(libqos-pc-obj-y) tests/virtio-scsi-test$(EXESUF): tests/virtio-scsi-test.o $(libqos-virtio-obj-y) tests/virtio-9p-test$(EXESUF): tests/virtio-9p-test.o diff --git a/tests/virtio-net-test.c b/tests/virtio-net-test.c index ea7478c..b37dd3e 100644 --- a/tests/virtio-net-test.c +++ b/tests/virtio-net-test.c @@ -10,20 +10,192 @@ #include glib.h #include string.h #include libqtest.h +#include qemu-common.h +#include qemu/sockets.h #include qemu/osdep.h -#include libqos/pci.h +#include qemu/iov.h +#include libqos/pci-pc.h +#include libqos/virtio.h +#include libqos/virtio-pci.h +#include libqos/malloc.h +#include libqos/malloc-pc.h +#include libqos/malloc-generic.h +#include qemu/bswap.h +#include hw/virtio/virtio-net.h #define PCI_SLOT_HP 0x06 +#define PCI_SLOT0x04 +#define PCI_FN 0x00 -/* Tests only initialization so far. TODO: Replace with functional tests */ -static void pci_nop(void) +#define QVIRTIO_NET_TIMEOUT_US (30 * 1000 * 1000) +#define VNET_HDR_SIZE sizeof(struct virtio_net_hdr_mrg_rxbuf) + +static void test_end(void) +{ +qtest_end(); +} + +#ifndef _WIN32 + +static QVirtioPCIDevice *virtio_net_pci_init(QPCIBus *bus, int slot) +{ +QVirtioPCIDevice *dev; + +dev = qvirtio_pci_device_find(bus, QVIRTIO_NET_DEVICE_ID); +g_assert(dev != NULL); +g_assert_cmphex(dev-vdev.device_type, ==, QVIRTIO_NET_DEVICE_ID); + +qvirtio_pci_device_enable(dev); +qvirtio_reset(qvirtio_pci, dev-vdev); +qvirtio_set_acknowledge(qvirtio_pci, dev-vdev); +qvirtio_set_driver(qvirtio_pci, dev-vdev); + +return dev; +} + +static QPCIBus *pci_test_start(int socket) +{ +char *cmdline; + +cmdline = g_strdup_printf(-netdev socket,fd=%d,id=hs0 -device + virtio-net-pci,netdev=hs0, socket); +qtest_start(cmdline); +g_free(cmdline); + +return qpci_init_pc(); +} + +static void driver_init(const QVirtioBus *bus, QVirtioDevice *dev) +{ +uint32_t features; + +features = qvirtio_get_features(bus, dev); +features = features ~(QVIRTIO_F_BAD_FEATURE | +QVIRTIO_F_RING_INDIRECT_DESC | +QVIRTIO_F_RING_EVENT_IDX); +qvirtio_set_features(bus, dev, features); + +qvirtio_set_driver_ok(bus, dev); +} + +static void rx_test(const QVirtioBus *bus, QVirtioDevice *dev, +QGuestAllocator *alloc, QVirtQueue *vq, +int socket) +{ +uint64_t req_addr; +uint32_t free_head; +char test[] = TEST; +char buffer[64]; +int len = htonl(sizeof(test)); +struct iovec iov[] = { +{ +.iov_base = len, +.iov_len = sizeof(len), +}, { +.iov_base = test, +.iov_len = sizeof(test), +}, +}; +int ret; + +req_addr = guest_alloc(alloc, 64); + +free_head = qvirtqueue_add(vq, req_addr, 64, true, false); +qvirtqueue_kick(bus, dev, vq, free_head); + +ret = iov_send(socket, iov, 2, 0, sizeof(len) + sizeof(test)); +g_assert_cmpint(ret, ==, sizeof(test) + sizeof(len)); + +qvirtio_wait_queue_isr(bus, dev, vq, QVIRTIO_NET_TIMEOUT_US); +memread(req_addr + VNET_HDR_SIZE, buffer, sizeof(test)); +g_assert_cmpstr(buffer, ==, TEST); + +guest_free(alloc, req_addr); +} + +static void tx_test(const QVirtioBus *bus, QVirtioDevice *dev, +QGuestAllocator *alloc, QVirtQueue *vq, +int socket) +{ +uint64_t req_addr; +uint32_t free_head; +uint32_t len; +char buffer[64]; +int ret; + +req_addr = guest_alloc(alloc, 64); +memwrite(req_addr + VNET_HDR_SIZE, TEST, 4); + +free_head = qvirtqueue_add(vq, req_addr, 64, false, false); +qvirtqueue_kick(bus, dev, vq, free_head); + +qvirtio_wait_queue_isr(bus, dev, vq, QVIRTIO_NET_TIMEOUT_US); +guest_free(alloc, req_addr); + +ret = qemu_recv(socket, len, sizeof(len), 0); +g_assert_cmpint(ret, ==, sizeof(len)); +len = ntohl(len); + +ret =
Re: [Qemu-devel] net: Next steps to deprecate -net (was: [RFC PATCH] Enable vlans and dump for -netdev, too)
On 17 July 2015 at 07:53, Thomas Huth th...@redhat.com wrote: Ok, assuming that my Network traffic dumping for -netdev devices patch series is going to solve the dumping-for-netdev problem, how do we tackle the remaining problems that we have to solve before we can deprecate -net? Does anybody have a survey of the (onboard) NICs that can only be configured with -net but not with -device? Could they nowadays be changed to work with -device, too, or are there still major obstacles to solve first? The problem is that -device says create a new device and configure it like this. But onboard NICs are created by the board, so we want let the user say how to configure those devices, not create new ones... -- PMM
[Qemu-devel] [Bug 1465935] Re: kvm_irqchip_commit_routes: Assertion `ret == 0' failed
Unfortunately I seem to be unable to get this bug triggered with the reproducer. It could be a detail of the guest setup I am missing. Since I do not have access to RHEL I used CentOS 6.3 in a 8core guest with 2 virtio disks. Host was 14.04. Left the script running for quite a bit but no crash happened. So it would be up to you to confirm that with a current 14.04 host you still can trigger the bug and with the patched version of qemu from http://people.canonical.com/~smb/lp1465935/ it would be gone. Thanks. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1465935 Title: kvm_irqchip_commit_routes: Assertion `ret == 0' failed Status in QEMU: New Status in qemu package in Ubuntu: Confirmed Status in qemu source package in Precise: New Status in qemu source package in Trusty: New Status in qemu source package in Utopic: New Status in qemu source package in Vivid: New Bug description: Several my QEMU instances crashed, and in the qemu log, I can see this assertion failure, qemu-system-x86_64: /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:984: kvm_irqchip_commit_routes: Assertion `ret == 0' failed. The QEMU version is 2.0.0, HV OS is ubuntu 12.04, kernel 3.2.0-38. Guest OS is RHEL 6.3. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1465935/+subscriptions
Re: [Qemu-devel] [PATCH RFC 9/9] tcg: update README about size changing ops
On 07/15/2015 12:03 PM, Aurelien Jarno wrote: +These ops are all optional in that case they are implemented as mov. +This is to allow some optimizations if the target maintains registers +zero or sign extended. For example a MIPS64 CPU requires that all +32-bit values are stored sign-extended in the registers. This means +the trunc_shr_i64_i32 should sign-extend the value when moving it +from a 64-bit to a 32-bit register. It also means ext_i32_i64 can be +implemented as a simple mov as the value is already sign extended. We need better wording. Each one of the three are optional, and the other two must be implemented. I think we ought to have a check in tcg.c about this, in tcg_add_target_add_op_defs. r~
[Qemu-devel] net: Next steps to deprecate -net (was: [RFC PATCH] Enable vlans and dump for -netdev, too)
On 05/26/2015 04:29 PM, Markus Armbruster wrote: Stefan Hajnoczi stefa...@redhat.com writes: [...] We thought the QEMU vlan concept would be dropped completely in the future, so it was never added to -netdev. No patches to do that have been posted over the years, so I think it was more of a conceptual goal than a concrete requirement. Well, patches to do that first need to replace the VLAN-only dump feature. To fully deprecate -net, we also have to replace -net nic for configuring onboard NICs. Prior discussion: http://lists.nongnu.org/archive/html/qemu-devel/2013-02/msg03743.html We haven't really tried either. Ok, assuming that my Network traffic dumping for -netdev devices patch series is going to solve the dumping-for-netdev problem, how do we tackle the remaining problems that we have to solve before we can deprecate -net? Does anybody have a survey of the (onboard) NICs that can only be configured with -net but not with -device? Could they nowadays be changed to work with -device, too, or are there still major obstacles to solve first? Thomas signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PULL v2 for-2.4 0/3] input: fixes for 2.4
Hi, Dropped the patch to enable virtio-input for non-linux systems. Otherwise unmodified. please pull, Gerd The following changes since commit 2d5ee9e7a7dd495d233cf9613a865f63f88e3375: Merge remote-tracking branch 'remotes/lalrae/tags/mips-20150716' into staging (2015-07-16 10:40:23 +0100) are available in the git repository at: git://git.kraxel.org/qemu tags/pull-input-20150717-1 for you to fetch changes up to 562f93754b95fd6dc65ad9a2aa15a90b2da7e8a4: hid: clarify hid_keyboard_process_keycode (2015-07-17 08:44:41 +0200) input: fixes for 2.4 Gerd Hoffmann (1): virtio-input: move sys/ioctl.h include Lin Ma (1): virtio-input: fix segfault in virtio_input_hid_properties Paolo Bonzini (1): hid: clarify hid_keyboard_process_keycode hw/input/hid.c | 32 hw/input/virtio-input-hid.c| 1 + hw/input/virtio-input-host.c | 1 + include/standard-headers/linux/input.h | 1 - scripts/update-linux-headers.sh| 1 + 5 files changed, 31 insertions(+), 5 deletions(-)
[Qemu-devel] [PULL 3/3] hid: clarify hid_keyboard_process_keycode
From: Paolo Bonzini pbonz...@redhat.com Coverity thinks the fallthroughs are smelly. They are correct, but everything else in this function is like wut?. Refer explicitly to bits 8 and 9 of hs-kbd.modifiers instead of shifting right first and using (1 7). Document what the scancode is when hid_code is 0xe0. And add plenty of comments. Signed-off-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/input/hid.c | 32 1 file changed, 28 insertions(+), 4 deletions(-) diff --git a/hw/input/hid.c b/hw/input/hid.c index 6841cb8..21ebd9e 100644 --- a/hw/input/hid.c +++ b/hw/input/hid.c @@ -239,7 +239,7 @@ static void hid_keyboard_event(DeviceState *dev, QemuConsole *src, static void hid_keyboard_process_keycode(HIDState *hs) { -uint8_t hid_code, key; +uint8_t hid_code, index, key; int i, keycode, slot; if (hs-n == 0) { @@ -249,7 +249,8 @@ static void hid_keyboard_process_keycode(HIDState *hs) keycode = hs-kbd.keycodes[slot]; key = keycode 0x7f; -hid_code = hid_usage_keys[key | ((hs-kbd.modifiers 1) (1 7))]; +index = key | ((hs-kbd.modifiers (1 8)) 1); +hid_code = hid_usage_keys[index]; hs-kbd.modifiers = ~(1 8); switch (hid_code) { @@ -257,18 +258,41 @@ static void hid_keyboard_process_keycode(HIDState *hs) return; case 0xe0: +assert(key == 0x1d); if (hs-kbd.modifiers (1 9)) { -hs-kbd.modifiers ^= 3 8; +/* The hid_codes for the 0xe1/0x1d scancode sequence are 0xe9/0xe0. + * Here we're processing the second hid_code. By dropping bit 9 + * and setting bit 8, the scancode after 0x1d will access the + * second half of the table. + */ +hs-kbd.modifiers ^= (1 8) | (1 9); return; } +/* fall through to process Ctrl_L */ case 0xe1 ... 0xe7: +/* Ctrl_L/Ctrl_R, Shift_L/Shift_R, Alt_L/Alt_R, Win_L/Win_R. + * Handle releases here, or fall through to process presses. + */ if (keycode (1 7)) { hs-kbd.modifiers = ~(1 (hid_code 0x0f)); return; } -case 0xe8 ... 0xef: +/* fall through */ +case 0xe8 ... 0xe9: +/* USB modifiers are just 1 byte long. Bits 8 and 9 of + * hs-kbd.modifiers implement a state machine that detects the + * 0xe0 and 0xe1/0x1d sequences. These bits do not follow the + * usual rules where bit 7 marks released keys; they are cleared + * elsewhere in the function as the state machine dictates. + */ hs-kbd.modifiers |= 1 (hid_code 0x0f); return; + +case 0xea ... 0xef: +abort(); + +default: +break; } if (keycode (1 7)) { -- 1.8.3.1
[Qemu-devel] [PULL 2/3] virtio-input: move sys/ioctl.h include
Drop from include/standard-headers/linux/input.h Add to hw/input/virtio-input-host.c instead. That allows to build virtio-input (except pass-through) on windows. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/input/virtio-input-host.c | 1 + include/standard-headers/linux/input.h | 1 - scripts/update-linux-headers.sh| 1 + 3 files changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/input/virtio-input-host.c b/hw/input/virtio-input-host.c index f7e3d84..8978f16 100644 --- a/hw/input/virtio-input-host.c +++ b/hw/input/virtio-input-host.c @@ -11,6 +11,7 @@ #include hw/virtio/virtio.h #include hw/virtio/virtio-input.h +#include sys/ioctl.h #include standard-headers/linux/input.h /* - */ diff --git a/include/standard-headers/linux/input.h b/include/standard-headers/linux/input.h index a459dd2..b003c67 100644 --- a/include/standard-headers/linux/input.h +++ b/include/standard-headers/linux/input.h @@ -10,7 +10,6 @@ #include sys/time.h -#include sys/ioctl.h #include sys/types.h #include standard-headers/linux/types.h diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh index 47378d9..f0e830c 100755 --- a/scripts/update-linux-headers.sh +++ b/scripts/update-linux-headers.sh @@ -56,6 +56,7 @@ cp_virtio() { -e 's/__bitwise__//' \ -e 's/__attribute__((packed))/QEMU_PACKED/' \ -e 's/__inline__/inline/' \ +-e '/sys\/ioctl.h/d' \ $f $to/$header; done fi -- 1.8.3.1