Re: [Qemu-devel] [PULL 00/10] Fix device introspection regressions
On 5 October 2015 at 18:11, Markus Armbrusterwrote: > Sigh. I'll stare at the original backtrace some more. Here's one from a debug build (with some reasoning about the cause underneath it). As an aside, thanks to your valgrind example I figured out how to get the test makefile to run tests under gdb: QTEST_QEMU_BINARY="xterm -e gdb --tty $(tty) --args aarch64-softmmu/qemu-system-aarch64" QTEST_QEMU_IMG=qemu-img gtester -k --verbose -m=quick tests/device-introspect-test (which will start a new xterm with a gdb in it for debugging each test). #0 0x7fff9145e286 in __pthread_kill () #1 0x7fff912529f9 in pthread_kill () #2 0x7fff956db9b3 in abort () #3 0x00010103ec50 in g_assertion_message () #4 0x00010103ec95 in g_assertion_message_expr () #5 0x00010045bbb2 in object_initialize_with_type (data=0x102056520, size=344, type=0x0) at /Users/pm215/src/qemu-for-merges/qom/object.c:333 #6 0x00010045c112 in object_initialize (data=0x102056520, size=344, typename=0x1005a7a96 "virtio-tablet-device") at /Users/pm215/src/qemu-for-merges/qom/object.c:352 #7 0x0001000e7d74 in virtio_instance_init_common (proxy_obj=0x10204e400, data=0x102056520, vdev_size=344, vdev_name=0x1005a7a96 "virtio-tablet-device") at /Users/pm215/src/qemu-for-merges/hw/virtio/virtio.c:1468 #8 0x0001003efe47 in virtio_tablet_initfn (obj=0x10204e400) at /Users/pm215/src/qemu-for-merges/hw/virtio/virtio-pci.c:2133 #9 0x00010045c066 in object_init_with_type (obj=0x10204e400, ti=0x10154f420) at /Users/pm215/src/qemu-for-merges/qom/object.c:314 #10 0x00010045bcf2 in object_initialize_with_type (data=0x10204e400, size=33400, type=0x10154f420) at /Users/pm215/src/qemu-for-merges/qom/object.c:344 #11 0x00010045c2a9 in object_new_with_type (type=0x10154f420) at /Users/pm215/src/qemu-for-merges/qom/object.c:430 #12 0x00010045c2f2 in object_new (typename=0x101585cc0 "virtio-tablet-pci") at /Users/pm215/src/qemu-for-merges/qom/object.c:440 #13 0x00010021923f in qmp_device_list_properties (typename=0x101585cc0 "virtio-tablet-pci", errp=0x7fff5fbfd850) at /Users/pm215/src/qemu-for-merges/qmp.c:529 #14 0x00010020e054 in qmp_marshal_device_list_properties (args=0x102023200, ret=0x7fff5fbfd8c8, errp=0x7fff5fbfd8d0) at qmp-marshal.c:1693 #15 0x000100047fe1 in handle_qmp_command (parser=0x10157f388, tokens=0x1013069a0) at /Users/pm215/src/qemu-for-merges/monitor.c:3860 #16 0x00010052fb16 in json_message_process_token (lexer=0x10157f390, token=0x101585b40, type=JSON_OPERATOR, x=15754, y=0) at /Users/pm215/src/qemu-for-merges/qobject/json-streamer.c:87 #17 0x00010052f45a in json_lexer_feed_char (lexer=0x10157f390, ch=125 '}', flush=false) at /Users/pm215/src/qemu-for-merges/qobject/json-lexer.c:303 #18 0x00010052f2e1 in json_lexer_feed (lexer=0x10157f390, buffer=0x7fff5fbfdb00 "}6\035�q&", size=1) at /Users/pm215/src/qemu-for-merges/qobject/json-lexer.c:356 #19 0x00010052fbc7 in json_message_parser_feed (parser=0x10157f388, buffer=0x7fff5fbfdb00 "}6\035�q&", size=1) at /Users/pm215/src/qemu-for-merges/qobject/json-streamer.c:110 #20 0x000100047ce6 in monitor_qmp_read (opaque=0x10157f310, buf=0x7fff5fbfdb00 "}6\035�q&", size=1) at /Users/pm215/src/qemu-for-merges/monitor.c:3875 #21 0x0001001f06d7 in qemu_chr_be_write (s=0x10157e310, buf=0x7fff5fbfdb00 "}6\035�q&", len=1) at /Users/pm215/src/qemu-for-merges/qemu-char.c:305 #22 0x0001001f661a in tcp_chr_read (chan=0x10157e5a0, cond=G_IO_IN, opaque=0x10157e310) at /Users/pm215/src/qemu-for-merges/qemu-char.c:2873 #23 0x00010101f0bd in g_main_context_dispatch () #24 0x000100473aaa in glib_pollfds_poll () at /Users/pm215/src/qemu-for-merges/main-loop.c:211 #25 0x00010047371b in os_host_main_loop_wait (timeout=0) at /Users/pm215/src/qemu-for-merges/main-loop.c:256 #26 0x00010047358d in main_loop_wait (nonblocking=1) at /Users/pm215/src/qemu-for-merges/main-loop.c:504 #27 0x00010020715c in main_loop () at /Users/pm215/src/qemu-for-merges/vl.c:1880 #28 0x0001002015bc in qemu_main (argc=14, argv=0x7fff5fbff5c0, envp=0x7fff5fbff638) at /Users/pm215/src/qemu-for-merges/vl.c:4634 #29 0x000100435c73 in main (argc=14, argv=0x7fff5fbff5c0) at /Users/pm215/src/qemu-for-merges/ui/cocoa.m:1164 I think the problem here is that the "virtio-tablet-pci" device is in hw/virtio/virtio-pci.c, which gets built for all hosts, but it uses the device "virtio-tablet-device", which is defined in hw/input/virtio-input-hid.c, which is only built if CONFIG_LINUX is defined. So on non-Linux hosts we get a virtio-tablet-pci device which can't be created. The easy fix is to have some suitable ifdeffery in virtio-pci.c, similar to how we only register the virtio_9p_pci and virtio_scsi_pci types if they've been configured into this build. thanks -- PMM
Re: [Qemu-devel] [PATCH 3/4] checkpatch: adapt some tests to QEMU
On Sun, Oct 04, 2015 at 07:44:29PM +0100, Peter Maydell wrote: > On 17 September 2015 at 17:32, Paolo Bonziniwrote: > > > > > > On 17/09/2015 18:16, Peter Maydell wrote: > >> On 17 September 2015 at 17:00, Paolo Bonzini wrote: > >>> > >>> > >>> On 17/09/2015 16:24, Peter Maydell wrote: > Can we revert this one, please? Checkpatch now warns about constructs > like > typedef struct MyDevice { > DeviceState parent; > > int reg0, reg1, reg2; > } MyDevice; > > > I think it varies depending on the maintainer. PPC, USB, SCSI, ACPI all > > use a separate typedef. I'll prepare a revert. > > Ping on that revert patch? I can't find it onlist... The x86 pull request I sent today triggers this error, BTW. I hope it won't make the pull request be automatically rejected. -- Eduardo
Re: [Qemu-devel] [PATCH 3/4] checkpatch: adapt some tests to QEMU
On 5 October 2015 at 19:11, Eduardo Habkostwrote: > On Sun, Oct 04, 2015 at 07:44:29PM +0100, Peter Maydell wrote: >> On 17 September 2015 at 17:32, Paolo Bonzini wrote: >> > >> > >> > On 17/09/2015 18:16, Peter Maydell wrote: >> >> On 17 September 2015 at 17:00, Paolo Bonzini wrote: >> >>> >> >>> >> >>> On 17/09/2015 16:24, Peter Maydell wrote: >> Can we revert this one, please? Checkpatch now warns about constructs >> like >> typedef struct MyDevice { >> DeviceState parent; >> >> int reg0, reg1, reg2; >> } MyDevice; >> >> > I think it varies depending on the maintainer. PPC, USB, SCSI, ACPI all >> > use a separate typedef. I'll prepare a revert. >> >> Ping on that revert patch? I can't find it onlist... > > The x86 pull request I sent today triggers this error, BTW. I hope it > won't make the pull request be automatically rejected. Nope, because I don't run checkpatch on pull requests. -- PMM
Re: [Qemu-devel] [Qemu-discuss] TCP options ipv4 and ipv6 have no effect
> The first if handles the "default to N" case, the second handles "default to > Y", the (absent) else case handles "default to PF_UNSPEC". Can you please elaborate it. Also I am not understanding the reason for inverting the values of addr->has_ipv* in second if condition. I believe that the fix for the issue under discussion will be committed to qemu repo very soon, so I'll like to add one more thing which requires to be fixed along with it. In 'tcp_chr_accept' function of qemu-char.c, the data type of saddr should be sockaddr_in6 so that it works with both IPv6 and IPv4 on Windows (works for linux without it because of accept4 and works with this solution as well!). Regards, Umair Sair -Original Message- From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini Sent: Sunday, October 04, 2015 3:48 PM To: Peter Maydell; Sair, Umair Cc: QEMU Developers; qemu-disc...@nongnu.org Subject: Re: [Qemu-discuss] TCP options ipv4 and ipv6 have no effect On 03/10/2015 00:36, Peter Maydell wrote: > > I agree about the (!ipv4 || !ipv6) condition though. > The three states I listed above ought to correspond to "qemu_opt not > set", "qemu_opt set to false" and "qemu_opt set to true", The problem is that the underlying QemuOpts-based code treats "qemu_opt not set" and "qemu_opt set to false" the same way: ipv4 ipv6 Y Y PF_INET6 Y N PF_INET N Y PF_INET6 N N PF_UNSPEC We want: ipv4 ipv6 YY PF_INET6 YN PF_INET Y- PF_INET (ipv6 = N) NY PF_INET6 NN PF_UNSPEC N- PF_INET6(ipv6 = Y) -Y PF_INET6(ipv4 = N) -N PF_INET (ipv4 = Y) -- PF_UNSPEC I think this patch gets the desired semantics: diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c index 2add83a..fdcf3fa 100644 --- a/util/qemu-sockets.c +++ b/util/qemu-sockets.c @@ -586,12 +586,15 @@ fail: static void inet_addr_to_opts(QemuOpts *opts, const InetSocketAddress *addr) { -bool ipv4 = addr->ipv4 || !addr->has_ipv4; -bool ipv6 = addr->ipv6 || !addr->has_ipv6; +bool ipv4 = addr->has_ipv4 && addr->ipv4; +bool ipv6 = addr->has_ipv6 && addr->ipv6; -if (!ipv4 || !ipv6) { +if (ipv4 || ipv6) { qemu_opt_set_bool(opts, "ipv4", ipv4, _abort); qemu_opt_set_bool(opts, "ipv6", ipv6, _abort); +} else if (addr->has_ipv4 || addr->has_ipv6) { +qemu_opt_set_bool(opts, "ipv4", !addr->has_ipv4, _abort); +qemu_opt_set_bool(opts, "ipv6", !addr->has_ipv6, _abort); } if (addr->has_to) { qemu_opt_set_number(opts, "to", addr->to, _abort); The first if handles the "default to N" case, the second handles "default to Y", the (absent) else case handles "default to PF_UNSPEC". Paolo
Re: [Qemu-devel] [PATCH] Add syscalls for -runas and -chroot tothe seccomp sandbox
"Namsun Ch'o"writes: >> Drawback: complexity. If we decide to limit ourselves to the original >> threat model (rogue guest), and enter the sandbox only after setup, we >> can keep things simpler. > > We could do both without much complexity. This looks simple enough to me: > > rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(chroot), 1, > SCMP_A0(SCMP_CMP_EQ, chroot_dir)); > if (rc < 0) > goto seccomp_return; > > rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(chdir), 1, > SCMP_A0(SCMP_CMP_EQ, "/")); > if (rc < 0) > goto seccomp_return; > > The only time chroot_dir is ever used is in os-posix.c:139: > > if (chroot(chroot_dir) < 0) { I'm afraid this materially weakens the sandbox. chroot_dir is writable. We don't need to permit chroot(chroot_dir) if we enter the sandbox only after setup.
Re: [Qemu-devel] [PULL 00/10] Fix device introspection regressions
Peter Maydellwrites: > On 5 October 2015 at 20:27, Paolo Bonzini wrote: >> >> >> On 05/10/2015 20:46, Peter Maydell wrote: >>> The easy fix is to have some suitable ifdeffery in virtio-pci.c, >>> similar to how we only register the virtio_9p_pci and virtio_scsi_pci >> >> (vhost_scsi_pci) >> >>> types if they've been configured into this build. >> >> Hmm, actually there's no reason to limit >> >> common-obj-$(CONFIG_VIRTIO) += virtio-input.o >> common-obj-$(CONFIG_VIRTIO) += virtio-input-hid.o >> >> to CONFIG_LINUX. Does it work to extract them out of the if (which is >> only correct for virtio-input-host.o)? > > Yes, though as you predicted we then fall over on virtio-input-host-pci. > If I bodge that one out in virtio-pci.c then the device-introspect-test > passes. Does this do the trick? If yes, I'll fill out the commit message and post it properly. >From e5ebf2f0680fcf145455db048b82cf8e5d9d6b9f Mon Sep 17 00:00:00 2001 From: Markus Armbruster Date: Tue, 6 Oct 2015 07:43:18 +0200 Subject: [PATCH] virtio-input: Fix device introspection on non-Linux hosts WIP --- hw/input/Makefile.objs | 2 +- hw/virtio/virtio-pci.c | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/hw/input/Makefile.objs b/hw/input/Makefile.objs index 624ba7e..7715d72 100644 --- a/hw/input/Makefile.objs +++ b/hw/input/Makefile.objs @@ -8,9 +8,9 @@ common-obj-$(CONFIG_STELLARIS_INPUT) += stellaris_input.o common-obj-$(CONFIG_TSC2005) += tsc2005.o common-obj-$(CONFIG_VMMOUSE) += vmmouse.o -ifeq ($(CONFIG_LINUX),y) common-obj-$(CONFIG_VIRTIO) += virtio-input.o common-obj-$(CONFIG_VIRTIO) += virtio-input-hid.o +ifeq ($(CONFIG_LINUX),y) common-obj-$(CONFIG_VIRTIO) += virtio-input-host.o endif diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index eda8205..23f131b 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -2228,7 +2228,9 @@ static void virtio_pci_register_types(void) type_register_static(_keyboard_pci_info); type_register_static(_mouse_pci_info); type_register_static(_tablet_pci_info); +#ifdef CONFIG_LINUX type_register_static(_host_pci_info); +#endif type_register_static(_pci_bus_info); type_register_static(_pci_info); #ifdef CONFIG_VIRTFS -- 2.4.3
Re: [Qemu-devel] [PULL 00/10] Fix device introspection regressions
On 5 October 2015 at 20:27, Paolo Bonziniwrote: > > > On 05/10/2015 20:46, Peter Maydell wrote: >> The easy fix is to have some suitable ifdeffery in virtio-pci.c, >> similar to how we only register the virtio_9p_pci and virtio_scsi_pci > > (vhost_scsi_pci) > >> types if they've been configured into this build. > > Hmm, actually there's no reason to limit > > common-obj-$(CONFIG_VIRTIO) += virtio-input.o > common-obj-$(CONFIG_VIRTIO) += virtio-input-hid.o > > to CONFIG_LINUX. Does it work to extract them out of the if (which is > only correct for virtio-input-host.o)? Yes, though as you predicted we then fall over on virtio-input-host-pci. If I bodge that one out in virtio-pci.c then the device-introspect-test passes. thanks -- PMM
Re: [Qemu-devel] [PATCHv2] target-arm: Implement AArch64 OSLSR_EL1 sysreg
On 5 October 2015 at 20:56, Davorin Mistawrote: > Added oslsr_write function to OSLAR_EL1 sysreg, using a status variable > in ARMCPUState struct (os_lock_status). > > Linux reads from this register during its suspend/resume procedure. > > Signed-off-by: Davorin Mista Thanks for this patch. I'm afraid it still needs some changes; comments below. > --- > Changed in v2: > -switched from using dummy registers to an actual register implementation > -implemented write function for OSLAR_EL1 sysreg > -added state variable to ARMCPUState struct > > Signed-off-by: Davorin Mista > --- > target-arm/cpu.h| 3 +++ > target-arm/helper.c | 16 +++- > 2 files changed, 18 insertions(+), 1 deletion(-) > > diff --git a/target-arm/cpu.h b/target-arm/cpu.h > index 5ea11a6..5aab654 100644 > --- a/target-arm/cpu.h > +++ b/target-arm/cpu.h > @@ -500,6 +500,9 @@ typedef struct CPUARMState { > uint32_t cregs[16]; > } iwmmxt; > > +/* OS Lock Status: accessed via OSLAR/OSLSR registers */ > +uint64_t os_lock_status; Can you call this "oslsr_el1" and put it inside the cp15 substruct with the other sysreg fields, please? > + > /* For mixed endian mode. */ > bool bswap_code; > > diff --git a/target-arm/helper.c b/target-arm/helper.c > index 9d62c4c..a6fad7a 100644 > --- a/target-arm/helper.c > +++ b/target-arm/helper.c > @@ -3147,6 +3147,13 @@ static void dcc_write(CPUARMState *env, const > ARMCPRegInfo *ri, > putchar(value); > } > > +/* write to os_lock_status state variable */ > +static void oslsr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t > value) > +{ > +/* only bit 1 can be modified, register is always 0b10x0 */ > +raw_write(env, ri, 8 + (value & 2)); This is the write function for OSLAR, not OSLSR, so you should call it oslar_write. Your logic isn't implementing the behaviour the ARM ARM requires, which is: * for AArch64 accesses, copy bit 0 of the written value into bit 1 of oslsr_el1 * for AArch32 accesses, if the written value is 0xC5ACCE55 then write 1 into bit 1 of oslsr_el1, else write 0 That looks something like: int oslock; if (ri->state == ARM_CP_STATE_AA32) { oslock = (value == 0xC5ACCE55); } else { oslock = value & 1; } env->cp15.oslsr_el1 = deposit32(env->cp15.oslsr_el1, 1, 1, oslock); > +} > + > static const ARMCPRegInfo debug_cp_reginfo[] = { > /* DBGDRAR, DBGDSAR: always RAZ since we don't implement memory mapped > * debug components. The AArch64 version of DBGDRAR is named MDRAR_EL1; > @@ -3179,7 +3186,14 @@ static const ARMCPRegInfo debug_cp_reginfo[] = { > /* We define a dummy WI OSLAR_EL1, because Linux writes to it. */ > { .name = "OSLAR_EL1", .state = ARM_CP_STATE_BOTH, >.cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 4, > - .access = PL1_W, .type = ARM_CP_NOP }, > + .access = PL1_W, .resetvalue = 10, Write only registers don't need a reset value. You also need .type = ARM_CP_NO_RAW, because a raw access to this register doesn't make sense. > + .fieldoffset = offsetof(CPUARMState, os_lock_status), > + .writefn = oslsr_write }, > +/* We define a dummy OSLSR_EL1, because Linux reads from it. */ > +{ .name = "OSLSR_EL1", .state = ARM_CP_STATE_BOTH, > + .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 4, > + .access = PL1_R, > + .fieldoffset = offsetof(CPUARMState, os_lock_status) }, This is the reginfo that should have the reset value. > /* Dummy OSDLR_EL1: 32-bit Linux will read this */ > { .name = "OSDLR_EL1", .state = ARM_CP_STATE_BOTH, >.cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 4, > -- > 2.6.0 > thanks -- PMM
Re: [Qemu-devel] [PATCH 3/5] qga: guest exec functionality
On 10/2/2015 1:59 AM, Michael Roth wrote: +# +# Since: 2.5 +## +{ 'command': 'guest-exec', + 'data':{ 'path': 'str', '*arg': ['str'], '*env': ['str'], + '*inp-data': 'str', '*capture-output': 'bool' }, + 'returns': 'GuestExec' } Would it make sense to just add handle (pid) to GuestExecStatus, and have this return GuestExecStatus just the same as guest-exec-status does? I'm not sure either way personally, just thought I'd mention it. I do not think it's a good idea. GuestExecStatus contains mostly the data about the finished exec. Having GuestExec returns same struct may make wrong assumption that it can be synchronous - wait for exec to complete and return all data in a single call. Implementing synchronous GuestExec is not and easy job - either we occupy host-guest channel for all time until task finished, which is bad or we need to implement multiplexed messages for concurrent qga commands.
[Qemu-devel] [PULL 02/10] hw/vfio/platform: change interrupt/unmask fields into pointer
From: Eric Augerunmask EventNotifier might not be initialized in case of edge sensitive irq. Using EventNotifier pointers make life simpler to handle the edge-sensitive irqfd setup. Signed-off-by: Eric Auger Signed-off-by: Alex Williamson --- hw/vfio/platform.c | 35 --- include/hw/vfio/vfio-platform.h |4 ++-- 2 files changed, 22 insertions(+), 17 deletions(-) diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c index d864342..cab1517 100644 --- a/hw/vfio/platform.c +++ b/hw/vfio/platform.c @@ -57,15 +57,20 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, sysbus_init_irq(sbdev, >qemuirq); /* Get an eventfd for trigger */ -ret = event_notifier_init(>interrupt, 0); +intp->interrupt = g_malloc0(sizeof(EventNotifier)); +ret = event_notifier_init(intp->interrupt, 0); if (ret) { +g_free(intp->interrupt); g_free(intp); error_report("vfio: Error: trigger event_notifier_init failed "); return NULL; } /* Get an eventfd for resample/unmask */ -ret = event_notifier_init(>unmask, 0); +intp->unmask = g_malloc0(sizeof(EventNotifier)); +ret = event_notifier_init(intp->unmask, 0); if (ret) { +g_free(intp->interrupt); +g_free(intp->unmask); g_free(intp); error_report("vfio: Error: resamplefd event_notifier_init failed"); return NULL; @@ -100,7 +105,7 @@ static int vfio_set_trigger_eventfd(VFIOINTp *intp, irq_set->start = 0; irq_set->count = 1; pfd = (int32_t *)_set->data; -*pfd = event_notifier_get_fd(>interrupt); +*pfd = event_notifier_get_fd(intp->interrupt); qemu_set_fd_handler(*pfd, (IOHandler *)handler, NULL, intp); ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set); g_free(irq_set); @@ -182,7 +187,7 @@ static void vfio_intp_mmap_enable(void *opaque) static void vfio_intp_inject_pending_lockheld(VFIOINTp *intp) { trace_vfio_platform_intp_inject_pending_lockheld(intp->pin, - event_notifier_get_fd(>interrupt)); + event_notifier_get_fd(intp->interrupt)); intp->state = VFIO_IRQ_ACTIVE; @@ -224,18 +229,18 @@ static void vfio_intp_interrupt(VFIOINTp *intp) trace_vfio_intp_interrupt_set_pending(intp->pin); QSIMPLEQ_INSERT_TAIL(>pending_intp_queue, intp, pqnext); -ret = event_notifier_test_and_clear(>interrupt); +ret = event_notifier_test_and_clear(intp->interrupt); qemu_mutex_unlock(>intp_mutex); return; } trace_vfio_platform_intp_interrupt(intp->pin, - event_notifier_get_fd(>interrupt)); + event_notifier_get_fd(intp->interrupt)); -ret = event_notifier_test_and_clear(>interrupt); +ret = event_notifier_test_and_clear(intp->interrupt); if (!ret) { error_report("Error when clearing fd=%d (ret = %d)", - event_notifier_get_fd(>interrupt), ret); + event_notifier_get_fd(intp->interrupt), ret); } intp->state = VFIO_IRQ_ACTIVE; @@ -283,7 +288,7 @@ static void vfio_platform_eoi(VFIODevice *vbasedev) QLIST_FOREACH(intp, >intp_list, next) { if (intp->state == VFIO_IRQ_ACTIVE) { trace_vfio_platform_eoi(intp->pin, -event_notifier_get_fd(>interrupt)); +event_notifier_get_fd(intp->interrupt)); intp->state = VFIO_IRQ_INACTIVE; /* deassert the virtual IRQ */ @@ -360,7 +365,7 @@ static int vfio_set_resample_eventfd(VFIOINTp *intp) irq_set->start = 0; irq_set->count = 1; pfd = (int32_t *)_set->data; -*pfd = event_notifier_get_fd(>unmask); +*pfd = event_notifier_get_fd(intp->unmask); qemu_set_fd_handler(*pfd, NULL, NULL, NULL); ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set); g_free(irq_set); @@ -396,8 +401,8 @@ static void vfio_start_irqfd_injection(SysBusDevice *sbdev, qemu_irq irq) } assert(intp); -if (kvm_irqchip_add_irqfd_notifier(kvm_state, >interrupt, - >unmask, irq) < 0) { +if (kvm_irqchip_add_irqfd_notifier(kvm_state, intp->interrupt, + intp->unmask, irq) < 0) { goto fail_irqfd; } @@ -411,11 +416,11 @@ static void vfio_start_irqfd_injection(SysBusDevice *sbdev, qemu_irq irq) intp->kvm_accel = true; trace_vfio_platform_start_irqfd_injection(intp->pin, - event_notifier_get_fd(>interrupt), - event_notifier_get_fd(>unmask)); + event_notifier_get_fd(intp->interrupt), + event_notifier_get_fd(intp->unmask));
[Qemu-devel] [PULL 08/10] memory: Allow replay of IOMMU mapping notifications
From: David GibsonWhen we have guest visible IOMMUs, we allow notifiers to be registered which will be informed of all changes to IOMMU mappings. This is used by vfio to keep the host IOMMU mappings in sync with guest IOMMU mappings. However, unlike with a memory region listener, an iommu notifier won't be told about any mappings which already exist in the (guest) IOMMU at the time it is registered. This can cause problems if hotplugging a VFIO device onto a guest bus which had existing guest IOMMU mappings, but didn't previously have an VFIO devices (and hence no host IOMMU mappings). This adds a memory_region_iommu_replay() function to handle this case. It replays any existing mappings in an IOMMU memory region to a specified notifier. Because the IOMMU memory region doesn't internally remember the granularity of the guest IOMMU it has a small hack where the caller must specify a granularity at which to replay mappings. If there are finer mappings in the guest IOMMU these will be reported in the iotlb structures passed to the notifier which it must handle (probably causing it to flag an error). This isn't new - the VFIO iommu notifier must already handle notifications about guest IOMMU mappings too short for it to represent in the host IOMMU. Signed-off-by: David Gibson Reviewed-by: Laurent Vivier Acked-by: Paolo Bonzini Signed-off-by: Alex Williamson --- include/exec/memory.h | 13 + memory.c | 20 2 files changed, 33 insertions(+) diff --git a/include/exec/memory.h b/include/exec/memory.h index 5baaf48..0f07159 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -583,6 +583,19 @@ void memory_region_notify_iommu(MemoryRegion *mr, void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n); /** + * memory_region_iommu_replay: replay existing IOMMU translations to + * a notifier + * + * @mr: the memory region to observe + * @n: the notifier to which to replay iommu mappings + * @granularity: Minimum page granularity to replay notifications for + * @is_write: Whether to treat the replay as a translate "write" + * through the iommu + */ +void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n, +hwaddr granularity, bool is_write); + +/** * memory_region_unregister_iommu_notifier: unregister a notifier for * changes to IOMMU translation entries. * diff --git a/memory.c b/memory.c index ef87363..1b03d22 100644 --- a/memory.c +++ b/memory.c @@ -1403,6 +1403,26 @@ void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n) notifier_list_add(>iommu_notify, n); } +void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n, +hwaddr granularity, bool is_write) +{ +hwaddr addr; +IOMMUTLBEntry iotlb; + +for (addr = 0; addr < memory_region_size(mr); addr += granularity) { +iotlb = mr->iommu_ops->translate(mr, addr, is_write); +if (iotlb.perm != IOMMU_NONE) { +n->notify(n, ); +} + +/* if (2^64 - MR size) < granularity, it's possible to get an + * infinite loop here. This should catch such a wraparound */ +if ((addr + granularity) < addr) { +break; +} +} +} + void memory_region_unregister_iommu_notifier(Notifier *n) { notifier_remove(n);
Re: [Qemu-devel] [Qemu-discuss] TCP options ipv4 and ipv6 have no effect
On 05/10/2015 20:03, Sair, Umair wrote: >> The first if handles the "default to N" case, the second handles "default to >> Y", the (absent) else case handles "default to PF_UNSPEC". > > Can you please elaborate it. Also I am not understanding the reason for > inverting the values of addr->has_ipv* in second if condition. If a value is absent, you can either default it to Y (then the value is !addr->has_xyz || addr->xyz) or default it to N (then the value is addr->has_xyz && addr->xyz). The desired logic is: - with one absent value and one "on", the absent value should be "off" - with one absent value and one "off", the absent value should be "on" - with two absent values, the absent values can be both left absent The patch works like this: >> static void inet_addr_to_opts(QemuOpts *opts, const InetSocketAddress >> *addr) { >> -bool ipv4 = addr->ipv4 || !addr->has_ipv4; >> -bool ipv6 = addr->ipv6 || !addr->has_ipv6; >> +bool ipv4 = addr->has_ipv4 && addr->ipv4; >> +bool ipv6 = addr->has_ipv6 && addr->ipv6; >> >> -if (!ipv4 || !ipv6) { >> +if (ipv4 || ipv6) { >> qemu_opt_set_bool(opts, "ipv4", ipv4, _abort); >> qemu_opt_set_bool(opts, "ipv6", ipv6, _abort); This handles the case where the absent value should be "off": at least one value is present _and_ true. >> +} else if (addr->has_ipv4 || addr->has_ipv6) { >> +qemu_opt_set_bool(opts, "ipv4", !addr->has_ipv4, _abort); >> +qemu_opt_set_bool(opts, "ipv6", !addr->has_ipv6, _abort); This handles the case where the absent value should be "on". We know that whichever the present value is, it is false; therefore !addr->has_xyz || addr->xyz simplifies to just !addr->has_xyz. > I believe that the fix for the issue under discussion will be > committed to qemu repo very soon, Did you test the patch, and did it work for you? If so, it is customary to reply with a line like "Tested by: Sair, Umair". > so I'll like to add one more thing > which requires to be fixed along with it. In 'tcp_chr_accept' > function of qemu-char.c, the data type of saddr should be > sockaddr_in6 so that it works with both IPv6 and IPv4 on Windows > (works for linux without it because of accept4 and works with this > solution as well!). Can you send a patch for it? Thanks! Paolo
[Qemu-devel] [PULL 04/10] vfio: Remove unneeded union from VFIOContainer
From: David GibsonCurrently the VFIOContainer iommu_data field contains a union with different information for different host iommu types. However: * It only actually contains information for the x86-like "Type1" iommu * Because we have a common listener the Type1 fields are actually used on all IOMMU types, including the SPAPR TCE type as well In fact we now have a general structure for the listener which is unlikely to ever need per-iommu-type information, so this patch removes the union. In a similar way we can unify the setup of the vfio memory listener in vfio_connect_container() that is currently split across a switch on iommu type, but is effectively the same in both cases. The iommu_data.release pointer was only needed as a cleanup function which would handle potentially different data in the union. With the union gone, it too can be removed. Signed-off-by: David Gibson Reviewed-by: Laurent Vivier Signed-off-by: Alex Williamson --- hw/vfio/common.c | 52 +++-- include/hw/vfio/vfio-common.h | 16 ++--- 2 files changed, 22 insertions(+), 46 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 0d341a3..1545f62 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -315,8 +315,7 @@ out: static void vfio_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { -VFIOContainer *container = container_of(listener, VFIOContainer, -iommu_data.type1.listener); +VFIOContainer *container = container_of(listener, VFIOContainer, listener); hwaddr iova, end; Int128 llend; void *vaddr; @@ -406,9 +405,9 @@ static void vfio_listener_region_add(MemoryListener *listener, * can gracefully fail. Runtime, there's not much we can do other * than throw a hardware error. */ -if (!container->iommu_data.type1.initialized) { -if (!container->iommu_data.type1.error) { -container->iommu_data.type1.error = ret; +if (!container->initialized) { +if (!container->error) { +container->error = ret; } } else { hw_error("vfio: DMA mapping failed, unable to continue"); @@ -419,8 +418,7 @@ static void vfio_listener_region_add(MemoryListener *listener, static void vfio_listener_region_del(MemoryListener *listener, MemoryRegionSection *section) { -VFIOContainer *container = container_of(listener, VFIOContainer, -iommu_data.type1.listener); +VFIOContainer *container = container_of(listener, VFIOContainer, listener); hwaddr iova, end; int ret; @@ -485,7 +483,7 @@ static const MemoryListener vfio_memory_listener = { static void vfio_listener_release(VFIOContainer *container) { -memory_listener_unregister(>iommu_data.type1.listener); +memory_listener_unregister(>listener); } int vfio_mmap_region(Object *obj, VFIORegion *region, @@ -683,21 +681,6 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as) ret = -errno; goto free_container_exit; } - -container->iommu_data.type1.listener = vfio_memory_listener; -container->iommu_data.release = vfio_listener_release; - -memory_listener_register(>iommu_data.type1.listener, - container->space->as); - -if (container->iommu_data.type1.error) { -ret = container->iommu_data.type1.error; -error_report("vfio: memory listener initialization failed for container"); -goto listener_release_exit; -} - -container->iommu_data.type1.initialized = true; - } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) { ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, ); if (ret) { @@ -723,19 +706,24 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as) ret = -errno; goto free_container_exit; } - -container->iommu_data.type1.listener = vfio_memory_listener; -container->iommu_data.release = vfio_listener_release; - -memory_listener_register(>iommu_data.type1.listener, - container->space->as); - } else { error_report("vfio: No available IOMMU models"); ret = -EINVAL; goto free_container_exit; } +container->listener = vfio_memory_listener; + +memory_listener_register(>listener, container->space->as); + +if (container->error) { +ret = container->error; +error_report("vfio: memory listener initialization failed for container"); +goto listener_release_exit; +} + +
Re: [Qemu-devel] QEMU+Linux ARMv7A current state
On Mon, Oct 05, 2015 at 11:13:33AM -0400, John Snow wrote: > I'm looking into the cubieboard now. Is our emulation based on any > particular model? (1-4?) The first model, the one with Allwinner A10. > I'm trying to see if I can find anything that resembles a spec to see > what kind of registers this SoC has for its SATA controller. There is some documentation on the SoC here, but apparently nothing on SATA: http://dl.linux-sunxi.org/A10/ Beniamino
Re: [Qemu-devel] [Qemu-discuss] TCP options ipv4 and ipv6 have no effect
On 05/10/2015 20:03, Sair, Umair wrote: >> The first if handles the "default to N" case, the second handles "default to >> Y", the (absent) else case handles "default to PF_UNSPEC". > > Can you please elaborate it. Also I am not understanding the reason for > inverting the values of addr->has_ipv* in second if condition. If a value is absent, you can either default it to Y (then the value is !addr->has_xyz || addr->xyz) or default it to N (then the value is addr->has_xyz && addr->xyz). The desired logic is: - with one absent value and one "on", the absent value should be "off" - with one absent value and one "off", the absent value should be "on" - with two absent values, the absent values can be both left absent The patch works like this: >> static void inet_addr_to_opts(QemuOpts *opts, const InetSocketAddress >> *addr) { >> -bool ipv4 = addr->ipv4 || !addr->has_ipv4; >> -bool ipv6 = addr->ipv6 || !addr->has_ipv6; >> +bool ipv4 = addr->has_ipv4 && addr->ipv4; >> +bool ipv6 = addr->has_ipv6 && addr->ipv6; >> >> -if (!ipv4 || !ipv6) { >> +if (ipv4 || ipv6) { >> qemu_opt_set_bool(opts, "ipv4", ipv4, _abort); >> qemu_opt_set_bool(opts, "ipv6", ipv6, _abort); This handles the case where the absent value should be "off": at least one value is present _and_ true. >> +} else if (addr->has_ipv4 || addr->has_ipv6) { >> +qemu_opt_set_bool(opts, "ipv4", !addr->has_ipv4, _abort); >> +qemu_opt_set_bool(opts, "ipv6", !addr->has_ipv6, _abort); This handles the case where the absent value should be "on". We know that whichever the present value is, it is false; therefore !addr->has_xyz || addr->xyz simplifies to just !addr->has_xyz. > I believe that the fix for the issue under discussion will be > committed to qemu repo very soon, Did you test the patch, and did it work for you? If so, it is customary to reply with a line like "Tested by: Sair, Umair". > so I'll like to add one more thing > which requires to be fixed along with it. In 'tcp_chr_accept' > function of qemu-char.c, the data type of saddr should be > sockaddr_in6 so that it works with both IPv6 and IPv4 on Windows > (works for linux without it because of accept4 and works with this > solution as well!). Can you send a patch for it? Thanks! Paolo
Re: [Qemu-devel] [PATCH 3/4] checkpatch: adapt some tests to QEMU
On 05/10/2015 20:47, Peter Maydell wrote: > On 5 October 2015 at 19:11, Eduardo Habkostwrote: >> On Sun, Oct 04, 2015 at 07:44:29PM +0100, Peter Maydell wrote: >>> On 17 September 2015 at 17:32, Paolo Bonzini wrote: On 17/09/2015 18:16, Peter Maydell wrote: > On 17 September 2015 at 17:00, Paolo Bonzini wrote: >> >> >> On 17/09/2015 16:24, Peter Maydell wrote: >>> Can we revert this one, please? Checkpatch now warns about constructs >>> like >>> typedef struct MyDevice { >>> DeviceState parent; >>> >>> int reg0, reg1, reg2; >>> } MyDevice; >>> I think it varies depending on the maintainer. PPC, USB, SCSI, ACPI all use a separate typedef. I'll prepare a revert. >>> >>> Ping on that revert patch? I can't find it onlist... >> >> The x86 pull request I sent today triggers this error, BTW. I hope it >> won't make the pull request be automatically rejected. > > Nope, because I don't run checkpatch on pull requests. I've sent the patch today, by the way. Sorry for the delay. Paolo
Re: [Qemu-devel] [PULL 00/10] Fix device introspection regressions
On 05/10/2015 20:46, Peter Maydell wrote: > The easy fix is to have some suitable ifdeffery in virtio-pci.c, > similar to how we only register the virtio_9p_pci and virtio_scsi_pci (vhost_scsi_pci) > types if they've been configured into this build. Hmm, actually there's no reason to limit common-obj-$(CONFIG_VIRTIO) += virtio-input.o common-obj-$(CONFIG_VIRTIO) += virtio-input-hid.o to CONFIG_LINUX. Does it work to extract them out of the if (which is only correct for virtio-input-host.o)? That said, the ifdeffery _is_ needed for TYPE_VIRTIO_INPUT_HOST_PCI. Paolo
[Qemu-devel] [PATCHv2] target-arm: Implement AArch64 OSLSR_EL1 sysreg
Added oslsr_write function to OSLAR_EL1 sysreg, using a status variable in ARMCPUState struct (os_lock_status). Linux reads from this register during its suspend/resume procedure. Signed-off-by: Davorin Mista--- Changed in v2: -switched from using dummy registers to an actual register implementation -implemented write function for OSLAR_EL1 sysreg -added state variable to ARMCPUState struct Signed-off-by: Davorin Mista --- target-arm/cpu.h| 3 +++ target-arm/helper.c | 16 +++- 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/target-arm/cpu.h b/target-arm/cpu.h index 5ea11a6..5aab654 100644 --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -500,6 +500,9 @@ typedef struct CPUARMState { uint32_t cregs[16]; } iwmmxt; +/* OS Lock Status: accessed via OSLAR/OSLSR registers */ +uint64_t os_lock_status; + /* For mixed endian mode. */ bool bswap_code; diff --git a/target-arm/helper.c b/target-arm/helper.c index 9d62c4c..a6fad7a 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -3147,6 +3147,13 @@ static void dcc_write(CPUARMState *env, const ARMCPRegInfo *ri, putchar(value); } +/* write to os_lock_status state variable */ +static void oslsr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value) +{ +/* only bit 1 can be modified, register is always 0b10x0 */ +raw_write(env, ri, 8 + (value & 2)); +} + static const ARMCPRegInfo debug_cp_reginfo[] = { /* DBGDRAR, DBGDSAR: always RAZ since we don't implement memory mapped * debug components. The AArch64 version of DBGDRAR is named MDRAR_EL1; @@ -3179,7 +3186,14 @@ static const ARMCPRegInfo debug_cp_reginfo[] = { /* We define a dummy WI OSLAR_EL1, because Linux writes to it. */ { .name = "OSLAR_EL1", .state = ARM_CP_STATE_BOTH, .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 4, - .access = PL1_W, .type = ARM_CP_NOP }, + .access = PL1_W, .resetvalue = 10, + .fieldoffset = offsetof(CPUARMState, os_lock_status), + .writefn = oslsr_write }, +/* We define a dummy OSLSR_EL1, because Linux reads from it. */ +{ .name = "OSLSR_EL1", .state = ARM_CP_STATE_BOTH, + .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 4, + .access = PL1_R, + .fieldoffset = offsetof(CPUARMState, os_lock_status) }, /* Dummy OSDLR_EL1: 32-bit Linux will read this */ { .name = "OSDLR_EL1", .state = ARM_CP_STATE_BOTH, .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 4, -- 2.6.0
[Qemu-devel] [PATCH] Remove macros IO_READ_PROTO and IO_WRITE_PROTO
--- hw/audio/adlib.c | 9 ++--- hw/audio/es1370.c | 17 ++--- hw/audio/gus.c| 9 ++--- hw/audio/sb16.c | 15 +-- 4 files changed, 15 insertions(+), 35 deletions(-) diff --git a/hw/audio/adlib.c b/hw/audio/adlib.c index 656eb37..af39920 100644 --- a/hw/audio/adlib.c +++ b/hw/audio/adlib.c @@ -57,11 +57,6 @@ void YMF262UpdateOneQEMU (int which, INT16 *dst, int length); #define SHIFT 1 #endif -#define IO_READ_PROTO(name) \ -uint32_t name (void *opaque, uint32_t nport) -#define IO_WRITE_PROTO(name) \ -void name (void *opaque, uint32_t nport, uint32_t val) - #define TYPE_ADLIB "adlib" #define ADLIB(obj) OBJECT_CHECK(AdlibState, (obj), TYPE_ADLIB) @@ -124,7 +119,7 @@ static void adlib_kill_timers (AdlibState *s) } } -static IO_WRITE_PROTO (adlib_write) +static void adlib_write (void *opaque, uint32_t nport, uint32_t val) { AdlibState *s = opaque; int a = nport & 3; @@ -141,7 +136,7 @@ static IO_WRITE_PROTO (adlib_write) #endif } -static IO_READ_PROTO (adlib_read) +static uint32_t adlib_read (void *opaque, uint32_t nport) { AdlibState *s = opaque; uint8_t data; diff --git a/hw/audio/es1370.c b/hw/audio/es1370.c index 8e7bcf5..dfb7c79 100644 --- a/hw/audio/es1370.c +++ b/hw/audio/es1370.c @@ -157,11 +157,6 @@ static const unsigned dac1_samplerate[] = { 5512, 11025, 22050, 44100 }; #define DAC2_CHANNEL 1 #define ADC_CHANNEL 2 -#define IO_READ_PROTO(n) \ -static uint32_t n (void *opaque, uint32_t addr) -#define IO_WRITE_PROTO(n) \ -static void n (void *opaque, uint32_t addr, uint32_t val) - static void es1370_dac1_callback (void *opaque, int free); static void es1370_dac2_callback (void *opaque, int free); static void es1370_adc_callback (void *opaque, int avail); @@ -474,7 +469,7 @@ static inline uint32_t es1370_fixup (ES1370State *s, uint32_t addr) return addr; } -IO_WRITE_PROTO (es1370_writeb) +static void es1370_writeb (void *opaque, uint32_t addr, uint32_t val) { ES1370State *s = opaque; uint32_t shift, mask; @@ -512,7 +507,7 @@ IO_WRITE_PROTO (es1370_writeb) } } -IO_WRITE_PROTO (es1370_writew) +static void es1370_writew (void *opaque, uint32_t addr, uint32_t val) { ES1370State *s = opaque; addr = es1370_fixup (s, addr); @@ -549,7 +544,7 @@ IO_WRITE_PROTO (es1370_writew) } } -IO_WRITE_PROTO (es1370_writel) +static void es1370_writel (void *opaque, uint32_t addr, uint32_t val) { ES1370State *s = opaque; struct chan *d = >chan[0]; @@ -615,7 +610,7 @@ IO_WRITE_PROTO (es1370_writel) } } -IO_READ_PROTO (es1370_readb) +static uint32_t es1370_readb (void *opaque, uint32_t addr) { ES1370State *s = opaque; uint32_t val; @@ -650,7 +645,7 @@ IO_READ_PROTO (es1370_readb) return val; } -IO_READ_PROTO (es1370_readw) +static uint32_t es1370_readw (void *opaque, uint32_t addr) { ES1370State *s = opaque; struct chan *d = >chan[0]; @@ -692,7 +687,7 @@ IO_READ_PROTO (es1370_readw) return val; } -IO_READ_PROTO (es1370_readl) +static uint32_t es1370_readl (void *opaque, uint32_t addr) { ES1370State *s = opaque; uint32_t val; diff --git a/hw/audio/gus.c b/hw/audio/gus.c index 86223a9..6ea1e09 100644 --- a/hw/audio/gus.c +++ b/hw/audio/gus.c @@ -41,11 +41,6 @@ #define GUS_ENDIANNESS 0 #endif -#define IO_READ_PROTO(name) \ -static uint32_t name (void *opaque, uint32_t nport) -#define IO_WRITE_PROTO(name) \ -static void name (void *opaque, uint32_t nport, uint32_t val) - #define TYPE_GUS "gus" #define GUS(obj) OBJECT_CHECK (GUSState, (obj), TYPE_GUS) @@ -64,14 +59,14 @@ typedef struct GUSState { qemu_irq pic; } GUSState; -IO_READ_PROTO (gus_readb) +static uint32_t gus_readb (void *opaque, uint32_t nport) { GUSState *s = opaque; return gus_read (>emu, nport, 1); } -IO_WRITE_PROTO (gus_writeb) +static void gus_writeb (void *opaque, uint32_t nport, uint32_t val) { GUSState *s = opaque; diff --git a/hw/audio/sb16.c b/hw/audio/sb16.c index b052de5..381ef56 100644 --- a/hw/audio/sb16.c +++ b/hw/audio/sb16.c @@ -40,11 +40,6 @@ #define ldebug(...) #endif -#define IO_READ_PROTO(name) \ -uint32_t name (void *opaque, uint32_t nport) -#define IO_WRITE_PROTO(name)\ -void name (void *opaque, uint32_t nport, uint32_t val) - static const char e3[] = "COPYRIGHT (C) CREATIVE TECHNOLOGY LTD, 1992."; #define TYPE_SB16 "sb16" @@ -881,7 +876,7 @@ static void reset (SB16State *s) legacy_reset (s); } -static IO_WRITE_PROTO (dsp_write) +static void dsp_write (void *opaque, uint32_t nport, uint32_t val) { SB16State *s = opaque; int iport; @@ -959,7 +954,7 @@ static IO_WRITE_PROTO (dsp_write) } } -static IO_READ_PROTO (dsp_read) +static uint32_t dsp_read (void *opaque, uint32_t nport) { SB16State *s = opaque; int iport, retval, ack = 0; @@ -1058,14 +1053,14 @@ static void reset_mixer
[Qemu-devel] [Bug 1502884] Re: Super important feature req: QEMU VNC server: Introduce a keyboard "norepeat" option!
What I request seems to be the same option as x11vnc's "-norepeat", http://linux.die.net/man/1/x11vnc -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1502884 Title: Super important feature req: QEMU VNC server: Introduce a keyboard "norepeat" option! Status in QEMU: New Bug description: Hi, A big issue when using QEMU's VNC server (VNC KVM) is that, when there's a network lag, unintended keypresses go through to the QEMU guest VM. This is frequently "enter" keypresses, causing all kinds of unintended consequences in the VM. So basically it's extremely dangerous. This is because the VNC protocol's keyboard interaction is implemented in terms of key down - key up events, making the server's keyboard autorepeat kick in when it should not. For this reason, it would be great if QEMU's VNC server part would be enhanced with an option such that when a VNC protocol key down is received, then locally that is treated as one single keypress only (I don't know how that should be implemented but I guess either as an immediate key down - key up sequence locally, or key down + key up after say 0.05 seconds), instead of waiting for the key up event from the VNC client. Thanks! To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1502884/+subscriptions
[Qemu-devel] [Bug 1502884] [NEW] Super important feature req: QEMU VNC server: Introduce a keyboard "norepeat" option!
Public bug reported: Hi, A big issue when using QEMU's VNC server (VNC KVM) is that, when there's a network lag, unintended keypresses go through to the QEMU guest VM. This is frequently "enter" keypresses, causing all kinds of unintended consequences in the VM. So basically it's extremely dangerous. This is because the VNC protocol's keyboard interaction is implemented in terms of key down - key up events, making the server's keyboard autorepeat kick in when it should not. For this reason, it would be great if QEMU's VNC server part would be enhanced with an option such that when a VNC protocol key down is received, then locally that is treated as one single keypress only (I don't know how that should be implemented but I guess either as an immediate key down - key up sequence locally, or key down + key up after say 0.05 seconds), instead of waiting for the key up event from the VNC client. Thanks! ** Affects: qemu Importance: Undecided Status: New ** Tags: extension keyboard kvm norepeat repeat server vnc -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1502884 Title: Super important feature req: QEMU VNC server: Introduce a keyboard "norepeat" option! Status in QEMU: New Bug description: Hi, A big issue when using QEMU's VNC server (VNC KVM) is that, when there's a network lag, unintended keypresses go through to the QEMU guest VM. This is frequently "enter" keypresses, causing all kinds of unintended consequences in the VM. So basically it's extremely dangerous. This is because the VNC protocol's keyboard interaction is implemented in terms of key down - key up events, making the server's keyboard autorepeat kick in when it should not. For this reason, it would be great if QEMU's VNC server part would be enhanced with an option such that when a VNC protocol key down is received, then locally that is treated as one single keypress only (I don't know how that should be implemented but I guess either as an immediate key down - key up sequence locally, or key down + key up after say 0.05 seconds), instead of waiting for the key up event from the VNC client. Thanks! To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1502884/+subscriptions
[Qemu-devel] [PULL 01/10] hw/vfio/platform: irqfd setup sequence update
From: Eric AugerWith current implementation, eventfd VFIO signaling is first set up and then irqfd is setup, if supported and allowed. This start sequence causes several issues with IRQ forwarding setup which, if supported, is transparently attempted on irqfd setup: IRQ forwarding setup is likely to fail if the IRQ is detected as under injection into the guest (active at irqchip level or VFIO masked). This currently always happens because the current sequence explicitly VFIO-masks the IRQ before setting irqfd. Even if that masking were removed, we couldn't prevent the case where the IRQ is under injection into the guest. So the simpler solution is to remove this 2-step startup and directly attempt irqfd setup. This is what this patch does. Also in case the eventfd setup fails, there is no reason to go farther: let's abort. Signed-off-by: Eric Auger Signed-off-by: Alex Williamson --- hw/vfio/platform.c | 51 +-- 1 file changed, 29 insertions(+), 22 deletions(-) diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c index a6726cd..d864342 100644 --- a/hw/vfio/platform.c +++ b/hw/vfio/platform.c @@ -310,18 +310,29 @@ static void vfio_platform_eoi(VFIODevice *vbasedev) /** * vfio_start_eventfd_injection - starts the virtual IRQ injection using * user-side handled eventfds - * @intp: the IRQ struct pointer + * @sbdev: the sysbus device handle + * @irq: the qemu irq handle */ -static int vfio_start_eventfd_injection(VFIOINTp *intp) +static void vfio_start_eventfd_injection(SysBusDevice *sbdev, qemu_irq irq) { int ret; +VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev); +VFIOINTp *intp; + +QLIST_FOREACH(intp, >intp_list, next) { +if (intp->qemuirq == irq) { +break; +} +} +assert(intp); ret = vfio_set_trigger_eventfd(intp, vfio_intp_interrupt); if (ret) { -error_report("vfio: Error: Failed to pass IRQ fd to the driver: %m"); +error_report("vfio: failed to start eventfd signaling for IRQ %d: %m", + intp->pin); +abort(); } -return ret; } /* @@ -359,6 +370,15 @@ static int vfio_set_resample_eventfd(VFIOINTp *intp) return ret; } +/** + * vfio_start_irqfd_injection - starts the virtual IRQ injection using + * irqfd + * + * @sbdev: the sysbus device handle + * @irq: the qemu irq handle + * + * In case the irqfd setup fails, we fallback to userspace handled eventfd + */ static void vfio_start_irqfd_injection(SysBusDevice *sbdev, qemu_irq irq) { VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev); @@ -366,7 +386,7 @@ static void vfio_start_irqfd_injection(SysBusDevice *sbdev, qemu_irq irq) if (!kvm_irqfds_enabled() || !kvm_resamplefds_enabled() || !vdev->irqfd_allowed) { -return; +goto fail_irqfd; } QLIST_FOREACH(intp, >intp_list, next) { @@ -376,13 +396,6 @@ static void vfio_start_irqfd_injection(SysBusDevice *sbdev, qemu_irq irq) } assert(intp); -/* Get to a known interrupt state */ -qemu_set_fd_handler(event_notifier_get_fd(>interrupt), -NULL, NULL, vdev); - -vfio_mask_single_irqindex(>vbasedev, intp->pin); -qemu_set_irq(intp->qemuirq, 0); - if (kvm_irqchip_add_irqfd_notifier(kvm_state, >interrupt, >unmask, irq) < 0) { goto fail_irqfd; @@ -395,9 +408,6 @@ static void vfio_start_irqfd_injection(SysBusDevice *sbdev, qemu_irq irq) goto fail_vfio; } -/* Let's resume injection with irqfd setup */ -vfio_unmask_single_irqindex(>vbasedev, intp->pin); - intp->kvm_accel = true; trace_vfio_platform_start_irqfd_injection(intp->pin, @@ -406,9 +416,11 @@ static void vfio_start_irqfd_injection(SysBusDevice *sbdev, qemu_irq irq) return; fail_vfio: kvm_irqchip_remove_irqfd_notifier(kvm_state, >interrupt, irq); +error_report("vfio: failed to start eventfd signaling for IRQ %d: %m", + intp->pin); +abort(); fail_irqfd: -vfio_start_eventfd_injection(intp); -vfio_unmask_single_irqindex(>vbasedev, intp->pin); +vfio_start_eventfd_injection(sbdev, irq); return; } @@ -646,7 +658,6 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp) VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev); SysBusDevice *sbdev = SYS_BUS_DEVICE(dev); VFIODevice *vbasedev = >vbasedev; -VFIOINTp *intp; int i, ret; vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM; @@ -665,10 +676,6 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp) vfio_map_region(vdev, i); sysbus_init_mmio(sbdev, >regions[i]->mem); } - -QLIST_FOREACH(intp, >intp_list, next) { -vfio_start_eventfd_injection(intp); -} } static const VMStateDescription vfio_platform_vmstate = {
[Qemu-devel] [PULL 05/10] vfio: Generalize vfio_listener_region_add failure path
From: David GibsonIf a DMA mapping operation fails in vfio_listener_region_add() it checks to see if we've already completed initial setup of the container. If so it reports an error so the setup code can fail gracefully, otherwise throws a hw_error(). There are other potential failure cases in vfio_listener_region_add() which could benefit from the same logic, so move it to its own fail: block. Later patches can use this to extend other failure cases to fail as gracefully as possible under the circumstances. Signed-off-by: David Gibson Reviewed-by: Thomas Huth Reviewed-by: Laurent Vivier Signed-off-by: Alex Williamson --- hw/vfio/common.c | 26 +++--- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 1545f62..95a4850 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -399,19 +399,23 @@ static void vfio_listener_region_add(MemoryListener *listener, error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx", %p) = %d (%m)", container, iova, end - iova, vaddr, ret); +goto fail; +} -/* - * On the initfn path, store the first error in the container so we - * can gracefully fail. Runtime, there's not much we can do other - * than throw a hardware error. - */ -if (!container->initialized) { -if (!container->error) { -container->error = ret; -} -} else { -hw_error("vfio: DMA mapping failed, unable to continue"); +return; + +fail: +/* + * On the initfn path, store the first error in the container so we + * can gracefully fail. Runtime, there's not much we can do other + * than throw a hardware error. + */ +if (!container->initialized) { +if (!container->error) { +container->error = ret; } +} else { +hw_error("vfio: DMA mapping failed, unable to continue"); } }
[Qemu-devel] [PULL 03/10] hw/vfio/platform: do not set resamplefd for edge-sensitive IRQS
From: Eric AugerIn irqfd mode, current code attempts to set a resamplefd whatever the type of the IRQ. For an edge-sensitive IRQ this attempt fails and as a consequence, the whole irqfd setup fails and we fall back to the slow mode. This patch bypasses the resamplefd setting for non level-sentive IRQs. Signed-off-by: Eric Auger Signed-off-by: Alex Williamson --- hw/vfio/platform.c | 42 +++--- trace-events |4 +++- 2 files changed, 30 insertions(+), 16 deletions(-) diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c index cab1517..5c1156c 100644 --- a/hw/vfio/platform.c +++ b/hw/vfio/platform.c @@ -32,6 +32,11 @@ * Functions used whatever the injection method */ +static inline bool vfio_irq_is_automasked(VFIOINTp *intp) +{ +return intp->flags & VFIO_IRQ_INFO_AUTOMASKED; +} + /** * vfio_init_intp - allocate, initialize the IRQ struct pointer * and add it into the list of IRQs @@ -65,15 +70,17 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, error_report("vfio: Error: trigger event_notifier_init failed "); return NULL; } -/* Get an eventfd for resample/unmask */ -intp->unmask = g_malloc0(sizeof(EventNotifier)); -ret = event_notifier_init(intp->unmask, 0); -if (ret) { -g_free(intp->interrupt); -g_free(intp->unmask); -g_free(intp); -error_report("vfio: Error: resamplefd event_notifier_init failed"); -return NULL; +if (vfio_irq_is_automasked(intp)) { +/* Get an eventfd for resample/unmask */ +intp->unmask = g_malloc0(sizeof(EventNotifier)); +ret = event_notifier_init(intp->unmask, 0); +if (ret) { +g_free(intp->interrupt); +g_free(intp->unmask); +g_free(intp); +error_report("vfio: Error: resamplefd event_notifier_init failed"); +return NULL; +} } QLIST_INSERT_HEAD(>intp_list, intp, next); @@ -294,7 +301,7 @@ static void vfio_platform_eoi(VFIODevice *vbasedev) /* deassert the virtual IRQ */ qemu_set_irq(intp->qemuirq, 0); -if (intp->flags & VFIO_IRQ_INFO_AUTOMASKED) { +if (vfio_irq_is_automasked(intp)) { /* unmasks the physical level-sensitive IRQ */ vfio_unmask_single_irqindex(vbasedev, intp->pin); } @@ -409,15 +416,20 @@ static void vfio_start_irqfd_injection(SysBusDevice *sbdev, qemu_irq irq) if (vfio_set_trigger_eventfd(intp, NULL) < 0) { goto fail_vfio; } -if (vfio_set_resample_eventfd(intp) < 0) { -goto fail_vfio; +if (vfio_irq_is_automasked(intp)) { +if (vfio_set_resample_eventfd(intp) < 0) { +goto fail_vfio; +} +trace_vfio_platform_start_level_irqfd_injection(intp->pin, +event_notifier_get_fd(intp->interrupt), +event_notifier_get_fd(intp->unmask)); +} else { +trace_vfio_platform_start_edge_irqfd_injection(intp->pin, +event_notifier_get_fd(intp->interrupt)); } intp->kvm_accel = true; -trace_vfio_platform_start_irqfd_injection(intp->pin, - event_notifier_get_fd(intp->interrupt), - event_notifier_get_fd(intp->unmask)); return; fail_vfio: kvm_irqchip_remove_irqfd_notifier(kvm_state, intp->interrupt, irq); diff --git a/trace-events b/trace-events index 36db793..6790292 100644 --- a/trace-events +++ b/trace-events @@ -1624,7 +1624,9 @@ vfio_platform_intp_interrupt(int pin, int fd) "Inject IRQ #%d (fd = %d)" vfio_platform_intp_inject_pending_lockheld(int pin, int fd) "Inject pending IRQ #%d (fd = %d)" vfio_platform_populate_interrupts(int pin, int count, int flags) "- IRQ index %d: count %d, flags=0x%x" vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING" -vfio_platform_start_irqfd_injection(int index, int fd, int resamplefd) "IRQ index=%d, fd = %d, resamplefd = %d" +vfio_platform_start_level_irqfd_injection(int index, int fd, int resamplefd) "IRQ index=%d, fd = %d, resamplefd = %d" +vfio_platform_start_edge_irqfd_injection(int index, int fd) "IRQ index=%d, fd = %d" + #hw/acpi/memory_hotplug.c mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
Re: [Qemu-devel] [PATCH] Remove macros IO_READ_PROTO and IO_WRITE_PROTO
On 5 October 2015 at 20:06, nutanshinde1992wrote: > --- > hw/audio/adlib.c | 9 ++--- > hw/audio/es1370.c | 17 ++--- > hw/audio/gus.c| 9 ++--- > hw/audio/sb16.c | 15 +-- > 4 files changed, 15 insertions(+), 35 deletions(-) Hi; thanks for sending us this patch. Codewise the changes look good; there's just a minor coding style nit, and a process issue you need to deal with. The process issue is that your git commit message has to end with a 'Signed-off-by: Your Name ' line. (If you look in the git commit logs at other peoples' changes you'll see what I mean.) This is you certifying to us that you wrote the patch and are willing to submit it to us under the terms of QEMU's license. (For a full statement of what you're signing to say, look here: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches?id=f6f94e2ab1b33f0082ac22d71f66385a60d8157f#n297 -- we use the same idea as the Linux kernel for this.) > diff --git a/hw/audio/adlib.c b/hw/audio/adlib.c > index 656eb37..af39920 100644 > --- a/hw/audio/adlib.c > +++ b/hw/audio/adlib.c > @@ -57,11 +57,6 @@ void YMF262UpdateOneQEMU (int which, INT16 *dst, int > length); > #define SHIFT 1 > #endif > > -#define IO_READ_PROTO(name) \ > -uint32_t name (void *opaque, uint32_t nport) > -#define IO_WRITE_PROTO(name) \ > -void name (void *opaque, uint32_t nport, uint32_t val) > - > #define TYPE_ADLIB "adlib" > #define ADLIB(obj) OBJECT_CHECK(AdlibState, (obj), TYPE_ADLIB) > > @@ -124,7 +119,7 @@ static void adlib_kill_timers (AdlibState *s) > } > } > > -static IO_WRITE_PROTO (adlib_write) > +static void adlib_write (void *opaque, uint32_t nport, uint32_t val) The style issue is that all these functions should not have a space between the function name and the '('. If you could remove the stray spaces, and then send us the fixed patch again with your signoff line in the commit message, that would be great. (Make sure the subject line says "[PATCH v2]" so we know it's the revised version.) thanks -- PMM
[Qemu-devel] [PULL 00/10] VFIO updates for 2015-10-05
The following changes since commit c0b520dfb8890294a9f8879f4759172900585995: Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging (2015-10-02 16:59:21 +0100) are available in the git repository at: git://github.com/awilliam/qemu-vfio.git tags/vfio-update-20151005.0 for you to fetch changes up to 727299697dd31f0e1ccecc7eab1bf658e8ed3079: vfio: Expose a VFIO PCI device's group for EEH (2015-10-05 12:40:13 -0600) VFIO updates 2015-10-05 - Change platform device IRQ setup sequence for compatibility with upcoming IRQ forwarding (Eric Auger) - Extensions to support vfio-pci devices on spapr-pci-host-bridge (David Gibson) David Gibson (7): vfio: Remove unneeded union from VFIOContainer vfio: Generalize vfio_listener_region_add failure path vfio: Check guest IOVA ranges against host IOMMU capabilities vfio: Record host IOMMU's available IO page sizes memory: Allow replay of IOMMU mapping notifications vfio: Allow hotplug of containers onto existing guest IOMMU mappings vfio: Expose a VFIO PCI device's group for EEH Eric Auger (3): hw/vfio/platform: irqfd setup sequence update hw/vfio/platform: change interrupt/unmask fields into pointer hw/vfio/platform: do not set resamplefd for edge-sensitive IRQS hw/vfio/common.c| 140 hw/vfio/pci.c | 14 hw/vfio/platform.c | 116 - include/exec/memory.h | 13 include/hw/vfio/vfio-common.h | 23 +++ include/hw/vfio/vfio-pci.h | 11 include/hw/vfio/vfio-platform.h | 4 +- memory.c| 20 ++ trace-events| 4 +- 9 files changed, 229 insertions(+), 116 deletions(-) create mode 100644 include/hw/vfio/vfio-pci.h
[Qemu-devel] [PULL 07/10] vfio: Record host IOMMU's available IO page sizes
From: David GibsonDepending on the host IOMMU type we determine and record the available page sizes for IOMMU translation. We'll need this for other validation in future patches. Signed-off-by: David Gibson Reviewed-by: Thomas Huth Reviewed-by: Laurent Vivier Signed-off-by: Alex Williamson --- hw/vfio/common.c | 13 + include/hw/vfio/vfio-common.h |1 + 2 files changed, 14 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 2faf492..f666de2 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -677,6 +677,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as) if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU) || ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU)) { bool v2 = !!ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU); +struct vfio_iommu_type1_info info; ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, ); if (ret) { @@ -702,6 +703,15 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as) */ container->min_iova = 0; container->max_iova = (hwaddr)-1; + +/* Assume just 4K IOVA page size */ +container->iova_pgsizes = 0x1000; +info.argsz = sizeof(info); +ret = ioctl(fd, VFIO_IOMMU_GET_INFO, ); +/* Ignore errors */ +if ((ret == 0) && (info.flags & VFIO_IOMMU_INFO_PGSIZES)) { +container->iova_pgsizes = info.iova_pgsizes; +} } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) { struct vfio_iommu_spapr_tce_info info; @@ -744,6 +754,9 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as) } container->min_iova = info.dma32_window_start; container->max_iova = container->min_iova + info.dma32_window_size - 1; + +/* Assume just 4K IOVA pages for now */ +container->iova_pgsizes = 0x1000; } else { error_report("vfio: No available IOMMU models"); ret = -EINVAL; diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 27a14c0..f037f3c 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -71,6 +71,7 @@ typedef struct VFIOContainer { * future */ hwaddr min_iova, max_iova; +uint64_t iova_pgsizes; QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; QLIST_HEAD(, VFIOGroup) group_list; QLIST_ENTRY(VFIOContainer) next;
[Qemu-devel] [PULL 09/10] vfio: Allow hotplug of containers onto existing guest IOMMU mappings
From: David GibsonAt present the memory listener used by vfio to keep host IOMMU mappings in sync with the guest memory image assumes that if a guest IOMMU appears, then it has no existing mappings. This may not be true if a VFIO device is hotplugged onto a guest bus which didn't previously include a VFIO device, and which has existing guest IOMMU mappings. Therefore, use the memory_region_register_iommu_notifier_replay() function in order to fix this case, replaying existing guest IOMMU mappings, bringing the host IOMMU into sync with the guest IOMMU. Signed-off-by: David Gibson Signed-off-by: Alex Williamson --- hw/vfio/common.c | 23 +-- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index f666de2..6797208 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -312,6 +312,11 @@ out: rcu_read_unlock(); } +static hwaddr vfio_container_granularity(VFIOContainer *container) +{ +return (hwaddr)1 << ctz64(container->iova_pgsizes); +} + static void vfio_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { @@ -369,26 +374,16 @@ static void vfio_listener_region_add(MemoryListener *listener, * would be the right place to wire that up (tell the KVM * device emulation the VFIO iommu handles to use). */ -/* - * This assumes that the guest IOMMU is empty of - * mappings at this point. - * - * One way of doing this is: - * 1. Avoid sharing IOMMUs between emulated devices or different - * IOMMU groups. - * 2. Implement VFIO_IOMMU_ENABLE in the host kernel to fail if - * there are some mappings in IOMMU. - * - * VFIO on SPAPR does that. Other IOMMU models may do that different, - * they must make sure there are no existing mappings or - * loop through existing mappings to map them into VFIO. - */ giommu = g_malloc0(sizeof(*giommu)); giommu->iommu = section->mr; giommu->container = container; giommu->n.notify = vfio_iommu_map_notify; QLIST_INSERT_HEAD(>giommu_list, giommu, giommu_next); + memory_region_register_iommu_notifier(giommu->iommu, >n); +memory_region_iommu_replay(giommu->iommu, >n, + vfio_container_granularity(container), + false); return; }
Re: [Qemu-devel] [PATCH v5 2/2] Use help sub-sections to create sub-help options
On Mon, Sep 14, 2015 at 08:01:20PM +0200, Laurent Vivier wrote: > As '-help' output is 400 lines long it is not easy > to find information, but generally we know from > which area we want the information. I agree - a bit more order is IMHO needed in the general -help - I'm not sure this is the place to start. E.g. it is not your fault that it lists e.g. a ton of random stuff as Standard options but this becomes more of a problem if you hide things behind sub-sections. For example, how do I change the amount of memory? Currently I do --help and search for memory. This also diverges from the way one queries other help (-device foo,help, -cpu help etc). How about we move detailed help into each option instead? This will mean only 113 lines which is a lot, but kind of reasonable. -machine [type=]name[,prop[=value][,...]] select emulated machine ('-machine help' for help) -cpu cpuselect CPU ('-cpu help' for list) -smp [cpus=]n[,prop[=value][,...]] configure SMP ('-smp help' for help) -numa node[,prop[=value][,...]] configure NUMA ('-numa help' for help) Of course it's also true that often the help says nothing useful at all. > As sections already exist in the help description, > add some options to only display the wanted section. > > '-help' now can take an optional parameter of the > form -help[=LIST], which is a comma separated list > of sections to display: > > all display all help sections > help display help options > standard display standard options What does standard options mean? > block display block options > usb display usb options > display display display options > machine display machine options > network display network options > character display character options Character device options? > url display url options what are these? > btdisplay bt options bluetooth options? > tpm display tpm options > kerneldisplay kernel options > expertdisplay expert options > objectdisplay object options what are these? > > '-help' without option displays only help options. > > Example: > > $ x86_64-softmmu/qemu-system-x86_64 -help=kernel,usb > QEMU emulator version 2.4.50, Copyright (c) 2003-2008 Fabrice Bellard > usage: qemu-system-x86_64 [options] [disk_image] > > 'disk_image' is a raw hard disk image for IDE hard disk 0 > > Linux/Multiboot boot specific: > -kernel bzImage use 'bzImage' as kernel image > -append cmdline use 'cmdline' as kernel command line > -initrd fileuse 'file' as initial ram disk > -dtbfileuse 'file' as device tree image > > USB options: > -usbenable the USB driver (will be the default soon) > -usbdevice name add the host or guest USB device 'name' > > Signed-off-by: Laurent VivierMaybe a good place to start is rewriting our man page. We can then distill that to a minimum that makes sense in -help. -- MST
Re: [Qemu-devel] [PATCH v8 04/54] Move configuration section writing
On (Tue) 29 Sep 2015 [09:37:28], Dr. David Alan Gilbert (git) wrote: > From: "Dr. David Alan Gilbert"> > The vmstate_configuration is currently written > in 'qemu_savevm_state_begin', move it to > 'qemu_savevm_state_header' since it's got a hard > requirement that it must be the 1st thing after > the header. > (In postcopy some 'command' sections get sent > early before the saving of the main sections > and hence before qemu_savevm_state_begin). > > Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Amit Shah The function name 'savevm_state_header()' isn't accurate anymore. Not serious for this series. Amit
Re: [Qemu-devel] [PULL 00/10] Fix device introspection regressions
Peter Maydellwrites: > On 2 October 2015 at 18:20, Markus Armbruster wrote: >> QMP command device-list-properties regressed in 2.1: it can crash or >> leave dangling pointers behind. >> >> -device FOO,help regressed in 2.2: it no longer works for >> non-pluggable devices. I tried to fix that some time ago[*], but my >> fix failed review. This is my second, more comprehensive try. >> >> PATCH 1-3 fix one class of bugs involved in the regressions, PATCH 4-5 >> are libqtest preliminaries, PATCH 6 adds tests to demonstrate the >> remaining bugs, PATCH 7-9 fix them to a degree (see PATCH 8 for >> limitations), and PATCH 10 cleans up. > > This ordering breaks bisection of 'make check', as I found out when > I tried to figure out which of the patches in this pull was causing > an OSX test failure. Please can you reorder them so that 'make check' > works at all points in the series? My ordering may be bad (and I'll recheck it, of course), or it may temporarily expose a hidden bug. I better figure out what's going on here. >> The following changes since commit ff770b07f34d28b79013a83989bd6c85f8f16b2f: >> >> Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into >> staging (2015-10-02 11:01:18 +0100) >> >> are available in the git repository at: >> >> git://repo.or.cz/qemu/armbru.git tags/pull-monitor-2015-10-02 >> >> for you to fetch changes up to e927162a6fa2fa6144de9d1d11cc9448a2143671: >> >> Revert "qdev: Use qdev_get_device_class() for -device ,help" >> (2015-10-02 16:45:53 +0200) >> >> >> Fix device introspection regressions >> >> > > 'make check' failure on OSX: > > /aarch64/device/introspect/list: OK > /aarch64/device/introspect/none: OK > /aarch64/device/introspect/abstract: OK > /aarch64/device/introspect/concrete: ** > ERROR:/Users/pm215/src/qemu-for-merges/qom/object.c:333:void > object_initialize_with_type(void *, size_t, TypeImpl *): assertion > failed: (type != NULL) > Broken pipe > FAIL > > I have no idea why this only failed on OSX... Can you re-run this with valgrind spliced in? I use something like $ QTEST_QEMU_BINARY="valgrind --vgdb-error=1 --log-file=vg.log qemu-system-aarch64" QTEST_QEMU_IMG=qemu-img MALLOC_PERTURB_=${MALLOC_PERTURB_:-$((RANDOM % 255 + 1))} gtester -k --verbose -m=quick tests/device-introspect-test > Backtrace: [Confusing due to inlining and other optimizations; snipped for now...]
Re: [Qemu-devel] [PATCH 1/3] Target-microblaze: Remove unnecessary variable
05.10.2015 08:18, Markus Armbruster wrote: > Michael Tokarevwrites: > >> 25.09.2015 11:37, Shraddha Barke wrote: >>> Compress lines and remove the variable . >> >> Applied to -trivial, removing this piece of commit message: >> >> --- >>> Change made using Coccinelle script [..snip..] >> --- > > Why? I like having the semantic patch in the commit message when > there's any chance we'll want do the same mechanical change again later. > > You could save space and include it by reference, though: "Same > Coccinelle semantic patch as is commit 74c373e". git commit messages aren't good documentation for various scripts like this, this info will be lost in the noize. If it might be better to keep such scripts in a separate file where it is easier to find, or in a wiki page on the site. The key point is where to find the info, git log is difficult for that, especially when you don't know what to search for or that such a script exists in there in the first place. On the other hand, when git log is cluttered by such a long messages for such small changes, it becomes more difficult to find info which you really look in git log -- namely, which changes were made that might have introduced this regression, things like that. So to me, the shorter and cleaner the commit message is, the better. BTW, I've no idea why this email has been Cc'd to kvm@vger :) Thanks, /mjt
Re: [Qemu-devel] [PATCH 21/36] misc: spelling
05.10.2015 08:09, Markus Armbruster пишет: > Michael Tokarevwrites: > >> 25.09.2015 19:08, Eric Blake wrote: >>> On 09/25/2015 08:03 AM, marcandre.lur...@redhat.com wrote: From: Marc-André Lureau [] >>> Reviewed-by: Eric Blake >> >> Note there's no S-o-b line in the original patch (whole series, >> looks like). Hopefully it is okay for such a really trivial >> patch :) >> >> Applied, thanks! > > It may be legally safe, but do we really want to engage in judging > whether patches are copyrightable or not? Besides, it sets a bad > example. Sometimes I question our own sanity. Even for a trivial spelling fix we require significantly more beaurocracy(sp) than the fix is worth, and want formal rules instead of using common sense. This is a common trend in the world, to formalize everything instead of thinking, the world is becoming "candy". This reminded me an old movie, "Demolition Man", -- the cops in the future reads instructions about what to do in each situation they happened to come. But oh well, no one want to take responsibility, that's okay ;) Sorry for somewhat non-technical answer, I'll revert this patch, waiting for more beaurocracy. > Marc-André, please repost your patches ready for -trivial with your > S-o-B, cc: qemu-trivial. Mind you, it was a large series, with wasn't intended for -trivial at all. That's more rules and more beaurocracy. And many other patches in that series didn't have s-o-b line too. Thanks, /mjt
Re: [Qemu-devel] [PATCH 1/3] Target-microblaze: Remove unnecessary variable
On 5 October 2015 at 08:18, Michael Tokarevwrote: > 05.10.2015 08:18, Markus Armbruster wrote: >> Why? I like having the semantic patch in the commit message when >> there's any chance we'll want do the same mechanical change again later. >> >> You could save space and include it by reference, though: "Same >> Coccinelle semantic patch as is commit 74c373e". > > git commit messages aren't good documentation for various scripts > like this, this info will be lost in the noize. If it might be > better to keep such scripts in a separate file where it is easier > to find, or in a wiki page on the site. The key point is where to > find the info, git log is difficult for that, especially when you > don't know what to search for or that such a script exists in > there in the first place. > > On the other hand, when git log is cluttered by such a long messages > for such small changes, it becomes more difficult to find info which > you really look in git log -- namely, which changes were made that > might have introduced this regression, things like that. I think it can be useful when you're looking at a commit to know that it was automatically created, especially if it's a big commit. It means that if you're looking for a bug in it you can concentrate on the script that created it rather than the possibly large set of changes it produced, or if you're trying to cherry-pick it into another branch you can just apply the script instead. In a commit with a change this small it's not very significant either way, though. thanks -- PMM
Re: [Qemu-devel] [PATCH v5 20/48] ivshmem: simplify a bit the code
On 02.10.2015 21:09, marcandre.lur...@redhat.com wrote: > From: Marc-André Lureau> > Use some more explicit variables to simplify the code. > > Signed-off-by: Marc-André Lureau Reviewed-by: Claudio Fontana > --- > hw/misc/ivshmem.c | 28 ++-- > 1 file changed, 14 insertions(+), 14 deletions(-) > > diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c > index 6ee4881..c054e52 100644 > --- a/hw/misc/ivshmem.c > +++ b/hw/misc/ivshmem.c > @@ -488,9 +488,10 @@ static void ivshmem_read(void *opaque, const uint8_t > *buf, int size) > { > IVShmemState *s = opaque; > int incoming_fd; > -int guest_max_eventfd; > +int new_eventfd; > long incoming_posn; > Error *err = NULL; > +Peer *peer; > > if (!fifo_update_and_get(s, buf, size, > _posn, sizeof(incoming_posn))) { > @@ -517,6 +518,8 @@ static void ivshmem_read(void *opaque, const uint8_t > *buf, int size) > } > } > > +peer = >peers[incoming_posn]; > + > if (incoming_fd == -1) { > /* if posn is positive and unseen before then this is our posn*/ > if (incoming_posn >= 0 && s->vm_id == -1) { > @@ -564,27 +567,24 @@ static void ivshmem_read(void *opaque, const uint8_t > *buf, int size) > return; > } > > -/* each guest has an array of eventfds, and we keep track of how many > - * guests for each VM */ > -guest_max_eventfd = s->peers[incoming_posn].nb_eventfds; > +/* each peer has an associated array of eventfds, and we keep > + * track of how many eventfds received so far */ > +/* get a new eventfd: */ > +new_eventfd = peer->nb_eventfds++; > > /* this is an eventfd for a particular guest VM */ > IVSHMEM_DPRINTF("eventfds[%ld][%d] = %d\n", incoming_posn, > -guest_max_eventfd, incoming_fd); > - > event_notifier_init_fd(>peers[incoming_posn].eventfds[guest_max_eventfd], > - incoming_fd); > - > -/* increment count for particular guest */ > -s->peers[incoming_posn].nb_eventfds++; > +new_eventfd, incoming_fd); > +event_notifier_init_fd(>eventfds[new_eventfd], incoming_fd); > > if (incoming_posn == s->vm_id) { > -s->eventfd_chr[guest_max_eventfd] = create_eventfd_chr_device(s, > - >peers[s->vm_id].eventfds[guest_max_eventfd], > - guest_max_eventfd); > +s->eventfd_chr[new_eventfd] = create_eventfd_chr_device(s, > + >peers[s->vm_id].eventfds[new_eventfd], > + new_eventfd); > } > > if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) { > -ivshmem_add_eventfd(s, incoming_posn, guest_max_eventfd); > +ivshmem_add_eventfd(s, incoming_posn, new_eventfd); > } > } > > -- Claudio Fontana Server Virtualization Architect Huawei Technologies Duesseldorf GmbH Riesstraße 25 - 80992 München
[Qemu-devel] implement smbios support for mach-virt: triggers usual QEMU makefile bug
Hi Peter, The commit "smbios: implement smbios support for mach-virt" seems to cause the usual problem in QEMU's makefiles to trigger: hw/arm/virt.c:892: undefined reference to `smbios_set_defaults' hw/arm/virt.c:895: undefined reference to `smbios_get_tables' This is IIRC the consequence of adding CONFIG_SMBIOS=y to default-configs/rm-softmmu.mak, which is not picked up by the build system until a clean of the working tree has been done, right? This is worked around by $ git clean -d -x -f followed by reconfigure/rebuild. Just wanted to mention this on the list in case someone is looking to fix this longstanding issue.. Ciao, thanks Claudio
Re: [Qemu-devel] [PATCH RFC 0/3] Checkpoint-assisted migration proposal
Hi Amit, On Tue, Sep 15, 2015 at 12:39 PM, Amit Shahwrote: > Could you please include a file in the docs/ directory that documents > how this works, so it's easier to comment on the general idea? sure, we will add this. > From 'checkpointing', I was afraid this was going to use some > checkpoint-restore framework, but instead it's a new checkpointing > method that you're adding to qemu. > > Can you describe when checkpoints are taken, and what is checkpointed? > How is it stored on the disk? Checkpoints are taken after a migration (at the source). If a checkpoint exists at the destination, the VM's state is reconstructed from the local checkpoint as well as updated pages from the source. This checkpoint-assisted migration can be faster, if network is the bottleneck, and saves network bandwidth. We can, in principle, reuse the existing checkpoint format of QEMU. The current implementation writes its own checkpoint because it was less effort on our side. We write the VM's main memory into a single file. > I'm sure the patches have all the details, but it's easier to check > the soundness of the idea if there's a high-level doc that explains > this, and then we can discuss the finer points over patches. We've recently published a paper about the general idea and expected benefits for a number of workloads ( http://se.inf.tu-dresden.de/pubs/papers/knauth2015vecycle.pdf ) > Overall, I think this approach can benefit some workloads, and since > it's not affecting a lot of common code, we could look at adding it. > > Also, apologies for not getting to this earlier. Kind regards, Thomas.
[Qemu-devel] [RFC v0 0/2] Enforce gaps between DIMMs
The suggested way to work around the virtio bug reported here http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html is to introduce gaps between DIMMs. Igor's patchset changes the pc-dimm auto-address assignment to introduce gaps and ues the same from pc memhp. This patchset does the same for sPAPR PowerPC. Before introducing the gap, ensure that memory hotplug region has enough room for alignment adjustment. We accommodate a max alignment of 256MB for each slot since sPAPR memory hotplug enforces an alignment requirement of 256MB on RAM size, maxmem and NUMA node mem sizes. This applies on David's spapr-next branch + Igor's patchset applied. This has been very lightly tested and intention is to get feedback on the correctness aspect of this. Bharata B Rao (2): spapr: Accommadate alignment gaps in hotplug memory region spapr: Force gaps between DIMM's GPA hw/ppc/spapr.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) -- 2.1.0
Re: [Qemu-devel] [PATCH RFC 0/3] Checkpoint-assisted migration proposal
On (Mon) 05 Oct 2015 [10:33:01], Thomas Knauth wrote: > Hi Amit, > > On Tue, Sep 15, 2015 at 12:39 PM, Amit Shahwrote: > > Could you please include a file in the docs/ directory that documents > > how this works, so it's easier to comment on the general idea? > > sure, we will add this. Thanks! > > From 'checkpointing', I was afraid this was going to use some > > checkpoint-restore framework, but instead it's a new checkpointing > > method that you're adding to qemu. > > > > Can you describe when checkpoints are taken, and what is checkpointed? > > How is it stored on the disk? > > Checkpoints are taken after a migration (at the source). If a > checkpoint exists at the destination, the VM's state is reconstructed > from the local checkpoint as well as updated pages from the source. > This checkpoint-assisted migration can be faster, if network is the > bottleneck, and saves network bandwidth. > > We can, in principle, reuse the existing checkpoint format of QEMU. > The current implementation writes its own checkpoint because it was > less effort on our side. We write the VM's main memory into a single > file. > > > I'm sure the patches have all the details, but it's easier to check > > the soundness of the idea if there's a high-level doc that explains > > this, and then we can discuss the finer points over patches. > > We've recently published a paper about the general idea and expected > benefits for a number of workloads ( > http://se.inf.tu-dresden.de/pubs/papers/knauth2015vecycle.pdf ) I'll give it a look, thanks. I'm interested in knowing what workloads benefit. There was one outstanding question from Dave about collisions: https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg01614.html Can you please address that in your next submission? > > Overall, I think this approach can benefit some workloads, and since > > it's not affecting a lot of common code, we could look at adding it. > > > > Also, apologies for not getting to this earlier. > > Kind regards, > Thomas. Thanks, Amit
Re: [Qemu-devel] [PATCH v4] linux-user/syscall.c: malloc()/calloc() to g_malloc()/g_try_malloc()/g_new0()
On Mon, Oct 5, 2015 at 4:32 AM, Harmandeep Kaurwrote: > Convert malloc()/calloc() calls to g_malloc()/g_try_malloc()/g_new0() > in linux-user/syscall.c file > > Signed-off-by: Harmandeep Kaur > --- > v1->v2 convert the free() call in host_to_target_semarray() > to g_free() and calls g_try_malloc(count) instead of > g_try_malloc(sizeof(count)) > > v2->v3 used g_try_new() and friends to avoid overflow issues > > v3->v4 use g_free for unlock_iovec() and host_to_target_semarray(). > > linux-user/syscall.c | 24 > 1 file changed, 12 insertions(+), 12 deletions(-) Reviewed-by: Stefan Hajnoczi
Re: [Qemu-devel] [PATCH 1/3] block: prohibit migration during BlockJobs
Am 02.10.2015 um 20:17 hat John Snow geschrieben: > On 10/01/2015 02:03 PM, Paolo Bonzini wrote: > > On 01/10/2015 18:34, John Snow wrote: > >> Unless we can prove this to be safe for specific cases, > >> the default should be to prohibit migration during BlockJobs. > > > > Block jobs do not affect the current block, only other block device, > > hence they *are* safe for migration. > > > > Can you elaborate for me here? > > > What you want, I think, is the target not to be garbage when migration > > ends. Based on this you can block specific cases, namely mirror which > > you already do allow (patch 2) and backup except for sync='none'. > > > > Paolo > > It would be nice if the target wasn't garbage, yes :) > > I allow mirror in specific circumstances -- you can't start a mirror, > but if an existing mirror has hit the sync phase, that's OK. > > I can try to do a more exhaustive audit of what should and should not > work, but my thought was "guilty before proven innocent." I think I've come to the conclusion that qemu can't even know in all cases (particularly mirroring) because it depends on what the resulting image is used for. If we mirror in order to do a live storage migration, then obviously it needs to run during migration. In this case, your restriction "only in sync phase" seems to be right. If we mirror for some other reason, the mirroring job should probably continue running on the destination host. The management tool can set up the destination accordingly, but if it doesn't, the job is silently cancelled. The point is that qemu (even more on the source host) doesn't know which case is true, so rejecting the migration seems wrong. If we do commit or stream, the job is silently cancelled and must be restarted on the destination host. It's the same case as above. All (or most) of the restarting jobs on the destination doesn't come for free, we usually repeat some useless work that the source host had already done. That's not a reason to reject migrations, just cost for the user to consider before they start migration. So in the end, I'm not sure how much qemu can actually do here. Kevin
Re: [Qemu-devel] [PATCH] Add syscalls for -runas and -chroot tothe seccomp sandbox
On Mon, Oct 05, 2015 at 07:20:58AM +0200, Markus Armbruster wrote: > "Namsun Ch'o"writes: > > >> If we intend seccomp to protect against flaws during QEMU setup, then > >> having > >> it earlier is neccessary. eg QEMU opening a corrupt qcow2 image which might > >> exploit QEMU before the guest CPUs start. > > > >> If the latter is the case, then we could start with a relaxed seccomp > >> sandbox which included the setuid/chroot features, and then switch to a > >> more restricted one which blocked them before main_loop() runs. > > > > That's not possible. Seccomp will not be enforced until seccomp_load(ctx) is > > called, after which no new changes to the filter can be made. > > That's a pity. > > As long as it's the case, we need to pick: either we protect against > rogue guests, or against rogue images. The original idea was the > former, and it still makes the most sense to me. Yep, protecting against rogue guests seems much more important. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
[Qemu-devel] [RFC v0 2/2] spapr: Force gaps between DIMM's GPA
Mapping DIMMs non contiguously allows to workaround virtio bug reported earlier: http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html In this case guest kernel doesn't allocate buffers that can cross DIMM boundary keeping each buffer local to a DIMM. Suggested-by: Michael S. TsirkinSigned-off-by: Bharata B Rao --- hw/ppc/spapr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 2ec509b..a8526e9 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -2127,7 +2127,7 @@ static void spapr_memory_plug(HotplugHandler *hotplug_dev, DeviceState *dev, goto out; } -pc_dimm_memory_plug(dev, >hotplug_memory, mr, align, false, _err); +pc_dimm_memory_plug(dev, >hotplug_memory, mr, align, true, _err); if (local_err) { goto out; } -- 2.1.0
[Qemu-devel] [RFC v0 1/2] spapr: Accommadate alignment gaps in hotplug memory region
Size hotplug memory region assuming a 256MB max alignment every slot. Signed-off-by: Bharata B Rao--- hw/ppc/spapr.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index fc5e7d6..2ec509b 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -1792,6 +1792,9 @@ static void ppc_spapr_init(MachineState *machine) spapr->hotplug_memory.base = ROUND_UP(machine->ram_size, SPAPR_HOTPLUG_MEM_ALIGN); + +/* size hotplug region assuming 256M max alignment per slot */ +hotplug_mem_size += SPAPR_MEMORY_BLOCK_SIZE * machine->ram_slots; memory_region_init(>hotplug_memory.mr, OBJECT(spapr), "hotplug-memory", hotplug_mem_size); memory_region_add_subregion(sysmem, spapr->hotplug_memory.base, -- 2.1.0
Re: [Qemu-devel] [PATCH 2/2] pc: memhp: force gaps between DIMM's GPA
On Mon, Sep 28, 2015 at 11:13:42AM +0200, Igor Mammedov wrote: > On Mon, 28 Sep 2015 10:09:26 +0530 > Bharata B Raowrote: > > > On Sun, Sep 27, 2015 at 04:04:06PM +0200, Igor Mammedov wrote: > > > On Sun, 27 Sep 2015 16:11:02 +0300 > > > "Michael S. Tsirkin" wrote: > > > > > > > On Sun, Sep 27, 2015 at 03:06:24PM +0200, Igor Mammedov wrote: > > > > > On Sun, 27 Sep 2015 13:48:21 +0300 > > > > > "Michael S. Tsirkin" wrote: > > > > > > > > > > > On Fri, Sep 25, 2015 at 03:53:12PM +0200, Igor Mammedov wrote: > > > > > > > mapping DIMMs non contiguously allows to workaround > > > > > > > virtio bug reported earlier: > > > > > > > http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html > > > > > > > in this case guest kernel doesn't allocate buffers > > > > > > > that can cross DIMM boundary keeping each buffer > > > > > > > local to a DIMM. > > > > > > > > > > > > > > Suggested-by: Michael S. Tsirkin > > > > > > > Signed-off-by: Igor Mammedov > > > > > > > --- > > > > > > > benefit of this workaround is that no guest side > > > > > > > changes are required. > > > > > > > > > > > > That's a hard requirement, I agree. > > > > > > > > > > > > > > > > > > > --- > > > > > > > hw/i386/pc.c | 4 +++- > > > > > > > hw/i386/pc_piix.c| 3 +++ > > > > > > > hw/i386/pc_q35.c | 3 +++ > > > > > > > include/hw/i386/pc.h | 2 ++ > > > > > > > 4 files changed, 11 insertions(+), 1 deletion(-) > > > > > > > > > > > > Aren't other architectures besides PC ever affected? > > > > > > Do they all allocate all of memory contigious in HVA space? > > > > > I'm not sure about other targets I've CCed interested parties. > > > > > > > > > > > > > > > > > Also - does the issue only affect hotplugged memory? > > > > > Potentially it affects -numa memdev=foo, but however I've > > > > > tried I wasn't able to reproduce. > > > > > We could do it as > > > > > separate workaround later if it would affect someone > > > > > and virtio is not fixed to handle split buffers by that time. > > > > > > > > > > > > > You can't reproduce a crash or you can't reproduce getting > > > > contigious GPA with fragmented HVA? > > > > If you can see fragmentation that's enough to assume guest crash > > > > can be triggered, even if it doesn't with Linux. > > > I'll check it. > > > > > > > > > > > > > > > > > > Can't the patch be local to pc-dimm (except maybe the > > > > > > backwards compatibility thing)? > > > > > I think decision about using gaps and its size > > > > > should be done by board and not generic pc-dimm. > > > > > > > > > > > > > Well virtio is generic and can be used by all boards. > > > Still pc-dimm.addr is not allocation is not part of pc-dimm > > > device. it's just helper functions that happen to live in > > > the same file source file. > > > > > > But more importantly every target might have it's own > > > notion how it partitions hotplug address space so making > > > the same gap global might break them. > > > > > > It's safer to enable gaps per target, I think ppc guys > > > will make their own patch on top of this to taking > > > in account their target specific and compat stuff. > > > > I have never seen this issue that you mention at > > > > http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html > > > > in PowerPC. I have not been able to reproduce the QEMU crash with the > > commandline suggested there. > > > > (# ./ppc64-softmmu/qemu-system-ppc64 --enable-kvm --nographic > > -machine pseries -m 8G,slots=32,maxmem=32G -device > > virtio-blk-pci,drive=rootdisk -drive > > file=/home/bharata/F20-snap1,if=none,cache=none,id=rootdisk,format=qcow2 > > -monitor telnet:localhost:1235,server,nowait -vga none > > -bios /home/bharata/slof/slof.bin -smp 16,maxcpus=32 -netdev > > tap,id=foo,ifname=tap0,script=/home/bharata/qemu-ifup -device > > virtio-net-pci,id=n1,netdev=foo `for i in $(seq 0 15); do echo -n > > "-object memory-backend-ram,id=m$i,size=256M -device > > pc-dimm,id=dimm$i,memdev=m$i "; done` -snapshot) > > > > PowerPC sPAPR memory hotplug enforces memory alignment of 256MB > > for both boottime as well as hotplugged memory. > > > > So not sure if anything other than the default gap=0 which you have > > done in this patchset for PowerPC is necessary. > The bigger initial memory and dimm sizes the less likelihood of > triggering the bug. You don't see it mostly due to luck, but it doesn't > rule out possibility of it happening in production. > So please consider turning on gaps for ppc machine. > > Looking at how hotplug_mem_size is sized in hw/ppc/spapr.c it doesn't > look that just turning on gaps would work since it doesn't have a space > for alignment adjustment. > > try to plug dimm device in following order: > -m 8G,slots=2,maxmem=1256M \ > > -object memory-backend-ram,id=m1,size=256M -device pc-dimm,memdev=m1 \ > > -object >
Re: [Qemu-devel] [RFC v0 1/2] spapr: Accommadate alignment gaps in hotplug memory region
On Mon, 5 Oct 2015 14:05:23 +0530 Bharata B Raowrote: > Size hotplug memory region assuming a 256MB max alignment every slot. > > Signed-off-by: Bharata B Rao > --- > hw/ppc/spapr.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index fc5e7d6..2ec509b 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -1792,6 +1792,9 @@ static void ppc_spapr_init(MachineState *machine) > > spapr->hotplug_memory.base = ROUND_UP(machine->ram_size, >SPAPR_HOTPLUG_MEM_ALIGN); > + > +/* size hotplug region assuming 256M max alignment per slot */ > +hotplug_mem_size += SPAPR_MEMORY_BLOCK_SIZE * machine->ram_slots; Does target support hugepages backend? If it does then adjustment probably should be max supported hugepage alignment. > memory_region_init(>hotplug_memory.mr, OBJECT(spapr), > "hotplug-memory", hotplug_mem_size); > memory_region_add_subregion(sysmem, spapr->hotplug_memory.base,
[Qemu-devel] [PATCH 2/3] virtio-9p: add unrealize handler
If the user tries to hot unplug a virtio-9p device, it seems to succeed but in fact: - virtio-9p coroutines thread pool and async queue are leaked - QEMU crashes in virtio_vmstate_change() if the user tries to live migrate This patch brings hot unplug support to virtio-9p-device. It fixes both above issues. Signed-off-by: Greg Kurz--- hw/9pfs/virtio-9p-device.c | 12 1 file changed, 12 insertions(+) diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c index 93a407c45926..ed133c40493a 100644 --- a/hw/9pfs/virtio-9p-device.c +++ b/hw/9pfs/virtio-9p-device.c @@ -138,6 +138,17 @@ out: v9fs_path_free(); } +static void virtio_9p_device_unrealize(DeviceState *dev, Error **errp) +{ +VirtIODevice *vdev = VIRTIO_DEVICE(dev); +V9fsState *s = VIRTIO_9P(dev); + +v9fs_release_worker_threads(); +g_free(s->ctx.fs_root); +g_free(s->tag); +virtio_cleanup(vdev); +} + /* virtio-9p device */ static Property virtio_9p_properties[] = { @@ -154,6 +165,7 @@ static void virtio_9p_class_init(ObjectClass *klass, void *data) dc->props = virtio_9p_properties; set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); vdc->realize = virtio_9p_device_realize; +vdc->unrealize = virtio_9p_device_unrealize; vdc->get_features = virtio_9p_get_features; vdc->get_config = virtio_9p_get_config; }
[Qemu-devel] [PATCH 1/3] virtio-9p-coth: add release function and fix init error path
This is preliminary work to support hotplug/unplug of virtio-9p-device. Signed-off-by: Greg Kurz--- hw/9pfs/virtio-9p-coth.c | 13 + hw/9pfs/virtio-9p-coth.h |1 + 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/hw/9pfs/virtio-9p-coth.c b/hw/9pfs/virtio-9p-coth.c index 8185c533c013..a4392586c9a2 100644 --- a/hw/9pfs/virtio-9p-coth.c +++ b/hw/9pfs/virtio-9p-coth.c @@ -66,10 +66,7 @@ int v9fs_init_worker_threads(void) } p->completed = g_async_queue_new(); if (!p->completed) { -/* - * We are going to terminate. - * So don't worry about cleanup - */ +g_thread_pool_free(p->pool, true, true); ret = -1; goto err_out; } @@ -80,3 +77,11 @@ err_out: pthread_sigmask(SIG_SETMASK, , NULL); return ret; } + +int v9fs_release_worker_threads(void) +{ +V9fsThPool *p = _pool; + +g_thread_pool_free(p->pool, TRUE, TRUE); +g_async_queue_unref(p->completed); +} diff --git a/hw/9pfs/virtio-9p-coth.h b/hw/9pfs/virtio-9p-coth.h index 4f51b250d1d4..6502e422cea8 100644 --- a/hw/9pfs/virtio-9p-coth.h +++ b/hw/9pfs/virtio-9p-coth.h @@ -56,6 +56,7 @@ typedef struct V9fsThPool { extern void co_run_in_worker_bh(void *); extern int v9fs_init_worker_threads(void); +extern int v9fs_release_worker_threads(void); extern int v9fs_co_readlink(V9fsPDU *, V9fsPath *, V9fsString *); extern int v9fs_co_readdir_r(V9fsPDU *, V9fsFidState *, struct dirent *, struct dirent **result);
[Qemu-devel] [PATCH 3/3] virtio-9p: add savem handlers
We don't support migration of mounted 9p shares. This is handled by a migration blocker. One would expect, however, to be able to migrate if the share is unmounted. Unfortunately virtio-9p-device does not register savevm handlers at all ! Migration succeeds and leaves the guest with a dangling device... This patch simply registers migration handlers for virtio-9p-device. Whether migration is possible or not still depends on the migration blocker. Signed-off-by: Greg Kurz--- hw/9pfs/virtio-9p-device.c | 12 1 file changed, 12 insertions(+) diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c index ed133c40493a..bd7f10a0a902 100644 --- a/hw/9pfs/virtio-9p-device.c +++ b/hw/9pfs/virtio-9p-device.c @@ -43,6 +43,16 @@ static void virtio_9p_get_config(VirtIODevice *vdev, uint8_t *config) g_free(cfg); } +static void virtio_9p_save(QEMUFile *f, void *opaque) +{ +virtio_save(VIRTIO_DEVICE(opaque), f); +} + +static int virtio_9p_load(QEMUFile *f, void *opaque, int version_id) +{ +return virtio_load(VIRTIO_DEVICE(opaque), f, version_id); +} + static void virtio_9p_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -130,6 +140,7 @@ static void virtio_9p_device_realize(DeviceState *dev, Error **errp) } v9fs_path_free(); +register_savevm(dev, "virtio-9p", -1, 1, virtio_9p_save, virtio_9p_load, s); return; out: g_free(s->ctx.fs_root); @@ -146,6 +157,7 @@ static void virtio_9p_device_unrealize(DeviceState *dev, Error **errp) v9fs_release_worker_threads(); g_free(s->ctx.fs_root); g_free(s->tag); +unregister_savevm(dev, "virtio-9p", s); virtio_cleanup(vdev); }
[Qemu-devel] [PATCH 0/3] virtio-9p: hotplug and migration support
This series brings both hoplug and migration support to virtio-9p devices. Patch 1 is preliminary work. Patch 2 brings hotplug and fixes a QEMU crash. Patch 3 brings migration (of course it still depends on the 9p to be unmounted) --- Greg Kurz (3): virtio-9p-coth: add release function and fix init error path virtio-9p: add unrealize handler virtio-9p: add savem handlers hw/9pfs/virtio-9p-coth.c | 13 + hw/9pfs/virtio-9p-coth.h |1 + hw/9pfs/virtio-9p-device.c | 24 3 files changed, 34 insertions(+), 4 deletions(-)
Re: [Qemu-devel] [PATCH v4 6/7] Make the kernel image in the fw_cfg DMA interface bootable
On Fr, 2015-10-02 at 09:40 -0400, Kevin O'Connor wrote: > On Fri, Oct 02, 2015 at 10:09:17AM +0200, Gerd Hoffmann wrote: > > > - read four bytes from under the fw_cfg selector QEMU_CFG_KERNEL_SIZE > > > (0x0008), > > > - if it is zero,return -1 --> no kernel boot requested, > > > - if it is nonzero, return 0 --> which means "top priority". > > > > > > In other words, I agree with: > > > > > > > -option_rom[nb_option_roms].bootindex = 0; > > > > +option_rom[nb_option_roms].bootindex = 1; > > The bootindex in QEMU is not visible in the firmware, so if the rest > of patch 6 is dropped then the above should be dropped as well. > > > Hmm. That makes the boot order undefined for "qemu -kernel foo -device > > virtio-blk,drive=bar,bootindex=1" when using an old seabios. I don't > > think this is a good idea. > > Wouldn't that make the bootorder undefined everywhere? What does it > mean to use -kernel and specify a bootorder? Kernel gets priority 0, everything else what you specify (typically configs start with bootindex=1), so the kernel gets the highest priority. So unless the user uses F12 to enter the boot menu and picks something else the kernel is booted. cheers, Gerd
Re: [Qemu-devel] [PATCH v4 6/7] Make the kernel image in the fw_cfg DMA interface bootable
On Fr, 2015-10-02 at 09:38 -0400, Kevin O'Connor wrote: > On Fri, Oct 02, 2015 at 10:16:26AM +0200, Gerd Hoffmann wrote: > > Hi, > > > > > That's fine with me. Marc - I think qemu_vmlinux_setup() in SeaBIOS > > > with the following would work: > > > > > > void qemu_vmlinux_setup(void) > > > { > > > u32 kernel_size; > > > qemu_cfg_read_entry(_size, QEMU_CFG_KERNEL_SIZE, > > > sizeof(kernel_size)); > > > if (kernel_size) > > > boot_add_qemu_vmlinux("QEMU Kernel image", 0); > > > } > > > > It isn't that simple. We also have support for multiboot kernels (using > > multiboot.bin option rom). So when doing this you need to be prepared > > to find a multiboot kernel in fw_cfg. > > Is there some way to detect if it's a multiboot kernel? Yes. There is a header with magic + checksum in the first 8k, see hw/i386/multiboot.c (in qemu). Or check the option rom name in the bootorder file, it's multiboot instead of linuxboot. > If so, > seabios can just fall back to using multiboot.bin. Or add multiboot support to seabios. cheers, Gerd
[Qemu-devel] [PATCH] target-mips: Add SIGRIE instruction
Add SIGRIE (Signal Reserved Instruction Exception). The instruction allows to use the 16-bit code field for software use. This instruction is introduced by and required as of Release 6. Signed-off-by: Yongbok Kim--- target-mips/translate.c | 12 +++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/target-mips/translate.c b/target-mips/translate.c index 4a47c2d..0af807a 100644 --- a/target-mips/translate.c +++ b/target-mips/translate.c @@ -323,6 +323,7 @@ enum { OPC_TLTIU= (0x0B << 16) | OPC_REGIMM, OPC_TEQI = (0x0C << 16) | OPC_REGIMM, OPC_TNEI = (0x0E << 16) | OPC_REGIMM, +OPC_SIGRIE = (0x17 << 16) | OPC_REGIMM, OPC_SYNCI= (0x1F << 16) | OPC_REGIMM, OPC_DAHI = (0x06 << 16) | OPC_REGIMM, @@ -12017,7 +12018,8 @@ enum { LSA = 0x0f, ALIGN = 0x1f, EXT = 0x2c, -POOL32AXF = 0x3c +POOL32AXF = 0x3c, +SIGRIE = 0x3f }; /* POOL32AXF encoding of minor opcode field extension */ @@ -13636,6 +13638,10 @@ static void decode_micromips32_opc(CPUMIPSState *env, DisasContext *ctx) case BREAK32: generate_exception_end(ctx, EXCP_BREAK); break; +case SIGRIE: +check_insn(ctx, ISA_MIPS32R6); +generate_exception_err(ctx, EXCP_RI, extract32(ctx->opcode, 6, 16)); +break; default: pool32a_invalid: MIPS_INVAL("pool32a"); @@ -18958,6 +18964,10 @@ static void decode_opc(CPUMIPSState *env, DisasContext *ctx) check_insn_opc_removed(ctx, ISA_MIPS32R6); gen_trap(ctx, op1, rs, -1, imm); break; +case OPC_SIGRIE: +check_insn(ctx, ISA_MIPS32R6); +generate_exception_err(ctx, EXCP_RI, extract32(ctx->opcode, 0, 16)); +break; case OPC_SYNCI: check_insn(ctx, ISA_MIPS32R2); /* Break the TB to be able to sync copied instructions -- 1.7.1
Re: [Qemu-devel] [PATCH v8 00/54] Postcopy implementation
* Bharata B Rao (bhar...@linux.vnet.ibm.com) wrote: > On Mon, Sep 28, 2015 at 05:51:39PM +0100, Dr. David Alan Gilbert (git) wrote: > > From: "Dr. David Alan Gilbert"> > > > This is the 8th cut of my version of postcopy. > > > > The userfaultfd linux kernel code is now in the upstream kernel > > tree, and so 4.3-rc3 can be used without modification. > > > > This qemu series can be found at: > > https://github.com/orbitfp7/qemu.git > > on the wp3-postcopy-v8 tag > > > > > > Testing status: > > * Tested heavily on x86 > > * Smoke tested on aarch64 (so it does work on different page sizes) > > * Power is unhappy for me (but gets further than the htab problem > > v7 used to have) (I get a kvm run failed) > > As I said earlier, postcopy migration works on Power, but memory hotplug > seems to have some problem. > > qemu-system-ppc64 ... -object memory-backend-ram,id=ram0,size=2G -device > pc-dimm,memdev=ram0 > > qemu/exec.c:1278: find_ram_offset: Assertion `size != 0' failed. > > Does this happen on x86 too ? Hmm, yes it does - I hadn't tried that. The problem is that I added a HOST_PAGE_ALIGN call during RAMBlock creation, and -object gets parsed pretty early on, before the internal host page masks have been setup. The patch below passes a smoke test; I'll look to clean it up. Thanks, Dave diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index a3719b7..b4c4b6e 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -81,7 +81,6 @@ void cpu_gen_init(void); int cpu_gen_code(CPUArchState *env, struct TranslationBlock *tb, int *gen_code_size_ptr); bool cpu_restore_state(CPUState *cpu, uintptr_t searched_pc); -void page_size_init(void); void QEMU_NORETURN cpu_resume_from_signal(CPUState *cpu, void *puc); void QEMU_NORETURN cpu_io_recompile(CPUState *cpu, uintptr_t retaddr); diff --git a/include/qemu-common.h b/include/qemu-common.h index 01d29dd..ae3530c 100644 --- a/include/qemu-common.h +++ b/include/qemu-common.h @@ -491,4 +491,6 @@ int parse_debug_env(const char *name, int max, int initial); const char *qemu_ether_ntoa(const MACAddr *mac); +void page_size_init(void); + #endif diff --git a/kvm-all.c b/kvm-all.c index de1924c..62b71fe 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -1461,7 +1461,6 @@ static int kvm_init(MachineState *ms) * page size for the system though. */ assert(TARGET_PAGE_SIZE <= getpagesize()); -page_size_init(); s->sigmask_len = 8; diff --git a/vl.c b/vl.c index e211f6a..818075c 100644 --- a/vl.c +++ b/vl.c @@ -4249,6 +4249,7 @@ int main(int argc, char **argv, char **envp) exit(1); } +page_size_init(); socket_init(); if (qemu_opts_foreach(qemu_find_opts("object"), > > Regards, > Bharata. > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH v5 31/48] contrib: add ivshmem client and server
Hi, On 02.10.2015 21:09, marcandre.lur...@redhat.com wrote: > From: David Marchand> > When using ivshmem devices, notifications between guests can be sent as > interrupts using a ivshmem-server (typical use described in documentation). > The client is provided as a debug tool. > > Signed-off-by: Olivier Matz > Signed-off-by: David Marchand > [fix a valgrind warning, option and server_close() segvs, extra server > headers includes, getopt() return type] > Signed-off-by: Marc-André Lureau > --- > Makefile| 8 + > configure | 3 + > contrib/ivshmem-client/ivshmem-client.c | 433 > > contrib/ivshmem-client/ivshmem-client.h | 212 > contrib/ivshmem-client/main.c | 239 ++ > contrib/ivshmem-server/ivshmem-server.c | 422 +++ > contrib/ivshmem-server/ivshmem-server.h | 166 > contrib/ivshmem-server/main.c | 264 +++ > qemu-doc.texi | 10 +- > 9 files changed, 1754 insertions(+), 3 deletions(-) > create mode 100644 contrib/ivshmem-client/ivshmem-client.c > create mode 100644 contrib/ivshmem-client/ivshmem-client.h > create mode 100644 contrib/ivshmem-client/main.c > create mode 100644 contrib/ivshmem-server/ivshmem-server.c > create mode 100644 contrib/ivshmem-server/ivshmem-server.h > create mode 100644 contrib/ivshmem-server/main.c > > diff --git a/Makefile b/Makefile > index e370876..6bb4855 100644 > --- a/Makefile > +++ b/Makefile > @@ -321,6 +321,14 @@ msi: > @echo "MSI build not configured or dependency resolution failed > (reconfigure with --enable-guest-agent-msi option)" > endif > > +IVSHMEM_CLIENT_OBJS=$(addprefix $(SRC_PATH)/contrib/ivshmem-client/, > ivshmem-client.o main.o) > +ivshmem-client$(EXESUF): $(IVSHMEM_CLIENT_OBJS) > + $(call LINK, $^) > + > +IVSHMEM_SERVER_OBJS=$(addprefix $(SRC_PATH)/contrib/ivshmem-server/, > ivshmem-server.o main.o) > +ivshmem-server$(EXESUF): $(IVSHMEM_SERVER_OBJS) libqemuutil.a libqemustub.a > + $(call LINK, $^) > + I noticed only now, but this one puts the objects in the wrong place if BUILD_DIR != SRC_PATH, ie it puts the ivshmem-client.o and ivshmem-server.o in the source tree instead of the build tree. What about a submake in contrib/ ? (But can be done later, just would fit naturally to solve the problem above) Ciao Claudio > clean: > # avoid old build problems by removing potentially incorrect old files > rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h > gen-op-arm.h > diff --git a/configure b/configure > index f14454e..65da997 100755 > --- a/configure > +++ b/configure > @@ -4392,6 +4392,9 @@ if test "$want_tools" = "yes" ; then >if [ "$linux" = "yes" -o "$bsd" = "yes" -o "$solaris" = "yes" ] ; then > tools="qemu-nbd\$(EXESUF) $tools" >fi > + if [ "$kvm" = "yes" ] ; then > +tools="ivshmem-client\$(EXESUF) ivshmem-server\$(EXESUF) $tools" > + fi > fi > if test "$softmmu" = yes ; then >if test "$virtfs" != no ; then > diff --git a/contrib/ivshmem-client/ivshmem-client.c > b/contrib/ivshmem-client/ivshmem-client.c > new file mode 100644 > index 000..11c805c > --- /dev/null > +++ b/contrib/ivshmem-client/ivshmem-client.c > @@ -0,0 +1,433 @@ > +/* > + * Copyright 6WIND S.A., 2014 > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or > + * (at your option) any later version. See the COPYING file in the > + * top-level directory. > + */ > + > +#include > +#include > +#include > + > +#include "qemu-common.h" > +#include "qemu/queue.h" > + > +#include "ivshmem-client.h" > + > +/* log a message on stdout if verbose=1 */ > +#define IVSHMEM_CLIENT_DEBUG(client, fmt, ...) do { \ > +if ((client)->verbose) { \ > +printf(fmt, ## __VA_ARGS__); \ > +}\ > +} while (0) > + > +/* read message from the unix socket */ > +static int > +ivshmem_client_read_one_msg(IvshmemClient *client, long *index, int *fd) > +{ > +int ret; > +struct msghdr msg; > +struct iovec iov[1]; > +union { > +struct cmsghdr cmsg; > +char control[CMSG_SPACE(sizeof(int))]; > +} msg_control; > +struct cmsghdr *cmsg; > + > +iov[0].iov_base = index; > +iov[0].iov_len = sizeof(*index); > + > +memset(, 0, sizeof(msg)); > +msg.msg_iov = iov; > +msg.msg_iovlen = 1; > +msg.msg_control = _control; > +msg.msg_controllen = sizeof(msg_control); > + > +ret = recvmsg(client->sock_fd, , 0); > +if (ret < 0) { > +IVSHMEM_CLIENT_DEBUG(client, "cannot read message: %s\n", > + strerror(errno)); > +return -1; > +} > +if (ret == 0) { > +
Re: [Qemu-devel] [PATCH 3/3] kvm-all: notice KVM of vcpu's TSC rate after migration
On Wed, Sep 30, 2015 at 05:36:11PM -0300, Eduardo Habkost wrote: > On Wed, Sep 30, 2015 at 08:32:26AM +0800, Haozhong Zhang wrote: > > > [...] > > > > > Or maybe we shouldn't treat this as VM state, but as configuration, > > > > > and > > > > > let management configure the TSC frequency explicitly if the user > > > > > really > > > > > needs it to stay the same during migration. > > > > > > > > > > (CCing libvir-list to see if they have feedback) > > > > > > > > > > > > > Thanks for CC. I'll consider to add a command line option to control > > > > the configuration of guest TSC fequency. > > > > > > It already exists, -cpu has a "tsc-freq" option. > > > > > > > What I'm considering is to add a "-keep-tsc-freq" option to control > > whether the TSC frequency should be migrated if "tsc-freq" is not > > presented. Compared to "tsc-freq", "-keep-tsc-freq" can free users > > from figuring out the guest TSC frequency manually in the migration. > > If you do that, please make it a property of the CPU object, so it will > can be set as a "-cpu" option. > Yes, I'll do so. - Haozhong > -- > Eduardo > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 1/3] target-i386: add a subsection of vcpu's TSC rate in vmstate_x86_cpu
On Wed, Sep 30, 2015 at 09:07:08AM +0100, Dr. David Alan Gilbert wrote: > * Haozhong Zhang (haozhong.zh...@intel.com) wrote: > > On Tue, Sep 29, 2015 at 08:00:13PM +0100, Dr. David Alan Gilbert wrote: > > > * Haozhong Zhang (haozhong.zh...@intel.com) wrote: > > > > The newly added subsection 'vmstate_tsc_khz' in this patch results in > > > > vcpu's TSC rate being saved on the source machine and loaded on the > > > > target machine during the migration. > > > > > > > > Signed-off-by: Haozhong Zhang> > > > > > Hi, > > > I'd appreciate it if you could tie this to only do it on newer > > > machine types; that way it won't break back migration. > > > > > > > Does "back migration" mean migrating from QEMU w/ vmstate_tsc_khz > > subsection to QEMU w/o that subsection? > > Yes; like if we migrate from a newer qemu to an older qemu but with > the same machine type. > This patch does break the back migration. I'll fix this in the next version. - Haozhong > Dave > > > > > - Haozhong > > > > > Dave > > > > > > > --- > > > > target-i386/machine.c | 20 > > > > 1 file changed, 20 insertions(+) > > > > > > > > diff --git a/target-i386/machine.c b/target-i386/machine.c > > > > index 9fa0563..80108a3 100644 > > > > --- a/target-i386/machine.c > > > > +++ b/target-i386/machine.c > > > > @@ -752,6 +752,25 @@ static const VMStateDescription vmstate_xss = { > > > > } > > > > }; > > > > > > > > +static bool tsc_khz_needed(void *opaque) > > > > +{ > > > > +X86CPU *cpu = opaque; > > > > +CPUX86State *env = >env; > > > > + > > > > +return env->tsc_khz != 0; > > > > +} > > > > + > > > > +static const VMStateDescription vmstate_tsc_khz = { > > > > +.name = "cpu/tsc_khz", > > > > +.version_id = 1, > > > > +.minimum_version_id = 1, > > > > +.needed = tsc_khz_needed, > > > > +.fields = (VMStateField[]) { > > > > +VMSTATE_INT64(env.tsc_khz, X86CPU), > > > > +VMSTATE_END_OF_LIST() > > > > +} > > > > +}; > > > > + > > > > VMStateDescription vmstate_x86_cpu = { > > > > .name = "cpu", > > > > .version_id = 12, > > > > @@ -871,6 +890,7 @@ VMStateDescription vmstate_x86_cpu = { > > > > _msr_hyperv_crash, > > > > _avx512, > > > > _xss, > > > > +_tsc_khz, > > > > NULL > > > > } > > > > }; > > > > -- > > > > 2.4.8 > > > > > > > > > > > -- > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCHv2] target-arm: Implement AArch64 OSLSR_EL1 sysreg
Thanks Peter, I've made all changes as you suggested, but there is no property "ARM_CP_NO_RAW", there's also nothing similar to it defined in cpu.h, here's all the options: #define ARM_CP_SPECIAL 1 #define ARM_CP_CONST 2 #define ARM_CP_64BIT 4 #define ARM_CP_SUPPRESS_TB_END 8 #define ARM_CP_OVERRIDE 16 #define ARM_CP_NO_MIGRATE 32 #define ARM_CP_IO 64 Cheers, Davorin On 10/05/2015 01:27 PM, Peter Maydell wrote: On 5 October 2015 at 20:56, Davorin Mistawrote: Added oslsr_write function to OSLAR_EL1 sysreg, using a status variable in ARMCPUState struct (os_lock_status). Linux reads from this register during its suspend/resume procedure. Signed-off-by: Davorin Mista Thanks for this patch. I'm afraid it still needs some changes; comments below. --- Changed in v2: -switched from using dummy registers to an actual register implementation -implemented write function for OSLAR_EL1 sysreg -added state variable to ARMCPUState struct Signed-off-by: Davorin Mista --- target-arm/cpu.h| 3 +++ target-arm/helper.c | 16 +++- 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/target-arm/cpu.h b/target-arm/cpu.h index 5ea11a6..5aab654 100644 --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -500,6 +500,9 @@ typedef struct CPUARMState { uint32_t cregs[16]; } iwmmxt; +/* OS Lock Status: accessed via OSLAR/OSLSR registers */ +uint64_t os_lock_status; Can you call this "oslsr_el1" and put it inside the cp15 substruct with the other sysreg fields, please? + /* For mixed endian mode. */ bool bswap_code; diff --git a/target-arm/helper.c b/target-arm/helper.c index 9d62c4c..a6fad7a 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -3147,6 +3147,13 @@ static void dcc_write(CPUARMState *env, const ARMCPRegInfo *ri, putchar(value); } +/* write to os_lock_status state variable */ +static void oslsr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value) +{ +/* only bit 1 can be modified, register is always 0b10x0 */ +raw_write(env, ri, 8 + (value & 2)); This is the write function for OSLAR, not OSLSR, so you should call it oslar_write. Your logic isn't implementing the behaviour the ARM ARM requires, which is: * for AArch64 accesses, copy bit 0 of the written value into bit 1 of oslsr_el1 * for AArch32 accesses, if the written value is 0xC5ACCE55 then write 1 into bit 1 of oslsr_el1, else write 0 That looks something like: int oslock; if (ri->state == ARM_CP_STATE_AA32) { oslock = (value == 0xC5ACCE55); } else { oslock = value & 1; } env->cp15.oslsr_el1 = deposit32(env->cp15.oslsr_el1, 1, 1, oslock); +} + static const ARMCPRegInfo debug_cp_reginfo[] = { /* DBGDRAR, DBGDSAR: always RAZ since we don't implement memory mapped * debug components. The AArch64 version of DBGDRAR is named MDRAR_EL1; @@ -3179,7 +3186,14 @@ static const ARMCPRegInfo debug_cp_reginfo[] = { /* We define a dummy WI OSLAR_EL1, because Linux writes to it. */ { .name = "OSLAR_EL1", .state = ARM_CP_STATE_BOTH, .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 4, - .access = PL1_W, .type = ARM_CP_NOP }, + .access = PL1_W, .resetvalue = 10, Write only registers don't need a reset value. You also need .type = ARM_CP_NO_RAW, because a raw access to this register doesn't make sense. + .fieldoffset = offsetof(CPUARMState, os_lock_status), + .writefn = oslsr_write }, +/* We define a dummy OSLSR_EL1, because Linux reads from it. */ +{ .name = "OSLSR_EL1", .state = ARM_CP_STATE_BOTH, + .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 4, + .access = PL1_R, + .fieldoffset = offsetof(CPUARMState, os_lock_status) }, This is the reginfo that should have the reset value. /* Dummy OSDLR_EL1: 32-bit Linux will read this */ { .name = "OSDLR_EL1", .state = ARM_CP_STATE_BOTH, .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 4, -- 2.6.0 thanks -- PMM
Re: [Qemu-devel] [PATCH 1/5] ide/atapi: make PIO read requests async
On 09/21/2015 08:25 AM, Peter Lieven wrote: > PIO read requests on the ATAPI interface used to be sync blk requests. > This has to siginificant drawbacks. First the main loop hangs util an > I/O request is completed and secondly if the I/O request does not > complete (e.g. due to an unresponsive storage) Qemu hangs completely. > > Signed-off-by: Peter Lieven> --- > hw/ide/atapi.c | 69 > -- > 1 file changed, 43 insertions(+), 26 deletions(-) > > diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c > index 747f466..9257e1c 100644 > --- a/hw/ide/atapi.c > +++ b/hw/ide/atapi.c > @@ -105,31 +105,51 @@ static void cd_data_to_raw(uint8_t *buf, int lba) > memset(buf, 0, 288); > } > > -static int cd_read_sector(IDEState *s, int lba, uint8_t *buf, int > sector_size) > +static void cd_read_sector_cb(void *opaque, int ret) > { > -int ret; > +IDEState *s = opaque; > > -switch(sector_size) { > -case 2048: > -block_acct_start(blk_get_stats(s->blk), >acct, > - 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ); > -ret = blk_read(s->blk, (int64_t)lba << 2, buf, 4); > -block_acct_done(blk_get_stats(s->blk), >acct); > -break; > -case 2352: > -block_acct_start(blk_get_stats(s->blk), >acct, > - 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ); > -ret = blk_read(s->blk, (int64_t)lba << 2, buf + 16, 4); > -block_acct_done(blk_get_stats(s->blk), >acct); > -if (ret < 0) > -return ret; > -cd_data_to_raw(buf, lba); > -break; > -default: > -ret = -EIO; > -break; > +block_acct_done(blk_get_stats(s->blk), >acct); > + > +if (ret < 0) { > +ide_atapi_io_error(s, ret); > +return; > +} > + > +if (s->cd_sector_size == 2352) { > +cd_data_to_raw(s->io_buffer, s->lba); > } > -return ret; > + > +s->lba++; > +s->io_buffer_index = 0; > +s->status &= ~BUSY_STAT; > + > +ide_atapi_cmd_reply_end(s); > +} > + > +static int cd_read_sector(IDEState *s, int lba, void *buf, int sector_size) > +{ > +if (sector_size != 2048 && sector_size != 2352) { > +return -EINVAL; > +} > + > +s->iov.iov_base = buf; > +if (sector_size == 2352) { > +buf += 4; > +} > + > +s->iov.iov_len = 4 * BDRV_SECTOR_SIZE; > +qemu_iovec_init_external(>qiov, >iov, 1); > + > +if (blk_aio_readv(s->blk, (int64_t)lba << 2, >qiov, 4, > + cd_read_sector_cb, s) == NULL) { > +return -EIO; > +} > + > +block_acct_start(blk_get_stats(s->blk), >acct, > + 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ); > +s->status |= BUSY_STAT; > +return 0; > } > We discussed this off-list a bit, but for upstream synchronization: Unfortunately, I believe making cd_read_sector here non-blocking makes ide_atapi_cmd_reply_end non-blocking, and as a result makes calls to s->end_transfer_func() nonblocking, which functions like ide_data_readw are not prepared to cope with. My suggestion is to buffer an entire DRQ block of data at once (byte_count_limit) to avoid the problem. Thanks, --js > void ide_atapi_cmd_ok(IDEState *s) > @@ -190,10 +210,8 @@ void ide_atapi_cmd_reply_end(IDEState *s) > ret = cd_read_sector(s, s->lba, s->io_buffer, s->cd_sector_size); > if (ret < 0) { > ide_atapi_io_error(s, ret); > -return; > } > -s->lba++; > -s->io_buffer_index = 0; > +return; > } > if (s->elementary_transfer_size > 0) { > /* there are some data left to transmit in this elementary > @@ -275,7 +293,6 @@ static void ide_atapi_cmd_read_pio(IDEState *s, int lba, > int nb_sectors, > s->io_buffer_index = sector_size; > s->cd_sector_size = sector_size; > > -s->status = READY_STAT | SEEK_STAT; > ide_atapi_cmd_reply_end(s); > } > >
Re: [Qemu-devel] QEMU+Linux ARMv7A current state
On 10/05/2015 04:44 PM, Beniamino Galvani wrote: > On Mon, Oct 05, 2015 at 11:13:33AM -0400, John Snow wrote: >> I'm looking into the cubieboard now. Is our emulation based on any >> particular model? (1-4?) > > The first model, the one with Allwinner A10. > >> I'm trying to see if I can find anything that resembles a spec to see >> what kind of registers this SoC has for its SATA controller. > > There is some documentation on the SoC here, but apparently nothing on > SATA: > > http://dl.linux-sunxi.org/A10/ > > Beniamino > http://linux-sunxi.org/SATA looks relevant... http://dl.cubieforums.com/files/pdf/A10_development_board_user_manual--2011.9.23_English.pdf mentions the feature list: The board provides one SATA interface, which features: - Support SATA 1.5Gb/s, and SATA 3.0Gb/s - Compliant to SATA Spec. 2.6, and AHCI Revision 1.3 Specifications - Support industry-standard AMBA High-Performance Bus (AHB) and fully compliant to the AMBA Specification, Revision 2.0; Support 32-bit Little Endian - OOB signaling detection and generation - SATA 1.5Gb/s and SATA 3.0Gb/s speed negotiation when Tx OOB signaling is selected - Support device hot-plugging - Support power management features including automatic Partial to Slumber transition - Internal DMA Engine for Command and Data Transactions - Support hardware-assisted Native Command Queuing (NCQ) for up to 32-entries - Support external SATA (eSATA) I can't find anything else, though. I'll just have to wait from Peter to see what registers he needed to modify to get it working. Thanks, --js
Re: [Qemu-devel] [kvm-unit-tests PATCHv2] arm: Add PMU test
On 10/02/2015 10:48 AM, Christopher Covington wrote: > Add test the ARM Performance Monitors Unit (PMU). The informational > fields from the control register are printed, but not checked, and > the number of cycles it takes to run a known-instruction-count loop > is printed, but not checked. Once QEMU is fixed, we can at least > begin to check for IPC == 1 when using -icount. Thanks for submitting it. I think this is a good starting point to add PMU unit testing support for ARM. Some comments below. > > Signed-off-by: Christopher Covington> --- > arm/pmu.c | 89 > + > arm/unittests.cfg | 11 ++ > config/config-arm64.mak | 4 ++- > 3 files changed, 103 insertions(+), 1 deletion(-) > create mode 100644 arm/pmu.c > > diff --git a/arm/pmu.c b/arm/pmu.c > new file mode 100644 > index 000..f724c2c > --- /dev/null > +++ b/arm/pmu.c > @@ -0,0 +1,89 @@ > +/* > + * Test the ARM Performance Monitors Unit (PMU). > + * > + * Copyright 2015 The Linux Foundation. All rights reserved. > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms of the GNU Lesser General Public License version 2.1 and > + * only version 2.1 as published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public > License > + * for more details. > + */ > +#include "libcflat.h" > + > +struct pmu_data { > + union { > + uint32_t pmcr_el0; > + struct { > + unsigned int enable:1; > + unsigned int event_counter_reset:1; > + unsigned int cycle_counter_reset:1; > + unsigned int cycle_counter_clock_divider:1; > + unsigned int event_counter_export:1; > + unsigned int cycle_counter_disable_when_prohibited:1; > + unsigned int cycle_counter_long:1; > + unsigned int zeros:4; > + unsigned int num_counters:5; > + unsigned int identification_code:8; > + unsigned int implementor:8; Not saying it is a problem because "unsigned int" is 32-bit on 64bit machine. But to make it consistent with pmcr_el0, I would prefer "unsigned int" been replaced by "uint32_t". > + }; > + }; > +}; > + > +/* Execute a known number of guest instructions. Only odd instruction counts > + * greater than or equal to 3 are supported. The control register (PMCR) is > + * initialized with the provided value (allowing for example for the cycle > + * counter or eventer count to be reset if needed). After the known > instruction > + * count loop, zero is written to the PMCR to disable counting, allowing the > + * cycle counter or event counters to be read as needed at a later time. > + */ > +static void measure_instrs(int len, struct pmu_data pmcr) > +{ > + int i = (len - 1) / 2; > + > + if (len < 3 || ((len - 1) % 2)) > + abort(); > + > + asm volatile( > + "msr pmcr_el0, %[pmcr]\n" > + "1: subs %[i], %[i], #1\n" > + "b.gt 1b\n" > + "msr pmcr_el0, xzr" > + : [i] "+r" (i) : [pmcr] "r" (pmcr) : "cc"); > +} > + > +int main() > +{ > + struct pmu_data pmcr; > + const int samples = 10; > + > + asm volatile("mrs %0, pmcr_el0" : "=r" (pmcr)); > + > + printf("PMU implementor: %c\n", pmcr.implementor); > + printf("Identification code: 0x%x\n", pmcr.identification_code); > + printf("Event counters: %d\n", pmcr.num_counters); > + > + pmcr.cycle_counter_reset = 1; > + pmcr.enable = 1; > + > + printf("\ninstructions : cycles0 cycles1 ...\n"); > + > + for (int i = 3; i < 300; i += 32) { > + int avg, sum = 0; > + printf("%d :", i); > + for (int j = 0; j < samples; j++) { > + int val; > + measure_instrs(i, pmcr); > + asm volatile("mrs %0, pmccntr_el0" : "=r" (val)); > + sum += val; > + printf(" %d", val); > + } > + avg = sum / samples; > + printf(" sum=%d avg=%d avg_ipc=%d avg_cpi=%d\n", sum, avg, i / > avg, avg / i); > + } I understand that, as stated in commit message, it currently doesn't check the correctness of PMU counter values. But it would be better if testing code can be abstracted into an independent function (e.g. instr_cycle_check) for report("Instruction Cycles", instr_cycle_check()). You can return TRUE in the checking code for now. > + > + return report_summary(); > +} > diff --git a/arm/unittests.cfg b/arm/unittests.cfg > index
Re: [Qemu-devel] [PATCHv2] target-arm: Implement AArch64 OSLSR_EL1 sysreg
On Mon, Oct 5, 2015 at 2:25 PM, Davorin Mistawrote: > Thanks Peter, I've made all changes as you suggested, but there is no > property "ARM_CP_NO_RAW", there's also nothing similar to it defined in > cpu.h, here's all the options: > > #define ARM_CP_SPECIAL 1 > #define ARM_CP_CONST 2 > #define ARM_CP_64BIT 4 > #define ARM_CP_SUPPRESS_TB_END 8 > #define ARM_CP_OVERRIDE 16 > #define ARM_CP_NO_MIGRATE 32 > #define ARM_CP_IO 64 Hello Davorin, There is an ARM_CP_NO_RAW option (https://github.com/qemu/qemu/blob/master/target-arm/cpu.h#L1154). It looks like you are applying this on the Xilinx tree (or another out of date tree) instead of mainline. Thanks, Alistair > > Cheers, > Davorin > > > On 10/05/2015 01:27 PM, Peter Maydell wrote: >> >> On 5 October 2015 at 20:56, Davorin Mista >> wrote: >>> >>> Added oslsr_write function to OSLAR_EL1 sysreg, using a status variable >>> in ARMCPUState struct (os_lock_status). >>> >>> Linux reads from this register during its suspend/resume procedure. >>> >>> Signed-off-by: Davorin Mista >> >> >> Thanks for this patch. I'm afraid it still needs some changes; >> comments below. >> >>> --- >>> Changed in v2: >>> -switched from using dummy registers to an actual register implementation >>> -implemented write function for OSLAR_EL1 sysreg >>> -added state variable to ARMCPUState struct >>> >>> Signed-off-by: Davorin Mista >>> --- >>> target-arm/cpu.h| 3 +++ >>> target-arm/helper.c | 16 +++- >>> 2 files changed, 18 insertions(+), 1 deletion(-) >>> >>> diff --git a/target-arm/cpu.h b/target-arm/cpu.h >>> index 5ea11a6..5aab654 100644 >>> --- a/target-arm/cpu.h >>> +++ b/target-arm/cpu.h >>> @@ -500,6 +500,9 @@ typedef struct CPUARMState { >>> uint32_t cregs[16]; >>> } iwmmxt; >>> >>> +/* OS Lock Status: accessed via OSLAR/OSLSR registers */ >>> +uint64_t os_lock_status; >> >> >> Can you call this "oslsr_el1" and put it inside the cp15 substruct >> with the other sysreg fields, please? >> >>> + >>> /* For mixed endian mode. */ >>> bool bswap_code; >>> >>> diff --git a/target-arm/helper.c b/target-arm/helper.c >>> index 9d62c4c..a6fad7a 100644 >>> --- a/target-arm/helper.c >>> +++ b/target-arm/helper.c >>> @@ -3147,6 +3147,13 @@ static void dcc_write(CPUARMState *env, const >>> ARMCPRegInfo *ri, >>> putchar(value); >>> } >>> >>> +/* write to os_lock_status state variable */ >>> +static void oslsr_write(CPUARMState *env, const ARMCPRegInfo *ri, >>> uint64_t value) >>> +{ >>> +/* only bit 1 can be modified, register is always 0b10x0 */ >>> +raw_write(env, ri, 8 + (value & 2)); >> >> >> This is the write function for OSLAR, not OSLSR, so you should >> call it oslar_write. >> >> Your logic isn't implementing the behaviour the ARM ARM requires, >> which is: >> * for AArch64 accesses, copy bit 0 of the written value into >> bit 1 of oslsr_el1 >> * for AArch32 accesses, if the written value is 0xC5ACCE55 >> then write 1 into bit 1 of oslsr_el1, else write 0 >> >> That looks something like: >> >> int oslock; >> >> if (ri->state == ARM_CP_STATE_AA32) { >> oslock = (value == 0xC5ACCE55); >> } else { >> oslock = value & 1; >> } >> >> env->cp15.oslsr_el1 = deposit32(env->cp15.oslsr_el1, 1, 1, oslock); >> >> >>> +} >>> + >>> static const ARMCPRegInfo debug_cp_reginfo[] = { >>> /* DBGDRAR, DBGDSAR: always RAZ since we don't implement memory >>> mapped >>>* debug components. The AArch64 version of DBGDRAR is named >>> MDRAR_EL1; >>> @@ -3179,7 +3186,14 @@ static const ARMCPRegInfo debug_cp_reginfo[] = { >>> /* We define a dummy WI OSLAR_EL1, because Linux writes to it. */ >>> { .name = "OSLAR_EL1", .state = ARM_CP_STATE_BOTH, >>> .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 4, >>> - .access = PL1_W, .type = ARM_CP_NOP }, >>> + .access = PL1_W, .resetvalue = 10, >> >> >> Write only registers don't need a reset value. You also >> need .type = ARM_CP_NO_RAW, because a raw access to this register >> doesn't make sense. >> >>> + .fieldoffset = offsetof(CPUARMState, os_lock_status), >>> + .writefn = oslsr_write }, >>> +/* We define a dummy OSLSR_EL1, because Linux reads from it. */ >>> +{ .name = "OSLSR_EL1", .state = ARM_CP_STATE_BOTH, >>> + .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 4, >>> + .access = PL1_R, >>> + .fieldoffset = offsetof(CPUARMState, os_lock_status) }, >> >> >> This is the reginfo that should have the reset value. >> >>> /* Dummy OSDLR_EL1: 32-bit Linux will read this */ >>> { .name = "OSDLR_EL1", .state = ARM_CP_STATE_BOTH, >>> .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 4, >>> -- >>> 2.6.0 >>> >> >> thanks >> -- PMM >> >
Re: [Qemu-devel] [PATCH v5 0/2] Improve -help
On 05/10/2015 23:08, Michael S. Tsirkin wrote: > On Mon, Oct 05, 2015 at 03:20:19PM +0200, Laurent Vivier wrote: >> no more comments, it should mean it's ok ? >> >> Can someone merge this or just say "stop this, I don't want that"... >> >> Laurent > > I'm afraid as a first step, we need to bring some order into -help. I agree with your comments, but the aim of this series is not to change the content, only the container. It will be easy to add sections once this done. > I also suspect that a good place to start doing that is the man page. Man page and help are generated from qemu-options.hx, so improving one is also improving the other, but I think I'm not the good guy to rewrite a man page... I'm a mechanic, not a writer :) Thanks, Laurent
Re: [Qemu-devel] [PATCH v5 0/2] Improve -help
On Mon, Oct 05, 2015 at 03:20:19PM +0200, Laurent Vivier wrote: > no more comments, it should mean it's ok ? > > Can someone merge this or just say "stop this, I don't want that"... > > Laurent I'm afraid as a first step, we need to bring some order into -help. I also suspect that a good place to start doing that is the man page. -- MST
[Qemu-devel] [PATCHv3] target-arm: Implement AArch64 OSLAR/OSLSR_EL1 sysregs
Added oslar_write function to OSLAR_EL1 sysreg, using a status variable in ARMCPUState.cp15 struct (oslsr_el1). This variable is also linked to the newly added read-only OSLSR_EL1 register. Linux reads from this register during its suspend/resume procedure. Signed-off-by: Davorin Mista--- Changed in v2: -switched from using dummy registers to an actual register implementation -implemented write function for OSLAR_EL1 sysreg -added state variable to ARMCPUState struct Changed in v3: -renamed variable to oslsr_el1 and moved to cp15 -renamed write frunction to oslar_write -support both 32bit and 64bit ARM in oslar_write -moved resetvalue to the corresponding read-only register -removed "dummy" comments above registers --- target-arm/cpu.h| 1 + target-arm/helper.c | 23 +-- 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/target-arm/cpu.h b/target-arm/cpu.h index 5ea11a6..f9f9bfb 100644 --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -399,6 +399,7 @@ typedef struct CPUARMState { uint64_t dbgwvr[16]; /* watchpoint value registers */ uint64_t dbgwcr[16]; /* watchpoint control registers */ uint64_t mdscr_el1; +uint64_t oslsr_el1; /* OS Lock Status */ /* If the counter is enabled, this stores the last time the counter * was reset. Otherwise it stores the counter value */ diff --git a/target-arm/helper.c b/target-arm/helper.c index 9d62c4c..ecc89c7 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -3147,6 +3147,20 @@ static void dcc_write(CPUARMState *env, const ARMCPRegInfo *ri, putchar(value); } +/* write to os_lock_status state variable */ +static void oslar_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value) +{ +int oslock; + +if (ri->state == ARM_CP_STATE_AA32) { +oslock = (value == 0xC5ACCE55); +} else { +oslock = value & 1; +} + +env->cp15.oslsr_el1 = deposit32(env->cp15.oslsr_el1, 1, 1, oslock); +} + static const ARMCPRegInfo debug_cp_reginfo[] = { /* DBGDRAR, DBGDSAR: always RAZ since we don't implement memory mapped * debug components. The AArch64 version of DBGDRAR is named MDRAR_EL1; @@ -3176,10 +3190,15 @@ static const ARMCPRegInfo debug_cp_reginfo[] = { .access = PL1_R, .fieldoffset = offsetof(CPUARMState, cp15.mdscr_el1), .resetfn = arm_cp_reset_ignore }, -/* We define a dummy WI OSLAR_EL1, because Linux writes to it. */ { .name = "OSLAR_EL1", .state = ARM_CP_STATE_BOTH, .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 4, - .access = PL1_W, .type = ARM_CP_NOP }, + .access = PL1_W, + .fieldoffset = offsetof(CPUARMState, cp15.oslsr_el1), + .writefn = oslar_write }, +{ .name = "OSLSR_EL1", .state = ARM_CP_STATE_BOTH, + .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 4, + .access = PL1_R, .resetvalue = 10, + .fieldoffset = offsetof(CPUARMState, cp15.oslsr_el1) }, /* Dummy OSDLR_EL1: 32-bit Linux will read this */ { .name = "OSDLR_EL1", .state = ARM_CP_STATE_BOTH, .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 4, -- 2.6.0
Re: [Qemu-devel] [PATCH] target-arm: Avoid calling arm_el_is_aa64() function for unimplemented EL
On 2 October 2015 at 14:21, Sergey Sorokinwrote: > It is incorrect to call arm_el_is_aa64() function for unimplemented EL. > This patch fixes several attempts to do so. > > Signed-off-by: Sergey Sorokin > --- > target-arm/cpu.h| 8 +--- > target-arm/helper.c | 15 +-- > 2 files changed, 18 insertions(+), 5 deletions(-) > > diff --git a/target-arm/cpu.h b/target-arm/cpu.h > index cc1578c..df456a5 100644 > --- a/target-arm/cpu.h > +++ b/target-arm/cpu.h > @@ -1015,11 +1015,11 @@ static inline bool access_secure_reg(CPUARMState *env) > */ > #define A32_BANKED_CURRENT_REG_GET(_env, _regname)\ > A32_BANKED_REG_GET((_env), _regname,\ > - ((!arm_el_is_aa64((_env), 3) && arm_is_secure(_env > + (arm_is_secure(_env) && !arm_el_is_aa64((_env), 3))) > > #define A32_BANKED_CURRENT_REG_SET(_env, _regname, _val) > \ > A32_BANKED_REG_SET((_env), _regname,\ > - ((!arm_el_is_aa64((_env), 3) && > arm_is_secure(_env))), \ > + (arm_is_secure(_env) && !arm_el_is_aa64((_env), 3)), \ > (_val)) > > void arm_cpu_list(FILE *f, fprintf_function cpu_fprintf); > @@ -1586,7 +1586,9 @@ static inline bool arm_excp_unmasked(CPUState *cs, > unsigned int excp_idx, > * interrupt. > */ > if ((target_el > cur_el) && (target_el != 1)) { > -if (arm_el_is_aa64(env, 3) || ((scr || hcr) && (!secure))) { > +/* ARM_FEATURE_AARCH64 enabled means the higher EL is AArch64. */ > +if (arm_feature(env, ARM_FEATURE_AARCH64) || > +((scr || hcr) && (!secure))) { > unmasked = 1; > } > } I know we discussed this one before, but having looked more carefully at it, I think the reason it's weird is that the original code is only correct if we're not implementing EL2. For instance (table D1-14 in the v8 ARMARM rev A.g) if we're in an all-AArch64 environment executing in Secure EL0 then the interrupt mask is supposed to have an effect if the interrupt is targeting EL2. But the current code will always set 'unmasked' to 1 if EL3 is 64 bit. So I think what the code ought to read is: if ((target_el > cur_el) && (target_el != 1)) { /* Exceptions targeting a higher EL may not be maskable */ if (arm_feature(env, ARM_FEATURE_AARCH64)) { /* 64-bit masking rules are simple: exceptions to EL3 * can't be masked, and exceptions to EL2 can only be * masked from Secure state. */ if (target_el == 3 || !secure) { unmasked = 1; } } else { /* The old 32-bit-only environment has a more complicated * masking setup. */ if ((scr || hcr) && !secure) { unmasked = 1; } } } Except that then for the AArch64 case we've just calculated scr and hcr and then not needed them. I think most of the code calculating them ought to move into the else clause here. I'll write a patch for this and post it tomorrow. In the meantime, we can just make the comment say /* ARM_FEATURE_AARCH64 enabled means the highest EL is AArch64. * This code currently assumes that EL2 is not implemented * (and so that highest EL will be 3 and the target_el also 3). */ > diff --git a/target-arm/helper.c b/target-arm/helper.c > index 8367997..1f11dbd 100644 > --- a/target-arm/helper.c > +++ b/target-arm/helper.c > @@ -5220,11 +5220,22 @@ uint32_t arm_phys_excp_target_el(CPUState *cs, > uint32_t excp_idx, > uint32_t cur_el, bool secure) > { > CPUARMState *env = cs->env_ptr; > -int rw = ((env->cp15.scr_el3 & SCR_RW) == SCR_RW); > +int rw; > int scr; > int hcr; > int target_el; > -int is64 = arm_el_is_aa64(env, 3); > +/* Is the higher EL AArch64? */ "highest". I pointed this one out before... > +int is64 = arm_feature(env, ARM_FEATURE_AARCH64); > + > +/* If the highest EL is in AArch64 state, and EL3 is not implemented, > + * we must behave as if EL3 is implemented and is in AArch64 state. > + * Therefore we need appropriate RW bit. > + */ I think we could put this comment into the else branch, and rephrase it a bit: +if (arm_feature(env, ARM_FEATURE_EL3)) { +rw = ((env->cp15.scr_el3 & SCR_RW) == SCR_RW); +} else { +/* Either EL2 is the highest EL (and so the EL2 register width + * is given by is64); or there is no EL2 or EL3, in which case + * the value of 'rw' does not affect the table lookup anyway. + */ +rw = is64; +} > +if (arm_feature(env, ARM_FEATURE_EL3)) { > +rw = ((env->cp15.scr_el3 & SCR_RW) == SCR_RW); > +} else { > +rw = is64; > +} What I can do with
Re: [Qemu-devel] [PATCHv3] target-arm: Implement AArch64 OSLAR/OSLSR_EL1 sysregs
On 5 October 2015 at 22:36, Davorin Mistawrote: > Added oslar_write function to OSLAR_EL1 sysreg, using a status variable > in ARMCPUState.cp15 struct (oslsr_el1). This variable is also linked > to the newly added read-only OSLSR_EL1 register. > > Linux reads from this register during its suspend/resume procedure. > > Signed-off-by: Davorin Mista > > --- > Changed in v2: > -switched from using dummy registers to an actual register implementation > -implemented write function for OSLAR_EL1 sysreg > -added state variable to ARMCPUState struct > > Changed in v3: > -renamed variable to oslsr_el1 and moved to cp15 > -renamed write frunction to oslar_write > -support both 32bit and 64bit ARM in oslar_write > -moved resetvalue to the corresponding read-only register > -removed "dummy" comments above registers > --- > target-arm/cpu.h| 1 + > target-arm/helper.c | 23 +-- > 2 files changed, 22 insertions(+), 2 deletions(-) > > diff --git a/target-arm/cpu.h b/target-arm/cpu.h > index 5ea11a6..f9f9bfb 100644 > --- a/target-arm/cpu.h > +++ b/target-arm/cpu.h > @@ -399,6 +399,7 @@ typedef struct CPUARMState { > uint64_t dbgwvr[16]; /* watchpoint value registers */ > uint64_t dbgwcr[16]; /* watchpoint control registers */ > uint64_t mdscr_el1; > +uint64_t oslsr_el1; /* OS Lock Status */ > /* If the counter is enabled, this stores the last time the counter > * was reset. Otherwise it stores the counter value > */ > diff --git a/target-arm/helper.c b/target-arm/helper.c > index 9d62c4c..ecc89c7 100644 > --- a/target-arm/helper.c > +++ b/target-arm/helper.c > @@ -3147,6 +3147,20 @@ static void dcc_write(CPUARMState *env, const > ARMCPRegInfo *ri, > putchar(value); > } > > +/* write to os_lock_status state variable */ > +static void oslar_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t > value) > +{ > +int oslock; > + > +if (ri->state == ARM_CP_STATE_AA32) { > +oslock = (value == 0xC5ACCE55); > +} else { > +oslock = value & 1; > +} > + > +env->cp15.oslsr_el1 = deposit32(env->cp15.oslsr_el1, 1, 1, oslock); > +} > + > static const ARMCPRegInfo debug_cp_reginfo[] = { > /* DBGDRAR, DBGDSAR: always RAZ since we don't implement memory mapped > * debug components. The AArch64 version of DBGDRAR is named MDRAR_EL1; > @@ -3176,10 +3190,15 @@ static const ARMCPRegInfo debug_cp_reginfo[] = { >.access = PL1_R, >.fieldoffset = offsetof(CPUARMState, cp15.mdscr_el1), >.resetfn = arm_cp_reset_ignore }, > -/* We define a dummy WI OSLAR_EL1, because Linux writes to it. */ > { .name = "OSLAR_EL1", .state = ARM_CP_STATE_BOTH, >.cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 4, > - .access = PL1_W, .type = ARM_CP_NOP }, > + .access = PL1_W, > + .fieldoffset = offsetof(CPUARMState, cp15.oslsr_el1), OSLAR doesn't want a fieldoffset. Still missing .type = ARM_CP_NO_RAW. (As Alistair says, this does exist in mainline -- please make sure you write and test your patches against current git master, not any other tree.) > + .writefn = oslar_write }, > +{ .name = "OSLSR_EL1", .state = ARM_CP_STATE_BOTH, > + .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 4, > + .access = PL1_R, .resetvalue = 10, > + .fieldoffset = offsetof(CPUARMState, cp15.oslsr_el1) }, This part is OK. > /* Dummy OSDLR_EL1: 32-bit Linux will read this */ > { .name = "OSDLR_EL1", .state = ARM_CP_STATE_BOTH, >.cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 4, > -- > 2.6.0 thanks -- PMM
[Qemu-devel] [PATCHv2] fw_cfg: Define a static signature to be returned on DMA port reads
Return a static signature ("QEMU CFG") if the guest does a read to the DMA address io register. Signed-off-by: Kevin O'Connor--- Marc, if you decide to respin your fw_cfg series, I've updated the dma signature patch. This addresses the comments from Stefan, and I hope it addresses the comments from Laszlo. BTW, if you wanted to, it's possible to use deposit64 in fw_cfg_dma_mem_write() to support all possible (validly aligned) write sizes. Then fw_cfg_dma_mem_valid() shouldn't be needed. Something like: static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr, uint64_t value, unsigned size) { FWCfgState *s = opaque; s->dma_addr = deposit64(s->dma_addr, (8 - addr - size)*8, size*8, value); if (addr + size >= 8) { fw_cfg_dma_transfer(s); } } --- docs/specs/fw_cfg.txt | 3 +++ hw/nvram/fw_cfg.c | 14 -- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt index 2d6b2da..cbdce7d 100644 --- a/docs/specs/fw_cfg.txt +++ b/docs/specs/fw_cfg.txt @@ -93,6 +93,9 @@ by selecting the "signature" item using key 0x (FW_CFG_SIGNATURE), and reading four bytes from the data register. If the fw_cfg device is present, the four bytes read will contain the characters "QEMU". +If the DMA interface is available, then reading the DMA Address +Register returns 0x51454d5520434647 ("QEMU CFG" in big-endian format). + === Revision / feature bitmap (Key 0x0001, FW_CFG_ID) === A 32-bit little-endian unsigned int, this item is used to check for enabled diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index 59933b3..cf5c5c4 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -53,6 +53,8 @@ #define FW_CFG_DMA_CTL_SKIP0x04 #define FW_CFG_DMA_CTL_SELECT 0x08 +#define FW_CFG_DMA_SIGNATURE 0x51454d5520434647 /* "QEMU CFG" */ + typedef struct FWCfgEntry { uint32_t len; uint8_t *data; @@ -393,6 +395,13 @@ static void fw_cfg_dma_transfer(FWCfgState *s) trace_fw_cfg_read(s, 0); } +static uint64_t fw_cfg_dma_mem_read(void *opaque, hwaddr addr, +unsigned size) +{ +// Return a signature value (and handle various read sizes) +return extract64(FW_CFG_DMA_SIGNATURE, (8 - addr - size) * 8, size*8); +} + static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr, uint64_t value, unsigned size) { @@ -416,8 +425,8 @@ static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr, static bool fw_cfg_dma_mem_valid(void *opaque, hwaddr addr, unsigned size, bool is_write) { -return is_write && ((size == 4 && (addr == 0 || addr == 4)) || -(size == 8 && addr == 0)); +return !is_write || ((size == 4 && (addr == 0 || addr == 4)) || + (size == 8 && addr == 0)); } static bool fw_cfg_data_mem_valid(void *opaque, hwaddr addr, @@ -488,6 +497,7 @@ static const MemoryRegionOps fw_cfg_comb_mem_ops = { }; static const MemoryRegionOps fw_cfg_dma_mem_ops = { +.read = fw_cfg_dma_mem_read, .write = fw_cfg_dma_mem_write, .endianness = DEVICE_BIG_ENDIAN, .valid.accepts = fw_cfg_dma_mem_valid, -- 2.4.3
Re: [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
On 09/15/2015 12:24 PM, Eric Blake wrote: > On 09/05/2015 10:43 AM, Vladimir Sementsov-Ogievskiy wrote: >> Persistent dirty bitmaps will be saved into qcow2 files. It may be used >> as 'internal' bitmaps (for qcow2 drives) or as 'external' bitmaps for >> other drives (there may be qcow2 file with zero disk size but with >> several dirty bitmaps for other drives). >> >> Signed-off-by: Vladimir Sementsov-Ogievskiy>> --- >> docs/specs/qcow2.txt | 127 >> ++- >> 1 file changed, 126 insertions(+), 1 deletion(-) >> >> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt >> index 121dfc8..5fc0365 100644 >> --- a/docs/specs/qcow2.txt >> +++ b/docs/specs/qcow2.txt >> @@ -103,7 +103,13 @@ in the description of a field. >> write to an image with unknown auto-clear features if it >> clears the respective bits from this field first. >> >> -Bits 0-63: Reserved (set to 0) >> +Bit 0: Dirty bitmaps bit. If this bit is set then >> +there is a _consistent_ Dirty bitmaps >> extension >> +in the image. If it is not set, but there >> is a >> +Dirty bitmaps extension, its data should be >> +considered as inconsistent. > > Thanks for documenting this. I don't know that we use underscore for > _emphasis_ anywhere else in the file, but I don't have any better > suggestions. Should you also require that it is an error if this bit is > set but no Dirty bitmap extension header is present? > An error, but one that can be safely corrected by any fsck-style utility: clear the bit. >> + >> +Bits 1-63: Reserved (set to 0) >> >> 96 - 99: refcount_order >> Describes the width of a reference count block entry >> (width >> @@ -123,6 +129,7 @@ be stored. Each extension has a structure like the >> following: >> 0x - End of the header extension area >> 0xE2792ACA - Backing file format name >> 0x6803f857 - Feature name table >> +0x23852875 - Dirty bitmaps >> other - Unknown header extension, can be safely >> ignored >> >> @@ -166,6 +173,24 @@ the header extension data. Each entry look like this: >> terminated if it has full length) >> >> >> +== Dirty bitmaps == >> + >> +Dirty bitmaps is an optional header extension. It provides an ability to >> store >> +dirty bitmaps in a qcow2 image. The fields are: > > Might not hurt to remind the reader about the auto-clear feature bit > mentioned earlier controlling whether this extension can be trusted as > consistent. > >> + >> + 0 - 3: nb_dirty_bitmaps >> + The number of dirty bitmaps contained in the image. Valid >> + values: 0 - 65535. >> + >> + 4 - 7: dirty_bitmap_directory_size >> + Size of the Dirty Bitmap Directory in bytes. Valid >> values: >> + 0 - 67108864 (= 1024 * nb_dirty_bitmaps). > > Is it always going to be 1024 * nb_dirty_bitmaps? If so, why do we need > a redundant field? If not, then this wording needs help; from the rest > of this text, it looks like you want "at most 1024 * nb_dirty_bitmaps". > Also, while Dirty Bitmap Directory entries are variable length (and > thus a variable maximum), they do have a minimum size (so the minimum > value for dirty_bitmap_directory_size must be larger than 0 unless > nb_dirty_bitmaps is 0, in which case why would we have this header > extension) > Agree. >> + >> + 8 - 15: dirty_bitmap_directory_offset >> + Offset into the image file at which the Dirty Bitmap >> + Directory starts. Must be aligned to a cluster boundary. >> + >> + >> == Host cluster management == >> >> qcow2 manages the allocation of host clusters by maintaining a reference >> count >> @@ -360,3 +385,103 @@ Snapshot table entry: >> >> variable: Padding to round up the snapshot table entry size to the >> next multiple of 8. >> + >> + >> +== Dirty bitmaps == >> + >> +The feature supports storing dirty bitmaps in a qcow2 image. >> + >> +=== Cluster mapping === >> + >> +Dirty bitmaps are stored using a ONE-level structure for the mapping of >> +bitmaps to host clusters. It is called Dirty Bitmap Table. > > s/ONE/one/ (I didn't see the reason for the emphasis) > Emphasis is likely because that's not how the cluster allocation mechanism works in qcow2 otherwise. We're essentially storing data straight into what would otherwise be the L1 table. It's worth clarifying, in my opinion. >> + >> +The Dirty Bitmap Table has a variable size (stored in
[Qemu-devel] QEMU Technical Talk: NVDIMM and persistent memory in QEMU
Marc Mari has volunteered to give the following online technical talk on Monday, 12 October at 14:00 UTC: "Marc Mari will present the new NVDIMM persistent memory device class and how they integrate into QEMU and SeaBIOS. The main concepts of the hardware specification are covered, as well as how NVDIMMs can be used by virtual machines. This talk is aimed at QEMU and SeaBIOS developers." Marc has been experimenting with Guangrong Xiao's NVDIMM patches and is working on SeaBIOS boot-from-NVDIMM support. To join the event: https://plus.google.com/events/cfssoojfogaafulssb1qeijn07k This is the first QEMU technical talk and we will be using Google+'s Hangouts On Air feature for a live presentation. Video will also be archived on YouTube for viewing at a later date. If you would like to speak on a technical topic, please contact me! I hope to host talks showcasing features of interest to QEMU users as well as technical topics for QEMU developers. Stefan
Re: [Qemu-devel] [RFC v0 0/2] Enforce gaps between DIMMs
On Mon, Oct 05, 2015 at 02:05:22PM +0530, Bharata B Rao wrote: > The suggested way to work around the virtio bug reported here > > http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html > > is to introduce gaps between DIMMs. Igor's patchset changes the pc-dimm > auto-address assignment to introduce gaps and ues the same from pc memhp. > This patchset does the same for sPAPR PowerPC. > > Before introducing the gap, ensure that memory hotplug region has enough > room for alignment adjustment. We accommodate a max alignment of 256MB for > each slot since sPAPR memory hotplug enforces an alignment requirement of > 256MB on RAM size, maxmem and NUMA node mem sizes. > > This applies on David's spapr-next branch + Igor's patchset applied. > > This has been very lightly tested and intention is to get feedback > on the correctness aspect of this. > > Bharata B Rao (2): > spapr: Accommadate alignment gaps in hotplug memory region > spapr: Force gaps between DIMM's GPA PC needs this, PPC needs this ... I don't see why would this not apply everywhere. Isn't it time we just converted everyone? Drop the gap flag, set it unconditionally for new machine types. Thoughts? > hw/ppc/spapr.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > -- > 2.1.0 >
[Qemu-devel] [PULL 06/10] vfio: Check guest IOVA ranges against host IOMMU capabilities
From: David GibsonThe current vfio core code assumes that the host IOMMU is capable of mapping any IOVA the guest wants to use to where we need. However, real IOMMUs generally only support translating a certain range of IOVAs (the "DMA window") not a full 64-bit address space. The common x86 IOMMUs support a wide enough range that guests are very unlikely to go beyond it in practice, however the IOMMU used on IBM Power machines - in the default configuration - supports only a much more limited IOVA range, usually 0..2GiB. If the guest attempts to set up an IOVA range that the host IOMMU can't map, qemu won't report an error until it actually attempts to map a bad IOVA. If guest RAM is being mapped directly into the IOMMU (i.e. no guest visible IOMMU) then this will show up very quickly. If there is a guest visible IOMMU, however, the problem might not show up until much later when the guest actually attempt to DMA with an IOVA the host can't handle. This patch adds a test so that we will detect earlier if the guest is attempting to use IOVA ranges that the host IOMMU won't be able to deal with. For now, we assume that "Type1" (x86) IOMMUs can support any IOVA, this is incorrect, but no worse than what we have already. We can't do better for now because the Type1 kernel interface doesn't tell us what IOVA range the IOMMU actually supports. For the Power "sPAPR TCE" IOMMU, however, we can retrieve the supported IOVA range and validate guest IOVA ranges against it, and this patch does so. Signed-off-by: David Gibson Reviewed-by: Laurent Vivier Signed-off-by: Alex Williamson --- hw/vfio/common.c | 40 +--- include/hw/vfio/vfio-common.h |6 ++ 2 files changed, 43 insertions(+), 3 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 95a4850..2faf492 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -343,14 +343,22 @@ static void vfio_listener_region_add(MemoryListener *listener, if (int128_ge(int128_make64(iova), llend)) { return; } +end = int128_get64(llend); + +if ((iova < container->min_iova) || ((end - 1) > container->max_iova)) { +error_report("vfio: IOMMU container %p can't map guest IOVA region" + " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx, + container, iova, end - 1); +ret = -EFAULT; +goto fail; +} memory_region_ref(section->mr); if (memory_region_is_iommu(section->mr)) { VFIOGuestIOMMU *giommu; -trace_vfio_listener_region_add_iommu(iova, -int128_get64(int128_sub(llend, int128_one(; +trace_vfio_listener_region_add_iommu(iova, end - 1); /* * FIXME: We should do some checking to see if the * capabilities of the host VFIO IOMMU are adequate to model @@ -387,7 +395,6 @@ static void vfio_listener_region_add(MemoryListener *listener, /* Here we assume that memory_region_is_ram(section->mr)==true */ -end = int128_get64(llend); vaddr = memory_region_get_ram_ptr(section->mr) + section->offset_within_region + (iova - section->offset_within_address_space); @@ -685,7 +692,19 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as) ret = -errno; goto free_container_exit; } + +/* + * FIXME: This assumes that a Type1 IOMMU can map any 64-bit + * IOVA whatsoever. That's not actually true, but the current + * kernel interface doesn't tell us what it can map, and the + * existing Type1 IOMMUs generally support any IOVA we're + * going to actually try in practice. + */ +container->min_iova = 0; +container->max_iova = (hwaddr)-1; } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) { +struct vfio_iommu_spapr_tce_info info; + ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, ); if (ret) { error_report("vfio: failed to set group container: %m"); @@ -710,6 +729,21 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as) ret = -errno; goto free_container_exit; } + +/* + * This only considers the host IOMMU's 32-bit window. At + * some point we need to add support for the optional 64-bit + * window and dynamic windows + */ +info.argsz = sizeof(info); +ret = ioctl(fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, ); +if (ret) { +error_report("vfio: VFIO_IOMMU_SPAPR_TCE_GET_INFO failed: %m"); +ret = -errno; +goto free_container_exit; +} +container->min_iova = info.dma32_window_start; +container->max_iova = container->min_iova + info.dma32_window_size - 1; } else {
[Qemu-devel] [PULL 10/10] vfio: Expose a VFIO PCI device's group for EEH
From: David GibsonThe Enhanced Error Handling (EEH) interface in PAPR operates on units of a Partitionable Endpoint (PE). For VFIO devices, the PE boundaries the guest sees must match the PE (i.e. IOMMU group) boundaries on the host. To implement this it will need to discover from VFIO which group a given device belongs to. This exposes a new vfio_pci_device_group() function for this purpose. Signed-off-by: David Gibson Reviewed-by: Laurent Vivier Signed-off-by: Alex Williamson --- hw/vfio/pci.c | 14 ++ include/hw/vfio/vfio-pci.h | 11 +++ 2 files changed, 25 insertions(+) create mode 100644 include/hw/vfio/vfio-pci.h diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index dcabb6d..49ae834 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -35,6 +35,8 @@ #include "pci.h" #include "trace.h" +#include "hw/vfio/vfio-pci.h" + #define MSIX_CAP_LENGTH 12 static void vfio_disable_interrupts(VFIOPCIDevice *vdev); @@ -2312,6 +2314,18 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice *vdev) vdev->req_enabled = false; } +VFIOGroup *vfio_pci_device_group(PCIDevice *pdev) +{ +VFIOPCIDevice *vdev; + +if (!object_dynamic_cast(OBJECT(pdev), "vfio-pci")) { +return NULL; +} + +vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev); +return vdev->vbasedev.group; +} + static int vfio_initfn(PCIDevice *pdev) { VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev); diff --git a/include/hw/vfio/vfio-pci.h b/include/hw/vfio/vfio-pci.h new file mode 100644 index 000..32105f7 --- /dev/null +++ b/include/hw/vfio/vfio-pci.h @@ -0,0 +1,11 @@ +#ifndef VFIO_PCI_H +#define VFIO_PCI_H + +#include "qemu/typedefs.h" + +/* We expose the concept of a VFIOGroup, though not its internals */ +typedef struct VFIOGroup VFIOGroup; + +extern VFIOGroup *vfio_pci_device_group(PCIDevice *pdev); + +#endif /* VFIO_PCI_H */
Re: [Qemu-devel] [PATCH v7 08/18] qapi: Test use of 'number' within alternates
On 09/29/2015 04:21 PM, Eric Blake wrote: > Add some testsuite exposure for use of a 'number' as part of > an alternate. The current state of the tree has a few bugs > exposed by this: our input parser depends on the ordering of > how the qapi schema declared the alternate, and the parser > does not accept integers for a 'number' in an alternate even > though it does for numbers outside of an alternate. > > Mixing 'int' and 'number' in the same alternate is unusual, > since both are supplied by json-numbers, but there does not > seem to be a technical reason to forbid it given that our > json lexer distinguishes between json-numbers that can be > represented as an int vs. those that cannot. > > Improve the existing test_visitor_in_alternate() to match the > style of the new test_visitor_in_alternate_number(), and to > ensure full coverage of all possible qtype parsing. > > Signed-off-by: Eric Blake> > --- > +static void test_visitor_in_alternate_number(TestInputVisitorData *data, > + const void *unused) > +{ > +Visitor *v; > +Error *err = NULL; > +AltStrBool *asb; > +AltStrNum *asn; > +AltNumStr *ans; > +AltStrInt *asi; > +AltIntNum *ain; > +AltNumInt *ani; > + > +/* Parsing an int */ > + > +v = visitor_input_test_init(data, "42"); > +visit_type_AltStrBool(v, , NULL, ); > +g_assert(err); > +qapi_free_AltStrBool(asb); > +visitor_input_teardown(data, NULL); This fails to reset err = NULL... > + > +/* FIXME: Order of alternate should not affect semantics; asn should > + * parse the same as ans */ > +v = visitor_input_test_init(data, "42"); > +visit_type_AltStrNum(v, , NULL, ); > +/* FIXME g_assert_cmpint(asn->kind, == ALT_STR_NUM_KIND_N); */ > +/* FIXME g_assert_cmpfloat(asn->n, ==, 42); */ > +g_assert(err); > +error_free(err); > +err = NULL; ...which means that this test is not reliable. Do you need a v8, or can you squash this in? diff --git a/tests/test-qmp-input-visitor.c b/tests/test-qmp-input-visitor.c index 1b5a369..6104ac6 100644 --- a/tests/test-qmp-input-visitor.c +++ b/tests/test-qmp-input-visitor.c @@ -395,6 +395,8 @@ static void test_visitor_in_alternate_number(TestInputVisitorData *data, v = visitor_input_test_init(data, "42"); visit_type_AltStrBool(v, , NULL, ); g_assert(err); +error_free(err); +err = NULL; qapi_free_AltStrBool(asb); v = visitor_input_test_init(data, "42"); -- 2.4.3 -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 02/17] block: add bdrv_dirty_bitmap_size()
On 09/05/2015 12:43 PM, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy> --- > block.c | 5 + > include/block/block.h | 1 + > 2 files changed, 6 insertions(+) > > diff --git a/block.c b/block.c > index 6d14f5b..8c39d0a 100644 > --- a/block.c > +++ b/block.c > @@ -3632,6 +3632,11 @@ const char *bdrv_dirty_bitmap_name(const > BdrvDirtyBitmap *bitmap) > return bitmap->name; > } > > +int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap) > +{ > +return bitmap->size; > +} > + > uint64_t bdrv_dirty_bitmap_data_size(const BdrvDirtyBitmap *bitmap, > uint64_t count) > { > diff --git a/include/block/block.h b/include/block/block.h > index fb7d410..8166640 100644 > --- a/include/block/block.h > +++ b/include/block/block.h > @@ -510,6 +510,7 @@ void bdrv_set_dirty_iter(struct HBitmapIter *hbi, int64_t > offset); > int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap); > > const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap); > +int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap); > uint64_t bdrv_dirty_bitmap_data_size(const BdrvDirtyBitmap *bitmap, > uint64_t count); > void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap, > Reviewed-by: John Snow
Re: [Qemu-devel] [PATCH v6 08/16] qapi: Test use of 'number' within alternates
On 09/29/2015 07:38 AM, Markus Armbruster wrote: > Eric Blakewrites: > >> Add some testsuite exposure for use of a 'number' as part of >> an alternate. The current state of the tree has a few bugs >> exposed by this: our input parser depends on the ordering of >> how the qapi schema declared the alternate, and the parser >> does not accept integers for a 'number' in an alternate even >> though it does for numbers outside of an alternate. >> >> Mixing 'int' and 'number' in the same alternate is unusual, >> since both are supplied by json-numbers, but there does not >> seem to be a technical reason to forbid it given that our >> json lexer distinguishes between json-numbers that can be >> represented as an int vs. those that cannot. >> >> Improve the existing test_visitor_in_alternate() to match the >> style of the new test_visitor_in_alternate_number(), and to >> ensure full coverage of all possible qtype parsing. >> >> Signed-off-by: Eric Blake >> >> +++ b/tests/test-qmp-input-visitor.c >> @@ -371,12 +371,133 @@ static void >> test_visitor_in_alternate(TestInputVisitorData *data, >> UserDefAlternate *tmp; >> >> v = visitor_input_test_init(data, "42"); >> - >> -visit_type_UserDefAlternate(v, , NULL, ); >> -g_assert(err == NULL); >> +visit_type_UserDefAlternate(v, , NULL, _abort); >> g_assert_cmpint(tmp->kind, ==, USER_DEF_ALTERNATE_KIND_I); >> g_assert_cmpint(tmp->i, ==, 42); >> qapi_free_UserDefAlternate(tmp); >> +visitor_input_teardown(data, NULL); > > Ugly in this test: visitor_input_test_init() is to be paired with > visitor_input_teardown(), but each test's last visitor_input_teardown() > can be omitted, because we also pass visitor_input_teardown to > g_test_add(). Not your patch's fault. Let's ignore it for now. In fact, I have another patch coming down the pipeline that moves visitor_input_teardown() into visitor_input_test_init() (teardown is idempotent. And while working on it, I noticed that we could consider doing something like: init(data, ...); func(blah, >err); g_assert(data->err); teardown(data); to let teardown() take care of err, instead of having to manually error_free(err); err = NULL; on each reset (particularly nice if visitor_input_test_init() invokes teardown() itself, for automatic reset semantics). -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v5 0/2] Improve -help
On Tue, Oct 06, 2015 at 12:04:40AM +0200, Laurent Vivier wrote: > > > On 05/10/2015 23:08, Michael S. Tsirkin wrote: > > On Mon, Oct 05, 2015 at 03:20:19PM +0200, Laurent Vivier wrote: > >> no more comments, it should mean it's ok ? > >> > >> Can someone merge this or just say "stop this, I don't want that"... > >> > >> Laurent > > > > I'm afraid as a first step, we need to bring some order into -help. > > I agree with your comments, but the aim of this series is not to change > the content, only the container. That's my point, with content as it is, changing the container makes things worse not better IMHO. > It will be easy to add sections once this done. > > > I also suspect that a good place to start doing that is the man page. > > Man page and help are generated from qemu-options.hx, so improving one > is also > improving the other, but I think I'm not the good guy to rewrite a man > page... I'm a mechanic, not a writer :) > > Thanks, > Laurent
Re: [Qemu-devel] [PATCH] Add syscalls for -runas and -chroot tothe seccomp sandbox
> Drawback: complexity. If we decide to limit ourselves to the original > threat model (rogue guest), and enter the sandbox only after setup, we > can keep things simpler. We could do both without much complexity. This looks simple enough to me: rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(chroot), 1, SCMP_A0(SCMP_CMP_EQ, chroot_dir)); if (rc < 0) goto seccomp_return; rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(chdir), 1, SCMP_A0(SCMP_CMP_EQ, "/")); if (rc < 0) goto seccomp_return; The only time chroot_dir is ever used is in os-posix.c:139: if (chroot(chroot_dir) < 0) {
Re: [Qemu-devel] [PULL 14/15] vhost-user-test: use tmpfs by default
On Mon, Oct 05, 2015 at 11:39:39PM +0100, Peter Maydell wrote: > On 2 October 2015 at 14:45, Michael S. Tsirkinwrote: > > Most people don't run make check by default, so they skip vhost-user > > unit tests. Solve this by using tmpfs instead, unless hugetlbfs is > > specified (using an environment variable). > > > > Signed-off-by: Michael S. Tsirkin > > Reviewed-by: Marc-André Lureau > > Unfortunately I didn't notice before applying the pull, but this > is breaking 'make check' on AArch64 host for me: > > TEST: tests/vhost-user-test... (pid=20205) > Warning: path not on HugeTLBFS: /tmp/vhost-test-gRpbwl > qemu-system-i386: -netdev vhost-user,id=net0,chardev=chr0,vhostforce: > vhost-net support is not compiled in > qemu-system-i386: -netdev vhost-user,id=net0,chardev=chr0,vhostforce: > failed to init vhost_net for queue 0 > > Broken pipe > FAIL: tests/vhost-user-test > > Probably reproducible on x86 if you configure with --disable-vhost-net, > though I haven't tried that. > > Perhaps tests/vhost-user-test should be set up > in tests/Makefile using > check-qtest-i386-$(CONFIG_VHOST_USER) rather > than CONFIG_LINUX ? > > I'd appreciate a quick fix, because this machine is in my set > of systems I test all pullreqs on now... > > thanks > -- PMM ok first of all we need this: if I apply it, I see the same bug as you do. need to fix that too. Signed-off-by: Michael S. Tsirkin diff --git a/net/net.c b/net/net.c index 28a5597..1e6a082 100644 --- a/net/net.c +++ b/net/net.c @@ -902,9 +902,7 @@ static int (* const net_client_init_fun[NET_CLIENT_OPTIONS_KIND_MAX])( [NET_CLIENT_OPTIONS_KIND_BRIDGE]= net_init_bridge, #endif [NET_CLIENT_OPTIONS_KIND_HUBPORT] = net_init_hubport, -#ifdef CONFIG_VHOST_NET_USED [NET_CLIENT_OPTIONS_KIND_VHOST_USER] = net_init_vhost_user, -#endif #ifdef CONFIG_L2TPV3 [NET_CLIENT_OPTIONS_KIND_L2TPV3]= net_init_l2tpv3, #endif
[Qemu-devel] [PATCH resent] linux-user: in poll(), if nfds is 0, pfd can be NULL
This problem appears with yum in Fedora 20 / PPC64 container. test case: #include #include int main(void) { int ret; ret = poll(NULL, 0, 1000); printf("%d\n", ret); } target test environment: Fedora 20 / PPC64 host test environment: Ubuntu 14.0.2 / x86_64 original test result: -1 13451 poll(0,0,1000,274886297496,26854,268566648) = -1 errno=14 (Bad address) patched test result: 0 13536 poll(0,0,1000,274886297496,26854,268566648) = 0 Signed-off-by: Laurent Vivier--- This patch has already been sent in April, this version is just rebased on master. https://patchwork.ozlabs.org/patch/460950/ linux-user/syscall.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 98b5766..9cdb2a2 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -7893,14 +7893,20 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, struct pollfd *pfd; unsigned int i; -target_pfd = lock_user(VERIFY_WRITE, arg1, sizeof(struct target_pollfd) * nfds, 1); -if (!target_pfd) -goto efault; +pfd = NULL; +target_pfd = NULL; +if (nfds) { +target_pfd = lock_user(VERIFY_WRITE, arg1, + sizeof(struct target_pollfd) * nfds, 1); +if (!target_pfd) { +goto efault; +} -pfd = alloca(sizeof(struct pollfd) * nfds); -for(i = 0; i < nfds; i++) { -pfd[i].fd = tswap32(target_pfd[i].fd); -pfd[i].events = tswap16(target_pfd[i].events); +pfd = alloca(sizeof(struct pollfd) * nfds); +for (i = 0; i < nfds; i++) { +pfd[i].fd = tswap32(target_pfd[i].fd); +pfd[i].events = tswap16(target_pfd[i].events); +} } # ifdef TARGET_NR_ppoll -- 2.4.3
Re: [Qemu-devel] [PULL 14/15] vhost-user-test: use tmpfs by default
On 2 October 2015 at 14:45, Michael S. Tsirkinwrote: > Most people don't run make check by default, so they skip vhost-user > unit tests. Solve this by using tmpfs instead, unless hugetlbfs is > specified (using an environment variable). > > Signed-off-by: Michael S. Tsirkin > Reviewed-by: Marc-André Lureau Unfortunately I didn't notice before applying the pull, but this is breaking 'make check' on AArch64 host for me: TEST: tests/vhost-user-test... (pid=20205) Warning: path not on HugeTLBFS: /tmp/vhost-test-gRpbwl qemu-system-i386: -netdev vhost-user,id=net0,chardev=chr0,vhostforce: vhost-net support is not compiled in qemu-system-i386: -netdev vhost-user,id=net0,chardev=chr0,vhostforce: failed to init vhost_net for queue 0 Broken pipe FAIL: tests/vhost-user-test Probably reproducible on x86 if you configure with --disable-vhost-net, though I haven't tried that. Perhaps tests/vhost-user-test should be set up in tests/Makefile using check-qtest-i386-$(CONFIG_VHOST_USER) rather than CONFIG_LINUX ? I'd appreciate a quick fix, because this machine is in my set of systems I test all pullreqs on now... thanks -- PMM
Re: [Qemu-devel] [PATCH 01/17] block: fix bdrv_dirty_bitmap_granularity()
On 09/05/2015 12:43 PM, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy> --- > block.c | 2 +- > include/block/block.h | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/block.c b/block.c > index 4f7fc0d..6d14f5b 100644 > --- a/block.c > +++ b/block.c > @@ -3591,7 +3591,7 @@ uint32_t > bdrv_get_default_bitmap_granularity(BlockDriverState *bs) > return granularity; > } > > -uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap) > +uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap) > { > return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap); > } > diff --git a/include/block/block.h b/include/block/block.h > index edc1510..fb7d410 100644 > --- a/include/block/block.h > +++ b/include/block/block.h > @@ -495,7 +495,7 @@ void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap); > void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap); > BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs); > uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs); > -uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap); > +uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap); > bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap); > bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap); > DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap); > As with Eric's review, with a commit message added: Reviewed-by: John Snow
Re: [Qemu-devel] [PULL 14/15] vhost-user-test: use tmpfs by default
On Mon, Oct 05, 2015 at 11:39:39PM +0100, Peter Maydell wrote: > On 2 October 2015 at 14:45, Michael S. Tsirkinwrote: > > Most people don't run make check by default, so they skip vhost-user > > unit tests. Solve this by using tmpfs instead, unless hugetlbfs is > > specified (using an environment variable). > > > > Signed-off-by: Michael S. Tsirkin > > Reviewed-by: Marc-André Lureau > > Unfortunately I didn't notice before applying the pull, but this > is breaking 'make check' on AArch64 host for me: > > TEST: tests/vhost-user-test... (pid=20205) > Warning: path not on HugeTLBFS: /tmp/vhost-test-gRpbwl > qemu-system-i386: -netdev vhost-user,id=net0,chardev=chr0,vhostforce: > vhost-net support is not compiled in > qemu-system-i386: -netdev vhost-user,id=net0,chardev=chr0,vhostforce: > failed to init vhost_net for queue 0 > > Broken pipe > FAIL: tests/vhost-user-test > > Probably reproducible on x86 if you configure with --disable-vhost-net, > though I haven't tried that. > > Perhaps tests/vhost-user-test should be set up > in tests/Makefile using > check-qtest-i386-$(CONFIG_VHOST_USER) rather > than CONFIG_LINUX ? > > I'd appreciate a quick fix, because this machine is in my set > of systems I test all pullreqs on now... > > thanks > -- PMM I think you are right, but just to be on the safe side, let's test both for now. If this helps you, pls feel free to apply. I will look at cleaning this up later. --> tests: vhost-user: disable unless CONFIG_VHOST_NET vhost-user depends on vhost-net. We should probably fix that. For now, let's disable the test otherwise. Signed-off-by: Michael S. Tsirkin --- diff --git a/tests/Makefile b/tests/Makefile index 4063639..e6474ba 100644 --- a/tests/Makefile +++ b/tests/Makefile @@ -188,7 +188,9 @@ gcov-files-i386-y += hw/usb/hcd-xhci.c check-qtest-i386-y += tests/pc-cpu-test$(EXESUF) check-qtest-i386-y += tests/q35-test$(EXESUF) gcov-files-i386-y += hw/pci-host/q35.c +ifeq ($(CONFIG_VHOST_NET),y) check-qtest-i386-$(CONFIG_LINUX) += tests/vhost-user-test$(EXESUF) +endif check-qtest-x86_64-y = $(check-qtest-i386-y) gcov-files-i386-y += i386-softmmu/hw/timer/mc146818rtc.c gcov-files-x86_64-y = $(subst i386-softmmu/,x86_64-softmmu/,$(gcov-files-i386-y))
Re: [Qemu-devel] [RFC PATCH 07/10] spapr_pci: Allow PCI host bridge DMA window to be configured
On Mon, Oct 05, 2015 at 04:13:30PM +0200, Paolo Bonzini wrote: > > > On 03/10/2015 02:25, Alexey Kardashevskiy wrote: > >> I think this is the aim of VMSTATE_UINT64_EQUAL() ? > > > > We use it only for things which cannot be set via the command line > > and ideally there should be no VMSTATE_*_EQUAL. If something can be > > set via the command line, then the management software (read - > > libvirt) runs QEMU with explicit parameters to guarantee that these > > are equal. > > VMSTATE_*_EQUAL is used when a value is later used as e.g. the size of > an array. It basically provides bounds checking for the subsequent > array, avoiding that an invalid migration file or an error issuing the > QEMU command on the destination transforms into a buffer overflow. > > Michael Roth did most of this work, IIRC. Documenting it in > docs/migration.txt would be nice. Ah.. which means we probably should use VMSTATE_*_EQUAL here since the window size determines the size of the array of actual TCEs to follow shortly. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [RFC v0 1/2] spapr: Accommadate alignment gaps in hotplug memory region
On Mon, Oct 05, 2015 at 11:05:07AM +0200, Igor Mammedov wrote: > On Mon, 5 Oct 2015 14:05:23 +0530 > Bharata B Raowrote: > > > Size hotplug memory region assuming a 256MB max alignment every slot. > > > > Signed-off-by: Bharata B Rao > > --- > > hw/ppc/spapr.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > > index fc5e7d6..2ec509b 100644 > > --- a/hw/ppc/spapr.c > > +++ b/hw/ppc/spapr.c > > @@ -1792,6 +1792,9 @@ static void ppc_spapr_init(MachineState *machine) > > > > spapr->hotplug_memory.base = ROUND_UP(machine->ram_size, > >SPAPR_HOTPLUG_MEM_ALIGN); > > + > > +/* size hotplug region assuming 256M max alignment per slot */ > > +hotplug_mem_size += SPAPR_MEMORY_BLOCK_SIZE * machine->ram_slots; > Does target support hugepages backend? If it does then adjustment probably > should be max supported hugepage alignment. Hrm, so the maximum possible page size on Power is 16G (though we don't yet support that on "powernv" which is what the host system will generally be). Not sure if the possibility of 16G "colossal pages" in future is enough reason to put such a huge gap. There aren't any other page sizes between 16MB and 16GB. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
[Qemu-devel] [PATCH qemu] kvm-all: Align to qemu_real_host_page_size in kvm_set_phys_mem
As the comment in kvm_set_phys_mem() says, KVM works in page size chunks. However it uses hardcoded TARGET_PAGE_SIZE which is 4K on most platforms while actual host may use different page size, for example, PPC64 hosts use 64K system pages. This replaces static TARGET_PAGE_SIZE with run-time calculated qemu_real_host_page_size. Signed-off-by: Alexey Kardashevskiy--- This is the result of debugging VFIO quirks not working under PPC64 KVM. --- kvm-all.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 0be4615..6f04fbb 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -642,15 +642,15 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml, /* kvm works in page size chunks, but the function may be called with sub-page size and unaligned start address. Pad the start address to next and truncate size to previous page boundary. */ -delta = (TARGET_PAGE_SIZE - (start_addr & ~TARGET_PAGE_MASK)); -delta &= ~TARGET_PAGE_MASK; +delta = qemu_real_host_page_size - (start_addr & ~qemu_real_host_page_mask); +delta &= ~qemu_real_host_page_mask; if (delta > size) { return; } start_addr += delta; size -= delta; -size &= TARGET_PAGE_MASK; -if (!size || (start_addr & ~TARGET_PAGE_MASK)) { +size &= qemu_real_host_page_mask; +if (!size || (start_addr & ~qemu_real_host_page_mask)) { return; } -- 2.5.0.rc3
Re: [Qemu-devel] [RFC PATCH 07/10] spapr_pci: Allow PCI host bridge DMA window to be configured
On Tue, Oct 06, 2015 at 02:25:07PM +1100, David Gibson wrote: > On Mon, Oct 05, 2015 at 04:13:30PM +0200, Paolo Bonzini wrote: > > > > > > On 03/10/2015 02:25, Alexey Kardashevskiy wrote: > > >> I think this is the aim of VMSTATE_UINT64_EQUAL() ? > > > > > > We use it only for things which cannot be set via the command line > > > and ideally there should be no VMSTATE_*_EQUAL. If something can be > > > set via the command line, then the management software (read - > > > libvirt) runs QEMU with explicit parameters to guarantee that these > > > are equal. > > > > VMSTATE_*_EQUAL is used when a value is later used as e.g. the size of > > an array. It basically provides bounds checking for the subsequent > > array, avoiding that an invalid migration file or an error issuing the > > QEMU command on the destination transforms into a buffer overflow. > > > > Michael Roth did most of this work, IIRC. Documenting it in > > docs/migration.txt would be nice. > > Ah.. which means we probably should use VMSTATE_*_EQUAL here since the > window size determines the size of the array of actual TCEs to follow > shortly. Wait.. no we don't. The vmstate for the sPAPRTCETable object which actually holds the IOMMU page table information already has a suitable VMSTATE_*_EQUAL to protect the variable sized array, so we don't need another one here. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH v5 32/48] ivshmem-client: check the number of vectors
On 02.10.2015 21:09, marcandre.lur...@redhat.com wrote: > From: Marc-André Lureau> > Check the number of vectors received from the server, to avoid > out of bound array access. > > Signed-off-by: Marc-André Lureau > --- > contrib/ivshmem-client/ivshmem-client.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/contrib/ivshmem-client/ivshmem-client.c > b/contrib/ivshmem-client/ivshmem-client.c > index 11c805c..34a65b1 100644 > --- a/contrib/ivshmem-client/ivshmem-client.c > +++ b/contrib/ivshmem-client/ivshmem-client.c > @@ -128,6 +128,11 @@ ivshmem_client_handle_server_msg(IvshmemClient *client) > /* new vector */ > IVSHMEM_CLIENT_DEBUG(client, " new vector %d (fd=%d) for peer id %ld\n", > peer->vectors_count, fd, peer->id); > +if (peer->vectors_count >= G_N_ELEMENTS(peer->vectors)) { > +IVSHMEM_CLIENT_DEBUG(client, "Too many vector received, failing"); nit: "Too many vectors" Reviewed-by: Claudio Fontana > +return -1; > +} > + > peer->vectors[peer->vectors_count] = fd; > peer->vectors_count++; > >
Re: [Qemu-devel] [PATCH v5 40/48] glib-compat: add 2.38/2.40/2.46 asserts
On 05/10/2015 12:56, Claudio Fontana wrote: > Hi, I did not find g_assertion_message in any of the exported GLIB APIs. > In fact, I found it in glib/gtestutils.h under the section "internal ABI". > This is a hint that we should not be using it right? Since this only matters for these pre-2.46 versions, where we know that g_assertion_message exists, we can use it. Paolo
Re: [Qemu-devel] [PATCH] temp-floating-point: Use float32_to_t and t_to_float32 for the input register value
> On 5 October 2015 at 12:21, Chen Gangwrote: >> +static float32 t_to_float32 (uint32_t a) >> +{ >> + CPU_FloatU r; >> + r.l = a; >> + return r.f; >> +} > > This appears to be reimplementing make_float32(). > OK, thanks. >> + >> +static uint32_t float32_to_t(float32 a) >> +{ >> + CPU_FloatU r; >> + r.f = a; >> + return r.l; >> +} > > And this is just float32_val(). > OK, thanks. >> uint64_t helper_fsingle_add1(CPUTLGState *env, uint64_t rsrc, uint64_t rsrcb) >> { >> - FPUTLGState *fpu = >fpu; >> - return fsingle_calc(fpu, int64_to_float32(rsrc, _STATUS), >> - int64_to_float32(rsrcb, _STATUS), >> - float32_add); >> + return fsingle_calc(>fpu, t_to_float32((uint32_t)rsrc), >> + t_to_float32((uint32_t)rsrcb), float32_add); >> } > > Why is the helper for a single-precision operation taking a 64-bit > argument anyway? > Oh, the register are uint64_t, so I guess the input register value are 64-bit, too. Thanks. -- Chen Gang Open, share, and attitude like air, water, and life which God blessed
Re: [Qemu-devel] [PATCH] temp-floating-point: Use float32_to_t and t_to_float32 for the input register value
> From: peter.mayd...@linaro.org > On 5 October 2015 at 12:54, Chen Gangwrote: >>> On 5 October 2015 at 12:21, Chen Gang wrote: >>> Why is the helper for a single-precision operation taking a 64-bit >>> argument anyway? >>> >> >> Oh, the register are uint64_t, so I guess the input register value are >> 64-bit, too. > > Usually for single precision the generated code is also working > with 32-bit values. What you have is not necessarily > wrong, but it is odd. I'd have to check the architecture > manual and the translate.c code to be sure. > OK, thanks. Originally, I referenced alpha and s390x for it. -- Chen Gang Open, share, and attitude like air, water, and life which God blessed
Re: [Qemu-devel] [PATCH v3 0/4] SysFS driver for QEMU fw_cfg device
On Mon, Oct 05, 2015 at 11:00:36AM +0100, Mark Rutland wrote: > On Sat, Oct 03, 2015 at 07:28:05PM -0400, Gabriel L. Somlo wrote: > > From: "Gabriel Somlo"> > > > Allow access to QEMU firmware blobs, passed into the guest VM via > > the fw_cfg device, through SysFS entries. Blob meta-data (e.g. name, > > size, and fw_cfg key), as well as the raw binary blob data may be > > accessed. > > > > The SysFS access location is /sys/firmware/qemu_fw_cfg/... and was > > selected based on overall similarity to the type of information > > exposed under /sys/firmware/dmi/entries/... > > What is the intended use of these? > > Some of the keys in the example look like they'd come from other sources > (e.g. the *-tables entries), while others look like kernel/bootloader > configuration options (e.g. etc/boot-fail-wait, bootorder) -- I'm > concerned about redundancy here. Paolo already answered that (more eloquently than I would have) so I'll leave it at that, for now... > > > NEW (since v2): Using ACPI to detect the presence and details of the > > fw_cfg virtual hardware device. > > > > Device Tree has been suggested by Ard as a comment on v2 of this > > patch, but after some deliberation I decided to go with ACPI, > > since it's supported on both x86 and some (uefi-enabled) versions > > of aarch64. I really don't see how I'd reasonably use *both* DT (on > > ARM) *and* ACPI (on x86), and after all I'm mostly concerned with > > x86, but originally wanted to maximize portability (which is where > > the register probing in earlier versions came from). > > There are defintitely going to be arm64 VMs that don't use ACPI, so we > may need DT support depending on what the intended use is. > > I'm not sure I follow what the difficulty with supporting DT in addition > to ACPI is? It looks like all you need is a compatible string and a reg > entry. Bearing in mind that I have almost no experience with arm: I started out by probing all possible port-io and mmio locations where fw_cfg registers might have been found, from a "classic" module_init method. Arm has DT, which as far as I understand will answer the following two questions: 1. Do I have fw_cfg ? 2. If yes, what address range does it use ? So that I could continue using a classic module_init, but won't need to probe for the device. PC (my primary architecture, the one I actually care about) does not have DT. If I want to share the same code, I can't probe, so if I try DT and don't find fw_cfg there (or somehow DT is no-op-ed out because I'm on a PC guest), I could somehow look it up in ACPI the same way (i.e., use ACPI as sort of a stand-in for DT). But all ACPI-enabled drivers I could find use dedicated macros (i.e. no more classic module_init() and module_exit(), but rather module_acpi_driver() with .add and .remove methods on an acpi_driver object, etc.) Not sure how I'd glue DT back into something like that. In addition, Michael's comment earlier in the thread suggests that even my current acpi version isn't sufficiently "orthodox" w.r.t. ACPI, and I should be providing the hardware access routine as an ACPI/AML routine, to avoid race conditions with the rest of ACPI, and for encapsulation. I.e. it's even rude to use the fw_cfg node's ACPI _CRS method (the part where I'd be treating it like a DT stand-in only to query fw_cfg's hardware specifics). So far, all the information I've been able to pull together points away from a dual DT + ACPI all-in-one solution for fw_cfg. If you know of an example where that's done in an acceptable way, please let me know so I can use it for inspiration... Thanks much, --Gabriel > > > A patch set generating an ACPI device node for qemu's fw_cfg is > > currently under review on the qemu-devel list: > > > > http://lists.nongnu.org/archive/html/qemu-devel/2015-09/msg06946.html > > (sorry, gmane appears down at the moment...) > > > > In consequence: > > > > - Patch 1/4 is mostly the same as in v2; > > - Patch 2/4 switches device initialization from register > > probing to using ACPI; this is a separate patch only to > > illustrate the transition from probing to ACPI, and I'm > > assuming it will end up squashed on top of patch 1/4 in > > the final version. > > > > - Patches 3/4 and 4/4 add a "human-readable" directory > > hierarchy built from tokenizing fw_cfg blob names into > > '/'-separated components, with symlinks to each 'by_key' > > blob folder (same as in earlier versions). At Greg's > > suggestion I tried to build this folder hierarchy and > > leaf symlinks using udev rules, but so far I haven't been > > successful in figuring that out. If udev turns out to > > be applicable after all, these two patches can be dropped > > from this series. > > > > In other words, patches 1 and 2 give us the following "by_key" listing > > of blobs contained in the qemu fw_cfg device
Re: [Qemu-devel] [PATCH v3 0/4] SysFS driver for QEMU fw_cfg device
On Mon, Oct 05, 2015 at 01:50:47PM +0100, Peter Maydell wrote: > On 5 October 2015 at 13:40, Gabriel L. Somlowrote: > > In addition, Michael's comment earlier in the thread suggests that > > even my current acpi version isn't sufficiently "orthodox" w.r.t. > > ACPI, and I should be providing the hardware access routine as > > an ACPI/AML routine, to avoid race conditions with the rest of ACPI, > > and for encapsulation. I.e. it's even rude to use the fw_cfg node's > > ACPI _CRS method (the part where I'd be treating it like a DT stand-in > > only to query fw_cfg's hardware specifics). > > If you want to try to support "firmware might also be reading > fw_cfg at the same time as the kernel" this is a (painful) > problem regardless of how the kernel figures out whether a > fw_cfg device is present. I had assumed that one of the design > assumptions of this series was that firmware would only > read the fw_cfg before booting the guest kernel and never touch > it afterwards. If it might touch it later then letting the > guest kernel also mess with fw_cfg seems like a really bad idea. I don't know of any case where firmware and kernel might race each other to access fw_cfg. The issue AFAICT is whether it's safe (future-proof) to rely on parsing _CRS for the fw_cfg i/o access information, or whether such logic could be rendered obsolete by potential future updates to fw_cfg's _CRS. If I "outsource" the fw_cfg_dump_blob_by_key() functionality entirely to an ACPI method, my kernel driver won't have to worry about keeping up with said future updates. On the down-side, that means the kernel driver will be ACPI or nothing (but I'm OK with that, at my curent level of understanding :) Thanks, --Gabriel
Re: [Qemu-devel] [PATCHv3 5/7] memory: Allow replay of IOMMU mapping notifications
On 30/09/2015 04:13, David Gibson wrote: > When we have guest visible IOMMUs, we allow notifiers to be registered > which will be informed of all changes to IOMMU mappings. This is used by > vfio to keep the host IOMMU mappings in sync with guest IOMMU mappings. > > However, unlike with a memory region listener, an iommu notifier won't be > told about any mappings which already exist in the (guest) IOMMU at the > time it is registered. This can cause problems if hotplugging a VFIO > device onto a guest bus which had existing guest IOMMU mappings, but didn't > previously have an VFIO devices (and hence no host IOMMU mappings). > > This adds a memory_region_iommu_replay() function to handle this case. It > replays any existing mappings in an IOMMU memory region to a specified > notifier. Because the IOMMU memory region doesn't internally remember the > granularity of the guest IOMMU it has a small hack where the caller must > specify a granularity at which to replay mappings. > > If there are finer mappings in the guest IOMMU these will be reported in > the iotlb structures passed to the notifier which it must handle (probably > causing it to flag an error). This isn't new - the VFIO iommu notifier > must already handle notifications about guest IOMMU mappings too short > for it to represent in the host IOMMU. > > Signed-off-by: David Gibson> --- > include/exec/memory.h | 13 + > memory.c | 20 > 2 files changed, 33 insertions(+) > > diff --git a/include/exec/memory.h b/include/exec/memory.h > index 5baaf48..0f07159 100644 > --- a/include/exec/memory.h > +++ b/include/exec/memory.h > @@ -583,6 +583,19 @@ void memory_region_notify_iommu(MemoryRegion *mr, > void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n); > > /** > + * memory_region_iommu_replay: replay existing IOMMU translations to > + * a notifier > + * > + * @mr: the memory region to observe > + * @n: the notifier to which to replay iommu mappings > + * @granularity: Minimum page granularity to replay notifications for > + * @is_write: Whether to treat the replay as a translate "write" > + * through the iommu > + */ > +void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n, > +hwaddr granularity, bool is_write); > + > +/** > * memory_region_unregister_iommu_notifier: unregister a notifier for > * changes to IOMMU translation entries. > * > diff --git a/memory.c b/memory.c > index ef87363..1b03d22 100644 > --- a/memory.c > +++ b/memory.c > @@ -1403,6 +1403,26 @@ void > memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n) > notifier_list_add(>iommu_notify, n); > } > > +void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n, > +hwaddr granularity, bool is_write) > +{ > +hwaddr addr; > +IOMMUTLBEntry iotlb; > + > +for (addr = 0; addr < memory_region_size(mr); addr += granularity) { > +iotlb = mr->iommu_ops->translate(mr, addr, is_write); > +if (iotlb.perm != IOMMU_NONE) { > +n->notify(n, ); > +} > + > +/* if (2^64 - MR size) < granularity, it's possible to get an > + * infinite loop here. This should catch such a wraparound */ > +if ((addr + granularity) < addr) { > +break; > +} > +} > +} > + > void memory_region_unregister_iommu_notifier(Notifier *n) > { > notifier_remove(n); > Acked-by: Paolo Bonzini
Re: [Qemu-devel] [PATCH v3 0/4] SysFS driver for QEMU fw_cfg device
On Mon, Oct 05, 2015 at 01:56:47PM +0100, Mark Rutland wrote: > On Mon, Oct 05, 2015 at 08:43:46AM -0400, Gabriel L. Somlo wrote: > > On Mon, Oct 05, 2015 at 01:23:33PM +0100, Mark Rutland wrote: > > > On Mon, Oct 05, 2015 at 01:48:52PM +0200, Paolo Bonzini wrote: > > > > > > > > > > > > On 05/10/2015 12:00, Mark Rutland wrote: > > > > > Some of the keys in the example look like they'd come from other > > > > > sources > > > > > (e.g. the *-tables entries), while others look like kernel/bootloader > > > > > configuration options (e.g. etc/boot-fail-wait, bootorder) -- I'm > > > > > concerned about redundancy here. > > > > > > > > The redundancy is because the firmware and the bootloader actually > > > > _consume_ these fw_cfg strings to produce the others (the ACPI tables, > > > > the kernel configuration options). > > > > > > > > On the other hand, hiding some strings just because they ought to have > > > > been consumed already makes little sense. > > > > > > Sure. However, I'm concerned that providing redundant interfaces for > > > those could lead to people grabbing information from here (because it's > > > convenient) rather than the existing canonical locations, which means we > > > get more software that works on fewer systems for no good reason. > > > > > > What I couldn't figure out was what _additional_ information this > > > provided; it looked like a mixed bag of details we could already get > > > from disparate sources. If that's all it does, then it seems to me like > > > it doesn't add any benefit and potentially makes things worse. > > > > > > So what do we get from this interface that we cannot get elsewhere, and > > > why is this the best way of exposing it? > > > > Starting with qemu 2.4, it is possible to insert arbitrary named > > blobs into fw_cfg from the qemu command line. *Those* entries > > might be interesting to userspace, which is why it might be handy > > to access to fw_cfg blobs in general. > > So this is a mechanism to pass arbitrary key:value pairs to a guest > userspace? What would those be used for, and why would this be the > correct location for that? Yes to arbitrary host->guest arbitrary key:value pairs. fw_cfg because it's asynchronous (host supplies the data at guest start time, and no longer has to worry about whether and when guests may or may not start some sort of agent in order to be able to accept connections, etc); also because it's guest-os agnostic (no piggy-backing on e.g. kernel command line). Drivers to make data available to guest userspace can be written for any guest OS. > How do we avoid clashes between user-selected names and those we need to > pass actual FW data? Internally supplied blobs (by QEMU) meant for the firmware are, by convention, prefixed with "/etc/...". Command-line blobs are expected to use "opt/...". QEMU issues a warning if a name is used on the command line that doesn't begin with 'opt/'. Thanks, --Gabriel
Re: [Qemu-devel] [PATCH v5 40/48] glib-compat: add 2.38/2.40/2.46 asserts
Hi On Mon, Oct 5, 2015 at 12:56 PM, Claudio Fontanawrote: > Hi, I did not find g_assertion_message in any of the exported GLIB APIs. > In fact, I found it in glib/gtestutils.h under the section "internal ABI". > This is a hint that we should not be using it right? Good point, I didn't see that. Notice that this file has other weird comments, such as "semi-internal API". I think it's fine to use though, because it's an external symbol and glib has ABI stability guarantees. I opened https://bugzilla.gnome.org/show_bug.cgi?id=756077 to change the glib comments. -- Marc-André Lureau
Re: [Qemu-devel] [PULL 00/10] Fix device introspection regressions
On 5 October 2015 at 07:49, Markus Armbrusterwrote: > Peter Maydell writes: > >> On 2 October 2015 at 18:20, Markus Armbruster wrote: >>> QMP command device-list-properties regressed in 2.1: it can crash or >>> leave dangling pointers behind. >>> >>> -device FOO,help regressed in 2.2: it no longer works for >>> non-pluggable devices. I tried to fix that some time ago[*], but my >>> fix failed review. This is my second, more comprehensive try. >>> >>> PATCH 1-3 fix one class of bugs involved in the regressions, PATCH 4-5 >>> are libqtest preliminaries, PATCH 6 adds tests to demonstrate the >>> remaining bugs, PATCH 7-9 fix them to a degree (see PATCH 8 for >>> limitations), and PATCH 10 cleans up. >> >> This ordering breaks bisection of 'make check', as I found out when >> I tried to figure out which of the patches in this pull was causing >> an OSX test failure. Please can you reorder them so that 'make check' >> works at all points in the series? > > My ordering may be bad (and I'll recheck it, of course), or it may > temporarily expose a hidden bug. I better figure out what's going on > here. > >>> The following changes since commit ff770b07f34d28b79013a83989bd6c85f8f16b2f: >>> >>> Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into >>> staging (2015-10-02 11:01:18 +0100) >>> >>> are available in the git repository at: >>> >>> git://repo.or.cz/qemu/armbru.git tags/pull-monitor-2015-10-02 >>> >>> for you to fetch changes up to e927162a6fa2fa6144de9d1d11cc9448a2143671: >>> >>> Revert "qdev: Use qdev_get_device_class() for -device ,help" >>> (2015-10-02 16:45:53 +0200) >>> >>> >>> Fix device introspection regressions >>> >>> >> >> 'make check' failure on OSX: >> >> /aarch64/device/introspect/list: OK >> /aarch64/device/introspect/none: OK >> /aarch64/device/introspect/abstract: OK >> /aarch64/device/introspect/concrete: ** >> ERROR:/Users/pm215/src/qemu-for-merges/qom/object.c:333:void >> object_initialize_with_type(void *, size_t, TypeImpl *): assertion >> failed: (type != NULL) >> Broken pipe >> FAIL >> >> I have no idea why this only failed on OSX... > > Can you re-run this with valgrind spliced in? Valgrind is not particularly helpful: it reports a couple of irrelevancies and an unimplemented syscall, then just reports the backtrace for the abort: ==26853== Memcheck, a memory error detector ==26853== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==26853== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==26853== Command: ./aarch64-softmmu/qemu-system-aarch64 -qtest unix:/tmp/qtest-26555.sock,nowait -qtest-log /dev/null -qmp unix:/tmp/qtest-26555.qmp,nowait -machine accel=qtest -display none -nodefaults -machine none ==26853== Parent PID: 26555 ==26853== ==26853== Syscall param __pthread_sigmask(set) points to uninitialised byte(s) ==26853==at 0x10434E2B6: __pthread_sigmask (in /usr/lib/system/libsystem_kernel.dylib) ==26853==by 0x10446406D: pthread_sigmask (in /usr/lib/system/libsystem_pthread.dylib) ==26853==by 0x100537022: qemu_thread_create (qemu-thread-posix.c:488) ==26853==by 0x100550ACB: rcu_init_complete (rcu.c:320) ==26853==by 0x100550B18: rcu_init (rcu.c:351) ==26853==by 0x7FFF5FC12D0A: ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) (in /usr/lib/dyld) ==26853==by 0x7FFF5FC12E97: ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) (in /usr/lib/dyld) ==26853==by 0x7FFF5FC0F890: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld) ==26853==by 0x7FFF5FC0F717: ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld) ==26853==by 0x7FFF5FC0F988: ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) (in /usr/lib/dyld) ==26853==by 0x7FFF5FC02244: dyld::initializeMainExecutable() (in /usr/lib/dyld) ==26853==by 0x7FFF5FC05C18: dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) (in /usr/lib/dyld) ==26853== Address 0x1056e0c80 is on thread 1's stack ==26853== in frame #2, created by qemu_thread_create (qemu-thread-posix.c:461) ==26853== ==26853== Syscall param __pthread_sigmask(set) points to uninitialised byte(s) ==26853==at 0x10434E2B6: __pthread_sigmask (in /usr/lib/system/libsystem_kernel.dylib) ==26853==by 0x10446406D: pthread_sigmask (in
Re: [Qemu-devel] [PATCH v3 0/4] SysFS driver for QEMU fw_cfg device
> > I'm not sure I follow what the difficulty with supporting DT in addition > > to ACPI is? It looks like all you need is a compatible string and a reg > > entry. > > Bearing in mind that I have almost no experience with arm: > > I started out by probing all possible port-io and mmio locations where > fw_cfg registers might have been found, from a "classic" module_init > method. > > Arm has DT, which as far as I understand will answer the following two > questions: 1. Do I have fw_cfg ? 2. If yes, what address range does it use ? > So that I could continue using a classic module_init, but won't need > to probe for the device. > > PC (my primary architecture, the one I actually care about) does not > have DT. If I want to share the same code, I can't probe, so if I try > DT and don't find fw_cfg there (or somehow DT is no-op-ed out because > I'm on a PC guest), I could somehow look it up in ACPI the same way > (i.e., use ACPI as sort of a stand-in for DT). I'd imagine that it's simple to have something in your probe path like: if (pdev->dev.of_node) parse_dt(pdev); else parse_acpi(pdev); > But all ACPI-enabled drivers I could find use dedicated macros (i.e. > no more classic module_init() and module_exit(), but rather > module_acpi_driver() with .add and .remove methods on an acpi_driver > object, etc.) Not sure how I'd glue DT back into something like that. You don't have to use those macros, and can simply use the classic module_{init,exit} functions, calling the requisite acpi driver registration functions at module {init,exit} time. > In addition, Michael's comment earlier in the thread suggests that > even my current acpi version isn't sufficiently "orthodox" w.r.t. > ACPI, and I should be providing the hardware access routine as > an ACPI/AML routine, to avoid race conditions with the rest of ACPI, > and for encapsulation. I.e. it's even rude to use the fw_cfg node's > ACPI _CRS method (the part where I'd be treating it like a DT stand-in > only to query fw_cfg's hardware specifics). As Peter stated, this sounds very much like it rules out sharing the interface with FW generally (and is certainly scary). > So far, all the information I've been able to pull together points > away from a dual DT + ACPI all-in-one solution for fw_cfg. If you know > of an example where that's done in an acceptable way, please let > me know so I can use it for inspiration... I'm not immediately aware, but I would imagine you could search for files that had both an of_match_table and a acpi_bus_register_driver call. Thanks, Mark.
[Qemu-devel] [PATCH] tests: Unique test path for /string-visitor/output
From: "Dr. David Alan Gilbert"Newer GLib's want unique test paths, and thus moan at dupes. (Seen on Fedora 23 which has glib 2.46) Uniqueify the paths. Signed-off-by: Dr. David Alan Gilbert --- tests/test-string-output-visitor.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/tests/test-string-output-visitor.c b/tests/test-string-output-visitor.c index 101fb27..fd5e67b 100644 --- a/tests/test-string-output-visitor.c +++ b/tests/test-string-output-visitor.c @@ -248,39 +248,39 @@ int main(int argc, char **argv) output_visitor_test_add("/string-visitor/output/int", _visitor_data, test_visitor_out_int, false); -output_visitor_test_add("/string-visitor/output/int", +output_visitor_test_add("/string-visitor/output/int-human", _visitor_data, test_visitor_out_int, true); output_visitor_test_add("/string-visitor/output/bool", _visitor_data, test_visitor_out_bool, false); -output_visitor_test_add("/string-visitor/output/bool", +output_visitor_test_add("/string-visitor/output/bool-human", _visitor_data, test_visitor_out_bool, true); output_visitor_test_add("/string-visitor/output/number", _visitor_data, test_visitor_out_number, false); -output_visitor_test_add("/string-visitor/output/number", +output_visitor_test_add("/string-visitor/output/number-human", _visitor_data, test_visitor_out_number, true); output_visitor_test_add("/string-visitor/output/string", _visitor_data, test_visitor_out_string, false); -output_visitor_test_add("/string-visitor/output/string", +output_visitor_test_add("/string-visitor/output/string-human", _visitor_data, test_visitor_out_string, true); output_visitor_test_add("/string-visitor/output/no-string", _visitor_data, test_visitor_out_no_string, false); -output_visitor_test_add("/string-visitor/output/no-string", +output_visitor_test_add("/string-visitor/output/no-string-human", _visitor_data, test_visitor_out_no_string, true); output_visitor_test_add("/string-visitor/output/enum", _visitor_data, test_visitor_out_enum, false); -output_visitor_test_add("/string-visitor/output/enum", +output_visitor_test_add("/string-visitor/output/enum-human", _visitor_data, test_visitor_out_enum, true); output_visitor_test_add("/string-visitor/output/enum-errors", _visitor_data, test_visitor_out_enum_errors, false); -output_visitor_test_add("/string-visitor/output/enum-errors", +output_visitor_test_add("/string-visitor/output/enum-errors-human", _visitor_data, test_visitor_out_enum_errors, true); output_visitor_test_add("/string-visitor/output/intList", _visitor_data, test_visitor_out_intList, false); -output_visitor_test_add("/string-visitor/output/intList", +output_visitor_test_add("/string-visitor/output/intList-human", _visitor_data, test_visitor_out_intList, true); g_test_run(); -- 2.5.0
Re: [Qemu-devel] [PATCH] temp-floating-point: Use float32_to_t and t_to_float32 for the input register value
On 5 October 2015 at 12:21, Chen Gangwrote: > From 6bb2ed5b7046cda545f6a12721b773fde40f07f1 Mon Sep 17 00:00:00 2001 > From: Chen Gang > Date: Mon, 5 Oct 2015 19:12:07 +0800 > Subject: [PATCH] temp-floating-point: Use float32_to_t and t_to_float32 for > the input register value > > Original implementation use int*_to_float32 and float32_to_int*, which > will generate incorrect result. > > Signed-off-by: Chen Gang > --- > target-tilegx/fpu_helper.c | 51 > +- > 1 file changed, 23 insertions(+), 28 deletions(-) > > diff --git a/target-tilegx/fpu_helper.c b/target-tilegx/fpu_helper.c > index daae570..2707f30 100644 > --- a/target-tilegx/fpu_helper.c > +++ b/target-tilegx/fpu_helper.c > @@ -68,6 +68,20 @@ static uint64_t float64_to_t(float64 fa) > return r.ll; > } > > +static float32 t_to_float32 (uint32_t a) > +{ > +CPU_FloatU r; > +r.l = a; > +return r.f; > +} This appears to be reimplementing make_float32(). > + > +static uint32_t float32_to_t(float32 a) > +{ > +CPU_FloatU r; > +r.f = a; > +return r.l; > +} And this is just float32_val(). > uint64_t helper_fsingle_add1(CPUTLGState *env, uint64_t rsrc, uint64_t rsrcb) > { > -FPUTLGState *fpu = >fpu; > -return fsingle_calc(fpu, int64_to_float32(rsrc, _STATUS), > -int64_to_float32(rsrcb, _STATUS), > -float32_add); > +return fsingle_calc(>fpu, t_to_float32((uint32_t)rsrc), > +t_to_float32((uint32_t)rsrcb), float32_add); > } Why is the helper for a single-precision operation taking a 64-bit argument anyway? thanks -- PMM