Re: [Qemu-devel] [PULL 00/10] Fix device introspection regressions

2015-10-05 Thread Peter Maydell
On 5 October 2015 at 18:11, Markus Armbruster  wrote:
> Sigh.  I'll stare at the original backtrace some more.

Here's one from a debug build (with some reasoning about
the cause underneath it).

As an aside, thanks to your valgrind example I figured out
how to get the test makefile to run tests under gdb:

QTEST_QEMU_BINARY="xterm -e gdb --tty $(tty) --args
aarch64-softmmu/qemu-system-aarch64" QTEST_QEMU_IMG=qemu-img gtester
-k --verbose -m=quick  tests/device-introspect-test

(which will start a new xterm with a gdb in it for
debugging each test).

#0  0x7fff9145e286 in __pthread_kill ()
#1  0x7fff912529f9 in pthread_kill ()
#2  0x7fff956db9b3 in abort ()
#3  0x00010103ec50 in g_assertion_message ()
#4  0x00010103ec95 in g_assertion_message_expr ()
#5  0x00010045bbb2 in object_initialize_with_type
(data=0x102056520, size=344, type=0x0) at
/Users/pm215/src/qemu-for-merges/qom/object.c:333
#6  0x00010045c112 in object_initialize (data=0x102056520,
size=344, typename=0x1005a7a96 "virtio-tablet-device") at
/Users/pm215/src/qemu-for-merges/qom/object.c:352
#7  0x0001000e7d74 in virtio_instance_init_common
(proxy_obj=0x10204e400, data=0x102056520, vdev_size=344,
vdev_name=0x1005a7a96 "virtio-tablet-device") at
/Users/pm215/src/qemu-for-merges/hw/virtio/virtio.c:1468
#8  0x0001003efe47 in virtio_tablet_initfn (obj=0x10204e400) at
/Users/pm215/src/qemu-for-merges/hw/virtio/virtio-pci.c:2133
#9  0x00010045c066 in object_init_with_type (obj=0x10204e400,
ti=0x10154f420) at /Users/pm215/src/qemu-for-merges/qom/object.c:314
#10 0x00010045bcf2 in object_initialize_with_type
(data=0x10204e400, size=33400, type=0x10154f420) at
/Users/pm215/src/qemu-for-merges/qom/object.c:344
#11 0x00010045c2a9 in object_new_with_type (type=0x10154f420) at
/Users/pm215/src/qemu-for-merges/qom/object.c:430
#12 0x00010045c2f2 in object_new (typename=0x101585cc0
"virtio-tablet-pci") at
/Users/pm215/src/qemu-for-merges/qom/object.c:440
#13 0x00010021923f in qmp_device_list_properties
(typename=0x101585cc0 "virtio-tablet-pci", errp=0x7fff5fbfd850) at
/Users/pm215/src/qemu-for-merges/qmp.c:529
#14 0x00010020e054 in qmp_marshal_device_list_properties
(args=0x102023200, ret=0x7fff5fbfd8c8, errp=0x7fff5fbfd8d0) at
qmp-marshal.c:1693
#15 0x000100047fe1 in handle_qmp_command (parser=0x10157f388,
tokens=0x1013069a0) at /Users/pm215/src/qemu-for-merges/monitor.c:3860
#16 0x00010052fb16 in json_message_process_token
(lexer=0x10157f390, token=0x101585b40, type=JSON_OPERATOR, x=15754,
y=0) at /Users/pm215/src/qemu-for-merges/qobject/json-streamer.c:87
#17 0x00010052f45a in json_lexer_feed_char (lexer=0x10157f390,
ch=125 '}', flush=false) at
/Users/pm215/src/qemu-for-merges/qobject/json-lexer.c:303
#18 0x00010052f2e1 in json_lexer_feed (lexer=0x10157f390,
buffer=0x7fff5fbfdb00 "}6\035�q&", size=1) at
/Users/pm215/src/qemu-for-merges/qobject/json-lexer.c:356
#19 0x00010052fbc7 in json_message_parser_feed
(parser=0x10157f388, buffer=0x7fff5fbfdb00 "}6\035�q&", size=1) at
/Users/pm215/src/qemu-for-merges/qobject/json-streamer.c:110
#20 0x000100047ce6 in monitor_qmp_read (opaque=0x10157f310,
buf=0x7fff5fbfdb00 "}6\035�q&", size=1) at
/Users/pm215/src/qemu-for-merges/monitor.c:3875
#21 0x0001001f06d7 in qemu_chr_be_write (s=0x10157e310,
buf=0x7fff5fbfdb00 "}6\035�q&", len=1) at
/Users/pm215/src/qemu-for-merges/qemu-char.c:305
#22 0x0001001f661a in tcp_chr_read (chan=0x10157e5a0,
cond=G_IO_IN, opaque=0x10157e310) at
/Users/pm215/src/qemu-for-merges/qemu-char.c:2873
#23 0x00010101f0bd in g_main_context_dispatch ()
#24 0x000100473aaa in glib_pollfds_poll () at
/Users/pm215/src/qemu-for-merges/main-loop.c:211
#25 0x00010047371b in os_host_main_loop_wait (timeout=0) at
/Users/pm215/src/qemu-for-merges/main-loop.c:256
#26 0x00010047358d in main_loop_wait (nonblocking=1) at
/Users/pm215/src/qemu-for-merges/main-loop.c:504
#27 0x00010020715c in main_loop () at
/Users/pm215/src/qemu-for-merges/vl.c:1880
#28 0x0001002015bc in qemu_main (argc=14, argv=0x7fff5fbff5c0,
envp=0x7fff5fbff638) at /Users/pm215/src/qemu-for-merges/vl.c:4634
#29 0x000100435c73 in main (argc=14, argv=0x7fff5fbff5c0) at
/Users/pm215/src/qemu-for-merges/ui/cocoa.m:1164

I think the problem here is that the "virtio-tablet-pci" device
is in hw/virtio/virtio-pci.c, which gets built for all hosts,
but it uses the device "virtio-tablet-device", which is defined
in hw/input/virtio-input-hid.c, which is only built if CONFIG_LINUX
is defined. So on non-Linux hosts we get a virtio-tablet-pci
device which can't be created.

The easy fix is to have some suitable ifdeffery in virtio-pci.c,
similar to how we only register the virtio_9p_pci and virtio_scsi_pci
types if they've been configured into this build.

thanks
-- PMM



Re: [Qemu-devel] [PATCH 3/4] checkpatch: adapt some tests to QEMU

2015-10-05 Thread Eduardo Habkost
On Sun, Oct 04, 2015 at 07:44:29PM +0100, Peter Maydell wrote:
> On 17 September 2015 at 17:32, Paolo Bonzini  wrote:
> >
> >
> > On 17/09/2015 18:16, Peter Maydell wrote:
> >> On 17 September 2015 at 17:00, Paolo Bonzini  wrote:
> >>>
> >>>
> >>> On 17/09/2015 16:24, Peter Maydell wrote:
>  Can we revert this one, please? Checkpatch now warns about constructs
>  like
>    typedef struct MyDevice {
>    DeviceState parent;
> 
>    int reg0, reg1, reg2;
>    } MyDevice;
> 
> > I think it varies depending on the maintainer.  PPC, USB, SCSI, ACPI all
> > use a separate typedef.  I'll prepare a revert.
> 
> Ping on that revert patch? I can't find it onlist...

The x86 pull request I sent today triggers this error, BTW. I hope it
won't make the pull request be automatically rejected.

-- 
Eduardo



Re: [Qemu-devel] [PATCH 3/4] checkpatch: adapt some tests to QEMU

2015-10-05 Thread Peter Maydell
On 5 October 2015 at 19:11, Eduardo Habkost  wrote:
> On Sun, Oct 04, 2015 at 07:44:29PM +0100, Peter Maydell wrote:
>> On 17 September 2015 at 17:32, Paolo Bonzini  wrote:
>> >
>> >
>> > On 17/09/2015 18:16, Peter Maydell wrote:
>> >> On 17 September 2015 at 17:00, Paolo Bonzini  wrote:
>> >>>
>> >>>
>> >>> On 17/09/2015 16:24, Peter Maydell wrote:
>>  Can we revert this one, please? Checkpatch now warns about constructs
>>  like
>>    typedef struct MyDevice {
>>    DeviceState parent;
>> 
>>    int reg0, reg1, reg2;
>>    } MyDevice;
>>
>> > I think it varies depending on the maintainer.  PPC, USB, SCSI, ACPI all
>> > use a separate typedef.  I'll prepare a revert.
>>
>> Ping on that revert patch? I can't find it onlist...
>
> The x86 pull request I sent today triggers this error, BTW. I hope it
> won't make the pull request be automatically rejected.

Nope, because I don't run checkpatch on pull requests.

-- PMM



Re: [Qemu-devel] [Qemu-discuss] TCP options ipv4 and ipv6 have no effect

2015-10-05 Thread Sair, Umair
> The first if handles the "default to N" case, the second handles "default to 
> Y", the (absent) else case handles "default to PF_UNSPEC".

Can you please elaborate it. Also I am not understanding the reason for 
inverting the values of addr->has_ipv* in second if condition.

I believe that the fix for the issue under discussion will be committed to qemu 
repo very soon, so I'll like to add one more thing which requires to be fixed 
along with it. In 'tcp_chr_accept' function of qemu-char.c, the data type of 
saddr should be sockaddr_in6 so that it works with both IPv6 and IPv4 on 
Windows (works for linux without it because of accept4 and works with this 
solution as well!).

Regards,
Umair Sair

-Original Message-
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini
Sent: Sunday, October 04, 2015 3:48 PM
To: Peter Maydell; Sair, Umair
Cc: QEMU Developers; qemu-disc...@nongnu.org
Subject: Re: [Qemu-discuss] TCP options ipv4 and ipv6 have no effect

On 03/10/2015 00:36, Peter Maydell wrote:
> 
> I agree about the (!ipv4 || !ipv6) condition though.
> The three states I listed above ought to correspond to "qemu_opt not 
> set", "qemu_opt set to false" and "qemu_opt set to true",

The problem is that the underlying QemuOpts-based code treats "qemu_opt not 
set" and "qemu_opt set to false" the same way:

  ipv4   ipv6
  Y  Y   PF_INET6
  Y  N   PF_INET
  N  Y   PF_INET6
  N  N   PF_UNSPEC

We want:

 ipv4 ipv6
 YY PF_INET6
 YN PF_INET
 Y- PF_INET (ipv6 = N)
 NY PF_INET6
 NN PF_UNSPEC
 N- PF_INET6(ipv6 = Y)
 -Y PF_INET6(ipv4 = N)
 -N PF_INET (ipv4 = Y)
 -- PF_UNSPEC

I think this patch gets the desired semantics:

diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c index 2add83a..fdcf3fa 
100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -586,12 +586,15 @@ fail:
 
 static void inet_addr_to_opts(QemuOpts *opts, const InetSocketAddress *addr)  {
-bool ipv4 = addr->ipv4 || !addr->has_ipv4;
-bool ipv6 = addr->ipv6 || !addr->has_ipv6;
+bool ipv4 = addr->has_ipv4 && addr->ipv4;
+bool ipv6 = addr->has_ipv6 && addr->ipv6;
 
-if (!ipv4 || !ipv6) {
+if (ipv4 || ipv6) {
 qemu_opt_set_bool(opts, "ipv4", ipv4, _abort);
 qemu_opt_set_bool(opts, "ipv6", ipv6, _abort);
+} else if (addr->has_ipv4 || addr->has_ipv6) {
+qemu_opt_set_bool(opts, "ipv4", !addr->has_ipv4, _abort);
+qemu_opt_set_bool(opts, "ipv6", !addr->has_ipv6, _abort);
 }
 if (addr->has_to) {
 qemu_opt_set_number(opts, "to", addr->to, _abort);


The first if handles the "default to N" case, the second handles "default to 
Y", the (absent) else case handles "default to PF_UNSPEC".

Paolo


Re: [Qemu-devel] [PATCH] Add syscalls for -runas and -chroot tothe seccomp sandbox

2015-10-05 Thread Markus Armbruster
"Namsun Ch'o"  writes:

>> Drawback: complexity.  If we decide to limit ourselves to the original
>> threat model (rogue guest), and enter the sandbox only after setup, we
>> can keep things simpler.
>
> We could do both without much complexity. This looks simple enough to me:
>
>   rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(chroot), 1,
> SCMP_A0(SCMP_CMP_EQ, chroot_dir));
>   if (rc < 0)
> goto seccomp_return;
>
>   rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(chdir), 1,
> SCMP_A0(SCMP_CMP_EQ, "/"));
>   if (rc < 0)
> goto seccomp_return;
>
> The only time chroot_dir is ever used is in os-posix.c:139:
>
>   if (chroot(chroot_dir) < 0) {

I'm afraid this materially weakens the sandbox.  chroot_dir is writable.

We don't need to permit chroot(chroot_dir) if we enter the sandbox only
after setup.



Re: [Qemu-devel] [PULL 00/10] Fix device introspection regressions

2015-10-05 Thread Markus Armbruster
Peter Maydell  writes:

> On 5 October 2015 at 20:27, Paolo Bonzini  wrote:
>>
>>
>> On 05/10/2015 20:46, Peter Maydell wrote:
>>> The easy fix is to have some suitable ifdeffery in virtio-pci.c,
>>> similar to how we only register the virtio_9p_pci and virtio_scsi_pci
>>
>> (vhost_scsi_pci)
>>
>>> types if they've been configured into this build.
>>
>> Hmm, actually there's no reason to limit
>>
>> common-obj-$(CONFIG_VIRTIO) += virtio-input.o
>> common-obj-$(CONFIG_VIRTIO) += virtio-input-hid.o
>>
>> to CONFIG_LINUX.  Does it work to extract them out of the if (which is
>> only correct for virtio-input-host.o)?
>
> Yes, though as you predicted we then fall over on virtio-input-host-pci.
> If I bodge that one out in virtio-pci.c then the device-introspect-test
> passes.

Does this do the trick?  If yes, I'll fill out the commit message and
post it properly.

>From e5ebf2f0680fcf145455db048b82cf8e5d9d6b9f Mon Sep 17 00:00:00 2001
From: Markus Armbruster 
Date: Tue, 6 Oct 2015 07:43:18 +0200
Subject: [PATCH] virtio-input: Fix device introspection on non-Linux hosts WIP

---
 hw/input/Makefile.objs | 2 +-
 hw/virtio/virtio-pci.c | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/input/Makefile.objs b/hw/input/Makefile.objs
index 624ba7e..7715d72 100644
--- a/hw/input/Makefile.objs
+++ b/hw/input/Makefile.objs
@@ -8,9 +8,9 @@ common-obj-$(CONFIG_STELLARIS_INPUT) += stellaris_input.o
 common-obj-$(CONFIG_TSC2005) += tsc2005.o
 common-obj-$(CONFIG_VMMOUSE) += vmmouse.o
 
-ifeq ($(CONFIG_LINUX),y)
 common-obj-$(CONFIG_VIRTIO) += virtio-input.o
 common-obj-$(CONFIG_VIRTIO) += virtio-input-hid.o
+ifeq ($(CONFIG_LINUX),y)
 common-obj-$(CONFIG_VIRTIO) += virtio-input-host.o
 endif
 
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index eda8205..23f131b 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -2228,7 +2228,9 @@ static void virtio_pci_register_types(void)
 type_register_static(_keyboard_pci_info);
 type_register_static(_mouse_pci_info);
 type_register_static(_tablet_pci_info);
+#ifdef CONFIG_LINUX
 type_register_static(_host_pci_info);
+#endif
 type_register_static(_pci_bus_info);
 type_register_static(_pci_info);
 #ifdef CONFIG_VIRTFS
-- 
2.4.3




Re: [Qemu-devel] [PULL 00/10] Fix device introspection regressions

2015-10-05 Thread Peter Maydell
On 5 October 2015 at 20:27, Paolo Bonzini  wrote:
>
>
> On 05/10/2015 20:46, Peter Maydell wrote:
>> The easy fix is to have some suitable ifdeffery in virtio-pci.c,
>> similar to how we only register the virtio_9p_pci and virtio_scsi_pci
>
> (vhost_scsi_pci)
>
>> types if they've been configured into this build.
>
> Hmm, actually there's no reason to limit
>
> common-obj-$(CONFIG_VIRTIO) += virtio-input.o
> common-obj-$(CONFIG_VIRTIO) += virtio-input-hid.o
>
> to CONFIG_LINUX.  Does it work to extract them out of the if (which is
> only correct for virtio-input-host.o)?

Yes, though as you predicted we then fall over on virtio-input-host-pci.
If I bodge that one out in virtio-pci.c then the device-introspect-test
passes.

thanks
-- PMM



Re: [Qemu-devel] [PATCHv2] target-arm: Implement AArch64 OSLSR_EL1 sysreg

2015-10-05 Thread Peter Maydell
On 5 October 2015 at 20:56, Davorin Mista  wrote:
> Added oslsr_write function to OSLAR_EL1 sysreg, using a status variable
> in ARMCPUState struct (os_lock_status).
>
> Linux reads from this register during its suspend/resume procedure.
>
> Signed-off-by: Davorin Mista 

Thanks for this patch. I'm afraid it still needs some changes;
comments below.

> ---
> Changed in v2:
> -switched from using dummy registers to an actual register implementation
> -implemented write function for OSLAR_EL1 sysreg
> -added state variable to ARMCPUState struct
>
> Signed-off-by: Davorin Mista 
> ---
>  target-arm/cpu.h|  3 +++
>  target-arm/helper.c | 16 +++-
>  2 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/target-arm/cpu.h b/target-arm/cpu.h
> index 5ea11a6..5aab654 100644
> --- a/target-arm/cpu.h
> +++ b/target-arm/cpu.h
> @@ -500,6 +500,9 @@ typedef struct CPUARMState {
>  uint32_t cregs[16];
>  } iwmmxt;
>
> +/* OS Lock Status: accessed via OSLAR/OSLSR registers */
> +uint64_t os_lock_status;

Can you call this "oslsr_el1" and put it inside the cp15 substruct
with the other sysreg fields, please?

> +
>  /* For mixed endian mode.  */
>  bool bswap_code;
>
> diff --git a/target-arm/helper.c b/target-arm/helper.c
> index 9d62c4c..a6fad7a 100644
> --- a/target-arm/helper.c
> +++ b/target-arm/helper.c
> @@ -3147,6 +3147,13 @@ static void dcc_write(CPUARMState *env, const 
> ARMCPRegInfo *ri,
>  putchar(value);
>  }
>
> +/* write to os_lock_status state variable */
> +static void oslsr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t 
> value)
> +{
> +/* only bit 1 can be modified, register is always 0b10x0 */
> +raw_write(env, ri, 8 + (value & 2));

This is the write function for OSLAR, not OSLSR, so you should
call it oslar_write.

Your logic isn't implementing the behaviour the ARM ARM requires,
which is:
 * for AArch64 accesses, copy bit 0 of the written value into
   bit 1 of oslsr_el1
 * for AArch32 accesses, if the written value is 0xC5ACCE55
   then write 1 into bit 1 of oslsr_el1, else write 0

That looks something like:

int oslock;

if (ri->state == ARM_CP_STATE_AA32) {
oslock = (value == 0xC5ACCE55);
} else {
oslock = value & 1;
}

env->cp15.oslsr_el1 = deposit32(env->cp15.oslsr_el1, 1, 1, oslock);


> +}
> +
>  static const ARMCPRegInfo debug_cp_reginfo[] = {
>  /* DBGDRAR, DBGDSAR: always RAZ since we don't implement memory mapped
>   * debug components. The AArch64 version of DBGDRAR is named MDRAR_EL1;
> @@ -3179,7 +3186,14 @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
>  /* We define a dummy WI OSLAR_EL1, because Linux writes to it. */
>  { .name = "OSLAR_EL1", .state = ARM_CP_STATE_BOTH,
>.cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 4,
> -  .access = PL1_W, .type = ARM_CP_NOP },
> +  .access = PL1_W, .resetvalue = 10,

Write only registers don't need a reset value. You also
need .type = ARM_CP_NO_RAW, because a raw access to this register
doesn't make sense.

> +  .fieldoffset = offsetof(CPUARMState, os_lock_status),
> +  .writefn = oslsr_write },
> +/* We define a dummy OSLSR_EL1, because Linux reads from it. */
> +{ .name = "OSLSR_EL1", .state = ARM_CP_STATE_BOTH,
> +  .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 4,
> +  .access = PL1_R,
> +  .fieldoffset = offsetof(CPUARMState, os_lock_status) },

This is the reginfo that should have the reset value.

>  /* Dummy OSDLR_EL1: 32-bit Linux will read this */
>  { .name = "OSDLR_EL1", .state = ARM_CP_STATE_BOTH,
>.cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 4,
> --
> 2.6.0
>

thanks
-- PMM



Re: [Qemu-devel] [PATCH 3/5] qga: guest exec functionality

2015-10-05 Thread Yuri Pudgorodskiy

On 10/2/2015 1:59 AM, Michael Roth wrote:

+#
+# Since: 2.5
+##
+{ 'command': 'guest-exec',
+  'data':{ 'path': 'str', '*arg': ['str'], '*env': ['str'],
+   '*inp-data': 'str', '*capture-output': 'bool' },
+  'returns': 'GuestExec' }

Would it make sense to just add handle (pid) to GuestExecStatus, and
have this return GuestExecStatus just the same as guest-exec-status
does? I'm not sure either way personally, just thought I'd mention it.



I do not think it's a good idea. GuestExecStatus contains mostly the 
data about

the finished exec. Having GuestExec returns same struct may make wrong
assumption that it can be synchronous - wait for exec to complete and 
return all

data in a single call.

Implementing synchronous GuestExec is not and easy job - either we occupy
host-guest channel for all time until task finished, which is bad or we 
need to

implement multiplexed messages for concurrent qga commands.





[Qemu-devel] [PULL 02/10] hw/vfio/platform: change interrupt/unmask fields into pointer

2015-10-05 Thread Alex Williamson
From: Eric Auger 

unmask EventNotifier might not be initialized in case of edge
sensitive irq. Using EventNotifier pointers make life simpler to
handle the edge-sensitive irqfd setup.

Signed-off-by: Eric Auger 
Signed-off-by: Alex Williamson 
---
 hw/vfio/platform.c  |   35 ---
 include/hw/vfio/vfio-platform.h |4 ++--
 2 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index d864342..cab1517 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -57,15 +57,20 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev,
 sysbus_init_irq(sbdev, >qemuirq);
 
 /* Get an eventfd for trigger */
-ret = event_notifier_init(>interrupt, 0);
+intp->interrupt = g_malloc0(sizeof(EventNotifier));
+ret = event_notifier_init(intp->interrupt, 0);
 if (ret) {
+g_free(intp->interrupt);
 g_free(intp);
 error_report("vfio: Error: trigger event_notifier_init failed ");
 return NULL;
 }
 /* Get an eventfd for resample/unmask */
-ret = event_notifier_init(>unmask, 0);
+intp->unmask = g_malloc0(sizeof(EventNotifier));
+ret = event_notifier_init(intp->unmask, 0);
 if (ret) {
+g_free(intp->interrupt);
+g_free(intp->unmask);
 g_free(intp);
 error_report("vfio: Error: resamplefd event_notifier_init failed");
 return NULL;
@@ -100,7 +105,7 @@ static int vfio_set_trigger_eventfd(VFIOINTp *intp,
 irq_set->start = 0;
 irq_set->count = 1;
 pfd = (int32_t *)_set->data;
-*pfd = event_notifier_get_fd(>interrupt);
+*pfd = event_notifier_get_fd(intp->interrupt);
 qemu_set_fd_handler(*pfd, (IOHandler *)handler, NULL, intp);
 ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
 g_free(irq_set);
@@ -182,7 +187,7 @@ static void vfio_intp_mmap_enable(void *opaque)
 static void vfio_intp_inject_pending_lockheld(VFIOINTp *intp)
 {
 trace_vfio_platform_intp_inject_pending_lockheld(intp->pin,
-  event_notifier_get_fd(>interrupt));
+  event_notifier_get_fd(intp->interrupt));
 
 intp->state = VFIO_IRQ_ACTIVE;
 
@@ -224,18 +229,18 @@ static void vfio_intp_interrupt(VFIOINTp *intp)
 trace_vfio_intp_interrupt_set_pending(intp->pin);
 QSIMPLEQ_INSERT_TAIL(>pending_intp_queue,
  intp, pqnext);
-ret = event_notifier_test_and_clear(>interrupt);
+ret = event_notifier_test_and_clear(intp->interrupt);
 qemu_mutex_unlock(>intp_mutex);
 return;
 }
 
 trace_vfio_platform_intp_interrupt(intp->pin,
-  event_notifier_get_fd(>interrupt));
+  event_notifier_get_fd(intp->interrupt));
 
-ret = event_notifier_test_and_clear(>interrupt);
+ret = event_notifier_test_and_clear(intp->interrupt);
 if (!ret) {
 error_report("Error when clearing fd=%d (ret = %d)",
- event_notifier_get_fd(>interrupt), ret);
+ event_notifier_get_fd(intp->interrupt), ret);
 }
 
 intp->state = VFIO_IRQ_ACTIVE;
@@ -283,7 +288,7 @@ static void vfio_platform_eoi(VFIODevice *vbasedev)
 QLIST_FOREACH(intp, >intp_list, next) {
 if (intp->state == VFIO_IRQ_ACTIVE) {
 trace_vfio_platform_eoi(intp->pin,
-event_notifier_get_fd(>interrupt));
+event_notifier_get_fd(intp->interrupt));
 intp->state = VFIO_IRQ_INACTIVE;
 
 /* deassert the virtual IRQ */
@@ -360,7 +365,7 @@ static int vfio_set_resample_eventfd(VFIOINTp *intp)
 irq_set->start = 0;
 irq_set->count = 1;
 pfd = (int32_t *)_set->data;
-*pfd = event_notifier_get_fd(>unmask);
+*pfd = event_notifier_get_fd(intp->unmask);
 qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
 ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
 g_free(irq_set);
@@ -396,8 +401,8 @@ static void vfio_start_irqfd_injection(SysBusDevice *sbdev, 
qemu_irq irq)
 }
 assert(intp);
 
-if (kvm_irqchip_add_irqfd_notifier(kvm_state, >interrupt,
-   >unmask, irq) < 0) {
+if (kvm_irqchip_add_irqfd_notifier(kvm_state, intp->interrupt,
+   intp->unmask, irq) < 0) {
 goto fail_irqfd;
 }
 
@@ -411,11 +416,11 @@ static void vfio_start_irqfd_injection(SysBusDevice 
*sbdev, qemu_irq irq)
 intp->kvm_accel = true;
 
 trace_vfio_platform_start_irqfd_injection(intp->pin,
- event_notifier_get_fd(>interrupt),
- event_notifier_get_fd(>unmask));
+ event_notifier_get_fd(intp->interrupt),
+ event_notifier_get_fd(intp->unmask));
 

[Qemu-devel] [PULL 08/10] memory: Allow replay of IOMMU mapping notifications

2015-10-05 Thread Alex Williamson
From: David Gibson 

When we have guest visible IOMMUs, we allow notifiers to be registered
which will be informed of all changes to IOMMU mappings.  This is used by
vfio to keep the host IOMMU mappings in sync with guest IOMMU mappings.

However, unlike with a memory region listener, an iommu notifier won't be
told about any mappings which already exist in the (guest) IOMMU at the
time it is registered.  This can cause problems if hotplugging a VFIO
device onto a guest bus which had existing guest IOMMU mappings, but didn't
previously have an VFIO devices (and hence no host IOMMU mappings).

This adds a memory_region_iommu_replay() function to handle this case.  It
replays any existing mappings in an IOMMU memory region to a specified
notifier.  Because the IOMMU memory region doesn't internally remember the
granularity of the guest IOMMU it has a small hack where the caller must
specify a granularity at which to replay mappings.

If there are finer mappings in the guest IOMMU these will be reported in
the iotlb structures passed to the notifier which it must handle (probably
causing it to flag an error).  This isn't new - the VFIO iommu notifier
must already handle notifications about guest IOMMU mappings too short
for it to represent in the host IOMMU.

Signed-off-by: David Gibson 
Reviewed-by: Laurent Vivier 
Acked-by: Paolo Bonzini 
Signed-off-by: Alex Williamson 
---
 include/exec/memory.h |   13 +
 memory.c  |   20 
 2 files changed, 33 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 5baaf48..0f07159 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -583,6 +583,19 @@ void memory_region_notify_iommu(MemoryRegion *mr,
 void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n);
 
 /**
+ * memory_region_iommu_replay: replay existing IOMMU translations to
+ * a notifier
+ *
+ * @mr: the memory region to observe
+ * @n: the notifier to which to replay iommu mappings
+ * @granularity: Minimum page granularity to replay notifications for
+ * @is_write: Whether to treat the replay as a translate "write"
+ * through the iommu
+ */
+void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n,
+hwaddr granularity, bool is_write);
+
+/**
  * memory_region_unregister_iommu_notifier: unregister a notifier for
  * changes to IOMMU translation entries.
  *
diff --git a/memory.c b/memory.c
index ef87363..1b03d22 100644
--- a/memory.c
+++ b/memory.c
@@ -1403,6 +1403,26 @@ void memory_region_register_iommu_notifier(MemoryRegion 
*mr, Notifier *n)
 notifier_list_add(>iommu_notify, n);
 }
 
+void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n,
+hwaddr granularity, bool is_write)
+{
+hwaddr addr;
+IOMMUTLBEntry iotlb;
+
+for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
+iotlb = mr->iommu_ops->translate(mr, addr, is_write);
+if (iotlb.perm != IOMMU_NONE) {
+n->notify(n, );
+}
+
+/* if (2^64 - MR size) < granularity, it's possible to get an
+ * infinite loop here.  This should catch such a wraparound */
+if ((addr + granularity) < addr) {
+break;
+}
+}
+}
+
 void memory_region_unregister_iommu_notifier(Notifier *n)
 {
 notifier_remove(n);




Re: [Qemu-devel] [Qemu-discuss] TCP options ipv4 and ipv6 have no effect

2015-10-05 Thread Paolo Bonzini


On 05/10/2015 20:03, Sair, Umair wrote:
>> The first if handles the "default to N" case, the second handles "default to 
>> Y", the (absent) else case handles "default to PF_UNSPEC".
> 
> Can you please elaborate it. Also I am not understanding the reason for 
> inverting the values of addr->has_ipv* in second if condition.

If a value is absent, you can either default it to Y (then the value is
!addr->has_xyz || addr->xyz) or default it to N (then the value is
addr->has_xyz && addr->xyz).

The desired logic is:

- with one absent value and one "on", the absent value should be "off"

- with one absent value and one "off", the absent value should be "on"

- with two absent values, the absent values can be both left absent

The patch works like this:

>>  static void inet_addr_to_opts(QemuOpts *opts, const InetSocketAddress 
>> *addr)  {
>> -bool ipv4 = addr->ipv4 || !addr->has_ipv4;
>> -bool ipv6 = addr->ipv6 || !addr->has_ipv6;
>> +bool ipv4 = addr->has_ipv4 && addr->ipv4;
>> +bool ipv6 = addr->has_ipv6 && addr->ipv6;
>>  
>> -if (!ipv4 || !ipv6) {
>> +if (ipv4 || ipv6) {
>>  qemu_opt_set_bool(opts, "ipv4", ipv4, _abort);
>>  qemu_opt_set_bool(opts, "ipv6", ipv6, _abort);

This handles the case where the absent value should be "off": at least
one value is present _and_ true.

>> +} else if (addr->has_ipv4 || addr->has_ipv6) {
>> +qemu_opt_set_bool(opts, "ipv4", !addr->has_ipv4, _abort);
>> +qemu_opt_set_bool(opts, "ipv6", !addr->has_ipv6, _abort);

This handles the case where the absent value should be "on".  We know
that whichever the present value is, it is false; therefore
!addr->has_xyz || addr->xyz simplifies to just !addr->has_xyz.

> I believe that the fix for the issue under discussion will be
> committed to qemu repo very soon,

Did you test the patch, and did it work for you?  If so, it is customary
to reply with a line like "Tested by: Sair, Umair ".

> so I'll like to add one more thing
> which requires to be fixed along with it. In 'tcp_chr_accept'
> function of qemu-char.c, the data type of saddr should be
> sockaddr_in6 so that it works with both IPv6 and IPv4 on Windows
> (works for linux without it because of accept4 and works with this
> solution as well!).

Can you send a patch for it?

Thanks!

Paolo



[Qemu-devel] [PULL 04/10] vfio: Remove unneeded union from VFIOContainer

2015-10-05 Thread Alex Williamson
From: David Gibson 

Currently the VFIOContainer iommu_data field contains a union with
different information for different host iommu types.  However:
   * It only actually contains information for the x86-like "Type1" iommu
   * Because we have a common listener the Type1 fields are actually used
on all IOMMU types, including the SPAPR TCE type as well

In fact we now have a general structure for the listener which is unlikely
to ever need per-iommu-type information, so this patch removes the union.

In a similar way we can unify the setup of the vfio memory listener in
vfio_connect_container() that is currently split across a switch on iommu
type, but is effectively the same in both cases.

The iommu_data.release pointer was only needed as a cleanup function
which would handle potentially different data in the union.  With the
union gone, it too can be removed.

Signed-off-by: David Gibson 
Reviewed-by: Laurent Vivier 
Signed-off-by: Alex Williamson 
---
 hw/vfio/common.c  |   52 +++--
 include/hw/vfio/vfio-common.h |   16 ++---
 2 files changed, 22 insertions(+), 46 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 0d341a3..1545f62 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -315,8 +315,7 @@ out:
 static void vfio_listener_region_add(MemoryListener *listener,
  MemoryRegionSection *section)
 {
-VFIOContainer *container = container_of(listener, VFIOContainer,
-iommu_data.type1.listener);
+VFIOContainer *container = container_of(listener, VFIOContainer, listener);
 hwaddr iova, end;
 Int128 llend;
 void *vaddr;
@@ -406,9 +405,9 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
  * can gracefully fail.  Runtime, there's not much we can do other
  * than throw a hardware error.
  */
-if (!container->iommu_data.type1.initialized) {
-if (!container->iommu_data.type1.error) {
-container->iommu_data.type1.error = ret;
+if (!container->initialized) {
+if (!container->error) {
+container->error = ret;
 }
 } else {
 hw_error("vfio: DMA mapping failed, unable to continue");
@@ -419,8 +418,7 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
 static void vfio_listener_region_del(MemoryListener *listener,
  MemoryRegionSection *section)
 {
-VFIOContainer *container = container_of(listener, VFIOContainer,
-iommu_data.type1.listener);
+VFIOContainer *container = container_of(listener, VFIOContainer, listener);
 hwaddr iova, end;
 int ret;
 
@@ -485,7 +483,7 @@ static const MemoryListener vfio_memory_listener = {
 
 static void vfio_listener_release(VFIOContainer *container)
 {
-memory_listener_unregister(>iommu_data.type1.listener);
+memory_listener_unregister(>listener);
 }
 
 int vfio_mmap_region(Object *obj, VFIORegion *region,
@@ -683,21 +681,6 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as)
 ret = -errno;
 goto free_container_exit;
 }
-
-container->iommu_data.type1.listener = vfio_memory_listener;
-container->iommu_data.release = vfio_listener_release;
-
-memory_listener_register(>iommu_data.type1.listener,
- container->space->as);
-
-if (container->iommu_data.type1.error) {
-ret = container->iommu_data.type1.error;
-error_report("vfio: memory listener initialization failed for 
container");
-goto listener_release_exit;
-}
-
-container->iommu_data.type1.initialized = true;
-
 } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) {
 ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, );
 if (ret) {
@@ -723,19 +706,24 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as)
 ret = -errno;
 goto free_container_exit;
 }
-
-container->iommu_data.type1.listener = vfio_memory_listener;
-container->iommu_data.release = vfio_listener_release;
-
-memory_listener_register(>iommu_data.type1.listener,
- container->space->as);
-
 } else {
 error_report("vfio: No available IOMMU models");
 ret = -EINVAL;
 goto free_container_exit;
 }
 
+container->listener = vfio_memory_listener;
+
+memory_listener_register(>listener, container->space->as);
+
+if (container->error) {
+ret = container->error;
+error_report("vfio: memory listener initialization failed for 
container");
+goto listener_release_exit;
+}
+
+

Re: [Qemu-devel] QEMU+Linux ARMv7A current state

2015-10-05 Thread Beniamino Galvani
On Mon, Oct 05, 2015 at 11:13:33AM -0400, John Snow wrote:
> I'm looking into the cubieboard now. Is our emulation based on any
> particular model? (1-4?)

The first model, the one with Allwinner A10.

> I'm trying to see if I can find anything that resembles a spec to see
> what kind of registers this SoC has for its SATA controller.

There is some documentation on the SoC here, but apparently nothing on
SATA:

http://dl.linux-sunxi.org/A10/

Beniamino



Re: [Qemu-devel] [Qemu-discuss] TCP options ipv4 and ipv6 have no effect

2015-10-05 Thread Paolo Bonzini


On 05/10/2015 20:03, Sair, Umair wrote:
>> The first if handles the "default to N" case, the second handles "default to 
>> Y", the (absent) else case handles "default to PF_UNSPEC".
> 
> Can you please elaborate it. Also I am not understanding the reason for 
> inverting the values of addr->has_ipv* in second if condition.

If a value is absent, you can either default it to Y (then the value is
!addr->has_xyz || addr->xyz) or default it to N (then the value is
addr->has_xyz && addr->xyz).

The desired logic is:

- with one absent value and one "on", the absent value should be "off"

- with one absent value and one "off", the absent value should be "on"

- with two absent values, the absent values can be both left absent

The patch works like this:

>>  static void inet_addr_to_opts(QemuOpts *opts, const InetSocketAddress 
>> *addr)  {
>> -bool ipv4 = addr->ipv4 || !addr->has_ipv4;
>> -bool ipv6 = addr->ipv6 || !addr->has_ipv6;
>> +bool ipv4 = addr->has_ipv4 && addr->ipv4;
>> +bool ipv6 = addr->has_ipv6 && addr->ipv6;
>>  
>> -if (!ipv4 || !ipv6) {
>> +if (ipv4 || ipv6) {
>>  qemu_opt_set_bool(opts, "ipv4", ipv4, _abort);
>>  qemu_opt_set_bool(opts, "ipv6", ipv6, _abort);

This handles the case where the absent value should be "off": at least
one value is present _and_ true.

>> +} else if (addr->has_ipv4 || addr->has_ipv6) {
>> +qemu_opt_set_bool(opts, "ipv4", !addr->has_ipv4, _abort);
>> +qemu_opt_set_bool(opts, "ipv6", !addr->has_ipv6, _abort);

This handles the case where the absent value should be "on".  We know
that whichever the present value is, it is false; therefore
!addr->has_xyz || addr->xyz simplifies to just !addr->has_xyz.

> I believe that the fix for the issue under discussion will be
> committed to qemu repo very soon,

Did you test the patch, and did it work for you?  If so, it is customary
to reply with a line like "Tested by: Sair, Umair ".

> so I'll like to add one more thing
> which requires to be fixed along with it. In 'tcp_chr_accept'
> function of qemu-char.c, the data type of saddr should be
> sockaddr_in6 so that it works with both IPv6 and IPv4 on Windows
> (works for linux without it because of accept4 and works with this
> solution as well!).

Can you send a patch for it?

Thanks!

Paolo



Re: [Qemu-devel] [PATCH 3/4] checkpatch: adapt some tests to QEMU

2015-10-05 Thread Paolo Bonzini


On 05/10/2015 20:47, Peter Maydell wrote:
> On 5 October 2015 at 19:11, Eduardo Habkost  wrote:
>> On Sun, Oct 04, 2015 at 07:44:29PM +0100, Peter Maydell wrote:
>>> On 17 September 2015 at 17:32, Paolo Bonzini  wrote:


 On 17/09/2015 18:16, Peter Maydell wrote:
> On 17 September 2015 at 17:00, Paolo Bonzini  wrote:
>>
>>
>> On 17/09/2015 16:24, Peter Maydell wrote:
>>> Can we revert this one, please? Checkpatch now warns about constructs
>>> like
>>>   typedef struct MyDevice {
>>>   DeviceState parent;
>>>
>>>   int reg0, reg1, reg2;
>>>   } MyDevice;
>>>
 I think it varies depending on the maintainer.  PPC, USB, SCSI, ACPI all
 use a separate typedef.  I'll prepare a revert.
>>>
>>> Ping on that revert patch? I can't find it onlist...
>>
>> The x86 pull request I sent today triggers this error, BTW. I hope it
>> won't make the pull request be automatically rejected.
> 
> Nope, because I don't run checkpatch on pull requests.

I've sent the patch today, by the way.  Sorry for the delay.

Paolo



Re: [Qemu-devel] [PULL 00/10] Fix device introspection regressions

2015-10-05 Thread Paolo Bonzini


On 05/10/2015 20:46, Peter Maydell wrote:
> The easy fix is to have some suitable ifdeffery in virtio-pci.c,
> similar to how we only register the virtio_9p_pci and virtio_scsi_pci

(vhost_scsi_pci)

> types if they've been configured into this build.

Hmm, actually there's no reason to limit

common-obj-$(CONFIG_VIRTIO) += virtio-input.o
common-obj-$(CONFIG_VIRTIO) += virtio-input-hid.o

to CONFIG_LINUX.  Does it work to extract them out of the if (which is
only correct for virtio-input-host.o)?

That said, the ifdeffery _is_ needed for TYPE_VIRTIO_INPUT_HOST_PCI.

Paolo



[Qemu-devel] [PATCHv2] target-arm: Implement AArch64 OSLSR_EL1 sysreg

2015-10-05 Thread Davorin Mista
Added oslsr_write function to OSLAR_EL1 sysreg, using a status variable 
in ARMCPUState struct (os_lock_status).

Linux reads from this register during its suspend/resume procedure.

Signed-off-by: Davorin Mista 

---
Changed in v2:
-switched from using dummy registers to an actual register implementation
-implemented write function for OSLAR_EL1 sysreg
-added state variable to ARMCPUState struct

Signed-off-by: Davorin Mista 
---
 target-arm/cpu.h|  3 +++
 target-arm/helper.c | 16 +++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 5ea11a6..5aab654 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -500,6 +500,9 @@ typedef struct CPUARMState {
 uint32_t cregs[16];
 } iwmmxt;
 
+/* OS Lock Status: accessed via OSLAR/OSLSR registers */
+uint64_t os_lock_status;
+
 /* For mixed endian mode.  */
 bool bswap_code;
 
diff --git a/target-arm/helper.c b/target-arm/helper.c
index 9d62c4c..a6fad7a 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -3147,6 +3147,13 @@ static void dcc_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 putchar(value);
 }
 
+/* write to os_lock_status state variable */
+static void oslsr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t 
value)
+{
+/* only bit 1 can be modified, register is always 0b10x0 */
+raw_write(env, ri, 8 + (value & 2));
+}
+
 static const ARMCPRegInfo debug_cp_reginfo[] = {
 /* DBGDRAR, DBGDSAR: always RAZ since we don't implement memory mapped
  * debug components. The AArch64 version of DBGDRAR is named MDRAR_EL1;
@@ -3179,7 +3186,14 @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
 /* We define a dummy WI OSLAR_EL1, because Linux writes to it. */
 { .name = "OSLAR_EL1", .state = ARM_CP_STATE_BOTH,
   .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 4,
-  .access = PL1_W, .type = ARM_CP_NOP },
+  .access = PL1_W, .resetvalue = 10,
+  .fieldoffset = offsetof(CPUARMState, os_lock_status),
+  .writefn = oslsr_write },
+/* We define a dummy OSLSR_EL1, because Linux reads from it. */
+{ .name = "OSLSR_EL1", .state = ARM_CP_STATE_BOTH,
+  .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 4,
+  .access = PL1_R,
+  .fieldoffset = offsetof(CPUARMState, os_lock_status) },
 /* Dummy OSDLR_EL1: 32-bit Linux will read this */
 { .name = "OSDLR_EL1", .state = ARM_CP_STATE_BOTH,
   .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 4,
-- 
2.6.0




[Qemu-devel] [PATCH] Remove macros IO_READ_PROTO and IO_WRITE_PROTO

2015-10-05 Thread nutanshinde1992
---
 hw/audio/adlib.c  |  9 ++---
 hw/audio/es1370.c | 17 ++---
 hw/audio/gus.c|  9 ++---
 hw/audio/sb16.c   | 15 +--
 4 files changed, 15 insertions(+), 35 deletions(-)

diff --git a/hw/audio/adlib.c b/hw/audio/adlib.c
index 656eb37..af39920 100644
--- a/hw/audio/adlib.c
+++ b/hw/audio/adlib.c
@@ -57,11 +57,6 @@ void YMF262UpdateOneQEMU (int which, INT16 *dst, int length);
 #define SHIFT 1
 #endif
 
-#define IO_READ_PROTO(name) \
-uint32_t name (void *opaque, uint32_t nport)
-#define IO_WRITE_PROTO(name) \
-void name (void *opaque, uint32_t nport, uint32_t val)
-
 #define TYPE_ADLIB "adlib"
 #define ADLIB(obj) OBJECT_CHECK(AdlibState, (obj), TYPE_ADLIB)
 
@@ -124,7 +119,7 @@ static void adlib_kill_timers (AdlibState *s)
 }
 }
 
-static IO_WRITE_PROTO (adlib_write)
+static void adlib_write (void *opaque, uint32_t nport, uint32_t val)
 {
 AdlibState *s = opaque;
 int a = nport & 3;
@@ -141,7 +136,7 @@ static IO_WRITE_PROTO (adlib_write)
 #endif
 }
 
-static IO_READ_PROTO (adlib_read)
+static uint32_t adlib_read (void *opaque, uint32_t nport)
 {
 AdlibState *s = opaque;
 uint8_t data;
diff --git a/hw/audio/es1370.c b/hw/audio/es1370.c
index 8e7bcf5..dfb7c79 100644
--- a/hw/audio/es1370.c
+++ b/hw/audio/es1370.c
@@ -157,11 +157,6 @@ static const unsigned dac1_samplerate[] = { 5512, 11025, 
22050, 44100 };
 #define DAC2_CHANNEL 1
 #define ADC_CHANNEL 2
 
-#define IO_READ_PROTO(n) \
-static uint32_t n (void *opaque, uint32_t addr)
-#define IO_WRITE_PROTO(n) \
-static void n (void *opaque, uint32_t addr, uint32_t val)
-
 static void es1370_dac1_callback (void *opaque, int free);
 static void es1370_dac2_callback (void *opaque, int free);
 static void es1370_adc_callback (void *opaque, int avail);
@@ -474,7 +469,7 @@ static inline uint32_t es1370_fixup (ES1370State *s, 
uint32_t addr)
 return addr;
 }
 
-IO_WRITE_PROTO (es1370_writeb)
+static void es1370_writeb (void *opaque, uint32_t addr, uint32_t val)
 {
 ES1370State *s = opaque;
 uint32_t shift, mask;
@@ -512,7 +507,7 @@ IO_WRITE_PROTO (es1370_writeb)
 }
 }
 
-IO_WRITE_PROTO (es1370_writew)
+static void es1370_writew (void *opaque, uint32_t addr, uint32_t val)
 {
 ES1370State *s = opaque;
 addr = es1370_fixup (s, addr);
@@ -549,7 +544,7 @@ IO_WRITE_PROTO (es1370_writew)
 }
 }
 
-IO_WRITE_PROTO (es1370_writel)
+static void es1370_writel (void *opaque, uint32_t addr, uint32_t val)
 {
 ES1370State *s = opaque;
 struct chan *d = >chan[0];
@@ -615,7 +610,7 @@ IO_WRITE_PROTO (es1370_writel)
 }
 }
 
-IO_READ_PROTO (es1370_readb)
+static uint32_t es1370_readb (void *opaque, uint32_t addr)
 {
 ES1370State *s = opaque;
 uint32_t val;
@@ -650,7 +645,7 @@ IO_READ_PROTO (es1370_readb)
 return val;
 }
 
-IO_READ_PROTO (es1370_readw)
+static uint32_t es1370_readw (void *opaque, uint32_t addr)
 {
 ES1370State *s = opaque;
 struct chan *d = >chan[0];
@@ -692,7 +687,7 @@ IO_READ_PROTO (es1370_readw)
 return val;
 }
 
-IO_READ_PROTO (es1370_readl)
+static uint32_t es1370_readl (void *opaque, uint32_t addr)
 {
 ES1370State *s = opaque;
 uint32_t val;
diff --git a/hw/audio/gus.c b/hw/audio/gus.c
index 86223a9..6ea1e09 100644
--- a/hw/audio/gus.c
+++ b/hw/audio/gus.c
@@ -41,11 +41,6 @@
 #define GUS_ENDIANNESS 0
 #endif
 
-#define IO_READ_PROTO(name) \
-static uint32_t name (void *opaque, uint32_t nport)
-#define IO_WRITE_PROTO(name) \
-static void name (void *opaque, uint32_t nport, uint32_t val)
-
 #define TYPE_GUS "gus"
 #define GUS(obj) OBJECT_CHECK (GUSState, (obj), TYPE_GUS)
 
@@ -64,14 +59,14 @@ typedef struct GUSState {
 qemu_irq pic;
 } GUSState;
 
-IO_READ_PROTO (gus_readb)
+static uint32_t gus_readb (void *opaque, uint32_t nport)
 {
 GUSState *s = opaque;
 
 return gus_read (>emu, nport, 1);
 }
 
-IO_WRITE_PROTO (gus_writeb)
+static void gus_writeb (void *opaque, uint32_t nport, uint32_t val)
 {
 GUSState *s = opaque;
 
diff --git a/hw/audio/sb16.c b/hw/audio/sb16.c
index b052de5..381ef56 100644
--- a/hw/audio/sb16.c
+++ b/hw/audio/sb16.c
@@ -40,11 +40,6 @@
 #define ldebug(...)
 #endif
 
-#define IO_READ_PROTO(name) \
-uint32_t name (void *opaque, uint32_t nport)
-#define IO_WRITE_PROTO(name)\
-void name (void *opaque, uint32_t nport, uint32_t val)
-
 static const char e3[] = "COPYRIGHT (C) CREATIVE TECHNOLOGY LTD, 1992.";
 
 #define TYPE_SB16 "sb16"
@@ -881,7 +876,7 @@ static void reset (SB16State *s)
 legacy_reset (s);
 }
 
-static IO_WRITE_PROTO (dsp_write)
+static void dsp_write (void *opaque, uint32_t nport, uint32_t val)
 {
 SB16State *s = opaque;
 int iport;
@@ -959,7 +954,7 @@ static IO_WRITE_PROTO (dsp_write)
 }
 }
 
-static IO_READ_PROTO (dsp_read)
+static uint32_t dsp_read (void *opaque, uint32_t nport)
 {
 SB16State *s = opaque;
 int iport, retval, ack = 0;
@@ -1058,14 +1053,14 @@ static void reset_mixer 

[Qemu-devel] [Bug 1502884] Re: Super important feature req: QEMU VNC server: Introduce a keyboard "norepeat" option!

2015-10-05 Thread Mikael
What I request seems to be the same option as x11vnc's "-norepeat",
http://linux.die.net/man/1/x11vnc

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1502884

Title:
  Super important feature req: QEMU VNC server: Introduce a keyboard
  "norepeat" option!

Status in QEMU:
  New

Bug description:
  Hi,

  A big issue when using QEMU's VNC server (VNC KVM) is that, when
  there's a network lag, unintended keypresses go through to the QEMU
  guest VM.

  This is frequently "enter" keypresses, causing all kinds of unintended
  consequences in the VM. So basically it's extremely dangerous.

  This is because the VNC protocol's keyboard interaction is implemented
  in terms of key down - key up events, making the server's keyboard
  autorepeat kick in when it should not.

  
  For this reason, it would be great if QEMU's VNC server part would be 
enhanced with an option such that when a VNC protocol key down is received, 
then locally that is treated as one single keypress only (I don't know how that 
should be implemented but I guess either as an immediate key down - key up 
sequence locally, or key down + key up after say 0.05 seconds), instead of 
waiting for the key up event from the VNC client.

  Thanks!

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1502884/+subscriptions



[Qemu-devel] [Bug 1502884] [NEW] Super important feature req: QEMU VNC server: Introduce a keyboard "norepeat" option!

2015-10-05 Thread Mikael
Public bug reported:

Hi,

A big issue when using QEMU's VNC server (VNC KVM) is that, when there's
a network lag, unintended keypresses go through to the QEMU guest VM.

This is frequently "enter" keypresses, causing all kinds of unintended
consequences in the VM. So basically it's extremely dangerous.

This is because the VNC protocol's keyboard interaction is implemented
in terms of key down - key up events, making the server's keyboard
autorepeat kick in when it should not.


For this reason, it would be great if QEMU's VNC server part would be enhanced 
with an option such that when a VNC protocol key down is received, then locally 
that is treated as one single keypress only (I don't know how that should be 
implemented but I guess either as an immediate key down - key up sequence 
locally, or key down + key up after say 0.05 seconds), instead of waiting for 
the key up event from the VNC client.

Thanks!

** Affects: qemu
 Importance: Undecided
 Status: New


** Tags: extension keyboard kvm norepeat repeat server vnc

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1502884

Title:
  Super important feature req: QEMU VNC server: Introduce a keyboard
  "norepeat" option!

Status in QEMU:
  New

Bug description:
  Hi,

  A big issue when using QEMU's VNC server (VNC KVM) is that, when
  there's a network lag, unintended keypresses go through to the QEMU
  guest VM.

  This is frequently "enter" keypresses, causing all kinds of unintended
  consequences in the VM. So basically it's extremely dangerous.

  This is because the VNC protocol's keyboard interaction is implemented
  in terms of key down - key up events, making the server's keyboard
  autorepeat kick in when it should not.

  
  For this reason, it would be great if QEMU's VNC server part would be 
enhanced with an option such that when a VNC protocol key down is received, 
then locally that is treated as one single keypress only (I don't know how that 
should be implemented but I guess either as an immediate key down - key up 
sequence locally, or key down + key up after say 0.05 seconds), instead of 
waiting for the key up event from the VNC client.

  Thanks!

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1502884/+subscriptions



[Qemu-devel] [PULL 01/10] hw/vfio/platform: irqfd setup sequence update

2015-10-05 Thread Alex Williamson
From: Eric Auger 

With current implementation, eventfd VFIO signaling is first set up and
then irqfd is setup, if supported and allowed.

This start sequence causes several issues with IRQ forwarding setup
which, if supported, is transparently attempted on irqfd setup:
IRQ forwarding setup is likely to fail if the IRQ is detected as under
injection into the guest (active at irqchip level or VFIO masked).

This currently always happens because the current sequence explicitly
VFIO-masks the IRQ before setting irqfd.

Even if that masking were removed, we couldn't prevent the case where
the IRQ is under injection into the guest.

So the simpler solution is to remove this 2-step startup and directly
attempt irqfd setup. This is what this patch does.

Also in case the eventfd setup fails, there is no reason to go farther:
let's abort.

Signed-off-by: Eric Auger 
Signed-off-by: Alex Williamson 
---
 hw/vfio/platform.c |   51 +--
 1 file changed, 29 insertions(+), 22 deletions(-)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index a6726cd..d864342 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -310,18 +310,29 @@ static void vfio_platform_eoi(VFIODevice *vbasedev)
 /**
  * vfio_start_eventfd_injection - starts the virtual IRQ injection using
  * user-side handled eventfds
- * @intp: the IRQ struct pointer
+ * @sbdev: the sysbus device handle
+ * @irq: the qemu irq handle
  */
 
-static int vfio_start_eventfd_injection(VFIOINTp *intp)
+static void vfio_start_eventfd_injection(SysBusDevice *sbdev, qemu_irq irq)
 {
 int ret;
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
+VFIOINTp *intp;
+
+QLIST_FOREACH(intp, >intp_list, next) {
+if (intp->qemuirq == irq) {
+break;
+}
+}
+assert(intp);
 
 ret = vfio_set_trigger_eventfd(intp, vfio_intp_interrupt);
 if (ret) {
-error_report("vfio: Error: Failed to pass IRQ fd to the driver: %m");
+error_report("vfio: failed to start eventfd signaling for IRQ %d: %m",
+ intp->pin);
+abort();
 }
-return ret;
 }
 
 /*
@@ -359,6 +370,15 @@ static int vfio_set_resample_eventfd(VFIOINTp *intp)
 return ret;
 }
 
+/**
+ * vfio_start_irqfd_injection - starts the virtual IRQ injection using
+ * irqfd
+ *
+ * @sbdev: the sysbus device handle
+ * @irq: the qemu irq handle
+ *
+ * In case the irqfd setup fails, we fallback to userspace handled eventfd
+ */
 static void vfio_start_irqfd_injection(SysBusDevice *sbdev, qemu_irq irq)
 {
 VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
@@ -366,7 +386,7 @@ static void vfio_start_irqfd_injection(SysBusDevice *sbdev, 
qemu_irq irq)
 
 if (!kvm_irqfds_enabled() || !kvm_resamplefds_enabled() ||
 !vdev->irqfd_allowed) {
-return;
+goto fail_irqfd;
 }
 
 QLIST_FOREACH(intp, >intp_list, next) {
@@ -376,13 +396,6 @@ static void vfio_start_irqfd_injection(SysBusDevice 
*sbdev, qemu_irq irq)
 }
 assert(intp);
 
-/* Get to a known interrupt state */
-qemu_set_fd_handler(event_notifier_get_fd(>interrupt),
-NULL, NULL, vdev);
-
-vfio_mask_single_irqindex(>vbasedev, intp->pin);
-qemu_set_irq(intp->qemuirq, 0);
-
 if (kvm_irqchip_add_irqfd_notifier(kvm_state, >interrupt,
>unmask, irq) < 0) {
 goto fail_irqfd;
@@ -395,9 +408,6 @@ static void vfio_start_irqfd_injection(SysBusDevice *sbdev, 
qemu_irq irq)
 goto fail_vfio;
 }
 
-/* Let's resume injection with irqfd setup */
-vfio_unmask_single_irqindex(>vbasedev, intp->pin);
-
 intp->kvm_accel = true;
 
 trace_vfio_platform_start_irqfd_injection(intp->pin,
@@ -406,9 +416,11 @@ static void vfio_start_irqfd_injection(SysBusDevice 
*sbdev, qemu_irq irq)
 return;
 fail_vfio:
 kvm_irqchip_remove_irqfd_notifier(kvm_state, >interrupt, irq);
+error_report("vfio: failed to start eventfd signaling for IRQ %d: %m",
+ intp->pin);
+abort();
 fail_irqfd:
-vfio_start_eventfd_injection(intp);
-vfio_unmask_single_irqindex(>vbasedev, intp->pin);
+vfio_start_eventfd_injection(sbdev, irq);
 return;
 }
 
@@ -646,7 +658,6 @@ static void vfio_platform_realize(DeviceState *dev, Error 
**errp)
 VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
 SysBusDevice *sbdev = SYS_BUS_DEVICE(dev);
 VFIODevice *vbasedev = >vbasedev;
-VFIOINTp *intp;
 int i, ret;
 
 vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
@@ -665,10 +676,6 @@ static void vfio_platform_realize(DeviceState *dev, Error 
**errp)
 vfio_map_region(vdev, i);
 sysbus_init_mmio(sbdev, >regions[i]->mem);
 }
-
-QLIST_FOREACH(intp, >intp_list, next) {
-vfio_start_eventfd_injection(intp);
-}
 }
 
 static const VMStateDescription vfio_platform_vmstate = {




[Qemu-devel] [PULL 05/10] vfio: Generalize vfio_listener_region_add failure path

2015-10-05 Thread Alex Williamson
From: David Gibson 

If a DMA mapping operation fails in vfio_listener_region_add() it
checks to see if we've already completed initial setup of the
container.  If so it reports an error so the setup code can fail
gracefully, otherwise throws a hw_error().

There are other potential failure cases in vfio_listener_region_add()
which could benefit from the same logic, so move it to its own
fail: block.  Later patches can use this to extend other failure cases
to fail as gracefully as possible under the circumstances.

Signed-off-by: David Gibson 
Reviewed-by: Thomas Huth 
Reviewed-by: Laurent Vivier 
Signed-off-by: Alex Williamson 
---
 hw/vfio/common.c |   26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 1545f62..95a4850 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -399,19 +399,23 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
 error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
  "0x%"HWADDR_PRIx", %p) = %d (%m)",
  container, iova, end - iova, vaddr, ret);
+goto fail;
+}
 
-/*
- * On the initfn path, store the first error in the container so we
- * can gracefully fail.  Runtime, there's not much we can do other
- * than throw a hardware error.
- */
-if (!container->initialized) {
-if (!container->error) {
-container->error = ret;
-}
-} else {
-hw_error("vfio: DMA mapping failed, unable to continue");
+return;
+
+fail:
+/*
+ * On the initfn path, store the first error in the container so we
+ * can gracefully fail.  Runtime, there's not much we can do other
+ * than throw a hardware error.
+ */
+if (!container->initialized) {
+if (!container->error) {
+container->error = ret;
 }
+} else {
+hw_error("vfio: DMA mapping failed, unable to continue");
 }
 }
 




[Qemu-devel] [PULL 03/10] hw/vfio/platform: do not set resamplefd for edge-sensitive IRQS

2015-10-05 Thread Alex Williamson
From: Eric Auger 

In irqfd mode, current code attempts to set a resamplefd whatever
the type of the IRQ. For an edge-sensitive IRQ this attempt fails
and as a consequence, the whole irqfd setup fails and we fall back
to the slow mode. This patch bypasses the resamplefd setting for
non level-sentive IRQs.

Signed-off-by: Eric Auger 
Signed-off-by: Alex Williamson 
---
 hw/vfio/platform.c |   42 +++---
 trace-events   |4 +++-
 2 files changed, 30 insertions(+), 16 deletions(-)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index cab1517..5c1156c 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -32,6 +32,11 @@
  * Functions used whatever the injection method
  */
 
+static inline bool vfio_irq_is_automasked(VFIOINTp *intp)
+{
+return intp->flags & VFIO_IRQ_INFO_AUTOMASKED;
+}
+
 /**
  * vfio_init_intp - allocate, initialize the IRQ struct pointer
  * and add it into the list of IRQs
@@ -65,15 +70,17 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev,
 error_report("vfio: Error: trigger event_notifier_init failed ");
 return NULL;
 }
-/* Get an eventfd for resample/unmask */
-intp->unmask = g_malloc0(sizeof(EventNotifier));
-ret = event_notifier_init(intp->unmask, 0);
-if (ret) {
-g_free(intp->interrupt);
-g_free(intp->unmask);
-g_free(intp);
-error_report("vfio: Error: resamplefd event_notifier_init failed");
-return NULL;
+if (vfio_irq_is_automasked(intp)) {
+/* Get an eventfd for resample/unmask */
+intp->unmask = g_malloc0(sizeof(EventNotifier));
+ret = event_notifier_init(intp->unmask, 0);
+if (ret) {
+g_free(intp->interrupt);
+g_free(intp->unmask);
+g_free(intp);
+error_report("vfio: Error: resamplefd event_notifier_init failed");
+return NULL;
+}
 }
 
 QLIST_INSERT_HEAD(>intp_list, intp, next);
@@ -294,7 +301,7 @@ static void vfio_platform_eoi(VFIODevice *vbasedev)
 /* deassert the virtual IRQ */
 qemu_set_irq(intp->qemuirq, 0);
 
-if (intp->flags & VFIO_IRQ_INFO_AUTOMASKED) {
+if (vfio_irq_is_automasked(intp)) {
 /* unmasks the physical level-sensitive IRQ */
 vfio_unmask_single_irqindex(vbasedev, intp->pin);
 }
@@ -409,15 +416,20 @@ static void vfio_start_irqfd_injection(SysBusDevice 
*sbdev, qemu_irq irq)
 if (vfio_set_trigger_eventfd(intp, NULL) < 0) {
 goto fail_vfio;
 }
-if (vfio_set_resample_eventfd(intp) < 0) {
-goto fail_vfio;
+if (vfio_irq_is_automasked(intp)) {
+if (vfio_set_resample_eventfd(intp) < 0) {
+goto fail_vfio;
+}
+trace_vfio_platform_start_level_irqfd_injection(intp->pin,
+event_notifier_get_fd(intp->interrupt),
+event_notifier_get_fd(intp->unmask));
+} else {
+trace_vfio_platform_start_edge_irqfd_injection(intp->pin,
+event_notifier_get_fd(intp->interrupt));
 }
 
 intp->kvm_accel = true;
 
-trace_vfio_platform_start_irqfd_injection(intp->pin,
- event_notifier_get_fd(intp->interrupt),
- event_notifier_get_fd(intp->unmask));
 return;
 fail_vfio:
 kvm_irqchip_remove_irqfd_notifier(kvm_state, intp->interrupt, irq);
diff --git a/trace-events b/trace-events
index 36db793..6790292 100644
--- a/trace-events
+++ b/trace-events
@@ -1624,7 +1624,9 @@ vfio_platform_intp_interrupt(int pin, int fd) "Inject IRQ 
#%d (fd = %d)"
 vfio_platform_intp_inject_pending_lockheld(int pin, int fd) "Inject pending 
IRQ #%d (fd = %d)"
 vfio_platform_populate_interrupts(int pin, int count, int flags) "- IRQ index 
%d: count %d, flags=0x%x"
 vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
-vfio_platform_start_irqfd_injection(int index, int fd, int resamplefd) "IRQ 
index=%d, fd = %d, resamplefd = %d"
+vfio_platform_start_level_irqfd_injection(int index, int fd, int resamplefd) 
"IRQ index=%d, fd = %d, resamplefd = %d"
+vfio_platform_start_edge_irqfd_injection(int index, int fd) "IRQ index=%d, fd 
= %d"
+
 
 #hw/acpi/memory_hotplug.c
 mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32




Re: [Qemu-devel] [PATCH] Remove macros IO_READ_PROTO and IO_WRITE_PROTO

2015-10-05 Thread Peter Maydell
On 5 October 2015 at 20:06, nutanshinde1992  wrote:
> ---
>  hw/audio/adlib.c  |  9 ++---
>  hw/audio/es1370.c | 17 ++---
>  hw/audio/gus.c|  9 ++---
>  hw/audio/sb16.c   | 15 +--
>  4 files changed, 15 insertions(+), 35 deletions(-)

Hi; thanks for sending us this patch. Codewise the changes
look good; there's just a minor coding style nit, and a process
issue you need to deal with.

The process issue is that your git commit message has to
end with a 'Signed-off-by: Your Name ' line.
(If you look in the git commit logs at other peoples' changes you'll
see what I mean.)

This is you certifying to us that you wrote the patch and are
willing to submit it to us under the terms of QEMU's license.
(For a full statement of what you're signing to say, look here:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches?id=f6f94e2ab1b33f0082ac22d71f66385a60d8157f#n297
 -- we use the same idea as the Linux kernel for this.)

> diff --git a/hw/audio/adlib.c b/hw/audio/adlib.c
> index 656eb37..af39920 100644
> --- a/hw/audio/adlib.c
> +++ b/hw/audio/adlib.c
> @@ -57,11 +57,6 @@ void YMF262UpdateOneQEMU (int which, INT16 *dst, int 
> length);
>  #define SHIFT 1
>  #endif
>
> -#define IO_READ_PROTO(name) \
> -uint32_t name (void *opaque, uint32_t nport)
> -#define IO_WRITE_PROTO(name) \
> -void name (void *opaque, uint32_t nport, uint32_t val)
> -
>  #define TYPE_ADLIB "adlib"
>  #define ADLIB(obj) OBJECT_CHECK(AdlibState, (obj), TYPE_ADLIB)
>
> @@ -124,7 +119,7 @@ static void adlib_kill_timers (AdlibState *s)
>  }
>  }
>
> -static IO_WRITE_PROTO (adlib_write)
> +static void adlib_write (void *opaque, uint32_t nport, uint32_t val)

The style issue is that all these functions should not have a
space between the function name and the '('.

If you could remove the stray spaces, and then send us the
fixed patch again with your signoff line in the commit message,
that would be great. (Make sure the subject line says "[PATCH v2]"
so we know it's the revised version.)

thanks
-- PMM



[Qemu-devel] [PULL 00/10] VFIO updates for 2015-10-05

2015-10-05 Thread Alex Williamson
The following changes since commit c0b520dfb8890294a9f8879f4759172900585995:

  Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
(2015-10-02 16:59:21 +0100)

are available in the git repository at:


  git://github.com/awilliam/qemu-vfio.git tags/vfio-update-20151005.0

for you to fetch changes up to 727299697dd31f0e1ccecc7eab1bf658e8ed3079:

  vfio: Expose a VFIO PCI device's group for EEH (2015-10-05 12:40:13 -0600)


VFIO updates 2015-10-05

 - Change platform device IRQ setup sequence for compatibility
   with upcoming IRQ forwarding (Eric Auger)
 - Extensions to support vfio-pci devices on spapr-pci-host-bridge
   (David Gibson)


David Gibson (7):
  vfio: Remove unneeded union from VFIOContainer
  vfio: Generalize vfio_listener_region_add failure path
  vfio: Check guest IOVA ranges against host IOMMU capabilities
  vfio: Record host IOMMU's available IO page sizes
  memory: Allow replay of IOMMU mapping notifications
  vfio: Allow hotplug of containers onto existing guest IOMMU mappings
  vfio: Expose a VFIO PCI device's group for EEH

Eric Auger (3):
  hw/vfio/platform: irqfd setup sequence update
  hw/vfio/platform: change interrupt/unmask fields into pointer
  hw/vfio/platform: do not set resamplefd for edge-sensitive IRQS

 hw/vfio/common.c| 140 
 hw/vfio/pci.c   |  14 
 hw/vfio/platform.c  | 116 -
 include/exec/memory.h   |  13 
 include/hw/vfio/vfio-common.h   |  23 +++
 include/hw/vfio/vfio-pci.h  |  11 
 include/hw/vfio/vfio-platform.h |   4 +-
 memory.c|  20 ++
 trace-events|   4 +-
 9 files changed, 229 insertions(+), 116 deletions(-)
 create mode 100644 include/hw/vfio/vfio-pci.h



[Qemu-devel] [PULL 07/10] vfio: Record host IOMMU's available IO page sizes

2015-10-05 Thread Alex Williamson
From: David Gibson 

Depending on the host IOMMU type we determine and record the available page
sizes for IOMMU translation.  We'll need this for other validation in
future patches.

Signed-off-by: David Gibson 
Reviewed-by: Thomas Huth 
Reviewed-by: Laurent Vivier 
Signed-off-by: Alex Williamson 
---
 hw/vfio/common.c  |   13 +
 include/hw/vfio/vfio-common.h |1 +
 2 files changed, 14 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 2faf492..f666de2 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -677,6 +677,7 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as)
 if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU) ||
 ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU)) {
 bool v2 = !!ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU);
+struct vfio_iommu_type1_info info;
 
 ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, );
 if (ret) {
@@ -702,6 +703,15 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as)
  */
 container->min_iova = 0;
 container->max_iova = (hwaddr)-1;
+
+/* Assume just 4K IOVA page size */
+container->iova_pgsizes = 0x1000;
+info.argsz = sizeof(info);
+ret = ioctl(fd, VFIO_IOMMU_GET_INFO, );
+/* Ignore errors */
+if ((ret == 0) && (info.flags & VFIO_IOMMU_INFO_PGSIZES)) {
+container->iova_pgsizes = info.iova_pgsizes;
+}
 } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) {
 struct vfio_iommu_spapr_tce_info info;
 
@@ -744,6 +754,9 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as)
 }
 container->min_iova = info.dma32_window_start;
 container->max_iova = container->min_iova + info.dma32_window_size - 1;
+
+/* Assume just 4K IOVA pages for now */
+container->iova_pgsizes = 0x1000;
 } else {
 error_report("vfio: No available IOMMU models");
 ret = -EINVAL;
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 27a14c0..f037f3c 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -71,6 +71,7 @@ typedef struct VFIOContainer {
  * future
  */
 hwaddr min_iova, max_iova;
+uint64_t iova_pgsizes;
 QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
 QLIST_HEAD(, VFIOGroup) group_list;
 QLIST_ENTRY(VFIOContainer) next;




[Qemu-devel] [PULL 09/10] vfio: Allow hotplug of containers onto existing guest IOMMU mappings

2015-10-05 Thread Alex Williamson
From: David Gibson 

At present the memory listener used by vfio to keep host IOMMU mappings
in sync with the guest memory image assumes that if a guest IOMMU
appears, then it has no existing mappings.

This may not be true if a VFIO device is hotplugged onto a guest bus
which didn't previously include a VFIO device, and which has existing
guest IOMMU mappings.

Therefore, use the memory_region_register_iommu_notifier_replay()
function in order to fix this case, replaying existing guest IOMMU
mappings, bringing the host IOMMU into sync with the guest IOMMU.

Signed-off-by: David Gibson 
Signed-off-by: Alex Williamson 
---
 hw/vfio/common.c |   23 +--
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index f666de2..6797208 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -312,6 +312,11 @@ out:
 rcu_read_unlock();
 }
 
+static hwaddr vfio_container_granularity(VFIOContainer *container)
+{
+return (hwaddr)1 << ctz64(container->iova_pgsizes);
+}
+
 static void vfio_listener_region_add(MemoryListener *listener,
  MemoryRegionSection *section)
 {
@@ -369,26 +374,16 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
  * would be the right place to wire that up (tell the KVM
  * device emulation the VFIO iommu handles to use).
  */
-/*
- * This assumes that the guest IOMMU is empty of
- * mappings at this point.
- *
- * One way of doing this is:
- * 1. Avoid sharing IOMMUs between emulated devices or different
- * IOMMU groups.
- * 2. Implement VFIO_IOMMU_ENABLE in the host kernel to fail if
- * there are some mappings in IOMMU.
- *
- * VFIO on SPAPR does that. Other IOMMU models may do that different,
- * they must make sure there are no existing mappings or
- * loop through existing mappings to map them into VFIO.
- */
 giommu = g_malloc0(sizeof(*giommu));
 giommu->iommu = section->mr;
 giommu->container = container;
 giommu->n.notify = vfio_iommu_map_notify;
 QLIST_INSERT_HEAD(>giommu_list, giommu, giommu_next);
+
 memory_region_register_iommu_notifier(giommu->iommu, >n);
+memory_region_iommu_replay(giommu->iommu, >n,
+   vfio_container_granularity(container),
+   false);
 
 return;
 }




Re: [Qemu-devel] [PATCH v5 2/2] Use help sub-sections to create sub-help options

2015-10-05 Thread Michael S. Tsirkin
On Mon, Sep 14, 2015 at 08:01:20PM +0200, Laurent Vivier wrote:
> As '-help' output is 400 lines long it is not easy
> to find information, but generally we know from
> which area we want the information.

I agree - a bit more order is IMHO needed in the general
-help - I'm not sure this is the place to start.
E.g. it is not your fault that it lists e.g. a ton of random
stuff as Standard options but this becomes more of
a problem if you hide things behind sub-sections.

For example, how do I change the amount of memory?
Currently I do --help and search for memory.


This also diverges from the way one queries other help
(-device foo,help, -cpu help  etc).

How about we move detailed help into each option instead?
This will mean only 113 lines which is a lot, but
kind of reasonable.

-machine [type=]name[,prop[=value][,...]] select emulated machine ('-machine 
help' for help)
-cpu cpuselect CPU ('-cpu help' for list)
-smp [cpus=]n[,prop[=value][,...]] configure SMP ('-smp help' for help)
-numa node[,prop[=value][,...]] configure NUMA ('-numa help' for help)


Of course it's also true that often the help says
nothing useful at all.


> As sections already exist in the help description,
> add some options to only display the wanted section.
> 
> '-help' now can take an optional parameter of the
> form -help[=LIST], which is a comma separated list
> of sections to display:
> 
> all   display all help sections
> help  display help options
> standard  display standard options

What does standard options mean?

> block display block options
> usb   display usb options
> display   display display options
> machine   display machine options
> network   display network options
> character display character options

Character device options?

> url   display url options

what are these?

> btdisplay bt options

bluetooth options?

> tpm   display tpm options
> kerneldisplay kernel options
> expertdisplay expert options
> objectdisplay object options

what are these?

> 
> '-help' without option displays only help options.
> 
> Example:
> 
> $ x86_64-softmmu/qemu-system-x86_64 -help=kernel,usb
> QEMU emulator version 2.4.50, Copyright (c) 2003-2008 Fabrice Bellard
> usage: qemu-system-x86_64 [options] [disk_image]
> 
> 'disk_image' is a raw hard disk image for IDE hard disk 0
> 
> Linux/Multiboot boot specific:
> -kernel bzImage use 'bzImage' as kernel image
> -append cmdline use 'cmdline' as kernel command line
> -initrd fileuse 'file' as initial ram disk
> -dtbfileuse 'file' as device tree image
> 
> USB options:
> -usbenable the USB driver (will be the default soon)
> -usbdevice name add the host or guest USB device 'name'
> 
> Signed-off-by: Laurent Vivier 

Maybe a good place to start is rewriting our man page.
We can then distill that to a minimum that makes sense
in -help.


-- 
MST



Re: [Qemu-devel] [PATCH v8 04/54] Move configuration section writing

2015-10-05 Thread Amit Shah
On (Tue) 29 Sep 2015 [09:37:28], Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> The vmstate_configuration is currently written
> in 'qemu_savevm_state_begin', move it to
> 'qemu_savevm_state_header' since it's got a hard
> requirement that it must be the 1st thing after
> the header.
> (In postcopy some 'command' sections get sent
> early before the saving of the main sections
> and hence before qemu_savevm_state_begin).
> 
> Signed-off-by: Dr. David Alan Gilbert 

Reviewed-by: Amit Shah 

The function name 'savevm_state_header()' isn't accurate anymore.  Not
serious for this series.

Amit



Re: [Qemu-devel] [PULL 00/10] Fix device introspection regressions

2015-10-05 Thread Markus Armbruster
Peter Maydell  writes:

> On 2 October 2015 at 18:20, Markus Armbruster  wrote:
>> QMP command device-list-properties regressed in 2.1: it can crash or
>> leave dangling pointers behind.
>>
>> -device FOO,help regressed in 2.2: it no longer works for
>> non-pluggable devices.  I tried to fix that some time ago[*], but my
>> fix failed review.  This is my second, more comprehensive try.
>>
>> PATCH 1-3 fix one class of bugs involved in the regressions, PATCH 4-5
>> are libqtest preliminaries, PATCH 6 adds tests to demonstrate the
>> remaining bugs, PATCH 7-9 fix them to a degree (see PATCH 8 for
>> limitations), and PATCH 10 cleans up.
>
> This ordering breaks bisection of 'make check', as I found out when
> I tried to figure out which of the patches in this pull was causing
> an OSX test failure. Please can you reorder them so that 'make check'
> works at all points in the series?

My ordering may be bad (and I'll recheck it, of course), or it may
temporarily expose a hidden bug.  I better figure out what's going on
here.

>> The following changes since commit ff770b07f34d28b79013a83989bd6c85f8f16b2f:
>>
>>   Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into 
>> staging (2015-10-02 11:01:18 +0100)
>>
>> are available in the git repository at:
>>
>>   git://repo.or.cz/qemu/armbru.git tags/pull-monitor-2015-10-02
>>
>> for you to fetch changes up to e927162a6fa2fa6144de9d1d11cc9448a2143671:
>>
>>   Revert "qdev: Use qdev_get_device_class() for -device ,help" 
>> (2015-10-02 16:45:53 +0200)
>>
>> 
>> Fix device introspection regressions
>>
>> 
>
> 'make check' failure on OSX:
>
>   /aarch64/device/introspect/list: OK
>   /aarch64/device/introspect/none: OK
>   /aarch64/device/introspect/abstract: OK
>   /aarch64/device/introspect/concrete: **
> ERROR:/Users/pm215/src/qemu-for-merges/qom/object.c:333:void
> object_initialize_with_type(void *, size_t, TypeImpl *): assertion
> failed: (type != NULL)
> Broken pipe
> FAIL
>
> I have no idea why this only failed on OSX...

Can you re-run this with valgrind spliced in?

I use something like

$ QTEST_QEMU_BINARY="valgrind --vgdb-error=1 --log-file=vg.log 
qemu-system-aarch64" QTEST_QEMU_IMG=qemu-img 
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$((RANDOM % 255 + 1))} gtester -k --verbose 
-m=quick  tests/device-introspect-test

> Backtrace:
[Confusing due to inlining and other optimizations; snipped for now...]



Re: [Qemu-devel] [PATCH 1/3] Target-microblaze: Remove unnecessary variable

2015-10-05 Thread Michael Tokarev
05.10.2015 08:18, Markus Armbruster wrote:
> Michael Tokarev  writes:
> 
>> 25.09.2015 11:37, Shraddha Barke wrote:
>>> Compress lines and remove the variable .
>>
>> Applied to -trivial, removing this piece of commit message:
>>
>> ---
>>> Change made using Coccinelle script
[..snip..]
>> ---
> 
> Why?  I like having the semantic patch in the commit message when
> there's any chance we'll want do the same mechanical change again later.
> 
> You could save space and include it by reference, though: "Same
> Coccinelle semantic patch as is commit 74c373e".

git commit messages aren't good documentation for various scripts
like this, this info will be lost in the noize.  If it might be
better to keep such scripts in a separate file where it is easier
to find, or in a wiki page on the site. The key point is where to
find the info, git log is difficult for that, especially when you
don't know what to search for or that such a script exists in
there in the first place.

On the other hand, when git log is cluttered by such a long messages
for such small changes, it becomes more difficult to find info which
you really look in git log -- namely, which changes were made that
might have introduced this regression, things like that.

So to me, the shorter and cleaner the commit message is, the better.

BTW, I've no idea why this email has been Cc'd to kvm@vger :)

Thanks,

/mjt



Re: [Qemu-devel] [PATCH 21/36] misc: spelling

2015-10-05 Thread Michael Tokarev
05.10.2015 08:09, Markus Armbruster пишет:
> Michael Tokarev  writes:
> 
>> 25.09.2015 19:08, Eric Blake wrote:
>>> On 09/25/2015 08:03 AM, marcandre.lur...@redhat.com wrote:
 From: Marc-André Lureau 
[]
>>> Reviewed-by: Eric Blake 
>>
>> Note there's no S-o-b line in the original patch (whole series,
>> looks like).  Hopefully it is okay for such a really trivial
>> patch :)
>>
>> Applied, thanks!
> 
> It may be legally safe, but do we really want to engage in judging
> whether patches are copyrightable or not?  Besides, it sets a bad
> example.

Sometimes I question our own sanity.  Even for a trivial spelling fix
we require significantly more beaurocracy(sp) than the fix is worth,
and want formal rules instead of using common sense. This is a common
trend in the world, to formalize everything instead of thinking, the
world is becoming "candy".  This reminded me an old movie, "Demolition
Man", -- the cops in the future reads instructions about what to do
in each situation they happened to come.  But oh well, no one want to
take responsibility, that's okay ;)

Sorry for somewhat non-technical answer, I'll revert this patch,
waiting for more beaurocracy.

> Marc-André, please repost your patches ready for -trivial with your
> S-o-B, cc: qemu-trivial.

Mind you, it was a large series, with wasn't intended for -trivial
at all.  That's more rules and more beaurocracy.  And many other
patches in that series didn't have s-o-b line too.

Thanks,

/mjt




Re: [Qemu-devel] [PATCH 1/3] Target-microblaze: Remove unnecessary variable

2015-10-05 Thread Peter Maydell
On 5 October 2015 at 08:18, Michael Tokarev  wrote:
> 05.10.2015 08:18, Markus Armbruster wrote:
>> Why?  I like having the semantic patch in the commit message when
>> there's any chance we'll want do the same mechanical change again later.
>>
>> You could save space and include it by reference, though: "Same
>> Coccinelle semantic patch as is commit 74c373e".
>
> git commit messages aren't good documentation for various scripts
> like this, this info will be lost in the noize.  If it might be
> better to keep such scripts in a separate file where it is easier
> to find, or in a wiki page on the site. The key point is where to
> find the info, git log is difficult for that, especially when you
> don't know what to search for or that such a script exists in
> there in the first place.
>
> On the other hand, when git log is cluttered by such a long messages
> for such small changes, it becomes more difficult to find info which
> you really look in git log -- namely, which changes were made that
> might have introduced this regression, things like that.

I think it can be useful when you're looking at a commit
to know that it was automatically created, especially if
it's a big commit. It means that if you're looking for
a bug in it you can concentrate on the script that created
it rather than the possibly large set of changes it produced,
or if you're trying to cherry-pick it into another branch you
can just apply the script instead.

In a commit with a change this small it's not very significant
either way, though.

thanks
-- PMM



Re: [Qemu-devel] [PATCH v5 20/48] ivshmem: simplify a bit the code

2015-10-05 Thread Claudio Fontana
On 02.10.2015 21:09, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Use some more explicit variables to simplify the code.
> 
> Signed-off-by: Marc-André Lureau 

Reviewed-by: Claudio Fontana 

> ---
>  hw/misc/ivshmem.c | 28 ++--
>  1 file changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
> index 6ee4881..c054e52 100644
> --- a/hw/misc/ivshmem.c
> +++ b/hw/misc/ivshmem.c
> @@ -488,9 +488,10 @@ static void ivshmem_read(void *opaque, const uint8_t 
> *buf, int size)
>  {
>  IVShmemState *s = opaque;
>  int incoming_fd;
> -int guest_max_eventfd;
> +int new_eventfd;
>  long incoming_posn;
>  Error *err = NULL;
> +Peer *peer;
>  
>  if (!fifo_update_and_get(s, buf, size,
>   _posn, sizeof(incoming_posn))) {
> @@ -517,6 +518,8 @@ static void ivshmem_read(void *opaque, const uint8_t 
> *buf, int size)
>  }
>  }
>  
> +peer = >peers[incoming_posn];
> +
>  if (incoming_fd == -1) {
>  /* if posn is positive and unseen before then this is our posn*/
>  if (incoming_posn >= 0 && s->vm_id == -1) {
> @@ -564,27 +567,24 @@ static void ivshmem_read(void *opaque, const uint8_t 
> *buf, int size)
>  return;
>  }
>  
> -/* each guest has an array of eventfds, and we keep track of how many
> - * guests for each VM */
> -guest_max_eventfd = s->peers[incoming_posn].nb_eventfds;
> +/* each peer has an associated array of eventfds, and we keep
> + * track of how many eventfds received so far */
> +/* get a new eventfd: */
> +new_eventfd = peer->nb_eventfds++;
>  
>  /* this is an eventfd for a particular guest VM */
>  IVSHMEM_DPRINTF("eventfds[%ld][%d] = %d\n", incoming_posn,
> -guest_max_eventfd, incoming_fd);
> -
> event_notifier_init_fd(>peers[incoming_posn].eventfds[guest_max_eventfd],
> -   incoming_fd);
> -
> -/* increment count for particular guest */
> -s->peers[incoming_posn].nb_eventfds++;
> +new_eventfd, incoming_fd);
> +event_notifier_init_fd(>eventfds[new_eventfd], incoming_fd);
>  
>  if (incoming_posn == s->vm_id) {
> -s->eventfd_chr[guest_max_eventfd] = create_eventfd_chr_device(s,
> -   >peers[s->vm_id].eventfds[guest_max_eventfd],
> -   guest_max_eventfd);
> +s->eventfd_chr[new_eventfd] = create_eventfd_chr_device(s,
> +   >peers[s->vm_id].eventfds[new_eventfd],
> +   new_eventfd);
>  }
>  
>  if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) {
> -ivshmem_add_eventfd(s, incoming_posn, guest_max_eventfd);
> +ivshmem_add_eventfd(s, incoming_posn, new_eventfd);
>  }
>  }
>  
> 


-- 
Claudio Fontana
Server Virtualization Architect
Huawei Technologies Duesseldorf GmbH
Riesstraße 25 - 80992 München




[Qemu-devel] implement smbios support for mach-virt: triggers usual QEMU makefile bug

2015-10-05 Thread Claudio Fontana
Hi Peter,

The commit "smbios: implement smbios support for mach-virt" seems to cause the 
usual problem in QEMU's makefiles to trigger:

hw/arm/virt.c:892: undefined reference to `smbios_set_defaults'
hw/arm/virt.c:895: undefined reference to `smbios_get_tables'

This is IIRC the consequence of adding CONFIG_SMBIOS=y to 
default-configs/rm-softmmu.mak,
which is not picked up by the build system until a clean of the working tree 
has been done, right?

This is worked around by

$ git clean -d -x -f

followed by reconfigure/rebuild. Just wanted to mention this on the list in 
case someone is looking to fix this longstanding issue..

Ciao, thanks

Claudio



Re: [Qemu-devel] [PATCH RFC 0/3] Checkpoint-assisted migration proposal

2015-10-05 Thread Thomas Knauth
Hi Amit,

On Tue, Sep 15, 2015 at 12:39 PM, Amit Shah  wrote:
> Could you please include a file in the docs/ directory that documents
> how this works, so it's easier to comment on the general idea?

sure, we will add this.

> From 'checkpointing', I was afraid this was going to use some
> checkpoint-restore framework, but instead it's a new checkpointing
> method that you're adding to qemu.
>
> Can you describe when checkpoints are taken, and what is checkpointed?
> How is it stored on the disk?

Checkpoints are taken after a migration (at the source). If a
checkpoint exists at the destination, the VM's state is reconstructed
from the local checkpoint as well as updated pages from the source.
This checkpoint-assisted migration can be faster, if network is the
bottleneck, and saves network bandwidth.

We can, in principle, reuse the existing checkpoint format of QEMU.
The current implementation writes its own checkpoint because it was
less effort on our side. We write the VM's main memory into a single
file.

> I'm sure the patches have all the details, but it's easier to check
> the soundness of the idea if there's a high-level doc that explains
> this, and then we can discuss the finer points over patches.

We've recently published a paper about the general idea and expected
benefits for a number of workloads (
http://se.inf.tu-dresden.de/pubs/papers/knauth2015vecycle.pdf )

> Overall, I think this approach can benefit some workloads, and since
> it's not affecting a lot of common code, we could look at adding it.
>
> Also, apologies for not getting to this earlier.

Kind regards,
Thomas.



[Qemu-devel] [RFC v0 0/2] Enforce gaps between DIMMs

2015-10-05 Thread Bharata B Rao
The suggested way to work around the virtio bug reported here

http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html

is to introduce gaps between DIMMs. Igor's patchset changes the pc-dimm
auto-address assignment to introduce gaps and ues the same from pc memhp.
This patchset does the same for sPAPR PowerPC.

Before introducing the gap, ensure that memory hotplug region has enough
room for alignment adjustment. We accommodate a max alignment of 256MB for
each slot since sPAPR memory hotplug enforces an alignment requirement of
256MB on RAM size, maxmem and NUMA node mem sizes.

This applies on David's spapr-next branch + Igor's patchset applied.

This has been very lightly tested and intention is to get feedback
on the correctness aspect of this.

Bharata B Rao (2):
  spapr: Accommadate alignment gaps in hotplug memory region
  spapr: Force gaps between DIMM's GPA

 hw/ppc/spapr.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

-- 
2.1.0




Re: [Qemu-devel] [PATCH RFC 0/3] Checkpoint-assisted migration proposal

2015-10-05 Thread Amit Shah
On (Mon) 05 Oct 2015 [10:33:01], Thomas Knauth wrote:
> Hi Amit,
> 
> On Tue, Sep 15, 2015 at 12:39 PM, Amit Shah  wrote:
> > Could you please include a file in the docs/ directory that documents
> > how this works, so it's easier to comment on the general idea?
> 
> sure, we will add this.

Thanks!

> > From 'checkpointing', I was afraid this was going to use some
> > checkpoint-restore framework, but instead it's a new checkpointing
> > method that you're adding to qemu.
> >
> > Can you describe when checkpoints are taken, and what is checkpointed?
> > How is it stored on the disk?
> 
> Checkpoints are taken after a migration (at the source). If a
> checkpoint exists at the destination, the VM's state is reconstructed
> from the local checkpoint as well as updated pages from the source.
> This checkpoint-assisted migration can be faster, if network is the
> bottleneck, and saves network bandwidth.
> 
> We can, in principle, reuse the existing checkpoint format of QEMU.
> The current implementation writes its own checkpoint because it was
> less effort on our side. We write the VM's main memory into a single
> file.
> 
> > I'm sure the patches have all the details, but it's easier to check
> > the soundness of the idea if there's a high-level doc that explains
> > this, and then we can discuss the finer points over patches.
> 
> We've recently published a paper about the general idea and expected
> benefits for a number of workloads (
> http://se.inf.tu-dresden.de/pubs/papers/knauth2015vecycle.pdf )

I'll give it a look, thanks.  I'm interested in knowing what workloads
benefit.

There was one outstanding question from Dave about collisions:

https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg01614.html

Can you please address that in your next submission?

> > Overall, I think this approach can benefit some workloads, and since
> > it's not affecting a lot of common code, we could look at adding it.
> >
> > Also, apologies for not getting to this earlier.
> 
> Kind regards,
> Thomas.

Thanks,
Amit



Re: [Qemu-devel] [PATCH v4] linux-user/syscall.c: malloc()/calloc() to g_malloc()/g_try_malloc()/g_new0()

2015-10-05 Thread Stefan Hajnoczi
On Mon, Oct 5, 2015 at 4:32 AM, Harmandeep Kaur
 wrote:
> Convert malloc()/calloc() calls to g_malloc()/g_try_malloc()/g_new0()
> in linux-user/syscall.c file
>
> Signed-off-by: Harmandeep Kaur 
> ---
> v1->v2  convert the free() call in host_to_target_semarray()
> to g_free() and calls g_try_malloc(count)  instead of
> g_try_malloc(sizeof(count))
>
> v2->v3 used g_try_new() and friends to avoid overflow issues
>
> v3->v4 use g_free for unlock_iovec() and host_to_target_semarray().
>
>  linux-user/syscall.c | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)

Reviewed-by: Stefan Hajnoczi 



Re: [Qemu-devel] [PATCH 1/3] block: prohibit migration during BlockJobs

2015-10-05 Thread Kevin Wolf
Am 02.10.2015 um 20:17 hat John Snow geschrieben:
> On 10/01/2015 02:03 PM, Paolo Bonzini wrote:
> > On 01/10/2015 18:34, John Snow wrote:
> >> Unless we can prove this to be safe for specific cases,
> >> the default should be to prohibit migration during BlockJobs.
> > 
> > Block jobs do not affect the current block, only other block device,
> > hence they *are* safe for migration.
> > 
> 
> Can you elaborate for me here?
> 
> > What you want, I think, is the target not to be garbage when migration
> > ends.  Based on this you can block specific cases, namely mirror which
> > you already do allow (patch 2) and backup except for sync='none'.
> > 
> > Paolo
> 
> It would be nice if the target wasn't garbage, yes :)
> 
> I allow mirror in specific circumstances -- you can't start a mirror,
> but if an existing mirror has hit the sync phase, that's OK.
> 
> I can try to do a more exhaustive audit of what should and should not
> work, but my thought was "guilty before proven innocent."

I think I've come to the conclusion that qemu can't even know in all
cases (particularly mirroring) because it depends on what the resulting
image is used for.

If we mirror in order to do a live storage migration, then obviously it
needs to run during migration. In this case, your restriction "only in
sync phase" seems to be right.

If we mirror for some other reason, the mirroring job should probably
continue running on the destination host. The management tool can set up
the destination accordingly, but if it doesn't, the job is silently
cancelled. The point is that qemu (even more on the source host) doesn't
know which case is true, so rejecting the migration seems wrong.

If we do commit or stream, the job is silently cancelled and must be
restarted on the destination host. It's the same case as above.

All (or most) of the restarting jobs on the destination doesn't come for
free, we usually repeat some useless work that the source host had
already done. That's not a reason to reject migrations, just cost for
the user to consider before they start migration.

So in the end, I'm not sure how much qemu can actually do here.

Kevin



Re: [Qemu-devel] [PATCH] Add syscalls for -runas and -chroot tothe seccomp sandbox

2015-10-05 Thread Daniel P. Berrange
On Mon, Oct 05, 2015 at 07:20:58AM +0200, Markus Armbruster wrote:
> "Namsun Ch'o"  writes:
> 
> >> If we intend seccomp to protect against flaws during QEMU setup, then 
> >> having
> >> it earlier is neccessary. eg QEMU opening a corrupt qcow2 image which might
> >> exploit QEMU before the guest CPUs start.
> >
> >> If the latter is the case, then we could start with a relaxed seccomp
> >> sandbox which included the setuid/chroot features, and then switch to a
> >> more restricted one which blocked them before main_loop() runs.
> >
> > That's not possible. Seccomp will not be enforced until seccomp_load(ctx) is
> > called, after which no new changes to the filter can be made.
> 
> That's a pity.
> 
> As long as it's the case, we need to pick: either we protect against
> rogue guests, or against rogue images.  The original idea was the
> former, and it still makes the most sense to me.

Yep, protecting against rogue guests seems much more important.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



[Qemu-devel] [RFC v0 2/2] spapr: Force gaps between DIMM's GPA

2015-10-05 Thread Bharata B Rao
Mapping DIMMs non contiguously allows to workaround virtio bug reported
earlier:
http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html
In this case guest kernel doesn't allocate buffers that can cross DIMM
boundary keeping each buffer local to a DIMM.

Suggested-by: Michael S. Tsirkin 
Signed-off-by: Bharata B Rao 
---
 hw/ppc/spapr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 2ec509b..a8526e9 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2127,7 +2127,7 @@ static void spapr_memory_plug(HotplugHandler 
*hotplug_dev, DeviceState *dev,
 goto out;
 }
 
-pc_dimm_memory_plug(dev, >hotplug_memory, mr, align, false, 
_err);
+pc_dimm_memory_plug(dev, >hotplug_memory, mr, align, true, _err);
 if (local_err) {
 goto out;
 }
-- 
2.1.0




[Qemu-devel] [RFC v0 1/2] spapr: Accommadate alignment gaps in hotplug memory region

2015-10-05 Thread Bharata B Rao
Size hotplug memory region assuming a 256MB max alignment every slot.

Signed-off-by: Bharata B Rao 
---
 hw/ppc/spapr.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index fc5e7d6..2ec509b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1792,6 +1792,9 @@ static void ppc_spapr_init(MachineState *machine)
 
 spapr->hotplug_memory.base = ROUND_UP(machine->ram_size,
   SPAPR_HOTPLUG_MEM_ALIGN);
+
+/* size hotplug region assuming 256M max alignment per slot */
+hotplug_mem_size += SPAPR_MEMORY_BLOCK_SIZE * machine->ram_slots;
 memory_region_init(>hotplug_memory.mr, OBJECT(spapr),
"hotplug-memory", hotplug_mem_size);
 memory_region_add_subregion(sysmem, spapr->hotplug_memory.base,
-- 
2.1.0




Re: [Qemu-devel] [PATCH 2/2] pc: memhp: force gaps between DIMM's GPA

2015-10-05 Thread Bharata B Rao
On Mon, Sep 28, 2015 at 11:13:42AM +0200, Igor Mammedov wrote:
> On Mon, 28 Sep 2015 10:09:26 +0530
> Bharata B Rao  wrote:
> 
> > On Sun, Sep 27, 2015 at 04:04:06PM +0200, Igor Mammedov wrote:
> > > On Sun, 27 Sep 2015 16:11:02 +0300
> > > "Michael S. Tsirkin"  wrote:
> > > 
> > > > On Sun, Sep 27, 2015 at 03:06:24PM +0200, Igor Mammedov wrote:
> > > > > On Sun, 27 Sep 2015 13:48:21 +0300
> > > > > "Michael S. Tsirkin"  wrote:
> > > > > 
> > > > > > On Fri, Sep 25, 2015 at 03:53:12PM +0200, Igor Mammedov wrote:
> > > > > > > mapping DIMMs non contiguously allows to workaround
> > > > > > > virtio bug reported earlier:
> > > > > > > http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html
> > > > > > > in this case guest kernel doesn't allocate buffers
> > > > > > > that can cross DIMM boundary keeping each buffer
> > > > > > > local to a DIMM.
> > > > > > > 
> > > > > > > Suggested-by: Michael S. Tsirkin 
> > > > > > > Signed-off-by: Igor Mammedov 
> > > > > > > ---
> > > > > > > benefit of this workaround is that no guest side
> > > > > > > changes are required.
> > > > > > 
> > > > > > That's a hard requirement, I agree.
> > > > > > 
> > > > > > 
> > > > > > > ---
> > > > > > >  hw/i386/pc.c | 4 +++-
> > > > > > >  hw/i386/pc_piix.c| 3 +++
> > > > > > >  hw/i386/pc_q35.c | 3 +++
> > > > > > >  include/hw/i386/pc.h | 2 ++
> > > > > > >  4 files changed, 11 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > Aren't other architectures besides PC ever affected?
> > > > > > Do they all allocate all of memory contigious in HVA space?
> > > > > I'm not sure about other targets I've CCed interested parties.
> > > > > 
> > > > > > 
> > > > > > Also - does the issue only affect hotplugged memory?
> > > > > Potentially it affects -numa memdev=foo, but however I've
> > > > > tried I wasn't able to reproduce.
> > > > > We could do it as
> > > > > separate workaround later if it would affect someone
> > > > > and virtio is not fixed to handle split buffers by that time.
> > > > > 
> > > > 
> > > > You can't reproduce a crash or you can't reproduce getting
> > > > contigious GPA with fragmented HVA?
> > > > If you can see fragmentation that's enough to assume guest crash
> > > > can be triggered, even if it doesn't with Linux.
> > > I'll check it.
> > > 
> > > > 
> > > > >  
> > > > > > Can't the patch be local to pc-dimm (except maybe the
> > > > > > backwards compatibility thing)?
> > > > > I think decision about using gaps and its size
> > > > > should be done by board and not generic pc-dimm.
> > > > > 
> > > > 
> > > > Well virtio is generic and can be used by all boards.
> > > Still pc-dimm.addr is not allocation is not part of pc-dimm
> > > device. it's just helper functions that happen to live in
> > > the same file source file.
> > > 
> > > But more importantly every target might have it's own
> > > notion how it partitions hotplug address space so making
> > > the same gap global might break them.
> > > 
> > > It's safer to enable gaps per target, I think ppc guys
> > > will make their own patch on top of this to taking
> > > in account their target specific and compat stuff.
> > 
> > I have never seen this issue that you mention at
> > 
> > http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html
> > 
> > in PowerPC. I have not been able to reproduce the QEMU crash with the
> > commandline suggested there.
> > 
> > (# ./ppc64-softmmu/qemu-system-ppc64 --enable-kvm --nographic
> > -machine pseries -m 8G,slots=32,maxmem=32G -device
> > virtio-blk-pci,drive=rootdisk -drive
> > file=/home/bharata/F20-snap1,if=none,cache=none,id=rootdisk,format=qcow2
> > -monitor telnet:localhost:1235,server,nowait -vga none
> > -bios /home/bharata/slof/slof.bin -smp 16,maxcpus=32 -netdev
> > tap,id=foo,ifname=tap0,script=/home/bharata/qemu-ifup -device
> > virtio-net-pci,id=n1,netdev=foo `for i in $(seq 0 15); do echo -n
> > "-object memory-backend-ram,id=m$i,size=256M -device
> > pc-dimm,id=dimm$i,memdev=m$i "; done` -snapshot)
> > 
> > PowerPC sPAPR memory hotplug enforces memory alignment of 256MB
> > for both boottime as well as hotplugged memory.
> > 
> > So not sure if anything other than the default gap=0 which you have
> > done in this patchset for PowerPC is necessary.
> The bigger initial memory and dimm sizes the less likelihood of
> triggering the bug. You don't see it mostly due to luck, but it doesn't
> rule out possibility of it happening in production.
> So please consider turning on gaps for ppc machine.
> 
> Looking at how hotplug_mem_size is sized in hw/ppc/spapr.c it doesn't
> look that just turning on gaps would work since it doesn't have a space
> for alignment adjustment.
> 
> try to plug dimm device in following order:
>   -m 8G,slots=2,maxmem=1256M \
> 
>   -object memory-backend-ram,id=m1,size=256M -device pc-dimm,memdev=m1 \
> 
>   -object 
> 

Re: [Qemu-devel] [RFC v0 1/2] spapr: Accommadate alignment gaps in hotplug memory region

2015-10-05 Thread Igor Mammedov
On Mon,  5 Oct 2015 14:05:23 +0530
Bharata B Rao  wrote:

> Size hotplug memory region assuming a 256MB max alignment every slot.
> 
> Signed-off-by: Bharata B Rao 
> ---
>  hw/ppc/spapr.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index fc5e7d6..2ec509b 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1792,6 +1792,9 @@ static void ppc_spapr_init(MachineState *machine)
>  
>  spapr->hotplug_memory.base = ROUND_UP(machine->ram_size,
>SPAPR_HOTPLUG_MEM_ALIGN);
> +
> +/* size hotplug region assuming 256M max alignment per slot */
> +hotplug_mem_size += SPAPR_MEMORY_BLOCK_SIZE * machine->ram_slots;
Does target support hugepages backend? If it does then adjustment probably
should be max supported hugepage alignment.

>  memory_region_init(>hotplug_memory.mr, OBJECT(spapr),
> "hotplug-memory", hotplug_mem_size);
>  memory_region_add_subregion(sysmem, spapr->hotplug_memory.base,




[Qemu-devel] [PATCH 2/3] virtio-9p: add unrealize handler

2015-10-05 Thread Greg Kurz
If the user tries to hot unplug a virtio-9p device, it seems to succeed but
in fact:
- virtio-9p coroutines thread pool and async queue are leaked
- QEMU crashes in virtio_vmstate_change() if the user tries to live migrate

This patch brings hot unplug support to virtio-9p-device. It fixes both
above issues.

Signed-off-by: Greg Kurz 
---
 hw/9pfs/virtio-9p-device.c |   12 
 1 file changed, 12 insertions(+)

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 93a407c45926..ed133c40493a 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -138,6 +138,17 @@ out:
 v9fs_path_free();
 }
 
+static void virtio_9p_device_unrealize(DeviceState *dev, Error **errp)
+{
+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+V9fsState *s = VIRTIO_9P(dev);
+
+v9fs_release_worker_threads();
+g_free(s->ctx.fs_root);
+g_free(s->tag);
+virtio_cleanup(vdev);
+}
+
 /* virtio-9p device */
 
 static Property virtio_9p_properties[] = {
@@ -154,6 +165,7 @@ static void virtio_9p_class_init(ObjectClass *klass, void 
*data)
 dc->props = virtio_9p_properties;
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 vdc->realize = virtio_9p_device_realize;
+vdc->unrealize = virtio_9p_device_unrealize;
 vdc->get_features = virtio_9p_get_features;
 vdc->get_config = virtio_9p_get_config;
 }




[Qemu-devel] [PATCH 1/3] virtio-9p-coth: add release function and fix init error path

2015-10-05 Thread Greg Kurz
This is preliminary work to support hotplug/unplug of virtio-9p-device.

Signed-off-by: Greg Kurz 
---
 hw/9pfs/virtio-9p-coth.c |   13 +
 hw/9pfs/virtio-9p-coth.h |1 +
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/hw/9pfs/virtio-9p-coth.c b/hw/9pfs/virtio-9p-coth.c
index 8185c533c013..a4392586c9a2 100644
--- a/hw/9pfs/virtio-9p-coth.c
+++ b/hw/9pfs/virtio-9p-coth.c
@@ -66,10 +66,7 @@ int v9fs_init_worker_threads(void)
 }
 p->completed = g_async_queue_new();
 if (!p->completed) {
-/*
- * We are going to terminate.
- * So don't worry about cleanup
- */
+g_thread_pool_free(p->pool, true, true);
 ret = -1;
 goto err_out;
 }
@@ -80,3 +77,11 @@ err_out:
 pthread_sigmask(SIG_SETMASK, , NULL);
 return ret;
 }
+
+int v9fs_release_worker_threads(void)
+{
+V9fsThPool *p = _pool;
+
+g_thread_pool_free(p->pool, TRUE, TRUE);
+g_async_queue_unref(p->completed);
+}
diff --git a/hw/9pfs/virtio-9p-coth.h b/hw/9pfs/virtio-9p-coth.h
index 4f51b250d1d4..6502e422cea8 100644
--- a/hw/9pfs/virtio-9p-coth.h
+++ b/hw/9pfs/virtio-9p-coth.h
@@ -56,6 +56,7 @@ typedef struct V9fsThPool {
 
 extern void co_run_in_worker_bh(void *);
 extern int v9fs_init_worker_threads(void);
+extern int v9fs_release_worker_threads(void);
 extern int v9fs_co_readlink(V9fsPDU *, V9fsPath *, V9fsString *);
 extern int v9fs_co_readdir_r(V9fsPDU *, V9fsFidState *,
struct dirent *, struct dirent **result);




[Qemu-devel] [PATCH 3/3] virtio-9p: add savem handlers

2015-10-05 Thread Greg Kurz
We don't support migration of mounted 9p shares. This is handled by a
migration blocker.

One would expect, however, to be able to migrate if the share is unmounted.
Unfortunately virtio-9p-device does not register savevm handlers at all !
Migration succeeds and leaves the guest with a dangling device...

This patch simply registers migration handlers for virtio-9p-device. Whether
migration is possible or not still depends on the migration blocker.

Signed-off-by: Greg Kurz 
---
 hw/9pfs/virtio-9p-device.c |   12 
 1 file changed, 12 insertions(+)

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index ed133c40493a..bd7f10a0a902 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -43,6 +43,16 @@ static void virtio_9p_get_config(VirtIODevice *vdev, uint8_t 
*config)
 g_free(cfg);
 }
 
+static void virtio_9p_save(QEMUFile *f, void *opaque)
+{
+virtio_save(VIRTIO_DEVICE(opaque), f);
+}
+
+static int virtio_9p_load(QEMUFile *f, void *opaque, int version_id)
+{
+return virtio_load(VIRTIO_DEVICE(opaque), f, version_id);
+}
+
 static void virtio_9p_device_realize(DeviceState *dev, Error **errp)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -130,6 +140,7 @@ static void virtio_9p_device_realize(DeviceState *dev, 
Error **errp)
 }
 v9fs_path_free();
 
+register_savevm(dev, "virtio-9p", -1, 1, virtio_9p_save, virtio_9p_load, 
s);
 return;
 out:
 g_free(s->ctx.fs_root);
@@ -146,6 +157,7 @@ static void virtio_9p_device_unrealize(DeviceState *dev, 
Error **errp)
 v9fs_release_worker_threads();
 g_free(s->ctx.fs_root);
 g_free(s->tag);
+unregister_savevm(dev, "virtio-9p", s);
 virtio_cleanup(vdev);
 }
 




[Qemu-devel] [PATCH 0/3] virtio-9p: hotplug and migration support

2015-10-05 Thread Greg Kurz
This series brings both hoplug and migration support to virtio-9p devices.

Patch 1 is preliminary work.
Patch 2 brings hotplug and fixes a QEMU crash.
Patch 3 brings migration (of course it still depends on the 9p to be unmounted)

---

Greg Kurz (3):
  virtio-9p-coth: add release function and fix init error path
  virtio-9p: add unrealize handler
  virtio-9p: add savem handlers


 hw/9pfs/virtio-9p-coth.c   |   13 +
 hw/9pfs/virtio-9p-coth.h   |1 +
 hw/9pfs/virtio-9p-device.c |   24 
 3 files changed, 34 insertions(+), 4 deletions(-)




Re: [Qemu-devel] [PATCH v4 6/7] Make the kernel image in the fw_cfg DMA interface bootable

2015-10-05 Thread Gerd Hoffmann
On Fr, 2015-10-02 at 09:40 -0400, Kevin O'Connor wrote:
> On Fri, Oct 02, 2015 at 10:09:17AM +0200, Gerd Hoffmann wrote:
> > > - read four bytes from under the fw_cfg selector QEMU_CFG_KERNEL_SIZE
> > >   (0x0008),
> > > - if it is zero,return -1 --> no kernel boot requested,
> > > - if it is nonzero, return  0 --> which means "top priority".
> > > 
> > > In other words, I agree with:
> > > 
> > > > -option_rom[nb_option_roms].bootindex = 0;
> > > > +option_rom[nb_option_roms].bootindex = 1;
> 
> The bootindex in QEMU is not visible in the firmware, so if the rest
> of patch 6 is dropped then the above should be dropped as well.
> 
> > Hmm.  That makes the boot order undefined for "qemu -kernel foo -device
> > virtio-blk,drive=bar,bootindex=1" when using an old seabios.  I don't
> > think this is a good idea.
> 
> Wouldn't that make the bootorder undefined everywhere? What does it
> mean to use -kernel and specify a bootorder?

Kernel gets priority 0, everything else what you specify (typically
configs start with bootindex=1), so the kernel gets the highest
priority.  So unless the user uses F12 to enter the boot menu and picks
something else the kernel is booted.

cheers,
  Gerd





Re: [Qemu-devel] [PATCH v4 6/7] Make the kernel image in the fw_cfg DMA interface bootable

2015-10-05 Thread Gerd Hoffmann
On Fr, 2015-10-02 at 09:38 -0400, Kevin O'Connor wrote:
> On Fri, Oct 02, 2015 at 10:16:26AM +0200, Gerd Hoffmann wrote:
> >   Hi,
> > 
> > > That's fine with me.  Marc - I think qemu_vmlinux_setup() in SeaBIOS
> > > with the following would work:
> > > 
> > > void qemu_vmlinux_setup(void)
> > > {
> > > u32 kernel_size;
> > > qemu_cfg_read_entry(_size, QEMU_CFG_KERNEL_SIZE, 
> > > sizeof(kernel_size));
> > > if (kernel_size)
> > > boot_add_qemu_vmlinux("QEMU Kernel image", 0);
> > > }
> > 
> > It isn't that simple.  We also have support for multiboot kernels (using
> > multiboot.bin option rom).  So when doing this you need to be prepared
> > to find a multiboot kernel in fw_cfg.
> 
> Is there some way to detect if it's a multiboot kernel?

Yes.  There is a header with magic + checksum in the first 8k, see
hw/i386/multiboot.c (in qemu).

Or check the option rom name in the bootorder file, it's multiboot
instead of linuxboot.

>   If so,
> seabios can just fall back to using multiboot.bin.

Or add multiboot support to seabios.

cheers,
  Gerd





[Qemu-devel] [PATCH] target-mips: Add SIGRIE instruction

2015-10-05 Thread Yongbok Kim
Add SIGRIE (Signal Reserved Instruction Exception).
The instruction allows to use the 16-bit code field for software use.
This instruction is introduced by and required as of Release 6.

Signed-off-by: Yongbok Kim 
---
 target-mips/translate.c |   12 +++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 4a47c2d..0af807a 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -323,6 +323,7 @@ enum {
 OPC_TLTIU= (0x0B << 16) | OPC_REGIMM,
 OPC_TEQI = (0x0C << 16) | OPC_REGIMM,
 OPC_TNEI = (0x0E << 16) | OPC_REGIMM,
+OPC_SIGRIE   = (0x17 << 16) | OPC_REGIMM,
 OPC_SYNCI= (0x1F << 16) | OPC_REGIMM,
 
 OPC_DAHI = (0x06 << 16) | OPC_REGIMM,
@@ -12017,7 +12018,8 @@ enum {
 LSA = 0x0f,
 ALIGN = 0x1f,
 EXT = 0x2c,
-POOL32AXF = 0x3c
+POOL32AXF = 0x3c,
+SIGRIE = 0x3f
 };
 
 /* POOL32AXF encoding of minor opcode field extension */
@@ -13636,6 +13638,10 @@ static void decode_micromips32_opc(CPUMIPSState *env, 
DisasContext *ctx)
 case BREAK32:
 generate_exception_end(ctx, EXCP_BREAK);
 break;
+case SIGRIE:
+check_insn(ctx, ISA_MIPS32R6);
+generate_exception_err(ctx, EXCP_RI, extract32(ctx->opcode, 6, 
16));
+break;
 default:
 pool32a_invalid:
 MIPS_INVAL("pool32a");
@@ -18958,6 +18964,10 @@ static void decode_opc(CPUMIPSState *env, DisasContext 
*ctx)
 check_insn_opc_removed(ctx, ISA_MIPS32R6);
 gen_trap(ctx, op1, rs, -1, imm);
 break;
+case OPC_SIGRIE:
+check_insn(ctx, ISA_MIPS32R6);
+generate_exception_err(ctx, EXCP_RI, extract32(ctx->opcode, 0, 
16));
+break;
 case OPC_SYNCI:
 check_insn(ctx, ISA_MIPS32R2);
 /* Break the TB to be able to sync copied instructions
-- 
1.7.1




Re: [Qemu-devel] [PATCH v8 00/54] Postcopy implementation

2015-10-05 Thread Dr. David Alan Gilbert
* Bharata B Rao (bhar...@linux.vnet.ibm.com) wrote:
> On Mon, Sep 28, 2015 at 05:51:39PM +0100, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" 
> > 
> >   This is the 8th cut of my version of postcopy.
> > 
> > The userfaultfd linux kernel code is now in the upstream kernel
> > tree, and so 4.3-rc3 can be used without modification.
> > 
> > This qemu series can be found at:
> > https://github.com/orbitfp7/qemu.git
> > on the wp3-postcopy-v8 tag
> > 
> > 
> > Testing status:
> >   * Tested heavily on x86
> >   * Smoke tested on aarch64 (so it does work on different page sizes)
> >   * Power is unhappy for me (but gets further than the htab problem
> > v7 used to have) (I get a kvm run failed)
> 
> As I said earlier, postcopy migration works on Power, but memory hotplug
> seems to have some problem.
> 
> qemu-system-ppc64 ... -object memory-backend-ram,id=ram0,size=2G -device 
> pc-dimm,memdev=ram0
> 
> qemu/exec.c:1278: find_ram_offset: Assertion `size != 0' failed.
> 
> Does this happen on x86 too ?

Hmm, yes it does - I hadn't tried that.
The problem is that I added a HOST_PAGE_ALIGN call during RAMBlock creation,
and -object gets parsed pretty early on, before the internal host page masks
have been setup.

The patch below passes a smoke test; I'll look to clean it up.

Thanks,

Dave

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index a3719b7..b4c4b6e 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -81,7 +81,6 @@ void cpu_gen_init(void);
 int cpu_gen_code(CPUArchState *env, struct TranslationBlock *tb,
  int *gen_code_size_ptr);
 bool cpu_restore_state(CPUState *cpu, uintptr_t searched_pc);
-void page_size_init(void);
 
 void QEMU_NORETURN cpu_resume_from_signal(CPUState *cpu, void *puc);
 void QEMU_NORETURN cpu_io_recompile(CPUState *cpu, uintptr_t retaddr);
diff --git a/include/qemu-common.h b/include/qemu-common.h
index 01d29dd..ae3530c 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -491,4 +491,6 @@ int parse_debug_env(const char *name, int max, int initial);
 
 const char *qemu_ether_ntoa(const MACAddr *mac);
 
+void page_size_init(void);
+
 #endif
diff --git a/kvm-all.c b/kvm-all.c
index de1924c..62b71fe 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1461,7 +1461,6 @@ static int kvm_init(MachineState *ms)
  * page size for the system though.
  */
 assert(TARGET_PAGE_SIZE <= getpagesize());
-page_size_init();
 
 s->sigmask_len = 8;
 
diff --git a/vl.c b/vl.c
index e211f6a..818075c 100644
--- a/vl.c
+++ b/vl.c
@@ -4249,6 +4249,7 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 
+page_size_init();
 socket_init();
 
 if (qemu_opts_foreach(qemu_find_opts("object"),

> 
> Regards,
> Bharata.
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v5 31/48] contrib: add ivshmem client and server

2015-10-05 Thread Claudio Fontana
Hi,

On 02.10.2015 21:09, marcandre.lur...@redhat.com wrote:
> From: David Marchand 
> 
> When using ivshmem devices, notifications between guests can be sent as
> interrupts using a ivshmem-server (typical use described in documentation).
> The client is provided as a debug tool.
> 
> Signed-off-by: Olivier Matz 
> Signed-off-by: David Marchand 
> [fix a valgrind warning, option and server_close() segvs, extra server
> headers includes, getopt() return type]
> Signed-off-by: Marc-André Lureau 
> ---
>  Makefile|   8 +
>  configure   |   3 +
>  contrib/ivshmem-client/ivshmem-client.c | 433 
> 
>  contrib/ivshmem-client/ivshmem-client.h | 212 
>  contrib/ivshmem-client/main.c   | 239 ++
>  contrib/ivshmem-server/ivshmem-server.c | 422 +++
>  contrib/ivshmem-server/ivshmem-server.h | 166 
>  contrib/ivshmem-server/main.c   | 264 +++
>  qemu-doc.texi   |  10 +-
>  9 files changed, 1754 insertions(+), 3 deletions(-)
>  create mode 100644 contrib/ivshmem-client/ivshmem-client.c
>  create mode 100644 contrib/ivshmem-client/ivshmem-client.h
>  create mode 100644 contrib/ivshmem-client/main.c
>  create mode 100644 contrib/ivshmem-server/ivshmem-server.c
>  create mode 100644 contrib/ivshmem-server/ivshmem-server.h
>  create mode 100644 contrib/ivshmem-server/main.c
> 
> diff --git a/Makefile b/Makefile
> index e370876..6bb4855 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -321,6 +321,14 @@ msi:
>   @echo "MSI build not configured or dependency resolution failed 
> (reconfigure with --enable-guest-agent-msi option)"
>  endif
>  
> +IVSHMEM_CLIENT_OBJS=$(addprefix $(SRC_PATH)/contrib/ivshmem-client/, 
> ivshmem-client.o main.o)
> +ivshmem-client$(EXESUF): $(IVSHMEM_CLIENT_OBJS)
> + $(call LINK, $^)
> +
> +IVSHMEM_SERVER_OBJS=$(addprefix $(SRC_PATH)/contrib/ivshmem-server/, 
> ivshmem-server.o main.o)
> +ivshmem-server$(EXESUF): $(IVSHMEM_SERVER_OBJS) libqemuutil.a libqemustub.a
> + $(call LINK, $^)
> +

I noticed only now, but this one puts the objects in the wrong place if 
BUILD_DIR != SRC_PATH,
ie it puts the ivshmem-client.o and ivshmem-server.o in the source tree instead 
of the build tree.

What about a submake in contrib/ ? (But can be done later, just would fit 
naturally to solve the problem above)

Ciao

Claudio

>  clean:
>  # avoid old build problems by removing potentially incorrect old files
>   rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h 
> gen-op-arm.h
> diff --git a/configure b/configure
> index f14454e..65da997 100755
> --- a/configure
> +++ b/configure
> @@ -4392,6 +4392,9 @@ if test "$want_tools" = "yes" ; then
>if [ "$linux" = "yes" -o "$bsd" = "yes" -o "$solaris" = "yes" ] ; then
>  tools="qemu-nbd\$(EXESUF) $tools"
>fi
> +  if [ "$kvm" = "yes" ] ; then
> +tools="ivshmem-client\$(EXESUF) ivshmem-server\$(EXESUF) $tools"
> +  fi
>  fi
>  if test "$softmmu" = yes ; then
>if test "$virtfs" != no ; then
> diff --git a/contrib/ivshmem-client/ivshmem-client.c 
> b/contrib/ivshmem-client/ivshmem-client.c
> new file mode 100644
> index 000..11c805c
> --- /dev/null
> +++ b/contrib/ivshmem-client/ivshmem-client.c
> @@ -0,0 +1,433 @@
> +/*
> + * Copyright 6WIND S.A., 2014
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * (at your option) any later version.  See the COPYING file in the
> + * top-level directory.
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +#include "qemu-common.h"
> +#include "qemu/queue.h"
> +
> +#include "ivshmem-client.h"
> +
> +/* log a message on stdout if verbose=1 */
> +#define IVSHMEM_CLIENT_DEBUG(client, fmt, ...) do { \
> +if ((client)->verbose) { \
> +printf(fmt, ## __VA_ARGS__); \
> +}\
> +} while (0)
> +
> +/* read message from the unix socket */
> +static int
> +ivshmem_client_read_one_msg(IvshmemClient *client, long *index, int *fd)
> +{
> +int ret;
> +struct msghdr msg;
> +struct iovec iov[1];
> +union {
> +struct cmsghdr cmsg;
> +char control[CMSG_SPACE(sizeof(int))];
> +} msg_control;
> +struct cmsghdr *cmsg;
> +
> +iov[0].iov_base = index;
> +iov[0].iov_len = sizeof(*index);
> +
> +memset(, 0, sizeof(msg));
> +msg.msg_iov = iov;
> +msg.msg_iovlen = 1;
> +msg.msg_control = _control;
> +msg.msg_controllen = sizeof(msg_control);
> +
> +ret = recvmsg(client->sock_fd, , 0);
> +if (ret < 0) {
> +IVSHMEM_CLIENT_DEBUG(client, "cannot read message: %s\n",
> + strerror(errno));
> +return -1;
> +}
> +if (ret == 0) {
> +

Re: [Qemu-devel] [PATCH 3/3] kvm-all: notice KVM of vcpu's TSC rate after migration

2015-10-05 Thread Haozhong Zhang
On Wed, Sep 30, 2015 at 05:36:11PM -0300, Eduardo Habkost wrote:
> On Wed, Sep 30, 2015 at 08:32:26AM +0800, Haozhong Zhang wrote:
> > > [...]
> > > > > Or maybe we shouldn't treat this as VM state, but as configuration, 
> > > > > and
> > > > > let management configure the TSC frequency explicitly if the user 
> > > > > really
> > > > > needs it to stay the same during migration.
> > > > >
> > > > > (CCing libvir-list to see if they have feedback)
> > > > >
> > > > 
> > > > Thanks for CC. I'll consider to add a command line option to control
> > > > the configuration of guest TSC fequency.
> > > 
> > > It already exists, -cpu has a "tsc-freq" option.
> > >
> > 
> > What I'm considering is to add a "-keep-tsc-freq" option to control
> > whether the TSC frequency should be migrated if "tsc-freq" is not
> > presented. Compared to "tsc-freq", "-keep-tsc-freq" can free users
> > from figuring out the guest TSC frequency manually in the migration.
> 
> If you do that, please make it a property of the CPU object, so it will
> can be set as a "-cpu" option.
>

Yes, I'll do so.

- Haozhong

> -- 
> Eduardo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: [Qemu-devel] [PATCH 1/3] target-i386: add a subsection of vcpu's TSC rate in vmstate_x86_cpu

2015-10-05 Thread Haozhong Zhang
On Wed, Sep 30, 2015 at 09:07:08AM +0100, Dr. David Alan Gilbert wrote:
> * Haozhong Zhang (haozhong.zh...@intel.com) wrote:
> > On Tue, Sep 29, 2015 at 08:00:13PM +0100, Dr. David Alan Gilbert wrote:
> > > * Haozhong Zhang (haozhong.zh...@intel.com) wrote:
> > > > The newly added subsection 'vmstate_tsc_khz' in this patch results in
> > > > vcpu's TSC rate being saved on the source machine and loaded on the
> > > > target machine during the migration.
> > > > 
> > > > Signed-off-by: Haozhong Zhang 
> > > 
> > > Hi,
> > >   I'd appreciate it if you could tie this to only do it on newer
> > > machine types; that way it won't break back migration.
> > >
> > 
> > Does "back migration" mean migrating from QEMU w/ vmstate_tsc_khz
> > subsection to QEMU w/o that subsection?
> 
> Yes; like if we migrate from a newer qemu to an older qemu but with
> the same machine type.
>

This patch does break the back migration. I'll fix this in the next version.

- Haozhong

> Dave
> 
> > 
> > - Haozhong
> > 
> > > Dave
> > > 
> > > > ---
> > > >  target-i386/machine.c | 20 
> > > >  1 file changed, 20 insertions(+)
> > > > 
> > > > diff --git a/target-i386/machine.c b/target-i386/machine.c
> > > > index 9fa0563..80108a3 100644
> > > > --- a/target-i386/machine.c
> > > > +++ b/target-i386/machine.c
> > > > @@ -752,6 +752,25 @@ static const VMStateDescription vmstate_xss = {
> > > >  }
> > > >  };
> > > >  
> > > > +static bool tsc_khz_needed(void *opaque)
> > > > +{
> > > > +X86CPU *cpu = opaque;
> > > > +CPUX86State *env = >env;
> > > > +
> > > > +return env->tsc_khz != 0;
> > > > +}
> > > > +
> > > > +static const VMStateDescription vmstate_tsc_khz = {
> > > > +.name = "cpu/tsc_khz",
> > > > +.version_id = 1,
> > > > +.minimum_version_id = 1,
> > > > +.needed = tsc_khz_needed,
> > > > +.fields = (VMStateField[]) {
> > > > +VMSTATE_INT64(env.tsc_khz, X86CPU),
> > > > +VMSTATE_END_OF_LIST()
> > > > +}
> > > > +};
> > > > +
> > > >  VMStateDescription vmstate_x86_cpu = {
> > > >  .name = "cpu",
> > > >  .version_id = 12,
> > > > @@ -871,6 +890,7 @@ VMStateDescription vmstate_x86_cpu = {
> > > >  _msr_hyperv_crash,
> > > >  _avx512,
> > > >  _xss,
> > > > +_tsc_khz,
> > > >  NULL
> > > >  }
> > > >  };
> > > > -- 
> > > > 2.4.8
> > > > 
> > > > 
> > > --
> > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: [Qemu-devel] [PATCHv2] target-arm: Implement AArch64 OSLSR_EL1 sysreg

2015-10-05 Thread Davorin Mista
Thanks Peter, I've made all changes as you suggested, but there is no 
property "ARM_CP_NO_RAW", there's also nothing similar to it defined in 
cpu.h, here's all the options:


#define ARM_CP_SPECIAL 1
#define ARM_CP_CONST 2
#define ARM_CP_64BIT 4
#define ARM_CP_SUPPRESS_TB_END 8
#define ARM_CP_OVERRIDE 16
#define ARM_CP_NO_MIGRATE 32
#define ARM_CP_IO 64

Cheers,
Davorin

On 10/05/2015 01:27 PM, Peter Maydell wrote:

On 5 October 2015 at 20:56, Davorin Mista  wrote:

Added oslsr_write function to OSLAR_EL1 sysreg, using a status variable
in ARMCPUState struct (os_lock_status).

Linux reads from this register during its suspend/resume procedure.

Signed-off-by: Davorin Mista 


Thanks for this patch. I'm afraid it still needs some changes;
comments below.


---
Changed in v2:
-switched from using dummy registers to an actual register implementation
-implemented write function for OSLAR_EL1 sysreg
-added state variable to ARMCPUState struct

Signed-off-by: Davorin Mista 
---
  target-arm/cpu.h|  3 +++
  target-arm/helper.c | 16 +++-
  2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 5ea11a6..5aab654 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -500,6 +500,9 @@ typedef struct CPUARMState {
  uint32_t cregs[16];
  } iwmmxt;

+/* OS Lock Status: accessed via OSLAR/OSLSR registers */
+uint64_t os_lock_status;


Can you call this "oslsr_el1" and put it inside the cp15 substruct
with the other sysreg fields, please?


+
  /* For mixed endian mode.  */
  bool bswap_code;

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 9d62c4c..a6fad7a 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -3147,6 +3147,13 @@ static void dcc_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
  putchar(value);
  }

+/* write to os_lock_status state variable */
+static void oslsr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t 
value)
+{
+/* only bit 1 can be modified, register is always 0b10x0 */
+raw_write(env, ri, 8 + (value & 2));


This is the write function for OSLAR, not OSLSR, so you should
call it oslar_write.

Your logic isn't implementing the behaviour the ARM ARM requires,
which is:
  * for AArch64 accesses, copy bit 0 of the written value into
bit 1 of oslsr_el1
  * for AArch32 accesses, if the written value is 0xC5ACCE55
then write 1 into bit 1 of oslsr_el1, else write 0

That looks something like:

 int oslock;

 if (ri->state == ARM_CP_STATE_AA32) {
 oslock = (value == 0xC5ACCE55);
 } else {
 oslock = value & 1;
 }

 env->cp15.oslsr_el1 = deposit32(env->cp15.oslsr_el1, 1, 1, oslock);



+}
+
  static const ARMCPRegInfo debug_cp_reginfo[] = {
  /* DBGDRAR, DBGDSAR: always RAZ since we don't implement memory mapped
   * debug components. The AArch64 version of DBGDRAR is named MDRAR_EL1;
@@ -3179,7 +3186,14 @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
  /* We define a dummy WI OSLAR_EL1, because Linux writes to it. */
  { .name = "OSLAR_EL1", .state = ARM_CP_STATE_BOTH,
.cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 4,
-  .access = PL1_W, .type = ARM_CP_NOP },
+  .access = PL1_W, .resetvalue = 10,


Write only registers don't need a reset value. You also
need .type = ARM_CP_NO_RAW, because a raw access to this register
doesn't make sense.


+  .fieldoffset = offsetof(CPUARMState, os_lock_status),
+  .writefn = oslsr_write },
+/* We define a dummy OSLSR_EL1, because Linux reads from it. */
+{ .name = "OSLSR_EL1", .state = ARM_CP_STATE_BOTH,
+  .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 4,
+  .access = PL1_R,
+  .fieldoffset = offsetof(CPUARMState, os_lock_status) },


This is the reginfo that should have the reset value.


  /* Dummy OSDLR_EL1: 32-bit Linux will read this */
  { .name = "OSDLR_EL1", .state = ARM_CP_STATE_BOTH,
.cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 4,
--
2.6.0



thanks
-- PMM





Re: [Qemu-devel] [PATCH 1/5] ide/atapi: make PIO read requests async

2015-10-05 Thread John Snow


On 09/21/2015 08:25 AM, Peter Lieven wrote:
> PIO read requests on the ATAPI interface used to be sync blk requests.
> This has to siginificant drawbacks. First the main loop hangs util an
> I/O request is completed and secondly if the I/O request does not
> complete (e.g. due to an unresponsive storage) Qemu hangs completely.
> 
> Signed-off-by: Peter Lieven 
> ---
>  hw/ide/atapi.c | 69 
> --
>  1 file changed, 43 insertions(+), 26 deletions(-)
> 
> diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
> index 747f466..9257e1c 100644
> --- a/hw/ide/atapi.c
> +++ b/hw/ide/atapi.c
> @@ -105,31 +105,51 @@ static void cd_data_to_raw(uint8_t *buf, int lba)
>  memset(buf, 0, 288);
>  }
>  
> -static int cd_read_sector(IDEState *s, int lba, uint8_t *buf, int 
> sector_size)
> +static void cd_read_sector_cb(void *opaque, int ret)
>  {
> -int ret;
> +IDEState *s = opaque;
>  
> -switch(sector_size) {
> -case 2048:
> -block_acct_start(blk_get_stats(s->blk), >acct,
> - 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
> -ret = blk_read(s->blk, (int64_t)lba << 2, buf, 4);
> -block_acct_done(blk_get_stats(s->blk), >acct);
> -break;
> -case 2352:
> -block_acct_start(blk_get_stats(s->blk), >acct,
> - 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
> -ret = blk_read(s->blk, (int64_t)lba << 2, buf + 16, 4);
> -block_acct_done(blk_get_stats(s->blk), >acct);
> -if (ret < 0)
> -return ret;
> -cd_data_to_raw(buf, lba);
> -break;
> -default:
> -ret = -EIO;
> -break;
> +block_acct_done(blk_get_stats(s->blk), >acct);
> +
> +if (ret < 0) {
> +ide_atapi_io_error(s, ret);
> +return;
> +}
> +
> +if (s->cd_sector_size == 2352) {
> +cd_data_to_raw(s->io_buffer, s->lba);
>  }
> -return ret;
> +
> +s->lba++;
> +s->io_buffer_index = 0;
> +s->status &= ~BUSY_STAT;
> +
> +ide_atapi_cmd_reply_end(s);
> +}
> +
> +static int cd_read_sector(IDEState *s, int lba, void *buf, int sector_size)
> +{
> +if (sector_size != 2048 && sector_size != 2352) {
> +return -EINVAL;
> +}
> +
> +s->iov.iov_base = buf;
> +if (sector_size == 2352) {
> +buf += 4;
> +}
> +
> +s->iov.iov_len = 4 * BDRV_SECTOR_SIZE;
> +qemu_iovec_init_external(>qiov, >iov, 1);
> +
> +if (blk_aio_readv(s->blk, (int64_t)lba << 2, >qiov, 4,
> +  cd_read_sector_cb, s) == NULL) {
> +return -EIO;
> +}
> +
> +block_acct_start(blk_get_stats(s->blk), >acct,
> + 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
> +s->status |= BUSY_STAT;
> +return 0;
>  }
>  

We discussed this off-list a bit, but for upstream synchronization:

Unfortunately, I believe making cd_read_sector here non-blocking makes
ide_atapi_cmd_reply_end non-blocking, and as a result makes calls to
s->end_transfer_func() nonblocking, which functions like ide_data_readw
are not prepared to cope with.

My suggestion is to buffer an entire DRQ block of data at once
(byte_count_limit) to avoid the problem.

Thanks,
--js

>  void ide_atapi_cmd_ok(IDEState *s)
> @@ -190,10 +210,8 @@ void ide_atapi_cmd_reply_end(IDEState *s)
>  ret = cd_read_sector(s, s->lba, s->io_buffer, s->cd_sector_size);
>  if (ret < 0) {
>  ide_atapi_io_error(s, ret);
> -return;
>  }
> -s->lba++;
> -s->io_buffer_index = 0;
> +return;
>  }
>  if (s->elementary_transfer_size > 0) {
>  /* there are some data left to transmit in this elementary
> @@ -275,7 +293,6 @@ static void ide_atapi_cmd_read_pio(IDEState *s, int lba, 
> int nb_sectors,
>  s->io_buffer_index = sector_size;
>  s->cd_sector_size = sector_size;
>  
> -s->status = READY_STAT | SEEK_STAT;
>  ide_atapi_cmd_reply_end(s);
>  }
>  
> 



Re: [Qemu-devel] QEMU+Linux ARMv7A current state

2015-10-05 Thread John Snow


On 10/05/2015 04:44 PM, Beniamino Galvani wrote:
> On Mon, Oct 05, 2015 at 11:13:33AM -0400, John Snow wrote:
>> I'm looking into the cubieboard now. Is our emulation based on any
>> particular model? (1-4?)
> 
> The first model, the one with Allwinner A10.
> 
>> I'm trying to see if I can find anything that resembles a spec to see
>> what kind of registers this SoC has for its SATA controller.
> 
> There is some documentation on the SoC here, but apparently nothing on
> SATA:
> 
> http://dl.linux-sunxi.org/A10/
> 
> Beniamino
> 

http://linux-sunxi.org/SATA looks relevant...

http://dl.cubieforums.com/files/pdf/A10_development_board_user_manual--2011.9.23_English.pdf
mentions the feature list:

The board provides one SATA interface, which features:
- Support SATA 1.5Gb/s, and SATA 3.0Gb/s
- Compliant to SATA Spec. 2.6, and AHCI Revision 1.3 Specifications
- Support industry-standard AMBA High-Performance Bus (AHB) and fully
compliant to the AMBA Specification, Revision 2.0; Support 32-bit Little
Endian
- OOB signaling detection and generation
- SATA 1.5Gb/s and SATA 3.0Gb/s speed negotiation when Tx OOB signaling
is selected
- Support device hot-plugging
- Support power management features including automatic Partial to
Slumber transition
- Internal DMA Engine for Command and Data Transactions
- Support hardware-assisted Native Command Queuing (NCQ) for up to
32-entries
- Support external SATA (eSATA)

I can't find anything else, though. I'll just have to wait from Peter to
see what registers he needed to modify to get it working.

Thanks,
--js



Re: [Qemu-devel] [kvm-unit-tests PATCHv2] arm: Add PMU test

2015-10-05 Thread Wei Huang


On 10/02/2015 10:48 AM, Christopher Covington wrote:
> Add test the ARM Performance Monitors Unit (PMU). The informational
> fields from the control register are printed, but not checked, and
> the number of cycles it takes to run a known-instruction-count loop
> is printed, but not checked. Once QEMU is fixed, we can at least
> begin to check for IPC == 1 when using -icount.

Thanks for submitting it. I think this is a good starting point to add
PMU unit testing support for ARM. Some comments below.

> 
> Signed-off-by: Christopher Covington 
> ---
>  arm/pmu.c   | 89 
> +
>  arm/unittests.cfg   | 11 ++
>  config/config-arm64.mak |  4 ++-
>  3 files changed, 103 insertions(+), 1 deletion(-)
>  create mode 100644 arm/pmu.c
> 
> diff --git a/arm/pmu.c b/arm/pmu.c
> new file mode 100644
> index 000..f724c2c
> --- /dev/null
> +++ b/arm/pmu.c
> @@ -0,0 +1,89 @@
> +/*
> + * Test the ARM Performance Monitors Unit (PMU).
> + *
> + * Copyright 2015 The Linux Foundation. All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU Lesser General Public License version 2.1 and
> + * only version 2.1 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but 
> WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public 
> License
> + * for more details.
> + */
> +#include "libcflat.h"
> +
> +struct pmu_data {
> + union {
> + uint32_t pmcr_el0;
> + struct {
> + unsigned int enable:1;
> + unsigned int event_counter_reset:1;
> + unsigned int cycle_counter_reset:1;
> + unsigned int cycle_counter_clock_divider:1;
> + unsigned int event_counter_export:1;
> + unsigned int cycle_counter_disable_when_prohibited:1;
> + unsigned int cycle_counter_long:1;
> + unsigned int zeros:4;
> + unsigned int num_counters:5;
> + unsigned int identification_code:8;
> + unsigned int implementor:8;

Not saying it is a problem because "unsigned int" is 32-bit on 64bit
machine. But to make it consistent with pmcr_el0, I would prefer
"unsigned int" been replaced by "uint32_t".

> + };
> + };
> +};
> +
> +/* Execute a known number of guest instructions. Only odd instruction counts
> + * greater than or equal to 3 are supported. The control register (PMCR) is
> + * initialized with the provided value (allowing for example for the cycle
> + * counter or eventer count to be reset if needed). After the known 
> instruction
> + * count loop, zero is written to the PMCR to disable counting, allowing the
> + * cycle counter or event counters to be read as needed at a later time.
> + */
> +static void measure_instrs(int len, struct pmu_data pmcr)
> +{
> + int i = (len - 1) / 2;
> +
> + if (len < 3 || ((len - 1) % 2))
> + abort();
> +
> + asm volatile(
> + "msr pmcr_el0, %[pmcr]\n"
> + "1: subs %[i], %[i], #1\n"
> + "b.gt 1b\n"
> + "msr pmcr_el0, xzr"
> + : [i] "+r" (i) : [pmcr] "r" (pmcr) : "cc");
> +}
> +
> +int main()
> +{
> + struct pmu_data pmcr;
> + const int samples = 10;
> +
> + asm volatile("mrs %0, pmcr_el0" : "=r" (pmcr));
> +
> + printf("PMU implementor: %c\n", pmcr.implementor);
> + printf("Identification code: 0x%x\n", pmcr.identification_code);
> + printf("Event counters:  %d\n", pmcr.num_counters);
> +
> + pmcr.cycle_counter_reset = 1;
> + pmcr.enable = 1;
> +
> + printf("\ninstructions : cycles0 cycles1 ...\n");
> +
> + for (int i = 3; i < 300; i += 32) {
> + int avg, sum = 0;
> + printf("%d :", i);
> + for (int j = 0; j < samples; j++) {
> + int val;
> + measure_instrs(i, pmcr);
> + asm volatile("mrs %0, pmccntr_el0" : "=r" (val));
> + sum += val;
> + printf(" %d", val);
> + }
> + avg = sum / samples;
> + printf(" sum=%d avg=%d avg_ipc=%d avg_cpi=%d\n", sum, avg, i / 
> avg, avg / i);
> + }

I understand that, as stated in commit message, it currently doesn't
check the correctness of PMU counter values. But it would be better if
testing code can be abstracted into an independent function (e.g.
instr_cycle_check) for report("Instruction Cycles",
instr_cycle_check()). You can return TRUE in the checking code for now.


> +
> + return report_summary();
> +}
> diff --git a/arm/unittests.cfg b/arm/unittests.cfg
> index 

Re: [Qemu-devel] [PATCHv2] target-arm: Implement AArch64 OSLSR_EL1 sysreg

2015-10-05 Thread Alistair Francis
On Mon, Oct 5, 2015 at 2:25 PM, Davorin Mista  wrote:
> Thanks Peter, I've made all changes as you suggested, but there is no
> property "ARM_CP_NO_RAW", there's also nothing similar to it defined in
> cpu.h, here's all the options:
>
> #define ARM_CP_SPECIAL 1
> #define ARM_CP_CONST 2
> #define ARM_CP_64BIT 4
> #define ARM_CP_SUPPRESS_TB_END 8
> #define ARM_CP_OVERRIDE 16
> #define ARM_CP_NO_MIGRATE 32
> #define ARM_CP_IO 64

Hello Davorin,


There is an ARM_CP_NO_RAW option
(https://github.com/qemu/qemu/blob/master/target-arm/cpu.h#L1154). It
looks like you are applying this on the Xilinx tree (or another out of
date tree) instead of mainline.

Thanks,

Alistair

>
> Cheers,
> Davorin
>
>
> On 10/05/2015 01:27 PM, Peter Maydell wrote:
>>
>> On 5 October 2015 at 20:56, Davorin Mista 
>> wrote:
>>>
>>> Added oslsr_write function to OSLAR_EL1 sysreg, using a status variable
>>> in ARMCPUState struct (os_lock_status).
>>>
>>> Linux reads from this register during its suspend/resume procedure.
>>>
>>> Signed-off-by: Davorin Mista 
>>
>>
>> Thanks for this patch. I'm afraid it still needs some changes;
>> comments below.
>>
>>> ---
>>> Changed in v2:
>>> -switched from using dummy registers to an actual register implementation
>>> -implemented write function for OSLAR_EL1 sysreg
>>> -added state variable to ARMCPUState struct
>>>
>>> Signed-off-by: Davorin Mista 
>>> ---
>>>   target-arm/cpu.h|  3 +++
>>>   target-arm/helper.c | 16 +++-
>>>   2 files changed, 18 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/target-arm/cpu.h b/target-arm/cpu.h
>>> index 5ea11a6..5aab654 100644
>>> --- a/target-arm/cpu.h
>>> +++ b/target-arm/cpu.h
>>> @@ -500,6 +500,9 @@ typedef struct CPUARMState {
>>>   uint32_t cregs[16];
>>>   } iwmmxt;
>>>
>>> +/* OS Lock Status: accessed via OSLAR/OSLSR registers */
>>> +uint64_t os_lock_status;
>>
>>
>> Can you call this "oslsr_el1" and put it inside the cp15 substruct
>> with the other sysreg fields, please?
>>
>>> +
>>>   /* For mixed endian mode.  */
>>>   bool bswap_code;
>>>
>>> diff --git a/target-arm/helper.c b/target-arm/helper.c
>>> index 9d62c4c..a6fad7a 100644
>>> --- a/target-arm/helper.c
>>> +++ b/target-arm/helper.c
>>> @@ -3147,6 +3147,13 @@ static void dcc_write(CPUARMState *env, const
>>> ARMCPRegInfo *ri,
>>>   putchar(value);
>>>   }
>>>
>>> +/* write to os_lock_status state variable */
>>> +static void oslsr_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>> uint64_t value)
>>> +{
>>> +/* only bit 1 can be modified, register is always 0b10x0 */
>>> +raw_write(env, ri, 8 + (value & 2));
>>
>>
>> This is the write function for OSLAR, not OSLSR, so you should
>> call it oslar_write.
>>
>> Your logic isn't implementing the behaviour the ARM ARM requires,
>> which is:
>>   * for AArch64 accesses, copy bit 0 of the written value into
>> bit 1 of oslsr_el1
>>   * for AArch32 accesses, if the written value is 0xC5ACCE55
>> then write 1 into bit 1 of oslsr_el1, else write 0
>>
>> That looks something like:
>>
>>  int oslock;
>>
>>  if (ri->state == ARM_CP_STATE_AA32) {
>>  oslock = (value == 0xC5ACCE55);
>>  } else {
>>  oslock = value & 1;
>>  }
>>
>>  env->cp15.oslsr_el1 = deposit32(env->cp15.oslsr_el1, 1, 1, oslock);
>>
>>
>>> +}
>>> +
>>>   static const ARMCPRegInfo debug_cp_reginfo[] = {
>>>   /* DBGDRAR, DBGDSAR: always RAZ since we don't implement memory
>>> mapped
>>>* debug components. The AArch64 version of DBGDRAR is named
>>> MDRAR_EL1;
>>> @@ -3179,7 +3186,14 @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
>>>   /* We define a dummy WI OSLAR_EL1, because Linux writes to it. */
>>>   { .name = "OSLAR_EL1", .state = ARM_CP_STATE_BOTH,
>>> .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 4,
>>> -  .access = PL1_W, .type = ARM_CP_NOP },
>>> +  .access = PL1_W, .resetvalue = 10,
>>
>>
>> Write only registers don't need a reset value. You also
>> need .type = ARM_CP_NO_RAW, because a raw access to this register
>> doesn't make sense.
>>
>>> +  .fieldoffset = offsetof(CPUARMState, os_lock_status),
>>> +  .writefn = oslsr_write },
>>> +/* We define a dummy OSLSR_EL1, because Linux reads from it. */
>>> +{ .name = "OSLSR_EL1", .state = ARM_CP_STATE_BOTH,
>>> +  .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 4,
>>> +  .access = PL1_R,
>>> +  .fieldoffset = offsetof(CPUARMState, os_lock_status) },
>>
>>
>> This is the reginfo that should have the reset value.
>>
>>>   /* Dummy OSDLR_EL1: 32-bit Linux will read this */
>>>   { .name = "OSDLR_EL1", .state = ARM_CP_STATE_BOTH,
>>> .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 4,
>>> --
>>> 2.6.0
>>>
>>
>> thanks
>> -- PMM
>>
>



Re: [Qemu-devel] [PATCH v5 0/2] Improve -help

2015-10-05 Thread Laurent Vivier


On 05/10/2015 23:08, Michael S. Tsirkin wrote:
> On Mon, Oct 05, 2015 at 03:20:19PM +0200, Laurent Vivier wrote:
>> no more comments, it should mean it's ok ?
>>
>> Can someone merge this or just say "stop this, I don't want that"...
>>
>> Laurent
> 
> I'm afraid as a first step, we need to bring some order into -help.

I agree with your comments, but the aim of this series is not to change
the content, only the container.

It will be easy to add sections once this done.

> I also suspect that a good place to start doing that is the man page.

Man page and help are generated from qemu-options.hx, so improving one
is also
improving the other, but I think I'm not the good guy to rewrite a man
page... I'm a mechanic, not a writer :)

Thanks,
Laurent



Re: [Qemu-devel] [PATCH v5 0/2] Improve -help

2015-10-05 Thread Michael S. Tsirkin
On Mon, Oct 05, 2015 at 03:20:19PM +0200, Laurent Vivier wrote:
> no more comments, it should mean it's ok ?
> 
> Can someone merge this or just say "stop this, I don't want that"...
> 
> Laurent

I'm afraid as a first step, we need to bring some order into -help.

I also suspect that a good place to start doing that is the man page.

-- 
MST



[Qemu-devel] [PATCHv3] target-arm: Implement AArch64 OSLAR/OSLSR_EL1 sysregs

2015-10-05 Thread Davorin Mista
Added oslar_write function to OSLAR_EL1 sysreg, using a status variable
in ARMCPUState.cp15 struct (oslsr_el1). This variable is also linked
to the newly added read-only OSLSR_EL1 register.

Linux reads from this register during its suspend/resume procedure.

Signed-off-by: Davorin Mista 

---
Changed in v2:
-switched from using dummy registers to an actual register implementation
-implemented write function for OSLAR_EL1 sysreg
-added state variable to ARMCPUState struct

Changed in v3:
-renamed variable to oslsr_el1 and moved to cp15
-renamed write frunction to oslar_write
-support both 32bit and 64bit ARM in oslar_write
-moved resetvalue to the corresponding read-only register
-removed "dummy" comments above registers
---
 target-arm/cpu.h|  1 +
 target-arm/helper.c | 23 +--
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 5ea11a6..f9f9bfb 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -399,6 +399,7 @@ typedef struct CPUARMState {
 uint64_t dbgwvr[16]; /* watchpoint value registers */
 uint64_t dbgwcr[16]; /* watchpoint control registers */
 uint64_t mdscr_el1;
+uint64_t oslsr_el1; /* OS Lock Status */
 /* If the counter is enabled, this stores the last time the counter
  * was reset. Otherwise it stores the counter value
  */
diff --git a/target-arm/helper.c b/target-arm/helper.c
index 9d62c4c..ecc89c7 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -3147,6 +3147,20 @@ static void dcc_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 putchar(value);
 }
 
+/* write to os_lock_status state variable */
+static void oslar_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t 
value)
+{
+int oslock;
+
+if (ri->state == ARM_CP_STATE_AA32) {
+oslock = (value == 0xC5ACCE55);
+} else {
+oslock = value & 1;
+}
+
+env->cp15.oslsr_el1 = deposit32(env->cp15.oslsr_el1, 1, 1, oslock);
+}
+
 static const ARMCPRegInfo debug_cp_reginfo[] = {
 /* DBGDRAR, DBGDSAR: always RAZ since we don't implement memory mapped
  * debug components. The AArch64 version of DBGDRAR is named MDRAR_EL1;
@@ -3176,10 +3190,15 @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
   .access = PL1_R,
   .fieldoffset = offsetof(CPUARMState, cp15.mdscr_el1),
   .resetfn = arm_cp_reset_ignore },
-/* We define a dummy WI OSLAR_EL1, because Linux writes to it. */
 { .name = "OSLAR_EL1", .state = ARM_CP_STATE_BOTH,
   .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 4,
-  .access = PL1_W, .type = ARM_CP_NOP },
+  .access = PL1_W,
+  .fieldoffset = offsetof(CPUARMState, cp15.oslsr_el1),
+  .writefn = oslar_write },
+{ .name = "OSLSR_EL1", .state = ARM_CP_STATE_BOTH,
+  .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 4,
+  .access = PL1_R, .resetvalue = 10,
+  .fieldoffset = offsetof(CPUARMState, cp15.oslsr_el1) },
 /* Dummy OSDLR_EL1: 32-bit Linux will read this */
 { .name = "OSDLR_EL1", .state = ARM_CP_STATE_BOTH,
   .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 4,
-- 
2.6.0




Re: [Qemu-devel] [PATCH] target-arm: Avoid calling arm_el_is_aa64() function for unimplemented EL

2015-10-05 Thread Peter Maydell
On 2 October 2015 at 14:21, Sergey Sorokin  wrote:
> It is incorrect to call arm_el_is_aa64() function for unimplemented EL.
> This patch fixes several attempts to do so.
>
> Signed-off-by: Sergey Sorokin 
> ---
>  target-arm/cpu.h|  8 +---
>  target-arm/helper.c | 15 +--
>  2 files changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/target-arm/cpu.h b/target-arm/cpu.h
> index cc1578c..df456a5 100644
> --- a/target-arm/cpu.h
> +++ b/target-arm/cpu.h
> @@ -1015,11 +1015,11 @@ static inline bool access_secure_reg(CPUARMState *env)
>   */
>  #define A32_BANKED_CURRENT_REG_GET(_env, _regname)\
>  A32_BANKED_REG_GET((_env), _regname,\
> -   ((!arm_el_is_aa64((_env), 3) && arm_is_secure(_env
> +   (arm_is_secure(_env) && !arm_el_is_aa64((_env), 3)))
>
>  #define A32_BANKED_CURRENT_REG_SET(_env, _regname, _val) 
>   \
>  A32_BANKED_REG_SET((_env), _regname,\
> -   ((!arm_el_is_aa64((_env), 3) && 
> arm_is_secure(_env))),  \
> +   (arm_is_secure(_env) && !arm_el_is_aa64((_env), 3)), \
> (_val))
>
>  void arm_cpu_list(FILE *f, fprintf_function cpu_fprintf);
> @@ -1586,7 +1586,9 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
> unsigned int excp_idx,
>   * interrupt.
>   */
>  if ((target_el > cur_el) && (target_el != 1)) {
> -if (arm_el_is_aa64(env, 3) || ((scr || hcr) && (!secure))) {
> +/* ARM_FEATURE_AARCH64 enabled means the higher EL is AArch64. */
> +if (arm_feature(env, ARM_FEATURE_AARCH64) ||
> +((scr || hcr) && (!secure))) {
>  unmasked = 1;
>  }
>  }

I know we discussed this one before, but having looked more carefully
at it, I think the reason it's weird is that the original code is
only correct if we're not implementing EL2.

For instance (table D1-14 in the v8 ARMARM rev A.g)
if we're in an all-AArch64 environment executing in Secure EL0
then the interrupt mask is supposed to have an effect if the interrupt
is targeting EL2. But the current code will always set 'unmasked' to 1 if
EL3 is 64 bit.

So I think what the code ought to read is:

if ((target_el > cur_el) && (target_el != 1)) {
/* Exceptions targeting a higher EL may not be maskable */
if (arm_feature(env, ARM_FEATURE_AARCH64)) {
/* 64-bit masking rules are simple: exceptions to EL3
 * can't be masked, and exceptions to EL2 can only be
 * masked from Secure state.
 */
if (target_el == 3 || !secure) {
unmasked = 1;
}
} else {
/* The old 32-bit-only environment has a more complicated
 * masking setup.
 */
if ((scr || hcr) && !secure) {
unmasked = 1;
}
}
}

Except that then for the AArch64 case we've just calculated
scr and hcr and then not needed them. I think most of the
code calculating them ought to move into the else clause here.

I'll write a patch for this and post it tomorrow.

In the meantime, we can just make the comment say

   /* ARM_FEATURE_AARCH64 enabled means the highest EL is AArch64.
* This code currently assumes that EL2 is not implemented
* (and so that highest EL will be 3 and the target_el also 3).
*/

> diff --git a/target-arm/helper.c b/target-arm/helper.c
> index 8367997..1f11dbd 100644
> --- a/target-arm/helper.c
> +++ b/target-arm/helper.c
> @@ -5220,11 +5220,22 @@ uint32_t arm_phys_excp_target_el(CPUState *cs, 
> uint32_t excp_idx,
>   uint32_t cur_el, bool secure)
>  {
>  CPUARMState *env = cs->env_ptr;
> -int rw = ((env->cp15.scr_el3 & SCR_RW) == SCR_RW);
> +int rw;
>  int scr;
>  int hcr;
>  int target_el;
> -int is64 = arm_el_is_aa64(env, 3);
> +/* Is the higher EL AArch64? */

"highest". I pointed this one out before...

> +int is64 = arm_feature(env, ARM_FEATURE_AARCH64);
> +
> +/* If the highest EL is in AArch64 state, and EL3 is not implemented,
> + * we must behave as if EL3 is implemented and is in AArch64 state.
> + * Therefore we need appropriate RW bit.
> + */

I think we could put this comment into the else branch, and
rephrase it a bit:

+if (arm_feature(env, ARM_FEATURE_EL3)) {
+rw = ((env->cp15.scr_el3 & SCR_RW) == SCR_RW);
+} else {
+/* Either EL2 is the highest EL (and so the EL2 register width
+ * is given by is64); or there is no EL2 or EL3, in which case
+ * the value of 'rw' does not affect the table lookup anyway.
+ */
+rw = is64;
+}

> +if (arm_feature(env, ARM_FEATURE_EL3)) {
> +rw = ((env->cp15.scr_el3 & SCR_RW) == SCR_RW);
> +} else {
> +rw = is64;
> +}

What I can do with 

Re: [Qemu-devel] [PATCHv3] target-arm: Implement AArch64 OSLAR/OSLSR_EL1 sysregs

2015-10-05 Thread Peter Maydell
On 5 October 2015 at 22:36, Davorin Mista  wrote:
> Added oslar_write function to OSLAR_EL1 sysreg, using a status variable
> in ARMCPUState.cp15 struct (oslsr_el1). This variable is also linked
> to the newly added read-only OSLSR_EL1 register.
>
> Linux reads from this register during its suspend/resume procedure.
>
> Signed-off-by: Davorin Mista 
>
> ---
> Changed in v2:
> -switched from using dummy registers to an actual register implementation
> -implemented write function for OSLAR_EL1 sysreg
> -added state variable to ARMCPUState struct
>
> Changed in v3:
> -renamed variable to oslsr_el1 and moved to cp15
> -renamed write frunction to oslar_write
> -support both 32bit and 64bit ARM in oslar_write
> -moved resetvalue to the corresponding read-only register
> -removed "dummy" comments above registers
> ---
>  target-arm/cpu.h|  1 +
>  target-arm/helper.c | 23 +--
>  2 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/target-arm/cpu.h b/target-arm/cpu.h
> index 5ea11a6..f9f9bfb 100644
> --- a/target-arm/cpu.h
> +++ b/target-arm/cpu.h
> @@ -399,6 +399,7 @@ typedef struct CPUARMState {
>  uint64_t dbgwvr[16]; /* watchpoint value registers */
>  uint64_t dbgwcr[16]; /* watchpoint control registers */
>  uint64_t mdscr_el1;
> +uint64_t oslsr_el1; /* OS Lock Status */
>  /* If the counter is enabled, this stores the last time the counter
>   * was reset. Otherwise it stores the counter value
>   */
> diff --git a/target-arm/helper.c b/target-arm/helper.c
> index 9d62c4c..ecc89c7 100644
> --- a/target-arm/helper.c
> +++ b/target-arm/helper.c
> @@ -3147,6 +3147,20 @@ static void dcc_write(CPUARMState *env, const 
> ARMCPRegInfo *ri,
>  putchar(value);
>  }
>
> +/* write to os_lock_status state variable */
> +static void oslar_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t 
> value)
> +{
> +int oslock;
> +
> +if (ri->state == ARM_CP_STATE_AA32) {
> +oslock = (value == 0xC5ACCE55);
> +} else {
> +oslock = value & 1;
> +}
> +
> +env->cp15.oslsr_el1 = deposit32(env->cp15.oslsr_el1, 1, 1, oslock);
> +}
> +
>  static const ARMCPRegInfo debug_cp_reginfo[] = {
>  /* DBGDRAR, DBGDSAR: always RAZ since we don't implement memory mapped
>   * debug components. The AArch64 version of DBGDRAR is named MDRAR_EL1;
> @@ -3176,10 +3190,15 @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
>.access = PL1_R,
>.fieldoffset = offsetof(CPUARMState, cp15.mdscr_el1),
>.resetfn = arm_cp_reset_ignore },
> -/* We define a dummy WI OSLAR_EL1, because Linux writes to it. */
>  { .name = "OSLAR_EL1", .state = ARM_CP_STATE_BOTH,
>.cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 4,
> -  .access = PL1_W, .type = ARM_CP_NOP },
> +  .access = PL1_W,
> +  .fieldoffset = offsetof(CPUARMState, cp15.oslsr_el1),

OSLAR doesn't want a fieldoffset.

Still missing .type = ARM_CP_NO_RAW. (As Alistair says, this does
exist in mainline -- please make sure you write and test your
patches against current git master, not any other tree.)

> +  .writefn = oslar_write },
> +{ .name = "OSLSR_EL1", .state = ARM_CP_STATE_BOTH,
> +  .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 4,
> +  .access = PL1_R, .resetvalue = 10,
> +  .fieldoffset = offsetof(CPUARMState, cp15.oslsr_el1) },

This part is OK.

>  /* Dummy OSDLR_EL1: 32-bit Linux will read this */
>  { .name = "OSDLR_EL1", .state = ARM_CP_STATE_BOTH,
>.cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 4,
> --
> 2.6.0

thanks
-- PMM



[Qemu-devel] [PATCHv2] fw_cfg: Define a static signature to be returned on DMA port reads

2015-10-05 Thread Kevin O'Connor
Return a static signature ("QEMU CFG") if the guest does a read to the
DMA address io register.

Signed-off-by: Kevin O'Connor 
---

Marc, if you decide to respin your fw_cfg series, I've updated the dma
signature patch.  This addresses the comments from Stefan, and I hope
it addresses the comments from Laszlo.

BTW, if you wanted to, it's possible to use deposit64 in
fw_cfg_dma_mem_write() to support all possible (validly aligned) write
sizes.  Then fw_cfg_dma_mem_valid() shouldn't be needed.  Something
like:

static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr,
 uint64_t value, unsigned size)
{
FWCfgState *s = opaque;
s->dma_addr = deposit64(s->dma_addr, (8 - addr - size)*8, size*8, value);
if (addr + size >= 8) {
fw_cfg_dma_transfer(s);
}
}

---
 docs/specs/fw_cfg.txt |  3 +++
 hw/nvram/fw_cfg.c | 14 --
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt
index 2d6b2da..cbdce7d 100644
--- a/docs/specs/fw_cfg.txt
+++ b/docs/specs/fw_cfg.txt
@@ -93,6 +93,9 @@ by selecting the "signature" item using key 0x 
(FW_CFG_SIGNATURE),
 and reading four bytes from the data register. If the fw_cfg device is
 present, the four bytes read will contain the characters "QEMU".
 
+If the DMA interface is available, then reading the DMA Address
+Register returns 0x51454d5520434647 ("QEMU CFG" in big-endian format).
+
 === Revision / feature bitmap (Key 0x0001, FW_CFG_ID) ===
 
 A 32-bit little-endian unsigned int, this item is used to check for enabled
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 59933b3..cf5c5c4 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -53,6 +53,8 @@
 #define FW_CFG_DMA_CTL_SKIP0x04
 #define FW_CFG_DMA_CTL_SELECT  0x08
 
+#define FW_CFG_DMA_SIGNATURE 0x51454d5520434647 /* "QEMU CFG" */
+
 typedef struct FWCfgEntry {
 uint32_t len;
 uint8_t *data;
@@ -393,6 +395,13 @@ static void fw_cfg_dma_transfer(FWCfgState *s)
 trace_fw_cfg_read(s, 0);
 }
 
+static uint64_t fw_cfg_dma_mem_read(void *opaque, hwaddr addr,
+unsigned size)
+{
+// Return a signature value (and handle various read sizes)
+return extract64(FW_CFG_DMA_SIGNATURE, (8 - addr - size) * 8, size*8);
+}
+
 static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr,
  uint64_t value, unsigned size)
 {
@@ -416,8 +425,8 @@ static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr,
 static bool fw_cfg_dma_mem_valid(void *opaque, hwaddr addr,
   unsigned size, bool is_write)
 {
-return is_write && ((size == 4 && (addr == 0 || addr == 4)) ||
-(size == 8 && addr == 0));
+return !is_write || ((size == 4 && (addr == 0 || addr == 4)) ||
+ (size == 8 && addr == 0));
 }
 
 static bool fw_cfg_data_mem_valid(void *opaque, hwaddr addr,
@@ -488,6 +497,7 @@ static const MemoryRegionOps fw_cfg_comb_mem_ops = {
 };
 
 static const MemoryRegionOps fw_cfg_dma_mem_ops = {
+.read = fw_cfg_dma_mem_read,
 .write = fw_cfg_dma_mem_write,
 .endianness = DEVICE_BIG_ENDIAN,
 .valid.accepts = fw_cfg_dma_mem_valid,
-- 
2.4.3




Re: [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification

2015-10-05 Thread John Snow


On 09/15/2015 12:24 PM, Eric Blake wrote:
> On 09/05/2015 10:43 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Persistent dirty bitmaps will be saved into qcow2 files. It may be used
>> as 'internal' bitmaps (for qcow2 drives) or as 'external' bitmaps for
>> other drives (there may be qcow2 file with zero disk size but with
>> several dirty bitmaps for other drives).
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> ---
>>  docs/specs/qcow2.txt | 127 
>> ++-
>>  1 file changed, 126 insertions(+), 1 deletion(-)
>>
>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>> index 121dfc8..5fc0365 100644
>> --- a/docs/specs/qcow2.txt
>> +++ b/docs/specs/qcow2.txt
>> @@ -103,7 +103,13 @@ in the description of a field.
>>  write to an image with unknown auto-clear features if it
>>  clears the respective bits from this field first.
>>  
>> -Bits 0-63:  Reserved (set to 0)
>> +Bit 0:  Dirty bitmaps bit. If this bit is set then
>> +there is a _consistent_ Dirty bitmaps 
>> extension
>> +in the image. If it is not set, but there 
>> is a
>> +Dirty bitmaps extension, its data should be
>> +considered as inconsistent.
> 
> Thanks for documenting this. I don't know that we use underscore for
> _emphasis_ anywhere else in the file, but I don't have any better
> suggestions.  Should you also require that it is an error if this bit is
> set but no Dirty bitmap extension header is present?
> 

An error, but one that can be safely corrected by any fsck-style
utility: clear the bit.

>> +
>> +Bits 1-63:  Reserved (set to 0)
>>  
>>   96 -  99:  refcount_order
>>  Describes the width of a reference count block entry 
>> (width
>> @@ -123,6 +129,7 @@ be stored. Each extension has a structure like the 
>> following:
>>  0x - End of the header extension area
>>  0xE2792ACA - Backing file format name
>>  0x6803f857 - Feature name table
>> +0x23852875 - Dirty bitmaps
>>  other  - Unknown header extension, can be safely
>>   ignored
>>  
>> @@ -166,6 +173,24 @@ the header extension data. Each entry look like this:
>>  terminated if it has full length)
>>  
>>  
>> +== Dirty bitmaps ==
>> +
>> +Dirty bitmaps is an optional header extension. It provides an ability to 
>> store
>> +dirty bitmaps in a qcow2 image. The fields are:
> 
> Might not hurt to remind the reader about the auto-clear feature bit
> mentioned earlier controlling whether this extension can be trusted as
> consistent.
> 
>> +
>> +  0 -  3:  nb_dirty_bitmaps
>> +   The number of dirty bitmaps contained in the image. Valid
>> +   values: 0 - 65535.
>> +
>> +  4 -  7:  dirty_bitmap_directory_size
>> +   Size of the Dirty Bitmap Directory in bytes. Valid 
>> values:
>> +   0 - 67108864 (= 1024 * nb_dirty_bitmaps).
> 
> Is it always going to be 1024 * nb_dirty_bitmaps? If so, why do we need
> a redundant field?  If not, then this wording needs help; from the rest
> of this text, it looks like you want "at most 1024 * nb_dirty_bitmaps".
>  Also, while Dirty Bitmap Directory entries are variable length (and
> thus a variable maximum), they do have a minimum size (so the minimum
> value for dirty_bitmap_directory_size must be larger than 0 unless
> nb_dirty_bitmaps is 0, in which case why would we have this header
> extension)
> 

Agree.

>> +
>> +  8 - 15:  dirty_bitmap_directory_offset
>> +   Offset into the image file at which the Dirty Bitmap
>> +   Directory starts. Must be aligned to a cluster boundary.
>> +
>> +
>>  == Host cluster management ==
>>  
>>  qcow2 manages the allocation of host clusters by maintaining a reference 
>> count
>> @@ -360,3 +385,103 @@ Snapshot table entry:
>>  
>>  variable:   Padding to round up the snapshot table entry size to the
>>  next multiple of 8.
>> +
>> +
>> +== Dirty bitmaps ==
>> +
>> +The feature supports storing dirty bitmaps in a qcow2 image.
>> +
>> +=== Cluster mapping ===
>> +
>> +Dirty bitmaps are stored using a ONE-level structure for the mapping of
>> +bitmaps to host clusters. It is called Dirty Bitmap Table.
> 
> s/ONE/one/ (I didn't see the reason for the emphasis)
> 

Emphasis is likely because that's not how the cluster allocation
mechanism works in qcow2 otherwise. We're essentially storing data
straight into what would otherwise be the L1 table.

It's worth clarifying, in my opinion.

>> +
>> +The Dirty Bitmap Table has a variable size (stored in 

[Qemu-devel] QEMU Technical Talk: NVDIMM and persistent memory in QEMU

2015-10-05 Thread Stefan Hajnoczi
Marc Mari has volunteered to give the following online technical talk
on Monday, 12 October at 14:00 UTC:

"Marc Mari will present the new NVDIMM persistent memory device class
and how they integrate into QEMU and SeaBIOS.  The main concepts of
the hardware specification are covered, as well as how NVDIMMs can be
used by virtual machines.

This talk is aimed at QEMU and SeaBIOS developers."

Marc has been experimenting with Guangrong Xiao's NVDIMM patches and
is working on SeaBIOS boot-from-NVDIMM support.

To join the event:
https://plus.google.com/events/cfssoojfogaafulssb1qeijn07k

This is the first QEMU technical talk and we will be using Google+'s
Hangouts On Air feature for a live presentation.  Video will also be
archived on YouTube for viewing at a later date.

If you would like to speak on a technical topic, please contact me!  I
hope to host talks showcasing features of interest to QEMU users as
well as technical topics for QEMU developers.

Stefan



Re: [Qemu-devel] [RFC v0 0/2] Enforce gaps between DIMMs

2015-10-05 Thread Michael S. Tsirkin
On Mon, Oct 05, 2015 at 02:05:22PM +0530, Bharata B Rao wrote:
> The suggested way to work around the virtio bug reported here
> 
> http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html
> 
> is to introduce gaps between DIMMs. Igor's patchset changes the pc-dimm
> auto-address assignment to introduce gaps and ues the same from pc memhp.
> This patchset does the same for sPAPR PowerPC.
> 
> Before introducing the gap, ensure that memory hotplug region has enough
> room for alignment adjustment. We accommodate a max alignment of 256MB for
> each slot since sPAPR memory hotplug enforces an alignment requirement of
> 256MB on RAM size, maxmem and NUMA node mem sizes.
> 
> This applies on David's spapr-next branch + Igor's patchset applied.
> 
> This has been very lightly tested and intention is to get feedback
> on the correctness aspect of this.
> 
> Bharata B Rao (2):
>   spapr: Accommadate alignment gaps in hotplug memory region
>   spapr: Force gaps between DIMM's GPA

PC needs this, PPC needs this ... I don't see why would this not
apply everywhere. Isn't it time we just converted everyone?
Drop the gap flag, set it unconditionally for new machine types.

Thoughts?


>  hw/ppc/spapr.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> -- 
> 2.1.0
> 



[Qemu-devel] [PULL 06/10] vfio: Check guest IOVA ranges against host IOMMU capabilities

2015-10-05 Thread Alex Williamson
From: David Gibson 

The current vfio core code assumes that the host IOMMU is capable of
mapping any IOVA the guest wants to use to where we need.  However, real
IOMMUs generally only support translating a certain range of IOVAs (the
"DMA window") not a full 64-bit address space.

The common x86 IOMMUs support a wide enough range that guests are very
unlikely to go beyond it in practice, however the IOMMU used on IBM Power
machines - in the default configuration - supports only a much more limited
IOVA range, usually 0..2GiB.

If the guest attempts to set up an IOVA range that the host IOMMU can't
map, qemu won't report an error until it actually attempts to map a bad
IOVA.  If guest RAM is being mapped directly into the IOMMU (i.e. no guest
visible IOMMU) then this will show up very quickly.  If there is a guest
visible IOMMU, however, the problem might not show up until much later when
the guest actually attempt to DMA with an IOVA the host can't handle.

This patch adds a test so that we will detect earlier if the guest is
attempting to use IOVA ranges that the host IOMMU won't be able to deal
with.

For now, we assume that "Type1" (x86) IOMMUs can support any IOVA, this is
incorrect, but no worse than what we have already.  We can't do better for
now because the Type1 kernel interface doesn't tell us what IOVA range the
IOMMU actually supports.

For the Power "sPAPR TCE" IOMMU, however, we can retrieve the supported
IOVA range and validate guest IOVA ranges against it, and this patch does
so.

Signed-off-by: David Gibson 
Reviewed-by: Laurent Vivier 
Signed-off-by: Alex Williamson 
---
 hw/vfio/common.c  |   40 +---
 include/hw/vfio/vfio-common.h |6 ++
 2 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 95a4850..2faf492 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -343,14 +343,22 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
 if (int128_ge(int128_make64(iova), llend)) {
 return;
 }
+end = int128_get64(llend);
+
+if ((iova < container->min_iova) || ((end - 1) > container->max_iova)) {
+error_report("vfio: IOMMU container %p can't map guest IOVA region"
+ " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx,
+ container, iova, end - 1);
+ret = -EFAULT;
+goto fail;
+}
 
 memory_region_ref(section->mr);
 
 if (memory_region_is_iommu(section->mr)) {
 VFIOGuestIOMMU *giommu;
 
-trace_vfio_listener_region_add_iommu(iova,
-int128_get64(int128_sub(llend, int128_one(;
+trace_vfio_listener_region_add_iommu(iova, end - 1);
 /*
  * FIXME: We should do some checking to see if the
  * capabilities of the host VFIO IOMMU are adequate to model
@@ -387,7 +395,6 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
 
 /* Here we assume that memory_region_is_ram(section->mr)==true */
 
-end = int128_get64(llend);
 vaddr = memory_region_get_ram_ptr(section->mr) +
 section->offset_within_region +
 (iova - section->offset_within_address_space);
@@ -685,7 +692,19 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as)
 ret = -errno;
 goto free_container_exit;
 }
+
+/*
+ * FIXME: This assumes that a Type1 IOMMU can map any 64-bit
+ * IOVA whatsoever.  That's not actually true, but the current
+ * kernel interface doesn't tell us what it can map, and the
+ * existing Type1 IOMMUs generally support any IOVA we're
+ * going to actually try in practice.
+ */
+container->min_iova = 0;
+container->max_iova = (hwaddr)-1;
 } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) {
+struct vfio_iommu_spapr_tce_info info;
+
 ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, );
 if (ret) {
 error_report("vfio: failed to set group container: %m");
@@ -710,6 +729,21 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as)
 ret = -errno;
 goto free_container_exit;
 }
+
+/*
+ * This only considers the host IOMMU's 32-bit window.  At
+ * some point we need to add support for the optional 64-bit
+ * window and dynamic windows
+ */
+info.argsz = sizeof(info);
+ret = ioctl(fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, );
+if (ret) {
+error_report("vfio: VFIO_IOMMU_SPAPR_TCE_GET_INFO failed: %m");
+ret = -errno;
+goto free_container_exit;
+}
+container->min_iova = info.dma32_window_start;
+container->max_iova = container->min_iova + info.dma32_window_size - 1;
 } else {
 

[Qemu-devel] [PULL 10/10] vfio: Expose a VFIO PCI device's group for EEH

2015-10-05 Thread Alex Williamson
From: David Gibson 

The Enhanced Error Handling (EEH) interface in PAPR operates on units of a
Partitionable Endpoint (PE).  For VFIO devices, the PE boundaries the guest
sees must match the PE (i.e. IOMMU group) boundaries on the host.  To
implement this it will need to discover from VFIO which group a given
device belongs to.

This exposes a new vfio_pci_device_group() function for this purpose.

Signed-off-by: David Gibson 
Reviewed-by: Laurent Vivier 
Signed-off-by: Alex Williamson 
---
 hw/vfio/pci.c  |   14 ++
 include/hw/vfio/vfio-pci.h |   11 +++
 2 files changed, 25 insertions(+)
 create mode 100644 include/hw/vfio/vfio-pci.h

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index dcabb6d..49ae834 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -35,6 +35,8 @@
 #include "pci.h"
 #include "trace.h"
 
+#include "hw/vfio/vfio-pci.h"
+
 #define MSIX_CAP_LENGTH 12
 
 static void vfio_disable_interrupts(VFIOPCIDevice *vdev);
@@ -2312,6 +2314,18 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice 
*vdev)
 vdev->req_enabled = false;
 }
 
+VFIOGroup *vfio_pci_device_group(PCIDevice *pdev)
+{
+VFIOPCIDevice *vdev;
+
+if (!object_dynamic_cast(OBJECT(pdev), "vfio-pci")) {
+return NULL;
+}
+
+vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
+return vdev->vbasedev.group;
+}
+
 static int vfio_initfn(PCIDevice *pdev)
 {
 VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
diff --git a/include/hw/vfio/vfio-pci.h b/include/hw/vfio/vfio-pci.h
new file mode 100644
index 000..32105f7
--- /dev/null
+++ b/include/hw/vfio/vfio-pci.h
@@ -0,0 +1,11 @@
+#ifndef VFIO_PCI_H
+#define VFIO_PCI_H
+
+#include "qemu/typedefs.h"
+
+/* We expose the concept of a VFIOGroup, though not its internals */
+typedef struct VFIOGroup VFIOGroup;
+
+extern VFIOGroup *vfio_pci_device_group(PCIDevice *pdev);
+
+#endif /* VFIO_PCI_H */




Re: [Qemu-devel] [PATCH v7 08/18] qapi: Test use of 'number' within alternates

2015-10-05 Thread Eric Blake
On 09/29/2015 04:21 PM, Eric Blake wrote:
> Add some testsuite exposure for use of a 'number' as part of
> an alternate.  The current state of the tree has a few bugs
> exposed by this: our input parser depends on the ordering of
> how the qapi schema declared the alternate, and the parser
> does not accept integers for a 'number' in an alternate even
> though it does for numbers outside of an alternate.
> 
> Mixing 'int' and 'number' in the same alternate is unusual,
> since both are supplied by json-numbers, but there does not
> seem to be a technical reason to forbid it given that our
> json lexer distinguishes between json-numbers that can be
> represented as an int vs. those that cannot.
> 
> Improve the existing test_visitor_in_alternate() to match the
> style of the new test_visitor_in_alternate_number(), and to
> ensure full coverage of all possible qtype parsing.
> 
> Signed-off-by: Eric Blake 
> 
> ---

> +static void test_visitor_in_alternate_number(TestInputVisitorData *data,
> + const void *unused)
> +{
> +Visitor *v;
> +Error *err = NULL;
> +AltStrBool *asb;
> +AltStrNum *asn;
> +AltNumStr *ans;
> +AltStrInt *asi;
> +AltIntNum *ain;
> +AltNumInt *ani;
> +
> +/* Parsing an int */
> +
> +v = visitor_input_test_init(data, "42");
> +visit_type_AltStrBool(v, , NULL, );
> +g_assert(err);
> +qapi_free_AltStrBool(asb);
> +visitor_input_teardown(data, NULL);

This fails to reset err = NULL...

> +
> +/* FIXME: Order of alternate should not affect semantics; asn should
> + * parse the same as ans */
> +v = visitor_input_test_init(data, "42");
> +visit_type_AltStrNum(v, , NULL, );
> +/* FIXME g_assert_cmpint(asn->kind, == ALT_STR_NUM_KIND_N); */
> +/* FIXME g_assert_cmpfloat(asn->n, ==, 42); */
> +g_assert(err);
> +error_free(err);
> +err = NULL;

...which means that this test is not reliable.  Do you need a v8, or can
you squash this in?


diff --git a/tests/test-qmp-input-visitor.c b/tests/test-qmp-input-visitor.c
index 1b5a369..6104ac6 100644
--- a/tests/test-qmp-input-visitor.c
+++ b/tests/test-qmp-input-visitor.c
@@ -395,6 +395,8 @@ static void
test_visitor_in_alternate_number(TestInputVisitorData *data,
 v = visitor_input_test_init(data, "42");
 visit_type_AltStrBool(v, , NULL, );
 g_assert(err);
+error_free(err);
+err = NULL;
 qapi_free_AltStrBool(asb);

 v = visitor_input_test_init(data, "42");
-- 
2.4.3



-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 02/17] block: add bdrv_dirty_bitmap_size()

2015-10-05 Thread John Snow


On 09/05/2015 12:43 PM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block.c   | 5 +
>  include/block/block.h | 1 +
>  2 files changed, 6 insertions(+)
> 
> diff --git a/block.c b/block.c
> index 6d14f5b..8c39d0a 100644
> --- a/block.c
> +++ b/block.c
> @@ -3632,6 +3632,11 @@ const char *bdrv_dirty_bitmap_name(const 
> BdrvDirtyBitmap *bitmap)
>  return bitmap->name;
>  }
>  
> +int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap)
> +{
> +return bitmap->size;
> +}
> +
>  uint64_t bdrv_dirty_bitmap_data_size(const BdrvDirtyBitmap *bitmap,
>   uint64_t count)
>  {
> diff --git a/include/block/block.h b/include/block/block.h
> index fb7d410..8166640 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -510,6 +510,7 @@ void bdrv_set_dirty_iter(struct HBitmapIter *hbi, int64_t 
> offset);
>  int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
>  
>  const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap);
> +int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap);
>  uint64_t bdrv_dirty_bitmap_data_size(const BdrvDirtyBitmap *bitmap,
>   uint64_t count);
>  void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap,
> 

Reviewed-by: John Snow 



Re: [Qemu-devel] [PATCH v6 08/16] qapi: Test use of 'number' within alternates

2015-10-05 Thread Eric Blake
On 09/29/2015 07:38 AM, Markus Armbruster wrote:
> Eric Blake  writes:
> 
>> Add some testsuite exposure for use of a 'number' as part of
>> an alternate.  The current state of the tree has a few bugs
>> exposed by this: our input parser depends on the ordering of
>> how the qapi schema declared the alternate, and the parser
>> does not accept integers for a 'number' in an alternate even
>> though it does for numbers outside of an alternate.
>>
>> Mixing 'int' and 'number' in the same alternate is unusual,
>> since both are supplied by json-numbers, but there does not
>> seem to be a technical reason to forbid it given that our
>> json lexer distinguishes between json-numbers that can be
>> represented as an int vs. those that cannot.
>>
>> Improve the existing test_visitor_in_alternate() to match the
>> style of the new test_visitor_in_alternate_number(), and to
>> ensure full coverage of all possible qtype parsing.
>>
>> Signed-off-by: Eric Blake 
>>

>> +++ b/tests/test-qmp-input-visitor.c
>> @@ -371,12 +371,133 @@ static void 
>> test_visitor_in_alternate(TestInputVisitorData *data,
>>  UserDefAlternate *tmp;
>>
>>  v = visitor_input_test_init(data, "42");
>> -
>> -visit_type_UserDefAlternate(v, , NULL, );
>> -g_assert(err == NULL);
>> +visit_type_UserDefAlternate(v, , NULL, _abort);
>>  g_assert_cmpint(tmp->kind, ==, USER_DEF_ALTERNATE_KIND_I);
>>  g_assert_cmpint(tmp->i, ==, 42);
>>  qapi_free_UserDefAlternate(tmp);
>> +visitor_input_teardown(data, NULL);
> 
> Ugly in this test: visitor_input_test_init() is to be paired with
> visitor_input_teardown(), but each test's last visitor_input_teardown()
> can be omitted, because we also pass visitor_input_teardown to
> g_test_add().  Not your patch's fault.  Let's ignore it for now.

In fact, I have another patch coming down the pipeline that moves
visitor_input_teardown() into visitor_input_test_init() (teardown is
idempotent.  And while working on it, I noticed that we could consider
doing something like:

init(data, ...);
func(blah, >err);
g_assert(data->err);
teardown(data);

to let teardown() take care of err, instead of having to manually
 error_free(err); err = NULL;
on each reset (particularly nice if visitor_input_test_init() invokes
teardown() itself, for automatic reset semantics).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v5 0/2] Improve -help

2015-10-05 Thread Michael S. Tsirkin
On Tue, Oct 06, 2015 at 12:04:40AM +0200, Laurent Vivier wrote:
> 
> 
> On 05/10/2015 23:08, Michael S. Tsirkin wrote:
> > On Mon, Oct 05, 2015 at 03:20:19PM +0200, Laurent Vivier wrote:
> >> no more comments, it should mean it's ok ?
> >>
> >> Can someone merge this or just say "stop this, I don't want that"...
> >>
> >> Laurent
> > 
> > I'm afraid as a first step, we need to bring some order into -help.
> 
> I agree with your comments, but the aim of this series is not to change
> the content, only the container.

That's my point, with content as it is, changing the container
makes things worse not better IMHO.

> It will be easy to add sections once this done.
> 
> > I also suspect that a good place to start doing that is the man page.
> 
> Man page and help are generated from qemu-options.hx, so improving one
> is also
> improving the other, but I think I'm not the good guy to rewrite a man
> page... I'm a mechanic, not a writer :)
> 
> Thanks,
> Laurent



Re: [Qemu-devel] [PATCH] Add syscalls for -runas and -chroot tothe seccomp sandbox

2015-10-05 Thread Namsun Ch'o
> Drawback: complexity.  If we decide to limit ourselves to the original
> threat model (rogue guest), and enter the sandbox only after setup, we
> can keep things simpler.

We could do both without much complexity. This looks simple enough to me:

  rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(chroot), 1,
SCMP_A0(SCMP_CMP_EQ, chroot_dir));
  if (rc < 0)
goto seccomp_return;

  rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(chdir), 1,
SCMP_A0(SCMP_CMP_EQ, "/"));
  if (rc < 0)
goto seccomp_return;

The only time chroot_dir is ever used is in os-posix.c:139:

  if (chroot(chroot_dir) < 0) {



Re: [Qemu-devel] [PULL 14/15] vhost-user-test: use tmpfs by default

2015-10-05 Thread Michael S. Tsirkin
On Mon, Oct 05, 2015 at 11:39:39PM +0100, Peter Maydell wrote:
> On 2 October 2015 at 14:45, Michael S. Tsirkin  wrote:
> > Most people don't run make check by default, so they skip vhost-user
> > unit tests.  Solve this by using tmpfs instead, unless hugetlbfs is
> > specified (using an environment variable).
> >
> > Signed-off-by: Michael S. Tsirkin 
> > Reviewed-by: Marc-André Lureau 
> 
> Unfortunately I didn't notice before applying the pull, but this
> is breaking 'make check' on AArch64 host for me:
> 
> TEST: tests/vhost-user-test... (pid=20205)
> Warning: path not on HugeTLBFS: /tmp/vhost-test-gRpbwl
> qemu-system-i386: -netdev vhost-user,id=net0,chardev=chr0,vhostforce:
> vhost-net support is not compiled in
> qemu-system-i386: -netdev vhost-user,id=net0,chardev=chr0,vhostforce:
> failed to init vhost_net for queue 0
> 
> Broken pipe
> FAIL: tests/vhost-user-test
> 
> Probably reproducible on x86 if you configure with --disable-vhost-net,
> though I haven't tried that.
> 
> Perhaps tests/vhost-user-test should be set up
> in tests/Makefile using
> check-qtest-i386-$(CONFIG_VHOST_USER) rather
> than CONFIG_LINUX ?
> 
> I'd appreciate a quick fix, because this machine is in my set
> of systems I test all pullreqs on now...
> 
> thanks
> -- PMM


ok first of all we need this: if I apply it, I see the same
bug as you do. need to fix that too.

Signed-off-by: Michael S. Tsirkin 


diff --git a/net/net.c b/net/net.c
index 28a5597..1e6a082 100644
--- a/net/net.c
+++ b/net/net.c
@@ -902,9 +902,7 @@ static int (* const 
net_client_init_fun[NET_CLIENT_OPTIONS_KIND_MAX])(
 [NET_CLIENT_OPTIONS_KIND_BRIDGE]= net_init_bridge,
 #endif
 [NET_CLIENT_OPTIONS_KIND_HUBPORT]   = net_init_hubport,
-#ifdef CONFIG_VHOST_NET_USED
 [NET_CLIENT_OPTIONS_KIND_VHOST_USER] = net_init_vhost_user,
-#endif
 #ifdef CONFIG_L2TPV3
 [NET_CLIENT_OPTIONS_KIND_L2TPV3]= net_init_l2tpv3,
 #endif



[Qemu-devel] [PATCH resent] linux-user: in poll(), if nfds is 0, pfd can be NULL

2015-10-05 Thread Laurent Vivier
This problem appears with yum in Fedora 20 / PPC64 container.

test case:

#include 
#include 

int main(void)
{
int ret;

ret = poll(NULL, 0, 1000);
printf("%d\n", ret);
}

target test environment: Fedora 20 / PPC64
host test environment: Ubuntu 14.0.2 / x86_64

original test result: -1

13451 poll(0,0,1000,274886297496,26854,268566648) = -1 errno=14 (Bad 
address)

patched test result: 0

13536 poll(0,0,1000,274886297496,26854,268566648) = 0

Signed-off-by: Laurent Vivier 
---
This patch has already been sent in April, this version is just rebased on 
master.
https://patchwork.ozlabs.org/patch/460950/

 linux-user/syscall.c | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 98b5766..9cdb2a2 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -7893,14 +7893,20 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 struct pollfd *pfd;
 unsigned int i;
 
-target_pfd = lock_user(VERIFY_WRITE, arg1, sizeof(struct 
target_pollfd) * nfds, 1);
-if (!target_pfd)
-goto efault;
+pfd = NULL;
+target_pfd = NULL;
+if (nfds) {
+target_pfd = lock_user(VERIFY_WRITE, arg1,
+   sizeof(struct target_pollfd) * nfds, 1);
+if (!target_pfd) {
+goto efault;
+}
 
-pfd = alloca(sizeof(struct pollfd) * nfds);
-for(i = 0; i < nfds; i++) {
-pfd[i].fd = tswap32(target_pfd[i].fd);
-pfd[i].events = tswap16(target_pfd[i].events);
+pfd = alloca(sizeof(struct pollfd) * nfds);
+for (i = 0; i < nfds; i++) {
+pfd[i].fd = tswap32(target_pfd[i].fd);
+pfd[i].events = tswap16(target_pfd[i].events);
+}
 }
 
 # ifdef TARGET_NR_ppoll
-- 
2.4.3




Re: [Qemu-devel] [PULL 14/15] vhost-user-test: use tmpfs by default

2015-10-05 Thread Peter Maydell
On 2 October 2015 at 14:45, Michael S. Tsirkin  wrote:
> Most people don't run make check by default, so they skip vhost-user
> unit tests.  Solve this by using tmpfs instead, unless hugetlbfs is
> specified (using an environment variable).
>
> Signed-off-by: Michael S. Tsirkin 
> Reviewed-by: Marc-André Lureau 

Unfortunately I didn't notice before applying the pull, but this
is breaking 'make check' on AArch64 host for me:

TEST: tests/vhost-user-test... (pid=20205)
Warning: path not on HugeTLBFS: /tmp/vhost-test-gRpbwl
qemu-system-i386: -netdev vhost-user,id=net0,chardev=chr0,vhostforce:
vhost-net support is not compiled in
qemu-system-i386: -netdev vhost-user,id=net0,chardev=chr0,vhostforce:
failed to init vhost_net for queue 0

Broken pipe
FAIL: tests/vhost-user-test

Probably reproducible on x86 if you configure with --disable-vhost-net,
though I haven't tried that.

Perhaps tests/vhost-user-test should be set up
in tests/Makefile using
check-qtest-i386-$(CONFIG_VHOST_USER) rather
than CONFIG_LINUX ?

I'd appreciate a quick fix, because this machine is in my set
of systems I test all pullreqs on now...

thanks
-- PMM



Re: [Qemu-devel] [PATCH 01/17] block: fix bdrv_dirty_bitmap_granularity()

2015-10-05 Thread John Snow


On 09/05/2015 12:43 PM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block.c   | 2 +-
>  include/block/block.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 4f7fc0d..6d14f5b 100644
> --- a/block.c
> +++ b/block.c
> @@ -3591,7 +3591,7 @@ uint32_t 
> bdrv_get_default_bitmap_granularity(BlockDriverState *bs)
>  return granularity;
>  }
>  
> -uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap)
> +uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap)
>  {
>  return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
>  }
> diff --git a/include/block/block.h b/include/block/block.h
> index edc1510..fb7d410 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -495,7 +495,7 @@ void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
>  void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
>  BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
>  uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
> -uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap);
> +uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap);
>  bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
>  bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
>  DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap);
> 

As with Eric's review, with a commit message added:

Reviewed-by: John Snow 



Re: [Qemu-devel] [PULL 14/15] vhost-user-test: use tmpfs by default

2015-10-05 Thread Michael S. Tsirkin
On Mon, Oct 05, 2015 at 11:39:39PM +0100, Peter Maydell wrote:
> On 2 October 2015 at 14:45, Michael S. Tsirkin  wrote:
> > Most people don't run make check by default, so they skip vhost-user
> > unit tests.  Solve this by using tmpfs instead, unless hugetlbfs is
> > specified (using an environment variable).
> >
> > Signed-off-by: Michael S. Tsirkin 
> > Reviewed-by: Marc-André Lureau 
> 
> Unfortunately I didn't notice before applying the pull, but this
> is breaking 'make check' on AArch64 host for me:
> 
> TEST: tests/vhost-user-test... (pid=20205)
> Warning: path not on HugeTLBFS: /tmp/vhost-test-gRpbwl
> qemu-system-i386: -netdev vhost-user,id=net0,chardev=chr0,vhostforce:
> vhost-net support is not compiled in
> qemu-system-i386: -netdev vhost-user,id=net0,chardev=chr0,vhostforce:
> failed to init vhost_net for queue 0
> 
> Broken pipe
> FAIL: tests/vhost-user-test
> 
> Probably reproducible on x86 if you configure with --disable-vhost-net,
> though I haven't tried that.
> 
> Perhaps tests/vhost-user-test should be set up
> in tests/Makefile using
> check-qtest-i386-$(CONFIG_VHOST_USER) rather
> than CONFIG_LINUX ?
> 
> I'd appreciate a quick fix, because this machine is in my set
> of systems I test all pullreqs on now...
> 
> thanks
> -- PMM

I think you are right, but just to be on the safe side, let's test
both for now.

If this helps you, pls feel free to apply.

I will look at cleaning this up later.

-->

tests: vhost-user: disable unless CONFIG_VHOST_NET

vhost-user depends on vhost-net. We should probably fix that.
For now, let's disable the test otherwise.

Signed-off-by: Michael S. Tsirkin 

---

diff --git a/tests/Makefile b/tests/Makefile
index 4063639..e6474ba 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -188,7 +188,9 @@ gcov-files-i386-y += hw/usb/hcd-xhci.c
 check-qtest-i386-y += tests/pc-cpu-test$(EXESUF)
 check-qtest-i386-y += tests/q35-test$(EXESUF)
 gcov-files-i386-y += hw/pci-host/q35.c
+ifeq ($(CONFIG_VHOST_NET),y)
 check-qtest-i386-$(CONFIG_LINUX) += tests/vhost-user-test$(EXESUF)
+endif
 check-qtest-x86_64-y = $(check-qtest-i386-y)
 gcov-files-i386-y += i386-softmmu/hw/timer/mc146818rtc.c
 gcov-files-x86_64-y = $(subst 
i386-softmmu/,x86_64-softmmu/,$(gcov-files-i386-y))




Re: [Qemu-devel] [RFC PATCH 07/10] spapr_pci: Allow PCI host bridge DMA window to be configured

2015-10-05 Thread David Gibson
On Mon, Oct 05, 2015 at 04:13:30PM +0200, Paolo Bonzini wrote:
> 
> 
> On 03/10/2015 02:25, Alexey Kardashevskiy wrote:
> >> I think this is the aim of VMSTATE_UINT64_EQUAL() ?
> > 
> > We use it only for things which cannot be set via the command line
> > and ideally there should be no VMSTATE_*_EQUAL. If something can be
> > set via the command line, then the management software (read -
> > libvirt) runs QEMU with explicit parameters to guarantee that these
> > are equal.
> 
> VMSTATE_*_EQUAL is used when a value is later used as e.g. the size of
> an array.  It basically provides bounds checking for the subsequent
> array, avoiding that an invalid migration file or an error issuing the
> QEMU command on the destination transforms into a buffer overflow.
> 
> Michael Roth did most of this work, IIRC.  Documenting it in
> docs/migration.txt would be nice.

Ah.. which means we probably should use VMSTATE_*_EQUAL here since the
window size determines the size of the array of actual TCEs to follow
shortly.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC v0 1/2] spapr: Accommadate alignment gaps in hotplug memory region

2015-10-05 Thread David Gibson
On Mon, Oct 05, 2015 at 11:05:07AM +0200, Igor Mammedov wrote:
> On Mon,  5 Oct 2015 14:05:23 +0530
> Bharata B Rao  wrote:
> 
> > Size hotplug memory region assuming a 256MB max alignment every slot.
> > 
> > Signed-off-by: Bharata B Rao 
> > ---
> >  hw/ppc/spapr.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index fc5e7d6..2ec509b 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1792,6 +1792,9 @@ static void ppc_spapr_init(MachineState *machine)
> >  
> >  spapr->hotplug_memory.base = ROUND_UP(machine->ram_size,
> >SPAPR_HOTPLUG_MEM_ALIGN);
> > +
> > +/* size hotplug region assuming 256M max alignment per slot */
> > +hotplug_mem_size += SPAPR_MEMORY_BLOCK_SIZE * machine->ram_slots;
> Does target support hugepages backend? If it does then adjustment probably
> should be max supported hugepage alignment.

Hrm, so the maximum possible page size on Power is 16G (though we
don't yet support that on "powernv" which is what the host system will
generally be).

Not sure if the possibility of 16G "colossal pages" in future is
enough reason to put such a huge gap.  There aren't any other page
sizes between 16MB and 16GB.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[Qemu-devel] [PATCH qemu] kvm-all: Align to qemu_real_host_page_size in kvm_set_phys_mem

2015-10-05 Thread Alexey Kardashevskiy
As the comment in kvm_set_phys_mem() says, KVM works in page size chunks.
However it uses hardcoded TARGET_PAGE_SIZE which is 4K on most platforms
while actual host may use different page size, for example, PPC64 hosts
use 64K system pages.

This replaces static TARGET_PAGE_SIZE with run-time calculated
qemu_real_host_page_size.

Signed-off-by: Alexey Kardashevskiy 
---

This is the result of debugging VFIO quirks not working under PPC64 KVM.
---
 kvm-all.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 0be4615..6f04fbb 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -642,15 +642,15 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
 /* kvm works in page size chunks, but the function may be called
with sub-page size and unaligned start address. Pad the start
address to next and truncate size to previous page boundary. */
-delta = (TARGET_PAGE_SIZE - (start_addr & ~TARGET_PAGE_MASK));
-delta &= ~TARGET_PAGE_MASK;
+delta = qemu_real_host_page_size - (start_addr & 
~qemu_real_host_page_mask);
+delta &= ~qemu_real_host_page_mask;
 if (delta > size) {
 return;
 }
 start_addr += delta;
 size -= delta;
-size &= TARGET_PAGE_MASK;
-if (!size || (start_addr & ~TARGET_PAGE_MASK)) {
+size &= qemu_real_host_page_mask;
+if (!size || (start_addr & ~qemu_real_host_page_mask)) {
 return;
 }
 
-- 
2.5.0.rc3




Re: [Qemu-devel] [RFC PATCH 07/10] spapr_pci: Allow PCI host bridge DMA window to be configured

2015-10-05 Thread David Gibson
On Tue, Oct 06, 2015 at 02:25:07PM +1100, David Gibson wrote:
> On Mon, Oct 05, 2015 at 04:13:30PM +0200, Paolo Bonzini wrote:
> > 
> > 
> > On 03/10/2015 02:25, Alexey Kardashevskiy wrote:
> > >> I think this is the aim of VMSTATE_UINT64_EQUAL() ?
> > > 
> > > We use it only for things which cannot be set via the command line
> > > and ideally there should be no VMSTATE_*_EQUAL. If something can be
> > > set via the command line, then the management software (read -
> > > libvirt) runs QEMU with explicit parameters to guarantee that these
> > > are equal.
> > 
> > VMSTATE_*_EQUAL is used when a value is later used as e.g. the size of
> > an array.  It basically provides bounds checking for the subsequent
> > array, avoiding that an invalid migration file or an error issuing the
> > QEMU command on the destination transforms into a buffer overflow.
> > 
> > Michael Roth did most of this work, IIRC.  Documenting it in
> > docs/migration.txt would be nice.
> 
> Ah.. which means we probably should use VMSTATE_*_EQUAL here since the
> window size determines the size of the array of actual TCEs to follow
> shortly.

Wait.. no we don't.

The vmstate for the sPAPRTCETable object which actually holds the
IOMMU page table information already has a suitable VMSTATE_*_EQUAL to
protect the variable sized array, so we don't need another one here.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v5 32/48] ivshmem-client: check the number of vectors

2015-10-05 Thread Claudio Fontana
On 02.10.2015 21:09, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Check the number of vectors received from the server, to avoid
> out of bound array access.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  contrib/ivshmem-client/ivshmem-client.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/contrib/ivshmem-client/ivshmem-client.c 
> b/contrib/ivshmem-client/ivshmem-client.c
> index 11c805c..34a65b1 100644
> --- a/contrib/ivshmem-client/ivshmem-client.c
> +++ b/contrib/ivshmem-client/ivshmem-client.c
> @@ -128,6 +128,11 @@ ivshmem_client_handle_server_msg(IvshmemClient *client)
>  /* new vector */
>  IVSHMEM_CLIENT_DEBUG(client, "  new vector %d (fd=%d) for peer id %ld\n",
>   peer->vectors_count, fd, peer->id);
> +if (peer->vectors_count >= G_N_ELEMENTS(peer->vectors)) {
> +IVSHMEM_CLIENT_DEBUG(client, "Too many vector received, failing");

nit: "Too many vectors"

Reviewed-by: Claudio Fontana 


> +return -1;
> +}
> +
>  peer->vectors[peer->vectors_count] = fd;
>  peer->vectors_count++;
>  
> 





Re: [Qemu-devel] [PATCH v5 40/48] glib-compat: add 2.38/2.40/2.46 asserts

2015-10-05 Thread Paolo Bonzini


On 05/10/2015 12:56, Claudio Fontana wrote:
> Hi, I did not find g_assertion_message in any of the exported GLIB APIs.
> In fact, I found it in glib/gtestutils.h under the section "internal ABI".
> This is a hint that we should not be using it right?

Since this only matters for these pre-2.46 versions, where we know that
g_assertion_message exists, we can use it.

Paolo



Re: [Qemu-devel] [PATCH] temp-floating-point: Use float32_to_t and t_to_float32 for the input register value

2015-10-05 Thread Chen Gang

> On 5 October 2015 at 12:21, Chen Gang  wrote:
>> +static float32 t_to_float32 (uint32_t a)
>> +{
>> + CPU_FloatU r;
>> + r.l = a;
>> + return r.f;
>> +}
>
> This appears to be reimplementing make_float32().
>

OK, thanks.

>> +
>> +static uint32_t float32_to_t(float32 a)
>> +{
>> + CPU_FloatU r;
>> + r.f = a;
>> + return r.l;
>> +}
>
> And this is just float32_val().
>

OK, thanks.

>> uint64_t helper_fsingle_add1(CPUTLGState *env, uint64_t rsrc, uint64_t rsrcb)
>> {
>> - FPUTLGState *fpu = >fpu;
>> - return fsingle_calc(fpu, int64_to_float32(rsrc, _STATUS),
>> - int64_to_float32(rsrcb, _STATUS),
>> - float32_add);
>> + return fsingle_calc(>fpu, t_to_float32((uint32_t)rsrc),
>> + t_to_float32((uint32_t)rsrcb), float32_add);
>> }
>
> Why is the helper for a single-precision operation taking a 64-bit
> argument anyway?
>

Oh, the register are uint64_t, so I guess the input register value are 64-bit, 
too.


Thanks.
--
Chen Gang

Open, share, and attitude like air, water, and life which God blessed
  

Re: [Qemu-devel] [PATCH] temp-floating-point: Use float32_to_t and t_to_float32 for the input register value

2015-10-05 Thread Chen Gang

> From: peter.mayd...@linaro.org
> On 5 October 2015 at 12:54, Chen Gang  wrote:
>>> On 5 October 2015 at 12:21, Chen Gang  wrote:
>>> Why is the helper for a single-precision operation taking a 64-bit
>>> argument anyway?
>>>
>>
>> Oh, the register are uint64_t, so I guess the input register value are 
>> 64-bit, too.
>
> Usually for single precision the generated code is also working
> with 32-bit values. What you have is not necessarily
> wrong, but it is odd. I'd have to check the architecture
> manual and the translate.c code to be sure.
>

OK, thanks. Originally, I referenced alpha and s390x for it.

--
Chen Gang

Open, share, and attitude like air, water, and life which God blessed
  

Re: [Qemu-devel] [PATCH v3 0/4] SysFS driver for QEMU fw_cfg device

2015-10-05 Thread Gabriel L. Somlo
On Mon, Oct 05, 2015 at 11:00:36AM +0100, Mark Rutland wrote:
> On Sat, Oct 03, 2015 at 07:28:05PM -0400, Gabriel L. Somlo wrote:
> > From: "Gabriel Somlo" 
> > 
> > Allow access to QEMU firmware blobs, passed into the guest VM via
> > the fw_cfg device, through SysFS entries. Blob meta-data (e.g. name,
> > size, and fw_cfg key), as well as the raw binary blob data may be
> > accessed.
> > 
> > The SysFS access location is /sys/firmware/qemu_fw_cfg/... and was
> > selected based on overall similarity to the type of information
> > exposed under /sys/firmware/dmi/entries/...
> 
> What is the intended use of these?
> 
> Some of the keys in the example look like they'd come from other sources
> (e.g. the *-tables entries), while others look like kernel/bootloader
> configuration options (e.g. etc/boot-fail-wait, bootorder) -- I'm
> concerned about redundancy here.

Paolo already answered that (more eloquently than I would have) so I'll
leave it at that, for now...

> 
> > NEW (since v2): Using ACPI to detect the presence and details of the
> > fw_cfg virtual hardware device.
> > 
> > Device Tree has been suggested by Ard as a comment on v2 of this
> > patch, but after some deliberation I decided to go with ACPI,
> > since it's supported on both x86 and some (uefi-enabled) versions
> > of aarch64. I really don't see how I'd reasonably use *both* DT (on
> > ARM) *and* ACPI (on x86), and after all I'm mostly concerned with
> > x86, but originally wanted to maximize portability (which is where
> > the register probing in earlier versions came from).
> 
> There are defintitely going to be arm64 VMs that don't use ACPI, so we
> may need DT support depending on what the intended use is.
> 
> I'm not sure I follow what the difficulty with supporting DT in addition
> to ACPI is? It looks like all you need is a compatible string and a reg
> entry.

Bearing in mind that I have almost no experience with arm:

I started out by probing all possible port-io and mmio locations where
fw_cfg registers might have been found, from a "classic" module_init
method.

Arm has DT, which as far as I understand will answer the following two
questions: 1. Do I have fw_cfg ? 2. If yes, what address range does it use ?
So that I could continue using a classic module_init, but won't need
to probe for the device.

PC (my primary architecture, the one I actually care about) does not
have DT. If I want to share the same code, I can't probe, so if I try
DT and don't find fw_cfg there (or somehow DT is no-op-ed out because
I'm on a PC guest), I could somehow look it up in ACPI the same way
(i.e., use ACPI as sort of a stand-in for DT).

But all ACPI-enabled drivers I could find use dedicated macros (i.e.
no more classic module_init() and module_exit(), but rather
module_acpi_driver() with .add and .remove methods on an acpi_driver
object, etc.) Not sure how I'd glue DT back into something like that.

In addition, Michael's comment earlier in the thread suggests that
even my current acpi version isn't sufficiently "orthodox" w.r.t.
ACPI, and I should be providing the hardware access routine as
an ACPI/AML routine, to avoid race conditions with the rest of ACPI,
and for encapsulation. I.e. it's even rude to use the fw_cfg node's
ACPI _CRS method (the part where I'd be treating it like a DT stand-in
only to query fw_cfg's hardware specifics).

So far, all the information I've been able to pull together points
away from a dual DT + ACPI all-in-one solution for fw_cfg. If you know
of an example where that's done in an acceptable way, please let
me know so I can use it for inspiration...

Thanks much,
--Gabriel

> 
> > A patch set generating an ACPI device node for qemu's fw_cfg is
> > currently under review on the qemu-devel list:
> > 
> > http://lists.nongnu.org/archive/html/qemu-devel/2015-09/msg06946.html
> > (sorry, gmane appears down at the moment...)
> > 
> > In consequence:
> > 
> > - Patch 1/4 is mostly the same as in v2;
> > - Patch 2/4 switches device initialization from register
> >   probing to using ACPI; this is a separate patch only to
> >   illustrate the transition from probing to ACPI, and I'm
> >   assuming it will end up squashed on top of patch 1/4 in
> >   the final version.
> > 
> > - Patches 3/4 and 4/4 add a "human-readable" directory
> >   hierarchy built from tokenizing fw_cfg blob names into
> >   '/'-separated components, with symlinks to each 'by_key'
> >   blob folder (same as in earlier versions). At Greg's
> >   suggestion I tried to build this folder hierarchy and
> >   leaf symlinks using udev rules, but so far I haven't been
> >   successful in figuring that out. If udev turns out to 
> >   be applicable after all, these two patches can be dropped
> >   from this series.
> > 
> > In other words, patches 1 and 2 give us the following "by_key" listing
> > of blobs contained in the qemu fw_cfg device 

Re: [Qemu-devel] [PATCH v3 0/4] SysFS driver for QEMU fw_cfg device

2015-10-05 Thread Gabriel L. Somlo
On Mon, Oct 05, 2015 at 01:50:47PM +0100, Peter Maydell wrote:
> On 5 October 2015 at 13:40, Gabriel L. Somlo  wrote:
> > In addition, Michael's comment earlier in the thread suggests that
> > even my current acpi version isn't sufficiently "orthodox" w.r.t.
> > ACPI, and I should be providing the hardware access routine as
> > an ACPI/AML routine, to avoid race conditions with the rest of ACPI,
> > and for encapsulation. I.e. it's even rude to use the fw_cfg node's
> > ACPI _CRS method (the part where I'd be treating it like a DT stand-in
> > only to query fw_cfg's hardware specifics).
> 
> If you want to try to support "firmware might also be reading
> fw_cfg at the same time as the kernel" this is a (painful)
> problem regardless of how the kernel figures out whether a
> fw_cfg device is present. I had assumed that one of the design
> assumptions of this series was that firmware would only
> read the fw_cfg before booting the guest kernel and never touch
> it afterwards. If it might touch it later then letting the
> guest kernel also mess with fw_cfg seems like a really bad idea.

I don't know of any case where firmware and kernel might race each
other to access fw_cfg.

The issue AFAICT is whether it's safe (future-proof) to rely on
parsing _CRS for the fw_cfg i/o access information, or whether
such logic could be rendered obsolete by potential future updates
to fw_cfg's _CRS. If I "outsource" the fw_cfg_dump_blob_by_key()
functionality entirely to an ACPI method, my kernel driver won't
have to worry about keeping up with said future updates.

On the down-side, that means the kernel driver will be ACPI or
nothing (but I'm OK with that, at my curent level of understanding :)

Thanks,
--Gabriel



Re: [Qemu-devel] [PATCHv3 5/7] memory: Allow replay of IOMMU mapping notifications

2015-10-05 Thread Paolo Bonzini


On 30/09/2015 04:13, David Gibson wrote:
> When we have guest visible IOMMUs, we allow notifiers to be registered
> which will be informed of all changes to IOMMU mappings.  This is used by
> vfio to keep the host IOMMU mappings in sync with guest IOMMU mappings.
> 
> However, unlike with a memory region listener, an iommu notifier won't be
> told about any mappings which already exist in the (guest) IOMMU at the
> time it is registered.  This can cause problems if hotplugging a VFIO
> device onto a guest bus which had existing guest IOMMU mappings, but didn't
> previously have an VFIO devices (and hence no host IOMMU mappings).
> 
> This adds a memory_region_iommu_replay() function to handle this case.  It
> replays any existing mappings in an IOMMU memory region to a specified
> notifier.  Because the IOMMU memory region doesn't internally remember the
> granularity of the guest IOMMU it has a small hack where the caller must
> specify a granularity at which to replay mappings.
> 
> If there are finer mappings in the guest IOMMU these will be reported in
> the iotlb structures passed to the notifier which it must handle (probably
> causing it to flag an error).  This isn't new - the VFIO iommu notifier
> must already handle notifications about guest IOMMU mappings too short
> for it to represent in the host IOMMU.
> 
> Signed-off-by: David Gibson 
> ---
>  include/exec/memory.h | 13 +
>  memory.c  | 20 
>  2 files changed, 33 insertions(+)
> 
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 5baaf48..0f07159 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -583,6 +583,19 @@ void memory_region_notify_iommu(MemoryRegion *mr,
>  void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n);
>  
>  /**
> + * memory_region_iommu_replay: replay existing IOMMU translations to
> + * a notifier
> + *
> + * @mr: the memory region to observe
> + * @n: the notifier to which to replay iommu mappings
> + * @granularity: Minimum page granularity to replay notifications for
> + * @is_write: Whether to treat the replay as a translate "write"
> + * through the iommu
> + */
> +void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n,
> +hwaddr granularity, bool is_write);
> +
> +/**
>   * memory_region_unregister_iommu_notifier: unregister a notifier for
>   * changes to IOMMU translation entries.
>   *
> diff --git a/memory.c b/memory.c
> index ef87363..1b03d22 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -1403,6 +1403,26 @@ void 
> memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n)
>  notifier_list_add(>iommu_notify, n);
>  }
>  
> +void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n,
> +hwaddr granularity, bool is_write)
> +{
> +hwaddr addr;
> +IOMMUTLBEntry iotlb;
> +
> +for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
> +iotlb = mr->iommu_ops->translate(mr, addr, is_write);
> +if (iotlb.perm != IOMMU_NONE) {
> +n->notify(n, );
> +}
> +
> +/* if (2^64 - MR size) < granularity, it's possible to get an
> + * infinite loop here.  This should catch such a wraparound */
> +if ((addr + granularity) < addr) {
> +break;
> +}
> +}
> +}
> +
>  void memory_region_unregister_iommu_notifier(Notifier *n)
>  {
>  notifier_remove(n);
> 

Acked-by: Paolo Bonzini 



Re: [Qemu-devel] [PATCH v3 0/4] SysFS driver for QEMU fw_cfg device

2015-10-05 Thread Gabriel L. Somlo
On Mon, Oct 05, 2015 at 01:56:47PM +0100, Mark Rutland wrote:
> On Mon, Oct 05, 2015 at 08:43:46AM -0400, Gabriel L. Somlo wrote:
> > On Mon, Oct 05, 2015 at 01:23:33PM +0100, Mark Rutland wrote:
> > > On Mon, Oct 05, 2015 at 01:48:52PM +0200, Paolo Bonzini wrote:
> > > > 
> > > > 
> > > > On 05/10/2015 12:00, Mark Rutland wrote:
> > > > > Some of the keys in the example look like they'd come from other 
> > > > > sources
> > > > > (e.g. the *-tables entries), while others look like kernel/bootloader
> > > > > configuration options (e.g. etc/boot-fail-wait, bootorder) -- I'm
> > > > > concerned about redundancy here.
> > > > 
> > > > The redundancy is because the firmware and the bootloader actually
> > > > _consume_ these fw_cfg strings to produce the others (the ACPI tables,
> > > > the kernel configuration options).
> > > > 
> > > > On the other hand, hiding some strings just because they ought to have
> > > > been consumed already makes little sense.
> > > 
> > > Sure. However, I'm concerned that providing redundant interfaces for
> > > those could lead to people grabbing information from here (because it's
> > > convenient) rather than the existing canonical locations, which means we
> > > get more software that works on fewer systems for no good reason.
> > > 
> > > What I couldn't figure out was what _additional_ information this
> > > provided; it looked like a mixed bag of details we could already get
> > > from disparate sources. If that's all it does, then it seems to me like
> > > it doesn't add any benefit and potentially makes things worse.
> > > 
> > > So what do we get from this interface that we cannot get elsewhere, and
> > > why is this the best way of exposing it?
> > 
> > Starting with qemu 2.4, it is possible to insert arbitrary named
> > blobs into fw_cfg from the qemu command line. *Those* entries
> > might be interesting to userspace, which is why it might be handy
> > to access to fw_cfg blobs in general.
> 
> So this is a mechanism to pass arbitrary key:value pairs to a guest
> userspace? What would those be used for, and why would this be the
> correct location for that?

Yes to arbitrary host->guest arbitrary key:value pairs.
fw_cfg because it's asynchronous (host supplies the data at guest
start time, and no longer has to worry about whether and when guests
may or may not start some sort of agent in order to be able to accept
connections, etc); also because it's guest-os agnostic (no
piggy-backing on e.g. kernel command line). Drivers to make data
available to guest userspace can be written for any guest OS.

> How do we avoid clashes between user-selected names and those we need to
> pass actual FW data?

Internally supplied blobs (by QEMU) meant for the firmware are, by
convention, prefixed with "/etc/...". Command-line blobs are expected
to use "opt/...". QEMU issues a warning if a name is used on the
command line that doesn't begin with 'opt/'.

Thanks,
--Gabriel



Re: [Qemu-devel] [PATCH v5 40/48] glib-compat: add 2.38/2.40/2.46 asserts

2015-10-05 Thread Marc-André Lureau
Hi

On Mon, Oct 5, 2015 at 12:56 PM, Claudio Fontana
 wrote:
> Hi, I did not find g_assertion_message in any of the exported GLIB APIs.
> In fact, I found it in glib/gtestutils.h under the section "internal ABI".
> This is a hint that we should not be using it right?

Good point, I didn't see that. Notice that this file has other weird
comments, such as "semi-internal API". I think it's fine to use
though, because it's an external symbol and glib has ABI stability
guarantees. I opened https://bugzilla.gnome.org/show_bug.cgi?id=756077
to change the glib comments.

-- 
Marc-André Lureau



Re: [Qemu-devel] [PULL 00/10] Fix device introspection regressions

2015-10-05 Thread Peter Maydell
On 5 October 2015 at 07:49, Markus Armbruster  wrote:
> Peter Maydell  writes:
>
>> On 2 October 2015 at 18:20, Markus Armbruster  wrote:
>>> QMP command device-list-properties regressed in 2.1: it can crash or
>>> leave dangling pointers behind.
>>>
>>> -device FOO,help regressed in 2.2: it no longer works for
>>> non-pluggable devices.  I tried to fix that some time ago[*], but my
>>> fix failed review.  This is my second, more comprehensive try.
>>>
>>> PATCH 1-3 fix one class of bugs involved in the regressions, PATCH 4-5
>>> are libqtest preliminaries, PATCH 6 adds tests to demonstrate the
>>> remaining bugs, PATCH 7-9 fix them to a degree (see PATCH 8 for
>>> limitations), and PATCH 10 cleans up.
>>
>> This ordering breaks bisection of 'make check', as I found out when
>> I tried to figure out which of the patches in this pull was causing
>> an OSX test failure. Please can you reorder them so that 'make check'
>> works at all points in the series?
>
> My ordering may be bad (and I'll recheck it, of course), or it may
> temporarily expose a hidden bug.  I better figure out what's going on
> here.
>
>>> The following changes since commit ff770b07f34d28b79013a83989bd6c85f8f16b2f:
>>>
>>>   Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into 
>>> staging (2015-10-02 11:01:18 +0100)
>>>
>>> are available in the git repository at:
>>>
>>>   git://repo.or.cz/qemu/armbru.git tags/pull-monitor-2015-10-02
>>>
>>> for you to fetch changes up to e927162a6fa2fa6144de9d1d11cc9448a2143671:
>>>
>>>   Revert "qdev: Use qdev_get_device_class() for -device ,help" 
>>> (2015-10-02 16:45:53 +0200)
>>>
>>> 
>>> Fix device introspection regressions
>>>
>>> 
>>
>> 'make check' failure on OSX:
>>
>>   /aarch64/device/introspect/list: OK
>>   /aarch64/device/introspect/none: OK
>>   /aarch64/device/introspect/abstract: OK
>>   /aarch64/device/introspect/concrete: **
>> ERROR:/Users/pm215/src/qemu-for-merges/qom/object.c:333:void
>> object_initialize_with_type(void *, size_t, TypeImpl *): assertion
>> failed: (type != NULL)
>> Broken pipe
>> FAIL
>>
>> I have no idea why this only failed on OSX...
>
> Can you re-run this with valgrind spliced in?

Valgrind is not particularly helpful: it reports a couple of
irrelevancies and an unimplemented syscall, then just
reports the backtrace for the abort:

==26853== Memcheck, a memory error detector
==26853== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==26853== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==26853== Command: ./aarch64-softmmu/qemu-system-aarch64 -qtest
unix:/tmp/qtest-26555.sock,nowait -qtest-log /dev/null -qmp
unix:/tmp/qtest-26555.qmp,nowait -machine accel=qtest -display none
-nodefaults -machine none
==26853== Parent PID: 26555
==26853==
==26853== Syscall param __pthread_sigmask(set) points to uninitialised byte(s)
==26853==at 0x10434E2B6: __pthread_sigmask (in
/usr/lib/system/libsystem_kernel.dylib)
==26853==by 0x10446406D: pthread_sigmask (in
/usr/lib/system/libsystem_pthread.dylib)
==26853==by 0x100537022: qemu_thread_create (qemu-thread-posix.c:488)
==26853==by 0x100550ACB: rcu_init_complete (rcu.c:320)
==26853==by 0x100550B18: rcu_init (rcu.c:351)
==26853==by 0x7FFF5FC12D0A:
ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&)
(in /usr/lib/dyld)
==26853==by 0x7FFF5FC12E97:
ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&)
(in /usr/lib/dyld)
==26853==by 0x7FFF5FC0F890:
ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&,
unsigned int, ImageLoader::InitializerTimingList&,
ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==26853==by 0x7FFF5FC0F717:
ImageLoader::processInitializers(ImageLoader::LinkContext const&,
unsigned int, ImageLoader::InitializerTimingList&,
ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==26853==by 0x7FFF5FC0F988:
ImageLoader::runInitializers(ImageLoader::LinkContext const&,
ImageLoader::InitializerTimingList&) (in /usr/lib/dyld)
==26853==by 0x7FFF5FC02244: dyld::initializeMainExecutable() (in
/usr/lib/dyld)
==26853==by 0x7FFF5FC05C18: dyld::_main(macho_header const*,
unsigned long, int, char const**, char const**, char const**, unsigned
long*) (in /usr/lib/dyld)
==26853==  Address 0x1056e0c80 is on thread 1's stack
==26853==  in frame #2, created by qemu_thread_create (qemu-thread-posix.c:461)
==26853==
==26853== Syscall param __pthread_sigmask(set) points to uninitialised byte(s)
==26853==at 0x10434E2B6: __pthread_sigmask (in
/usr/lib/system/libsystem_kernel.dylib)
==26853==by 0x10446406D: pthread_sigmask (in

Re: [Qemu-devel] [PATCH v3 0/4] SysFS driver for QEMU fw_cfg device

2015-10-05 Thread Mark Rutland
> > I'm not sure I follow what the difficulty with supporting DT in addition
> > to ACPI is? It looks like all you need is a compatible string and a reg
> > entry.
> 
> Bearing in mind that I have almost no experience with arm:
> 
> I started out by probing all possible port-io and mmio locations where
> fw_cfg registers might have been found, from a "classic" module_init
> method.
> 
> Arm has DT, which as far as I understand will answer the following two
> questions: 1. Do I have fw_cfg ? 2. If yes, what address range does it use ?
> So that I could continue using a classic module_init, but won't need
> to probe for the device.
> 
> PC (my primary architecture, the one I actually care about) does not
> have DT. If I want to share the same code, I can't probe, so if I try
> DT and don't find fw_cfg there (or somehow DT is no-op-ed out because
> I'm on a PC guest), I could somehow look it up in ACPI the same way
> (i.e., use ACPI as sort of a stand-in for DT).

I'd imagine that it's simple to have something in your probe path like:

if (pdev->dev.of_node)
parse_dt(pdev);
else
parse_acpi(pdev);

> But all ACPI-enabled drivers I could find use dedicated macros (i.e.
> no more classic module_init() and module_exit(), but rather
> module_acpi_driver() with .add and .remove methods on an acpi_driver
> object, etc.) Not sure how I'd glue DT back into something like that.

You don't have to use those macros, and can simply use the classic
module_{init,exit} functions, calling the requisite acpi driver
registration functions at module {init,exit} time.

> In addition, Michael's comment earlier in the thread suggests that
> even my current acpi version isn't sufficiently "orthodox" w.r.t.
> ACPI, and I should be providing the hardware access routine as
> an ACPI/AML routine, to avoid race conditions with the rest of ACPI,
> and for encapsulation. I.e. it's even rude to use the fw_cfg node's
> ACPI _CRS method (the part where I'd be treating it like a DT stand-in
> only to query fw_cfg's hardware specifics).

As Peter stated, this sounds very much like it rules out sharing the
interface with FW generally (and is certainly scary).

> So far, all the information I've been able to pull together points
> away from a dual DT + ACPI all-in-one solution for fw_cfg. If you know
> of an example where that's done in an acceptable way, please let
> me know so I can use it for inspiration...

I'm not immediately aware, but I would imagine you could search for
files that had both an of_match_table and a acpi_bus_register_driver
call.

Thanks,
Mark.



[Qemu-devel] [PATCH] tests: Unique test path for /string-visitor/output

2015-10-05 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

Newer GLib's want unique test paths, and thus moan at dupes.
(Seen on Fedora 23 which has glib 2.46)

Uniqueify the paths.

Signed-off-by: Dr. David Alan Gilbert 
---
 tests/test-string-output-visitor.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tests/test-string-output-visitor.c 
b/tests/test-string-output-visitor.c
index 101fb27..fd5e67b 100644
--- a/tests/test-string-output-visitor.c
+++ b/tests/test-string-output-visitor.c
@@ -248,39 +248,39 @@ int main(int argc, char **argv)
 
 output_visitor_test_add("/string-visitor/output/int",
 _visitor_data, test_visitor_out_int, false);
-output_visitor_test_add("/string-visitor/output/int",
+output_visitor_test_add("/string-visitor/output/int-human",
 _visitor_data, test_visitor_out_int, true);
 output_visitor_test_add("/string-visitor/output/bool",
 _visitor_data, test_visitor_out_bool, false);
-output_visitor_test_add("/string-visitor/output/bool",
+output_visitor_test_add("/string-visitor/output/bool-human",
 _visitor_data, test_visitor_out_bool, true);
 output_visitor_test_add("/string-visitor/output/number",
 _visitor_data, test_visitor_out_number, false);
-output_visitor_test_add("/string-visitor/output/number",
+output_visitor_test_add("/string-visitor/output/number-human",
 _visitor_data, test_visitor_out_number, true);
 output_visitor_test_add("/string-visitor/output/string",
 _visitor_data, test_visitor_out_string, false);
-output_visitor_test_add("/string-visitor/output/string",
+output_visitor_test_add("/string-visitor/output/string-human",
 _visitor_data, test_visitor_out_string, true);
 output_visitor_test_add("/string-visitor/output/no-string",
 _visitor_data, test_visitor_out_no_string,
 false);
-output_visitor_test_add("/string-visitor/output/no-string",
+output_visitor_test_add("/string-visitor/output/no-string-human",
 _visitor_data, test_visitor_out_no_string,
 true);
 output_visitor_test_add("/string-visitor/output/enum",
 _visitor_data, test_visitor_out_enum, false);
-output_visitor_test_add("/string-visitor/output/enum",
+output_visitor_test_add("/string-visitor/output/enum-human",
 _visitor_data, test_visitor_out_enum, true);
 output_visitor_test_add("/string-visitor/output/enum-errors",
 _visitor_data, test_visitor_out_enum_errors,
 false);
-output_visitor_test_add("/string-visitor/output/enum-errors",
+output_visitor_test_add("/string-visitor/output/enum-errors-human",
 _visitor_data, test_visitor_out_enum_errors,
 true);
 output_visitor_test_add("/string-visitor/output/intList",
 _visitor_data, test_visitor_out_intList, 
false);
-output_visitor_test_add("/string-visitor/output/intList",
+output_visitor_test_add("/string-visitor/output/intList-human",
 _visitor_data, test_visitor_out_intList, true);
 
 g_test_run();
-- 
2.5.0




Re: [Qemu-devel] [PATCH] temp-floating-point: Use float32_to_t and t_to_float32 for the input register value

2015-10-05 Thread Peter Maydell
On 5 October 2015 at 12:21, Chen Gang  wrote:
> From 6bb2ed5b7046cda545f6a12721b773fde40f07f1 Mon Sep 17 00:00:00 2001
> From: Chen Gang 
> Date: Mon, 5 Oct 2015 19:12:07 +0800
> Subject: [PATCH] temp-floating-point: Use float32_to_t and t_to_float32 for
>  the input register value
>
> Original implementation use int*_to_float32 and float32_to_int*, which
> will generate incorrect result.
>
> Signed-off-by: Chen Gang 
> ---
>  target-tilegx/fpu_helper.c | 51 
> +-
>  1 file changed, 23 insertions(+), 28 deletions(-)
>
> diff --git a/target-tilegx/fpu_helper.c b/target-tilegx/fpu_helper.c
> index daae570..2707f30 100644
> --- a/target-tilegx/fpu_helper.c
> +++ b/target-tilegx/fpu_helper.c
> @@ -68,6 +68,20 @@ static uint64_t float64_to_t(float64 fa)
>  return r.ll;
>  }
>
> +static float32 t_to_float32 (uint32_t a)
> +{
> +CPU_FloatU r;
> +r.l = a;
> +return r.f;
> +}

This appears to be reimplementing make_float32().

> +
> +static uint32_t float32_to_t(float32 a)
> +{
> +CPU_FloatU r;
> +r.f = a;
> +return r.l;
> +}

And this is just float32_val().

>  uint64_t helper_fsingle_add1(CPUTLGState *env, uint64_t rsrc, uint64_t rsrcb)
>  {
> -FPUTLGState *fpu = >fpu;
> -return fsingle_calc(fpu, int64_to_float32(rsrc, _STATUS),
> -int64_to_float32(rsrcb, _STATUS),
> -float32_add);
> +return fsingle_calc(>fpu, t_to_float32((uint32_t)rsrc),
> +t_to_float32((uint32_t)rsrcb), float32_add);
>  }

Why is the helper for a single-precision operation taking a 64-bit
argument anyway?

thanks
-- PMM



  1   2   >