Re: [Qemu-devel] Isuue assiging devices using VFIO on x86

2012-08-28 Thread Alex Williamson
On Tue, 2012-08-28 at 17:10 +, Bhushan Bharat-R65777 wrote:
> 
> > -Original Message-
> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Tuesday, August 28, 2012 9:27 PM
> > To: Bhushan Bharat-R65777
> > Cc: k...@vger.kernel.org; Avi Kivity; qemu-devel@nongnu.org
> > Subject: Re: Isuue assiging devices using VFIO on x86
> > 
> > On Tue, 2012-08-28 at 09:23 +, Bhushan Bharat-R65777 wrote:
> > > Hi Alex,
> > >
> > > In my susyem I have following devices:
> > >
> > > I tried assigning a following PCI devices:
> > > 00:03.0 Communication controller: Intel Corporation 4 Series Chipset HECI
> > Controller (rev 03)
> > > 00:03.2 IDE interface: Intel Corporation 4 Series Chipset PT IDER 
> > > Controller
> > (rev 03)
> > > 00:03.3 Serial controller: Intel Corporation 4 Series Chipset Serial KT
> > Controller (rev 03)
> > >
> > > and getting below error:
> > >
> > > ---
> > > Command:
> > > ---
> > > qemu-system-x86_64 -enable-kvm  -nographic  -kernel 
> > > /boot/vmlinuz-3.6.0-rc2+ -
> > initrd /boot/initramfs-3.6.0-rc2+.img -append "root=/dev/sda1" -m 1024 
> > -drive
> > file=/home/kvmdev/debian_squeeze_amd64_standard.qcow2 -device vfio-
> > pci,host=:00:03.0 -device vfio-pci,host=:00:03.2 -device vfio-
> > pci,host=:00:03.3
> > >
> > > Error prints:
> > > 
> > > qemu-system-x86_64: -device vfio-pci,host=:00:03.0: Warning, device
> > :00:03.0 does not support reset
> > >
> > > qemu-system-x86_64: -device vfio-pci,host=:00:03.0: VFIO :00:03.0 
> > > BAR
> > 0 is too small to mmap, this may affect performance.
> > >
> > > qemu-system-x86_64: -device vfio-pci,host=:00:03.2: Warning, device
> > :00:03.2 does not support reset
> > >
> > > qemu-system-x86_64: -device vfio-pci,host=:00:03.3: Warning, device
> > :00:03.3 does not support reset
> > >
> > > qemu: hardware error: register_ioport_read: invalid opaque for address 
> > > 0x3f6
> > > CPU #0:
> > > EAX=8003 EBX=3ffe0e80 ECX=8003 EDX=0cfc
> > > ESI=2800 EDI=80002804 EBP=6fc0 ESP=6f38
> > > EIP=3ffec005 EFL=0086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
> > > ES =0010   00c09300 DPL=0 DS   [-WA]
> > > CS =0008   00c09b00 DPL=0 CS32 [-RA]
> > > SS =0010   00c09300 DPL=0 DS   [-WA]
> > > DS =0010   00c09300 DPL=0 DS   [-WA]
> > > FS =0010   00c09300 DPL=0 DS   [-WA]
> > > GS =0010   00c09300 DPL=0 DS   [-WA]
> > > LDT=   8200 DPL=0 LDT
> > > TR =   8b00 DPL=0 TSS32-busy
> > > GDT= 000fcd68 0037
> > > IDT= 000fdb60 
> > > CR0=0011 CR2= CR3= CR4=
> > > DR0= DR1= DR2=
> > DR3=
> > > DR6=0ff0 DR7=0400
> > > EFER=
> > > FCW=037f FSW= [ST=0] FTW=00 MXCSR=1f80
> > > FPR0=  FPR1= 
> > > FPR2=  FPR3= 
> > > FPR4=  FPR5= 
> > > FPR6=  FPR7= 
> > > XMM00= 
> > > XMM01=
> > > XMM02= 
> > > XMM03=
> > > XMM04= 
> > > XMM05=
> > > XMM06= 
> > > XMM07=
> > > Aborted (core dumped)
> > >
> > > 
> > >
> > > Linux: http://github.com/awilliam/linux-vfio.git next
> > > Qemu: https://github.com/awilliam/qemu-vfio.git  iommu-group-vfio
> > >
> > > Any idea what's is the issue.
> > 
> > Please try the vfio-pci-for-qemu-1.2-v3 qemu-vfio.git tag, I'm not even
> > sure what's in that old iommu-group-vfio branch.  Thanks,
> 
> Can you share the command to configure qemu?

Nothing special: ./configure

I sometimes set --prefix or --target-list, but nothing that changes the
binaries.  Thanks,

Alex




Re: [Qemu-devel] [PATCH] Fix buffer run out in eepro100.

2012-08-28 Thread Stefan Hajnoczi
On Wed, Aug 29, 2012 at 1:55 AM, Bo Yang  wrote:
> On 08/28/2012 06:59 PM, Stefan Hajnoczi wrote:
>> On Tue, Aug 28, 2012 at 7:23 AM, Bo Yang  wrote:
>>> According
>>> to liunux driver's implementation, the descriptor with EL bit set
>>> must not be touched by hardware, usually, the buffer size of this
>>> descriptor is set to 0.
>>
>> Please describe the bug you are seeing and how to reproduce it.
>
> Unfortunately, I cannot reproduce it. It is reported by QA, I used the
> QA's test environment to debug here.

Please share the steps to reproduce in the commit message so we have
an idea of what this patch fixes (even if we can't reproduce it).

> It's
>> not clear to me that your patch implements the behavior specified in
>> the datasheet.  I have included relevant quotes from the datasheet:
>>
>> http://www.intel.com/content/dam/doc/manual/8255x-10-100-mbps-ethernet-controller-software-dev-manual.pdf
>>
>> "Table 55. CU Activities Performed at the End of Execution" shows that
>> the EL bit takes effect at end of command execution.  I think this
>> means the descriptor *will* be processed by the hardware.
>>
>> The following documents the behavior when the descriptor's buffer size is 0:
>>
>> "6.4.3.3.1 Configuring the Next RFD
>> [...]
>> 3. Initiates a receive DMA if the size of the data field in the RFD is
>> greater than zero. The receive
>> DMA is initiated with the address of the first byte of the destination
>> address to the byte count
>> specified by the RFD.
>> 4. Forces a receive DMA completion if the size of the data field in
>> the RFD is zero."
>>
>>> diff --git a/hw/eepro100.c b/hw/eepro100.c
>>> index 50d117e..e0efd96 100644
>>> --- a/hw/eepro100.c
>>> +++ b/hw/eepro100.c
>>> @@ -1619,8 +1619,13 @@ static const MemoryRegionOps eepro100_ops = {
>>>  static int nic_can_receive(NetClientState *nc)
>>>  {
>>>  EEPRO100State *s = DO_UPCAST(NICState, nc, nc)->opaque;
>>> +ru_state_t state;
>>>  TRACE(RXTX, logout("%p\n", s));
>>> -return get_ru_state(s) == ru_ready;
>>> +state = get_ru_state(s);
>>> +if (state == ru_no_resources) {
>>> +   eepro100_rnr_interrupt(s);
>>> +}
>>> +return state == ru_ready;
>>>  #if 0
>>>  return !eepro100_buffer_full(s);
>>>  #endif
>>
>> This is a boolean function, it shouldn't have side-effects.
>>
>> Why did you add the eepro100_rnr_interrupt() call here when it's also
>> called below from nic_receive()?
>
> This is needed. Let's consider tap with e100.
> When rx descriptor is EL with size 0. e100 enter ru_no_resources state
> and rnr interrupt is raised. Then, driver in guest handles the
> interrupt, however, there can be out of memory condition in guest, so
> this time the rx descriptor filling fails, so no rx descriptor is ready.
>
> Have a look at tap, tap_can_send is ioh->fd_read_poll, tap_send is
> ioh->fd_read, tap_can_send calls nic_can_receive(), tap_send() call
> nic_receive(), if tap_can_send returns 0, the tap fd will not be added
> to read set, and tap_send() will never be called, so nic_receive is
> never called, no more rnr interrupt will be raised. then the network
> just stall. So we need interrupt to notify the guest to prepare rx
> descriptor again.

Here is how virtio-net handles this:
hw/virtio-net.c:virtio_net_handle_rx() calls
qemu_flush_queued_packets() followed by qemu_notify_event() to resume
rx.

static void virtio_net_handle_rx(VirtIODevice *vdev, VirtQueue *vq)
{
VirtIONet *n = to_virtio_net(vdev);

qemu_flush_queued_packets(&n->nic->nc);

/* We now have RX buffers, signal to the IO thread to break out of the
 * select to re-poll the tap file descriptor */
qemu_notify_event();
}

This function gets invoked when new rx buffers have been given to the
virtio-net NIC by the guest.

eepro100 should do something similar, maybe on eepro100_ru_command() RX_START?

BTW, guest driver out-of-memory is not the hardware's problem.  The
driver needs to handle it.  For example, the Linux e100 driver has a
watchdog timer which will raise an interrupt to refill the rx ring if
allocation failed.

Stefan



Re: [Qemu-devel] [PATCH v4] configure: properly check if -lrt and -lm is needed

2012-08-28 Thread Natanael Copa
On Tue, 28 Aug 2012 17:16:18 +
Blue Swirl  wrote:

> On Tue, Aug 28, 2012 at 7:33 AM, Natanael Copa
>  wrote:
> > On Tue, 21 Aug 2012 18:12:05 +
> > Blue Swirl  wrote:
> >>
> >> Now I get this on mingw32:
> >> config-host.mak is out-of-date, running configure
> >>
> >> Error: librt check failed
> >
> > Any news on the v4 patch, which should fix this?
> 
> No change:
> config-host.mak is out-of-date, running configure
> 
> Error: librt check failed

I have run out of guesses. Could I please have the last lines from your
config.log?

Thanks!

-nc



Re: [Qemu-devel] [PATCH] Add new client_present and client capabilities fields to QXLRom

2012-08-28 Thread Gerd Hoffmann
On 08/29/12 02:58, Søren Sandmann wrote:
> Gerd Hoffmann  writes:
> 
>> On 08/27/12 19:20, Søren Sandmann Pedersen wrote:
>>> From: Søren Sandmann Pedersen 
>>>
>>> The client_present field is a byte that is set of non-zero when a
>>> client is connected and to zero when no client is connected.
>>>
>>> The client_capabilities[58] array contains 464 bits that indicate the
>>> capabilities of the client.
>>
>> What is supposed to happen in case multiple clients are connected?
> 
> Is this case supported at all?

There is code for it, although disabled by default and nobody actively
working in it as far I know.  We should at least have a plan how to
handle that situation ...

> If it is, I'd say that the guest should not be aware of it and the bits
> advertise should be interpreted as "these are the capabilities that
> spice-server will marshall on to the clients that are
> connected". Presumably spice-server would then set the bit vector to the
> intersection of all the clients.

Makes sense.

>> How do you handle the race conditions, especially on capability
>> downgrade?  There might be not-yet processed commands in the command
>> queue which the client is unable to handle, or existing surfaces using
>> formats the client doesn't understand ...
> 
> Good question. 
> 
> I don't know of a good way to deal with the situation where the new
> client is unable to handle existing surfaces.

We need a sensible solution here.  If we can't handle capability
downgrade at runtime the capability negotiation between guest and client
doesn't make sense in the first place.

Failing that we can add a a8 switch to qxl.  When enabled qemu asks
spice-server to enable a8, which in turn will raise the display channel
minor version, basically requiring an updated spice client to connect.
With a8 disabled old clients can connect too.

> For commands, would it work for spice-server to just process everything
> in the command ring after changing the capability bits (ie., in possibly
> brief moment before a new client connects)? It seems that would be a
> good thing to do even without capability bits.

spice server could process (aka server-side rendering) all outstanding
commands after updating capability bits and before starting to forward
commands to the new client.  Yes, that should work.

cheers,
  Gerd



Re: [Qemu-devel] [BUG 1.2] qemu-system-arm segfault

2012-08-28 Thread Stefan Weil

Am 28.08.2012 22:26, schrieb Adam Lackorzynski:

Hi,

I'm getting a segfault for qemu-system-arm (git).
Git bisect points to 33e95c6328a3149a52615176617997c4f8f7088b.
Host is x86-32, I'm not getting it in a 64bit environment.
However, valgrind is showing a similar output for arm_gic_class_init and
arm_gic_init.




Hi,

my quick test on 32 bit Ubuntu Lucid confirms this segfault.
Valgrind does not work in this environment.

Anthony, I think this might be important for QEMU 1.2.

Regards,

Stefan W.




[Qemu-devel] [PATCH v10 6/6] allower the user to disable pv event support

2012-08-28 Thread Wen Congyang
Signed-off-by: Wen Congyang 
---
 hw/pc_piix.c|6 +-
 qemu-config.c   |4 
 qemu-options.hx |3 ++-
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 37eca23..10531a8 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -149,6 +149,8 @@ static void pc_init1(MemoryRegion *system_memory,
 MemoryRegion *pci_memory;
 MemoryRegion *rom_memory;
 void *fw_cfg = NULL;
+QemuOptsList *list = qemu_find_opts("machine");
+bool enable_pv_event;
 
 pc_cpus_init(cpu_model);
 
@@ -287,7 +289,9 @@ static void pc_init1(MemoryRegion *system_memory,
 pc_pci_device_init(pci_bus);
 }
 
-if (kvm_enabled()) {
+enable_pv_event = qemu_opt_get_bool(QTAILQ_FIRST(&list->head),
+"enable_pv_event", false);
+if (kvm_enabled() && enable_pv_event) {
 kvm_pv_event_init(isa_bus);
 }
 }
diff --git a/qemu-config.c b/qemu-config.c
index c05ffbc..a58bf71 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -612,6 +612,10 @@ static QemuOptsList qemu_machine_opts = {
 .name = "dump-guest-core",
 .type = QEMU_OPT_BOOL,
 .help = "Include guest memory in  a core dump",
+}, {
+.name = "enable_pv_event",
+.type = QEMU_OPT_BOOL,
+.help = "handle pv event"
 },
 { /* End of list */ }
 },
diff --git a/qemu-options.hx b/qemu-options.hx
index 3c411c4..c825d66 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -38,7 +38,8 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
 "supported accelerators are kvm, xen, tcg (default: tcg)\n"
 "kernel_irqchip=on|off controls accelerated irqchip 
support\n"
 "kvm_shadow_mem=size of KVM shadow MMU\n"
-"dump-guest-core=on|off include guest memory in a core 
dump (default=on)\n",
+"dump-guest-core=on|off include guest memory in a core 
dump (default=on)\n"
+"enable_pv_event=on|off controls pv event support\n",
 QEMU_ARCH_ALL)
 STEXI
 @item -machine [type=]@var{name}[,prop=@var{value}[,...]]
-- 
1.7.1



[Qemu-devel] [PATCH v10 5/6] introduce a new qom device to deal with panicked event

2012-08-28 Thread Wen Congyang
If the target is x86/x86_64, the guest's kernel will write 0x01 to the
port KVM_PV_EVENT_PORT when it is panciked. This patch introduces a new
qom device kvm_pv_ioport to listen this I/O port, and deal with panicked
event according to panicked_action's value. The possible actions are:
1. emit QEVENT_GUEST_PANICKED only
2. emit QEVENT_GUEST_PANICKED and pause the guest
3. emit QEVENT_GUEST_PANICKED and poweroff the guest
4. emit QEVENT_GUEST_PANICKED and reset the guest

I/O ports does not work for some targets(for example: s390). And you
can implement another qom device, and include it's code into pv_event.c
for such target.

Note: if we emit QEVENT_GUEST_PANICKED only, and the management
application does not receive this event(the management may not
run when the event is emitted), the management won't know the
guest is panicked.

Signed-off-by: Wen Congyang 
---
 hw/kvm/Makefile.objs |2 +-
 hw/kvm/pv_event.c|  197 ++
 hw/pc_piix.c |5 ++
 kvm-stub.c   |4 +
 kvm.h|2 +
 5 files changed, 209 insertions(+), 1 deletions(-)
 create mode 100644 hw/kvm/pv_event.c

diff --git a/hw/kvm/Makefile.objs b/hw/kvm/Makefile.objs
index 226497a..23e3b30 100644
--- a/hw/kvm/Makefile.objs
+++ b/hw/kvm/Makefile.objs
@@ -1 +1 @@
-obj-$(CONFIG_KVM) += clock.o apic.o i8259.o ioapic.o i8254.o
+obj-$(CONFIG_KVM) += clock.o apic.o i8259.o ioapic.o i8254.o pv_event.o
diff --git a/hw/kvm/pv_event.c b/hw/kvm/pv_event.c
new file mode 100644
index 000..e14b030
--- /dev/null
+++ b/hw/kvm/pv_event.c
@@ -0,0 +1,197 @@
+/*
+ * QEMU KVM support, paravirtual event device
+ *
+ * Copyright Fujitsu, Corp. 2012
+ *
+ * Authors:
+ * Wen Congyang 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* Possible values for action parameter. */
+#define PANICKED_REPORT 1   /* emit QEVENT_GUEST_PANICKED only */
+#define PANICKED_PAUSE  2   /* emit QEVENT_GUEST_PANICKED and pause VM */
+#define PANICKED_POWEROFF   3   /* emit QEVENT_GUEST_PANICKED and quit VM */
+#define PANICKED_RESET  4   /* emit QEVENT_GUEST_PANICKED and reset VM */
+
+#define PV_EVENT_DRIVER "kvm_pv_event"
+
+struct PVEventAction {
+char *panicked_action;
+int panicked_action_value;
+};
+
+#define DEFINE_PV_EVENT_PROPERTIES(_state, _conf)   \
+DEFINE_PROP_STRING("panicked_action", _state, _conf.panicked_action)
+
+static void panicked_mon_event(const char *action)
+{
+QObject *data;
+
+data = qobject_from_jsonf("{ 'action': %s }", action);
+monitor_protocol_event(QEVENT_GUEST_PANICKED, data);
+qobject_decref(data);
+}
+
+static void panicked_perform_action(uint32_t panicked_action)
+{
+switch (panicked_action) {
+case PANICKED_REPORT:
+panicked_mon_event("report");
+break;
+
+case PANICKED_PAUSE:
+panicked_mon_event("pause");
+vm_stop(RUN_STATE_GUEST_PANICKED);
+break;
+
+case PANICKED_POWEROFF:
+panicked_mon_event("poweroff");
+qemu_system_shutdown_request();
+break;
+
+case PANICKED_RESET:
+panicked_mon_event("reset");
+qemu_system_reset_request();
+break;
+}
+}
+
+static uint64_t supported_event(void)
+{
+return 1 << KVM_PV_FEATURE_PANICKED;
+}
+
+static void handle_event(int event, struct PVEventAction *conf)
+{
+if (event == KVM_PV_EVENT_PANICKED) {
+panicked_perform_action(conf->panicked_action_value);
+}
+}
+
+static int pv_event_init(struct PVEventAction *conf)
+{
+if (!conf->panicked_action) {
+conf->panicked_action_value = PANICKED_REPORT;
+} else if (strcasecmp(conf->panicked_action, "none") == 0) {
+conf->panicked_action_value = PANICKED_REPORT;
+} else if (strcasecmp(conf->panicked_action, "pause") == 0) {
+conf->panicked_action_value = PANICKED_PAUSE;
+} else if (strcasecmp(conf->panicked_action, "poweroff") == 0) {
+conf->panicked_action_value = PANICKED_POWEROFF;
+} else if (strcasecmp(conf->panicked_action, "reset") == 0) {
+conf->panicked_action_value = PANICKED_RESET;
+} else {
+return -1;
+}
+
+return 0;
+}
+
+#if defined(KVM_PV_EVENT_PORT)
+
+#include "hw/isa.h"
+
+typedef struct {
+ISADevice dev;
+struct PVEventAction conf;
+MemoryRegion ioport;
+} PVIOPortState;
+
+static uint64_t pv_io_read(void *opaque, target_phys_addr_t addr, unsigned 
size)
+{
+return supported_event();
+}
+
+static void pv_io_write(void *opaque, target_phys_addr_t addr, uint64_t val,
+unsigned size)
+{
+PVIOPortState *s = opaque;
+
+handle_event(val, &s->conf);
+}
+
+static const MemoryRegionOps pv_io_ops = {
+.read = pv_io_read,
+.write = pv_io_write,
+.impl = {
+.min_access_size = 4,
+   

[Qemu-devel] [PATCH v10 4/6] add a new qevent: QEVENT_GUEST_PANICKED

2012-08-28 Thread Wen Congyang
This event will be emited when the guest is panicked.

Signed-off-by: Wen Congyang 
---
 monitor.c |1 +
 monitor.h |1 +
 2 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/monitor.c b/monitor.c
index b17b1bb..f74dd2d 100644
--- a/monitor.c
+++ b/monitor.c
@@ -455,6 +455,7 @@ static const char *monitor_event_names[] = {
 [QEVENT_SUSPEND_DISK] = "SUSPEND_DISK",
 [QEVENT_WAKEUP] = "WAKEUP",
 [QEVENT_BALLOON_CHANGE] = "BALLOON_CHANGE",
+[QEVENT_GUEST_PANICKED] = "GUEST_PANICKED",
 };
 QEMU_BUILD_BUG_ON(ARRAY_SIZE(monitor_event_names) != QEVENT_MAX)
 
diff --git a/monitor.h b/monitor.h
index 47d556b..f48a502 100644
--- a/monitor.h
+++ b/monitor.h
@@ -43,6 +43,7 @@ typedef enum MonitorEvent {
 QEVENT_SUSPEND_DISK,
 QEVENT_WAKEUP,
 QEVENT_BALLOON_CHANGE,
+QEVENT_GUEST_PANICKED,
 
 /* Add to 'monitor_event_names' array in monitor.c when
  * defining new events here */
-- 
1.7.1




[Qemu-devel] [PATCH v10 3/6] add a new runstate: RUN_STATE_GUEST_PANICKED

2012-08-28 Thread Wen Congyang
The guest will be in this state when it is panicked.

Signed-off-by: Wen Congyang 
---
 qapi-schema.json |6 +-
 qmp.c|3 ++-
 vl.c |7 ++-
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index bd8ad74..edb090a 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -149,11 +149,15 @@
 # @suspended: guest is suspended (ACPI S3)
 #
 # @watchdog: the watchdog action is configured to pause and has been triggered
+#
+# @guest-panicked: the panicked action is configured to pause and has been
+# triggered.
 ##
 { 'enum': 'RunState',
   'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
 'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
-'running', 'save-vm', 'shutdown', 'suspended', 'watchdog' ] }
+'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
+'guest-panicked' ] }
 
 ##
 # @StatusInfo:
diff --git a/qmp.c b/qmp.c
index c5a20f1..f863f56 100644
--- a/qmp.c
+++ b/qmp.c
@@ -148,7 +148,8 @@ void qmp_cont(Error **errp)
 error_set(errp, QERR_MIGRATION_EXPECTED);
 return;
 } else if (runstate_check(RUN_STATE_INTERNAL_ERROR) ||
-   runstate_check(RUN_STATE_SHUTDOWN)) {
+   runstate_check(RUN_STATE_SHUTDOWN) ||
+   runstate_check(RUN_STATE_GUEST_PANICKED)) {
 error_set(errp, QERR_RESET_REQUIRED);
 return;
 } else if (runstate_check(RUN_STATE_SUSPENDED)) {
diff --git a/vl.c b/vl.c
index 316a977..947b23a 100644
--- a/vl.c
+++ b/vl.c
@@ -373,6 +373,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RUN_STATE_RUNNING, RUN_STATE_SAVE_VM },
 { RUN_STATE_RUNNING, RUN_STATE_SHUTDOWN },
 { RUN_STATE_RUNNING, RUN_STATE_WATCHDOG },
+{ RUN_STATE_RUNNING, RUN_STATE_GUEST_PANICKED },
 
 { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
 
@@ -387,6 +388,9 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
 { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
 
+{ RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
+{ RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
+
 { RUN_STATE_MAX, RUN_STATE_MAX },
 };
 
@@ -1607,7 +1611,8 @@ static bool main_loop_should_exit(void)
 qemu_system_reset(VMRESET_REPORT);
 resume_all_vcpus();
 if (runstate_check(RUN_STATE_INTERNAL_ERROR) ||
-runstate_check(RUN_STATE_SHUTDOWN)) {
+runstate_check(RUN_STATE_SHUTDOWN) ||
+runstate_check(RUN_STATE_GUEST_PANICKED)) {
 bdrv_iterate(iostatus_bdrv_it, NULL);
 vm_start();
 }
-- 
1.7.1




[Qemu-devel] [PATCH v10 2/6] kvm: Update kernel headers

2012-08-28 Thread Wen Congyang
Corresponding kvm.git hash: 1d92128f with my patch for kvm
---
 linux-headers/asm-s390/kvm.h  |2 +-
 linux-headers/asm-s390/kvm_para.h |2 +-
 linux-headers/asm-x86/kvm.h   |1 +
 linux-headers/asm-x86/kvm_para.h  |9 +
 linux-headers/linux/kvm.h |3 +++
 linux-headers/linux/kvm_para.h|6 ++
 6 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h
index bdcbe0f..d25da59 100644
--- a/linux-headers/asm-s390/kvm.h
+++ b/linux-headers/asm-s390/kvm.h
@@ -1,7 +1,7 @@
 #ifndef __LINUX_KVM_S390_H
 #define __LINUX_KVM_S390_H
 /*
- * asm-s390/kvm.h - KVM s390 specific structures and definitions
+ * KVM s390 specific structures and definitions
  *
  * Copyright IBM Corp. 2008
  *
diff --git a/linux-headers/asm-s390/kvm_para.h 
b/linux-headers/asm-s390/kvm_para.h
index 8e2dd67..870051f 100644
--- a/linux-headers/asm-s390/kvm_para.h
+++ b/linux-headers/asm-s390/kvm_para.h
@@ -1,5 +1,5 @@
 /*
- * asm-s390/kvm_para.h - definition for paravirtual devices on s390
+ * definition for paravirtual devices on s390
  *
  * Copyright IBM Corp. 2008
  *
diff --git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h
index e7d1c19..246617e 100644
--- a/linux-headers/asm-x86/kvm.h
+++ b/linux-headers/asm-x86/kvm.h
@@ -12,6 +12,7 @@
 /* Select x86 specific features in  */
 #define __KVM_HAVE_PIT
 #define __KVM_HAVE_IOAPIC
+#define __KVM_HAVE_IRQ_LINE
 #define __KVM_HAVE_DEVICE_ASSIGNMENT
 #define __KVM_HAVE_MSI
 #define __KVM_HAVE_USER_NMI
diff --git a/linux-headers/asm-x86/kvm_para.h b/linux-headers/asm-x86/kvm_para.h
index f2ac46a..53aca59 100644
--- a/linux-headers/asm-x86/kvm_para.h
+++ b/linux-headers/asm-x86/kvm_para.h
@@ -22,6 +22,7 @@
 #define KVM_FEATURE_CLOCKSOURCE23
 #define KVM_FEATURE_ASYNC_PF   4
 #define KVM_FEATURE_STEAL_TIME 5
+#define KVM_FEATURE_PV_EOI 6
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
@@ -37,6 +38,7 @@
 #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01
 #define MSR_KVM_ASYNC_PF_EN 0x4b564d02
 #define MSR_KVM_STEAL_TIME  0x4b564d03
+#define MSR_KVM_PV_EOI_EN  0x4b564d04
 
 struct kvm_steal_time {
__u64 steal;
@@ -89,5 +91,12 @@ struct kvm_vcpu_pv_apf_data {
__u32 enabled;
 };
 
+#define KVM_PV_EOI_BIT 0
+#define KVM_PV_EOI_MASK (0x1 << KVM_PV_EOI_BIT)
+#define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
+#define KVM_PV_EOI_DISABLED 0x0
+
+#define KVM_PV_EVENT_PORT  (0x505UL)
+
 
 #endif /* _ASM_X86_KVM_PARA_H */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 5a9d4e3..4b9e575 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -617,6 +617,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_SIGNAL_MSI 77
 #define KVM_CAP_PPC_GET_SMMU_INFO 78
 #define KVM_CAP_S390_COW 79
+#define KVM_CAP_PPC_ALLOC_HTAB 80
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -828,6 +829,8 @@ struct kvm_s390_ucas_mapping {
 #define KVM_SIGNAL_MSI_IOW(KVMIO,  0xa5, struct kvm_msi)
 /* Available with KVM_CAP_PPC_GET_SMMU_INFO */
 #define KVM_PPC_GET_SMMU_INFO_IOR(KVMIO,  0xa6, struct kvm_ppc_smmu_info)
+/* Available with KVM_CAP_PPC_ALLOC_HTAB */
+#define KVM_PPC_ALLOCATE_HTAB_IOWR(KVMIO, 0xa7, __u32)
 
 /*
  * ioctls for vcpu fds
diff --git a/linux-headers/linux/kvm_para.h b/linux-headers/linux/kvm_para.h
index 7bdcf93..f6be0bb 100644
--- a/linux-headers/linux/kvm_para.h
+++ b/linux-headers/linux/kvm_para.h
@@ -20,6 +20,12 @@
 #define KVM_HC_FEATURES3
 #define KVM_HC_PPC_MAP_MAGIC_PAGE  4
 
+/* The bit of supported pv event */
+#define KVM_PV_FEATURE_PANICKED0
+
+/* The pv event value */
+#define KVM_PV_EVENT_PANICKED  1
+
 /*
  * hypercalls use architecture specific
  */
-- 
1.7.1




[Qemu-devel] [PATCH v10 1/6] start vm after reseting it

2012-08-28 Thread Wen Congyang
The guest should run after reseting it, but it does not run if its
old state is RUN_STATE_INTERNAL_ERROR or RUN_STATE_PAUSED.

We don't set runstate to RUN_STATE_PAUSED when reseting the guest,
so the runstate will be changed from RUN_STATE_INTERNAL_ERROR or
RUN_STATE_PAUSED to RUN_STATE_RUNNING(not RUN_STATE_PAUSED).

Signed-off-by: Wen Congyang 
---
 block.h |2 ++
 qmp.c   |2 +-
 vl.c|7 ---
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/block.h b/block.h
index 2e2be11..c3265c2 100644
--- a/block.h
+++ b/block.h
@@ -339,6 +339,8 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs);
 void bdrv_set_in_use(BlockDriverState *bs, int in_use);
 int bdrv_in_use(BlockDriverState *bs);
 
+void iostatus_bdrv_it(void *opaque, BlockDriverState *bs);
+
 enum BlockAcctType {
 BDRV_ACCT_READ,
 BDRV_ACCT_WRITE,
diff --git a/qmp.c b/qmp.c
index 8463922..c5a20f1 100644
--- a/qmp.c
+++ b/qmp.c
@@ -125,7 +125,7 @@ SpiceInfo *qmp_query_spice(Error **errp)
 };
 #endif
 
-static void iostatus_bdrv_it(void *opaque, BlockDriverState *bs)
+void iostatus_bdrv_it(void *opaque, BlockDriverState *bs)
 {
 bdrv_iostatus_reset(bs);
 }
diff --git a/vl.c b/vl.c
index 7c577fa..316a977 100644
--- a/vl.c
+++ b/vl.c
@@ -343,7 +343,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RUN_STATE_INMIGRATE, RUN_STATE_RUNNING },
 { RUN_STATE_INMIGRATE, RUN_STATE_PRELAUNCH },
 
-{ RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
+{ RUN_STATE_INTERNAL_ERROR, RUN_STATE_RUNNING },
 { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
 
 { RUN_STATE_IO_ERROR, RUN_STATE_RUNNING },
@@ -376,7 +376,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 
 { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
 
-{ RUN_STATE_SHUTDOWN, RUN_STATE_PAUSED },
+{ RUN_STATE_SHUTDOWN, RUN_STATE_RUNNING },
 { RUN_STATE_SHUTDOWN, RUN_STATE_FINISH_MIGRATE },
 
 { RUN_STATE_DEBUG, RUN_STATE_SUSPENDED },
@@ -1608,7 +1608,8 @@ static bool main_loop_should_exit(void)
 resume_all_vcpus();
 if (runstate_check(RUN_STATE_INTERNAL_ERROR) ||
 runstate_check(RUN_STATE_SHUTDOWN)) {
-runstate_set(RUN_STATE_PAUSED);
+bdrv_iterate(iostatus_bdrv_it, NULL);
+vm_start();
 }
 }
 if (qemu_wakeup_requested()) {
-- 
1.7.1




[Qemu-devel] [PATCH v10] kvm: notify host when the guest is panicked

2012-08-28 Thread Wen Congyang
We can know the guest is panicked when the guest runs on xen.
But we do not have such feature on kvm.

Another purpose of this feature is: management app(for example:
libvirt) can do auto dump when the guest is panicked. If management
app does not do auto dump, the guest's user can do dump by hand if
he sees the guest is panicked.

We have three solutions to implement this feature:
1. use vmcall
2. use I/O port
3. use virtio-serial.

We have decided to avoid touching hypervisor. The reason why I choose
choose the I/O port is:
1. it is easier to implememt
2. it does not depend any virtual device
3. it can work when starting the kernel

Signed-off-by: Wen Congyang 
---
 Documentation/virtual/kvm/pv_event.txt |   32 
 arch/ia64/include/asm/kvm_para.h   |   14 ++
 arch/powerpc/include/asm/kvm_para.h|   14 ++
 arch/s390/include/asm/kvm_para.h   |   14 ++
 arch/x86/include/asm/kvm_para.h|   27 +++
 arch/x86/kernel/kvm.c  |   25 +
 include/linux/kvm_para.h   |   23 +++
 7 files changed, 149 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/virtual/kvm/pv_event.txt

diff --git a/Documentation/virtual/kvm/pv_event.txt 
b/Documentation/virtual/kvm/pv_event.txt
new file mode 100644
index 000..bb04de0
--- /dev/null
+++ b/Documentation/virtual/kvm/pv_event.txt
@@ -0,0 +1,32 @@
+The KVM paravirtual event interface
+=
+
+Initializing the paravirtual event interface
+==
+kvm_pv_event_init()
+Argiments:
+   None
+
+Return Value:
+   0: The guest kernel can use paravirtual event interface.
+   1: The guest kernel can't use paravirtual event interface.
+
+Querying whether the event can be ejected
+==
+kvm_pv_has_feature()
+Arguments:
+   feature: The bit value of this paravirtual event to query
+
+Return Value:
+   0 : The guest kernel can't eject this paravirtual event.
+   -1: The guest kernel can eject this paravirtual event.
+
+
+Ejecting paravirtual event
+==
+kvm_pv_eject_event()
+Arguments:
+   event: The event to be ejected.
+
+Return Value:
+   None
diff --git a/arch/ia64/include/asm/kvm_para.h b/arch/ia64/include/asm/kvm_para.h
index 2019cb9..b5ec658 100644
--- a/arch/ia64/include/asm/kvm_para.h
+++ b/arch/ia64/include/asm/kvm_para.h
@@ -31,6 +31,20 @@ static inline bool kvm_check_and_clear_guest_paused(void)
return false;
 }
 
+static inline int kvm_arch_pv_event_init(void)
+{
+   return 0;
+}
+
+static inline unsigned int kvm_arch_pv_features(void)
+{
+   return 0;
+}
+
+static inline void kvm_arch_pv_eject_event(unsigned int event)
+{
+}
+
 #endif
 
 #endif
diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index c18916b..01b98c7 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -211,6 +211,20 @@ static inline bool kvm_check_and_clear_guest_paused(void)
return false;
 }
 
+static inline int kvm_arch_pv_event_init(void)
+{
+   return 0;
+}
+
+static inline unsigned int kvm_arch_pv_features(void)
+{
+   return 0;
+}
+
+static inline void kvm_arch_pv_eject_event(unsigned int event)
+{
+}
+
 #endif /* __KERNEL__ */
 
 #endif /* __POWERPC_KVM_PARA_H__ */
diff --git a/arch/s390/include/asm/kvm_para.h b/arch/s390/include/asm/kvm_para.h
index da44867..00ce058 100644
--- a/arch/s390/include/asm/kvm_para.h
+++ b/arch/s390/include/asm/kvm_para.h
@@ -154,6 +154,20 @@ static inline bool kvm_check_and_clear_guest_paused(void)
return false;
 }
 
+static inline int kvm_arch_pv_event_init(void)
+{
+   return 0;
+}
+
+static inline unsigned int kvm_arch_pv_features(void)
+{
+   return 0;
+}
+
+static inline void kvm_arch_pv_eject_event(unsigned int event)
+{
+}
+
 #endif
 
 #endif /* __S390_KVM_PARA_H */
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 2f7712e..7d297f0 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -96,8 +96,11 @@ struct kvm_vcpu_pv_apf_data {
 #define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
 #define KVM_PV_EOI_DISABLED 0x0
 
+#define KVM_PV_EVENT_PORT  (0x505UL)
+
 #ifdef __KERNEL__
 #include 
+#include 
 
 extern void kvmclock_init(void);
 extern int kvm_register_clock(char *txt);
@@ -228,6 +231,30 @@ static inline void kvm_disable_steal_time(void)
 }
 #endif
 
+static inline int kvm_arch_pv_event_init(void)
+{
+   if (!request_region(KVM_PV_EVENT_PORT, 1, "KVM_PV_EVENT"))
+   return -1;
+
+   return 0;
+}
+
+static inline unsigned int kvm_arch_pv_features(void)
+{
+   unsigned int features = inl(KVM_PV_EVENT_PORT);
+
+   /* Reading from an invalid I/O port will return -1 */
+   if (features == ~0)
+   features = 0;
+
+   return features;
+}
+
+

[Qemu-devel] [Bug 1037675] Re: Guest Kernel Panic if using "-cpu host" in qemu-kvm 1.1.1

2012-08-28 Thread Stefan Kuhn
I have the same issue, but not on hardened.
Tried for 1-2 hours to send the output to serial console but failed.
The text below is what I posted at 
https://bugs.gentoo.org/show_bug.cgi?id=431640#c8:
###

Same issue here (same screenshot with qemu-kvm-1.1.1-r1), but not on hardened. 
Happens with 1.1.1-r3 with a different panic message (not the one
mentioned in bug #431444)

I fail miserably at "sending output to serial console or file", if anyone
would help me there I could provide the entire output of the kernel panic.

Below is what I've found out to far:
(Tests with qemu-kvm-1.1.1-r1)

Analysis:
=
Happens with the following LiveCDs:
- Fedora-17-x86_64-netinst.iso: uname -r = 3.3.4-5.fc17.x86_64; image: 
/LiveOS/squashfs.img
- pentoo-x86_64-2012.0_beta1.7.iso: uname -r = 3.4.2-pentoo; image: 
image.squashfs

Does not happen with:
- pentoo-x86_64-2009.0.iso: uname -r = 2.6.31-pentoo-r3; image: image.squashfs
- (Gentoo) install-amd64-minimal-20120621.iso: uname -r = 3.2.12-gentoo; 
image.squashfs
- systemrescuecd-x86-2.8.1.iso (64bit kernel): uname -r = 3.2.23-std281-amd64: 
sysrcd.dat

Tried installing qemu-kvm-1.1.1-r1 without the 2 patches, no difference.

Guess:
==
This seems to happen only with newer guest-kernels (on newer CPUs).

Hardware-Info:
==
see attachments

** Bug watch added: Gentoo Bugzilla #431640
   https://bugs.gentoo.org/show_bug.cgi?id=431640

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1037675

Title:
  Guest Kernel Panic if using "-cpu host" in qemu-kvm 1.1.1

Status in QEMU:
  New

Bug description:
  After Upgrading to qemu-kvm-1.1.1-r1 from version 1.0.1-r1 my virtual
  machines (running gentoo linux) panic at intel_pmu_init. (detailed
  information including stacktrace are in the uploaded screenshot). When
  i remove the "-cpu host" option, the system starts normally.

  the command line from whicht the system is bootet:

  qemu-kvm -vnc :7 -usbdevice tablet -daemonize -m 256 -drive
  file=/data/virtual_machines/wgs-l08.img,if=virtio  -boot c -k de -net
  nic,model=virtio,macaddr=12:12:00:12:34:63,vlan=0 -net
  tap,ifname=qtap6,script=no,downscript=no,vlan=0 -smp 2 -enable-kvm
  -cpu host -monitor unix:/var/run/qemu-
  kvm/wgs-l08.monitor,server,nowait

  also reported on gentoo bug tracker (with some more details of the
  host): https://bugs.gentoo.org/show_bug.cgi?id=431640

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1037675/+subscriptions



Re: [Qemu-devel] [PATCH] Fix buffer run out in eepro100.

2012-08-28 Thread Bo Yang
On 08/28/2012 06:59 PM, Stefan Hajnoczi wrote:
> On Tue, Aug 28, 2012 at 7:23 AM, Bo Yang  wrote:
>> According
>> to liunux driver's implementation, the descriptor with EL bit set
>> must not be touched by hardware, usually, the buffer size of this
>> descriptor is set to 0.
> 
> Please describe the bug you are seeing and how to reproduce it.  

Unfortunately, I cannot reproduce it. It is reported by QA, I used the
QA's test environment to debug here.

It's
> not clear to me that your patch implements the behavior specified in
> the datasheet.  I have included relevant quotes from the datasheet:
> 
> http://www.intel.com/content/dam/doc/manual/8255x-10-100-mbps-ethernet-controller-software-dev-manual.pdf
> 
> "Table 55. CU Activities Performed at the End of Execution" shows that
> the EL bit takes effect at end of command execution.  I think this
> means the descriptor *will* be processed by the hardware.
> 
> The following documents the behavior when the descriptor's buffer size is 0:
> 
> "6.4.3.3.1 Configuring the Next RFD
> [...]
> 3. Initiates a receive DMA if the size of the data field in the RFD is
> greater than zero. The receive
> DMA is initiated with the address of the first byte of the destination
> address to the byte count
> specified by the RFD.
> 4. Forces a receive DMA completion if the size of the data field in
> the RFD is zero."
> 
>> diff --git a/hw/eepro100.c b/hw/eepro100.c
>> index 50d117e..e0efd96 100644
>> --- a/hw/eepro100.c
>> +++ b/hw/eepro100.c
>> @@ -1619,8 +1619,13 @@ static const MemoryRegionOps eepro100_ops = {
>>  static int nic_can_receive(NetClientState *nc)
>>  {
>>  EEPRO100State *s = DO_UPCAST(NICState, nc, nc)->opaque;
>> +ru_state_t state;
>>  TRACE(RXTX, logout("%p\n", s));
>> -return get_ru_state(s) == ru_ready;
>> +state = get_ru_state(s);
>> +if (state == ru_no_resources) {
>> +   eepro100_rnr_interrupt(s);
>> +}
>> +return state == ru_ready;
>>  #if 0
>>  return !eepro100_buffer_full(s);
>>  #endif
> 
> This is a boolean function, it shouldn't have side-effects.
> 
> Why did you add the eepro100_rnr_interrupt() call here when it's also
> called below from nic_receive()?

This is needed. Let's consider tap with e100.
When rx descriptor is EL with size 0. e100 enter ru_no_resources state
and rnr interrupt is raised. Then, driver in guest handles the
interrupt, however, there can be out of memory condition in guest, so
this time the rx descriptor filling fails, so no rx descriptor is ready.

Have a look at tap, tap_can_send is ioh->fd_read_poll, tap_send is
ioh->fd_read, tap_can_send calls nic_can_receive(), tap_send() call
nic_receive(), if tap_can_send returns 0, the tap fd will not be added
to read set, and tap_send() will never be called, so nic_receive is
never called, no more rnr interrupt will be raised. then the network
just stall. So we need interrupt to notify the guest to prepare rx
descriptor again.

> 
>> @@ -1732,6 +1737,15 @@ static ssize_t nic_receive(NetClientState *nc, const 
>> uint8_t * buf, size_t size)
>>   &rx, sizeof(eepro100_rx_t));
>>  uint16_t rfd_command = le16_to_cpu(rx.command);
>>  uint16_t rfd_size = le16_to_cpu(rx.size);
>> +/* don't touch the rx descriptor with EL set. */
>> +if (rfd_command & COMMAND_EL) {
>> +/* EL bit is set, so this was the last frame. */
>> +logout("receive: Running out of frames\n");
>> +set_ru_state(s, ru_no_resources);
>> +s->statistics.rx_resource_errors++;
>> +   eepro100_rnr_interrupt(s);
>> +   return -1;
>> +}
>>  if (size > rfd_size) {
>>  logout("Receive buffer (%" PRId16 " bytes) too small for data "
>> @@ -1767,11 +1781,6 @@ static ssize_t nic_receive(NetClientState *nc, const 
>> uint8_t * buf, size_t size)
>>  s->statistics.rx_good_frames++;
>>  eepro100_fr_interrupt(s);
>>  s->ru_offset = le32_to_cpu(rx.link);
>> -if (rfd_command & COMMAND_EL) {
>> -/* EL bit is set, so this was the last frame. */
>> -logout("receive: Running out of frames\n");
>> -set_ru_state(s, ru_suspended);
>> -}
> 
> "6.4.3.3.3 Completion of Reception" describes how this should work.  I
> think the rnr interrupt should be raised here and we should enter the
> no resources state.

I think you're right here. To comply to the specification of e100. We
should handle the rx descriptor with EL, and then enter ru_no_resources
state, raises rnr interrupt.

therefore,

set_ru_state(s, ru_no_resources);
eepro100_rnr_interrupt(s);

after handle rx descriptor with EL bit.

> 
> Stefan
> 




Re: [Qemu-devel] [PATCH] Add new client_present and client capabilities fields to QXLRom

2012-08-28 Thread Søren Sandmann
Gerd Hoffmann  writes:

> On 08/27/12 19:20, Søren Sandmann Pedersen wrote:
>> From: Søren Sandmann Pedersen 
>> 
>> The client_present field is a byte that is set of non-zero when a
>> client is connected and to zero when no client is connected.
>> 
>> The client_capabilities[58] array contains 464 bits that indicate the
>> capabilities of the client.
>
> What is supposed to happen in case multiple clients are connected?

Is this case supported at all?

If it is, I'd say that the guest should not be aware of it and the bits
advertise should be interpreted as "these are the capabilities that
spice-server will marshall on to the clients that are
connected". Presumably spice-server would then set the bit vector to the
intersection of all the clients.

> How do you handle the race conditions, especially on capability
> downgrade?  There might be not-yet processed commands in the command
> queue which the client is unable to handle, or existing surfaces using
> formats the client doesn't understand ...

Good question. 

I don't know of a good way to deal with the situation where the new
client is unable to handle existing surfaces. I suppose in principle
spice-server could emulate their existence, sending them as images, but
I'm not familiar enough with spice-server to say whether that is
feasible.

For commands, would it work for spice-server to just process everything
in the command ring after changing the capability bits (ie., in possibly
brief moment before a new client connects)? It seems that would be a
good thing to do even without capability bits.


Søren



Re: [Qemu-devel] [PATCHv4 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Eduardo Habkost
On Wed, Aug 29, 2012 at 01:25:35AM +0300, Michael S. Tsirkin wrote:
> On Wed, Aug 29, 2012 at 01:21:13AM +0300, Michael S. Tsirkin wrote:
> > On Tue, Aug 28, 2012 at 07:02:42PM -0300, Eduardo Habkost wrote:
> > > On Wed, Aug 29, 2012 at 12:35:28AM +0300, Michael S. Tsirkin wrote:
> > > > On Tue, Aug 28, 2012 at 04:13:38PM -0300, Eduardo Habkost wrote:
> > > > > On Tue, Aug 28, 2012 at 08:43:52PM +0300, Michael S. Tsirkin wrote:
> > > > > > In preparation for adding PV EOI support, disable PV EOI by default 
> > > > > > for
> > > > > > 1.1 and older machine types, to avoid CPUID changing during 
> > > > > > migration.
> > > > > > 
> > > > > > PV EOI can still be enabled/disabled by specifying it explicitly.
> > > > > > Enable for 1.1
> > > > > > -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> > > > > > Disable for 1.2
> > > > > > -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
> > > > > > 
> > > > > 
> > > > > What about users that are already running "qemu-1.1 -M pc-1.1" on a 
> > > > > host
> > > > > kernel that supports PV EOI already? They would get PV EOI disabled 
> > > > > when
> > > > > migrating to a destination running "qemu-1.2 -M pc-1.1".
> > > > > 
> > > > > (On the other hand, people running "qemu-1.1 -M pc-1.1" on a host 
> > > > > kernel
> > > > > supporting PV EOI already have migration broken, so there's not much 
> > > > > we
> > > > > can do for them)
> > > > 
> > > > Exactly.
> > > > 
> > > > Talked to Gleb, long term I think we should rework code to make
> > > > it forward-compatible wrt adding new MSRs:
> > > > - source gets list of MSRs to be migrated from KVM and simply sends 
> > > > them all
> > > > - send all MSRS in key/value format
> > > > - destination gets list of MSRs to be migrated from KVM and
> > > >   only restores the supported ones
> > > 
> > > As far as I understand the migration code requirements/expectations, if
> > > the origin is sending some data, it is because it is part of the
> > > guest-visible machine state that must be kept while migrating. Because
> > > of that, the destination is not allowed to drop anything it doesn't know
> > > about.
> > 
> > 
> > We have a ton of code that reads in values then just
> > ignores them, for compat with old qemu.

The difference is that you ignore only the values you _know_ to be safe
to be ignored. You can't ignore a value simply because you don't know
what it means, and if it's important or not.

> > This will be exactly such a case:
> > we don't drop anything - protocol does not
> > support this. We read and simply do not tell kvm
> > about it.

This fits the definition of "dropping", to me.

> We also have tons of code that sends
> > useless values again for compatibility.

But these values should be ignored only if the receiver knows exactly
what it means, and knows exactly why/when it can be ignored.

> > 
> > > At the same time, if it's not part of guest-visible machine
> > > state, it doesn't have to be sent by the migration origin.

It may be not directly visible, but if it's part of the machine state,
it affects the guest in some way. If it didn't affect the guest in any
way, we wouldn't even have to send it while migrating, in the first
place.

> > > 
> > 
> > False, we often send internal device state which is not
> > directly guest visible.
> 
> Besides, all MSRs in KVM's MSR list are actually guest visible, even if
> guests normally do not invoke these MSRs - they can.

The difference is that the machine doesn't guarantee a MSR to be
migrated unless the corresponding feature bit reports the MSR as
supported.

e.g. using the PV EOI MSR without having the PV EOI CPUID bit set is
undefined behavior. The guest shouldn't do it.

On the other hand, if the PV EOI CPUID bit is set, the MSR should be
always kept, no matter what.

This means that we need to migrate the MSR only if the corresponding
CPUID bit is enabled.


> 
> > > On the other hand, a mode of operation that doesn't require updating
> > > QEMU every time there's a new bit of guest-visible state to be migrated
> > > would be nice (just like the "-cpu host" mode, that doesn't require
> > > updating QEMU for every new CPU feature, is nice for some use cases). I
> > > just don't know how to make work with the current migration protocol.
> > > 
> > 
> > I don't understand. What is the problem with the proposal?
> > What will not work with our protocol?
> > Can you give an example please?

Yes:

Suppose kernel 3.7 introduces KVM_FOO_MSR and CPUID_KVM_FOO.

Also, suppose QEMU 1.2 doesn't know anything about KVM_FOO, because it
was release before this feature was introduced.


Then you run "qemu-1.2 -M pc-1.2" on a 3.7 host kernel. qemu-1.2 can do
two things here:

1) Not enable CPUID_KVM_FOO

In this case, the guest should not use KVM_FOO_MSR and the MSR does not
need to be migrated (the guest may try to use it, but the behavior when
trying to use it is undefined). Sending the MSR value when migrating
would be useless.


2) Enable CPUID_KVM_FOO.

In this case, the guest m

[Qemu-devel] qemu-system-arm segfault

2012-08-28 Thread Adam Lackorzynski
Hi,

I'm getting a segfault for qemu-system-arm (git).
Git bisect points to 33e95c6328a3149a52615176617997c4f8f7088b.
Host is x86-32, I'm not getting it in a 64bit environment.
However, valgrind is showing a similar output for arm_gic_class_init and
arm_gic_init.

$ arm-softmmu/qemu-system-arm -M realview-eb
*** glibc detected *** arm-softmmu/qemu-system-arm: malloc(): memory 
corruption: 0xf7f15b38 ***
=== Backtrace: =
/lib/i386-linux-gnu/i686/cmov/libc.so.6(+0x6e3b1)[0xf6da43b1]
/lib/i386-linux-gnu/i686/cmov/libc.so.6(+0x71194)[0xf6da7194]
/lib/i386-linux-gnu/i686/cmov/libc.so.6(__libc_malloc+0x5c)[0xf6da8d9c]
arm-softmmu/qemu-system-arm(+0x15aae7)[0xf758dae7]
...

$ gdb --args arm-softmmu/qemu-system-arm -M realview-eb
(gdb) r
Starting program: /tmp/qemu/qemu/arm-softmmu/qemu-system-arm -M realview-eb
[Thread debugging using libthread_db enabled]
Using host libthread_db library 
"/lib/i386-linux-gnu/i686/cmov/libthread_db.so.1".
[New Thread 0xf3ccab70 (LWP 11267)]

Program received signal SIGSEGV, Segmentation fault.
_int_malloc (av=, bytes=) at malloc.c:4674
4674malloc.c: No such file or directory.
(gdb) bt
#0  _int_malloc (av=, bytes=) at malloc.c:4674
#1  0xf7973d9c in *__GI___libc_malloc (bytes=32) at malloc.c:3660
#2  0x566afae7 in malloc_and_trace (n_bytes=32) at /tmp/qemu/qemu/vl.c:2322
#3  0xf7edd45c in ?? () from /lib/i386-linux-gnu/libglib-2.0.so.0
#4  0xf7edd78b in g_malloc0 () from /lib/i386-linux-gnu/libglib-2.0.so.0
#5  0x566f29c6 in object_property_add (obj=obj@entry=0x57042a18, 
name=name@entry=0x56881b00 "type", 
type=type@entry=0x568a1c87 "string", get=get@entry=0x566f1620 
, set=set@entry=0, 
release=release@entry=0x566f15e0 , opaque=0x570413a0, 
errp=0x0) at qom/object.c:623
#6  0x566f438d in object_property_add_str (obj=obj@entry=0x57042a18, 
name=name@entry=0x56881b00 "type", 
get=get@entry=0x566f14c0 , set=set@entry=0, 
errp=errp@entry=0x0) at qom/object.c:1179
#7  0x566f440b in object_instance_init (obj=0x57042a18) at qom/object.c:1193
#8  0x566f18af in object_init_with_type (obj=obj@entry=0x57042a18, 
ti=0x56fd0e10) at qom/object.c:294
#9  0x566f18a3 in object_init_with_type (obj=obj@entry=0x57042a18, 
ti=0x56fc8b88) at qom/object.c:290
#10 0x566f18a3 in object_init_with_type (obj=obj@entry=0x57042a18, 
ti=0x56fcea50) at qom/object.c:290
#11 0x566f18a3 in object_init_with_type (obj=obj@entry=0x57042a18, 
ti=0x56fd1470) at qom/object.c:290
#12 0x566f18a3 in object_init_with_type (obj=obj@entry=0x57042a18, 
ti=ti@entry=0x56fd1388) at qom/object.c:290
#13 0x566f1fae in object_initialize_with_type (data=data@entry=0x57042a18, 
type=type@entry=0x56fd1388) at qom/object.c:311
#14 0x566f21fe in object_new_with_type (type=0x56fd1388) at qom/object.c:397
#15 0x566f2291 in object_new (typename=0x56fd1388 "H\024\375V4", 
typename@entry=0x5688b60a "arm_gic") at qom/object.c:407
#16 0x565f93a2 in qdev_try_create (bus=bus@entry=0x0, 
type=type@entry=0x5688b60a "arm_gic") at hw/qdev.c:134
#17 0x565f944a in qdev_create (bus=bus@entry=0x0, name=name@entry=0x5688b60a 
"arm_gic") at hw/qdev.c:114
#18 0x567adf7e in realview_gic_init (dev=0x57041748) at 
/tmp/qemu/qemu/hw/arm/../realview_gic.c:34
#19 0x56697148 in sysbus_device_init (dev=0x57041748) at 
/tmp/qemu/qemu/hw/sysbus.c:121
#20 0x565fa6c8 in qdev_init (dev=dev@entry=0x57041748) at hw/qdev.c:160
#21 0x565fa84c in qdev_init_nofail (dev=dev@entry=0x57041748) at hw/qdev.c:261
#22 0x56697884 in sysbus_create_varargs (name=name@entry=0x5688b742 
"realview_gic", addr=268697600)
at /tmp/qemu/qemu/hw/sysbus.c:135
#23 0x567ada5c in sysbus_create_simple (irq=, addr=, name=0x5688b742 "realview_gic")
at /tmp/qemu/qemu/hw/arm/../sysbus.h:79
#24 realview_init (ram_size=, kernel_filename=0x0, 
kernel_cmdline=0x5685a80d "", initrd_filename=0x0, 
cpu_model=0x5689010b "arm926", board_type=BOARD_EB, 
boot_device=) at /tmp/qemu/qemu/hw/arm/../realview.c:168
#25 0x5658e7c8 in main (argc=3, argv=0xd6a4, envp=0xd6b4) at 
/tmp/qemu/qemu/vl.c:3616
(gdb) 

$ valgrind arm-softmmu/qemu-system-arm -M realview-eb   
 [master]
==11274== Memcheck, a memory error detector
==11274== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==11274== Using Valgrind-3.8.0 and LibVEX; rerun with -h for copyright info
==11274== Command: arm-softmmu/qemu-system-arm -M realview-eb
==11274== 
==11274== Invalid write of size 4
==11274==at 0x3035AB: arm_gic_class_init (arm_gic.c:696)
==11274==by 0x2A4E48: type_initialize (object.c:281)
==11274==by 0x2A5633: object_class_by_name (object.c:510)
==11274==by 0x1AC395: qdev_try_create (qdev.c:131)
==11274==by 0x1AC449: qdev_create (qdev.c:114)
==11274==by 0x360F7D: realview_gic_init (realview_gic.c:34)
==11274==by 0x24A147: sysbus_device_init (sysbus.c:121)
==11274==by 0x1AD6C7: qdev_init (qdev.c:160)
==11274==by 0x1AD84B: qdev_init_nofail (qdev.c:261)
==11274==by 0x24A883: sysbus_create_varargs (sysbus.c:135)

[Qemu-devel] macvlan/macvtap: guest/host cannot communicate when network cable is unplugged

2012-08-28 Thread ching
Hi all,

I try to setup guest network with macvlan, as guest cannot communicate with 
host directly, i try to workaround by configuring host to using a macvlan 
interface

host: Gentoo x64, kernel 3.5.2, qemu-kvm 1.1.1-r1, libvirt 0.9.13
guest: Ubuntu 12.04, kernel 3.4.9, virtio-net

The host is using macvlan (interface name: znet0, ip: 192.168.1.2, bridge mode)

The guest is using macvtap managed by libvirt (interface name: macvtap0, ip: 
192.168.1.184, bridge mode, with vhost)


I am facing a strange problem that the guest/host can communicate only when the 
network cable of the ethernet adapter is connected.




The network config of the host

$ifconfig

eth0  Link encap:Ethernet  HWaddr f4:6d:xx:xx:xx:xx  
  inet6 addr: fe80::xx:xx:xx:xx/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:86507 errors:0 dropped:0 overruns:0 frame:0
  TX packets:55940 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:126005746 (120.1 MiB)  TX bytes:4394225 (4.1 MiB)

macvtap0  Link encap:Ethernet  HWaddr 52:54:xx:xx:xx:xx
  inet6 addr: fe80::xx:xx:xx:xx/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:70 errors:0 dropped:0 overruns:0 frame:0
  TX packets:84 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:500
  RX bytes:9036 (8.8 KiB)  TX bytes:14734 (14.3 KiB)

znet0 Link encap:Ethernet  HWaddr 00:60:xx:xx:xx:xx
  inet addr:192.168.1.2  Bcast:192.168.1.255  Mask:255.255.255.0
  inet6 addr: 2002:xx:xx:xx:xx/64 Scope:Global
  inet6 addr: fe80:xx:xx:xx:xx/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:4463190 errors:0 dropped:0 overruns:0 frame:0
  TX packets:12527522 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:3959213697 (3.6 GiB)  TX bytes:18590336476 (17.3 GiB)

$ip -d link show

10: znet0@eth0:  mtu 1500 qdisc noqueue 
state UP mode DEFAULT
link/ether 00:60:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
macvlan  mode bridge
17: macvtap0@eth0:  mtu 1500 qdisc 
pfifo_fast state UNKNOWN mode DEFAULT qlen 500
link/ether 52:54:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
macvtap  mode bridge

libvirt config of the guest network


  
  
  
  


The network config of the guest

eth0  Link encap:Ethernet  HWaddr 52:54:xx:xx:xx:xx  
  inet addr:192.168.1.184  Bcast:192.168.1.255  Mask:255.255.255.0
  inet6 addr: 2002:xx:xx:xx:xxx/64 Scope:Global
  inet6 addr: 2002:xx:xx:xx:xx/64 Scope:Global
  inet6 addr: fe80:xx:xx:xx:xx/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:25 errors:0 dropped:0 overruns:0 frame:0
  TX packets:66 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:3108 (3.1 KB)  TX bytes:13066 (13.0 KB)

when network cable of host is plugged in

$ ping 192.168.1.184
PING 192.168.1.184 (192.168.1.184) 56(84) bytes of data.
64 bytes from 192.168.1.184: icmp_req=1 ttl=64 time=0.314 ms
64 bytes from 192.168.1.184: icmp_req=2 ttl=64 time=0.253 ms
64 bytes from 192.168.1.184: icmp_req=3 ttl=64 time=0.230 ms


Afterward, i unplugged the network cable to ensure that packet is not routed 
via external router, but the test seems failed.

$ ping 192.168.1.184
PING 192.168.1.184 (192.168.1.184) 56(84) bytes of data.
>From 192.168.1.2: icmp_seq=10 Destination Host Unreachable
>From 192.168.1.2: icmp_seq=11 Destination Host Unreachable


Do anyone have similar problem? Am i missing something?


Regards,
ching



Re: [Qemu-devel] [PATCHv4 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Michael S. Tsirkin
On Wed, Aug 29, 2012 at 01:21:13AM +0300, Michael S. Tsirkin wrote:
> On Tue, Aug 28, 2012 at 07:02:42PM -0300, Eduardo Habkost wrote:
> > On Wed, Aug 29, 2012 at 12:35:28AM +0300, Michael S. Tsirkin wrote:
> > > On Tue, Aug 28, 2012 at 04:13:38PM -0300, Eduardo Habkost wrote:
> > > > On Tue, Aug 28, 2012 at 08:43:52PM +0300, Michael S. Tsirkin wrote:
> > > > > In preparation for adding PV EOI support, disable PV EOI by default 
> > > > > for
> > > > > 1.1 and older machine types, to avoid CPUID changing during migration.
> > > > > 
> > > > > PV EOI can still be enabled/disabled by specifying it explicitly.
> > > > > Enable for 1.1
> > > > > -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> > > > > Disable for 1.2
> > > > > -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
> > > > > 
> > > > 
> > > > What about users that are already running "qemu-1.1 -M pc-1.1" on a host
> > > > kernel that supports PV EOI already? They would get PV EOI disabled when
> > > > migrating to a destination running "qemu-1.2 -M pc-1.1".
> > > > 
> > > > (On the other hand, people running "qemu-1.1 -M pc-1.1" on a host kernel
> > > > supporting PV EOI already have migration broken, so there's not much we
> > > > can do for them)
> > > 
> > > Exactly.
> > > 
> > > Talked to Gleb, long term I think we should rework code to make
> > > it forward-compatible wrt adding new MSRs:
> > > - source gets list of MSRs to be migrated from KVM and simply sends them 
> > > all
> > > - send all MSRS in key/value format
> > > - destination gets list of MSRs to be migrated from KVM and
> > >   only restores the supported ones
> > 
> > As far as I understand the migration code requirements/expectations, if
> > the origin is sending some data, it is because it is part of the
> > guest-visible machine state that must be kept while migrating. Because
> > of that, the destination is not allowed to drop anything it doesn't know
> > about.
> 
> 
> We have a ton of code that reads in values then just
> ignores them, for compat with old qemu.
> This will be exactly such a case:
> we don't drop anything - protocol does not
> support this. We read and simply do not tell kvm
> about it. We also have tons of code that sends
> useless values again for compatibility.
> 
> > At the same time, if it's not part of guest-visible machine
> > state, it doesn't have to be sent by the migration origin.
> > 
> 
> False, we often send internal device state which is not
> directly guest visible.

Besides, all MSRs in KVM's MSR list are actually guest visible, even if
guests normally do not invoke these MSRs - they can.

> > On the other hand, a mode of operation that doesn't require updating
> > QEMU every time there's a new bit of guest-visible state to be migrated
> > would be nice (just like the "-cpu host" mode, that doesn't require
> > updating QEMU for every new CPU feature, is nice for some use cases). I
> > just don't know how to make work with the current migration protocol.
> > 
> 
> I don't understand. What is the problem with the proposal?
> What will not work with our protocol?
> Can you give an example please?
> 
> 
> > > Too late for 1.2?
> > 
> > Absolutely (in my opinion).
> > 
> > > 
> > > > While we don't make the KVM feature-bit handling sane (with defaults
> > > > that are not blindly derived from the host kernel capabilities), maybe
> > > > the safest bet is to expect users to not migrate between hosts running
> > > > kernels with different KVM capabilities? (I am not sure which option is
> > > > better)
> > > 
> > > Sorry not sure what you talk about here. What has KVM feature-bit
> > > handling to do with this patchset?
> > 
> > Everything? The whole point of this patch is to filter out the PV_EOI
> > KVM feature bit.
> > 
> > 
> > This part of the current code, specifically, is wrong:
> > 
> > > > >  plus_kvm_features = ~0; /* not supported bits will be filtered 
> > > > > out later */
> > 
> > The QEMU-side list of KVM features should be whitelist-based, not
> > blacklist-based (unless the user doesn't need migration, in that case he
> > can use "-cpu host" and get every feature blindly enabled), because QEMU
> > can't know if a new feature involves guest-visible state that has to be
> > migrated.
> > 
> > 
> > > 
> > > > 
> > > > > Signed-off-by: Michael S. Tsirkin 
> > > > > ---
> > > > >  hw/Makefile.objs  |  2 +-
> > > > >  hw/cpu_flags.c| 32 
> > > > >  hw/cpu_flags.h|  9 +
> > > > >  hw/pc_piix.c  |  2 ++
> > > > >  target-i386/cpu.c |  8 
> > > > >  5 files changed, 52 insertions(+), 1 deletion(-)
> > > > >  create mode 100644 hw/cpu_flags.c
> > > > >  create mode 100644 hw/cpu_flags.h
> > > > > 
> > > > > diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> > > > > index 850b87b..3f2532a 100644
> > > > > --- a/hw/Makefile.objs
> > > > > +++ b/hw/Makefile.objs
> > > > > @@ -1,5 +1,5 @@
> > > > >  hw-obj-y = usb/ ide/
> > > > > -hw-obj-y += loader.o
> > > > > +hw-obj-y += loader.o cpu_flags.o
> 

Re: [Qemu-devel] [PATCHv4 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 07:02:42PM -0300, Eduardo Habkost wrote:
> On Wed, Aug 29, 2012 at 12:35:28AM +0300, Michael S. Tsirkin wrote:
> > On Tue, Aug 28, 2012 at 04:13:38PM -0300, Eduardo Habkost wrote:
> > > On Tue, Aug 28, 2012 at 08:43:52PM +0300, Michael S. Tsirkin wrote:
> > > > In preparation for adding PV EOI support, disable PV EOI by default for
> > > > 1.1 and older machine types, to avoid CPUID changing during migration.
> > > > 
> > > > PV EOI can still be enabled/disabled by specifying it explicitly.
> > > > Enable for 1.1
> > > > -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> > > > Disable for 1.2
> > > > -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
> > > > 
> > > 
> > > What about users that are already running "qemu-1.1 -M pc-1.1" on a host
> > > kernel that supports PV EOI already? They would get PV EOI disabled when
> > > migrating to a destination running "qemu-1.2 -M pc-1.1".
> > > 
> > > (On the other hand, people running "qemu-1.1 -M pc-1.1" on a host kernel
> > > supporting PV EOI already have migration broken, so there's not much we
> > > can do for them)
> > 
> > Exactly.
> > 
> > Talked to Gleb, long term I think we should rework code to make
> > it forward-compatible wrt adding new MSRs:
> > - source gets list of MSRs to be migrated from KVM and simply sends them all
> > - send all MSRS in key/value format
> > - destination gets list of MSRs to be migrated from KVM and
> >   only restores the supported ones
> 
> As far as I understand the migration code requirements/expectations, if
> the origin is sending some data, it is because it is part of the
> guest-visible machine state that must be kept while migrating. Because
> of that, the destination is not allowed to drop anything it doesn't know
> about.


We have a ton of code that reads in values then just
ignores them, for compat with old qemu.
This will be exactly such a case:
we don't drop anything - protocol does not
support this. We read and simply do not tell kvm
about it. We also have tons of code that sends
useless values again for compatibility.

> At the same time, if it's not part of guest-visible machine
> state, it doesn't have to be sent by the migration origin.
> 

False, we often send internal device state which is not
directly guest visible.

> On the other hand, a mode of operation that doesn't require updating
> QEMU every time there's a new bit of guest-visible state to be migrated
> would be nice (just like the "-cpu host" mode, that doesn't require
> updating QEMU for every new CPU feature, is nice for some use cases). I
> just don't know how to make work with the current migration protocol.
> 

I don't understand. What is the problem with the proposal?
What will not work with our protocol?
Can you give an example please?


> > Too late for 1.2?
> 
> Absolutely (in my opinion).
> 
> > 
> > > While we don't make the KVM feature-bit handling sane (with defaults
> > > that are not blindly derived from the host kernel capabilities), maybe
> > > the safest bet is to expect users to not migrate between hosts running
> > > kernels with different KVM capabilities? (I am not sure which option is
> > > better)
> > 
> > Sorry not sure what you talk about here. What has KVM feature-bit
> > handling to do with this patchset?
> 
> Everything? The whole point of this patch is to filter out the PV_EOI
> KVM feature bit.
> 
> 
> This part of the current code, specifically, is wrong:
> 
> > > >  plus_kvm_features = ~0; /* not supported bits will be filtered out 
> > > > later */
> 
> The QEMU-side list of KVM features should be whitelist-based, not
> blacklist-based (unless the user doesn't need migration, in that case he
> can use "-cpu host" and get every feature blindly enabled), because QEMU
> can't know if a new feature involves guest-visible state that has to be
> migrated.
> 
> 
> > 
> > > 
> > > > Signed-off-by: Michael S. Tsirkin 
> > > > ---
> > > >  hw/Makefile.objs  |  2 +-
> > > >  hw/cpu_flags.c| 32 
> > > >  hw/cpu_flags.h|  9 +
> > > >  hw/pc_piix.c  |  2 ++
> > > >  target-i386/cpu.c |  8 
> > > >  5 files changed, 52 insertions(+), 1 deletion(-)
> > > >  create mode 100644 hw/cpu_flags.c
> > > >  create mode 100644 hw/cpu_flags.h
> > > > 
> > > > diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> > > > index 850b87b..3f2532a 100644
> > > > --- a/hw/Makefile.objs
> > > > +++ b/hw/Makefile.objs
> > > > @@ -1,5 +1,5 @@
> > > >  hw-obj-y = usb/ ide/
> > > > -hw-obj-y += loader.o
> > > > +hw-obj-y += loader.o cpu_flags.o
> > > >  hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
> > > >  hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
> > > >  hw-obj-y += fw_cfg.o
> > > > diff --git a/hw/cpu_flags.c b/hw/cpu_flags.c
> > > > new file mode 100644
> > > > index 000..7a633c0
> > > > --- /dev/null
> > > > +++ b/hw/cpu_flags.c
> > > > @@ -0,0 +1,32 @@
> > > > +/*
> > > > + * CPU compatibility flags.
> > > > + *
> > > > + * Copyright (c) 2012 Red Hat Inc.
> > > > + * Au

Re: [Qemu-devel] [PATCHv4 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Eduardo Habkost
On Wed, Aug 29, 2012 at 12:35:28AM +0300, Michael S. Tsirkin wrote:
> On Tue, Aug 28, 2012 at 04:13:38PM -0300, Eduardo Habkost wrote:
> > On Tue, Aug 28, 2012 at 08:43:52PM +0300, Michael S. Tsirkin wrote:
> > > In preparation for adding PV EOI support, disable PV EOI by default for
> > > 1.1 and older machine types, to avoid CPUID changing during migration.
> > > 
> > > PV EOI can still be enabled/disabled by specifying it explicitly.
> > > Enable for 1.1
> > > -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> > > Disable for 1.2
> > > -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
> > > 
> > 
> > What about users that are already running "qemu-1.1 -M pc-1.1" on a host
> > kernel that supports PV EOI already? They would get PV EOI disabled when
> > migrating to a destination running "qemu-1.2 -M pc-1.1".
> > 
> > (On the other hand, people running "qemu-1.1 -M pc-1.1" on a host kernel
> > supporting PV EOI already have migration broken, so there's not much we
> > can do for them)
> 
> Exactly.
> 
> Talked to Gleb, long term I think we should rework code to make
> it forward-compatible wrt adding new MSRs:
> - source gets list of MSRs to be migrated from KVM and simply sends them all
> - send all MSRS in key/value format
> - destination gets list of MSRs to be migrated from KVM and
>   only restores the supported ones

As far as I understand the migration code requirements/expectations, if
the origin is sending some data, it is because it is part of the
guest-visible machine state that must be kept while migrating. Because
of that, the destination is not allowed to drop anything it doesn't know
about. At the same time, if it's not part of guest-visible machine
state, it doesn't have to be sent by the migration origin.

On the other hand, a mode of operation that doesn't require updating
QEMU every time there's a new bit of guest-visible state to be migrated
would be nice (just like the "-cpu host" mode, that doesn't require
updating QEMU for every new CPU feature, is nice for some use cases). I
just don't know how to make work with the current migration protocol.



> Too late for 1.2?

Absolutely (in my opinion).

> 
> > While we don't make the KVM feature-bit handling sane (with defaults
> > that are not blindly derived from the host kernel capabilities), maybe
> > the safest bet is to expect users to not migrate between hosts running
> > kernels with different KVM capabilities? (I am not sure which option is
> > better)
> 
> Sorry not sure what you talk about here. What has KVM feature-bit
> handling to do with this patchset?

Everything? The whole point of this patch is to filter out the PV_EOI
KVM feature bit.


This part of the current code, specifically, is wrong:

> > >  plus_kvm_features = ~0; /* not supported bits will be filtered out 
> > > later */

The QEMU-side list of KVM features should be whitelist-based, not
blacklist-based (unless the user doesn't need migration, in that case he
can use "-cpu host" and get every feature blindly enabled), because QEMU
can't know if a new feature involves guest-visible state that has to be
migrated.


> 
> > 
> > > Signed-off-by: Michael S. Tsirkin 
> > > ---
> > >  hw/Makefile.objs  |  2 +-
> > >  hw/cpu_flags.c| 32 
> > >  hw/cpu_flags.h|  9 +
> > >  hw/pc_piix.c  |  2 ++
> > >  target-i386/cpu.c |  8 
> > >  5 files changed, 52 insertions(+), 1 deletion(-)
> > >  create mode 100644 hw/cpu_flags.c
> > >  create mode 100644 hw/cpu_flags.h
> > > 
> > > diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> > > index 850b87b..3f2532a 100644
> > > --- a/hw/Makefile.objs
> > > +++ b/hw/Makefile.objs
> > > @@ -1,5 +1,5 @@
> > >  hw-obj-y = usb/ ide/
> > > -hw-obj-y += loader.o
> > > +hw-obj-y += loader.o cpu_flags.o
> > >  hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
> > >  hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
> > >  hw-obj-y += fw_cfg.o
> > > diff --git a/hw/cpu_flags.c b/hw/cpu_flags.c
> > > new file mode 100644
> > > index 000..7a633c0
> > > --- /dev/null
> > > +++ b/hw/cpu_flags.c
> > > @@ -0,0 +1,32 @@
> > > +/*
> > > + * CPU compatibility flags.
> > > + *
> > > + * Copyright (c) 2012 Red Hat Inc.
> > > + * Author: Michael S. Tsirkin.
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License as published by
> > > + * the Free Software Foundation; either version 2 of the License, or
> > > + * (at your option) any later version.
> > > + *
> > > + * This program is distributed in the hope that it will be useful,
> > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > + * GNU General Public License for more details.
> > > + *
> > > + * You should have received a copy of the GNU General Public License 
> > > along
> > > + * with this program; if not, see .
> > > + */
> > > +#include "hw/c

Re: [Qemu-devel] [PATCH 4/4] kvm: i386: Add classic PCI device assignment

2012-08-28 Thread Anthony Liguori
Blue Swirl  writes:

> On Tue, Aug 28, 2012 at 7:31 PM, Anthony Liguori  
> wrote:
>> Blue Swirl  writes:
>>
>>> On Tue, Aug 28, 2012 at 5:28 PM, Michael S. Tsirkin  wrote:
 On Tue, Aug 28, 2012 at 05:01:55PM +, Blue Swirl wrote:
> On Tue, Aug 28, 2012 at 7:35 AM, Michael Tokarev  wrote:
> > On 27.08.2012 22:56, Blue Swirl wrote:
> > []
> >>> +static uint32_t slow_bar_readb(void *opaque, target_phys_addr_t addr)
> >>> +{
> >>> +AssignedDevRegion *d = opaque;
> >>> +uint8_t *in = d->u.r_virtbase + addr;
> >>
> >> Don't perform arithmetic with void pointers.
> >
> > There are a few places in common qemu code which does this for a very
> > long time.  So I guess it is safe now.
>
> It's a non-standard GCC extension.

 So?  We use many other GCC extensions. grep for typeof.
>>>
>>> Dependencies should not be introduced trivially. In this case, it's
>>> pretty easy to avoid void pointer arithmetic as Jan's next version
>>> shows.
>>
>> The standard is vague with respect void arithmetic.  Most compilers
>> allow it.  A very good analysis of the standard can be found below.
>>
>> http://stackoverflow.com/questions/3523145/pointer-arithmetic-for-void-pointer-in-c
>
> The analysis would seem to show that arithmetic may be acceptable, but
> it doesn't say that void pointers must be treated like char pointers.
> In my view, this would make sense:
>
> char *cptr;
> void *vptr;
>
> Since
> cptr++;
> is equivalent to
> cptr = (char *)((uintptr_t)cptr + sizeof(*cptr));
>
> therefore
>
> vptr++;
> should be equivalent to
> vptr = (void *)((uintptr_t)vptr + sizeof(*vptr));
> That is, vptr++ should be equivalent to vptr += 0 because sizeof(void)
> should be 0 if allowed.

sizeof(void) == 1

With GCC at least.

Regards,

Anthony Liguori

>>
>> Regards,
>>
>> Anthony Liguori
>>
>>>

 Is there a work in progress to build GCC with visual studio?
 If yes what are the chances KVM device assignment
 will work on windows?
>>>
>>> IIRC there was really a project to use KVM on Windows and another
>>> project to build QEMU with MSVC.
>>>

 Look QEMU codebase is what it is. Unless you rework all existing
 code to confirm to your taste, I do not see why you NACK valid new code
 unless it confirms to same.
>>>
>>> Yes, I'd be happy to fix the style with huge patches at once. But our
>>> fearless leader does not agree, so we are stuck with the codebase
>>> being what it is until it is fixed one step at a time.
>>>

> >
> > /mjt
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: [Qemu-devel] [PATCHv2 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 09:02:05PM +0400, malc wrote:
> On Tue, 28 Aug 2012, Michael S. Tsirkin wrote:
> 
> > On Mon, Aug 27, 2012 at 07:40:56PM +, Blue Swirl wrote:
> > > On Mon, Aug 27, 2012 at 7:24 PM, Michael S. Tsirkin  
> > > wrote:
> > > > On Mon, Aug 27, 2012 at 07:12:27PM +, Blue Swirl wrote:
> > > >> On Mon, Aug 27, 2012 at 7:06 PM, Michael S. Tsirkin  
> > > >> wrote:
> > > >> > On Mon, Aug 27, 2012 at 06:58:29PM +, Blue Swirl wrote:
> > > >> >> On Mon, Aug 27, 2012 at 12:20 PM, Michael S. Tsirkin 
> > > >> >>  wrote:
> > > >> >> > In preparation for adding PV EOI support, disable PV EOI by 
> > > >> >> > default for
> > > >> >> > 1.1 and older machine types, to avoid CPUID changing during 
> > > >> >> > migration.
> > > >> >> >
> > > >> >> > PV EOI can still be enabled/disabled by specifying it explicitly.
> > > >> >> > Enable for 1.1
> > > >> >> > -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> > > >> >> > Disable for 1.2
> > > >> >> > -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
> > > >> >> >
> > > >> >> > Signed-off-by: Michael S. Tsirkin 
> > > >> >> > ---
> > > >> >> >  hw/Makefile.objs  |  2 +-
> > > >> >> >  hw/cpu_flags.c| 32 
> > > >> >> >  hw/cpu_flags.h|  9 +
> > > >> >> >  hw/pc_piix.c  |  2 ++
> 
> [..snip..]
> 
> > > 
> > > No leading underscores. They are not used in QEMU.
> > 
> > They are *widely* used in QEMU to mark internal
> > stuff. E.g. parameters in many macros.
> > 
> 
> ISO/IEC 9899:TC3 7.1.3#1
> 
> - All identifiers that begin with an underscore and either an
>   uppercase letter or another underscore are always reserved for any use.
> 
> IOW no __ or _[A-Z] at all.
> 
> - All identifiers that begin with an underscore are always reserved
>   for use as identifiers with file scope in both the ordinary and tag
>   name spaces.
> 
> IOW _ as the name of an argument to a macro is (probably) okay.
> 
> > In reality __ is also widely used. I'm still mulling
> > removing 2.4 from HACKING - it appears too draconian,
> > the chances of a conflict with preprocessor are remote
> > and if it triggers, it's trivial to catch.
> > We also have lots of existing code violating this rule.
> > 
> > And the rule about _t suffix is just silly.
> 
> http://pubs.opengroup.org/onlinepubs/7908799/xns/namespace.html
> 
> [..snip..]

It's still silly :)

> -- 
> mailto:av1...@comtv.ru



Re: [Qemu-devel] [PATCH v2 4/4] kvm: i386: Add classic PCI device assignment

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 01:02:26PM +0200, Jan Kiszka wrote:
> This adds PCI device assignment for i386 targets using the classic KVM
> interfaces. This version is 100% identical to what is being maintained
> in qemu-kvm for several years and is supported by libvirt as well. It is
> expected to remain relevant for another couple of years until kernels
> without full-features and performance-wise equivalent VFIO support are
> obsolete.
> 
> A refactoring to-do that should be done in-tree is to model MSI and
> MSI-X support via the generic PCI layer, similar to what VFIO is already
> doing for MSI-X. This should improve the correctness and clean up the
> code from duplicate logic.
> 
> Signed-off-by: Jan Kiszka 
> ---
> 
> Changes in v2:
>  - Addressed review comments of Andreas and most of Blue (see [1] for
>unchanged aspects)

any chance you can address mine? Specifically I suggested
listing commit id that you started from. Also:

> +static void msix_reset(AssignedDevice *dev)
> +{
> +MSIXTableEntry *entry;
> +int i;
> +
> +if (!dev->msix_table) {
> +return;
> +}
> +
> +memset(dev->msix_table, 0, MSIX_PAGE_SIZE);
> +
> +for (i = 0, entry = dev->msix_table; i < dev->msix_max; i++, entry++) {
> +entry->ctrl = cpu_to_le32(0x1); /* Masked */
> +}
> +}

Is a bad name. Ideally file scope names should have a prefix
assigned_dev_ or pci_assign_ or something like that
but it's not a hard requirement. In this case
file includes msix.h so prefix msix_ is confusing.

-- 
MST



Re: [Qemu-devel] [PATCHv2 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 04:40:38PM -0300, Eduardo Habkost wrote:
> On Tue, Aug 28, 2012 at 02:13:18PM -0500, Anthony Liguori wrote:
> > "Michael S. Tsirkin"  writes:
> > 
> > > In preparation for adding PV EOI support, disable PV EOI by default for
> > > 1.1 and older machine types, to avoid CPUID changing during migration.
> > >
> > > PV EOI can still be enabled/disabled by specifying it explicitly.
> > > Enable for 1.1
> > > -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> > > Disable for 1.2
> > > -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
> > >
> > > Signed-off-by: Michael S. Tsirkin 
> > > ---
> > >  hw/Makefile.objs  |  2 +-
> > >  hw/cpu_flags.c| 32 
> > >  hw/cpu_flags.h|  9 +
> > >  hw/pc_piix.c  |  2 ++
> > >  target-i386/cpu.c |  8 
> > >  5 files changed, 52 insertions(+), 1 deletion(-)
> > >  create mode 100644 hw/cpu_flags.c
> > >  create mode 100644 hw/cpu_flags.h
> > >
> > > diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> > > index 850b87b..3f2532a 100644
> > > --- a/hw/Makefile.objs
> > > +++ b/hw/Makefile.objs
> > > @@ -1,5 +1,5 @@
> > >  hw-obj-y = usb/ ide/
> > > -hw-obj-y += loader.o
> > > +hw-obj-y += loader.o cpu_flags.o
> > >  hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
> > >  hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
> > >  hw-obj-y += fw_cfg.o
> > > diff --git a/hw/cpu_flags.c b/hw/cpu_flags.c
> > > new file mode 100644
> > > index 000..2422d20
> > > --- /dev/null
> > > +++ b/hw/cpu_flags.c
> > > @@ -0,0 +1,32 @@
> > > +/*
> > > + * CPU compatibility flags.
> > > + *
> > > + * Copyright (c) 2012 Red Hat Inc.
> > > + * Author: Michael S. Tsirkin.
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License as published by
> > > + * the Free Software Foundation; either version 2 of the License, or
> > > + * (at your option) any later version.
> > > + *
> > > + * This program is distributed in the hope that it will be useful,
> > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > + * GNU General Public License for more details.
> > > + *
> > > + * You should have received a copy of the GNU General Public License 
> > > along
> > > + * with this program; if not, see .
> > > + */
> > > +#include "hw/cpu_flags.h"
> > > +
> > > +static bool __kvm_pv_eoi_disabled;
> > > +
> > > +void disable_kvm_pv_eoi(void)
> > > +{
> > > +   __kvm_pv_eoi_disabled = true;
> > > +}
> > > +
> > > +bool kvm_pv_eoi_disabled(void)
> > > +{
> > > +   return __kvm_pv_eoi_disabled;
> > > +}
> > 
> > Would this make more sense as a CPU flag or at least the KVM PV LAPIC?
> 
> It is a CPU flag, already. The problem is that QEMU defaults to enabling
> blindly everything reported by the host kernel, potentially breaking
> migration (and this still has to be fixed).
> > 
> > We really ought to embed the LAPIC in the CPU objects.  Then it would
> > all make a bit more sense conceptionally.  But I definitely thing a CPU
> > feature flag makes the most sense here.
> > 
> > Then we can handle it via globals once we make CPU's qdev devices.
> 
> Yes, this is a perfect candidate to use globals/qdev for machine-type
> compatibility.

OK so ... ACK?

> > 
> > Regards,
> > 
> > Anthony Liguori
> > 
> > > diff --git a/hw/cpu_flags.h b/hw/cpu_flags.h
> > > new file mode 100644
> > > index 000..05777b6
> > > --- /dev/null
> > > +++ b/hw/cpu_flags.h
> > > @@ -0,0 +1,9 @@
> > > +#ifndef HW_CPU_FLAGS_H
> > > +#define HW_CPU_FLAGS_H
> > > +
> > > +#include 
> > > +
> > > +void disable_kvm_pv_eoi(void);
> > > +bool kvm_pv_eoi_disabled(void);
> > > +
> > > +#endif
> > > diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> > > index 008d42f..bdbceda 100644
> > > --- a/hw/pc_piix.c
> > > +++ b/hw/pc_piix.c
> > > @@ -46,6 +46,7 @@
> > >  #ifdef CONFIG_XEN
> > >  #  include 
> > >  #endif
> > > +#include "cpu_flags.h"
> > >  
> > >  #define MAX_IDE_BUS 2
> > >  
> > > @@ -371,6 +372,7 @@ static QEMUMachine pc_machine_v1_2 = {
> > >  
> > >  static void pc_machine_v1_1_compat(void)
> > >  {
> > > +disable_kvm_pv_eoi();
> > >  }
> > >  
> > >  static void pc_init_pci_v1_1(ram_addr_t ram_size,
> > > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > > index 120a2e3..0d02fd1 100644
> > > --- a/target-i386/cpu.c
> > > +++ b/target-i386/cpu.c
> > > @@ -23,6 +23,7 @@
> > >  
> > >  #include "cpu.h"
> > >  #include "kvm.h"
> > > +#include "asm/kvm_para.h"
> > >  
> > >  #include "qemu-option.h"
> > >  #include "qemu-config.h"
> > > @@ -33,6 +34,7 @@
> > >  #include "hyperv.h"
> > >  
> > >  #include "hw/hw.h"
> > > +#include "hw/cpu_flags.h"
> > >  
> > >  /* feature flags taken from "Intel Processor Identification and the CPUID
> > >   * Instruction" and AMD's "CPUID Specification".  In cases of 
> > > disagreement
> > > @@ -889,6 +891,12 @@ static int cpu_x86_fin

Re: [Qemu-devel] [PATCHv2 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 02:13:18PM -0500, Anthony Liguori wrote:
> "Michael S. Tsirkin"  writes:
> 
> > In preparation for adding PV EOI support, disable PV EOI by default for
> > 1.1 and older machine types, to avoid CPUID changing during migration.
> >
> > PV EOI can still be enabled/disabled by specifying it explicitly.
> > Enable for 1.1
> > -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> > Disable for 1.2
> > -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
> >
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >  hw/Makefile.objs  |  2 +-
> >  hw/cpu_flags.c| 32 
> >  hw/cpu_flags.h|  9 +
> >  hw/pc_piix.c  |  2 ++
> >  target-i386/cpu.c |  8 
> >  5 files changed, 52 insertions(+), 1 deletion(-)
> >  create mode 100644 hw/cpu_flags.c
> >  create mode 100644 hw/cpu_flags.h
> >
> > diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> > index 850b87b..3f2532a 100644
> > --- a/hw/Makefile.objs
> > +++ b/hw/Makefile.objs
> > @@ -1,5 +1,5 @@
> >  hw-obj-y = usb/ ide/
> > -hw-obj-y += loader.o
> > +hw-obj-y += loader.o cpu_flags.o
> >  hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
> >  hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
> >  hw-obj-y += fw_cfg.o
> > diff --git a/hw/cpu_flags.c b/hw/cpu_flags.c
> > new file mode 100644
> > index 000..2422d20
> > --- /dev/null
> > +++ b/hw/cpu_flags.c
> > @@ -0,0 +1,32 @@
> > +/*
> > + * CPU compatibility flags.
> > + *
> > + * Copyright (c) 2012 Red Hat Inc.
> > + * Author: Michael S. Tsirkin.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License along
> > + * with this program; if not, see .
> > + */
> > +#include "hw/cpu_flags.h"
> > +
> > +static bool __kvm_pv_eoi_disabled;
> > +
> > +void disable_kvm_pv_eoi(void)
> > +{
> > +   __kvm_pv_eoi_disabled = true;
> > +}
> > +
> > +bool kvm_pv_eoi_disabled(void)
> > +{
> > +   return __kvm_pv_eoi_disabled;
> > +}
> 
> Would this make more sense as a CPU flag or at least the KVM PV LAPIC?
> 
> We really ought to embed the LAPIC in the CPU objects.  Then it would
> all make a bit more sense conceptionally.  But I definitely thing a CPU
> feature flag makes the most sense here.
> 
> Then we can handle it via globals once we make CPU's qdev devices.
> 
> Regards,
> 
> Anthony Liguori

Sorry I do not understand what you suggest.
This is a trivial bugfix patch it can be written in a million ways,
I am confused by all the improvement suggestions.
Could you please simply send me a patch?  I can even test it.

All I care about is that we get the functionality above:

 -M pc-1.1 --- disabled
 -M pc-1.2 --- enabled
 -M pc-1.1 -cpu kvm64,+kvm_pv_eoi -- enabled
 -M pc-1.2 -cpu kvm64,-kvm_pv_eoi -- disabled

> > diff --git a/hw/cpu_flags.h b/hw/cpu_flags.h
> > new file mode 100644
> > index 000..05777b6
> > --- /dev/null
> > +++ b/hw/cpu_flags.h
> > @@ -0,0 +1,9 @@
> > +#ifndef HW_CPU_FLAGS_H
> > +#define HW_CPU_FLAGS_H
> > +
> > +#include 
> > +
> > +void disable_kvm_pv_eoi(void);
> > +bool kvm_pv_eoi_disabled(void);
> > +
> > +#endif
> > diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> > index 008d42f..bdbceda 100644
> > --- a/hw/pc_piix.c
> > +++ b/hw/pc_piix.c
> > @@ -46,6 +46,7 @@
> >  #ifdef CONFIG_XEN
> >  #  include 
> >  #endif
> > +#include "cpu_flags.h"
> >  
> >  #define MAX_IDE_BUS 2
> >  
> > @@ -371,6 +372,7 @@ static QEMUMachine pc_machine_v1_2 = {
> >  
> >  static void pc_machine_v1_1_compat(void)
> >  {
> > +disable_kvm_pv_eoi();
> >  }
> >  
> >  static void pc_init_pci_v1_1(ram_addr_t ram_size,
> > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > index 120a2e3..0d02fd1 100644
> > --- a/target-i386/cpu.c
> > +++ b/target-i386/cpu.c
> > @@ -23,6 +23,7 @@
> >  
> >  #include "cpu.h"
> >  #include "kvm.h"
> > +#include "asm/kvm_para.h"
> >  
> >  #include "qemu-option.h"
> >  #include "qemu-config.h"
> > @@ -33,6 +34,7 @@
> >  #include "hyperv.h"
> >  
> >  #include "hw/hw.h"
> > +#include "hw/cpu_flags.h"
> >  
> >  /* feature flags taken from "Intel Processor Identification and the CPUID
> >   * Instruction" and AMD's "CPUID Specification".  In cases of disagreement
> > @@ -889,6 +891,12 @@ static int cpu_x86_find_by_name(x86_def_t 
> > *x86_cpu_def, const char *cpu_model)
> >  
> >  plus_kvm_features = ~0; /* not supported bits will be filtered out 
> > later */
> >  
> > +/* Disable PV EOI for old machine types.
> > + * Fea

Re: [Qemu-devel] [PATCHv4 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 04:13:38PM -0300, Eduardo Habkost wrote:
> On Tue, Aug 28, 2012 at 08:43:52PM +0300, Michael S. Tsirkin wrote:
> > In preparation for adding PV EOI support, disable PV EOI by default for
> > 1.1 and older machine types, to avoid CPUID changing during migration.
> > 
> > PV EOI can still be enabled/disabled by specifying it explicitly.
> > Enable for 1.1
> > -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> > Disable for 1.2
> > -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
> > 
> 
> What about users that are already running "qemu-1.1 -M pc-1.1" on a host
> kernel that supports PV EOI already? They would get PV EOI disabled when
> migrating to a destination running "qemu-1.2 -M pc-1.1".
> 
> (On the other hand, people running "qemu-1.1 -M pc-1.1" on a host kernel
> supporting PV EOI already have migration broken, so there's not much we
> can do for them)

Exactly.

Talked to Gleb, long term I think we should rework code to make
it forward-compatible wrt adding new MSRs:
- source gets list of MSRs to be migrated from KVM and simply sends them all
- send all MSRS in key/value format
- destination gets list of MSRs to be migrated from KVM and
  only restores the supported ones
Too late for 1.2?

> While we don't make the KVM feature-bit handling sane (with defaults
> that are not blindly derived from the host kernel capabilities), maybe
> the safest bet is to expect users to not migrate between hosts running
> kernels with different KVM capabilities? (I am not sure which option is
> better)

Sorry not sure what you talk about here. What has KVM feature-bit
handling to do with this patchset?

> 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >  hw/Makefile.objs  |  2 +-
> >  hw/cpu_flags.c| 32 
> >  hw/cpu_flags.h|  9 +
> >  hw/pc_piix.c  |  2 ++
> >  target-i386/cpu.c |  8 
> >  5 files changed, 52 insertions(+), 1 deletion(-)
> >  create mode 100644 hw/cpu_flags.c
> >  create mode 100644 hw/cpu_flags.h
> > 
> > diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> > index 850b87b..3f2532a 100644
> > --- a/hw/Makefile.objs
> > +++ b/hw/Makefile.objs
> > @@ -1,5 +1,5 @@
> >  hw-obj-y = usb/ ide/
> > -hw-obj-y += loader.o
> > +hw-obj-y += loader.o cpu_flags.o
> >  hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
> >  hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
> >  hw-obj-y += fw_cfg.o
> > diff --git a/hw/cpu_flags.c b/hw/cpu_flags.c
> > new file mode 100644
> > index 000..7a633c0
> > --- /dev/null
> > +++ b/hw/cpu_flags.c
> > @@ -0,0 +1,32 @@
> > +/*
> > + * CPU compatibility flags.
> > + *
> > + * Copyright (c) 2012 Red Hat Inc.
> > + * Author: Michael S. Tsirkin.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License along
> > + * with this program; if not, see .
> > + */
> > +#include "hw/cpu_flags.h"
> > +
> > +static bool kvm_pv_eoi_disabled_state;
> > +
> > +void disable_kvm_pv_eoi(void)
> > +{
> > +   kvm_pv_eoi_disabled_state = true;
> > +}
> > +
> > +bool kvm_pv_eoi_disabled(void)
> > +{
> > +   return kvm_pv_eoi_disabled_state;
> > +}
> > diff --git a/hw/cpu_flags.h b/hw/cpu_flags.h
> > new file mode 100644
> > index 000..05777b6
> > --- /dev/null
> > +++ b/hw/cpu_flags.h
> > @@ -0,0 +1,9 @@
> > +#ifndef HW_CPU_FLAGS_H
> > +#define HW_CPU_FLAGS_H
> > +
> > +#include 
> > +
> > +void disable_kvm_pv_eoi(void);
> > +bool kvm_pv_eoi_disabled(void);
> > +
> > +#endif
> > diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> > index 008d42f..bdbceda 100644
> > --- a/hw/pc_piix.c
> > +++ b/hw/pc_piix.c
> > @@ -46,6 +46,7 @@
> >  #ifdef CONFIG_XEN
> >  #  include 
> >  #endif
> > +#include "cpu_flags.h"
> >  
> >  #define MAX_IDE_BUS 2
> >  
> > @@ -371,6 +372,7 @@ static QEMUMachine pc_machine_v1_2 = {
> >  
> >  static void pc_machine_v1_1_compat(void)
> >  {
> > +disable_kvm_pv_eoi();
> >  }
> >  
> >  static void pc_init_pci_v1_1(ram_addr_t ram_size,
> > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > index 120a2e3..0d02fd1 100644
> > --- a/target-i386/cpu.c
> > +++ b/target-i386/cpu.c
> > @@ -23,6 +23,7 @@
> >  
> >  #include "cpu.h"
> >  #include "kvm.h"
> > +#include "asm/kvm_para.h"
> >  
> >  #include "qemu-option.h"
> >  #include "qemu-config.h"
> > @@ -33,6 +34,7 @@
> >  #include "hyperv.h"
> >  
> >  #include "hw/hw.h"
> > +#include "hw/cpu_flags.h"
> >  
> >  /* feature flags taken from "Intel Processor Identificatio

Re: [Qemu-devel] [PATCHv2 2/4] pc: refactor compat code

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 01:37:18PM -0300, Marcelo Tosatti wrote:
> On Tue, Aug 28, 2012 at 07:31:24PM +0300, Michael S. Tsirkin wrote:
> > On Tue, Aug 28, 2012 at 01:23:18PM -0300, Marcelo Tosatti wrote:
> > > On Mon, Aug 27, 2012 at 03:20:13PM +0300, Michael S. Tsirkin wrote:
> > > > In preparation to adding PV EOI migration for 1.2,
> > > > trivially refactor some some compat code
> > > > to make it easier to add version specific
> > > > cpuid tweaks.
> > > > 
> > > > Signed-off-by: Michael S. Tsirkin 
> > > > ---
> > > >  hw/pc_piix.c | 44 
> > > >  1 file changed, 36 insertions(+), 8 deletions(-)
> > > 
> > > Why can't you clear the cpuid bit in kvm paravirt leaf at hw/pc_piix.c, 
> > > leaving compat code isolated there?
> > 
> > This is not how we handle it for compat properties:
> > there we set flag in pc_piix and make devices look at
> > that flag.
> > And it makes sense because what you suggest does not scale: we can not
> > teach pc_piix about quirks of all hardware.
> 
> What is there to scale? Old machines are static, they don't change.

But we keep adding new features in each version and each of them adds
new device specific stuff. If we stick all logic in pc_piix.c it will
quickly have to know about internals of virtio,cdrom,ide,cpu ...

> This
> way you are putting the burden of compat support in the generic driver
> code, which has better things to worry about than old machine types.
> 
> IMO it is better to have all of the compat mess localized.
>
> > It will also scale better if we ever get interested about
> > compatibility and migration for non pc machines.
> > -- 
> > MST
> 
> Alright, will wait a couple of days before merging.



Re: [Qemu-devel] [PATCH 4/4] kvm: i386: Add classic PCI device assignment

2012-08-28 Thread Peter Maydell
On 27 August 2012 13:15, Jan Kiszka  wrote:
> On 2012-08-27 14:07, Andreas Färber wrote:
>> Am I correct to understand we compile this only for i386 / x86_64?
>
> This is correct.

Did we ever make a decision about whether architecture specific
KVM devices should all live in hw/kvm/ or in some directory
with the arch name in it? I guess they're all in hw/kvm/ at
the moment so they should stay there unless there's a pressing
reason to move them around...

Since this is arch-specific we should probably give the
resulting device a more specific name than "pci-assign",
which implies that it is (a) ok for any PCI system and
(b) not something that will be obsolete in the foreseen
future.

-- PMM



Re: [Qemu-devel] [PATCH 4/4] kvm: i386: Add classic PCI device assignment

2012-08-28 Thread Blue Swirl
On Tue, Aug 28, 2012 at 7:31 PM, Anthony Liguori  wrote:
> Blue Swirl  writes:
>
>> On Tue, Aug 28, 2012 at 5:28 PM, Michael S. Tsirkin  wrote:
>>> On Tue, Aug 28, 2012 at 05:01:55PM +, Blue Swirl wrote:
 On Tue, Aug 28, 2012 at 7:35 AM, Michael Tokarev  wrote:
 > On 27.08.2012 22:56, Blue Swirl wrote:
 > []
 >>> +static uint32_t slow_bar_readb(void *opaque, target_phys_addr_t addr)
 >>> +{
 >>> +AssignedDevRegion *d = opaque;
 >>> +uint8_t *in = d->u.r_virtbase + addr;
 >>
 >> Don't perform arithmetic with void pointers.
 >
 > There are a few places in common qemu code which does this for a very
 > long time.  So I guess it is safe now.

 It's a non-standard GCC extension.
>>>
>>> So?  We use many other GCC extensions. grep for typeof.
>>
>> Dependencies should not be introduced trivially. In this case, it's
>> pretty easy to avoid void pointer arithmetic as Jan's next version
>> shows.
>
> The standard is vague with respect void arithmetic.  Most compilers
> allow it.  A very good analysis of the standard can be found below.
>
> http://stackoverflow.com/questions/3523145/pointer-arithmetic-for-void-pointer-in-c

The analysis would seem to show that arithmetic may be acceptable, but
it doesn't say that void pointers must be treated like char pointers.
In my view, this would make sense:

char *cptr;
void *vptr;

Since
cptr++;
is equivalent to
cptr = (char *)((uintptr_t)cptr + sizeof(*cptr));

therefore

vptr++;
should be equivalent to
vptr = (void *)((uintptr_t)vptr + sizeof(*vptr));

That is, vptr++ should be equivalent to vptr += 0 because sizeof(void)
should be 0 if allowed.

>
> BTW: can we please stop arguing about C standards.  If we currently are
> using something in QEMU that's supported by clang and GCC, it's fine and
> we ought to continue using it.
>
> The reserved names actually did bite us when porting to a new platform.
> But the only requirement for C extensions ought to be reasonable support
> in GCC and clang.
>
> I don't care at all about supporting proprietary compilers.

We also don't have crowds banging doors with their money bags with a
need for such support.

>
> Regards,
>
> Anthony Liguori
>
>>
>>>
>>> Is there a work in progress to build GCC with visual studio?
>>> If yes what are the chances KVM device assignment
>>> will work on windows?
>>
>> IIRC there was really a project to use KVM on Windows and another
>> project to build QEMU with MSVC.
>>
>>>
>>> Look QEMU codebase is what it is. Unless you rework all existing
>>> code to confirm to your taste, I do not see why you NACK valid new code
>>> unless it confirms to same.
>>
>> Yes, I'd be happy to fix the style with huge patches at once. But our
>> fearless leader does not agree, so we are stuck with the codebase
>> being what it is until it is fixed one step at a time.
>>
>>>
 >
 > /mjt
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: [Qemu-devel] KVM call agenda for Tuesda, August 28th

2012-08-28 Thread Anthony Liguori
Eduardo Habkost  writes:

> On Tue, Aug 28, 2012 at 02:15:30PM -0500, Anthony Liguori wrote:
>> Eduardo Habkost  writes:
>> 
>> > On Tue, Aug 28, 2012 at 07:59:47PM +0200, Andreas Färber wrote:
>> >> Am 28.08.2012 16:27, schrieb Eduardo Habkost:
>> >> > On Tue, Aug 28, 2012 at 02:55:56PM +0100, Peter Maydell wrote:
>> >> >> On 28 August 2012 14:30, Eduardo Habkost  wrote:
>> >> >>> - 1.2 branching, or creation of a "cpu-next" tree where "good to be
>> >> >>>   merged" patches can live until 1.2 is done;
>> >> >>
>> >> >> With 1.3 due for release in just over a week, it seems unlikely
>> >> >> that it's worth branching at this point...
>> >> > 
>> >> > Well, the closer to the release, the smaller the cost of branching as we
>> >> > won't have many patches entering the 1.2 branch, anyway.
>> >> 
>> >> The idea behind the new release model is to never branch for releases,
>> >> so that we can easily bisect between v1.2 and v1.3, both tags being on
>> >> the same branch. So I don't think a 1.2 branch is likely.
>> >
>> > That means that every branch to be merged after 1.2 has to be rebased on
>> > top of 1.2 before being merged?
>> 
>> I'd prefer not to do next trees unless it's for a clear subsystem that
>> already exists and will continue to exist.
>> 
>> If someone wants to be a CPU subsystem maintainer, that's great, and we
>> can keep the tree open regardless of the release.  But just having a
>> temporary tree for 3 weeks is more pain than it's worth.
>
> How exactly this would cause pain? I am already maintaining a branch for
> myself with a huge list of patches, to be able to continue working on
> things I want to send to 1.3.
>
> The difference is that in addition to that, I am willing to gather the
> patches that seem to be "ready to go" on a more stable branch, and send
> them as a single pull request (or even a plain patch series by mail) to
> the list once 1.2 is out.

Patches sent during the release window usually don't have enough so just
pulling them sort of defeats the purpose.

Regards,

Anthony Liguori

>
> -- 
> Eduardo



Re: [Qemu-devel] [PATCH 4/4] kvm: i386: Add classic PCI device assignment

2012-08-28 Thread malc
On Tue, 28 Aug 2012, Anthony Liguori wrote:

> Blue Swirl  writes:
> 
> > On Tue, Aug 28, 2012 at 5:28 PM, Michael S. Tsirkin  wrote:
> >> On Tue, Aug 28, 2012 at 05:01:55PM +, Blue Swirl wrote:
> >>> On Tue, Aug 28, 2012 at 7:35 AM, Michael Tokarev  wrote:
> >>> > On 27.08.2012 22:56, Blue Swirl wrote:
> >>> > []
> >>> >>> +static uint32_t slow_bar_readb(void *opaque, target_phys_addr_t addr)
> >>> >>> +{
> >>> >>> +AssignedDevRegion *d = opaque;
> >>> >>> +uint8_t *in = d->u.r_virtbase + addr;
> >>> >>
> >>> >> Don't perform arithmetic with void pointers.
> >>> >
> >>> > There are a few places in common qemu code which does this for a very
> >>> > long time.  So I guess it is safe now.
> >>>
> >>> It's a non-standard GCC extension.
> >>
> >> So?  We use many other GCC extensions. grep for typeof.
> >
> > Dependencies should not be introduced trivially. In this case, it's
> > pretty easy to avoid void pointer arithmetic as Jan's next version
> > shows.
> 
> The standard is vague with respect void arithmetic.  Most compilers
> allow it.  A very good analysis of the standard can be found below.
> 
> http://stackoverflow.com/questions/3523145/pointer-arithmetic-for-void-pointer-in-c
> 
> BTW: can we please stop arguing about C standards.  If we currently are
> using something in QEMU that's supported by clang and GCC, it's fine and
> we ought to continue using it.

No we can not stop arguing. Besides you are wrong.

[..snip..]

-- 
mailto:av1...@comtv.ru



Re: [Qemu-devel] [PATCHv2 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Eduardo Habkost
On Tue, Aug 28, 2012 at 02:13:18PM -0500, Anthony Liguori wrote:
> "Michael S. Tsirkin"  writes:
> 
> > In preparation for adding PV EOI support, disable PV EOI by default for
> > 1.1 and older machine types, to avoid CPUID changing during migration.
> >
> > PV EOI can still be enabled/disabled by specifying it explicitly.
> > Enable for 1.1
> > -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> > Disable for 1.2
> > -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
> >
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >  hw/Makefile.objs  |  2 +-
> >  hw/cpu_flags.c| 32 
> >  hw/cpu_flags.h|  9 +
> >  hw/pc_piix.c  |  2 ++
> >  target-i386/cpu.c |  8 
> >  5 files changed, 52 insertions(+), 1 deletion(-)
> >  create mode 100644 hw/cpu_flags.c
> >  create mode 100644 hw/cpu_flags.h
> >
> > diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> > index 850b87b..3f2532a 100644
> > --- a/hw/Makefile.objs
> > +++ b/hw/Makefile.objs
> > @@ -1,5 +1,5 @@
> >  hw-obj-y = usb/ ide/
> > -hw-obj-y += loader.o
> > +hw-obj-y += loader.o cpu_flags.o
> >  hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
> >  hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
> >  hw-obj-y += fw_cfg.o
> > diff --git a/hw/cpu_flags.c b/hw/cpu_flags.c
> > new file mode 100644
> > index 000..2422d20
> > --- /dev/null
> > +++ b/hw/cpu_flags.c
> > @@ -0,0 +1,32 @@
> > +/*
> > + * CPU compatibility flags.
> > + *
> > + * Copyright (c) 2012 Red Hat Inc.
> > + * Author: Michael S. Tsirkin.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License along
> > + * with this program; if not, see .
> > + */
> > +#include "hw/cpu_flags.h"
> > +
> > +static bool __kvm_pv_eoi_disabled;
> > +
> > +void disable_kvm_pv_eoi(void)
> > +{
> > +   __kvm_pv_eoi_disabled = true;
> > +}
> > +
> > +bool kvm_pv_eoi_disabled(void)
> > +{
> > +   return __kvm_pv_eoi_disabled;
> > +}
> 
> Would this make more sense as a CPU flag or at least the KVM PV LAPIC?

It is a CPU flag, already. The problem is that QEMU defaults to enabling
blindly everything reported by the host kernel, potentially breaking
migration (and this still has to be fixed).

> 
> We really ought to embed the LAPIC in the CPU objects.  Then it would
> all make a bit more sense conceptionally.  But I definitely thing a CPU
> feature flag makes the most sense here.
> 
> Then we can handle it via globals once we make CPU's qdev devices.

Yes, this is a perfect candidate to use globals/qdev for machine-type
compatibility.

> 
> Regards,
> 
> Anthony Liguori
> 
> > diff --git a/hw/cpu_flags.h b/hw/cpu_flags.h
> > new file mode 100644
> > index 000..05777b6
> > --- /dev/null
> > +++ b/hw/cpu_flags.h
> > @@ -0,0 +1,9 @@
> > +#ifndef HW_CPU_FLAGS_H
> > +#define HW_CPU_FLAGS_H
> > +
> > +#include 
> > +
> > +void disable_kvm_pv_eoi(void);
> > +bool kvm_pv_eoi_disabled(void);
> > +
> > +#endif
> > diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> > index 008d42f..bdbceda 100644
> > --- a/hw/pc_piix.c
> > +++ b/hw/pc_piix.c
> > @@ -46,6 +46,7 @@
> >  #ifdef CONFIG_XEN
> >  #  include 
> >  #endif
> > +#include "cpu_flags.h"
> >  
> >  #define MAX_IDE_BUS 2
> >  
> > @@ -371,6 +372,7 @@ static QEMUMachine pc_machine_v1_2 = {
> >  
> >  static void pc_machine_v1_1_compat(void)
> >  {
> > +disable_kvm_pv_eoi();
> >  }
> >  
> >  static void pc_init_pci_v1_1(ram_addr_t ram_size,
> > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > index 120a2e3..0d02fd1 100644
> > --- a/target-i386/cpu.c
> > +++ b/target-i386/cpu.c
> > @@ -23,6 +23,7 @@
> >  
> >  #include "cpu.h"
> >  #include "kvm.h"
> > +#include "asm/kvm_para.h"
> >  
> >  #include "qemu-option.h"
> >  #include "qemu-config.h"
> > @@ -33,6 +34,7 @@
> >  #include "hyperv.h"
> >  
> >  #include "hw/hw.h"
> > +#include "hw/cpu_flags.h"
> >  
> >  /* feature flags taken from "Intel Processor Identification and the CPUID
> >   * Instruction" and AMD's "CPUID Specification".  In cases of disagreement
> > @@ -889,6 +891,12 @@ static int cpu_x86_find_by_name(x86_def_t 
> > *x86_cpu_def, const char *cpu_model)
> >  
> >  plus_kvm_features = ~0; /* not supported bits will be filtered out 
> > later */
> >  
> > +/* Disable PV EOI for old machine types.
> > + * Feature flags can still override. */
> > +if (kvm_pv_eoi_disabled()) {
> > +plus_kvm_features &= ~(0x1 << KVM_FEATURE_PV_EOI);
> > +}
> > +
> > 

Re: [Qemu-devel] [PATCH 4/4] kvm: i386: Add classic PCI device assignment

2012-08-28 Thread Anthony Liguori
Blue Swirl  writes:

> On Tue, Aug 28, 2012 at 5:28 PM, Michael S. Tsirkin  wrote:
>> On Tue, Aug 28, 2012 at 05:01:55PM +, Blue Swirl wrote:
>>> On Tue, Aug 28, 2012 at 7:35 AM, Michael Tokarev  wrote:
>>> > On 27.08.2012 22:56, Blue Swirl wrote:
>>> > []
>>> >>> +static uint32_t slow_bar_readb(void *opaque, target_phys_addr_t addr)
>>> >>> +{
>>> >>> +AssignedDevRegion *d = opaque;
>>> >>> +uint8_t *in = d->u.r_virtbase + addr;
>>> >>
>>> >> Don't perform arithmetic with void pointers.
>>> >
>>> > There are a few places in common qemu code which does this for a very
>>> > long time.  So I guess it is safe now.
>>>
>>> It's a non-standard GCC extension.
>>
>> So?  We use many other GCC extensions. grep for typeof.
>
> Dependencies should not be introduced trivially. In this case, it's
> pretty easy to avoid void pointer arithmetic as Jan's next version
> shows.

The standard is vague with respect void arithmetic.  Most compilers
allow it.  A very good analysis of the standard can be found below.

http://stackoverflow.com/questions/3523145/pointer-arithmetic-for-void-pointer-in-c

BTW: can we please stop arguing about C standards.  If we currently are
using something in QEMU that's supported by clang and GCC, it's fine and
we ought to continue using it.

The reserved names actually did bite us when porting to a new platform.
But the only requirement for C extensions ought to be reasonable support
in GCC and clang.

I don't care at all about supporting proprietary compilers.

Regards,

Anthony Liguori

>
>>
>> Is there a work in progress to build GCC with visual studio?
>> If yes what are the chances KVM device assignment
>> will work on windows?
>
> IIRC there was really a project to use KVM on Windows and another
> project to build QEMU with MSVC.
>
>>
>> Look QEMU codebase is what it is. Unless you rework all existing
>> code to confirm to your taste, I do not see why you NACK valid new code
>> unless it confirms to same.
>
> Yes, I'd be happy to fix the style with huge patches at once. But our
> fearless leader does not agree, so we are stuck with the codebase
> being what it is until it is fixed one step at a time.
>
>>
>>> >
>>> > /mjt
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: [Qemu-devel] KVM call agenda for Tuesda, August 28th

2012-08-28 Thread Eduardo Habkost
On Tue, Aug 28, 2012 at 02:15:30PM -0500, Anthony Liguori wrote:
> Eduardo Habkost  writes:
> 
> > On Tue, Aug 28, 2012 at 07:59:47PM +0200, Andreas Färber wrote:
> >> Am 28.08.2012 16:27, schrieb Eduardo Habkost:
> >> > On Tue, Aug 28, 2012 at 02:55:56PM +0100, Peter Maydell wrote:
> >> >> On 28 August 2012 14:30, Eduardo Habkost  wrote:
> >> >>> - 1.2 branching, or creation of a "cpu-next" tree where "good to be
> >> >>>   merged" patches can live until 1.2 is done;
> >> >>
> >> >> With 1.3 due for release in just over a week, it seems unlikely
> >> >> that it's worth branching at this point...
> >> > 
> >> > Well, the closer to the release, the smaller the cost of branching as we
> >> > won't have many patches entering the 1.2 branch, anyway.
> >> 
> >> The idea behind the new release model is to never branch for releases,
> >> so that we can easily bisect between v1.2 and v1.3, both tags being on
> >> the same branch. So I don't think a 1.2 branch is likely.
> >
> > That means that every branch to be merged after 1.2 has to be rebased on
> > top of 1.2 before being merged?
> 
> I'd prefer not to do next trees unless it's for a clear subsystem that
> already exists and will continue to exist.
> 
> If someone wants to be a CPU subsystem maintainer, that's great, and we
> can keep the tree open regardless of the release.  But just having a
> temporary tree for 3 weeks is more pain than it's worth.

How exactly this would cause pain? I am already maintaining a branch for
myself with a huge list of patches, to be able to continue working on
things I want to send to 1.3.

The difference is that in addition to that, I am willing to gather the
patches that seem to be "ready to go" on a more stable branch, and send
them as a single pull request (or even a plain patch series by mail) to
the list once 1.2 is out.

-- 
Eduardo



Re: [Qemu-devel] KVM call agenda for Tuesda, August 28th

2012-08-28 Thread Anthony Liguori
Eduardo Habkost  writes:

> On Tue, Aug 28, 2012 at 07:59:47PM +0200, Andreas Färber wrote:
>> Am 28.08.2012 16:27, schrieb Eduardo Habkost:
>> > On Tue, Aug 28, 2012 at 02:55:56PM +0100, Peter Maydell wrote:
>> >> On 28 August 2012 14:30, Eduardo Habkost  wrote:
>> >>> - 1.2 branching, or creation of a "cpu-next" tree where "good to be
>> >>>   merged" patches can live until 1.2 is done;
>> >>
>> >> With 1.3 due for release in just over a week, it seems unlikely
>> >> that it's worth branching at this point...
>> > 
>> > Well, the closer to the release, the smaller the cost of branching as we
>> > won't have many patches entering the 1.2 branch, anyway.
>> 
>> The idea behind the new release model is to never branch for releases,
>> so that we can easily bisect between v1.2 and v1.3, both tags being on
>> the same branch. So I don't think a 1.2 branch is likely.
>
> That means that every branch to be merged after 1.2 has to be rebased on
> top of 1.2 before being merged?

I'd prefer not to do next trees unless it's for a clear subsystem that
already exists and will continue to exist.

If someone wants to be a CPU subsystem maintainer, that's great, and we
can keep the tree open regardless of the release.  But just having a
temporary tree for 3 weeks is more pain than it's worth.

Regards,

Anthony Liguori

>
> -- 
> Eduardo



Re: [Qemu-devel] [PATCHv2 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Anthony Liguori
"Michael S. Tsirkin"  writes:

> In preparation for adding PV EOI support, disable PV EOI by default for
> 1.1 and older machine types, to avoid CPUID changing during migration.
>
> PV EOI can still be enabled/disabled by specifying it explicitly.
> Enable for 1.1
> -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> Disable for 1.2
> -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
>
> Signed-off-by: Michael S. Tsirkin 
> ---
>  hw/Makefile.objs  |  2 +-
>  hw/cpu_flags.c| 32 
>  hw/cpu_flags.h|  9 +
>  hw/pc_piix.c  |  2 ++
>  target-i386/cpu.c |  8 
>  5 files changed, 52 insertions(+), 1 deletion(-)
>  create mode 100644 hw/cpu_flags.c
>  create mode 100644 hw/cpu_flags.h
>
> diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> index 850b87b..3f2532a 100644
> --- a/hw/Makefile.objs
> +++ b/hw/Makefile.objs
> @@ -1,5 +1,5 @@
>  hw-obj-y = usb/ ide/
> -hw-obj-y += loader.o
> +hw-obj-y += loader.o cpu_flags.o
>  hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
>  hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
>  hw-obj-y += fw_cfg.o
> diff --git a/hw/cpu_flags.c b/hw/cpu_flags.c
> new file mode 100644
> index 000..2422d20
> --- /dev/null
> +++ b/hw/cpu_flags.c
> @@ -0,0 +1,32 @@
> +/*
> + * CPU compatibility flags.
> + *
> + * Copyright (c) 2012 Red Hat Inc.
> + * Author: Michael S. Tsirkin.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see .
> + */
> +#include "hw/cpu_flags.h"
> +
> +static bool __kvm_pv_eoi_disabled;
> +
> +void disable_kvm_pv_eoi(void)
> +{
> +   __kvm_pv_eoi_disabled = true;
> +}
> +
> +bool kvm_pv_eoi_disabled(void)
> +{
> +   return __kvm_pv_eoi_disabled;
> +}

Would this make more sense as a CPU flag or at least the KVM PV LAPIC?

We really ought to embed the LAPIC in the CPU objects.  Then it would
all make a bit more sense conceptionally.  But I definitely thing a CPU
feature flag makes the most sense here.

Then we can handle it via globals once we make CPU's qdev devices.

Regards,

Anthony Liguori

> diff --git a/hw/cpu_flags.h b/hw/cpu_flags.h
> new file mode 100644
> index 000..05777b6
> --- /dev/null
> +++ b/hw/cpu_flags.h
> @@ -0,0 +1,9 @@
> +#ifndef HW_CPU_FLAGS_H
> +#define HW_CPU_FLAGS_H
> +
> +#include 
> +
> +void disable_kvm_pv_eoi(void);
> +bool kvm_pv_eoi_disabled(void);
> +
> +#endif
> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> index 008d42f..bdbceda 100644
> --- a/hw/pc_piix.c
> +++ b/hw/pc_piix.c
> @@ -46,6 +46,7 @@
>  #ifdef CONFIG_XEN
>  #  include 
>  #endif
> +#include "cpu_flags.h"
>  
>  #define MAX_IDE_BUS 2
>  
> @@ -371,6 +372,7 @@ static QEMUMachine pc_machine_v1_2 = {
>  
>  static void pc_machine_v1_1_compat(void)
>  {
> +disable_kvm_pv_eoi();
>  }
>  
>  static void pc_init_pci_v1_1(ram_addr_t ram_size,
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 120a2e3..0d02fd1 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -23,6 +23,7 @@
>  
>  #include "cpu.h"
>  #include "kvm.h"
> +#include "asm/kvm_para.h"
>  
>  #include "qemu-option.h"
>  #include "qemu-config.h"
> @@ -33,6 +34,7 @@
>  #include "hyperv.h"
>  
>  #include "hw/hw.h"
> +#include "hw/cpu_flags.h"
>  
>  /* feature flags taken from "Intel Processor Identification and the CPUID
>   * Instruction" and AMD's "CPUID Specification".  In cases of disagreement
> @@ -889,6 +891,12 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
> const char *cpu_model)
>  
>  plus_kvm_features = ~0; /* not supported bits will be filtered out later 
> */
>  
> +/* Disable PV EOI for old machine types.
> + * Feature flags can still override. */
> +if (kvm_pv_eoi_disabled()) {
> +plus_kvm_features &= ~(0x1 << KVM_FEATURE_PV_EOI);
> +}
> +
>  add_flagname_to_bitmaps("hypervisor", &plus_features,
>  &plus_ext_features, &plus_ext2_features, &plus_ext3_features,
>  &plus_kvm_features, &plus_svm_features);
> -- 
> MST



Re: [Qemu-devel] [PATCHv4 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Eduardo Habkost
On Tue, Aug 28, 2012 at 08:43:52PM +0300, Michael S. Tsirkin wrote:
> In preparation for adding PV EOI support, disable PV EOI by default for
> 1.1 and older machine types, to avoid CPUID changing during migration.
> 
> PV EOI can still be enabled/disabled by specifying it explicitly.
> Enable for 1.1
> -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> Disable for 1.2
> -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
> 

What about users that are already running "qemu-1.1 -M pc-1.1" on a host
kernel that supports PV EOI already? They would get PV EOI disabled when
migrating to a destination running "qemu-1.2 -M pc-1.1".

(On the other hand, people running "qemu-1.1 -M pc-1.1" on a host kernel
supporting PV EOI already have migration broken, so there's not much we
can do for them)

While we don't make the KVM feature-bit handling sane (with defaults
that are not blindly derived from the host kernel capabilities), maybe
the safest bet is to expect users to not migrate between hosts running
kernels with different KVM capabilities? (I am not sure which option is
better)


> Signed-off-by: Michael S. Tsirkin 
> ---
>  hw/Makefile.objs  |  2 +-
>  hw/cpu_flags.c| 32 
>  hw/cpu_flags.h|  9 +
>  hw/pc_piix.c  |  2 ++
>  target-i386/cpu.c |  8 
>  5 files changed, 52 insertions(+), 1 deletion(-)
>  create mode 100644 hw/cpu_flags.c
>  create mode 100644 hw/cpu_flags.h
> 
> diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> index 850b87b..3f2532a 100644
> --- a/hw/Makefile.objs
> +++ b/hw/Makefile.objs
> @@ -1,5 +1,5 @@
>  hw-obj-y = usb/ ide/
> -hw-obj-y += loader.o
> +hw-obj-y += loader.o cpu_flags.o
>  hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
>  hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
>  hw-obj-y += fw_cfg.o
> diff --git a/hw/cpu_flags.c b/hw/cpu_flags.c
> new file mode 100644
> index 000..7a633c0
> --- /dev/null
> +++ b/hw/cpu_flags.c
> @@ -0,0 +1,32 @@
> +/*
> + * CPU compatibility flags.
> + *
> + * Copyright (c) 2012 Red Hat Inc.
> + * Author: Michael S. Tsirkin.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see .
> + */
> +#include "hw/cpu_flags.h"
> +
> +static bool kvm_pv_eoi_disabled_state;
> +
> +void disable_kvm_pv_eoi(void)
> +{
> +   kvm_pv_eoi_disabled_state = true;
> +}
> +
> +bool kvm_pv_eoi_disabled(void)
> +{
> +   return kvm_pv_eoi_disabled_state;
> +}
> diff --git a/hw/cpu_flags.h b/hw/cpu_flags.h
> new file mode 100644
> index 000..05777b6
> --- /dev/null
> +++ b/hw/cpu_flags.h
> @@ -0,0 +1,9 @@
> +#ifndef HW_CPU_FLAGS_H
> +#define HW_CPU_FLAGS_H
> +
> +#include 
> +
> +void disable_kvm_pv_eoi(void);
> +bool kvm_pv_eoi_disabled(void);
> +
> +#endif
> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> index 008d42f..bdbceda 100644
> --- a/hw/pc_piix.c
> +++ b/hw/pc_piix.c
> @@ -46,6 +46,7 @@
>  #ifdef CONFIG_XEN
>  #  include 
>  #endif
> +#include "cpu_flags.h"
>  
>  #define MAX_IDE_BUS 2
>  
> @@ -371,6 +372,7 @@ static QEMUMachine pc_machine_v1_2 = {
>  
>  static void pc_machine_v1_1_compat(void)
>  {
> +disable_kvm_pv_eoi();
>  }
>  
>  static void pc_init_pci_v1_1(ram_addr_t ram_size,
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 120a2e3..0d02fd1 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -23,6 +23,7 @@
>  
>  #include "cpu.h"
>  #include "kvm.h"
> +#include "asm/kvm_para.h"
>  
>  #include "qemu-option.h"
>  #include "qemu-config.h"
> @@ -33,6 +34,7 @@
>  #include "hyperv.h"
>  
>  #include "hw/hw.h"
> +#include "hw/cpu_flags.h"
>  
>  /* feature flags taken from "Intel Processor Identification and the CPUID
>   * Instruction" and AMD's "CPUID Specification".  In cases of disagreement
> @@ -889,6 +891,12 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
> const char *cpu_model)
>  
>  plus_kvm_features = ~0; /* not supported bits will be filtered out later 
> */
>  
> +/* Disable PV EOI for old machine types.
> + * Feature flags can still override. */
> +if (kvm_pv_eoi_disabled()) {
> +plus_kvm_features &= ~(0x1 << KVM_FEATURE_PV_EOI);
> +}
> +
>  add_flagname_to_bitmaps("hypervisor", &plus_features,
>  &plus_ext_features, &plus_ext2_features, &plus_ext3_features,
>  &plus_kvm_features, &plus_svm_features);
> -- 
> MST
> 
> 

-- 
Eduardo



Re: [Qemu-devel] [PATCH v2 for-1.2 0/2] stream: complete early if end of backing file is reached

2012-08-28 Thread Kevin Wolf
Am 28.08.2012 16:26, schrieb Stefan Hajnoczi:
> Image streaming hangs if the backing image is smaller than the image file.  
> The
> problem is that the image streaming loop makes no progress when
> bdrv_co_is_allocated() returns 0 with pnum=0.  More details in the actual
> patch.
> 
> I have also included a qemu-iotest to check this scenario.  It hangs when run
> against qemu.git/master and passes when the patch is applied.
> 
> Stefan Hajnoczi (2):
>   stream: complete early if end of backing file is reached
>   qemu-iotests: add backing file smaller than image test case
> 
>  block/stream.c |6 ++
>  tests/qemu-iotests/030 |   33 +
>  tests/qemu-iotests/030.out |4 ++--
>  3 files changed, 41 insertions(+), 2 deletions(-)
> 

Thanks, applied all to the block branch.

Kevin



Re: [Qemu-devel] [PATCHv2 2/4] pc: refactor compat code

2012-08-28 Thread Marcelo Tosatti
On Tue, Aug 28, 2012 at 07:31:24PM +0300, Michael S. Tsirkin wrote:
> On Tue, Aug 28, 2012 at 01:23:18PM -0300, Marcelo Tosatti wrote:
> > On Mon, Aug 27, 2012 at 03:20:13PM +0300, Michael S. Tsirkin wrote:
> > > In preparation to adding PV EOI migration for 1.2,
> > > trivially refactor some some compat code
> > > to make it easier to add version specific
> > > cpuid tweaks.
> > > 
> > > Signed-off-by: Michael S. Tsirkin 
> > > ---
> > >  hw/pc_piix.c | 44 
> > >  1 file changed, 36 insertions(+), 8 deletions(-)
> > 
> > Why can't you clear the cpuid bit in kvm paravirt leaf at hw/pc_piix.c, 
> > leaving compat code isolated there?
> 
> This is not how we handle it for compat properties:
> there we set flag in pc_piix and make devices look at
> that flag.
> And it makes sense because what you suggest does not scale: we can not
> teach pc_piix about quirks of all hardware.

What is there to scale? Old machines are static, they don't change. This
way you are putting the burden of compat support in the generic driver
code, which has better things to worry about than old machine types.

IMO it is better to have all of the compat mess localized.

> It will also scale better if we ever get interested about
> compatibility and migration for non pc machines.
> -- 
> MST

Alright, will wait a couple of days before merging.




Re: [Qemu-devel] PPC heathrow broken - update OpenBIOS to r1063?

2012-08-28 Thread Aurelien Jarno
On Mon, Aug 27, 2012 at 05:59:26PM -0700, Alexander Graf wrote:
> 
> 
> On 27.08.2012, at 14:13, Aurelien Jarno  wrote:
> 
> > Hi,
> > 
> > As you probably know, the PPC machines with a heathrow controller is
> > broken following commit 9e56edcf ("vga: raise default vgamem size").
> > The PCI hole space is not big enough for such a new default size.
> > 
> > Alexander has fixed it in OpenBIOS r1063, while the current version in
> > QEMU is r1062. It would be nice if OpenBIOS PPC could be updated in git.
> > Do we want to keep the SPARC version in sync? Should I do it?
> 
> Sorry, yes. Source for SPARC didn't change in that commit. Could you please 
> just push the openbios patch and your altivec patch there?
> 

Thanks, just did it.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net



[Qemu-devel] [PATCHv4 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Michael S. Tsirkin
In preparation for adding PV EOI support, disable PV EOI by default for
1.1 and older machine types, to avoid CPUID changing during migration.

PV EOI can still be enabled/disabled by specifying it explicitly.
Enable for 1.1
-M pc-1.1 -cpu kvm64,+kvm_pv_eoi
Disable for 1.2
-M pc-1.2 -cpu kvm64,-kvm_pv_eoi

Signed-off-by: Michael S. Tsirkin 
---
 hw/Makefile.objs  |  2 +-
 hw/cpu_flags.c| 32 
 hw/cpu_flags.h|  9 +
 hw/pc_piix.c  |  2 ++
 target-i386/cpu.c |  8 
 5 files changed, 52 insertions(+), 1 deletion(-)
 create mode 100644 hw/cpu_flags.c
 create mode 100644 hw/cpu_flags.h

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 850b87b..3f2532a 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -1,5 +1,5 @@
 hw-obj-y = usb/ ide/
-hw-obj-y += loader.o
+hw-obj-y += loader.o cpu_flags.o
 hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
 hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
 hw-obj-y += fw_cfg.o
diff --git a/hw/cpu_flags.c b/hw/cpu_flags.c
new file mode 100644
index 000..7a633c0
--- /dev/null
+++ b/hw/cpu_flags.c
@@ -0,0 +1,32 @@
+/*
+ * CPU compatibility flags.
+ *
+ * Copyright (c) 2012 Red Hat Inc.
+ * Author: Michael S. Tsirkin.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see .
+ */
+#include "hw/cpu_flags.h"
+
+static bool kvm_pv_eoi_disabled_state;
+
+void disable_kvm_pv_eoi(void)
+{
+   kvm_pv_eoi_disabled_state = true;
+}
+
+bool kvm_pv_eoi_disabled(void)
+{
+   return kvm_pv_eoi_disabled_state;
+}
diff --git a/hw/cpu_flags.h b/hw/cpu_flags.h
new file mode 100644
index 000..05777b6
--- /dev/null
+++ b/hw/cpu_flags.h
@@ -0,0 +1,9 @@
+#ifndef HW_CPU_FLAGS_H
+#define HW_CPU_FLAGS_H
+
+#include 
+
+void disable_kvm_pv_eoi(void);
+bool kvm_pv_eoi_disabled(void);
+
+#endif
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 008d42f..bdbceda 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -46,6 +46,7 @@
 #ifdef CONFIG_XEN
 #  include 
 #endif
+#include "cpu_flags.h"
 
 #define MAX_IDE_BUS 2
 
@@ -371,6 +372,7 @@ static QEMUMachine pc_machine_v1_2 = {
 
 static void pc_machine_v1_1_compat(void)
 {
+disable_kvm_pv_eoi();
 }
 
 static void pc_init_pci_v1_1(ram_addr_t ram_size,
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 120a2e3..0d02fd1 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -23,6 +23,7 @@
 
 #include "cpu.h"
 #include "kvm.h"
+#include "asm/kvm_para.h"
 
 #include "qemu-option.h"
 #include "qemu-config.h"
@@ -33,6 +34,7 @@
 #include "hyperv.h"
 
 #include "hw/hw.h"
+#include "hw/cpu_flags.h"
 
 /* feature flags taken from "Intel Processor Identification and the CPUID
  * Instruction" and AMD's "CPUID Specification".  In cases of disagreement
@@ -889,6 +891,12 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
const char *cpu_model)
 
 plus_kvm_features = ~0; /* not supported bits will be filtered out later */
 
+/* Disable PV EOI for old machine types.
+ * Feature flags can still override. */
+if (kvm_pv_eoi_disabled()) {
+plus_kvm_features &= ~(0x1 << KVM_FEATURE_PV_EOI);
+}
+
 add_flagname_to_bitmaps("hypervisor", &plus_features,
 &plus_ext_features, &plus_ext2_features, &plus_ext3_features,
 &plus_kvm_features, &plus_svm_features);
-- 
MST




Re: [Qemu-devel] KVM call agenda for Tuesda, August 28th

2012-08-28 Thread Eduardo Habkost
On Tue, Aug 28, 2012 at 07:59:47PM +0200, Andreas Färber wrote:
> Am 28.08.2012 16:27, schrieb Eduardo Habkost:
> > On Tue, Aug 28, 2012 at 02:55:56PM +0100, Peter Maydell wrote:
> >> On 28 August 2012 14:30, Eduardo Habkost  wrote:
> >>> - 1.2 branching, or creation of a "cpu-next" tree where "good to be
> >>>   merged" patches can live until 1.2 is done;
> >>
> >> With 1.3 due for release in just over a week, it seems unlikely
> >> that it's worth branching at this point...
> > 
> > Well, the closer to the release, the smaller the cost of branching as we
> > won't have many patches entering the 1.2 branch, anyway.
> 
> The idea behind the new release model is to never branch for releases,
> so that we can easily bisect between v1.2 and v1.3, both tags being on
> the same branch. So I don't think a 1.2 branch is likely.

That means that every branch to be merged after 1.2 has to be rebased on
top of 1.2 before being merged?

-- 
Eduardo



[Qemu-devel] [PATCHv4 4/4] kvm: get/set PV EOI MSR

2012-08-28 Thread Michael S. Tsirkin
Support get/set of new PV EOI MSR, for migration.
Add an optional section for MSR value - send it
out in case MSR was changed from the default value (0).

Signed-off-by: Michael S. Tsirkin 
---
 target-i386/cpu.h |  1 +
 target-i386/kvm.c | 13 +
 target-i386/machine.c | 21 +
 3 files changed, 35 insertions(+)

diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index aabf993..3c57d8b 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -699,6 +699,7 @@ typedef struct CPUX86State {
 uint64_t system_time_msr;
 uint64_t wall_clock_msr;
 uint64_t async_pf_en_msr;
+uint64_t pv_eoi_en_msr;
 
 uint64_t tsc;
 uint64_t tsc_deadline;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 5e2d4f5..6790180 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -64,6 +64,7 @@ static bool has_msr_star;
 static bool has_msr_hsave_pa;
 static bool has_msr_tsc_deadline;
 static bool has_msr_async_pf_en;
+static bool has_msr_pv_eoi_en;
 static bool has_msr_misc_enable;
 static int lm_capable_kernel;
 
@@ -456,6 +457,8 @@ int kvm_arch_init_vcpu(CPUX86State *env)
 
 has_msr_async_pf_en = c->eax & (1 << KVM_FEATURE_ASYNC_PF);
 
+has_msr_pv_eoi_en = c->eax & (1 << KVM_FEATURE_PV_EOI);
+
 cpu_x86_cpuid(env, 0, 0, &limit, &unused, &unused, &unused);
 
 for (i = 0; i <= limit; i++) {
@@ -1018,6 +1021,10 @@ static int kvm_put_msrs(CPUX86State *env, int level)
 kvm_msr_entry_set(&msrs[n++], MSR_KVM_ASYNC_PF_EN,
   env->async_pf_en_msr);
 }
+if (has_msr_pv_eoi_en) {
+kvm_msr_entry_set(&msrs[n++], MSR_KVM_PV_EOI_EN,
+  env->pv_eoi_en_msr);
+}
 if (hyperv_hypercall_available()) {
 kvm_msr_entry_set(&msrs[n++], HV_X64_MSR_GUEST_OS_ID, 0);
 kvm_msr_entry_set(&msrs[n++], HV_X64_MSR_HYPERCALL, 0);
@@ -1260,6 +1267,9 @@ static int kvm_get_msrs(CPUX86State *env)
 if (has_msr_async_pf_en) {
 msrs[n++].index = MSR_KVM_ASYNC_PF_EN;
 }
+if (has_msr_pv_eoi_en) {
+msrs[n++].index = MSR_KVM_PV_EOI_EN;
+}
 
 if (env->mcg_cap) {
 msrs[n++].index = MSR_MCG_STATUS;
@@ -1339,6 +1349,9 @@ static int kvm_get_msrs(CPUX86State *env)
 case MSR_KVM_ASYNC_PF_EN:
 env->async_pf_en_msr = msrs[i].data;
 break;
+case MSR_KVM_PV_EOI_EN:
+env->pv_eoi_en_msr = msrs[i].data;
+break;
 }
 }
 
diff --git a/target-i386/machine.c b/target-i386/machine.c
index a8be058..4771508 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -279,6 +279,13 @@ static bool async_pf_msr_needed(void *opaque)
 return cpu->async_pf_en_msr != 0;
 }
 
+static bool pv_eoi_msr_needed(void *opaque)
+{
+CPUX86State *cpu = opaque;
+
+return cpu->pv_eoi_en_msr != 0;
+}
+
 static const VMStateDescription vmstate_async_pf_msr = {
 .name = "cpu/async_pf_msr",
 .version_id = 1,
@@ -290,6 +297,17 @@ static const VMStateDescription vmstate_async_pf_msr = {
 }
 };
 
+static const VMStateDescription vmstate_pv_eoi_msr = {
+.name = "cpu/async_pv_eoi_msr",
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields  = (VMStateField []) {
+VMSTATE_UINT64(pv_eoi_en_msr, CPUX86State),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static bool fpop_ip_dp_needed(void *opaque)
 {
 CPUX86State *env = opaque;
@@ -454,6 +472,9 @@ static const VMStateDescription vmstate_cpu = {
 .vmsd = &vmstate_async_pf_msr,
 .needed = async_pf_msr_needed,
 } , {
+.vmsd = &vmstate_pv_eoi_msr,
+.needed = pv_eoi_msr_needed,
+} , {
 .vmsd = &vmstate_fpop_ip_dp,
 .needed = fpop_ip_dp_needed,
 }, {
-- 
MST



Re: [Qemu-devel] KVM call agenda for Tuesda, August 28th

2012-08-28 Thread Andreas Färber
Am 28.08.2012 16:27, schrieb Eduardo Habkost:
> On Tue, Aug 28, 2012 at 02:55:56PM +0100, Peter Maydell wrote:
>> On 28 August 2012 14:30, Eduardo Habkost  wrote:
>>> - 1.2 branching, or creation of a "cpu-next" tree where "good to be
>>>   merged" patches can live until 1.2 is done;
>>
>> With 1.3 due for release in just over a week, it seems unlikely
>> that it's worth branching at this point...
> 
> Well, the closer to the release, the smaller the cost of branching as we
> won't have many patches entering the 1.2 branch, anyway.

The idea behind the new release model is to never branch for releases,
so that we can easily bisect between v1.2 and v1.3, both tags being on
the same branch. So I don't think a 1.2 branch is likely.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 06:46:53PM +0100, Peter Maydell wrote:
> On 28 August 2012 18:32, Michael S. Tsirkin  wrote:
> > What about _t in POSIX? That seems fairly safe if name is long and qemu
> > specific enough.
> 
> Depends what you mean by "safe". The spec says "don't use this";
> it isn't any different to the __ and _[A-Z] prohibitions in that
> respect.

It's different: unlike C compiler POSIX can not mangle
names in an application. So it will not internally
create qemu_foo_bar_t from qemu_foo_bar, and thus
qemu_foo_bar_t is very unlikely to create conflicts.

OTOH compiler people do mysterious and strange things
to the point where you do not want to mess with them
if you can.

> Other posix namespace landgrabs you may not have expected:
>  * ctype.h reserves "is[a-z]" and to[a-z]" prefixes
>  * string.h takes "str[a-z]" and "mem[a-z]" prefixes
> 
> (and qemu-common.h includes both string.h and ctype.h so this
> effectively applies to all of qemu).
> 
> I'm in two minds about these, because on the one hand they're
> reserved but on the other hand abiding by the rules makes things
> uglier (whereas "use one underscore not two" doesn't produce
> an uglier name IMHO).
> 
> (posix also if I'm reading it correctly reserves all of 'prefix
> underscore', which is annoying.)
> 
> -- PMM

Exactly my points.

-- 
MST



[Qemu-devel] [PATCHv4 2/4] pc: refactor compat code

2012-08-28 Thread Michael S. Tsirkin
In preparation to adding PV EOI migration for 1.2,
trivially refactor some some compat code
to make it easier to add version specific
cpuid tweaks.

Signed-off-by: Michael S. Tsirkin 
---
 hw/pc_piix.c | 44 
 1 file changed, 36 insertions(+), 8 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index a771d79..008d42f 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -369,6 +369,22 @@ static QEMUMachine pc_machine_v1_2 = {
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 };
 
+static void pc_machine_v1_1_compat(void)
+{
+}
+
+static void pc_init_pci_v1_1(ram_addr_t ram_size,
+ const char *boot_device,
+ const char *kernel_filename,
+ const char *kernel_cmdline,
+ const char *initrd_filename,
+ const char *cpu_model)
+{
+pc_machine_v1_1_compat();
+pc_init_pci(ram_size, boot_device, kernel_filename,
+kernel_cmdline, initrd_filename, cpu_model);
+}
+
 #define PC_COMPAT_1_1 \
 {\
 .driver   = "virtio-scsi-pci",\
@@ -403,7 +419,7 @@ static QEMUMachine pc_machine_v1_2 = {
 static QEMUMachine pc_machine_v1_1 = {
 .name = "pc-1.1",
 .desc = "Standard PC",
-.init = pc_init_pci,
+.init = pc_init_pci_v1_1,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
@@ -439,7 +455,7 @@ static QEMUMachine pc_machine_v1_1 = {
 static QEMUMachine pc_machine_v1_0 = {
 .name = "pc-1.0",
 .desc = "Standard PC",
-.init = pc_init_pci,
+.init = pc_init_pci_v1_1,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
@@ -455,7 +471,7 @@ static QEMUMachine pc_machine_v1_0 = {
 static QEMUMachine pc_machine_v0_15 = {
 .name = "pc-0.15",
 .desc = "Standard PC",
-.init = pc_init_pci,
+.init = pc_init_pci_v1_1,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
@@ -488,7 +504,7 @@ static QEMUMachine pc_machine_v0_15 = {
 static QEMUMachine pc_machine_v0_14 = {
 .name = "pc-0.14",
 .desc = "Standard PC",
-.init = pc_init_pci,
+.init = pc_init_pci_v1_1,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
@@ -519,10 +535,22 @@ static QEMUMachine pc_machine_v0_14 = {
 .value= stringify(1),\
 }
 
+static void pc_init_pci_v0_13(ram_addr_t ram_size,
+ const char *boot_device,
+ const char *kernel_filename,
+ const char *kernel_cmdline,
+ const char *initrd_filename,
+ const char *cpu_model)
+{
+pc_machine_v1_1_compat();
+pc_init_pci_no_kvmclock(ram_size, boot_device, kernel_filename,
+kernel_cmdline, initrd_filename, cpu_model);
+}
+
 static QEMUMachine pc_machine_v0_13 = {
 .name = "pc-0.13",
 .desc = "Standard PC",
-.init = pc_init_pci_no_kvmclock,
+.init = pc_init_pci_v0_13,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
@@ -560,7 +588,7 @@ static QEMUMachine pc_machine_v0_13 = {
 static QEMUMachine pc_machine_v0_12 = {
 .name = "pc-0.12",
 .desc = "Standard PC",
-.init = pc_init_pci_no_kvmclock,
+.init = pc_init_pci_v0_13,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
@@ -594,7 +622,7 @@ static QEMUMachine pc_machine_v0_12 = {
 static QEMUMachine pc_machine_v0_11 = {
 .name = "pc-0.11",
 .desc = "Standard PC, qemu 0.11",
-.init = pc_init_pci_no_kvmclock,
+.init = pc_init_pci_v0_13,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
@@ -616,7 +644,7 @@ static QEMUMachine pc_machine_v0_11 = {
 static QEMUMachine pc_machine_v0_10 = {
 .name = "pc-0.10",
 .desc = "Standard PC, qemu 0.10",
-.init = pc_init_pci_no_kvmclock,
+.init = pc_init_pci_v0_13,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
-- 
MST




Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Peter Maydell
On 28 August 2012 18:32, Michael S. Tsirkin  wrote:
> What about _t in POSIX? That seems fairly safe if name is long and qemu
> specific enough.

Depends what you mean by "safe". The spec says "don't use this";
it isn't any different to the __ and _[A-Z] prohibitions in that
respect.

Other posix namespace landgrabs you may not have expected:
 * ctype.h reserves "is[a-z]" and to[a-z]" prefixes
 * string.h takes "str[a-z]" and "mem[a-z]" prefixes

(and qemu-common.h includes both string.h and ctype.h so this
effectively applies to all of qemu).

I'm in two minds about these, because on the one hand they're
reserved but on the other hand abiding by the rules makes things
uglier (whereas "use one underscore not two" doesn't produce
an uglier name IMHO).

(posix also if I'm reading it correctly reserves all of 'prefix
underscore', which is annoying.)

-- PMM



[Qemu-devel] [PATCHv4 1/4] linux-headers: update to 3.6-rc3

2012-08-28 Thread Michael S. Tsirkin
Update linux-headers to version present in Linux 3.6-rc3.
Header asm-x96_64/kvm_para.h update is needed for the new PV EOI
feature.

Signed-off-by: Michael S. Tsirkin 
---
 linux-headers/asm-s390/kvm.h  | 2 +-
 linux-headers/asm-s390/kvm_para.h | 2 +-
 linux-headers/asm-x86/kvm.h   | 1 +
 linux-headers/asm-x86/kvm_para.h  | 7 +++
 linux-headers/linux/kvm.h | 3 +++
 5 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h
index bdcbe0f..d25da59 100644
--- a/linux-headers/asm-s390/kvm.h
+++ b/linux-headers/asm-s390/kvm.h
@@ -1,7 +1,7 @@
 #ifndef __LINUX_KVM_S390_H
 #define __LINUX_KVM_S390_H
 /*
- * asm-s390/kvm.h - KVM s390 specific structures and definitions
+ * KVM s390 specific structures and definitions
  *
  * Copyright IBM Corp. 2008
  *
diff --git a/linux-headers/asm-s390/kvm_para.h 
b/linux-headers/asm-s390/kvm_para.h
index 8e2dd67..870051f 100644
--- a/linux-headers/asm-s390/kvm_para.h
+++ b/linux-headers/asm-s390/kvm_para.h
@@ -1,5 +1,5 @@
 /*
- * asm-s390/kvm_para.h - definition for paravirtual devices on s390
+ * definition for paravirtual devices on s390
  *
  * Copyright IBM Corp. 2008
  *
diff --git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h
index e7d1c19..246617e 100644
--- a/linux-headers/asm-x86/kvm.h
+++ b/linux-headers/asm-x86/kvm.h
@@ -12,6 +12,7 @@
 /* Select x86 specific features in  */
 #define __KVM_HAVE_PIT
 #define __KVM_HAVE_IOAPIC
+#define __KVM_HAVE_IRQ_LINE
 #define __KVM_HAVE_DEVICE_ASSIGNMENT
 #define __KVM_HAVE_MSI
 #define __KVM_HAVE_USER_NMI
diff --git a/linux-headers/asm-x86/kvm_para.h b/linux-headers/asm-x86/kvm_para.h
index f2ac46a..a1c3d72 100644
--- a/linux-headers/asm-x86/kvm_para.h
+++ b/linux-headers/asm-x86/kvm_para.h
@@ -22,6 +22,7 @@
 #define KVM_FEATURE_CLOCKSOURCE23
 #define KVM_FEATURE_ASYNC_PF   4
 #define KVM_FEATURE_STEAL_TIME 5
+#define KVM_FEATURE_PV_EOI 6
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
@@ -37,6 +38,7 @@
 #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01
 #define MSR_KVM_ASYNC_PF_EN 0x4b564d02
 #define MSR_KVM_STEAL_TIME  0x4b564d03
+#define MSR_KVM_PV_EOI_EN  0x4b564d04
 
 struct kvm_steal_time {
__u64 steal;
@@ -89,5 +91,10 @@ struct kvm_vcpu_pv_apf_data {
__u32 enabled;
 };
 
+#define KVM_PV_EOI_BIT 0
+#define KVM_PV_EOI_MASK (0x1 << KVM_PV_EOI_BIT)
+#define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
+#define KVM_PV_EOI_DISABLED 0x0
+
 
 #endif /* _ASM_X86_KVM_PARA_H */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 5a9d4e3..4b9e575 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -617,6 +617,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_SIGNAL_MSI 77
 #define KVM_CAP_PPC_GET_SMMU_INFO 78
 #define KVM_CAP_S390_COW 79
+#define KVM_CAP_PPC_ALLOC_HTAB 80
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -828,6 +829,8 @@ struct kvm_s390_ucas_mapping {
 #define KVM_SIGNAL_MSI_IOW(KVMIO,  0xa5, struct kvm_msi)
 /* Available with KVM_CAP_PPC_GET_SMMU_INFO */
 #define KVM_PPC_GET_SMMU_INFO_IOR(KVMIO,  0xa6, struct kvm_ppc_smmu_info)
+/* Available with KVM_CAP_PPC_ALLOC_HTAB */
+#define KVM_PPC_ALLOCATE_HTAB_IOWR(KVMIO, 0xa7, __u32)
 
 /*
  * ioctls for vcpu fds
-- 
MST




[Qemu-devel] [PATCHv4 0/4] migrate PV EOI MSR

2012-08-28 Thread Michael S. Tsirkin
It turns out PV EOI gets disabled after migration -
until next guest reset.
This is because we are missing code to actually migrate it.
This patch fixes it up: it applies cleanly to qemu.git
as well as qemu-kvm.git, so I think it's cleaner
to apply it in qemu.git to keep diff to minimum.

Note: there's talk about adding infrastructure for
CPUID whitelisting which thinkably could be used
for migration compat support. I am guessing this won't be
1.2 material - when it's ready we can easily replace
a simple flag that this patchset adds with something else.

So this just adds minimal code to avoid regressing
cross-version migration.

Note: there's a kernel bug in linux 3.6-rc3 - apply
my patch 'kvm: fix KVM_GET_MSR for PV EOI' in order to
use this patchset on it.

Needed for 1.2.

Changes from v2 and v3:
Multiple tweaks in variable name to confirm to strict C99 (Blue Swirl)
Changes from v1:
Update all headers from 3.6-rc3 to keep them in sync (Jan)
Disable cpuid flag for qemu 1.2 and older (Orit)

Michael S. Tsirkin (4):
  linux-headers: update to 3.6-rc3
  pc: refactor compat code
  cpuid: disable pv eoi for 1.1 and older compat types
  kvm: get/set PV EOI MSR

 hw/Makefile.objs  |  2 +-
 hw/cpu_flags.c| 32 +++
 hw/cpu_flags.h|  9 
 hw/pc_piix.c  | 46 ---
 linux-headers/asm-s390/kvm.h  |  2 +-
 linux-headers/asm-s390/kvm_para.h |  2 +-
 linux-headers/asm-x86/kvm.h   |  1 +
 linux-headers/asm-x86/kvm_para.h  |  7 ++
 linux-headers/linux/kvm.h |  3 +++
 target-i386/cpu.c |  8 +++
 target-i386/cpu.h |  1 +
 target-i386/kvm.c | 13 +++
 target-i386/machine.c | 21 ++
 13 files changed, 136 insertions(+), 11 deletions(-)
 create mode 100644 hw/cpu_flags.c
 create mode 100644 hw/cpu_flags.h

-- 
MST



Re: [Qemu-devel] [PATCHv3 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Blue Swirl
On Tue, Aug 28, 2012 at 5:22 PM, Michael S. Tsirkin  wrote:
> On Tue, Aug 28, 2012 at 05:05:25PM +, Blue Swirl wrote:
>> > +static bool _kvm_pv_eoi_disabled;
>>
>> NACK. I find your lack of compliance disturbing.
>
> Compliance with what? Could you please add some
> motivation for the NACK?

You did not respect my review comments.

>
> --
> MST



Re: [Qemu-devel] [PATCH 4/4] kvm: i386: Add classic PCI device assignment

2012-08-28 Thread Blue Swirl
On Tue, Aug 28, 2012 at 5:28 PM, Michael S. Tsirkin  wrote:
> On Tue, Aug 28, 2012 at 05:01:55PM +, Blue Swirl wrote:
>> On Tue, Aug 28, 2012 at 7:35 AM, Michael Tokarev  wrote:
>> > On 27.08.2012 22:56, Blue Swirl wrote:
>> > []
>> >>> +static uint32_t slow_bar_readb(void *opaque, target_phys_addr_t addr)
>> >>> +{
>> >>> +AssignedDevRegion *d = opaque;
>> >>> +uint8_t *in = d->u.r_virtbase + addr;
>> >>
>> >> Don't perform arithmetic with void pointers.
>> >
>> > There are a few places in common qemu code which does this for a very
>> > long time.  So I guess it is safe now.
>>
>> It's a non-standard GCC extension.
>
> So?  We use many other GCC extensions. grep for typeof.

Dependencies should not be introduced trivially. In this case, it's
pretty easy to avoid void pointer arithmetic as Jan's next version shows.

>
> Is there a work in progress to build GCC with visual studio?
> If yes what are the chances KVM device assignment
> will work on windows?

IIRC there was really a project to use KVM on Windows and another
project to build QEMU with MSVC.

>
> Look QEMU codebase is what it is. Unless you rework all existing
> code to confirm to your taste, I do not see why you NACK valid new code
> unless it confirms to same.

Yes, I'd be happy to fix the style with huge patches at once. But our
fearless leader does not agree, so we are stuck with the codebase
being what it is until it is fixed one step at a time.

>
>> >
>> > /mjt



Re: [Qemu-devel] [PATCHv3 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 05:28:22PM +, Blue Swirl wrote:
> On Tue, Aug 28, 2012 at 5:22 PM, Michael S. Tsirkin  wrote:
> > On Tue, Aug 28, 2012 at 05:05:25PM +, Blue Swirl wrote:
> >> > +static bool _kvm_pv_eoi_disabled;
> >>
> >> NACK. I find your lack of compliance disturbing.
> >
> > Compliance with what? Could you please add some
> > motivation for the NACK?
> 
> You did not respect my review comments.

I merely asking for motivation for the comments.
Anyway, malc explained it to me so I got it now.
Thanks for the review will post soon.

> >
> > --
> > MST



Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 09:33:16PM +0400, malc wrote:
> On Tue, 28 Aug 2012, Peter Maydell wrote:
> 
> > On 28 August 2012 18:21, Michael S. Tsirkin  wrote:
> > > We are talking about stuff like __kvm_pv_eoi - so the chance is exactly 0.
> > > And if it does happen then you run a simple script and fix
> > > this one instance.
> > 
> > Why not just use a name that doesn't use a double underscore
> > in the first place? The C standard specifically allows single
> > underscore + lowercase to give things other than the implementation
> > part of the underscore-namespace. In this case, "_kvm_pv_eoi"
> > would be OK.
> 
> No it wouldn't, _kvm_pv_eoi is a file scope identifier, and names 
> beginning with underscore are reserved in this context.

Looks like they are and I missed that.
Maybe we should add that to HACKING.

> > 
> > >> The tiny single benefit from violating the rules would be that you
> > >> could use a few additional possible classes of prefixes, in addition
> > >> to the infinite combinations already available.
> > >
> > > Benefit would be consistency with existing QEMU code
> > > which has both _t __  and _X, and consistency
> > > within HACKING itself.
> > 
> > HACKING and CODING_STYLE contain a number of rules which
> > the existing codebase doesn't fully conform to. The idea
> > is to incrementally improve consistency and correctness.
> > 
> > -- PMM
> > 
> 
> -- 
> mailto:av1...@comtv.ru



Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 06:27:59PM +0100, Peter Maydell wrote:
> On 28 August 2012 18:21, Michael S. Tsirkin  wrote:
> > We are talking about stuff like __kvm_pv_eoi - so the chance is exactly 0.
> > And if it does happen then you run a simple script and fix
> > this one instance.
> 
> Why not just use a name that doesn't use a double underscore
> in the first place? The C standard specifically allows single
> underscore + lowercase to give things other than the implementation
> part of the underscore-namespace. In this case, "_kvm_pv_eoi"
> would be OK.

BTW this is exactly what v2 of my patch did but Blue Swirl nacked this too.

> >> The tiny single benefit from violating the rules would be that you
> >> could use a few additional possible classes of prefixes, in addition
> >> to the infinite combinations already available.
> >
> > Benefit would be consistency with existing QEMU code
> > which has both _t __  and _X, and consistency
> > within HACKING itself.
> 
> HACKING and CODING_STYLE contain a number of rules which
> the existing codebase doesn't fully conform to. The idea
> is to incrementally improve consistency and correctness.
> 
> -- PMM

How about fixing HACKING itself? It recommends using ram_addr_t.





Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Andreas Färber
Am 28.08.2012 18:01, schrieb Michael S. Tsirkin:
> We copied HACKING from libvirt but it has some bogus stuff:
> neither underscore capital, double underscore, or underscore 't' suffixes
> are reserved in Posix/C: this appears to be based on misreading of the
> C standard. Using sane prefixes is enough to avoid conflicts.
> 
> These rules are also widely violated in our codebase,
> and it does not make sense to rework it all, apparently for
> no benefit.

It was generally not accepted to do Coding Style cleanups in the code
base so you will find lots of violations for any rule if you invest the
time to search. That's no reason to abolish any such rule.

Cc'ing malc, who has "violently" fought for this particular rule.

I also think that asking people not to introduce new ..._t typedefs is
different from asking to use appropriate existing typedefs, so HACKING
is not really contradicting itself in that aspect.

Regards,
Andreas

> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  HACKING | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/HACKING b/HACKING
> index 471cf1d..0a941fc 100644
> --- a/HACKING
> +++ b/HACKING
> @@ -69,10 +69,6 @@ it points to, or it is aliased to another pointer that is.
>  2.3. Typedefs
>  Typedefs are used to eliminate the redundant 'struct' keyword.
>  
> -2.4. Reserved namespaces in C and POSIX
> -Underscore capital, double underscore, and underscore 't' suffixes should be
> -avoided.
> -
>  3. Low level memory management
>  
>  Use of the malloc/free/realloc/calloc/valloc/memalign/posix_memalign
> 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread malc
On Tue, 28 Aug 2012, Peter Maydell wrote:

> On 28 August 2012 18:21, Michael S. Tsirkin  wrote:
> > We are talking about stuff like __kvm_pv_eoi - so the chance is exactly 0.
> > And if it does happen then you run a simple script and fix
> > this one instance.
> 
> Why not just use a name that doesn't use a double underscore
> in the first place? The C standard specifically allows single
> underscore + lowercase to give things other than the implementation
> part of the underscore-namespace. In this case, "_kvm_pv_eoi"
> would be OK.

No it wouldn't, _kvm_pv_eoi is a file scope identifier, and names 
beginning with underscore are reserved in this context.

> 
> >> The tiny single benefit from violating the rules would be that you
> >> could use a few additional possible classes of prefixes, in addition
> >> to the infinite combinations already available.
> >
> > Benefit would be consistency with existing QEMU code
> > which has both _t __  and _X, and consistency
> > within HACKING itself.
> 
> HACKING and CODING_STYLE contain a number of rules which
> the existing codebase doesn't fully conform to. The idea
> is to incrementally improve consistency and correctness.
> 
> -- PMM
> 

-- 
mailto:av1...@comtv.ru



Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 06:23:38PM +0100, Peter Maydell wrote:
> On 28 August 2012 18:18, Michael S. Tsirkin  wrote:
> > On Tue, Aug 28, 2012 at 05:24:40PM +0100, Peter Maydell wrote:
> >>  C99 7.1.3
> >> reserves underscore capital and double underscore prefixes.
> >
> > This is taking it out of context - reserved means different
> > things in different parts of the spec.
> >
> > As far as I can see, 7.1.3 talks about what is permissible in headers:
> > naturally when a libc implementation adds stuff in a header it would
> > want to avoid breaking applications including this header. But
> > it does not talk about what is permissible in a legal program - to avoid
> > conflicts, you just need to use variables with reasonable names like
> > kvmXXX qemuXXX or virtioXXX.
> 
> 7.1.3 para 2 says "If the program declares or defines an identifier
> in a context in which it is reserved [...] the behavior is undefined."
> The rationale for 7.1.3 says that the underscore-cap and double
> underscore identifiers are "reserved for the implementor" and that
> "part of the name space of internal identifiers beginning with
> underscore is available to the user" which implies that the rest
> of that namespace is *not* available to the user.
> 
> I think that's fairly clear that we can't use these identifiers
> without entering the realm of undefined behavior.
> 

I'll have to re-read that, I missed it.
What about _t in POSIX? That seems fairly safe if name is long and qemu
specific enough.

-- 
MST



Re: [Qemu-devel] [Bug 1042388] Re: qemu: Unsupported syscall: 257 (timer_create)

2012-08-28 Thread Erik de Castro Lopo
Peter Maydell wrote:

> A couple of days for somebody who knows what they're doing and has
> a convenient test case.

Working on it.




[Qemu-devel] [Bug 1042654] [NEW] Floppy disks and network not working on NT 3.1 on Qemu 1.2 rc1

2012-08-28 Thread TC1988
Public bug reported:

When I try to put Floppy IMG/IMA/VFD images on NT 3.1 when it is running on 
Qemu 1.2 rc, they are not recognized and the network is not working even though 
I set it correctly (especially the AMD PCnet adapter)
Here's some screenshot of the floppy error:
http://i49.tinypic.com/j77wcw.png

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1042654

Title:
  Floppy disks and network not working on NT 3.1 on Qemu 1.2 rc1

Status in QEMU:
  New

Bug description:
  When I try to put Floppy IMG/IMA/VFD images on NT 3.1 when it is running on 
Qemu 1.2 rc, they are not recognized and the network is not working even though 
I set it correctly (especially the AMD PCnet adapter)
  Here's some screenshot of the floppy error:
  http://i49.tinypic.com/j77wcw.png

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1042654/+subscriptions



Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread malc
On Tue, 28 Aug 2012, Michael S. Tsirkin wrote:

> On Tue, Aug 28, 2012 at 05:24:40PM +0100, Peter Maydell wrote:
> > On 28 August 2012 17:01, Michael S. Tsirkin  wrote:
> > > We copied HACKING from libvirt but it has some bogus stuff:
> > > neither underscore capital, double underscore, or underscore 't' suffixes
> > > are reserved in Posix/C: this appears to be based on misreading of the
> > > C standard. Using sane prefixes is enough to avoid conflicts.
> 
> > > -2.4. Reserved namespaces in C and POSIX
> > > -Underscore capital, double underscore, and underscore 't' suffixes 
> > > should be
> > > -avoided.
> > 
> > I think this is just a missing "prefixes". C99 7.1.3
> > reserves underscore capital and double underscore prefixes.
> 
> This is taking it out of context - reserved means different
> things in different parts of the spec.

Reserved means - reserved for implementation to use as it sees fit,
for instance - 6.2.5 

28) Implementation-defined keywords shall have the form of an identifier
reserved for any use as described in 7.1.3.

Future versions of C might define something in the form __xxx or _[A-Z]xxx 
precisely because those are invalid for user to define.. (_Imaginary, 
_Bool, whatever)

7.1.3#2

> 
> As far as I can see, 7.1.3 talks about what is permissible in headers: 

And you are wrong.

> naturally when a libc implementation adds stuff in a header it would 
> want to avoid breaking applications including this header. But it does 
> not talk about what is permissible in a legal program - to avoid 
> conflicts, you just need to use variables with reasonable names like 
> kvmXXX qemuXXX or virtioXXX.
> 
> So it does not look like this is a reason to not use __kvm_pv_eoi_disabled.
> 
> Chances that a libc header predefines this name are zero.
> 
> > POSIX reserves _t if any POSIX header is defined (POSIX.1-2008 section
> > 2.2.2, 
> > http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html)
> 
> Same thing really.
> So this says posix implementations can add types ending with _t in any
> header.  It does not prohibit applications from using such names
> in a reasonable manner, and it does not look like this is a reason to
> not use qemu_foo_bar_t.
> 
> 
> > I would suggest that the section is reworded to:
> > 
> > # Underscore capital and double underscore prefixes are reserved
> > # by the C standard. Underscore 't' suffixes are reserved by POSIX.
> > # They should be avoided in QEMU code.
> > 
> > -- PMM
> 
> This might be nice from language purism point of view
> but ignores the fact that QEMU already uses lots of these.
> Specifically a _t type is even mentioned in HACKING itself.
> 
> So the benefit to excluding this in new code is marginal.
> 
> 
> 

-- 
mailto:av1...@comtv.ru



Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Peter Maydell
On 28 August 2012 18:21, Michael S. Tsirkin  wrote:
> We are talking about stuff like __kvm_pv_eoi - so the chance is exactly 0.
> And if it does happen then you run a simple script and fix
> this one instance.

Why not just use a name that doesn't use a double underscore
in the first place? The C standard specifically allows single
underscore + lowercase to give things other than the implementation
part of the underscore-namespace. In this case, "_kvm_pv_eoi"
would be OK.

>> The tiny single benefit from violating the rules would be that you
>> could use a few additional possible classes of prefixes, in addition
>> to the infinite combinations already available.
>
> Benefit would be consistency with existing QEMU code
> which has both _t __  and _X, and consistency
> within HACKING itself.

HACKING and CODING_STYLE contain a number of rules which
the existing codebase doesn't fully conform to. The idea
is to incrementally improve consistency and correctness.

-- PMM



Re: [Qemu-devel] [PATCH 4/4] kvm: i386: Add classic PCI device assignment

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 05:01:55PM +, Blue Swirl wrote:
> On Tue, Aug 28, 2012 at 7:35 AM, Michael Tokarev  wrote:
> > On 27.08.2012 22:56, Blue Swirl wrote:
> > []
> >>> +static uint32_t slow_bar_readb(void *opaque, target_phys_addr_t addr)
> >>> +{
> >>> +AssignedDevRegion *d = opaque;
> >>> +uint8_t *in = d->u.r_virtbase + addr;
> >>
> >> Don't perform arithmetic with void pointers.
> >
> > There are a few places in common qemu code which does this for a very
> > long time.  So I guess it is safe now.
> 
> It's a non-standard GCC extension.

So?  We use many other GCC extensions. grep for typeof.

Is there a work in progress to build GCC with visual studio?
If yes what are the chances KVM device assignment
will work on windows?

Look QEMU codebase is what it is. Unless you rework all existing
code to confirm to your taste, I do not see why you NACK valid new code
unless it confirms to same.

> >
> > /mjt



Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Blue Swirl
On Tue, Aug 28, 2012 at 5:21 PM, Michael S. Tsirkin  wrote:
> On Tue, Aug 28, 2012 at 05:13:24PM +, Blue Swirl wrote:
>> On Tue, Aug 28, 2012 at 4:01 PM, Michael S. Tsirkin  wrote:
>> > We copied HACKING from libvirt but it has some bogus stuff:
>> > neither underscore capital, double underscore, or underscore 't' suffixes
>> > are reserved in Posix/C: this appears to be based on misreading of the
>> > C standard. Using sane prefixes is enough to avoid conflicts.
>> >
>> > These rules are also widely violated in our codebase,
>> > and it does not make sense to rework it all, apparently for
>> > no benefit.
>>
>> NACK. The benefit is improved standards compliance. One day we could
>> find QEMU being ported to environment which conflicts with these
>> symbols.
>
> We are talking about stuff like __kvm_pv_eoi - so the chance is exactly 0.

How do you know for sure? The implementation could be that symbols are
converted by the compiler to use underscore prefixes.

> And if it does happen then you run a simple script and fix
> this one instance.

Then why not fix now? How about kvm_pv_eoi_state?

>
>> The tiny single benefit from violating the rules would be that you
>> could use a few additional possible classes of prefixes, in addition
>> to the infinite combinations already available.
>
> Benefit would be consistency with existing QEMU code
> which has both _t __  and _X, and consistency
> within HACKING itself.

There are plenty of bad things wrt style, but the plan is to get rid
of those eventually. It takes time since global reformatting etc. to
enforce the rules is not accepted.

>
>> >
>> > Signed-off-by: Michael S. Tsirkin 
>> > ---
>> >  HACKING | 4 
>> >  1 file changed, 4 deletions(-)
>> >
>> > diff --git a/HACKING b/HACKING
>> > index 471cf1d..0a941fc 100644
>> > --- a/HACKING
>> > +++ b/HACKING
>> > @@ -69,10 +69,6 @@ it points to, or it is aliased to another pointer that 
>> > is.
>> >  2.3. Typedefs
>> >  Typedefs are used to eliminate the redundant 'struct' keyword.
>> >
>> > -2.4. Reserved namespaces in C and POSIX
>> > -Underscore capital, double underscore, and underscore 't' suffixes should 
>> > be
>> > -avoided.
>> > -
>> >  3. Low level memory management
>> >
>> >  Use of the malloc/free/realloc/calloc/valloc/memalign/posix_memalign
>> > --
>> > MST



Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Peter Maydell
On 28 August 2012 18:18, Michael S. Tsirkin  wrote:
> On Tue, Aug 28, 2012 at 05:24:40PM +0100, Peter Maydell wrote:
>>  C99 7.1.3
>> reserves underscore capital and double underscore prefixes.
>
> This is taking it out of context - reserved means different
> things in different parts of the spec.
>
> As far as I can see, 7.1.3 talks about what is permissible in headers:
> naturally when a libc implementation adds stuff in a header it would
> want to avoid breaking applications including this header. But
> it does not talk about what is permissible in a legal program - to avoid
> conflicts, you just need to use variables with reasonable names like
> kvmXXX qemuXXX or virtioXXX.

7.1.3 para 2 says "If the program declares or defines an identifier
in a context in which it is reserved [...] the behavior is undefined."
The rationale for 7.1.3 says that the underscore-cap and double
underscore identifiers are "reserved for the implementor" and that
"part of the name space of internal identifiers beginning with
underscore is available to the user" which implies that the rest
of that namespace is *not* available to the user.

I think that's fairly clear that we can't use these identifiers
without entering the realm of undefined behavior.

-- PMM



Re: [Qemu-devel] [PATCHv3 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 05:05:25PM +, Blue Swirl wrote:
> > +static bool _kvm_pv_eoi_disabled;
> 
> NACK. I find your lack of compliance disturbing.

Compliance with what? Could you please add some
motivation for the NACK?

-- 
MST



Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 05:13:24PM +, Blue Swirl wrote:
> On Tue, Aug 28, 2012 at 4:01 PM, Michael S. Tsirkin  wrote:
> > We copied HACKING from libvirt but it has some bogus stuff:
> > neither underscore capital, double underscore, or underscore 't' suffixes
> > are reserved in Posix/C: this appears to be based on misreading of the
> > C standard. Using sane prefixes is enough to avoid conflicts.
> >
> > These rules are also widely violated in our codebase,
> > and it does not make sense to rework it all, apparently for
> > no benefit.
> 
> NACK. The benefit is improved standards compliance. One day we could
> find QEMU being ported to environment which conflicts with these
> symbols.

We are talking about stuff like __kvm_pv_eoi - so the chance is exactly 0.
And if it does happen then you run a simple script and fix
this one instance.

> The tiny single benefit from violating the rules would be that you
> could use a few additional possible classes of prefixes, in addition
> to the infinite combinations already available.

Benefit would be consistency with existing QEMU code
which has both _t __  and _X, and consistency
within HACKING itself.

> >
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >  HACKING | 4 
> >  1 file changed, 4 deletions(-)
> >
> > diff --git a/HACKING b/HACKING
> > index 471cf1d..0a941fc 100644
> > --- a/HACKING
> > +++ b/HACKING
> > @@ -69,10 +69,6 @@ it points to, or it is aliased to another pointer that 
> > is.
> >  2.3. Typedefs
> >  Typedefs are used to eliminate the redundant 'struct' keyword.
> >
> > -2.4. Reserved namespaces in C and POSIX
> > -Underscore capital, double underscore, and underscore 't' suffixes should 
> > be
> > -avoided.
> > -
> >  3. Low level memory management
> >
> >  Use of the malloc/free/realloc/calloc/valloc/memalign/posix_memalign
> > --
> > MST



Re: [Qemu-devel] [RFC][PATCH v4 3/3] tcg: Optimize qemu_ld/st by generating slow paths at the end of a block

2012-08-28 Thread Andreas Färber
Am 27.08.2012 20:31, schrieb Peter Maydell:
> On 27 August 2012 08:23, Yeongkyoon Lee  wrote:
>> BTW, who will finally confirm my patches?
>> I have sent four version of my patches in which I have applied all the
>> reasonable feedbacks from this community.
> 
> If you'd like your patches committed you should not use the "[RFC]" tag
> in the Subject, because "RFC" means "I would like feedback on this
> patch but do not intend it to be committed to master".

Literally, RFC means request for comments.

Personally I differentiate between [RFC n/m] and [PATCH RFC n/m], where
the lack of PATCH means "don't commit this version" and the latter
indicating "I'm not so sure if this is how we want to do it, but if
people agree it can go in". ;)

Not sure how [RFC][PATCH n/m] is intended here? If everyone adds RFC to
a regular PATCH, it looses meaning. In the course of review when you
feel the patches are okay to be committed, RFC should disappear as you
may well get comments without asking for them anyway. :)

HTE,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 05:24:40PM +0100, Peter Maydell wrote:
> On 28 August 2012 17:01, Michael S. Tsirkin  wrote:
> > We copied HACKING from libvirt but it has some bogus stuff:
> > neither underscore capital, double underscore, or underscore 't' suffixes
> > are reserved in Posix/C: this appears to be based on misreading of the
> > C standard. Using sane prefixes is enough to avoid conflicts.

> > -2.4. Reserved namespaces in C and POSIX
> > -Underscore capital, double underscore, and underscore 't' suffixes should 
> > be
> > -avoided.
> 
> I think this is just a missing "prefixes". C99 7.1.3
> reserves underscore capital and double underscore prefixes.

This is taking it out of context - reserved means different
things in different parts of the spec.

As far as I can see, 7.1.3 talks about what is permissible in headers:
naturally when a libc implementation adds stuff in a header it would
want to avoid breaking applications including this header. But
it does not talk about what is permissible in a legal program - to avoid
conflicts, you just need to use variables with reasonable names like
kvmXXX qemuXXX or virtioXXX.

So it does not look like this is a reason to not use __kvm_pv_eoi_disabled.

Chances that a libc header predefines this name are zero.

> POSIX reserves _t if any POSIX header is defined (POSIX.1-2008 section
> 2.2.2, 
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html)

Same thing really.
So this says posix implementations can add types ending with _t in any
header.  It does not prohibit applications from using such names
in a reasonable manner, and it does not look like this is a reason to
not use qemu_foo_bar_t.


> I would suggest that the section is reworded to:
> 
> # Underscore capital and double underscore prefixes are reserved
> # by the C standard. Underscore 't' suffixes are reserved by POSIX.
> # They should be avoided in QEMU code.
> 
> -- PMM

This might be nice from language purism point of view
but ignores the fact that QEMU already uses lots of these.
Specifically a _t type is even mentioned in HACKING itself.

So the benefit to excluding this in new code is marginal.


-- 
MST




Re: [Qemu-devel] [PATCH v4] configure: properly check if -lrt and -lm is needed

2012-08-28 Thread Blue Swirl
On Tue, Aug 28, 2012 at 7:33 AM, Natanael Copa  wrote:
> On Tue, 21 Aug 2012 18:12:05 +
> Blue Swirl  wrote:
>>
>> Now I get this on mingw32:
>> config-host.mak is out-of-date, running configure
>>
>> Error: librt check failed
>
> Any news on the v4 patch, which should fix this?

No change:
config-host.mak is out-of-date, running configure

Error: librt check failed

>
> Thanks!
>



Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Blue Swirl
On Tue, Aug 28, 2012 at 4:01 PM, Michael S. Tsirkin  wrote:
> We copied HACKING from libvirt but it has some bogus stuff:
> neither underscore capital, double underscore, or underscore 't' suffixes
> are reserved in Posix/C: this appears to be based on misreading of the
> C standard. Using sane prefixes is enough to avoid conflicts.
>
> These rules are also widely violated in our codebase,
> and it does not make sense to rework it all, apparently for
> no benefit.

NACK. The benefit is improved standards compliance. One day we could
find QEMU being ported to environment which conflicts with these
symbols.

The tiny single benefit from violating the rules would be that you
could use a few additional possible classes of prefixes, in addition
to the infinite combinations already available.

>
> Signed-off-by: Michael S. Tsirkin 
> ---
>  HACKING | 4 
>  1 file changed, 4 deletions(-)
>
> diff --git a/HACKING b/HACKING
> index 471cf1d..0a941fc 100644
> --- a/HACKING
> +++ b/HACKING
> @@ -69,10 +69,6 @@ it points to, or it is aliased to another pointer that is.
>  2.3. Typedefs
>  Typedefs are used to eliminate the redundant 'struct' keyword.
>
> -2.4. Reserved namespaces in C and POSIX
> -Underscore capital, double underscore, and underscore 't' suffixes should be
> -avoided.
> -
>  3. Low level memory management
>
>  Use of the malloc/free/realloc/calloc/valloc/memalign/posix_memalign
> --
> MST



Re: [Qemu-devel] Isuue assiging devices using VFIO on x86

2012-08-28 Thread Bhushan Bharat-R65777


> -Original Message-
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Tuesday, August 28, 2012 9:27 PM
> To: Bhushan Bharat-R65777
> Cc: k...@vger.kernel.org; Avi Kivity; qemu-devel@nongnu.org
> Subject: Re: Isuue assiging devices using VFIO on x86
> 
> On Tue, 2012-08-28 at 09:23 +, Bhushan Bharat-R65777 wrote:
> > Hi Alex,
> >
> > In my susyem I have following devices:
> >
> > I tried assigning a following PCI devices:
> > 00:03.0 Communication controller: Intel Corporation 4 Series Chipset HECI
> Controller (rev 03)
> > 00:03.2 IDE interface: Intel Corporation 4 Series Chipset PT IDER Controller
> (rev 03)
> > 00:03.3 Serial controller: Intel Corporation 4 Series Chipset Serial KT
> Controller (rev 03)
> >
> > and getting below error:
> >
> > ---
> > Command:
> > ---
> > qemu-system-x86_64 -enable-kvm  -nographic  -kernel 
> > /boot/vmlinuz-3.6.0-rc2+ -
> initrd /boot/initramfs-3.6.0-rc2+.img -append "root=/dev/sda1" -m 1024 -drive
> file=/home/kvmdev/debian_squeeze_amd64_standard.qcow2 -device vfio-
> pci,host=:00:03.0 -device vfio-pci,host=:00:03.2 -device vfio-
> pci,host=:00:03.3
> >
> > Error prints:
> > 
> > qemu-system-x86_64: -device vfio-pci,host=:00:03.0: Warning, device
> :00:03.0 does not support reset
> >
> > qemu-system-x86_64: -device vfio-pci,host=:00:03.0: VFIO :00:03.0 
> > BAR
> 0 is too small to mmap, this may affect performance.
> >
> > qemu-system-x86_64: -device vfio-pci,host=:00:03.2: Warning, device
> :00:03.2 does not support reset
> >
> > qemu-system-x86_64: -device vfio-pci,host=:00:03.3: Warning, device
> :00:03.3 does not support reset
> >
> > qemu: hardware error: register_ioport_read: invalid opaque for address 0x3f6
> > CPU #0:
> > EAX=8003 EBX=3ffe0e80 ECX=8003 EDX=0cfc
> > ESI=2800 EDI=80002804 EBP=6fc0 ESP=6f38
> > EIP=3ffec005 EFL=0086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
> > ES =0010   00c09300 DPL=0 DS   [-WA]
> > CS =0008   00c09b00 DPL=0 CS32 [-RA]
> > SS =0010   00c09300 DPL=0 DS   [-WA]
> > DS =0010   00c09300 DPL=0 DS   [-WA]
> > FS =0010   00c09300 DPL=0 DS   [-WA]
> > GS =0010   00c09300 DPL=0 DS   [-WA]
> > LDT=   8200 DPL=0 LDT
> > TR =   8b00 DPL=0 TSS32-busy
> > GDT= 000fcd68 0037
> > IDT= 000fdb60 
> > CR0=0011 CR2= CR3= CR4=
> > DR0= DR1= DR2=
> DR3=
> > DR6=0ff0 DR7=0400
> > EFER=
> > FCW=037f FSW= [ST=0] FTW=00 MXCSR=1f80
> > FPR0=  FPR1= 
> > FPR2=  FPR3= 
> > FPR4=  FPR5= 
> > FPR6=  FPR7= 
> > XMM00= 
> > XMM01=
> > XMM02= 
> > XMM03=
> > XMM04= 
> > XMM05=
> > XMM06= 
> > XMM07=
> > Aborted (core dumped)
> >
> > 
> >
> > Linux: http://github.com/awilliam/linux-vfio.git next
> > Qemu: https://github.com/awilliam/qemu-vfio.git  iommu-group-vfio
> >
> > Any idea what's is the issue.
> 
> Please try the vfio-pci-for-qemu-1.2-v3 qemu-vfio.git tag, I'm not even
> sure what's in that old iommu-group-vfio branch.  Thanks,

Can you share the command to configure qemu?

Thanks
-Bharat

> 
> Alex
> 



Re: [Qemu-devel] [PATCH V5 1/8] isa: add isa_address_space_io

2012-08-28 Thread Jan Kiszka
On 2012-08-28 17:42, Julien Grall wrote:
> On 08/24/2012 05:10 PM, Andreas Färber wrote:
>> Am 22.08.2012 14:27, schrieb Julien Grall:
>>
>>> This function permits to retrieve ISA IO address space.
>>> It will be usefull when we need to pass IO address space as argument.
>>>
>>> Signed-off-by: Julien Grall
>>> ---
>>>   hw/isa-bus.c |5 +
>>>   hw/isa.h |1 +
>>>   2 files changed, 6 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/hw/isa-bus.c b/hw/isa-bus.c
>>> index f9b2373..662c86b 100644
>>> --- a/hw/isa-bus.c
>>> +++ b/hw/isa-bus.c
>>> @@ -244,4 +244,9 @@ MemoryRegion *isa_address_space(ISADevice *dev)
>>>   return get_system_memory();
>>>   }
>>>
>>> +MemoryRegion *isa_address_space_io(ISADevice *dev)
>>> +{
>>> +return get_system_io();
>>> +}
>>>  
>> Unlike the address_space above, there's an address_space_io field in
>> ISABus, so I guess the implementation of this function should rather
>> obtain the device's BusState via isa_bus_from_device(dev) and return its
>> field rather than hardcoding get_system_io() here.
>>
> I use this function in hw/dma.c. For the moment, the code doesn't
> use ISA device, so I pass NULL to isa_address_space_io
> (See patch 6).

Yes, there are some old drivers in QEMU that doesn't follow QOM or even
qdev patterns. They may not work on systems Andreas is thinking of anyway.

But this is a generic service, so it should try harder: If a device is
given, use the io space of its bus. If not, fall back to isabus.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCHv3 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Blue Swirl
On Tue, Aug 28, 2012 at 1:22 PM, Michael S. Tsirkin  wrote:
> In preparation for adding PV EOI support, disable PV EOI by default for
> 1.1 and older machine types, to avoid CPUID changing during migration.
>
> PV EOI can still be enabled/disabled by specifying it explicitly.
> Enable for 1.1
> -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> Disable for 1.2
> -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
>
> Signed-off-by: Michael S. Tsirkin 
> ---
>  hw/Makefile.objs  |  2 +-
>  hw/cpu_flags.c| 32 
>  hw/cpu_flags.h|  9 +
>  hw/pc_piix.c  |  2 ++
>  target-i386/cpu.c |  8 
>  5 files changed, 52 insertions(+), 1 deletion(-)
>  create mode 100644 hw/cpu_flags.c
>  create mode 100644 hw/cpu_flags.h
>
> diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> index 850b87b..3f2532a 100644
> --- a/hw/Makefile.objs
> +++ b/hw/Makefile.objs
> @@ -1,5 +1,5 @@
>  hw-obj-y = usb/ ide/
> -hw-obj-y += loader.o
> +hw-obj-y += loader.o cpu_flags.o
>  hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
>  hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
>  hw-obj-y += fw_cfg.o
> diff --git a/hw/cpu_flags.c b/hw/cpu_flags.c
> new file mode 100644
> index 000..d821d8c
> --- /dev/null
> +++ b/hw/cpu_flags.c
> @@ -0,0 +1,32 @@
> +/*
> + * CPU compatibility flags.
> + *
> + * Copyright (c) 2012 Red Hat Inc.
> + * Author: Michael S. Tsirkin.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see .
> + */
> +#include "hw/cpu_flags.h"
> +
> +static bool _kvm_pv_eoi_disabled;

NACK. I find your lack of compliance disturbing.

> +
> +void disable_kvm_pv_eoi(void)
> +{
> +   _kvm_pv_eoi_disabled = true;
> +}
> +
> +bool kvm_pv_eoi_disabled(void)
> +{
> +   return _kvm_pv_eoi_disabled;
> +}
> diff --git a/hw/cpu_flags.h b/hw/cpu_flags.h
> new file mode 100644
> index 000..05777b6
> --- /dev/null
> +++ b/hw/cpu_flags.h
> @@ -0,0 +1,9 @@
> +#ifndef HW_CPU_FLAGS_H
> +#define HW_CPU_FLAGS_H
> +
> +#include 
> +
> +void disable_kvm_pv_eoi(void);
> +bool kvm_pv_eoi_disabled(void);
> +
> +#endif
> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> index 008d42f..bdbceda 100644
> --- a/hw/pc_piix.c
> +++ b/hw/pc_piix.c
> @@ -46,6 +46,7 @@
>  #ifdef CONFIG_XEN
>  #  include 
>  #endif
> +#include "cpu_flags.h"
>
>  #define MAX_IDE_BUS 2
>
> @@ -371,6 +372,7 @@ static QEMUMachine pc_machine_v1_2 = {
>
>  static void pc_machine_v1_1_compat(void)
>  {
> +disable_kvm_pv_eoi();
>  }
>
>  static void pc_init_pci_v1_1(ram_addr_t ram_size,
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 120a2e3..0d02fd1 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -23,6 +23,7 @@
>
>  #include "cpu.h"
>  #include "kvm.h"
> +#include "asm/kvm_para.h"
>
>  #include "qemu-option.h"
>  #include "qemu-config.h"
> @@ -33,6 +34,7 @@
>  #include "hyperv.h"
>
>  #include "hw/hw.h"
> +#include "hw/cpu_flags.h"
>
>  /* feature flags taken from "Intel Processor Identification and the CPUID
>   * Instruction" and AMD's "CPUID Specification".  In cases of disagreement
> @@ -889,6 +891,12 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
> const char *cpu_model)
>
>  plus_kvm_features = ~0; /* not supported bits will be filtered out later 
> */
>
> +/* Disable PV EOI for old machine types.
> + * Feature flags can still override. */
> +if (kvm_pv_eoi_disabled()) {
> +plus_kvm_features &= ~(0x1 << KVM_FEATURE_PV_EOI);
> +}
> +
>  add_flagname_to_bitmaps("hypervisor", &plus_features,
>  &plus_ext_features, &plus_ext2_features, &plus_ext3_features,
>  &plus_kvm_features, &plus_svm_features);
> --
> MST
>
>



Re: [Qemu-devel] [PATCH 4/4] kvm: i386: Add classic PCI device assignment

2012-08-28 Thread Blue Swirl
On Tue, Aug 28, 2012 at 7:35 AM, Michael Tokarev  wrote:
> On 27.08.2012 22:56, Blue Swirl wrote:
> []
>>> +static uint32_t slow_bar_readb(void *opaque, target_phys_addr_t addr)
>>> +{
>>> +AssignedDevRegion *d = opaque;
>>> +uint8_t *in = d->u.r_virtbase + addr;
>>
>> Don't perform arithmetic with void pointers.
>
> There are a few places in common qemu code which does this for a very
> long time.  So I guess it is safe now.

It's a non-standard GCC extension.

>
> /mjt



Re: [Qemu-devel] [PATCHv2 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread malc
On Tue, 28 Aug 2012, Michael S. Tsirkin wrote:

> On Mon, Aug 27, 2012 at 07:40:56PM +, Blue Swirl wrote:
> > On Mon, Aug 27, 2012 at 7:24 PM, Michael S. Tsirkin  wrote:
> > > On Mon, Aug 27, 2012 at 07:12:27PM +, Blue Swirl wrote:
> > >> On Mon, Aug 27, 2012 at 7:06 PM, Michael S. Tsirkin  
> > >> wrote:
> > >> > On Mon, Aug 27, 2012 at 06:58:29PM +, Blue Swirl wrote:
> > >> >> On Mon, Aug 27, 2012 at 12:20 PM, Michael S. Tsirkin 
> > >> >>  wrote:
> > >> >> > In preparation for adding PV EOI support, disable PV EOI by default 
> > >> >> > for
> > >> >> > 1.1 and older machine types, to avoid CPUID changing during 
> > >> >> > migration.
> > >> >> >
> > >> >> > PV EOI can still be enabled/disabled by specifying it explicitly.
> > >> >> > Enable for 1.1
> > >> >> > -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> > >> >> > Disable for 1.2
> > >> >> > -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
> > >> >> >
> > >> >> > Signed-off-by: Michael S. Tsirkin 
> > >> >> > ---
> > >> >> >  hw/Makefile.objs  |  2 +-
> > >> >> >  hw/cpu_flags.c| 32 
> > >> >> >  hw/cpu_flags.h|  9 +
> > >> >> >  hw/pc_piix.c  |  2 ++

[..snip..]

> > 
> > No leading underscores. They are not used in QEMU.
> 
> They are *widely* used in QEMU to mark internal
> stuff. E.g. parameters in many macros.
> 

ISO/IEC 9899:TC3 7.1.3#1

- All identifiers that begin with an underscore and either an
  uppercase letter or another underscore are always reserved for any use.

IOW no __ or _[A-Z] at all.

- All identifiers that begin with an underscore are always reserved
  for use as identifiers with file scope in both the ordinary and tag
  name spaces.

IOW _ as the name of an argument to a macro is (probably) okay.

> In reality __ is also widely used. I'm still mulling
> removing 2.4 from HACKING - it appears too draconian,
> the chances of a conflict with preprocessor are remote
> and if it triggers, it's trivial to catch.
> We also have lots of existing code violating this rule.
> 
> And the rule about _t suffix is just silly.

http://pubs.opengroup.org/onlinepubs/7908799/xns/namespace.html

[..snip..]

-- 
mailto:av1...@comtv.ru



Re: [Qemu-devel] [RFC][PATCH v4 3/3] tcg: Optimize qemu_ld/st by generating slow paths at the end of a block

2012-08-28 Thread Blue Swirl
On Tue, Aug 28, 2012 at 6:52 AM, Yeongkyoon Lee
 wrote:
>
>>> It's been a long time.
>>>
>>> I've tested the performances of one jump difference when fast qemu_ld/st
>>> (TLB hit).
>>> The result shows 3.6% CoreMark enhancement when reducing one jump where
>>> slow
>>> paths are generated at the end of block as same for the both cases.
>>> That means reducing one jump dominates the majority of performance
>>> enhancement from LDST_OPTIMIZATION.
>>> As a result, it needs extended MMU helper functions for attaining that
>>> performance rising, and those extended functions are used only
>>> implicitly.
>>>
>>> BTW, who will finally confirm my patches?
>>> I have sent four version of my patches in which I have applied all the
>>> reasonable feedbacks from this community.
>>> Currently, v4 is the final candidate though it might need merge with
>>> latest
>>> HEAD because it was sent 1 month before.
>>
>> I think the patches should be applied when 1.3 development opens.
>>
>
> Thanks for your reply.
> How do you estimate when 1.3 development is open?

2012-09-05 according to http://wiki.qemu.org/Planning/1.2



Re: [Qemu-devel] PPC heathrow broken - update OpenBIOS to r1063?

2012-08-28 Thread Blue Swirl
On Mon, Aug 27, 2012 at 9:13 PM, Aurelien Jarno  wrote:
> Hi,
>
> As you probably know, the PPC machines with a heathrow controller is
> broken following commit 9e56edcf ("vga: raise default vgamem size").
> The PCI hole space is not big enough for such a new default size.
>
> Alexander has fixed it in OpenBIOS r1063, while the current version in
> QEMU is r1062. It would be nice if OpenBIOS PPC could be updated in git.
> Do we want to keep the SPARC version in sync? Should I do it?

Keeping the images in sync would be nice considering also the git
submodule (however, it has been broken for some time).

>
> Regards,
> Aurelien
>
> --
> Aurelien Jarno  GPG: 1024D/F1BCDB73
> aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] Isuue assiging devices using VFIO on x86

2012-08-28 Thread Alex Williamson
On Tue, 2012-08-28 at 09:23 +, Bhushan Bharat-R65777 wrote:
> Hi Alex,
> 
> In my susyem I have following devices:
> 
> I tried assigning a following PCI devices:
> 00:03.0 Communication controller: Intel Corporation 4 Series Chipset HECI 
> Controller (rev 03)
> 00:03.2 IDE interface: Intel Corporation 4 Series Chipset PT IDER Controller 
> (rev 03)
> 00:03.3 Serial controller: Intel Corporation 4 Series Chipset Serial KT 
> Controller (rev 03)
> 
> and getting below error:
> 
> ---
> Command:
> ---
> qemu-system-x86_64 -enable-kvm  -nographic  -kernel /boot/vmlinuz-3.6.0-rc2+ 
> -initrd /boot/initramfs-3.6.0-rc2+.img -append "root=/dev/sda1" -m 1024 
> -drive file=/home/kvmdev/debian_squeeze_amd64_standard.qcow2 -device 
> vfio-pci,host=:00:03.0 -device vfio-pci,host=:00:03.2 -device 
> vfio-pci,host=:00:03.3
> 
> Error prints:
> 
> qemu-system-x86_64: -device vfio-pci,host=:00:03.0: Warning, device 
> :00:03.0 does not support reset
> 
> qemu-system-x86_64: -device vfio-pci,host=:00:03.0: VFIO :00:03.0 BAR 
> 0 is too small to mmap, this may affect performance.
> 
> qemu-system-x86_64: -device vfio-pci,host=:00:03.2: Warning, device 
> :00:03.2 does not support reset
> 
> qemu-system-x86_64: -device vfio-pci,host=:00:03.3: Warning, device 
> :00:03.3 does not support reset
> 
> qemu: hardware error: register_ioport_read: invalid opaque for address 0x3f6
> CPU #0:
> EAX=8003 EBX=3ffe0e80 ECX=8003 EDX=0cfc
> ESI=2800 EDI=80002804 EBP=6fc0 ESP=6f38
> EIP=3ffec005 EFL=0086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0010   00c09300 DPL=0 DS   [-WA]
> CS =0008   00c09b00 DPL=0 CS32 [-RA]
> SS =0010   00c09300 DPL=0 DS   [-WA]
> DS =0010   00c09300 DPL=0 DS   [-WA]
> FS =0010   00c09300 DPL=0 DS   [-WA]
> GS =0010   00c09300 DPL=0 DS   [-WA]
> LDT=   8200 DPL=0 LDT
> TR =   8b00 DPL=0 TSS32-busy
> GDT= 000fcd68 0037
> IDT= 000fdb60 
> CR0=0011 CR2= CR3= CR4=
> DR0= DR1= DR2= 
> DR3= 
> DR6=0ff0 DR7=0400
> EFER=
> FCW=037f FSW= [ST=0] FTW=00 MXCSR=1f80
> FPR0=  FPR1= 
> FPR2=  FPR3= 
> FPR4=  FPR5= 
> FPR6=  FPR7= 
> XMM00= XMM01=
> XMM02= XMM03=
> XMM04= XMM05=
> XMM06= XMM07=
> Aborted (core dumped)
> 
> 
> 
> Linux: http://github.com/awilliam/linux-vfio.git next
> Qemu: https://github.com/awilliam/qemu-vfio.git  iommu-group-vfio
> 
> Any idea what's is the issue.

Please try the vfio-pci-for-qemu-1.2-v3 qemu-vfio.git tag, I'm not even
sure what's in that old iommu-group-vfio branch.  Thanks,

Alex




Re: [Qemu-devel] QEMU emulation per CPU

2012-08-28 Thread Mulyadi Santosa
Hi :)

On Tue, Aug 28, 2012 at 3:39 PM, Naresh Bhat  wrote:
> Hi Mulyadi Santosa,
>
> Thank you very much for quick response.  Can you share some documents
> ? I want to do it practically.

one of the google results you might find helpful is:
http://www.cyberciti.biz/tips/setting-processor-affinity-certain-task-or-process.html

hope that helps :)

-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com



Re: [Qemu-devel] [PATCHv2 2/4] pc: refactor compat code

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 01:23:18PM -0300, Marcelo Tosatti wrote:
> On Mon, Aug 27, 2012 at 03:20:13PM +0300, Michael S. Tsirkin wrote:
> > In preparation to adding PV EOI migration for 1.2,
> > trivially refactor some some compat code
> > to make it easier to add version specific
> > cpuid tweaks.
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >  hw/pc_piix.c | 44 
> >  1 file changed, 36 insertions(+), 8 deletions(-)
> 
> Why can't you clear the cpuid bit in kvm paravirt leaf at hw/pc_piix.c, 
> leaving compat code isolated there?

This is not how we handle it for compat properties:
there we set flag in pc_piix and make devices look at
that flag.
And it makes sense because what you suggest does not scale: we can not
teach pc_piix about quirks of all hardware.

It will also scale better if we ever get interested about
compatibility and migration for non pc machines.
-- 
MST



Re: [Qemu-devel] [PATCHv2 2/4] pc: refactor compat code

2012-08-28 Thread Marcelo Tosatti
On Mon, Aug 27, 2012 at 03:20:13PM +0300, Michael S. Tsirkin wrote:
> In preparation to adding PV EOI migration for 1.2,
> trivially refactor some some compat code
> to make it easier to add version specific
> cpuid tweaks.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  hw/pc_piix.c | 44 
>  1 file changed, 36 insertions(+), 8 deletions(-)

Why can't you clear the cpuid bit in kvm paravirt leaf at hw/pc_piix.c, 
leaving compat code isolated there?




Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Peter Maydell
On 28 August 2012 17:01, Michael S. Tsirkin  wrote:
> We copied HACKING from libvirt but it has some bogus stuff:
> neither underscore capital, double underscore, or underscore 't' suffixes
> are reserved in Posix/C: this appears to be based on misreading of the
> C standard. Using sane prefixes is enough to avoid conflicts.

> -2.4. Reserved namespaces in C and POSIX
> -Underscore capital, double underscore, and underscore 't' suffixes should be
> -avoided.

I think this is just a missing "prefixes". C99 7.1.3
reserves underscore capital and double underscore prefixes.
POSIX reserves _t if any POSIX header is defined (POSIX.1-2008 section
2.2.2, http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html)

I would suggest that the section is reworded to:

# Underscore capital and double underscore prefixes are reserved
# by the C standard. Underscore 't' suffixes are reserved by POSIX.
# They should be avoided in QEMU code.

-- PMM



Re: [Qemu-devel] [PATCHv2 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-28 Thread Michael S. Tsirkin
On Mon, Aug 27, 2012 at 07:40:56PM +, Blue Swirl wrote:
> On Mon, Aug 27, 2012 at 7:24 PM, Michael S. Tsirkin  wrote:
> > On Mon, Aug 27, 2012 at 07:12:27PM +, Blue Swirl wrote:
> >> On Mon, Aug 27, 2012 at 7:06 PM, Michael S. Tsirkin  
> >> wrote:
> >> > On Mon, Aug 27, 2012 at 06:58:29PM +, Blue Swirl wrote:
> >> >> On Mon, Aug 27, 2012 at 12:20 PM, Michael S. Tsirkin  
> >> >> wrote:
> >> >> > In preparation for adding PV EOI support, disable PV EOI by default 
> >> >> > for
> >> >> > 1.1 and older machine types, to avoid CPUID changing during migration.
> >> >> >
> >> >> > PV EOI can still be enabled/disabled by specifying it explicitly.
> >> >> > Enable for 1.1
> >> >> > -M pc-1.1 -cpu kvm64,+kvm_pv_eoi
> >> >> > Disable for 1.2
> >> >> > -M pc-1.2 -cpu kvm64,-kvm_pv_eoi
> >> >> >
> >> >> > Signed-off-by: Michael S. Tsirkin 
> >> >> > ---
> >> >> >  hw/Makefile.objs  |  2 +-
> >> >> >  hw/cpu_flags.c| 32 
> >> >> >  hw/cpu_flags.h|  9 +
> >> >> >  hw/pc_piix.c  |  2 ++
> >> >> >  target-i386/cpu.c |  8 
> >> >> >  5 files changed, 52 insertions(+), 1 deletion(-)
> >> >> >  create mode 100644 hw/cpu_flags.c
> >> >> >  create mode 100644 hw/cpu_flags.h
> >> >> >
> >> >> > diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> >> >> > index 850b87b..3f2532a 100644
> >> >> > --- a/hw/Makefile.objs
> >> >> > +++ b/hw/Makefile.objs
> >> >> > @@ -1,5 +1,5 @@
> >> >> >  hw-obj-y = usb/ ide/
> >> >> > -hw-obj-y += loader.o
> >> >> > +hw-obj-y += loader.o cpu_flags.o
> >> >> >  hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
> >> >> >  hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
> >> >> >  hw-obj-y += fw_cfg.o
> >> >> > diff --git a/hw/cpu_flags.c b/hw/cpu_flags.c
> >> >> > new file mode 100644
> >> >> > index 000..2422d20
> >> >> > --- /dev/null
> >> >> > +++ b/hw/cpu_flags.c
> >> >> > @@ -0,0 +1,32 @@
> >> >> > +/*
> >> >> > + * CPU compatibility flags.
> >> >> > + *
> >> >> > + * Copyright (c) 2012 Red Hat Inc.
> >> >> > + * Author: Michael S. Tsirkin.
> >> >> > + *
> >> >> > + * This program is free software; you can redistribute it and/or 
> >> >> > modify
> >> >> > + * it under the terms of the GNU General Public License as published 
> >> >> > by
> >> >> > + * the Free Software Foundation; either version 2 of the License, or
> >> >> > + * (at your option) any later version.
> >> >> > + *
> >> >> > + * This program is distributed in the hope that it will be useful,
> >> >> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> >> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >> >> > + * GNU General Public License for more details.
> >> >> > + *
> >> >> > + * You should have received a copy of the GNU General Public License 
> >> >> > along
> >> >> > + * with this program; if not, see .
> >> >> > + */
> >> >> > +#include "hw/cpu_flags.h"
> >> >> > +
> >> >> > +static bool __kvm_pv_eoi_disabled;
> >> >>
> >> >> Don't use identifiers with leading underscores.
> >> >
> >> > C99 spec says "
> >> > Any other predefined macro names
> >> > shall begin with a leading underscore followed by an uppercase letter or
> >> > a second underscore.
> >> > "
> >> >
> >> > what are chances of compiler predefining macro __kvm_pv_eoi_disabled?
> >>
> >> Why do you even consider that since it's trivially easy to use
> >> something else? If a standard (and HACKING in our case) specifies
> >> something, why do you want to fight it?
> >
> > I missed this in HACKING, you are right:
> >
> > 2.4. Reserved namespaces in C and POSIX
> > Underscore capital, double underscore, and underscore 't' suffixes
> > should be avoided.
> >
> > so _kvm_pv_eoi_disabled is ok __kvm_pv_eoi_disabled is not.
> > Will fix.
> 
> No leading underscores. They are not used in QEMU.

They are *widely* used in QEMU to mark internal
stuff. E.g. parameters in many macros.

In reality __ is also widely used. I'm still mulling
removing 2.4 from HACKING - it appears too draconian,
the chances of a conflict with preprocessor are remote
and if it triggers, it's trivial to catch.
We also have lots of existing code violating this rule.

And the rule about _t suffix is just silly.

> >
> >> >
> >> > But OK, will rename _kvm_pv_eoi_disabled.
> >> > _ + lower case is guaranteed OK.
> >>
> >> No, just use kvm_pv_eoi_disabled, the underscore is useless.
> >
> > It isn't useless, this avoids conflict with function name.
> > _ says it's an internal variable used to implement kvm_pv_eoi_disabled
> > in a very clear way.
> 
> Sure, but there are infinite number of ways of making the identifiers
> unique. Using leading underscores is a way to ever conflict with
> compiler, linker,  libc, POSIX etc.

I think you are mistaken. Anything to support this claim?

> Don't do it.
> Where's your imagination, can't you invent any other prefix or suffix?

I like _, I think it's a standard C way to mark internal

[Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Michael S. Tsirkin
We copied HACKING from libvirt but it has some bogus stuff:
neither underscore capital, double underscore, or underscore 't' suffixes
are reserved in Posix/C: this appears to be based on misreading of the
C standard. Using sane prefixes is enough to avoid conflicts.

These rules are also widely violated in our codebase,
and it does not make sense to rework it all, apparently for
no benefit.

Signed-off-by: Michael S. Tsirkin 
---
 HACKING | 4 
 1 file changed, 4 deletions(-)

diff --git a/HACKING b/HACKING
index 471cf1d..0a941fc 100644
--- a/HACKING
+++ b/HACKING
@@ -69,10 +69,6 @@ it points to, or it is aliased to another pointer that is.
 2.3. Typedefs
 Typedefs are used to eliminate the redundant 'struct' keyword.
 
-2.4. Reserved namespaces in C and POSIX
-Underscore capital, double underscore, and underscore 't' suffixes should be
-avoided.
-
 3. Low level memory management
 
 Use of the malloc/free/realloc/calloc/valloc/memalign/posix_memalign
-- 
MST



Re: [Qemu-devel] [PATCH v1 4/4] hw: Add support for a dummy ARMv7-M board

2012-08-28 Thread Meador Inge
On 08/28/2012 07:48 AM, Peter Maydell wrote:

> On 27 August 2012 21:37, Meador Inge  wrote:
>> This patch adds support for a "dummy" ARMv7-M board so that
>> QEMU can be used as an ISS for ARMv7-M processors.  For example,
>> running an image compiled for the Cortex-M3 with -cpu cortex-m3
>> should just work.
> 
> So what programs would run on this 'dummy' board which would not
> execute correctly on the existing stellaris board models?

Similar programs will run on the Stellaris models, but the RAM on
those models is severely limited (8KB for LM3S811EVB and 64KB for LM3S6965EVB)
and it is hard wired.  Having some board that can handle ARMv7-M with more RAM
(and can be used with -m) is useful.

-- 
Meador Inge
CodeSourcery / Mentor Embedded
http://www.mentor.com/embedded-software



Re: [Qemu-devel] [PATCH] HACKING: remove bogus restrictions

2012-08-28 Thread Michael S. Tsirkin
On Tue, Aug 28, 2012 at 07:01:16PM +0300, Michael S. Tsirkin wrote:
> We copied HACKING from libvirt but it has some bogus stuff:
> neither underscore capital, double underscore, or underscore 't' suffixes
> are reserved in Posix/C: this appears to be based on misreading of the
> C standard. Using sane prefixes is enough to avoid conflicts.
> 
> These rules are also widely violated in our codebase,

To add to that, they are even contradicted in HACKING itself
which suggests using ram_addr_t for RAM offsets (_t suffix).

> and it does not make sense to rework it all, apparently for
> no benefit.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  HACKING | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/HACKING b/HACKING
> index 471cf1d..0a941fc 100644
> --- a/HACKING
> +++ b/HACKING
> @@ -69,10 +69,6 @@ it points to, or it is aliased to another pointer that is.
>  2.3. Typedefs
>  Typedefs are used to eliminate the redundant 'struct' keyword.
>  
> -2.4. Reserved namespaces in C and POSIX
> -Underscore capital, double underscore, and underscore 't' suffixes should be
> -avoided.
> -
>  3. Low level memory management
>  
>  Use of the malloc/free/realloc/calloc/valloc/memalign/posix_memalign
> -- 
> MST



Re: [Qemu-devel] [PATCH V5 1/8] isa: add isa_address_space_io

2012-08-28 Thread Julien Grall

On 08/24/2012 05:10 PM, Andreas Färber wrote:

Am 22.08.2012 14:27, schrieb Julien Grall:
   

This function permits to retrieve ISA IO address space.
It will be usefull when we need to pass IO address space as argument.

Signed-off-by: Julien Grall
---
  hw/isa-bus.c |5 +
  hw/isa.h |1 +
  2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/hw/isa-bus.c b/hw/isa-bus.c
index f9b2373..662c86b 100644
--- a/hw/isa-bus.c
+++ b/hw/isa-bus.c
@@ -244,4 +244,9 @@ MemoryRegion *isa_address_space(ISADevice *dev)
  return get_system_memory();
  }

+MemoryRegion *isa_address_space_io(ISADevice *dev)
+{
+return get_system_io();
+}
 

Unlike the address_space above, there's an address_space_io field in
ISABus, so I guess the implementation of this function should rather
obtain the device's BusState via isa_bus_from_device(dev) and return its
field rather than hardcoding get_system_io() here.
   

I use this function in hw/dma.c. For the moment, the code doesn't
use ISA device, so I pass NULL to isa_address_space_io
(See patch 6).

Instead of using isa_bus_from_device, can I just return
"isabus->address_space_io" as in isa_register_portio_list ?

--
Julien Grall



Re: [Qemu-devel] [PATCH]] audio: previous audio buffer should be flushed

2012-08-28 Thread malc
On Tue, 28 Aug 2012, munkyu.im wrote:

> Winwave audio backend has problem with pausing and restart audio out.
> Unlike other backends, Winwave pausing API does not flush audio buffer.
> As a result, the previous audio data are playedin front of
> user expected sound when user restart audio.
> So changes it to waveOutReset()
> 
> Signed-off-by: Munkyu Im 
> ---
>  audio/winwaveaudio.c |   12 +++-
>  1 files changed, 3 insertions(+), 9 deletions(-)
> 

Thanks, applied with some cosmetics changes to the commit message.

-- 
mailto:av1...@comtv.ru



Re: [Qemu-devel] [PATCH v2 for-1.2 2/2] qemu-iotests: add backing file smaller than image test case

2012-08-28 Thread Paolo Bonzini
Il 28/08/2012 16:26, Stefan Hajnoczi ha scritto:
> This new test case checks that streaming completes successfully when the
> backing file is smaller than the image file.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  tests/qemu-iotests/030 |   33 +
>  tests/qemu-iotests/030.out |4 ++--
>  2 files changed, 35 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030
> index f71ab8d..55b16f8 100755
> --- a/tests/qemu-iotests/030
> +++ b/tests/qemu-iotests/030
> @@ -125,6 +125,39 @@ class TestSingleDrive(ImageStreamingTestCase):
>  result = self.vm.qmp('block-stream', device='nonexistent')
>  self.assert_qmp(result, 'error/class', 'DeviceNotFound')
>  
> +
> +class TestSmallerBackingFile(ImageStreamingTestCase):
> +backing_len = 1 * 1024 * 1024 # MB
> +image_len = 2 * backing_len
> +
> +def setUp(self):
> +self.create_image(backing_img, self.backing_len)
> +qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
> backing_img, test_img, str(self.image_len))
> +self.vm = iotests.VM().add_drive(test_img)
> +self.vm.launch()
> +
> +# If this hangs, then you are missing a fix to complete streaming when 
> the
> +# end of the backing file is reached.
> +def test_stream(self):
> +self.assert_no_active_streams()
> +
> +result = self.vm.qmp('block-stream', device='drive0')
> +self.assert_qmp(result, 'return', {})
> +
> +completed = False
> +while not completed:
> +for event in self.vm.get_qmp_events(wait=True):
> +if event['event'] == 'BLOCK_JOB_COMPLETED':
> +self.assert_qmp(event, 'data/type', 'stream')
> +self.assert_qmp(event, 'data/device', 'drive0')
> +self.assert_qmp(event, 'data/offset', self.image_len)
> +self.assert_qmp(event, 'data/len', self.image_len)
> +completed = True
> +
> +self.assert_no_active_streams()
> +self.vm.shutdown()
> +
> +
>  class TestStreamStop(ImageStreamingTestCase):
>  image_len = 8 * 1024 * 1024 * 1024 # GB
>  
> diff --git a/tests/qemu-iotests/030.out b/tests/qemu-iotests/030.out
> index 3f8a935..2f7d390 100644
> --- a/tests/qemu-iotests/030.out
> +++ b/tests/qemu-iotests/030.out
> @@ -1,5 +1,5 @@
> -..
> +...
>  --
> -Ran 6 tests
> +Ran 7 tests
>  
>  OK
> 

Reviewed-by: Paolo Bonzini 



Re: [Qemu-devel] [PATCH v2 for-1.2 1/2] stream: complete early if end of backing file is reached

2012-08-28 Thread Paolo Bonzini
Il 28/08/2012 16:26, Stefan Hajnoczi ha scritto:
> It is possible to create an image that is larger than its backing file.
> Reading beyond the end of the backing file produces zeroes if no writes
> have been made to those sectors in the image file.
> 
> This patch finishes streaming early when the end of the backing file is
> reached.  Without this patch the block job hangs and continually tries
> to stream the first sectors beyond the end of the backing file.
> 
> To reproduce the hung block job bug:
> 
>   $ qemu-img create -f qcow2 backing.qcow2 128M
>   $ qemu-img create -f qcow2 -o backing_file=backing.qcow2 image.qcow2 6G
>   $ qemu -drive if=virtio,cache=none,file=image.qcow2
>   (qemu) block_stream virtio0
>   (qemu) info block-jobs
> 
> The qemu-iotests 030 streaming test still passes.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  block/stream.c |6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/block/stream.c b/block/stream.c
> index 37c4652..c4f87dd 100644
> --- a/block/stream.c
> +++ b/block/stream.c
> @@ -122,6 +122,12 @@ wait:
>   * known-unallocated area [sector_num, sector_num+n).  */
>  ret = bdrv_co_is_allocated_above(bs->backing_hd, base,
>   sector_num, n, &n);
> +
> +/* Finish early if end of backing file has been reached */
> +if (ret == 0 && n == 0) {
> +n = end - sector_num;
> +}

Reviewed-by: Paolo Bonzini 

>  copy = (ret == 1);
>  }
>  trace_stream_one_iteration(s, sector_num, n, ret);
> 





[Qemu-devel] KVM call minutes for Tuesday, August 28th

2012-08-28 Thread Juan Quintela

Hi

As almost everybody was at LinuxCon, we only started talking about the
cpu patches from Eduardo and decided to left things for the list/next
week.

- cpu patches until 1.2
  * get a new tree somewhere
- 1.3 discussion will wait until 1.2 is out
  people too busy right now

- Remember than KVM Forum abstracts deadline is this Friday (August
  31th)

Thanks, Juan.


This were the topics asked from Eduardo:

- *-user and qdev (recent RFCs didn't get many comments in the list, and
  I don't see a conclusion);
- 1.2 branching, or creation of a "cpu-next" tree where "good to be
  merged" patches can live until 1.2 is done;
- CPU code roadmap[1].

[1] For reference, this is the list of pending work that I want to
target for 1.3:

- -cpu help fix
- CPU DeviceState
- move CPU models to C
- kill cpudef
- unduplicate feature names
- CPU properties
- CPU model classes
- Fix -cpu host to use GET_SUPPORTED_CPUID properly
- Fix -cpu check/enforce brokenness
  (including lots of refactoring of the currently-broken feature-flag
  checking code)
- APIC ID threads/cores topology fix

Thanks, Juan.



[Qemu-devel] [PATCH v2 for-1.2 1/2] stream: complete early if end of backing file is reached

2012-08-28 Thread Stefan Hajnoczi
It is possible to create an image that is larger than its backing file.
Reading beyond the end of the backing file produces zeroes if no writes
have been made to those sectors in the image file.

This patch finishes streaming early when the end of the backing file is
reached.  Without this patch the block job hangs and continually tries
to stream the first sectors beyond the end of the backing file.

To reproduce the hung block job bug:

  $ qemu-img create -f qcow2 backing.qcow2 128M
  $ qemu-img create -f qcow2 -o backing_file=backing.qcow2 image.qcow2 6G
  $ qemu -drive if=virtio,cache=none,file=image.qcow2
  (qemu) block_stream virtio0
  (qemu) info block-jobs

The qemu-iotests 030 streaming test still passes.

Signed-off-by: Stefan Hajnoczi 
---
 block/stream.c |6 ++
 1 file changed, 6 insertions(+)

diff --git a/block/stream.c b/block/stream.c
index 37c4652..c4f87dd 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -122,6 +122,12 @@ wait:
  * known-unallocated area [sector_num, sector_num+n).  */
 ret = bdrv_co_is_allocated_above(bs->backing_hd, base,
  sector_num, n, &n);
+
+/* Finish early if end of backing file has been reached */
+if (ret == 0 && n == 0) {
+n = end - sector_num;
+}
+
 copy = (ret == 1);
 }
 trace_stream_one_iteration(s, sector_num, n, ret);
-- 
1.7.10.4




[Qemu-devel] [PATCH v2 for-1.2 2/2] qemu-iotests: add backing file smaller than image test case

2012-08-28 Thread Stefan Hajnoczi
This new test case checks that streaming completes successfully when the
backing file is smaller than the image file.

Signed-off-by: Stefan Hajnoczi 
---
 tests/qemu-iotests/030 |   33 +
 tests/qemu-iotests/030.out |4 ++--
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030
index f71ab8d..55b16f8 100755
--- a/tests/qemu-iotests/030
+++ b/tests/qemu-iotests/030
@@ -125,6 +125,39 @@ class TestSingleDrive(ImageStreamingTestCase):
 result = self.vm.qmp('block-stream', device='nonexistent')
 self.assert_qmp(result, 'error/class', 'DeviceNotFound')
 
+
+class TestSmallerBackingFile(ImageStreamingTestCase):
+backing_len = 1 * 1024 * 1024 # MB
+image_len = 2 * backing_len
+
+def setUp(self):
+self.create_image(backing_img, self.backing_len)
+qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
backing_img, test_img, str(self.image_len))
+self.vm = iotests.VM().add_drive(test_img)
+self.vm.launch()
+
+# If this hangs, then you are missing a fix to complete streaming when the
+# end of the backing file is reached.
+def test_stream(self):
+self.assert_no_active_streams()
+
+result = self.vm.qmp('block-stream', device='drive0')
+self.assert_qmp(result, 'return', {})
+
+completed = False
+while not completed:
+for event in self.vm.get_qmp_events(wait=True):
+if event['event'] == 'BLOCK_JOB_COMPLETED':
+self.assert_qmp(event, 'data/type', 'stream')
+self.assert_qmp(event, 'data/device', 'drive0')
+self.assert_qmp(event, 'data/offset', self.image_len)
+self.assert_qmp(event, 'data/len', self.image_len)
+completed = True
+
+self.assert_no_active_streams()
+self.vm.shutdown()
+
+
 class TestStreamStop(ImageStreamingTestCase):
 image_len = 8 * 1024 * 1024 * 1024 # GB
 
diff --git a/tests/qemu-iotests/030.out b/tests/qemu-iotests/030.out
index 3f8a935..2f7d390 100644
--- a/tests/qemu-iotests/030.out
+++ b/tests/qemu-iotests/030.out
@@ -1,5 +1,5 @@
-..
+...
 --
-Ran 6 tests
+Ran 7 tests
 
 OK
-- 
1.7.10.4




[Qemu-devel] [PATCH v2 for-1.2 0/2] stream: complete early if end of backing file is reached

2012-08-28 Thread Stefan Hajnoczi
Image streaming hangs if the backing image is smaller than the image file.  The
problem is that the image streaming loop makes no progress when
bdrv_co_is_allocated() returns 0 with pnum=0.  More details in the actual
patch.

I have also included a qemu-iotest to check this scenario.  It hangs when run
against qemu.git/master and passes when the patch is applied.

Stefan Hajnoczi (2):
  stream: complete early if end of backing file is reached
  qemu-iotests: add backing file smaller than image test case

 block/stream.c |6 ++
 tests/qemu-iotests/030 |   33 +
 tests/qemu-iotests/030.out |4 ++--
 3 files changed, 41 insertions(+), 2 deletions(-)

-- 
1.7.10.4




Re: [Qemu-devel] KVM call agenda for Tuesda, August 28th

2012-08-28 Thread Eduardo Habkost
On Tue, Aug 28, 2012 at 02:55:56PM +0100, Peter Maydell wrote:
> On 28 August 2012 14:30, Eduardo Habkost  wrote:
> > - 1.2 branching, or creation of a "cpu-next" tree where "good to be
> >   merged" patches can live until 1.2 is done;
> 
> With 1.3 due for release in just over a week, it seems unlikely
> that it's worth branching at this point...

Well, the closer to the release, the smaller the cost of branching as we
won't have many patches entering the 1.2 branch, anyway.

But in the end, this is more a problem of patch review capacity, than
about having a branch created. One can easily create a branch somewhere
(I am going to create a "cpu-next" branch for the patches that seem to
be "ready to go"), and propose to get it merged after 1.2 is out. But
the problem is to have enough eyeballs to look at it to decide if each
patch should go into that branch, or not.

-- 
Eduardo



Re: [Qemu-devel] [PATCH for-1.2] qed: refuse unaligned zero writes with a backing file

2012-08-28 Thread Paolo Bonzini
Il 28/08/2012 15:37, Stefan Hajnoczi ha scritto:
>> > The "right fix" would not be much more complex though, something like 
>> > this, right?
>> > (untested).
> Yes but it's more complicated.  To do a really good job we should
> slice off the first/last clusters if they are unaligned, handle them
> like regular allocating writes, and handle the middle of the request
> as a zero write.
> 
> I decided to do the simplest implementation since this scenario only
> occurs in test cases, not real guests.

Yes, I was curious because it reminded me of the patch I did to write
zeroes when I was playing with discard to avoid the large bounce buffer
in qed_aio_write_inplace.  That patch takes care of processing clusters
one by one (though that means one L2 write for each and every cluster,
not just the first and last).

It probably causes a performance hit, but anyway I attach it for
completeness.

Paolo
>From 98f2978ae5d0f34dca0369fcc727d1e533c0e6b0 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini 
Date: Thu, 8 Mar 2012 14:48:29 +0100
Subject: [PATCH 1/2] qed: make write-zeroes bounce buffer smaller than a
 single cluster

Currently, a write-zeroes operation could allocates memory for the whole
I/O operation if it is not aligned.  This is not necessary, because only
two unaligned clusters could be written.

This makes the write-zeroes operation proceed one cluster at a time,
even if all clusters are currently available and zero.  This does cause
worse performance due to multiple L2 table writes.  However, if zero-write
detection is enabled it means we're not interested in maximum performance.

Signed-off-by: Paolo Bonzini 
---
 block/qed.c | 75 +
 1 file modificato, 46 inserzioni(+), 29 rimozioni(-)

diff --git a/block/qed.c b/block/qed.c
index a02dbfd..bcea346 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -869,7 +869,9 @@ static void qed_aio_complete(QEDAIOCB *acb, int ret)
 /* Free the buffer we may have allocated for zero writes */
 if (acb->flags & QED_AIOCB_ZERO) {
 qemu_vfree(acb->qiov->iov[0].iov_base);
-acb->qiov->iov[0].iov_base = NULL;
+qemu_iovec_destroy(acb->qiov);
+g_free(acb->qiov);
+acb->qiov = NULL;
 }
 
 /* Arrange for a bh to invoke the completion function */
@@ -1096,6 +1098,34 @@ static void qed_aio_write_zero_cluster(void *opaque, int ret)
 }
 
 /**
+ * Calculate the I/O vector
+ *
+ * @acb:Write request
+ * @len:Length in bytes
+ */
+static void qed_prepare_qiov(QEDAIOCB *acb, size_t len)
+{
+/* Calculate the I/O vector */
+if (acb->flags & QED_AIOCB_ZERO) {
+/* Allocate buffer for zero writes */
+if (!acb->qiov) {
+BDRVQEDState *s = acb_to_s(acb);
+char *base;
+
+acb->qiov = g_malloc(sizeof(QEMUIOVector));
+base = qemu_blockalign(s->bs, s->header.cluster_size);
+qemu_iovec_init(acb->qiov, 1);
+qemu_iovec_add(acb->qiov, base, s->header.cluster_size);
+memset(base, 0, s->header.cluster_size);
+}
+assert(len <= acb->qiov->size);
+qemu_iovec_add(&acb->cur_qiov, acb->qiov->iov[0].iov_base, len);
+} else {
+qemu_iovec_concat(&acb->cur_qiov, acb->qiov, acb->qiov_offset, len);
+}
+}
+
+/**
  * Write new data cluster
  *
  * @acb:Write request
@@ -1124,21 +1154,20 @@ static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
 
 acb->cur_nclusters = qed_bytes_to_clusters(s,
 qed_offset_into_cluster(s, acb->cur_pos) + len);
-qemu_iovec_concat(&acb->cur_qiov, acb->qiov, acb->qiov_offset, len);
 
 if (acb->flags & QED_AIOCB_ZERO) {
 /* Skip ahead if the clusters are already zero */
 if (acb->find_cluster_ret == QED_CLUSTER_ZERO) {
-qed_aio_next_io(acb, 0);
-return;
+cb = qed_aio_next_io;
+} else {
+cb = qed_aio_write_zero_cluster;
 }
-
-cb = qed_aio_write_zero_cluster;
 } else {
 cb = qed_aio_write_prefill;
 acb->cur_cluster = qed_alloc_clusters(s, acb->cur_nclusters);
 }
 
+qed_prepare_qiov(acb, len);
 if (qed_should_set_need_check(s)) {
 s->header.features |= QED_F_NEED_CHECK;
 qed_write_header(s, cb, acb);
@@ -1158,19 +1187,8 @@ static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
  */
 static void qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
 {
-/* Allocate buffer for zero writes */
-if (acb->flags & QED_AIOCB_ZERO) {
-struct iovec *iov = acb->qiov->iov;
-
-if (!iov->iov_base) {
-iov->iov_base = qemu_blockalign(acb->common.bs, iov->iov_len);
-memset(iov->iov_base, 0, iov->iov_len);
-}
-}
-
-/* Calculate the I/O vector */
 acb->cur_cluster = offset;
-qemu_iovec_concat(&acb->cur_qiov, acb->qiov, acb->qiov_offset, len);
+qed_prepare_qiov(acb, len);
 
 /* Do the act

Re: [Qemu-devel] [PATCH for-1.2] stream: complete early if end of backing file is reached

2012-08-28 Thread Stefan Hajnoczi
On Tue, Aug 28, 2012 at 2:58 PM, Stefan Hajnoczi
 wrote:
> It is possible to create an image that is larger than its backing file.
> Reading beyond the end of the backing file produces zeroes if no writes
> have been made to those sectors in the image file.
>
> This patch finishes streaming early when the end of the backing file is
> reached.  Without this patch the block job hangs and continually tries
> to stream the first sectors beyond the end of the backing file.
>
> To reproduce the hung block job bug:
>
>   $ qemu-img create -f qcow2 backing.qcow2 128M
>   $ qemu-img create -f qcow2 -o backing_file=backing.qcow2 image.qcow2 6G
>   $ qemu -drive if=virtio,cache=none,file=image.qcow2
>   (qemu) block_stream virtio0
>   (qemu) info block-jobs
>
> The qemu-iotests 030 streaming test still passes.
>
> Signed-off-by: Stefan Hajnoczi 
> ---
>  block/stream.c |7 +++
>  1 file changed, 7 insertions(+)

I will resend this along with a test case.

Stefan



[Qemu-devel] [PATCH for-1.2] stream: complete early if end of backing file is reached

2012-08-28 Thread Stefan Hajnoczi
It is possible to create an image that is larger than its backing file.
Reading beyond the end of the backing file produces zeroes if no writes
have been made to those sectors in the image file.

This patch finishes streaming early when the end of the backing file is
reached.  Without this patch the block job hangs and continually tries
to stream the first sectors beyond the end of the backing file.

To reproduce the hung block job bug:

  $ qemu-img create -f qcow2 backing.qcow2 128M
  $ qemu-img create -f qcow2 -o backing_file=backing.qcow2 image.qcow2 6G
  $ qemu -drive if=virtio,cache=none,file=image.qcow2
  (qemu) block_stream virtio0
  (qemu) info block-jobs

The qemu-iotests 030 streaming test still passes.

Signed-off-by: Stefan Hajnoczi 
---
 block/stream.c |7 +++
 1 file changed, 7 insertions(+)

diff --git a/block/stream.c b/block/stream.c
index 37c4652..331ba21 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -122,6 +122,13 @@ wait:
  * known-unallocated area [sector_num, sector_num+n).  */
 ret = bdrv_co_is_allocated_above(bs->backing_hd, base,
  sector_num, n, &n);
+
+/* Finish early if end of backing file has been reached */
+if (ret == 0 && n == 0) {
+sector_num = end;
+break;
+}
+
 copy = (ret == 1);
 }
 trace_stream_one_iteration(s, sector_num, n, ret);
-- 
1.7.10.4




Re: [Qemu-devel] KVM call agenda for Tuesda, August 28th

2012-08-28 Thread Peter Maydell
On 28 August 2012 14:30, Eduardo Habkost  wrote:
> - 1.2 branching, or creation of a "cpu-next" tree where "good to be
>   merged" patches can live until 1.2 is done;

With 1.3 due for release in just over a week, it seems unlikely
that it's worth branching at this point...

-- PMM



  1   2   >