Re: [Qemu-devel] [PATCH] flatload: fix bss clearing

2012-07-10 Thread Mike Frysinger
On Monday 09 July 2012 09:21:52 Andreas Färber wrote:
 Am 09.07.2012 15:04, schrieb Mike Frysinger:
  The current bss clear logic assumes the target mmap address and host
  address are the same.  Use g2h to translate from the target address
  space to the host so we can call memset on it.
 
 Patch looks sensible. Are you working on rebasing your Blackfin target
 to QOM and AREG0?

i've rebased them to the latest release (1.1.0).  FDPIC seems to work fine, as 
does basic ELF, but FLAT gets into an infinite loop and i haven't figured out 
why just yet.
-mike


signature.asc
Description: This is a digitally signed message part.


Re: [Qemu-devel] [PATCH] megasas: Fix compilation for 32 bit hosts

2012-07-10 Thread Hannes Reinecke
Hi Stefan,

you might've seen that Anthony objected to this in general.
Apparently I'm not allowed to use the instance address to seed the
SAS address.

So yes, your fix is valid, but might be pointless as I might have to
re-do this section anyway.
But wait and see what Anthony has to say here.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)





Re: [Qemu-devel] [PATCH] megasas: Fix compilation for 32 bit hosts

2012-07-10 Thread Stefan Weil

Am 10.07.2012 08:00, schrieb Hannes Reinecke:

Hi Stefan,

you might've seen that Anthony objected to this in general.
Apparently I'm not allowed to use the instance address to seed the
SAS address.

So yes, your fix is valid, but might be pointless as I might have to
re-do this section anyway.
But wait and see what Anthony has to say here.

Cheers,

Hannes


There remains an additional problem with megasas_dcmd_dump_frame
because it takes too many arguments.

Builds with the simple trace backend fail.

Cheers,

Stefan





[Qemu-devel] make apic hot-plugable

2012-07-10 Thread Liu Ping Fan
The previous effort to make apic hot-plugable is thread:
  [PATCH V3] Introduce a new bus ICC to connect APIC
refer to :
  http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg00413.html

But now, we are with qom. So remodeling the apic as a kid of CPUState (neglect 
the dependent apic, which is not usual any longer).

Any comments or suggesion?



Re: [Qemu-devel] [PATCH v2 0/2] QOMify AXI stream for Xilinx AXI ethernet/DMA

2012-07-10 Thread Peter Crosthwaite
Ping^2

On Wed, Jul 4, 2012 at 10:28 AM, Peter Crosthwaite
peter.crosthwa...@petalogix.com wrote:
 Ping!

 On Thu, Jun 28, 2012 at 8:41 PM, Peter A. G. Crosthwaite
 peter.crosthwa...@petalogix.com wrote:
 Next revision of the series for AXI-stream, rebased on anthonys refactoring 
 of the Interface system. Anthonys patch is already on the mailing list, but 
 I have included it form completeness. P2 is all the actual axi-stream device 
 land device-land stuff.

 Changed since V1:
 Rebased Anthonys patch (P1) (Heavy conflict with final phase qom-next merge)
 Rolled Interface + link bug patch (formerly P3) into P1

 Anthony Liguori (1):
   qom: Reimplement Interfaces

 Peter A. G. Crosthwaite (1):
   xilinx_axi*: Re-implemented interconnect

  hw/Makefile.objs |1 +
  hw/petalogix_ml605_mmu.c |   24 +++--
  hw/stream.c  |   23 +
  hw/stream.h  |   31 +++
  hw/xilinx.h  |   22 ++---
  hw/xilinx_axidma.c   |   74 +---
  hw/xilinx_axidma.h   |   39 
  hw/xilinx_axienet.c  |   32 ---
  include/qemu/object.h|   46 ++
  qom/object.c |  220 
 ++
  10 files changed, 255 insertions(+), 257 deletions(-)
  create mode 100644 hw/stream.c
  create mode 100644 hw/stream.h
  delete mode 100644 hw/xilinx_axidma.h

 --
 1.7.3.2




[Qemu-devel] [PATCH 3/5] qdev: export the bus reset interface

2012-07-10 Thread Liu Ping Fan
Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
---
 hw/qdev.c |   17 -
 hw/qdev.h |2 ++
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/hw/qdev.c b/hw/qdev.c
index d2100a1..f7983e4 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -249,11 +249,9 @@ static int qdev_reset_one(DeviceState *dev, void *opaque)
 
 static int qbus_reset_one(BusState *bus, void *opaque)
 {
-BusClass *bc = BUS_GET_CLASS(bus);
-if (bc-reset) {
-return bc-reset(bus);
-}
-return 0;
+int ret;
+ret = bus_reset(bus);
+return ret;
 }
 
 void qdev_reset_all(DeviceState *dev)
@@ -766,6 +764,15 @@ void device_reset(DeviceState *dev)
 }
 }
 
+int bus_reset(BusState *bus)
+{
+BusClass *bc = BUS_GET_CLASS(bus);
+if (bc-reset) {
+return bc-reset(bus);
+}
+return 0;
+}
+
 Object *qdev_get_machine(void)
 {
 static Object *dev;
diff --git a/hw/qdev.h b/hw/qdev.h
index aecc69e..5f88b4b 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -356,6 +356,8 @@ void qdev_machine_init(void);
  */
 void device_reset(DeviceState *dev);
 
+int bus_reset(BusState *bus);
+
 const VMStateDescription *qdev_get_vmsd(DeviceState *dev);
 
 const char *qdev_fw_name(DeviceState *dev);
-- 
1.7.4.4




Re: [Qemu-devel] [PATCH 4/6] device_tree: Add support for reading device tree properties

2012-07-10 Thread Peter Crosthwaite
On Sat, Jul 7, 2012 at 1:34 AM, Peter Maydell peter.mayd...@linaro.org wrote:
 On 6 July 2012 02:56, Peter Crosthwaite peter.crosthwa...@petalogix.com 
 wrote:
 Can we generalise and get functionality for reading cells with offsets
 as well? Your function assumes (and asserts) that the property is a
 single cell, but can we add a index parameter for reading a non-0th
 property out of a multi-cell prop? Needed for reading things like
 ranges, regs and interrupt properties.

 I was playing about with this and I'm really not sure that we should
 be providing a read a single u32 from a u32 array property at the
 device_tree.c layer. For example, for handling the ranges property
 what you really want to do is treat it as a list of tuples

The tuples concept layers on top of the cells concept. The meaning of
cells will differ from property to property but cells are cells. The
setter API manages cells but not tuples, so the same should be true of
the getter API. So if you are dealing with tuples, the device tree
layer should provide access to the cells (which is essentially reading
u32 from u32 array) then you can interpret them as tuples if you wish.

I guess what i'm trying to say is that tuples and cells should be
managed quite separately, the former in the client and the latter in
the device tree API.

 (including
 doing something sensible if it doesn't have the right length to be
 a complete list), so the code that knows the structure of the ranges
 property is better off calling qemu_devtree_getprop to get a uint32_t*
 for the whole array.

The logic for the byte reversal should still be in device tree
however. You have to be_to_cpu each read value after reading that
array which is tedious and an easy omission to make. I think if we are
going to wrap for single cell props then it makes sense for multi cell
as well.

Then it has the whole thing as a straightforward
 C array which will be much easier and more efficient to handle than

Efficient yes, but easier no because of the endian issue. For my few
use cases out of tree I only ever get the one property at a time. I
never parse an entire cell array.

 constantly bouncing back into the fdt layer to read each uint32_t.


Constantly bouncing back is safer however. If you hang on to an
in-place pointer into the FDT (as returned by get_prop) and someone
comes along and set_props() then your pointer is corrupted. Ive been
snagged before by doing exactly this and eventually came to the
brute-force approach of just requerying the DTB every touch rather
than try to work with pointers to arrays. duping the property could
work, but its a bit of a mess trying to free the returned copies.

(As an aside this is one of the reasons why my chicken-and-egg machine
model uses coroutines instead of threads. I need the get_prop()
followed by its immediate usage to be atomic).

If user care about efficiency over safety, then your get_prop + cast +
endian_swap approach will always be available to them. I just think a
single idx arg at the end creates more options for users. We could
vararg it and if its omitted it just takes the 0th element to keep the
call sites for the 90% case a little more concise.

Regards,
Peter

 I've also just realised that I'm assuming that the pointer returned
 by fdt_getprop() is naturally aligned for a 32 bit integer if the
 property is a 32 bit integer -- is that valid?

 -- PMM



[Qemu-devel] [PATCH 2/5] qom: introduce object_is_type_str(), so we can judge its type.

2012-07-10 Thread Liu Ping Fan
Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
---
 include/qemu/object.h |2 ++
 qom/object.c  |6 ++
 2 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/include/qemu/object.h b/include/qemu/object.h
index 8b17776..a66e996 100644
--- a/include/qemu/object.h
+++ b/include/qemu/object.h
@@ -479,6 +479,8 @@ void object_initialize(void *obj, const char *typename);
  */
 void object_finalize(void *obj);
 
+bool object_is_type_str(Object *obj, const char *typename);
+
 /**
  * object_dynamic_cast:
  * @obj: The object to cast.
diff --git a/qom/object.c b/qom/object.c
index 00bb3b0..6c27d90 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -425,6 +425,12 @@ static bool type_is_ancestor(TypeImpl *type, TypeImpl 
*target_type)
 return false;
 }
 
+bool object_is_type_str(Object *obj, const char *typename)
+{
+TypeImpl *target_type = type_get_by_name(typename);
+return !target_type || type_is_ancestor(obj-class-type, target_type);
+}
+
 static bool object_is_type(Object *obj, TypeImpl *target_type)
 {
 return !target_type || type_is_ancestor(obj-class-type, target_type);
-- 
1.7.4.4




[Qemu-devel] [PATCH 5/5] apic: create apic as a child of cpu, not system_bus any longer

2012-07-10 Thread Liu Ping Fan
Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
---
 hw/pc.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index c7e9ab3..8df58c9 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -886,17 +886,17 @@ DeviceState *cpu_get_current_apic(void)
 }
 }
 
-static DeviceState *apic_init(void *env, uint8_t apic_id)
+static DeviceState *apic_init(void *cpu, void *env, uint8_t apic_id)
 {
 DeviceState *dev;
 static int apic_mapped;
 
 if (kvm_irqchip_in_kernel()) {
-dev = qdev_create(NULL, kvm-apic);
+dev = qdev_create_kid(OBJECT(cpu), kvm-apic);
 } else if (xen_enabled()) {
-dev = qdev_create(NULL, xen-apic);
+dev = qdev_create_kid(OBJECT(cpu), xen-apic);
 } else {
-dev = qdev_create(NULL, apic);
+dev = qdev_create_kid(OBJECT(cpu), apic);
 }
 
 qdev_prop_set_uint8(dev, id, apic_id);
@@ -945,7 +945,7 @@ static X86CPU *pc_new_cpu(const char *cpu_model)
 }
 env = cpu-env;
 if ((env-cpuid_features  CPUID_APIC) || smp_cpus  1) {
-env-apic_state = apic_init(env, env-cpuid_apic_id);
+env-apic_state = apic_init(cpu, env, env-cpuid_apic_id);
 }
 qemu_register_reset(pc_cpu_reset, cpu);
 pc_cpu_reset(cpu);
-- 
1.7.4.4




Re: [Qemu-devel] [PULL 00/14] SCSI updates for 2012-07-02

2012-07-10 Thread Paolo Bonzini
Il 10/07/2012 07:57, Hannes Reinecke ha scritto:
  This will make migration impossible not to mention the fact that
  casting a pointer to a uint64_t is really broken.
  
 Hey, this is _NOT_ an address. It's a simple way of generating a
 system-wide unique SAS address.
 
 The whole thing is informational anyway, and can only be seen when
 using the (proprietary) MegaCLI userspace command.

So even on real hardware it is not exported to the VPD (in the case of
the per-LUN address)?  And the per-port address is also not visible in VPD?

I recently added a wwn property to scsi-{hd,cd}, a similar property
should perhaps be added to the megasas device.  We can do the same thing
and add a default.  Not the pointer value, though, because it is not
migratable.  A counter is also problematic for migration when you have
hotplug/hotunplug.  You can instead use something like a CRC32 of the
device id.

Once it's added, we can add support for it in SCSIBusInfo so that it is
exported via VPD.

 Okay, so here's the challenge: We need to generate a system-wide
 unique SAS address, one per SCSI device and one per megasas instance.
 A simple counter won't work, as we might have several qemu instances
 running. Which would result in all of them having the same SAS
 address for the host.

That's not a problem as long as we're not supporting things like
persistent reservations across guests (just like it's not a problem if
you give the same MAC address to network cards with slirp).

Paolo



[Qemu-devel] [PATCH 4/5] qom-cpu: during cpu reset, it will reset its child

2012-07-10 Thread Liu Ping Fan
This will give the embeded logic module, such as apic has the
opportunity to reset.

Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
---
 qom/cpu.c |   16 
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/qom/cpu.c b/qom/cpu.c
index 5b36046..6aea8e6 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -20,10 +20,26 @@
 
 #include qemu/cpu.h
 #include qemu-common.h
+#include hw/qdev.h
+
+static int cpu_reset_kid(Object *child, void *opaque)
+{
+if (object_is_type_str(child, TYPE_DEVICE)) {
+device_reset(DEVICE(child));
+} else if (object_is_type_str(child, TYPE_BUS)) {
+bus_reset(BUS(child));
+} else {
+printf(cpu's child must be DEVICE or BUS);
+abort();
+}
+return 0;
+}
 
 void cpu_reset(CPUState *cpu)
 {
 CPUClass *klass = CPU_GET_CLASS(cpu);
+Object *obj = OBJECT(cpu);
+object_child_foreach(obj, cpu_reset_kid, NULL);
 
 if (klass-reset != NULL) {
 (*klass-reset)(cpu);
-- 
1.7.4.4




Re: [Qemu-devel] [RFC] introduce a dynamic library to expose qemu block API

2012-07-10 Thread Paolo Bonzini
Il 10/07/2012 07:04, Wenchao Xia ha scritto:
 于 2012-7-9 17:13, Paolo Bonzini 写道:
 Il 09/07/2012 10:54, Wenchao Xia ha scritto:
 Following is my implementing plan draft:
1 introduce libqblock.so in sub directory in qemu.
2 write a nbd client in libqblock, similar to qemu nbd client. Then
 use it to talk with nbd server, by default is qemu-nbd, to get access
 to images. In this way, libqblock.so could be friendly LGPL licensed.

 Did you actually assess the license situation of the block layer?
 block.c and large parts of block/* are under a BSD license, for example.
   If the library only has to support raw files, it might do so using
 synchronous I/O only.  This would remove a large body of GPL-licensed code.

   If the library was built as nbd-client communicating with nbd-server,
 which then employ the BSO licensed code, could the library ignore the
 server side's license problem?

Yes, but if your first worry is the legal problem you are doomed to
design an awful library.

 The reason using nbd-client approach are:
 work around qemu block layer license issue and easy to implement

Working around the QEMU block layer license is not a goal per se,
especially because you haven't a) assessed _what_ is the GPL code that
the library would use; b) told us why the library should not be under
the GPL.

Please design first according to the functionality you want to
implement, then think about the implementation.

 , if
 other tool found this labrary useful then considering about directly
 employ the qemu block code.

Again, I find this to be quite backwards.  Writing a replacement for the
QEMU block layer just for licensing reasons is going to be a waste of
resources, since 90% of it is already BSD-licensed.

Perhaps you can produce two variants of the library, one using GPLed
backends and one entirely under the BSD license.  That would be good.
However we cannot help much in finding the best way to reach your goals,
again because you haven't reported on what actually is the GPL code that
you're worried about and why.

3 still not got a good way to get additional info in (2)(3)(4),
 currently in my head is patch qemu-nbd to add an additional nbd command,
 image-info, in which returns related info.

 On the Linux kernel mailing list I would have no qualms labeling such
 command as crap.  However, since the social standards on qemu-devel
 are a bit higher, I'll ask instead: what information would the command
 provide beyond the size?

   The API need to report the image format it is using, such as
 qcow2. And also API should report if a block at offset have been
 allocated or it is a hole.

qemu-nbd is designed to always provide the image format as raw, so its
client has no business knowing whether the image is originally stored as
qcow2 or something else.

Paolo



[Qemu-devel] [PATCH 1/5] qdev: introduce qdev_create_kid(Object *parent, const char *type)

2012-07-10 Thread Liu Ping Fan
DeviceState can be created as kid of DeviceState/CPUState, not neccesary
attached to bus. This will be helpful to simulate the real hardware
submodule which sits inside package.

Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
---
 hw/qdev.c |   28 
 hw/qdev.h |1 +
 2 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/hw/qdev.c b/hw/qdev.c
index af54467..d2100a1 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -26,6 +26,7 @@
this API directly.  */
 
 #include net.h
+#include qemu/cpu.h
 #include qdev.h
 #include sysemu.h
 #include error.h
@@ -145,6 +146,33 @@ DeviceState *qdev_try_create(BusState *bus, const char 
*type)
 return dev;
 }
 
+DeviceState *qdev_create_kid(Object *parent, const char *type)
+{
+DeviceState *dev;
+assert(parent);
+
+if (object_class_by_name(type) == NULL) {
+return NULL;
+}
+
+if (object_is_type_str(parent, TYPE_BUS)) {
+return qdev_create(BUS(parent), type);
+}
+
+if (!object_is_type_str(parent, TYPE_DEVICE)
+|| !object_is_type_str(parent, TYPE_CPU)) {
+return NULL;
+}
+
+dev = DEVICE(object_new(type));
+if (!dev) {
+return NULL;
+}
+object_property_add_child(OBJECT(parent), type, OBJECT(dev), NULL);
+
+return dev;
+}
+
 /* Initialize a device.  Device properties should be set before calling
this function.  IRQs and MMIO regions should be connected/mapped after
calling this function.
diff --git a/hw/qdev.h b/hw/qdev.h
index f4683dc..aecc69e 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -154,6 +154,7 @@ typedef struct GlobalProperty {
 
 DeviceState *qdev_create(BusState *bus, const char *name);
 DeviceState *qdev_try_create(BusState *bus, const char *name);
+DeviceState *qdev_create_kid(Object *parent, const char *type);
 bool qdev_exists(const char *name);
 int qdev_device_help(QemuOpts *opts);
 DeviceState *qdev_device_add(QemuOpts *opts);
-- 
1.7.4.4




Re: [Qemu-devel] [RFC] introduce a dynamic library to expose qemu block API

2012-07-10 Thread Paolo Bonzini
Il 10/07/2012 07:37, Wenchao Xia ha scritto:

 For getting the other metadata about the disk image you mention, another
 possibility to is just make 'qemu-img info' return the data in a machine
 parseable format, ie JSON  make a client API for extracting data from
 this JSON document.

   Thank u for the idea. The .so is introduced to let program access the
 image more directly, parsing string is not so fast and it depends on
 another program's stdout output, I hope to get a faster way.

I doubt you actually have profiled it.

Paolo




Re: [Qemu-devel] [PATCH] pseries: Add support for new KVM hash table control call

2012-07-10 Thread Benjamin Herrenschmidt
On Wed, 2012-06-27 at 22:10 +1000, Benjamin Herrenschmidt wrote:
 From: David Gibson da...@gibson.dropbear.id.au
 
 This adds support for then new reset htab ioctl which allows qemu
 to properly cleanup the MMU hash table when the guest is reset. With
 the corresponding kernel support, reset of a guest now works properly.
 
 This also paves the way for indicating a different size hash table
 to the kernel and for the kernel to be able to impose limits on
 the requested size.

Alex, this has a bug, if you already applied it, please sneak:

 +int kvmppc_reset_htab(int shift_hint)
 +{
 +uint32_t shift = shift_hint;
 +
 +if (kvm_check_extension(kvm_state, KVM_CAP_PPC_ALLOC_HTAB)) {

The above shall be if (kvm_enabled()  

Else it will segfault in kvm_check_extension

Or let me know if I should re-submit.

Cheers,
Ben.





Re: [Qemu-devel] [PATCH] megasas: disable due to build breakage

2012-07-10 Thread Gerd Hoffmann
On 07/10/12 01:18, Anthony Liguori wrote:
 The Buildbot has detected a new failure on builder default_i386_rhel61 while
 building qemu.
 
 Full details are available at:
  http://buildbot.b1-systems.de/qemu/builders/default_i386_rhel61/builds/304
 
 The proper fix is non-trivial so let's disable the build by default until it's
 fixed properly.

btw: what about make check still failing (for months now)?
I guess we should either disable or fix that one too ...

cheers,
  Gerd




Re: [Qemu-devel] [PATCH 23/25] fdc: Move floppy geometry guessing back from block.c

2012-07-10 Thread Kevin Wolf
Am 09.07.2012 19:01, schrieb Anthony Liguori:
 On 07/09/2012 11:46 AM, Eric Blake wrote:
 On 07/09/2012 10:07 AM, Markus Armbruster wrote:

 This is an unconditional use of fd_type[0].  If floppy == NULL, this is
 dereferencing an uninitialized value.

 I'm not sure why the explicit initialization was removed...

 Brain fart on my part, sorry.  The old loop assigns only if the drive
 exists.  The new loop assigns unconditionally.  Except the whole loop is
 still conditional.

 Testing can't flag this, because floppy is never null.

 Looks broken indeed. I just wonder why my gcc (or the buildbots) didn't
 complain.

 Me too.  Looks like I should upgrade to a more recent gcc.

 It's probably not the version of the gcc you used, but whether or not
 your CFLAGS include -O2.  Gcc has the (IMO very annoying) limitation
 that uninitialized-use analysis can only be performed if you are also
 doing optimization.  You have to use a tool like clang or Coverity if
 you want more reliable uninitialized-use analysis even while building
 -O0 debug images.

 
 Specifically, without -O, GCC doesn't do data flow analysis so any warning 
 that 
 requires DFA won't get triggered.
 
 So in general, if you are normally building with -O0, make sure to also build 
 with -O in order to get full warnings.

Just checked it to be sure, this doesn't seem to be the reason:

CFLAGS=-O2 -D_FORTIFY_SOURCE=2 -g

Kevin



Re: [Qemu-devel] [PATCH 4/6] device_tree: Add support for reading device tree properties

2012-07-10 Thread Peter Maydell
On 10 July 2012 07:54, Peter Crosthwaite
peter.crosthwa...@petalogix.com wrote:
 If user care about efficiency over safety, then your get_prop + cast +
 endian_swap approach will always be available to them. I just think a
 single idx arg at the end creates more options for users. We could
 vararg it and if its omitted it just takes the 0th element to keep the
 call sites for the 90% case a little more concise.

Hmm, OK, I'll have another go and see how the patch comes out.

-- PMM



Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-10 Thread Kevin Wolf
Am 09.07.2012 19:35, schrieb Corey Bryant:
 
 
 On 07/09/2012 11:46 AM, Kevin Wolf wrote:
 Am 09.07.2012 17:05, schrieb Corey Bryant:
 I'm not sure this is an issue with current design.  I know things have
 changed a bit as the email threads evolved, so I'll paste the current
 design that I am working from.  Please let me know if you still see any
 issues.

 FD passing:
 ---
 New monitor commands enable adding/removing an fd to/from a set.  New
 monitor command query-fdsets enables querying of current monitor fdsets.
The set of fds should all refer to the same file, with each fd having
 different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup
 the fd that has the matching access mode flags.

 Design points:
 --
 1. add-fd
 - fd is passed via SCM rights and qemu adds fd to first unused fdset
 (e.g. /dev/fdset/1)
 - add-fd monitor function initializes the monitor inuse flag for the
 fdset to true
 - add-fd monitor function initializes the remove flag for the fd to false
 - add-fd returns fdset number and received fd number (e.g fd=3) to caller

 2. drive_add file=/dev/fdset/1
 - qemu_open uses the first fd in fdset1 that has access flags matching
 the qemu_open action flags and has remove flag set to false
 - qemu_open increments refcount for the fdset
 - Need to make sure that if a command like 'device-add' fails that
 refcount is not incremented

 3. add-fd fdset=1
 - fd is passed via SCM rights
 - add-fd monitor function adds the received fd to the specified fdset
 (or fails if fdset doesn't exist)
 - add-fd monitor function initializes the remove flag for the fd to false
 - add-fd returns fdset number and received fd number (e.g fd=4) to caller

 4. block-commit
 - qemu_open performs reopen by using the first fd from the fdset that
 has access flags matching the qemu_open action flags and has remove flag
 set to false
 - qemu_open increments refcount for the fdset
 - Need to make sure that if a command like 'block-commit' fails that
 refcount is not incremented

 5. remove-fd fdset=1 fd=4
 - remove-fd monitor function fails if fdset doesn't exist
 - remove-fd monitor function turns on remove flag for fd=4

 What was again the reason why we keep removed fds in the fdset at all?
 
 Because if refcount is  0 for the fd set, then the fd could be in use 
 by a block device.  So we keep it around until refcount is decremented 
 to zero, at which point it is safe to close.
 

 The removed flag would make sense for a fdset after a hypothetical
 close-fdset call because the fdset needs to be kept around until the
 last user closes it, but I think removed fds can be deleted immediately.
 
 fds in an fd set really need to be kept around until zero block devices 
 reference them.  At that point, if '(refcount == 0  (!inuse || 
 remove))' is true, then we'll officially close the fd.

Block devices don't reference an fd in the fdset. There are two
references in a block device. The first one is obviously the file
descriptor they are using; it is a fd dup()ed from an fd in the fdset,
but it's now independent of it. The other reference is the file name
that is kept in the BlockDriverState, and it always points to
/dev/fdset/X, that is, the whole fdset instead of a single fd.

What happens if you remove a file descriptor from an fdset that is in
use, is that you can't reopen the fdset with the flags of the removed
file descriptor any more. Which I believe is exactly the expected
behaviour. libvirt would use this to revoke r/w access, for example (and
which behaviour you already provide by checking removed in qemu_open).

Are there any other use cases where it makes a difference whether a file
descriptor is kept in the fdset with removed=1 or whether it's actually
removed from the fdset?

 I think I might have confused remove-fd and close-fdset in earlier
 emails in this thread, so I hope this isn't inconsistent with what I
 said before.

 
 Ok no problem.
 
 6. qemu_close (need to replace all close calls in block layer with
 qemu_close)
 - qemu_close decrements refcount for fdset
 - qemu_close closes all fds that have (refcount == 0  (!inuse || remove))
 - qemu_close frees the fdset if no fds remain in it

 7. disconnecting the QMP monitor
 - monitor disconnect visits all fdsets on monitor and turns off monitor
 in-use flag for fdset

 And close all fds with refcount == 0.

 
 Yes, this makes sense.
 
 It also makes sense to close removed fds with refcount == 0 in the 
 remove-fd function.  Basically this will be the same thing we do in 
 qemu_close.  We'll close any fds that evaulate the following as true:
 
 (refcount == 0  (!inuse || remove))

Yes, whatever condition we'll come up with, but it should be the same
and checked in all places where its value might change.

Kevin



Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-10 Thread Kevin Wolf
Am 09.07.2012 20:40, schrieb Anthony Liguori:
 On 06/26/2012 04:10 AM, Daniel P. Berrange wrote:
 On Fri, Jun 22, 2012 at 02:36:07PM -0400, Corey Bryant wrote:
 libvirt's sVirt security driver provides SELinux MAC isolation for
 Qemu guest processes and their corresponding image files.  In other
 words, sVirt uses SELinux to prevent a QEMU process from opening
 files that do not belong to it.

 sVirt provides this support by labeling guests and resources with
 security labels that are stored in file system extended attributes.
 Some file systems, such as NFS, do not support the extended
 attribute security namespace, and therefore cannot support sVirt
 isolation.

 A solution to this problem is to provide fd passing support, where
 libvirt opens files and passes file descriptors to QEMU.  This,
 along with SELinux policy to prevent QEMU from opening files, can
 provide image file isolation for NFS files stored on the same NFS
 mount.

 This patch series adds the pass-fd QMP monitor command, which allows
 an fd to be passed via SCM_RIGHTS, and returns the received file
 descriptor.  Support is also added to the block layer to allow QEMU
 to dup the fd when the filename is of the /dev/fd/X format.  This
 is useful if MAC policy prevents QEMU from opening specific types
 of files.

 I was thinking about some of the sources complexity when using
 FD passing from libvirt and wanted to raise one idea for discussion
 before we continue.

 With this proposed series, we have usage akin to:

1. pass_fd FDSET={M} -  returns a string /dev/fd/N showing QEMU's
   view of the FD
2. drive_add file=/dev/fd/N
3. if failure:
 close_fd /dev/fd/N

 My problem is that none of this FD passing is transactional.
 
 My original patch series did not suffer from this problem.
 
 QEMU owned the file descriptor once it received it from libvirt.
 
 I don't think the cited problem (QEMU failing an operation if libvirt was 
 down) 
 is really an actual problem since it would be libvirt that would be issuing 
 the 
 command in the first place (so the command would just fail which libvirt 
 would 
 have to assume anyway if it crashed).
 
 I really dislike where this thread has headed with /dev/fdset.  This has 
 become 
 extremely complex and cumbersome.

What exactly is complex about the interface we're going to provide? A
long discussion about how to get the details implemented best doesn't
mean at all that the result is complex.

 Perhaps we should reconsider using an RPC for QEMU to request an fd as this 
 solves all the cited problems in a much simpler fashion.

NACK. RPC is wrong and no way easier to handle for management.

Kevin



Re: [Qemu-devel] [PATCH 1/5] qdev: introduce qdev_create_kid(Object *parent, const char *type)

2012-07-10 Thread Andreas Färber
Am 10.07.2012 08:16, schrieb Liu Ping Fan:
 DeviceState can be created as kid of DeviceState/CPUState, not neccesary
 attached to bus. This will be helpful to simulate the real hardware
 submodule which sits inside package.
 
 Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
 ---
  hw/qdev.c |   28 
  hw/qdev.h |1 +
  2 files changed, 29 insertions(+), 0 deletions(-)
 
 diff --git a/hw/qdev.c b/hw/qdev.c
 index af54467..d2100a1 100644
 --- a/hw/qdev.c
 +++ b/hw/qdev.c
 @@ -26,6 +26,7 @@
 this API directly.  */
  
  #include net.h
 +#include qemu/cpu.h
  #include qdev.h
  #include sysemu.h
  #include error.h
 @@ -145,6 +146,33 @@ DeviceState *qdev_try_create(BusState *bus, const char 
 *type)
  return dev;
  }
  
 +DeviceState *qdev_create_kid(Object *parent, const char *type)
 +{
 +DeviceState *dev;
 +assert(parent);
 +
 +if (object_class_by_name(type) == NULL) {
 +return NULL;
 +}
 +
 +if (object_is_type_str(parent, TYPE_BUS)) {

This is only introduced in patch 2, no?

Andreas

 +return qdev_create(BUS(parent), type);
 +}
 +
 +if (!object_is_type_str(parent, TYPE_DEVICE)
 +|| !object_is_type_str(parent, TYPE_CPU)) {
 +return NULL;
 +}
 +
 +dev = DEVICE(object_new(type));
 +if (!dev) {
 +return NULL;
 +}
 +object_property_add_child(OBJECT(parent), type, OBJECT(dev), NULL);
 +
 +return dev;
 +}
 +
  /* Initialize a device.  Device properties should be set before calling
 this function.  IRQs and MMIO regions should be connected/mapped after
 calling this function.
 diff --git a/hw/qdev.h b/hw/qdev.h
 index f4683dc..aecc69e 100644
 --- a/hw/qdev.h
 +++ b/hw/qdev.h
 @@ -154,6 +154,7 @@ typedef struct GlobalProperty {
  
  DeviceState *qdev_create(BusState *bus, const char *name);
  DeviceState *qdev_try_create(BusState *bus, const char *name);
 +DeviceState *qdev_create_kid(Object *parent, const char *type);
  bool qdev_exists(const char *name);
  int qdev_device_help(QemuOpts *opts);
  DeviceState *qdev_device_add(QemuOpts *opts);
 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg





[Qemu-devel] [Bug 1018530] Re: No write access in a 9p/virtfs shared folder

2012-07-10 Thread Georg Poppe
qemu is running with user libvirt-qemu, not root.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1018530

Title:
  No write access in a 9p/virtfs shared folder

Status in QEMU:
  New
Status in “qemu-kvm” package in Ubuntu:
  Fix Released

Bug description:
  Ubuntu version:  Ubuntu 12.04 LTS
  Kernel: 3.2.0-25-generic
  Version of qemu-kvm: 1.0+noroms-0ubuntu13

  I have created an shared folder for an virtual machine which is
  managed by libvirt.

  filesystem type='mount' accessmode='passthrough'
  source dir='/storage/data'/
  target dir='data'/
  address type='pci' domain='0x' bus='0x00' slot='0x08' function='0x0'/
  /filesystem

  I mounted it in the virtual machine with this command:  mount -t 9p -o 
trans=virtio,version=9p2000.L data /data
  The filesystem permissions of all files an folders in the shared folder are 
set to 777. I expected that I have the full permissions also in the virtual 
machine.

  Regardless of the permissions on the filesystem I cannot write or create 
files and folders in the virtual machine. The original filesystem (/storage) is 
XFS.
  In another shared folder (similar config in libvirt) which is originally NTFS 
I have no problems.

  ProblemType: Bug
  DistroRelease: Ubuntu 12.04
  Package: qemu-kvm 1.0+noroms-0ubuntu13
  ProcVersionSignature: Ubuntu 3.2.0-25.40-generic 3.2.18
  Uname: Linux 3.2.0-25-generic x86_64
  ApportVersion: 2.0.1-0ubuntu8
  Architecture: amd64
  Date: Wed Jun 27 20:15:20 2012
  InstallationMedia: Ubuntu-Server 12.04 LTS Precise Pangolin - Beta amd64 
(20120409)
  MachineType: To be filled by O.E.M. To be filled by O.E.M.
  ProcEnviron:
   TERM=xterm
   LANG=de_DE.UTF-8
   SHELL=/bin/bash
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-25-generic 
root=/dev/mapper/system-root ro
  SourcePackage: qemu-kvm
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 04/18/2012
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: 1208
  dmi.board.asset.tag: To be filled by O.E.M.
  dmi.board.name: M5A99X EVO
  dmi.board.vendor: ASUSTeK COMPUTER INC.
  dmi.board.version: Rev 1.xx
  dmi.chassis.asset.tag: To Be Filled By O.E.M.
  dmi.chassis.type: 3
  dmi.chassis.vendor: To Be Filled By O.E.M.
  dmi.chassis.version: To Be Filled By O.E.M.
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvr1208:bd04/18/2012:svnTobefilledbyO.E.M.:pnTobefilledbyO.E.M.:pvrTobefilledbyO.E.M.:rvnASUSTeKCOMPUTERINC.:rnM5A99XEVO:rvrRev1.xx:cvnToBeFilledByO.E.M.:ct3:cvrToBeFilledByO.E.M.:
  dmi.product.name: To be filled by O.E.M.
  dmi.product.version: To be filled by O.E.M.
  dmi.sys.vendor: To be filled by O.E.M.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1018530/+subscriptions



Re: [Qemu-devel] [PATCH] pseries: Add support for new KVM hash table control call

2012-07-10 Thread Benjamin Herrenschmidt
On Tue, 2012-07-10 at 17:25 +1000, Benjamin Herrenschmidt wrote:
 On Wed, 2012-06-27 at 22:10 +1000, Benjamin Herrenschmidt wrote:
  From: David Gibson da...@gibson.dropbear.id.au
  
  This adds support for then new reset htab ioctl which allows qemu
  to properly cleanup the MMU hash table when the guest is reset. With
  the corresponding kernel support, reset of a guest now works properly.
  
  This also paves the way for indicating a different size hash table
  to the kernel and for the kernel to be able to impose limits on
  the requested size.
 
 Alex, this has a bug, if you already applied it, please sneak:

Actually just drop the whole thing, it also breaks PR KVM, I need
to work a bit more on it.

Cheers,
Ben.





Re: [Qemu-devel] [PATCH 0/3] apic: Fixes for userspace model

2012-07-10 Thread Avi Kivity
On 07/09/2012 05:42 PM, Jan Kiszka wrote:
 As Avi noted recently, there is a problem in way we inject interrupts
 into the userspace APIC under KVM: The TRP check over the iothread may
 race with the VCPU raising the TPR value while in KVM mode. Patch 3
 addresses this issue.

 The other two patches fix problems I came across while thinking about
 the first one.

 Who would like to process this series, up/master? Or should it go in
 directly?

I applied to uq/master, where it can get some extra testing.  Thanks.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.




Re: [Qemu-devel] [PATCH 2/5] qom: introduce object_is_type_str(), so we can judge its type.

2012-07-10 Thread Paolo Bonzini
Il 10/07/2012 08:16, Liu Ping Fan ha scritto:
 Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
 ---
  include/qemu/object.h |2 ++
  qom/object.c  |6 ++
  2 files changed, 8 insertions(+), 0 deletions(-)
 
 diff --git a/include/qemu/object.h b/include/qemu/object.h
 index 8b17776..a66e996 100644
 --- a/include/qemu/object.h
 +++ b/include/qemu/object.h
 @@ -479,6 +479,8 @@ void object_initialize(void *obj, const char *typename);
   */
  void object_finalize(void *obj);
  
 +bool object_is_type_str(Object *obj, const char *typename);

Please call this object_is_instance_of, and just call
object_dynamic_cast internally so that interfaces are handled properly.

Paolo

 +
  /**
   * object_dynamic_cast:
   * @obj: The object to cast.
 diff --git a/qom/object.c b/qom/object.c
 index 00bb3b0..6c27d90 100644
 --- a/qom/object.c
 +++ b/qom/object.c
 @@ -425,6 +425,12 @@ static bool type_is_ancestor(TypeImpl *type, TypeImpl 
 *target_type)
  return false;
  }
  
 +bool object_is_type_str(Object *obj, const char *typename)
 +{
 +TypeImpl *target_type = type_get_by_name(typename);
 +return !target_type || type_is_ancestor(obj-class-type, target_type);
 +}
 +
  static bool object_is_type(Object *obj, TypeImpl *target_type)
  {
  return !target_type || type_is_ancestor(obj-class-type, target_type);
 





Re: [Qemu-devel] [PATCH] megasas: Fix compilation for 32 bit hosts

2012-07-10 Thread Hannes Reinecke
On 07/10/2012 08:03 AM, Stefan Weil wrote:
 Am 10.07.2012 08:00, schrieb Hannes Reinecke:
 Hi Stefan,

 you might've seen that Anthony objected to this in general.
 Apparently I'm not allowed to use the instance address to seed the
 SAS address.

 So yes, your fix is valid, but might be pointless as I might have to
 re-do this section anyway.
 But wait and see what Anthony has to say here.

 Cheers,

 Hannes
 
 There remains an additional problem with megasas_dcmd_dump_frame
 because it takes too many arguments.
 
 Builds with the simple trace backend fail.
 
Yeah. Apparently the trace infrastructure can't handle 9 arguments.

I'll be sending a patch.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)





Re: [Qemu-devel] [PATCH 3/5] qdev: export the bus reset interface

2012-07-10 Thread Paolo Bonzini
Il 10/07/2012 08:16, Liu Ping Fan ha scritto:
 Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
 ---
  hw/qdev.c |   17 -
  hw/qdev.h |2 ++
  2 files changed, 14 insertions(+), 5 deletions(-)
 
 diff --git a/hw/qdev.c b/hw/qdev.c
 index d2100a1..f7983e4 100644
 --- a/hw/qdev.c
 +++ b/hw/qdev.c
 @@ -249,11 +249,9 @@ static int qdev_reset_one(DeviceState *dev, void *opaque)
  
  static int qbus_reset_one(BusState *bus, void *opaque)
  {
 -BusClass *bc = BUS_GET_CLASS(bus);
 -if (bc-reset) {
 -return bc-reset(bus);
 -}
 -return 0;
 +int ret;
 +ret = bus_reset(bus);
 +return ret;
  }
  
  void qdev_reset_all(DeviceState *dev)
 @@ -766,6 +764,15 @@ void device_reset(DeviceState *dev)
  }
  }
  
 +int bus_reset(BusState *bus)
 +{
 +BusClass *bc = BUS_GET_CLASS(bus);
 +if (bc-reset) {
 +return bc-reset(bus);
 +}
 +return 0;
 +}

Two comments:

1) Is this correct? Resetting a bus should reset all the children before
resetting the bus itself.

2) Does it make sense to export it, since we're going towards removing
unnecessary buses?

Paolo

 +
  Object *qdev_get_machine(void)
  {
  static Object *dev;
 diff --git a/hw/qdev.h b/hw/qdev.h
 index aecc69e..5f88b4b 100644
 --- a/hw/qdev.h
 +++ b/hw/qdev.h
 @@ -356,6 +356,8 @@ void qdev_machine_init(void);
   */
  void device_reset(DeviceState *dev);
  
 +int bus_reset(BusState *bus);
 +
  const VMStateDescription *qdev_get_vmsd(DeviceState *dev);
  
  const char *qdev_fw_name(DeviceState *dev);
 





Re: [Qemu-devel] [PATCH 4/5] qom-cpu: during cpu reset, it will reset its child

2012-07-10 Thread Paolo Bonzini
Il 10/07/2012 08:16, Liu Ping Fan ha scritto:
 This will give the embeded logic module, such as apic has the
 opportunity to reset.
 
 Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
 ---
  qom/cpu.c |   16 
  1 files changed, 16 insertions(+), 0 deletions(-)
 
 diff --git a/qom/cpu.c b/qom/cpu.c
 index 5b36046..6aea8e6 100644
 --- a/qom/cpu.c
 +++ b/qom/cpu.c
 @@ -20,10 +20,26 @@
  
  #include qemu/cpu.h
  #include qemu-common.h
 +#include hw/qdev.h
 +
 +static int cpu_reset_kid(Object *child, void *opaque)
 +{
 +if (object_is_type_str(child, TYPE_DEVICE)) {
 +device_reset(DEVICE(child));
 +} else if (object_is_type_str(child, TYPE_BUS)) {
 +bus_reset(BUS(child));
 +} else {
 +printf(cpu's child must be DEVICE or BUS);
 +abort();
 +}
 +return 0;
 +}
  
  void cpu_reset(CPUState *cpu)
  {
  CPUClass *klass = CPU_GET_CLASS(cpu);
 +Object *obj = OBJECT(cpu);
 +object_child_foreach(obj, cpu_reset_kid, NULL);

Ok, now I see what you want to do.  Next time, please add meaningful
commit messages to all patches in the series, even those that only add
infrastructure.

It really looks like time is ripe to make CPUs children of Device, so
you can just use qdev_reset_all to reset the CPU.

Paolo

  if (klass-reset != NULL) {
  (*klass-reset)(cpu);
 





Re: [Qemu-devel] [PATCH 5/5] apic: create apic as a child of cpu, not system_bus any longer

2012-07-10 Thread Paolo Bonzini
Il 10/07/2012 08:16, Liu Ping Fan ha scritto:
 Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
 ---
  hw/pc.c |   10 +-
  1 files changed, 5 insertions(+), 5 deletions(-)
 
 diff --git a/hw/pc.c b/hw/pc.c
 index c7e9ab3..8df58c9 100644
 --- a/hw/pc.c
 +++ b/hw/pc.c
 @@ -886,17 +886,17 @@ DeviceState *cpu_get_current_apic(void)
  }
  }
  
 -static DeviceState *apic_init(void *env, uint8_t apic_id)
 +static DeviceState *apic_init(void *cpu, void *env, uint8_t apic_id)
  {
  DeviceState *dev;
  static int apic_mapped;
  
  if (kvm_irqchip_in_kernel()) {
 -dev = qdev_create(NULL, kvm-apic);
 +dev = qdev_create_kid(OBJECT(cpu), kvm-apic);
  } else if (xen_enabled()) {
 -dev = qdev_create(NULL, xen-apic);
 +dev = qdev_create_kid(OBJECT(cpu), xen-apic);
  } else {
 -dev = qdev_create(NULL, apic);
 +dev = qdev_create_kid(OBJECT(cpu), apic);
  }

Does it make sense instead to do this in the realize method of the CPU?

Paolo

  
  qdev_prop_set_uint8(dev, id, apic_id);
 @@ -945,7 +945,7 @@ static X86CPU *pc_new_cpu(const char *cpu_model)
  }
  env = cpu-env;
  if ((env-cpuid_features  CPUID_APIC) || smp_cpus  1) {
 -env-apic_state = apic_init(env, env-cpuid_apic_id);
 +env-apic_state = apic_init(cpu, env, env-cpuid_apic_id);
  }
  qemu_register_reset(pc_cpu_reset, cpu);
  pc_cpu_reset(cpu);
 





Re: [Qemu-devel] [PATCH 1/5] qdev: introduce qdev_create_kid(Object *parent, const char *type)

2012-07-10 Thread Paolo Bonzini
Il 10/07/2012 08:16, Liu Ping Fan ha scritto:
 DeviceState can be created as kid of DeviceState/CPUState, not neccesary
 attached to bus. This will be helpful to simulate the real hardware
 submodule which sits inside package.
 
 Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
 ---
  hw/qdev.c |   28 
  hw/qdev.h |1 +
  2 files changed, 29 insertions(+), 0 deletions(-)
 
 diff --git a/hw/qdev.c b/hw/qdev.c
 index af54467..d2100a1 100644
 --- a/hw/qdev.c
 +++ b/hw/qdev.c
 @@ -26,6 +26,7 @@
 this API directly.  */
  
  #include net.h
 +#include qemu/cpu.h
  #include qdev.h
  #include sysemu.h
  #include error.h
 @@ -145,6 +146,33 @@ DeviceState *qdev_try_create(BusState *bus, const char 
 *type)
  return dev;
  }
  
 +DeviceState *qdev_create_kid(Object *parent, const char *type)
 +{
 +DeviceState *dev;
 +assert(parent);
 +
 +if (object_class_by_name(type) == NULL) {
 +return NULL;
 +}
 +
 +if (object_is_type_str(parent, TYPE_BUS)) {
 +return qdev_create(BUS(parent), type);
 +}
 +
 +if (!object_is_type_str(parent, TYPE_DEVICE)
 +|| !object_is_type_str(parent, TYPE_CPU)) {
 +return NULL;
 +}
 +
 +dev = DEVICE(object_new(type));
 +if (!dev) {
 +return NULL;
 +}
 +object_property_add_child(OBJECT(parent), type, OBJECT(dev), NULL);

I don't like this.  The only additional functionality is magic
dispatching between qdev_create for buses and object_property_add_child
for devices.  This should be done with a method that is implemented in
both objects (e.g. an interface), not with type checks like this.

However, you're not even using the functionality, and designing APIs
without an effective need usually makes for bad APIs.

Instead, you can just move APIC creation in the CPU, and use
object_property_add_child there.

Paolo

 +return dev;
 +}
 +
  /* Initialize a device.  Device properties should be set before calling
 this function.  IRQs and MMIO regions should be connected/mapped after
 calling this function.
 diff --git a/hw/qdev.h b/hw/qdev.h
 index f4683dc..aecc69e 100644
 --- a/hw/qdev.h
 +++ b/hw/qdev.h
 @@ -154,6 +154,7 @@ typedef struct GlobalProperty {
  
  DeviceState *qdev_create(BusState *bus, const char *name);
  DeviceState *qdev_try_create(BusState *bus, const char *name);
 +DeviceState *qdev_create_kid(Object *parent, const char *type);
  bool qdev_exists(const char *name);
  int qdev_device_help(QemuOpts *opts);
  DeviceState *qdev_device_add(QemuOpts *opts);
 





Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-10 Thread Daniel P. Berrange
On Mon, Jul 09, 2012 at 04:00:37PM -0300, Luiz Capitulino wrote:
 On Mon, 09 Jul 2012 13:40:34 -0500
 Anthony Liguori aligu...@us.ibm.com wrote:
 
  On 06/26/2012 04:10 AM, Daniel P. Berrange wrote:
   On Fri, Jun 22, 2012 at 02:36:07PM -0400, Corey Bryant wrote:
   libvirt's sVirt security driver provides SELinux MAC isolation for
   Qemu guest processes and their corresponding image files.  In other
   words, sVirt uses SELinux to prevent a QEMU process from opening
   files that do not belong to it.
  
   sVirt provides this support by labeling guests and resources with
   security labels that are stored in file system extended attributes.
   Some file systems, such as NFS, do not support the extended
   attribute security namespace, and therefore cannot support sVirt
   isolation.
  
   A solution to this problem is to provide fd passing support, where
   libvirt opens files and passes file descriptors to QEMU.  This,
   along with SELinux policy to prevent QEMU from opening files, can
   provide image file isolation for NFS files stored on the same NFS
   mount.
  
   This patch series adds the pass-fd QMP monitor command, which allows
   an fd to be passed via SCM_RIGHTS, and returns the received file
   descriptor.  Support is also added to the block layer to allow QEMU
   to dup the fd when the filename is of the /dev/fd/X format.  This
   is useful if MAC policy prevents QEMU from opening specific types
   of files.
  
   I was thinking about some of the sources complexity when using
   FD passing from libvirt and wanted to raise one idea for discussion
   before we continue.
  
   With this proposed series, we have usage akin to:
  
  1. pass_fd FDSET={M} -  returns a string /dev/fd/N showing QEMU's
 view of the FD
  2. drive_add file=/dev/fd/N
  3. if failure:
   close_fd /dev/fd/N
  
   My problem is that none of this FD passing is transactional.
  
  My original patch series did not suffer from this problem.
  
  QEMU owned the file descriptor once it received it from libvirt.
  
  I don't think the cited problem (QEMU failing an operation if libvirt was 
  down) 
  is really an actual problem since it would be libvirt that would be issuing 
  the 
  command in the first place (so the command would just fail which libvirt 
  would 
  have to assume anyway if it crashed).
  
  I really dislike where this thread has headed with /dev/fdset.  This has 
  become 
  extremely complex and cumbersome.
 
 I agree, maybe it's time to start over and discuss the original problem again.

I must say, I'm not entirely sure of all the problems we're trying to
solve anymore. I don't think we've ever clearly stated in this thread
what all the requirements/problems are, so I'm finding it hard to see
what the optimal solution is.


Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [RFC][PATCH v2 0/4] tcg: enhance code generation quality for qemu_ld/st IRs

2012-07-10 Thread Yeongkyoon Lee

On 2012년 07월 05일 22:23, Yeongkyoon Lee wrote:

Summarized feature is as following.
  - All the changes are wrapped by macro CONFIG_QEMU_LDST_OPTIMIZATION and 
disabled by default.
  - They are enabled by configure --enable-ldst-optimization and need 
CONFIG_SOFTMMU.
  - They do not work with CONFIG_TCG_PASS_AREG0 because it looks better apply 
them after areg0 codes come steady.
  - Currently, they support only x86 and x86-64 and have been tested with x86 
and ARM linux targets on x86/x86-64 host platforms.
  - Build test has been done for all targets.


I'd like to summarize community's feedbacks/observations and propose new 
patch for ldst optimization.


* Feedbacks/observations
1. It needs to work with PASS_AREG0 (CONFIG_TCG_PASS_AREG0).
2. It does not need to be configured by user.
3. It looks good for a target to be ldst-optimized on x86/64 hosts not 
optionally.
4. CONFIG_QEMU_LDST_OPTIMIZATION looks necessary because common code 
(e.g. tcg.h/tcg.c) is used on-x86 hosts and it should support 
non-softmmu targets.
5. It might need two versions of MMU helpers, standard and extended, 
simultaneously because C code might want to call the standard version.


* Modification proposals
1. Apply ldst optimization also when PASS_AREG0 enabled.
2. Make softmmu targets always to use ldst optimization on x86/64 hosts. 
But testing for many targets is an issue...
3. Make target mem(op) helpers to provide extended MMU helpers and 
softmmu_header.h to provide standard MMU helpers.

4. Fix some mistypings and redundant checks.

I'm not sure whether my proposals are feasible, however, I'd like to try 
them.

How do you think about it?


Yeongkyoon Lee (4):
   tcg: add declarations and templates of extended MMU helpers
   tcg: add extended MMU helpers to softmmu targets
   tcg: add optimized TCG qemu_ld/st generation
   configure: add CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st
 optimization

  configure |   15 ++
  softmmu_defs.h|   13 ++
  softmmu_template.h|   51 +--
  target-alpha/mem_helper.c |   22 +++
  target-arm/op_helper.c|   23 +++
  target-cris/op_helper.c   |   22 +++
  target-i386/mem_helper.c  |   22 +++
  target-lm32/op_helper.c   |   23 +++-
  target-m68k/op_helper.c   |   22 +++
  target-microblaze/op_helper.c |   22 +++
  target-mips/op_helper.c   |   22 +++
  target-ppc/mem_helper.c   |   22 +++
  target-s390x/op_helper.c  |   22 +++
  target-sh4/op_helper.c|   22 +++
  target-sparc/ldst_helper.c|   23 +++
  target-xtensa/op_helper.c |   22 +++
  tcg/i386/tcg-target.c |  328 +
  tcg/tcg.c |   12 ++
  tcg/tcg.h |   35 +
  19 files changed, 732 insertions(+), 11 deletions(-)








[Qemu-devel] [PATCH v3 00/29] Disk geometry cleanup

2012-07-10 Thread Markus Armbruster
29 patches may look discouraging, but most patches are small, and the
ones that aren't just move code around.

Goals of this series:

1. One more step towards a clean separation block device host and
   guest part.

2. Purge CHS geometry from the block layer

Part I[PATCH01/29]: Floppy geometry

Part II   [PATCH 02-03/29]: vvfat geometry bug fixes

Part III  [PATCH 04-10/29]: Clean up hard disk geometry guessing code

Part IV   [PATCH 11-12/29]: Clean up CMOS hard disk info setup

Part V[PATCH 13-24/29]: qdev properties for disk geometry

Part VI   [PATCH 25-29/29]: A few more fixes and cleanups

A few more cleanups are in the works, in particular geometry checking
code duplication pointed out by Kevin.

This patch series is also available at
git://repo.or.cz/qemu/armbru.git
tag geo-v3

v3: Rebase; drop the three patches that have been committed already
Fix uninitialized variable in PATCH 02/29 (Anthony)

v2: New hw/block-common.h (Blue  Kevin)
Coding style here  there (Blue)
Tracepoint parameter types (Stefan)

Markus Armbruster (29):
  fdc: Move floppy geometry guessing back from block.c
  vvfat: Fix partition table
  vvfat: Do not clobber the user's geometry
  qtest: Add hard disk geometry test
  hd-geometry: Move disk geometry guessing back from block.c
  hd-geometry: Add tracepoints
  hd-geometry: Unnest conditional in hd_geometry_guess()
  hd-geometry: Factor out guess_chs_for_size()
  hd-geometry: Clean up gratuitous goto in hd_geometry_guess()
  hd-geometry: Clean up confusing use of prior translation hint
  hd-geometry: Cut out block layer translation middleman
  ide pc: Cut out the block layer geometry middleman
  blockdev: Save geometry in DriveInfo
  qdev: Introduce block geometry properties
  hd-geometry: Switch to uint32_t to match BlockConf
  scsi-hd: qdev properties for disk geometry
  virtio-blk: qdev properties for disk geometry
  ide: qdev properties for disk geometry
  qtest: Cover qdev properties for disk geometry
  qdev: Collect private helpers in one place
  qdev: New property type chs-translation
  ide: qdev property for BIOS CHS translation
  qtest: Cover qdev property for BIOS CHS translation
  block: Geometry and translation hints are now useless, purge them
  ide pc: Put hard disk info into CMOS only for hard disks
  qtest: Test we don't put hard disk info into CMOS for a CD-ROM
  hd-geometry: Compute BIOS CHS translation in one place
  blockdev: Drop redundant CHS validation for if=ide
  Relax IDE CHS limits from 16383,16,63 to 65535,16,255

 block.c  |  254 --
 block.h  |   39 +
 block/vvfat.c|   57 ---
 block_int.h  |1 -
 blockdev.c   |   24 +--
 blockdev.h   |2 +
 hw/Makefile.objs |2 +-
 hw/block-common.h|   29 
 hw/fdc.c |  122 +--
 hw/fdc.h |   10 +-
 hw/hd-geometry.c |  157 ++
 hw/ide.h |4 +-
 hw/ide/core.c|   30 +++-
 hw/ide/internal.h|7 +-
 hw/ide/qdev.c|   46 +-
 hw/pc.c  |   78 --
 hw/qdev-properties.c |  160 ++-
 hw/qdev.h|3 +
 hw/s390-virtio-bus.c |1 +
 hw/scsi-disk.c   |   70 ++---
 hw/virtio-blk.c  |   42 -
 hw/virtio-pci.c  |1 +
 tests/Makefile   |2 +
 tests/hd-geo-test.c  |  428 ++
 trace-events |4 +
 vl.c |2 +-
 26 files changed, 1067 insertions(+), 508 deletions(-)
 create mode 100644 hw/block-common.h
 create mode 100644 hw/hd-geometry.c
 create mode 100644 tests/hd-geo-test.c

-- 
1.7.6.5




[Qemu-devel] [PATCH v3 16/29] scsi-hd: qdev properties for disk geometry

2012-07-10 Thread Markus Armbruster
Geometry needs to be qdev properties, because it belongs to the
disk's guest part.

Maintain backward compatibility exactly like for serial: fall back to
DriveInfo's geometry, set with -drive cyls=...

Do this only for scsi-hd.  scsi-disk is legacy.  scsi-cd doesn't have
a geometry.  scsi-block should get geometry from the host disk.

Bonus: info qtree now shows the geometry.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/scsi-disk.c |   69 +--
 1 files changed, 46 insertions(+), 23 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index c881acf..0a182f9 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -966,9 +966,6 @@ static int mode_sense_page(SCSIDiskState *s, int page, 
uint8_t **p_outbuf,
 [MODE_PAGE_AUDIO_CTL]  = (1  TYPE_ROM),
 [MODE_PAGE_CAPABILITIES]   = (1  TYPE_ROM),
 };
-
-BlockDriverState *bdrv = s-qdev.conf.bs;
-uint32_t cylinders, heads, secs;
 uint8_t *p = *p_outbuf;
 
 if ((mode_sense_valid[page]  (1  s-qdev.type)) == 0) {
@@ -990,19 +987,18 @@ static int mode_sense_page(SCSIDiskState *s, int page, 
uint8_t **p_outbuf,
 break;
 }
 /* if a geometry hint is available, use it */
-hd_geometry_guess(bdrv, cylinders, heads, secs, NULL);
-p[2] = (cylinders  16)  0xff;
-p[3] = (cylinders  8)  0xff;
-p[4] = cylinders  0xff;
-p[5] = heads  0xff;
+p[2] = (s-qdev.conf.cyls  16)  0xff;
+p[3] = (s-qdev.conf.cyls  8)  0xff;
+p[4] = s-qdev.conf.cyls  0xff;
+p[5] = s-qdev.conf.heads  0xff;
 /* Write precomp start cylinder, disabled */
-p[6] = (cylinders  16)  0xff;
-p[7] = (cylinders  8)  0xff;
-p[8] = cylinders  0xff;
+p[6] = (s-qdev.conf.cyls  16)  0xff;
+p[7] = (s-qdev.conf.cyls  8)  0xff;
+p[8] = s-qdev.conf.cyls  0xff;
 /* Reduced current start cylinder, disabled */
-p[9] = (cylinders  16)  0xff;
-p[10] = (cylinders  8)  0xff;
-p[11] = cylinders  0xff;
+p[9] = (s-qdev.conf.cyls  16)  0xff;
+p[10] = (s-qdev.conf.cyls  8)  0xff;
+p[11] = s-qdev.conf.cyls  0xff;
 /* Device step rate [ns], 200ns */
 p[12] = 0;
 p[13] = 200;
@@ -1024,18 +1020,17 @@ static int mode_sense_page(SCSIDiskState *s, int page, 
uint8_t **p_outbuf,
 p[2] = 5000  8;
 p[3] = 5000  0xff;
 /* if a geometry hint is available, use it */
-hd_geometry_guess(bdrv, cylinders, heads, secs, NULL);
-p[4] = heads  0xff;
-p[5] = secs  0xff;
+p[4] = s-qdev.conf.heads  0xff;
+p[5] = s-qdev.conf.secs  0xff;
 p[6] = s-qdev.blocksize  8;
-p[8] = (cylinders  8)  0xff;
-p[9] = cylinders  0xff;
+p[8] = (s-qdev.conf.cyls  8)  0xff;
+p[9] = s-qdev.conf.cyls  0xff;
 /* Write precomp start cylinder, disabled */
-p[10] = (cylinders  8)  0xff;
-p[11] = cylinders  0xff;
+p[10] = (s-qdev.conf.cyls  8)  0xff;
+p[11] = s-qdev.conf.cyls  0xff;
 /* Reduced current start cylinder, disabled */
-p[12] = (cylinders  8)  0xff;
-p[13] = cylinders  0xff;
+p[12] = (s-qdev.conf.cyls  8)  0xff;
+p[13] = s-qdev.conf.cyls  0xff;
 /* Device step rate [100us], 100us */
 p[14] = 0;
 p[15] = 1;
@@ -1755,6 +1750,33 @@ static int scsi_initfn(SCSIDevice *dev)
 return -1;
 }
 
+if (!dev-conf.cyls  !dev-conf.heads  !dev-conf.secs) {
+/* try to fall back to value set with legacy -drive cyls=... */
+dinfo = drive_get_by_blockdev(s-qdev.conf.bs);
+dev-conf.cyls = dinfo-cyls;
+dev-conf.heads = dinfo-heads;
+dev-conf.secs = dinfo-secs;
+}
+if (!dev-conf.cyls  !dev-conf.heads  !dev-conf.secs) {
+hd_geometry_guess(s-qdev.conf.bs,
+  dev-conf.cyls, dev-conf.heads, dev-conf.secs,
+  NULL);
+}
+if (dev-conf.cyls || dev-conf.heads || dev-conf.secs) {
+if (dev-conf.cyls  1 || dev-conf.cyls  65535) {
+error_report(cyls must be between 1 and 65535);
+return -1;
+}
+if (dev-conf.heads  1 || dev-conf.heads  255) {
+error_report(heads must be between 1 and 255);
+return -1;
+}
+if (dev-conf.secs  1 || dev-conf.secs  255) {
+error_report(secs must be between 1 and 255);
+return -1;
+}
+}
+
 if (!s-serial) {
 /* try to fall back to value set with legacy -drive serial=... */
 dinfo = drive_get_by_blockdev(s-qdev.conf.bs);
@@ -1975,6 +1997,7 @@ static Property scsi_hd_properties[] = {
 DEFINE_PROP_BIT(dpofua, SCSIDiskState, features,
 SCSI_DISK_F_DPOFUA, false),
 DEFINE_PROP_HEX64(wwn, SCSIDiskState, wwn, 0),
+

Re: [Qemu-devel] [PATCH 3/9] isa: Add a way to query for a free interrupt

2012-07-10 Thread Paolo Bonzini
Il 09/07/2012 21:17, miny...@acm.org ha scritto:
 From: Corey Minyard cminy...@mvista.com
 
 This lets devices that don't care about their interrupt number, like
 IPMI, just grab any unused interrupt.

I would try to avoid this.  It is too dependent on the actual
initialization order and command line.  Just pick a reasonable value for
the interrupt (5?) and make it customizable.

I only gave a cursory look at the series, but it looks really well done
except for command-line parsing.  I'll reply to the patches individually.

Paolo





Re: [Qemu-devel] [PATCH 5/9] IPMI: Add a PC ISA type structure

2012-07-10 Thread Paolo Bonzini
Il 09/07/2012 21:17, miny...@acm.org ha scritto:
 From: Corey Minyard cminy...@mvista.com
 
 This provides the base infrastructure to tie IPMI low-level
 interfaces into a PC ISA bus.
 
 Signed-off-by: Corey Minyard cminy...@mvista.com
 ---
  default-configs/i386-softmmu.mak   |1 +
  default-configs/x86_64-softmmu.mak |1 +
  hw/Makefile.objs   |1 +
  hw/isa_ipmi.c  |  138 
 
  hw/pc.c|   12 +++
  hw/pc.h|   18 +
  hw/smbios.h|   12 +++
  7 files changed, 183 insertions(+), 0 deletions(-)
  create mode 100644 hw/isa_ipmi.c
 
 diff --git a/default-configs/i386-softmmu.mak 
 b/default-configs/i386-softmmu.mak
 index eb17afc..c0aff0d 100644
 --- a/default-configs/i386-softmmu.mak
 +++ b/default-configs/i386-softmmu.mak
 @@ -8,6 +8,7 @@ CONFIG_VGA_CIRRUS=y
  CONFIG_VMWARE_VGA=y
  CONFIG_VMMOUSE=y
  CONFIG_IPMI=y
 +CONFIG_ISA_IPMI=y
  CONFIG_SERIAL=y
  CONFIG_PARALLEL=y
  CONFIG_I8254=y
 diff --git a/default-configs/x86_64-softmmu.mak 
 b/default-configs/x86_64-softmmu.mak
 index e4e3e4f..615e4f2 100644
 --- a/default-configs/x86_64-softmmu.mak
 +++ b/default-configs/x86_64-softmmu.mak
 @@ -8,6 +8,7 @@ CONFIG_VGA_CIRRUS=y
  CONFIG_VMWARE_VGA=y
  CONFIG_VMMOUSE=y
  CONFIG_IPMI=y
 +CONFIG_ISA_IPMI=y
  CONFIG_SERIAL=y
  CONFIG_PARALLEL=y
  CONFIG_I8254=y
 diff --git a/hw/Makefile.objs b/hw/Makefile.objs
 index 0d55997..8f27ffe 100644
 --- a/hw/Makefile.objs
 +++ b/hw/Makefile.objs
 @@ -21,6 +21,7 @@ hw-obj-$(CONFIG_ESCC) += escc.o
  hw-obj-$(CONFIG_EMPTY_SLOT) += empty_slot.o
  
  hw-obj-$(CONFIG_IPMI) += ipmi.o
 +hw-obj-$(CONFIG_ISA_IPMI) += isa_ipmi.o
  
  hw-obj-$(CONFIG_SERIAL) += serial.o
  hw-obj-$(CONFIG_PARALLEL) += parallel.o
 diff --git a/hw/isa_ipmi.c b/hw/isa_ipmi.c
 new file mode 100644
 index 000..cad78b0
 --- /dev/null
 +++ b/hw/isa_ipmi.c
 @@ -0,0 +1,138 @@
 +/*
 + * QEMU ISA IPMI KCS emulation
 + *
 + * Copyright (c) 2012 Corey Minyard, MontaVista Software, LLC
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a 
 copy
 + * of this software and associated documentation files (the Software), to 
 deal
 + * in the Software without restriction, including without limitation the 
 rights
 + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 + * copies of the Software, and to permit persons to whom the Software is
 + * furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included in
 + * all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
 FROM,
 + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 + * THE SOFTWARE.
 + */
 +#include hw.h
 +#include isa.h
 +#include pc.h
 +#include qemu-timer.h
 +#include sysemu.h
 +#include smbios.h
 +#include ipmi.h
 +
 +
 +typedef struct ISAIPMIState {
 +ISADevice dev;
 +uint32_t type;
 +uint32_t iobase;
 +uint32_t isairq;
 +uint8_t slave_addr;
 +IPMIState state;
 +} ISAIPMIState;

I would prefer to have a TYPE_IPMI_INTERFACE class for IPMIState.  Then
you can subclass that class in patches 6/7/8, and include a pointer in
ISAIPMIState.

Paolo

 +static int ipmi_isa_initfn(ISADevice *dev)
 +{
 +ISAIPMIState *isa = DO_UPCAST(ISAIPMIState, dev, dev);
 +struct smbios_type_38 smb38;
 +
 +if (isa-iobase == -1) {
 + /* If no I/O base is specified, set the defaults */
 + switch (isa-type) {
 + case IPMI_KCS:
 + isa-iobase = 0xca2;
 + break;
 + case IPMI_BT:
 + isa-iobase = 0xe4;
 + break;
 + case IPMI_SMIC:
 + isa-iobase = 0xca9;
 + break;
 + default:
 + fprintf(stderr, Unknown IPMI type: %d\n, isa-type);
 + abort();
 + }
 +}
 +
 +isa-state.slave_addr = isa-slave_addr;
 +
 +qdev_set_legacy_instance_id(dev-qdev, isa-iobase, 3);
 +
 +ipmi_init(isa-type, isa-state);
 +
 +if (isa-isairq  0) {
 + isa_init_irq(dev, isa-state.irq, isa-isairq);
 + isa-state.use_irq = 1;
 +}
 +
 +isa_register_ioport(dev, isa-state.io, isa-iobase);
 +
 +smb38.header.type = 38;
 +smb38.header.length = sizeof(smb38);
 +smb38.header.handle = 0x3000;
 +smb38.interface_type = isa-state.smbios_type;
 +smb38.ipmi_version = 0x20;
 +smb38.i2c_slave_addr = isa-state.slave_addr;
 +smb38.nv_storage_dev_addr = 0;
 +
 +/* or 1 to set it to I/O space */
 +smb38.base_addr = isa-iobase | 1;
 +
 + /* 1-byte 

Re: [Qemu-devel] [PATCH 4/9] Add a base IPMI interface

2012-07-10 Thread Daniel P. Berrange
On Mon, Jul 09, 2012 at 02:17:04PM -0500, miny...@acm.org wrote:
 diff --git a/qemu-options.hx b/qemu-options.hx
 index 125a4da..823f6bc 100644
 --- a/qemu-options.hx
 +++ b/qemu-options.hx
 @@ -2204,6 +2204,41 @@ Three button serial mouse. Configure the guest to use 
 Microsoft protocol.
  @end table
  ETEXI
  
 +DEF(ipmi, HAS_ARG, QEMU_OPTION_ipmi, \
 +-ipmi [kcs|bt,]dev|local|none  IPMI interface to the dev, or internal 
 BMC\n,
 +QEMU_ARCH_ALL)
 +STEXI
 +@item -ipmi [bt|kcs,]@var{dev}|local|none
 +@findex -ipmi
 +Set up an IPMI interface.  The physical interface may either be
 +KCS or BT, the default is KCS.  Two options are available for
 +simulation of the IPMI BMC.  If @code{local} is specified, then a
 +minimal internal BMC is used.  This BMC is basically useful as a
 +watchdog timer and for fooling a system into thinking IPMI is there.
 +
 +If @var{dev} is specified (see the serial section above for details on
 +what can be specified for @var{dev}) then a connection to an external IPMI
 +simulator is made.  This interface has the ability to do power control
 +and reset, so it can do the normal IPMI types of things required.

 +The OpenIPMI project's lanserv simulator is capable of providing
 +this interface.  It is also capable of an IPMI LAN interface, and
 +you can do power control (the lanserv simulator is capable of starting
 +a VM, too) and reset of a virtual machine over a standard remote LAN
 +interface.  For details on this, see OpenIPMI.
 +
 +The remote connection to a LAN interface will reconnect if disconnected,
 +so if a remote BMC fails and restarts, it will still be usable.
 +
 +For instance, to connect to an external interface on the local machine
 +port 9002 with a BT physical interface, do the following:
 +@table @code
 +@item -ipmi bt,tcp:localhost:9002
 +@end table
 +
 +Use @code{-ipmi none} to disable IPMI.
 +ETEXI

I tend to question the wisdom of exposing a remote accessible TCP socket
with no encryption or authentication, which can be used to shutdown/reset
QEMU instances, and who knows what other functions in the future.

While it might be claimed that one would only enable this if QEMU were
on a trusted management LAN, IMHO, current network threats/attacks
mean there is really no such thing as a trusted LAN anymore. So I can't
see this being practical to actually use in a production deployment.


BTW, the syntax you show here is the legacy approach where both front
and backend device config is mixed.  Does you patch work with the
modern QEMU syntax which is something like

  -chardev name=impi0,tcp:localhost:9002 -device bt,chardev=ipmi0

if it doesn't work, then you'll need to update your patches to support
this approach.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



[Qemu-devel] [PATCH v3 05/29] hd-geometry: Move disk geometry guessing back from block.c

2012-07-10 Thread Markus Armbruster
Commit f3d54fc4 factored it out of hw/ide.c for reuse.  Sensible,
except it was put into block.c.  Device-specific functionality should
be kept in device code, not the block layer.  Move it to
hw/hd-geometry.c, and make stylistic changes required to keep
checkpatch.pl happy.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 block.c   |  121 ---
 block.h   |1 -
 blockdev.h|1 +
 hw/Makefile.objs  |2 +-
 hw/block-common.h |   21 +++
 hw/hd-geometry.c  |  162 +
 hw/ide/core.c |3 +-
 hw/scsi-disk.c|5 +-
 hw/virtio-blk.c   |3 +-
 9 files changed, 192 insertions(+), 127 deletions(-)
 create mode 100644 hw/block-common.h
 create mode 100644 hw/hd-geometry.c

diff --git a/block.c b/block.c
index ffda1c2..06323cf 100644
--- a/block.c
+++ b/block.c
@@ -2132,127 +2132,6 @@ void bdrv_get_geometry(BlockDriverState *bs, uint64_t 
*nb_sectors_ptr)
 *nb_sectors_ptr = length;
 }
 
-struct partition {
-uint8_t boot_ind;   /* 0x80 - active */
-uint8_t head;   /* starting head */
-uint8_t sector; /* starting sector */
-uint8_t cyl;/* starting cylinder */
-uint8_t sys_ind;/* What partition type */
-uint8_t end_head;   /* end head */
-uint8_t end_sector; /* end sector */
-uint8_t end_cyl;/* end cylinder */
-uint32_t start_sect;/* starting sector counting from 0 */
-uint32_t nr_sects;  /* nr of sectors in partition */
-} QEMU_PACKED;
-
-/* try to guess the disk logical geometry from the MSDOS partition table. 
Return 0 if OK, -1 if could not guess */
-static int guess_disk_lchs(BlockDriverState *bs,
-   int *pcylinders, int *pheads, int *psectors)
-{
-uint8_t buf[BDRV_SECTOR_SIZE];
-int i, heads, sectors, cylinders;
-struct partition *p;
-uint32_t nr_sects;
-uint64_t nb_sectors;
-
-bdrv_get_geometry(bs, nb_sectors);
-
-/**
- * The function will be invoked during startup not only in sync I/O mode,
- * but also in async I/O mode. So the I/O throttling function has to
- * be disabled temporarily here, not permanently.
- */
-if (bdrv_read_unthrottled(bs, 0, buf, 1)  0) {
-return -1;
-}
-/* test msdos magic */
-if (buf[510] != 0x55 || buf[511] != 0xaa)
-return -1;
-for(i = 0; i  4; i++) {
-p = ((struct partition *)(buf + 0x1be)) + i;
-nr_sects = le32_to_cpu(p-nr_sects);
-if (nr_sects  p-end_head) {
-/* We make the assumption that the partition terminates on
-   a cylinder boundary */
-heads = p-end_head + 1;
-sectors = p-end_sector  63;
-if (sectors == 0)
-continue;
-cylinders = nb_sectors / (heads * sectors);
-if (cylinders  1 || cylinders  16383)
-continue;
-*pheads = heads;
-*psectors = sectors;
-*pcylinders = cylinders;
-#if 0
-printf(guessed geometry: LCHS=%d %d %d\n,
-   cylinders, heads, sectors);
-#endif
-return 0;
-}
-}
-return -1;
-}
-
-void bdrv_guess_geometry(BlockDriverState *bs, int *pcyls, int *pheads, int 
*psecs)
-{
-int translation, lba_detected = 0;
-int cylinders, heads, secs;
-uint64_t nb_sectors;
-
-/* if a geometry hint is available, use it */
-bdrv_get_geometry(bs, nb_sectors);
-bdrv_get_geometry_hint(bs, cylinders, heads, secs);
-translation = bdrv_get_translation_hint(bs);
-if (cylinders != 0) {
-*pcyls = cylinders;
-*pheads = heads;
-*psecs = secs;
-} else {
-if (guess_disk_lchs(bs, cylinders, heads, secs) == 0) {
-if (heads  16) {
-/* if heads  16, it means that a BIOS LBA
-   translation was active, so the default
-   hardware geometry is OK */
-lba_detected = 1;
-goto default_geometry;
-} else {
-*pcyls = cylinders;
-*pheads = heads;
-*psecs = secs;
-/* disable any translation to be in sync with
-   the logical geometry */
-if (translation == BIOS_ATA_TRANSLATION_AUTO) {
-bdrv_set_translation_hint(bs,
-  BIOS_ATA_TRANSLATION_NONE);
-}
-}
-} else {
-default_geometry:
-/* if no geometry, use a standard physical disk geometry */
-cylinders = nb_sectors / (16 * 63);
-
-if (cylinders  16383)
-cylinders = 16383;
-else if (cylinders  2)
-cylinders = 2;
-*pcyls = cylinders;
-

[Qemu-devel] [PATCH v3 23/29] qtest: Cover qdev property for BIOS CHS translation

2012-07-10 Thread Markus Armbruster

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 tests/hd-geo-test.c |   13 +++--
 1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/tests/hd-geo-test.c b/tests/hd-geo-test.c
index a47b945..5d9d2e4 100644
--- a/tests/hd-geo-test.c
+++ b/tests/hd-geo-test.c
@@ -321,15 +321,16 @@ static void test_ide_drive_user(const char *dev, bool 
trans)
 const CHST expected_chst = { secs / (4 * 32) , 4, 32, trans };
 
 argc = setup_common(argv, ARRAY_SIZE(argv));
-opts = g_strdup_printf(%s,cyls=%d,heads=%d,secs=%d%s,
-   dev  !trans ? dev : ,
+opts = g_strdup_printf(%s,%s%scyls=%d,heads=%d,secs=%d,
+   dev ?: ,
+   trans  dev ? bios-chs- : ,
+   trans ? trans=lba, : ,
expected_chst.cyls, expected_chst.heads,
-   expected_chst.secs,
-   trans ? ,trans=lba : );
+   expected_chst.secs);
 cur_ide[0] = expected_chst;
 argc = setup_ide(argc, argv, ARRAY_SIZE(argv),
- 0, dev  !trans ? opts : NULL, backend_small, mbr_chs,
- dev  !trans ?  : opts);
+ 0, dev ? opts : NULL, backend_small, mbr_chs,
+ dev ?  : opts);
 g_free(opts);
 qtest_start(g_strjoinv( , argv));
 test_cmos();
-- 
1.7.6.5




[Qemu-devel] [PATCH v3 21/29] qdev: New property type chs-translation

2012-07-10 Thread Markus Armbruster

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/qdev-properties.c |   15 +++
 hw/qdev.h|3 +++
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
index 002c7f9..0b18f8c 100644
--- a/hw/qdev-properties.c
+++ b/hw/qdev-properties.c
@@ -782,6 +782,21 @@ PropertyInfo qdev_prop_losttickpolicy = {
 .set   = set_enum,
 };
 
+/* --- BIOS CHS translation */
+
+static const char *bios_chs_trans_table[] = {
+[BIOS_ATA_TRANSLATION_AUTO] = auto,
+[BIOS_ATA_TRANSLATION_NONE] = none,
+[BIOS_ATA_TRANSLATION_LBA]  = lba,
+};
+
+PropertyInfo qdev_prop_bios_chs_trans = {
+.name = bios-chs-trans,
+.enum_table = bios_chs_trans_table,
+.get = get_enum,
+.set = set_enum,
+};
+
 /* --- pci address --- */
 
 /*
diff --git a/hw/qdev.h b/hw/qdev.h
index f4683dc..9be35d4 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -232,6 +232,7 @@ extern PropertyInfo qdev_prop_chr;
 extern PropertyInfo qdev_prop_ptr;
 extern PropertyInfo qdev_prop_macaddr;
 extern PropertyInfo qdev_prop_losttickpolicy;
+extern PropertyInfo qdev_prop_bios_chs_trans;
 extern PropertyInfo qdev_prop_drive;
 extern PropertyInfo qdev_prop_netdev;
 extern PropertyInfo qdev_prop_vlan;
@@ -299,6 +300,8 @@ extern PropertyInfo qdev_prop_pci_host_devaddr;
 #define DEFINE_PROP_LOSTTICKPOLICY(_n, _s, _f, _d) \
 DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_losttickpolicy, \
 LostTickPolicy)
+#define DEFINE_PROP_BIOS_CHS_TRANS(_n, _s, _f, _d) \
+DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_bios_chs_trans, int)
 #define DEFINE_PROP_BLOCKSIZE(_n, _s, _f, _d) \
 DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_blocksize, uint16_t)
 #define DEFINE_PROP_PCI_HOST_DEVADDR(_n, _s, _f) \
-- 
1.7.6.5




Re: [Qemu-devel] [PATCH 4/9] Add a base IPMI interface

2012-07-10 Thread Paolo Bonzini
Il 09/07/2012 21:17, miny...@acm.org ha scritto:
 +
 +/* Phyical interface types. */
 +#define IPMI_KCS 1
 +#define IPMI_BT  2
 +#define IPMI_SMIC3

The code is not 100% consistent for values that are defined in hardware
specs (enums are preferred for new code, but that's not really
enforced).  However, for everything else please use enums rather than
arbitrary #defines.

Paolo




Re: [Qemu-devel] [PATCH 5/9] IPMI: Add a PC ISA type structure

2012-07-10 Thread Paolo Bonzini
Il 09/07/2012 21:17, miny...@acm.org ha scritto:
 +
 +static Property ipmi_isa_properties[] = {
 +DEFINE_PROP_HEX32(type, ISAIPMIState, type,  IPMI_KCS),

You can add an enum property.  There is one example called
LostTickPolicy in the tree.  Please do not use the generic type name;
use interface for example.

Start with an empty enum, and let each of the two patches 6/7 add an
item; same for the various switch statements on
IPMI_KCS/IPMI_BT/IPMI_SMIC.  (Actually if you do this the IPMI_SMIC
branches will disappear, right?)

 +DEFINE_PROP_HEX32(iobase, ISAIPMIState, iobase,  -1),
 +DEFINE_PROP_UINT32(irq,   ISAIPMIState, isairq,  0),
 +DEFINE_PROP_UINT8(slave_addr, ISAIPMIState, slave_addr,  0),
 +DEFINE_PROP_PTR(charopts,  ISAIPMIState, state.chropts),

Here, you should add a normal chardev property so that people can use
-chardev and -device to create the IPMI interface.  The device can be
created like this:

   -chardev ...,id=charipmi -device ipmi,interface=kcs,chardev=charipmi

If the chardev is absent, the local interface is used instead.

See docs/qdev-device-use.txt for more information.

 +DEFINE_PROP_END_OF_LIST(),
 +};
 +
 +static const VMStateDescription vmstate_isa_ipmi = {
 +.name = isa-ipmi,
 +.version_id = 3,
 +.minimum_version_id = 3,
 +.fields  = (VMStateField []) {
 +VMSTATE_STRUCT(state, ISAIPMIState, 0, vmstate_isa_ipmi,
 +IPMIState),
 +VMSTATE_END_OF_LIST()
 +}
 +};
 +
 +static void ipmi_isa_class_initfn(ObjectClass *klass, void *data)
 +{
 +DeviceClass *dc = DEVICE_CLASS(klass);
 +ISADeviceClass *ic = ISA_DEVICE_CLASS(klass);
 +ic-init = ipmi_isa_initfn;
 +dc-vmsd = vmstate_isa_ipmi;
 +dc-props = ipmi_isa_properties;
 +}
 +
 +static TypeInfo ipmi_isa_info = {
 +.name  = isa-ipmi,
 +.parent= TYPE_ISA_DEVICE,
 +.instance_size = sizeof(ISAIPMIState),
 +.class_init= ipmi_isa_class_initfn,
 +};
 +
 +static void ipmi_register_types(void)
 +{
 +type_register_static(ipmi_isa_info);
 +}
 +
 +type_init(ipmi_register_types)
 diff --git a/hw/pc.c b/hw/pc.c
 index c0acb6a..965e053 100644
 --- a/hw/pc.c
 +++ b/hw/pc.c
 @@ -1173,6 +1173,18 @@ void pc_basic_device_init(ISABus *isa_bus, qemu_irq 
 *gsi,
  fd[i] = drive_get(IF_FLOPPY, 0, i);
  }
  *floppy = fdctrl_init_isa(isa_bus, fd);
 +
 +i = 0;
 +if (do_local_ipmi) {
 + ipmi_isa_init(isa_bus, isa_find_free_irq(isa_bus), ipmi_types[i], NULL);
 + i++;
 +}
 +for(; i  MAX_IPMI_DEVICES; i++) {
 + if (ipmi_hds[i]) {
 + ipmi_isa_init(isa_bus, isa_find_free_irq(isa_bus),
 +   ipmi_types[i], ipmi_hds[i]);
 + }
 +}
  }
  
  void pc_pci_device_init(PCIBus *pci_bus)
 diff --git a/hw/pc.h b/hw/pc.h
 index 33ab689..5b6d947 100644
 --- a/hw/pc.h
 +++ b/hw/pc.h
 @@ -223,6 +223,24 @@ static inline bool isa_ne2000_init(ISABus *bus, int 
 base, int irq, NICInfo *nd)
  return true;
  }
  
 +/* IPMI */
 +static inline bool ipmi_isa_init(ISABus *bus, int irq,
 +  int type, QemuOpts *opts)
 +{
 +ISADevice *dev;
 +
 +dev = isa_try_create(bus, isa-ipmi);
 +if (!dev) {
 +return false;
 +}
 +qdev_prop_set_uint32(dev-qdev, type, type);
 +qdev_prop_set_uint32(dev-qdev, irq, irq);
 +qdev_prop_set_ptr(dev-qdev, charopts, opts);
 +if (qdev_init(dev-qdev)  0) {
 +return false;
 +}





Re: [Qemu-devel] First shot at adding IPMI to qemu

2012-07-10 Thread Paolo Bonzini
Il 09/07/2012 21:17, miny...@acm.org ha scritto:
 I had asked about getting an IPMI device into qemu and received some
 interest, and it's useful to me, so I've done some work to add it.
 The following patch set has a set of patches to add an IPMI KCS
 device, and IPMI BT device, a built-in BMC (IPMI management controller),
 and a way to attach an external BMC through a chardev.
 
 There was some discussion on whether to make the BMC internal or
 external, but I went ahead and added both.  The internal one is
 fairly basic and not extensible, at least without adding code.
 I've modified the OpenIPMI library simulator to work with the
 external interface to allow it to receive connections from the
 qemu external simulator with a fairly basic protocol.
 
 I've also added the ability for the OpenIPMI library to manage
 a VM to power it on, power it off, reset it, and handle an IPMI
 watchdog timer.  So it looks quite like a real system.  Instructions
 for using it are in the OpenIPMI release candidate I uploaded to
 https://sourceforge.net/projects/openipmi
 
 Since IPMI can advertise its presence via SMBIOS, I added a
 way for a driver to add an SMBIOS entry.  I also added a way
 to query a free interrupt from the ISA bus, since the interrupt
 is in the SMBIOS entry and nobody really cares which one is used.

I provided some feedback in the individual patches, it shouldn't be a
lot of work compared to what you have done already!

It would be great if you could add a basic testcase using qtest, even if
only for the internal interface, to ensure it doesn't bitrot.

Paolo





[Qemu-devel] [PATCH v17 0/9] XBZRLE delta for live migration of large memory app

2012-07-10 Thread Orit Wasserman
Changes from v16:
- Change QMP migrate_set_cachesize to migrate-set-cache-size

Changes from v15:
- Fix example in documentation
- Fix identation in qmp-commands.hx
- Fix missing comments from v13
- Fix other comments by Eric Blake

Changes from v14:
- rebase on top on Juan's patches
- Use clz64 to calculate pow2floor (round down to power of 2)
- Fix xbzrle_encode_buffer and xbzrle_decode_buffer
- Fix QMP commands documentation 
Changes from v13:
- Fix round to power of 2 of cache size
- Add more checks to the XBZRLE encoding.
- use comparison instead of XOR when calculating zrun_len
- use strcmp trick for calculating nzrun_len (algorithm from Eric Blake)
- Fix other comments by Eric Blake
- Fix comments from Blue Swirl
- Display migration statics after migration completes
Changes from v12:
- use bool for blk and shared params
- use long when decoding buffer
- fix QMP commands
- always display migration parameters in info migrate
- update current_addr inside the while loop in ram_save_block
- display statistics after migration completes
- fix other review comments from Eric Blake

Changes from v11: 
- divide patch 7 to several smaller patches.
- Use an array for setting migration parameters QMP only (there
  is not support for arrays in HMP commands). parameters can be enabled
  or disabled.
- Do not use XBZRLE in stage 3 , it is a very sensitive stage and CPU
  can be an issue.
- Fix review comments by Juan Quintela and Eric Blake

Changes from v10:
- Cache size will be in bytes, in case it is not a power of 2 it will be
  reduced to the nearest power of 2.
- fix documentation
- use cache_init with number of pages not cache size.

Changes from v9:
- move cache implementation to separate files. Kept our own 
implementation because GCache or GHashTable have no size limit.
- Add migrate_set_parameter function
- removed XBZRLE option from migrate command
- add cache size information to query_migrate command
- add documantation file
- write/read the exact XBZRLE header format
- fix other review comments by Anthony and Juan

Changes from v8:
Implement more effiecent cache_resize method
fix set_cachesize command 

Changes from v7:
Copy current page before encoding it, this will prevents page content
change during the encoding.
Allow changing the cache size during an active migration.
Fix comments by Avi.

Changes from v6:
 1) add assert checks to ULEB encoding/decoding
 2) no need to send last zero run

Changes from v5:
1) Add migration capabilities
2) Use ULEB to encode run length
3) Do not send unmodified (dirty) page
3) Fix other patch comments

Using GCache or GHashTable requires allocating new buffer on every content 
change and have no size limit ,
so I decided to keep the simple cache implementation.

Changes from v4:
1) Rebase
2) divide patch into 9 patches
3) move memory allocation into cache_insert

Future work :
 Use SSE for encoding.
 Page ranking acording to their dirty rate and automatic 
activation/deactivation of the feature - will be sent in a separate patch 
series.  

By using XBZRLE (Xor Based Zero Run Length Encoding) we can reduce VM downtime
and total live-migration time of VMs running memory write intensive workloads
typical of large enterprise applications such as SAP ERP Systems, and generally
speaking for any application with a sparse memory update pattern.

The compression format uses the fact that we will have many zero (zero 
represents
an unchanged value). 
We repesent the page data delta by zero and non zero runs.
We represent a zero run with it's length (in bytes). 
We represent a non zero run with it's length (in bytes) and the data.
The run length is encoded using ULEB128 (http://en.wikipedia.org/wiki/LEB128)

page = zrun nzrun
   | zrun nzrun page

zrun = length

nzrun = length byte...

length = uleb128 encoded integer

On the sender side XBZRLE is used as a compact delta encoding of page updates,
retrieving the old page content from an LRU cache (default size of 512 MB). The
receiving side uses the existing page content and XBZRLE to decode the new page
content.

This is a more compact way to store the delta than the previous version.

This work was originally based on research results published VEE 2011: 
Evaluation of
Delta Compression Techniques for Efficient Live Migration of Large Virtual
Machines by Benoit, Svard, Tordsson and Elmroth. Additionally the delta encoder
XBRLE was improved further using XBZRLE instead.

XBZRLE has a sustained bandwidth of 2-2.5 GB/s for typical workloads making it
ideal for in-line, real-time encoding such as is needed for live-migration.

A typical 

[Qemu-devel] [PATCH v17 1/9] Add migration capabilities

2012-07-10 Thread Orit Wasserman
Add migration capabilities that can be queried by the management.
The management can query the source QEMU and the destination QEMU in order to
verify both support some migration capability (currently only XBZRLE).
The management can enable a capability for the next migration by using
migrate_set_parameter command.

Signed-off-by: Orit Wasserman owass...@redhat.com
---
 hmp-commands.hx  |   16 
 hmp.c|   64 
 hmp.h|2 +
 migration.c  |   72 -
 migration.h  |2 +
 monitor.c|7 +
 qapi-schema.json |   53 ++-
 qmp-commands.hx  |   71 +++--
 8 files changed, 280 insertions(+), 7 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index f5d9d91..9245bef 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -861,6 +861,20 @@ Set maximum tolerated downtime (in seconds) for migration.
 ETEXI
 
 {
+.name   = migrate_set_parameter,
+.args_type  = capability:s,state:b,
+.params = ,
+.help   = Enable/Disable the usage of a capability for migration,
+.mhandler.cmd = hmp_migrate_set_parameter,
+},
+
+STEXI
+@item migrate_set_parameter @var{capability} @var{state}
+@findex migrate_set_parameter
+Enable/Disable the usage of a capability @var{capability} for migration.
+ETEXI
+
+{
 .name   = client_migrate_info,
 .args_type  = 
protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?,
 .params = protocol hostname port tls-port cert-subject,
@@ -1419,6 +1433,8 @@ show CPU statistics
 show user network stack connection states
 @item info migrate
 show migration status
+@item info migration_capabilities
+show migration capabilities
 @item info balloon
 show balloon information
 @item info qtree
diff --git a/hmp.c b/hmp.c
index 4c6d4ae..b0440e6 100644
--- a/hmp.c
+++ b/hmp.c
@@ -131,9 +131,19 @@ void hmp_info_mice(Monitor *mon)
 void hmp_info_migrate(Monitor *mon)
 {
 MigrationInfo *info;
+MigrationCapabilityInfoList *cap;
 
 info = qmp_query_migrate(NULL);
 
+if (info-has_capabilities  info-capabilities) {
+monitor_printf(mon, capabilities: );
+for (cap = info-capabilities; cap; cap = cap-next) {
+monitor_printf(mon, %s: %s ,
+   MigrationCapability_lookup[cap-value-capability],
+   cap-value-state ? on : off);
+}
+monitor_printf(mon, \n);
+}
 if (info-has_status) {
 monitor_printf(mon, Migration status: %s\n, info-status);
 }
@@ -161,6 +171,25 @@ void hmp_info_migrate(Monitor *mon)
 qapi_free_MigrationInfo(info);
 }
 
+void hmp_info_migration_capabilities(Monitor *mon)
+{
+MigrationCapabilityInfoList *caps_list, *cap;
+
+caps_list = qmp_query_migration_capabilities(NULL);
+if (!caps_list) {
+monitor_printf(mon, No migration capabilities found\n);
+return;
+}
+
+for (cap = caps_list; cap; cap = cap-next) {
+monitor_printf(mon, %s: %s ,
+   MigrationCapability_lookup[cap-value-capability],
+   cap-value-state ? on : off);
+}
+
+qapi_free_MigrationCapabilityInfoList(caps_list);
+}
+
 void hmp_info_cpus(Monitor *mon)
 {
 CpuInfoList *cpu_list, *cpu;
@@ -735,6 +764,41 @@ void hmp_migrate_set_speed(Monitor *mon, const QDict 
*qdict)
 qmp_migrate_set_speed(value, NULL);
 }
 
+void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
+{
+const char *cap = qdict_get_str(qdict, capability);
+bool state = qdict_get_bool(qdict, state);
+Error *err = NULL;
+MigrationCapabilityInfoList *params = NULL;
+int i;
+
+for (i = 0; i  MIGRATION_CAPABILITY_MAX; i++) {
+if (strcmp(cap, MigrationCapability_lookup[i]) == 0) {
+if (!params) {
+params = g_malloc0(sizeof(*params));
+}
+params-value = g_malloc0(sizeof(*params-value));
+params-value-capability = i;
+params-value-state = state;
+params-next = NULL;
+qmp_migrate_set_parameters(params, err);
+break;
+}
+}
+
+if (i == MIGRATION_CAPABILITY_MAX) {
+error_set(err, QERR_INVALID_PARAMETER, cap);
+}
+
+qapi_free_MigrationCapabilityInfoList(params);
+
+if (err) {
+monitor_printf(mon, migrate_set_parameter: %s\n,
+   error_get_pretty(err));
+error_free(err);
+}
+}
+
 void hmp_set_password(Monitor *mon, const QDict *qdict)
 {
 const char *protocol  = qdict_get_str(qdict, protocol);
diff --git a/hmp.h b/hmp.h
index 79d138d..09ba198 100644
--- a/hmp.h
+++ b/hmp.h
@@ -25,6 +25,7 @@ void hmp_info_uuid(Monitor *mon);
 void hmp_info_chardev(Monitor *mon);
 void hmp_info_mice(Monitor *mon);
 void 

[Qemu-devel] [PATCH v17 4/9] Add uleb encoding/decoding functions

2012-07-10 Thread Orit Wasserman
Implement Unsigned Little Endian Base 128.

Signed-off-by: Orit Wasserman owass...@redhat.com
---
 cutils.c  |   33 +
 qemu-common.h |8 
 2 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/cutils.c b/cutils.c
index b0bdd4b..700f943 100644
--- a/cutils.c
+++ b/cutils.c
@@ -384,3 +384,36 @@ int64_t pow2floor(int64_t value)
 }
 return value;
 }
+
+/*
+ * Implementation of  ULEB128 (http://en.wikipedia.org/wiki/LEB128)
+ * Input is limited to 14-bit numbers
+ */
+int uleb128_encode_small(uint8_t *out, uint32_t n)
+{
+g_assert(n = 0x3fff);
+if (n  0x80) {
+*out++ = n;
+return 1;
+} else {
+*out++ = (n  0x7f) | 0x80;
+*out++ = n  7;
+return 2;
+}
+}
+
+int uleb128_decode_small(const uint8_t *in, uint32_t *n)
+{
+if (!(*in  0x80)) {
+*n = *in++;
+return 1;
+} else {
+*n = *in++  0x7f;
+/* we exceed 14 bit number */
+if (*in  0x80) {
+return -1;
+}
+*n |= *in++  7;
+return 2;
+}
+}
diff --git a/qemu-common.h b/qemu-common.h
index 195bab5..3188bdd 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -426,4 +426,12 @@ int64_t pow2floor(int64_t value);
 
 #include module.h
 
+/*
+ * Implementation of ULEB128 (http://en.wikipedia.org/wiki/LEB128)
+ * Input is limited to 14-bit numbers
+ */
+
+int uleb128_encode_small(uint8_t *out, uint32_t n);
+int uleb128_decode_small(const uint8_t *in, uint32_t *n);
+
 #endif
-- 
1.7.7.6




[Qemu-devel] [PATCH v17 6/9] Add xbzrle_encode_buffer and xbzrle_decode_buffer functions

2012-07-10 Thread Orit Wasserman
For performance we are encoding long word at a time.
For nzrun we use long-word-at-a-time NULL-detection tricks from strcmp():
using ((lword - 0x0101010101010101)  (~lword)  0x8080808080808080) test
to find out if any byte in the long word is zero.

Signed-off-by: Benoit Hudzia benoit.hud...@sap.com
Signed-off-by: Petter Svard pett...@cs.umu.se
Signed-off-by: Aidan Shribman aidan.shrib...@sap.com
Signed-off-by: Orit Wasserman owass...@redhat.com
Signed-off-by: Eric Blake ebl...@redhat.com
---
 migration.h |4 ++
 savevm.c|  159 +++
 2 files changed, 163 insertions(+), 0 deletions(-)

diff --git a/migration.h b/migration.h
index acc0b94..c46af82 100644
--- a/migration.h
+++ b/migration.h
@@ -100,4 +100,8 @@ void migrate_add_blocker(Error *reason);
  */
 void migrate_del_blocker(Error *reason);
 
+int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
+ uint8_t *dst, int dlen);
+int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen);
+
 #endif
diff --git a/savevm.c b/savevm.c
index a15c163..26d 100644
--- a/savevm.c
+++ b/savevm.c
@@ -2385,3 +2385,162 @@ void vmstate_register_ram_global(MemoryRegion *mr)
 {
 vmstate_register_ram(mr, NULL);
 }
+
+/*
+  page = zrun nzrun
+   | zrun nzrun page
+
+  zrun = length
+
+  nzrun = length byte...
+
+  length = uleb128 encoded integer
+ */
+int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
+ uint8_t *dst, int dlen)
+{
+uint32_t zrun_len = 0, nzrun_len = 0;
+int d = 0, i = 0;
+long res, xor;
+uint8_t *nzrun_start = NULL;
+
+g_assert(!(((uintptr_t)old_buf | (uintptr_t)new_buf | slen) %
+   sizeof(long)));
+
+while (i  slen) {
+/* overflow */
+if (d + 2  dlen) {
+return -1;
+}
+
+/* not aligned to sizeof(long) */
+res = (slen - i) % sizeof(long);
+while (res  old_buf[i] == new_buf[i]) {
+zrun_len++;
+i++;
+res--;
+}
+
+/* word at a time for speed */
+if (!res) {
+while (i  slen 
+   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
+i += sizeof(long);
+zrun_len += sizeof(long);
+}
+
+/* go over the rest */
+while (i  slen  old_buf[i] == new_buf[i]) {
+zrun_len++;
+i++;
+}
+}
+
+/* buffer unchanged */
+if (zrun_len == slen) {
+return 0;
+}
+
+/* skip last zero run */
+if (i == slen) {
+return d;
+}
+
+d += uleb128_encode_small(dst + d, zrun_len);
+
+zrun_len = 0;
+nzrun_start = new_buf + i;
+
+/* overflow */
+if (d + 2  dlen) {
+return -1;
+}
+/* not aligned to sizeof(long) */
+res = (slen - i) % sizeof(long);
+while (res  old_buf[i] != new_buf[i]) {
+i++;
+nzrun_len++;
+res--;
+}
+
+/* word at a time for speed, use of 32-bit long okay */
+if (!res) {
+/* truncation to 32-bit long okay */
+long mask = 0x0101010101010101ULL;
+while (i  slen) {
+xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i);
+if ((xor - mask)  ~xor  (mask  7)) {
+/* found the end of an nzrun within the current long */
+while (old_buf[i] != new_buf[i]) {
+nzrun_len++;
+i++;
+}
+break;
+} else {
+i += sizeof(long);
+nzrun_len += sizeof(long);
+}
+}
+}
+
+d += uleb128_encode_small(dst + d, nzrun_len);
+/* overflow */
+if (d + nzrun_len  dlen) {
+return -1;
+}
+memcpy(dst + d, nzrun_start, nzrun_len);
+d += nzrun_len;
+nzrun_len = 0;
+}
+
+return d;
+}
+
+int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen)
+{
+int i = 0, d = 0;
+int ret;
+uint32_t count = 0;
+
+while (i  slen) {
+
+/* zrun */
+if ((slen - i)  2) {
+return -1;
+}
+
+ret = uleb128_decode_small(src + i, count);
+if (ret  0 || (i  !count)) {
+return -1;
+}
+i += ret;
+d += count;
+
+/* overflow */
+if (d  dlen) {
+return -1;
+}
+
+/* nzrun */
+if ((slen - i)  2) {
+return -1;
+}
+
+ret = uleb128_decode_small(src + i, count);
+if (ret  0 || !count) {
+return -1;
+}
+i += ret;
+
+/* overflow */
+if (d + count  dlen || i + count  slen) {
+  

[Qemu-devel] [PATCH v17 8/9] Add migrate_set_cachesize command

2012-07-10 Thread Orit Wasserman
Change XBZRLE cache size in bytes (the size should be a power of 2, it will be
rounded down to the nearest power of 2).
If XBZRLE cache size is too small there will be many cache miss.

Signed-off-by: Benoit Hudzia benoit.hud...@sap.com
Signed-off-by: Petter Svard pett...@cs.umu.se
Signed-off-by: Aidan Shribman aidan.shrib...@sap.com
Signed-off-by: Orit Wasserman owass...@redhat.com
---
 arch_init.c  |   10 ++
 hmp-commands.hx  |   20 
 hmp.c|   13 +
 hmp.h|1 +
 migration.c  |   14 ++
 migration.h  |2 ++
 qapi-schema.json |   16 
 qmp-commands.hx  |   23 +++
 8 files changed, 99 insertions(+), 0 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 8bbd6da..be6670c 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -192,6 +192,16 @@ static struct {
 .cache = NULL,
 };
 
+
+int64_t xbzrle_cache_resize(int64_t new_size)
+{
+if (XBZRLE.cache != NULL) {
+return cache_resize(XBZRLE.cache, new_size / TARGET_PAGE_SIZE) *
+TARGET_PAGE_SIZE;
+}
+return pow2floor(new_size);
+}
+
 static void save_block_hdr(QEMUFile *f, RAMBlock *block, ram_addr_t offset,
 int cont, int flag)
 {
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 9245bef..052a0a3 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -829,6 +829,26 @@ STEXI
 @item migrate_cancel
 @findex migrate_cancel
 Cancel the current VM migration.
+
+ETEXI
+
+{
+.name   = migrate_set_cachesize,
+.args_type  = value:o,
+.params = value,
+.help   = set cache size (in bytes) for XBZRLE migrations,
+  the cache size will be rounded down to the nearest 
+  power of 2.\n
+  The cache size effects the number of cache misses.
+  In case of a high cache miss ratio you need to increase
+   the cache size,
+.mhandler.cmd = hmp_migrate_set_cachesize,
+},
+
+STEXI
+@item migrate_set_cachesize @var{value}
+@findex migrate_set_cachesize
+Set cache size to @var{value} (in bytes) for xbzrle migrations.
 ETEXI
 
 {
diff --git a/hmp.c b/hmp.c
index b0440e6..99ad00a 100644
--- a/hmp.c
+++ b/hmp.c
@@ -758,6 +758,19 @@ void hmp_migrate_set_downtime(Monitor *mon, const QDict 
*qdict)
 qmp_migrate_set_downtime(value, NULL);
 }
 
+void hmp_migrate_set_cachesize(Monitor *mon, const QDict *qdict)
+{
+int64_t value = qdict_get_int(qdict, value);
+Error *err = NULL;
+
+qmp_migrate_set_cache_size(value, err);
+if (err) {
+monitor_printf(mon, %s\n, error_get_pretty(err));
+error_free(err);
+return;
+}
+}
+
 void hmp_migrate_set_speed(Monitor *mon, const QDict *qdict)
 {
 int64_t value = qdict_get_int(qdict, value);
diff --git a/hmp.h b/hmp.h
index 09ba198..7c5117d 100644
--- a/hmp.h
+++ b/hmp.h
@@ -53,6 +53,7 @@ void hmp_migrate_cancel(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_downtime(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_speed(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict);
+void hmp_migrate_set_cachesize(Monitor *mon, const QDict *qdict);
 void hmp_set_password(Monitor *mon, const QDict *qdict);
 void hmp_expire_password(Monitor *mon, const QDict *qdict);
 void hmp_eject(Monitor *mon, const QDict *qdict);
diff --git a/migration.c b/migration.c
index 1a264a9..d134bf6 100644
--- a/migration.c
+++ b/migration.c
@@ -533,6 +533,20 @@ void qmp_migrate_cancel(Error **errp)
 migrate_fd_cancel(migrate_get_current());
 }
 
+void qmp_migrate_set_cache_size(int64_t value, Error **errp)
+{
+MigrationState *s = migrate_get_current();
+
+/* Check for truncation */
+if (value != (size_t)value) {
+error_set(errp, QERR_INVALID_PARAMETER_VALUE, cache size,
+  exceeding address space);
+return;
+}
+
+s-xbzrle_cache_size = xbzrle_cache_resize(value);
+}
+
 void qmp_migrate_set_speed(int64_t value, Error **errp)
 {
 MigrationState *s;
diff --git a/migration.h b/migration.h
index 9b61e70..a73a34a 100644
--- a/migration.h
+++ b/migration.h
@@ -108,4 +108,6 @@ int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t 
*dst, int dlen);
 int migrate_use_xbzrle(void);
 int64_t migrate_xbzrle_cache_size(void);
 
+int64_t xbzrle_cache_resize(int64_t new_size);
+
 #endif
diff --git a/qapi-schema.json b/qapi-schema.json
index a8408fd..a0f0f95 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1390,6 +1390,22 @@
 { 'command': 'migrate_set_speed', 'data': {'value': 'int'} }
 
 ##
+# @migrate-set-cache-size
+#
+# Set XBZRLE cache size
+#
+# @value: cache size in bytes
+#
+# The size will be rounded down to the nearest power of 2.
+# The cache size can be modified before and during ongoing migration
+#
+# Returns: nothing on success
+#
+# Since: 1.2
+##
+{ 'command': 

[Qemu-devel] [PATCH v17 7/9] Add XBZRLE to ram_save_block and ram_save_live

2012-07-10 Thread Orit Wasserman
In the outgoing migration check to see if the page is cached and
changed than send compressed page by using save_xbrle_page function.
In the incoming migration check to see if RAM_SAVE_FLAG_XBZRLE is set
and decompress the page (by using load_xbrle function).

Signed-off-by: Benoit Hudzia benoit.hud...@sap.com
Signed-off-by: Petter Svard pett...@cs.umu.se
Signed-off-by: Aidan Shribman aidan.shrib...@sap.com
Signed-off-by: Orit Wasserman owass...@redhat.com
---
 arch_init.c |  187 +--
 migration.c |   24 
 migration.h |4 +
 3 files changed, 210 insertions(+), 5 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 91e583f..8bbd6da 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -43,6 +43,7 @@
 #include hw/smbios.h
 #include exec-memory.h
 #include hw/pcspk.h
+#include qemu/page_cache.h
 
 #ifdef DEBUG_ARCH_INIT
 #define DPRINTF(fmt, ...) \
@@ -102,6 +103,7 @@ const uint32_t arch_type = QEMU_ARCH;
 #define RAM_SAVE_FLAG_PAGE 0x08
 #define RAM_SAVE_FLAG_EOS  0x10
 #define RAM_SAVE_FLAG_CONTINUE 0x20
+#define RAM_SAVE_FLAG_XBZRLE   0x40
 
 #ifdef __ALTIVEC__
 #include altivec.h
@@ -169,6 +171,27 @@ static int is_dup_page(uint8_t *page)
 return 1;
 }
 
+/* XBZRLE (Xor Based Zero Length Encoding */
+typedef struct XBZRLEHeader {
+uint16_t xh_len;
+uint8_t xh_flags;
+} XBZRLEHeader;
+
+/* struct contains XBZRLE cache and a static page
+   used by the compression */
+static struct {
+/* buffer used for XBZRLE encoding */
+uint8_t *encoded_buf;
+/* buffer used for XBZRLE decoding */
+uint8_t *decoded_buf;
+/* Cache for XBZRLE */
+PageCache *cache;
+} XBZRLE = {
+.encoded_buf = NULL,
+.decoded_buf = NULL,
+.cache = NULL,
+};
+
 static void save_block_hdr(QEMUFile *f, RAMBlock *block, ram_addr_t offset,
 int cont, int flag)
 {
@@ -181,15 +204,76 @@ static void save_block_hdr(QEMUFile *f, RAMBlock *block, 
ram_addr_t offset,
 
 }
 
+#define ENCODING_FLAG_XBZRLE 0x1
+
+static int save_xbzrle_page(QEMUFile *f, uint8_t *current_data,
+ram_addr_t current_addr, RAMBlock *block,
+ram_addr_t offset, int cont, int stage)
+{
+int encoded_len = 0, bytes_sent = -1, ret = -1;
+XBZRLEHeader hdr = {
+.xh_len = 0,
+.xh_flags = 0,
+};
+uint8_t *prev_cached_page;
+
+/* Stage 1 cache the page and exit.
+   Stage 2 check to see if page is cached, if not cache the page.
+   Stage 3 check if the page is cached and if not exit.
+*/
+if (stage == 1 || !cache_is_cached(XBZRLE.cache, current_addr)) {
+if (stage != 3) {
+cache_insert(XBZRLE.cache, current_addr,
+ g_memdup(current_data, TARGET_PAGE_SIZE));
+}
+return -1;
+}
+
+prev_cached_page = get_cached_data(XBZRLE.cache, current_addr);
+
+/* XBZRLE encoding (if there is no overflow) */
+encoded_len = xbzrle_encode_buffer(prev_cached_page, current_data,
+   TARGET_PAGE_SIZE, XBZRLE.encoded_buf,
+   TARGET_PAGE_SIZE);
+if (encoded_len == 0) {
+DPRINTF(Skipping unmodified page\n);
+return 0;
+} else if (encoded_len == -1) {
+DPRINTF(Overflow\n);
+/* update data in the cache */
+memcpy(prev_cached_page, current_data, TARGET_PAGE_SIZE);
+return -1;
+}
+
+/* we need to update the data in the cache, in order to get the same data
+   we cached we decode the encoded page on the cached data */
+ret = xbzrle_decode_buffer(XBZRLE.encoded_buf, encoded_len,
+   prev_cached_page, TARGET_PAGE_SIZE);
+g_assert(ret != -1);
+
+hdr.xh_len = encoded_len;
+hdr.xh_flags |= ENCODING_FLAG_XBZRLE;
+
+/* Send XBZRLE based compressed page */
+save_block_hdr(f, block, offset, cont, RAM_SAVE_FLAG_XBZRLE);
+qemu_put_byte(f, hdr.xh_flags);
+qemu_put_be16(f, hdr.xh_len);
+qemu_put_buffer(f, XBZRLE.encoded_buf, encoded_len);
+bytes_sent = encoded_len + sizeof(hdr);
+
+return bytes_sent;
+}
+
 static RAMBlock *last_block;
 static ram_addr_t last_offset;
 
-static int ram_save_block(QEMUFile *f)
+static int ram_save_block(QEMUFile *f, int stage)
 {
 RAMBlock *block = last_block;
 ram_addr_t offset = last_offset;
 int bytes_sent = -1;
 MemoryRegion *mr;
+ram_addr_t current_addr;
 
 if (!block)
 block = QLIST_FIRST(ram_list.blocks);
@@ -210,13 +294,31 @@ static int ram_save_block(QEMUFile *f)
 save_block_hdr(f, block, offset, cont, RAM_SAVE_FLAG_COMPRESS);
 qemu_put_byte(f, *p);
 bytes_sent = 1;
-} else {
+} else if (migrate_use_xbzrle()  stage != 3) {
+current_addr = block-offset + offset;
+/* In stage 1 we only cache the pages before sending them
+   from 

[Qemu-devel] [PATCH v3 11/29] hd-geometry: Cut out block layer translation middleman

2012-07-10 Thread Markus Armbruster
hd_geometry_guess() picks geometry and translation.  Callers can get
the geometry directly, via parameters, but for translation they need
to go through the block layer.

Add a parameter for translation, so it can optionally be gotten just
like geometry.  In preparation of purging translation from the block
layer, which will happen later in this series.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/block-common.h |3 ++-
 hw/hd-geometry.c  |   20 ++--
 hw/ide/core.c |2 +-
 hw/scsi-disk.c|4 ++--
 hw/virtio-blk.c   |2 +-
 5 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/hw/block-common.h b/hw/block-common.h
index 3a4d4c6..bba817a 100644
--- a/hw/block-common.h
+++ b/hw/block-common.h
@@ -16,6 +16,7 @@
 /* Hard disk geometry */
 
 void hd_geometry_guess(BlockDriverState *bs,
-   int *pcyls, int *pheads, int *psecs);
+   int *pcyls, int *pheads, int *psecs,
+   int *ptrans);
 
 #endif
diff --git a/hw/hd-geometry.c b/hw/hd-geometry.c
index 241aed9..4d746b7 100644
--- a/hw/hd-geometry.c
+++ b/hw/hd-geometry.c
@@ -117,7 +117,8 @@ static void guess_chs_for_size(BlockDriverState *bs,
 }
 
 void hd_geometry_guess(BlockDriverState *bs,
-   int *pcyls, int *pheads, int *psecs)
+   int *pcyls, int *pheads, int *psecs,
+   int *ptrans)
 {
 int cylinders, heads, secs, translation;
 
@@ -129,6 +130,9 @@ void hd_geometry_guess(BlockDriverState *bs,
 *pcyls = cylinders;
 *pheads = heads;
 *psecs = secs;
+if (ptrans) {
+*ptrans = translation;
+}
 return;
 }
 
@@ -142,10 +146,10 @@ void hd_geometry_guess(BlockDriverState *bs,
translation was active, so a standard physical disk
geometry is OK */
 guess_chs_for_size(bs, pcyls, pheads, psecs);
-bdrv_set_translation_hint(bs,
-  *pcyls * *pheads = 131072
-  ? BIOS_ATA_TRANSLATION_LARGE
-  : BIOS_ATA_TRANSLATION_LBA);
+translation = *pcyls * *pheads = 131072
+? BIOS_ATA_TRANSLATION_LARGE
+: BIOS_ATA_TRANSLATION_LBA;
+bdrv_set_translation_hint(bs, translation);
 } else {
 /* LCHS guess with heads = 16: use as physical geometry */
 *pcyls = cylinders;
@@ -153,7 +157,11 @@ void hd_geometry_guess(BlockDriverState *bs,
 *psecs = secs;
 /* disable any translation to be in sync with
the logical geometry */
-bdrv_set_translation_hint(bs, BIOS_ATA_TRANSLATION_NONE);
+translation = BIOS_ATA_TRANSLATION_NONE;
+bdrv_set_translation_hint(bs, translation);
+}
+if (ptrans) {
+*ptrans = translation;
 }
 bdrv_set_geometry_hint(bs, *pcyls, *pheads, *psecs);
 trace_hd_geometry_guess(bs, *pcyls, *pheads, *psecs, translation);
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 0d1bf10..28f04ad 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -1934,7 +1934,7 @@ int ide_init_drive(IDEState *s, BlockDriverState *bs, 
IDEDriveKind kind,
 s-drive_kind = kind;
 
 bdrv_get_geometry(bs, nb_sectors);
-hd_geometry_guess(bs, cylinders, heads, secs);
+hd_geometry_guess(bs, cylinders, heads, secs, NULL);
 if (cylinders  1 || cylinders  16383) {
 error_report(cyls must be between 1 and 16383);
 return -1;
diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 5339c2e..fc077f5 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -990,7 +990,7 @@ static int mode_sense_page(SCSIDiskState *s, int page, 
uint8_t **p_outbuf,
 break;
 }
 /* if a geometry hint is available, use it */
-hd_geometry_guess(bdrv, cylinders, heads, secs);
+hd_geometry_guess(bdrv, cylinders, heads, secs, NULL);
 p[2] = (cylinders  16)  0xff;
 p[3] = (cylinders  8)  0xff;
 p[4] = cylinders  0xff;
@@ -1024,7 +1024,7 @@ static int mode_sense_page(SCSIDiskState *s, int page, 
uint8_t **p_outbuf,
 p[2] = 5000  8;
 p[3] = 5000  0xff;
 /* if a geometry hint is available, use it */
-hd_geometry_guess(bdrv, cylinders, heads, secs);
+hd_geometry_guess(bdrv, cylinders, heads, secs, NULL);
 p[4] = heads  0xff;
 p[5] = secs  0xff;
 p[6] = s-qdev.blocksize  8;
diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index f16c5ce..d2709a7 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -623,7 +623,7 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, 
VirtIOBlkConf *blk)
 s-blk = blk;
 s-rq = NULL;
 s-sector_mask = (s-conf-logical_block_size / BDRV_SECTOR_SIZE) - 1;
-hd_geometry_guess(s-bs, cylinders, heads, secs);
+hd_geometry_guess(s-bs, cylinders, heads, secs, NULL);
 
 s-vq = virtio_add_queue(s-vdev, 128, virtio_blk_handle_output);
 
-- 
1.7.6.5




[Qemu-devel] [PATCH v3 12/29] ide pc: Cut out the block layer geometry middleman

2012-07-10 Thread Markus Armbruster
PC BIOS setup needs IDE geometry information.  Get it directly from
the device model rather than through the block layer.  In preparation
of purging geometry from the block layer, which will happen later in
this series.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/ide.h  |4 +++-
 hw/ide/core.c |2 +-
 hw/ide/internal.h |2 +-
 hw/ide/qdev.c |   21 +
 hw/pc.c   |   51 +++
 5 files changed, 49 insertions(+), 31 deletions(-)

diff --git a/hw/ide.h b/hw/ide.h
index 0b18c90..2db4079 100644
--- a/hw/ide.h
+++ b/hw/ide.h
@@ -29,7 +29,9 @@ void mmio_ide_init (target_phys_addr_t membase, 
target_phys_addr_t membase2,
 qemu_irq irq, int shift,
 DriveInfo *hd0, DriveInfo *hd1);
 
-void ide_get_bs(BlockDriverState *bs[], BusState *qbus);
+int ide_get_geometry(BusState *bus, int unit,
+ int16_t *cyls, int8_t *heads, int8_t *secs);
+int ide_get_bios_chs_trans(BusState *bus, int unit);
 
 /* ide/core.c */
 void ide_drive_get(DriveInfo **hd, int max_bus);
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 28f04ad..7f5ad07 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -1934,7 +1934,7 @@ int ide_init_drive(IDEState *s, BlockDriverState *bs, 
IDEDriveKind kind,
 s-drive_kind = kind;
 
 bdrv_get_geometry(bs, nb_sectors);
-hd_geometry_guess(bs, cylinders, heads, secs, NULL);
+hd_geometry_guess(bs, cylinders, heads, secs, s-chs_trans);
 if (cylinders  1 || cylinders  16383) {
 error_report(cyls must be between 1 and 16383);
 return -1;
diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index 1a02f57..56c718e 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -344,7 +344,7 @@ struct IDEState {
 uint8_t unit;
 /* ide config */
 IDEDriveKind drive_kind;
-int cylinders, heads, sectors;
+int cylinders, heads, sectors, chs_trans;
 int64_t nb_sectors;
 int mult_sectors;
 int identify_set;
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index c122395..87e0b75 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -111,11 +111,24 @@ IDEDevice *ide_create_drive(IDEBus *bus, int unit, 
DriveInfo *drive)
 return DO_UPCAST(IDEDevice, qdev, dev);
 }
 
-void ide_get_bs(BlockDriverState *bs[], BusState *qbus)
+int ide_get_geometry(BusState *bus, int unit,
+ int16_t *cyls, int8_t *heads, int8_t *secs)
 {
-IDEBus *bus = DO_UPCAST(IDEBus, qbus, qbus);
-bs[0] = bus-master ? bus-master-conf.bs : NULL;
-bs[1] = bus-slave  ? bus-slave-conf.bs  : NULL;
+IDEState *s = DO_UPCAST(IDEBus, qbus, bus)-ifs[unit];
+
+if (!s-bs) {
+return -1;
+}
+
+*cyls = s-cylinders;
+*heads = s-heads;
+*secs = s-sectors;
+return 0;
+}
+
+int ide_get_bios_chs_trans(BusState *bus, int unit)
+{
+return DO_UPCAST(IDEBus, qbus, bus)-ifs[unit].chs_trans;
 }
 
 /* - */
diff --git a/hw/pc.c b/hw/pc.c
index 91cf77d..89a0c66 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -216,11 +216,9 @@ static int cmos_get_fd_drive_type(FDriveType fd0)
 return val;
 }
 
-static void cmos_init_hd(int type_ofs, int info_ofs, BlockDriverState *hd,
- ISADevice *s)
+static void cmos_init_hd(ISADevice *s, int type_ofs, int info_ofs,
+ int16_t cylinders, int8_t heads, int8_t sectors)
 {
-int cylinders, heads, sectors;
-bdrv_get_geometry_hint(hd, cylinders, heads, sectors);
 rtc_set_memory(s, type_ofs, 47);
 rtc_set_memory(s, info_ofs, cylinders);
 rtc_set_memory(s, info_ofs + 1, cylinders  8);
@@ -281,37 +279,42 @@ static int pc_boot_set(void *opaque, const char 
*boot_device)
 
 typedef struct pc_cmos_init_late_arg {
 ISADevice *rtc_state;
-BusState *idebus0, *idebus1;
+BusState *idebus[2];
 } pc_cmos_init_late_arg;
 
 static void pc_cmos_init_late(void *opaque)
 {
 pc_cmos_init_late_arg *arg = opaque;
 ISADevice *s = arg-rtc_state;
+int16_t cylinders;
+int8_t heads, sectors;
 int val;
-BlockDriverState *hd_table[4];
 int i;
 
-ide_get_bs(hd_table, arg-idebus0);
-ide_get_bs(hd_table + 2, arg-idebus1);
-
-rtc_set_memory(s, 0x12, (hd_table[0] ? 0xf0 : 0) | (hd_table[1] ? 0x0f : 
0));
-if (hd_table[0])
-cmos_init_hd(0x19, 0x1b, hd_table[0], s);
-if (hd_table[1])
-cmos_init_hd(0x1a, 0x24, hd_table[1], s);
+val = 0;
+if (ide_get_geometry(arg-idebus[0], 0,
+ cylinders, heads, sectors) = 0) {
+cmos_init_hd(s, 0x19, 0x1b, cylinders, heads, sectors);
+val |= 0xf0;
+}
+if (ide_get_geometry(arg-idebus[0], 1,
+ cylinders, heads, sectors) = 0) {
+cmos_init_hd(s, 0x1a, 0x24, cylinders, heads, sectors);
+val |= 0x0f;
+}
+rtc_set_memory(s, 0x12, val);
 
 val = 0;
 for (i = 0; i  4; i++) {
-if (hd_table[i]) {
-   

[Qemu-devel] [PATCH v17 3/9] Add cache handling functions

2012-07-10 Thread Orit Wasserman
Add LRU page cache mechanism.
The page are accessed by their address.

Signed-off-by: Benoit Hudzia benoit.hud...@sap.com
Signed-off-by: Petter Svard pett...@cs.umu.se
Signed-off-by: Aidan Shribman aidan.shrib...@sap.com
Signed-off-by: Orit Wasserman owass...@redhat.com
---
 Makefile.objs |1 +
 cutils.c  |9 +
 qemu-common.h |   13 +
 3 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index 5ebbcfa..e0fb69b 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -77,6 +77,7 @@ common-obj-y += qemu-char.o #aio.o
 common-obj-y += block-migration.o iohandler.o
 common-obj-y += pflib.o
 common-obj-y += bitmap.o bitops.o
+common-obj-y += page_cache.o
 
 common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o migration-fd.o
 common-obj-$(CONFIG_WIN32) += version.o
diff --git a/cutils.c b/cutils.c
index e2bc1b8..b0bdd4b 100644
--- a/cutils.c
+++ b/cutils.c
@@ -375,3 +375,12 @@ int qemu_parse_fd(const char *param)
 }
 return fd;
 }
+
+/* round down to the nearest power of 2*/
+int64_t pow2floor(int64_t value)
+{
+if (!is_power_of_2(value)) {
+value = 0x8000ULL  clz64(value);
+}
+return value;
+}
diff --git a/qemu-common.h b/qemu-common.h
index 09676f5..195bab5 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -1,3 +1,4 @@
+
 /* Common header file that is included by all of qemu.  */
 #ifndef QEMU_COMMON_H
 #define QEMU_COMMON_H
@@ -411,6 +412,18 @@ static inline uint64_t muldiv64(uint64_t a, uint32_t b, 
uint32_t c)
 /* Round number up to multiple */
 #define QEMU_ALIGN_UP(n, m) QEMU_ALIGN_DOWN((n) + (m) - 1, (m))
 
+static inline bool is_power_of_2(int64_t value)
+{
+if (!value) {
+return 0;
+}
+
+return !(value  (value - 1));
+}
+
+/* round down to the nearest power of 2*/
+int64_t pow2floor(int64_t value);
+
 #include module.h
 
 #endif
-- 
1.7.7.6




[Qemu-devel] [PATCH v3 17/29] virtio-blk: qdev properties for disk geometry

2012-07-10 Thread Markus Armbruster
Geometry needs to be qdev properties, because it belongs to the
disk's guest part.

Maintain backward compatibility exactly like for serial: fall back to
DriveInfo's geometry, set with -drive cyls=...

Bonus: info qtree now shows the geometry.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/s390-virtio-bus.c |1 +
 hw/virtio-blk.c  |   41 -
 hw/virtio-pci.c  |1 +
 3 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/hw/s390-virtio-bus.c b/hw/s390-virtio-bus.c
index 4d49b96..a245684 100644
--- a/hw/s390-virtio-bus.c
+++ b/hw/s390-virtio-bus.c
@@ -402,6 +402,7 @@ static TypeInfo s390_virtio_net = {
 
 static Property s390_virtio_blk_properties[] = {
 DEFINE_BLOCK_PROPERTIES(VirtIOS390Device, blk.conf),
+DEFINE_BLOCK_CHS_PROPERTIES(VirtIOS390Device, blk.conf),
 DEFINE_PROP_STRING(serial, VirtIOS390Device, blk.serial),
 #ifdef __linux__
 DEFINE_PROP_BIT(scsi, VirtIOS390Device, blk.scsi, 0, true),
diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index 4344e28..3885904 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -479,19 +479,17 @@ static void virtio_blk_update_config(VirtIODevice *vdev, 
uint8_t *config)
 VirtIOBlock *s = to_virtio_blk(vdev);
 struct virtio_blk_config blkcfg;
 uint64_t capacity;
-int cylinders, heads, secs;
 int blk_size = s-conf-logical_block_size;
 
 bdrv_get_geometry(s-bs, capacity);
-bdrv_get_geometry_hint(s-bs, cylinders, heads, secs);
 memset(blkcfg, 0, sizeof(blkcfg));
 stq_raw(blkcfg.capacity, capacity);
 stl_raw(blkcfg.seg_max, 128 - 2);
-stw_raw(blkcfg.cylinders, cylinders);
+stw_raw(blkcfg.cylinders, s-conf-cyls);
 stl_raw(blkcfg.blk_size, blk_size);
 stw_raw(blkcfg.min_io_size, s-conf-min_io_size / blk_size);
 stw_raw(blkcfg.opt_io_size, s-conf-opt_io_size / blk_size);
-blkcfg.heads = heads;
+blkcfg.heads = s-conf-heads;
 /*
  * We must ensure that the block device capacity is a multiple of
  * the logical block size. If that is not the case, lets use
@@ -503,10 +501,10 @@ static void virtio_blk_update_config(VirtIODevice *vdev, 
uint8_t *config)
  * divided by 512 - instead it is the amount of blk_size blocks
  * per track (cylinder).
  */
-if (bdrv_getlength(s-bs) /  heads / secs % blk_size) {
-blkcfg.sectors = secs  ~s-sector_mask;
+if (bdrv_getlength(s-bs) /  s-conf-heads / s-conf-secs % blk_size) {
+blkcfg.sectors = s-conf-secs  ~s-sector_mask;
 } else {
-blkcfg.sectors = secs;
+blkcfg.sectors = s-conf-secs;
 }
 blkcfg.size_max = 0;
 blkcfg.physical_block_exp = get_physical_block_exp(s-conf);
@@ -590,7 +588,6 @@ static const BlockDevOps virtio_block_ops = {
 VirtIODevice *virtio_blk_init(DeviceState *dev, VirtIOBlkConf *blk)
 {
 VirtIOBlock *s;
-uint32_t cylinders, heads, secs;
 static int virtio_blk_id;
 DriveInfo *dinfo;
 
@@ -623,7 +620,33 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, 
VirtIOBlkConf *blk)
 s-blk = blk;
 s-rq = NULL;
 s-sector_mask = (s-conf-logical_block_size / BDRV_SECTOR_SIZE) - 1;
-hd_geometry_guess(s-bs, cylinders, heads, secs, NULL);
+
+if (!blk-conf.cyls  !blk-conf.heads  !blk-conf.secs) {
+/* try to fall back to value set with legacy -drive cyls=... */
+dinfo = drive_get_by_blockdev(blk-conf.bs);
+blk-conf.cyls = dinfo-cyls;
+blk-conf.heads = dinfo-heads;
+blk-conf.secs = dinfo-secs;
+}
+if (!blk-conf.cyls  !blk-conf.heads  !blk-conf.secs) {
+hd_geometry_guess(s-bs,
+  blk-conf.cyls, blk-conf.heads, blk-conf.secs,
+  NULL);
+}
+if (blk-conf.cyls || blk-conf.heads || blk-conf.secs) {
+if (blk-conf.cyls  1 || blk-conf.cyls  65535) {
+error_report(cyls must be between 1 and 65535);
+return NULL;
+}
+if (blk-conf.heads  1 || blk-conf.heads  255) {
+error_report(heads must be between 1 and 255);
+return NULL;
+}
+if (blk-conf.secs  1 || blk-conf.secs  255) {
+error_report(secs must be between 1 and 255);
+return NULL;
+}
+}
 
 s-vq = virtio_add_queue(s-vdev, 128, virtio_blk_handle_output);
 
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 9342eed..557d1d3 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -936,6 +936,7 @@ static int virtio_balloon_exit_pci(PCIDevice *pci_dev)
 static Property virtio_blk_properties[] = {
 DEFINE_PROP_HEX32(class, VirtIOPCIProxy, class_code, 0),
 DEFINE_BLOCK_PROPERTIES(VirtIOPCIProxy, blk.conf),
+DEFINE_BLOCK_CHS_PROPERTIES(VirtIOPCIProxy, blk.conf),
 DEFINE_PROP_STRING(serial, VirtIOPCIProxy, blk.serial),
 #ifdef __linux__
 DEFINE_PROP_BIT(scsi, VirtIOPCIProxy, blk.scsi, 0, true),
-- 
1.7.6.5




[Qemu-devel] [PATCH v3 10/29] hd-geometry: Clean up confusing use of prior translation hint

2012-07-10 Thread Markus Armbruster
When hd_geometry_guess() picks a geometry, it also picks the
appropriate translation, but only when the prior translation hint is
BIOS_ATA_TRANSLATION_AUTO.  Looks wrong, because such a prior
translation would be passed to the BIOS whether it's suitable for the
geometry or not.

Fortunately, that can't happen.  There are just two ways for the
translation hint to get set to something other than
BIOS_ATA_TRANSLATION_AUTO: drive_init() on behalf of -drive trans=...,
and hd_geometry_guess().  Both set it only when they also set a valid
geometry hint, i.e. one with a non-zero number of cylinders.

Since hd_geometry_guess() returns right away when it finds a valid
geometry hint, translation can only be BIOS_ATA_TRANSLATION_AUTO in
the remainder of the function.

Assert this, and simplify accordingly.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/hd-geometry.c |   17 +++--
 1 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/hw/hd-geometry.c b/hw/hd-geometry.c
index fb849a3..241aed9 100644
--- a/hw/hd-geometry.c
+++ b/hw/hd-geometry.c
@@ -132,6 +132,8 @@ void hd_geometry_guess(BlockDriverState *bs,
 return;
 }
 
+assert(translation == BIOS_ATA_TRANSLATION_AUTO);
+
 if (guess_disk_lchs(bs, cylinders, heads, secs)  0) {
 /* no LCHS guess: use a standard physical disk geometry  */
 guess_chs_for_size(bs, pcyls, pheads, psecs);
@@ -140,12 +142,10 @@ void hd_geometry_guess(BlockDriverState *bs,
translation was active, so a standard physical disk
geometry is OK */
 guess_chs_for_size(bs, pcyls, pheads, psecs);
-if (translation == BIOS_ATA_TRANSLATION_AUTO) {
-bdrv_set_translation_hint(bs,
-  *pcyls * *pheads = 131072
-  ? BIOS_ATA_TRANSLATION_LARGE
-  : BIOS_ATA_TRANSLATION_LBA);
-}
+bdrv_set_translation_hint(bs,
+  *pcyls * *pheads = 131072
+  ? BIOS_ATA_TRANSLATION_LARGE
+  : BIOS_ATA_TRANSLATION_LBA);
 } else {
 /* LCHS guess with heads = 16: use as physical geometry */
 *pcyls = cylinders;
@@ -153,10 +153,7 @@ void hd_geometry_guess(BlockDriverState *bs,
 *psecs = secs;
 /* disable any translation to be in sync with
the logical geometry */
-if (translation == BIOS_ATA_TRANSLATION_AUTO) {
-bdrv_set_translation_hint(bs,
-  BIOS_ATA_TRANSLATION_NONE);
-}
+bdrv_set_translation_hint(bs, BIOS_ATA_TRANSLATION_NONE);
 }
 bdrv_set_geometry_hint(bs, *pcyls, *pheads, *psecs);
 trace_hd_geometry_guess(bs, *pcyls, *pheads, *psecs, translation);
-- 
1.7.6.5




[Qemu-devel] [PATCH v3 25/29] ide pc: Put hard disk info into CMOS only for hard disks

2012-07-10 Thread Markus Armbruster
In particular, don't set disk type and geometry when a CD-ROM on bus
ide.0 has media during CMOS initialization.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/ide/qdev.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index f191dd3..84097fd 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -117,7 +117,7 @@ int ide_get_geometry(BusState *bus, int unit,
 {
 IDEState *s = DO_UPCAST(IDEBus, qbus, bus)-ifs[unit];
 
-if (!s-bs) {
+if (s-drive_kind != IDE_HD || !s-bs) {
 return -1;
 }
 
-- 
1.7.6.5




[Qemu-devel] [PATCH v3 19/29] qtest: Cover qdev properties for disk geometry

2012-07-10 Thread Markus Armbruster

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 tests/hd-geo-test.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/tests/hd-geo-test.c b/tests/hd-geo-test.c
index cc447a2..a47b945 100644
--- a/tests/hd-geo-test.c
+++ b/tests/hd-geo-test.c
@@ -321,13 +321,15 @@ static void test_ide_drive_user(const char *dev, bool 
trans)
 const CHST expected_chst = { secs / (4 * 32) , 4, 32, trans };
 
 argc = setup_common(argv, ARRAY_SIZE(argv));
-opts = g_strdup_printf(,cyls=%d,heads=%d,secs=%d%s,
+opts = g_strdup_printf(%s,cyls=%d,heads=%d,secs=%d%s,
+   dev  !trans ? dev : ,
expected_chst.cyls, expected_chst.heads,
expected_chst.secs,
trans ? ,trans=lba : );
 cur_ide[0] = expected_chst;
 argc = setup_ide(argc, argv, ARRAY_SIZE(argv),
- 0, dev, backend_small, mbr_chs, opts);
+ 0, dev  !trans ? opts : NULL, backend_small, mbr_chs,
+ dev  !trans ?  : opts);
 g_free(opts);
 qtest_start(g_strjoinv( , argv));
 test_cmos();
-- 
1.7.6.5




[Qemu-devel] [PATCH v3 01/29] fdc: Move floppy geometry guessing back from block.c

2012-07-10 Thread Markus Armbruster
Commit 5bbdbb46 moved it to block.c because other geometry guessing
functions already reside in block.c.  Device-specific functionality
should be kept in device code, not the block layer.  Move it back.

Disk geometry guessing is still in block.c.  To be moved out in a
later patch series.

Bonus: the floppy type used in pc_cmos_init() now obviously matches
the one in the FDrive.  Before, we relied on
bdrv_get_floppy_geometry_hint() picking the same type both in
fd_revalidate() and in pc_cmos_init().

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 block.c  |  101 ---
 block.h  |   18 -
 hw/fdc.c |  122 -
 hw/fdc.h |   10 +-
 hw/pc.c  |   11 +-
 5 files changed, 123 insertions(+), 139 deletions(-)

diff --git a/block.c b/block.c
index 0c923f2..ffda1c2 100644
--- a/block.c
+++ b/block.c
@@ -2282,107 +2282,6 @@ void bdrv_set_io_limits(BlockDriverState *bs,
 bs-io_limits_enabled = bdrv_io_limits_enabled(bs);
 }
 
-/* Recognize floppy formats */
-typedef struct FDFormat {
-FDriveType drive;
-uint8_t last_sect;
-uint8_t max_track;
-uint8_t max_head;
-FDriveRate rate;
-} FDFormat;
-
-static const FDFormat fd_formats[] = {
-/* First entry is default format */
-/* 1.44 MB 31/2 floppy disks */
-{ FDRIVE_DRV_144, 18, 80, 1, FDRIVE_RATE_500K, },
-{ FDRIVE_DRV_144, 20, 80, 1, FDRIVE_RATE_500K, },
-{ FDRIVE_DRV_144, 21, 80, 1, FDRIVE_RATE_500K, },
-{ FDRIVE_DRV_144, 21, 82, 1, FDRIVE_RATE_500K, },
-{ FDRIVE_DRV_144, 21, 83, 1, FDRIVE_RATE_500K, },
-{ FDRIVE_DRV_144, 22, 80, 1, FDRIVE_RATE_500K, },
-{ FDRIVE_DRV_144, 23, 80, 1, FDRIVE_RATE_500K, },
-{ FDRIVE_DRV_144, 24, 80, 1, FDRIVE_RATE_500K, },
-/* 2.88 MB 31/2 floppy disks */
-{ FDRIVE_DRV_288, 36, 80, 1, FDRIVE_RATE_1M, },
-{ FDRIVE_DRV_288, 39, 80, 1, FDRIVE_RATE_1M, },
-{ FDRIVE_DRV_288, 40, 80, 1, FDRIVE_RATE_1M, },
-{ FDRIVE_DRV_288, 44, 80, 1, FDRIVE_RATE_1M, },
-{ FDRIVE_DRV_288, 48, 80, 1, FDRIVE_RATE_1M, },
-/* 720 kB 31/2 floppy disks */
-{ FDRIVE_DRV_144,  9, 80, 1, FDRIVE_RATE_250K, },
-{ FDRIVE_DRV_144, 10, 80, 1, FDRIVE_RATE_250K, },
-{ FDRIVE_DRV_144, 10, 82, 1, FDRIVE_RATE_250K, },
-{ FDRIVE_DRV_144, 10, 83, 1, FDRIVE_RATE_250K, },
-{ FDRIVE_DRV_144, 13, 80, 1, FDRIVE_RATE_250K, },
-{ FDRIVE_DRV_144, 14, 80, 1, FDRIVE_RATE_250K, },
-/* 1.2 MB 51/4 floppy disks */
-{ FDRIVE_DRV_120, 15, 80, 1, FDRIVE_RATE_500K, },
-{ FDRIVE_DRV_120, 18, 80, 1, FDRIVE_RATE_500K, },
-{ FDRIVE_DRV_120, 18, 82, 1, FDRIVE_RATE_500K, },
-{ FDRIVE_DRV_120, 18, 83, 1, FDRIVE_RATE_500K, },
-{ FDRIVE_DRV_120, 20, 80, 1, FDRIVE_RATE_500K, },
-/* 720 kB 51/4 floppy disks */
-{ FDRIVE_DRV_120,  9, 80, 1, FDRIVE_RATE_250K, },
-{ FDRIVE_DRV_120, 11, 80, 1, FDRIVE_RATE_250K, },
-/* 360 kB 51/4 floppy disks */
-{ FDRIVE_DRV_120,  9, 40, 1, FDRIVE_RATE_300K, },
-{ FDRIVE_DRV_120,  9, 40, 0, FDRIVE_RATE_300K, },
-{ FDRIVE_DRV_120, 10, 41, 1, FDRIVE_RATE_300K, },
-{ FDRIVE_DRV_120, 10, 42, 1, FDRIVE_RATE_300K, },
-/* 320 kB 51/4 floppy disks */
-{ FDRIVE_DRV_120,  8, 40, 1, FDRIVE_RATE_250K, },
-{ FDRIVE_DRV_120,  8, 40, 0, FDRIVE_RATE_250K, },
-/* 360 kB must match 51/4 better than 31/2... */
-{ FDRIVE_DRV_144,  9, 80, 0, FDRIVE_RATE_250K, },
-/* end */
-{ FDRIVE_DRV_NONE, -1, -1, 0, 0, },
-};
-
-void bdrv_get_floppy_geometry_hint(BlockDriverState *bs, int *nb_heads,
-   int *max_track, int *last_sect,
-   FDriveType drive_in, FDriveType *drive,
-   FDriveRate *rate)
-{
-const FDFormat *parse;
-uint64_t nb_sectors, size;
-int i, first_match, match;
-
-bdrv_get_geometry(bs, nb_sectors);
-match = -1;
-first_match = -1;
-for (i = 0; ; i++) {
-parse = fd_formats[i];
-if (parse-drive == FDRIVE_DRV_NONE) {
-break;
-}
-if (drive_in == parse-drive ||
-drive_in == FDRIVE_DRV_NONE) {
-size = (parse-max_head + 1) * parse-max_track *
-parse-last_sect;
-if (nb_sectors == size) {
-match = i;
-break;
-}
-if (first_match == -1) {
-first_match = i;
-}
-}
-}
-if (match == -1) {
-if (first_match == -1) {
-match = 1;
-} else {
-match = first_match;
-}
-parse = fd_formats[match];
-}
-*nb_heads = parse-max_head + 1;
-*max_track = parse-max_track;
-*last_sect = parse-last_sect;
-*drive = parse-drive;
-*rate = parse-rate;
-}
-
 int bdrv_get_translation_hint(BlockDriverState *bs)
 {
 return bs-translation;
diff --git a/block.h b/block.h
index e34d942..b24f664 100644
--- a/block.h
+++ 

[Qemu-devel] [PATCH v3 20/29] qdev: Collect private helpers in one place

2012-07-10 Thread Markus Armbruster
Just code motion, with one long line wrapped to keep checkpatch.pl
happy.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/qdev-properties.c |  144 +-
 1 files changed, 72 insertions(+), 72 deletions(-)

diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
index 0b89462..002c7f9 100644
--- a/hw/qdev-properties.c
+++ b/hw/qdev-properties.c
@@ -10,6 +10,78 @@ void *qdev_get_prop_ptr(DeviceState *dev, Property *prop)
 return ptr;
 }
 
+static void get_pointer(Object *obj, Visitor *v, Property *prop,
+const char *(*print)(void *ptr),
+const char *name, Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+void **ptr = qdev_get_prop_ptr(dev, prop);
+char *p;
+
+p = (char *) (*ptr ? print(*ptr) : );
+visit_type_str(v, p, name, errp);
+}
+
+static void set_pointer(Object *obj, Visitor *v, Property *prop,
+int (*parse)(DeviceState *dev, const char *str,
+ void **ptr),
+const char *name, Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Error *local_err = NULL;
+void **ptr = qdev_get_prop_ptr(dev, prop);
+char *str;
+int ret;
+
+if (dev-state != DEV_STATE_CREATED) {
+error_set(errp, QERR_PERMISSION_DENIED);
+return;
+}
+
+visit_type_str(v, str, name, local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+if (!*str) {
+g_free(str);
+*ptr = NULL;
+return;
+}
+ret = parse(dev, str, ptr);
+error_set_from_qdev_prop_error(errp, ret, dev, prop, str);
+g_free(str);
+}
+
+static void get_enum(Object *obj, Visitor *v, void *opaque,
+ const char *name, Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+int *ptr = qdev_get_prop_ptr(dev, prop);
+
+visit_type_enum(v, ptr, prop-info-enum_table,
+prop-info-name, prop-name, errp);
+}
+
+static void set_enum(Object *obj, Visitor *v, void *opaque,
+ const char *name, Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+int *ptr = qdev_get_prop_ptr(dev, prop);
+
+if (dev-state != DEV_STATE_CREATED) {
+error_set(errp, QERR_PERMISSION_DENIED);
+return;
+}
+
+visit_type_enum(v, ptr, prop-info-enum_table,
+prop-info-name, prop-name, errp);
+}
+
+/* Bit */
+
 static uint32_t qdev_get_prop_mask(Property *prop)
 {
 assert(prop-info == qdev_prop_bit);
@@ -26,8 +98,6 @@ static void bit_prop_set(DeviceState *dev, Property *props, 
bool val)
 *p = ~mask;
 }
 
-/* Bit */
-
 static int print_bit(DeviceState *dev, Property *prop, char *dest, size_t len)
 {
 uint32_t *p = qdev_get_prop_ptr(dev, prop);
@@ -435,48 +505,6 @@ static const char *print_drive(void *ptr)
 return bdrv_get_device_name(ptr);
 }
 
-static void get_pointer(Object *obj, Visitor *v, Property *prop,
-const char *(*print)(void *ptr),
-const char *name, Error **errp)
-{
-DeviceState *dev = DEVICE(obj);
-void **ptr = qdev_get_prop_ptr(dev, prop);
-char *p;
-
-p = (char *) (*ptr ? print(*ptr) : );
-visit_type_str(v, p, name, errp);
-}
-
-static void set_pointer(Object *obj, Visitor *v, Property *prop,
-int (*parse)(DeviceState *dev, const char *str, void 
**ptr),
-const char *name, Error **errp)
-{
-DeviceState *dev = DEVICE(obj);
-Error *local_err = NULL;
-void **ptr = qdev_get_prop_ptr(dev, prop);
-char *str;
-int ret;
-
-if (dev-state != DEV_STATE_CREATED) {
-error_set(errp, QERR_PERMISSION_DENIED);
-return;
-}
-
-visit_type_str(v, str, name, local_err);
-if (local_err) {
-error_propagate(errp, local_err);
-return;
-}
-if (!*str) {
-g_free(str);
-*ptr = NULL;
-return;
-}
-ret = parse(dev, str, ptr);
-error_set_from_qdev_prop_error(errp, ret, dev, prop, str);
-g_free(str);
-}
-
 static void get_drive(Object *obj, Visitor *v, void *opaque,
   const char *name, Error **errp)
 {
@@ -735,7 +763,6 @@ PropertyInfo qdev_prop_macaddr = {
 .set   = set_mac,
 };
 
-
 /* --- lost tick policy --- */
 
 static const char *lost_tick_policy_table[LOST_TICK_MAX+1] = {
@@ -748,33 +775,6 @@ static const char *lost_tick_policy_table[LOST_TICK_MAX+1] 
= {
 
 QEMU_BUILD_BUG_ON(sizeof(LostTickPolicy) != sizeof(int));
 
-static void get_enum(Object *obj, Visitor *v, void *opaque,
- const char *name, Error **errp)
-{
-DeviceState *dev = DEVICE(obj);
-Property *prop = opaque;
-int *ptr = qdev_get_prop_ptr(dev, prop);
-
-visit_type_enum(v, ptr, prop-info-enum_table,
-prop-info-name, prop-name, errp);

[Qemu-devel] [PATCH v3 15/29] hd-geometry: Switch to uint32_t to match BlockConf

2012-07-10 Thread Markus Armbruster
Best to use the same type, to avoid unwanted truncation or sign
extension.

BlockConf can't use plain int for cyls, heads and secs, because
integer properties require an exact width.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/block-common.h |2 +-
 hw/hd-geometry.c  |4 ++--
 hw/ide/core.c |2 +-
 hw/scsi-disk.c|2 +-
 hw/virtio-blk.c   |2 +-
 trace-events  |2 +-
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/hw/block-common.h b/hw/block-common.h
index bba817a..2f65186 100644
--- a/hw/block-common.h
+++ b/hw/block-common.h
@@ -16,7 +16,7 @@
 /* Hard disk geometry */
 
 void hd_geometry_guess(BlockDriverState *bs,
-   int *pcyls, int *pheads, int *psecs,
+   uint32_t *pcyls, uint32_t *pheads, uint32_t *psecs,
int *ptrans);
 
 #endif
diff --git a/hw/hd-geometry.c b/hw/hd-geometry.c
index 4d746b7..7626cbb 100644
--- a/hw/hd-geometry.c
+++ b/hw/hd-geometry.c
@@ -98,7 +98,7 @@ static int guess_disk_lchs(BlockDriverState *bs,
 }
 
 static void guess_chs_for_size(BlockDriverState *bs,
-   int *pcyls, int *pheads, int *psecs)
+uint32_t *pcyls, uint32_t *pheads, uint32_t *psecs)
 {
 uint64_t nb_sectors;
 int cylinders;
@@ -117,7 +117,7 @@ static void guess_chs_for_size(BlockDriverState *bs,
 }
 
 void hd_geometry_guess(BlockDriverState *bs,
-   int *pcyls, int *pheads, int *psecs,
+   uint32_t *pcyls, uint32_t *pheads, uint32_t *psecs,
int *ptrans)
 {
 int cylinders, heads, secs, translation;
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 7f5ad07..f1966e3 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -1927,7 +1927,7 @@ int ide_init_drive(IDEState *s, BlockDriverState *bs, 
IDEDriveKind kind,
const char *version, const char *serial, const char *model,
uint64_t wwn)
 {
-int cylinders, heads, secs;
+uint32_t cylinders, heads, secs;
 uint64_t nb_sectors;
 
 s-bs = bs;
diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index fc077f5..c881acf 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -968,7 +968,7 @@ static int mode_sense_page(SCSIDiskState *s, int page, 
uint8_t **p_outbuf,
 };
 
 BlockDriverState *bdrv = s-qdev.conf.bs;
-int cylinders, heads, secs;
+uint32_t cylinders, heads, secs;
 uint8_t *p = *p_outbuf;
 
 if ((mode_sense_valid[page]  (1  s-qdev.type)) == 0) {
diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index d2709a7..4344e28 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -590,7 +590,7 @@ static const BlockDevOps virtio_block_ops = {
 VirtIODevice *virtio_blk_init(DeviceState *dev, VirtIOBlkConf *blk)
 {
 VirtIOBlock *s;
-int cylinders, heads, secs;
+uint32_t cylinders, heads, secs;
 static int virtio_blk_id;
 DriveInfo *dinfo;
 
diff --git a/trace-events b/trace-events
index acef082..5b5272b 100644
--- a/trace-events
+++ b/trace-events
@@ -143,7 +143,7 @@ ecc_diag_mem_readb(uint64_t addr, uint32_t ret) Read 
diagnostic %PRId64= %02x
 
 # hw/hd-geometry.c
 hd_geometry_lchs_guess(void *bs, int cyls, int heads, int secs) bs %p LCHS %d 
%d %d
-hd_geometry_guess(void *bs, int cyls, int heads, int secs, int trans) bs %p 
CHS %d %d %d trans %d
+hd_geometry_guess(void *bs, uint32_t cyls, uint32_t heads, uint32_t secs, int 
trans) bs %p CHS %u %u %u trans %d
 
 # hw/jazz-led.c
 jazz_led_read(uint64_t addr, uint8_t val) read addr=0x%PRIx64: 0x%x
-- 
1.7.6.5




Re: [Qemu-devel] [PATCH 4/5] qom-cpu: during cpu reset, it will reset its child

2012-07-10 Thread Andreas Färber
Am 10.07.2012 10:41, schrieb Paolo Bonzini:
 Il 10/07/2012 08:16, Liu Ping Fan ha scritto:
 This will give the embeded logic module, such as apic has the
 opportunity to reset.

 Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
 ---
  qom/cpu.c |   16 
  1 files changed, 16 insertions(+), 0 deletions(-)

 diff --git a/qom/cpu.c b/qom/cpu.c
 index 5b36046..6aea8e6 100644
 --- a/qom/cpu.c
 +++ b/qom/cpu.c
 @@ -20,10 +20,26 @@
  
  #include qemu/cpu.h
  #include qemu-common.h
 +#include hw/qdev.h
 +
 +static int cpu_reset_kid(Object *child, void *opaque)
 +{
 +if (object_is_type_str(child, TYPE_DEVICE)) {
 +device_reset(DEVICE(child));
 +} else if (object_is_type_str(child, TYPE_BUS)) {
 +bus_reset(BUS(child));
 +} else {
 +printf(cpu's child must be DEVICE or BUS);
 +abort();
 +}
 +return 0;
 +}
  
  void cpu_reset(CPUState *cpu)
  {
  CPUClass *klass = CPU_GET_CLASS(cpu);
 +Object *obj = OBJECT(cpu);
 +object_child_foreach(obj, cpu_reset_kid, NULL);
 
 Ok, now I see what you want to do.  Next time, please add meaningful
 commit messages to all patches in the series, even those that only add
 infrastructure.
 
 It really looks like time is ripe to make CPUs children of Device, so
 you can just use qdev_reset_all to reset the CPU.

While we agree on that goal, the way there has proven controversial,
please review and comment on the two approaches around.

One thing we definitely need to do is to split up qdev.h.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg





Re: [Qemu-devel] [PATCH V2] Use clean shutdown request for ctrl-a x

2012-07-10 Thread Fabien Chouteau
On 07/09/2012 02:18 PM, Andreas Färber wrote:
 Am 09.07.2012 12:19, schrieb Fabien Chouteau:
 Any comment?

 On 07/04/2012 01:04 PM, Fabien Chouteau wrote:
 The goal is to make ctrl-a x to close Qemu in a clean way. The current
 exit(0) skips a lot of cleanup/close functions, for example in block
 drivers.

 Signed-off-by: Fabien Chouteau chout...@adacore.com
 ---
  qemu-char.c |2 +-
  sysemu.h|1 +
  vl.c|5 +
  3 files changed, 7 insertions(+), 1 deletion(-)

 diff --git a/qemu-char.c b/qemu-char.c
 index c2aaaee..7732846 100644
 --- a/qemu-char.c
 +++ b/qemu-char.c
 @@ -353,7 +353,7 @@ static int mux_proc_byte(CharDriverState *chr, 
 MuxDriver *d, int ch)
  {
   const char *term =  QEMU: Terminated\n\r;
   chr-chr_write(chr,(uint8_t *)term,strlen(term));
 - exit(0);
 + qemu_system_force_shutdown();
   break;
  }
  case 's':
 
 FWIW there was a recent patch by Hervé that exposed further occurrences
 of exit(), probably all would need to be reviewed and fixed.
 

There's a lot of exit() in Qemu (~660 with a quick grep), but it doesn't
always make sense to change them all.

In my opinion this one is a clean user request to shutdown the system,
as opposed to an error state that requires exit().


-- 
Fabien Chouteau





[Qemu-devel] [PATCH v3 26/29] qtest: Test we don't put hard disk info into CMOS for a CD-ROM

2012-07-10 Thread Markus Armbruster

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 tests/hd-geo-test.c |   22 ++
 1 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/tests/hd-geo-test.c b/tests/hd-geo-test.c
index 5d9d2e4..9a31e85 100644
--- a/tests/hd-geo-test.c
+++ b/tests/hd-geo-test.c
@@ -369,6 +369,27 @@ static void test_ide_device_user_chst(void)
 test_ide_drive_user(ide-hd, true);
 }
 
+/*
+ * Test case: IDE devices (if=ide), but use index=0 for CD-ROM
+ */
+static void test_ide_drive_cd_0(void)
+{
+char *argv[256];
+int argc, ide_idx;
+Backend i;
+
+argc = setup_common(argv, ARRAY_SIZE(argv));
+for (i = 0; i = backend_empty; i++) {
+ide_idx = backend_empty - i;
+cur_ide[ide_idx] = hd_chst[i][mbr_blank];
+argc = setup_ide(argc, argv, ARRAY_SIZE(argv),
+ ide_idx, NULL, i, mbr_blank, );
+}
+qtest_start(g_strjoinv( , argv));
+test_cmos();
+qtest_quit(global_qtest);
+}
+
 int main(int argc, char **argv)
 {
 Backend i;
@@ -390,6 +411,7 @@ int main(int argc, char **argv)
 qtest_add_func(hd-geo/ide/drive/mbr/chs, test_ide_drive_mbr_chs);
 qtest_add_func(hd-geo/ide/drive/user/chs, test_ide_drive_user_chs);
 qtest_add_func(hd-geo/ide/drive/user/chst, test_ide_drive_user_chst);
+qtest_add_func(hd-geo/ide/drive/cd_0, test_ide_drive_cd_0);
 qtest_add_func(hd-geo/ide/device/mbr/blank, test_ide_device_mbr_blank);
 qtest_add_func(hd-geo/ide/device/mbr/lba, test_ide_device_mbr_lba);
 qtest_add_func(hd-geo/ide/device/mbr/chs, test_ide_device_mbr_chs);
-- 
1.7.6.5




[Qemu-devel] [PATCH v17 2/9] Add XBZRLE documentation

2012-07-10 Thread Orit Wasserman
Signed-off-by: Orit Wasserman owass...@redhat.com
---
 docs/xbzrle.txt |  136 +++
 1 files changed, 136 insertions(+), 0 deletions(-)
 create mode 100644 docs/xbzrle.txt

diff --git a/docs/xbzrle.txt b/docs/xbzrle.txt
new file mode 100644
index 000..f70e851
--- /dev/null
+++ b/docs/xbzrle.txt
@@ -0,0 +1,136 @@
+XBZRLE (Xor Based Zero Run Length Encoding)
+===
+
+Using XBZRLE (Xor Based Zero Run Length Encoding) allows for the reduction
+of VM downtime and the total live-migration time of Virtual machines.
+It is particularly useful for virtual machines running memory write intensive
+workloads that are typical of large enterprise applications such as SAP ERP
+Systems, and generally speaking for any application that uses a sparse memory
+update pattern.
+
+Instead of sending the changed guest memory page this solution will send a
+compressed version of the updates, thus reducing the amount of data sent during
+live migration.
+In order to be able to calculate the update, the previous memory pages need to
+be stored on the source. Those pages are stored in a dedicated cache
+(hash table) and are
+accessed by their address.
+The larger the cache size the better the chances are that the page has already
+been stored in the cache.
+A small cache size will result in high cache miss rate.
+Cache size can be changed before and during migration.
+
+Format
+===
+
+The compression format performs a XOR between the previous and current content
+of the page, where zero represents an unchanged value.
+The page data delta is represented by zero and non zero runs.
+A zero run is represented by its length (in bytes).
+A non zero run is represented by its length (in bytes) and the new data.
+The run length is encoded using ULEB128 (http://en.wikipedia.org/wiki/LEB128)
+
+There can be more than one valid encoding, the sender may send a longer 
encoding
+for the benefit of reducing computation cost.
+
+page = zrun nzrun
+   | zrun nzrun page
+
+zrun = length
+
+nzrun = length byte...
+
+length = uleb128 encoded integer
+
+On the sender side XBZRLE is used as a compact delta encoding of page updates,
+retrieving the old page content from the cache (default size of 512 MB). The
+receiving side uses the existing page's content and XBZRLE to decode the new
+page's content.
+
+This work was originally based on research results published
+VEE 2011: Evaluation of Delta Compression Techniques for Efficient Live
+Migration of Large Virtual Machines by Benoit, Svard, Tordsson and Elmroth.
+Additionally the delta encoder XBRLE was improved further using the XBZRLE
+instead.
+
+XBZRLE has a sustained bandwidth of 2-2.5 GB/s for typical workloads making it
+ideal for in-line, real-time encoding such as is needed for live-migration.
+
+Example
+old buffer:
+1001 zeros
+05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 68 00 00 6b 00 6d
+3074 zeros
+
+new buffer:
+1001 zeros
+01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 68 00 00 67 00 69
+3074 zeros
+
+encoded buffer:
+
+encoded length 24
+e9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69
+
+Migration Capabilities
+==
+In order to use XBZRLE the destination QEMU version should be able to
+decode the new format.
+Adding a new migration capabilities command that will allow external management
+to query for it support.
+A typical use for the destination
+{qemu} info migrate_capabilities
+{qemu} xbzrle, ...
+
+In order to enable capabilities for future live migration,
+a new command migrate_set_parameter is introduced:
+{qemu} migrate_set_parameter xbzrle
+
+Usage
+==
+
+1. Activate xbzrle
+2. Set the XBZRLE cache size - the cache size is in MBytes and should be a
+power of 2. The cache default value is 64MBytes.
+3. start outgoing migration
+
+A typical usage scenario:
+On the incoming QEMU:
+{qemu} migrate_set_parameter xbzrle on
+On the outgoing QEMU:
+{qemu} migrate_set_parameter xbzrle on
+{qemu} migrate_set_cachesize 256m
+{qemu} migrate -d tcp:destination.host:
+{qemu} info migrate
+...
+cache size: 67108864 bytes
+transferred ram-duplicate: A kbytes
+transferred ram-normal: B kbytes
+transferred ram-xbrle: C kbytes
+overflow ram-xbrle: D pages
+cache-miss ram-xbrle: E pages
+
+cache-miss: the number of cache misses to date - high cache-miss rate
+indicates that the cache size is set too low.
+overflow: the number of overflows in the decoding which where the delta could
+not be compressed. This can happen if the changes in the pages are too large
+or there are many short changes; for example, changing every second byte (half 
a
+page).
+
+Testing: Testing indicated that live migration with XBZRLE was completed in 110
+seconds, whereas without it would not be able to complete.
+
+A simple synthetic memory r/w load generator:
+..include stdlib.h
+..include stdio.h
+..int 

[Qemu-devel] [PATCH v3 18/29] ide: qdev properties for disk geometry

2012-07-10 Thread Markus Armbruster
Geometry needs to be qdev properties, because it belongs to the
disk's guest part.

Maintain backward compatibility exactly like for serial: fall back to
DriveInfo's geometry, set with -drive cyls=...

Do this only for ide-hd.  ide-drive is legacy.  ide-cd doesn't have a
geometry.

Bonus: info qtree now shows the geometry.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/ide/core.c |   19 ++-
 hw/ide/internal.h |4 +++-
 hw/ide/qdev.c |   22 +-
 3 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index f1966e3..bf1ce89 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -1925,16 +1925,16 @@ static const BlockDevOps ide_cd_block_ops = {
 
 int ide_init_drive(IDEState *s, BlockDriverState *bs, IDEDriveKind kind,
const char *version, const char *serial, const char *model,
-   uint64_t wwn)
+   uint64_t wwn,
+   uint32_t cylinders, uint32_t heads, uint32_t secs,
+   int chs_trans)
 {
-uint32_t cylinders, heads, secs;
 uint64_t nb_sectors;
 
 s-bs = bs;
 s-drive_kind = kind;
 
 bdrv_get_geometry(bs, nb_sectors);
-hd_geometry_guess(bs, cylinders, heads, secs, s-chs_trans);
 if (cylinders  1 || cylinders  16383) {
 error_report(cyls must be between 1 and 16383);
 return -1;
@@ -1950,6 +1950,7 @@ int ide_init_drive(IDEState *s, BlockDriverState *bs, 
IDEDriveKind kind,
 s-cylinders = cylinders;
 s-heads = heads;
 s-sectors = secs;
+s-chs_trans = chs_trans;
 s-nb_sectors = nb_sectors;
 s-wwn = wwn;
 /* The SMART values should be preserved across power cycles
@@ -2076,17 +2077,25 @@ void ide_init2(IDEBus *bus, qemu_irq irq)
 void ide_init2_with_non_qdev_drives(IDEBus *bus, DriveInfo *hd0,
 DriveInfo *hd1, qemu_irq irq)
 {
-int i;
+int i, trans;
 DriveInfo *dinfo;
+uint32_t cyls, heads, secs;
 
 for(i = 0; i  2; i++) {
 dinfo = i == 0 ? hd0 : hd1;
 ide_init1(bus, i);
 if (dinfo) {
+cyls  = dinfo-cyls;
+heads = dinfo-heads;
+secs  = dinfo-secs;
+trans = dinfo-trans;
+if (!cyls  !heads  !secs) {
+hd_geometry_guess(dinfo-bdrv, cyls, heads, secs, trans);
+}
 if (ide_init_drive(bus-ifs[i], dinfo-bdrv,
dinfo-media_cd ? IDE_CD : IDE_HD, NULL,
*dinfo-serial ? dinfo-serial : NULL,
-   NULL, 0)  0) {
+   NULL, 0, cyls, heads, secs, trans)  0) {
 error_report(Can't set up IDE drive %s, dinfo-id);
 exit(1);
 }
diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index 56c718e..685e976 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -545,7 +545,9 @@ uint32_t ide_data_readl(void *opaque, uint32_t addr);
 
 int ide_init_drive(IDEState *s, BlockDriverState *bs, IDEDriveKind kind,
const char *version, const char *serial, const char *model,
-   uint64_t wwn);
+   uint64_t wwn,
+   uint32_t cylinders, uint32_t heads, uint32_t secs,
+   int chs_trans);
 void ide_init2(IDEBus *bus, qemu_irq irq);
 void ide_init2_with_non_qdev_drives(IDEBus *bus, DriveInfo *hd0,
 DriveInfo *hd1, qemu_irq irq);
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 87e0b75..3e297dc 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -21,6 +21,7 @@
 #include qemu-error.h
 #include hw/ide/internal.h
 #include blockdev.h
+#include hw/block-common.h
 #include sysemu.h
 
 /* - */
@@ -143,6 +144,7 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind)
 IDEState *s = bus-ifs + dev-unit;
 const char *serial;
 DriveInfo *dinfo;
+int trans;
 
 if (dev-conf.discard_granularity  dev-conf.discard_granularity != 512) 
{
 error_report(discard_granularity must be 512 for ide);
@@ -158,8 +160,25 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind 
kind)
 }
 }
 
+trans = BIOS_ATA_TRANSLATION_AUTO;
+if (!dev-conf.cyls  !dev-conf.heads  !dev-conf.secs) {
+/* try to fall back to value set with legacy -drive cyls=... */
+dinfo = drive_get_by_blockdev(dev-conf.bs);
+dev-conf.cyls  = dinfo-cyls;
+dev-conf.heads = dinfo-heads;
+dev-conf.secs  = dinfo-secs;
+trans   = dinfo-trans;
+}
+if (!dev-conf.cyls  !dev-conf.heads  !dev-conf.secs) {
+hd_geometry_guess(dev-conf.bs,
+  dev-conf.cyls, dev-conf.heads, dev-conf.secs,
+  trans);
+}
+
 if (ide_init_drive(s, dev-conf.bs, kind,
-   dev-version, serial, dev-model, dev-wwn)  0) {
+ 

[Qemu-devel] [PATCH v3 08/29] hd-geometry: Factor out guess_chs_for_size()

2012-07-10 Thread Markus Armbruster

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/hd-geometry.c |   32 
 1 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/hw/hd-geometry.c b/hw/hd-geometry.c
index db47846..1a58894 100644
--- a/hw/hd-geometry.c
+++ b/hw/hd-geometry.c
@@ -97,14 +97,31 @@ static int guess_disk_lchs(BlockDriverState *bs,
 return -1;
 }
 
+static void guess_chs_for_size(BlockDriverState *bs,
+   int *pcyls, int *pheads, int *psecs)
+{
+uint64_t nb_sectors;
+int cylinders;
+
+bdrv_get_geometry(bs, nb_sectors);
+
+cylinders = nb_sectors / (16 * 63);
+if (cylinders  16383) {
+cylinders = 16383;
+} else if (cylinders  2) {
+cylinders = 2;
+}
+*pcyls = cylinders;
+*pheads = 16;
+*psecs = 63;
+}
+
 void hd_geometry_guess(BlockDriverState *bs,
int *pcyls, int *pheads, int *psecs)
 {
 int translation, lba_detected = 0;
 int cylinders, heads, secs;
-uint64_t nb_sectors;
 
-bdrv_get_geometry(bs, nb_sectors);
 bdrv_get_geometry_hint(bs, cylinders, heads, secs);
 translation = bdrv_get_translation_hint(bs);
 
@@ -119,16 +136,7 @@ void hd_geometry_guess(BlockDriverState *bs,
 if (guess_disk_lchs(bs, cylinders, heads, secs)  0) {
 /* no LCHS guess: use a standard physical disk geometry  */
 default_geometry:
-cylinders = nb_sectors / (16 * 63);
-
-if (cylinders  16383) {
-cylinders = 16383;
-} else if (cylinders  2) {
-cylinders = 2;
-}
-*pcyls = cylinders;
-*pheads = 16;
-*psecs = 63;
+guess_chs_for_size(bs, pcyls, pheads, psecs);
 if ((lba_detected == 1)  (translation == BIOS_ATA_TRANSLATION_AUTO)) 
{
 if ((*pcyls * *pheads) = 131072) {
 bdrv_set_translation_hint(bs,
-- 
1.7.6.5




Re: [Qemu-devel] KVM call agenda for Tuesday, July 10th

2012-07-10 Thread Andreas Färber
Hi,

Am 09.07.2012 22:37, schrieb Juan Quintela:
 
 Please send in any agenda items you are interested in covering.

Steps towards CPU hotplug:
* how to model CPUState as a DeviceState
* coordination of APIC-related x86 CPU remodelling

Thanks,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg





Re: [Qemu-devel] [PATCH] pseries: Add support for new KVM hash table control call

2012-07-10 Thread Alexander Graf

On 10.07.2012, at 10:16, Benjamin Herrenschmidt wrote:

 On Tue, 2012-07-10 at 17:25 +1000, Benjamin Herrenschmidt wrote:
 On Wed, 2012-06-27 at 22:10 +1000, Benjamin Herrenschmidt wrote:
 From: David Gibson da...@gibson.dropbear.id.au
 
 This adds support for then new reset htab ioctl which allows qemu
 to properly cleanup the MMU hash table when the guest is reset. With
 the corresponding kernel support, reset of a guest now works properly.
 
 This also paves the way for indicating a different size hash table
 to the kernel and for the kernel to be able to impose limits on
 the requested size.
 
 Alex, this has a bug, if you already applied it, please sneak:
 
 Actually just drop the whole thing, it also breaks PR KVM, I need
 to work a bit more on it.

Alrighty. Dropped :).


Alex




[Qemu-devel] [PATCH v3 22/29] ide: qdev property for BIOS CHS translation

2012-07-10 Thread Markus Armbruster
This isn't quite orthodox.  CHS translation is firmware configuration,
communicated via the RTC's CMOS RAM, not a property of the disk.  But
it's best to treat it just like geometry anyway.

Maintain backward compatibility exactly like for geometry: fall back
to DriveInfo's translation, set with -drive trans=...

Bonus: info qtree now shows the translation.  Except when it shows
auto: that's resolved by pc_cmos_init_late().  To be addressed
shortly.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/ide/internal.h |1 +
 hw/ide/qdev.c |   10 +-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index 685e976..c3ecafc 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -474,6 +474,7 @@ struct IDEDevice {
 DeviceState qdev;
 uint32_t unit;
 BlockConf conf;
+int chs_trans;
 char *version;
 char *serial;
 char *model;
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 3e297dc..f191dd3 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -144,7 +144,6 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind)
 IDEState *s = bus-ifs + dev-unit;
 const char *serial;
 DriveInfo *dinfo;
-int trans;
 
 if (dev-conf.discard_granularity  dev-conf.discard_granularity != 512) 
{
 error_report(discard_granularity must be 512 for ide);
@@ -160,25 +159,24 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind 
kind)
 }
 }
 
-trans = BIOS_ATA_TRANSLATION_AUTO;
 if (!dev-conf.cyls  !dev-conf.heads  !dev-conf.secs) {
 /* try to fall back to value set with legacy -drive cyls=... */
 dinfo = drive_get_by_blockdev(dev-conf.bs);
 dev-conf.cyls  = dinfo-cyls;
 dev-conf.heads = dinfo-heads;
 dev-conf.secs  = dinfo-secs;
-trans   = dinfo-trans;
+dev-chs_trans  = dinfo-trans;
 }
 if (!dev-conf.cyls  !dev-conf.heads  !dev-conf.secs) {
 hd_geometry_guess(dev-conf.bs,
   dev-conf.cyls, dev-conf.heads, dev-conf.secs,
-  trans);
+  dev-chs_trans);
 }
 
 if (ide_init_drive(s, dev-conf.bs, kind,
dev-version, serial, dev-model, dev-wwn,
dev-conf.cyls, dev-conf.heads, dev-conf.secs,
-   trans)  0) {
+   dev-chs_trans)  0) {
 return -1;
 }
 
@@ -222,6 +220,8 @@ static int ide_drive_initfn(IDEDevice *dev)
 static Property ide_hd_properties[] = {
 DEFINE_IDE_DEV_PROPERTIES(),
 DEFINE_BLOCK_CHS_PROPERTIES(IDEDrive, dev.conf),
+DEFINE_PROP_BIOS_CHS_TRANS(bios-chs-trans,
+IDEDrive, dev.chs_trans, BIOS_ATA_TRANSLATION_AUTO),
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
1.7.6.5




[Qemu-devel] [PATCH v3 24/29] block: Geometry and translation hints are now useless, purge them

2012-07-10 Thread Markus Armbruster
There are two producers of these hints: drive_init() on behalf of
-drive, and hd_geometry_guess().

The only consumer of the hint is hd_geometry_guess().

The callers of hd_geometry_guess() call it only when drive_init()
didn't set the hints.  Therefore, drive_init()'s hints are never used.

Thus, hd_geometry_guess() only ever sees hints it produced itself in a
prior call.  Only the first call computes something, subsequent calls
just repeat the first call's results.  However, hd_geometry_guess() is
never called more than once: the device models don't, and the block
device is destroyed on unplug.  Thus, dropping the repeat feature
doesn't break anything now.

If a block device wasn't destroyed on unplug and could be reused with
a new device, then repeating old results would be wrong.  Thus,
dropping the repeat feature prevents future breakage.

This renders the hints unused.  Purge them from the block layer.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 block.c  |   32 
 block.h  |   12 
 block_int.h  |1 -
 blockdev.c   |   14 ++
 hw/block-common.h|6 ++
 hw/hd-geometry.c |   20 +---
 hw/pc.c  |1 +
 hw/qdev-properties.c |1 +
 vl.c |2 +-
 9 files changed, 12 insertions(+), 77 deletions(-)

diff --git a/block.c b/block.c
index 06323cf..ce7eb8f 100644
--- a/block.c
+++ b/block.c
@@ -996,12 +996,6 @@ static void bdrv_move_feature_fields(BlockDriverState 
*bs_dest,
 bs_dest-block_timer= bs_src-block_timer;
 bs_dest-io_limits_enabled  = bs_src-io_limits_enabled;
 
-/* geometry */
-bs_dest-cyls   = bs_src-cyls;
-bs_dest-heads  = bs_src-heads;
-bs_dest-secs   = bs_src-secs;
-bs_dest-translation= bs_src-translation;
-
 /* r/w error */
 bs_dest-on_read_error  = bs_src-on_read_error;
 bs_dest-on_write_error = bs_src-on_write_error;
@@ -2132,27 +2126,6 @@ void bdrv_get_geometry(BlockDriverState *bs, uint64_t 
*nb_sectors_ptr)
 *nb_sectors_ptr = length;
 }
 
-void bdrv_set_geometry_hint(BlockDriverState *bs,
-int cyls, int heads, int secs)
-{
-bs-cyls = cyls;
-bs-heads = heads;
-bs-secs = secs;
-}
-
-void bdrv_set_translation_hint(BlockDriverState *bs, int translation)
-{
-bs-translation = translation;
-}
-
-void bdrv_get_geometry_hint(BlockDriverState *bs,
-int *pcyls, int *pheads, int *psecs)
-{
-*pcyls = bs-cyls;
-*pheads = bs-heads;
-*psecs = bs-secs;
-}
-
 /* throttling disk io limits */
 void bdrv_set_io_limits(BlockDriverState *bs,
 BlockIOLimit *io_limits)
@@ -2161,11 +2134,6 @@ void bdrv_set_io_limits(BlockDriverState *bs,
 bs-io_limits_enabled = bdrv_io_limits_enabled(bs);
 }
 
-int bdrv_get_translation_hint(BlockDriverState *bs)
-{
-return bs-translation;
-}
-
 void bdrv_set_on_error(BlockDriverState *bs, BlockErrorAction on_read_error,
BlockErrorAction on_write_error)
 {
diff --git a/block.h b/block.h
index 1cd8a01..29c5eab 100644
--- a/block.h
+++ b/block.h
@@ -257,18 +257,6 @@ int bdrv_has_zero_init(BlockDriverState *bs);
 int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
   int *pnum);
 
-#define BIOS_ATA_TRANSLATION_AUTO   0
-#define BIOS_ATA_TRANSLATION_NONE   1
-#define BIOS_ATA_TRANSLATION_LBA2
-#define BIOS_ATA_TRANSLATION_LARGE  3
-#define BIOS_ATA_TRANSLATION_RECHS  4
-
-void bdrv_set_geometry_hint(BlockDriverState *bs,
-int cyls, int heads, int secs);
-void bdrv_set_translation_hint(BlockDriverState *bs, int translation);
-void bdrv_get_geometry_hint(BlockDriverState *bs,
-int *pcyls, int *pheads, int *psecs);
-int bdrv_get_translation_hint(BlockDriverState *bs);
 void bdrv_set_on_error(BlockDriverState *bs, BlockErrorAction on_read_error,
BlockErrorAction on_write_error);
 BlockErrorAction bdrv_get_on_error(BlockDriverState *bs, int is_read);
diff --git a/block_int.h b/block_int.h
index 1fb5352..d72317f 100644
--- a/block_int.h
+++ b/block_int.h
@@ -320,7 +320,6 @@ struct BlockDriverState {
 
 /* NOTE: the following infos are only hints for real hardware
drivers. They are not used by the block driver */
-int cyls, heads, secs, translation;
 BlockErrorAction on_read_error, on_write_error;
 bool iostatus_enabled;
 BlockDeviceIoStatus iostatus;
diff --git a/blockdev.c b/blockdev.c
index 161985b..06c997e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -7,8 +7,8 @@
  * later.  See the COPYING file in the top-level directory.
  */
 
-#include block.h
 #include blockdev.h
+#include hw/block-common.h
 #include monitor.h
 #include qerror.h
 #include qemu-option.h
@@ -551,17 +551,7 @@ DriveInfo *drive_init(QemuOpts *opts, int 

[Qemu-devel] [PATCH v3 27/29] hd-geometry: Compute BIOS CHS translation in one place

2012-07-10 Thread Markus Armbruster
Currently, it is split between hd_geometry_guess() and
pc_cmos_init_late().  Confusing.  info qtree shows the result of the
former.  Also confusing.

Fold the part done in pc_cmos_init_late() into hd_geometry_guess().

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/block-common.h |1 +
 hw/hd-geometry.c  |9 -
 hw/ide/core.c |2 ++
 hw/ide/qdev.c |3 +++
 hw/pc.c   |   19 ---
 5 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/hw/block-common.h b/hw/block-common.h
index ec7810d..31e12ba 100644
--- a/hw/block-common.h
+++ b/hw/block-common.h
@@ -24,5 +24,6 @@
 void hd_geometry_guess(BlockDriverState *bs,
uint32_t *pcyls, uint32_t *pheads, uint32_t *psecs,
int *ptrans);
+int hd_bios_chs_auto_trans(uint32_t cyls, uint32_t heads, uint32_t secs);
 
 #endif
diff --git a/hw/hd-geometry.c b/hw/hd-geometry.c
index 74678a6..1cdb9fb 100644
--- a/hw/hd-geometry.c
+++ b/hw/hd-geometry.c
@@ -125,7 +125,7 @@ void hd_geometry_guess(BlockDriverState *bs,
 if (guess_disk_lchs(bs, cylinders, heads, secs)  0) {
 /* no LCHS guess: use a standard physical disk geometry  */
 guess_chs_for_size(bs, pcyls, pheads, psecs);
-translation = BIOS_ATA_TRANSLATION_AUTO;
+translation = hd_bios_chs_auto_trans(*pcyls, *pheads, *psecs);
 } else if (heads  16) {
 /* LCHS guess with heads  16 means that a BIOS LBA
translation was active, so a standard physical disk
@@ -148,3 +148,10 @@ void hd_geometry_guess(BlockDriverState *bs,
 }
 trace_hd_geometry_guess(bs, *pcyls, *pheads, *psecs, translation);
 }
+
+int hd_bios_chs_auto_trans(uint32_t cyls, uint32_t heads, uint32_t secs)
+{
+return cyls = 1024  heads = 16  secs = 63
+? BIOS_ATA_TRANSLATION_NONE
+: BIOS_ATA_TRANSLATION_LBA;
+}
diff --git a/hw/ide/core.c b/hw/ide/core.c
index bf1ce89..1ca7cdf 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -2091,6 +2091,8 @@ void ide_init2_with_non_qdev_drives(IDEBus *bus, 
DriveInfo *hd0,
 trans = dinfo-trans;
 if (!cyls  !heads  !secs) {
 hd_geometry_guess(dinfo-bdrv, cyls, heads, secs, trans);
+} else if (trans == BIOS_ATA_TRANSLATION_AUTO) {
+trans = hd_bios_chs_auto_trans(cyls, heads, secs);
 }
 if (ide_init_drive(bus-ifs[i], dinfo-bdrv,
dinfo-media_cd ? IDE_CD : IDE_HD, NULL,
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 84097fd..de9db3b 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -171,6 +171,9 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind)
 hd_geometry_guess(dev-conf.bs,
   dev-conf.cyls, dev-conf.heads, dev-conf.secs,
   dev-chs_trans);
+} else if (dev-chs_trans == BIOS_ATA_TRANSLATION_AUTO) {
+dev-chs_trans = hd_bios_chs_auto_trans(dev-conf.cyls,
+dev-conf.heads, dev-conf.secs);
 }
 
 if (ide_init_drive(s, dev-conf.bs, kind,
diff --git a/hw/pc.c b/hw/pc.c
index 77b12b4..598267a 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -290,7 +290,7 @@ static void pc_cmos_init_late(void *opaque)
 int16_t cylinders;
 int8_t heads, sectors;
 int val;
-int i;
+int i, trans;
 
 val = 0;
 if (ide_get_geometry(arg-idebus[0], 0,
@@ -313,20 +313,9 @@ static void pc_cmos_init_late(void *opaque)
geometry can be different if a translation is done. */
 if (ide_get_geometry(arg-idebus[i / 2], i % 2,
  cylinders, heads, sectors) = 0) {
-int translation = ide_get_bios_chs_trans(arg-idebus[i / 2],
- i % 2);
-if (translation == BIOS_ATA_TRANSLATION_AUTO) {
-if (cylinders = 1024  heads = 16  sectors = 63) {
-/* No translation. */
-translation = 0;
-} else {
-/* LBA translation. */
-translation = 1;
-}
-} else {
-translation--;
-}
-val |= translation  (i * 2);
+trans = ide_get_bios_chs_trans(arg-idebus[i / 2], i % 2) - 1;
+assert((trans  ~3) == 0);
+val |= trans  (i * 2);
 }
 }
 rtc_set_memory(s, 0x39, val);
-- 
1.7.6.5




[Qemu-devel] [PATCH v17 5/9] Change ram_save_block to return -1 if there are no more changes

2012-07-10 Thread Orit Wasserman
It will return 0 if the page is unmodifed.

Signed-off-by: Orit Wasserman owass...@redhat.com
---
 arch_init.c |   11 +++
 1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 5b0f562..91e583f 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -188,7 +188,7 @@ static int ram_save_block(QEMUFile *f)
 {
 RAMBlock *block = last_block;
 ram_addr_t offset = last_offset;
-int bytes_sent = 0;
+int bytes_sent = -1;
 MemoryRegion *mr;
 
 if (!block)
@@ -349,8 +349,11 @@ int ram_save_live(QEMUFile *f, int stage, void *opaque)
 int bytes_sent;
 
 bytes_sent = ram_save_block(f);
-bytes_transferred += bytes_sent;
-if (bytes_sent == 0) { /* no more blocks */
+/* bytes_sent 0 represent unchanged page,
+   bytes_sent -1 represent no more blocks*/
+if (bytes_sent  0) {
+bytes_transferred += bytes_sent;
+} else if (bytes_sent == -1) { /* no more blocks */
 break;
 }
 /* we want to check in the 1st loop, just in case it was the 1st time
@@ -387,7 +390,7 @@ int ram_save_live(QEMUFile *f, int stage, void *opaque)
 int bytes_sent;
 
 /* flush all remaining blocks regardless of rate limiting */
-while ((bytes_sent = ram_save_block(f)) != 0) {
+while ((bytes_sent = ram_save_block(f)) != -1) {
 bytes_transferred += bytes_sent;
 }
 memory_global_dirty_log_stop();
-- 
1.7.7.6




[Qemu-devel] [PATCH v3 02/29] vvfat: Fix partition table

2012-07-10 Thread Markus Armbruster
Unless parameter :floppy: is given, vvfat creates a virtual image
with DOS MBR defining a single partition which holds the FAT file
system.  The size of the virtual image depends on the width of the
FAT: 32 MiB (CHS 64, 16, 63) for 12 bit FAT, 504 MiB (CHS 1024, 16,
63) for 16 and 32 bit FAT, leaving (64*16-1)*63 = 64449 and
(1024*16-1)*64 = 1032129 sectors for the partition.

However, it screws up the end of the partition in the MBR:

FAT width param.  start CHS  end CHS start LBA  size
:32:  0,1,1  1023,14,63   631032065
:16:  0,1,1  1023,14,55   631032057
:12:  0,1,163,14,55   63  64377

The actual FAT file system nevertheless assumes the partition has
1032129 or 64449 sectors.  Oops.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 block/vvfat.c |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index 0fd3367..e2b83a2 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -394,11 +394,12 @@ static void init_mbr(BDRVVVFATState* s)
 
 /* LBA is used when partition is outside the CHS geometry */
 lba = sector2CHS(s-bs, partition-start_CHS, s-first_sectors_number-1);
-lba|= sector2CHS(s-bs, partition-end_CHS,   s-sector_count);
+lba |= sector2CHS(s-bs, partition-end_CHS, s-bs-total_sectors - 1);
 
 /*LBA partitions are identified only by start/length_sector_long not by 
CHS*/
-partition-start_sector_long =cpu_to_le32(s-first_sectors_number-1);
-partition-length_sector_long=cpu_to_le32(s-sector_count - 
s-first_sectors_number+1);
+partition-start_sector_long  = cpu_to_le32(s-first_sectors_number - 1);
+partition-length_sector_long = cpu_to_le32(s-bs-total_sectors
+- s-first_sectors_number + 1);
 
 /* FAT12/FAT16/FAT32 */
 /* DOS uses different types when partition is LBA,
-- 
1.7.6.5




[Qemu-devel] [PATCH v3 28/29] blockdev: Drop redundant CHS validation for if=ide

2012-07-10 Thread Markus Armbruster
Leave it to ide_init_drive().

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 blockdev.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 06c997e..5f8677e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -330,15 +330,15 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi)
 max_devs = if_max_devs[type];
 
 if (cyls || heads || secs) {
-if (cyls  1 || (type == IF_IDE  cyls  16383)) {
+if (cyls  1) {
 error_report(invalid physical cyls number);
return NULL;
}
-if (heads  1 || (type == IF_IDE  heads  16)) {
+if (heads  1) {
 error_report(invalid physical heads number);
return NULL;
}
-if (secs  1 || (type == IF_IDE  secs  63)) {
+if (secs  1) {
 error_report(invalid physical secs number);
return NULL;
}
-- 
1.7.6.5




Re: [Qemu-devel] [PATCH 5/5] apic: create apic as a child of cpu, not system_bus any longer

2012-07-10 Thread Andreas Färber
Am 10.07.2012 10:45, schrieb Paolo Bonzini:
 Il 10/07/2012 08:16, Liu Ping Fan ha scritto:
 Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
 ---
  hw/pc.c |   10 +-
  1 files changed, 5 insertions(+), 5 deletions(-)

 diff --git a/hw/pc.c b/hw/pc.c
 index c7e9ab3..8df58c9 100644
 --- a/hw/pc.c
 +++ b/hw/pc.c
 @@ -886,17 +886,17 @@ DeviceState *cpu_get_current_apic(void)
  }
  }
  
 -static DeviceState *apic_init(void *env, uint8_t apic_id)
 +static DeviceState *apic_init(void *cpu, void *env, uint8_t apic_id)
  {
  DeviceState *dev;
  static int apic_mapped;
  
  if (kvm_irqchip_in_kernel()) {
 -dev = qdev_create(NULL, kvm-apic);
 +dev = qdev_create_kid(OBJECT(cpu), kvm-apic);
  } else if (xen_enabled()) {
 -dev = qdev_create(NULL, xen-apic);
 +dev = qdev_create_kid(OBJECT(cpu), xen-apic);
  } else {
 -dev = qdev_create(NULL, apic);
 +dev = qdev_create_kid(OBJECT(cpu), apic);
  }
 
 Does it make sense instead to do this in the realize method of the CPU?

Igor was working on patches to do that. We ran into other design issues
on that road, yesterday I made a proposal how we might proceed with his
approach:

http://lists.nongnu.org/archive/html/qemu-devel/2012-07/msg00992.html

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg





Re: [Qemu-devel] [PATCH v3 2/7] memory: Flush coalesced MMIO on selected region access

2012-07-10 Thread Jan Kiszka
On 2012-07-02 11:07, Avi Kivity wrote:
 On 06/29/2012 07:37 PM, Jan Kiszka wrote:
 Instead of flushing pending coalesced MMIO requests on every vmexit,
 this provides a mechanism to selectively flush when memory regions
 related to the coalesced one are accessed. This first of all includes
 the coalesced region itself but can also applied to other regions, e.g.
 of the same device, by calling memory_region_set_flush_coalesced.
 
 Looks fine.
 
 I have a hard time deciding whether this should go through the kvm tree
 or memory tree.  Anthony, perhaps you can commit it directly to avoid
 the livelock?
 
 Reviewed-by: Avi Kivity a...@redhat.com
 

Anthony, ping?

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux





[Qemu-devel] [PATCH v3 03/29] vvfat: Do not clobber the user's geometry

2012-07-10 Thread Markus Armbruster
vvfat creates a virtual VFAT filesystem with a certain logical
geometry that depends on its options.  It sets the geometry hint to
this geometry.  It is the only block driver to do this.

The geometry hint is about about *physical* geometry, and used only by
certain hard disk device models.

vvfat's hint is normally invisible for device models, because
bdrv_open() puts a raw format on top of vvfat's fat protocol.  That
raw format is where drive_init() puts the user's geometry (if any),
and where the device model gets it from.

Nobody complained, because the default physical geometry is the same
as vvfat's logical geometry:

optsLCHSdef. PCHS
1024,16,63  same
:32:1024,16,63  same
:16:1024,16,63  same
:12:  64,16,63  same

Except when you specify :floppy:

optsLCHSdef. PCHS
   :floppy:   80, 2,36  5,16,63
:32:floppy:   80, 2,36  5,16,63
:16:floppy:   80, 2,36  5,16,63
:12:floppy:   80, 2,18  2,16,63

Silly thing to do for use with a hard disk.

However, the raw format can be suppressed by adding an
redundant-looking format=vvfat to file=fat:FOO.  Then, vvfat's
hint clobbers the user's geometry, i.e. -drive options cyls, heads,
secs get silently ignored.  Don't do that.

No change without format=vvfat.  With it, the user's hard disk
geometry (-drive options cyls, heads, secs) is now obeyed, and the
default hard disk geometry with :floppy: now matches the one without
format=vvfat.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 block/vvfat.c |   52 
 1 files changed, 28 insertions(+), 24 deletions(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index e2b83a2..07d637b 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -359,11 +359,12 @@ typedef struct BDRVVVFATState {
  * if the position is outside the specified geometry, fill maximum value for 
CHS
  * and return 1 to signal overflow.
  */
-static int sector2CHS(BlockDriverState* bs, mbr_chs_t * chs, int spos){
+static int sector2CHS(mbr_chs_t *chs, int spos, int cyls, int heads, int secs)
+{
 int head,sector;
-sector   = spos % (bs-secs);  spos/= bs-secs;
-head = spos % (bs-heads); spos/= bs-heads;
-if(spos = bs-cyls){
+sector   = spos % secs;  spos /= secs;
+head = spos % heads; spos /= heads;
+if (spos = cyls) {
 /* Overflow,
 it happens if 32bit sector positions are used, while CHS is only 24bit.
 Windows/Dos is said to take 1023/255/63 as nonrepresentable CHS */
@@ -378,7 +379,7 @@ static int sector2CHS(BlockDriverState* bs, mbr_chs_t * 
chs, int spos){
 return 0;
 }
 
-static void init_mbr(BDRVVVFATState* s)
+static void init_mbr(BDRVVVFATState *s, int cyls, int heads, int secs)
 {
 /* TODO: if the files mbr.img and bootsect.img exist, use them */
 mbr_t* real_mbr=(mbr_t*)s-first_sectors;
@@ -393,8 +394,10 @@ static void init_mbr(BDRVVVFATState* s)
 partition-attributes=0x80; /* bootable */
 
 /* LBA is used when partition is outside the CHS geometry */
-lba = sector2CHS(s-bs, partition-start_CHS, s-first_sectors_number-1);
-lba |= sector2CHS(s-bs, partition-end_CHS, s-bs-total_sectors - 1);
+lba  = sector2CHS(partition-start_CHS, s-first_sectors_number - 1,
+ cyls, heads, secs);
+lba |= sector2CHS(partition-end_CHS,   s-bs-total_sectors - 1,
+ cyls, heads, secs);
 
 /*LBA partitions are identified only by start/length_sector_long not by 
CHS*/
 partition-start_sector_long  = cpu_to_le32(s-first_sectors_number - 1);
@@ -831,7 +834,7 @@ static inline off_t cluster2sector(BDRVVVFATState* s, 
uint32_t cluster_num)
 }
 
 static int init_directories(BDRVVVFATState* s,
-   const char* dirname)
+const char *dirname, int heads, int secs)
 {
 bootsector_t* bootsector;
 mapping_t* mapping;
@@ -958,8 +961,8 @@ static int init_directories(BDRVVVFATState* s,
 bootsector-media_type=(s-first_sectors_number1?0xf8:0xf0); /* media 
descriptor (f8=hd, f0=3.5 fd)*/
 s-fat.pointer[0] = bootsector-media_type;
 bootsector-sectors_per_fat=cpu_to_le16(s-sectors_per_fat);
-bootsector-sectors_per_track=cpu_to_le16(s-bs-secs);
-bootsector-number_of_heads=cpu_to_le16(s-bs-heads);
+bootsector-sectors_per_track = cpu_to_le16(secs);
+bootsector-number_of_heads = cpu_to_le16(heads);
 bootsector-hidden_sectors=cpu_to_le32(s-first_sectors_number==1?0:0x3f);
 
bootsector-total_sectors=cpu_to_le32(s-sector_count0x?s-sector_count:0);
 
@@ -992,7 +995,7 @@ static void vvfat_rebind(BlockDriverState *bs)
 static int vvfat_open(BlockDriverState *bs, const char* dirname, int flags)
 {
 BDRVVVFATState *s = bs-opaque;
-int i;
+int i, cyls, heads, secs;
 
 #ifdef DEBUG
 vvv = s;
@@ -1034,24 +1037,28 @@ DLOG(if (stderr == NULL) {
/* 1.44MB or 2.88MB floppy.  2.88MB can be FAT12 (default) or 

[Qemu-devel] [PATCH] Add --disable-bridge-helper configure flag

2012-07-10 Thread Fabien Chouteau
Bridge helper uses ioctl's not available on old Linux versions, we add
this flag to disable the build.

Signed-off-by: Fabien Chouteau chout...@adacore.com
---
 Makefile  |2 +-
 configure |8 
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 34d6a9e..b46c6b0 100644
--- a/Makefile
+++ b/Makefile
@@ -37,7 +37,7 @@ $(call set-vpath, $(SRC_PATH))
 
 LIBS+=-lz $(LIBS_TOOLS)
 
-HELPERS-$(CONFIG_LINUX) = qemu-bridge-helper$(EXESUF)
+HELPERS-$(CONFIG_HELPERS) = qemu-bridge-helper$(EXESUF)
 
 ifdef BUILD_DOCS
 DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8 
QMP/qmp-commands.txt
diff --git a/configure b/configure
index 500fe24..5566752 100755
--- a/configure
+++ b/configure
@@ -195,6 +195,7 @@ zlib=yes
 guest_agent=yes
 libiscsi=
 coroutine=
+bridge_helper=yes
 
 # parse CC options first
 for opt do
@@ -824,6 +825,8 @@ for opt do
   ;;
   --disable-guest-agent) guest_agent=no
   ;;
+  --disable-bridge-helper) bridge_helper=no
+  ;;
   *) echo ERROR: unknown option $opt; show_help=yes
   ;;
   esac
@@ -1110,6 +1113,7 @@ echo   --disable-guest-agentdisable building of the 
QEMU Guest Agent
 echo   --enable-guest-agent enable building of the QEMU Guest Agent
 echo   --with-coroutine=BACKEND coroutine backend. Supported options:
 echogthread, ucontext, sigaltstack, windows
+echo   --disable-bridge-helper  disable building of the qemu-bridge-helper
 echo 
 echo NOTE: The object files are built at the place where configure is 
launched
 exit 1
@@ -3896,6 +3900,10 @@ if test $tcg_interpreter = yes ; then
   echo CONFIG_TCI_DIS=y   $libdis_config_mak
 fi
 
+if test $bridge_helper = yes  test $linux = yes ; then
+echo CONFIG_HELPERS=y  $config_host_mak
+fi
+
 case $ARCH in
 alpha)
   # Ensure there's only a single GP
-- 
1.7.9.5




Re: [Qemu-devel] [PATCH] Add --disable-bridge-helper configure flag

2012-07-10 Thread Paolo Bonzini
Il 10/07/2012 12:43, Fabien Chouteau ha scritto:
 Bridge helper uses ioctl's not available on old Linux versions, we add
 this flag to disable the build.

Which ioctls?  Please detect them, so that we can also work around them
perhaps.

Paolo



[Qemu-devel] [PATCH v3 07/29] hd-geometry: Unnest conditional in hd_geometry_guess()

2012-07-10 Thread Markus Armbruster

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/hd-geometry.c |   84 +++---
 1 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/hw/hd-geometry.c b/hw/hd-geometry.c
index f0dd021..db47846 100644
--- a/hw/hd-geometry.c
+++ b/hw/hd-geometry.c
@@ -104,58 +104,58 @@ void hd_geometry_guess(BlockDriverState *bs,
 int cylinders, heads, secs;
 uint64_t nb_sectors;
 
-/* if a geometry hint is available, use it */
 bdrv_get_geometry(bs, nb_sectors);
 bdrv_get_geometry_hint(bs, cylinders, heads, secs);
 translation = bdrv_get_translation_hint(bs);
+
 if (cylinders != 0) {
+/* already got a geometry hint: use it */
 *pcyls = cylinders;
 *pheads = heads;
 *psecs = secs;
-} else {
-if (guess_disk_lchs(bs, cylinders, heads, secs) == 0) {
-if (heads  16) {
-/* if heads  16, it means that a BIOS LBA
-   translation was active, so the default
-   hardware geometry is OK */
-lba_detected = 1;
-goto default_geometry;
-} else {
-*pcyls = cylinders;
-*pheads = heads;
-*psecs = secs;
-/* disable any translation to be in sync with
-   the logical geometry */
-if (translation == BIOS_ATA_TRANSLATION_AUTO) {
-bdrv_set_translation_hint(bs,
-  BIOS_ATA_TRANSLATION_NONE);
-}
-}
-} else {
-default_geometry:
-/* if no geometry, use a standard physical disk geometry */
-cylinders = nb_sectors / (16 * 63);
+return;
+}
 
-if (cylinders  16383) {
-cylinders = 16383;
-} else if (cylinders  2) {
-cylinders = 2;
-}
-*pcyls = cylinders;
-*pheads = 16;
-*psecs = 63;
-if ((lba_detected == 1)
- (translation == BIOS_ATA_TRANSLATION_AUTO)) {
-if ((*pcyls * *pheads) = 131072) {
-bdrv_set_translation_hint(bs,
-  BIOS_ATA_TRANSLATION_LARGE);
-} else {
-bdrv_set_translation_hint(bs,
-  BIOS_ATA_TRANSLATION_LBA);
-}
+if (guess_disk_lchs(bs, cylinders, heads, secs)  0) {
+/* no LCHS guess: use a standard physical disk geometry  */
+default_geometry:
+cylinders = nb_sectors / (16 * 63);
+
+if (cylinders  16383) {
+cylinders = 16383;
+} else if (cylinders  2) {
+cylinders = 2;
+}
+*pcyls = cylinders;
+*pheads = 16;
+*psecs = 63;
+if ((lba_detected == 1)  (translation == BIOS_ATA_TRANSLATION_AUTO)) 
{
+if ((*pcyls * *pheads) = 131072) {
+bdrv_set_translation_hint(bs,
+  BIOS_ATA_TRANSLATION_LARGE);
+} else {
+bdrv_set_translation_hint(bs,
+  BIOS_ATA_TRANSLATION_LBA);
 }
 }
-bdrv_set_geometry_hint(bs, *pcyls, *pheads, *psecs);
+} else if (heads  16) {
+/* LCHS guess with heads  16 means that a BIOS LBA
+   translation was active, so a standard physical disk
+   geometry is OK */
+lba_detected = 1;
+goto default_geometry;
+} else {
+/* LCHS guess with heads = 16: use as physical geometry */
+*pcyls = cylinders;
+*pheads = heads;
+*psecs = secs;
+/* disable any translation to be in sync with
+   the logical geometry */
+if (translation == BIOS_ATA_TRANSLATION_AUTO) {
+bdrv_set_translation_hint(bs,
+  BIOS_ATA_TRANSLATION_NONE);
+}
 }
+bdrv_set_geometry_hint(bs, *pcyls, *pheads, *psecs);
 trace_hd_geometry_guess(bs, *pcyls, *pheads, *psecs, translation);
 }
-- 
1.7.6.5




[Qemu-devel] [PATCH v3 04/29] qtest: Add hard disk geometry test

2012-07-10 Thread Markus Armbruster
So far covers only IDE and tests only CMOS contents.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 tests/Makefile  |2 +
 tests/hd-geo-test.c |  403 +++
 2 files changed, 405 insertions(+), 0 deletions(-)
 create mode 100644 tests/hd-geo-test.c

diff --git a/tests/Makefile b/tests/Makefile
index d687ecc..9675ba7 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -21,6 +21,7 @@ check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh
 # All QTests for now are POSIX-only, but the dependencies are
 # really in libqtest, not in the testcases themselves.
 check-qtest-i386-y = tests/fdc-test$(EXESUF)
+check-qtest-i386-y += tests/hd-geo-test$(EXESUF)
 check-qtest-i386-y += tests/rtc-test$(EXESUF)
 check-qtest-x86_64-y = $(check-qtest-i386-y)
 check-qtest-sparc-y = tests/m48t59-test$(EXESUF)
@@ -72,6 +73,7 @@ tests/test-visitor-serialization$(EXESUF): 
tests/test-visitor-serialization.o $(
 tests/rtc-test$(EXESUF): tests/rtc-test.o $(trace-obj-y)
 tests/m48t59-test$(EXESUF): tests/m48t59-test.o $(trace-obj-y)
 tests/fdc-test$(EXESUF): tests/fdc-test.o tests/libqtest.o $(trace-obj-y)
+tests/hd-geo-test$(EXESUF): tests/hd-geo-test.o tests/libqtest.o $(trace-obj-y)
 
 # QTest rules
 
diff --git a/tests/hd-geo-test.c b/tests/hd-geo-test.c
new file mode 100644
index 000..cc447a2
--- /dev/null
+++ b/tests/hd-geo-test.c
@@ -0,0 +1,403 @@
+/*
+ * Hard disk geometry test cases.
+ *
+ * Copyright (c) 2012 Red Hat Inc.
+ *
+ * Authors:
+ *  Markus Armbruster arm...@redhat.com,
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+/*
+ * Covers only IDE and tests only CMOS contents.  Better than nothing.
+ * Improvements welcome.
+ */
+
+#include glib.h
+#include stdlib.h
+#include string.h
+#include unistd.h
+#include qemu-common.h
+#include libqtest.h
+
+static const char test_image[] = /tmp/qtest.XX;
+
+static char *create_test_img(int secs)
+{
+char *template = strdup(/tmp/qtest.XX);
+int fd, ret;
+
+fd = mkstemp(template);
+g_assert(fd = 0);
+ret = ftruncate(fd, (off_t)secs * 512);
+g_assert(ret == 0);
+close(fd);
+return template;
+}
+
+typedef struct {
+int cyls, heads, secs, trans;
+} CHST;
+
+typedef enum {
+mbr_blank, mbr_lba, mbr_chs,
+mbr_last
+} MBRcontents;
+
+typedef enum {
+/* order is relevant */
+backend_small, backend_large, backend_empty,
+backend_last
+} Backend;
+
+static const int img_secs[backend_last] = {
+[backend_small] = 61440,
+[backend_large] = 8388608,
+[backend_empty] = -1,
+};
+
+static const CHST hd_chst[backend_last][mbr_last] = {
+[backend_small] = {
+[mbr_blank] = { 60, 16, 63, 0 },
+[mbr_lba]   = { 60, 16, 63, 2 },
+[mbr_chs]   = { 60, 16, 63, 0 }
+},
+[backend_large] = {
+[mbr_blank] = { 8322, 16, 63, 1 },
+[mbr_lba]   = { 8322, 16, 63, 1 },
+[mbr_chs]   = { 8322, 16, 63, 0 }
+},
+};
+
+static const char *img_file_name[backend_last];
+
+static const CHST *cur_ide[4];
+
+static bool is_hd(const CHST *expected_chst)
+{
+return expected_chst  expected_chst-cyls;
+}
+
+static void test_cmos_byte(int reg, int expected)
+{
+enum { cmos_base = 0x70 };
+int actual;
+
+outb(cmos_base + 0, reg);
+actual = inb(cmos_base + 1);
+g_assert(actual == expected);
+}
+
+static void test_cmos_bytes(int reg0, int n, uint8_t expected[])
+{
+int i;
+
+for (i = 0; i  9; i++) {
+test_cmos_byte(reg0 + i, expected[i]);
+}
+}
+
+static void test_cmos_disk_data(void)
+{
+test_cmos_byte(0x12,
+   (is_hd(cur_ide[0]) ? 0xf0 : 0) |
+   (is_hd(cur_ide[1]) ? 0x0f : 0));
+}
+
+static void test_cmos_drive_cyl(int reg0, const CHST *expected_chst)
+{
+if (is_hd(expected_chst)) {
+int c = expected_chst-cyls;
+int h = expected_chst-heads;
+int s = expected_chst-secs;
+uint8_t expected_bytes[9] = {
+c  0xff, c  8, h, 0xff, 0xff, 0xc0 | ((h  8)  3),
+c  0xff, c  8, s
+};
+test_cmos_bytes(reg0, 9, expected_bytes);
+} else {
+int i;
+
+for (i = 0; i  9; i++) {
+test_cmos_byte(reg0 + i, 0);
+}
+}
+}
+
+static void test_cmos_drive1(void)
+{
+test_cmos_byte(0x19, is_hd(cur_ide[0]) ? 47 : 0);
+test_cmos_drive_cyl(0x1b, cur_ide[0]);
+}
+
+static void test_cmos_drive2(void)
+{
+test_cmos_byte(0x1a, is_hd(cur_ide[1]) ? 47 : 0);
+test_cmos_drive_cyl(0x24, cur_ide[1]);
+}
+
+static void test_cmos_disktransflag(void)
+{
+int val, i;
+
+val = 0;
+for (i = 0; i  ARRAY_SIZE(cur_ide); i++) {
+if (is_hd(cur_ide[i])) {
+val |= cur_ide[i]-trans  (2 * i);
+}
+}
+test_cmos_byte(0x39, val);
+}
+
+static void test_cmos(void)
+{
+test_cmos_disk_data();
+test_cmos_drive1();
+

[Qemu-devel] [PATCH v3 14/29] qdev: Introduce block geometry properties

2012-07-10 Thread Markus Armbruster

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 block.h |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/block.h b/block.h
index 993894e..1cd8a01 100644
--- a/block.h
+++ b/block.h
@@ -426,6 +426,8 @@ typedef struct BlockConf {
 uint32_t opt_io_size;
 int32_t bootindex;
 uint32_t discard_granularity;
+/* geometry, not all devices use this */
+uint32_t cyls, heads, secs;
 } BlockConf;
 
 static inline unsigned int get_physical_block_exp(BlockConf *conf)
@@ -453,5 +455,9 @@ static inline unsigned int get_physical_block_exp(BlockConf 
*conf)
 DEFINE_PROP_UINT32(discard_granularity, _state, \
_conf.discard_granularity, 0)
 
-#endif
+#define DEFINE_BLOCK_CHS_PROPERTIES(_state, _conf)  \
+DEFINE_PROP_UINT32(cyls, _state, _conf.cyls, 0),  \
+DEFINE_PROP_UINT32(heads, _state, _conf.heads, 0), \
+DEFINE_PROP_UINT32(secs, _state, _conf.secs, 0)
 
+#endif
-- 
1.7.6.5




[Qemu-devel] [PATCH v3 13/29] blockdev: Save geometry in DriveInfo

2012-07-10 Thread Markus Armbruster
In preparation of purging it from the block layer, which will happen
later in this series.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 blockdev.c |4 
 blockdev.h |1 +
 2 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index a85a429..161985b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -530,6 +530,10 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi)
 dinfo-type = type;
 dinfo-bus = bus_id;
 dinfo-unit = unit_id;
+dinfo-cyls = cyls;
+dinfo-heads = heads;
+dinfo-secs = secs;
+dinfo-trans = translation;
 dinfo-opts = opts;
 dinfo-refcount = 1;
 if (serial) {
diff --git a/blockdev.h b/blockdev.h
index 26454c9..bc8c2dc 100644
--- a/blockdev.h
+++ b/blockdev.h
@@ -35,6 +35,7 @@ struct DriveInfo {
 int unit;
 int auto_del;   /* see blockdev_mark_auto_del() */
 int media_cd;
+int cyls, heads, secs, trans;
 QemuOpts *opts;
 char serial[BLOCK_SERIAL_STRLEN + 1];
 QTAILQ_ENTRY(DriveInfo) next;
-- 
1.7.6.5




[Qemu-devel] [PATCH v3 06/29] hd-geometry: Add tracepoints

2012-07-10 Thread Markus Armbruster

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/hd-geometry.c |7 +++
 trace-events |4 
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/hd-geometry.c b/hw/hd-geometry.c
index c45eafd..f0dd021 100644
--- a/hw/hd-geometry.c
+++ b/hw/hd-geometry.c
@@ -32,6 +32,7 @@
 
 #include block.h
 #include hw/block-common.h
+#include trace.h
 
 struct partition {
 uint8_t boot_ind;   /* 0x80 - active */
@@ -89,10 +90,7 @@ static int guess_disk_lchs(BlockDriverState *bs,
 *pheads = heads;
 *psectors = sectors;
 *pcylinders = cylinders;
-#if 0
-printf(guessed geometry: LCHS=%d %d %d\n,
-   cylinders, heads, sectors);
-#endif
+trace_hd_geometry_lchs_guess(bs, cylinders, heads, sectors);
 return 0;
 }
 }
@@ -159,4 +157,5 @@ void hd_geometry_guess(BlockDriverState *bs,
 }
 bdrv_set_geometry_hint(bs, *pcyls, *pheads, *psecs);
 }
+trace_hd_geometry_guess(bs, *pcyls, *pheads, *psecs, translation);
 }
diff --git a/trace-events b/trace-events
index 1f9fc98..acef082 100644
--- a/trace-events
+++ b/trace-events
@@ -141,6 +141,10 @@ ecc_mem_readl_ecr1(uint32_t ret) Read event count 2 %08x
 ecc_diag_mem_writeb(uint64_t addr, uint32_t val) Write diagnostic %PRId64 = 
%02x
 ecc_diag_mem_readb(uint64_t addr, uint32_t ret) Read diagnostic %PRId64= 
%02x
 
+# hw/hd-geometry.c
+hd_geometry_lchs_guess(void *bs, int cyls, int heads, int secs) bs %p LCHS %d 
%d %d
+hd_geometry_guess(void *bs, int cyls, int heads, int secs, int trans) bs %p 
CHS %d %d %d trans %d
+
 # hw/jazz-led.c
 jazz_led_read(uint64_t addr, uint8_t val) read addr=0x%PRIx64: 0x%x
 jazz_led_write(uint64_t addr, uint8_t new) write addr=0x%PRIx64: 0x%x
-- 
1.7.6.5




[Qemu-devel] [PATCH] uhci: initialize expire_time when loading v1 vmstate

2012-07-10 Thread Gerd Hoffmann
$subject says all: when loading old (v1) vmstate which doesn't contain
expire_time initialize it with a reasonable default (current time).

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/usb/hcd-uhci.c |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/hw/usb/hcd-uhci.c b/hw/usb/hcd-uhci.c
index 8f652d2..2aac8a2 100644
--- a/hw/usb/hcd-uhci.c
+++ b/hw/usb/hcd-uhci.c
@@ -388,11 +388,23 @@ static const VMStateDescription vmstate_uhci_port = {
 }
 };
 
+static int uhci_post_load(void *opaque, int version_id)
+{
+UHCIState *s = opaque;
+
+if (version_id  2) {
+s-expire_time = qemu_get_clock_ns(vm_clock) +
+(get_ticks_per_sec() / FRAME_TIMER_FREQ);
+}
+return 0;
+}
+
 static const VMStateDescription vmstate_uhci = {
 .name = uhci,
 .version_id = 2,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
+.post_load = uhci_post_load,
 .fields  = (VMStateField []) {
 VMSTATE_PCI_DEVICE(dev, UHCIState),
 VMSTATE_UINT8_EQUAL(num_ports_vmstate, UHCIState),
-- 
1.7.1




Re: [Qemu-devel] [PATCH v3] sheepdog: do not blindly memset all read buffers

2012-07-10 Thread Kevin Wolf
Am 09.07.2012 20:23, schrieb MORITA Kazutaka:
 At Mon, 9 Jul 2012 16:34:13 +0200,
 Christoph Hellwig wrote:

 Only buffers that map to unallocated blocks need to be zeroed.

 Signed-off-by: Christoph Hellwig h...@lst.de

 ---
  block/sheepdog.c |   37 ++---
  1 file changed, 18 insertions(+), 19 deletions(-)
 
 Acked-by: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp

Thanks, applied to the block branch.

Kevin



[Qemu-devel] [PATCH v3 29/29] Relax IDE CHS limits from 16383, 16, 63 to 65535, 16, 255

2012-07-10 Thread Markus Armbruster
New limits straight from ATA4 6.2 Register delivered data transfer
command sector addressing.

I figure the old sector limit 63 was blindly copied from the BIOS
int 13 limit.  Doesn't apply to the hardware.  No idea where the old
cylinder limit comes from.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/ide/core.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 1ca7cdf..58a454f 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -1935,16 +1935,16 @@ int ide_init_drive(IDEState *s, BlockDriverState *bs, 
IDEDriveKind kind,
 s-drive_kind = kind;
 
 bdrv_get_geometry(bs, nb_sectors);
-if (cylinders  1 || cylinders  16383) {
-error_report(cyls must be between 1 and 16383);
+if (cylinders  1 || cylinders  65535) {
+error_report(cyls must be between 1 and 65535);
 return -1;
 }
 if (heads  1 || heads  16) {
 error_report(heads must be between 1 and 16);
 return -1;
 }
-if (secs  1 || secs  63) {
-error_report(secs must be between 1 and 63);
+if (secs  1 || secs  255) {
+error_report(secs must be between 1 and 255);
 return -1;
 }
 s-cylinders = cylinders;
-- 
1.7.6.5




Re: [Qemu-devel] [PATCH] Add --disable-bridge-helper configure flag

2012-07-10 Thread Fabien Chouteau
On 07/10/2012 12:44 PM, Paolo Bonzini wrote:
 Il 10/07/2012 12:43, Fabien Chouteau ha scritto:
 Bridge helper uses ioctl's not available on old Linux versions, we add
 this flag to disable the build.
 
 Which ioctls?  Please detect them, so that we can also work around them
 perhaps.
 

There SIOCBRADDIF at least, maybe it's the only one. 

-- 
Fabien Chouteau





Re: [Qemu-devel] [PATCH] Add --disable-bridge-helper configure flag

2012-07-10 Thread Paolo Bonzini
Il 10/07/2012 13:02, Fabien Chouteau ha scritto:
  Bridge helper uses ioctl's not available on old Linux versions, we add
  this flag to disable the build.
  
  Which ioctls?  Please detect them, so that we can also work around them
  perhaps.
  
 There SIOCBRADDIF at least, maybe it's the only one. 

So indeed you could also use SIOCDEVPRIVATE / BRCTL_ADD_IF if you were
inclined to do so...

Paolo




[Qemu-devel] [PATCH v3 09/29] hd-geometry: Clean up gratuitous goto in hd_geometry_guess()

2012-07-10 Thread Markus Armbruster

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/hd-geometry.c |   22 --
 1 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/hw/hd-geometry.c b/hw/hd-geometry.c
index 1a58894..fb849a3 100644
--- a/hw/hd-geometry.c
+++ b/hw/hd-geometry.c
@@ -119,8 +119,7 @@ static void guess_chs_for_size(BlockDriverState *bs,
 void hd_geometry_guess(BlockDriverState *bs,
int *pcyls, int *pheads, int *psecs)
 {
-int translation, lba_detected = 0;
-int cylinders, heads, secs;
+int cylinders, heads, secs, translation;
 
 bdrv_get_geometry_hint(bs, cylinders, heads, secs);
 translation = bdrv_get_translation_hint(bs);
@@ -135,23 +134,18 @@ void hd_geometry_guess(BlockDriverState *bs,
 
 if (guess_disk_lchs(bs, cylinders, heads, secs)  0) {
 /* no LCHS guess: use a standard physical disk geometry  */
-default_geometry:
 guess_chs_for_size(bs, pcyls, pheads, psecs);
-if ((lba_detected == 1)  (translation == BIOS_ATA_TRANSLATION_AUTO)) 
{
-if ((*pcyls * *pheads) = 131072) {
-bdrv_set_translation_hint(bs,
-  BIOS_ATA_TRANSLATION_LARGE);
-} else {
-bdrv_set_translation_hint(bs,
-  BIOS_ATA_TRANSLATION_LBA);
-}
-}
 } else if (heads  16) {
 /* LCHS guess with heads  16 means that a BIOS LBA
translation was active, so a standard physical disk
geometry is OK */
-lba_detected = 1;
-goto default_geometry;
+guess_chs_for_size(bs, pcyls, pheads, psecs);
+if (translation == BIOS_ATA_TRANSLATION_AUTO) {
+bdrv_set_translation_hint(bs,
+  *pcyls * *pheads = 131072
+  ? BIOS_ATA_TRANSLATION_LARGE
+  : BIOS_ATA_TRANSLATION_LBA);
+}
 } else {
 /* LCHS guess with heads = 16: use as physical geometry */
 *pcyls = cylinders;
-- 
1.7.6.5




Re: [Qemu-devel] [PATCH 4/9] Add a base IPMI interface

2012-07-10 Thread Markus Armbruster
Daniel P. Berrange berra...@redhat.com writes:

 On Mon, Jul 09, 2012 at 02:17:04PM -0500, miny...@acm.org wrote:
 diff --git a/qemu-options.hx b/qemu-options.hx
 index 125a4da..823f6bc 100644
 --- a/qemu-options.hx
 +++ b/qemu-options.hx
 @@ -2204,6 +2204,41 @@ Three button serial mouse. Configure the guest to use 
 Microsoft protocol.
  @end table
  ETEXI
  
 +DEF(ipmi, HAS_ARG, QEMU_OPTION_ipmi, \
 +-ipmi [kcs|bt,]dev|local|none  IPMI interface to the dev, or internal 
 BMC\n,
 +QEMU_ARCH_ALL)
 +STEXI
 +@item -ipmi [bt|kcs,]@var{dev}|local|none
 +@findex -ipmi
 +Set up an IPMI interface.  The physical interface may either be
 +KCS or BT, the default is KCS.  Two options are available for
 +simulation of the IPMI BMC.  If @code{local} is specified, then a
 +minimal internal BMC is used.  This BMC is basically useful as a
 +watchdog timer and for fooling a system into thinking IPMI is there.
 +
 +If @var{dev} is specified (see the serial section above for details on
 +what can be specified for @var{dev}) then a connection to an external IPMI
 +simulator is made.  This interface has the ability to do power control
 +and reset, so it can do the normal IPMI types of things required.

 +The OpenIPMI project's lanserv simulator is capable of providing
 +this interface.  It is also capable of an IPMI LAN interface, and
 +you can do power control (the lanserv simulator is capable of starting
 +a VM, too) and reset of a virtual machine over a standard remote LAN
 +interface.  For details on this, see OpenIPMI.
 +
 +The remote connection to a LAN interface will reconnect if disconnected,
 +so if a remote BMC fails and restarts, it will still be usable.
 +
 +For instance, to connect to an external interface on the local machine
 +port 9002 with a BT physical interface, do the following:
 +@table @code
 +@item -ipmi bt,tcp:localhost:9002
 +@end table
 +
 +Use @code{-ipmi none} to disable IPMI.
 +ETEXI
[...]
 BTW, the syntax you show here is the legacy approach where both front
 and backend device config is mixed.  Does you patch work with the
 modern QEMU syntax which is something like

   -chardev name=impi0,tcp:localhost:9002 -device bt,chardev=ipmi0

 if it doesn't work, then you'll need to update your patches to support
 this approach.

Yup.  Working -device is mandatory for new devices.  Convenience options
are optional, and whether they're worth the trouble depends.

Please introduce the convenience option -ipmi in a separate patch.



[Qemu-devel] [PATCH v17 9/9] Add XBZRLE statistics

2012-07-10 Thread Orit Wasserman
Signed-off-by: Benoit Hudzia benoit.hud...@sap.com
Signed-off-by: Petter Svard pett...@cs.umu.se
Signed-off-by: Aidan Shribman aidan.shrib...@sap.com
Signed-off-by: Orit Wasserman owass...@redhat.com
---
 arch_init.c  |   66 ++
 hmp.c|   13 ++
 migration.c  |   49 
 migration.h  |9 +++
 qapi-schema.json |   37 ++
 qmp-commands.hx  |   35 +++-
 6 files changed, 203 insertions(+), 6 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index be6670c..aa38801 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -202,6 +202,64 @@ int64_t xbzrle_cache_resize(int64_t new_size)
 return pow2floor(new_size);
 }
 
+/* accounting for migration statistics */
+typedef struct AccountingInfo {
+uint64_t dup_pages;
+uint64_t norm_pages;
+uint64_t xbzrle_bytes;
+uint64_t xbzrle_pages;
+uint64_t xbzrle_cache_miss;
+uint64_t iterations;
+uint64_t xbzrle_overflows;
+} AccountingInfo;
+
+static AccountingInfo acct_info;
+
+static void acct_clear(void)
+{
+memset(acct_info, 0, sizeof(acct_info));
+}
+
+uint64_t dup_mig_bytes_transferred(void)
+{
+return acct_info.dup_pages * TARGET_PAGE_SIZE;
+}
+
+uint64_t dup_mig_pages_transferred(void)
+{
+return acct_info.dup_pages;
+}
+
+uint64_t norm_mig_bytes_transferred(void)
+{
+return acct_info.norm_pages * TARGET_PAGE_SIZE;
+}
+
+uint64_t norm_mig_pages_transferred(void)
+{
+return acct_info.norm_pages;
+}
+
+uint64_t xbzrle_mig_bytes_transferred(void)
+{
+return acct_info.xbzrle_bytes;
+}
+
+uint64_t xbzrle_mig_pages_transferred(void)
+{
+return acct_info.xbzrle_pages;
+}
+
+uint64_t xbzrle_mig_pages_cache_miss(void)
+{
+return acct_info.xbzrle_cache_miss;
+}
+
+uint64_t xbzrle_mig_pages_overflow(void)
+{
+return acct_info.xbzrle_overflows;
+}
+
 static void save_block_hdr(QEMUFile *f, RAMBlock *block, ram_addr_t offset,
 int cont, int flag)
 {
@@ -236,6 +294,7 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t 
*current_data,
 cache_insert(XBZRLE.cache, current_addr,
  g_memdup(current_data, TARGET_PAGE_SIZE));
 }
+acct_info.xbzrle_cache_miss++;
 return -1;
 }
 
@@ -250,6 +309,7 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t 
*current_data,
 return 0;
 } else if (encoded_len == -1) {
 DPRINTF(Overflow\n);
+acct_info.xbzrle_overflows++;
 /* update data in the cache */
 memcpy(prev_cached_page, current_data, TARGET_PAGE_SIZE);
 return -1;
@@ -269,7 +329,9 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t 
*current_data,
 qemu_put_byte(f, hdr.xh_flags);
 qemu_put_be16(f, hdr.xh_len);
 qemu_put_buffer(f, XBZRLE.encoded_buf, encoded_len);
+acct_info.xbzrle_pages++;
 bytes_sent = encoded_len + sizeof(hdr);
+acct_info.xbzrle_bytes += bytes_sent;
 
 return bytes_sent;
 }
@@ -301,6 +363,7 @@ static int ram_save_block(QEMUFile *f, int stage)
 p = memory_region_get_ram_ptr(mr) + offset;
 
 if (is_dup_page(p)) {
+acct_info.dup_pages++;
 save_block_hdr(f, block, offset, cont, RAM_SAVE_FLAG_COMPRESS);
 qemu_put_byte(f, *p);
 bytes_sent = 1;
@@ -323,6 +386,7 @@ static int ram_save_block(QEMUFile *f, int stage)
 save_block_hdr(f, block, offset, cont, RAM_SAVE_FLAG_PAGE);
 qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
 bytes_sent = TARGET_PAGE_SIZE;
+acct_info.norm_pages++;
 }
 
 /* if page is unmodified, continue to the next */
@@ -449,6 +513,7 @@ int ram_save_live(QEMUFile *f, int stage, void *opaque)
 return -1;
 }
 XBZRLE.encoded_buf = g_malloc0(TARGET_PAGE_SIZE);
+acct_clear();
 }
 
 /* Make sure all dirty bits are set */
@@ -484,6 +549,7 @@ int ram_save_live(QEMUFile *f, int stage, void *opaque)
bytes_sent -1 represent no more blocks*/
 if (bytes_sent  0) {
 bytes_transferred += bytes_sent;
+acct_info.iterations++;
 } else if (bytes_sent == -1) { /* no more blocks */
 break;
 }
diff --git a/hmp.c b/hmp.c
index 99ad00a..0d7333b 100644
--- a/hmp.c
+++ b/hmp.c
@@ -168,6 +168,19 @@ void hmp_info_migrate(Monitor *mon)
info-disk-total  10);
 }
 
+if (info-has_cache) {
+monitor_printf(mon, cache size: % PRIu64  bytes\n,
+   info-cache-cache_size);
+monitor_printf(mon, xbzrle transferred: % PRIu64  kbytes\n,
+   info-cache-xbzrle_bytes  10);
+monitor_printf(mon, xbzrle pages: % PRIu64  pages\n,
+   info-cache-xbzrle_pages);
+monitor_printf(mon, xbzrle cache miss: % PRIu64 \n,

[Qemu-devel] [PATCH] Fix xen pci passthrough

2012-07-10 Thread Wei Wang
Hi, I find a passthrough issue in qemu when pci device has multiple mmio 
regions. In this case, the last few bytes of qmp response string are trimmed 
and then passthru fails with following output.

*** glibc detected *** xl: realloc(): invalid pointer: 0x02163f90 ***
=== Backtrace: =
/lib64/libc.so.6(+0x74c06)[0x7f62970e4c06]
/lib64/libc.so.6(+0x77d25)[0x7f62970e7d25]
/lib/libxenlight.so.2.0(+0x28d02)[0x7f6297a78d02]
/lib/libxenlight.so.2.0(+0x2eccf)[0x7f6297a7eccf]
/lib/libxenlight.so.2.0(+0x2f2f6)[0x7f6297a7f2f6]
/lib/libxenlight.so.2.0(+0x2fe18)[0x7f6297a7fe18]
/lib/libxenlight.so.2.0(+0x20027)[0x7f6297a70027]
/lib/libxenlight.so.2.0(+0x212a6)[0x7f6297a712a6]
/lib/libxenlight.so.2.0(+0x19e82)[0x7f6297a69e82]
/lib/libxenlight.so.2.0(+0x1c288)[0x7f6297a6c288]
/lib/libxenlight.so.2.0(+0x1c2a8)[0x7f6297a6c2a8]
/lib/libxenlight.so.2.0(+0x2657e)[0x7f6297a7657e]
/lib/libxenlight.so.2.0(+0x34076)[0x7f6297a84076]
/lib/libxenlight.so.2.0(libxl__fork_selfpipe_woken+0x92)[0x7f6297a84394]
/lib/libxenlight.so.2.0(+0x3254a)[0x7f6297a8254a]
/lib/libxenlight.so.2.0(+0x3276d)[0x7f6297a8276d]
/lib/libxenlight.so.2.0(+0x33944)[0x7f6297a83944]
/lib/libxenlight.so.2.0(+0x1c0a8)[0x7f6297a6c0a8]
/lib/libxenlight.so.2.0(libxl_domain_create_new+0x14)[0x7f6297a6c14f]
xl[0x40c1f2]
xl[0x40fc94]
xl[0x406c21]
/lib64/libc.so.6(__libc_start_main+0xed)[0x7f629709123d]
xl[0x406439]

Attached patch can fix this issue. 

Thanks,
Wei

Signed-off-by: Wei Wang wei.wa...@amd.com

---
 monitor.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/monitor.c b/monitor.c
index f6107ba..9f30f5f 100644
--- a/monitor.c
+++ b/monitor.c
@@ -165,7 +165,7 @@ struct Monitor {
 int reset_seen;
 int flags;
 int suspend_cnt;
-uint8_t outbuf[1024];
+uint8_t outbuf[2048];
 int outbuf_index;
 ReadLineState *rs;
 MonitorControl *mc;
-- 
1.7.4





[Qemu-devel] [PATCH] hw/imx_avic.c: Avoid format error when target_phys_addr_t is 64 bits

2012-07-10 Thread Peter Maydell
Add a missing cast to avoid gcc complaining about format string
errors when printing an expression based on a target_phys_addr_t.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 hw/imx_avic.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/imx_avic.c b/hw/imx_avic.c
index 25f47f3..4f010e8 100644
--- a/hw/imx_avic.c
+++ b/hw/imx_avic.c
@@ -267,7 +267,7 @@ static void imx_avic_write(void *opaque, target_phys_addr_t 
offset,
 /* Vector Registers not yet supported */
 if (offset = 0x100  offset = 0x2fc) {
 IPRINTF(imx_avic_write to vector register %d ignored\n,
-(offset - 0x100)  2);
+(unsigned int)((offset - 0x100)  2));
 return;
 }
 
-- 
1.7.1




Re: [Qemu-devel] [PULL 00/14] SCSI updates for 2012-07-02

2012-07-10 Thread Anthony Liguori

On 07/10/2012 12:57 AM, Hannes Reinecke wrote:

On 07/10/2012 01:19 AM, Anthony Liguori wrote:

On 07/09/2012 06:09 PM, Alexander Graf wrote:


On 09.07.2012, at 18:48, Anthony Liguori wrote:


On 07/02/2012 04:41 AM, Paolo Bonzini wrote:

Anthony,

The following changes since commit
71ea2e016131a9fcde6f1ffd3e0e34a64c21f593:

bsd-user: fix build (2012-06-28 20:28:36 +)


Pulled.  Thanks.


Megasas? :)


So this code is really broken:

 info.host.type = MFI_INFO_HOST_PCIX;
 info.device.type = MFI_INFO_DEV_SAS3G;
 info.device.port_count = 2;
 info.device.port_addr[0] =
cpu_to_le64(megasas_gen_sas_addr((uint64_t)s));

This will make migration impossible not to mention the fact that
casting a pointer to a uint64_t is really broken.


Hey, this is _NOT_ an address. It's a simple way of generating a
system-wide unique SAS address.


But it's not stable across migration.  That's the problem.


The whole thing is informational anyway, and can only be seen when
using the (proprietary) MegaCLI userspace command.


Nonetheless, it's still guest visible.


This code needs to be refactored to not do this.  It's quite
pervasive though (there's a half a dozen instances like this).



Okay, so here's the challenge: We need to generate a system-wide
unique SAS address, one per SCSI device and one per megasas instance.
A simple counter won't work, as we might have several qemu instances
running. Which would result in all of them having the same SAS
address for the host.


You could used a hashed uuid.

Regards,

Anthony Liguori




I'm going to disable the build by default.  I don't want to see a
rash fix like (uint64_t)(intptr_t).  This needs to be fixed by not
making the pointer address guest visible.  It can then be
re-enabled.  Should be easy enough to update your .mak config if you
want to test between now and then.


As said, it's _not_ an address. The address it just use to seed the
SAS address.

But as you object, I see to use something else for seeding the SAS
address.

Cheers,

Hannes





Re: [Qemu-devel] [PULL 00/14] SCSI updates for 2012-07-02

2012-07-10 Thread Hannes Reinecke
On 07/10/2012 02:52 PM, Anthony Liguori wrote:
 On 07/10/2012 12:57 AM, Hannes Reinecke wrote:
 On 07/10/2012 01:19 AM, Anthony Liguori wrote:
 On 07/09/2012 06:09 PM, Alexander Graf wrote:

 On 09.07.2012, at 18:48, Anthony Liguori wrote:

 On 07/02/2012 04:41 AM, Paolo Bonzini wrote:
 Anthony,

 The following changes since commit
 71ea2e016131a9fcde6f1ffd3e0e34a64c21f593:

 bsd-user: fix build (2012-06-28 20:28:36 +)

 Pulled.  Thanks.

 Megasas? :)

 So this code is really broken:

  info.host.type = MFI_INFO_HOST_PCIX;
  info.device.type = MFI_INFO_DEV_SAS3G;
  info.device.port_count = 2;
  info.device.port_addr[0] =
 cpu_to_le64(megasas_gen_sas_addr((uint64_t)s));

 This will make migration impossible not to mention the fact that
 casting a pointer to a uint64_t is really broken.

 Hey, this is _NOT_ an address. It's a simple way of generating a
 system-wide unique SAS address.
 
 But it's not stable across migration.  That's the problem.
 
 The whole thing is informational anyway, and can only be seen when
 using the (proprietary) MegaCLI userspace command.
 
 Nonetheless, it's still guest visible.
 

Okay, I see your point. I'll be reworking things to not use the
pointer here.

 This code needs to be refactored to not do this.  It's quite
 pervasive though (there's a half a dozen instances like this).


 Okay, so here's the challenge: We need to generate a system-wide
 unique SAS address, one per SCSI device and one per megasas instance.
 A simple counter won't work, as we might have several qemu instances
 running. Which would result in all of them having the same SAS
 address for the host.
 
 You could used a hashed uuid.
 
Right. Will see how that'll work out.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)





Re: [Qemu-devel] [PATCH 4/6] device_tree: Add support for reading device tree properties

2012-07-10 Thread Peter Maydell
On 10 July 2012 07:54, Peter Crosthwaite
peter.crosthwa...@petalogix.com wrote:
 Constantly bouncing back is safer however. If you hang on to an
 in-place pointer into the FDT (as returned by get_prop) and someone
 comes along and set_props() then your pointer is corrupted. Ive been
 snagged before by doing exactly this and eventually came to the
 brute-force approach of just requerying the DTB every touch rather
 than try to work with pointers to arrays. duping the property could
 work, but its a bit of a mess trying to free the returned copies.

Incidentally, if you have two separate bits of code both accessing
the DTB in parallel then this sounds like a really weird corner case
use. I would expect that the standard thing would be at startup
we read the DTB, modify it slightly and after that ignore it,
all of which should be straightforward single threaded code with
no particular control flow/threading/coroutine issues.

-- PMM



Re: [Qemu-devel] [PATCH] Fix xen pci passthrough

2012-07-10 Thread Anthony PERARD

On 10/07/12 12:43, Wei Wang wrote:

Hi, I find a passthrough issue in qemu when pci device has multiple mmio
regions. In this case, the last few bytes of qmp response string are trimmed
and then passthru fails with following output.


Could you compile libxl with DEBUG_RECEIVED (uncomment the #define in 
tools/libxl/libxl_qmp.c) and then give the output of `xl -vvv create ...`.


Thanks,


*** glibc detected *** xl: realloc(): invalid pointer: 0x02163f90 ***
=== Backtrace: =
/lib64/libc.so.6(+0x74c06)[0x7f62970e4c06]
/lib64/libc.so.6(+0x77d25)[0x7f62970e7d25]
/lib/libxenlight.so.2.0(+0x28d02)[0x7f6297a78d02]
/lib/libxenlight.so.2.0(+0x2eccf)[0x7f6297a7eccf]
/lib/libxenlight.so.2.0(+0x2f2f6)[0x7f6297a7f2f6]
/lib/libxenlight.so.2.0(+0x2fe18)[0x7f6297a7fe18]
/lib/libxenlight.so.2.0(+0x20027)[0x7f6297a70027]
/lib/libxenlight.so.2.0(+0x212a6)[0x7f6297a712a6]
/lib/libxenlight.so.2.0(+0x19e82)[0x7f6297a69e82]
/lib/libxenlight.so.2.0(+0x1c288)[0x7f6297a6c288]
/lib/libxenlight.so.2.0(+0x1c2a8)[0x7f6297a6c2a8]
/lib/libxenlight.so.2.0(+0x2657e)[0x7f6297a7657e]
/lib/libxenlight.so.2.0(+0x34076)[0x7f6297a84076]
/lib/libxenlight.so.2.0(libxl__fork_selfpipe_woken+0x92)[0x7f6297a84394]
/lib/libxenlight.so.2.0(+0x3254a)[0x7f6297a8254a]
/lib/libxenlight.so.2.0(+0x3276d)[0x7f6297a8276d]
/lib/libxenlight.so.2.0(+0x33944)[0x7f6297a83944]
/lib/libxenlight.so.2.0(+0x1c0a8)[0x7f6297a6c0a8]
/lib/libxenlight.so.2.0(libxl_domain_create_new+0x14)[0x7f6297a6c14f]
xl[0x40c1f2]
xl[0x40fc94]
xl[0x406c21]
/lib64/libc.so.6(__libc_start_main+0xed)[0x7f629709123d]
xl[0x406439]

Attached patch can fix this issue.

Thanks,
Wei

Signed-off-by: Wei Wang wei.wa...@amd.com

---
  monitor.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/monitor.c b/monitor.c
index f6107ba..9f30f5f 100644
--- a/monitor.c
+++ b/monitor.c
@@ -165,7 +165,7 @@ struct Monitor {
  int reset_seen;
  int flags;
  int suspend_cnt;
-uint8_t outbuf[1024];
+uint8_t outbuf[2048];
  int outbuf_index;
  ReadLineState *rs;
  MonitorControl *mc;




--
Anthony PERARD





[Qemu-devel] [PATCH 1/2] target-i386: move cpu halted decision into x86_cpu_reset

2012-07-10 Thread Igor Mammedov
MP initialization protocol differs between cpu families, and for P6 and
onward models it is up to CPU to decide if it will be BSP using this
protocol, so try to model this. However there is no point in implementing
MP initialization protocol in qemu. Thus first CPU is always marked as BSP.

This patch:
 - moves decision to designate BSP from board into cpu, making cpu
self-sufficient in this regard. Later it will allow to cleanup hw/pc.c
and remove cpu_reset and wrappers from there.
 - stores flag that CPU is BSP in IA32_APIC_BASE to model behavior
described in Inted SDM vol 3a part 1 chapter 8.4.1
 - uses MSR_IA32_APICBASE_BSP flag in apic_base for checking if cpu is BSP

patch is based on Jan Kiszka's proposal:
http://thread.gmane.org/gmane.comp.emulators.qemu/100806

v2:
  - fix build for i386-linux-user
  spotted-by: Peter Maydell peter.mayd...@linaro.org
v3:
  - style change requested by Andreas Färber afaer...@suse.de

Signed-off-by: Igor Mammedov imamm...@redhat.com
---
 hw/apic.h|2 +-
 hw/apic_common.c |   20 ++--
 hw/pc.c  |9 -
 target-i386/cpu.c|9 +
 target-i386/helper.c |1 -
 target-i386/kvm.c|5 +++--
 6 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/hw/apic.h b/hw/apic.h
index 62179ce..d961ed4 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -20,9 +20,9 @@ void apic_init_reset(DeviceState *s);
 void apic_sipi(DeviceState *s);
 void apic_handle_tpr_access_report(DeviceState *d, target_ulong ip,
TPRAccess access);
+void apic_designate_bsp(DeviceState *d);
 
 /* pc.c */
-int cpu_is_bsp(CPUX86State *env);
 DeviceState *cpu_get_current_apic(void);
 
 #endif
diff --git a/hw/apic_common.c b/hw/apic_common.c
index 60b8259..095b09e 100644
--- a/hw/apic_common.c
+++ b/hw/apic_common.c
@@ -43,8 +43,8 @@ uint64_t cpu_get_apic_base(DeviceState *d)
 trace_cpu_get_apic_base((uint64_t)s-apicbase);
 return s-apicbase;
 } else {
-trace_cpu_get_apic_base(0);
-return 0;
+trace_cpu_get_apic_base(MSR_IA32_APICBASE_BSP);
+return MSR_IA32_APICBASE_BSP;
 }
 }
 
@@ -201,22 +201,30 @@ void apic_init_reset(DeviceState *d)
 s-timer_expiry = -1;
 }
 
+void apic_designate_bsp(DeviceState *d)
+{
+if (d == NULL) {
+return;
+}
+
+APICCommonState *s = APIC_COMMON(d);
+s-apicbase |= MSR_IA32_APICBASE_BSP;
+}
+
 static void apic_reset_common(DeviceState *d)
 {
 APICCommonState *s = DO_UPCAST(APICCommonState, busdev.qdev, d);
 APICCommonClass *info = APIC_COMMON_GET_CLASS(s);
-bool bsp;
 
-bsp = cpu_is_bsp(s-cpu_env);
 s-apicbase = 0xfee0 |
-(bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
+(s-apicbase  MSR_IA32_APICBASE_BSP) | MSR_IA32_APICBASE_ENABLE;
 
 s-vapic_paddr = 0;
 info-vapic_base_update(s);
 
 apic_init_reset(d);
 
-if (bsp) {
+if (s-apicbase  MSR_IA32_APICBASE_BSP) {
 /*
  * LINT0 delivery mode on CPU #0 is set to ExtInt at initialization
  * time typically by BIOS, so PIC interrupt can be delivered to the
diff --git a/hw/pc.c b/hw/pc.c
index c7e9ab3..50c1715 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -871,12 +871,6 @@ void pc_init_ne2k_isa(ISABus *bus, NICInfo *nd)
 nb_ne2k++;
 }
 
-int cpu_is_bsp(CPUX86State *env)
-{
-/* We hard-wire the BSP to the first CPU. */
-return env-cpu_index == 0;
-}
-
 DeviceState *cpu_get_current_apic(void)
 {
 if (cpu_single_env) {
@@ -927,10 +921,7 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int 
level)
 static void pc_cpu_reset(void *opaque)
 {
 X86CPU *cpu = opaque;
-CPUX86State *env = cpu-env;
-
 cpu_reset(CPU(cpu));
-env-halted = !cpu_is_bsp(env);
 }
 
 static X86CPU *pc_new_cpu(const char *cpu_model)
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 5521709..f9ed6d8 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -1686,6 +1686,15 @@ static void x86_cpu_reset(CPUState *s)
 env-dr[7] = DR7_FIXED_1;
 cpu_breakpoint_remove_all(env, BP_CPU);
 cpu_watchpoint_remove_all(env, BP_CPU);
+
+#if !defined(CONFIG_USER_ONLY)
+/* We hard-wire the BSP to the first CPU. */
+if (env-cpu_index == 0) {
+apic_designate_bsp(env-apic_state);
+}
+
+env-halted = !(cpu_get_apic_base(env-apic_state)  
MSR_IA32_APICBASE_BSP);
+#endif
 }
 
 static void mce_init(X86CPU *cpu)
diff --git a/target-i386/helper.c b/target-i386/helper.c
index d3af6ea..b748d90 100644
--- a/target-i386/helper.c
+++ b/target-i386/helper.c
@@ -1191,7 +1191,6 @@ void do_cpu_init(X86CPU *cpu)
 env-interrupt_request = sipi;
 env-pat = pat;
 apic_init_reset(env-apic_state);
-env-halted = !cpu_is_bsp(env);
 }
 
 void do_cpu_sipi(X86CPU *cpu)
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 0d0d8f6..09621e5 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -583,8 +583,9 @@ void kvm_arch_reset_vcpu(CPUX86State 

[Qemu-devel] [PATCH 0/2 v2] target-i386: refactor reset handling and move it into cpu.c

2012-07-10 Thread Igor Mammedov
v2:
  ommited moving of x86_cpu_realize() from cpu_x86_init() to pc_new_cpu(),
  to keep cpu_init implementation in -softmmu and -user targets the same
  in single place and maintanable.

tree for testing:
  https://github.com/imammedo/qemu/tree/x86_reset

comiple  run tested with x86_64-linux-user, x86_64-softmmu targets

Igor Mammedov (2):
  target-i386: move cpu halted decision into x86_cpu_reset
  target-i386: move cpu_reset and reset callback to cpu.c

 hw/apic.h|2 +-
 hw/apic_common.c |   20 ++--
 hw/pc.c  |   18 +-
 target-i386/cpu.c|   25 +
 target-i386/helper.c |1 -
 target-i386/kvm.c|5 +++--
 6 files changed, 44 insertions(+), 27 deletions(-)




[Qemu-devel] [PATCH 2/2] target-i386: move cpu_reset and reset callback to cpu.c

2012-07-10 Thread Igor Mammedov
Moving reset callback into cpu object from board level and
resetting cpu at the end of x86_cpu_realize() will allow properly
create cpu object during run-time (hotplug) without calling reset exteraly.

When reset over QOM hierarchy is implemented, reset callback
should be removed.

v2:
  leave cpu_reset in pc_new_cpu() for now, it's to be cleaned up when APIC init 
is moved in cpu.c

Signed-off-by: Igor Mammedov imamm...@redhat.com
---
 hw/pc.c   |9 +
 target-i386/cpu.c |   16 
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 50c1715..d74ca6e 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -918,12 +918,6 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int 
level)
 }
 }
 
-static void pc_cpu_reset(void *opaque)
-{
-X86CPU *cpu = opaque;
-cpu_reset(CPU(cpu));
-}
-
 static X86CPU *pc_new_cpu(const char *cpu_model)
 {
 X86CPU *cpu;
@@ -938,8 +932,7 @@ static X86CPU *pc_new_cpu(const char *cpu_model)
 if ((env-cpuid_features  CPUID_APIC) || smp_cpus  1) {
 env-apic_state = apic_init(env, env-cpuid_apic_id);
 }
-qemu_register_reset(pc_cpu_reset, cpu);
-pc_cpu_reset(cpu);
+cpu_reset(CPU(cpu));
 return cpu;
 }
 
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index f9ed6d8..65c7446 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -31,6 +31,8 @@
 
 #include hyperv.h
 
+#include hw/hw.h
+
 /* feature flags taken from Intel Processor Identification and the CPUID
  * Instruction and AMD's CPUID Specification.  In cases of disagreement
  * between feature naming conventions, aliases may be added.
@@ -1697,6 +1699,15 @@ static void x86_cpu_reset(CPUState *s)
 #endif
 }
 
+#ifndef CONFIG_USER_ONLY
+/* TODO: remove me, when reset over QOM tree is implemented */
+static void x86_cpu_machine_reset_cb(void *opaque)
+{
+X86CPU *cpu = opaque;
+cpu_reset(CPU(cpu));
+}
+#endif
+
 static void mce_init(X86CPU *cpu)
 {
 CPUX86State *cenv = cpu-env;
@@ -1717,8 +1728,13 @@ void x86_cpu_realize(Object *obj, Error **errp)
 {
 X86CPU *cpu = X86_CPU(obj);
 
+#ifndef CONFIG_USER_ONLY
+qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
+#endif
+
 mce_init(cpu);
 qemu_init_vcpu(cpu-env);
+cpu_reset(CPU(cpu));
 }
 
 static void x86_cpu_initfn(Object *obj)
-- 
1.7.1




  1   2   3   >