date:20131025

Re: [Qemu-devel] [PATCH 09/17] migration-local: override before_ram_iterate to send pipefd

2013-10-25 Thread Paolo Bonzini

Il 25/10/2013 05:38, Lei Li ha scritto:
 
 Just want to confirm, normally, should I take these 'otherwise looks
 good/ok'
 as a 'Reviewed-by' from you If the other comment is fixed in the update
 version?

Depends on how much the patch changes... right now I'm still expecting
some changes so I didn't really look much at the patch and didn't test
it.  I prefer to take a more complete look at v3 before giving a
formal Reviewed-by.

Paolo

Re: [Qemu-devel] [PATCH 0/17 v2] Localhost migration with side channel for ram

2013-10-25 Thread Paolo Bonzini

Il 25/10/2013 06:58, Lei Li ha scritto:
 Right now just has inaccurate numbers without the new vmsplice, which
 based on
 the result from info migrate, as the guest ram size increases, although the
 'total time' is number of times less compared with the current live
 migration, but the 'downtime' performs badly.

Of course.
 
 For a 1GB ram guest,
 
 total time: 702 milliseconds
 downtime: 692 milliseconds
 
 And when the ram size of guest increasesexponentially, those numbers are
 proportional to it.
  
 I will make a list of the performance with the new vmsplice later, I am
 sure it'd be much better than this at least.

Yes, please.  Is the memory usage is still 2x without vmsplice?

I think you have a nice proof of concept, but on the other hand this
probably needs to be coupled with some kind of postcopy live migration,
that is:

* the source starts sending data

* but the destination starts running immediately

* if the machine needs a page that is missing, the destination asks the
source to send it

* as soon as it arrives, the destination can restart

Using postcopy is problematic for reliability: if the destination fails,
the virtual machine is lost because the source doesn't have the latest
content of memory.  However, this is a much, much smaller problem for
live QEMU upgrade where the network cannot fail.

If you do this, you can achieve pretty much instantaneous live upgrade,
well within your original 200 ms goals.  But the flipping code with
vmsplice should be needed anyway to avoid doubling memory usage, and
it's looking pretty good in this version already!  I'm relieved that the
RDMA code was designed right!

Paolo

Re: [Qemu-devel] [PATCH 14/17] add new RanState RAN_STATE_FLIPPING_MIGRATE

2013-10-25 Thread Paolo Bonzini

Il 25/10/2013 05:30, Lei Li ha scritto:

 I am not sure about the name; for one thing, the new state would apply
 also to postcopy migration.
 
 About the name, how about 'live-upgrade'?
 
 OK, I'll add the transition between postcopy and this new state.

Note I didn't mean postmigrate.

For a description of postcopy, see my answer to the cover letter (patch
0).  The new state means somebody else has newer contents of the
memory.  Perhaps stale?

 And should it also apply from 'prelaunch' to 'flipping-migrate' too?

Yes, it should.  Good catch!

Paolo

Re: [Qemu-devel] [patch 2/2] i386: pc: align gpa-hpa on 1GB boundary

2013-10-25 Thread Paolo Bonzini

Il 25/10/2013 05:58, Marcelo Tosatti ha scritto:
 On Fri, Oct 25, 2013 at 12:55:36AM +0100, Paolo Bonzini wrote:
 +if (hpagesize == (130)) {
 +unsigned long holesize = 0x1ULL - below_4g_mem_size;
 +
 +memory_region_init_alias(ram_above_4g, NULL, ram-above-4g, 
 ram,
 +0x1ULL,
 +above_4g_mem_size - holesize);
 +memory_region_add_subregion(system_memory, 0x1ULL,
 +ram_above_4g);
 +
 +ram_above_4g_piecetwo = 
 g_malloc(sizeof(*ram_above_4g_piecetwo));
 +memory_region_init_alias(ram_above_4g_piecetwo, NULL,
 + ram-above-4g-piecetwo, ram,
 + 0x1ULL - holesize, holesize);
 +memory_region_add_subregion(system_memory,
 +0x1ULL +
 +above_4g_mem_size - holesize,
 +ram_above_4g_piecetwo);

 Why break it in two?  You can just allocate extra holesize bytes in the
 ram MemoryRegion, and not map the part that corresponds to
 [0x1ULL - holesize, 0x1ULL).
 
 - If the ram MemoryRegion is backed with 1GB hugepages, you might not 
 want to allocate extra holesize bytes (which might require an entire
 1GB page).
 
 - 1GB backed RAM can be mapped with 2MB pages.
 
 Also, as Peter said this cannot depend on host considerations.  Just do
 it unconditionally, but only for new machine types (pc-1.8 and q35-1.8,
 since unfortunately we're too close to hard freeze).
 
 Why the description of memory subregions and aliases are part of machine
 types?

It affects the migration stream, which stores RAM offsets instead of
physical addresses.

Let's say you have an 8 GB guest and the hole size is 0.25 GB.

If the huge page size is 2MB, you have:

   Physical addressLength   RAM offsets
   0   3.75 GB  pc.ram @ 0
   4 GB4.25 GB  pc.ram @ 3.75 GB

If the huge page size is 1GB, you have:

   Physical addressLength   RAM offsets
   0   3.75 GB  pc.ram @ 0
   4 GB4 GB pc.ram @ 4 GB
   8 GB0.25 GB  pc.ram @ 3.75 GB

So your memory rotates around the 3.75 GB boundary when you migrate
from a non-gbpages host to a gbpages host or vice versa.

If we're doing it only for new machine types, it's even simpler to just
have two RAM regions:

   Physical addressLength   RAM offsets
   0   3.75 GB  pc.ram-below-4g @ 0
   4 GB4.25 GB  pc.ram-above-4g @ 0

Because offsets are zero, and lengths match the RAM block lengths, you
do not need any complication with aliasing.  This still has to be done
only for new machine types.

Paolo

Re: [Qemu-devel] [PATCH v2] qemu-iotests: Test for loading VM state from qcow2

2013-10-25 Thread Kevin Wolf

Am 24.10.2013 um 20:24 hat Max Reitz geschrieben:
 Add a test for saving a VM state from a qcow2 image and loading it back
 (with having restarted qemu in between); this should work without any
 problems.
 
 Signed-off-by: Max Reitz mre...@redhat.com

Thanks, applied to the block branch.

Kevin

Re: [Qemu-devel] [PATCH 0/17 v2] Localhost migration with side channel for ram

2013-10-25 Thread Anthony Liguori

On Oct 25, 2013 8:30 AM, Paolo Bonzini pbonz...@redhat.com wrote:

 Il 25/10/2013 06:58, Lei Li ha scritto:
  Right now just has inaccurate numbers without the new vmsplice, which
  based on
  the result from info migrate, as the guest ram size increases, although
the
  'total time' is number of times less compared with the current live
  migration, but the 'downtime' performs badly.

 Of course.
 
  For a 1GB ram guest,
 
  total time: 702 milliseconds
  downtime: 692 milliseconds
 
  And when the ram size of guest increasesexponentially, those numbers are
  proportional to it.
 
  I will make a list of the performance with the new vmsplice later, I am
  sure it'd be much better than this at least.

 Yes, please.  Is the memory usage is still 2x without vmsplice?

 I think you have a nice proof of concept, but on the other hand this
 probably needs to be coupled with some kind of postcopy live migration,
 that is:

 * the source starts sending data

 * but the destination starts running immediately

 * if the machine needs a page that is missing, the destination asks the
 source to send it

 * as soon as it arrives, the destination can restart

 Using postcopy is problematic for reliability: if the destination fails,
 the virtual machine is lost because the source doesn't have the latest
 content of memory.  However, this is a much, much smaller problem for
 live QEMU upgrade where the network cannot fail.

 If you do this, you can achieve pretty much instantaneous live upgrade,
 well within your original 200 ms goals.

This is actually a very nice justification for post copy.

Regards,

Anthony Liguori

But the flipping code with
 vmsplice should be needed anyway to avoid doubling memory usage, and
 it's looking pretty good in this version already!  I'm relieved that the
 RDMA code was designed right!

 Paolo

Re: [Qemu-devel] [PATCH v2] qcow2: Flush image after creation

2013-10-25 Thread Kevin Wolf

Am 24.10.2013 um 20:35 hat Max Reitz geschrieben:
 Opening the qcow2 image with BDRV_O_NO_FLUSH prevents any flushes during
 the image creation. This means that the image has not yet been flushed
 to disk when qemu-img create exits. This flush is delayed until the next
 operation on the image involving opening it without BDRV_O_NO_FLUSH and
 closing (or directly flushing) it. For large images and/or images with a
 small cluster size and preallocated metadata, this flush may take a
 significant amount of time and may occur unexpectedly.
 
 Reopening the image without BDRV_O_NO_FLUSH right before the end of
 qcow2_create2() results in hoisting the potentially costly flush into
 the image creation, which is expected to take some time (whereas
 successive image operations may be not).
 
 Signed-off-by: Max Reitz mre...@redhat.com
 Reviewed-by: Eric Blake ebl...@redhat.com

Thanks, applied to the block branch.

Kevin

Re: [Qemu-devel] [PATCH 0/6] qapi: generate event defines automatically

2013-10-25 Thread Wenchao Xia

Hi, Markus
I am coding V2 which support event in qapi-schema, and just remember
it is on your TODO list. Is it OK to let me implement it instead as V2?

Re: [Qemu-devel] [PATCH] linux-user: create target_structs header to place ipc_perm and shmid_ds

2013-10-25 Thread Erik de Castro Lopo

Petar Jovanovic wrote:

 From: Petar Jovanovic petar.jovano...@imgtec.com
 
 Creating target_structs header in linux-user/$arch/ and making
 target_ipc_perm and target_shmid_ds its first inhabitants.
 The struct defintions may/should be further fine-tuned by arch maintainers.
 
 Signed-off-by: Petar Jovanovic petar.jovano...@imgtec.com

Reviewed-by: Erik de Castro Lopo er...@mega-nerd.com


I'm relatively new to QEMU and this is my first review. This change
looks sane to me, applies cleanly and compiles without any new warnings.

In future I will be attempting to review anything in the linux-user
tree.

Cheers,
Erik
-- 
--
Erik de Castro Lopo
http://www.mega-nerd.com/

Re: [Qemu-devel] Patch v3 : POSIX timer implementation for linux-user.

2013-10-25 Thread Erik de Castro Lopo

mle...@mega-nerd.com wrote:

 
 Changes from original:
 
 * Call host's libc functions directly rather than _syscall*() (as suggested
   by Peter Maydell).
 * Remove un-needed #defines.
 
 Launchpad bug is here: https://bugs.launchpad.net/bugs/1042388


Ping?
http://patchwork.ozlabs.org/patch/284786/

Erik
-- 
--
Erik de Castro Lopo
http://www.mega-nerd.com/

Re: [Qemu-devel] [patch 2/2] i386: pc: align gpa-hpa on 1GB boundary

2013-10-25 Thread igor Mammedov

On Fri, 25 Oct 2013 02:58:05 -0200
Marcelo Tosatti mtosa...@redhat.com wrote:

 On Fri, Oct 25, 2013 at 12:55:36AM +0100, Paolo Bonzini wrote:
   +if (hpagesize == (130)) {
   +unsigned long holesize = 0x1ULL -
   below_4g_mem_size; +
   +memory_region_init_alias(ram_above_4g, NULL,
   ram-above-4g, ram,
   +0x1ULL,
   +above_4g_mem_size -
   holesize);
   +memory_region_add_subregion(system_memory,
   0x1ULL,
   +ram_above_4g);
   +
   +ram_above_4g_piecetwo =
   g_malloc(sizeof(*ram_above_4g_piecetwo));
   +memory_region_init_alias(ram_above_4g_piecetwo, NULL,
   + ram-above-4g-piecetwo,
   ram,
   + 0x1ULL - holesize,
   holesize);
   +memory_region_add_subregion(system_memory,
   +0x1ULL +
   +above_4g_mem_size -
   holesize,
   +ram_above_4g_piecetwo);
  
  Why break it in two?  You can just allocate extra holesize bytes in
  the ram MemoryRegion, and not map the part that corresponds to
  [0x1ULL - holesize, 0x1ULL).
 
 - If the ram MemoryRegion is backed with 1GB hugepages, you might
 not want to allocate extra holesize bytes (which might require an
 entire 1GB page).
From POV of moddeling current ram as dimm devices, aliasing
wouldn't work nice. But breaking one block in two or more is fine since
then blocks could be represented as several dimm devices.

+3Gb backend ram it could be split in blocks like this:

  [ 3Gb (1Gb pages backed) ]
  [tail1 (below_4gb - 3Gb) (2mb pages backed) ]
  [above_4gb whole X Gb pages (1Gb pages backed)]
  [tail2 (2mb pages backed)]

 - 1GB backed RAM can be mapped with 2MB pages.
 
  Also, as Peter said this cannot depend on host considerations.
  Just do it unconditionally, but only for new machine types (pc-1.8
  and q35-1.8, since unfortunately we're too close to hard freeze).
 
 Why the description of memory subregions and aliases are part of
 machine types?

Re: [Qemu-devel] [PATCH 1/1] sd: pl181: fix fifo count read support

2013-10-25 Thread Jean-Christophe PLAGNIOL-VILLARD

On 11:33 Sat 19 Oct , Jean-Christophe PLAGNIOL-VILLARD wrote:
 as it's depend on current direction

ony change to get that applied?

Barebox relay on it so it can work on both qemu and real hw

Best Regards,
J.
 
 Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD plagn...@jcrosoft.com
 ---
  hw/sd/pl181.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)
 
 diff --git a/hw/sd/pl181.c b/hw/sd/pl181.c
 index 03875bf..91adbbd 100644
 --- a/hw/sd/pl181.c
 +++ b/hw/sd/pl181.c
 @@ -344,7 +344,11 @@ static uint64_t pl181_read(void *opaque, hwaddr offset,
 data engine.  DataCnt is decremented after each byte is
 transferred between the serial engine and the card.
 We don't emulate this level of detail, so both can be the same.  
 */
 -tmp = (s-datacnt + 3)  2;
 + if (s-datactrl  PL181_DATA_DIRECTION)
 + tmp = s-fifo_len;
 + else
 + tmp = s-datacnt;
 +tmp = (tmp + 3)  2;
  if (s-linux_hack) {
  s-linux_hack = 0;
  pl181_fifo_run(s);
 -- 
 1.8.4.rc3

Re: [Qemu-devel] [PATCH v5 4/5] Update documentation for LTTng ust tracing

2013-10-25 Thread Alex Bennée


mohamad.ge...@gmail.com writes:

 Signed-off-by: Mohamad Gebai mohamad.ge...@polymtl.ca

All looks good to me now.

Reviewed-by: Alex Bennée a...@bennee.com

-- 
Alex Bennée

Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX

2013-10-25 Thread Alex Bennée


tommu...@gmail.com writes:

 This patch adds routines to the softfloat library that are useful for
 the PowerPC VSX implementation.  The routines are, however, not specific
 to PowerPC and are approprriate for softfloat.
snip

Is it worth adding some sort of test into make check to defend these
softfloat functions against unintentional breakage? It would certainly
be worthwhile as soon as multiple arches use these functions as float
errors are often subtle and hard to track down.

-- 
Alex Bennée

Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX

2013-10-25 Thread Peter Maydell

On 25 October 2013 12:34, Alex Bennée alex.ben...@linaro.org wrote:
 Is it worth adding some sort of test into make check to defend these
 softfloat functions against unintentional breakage? It would certainly
 be worthwhile as soon as multiple arches use these functions as float
 errors are often subtle and hard to track down.

Ideally, but there's zero infrastructure for doing the kind
of serious including-edge-cases testing at the moment, so I'm
not really in favour of making it a gating condition for
accepting patches.

If somebody wanted to set up such infrastructure, there are
a couple of approaches that spring to mind:
 (a) get risu (https://wiki.linaro.org/PeterMaydell/Risu) working
  on more target architectures, add the record-and-replay feature
  so it can be run without having target hardware, and then just
  test softfloat by testing the actual target fp instructions
 (b) something involving wiring up IBM's IEEE test suite
  vectors directly to our softfloat code:
 
https://www.research.ibm.com/cgi-bin/haifa/test_suite_download.pl?first=elenagsecond=webmaster
  (it's not clear to me what license the test vectors are
  under)

-- PMM

Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX

2013-10-25 Thread Peter Maydell

On 24 October 2013 17:17, Tom Musta tommu...@gmail.com wrote:
 This patch adds routines to the softfloat library that are useful for
 the PowerPC VSX implementation.  The routines are, however, not specific
 to PowerPC and are approprriate for softfloat.

 The following routines are added:

   - float32_is_denormal() returns true if the 32-bit floating point number
 is denormalized.
   - float64_is_denormal() returns true if the 64-bit floating point number
 is denormalized.

Can you point me at the patches which use these, please?
I couldn't find them with a quick search in my email client.

   - float32_get_unbiased_exp() returns the unbiased exponent of a 32-bit
 floating point number.
   - float64_get_unbiased_exp() returns the unbiased exponent of a 64-bit
 floating point number.

These look rather odd to me, and again I can't find the uses in
your patchset. Returning just the exponent is a bit odd and
suggests that maybe the split between target code and softfloat
is in the wrong place.

   - float32_to_uint64() converts a 32-bit floating point number to an
 unsigned 64 bit number.

I would put this in its own patch, personally.


 +INLINE int float32_is_denormal(float32 a)
 +{
 +return ((float32_val(a)  0x7f80) == 0) 
 +   ((float32_val(a)  0x007f) != 0);
 +}

return float32_is_zero_or_denormal(a)  !float32_is_zero(a);

is easier to review and less duplicative of code.

thanks
-- PMM

Re: [Qemu-devel] [PATCH 09/17] migration-local: override before_ram_iterate to send pipefd

2013-10-25 Thread Lei Li


On 10/25/2013 03:23 PM, Paolo Bonzini wrote:

Il 25/10/2013 05:38, Lei Li ha scritto:

Just want to confirm, normally, should I take these 'otherwise looks
good/ok'
as a 'Reviewed-by' from you If the other comment is fixed in the update
version?

Depends on how much the patch changes... right now I'm still expecting
some changes so I didn't really look much at the patch and didn't test
it.  I prefer to take a more complete look at v3 before giving a
formal Reviewed-by.


I see, thanks for your explanation.



Paolo




--
Lei

Re: [Qemu-devel] [PATCH 14/17] add new RanState RAN_STATE_FLIPPING_MIGRATE

2013-10-25 Thread Lei Li


On 10/25/2013 03:31 PM, Paolo Bonzini wrote:

Il 25/10/2013 05:30, Lei Li ha scritto:

I am not sure about the name; for one thing, the new state would apply
also to postcopy migration.

About the name, how about 'live-upgrade'?

OK, I'll add the transition between postcopy and this new state.

Note I didn't mean postmigrate.

For a description of postcopy, see my answer to the cover letter (patch


Yes, I've realized that I misunderstood it...


0).  The new state means somebody else has newer contents of the
memory.  Perhaps stale?

And should it also apply from 'prelaunch' to 'flipping-migrate' too?

Yes, it should.  Good catch!

Paolo




--
Lei

Re: [Qemu-devel] [PATCH 0/17 v2] Localhost migration with side channel for ram

2013-10-25 Thread Lei Li


On 10/25/2013 03:30 PM, Paolo Bonzini wrote:

Il 25/10/2013 06:58, Lei Li ha scritto:

Right now just has inaccurate numbers without the new vmsplice, which
based on
the result from info migrate, as the guest ram size increases, although the
'total time' is number of times less compared with the current live
migration, but the 'downtime' performs badly.

Of course.

For a 1GB ram guest,

total time: 702 milliseconds
downtime: 692 milliseconds

And when the ram size of guest increasesexponentially, those numbers are
proportional to it.
  
I will make a list of the performance with the new vmsplice later, I am

sure it'd be much better than this at least.

Yes, please.  Is the memory usage is still 2x without vmsplice?

I think you have a nice proof of concept, but on the other hand this
probably needs to be coupled with some kind of postcopy live migration,
that is:

* the source starts sending data

* but the destination starts running immediately

* if the machine needs a page that is missing, the destination asks the
source to send it

* as soon as it arrives, the destination can restart

Using postcopy is problematic for reliability: if the destination fails,
the virtual machine is lost because the source doesn't have the latest
content of memory.  However, this is a much, much smaller problem for
live QEMU upgrade where the network cannot fail.

If you do this, you can achieve pretty much instantaneous live upgrade,
well within your original 200 ms goals.  But the flipping code with
vmsplice should be needed anyway to avoid doubling memory usage, and


Yes, I have read the postcopy migration patches, it does perform very
good on downtime, as just send the vmstates then switch the execution
to destination host. And as you pointed out, it can not avoid
doubling memory usage.

The numbers list above are based on the old vmsplice as I have not yet
worked on the benchmark for performance, it actually copys data rather
than moving. As the feedback for this version is positive, now I am
trying to get a real result out with the new vmsplice.

BTW, kernel side is looking for huge page solution for the improvement of
performance.

The recently patches from kernel as link,

http://article.gmane.org/gmane.linux.kernel/1574277


it's looking pretty good in this version already!  I'm relieved that the
RDMA code was designed right!


I am happy with it too. :)
Those RDMA hooks really make thingsmore flexible!



Paolo




--
Lei

[Qemu-devel] KVM call agenda for 2013-10-29

2013-10-25 Thread Juan Quintela



Hi

Please, send any topic that you are interested in covering.

Thanks, Juan.

Call details:

10:00 AM to 11:00 AM EDT
Every two weeks

If you need phone number details,  contact me privately.

[Qemu-devel] [PATCH] Fix COR by disabling BDRV_O_COPY_ON_READ before opening the backing_file.

2013-10-25 Thread Thibaut LAURENT

Since commit 0ebd24e0a203cf2852c310b59fbe050190dc6c8c,
bdrv_open_common will throw an error when trying to open a file
read-only with the BDRV_O_COPY_ON_READ flag set.
Although BDRV_O_RDWR is unset for the backing files,
BDRV_O_COPY_ON_READ is still passed on if copy-on-read was requested
for the drive. Let's unset this flag too before opening the backing
file, or bdrv_open_common will fail.

Signed-off-by: Thibaut LAURENT thibaut.laur...@gmail.com
---
 block.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index fd05a80..4474012 100644
--- a/block.c
+++ b/block.c
@@ -999,7 +999,8 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*options, Error **errp)
 }
 
 /* backing files always opened read-only */
-back_flags = bs-open_flags  ~(BDRV_O_RDWR | BDRV_O_SNAPSHOT);
+back_flags = bs-open_flags  ~(BDRV_O_RDWR | BDRV_O_SNAPSHOT |
+BDRV_O_COPY_ON_READ);
 
 ret = bdrv_open(bs-backing_hd,
 *backing_filename ? backing_filename : NULL, options,
-- 
1.8.4.1

Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX

2013-10-25 Thread Tom Musta


Peter:  Thanks for your feedback.  Responses below.


On 10/25/2013 6:55 AM, Peter Maydell wrote:

On 24 October 2013 17:17, Tom Musta tommu...@gmail.com wrote:

This patch adds routines to the softfloat library that are useful for
the PowerPC VSX implementation.  The routines are, however, not specific
to PowerPC and are approprriate for softfloat.

The following routines are added:

   - float32_is_denormal() returns true if the 32-bit floating point number
 is denormalized.
   - float64_is_denormal() returns true if the 64-bit floating point number
 is denormalized.


Can you point me at the patches which use these, please?
I couldn't find them with a quick search in my email client.


Please see http://lists.nongnu.org/archive/html/qemu-devel/2013-10/msg03108.html




   - float32_get_unbiased_exp() returns the unbiased exponent of a 32-bit
 floating point number.
   - float64_get_unbiased_exp() returns the unbiased exponent of a 64-bit
 floating point number.


These look rather odd to me, and again I can't find the uses in
your patchset. Returning just the exponent is a bit odd and
suggests that maybe the split between target code and softfloat
is in the wrong place.


Please see http://lists.nongnu.org/archive/html/qemu-devel/2013-10/msg03108.html
and http://lists.nongnu.org/archive/html/qemu-devel/2013-10/msg03107.html
and also the corresponding definitions of those instructions in the Power ISA.

What is odd here is the PowerPC instruction(s) :)

But given that softfloat code extracts exponents in numerous places, I do not 
find
it odd at all that a floating point instruction model for a non-standard
operation might have to do the same.

These functions can easily be kept within the PowerPC code proper if there are
objections to them being added to softfloat.  I would rename them, of course, so
that they do not look like softfloat routines.


   - float32_to_uint64() converts a 32-bit floating point number to an
 unsigned 64 bit number.


I would put this in its own patch, personally.


Fair enough.  Just so that I am clear ... do you mean submit this as a patch
just by itself (not as part of a series of VSX additions)?



+INLINE int float32_is_denormal(float32 a)
+{
+return ((float32_val(a)  0x7f80) == 0) 
+   ((float32_val(a)  0x007f) != 0);
+}


return float32_is_zero_or_denormal(a)  !float32_is_zero(a);

is easier to review and less duplicative of code.

thanks


It surprised me that there were is_zero and is_zero_or_denormal functions but
not is_denormal functions.  I would find it more normal to implement the two
primitive functions and then construct is_zero_or_denormal to be the OR of
those two.  Until you look at efficiency of the implementation.

Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX

2013-10-25 Thread Alex Bennée


peter.mayd...@linaro.org writes:

 On 25 October 2013 12:34, Alex Bennée alex.ben...@linaro.org wrote:
 Is it worth adding some sort of test into make check to defend these
 softfloat functions against unintentional breakage? It would certainly
 be worthwhile as soon as multiple arches use these functions as float
 errors are often subtle and hard to track down.

 Ideally, but there's zero infrastructure for doing the kind
 of serious including-edge-cases testing at the moment, so I'm
 not really in favour of making it a gating condition for
 accepting patches.

I'm not proposing to halt inclusion for that I was just wondering aloud
how it could be defended. For the soft-float routines themselves they
could be tested within the existing tests/ stuff like
tests/check-qfloat.c without having to worry about hooking into target
arch specific test cases.

 If somebody wanted to set up such infrastructure, there are
 a couple of approaches that spring to mind:
  (a) get risu (https://wiki.linaro.org/PeterMaydell/Risu) working
   on more target architectures, add the record-and-replay feature
   so it can be run without having target hardware, and then just
   test softfloat by testing the actual target fp instructions

Interesting. Funnily we spent a lot of time at Transitive fixing up
translation failures that our random code generator threw up. It's also
equally interesting how far you can get with fairly broken translation
that no actual applications care about.

I'll have a look once I've fixed up build machinery around the existing
TCG tests.

  (b) something involving wiring up IBM's IEEE test suite
   vectors directly to our softfloat code:
  
 https://www.research.ibm.com/cgi-bin/haifa/test_suite_download.pl?first=elenagsecond=webmaster
   (it's not clear to me what license the test vectors are
   under)

 -- PMM

-- 
Alex Bennée

Re: [Qemu-devel] [PATCH v5 1/5] Fix configure script for LTTng 2.x

2013-10-25 Thread Alex Bennée


mohamad.ge...@gmail.com writes:

 Signed-off-by: Mohamad Gebai mohamad.ge...@polymtl.ca
 ---
snip

Tested on Ubuntu 12.04 with and without the LTTNG PPA and seems to work
well enough.

Reviewed-by: Alex Bennée a...@bennee.com

-- 
Alex Bennée

Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX

2013-10-25 Thread Tom Musta


On 10/25/2013 6:44 AM, Peter Maydell wrote:

On 25 October 2013 12:34, Alex Bennée alex.ben...@linaro.org wrote:

Is it worth adding some sort of test into make check to defend these
softfloat functions against unintentional breakage? It would certainly
be worthwhile as soon as multiple arches use these functions as float
errors are often subtle and hard to track down.


Ideally, but there's zero infrastructure for doing the kind
of serious including-edge-cases testing at the moment, so I'm
not really in favour of making it a gating condition for
accepting patches.

If somebody wanted to set up such infrastructure, there are
a couple of approaches that spring to mind:
  (a) get risu (https://wiki.linaro.org/PeterMaydell/Risu) working
   on more target architectures, add the record-and-replay feature
   so it can be run without having target hardware, and then just
   test softfloat by testing the actual target fp instructions
  (b) something involving wiring up IBM's IEEE test suite
   vectors directly to our softfloat code:
  
https://www.research.ibm.com/cgi-bin/haifa/test_suite_download.pl?first=elenagsecond=webmaster
   (it's not clear to me what license the test vectors are
   under)


Softfloat would seem to lend itself very well to unit testing which makes (b)
attractive.  Let me see if I can get an answer to the licensing question.

Re: [Qemu-devel] [patch 2/2] i386: pc: align gpa-hpa on 1GB boundary

2013-10-25 Thread Marcelo Tosatti

On Fri, Oct 25, 2013 at 11:57:18AM +0200, igor Mammedov wrote:
 On Fri, 25 Oct 2013 02:58:05 -0200
 Marcelo Tosatti mtosa...@redhat.com wrote:
 
  On Fri, Oct 25, 2013 at 12:55:36AM +0100, Paolo Bonzini wrote:
+if (hpagesize == (130)) {
+unsigned long holesize = 0x1ULL -
below_4g_mem_size; +
+memory_region_init_alias(ram_above_4g, NULL,
ram-above-4g, ram,
+0x1ULL,
+above_4g_mem_size -
holesize);
+memory_region_add_subregion(system_memory,
0x1ULL,
+ram_above_4g);
+
+ram_above_4g_piecetwo =
g_malloc(sizeof(*ram_above_4g_piecetwo));
+memory_region_init_alias(ram_above_4g_piecetwo, NULL,
+ ram-above-4g-piecetwo,
ram,
+ 0x1ULL - holesize,
holesize);
+memory_region_add_subregion(system_memory,
+0x1ULL +
+above_4g_mem_size -
holesize,
+ram_above_4g_piecetwo);
   
   Why break it in two?  You can just allocate extra holesize bytes in
   the ram MemoryRegion, and not map the part that corresponds to
   [0x1ULL - holesize, 0x1ULL).
  
  - If the ram MemoryRegion is backed with 1GB hugepages, you might
  not want to allocate extra holesize bytes (which might require an
  entire 1GB page).
 From POV of moddeling current ram as dimm devices, aliasing
 wouldn't work nice. But breaking one block in two or more is fine since
 then blocks could be represented as several dimm devices.
 
 +3Gb backend ram it could be split in blocks like this:
 
   [ 3Gb (1Gb pages backed) ]
   [tail1 (below_4gb - 3Gb) (2mb pages backed) ]
   [above_4gb whole X Gb pages (1Gb pages backed)]
   [tail2 (2mb pages backed)]

Yes, thought of that, unfortunately its cumbersome to add an interface
for the user to supply both 2MB and 1GB hugetlbfs pages.

Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX

2013-10-25 Thread Peter Maydell

On 25 October 2013 14:01, Tom Musta tommu...@gmail.com wrote:
 On 10/25/2013 6:55 AM, Peter Maydell wrote:
 On 24 October 2013 17:17, Tom Musta tommu...@gmail.com wrote:
- float32_is_denormal() returns true if the 32-bit floating point
 number
  is denormalized.
- float64_is_denormal() returns true if the 64-bit floating point
 number
  is denormalized.


 Can you point me at the patches which use these, please?
 I couldn't find them with a quick search in my email client.


 Please see
 http://lists.nongnu.org/archive/html/qemu-devel/2013-10/msg03108.html

Thanks. For that code you can just use the existing
is_zero_or_denormal function if you like, since you've
already ruled out is this zero? by the time you're
checking for is this denormal?. (In fact that logic
seems to do a number of pointless checks for is this
zero? when it's already ruled that case out very early;
it should probably be rephrased.)

However I don't think there's any harm in our providing some
*_is_denormal() functions in our softfloat API if the code
seems clearer if it's written to use them. It does fill
out an odd gap in the API shape, as you note below.

- float32_get_unbiased_exp() returns the unbiased exponent of a 32-bit
  floating point number.
- float64_get_unbiased_exp() returns the unbiased exponent of a 64-bit
  floating point number.


 These look rather odd to me, and again I can't find the uses in
 your patchset. Returning just the exponent is a bit odd and
 suggests that maybe the split between target code and softfloat
 is in the wrong place.


 Please see
 http://lists.nongnu.org/archive/html/qemu-devel/2013-10/msg03108.html
 and http://lists.nongnu.org/archive/html/qemu-devel/2013-10/msg03107.html
 and also the corresponding definitions of those instructions in the Power
 ISA.

 What is odd here is the PowerPC instruction(s) :)

 But given that softfloat code extracts exponents in numerous places, I do
 not find
 it odd at all that a floating point instruction model for a non-standard
 operation might have to do the same.

 These functions can easily be kept within the PowerPC code proper if there
 are
 objections to them being added to softfloat.  I would rename them, of
 course, so
 that they do not look like softfloat routines.

Mmm. You'll notice that your calling code has to know rather
a lot about the format of the IEEE floats (in that it has
to know the min/max exponent and mantissa width). So I think
I'd just opencode these in the PPC routines. (This is what we
do in target-arm, see recpe_f32 and rsqrte_f32 for examples.)

- float32_to_uint64() converts a 32-bit floating point number to an
  unsigned 64 bit number.


 I would put this in its own patch, personally.


 Fair enough.  Just so that I am clear ... do you mean submit this as a patch
 just by itself (not as part of a series of VSX additions)?

I mean in its own patch email so it is a separate commit and
clearly separated from other things for code review purposes.
You probably still keep it as part of this patch series. (In
fact it would also be a good idea to include the previous
patch this one depends on, if that has not yet been committed.)



 +INLINE int float32_is_denormal(float32 a)
 +{
 +return ((float32_val(a)  0x7f80) == 0) 
 +   ((float32_val(a)  0x007f) != 0);
 +}


 return float32_is_zero_or_denormal(a)  !float32_is_zero(a);

 is easier to review and less duplicative of code.

 thanks


 It surprised me that there were is_zero and is_zero_or_denormal functions
 but
 not is_denormal functions.  I would find it more normal to implement the two
 primitive functions and then construct is_zero_or_denormal to be the OR of
 those two.  Until you look at efficiency of the implementation.

I think also the original uses of these functions didn't
need to distinguish zero from denormal, so it was a more
natural API for those uses.

-- PMM

Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions

2013-10-25 Thread Tom Musta


On 10/24/2013 3:38 PM, Richard Henderson wrote:

On 10/24/2013 09:25 AM, Tom Musta wrote:

\

snip


+ft1 = tp##_to_##btp(s-fld[i], env-fp_status);  \
+ft0 = btp##_##sum(ft0, ft1, env-fp_status); \
+xt.fld[i] = btp##_to_##tp(ft0, env-fp_status);  \

snip

You want to be using tp##muladd instead of widening to 128 bits.


Thanks for the suggestion, Richard.  I will try it.




+s = xt;  \
+} \
+else {\
+m = xt;  \


Also be careful of the codingstyle.


To be fixed in V2 (checkpatch.pl missed this one).

Re: [Qemu-devel] [PATCH v5 1/5] Fix configure script for LTTng 2.x

2013-10-25 Thread Mohamad Gebai




Signed-off-by: Mohamad Gebai mohamad.ge...@polymtl.ca
---

snip

Tested on Ubuntu 12.04 with and without the LTTNG PPA and seems to work
well enough.

Reviewed-by: Alex Bennée a...@bennee.com

Yes, the bug is actually only in the Ubuntu package (missing liburcu*.pc 
files). It is fixed everywhere else, including the LTTng PPA. There is a 
bug report about it on Launchpad. Either ways, this fall back avoids 
getting an error with the Ubuntu packages.


Thanks!
Mohamad

Re: [Qemu-devel] [PATCH 15/19] Add VSX xmax/xmin Instructions

2013-10-25 Thread Tom Musta


On 10/24/2013 5:10 PM, Peter Maydell wrote:

Can't you use the min and max softfloat functions? Those are
there specifically because the corner cases mean you can't
implement them using the comparisons. (For instance for
the example you quote of max(-0.0, +0.0) they return +0.0
as you require.)


I tried this but didn't have much luck getting results to match
the P7 hardware.  Unfortunately, I don't recall the details.
Let me try this approach again.

[Qemu-devel] e1000 patch for osx

2013-10-25 Thread jacek burghardt

Is there a patch for qemu git master that pre init e1000 so I can get rid
off unpluged network cable message ? I know there is patch but is is for
older version of qemu and it seeem that it no longer functions and does not
apply fully as code was changed.

Re: [Qemu-devel] [PATCH 15/19] Add VSX xmax/xmin Instructions

2013-10-25 Thread Peter Maydell

On 25 October 2013 14:52, Tom Musta tommu...@gmail.com wrote:
 On 10/24/2013 5:10 PM, Peter Maydell wrote:

 Can't you use the min and max softfloat functions? Those are
 there specifically because the corner cases mean you can't
 implement them using the comparisons. (For instance for
 the example you quote of max(-0.0, +0.0) they return +0.0
 as you require.)

 I tried this but didn't have much luck getting results to match
 the P7 hardware.  Unfortunately, I don't recall the details.
 Let me try this approach again.

The functions are supposed to match the IEEE mandated min/max
behaviour, and I tested the ARM instructions that use them,
so unless the PPC chip designers have gone rather off-piste
they ought to work :-) (It can happen, though, IIRC x86 has
some rather weird non-IEEE min/max insns.)

-- PMM

Re: [Qemu-devel] e1000 patch for osx

2013-10-25 Thread Paolo Bonzini

Il 25/10/2013 14:53, jacek burghardt ha scritto:
 Is there a patch for qemu git master that pre init e1000 so I can get
 rid off unpluged network cable message ? I know there is patch but is is
 for older version of qemu and it seeem that it no longer functions and
 does not apply fully as code was changed. 

Which patch was that?

Paolo

Re: [Qemu-devel] e1000 patch for osx

2013-10-25 Thread jacek burghardt

https://github.com/saucelabs/mac-osx-on-kvm/blob/master/e1000-mac-hacks.patch

-} else
-s-phy_reg[addr] = data;
+} else {
+/* some (reset) bits are self clearing, so better clear them */
+switch (addr) {
+case PHY_CTRL:
+s-phy_reg[addr] = data  0x7eff;
+if (s-phy_reg[addr] != data)
+set_ics(s, 0, E1000_ICR_LSC);
+break;
+default:
+s-phy_reg[addr] = data;
+}
+}



On Fri, Oct 25, 2013 at 8:21 AM, Paolo Bonzini pbonz...@redhat.com wrote:

 Il 25/10/2013 14:53, jacek burghardt ha scritto:
  Is there a patch for qemu git master that pre init e1000 so I can get
  rid off unpluged network cable message ? I know there is patch but is is
  for older version of qemu and it seeem that it no longer functions and
  does not apply fully as code was changed.

 Which patch was that?

 Paolo

Re: [Qemu-devel] [PATCH v5 1/5] Fix configure script for LTTng 2.x

2013-10-25 Thread Alex Bennée


mohamad.ge...@polymtl.ca writes:

 Signed-off-by: Mohamad Gebai mohamad.ge...@polymtl.ca
 ---
 snip

 Tested on Ubuntu 12.04 with and without the LTTNG PPA and seems to work
 well enough.

 Reviewed-by: Alex Bennée a...@bennee.com

 Yes, the bug is actually only in the Ubuntu package (missing liburcu*.pc 
 files). It is fixed everywhere else, including the LTTng PPA. There is a 
 bug report about it on Launchpad. Either ways, this fall back avoids 
 getting an error with the Ubuntu packages.
snip

Hopefully Stefan will be able to check on the RPM based ones.

Did you identify a bug report against Ubuntu for these packages?

-- 
Alex Bennée

[Qemu-devel] [PATCH 0/3 for 1.7] migration: introduce page flipping capability

2013-10-25 Thread Lei Li

This series is extracted from the lastest localhost migration
with side channel for ram patch set with comments from Paolo
fixed. Send it separately according to his suggestion.

Localhost migration with side channel for ram:
http://lists.gnu.org/archive/html/qemu-devel/2013-10/msg02787.html

Lei Li (3):
  QAPI: introduce magration capability unix_page_flipping
  migration: add migrate_unix_page_flipping()
  qmp-command.hx: add missing docs for migration capabilites

[Qemu-devel] [PATCH 3/3] qmp-command.hx: add missing docs for migration capabilites

2013-10-25 Thread Lei Li

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Lei Li li...@linux.vnet.ibm.com
---
 qmp-commands.hx |8 
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index fba15cd..dcec433 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2898,6 +2898,10 @@ migrate-set-capabilities
 Enable/Disable migration capabilities
 
 - xbzrle: XBZRLE support
+- x-rdma-pin-all: Pin all pages during RDMA support
+- zero-blocks: Compress zero blocks during block migration
+- auto-converge: Block VCPU to help convergence of migration
+- unix-page-flipping: Page flipping for live QEMU upgrade
 
 Arguments:
 
@@ -2922,6 +2926,10 @@ Query current migration capabilities
 
 - capabilities: migration capabilities state
  - xbzrle : XBZRLE state (json-bool)
+ - x-rdma-pin-all: RDMA state (json-bool)
+ - zero-blocks: zero-blocks state (json-bool)
+ - auto-converge: Auto converge state (json-bool)
+ - unix-page-flipping: Page flipping state (json-bool)
 
 Arguments:
 
-- 
1.7.7.6

[Qemu-devel] [PATCH 1/3] QAPI: introduce magration capability unix_page_flipping

2013-10-25 Thread Lei Li

Introduce unix_page_flipping to MigrationCapability for localhost
migration.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Lei Li li...@linux.vnet.ibm.com
---
 qapi-schema.json |   10 +-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index 60f3fd1..7cb88af 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -661,10 +661,18 @@
 # @auto-converge: If enabled, QEMU will automatically throttle down the guest
 #  to speed up convergence of RAM migration. (since 1.6)
 #
+# @unix-page-flipping: If enabled, QEMU can optimize migration when the
+#  destination is a QEMU process that runs on the same host as
+#  the source (as is the case for live upgrade).  If the migration
+#  transport is a Unix socket, QEMU will flip RAM pages directly to
+#  the destination, so that memory is only allocated twice for the
+#  source and destination processes. Disabled by default. (since 1.8)
+#
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
-  'data': ['xbzrle', 'x-rdma-pin-all', 'auto-converge', 'zero-blocks'] }
+  'data': ['xbzrle', 'x-rdma-pin-all', 'auto-converge', 'zero-blocks',
+   'unix-page-flipping'] }
 
 ##
 # @MigrationCapabilityStatus
-- 
1.7.7.6

Re: [Qemu-devel] [PATCH] rdma: rename 'x-rdma' = 'rdma'

2013-10-25 Thread Michael R. Hines


On 10/22/2013 04:20 PM, Eric Blake wrote:

On 10/22/2013 05:59 PM, mrhi...@linux.vnet.ibm.com wrote:

From: Michael R. Hines mrhi...@us.ibm.com

As far as we can tell, all known bugs have been fixed,
there as been very good participation in testing and running.

1. Parallel RDMA migrations are working
2. IPv6 migration is working
3. Libvirt patches are ready
4. virt-test is working

Any objections to removing the experimental tag?

There is one remaining bug: qemu-system-i386 does not compile
with RDMA: I have very zero access to 32-bit hardware
using RDMA, so this hasn't been much of a priority. It seems
safer to *not* submit non-testable patch rather than submit
submit a fix just for the sake of compiling =)
  
Signed-off-by: Michael R. Hines mrhi...@us.ibm.com

---
  
  TODO:

  =
-1. 'migrate x-rdma:host:port' and '-incoming x-rdma' options will be
+1. 'migrate rdma:host:port' and '-incoming rdma' options will be
 renamed to 'rdma' after the experimental phase of this work has
 completed upstream.

Shouldn't you remove step 1 and renumber the rest of the list
altogether, rather than just altering the comment to make it out-of-date?



Oops =)


+++ b/qapi-schema.json
@@ -615,7 +615,7 @@
  #  This feature allows us to minimize migration traffic for certain 
work
  #  loads, by sending compressed difference of the pages
  #
-# @x-rdma-pin-all: Controls whether or not the entire VM memory footprint is
+# @rdma-pin-all: Controls whether or not the entire VM memory footprint is
  #  mlock()'d on demand or all at once. Refer to docs/rdma.txt for 
usage.
  #  Disabled by default. Experimental: may (or may not) be renamed 
after
  #  further testing is complete. (since 1.6)

I'd also recommend tweaking this to say 'since 1.7', since the spelling
'rdma-pin-all' is new to this release.



Ah, yes. =)

[Qemu-devel] [PATCH 2/3] migration: add migrate_unix_page_flipping()

2013-10-25 Thread Lei Li

Add migrate_unix_page_flipping() to check if
MIGRATION_CAPABILITY_UNIX_PAGE_FLIPPING is enabled.

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Lei Li li...@linux.vnet.ibm.com
---
 include/migration/migration.h |3 +++
 migration.c   |9 +
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/include/migration/migration.h b/include/migration/migration.h
index 140e6b4..7e5d01a 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -131,10 +131,13 @@ void migrate_add_blocker(Error *reason);
 void migrate_del_blocker(Error *reason);
 
 bool migrate_rdma_pin_all(void);
+
 bool migrate_zero_blocks(void);
 
 bool migrate_auto_converge(void);
 
+bool migrate_unix_page_flipping(void);
+
 int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
  uint8_t *dst, int dlen);
 int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen);
diff --git a/migration.c b/migration.c
index 2b1ab20..4ac466b 100644
--- a/migration.c
+++ b/migration.c
@@ -541,6 +541,15 @@ int64_t migrate_xbzrle_cache_size(void)
 return s-xbzrle_cache_size;
 }
 
+bool migrate_unix_page_flipping(void)
+{
+MigrationState *s;
+
+s = migrate_get_current();
+
+return s-enabled_capabilities[MIGRATION_CAPABILITY_UNIX_PAGE_FLIPPING];
+}
+
 /* migration thread support */
 
 static void *migration_thread(void *opaque)
-- 
1.7.7.6

Re: [Qemu-devel] [PATCH] rdma: rename 'x-rdma' = 'rdma'

2013-10-25 Thread Michael R. Hines


On 10/23/2013 02:25 AM, Paolo Bonzini wrote:

Il 22/10/2013 21:20, Eric Blake ha scritto:

-# @x-rdma-pin-all: Controls whether or not the entire VM memory footprint is
+# @rdma-pin-all: Controls whether or not the entire VM memory footprint is
  #  mlock()'d on demand or all at once. Refer to docs/rdma.txt for 
usage.
  #  Disabled by default. Experimental: may (or may not) be renamed 
after
  #  further testing is complete. (since 1.6)

I'd also recommend tweaking this to say 'since 1.7', since the spelling
'rdma-pin-all' is new to this release.

I would also leave this as experimental for now.

Basically the point of the experimental designation was to ensure that
RDMA protocol changes might not preserve backwards compatibility.  The
capability is a separate thing from the protocol, as it would likely
apply to any migration-over-RDMA implementation

Paolo



Well, I tried posting libvirt support with this naming scheme,
but they didn't accepted.

Their reason (Daniel, I think) is valid: experimental implies that it
shouldn't be exposed in the management software until it is
deemed stable at some point.

As far we can tell, it is stable, and made very clear using the new
'setup' state in the migration state machine.

How would we expose it in libvirt as an experimental feature
without labeling it as an experimental feature?

- Michael

Re: [Qemu-devel] [PATCH 0/3 for 1.7] migration: introduce page flipping capability

2013-10-25 Thread Paolo Bonzini

Il 25/10/2013 15:59, Lei Li ha scritto:
 This series is extracted from the lastest localhost migration
 with side channel for ram patch set with comments from Paolo
 fixed. Send it separately according to his suggestion.
 
 Localhost migration with side channel for ram:
 http://lists.gnu.org/archive/html/qemu-devel/2013-10/msg02787.html
 
 Lei Li (3):
   QAPI: introduce magration capability unix_page_flipping
   migration: add migrate_unix_page_flipping()
   qmp-command.hx: add missing docs for migration capabilites
 

Sorry for the misunderstanding---I meant squashing them together in one
patch, not separating the series.

Paolo

Re: [Qemu-devel] [PATCH] rdma: rename 'x-rdma' = 'rdma'

2013-10-25 Thread Paolo Bonzini

Il 25/10/2013 16:03, Michael R. Hines ha scritto:
 
 Well, I tried posting libvirt support with this naming scheme,
 but they didn't accepted.
 
 Their reason (Daniel, I think) is valid: experimental implies that it
 shouldn't be exposed in the management software until it is
 deemed stable at some point.
 
 As far we can tell, it is stable, and made very clear using the new
 'setup' state in the migration state machine.

Sure, x-rdma = rdma *is* stable.  I'm not sure about x-rdma-pin-all though.

Paolo

[Qemu-devel] [RFC] block io lost in the guest , possible related to qemu?

2013-10-25 Thread Jack Wang

Hi Experts,

We've seen guest block io lost in a VM.any response will be helpful

environment is:
guest os: Ubuntu 1304
running busy database workload with xfs on a disk export with virtio-blk

the exported vdb has very high infight io over 300. Some times later a
lot io process in D state, looks a lot requests is lost in below storage
stack.

We're use qemu-kvm 1.0, host kernel 3.4.51

In qemu log of virtio-blk.c
I found below commit, I wonder is it possible the workload generate some
unknown reqests to qemu that lost in virtio_blk_handle_read?
I do some fio test myself, I cann't generate so call unknown request type.

Any response will be helpful.

Jack


commit 9e72c45033770b81b536ac6091e91807247cc25a
Author: Alexey Zaytsev alexey.zayt...@gmail.com
Date:   Thu Dec 13 09:03:43 2012 +0200

virtio-blk: Return UNSUPP for unknown request types

Currently, all unknown requests are treated as VIRTIO_BLK_T_IN

Signed-off-by: Alexey Zaytsev alexey.zayt...@gmail.com
Signed-off-by: Stefan Hajnoczi stefa...@redhat.com

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index 92c745a..df57b35 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -398,10 +398,14 @@ static void
virtio_blk_handle_request(VirtIOBlockReq *req,
 qemu_iovec_init_external(req-qiov, req-elem.out_sg[1],
  req-elem.out_num - 1);
 virtio_blk_handle_write(req, mrb);
-} else {
+} else if (type == VIRTIO_BLK_T_IN || type == VIRTIO_BLK_T_BARRIER) {
+/* VIRTIO_BLK_T_IN is 0, so we can't just  it. */
 qemu_iovec_init_external(req-qiov, req-elem.in_sg[0],
  req-elem.in_num - 1);
 virtio_blk_handle_read(req);
+} else {
+virtio_blk_req_complete(req, VIRTIO_BLK_S_UNSUPP);
+g_free(req);
 }
 }

Re: [Qemu-devel] [PATCH v5 1/5] Fix configure script for LTTng 2.x

2013-10-25 Thread Mohamad Gebai


On 10/25/2013 10:33 AM, Alex Bennée wrote:

mohamad.ge...@polymtl.ca writes:

Yes, the bug is actually only in the Ubuntu package (missing liburcu*.pc
files). It is fixed everywhere else, including the LTTng PPA. There is a
bug report about it on Launchpad. Either ways, this fall back avoids
getting an error with the Ubuntu packages.

snip

Hopefully Stefan will be able to check on the RPM based ones.

Did you identify a bug report against Ubuntu for these packages?


Yes, you can find it here:
https://bugs.launchpad.net/ubuntu/+source/liburcu/+bug/1243391

Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions

2013-10-25 Thread Tom Musta


On 10/24/2013 3:38 PM, Richard Henderson wrote:

On 10/24/2013 09:25 AM, Tom Musta wrote:

\
+ft0 = tp##_to_##btp(xa.fld[i], env-fp_status);  \
+ft1 = tp##_to_##btp(m-fld[i], env-fp_status);  \
+ft0 = btp##_mul(ft0, ft1, env-fp_status);   \
+if (unlikely(btp##_is_infinity(ft0) \
+ tp##_is_infinity(s-fld[i])\
+ btp##_is_neg(ft0) cmp tp##_is_neg(s-fld[i]))) { \
+xt.fld[i] = float64_to_##tp(  \
+  fload_invalid_op_excp(env,  \
+ POWERPC_EXCP_FP_VXISI,   \
+ sfprf),  \
+  env-fp_status);   \
+} else {  \
+ft1 = tp##_to_##btp(s-fld[i], env-fp_status);  \
+ft0 = btp##_##sum(ft0, ft1, env-fp_status); \
+xt.fld[i] = btp##_to_##tp(ft0, env-fp_status);  \
+} \
+if (neg  likely(!tp##_is_any_nan(xt.fld[i]))) { \
+xt.fld[i] = tp##_chs(xt.fld[i]);  \
+}


You want to be using tp##muladd instead of widening to 128 bits.


I tried recoding xsmaddadp using float64_muladd.  The problem that I hit is the
boundary case where the intermediate product and the summand are infinities of
the opposite sign.  This is the case handled by the first if in the code
snippet above.  PowerPC has a dedicated FPSCR bit for this type of condition
(VXISI) as well as a general invalid operation bit (VX).  As far as I can tell,
the softfloat code only has the equivalent of the VX bit.   Thus the 
implementation
that I proposed is a more accurate representation of the Power ISA.

The VSX code was modeled after the existing fmadd FPU instruction.  I suspect
the author of that code wrote it this way for similar reasons.

I am inclined to keep my proposed implementation, which is consistent with
the existing PowerPC code.

Thoughts?

Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions

2013-10-25 Thread Richard Henderson

On 10/25/2013 09:25 AM, Tom Musta wrote:
 
 I tried recoding xsmaddadp using float64_muladd.  The problem that I hit is 
 the
 boundary case where the intermediate product and the summand are infinities of
 the opposite sign.  This is the case handled by the first if in the code
 snippet above.  PowerPC has a dedicated FPSCR bit for this type of condition
 (VXISI) as well as a general invalid operation bit (VX).  As far as I can 
 tell,
 the softfloat code only has the equivalent of the VX bit.   Thus the
 implementation
 that I proposed is a more accurate representation of the Power ISA.
 
 The VSX code was modeled after the existing fmadd FPU instruction.  I suspect
 the author of that code wrote it this way for similar reasons.
 
 I am inclined to keep my proposed implementation, which is consistent with
 the existing PowerPC code.
 
 Thoughts?

Hmm.  I won't object to your current implementation, since it does produce
correct results.

I believe that a better implementation could use float*_muladd, and check the
result for float_flag_invalid.  If set, compute the intermediate product so you
can figure out the VXISI setting.  But we'd expect that to be an unlikely path.


r~

Re: [Qemu-devel] [PATCH 1/2] target-arm: sort TCG cpreg list by 64bit id version

2013-10-25 Thread Peter Maydell

On 11 October 2013 18:38, Alvise Rigo a.r...@virtualopensystems.com wrote:
 Both KVM and TCG populate the cpreg_list with 64 bit registers IDs, but in 
 the TCG side the cpreg_list is sorted using the 32 bit id version while in 
 the kvm side the 64 bit id version is used.
 This patch makes the sorting of the cpreg_list consistent between KVM and TCG.

 Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com

Thanks, applied this and 2/2 to target-arm.next.
A couple of formatting notes for next time:
 * please use checkpatch.pl to check you haven't got
   coding style violations (both these patches had bad indent
   and missing braces)
 * please wrap your commit messages rather than having them
   be one very long line
 * if you're submitting a patchset with more than one patch
   in it please include a cover letter email (this set doesn't
   seem to have one)

I've fixed these issues up in my queue this time round since
I wanted to get the patches out in a pullreq this week, but
usually I'd just bounce a patch back for that sort of error.

(If you haven't read http://qemu-project.org/Contribute/SubmitAPatch
I'd recommend it; it tries to list various minor formatting
and process issues that can trip up first-time submitters.)

thanks
-- PMM

Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions

2013-10-25 Thread Tom Musta


On 10/25/2013 11:42 AM, Richard Henderson wrote:

I believe that a better implementation could use float*_muladd, and check the
result for float_flag_invalid.  If set, compute the intermediate product so you
can figure out the VXISI setting.  But we'd expect that to be an unlikely path.


Interesting thought.  I think I see a way to re-arrange the code.  Thanks, 
Richard.

Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions

2013-10-25 Thread Peter Maydell

On 25 October 2013 17:25, Tom Musta tommu...@gmail.com wrote:
 On 10/24/2013 3:38 PM, Richard Henderson wrote:
 You want to be using tp##muladd instead of widening to 128 bits.


 I tried recoding xsmaddadp using float64_muladd.  The problem that I hit is
 the
 boundary case where the intermediate product and the summand are infinities
 of
 the opposite sign.  This is the case handled by the first if in the code
 snippet above.  PowerPC has a dedicated FPSCR bit for this type of condition
 (VXISI) as well as a general invalid operation bit (VX).  As far as I can
 tell,
 the softfloat code only has the equivalent of the VX bit.   Thus the
 implementation
 that I proposed is a more accurate representation of the Power ISA.

You could add the flag to the softfloat code -- this is what I did
for the somewhat ARM specific float_flag_output_denormal.

 The VSX code was modeled after the existing fmadd FPU instruction.  I
 suspect
 the author of that code wrote it this way for similar reasons.

I suspect it just predates the provision of fused multiply-add at
the softfloat level. It should ideally be rewritten to use the
softfloat functions.

Are you sure that doing the arithmetic with the softfloat 128 bit
float operations doesn't set the inexact flag anywhere it
shouldn't? (ie where the intermediate product is not exact in
128 bit format but the final result is exact in 64 or 32 bits).

-- PMM

Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions

2013-10-25 Thread Richard Henderson

On 10/25/2013 10:13 AM, Tom Musta wrote:
 On 10/25/2013 11:42 AM, Richard Henderson wrote:
 I believe that a better implementation could use float*_muladd, and check the
 result for float_flag_invalid.  If set, compute the intermediate product so 
 you
 can figure out the VXISI setting.  But we'd expect that to be an unlikely 
 path.
 
 Interesting thought.  I think I see a way to re-arrange the code.  Thanks,
 Richard.

Actually, you don't even have to compute the intermediate product.

The only way you can have VXISI for a*b+c is for

  isinf(c)  (isinf(a) || isinf(b))

since the intermediate product a*b is infinite precision, and thus cannot
overflow to inf unless one of the multiplicands is already inf.


r~

Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions

2013-10-25 Thread Richard Henderson

On 10/25/2013 10:20 AM, Peter Maydell wrote:
 Are you sure that doing the arithmetic with the softfloat 128 bit
 float operations doesn't set the inexact flag anywhere it
 shouldn't? (ie where the intermediate product is not exact in
 128 bit format but the final result is exact in 64 or 32 bits).

The 128 bit multiply cannot given an inexact, and I believe that if the 128 bit
addition gives inexact then the 64-bit fma result would also have inexact.


r~

Re: [Qemu-devel] [PATCH v4] integrator: fix Linux boot failure by emulating dbg region

2013-10-25 Thread Peter Maydell

On 22 October 2013 15:16,  alex.ben...@linaro.org wrote:
 +typedef struct {
 +SysBusDevice parent_obj;
 +
 +MemoryRegion iomem;
 +
 +uint32_t alpha;
 +uint32_t leds;
 +uint32_t switches;
 +} IntegratorDebugState

You forgot to remove these unused fields. I've done so
and added that slightly-edited patch to target-arm.next.

thanks
-- PMM

Re: [Qemu-devel] [PATCH 1/1] sd: pl181: fix fifo count read support

2013-10-25 Thread Peter Maydell

On 25 October 2013 12:04, Jean-Christophe PLAGNIOL-VILLARD
plagn...@jcrosoft.com wrote:
 On 11:33 Sat 19 Oct , Jean-Christophe PLAGNIOL-VILLARD wrote:
 as it's depend on current direction

 ony change to get that applied?

 Barebox relay on it so it can work on both qemu and real hw

I can't see anything obvious in the PL181 data sheet that
says this register should change behaviour like this based
on the direction of transfer, so I'm afraid I can't accept
this patch without a much more detailed analysis of why
it is correct. (Just as a for-starters, how does this change
relate to the comment immediately above that mentions vagueness
in the documentation and claims we don't need to emulate things
to an exact level of detail? Is this change supposed to fix
that? Does the comment need to change? Which bit of the
PL181 documentation describes the behaviour the patch is
affecting? etc)

I'd also appreciate it if you could read
http://wiki.qemu.org/Contribute/SubmitAPatch
In particular, your patch has some obvious coding
style errors.

thanks
-- PMM

Re: [Qemu-devel] [sheepdog] [PATCH v2 0/2] sheepdog: make use of copy_policy

2013-10-25 Thread MORITA Kazutaka

At Wed, 23 Oct 2013 16:51:50 +0800,
Liu Yuan wrote:
 
 v2:
  - merge the reserved bits
 
 This patch set makes use of copy_policy in struct SheepdogInode in order to
 support recently introduced erasure coding volume in sheepdog.
 
 Thanks
 Yuan
 
 Liu Yuan (2):
   sheepdog: explicitly set copies as type uint8_t
   sheepdog: pass copy_policy in the request
 
  block/sheepdog.c |   30 +++---
  1 file changed, 19 insertions(+), 11 deletions(-)

Acked-by: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp

[Qemu-devel] [PULL 5/6] target-arm: fix sorting issue of KVM cpreg list

2013-10-25 Thread Peter Maydell

From: Alvise Rigo a.r...@virtualopensystems.com

The compare_u64 function was not sorting the KVM cpreg_list in the
right way due to the wrong returned value.  Since we are comparing
two 64bit values we can't simply return their difference if the
returned type is int.

Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com
Message-id: 1381513125-26802-2-git-send-email-a.r...@virtualopensystems.com
[PMM: fixed coding style, indent and commit message formatting]
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 target-arm/kvm.c |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/target-arm/kvm.c b/target-arm/kvm.c
index b92e00d..6e5cd36 100644
--- a/target-arm/kvm.c
+++ b/target-arm/kvm.c
@@ -67,7 +67,13 @@ static bool reg_syncs_via_tuple_list(uint64_t regidx)
 
 static int compare_u64(const void *a, const void *b)
 {
-return *(uint64_t *)a - *(uint64_t *)b;
+if (*(uint64_t *)a  *(uint64_t *)b) {
+return 1;
+}
+if (*(uint64_t *)a  *(uint64_t *)b) {
+return -1;
+}
+return 0;
 }
 
 int kvm_arch_init_vcpu(CPUState *cs)
-- 
1.7.9.5

[Qemu-devel] [PULL 0/6] target-arm queue

2013-10-25 Thread Peter Maydell

The following changes since commit fc8ead74674b7129e8f31c2595c76658e5622197:

  Merge remote-tracking branch 'qemu-kvm/uq/master' into staging (2013-10-18 
10:03:24 -0700)

are available in the git repository at:


  git://git.linaro.org/people/pmaydell/qemu-arm.git 
tags/pull-target-arm-20131025

for you to fetch changes up to 71c903cc3b78fc563122fe40c5cadd050068b91a:

  integrator: fix Linux boot failure by emulating dbg region (2013-10-25 
18:27:07 +0100)


target-arm queue: a couple of trivial features to improve support
for some guest emulation cases, notably running UEFI images:
 * support VBAR (vector base address register)
 * allow running without specifying a kernel (ie just running
   an image from flash)
Plus some bugfixes.


Alex Bennée (1):
  integrator: fix Linux boot failure by emulating dbg region

Alvise Rigo (2):
  target-arm: sort TCG cpreg list by KVM-style 64 bit ID number
  target-arm: fix sorting issue of KVM cpreg list

Nathan Rossi (1):
  target-arm: Add CP15 VBAR support

Peter Maydell (2):
  hw/arm/boot: Make user not specifying a kernel not an error
  hw/arm: Tidy up conditional calls to arm_load_kernel

 default-configs/arm-softmmu.mak|1 +
 hw/arm/boot.c  |6 +-
 hw/arm/integratorcp.c  |2 +
 hw/arm/omap_sx1.c  |   10 ++--
 hw/arm/palm.c  |   10 ++--
 hw/arm/z2.c|   12 ++--
 hw/misc/Makefile.objs  |1 +
 hw/misc/arm_integrator_debug.c |   99 
 include/hw/misc/arm_integrator_debug.h |   18 ++
 target-arm/cpu.h   |1 +
 target-arm/helper.c|   33 ++-
 target-arm/kvm.c   |8 ++-
 12 files changed, 176 insertions(+), 25 deletions(-)
 create mode 100644 hw/misc/arm_integrator_debug.c
 create mode 100644 include/hw/misc/arm_integrator_debug.h

[Qemu-devel] [PULL 2/6] hw/arm: Tidy up conditional calls to arm_load_kernel

2013-10-25 Thread Peter Maydell

Now that arm_load_kernel doesn't insist on a kernel filename
being present, we can remove some unnecessary conditionals
in board models.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Message-id: 1379980897-21277-3-git-send-email-peter.mayd...@linaro.org
---
 hw/arm/omap_sx1.c |   10 --
 hw/arm/palm.c |   10 --
 hw/arm/z2.c   |   12 +---
 3 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/hw/arm/omap_sx1.c b/hw/arm/omap_sx1.c
index b0f8664..03b3816 100644
--- a/hw/arm/omap_sx1.c
+++ b/hw/arm/omap_sx1.c
@@ -194,12 +194,10 @@ static void sx1_init(QEMUMachineInitArgs *args, const int 
version)
 }
 
 /* Load the kernel.  */
-if (args-kernel_filename) {
-sx1_binfo.kernel_filename = args-kernel_filename;
-sx1_binfo.kernel_cmdline = args-kernel_cmdline;
-sx1_binfo.initrd_filename = args-initrd_filename;
-arm_load_kernel(mpu-cpu, sx1_binfo);
-}
+sx1_binfo.kernel_filename = args-kernel_filename;
+sx1_binfo.kernel_cmdline = args-kernel_cmdline;
+sx1_binfo.initrd_filename = args-initrd_filename;
+arm_load_kernel(mpu-cpu, sx1_binfo);
 
 /* TODO: fix next line */
 //~ qemu_console_resize(ds, 640, 480);
diff --git a/hw/arm/palm.c b/hw/arm/palm.c
index 3e39044..0b72bbe 100644
--- a/hw/arm/palm.c
+++ b/hw/arm/palm.c
@@ -261,12 +261,10 @@ static void palmte_init(QEMUMachineInitArgs *args)
 }
 
 /* Load the kernel.  */
-if (kernel_filename) {
-palmte_binfo.kernel_filename = kernel_filename;
-palmte_binfo.kernel_cmdline = kernel_cmdline;
-palmte_binfo.initrd_filename = initrd_filename;
-arm_load_kernel(mpu-cpu, palmte_binfo);
-}
+palmte_binfo.kernel_filename = kernel_filename;
+palmte_binfo.kernel_cmdline = kernel_cmdline;
+palmte_binfo.initrd_filename = initrd_filename;
+arm_load_kernel(mpu-cpu, palmte_binfo);
 }
 
 static QEMUMachine palmte_machine = {
diff --git a/hw/arm/z2.c b/hw/arm/z2.c
index 2e0d5d4..a00fcc0 100644
--- a/hw/arm/z2.c
+++ b/hw/arm/z2.c
@@ -360,13 +360,11 @@ static void z2_init(QEMUMachineInitArgs *args)
 qdev_connect_gpio_out(mpu-gpio, Z2_GPIO_LCD_CS,
 qemu_allocate_irqs(z2_lcd_cs, z2_lcd, 1)[0]);
 
-if (kernel_filename) {
-z2_binfo.kernel_filename = kernel_filename;
-z2_binfo.kernel_cmdline = kernel_cmdline;
-z2_binfo.initrd_filename = initrd_filename;
-z2_binfo.board_id = 0x6dd;
-arm_load_kernel(mpu-cpu, z2_binfo);
-}
+z2_binfo.kernel_filename = kernel_filename;
+z2_binfo.kernel_cmdline = kernel_cmdline;
+z2_binfo.initrd_filename = initrd_filename;
+z2_binfo.board_id = 0x6dd;
+arm_load_kernel(mpu-cpu, z2_binfo);
 }
 
 static QEMUMachine z2_machine = {
-- 
1.7.9.5

[Qemu-devel] [PULL 6/6] integrator: fix Linux boot failure by emulating dbg region

2013-10-25 Thread Peter Maydell

From: Alex Bennée a...@bennee.com

Commit 9b8c69243 (since reverted) broke the ability to boot the kernel
as the value returned by unassigned_mem_read returned non-zero and left
the kernel looping forever waiting for it to change (see
integrator_led_set in the kernel code).

Relying on a varying implementation detail is incorrect anyway so this
introduces a basic stub of a memory region for the debug/LED section
on the integrator board.

Signed-off-by: Alex Bennée a...@bennee.com
Message-id: 1382451366-9539-1-git-send-email-alex.ben...@linaro.org
[PMM: removed three unused fields from struct IntegratorDebugState]
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 default-configs/arm-softmmu.mak|1 +
 hw/arm/integratorcp.c  |2 +
 hw/misc/Makefile.objs  |1 +
 hw/misc/arm_integrator_debug.c |   99 
 include/hw/misc/arm_integrator_debug.h |   18 ++
 5 files changed, 121 insertions(+)
 create mode 100644 hw/misc/arm_integrator_debug.c
 create mode 100644 include/hw/misc/arm_integrator_debug.h

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index d13bc2b..7e69137 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -79,3 +79,4 @@ CONFIG_VERSATILE_PCI=y
 CONFIG_VERSATILE_I2C=y
 
 CONFIG_SDHCI=y
+CONFIG_INTEGRATOR_DEBUG=y
diff --git a/hw/arm/integratorcp.c b/hw/arm/integratorcp.c
index 2ef93ed..c44b2a4 100644
--- a/hw/arm/integratorcp.c
+++ b/hw/arm/integratorcp.c
@@ -11,6 +11,7 @@
 #include hw/devices.h
 #include hw/boards.h
 #include hw/arm/arm.h
+#include hw/misc/arm_integrator_debug.h
 #include net/net.h
 #include exec/address-spaces.h
 #include sysemu/sysemu.h
@@ -508,6 +509,7 @@ static void integratorcp_init(QEMUMachineInitArgs *args)
 icp_control_init(0xcb00);
 sysbus_create_simple(pl050_keyboard, 0x1800, pic[3]);
 sysbus_create_simple(pl050_mouse, 0x1900, pic[4]);
+sysbus_create_simple(TYPE_INTEGRATOR_DEBUG, 0x1a00, 0);
 sysbus_create_varargs(pl181, 0x1c00, pic[23], pic[24], NULL);
 if (nd_table[0].used)
 smc91c111_init(nd_table[0], 0xc800, pic[27]);
diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index 2578e29..cca5c05 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -10,6 +10,7 @@ obj-$(CONFIG_VMPORT) += vmport.o
 
 # ARM devices
 common-obj-$(CONFIG_PL310) += arm_l2x0.o
+common-obj-$(CONFIG_INTEGRATOR_DEBUG) += arm_integrator_debug.o
 
 # PKUnity SoC devices
 common-obj-$(CONFIG_PUV3) += puv3_pm.o
diff --git a/hw/misc/arm_integrator_debug.c b/hw/misc/arm_integrator_debug.c
new file mode 100644
index 000..99b720f
--- /dev/null
+++ b/hw/misc/arm_integrator_debug.c
@@ -0,0 +1,99 @@
+/*
+ * LED, Switch and Debug control registers for ARM Integrator Boards
+ *
+ * This is currently a stub for this functionality but at least
+ * ensures something other than unassigned_mem_read() handles access
+ * to this area.
+ *
+ * The real h/w is described at:
+ *  
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0159b/Babbfijf.html
+ *
+ * Copyright (c) 2013 Alex Bennée a...@bennee.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include hw/hw.h
+#include hw/sysbus.h
+#include exec/address-spaces.h
+#include hw/misc/arm_integrator_debug.h
+
+#define INTEGRATOR_DEBUG(obj) \
+OBJECT_CHECK(IntegratorDebugState, (obj), TYPE_INTEGRATOR_DEBUG)
+
+typedef struct {
+SysBusDevice parent_obj;
+
+MemoryRegion iomem;
+} IntegratorDebugState;
+
+static uint64_t intdbg_control_read(void *opaque, hwaddr offset,
+unsigned size)
+{
+switch (offset  2) {
+case 0: /* ALPHA */
+case 1: /* LEDS */
+case 2: /* SWITCHES */
+qemu_log_mask(LOG_UNIMP,
+  %s: returning zero from % HWADDR_PRIx :%u\n,
+  __func__, offset, size);
+return 0;
+default:
+qemu_log_mask(LOG_GUEST_ERROR,
+  %s: Bad offset % HWADDR_PRIx,
+  __func__, offset);
+return 0;
+}
+}
+
+static void intdbg_control_write(void *opaque, hwaddr offset,
+ uint64_t value, unsigned size)
+{
+switch (offset  2) {
+case 1: /* ALPHA */
+case 2: /* LEDS */
+case 3: /* SWITCHES */
+/* Nothing interesting implemented yet.  */
+qemu_log_mask(LOG_UNIMP,
+  %s: ignoring write of % PRIu64
+   to % HWADDR_PRIx :%u\n,
+  __func__, value, offset, size);
+break;
+default:
+qemu_log_mask(LOG_GUEST_ERROR,
+  %s: write of % PRIu64
+   to bad offset % HWADDR_PRIx \n,
+  __func__, value, offset);
+}
+}
+
+static const MemoryRegionOps intdbg_control_ops = {
+

[Qemu-devel] [PULL 4/6] target-arm: sort TCG cpreg list by KVM-style 64 bit ID number

2013-10-25 Thread Peter Maydell

From: Alvise Rigo a.r...@virtualopensystems.com

Both KVM and TCG populate the cpreg_list with 64 bit register IDs,
but in the TCG side the cpreg_list is sorted using the 32 bit ID
version while in the kvm side the 64 bit ID version is used.  This
patch makes the sorting of the cpreg_list consistent between KVM and
TCG.

Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com
Message-id: 1381513125-26802-1-git-send-email-a.r...@virtualopensystems.com
[PMM: fixed indent, coding style and commit message formatting]
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 target-arm/helper.c |   12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 73476ed..3445813 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -225,10 +225,16 @@ static void count_cpreg(gpointer key, gpointer opaque)
 
 static gint cpreg_key_compare(gconstpointer a, gconstpointer b)
 {
-uint32_t aidx = *(uint32_t *)a;
-uint32_t bidx = *(uint32_t *)b;
+uint64_t aidx = cpreg_to_kvm_id(*(uint32_t *)a);
+uint64_t bidx = cpreg_to_kvm_id(*(uint32_t *)b);
 
-return aidx - bidx;
+if (aidx  bidx) {
+return 1;
+}
+if (aidx  bidx) {
+return -1;
+}
+return 0;
 }
 
 static void cpreg_make_keylist(gpointer key, gpointer value, gpointer udata)
-- 
1.7.9.5

[Qemu-devel] [PULL 1/6] hw/arm/boot: Make user not specifying a kernel not an error

2013-10-25 Thread Peter Maydell

Typically ARM boards will have some kind of flash which might contain
a boot ROM; it's therefore a valid use case to provide only an
image for the boot ROM and not require QEMU's internal boot loader
at all. Remove the fatal error if -kernel isn't specified.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Message-id: 1379980897-21277-2-git-send-email-peter.mayd...@linaro.org
---
 hw/arm/boot.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 1e313af..583ec79 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -354,8 +354,10 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
*info)
 
 /* Load the kernel.  */
 if (!info-kernel_filename) {
-fprintf(stderr, Kernel image must be specified\n);
-exit(1);
+/* If no kernel specified, do nothing; we will start from address 0
+ * (typically a boot ROM image) in the same way as hardware.
+ */
+return;
 }
 
 info-dtb_filename = qemu_opt_get(qemu_get_machine_opts(), dtb);
-- 
1.7.9.5

[Qemu-devel] [PULL 3/6] target-arm: Add CP15 VBAR support

2013-10-25 Thread Peter Maydell

From: Nathan Rossi nathan.ro...@xilinx.com

Added Vector Base Address remapping on ARM v7.

Signed-off-by: Nathan Rossi nathan.ro...@xilinx.com
Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
[PMM: removed spurious mask of value with 131]
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 target-arm/cpu.h|1 +
 target-arm/helper.c |   21 +
 2 files changed, 22 insertions(+)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 2c56740..9f110f1 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -176,6 +176,7 @@ typedef struct CPUARMState {
 uint32_t c9_pmxevtyper; /* perf monitor event type */
 uint32_t c9_pmuserenr; /* perf monitor user enable */
 uint32_t c9_pminten; /* perf monitor interrupt enables */
+uint32_t c12_vbar; /* vector base address register */
 uint32_t c13_fcse; /* FCSE PID.  */
 uint32_t c13_context; /* Context ID.  */
 uint32_t c13_tls1; /* User RW Thread register.  */
diff --git a/target-arm/helper.c b/target-arm/helper.c
index c63bbd7..73476ed 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -537,6 +537,13 @@ static int pmintenclr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 return 0;
 }
 
+static int vbar_write(CPUARMState *env, const ARMCPRegInfo *ri,
+  uint64_t value)
+{
+env-cp15.c12_vbar = value  ~0x1Ful;
+return 0;
+}
+
 static int ccsidr_read(CPUARMState *env, const ARMCPRegInfo *ri,
uint64_t *value)
 {
@@ -622,6 +629,10 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
   .access = PL1_RW, .type = ARM_CP_NO_MIGRATE,
   .fieldoffset = offsetof(CPUARMState, cp15.c9_pminten),
   .resetvalue = 0, .writefn = pmintenclr_write, },
+{ .name = VBAR, .cp = 15, .crn = 12, .crm = 0, .opc1 = 0, .opc2 = 0,
+  .access = PL1_RW, .writefn = vbar_write,
+  .fieldoffset = offsetof(CPUARMState, cp15.c12_vbar),
+  .resetvalue = 0 },
 { .name = SCR, .cp = 15, .crn = 1, .crm = 1, .opc1 = 0, .opc2 = 0,
   .access = PL1_RW, .fieldoffset = offsetof(CPUARMState, cp15.c1_scr),
   .resetvalue = 0, },
@@ -2470,7 +2481,17 @@ void arm_cpu_do_interrupt(CPUState *cs)
 }
 /* High vectors.  */
 if (env-cp15.c1_sys  (1  13)) {
+/* when enabled, base address cannot be remapped.  */
 addr += 0x;
+} else {
+/* ARM v7 architectures provide a vector base address register to remap
+ * the interrupt vector table.
+ * This register is only followed in non-monitor mode, and has a secure
+ * and un-secure copy. Since the cpu is always in a un-secure operation
+ * and is never in monitor mode this feature is always active.
+ * Note: only bits 31:5 are valid.
+ */
+addr += env-cp15.c12_vbar;
 }
 switch_mode (env, new_mode);
 env-spsr = cpsr_read(env);
-- 
1.7.9.5

Re: [Qemu-devel] [patch 2/2] i386: pc: align gpa-hpa on 1GB boundary

2013-10-25 Thread Marcelo Tosatti

On Fri, Oct 25, 2013 at 09:52:34AM +0100, Paolo Bonzini wrote:
 Because offsets are zero, and lengths match the RAM block lengths, you
 do not need any complication with aliasing.  This still has to be done
 only for new machine types.

Not possible because you just wasted holesize bytes (if number of
additional bytes due to huge page alignment is smaller than holesize, a
new hugepage is required, which is not acceptable).

Is there a tree the new machine types can live until 1.8 opens up?

Can you pick up the MAP_POPULATE patch?

[Qemu-devel] [PULL 00/29] ppc patch queue 2013-10-25

2013-10-25 Thread Alexander Graf

Hi Blue / Aurelien / Anthony,

This is my current patch queue for ppc.  Please pull.

Alex


The following changes since commit fc8ead74674b7129e8f31c2595c76658e5622197:

  Merge remote-tracking branch 'qemu-kvm/uq/master' into staging (2013-10-18 
10:03:24 -0700)

are available in the git repository at:


  git://github.com/agraf/qemu.git ppc-for-upstream

for you to fetch changes up to 3bbf37f2692652cc9d48030a9e7f34e2207429f6:

  spapr: Use DeviceClass::fw_name for device tree CPU node (2013-10-25 23:25:48 
+0200)


Alexander Graf (1):
  PPC: Fix L2CR write accesses

Alexey Kardashevskiy (14):
  pseries: Update SLOF firmware image
  spapr: increase temporary fdt buffer size
  spapr: Add ibm, purr property on power7 and newer
  spapr-rtas: fix h_rtas parameters reading
  xics: move reset and cpu_setup
  spapr: move cpu_setup after kvmppc_set_papr
  xics: replace fprintf with error_report
  xics: add pre_save/post_load dispatchers
  xics: convert init() to realize()
  xics: add missing const specifiers to TypeInfo
  xics: split to xics and xics-common
  xics: add cpu_setup callback
  xics-kvm: enable irqfd for MSI
  spapr-pci: enable irqfd for INTx

Andreas Färber (2):
  target-ppc: Fill in OpenFirmware names for some PowerPCCPU families
  spapr: Use DeviceClass::fw_name for device tree CPU node

Aneesh Kumar K.V (5):
  target-ppc: Update slb array with correct index values.
  target-ppc: Check for error on address translation in memsave command
  target-ppc: Use #define for max slb entries
  dump-guest-memory: Check for the correct return value
  target-ppc: dump-guest-memory support

Benjamin Herrenschmidt (3):
  pseries: Fix loading of little endian kernels
  xics: Implement H_IPOLL
  xics: Implement H_XIRR_X

David Gibson (2):
  target-ppc: Add helper for KVM_PPC_RTAS_DEFINE_TOKEN
  xics-kvm: Support for in-kernel XICS interrupt controller

Tom Musta (2):
  ppc: Add CFAR, DAR and DSISR to the dictionary of printable registers
  target-ppc: Little Endian Correction to Load/Store Vector Element

 cpus.c|   5 +-
 default-configs/ppc64-softmmu.mak |   1 +
 dump.c|   4 +-
 hw/intc/Makefile.objs |   1 +
 hw/intc/xics.c| 327 -
 hw/intc/xics_kvm.c| 494 ++
 hw/ppc/spapr.c|  72 --
 hw/ppc/spapr_hcall.c  |   6 +-
 hw/ppc/spapr_pci.c|  13 +
 include/elf.h |   3 +
 include/hw/ppc/spapr.h|  11 +-
 include/hw/ppc/xics.h |  57 +
 monitor.c |   3 +
 pc-bios/README|   2 +-
 pc-bios/slof.bin  | Bin 909720 - 875424 bytes
 roms/SLOF |   2 +-
 target-ppc/Makefile.objs  |   2 +-
 target-ppc/arch_dump.c| 253 +++
 target-ppc/cpu-qom.h  |   5 +-
 target-ppc/cpu.h  |   3 +-
 target-ppc/kvm.c  |  35 ++-
 target-ppc/kvm_ppc.h  |   7 +
 target-ppc/machine.c  |   2 +-
 target-ppc/mem_helper.c   |   2 +
 target-ppc/translate_init.c   |  38 ++-
 25 files changed, 1235 insertions(+), 113 deletions(-)
 create mode 100644 hw/intc/xics_kvm.c
 create mode 100644 target-ppc/arch_dump.c

[Qemu-devel] [PULL 17/29] xics: add cpu_setup callback

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This adds a cpu_setup callback to the XICS device class (as XICS-KVM
will do it different), xics_cpu_setup() will call it if it is set.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c| 5 +
 include/hw/ppc/xics.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 5ed2618..1c6e6f5 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -37,9 +37,14 @@ void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 CPUState *cs = CPU(cpu);
 CPUPPCState *env = cpu-env;
 ICPState *ss = icp-ss[cs-cpu_index];
+XICSStateClass *info = XICS_COMMON_GET_CLASS(icp);
 
 assert(cs-cpu_index  icp-nr_servers);
 
+if (info-cpu_setup) {
+info-cpu_setup(icp, cpu);
+}
+
 switch (PPC_INPUT(env)) {
 case PPC_FLAGS_INPUT_POWER7:
 ss-output = env-irq_inputs[POWER7_INPUT_INT];
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 7e702a0..343bba8 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -64,6 +64,7 @@ typedef struct ICSIRQState ICSIRQState;
 struct XICSStateClass {
 DeviceClass parent_class;
 
+void (*cpu_setup)(XICSState *icp, PowerPCCPU *cpu);
 void (*set_nr_irqs)(XICSState *icp, uint32_t nr_irqs, Error **errp);
 void (*set_nr_servers)(XICSState *icp, uint32_t nr_servers, Error **errp);
 };
-- 
1.8.1.4

[Qemu-devel] [PULL 02/29] pseries: Fix loading of little endian kernels

2013-10-25 Thread Alexander Graf

From: Benjamin Herrenschmidt b...@kernel.crashing.org

Try loading the kernel as little endian if it fails big endian.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
Reviewed-by: Anton Blanchard an...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc/spapr.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 004184d..5bf6c3b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -273,6 +273,7 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
hwaddr initrd_base,
hwaddr initrd_size,
hwaddr kernel_size,
+   bool little_endian,
const char *boot_device,
const char *kernel_cmdline,
uint32_t epow_irq)
@@ -326,6 +327,9 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
   cpu_to_be64(kernel_size) };
 
 _FDT((fdt_property(fdt, qemu,boot-kernel, kprop, sizeof(kprop;
+if (little_endian) {
+_FDT((fdt_property(fdt, qemu,boot-kernel-le, NULL, 0)));
+}
 }
 if (boot_device) {
 _FDT((fdt_property_string(fdt, qemu,boot-device, boot_device)));
@@ -1102,6 +1106,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 uint32_t initrd_base = 0;
 long kernel_size = 0, initrd_size = 0;
 long load_limit, rtas_limit, fw_size;
+bool kernel_le = false;
 char *filename;
 
 msi_supported = true;
@@ -1282,6 +1287,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 kernel_size = load_elf(kernel_filename, translate_kernel_address, NULL,
NULL, lowaddr, NULL, 1, ELF_MACHINE, 0);
 if (kernel_size  0) {
+kernel_size = load_elf(kernel_filename,
+   translate_kernel_address, NULL,
+   NULL, lowaddr, NULL, 0, ELF_MACHINE, 0);
+kernel_le = kernel_size  0;
+}
+if (kernel_size  0) {
 kernel_size = load_image_targphys(kernel_filename,
   KERNEL_LOAD_ADDR,
   load_limit - KERNEL_LOAD_ADDR);
@@ -1331,7 +1342,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 /* Prepare the device tree */
 spapr-fdt_skel = spapr_create_fdt_skel(cpu_model,
 initrd_base, initrd_size,
-kernel_size,
+kernel_size, kernel_le,
 boot_device, kernel_cmdline,
 spapr-epow_irq);
 assert(spapr-fdt_skel != NULL);
-- 
1.8.1.4

[Qemu-devel] [PULL 20/29] xics: Implement H_XIRR_X

2013-10-25 Thread Alexander Graf

From: Benjamin Herrenschmidt b...@kernel.crashing.org

This implements H_XIRR_X hypercall in addition to H_XIRR as
it is mandatory for PAPR+ and there is no way for the guest to
detect whether it is supported or not so just add it.

As the Partition Adjunct Option is not supported at the moment,
the CPPR parameter of the hypercall is ignored.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c | 14 ++
 include/hw/ppc/spapr.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index eb93276..a05 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -27,6 +27,7 @@
 
 #include hw/hw.h
 #include trace.h
+#include qemu/timer.h
 #include hw/ppc/spapr.h
 #include hw/ppc/xics.h
 #include qemu/error-report.h
@@ -679,6 +680,18 @@ static target_ulong h_xirr(PowerPCCPU *cpu, 
sPAPREnvironment *spapr,
 return H_SUCCESS;
 }
 
+static target_ulong h_xirr_x(PowerPCCPU *cpu, sPAPREnvironment *spapr,
+ target_ulong opcode, target_ulong *args)
+{
+CPUState *cs = CPU(cpu);
+ICPState *ss = spapr-icp-ss[cs-cpu_index];
+uint32_t xirr = icp_accept(ss);
+
+args[0] = xirr;
+args[1] = cpu_get_real_ticks();
+return H_SUCCESS;
+}
+
 static target_ulong h_eoi(PowerPCCPU *cpu, sPAPREnvironment *spapr,
   target_ulong opcode, target_ulong *args)
 {
@@ -853,6 +866,7 @@ static void xics_realize(DeviceState *dev, Error **errp)
 spapr_register_hypercall(H_CPPR, h_cppr);
 spapr_register_hypercall(H_IPI, h_ipi);
 spapr_register_hypercall(H_XIRR, h_xirr);
+spapr_register_hypercall(H_XIRR_X, h_xirr_x);
 spapr_register_hypercall(H_EOI, h_eoi);
 spapr_register_hypercall(H_IPOLL, h_ipoll);
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 6407c8a..5ae0b58 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -283,6 +283,7 @@ typedef struct sPAPREnvironment {
 #define H_GET_EM_PARMS  0x2B8
 #define H_SET_MPP   0x2D0
 #define H_GET_MPP   0x2D4
+#define H_XIRR_X0x2FC
 #define H_SET_MODE  0x31C
 #define MAX_HCALL_OPCODEH_SET_MODE
 
-- 
1.8.1.4

[Qemu-devel] [PULL 25/29] target-ppc: Use #define for max slb entries

2013-10-25 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

Instead of opencoding 64 use MAX_SLB_ENTRIES. We don't update the kernel
header here.

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 target-ppc/cpu.h | 3 ++-
 target-ppc/kvm.c | 4 ++--
 target-ppc/machine.c | 2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 422a6bb..26acdba 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -405,6 +405,7 @@ struct ppc_slb_t {
 uint64_t vsid;
 };
 
+#define MAX_SLB_ENTRIES 64
 #define SEGMENT_SHIFT_256M  28
 #define SEGMENT_MASK_256M   (~((1ULL  SEGMENT_SHIFT_256M) - 1))
 
@@ -949,7 +950,7 @@ struct CPUPPCState {
 #if !defined(CONFIG_USER_ONLY)
 #if defined(TARGET_PPC64)
 /* PowerPC 64 SLB area */
-ppc_slb_t slb[64];
+ppc_slb_t slb[MAX_SLB_ENTRIES];
 int32_t slb_nr;
 #endif
 /* segment registers */
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index e2f8b03..b77ce5e 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -818,7 +818,7 @@ int kvm_arch_put_registers(CPUState *cs, int level)
 
 /* Sync SLB */
 #ifdef TARGET_PPC64
-for (i = 0; i  64; i++) {
+for (i = 0; i  ARRAY_SIZE(env-slb); i++) {
 sregs.u.s.ppc64.slb[i].slbe = env-slb[i].esid;
 sregs.u.s.ppc64.slb[i].slbv = env-slb[i].vsid;
 }
@@ -1040,7 +1040,7 @@ int kvm_arch_get_registers(CPUState *cs)
  * back in.
  */
 memset(env-slb, 0, sizeof(env-slb));
-for (i = 0; i  64; i++) {
+for (i = 0; i  ARRAY_SIZE(env-slb); i++) {
 target_ulong rb = sregs.u.s.ppc64.slb[i].slbe;
 target_ulong rs = sregs.u.s.ppc64.slb[i].slbv;
 /*
diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index 12e1512..12c174f 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -312,7 +312,7 @@ static const VMStateDescription vmstate_slb = {
 .minimum_version_id_old = 1,
 .fields  = (VMStateField []) {
 VMSTATE_INT32_EQUAL(env.slb_nr, PowerPCCPU),
-VMSTATE_SLB_ARRAY(env.slb, PowerPCCPU, 64),
+VMSTATE_SLB_ARRAY(env.slb, PowerPCCPU, MAX_SLB_ENTRIES),
 VMSTATE_END_OF_LIST()
 }
 };
-- 
1.8.1.4

[Qemu-devel] [PULL 03/29] ppc: Add CFAR, DAR and DSISR to the dictionary of printable registers

2013-10-25 Thread Alexander Graf

From: Tom Musta tommu...@gmail.com

The CFAR, DAR and DSISR registers are currently missing from the
dictionary of registers that may be printed in the QEMU console.
These are interesting registers when debugging.  With this patch,
the following commands work properly:

 (qemu) print $cfar
 (qemu) print $dar
 (qemu) print $dsisr

Signed-off-by: Tom Musta tommu...@gmail.com
Reviewed-by: Anton Blanchard an...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 monitor.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/monitor.c b/monitor.c
index 74f3f1b..b02b21c 100644
--- a/monitor.c
+++ b/monitor.c
@@ -3186,6 +3186,9 @@ static const MonitorDef monitor_defs[] = {
 
 { srr0, offsetof(CPUPPCState, spr[SPR_SRR0]) },
 { srr1, offsetof(CPUPPCState, spr[SPR_SRR1]) },
+{ dar, offsetof(CPUPPCState, spr[SPR_DAR]) },
+{ dsisr, offsetof(CPUPPCState, spr[SPR_DSISR]) },
+{ cfar, offsetof(CPUPPCState, spr[SPR_CFAR]) },
 { sprg0, offsetof(CPUPPCState, spr[SPR_SPRG0]) },
 { sprg1, offsetof(CPUPPCState, spr[SPR_SPRG1]) },
 { sprg2, offsetof(CPUPPCState, spr[SPR_SPRG2]) },
-- 
1.8.1.4

[Qemu-devel] [PULL 19/29] xics: Implement H_IPOLL

2013-10-25 Thread Alexander Graf

From: Benjamin Herrenschmidt b...@kernel.crashing.org

This adds support for the H_IPOLL hypercall which the guest
uses to poll for a pending interrupt. This hypercall is
mandatory for PAPR+ and there is no way for the guest to
detect whether it is supported or not so just add it.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Acked-by: Alexander Graf ag...@suse.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 1c6e6f5..eb93276 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -689,6 +689,18 @@ static target_ulong h_eoi(PowerPCCPU *cpu, 
sPAPREnvironment *spapr,
 return H_SUCCESS;
 }
 
+static target_ulong h_ipoll(PowerPCCPU *cpu, sPAPREnvironment *spapr,
+target_ulong opcode, target_ulong *args)
+{
+CPUState *cs = CPU(cpu);
+ICPState *ss = spapr-icp-ss[cs-cpu_index];
+
+args[0] = ss-xirr;
+args[1] = ss-mfrr;
+
+return H_SUCCESS;
+}
+
 static void rtas_set_xive(PowerPCCPU *cpu, sPAPREnvironment *spapr,
   uint32_t token,
   uint32_t nargs, target_ulong args,
@@ -842,6 +854,7 @@ static void xics_realize(DeviceState *dev, Error **errp)
 spapr_register_hypercall(H_IPI, h_ipi);
 spapr_register_hypercall(H_XIRR, h_xirr);
 spapr_register_hypercall(H_EOI, h_eoi);
+spapr_register_hypercall(H_IPOLL, h_ipoll);
 
 object_property_set_bool(OBJECT(icp-ics), true, realized, error);
 if (error) {
-- 
1.8.1.4

[Qemu-devel] [PULL 05/29] PPC: Fix L2CR write accesses

2013-10-25 Thread Alexander Graf

Commit 2345f1c01 was supposed to render L2CR writes into noops. Instead,
it made them illegal instruction traps which apparently didn't confuse
XNU, but can easily confuse other OSs.

Fix it up by actually doing nothing when we write to L2CR.

Reported-by: Julio Guerra gu...@julio.in
Signed-off-by: Alexander Graf ag...@suse.de
Tested-by: Julio Guerra gu...@julio.in
---
 target-ppc/translate_init.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 651da6b..807dab3 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -108,6 +108,11 @@ static void spr_write_clear (void *opaque, int sprn, int 
gprn)
 tcg_temp_free(t0);
 tcg_temp_free(t1);
 }
+
+static void spr_access_nop(void *opaque, int sprn, int gprn)
+{
+}
+
 #endif
 
 /* SPR common to all PowerPC */
@@ -1382,7 +1387,7 @@ static void gen_spr_74xx (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Not strictly an SPR */
 vscr_init(env, 0x0001);
@@ -5170,7 +5175,7 @@ static void init_proc_750 (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Time base */
 gen_tbl(env);
@@ -5233,7 +5238,7 @@ static void init_proc_750cl (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Time base */
 gen_tbl(env);
@@ -5419,7 +5424,7 @@ static void init_proc_750cx (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Time base */
 gen_tbl(env);
@@ -5486,7 +5491,7 @@ static void init_proc_750fx (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Time base */
 gen_tbl(env);
@@ -5558,7 +5563,7 @@ static void init_proc_750gx (CPUPPCState *env)
 /* XXX : not implemented (XXX: different from 750fx) */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Time base */
 gen_tbl(env);
@@ -5694,7 +5699,7 @@ static void init_proc_755 (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* XXX : not implemented */
 spr_register(env, SPR_L2PMCR, L2PMCR,
@@ -6650,7 +6655,7 @@ static void init_proc_970 (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Memory management */
 /* XXX: not correct */
@@ -6750,7 +6755,7 @@ static void init_proc_970FX (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Memory management */
 /* XXX: not correct */
@@ -6862,7 +6867,7 @@ static void init_proc_970GX (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Memory management */
 /* XXX: not correct */
@@ -6962,7 +6967,7 @@ static void init_proc_970MP (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Memory management */
 /* XXX: not correct */
@@ -7054,7 +7059,7 @@ static void init_proc_power5plus(CPUPPCState

[Qemu-devel] [PULL 22/29] spapr-pci: enable irqfd for INTx

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This enables IRQFD for LSI (level triggered INTx interrupts) by adding
a spapr_route_intx_pin_to_irq() callback to the sPAPR PCI host bus. This
callback is called to know the global interrupt number to link resampling fd
with IRQFD's fd in KVM.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc/spapr_pci.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 9b6ee32..edb4cb0 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -432,6 +432,17 @@ static void pci_spapr_set_irq(void *opaque, int irq_num, 
int level)
 qemu_set_irq(spapr_phb_lsi_qirq(phb, irq_num), level);
 }
 
+static PCIINTxRoute spapr_route_intx_pin_to_irq(void *opaque, int pin)
+{
+sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(opaque);
+PCIINTxRoute route;
+
+route.mode = PCI_INTX_ENABLED;
+route.irq = sphb-lsi_table[pin].irq;
+
+return route;
+}
+
 /*
  * MSI/MSIX memory region implementation.
  * The handler handles both MSI and MSIX.
@@ -610,6 +621,8 @@ static int spapr_phb_init(SysBusDevice *s)
 
 pci_setup_iommu(bus, spapr_pci_dma_iommu, sphb);
 
+pci_bus_set_route_irq_fn(bus, spapr_route_intx_pin_to_irq);
+
 QLIST_INSERT_HEAD(spapr-phbs, sphb, list);
 
 /* Initialize the LSI table */
-- 
1.8.1.4

[Qemu-devel] [PULL 10/29] xics: move reset and cpu_setup

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This simple change makes following patches nicer.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Acked-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c | 72 +-
 1 file changed, 36 insertions(+), 36 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index bb018d1..a0d71ef 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -30,6 +30,42 @@
 #include hw/ppc/spapr.h
 #include hw/ppc/xics.h
 
+void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
+{
+CPUState *cs = CPU(cpu);
+CPUPPCState *env = cpu-env;
+ICPState *ss = icp-ss[cs-cpu_index];
+
+assert(cs-cpu_index  icp-nr_servers);
+
+switch (PPC_INPUT(env)) {
+case PPC_FLAGS_INPUT_POWER7:
+ss-output = env-irq_inputs[POWER7_INPUT_INT];
+break;
+
+case PPC_FLAGS_INPUT_970:
+ss-output = env-irq_inputs[PPC970_INPUT_INT];
+break;
+
+default:
+fprintf(stderr, XICS interrupt controller does not support this CPU 
+bus model\n);
+abort();
+}
+}
+
+static void xics_reset(DeviceState *d)
+{
+XICSState *icp = XICS(d);
+int i;
+
+for (i = 0; i  icp-nr_servers; i++) {
+device_reset(DEVICE(icp-ss[i]));
+}
+
+device_reset(DEVICE(icp-ics));
+}
+
 /*
  * ICP: Presentation layer
  */
@@ -600,42 +636,6 @@ static void rtas_int_on(PowerPCCPU *cpu, sPAPREnvironment 
*spapr,
  * XICS
  */
 
-static void xics_reset(DeviceState *d)
-{
-XICSState *icp = XICS(d);
-int i;
-
-for (i = 0; i  icp-nr_servers; i++) {
-device_reset(DEVICE(icp-ss[i]));
-}
-
-device_reset(DEVICE(icp-ics));
-}
-
-void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
-{
-CPUState *cs = CPU(cpu);
-CPUPPCState *env = cpu-env;
-ICPState *ss = icp-ss[cs-cpu_index];
-
-assert(cs-cpu_index  icp-nr_servers);
-
-switch (PPC_INPUT(env)) {
-case PPC_FLAGS_INPUT_POWER7:
-ss-output = env-irq_inputs[POWER7_INPUT_INT];
-break;
-
-case PPC_FLAGS_INPUT_970:
-ss-output = env-irq_inputs[PPC970_INPUT_INT];
-break;
-
-default:
-fprintf(stderr, XICS interrupt controller does not support this CPU 
-bus model\n);
-abort();
-}
-}
-
 static void xics_realize(DeviceState *dev, Error **errp)
 {
 XICSState *icp = XICS(dev);
-- 
1.8.1.4

[Qemu-devel] [PULL 04/29] target-ppc: Little Endian Correction to Load/Store Vector Element

2013-10-25 Thread Alexander Graf

From: Tom Musta tommu...@gmail.com

The Load Vector Element (lve*x) and Store Vector Element (stve*x)
instructions not only byte-swap in Little Endian mode, they also
invert the element that is accessed. For example, the RTL for
lvehx contains this:

 eb -- EA[60:63]
 if Big-Endian byte ordering then
 VRT[8*eb:8*eb+15] -- MEM(EA,2)
 else
 VRT[112-(8*eb):127-(8*eb)] -- MEM(EA,2)

This patch adds the element inversion, as described in the last line
of the RTL.

Signed-off-by: Tom Musta tommu...@gmail.com
Reviewed-by: Anton Blanchard an...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 target-ppc/mem_helper.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target-ppc/mem_helper.c b/target-ppc/mem_helper.c
index d8e63ca..f35ed03 100644
--- a/target-ppc/mem_helper.c
+++ b/target-ppc/mem_helper.c
@@ -212,6 +212,7 @@ target_ulong helper_lscbx(CPUPPCState *env, target_ulong 
addr, uint32_t reg,
 int index = (addr  0xf)  sh; \
 \
 if (msr_le) {   \
+index = n_elems - index - 1;\
 r-element[LO_IDX ? index : (adjust - index)] = \
 swap(access(env, addr));\
 } else {\
@@ -236,6 +237,7 @@ LVE(lvewx, cpu_ldl_data, bswap32, u32)
 int index = (addr  0xf)  sh; \
 \
 if (msr_le) {   \
+index = n_elems - index - 1;\
 access(env, addr, swap(r-element[LO_IDX ? index :  \
   (adjust - index)]));  \
 } else {\
-- 
1.8.1.4

[Qemu-devel] [PULL 14/29] xics: convert init() to realize()

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This fixes XICS according new QOM rules.

This converts ICS's init() callbacks to realize().

This converts legacy qdev_init_nofail() to property_set(realized).

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Andreas Färber afaer...@suse.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c | 28 ++--
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index eeb64f5..76654db 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -479,15 +479,17 @@ static const VMStateDescription vmstate_ics = {
 },
 };
 
-static int ics_realize(DeviceState *dev)
+static void ics_realize(DeviceState *dev, Error **errp)
 {
 ICSState *ics = ICS(dev);
 
+if (!ics-nr_irqs) {
+error_setg(errp, Number of interrupts needs to be greater 0);
+return;
+}
 ics-irqs = g_malloc0(ics-nr_irqs * sizeof(ICSIRQState));
 ics-islsi = g_malloc0(ics-nr_irqs * sizeof(bool));
 ics-qirqs = qemu_allocate_irqs(ics_set_irq, ics, ics-nr_irqs);
-
-return 0;
 }
 
 static void ics_class_init(ObjectClass *klass, void *data)
@@ -495,7 +497,7 @@ static void ics_class_init(ObjectClass *klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 ICSStateClass *isc = ICS_CLASS(klass);
 
-dc-init = ics_realize;
+dc-realize = ics_realize;
 dc-vmsd = vmstate_ics;
 dc-reset = ics_reset;
 isc-post_load = ics_post_load;
@@ -691,8 +693,14 @@ static void xics_realize(DeviceState *dev, Error **errp)
 {
 XICSState *icp = XICS(dev);
 ICSState *ics = icp-ics;
+Error *error = NULL;
 int i;
 
+if (!icp-nr_servers) {
+error_setg(errp, Number of servers needs to be greater 0);
+return;
+}
+
 /* Registration of global state belongs into realize */
 spapr_rtas_register(ibm,set-xive, rtas_set_xive);
 spapr_rtas_register(ibm,get-xive, rtas_get_xive);
@@ -707,7 +715,11 @@ static void xics_realize(DeviceState *dev, Error **errp)
 ics-nr_irqs = icp-nr_irqs;
 ics-offset = XICS_IRQ_BASE;
 ics-icp = icp;
-qdev_init_nofail(DEVICE(ics));
+object_property_set_bool(OBJECT(icp-ics), true, realized, error);
+if (error) {
+error_propagate(errp, error);
+return;
+}
 
 icp-ss = g_malloc0(icp-nr_servers*sizeof(ICPState));
 for (i = 0; i  icp-nr_servers; i++) {
@@ -715,7 +727,11 @@ static void xics_realize(DeviceState *dev, Error **errp)
 object_initialize(icp-ss[i], sizeof(icp-ss[i]), TYPE_ICP);
 snprintf(buffer, sizeof(buffer), icp[%d], i);
 object_property_add_child(OBJECT(icp), buffer, OBJECT(icp-ss[i]), 
NULL);
-qdev_init_nofail(DEVICE(icp-ss[i]));
+object_property_set_bool(OBJECT(icp-ss[i]), true, realized, 
error);
+if (error) {
+error_propagate(errp, error);
+return;
+}
 }
 }
 
-- 
1.8.1.4

[Qemu-devel] [PULL 09/29] target-ppc: Add helper for KVM_PPC_RTAS_DEFINE_TOKEN

2013-10-25 Thread Alexander Graf

From: David Gibson da...@gibson.dropbear.id.au

Recent PowerKVM allows the kernel to intercept some RTAS calls from the
guest directly.  This is used to implement the more efficient in-kernel
XICS for example.  qemu is still responsible for assigning the RTAS token
numbers however, and needs to tell the kernel which RTAS function name is
assigned to a given token value.  This patch adds a convenience wrapper for
the KVM_PPC_RTAS_DEFINE_TOKEN ioctl() which is used for this purpose.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Acked-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 target-ppc/kvm.c | 14 ++
 target-ppc/kvm_ppc.h |  7 +++
 2 files changed, 21 insertions(+)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 8a196c6..0b5d391 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1789,6 +1789,20 @@ static int kvm_ppc_register_host_cpu_type(void)
 return 0;
 }
 
+int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function)
+{
+struct kvm_rtas_token_args args = {
+.token = token,
+};
+
+if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_RTAS)) {
+return -ENOENT;
+}
+
+strncpy(args.name, function, sizeof(args.name));
+
+return kvm_vm_ioctl(kvm_state, KVM_PPC_RTAS_DEFINE_TOKEN, args);
+}
 
 int kvmppc_get_htab_fd(bool write)
 {
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 4ae7bf2..5f78e4b 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -38,6 +38,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int 
hash_shift);
 #endif /* !CONFIG_USER_ONLY */
 int kvmppc_fixup_cpu(PowerPCCPU *cpu);
 bool kvmppc_has_cap_epr(void);
+int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function);
 int kvmppc_get_htab_fd(bool write);
 int kvmppc_save_htab(QEMUFile *f, int fd, size_t bufsize, int64_t max_ns);
 int kvmppc_load_htab_chunk(QEMUFile *f, int fd, uint32_t index,
@@ -164,6 +165,12 @@ static inline bool kvmppc_has_cap_epr(void)
 return false;
 }
 
+static inline int kvmppc_define_rtas_kernel_token(uint32_t token,
+  const char *function)
+{
+return -1;
+}
+
 static inline int kvmppc_get_htab_fd(bool write)
 {
 return -1;
-- 
1.8.1.4

[Qemu-devel] [PULL 12/29] xics: replace fprintf with error_report

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This replaces old-style fprintf with new style error_report.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Andreas Färber afaer...@suse.de
Acked-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index a0d71ef..666888d 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -29,6 +29,7 @@
 #include trace.h
 #include hw/ppc/spapr.h
 #include hw/ppc/xics.h
+#include qemu/error-report.h
 
 void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 {
@@ -48,8 +49,8 @@ void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 break;
 
 default:
-fprintf(stderr, XICS interrupt controller does not support this CPU 
-bus model\n);
+error_report(XICS interrupt controller does not support this CPU 
+ bus model);
 abort();
 }
 }
-- 
1.8.1.4

[Qemu-devel] [PULL 06/29] spapr: increase temporary fdt buffer size

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

At the moment the size of the buffer is set to 64K which is
enough for approximately 150 VCPUs which is not the limit.

This increases the buffer up to 256K which allows having
a tree for approximately 600 VCPUs which is way beyond the real
number we need.

As only the real size of the tree is copied to the guest, there
will be no impact on existing configurations.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc/spapr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 5bf6c3b..6322c98 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -62,7 +62,7 @@
  *
  * We load our kernel at 4M, leaving space for SLOF initial image
  */
-#define FDT_MAX_SIZE0x1
+#define FDT_MAX_SIZE0x4
 #define RTAS_MAX_SIZE   0x1
 #define FW_MAX_SIZE 0x40
 #define FW_FILE_NAMEslof.bin
-- 
1.8.1.4

[Qemu-devel] [PULL 18/29] xics-kvm: Support for in-kernel XICS interrupt controller

2013-10-25 Thread Alexander Graf

From: David Gibson da...@gibson.dropbear.id.au

Recent (host) kernels support emulating the PAPR defined XICS interrupt
controller system within KVM.  This patch allows qemu to initialize and
configure the in-kernel XICS, and keep its state in sync with qemu's XICS
state as necessary.

This should give considerable performance improvements.  e.g. on a simple
IPI ping-pong test between hardware threads, using qemu XICS gives us
around 5,000 irqs/second, whereas the in-kernel XICS gives us around
70,000 irqs/s on the same hardware configuration.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
[Mike Qiu qiud...@linux.vnet.ibm.com: fixed mistype which caused 
ics_set_kvm_state() to fail]
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Alexander Graf ag...@suse.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 default-configs/ppc64-softmmu.mak |   1 +
 hw/intc/Makefile.objs |   1 +
 hw/intc/xics_kvm.c| 488 ++
 hw/ppc/spapr.c|  21 +-
 include/hw/ppc/xics.h |  10 +
 5 files changed, 520 insertions(+), 1 deletion(-)
 create mode 100644 hw/intc/xics_kvm.c

diff --git a/default-configs/ppc64-softmmu.mak 
b/default-configs/ppc64-softmmu.mak
index 975112a..fb34a9b 100644
--- a/default-configs/ppc64-softmmu.mak
+++ b/default-configs/ppc64-softmmu.mak
@@ -46,6 +46,7 @@ CONFIG_E500=y
 CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
 # For pSeries
 CONFIG_XICS=$(CONFIG_PSERIES)
+CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
 # For PReP
 CONFIG_I82378=y
 CONFIG_I8259=y
diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index 2851eed..47ac442 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -23,3 +23,4 @@ obj-$(CONFIG_OMAP) += omap_intc.o
 obj-$(CONFIG_OPENPIC_KVM) += openpic_kvm.o
 obj-$(CONFIG_SH4) += sh_intc.o
 obj-$(CONFIG_XICS) += xics.o
+obj-$(CONFIG_XICS_KVM) += xics_kvm.o
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
new file mode 100644
index 000..a2ccafa
--- /dev/null
+++ b/hw/intc/xics_kvm.c
@@ -0,0 +1,488 @@
+/*
+ * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System Emulator
+ *
+ * PAPR Virtualized Interrupt System, aka ICS/ICP aka xics, in-kernel emulation
+ *
+ * Copyright (c) 2013 David Gibson, IBM Corporation.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ */
+
+#include hw/hw.h
+#include trace.h
+#include hw/ppc/spapr.h
+#include hw/ppc/xics.h
+#include kvm_ppc.h
+#include qemu/config-file.h
+#include qemu/error-report.h
+
+#include sys/ioctl.h
+
+typedef struct KVMXICSState {
+XICSState parent_obj;
+
+uint32_t set_xive_token;
+uint32_t get_xive_token;
+uint32_t int_off_token;
+uint32_t int_on_token;
+int kernel_xics_fd;
+} KVMXICSState;
+
+/*
+ * ICP-KVM
+ */
+static void icp_get_kvm_state(ICPState *ss)
+{
+uint64_t state;
+struct kvm_one_reg reg = {
+.id = KVM_REG_PPC_ICP_STATE,
+.addr = (uintptr_t)state,
+};
+int ret;
+
+/* ICP for this CPU thread is not in use, exiting */
+if (!ss-cs) {
+return;
+}
+
+ret = kvm_vcpu_ioctl(ss-cs, KVM_GET_ONE_REG, reg);
+if (ret != 0) {
+error_report(Unable to retrieve KVM interrupt controller state
+ for CPU %d: %s, ss-cs-cpu_index, strerror(errno));
+exit(1);
+}
+
+ss-xirr = state  KVM_REG_PPC_ICP_XISR_SHIFT;
+ss-mfrr = (state  KVM_REG_PPC_ICP_MFRR_SHIFT)
+ KVM_REG_PPC_ICP_MFRR_MASK;
+ss-pending_priority = (state  KVM_REG_PPC_ICP_PPRI_SHIFT)
+ KVM_REG_PPC_ICP_PPRI_MASK;
+}
+
+static int icp_set_kvm_state(ICPState *ss, int version_id)
+{
+uint64_t state;
+struct kvm_one_reg reg = {
+.id = KVM_REG_PPC_ICP_STATE,
+.addr = (uintptr_t)state,
+};
+int ret;
+
+/* ICP for this CPU thread is not in use, exiting */
+if (!ss-cs) {
+return 0;
+}
+
+state =

[Qemu-devel] [PULL 21/29] xics-kvm: enable irqfd for MSI

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This enables IRQFD support for sPAPR. The feature decreases the latency
of interrupt handling.

To enable IRQFD for MSI, this sets kvm_gsi_direct_mapping to true which
enables direct MSI mapping.

To enable IRQFD for LSI (level triggered INTx interrupts), a PCI host bus
callback is required. The patch for that is coming next.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics_kvm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index a2ccafa..c203646 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -441,6 +441,12 @@ static void xics_kvm_realize(DeviceState *dev, Error 
**errp)
 goto fail;
 }
 }
+
+kvm_kernel_irqchip = true;
+kvm_irqfds_allowed = true;
+kvm_msi_via_irqfd_allowed = true;
+kvm_gsi_direct_mapping = true;
+
 return;
 
 fail:
-- 
1.8.1.4

[Qemu-devel] [PULL 23/29] target-ppc: Update slb array with correct index values.

2013-10-25 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

Without this, a value of rb=0 and rs=0 results in replacing the 0th
index. This can be observed when using gdb remote debugging support.

(gdb) x/10i do_fork
   0xc0085330 do_fork:Cannot access memory at address 
0xc0085330
(gdb)

This is because when we do the slb sync via kvm_cpu_synchronize_state,
we overwrite the slb entry (0th entry) for 0xc0085330

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 target-ppc/kvm.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 0b5d391..e2f8b03 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1033,9 +1033,22 @@ int kvm_arch_get_registers(CPUState *cs)
 
 /* Sync SLB */
 #ifdef TARGET_PPC64
+/*
+ * The packed SLB array we get from KVM_GET_SREGS only contains
+ * information about valid entries. So we flush our internal
+ * copy to get rid of stale ones, then put all valid SLB entries
+ * back in.
+ */
+memset(env-slb, 0, sizeof(env-slb));
 for (i = 0; i  64; i++) {
-ppc_store_slb(env, sregs.u.s.ppc64.slb[i].slbe,
-   sregs.u.s.ppc64.slb[i].slbv);
+target_ulong rb = sregs.u.s.ppc64.slb[i].slbe;
+target_ulong rs = sregs.u.s.ppc64.slb[i].slbv;
+/*
+ * Only restore valid entries
+ */
+if (rb  SLB_ESID_V) {
+ppc_store_slb(env, rb, rs);
+}
 }
 #endif
 
-- 
1.8.1.4

[Qemu-devel] [PULL 29/29] spapr: Use DeviceClass::fw_name for device tree CPU node

2013-10-25 Thread Alexander Graf

From: Andreas Färber afaer...@suse.de

Instead of relying on cpu_model, obtain the device tree node label
per CPU. Use DeviceClass::fw_name as source.

Whenever DeviceClass::fw_name is unknown, default to PowerPC,UNKNOWN.

As a consequence, spapr_fixup_cpu_dt() can operate on each CPU's fw_name,
obsoleting sPAPREnvironment::cpu_model, and spapr_create_fdt_skel() can
drop its cpu_model argument.

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
Signed-off-by: Andreas Färber afaer...@suse.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc/spapr.c  | 26 ++
 include/hw/ppc/spapr.h  |  1 -
 target-ppc/translate_init.c |  2 ++
 3 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index c0613e4..f76b355 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -204,9 +204,8 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment 
*spapr)
 int smt = kvmppc_smt_threads();
 uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr-htab_shift)};
 
-assert(spapr-cpu_model);
-
 CPU_FOREACH(cpu) {
+DeviceClass *dc = DEVICE_GET_CLASS(cpu);
 uint32_t associativity[] = {cpu_to_be32(0x5),
 cpu_to_be32(0x0),
 cpu_to_be32(0x0),
@@ -218,7 +217,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment 
*spapr)
 continue;
 }
 
-snprintf(cpu_model, 32, /cpus/%s@%x, spapr-cpu_model,
+snprintf(cpu_model, 32, /cpus/%s@%x, dc-fw_name,
  cpu-cpu_index);
 
 offset = fdt_path_offset(fdt, cpu_model);
@@ -288,8 +287,7 @@ static size_t create_page_sizes_prop(CPUPPCState *env, 
uint32_t *prop,
 } while (0)
 
 
-static void *spapr_create_fdt_skel(const char *cpu_model,
-   hwaddr initrd_base,
+static void *spapr_create_fdt_skel(hwaddr initrd_base,
hwaddr initrd_size,
hwaddr kernel_size,
bool little_endian,
@@ -306,7 +304,6 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 char qemu_hypertas_prop[] = hcall-memop1;
 uint32_t refpoints[] = {cpu_to_be32(0x4), cpu_to_be32(0x4)};
 uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)};
-char *modelname;
 int i, smt = kvmppc_smt_threads();
 unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
 
@@ -365,18 +362,10 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 _FDT((fdt_property_cell(fdt, #address-cells, 0x1)));
 _FDT((fdt_property_cell(fdt, #size-cells, 0x0)));
 
-modelname = g_strdup(cpu_model);
-
-for (i = 0; i  strlen(modelname); i++) {
-modelname[i] = toupper(modelname[i]);
-}
-
-/* This is needed during FDT finalization */
-spapr-cpu_model = g_strdup(modelname);
-
 CPU_FOREACH(cs) {
 PowerPCCPU *cpu = POWERPC_CPU(cs);
 CPUPPCState *env = cpu-env;
+DeviceClass *dc = DEVICE_GET_CLASS(cs);
 PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
 int index = cs-cpu_index;
 uint32_t servers_prop[smp_threads];
@@ -393,7 +382,7 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 continue;
 }
 
-nodename = g_strdup_printf(%s@%x, modelname, index);
+nodename = g_strdup_printf(%s@%x, dc-fw_name, index);
 
 _FDT((fdt_begin_node(fdt, nodename)));
 
@@ -477,8 +466,6 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 _FDT((fdt_end_node(fdt)));
 }
 
-g_free(modelname);
-
 _FDT((fdt_end_node(fdt)));
 
 /* RTAS */
@@ -1363,8 +1350,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
  savevm_htab_handlers, spapr);
 
 /* Prepare the device tree */
-spapr-fdt_skel = spapr_create_fdt_skel(cpu_model,
-initrd_base, initrd_size,
+spapr-fdt_skel = spapr_create_fdt_skel(initrd_base, initrd_size,
 kernel_size, kernel_le,
 boot_device, kernel_cmdline,
 spapr-epow_irq);
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 5ae0b58..fdaab2d 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -29,7 +29,6 @@ typedef struct sPAPREnvironment {
 target_ulong entry_point;
 uint32_t next_irq;
 uint64_t rtc_offset;
-char *cpu_model;
 bool has_graphics;
 
 uint32_t epow_irq;
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 9e29caa..47825ac 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8587,6 +8587,8 @@ static void ppc_cpu_class_init(ObjectClass *oc, void 
*data)
 #else
 cc-gdb_core_xml_file = power-core.xml;
 #endif
+
+dc-fw_name = PowerPC,UNKNOWN;
 }
 
 static const

[Qemu-devel] [PULL 26/29] dump-guest-memory: Check for the correct return value

2013-10-25 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

We should check for error with s-note_size

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 dump.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/dump.c b/dump.c
index 846155c..80a9116 100644
--- a/dump.c
+++ b/dump.c
@@ -66,7 +66,7 @@ typedef struct DumpState {
 uint32_t sh_info;
 bool have_section;
 bool resume;
-size_t note_size;
+ssize_t note_size;
 hwaddr memory_offset;
 int fd;
 
@@ -765,7 +765,7 @@ static int dump_init(DumpState *s, int fd, bool paging, 
bool has_filter,
 
 s-note_size = cpu_get_note_size(s-dump_info.d_class,
  s-dump_info.d_machine, nr_cpus);
-if (ret  0) {
+if (s-note_size  0) {
 error_set(errp, QERR_UNSUPPORTED);
 goto cleanup;
 }
-- 
1.8.1.4

[Qemu-devel] [PULL 28/29] target-ppc: Fill in OpenFirmware names for some PowerPCCPU families

2013-10-25 Thread Alexander Graf

From: Andreas Färber afaer...@suse.de

Set the expected values for POWER7, POWER7+, POWER8 and POWER5+.
Note that POWER5+ and POWER7+ are intentionally lacking the '+', so the
lack of a POWER7P family constitutes no problem.

Signed-off-by: Andreas Färber afaer...@suse.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 target-ppc/translate_init.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index f778eaa..9e29caa 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7108,6 +7108,7 @@ POWERPC_FAMILY(POWER5P)(ObjectClass *oc, void *data)
 DeviceClass *dc = DEVICE_CLASS(oc);
 PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
 
+dc-fw_name = PowerPC,POWER5;
 dc-desc = POWER5+;
 pcc-init_proc = init_proc_power5plus;
 pcc-check_pow = check_pow_970FX;
@@ -7218,6 +7219,7 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
 DeviceClass *dc = DEVICE_CLASS(oc);
 PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
 
+dc-fw_name = PowerPC,POWER7;
 dc-desc = POWER7;
 pcc-init_proc = init_proc_POWER7;
 pcc-check_pow = check_pow_nocheck;
@@ -7252,6 +7254,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
 DeviceClass *dc = DEVICE_CLASS(oc);
 PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
 
+dc-fw_name = PowerPC,POWER8;
 dc-desc = POWER8;
 pcc-init_proc = init_proc_POWER7;
 pcc-check_pow = check_pow_nocheck;
-- 
1.8.1.4

[Qemu-devel] [PULL 27/29] target-ppc: dump-guest-memory support

2013-10-25 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

This patch add support for dumping guest memory using dump-guest-memory
monitor command.

Before patch:

(qemu) dump-guest-memory testcrash
this feature or command is not currently supported
(qemu)

After patch:

(qemu) dump-guest-memory testcrash
(qemu)

crash was able to read the file

crash bt
PID: 0  TASK: c0c0d0d0  CPU: 0   COMMAND: swapper/0

 R0:  2884R1:  c0cafa50R2:  c0cb05b0
 R3:  R4:  c0bc4cb0R5:  
 R6:  001efe93b800R7:  R8:  
 R9:  b0001032R10: 0001R11: 0001eb2117e00d55

...

NOTE: Currently crash tools doesn't look at ELF notes in the dump on ppc64.

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 include/elf.h   |   3 +
 target-ppc/Makefile.objs|   2 +-
 target-ppc/arch_dump.c  | 253 
 target-ppc/cpu-qom.h|   5 +-
 target-ppc/translate_init.c |   4 +
 5 files changed, 265 insertions(+), 2 deletions(-)
 create mode 100644 target-ppc/arch_dump.c

diff --git a/include/elf.h b/include/elf.h
index 58bfbf8..b818091 100644
--- a/include/elf.h
+++ b/include/elf.h
@@ -1359,6 +1359,9 @@ typedef struct elf64_shdr {
 #define NT_S390_TODPREG 0x303   /* s390 TOD programmable register */
 #define NT_S390_TODCMP  0x302   /* s390 TOD clock comparator register 
*/
 #define NT_S390_TIMER   0x301   /* s390 timer register */
+#define NT_PPC_VMX   0x100  /* PowerPC Altivec/VMX registers */
+#define NT_PPC_SPE   0x101  /* PowerPC SPE/EVR registers */
+#define NT_PPC_VSX   0x102  /* PowerPC VSX registers */
 
 
 /* Note header in a PT_NOTE section */
diff --git a/target-ppc/Makefile.objs b/target-ppc/Makefile.objs
index 94d6d0c..3cb23e0 100644
--- a/target-ppc/Makefile.objs
+++ b/target-ppc/Makefile.objs
@@ -2,7 +2,7 @@ obj-y += cpu-models.o
 obj-y += translate.o
 ifeq ($(CONFIG_SOFTMMU),y)
 obj-y += machine.o mmu_helper.o mmu-hash32.o
-obj-$(TARGET_PPC64) += mmu-hash64.o
+obj-$(TARGET_PPC64) += mmu-hash64.o arch_dump.o
 endif
 obj-$(CONFIG_KVM) += kvm.o kvm_ppc.o
 obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
diff --git a/target-ppc/arch_dump.c b/target-ppc/arch_dump.c
new file mode 100644
index 000..17fd4c6
--- /dev/null
+++ b/target-ppc/arch_dump.c
@@ -0,0 +1,253 @@
+/*
+ * writing ELF notes for ppc64 arch
+ *
+ *
+ * Copyright IBM, Corp. 2013
+ *
+ * Authors:
+ * Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include cpu.h
+#include elf.h
+#include exec/cpu-all.h
+#include sysemu/dump.h
+#include sysemu/kvm.h
+
+struct PPC64UserRegStruct {
+uint64_t gpr[32];
+uint64_t nip;
+uint64_t msr;
+uint64_t orig_gpr3;
+uint64_t ctr;
+uint64_t link;
+uint64_t xer;
+uint64_t ccr;
+uint64_t softe;
+uint64_t trap;
+uint64_t dar;
+uint64_t dsisr;
+uint64_t result;
+} QEMU_PACKED;
+
+struct PPC64ElfPrstatus {
+char pad1[112];
+struct PPC64UserRegStruct pr_reg;
+uint64_t pad2[4];
+} QEMU_PACKED;
+
+
+struct PPC64ElfFpregset {
+uint64_t fpr[32];
+uint64_t fpscr;
+}  QEMU_PACKED;
+
+
+struct PPC64ElfVmxregset {
+ppc_avr_t avr[32];
+ppc_avr_t vscr;
+union {
+ppc_avr_t unused;
+uint32_t value;
+} vrsave;
+}  QEMU_PACKED;
+
+struct PPC64ElfVsxregset {
+uint64_t vsr[32];
+}  QEMU_PACKED;
+
+struct PPC64ElfSperegset {
+uint32_t evr[32];
+uint64_t spe_acc;
+uint32_t spe_fscr;
+}  QEMU_PACKED;
+
+typedef struct noteStruct {
+Elf64_Nhdr hdr;
+char name[5];
+char pad3[3];
+union {
+struct PPC64ElfPrstatus  prstatus;
+struct PPC64ElfFpregset  fpregset;
+struct PPC64ElfVmxregset vmxregset;
+struct PPC64ElfVsxregset vsxregset;
+struct PPC64ElfSperegset speregset;
+} contents;
+} QEMU_PACKED Note;
+
+
+static void ppc64_write_elf64_prstatus(Note *note, PowerPCCPU *cpu)
+{
+int i;
+uint64_t cr;
+struct PPC64ElfPrstatus *prstatus;
+struct PPC64UserRegStruct *reg;
+
+note-hdr.n_type = cpu_to_be32(NT_PRSTATUS);
+
+prstatus = note-contents.prstatus;
+memset(prstatus, 0, sizeof(*prstatus));
+reg = prstatus-pr_reg;
+
+for (i = 0; i  32; i++) {
+reg-gpr[i] = cpu_to_be64(cpu-env.gpr[i]);
+}
+reg-nip = cpu_to_be64(cpu-env.nip);
+reg-msr = cpu_to_be64(cpu-env.msr);
+reg-ctr = cpu_to_be64(cpu-env.ctr);
+reg-link = cpu_to_be64(cpu-env.lr);
+reg-xer = cpu_to_be64(cpu_read_xer(cpu-env));
+
+cr = 0;
+for (i = 0; i  8; i++) {
+cr |= (cpu-env.crf[i]  15)  (4 * (7 - i));
+}
+reg-ccr = cpu_to_be64(cr);
+}
+
+static void

[Qemu-devel] [PULL 13/29] xics: add pre_save/post_load dispatchers

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

The upcoming support of in-kernel XICS will redefine migration callbacks
for both ICS and ICP so classes and callback pointers are added.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c| 56 ---
 include/hw/ppc/xics.h | 26 
 2 files changed, 79 insertions(+), 3 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 666888d..eeb64f5 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -190,11 +190,35 @@ static void icp_irq(XICSState *icp, int server, int nr, 
uint8_t priority)
 }
 }
 
+static void icp_dispatch_pre_save(void *opaque)
+{
+ICPState *ss = opaque;
+ICPStateClass *info = ICP_GET_CLASS(ss);
+
+if (info-pre_save) {
+info-pre_save(ss);
+}
+}
+
+static int icp_dispatch_post_load(void *opaque, int version_id)
+{
+ICPState *ss = opaque;
+ICPStateClass *info = ICP_GET_CLASS(ss);
+
+if (info-post_load) {
+return info-post_load(ss, version_id);
+}
+
+return 0;
+}
+
 static const VMStateDescription vmstate_icp_server = {
 .name = icp/server,
 .version_id = 1,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
+.pre_save = icp_dispatch_pre_save,
+.post_load = icp_dispatch_post_load,
 .fields  = (VMStateField []) {
 /* Sanity check */
 VMSTATE_UINT32(xirr, ICPState),
@@ -229,6 +253,7 @@ static TypeInfo icp_info = {
 .parent = TYPE_DEVICE,
 .instance_size = sizeof(ICPState),
 .class_init = icp_class_init,
+.class_size = sizeof(ICPStateClass),
 };
 
 /*
@@ -390,10 +415,9 @@ static void ics_reset(DeviceState *dev)
 }
 }
 
-static int ics_post_load(void *opaque, int version_id)
+static int ics_post_load(ICSState *ics, int version_id)
 {
 int i;
-ICSState *ics = opaque;
 
 for (i = 0; i  ics-icp-nr_servers; i++) {
 icp_resend(ics-icp, i);
@@ -402,6 +426,28 @@ static int ics_post_load(void *opaque, int version_id)
 return 0;
 }
 
+static void ics_dispatch_pre_save(void *opaque)
+{
+ICSState *ics = opaque;
+ICSStateClass *info = ICS_GET_CLASS(ics);
+
+if (info-pre_save) {
+info-pre_save(ics);
+}
+}
+
+static int ics_dispatch_post_load(void *opaque, int version_id)
+{
+ICSState *ics = opaque;
+ICSStateClass *info = ICS_GET_CLASS(ics);
+
+if (info-post_load) {
+return info-post_load(ics, version_id);
+}
+
+return 0;
+}
+
 static const VMStateDescription vmstate_ics_irq = {
 .name = ics/irq,
 .version_id = 1,
@@ -421,7 +467,8 @@ static const VMStateDescription vmstate_ics = {
 .version_id = 1,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
-.post_load = ics_post_load,
+.pre_save = ics_dispatch_pre_save,
+.post_load = ics_dispatch_post_load,
 .fields  = (VMStateField []) {
 /* Sanity check */
 VMSTATE_UINT32_EQUAL(nr_irqs, ICSState),
@@ -446,10 +493,12 @@ static int ics_realize(DeviceState *dev)
 static void ics_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
+ICSStateClass *isc = ICS_CLASS(klass);
 
 dc-init = ics_realize;
 dc-vmsd = vmstate_ics;
 dc-reset = ics_reset;
+isc-post_load = ics_post_load;
 }
 
 static TypeInfo ics_info = {
@@ -457,6 +506,7 @@ static TypeInfo ics_info = {
 .parent = TYPE_DEVICE,
 .instance_size = sizeof(ICSState),
 .class_init = ics_class_init,
+.class_size = sizeof(ICSStateClass),
 };
 
 /*
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 66364c5..6e3b605 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -42,7 +42,9 @@
  *  that yet)
  */
 typedef struct XICSState XICSState;
+typedef struct ICPStateClass ICPStateClass;
 typedef struct ICPState ICPState;
+typedef struct ICSStateClass ICSStateClass;
 typedef struct ICSState ICSState;
 typedef struct ICSIRQState ICSIRQState;
 
@@ -59,6 +61,18 @@ struct XICSState {
 #define TYPE_ICP icp
 #define ICP(obj) OBJECT_CHECK(ICPState, (obj), TYPE_ICP)
 
+#define ICP_CLASS(klass) \
+ OBJECT_CLASS_CHECK(ICPStateClass, (klass), TYPE_ICP)
+#define ICP_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(ICPStateClass, (obj), TYPE_ICP)
+
+struct ICPStateClass {
+DeviceClass parent_class;
+
+void (*pre_save)(ICPState *s);
+int (*post_load)(ICPState *s, int version_id);
+};
+
 struct ICPState {
 /* private */
 DeviceState parent_obj;
@@ -72,6 +86,18 @@ struct ICPState {
 #define TYPE_ICS ics
 #define ICS(obj) OBJECT_CHECK(ICSState, (obj), TYPE_ICS)
 
+#define ICS_CLASS(klass) \
+ OBJECT_CLASS_CHECK(ICSStateClass, (klass), TYPE_ICS)
+#define ICS_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(ICSStateClass, (obj), TYPE_ICS)
+
+struct ICSStateClass {
+DeviceClass parent_class;
+
+void (*pre_save)(ICSState *s);
+int (*post_load)(ICSState *s, int version_id);
+};
+
 struct

[Qemu-devel] [PULL 24/29] target-ppc: Check for error on address translation in memsave command

2013-10-25 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

When we translate the virtual address to physical check for error.

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 cpus.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index 398229e..912938c 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1403,7 +1403,10 @@ void qmp_memsave(int64_t addr, int64_t size, const char 
*filename,
 l = sizeof(buf);
 if (l  size)
 l = size;
-cpu_memory_rw_debug(cpu, addr, buf, l, 0);
+if (cpu_memory_rw_debug(cpu, addr, buf, l, 0) != 0) {
+error_setg(errp, Invalid addr 0x%016 PRIx64 specified, addr);
+goto exit;
+}
 if (fwrite(buf, 1, l, f) != l) {
 error_set(errp, QERR_IO_ERROR);
 goto exit;
-- 
1.8.1.4

[Qemu-devel] [PULL 11/29] spapr: move cpu_setup after kvmppc_set_papr

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This moves the xics_cpu_setup() call after kvmppc_set_papr()
in order to get VCPUs initialized as this is required by upcoming
XICS-KVM.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Acked-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc/spapr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 259df92..a276377 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1184,8 +1184,6 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 }
 env = cpu-env;
 
-xics_cpu_setup(spapr-icp, cpu);
-
 /* Set time-base frequency to 512 MHz */
 cpu_ppc_tb_init(env, TIMEBASE_FREQ);
 
@@ -1199,6 +1197,8 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 kvmppc_set_papr(cpu);
 }
 
+xics_cpu_setup(spapr-icp, cpu);
+
 qemu_register_reset(spapr_cpu_reset, cpu);
 }
 
-- 
1.8.1.4

[Qemu-devel] [PULL 07/29] spapr: Add ibm, purr property on power7 and newer

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

PAPR+ says that no ibm,purr tells the guest that H_PURR is not
supported. However some guests still try calling H_PURR on POWER7 unless
the property is present and equal to 0. This adds the property for CPUs
supporting the PURR special register.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc/spapr.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6322c98..259df92 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -422,6 +422,10 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 _FDT((fdt_property(fdt, ibm,ppc-interrupt-gserver#s,
gservers_prop, sizeof(gservers_prop;
 
+if (env-spr_cb[SPR_PURR].oea_read) {
+_FDT((fdt_property(fdt, ibm,purr, NULL, 0)));
+}
+
 if (env-mmu_model  POWERPC_MMU_1TSEG) {
 _FDT((fdt_property(fdt, ibm,processor-segment-sizes,
segs, sizeof(segs;
-- 
1.8.1.4

[Qemu-devel] [PULL 08/29] spapr-rtas: fix h_rtas parameters reading

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

On the real hardware, RTAS is called in real mode and therefore
top 4 bits of the address passed in the call are ignored.
So does the patch.

This converts h_rtas() to use existing rtas_ld() handlers.

This fixed rtas_ld()/rtas_st() to ignore top 4 bits.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc/spapr_hcall.c   | 6 +++---
 include/hw/ppc/spapr.h | 9 +++--
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index f10ba8a..f755a53 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -521,9 +521,9 @@ static target_ulong h_rtas(PowerPCCPU *cpu, 
sPAPREnvironment *spapr,
target_ulong opcode, target_ulong *args)
 {
 target_ulong rtas_r3 = args[0];
-uint32_t token = ldl_be_phys(rtas_r3);
-uint32_t nargs = ldl_be_phys(rtas_r3 + 4);
-uint32_t nret = ldl_be_phys(rtas_r3 + 8);
+uint32_t token = rtas_ld(rtas_r3, 0);
+uint32_t nargs = rtas_ld(rtas_r3, 1);
+uint32_t nret = rtas_ld(rtas_r3, 2);
 
 return spapr_rtas_call(cpu, spapr, token, nargs, rtas_r3 + 12,
nret, rtas_r3 + 12 + 4*nargs);
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index e37b419..6407c8a 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -332,14 +332,19 @@ static inline int spapr_allocate_lsi(int hint)
 return spapr_allocate_irq(hint, true);
 }
 
+static inline uint64_t ppc64_phys_to_real(uint64_t addr)
+{
+return addr  ~0xF000ULL;
+}
+
 static inline uint32_t rtas_ld(target_ulong phys, int n)
 {
-return ldl_be_phys(phys + 4*n);
+return ldl_be_phys(ppc64_phys_to_real(phys + 4*n));
 }
 
 static inline void rtas_st(target_ulong phys, int n, uint32_t val)
 {
-stl_be_phys(phys + 4*n, val);
+stl_be_phys(ppc64_phys_to_real(phys + 4*n), val);
 }
 
 typedef void (*spapr_rtas_fn)(PowerPCCPU *cpu, sPAPREnvironment *spapr,
-- 
1.8.1.4

[Qemu-devel] [PATCHv1 3/4] Timers: Instrument timer_mod

2013-10-25 Thread Alex Bligh

Add instrumentation for timer_mod to allow measurement of the
average time delta to expiry plus the number of short delta
periods. This is only run when logging to a file because
getting the clock value may add appreciable expense.

Signed-off-by: Alex Bligh a...@alex.org.uk
---
 qemu-timer.c |   17 +
 1 file changed, 17 insertions(+)

diff --git a/qemu-timer.c b/qemu-timer.c
index 84a8932..16eaa1f 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -376,6 +376,23 @@ static bool timer_mod_ns_locked(QEMUTimerList *timer_list,
 ts-next = *pt;
 *pt = ts;
 
+if (timer_debug_log) {
+int64_t delta;
+
+delta = ts-expire_time -
+qemu_clock_get_ns(ts-timer_list-clock-type);
+if (delta = 0) {
+delta = 0;
+}
+
+ts-tot_deltas += delta;
+ts-num_deltas++;
+
+if (delta  SCALE_US) {
+ts-num_short++;
+}
+}
+
 return pt == timer_list-active_timers;
 }
 
-- 
1.7.9.5

[Qemu-devel] [PATCHv1 0/4] Timers: add timer debugging through -timer-debug-log

2013-10-25 Thread Alex Bligh

This patch set adds facilities for debugging timers using the additional
command line option -timer-debug-log=FILE. If this option is selected,
a debugging file will be written showing information about the current
state of timers in the system, which the author feels will be useful for
debugging in the field.

Note that the option is a command line option rather than a configure
time option. This is because users in the field having issues are unlikely
to have a compile time option enabled.

It would be useful to get this feature in prior to 1.7 as it has little
impact other than making a major change to a subsystem more debuggable.
This patch has been lightly test.

Impact of changes whether or not -timer-debug-log is specified:

1. QEMUTimer is expanded to hold additional debugging information. Some
   of this is unused when the command line option is unspecified.

2. The file and line number of the caller that allocated the timer are
   recorded. This is useful for debugging in gdb.

It is felt these are minimal in nature.

Additional impact of changes only when -timer-debug-log is specified:

1. On every timer modification, the current clock time for that timer
   is read, and the additional debug information filled in.

2. Every second (roughly) a file is written (atomically) containing the
   timer debug information.

The debug information includes information on the number of timer
expiries since the timer was created, the average expiry time (in
nanoseconds), and the number of short expiries, being the number of
times the timer was asked to expire in less than one microsecond
(these usually but not always indicate a bug).

The file format is designed to be useful both to a mailing list and
to a user armed with gdb. An example of the output follows:

Timer list at 0x7f4d6cf0d6e0 clock 0:
   Address   Expiries  AvgLength   NumShort Source

Timer list at 0x7f4d6cf0cbc0 clock 0:
   Address   Expiries  AvgLength   NumShort Source

Timer list at 0x7f4d6cf0d750 clock 1:
   Address   Expiries  AvgLength   NumShort Source

Timer list at 0x7f4d6cf0cc30 clock 1:
   Address   Expiries  AvgLength   NumShort Source
0x7f4d6cf51550  1   27462700  0 i8254.c:333

Timer list at 0x7f4d6cf0d7c0 clock 2:
   Address   Expiries  AvgLength   NumShort Source

Timer list at 0x7f4d6cf0cca0 clock 2:
   Address   Expiries  AvgLength   NumShort Source
0x7f4d6cf6eed0  1  97000  0 
mc146818rtc.c:858

Note that the somewhat strange choice to output to a file has been taken
because the tracing infrastructure is unlikely to be enabled in a distro
environment.

Alex Bligh (4):
  Timers: add debugging macros wrapping timer functions and debug
structures
  Timers: add command line option -timer-debug-log
  Timers: Instrument timer_mod
  Timers: produce timer-debug-log file

 include/block/aio.h  |   20 ++---
 include/qemu/timer.h |   70 ++
 qemu-options.hx  |   11 +
 qemu-timer.c |  118 --
 vl.c |3 ++
 5 files changed, 194 insertions(+), 28 deletions(-)

-- 
1.7.9.5

[Qemu-devel] [PATCHv1 1/4] Timers: add debugging macros wrapping timer functions and debug structures

2013-10-25 Thread Alex Bligh

Add debugging versions of functions creating timers to record the
file and line number that they were called from. Add macros to
call these transparently. Add fields to timer struct to store
debugging information.

Note this patch contains one checkpatch.pl warning (space before
parenthesis) and a rather arcane double stringify macro. These
are copied from audio_int.h and I believe are to work around
compiler incompatibilities.

Signed-off-by: Alex Bligh a...@alex.org.uk
---
 include/block/aio.h  |   20 ++-
 include/qemu/timer.h |   69 --
 qemu-timer.c |8 +++---
 3 files changed, 69 insertions(+), 28 deletions(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index 2efdf41..199728f 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -262,13 +262,17 @@ void qemu_aio_set_fd_handler(int fd,
  *
  * Returns: a pointer to the new timer
  */
-static inline QEMUTimer *aio_timer_new(AioContext *ctx, QEMUClockType type,
-   int scale,
-   QEMUTimerCB *cb, void *opaque)
+static inline QEMUTimer *aio_timer_new_dbg(AioContext *ctx, QEMUClockType type,
+   int scale,
+   QEMUTimerCB *cb, void *opaque,
+   const char *dbg)
 {
-return timer_new_tl(ctx-tlg.tl[type], scale, cb, opaque);
+return timer_new_tl_dbg(ctx-tlg.tl[type], scale, cb, opaque, dbg);
 }
 
+#define aio_timer_new(ctx, type, scale, opaque) \
+aio_timer_new_dbg(ctx, type, scale, opaque, TIMER_DBG)
+
 /**
  * aio_timer_init:
  * @ctx: the aio context
@@ -284,9 +288,13 @@ static inline QEMUTimer *aio_timer_new(AioContext *ctx, 
QEMUClockType type,
 static inline void aio_timer_init(AioContext *ctx,
   QEMUTimer *ts, QEMUClockType type,
   int scale,
-  QEMUTimerCB *cb, void *opaque)
+  QEMUTimerCB *cb, void *opaque,
+  const char *dbg)
 {
-timer_init(ts, ctx-tlg.tl[type], scale, cb, opaque);
+timer_init_dbg(ts, ctx-tlg.tl[type], scale, cb, opaque, dbg);
 }
 
+#define aio_timer_init(ctx, ts, type, scale, cb, opaque) \
+aio_timer_init(ctx, ts, type, scale, cb, opaque, TIMER_DBG)
+
 #endif
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 5afcffc..d3ab5b0 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -11,6 +11,11 @@
 #define SCALE_US 1000
 #define SCALE_NS 1
 
+/* debugging macros */
+#define TIMER_STRINGIFY_(n) #n
+#define TIMER_STRINGIFY(n) TIMER_STRINGIFY_(n)
+#define TIMER_DBG __FILE__ : TIMER_STRINGIFY (__LINE__)
+
 /**
  * QEMUClockType:
  *
@@ -61,6 +66,12 @@ struct QEMUTimer {
 void *opaque;
 QEMUTimer *next;
 int scale;
+
+/* these items are only used when debugging */
+const char *dbg;
+int64_t tot_deltas;
+int64_t num_deltas;
+int64_t num_short;
 };
 
 extern QEMUTimerListGroup main_loop_tlg;
@@ -415,9 +426,13 @@ int64_t timerlistgroup_deadline_ns(QEMUTimerListGroup 
*tlg);
  * You need not call an explicit deinit call. Simply make
  * sure it is not on a list with timer_del.
  */
-void timer_init(QEMUTimer *ts,
-QEMUTimerList *timer_list, int scale,
-QEMUTimerCB *cb, void *opaque);
+void timer_init_dbg(QEMUTimer *ts,
+QEMUTimerList *timer_list, int scale,
+QEMUTimerCB *cb, void *opaque,
+const char *dbg);
+
+#define timer_init(ts, timer_list, scale, cb, opaque) \
+timer_init_dbg(ts, timer_list, scale, cb, opaque, TIMER_DBG)
 
 /**
  * timer_new_tl:
@@ -434,16 +449,20 @@ void timer_init(QEMUTimer *ts,
  *
  * Returns: a pointer to the timer
  */
-static inline QEMUTimer *timer_new_tl(QEMUTimerList *timer_list,
-  int scale,
-  QEMUTimerCB *cb,
-  void *opaque)
+static inline QEMUTimer *timer_new_tl_dbg(QEMUTimerList *timer_list,
+  int scale,
+  QEMUTimerCB *cb,
+  void *opaque,
+  const char *dbg)
 {
 QEMUTimer *ts = g_malloc0(sizeof(QEMUTimer));
-timer_init(ts, timer_list, scale, cb, opaque);
+timer_init_dbg(ts, timer_list, scale, cb, opaque, dbg);
 return ts;
 }
 
+#define timer_new_tl(timer_list, scale, cb, opaque) \
+timer_new_tl_dbg(timer_list, scale, cb, opaque, TIMER_DBG)
+
 /**
  * timer_new:
  * @type: the clock type to use
@@ -456,12 +475,16 @@ static inline QEMUTimer *timer_new_tl(QEMUTimerList 
*timer_list,
  *
  * Returns: a pointer to the timer
  */
-static inline QEMUTimer *timer_new(QEMUClockType type, int scale,
-

[Qemu-devel] [PATCHv1 2/4] Timers: add command line option -timer-debug-log

2013-10-25 Thread Alex Bligh

Add a command line option -timer-debug-log which takes the name
of a file to which periodic timer debugging information will be
written.

Signed-off-by: Alex Bligh a...@alex.org.uk
---
 include/qemu/timer.h |1 +
 qemu-options.hx  |   11 +++
 qemu-timer.c |1 +
 vl.c |3 +++
 4 files changed, 16 insertions(+)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index d3ab5b0..1f7c5e4 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -75,6 +75,7 @@ struct QEMUTimer {
 };
 
 extern QEMUTimerListGroup main_loop_tlg;
+extern const char *timer_debug_log;
 
 /*
  * QEMUClockType
diff --git a/qemu-options.hx b/qemu-options.hx
index 5dc8b75..605c1b9 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3116,6 +3116,17 @@ STEXI
 prepend a timestamp to each log message.(default:on)
 ETEXI
 
+DEF(timer-debug-log, HAS_ARG, QEMU_OPTION_timer_debug_log,
+-timer-debug-log FILE\n
+write timer debug log to FILE (default: don't write),
+QEMU_ARCH_ALL)
+STEXI
+@item -timer-debug-log @var{file}
+@findex -timer-debug-log
+Write timer debug output periodically to file @var{file}. By default,
+no timer debug logging is written.
+ETEXI
+
 HXCOMM This is the last statement. Insert new options before this line!
 STEXI
 @end table
diff --git a/qemu-timer.c b/qemu-timer.c
index 0e358ac..84a8932 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -57,6 +57,7 @@ typedef struct QEMUClock {
 
 QEMUTimerListGroup main_loop_tlg;
 QEMUClock qemu_clocks[QEMU_CLOCK_MAX];
+const char *timer_debug_log;
 
 /* A QEMUTimerList is a list of timers attached to a clock. More
  * than one QEMUTimerList can be attached to each clock, for instance
diff --git a/vl.c b/vl.c
index b42ac67..4564207 100644
--- a/vl.c
+++ b/vl.c
@@ -3794,6 +3794,9 @@ int main(int argc, char **argv, char **envp)
 }
 configure_msg(opts);
 break;
+case QEMU_OPTION_timer_debug_log:
+timer_debug_log = optarg;
+break;
 default:
 os_parse_cmd_args(popt-index, optarg);
 }
-- 
1.7.9.5

[Qemu-devel] [PATCHv1 4/4] Timers: produce timer-debug-log file

2013-10-25 Thread Alex Bligh

Write a timer-debug-log file if enabled containing data about the
currently existing timers.

Signed-off-by: Alex Bligh a...@alex.org.uk
---
 qemu-timer.c |   92 ++
 1 file changed, 92 insertions(+)

diff --git a/qemu-timer.c b/qemu-timer.c
index 16eaa1f..cbce7ba 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -58,6 +58,7 @@ typedef struct QEMUClock {
 QEMUTimerListGroup main_loop_tlg;
 QEMUClock qemu_clocks[QEMU_CLOCK_MAX];
 const char *timer_debug_log;
+static int64_t timer_last_debug;
 
 /* A QEMUTimerList is a list of timers attached to a clock. More
  * than one QEMUTimerList can be attached to each clock, for instance
@@ -396,6 +397,93 @@ static bool timer_mod_ns_locked(QEMUTimerList *timer_list,
 return pt == timer_list-active_timers;
 }
 
+static void timer_debug(void)
+{
+GString *debug_text;
+GString *tmpfile;
+QEMUClockType type;
+FILE *f;
+uint64_t now;
+
+if (!timer_debug_log) {
+return;
+}
+
+/* In order not to avoid influencing the output, we don't use a timer
+ * here, but use this disappointingly manual method.
+ */
+now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
+if ((now - timer_last_debug)  1000 * SCALE_MS) {
+return;
+}
+timer_last_debug = now;
+
+debug_text = g_string_new();
+tmpfile = g_string_new(timer_debug_log);
+g_string_append(tmpfile, .tmp);
+
+for (type = 0; type  QEMU_CLOCK_MAX; type++) {
+QEMUTimerList *timer_list;
+QEMUClock *clock = qemu_clock_ptr(type);
+
+/* Iteration through timerlists means we need the BQL held to
+ * call this safely.
+ */
+QLIST_FOREACH(timer_list, clock-timerlists, list) {
+QEMUTimer *ts;
+
+g_string_append_printf(debug_text, \nTimer list at %p clock 
%d:\n,
+   timer_list, (int) type);
+g_string_append_printf(debug_text, %18s %14s %14s %14s %s\n,
+   Address,
+   Expiries,
+   AvgLength,
+   NumShort,
+   Source);
+qemu_mutex_lock(timer_list-active_timers_lock);
+ts = timer_list-active_timers;
+for (ts = timer_list-active_timers; ts; ts = ts-next) {
+int64_t avg = -1;
+if (ts-num_deltas) {
+avg = (ts-tot_deltas + (ts-num_deltas/2)) /
+ts-num_deltas;
+}
+const char *src = unknown;
+if (ts-dbg) {
+const char *slash;
+src = ts-dbg;
+slash = strrchr(src, '/');
+if (!slash) {
+slash = strrchr(src, '\\');
+}
+if (slash) {
+src = slash+1;
+}
+}
+
+g_string_append_printf(debug_text, %18p %14 PRId64  %14
+   PRId64  %14 PRId64 %s\n,
+   ts,
+   ts-num_deltas,
+   avg,
+   ts-num_short,
+   src);
+}
+qemu_mutex_unlock(timer_list-active_timers_lock);
+}
+}
+
+f = fopen(tmpfile-str, w);
+if (f) {
+fprintf(f, %s, debug_text-str);
+fclose(f);
+rename(tmpfile-str, timer_debug_log);
+}
+
+g_string_free(tmpfile, true);
+g_string_free(debug_text, true);
+}
+
 static void timerlist_rearm(QEMUTimerList *timer_list)
 {
 /* Interrupt execution to force deadline recalculation.  */
@@ -621,6 +709,10 @@ bool qemu_clock_run_all_timers(void)
 bool progress = false;
 QEMUClockType type;
 
+if (timer_debug_log) {
+timer_debug();
+}
+
 for (type = 0; type  QEMU_CLOCK_MAX; type++) {
 progress |= qemu_clock_run_timers(type);
 }
-- 
1.7.9.5

[Qemu-devel] [PULL 16/29] xics: split to xics and xics-common

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

The upcoming XICS-KVM support will use bits of emulated XICS code.
So this introduces new level of hierarchy - xics-common class. Both
emulated XICS and XICS-KVM will inherit from it and override class
callbacks when required.

The new xics-common class implements:
1. replaces static nr_irqs and nr_servers properties with
the dynamic ones and adds callbacks to be executed when properties
are set.
2. xics_cpu_setup() callback renamed to xics_common_cpu_setup() as
it is a common part for both XICS'es
3. xics_reset() renamed to xics_common_reset() for the same reason.

The emulated XICS changes:
1. the part of xics_realize() which creates ICPs is moved to
the nr_servers property callback as realize() is too late to
create/initialize devices and instance_init() is too early to create
devices as the number of child devices comes via the nr_servers
property.
2. added ics_initfn() which does a little part of what xics_realize() did.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Alexander Graf ag...@suse.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c| 156 +++---
 hw/ppc/spapr.c|   2 +-
 include/hw/ppc/xics.h |  20 +++
 3 files changed, 157 insertions(+), 21 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index c90eb0a..5ed2618 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -30,6 +30,7 @@
 #include hw/ppc/spapr.h
 #include hw/ppc/xics.h
 #include qemu/error-report.h
+#include qapi/visitor.h
 
 void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 {
@@ -55,9 +56,12 @@ void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 }
 }
 
-static void xics_reset(DeviceState *d)
+/*
+ * XICS Common class - parent for emulated XICS and KVM-XICS
+ */
+static void xics_common_reset(DeviceState *d)
 {
-XICSState *icp = XICS(d);
+XICSState *icp = XICS_COMMON(d);
 int i;
 
 for (i = 0; i  icp-nr_servers; i++) {
@@ -67,6 +71,99 @@ static void xics_reset(DeviceState *d)
 device_reset(DEVICE(icp-ics));
 }
 
+static void xics_prop_get_nr_irqs(Object *obj, Visitor *v,
+  void *opaque, const char *name, Error **errp)
+{
+XICSState *icp = XICS_COMMON(obj);
+int64_t value = icp-nr_irqs;
+
+visit_type_int(v, value, name, errp);
+}
+
+static void xics_prop_set_nr_irqs(Object *obj, Visitor *v,
+  void *opaque, const char *name, Error **errp)
+{
+XICSState *icp = XICS_COMMON(obj);
+XICSStateClass *info = XICS_COMMON_GET_CLASS(icp);
+Error *error = NULL;
+int64_t value;
+
+visit_type_int(v, value, name, error);
+if (error) {
+error_propagate(errp, error);
+return;
+}
+if (icp-nr_irqs) {
+error_setg(errp, Number of interrupts is already set to %u,
+   icp-nr_irqs);
+return;
+}
+
+assert(info-set_nr_irqs);
+assert(icp-ics);
+info-set_nr_irqs(icp, value, errp);
+}
+
+static void xics_prop_get_nr_servers(Object *obj, Visitor *v,
+ void *opaque, const char *name,
+ Error **errp)
+{
+XICSState *icp = XICS_COMMON(obj);
+int64_t value = icp-nr_servers;
+
+visit_type_int(v, value, name, errp);
+}
+
+static void xics_prop_set_nr_servers(Object *obj, Visitor *v,
+ void *opaque, const char *name,
+ Error **errp)
+{
+XICSState *icp = XICS_COMMON(obj);
+XICSStateClass *info = XICS_COMMON_GET_CLASS(icp);
+Error *error = NULL;
+int64_t value;
+
+visit_type_int(v, value, name, error);
+if (error) {
+error_propagate(errp, error);
+return;
+}
+if (icp-nr_servers) {
+error_setg(errp, Number of servers is already set to %u,
+   icp-nr_servers);
+return;
+}
+
+assert(info-set_nr_servers);
+info-set_nr_servers(icp, value, errp);
+}
+
+static void xics_common_initfn(Object *obj)
+{
+object_property_add(obj, nr_irqs, int,
+xics_prop_get_nr_irqs, xics_prop_set_nr_irqs,
+NULL, NULL, NULL);
+object_property_add(obj, nr_servers, int,
+xics_prop_get_nr_servers, xics_prop_set_nr_servers,
+NULL, NULL, NULL);
+}
+
+static void xics_common_class_init(ObjectClass *oc, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(oc);
+
+dc-reset = xics_common_reset;
+}
+
+static const TypeInfo xics_common_info = {
+.name  = TYPE_XICS_COMMON,
+.parent= TYPE_SYS_BUS_DEVICE,
+.instance_size = sizeof(XICSState),
+.class_size= sizeof(XICSStateClass),
+.instance_init = xics_common_initfn,
+.class_init= xics_common_class_init,
+};
+
 /*
  * ICP: Presentation layer
  */
@@ -479,6 +576,13 @@ static const VMStateDescription vmstate_ics = {
 },
 };

[Qemu-devel] [PULL 15/29] xics: add missing const specifiers to TypeInfo

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This adds missing const specifiers to ICS and ICP TypeInfo's.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Andreas Färber afaer...@suse.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 76654db..c90eb0a 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -248,7 +248,7 @@ static void icp_class_init(ObjectClass *klass, void *data)
 dc-vmsd = vmstate_icp_server;
 }
 
-static TypeInfo icp_info = {
+static const TypeInfo icp_info = {
 .name = TYPE_ICP,
 .parent = TYPE_DEVICE,
 .instance_size = sizeof(ICPState),
@@ -503,7 +503,7 @@ static void ics_class_init(ObjectClass *klass, void *data)
 isc-post_load = ics_post_load;
 }
 
-static TypeInfo ics_info = {
+static const TypeInfo ics_info = {
 .name = TYPE_ICS,
 .parent = TYPE_DEVICE,
 .instance_size = sizeof(ICSState),
-- 
1.8.1.4

Re: [Qemu-devel] [patch 2/2] i386: pc: align gpa-hpa on 1GB boundary

2013-10-25 Thread Paolo Bonzini

Il 25/10/2013 20:50, Marcelo Tosatti ha scritto:
 On Fri, Oct 25, 2013 at 09:52:34AM +0100, Paolo Bonzini wrote:
 Because offsets are zero, and lengths match the RAM block lengths, you
 do not need any complication with aliasing.  This still has to be done
 only for new machine types.
 
 Not possible because you just wasted holesize bytes (if number of
 additional bytes due to huge page alignment is smaller than holesize, a
 new hugepage is required, which is not acceptable).

Ok.  Thanks for explaining---the patch seems good with the proper
compatibility option in the machine type.  Please run the
guest_memory_dump_analysis test in autotest too.

 Is there a tree the new machine types can live until 1.8 opens up?
 
 Can you pick up the MAP_POPULATE patch?

Yes, I can pick that one up next week.

Michael is usually gathering hw/i386/pc* patches in his PCI tree, you
can Cc him on v2 of this one.

Paolo

Re: [Qemu-devel] [PATCHv1 0/4] Timers: add timer debugging through -timer-debug-log

2013-10-25 Thread Paolo Bonzini

Il 25/10/2013 23:30, Alex Bligh ha scritto:
 This patch set adds facilities for debugging timers using the additional
 command line option -timer-debug-log=FILE. If this option is selected,
 a debugging file will be written showing information about the current
 state of timers in the system, which the author feels will be useful for
 debugging in the field.
 
 Note that the option is a command line option rather than a configure
 time option. This is because users in the field having issues are unlikely
 to have a compile time option enabled.
 
 It would be useful to get this feature in prior to 1.7 as it has little
 impact other than making a major change to a subsystem more debuggable.
 This patch has been lightly test.
 
 Impact of changes whether or not -timer-debug-log is specified:
 
 1. QEMUTimer is expanded to hold additional debugging information. Some
of this is unused when the command line option is unspecified.
 
 2. The file and line number of the caller that allocated the timer are
recorded. This is useful for debugging in gdb.
 
 It is felt these are minimal in nature.
 
 Additional impact of changes only when -timer-debug-log is specified:
 
 1. On every timer modification, the current clock time for that timer
is read, and the additional debug information filled in.
 
 2. Every second (roughly) a file is written (atomically) containing the
timer debug information.
 
 The debug information includes information on the number of timer
 expiries since the timer was created, the average expiry time (in
 nanoseconds), and the number of short expiries, being the number of
 times the timer was asked to expire in less than one microsecond
 (these usually but not always indicate a bug).
 
 The file format is designed to be useful both to a mailing list and
 to a user armed with gdb. An example of the output follows:
 
 Timer list at 0x7f4d6cf0d6e0 clock 0:
Address   Expiries  AvgLength   NumShort Source
 
 Timer list at 0x7f4d6cf0cbc0 clock 0:
Address   Expiries  AvgLength   NumShort Source
 
 Timer list at 0x7f4d6cf0d750 clock 1:
Address   Expiries  AvgLength   NumShort Source
 
 Timer list at 0x7f4d6cf0cc30 clock 1:
Address   Expiries  AvgLength   NumShort Source
 0x7f4d6cf51550  1   27462700  0 i8254.c:333
 
 Timer list at 0x7f4d6cf0d7c0 clock 2:
Address   Expiries  AvgLength   NumShort Source
 
 Timer list at 0x7f4d6cf0cca0 clock 2:
Address   Expiries  AvgLength   NumShort Source
 0x7f4d6cf6eed0  1  97000  0 
 mc146818rtc.c:858
 
 Note that the somewhat strange choice to output to a file has been taken
 because the tracing infrastructure is unlikely to be enabled in a distro
 environment.

This is a bug in the distro, if it is Linux.  There is no reason not to
enable the stap trace format when running on Linux (Fedora does for
other packages than QEMU, too---most notably glib and glibc).

If it is useful, adding debugging information to timer_new_ns (please
make file and line two separate arguments, though) can definitely be
done unconditionally and added to the traces.  I think adding a
tracepoint in timerlist_run_timers would provide very similar
information to that in your file.

Paolo

Re: [Qemu-devel] [Qemu-ppc] [PULL 00/29] ppc patch queue 2013-10-25

2013-10-25 Thread Mark Cave-Ayland


On 25/10/13 22:27, Alexander Graf wrote:


Hi Blue / Aurelien / Anthony,

This is my current patch queue for ppc.  Please pull.

Alex


Hi Alex,

Did you get my repost of the PPC PCI configuration space patch to 
qemu-devel here: 
http://lists.gnu.org/archive/html/qemu-devel/2013-10/msg01491.html? Or 
should that go via someone else's tree?



ATB,

Mark.

1 2 >

1 - 100 of 105 matches

Mail list logo