Re: [Qemu-devel] [PATCH v4 06/10] qemu-ga: Add Windows VSS provider to quiesce applications on fsfreeze

2013-06-26 Thread Paolo Bonzini
Il 26/06/2013 00:31, Tomoki Sekiyama ha scritto:
 From: Paolo Bonzini [paolo.bonz...@gmail.com] on behalf of Paolo Bonzini 
 [pbonz...@redhat.com]
 Il 06/06/2013 17:06, Tomoki Sekiyama ha scritto:
 +STDAPI VSSCheckOSVersion(void);
 +
 +STDAPI COMRegister(void);
 +STDAPI COMUnregister(void);
 +
 +STDAPI DllRegisterServer(void);
 +STDAPI DllUnregisterServer(void);

 Can you explain the difference between COMRegister/COMUnregister and
 DllRegisterServer/DllUnregisterServer (and why the COM+ part need not be
 done by regsvr32)?  Also, why does COMUnregister call
 DllUnregisterServer but COMRegister does not call DllRegisterServer?
 
 COMRegister and COMUnregister are called by`qemu-ga -s install`,
 to register/unregister the DLL into/from COM+ application catalogue.
 
 DllRegisterServer is automatically called inside
 pCatalog-InstallComponent() (like regsvr32 does), and register
 this DLL as a VSS provider. DllUnregisterServer will do the oposite.
 
 ICOMAdminCatalog (pCatalog) does not provide a method to uninstall
 component, so COMUnregister calls DllUnregisterServer by itself.

Understood, thanks.  Just one question remains: why is the COM+ part not
needed when you invoke regsvr32?

Thanks,

Paolo

 Also, is it needed to call VSSCheckOSVersion from the requestor?  I
 would think that checking VSSAPI.DLL is stronger than checking the
 version, and indeed you do that check too.
 
 In Windows XP, VSSAPI.DLL exists but it has different functionality
 and interfaces from newer Windows. 
 http://msdn.microsoft.com/en-us/library/windows/desktop/aa384627(v=vs.85).aspx
 
 It is checking the OS version because this patchset only supports
 Windows 2003 or later.
 
 Thanks,
 Tomoki Sekiyama
 




Re: [Qemu-devel] block: Review of .has_zero_init use

2013-06-26 Thread Peter Lieven

Am 26.06.2013 um 05:14 schrieb Bharata B Rao bhar...@linux.vnet.ibm.com:

 On Tue, Jun 25, 2013 at 01:39:11PM +0200, Kevin Wolf wrote:
 
 Can you please review for the gluster, rbd, sheepdog and ssh driver
 whether it's safe to assume that the image reads back as zeros after
 bdrv_create?
 
 Gluster supports both file and block backends. While the above is true for
 file backend (which uses ftruncate), the same is not true for
 block backend (which uses lvcreate  lvresize).
 
 So overall it is not safe to assume that an image on GlusterFS volume
 reads back as zeroes after create.

Okay, so for safety we have to return has_zero_init = 0. Erroneously assuming
a device is zero initialized can bring severe filesystem corruption.

Do you see a way to query the information of the underlaying
backend from the storage and return 1 or 0 conditionally?

Peter



Re: [Qemu-devel] [PATCH 0/3] make bh safe with hot-unplug

2013-06-26 Thread Paolo Bonzini
Il 26/06/2013 04:59, liu ping fan ha scritto:
 The latter part could be the hard one in a multi-threaded context, but I
 think it's up to the device to ensure it.  It doesn't _have_ to be hard.
  For example, joining the data-plane thread would do that as well.

 It seems not easy, take consideration of the sequence:
 _bh_delete(), // ensure not scheduled, ie, no further access to 
 DeviceState
 wait for rcu grace period to come, // ensure not running
 free DeviceState in rcu-reclaimer
 But  in current code, _delete() can be placed in DeviceState
 finalization path(in reclaimer), which means while reclaiming, there
 still exist access path to the DeviceState.

It can be placed there, but it doesn't mean it is a good idea. :)

 On the other hand, pushing _delete() out of  finalization path is not
 easy, since we do not what time the DeviceState has done with its bh.

See above:

- if the BH will run in the iothread, the BH is definitely not running
(because the BH runs under the BQL, and the reclaimer owns it)

- if the BH is running in another thread, waiting for that thread to
terminate obviously makes the BH not running.

What we need is separation of removal and reclamation.  Without that,
any effort to make things unplug-safe is going to be way way more
complicated than necessary.

Paolo



Re: [Qemu-devel] [PATCH v11 14/15] rdma: introduce MIG_STATE_NONE and change MIG_STATE_SETUP state transition

2013-06-26 Thread Paolo Bonzini
Il 26/06/2013 02:31, Michael R. Hines ha scritto:
 On 06/25/2013 05:06 PM, Paolo Bonzini wrote:
 Il 25/06/2013 22:56, Michael R. Hines ha scritto:
 I was wrong - this does require a protocol extension.

 This is because the RDMA transfers are asynchronous, and thus
 we cannot know in advance that it is safe to unregister the memory
 associated with each individual transfer before the transfer actually
 completes.

 While the destination currently uses the protocol to participate in
 *registering* the page, the destination does not participate in the
 RDMA transfers themselves, only the source does, and thus would
 require a new exchange of messages to block and instruct the
 destination to unpin the memory.
 Yes, that's what I recalled too (really what mst told me :)).  Does it
 need to be blocking though?  As long as the pinning is blocking, and
 messages are processed in order, the source can proceed immediately
 after sending an unpin message.  This assumes of course that the chunk
 is not being transmitted, and I am not sure how easy the source can
 determine that.
 
 No, they're not processed in order. In fact, not only does the device
 write out of order, but also the PCI bus writes out of order.
 This was such a problem in fact, that I fixed several bugs as a result
 a few weeks ago (v7 of the patch with an in-depth description).
 
 The destination simply cannot assume whatsoever what the ordering
 of the writes are - that's really the whole point of using RDMA in the
 first place so that the software can get out of the way of the transfer
 process to lower the latency of each transfer.

The memory is processed out of order, but what about the messages?
Those must be in order.

Note that I wrote above This assumes of course that the chunk is not
being transmitted.  Can the source know when an asynchronous transfer
finished, and delay the unpinning until that time?

Paolo

 
 The only option is to send a blocking message to the other side to
 request the unpinning (in addition to unpinning on the source first upon
 completion of the original transfer).
 
 As you can expect, this would be very expensive and we must ensure
 that we have *very* good a-priori information that this memory will
 not need to be re-registered anytime in the near future.
 
 - Michael
 
 
 




Re: [Qemu-devel] block: Review of .has_zero_init use

2013-06-26 Thread Bharata B Rao
On Wed, Jun 26, 2013 at 07:59:27AM +0200, Peter Lieven wrote:
 
 Am 26.06.2013 um 05:14 schrieb Bharata B Rao bhar...@linux.vnet.ibm.com:
 
  On Tue, Jun 25, 2013 at 01:39:11PM +0200, Kevin Wolf wrote:
  
  Can you please review for the gluster, rbd, sheepdog and ssh driver
  whether it's safe to assume that the image reads back as zeros after
  bdrv_create?
  
  Gluster supports both file and block backends. While the above is true for
  file backend (which uses ftruncate), the same is not true for
  block backend (which uses lvcreate  lvresize).
  
  So overall it is not safe to assume that an image on GlusterFS volume
  reads back as zeroes after create.
 
 Okay, so for safety we have to return has_zero_init = 0. Erroneously assuming
 a device is zero initialized can bring severe filesystem corruption.
 
 Do you see a way to query the information of the underlaying
 backend from the storage and return 1 or 0 conditionally?

Right now no. There have been efforts to get the capabilities (file, block etc)
of gluster volume from gluster cmdline, but there haven't been discussions
about providing such capabilities from libgfapi which is the library
QEMU uses to talk with gluster.

I see has_zero_init being used from qemu-img convert path. Is there any
other use of this in QEMU ?

Regards,
Bharata. 




Re: [Qemu-devel] [PATCH v3 1/2] net: introduce RX_FILTER_CHANGED event

2013-06-26 Thread Markus Armbruster
Eric Blake ebl...@redhat.com writes:

 On 05/23/2013 03:07 AM, Amos Kong wrote:
 Introduce this new QMP event to notify management after guest changes
 rx-filter configuration.
 
 Signed-off-by: Amos Kong ak...@redhat.com
 ---
  QMP/qmp-events.txt| 14 ++
  include/monitor/monitor.h |  1 +
  monitor.c |  1 +
  3 files changed, 16 insertions(+)
 
 diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt
 index 92fe5fb..ad6612b 100644
 --- a/QMP/qmp-events.txt
 +++ b/QMP/qmp-events.txt
 @@ -154,6 +154,20 @@ Data:
  path: /machine/peripheral/virtio-net-pci-0 },
timestamp: { seconds: 1265044230, microseconds: 450486 } }
  
 +RX_FILTER_CHANGED
 +-

 +
  DEVICE_TRAY_MOVED
  -

 Isn't this file supposed to be kept in sorted order, to minimize merge
 conflicts when backporting events?

Yes, please.  Also helps humans navigate.

GUEST_PANICKED is out of order, needs fixing.



Re: [Qemu-devel] [PULL 2/8] i.MX: Implement a more complete version of the GPT timer.

2013-06-26 Thread Paolo Bonzini
Il 25/06/2013 22:53, Peter Maydell ha scritto:
 On 25 June 2013 19:42, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 25/06/2013 20:21, Peter Maydell ha scritto:
 @@ -416,7 +513,7 @@ static int imx_timerg_init(SysBusDevice *dev)

  sysbus_init_irq(dev, s-irq);
  memory_region_init_io(s-iomem, imx_timerg_ops,
 -  s, imxg-timer,
 +  s, TYPE_IMX_GPT,
0x1000);
  sysbus_init_mmio(dev, s-iomem);


 There was some agreement that this is not a good change.
 
 I agree (and more so regarding the use of the macro in the
 vmstate name), but nobody actually posted any comment to
 that effect against any of the versions of this patch that
 got sent out for review...

Yeah, the timing was bad... Can you post a revert, though?

Paolo



Re: [Qemu-devel] [PULL 2/8] i.MX: Implement a more complete version of the GPT timer.

2013-06-26 Thread jcd
For my own education, could you elaborate on the reason this is not a good 
change? 

Thanks. 

JC 

- Mail original -

 Il 25/06/2013 22:53, Peter Maydell ha scritto:
  On 25 June 2013 19:42, Paolo Bonzini pbonz...@redhat.com wrote:
  Il 25/06/2013 20:21, Peter Maydell ha scritto:
  @@ -416,7 +513,7 @@ static int imx_timerg_init(SysBusDevice *dev)
 
  sysbus_init_irq(dev, s-irq);
  memory_region_init_io(s-iomem, imx_timerg_ops,
  - s, imxg-timer,
  + s, TYPE_IMX_GPT,
  0x1000);
  sysbus_init_mmio(dev, s-iomem);
 
 
  There was some agreement that this is not a good change.
 
  I agree (and more so regarding the use of the macro in the
  vmstate name), but nobody actually posted any comment to
  that effect against any of the versions of this patch that
  got sent out for review...

 Yeah, the timing was bad... Can you post a revert, though?

 Paolo


Re: [Qemu-devel] [PULL 2/8] i.MX: Implement a more complete version of the GPT timer.

2013-06-26 Thread Paolo Bonzini
Il 26/06/2013 09:08, j...@tribudubois.net ha scritto:
 For my own education, could you elaborate on the reason this is not a
 good change?

Quoting from http://article.gmane.org/gmane.comp.emulators.qemu/218078:

 The MemoryRegion name is free text that can be changed as desired and
 doesn't need to match the type name.
 
 The TYPE_* constant serves to keep consistency between usages of the
 string literal but does not prohibit changing its value.
 
 The VMStateDescription should thus not use the TYPE_* constant because
 it must not ever change for live migration. Renaming a type is a valid
 thing to do if there is no command line compatibility to keep (think
 board-level xlnx.foo - xlnx,foo).

Paolo



Re: [Qemu-devel] [PATCH v3 2/2] net: introduce command to query rx-filter information

2013-06-26 Thread Markus Armbruster
Eric Blake ebl...@redhat.com writes:

 On 05/23/2013 03:08 AM, Amos Kong wrote:
 We want to implement mac programming over macvtap through Libvirt,
 related rx-filter configuration contains main mac, some of rx-mode
 and mac-table.
 
 The previous patch adds QMP event to notify management of rx-filter
 change. This patch adds a monitor command to query rx-filter
 information.
 
 A flag is used to avoid events flooding, if user don't query

 s/don't/doesn't/

 rx-filter after receives one event, new events won't be sent

 s/after/after it/

 to qmp monitor.
 
 +++ b/net/net.c
 @@ -961,6 +961,53 @@ void print_net_client(Monitor *mon, NetClientState *nc)
 nc-info_str);
  }
  
 +RxFilterInfoList *qmp_query_rx_filter(bool has_name, const char *name,
 +  Error **errp)
 +{
 +NetClientState *nc;
 +RxFilterInfoList *filter_list = NULL, *last_entry = NULL;
 +
 +QTAILQ_FOREACH(nc, net_clients, next) {
 +RxFilterInfoList *entry;
 +RxFilterInfo *info;
 +
 +if (nc-info-type != NET_CLIENT_OPTIONS_KIND_NIC) {
 +continue;
 +}
 +if (has_name  strcmp(nc-name, name) != 0) {

 Do you need the has_name argument here, or can you ensure that the
 caller passes NULL when the caller's has_name was false, for one less
 parameter and the same amount of information?

QAPI always generates a has_FOO parameter for an optional FOO parameter,
even when FOO is a pointer, which has the perfectly obvious special not
given value NULL.  I hate that.

[...]



Re: [Qemu-devel] [RFC PATCH 2/3] block: assign backing relationship in drive-backup

2013-06-26 Thread Paolo Bonzini
Il 26/06/2013 05:59, Fam Zheng ha scritto:
 Assign source image as the backing hd of target bs, so reading target bs
 gets the point-in-time copy of data from source image.
 
 Signed-off-by: Fam Zheng f...@redhat.com
 ---
  blockdev.c | 4 
  1 file changed, 4 insertions(+)
 
 diff --git a/blockdev.c b/blockdev.c
 index 5e694f3..a1d816e 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -1504,6 +1504,10 @@ void qmp_drive_backup(const char *device, const char 
 *target,
  return;
  }
  
 +target_bs-backing_hd = bs;
 +pstrcpy(target_bs-backing_file, sizeof(target_bs-backing_file),
 +bs-filename);

backing_format missing.

Paolo

  backup_start(bs, target_bs, speed, on_source_error, on_target_error,
   block_job_cb, bs, local_err);
  if (local_err != NULL) {
 




Re: [Qemu-devel] [PULL 2/8] i.MX: Implement a more complete version of the GPT timer.

2013-06-26 Thread jcd
Thanks for the info. 

Is this consolidated into some developper documentation somewhere (a wiki, 
...)? 

JC 

- Mail original -

 Il 26/06/2013 09:08, j...@tribudubois.net ha scritto:
  For my own education, could you elaborate on the reason this is not a
  good change?

 Quoting from http://article.gmane.org/gmane.comp.emulators.qemu/218078:

  The MemoryRegion name is free text that can be changed as desired and
  doesn't need to match the type name.
 
  The TYPE_* constant serves to keep consistency between usages of the
  string literal but does not prohibit changing its value.
 
  The VMStateDescription should thus not use the TYPE_* constant because
  it must not ever change for live migration. Renaming a type is a valid
  thing to do if there is no command line compatibility to keep (think
  board-level xlnx.foo - xlnx,foo).

 Paolo


Re: [Qemu-devel] [PULL 2/8] i.MX: Implement a more complete version of the GPT timer.

2013-06-26 Thread Peter Crosthwaite
Hi

On Wed, Jun 26, 2013 at 4:56 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 25/06/2013 22:53, Peter Maydell ha scritto:
 On 25 June 2013 19:42, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 25/06/2013 20:21, Peter Maydell ha scritto:
 @@ -416,7 +513,7 @@ static int imx_timerg_init(SysBusDevice *dev)

  sysbus_init_irq(dev, s-irq);
  memory_region_init_io(s-iomem, imx_timerg_ops,
 -  s, imxg-timer,
 +  s, TYPE_IMX_GPT,
0x1000);
  sysbus_init_mmio(dev, s-iomem);


 There was some agreement that this is not a good change.

 I agree (and more so regarding the use of the macro in the
 vmstate name), but nobody actually posted any comment to
 that effect against any of the versions of this patch that
 got sent out for review...

 Yeah, the timing was bad... Can you post a revert, though?


The original string being replaced was a poor choice as well. IIUC the
consensus was string field of the memory regions is supposed to
indicate the purpose of the memory region for the device. Good
examples would be Control regs or MMIO. Naming the memory region
after the device type is a redundancy as that info will come via
memory region owners.

So rather than revert could you just choose something more descriptive?

Regards,
Peter

 Paolo




Re: [Qemu-devel] [RFC PATCH 0/3] Point-in-time snapshot exporting with drive-backup

2013-06-26 Thread Paolo Bonzini
Il 26/06/2013 05:59, Fam Zheng ha scritto:
 
 The block job originally had total control over target bs: when the job
 completes, the target is deleted. Now it's shared with NBD, so we should make
 sure the job completion or canceling wouldn't crash NBD server. This patch
 doesn't handle this case: if the block job is ended before NBD exporting,
 there'll be problem.

This should not be a problem.  When the job completes or cancels it will
close s-target.  The NBD server has installed a close notifier for
s-target, and will close connections with the clients.

Similarly, if the source disk is hot-unplugged, the bdrv_close will
cancel the job, which in turn closes connections with the clients.

So this just works as long as no hot-unplug happens and as long as the
job doesn't complete: the client can use the point-in-time snapshot via
NBD as long as it cares to.

This leads to another observation: a sync:'none' block-backup job
probably should never complete, and should instead go on until explicit
cancellation.  This is because the job does no background writes, and
thus completion would only happen after the guest has written the whole
disk.  Writing the whole disk is rare enough that it will likely cause
bugs in the clients.  It is easier just to never complete the job.

Paolo



Re: [Qemu-devel] [RFC PATCH 0/3] Point-in-time snapshot exporting with drive-backup

2013-06-26 Thread Fam Zheng
On Wed, 06/26 09:23, Paolo Bonzini wrote:
 Il 26/06/2013 05:59, Fam Zheng ha scritto:
  
  The block job originally had total control over target bs: when the job
  completes, the target is deleted. Now it's shared with NBD, so we should 
  make
  sure the job completion or canceling wouldn't crash NBD server. This patch
  doesn't handle this case: if the block job is ended before NBD exporting,
  there'll be problem.
 
 This should not be a problem.  When the job completes or cancels it will
 close s-target.  The NBD server has installed a close notifier for
 s-target, and will close connections with the clients.
 
 Similarly, if the source disk is hot-unplugged, the bdrv_close will
 cancel the job, which in turn closes connections with the clients.
 
 So this just works as long as no hot-unplug happens and as long as the
 job doesn't complete: the client can use the point-in-time snapshot via
 NBD as long as it cares to.

OK, thanks for pointing this out.

 
 This leads to another observation: a sync:'none' block-backup job
 probably should never complete, and should instead go on until explicit
 cancellation.  This is because the job does no background writes, and
 thus completion would only happen after the guest has written the whole
 disk.  Writing the whole disk is rare enough that it will likely cause
 bugs in the clients.  It is easier just to never complete the job.
 

Yes, the sync mode none will simply run forever until cancelled.

-- 
Fam



Re: [Qemu-devel] block: Review of .has_zero_init use

2013-06-26 Thread Kevin Wolf
Am 26.06.2013 um 08:46 hat Bharata B Rao geschrieben:
 On Wed, Jun 26, 2013 at 07:59:27AM +0200, Peter Lieven wrote:
  
  Am 26.06.2013 um 05:14 schrieb Bharata B Rao bhar...@linux.vnet.ibm.com:
  
   On Tue, Jun 25, 2013 at 01:39:11PM +0200, Kevin Wolf wrote:
   
   Can you please review for the gluster, rbd, sheepdog and ssh driver
   whether it's safe to assume that the image reads back as zeros after
   bdrv_create?
   
   Gluster supports both file and block backends. While the above is true for
   file backend (which uses ftruncate), the same is not true for
   block backend (which uses lvcreate  lvresize).
   
   So overall it is not safe to assume that an image on GlusterFS volume
   reads back as zeroes after create.
  
  Okay, so for safety we have to return has_zero_init = 0. Erroneously 
  assuming
  a device is zero initialized can bring severe filesystem corruption.
  
  Do you see a way to query the information of the underlaying
  backend from the storage and return 1 or 0 conditionally?
 
 Right now no. There have been efforts to get the capabilities (file, block 
 etc)
 of gluster volume from gluster cmdline, but there haven't been discussions
 about providing such capabilities from libgfapi which is the library
 QEMU uses to talk with gluster.
 
 I see has_zero_init being used from qemu-img convert path. Is there any
 other use of this in QEMU ?

No, it's the only user.

So what this means is that using 'qemu-img convert' with a glusterfs
target corrupts the image if the volume is backed by a block device.

Kevin



Re: [Qemu-devel] [PATCH] block/ssh: Set bdrv_has_zero_init according to the file type.

2013-06-26 Thread Kevin Wolf
Am 25.06.2013 um 19:15 hat Richard W.M. Jones geschrieben:
 From: Richard W.M. Jones rjo...@redhat.com
 
 If the remote is a regular file, set it to true (ie. reads of
 uninitialized areas in a newly created file will return zeroes).
 If we can't prove that, return false (a safe default).
 
 Tested by adding a debugging print statement [not part of this commit]
 and creating a remote file and a remote block device:
 
   $ ./qemu-img create ssh://localhost/tmp/new 100M
   Formatting 'ssh://localhost/tmp/new', fmt=raw size=104857600
   filename ssh://localhost/tmp/new: has_zero_init = 1
   $ sudo lvcreate -L 1G -n tmp /dev/fedora
 Logical volume tmp created
   $ ./qemu-img create ssh://localhost/dev/fedora/tmp 1G
   Formatting 'ssh://localhost/dev/fedora/tmp', fmt=raw size=1073741824
   filename ssh://localhost/dev/fedora/tmp: has_zero_init = 0
 
 Cc: Kevin Wolf kw...@redhat.com
 Signed-off-by: Richard W.M. Jones rjo...@redhat.com

Thanks, applied to the block branch, and CCed qemu-stable.

Kevin



Re: [Qemu-devel] [PATCH] block: add drive_backup HMP command

2013-06-26 Thread Stefan Hajnoczi
On Tue, Jun 25, 2013 at 04:36:53PM +0200, Paolo Bonzini wrote:
 Il 25/06/2013 16:06, Kevin Wolf ha scritto:
  Am 25.06.2013 um 15:49 hat Paolo Bonzini geschrieben:
  Il 25/06/2013 15:26, Kevin Wolf ha scritto:
  +if (!full) {
  +error_setg(errp, -f is not yet implemented);
  +hmp_handle_error(mon, errp);
  +return;
  +}
  Then why make it a valid option and confuse users in the help text by
  describing options that don't really exist?
 
  Because otherwise we're stuck with a meaning of the flag that is
  different between drive-mirror and block-backup.
  
  Do you mean when otherwise isn't only we don't add -f now, but also
  we accidentally add a -f with different meaning later? Not sure if
  there's a real danger of that when we're aware that we want -f with the
  same meaning as for mirroring.
 
 We have drive-mirror with:
 * the default is 'top'
 * -f gives 'full'
 
 block-backup for now only implements 'full'.  If we do not force the
 user to add -f, the default is 'full' and we should not change it later.
 
 However, I would move the not yet implemented error from HMP to QMP.
 This way, both drive-mirror and block-backup will have a mandatory
 'sync' argument.  We plan to implement it anyway, and it makes sense imo
 to avoid gratuitous differences in the APIs.

Thanks, I should have explained this in the commit message.  Requiring
-f now avoids changing semantics later when 'top' becomes the default to
match drive-mirror.

I'll move the error into qmp_drive_backup().

Stefan



[Qemu-devel] [PATCH] gluster: Return bdrv_has_zero_init = 0

2013-06-26 Thread Kevin Wolf
GlusterFS volumes can be backed by block devices, in which case
bdrv_create() doesn't make sure that the image is zeroed out. It is
currently not possibly to detect whether a given image is backed by a
file or a block device, and incorrectly assuming that it is zeroed
corrupts images during qemu-img convert, so let's err on the side of
caution and always return 0.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/gluster.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/block/gluster.c b/block/gluster.c
index 91acde2..61424bc 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -574,6 +574,12 @@ static void qemu_gluster_close(BlockDriverState *bs)
 glfs_fini(s-glfs);
 }
 
+static int qemu_gluster_has_zero_init(BlockDriverState *bs)
+{
+/* GlusterFS volume could be backed by a block device */
+return 0;
+}
+
 static QEMUOptionParameter qemu_gluster_create_options[] = {
 {
 .name = BLOCK_OPT_SIZE,
@@ -595,6 +601,7 @@ static BlockDriver bdrv_gluster = {
 .bdrv_aio_readv   = qemu_gluster_aio_readv,
 .bdrv_aio_writev  = qemu_gluster_aio_writev,
 .bdrv_aio_flush   = qemu_gluster_aio_flush,
+.bdrv_has_zero_init   = qemu_gluster_has_zero_init,
 .create_options   = qemu_gluster_create_options,
 };
 
@@ -610,6 +617,7 @@ static BlockDriver bdrv_gluster_tcp = {
 .bdrv_aio_readv   = qemu_gluster_aio_readv,
 .bdrv_aio_writev  = qemu_gluster_aio_writev,
 .bdrv_aio_flush   = qemu_gluster_aio_flush,
+.bdrv_has_zero_init   = qemu_gluster_has_zero_init,
 .create_options   = qemu_gluster_create_options,
 };
 
@@ -625,6 +633,7 @@ static BlockDriver bdrv_gluster_unix = {
 .bdrv_aio_readv   = qemu_gluster_aio_readv,
 .bdrv_aio_writev  = qemu_gluster_aio_writev,
 .bdrv_aio_flush   = qemu_gluster_aio_flush,
+.bdrv_has_zero_init   = qemu_gluster_has_zero_init,
 .create_options   = qemu_gluster_create_options,
 };
 
@@ -640,6 +649,7 @@ static BlockDriver bdrv_gluster_rdma = {
 .bdrv_aio_readv   = qemu_gluster_aio_readv,
 .bdrv_aio_writev  = qemu_gluster_aio_writev,
 .bdrv_aio_flush   = qemu_gluster_aio_flush,
+.bdrv_has_zero_init   = qemu_gluster_has_zero_init,
 .create_options   = qemu_gluster_create_options,
 };
 
-- 
1.8.1.4




[Qemu-devel] Google Summer of Code 2013 has started

2013-06-26 Thread Stefan Hajnoczi
It is a pleasure to welcome the following GSoC 2013 students to the
QEMU, KVM, and libvirt communities:

Libvirt Wireshark Dissector - Yuto KAWAMURA (kawamuray)
http://qemu-project.org/Features/LibvirtWiresharkDissector

Libvirt Introduce API to query IP addresses for given domain - Nehal
J. Wani (nehaljwani)
http://www.google-melange.com/gsoc/project/google/gsoc2013/nehaljwani/51001

Libvirt More Intelligent virsh auto-completion - Tomas Meszaros
http://www.google-melange.com/gsoc/project/google/gsoc2013/examon/13001

QEMU Integrated Copy-Paste - Ozan Çağlayan and Pallav Agrawal (pallav)
http://qemu-project.org/Features/IntegratedCopyPaste

QEMU Continuation Passing C - Charlie Shepherd (cs648)
http://qemu-project.org/Features/Continuation-Passing_C

QEMU Kconfig - Ákos Kovács
http://qemu-project.org/Features/Kconfig

QEMU USB Media Transfer Protocol emulation - a|mond
http://www.google-melange.com/gsoc/project/google/gsoc2013/almond/1001

KVM Nested Virtualization Testsuite - Arthur Chunqi Li (xelatex)
http://www.google-melange.com/gsoc/project/google/gsoc2013/xelatex/19001

Coding started on Monday, 17th of June and ends Monday, 23rd of September.

Feel free to follow these projects - feature pages are being created
with git repo and blog links.

Stefan



Re: [Qemu-devel] [PATCH V2] build: remove compile warning

2013-06-26 Thread Paolo Bonzini
Il 24/06/2013 19:58, Stefan Weil ha scritto:
 hw/ppc/spapr_hcall.c:188:1: warning: control reaches end of non-void
 function [-Wreturn-type]
 hw/ppc/spapr_pci.c:454:1: warning: control reaches end of non-void
 function [-Wreturn-type]
 I think you could report this to mingw.  GCC should handle if (!0)
 foo() just fine if foo is noreturn, perhaps the assertion failure
 runtime function is not noreturn in mingw.
 
 It's a gcc problem. Removing the assertion manually in the source
 code and compiling with NDEBUG (which we do by default)results
 in the same compiler warning.

Huh?  That seems wrong, assertions are there for a reason...

Paolo



Re: [Qemu-devel] [PATCH 0/3] spapr: Fix remaining compiler warnings

2013-06-26 Thread Paolo Bonzini
Il 24/06/2013 19:48, Stefan Weil ha scritto:
 Hi Alex,
 
 this mini series fixes the remaining compiler warnings in
 my w32/w64 cross build environment on Debian wheezy.
 
 [PATCH 1/3] spapr: Use named enum for function remove_hpte
 [PATCH 2/3] spapr: Fix compiler warning for some versions of gcc
 [PATCH 3/3] spapr: Fix compiler warning for some versions of gcc
 
 The modifications are nearly trivial, but this time it would
 be better if you could apply them to your patch queue.

I think these patches are a bit backwards.

The right fix is just to use abort() instead of assert(0).

Paolo



Re: [Qemu-devel] [PATCH 0/3] make bh safe with hot-unplug

2013-06-26 Thread liu ping fan
On Wed, Jun 26, 2013 at 2:34 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 26/06/2013 04:59, liu ping fan ha scritto:
 The latter part could be the hard one in a multi-threaded context, but I
 think it's up to the device to ensure it.  It doesn't _have_ to be hard.
  For example, joining the data-plane thread would do that as well.

 It seems not easy, take consideration of the sequence:
 _bh_delete(), // ensure not scheduled, ie, no further access to 
 DeviceState
 wait for rcu grace period to come, // ensure not running
 free DeviceState in rcu-reclaimer
 But  in current code, _delete() can be placed in DeviceState
 finalization path(in reclaimer), which means while reclaiming, there
 still exist access path to the DeviceState.

 It can be placed there, but it doesn't mean it is a good idea. :)

Unfortunately , these is what current code does, otherwise, they
should _bh_new() each time when scheduling bh, and _bh_delete() in
bh's callback, i.e. callback is the exact place to know we have done
with bh. (The other place is the destroy of DeviceState)

 On the other hand, pushing _delete() out of  finalization path is not
 easy, since we do not what time the DeviceState has done with its bh.

 See above:

 - if the BH will run in the iothread, the BH is definitely not running
 (because the BH runs under the BQL, and the reclaimer owns it)

It works for the case in which the DeviceState's reclaimer calls
_bh_delete(). But in another case(also exist in current code), where
we call _bh_delete() in callback, the bh will be scheduled, and oops!

 - if the BH is running in another thread, waiting for that thread to
 terminate obviously makes the BH not running.

This imply that except for qemu_aio_context, AioContext can be only
shared by one device, right? Otherwise we can not just terminate the
thread. If it is true, why can not we have more than one just like
qemu_aio_context?

 What we need is separation of removal and reclamation.  Without that,
 any effort to make things unplug-safe is going to be way way more
 complicated than necessary.

Agree, but when I tried for bh, I found the separation of removal and
reclamation are not easy.  We can not _bh_delete() in
acpi_piix_eject_slot(), because mmio-dispatch can schedule a bh at the
same time. And as explained, only two places can hold the
_bh_delete().
In a short word, with rcu, we need to constrain the calling idiom of
bh, i.e., putting them in reclaimer.  On the other hand, my patch try
to leave the calling idiom as it is, and handle this issue inside bh.

Regards,
Pingfan

 Paolo



Re: [Qemu-devel] [PATCH 10/15] target-i386: convert 'hv_spinlocks' to static property

2013-06-26 Thread Igor Mammedov
On Tue, 25 Jun 2013 17:34:44 -0300
Eduardo Habkost ehabk...@redhat.com wrote:

 On Tue, Jun 25, 2013 at 05:30:50PM -0300, Eduardo Habkost wrote:
  On Wed, Jun 05, 2013 at 03:18:41PM +0200, Igor Mammedov wrote:
   Signed-off-by: Igor Mammedov imamm...@redhat.com
   ---
  [...]
   @@ -1632,6 +1677,7 @@ static void cpu_x86_parse_featurestr(X86CPU *cpu, 
   char *features, Error **errp)
} else if (!strcmp(featurestr, hv-spinlocks)) {
char *err;
const int min = 0xFFF;
   +char num[32];
numvalue = strtoul(val, err, 0);
if (!*val || *err) {
error_setg(errp, bad numerical value %s, val);
   @@ -1643,7 +1689,8 @@ static void cpu_x86_parse_featurestr(X86CPU *cpu, 
   char *features, Error **errp)
min);
numvalue = min;
}
   -env-hyperv_spinlock_attempts = numvalue;
   +snprintf(num, sizeof(num), % PRId32, numvalue);
   +object_property_parse(OBJECT(cpu), num, featurestr, 
   errp);
  
  Why not use object_property_set_int()?
 
 Oh, I believe I have asked that before and you have already explained
 it: you are using strings to allow the existing object_property_parse()
 calls to be easily converted to qdev_prop_register_global() calls later.
 Correct?
 

Yep, this will be used later to consolidate code.



Re: [Qemu-devel] [PATCH 08/15] target-i386: cpu: convert 'tsc-frequency' to static property

2013-06-26 Thread Igor Mammedov
On Mon, 24 Jun 2013 14:09:43 -0300
Eduardo Habkost ehabk...@redhat.com wrote:

 On Wed, Jun 05, 2013 at 03:18:39PM +0200, Igor Mammedov wrote:
  Signed-off-by: Igor Mammedov imamm...@redhat.com
  ---
   target-i386/cpu.c |   12 +---
   1 files changed, 9 insertions(+), 3 deletions(-)
  
  diff --git a/target-i386/cpu.c b/target-i386/cpu.c
  index 9f6fe06..ec6d33f 100644
  --- a/target-i386/cpu.c
  +++ b/target-i386/cpu.c
  @@ -1427,6 +1427,14 @@ static void x86_cpuid_set_tsc_freq(Object *obj, 
  Visitor *v, void *opaque,
   cpu-env.tsc_khz = value / 1000;
   }
   
  +static PropertyInfo qdev_prop_tsc_freq = {
  +.name  = int64,
 
 What about making it uint64 instead of keeping it signed? We already
 reject negative values.
 
 As this patch just converts the current code/semantics to use static
 properties, and the change to uint64 could be done in a separate patch:
Yes, purpose of patch is to make conversion and nothing else, so it would be 
easy
to review.

 
 Reviewed-by: Eduardo Habkost ehabk...@redhat.com
 
 
  +.get   = x86_cpuid_get_tsc_freq,
  +.set   = x86_cpuid_set_tsc_freq,
  +};
  +#define DEFINE_PROP_TSC_FREQ(_n)   
  \
  +DEFINE_PROP(_n, X86CPU, env.tsc_khz, qdev_prop_tsc_freq, int32_t)
  +
   static void x86_cpuid_get_apic_id(Object *obj, Visitor *v, void *opaque,
 const char *name, Error **errp)
   {
  @@ -1509,6 +1517,7 @@ static Property cpu_x86_properties[] = {
   DEFINE_PROP_UINT32(xlevel, X86CPU, env.cpuid_xlevel, 0),
   DEFINE_PROP_VENDOR(vendor),
   DEFINE_PROP_MODEL_ID(model-id),
  +DEFINE_PROP_TSC_FREQ(tsc-frequency),
   DEFINE_PROP_END_OF_LIST(),
   };
   
  @@ -2488,9 +2497,6 @@ static void x86_cpu_initfn(Object *obj)
   cs-env_ptr = env;
   cpu_exec_init(env);
   
  -object_property_add(obj, tsc-frequency, int,
  -x86_cpuid_get_tsc_freq,
  -x86_cpuid_set_tsc_freq, NULL, NULL, NULL);
   object_property_add(obj, apic-id, int,
   x86_cpuid_get_apic_id,
   x86_cpuid_set_apic_id, NULL, NULL, NULL);
  -- 
  1.7.1
  
  
 




Re: [Qemu-devel] [PATCH 0/3] make bh safe with hot-unplug

2013-06-26 Thread Paolo Bonzini
Il 26/06/2013 10:20, liu ping fan ha scritto:
 On the other hand, pushing _delete() out of  finalization path is not
 easy, since we do not what time the DeviceState has done with its bh.

 See above:

 - if the BH will run in the iothread, the BH is definitely not running
 (because the BH runs under the BQL, and the reclaimer owns it)

 It works for the case in which the DeviceState's reclaimer calls
 _bh_delete(). But in another case(also exist in current code), where
 we call _bh_delete() in callback, the bh will be scheduled, and oops!

But you may know that the BH is not scheduled after removal, too.

There are not that many devices that have bottom halves (apart from
those that use bottom halves in ptimer).  If they really need to, they
can do the object_ref/unref themselves.  But I expect this to be rare,
and generic code should not need an owner field in bottom halves.  For
example, block devices should stop sending requests after removal.

 - if the BH is running in another thread, waiting for that thread to
 terminate obviously makes the BH not running.

 This imply that except for qemu_aio_context, AioContext can be only
 shared by one device, right? Otherwise we can not just terminate the
 thread. If it is true, why can not we have more than one just like
 qemu_aio_context?

Yes, if you do it that way you can have only one AioContext per device.
 RCU is another way, and doesn't have the same limitation.

We should avoid introducing infrastructure that we are not sure is
needed.  For what it's worth, your patches to make the bottom half list
thread-safe are also slightly premature.  However, they do not change
the API and it makes some sense to add the infrastructure.  In this
case, I'm simply not sure that we're there yet.

If you look at the memory work, for example, the owner patches happen to
be useful for BQL-less dispatch too, but they are solving existing
hot-unplug bugs.

 What we need is separation of removal and reclamation.  Without that,
 any effort to make things unplug-safe is going to be way way more
 complicated than necessary.

 Agree, but when I tried for bh, I found the separation of removal and
 reclamation are not easy.  We can not _bh_delete() in
 acpi_piix_eject_slot(), because mmio-dispatch can schedule a bh at the
 same time.

acpi_piix_eject_slot() is removal, not reclamation.  The plan there is
that qdev_free calls the exit callback immediately (which can do
qemu_bh_delete), and schedules an unref after the next RCU grace period.
 At the next RCU grace period the BH will not be running, thus it is
safe to finalize the object.

Paolo

 And as explained, only two places can hold the
 _bh_delete().
 In a short word, with rcu, we need to constrain the calling idiom of
 bh, i.e., putting them in reclaimer.  On the other hand, my patch try
 to leave the calling idiom as it is, and handle this issue inside bh.
 
 Regards,
 Pingfan
 
 Paolo




Re: [Qemu-devel] [PATCH] target-arm: Added CP15 VBAR support

2013-06-26 Thread Peter Maydell
On 26 June 2013 02:57, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote:
 On Wed, Feb 27, 2013 at 8:10 PM, Peter Maydell peter.mayd...@linaro.org 
 wrote:
 This one's tricky because the VBAR only exists in CPUs with
 TrustZone, and strictly speaking QEMU models a non-TrustZone
 core. However we already have some dummy cp15 registers which
 are really TZ-only just to get guests working, and this may
 well fall into that category.

 I need to have a think about how we can coherently define
 what QEMU does about TZ so we can have a better idea of
 what should and shouldn't be implemented. (If you have any
 suggestions I'm open to them.)


 ARM_FEATURE_TZ?

Yes, but what should this imply about QEMU's behaviour?
Naming a feature switch is easy, semantics are hard.

-- PMM



Re: [Qemu-devel] [PATCH 1/2] device_tree: Add qemu_devtree_setprop_sized_cells() utility function

2013-06-26 Thread Peter Maydell
On 26 June 2013 00:38, David Gibson da...@gibson.dropbear.id.au wrote:
 On Mon, Jun 24, 2013 at 12:02:39PM +0100, Peter Maydell wrote:
 On 24 June 2013 11:56, Alexander Graf ag...@suse.de wrote:
  This looks pretty complicated for something actually quite
  simple: All you want is to pass in a number of 64bit values,
  rather than 32bit ones, right?

 Nope. If the device tree blob says #address-cells is 1
 and #size-cells is 1, then I want to pass in values to
 go in 32 bit cells. If it says #address-cells is 2 but
 #size-cells is still 1, then I want to pass in a value
 for a 64 bit cell then one for a 32 bit cell. If they're
 both 2 then I want to pass in values for two 64 bit
 cells.

 Hmm.. the property certainly needs to be constructed that way.  But I
 think Alex's point is that you could make the arguments all 64-bit,
 and then truncate them in the generated property.

Er, the arguments *are* all 64 bits and truncated
in the generated property:
+ * @...: 0-terminated list of uint32_t number-of-cells, uint64_t value pairs

 There's a bigger problem, though, that exists with both versions.  In
 general people expect integer arguments like this not to care too much
 about the exact integer type, because it will be promoted to the right
 thing.  Except with varargs it won't.  So if you ever have a
 notionally 64-bit address that happens to fit in a 32-bit variable and
 you pass that it, this function will be broken.  And the worst of it
 is, it'll work most of the time, until you happen to hit the wrong ABI
 and parameter combination :(.

Do you have a suggested improvement to the API to avoid this?

thanks
-- PMM



[Qemu-devel] Libvirt : Bootstrap fails for local gnulib

2013-06-26 Thread chandrashekar shastri

Hi All,

The Libvirt compliation fails to bootstarp for local gnulib. We do not 
have outbound access for the test machines.

So, we ll pull the gnulib from and copy to the test machines.

When we run ./bootstrap --no-git  --gnulib-srcdir=/path/to/local, below 
is the error:


./bootstrap: Bootstrapping from checked-out libvirt sources...
./bootstrap: consider installing git-merge-changelog from gnulib
./bootstrap: getting gnulib files...
fatal: Needed a single revision
Unable to find current revision in submodule path '.gnulib'


If we run autogen.sh it tries to pull from git though the $GNULIB_SRCDIR 
is exported with the local gnulib.


./autogen.sh '--no-git'
I am going to run ./configure with no arguments - if you wish
to pass any to it, please specify them on the ./autogen.sh command line.
running bootstrap --no-git...
./bootstrap: Bootstrapping from checked-out libvirt sources...
./bootstrap: consider installing git-merge-changelog from gnulib
./bootstrap: getting gnulib files...
fatal: Unable to look up git.sv.gnu.org (port 9418) (Name or service not 
known)

Unable to fetch in submodule path '.gnulib'
Failed to bootstrap, please investigate.

Please provide us the workaround for the same.

Thanks,
Chandrashekar




[Qemu-devel] [PATCH v5 06/14] acpi_piix4 : Implement memory device hotplug registers

2013-06-26 Thread Hu Tao
From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com

A 32-byte register is used to present up to 256 hotplug-able memory devices
to BIOS and OSPM. Hot-add and hot-remove functions trigger an ACPI hotplug
event through these. Only reads are allowed from these registers.

An ACPI hot-remove event but needs to wait for OSPM to eject the device.
We use a single-byte register to know when OSPM has called the _EJ function
for a particular dimm. A write to this byte will depopulate the respective dimm.
Only writes are allowed to this byte.

v1-v2:
mems_sts address moved from 0xaf20 to 0xaf80 (to accomodate more space for
cpu-hotplugging in the future).
_EJ array is reduced to a single byte.
Add documentation in docs/specs/acpi_hotplug.txt

v3-v4: Removed hot-remove functions, will be added separately. Updated for
memory API.

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 docs/specs/acpi_mem_hotplug.txt | 14 
 hw/acpi/piix4.c | 72 -
 2 files changed, 85 insertions(+), 1 deletion(-)
 create mode 100644 docs/specs/acpi_mem_hotplug.txt

diff --git a/docs/specs/acpi_mem_hotplug.txt b/docs/specs/acpi_mem_hotplug.txt
new file mode 100644
index 000..8391713
--- /dev/null
+++ b/docs/specs/acpi_mem_hotplug.txt
@@ -0,0 +1,14 @@
+QEMU-ACPI BIOS hotplug interface
+--
+This document describes the interface between QEMU and the ACPI BIOS for 
non-PCI
+space. For the PCI interface please look at docs/specs/acpi_pci_hotplug.txt
+
+QEMU-ACPI BIOS memory hotplug interface
+--
+
+Memory Dimm status array (IO port 0xaf80-0xaf9f, 1-byte access):
+---
+Dimm hot-plug notification pending. One bit per slot.
+
+Read by ACPI BIOS GPE.3 handler to notify OS of memory hot-add or hot-remove
+events.  Read-only.
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 756df3b..2795ab0 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -29,6 +29,7 @@
 #include exec/ioport.h
 #include hw/nvram/fw_cfg.h
 #include exec/address-spaces.h
+#include hw/mem-hotplug/dimm.h
 
 //#define DEBUG
 
@@ -50,9 +51,12 @@
 
 #define PIIX4_PROC_BASE 0xaf00
 #define PIIX4_PROC_LEN 32
+#define PIIX4_MEM_BASE 0xaf80
+#define PIIX4_MEM_LEN 32
 
 #define PIIX4_PCI_HOTPLUG_STATUS 2
 #define PIIX4_CPU_HOTPLUG_STATUS 4
+#define PIIX4_MEM_HOTPLUG_STATUS 8
 
 struct pci_status {
 uint32_t up; /* deprecated, maintained for migration compatibility */
@@ -63,6 +67,10 @@ typedef struct CPUStatus {
 uint8_t sts[PIIX4_PROC_LEN];
 } CPUStatus;
 
+typedef struct MemStatus {
+uint8_t mems_sts[PIIX4_MEM_LEN];
+} MemStatus;
+
 typedef struct PIIX4PMState {
 PCIDevice dev;
 
@@ -70,6 +78,7 @@ typedef struct PIIX4PMState {
 MemoryRegion io_gpe;
 MemoryRegion io_pci;
 MemoryRegion io_cpu;
+MemoryRegion io_mem;
 ACPIREGS ar;
 
 APMState apm;
@@ -94,6 +103,8 @@ typedef struct PIIX4PMState {
 
 CPUStatus gpe_cpu;
 Notifier cpu_added_notifier;
+
+MemStatus  gpe_mem;
 } PIIX4PMState;
 
 static void piix4_acpi_system_hot_add_init(MemoryRegion *parent,
@@ -113,7 +124,8 @@ static void pm_update_sci(PIIX4PMState *s)
ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
 (((s-ar.gpe.sts[0]  s-ar.gpe.en[0]) 
-  (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0);
+  (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS |
+   PIIX4_MEM_HOTPLUG_STATUS)) != 0);
 
 qemu_set_irq(s-irq, sci_level);
 /* schedule a timer interruption if needed */
@@ -664,8 +676,40 @@ static void piix4_init_cpu_status(CPUState *cpu, void 
*data)
 g-sts[id / 8] |= (1  (id % 8));
 }
 
+static uint32_t mem_status_readb(void *opaque, uint32_t addr)
+{
+PIIX4PMState *s = opaque;
+uint32_t val = 0;
+MemStatus *g = s-gpe_mem;
+if (addr  PIIX4_MEM_LEN) {
+val = (uint32_t) g-mems_sts[addr];
+}
+PIIX4_DPRINTF(memhp read % PRIu32  == % PRIu32 \n, addr, val);
+return val;
+}
+
+static const MemoryRegionOps mem_hotplug_ops = {
+.old_portio = (MemoryRegionPortio[]) {
+{
+.offset = 0,   .len = PIIX4_MEM_LEN, .size = 1,
+.read = mem_status_readb,
+},
+PORTIO_END_OF_LIST()
+},
+.endianness = DEVICE_LITTLE_ENDIAN,
+};
+
+static void piix4_init_mem_status(PIIX4PMState *s)
+{
+int i;
+for (i = 0; i  PIIX4_MEM_LEN; i++) {
+s-gpe_mem.mems_sts[i] = 0;
+}
+}
+
 static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
 PCIHotplugState state);
+static int piix4_mem_hotplug(DeviceState *qdev, DimmDevice *dev, int add);
 
 static void piix4_acpi_system_hot_add_init(MemoryRegion *parent,
PCIBus *bus, PIIX4PMState *s)

[Qemu-devel] [PATCH v5 00/14] ACPI memory hotplug

2013-06-26 Thread Hu Tao
It's been quite a while since v4 and lots of changes happend
in qemu and v4 just can't apply anymore. So this series is
basically a rebase. Another purpose is to bring up discussions
to make consensus on some questions since v4, see
http://lists.gnu.org/archive/html/qemu-devel/2013-01/msg01219.html
and http://lists.gnu.org/archive/html/qemu-devel/2013-02/msg05332.html

This series works with seabios counterpart.

changes from v4:

  - rebased on a recent qemu-git
  - based on another series which seperates i440fx refactor from v4.
http://lists.nongnu.org/archive/html/qemu-devel/2013-06/msg03500.html
  - hot-unplug patches not included, as suggested by Vasilis, since
hot-unplug has some more complications with refcounting memory regions.
  - fix some copy-paste errors in qapi-schema.json.

v4: http://lists.gnu.org/archive/html/qemu-devel/2012-12/msg02693.html

Issues:

- hot-remove needs to only unmap the dimm device from guest's view. Freeing the
  memory should happen when the last user of the device (e.g. virtio-blk) unrefs
  the device. A testcase is needed for this.

- Live Migration: Ramblocks are migrated before qdev VMStates are migrated. So
  the DimmDevice is handled diferrently than other devices. Should this be
  reworked ?( DimmDevice structure currently does not define a 
VMStateDescription)
  Live migration works as long as the dimm layout (command line args) are
  identical at the source and destination qemu command line, and destination 
takes
  into account hot-operations that have occured on source. (v3 patch 10/19
  created the DimmDevice that corresponds to an unknown incoming ramblock, e.g.
  for a dimm that was hot-added on source. but has been dropped for the moment).

- A main blocker issue is windows guest functionality. The patchset does not
  work for windows currently.  Testing on win2012 server RC or windows2008
  consumer prerelease, when adding a DIMM, there is a BSOD with ACPI_BIOS_ERROR
  message. After this, the VM keeps rebooting with ACPI_BIOS_ERROR. The windows
  pnpmem driver obviosuly has a problem with the seabios dimm implementation
  (or the seabios dimm implementation is not fully ACPI-compliant). If someone
  can review the seabios patches or has any ideas to debug this, let me know.

- hot-operation notification lists need to be added to migration state.

- If a virtio sg element straddles a ramblock boundary, virtio_map_sg can't
  handle this and qemu exits with virtio: Trying to map MMIO memory assertion.
  This was discovered with stress testing in a VM with hotplugged DIMMs. The
  top-most commit in the qemu repo above tries to fix this (803aedf0) but maybe
  you have a better idea. This problem is critical for hot-add upstreaming and
  needs to be solved. Review/discussion on the list/irc is necessary.

- q35 was supposed/rumoured to have native acpi hotplug support, but I haven't
  found it in the spec (and I think other people in the list didn't either). 
Iiuc
  the plan is to support paravirtual memory hotplug for both piix4 and q35.

Hu Tao (5):
  qapi: make visit_type_size fallback to v-type_int()
  Implement dimm device abstraction
  memory: record below_4g_mem_size, above_4g_mem_size
  memory controller: initialize dram controller.
  pc: Add dimm paravirt SRAT info

Vasilis Liaskovitis (9):
  Add SIZE type to qdev properties
  qemu-option: export parse_option_number
  vl: handle -device dimm
  acpi_piix4 : Implement memory device hotplug registers
  acpi_ich9 : Implement memory device hotplug registers
  Introduce paravirt interface QEMU_CFG_PCI_WINDOW
  Implement info memory and query-memory
  balloon: update with hotplugged memory
  Implement dimm-info

 default-configs/x86_64-softmmu.mak |   1 +
 docs/specs/acpi_mem_hotplug.txt|  14 ++
 docs/specs/fwcfg.txt   |  28 
 hmp-commands.hx|   4 +
 hmp.c  |  25 
 hmp.h  |   2 +
 hw/Makefile.objs   |   1 +
 hw/acpi/ich9.c |  56 ++-
 hw/acpi/piix4.c|  72 -
 hw/core/qdev-properties.c  |  61 
 hw/i386/pc.c   |  74 +++--
 hw/i386/pc_piix.c  |   1 +
 hw/i386/pc_q35.c   |  17 ++-
 hw/mem-hotplug/Makefile.objs   |   1 +
 hw/mem-hotplug/dimm.c  | 298 +
 hw/pci-host/piix.c |  13 ++
 hw/virtio/virtio-balloon.c |  13 +-
 include/hw/acpi/ich9.h |  10 ++
 include/hw/i386/pc.h   |   8 +
 include/hw/mem-hotplug/dimm.h  |  78 ++
 include/hw/nvram/fw_cfg.h  |   1 +
 include/hw/qdev-properties.h   |   3 +
 include/qemu/option.h  |   4 +
 include/sysemu/sysemu.h|   1 +
 monitor.c  |  14 ++
 qapi-schema.json   |  40 +
 qapi/qapi-visit-core.c |   6 +-
 qmp-commands.hx

[Qemu-devel] [PATCH v5 01/14] qapi: make visit_type_size fallback to v-type_int()

2013-06-26 Thread Hu Tao
In case of v-type_uint64 is NULL, fallback to v-type_int, by calling
visit_type_uint64().

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 qapi/qapi-visit-core.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index 401ee6e..7ae2061 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -239,7 +239,11 @@ void visit_type_int64(Visitor *v, int64_t *obj, const char 
*name, Error **errp)
 void visit_type_size(Visitor *v, uint64_t *obj, const char *name, Error **errp)
 {
 if (!error_is_set(errp)) {
-(v-type_size ? v-type_size : v-type_uint64)(v, obj, name, errp);
+if (v-type_size) {
+v-type_size(v, obj, name, errp);
+} else {
+visit_type_uint64(v, obj, name, errp);
+}
 }
 }
 
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 10/14] pc: Add dimm paravirt SRAT info

2013-06-26 Thread Hu Tao
The numa_fw_cfg paravirt interface is extended to include SRAT information for
all hotplug-able dimms. There are 3 words for each hotplug-able memory slot,
denoting start address, size and node proximity. The new info is appended after
existing numa info, so that the fw_cfg layout does not break.  This information
is used by Seabios to build hotplug memory device objects at runtime.
nb_numa_nodes is set to 1 by default (not 0), so that we always pass srat info
to SeaBIOS.

v3-v4: numa_fw_cfg needs to be initalized after memory controller sets up dimm
ranges.  Make changes for pc_piix and pc_q35 to set numa_fw_cfg after i440fx
initialization.

v2-v3: setting nb_numa_nodes to 1 is not needed

v1-v2:
Dimm SRAT info (#dimms) is appended at end of existing numa fw_cfg in order not
to break existing layout
Documentation of the new fwcfg layout is included in docs/specs/fwcfg.txt

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 docs/specs/fwcfg.txt| 28 
 hw/i386/pc.c| 30 --
 hw/i386/pc_piix.c   |  1 +
 hw/i386/pc_q35.c|  7 +--
 include/hw/i386/pc.h|  1 +
 include/sysemu/sysemu.h |  1 +
 6 files changed, 60 insertions(+), 8 deletions(-)
 create mode 100644 docs/specs/fwcfg.txt

diff --git a/docs/specs/fwcfg.txt b/docs/specs/fwcfg.txt
new file mode 100644
index 000..e6fcd8f
--- /dev/null
+++ b/docs/specs/fwcfg.txt
@@ -0,0 +1,28 @@
+QEMU-BIOS Paravirt Documentation
+--
+
+This document describes paravirt data structures passed from QEMU to BIOS.
+
+fw_cfg SRAT paravirt info
+
+The SRAT info passed from QEMU to BIOS has the following layout:
+
+---
+#nodes | cpu0_pxm | cpu1_pxm | ... | cpulast_pxm | node0_mem | node1_mem | ... 
| nodelast_mem
+
+---
+#dimms | dimm0_start | dimm0_sz | dimm0_pxm | ... | dimmlast_start | 
dimmlast_sz | dimmlast_pxm
+
+Entry 0 contains the number of numa nodes (nb_numa_nodes).
+
+Entries 1..max_cpus: The next max_cpus entries describe node proximity for each
+one of the vCPUs in the system.
+
+Entries max_cpus+1..max_cpus+nb_numa_nodes+1:  The next nb_numa_nodes entries
+describe the memory size for each one of the NUMA nodes in the system.
+
+Entry max_cpus+nb_numa_nodes+1 contains the number of memory dimms 
(nb_hp_dimms)
+
+The last 3 * nb_hp_dimms entries are organized in triplets: Each triplet 
contains
+the physical address offset, size (in bytes), and node proximity for the
+respective dimm.
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 65838a6..b51d3b5 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -55,6 +55,7 @@
 #include hw/acpi/acpi.h
 #include hw/cpu/icc_bus.h
 #include hw/boards.h
+#include hw/mem-hotplug/dimm.h
 
 /* debug PC/ISA interrupts */
 //#define DEBUG_IRQ
@@ -606,8 +607,6 @@ static FWCfgState *bochs_bios_init(void)
 FWCfgState *fw_cfg;
 uint8_t *smbios_table;
 size_t smbios_len;
-uint64_t *numa_fw_cfg;
-int i, j;
 unsigned int apic_id_limit = pc_apic_id_limit(max_cpus);
 
 fw_cfg = fw_cfg_init(BIOS_CFG_IOPORT, BIOS_CFG_IOPORT + 1, 0, 0);
@@ -640,11 +639,25 @@ static FWCfgState *bochs_bios_init(void)
  e820_table, sizeof(e820_table));
 
 fw_cfg_add_bytes(fw_cfg, FW_CFG_HPET, hpet_cfg, sizeof(hpet_cfg));
+
+return fw_cfg;
+}
+
+void bochs_meminfo_bios_init(void *fw_cfg)
+{
+uint64_t *numa_fw_cfg;
+uint64_t *hp_dimms_fw_cfg;
+int i, j;
+unsigned int apic_id_limit = pc_apic_id_limit(max_cpus);
+
 /* allocate memory for the NUMA channel: one (64bit) word for the number
  * of nodes, one word for each VCPU-node and one word for each node to
  * hold the amount of memory.
+ * Finally one word for the number of hotplug memory slots and three words
+ * for each hotplug memory slot (start address, size and node proximity).
  */
-numa_fw_cfg = g_new0(uint64_t, 1 + apic_id_limit + nb_numa_nodes);
+numa_fw_cfg = g_new0(uint64_t,
+ 2 + apic_id_limit + nb_numa_nodes  + 3 * nb_hp_dimms);
 numa_fw_cfg[0] = cpu_to_le64(nb_numa_nodes);
 for (i = 0; i  max_cpus; i++) {
 unsigned int apic_id = x86_cpu_apic_id_from_index(i);
@@ -659,11 +672,16 @@ static FWCfgState *bochs_bios_init(void)
 for (i = 0; i  nb_numa_nodes; i++) {
 numa_fw_cfg[apic_id_limit + 1 + i] = cpu_to_le64(node_mem[i]);
 }
+
+numa_fw_cfg[1 + apic_id_limit + nb_numa_nodes] = cpu_to_le64(nb_hp_dimms);
+
+hp_dimms_fw_cfg = numa_fw_cfg + 2 + apic_id_limit + nb_numa_nodes;
+if (nb_hp_dimms) {
+dimm_setup_fwcfg_layout(hp_dimms_fw_cfg);
+}
 fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA, numa_fw_cfg,
- (1 + apic_id_limit + 

[Qemu-devel] [PATCH v5 05/14] vl: handle -device dimm

2013-06-26 Thread Hu Tao
From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 vl.c | 51 +++
 1 file changed, 51 insertions(+)

diff --git a/vl.c b/vl.c
index 767e020..9d88a79 100644
--- a/vl.c
+++ b/vl.c
@@ -170,6 +170,7 @@ int main(int argc, char **argv)
 
 #include ui/qemu-spice.h
 #include qapi/string-input-visitor.h
+#include hw/mem-hotplug/dimm.h
 
 //#define DEBUG_NET
 //#define DEBUG_SLIRP
@@ -252,6 +253,7 @@ static QTAILQ_HEAD(, FWBootEntry) fw_boot_order =
 int nb_numa_nodes;
 uint64_t node_mem[MAX_NODES];
 unsigned long *node_cpumask[MAX_NODES];
+int nb_hp_dimms;
 
 uint8_t qemu_uuid[16];
 
@@ -2338,6 +2340,50 @@ static int chardev_init_func(QemuOpts *opts, void 
*opaque)
 return 0;
 }
 
+static int dimmcfg_init_func(QemuOpts *opts, void *opaque)
+{
+const char *driver;
+const char *id;
+uint64_t node, size;
+uint32_t populated;
+const char *buf, *busbuf;
+
+/* DimmDevice configuration needs to be known in order to initialize 
chipset
+ * with correct memory and pci ranges. But all devices are created after
+ * chipset / machine initialization. In * order to avoid this problem, we
+ * parse dimm information earlier into dimmcfg structs. */
+
+driver = qemu_opt_get(opts, driver);
+if (!strcmp(driver, dimm)) {
+
+id = qemu_opts_id(opts);
+buf = qemu_opt_get(opts, size);
+parse_option_size(size, buf, size, NULL);
+buf = qemu_opt_get(opts, node);
+parse_option_number(node, buf, node, NULL);
+busbuf = qemu_opt_get(opts, bus);
+buf = qemu_opt_get(opts, populated);
+if (!buf) {
+populated = 0;
+} else {
+populated = strcmp(buf, on) ? 0 : 1;
+}
+
+dimm_config_create((char *)id, size, busbuf ? busbuf : membus.0,
+   node, nb_hp_dimms);
+
+/* if !populated, we just keep the config. The real device
+ * will be created in the future with a normal device_add
+ * command. */
+if (!populated) {
+qemu_opts_del(opts);
+}
+nb_hp_dimms++;
+}
+
+return 0;
+}
+
 #ifdef CONFIG_VIRTFS
 static int fsdev_init_func(QemuOpts *opts, void *opaque)
 {
@@ -4260,6 +4306,11 @@ int main(int argc, char **argv, char **envp)
 }
 qemu_add_globals();
 
+/* init generic devices */
+if (qemu_opts_foreach(qemu_find_opts(device),
+   dimmcfg_init_func, NULL, 1) != 0) {
+exit(1);
+}
 qdev_machine_init();
 
 QEMUMachineInitArgs args = { .ram_size = ram_size,
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 3/7] acpi-dsdt: Implement functions for memory hotplug

2013-06-26 Thread Hu Tao
From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com

Extend the DSDT to include methods for handling memory hot-add and hot-remove
notifications and memory device status requests. These functions are called
from the memory device SSDT methods.

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
---
 src/acpi-dsdt-mem-hotplug.dsl | 57 +++
 src/acpi-dsdt.dsl |  5 +++-
 2 files changed, 61 insertions(+), 1 deletion(-)
 create mode 100644 src/acpi-dsdt-mem-hotplug.dsl

diff --git a/src/acpi-dsdt-mem-hotplug.dsl b/src/acpi-dsdt-mem-hotplug.dsl
new file mode 100644
index 000..1d82532
--- /dev/null
+++ b/src/acpi-dsdt-mem-hotplug.dsl
@@ -0,0 +1,57 @@
+/
+ * Memory hotplug
+ /
+
+Scope(\_SB) {
+/* Objects filled in by run-time generated SSDT */
+External(MTFY, MethodObj)
+External(MEON, PkgObj)
+
+Method (CMST, 1, NotSerialized) {
+// _STA method - return ON status of memdevice
+// Local0 = MEON flag for this cpu
+Store(DerefOf(Index(MEON, Arg0)), Local0)
+If (Local0) { Return(0xF) } Else { Return(0x0) }
+}
+
+/* Memory hotplug notify array */
+OperationRegion(MEST, SystemIO, 0xaf80, 32)
+Field (MEST, ByteAcc, NoLock, Preserve)
+{
+MES, 256
+}
+
+Method(MESC, 0) {
+// Local5 = active memdevice bitmap
+Store (MES, Local5)
+// Local2 = last read byte from bitmap
+Store (Zero, Local2)
+// Local0 = memory device iterator
+Store (Zero, Local0)
+While (LLess(Local0, SizeOf(MEON))) {
+// Local1 = MEON flag for this memory device
+Store(DerefOf(Index(MEON, Local0)), Local1)
+If (And(Local0, 0x07)) {
+// Shift down previously read bitmap byte
+ShiftRight(Local2, 1, Local2)
+} Else {
+// Read next byte from memdevice bitmap
+Store(DerefOf(Index(Local5, ShiftRight(Local0, 3))), 
Local2)
+}
+// Local3 = active state for this memory device
+Store(And(Local2, 1), Local3)
+
+If (LNotEqual(Local1, Local3)) {
+// State change - update MEON with new state
+Store(Local3, Index(MEON, Local0))
+// Do MEM notify
+If (LEqual(Local3, 1)) {
+MTFY(Local0, 1)
+}
+}
+Increment(Local0)
+}
+Return(One)
+}
+
+}
diff --git a/src/acpi-dsdt.dsl b/src/acpi-dsdt.dsl
index 158f6b4..98c9413 100644
--- a/src/acpi-dsdt.dsl
+++ b/src/acpi-dsdt.dsl
@@ -294,6 +294,7 @@ DefinitionBlock (
 }
 
 #include acpi-dsdt-cpu-hotplug.dsl
+#include acpi-dsdt-mem-hotplug.dsl
 
 
 /
@@ -313,7 +314,9 @@ DefinitionBlock (
 // CPU hotplug event
 \_SB.PRSC()
 }
-Method(_L03) {
+Method(_E03) {
+// Memory hotplug event
+\_SB.MESC()
 }
 Method(_L04) {
 }
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 02/14] Add SIZE type to qdev properties

2013-06-26 Thread Hu Tao
From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com

This patch adds a 'SIZE' type property to qdev.

It will make dimm description more convenient by allowing sizes to be specified
with K,M,G,T prefixes instead of number of bytes e.g.:
-device dimm,id=mem0,size=2G,bus=membus.0

Credits go to Ian Molton for original patch. See:
http://patchwork.ozlabs.org/patch/38835/

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 hw/core/qdev-properties.c| 61 
 include/hw/qdev-properties.h |  3 +++
 include/qemu/option.h|  2 ++
 util/qemu-option.c   |  4 +--
 4 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index 3a324fb..9c34793 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -1134,3 +1134,64 @@ void qdev_prop_set_globals(DeviceState *dev, Error 
**errp)
 class = object_class_get_parent(class);
 } while (class);
 }
+
+/* --- 64bit unsigned int 'size' type --- */
+
+static void get_size(Object *obj, Visitor *v, void *opaque,
+ const char *name, Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
+
+visit_type_size(v, ptr, name, errp);
+}
+
+static void set_size(Object *obj, Visitor *v, void *opaque,
+ const char *name, Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
+
+if (dev-realized) {
+qdev_prop_set_after_realize(dev, name, errp);
+return;
+}
+
+visit_type_size(v, ptr, name, errp);
+}
+
+static int parse_size(DeviceState *dev, Property *prop, const char *str)
+{
+uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
+Error *errp = NULL;
+
+if (str != NULL) {
+parse_option_size(prop-name, str, ptr, errp);
+}
+assert_no_error(errp);
+return 0;
+}
+
+static int print_size(DeviceState *dev, Property *prop, char *dest, size_t len)
+{
+uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
+char suffixes[] = {'T', 'G', 'M', 'K', 'B'};
+int i = 0;
+uint64_t div;
+
+for (div = (long int)1  40; !(*ptr / div) ; div = 10) {
+i++;
+}
+
+return snprintf(dest, len, %0.03f%c, (double)*ptr / div, suffixes[i]);
+}
+
+PropertyInfo qdev_prop_size = {
+.name  = size,
+.parse = parse_size,
+.print = print_size,
+.get = get_size,
+.set = set_size,
+};
diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index 39448b7..692f82e 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -15,6 +15,7 @@ extern PropertyInfo qdev_prop_uint64;
 extern PropertyInfo qdev_prop_hex8;
 extern PropertyInfo qdev_prop_hex32;
 extern PropertyInfo qdev_prop_hex64;
+extern PropertyInfo qdev_prop_size;
 extern PropertyInfo qdev_prop_string;
 extern PropertyInfo qdev_prop_chr;
 extern PropertyInfo qdev_prop_ptr;
@@ -116,6 +117,8 @@ extern PropertyInfo qdev_prop_arraylen;
 DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_hex32, uint32_t)
 #define DEFINE_PROP_HEX64(_n, _s, _f, _d)   \
 DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_hex64, uint64_t)
+#define DEFINE_PROP_SIZE(_n, _s, _f, _d)   \
+DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_size, uint64_t)
 #define DEFINE_PROP_PCI_DEVFN(_n, _s, _f, _d)   \
 DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_pci_devfn, int32_t)
 
diff --git a/include/qemu/option.h b/include/qemu/option.h
index a83c700..83c1e84 100644
--- a/include/qemu/option.h
+++ b/include/qemu/option.h
@@ -153,5 +153,7 @@ typedef int (*qemu_opts_loopfunc)(QemuOpts *opts, void 
*opaque);
 int qemu_opts_print(QemuOpts *opts, void *dummy);
 int qemu_opts_foreach(QemuOptsList *list, qemu_opts_loopfunc func, void 
*opaque,
   int abort_on_failure);
+void parse_option_size(const char *name, const char *value,
+   uint64_t *ret, Error **errp);
 
 #endif
diff --git a/util/qemu-option.c b/util/qemu-option.c
index 412c425..8c67857 100644
--- a/util/qemu-option.c
+++ b/util/qemu-option.c
@@ -173,8 +173,8 @@ static void parse_option_number(const char *name, const 
char *value,
 }
 }
 
-static void parse_option_size(const char *name, const char *value,
-  uint64_t *ret, Error **errp)
+void parse_option_size(const char *name, const char *value,
+   uint64_t *ret, Error **errp)
 {
 char *postfix;
 double sizef;
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 09/14] memory controller: initialize dram controller.

2013-06-26 Thread Hu Tao
Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 hw/i386/pc.c | 27 +++
 include/hw/i386/pc.h |  5 +
 2 files changed, 32 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 55056b1..65838a6 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1283,6 +1283,27 @@ void mc_set_smm(int val, void *arg)
 memory_region_transaction_commit();
 }
 
+static hwaddr mc_dimm_offset(DeviceState *dev, uint64_t size)
+{
+MemoryController *d = MEMORY_CONTROLLER(dev);
+MemoryControllerClass *c = MEMORY_CONTROLLER_GET_CLASS(d);
+hwaddr ret;
+
+if (d-below_4g_mem_size + size = c-pci_hole_start) {
+/* if dimm fits before pci hole, append it normally */
+ret = d-below_4g_mem_size;
+d-below_4g_mem_size += size;
+} else {
+/* otherwise place it above 4GB */
+ret = d-above_4g_mem_size + c-pci_hole_end;
+d-above_4g_mem_size += size;
+}
+
+d-ram_size += size;
+
+return ret;
+}
+
 static int memory_controller_init(PCIDevice *dev)
 {
 MemoryController *m = MEMORY_CONTROLLER(dev);
@@ -1353,6 +1374,11 @@ static int memory_controller_init(PCIDevice *dev)
  PAM_EXPAN_SIZE);
 }
 
+m-dram_channel0 = dimm_bus_create(OBJECT(m), membus.0, 8,
+   c-dimm_offset);
+m-pv_dram_channel = dimm_bus_create(OBJECT(m), membus.pv, 0,
+ c-dimm_offset);
+
 ram_size = m-ram_size / 8 / 1024 / 1024;
 if (ram_size  255) {
 ram_size = 255;
@@ -1388,6 +1414,7 @@ static void memory_controller_class_init(ObjectClass 
*klass, void *data)
 dc-no_user = 1;
 mc-set_smm = mc_set_smm;
 mc-update = mc_update;
+mc-dimm_offset = mc_dimm_offset;
 }
 
 static const TypeInfo memory_controller_type_info = {
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index e2cbc1b..959b92b 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -10,6 +10,7 @@
 #include hw/i386/ioapic.h
 #include hw/pci/pci.h
 #include hw/pci-host/pam.h
+#include hw/mem-hotplug/dimm.h
 
 #define TYPE_MEMORY_CONTROLLER memory controller
 #define MEMORY_CONTROLLER(obj) OBJECT_CHECK(MemoryController, (obj), 
TYPE_DEVICE)
@@ -29,6 +30,7 @@ typedef struct MemoryControllerClass {
 
 void (*set_smm)(int val, void *arg);
 void (*update)(MemoryController *m);
+hwaddr (*dimm_offset)(DeviceState *d, uint64_t size);
 } MemoryControllerClass;
 
 typedef struct MemoryController {
@@ -48,6 +50,9 @@ typedef struct MemoryController {
 MemoryRegion ram_above_4g;
 hwaddr below_4g_mem_size;
 hwaddr above_4g_mem_size;
+
+DimmBus *dram_channel0;
+DimmBus *pv_dram_channel;
 } MemoryController;
 
 void mc_update_pam(MemoryController *d);
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 03/14] qemu-option: export parse_option_number

2013-06-26 Thread Hu Tao
From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 include/qemu/option.h | 2 ++
 util/qemu-option.c| 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/qemu/option.h b/include/qemu/option.h
index 83c1e84..d7a0bb9 100644
--- a/include/qemu/option.h
+++ b/include/qemu/option.h
@@ -155,5 +155,7 @@ int qemu_opts_foreach(QemuOptsList *list, 
qemu_opts_loopfunc func, void *opaque,
   int abort_on_failure);
 void parse_option_size(const char *name, const char *value,
uint64_t *ret, Error **errp);
+void parse_option_number(const char *name, const char *value,
+ uint64_t *ret, Error **errp);
 
 #endif
diff --git a/util/qemu-option.c b/util/qemu-option.c
index 8c67857..dddabcc 100644
--- a/util/qemu-option.c
+++ b/util/qemu-option.c
@@ -155,8 +155,8 @@ static void parse_option_bool(const char *name, const char 
*value, bool *ret,
 }
 }
 
-static void parse_option_number(const char *name, const char *value,
-uint64_t *ret, Error **errp)
+void parse_option_number(const char *name, const char *value,
+ uint64_t *ret, Error **errp)
 {
 char *postfix;
 uint64_t number;
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 13/14] balloon: update with hotplugged memory

2013-06-26 Thread Hu Tao
From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com

query-balloon and info balloon should report total memory available to the
guest.

balloon inflate/ deflate can also use all memory available to the guest (initial
+ hotplugged memory)

Ballon driver has been minimaly tested with the patch, please review and test.

Caveat: if the guest does not online hotplugged-memory, it's easy for a balloon
inflate command to OOM a guest.

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
---
 hw/virtio/virtio-balloon.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index d669756..c866000 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -30,6 +30,7 @@
 #endif
 
 #include hw/virtio/virtio-bus.h
+#include hw/mem-hotplug/dimm.h
 
 static void balloon_page(void *addr, int deflate)
 {
@@ -271,10 +272,11 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
 VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
 struct virtio_balloon_config config;
 uint32_t oldactual = dev-actual;
+uint64_t hotplugged_ram_size = get_hp_memory_total();
 memcpy(config, config_data, 8);
 dev-actual = le32_to_cpu(config.actual);
 if (dev-actual != oldactual) {
-qemu_balloon_changed(ram_size -
+qemu_balloon_changed(ram_size + hotplugged_ram_size -
((ram_addr_t) dev-actual  VIRTIO_BALLOON_PFN_SHIFT));
 }
 }
@@ -290,18 +292,21 @@ static void virtio_balloon_stat(void *opaque, BalloonInfo 
*info)
 VirtIOBalloon *dev = opaque;
 info-actual = ram_size - ((uint64_t) dev-actual 
VIRTIO_BALLOON_PFN_SHIFT);
+info-actual += get_hp_memory_total();
 }
 
 static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
 {
 VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+uint64_t hotplugged_ram_size = get_hp_memory_total();
 
-if (target  ram_size) {
-target = ram_size;
+if (target  ram_size + hotplugged_ram_size) {
+target = ram_size + hotplugged_ram_size;
 }
 if (target) {
-dev-num_pages = (ram_size - target)  VIRTIO_BALLOON_PFN_SHIFT;
+dev-num_pages = (ram_size + hotplugged_ram_size - target) 
+ VIRTIO_BALLOON_PFN_SHIFT;
 virtio_notify_config(vdev);
 }
 }
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 7/7] pci: Use paravirt interface for pcimem_start and pcimem64_start

2013-06-26 Thread Hu Tao
From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com

Initialize the 32-bit and 64-bit pci starting offsets from values passed in by
the qemu paravirt interface QEMU_CFG_PCI_WINDOW. Qemu calculates the starting
offsets based on initial memory and hotplug-able dimms.
It's possible to avoid the new paravirt interface, and calculate pci ranges from
srat entries. But the code changes are ugly, see:
http://lists.gnu.org/archive/html/qemu-devel/2012-09/msg03548.html

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
---
 src/paravirt.c | 7 +++
 src/paravirt.h | 1 +
 src/pciinit.c  | 9 +
 3 files changed, 17 insertions(+)

diff --git a/src/paravirt.c b/src/paravirt.c
index 5925c63..9c1e511 100644
--- a/src/paravirt.c
+++ b/src/paravirt.c
@@ -134,6 +134,7 @@ qemu_platform_setup(void)
 #define QEMU_CFG_BOOT_MENU  0x0e
 #define QEMU_CFG_MAX_CPUS   0x0f
 #define QEMU_CFG_FILE_DIR   0x19
+#define QEMU_CFG_PCI_WINDOW 0x1a
 #define QEMU_CFG_ARCH_LOCAL 0x8000
 #define QEMU_CFG_ACPI_TABLES(QEMU_CFG_ARCH_LOCAL + 0)
 #define QEMU_CFG_SMBIOS_ENTRIES (QEMU_CFG_ARCH_LOCAL + 1)
@@ -339,3 +340,9 @@ void qemu_cfg_init(void)
  , 0, be32_to_cpu(qfile.size));
 }
 }
+
+void qemu_cfg_get_pci_offsets(u64 *pcimem_start, u64 *pcimem64_start)
+{
+qemu_cfg_read_entry(pcimem_start, QEMU_CFG_PCI_WINDOW, sizeof(u64));
+qemu_cfg_read((u8*)(pcimem64_start), sizeof(u64));
+}
diff --git a/src/paravirt.h b/src/paravirt.h
index fce5af9..2c37d0d 100644
--- a/src/paravirt.h
+++ b/src/paravirt.h
@@ -27,5 +27,6 @@ static inline int runningOnKVM(void) {
 void qemu_preinit(void);
 void qemu_platform_setup(void);
 void qemu_cfg_init(void);
+void qemu_cfg_get_pci_offsets(u64 *pcimem_start, u64 *pcimem64_start);
 
 #endif
diff --git a/src/pciinit.c b/src/pciinit.c
index 8370b96..7e63c5e 100644
--- a/src/pciinit.c
+++ b/src/pciinit.c
@@ -805,6 +805,7 @@ static void pci_bios_map_devices(struct pci_bus *busses)
 void
 pci_setup(void)
 {
+u64 pv_pcimem_start, pv_pcimem64_start;
 if (!CONFIG_QEMU)
 return;
 
@@ -837,6 +838,14 @@ pci_setup(void)
 
 pci_bios_init_devices();
 
+/* if qemu gives us other pci window values, it means there are 
hotplug-able
+ * dimms. Adjust accordingly */
+qemu_cfg_get_pci_offsets(pv_pcimem_start, pv_pcimem64_start);
+if (pv_pcimem_start  pcimem_start)
+pcimem_start = pv_pcimem_start;
+if (pv_pcimem64_start  pcimem64_start)
+pcimem64_start = pv_pcimem64_start;
+
 free(busses);
 
 pci_enable_default_vga();
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 04/14] Implement dimm device abstraction

2013-06-26 Thread Hu Tao
Each hotplug-able memory slot is a DimmDevice. All DimmDevices are attached
to a new bus called DimmBus. This bus is introduced so that we no longer
depend on hotplug-capability of main system bus (the main bus does not allow
hotplugging). The DimmBus should be attached to a chipset Device (i440fx in case
of the pc)

A hot-add operation for a particular dimm:
- creates a new DimmDevice and attaches it to the DimmBus
- creates a new MemoryRegion of the given physical address offset, size and
node proximity, and attaches it to main system memory as a sub_region.

Hotplug operations are done through normal device_add commands.
Also add properties to DimmDevice.

TODO/FIXME: Dimm configurations are specified at qemu startup at the command
line. Dimms with populated=off are actually not created, but their config
is saved in the corresponding bus. If user does not specify the correct bus
for a dimm device during device_add and there are multiple memory buses in the
machine (e.g. right now membus.0 and membus.pv are created in this pachset) the
first bus found in the device tree that matches the device type is picked. This
bus may be the wrong one, and the dimm config is not retrieved, thus creating a
useless dimm device. A solution would be to delete the new dimm if the dimm
config was not found (see commented qmp_device_del in dimm_init) but this does
not look clean. This design needs to be fixed somehow to avoid similar problems
if multiple memory buses are present in the same machine.

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 default-configs/x86_64-softmmu.mak |   1 +
 hw/Makefile.objs   |   1 +
 hw/mem-hotplug/Makefile.objs   |   1 +
 hw/mem-hotplug/dimm.c  | 241 +
 include/hw/mem-hotplug/dimm.h  |  77 
 5 files changed, 321 insertions(+)
 create mode 100644 hw/mem-hotplug/Makefile.objs
 create mode 100644 hw/mem-hotplug/dimm.c
 create mode 100644 include/hw/mem-hotplug/dimm.h

diff --git a/default-configs/x86_64-softmmu.mak 
b/default-configs/x86_64-softmmu.mak
index 599b630..5348800 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -46,3 +46,4 @@ CONFIG_APIC=y
 CONFIG_IOAPIC=y
 CONFIG_ICC_BUS=y
 CONFIG_PVPANIC=y
+CONFIG_MEM_HOTPLUG=y
diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 0243d6a..6d3dc73 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -27,6 +27,7 @@ devices-dirs-$(CONFIG_SOFTMMU) += usb/
 devices-dirs-$(CONFIG_VIRTIO) += virtio/
 devices-dirs-$(CONFIG_SOFTMMU) += watchdog/
 devices-dirs-$(CONFIG_SOFTMMU) += xen/
+devices-dirs-$(CONFIG_MEM_HOTPLUG) += mem-hotplug/
 devices-dirs-y += core/
 common-obj-y += $(devices-dirs-y)
 obj-y += $(devices-dirs-y)
diff --git a/hw/mem-hotplug/Makefile.objs b/hw/mem-hotplug/Makefile.objs
new file mode 100644
index 000..7563ef5
--- /dev/null
+++ b/hw/mem-hotplug/Makefile.objs
@@ -0,0 +1 @@
+common-obj-$(CONFIG_MEM_HOTPLUG) += dimm.o
diff --git a/hw/mem-hotplug/dimm.c b/hw/mem-hotplug/dimm.c
new file mode 100644
index 000..09cfc4b
--- /dev/null
+++ b/hw/mem-hotplug/dimm.c
@@ -0,0 +1,241 @@
+/*
+ * Dimm device for Memory Hotplug
+ *
+ * Copyright ProfitBricks GmbH 2013
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see http://www.gnu.org/licenses/
+ */
+
+#include trace.h
+#include hw/qdev.h
+#include hw/mem-hotplug/dimm.h
+#include time.h
+#include qmp-commands.h
+
+/* the following list is used to hold dimm config info before machine
+ * is initialized. After machine init, the list is not used anymore.*/
+static DimmConfiglist dimmconfig_list =
+   QTAILQ_HEAD_INITIALIZER(dimmconfig_list);
+
+/* the list of memory buses */
+static QLIST_HEAD(, DimmBus) memory_buses;
+
+static void dimmbus_dev_print(Monitor *mon, DeviceState *dev, int indent);
+static char *dimmbus_get_fw_dev_path(DeviceState *dev);
+
+static Property dimm_properties[] = {
+DEFINE_PROP_UINT64(start, DimmDevice, start, 0),
+DEFINE_PROP_SIZE(size, DimmDevice, size, DEFAULT_DIMMSIZE),
+DEFINE_PROP_UINT32(node, DimmDevice, node, 0),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void dimmbus_dev_print(Monitor *mon, DeviceState *dev, int indent)
+{
+}
+
+static char *dimmbus_get_fw_dev_path(DeviceState *dev)
+{
+char path[40];
+
+snprintf(path, sizeof(path), %s, 

[Qemu-devel] [PATCH v5 11/14] Introduce paravirt interface QEMU_CFG_PCI_WINDOW

2013-06-26 Thread Hu Tao
From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com

Qemu calculates the 32-bit and 64-bit PCI starting offsets based on
initial memory and hotplug-able dimms. This info needs to be passed to Seabios
for PCI initialization.

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 hw/i386/pc_q35.c  | 10 ++
 hw/pci-host/piix.c| 13 +
 include/hw/nvram/fw_cfg.h |  1 +
 3 files changed, 24 insertions(+)

diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 2c14977..f956c62 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -42,6 +42,7 @@
 #include hw/ide/ahci.h
 #include hw/usb.h
 #include hw/cpu/icc_bus.h
+#include hw/nvram/fw_cfg.h
 
 /* ICH9 AHCI has 6 ports */
 #define MAX_SATA_PORTS 6
@@ -75,6 +76,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
 PCIDevice *ahci;
 DeviceState *icc_bridge;
 void *fw_cfg = NULL;
+uint64_t *pci_window_fw_cfg;
 
 icc_bridge = qdev_create(NULL, TYPE_ICC_BRIDGE);
 object_property_add_child(qdev_get_machine(), icc-bridge,
@@ -119,6 +121,14 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
 /* pci */
 qdev_init_nofail(DEVICE(q35_host));
 bochs_meminfo_bios_init(fw_cfg);
+
+pci_window_fw_cfg = g_new0(uint64_t, 2);
+pci_window_fw_cfg[0] = cpu_to_le64(MCH_HOST_BRIDGE_PCIEXBAR_DEFAULT);
+pci_window_fw_cfg[1] = cpu_to_le64(0x1ULL +
+   q35_host-mch.dev.above_4g_mem_size);
+fw_cfg_add_bytes(fw_cfg, FW_CFG_PCI_WINDOW,
+ (uint8_t *)pci_window_fw_cfg, 2 * 8);
+
 host_bus = q35_host-host.pci.bus;
 /* create ISA bus */
 lpc = pci_create_simple_multifunction(host_bus, PCI_DEVFN(ICH9_LPC_DEV,
diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 71a9b8b..ace8f2b 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -32,6 +32,7 @@
 #include hw/xen/xen.h
 #include hw/pci-host/pam.h
 #include sysemu/sysemu.h
+#include hw/nvram/fw_cfg.h
 
 /*
  * I440FX chipset data sheet.
@@ -229,6 +230,8 @@ static PCIBus *i440fx_common_init(const char *device_name,
 PIIX3State *piix3;
 I440FXPMCState *f;
 I440FXState *i440fx;
+uint64_t *pci_window_fw_cfg;
+void *fw_cfg;
 
 i440fx = I440FX_DEVICE(object_new(TYPE_I440FX_DEVICE));
 s = PCI_HOST_BRIDGE(i440fx);
@@ -266,6 +269,16 @@ static PCIBus *i440fx_common_init(const char *device_name,
 *piix3_devfn = piix3-dev.devfn;
 *pci_address_space = i440fx-pci_address_space;
 
+fw_cfg = fw_cfg_find();
+if (fw_cfg) {
+pci_window_fw_cfg = g_new0(uint64_t, 2);
+pci_window_fw_cfg[0] = cpu_to_le64(f-dev.below_4g_mem_size);
+pci_window_fw_cfg[1] = cpu_to_le64(0x1ULL +
+   f-dev.above_4g_mem_size);
+fw_cfg_add_bytes(fw_cfg, FW_CFG_PCI_WINDOW,
+ (uint8_t *)pci_window_fw_cfg, 2 * 8);
+}
+
 return s-bus;
 }
 
diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
index f60dd67..41cfa32 100644
--- a/include/hw/nvram/fw_cfg.h
+++ b/include/hw/nvram/fw_cfg.h
@@ -35,6 +35,7 @@
 #define FW_CFG_SETUP_SIZE   0x17
 #define FW_CFG_SETUP_DATA   0x18
 #define FW_CFG_FILE_DIR 0x19
+#define FW_CFG_PCI_WINDOW   0x1a
 
 #define FW_CFG_FILE_FIRST   0x20
 #define FW_CFG_FILE_SLOTS   0x10
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 14/14] Implement dimm-info

2013-06-26 Thread Hu Tao
From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com

query-dimm-info and info dimm will give current state of all dimms in the
system e.g.

dimm0: on
dimm1: off
dimm2: off
dimm3: on
etc.

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 hmp-commands.hx   |  2 ++
 hmp.c | 17 +
 hmp.h |  1 +
 hw/mem-hotplug/dimm.c | 43 +++
 monitor.c |  7 +++
 qapi-schema.json  | 27 +++
 6 files changed, 97 insertions(+)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index b2937c2..6cb1285 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1657,6 +1657,8 @@ show roms
 show memory-total
 @item info tpm
 show the TPM device
+@item info dimm
+show dimm
 @end table
 ETEXI
 
diff --git a/hmp.c b/hmp.c
index 0371da9..04586b3 100644
--- a/hmp.c
+++ b/hmp.c
@@ -683,6 +683,23 @@ void hmp_info_memory(Monitor *mon, const QDict *qdict)
 qapi_free_MemoryInfo(mem);
 }
 
+void hmp_info_dimm(Monitor *mon, const QDict *qdict)
+{
+DimmInfoList *info;
+DimmInfoList *item;
+DimmInfo *dimm;
+
+info = qmp_query_dimm_info(NULL);
+for (item = info; item; item = item-next) {
+dimm = item-value;
+monitor_printf(mon, dimm %s : %s\n, dimm-dimm,
+   dimm-state ? on : off);
+dimm-dimm = NULL;
+}
+
+qapi_free_DimmInfoList(info);
+}
+
 void hmp_quit(Monitor *mon, const QDict *qdict)
 {
 monitor_suspend(mon);
diff --git a/hmp.h b/hmp.h
index b5a5afb..06cea4e 100644
--- a/hmp.h
+++ b/hmp.h
@@ -38,6 +38,7 @@ void hmp_info_pci(Monitor *mon, const QDict *qdict);
 void hmp_info_block_jobs(Monitor *mon, const QDict *qdict);
 void hmp_info_tpm(Monitor *mon, const QDict *qdict);
 void hmp_info_memory(Monitor *mon, const QDict *qdict);
+void hmp_info_dimm(Monitor *mon, const QDict *qdict);
 void hmp_quit(Monitor *mon, const QDict *qdict);
 void hmp_stop(Monitor *mon, const QDict *qdict);
 void hmp_system_reset(Monitor *mon, const QDict *qdict);
diff --git a/hw/mem-hotplug/dimm.c b/hw/mem-hotplug/dimm.c
index da31a18..4b08706 100644
--- a/hw/mem-hotplug/dimm.c
+++ b/hw/mem-hotplug/dimm.c
@@ -170,6 +170,18 @@ static DimmConfig *dimmcfg_find_from_name(DimmBus *bus, 
const char *name)
 return NULL;
 }
 
+static DimmDevice *dimm_find_from_name(DimmBus *bus, const char *name)
+{
+DimmDevice *slot;
+
+QTAILQ_FOREACH(slot, bus-dimmlist, nextdimm) {
+if (!strcmp(slot-qdev.id, name)) {
+return slot;
+}
+}
+return NULL;
+}
+
 void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots)
 {
 DimmConfig *slot;
@@ -185,6 +197,37 @@ void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots)
 }
 }
 
+DimmInfoList *qmp_query_dimm_info(Error **errp)
+{
+DimmBus *bus;
+DimmConfig *slot;
+DimmInfoList *head = NULL, *info, *cur_item = NULL;
+
+QLIST_FOREACH(bus, memory_buses, next) {
+QTAILQ_FOREACH(slot, bus-dimmconfig_list, nextdimmcfg) {
+
+info = g_malloc0(sizeof(*info));
+info-value = g_malloc0(sizeof(*info-value));
+info-value-dimm = g_malloc0(sizeof(char) * 32);
+strcpy(info-value-dimm, slot-name);
+if (dimm_find_from_name(bus, slot-name)) {
+info-value-state = 1;
+} else {
+info-value-state = 0;
+}
+/* XXX: waiting for the qapi to support GSList */
+if (!cur_item) {
+head = cur_item = info;
+} else {
+cur_item-next = info;
+cur_item = info;
+}
+}
+}
+
+return head;
+}
+
 uint64_t get_hp_memory_total(void)
 {
 DimmBus *bus;
diff --git a/monitor.c b/monitor.c
index 14a955e..c2b046c 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2753,6 +2753,13 @@ static mon_cmd_t info_cmds[] = {
 .mhandler.cmd = do_trace_print_events,
 },
 {
+.name   = dimm,
+.args_type  = ,
+.params = ,
+.help   = show active and non active dimms,
+.mhandler.cmd = hmp_info_dimm,
+},
+{
 .name   = tpm,
 .args_type  = ,
 .params = ,
diff --git a/qapi-schema.json b/qapi-schema.json
index d2940dc..88bbfac 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3621,3 +3621,30 @@
 { 'type': 'MemoryInfo',
   'data': { 'total': 'int' } }
 { 'command': 'query-memory', 'returns': 'MemoryInfo' }
+
+##
+# @DimmInfo:
+#
+# Information about status of dimm device
+#
+# @dimm: the name of the dimm
+#
+# @state: 'true' means the dimm device is plugged in, 'false' means
+# means the dimm device is plugged out.
+#
+# Since: 1.6
+#
+##
+{ 'type': 'DimmInfo',
+  'data': {'dimm': 'str', 'state': 'bool'} }
+
+##
+# @query-dimm-info:
+#
+# Returns the state of dimm devices
+#
+# Returns: list of @DimmInfo
+#
+# Since: 1.6
+##
+{ 

[Qemu-devel] [PATCH v5 07/14] acpi_ich9 : Implement memory device hotplug registers

2013-06-26 Thread Hu Tao
From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com

This implements acpi dimm hot-add capability for q35 (ich9). The logic is the
same as for the pc machine (piix4).

TODO: Fix acpi irq delivery bug. Currently there is a flood of irqs when
delivering an acpi interrupt (should be just one). Guest complains as follows:
irq 9: nobody cared
[...]
Disabling IRQ #9

where #9 is the acpi irq

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 hw/acpi/ich9.c | 56 --
 include/hw/acpi/ich9.h | 10 +
 2 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 4a17f32..0034aa2 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -33,6 +33,7 @@
 #include exec/address-spaces.h
 
 #include hw/i386/ich9.h
+#include hw/mem-hotplug/dimm.h
 
 //#define DEBUG
 
@@ -49,11 +50,14 @@ static void pm_update_sci(ICH9LPCPMRegs *pm)
 
 pm1a_sts = acpi_pm1_evt_get_sts(pm-acpi_regs);
 
-sci_level = (((pm1a_sts  pm-acpi_regs.pm1.evt.en) 
+sci_level = pm1a_sts  pm-acpi_regs.pm1.evt.en) 
   (ACPI_BITMASK_RT_CLOCK_ENABLE |
ACPI_BITMASK_POWER_BUTTON_ENABLE |
ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
-   ACPI_BITMASK_TIMER_ENABLE)) != 0);
+   ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
+(((pm-acpi_regs.gpe.sts[0]  pm-acpi_regs.gpe.en[0]) 
+  (ICH9_MEM_HOTPLUG_STATUS)) != 0));
+
 qemu_set_irq(pm-irq, sci_level);
 
 /* schedule a timer interruption if needed */
@@ -202,6 +206,47 @@ static void pm_powerdown_req(Notifier *n, void *opaque)
 acpi_pm1_evt_power_down(pm-acpi_regs);
 }
 
+static uint32_t mem_status_readb(void *opaque, uint32_t addr)
+{
+ICH9LPCPMRegs *s = opaque;
+uint32_t val = 0;
+MemStatus *g = s-gpe_mem;
+if (addr  ICH9_MEM_LEN) {
+val = (uint32_t) g-mems_sts[addr];
+}
+ICH9_DEBUG(memhp read % PRIu32  == % PRIu32 \n, addr, val);
+return val;
+}
+
+static const MemoryRegionOps ich9_mem_hotplug_ops = {
+.old_portio = (MemoryRegionPortio[]) {
+{
+.offset = 0,   .len = ICH9_MEM_LEN, .size = 1,
+.read = mem_status_readb,
+},
+PORTIO_END_OF_LIST()
+},
+.endianness = DEVICE_LITTLE_ENDIAN,
+};
+
+static void enable_mem_device(ICH9LPCState *s, int memdevice)
+{
+MemStatus *g = s-pm.gpe_mem;
+s-pm.acpi_regs.gpe.sts[0] |= ICH9_MEM_HOTPLUG_STATUS;
+g-mems_sts[memdevice / 8] |= (1  (memdevice % 8));
+}
+
+static int ich9_mem_hotplug(DeviceState *dev, DimmDevice *dimm, int add)
+{
+ICH9LPCState *s = ICH9_LPC_DEVICE(dev);
+
+if (add) {
+enable_mem_device(s, dimm-idx);
+}
+pm_update_sci(s-pm);
+return 0;
+}
+
 void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm,
   qemu_irq sci_irq)
 {
@@ -227,4 +272,11 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm,
 qemu_register_reset(pm_reset, pm);
 pm-powerdown_notifier.notify = pm_powerdown_req;
 qemu_register_powerdown_notifier(pm-powerdown_notifier);
+
+memory_region_init_io(pm-io_mem, ich9_mem_hotplug_ops, pm,
+  acpi-memory-hotplug0, ICH9_MEM_BASE);
+memory_region_add_subregion(get_system_io(), ICH9_MEM_BASE, pm-io_mem);
+
+dimm_bus_hotplug(ich9_mem_hotplug, lpc_pci-qdev);
+
 }
diff --git a/include/hw/acpi/ich9.h b/include/hw/acpi/ich9.h
index b1fe71f..300e07f 100644
--- a/include/hw/acpi/ich9.h
+++ b/include/hw/acpi/ich9.h
@@ -23,6 +23,14 @@
 
 #include hw/acpi/acpi.h
 
+#define ICH9_MEM_BASE 0xaf80
+#define ICH9_MEM_LEN 32
+#define ICH9_MEM_HOTPLUG_STATUS 8
+
+typedef struct MemStatus {
+uint8_t mems_sts[ICH9_MEM_LEN];
+} MemStatus;
+
 typedef struct ICH9LPCPMRegs {
 /*
  * In ich9 spec says that pm1_cnt register is 32bit width and
@@ -34,6 +42,7 @@ typedef struct ICH9LPCPMRegs {
 MemoryRegion io;
 MemoryRegion io_gpe;
 MemoryRegion io_smi;
+MemoryRegion io_mem;
 
 uint32_t smi_en;
 uint32_t smi_sts;
@@ -42,6 +51,7 @@ typedef struct ICH9LPCPMRegs {
 
 uint32_t pm_io_base;
 Notifier powerdown_notifier;
+MemStatus gpe_mem;
 } ICH9LPCPMRegs;
 
 void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm,
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 12/14] Implement info memory and query-memory

2013-06-26 Thread Hu Tao
From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com

Returns total physical memory available to guest in bytes, including hotplugged
memory. Note that the number reported here may be different from what the guest
sees e.g. if the guest has not logically onlined hotplugged memory.

This functionality is provided independently of a balloon device, since a
guest can be using ACPI memory hotplug without using a balloon device.

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 hmp-commands.hx   |  2 ++
 hmp.c |  8 
 hmp.h |  1 +
 hw/mem-hotplug/dimm.c | 14 ++
 include/hw/mem-hotplug/dimm.h |  1 +
 monitor.c |  7 +++
 qapi-schema.json  | 13 +
 qmp-commands.hx   | 22 ++
 vl.c  |  9 +
 9 files changed, 77 insertions(+)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 915b0d1..b2937c2 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1653,6 +1653,8 @@ show device tree
 show qdev device model list
 @item info roms
 show roms
+@item info memory-total
+show memory-total
 @item info tpm
 show the TPM device
 @end table
diff --git a/hmp.c b/hmp.c
index 494a9aa..0371da9 100644
--- a/hmp.c
+++ b/hmp.c
@@ -675,6 +675,14 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict)
 qapi_free_TPMInfoList(info_list);
 }
 
+void hmp_info_memory(Monitor *mon, const QDict *qdict)
+{
+MemoryInfo *mem;
+mem = qmp_query_memory(NULL);
+monitor_printf(mon, MemTotal: % PRIu64 \n, mem-total);
+qapi_free_MemoryInfo(mem);
+}
+
 void hmp_quit(Monitor *mon, const QDict *qdict)
 {
 monitor_suspend(mon);
diff --git a/hmp.h b/hmp.h
index 56d2e92..b5a5afb 100644
--- a/hmp.h
+++ b/hmp.h
@@ -37,6 +37,7 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict);
 void hmp_info_pci(Monitor *mon, const QDict *qdict);
 void hmp_info_block_jobs(Monitor *mon, const QDict *qdict);
 void hmp_info_tpm(Monitor *mon, const QDict *qdict);
+void hmp_info_memory(Monitor *mon, const QDict *qdict);
 void hmp_quit(Monitor *mon, const QDict *qdict);
 void hmp_stop(Monitor *mon, const QDict *qdict);
 void hmp_system_reset(Monitor *mon, const QDict *qdict);
diff --git a/hw/mem-hotplug/dimm.c b/hw/mem-hotplug/dimm.c
index 09cfc4b..da31a18 100644
--- a/hw/mem-hotplug/dimm.c
+++ b/hw/mem-hotplug/dimm.c
@@ -185,6 +185,20 @@ void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots)
 }
 }
 
+uint64_t get_hp_memory_total(void)
+{
+DimmBus *bus;
+DimmDevice *slot;
+uint64_t info = 0;
+
+QLIST_FOREACH(bus, memory_buses, next) {
+QTAILQ_FOREACH(slot, bus-dimmlist, nextdimm) {
+info += slot-size;
+}
+}
+return info;
+}
+
 static void dimm_realize(DeviceState *s, Error **errp)
 {
 DimmBus *bus = DIMM_BUS(qdev_get_parent_bus(s));
diff --git a/include/hw/mem-hotplug/dimm.h b/include/hw/mem-hotplug/dimm.h
index 8625ae6..3947539 100644
--- a/include/hw/mem-hotplug/dimm.h
+++ b/include/hw/mem-hotplug/dimm.h
@@ -73,5 +73,6 @@ DimmBus *dimm_bus_create(Object *parent, const char *name, 
uint32_t max_dimms,
 dimm_calcoffset_fn pmc_set_offset);
 void dimm_config_create(char *id, uint64_t size, const char *bus, uint64_t 
node,
 uint32_t dimm_idx);
+uint64_t get_hp_memory_total(void);
 
 #endif
diff --git a/monitor.c b/monitor.c
index 70ae8f5..14a955e 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2760,6 +2760,13 @@ static mon_cmd_t info_cmds[] = {
 .mhandler.cmd = hmp_info_tpm,
 },
 {
+.name   = memory,
+.args_type  = ,
+.params = ,
+.help   = show memory informations including total memory size,
+.mhandler.cmd = hmp_info_memory,
+},
+{
 .name   = NULL,
 },
 };
diff --git a/qapi-schema.json b/qapi-schema.json
index a80ee40..d2940dc 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3608,3 +3608,16 @@
 '*cpuid-input-ecx': 'int',
 'cpuid-register': 'X86CPURegister32',
 'features': 'int' } }
+
+##
+# @query-memory:
+#
+# Returns total memory in bytes, including hotplugged dimms
+#
+# Returns: int
+#
+# Since: 1.6
+##
+{ 'type': 'MemoryInfo',
+  'data': { 'total': 'int' } }
+{ 'command': 'query-memory', 'returns': 'MemoryInfo' }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 8cea5e5..58d7c7c 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2997,3 +2997,25 @@ Example:
 - { return: {} }
 
 EQMP
+
+{
+.name   = query-memory,
+.args_type  = ,
+.mhandler.cmd_new = qmp_marshal_input_query_memory
+},
+SQMP
+query-memory
+--
+
+Return a json-object with the following information:
+
+- total: total memory in bytes, including hotplugged dimms
+
+Example:
+
+- { execute: query-memory }
+- {
+  return: { total: 1073741824 }
+   }
+
+EQMP
diff 

[Qemu-devel] [PATCH v5 5/7] acpi: generate hotplug memory devices

2013-06-26 Thread Hu Tao
The memory device generation is guided by qemu paravirt info. Seabios
uses the info to setup SRAT entries for the hotplug-able memory slots,
and to generate appropriate memory device objects. One memory device
(and corresponding SRAT entry) is generated for each hotplug-able qemu
memslot. Currently no SSDT memory device is created for initial system
memory.

We only support up to 255 DIMMs for now (PackageOp used for the MEON
array can only describe an array of at most 255 elements. VarPackageOp
would be needed to support more than 255 devices)

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 src/acpi.c | 151 ++---
 src/paravirt.c |   8 +++
 2 files changed, 152 insertions(+), 7 deletions(-)

diff --git a/src/acpi.c b/src/acpi.c
index ce988e0..e9a0326 100644
--- a/src/acpi.c
+++ b/src/acpi.c
@@ -15,6 +15,8 @@
 #include config.h // CONFIG_*
 #include paravirt.h // RamSize
 #include dev-q35.h
+#include memmap.h
+#include paravirt.h
 
 #include acpi-dsdt.hex
 
@@ -250,11 +252,23 @@ encodeLen(u8 *ssdt_ptr, int length, int bytes)
 #define PCIHP_AML (ssdp_pcihp_aml + *ssdt_pcihp_start)
 #define PCI_SLOTS 32
 
+/* 0x5B 0x82 DeviceOp PkgLength NameString DimmID */
+#define MEM_BASE 0xaf80
+#define MEM_AML (ssdm_mem_aml + *ssdt_mem_start)
+#define MEM_SIZEOF (*ssdt_mem_end - *ssdt_mem_start)
+#define MEM_OFFSET_HEX (*ssdt_mem_name - *ssdt_mem_start + 2)
+#define MEM_OFFSET_ID (*ssdt_mem_id - *ssdt_mem_start)
+#define MEM_OFFSET_PXM 31
+#define MEM_OFFSET_START 55
+#define MEM_OFFSET_END   63
+#define MEM_OFFSET_SIZE  79
+
 #define SSDT_SIGNATURE 0x54445353 // SSDT
 #define SSDT_HEADER_LENGTH 36
 
 #include ssdt-misc.hex
 #include ssdt-pcihp.hex
+#include ssdt-mem.hex
 
 #define PCI_RMV_BASE 0xae0c
 
@@ -306,9 +320,100 @@ static void patch_pcihp(int slot, u8 *ssdt_ptr, u32 eject)
 }
 }
 
+static void build_memdev(u8 *ssdt_ptr, int i, u64 mem_base, u64 mem_len, u8 
node)
+{
+memcpy(ssdt_ptr, MEM_AML, MEM_SIZEOF);
+ssdt_ptr[MEM_OFFSET_HEX] = getHex(i  4);
+ssdt_ptr[MEM_OFFSET_HEX+1] = getHex(i);
+ssdt_ptr[MEM_OFFSET_ID] = i;
+ssdt_ptr[MEM_OFFSET_PXM] = node;
+*(u64*)(ssdt_ptr + MEM_OFFSET_START) = cpu_to_le64(mem_base);
+*(u64*)(ssdt_ptr + MEM_OFFSET_END) = cpu_to_le64(mem_base + mem_len);
+*(u64*)(ssdt_ptr + MEM_OFFSET_SIZE) = cpu_to_le64(mem_len);
+}
+
+static u8 *build_memssdt(u8 *ssdt_ptr, int memssdt_len,
+ u64 *numadimmsmap, int nb_memdevs)
+{
+u64 mem_base, mem_len;
+u64 *dimm = numadimmsmap;
+int node;
+int i;
+
+// build Scope(_SB_) header
+*(ssdt_ptr++) = 0x10; // ScopeOp
+ssdt_ptr = encodeLen(ssdt_ptr, memssdt_len, 3);
+*(ssdt_ptr++) = '_';
+*(ssdt_ptr++) = 'S';
+*(ssdt_ptr++) = 'B';
+*(ssdt_ptr++) = '_';
+
+for (i = 0; i  nb_memdevs; i++) {
+mem_base = *dimm++;
+mem_len = *dimm++;
+node = *dimm++;
+build_memdev(ssdt_ptr, i, mem_base, mem_len, node);
+ssdt_ptr += MEM_SIZEOF;
+}
+
+// build Method(MTFY, 2) {If (LEqual(Arg0, 0x00)) {Notify(CM00, Arg1)} 
...}
+*(ssdt_ptr++) = 0x14; // MethodOp
+ssdt_ptr = encodeLen(ssdt_ptr, 2+5+(12*nb_memdevs), 2);
+*(ssdt_ptr++) = 'M';
+*(ssdt_ptr++) = 'T';
+*(ssdt_ptr++) = 'F';
+*(ssdt_ptr++) = 'Y';
+*(ssdt_ptr++) = 0x02;
+for (i=0; inb_memdevs; i++) {
+*(ssdt_ptr++) = 0xA0; // IfOp
+ssdt_ptr = encodeLen(ssdt_ptr, 11, 1);
+*(ssdt_ptr++) = 0x93; // LEqualOp
+*(ssdt_ptr++) = 0x68; // Arg0Op
+*(ssdt_ptr++) = 0x0A; // BytePrefix
+*(ssdt_ptr++) = i;
+*(ssdt_ptr++) = 0x86; // NotifyOp
+*(ssdt_ptr++) = 'M';
+*(ssdt_ptr++) = 'P';
+*(ssdt_ptr++) = getHex(i  4);
+*(ssdt_ptr++) = getHex(i);
+*(ssdt_ptr++) = 0x69; // Arg1Op
+}
+
+// build Name(MEON, Package() { One, One, ..., Zero, Zero, ... })
+*(ssdt_ptr++) = 0x08; // NameOp
+*(ssdt_ptr++) = 'M';
+*(ssdt_ptr++) = 'E';
+*(ssdt_ptr++) = 'O';
+*(ssdt_ptr++) = 'N';
+*(ssdt_ptr++) = 0x12; // PackageOp
+ssdt_ptr = encodeLen(ssdt_ptr, 2+1+(1*nb_memdevs), 2);
+*(ssdt_ptr++) = nb_memdevs;
+
+dimm = numadimmsmap;
+u8 memslot_status = 0, enabled;
+
+for (i = 0; i  nb_memdevs; i++) {
+enabled = 0;
+if (i % 8 == 0)
+memslot_status = inb(MEM_BASE + i/8);
+enabled = memslot_status  1;
+mem_base = *dimm++;
+mem_len = *dimm++;
+dimm++;  // node
+*(ssdt_ptr++) = enabled ? 0x01 : 0x00;
+if (enabled)
+add_e820(mem_base, mem_len, E820_RAM);
+memslot_status = memslot_status  1;
+}
+
+return ssdt_ptr;
+}
+
 static void*
 build_ssdt(void)
 {
+int nb_memdevs;
+u64 *numadimmsmap;
 int acpi_cpus = MaxCountCPUs  0xff ? 0xff : MaxCountCPUs;
 int length = (sizeof(ssdp_misc_aml) 

[Qemu-devel] [PATCH v5 0/7] support for ACPI memory hotplug

2013-06-26 Thread Hu Tao
This series adds memory devices and related methods to support
ACPI memory hotplug.

This series works with qemu counterpart. See qemu series for
a detailed description.

Hu Tao (2):
  set psize to 0 when romfile_loadfile failed
  acpi: generate hotplug memory devices

Vasilis Liaskovitis (5):
  Add ACPI_EXTRACT_DEVICE* macros
  Add SSDT memory device support
  acpi-dsdt: Implement functions for memory hotplug
  q35: Add memory hotplug handler
  pci: Use paravirt interface for pcimem_start and pcimem64_start

 Makefile  |   2 +-
 src/acpi-dsdt-mem-hotplug.dsl |  57 
 src/acpi-dsdt.dsl |   5 +-
 src/acpi.c| 151 --
 src/paravirt.c|  15 +
 src/paravirt.h|   1 +
 src/pciinit.c |   9 +++
 src/q35-acpi-dsdt.dsl |   6 +-
 src/romfile.c |  13 ++--
 src/ssdt-mem.dsl  |  61 +
 tools/acpi_extract.py |  28 
 11 files changed, 333 insertions(+), 15 deletions(-)
 create mode 100644 src/acpi-dsdt-mem-hotplug.dsl
 create mode 100644 src/ssdt-mem.dsl

-- 
1.8.3.1




[Qemu-devel] [PATCH v5 2/7] Add SSDT memory device support

2013-06-26 Thread Hu Tao
From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com

Define SSDT hotplug-able memory devices in _SB namespace. The dynamically
generated SSDT includes per memory device hotplug methods. These methods
just call methods defined in the DSDT. Also dynamically generate a MTFY
method and a MEON array of the online/available memory devices.  ACPI
extraction macros are used to place the AML code in variables later used by
src/acpi. The design is taken from SSDT cpu generation.

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
---
 Makefile |  2 +-
 src/ssdt-mem.dsl | 61 
 2 files changed, 62 insertions(+), 1 deletion(-)
 create mode 100644 src/ssdt-mem.dsl

diff --git a/Makefile b/Makefile
index 759..b88fb90 100644
--- a/Makefile
+++ b/Makefile
@@ -223,7 +223,7 @@ $(OUT)%.hex: src/%.dsl ./tools/acpi_extract_preprocess.py 
./tools/acpi_extract.p
$(Q)$(PYTHON) ./tools/acpi_extract.py $(OUT)$*.lst  $(OUT)$*.off
$(Q)cat $(OUT)$*.off  $@
 
-$(OUT)acpi.o: $(OUT)acpi-dsdt.hex $(OUT)ssdt-proc.hex $(OUT)ssdt-pcihp.hex 
$(OUT)ssdt-misc.hex $(OUT)q35-acpi-dsdt.hex
+$(OUT)acpi.o: $(OUT)acpi-dsdt.hex $(OUT)ssdt-proc.hex $(OUT)ssdt-pcihp.hex 
$(OUT)ssdt-misc.hex $(OUT)q35-acpi-dsdt.hex $(OUT)ssdt-mem.hex
 
  Kconfig rules
 
diff --git a/src/ssdt-mem.dsl b/src/ssdt-mem.dsl
new file mode 100644
index 000..d0f6203
--- /dev/null
+++ b/src/ssdt-mem.dsl
@@ -0,0 +1,61 @@
+/* This file is the basis for the ssdt_mem[] variable in src/acpi.c.
+ * It is similar in design to the ssdt_proc variable.
+ * It defines the contents of the per-dimm QWordMemory() object.  At
+ * runtime, a dynamically generated SSDT will contain one copy of this
+ * AML snippet for every possible memory device in the system.  The
+ * objects will * be placed in the \_SB_ namespace.
+ *
+ * In addition to the aml code generated from this file, the
+ * src/acpi.c file creates a MTFY method with an entry for each memdevice:
+ * Method(MTFY, 2) {
+ * If (LEqual(Arg0, 0x00)) { Notify(MP00, Arg1) }
+ * If (LEqual(Arg0, 0x01)) { Notify(MP01, Arg1) }
+ * ...
+ * }
+ * and a MEON array with the list of active and inactive memory devices:
+ * Name(MEON, Package() { One, One, ..., Zero, Zero, ... })
+ */
+ACPI_EXTRACT_ALL_CODE ssdm_mem_aml
+
+DefinitionBlock (ssdt-mem.aml, SSDT, 0x02, BXPC, CSSDT, 0x1)
+/*  v-- DO NOT EDIT --v */
+{
+ACPI_EXTRACT_DEVICE_START ssdt_mem_start
+ACPI_EXTRACT_DEVICE_END ssdt_mem_end
+ACPI_EXTRACT_DEVICE_STRING ssdt_mem_name
+Device(MPAA) {
+ACPI_EXTRACT_NAME_BYTE_CONST ssdt_mem_id
+Name(ID, 0xAA)
+/*  ^-- DO NOT EDIT --^
+ *
+ * The src/acpi.c code requires the above layout so that it can update
+ * MPAA and 0xAA with the appropriate MEMDEVICE id (see
+ * SD_OFFSET_MEMHEX/MEMID1/MEMID2).  Don't change the above without
+ * also updating the C code.
+ */
+Name(_HID, EISAID(PNP0C80))
+Name(_PXM, 0xAA)
+
+External(CMST, MethodObj)
+External(MPEJ, MethodObj)
+
+Name(_CRS, ResourceTemplate() {
+QwordMemory(
+   ResourceConsumer,
+   , // _DEC
+   MinFixed, // _MIF
+   MaxFixed, // _MAF
+   Cacheable,// _MEM
+   ReadWrite,// _RW
+   0x0,  // _GRA
+   0xDEADBEEF,   // _MIN
+   0xE6ADBEEE,   // _MAX
+   0x,   // _TRA
+   0x0800,   // _LEN
+   )
+})
+Method (_STA, 0) {
+Return(CMST(ID))
+}
+}
+}
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 08/14] memory: record below_4g_mem_size, above_4g_mem_size

2013-06-26 Thread Hu Tao
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 hw/i386/pc.c | 17 -
 include/hw/i386/pc.h |  2 ++
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 5bb4989..55056b1 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1288,7 +1288,6 @@ static int memory_controller_init(PCIDevice *dev)
 MemoryController *m = MEMORY_CONTROLLER(dev);
 MemoryControllerClass *c = MEMORY_CONTROLLER_GET_CLASS(dev);
 ram_addr_t ram_size;
-hwaddr below_4g_mem_size, above_4g_mem_size;
 hwaddr pci_hole_start, pci_hole_size;
 hwaddr pci_hole64_start, pci_hole64_size;
 int i;
@@ -1296,11 +1295,11 @@ static int memory_controller_init(PCIDevice *dev)
 g_assert(m-system_memory != NULL);
 
 if(m-ram_size  c-pci_hole_start) {
-below_4g_mem_size = c-pci_hole_start;
-above_4g_mem_size = m-ram_size - c-pci_hole_start;
+m-below_4g_mem_size = c-pci_hole_start;
+m-above_4g_mem_size = m-ram_size - c-pci_hole_start;
 } else {
-below_4g_mem_size = m-ram_size;
-above_4g_mem_size = 0;
+m-below_4g_mem_size = m-ram_size;
+m-above_4g_mem_size = 0;
 }
 
 /* Allocate RAM.  We allocate it as a single memory region and use
@@ -1310,16 +1309,16 @@ static int memory_controller_init(PCIDevice *dev)
 memory_region_init_ram(m-ram, pc.ram, m-ram_size);
 vmstate_register_ram_global(m-ram);
 memory_region_init_alias(m-ram_below_4g, ram-below-4g, m-ram,
- 0, below_4g_mem_size);
+ 0, m-below_4g_mem_size);
 memory_region_add_subregion(m-system_memory, 0, m-ram_below_4g);
-if (above_4g_mem_size  0) {
+if (m-above_4g_mem_size  0) {
 memory_region_init_alias(m-ram_above_4g, ram-above-4g, m-ram,
- below_4g_mem_size, above_4g_mem_size);
+ m-below_4g_mem_size, m-above_4g_mem_size);
 memory_region_add_subregion(m-system_memory, c-pci_hole_end,
 m-ram_above_4g);
 }
 
-pci_hole_start = below_4g_mem_size;
+pci_hole_start = m-below_4g_mem_size;
 pci_hole_size = c-pci_hole_end - pci_hole_start;
 
 pci_hole64_start = c-pci_hole_end + m-ram_size - pci_hole_start;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 5d36558..e2cbc1b 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -46,6 +46,8 @@ typedef struct MemoryController {
 MemoryRegion ram;
 MemoryRegion ram_below_4g;
 MemoryRegion ram_above_4g;
+hwaddr below_4g_mem_size;
+hwaddr above_4g_mem_size;
 } MemoryController;
 
 void mc_update_pam(MemoryController *d);
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 4/7] set psize to 0 when romfile_loadfile failed

2013-06-26 Thread Hu Tao
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 src/romfile.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/romfile.c b/src/romfile.c
index ea71d1f..b1b89bb 100644
--- a/src/romfile.c
+++ b/src/romfile.c
@@ -51,28 +51,33 @@ romfile_loadfile(const char *name, int *psize)
 {
 struct romfile_s *file = romfile_find(name);
 if (!file)
-return NULL;
+goto failed;
 
 int filesize = file-size;
 if (!filesize)
-return NULL;
+goto failed;
 
 char *data = malloc_tmphigh(filesize+1);
 if (!data) {
 warn_noalloc();
-return NULL;
+goto failed;
 }
 
 dprintf(5, Copying romfile '%s' (len %d)\n, name, filesize);
 int ret = file-copy(file, data, filesize);
 if (ret  0) {
 free(data);
-return NULL;
+goto failed;
 }
 if (psize)
 *psize = filesize;
 data[filesize] = '\0';
 return data;
+
+failed:
+if (psize)
+*psize = 0;
+return NULL;
 }
 
 // Attempt to load an integer from the given file - return 'defval'
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 1/7] Add ACPI_EXTRACT_DEVICE* macros

2013-06-26 Thread Hu Tao
From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com

This allows to extract the beginning, end and name of a Device object.

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
---
 tools/acpi_extract.py | 28 
 1 file changed, 28 insertions(+)

diff --git a/tools/acpi_extract.py b/tools/acpi_extract.py
index ab8ced6..8ad15d3 100755
--- a/tools/acpi_extract.py
+++ b/tools/acpi_extract.py
@@ -230,6 +230,28 @@ def aml_package_start(offset):
 offset += 1
 return offset + aml_pkglen_bytes(offset) + 1
 
+def aml_device_start(offset):
+#0x5B 0x82 DeviceOp PkgLength NameString ProcID
+if ((aml[offset] != 0x5B) or (aml[offset + 1] != 0x82)):
+die( Name offset 0x%x: expected 0x5B 0x83 actual 0x%x 0x%x %
+ (offset, aml[offset], aml[offset + 1]));
+return offset
+
+def aml_device_string(offset):
+#0x5B 0x82 DeviceOp PkgLength NameString ProcID
+start = aml_device_start(offset)
+offset += 2
+pkglenbytes = aml_pkglen_bytes(offset)
+offset += pkglenbytes
+return offset
+
+def aml_device_end(offset):
+start = aml_device_start(offset)
+offset += 2
+pkglenbytes = aml_pkglen_bytes(offset)
+pkglen = aml_pkglen(offset)
+return offset + pkglen
+
 lineno = 0
 for line in fileinput.input():
 # Strip trailing newline
@@ -322,6 +344,12 @@ for i in range(len(asl)):
 offset = aml_processor_end(offset)
 elif (directive == ACPI_EXTRACT_PKG_START):
 offset = aml_package_start(offset)
+elif (directive == ACPI_EXTRACT_DEVICE_START):
+offset = aml_device_start(offset)
+elif (directive == ACPI_EXTRACT_DEVICE_STRING):
+offset = aml_device_string(offset)
+elif (directive == ACPI_EXTRACT_DEVICE_END):
+offset = aml_device_end(offset)
 else:
 die(Unsupported directive %s % directive)
 
-- 
1.8.3.1




[Qemu-devel] [PATCH] vmdk: remove wrong calculation of relative path

2013-06-26 Thread Fam Zheng
When creating image with backing file, the driver tries to calculate the
relative path from created image file to backing file, but the path
computation is incorrect. e.g.:

$ qemu-img create -f vmdk -b vmdk-data-disk.vmdk vmdk-data-snapshot1
Formatting 'vmdk-data-snapshot1', fmt=vmdk size=10737418240
backing_file='vmdk-data-disk.vmdk' compat6=off zeroed_grain=off

$ qemu-img info vmdk-data-snapshot1
image: vmdk-data-snapshot1
file format: vmdk
virtual size: 10G (10737418240 bytes)
disk size: 12K
-  backing file: disk.vmdk

The common part in file names, vmdk-data-, is incorrectly forgotten by
relative_path(). As the VMDK specification has no restriction on
parentNameHint to be relative path, we simply remove this by using the
backing_file option.

Signed-off-by: Fam Zheng f...@redhat.com
---
 block/vmdk.c | 44 +---
 1 file changed, 1 insertion(+), 43 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 975e1d4..a28fb5e 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1487,45 +1487,6 @@ static int filename_decompose(const char *filename, char 
*path, char *prefix,
 return VMDK_OK;
 }
 
-static int relative_path(char *dest, int dest_size,
-const char *base, const char *target)
-{
-int i = 0;
-int n = 0;
-const char *p, *q;
-#ifdef _WIN32
-const char *sep = \\;
-#else
-const char *sep = /;
-#endif
-
-if (!(dest  base  target)) {
-return VMDK_ERROR;
-}
-if (path_is_absolute(target)) {
-pstrcpy(dest, dest_size, target);
-return VMDK_OK;
-}
-while (base[i] == target[i]) {
-i++;
-}
-p = base[i];
-q = target[i];
-while (*p) {
-if (*p == *sep) {
-n++;
-}
-p++;
-}
-dest[0] = '\0';
-for (; n; n--) {
-pstrcat(dest, dest_size, ..);
-pstrcat(dest, dest_size, sep);
-}
-pstrcat(dest, dest_size, q);
-return VMDK_OK;
-}
-
 static int vmdk_create(const char *filename, QEMUOptionParameter *options)
 {
 int fd, idx = 0;
@@ -1625,7 +1586,6 @@ static int vmdk_create(const char *filename, 
QEMUOptionParameter *options)
 return -ENOTSUP;
 }
 if (backing_file) {
-char parent_filename[PATH_MAX];
 BlockDriverState *bs = bdrv_new();
 ret = bdrv_open(bs, backing_file, NULL, 0, NULL);
 if (ret != 0) {
@@ -1638,10 +1598,8 @@ static int vmdk_create(const char *filename, 
QEMUOptionParameter *options)
 }
 parent_cid = vmdk_read_cid(bs, 0);
 bdrv_delete(bs);
-relative_path(parent_filename, sizeof(parent_filename),
-  filename, backing_file);
 snprintf(parent_desc_line, sizeof(parent_desc_line),
-parentFileNameHint=\%s\, parent_filename);
+parentFileNameHint=\%s\, backing_file);
 }
 
 /* Create extents */
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 6/7] q35: Add memory hotplug handler

2013-06-26 Thread Hu Tao
From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com

Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
---
 src/q35-acpi-dsdt.dsl | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/q35-acpi-dsdt.dsl b/src/q35-acpi-dsdt.dsl
index c031d83..5b28d72 100644
--- a/src/q35-acpi-dsdt.dsl
+++ b/src/q35-acpi-dsdt.dsl
@@ -403,7 +403,7 @@ DefinitionBlock (
 }
 
 #include acpi-dsdt-cpu-hotplug.dsl
-
+#include acpi-dsdt-mem-hotplug.dsl
 
 /
  * General purpose events
@@ -418,7 +418,9 @@ DefinitionBlock (
 // CPU hotplug event
 \_SB.PRSC()
 }
-Method(_L02) {
+Method(_E02) {
+// Memory hotplug event
+\_SB.MESC()
 }
 Method(_L03) {
 }
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v3 2/2] net: introduce command to query rx-filter information

2013-06-26 Thread Markus Armbruster
Eric Blake ebl...@redhat.com writes:

 On 05/24/2013 07:08 AM, Luiz Capitulino wrote:
 This sounds like premature optimization to me, but I wonder if instead
 of cluttering commands with arguments to do the filtering we could add
 some standard way of doing this in the QAPI.

 Maybe we could make QAPI support generic filtering for all query-*
 commands.  But there's still a cost with making all query-* commands
 malloc the space for an entire list only to then have a QAPI layer on
 top of it do filtering before handing back the single element over the
 wire, compared to passing the filtering down to the query-*
 implementation to do the filtering in place.

I'd expect the time spent on malloc to be dwarved several times over by
I/O latency.

If we worry about malloc impacting our latency, then we should not be
using QAPI!  It's really, really malloc-happy.

In QMP, we generally don't.

 It was you who suggested a filter command?

 No, Stefan suggested it on v1:
 https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg03102.html

So far, I'm with Luiz here.



Re: [Qemu-devel] [PATCH 0/3] make bh safe with hot-unplug

2013-06-26 Thread liu ping fan
On Wed, Jun 26, 2013 at 4:38 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 26/06/2013 10:20, liu ping fan ha scritto:
 On the other hand, pushing _delete() out of  finalization path is not
 easy, since we do not what time the DeviceState has done with its bh.

 See above:

 - if the BH will run in the iothread, the BH is definitely not running
 (because the BH runs under the BQL, and the reclaimer owns it)

 It works for the case in which the DeviceState's reclaimer calls
 _bh_delete(). But in another case(also exist in current code), where
 we call _bh_delete() in callback, the bh will be scheduled, and oops!

 But you may know that the BH is not scheduled after removal, too.

No, the removal can run in parallel with the other mmio-dispatch which
can resort to schedule a bh. I.e, the removal can not call
_bh_delete(), so it do not know whether bh will be scheduled or not.

 There are not that many devices that have bottom halves (apart from
 those that use bottom halves in ptimer).  If they really need to, they
 can do the object_ref/unref themselves.  But I expect this to be rare,
 and generic code should not need an owner field in bottom halves.  For

Agree :), so the aim is how to  handle bh safely when unplug, if we
can resolve it with rcu, it will be more fine!
 example, block devices should stop sending requests after removal.

Yes, but need we need take account for stop sending request != no
mmio-dispatch in fly ? And I think these mmio-dispatch are the
criminal :),  making it is hard to sync with bh's stop.
 - if the BH is running in another thread, waiting for that thread to
 terminate obviously makes the BH not running.

 This imply that except for qemu_aio_context, AioContext can be only
 shared by one device, right? Otherwise we can not just terminate the
 thread. If it is true, why can not we have more than one just like
 qemu_aio_context?

 Yes, if you do it that way you can have only one AioContext per device.
  RCU is another way, and doesn't have the same limitation.

 We should avoid introducing infrastructure that we are not sure is
 needed.  For what it's worth, your patches to make the bottom half list
 thread-safe are also slightly premature.  However, they do not change
 the API and it makes some sense to add the infrastructure.  In this
 case, I'm simply not sure that we're there yet.

See, thx.
 If you look at the memory work, for example, the owner patches happen to
 be useful for BQL-less dispatch too, but they are solving existing
 hot-unplug bugs.

Oh, I will read it again, since I had thought the owner is only used
for the purpose of refcnt.

 What we need is separation of removal and reclamation.  Without that,
 any effort to make things unplug-safe is going to be way way more
 complicated than necessary.

 Agree, but when I tried for bh, I found the separation of removal and
 reclamation are not easy.  We can not _bh_delete() in
 acpi_piix_eject_slot(), because mmio-dispatch can schedule a bh at the
 same time.

 acpi_piix_eject_slot() is removal, not reclamation.  The plan there is
 that qdev_free calls the exit callback immediately (which can do
 qemu_bh_delete), and schedules an unref after the next RCU grace period.
  At the next RCU grace period the BH will not be running, thus it is
 safe to finalize the object.

I have two question:
1st, who trigger qdev_free?  //Not figuring out the total design, but
I feel it is quite different from kernel's design, where refcnt-0,
then exit is called.
2nd, _delete_bh() means even there is any not-severed request, the
request will be canceled. Is it legal, and will not lose data(not
sure, since I do not know who will call qdev_free)?

Thxregards,
Pingfan
 Paolo

  And as explained, only two places can hold the
 _bh_delete().
 In a short word, with rcu, we need to constrain the calling idiom of
 bh, i.e., putting them in reclaimer.  On the other hand, my patch try
 to leave the calling idiom as it is, and handle this issue inside bh.

 Regards,
 Pingfan

 Paolo




Re: [Qemu-devel] [PATCH v5 05/14] vl: handle -device dimm

2013-06-26 Thread Paolo Bonzini
Il 26/06/2013 11:13, Hu Tao ha scritto:
 From: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
 
 Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com
 Signed-off-by: Hu Tao hu...@cn.fujitsu.com
 ---
  vl.c | 51 +++
  1 file changed, 51 insertions(+)
 
 diff --git a/vl.c b/vl.c
 index 767e020..9d88a79 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -170,6 +170,7 @@ int main(int argc, char **argv)
  
  #include ui/qemu-spice.h
  #include qapi/string-input-visitor.h
 +#include hw/mem-hotplug/dimm.h
  
  //#define DEBUG_NET
  //#define DEBUG_SLIRP
 @@ -252,6 +253,7 @@ static QTAILQ_HEAD(, FWBootEntry) fw_boot_order =
  int nb_numa_nodes;
  uint64_t node_mem[MAX_NODES];
  unsigned long *node_cpumask[MAX_NODES];
 +int nb_hp_dimms;
  
  uint8_t qemu_uuid[16];
  
 @@ -2338,6 +2340,50 @@ static int chardev_init_func(QemuOpts *opts, void 
 *opaque)
  return 0;
  }
  
 +static int dimmcfg_init_func(QemuOpts *opts, void *opaque)
 +{
 +const char *driver;
 +const char *id;
 +uint64_t node, size;
 +uint32_t populated;
 +const char *buf, *busbuf;
 +
 +/* DimmDevice configuration needs to be known in order to initialize 
 chipset
 + * with correct memory and pci ranges. But all devices are created after
 + * chipset / machine initialization. In * order to avoid this problem, we
 + * parse dimm information earlier into dimmcfg structs. */
 +
 +driver = qemu_opt_get(opts, driver);
 +if (!strcmp(driver, dimm)) {
 +
 +id = qemu_opts_id(opts);
 +buf = qemu_opt_get(opts, size);
 +parse_option_size(size, buf, size, NULL);
 +buf = qemu_opt_get(opts, node);
 +parse_option_number(node, buf, node, NULL);
 +busbuf = qemu_opt_get(opts, bus);
 +buf = qemu_opt_get(opts, populated);
 +if (!buf) {
 +populated = 0;
 +} else {
 +populated = strcmp(buf, on) ? 0 : 1;
 +}
 +
 +dimm_config_create((char *)id, size, busbuf ? busbuf : membus.0,
 +   node, nb_hp_dimms);
 +
 +/* if !populated, we just keep the config. The real device
 + * will be created in the future with a normal device_add
 + * command. */
 +if (!populated) {
 +qemu_opts_del(opts);
 +}

I think you need another option than -device dimm.  For example it could
be declared together with the NUMA node.  This could declare two NUMA
nodes, each with 2G of populated and 2G of unpopulated RAM:

   -numa node,mem-range=0-2G,mem-range-hotplug=4G-6G \
   -numa node,mem-range=2G-4G,mem-range-hotplug=6G-8G

I'm not sure I like the names particularly though.  CCing Eduardo,
Bandan and Wanlong Gao.

Paolo

 +nb_hp_dimms++;
 +}
 +
 +return 0;
 +}
 +
  #ifdef CONFIG_VIRTFS
  static int fsdev_init_func(QemuOpts *opts, void *opaque)
  {
 @@ -4260,6 +4306,11 @@ int main(int argc, char **argv, char **envp)
  }
  qemu_add_globals();
  
 +/* init generic devices */
 +if (qemu_opts_foreach(qemu_find_opts(device),
 +   dimmcfg_init_func, NULL, 1) != 0) {
 +exit(1);
 +}
  qdev_machine_init();
  
  QEMUMachineInitArgs args = { .ram_size = ram_size,
 




Re: [Qemu-devel] [PATCH v5] QEMUBH: make AioContext's bh re-entrant

2013-06-26 Thread liu ping fan
On Tue, Jun 25, 2013 at 5:40 PM, Kevin Wolf kw...@redhat.com wrote:
 Am 25.06.2013 um 19:26 hat Liu Ping Fan geschrieben:
 BH will be used outside big lock, so introduce lock to protect
 between the writers, ie, bh's adders and deleter. The lock only
 affects the writers and bh's callback does not take this extra lock.
 Note that for the same AioContext, aio_bh_poll() can not run in
 parallel yet.

 Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com

 -

 Please use exactly three dashes so that 'git am' recognises it as the
 end of the commit message.

Sorry, I will notice at the next time.  Should I re-post it?

Thanks and regards,
Pingfan
 This doesn't compile yet because smp_read_barrier_depends() isn't merged
 yet, so maybe there's still time for some nitpicking: Wouldn't using
 atomic_set/get better document things and make them easier to read?

 It should be correct anyway, so:

 Reviewed-by: Kevin Wolf kw...@redhat.com



Re: [Qemu-devel] [PATCH v3 2/2] net: introduce command to query rx-filter information

2013-06-26 Thread Markus Armbruster
Amos Kong ak...@redhat.com writes:

 We want to implement mac programming over macvtap through Libvirt,
 related rx-filter configuration contains main mac, some of rx-mode
 and mac-table.

 The previous patch adds QMP event to notify management of rx-filter
 change. This patch adds a monitor command to query rx-filter
 information.

 A flag is used to avoid events flooding, if user don't query
 rx-filter after receives one event, new events won't be sent
 to qmp monitor.

 (qemu) info rx-filter vnet0
 vnet0:
  \ promiscuous: on
  \ multicast: normal
  \ unicast: normal
  \ broadcast-allowed: off
  \ multicast-overflow: off
  \ unicast-overflow: off
  \ main-mac: 52:54:00:12:34:56
  \ unicast-table:
  \ multicast-table:
 01:00:5e:00:00:01
 33:33:00:00:00:01
 33:33:ff:12:34:56

 Signed-off-by: Amos Kong ak...@redhat.com
 ---
  hmp-commands.hx |  2 ++
  hmp.c   | 49 
  hmp.h   |  1 +
  hw/net/virtio-net.c | 93 
 +
  include/net/net.h   |  2 ++
  monitor.c   |  8 +
  net/net.c   | 47 +++
  qapi-schema.json| 73 +
  qmp-commands.hx | 55 +++
  9 files changed, 330 insertions(+)

 diff --git a/hmp-commands.hx b/hmp-commands.hx
 index 9cea415..b7863eb 100644
 --- a/hmp-commands.hx
 +++ b/hmp-commands.hx
 @@ -1639,6 +1639,8 @@ show qdev device model list
  show roms
  @item info tpm
  show the TPM device
 +@item info rx-filter [net client name]
 +show the rx-filter information for all nics (or for the given nic)

Humor me: spell it NICs and NIC, because it's an acronym.

  @end table
  ETEXI
  
 diff --git a/hmp.c b/hmp.c
 index 4fb76ec..5b47382 100644
 --- a/hmp.c
 +++ b/hmp.c
 @@ -653,6 +653,55 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict)
  qapi_free_TPMInfoList(info_list);
  }
  
 +void hmp_info_rx_filter(Monitor *mon, const QDict *qdict)
 +{
 +RxFilterInfoList *filter_list, *entry;
 +strList *str_entry;
 +bool has_name = qdict_haskey(qdict, name);
 +const char *name = NULL;
 +
 +if (has_name) {
 +name = qdict_get_str(qdict, name);
 +}
 +
 +filter_list = qmp_query_rx_filter(has_name, name, NULL);

Rather roundabout way to do

const char *name = qdict_get_str(qdict, name);

filter_list = qmp_query_rx_filter(name != NULL, name);

 +entry = filter_list;
 +
 +while (entry) {

Recommend to put the loop control in one place:

for (entry = filter_list; entry; entry = entry-next) {

 +monitor_printf(mon, %s:\n, entry-value-name);
 +monitor_printf(mon,  \\ promiscuous: %s\n,
 +   entry-value-promiscuous ? on : off);
 +monitor_printf(mon,  \\ multicast: %s\n,
 +   RxState_lookup[entry-value-multicast]);
 +monitor_printf(mon,  \\ unicast: %s\n,
 +   RxState_lookup[entry-value-unicast]);
 +monitor_printf(mon,  \\ broadcast-allowed: %s\n,
 +   entry-value-broadcast_allowed ? on : off);
 +monitor_printf(mon,  \\ multicast-overflow: %s\n,
 +   entry-value-multicast_overflow ? on : off);
 +monitor_printf(mon,  \\ unicast-overflow: %s\n,
 +   entry-value-unicast_overflow ? on : off);
 +monitor_printf(mon,  \\ main-mac: %s\n, entry-value-main_mac);
 +
 +str_entry = entry-value-unicast_table;
 +monitor_printf(mon,  \\ unicast-table:\n);
 +while (str_entry) {
 +monitor_printf(mon, %s\n, str_entry-value);
 +str_entry = str_entry-next;
 +}
 +
 +str_entry = entry-value-multicast_table;
 +monitor_printf(mon,  \\ multicast-table:\n);
 +while (str_entry) {
 +monitor_printf(mon, %s\n, str_entry-value);
 +str_entry = str_entry-next;
 +}
 +
 +entry = entry-next;
 +}
 +qapi_free_RxFilterInfoList(filter_list);
 +}
 +
  void hmp_quit(Monitor *mon, const QDict *qdict)
  {
  monitor_suspend(mon);
 diff --git a/hmp.h b/hmp.h
 index 95fe76e..9af733e 100644
 --- a/hmp.h
 +++ b/hmp.h
 @@ -37,6 +37,7 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict);
  void hmp_info_pci(Monitor *mon, const QDict *qdict);
  void hmp_info_block_jobs(Monitor *mon, const QDict *qdict);
  void hmp_info_tpm(Monitor *mon, const QDict *qdict);
 +void hmp_info_rx_filter(Monitor *mon, const QDict *qdict);
  void hmp_quit(Monitor *mon, const QDict *qdict);
  void hmp_stop(Monitor *mon, const QDict *qdict);
  void hmp_system_reset(Monitor *mon, const QDict *qdict);
 diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
 index 1ea9556..f93e021 100644
 --- a/hw/net/virtio-net.c
 +++ b/hw/net/virtio-net.c
 @@ -21,6 +21,8 @@
  #include hw/virtio/virtio-net.h
  #include net/vhost_net.h
  #include hw/virtio/virtio-bus.h
 +#include qapi/qmp/qjson.h
 

Re: [Qemu-devel] [PATCH 0/3] make bh safe with hot-unplug

2013-06-26 Thread Paolo Bonzini
Il 26/06/2013 11:44, liu ping fan ha scritto:
 On Wed, Jun 26, 2013 at 4:38 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 26/06/2013 10:20, liu ping fan ha scritto:
 On the other hand, pushing _delete() out of  finalization path is not
 easy, since we do not what time the DeviceState has done with its bh.

 See above:

 - if the BH will run in the iothread, the BH is definitely not running
 (because the BH runs under the BQL, and the reclaimer owns it)

 It works for the case in which the DeviceState's reclaimer calls
 _bh_delete(). But in another case(also exist in current code), where
 we call _bh_delete() in callback, the bh will be scheduled, and oops!

 But you may know that the BH is not scheduled after removal, too.

 No, the removal can run in parallel with the other mmio-dispatch which
 can resort to schedule a bh.

But then behavior is more or less undefined.  Of course the device must
ensure that something sane happens, but that's the responsibility of the
device, not of the generic layer.

 I.e, the removal can not call
 _bh_delete(), so it do not know whether bh will be scheduled or not.

It can call qemu_bh_delete(), then all concurrent qemu_bh_schedule()
will be no-ops.

 If you look at the memory work, for example, the owner patches happen to
 be useful for BQL-less dispatch too, but they are solving existing
 hot-unplug bugs.

 Oh, I will read it again, since I had thought the owner is only used
 for the purpose of refcnt.

It is.  But it goes beyond BQL-less dispatch.  Anything that gives away
the BQL between a map and unmap (including asynchronous I/O) needs an
owner.  We have ignored that until now because we do not have memory unplug.

 What we need is separation of removal and reclamation.  Without that,
 any effort to make things unplug-safe is going to be way way more
 complicated than necessary.

 Agree, but when I tried for bh, I found the separation of removal and
 reclamation are not easy.  We can not _bh_delete() in
 acpi_piix_eject_slot(), because mmio-dispatch can schedule a bh at the
 same time.

 acpi_piix_eject_slot() is removal, not reclamation.  The plan there is
 that qdev_free calls the exit callback immediately (which can do
 qemu_bh_delete), and schedules an unref after the next RCU grace period.
  At the next RCU grace period the BH will not be running, thus it is
 safe to finalize the object.

 I have two question:
 1st, who trigger qdev_free?  //Not figuring out the total design, but
 I feel it is quite different from kernel's design, where refcnt-0,
 then exit is called.

qdev_free is triggered by the guest, but free is a misnomer.  It is
really make it inaccessible from the guest and management (the kernel
equivalent would be removal of /dev and /sys entries, for example).  The
actual free will happen later.

 2nd, _delete_bh() means even there is any not-severed request, the
 request will be canceled. Is it legal, and will not lose data(not
 sure, since I do not know who will call qdev_free)?

That's something that should be ensured by the device.  For example it
is not a problem to cancel virtio-net's tx_bh.  If it is a problem, the
device must do something else.  It could even be a bdrv_drain_all() in
the worst case.

Paolo



Re: [Qemu-devel] [PATCH for mst/pci] output nc-name in NIC_RX_FILTER_CHANGED event

2013-06-26 Thread Markus Armbruster
Amos Kong ak...@redhat.com writes:

 netclient 'name' entry in event is useful for management to know
 which device is changed. n-netclient_name is not always set.
 This patch changes to use nc-name. If we don't assign 'id',
 qemu will set a generated name to nc-name.

 Signed-off-by: Amos Kong ak...@redhat.com
 ---
  hw/net/virtio-net.c | 11 +++
  1 file changed, 3 insertions(+), 8 deletions(-)

 diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
 index c88403a..e4d9752 100644
 --- a/hw/net/virtio-net.c
 +++ b/hw/net/virtio-net.c
 @@ -200,14 +200,9 @@ static void rxfilter_notify(NetClientState *nc)
  VirtIONet *n = qemu_get_nic_opaque(nc);
  
  if (nc-rxfilter_notify_enabled) {
 -if (n-netclient_name) {
 -event_data = qobject_from_jsonf({ 'name': %s, 'path': %s },
 -n-netclient_name,
 -
 object_get_canonical_path(OBJECT(n-qdev)));
 -} else {
 -event_data = qobject_from_jsonf({ 'path': %s },
 -
 object_get_canonical_path(OBJECT(n-qdev)));
 -}
 +event_data = qobject_from_jsonf({ 'name': %s, 'path': %s },
 +   nc-name,
 +   object_get_canonical_path(OBJECT(n-qdev)));
  monitor_protocol_event(QEVENT_NIC_RX_FILTER_CHANGED, event_data);
  qobject_decref(event_data);

Is this on top of [PATCH v3 0/2] mac programming over macvtap?

Yes, qdev IDs are optional, and therefore can serve as reliable
identifier only when the user / management application always specifies
one, and even then, you're still screwed for auto-created devices.
Easily avoided for NICs, but yes, the problem is real.

However, I don't agree with the solution use NetCientState name,
because that's too ad hoc, and not general.  Can we please use QOM
paths?



Re: [Qemu-devel] [PATCH for mst/pci] output nc-name in NIC_RX_FILTER_CHANGED event

2013-06-26 Thread Markus Armbruster
Amos Kong ak...@redhat.com writes:

 On Mon, Jun 24, 2013 at 02:34:59PM +0800, Amos Kong wrote:
 netclient 'name' entry in event is useful for management to know
 which device is changed. n-netclient_name is not always set.
 This patch changes to use nc-name. If we don't assign 'id',
 qemu will set a generated name to nc-name.


 IRC: mst akong, what do other events include? name or id?

 I just checked QMP/qmp-event.txt, they all use 'device name'.
 (eg: BLOCK_IO_ERROR, DEVICE_DELETED, DEVICE_TRAY_MOVED, BLOCK_JOB_*)

 If we assign 'id' for -device, device name will be set to id.
 Otherwise, a generated device name will set to some device.

DEVICE_DELETED uses device (the qdev ID) and path (the QOM path).

For reasons I don't understand, it sets path only when the device has
no qdev ID.  That should be cleaned up.



Re: [Qemu-devel] [PATCH 1/2] device_tree: Add qemu_devtree_setprop_sized_cells() utility function

2013-06-26 Thread Alexander Graf

On 26.06.2013, at 10:49, Peter Maydell wrote:

 On 26 June 2013 00:38, David Gibson da...@gibson.dropbear.id.au wrote:
 On Mon, Jun 24, 2013 at 12:02:39PM +0100, Peter Maydell wrote:
 On 24 June 2013 11:56, Alexander Graf ag...@suse.de wrote:
 This looks pretty complicated for something actually quite
 simple: All you want is to pass in a number of 64bit values,
 rather than 32bit ones, right?
 
 Nope. If the device tree blob says #address-cells is 1
 and #size-cells is 1, then I want to pass in values to
 go in 32 bit cells. If it says #address-cells is 2 but
 #size-cells is still 1, then I want to pass in a value
 for a 64 bit cell then one for a 32 bit cell. If they're
 both 2 then I want to pass in values for two 64 bit
 cells.
 
 Hmm.. the property certainly needs to be constructed that way.  But I
 think Alex's point is that you could make the arguments all 64-bit,
 and then truncate them in the generated property.
 
 Er, the arguments *are* all 64 bits and truncated
 in the generated property:
 + * @...: 0-terminated list of uint32_t number-of-cells, uint64_t value pairs
 
 There's a bigger problem, though, that exists with both versions.  In
 general people expect integer arguments like this not to care too much
 about the exact integer type, because it will be promoted to the right
 thing.  Except with varargs it won't.  So if you ever have a
 notionally 64-bit address that happens to fit in a 32-bit variable and
 you pass that it, this function will be broken.  And the worst of it
 is, it'll work most of the time, until you happen to hit the wrong ABI
 and parameter combination :(.
 
 Do you have a suggested improvement to the API to avoid this?

I think it makes sense to make this API special-purpose for reg. We currently 
have a generic put any number of 32bit values into the property function 
(qemu_devtree_setprop_cells).

Can't we also just add a qemu_devtree_setprop_reg() that walks the tree 
downwards in search for #address-cells and #size-cells and assembles the 
correct reg property from a list of 64bit arguments?

  qemu_devtree_setprop_reg(fdt, /foo/bar, region[0].offset, region[0].size, 
region[1].offset, region[1].size);

Maybe we could even make it more generic to also cover ranges.


Alex




Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device version 2.

2013-06-26 Thread Ian Campbell
On Wed, 2013-06-19 at 12:31 +0100, Paul Durrant wrote:

 That's correct. If a vendor wishes to ship a new driver for an
 existing piece of h/w they just post it. However, at some point the
 vendor will sell a new piece of h/w which requires a driver that will
 not work with older h/w. So, they update their device id or revision
 to something new, leave the old driver on windows update alone, and
 post a new driver that will only bind to the new h/w.

Right, but that is not analogous to this change. This change is more
akin to the driver folks going to the hardware guys and saying we've
ended up painted into a bit of corner, the easiest way to get out of it
is for you guys to reissue the exact same hardware (ASIC?) with a
different device id. I can imagine the hardware guys would be thrilled
with that prospect!

 This is model we have followed in XenServer: the platform device
 represents 'the set of PV drivers'

You may have followed that model in XenServer but it has never been the
case for upstream. The platform device simply provides an end point for
enabling/triggering baseline Xen detection/support. The set of PV
drivers (in so far as such a thing even exists in the mix and match
world of upstream Xen) is represented via the xenbus entries for the
devices and the associated feature flags and the like.

  and therefore to ship a new and non-backwards-compatible set of PV
 drivers we incremented the platform device id. That fits with the way
 that Windows update works and seems like a totally reasonable way to
 think of the platform device IMO. Clearly my opinion, and the reality
 of Windows Update, are somewhat contentious :-(

What's contentious IMHO is adding hacks to the Xen (or qemu) code base
to workaround an issue which is entirely internal to the guest Windows
update/driver bindings.

Ian.




Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device version 2.

2013-06-26 Thread Ian Campbell
On Thu, 2013-06-20 at 09:56 +0100, Tim Deegan wrote:
 At 07:47 + on 20 Jun (1371714432), Paul Durrant wrote:
I agree. If this is really the only solution, we would need to have
both versions presented to the guest so that old drivers continue to
work without any intervention.
   
   I suspect that if we expose both, both sets of drivers try to run the
   same PV connections, and hilarity ensues.
   
  
  Actually I think I can make that work, and it is the conclusion I came
  to after Alex's comment.
 
 Ah, nice!  In that case, I'm a lot less worried -- we can just expose
 both versions/devices by default and there's no need for a visible
 control knob tied to driver version (except maybe for debugging).
 
 It means an 'unsupported' device appearing on other/older OSes, which is
 unfortunate, but ISTR only Windows really complains visibly about that.

I'm not at all convinced this is a good approach. Are we going to add a
third, forth and fifth device whenever Linux, BSD, $other-OS paint
themselves into a corner somehow WRT their internal driver model vs
their Xen PV drivers?

AFAIK the Citrix PV drivers have never been formally supported on
anything other than XenServer and XCP (and I'm not sure about formally
for XCP), so this is really an issue of supporting upgrades for people
running those. I think rather than making hacks upstream, which will
effectively need to be supported forever, the hack should be done on the
XenServer side and take advantage of whatever the supported upgrade path
is (N+1 or N+2 or whatever). This way the hack can eventually go away.
For anyone who grabbed the older drivers and used them outside of the
context of XenServer or XCP this is a documentation/awareness issue.

Can we use the blacklisting functionality of the PV unplug protocol to
blacklist previous versions of the Citrix PV drivers? I wouldn't
consider this an unsuitable thing to do in upstream, in fact it would be
using it for exactly the purpose for which it was designed. As long as
this is sufficient to boot with emulated devices in order to switch to
the newer drivers that should be good enough.

Ian.




Re: [Qemu-devel] [PATCH v4 04/10] Add a script to extract VSS SDK headers on POSIX system

2013-06-26 Thread Laszlo Ersek
On 06/25/13 17:01, Paolo Bonzini wrote:
 Il 25/06/2013 17:01, Laszlo Ersek ha scritto:
 On 06/06/13 17:06, Tomoki Sekiyama wrote:

 +if ! command -v msiextract  /dev/null; then
 +  echo 'msiextract not found. Please install msitools.' 2
 +  exit 1
 +fi

 (This is not a review comment -- I'm trying to test it:)

 What msiextract version (and dependencies, like libgcab, libgsf etc) are
 you using? I'm unable to extract vsssdk.msi; msiextract spews a bunch
 of assertion failures.

 2e39646b7850a12673bc66ade85fece3  setup.exe
 433eb024ed0c669dd1563d952ca41091  vsssdk.msi

 My versions (RHEL-6.4.z distro):
 - msitools-0.92 (built from source)
 - gcab-0.4 (built from source)
 - libgsf-1.14.15-5.el6 (patch in [1] doesn't seem to help, it only
 changes the kinds of asserts that fail)
 - glib2-2.22.5-7.el6
 - libuuid-2.17.2-12.9.el6_4.3
 
 The attached patch may help building a newer libgsf on RHEL6.

It did help building it, however the runtime error has stayed. I'm
starting to worry this is a more fundamental issue, namely in glib.

The first errors I'm getting on the msiextract stderr are:

  (msiextract:28858): GLib-GObject-WARNING **: invalid cast from
  `GDataInputStream' to `GSeekable'

  (msiextract:28858): GLib-GIO-CRITICAL **: g_seekable_seek: assertion
  `G_IS_SEEKABLE (seekable)' failed

(The second clearly being a consequence of the first.)

Problem is, the first warning is wrong. According to the current, live
glib docs https://developer.gnome.org/gio/stable/GDataInputStream.html,

  GDataInputStream implements GSeekable.

The NEWS file in glib states,

  Overview of changes from GLib 2.32.1 to 2.33.1
  ==
  [...]
  * GIO:
   - implement GSeekable for the data and buffered stream classes

Commits:

  git log --oneline --reverse 2.32.1..2.33.1 | grep -i seekable
  90739ba Make GBufferedInputStream implement GSeekable
  43895e3 Make GBufferedOutputStream implement GSeekable
  a44e801 Make GDataOutputStream implement GSeekable

Bug: https://bugzilla.gnome.org/show_bug.cgi?id=673034

GDataOutputStream and GBufferedOutputStream are both direct descendants
of GFilterOutputStream, and they both implement the GSeekable interface
in isolation.

However GDataInputStream is derived from GBufferedInputStream (it the
former is not the sibling of the latter, as for output).
GBufferedInputStream was made to implement GSeekable explicitly, and
GDataInputStream gets it via inheritance. (In 2.33.1, that is.)

I guess I'll extract the MSI once on F19 and stash the contents...

Thanks!
Laszlo



Re: [Qemu-devel] [PATCH 1/2] device_tree: Add qemu_devtree_setprop_sized_cells() utility function

2013-06-26 Thread Peter Maydell
On 26 June 2013 11:31, Alexander Graf ag...@suse.de wrote:
 I think it makes sense to make this API special-purpose for reg.
 We currently have a generic put any number of 32bit values into the
 property function (qemu_devtree_setprop_cells).

Yes, but that doesn't work for things that aren't simple arrays
of 32 bit values, so I think that a generic way to deal
with those too would be useful. If you wanted to write a
ranges property you'd need this too, so it doesn't just
apply to reg.

I think we could avoid the varargs doesn't promote problem
by using a varargs-macro wrapper:

#define qemu_devtree_setprop_sized_cells(fdt, node, prop, ...) \
do {   \
uint64_t args[] = { __VA_ARGS__ }; \
do_qemu_devtree_setprop_sized_cells(fdt, node, prop, \
args, sizeof(args));
} while (0)

which will promote everything (including the size arguments,
harmlessly) to uint64_t, and avoids having a varargs function.

 Can't we also just add a qemu_devtree_setprop_reg() that walks
 the tree downwards in search for #address-cells and #size-cells
 and assembles the correct reg property from a list of 64bit
 arguments?

Do we have an actual use for this? It seems pretty complicated.
I would expect that in practice there are two major use cases:
 (a) create your own fdt from scratch (in which case you can
 just make everything 64 bits and in any case will know
 when creating nodes what the #address-cells etc are)
 (b) modify an existing fdt, in which case you definitely don't
 want to go poking around too deeply in the tree; anything
 more than just put an extra node in the root is starting
 to get pretty chancy.

thanks
-- PMM



Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device version 2.

2013-06-26 Thread Paul Durrant
 -Original Message-
 From: Ian Campbell
 Sent: 26 June 2013 11:40
 To: Tim (Xen.org)
 Cc: Paul Durrant; Matt Wilson; Alex Bligh; xen-de...@lists.xen.org; qemu-
 de...@nongnu.org
 Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device
 version 2.
 
 On Thu, 2013-06-20 at 09:56 +0100, Tim Deegan wrote:
  At 07:47 + on 20 Jun (1371714432), Paul Durrant wrote:
 I agree. If this is really the only solution, we would need to have
 both versions presented to the guest so that old drivers continue to
 work without any intervention.
   
I suspect that if we expose both, both sets of drivers try to run the
same PV connections, and hilarity ensues.
   
  
   Actually I think I can make that work, and it is the conclusion I came
   to after Alex's comment.
 
  Ah, nice!  In that case, I'm a lot less worried -- we can just expose
  both versions/devices by default and there's no need for a visible
  control knob tied to driver version (except maybe for debugging).
 
  It means an 'unsupported' device appearing on other/older OSes, which is
  unfortunate, but ISTR only Windows really complains visibly about that.
 
 I'm not at all convinced this is a good approach. Are we going to add a
 third, forth and fifth device whenever Linux, BSD, $other-OS paint
 themselves into a corner somehow WRT their internal driver model vs
 their Xen PV drivers?
 

Is there any harm in having a separate device for each OS's PV drivers? I agree 
it's not entirely elegant, but at least it allows for revision control when you 
need it.

 AFAIK the Citrix PV drivers have never been formally supported on
 anything other than XenServer and XCP (and I'm not sure about formally
 for XCP), so this is really an issue of supporting upgrades for people
 running those. I think rather than making hacks upstream, which will
 effectively need to be supported forever, the hack should be done on the
 XenServer side and take advantage of whatever the supported upgrade path
 is (N+1 or N+2 or whatever). This way the hack can eventually go away.
 For anyone who grabbed the older drivers and used them outside of the
 context of XenServer or XCP this is a documentation/awareness issue.
 
 Can we use the blacklisting functionality of the PV unplug protocol to
 blacklist previous versions of the Citrix PV drivers? I wouldn't
 consider this an unsuitable thing to do in upstream, in fact it would be
 using it for exactly the purpose for which it was designed. As long as
 this is sufficient to boot with emulated devices in order to switch to
 the newer drivers that should be good enough.
 

We could blacklist all existing Citrix PV drivers in upstream QEMU, to avoid 
the clash, but that seems like a very unfriendly approach. Also, it's not going 
to stop someone with an existing VM, who happens to be using legacy Citrix PV 
drivers (an AWS VM for instance) receiving a driver from Windows Update that 
will blue-screen their VM on next reboot. Hence the only way forward is to bind 
the new drivers to something new, that we can control, so we know what driver a 
VM is going to get from Windows Update. And we may indeed need to modify its 
revision in future so that we can retire old sets of PV drivers and replace 
them with new ones, but only for newer XenServer releases. Thus, I also propose 
to make the PCI revision of the new device a command line parameter.

  Paul


[Qemu-devel] [PATCH 2/2] fbdev: add monitor commands to enable/disable/query

2013-06-26 Thread Gerd Hoffmann
This patch adds a fbdev monitor command to enable/disable
the fbdev display at runtime to both qmp and hmp.

qmp: framebuffer-display enable=on|off scale=on|off device=/dev/fbn
hmp: framebuffer-display on|off

There is also a query-framebuffer command for qmp.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hmp-commands.hx  |   14 ++
 hmp.c|9 +
 hmp.h|1 +
 include/ui/console.h |1 +
 qapi-schema.json |   43 +++
 qmp-commands.hx  |   12 
 qmp.c|   31 +++
 ui/fbdev.c   |   20 
 8 files changed, 131 insertions(+)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 915b0d1..283106d 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1563,7 +1563,21 @@ STEXI
 @findex qemu-io
 
 Executes a qemu-io command on the given block device.
+ETEXI
+
+{
+.name   = framebuffer-display,
+.args_type  = enable:b,
+.params = on|off,
+.help   = enable/disable linux console framebuffer display,
+.mhandler.cmd = hmp_framebuffer_display,
+},
+
+STEXI
+@item framebuffer-display on | off
+@findex framebuffer-display
 
+enable/disable linux console framebuffer display.
 ETEXI
 
 {
diff --git a/hmp.c b/hmp.c
index 494a9aa..55f195f 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1464,3 +1464,12 @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
 
 hmp_handle_error(mon, err);
 }
+
+void hmp_framebuffer_display(Monitor *mon, const QDict *qdict)
+{
+int enable = qdict_get_bool(qdict, enable);
+Error *errp = NULL;
+
+qmp_framebuffer_display(enable, false, false, false, NULL, errp);
+hmp_handle_error(mon, errp);
+}
diff --git a/hmp.h b/hmp.h
index 56d2e92..c3a48e4 100644
--- a/hmp.h
+++ b/hmp.h
@@ -86,5 +86,6 @@ void hmp_nbd_server_stop(Monitor *mon, const QDict *qdict);
 void hmp_chardev_add(Monitor *mon, const QDict *qdict);
 void hmp_chardev_remove(Monitor *mon, const QDict *qdict);
 void hmp_qemu_io(Monitor *mon, const QDict *qdict);
+void hmp_framebuffer_display(Monitor *mon, const QDict *qdict);
 
 #endif
diff --git a/include/ui/console.h b/include/ui/console.h
index 71b538a..5a9207d 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -311,6 +311,7 @@ void sdl_display_init(DisplayState *ds, int full_screen, 
int no_frame);
 /* fbdev.c */
 int fbdev_display_init(const char *device, bool scale, Error **err);
 void fbdev_display_uninit(void);
+FramebufferInfo *framebuffer_info(void);
 
 /* cocoa.m */
 void cocoa_display_init(DisplayState *ds, int full_screen);
diff --git a/qapi-schema.json b/qapi-schema.json
index 6cc07c2..715dc1f 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3608,3 +3608,46 @@
 '*cpuid-input-ecx': 'int',
 'cpuid-register': 'X86CPURegister32',
 'features': 'int' } }
+
+##
+# @framebuffer-display:
+#
+# Enable/disable linux console framebuffer display.
+#
+# @enable: whenever the framebuffer display should be enabled or disabled.
+#
+# @scale: #optional enables display scaling, default: off
+#
+# @device: #optional specifies framebuffer device, default: /dev/fb0
+#
+# Returns: Nothing.
+#
+# Since: 1.6
+#
+##
+{ 'command': 'framebuffer-display', 'data': {'enable'  : 'bool',
+ '*scale'  : 'bool',
+ '*device' : 'str' } }
+
+##
+# @FramebufferInfo:
+#
+# Since 1.6
+##
+{ 'type': 'FramebufferInfo',
+  'data': { 'enabled': 'bool',
+'*scale' : 'bool',
+'*device': 'str',
+'*vtno'  : 'int' } }
+
+##
+# @query-framebuffer:
+#
+# Query linux console framebuffer state.
+#
+# Returns: FramebufferInfo.
+#
+# Since: 1.6
+#
+##
+{ 'command': 'query-framebuffer', 'returns': 'FramebufferInfo' }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 8cea5e5..e0661f0 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2997,3 +2997,15 @@ Example:
 - { return: {} }
 
 EQMP
+
+{
+.name   = framebuffer-display,
+.args_type  = enable:b,scale:b?,device:s?,
+.mhandler.cmd_new = qmp_marshal_input_framebuffer_display,
+},
+
+{
+.name   = query-framebuffer,
+.args_type  = ,
+.mhandler.cmd_new = qmp_marshal_input_query_framebuffer,
+},
diff --git a/qmp.c b/qmp.c
index 4c149b3..b6d826c 100644
--- a/qmp.c
+++ b/qmp.c
@@ -404,6 +404,37 @@ void qmp_change(const char *device, const char *target,
 }
 }
 
+void qmp_framebuffer_display(bool enable,
+ bool has_scale, bool scale,
+ bool has_device, const char *device,
+ Error **errp)
+{
+#if defined(CONFIG_FBDEV)
+if (enable) {
+if (fbdev_display_init(has_device ? device : NULL,
+   has_scale  ? scale  : false,
+   errp) != 0) {

[Qemu-devel] [PATCH 1/2] fbdev: add linux framebuffer display driver.

2013-06-26 Thread Gerd Hoffmann
Display works, requires truecolor framebuffer with 16 or 32 bpp on the
host.  32bpp is recommended.  The framebuffer is used as-is, qemu
doesn't try to switch modes.  With LCD displays mode switching is pretty
pointless IMHO, also it wouldn't work anyway with the most common
fbdev drivers (vesafb, KMS).  Guest display is centered on the host
screen.

Mouse works, uses /dev/input/mice.

Keyboard works.  Guest screen has whatever keymap you load inside
the guest.  Text windows (monitor, serial, ...) have a simple en-us
keymap.  Good enough to type monitor commands.  Not goot enough to
work seriously on a serial terminal.  But the qemu terminal emulation
isn't good enough for that anyway ;)

Hot keys:
  Ctrl-Alt-Fnr  - host console switching.
  Ctrl-Alt-nr   - qemu console switching.
  Ctrl-Alt-S  - toggle display scaling.
  Ctrl-Alt-ESC- exit qemu.

Special feature:  Sane console switching.  Switching away stops screen
updates.  Switching back redraws the screen.  When started from the
linux console qemu uses the vt you've started it from (requires just
read/write access to /dev/fb0).  When starting from somewhere else qemu
tries to open a unused virtual terminal and switch to it (usually
requires root privileges to open /dev/ttynr).

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 configure   |   12 +
 include/sysemu/sysemu.h |1 +
 include/ui/console.h|4 +
 qemu-options.hx |4 +-
 trace-events|   14 +
 ui/Makefile.objs|1 +
 ui/fbdev.c  | 1164 +++
 ui/linux-keynames.h |  388 
 vl.c|   21 +
 9 files changed, 1608 insertions(+), 1 deletion(-)
 create mode 100644 ui/fbdev.c
 create mode 100644 ui/linux-keynames.h

diff --git a/configure b/configure
index 0e0adde..391e645 100755
--- a/configure
+++ b/configure
@@ -158,6 +158,7 @@ fdt=
 nptl=
 pixman=
 sdl=
+fbdev=no
 virtfs=
 vnc=yes
 sparse=no
@@ -549,6 +550,7 @@ Haiku)
   kvm=yes
   vhost_net=yes
   vhost_scsi=yes
+  fbdev=yes
   if [ $cpu = i386 -o $cpu = x86_64 ] ; then
 audio_possible_drivers=$audio_possible_drivers fmod
   fi
@@ -700,6 +702,10 @@ for opt do
   ;;
   --enable-qom-cast-debug) qom_cast_debug=yes
   ;;
+  --disable-fbdev) fbdev=no
+  ;;
+  --enable-fbdev) fbdev=yes
+  ;;
   --disable-virtfs) virtfs=no
   ;;
   --enable-virtfs) virtfs=yes
@@ -1051,6 +1057,8 @@ echo   --disable-sdldisable SDL
 echo   --enable-sdl enable SDL
 echo   --disable-gtkdisable gtk UI
 echo   --enable-gtk enable gtk UI
+echo   --disable-fbdev  disable linux framebuffer
+echo   --enable-fbdev   enable linux framebuffer
 echo   --disable-virtfs disable VirtFS
 echo   --enable-virtfs  enable VirtFS
 echo   --disable-vncdisable VNC
@@ -3492,6 +3500,7 @@ fi
 echo pixman$pixman
 echo SDL support   $sdl
 echo GTK support   $gtk
+echo fbdev support $fbdev
 echo curses support$curses
 echo curl support  $curl
 echo mingw32 support   $mingw32
@@ -3722,6 +3731,9 @@ if test $sdl = yes ; then
   echo CONFIG_SDL=y  $config_host_mak
   echo SDL_CFLAGS=$sdl_cflags  $config_host_mak
 fi
+if test $fbdev = yes ; then
+  echo CONFIG_FBDEV=y  $config_host_mak
+fi
 if test $cocoa = yes ; then
   echo CONFIG_COCOA=y  $config_host_mak
 fi
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 2fb71af..5922311 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -91,6 +91,7 @@ typedef enum DisplayType
 DT_CURSES,
 DT_SDL,
 DT_GTK,
+DT_FBDEV,
 DT_NOGRAPHIC,
 DT_NONE,
 } DisplayType;
diff --git a/include/ui/console.h b/include/ui/console.h
index 98edf41..71b538a 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -308,6 +308,10 @@ void register_vc_handler(VcHandler *handler);
 /* sdl.c */
 void sdl_display_init(DisplayState *ds, int full_screen, int no_frame);
 
+/* fbdev.c */
+int fbdev_display_init(const char *device, bool scale, Error **err);
+void fbdev_display_uninit(void);
+
 /* cocoa.m */
 void cocoa_display_init(DisplayState *ds, int full_screen);
 
diff --git a/qemu-options.hx b/qemu-options.hx
index ca6fdf6..259e76f 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -804,7 +804,7 @@ ETEXI
 
 DEF(display, HAS_ARG, QEMU_OPTION_display,
 -display sdl[,frame=on|off][,alt_grab=on|off][,ctrl_grab=on|off]\n
-[,window_close=on|off]|curses|none|\n
+[,window_close=on|off]|curses|fbdev|none|\n
 vnc=display[,optargs]\n
 select display type\n, QEMU_ARCH_ALL)
 STEXI
@@ -822,6 +822,8 @@ support a text mode, QEMU can display this output using a
 curses/ncurses interface. Nothing is displayed when the graphics
 device is in graphical mode or if the graphics device does not support
 a text mode. Generally only the VGA device models support text mode.
+@item fbdev
+Display 

Re: [Qemu-devel] [PATCH 1/2] device_tree: Add qemu_devtree_setprop_sized_cells() utility function

2013-06-26 Thread Alexander Graf

On 26.06.2013, at 12:50, Peter Maydell wrote:

 On 26 June 2013 11:31, Alexander Graf ag...@suse.de wrote:
 I think it makes sense to make this API special-purpose for reg.
 We currently have a generic put any number of 32bit values into the
 property function (qemu_devtree_setprop_cells).
 
 Yes, but that doesn't work for things that aren't simple arrays
 of 32 bit values, so I think that a generic way to deal
 with those too would be useful. If you wanted to write a
 ranges property you'd need this too, so it doesn't just
 apply to reg.
 
 I think we could avoid the varargs doesn't promote problem
 by using a varargs-macro wrapper:
 
 #define qemu_devtree_setprop_sized_cells(fdt, node, prop, ...) \
do {   \
uint64_t args[] = { __VA_ARGS__ }; \
do_qemu_devtree_setprop_sized_cells(fdt, node, prop, \
args, sizeof(args));
} while (0)
 
 which will promote everything (including the size arguments,
 harmlessly) to uint64_t, and avoids having a varargs function.

That would work, yes :).

 
 Can't we also just add a qemu_devtree_setprop_reg() that walks
 the tree downwards in search for #address-cells and #size-cells
 and assembles the correct reg property from a list of 64bit
 arguments?
 
 Do we have an actual use for this? It seems pretty complicated.
 I would expect that in practice there are two major use cases:
 (a) create your own fdt from scratch (in which case you can
 just make everything 64 bits and in any case will know
 when creating nodes what the #address-cells etc are)
 (b) modify an existing fdt, in which case you definitely don't
 want to go poking around too deeply in the tree; anything
 more than just put an extra node in the root is starting
 to get pretty chancy.

Well, though I do agree it would mimic exactly what the interpreter will do 
when reading those values, ensuring consistency, no?


Alex




Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device version 2.

2013-06-26 Thread Ian Campbell
On Wed, 2013-06-26 at 12:23 +0100, Paul Durrant wrote:
  -Original Message-
  From: Ian Campbell
  Sent: 26 June 2013 11:40
  To: Tim (Xen.org)
  Cc: Paul Durrant; Matt Wilson; Alex Bligh; xen-de...@lists.xen.org; qemu-
  de...@nongnu.org
  Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device
  version 2.
  
  On Thu, 2013-06-20 at 09:56 +0100, Tim Deegan wrote:
   At 07:47 + on 20 Jun (1371714432), Paul Durrant wrote:
  I agree. If this is really the only solution, we would need to have
  both versions presented to the guest so that old drivers continue to
  work without any intervention.

 I suspect that if we expose both, both sets of drivers try to run the
 same PV connections, and hilarity ensues.

   
Actually I think I can make that work, and it is the conclusion I came
to after Alex's comment.
  
   Ah, nice!  In that case, I'm a lot less worried -- we can just expose
   both versions/devices by default and there's no need for a visible
   control knob tied to driver version (except maybe for debugging).
  
   It means an 'unsupported' device appearing on other/older OSes, which is
   unfortunate, but ISTR only Windows really complains visibly about that.
  
  I'm not at all convinced this is a good approach. Are we going to add a
  third, forth and fifth device whenever Linux, BSD, $other-OS paint
  themselves into a corner somehow WRT their internal driver model vs
  their Xen PV drivers?
  
 
 Is there any harm in having a separate device for each OS's PV
 drivers? I agree it's not entirely elegant, but at least it allows for
 revision control when you need it.
 
  AFAIK the Citrix PV drivers have never been formally supported on
  anything other than XenServer and XCP (and I'm not sure about formally
  for XCP), so this is really an issue of supporting upgrades for people
  running those. I think rather than making hacks upstream, which will
  effectively need to be supported forever, the hack should be done on the
  XenServer side and take advantage of whatever the supported upgrade path
  is (N+1 or N+2 or whatever). This way the hack can eventually go away.
  For anyone who grabbed the older drivers and used them outside of the
  context of XenServer or XCP this is a documentation/awareness issue.
  
  Can we use the blacklisting functionality of the PV unplug protocol to
  blacklist previous versions of the Citrix PV drivers? I wouldn't
  consider this an unsuitable thing to do in upstream, in fact it would be
  using it for exactly the purpose for which it was designed. As long as
  this is sufficient to boot with emulated devices in order to switch to
  the newer drivers that should be good enough.
  
 
 We could blacklist all existing Citrix PV drivers in upstream QEMU, to
 avoid the clash, but that seems like a very unfriendly approach. Also,
 it's not going to stop someone with an existing VM, who happens to be
 using legacy Citrix PV drivers (an AWS VM for instance) receiving a
 driver from Windows Update that will blue-screen their VM on next
 reboot. Hence the only way forward is to bind the new drivers to
 something new, that we can control, so we know what driver a VM is
 going to get from Windows Update.

Is it not also possible to issue a new version of the old driver via
Windows update, e.g. a stunt one which just disables itself? Or to have
the new one somehow reach out and nobble the old one. Or have the new
one detect the presence of the old one and refuse to install until it
has been removed?

  And we may indeed need to modify its revision in future so that we
 can retire old sets of PV drivers and replace them with new ones, but
 only for newer XenServer releases. Thus, I also propose to make the
 PCI revision of the new device a command line parameter.

So this ugliness is really just the thin end of the wedge and all users
of these drivers in the future are going to need to be educated on which
magic qemu option they need to go with the version of the driver they
are running? And when they get an update they might need to go into
their configuration and change it or else risk a blue screen on the next
reboot? That sounds like madness to me...

It may be acceptable for XenServer to tie versions of the PV drivers to
specific versions of XenServer but I'm afraid that is not acceptable
upstream, we can't just go around willy nilly breaking compatibility
between front and backends, this is why we have feature flags,
negotiation and fallbacks.

It would be one thing to accept this unfortunate event and take a one
time hack to dig you out of a hole. But in that case it would really
need to be the case that the new version of the drivers are designed
with sufficient future proofing mechanisms that any future changes can
be handled internally. It seems to me like you intend to treat this
mechanism as an ongoing deliberate mechanism rather than a one off fix
for a historical mistake.

Ian.




Re: [Qemu-devel] [PATCH v2 1/2] ps2: add support of auto-repeat

2013-06-26 Thread Markus Armbruster
Luiz Capitulino lcapitul...@redhat.com writes:

 On Fri, 14 Jun 2013 13:46:41 +0800
 Amos Kong ak...@redhat.com wrote:

 On Fri, May 31, 2013 at 08:31:17PM +0800, Amos Kong wrote:
  On Thu, May 30, 2013 at 11:48:46AM -0500, Anthony Liguori wrote:
   Amos Kong ak...@redhat.com writes:
 
 
diff --git a/hw/input/ps2.c b/hw/input/ps2.c
index 3412079..8adbb4a 100644
--- a/hw/input/ps2.c
+++ b/hw/input/ps2.c
@@ -94,6 +94,10 @@ typedef struct {
 int translate;
 int scancode_set; /* 1=XT, 2=AT, 3=PS/2 */
 int ledstate;
+int repeat_period; /* typematic period, ms */
+int repeat_delay; /* typematic delay, ms */
+int repeat_key; /* keycode to repeat */
+QEMUTimer *repeat_timer;
   
   This state needs to be migrated, no?  I suspect it can/should be done
   via a subsection too.
  
  It sounds only reasonable for 'sendkey' command. We want to repeat one
  key for 100 times, the key should be continaully repeated in the dest
  vm until it reaches to 100 times.
  
  For implement this, we should also migrate key_timer in ui/input.c,
  then it will send a release event to ps2 queue when the key_timer
  is expired. The bottom patch migrates repeat_timer  repeat_key,
  where should we save key_timer for migration?
 
 Luiz, any suggestion about migrate the key_timer in ui/input.c?

 I don't have any. Maybe Markus or Juan can help (CC'ed).

 
 We need to migrate it, then sendkey can continually work in dest vm
 until the timer is expired.

I read the thread, but I'm not sure I have enough context for a sensible
answer.  Let me try to piece it together.

On a real PC keyboard, key down generates that key's make code, key up
generates the key's break code.  If the key is typematic, the make code
repeats while the key is down.  First repeat is after a programmable
delay, subsequent repeats happen at a programmable rate.

Which keys are typematic is programmable in scan code set 3, but we
don't implement the commands to do so.  Oh well, the world is full of
crappy clone keyboards, this is just one more.

What problem are you trying to solve?  Your cover letter mentions the
sendkey command.  Takes an array of keys and an optional hold-time,
defaulting to 100ms.

Aside: array of keys + a single hold time is a rotten design.  Outside
the scope of this patch.

100ms is well below the smallest typematic delay, so by default no
repeat happens.  But if you specify a sufficiently large hold-time, it
arguably should.  Is that the problem you're trying to fix?

Why is this problem worth fixing?

Does your patch affect anything but the sendkey command?  I'm asking
because I'm not at all sure injecting emulated repeat between the user's
keyboard and the guest OS is a good idea.  I'd expect the user's
keyboard to repeat according to the user's wishes already, and I'm
concerned a second repeat could mess up things.



Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device version 2.

2013-06-26 Thread Tim Deegan
At 11:23 + on 26 Jun (1372245783), Paul Durrant wrote:
  We could blacklist all existing Citrix PV drivers in upstream QEMU,
 to avoid the clash, but that seems like a very unfriendly
 approach. Also, it's not going to stop someone with an existing VM,
 who happens to be using legacy Citrix PV drivers (an AWS VM for
 instance) receiving a driver from Windows Update that will blue-screen
 their VM on next reboot. Hence the only way forward is to bind the new
 drivers to something new, that we can control, so we know what driver
 a VM is going to get from Windows Update.

I don't think you ever got an answer about whether this could be
finagled using version numbers in the drivers.

Also: would it not be possible to have a blkfront driver (even a trivial
one) in your new bus driver so it can detect and avoid this problem?

Or: have your bus driver detect when the blkfront driver is missing and
not unplug the emulated disk?  In fact wouldn't having the emulated disk
driver do the unplug solve it for free?

 And we may indeed need to
 modify its revision in future so that we can retire old sets of PV
 drivers and replace them with new ones, but only for newer XenServer
 releases. Thus, I also propose to make the PCI revision of the new
 device a command line parameter.

I'd rather not.  This gets straight back to having host-admin controls
that have to manually be matched to in-guest software. 

Tim.



[Qemu-devel] [PATCH] mac-io: Add escc-legacy memory alias region

2013-06-26 Thread Alexander Graf
Mac OS X's debugging serial driver accesses the ESCC through a different
register layout, called escc-legacy. This layout differs from the normal
escc register layout purely by the location of the respective registers.

This patch adds a memory alias region that takes normal escc registers and
maps them into the escc-legacy register space.

With this patch applied, a Mac OS X guest successfully emits debug output
on the serial port when run with debug parameters set, for example by running:

  $ qemu-system-ppc -prom-env -'boot-args=-v debug=0x8 io=0xff serial=0x3' \
-cdrom 10.4.iso -boot d

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/misc/macio/macio.c | 47 +++
 1 file changed, 47 insertions(+)

diff --git a/hw/misc/macio/macio.c b/hw/misc/macio/macio.c
index 9ac004a..d9971e2 100644
--- a/hw/misc/macio/macio.c
+++ b/hw/misc/macio/macio.c
@@ -67,12 +67,59 @@ typedef struct NewWorldMacIOState {
 MACIOIDEState ide[2];
 } NewWorldMacIOState;
 
+/*
+ * The mac-io has two interfaces to the ESCC. One is called escc-legacy,
+ * while the other one is the normal, current ESCC interface.
+ *
+ * The magic below creates memory aliases to spawn the escc-legacy device
+ * purely by rerouting the respective registers to our escc region. This
+ * works because the only difference between the two memory regions is the
+ * register layout, not their semantics.
+ *
+ * Reference: 
ftp://ftp.software.ibm.com/rs6000/technology/spec/chrp/inwork/CHRP_IORef_1.0.pdf
+ */
+static void macio_escc_legacy_setup(MacIOState *macio_state)
+{
+MemoryRegion *escc_legacy = g_new(MemoryRegion, 1);
+MemoryRegion *bar = macio_state-bar;
+int i;
+static const int maps[] = {
+0x00, 0x00,
+0x02, 0x20,
+0x04, 0x10,
+0x06, 0x30,
+0x08, 0x40,
+0x0A, 0x50,
+0x60, 0x60,
+0x70, 0x70,
+0x80, 0x70,
+0x90, 0x80,
+0xA0, 0x90,
+0xB0, 0xA0,
+0xC0, 0xB0,
+0xD0, 0xC0,
+0xE0, 0xD0,
+0xF0, 0xE0,
+};
+
+memory_region_init(escc_legacy, escc-legacy, 256);
+for (i = 0; i  ARRAY_SIZE(maps); i += 2) {
+MemoryRegion *port = g_new(MemoryRegion, 1);
+memory_region_init_alias(port, escc-legacy-port, 
macio_state-escc_mem,
+ maps[i+1], 0x2);
+memory_region_add_subregion(escc_legacy, maps[i], port);
+}
+
+memory_region_add_subregion(bar, 0x12000, escc_legacy);
+}
+
 static void macio_bar_setup(MacIOState *macio_state)
 {
 MemoryRegion *bar = macio_state-bar;
 
 if (macio_state-escc_mem) {
 memory_region_add_subregion(bar, 0x13000, macio_state-escc_mem);
+macio_escc_legacy_setup(macio_state);
 }
 }
 
-- 
1.8.1.4




Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device version 2.

2013-06-26 Thread Paul Durrant
 -Original Message-
 From: Tim Deegan [mailto:t...@xen.org]
 Sent: 26 June 2013 12:58
 To: Paul Durrant
 Cc: Ian Campbell; Matt Wilson; Alex Bligh; xen-de...@lists.xen.org; qemu-
 de...@nongnu.org
 Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device
 version 2.
 
 At 11:23 + on 26 Jun (1372245783), Paul Durrant wrote:
   We could blacklist all existing Citrix PV drivers in upstream QEMU,
  to avoid the clash, but that seems like a very unfriendly
  approach. Also, it's not going to stop someone with an existing VM,
  who happens to be using legacy Citrix PV drivers (an AWS VM for
  instance) receiving a driver from Windows Update that will blue-screen
  their VM on next reboot. Hence the only way forward is to bind the new
  drivers to something new, that we can control, so we know what driver
  a VM is going to get from Windows Update.
 
 I don't think you ever got an answer about whether this could be
 finagled using version numbers in the drivers.

No, we thought about that and I don't believe it would be possible.
 
 Also: would it not be possible to have a blkfront driver (even a trivial
 one) in your new bus driver so it can detect and avoid this problem?
 
 Or: have your bus driver detect when the blkfront driver is missing and
 not unplug the emulated disk?  In fact wouldn't having the emulated disk
 driver do the unplug solve it for free?


The issue is the old s/w not the new s/w. The old drivers make assumptions 
about each other's presence as we can't change that because they are already 
out there.
 
  And we may indeed need to
  modify its revision in future so that we can retire old sets of PV
  drivers and replace them with new ones, but only for newer XenServer
  releases. Thus, I also propose to make the PCI revision of the new
  device a command line parameter.
 
 I'd rather not.  This gets straight back to having host-admin controls
 that have to manually be matched to in-guest software.
 

Well not really. This is just the same as a h/w vendor shipping a new device. 
The drivers for the old device are still there on Windows Update; so no change. 
The new drivers are for the new device so only download when it is present. 
Setting the revision number for the 'citrix pv bus' device would only be like 
choosing which emulated NIC you want.

  Paul



Re: [Qemu-devel] [PATCH v3 2/2] net: introduce command to query rx-filter information

2013-06-26 Thread Markus Armbruster
Meh, I got confused and reviewed an out-of-date version.  Hope it's not
entirely noise.



[Qemu-devel] [PATCH v2 0/2] block: add drive_backup HMP command

2013-06-26 Thread Stefan Hajnoczi
These patches add the sync mode argument to drive-backup (just like
drive-mirror) and then introduce the drive_backup HMP command.

It's necessary to add the sync mode argument although only the 'full' is
supported, since the drive_mirror HMP command defaults to sync mode 'top'.  To
avoid confusion, we implement the same behavior as drive_mirror and use the
sync mode argument which Ian Main im...@redhat.com is currently working on
implementing.

Stefan Hajnoczi (2):
  blockdev: add sync mode to drive-backup QMP command
  block: add drive_backup HMP command

 blockdev.c |  6 ++
 hmp-commands.hx| 20 
 hmp.c  | 28 
 hmp.h  |  1 +
 qapi-schema.json   | 14 --
 qmp-commands.hx|  6 +-
 tests/qemu-iotests/055 | 36 +---
 7 files changed, 93 insertions(+), 18 deletions(-)

-- 
1.8.1.4




[Qemu-devel] [PATCH v2 1/2] blockdev: add sync mode to drive-backup QMP command

2013-06-26 Thread Stefan Hajnoczi
The drive-backup command is similar to the drive-mirror command, except
no guest data written after the command executes gets copied.  Add a
sync mode argument which determines whether the entire disk is copied,
just allocated clusters, or only clusters being written to by the guest.

Currently only sync mode 'full' is supported - it copies the entire disk.
For read-only point-in-time snapshots we may only need sync mode 'none'
since the target can be a qcow2 file using the guest's disk as its
backing file (no need to copy the entire disk).  Finally, sync mode
'top' is useful if we wish to preserve the backing chain.

Note that this patch just adds the sync mode argument to drive-backup.
It does not implement sync modes 'top' or 'none'.  This patch is
necessary so we can add a drive-backup HMP command that behaves like the
existing drive-mirror HMP command and takes a sync mode.

Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 blockdev.c |  6 ++
 qapi-schema.json   | 14 --
 qmp-commands.hx|  6 +-
 tests/qemu-iotests/055 | 36 +---
 4 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index edca843..fe73278 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -936,6 +936,7 @@ static void drive_backup_prepare(BlkTransactionState 
*common, Error **errp)
 
 qmp_drive_backup(backup-device, backup-target,
  backup-has_format, backup-format,
+ backup-sync,
  backup-has_mode, backup-mode,
  backup-has_speed, backup-speed,
  backup-has_on_source_error, backup-on_source_error,
@@ -1421,6 +1422,7 @@ void qmp_block_commit(const char *device,
 
 void qmp_drive_backup(const char *device, const char *target,
   bool has_format, const char *format,
+  enum MirrorSyncMode sync,
   bool has_mode, enum NewImageMode mode,
   bool has_speed, int64_t speed,
   bool has_on_source_error, BlockdevOnError 
on_source_error,
@@ -1435,6 +1437,10 @@ void qmp_drive_backup(const char *device, const char 
*target,
 int64_t size;
 int ret;
 
+if (sync != MIRROR_SYNC_MODE_FULL) {
+error_setg(errp, only sync mode 'full' is currently supported);
+return;
+}
 if (!has_speed) {
 speed = 0;
 }
diff --git a/qapi-schema.json b/qapi-schema.json
index d6479e1..1e19e4b 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1626,6 +1626,10 @@
 # @format: #optional the format of the new destination, default is to
 #  probe if @mode is 'existing', else the format of the source
 #
+# @sync: what parts of the disk image should be copied to the destination
+#(all the disk, only the sectors allocated in the topmost image, or
+#only new I/O).
+#
 # @mode: #optional whether and how QEMU should create a new image, default is
 #'absolute-paths'.
 #
@@ -1647,7 +1651,8 @@
 ##
 { 'type': 'DriveBackup',
   'data': { 'device': 'str', 'target': 'str', '*format': 'str',
-'*mode': 'NewImageMode', '*speed': 'int',
+'sync': 'MirrorSyncMode', '*mode': 'NewImageMode',
+'*speed': 'int',
 '*on-source-error': 'BlockdevOnError',
 '*on-target-error': 'BlockdevOnError' } }
 
@@ -1807,6 +1812,10 @@
 #
 # @speed: #optional the maximum speed, in bytes per second
 #
+# @sync: what parts of the disk image should be copied to the destination
+#(all the disk, only the sectors allocated in the topmost image, or
+#only new I/O).
+#
 # @on-source-error: #optional the action to take on an error on the source,
 #   default 'report'.  'stop' and 'enospc' can only be used
 #   if the block device supports io-status (see BlockInfo).
@@ -1826,7 +1835,8 @@
 ##
 { 'command': 'drive-backup',
   'data': { 'device': 'str', 'target': 'str', '*format': 'str',
-'*mode': 'NewImageMode', '*speed': 'int',
+'sync': 'MirrorSyncMode', '*mode': 'NewImageMode',
+'*speed': 'int',
 '*on-source-error': 'BlockdevOnError',
 '*on-target-error': 'BlockdevOnError' } }
 
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 362f0e1..e075df4 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -913,7 +913,7 @@ EQMP
 
 {
 .name   = drive-backup,
-.args_type  = device:B,target:s,speed:i?,mode:s?,format:s?,
+.args_type  = sync:s,device:B,target:s,speed:i?,mode:s?,format:s?,
   on-source-error:s?,on-target-error:s?,
 .mhandler.cmd_new = qmp_marshal_input_drive_backup,
 },
@@ -939,6 +939,10 @@ Arguments:
 - format: the format of the new destination, default is to probe if 'mode' is
 'existing', else the format of the source
 (json-string, optional)
+- sync: what parts of the disk 

[Qemu-devel] [PATCH v2 2/2] block: add drive_backup HMP command

2013-06-26 Thread Stefan Hajnoczi
Make drive_backup available on the HMP monitor:

  drive_backup [-n] [-f] device target [format]

The -n flag requests QEMU to reuse the image found in new-image-file,
instead of recreating it from scratch.

The -f flag requests QEMU to copy the whole disk, so that the result
does not need a backing file.  Note that this flag *must* currently be
passed since the other sync modes ('none' and 'top') have not been
implemented yet.  Requiring it ensures that drive_backup behaves like
drive_mirror.

Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 hmp-commands.hx | 20 
 hmp.c   | 28 
 hmp.h   |  1 +
 3 files changed, 49 insertions(+)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 915b0d1..800101b 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1059,6 +1059,26 @@ using the specified target.
 ETEXI
 
 {
+.name   = drive_backup,
+.args_type  = reuse:-n,full:-f,device:B,target:s,format:s?,
+.params = [-n] [-f] device target [format],
+.help   = initiates a point-in-time\n\t\t\t
+  copy for a device. The device's contents are\n\t\t\t
+  copied to the new image file, excluding data 
that\n\t\t\t
+  is written after the command is started.\n\t\t\t
+  The -n flag requests QEMU to reuse the image 
found\n\t\t\t
+  in new-image-file, instead of recreating it from 
scratch.\n\t\t\t
+  The -f flag requests QEMU to copy the whole 
disk,\n\t\t\t
+  so that the result does not need a backing 
file.\n\t\t\t,
+.mhandler.cmd = hmp_drive_backup,
+},
+STEXI
+@item drive_backup
+@findex drive_backup
+Start a point-in-time copy of a block device to a specificed target.
+ETEXI
+
+{
 .name   = drive_add,
 .args_type  = pci_addr:s,opts:s,
 .params = [[domain:]bus:]slot\n
diff --git a/hmp.c b/hmp.c
index 494a9aa..f9fb999 100644
--- a/hmp.c
+++ b/hmp.c
@@ -889,6 +889,34 @@ void hmp_drive_mirror(Monitor *mon, const QDict *qdict)
 hmp_handle_error(mon, errp);
 }
 
+void hmp_drive_backup(Monitor *mon, const QDict *qdict)
+{
+const char *device = qdict_get_str(qdict, device);
+const char *filename = qdict_get_str(qdict, target);
+const char *format = qdict_get_try_str(qdict, format);
+int reuse = qdict_get_try_bool(qdict, reuse, 0);
+int full = qdict_get_try_bool(qdict, full, 0);
+enum NewImageMode mode;
+Error *errp = NULL;
+
+if (!filename) {
+error_set(errp, QERR_MISSING_PARAMETER, target);
+hmp_handle_error(mon, errp);
+return;
+}
+
+if (reuse) {
+mode = NEW_IMAGE_MODE_EXISTING;
+} else {
+mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
+}
+
+qmp_drive_backup(device, filename, !!format, format,
+ full ? MIRROR_SYNC_MODE_FULL : MIRROR_SYNC_MODE_TOP,
+ true, mode, false, 0, false, 0, false, 0, errp);
+hmp_handle_error(mon, errp);
+}
+
 void hmp_snapshot_blkdev(Monitor *mon, const QDict *qdict)
 {
 const char *device = qdict_get_str(qdict, device);
diff --git a/hmp.h b/hmp.h
index 56d2e92..6c3bdcd 100644
--- a/hmp.h
+++ b/hmp.h
@@ -55,6 +55,7 @@ void hmp_balloon(Monitor *mon, const QDict *qdict);
 void hmp_block_resize(Monitor *mon, const QDict *qdict);
 void hmp_snapshot_blkdev(Monitor *mon, const QDict *qdict);
 void hmp_drive_mirror(Monitor *mon, const QDict *qdict);
+void hmp_drive_backup(Monitor *mon, const QDict *qdict);
 void hmp_migrate_cancel(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_downtime(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_speed(Monitor *mon, const QDict *qdict);
-- 
1.8.1.4




Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device version 2.

2013-06-26 Thread Tim Deegan
At 12:06 + on 26 Jun (1372248391), Paul Durrant wrote:
  -Original Message-
  From: Tim Deegan [mailto:t...@xen.org]
  Sent: 26 June 2013 12:58
  To: Paul Durrant
  Cc: Ian Campbell; Matt Wilson; Alex Bligh; xen-de...@lists.xen.org; qemu-
  de...@nongnu.org
  Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device
  version 2.
  
  At 11:23 + on 26 Jun (1372245783), Paul Durrant wrote:
We could blacklist all existing Citrix PV drivers in upstream QEMU,
   to avoid the clash, but that seems like a very unfriendly
   approach. Also, it's not going to stop someone with an existing VM,
   who happens to be using legacy Citrix PV drivers (an AWS VM for
   instance) receiving a driver from Windows Update that will blue-screen
   their VM on next reboot. Hence the only way forward is to bind the new
   drivers to something new, that we can control, so we know what driver
   a VM is going to get from Windows Update.
  
  I don't think you ever got an answer about whether this could be
  finagled using version numbers in the drivers.
 
 No, we thought about that and I don't believe it would be possible.

This doc makes it look like it's just a matter of choosing appropriate
version and dates:
http://msdn.microsoft.com/en-us/library/windows/hardware/gg487453.aspx

  Also: would it not be possible to have a blkfront driver (even a trivial
  one) in your new bus driver so it can detect and avoid this problem?
  
  Or: have your bus driver detect when the blkfront driver is missing and
  not unplug the emulated disk?  In fact wouldn't having the emulated disk
  driver do the unplug solve it for free?
 
 The issue is the old s/w not the new s/w. The old drivers make
 assumptions about each other's presence as we can't change that
 because they are already out there.

The issue you mentioned was that the old drivers bound the block driver
to the PCI device, and when your new driver is installed you get STOP 7B
because the system disk is missing (because I guess you'd need the new
xenbus driver to come up before it will trigger installing the new block
driver).

So can the new driver not fix this, either by running a trivial blkfront
itself or by allowing the emulated IDE controller to live?

   And we may indeed need to
   modify its revision in future so that we can retire old sets of PV
   drivers and replace them with new ones, but only for newer XenServer
   releases. Thus, I also propose to make the PCI revision of the new
   device a command line parameter.
  
  I'd rather not.  This gets straight back to having host-admin controls
  that have to manually be matched to in-guest software.
 
 Well not really. This is just the same as a h/w vendor shipping a new
 device.

Well, that would be more like having the PCI revision reflect the Xen
version.  Which might be a reasonable idea, if there is to be a second
PCI device.  

But metaphors aside, it's still requiring an admin change in order to
match the software inside the VM, which as we've seen is unpopular with
admins. :)

Tim.



Re: [Qemu-devel] [PATCH v11 14/15] rdma: introduce MIG_STATE_NONE and change MIG_STATE_SETUP state transition

2013-06-26 Thread Michael R. Hines

On 06/26/2013 02:37 AM, Paolo Bonzini wrote:

Il 26/06/2013 02:31, Michael R. Hines ha scritto:

On 06/25/2013 05:06 PM, Paolo Bonzini wrote:

Il 25/06/2013 22:56, Michael R. Hines ha scritto:

I was wrong - this does require a protocol extension.

This is because the RDMA transfers are asynchronous, and thus
we cannot know in advance that it is safe to unregister the memory
associated with each individual transfer before the transfer actually
completes.

While the destination currently uses the protocol to participate in
*registering* the page, the destination does not participate in the
RDMA transfers themselves, only the source does, and thus would
require a new exchange of messages to block and instruct the
destination to unpin the memory.

Yes, that's what I recalled too (really what mst told me :)).  Does it
need to be blocking though?  As long as the pinning is blocking, and
messages are processed in order, the source can proceed immediately
after sending an unpin message.  This assumes of course that the chunk
is not being transmitted, and I am not sure how easy the source can
determine that.

No, they're not processed in order. In fact, not only does the device
write out of order, but also the PCI bus writes out of order.
This was such a problem in fact, that I fixed several bugs as a result
a few weeks ago (v7 of the patch with an in-depth description).

The destination simply cannot assume whatsoever what the ordering
of the writes are - that's really the whole point of using RDMA in the
first place so that the software can get out of the way of the transfer
process to lower the latency of each transfer.

The memory is processed out of order, but what about the messages?
Those must be in order.

Note that I wrote above This assumes of course that the chunk is not
being transmitted.  Can the source know when an asynchronous transfer
finished, and delay the unpinning until that time?

Paolo


Yes, the source does know. There's no problem unpinning on the source.

But both sides must do the unpinning, not just the source.

Did I misunderstand you? Are you suggesting *only* unpinning on the source?

- Michael




Re: [Qemu-devel] [PATCH 1/2] device_tree: Add qemu_devtree_setprop_sized_cells() utility function

2013-06-26 Thread Peter Crosthwaite
Hi,

On Wed, Jun 26, 2013 at 8:50 PM, Peter Maydell peter.mayd...@linaro.org wrote:
 On 26 June 2013 11:31, Alexander Graf ag...@suse.de wrote:
 I think it makes sense to make this API special-purpose for reg.
 We currently have a generic put any number of 32bit values into the
 property function (qemu_devtree_setprop_cells).

 Yes, but that doesn't work for things that aren't simple arrays
 of 32 bit values, so I think that a generic way to deal
 with those too would be useful. If you wanted to write a
 ranges property you'd need this too, so it doesn't just
 apply to reg.


+1. And wouldn't an implementation of such a reg-specific function
back onto Peter's new function quite nicely anyway?

 I think we could avoid the varargs doesn't promote problem
 by using a varargs-macro wrapper:

 #define qemu_devtree_setprop_sized_cells(fdt, node, prop, ...) \
 do {   \
 uint64_t args[] = { __VA_ARGS__ }; \
 do_qemu_devtree_setprop_sized_cells(fdt, node, prop, \
 args, sizeof(args));
 } while (0)


Are statement expressions sanctioned? Or do we need to give up the
nice caller accessible return codes?

And can we factor out common functionality (mainly the error checking)
with existing set_prop_cells to make the interfaces consistent? (we
need to get rid of those aborts sooner or later)

 which will promote everything (including the size arguments,
 harmlessly) to uint64_t, and avoids having a varargs function.

 Can't we also just add a qemu_devtree_setprop_reg() that walks
 the tree downwards in search for #address-cells and #size-cells
 and assembles the correct reg property from a list of 64bit
 arguments?


I have a patch in my tree that generalises qemu_devtree_getprop* to
allow you walk parents for properties (as per the #foo-cells
semantic). I use it for interrupt cells however, which kinda suggests
that this wish for parent traversal is more generic than just
populating reg. I think that Peters patch, along with a parent search
friendly property search will be enough to be able to do whatever you
want in only a handful of lines.

 Do we have an actual use for this? It seems pretty complicated.
 I would expect that in practice there are two major use cases:
  (a) create your own fdt from scratch (in which case you can
  just make everything 64 bits and in any case will know
  when creating nodes what the #address-cells etc are)
  (b) modify an existing fdt, in which case you definitely don't
  want to go poking around too deeply in the tree; anything
  more than just put an extra node in the root is starting
  to get pretty chancy.


Looking to the future, what about -device adding a periph then having
qemu add it to the dtb before boot?

Regards,
Peter

 thanks
 -- PMM



Re: [Qemu-devel] [PATCH v11 14/15] rdma: introduce MIG_STATE_NONE and change MIG_STATE_SETUP state transition

2013-06-26 Thread Paolo Bonzini
Il 26/06/2013 14:37, Michael R. Hines ha scritto:
 On 06/26/2013 02:37 AM, Paolo Bonzini wrote:
 Il 26/06/2013 02:31, Michael R. Hines ha scritto:
 On 06/25/2013 05:06 PM, Paolo Bonzini wrote:
 Il 25/06/2013 22:56, Michael R. Hines ha scritto:
 I was wrong - this does require a protocol extension.

 This is because the RDMA transfers are asynchronous, and thus
 we cannot know in advance that it is safe to unregister the memory
 associated with each individual transfer before the transfer actually
 completes.

 While the destination currently uses the protocol to participate in
 *registering* the page, the destination does not participate in the
 RDMA transfers themselves, only the source does, and thus would
 require a new exchange of messages to block and instruct the
 destination to unpin the memory.
 Yes, that's what I recalled too (really what mst told me :)).  Does it
 need to be blocking though?  As long as the pinning is blocking, and
 messages are processed in order, the source can proceed immediately
 after sending an unpin message.  This assumes of course that the chunk
 is not being transmitted, and I am not sure how easy the source can
 determine that.
 No, they're not processed in order. In fact, not only does the device
 write out of order, but also the PCI bus writes out of order.
 This was such a problem in fact, that I fixed several bugs as a result
 a few weeks ago (v7 of the patch with an in-depth description).

 The destination simply cannot assume whatsoever what the ordering
 of the writes are - that's really the whole point of using RDMA in the
 first place so that the software can get out of the way of the transfer
 process to lower the latency of each transfer.
 The memory is processed out of order, but what about the messages?
 Those must be in order.

 Note that I wrote above This assumes of course that the chunk is not
 being transmitted.  Can the source know when an asynchronous transfer
 finished, and delay the unpinning until that time?
 
 Yes, the source does know. There's no problem unpinning on the source.
 
 But both sides must do the unpinning, not just the source.
 
 Did I misunderstand you? Are you suggesting *only* unpinning on the source?

I'm suggesting (if possible) that the source only tells the destination
to unpin once it knows it is safe for the destination to do it.  As long
as unpin and pin messages are not reordered, it should be possible to do
it with an asynchronous message from the source to the destination.

Paolo



Re: [Qemu-devel] [PATCH 1/2] device_tree: Add qemu_devtree_setprop_sized_cells() utility function

2013-06-26 Thread Alexander Graf

On 06/26/2013 02:38 PM, Peter Crosthwaite wrote:

Hi,

On Wed, Jun 26, 2013 at 8:50 PM, Peter Maydellpeter.mayd...@linaro.org  wrote:

On 26 June 2013 11:31, Alexander Grafag...@suse.de  wrote:

I think it makes sense to make this API special-purpose for reg.
We currently have a generic put any number of 32bit values into the
property function (qemu_devtree_setprop_cells).

Yes, but that doesn't work for things that aren't simple arrays
of 32 bit values, so I think that a generic way to deal
with those too would be useful. If you wanted to write a
ranges property you'd need this too, so it doesn't just
apply to reg.


+1. And wouldn't an implementation of such a reg-specific function
back onto Peter's new function quite nicely anyway?


I think we could avoid the varargs doesn't promote problem
by using a varargs-macro wrapper:

#define qemu_devtree_setprop_sized_cells(fdt, node, prop, ...) \
 do {   \
 uint64_t args[] = { __VA_ARGS__ }; \
 do_qemu_devtree_setprop_sized_cells(fdt, node, prop, \
 args, sizeof(args));
 } while (0)


Are statement expressions sanctioned? Or do we need to give up the
nice caller accessible return codes?

And can we factor out common functionality (mainly the error checking)
with existing set_prop_cells to make the interfaces consistent? (we
need to get rid of those aborts sooner or later)


which will promote everything (including the size arguments,
harmlessly) to uint64_t, and avoids having a varargs function.


Can't we also just add a qemu_devtree_setprop_reg() that walks
the tree downwards in search for #address-cells and #size-cells
and assembles the correct reg property from a list of 64bit
arguments?

I have a patch in my tree that generalises qemu_devtree_getprop* to
allow you walk parents for properties (as per the #foo-cells
semantic). I use it for interrupt cells however, which kinda suggests
that this wish for parent traversal is more generic than just
populating reg. I think that Peters patch, along with a parent search
friendly property search will be enough to be able to do whatever you
want in only a handful of lines.


Do we have an actual use for this? It seems pretty complicated.
I would expect that in practice there are two major use cases:
  (a) create your own fdt from scratch (in which case you can
  just make everything 64 bits and in any case will know
  when creating nodes what the #address-cells etc are)
  (b) modify an existing fdt, in which case you definitely don't
  want to go poking around too deeply in the tree; anything
  more than just put an extra node in the root is starting
  to get pretty chancy.


Looking to the future, what about -device adding a periph then having
qemu add it to the dtb before boot?


I've had a lengthy discussion about that with Anthony a while ago. His 
take was that this is perfectly reasonable, as long as the device tree 
generation code stays within the machine model. The machine would just 
traverse the QOM hierachy and generate device tree nodes for everything 
it knows.



Alex




Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device version 2.

2013-06-26 Thread Paul Durrant
 -Original Message-
 From: Tim Deegan [mailto:t...@xen.org]
 Sent: 26 June 2013 13:36
 To: Paul Durrant
 Cc: Ian Campbell; Matt Wilson; Alex Bligh; xen-de...@lists.xen.org; qemu-
 de...@nongnu.org
 Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device
 version 2.
 
 At 12:06 + on 26 Jun (1372248391), Paul Durrant wrote:
   -Original Message-
   From: Tim Deegan [mailto:t...@xen.org]
   Sent: 26 June 2013 12:58
   To: Paul Durrant
   Cc: Ian Campbell; Matt Wilson; Alex Bligh; xen-de...@lists.xen.org;
 qemu-
   de...@nongnu.org
   Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI
 device
   version 2.
  
   At 11:23 + on 26 Jun (1372245783), Paul Durrant wrote:
 We could blacklist all existing Citrix PV drivers in upstream QEMU,
to avoid the clash, but that seems like a very unfriendly
approach. Also, it's not going to stop someone with an existing VM,
who happens to be using legacy Citrix PV drivers (an AWS VM for
instance) receiving a driver from Windows Update that will blue-screen
their VM on next reboot. Hence the only way forward is to bind the
 new
drivers to something new, that we can control, so we know what driver
a VM is going to get from Windows Update.
  
   I don't think you ever got an answer about whether this could be
   finagled using version numbers in the drivers.
 
  No, we thought about that and I don't believe it would be possible.
 
 This doc makes it look like it's just a matter of choosing appropriate
 version and dates:
 http://msdn.microsoft.com/en-
 us/library/windows/hardware/gg487453.aspx
 

The issue, I believe, is that DriverVerDate trumps DriverVerVersion and AFAIK 
we can't fake an old date because the driver signing process will not allow it.

   Also: would it not be possible to have a blkfront driver (even a trivial
   one) in your new bus driver so it can detect and avoid this problem?
  
   Or: have your bus driver detect when the blkfront driver is missing and
   not unplug the emulated disk?  In fact wouldn't having the emulated disk
   driver do the unplug solve it for free?
 
  The issue is the old s/w not the new s/w. The old drivers make
  assumptions about each other's presence as we can't change that
  because they are already out there.
 
 The issue you mentioned was that the old drivers bound the block driver
 to the PCI device, and when your new driver is installed you get STOP 7B
 because the system disk is missing (because I guess you'd need the new
 xenbus driver to come up before it will trigger installing the new block
 driver).
 
 So can the new driver not fix this, either by running a trivial blkfront
 itself or by allowing the emulated IDE controller to live?
 

Aha, the old drivers have a registry override which will cause them to quiesce 
so I could have the new driver go and set that to make them shut up. That might 
allow us to cleanly take over on the next reboot. 

And we may indeed need to
modify its revision in future so that we can retire old sets of PV
drivers and replace them with new ones, but only for newer XenServer
releases. Thus, I also propose to make the PCI revision of the new
device a command line parameter.
  
   I'd rather not.  This gets straight back to having host-admin controls
   that have to manually be matched to in-guest software.
 
  Well not really. This is just the same as a h/w vendor shipping a new
  device.
 
 Well, that would be more like having the PCI revision reflect the Xen
 version.  Which might be a reasonable idea, if there is to be a second
 PCI device.
 
 But metaphors aside, it's still requiring an admin change in order to
 match the software inside the VM, which as we've seen is unpopular with
 admins. :)
 

Yes, but that's more of a business decision than an architectural one. As you 
say, it may well be unpopular and thus is something I'd prefer we didn't do 
unless we absolutely have to. What we can't do is have drivers being downloaded 
from Windows Update into systems where we know they are going to break things 
so I would like the new device to have enough tunability to avoid a trainwreck, 
and a parameterised revision is just enough :-) Thus, I still prefer to have a 
new device and leave the old one alone.

  Paul



Re: [Qemu-devel] Monitoring Screen Activity in QEMU/KVM

2013-06-26 Thread Richard W.M. Jones
On Wed, Jun 26, 2013 at 07:47:47AM +0530, Shehbaz Jaffer wrote:
 I want to determine the amount of screen activity taking place on VGA
 monitor/ Screen for different applications (eg. playing vlc video, normal
 typing.)
 
 When I do not start the X server, I can easily determine the screen
 activity by counting the number of pages accessed in the region 0xA -
 0xB (This is the VGA Monitor region in boot screen).
 
 However when I start the X Server, A diffrent set of pages are hit. Could
 anyone please explain how the VGA Monitor works in QEMU?

qemu emulates a PCI device.  Probably an ancient Cirrus Logic CL 5446,
but other devices are possible.

 Or if someone
 could suggest an alternate solution to determine amount of screen activity
 while playing diffrent applications?

I would take an existing VNC client and modify it to log the screeen
activity you want to log.  VNC is a well-documented protocol, there
are several high quality open source clients [gtk-vnc is the one I'd
pick], and doing this means the guest can run at full speed.

It depends a lot on how you define screen activity.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top



Re: [Qemu-devel] [PATCH 1/2] device_tree: Add qemu_devtree_setprop_sized_cells() utility function

2013-06-26 Thread Peter Maydell
On 26 June 2013 13:38, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote:
 On Wed, Jun 26, 2013 at 8:50 PM, Peter Maydell peter.mayd...@linaro.org 
 wrote:
 I think we could avoid the varargs doesn't promote problem
 by using a varargs-macro wrapper:

 #define qemu_devtree_setprop_sized_cells(fdt, node, prop, ...) \
 do {   \
 uint64_t args[] = { __VA_ARGS__ }; \
 do_qemu_devtree_setprop_sized_cells(fdt, node, prop, \
 args, sizeof(args));
 } while (0)


 Are statement expressions sanctioned? Or do we need to give up the
 nice caller accessible return codes?

I just borrowed the syntax from the existing qemu_devtree_setprop_cells
macro to illustrate how the varargs bit would work; I'll tweak it to
actually get the return code right for the patch proper.

 And can we factor out common functionality (mainly the error checking)
 with existing set_prop_cells to make the interfaces consistent? (we
 need to get rid of those aborts sooner or later)

Maybe we should at some point switch to using Error** for error
reporting in our wrapper APIs here? It is a bit odd that half
our device tree functions die on error and the other half
report a failure code.

 which will promote everything (including the size arguments,
 harmlessly) to uint64_t, and avoids having a varargs function.

 Can't we also just add a qemu_devtree_setprop_reg() that walks
 the tree downwards in search for #address-cells and #size-cells
 and assembles the correct reg property from a list of 64bit
 arguments?


 I have a patch in my tree that generalises qemu_devtree_getprop* to
 allow you walk parents for properties (as per the #foo-cells
 semantic). I use it for interrupt cells however, which kinda suggests
 that this wish for parent traversal is more generic than just
 populating reg. I think that Peters patch, along with a parent search
 friendly property search will be enough to be able to do whatever you
 want in only a handful of lines.

Sounds good to me.

-- PMM



Re: [Qemu-devel] [PATCH 1/2] device_tree: Add qemu_devtree_setprop_sized_cells() utility function

2013-06-26 Thread Peter Maydell
On 26 June 2013 13:38, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote:
 Are statement expressions sanctioned?

..and to answer this specific question which I missed first time
round, we already use them (eg container_of, MAKE_TCGV_PTR).

We seem to mark some but not all of them with __extension__;
I think on balance that's probably unnecessary given we aren't
going to be trying to compile with -pedantic any time soon.

-- PMM



Re: [Qemu-devel] [PATCH 2/5] tcg: Simplify logic using TCG_OPF_NOT_PRESENT

2013-06-26 Thread Claudio Fontana
Hello Richard,

eons ago Richard wrote:
 
 Expand the definition of not present to include should not be present.
 This means we can simplify the logic surrounding the generic tcg opcodes
 for which the host backend ought not be providing definitions.
 
 Signed-off-by: Richard Henderson address@hidden
 ---
  tcg/tcg-opc.h | 26 +++---
  tcg/tcg.c |  4 +---
  tcg/tcg.h |  3 ++-
  3 files changed, 18 insertions(+), 15 deletions(-)
 
 diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h
 index db5e6e5..83f7147 100644
 --- a/tcg/tcg-opc.h
 +++ b/tcg/tcg-opc.h
 @@ -27,17 +27,21 @@
   */
  
  /* predefined ops */
 -DEF(end, 0, 0, 0, 0) /* must be kept first */
 -DEF(nop, 0, 0, 0, 0)
 -DEF(nop1, 0, 0, 1, 0)
 -DEF(nop2, 0, 0, 2, 0)
 -DEF(nop3, 0, 0, 3, 0)
 -DEF(nopn, 0, 0, 1, 0) /* variable number of parameters */
 +DEF(end, 0, 0, 0, TCG_OPF_NOT_PRESENT) /* must be kept first */
 +DEF(nop, 0, 0, 0, TCG_OPF_NOT_PRESENT)
 +DEF(nop1, 0, 0, 1, TCG_OPF_NOT_PRESENT)
 +DEF(nop2, 0, 0, 2, TCG_OPF_NOT_PRESENT)
 +DEF(nop3, 0, 0, 3, TCG_OPF_NOT_PRESENT)
  
 -DEF(discard, 1, 0, 0, 0)
 +/* variable number of parameters */
 +DEF(nopn, 0, 0, 1, TCG_OPF_NOT_PRESENT)
 +
 +DEF(discard, 1, 0, 0, TCG_OPF_NOT_PRESENT)
 +DEF(set_label, 0, 0, 1, TCG_OPF_BB_END | TCG_OPF_NOT_PRESENT)
 +
 +/* variable number of parameters */
 +DEF(call, 0, 1, 2, TCG_OPF_CALL_CLOBBER | TCG_OPF_NOT_PRESENT)

If TCG_OPF_NOT_PRESENT is supposed to mark opcodes that should not be 
implemented by the target, then setting TCG_OPF_NOT_PRESENT for 'call' seems 
wrong to me.
This will actually trigger the code in the ifdef CONFIG_DEBUG_TCG and cause an 
assert to fail. 

  
 -DEF(set_label, 0, 0, 1, TCG_OPF_BB_END)
 -DEF(call, 0, 1, 2, TCG_OPF_CALL_CLOBBER) /* variable number of parameters */
  DEF(br, 0, 0, 1, TCG_OPF_BB_END)
  
  #define IMPL(X) (__builtin_constant_p(X)  !(X) ? TCG_OPF_NOT_PRESENT : 0)
 @@ -166,9 +170,9 @@ DEF(muls2_i64, 2, 2, 0, IMPL64 | 
 IMPL(TCG_TARGET_HAS_muls2_i64))
  
  /* QEMU specific */
  #if TARGET_LONG_BITS  TCG_TARGET_REG_BITS
 -DEF(debug_insn_start, 0, 0, 2, 0)
 +DEF(debug_insn_start, 0, 0, 2, TCG_OPF_NOT_PRESENT)
  #else
 -DEF(debug_insn_start, 0, 0, 1, 0)
 +DEF(debug_insn_start, 0, 0, 1, TCG_OPF_NOT_PRESENT)
  #endif
  DEF(exit_tb, 0, 0, 1, TCG_OPF_BB_END)
  DEF(goto_tb, 0, 0, 1, TCG_OPF_BB_END)
 diff --git a/tcg/tcg.c b/tcg/tcg.c
 index 1d8099c..c7e6567 100644
 --- a/tcg/tcg.c
 +++ b/tcg/tcg.c
 @@ -1160,9 +1160,7 @@ void tcg_add_target_add_op_defs(const TCGTargetOpDef 
 *tdefs)
  i = 0;
  for (op = 0; op  ARRAY_SIZE(tcg_op_defs); op++) {
  const TCGOpDef *def = tcg_op_defs[op];
 -if (op  INDEX_op_call
 -|| op == INDEX_op_debug_insn_start
 -|| (def-flags  TCG_OPF_NOT_PRESENT)) {
 +if (def-flags  TCG_OPF_NOT_PRESENT) {


this will trigger for op = 'call'


  /* Wrong entry in op definitions? */
  if (def-used) {
  fprintf(stderr, Invalid op definition for %s\n, def-name);
 diff --git a/tcg/tcg.h b/tcg/tcg.h
 index df375cf..72b694f 100644
 --- a/tcg/tcg.h
 +++ b/tcg/tcg.h
 @@ -593,7 +593,8 @@ enum {
  TCG_OPF_SIDE_EFFECTS = 0x04,
  /* Instruction operands are 64-bits (otherwise 32-bits).  */
  TCG_OPF_64BIT= 0x08,
 -/* Instruction is optional and not implemented by the host.  */
 +/* Instruction is optional and not implemented by the host, or insn
 +   is generic and should not be implemened by the host.  */
  TCG_OPF_NOT_PRESENT  = 0x10,
  };
  
 -- 
 1.8.1.4





[Qemu-devel] [PATCH v4 02/12] boot-order-test: New; covering just PC for now

2013-06-26 Thread Markus Armbruster

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 tests/Makefile  |  2 ++
 tests/boot-order-test.c | 73 +
 2 files changed, 75 insertions(+)
 create mode 100644 tests/boot-order-test.c

diff --git a/tests/Makefile b/tests/Makefile
index c107489..394e029 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -54,6 +54,7 @@ gcov-files-i386-y = hw/fdc.c
 check-qtest-i386-y += tests/ide-test$(EXESUF)
 check-qtest-i386-y += tests/hd-geo-test$(EXESUF)
 gcov-files-i386-y += hw/hd-geometry.c
+check-qtest-i386-y += tests/boot-order-test$(EXESUF)
 check-qtest-i386-y += tests/rtc-test$(EXESUF)
 check-qtest-i386-y += tests/i440fx-test$(EXESUF)
 check-qtest-i386-y += tests/fw_cfg-test$(EXESUF)
@@ -130,6 +131,7 @@ tests/m48t59-test$(EXESUF): tests/m48t59-test.o
 tests/fdc-test$(EXESUF): tests/fdc-test.o
 tests/ide-test$(EXESUF): tests/ide-test.o $(libqos-pc-obj-y)
 tests/hd-geo-test$(EXESUF): tests/hd-geo-test.o
+tests/boot-order-test$(EXESUF): tests/boot-order-test.o
 tests/tmp105-test$(EXESUF): tests/tmp105-test.o $(libqos-omap-obj-y)
 tests/i440fx-test$(EXESUF): tests/i440fx-test.o $(libqos-pc-obj-y)
 tests/fw_cfg-test$(EXESUF): tests/fw_cfg-test.o $(libqos-pc-obj-y)
diff --git a/tests/boot-order-test.c b/tests/boot-order-test.c
new file mode 100644
index 000..447be31
--- /dev/null
+++ b/tests/boot-order-test.c
@@ -0,0 +1,73 @@
+/*
+ * Boot order test cases.
+ *
+ * Copyright (c) 2013 Red Hat Inc.
+ *
+ * Authors:
+ *  Markus Armbruster arm...@redhat.com,
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include glib.h
+#include libqtest.h
+
+static void test_pc_cmos_byte(int reg, int expected)
+{
+int actual;
+
+outb(0x70, reg);
+actual = inb(0x71);
+g_assert_cmphex(actual, ==, expected);
+}
+
+static void test_pc_cmos(uint8_t boot1, uint8_t boot2)
+{
+test_pc_cmos_byte(0x38, boot1);
+test_pc_cmos_byte(0x3d, boot2);
+}
+
+static void test_pc_with_args(const char *test_args,
+  uint8_t boot1, uint8_t boot2,
+  uint8_t reboot1, uint8_t reboot2)
+{
+char *args = g_strdup_printf(-nodefaults -display none %s, test_args);
+
+qtest_start(args);
+test_pc_cmos(boot1, boot2);
+qmp({ 'execute': 'system_reset' });
+/*
+ * system_reset only requests reset.  We get a RESET event after
+ * the actual reset completes.  Need to wait for that.
+ */
+qmp();/* HACK: wait for event */
+test_pc_cmos(reboot1, reboot2);
+qtest_quit(global_qtest);
+g_free(args);
+}
+
+static void test_pc_boot_order(void)
+{
+test_pc_with_args(, 0x30, 0x12, 0x30, 0x12);
+test_pc_with_args(-no-fd-bootchk, 0x31, 0x12, 0x31, 0x12);
+test_pc_with_args(-boot c, 0, 0x02, 0, 0x02);
+test_pc_with_args(-boot nda, 0x10, 0x34, 0x10, 0x34);
+test_pc_with_args(-boot order=, 0, 0, 0, 0);
+test_pc_with_args(-boot order= -boot order=c, 0, 0x02, 0, 0x02);
+test_pc_with_args(-boot once=a, 0, 0x01, 0x30, 0x12);
+test_pc_with_args(-boot once=a -no-fd-bootchk, 0x01, 0x01, 0x31, 0x12);
+test_pc_with_args(-boot once=a,order=c, 0, 0x01, 0, 0x02);
+test_pc_with_args(-boot once=d -boot order=nda, 0, 0x03, 0x10, 0x34);
+test_pc_with_args(-boot once=a -boot once=b -boot once=c,
+  0, 0x02, 0x30, 0x12);
+}
+
+int main(int argc, char *argv[])
+{
+g_test_init(argc, argv, NULL);
+
+qtest_add_func(boot-order/pc, test_pc_boot_order);
+
+return g_test_run();
+}
-- 
1.7.11.7




[Qemu-devel] [PATCH v4 09/12] boot-order-test: Add tests for PowerPC PREP

2013-06-26 Thread Markus Armbruster
Cc: Andreas Färber afaer...@suse.de
Cc: Alexander Graf ag...@suse.de
Cc: qemu-...@nongnu.org
Signed-off-by: Markus Armbruster arm...@redhat.com
---
 tests/boot-order-test.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/tests/boot-order-test.c b/tests/boot-order-test.c
index c711c71..75b1642 100644
--- a/tests/boot-order-test.c
+++ b/tests/boot-order-test.c
@@ -112,6 +112,30 @@ static void test_pc_boot_order(void)
 test_boot_orders(NULL, read_boot_order_pc, test_cases_pc);
 }
 
+static uint8_t read_m48t59(uint64_t addr, uint16_t reg)
+{
+writeb(addr, reg  0xff);
+writeb(addr + 1, reg  8);
+return readb(addr + 3);
+}
+
+static uint64_t read_boot_order_prep(void)
+{
+return read_m48t59(0x8000 + 0x74, 0x34);
+}
+
+static const boot_order_test test_cases_prep[] = {
+{ , 'c', 'c' },
+{ -boot c, 'c', 'c' },
+{ -boot d, 'd', 'd' },
+{}
+};
+
+static void test_prep_boot_order(void)
+{
+test_boot_orders(prep, read_boot_order_prep, test_cases_prep);
+}
+
 static uint64_t read_boot_order_pmac(void)
 {
 QFWCFG *fw_cfg = mm_fw_cfg_init(0xf510);
@@ -146,6 +170,7 @@ int main(int argc, char *argv[])
 if (strcmp(arch, i386) == 0 || strcmp(arch, x86_64) == 0) {
 qtest_add_func(boot-order/pc, test_pc_boot_order);
 } else if (strcmp(arch, ppc) == 0 || strcmp(arch, ppc64) == 0) {
+qtest_add_func(boot-order/prep, test_prep_boot_order);
 qtest_add_func(boot-order/pmac_oldworld,
test_pmac_oldworld_boot_order);
 qtest_add_func(boot-order/pmac_newworld,
-- 
1.7.11.7




[Qemu-devel] [PATCH v4 10/12] boot-order-test: Add tests for Sun4m

2013-06-26 Thread Markus Armbruster
Cc: Blue Swirl blauwir...@gmail.com
Signed-off-by: Markus Armbruster arm...@redhat.com
---
 tests/boot-order-test.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/tests/boot-order-test.c b/tests/boot-order-test.c
index 75b1642..a3928ed 100644
--- a/tests/boot-order-test.c
+++ b/tests/boot-order-test.c
@@ -161,6 +161,18 @@ static void test_pmac_newworld_boot_order(void)
 test_boot_orders(mac99, read_boot_order_pmac, test_cases_fw_cfg);
 }
 
+static uint64_t read_boot_order_sun4m(void)
+{
+QFWCFG *fw_cfg = mm_fw_cfg_init(0xd0510ULL);
+
+return qfw_cfg_get_u16(fw_cfg, FW_CFG_BOOT_DEVICE);
+}
+
+static void test_sun4m_boot_order(void)
+{
+test_boot_orders(SS-5, read_boot_order_sun4m, test_cases_fw_cfg);
+}
+
 int main(int argc, char *argv[])
 {
 const char *arch = qtest_get_arch();
@@ -175,6 +187,8 @@ int main(int argc, char *argv[])
test_pmac_oldworld_boot_order);
 qtest_add_func(boot-order/pmac_newworld,
test_pmac_newworld_boot_order);
+} else if (strcmp(arch, sparc) == 0) {
+qtest_add_func(boot-order/sun4m, test_sun4m_boot_order);
 }
 
 return g_test_run();
-- 
1.7.11.7




[Qemu-devel] [PATCH v4 03/12] libqos: include dependencies

2013-06-26 Thread Markus Armbruster
From: Anthony Liguori aligu...@us.ibm.com

Otherwise rebuilds can fail when libqos is modified.

Reported-by: Markus Armbruster arm...@redhat.com
Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Markus Armbruster arm...@redhat.com
---
 tests/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/Makefile b/tests/Makefile
index 394e029..34a2325 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -224,3 +224,4 @@ check-block: $(patsubst %,check-%, $(check-block-y))
 check: check-unit check-qtest
 
 -include $(wildcard tests/*.d)
+-include $(wildcard tests/libqos/*.d)
-- 
1.7.11.7




[Qemu-devel] [PATCH v4 05/12] boot-order-test: Add tests for PowerMacs

2013-06-26 Thread Markus Armbruster
From: Andreas Färber afaer...@suse.de

They set the boot device via fw_cfg, which is then translated to a boot
path of hd or cd in OpenBIOS.

Signed-off-by: Andreas Färber afaer...@suse.de
Cc: Alexander Graf ag...@suse.de
Cc: qemu-...@nongnu.org
Converted to libqos/fw_cfg on Anthony's request.
Signed-off-by: Markus Armbruster arm...@redhat.com
---
 tests/Makefile  |  4 +++-
 tests/boot-order-test.c | 50 -
 2 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/tests/Makefile b/tests/Makefile
index 34a2325..8c7352d 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -67,6 +67,8 @@ gcov-files-sparc-y += hw/m48t59.c
 gcov-files-sparc64-y += hw/m48t59.c
 check-qtest-arm-y = tests/tmp105-test$(EXESUF)
 gcov-files-arm-y += hw/tmp105.c
+check-qtest-ppc-y += tests/boot-order-test$(EXESUF)
+check-qtest-ppc64-y += tests/boot-order-test$(EXESUF)
 
 GENERATED_HEADERS += tests/test-qapi-types.h tests/test-qapi-visit.h 
tests/test-qmp-commands.h
 
@@ -131,7 +133,7 @@ tests/m48t59-test$(EXESUF): tests/m48t59-test.o
 tests/fdc-test$(EXESUF): tests/fdc-test.o
 tests/ide-test$(EXESUF): tests/ide-test.o $(libqos-pc-obj-y)
 tests/hd-geo-test$(EXESUF): tests/hd-geo-test.o
-tests/boot-order-test$(EXESUF): tests/boot-order-test.o
+tests/boot-order-test$(EXESUF): tests/boot-order-test.o $(libqos-obj-y)
 tests/tmp105-test$(EXESUF): tests/tmp105-test.o $(libqos-omap-obj-y)
 tests/i440fx-test$(EXESUF): tests/i440fx-test.o $(libqos-pc-obj-y)
 tests/fw_cfg-test$(EXESUF): tests/fw_cfg-test.o $(libqos-pc-obj-y)
diff --git a/tests/boot-order-test.c b/tests/boot-order-test.c
index 447be31..766981d 100644
--- a/tests/boot-order-test.c
+++ b/tests/boot-order-test.c
@@ -10,7 +10,9 @@
  * See the COPYING file in the top-level directory.
  */
 
+#include string.h
 #include glib.h
+#include libqos/fw_cfg.h
 #include libqtest.h
 
 static void test_pc_cmos_byte(int reg, int expected)
@@ -63,11 +65,57 @@ static void test_pc_boot_order(void)
   0, 0x02, 0x30, 0x12);
 }
 
+#define G3BEIGE_CFG_ADDR 0xf510
+#define MAC99_CFG_ADDR   0xf510
+
+#define NO_QEMU_PROTOS
+#include hw/nvram/fw_cfg.h
+#undef NO_QEMU_PROTOS
+
+static void test_powermac_with_args(bool newworld, const char *extra_args,
+uint16_t expected_boot,
+uint16_t expected_reboot)
+{
+char *args = g_strdup_printf(-nodefaults -display none -machine %s %s,
+ newworld ? mac99 : g3beige, extra_args);
+QFWCFG *fw_cfg = mm_fw_cfg_init(newworld ? MAC99_CFG_ADDR
+: G3BEIGE_CFG_ADDR);
+uint16_t actual;
+
+qtest_start(args);
+actual = qfw_cfg_get_u16(fw_cfg, FW_CFG_BOOT_DEVICE);
+g_assert_cmphex(actual, ==, expected_boot);
+qmp({ 'execute': 'system_reset' });
+actual = qfw_cfg_get_u16(fw_cfg, FW_CFG_BOOT_DEVICE);
+g_assert_cmphex(actual, ==, expected_reboot);
+qtest_quit(global_qtest);
+g_free(args);
+}
+
+static void test_powermac_boot_order(void)
+{
+int i;
+
+for (i = 0; i  2; i++) {
+bool newworld = (i == 1);
+
+test_powermac_with_args(newworld, , 'c', 'c');
+test_powermac_with_args(newworld, -boot c, 'c', 'c');
+test_powermac_with_args(newworld, -boot d, 'd', 'd');
+}
+}
+
 int main(int argc, char *argv[])
 {
+const char *arch = qtest_get_arch();
+
 g_test_init(argc, argv, NULL);
 
-qtest_add_func(boot-order/pc, test_pc_boot_order);
+if (strcmp(arch, i386) == 0 || strcmp(arch, x86_64) == 0) {
+qtest_add_func(boot-order/pc, test_pc_boot_order);
+} else if (strcmp(arch, ppc) == 0 || strcmp(arch, ppc64) == 0) {
+qtest_add_func(boot-order/powermac, test_powermac_boot_order);
+}
 
 return g_test_run();
 }
-- 
1.7.11.7




[Qemu-devel] [PATCH v4 01/12] qtest: Don't reset on qtest chardev connect

2013-06-26 Thread Markus Armbruster
libqtest's qtest_init() connecting to the qtest socket triggers reset.
This was coded in the hope we could use the same QEMU process for
multiple tests that way.  Never used.  Injects an extra reset even
when it's not used, and that can mess up tests such as the one of
-boot once I'm about to add.  Drop it.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 qtest.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/qtest.c b/qtest.c
index 07a9612..74f1842 100644
--- a/qtest.c
+++ b/qtest.c
@@ -472,7 +472,12 @@ static void qtest_event(void *opaque, int event)
 
 switch (event) {
 case CHR_EVENT_OPENED:
-qemu_system_reset(false);
+/*
+ * We used to call qemu_system_reset() here, hoping we could
+ * use the same process for multiple tests that way.  Never
+ * used.  Injects an extra reset even when it's not used, and
+ * that can mess up tests, e.g. -boot once.
+ */
 for (i = 0; i  ARRAY_SIZE(irq_levels); i++) {
 irq_levels[i] = 0;
 }
-- 
1.7.11.7




[Qemu-devel] [PATCH v4 00/12] Boot order tests

2013-06-26 Thread Markus Armbruster
v4:
* Old PATCH 1-6 got merged, only tests left; cover letter changed
  accordingly
* PATCH 1,6 unchanged
* PATCH 2 fix system-reset race (Anthony)
* New PATCH 4,11 to make libqos/fw_cfg usable for the ppc, Sun4 tests
* New PATCH 3 shanghaied from Anthony, to make PATCH 4 compile without
  make clean.
* PATCH 5,10,12 use libqos/fw_cfg (Anthony)
* PATCH 7-9 rebased, minor tweaks
v3:
* Rebased, with only trivial conflicts
* PATCH 08 cosmetic improvements
* More test cases: new PATCH 09-16
v2:
* New PATCH 7 to make testing -boot once possible
* Old PATCH 5 reworked and extended became PATCH 8
* Writing more tests uncovered -no-fd-bootchk weirdness, cleaned up in
  new PATCH 5+6

Andreas Färber (1):
  boot-order-test: Add tests for PowerMacs

Anthony Liguori (1):
  libqos: include dependencies

Markus Armbruster (10):
  qtest: Don't reset on qtest chardev connect
  boot-order-test: New; covering just PC for now
  libqos: Add support for memory-mapped fw_cfg
  boot-order-test: Cover -boot once in ppc tests
  boot-order-test: Better separate target-specific and generic parts
  boot-order-test: Code motion for better readability
  boot-order-test: Add tests for PowerPC PREP
  boot-order-test: Add tests for Sun4m
  libqos: Generalize I/O-mapped fw_cfg
  boot-order-test: Add tests for Sun4u

 qtest.c  |   7 +-
 tests/Makefile   |   7 +-
 tests/boot-order-test.c  | 209 +++
 tests/fw_cfg-test.c  |   2 +-
 tests/libqos/fw_cfg-pc.c |  40 -
 tests/libqos/fw_cfg-pc.h |  20 -
 tests/libqos/fw_cfg.c|  55 +
 tests/libqos/fw_cfg.h|   9 ++
 tests/libqos/malloc-pc.c |   2 +-
 9 files changed, 287 insertions(+), 64 deletions(-)
 create mode 100644 tests/boot-order-test.c
 delete mode 100644 tests/libqos/fw_cfg-pc.c
 delete mode 100644 tests/libqos/fw_cfg-pc.h

-- 
1.7.11.7




[Qemu-devel] [PATCH v4 12/12] boot-order-test: Add tests for Sun4u

2013-06-26 Thread Markus Armbruster
Cc: Blue Swirl blauwir...@gmail.com
Signed-off-by: Markus Armbruster arm...@redhat.com
---
 tests/boot-order-test.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/tests/boot-order-test.c b/tests/boot-order-test.c
index a3928ed..4b233d0 100644
--- a/tests/boot-order-test.c
+++ b/tests/boot-order-test.c
@@ -173,6 +173,18 @@ static void test_sun4m_boot_order(void)
 test_boot_orders(SS-5, read_boot_order_sun4m, test_cases_fw_cfg);
 }
 
+static uint64_t read_boot_order_sun4u(void)
+{
+QFWCFG *fw_cfg = io_fw_cfg_init(0x510);
+
+return qfw_cfg_get_u16(fw_cfg, FW_CFG_BOOT_DEVICE);
+}
+
+static void test_sun4u_boot_order(void)
+{
+test_boot_orders(sun4u, read_boot_order_sun4u, test_cases_fw_cfg);
+}
+
 int main(int argc, char *argv[])
 {
 const char *arch = qtest_get_arch();
@@ -189,6 +201,8 @@ int main(int argc, char *argv[])
test_pmac_newworld_boot_order);
 } else if (strcmp(arch, sparc) == 0) {
 qtest_add_func(boot-order/sun4m, test_sun4m_boot_order);
+} else if (strcmp(arch, sparc64) == 0) {
+qtest_add_func(boot-order/sun4u, test_sun4u_boot_order);
 }
 
 return g_test_run();
-- 
1.7.11.7




[Qemu-devel] [PATCH v4 04/12] libqos: Add support for memory-mapped fw_cfg

2013-06-26 Thread Markus Armbruster
Signed-off-by: Markus Armbruster arm...@redhat.com
---
 tests/libqos/fw_cfg.c | 29 +
 tests/libqos/fw_cfg.h |  3 +++
 2 files changed, 32 insertions(+)

diff --git a/tests/libqos/fw_cfg.c b/tests/libqos/fw_cfg.c
index e386ff7..49d1683 100644
--- a/tests/libqos/fw_cfg.c
+++ b/tests/libqos/fw_cfg.c
@@ -2,15 +2,19 @@
  * libqos fw_cfg support
  *
  * Copyright IBM, Corp. 2012-2013
+ * Copyright (C) 2013 Red Hat Inc.
  *
  * Authors:
  *  Anthony Liguori   aligu...@us.ibm.com
+ *  Markus Armbruster arm...@redhat.com
  *
  * This work is licensed under the terms of the GNU GPL, version 2 or later.
  * See the COPYING file in the top-level directory.
  */
 
+#include glib.h
 #include libqos/fw_cfg.h
+#include libqtest.h
 #include qemu/bswap.h
 
 void qfw_cfg_select(QFWCFG *fw_cfg, uint16_t key)
@@ -50,3 +54,28 @@ uint64_t qfw_cfg_get_u64(QFWCFG *fw_cfg, uint16_t key)
 return le64_to_cpu(value);
 }
 
+static void mm_fw_cfg_select(QFWCFG *fw_cfg, uint16_t key)
+{
+writew(fw_cfg-base, key);
+}
+
+static void mm_fw_cfg_read(QFWCFG *fw_cfg, void *data, size_t len)
+{
+uint8_t *ptr = data;
+int i;
+
+for (i = 0; i  len; i++) {
+ptr[i] = readb(fw_cfg-base + 2);
+}
+}
+
+QFWCFG *mm_fw_cfg_init(uint64_t base)
+{
+QFWCFG *fw_cfg = g_malloc0(sizeof(*fw_cfg));
+
+fw_cfg-base = base;
+fw_cfg-select = mm_fw_cfg_select;
+fw_cfg-read = mm_fw_cfg_read;
+
+return fw_cfg;
+}
diff --git a/tests/libqos/fw_cfg.h b/tests/libqos/fw_cfg.h
index 44fc42b..19bb053 100644
--- a/tests/libqos/fw_cfg.h
+++ b/tests/libqos/fw_cfg.h
@@ -20,6 +20,7 @@ typedef struct QFWCFG QFWCFG;
 
 struct QFWCFG
 {
+uint64_t base;
 void (*select)(QFWCFG *fw_cfg, uint16_t key);
 void (*read)(QFWCFG *fw_cfg, void *data, size_t len);
 };
@@ -31,4 +32,6 @@ uint16_t qfw_cfg_get_u16(QFWCFG *fw_cfg, uint16_t key);
 uint32_t qfw_cfg_get_u32(QFWCFG *fw_cfg, uint16_t key);
 uint64_t qfw_cfg_get_u64(QFWCFG *fw_cfg, uint16_t key);
 
+QFWCFG *mm_fw_cfg_init(uint64_t base);
+
 #endif
-- 
1.7.11.7




[Qemu-devel] [PATCH v4 08/12] boot-order-test: Code motion for better readability

2013-06-26 Thread Markus Armbruster
Cc: Andreas Färber afaer...@suse.de
Cc: Alexander Graf ag...@suse.de
Cc: qemu-...@nongnu.org
Signed-off-by: Markus Armbruster arm...@redhat.com
---
 tests/boot-order-test.c | 44 ++--
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/tests/boot-order-test.c b/tests/boot-order-test.c
index 78249fc..c711c71 100644
--- a/tests/boot-order-test.c
+++ b/tests/boot-order-test.c
@@ -15,19 +15,15 @@
 #include libqos/fw_cfg.h
 #include libqtest.h
 
-static uint8_t read_mc146818(uint16_t port, uint8_t reg)
-{
-outb(port, reg);
-return inb(port + 1);
-}
-
-static uint64_t read_boot_order_pc(void)
-{
-uint8_t b1 = read_mc146818(0x70, 0x38);
-uint8_t b2 = read_mc146818(0x70, 0x3d);
+#define NO_QEMU_PROTOS
+#include hw/nvram/fw_cfg.h
+#undef NO_QEMU_PROTOS
 
-return b1 | (b2  8);
-}
+typedef struct {
+const char *args;
+uint64_t expected_boot;
+uint64_t expected_reboot;
+} boot_order_test;
 
 static void test_a_boot_order(const char *machine,
   const char *test_args,
@@ -57,12 +53,6 @@ static void test_a_boot_order(const char *machine,
 g_free(args);
 }
 
-typedef struct {
-const char *args;
-uint64_t expected_boot;
-uint64_t expected_reboot;
-} boot_order_test;
-
 static void test_boot_orders(const char *machine,
  uint64_t (*read_boot_order)(void),
  const boot_order_test *tests)
@@ -77,6 +67,20 @@ static void test_boot_orders(const char *machine,
 }
 }
 
+static uint8_t read_mc146818(uint16_t port, uint8_t reg)
+{
+outb(port, reg);
+return inb(port + 1);
+}
+
+static uint64_t read_boot_order_pc(void)
+{
+uint8_t b1 = read_mc146818(0x70, 0x38);
+uint8_t b2 = read_mc146818(0x70, 0x3d);
+
+return b1 | (b2  8);
+}
+
 static const boot_order_test test_cases_pc[] = {
 { ,
   0x1230, 0x1230 },
@@ -108,10 +112,6 @@ static void test_pc_boot_order(void)
 test_boot_orders(NULL, read_boot_order_pc, test_cases_pc);
 }
 
-#define NO_QEMU_PROTOS
-#include hw/nvram/fw_cfg.h
-#undef NO_QEMU_PROTOS
-
 static uint64_t read_boot_order_pmac(void)
 {
 QFWCFG *fw_cfg = mm_fw_cfg_init(0xf510);
-- 
1.7.11.7




[Qemu-devel] [PATCH v4 07/12] boot-order-test: Better separate target-specific and generic parts

2013-06-26 Thread Markus Armbruster
The initial version did just PC.  I didn't bother to separate out
generic parts, because I don't like to abstract from a single case.

Now we have two cases, PC and PowerMac, and I'm about to add more.
Time to do it right.

To ease review, this commit changes the code in-place, and the next
commit reorders it for better readability.

Cc: Andreas Färber afaer...@suse.de
Cc: Alexander Graf ag...@suse.de
Cc: qemu-...@nongnu.org
Signed-off-by: Markus Armbruster arm...@redhat.com
---
 tests/boot-order-test.c | 148 +---
 1 file changed, 91 insertions(+), 57 deletions(-)

diff --git a/tests/boot-order-test.c b/tests/boot-order-test.c
index 1a8e22a..78249fc 100644
--- a/tests/boot-order-test.c
+++ b/tests/boot-order-test.c
@@ -15,95 +15,126 @@
 #include libqos/fw_cfg.h
 #include libqtest.h
 
-static void test_pc_cmos_byte(int reg, int expected)
+static uint8_t read_mc146818(uint16_t port, uint8_t reg)
 {
-int actual;
-
-outb(0x70, reg);
-actual = inb(0x71);
-g_assert_cmphex(actual, ==, expected);
+outb(port, reg);
+return inb(port + 1);
 }
 
-static void test_pc_cmos(uint8_t boot1, uint8_t boot2)
+static uint64_t read_boot_order_pc(void)
 {
-test_pc_cmos_byte(0x38, boot1);
-test_pc_cmos_byte(0x3d, boot2);
+uint8_t b1 = read_mc146818(0x70, 0x38);
+uint8_t b2 = read_mc146818(0x70, 0x3d);
+
+return b1 | (b2  8);
 }
 
-static void test_pc_with_args(const char *test_args,
-  uint8_t boot1, uint8_t boot2,
-  uint8_t reboot1, uint8_t reboot2)
+static void test_a_boot_order(const char *machine,
+  const char *test_args,
+  uint64_t (*read_boot_order)(void),
+  uint64_t expected_boot,
+  uint64_t expected_reboot)
 {
-char *args = g_strdup_printf(-nodefaults -display none %s, test_args);
+char *args;
+uint64_t actual;
 
+args = g_strdup_printf(-nodefaults -display none%s%s %s,
+   machine ?  -M  : ,
+   machine ?: ,
+   test_args);
 qtest_start(args);
-test_pc_cmos(boot1, boot2);
+actual = read_boot_order();
+g_assert_cmphex(actual, ==, expected_boot);
 qmp({ 'execute': 'system_reset' });
 /*
  * system_reset only requests reset.  We get a RESET event after
  * the actual reset completes.  Need to wait for that.
  */
 qmp();/* HACK: wait for event */
-test_pc_cmos(reboot1, reboot2);
+actual = read_boot_order();
+g_assert_cmphex(actual, ==, expected_reboot);
 qtest_quit(global_qtest);
 g_free(args);
 }
 
-static void test_pc_boot_order(void)
+typedef struct {
+const char *args;
+uint64_t expected_boot;
+uint64_t expected_reboot;
+} boot_order_test;
+
+static void test_boot_orders(const char *machine,
+ uint64_t (*read_boot_order)(void),
+ const boot_order_test *tests)
 {
-test_pc_with_args(, 0x30, 0x12, 0x30, 0x12);
-test_pc_with_args(-no-fd-bootchk, 0x31, 0x12, 0x31, 0x12);
-test_pc_with_args(-boot c, 0, 0x02, 0, 0x02);
-test_pc_with_args(-boot nda, 0x10, 0x34, 0x10, 0x34);
-test_pc_with_args(-boot order=, 0, 0, 0, 0);
-test_pc_with_args(-boot order= -boot order=c, 0, 0x02, 0, 0x02);
-test_pc_with_args(-boot once=a, 0, 0x01, 0x30, 0x12);
-test_pc_with_args(-boot once=a -no-fd-bootchk, 0x01, 0x01, 0x31, 0x12);
-test_pc_with_args(-boot once=a,order=c, 0, 0x01, 0, 0x02);
-test_pc_with_args(-boot once=d -boot order=nda, 0, 0x03, 0x10, 0x34);
-test_pc_with_args(-boot once=a -boot once=b -boot once=c,
-  0, 0x02, 0x30, 0x12);
+int i;
+
+for (i = 0; tests[i].args; i++) {
+test_a_boot_order(machine, tests[i].args,
+  read_boot_order,
+  tests[i].expected_boot,
+  tests[i].expected_reboot);
+}
 }
 
-#define G3BEIGE_CFG_ADDR 0xf510
-#define MAC99_CFG_ADDR   0xf510
+static const boot_order_test test_cases_pc[] = {
+{ ,
+  0x1230, 0x1230 },
+{ -no-fd-bootchk,
+  0x1231, 0x1231 },
+{ -boot c,
+  0x0200, 0x0200 },
+{ -boot nda,
+  0x3410, 0x3410 },
+{ -boot order=,
+  0, 0 },
+{ -boot order= -boot order=c,
+  0x0200, 0x0200 },
+{ -boot once=a,
+  0x0100, 0x1230 },
+{ -boot once=a -no-fd-bootchk,
+  0x0101, 0x1231 },
+{ -boot once=a,order=c,
+  0x0100, 0x0200 },
+{ -boot once=d -boot order=nda,
+  0x0300, 0x3410 },
+{ -boot once=a -boot once=b -boot once=c,
+  0x0200, 0x1230 },
+{}
+};
+
+static void test_pc_boot_order(void)
+{
+test_boot_orders(NULL, read_boot_order_pc, test_cases_pc);
+}
 
 #define NO_QEMU_PROTOS
 #include hw/nvram/fw_cfg.h
 #undef NO_QEMU_PROTOS
 
-static void 

  1   2   >