date:20120718

[Qemu-devel] qemu in full emulation on win32

2012-07-18 Thread Alexey Kardashevskiy

Hi!

Found 2 problems while I was debugging qemu/ppc64-softmmu/qemu-system-ppc64.exe
WindowsXP SP3 Pro, 32bit, i686-pc-mingw32-gcc (GCC) 4.5.2.


1. The size of the following is 7 bytes on linux and 8 bytes on Windows:
struct {
uint32_t hi;
uint64_t child;
uint64_t parent;
uint64_t size;
} __attribute__((packed)) ranges[];

The structure is used between QEMU and Open Firmware (powerpc bios) so it is 
important.

The Feature is described here:
http://stackoverflow.com/questions/7789668/why-would-the-size-of-a-packed-structure-be-different-on-linux-and-windows-when
Shortly there is packing and ms-packing and they are different :)

The solutions are:
1. Add MS-specific #pragma pack(push,1) and #pragma pack(pop).
2. Add -mno-ms-bitfields (gcc = 4.7.0)
3. Change the structure above to use only uint32_t.

What is the common way of solving such problems in QEMU?


2. QEMU cannot allocate 1024MB for the guest RAM. Literally, VirtualAlloc() 
fails on 1024MB BUT it does not if I allocate 1023MB and 64MB by 2 subsequent 
calls. We allocate RAM via memory_region_init_ram(). I am pretty sure this is 
not happening on 64bit Windows and I suspect that it is happening with 
qemu-system-x86.exe, is not it?

Do we care that there is actually enough RAM and we could allocate it in 
several chunks?



-- 
Alexey

Re: [Qemu-devel] [PATCH v5 2/4] exynos4210: Added SD host controller model

2012-07-18 Thread Peter Crosthwaite

Will merge Igors corrections into v6

Regards,
Peter

On Wed, Jul 18, 2012 at 1:04 AM, Peter Maydell peter.mayd...@linaro.org wrote:
 On 17 July 2012 15:58, Igor Mitsyanko i.mitsya...@samsung.com wrote:
 On 07/17/2012 05:37 PM, Peter Maydell wrote:
 I would suggest two functions:

 int sdhci_slotint(SDHCIState *s)
 which just calculates and returns the external interrupt line state
 (might be able to make this 'static'), and

 Ok, but I'd rather make sdhci_slotint() return uint8_t, to emphasize that
 this function returns a value of hardware SLOTINT register.

 That's fine; the thing I want to avoid is having by-hand updates
 of slotint all over the code.

 -- PMM

Re: [Qemu-devel] qemu in full emulation on win32

2012-07-18 Thread Peter Maydell

On 18 July 2012 07:30, Alexey Kardashevskiy a...@ozlabs.ru wrote:
 1. The size of the following is 7 bytes on linux and 8 bytes on Windows:
 struct {
 uint32_t hi;
 uint64_t child;
 uint64_t parent;
 uint64_t size;
 } __attribute__((packed)) ranges[];

 The structure is used between QEMU and Open Firmware (powerpc bios) so it is 
 important.

I think this struct should use QEMU_PACKED, which will
ensure that it is packed to GCC rules rather than MS
rules.

We also seem to have let a pile of new uses of attribute((packed))
slip in in hw/mfi.h. Those are probably bugs too.

-- PMM

[Qemu-devel] [PATCH] powerpc pci: fixed packing of ranges[]

2012-07-18 Thread Alexey Kardashevskiy

By default mingw-gcc is trying to pack structures the way to
preserve binary compatibility with MS Visual C what leads to
incorrect and unexpected padding in the PCI bus ranges property of
the sPAPR PHB.

The patch replaces __attribute__((packed)) with more strict QEMU_PACKED
which actually is __attribute__((gcc_struct, packed)) on Windows.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/spapr_pci.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/spapr_pci.c b/hw/spapr_pci.c
index b3032d2..0261d2e 100644
--- a/hw/spapr_pci.c
+++ b/hw/spapr_pci.c
@@ -418,7 +418,7 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb,
 uint64_t child;
 uint64_t parent;
 uint64_t size;
-} __attribute__((packed)) ranges[] = {
+} QEMU_PACKED ranges[] = {
 {
 cpu_to_be32(b_ss(1)), cpu_to_be64(0),
 cpu_to_be64(phb-io_win_addr),
-- 
1.7.10.4

Re: [Qemu-devel] [PATCH] eventfd: making it rhread safe

2012-07-18 Thread Alexey Kardashevskiy

Ping again?


On 09/07/12 13:10, Alexey Kardashevskiy wrote:
 Ping?
 
 On 02/07/12 05:48, Alexey Kardashevskiy wrote:
 QEMU uses IO handlers to run select() in the main loop. The handlers list is 
 managed by qemu_set_fd_handler() helper which works fine when called from 
 the main thread as it is called not when select() is waiting.

 However sometime we need to update the handlers list from another thread. 
 For that the main loop's select() needs to be restarted with the updated 
 list.

 The patch adds the qemu_notify_event() call to interrupt select() and make 
 wrapping code to restart select() with the updated IO handlers list.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 Reviewed-by: Paolo Bonzini pbonz...@redhat.com

 ---
  iohandler.c |1 +
  1 file changed, 1 insertion(+)

 diff --git a/iohandler.c b/iohandler.c
 index 3c74de6..dea4355 100644
 --- a/iohandler.c
 +++ b/iohandler.c
 @@ -77,6 +77,7 @@ int qemu_set_fd_handler2(int fd,
  ioh-fd_write = fd_write;
  ioh-opaque = opaque;
  ioh-deleted = 0;
 +qemu_notify_event();
  }
  return 0;
  }

 
 


-- 
Alexey

Re: [Qemu-devel] [PATCH 1/4] QMP, Introduce set-global-dirty-log command.

2012-07-18 Thread Avi Kivity

On 07/17/2012 08:58 PM, Stefano Stabellini wrote:
 On Tue, 17 Jul 2012, Avi Kivity wrote:
 On 07/17/2012 04:30 PM, Anthony PERARD wrote:
  This command is used during a migration of a guest under Xen. It calls
  memory_global_dirty_log_start or memory_global_dirty_log_stop according to 
  the
  argument pass to the command.
 
 Is the command truly needed?  Can't it come from the xen library you
 link to?
 
 I thought that a while ago we decided to use QMP rather than xenstore to
 issue Xen commands to QEMU.
 The only xenstore stuff left are the PV protocols and few parameters.
 
 Of course we could use xenstore to issue commands again, but it goes
 against the last year and an half of development :-)

This particular command is weird.  You enable dirty logging via an
external interface, but all output goes through the internal interface.
 It doesn't make much sense.

But let's not reopen the issue, if it was decided to go with qmp command
for those things then so be it.

-- 
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH 3/4] exec, memory: Call to xen_modified_memory.

2012-07-18 Thread Avi Kivity

On 07/17/2012 09:36 PM, Stefano Stabellini wrote:
 On Tue, 17 Jul 2012, Avi Kivity wrote:
 On 07/17/2012 04:59 PM, Anthony PERARD wrote:
 
  This is pretty ugly.  An alternative is to set up a periodic bitmap scan
  that looks at the qemu dirty bitmap and calls xen_modified_memory() for
  dirty page ranges, and clears the bitmap for the next pass.  Is it
  workable?
  
  I don't think a periodic scan can do anything useful, unfortunately.
 
 Why not?
 
 I vaguely remember that we used to have a bitmap years ago, but, aside from
 making the code much more complicated, it caused blue screens on
 intensive disk accesses.

Surely it was some bug, not the scan itself.

 
 
  (is xen_modified_memory a hypercall, or does it maintain an in-memory
  structure?)
  
  It's an hypercall. The function do something (call the hypercall) only
  during migration, otherwise it return immediately.
 
 I see.  I guess it isn't expensive for you because there isn't much dma
 done by qemu usually with xen (unlike kvm where pv block devices are
 implemented in qemu).
 
 How about pushing the call into cpu_physical_memory_set_dirty_flags()?
 Would that reduce the number of call sites?
 
 Pushing the calls to cpu_physical_memory_set_dirty_flags and
 cpu_physical_memory_set_dirty_range would make the code much nicer.
 However being these functions in exec-obsolete.h, are they at risk of
 removal?

exec-obsolete.h just means don't add new call sites.  The functions
won't be removed, instead they'll be absorbed into the memory code with
different names and different implementations.

-- 
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [RFC] introduce a dynamic library to expose qemu block API

2012-07-18 Thread Wenchao Xia


  Hi, following is API draft, prototypes were taken from qemu/block.h,
and the API prefix is changed frpm bdrv to qbdrvs, to declare related
object is BlockDriverState, not BlockDriver. One issue here is it may
require include block_int.h, which is not LGPL2 licensed yet.
  API format is kept mostly the same with qemu generic block layer, to
make it easier for implement, and easier to make qemu migrate on it if
possible.


/* structure init and uninit */
BlockDriverState *qbdrvs_new(const char *device_name);
void qbdrvs_delete(BlockDriverState *bs);


/* file open and close */
int qbdrvs_open(BlockDriverState *bs, const char *filename, int flags,
  BlockDriver *drv);
void qbdrvs_close(BlockDriverState *bs);
int qbdrvs_img_create(const char *filename, const char *fmt,
const char *base_filename, const char *base_fmt,
char *options, uint64_t img_size, int flags);


/* sync access */
int qbdrvs_read(BlockDriverState *bs, int64_t sector_num,
  uint8_t *buf, int nb_sectors);
int qbdrvs_write(BlockDriverState *bs, int64_t sector_num,
   const uint8_t *buf, int nb_sectors);


/* info retrieve */
//sector, size and geometry info
int qbdrvs_get_info(BlockDriverState *bs, BlockDriverInfo *bdi);
int64_t qbdrvs_getlength(BlockDriverState *bs);
int64_t qbdrvs_get_allocated_file_size(BlockDriverState *bs);
void qbdrvs_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr);
//image type
const char *qbdrvs_get_format_name(BlockDriverState *bs);
//backing file info
void qbdrvs_get_backing_filename(BlockDriverState *bs,
   char *filename, int filename_size);
void qbdrvs_get_full_backing_filename(BlockDriverState *bs,
char *dest, size_t sz);


/* advanced image content access */
int qbdrvs_is_allocated(BlockDriverState *bs, int64_t sector_num, int 
nb_sectors,

  int *pnum);
int qbdrvs_discard(BlockDriverState *bs, int64_t sector_num, int 
nb_sectors);

int qbdrvs_has_zero_init(BlockDriverState *bs);


Il 16/07/2012 10:16, Wenchao Xia ha scritto:



   Really thanks for the investigation, I paid quite sometime to dig out
which license is compatible to LGPL, this have sorted it out.
   The coroutine and structure inside is quite a challenge.


Coroutines are really just a small complication in the program flow if
all you support is synchronous access to files (i.e. no HTTP etc.).
Their usage should be completely transparent.


What about
provide the library first in nbd + sync access, and waiting for the
library employer response? If it is good to use, then replace implement
code to native qemu block layer code, change code's license, while keep
API unchanged.


You can start by proposing the API.

Paolo




--
Best Regards

Wenchao Xia

Re: [Qemu-devel] [PATCH v7 0/3] Simpletrace v2: Support multiple args, strings.

2012-07-18 Thread Stefan Hajnoczi

On Tue, Jul 17, 2012 at 8:51 PM, Harsh Bora ha...@linux.vnet.ibm.com wrote:
 On 07/17/2012 08:53 PM, Stefan Hajnoczi wrote:

 On Tue, Jul 3, 2012 at 10:20 AM, Harsh Prateek Bora
 ha...@linux.vnet.ibm.com wrote:

 Existing simpletrace backend allows to trace at max 6 args and does not
 support strings. This newer tracelog format gets rid of fixed size
 records
 and therefore allows to trace variable number of args including strings.

 Sample trace:
 v9fs_version 0.000 tag=0x id=0x64 msize=0x2000 version=9P2000.L
 v9fs_version_return 6.705 tag=0x id=0x64 msize=0x2000
 version=9P2000.L
 v9fs_attach 174.467 tag=0x1 id=0x68 fid=0x0 afid=0x
 uname=nobody aname=
 v9fs_attach_return 4720.454 tag=0x1 id=0x68 type=0xff80
 version=0x4f2a4dd0  path=0x220ea6


 I have successfully tested it with the fix that I posted.  Writing
 simpletrace.Analyzer Python scripts still works - now with string
 arguments too :).

 Besides the last few comments on Patch 2, this looks okay now.


 Thanks, I have the updated patches ready except for the question asked on
 prev reply. Shall I fold your fix in patch 2 with comments as necessary. Let
 me know if I need to add your s-o-b after merging your fix in patch #2? You
 can update the commit message as you feel appropriate while merging to your
 tree though.

It's a trivial patch, feel free to squash it without noting anything.

Stefan

Re: [Qemu-devel] [PATCH v7 2/3] Simpletrace v2: Support multiple arguments, strings.

2012-07-18 Thread Stefan Hajnoczi

On Tue, Jul 17, 2012 at 8:01 PM, Harsh Bora ha...@linux.vnet.ibm.com wrote:
 On 07/17/2012 08:51 PM, Stefan Hajnoczi wrote:

 On Tue, Jul 3, 2012 at 10:20 AM, Harsh Prateek Bora
 ha...@linux.vnet.ibm.com wrote:

 Existing simpletrace backend allows to trace at max 6 args and does not
 support strings. This newer tracelog format gets rid of fixed size
 records
 and therefore allows to trace variable number of args including strings.

 Sample trace with strings:
 v9fs_version 0.000 tag=0x id=0x64 msize=0x2000 version=9P2000.L
 v9fs_version_return 6.705 tag=0x id=0x64 msize=0x2000
 version=9P2000.L

 Signed-off-by: Harsh Prateek Bora ha...@linux.vnet.ibm.com
 ---
   scripts/tracetool/backend/simple.py |   84 +---
   trace/simple.c  |  256
 ++-
   trace/simple.h  |   39 +-
   3 files changed, 260 insertions(+), 119 deletions(-)

 diff --git a/scripts/tracetool/backend/simple.py
 b/scripts/tracetool/backend/simple.py
 index fbb5717..d3cf4da 100644
 --- a/scripts/tracetool/backend/simple.py
 +++ b/scripts/tracetool/backend/simple.py
 @@ -15,9 +15,16 @@ __email__  = stefa...@linux.vnet.ibm.com

   from tracetool import out

 +def is_string(arg):
 +strtype = ('const char*', 'char*', 'const char *', 'char *')
 +if arg.lstrip().startswith(strtype):
 +return True
 +else:
 +return False

   def c(events):
   out('#include trace.h',
 +'#include trace/simple.h',
   '',
   'TraceEvent trace_list[] = {')

 @@ -26,30 +33,69 @@ def c(events):
   name = e.name,
   )

 -out('};')
 -
 -def h(events):
 -out('#include trace/simple.h',
 +out('};',
   '')

 -for num, e in enumerate(events):
 -if len(e.args):
 -argstr = e.args.names()
 -arg_prefix = ', (uint64_t)(uintptr_t)'
 -cast_args = arg_prefix + arg_prefix.join(argstr)
 -simple_args = (str(num) + cast_args)
 -else:
 -simple_args = str(num)
 +for num, event in enumerate(events):
 +sizes = []
 +for type_, name in event.args:
 +if is_string(type_):
 +sizes.append(4 + (( + name +  ? strlen( + name + )
 : 0) % MAX_TRACE_STRLEN))


 trace_record_write_str() and this code both use % to truncate the
 string.  If the string is 512 characters long you get an empty string.
   That's weird and not normally how truncation works.

 Perhaps it's better to change this Python code to emit something like:
 size_t arg%(num)d_len = %(name)s ? MAX(strlen(%(name)s, MAX_TRACE_STRLEN))
 : 0;


 I think we need to use MIN instead of MAX, right ?

Yes, my bad :).

Stefan

Re: [Qemu-devel] [PATCH v7 2/3] Simpletrace v2: Support multiple arguments, strings.

2012-07-18 Thread Stefan Hajnoczi

On Tue, Jul 17, 2012 at 9:08 PM, Harsh Bora ha...@linux.vnet.ibm.com wrote:
 On 07/18/2012 12:31 AM, Harsh Bora wrote:

 On 07/17/2012 08:51 PM, Stefan Hajnoczi wrote:

 On Tue, Jul 3, 2012 at 10:20 AM, Harsh Prateek Bora
 ha...@linux.vnet.ibm.com wrote:
 @@ -75,16 +96,31 @@ static char *trace_file_name = NULL;
*
* Returns false if the record is not valid.
*/
 -static bool get_trace_record(unsigned int idx, TraceRecord *record)
 +static bool get_trace_record(unsigned int idx, TraceRecord **recordptr)
   {
 -if (!(trace_buf[idx].event  TRACE_RECORD_VALID)) {
 +uint8_t rec_hdr[sizeof(TraceRecord)];
 +uint64_t event_flag = 0;
 +TraceRecord *record = (TraceRecord *) rec_hdr;


 Declaring rec_hdr as a uint8_t array is only because you're trying to
 avoid a cast later?  The easiest way to make this nice is to do what
 memset(), memcpy() and friends do: just use a void *buf argument.
 That way a caller can pass a TraceRecord* or any other pointer without
 explicit casts, unions, or the uint8_t array trick you are using.


 Are you suggesting to use malloc() here?

 void *rec_hdr = malloc(sizeof(TraceRecord));

 I kept a static array to make sure structure padding doesnt take place.
 I am not sure if using malloc here is recommended as we are reading
 header and then writing this header byte-by-byte?


 Ah, I confused it with trace_record_finish where we write back the
 previously read header, which is not the case here. However, I still feel
 using an array is better here probably because:
 1) We anyway cant use memcpy here to read from global buffer, we have to use
 read_from_buffer to take care of buffer boundaries.
 2) Isnt malloc() expensive for such a small allocation requirement?

No malloc.

The code is basically playing games with types because the read/write
functions take a uint8_t* buffer argument but callers actually want to
use TraceRecord.  You ended up declaring a uint8_t array and then
pointing a TraceRecord* into the array.

A cleaner way of doing this is to just use TraceRecord and make the
read/write functions take void* buffer arguments.  My comment about
memset/memcpy was that these library functions do the same thing - it
allows callers to write clean and simple code.

You can drop the uint8_t arrays and TraceRecord* alias pointers.  You
can also drop the union you have in one of the functions.

Just use a TraceRecord local variable and change the read/write helper
buffer argument to void*.

Stefan

Re: [Qemu-devel] [RFC] introduce a dynamic library to expose qemu block API

2012-07-18 Thread Paolo Bonzini

Il 18/07/2012 10:51, Wenchao Xia ha scritto:
   Hi, following is API draft, prototypes were taken from qemu/block.h,
 and the API prefix is changed frpm bdrv to qbdrvs, to declare related
 object is BlockDriverState, not BlockDriver. One issue here is it may
 require include block_int.h, which is not LGPL2 licensed yet.

block_int.h is BSD-licensed.  It's compatible.

   API format is kept mostly the same with qemu generic block layer, to
 make it easier for implement, and easier to make qemu migrate on it if
 possible.

I'll note that img_create, and almost all retrieval functions cannot be
implemented on top of NBD.

 /* structure init and uninit */
 BlockDriverState *qbdrvs_new(const char *device_name);

device_name is not needed in the library.

 void qbdrvs_delete(BlockDriverState *bs);

Being able to reuse the same BDS with close/open is one of the worst
parts of the QEMU block layer.  It is only needed to implement eject in
QEMU.  Let's get things right, and only have open/close:

int qbdrvs_open(BlockDriverState **bs, const char *filename, int flags,
const char *format_name);
void qbdrvs_close(BlockDriverState *bs);

 
 
 /* file open and close */
 int qbdrvs_open(BlockDriverState *bs, const char *filename, int flags,
   BlockDriver *drv);

You didn't have an API to find the BlockDriver *.  Let's just pass a
format name, or NULL for probing (see above). This is consistent with
qbdrvs_img_create.

 void qbdrvs_close(BlockDriverState *bs);
 int qbdrvs_img_create(const char *filename, const char *fmt,
 const char *base_filename, const char *base_fmt,
 char *options, uint64_t img_size, int flags);
 
 
 /* sync access */
 int qbdrvs_read(BlockDriverState *bs, int64_t sector_num,
   uint8_t *buf, int nb_sectors);
 int qbdrvs_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors);

I would like to have also a scatter gather API (qbdrvs_readv and
qbdrvs_writev) taking a struct iovec *iov, int niov instead of
uint8_t *buf, int nb_sectors.

flush is missing.

 
 /* info retrieve */
 //sector, size and geometry info
 int qbdrvs_get_info(BlockDriverState *bs, BlockDriverInfo *bdi);

One problem here is that BlockDriverInfo may grow in the future.  This
is a problem for the ABI of the library.  I don't have any particular
ideas here, except adding a separate API to the library for each member
of BlockDriverInfo.

 int64_t qbdrvs_getlength(BlockDriverState *bs);

qbdrvs_get_length.

 int64_t qbdrvs_get_allocated_file_size(BlockDriverState *bs);
 void qbdrvs_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr);

Not needed, it's the same as bdrv_getlength.

 //image type
 const char *qbdrvs_get_format_name(BlockDriverState *bs);
 //backing file info
 void qbdrvs_get_backing_filename(BlockDriverState *bs,
char *filename, int filename_size);
 void qbdrvs_get_full_backing_filename(BlockDriverState *bs,
 char *dest, size_t sz);
 
 
 /* advanced image content access */
 int qbdrvs_is_allocated(BlockDriverState *bs, int64_t sector_num, int
 nb_sectors,
   int *pnum);
 int qbdrvs_discard(BlockDriverState *bs, int64_t sector_num, int
 nb_sectors);
 int qbdrvs_has_zero_init(BlockDriverState *bs);

Paolo

Re: [Qemu-devel] [PATCH 5/7 v6] introduce a new qom device to deal with panicked event

2012-07-18 Thread Jan Kiszka

On 2012-07-18 03:54, Wen Congyang wrote:
 At 07/06/2012 07:05 PM, Jan Kiszka Wrote:
 On 2012-07-06 11:41, Wen Congyang wrote:
 If the target is x86/x86_64, the guest's kernel will write 0x01 to the
 port KVM_PV_PORT when it is panciked. This patch introduces a new qom
 device kvm_pv_ioport to listen this I/O port, and deal with panicked
 event according to panicked_action's value. The possible actions are:
 1. emit QEVENT_GUEST_PANICKED only
 2. emit QEVENT_GUEST_PANICKED and pause the guest
 3. emit QEVENT_GUEST_PANICKED and poweroff the guest
 4. emit QEVENT_GUEST_PANICKED and reset the guest

 I/O ports does not work for some targets(for example: s390). And you
 can implement another qom device, and include it's code into pv_event.c
 for such target.

 Note: if we emit QEVENT_GUEST_PANICKED only, and the management
 application does not receive this event(the management may not
 run when the event is emitted), the management won't know the
 guest is panicked.

 Signed-off-by: Wen Congyang we...@cn.fujitsu.com
 ---
  hw/kvm/Makefile.objs |2 +-
  hw/kvm/pv_event.c|   73 +++
  hw/kvm/pv_ioport.c   |  133 
 ++
  kvm-stub.c   |9 +++
  kvm.h|3 +
  vl.c |4 ++
  6 files changed, 223 insertions(+), 1 deletions(-)
  create mode 100644 hw/kvm/pv_event.c
  create mode 100644 hw/kvm/pv_ioport.c

 diff --git a/hw/kvm/Makefile.objs b/hw/kvm/Makefile.objs
 index 226497a..23e3b30 100644
 --- a/hw/kvm/Makefile.objs
 +++ b/hw/kvm/Makefile.objs
 @@ -1 +1 @@
 -obj-$(CONFIG_KVM) += clock.o apic.o i8259.o ioapic.o i8254.o
 +obj-$(CONFIG_KVM) += clock.o apic.o i8259.o ioapic.o i8254.o pv_event.o
 diff --git a/hw/kvm/pv_event.c b/hw/kvm/pv_event.c
 new file mode 100644
 index 000..d7ded37
 --- /dev/null
 +++ b/hw/kvm/pv_event.c
 @@ -0,0 +1,73 @@
 +/*
 + * QEMU KVM support, paravirtual event device
 + *
 + * Copyright Fujitsu, Corp. 2012
 + *
 + * Authors:
 + * Wen Congyang we...@cn.fujitsu.com
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or 
 later.
 + * See the COPYING file in the top-level directory.
 + *
 + */
 +
 +#include linux/kvm_para.h
 +#include asm/kvm_para.h
 +#include qobject.h
 +#include qjson.h
 +#include monitor.h
 +#include sysemu.h
 +#include kvm.h
 +
 +/* Possible values for action parameter. */
 +#define PANICKED_REPORT 1   /* emit QEVENT_GUEST_PANICKED only */
 +#define PANICKED_PAUSE  2   /* emit QEVENT_GUEST_PANICKED and pause VM 
 */
 +#define PANICKED_POWEROFF   3   /* emit QEVENT_GUEST_PANICKED and quit VM 
 */
 +#define PANICKED_RESET  4   /* emit QEVENT_GUEST_PANICKED and reset VM 
 */
 +
 +static int panicked_action = PANICKED_REPORT;

 Avoid global variables please when there are device states. This one is
 unneeded anyway (and will generate warnings when build without KVM_PV_PORT).
 
 Hmm, do you mean introduce another qom device to store event action?

I think you should be fine with one device per bus binding, but those
will consist of a common event layer and just different I/O layers (for
bus registration and access).

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 5/7 v6] introduce a new qom device to deal with panicked event

2012-07-18 Thread Jan Kiszka

On 2012-07-18 11:19, Jan Kiszka wrote:
 On 2012-07-18 03:54, Wen Congyang wrote:
 At 07/06/2012 07:05 PM, Jan Kiszka Wrote:
 On 2012-07-06 11:41, Wen Congyang wrote:
 If the target is x86/x86_64, the guest's kernel will write 0x01 to the
 port KVM_PV_PORT when it is panciked. This patch introduces a new qom
 device kvm_pv_ioport to listen this I/O port, and deal with panicked
 event according to panicked_action's value. The possible actions are:
 1. emit QEVENT_GUEST_PANICKED only
 2. emit QEVENT_GUEST_PANICKED and pause the guest
 3. emit QEVENT_GUEST_PANICKED and poweroff the guest
 4. emit QEVENT_GUEST_PANICKED and reset the guest

 I/O ports does not work for some targets(for example: s390). And you
 can implement another qom device, and include it's code into pv_event.c
 for such target.

 Note: if we emit QEVENT_GUEST_PANICKED only, and the management
 application does not receive this event(the management may not
 run when the event is emitted), the management won't know the
 guest is panicked.

 Signed-off-by: Wen Congyang we...@cn.fujitsu.com
 ---
  hw/kvm/Makefile.objs |2 +-
  hw/kvm/pv_event.c|   73 +++
  hw/kvm/pv_ioport.c   |  133 
 ++
  kvm-stub.c   |9 +++
  kvm.h|3 +
  vl.c |4 ++
  6 files changed, 223 insertions(+), 1 deletions(-)
  create mode 100644 hw/kvm/pv_event.c
  create mode 100644 hw/kvm/pv_ioport.c

 diff --git a/hw/kvm/Makefile.objs b/hw/kvm/Makefile.objs
 index 226497a..23e3b30 100644
 --- a/hw/kvm/Makefile.objs
 +++ b/hw/kvm/Makefile.objs
 @@ -1 +1 @@
 -obj-$(CONFIG_KVM) += clock.o apic.o i8259.o ioapic.o i8254.o
 +obj-$(CONFIG_KVM) += clock.o apic.o i8259.o ioapic.o i8254.o pv_event.o
 diff --git a/hw/kvm/pv_event.c b/hw/kvm/pv_event.c
 new file mode 100644
 index 000..d7ded37
 --- /dev/null
 +++ b/hw/kvm/pv_event.c
 @@ -0,0 +1,73 @@
 +/*
 + * QEMU KVM support, paravirtual event device
 + *
 + * Copyright Fujitsu, Corp. 2012
 + *
 + * Authors:
 + * Wen Congyang we...@cn.fujitsu.com
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or 
 later.
 + * See the COPYING file in the top-level directory.
 + *
 + */
 +
 +#include linux/kvm_para.h
 +#include asm/kvm_para.h
 +#include qobject.h
 +#include qjson.h
 +#include monitor.h
 +#include sysemu.h
 +#include kvm.h
 +
 +/* Possible values for action parameter. */
 +#define PANICKED_REPORT 1   /* emit QEVENT_GUEST_PANICKED only */
 +#define PANICKED_PAUSE  2   /* emit QEVENT_GUEST_PANICKED and pause 
 VM */
 +#define PANICKED_POWEROFF   3   /* emit QEVENT_GUEST_PANICKED and quit VM 
 */
 +#define PANICKED_RESET  4   /* emit QEVENT_GUEST_PANICKED and reset 
 VM */
 +
 +static int panicked_action = PANICKED_REPORT;

 Avoid global variables please when there are device states. This one is
 unneeded anyway (and will generate warnings when build without KVM_PV_PORT).

 Hmm, do you mean introduce another qom device to store event action?
 
 I think you should be fine with one device per bus binding, but those
 will consist of a common event layer and just different I/O layers (for
 bus registration and access).

To make this clearer: the I/O layer should embed a common state
structure of the event layer in its device state so that the event layer
can keep things like the action mode there.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] We need more reviewers/maintainers!!

2012-07-18 Thread Peter Maydell

On 12 March 2012 20:12, Stefan Weil s...@weilnetz.de wrote:
 We also need more resources for technical maintenance of the
 QEMU infrastructure. For example, the official mirror of the
 QEMU git repository (https://github.com/qemu/QEMU) is several
 months behind, http://git.savannah.gnu.org/cgit/qemu.git is
 even older.

The github mirror is still wildly out of date -- can we
get the link to it removed from http://wiki.qemu.org/Download
please? (I'd do it myself but the page is 'locked'.)

thanks
-- PMM

Re: [Qemu-devel] [PATCH] qemu kvm: Recognize PCID feature

2012-07-18 Thread Jan Kiszka

On 2012-07-18 10:44, Mao, Junjie wrote:
 Hi, Avi
 
 Any comments on this patch? :)

Always include qemu-devel when your are changing QEMU, qemu-kvm is just
staging for the latter. This patch can actually go into upstream
directly, maybe even via qemu-trivial as it just makes that flag selectable.

Jan

 
 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On
 Behalf Of Mao, Junjie
 Sent: Friday, July 13, 2012 12:58 PM
 To: 'k...@vger.kernel.org'
 Subject: [PATCH] qemu kvm: Recognize PCID feature

 This patch makes Qemu recognize the PCID feature specified from
 configuration or command line options.

 Signed-off-by: Junjie Mao junjie@intel.com
 ---
  target-i386/cpu.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 5521709..efc6ece
 100644
 --- a/target-i386/cpu.c
 +++ b/target-i386/cpu.c
 @@ -50,7 +50,7 @@ static const char *ext_feature_name[] = {
  ds_cpl, vmx, smx, est,
  tm2, ssse3, cid, NULL,
  fma, cx16, xtpr, pdcm,
 -NULL, NULL, dca, sse4.1|sse4_1,
 +NULL, pcid, dca, sse4.1|sse4_1,
  sse4.2|sse4_2, x2apic, movbe, popcnt,
  tsc-deadline, aes, xsave, osxsave,
  avx, NULL, NULL, hypervisor,
 --
 1.7.1
 --
 To unsubscribe from this list: send the line unsubscribe kvm in the body 
 of a
 message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [RFC] introduce a dynamic library to expose qemu block API

2012-07-18 Thread Stefan Hajnoczi

On Wed, Jul 18, 2012 at 9:51 AM, Wenchao Xia xiaw...@linux.vnet.ibm.com wrote:
 /* sync access */
 int qbdrvs_read(BlockDriverState *bs, int64_t sector_num,
   uint8_t *buf, int nb_sectors);
 int qbdrvs_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors);

Whether to provide sync and/or async access is a key question.

Synchronous APIs are great for writing dedicated tools like dd, cp,
convert, etc.

Asynchronous APIs are essential for integrating image file I/O into
event-driven programs like libvirt.  Here, the ability to do other
things while image file I/O is in progress is a requirement.  It may
also be necessary to cancel or timeout if an operation is not making
progress or the user decides to stop it.

I think we need to provide both sync and async.  Libraries like
libssh2 and libcurl already do this so their APIs can be used as a
starting point for async I/O.

Stefan

[Qemu-devel] [PATCH v8 0/3] Simpletrace v2: Support multiple args, strings.

2012-07-18 Thread Harsh Prateek Bora

Existing simpletrace backend allows to trace at max 6 args and does not
support strings. This newer tracelog format gets rid of fixed size records
and therefore allows to trace variable number of args including strings.

Sample trace:
v9fs_version 0.000 tag=0x id=0x64 msize=0x2000 version=9P2000.L
v9fs_version_return 6.705 tag=0x id=0x64 msize=0x2000 version=9P2000.L
v9fs_attach 174.467 tag=0x1 id=0x68 fid=0x0 afid=0x
uname=nobody aname=
v9fs_attach_return 4720.454 tag=0x1 id=0x68 type=0xff80
version=0x4f2a4dd0  path=0x220ea6

v8:
- Addressed Stefan's comment on v7 series.
- Included fix suggested by Stefan to use raw malloc/free in writer thread.

v7:
- Introduced clear_buffer_range to reset buffer range after consuming record.
- Introduced cap on string length using MAX_TRACE_STRLEN (currently set to 512)
  to avoid calculating wrong length when string length is  sizeof(uint32_t).
- Reverted idx refresh back to inside while-loop in writeout_thread.

v6:
- Corrected flush condition and reverted un-necessary check in get_trace_record
- moved idx refresh out of while-loop in writeout_thread, see comments in code.

v5:
- Addressed Stefan's review comments on v4 series.
- Inroduced code to handle corrupted records due to racing tracers.

v4:
- removed unused safe_strlen interface (missed in v3).

v3:
- Addressed Stefan's review comments on v2 series

v2:
- moved elimination of st_print_trace to 1/3 patch
- use sizeof(TraceRecord) for #define ST_REC_HDR_LEN

v1:
- Simpletrace v2 for the new (pythonized) tracetool


Harsh Prateek Bora (3):
  monitor: remove unused do_info_trace
  Simpletrace v2: Support multiple arguments, strings.
  Update simpletrace.py for new log format

 monitor.c   |   16 ---
 scripts/simpletrace.py  |  116 +--
 scripts/tracetool/backend/simple.py |   90 +---
 trace/simple.c  |  271 ---
 trace/simple.h  |   40 --
 5 files changed, 337 insertions(+), 196 deletions(-)

-- 
1.7.10.4

[Qemu-devel] [PATCH v8 2/3] Simpletrace v2: Support multiple arguments, strings.

2012-07-18 Thread Harsh Prateek Bora

Existing simpletrace backend allows to trace at max 6 args and does not
support strings. This newer tracelog format gets rid of fixed size records
and therefore allows to trace variable number of args including strings.

Sample trace with strings:
v9fs_version 0.000 tag=0x id=0x64 msize=0x2000 version=9P2000.L
v9fs_version_return 6.705 tag=0x id=0x64 msize=0x2000 version=9P2000.L

Signed-off-by: Harsh Prateek Bora ha...@linux.vnet.ibm.com
---
 scripts/tracetool/backend/simple.py |   90 ++---
 trace/simple.c  |  253 ++-
 trace/simple.h  |   39 +-
 3 files changed, 262 insertions(+), 120 deletions(-)

diff --git a/scripts/tracetool/backend/simple.py 
b/scripts/tracetool/backend/simple.py
index fbb5717..c7e47d6 100644
--- a/scripts/tracetool/backend/simple.py
+++ b/scripts/tracetool/backend/simple.py
@@ -15,9 +15,16 @@ __email__  = stefa...@linux.vnet.ibm.com
 
 from tracetool import out
 
+def is_string(arg):
+strtype = ('const char*', 'char*', 'const char *', 'char *')
+if arg.lstrip().startswith(strtype):
+return True
+else:
+return False
 
 def c(events):
 out('#include trace.h',
+'#include trace/simple.h',
 '',
 'TraceEvent trace_list[] = {')
 
@@ -26,30 +33,75 @@ def c(events):
 name = e.name,
 )
 
-out('};')
+out('};',
+'')
+
+for num, event in enumerate(events):
+out('void trace_%(name)s(%(args)s)',
+'{',
+'TraceBufferRecord rec;',
+name = event.name,
+args = event.args,
+)
+sizes = []
+for type_, name in event.args:
+if is_string(type_):
+out('size_t arg%(name)s_len = %(name)s ? 
MIN(strlen(%(name)s), MAX_TRACE_STRLEN) : 0;',
+name = name,
+   )
+strsizeinfo = 4 + arg%s_len % name
+sizes.append(strsizeinfo)
+else:
+sizes.append(8)
+sizestr =  + .join(sizes)
+if len(event.args) == 0:
+sizestr = '0'
+
+
+out('',
+'if (!trace_list[%(event_id)s].state) {',
+'return;',
+'}',
+'',
+'if (trace_record_start(rec, %(event_id)s, %(size_str)s)) {',
+'return; /* Trace Buffer Full, Event Dropped ! */',
+'}',
+event_id = num,
+size_str = sizestr,
+)
+
+if len(event.args)  0:
+for type_, name in event.args:
+# string
+if is_string(type_):
+out('trace_record_write_str(rec, %(name)s, 
arg%(name)s_len);',
+name = name,
+   )
+# pointer var (not string)
+elif type_.endswith('*'):
+out('trace_record_write_u64(rec, (uint64_t)(uint64_t 
*)%(name)s);',
+name = name,
+   )
+# primitive data type
+else:
+out('trace_record_write_u64(rec, 
(uint64_t)%(name)s);',
+   name = name,
+   )
+
+out('trace_record_finish(rec);',
+'}',
+'')
+
 
 def h(events):
 out('#include trace/simple.h',
 '')
 
-for num, e in enumerate(events):
-if len(e.args):
-argstr = e.args.names()
-arg_prefix = ', (uint64_t)(uintptr_t)'
-cast_args = arg_prefix + arg_prefix.join(argstr)
-simple_args = (str(num) + cast_args)
-else:
-simple_args = str(num)
-
-out('static inline void trace_%(name)s(%(args)s)',
-'{',
-'trace%(argc)d(%(trace_args)s);',
-'}',
-name = e.name,
-args = e.args,
-argc = len(e.args),
-trace_args = simple_args,
+for event in events:
+out('void trace_%(name)s(%(args)s);',
+name = event.name,
+args = event.args,
 )
-
+out('')
 out('#define NR_TRACE_EVENTS %d' % len(events))
 out('extern TraceEvent trace_list[NR_TRACE_EVENTS];')
diff --git a/trace/simple.c b/trace/simple.c
index b64bcf4..b700ea3 100644
--- a/trace/simple.c
+++ b/trace/simple.c
@@ -27,7 +27,7 @@
 #define HEADER_MAGIC 0xf2b177cb0aa429b4ULL
 
 /** Trace file version number, bump if format changes */
-#define HEADER_VERSION 0
+#define HEADER_VERSION 2
 
 /** Records were dropped event ID */
 #define DROPPED_EVENT_ID (~(uint64_t)0 - 1)
@@ -35,23 +35,6 @@
 /** Trace record is valid */
 #define TRACE_RECORD_VALID ((uint64_t)1  63)
 
-/** Trace buffer entry */
-typedef struct {
-uint64_t event;
-uint64_t timestamp_ns;
-uint64_t x1;
-uint64_t x2;
-uint64_t x3;
-uint64_t x4;
-

[Qemu-devel] [PATCH v8 3/3] Update simpletrace.py for new log format

2012-07-18 Thread Harsh Prateek Bora

Support new tracelog format for multiple arguments and strings.

Signed-off-by: Harsh Prateek Bora ha...@linux.vnet.ibm.com
---
 scripts/simpletrace.py |  116 +++-
 1 file changed, 75 insertions(+), 41 deletions(-)

diff --git a/scripts/simpletrace.py b/scripts/simpletrace.py
index f55e5e6..9b4419f 100755
--- a/scripts/simpletrace.py
+++ b/scripts/simpletrace.py
@@ -12,53 +12,69 @@
 import struct
 import re
 import inspect
+from tracetool import _read_events, Event
+from tracetool.backend.simple import is_string
 
 header_event_id = 0x
 header_magic= 0xf2b177cb0aa429b4
-header_version  = 0
 dropped_event_id = 0xfffe
 
-trace_fmt = '='
-trace_len = struct.calcsize(trace_fmt)
-event_re  = re.compile(r'(disable\s+)?([a-zA-Z0-9_]+)\(([^)]*)\).*')
+log_header_fmt = '=QQQ'
+rec_header_fmt = '=QQII'
 
-def parse_events(fobj):
-Parse a trace-events file into {event_num: (name, arg1, ...)}.
-
-def get_argnames(args):
-Extract argument names from a parameter list.
-return tuple(arg.split()[-1].lstrip('*') for arg in args.split(','))
-
-events = {dropped_event_id: ('dropped', 'count')}
-event_num = 0
-for line in fobj:
-m = event_re.match(line.strip())
-if m is None:
-continue
-
-disable, name, args = m.groups()
-events[event_num] = (name,) + get_argnames(args)
-event_num += 1
-return events
+def read_header(fobj, hfmt):
+'''Read a trace record header'''
+hlen = struct.calcsize(hfmt)
+hdr = fobj.read(hlen)
+if len(hdr) != hlen:
+return None
+return struct.unpack(hfmt, hdr)
 
-def read_record(fobj):
+def get_record(edict, rechdr, fobj):
 Deserialize a trace record from a file into a tuple (event_num, 
timestamp, arg1, ..., arg6).
-s = fobj.read(trace_len)
-if len(s) != trace_len:
+if rechdr is None:
 return None
-return struct.unpack(trace_fmt, s)
+rec = (rechdr[0], rechdr[1])
+if rechdr[0] != dropped_event_id:
+event_id = rechdr[0]
+event = edict[event_id]
+for type, name in event.args:
+if is_string(type):
+l = fobj.read(4)
+(len,) = struct.unpack('=L', l)
+s = fobj.read(len)
+rec = rec + (s,)
+else:
+(value,) = struct.unpack('=Q', fobj.read(8))
+rec = rec + (value,)
+else:
+(value,) = struct.unpack('=Q', fobj.read(8))
+rec = rec + (value,)
+return rec
+
+
+def read_record(edict, fobj):
+Deserialize a trace record from a file into a tuple (event_num, 
timestamp, arg1, ..., arg6).
+rechdr = read_header(fobj, rec_header_fmt)
+return get_record(edict, rechdr, fobj) # return tuple of record elements
 
-def read_trace_file(fobj):
+def read_trace_file(edict, fobj):
 Deserialize trace records from a file, yielding record tuples 
(event_num, timestamp, arg1, ..., arg6).
-header = read_record(fobj)
+header = read_header(fobj, log_header_fmt)
 if header is None or \
header[0] != header_event_id or \
-   header[1] != header_magic or \
-   header[2] != header_version:
-raise ValueError('not a trace file or incompatible version')
+   header[1] != header_magic:
+raise ValueError('Not a valid trace file!')
+if header[2] != 0 and \
+   header[2] != 2:
+raise ValueError('Unknown version of tracelog format!')
+
+log_version = header[2]
+if log_version == 0:
+raise ValueError('Older log format, not supported with this Qemu 
release!')
 
 while True:
-rec = read_record(fobj)
+rec = read_record(edict, fobj)
 if rec is None:
 break
 
@@ -89,16 +105,29 @@ class Analyzer(object):
 def process(events, log, analyzer):
 Invoke an analyzer on each event in a log.
 if isinstance(events, str):
-events = parse_events(open(events, 'r'))
+events = _read_events(open(events, 'r'))
 if isinstance(log, str):
 log = open(log, 'rb')
 
+enabled_events = []
+dropped_event = Event.build(Dropped_Event(uint64_t num_events_dropped))
+edict = {dropped_event_id: dropped_event}
+
+for e in events:
+if 'disable' not in e.properties:
+enabled_events.append(e)
+for num, event in enumerate(enabled_events):
+edict[num] = event
+
 def build_fn(analyzer, event):
-fn = getattr(analyzer, event[0], None)
+if isinstance(event, str):
+return analyzer.catchall
+
+fn = getattr(analyzer, event.name, None)
 if fn is None:
 return analyzer.catchall
 
-event_argcount = len(event) - 1
+event_argcount = len(event.args)
 fn_argcount = len(inspect.getargspec(fn)[0]) - 1
 if fn_argcount == event_argcount + 1:
 # Include timestamp as first argument
@@ -109,9 +138,9 @@

[Qemu-devel] [PATCH v8 1/3] monitor: remove unused do_info_trace

2012-07-18 Thread Harsh Prateek Bora

Going forward with simpletrace v2 variable size trace records, we cannot
have a generic function to print trace event info and therefore this
interface becomes invalid.

As per Stefan Hajnoczi:

This command is only available from the human monitor.  It's not very
useful because it historically hasn't been able to pretty-print events
or show them in the right order (we use a ringbuffer but it prints
them out from index 0).

Therefore, I don't think we're under any obligation to keep this
command around.  No one has complained about it's limitations - I
think this is a sign that no one has used it.  I'd be okay with a
patch that removes it.

Ref: http://lists.gnu.org/archive/html/qemu-devel/2012-01/msg01268.html

Signed-off-by: Harsh Prateek Bora ha...@linux.vnet.ibm.com
---
 monitor.c  |   16 
 trace/simple.c |   18 --
 trace/simple.h |1 -
 3 files changed, 35 deletions(-)

diff --git a/monitor.c b/monitor.c
index a3bc2c7..1fb00ab 100644
--- a/monitor.c
+++ b/monitor.c
@@ -795,13 +795,6 @@ static void do_info_cpu_stats(Monitor *mon)
 }
 #endif
 
-#if defined(CONFIG_TRACE_SIMPLE)
-static void do_info_trace(Monitor *mon)
-{
-st_print_trace((FILE *)mon, monitor_fprintf);
-}
-#endif
-
 static void do_trace_print_events(Monitor *mon)
 {
 trace_print_events((FILE *)mon, monitor_fprintf);
@@ -2568,15 +2561,6 @@ static mon_cmd_t info_cmds[] = {
 .help   = show roms,
 .mhandler.info = do_info_roms,
 },
-#if defined(CONFIG_TRACE_SIMPLE)
-{
-.name   = trace,
-.args_type  = ,
-.params = ,
-.help   = show current contents of trace buffer,
-.mhandler.info = do_info_trace,
-},
-#endif
 {
 .name   = trace-events,
 .args_type  = ,
diff --git a/trace/simple.c b/trace/simple.c
index b4a3c6e..b64bcf4 100644
--- a/trace/simple.c
+++ b/trace/simple.c
@@ -291,24 +291,6 @@ void st_print_trace_file_status(FILE *stream, int 
(*stream_printf)(FILE *stream,
   trace_file_name, trace_fp ? on : off);
 }
 
-void st_print_trace(FILE *stream, int (*stream_printf)(FILE *stream, const 
char *fmt, ...))
-{
-unsigned int i;
-
-for (i = 0; i  TRACE_BUF_LEN; i++) {
-TraceRecord record;
-
-if (!get_trace_record(i, record)) {
-continue;
-}
-stream_printf(stream, Event % PRIu64  : % PRIx64  % PRIx64
-   % PRIx64  % PRIx64  % PRIx64  % PRIx64 \n,
-  record.event, record.x1, record.x2,
-  record.x3, record.x4, record.x5,
-  record.x6);
-}
-}
-
 void st_flush_trace_buffer(void)
 {
 flush_trace_file(true);
diff --git a/trace/simple.h b/trace/simple.h
index 466e75b..6b5358c 100644
--- a/trace/simple.h
+++ b/trace/simple.h
@@ -29,7 +29,6 @@ void trace3(TraceEventID event, uint64_t x1, uint64_t x2, 
uint64_t x3);
 void trace4(TraceEventID event, uint64_t x1, uint64_t x2, uint64_t x3, 
uint64_t x4);
 void trace5(TraceEventID event, uint64_t x1, uint64_t x2, uint64_t x3, 
uint64_t x4, uint64_t x5);
 void trace6(TraceEventID event, uint64_t x1, uint64_t x2, uint64_t x3, 
uint64_t x4, uint64_t x5, uint64_t x6);
-void st_print_trace(FILE *stream, fprintf_function stream_printf);
 void st_print_trace_file_status(FILE *stream, fprintf_function stream_printf);
 void st_set_trace_file_enabled(bool enable);
 bool st_set_trace_file(const char *file);
-- 
1.7.10.4

[Qemu-devel] [PATCH] update-linux-headers.sh: Don't hard code list of architectures

2012-07-18 Thread Peter Maydell

Rather than hardcoding the list of architectures in the kernel
header update script, just import headers for every architecture
which supports KVM (with a blacklist exception for ia64 which
has KVM headers but is dead). This reduces the number of QEMU
files which need to be updated to add support for a new KVM
architecture.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
Changes v1-v2:
 * added a blacklist for ia64, to avoid noise and importing
   a pointless set of headers that will get dropped later

 scripts/update-linux-headers.sh |   16 +++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index 9d2a4bc..57ce69f 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -28,7 +28,21 @@ if [ -z $output ]; then
 output=$PWD
 fi
 
-for arch in x86 powerpc s390; do
+# This will pick up non-directories too (eg Kconfig) but we will
+# ignore them in the next loop.
+ARCHLIST=$(cd $linux/arch  echo *)
+
+for arch in $ARCHLIST; do
+# Discard anything which isn't a KVM-supporting architecture
+if ! [ -e $linux/arch/$arch/include/asm/kvm.h ]; then
+continue
+fi
+
+# Blacklist architectures which have KVM headers but are actually dead
+if [ $arch = ia64 ]; then
+continue
+fi
+
 make -C $linux INSTALL_HDR_PATH=$tmpdir SRCARCH=$arch headers_install
 
 rm -rf $output/linux-headers/asm-$arch
-- 
1.7.5.4

Re: [Qemu-devel] [PATCH] update-linux-headers.sh: Don't hard code list of architectures

2012-07-18 Thread Jan Kiszka

On 2012-07-18 12:11, Peter Maydell wrote:
 Rather than hardcoding the list of architectures in the kernel
 header update script, just import headers for every architecture
 which supports KVM (with a blacklist exception for ia64 which
 has KVM headers but is dead). This reduces the number of QEMU
 files which need to be updated to add support for a new KVM
 architecture.
 
 Signed-off-by: Peter Maydell peter.mayd...@linaro.org

Acked-by: Jan Kiszka jan.kis...@siemens.com

 ---
 Changes v1-v2:
  * added a blacklist for ia64, to avoid noise and importing
a pointless set of headers that will get dropped later
 
  scripts/update-linux-headers.sh |   16 +++-
  1 files changed, 15 insertions(+), 1 deletions(-)
 
 diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
 index 9d2a4bc..57ce69f 100755
 --- a/scripts/update-linux-headers.sh
 +++ b/scripts/update-linux-headers.sh
 @@ -28,7 +28,21 @@ if [ -z $output ]; then
  output=$PWD
  fi
  
 -for arch in x86 powerpc s390; do
 +# This will pick up non-directories too (eg Kconfig) but we will
 +# ignore them in the next loop.
 +ARCHLIST=$(cd $linux/arch  echo *)
 +
 +for arch in $ARCHLIST; do
 +# Discard anything which isn't a KVM-supporting architecture
 +if ! [ -e $linux/arch/$arch/include/asm/kvm.h ]; then
 +continue
 +fi
 +
 +# Blacklist architectures which have KVM headers but are actually dead
 +if [ $arch = ia64 ]; then
 +continue
 +fi
 +
  make -C $linux INSTALL_HDR_PATH=$tmpdir SRCARCH=$arch headers_install
  
  rm -rf $output/linux-headers/asm-$arch
 


-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH v2] qerror: Add QERR_PROPERTY_SET_AFTER_REALIZE

2012-07-18 Thread Andreas Färber

Am 16.07.2012 17:25, schrieb Peter Maydell:
 Add a new QError QERR_PROPERTY_SET_AFTER_REALIZE for attempts
 to set a QOM or qdev property after the object/device has been
 realized. This allows a slightly more informative diagnostic
 than the previous permission denied message.
 
 Signed-off-by: Peter Maydell peter.mayd...@linaro.org
 ---
 Changes since the v1 (which was sent way back in March...):
  * rebased on master now a pile of qdev/qom changesd have landed
  * fixed some overlong lines
  * avoid gcc '?:' extension
  * a couple of set_ functions in qdev-properties.c are new since v1
and needed their QERR_PERMISSION_DENIED checks changing

This does not yet seem to take into account the discussion with libvirt
and Anthony on what parameters to pass. The ID generalization was
nack'ed by Anthony and a QOM path was suggested as alternative. Could
you please look into that?

Thanks,
Andreas

 
  hw/qdev-properties.c |   42 --
  qerror.c |5 +
  qerror.h |3 +++
  3 files changed, 36 insertions(+), 14 deletions(-)
 
 diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
 index 0b89462..ae0c7a7 100644
 --- a/hw/qdev-properties.c
 +++ b/hw/qdev-properties.c
 @@ -54,7 +54,8 @@ static void set_bit(Object *obj, Visitor *v, void *opaque,
  bool value;
  
  if (dev-state != DEV_STATE_CREATED) {
 -error_set(errp, QERR_PERMISSION_DENIED);
 +error_set(errp, QERR_PROPERTY_SET_AFTER_REALIZE,
 +  dev-id ? dev-id : , name);
  return;
  }
  
 @@ -94,7 +95,8 @@ static void set_uint8(Object *obj, Visitor *v, void *opaque,
  uint8_t *ptr = qdev_get_prop_ptr(dev, prop);
  
  if (dev-state != DEV_STATE_CREATED) {
 -error_set(errp, QERR_PERMISSION_DENIED);
 +error_set(errp, QERR_PROPERTY_SET_AFTER_REALIZE,
 +  dev-id ? dev-id : , name);
  return;
  }
  
 @@ -161,7 +163,8 @@ static void set_uint16(Object *obj, Visitor *v, void 
 *opaque,
  uint16_t *ptr = qdev_get_prop_ptr(dev, prop);
  
  if (dev-state != DEV_STATE_CREATED) {
 -error_set(errp, QERR_PERMISSION_DENIED);
 +error_set(errp, QERR_PROPERTY_SET_AFTER_REALIZE,
 +  dev-id ? dev-id : , name);
  return;
  }
  
 @@ -194,7 +197,8 @@ static void set_uint32(Object *obj, Visitor *v, void 
 *opaque,
  uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
  
  if (dev-state != DEV_STATE_CREATED) {
 -error_set(errp, QERR_PERMISSION_DENIED);
 +error_set(errp, QERR_PROPERTY_SET_AFTER_REALIZE,
 +  dev-id ? dev-id : , name);
  return;
  }
  
 @@ -219,7 +223,8 @@ static void set_int32(Object *obj, Visitor *v, void 
 *opaque,
  int32_t *ptr = qdev_get_prop_ptr(dev, prop);
  
  if (dev-state != DEV_STATE_CREATED) {
 -error_set(errp, QERR_PERMISSION_DENIED);
 +error_set(errp, QERR_PROPERTY_SET_AFTER_REALIZE,
 +  dev-id ? dev-id : , name);
  return;
  }
  
 @@ -292,7 +297,8 @@ static void set_uint64(Object *obj, Visitor *v, void 
 *opaque,
  uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
  
  if (dev-state != DEV_STATE_CREATED) {
 -error_set(errp, QERR_PERMISSION_DENIED);
 +error_set(errp, QERR_PROPERTY_SET_AFTER_REALIZE,
 +  dev-id ? dev-id : , name);
  return;
  }
  
 @@ -380,7 +386,8 @@ static void set_string(Object *obj, Visitor *v, void 
 *opaque,
  char *str;
  
  if (dev-state != DEV_STATE_CREATED) {
 -error_set(errp, QERR_PERMISSION_DENIED);
 +error_set(errp, QERR_PROPERTY_SET_AFTER_REALIZE,
 +  dev-id ? dev-id : , name);
  return;
  }
  
 @@ -458,7 +465,8 @@ static void set_pointer(Object *obj, Visitor *v, Property 
 *prop,
  int ret;
  
  if (dev-state != DEV_STATE_CREATED) {
 -error_set(errp, QERR_PERMISSION_DENIED);
 +error_set(errp, QERR_PROPERTY_SET_AFTER_REALIZE,
 +  dev-id ? dev-id : , name);
  return;
  }
  
 @@ -627,7 +635,8 @@ static void set_vlan(Object *obj, Visitor *v, void 
 *opaque,
  VLANState *vlan;
  
  if (dev-state != DEV_STATE_CREATED) {
 -error_set(errp, QERR_PERMISSION_DENIED);
 +error_set(errp, QERR_PROPERTY_SET_AFTER_REALIZE,
 +  dev-id ? dev-id : , name);
  return;
  }
  
 @@ -697,7 +706,8 @@ static void set_mac(Object *obj, Visitor *v, void *opaque,
  char *str, *p;
  
  if (dev-state != DEV_STATE_CREATED) {
 -error_set(errp, QERR_PERMISSION_DENIED);
 +error_set(errp, QERR_PROPERTY_SET_AFTER_REALIZE,
 +  dev-id ? dev-id : , name);
  return;
  }
  
 @@ -767,7 +777,8 @@ static void set_enum(Object *obj, Visitor *v, void 
 *opaque,
  int *ptr = qdev_get_prop_ptr(dev, prop);
  
  if (dev-state != DEV_STATE_CREATED) {
 -error_set(errp, QERR_PERMISSION_DENIED);

Re: [Qemu-devel] [PATCH v2] qerror: Add QERR_PROPERTY_SET_AFTER_REALIZE

2012-07-18 Thread Peter Maydell

n 18 July 2012 11:20, Andreas Färber afaer...@suse.de wrote:
 Am 16.07.2012 17:25, schrieb Peter Maydell:
 Add a new QError QERR_PROPERTY_SET_AFTER_REALIZE for attempts
 to set a QOM or qdev property after the object/device has been
 realized. This allows a slightly more informative diagnostic
 than the previous permission denied message.

 Signed-off-by: Peter Maydell peter.mayd...@linaro.org
 ---
 Changes since the v1 (which was sent way back in March...):
  * rebased on master now a pile of qdev/qom changesd have landed
  * fixed some overlong lines
  * avoid gcc '?:' extension
  * a couple of set_ functions in qdev-properties.c are new since v1
and needed their QERR_PERMISSION_DENIED checks changing

 This does not yet seem to take into account the discussion with libvirt
 and Anthony on what parameters to pass. The ID generalization was
 nack'ed by Anthony and a QOM path was suggested as alternative. Could
 you please look into that?

I'm afraid I'm not really sure what you're referring to here --
do you have a link to a discussion?

All I want is for errors printed to the user to be a bit more
helpful; the whole qerror infrastructure seems to make it
somewhere between difficult and impossible to do that :-(

-- PMM

Re: [Qemu-devel] [RFC] introduce a dynamic library to expose qemu block API

2012-07-18 Thread Paolo Bonzini

 Whether to provide sync and/or async access is a key question.

Indeed.

 Synchronous APIs are great for writing dedicated tools like dd, cp,
 convert, etc.
 
 Asynchronous APIs are essential for integrating image file I/O into
 event-driven programs like libvirt.  Here, the ability to do other
 things while image file I/O is in progress is a requirement.  It may
 also be necessary to cancel or timeout if an operation is not making
 progress or the user decides to stop it.
 
 I think we need to provide both sync and async.  Libraries like
 libssh2 and libcurl already do this so their APIs can be used as a
 starting point for async I/O.

If we want to provide an asynchronous API, the easiest thing would
be to provide a GSource and that's it.  That would even make sense for
QEMU itself, in fact.

What I'm worried about, is how to support _both_ synchronous and
asynchronous access.  I'd like the library to be clean of things like
qemu_aio_wait() and qemu_aio_flush(), at least in the beginning.
That's why I think async can come later, once we actually get
applications needing it.  Right now, libvirt's requirements are
simple (e.g. probing the backing file chain) and would be synchronous
anyway.

Paolo

[Qemu-devel] [PATCH] vfio-powerpc: added VFIO support (v4)

2012-07-18 Thread Alexey Kardashevskiy

It literally does the following:

1. POWERPC IOMMU support (the kernel counterpart is required)

2. The patch assumes that IOAPIC calls are going to be replaced
with something generic.

3. Added sPAPRVFIOData (hw/spapr_iommu_vfio.h) which describes
the interface between VFIO and sPAPR IOMMU.

4. Change sPAPR PHB to scan the PCI bus which is used for
the IOMMU-VFIO group. Now it is enough to add the following to
the QEMU command line to get VFIO up with all the devices from
IOMMU group with id=3:
-device spapr-pci-host-bridge,busname=E1000E,buid=0x3,iommu=3,\
 
mem_win_addr=0x2300,io_win_addr=0x2400,msi_win_addr=0x2500

WIth the pathes posted today a bit earlier, this patch fully supports
VFIO what includes MSIX as well.

ps. yes, I know that linux_vfio.h has moved, will fix it later :)

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/linux-vfio.h   |   26 +++
 hw/ppc/Makefile.objs  |3 ++
 hw/spapr.h|4 ++
 hw/spapr_iommu.c  |   62 -
 hw/spapr_iommu_vfio.h |   34 ++
 hw/spapr_pci.c|  124 +++--
 hw/spapr_pci.h|6 +++
 hw/vfio_pci.c |   64 +
 hw/vfio_pci.h |2 +
 trace-events  |1 +
 10 files changed, 320 insertions(+), 6 deletions(-)
 create mode 100644 hw/spapr_iommu_vfio.h

diff --git a/hw/linux-vfio.h b/hw/linux-vfio.h
index 300d49b..27a0501 100644
--- a/hw/linux-vfio.h
+++ b/hw/linux-vfio.h
@@ -442,4 +442,30 @@ struct vfio_iommu_type1_dma_unmap {
 
 #define VFIO_IOMMU_UNMAP_DMA _IO(VFIO_TYPE, VFIO_BASE + 14)
 
+/*
+ * Interface to SPAPR TCE (POWERPC Book3S)
+ */
+#define SPAPR_TCE_IOMMU 2
+
+struct tce_iommu_info {
+__u32 argsz;
+__u32 flags;
+__u32 dma32_window_start;
+__u32 dma32_window_size;
+__u64 dma64_window_start;
+__u64 dma64_window_size;
+};
+
+#define SPAPR_TCE_IOMMU_GET_INFO_IO(VFIO_TYPE, VFIO_BASE + 12)
+
+struct tce_iommu_dma_map {
+__u32 argsz;
+__u32 flags;
+__u64 va;
+__u64 dmaaddr;
+};
+
+#define SPAPR_TCE_IOMMU_MAP_DMA _IO(VFIO_TYPE, VFIO_BASE + 13)
+#define SPAPR_TCE_IOMMU_UNMAP_DMA   _IO(VFIO_TYPE, VFIO_BASE + 14)
+
 #endif /* VFIO_H */
diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
index f573a95..c46a049 100644
--- a/hw/ppc/Makefile.objs
+++ b/hw/ppc/Makefile.objs
@@ -25,4 +25,7 @@ obj-$(CONFIG_FDT) += ../device_tree.o
 # Xilinx PPC peripherals
 obj-y += xilinx_ethlite.o
 
+# VFIO PCI device assignment
+obj-$(CONFIG_VFIO_PCI) += vfio_pci.o
+
 obj-y := $(addprefix ../,$(obj-y))
diff --git a/hw/spapr.h b/hw/spapr.h
index b37f337..0c15c88 100644
--- a/hw/spapr.h
+++ b/hw/spapr.h
@@ -340,4 +340,8 @@ int spapr_dma_dt(void *fdt, int node_off, const char 
*propname,
 int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
   DMAContext *dma);
 
+struct sPAPRVFIOData;
+void spapr_vfio_init_dma(int group_id, uint32_t liobn,
+ struct sPAPRVFIOData *data);
+
 #endif /* !defined (__HW_SPAPR_H__) */
diff --git a/hw/spapr_iommu.c b/hw/spapr_iommu.c
index 50c288d..0a82842 100644
--- a/hw/spapr_iommu.c
+++ b/hw/spapr_iommu.c
@@ -23,6 +23,8 @@
 #include dma.h
 
 #include hw/spapr.h
+#include hw/spapr_iommu_vfio.h
+#include hw/vfio_pci.h
 
 #include libfdt.h
 
@@ -183,6 +185,60 @@ static int put_tce_emu(target_ulong liobn, target_ulong 
ioba, target_ulong tce)
 return 0;
 }
 
+typedef struct sPAPRVFIOTable {
+struct sPAPRVFIOData *data;
+uint32_t liobn;
+QLIST_ENTRY(sPAPRVFIOTable) list;
+} sPAPRVFIOTable;
+
+QLIST_HEAD(vfio_tce_tables, sPAPRVFIOTable) vfio_tce_tables;
+
+void spapr_vfio_init_dma(int group_id, uint32_t liobn,
+ struct sPAPRVFIOData *data)
+{
+sPAPRVFIOTable *t;
+
+t = g_malloc0(sizeof(*t));
+t-data = data;
+t-liobn = liobn;
+
+QLIST_INSERT_HEAD(vfio_tce_tables, t, list);
+}
+
+static int put_tce_vfio(uint32_t liobn, target_ulong ioba, target_ulong tce)
+{
+sPAPRVFIOTable *t;
+struct tce_iommu_dma_map map = {
+.argsz = sizeof(map),
+.va = 0,
+.dmaaddr = ioba,
+};
+
+QLIST_FOREACH(t, vfio_tce_tables, list) {
+if (t-liobn != liobn) {
+continue;
+}
+if (!t-data) {
+return H_NO_MEM;
+}
+if (tce) {
+map.va = (uintptr_t)qemu_get_ram_ptr(tce  ~SPAPR_TCE_PAGE_MASK);
+
+if (t-data-map(t-data-groupid, map)) {
+perror(TCE_MAP_DMA);
+return H_PARAMETER;
+}
+} else {
+if (t-data-unmap(t-data-groupid, map)) {
+perror(TCE_UNMAP_DMA);
+return H_PARAMETER;
+}
+}
+return H_SUCCESS;
+}
+return H_CONTINUE; /* positive non-zero value */
+}
+
 static target_ulong h_put_tce(CPUPPCState *env, sPAPREnvironment *spapr,

[Qemu-devel] [PATCH] configure: Don't implicitly hardcode list of KVM architectures

2012-07-18 Thread Peter Maydell

The code creating the symlink from linux-headers/asm to the
architecture specific linux-headers/asm-$arch directory was
implicitly hardcoding a list of KVM supporting architectures.
Add a default case for the common Linux architecture name and
QEMU CPU name match case, so future architectures will only
need to add code if they've managed to get mismatched names.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
This means there's now only one place where configure has a
list of KVM enabled targets (the check where we set CONFIG_KVM).
I think we have to have one list, but we don't need to have
more than one...

NB: this patch means we'll now set up an asm/ link for
s390 as well as s390x. That should be harmless (s390 can't be
trying to include any headers from asm/ or it wouldn't build).

 configure |   14 +++---
 1 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index 0a3896e..c8d2895 100755
--- a/configure
+++ b/configure
@@ -3482,15 +3482,23 @@ if test $linux = yes ; then
   mkdir -p linux-headers
   case $cpu in
   i386|x86_64)
-symlink $source_path/linux-headers/asm-x86 linux-headers/asm
+linux_arch=x86
 ;;
   ppcemb|ppc|ppc64)
-symlink $source_path/linux-headers/asm-powerpc linux-headers/asm
+linux_arch=powerpc
 ;;
   s390x)
-symlink $source_path/linux-headers/asm-s390 linux-headers/asm
+linux_arch=s390
+;;
+  *)
+# For most CPUs the kernel architecture name and QEMU CPU name match.
+linux_arch=$cpu
 ;;
   esac
+# For non-KVM architectures we will not have asm headers.
+if [ -e $source_path/linux-headers/asm-$linux_arch ]; then
+symlink $source_path/linux-headers/asm-$linux_arch linux-headers/asm
+fi
 fi
 
 for target in $target_list; do
-- 
1.7.5.4

Re: [Qemu-devel] [PATCH v2] qerror: Add QERR_PROPERTY_SET_AFTER_REALIZE

2012-07-18 Thread Markus Armbruster

Peter Maydell peter.mayd...@linaro.org writes:

 n 18 July 2012 11:20, Andreas Färber afaer...@suse.de wrote:
 Am 16.07.2012 17:25, schrieb Peter Maydell:
 Add a new QError QERR_PROPERTY_SET_AFTER_REALIZE for attempts
 to set a QOM or qdev property after the object/device has been
 realized. This allows a slightly more informative diagnostic
 than the previous permission denied message.

 Signed-off-by: Peter Maydell peter.mayd...@linaro.org
 ---
 Changes since the v1 (which was sent way back in March...):
  * rebased on master now a pile of qdev/qom changesd have landed
  * fixed some overlong lines
  * avoid gcc '?:' extension
  * a couple of set_ functions in qdev-properties.c are new since v1
and needed their QERR_PERMISSION_DENIED checks changing

 This does not yet seem to take into account the discussion with libvirt
 and Anthony on what parameters to pass. The ID generalization was
 nack'ed by Anthony and a QOM path was suggested as alternative. Could
 you please look into that?

 I'm afraid I'm not really sure what you're referring to here --
 do you have a link to a discussion?

 All I want is for errors printed to the user to be a bit more
 helpful; the whole qerror infrastructure seems to make it
 somewhere between difficult and impossible to do that :-(

Yup.  One of the reasons why I detest it.

A recent thread on how to recover from this disaster:
http://lists.nongnu.org/archive/html/qemu-devel/2012-06/msg03469.html

Re: [Qemu-devel] [PATCH v2] qerror: Add QERR_PROPERTY_SET_AFTER_REALIZE

2012-07-18 Thread Peter Maydell

On 18 July 2012 12:19, Markus Armbruster arm...@redhat.com wrote:
 Peter Maydell peter.mayd...@linaro.org writes:

 n 18 July 2012 11:20, Andreas Färber afaer...@suse.de wrote:
 Am 16.07.2012 17:25, schrieb Peter Maydell:
 Add a new QError QERR_PROPERTY_SET_AFTER_REALIZE for attempts
 to set a QOM or qdev property after the object/device has been
 realized. This allows a slightly more informative diagnostic
 than the previous permission denied message.

 Signed-off-by: Peter Maydell peter.mayd...@linaro.org
 ---
 Changes since the v1 (which was sent way back in March...):
  * rebased on master now a pile of qdev/qom changesd have landed
  * fixed some overlong lines
  * avoid gcc '?:' extension
  * a couple of set_ functions in qdev-properties.c are new since v1
and needed their QERR_PERMISSION_DENIED checks changing

 This does not yet seem to take into account the discussion with libvirt
 and Anthony on what parameters to pass. The ID generalization was
 nack'ed by Anthony and a QOM path was suggested as alternative. Could
 you please look into that?

 I'm afraid I'm not really sure what you're referring to here --
 do you have a link to a discussion?

 All I want is for errors printed to the user to be a bit more
 helpful; the whole qerror infrastructure seems to make it
 somewhere between difficult and impossible to do that :-(

 Yup.  One of the reasons why I detest it.

 A recent thread on how to recover from this disaster:
 http://lists.nongnu.org/archive/html/qemu-devel/2012-06/msg03469.html

That's interesting but I'm not sure how it's relevant. We already
have QERR_PROPERTY values just this new one, so I don't see why
this is any worse than the ones we have. If we come up with some
new scheme we can convert this with all the rest. And I don't
really want to block improve this error message on getting
agreement for some big redesign effort...

-- PMM

Re: [Qemu-devel] [PATCH v2] qerror: Add QERR_PROPERTY_SET_AFTER_REALIZE

2012-07-18 Thread Peter Maydell

On 18 July 2012 12:36, Peter Maydell peter.mayd...@linaro.org wrote:
 That's interesting but I'm not sure how it's relevant. We already
 have QERR_PROPERTY values just this new one, so I don't see why
 this is any worse than the ones we have.

just like.

-- PMM

Re: [Qemu-devel] [PATCH] eventfd: making it rhread safe

2012-07-18 Thread Michael S. Tsirkin

On Mon, Jul 02, 2012 at 05:48:16AM +1000, Alexey Kardashevskiy wrote:
Subject: Re: [PATCH] eventfd: making it rhread safe

typo in the subject

 QEMU uses IO handlers to run select() in the main loop. The handlers list is 
 managed by qemu_set_fd_handler() helper which works fine when called from the 
 main thread as it is called not when select() is waiting.

git commit logs should break lines at 70-80 chars.
Sometimes people go beyond that a bit. But 214 chars is not reasonable.

 However sometime we need to update the handlers list from another thread.

Want to be more specific?

 For that the main loop's select() needs to be restarted with the updated list.
 
 The patch adds the qemu_notify_event() call to interrupt select() and make 
 wrapping code to restart select() with the updated IO handlers list.

What does 'and make wrapping code' mean?

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 Reviewed-by: Paolo Bonzini pbonz...@redhat.com

Does this fix any bugs? If yes commit log should mention this.

 ---
  iohandler.c |1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/iohandler.c b/iohandler.c
 index 3c74de6..dea4355 100644
 --- a/iohandler.c
 +++ b/iohandler.c
 @@ -77,6 +77,7 @@ int qemu_set_fd_handler2(int fd,
  ioh-fd_write = fd_write;
  ioh-opaque = opaque;
  ioh-deleted = 0;
 +qemu_notify_event();
  }
  return 0;
  }
 -- 
 1.7.10

Re: [Qemu-devel] [PATCH v2] qerror: Add QERR_PROPERTY_SET_AFTER_REALIZE

2012-07-18 Thread Markus Armbruster

Peter Maydell peter.mayd...@linaro.org writes:

 On 18 July 2012 12:19, Markus Armbruster arm...@redhat.com wrote:
 Peter Maydell peter.mayd...@linaro.org writes:

 n 18 July 2012 11:20, Andreas Färber afaer...@suse.de wrote:
 Am 16.07.2012 17:25, schrieb Peter Maydell:
 Add a new QError QERR_PROPERTY_SET_AFTER_REALIZE for attempts
 to set a QOM or qdev property after the object/device has been
 realized. This allows a slightly more informative diagnostic
 than the previous permission denied message.

 Signed-off-by: Peter Maydell peter.mayd...@linaro.org
 ---
 Changes since the v1 (which was sent way back in March...):
  * rebased on master now a pile of qdev/qom changesd have landed
  * fixed some overlong lines
  * avoid gcc '?:' extension
  * a couple of set_ functions in qdev-properties.c are new since v1
and needed their QERR_PERMISSION_DENIED checks changing

 This does not yet seem to take into account the discussion with libvirt
 and Anthony on what parameters to pass. The ID generalization was
 nack'ed by Anthony and a QOM path was suggested as alternative. Could
 you please look into that?

 I'm afraid I'm not really sure what you're referring to here --
 do you have a link to a discussion?

 All I want is for errors printed to the user to be a bit more
 helpful; the whole qerror infrastructure seems to make it
 somewhere between difficult and impossible to do that :-(

 Yup.  One of the reasons why I detest it.

 A recent thread on how to recover from this disaster:
 http://lists.nongnu.org/archive/html/qemu-devel/2012-06/msg03469.html

 That's interesting but I'm not sure how it's relevant. We already
 have QERR_PROPERTY values just this new one, so I don't see why
 this is any worse than the ones we have. If we come up with some
 new scheme we can convert this with all the rest. And I don't
 really want to block improve this error message on getting
 agreement for some big redesign effort...

I'm not objecting to your patch (I didn't even review it), just pointing
out there's a glimmer of hope on the emitting error messages fit for
humans is somewhere between difficult and impossible front.

[Qemu-devel] [PATCH] eventfd: making it thread safe

2012-07-18 Thread Alexey Kardashevskiy

QEMU uses IO handlers to run select() in the main loop.
The handlers list is managed by qemu_set_fd_handler() helper
which works fine when called from the main thread as it is
called not when select() is waiting.

However IO handlers list can be changed in the thread other than
the main one doing os_host_main_loop_wait(), for example, as a result
of a hypercall which changes PCI config space (VFIO on POWER is the case)
and enables/disabled MSI/MSIX which closes/creates eventfd handles.
If the main loop is waiting on such eventfd, it has to be restarted.

The patch adds the qemu_notify_event() call to interrupt select()
and make main_loop() to restart select() with the updated IO
handlers list.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Paolo Bonzini pbonz...@redhat.com
---
 iohandler.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/iohandler.c b/iohandler.c
index 3c74de6..dea4355 100644
--- a/iohandler.c
+++ b/iohandler.c
@@ -77,6 +77,7 @@ int qemu_set_fd_handler2(int fd,
 ioh-fd_write = fd_write;
 ioh-opaque = opaque;
 ioh-deleted = 0;
+qemu_notify_event();
 }
 return 0;
 }
-- 
1.7.10.4

Re: [Qemu-devel] [PATCH] cpu-defs.h: pull in qemu-common.h for HOST_LONG_BITS

2012-07-18 Thread Mike Frysinger

On Monday 16 July 2012 01:26:50 Stefan Weil wrote:
 Am 15.07.2012 23:54, schrieb Mike Frysinger:
  On Sunday 15 July 2012 15:34:33 Stefan Weil wrote:
  Am 15.07.2012 22:25, schrieb Mike Frysinger:
  This file uses the define HOST_LONG_BITS, but doesn't explicitly
  include qemu-common.h for it leading to build warnings for some
  setups: In file included from qemu/target-bfin/cpu.h:17,
  
 from qemu/cputlb.c:21:
  qemu/cpu-defs.h:83:5: warning: HOST_LONG_BITS is not defined
  
  Signed-off-by: Mike Frysinger vap...@gentoo.org
  ---
  
 cpu-defs.h |1 +
 1 file changed, 1 insertion(+)
  
  diff --git a/cpu-defs.h b/cpu-defs.h
  index f49e950..0d6018d 100644
  --- a/cpu-defs.h
  +++ b/cpu-defs.h
  @@ -28,6 +28,7 @@
  
 #include inttypes.h
 #include signal.h
 #include osdep.h
  
  +#include qemu-common.h
  
 #include qemu-queue.h
 #include targphys.h
  
  No. Of course this works, but I don't think that it is reasonable
  to include qemu-common.h in every *.h file. There are already too
  many of them.
  
  target-bfin/cpu.h should start like all other cpu.h files with
  
  these include statements:
  sorry, but that's fragile junk.  if a header file uses defines from
  another header file, it should be including it.
  -mike
 
 There are different ways how things can be done.
 
 Normally, I agree with you that each header file should be complete,
 but that's not the QEMU style.
 
 In your special case, it's more important to keep all */cpu.h similar.
 qemu/target-bfin/cpu.h is still not part of the official QEMU code,
 so it can be fixed before it is committed.

a lot of existing files in the top level pull in qemu-common.h.  i don't think 
this is a special case considering it's the first failure i've seen since i 
started the Blackfin port over a year ago.
-mike


signature.asc
Description: This is a digitally signed message part.

Re: [Qemu-devel] [PATCH] configure: do not quote $PKG_CONFIG

2012-07-18 Thread Mike Frysinger

On Monday 16 July 2012 11:58:55 Stefan Weil wrote:
 Am 16.07.2012 17:39, schrieb Eric Blake:
  On 07/15/2012 01:54 PM, Stefan Weil wrote:
  Am 15.07.2012 22:26, schrieb Mike Frysinger:
  We should not quote the PKG_CONFIG setting as this deviates from the
  canonical upstream behavior that gets integrated with all other build
  systems, and deviates from how we treat all other toolchain variables
  that we get from the environment.
  
  Ultimately, the point is that it breaks passing custom flags directly
  to pkg-config via the env var where this normally works elsewhere,
  and it used to work in the past.
  
  What about passing custom flags with QEMU_PKG_CONFIG_FLAGS?
  
  Removing the quotes will not allow paths containing spaces,
  so that's not a good idea.
  
  Actually, it IS a good idea.  The de facto standard build environment
  requires that pkg-config is not allowed to live in a path containing
  spaces, precisely so that you can override the variable to pass options
  to your preferred location of pkg-config; and if your build setup is
  truly so messed up as to have pkg-config installed in a canonical
  location with spaces, then you can also tweak your unusual environment
  to provide a symlink to pkg-config that does not contain spaces as the
  workaround.
 
 That sounds reasonable. Then the following patch was at least partially
 unnecessary:
 
 commit 17884d7b6462b0fe497f08fec6091ffbe04caa8d
 Author: Sergei Trofimovich sly...@gentoo.org
 Date:   Tue Jan 31 22:03:45 2012 +0300
 
  ./configure: request pkg-config to provide private libs when static
 linking
 
  Added wrapper around pkg-config to allow:
  - safe options injection via ${QEMU_PKG_CONFIG_FLAGS}
  - spaces in path to pkg-config
 
  Signed-off-by: Sergei Trofimovich sly...@gentoo.org
  CC: Peter Maydell peter.mayd...@linaro.org
  Signed-off-by: Anthony Liguori aligu...@us.ibm.com
 
 With Mike's new patch, QEMU_PKG_CONFIG_FLAGS is no longer needed
 because options can be passed using the pkg-config macro.
 I suggest to remove it.

i'm ambivalent on the additional functionality that qemu provides in its build 
system -- i'm just concerned with the baseline being the same as all other 
build systems.  some people probably find this handy.
-mike


signature.asc
Description: This is a digitally signed message part.

Re: [Qemu-devel] [PATCH] eventfd: making it thread safe

2012-07-18 Thread Michael S. Tsirkin

On Wed, Jul 18, 2012 at 10:08:53PM +1000, Alexey Kardashevskiy wrote:
 QEMU uses IO handlers to run select() in the main loop.
 The handlers list is managed by qemu_set_fd_handler() helper
 which works fine when called from the main thread as it is
 called not when select() is waiting.

when select() is not waiting?

 
 However IO handlers list can be changed in the thread other than
 the main one doing os_host_main_loop_wait(), for example, as a result
 of a hypercall which changes PCI config space (VFIO on POWER is the case)

So the problem is only with VFIO? Can it affect vhost-net?

 and enables/disabled MSI/MSIX which closes/creates eventfd handles.

There doesn't seem to be a notification in case an fd is
deleted. It's probably not at all urgent to remove
an fd from select - why do you mention closing handles?

 If the main loop is waiting on such eventfd, it has to be restarted.

Do you really mean 'should be waiting on the newly created
eventfd'?

 The patch adds the qemu_notify_event() call to interrupt select()
 and make main_loop() to restart select()

s/and make main_loop() to restart/to make main_loop() restart/?

 with the updated IO
 handlers list.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 Reviewed-by: Paolo Bonzini pbonz...@redhat.com
 ---
  iohandler.c |1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/iohandler.c b/iohandler.c
 index 3c74de6..dea4355 100644
 --- a/iohandler.c
 +++ b/iohandler.c
 @@ -77,6 +77,7 @@ int qemu_set_fd_handler2(int fd,
  ioh-fd_write = fd_write;
  ioh-opaque = opaque;
  ioh-deleted = 0;
 +qemu_notify_event();
  }
  return 0;
  }
 -- 
 1.7.10.4

Re: [Qemu-devel] [PATCH v2 1/6] hw/arm_boot.c: Make ram_size a uint64_t

2012-07-18 Thread Peter Crosthwaite

On Mon, Jul 16, 2012 at 11:24 PM, Peter Maydell
peter.mayd...@linaro.org wrote:
 Make the RAM size in arm_boot_info a uint64_t so it can express
 the larger RAM sizes that may be seen in LPAE systems.

 Signed-off-by: Peter Maydell peter.mayd...@linaro.org

Reviewed-by: Peter A. G. Crosthwaite peter.crosthwa...@petalogix.com

 ---
  hw/arm-misc.h |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/hw/arm-misc.h b/hw/arm-misc.h
 index 1f96229..bdd8fec 100644
 --- a/hw/arm-misc.h
 +++ b/hw/arm-misc.h
 @@ -25,7 +25,7 @@ qemu_irq *armv7m_init(MemoryRegion *address_space_mem,

  /* arm_boot.c */
  struct arm_boot_info {
 -int ram_size;
 +uint64_t ram_size;
  const char *kernel_filename;
  const char *kernel_cmdline;
  const char *initrd_filename;
 --
 1.7.5.4

Re: [Qemu-devel] [PATCH v2 3/6] hw/arm_boot.c: Check for RAM sizes exceeding ATAGS capacity

2012-07-18 Thread Peter Crosthwaite

On Mon, Jul 16, 2012 at 11:24 PM, Peter Maydell
peter.mayd...@linaro.org wrote:
 The legacy ATAGS format for passing information to the kernel only
 allows RAM sizes which fit in 32 bits; enforce this restriction
 rather than silently doing something weird.

 Signed-off-by: Peter Maydell peter.mayd...@linaro.org

Reviewed-by: Peter A. G. Crosthwaite peter.crosthwa...@petalogix.com

 ---
  hw/arm_boot.c |6 ++
  1 files changed, 6 insertions(+), 0 deletions(-)

 diff --git a/hw/arm_boot.c b/hw/arm_boot.c
 index 29ae324..af71ed6 100644
 --- a/hw/arm_boot.c
 +++ b/hw/arm_boot.c
 @@ -399,6 +399,12 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
 *info)
  bootloader[5] = dtb_start;
  } else {
  bootloader[5] = info-loader_start + KERNEL_ARGS_ADDR;
 +if (info-ram_size = (1ULL  32)) {
 +fprintf(stderr, qemu: RAM size must be less than 4GB to 
 boot
 + Linux kernel using ATAGS (try passing a device 
 tree
 + using -dtb)\n);
 +exit(1);
 +}
  }
  bootloader[6] = entry;
  for (n = 0; n  sizeof(bootloader) / 4; n++) {
 --
 1.7.5.4

Re: [Qemu-devel] [PATCH v2 4/6] device_tree: Add support for reading device tree properties

2012-07-18 Thread Peter Crosthwaite

On Mon, Jul 16, 2012 at 11:24 PM, Peter Maydell
peter.mayd...@linaro.org wrote:
 Add support for reading device tree properties (both generic
 and single-cell ones) to QEMU's convenience wrapper layer.

 Signed-off-by: Peter Maydell peter.mayd...@linaro.org

Reviewed-by: Peter A. G. Crosthwaite peter.crosthwa...@petalogix.com

 ---
  device_tree.c |   30 ++
  device_tree.h |4 
  2 files changed, 34 insertions(+), 0 deletions(-)

 diff --git a/device_tree.c b/device_tree.c
 index b366fdd..d7a9b6b 100644
 --- a/device_tree.c
 +++ b/device_tree.c
 @@ -178,6 +178,36 @@ int qemu_devtree_setprop_string(void *fdt, const char 
 *node_path,
  return r;
  }

 +const void *qemu_devtree_getprop(void *fdt, const char *node_path,
 + const char *property, int *lenp)
 +{
 +int len;
 +const void *r;
 +if (!lenp) {
 +lenp = len;
 +}
 +r = fdt_getprop(fdt, findnode_nofail(fdt, node_path), property, lenp);
 +if (!r) {
 +fprintf(stderr, %s: Couldn't get %s/%s: %s\n, __func__,
 +node_path, property, fdt_strerror(*lenp));
 +exit(1);
 +}
 +return r;
 +}
 +
 +uint32_t qemu_devtree_getprop_cell(void *fdt, const char *node_path,
 +   const char *property)
 +{
 +int len;
 +const uint32_t *p = qemu_devtree_getprop(fdt, node_path, property, len);
 +if (len != 4) {
 +fprintf(stderr, %s: %s/%s not 4 bytes long (not a cell?)\n,
 +__func__, node_path, property);
 +exit(1);
 +}
 +return be32_to_cpu(*p);
 +}
 +
  uint32_t qemu_devtree_get_phandle(void *fdt, const char *path)
  {
  uint32_t r;
 diff --git a/device_tree.h b/device_tree.h
 index 2244270..f7a3e6c 100644
 --- a/device_tree.h
 +++ b/device_tree.h
 @@ -28,6 +28,10 @@ int qemu_devtree_setprop_string(void *fdt, const char 
 *node_path,
  int qemu_devtree_setprop_phandle(void *fdt, const char *node_path,
   const char *property,
   const char *target_node_path);
 +const void *qemu_devtree_getprop(void *fdt, const char *node_path,
 + const char *property, int *lenp);
 +uint32_t qemu_devtree_getprop_cell(void *fdt, const char *node_path,
 +   const char *property);
  uint32_t qemu_devtree_get_phandle(void *fdt, const char *path);
  uint32_t qemu_devtree_alloc_phandle(void *fdt);
  int qemu_devtree_nop_node(void *fdt, const char *node_path);
 --
 1.7.5.4

Re: [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address and data

2012-07-18 Thread Michael S. Tsirkin

On Thu, Jun 21, 2012 at 09:39:10PM +1000, Alexey Kardashevskiy wrote:
 Added (msi|msix)_set_message() functions.
 
 Currently msi_notify()/msix_notify() write to these vectors to
 signal the guest about an interrupt so the correct values have to
 written there by the guest or QEMU.
 
 For example, POWER guest never initializes MSI/MSIX vectors, instead
 it uses RTAS hypercalls. So in order to support MSIX for virtio-pci on
 POWER we have to initialize MSI/MSIX message from QEMU.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru

So guests do enable MSI through config space, but do
not fill in vectors? Very strange. Are you sure it's not
just a guest bug? How does it work for other PCI devices?
Can't we just fix guest drivers to program the vectors properly?

Also pls address the comment below.

Thanks!

 ---
  hw/msi.c  |   13 +
  hw/msi.h  |1 +
  hw/msix.c |9 +
  hw/msix.h |2 ++
  4 files changed, 25 insertions(+)
 
 diff --git a/hw/msi.c b/hw/msi.c
 index 5233204..cc6102f 100644
 --- a/hw/msi.c
 +++ b/hw/msi.c
 @@ -105,6 +105,19 @@ static inline uint8_t msi_pending_off(const PCIDevice* 
 dev, bool msi64bit)
  return dev-msi_cap + (msi64bit ? PCI_MSI_PENDING_64 : 
 PCI_MSI_PENDING_32);
  }
  
 +void msi_set_message(PCIDevice *dev, MSIMessage msg)
 +{
 +uint16_t flags = pci_get_word(dev-config + msi_flags_off(dev));
 +bool msi64bit = flags  PCI_MSI_FLAGS_64BIT;
 +
 +if (msi64bit) {
 +pci_set_quad(dev-config + msi_address_lo_off(dev), msg.address);
 +} else {
 +pci_set_long(dev-config + msi_address_lo_off(dev), msg.address);
 +}
 +pci_set_word(dev-config + msi_data_off(dev, msi64bit), msg.data);
 +}
 +

Please add documentation. Something like

/*
 * Special API for POWER to configure the vectors through
 * a side channel. Should never be used by devices.
 */

  bool msi_enabled(const PCIDevice *dev)
  {
  return msi_present(dev) 
 diff --git a/hw/msi.h b/hw/msi.h
 index 75747ab..6ec1f99 100644
 --- a/hw/msi.h
 +++ b/hw/msi.h
 @@ -31,6 +31,7 @@ struct MSIMessage {
  
  extern bool msi_supported;
  
 +void msi_set_message(PCIDevice *dev, MSIMessage msg);
  bool msi_enabled(const PCIDevice *dev);
  int msi_init(struct PCIDevice *dev, uint8_t offset,
   unsigned int nr_vectors, bool msi64bit, bool 
 msi_per_vector_mask);
 diff --git a/hw/msix.c b/hw/msix.c
 index ded3c55..5f7d6d3 100644
 --- a/hw/msix.c
 +++ b/hw/msix.c
 @@ -45,6 +45,15 @@ static MSIMessage msix_get_message(PCIDevice *dev, 
 unsigned vector)
  return msg;
  }
  
 +void msix_set_message(PCIDevice *dev, int vector, struct MSIMessage msg)
 +{
 +uint8_t *table_entry = dev-msix_table_page + vector * 
 PCI_MSIX_ENTRY_SIZE;
 +
 +pci_set_quad(table_entry + PCI_MSIX_ENTRY_LOWER_ADDR, msg.address);
 +pci_set_long(table_entry + PCI_MSIX_ENTRY_DATA, msg.data);
 +table_entry[PCI_MSIX_ENTRY_VECTOR_CTRL] = ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
 +}
 +
  /* Add MSI-X capability to the config space for the device. */
  /* Given a bar and its size, add MSI-X table on top of it
   * and fill MSI-X capability in the config space.
 diff --git a/hw/msix.h b/hw/msix.h
 index 50aee82..26a437e 100644
 --- a/hw/msix.h
 +++ b/hw/msix.h
 @@ -4,6 +4,8 @@
  #include qemu-common.h
  #include pci.h
  
 +void msix_set_message(PCIDevice *dev, int vector, MSIMessage msg);
 +
  int msix_init(PCIDevice *pdev, unsigned short nentries,
MemoryRegion *bar,
unsigned bar_nr, unsigned bar_size);
 -- 
 1.7.10
 
 ps. double '-' and git version is an end-of-patch scissor as I read 
 somewhere, cannot recall where exactly :)
 
 
 
 
 
 
 On 21/06/12 20:56, Jan Kiszka wrote:
  On 2012-06-21 12:50, Alexey Kardashevskiy wrote:
  On 21/06/12 20:38, Jan Kiszka wrote:
  On 2012-06-21 12:28, Alexey Kardashevskiy wrote:
  On 21/06/12 17:39, Jan Kiszka wrote:
  On 2012-06-21 09:18, Alexey Kardashevskiy wrote:
 
  agrhhh. sha1 of the patch changed after rebasing :)
 
 
 
  Added (msi|msix)_(set|get)_message() function for whoever might
  want to use them.
 
  Currently msi_notify()/msix_notify() write to these vectors to
  signal the guest about an interrupt so the correct values have to
  written there by the guest or QEMU.
 
  For example, POWER guest never initializes MSI/MSIX vectors, instead
  it uses RTAS hypercalls. So in order to support MSIX for virtio-pci on
  POWER we have to initialize MSI/MSIX message from QEMU.
 
  As only set* function are required by now, the get functions were 
  added
  or made public for a symmetry.
 
  Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
  ---
   hw/msi.c  |   29 +
   hw/msi.h  |2 ++
   hw/msix.c |   11 ++-
   hw/msix.h |3 +++
   4 files changed, 44 insertions(+), 1 deletion(-)
 
  diff --git a/hw/msi.c b/hw/msi.c
  index 5233204..9ad84a4 100644
  --- a/hw/msi.c
  +++ b/hw/msi.c
  @@ -105,6 +105,35 @@ static inline uint8_t msi_pending_off(const 
  PCIDevice* dev, bool

Re: [Qemu-devel] [RFC] introduce a dynamic library to expose qemu block API

2012-07-18 Thread Stefan Hajnoczi

On Wed, Jul 18, 2012 at 06:42:24AM -0400, Paolo Bonzini wrote:
  Synchronous APIs are great for writing dedicated tools like dd, cp,
  convert, etc.
  
  Asynchronous APIs are essential for integrating image file I/O into
  event-driven programs like libvirt.  Here, the ability to do other
  things while image file I/O is in progress is a requirement.  It may
  also be necessary to cancel or timeout if an operation is not making
  progress or the user decides to stop it.
  
  I think we need to provide both sync and async.  Libraries like
  libssh2 and libcurl already do this so their APIs can be used as a
  starting point for async I/O.
 
 If we want to provide an asynchronous API, the easiest thing would
 be to provide a GSource and that's it.  That would even make sense for
 QEMU itself, in fact.
 
 What I'm worried about, is how to support _both_ synchronous and
 asynchronous access.  I'd like the library to be clean of things like
 qemu_aio_wait() and qemu_aio_flush(), at least in the beginning.
 That's why I think async can come later, once we actually get
 applications needing it.  Right now, libvirt's requirements are
 simple (e.g. probing the backing file chain) and would be synchronous
 anyway.

Yes, qemu_aio_wait() and qemu_aio_flush() are ugly.

Starting with sync makes sense, it's a convenient API to have even if we
add async later.

Stefan

[Qemu-devel] [PATCH] eventfd: making it thread safe

2012-07-18 Thread Alexey Kardashevskiy

QEMU uses IO handlers to run select() in the main loop.
The handlers list is managed by qemu_set_fd_handler() helper
which works fine when called from the main thread as it is
called when select() is not waiting.

However IO handlers list can be changed in the thread other than
the main one doing os_host_main_loop_wait(), for example, as a result
of a hypercall which changes PCI config space (VFIO on POWER is the case)
and enables/disabled MSI/MSIX which closes/creates eventfd handles.
As the main loop should be waiting on the newly created eventfds,
it has to be restarted.

The patch adds the qemu_notify_event() call to interrupt select()
to make main_loop() restart select() with the updated IO handlers
list.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Paolo Bonzini pbonz...@redhat.com
---
 iohandler.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/iohandler.c b/iohandler.c
index 3c74de6..dea4355 100644
--- a/iohandler.c
+++ b/iohandler.c
@@ -77,6 +77,7 @@ int qemu_set_fd_handler2(int fd,
 ioh-fd_write = fd_write;
 ioh-opaque = opaque;
 ioh-deleted = 0;
+qemu_notify_event();
 }
 return 0;
 }
-- 
1.7.10.4

Re: [Qemu-devel] [PATCH v4 5/6] qapi: convert sendkey

2012-07-18 Thread Amos Kong


On 12/07/12 23:09, Luiz Capitulino wrote:




Hi Luiz,


On Thu,  5 Jul 2012 20:48:44 +0800
Amos Kongak...@redhat.com  wrote:


Convert 'sendkey' to use QAPI. do_sendkey() depends on some
variables/functions in monitor.c, so reserve qmp_sendkey()
to monitor.c

key_defs[] in console.h is the mapping of key name to keycode,
Keys' index in the enmu and key_defs[] is same.

'send-key' of QMP doesn't support key in hexadecimal format.

Signed-off-by: Amos Kong ak...@redhat.com
---
  console.h|  152 ++
  hmp-commands.hx  |2 +-
  hmp.c|   64 +++
  hmp.h|1 +
  monitor.c|  239 ++
  qapi-schema.json |   46 +++
  qmp-commands.hx  |   28 +++
  7 files changed, 317 insertions(+), 215 deletions(-)

diff --git a/console.h b/console.h
index 4334db5..e1b0c45 100644
--- a/console.h
+++ b/console.h
@@ -6,6 +6,7 @@
  #include notify.h
  #include monitor.h
  #include trace.h
+#include qapi-types.h

  /* keyboard/mouse support */

@@ -397,4 +398,155 @@ static inline int vnc_display_pw_expire(DisplayState *ds, 
time_t expires)
  /* curses.c */
  void curses_display_init(DisplayState *ds, int full_screen);

+typedef struct {
+int keycode;
+const char *name;


I don't think we need 'name', as key names are also provided by 
KeyCodes_lookup[].
See more on this below.



Yes, I tried to define key_defs[] to a int array, and get keyname from 
KeyCodes_loopup[],

it works.

const int key_defs[] = {
[KEY_CODES_SHIFT] = 0x2a,




+} KeyDef;
+
+static const KeyDef key_defs[] = {


We can't have an array defined in a header file because it will be defined in
each .c file that includes it.

Please, define it in input.c (along with qmp_send_key())


Ok.


and write the following public functions:

  o KeyCode keycode_from_key(const char *key);
  o KeyCode keycode_from_code(int code);



void qmp_send_key(KeyCodesList *keys, bool has_hold_time, int64_t 
hold_time, ...)

^
\_ when we use qmp, a key list will be passed, the 
values are the index

   in enum KeyCodes. not the real KeyCode.

{ 'enum': 'KeyCodes',
  'data': [ 'shift', 'shift_r', 'al...

So we need to get this kind of 'index' in hmp_send_key() and pass to 
qmp_send_key().

then convert this 'index' to keycode in qmp_send_key()

I didn't find a way to define a non-serial enum.

eg: (then int qmp_marshal_input_send_key() would pass real keycode to 
qmp_send_key())

{ 'enum': 'KeyCodes',
  'data': [ 'shift' = 0x2a, 'shift_r' = 0x36, 'alt' = 0x38, ...


If we still pass 'index' to qmp_send_key as patch V4.

extern int index_from_key(const char *key);   - it's used in hmp_send_key()
extern int index_from_keycode(int code);  - it's used in hmp_send_key()
extern char *key_from_keycode(int idx);   - it's used in 
monitor_find_completion()

extern int keycode_from_key(const char *key); - it's used in qmp_send_key()



and then use these functions where using key_defs would be necessary. Also,
note that keycode_from_key() can use KeyCodes_lookup[] instead of key_defs (this
way we can drop 'name' from KeyDef).





+#endif
+#endif
+[KEY_CODES_MAX] = { 0, NULL },
+};
+
  #endif
diff --git a/hmp-commands.hx b/hmp-commands.hx
index e336251..865eea9 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -505,7 +505,7 @@ ETEXI
  .args_type  = keys:s,hold-time:i?,
  .params = keys [hold_ms],
  .help   = send keys to the VM (e.g. 'sendkey ctrl-alt-f1', default 
hold time=100 ms),
-.mhandler.cmd = do_sendkey,
+.mhandler.cmd = hmp_send_key,
  },

  STEXI
diff --git a/hmp.c b/hmp.c
index b9cec1d..cfdc106 100644
--- a/hmp.c
+++ b/hmp.c
@@ -19,6 +19,7 @@
  #include qemu-timer.h
  #include qmp-commands.h
  #include monitor.h
+#include console.h

  static void hmp_handle_error(Monitor *mon, Error **errp)
  {
@@ -1000,3 +1001,66 @@ void hmp_netdev_del(Monitor *mon, const QDict *qdict)
  qmp_netdev_del(id,err);
  hmp_handle_error(mon,err);
  }
+
+static int get_key_index(const char *key)
+{
+int i, keycode;
+char *endp;
+
+for (i = 0; i  KEY_CODES_MAX; i++)
+if (key_defs[i].keycode  !strcmp(key, key_defs[i].name))
+return i;


Here you can call do:

   keycode = keycode_from_key(key);
   if (keycode != KEY_CODES_MAX) {
return keycode;
   }


+
+if (strstart(key, 0x, NULL)) {
+keycode = strtoul(key,endp, 0);
+if (*endp == '\0'  keycode= 0x01  keycode= 0xff)
+for (i = 0; i  KEY_CODES_MAX; i++)
+if (keycode == key_defs[i].keycode)
+return i;


You can drop that for loop and do instead:

   keycode = keycode_from_code(keycode);



+}
+
+return -1;
+}
+
+void hmp_send_key(Monitor *mon, const QDict *qdict)
+{
+const char *keys = qdict_get_str(qdict,

Re: [Qemu-devel] [PATCH] eventfd: making it thread safe

2012-07-18 Thread Alexey Kardashevskiy

On 18/07/12 22:22, Michael S. Tsirkin wrote:
 On Wed, Jul 18, 2012 at 10:08:53PM +1000, Alexey Kardashevskiy wrote:
 QEMU uses IO handlers to run select() in the main loop.
 The handlers list is managed by qemu_set_fd_handler() helper
 which works fine when called from the main thread as it is
 called not when select() is waiting.
 
 when select() is not waiting?
 

 However IO handlers list can be changed in the thread other than
 the main one doing os_host_main_loop_wait(), for example, as a result
 of a hypercall which changes PCI config space (VFIO on POWER is the case)
 
 So the problem is only with VFIO? Can it affect vhost-net?

Honestly I have no idea about vhost-net as I never tried it.


 and enables/disabled MSI/MSIX which closes/creates eventfd handles.
 
 There doesn't seem to be a notification in case an fd is
 deleted. It's probably not at all urgent to remove
 an fd from select - why do you mention closing handles?

Agrhh. I missed this comment in the patch I just reposted.
Mentioned because the file* is still open when there is no need in it.
It has no effect for eventfd but may have for somebody else so we probably
want to add a notification on deletion. Dunno.


 If the main loop is waiting on such eventfd, it has to be restarted.
 
 Do you really mean 'should be waiting on the newly created
 eventfd'?
 
 The patch adds the qemu_notify_event() call to interrupt select()
 and make main_loop() to restart select()
 
 s/and make main_loop() to restart/to make main_loop() restart/?

Thanks for the comments. David used to polish my english but he is vacation
now :)


 with the updated IO
 handlers list.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 Reviewed-by: Paolo Bonzini pbonz...@redhat.com
 ---
  iohandler.c |1 +
  1 file changed, 1 insertion(+)

 diff --git a/iohandler.c b/iohandler.c
 index 3c74de6..dea4355 100644
 --- a/iohandler.c
 +++ b/iohandler.c
 @@ -77,6 +77,7 @@ int qemu_set_fd_handler2(int fd,
  ioh-fd_write = fd_write;
  ioh-opaque = opaque;
  ioh-deleted = 0;
 +qemu_notify_event();
  }
  return 0;
  }
 -- 
 1.7.10.4


-- 
Alexey

Re: [Qemu-devel] [PATCH v2] qerror: Add QERR_PROPERTY_SET_AFTER_REALIZE

2012-07-18 Thread Luiz Capitulino

On Wed, 18 Jul 2012 13:59:06 +0200
Markus Armbruster arm...@redhat.com wrote:

 Peter Maydell peter.mayd...@linaro.org writes:
 
  On 18 July 2012 12:19, Markus Armbruster arm...@redhat.com wrote:
  Peter Maydell peter.mayd...@linaro.org writes:
 
  n 18 July 2012 11:20, Andreas Färber afaer...@suse.de wrote:
  Am 16.07.2012 17:25, schrieb Peter Maydell:
  Add a new QError QERR_PROPERTY_SET_AFTER_REALIZE for attempts
  to set a QOM or qdev property after the object/device has been
  realized. This allows a slightly more informative diagnostic
  than the previous permission denied message.
 
  Signed-off-by: Peter Maydell peter.mayd...@linaro.org
  ---
  Changes since the v1 (which was sent way back in March...):
   * rebased on master now a pile of qdev/qom changesd have landed
   * fixed some overlong lines
   * avoid gcc '?:' extension
   * a couple of set_ functions in qdev-properties.c are new since v1
 and needed their QERR_PERMISSION_DENIED checks changing
 
  This does not yet seem to take into account the discussion with libvirt
  and Anthony on what parameters to pass. The ID generalization was
  nack'ed by Anthony and a QOM path was suggested as alternative. Could
  you please look into that?
 
  I'm afraid I'm not really sure what you're referring to here --
  do you have a link to a discussion?
 
  All I want is for errors printed to the user to be a bit more
  helpful; the whole qerror infrastructure seems to make it
  somewhere between difficult and impossible to do that :-(
 
  Yup.  One of the reasons why I detest it.
 
  A recent thread on how to recover from this disaster:
  http://lists.nongnu.org/archive/html/qemu-devel/2012-06/msg03469.html
 
  That's interesting but I'm not sure how it's relevant. We already
  have QERR_PROPERTY values just this new one, so I don't see why
  this is any worse than the ones we have. If we come up with some
  new scheme we can convert this with all the rest. And I don't
  really want to block improve this error message on getting
  agreement for some big redesign effort...
 
 I'm not objecting to your patch (I didn't even review it), just pointing
 out there's a glimmer of hope on the emitting error messages fit for
 humans is somewhere between difficult and impossible front.

Yeah, I plan to fix that soon.

Re: [Qemu-devel] [PATCH] cpu-defs.h: pull in qemu-common.h for HOST_LONG_BITS

2012-07-18 Thread Andreas Färber

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Am 18.07.2012 14:13, schrieb Mike Frysinger:
 On Monday 16 July 2012 01:26:50 Stefan Weil wrote:
 Am 15.07.2012 23:54, schrieb Mike Frysinger:
 On Sunday 15 July 2012 15:34:33 Stefan Weil wrote:
 Am 15.07.2012 22:25, schrieb Mike Frysinger:
 This file uses the define HOST_LONG_BITS, but doesn't
 explicitly include qemu-common.h for it leading to build
 warnings for some setups: In file included from
 qemu/target-bfin/cpu.h:17,
 
 from qemu/cputlb.c:21: qemu/cpu-defs.h:83:5: warning:
 HOST_LONG_BITS is not defined
 
 Signed-off-by: Mike Frysinger vap...@gentoo.org ---
 
 cpu-defs.h |1 + 1 file changed, 1 insertion(+)
 
 diff --git a/cpu-defs.h b/cpu-defs.h index f49e950..0d6018d
 100644 --- a/cpu-defs.h +++ b/cpu-defs.h @@ -28,6 +28,7 @@
 
 #include inttypes.h #include signal.h #include
 osdep.h
 
 +#include qemu-common.h
 
 #include qemu-queue.h #include targphys.h
 
 No. Of course this works, but I don't think that it is
 reasonable to include qemu-common.h in every *.h file. There
 are already too many of them.
 
 target-bfin/cpu.h should start like all other cpu.h files
 with
 
 these include statements:
 sorry, but that's fragile junk.  if a header file uses defines
 from another header file, it should be including it. -mike
 
 There are different ways how things can be done.
 
 Normally, I agree with you that each header file should be
 complete, but that's not the QEMU style.
 
 In your special case, it's more important to keep all */cpu.h
 similar. qemu/target-bfin/cpu.h is still not part of the official
 QEMU code, so it can be fixed before it is committed.
 
 a lot of existing files in the top level pull in qemu-common.h.  i
 don't think this is a special case considering it's the first
 failure i've seen since i started the Blackfin port over a year
 ago.

Long-term the cpu-defs.h header should go away, so moving stuff into
it and dropping it elsewhere seems not the best of ideas.

qemu-common.h includes cpu.h under some circumstances, which will
include cpu-defs.h, so a circular dependency, not a good idea either.

CPU is pretty tricky terrain. I've been working on improving it but
progress is slow because a solution for one thing tends to conflict
with someone else's work...

Andreas

- -- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg


-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.18 (GNU/Linux)

iQIcBAEBAgAGBQJQBrbEAAoJEPou0S0+fgE/WdYP/RMYVj96Rig55BaByc6G0T8w
iBFJ9WcALzna9pAtjkj716K9nsEGxF5/s8Z3t7MpCznMdZbeQeKko2pJeB0fbqqg
9Gj7ErpkHBQvo4v4UQ99MX2/cmqfDjAZ8a25GK0KP9MW32uqFK2mUOSt7f9nKyMm
HtJKRhdsTrO0x3zm5i+A3jEyMmbduU0WKB8bwIk6xiwmVRbqRvc/M2RqOyG9WnFf
LRJpNm7yXJlzShcmaNbtl7+DyUN6CX4cGUQ85l7gRrzpRQIJrGjMOJqwhdQotvRc
r02AgzLaQHVn26mTFT+LvUVIOhNMH/+uaDITmMdUyumcytAZAu17EhgWir/qtNNq
rQe82UVSghW6Os6oS/NR+8UOAfpgWGaUra1xxoiJIO+h+OO0smx2yLSicaKR6n1R
isGAx8KaSI/ypCdECZu14U2bnysYUUeGnpXAOqcx/gh+LP2riE2qT5qf22799+U2
lpgb+Vodfq6u4+xThU3aoRtWXMU+5nHnsx++6FZdgjzdesJgYdBuIfG1IH2prOdO
8Q1JVMndDwsRYiXJ9MB/v7e9kEo6JqqZA+V2hHLVmmP9SMuiCUEWzxwfYqKyryuA
eJw2MRHF0Bx68mAmqMfiU+TY6xogpG0sUrLkchoFKuTwpF4HEDBBiN/3ciMSpnak
kWYgVZrUpL7jDaEQsSZl
=x1X7
-END PGP SIGNATURE-

Re: [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address and data

2012-07-18 Thread Alexey Kardashevskiy

On 18/07/12 22:43, Michael S. Tsirkin wrote:
 On Thu, Jun 21, 2012 at 09:39:10PM +1000, Alexey Kardashevskiy wrote:
 Added (msi|msix)_set_message() functions.

 Currently msi_notify()/msix_notify() write to these vectors to
 signal the guest about an interrupt so the correct values have to
 written there by the guest or QEMU.

 For example, POWER guest never initializes MSI/MSIX vectors, instead
 it uses RTAS hypercalls. So in order to support MSIX for virtio-pci on
 POWER we have to initialize MSI/MSIX message from QEMU.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 
 So guests do enable MSI through config space, but do
 not fill in vectors? 

Yes. msix_capability_init() calls arch_setup_msi_irqs() which does everything 
it needs to do (i.e. calls hypervisor) before msix_capability_init() writes 
PCI_MSIX_FLAGS_ENABLE to the PCI_MSIX_FLAGS register.

These vectors are the PCI bus addresses, the way they are set is specific for a 
PCI host controller, I do not see why the current scheme is a bug.


 Very strange. Are you sure it's not
 just a guest bug? How does it work for other PCI devices?

Did not get the question. It works the same for every PCI device under POWER 
guest.


 Can't we just fix guest drivers to program the vectors properly?
 
 Also pls address the comment below.

Comment below.

 Thanks!
 
 ---
  hw/msi.c  |   13 +
  hw/msi.h  |1 +
  hw/msix.c |9 +
  hw/msix.h |2 ++
  4 files changed, 25 insertions(+)

 diff --git a/hw/msi.c b/hw/msi.c
 index 5233204..cc6102f 100644
 --- a/hw/msi.c
 +++ b/hw/msi.c
 @@ -105,6 +105,19 @@ static inline uint8_t msi_pending_off(const PCIDevice* 
 dev, bool msi64bit)
  return dev-msi_cap + (msi64bit ? PCI_MSI_PENDING_64 : 
 PCI_MSI_PENDING_32);
  }
  
 +void msi_set_message(PCIDevice *dev, MSIMessage msg)
 +{
 +uint16_t flags = pci_get_word(dev-config + msi_flags_off(dev));
 +bool msi64bit = flags  PCI_MSI_FLAGS_64BIT;
 +
 +if (msi64bit) {
 +pci_set_quad(dev-config + msi_address_lo_off(dev), msg.address);
 +} else {
 +pci_set_long(dev-config + msi_address_lo_off(dev), msg.address);
 +}
 +pci_set_word(dev-config + msi_data_off(dev, msi64bit), msg.data);
 +}
 +
 
 Please add documentation. Something like
 
 /*
  * Special API for POWER to configure the vectors through
  * a side channel. Should never be used by devices.
  */


It is useful for any para-virtualized environment I believe, is not it?
For s390 as well. Of course, if it supports PCI, for example, what I am not 
sure it does though :)



  bool msi_enabled(const PCIDevice *dev)
  {
  return msi_present(dev) 
 diff --git a/hw/msi.h b/hw/msi.h
 index 75747ab..6ec1f99 100644
 --- a/hw/msi.h
 +++ b/hw/msi.h
 @@ -31,6 +31,7 @@ struct MSIMessage {
  
  extern bool msi_supported;
  
 +void msi_set_message(PCIDevice *dev, MSIMessage msg);
  bool msi_enabled(const PCIDevice *dev);
  int msi_init(struct PCIDevice *dev, uint8_t offset,
   unsigned int nr_vectors, bool msi64bit, bool 
 msi_per_vector_mask);
 diff --git a/hw/msix.c b/hw/msix.c
 index ded3c55..5f7d6d3 100644
 --- a/hw/msix.c
 +++ b/hw/msix.c
 @@ -45,6 +45,15 @@ static MSIMessage msix_get_message(PCIDevice *dev, 
 unsigned vector)
  return msg;
  }
  
 +void msix_set_message(PCIDevice *dev, int vector, struct MSIMessage msg)
 +{
 +uint8_t *table_entry = dev-msix_table_page + vector * 
 PCI_MSIX_ENTRY_SIZE;
 +
 +pci_set_quad(table_entry + PCI_MSIX_ENTRY_LOWER_ADDR, msg.address);
 +pci_set_long(table_entry + PCI_MSIX_ENTRY_DATA, msg.data);
 +table_entry[PCI_MSIX_ENTRY_VECTOR_CTRL] = ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
 +}
 +
  /* Add MSI-X capability to the config space for the device. */
  /* Given a bar and its size, add MSI-X table on top of it
   * and fill MSI-X capability in the config space.
 diff --git a/hw/msix.h b/hw/msix.h
 index 50aee82..26a437e 100644
 --- a/hw/msix.h
 +++ b/hw/msix.h
 @@ -4,6 +4,8 @@
  #include qemu-common.h
  #include pci.h
  
 +void msix_set_message(PCIDevice *dev, int vector, MSIMessage msg);
 +
  int msix_init(PCIDevice *pdev, unsigned short nentries,
MemoryRegion *bar,
unsigned bar_nr, unsigned bar_size);
 -- 
 1.7.10

 ps. double '-' and git version is an end-of-patch scissor as I read 
 somewhere, cannot recall where exactly :)






 On 21/06/12 20:56, Jan Kiszka wrote:
 On 2012-06-21 12:50, Alexey Kardashevskiy wrote:
 On 21/06/12 20:38, Jan Kiszka wrote:
 On 2012-06-21 12:28, Alexey Kardashevskiy wrote:
 On 21/06/12 17:39, Jan Kiszka wrote:
 On 2012-06-21 09:18, Alexey Kardashevskiy wrote:

 agrhhh. sha1 of the patch changed after rebasing :)



 Added (msi|msix)_(set|get)_message() function for whoever might
 want to use them.

 Currently msi_notify()/msix_notify() write to these vectors to
 signal the guest about an interrupt so the correct values have to
 written there by the guest or QEMU.

 For example, POWER guest never initializes MSI/MSIX vectors, instead

[Qemu-devel] [PATCH 1/2] qemu-nbd: reorganize help message

2012-07-18 Thread Paolo Bonzini

This patch separates qemu-nbd's options in logical groups, thus making
the help message easier to read.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-nbd.c |   33 ++---
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 5a0300e..1c32290 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -46,28 +46,39 @@ static int nb_fds;
 
 static void usage(const char *name)
 {
-printf(
+(printf) (
 Usage: %s [OPTIONS] FILE\n
 QEMU Disk Network Block Device Server\n
 \n
+  -h, --help   display this help and exit\n
+  -V, --versionoutput version information and exit\n
+\n
+Connection properties:\n
   -p, --port=PORT  port to listen on (default `%d')\n
-  -o, --offset=OFFSET  offset into the image\n
   -b, --bind=IFACE interface to bind to (default `0.0.0.0')\n
   -k, --socket=PATHpath to the unix socket\n
(default 'SOCKET_PATH')\n
-  -r, --read-only  export read-only\n
-  -P, --partition=NUM  only expose partition NUM\n
-  -s, --snapshot   use snapshot file\n
-  -n, --nocachedisable host cache\n
-  -c, --connect=DEVconnect FILE to the local NBD device DEV\n
-  -d, --disconnect disconnect the specified device\n
   -e, --shared=NUM device can be shared by NUM clients (default '1')\n
   -t, --persistent don't exit on the last connection\n
   -v, --verbosedisplay extra debugging information\n
-  -h, --help   display this help and exit\n
-  -V, --versionoutput version information and exit\n
 \n
-Report bugs to anth...@codemonkey.ws\n
+Exposing part of the image:\n
+  -o, --offset=OFFSET  offset into the image\n
+  -P, --partition=NUM  only expose partition NUM\n
+\n
+#ifdef __linux__
+Kernel NBD client support:\n
+  -c, --connect=DEVconnect FILE to the local NBD device DEV\n
+  -d, --disconnect disconnect the specified device\n
+\n
+#endif
+\n
+Block device options:\n
+  -r, --read-only  export read-only\n
+  -s, --snapshot   use snapshot file\n
+  -n, --nocachedisable host cache\n
+\n
+Report bugs to qemu-devel@nongnu.org\n
 , name, NBD_DEFAULT_PORT, DEVICE);
 }
 
-- 
1.7.10.4

[Qemu-devel] [PATCH 0/2] qemu-nbd: add --cache and --aio options

2012-07-18 Thread Paolo Bonzini

Two simple patches that let qemu-nbd use native AIO and --cache=unsafe
mode.

Paolo Bonzini (2):
  qemu-nbd: reorganize help message
  qemu-nbd: add --cache and --aio options

 qemu-nbd.c |   75 +---
 1 file changed, 62 insertions(+), 13 deletions(-)

-- 
1.7.10.4

[Qemu-devel] [PATCH 2/2] qemu-nbd: add --cache and --aio options

2012-07-18 Thread Paolo Bonzini

Add two options to tune the I/O implementation of qemu-nbd, matching
the possibilities given by the QEMU -drive option.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-nbd.c |   42 --
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 1c32290..1c1cf6a 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -33,7 +33,9 @@
 #include libgen.h
 #include pthread.h
 
-#define SOCKET_PATH/var/lock/qemu-nbd-%s
+#define SOCKET_PATH /var/lock/qemu-nbd-%s
+#define QEMU_NBD_OPT_CACHE  1
+#define QEMU_NBD_OPT_AIO2
 
 static NBDExport *exp;
 static int verbose;
@@ -77,6 +79,10 @@ static void usage(const char *name)
   -r, --read-only  export read-only\n
   -s, --snapshot   use snapshot file\n
   -n, --nocachedisable host cache\n
+  --cache=MODE set cache mode (none, writeback, ...)\n
+#ifdef CONFIG_LINUX_AIO
+  --aio=MODE   set AIO mode (native or threads)\n
+#endif
 \n
 Report bugs to qemu-devel@nongnu.org\n
 , name, NBD_DEFAULT_PORT, DEVICE);
@@ -306,6 +312,10 @@ int main(int argc, char **argv)
 { disconnect, 0, NULL, 'd' },
 { snapshot, 0, NULL, 's' },
 { nocache, 0, NULL, 'n' },
+{ cache, 1, NULL, QEMU_NBD_OPT_CACHE },
+#ifdef CONFIG_LINUX_AIO
+{ aio, 1, NULL, QEMU_NBD_OPT_AIO },
+#endif
 { shared, 1, NULL, 'e' },
 { persistent, 0, NULL, 't' },
 { verbose, 0, NULL, 'v' },
@@ -320,6 +330,10 @@ int main(int argc, char **argv)
 int ret;
 int fd;
 int persistent = 0;
+bool seen_cache = false;
+#ifdef CONFIG_LINUX_AIO
+bool seen_aio = false;
+#endif
 pthread_t client_thread;
 
 /* The client thread uses SIGTERM to interrupt the server.  A signal
@@ -336,8 +350,32 @@ int main(int argc, char **argv)
 flags |= BDRV_O_SNAPSHOT;
 break;
 case 'n':
-flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB;
+optarg = (char *) none;
+/* fallthrough */
+case QEMU_NBD_OPT_CACHE:
+if (seen_cache) {
+errx(EXIT_FAILURE, -n and --cache can only be specified 
once);
+}
+seen_cache = true;
+if (bdrv_parse_cache_flags(optarg, flags) == -1) {
+errx(EXIT_FAILURE, Invalid cache mode `%s', optarg);
+}
 break;
+#ifdef CONFIG_LINUX_AIO
+case QEMU_NBD_OPT_AIO:
+if (seen_aio) {
+errx(EXIT_FAILURE, --aio can only be specified once);
+}
+seen_aio = true;
+if (!strcmp(optarg, native)) {
+flags |= BDRV_O_NATIVE_AIO;
+} else if (!strcmp(optarg, threads)) {
+/* this is the default */
+} else {
+   errx(EXIT_FAILURE, invalid aio mode `%s', optarg);
+}
+break;
+#endif
 case 'b':
 bindto = optarg;
 break;
-- 
1.7.10.4

[Qemu-devel] [PATCH v4 3/5] convert pci-host to QOM

2012-07-18 Thread Wanpeng Li

[CCing ML]

From: Anthony Liguori aligu...@us.ibm.com

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Wanpeng Li liw...@linux.vnet.ibm.com
---
 hw/pci_host.c |   26 ++
 hw/pci_host.h |5 +
 2 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/hw/pci_host.c b/hw/pci_host.c
index 8041778..095bfe3 100644
--- a/hw/pci_host.c
+++ b/hw/pci_host.c
@@ -165,4 +165,30 @@ const MemoryRegionOps pci_host_data_be_ops = {
 .endianness = DEVICE_BIG_ENDIAN,
 };
 
+void pci_host_set_mmio(PCIHostState *s, MemoryRegion *value)
+{
+object_property_set_link(OBJECT(s), OBJECT(value), mmio, NULL);
+}
+
+static void pci_host_initfn(Object *obj)
+{
+PCIHostState *s = PCI_HOST(obj);
+
+object_property_add_link(obj, mmio, TYPE_MEMORY_REGION,
+(Object **)s-address_space, NULL);
+}
+
+static TypeInfo pci_host_type_info = {
+.name = TYPE_PCI_HOST,
+.parent = TYPE_SYS_BUS_DEVICE,
+.instance_size = sizeof(PCIHostState),
+.instance_init = pci_host_initfn,
+};
+
+static void register_devices(void)
+{
+type_register_static(pci_host_type_info);
+}
+
+type_init(register_devices)
 
diff --git a/hw/pci_host.h b/hw/pci_host.h
index 359e38f..084e15c 100644
--- a/hw/pci_host.h
+++ b/hw/pci_host.h
@@ -30,6 +30,9 @@
 
 #include sysbus.h
 
+#define TYPE_PCI_HOST pci-host
+#define PCI_HOST(obj) OBJECT_CHECK(PCIHostState, (obj), TYPE_PCI_HOST)
+
 struct PCIHostState {
 SysBusDevice busdev;
 MemoryRegion conf_mem;
@@ -49,6 +52,8 @@ uint32_t pci_host_config_read_common(PCIDevice *pci_dev, 
uint32_t addr,
 void pci_data_write(PCIBus *s, uint32_t addr, uint32_t val, int len);
 uint32_t pci_data_read(PCIBus *s, uint32_t addr, int len);
 
+void pci_host_set_mmio(PCIHostState *s, MemoryRegion *value);
+
 extern const MemoryRegionOps pci_host_conf_le_ops;
 extern const MemoryRegionOps pci_host_conf_be_ops;
 extern const MemoryRegionOps pci_host_data_le_ops;
-- 
1.7.5.4

[Qemu-devel] [PATCH v4 2/5] convert MemoryRegion to QOM

2012-07-18 Thread Wanpeng Li

[CCing ML]

From: Anthony Liguori aligu...@us.ibm.com

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Wanpeng Li liw...@linux.vnet.ibm.com
---
 memory.c |   94 ++
 memory.h |8 +
 2 files changed, 78 insertions(+), 24 deletions(-)

diff --git a/memory.c b/memory.c
index aab4a31..3674535 100644
--- a/memory.c
+++ b/memory.c
@@ -797,35 +797,26 @@ static bool memory_region_wrong_endianness(MemoryRegion 
*mr)
 #endif
 }
 
-void memory_region_init(MemoryRegion *mr,
-const char *name,
-uint64_t size)
+void memory_region_set_name(MemoryRegion *mr, const char *name)
+{
+mr-name = g_strdup(name);
+}
+
+void memory_region_set_size(MemoryRegion *mr, uint64_t size)
 {
-mr-ops = NULL;
-mr-parent = NULL;
 mr-size = int128_make64(size);
 if (size == UINT64_MAX) {
 mr-size = int128_2_64();
 }
-mr-addr = 0;
-mr-subpage = false;
-mr-enabled = true;
-mr-terminates = false;
-mr-ram = false;
-mr-readable = true;
-mr-readonly = false;
-mr-rom_device = false;
-mr-destructor = memory_region_destructor_none;
-mr-priority = 0;
-mr-may_overlap = false;
-mr-alias = NULL;
-QTAILQ_INIT(mr-subregions);
-memset(mr-subregions_link, 0, sizeof mr-subregions_link);
-QTAILQ_INIT(mr-coalesced);
-mr-name = g_strdup(name);
-mr-dirty_log_mask = 0;
-mr-ioeventfd_nb = 0;
-mr-ioeventfds = NULL;
+}
+
+void memory_region_init(MemoryRegion *mr,
+const char *name,
+uint64_t size)
+{
+object_initialize(mr, TYPE_MEMORY_REGION);
+memory_region_set_name(mr, name);
+memory_region_set_size(mr, size);
 }
 
 static bool memory_region_access_valid(MemoryRegion *mr,
@@ -1645,3 +1636,58 @@ void mtree_info(fprintf_function mon_printf, void *f)
 g_free(ml);
 }
 }
+
+static void memory_region_initfn(Object *obj)
+{
+MemoryRegion *mr = MEMORY_REGION(obj);
+mr-ops = NULL;
+mr-parent = NULL;
+mr-size = int128_2_64();
+mr-addr = 0;
+mr-subpage = false;
+mr-enabled = true;
+mr-terminates = false;
+mr-ram = false;
+mr-readable = true;
+mr-readonly = false;
+mr-rom_device = false;
+mr-destructor = memory_region_destructor_none;
+mr-priority = 0;
+mr-may_overlap = false;
+mr-alias = NULL;
+mr-name = NULL;
+QTAILQ_INIT(mr-subregions);
+memset(mr-subregions_link, 0, sizeof mr-subregions_link);
+QTAILQ_INIT(mr-coalesced);
+mr-dirty_log_mask = 0;
+mr-ioeventfd_nb = 0;
+mr-ioeventfds = NULL;
+}
+
+static void memory_region_finalize(Object *obj)
+{
+MemoryRegion *mr = MEMORY_REGION(obj);
+
+assert(QTAILQ_EMPTY(mr-subregions));
+mr-destructor(mr);
+memory_region_clear_coalescing(mr);
+if (mr-name) {
+g_free((char *)mr-name);
+}
+g_free(mr-ioeventfds);
+}
+
+static TypeInfo memory_region_type = {
+.name = TYPE_MEMORY_REGION,
+.parent = TYPE_OBJECT,
+.instance_size = sizeof(MemoryRegion),
+.instance_init = memory_region_initfn,
+.instance_finalize = memory_region_finalize,
+};
+
+static void register_devices(void)
+{
+type_register_static(memory_region_type);
+}
+
+type_init(register_devices)
diff --git a/memory.h b/memory.h
index 740c48e..90a53f7 100644
--- a/memory.h
+++ b/memory.h
@@ -25,6 +25,7 @@
 #include iorange.h
 #include ioport.h
 #include int128.h
+#include qemu/object.h
 
 typedef struct MemoryRegionOps MemoryRegionOps;
 typedef struct MemoryRegion MemoryRegion;
@@ -116,6 +117,9 @@ struct MemoryRegionOps {
 typedef struct CoalescedMemoryRange CoalescedMemoryRange;
 typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
 
+#define TYPE_MEMORY_REGION memory-region
+#define MEMORY_REGION(obj) OBJECT_CHECK(MemoryRegion, (obj), 
TYPE_MEMORY_REGION)
+
 struct MemoryRegion {
 /* All fields are private - violators will be prosecuted */
 const MemoryRegionOps *ops;
@@ -748,6 +752,10 @@ void memory_global_dirty_log_stop(void);
 
 void mtree_info(fprintf_function mon_printf, void *f);
 
+void memory_region_set_name(MemoryRegion *mr, const char *name);
+
+void memory_region_set_size(MemoryRegion *mr, uint64_t size);
+
 #endif
 
 #endif
-- 
1.7.5.4

[Qemu-devel] [PATCH v4 4/5] prepare to create HPET, RTC and i8254 through composition

2012-07-18 Thread Wanpeng Li

[CCing ML]
 
 From: Anthony Liguori aligu...@us.ibm.com

The HPET usually sits on the LPC bus (which replaces ISA in modern systems).
It's sometimes a dedicated chip but can certain co-exist in a Super IO chip.
I think in terms of where it would live in this hypothetical device model,
putting it in the PIIX is rational.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Wanpeng Li liw...@linux.vnet.ibm.com
---
 hw/hpet.c   |   39 ++-
 hw/hpet_emul.h  |   41 +
 hw/i8254.c  |2 +-
 hw/i8254_internal.h |2 +-
 hw/mc146818rtc.c|   26 --
 hw/mc146818rtc.h|   30 ++
 6 files changed, 75 insertions(+), 65 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index fd3ddca..fc0ff6c 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -42,41 +42,6 @@
 
 #define HPET_MSI_SUPPORT0
 
-struct HPETState;
-typedef struct HPETTimer {  /* timers */
-uint8_t tn; /*timer number*/
-QEMUTimer *qemu_timer;
-struct HPETState *state;
-/* Memory-mapped, software visible timer registers */
-uint64_t config;/* configuration/cap */
-uint64_t cmp;   /* comparator */
-uint64_t fsb;   /* FSB route */
-/* Hidden register state */
-uint64_t period;/* Last value written to comparator */
-uint8_t wrap_flag;  /* timer pop will indicate wrap for one-shot 32-bit
- * mode. Next pop will be actual timer expiration.
- */
-} HPETTimer;
-
-typedef struct HPETState {
-SysBusDevice busdev;
-MemoryRegion iomem;
-uint64_t hpet_offset;
-qemu_irq irqs[HPET_NUM_IRQ_ROUTES];
-uint32_t flags;
-uint8_t rtc_irq_level;
-qemu_irq pit_enabled;
-uint8_t num_timers;
-HPETTimer timer[HPET_MAX_TIMERS];
-
-/* Memory-mapped, software visible registers */
-uint64_t capability;/* capabilities */
-uint64_t config;/* configuration */
-uint64_t isr;   /* interrupt status reg */
-uint64_t hpet_counter;  /* main counter */
-uint8_t  hpet_id;   /* instance id */
-} HPETState;
-
 static uint32_t hpet_in_legacy_mode(HPETState *s)
 {
 return s-config  HPET_CFG_LEGACY;
@@ -278,7 +243,7 @@ static const VMStateDescription vmstate_hpet_timer = {
 };
 
 static const VMStateDescription vmstate_hpet = {
-.name = hpet,
+.name = TYPE_HPET,
 .version_id = 2,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
@@ -746,7 +711,7 @@ static void hpet_device_class_init(ObjectClass *klass, void 
*data)
 }
 
 static TypeInfo hpet_device_info = {
-.name  = hpet,
+.name  = TYPE_HPET,
 .parent= TYPE_SYS_BUS_DEVICE,
 .instance_size = sizeof(HPETState),
 .class_init= hpet_device_class_init,
diff --git a/hw/hpet_emul.h b/hw/hpet_emul.h
index 757f79f..836c5c8 100644
--- a/hw/hpet_emul.h
+++ b/hw/hpet_emul.h
@@ -13,6 +13,9 @@
 #ifndef QEMU_HPET_EMUL_H
 #define QEMU_HPET_EMUL_H
 
+#include hw.h
+#include sysbus.h
+
 #define HPET_BASE   0xfed0
 #define HPET_CLK_PERIOD 1000ULL /* 1000 femtoseconds == 10ns*/
 
@@ -71,4 +74,42 @@ struct hpet_fw_config
 } QEMU_PACKED;
 
 extern struct hpet_fw_config hpet_cfg;
+
+#define TYPE_HPET hpet
+
+struct HPETState;
+typedef struct HPETTimer {  /* timers */
+uint8_t tn; /*timer number*/
+QEMUTimer *qemu_timer;
+struct HPETState *state;
+/* Memory-mapped, software visible timer registers */
+uint64_t config;/* configuration/cap */
+uint64_t cmp;   /* comparator */
+uint64_t fsb;   /* FSB route */
+/* Hidden register state */
+uint64_t period;/* Last value written to comparator */
+uint8_t wrap_flag;  /* timer pop will indicate wrap for one-shot 32-bit
+ * mode. Next pop will be actual timer expiration.
+ */
+} HPETTimer;
+
+typedef struct HPETState {
+SysBusDevice busdev;
+MemoryRegion iomem;
+uint64_t hpet_offset;
+qemu_irq irqs[HPET_NUM_IRQ_ROUTES];
+uint32_t flags;
+uint8_t rtc_irq_level;
+qemu_irq pit_enabled;
+uint8_t num_timers;
+HPETTimer timer[HPET_MAX_TIMERS];
+
+/* Memory-mapped, software visible registers */
+uint64_t capability;/* capabilities */
+uint64_t config;/* configuration */
+uint64_t isr;   /* interrupt status reg */
+uint64_t hpet_counter;  /* main counter */
+uint8_t  hpet_id;   /* instance id */
+} HPETState;
+
 #endif
diff --git a/hw/i8254.c b/hw/i8254.c
index 77bd5e8..9d855ec 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -346,7 +346,7 @@ static void pit_class_initfn(ObjectClass *klass, void *data)
 }
 
 static TypeInfo pit_info = {
-.name  = isa-pit,
+.name  =

[Qemu-devel] [PATCH v4 1/5] eliminate piix_pci.c and module i440fx and piix3

2012-07-18 Thread Wanpeng Li

[CCing ML]

From: Anthony Liguori aligu...@us.ibm.com

The big picture about the patch is shown as follows:

1) pc_init creates an I440FX, any bus devices (ISA serial port, PCI
vga and nics, etc.), sets properties appropriately, and realizes the
devices.
2) I440FX is-a PCIHost, has-a I440FX-PMC, has-a PIIX3
3) PIIX3 has-a RTC, has-a I8042, has-a DMAController, etc.

i440fx-pcihost = i440fx
i440fx = i440fx-pmc

i440fx_pmc is Programmable Memory Controller which integrated in I440FX
chipset, and move ram initialization into i440fx-pmc.

It might seem like a small change, but it better reflects the fact
that the PMC is contained within the i440fx which we will now reflect in
composition in the next few changesets.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Wanpeng Li liw...@linux.vnet.ibm.com
---
 hw/i386/Makefile.objs |2 +-
 hw/i440fx.c   |  434 +++
 hw/i440fx.h   |   77 +++
 hw/piix3.c|  292 
 hw/piix3.h|   79 +++
 hw/piix_pci.c |  599 -
 6 files changed, 883 insertions(+), 600 deletions(-)
 create mode 100644 hw/i440fx.c
 create mode 100644 hw/i440fx.h
 create mode 100644 hw/piix3.c
 create mode 100644 hw/piix3.h
 delete mode 100644 hw/piix_pci.c

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index 8c764bb..49b32d0 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -1,6 +1,6 @@
 obj-y += mc146818rtc.o pc.o
 obj-y += apic_common.o apic.o kvmvapic.o
-obj-y += sga.o ioapic_common.o ioapic.o piix_pci.o
+obj-y += sga.o ioapic_common.o ioapic.o i440fx.o piix3.o
 obj-y += vmport.o
 obj-y += pci-hotplug.o smbios.o wdt_ib700.o
 obj-y += debugcon.o multiboot.o
diff --git a/hw/i440fx.c b/hw/i440fx.c
new file mode 100644
index 000..8c4408f
--- /dev/null
+++ b/hw/i440fx.c
@@ -0,0 +1,434 @@
+/*
+ * QEMU i440FX PCI Host Bridge Emulation
+ *
+ * Copyright (c) 2006 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include i440fx.h
+#include range.h
+#include xen.h
+#include loader.h
+#include pc.h
+
+#define BIOS_FILENAME bios.bin
+
+/*
+ * I440FX chipset data sheet.
+ * http://download.intel.com/design/chipsets/datashts/29054901.pdf
+ *
+ * The I440FX is a package that contains an integrated PCI Host controller,
+ * memory controller, and is usually packaged with a PCI-ISA bus and super I/O
+ * chipset.
+ *
+ * The i440FX device is the PCI host controller.  On function 0.0, there is a
+ * memory controller called the Programmable Memory Controller (PMC).  On
+ * function 1.0, there is the PCI-ISA bus/super I/O chip called the PIIX3.
+ */
+
+#define I440FX_PMC_PCI_HOLE 0xE000ULL
+#define I440FX_PMC_PCI_HOLE_END 0x1ULL
+
+#define I440FX_PAM  0x59
+#define I440FX_PAM_SIZE 7
+#define I440FX_SMRAM0x72
+
+static void piix3_set_irq(void *opaque, int pirq, int level)
+{
+PIIX3State *piix3 = opaque;
+piix3_set_irq_level(piix3, pirq, level);
+}
+
+/*
+ * return the global irq number corresponding to a given device irq
+ * pin. We could also use the bus number to have a more precise
+ * mapping.
+ */
+static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
+{
+int slot_addend;
+slot_addend = (pci_dev-devfn  3) - 1;
+return (pci_intx + slot_addend)  3;
+}
+
+static void update_pam(I440FXPMCState *d, uint32_t start, uint32_t end, int r,
+   PAMMemoryRegion *mem)
+{
+if (mem-initialized) {
+memory_region_del_subregion(d-system_memory, mem-mem);
+memory_region_destroy(mem-mem);
+}
+
+switch (r) {
+case 3:
+/* RAM */
+memory_region_init_alias(mem-mem, pam-ram, d-ram_memory,
+ start, end - start);
+break;
+case 1:
+/* ROM (XXX: not quite correct) */
+memory_region_init_alias(mem-mem, pam-rom,

[Qemu-devel] [PATCH v3] qemu-img: correct size parsers and help message

2012-07-18 Thread Dong Xu Wang

qemu-img not only suports k/K/M/G/T/b, but also supports m/g/t/B. So correct
it in help message.

Also use the same parser in parse_option_size function.

Signed-off-by: Dong Xu Wang wdon...@linux.vnet.ibm.com
CC: riegama...@gmail.com
---
v1-v2: also correct error reporting.
v2-v3: use the same parser in parse_option_size function.

 qemu-img.c|9 +
 qemu-option.c |6 +-
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 80cfb9b..7f2fde4 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -69,8 +69,9 @@ static void help(void)
options are: 'none', 'writeback' (default, except for 
convert), 'writethrough',\n
'directsync' and 'unsafe' (default for convert)\n
  'size' is the disk image size in bytes. Optional suffixes\n
-   'k' or 'K' (kilobyte, 1024), 'M' (megabyte, 1024k), 'G' 
(gigabyte, 1024M)\n
-   and T (terabyte, 1024G) are supported. 'b' is ignored.\n
+   'k' or 'K' (kilobyte, 1024), 'm' or 'M' (megabyte, 1024k),\n
+   'g' or 'G' (gigabyte, 1024M) and 't' or 'T' (terabyte, 1024G) 
are supported.\n
+   'b' or 'B' is ignored.\n
  'output_filename' is the destination disk image filename\n
  'output_fmt' is the destination format\n
  'options' is a comma separated list of format specific options 
in a\n
@@ -341,8 +342,8 @@ static int img_create(int argc, char **argv)
 char *end;
 sval = strtosz_suffix(argv[optind++], end, STRTOSZ_DEFSUFFIX_B);
 if (sval  0 || *end) {
-error_report(Invalid image size specified! You may use k, M, G or 

-  T suffixes for );
+error_report(Invalid image size specified! You may use k/K, m/M, 
+  g/G or t/T suffixes for );
 error_report(kilobytes, megabytes, gigabytes and terabytes.);
 ret = -1;
 goto out;
diff --git a/qemu-option.c b/qemu-option.c
index bb3886c..abc3b90 100644
--- a/qemu-option.c
+++ b/qemu-option.c
@@ -213,25 +213,29 @@ static void parse_option_size(const char *name, const 
char *value,
 sizef = strtod(value, postfix);
 switch (*postfix) {
 case 'T':
+case 't':
 sizef *= 1024;
 /* fall through */
 case 'G':
+case 'g':
 sizef *= 1024;
 /* fall through */
 case 'M':
+case 'm':
 sizef *= 1024;
 /* fall through */
 case 'K':
 case 'k':
 sizef *= 1024;
 /* fall through */
+case 'B':
 case 'b':
 case '\0':
 *ret = (uint64_t) sizef;
 break;
 default:
 error_set(errp, QERR_INVALID_PARAMETER_VALUE, name, a size);
-error_printf_unless_qmp(You may use k, M, G or T suffixes for 
+error_printf_unless_qmp(You may use K/k, M/m, G/g or T/t suffixes 
for 
 kilobytes, megabytes, gigabytes and terabytes.\n);
 return;
 }
-- 
1.7.1

Re: [Qemu-devel] [PATCH v3] qemu-img: correct size parsers and help message

2012-07-18 Thread Kevin Wolf

Am 18.07.2012 15:23, schrieb Dong Xu Wang:
 qemu-img not only suports k/K/M/G/T/b, but also supports m/g/t/B. So correct
 it in help message.
 
 Also use the same parser in parse_option_size function.

This is not what the patch does. It uses a parser that seems slightly
more compatible with strtosz_suffix() than before, but it still doesn't
use the same one. It really should call strtosz_suffix() instead of
implementing a parser here.

Kevin

Re: [Qemu-devel] q35 and ahci

2012-07-18 Thread Gerd Hoffmann

 The current command line I'm using something such as:
 
 $ /usr/local/bin/qemu-system-x86_64  -name f16 -M pc_q35 -m 1G -smp 4
 -hda ./f16.img --enable-kvm  -bios path/q35-seabios/out/bios.bin
 -acpitable file=path/q35-seabios/out/q35-acpi-dsdt.aml  -monitor stdio
 
 So it might be nice to avoid the '-acpitable' flag. Perhaps, we can
 teach qemu to pull in the correct acpitable depending on the -M
 specification...

I expect those switches being a temporary solution.  Once everything is
upstream in qemu and seabios there should be no need to specify a acpi
table manually.  Likewise for the bios binary.

cheers,
  Gerd

[Qemu-devel] [PATCH v4 0/5] refactor PC machine, i440fx and piix3 to take advantage of QOM

2012-07-18 Thread Wanpeng Li

[CCing ML]

This series aggressively refactors the PC machine initialization to be more
modelled and less ad-hoc.  The highlights of this series are:

1) Things like -m and -bios-name are now device model properties

2) The i440fx and piix3 are now modelled in a thorough fashion

3) Most of the chipset features of the piix3 are modelled through composition

4) i440fx_init is trivialized to creating devices and setting properties

5) convert MemoryRegion to QOM

6) convert PCI host bridge to QOM

The point (4) is the most important one.  As we refactor in this fashion,
we should quickly get to the point where machine-init disappears completely in
favor of just creating a handful of devices.

The two stage initialization of QOM is important here.  instance_init() is when
composed devices are created which means that after you've created a device, all
of its children are visible in the device model.  This lets you set properties
of the parent and its children.

realize() (which is still called DeviceState::init today) will be called right
before the guest starts up for the first time.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Wanpeng Li liw...@linux.vnet.ibm.com

Change in v4:

*rebase patchset 

Changes in v3:

* fix coding style issues
* fix rebase error 
* add changes log 

Changes in v2:

* Rebase patch series of i440fx in Anthony's qom-rebase.12 branch to upstream
* convert MemoryRegion to QOM
* convert pci_host to QOM


Anthony Liguori (5):
  eliminate piix_pci.c and module i440fx and piix3
  convert MemoryRegion to QOM
  convert pci-host to QOM
  prepare to create HPET, RTC and i8254 through composition
  merge pc_piix.c to pc.c

 hw/hpet.c |   39 +---
 hw/hpet_emul.h|   41 +++
 hw/i386/Makefile.objs |3 +-
 hw/i440fx.c   |  434 +
 hw/i440fx.h   |   77 +
 hw/i8254.c|2 +-
 hw/i8254_internal.h   |2 +-
 hw/mc146818rtc.c  |   26 --
 hw/mc146818rtc.h  |   30 ++
 hw/pc.c   |  741 +++--
 hw/pc.h   |   46 +---
 hw/pc_piix.c  |  661 ---
 hw/pci_host.c |   26 ++
 hw/pci_host.h |5 +
 hw/piix3.c|  292 +++
 hw/piix3.h|   79 ++
 hw/piix_pci.c |  599 ---
 memory.c  |   94 +--
 memory.h  |8 +
 19 files changed, 1722 insertions(+), 1483 deletions(-)
 create mode 100644 hw/i440fx.c
 create mode 100644 hw/i440fx.h
 delete mode 100644 hw/pc_piix.c
 create mode 100644 hw/piix3.c
 create mode 100644 hw/piix3.h
 delete mode 100644 hw/piix_pci.c

-- 
1.7.5.4

Re: [Qemu-devel] [PATCH v3] qemu-img: correct size parsers and help message

2012-07-18 Thread Eric Blake

On 07/18/2012 07:23 AM, Dong Xu Wang wrote:
 qemu-img not only suports k/K/M/G/T/b, but also supports m/g/t/B. So correct
 it in help message.
 

 +++ b/qemu-img.c
 @@ -69,8 +69,9 @@ static void help(void)
 options are: 'none', 'writeback' (default, except for 
 convert), 'writethrough',\n
 'directsync' and 'unsafe' (default for convert)\n
   'size' is the disk image size in bytes. Optional suffixes\n
 -   'k' or 'K' (kilobyte, 1024), 'M' (megabyte, 1024k), 'G' 
 (gigabyte, 1024M)\n
 -   and T (terabyte, 1024G) are supported. 'b' is ignored.\n
 +   'k' or 'K' (kilobyte, 1024), 'm' or 'M' (megabyte, 1024k),\n
 +   'g' or 'G' (gigabyte, 1024M) and 't' or 'T' (terabyte, 
 1024G) are supported.\n
 +   'b' or 'B' is ignored.\n

Technically, 'kilobyte' is only 1000 bytes; the correct term for 1024
bytes is 'kibibyte'.  Likewise for 'megabyte' (100) vs. 'mebibyte'
(1024k, or 1,048,576 bytes); and so on for gibibytes and tebibytes.
Since disk manufacturers have already forced the rest of the world to
ask whether the number of bytes they are looking at is a power of 10 or
a power of 2 suffix, we might as well be precise in our naming to
document that we really are using powers of 2.

Furthermore, I think you can compress this by mentioning that the parse
is case-insensitive, instead of spelling out all the options:

'size' is the disk image size in bytes, scaled by an optional
case-insensitive suffix: 'k' (kibibyte, 1024), 'M' (mebibyte, 1024k),
'G' (gibibyte, 1024M), 'T' (tebibyte, 1024G), or 'b' (no scaling).

 @@ -341,8 +342,8 @@ static int img_create(int argc, char **argv)
  char *end;
  sval = strtosz_suffix(argv[optind++], end, STRTOSZ_DEFSUFFIX_B);
  if (sval  0 || *end) {
 -error_report(Invalid image size specified! You may use k, M, G 
 or 
 -  T suffixes for );
 +error_report(Invalid image size specified! You may use k/K, 
 m/M, 
 +  g/G or t/T suffixes for );

I personally dislike this change.  Just because we're lenient in what we
accept does not mean we have to document all of the possibilities that
we parse when correcting a user error; rather, we need only document the
preferred possibilities.

  default:
  error_set(errp, QERR_INVALID_PARAMETER_VALUE, name, a size);
 -error_printf_unless_qmp(You may use k, M, G or T suffixes for 
 +error_printf_unless_qmp(You may use K/k, M/m, G/g or T/t 
 suffixes for 

Again, in an error message, I'd only document the preferred capitalization.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] build: add make dist target (v2)

2012-07-18 Thread Gerd Hoffmann

On 07/17/12 21:12, Michael Roth wrote:
 On Tue, Jul 17, 2012 at 01:33:32PM -0500, Anthony Liguori wrote:
 Let's stop screwing up releases by having a script do the work that Anthony's
 fat fingers can't seem to get right.

 Cc: Michael Roth mdr...@linux.vnet.ibm.com
 Signed-off-by: Anthony Liguori aligu...@us.ibm.com
 
 Breaks if there's no tag corresponding with the contents of VERSION,
 but that might be considered a feature (an alternative might be to
 assume it's a development release, use current HEAD for master, and append the
 short git hash to the version). Works well as far as I can tell though,
 and I made a special point to confirm it did indeed output a bz2 :)

Or just use 'git describe --long' to figure what the version is.  This
way you can easily build a tarball for any git commit, and a release
tarball is just 'git checkout v$version; make dist'.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v4 5/6] qapi: convert sendkey

2012-07-18 Thread Luiz Capitulino

On Wed, 18 Jul 2012 20:56:54 +0800
Amos Kong ak...@redhat.com wrote:

  +} KeyDef;
  +
  +static const KeyDef key_defs[] = {
 
  We can't have an array defined in a header file because it will be defined 
  in
  each .c file that includes it.
 
  Please, define it in input.c (along with qmp_send_key())
 
 Ok.
 
  and write the following public functions:
 
o KeyCode keycode_from_key(const char *key);
o KeyCode keycode_from_code(int code);
 
 
 void qmp_send_key(KeyCodesList *keys, bool has_hold_time, int64_t 
 hold_time, ...)
  ^
  \_ when we use qmp, a key list will be passed, the 
 values are the index
 in enum KeyCodes. not the real KeyCode.

Right.

 
  { 'enum': 'KeyCodes',
'data': [ 'shift', 'shift_r', 'al...
 
 So we need to get this kind of 'index' in hmp_send_key() and pass to 
 qmp_send_key().

Yes, that's what keycode_from_key() would do, something like this:

KeyCode keycode_from_key(const char *key)
{
int i;

for (i = 0; i  KEY_CODES_MAX; i++) {
if (!strcmp(key, KeyCode_lookup[i])) {
return i;
}
}

return KEY_CODE_MAX;
}

Note that it returns the KeyCode index, and should be defined in input.c.

 then convert this 'index' to keycode in qmp_send_key()

Exactly, qmp_send_key() can access key_defs[] to get the keycode from the
index.

 
 I didn't find a way to define a non-serial enum.

I'm not sure I follow you here, I think that what I suggested above will work.

 
 eg: (then int qmp_marshal_input_send_key() would pass real keycode to 
 qmp_send_key())
 { 'enum': 'KeyCodes',
'data': [ 'shift' = 0x2a, 'shift_r' = 0x36, 'alt' = 0x38, ...
 
 
 If we still pass 'index' to qmp_send_key as patch V4.
 
 extern int index_from_key(const char *key);   - it's used in hmp_send_key()
 extern int index_from_keycode(int code);  - it's used in hmp_send_key()
 extern char *key_from_keycode(int idx);   - it's used in 
 monitor_find_completion()
 extern int keycode_from_key(const char *key); - it's used in qmp_send_key()
 
 
  and then use these functions where using key_defs would be necessary. Also,
  note that keycode_from_key() can use KeyCodes_lookup[] instead of key_defs 
  (this
  way we can drop 'name' from KeyDef).
 
 
 
  +#endif
  +#endif
  +[KEY_CODES_MAX] = { 0, NULL },
  +};
  +
#endif
  diff --git a/hmp-commands.hx b/hmp-commands.hx
  index e336251..865eea9 100644
  --- a/hmp-commands.hx
  +++ b/hmp-commands.hx
  @@ -505,7 +505,7 @@ ETEXI
.args_type  = keys:s,hold-time:i?,
.params = keys [hold_ms],
.help   = send keys to the VM (e.g. 'sendkey ctrl-alt-f1', 
  default hold time=100 ms),
  -.mhandler.cmd = do_sendkey,
  +.mhandler.cmd = hmp_send_key,
},
 
STEXI
  diff --git a/hmp.c b/hmp.c
  index b9cec1d..cfdc106 100644
  --- a/hmp.c
  +++ b/hmp.c
  @@ -19,6 +19,7 @@
#include qemu-timer.h
#include qmp-commands.h
#include monitor.h
  +#include console.h
 
static void hmp_handle_error(Monitor *mon, Error **errp)
{
  @@ -1000,3 +1001,66 @@ void hmp_netdev_del(Monitor *mon, const QDict 
  *qdict)
qmp_netdev_del(id,err);
hmp_handle_error(mon,err);
}
  +
  +static int get_key_index(const char *key)
  +{
  +int i, keycode;
  +char *endp;
  +
  +for (i = 0; i  KEY_CODES_MAX; i++)
  +if (key_defs[i].keycode  !strcmp(key, key_defs[i].name))
  +return i;
 
  Here you can call do:
 
 keycode = keycode_from_key(key);
 if (keycode != KEY_CODES_MAX) {
  return keycode;
 }
 
  +
  +if (strstart(key, 0x, NULL)) {
  +keycode = strtoul(key,endp, 0);
  +if (*endp == '\0'  keycode= 0x01  keycode= 0xff)
  +for (i = 0; i  KEY_CODES_MAX; i++)
  +if (keycode == key_defs[i].keycode)
  +return i;
 
  You can drop that for loop and do instead:
 
 keycode = keycode_from_code(keycode);
 
 
  +}
  +
  +return -1;
  +}
  +
  +void hmp_send_key(Monitor *mon, const QDict *qdict)
  +{
  +const char *keys = qdict_get_str(qdict, keys);
  +KeyCodesList *keylist, *head = NULL, *tmp = NULL;
  +int has_hold_time = qdict_haskey(qdict, hold-time);
  +int hold_time = qdict_get_try_int(qdict, hold-time, -1);
  +Error *err = NULL;
  +char keyname_buf[16];
  +char *separator;
  +int keyname_len;
  +
  +while (1) {
  +separator = strchr(keys, '-');
  +keyname_len = separator ? separator - keys : strlen(keys);
  +pstrcpy(keyname_buf, sizeof(keyname_buf), keys);
  +
  +/* Be compatible with old interface, convert user inputted  */
  +if (!strncmp(keyname_buf, , 1)  keyname_len == 1) {
  +pstrcpy(keyname_buf, sizeof(keyname_buf), less);
  +keyname_len = 4;
  +}
  +keyname_buf[keyname_len] = 0;
  +
  +keylist =

Re: [Qemu-devel] [RFC] introduce a dynamic library to expose qemu block API

2012-07-18 Thread Andreas Färber

Am 18.07.2012 10:51, schrieb Wenchao Xia:
   Hi, following is API draft, prototypes were taken from qemu/block.h,
 and the API prefix is changed frpm bdrv to qbdrvs, to declare related
 object is BlockDriverState, not BlockDriver. [...]

So let the bikeshedding begin: ;)

What about qbds_ prefix rather than qbdrvs_? I find the proposed mixture
of acronym (q for QEMU, b for Block, s for State) and abbreviation (drv
as in Driver) a bit ugly.

Or just simply go for qblock - might be better memorable. :)

Just my 2¢ (on something I don't use much),

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

[Qemu-devel] [PATCH v4 5/5] merge pc_piix.c to pc.c

2012-07-18 Thread Wanpeng Li

[CCing ML]

From: Anthony Liguori aligu...@us.ibm.com

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Wanpeng Li liw...@linux.vnet.ibm.com
---
 hw/i386/Makefile.objs |1 -
 hw/pc.c   |  753 +++--
 hw/pc.h   |   46 +---
 hw/pc_piix.c  |  661 ---
 4 files changed, 667 insertions(+), 794 deletions(-)
 delete mode 100644 hw/pc_piix.c

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index 49b32d0..868020c 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -4,7 +4,6 @@ obj-y += sga.o ioapic_common.o ioapic.o i440fx.o piix3.o
 obj-y += vmport.o
 obj-y += pci-hotplug.o smbios.o wdt_ib700.o
 obj-y += debugcon.o multiboot.o
-obj-y += pc_piix.o
 obj-y += pc_sysfw.o
 obj-$(CONFIG_XEN) += xen_platform.o xen_apic.o
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
diff --git a/hw/pc.c b/hw/pc.c
index c7e9ab3..7c04339 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -27,6 +27,7 @@
 #include fdc.h
 #include ide.h
 #include pci.h
+#include usb.h
 #include vmware_vga.h
 #include monitor.h
 #include fw_cfg.h
@@ -47,7 +48,10 @@
 #include ui/qemu-spice.h
 #include memory.h
 #include exec-memory.h
+#include kvm/clock.h
 #include arch_init.h
+#include smbus.h
+#include boards.h
 
 /* output Bochs bios info messages */
 //#define DEBUG_BIOS
@@ -75,6 +79,8 @@
 
 #define E820_NR_ENTRIES16
 
+#define MAX_IDE_BUS 2
+
 struct e820_entry {
 uint64_t address;
 uint64_t length;
@@ -86,10 +92,14 @@ struct e820_table {
 struct e820_entry entry[E820_NR_ENTRIES];
 } QEMU_PACKED __attribute((__aligned__(4)));
 
+static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
+static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
+static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
+
 static struct e820_table e820_table;
 struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
 
-void gsi_handler(void *opaque, int n, int level)
+static void gsi_handler(void *opaque, int n, int level)
 {
 GSIState *s = opaque;
 
@@ -107,7 +117,7 @@ static void ioport80_write(void *opaque, uint32_t addr, 
uint32_t data)
 /* MSDOS compatibility mode FPU exception support */
 static qemu_irq ferr_irq;
 
-void pc_register_ferr_irq(qemu_irq irq)
+static void pc_register_ferr_irq(qemu_irq irq)
 {
 ferr_irq = irq;
 }
@@ -330,7 +340,7 @@ static void pc_cmos_init_late(void *opaque)
 qemu_unregister_reset(pc_cmos_init_late, opaque);
 }
 
-void pc_cmos_init(ram_addr_t ram_size, ram_addr_t above_4g_mem_size,
+static void pc_cmos_init(ram_addr_t ram_size, ram_addr_t above_4g_mem_size,
   const char *boot_device,
   ISADevice *floppy, BusState *idebus0, BusState *idebus1,
   ISADevice *s)
@@ -860,7 +870,7 @@ static const int ne2000_irq[NE2000_NB_MAX] = { 9, 10, 11, 
3, 4, 5 };
 static const int parallel_io[MAX_PARALLEL_PORTS] = { 0x378, 0x278, 0x3bc };
 static const int parallel_irq[MAX_PARALLEL_PORTS] = { 7, 7, 7 };
 
-void pc_init_ne2k_isa(ISABus *bus, NICInfo *nd)
+static void pc_init_ne2k_isa(ISABus *bus, NICInfo *nd)
 {
 static int nb_ne2k = 0;
 
@@ -915,7 +925,7 @@ static DeviceState *apic_init(void *env, uint8_t apic_id)
 return dev;
 }
 
-void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
+static void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
 {
 CPUX86State *s = opaque;
 
@@ -952,7 +962,7 @@ static X86CPU *pc_new_cpu(const char *cpu_model)
 return cpu;
 }
 
-void pc_cpus_init(const char *cpu_model)
+static void pc_cpus_init(const char *cpu_model)
 {
 int i;
 
@@ -970,55 +980,18 @@ void pc_cpus_init(const char *cpu_model)
 }
 }
 
-void *pc_memory_init(MemoryRegion *system_memory,
+static void *pc_memory_init(MemoryRegion *system_memory,
 const char *kernel_filename,
 const char *kernel_cmdline,
 const char *initrd_filename,
 ram_addr_t below_4g_mem_size,
-ram_addr_t above_4g_mem_size,
-MemoryRegion *rom_memory,
-MemoryRegion **ram_memory)
+ram_addr_t above_4g_mem_size)
 {
 int linux_boot, i;
-MemoryRegion *ram, *option_rom_mr;
-MemoryRegion *ram_below_4g, *ram_above_4g;
 void *fw_cfg;
 
 linux_boot = (kernel_filename != NULL);
 
-/* Allocate RAM.  We allocate it as a single memory region and use
- * aliases to address portions of it, mostly for backwards compatibility
- * with older qemus that used qemu_ram_alloc().
- */
-ram = g_malloc(sizeof(*ram));
-memory_region_init_ram(ram, pc.ram,
-   below_4g_mem_size + above_4g_mem_size);
-vmstate_register_ram_global(ram);
-*ram_memory = ram;
-ram_below_4g = g_malloc(sizeof(*ram_below_4g));
-memory_region_init_alias(ram_below_4g, ram-below-4g, ram,
- 0,

Re: [Qemu-devel] [PATCH] build: add make dist target (v2)

2012-07-18 Thread Anthony Liguori

Gerd Hoffmann kra...@redhat.com writes:

 On 07/17/12 21:12, Michael Roth wrote:
 On Tue, Jul 17, 2012 at 01:33:32PM -0500, Anthony Liguori wrote:
 Let's stop screwing up releases by having a script do the work that 
 Anthony's
 fat fingers can't seem to get right.

 Cc: Michael Roth mdr...@linux.vnet.ibm.com
 Signed-off-by: Anthony Liguori aligu...@us.ibm.com
 
 Breaks if there's no tag corresponding with the contents of VERSION,
 but that might be considered a feature (an alternative might be to
 assume it's a development release, use current HEAD for master, and append 
 the
 short git hash to the version). Works well as far as I can tell though,
 and I made a special point to confirm it did indeed output a bz2 :)

 Or just use 'git describe --long' to figure what the version is.  This
 way you can easily build a tarball for any git commit, and a release
 tarball is just 'git checkout v$version; make dist'.

As long as it doesn't break release tarballs, I'm very open to patches
to make this more generally useful.

Regards,

Anthony Liguori


 cheers,
   Gerd

Re: [Qemu-devel] [RFC] introduce a dynamic library to expose qemu block API

2012-07-18 Thread Kevin Wolf

Am 18.07.2012 15:51, schrieb Andreas Färber:
 Am 18.07.2012 10:51, schrieb Wenchao Xia:
   Hi, following is API draft, prototypes were taken from qemu/block.h,
 and the API prefix is changed frpm bdrv to qbdrvs, to declare related
 object is BlockDriverState, not BlockDriver. [...]

After the refactoring that Markus is working on it won't refer to a
BlockDriverState, but to a BlockBackend. (And changes like this make
quite clear why the internals of the current block layer are not
suitable as a public API. The API needs to be defined in a separate
layer than can abstract such changes away.)

 So let the bikeshedding begin: ;)
 
 What about qbds_ prefix rather than qbdrvs_? I find the proposed mixture
 of acronym (q for QEMU, b for Block, s for State) and abbreviation (drv
 as in Driver) a bit ugly.
 
 Or just simply go for qblock - might be better memorable. :)

Yes, something like qblk that isn't tied to internals sounds better.

Kevin

Re: [Qemu-devel] [RFC] introduce a dynamic library to expose qemu block API

2012-07-18 Thread Daniel P. Berrange

On Wed, Jul 18, 2012 at 04:51:03PM +0800, Wenchao Xia wrote:
   Hi, following is API draft, prototypes were taken from qemu/block.h,
 and the API prefix is changed frpm bdrv to qbdrvs, to declare related
 object is BlockDriverState, not BlockDriver. One issue here is it may
 require include block_int.h, which is not LGPL2 licensed yet.
   API format is kept mostly the same with qemu generic block layer, to
 make it easier for implement, and easier to make qemu migrate on it if
 possible.


How is error reporting dealt with, and what is the intent around
thread safety of the APIs ?  I'd like to see a fully thread safe
API - multiple threads can use the same 'BlockDriverState *'
concurrently, and thread-local error reporting.

 
 
 /* structure init and uninit */
 BlockDriverState *qbdrvs_new(const char *device_name);
 void qbdrvs_delete(BlockDriverState *bs);
 
 
 /* file open and close */
 int qbdrvs_open(BlockDriverState *bs, const char *filename, int flags,
   BlockDriver *drv);
 void qbdrvs_close(BlockDriverState *bs);
 int qbdrvs_img_create(const char *filename, const char *fmt,
 const char *base_filename, const char *base_fmt,
 char *options, uint64_t img_size, int flags);

s/img_create/create/

Can this return an actual BlockDriverState struct too.

 
 
 /* sync access */
 int qbdrvs_read(BlockDriverState *bs, int64_t sector_num,
   uint8_t *buf, int nb_sectors);
 int qbdrvs_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors);
 
 
 /* info retrieve */
 //sector, size and geometry info
 int qbdrvs_get_info(BlockDriverState *bs, BlockDriverInfo *bdi);

What is in BlockDriverInfo and what is the intended ABI stability
policy for it ?

 int64_t qbdrvs_getlength(BlockDriverState *bs);
 int64_t qbdrvs_get_allocated_file_size(BlockDriverState *bs);
 void qbdrvs_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr);

Could this data all just be part of BlockDriverInfo data ?

 //image type
 const char *qbdrvs_get_format_name(BlockDriverState *bs);
 //backing file info
 void qbdrvs_get_backing_filename(BlockDriverState *bs,
char *filename, int filename_size);

You need to include the backing file format here too.

 void qbdrvs_get_full_backing_filename(BlockDriverState *bs,
 char *dest, size_t sz);

Not sure I see why we need this in addition to the above ?


Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

[Qemu-devel] QMP event for suspend to disk

2012-07-18 Thread Luiz Capitulino

Hi,

QEMU can now distinguish between S4 and power off for the guest OSes that
suspends to disk through S4. This means that we can have a QMP event for S4
(so that QMP clients can distinguish between S3 and S4).

However, as we already emit the SHUTDOWN event for this, it turns out that
there are three different ways of supporting the new event. We could:

 1. not emit the SHUTDOWN event and only emit the SUSPEND_DISK event

cons: breaks compatibility, as we'll stop sending an event we send today.

 2. emit both, that is, first SUSPEND_DISK and then SHUTDOWN

cons: emitting two very similar events.

 3. extend the SHUTDOWN event to say whether it's a suspend to disk or not,
like this:

   { event: SHUTDOWN, data: { suspend-to-disk: true }, ... }

cons: we already have the SUSPEND event, so from an API perspective
  it would make sense to have SUSPEND_DISK.

I prefer item 3, because it seems to be cleanest one.

Re: [Qemu-devel] [RFC] introduce a dynamic library to expose qemu block API

2012-07-18 Thread Paolo Bonzini

Il 18/07/2012 15:58, Daniel P. Berrange ha scritto:
 How is error reporting dealt with

These APIs just return errno values.

 , and what is the intent around
 thread safety of the APIs ?  I'd like to see a fully thread safe
 API - multiple threads can use the same 'BlockDriverState *'
 concurrently, and thread-local error reporting.

This is a bit difficult to provide, since the QEMU block layer itself is
not thread-safe.

Another missing feature is passwords.

Paolo

Re: [Qemu-devel] [PATCH v4 3/5] convert pci-host to QOM

2012-07-18 Thread Andreas Färber

Am 18.07.2012 15:19, schrieb Wanpeng Li:
 [CCing ML]
 
 From: Anthony Liguori aligu...@us.ibm.com
 
 Signed-off-by: Anthony Liguori aligu...@us.ibm.com
 Signed-off-by: Wanpeng Li liw...@linux.vnet.ibm.com
 ---
  hw/pci_host.c |   26 ++
  hw/pci_host.h |5 +
  2 files changed, 31 insertions(+), 0 deletions(-)

Note: This is a resend of an old patch that conflicts with my more
recent pci_host series and does not take recent review comments into
account (e.g., PCI_HOST_BRIDGE() was requested). Please take a look at
those patches and participate in the review, so that we can get it in soon.

Thanks,
Andreas

 
 diff --git a/hw/pci_host.c b/hw/pci_host.c
 index 8041778..095bfe3 100644
 --- a/hw/pci_host.c
 +++ b/hw/pci_host.c
 @@ -165,4 +165,30 @@ const MemoryRegionOps pci_host_data_be_ops = {
  .endianness = DEVICE_BIG_ENDIAN,
  };
  
 +void pci_host_set_mmio(PCIHostState *s, MemoryRegion *value)
 +{
 +object_property_set_link(OBJECT(s), OBJECT(value), mmio, NULL);
 +}
 +
 +static void pci_host_initfn(Object *obj)
 +{
 +PCIHostState *s = PCI_HOST(obj);
 +
 +object_property_add_link(obj, mmio, TYPE_MEMORY_REGION,
 +(Object **)s-address_space, NULL);
 +}
 +
 +static TypeInfo pci_host_type_info = {
 +.name = TYPE_PCI_HOST,
 +.parent = TYPE_SYS_BUS_DEVICE,
 +.instance_size = sizeof(PCIHostState),
 +.instance_init = pci_host_initfn,
 +};
 +
 +static void register_devices(void)
 +{
 +type_register_static(pci_host_type_info);
 +}
 +
 +type_init(register_devices)
  
 diff --git a/hw/pci_host.h b/hw/pci_host.h
 index 359e38f..084e15c 100644
 --- a/hw/pci_host.h
 +++ b/hw/pci_host.h
 @@ -30,6 +30,9 @@
  
  #include sysbus.h
  
 +#define TYPE_PCI_HOST pci-host
 +#define PCI_HOST(obj) OBJECT_CHECK(PCIHostState, (obj), TYPE_PCI_HOST)
 +
  struct PCIHostState {
  SysBusDevice busdev;
  MemoryRegion conf_mem;
 @@ -49,6 +52,8 @@ uint32_t pci_host_config_read_common(PCIDevice *pci_dev, 
 uint32_t addr,
  void pci_data_write(PCIBus *s, uint32_t addr, uint32_t val, int len);
  uint32_t pci_data_read(PCIBus *s, uint32_t addr, int len);
  
 +void pci_host_set_mmio(PCIHostState *s, MemoryRegion *value);
 +
  extern const MemoryRegionOps pci_host_conf_le_ops;
  extern const MemoryRegionOps pci_host_conf_be_ops;
  extern const MemoryRegionOps pci_host_data_le_ops;
 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

[Qemu-devel] [PATCH 00/11] configure: Fix -Werror issues

2012-07-18 Thread Peter Maydell

This patch series:
 1. turns off -Werror for configure tests
 2. fixes a large pile of warnings in various configure tests
 3. turns on -Werror for configure tests again, but in a way that means
that errors mean configure stops so the warnings are as obvious
to developers as they would be for a -Werror compile of the main code

The turn off -Werror patch has already been sent to the list;
I've included it here as it is a dependency, but it should probably
be committed ASAP even if the rest of this series needs review and
updating. I've also put all of Stefan's recent patches in since without
them configure won't run once -Werror is reenabled.

NB: this works for me, but it's possible there are some still lurking
warnings in a few compilation tests that are themselves protected by
tests for some library/host I don't have. They should be easy to fix
as they turn up, though, and the configure failure message is clear
about how to work around in the meantime.

Peter Maydell (7):
  configure: Don't run configure tests with -Werror enabled
  configure: -march=i486 belongs in QEMU_CFLAGS, not CFLAGS
  configure: Fix compile warning in PNG test
  configure: Fix warnings in VDE library probe
  configure: Fix compile warning in utimensat/futimens test
  configure: -I\$(SRC_PATH) goes in QEMU_INCLUDES not QEMU_CFLAGS
  configure: Check for -Werror causing failures when compiling tests

Stefan Weil (4):
  configure: Fix build with ALSA audio driver
  configure: Fix build with capabilities
  configure: Replace bash code by standard shell code
  configure: Fix errors in test for__sync_fetch_and_and

 configure |   73 +++-
 1 files changed, 57 insertions(+), 16 deletions(-)

-- 
1.7.5.4

[Qemu-devel] [PATCH 04/11] configure: Replace bash code by standard shell code

2012-07-18 Thread Peter Maydell

From: Stefan Weil s...@weilnetz.de

+= does not work with dash and other simple /bin/sh implementations.

The new code prepends the flag while the old code either did not work
(it continued after an error message which typically was not read) or
appended the flag. That difference should not matter here.

Reported-by: Olaf Hering o...@aepfle.de
Signed-off-by: Stefan Weil s...@weilnetz.de
Reviewed-by: Peter Maydell peter.mayd...@linaro.org
---
 configure |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/configure b/configure
index feadbe7..f54415d 100755
--- a/configure
+++ b/configure
@@ -2810,7 +2810,7 @@ int main(int argc, char **argv)
 }
 EOF
   if ! compile_prog   ; then
-CFLAGS+=-march=i486
+CFLAGS=-march=i486 $CFLAGS
   fi
 fi
 
-- 
1.7.5.4

Re: [Qemu-devel] [PATCH] build: add make dist target (v2)

2012-07-18 Thread Daniel P. Berrange

On Tue, Jul 17, 2012 at 01:33:32PM -0500, Anthony Liguori wrote:
 Let's stop screwing up releases by having a script do the work that Anthony's
 fat fingers can't seem to get right.
 
 Cc: Michael Roth mdr...@linux.vnet.ibm.com
 Signed-off-by: Anthony Liguori aligu...@us.ibm.com
 ---
 v1 - v2
  - include the scripts for real this time
  - remove tar/tarbin from PHONY
 ---
  Makefile |   19 ---
  scripts/make-release |   24 
  2 files changed, 32 insertions(+), 11 deletions(-)
  create mode 100755 scripts/make-release
 
 diff --git a/Makefile b/Makefile
 index 9707fa0..abf825d 100644
 --- a/Makefile
 +++ b/Makefile
 @@ -31,7 +31,7 @@ Makefile: ;
  configure: ;
  
  .PHONY: all clean cscope distclean dvi html info install install-doc \
 - pdf recurse-all speed tar tarbin test build-all
 + pdf recurse-all speed test build-all dist
  
  $(call set-vpath, $(SRC_PATH):$(SRC_PATH)/hw)
  
 @@ -232,6 +232,13 @@ clean:
   rm -f $$d/qemu-options.def; \
  done
  
 +VERSION ?= $(shell cat VERSION)
 +
 +dist: qemu-$(VERSION).tar.bz2
 +
 +qemu-%.tar.bz2:
 + $(SRC_PATH)/scripts/make-release $(SRC_PATH) $(patsubst 
 qemu-%.tar.bz2,%,$@)
 +
  distclean: clean
   rm -f config-host.mak config-host.h* config-host.ld $(DOCS) 
 qemu-options.texi qemu-img-cmds.texi qemu-monitor.texi
   rm -f config-all-devices.mak
 @@ -390,15 +397,5 @@ qemu-doc.dvi qemu-doc.html qemu-doc.info qemu-doc.pdf: \
   qemu-img.texi qemu-nbd.texi qemu-options.texi \
   qemu-monitor.texi qemu-img-cmds.texi
  
 -VERSION ?= $(shell cat VERSION)
 -FILE = qemu-$(VERSION)
 -
 -# tar release (use 'make -k tar' on a checkouted tree)
 -tar:
 - rm -rf /tmp/$(FILE)
 - cp -r . /tmp/$(FILE)
 - cd /tmp  tar zcvf ~/$(FILE).tar.gz $(FILE) --exclude CVS --exclude 
 .git --exclude .svn
 - rm -rf /tmp/$(FILE)
 -
  # Include automatically generated dependency files
  -include $(wildcard *.d audio/*.d slirp/*.d block/*.d net/*.d ui/*.d 
 qapi/*.d qga/*.d)
 diff --git a/scripts/make-release b/scripts/make-release
 new file mode 100755
 index 000..196c755
 --- /dev/null
 +++ b/scripts/make-release
 @@ -0,0 +1,24 @@
 +#!/bin/bash -e
 +#
 +# QEMU Release Script
 +#
 +# Copyright IBM, Corp. 2012
 +#
 +# Authors:
 +#  Anthony Liguori aligu...@us.ibm.com
 +#
 +# This work is licensed under the terms of the GNU GPLv2 or later.
 +# See the COPYING file in the top-level directory.
 +
 +src=$1
 +version=$2
 +destination=qemu-${version}
 +
 +git clone ${src} ${destination}
 +pushd ${destination}
 +git checkout v${version}
 +git submodule update --init
 +rm -rf .git roms/*/.git
 +popd
 +tar cfj ${destination}.tar.bz2 ${destination}
 +rm -rf ${destination}

Fancy providing an XZ compressed archive, in addition to the bz2 one?
It is almost 20% smaller with XZ with default compression levels...

$ ls -ahl qemu-1.1.1-1.tar*
-rw-rw-r--. 1 berrange berrange 9.2M Jul 17 19:20 qemu-1.1.1-1.tar.bz2
-rw-rw-r--. 1 berrange berrange 7.6M Jul 18 15:03 qemu-1.1.1-1.tar.xz

You can get it down to 7.3M if you use xz --best

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] [RFC] introduce a dynamic library to expose qemu block API

2012-07-18 Thread Daniel P. Berrange

On Wed, Jul 18, 2012 at 04:02:15PM +0200, Paolo Bonzini wrote:
 Il 18/07/2012 15:58, Daniel P. Berrange ha scritto:
  How is error reporting dealt with
 
 These APIs just return errno values.

Which has led to somewhat unhelpful error reporting in the past. If we're
designing a library API it'd be nice to improve on this.

 
  , and what is the intent around
  thread safety of the APIs ?  I'd like to see a fully thread safe
  API - multiple threads can use the same 'BlockDriverState *'
  concurrently, and thread-local error reporting.
 
 This is a bit difficult to provide, since the QEMU block layer itself is
 not thread-safe.

Yep, I'd expect that this is something we'd need to fix when turning the
code into a library.  NB, I don't mean to say QEMU should protect against
an app doing stupid things like letting 2 threads write to the same area
of the file at once. That's upto the application. I simply mean that the
BlockDriverState shouldn't corrupt itself if 2 separate APIs are called
concurrently on the same instance.

 Another missing feature is passwords.

Oh yes.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

[Qemu-devel] [PATCH 11/11] configure: Check for -Werror causing failures when compiling tests

2012-07-18 Thread Peter Maydell

Add support for checking whether test case code can compile without
warnings, by recompiling each successful test with -Werror. If the
-Werror version doesn't pass, we bail out. This gives us the same
level of visibility of warnings in test code as --enable-werror
provides for the main compile.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 configure |   32 
 1 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index 8140464..1939bdb 100755
--- a/configure
+++ b/configure
@@ -27,16 +27,40 @@ printf  '%s' $0 $@  config.log
 echo  config.log
 echo #  config.log
 
+do_cc() {
+# Run the compiler, capturing its output to the log.
+echo $cc $@  config.log
+$cc $@  config.log 21 || return $?
+# Test passed. If this is an --enable-werror build, rerun
+# the test with -Werror and bail out if it fails. This
+# makes warning-generating-errors in configure test code
+# obvious to developers.
+if test $werror != yes; then
+return 0
+fi
+# Don't bother rerunning the compile if we were already using -Werror
+case $* in
+*-Werror*)
+   return 0
+;;
+esac
+echo $cc -Werror $@  config.log
+$cc -Werror $@  config.log 21  return $?
+echo ERROR: configure test passed without -Werror but failed with 
-Werror.
+echo This is probably a bug in the configure script. The failing command
+echo will be at the bottom of config.log.
+echo You can run configure with --disable-werror to bypass this check.
+exit 1
+}
+
 compile_object() {
-  echo $cc $QEMU_CFLAGS -c -o $TMPO $TMPC  config.log
-  $cc $QEMU_CFLAGS -c -o $TMPO $TMPC  config.log 21
+  do_cc $QEMU_CFLAGS -c -o $TMPO $TMPC
 }
 
 compile_prog() {
   local_cflags=$1
   local_ldflags=$2
-  echo $cc $QEMU_CFLAGS $local_cflags -o $TMPE $TMPC $LDFLAGS $local_ldflags 
 config.log
-  $cc $QEMU_CFLAGS $local_cflags -o $TMPE $TMPC $LDFLAGS $local_ldflags  
config.log 21
+  do_cc $QEMU_CFLAGS $local_cflags -o $TMPE $TMPC $LDFLAGS $local_ldflags
 }
 
 # symbolically link $1 to $2.  Portable version of ln -sf.
-- 
1.7.5.4

Re: [Qemu-devel] QMP event for suspend to disk

2012-07-18 Thread Eric Blake

On 07/18/2012 08:02 AM, Luiz Capitulino wrote:
 Hi,
 
 QEMU can now distinguish between S4 and power off for the guest OSes that
 suspends to disk through S4. This means that we can have a QMP event for S4
 (so that QMP clients can distinguish between S3 and S4).
 
 However, as we already emit the SHUTDOWN event for this, it turns out that
 there are three different ways of supporting the new event. We could:
 
  1. not emit the SHUTDOWN event and only emit the SUSPEND_DISK event
 
 cons: breaks compatibility, as we'll stop sending an event we send today.

At least the new 'query-events' monitor command will allow libvirt to
know which events to listen for; so this is workable.

 
  2. emit both, that is, first SUSPEND_DISK and then SHUTDOWN
 
 cons: emitting two very similar events.

This has the most back-compat, but is noisier in that it generates more
events.

 
  3. extend the SHUTDOWN event to say whether it's a suspend to disk or not,
 like this:
 
{ event: SHUTDOWN, data: { suspend-to-disk: true }, ... }
 
 cons: we already have the SUSPEND event, so from an API perspective
   it would make sense to have SUSPEND_DISK.
 
 I prefer item 3, because it seems to be cleanest one.

I'm also liking 3, then 1, then 2; but think any of them will work from
libvirt's POV.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] vfio-powerpc: added VFIO support (v4)

2012-07-18 Thread Alex Williamson

On Wed, 2012-07-18 at 21:09 +1000, Alexey Kardashevskiy wrote:
 It literally does the following:
 
 1. POWERPC IOMMU support (the kernel counterpart is required)
 
 2. The patch assumes that IOAPIC calls are going to be replaced
 with something generic.
 
 3. Added sPAPRVFIOData (hw/spapr_iommu_vfio.h) which describes
 the interface between VFIO and sPAPR IOMMU.
 
 4. Change sPAPR PHB to scan the PCI bus which is used for
 the IOMMU-VFIO group. Now it is enough to add the following to
 the QEMU command line to get VFIO up with all the devices from
 IOMMU group with id=3:
 -device spapr-pci-host-bridge,busname=E1000E,buid=0x3,iommu=3,\
  
 mem_win_addr=0x2300,io_win_addr=0x2400,msi_win_addr=0x2500
 
 WIth the pathes posted today a bit earlier, this patch fully supports
 VFIO what includes MSIX as well.
 
 ps. yes, I know that linux_vfio.h has moved, will fix it later :)
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
  hw/linux-vfio.h   |   26 +++
  hw/ppc/Makefile.objs  |3 ++
  hw/spapr.h|4 ++
  hw/spapr_iommu.c  |   62 -
  hw/spapr_iommu_vfio.h |   34 ++
  hw/spapr_pci.c|  124 
 +++--
  hw/spapr_pci.h|6 +++
  hw/vfio_pci.c |   64 +
  hw/vfio_pci.h |2 +
  trace-events  |1 +
  10 files changed, 320 insertions(+), 6 deletions(-)
  create mode 100644 hw/spapr_iommu_vfio.h
 
 diff --git a/hw/linux-vfio.h b/hw/linux-vfio.h
 index 300d49b..27a0501 100644
 --- a/hw/linux-vfio.h
 +++ b/hw/linux-vfio.h
 @@ -442,4 +442,30 @@ struct vfio_iommu_type1_dma_unmap {
  
  #define VFIO_IOMMU_UNMAP_DMA _IO(VFIO_TYPE, VFIO_BASE + 14)
  
 +/*
 + * Interface to SPAPR TCE (POWERPC Book3S)
 + */
 +#define SPAPR_TCE_IOMMU 2
 +
 +struct tce_iommu_info {
 +__u32 argsz;
 +__u32 flags;
 +__u32 dma32_window_start;
 +__u32 dma32_window_size;
 +__u64 dma64_window_start;
 +__u64 dma64_window_size;
 +};
 +
 +#define SPAPR_TCE_IOMMU_GET_INFO_IO(VFIO_TYPE, VFIO_BASE + 12)
 +
 +struct tce_iommu_dma_map {
 +__u32 argsz;
 +__u32 flags;
 +__u64 va;
 +__u64 dmaaddr;
 +};
 +
 +#define SPAPR_TCE_IOMMU_MAP_DMA _IO(VFIO_TYPE, VFIO_BASE + 13)
 +#define SPAPR_TCE_IOMMU_UNMAP_DMA   _IO(VFIO_TYPE, VFIO_BASE + 14)
 +
  #endif /* VFIO_H */
 diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
 index f573a95..c46a049 100644
 --- a/hw/ppc/Makefile.objs
 +++ b/hw/ppc/Makefile.objs
 @@ -25,4 +25,7 @@ obj-$(CONFIG_FDT) += ../device_tree.o
  # Xilinx PPC peripherals
  obj-y += xilinx_ethlite.o
  
 +# VFIO PCI device assignment
 +obj-$(CONFIG_VFIO_PCI) += vfio_pci.o
 +
  obj-y := $(addprefix ../,$(obj-y))
 diff --git a/hw/spapr.h b/hw/spapr.h
 index b37f337..0c15c88 100644
 --- a/hw/spapr.h
 +++ b/hw/spapr.h
 @@ -340,4 +340,8 @@ int spapr_dma_dt(void *fdt, int node_off, const char 
 *propname,
  int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
DMAContext *dma);
  
 +struct sPAPRVFIOData;
 +void spapr_vfio_init_dma(int group_id, uint32_t liobn,
 + struct sPAPRVFIOData *data);
 +
  #endif /* !defined (__HW_SPAPR_H__) */
 diff --git a/hw/spapr_iommu.c b/hw/spapr_iommu.c
 index 50c288d..0a82842 100644
 --- a/hw/spapr_iommu.c
 +++ b/hw/spapr_iommu.c
 @@ -23,6 +23,8 @@
  #include dma.h
  
  #include hw/spapr.h
 +#include hw/spapr_iommu_vfio.h
 +#include hw/vfio_pci.h
  
  #include libfdt.h
  
 @@ -183,6 +185,60 @@ static int put_tce_emu(target_ulong liobn, target_ulong 
 ioba, target_ulong tce)
  return 0;
  }
  
 +typedef struct sPAPRVFIOTable {
 +struct sPAPRVFIOData *data;
 +uint32_t liobn;
 +QLIST_ENTRY(sPAPRVFIOTable) list;
 +} sPAPRVFIOTable;
 +
 +QLIST_HEAD(vfio_tce_tables, sPAPRVFIOTable) vfio_tce_tables;
 +
 +void spapr_vfio_init_dma(int group_id, uint32_t liobn,
 + struct sPAPRVFIOData *data)
 +{
 +sPAPRVFIOTable *t;
 +
 +t = g_malloc0(sizeof(*t));
 +t-data = data;
 +t-liobn = liobn;
 +
 +QLIST_INSERT_HEAD(vfio_tce_tables, t, list);
 +}
 +
 +static int put_tce_vfio(uint32_t liobn, target_ulong ioba, target_ulong tce)
 +{
 +sPAPRVFIOTable *t;
 +struct tce_iommu_dma_map map = {
 +.argsz = sizeof(map),
 +.va = 0,
 +.dmaaddr = ioba,
 +};
 +
 +QLIST_FOREACH(t, vfio_tce_tables, list) {
 +if (t-liobn != liobn) {
 +continue;
 +}
 +if (!t-data) {
 +return H_NO_MEM;
 +}

Why would this ever happen?

 +if (tce) {
 +map.va = (uintptr_t)qemu_get_ram_ptr(tce  ~SPAPR_TCE_PAGE_MASK);
 +
 +if (t-data-map(t-data-groupid, map)) {

Just pass t-data, this is why the VFIOContainer has a union.

 +perror(TCE_MAP_DMA);
 +return H_PARAMETER;
 +}
 +} else {
 +if

[Qemu-devel] [PATCH 09/11] configure: Fix compile warning in utimensat/futimens test

2012-07-18 Thread Peter Maydell

Fix compile warning in the utimensat/futimens test (implicit
declaration of function 'utimensat', ditto futimens) by
adding a missing include.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 configure |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/configure b/configure
index 9c2a84d..638e486 100755
--- a/configure
+++ b/configure
@@ -2341,6 +2341,7 @@ cat  $TMPC  EOF
 #define _ATFILE_SOURCE
 #include stddef.h
 #include fcntl.h
+#include sys/stat.h
 
 int main(void)
 {
-- 
1.7.5.4

[Qemu-devel] [PATCH 08/11] configure: Fix warnings in VDE library probe

2012-07-18 Thread Peter Maydell

Fix compile warnings in the VDE library probe (passing argument 1 of
'vde_open_real' discards 'const' qualifier from pointer target type,
ditto argument 2).

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 configure |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/configure b/configure
index 784325a..9c2a84d 100755
--- a/configure
+++ b/configure
@@ -1820,7 +1820,8 @@ if test $vde != no ; then
 int main(void)
 {
 struct vde_open_args a = {0, 0, 0};
-vde_open(, , a);
+char s[] = ;
+vde_open(s, s, a);
 return 0;
 }
 EOF
-- 
1.7.5.4

[Qemu-devel] [PATCH v4] add -machine mem-merge=on|off option

2012-07-18 Thread Luiz Capitulino

It allows to disable memory merge support (KSM on Linux), which is
enabled by default otherwise.

Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
---

IMPORTANT: this is on top of this series:

   http://lists.gnu.org/archive/html/qemu-devel/2012-07/msg01798.html

o v4

- rename option to mem-merge
- rebase on top of latest master
- rebase on top of the machine option rename series

 exec.c  | 19 ---
 qemu-config.c   |  4 
 qemu-options.hx |  7 ++-
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/exec.c b/exec.c
index c9fa17d..59d29ae 100644
--- a/exec.c
+++ b/exec.c
@@ -2510,6 +2510,19 @@ void qemu_ram_set_idstr(ram_addr_t addr, const char 
*name, DeviceState *dev)
 }
 }
 
+static int memory_try_enable_merging(void *addr, size_t len)
+{
+QemuOpts *opts;
+
+opts = qemu_opts_find(qemu_find_opts(machine), 0);
+if (opts  !qemu_opt_get_bool(opts, mem-merge, true)) {
+/* disabled by the user */
+return 0;
+}
+
+return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
+}
+
 ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
MemoryRegion *mr)
 {
@@ -2529,7 +2542,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void 
*host,
 new_block-host = file_ram_alloc(new_block, size, mem_path);
 if (!new_block-host) {
 new_block-host = qemu_vmalloc(size);
-qemu_madvise(new_block-host, size, QEMU_MADV_MERGEABLE);
+memory_try_enable_merging(new_block-host, size);
 }
 #else
 fprintf(stderr, -mem-path option unsupported\n);
@@ -2544,7 +2557,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void 
*host,
 } else {
 new_block-host = qemu_vmalloc(size);
 }
-qemu_madvise(new_block-host, size, QEMU_MADV_MERGEABLE);
+memory_try_enable_merging(new_block-host, size);
 }
 }
 new_block-length = size;
@@ -2670,7 +2683,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
 length, addr);
 exit(1);
 }
-qemu_madvise(vaddr, length, QEMU_MADV_MERGEABLE);
+memory_try_enable_merging(vaddr, length);
 }
 return;
 }
diff --git a/qemu-config.c b/qemu-config.c
index 9dac3be..3355e45 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -599,6 +599,10 @@ static QemuOptsList qemu_machine_opts = {
 .alias= dt_compatible,
 .type = QEMU_OPT_STRING,
 .help = Overrides the \compatible\ property of the dt root 
node,
+}, {
+.name = mem-merge,
+.type = QEMU_OPT_BOOL,
+.help = enable/disable memory merge support,
 },
 { /* End of list */ }
 },
diff --git a/qemu-options.hx b/qemu-options.hx
index 90eab87..25ce07e 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -33,7 +33,8 @@ DEF(machine, HAS_ARG, QEMU_OPTION_machine, \
 property accel=accel1[:accel2[:...]] selects 
accelerator\n
 supported accelerators are kvm, xen, tcg (default: tcg)\n
 kernel-irqchip=on|off controls accelerated irqchip 
support\n
-kvm-shadow-mem=size of KVM shadow MMU\n,
+kvm-shadow-mem=size of KVM shadow MMU\n
+mem-merge=on|off controls memory merge support (default: 
on)\n,
 QEMU_ARCH_ALL)
 STEXI
 @item -machine [type=]@var{name}[,prop=@var{value}[,...]]
@@ -50,6 +51,10 @@ to initialize.
 Enables in-kernel irqchip support for the chosen accelerator when available.
 @item kvm-shadow-mem=size
 Defines the size of the KVM shadow MMU.
+@item mem-merge=on|off
+Enables or disables memory merge support. This feature, when supported by
+the host, de-duplicates identical memory pages among VMs instances
+(enabled by default).
 @end table
 ETEXI
 
-- 
1.7.11.1.116.g8228a23

[Qemu-devel] [PATCH] improve scripts/make-release

2012-07-18 Thread Gerd Hoffmann

'make dist' creates a tarball for the current checkout.
'make qemu-${version}.tar.bz2' creates a tarball for git tag v${version}.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 Makefile |5 ++---
 scripts/make-release |8 +++-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/Makefile b/Makefile
index ab82ef3..4a5f399 100644
--- a/Makefile
+++ b/Makefile
@@ -233,9 +233,8 @@ clean:
rm -f $$d/qemu-options.def; \
 done
 
-VERSION ?= $(shell cat VERSION)
-
-dist: qemu-$(VERSION).tar.bz2
+dist:
+   $(SRC_PATH)/scripts/make-release $(SRC_PATH)
 
 qemu-%.tar.bz2:
$(SRC_PATH)/scripts/make-release $(SRC_PATH) $(patsubst 
qemu-%.tar.bz2,%,$@)
diff --git a/scripts/make-release b/scripts/make-release
index 196c755..181ca77 100755
--- a/scripts/make-release
+++ b/scripts/make-release
@@ -12,11 +12,17 @@
 
 src=$1
 version=$2
+if test $version = ; then
+   commit=$(git describe --long)
+   version=${commit#v}
+else
+   commit=v${version}
+fi
 destination=qemu-${version}
 
 git clone ${src} ${destination}
 pushd ${destination}
-git checkout v${version}
+git checkout ${commit}
 git submodule update --init
 rm -rf .git roms/*/.git
 popd
-- 
1.7.1

[Qemu-devel] [PATCH 03/11] configure: Fix build with capabilities

2012-07-18 Thread Peter Maydell

From: Stefan Weil s...@weilnetz.de

Since commit 417c9d72d48275d19c60861896efd4962d21aca2 all configure tests
normally run with -Werror. Some of these tests now fail because they
raised a compiler warning.

This patch fixes support for capabilities.

Signed-off-by: Stefan Weil s...@weilnetz.de
Reviewed-by: Peter Maydell peter.mayd...@linaro.org
---
 configure |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/configure b/configure
index 63156a7..feadbe7 100755
--- a/configure
+++ b/configure
@@ -2083,7 +2083,7 @@ if test $cap != no ; then
   cat  $TMPC EOF
 #include stdio.h
 #include sys/capability.h
-int main(void) { cap_t caps; caps = cap_init(); }
+int main(void) { cap_t caps; caps = cap_init(); return caps != NULL; }
 EOF
   if compile_prog  -lcap ; then
 cap=yes
-- 
1.7.5.4

[Qemu-devel] [PATCH 06/11] configure: Fix errors in test for__sync_fetch_and_and

2012-07-18 Thread Peter Maydell

From: Stefan Weil s...@weilnetz.de

The old test code raises two compiler warnings which are errors since
commit 417c9d72d48275d19c60861896efd4962d21aca2.

These errors could result in compilations with compiler flag
-march486 (so all nice features of newer processors got lost).

Signed-off-by: Stefan Weil s...@weilnetz.de
Reviewed-by: Peter Maydell peter.mayd...@linaro.org
---
 configure |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/configure b/configure
index 3de4ea7..aced52e 100755
--- a/configure
+++ b/configure
@@ -2797,7 +2797,7 @@ fi
 # specification is necessary
 if test $vhost_net = yes  test $cpu = i386; then
   cat  $TMPC  EOF
-int sfaa(unsigned *ptr)
+static int sfaa(int *ptr)
 {
   return __sync_fetch_and_and(ptr, 0);
 }
-- 
1.7.5.4

[Qemu-devel] [PATCH 05/11] configure: -march=i486 belongs in QEMU_CFLAGS, not CFLAGS

2012-07-18 Thread Peter Maydell

The distinction between QEMU_CFLAGS and CFLAGS is that the
former is for flags without which QEMU can't compile, whereas
the latter is for flags like -g -O2 which the user can
safely override. -march=i486 is in the former category, and
so belongs in QEMU_CFLAGS.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 configure |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/configure b/configure
index f54415d..3de4ea7 100755
--- a/configure
+++ b/configure
@@ -2810,7 +2810,7 @@ int main(int argc, char **argv)
 }
 EOF
   if ! compile_prog   ; then
-CFLAGS=-march=i486 $CFLAGS
+QEMU_CFLAGS=-march=i486 $QEMU_CFLAGS
   fi
 fi
 
-- 
1.7.5.4

Re: [Qemu-devel] [PATCH v4 4/5] prepare to create HPET, RTC and i8254 through composition

2012-07-18 Thread Markus Armbruster

Wanpeng Li liw...@linux.vnet.ibm.com writes:

 [CCing ML]
  
  From: Anthony Liguori aligu...@us.ibm.com

 The HPET usually sits on the LPC bus (which replaces ISA in modern systems).
 It's sometimes a dedicated chip but can certain co-exist in a Super IO chip.
 I think in terms of where it would live in this hypothetical device model,
 putting it in the PIIX is rational.

Could you explain briefly why you have to move struct definitions from
.c to .h?

Re: [Qemu-devel] [PATCH] improve scripts/make-release

2012-07-18 Thread Eric Blake

On 07/18/2012 08:31 AM, Gerd Hoffmann wrote:
 'make dist' creates a tarball for the current checkout.
 'make qemu-${version}.tar.bz2' creates a tarball for git tag v${version}.
 
 Signed-off-by: Gerd Hoffmann kra...@redhat.com
 ---
  Makefile |5 ++---
  scripts/make-release |8 +++-
  2 files changed, 9 insertions(+), 4 deletions(-)

 +++ b/scripts/make-release
 @@ -12,11 +12,17 @@
  
  src=$1
  version=$2
 +if test $version = ; then

test ! $version

is less typing, and still POSIX compliant.

 + commit=$(git describe --long)
 + version=${commit#v}

Sticking with my earlier theme started against Anthony's original
implementation of this script, in complaining about inconsistent shell
quoting styles, it might look nicer to favor minimal quoting:

commit=$(git describe --long)
version=${commit#v}

or maximal quoting:

commit=$(git describe --long)
version=${commit#v}

rather than an ad hoc mix.  But as that is cosmetic, and does not impact
functionality, you have my:

Reviewed-by: Eric Blake ebl...@redhat.com

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH 01/11] configure: Don't run configure tests with -Werror enabled

2012-07-18 Thread Peter Maydell

Don't run configure tests with -Werror in the compiler flags. The idea
of -Werror is that it makes problems very obvious to developers, so
they get fixed quickly. However, when running configure tests, failures
due to -Werror are far from obvious -- they simply result in the test
quietly failing when it should have passed. Not using -Werror is in
line with recommended practice in the Autoconf world.

This commit is essentially backing out the changes in commit 417c9d72.
Instead we fix the problem that commit was trying to address in a
different way: we add -Werror only for the test of the nss headers,
with a comment that this is specifically intended to detect a bug
in some releases of nss.

We also have to clean up a bug in the smartcard test where it was
trying to include smartcard_cflags in the test compile flags: this
would always result in a failure with -Werror, because they include
an escaped $(SRC_PATH) which is only valid when used in the final
makefile.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Reviewed-by: Stefan Weil s...@weilnetz.de
---
 configure |   22 ++
 1 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index 0a3896e..383fa3d 100755
--- a/configure
+++ b/configure
@@ -1156,9 +1156,10 @@ gcc_flags=-Wold-style-declaration 
-Wold-style-definition -Wtype-limits
 gcc_flags=-Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers 
$gcc_flags
 gcc_flags=-Wmissing-include-dirs -Wempty-body -Wnested-externs $gcc_flags
 gcc_flags=-fstack-protector-all -Wendif-labels $gcc_flags
-if test $werror = yes ; then
-gcc_flags=-Werror $gcc_flags
-fi
+# Note that we do not add -Werror to gcc_flags here, because that would
+# enable it for all configure tests. If a configure test failed due
+# to -Werror this would just silently disable some features,
+# so it's too error prone.
 cat  $TMPC  EOF
 int main(void) { return 0; }
 EOF
@@ -2656,8 +2657,16 @@ EOF
 smartcard_cflags=-I\$(SRC_PATH)/libcacard
 libcacard_libs=$($pkg_config --libs nss 2/dev/null) $glib_libs
 libcacard_cflags=$($pkg_config --cflags nss 2/dev/null) $glib_cflags
+test_cflags=$libcacard_cflags
+# The header files in nss  3.13.3 have a bug which causes them to
+# emit a warning. If we're going to compile QEMU with -Werror, then
+# test that the headers don't have this bug. Otherwise we would pass
+# the configure test but fail to compile QEMU later.
+if test $werror = yes; then
+test_cflags=-Werror $test_cflags
+fi
 if $pkg_config --atleast-version=3.12.8 nss /dev/null 21  \
-  compile_prog $smartcard_cflags $libcacard_cflags 
$libcacard_libs; then
+  compile_prog $test_cflags $libcacard_libs; then
 smartcard_nss=yes
 QEMU_CFLAGS=$QEMU_CFLAGS $smartcard_cflags $libcacard_cflags
 libs_softmmu=$libcacard_libs $libs_softmmu
@@ -2903,6 +2912,11 @@ if test -z $zero_malloc ; then
 fi
 fi
 
+# Now we've finished running tests it's OK to add -Werror to the compiler flags
+if test $werror = yes; then
+QEMU_CFLAGS=-Werror $QEMU_CFLAGS
+fi
+
 if test $solaris = no ; then
 if $ld --version 2/dev/null | grep GNU ld /dev/null 2/dev/null ; then
 LDFLAGS=-Wl,--warn-common $LDFLAGS
-- 
1.7.5.4

[Qemu-devel] [PATCH 02/11] configure: Fix build with ALSA audio driver

2012-07-18 Thread Peter Maydell

From: Stefan Weil s...@weilnetz.de

Since commit 417c9d72d48275d19c60861896efd4962d21aca2,
all configure tests normally run with -Werror.

Some of these tests now fail because they raised a compiler warning.

Here a build breakage for ALSA (configure --audio-drv-list=alsa) is fixed.

Signed-off-by: Stefan Weil s...@weilnetz.de
Reviewed-by: Peter Maydell peter.mayd...@linaro.org
---
 configure |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/configure b/configure
index 383fa3d..63156a7 100755
--- a/configure
+++ b/configure
@@ -1889,7 +1889,7 @@ for drv in $audio_drv_list; do
 case $drv in
 alsa)
 audio_drv_probe $drv alsa/asoundlib.h -lasound \
-snd_pcm_t **handle; return snd_pcm_close(*handle);
+return snd_pcm_close((snd_pcm_t *)0);
 libs_softmmu=-lasound $libs_softmmu
 ;;
 
-- 
1.7.5.4

Re: [Qemu-devel] Can't Build i386-bsd-user on Freebsd

2012-07-18 Thread Wei-Ren Chen

  CC'ed to Blue, since he is bsd-user maintainer as MAINTAINER said.

On Mon, Jul 16, 2012 at 07:07:38PM -0700, Paramjot Oberoi wrote:
 Hey all,
 
 I'm having trouble building user mode BSD emulation on FreeBSD. I've tried
 1.0.1, 1.1.1, and stable from GIT. I build by doing a: ./configure
 --target-list=i386-bsd-user, and then make with gmake. The first error I 
 get
 is in regards to CTLTYPE_QUAD.
 
 /usr/home/qemu-1.0.1/bsd-user/syscall.c: In function 'sysctl_oldcvt':
 /usr/home/qemu-1.0.1/bsd-user/syscall.c:214: error: 'CTLTYPE_QUAD' undeclared
 (first use in this function)
 /usr/home/qemu-1.0.1/bsd-user/syscall.c:214: error: (Each undeclared 
 identifier
 is reported only once
 /usr/home/qemu-1.0.1/bsd-user/syscall.c:214: error: for each function it
 appears in.)
 gmake[1]: *** [syscall.o] Error 1
 gmake: *** [subdir-i386-bsd-user] Error 2
 
 To fix this error I added an #ifdef/#endif around the switch statement as
 specified in this thread: http://comments.gmane.org/gmane.comp.emulators.qemu/
 104657. Now I'm stuck with the following error:
 
   CCi386-bsd-user/helper.o
   CCi386-bsd-user/cpu.o
   CCi386-bsd-user/disas.o
   CCi386-bsd-user/ioport-user.o
   LINK  i386-bsd-user/qemu-i386
 /usr/local/lib/libgthread-2.0.so: could not read symbols: File in wrong format
 gmake[1]: *** [qemu-i386] Error 1
 gmake: *** [subdir-i386-bsd-user] Error 2
 
 Any advice on how to get it to build? I'm running 32-bit FreeBSD. Thanks in
 advance, I appreciate it .

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj

[Qemu-devel] [RFC v9 02/27] virtio-blk: Set up host notifier for data plane

2012-07-18 Thread Stefan Hajnoczi

Set up the virtqueue notify ioeventfd that the data plane will monitor.

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 hw/virtio-blk.c |   37 +
 1 file changed, 37 insertions(+)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index a627427..0389294 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -26,6 +26,8 @@ typedef struct VirtIOBlock
 char *serial;
 unsigned short sector_mask;
 DeviceState *qdev;
+
+bool data_plane_started;
 } VirtIOBlock;
 
 static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev)
@@ -33,6 +35,39 @@ static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev)
 return (VirtIOBlock *)vdev;
 }
 
+static void virtio_blk_data_plane_start(VirtIOBlock *s)
+{
+if (s-vdev.binding-set_host_notifier(s-vdev.binding_opaque, 0, true) != 
0) {
+fprintf(stderr, virtio-blk failed to set host notifier\n);
+return;
+}
+
+s-data_plane_started = true;
+}
+
+static void virtio_blk_data_plane_stop(VirtIOBlock *s)
+{
+s-data_plane_started = false;
+
+s-vdev.binding-set_host_notifier(s-vdev.binding_opaque, 0, false);
+}
+
+static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t val)
+{
+VirtIOBlock *s = to_virtio_blk(vdev);
+
+/* Toggle host notifier only on status change */
+if (s-data_plane_started == !!(val  VIRTIO_CONFIG_S_DRIVER_OK)) {
+return;
+}
+
+if (val  VIRTIO_CONFIG_S_DRIVER_OK) {
+virtio_blk_data_plane_start(s);
+} else {
+virtio_blk_data_plane_stop(s);
+}
+}
+
 static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
 {
 fprintf(stderr, virtio_blk_handle_output: should never get here,
@@ -115,6 +150,7 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, BlockConf 
*conf,
 
 s-vdev.get_config = virtio_blk_update_config;
 s-vdev.get_features = virtio_blk_get_features;
+s-vdev.set_status = virtio_blk_set_status;
 s-bs = conf-bs;
 s-conf = conf;
 s-serial = *serial;
@@ -122,6 +158,7 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, BlockConf 
*conf,
 bdrv_guess_geometry(s-bs, cylinders, heads, secs);
 
 s-vq = virtio_add_queue(s-vdev, 128, virtio_blk_handle_output);
+s-data_plane_started = false;
 
 s-qdev = dev;
 bdrv_set_buffer_alignment(s-bs, conf-logical_block_size);
-- 
1.7.10.4

[Qemu-devel] [RFC v9 20/27] virtio-blk: Add ioscheduler to detect mergable requests

2012-07-18 Thread Stefan Hajnoczi

---
 hw/dataplane/iosched.h |   78 
 hw/virtio-blk.c|5 
 2 files changed, 83 insertions(+)
 create mode 100644 hw/dataplane/iosched.h

diff --git a/hw/dataplane/iosched.h b/hw/dataplane/iosched.h
new file mode 100644
index 000..12ebccc
--- /dev/null
+++ b/hw/dataplane/iosched.h
@@ -0,0 +1,78 @@
+#ifndef IOSCHED_H
+#define IOSCHED_H
+
+#include hw/dataplane/ioq.h
+
+typedef struct {
+unsigned long iocbs;
+unsigned long merges;
+unsigned long sched_calls;
+} IOSched;
+
+static int iocb_cmp(const void *a, const void *b)
+{
+const struct iocb *iocb_a = a;
+const struct iocb *iocb_b = b;
+
+/*
+ * Note that we can't simply subtract req2-sector from req1-sector
+ * here as that could overflow the return value.
+ */
+if (iocb_a-u.c.offset  iocb_b-u.c.offset) {
+return 1;
+} else if (iocb_a-u.c.offset  iocb_b-u.c.offset) {
+return -1;
+} else {
+return 0;
+}
+}
+
+static size_t iocb_nbytes(struct iocb *iocb)
+{
+struct iovec *iov = iocb-u.c.buf;
+size_t nbytes = 0;
+size_t i;
+for (i = 0; i  iocb-u.c.nbytes; i++) {
+nbytes += iov-iov_len;
+iov++;
+}
+return nbytes;
+}
+
+static void iosched_init(IOSched *iosched)
+{
+memset(iosched, 0, sizeof *iosched);
+}
+
+static void iosched_print_stats(IOSched *iosched)
+{
+fprintf(stderr, iocbs = %lu merges = %lu sched_calls = %lu\n,
+iosched-iocbs, iosched-merges, iosched-sched_calls);
+memset(iosched, 0, sizeof *iosched);
+}
+
+static void iosched(IOSched *iosched, struct iocb *unsorted[], unsigned int 
count)
+{
+struct iocb *sorted[count];
+struct iocb *last;
+unsigned int i;
+
+if ((++iosched-sched_calls % 1000) == 0) {
+iosched_print_stats(iosched);
+}
+
+memcpy(sorted, unsorted, sizeof sorted);
+qsort(sorted, count, sizeof sorted[0], iocb_cmp);
+
+iosched-iocbs += count;
+last = sorted[0];
+for (i = 1; i  count; i++) {
+if (last-aio_lio_opcode == sorted[i]-aio_lio_opcode 
+last-u.c.offset + iocb_nbytes(last) == sorted[i]-u.c.offset) {
+iosched-merges++;
+}
+last = sorted[i];
+}
+}
+
+#endif /* IOSCHED_H */
diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index f67fdb7..75cb0f2 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -22,6 +22,7 @@
 #include hw/dataplane/event-poll.h
 #include hw/dataplane/vring.h
 #include hw/dataplane/ioq.h
+#include hw/dataplane/iosched.h
 #include kvm.h
 
 enum {
@@ -57,6 +58,7 @@ typedef struct {
 EventHandler notify_handler;/* virtqueue notify handler */
 
 IOQueue ioqueue;/* Linux AIO queue (should really be per 
dataplane thread) */
+IOSched iosched;/* I/O scheduler */
 VirtIOBlockRequest requests[REQ_MAX]; /* pool of requests, managed by the 
queue */
 } VirtIOBlock;
 
@@ -249,6 +251,8 @@ static bool handle_notify(EventHandler *handler)
 }
 }
 
+iosched(s-iosched, s-ioqueue.queue, s-ioqueue.queue_idx);
+
 /* Submit requests, if any */
 int rc = ioq_submit(s-ioqueue);
 if (unlikely(rc  0)) {
@@ -289,6 +293,7 @@ static void data_plane_start(VirtIOBlock *s)
 {
 int i;
 
+iosched_init(s-iosched);
 vring_setup(s-vring, s-vdev, 0);
 
 /* Set up guest notifier (irq) */
-- 
1.7.10.4

[Qemu-devel] [RFC v9 26/27] msix: use upstream kvm_irqchip_set_irq()

2012-07-18 Thread Stefan Hajnoczi

Commit 9507e305ec54062fccc88fcf6fccf1898a7e7141 changed the
kvm_set_irq() function to kvm_irqchip_set_irq().

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 hw/msix.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/msix.c b/hw/msix.c
index 0ed1013..373017a 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -512,7 +512,7 @@ bool msix_try_notify_from_thread(PCIDevice *dev, unsigned 
vector)
 return false;
 }
 if (likely(kvm_enabled()  kvm_irqchip_in_kernel())) {
-kvm_set_irq(dev-msix_irq_entries[vector].gsi, 1, NULL);
+kvm_irqchip_set_irq(kvm_state, dev-msix_irq_entries[vector].gsi, 1);
 return true;
 }
 return false;
-- 
1.7.10.4

[Qemu-devel] [RFC v9 24/27] virtio-blk: fix incorrect length

2012-07-18 Thread Stefan Hajnoczi

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 hw/virtio-blk.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index 8734029..cff2298 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -131,7 +131,7 @@ static void complete_one_request(VirtIOBlockRequest *req, 
VirtIOBlock *s, ssize_
  * written to, but for virtio-blk it seems to be the number of bytes
  * transferred plus the status bytes.
  */
-vring_push(s-vring, req-head, len + sizeof req-status);
+vring_push(s-vring, req-head, len + sizeof(*req-status));
 }
 
 static bool is_request_merged(VirtIOBlockRequest *req)
-- 
1.7.10.4

[Qemu-devel] [RFC v9 00/27] virtio: virtio-blk data plane

2012-07-18 Thread Stefan Hajnoczi

This series implements a dedicated thread for virtio-blk processing using Linux
AIO for raw image files only.  It is based on qemu-kvm.git a0bc8c3 and somewhat
old but I wanted to share it on the list since it has been mentioned on mailing
lists and IRC recently.

These patches can be used for benchmarking and discussion about how to improve
block performance.  Paolo Bonzini has also worked in this area and might want
to share his patches.

The basic approach is:
1. Each virtio-blk device has a thread dedicated to handling ioeventfd
   signalling when the guest kicks the virtqueue.
2. Requests are processed without going through the QEMU block layer using
   Linux AIO directly.
3. Completion interrupts are injected via ioctl from the dedicated thread.

The series also contains request merging as a bdrv_aio_multiwrite() equivalent.
This was only to get a comparison against the QEMU block layer and I would drop
it for other types of analysis.

The effect of this series is that O_DIRECT Linux AIO on raw files can bypass
the QEMU global mutex and block layer.  This means higher performance.

A cleaned up version of this approach could be added to QEMU as a raw O_DIRECT
Linux AIO fast path.  Image file formats, protocols, and other block layer
features are not supported by virtio-blk-data-plane.

Git repo:
http://repo.or.cz/w/qemu-kvm/stefanha.git/shortlog/refs/heads/virtio-blk-data-plane

Stefan Hajnoczi (27):
  virtio-blk: Remove virtqueue request handling code
  virtio-blk: Set up host notifier for data plane
  virtio-blk: Data plane thread event loop
  virtio-blk: Map vring
  virtio-blk: Do cheapest possible memory mapping
  virtio-blk: Take PCI memory range into account
  virtio-blk: Put dataplane code into its own directory
  virtio-blk: Read requests from the vring
  virtio-blk: Add Linux AIO queue
  virtio-blk: Stop data plane thread cleanly
  virtio-blk: Indirect vring and flush support
  virtio-blk: Add workaround for BUG_ON() dependency in virtio_ring.h
  virtio-blk: Increase max requests for indirect vring
  virtio-blk: Use pthreads instead of qemu-thread
  notifier: Add a function to set the notifier
  virtio-blk: Kick data plane thread using event notifier set
  virtio-blk: Use guest notifier to raise interrupts
  virtio-blk: Call ioctl() directly instead of irqfd
  virtio-blk: Disable guest-host notifies while processing vring
  virtio-blk: Add ioscheduler to detect mergable requests
  virtio-blk: Add basic request merging
  virtio-blk: Fix request merging
  virtio-blk: Stub out SCSI commands
  virtio-blk: fix incorrect length
  msix: fix irqchip breakage in msix_try_notify_from_thread()
  msix: use upstream kvm_irqchip_set_irq()
  virtio-blk: add EVENT_IDX support to dataplane

 event_notifier.c  |7 +
 event_notifier.h  |1 +
 hw/dataplane/event-poll.h |  116 +++
 hw/dataplane/ioq.h|  128 
 hw/dataplane/iosched.h|   97 ++
 hw/dataplane/vring.h  |  334 
 hw/msix.c |   15 +
 hw/msix.h |1 +
 hw/virtio-blk.c   |  753 +
 hw/virtio-pci.c   |8 +
 hw/virtio.c   |9 +
 hw/virtio.h   |3 +
 12 files changed, 1074 insertions(+), 398 deletions(-)
 create mode 100644 hw/dataplane/event-poll.h
 create mode 100644 hw/dataplane/ioq.h
 create mode 100644 hw/dataplane/iosched.h
 create mode 100644 hw/dataplane/vring.h

-- 
1.7.10.4

[Qemu-devel] [RFC v9 17/27] virtio-blk: Use guest notifier to raise interrupts

2012-07-18 Thread Stefan Hajnoczi

The data plane thread isn't allowed to call virtio_irq() directly
because that function is not thread-safe.  Use the guest notifier just
like virtio-net to handle IRQs.

When MSI-X is in use and the vector is unmasked, the guest notifier
directly sets the IRQ inside the host kernel.  If the vector is masked,
then QEMU's iothread needs to take note of the IRQ.  If MSI-X is not in
use, then QEMU's iothread handles the IRQ and this will be slower than
synchronously calling notify_irq() from the data plane thread.
---
 hw/virtio-blk.c |   28 
 1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index d75c187..bdff68a 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -73,6 +73,18 @@ static int get_raw_posix_fd_hack(VirtIOBlock *s)
 return *(int*)s-bs-file-opaque;
 }
 
+/* Raise an interrupt to signal guest, if necessary */
+static void virtio_blk_notify_guest(VirtIOBlock *s)
+{
+/* Always notify when queue is empty (when feature acknowledge) */
+   if ((s-vring.vr.avail-flags  VRING_AVAIL_F_NO_INTERRUPT) 
+   (s-vring.vr.avail-idx != s-vring.last_avail_idx ||
+!(s-vdev.guest_features  (1  VIRTIO_F_NOTIFY_ON_EMPTY
+   return;
+
+event_notifier_set(virtio_queue_get_guest_notifier(s-vq));
+}
+
 static void complete_request(struct iocb *iocb, ssize_t ret, void *opaque)
 {
 VirtIOBlock *s = opaque;
@@ -154,7 +166,7 @@ static void process_request(IOQueue *ioq, struct iovec 
iov[], unsigned int out_n
 fdatasync(get_raw_posix_fd_hack(s));
 inhdr-status = VIRTIO_BLK_S_OK;
 vring_push(s-vring, head, sizeof *inhdr);
-virtio_irq(s-vq);
+virtio_blk_notify_guest(s);
 }
 return;
 
@@ -222,8 +234,7 @@ static bool handle_io(EventHandler *handler)
 VirtIOBlock *s = container_of(handler, VirtIOBlock, io_handler);
 
 if (ioq_run_completion(s-ioqueue, complete_request, s)  0) {
-/* TODO is this thread-safe and can it be done faster? */
-virtio_irq(s-vq);
+virtio_blk_notify_guest(s);
 }
 
 /* If there were more requests than iovecs, the vring will not be empty yet
@@ -251,11 +262,17 @@ static void data_plane_start(VirtIOBlock *s)
 
 vring_setup(s-vring, s-vdev, 0);
 
+/* Set up guest notifier (irq) */
+if (s-vdev.binding-set_guest_notifier(s-vdev.binding_opaque, 0, true) 
!= 0) {
+fprintf(stderr, virtio-blk failed to set guest notifier, ensure 
-enable-kvm is set\n);
+exit(1);
+}
+
 event_poll_init(s-event_poll);
 
 /* Set up virtqueue notify */
 if (s-vdev.binding-set_host_notifier(s-vdev.binding_opaque, 0, true) != 
0) {
-fprintf(stderr, virtio-blk failed to set host notifier, ensure 
-enable-kvm is set\n);
+fprintf(stderr, virtio-blk failed to set host notifier\n);
 exit(1);
 }
 event_poll_add(s-event_poll, s-notify_handler,
@@ -296,6 +313,9 @@ static void data_plane_stop(VirtIOBlock *s)
 s-vdev.binding-set_host_notifier(s-vdev.binding_opaque, 0, false);
 
 event_poll_cleanup(s-event_poll);
+
+/* Clean up guest notifier (irq) */
+s-vdev.binding-set_guest_notifier(s-vdev.binding_opaque, 0, false);
 }
 
 static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t val)
-- 
1.7.10.4

[Qemu-devel] [RFC v9 22/27] virtio-blk: Fix request merging

2012-07-18 Thread Stefan Hajnoczi

Khoa Huynh k...@us.ibm.com discovered that request merging is broken.
The merged iocb is not updated to reflect the total number of iovecs and
the offset is also outdated.

This patch fixes request merging.

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 hw/virtio-blk.c |   10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index 9131a7a..51807b5 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -178,13 +178,17 @@ static void merge_request(struct iocb *iocb_a, struct 
iocb *iocb_b)
 req_a-len = iocb_nbytes(iocb_a);
 }
 
-iocb_b-u.v.vec = iovec;
-req_b-len = iocb_nbytes(iocb_b);
-req_b-next_merged = req_a;
 /*
 fprintf(stderr, merged %p (%u) and %p (%u), %u iovecs in total\n,
 req_a, iocb_a-u.v.nr, req_b, iocb_b-u.v.nr, iocb_a-u.v.nr + 
iocb_b-u.v.nr);
 */
+
+iocb_b-u.v.vec = iovec;
+iocb_b-u.v.nr += iocb_a-u.v.nr;
+iocb_b-u.v.offset = iocb_a-u.v.offset;
+
+req_b-len = iocb_nbytes(iocb_b);
+req_b-next_merged = req_a;
 }
 
 static void process_request(IOQueue *ioq, struct iovec iov[], unsigned int 
out_num, unsigned int in_num, unsigned int head)
-- 
1.7.10.4

[Qemu-devel] [PATCH 10/11] configure: -I\$(SRC_PATH) goes in QEMU_INCLUDES not QEMU_CFLAGS

2012-07-18 Thread Peter Maydell

If the smartcard configure check passes, add '-I\$(SRC_PATH)/libcacard'
to QEMU_INCLUDES, not QEMU_CFLAGS. Otherwise the unexpanded SRC_PATH
will cause a warning in every following configure test.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 configure |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 638e486..8140464 100755
--- a/configure
+++ b/configure
@@ -2656,7 +2656,7 @@ if test $smartcard != no ; then
 #include pk11pub.h
 int main(void) { PK11_FreeSlot(0); return 0; }
 EOF
-smartcard_cflags=-I\$(SRC_PATH)/libcacard
+smartcard_includes=-I\$(SRC_PATH)/libcacard
 libcacard_libs=$($pkg_config --libs nss 2/dev/null) $glib_libs
 libcacard_cflags=$($pkg_config --cflags nss 2/dev/null) $glib_cflags
 test_cflags=$libcacard_cflags
@@ -2670,7 +2670,8 @@ EOF
 if $pkg_config --atleast-version=3.12.8 nss /dev/null 21  \
   compile_prog $test_cflags $libcacard_libs; then
 smartcard_nss=yes
-QEMU_CFLAGS=$QEMU_CFLAGS $smartcard_cflags $libcacard_cflags
+QEMU_CFLAGS=$QEMU_CFLAGS $libcacard_cflags
+QEMU_INCLUDES=$QEMU_INCLUDES $smartcard_includes
 libs_softmmu=$libcacard_libs $libs_softmmu
 else
 if test $smartcard_nss = yes; then
-- 
1.7.5.4

[Qemu-devel] [RFC v9 19/27] virtio-blk: Disable guest-host notifies while processing vring

2012-07-18 Thread Stefan Hajnoczi

---
 hw/dataplane/vring.h |   28 +++-
 hw/virtio-blk.c  |   47 +++
 2 files changed, 58 insertions(+), 17 deletions(-)

diff --git a/hw/dataplane/vring.h b/hw/dataplane/vring.h
index 44ef4a9..cdd4d4a 100644
--- a/hw/dataplane/vring.h
+++ b/hw/dataplane/vring.h
@@ -69,11 +69,29 @@ static void vring_setup(Vring *vring, VirtIODevice *vdev, 
int n)
 vring-vr.desc, vring-vr.avail, vring-vr.used);
 }
 
+/* Are there more descriptors available? */
 static bool vring_more_avail(Vring *vring)
 {
return vring-vr.avail-idx != vring-last_avail_idx;
 }
 
+/* Hint to disable guest-host notifies */
+static void vring_disable_cb(Vring *vring)
+{
+vring-vr.used-flags |= VRING_USED_F_NO_NOTIFY;
+}
+
+/* Re-enable guest-host notifies
+ *
+ * Returns false if there are more descriptors in the ring.
+ */
+static bool vring_enable_cb(Vring *vring)
+{
+vring-vr.used-flags = ~VRING_USED_F_NO_NOTIFY;
+__sync_synchronize(); /* mb() */
+return !vring_more_avail(vring);
+}
+
 /* This is stolen from linux-2.6/drivers/vhost/vhost.c. */
 static bool get_indirect(Vring *vring,
struct iovec iov[], struct iovec *iov_end,
@@ -160,7 +178,7 @@ static bool get_indirect(Vring *vring,
  *
  * Stolen from linux-2.6/drivers/vhost/vhost.c.
  */
-static unsigned int vring_pop(Vring *vring,
+static int vring_pop(Vring *vring,
  struct iovec iov[], struct iovec *iov_end,
  unsigned int *out_num, unsigned int *in_num)
 {
@@ -178,9 +196,9 @@ static unsigned int vring_pop(Vring *vring,
exit(1);
}
 
-   /* If there's nothing new since last we looked, return invalid. */
+   /* If there's nothing new since last we looked. */
if (avail_idx == last_avail_idx)
-   return num;
+   return -EAGAIN;
 
/* Only get avail ring entries after they have been exposed by guest. */
__sync_synchronize(); /* smp_rmb() */
@@ -215,7 +233,7 @@ static unsigned int vring_pop(Vring *vring,
 desc = vring-vr.desc[i];
if (desc.flags  VRING_DESC_F_INDIRECT) {
if (!get_indirect(vring, iov, iov_end, out_num, in_num, 
desc)) {
-return num; /* not enough iovecs, stop for now */
+return -ENOBUFS; /* not enough iovecs, stop for now */
 }
 continue;
}
@@ -225,7 +243,7 @@ static unsigned int vring_pop(Vring *vring,
  * with the current set.
  */
 if (iov = iov_end) {
-return num;
+return -ENOBUFS;
 }
 
 iov-iov_base = phys_to_host(vring, desc.addr);
diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index efeffa0..f67fdb7 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -202,7 +202,8 @@ static bool handle_notify(EventHandler *handler)
  * accept more I/O.  This is not implemented yet.
  */
 struct iovec iovec[VRING_MAX];
-struct iovec *iov, *end = iovec[VRING_MAX];
+struct iovec *end = iovec[VRING_MAX];
+struct iovec *iov = iovec;
 
 /* When a request is read from the vring, the index of the first descriptor
  * (aka head) is returned so that the completed request can be pushed onto
@@ -211,19 +212,41 @@ static bool handle_notify(EventHandler *handler)
  * The number of hypervisor read-only iovecs is out_num.  The number of
  * hypervisor write-only iovecs is in_num.
  */
-unsigned int head, out_num = 0, in_num = 0;
+int head;
+unsigned int out_num = 0, in_num = 0;
 
-for (iov = iovec; ; iov += out_num + in_num) {
-head = vring_pop(s-vring, iov, end, out_num, in_num);
-if (head = vring_get_num(s-vring)) {
-break; /* no more requests */
-}
+for (;;) {
+/* Disable guest-host notifies to avoid unnecessary vmexits */
+vring_disable_cb(s-vring);
+
+for (;;) {
+head = vring_pop(s-vring, iov, end, out_num, in_num);
+if (head  0) {
+break; /* no more requests */
+}
 
-/*
-fprintf(stderr, out_num=%u in_num=%u head=%u\n, out_num, in_num, 
head);
-*/
+/*
+fprintf(stderr, out_num=%u in_num=%u head=%d\n, out_num, in_num, 
head);
+*/
 
-process_request(s-ioqueue, iov, out_num, in_num, head);
+process_request(s-ioqueue, iov, out_num, in_num, head);
+iov += out_num + in_num;
+}
+
+if (likely(head == -EAGAIN)) { /* vring emptied */
+/* Re-enable guest-host notifies and stop processing the vring.
+ * But if the guest has snuck in more descriptors, keep processing.
+ */
+if (likely(vring_enable_cb(s-vring))) {
+break;
+}
+} else { /* head == -ENOBUFS, cannot continue since iovecs[] is 
depleted */
+/* Since

[Qemu-devel] [PATCH 07/11] configure: Fix compile warning in PNG test

2012-07-18 Thread Peter Maydell

Fix compile warning (variable 'png_ptr' set but not used) in the
PNG detection test code.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 configure |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/configure b/configure
index aced52e..784325a 100755
--- a/configure
+++ b/configure
@@ -1727,7 +1727,7 @@ cat  $TMPC EOF
 int main(void) {
 png_structp png_ptr;
 png_ptr = png_create_write_struct(PNG_LIBPNG_VER_STRING, NULL, NULL, NULL);
-return 0;
+return png_ptr != 0;
 }
 EOF
   if $pkg_config libpng --modversion /dev/null 21; then
-- 
1.7.5.4

[Qemu-devel] [RFC v9 16/27] virtio-blk: Kick data plane thread using event notifier set

2012-07-18 Thread Stefan Hajnoczi

---
 hw/virtio-blk.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index 1616be5..d75c187 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -339,8 +339,7 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
 virtio_blk_set_status(vdev, VIRTIO_CONFIG_S_DRIVER_OK); /* start the 
thread */
 
 /* Now kick the thread */
-uint64_t dummy = 1;
-ssize_t unused __attribute__((unused)) = 
write(event_notifier_get_fd(virtio_queue_get_host_notifier(s-vq)), dummy, 
sizeof dummy);
+event_notifier_set(virtio_queue_get_host_notifier(s-vq));
 }
 
 /* coalesce internal state, copy to pci i/o region 0
-- 
1.7.10.4

1 2 >

1 - 100 of 184 matches

Mail list logo