[Qemu-devel] ARM bootloader boot blobbing.

2012-09-26 Thread Peter Crosthwaite
Hi All,

Can anyone think of a reason why the arm primary bootloader cant be
done by just direct interaction with the CPU? Currently we have this
...

/* The worlds second smallest bootloader.  Set r0-r2, then jump to kernel.  */
static uint32_t bootloader[] = {
  0xe3a0, /* mov r0, #0 */
  0xe59f1004, /* ldr r1, [pc, #4] */
  0xe59f2004, /* ldr r2, [pc, #4] */
  0xe59ff004, /* ldr pc, [pc, #4] */
  0, /* Board ID */
  0, /* Address of kernel args.  Set by integratorcp_init.  */
  0  /* Kernel entry point.  Set by integratorcp_init.  */
};

... which gets injected into RAM then we set the PC to this blob and
go. But couldnt we just set R0-2 directly from the bootloader and just
straight to the kernel entry point? Why do we have to blob in a
lightweight bootloader?

Regards,
Peter



Re: [Qemu-devel] [PATCH 06/17] aio: introduce AioContext, move bottom halves there

2012-09-26 Thread Paolo Bonzini
Il 25/09/2012 23:51, Anthony Liguori ha scritto:
   typedef struct QEMUTimer QEMUTimer;
   typedef struct QEMUFile QEMUFile;
  +typedef struct QEMUBH QEMUBH;
   typedef struct DeviceState DeviceState;
   
   struct Monitor;
 Any reason to do this here vs. just #include qemu-aio.h in
 qemu-common.h?
 
 I don't see an obvious dependency on qemu-common.h in qemu-aio.h other
 than this typedef.

I thought we were moving away from including everything in
qemu-common.h.  In fact, the only includes from QEMU in qemu-common.h are:

#ifdef _WIN32
#include qemu-os-win32.h
#endif

#ifdef CONFIG_POSIX
#include qemu-os-posix.h
#endif

#include osdep.h
#include bswap.h

#include cpu.h

where cpu.h could probably be removed---perhaps should.

Paolo



Re: [Qemu-devel] [PATCH] usb: Fix usb_packet_map() in the presence of IOMMUs

2012-09-26 Thread Gerd Hoffmann
On 09/26/12 04:59, David Gibson wrote:
 With the IOMMU infrastructure introduced before 1.2, we need to use
 dma_memory_map() to obtain a qemu pointer to memory from an IO bus address.
 However, dma_memory_map() alters the given length to reflect the length
 over which the used DMA translation is valid - which could be either more
 or less than the requested length.
 
 usb_packet_map() does not correctly handle these cases, simply failing if
 dma_memory_map() alters the requested length.  If dma_memory_map()
 increased the length, we just need to use the requested length for the
 qemu_iovec_add().  However, if it decreased the length, it means that a
 single DMA translation is not valid for the whole sglist element, and so
 we need to loop, splitting it up into multiple iovec entries for each
 piece with a DMA translation (in practice 2 pieces is unlikely).
 
 This patch implements the correct behaviour

Patch added to usb patch queue.

thanks,
  Gerd



Re: [Qemu-devel] [PATCH 09/17] aio: prepare for introducing GSource-based dispatch

2012-09-26 Thread Paolo Bonzini
Il 26/09/2012 00:01, Anthony Liguori ha scritto:
  +node-pfd.events = G_IO_ERR;
  +node-pfd.events |= (io_read ? G_IO_IN | G_IO_HUP : 0);
  +node-pfd.events |= (io_write ? G_IO_OUT : 0);
   }
 Should we even set G_IO_ERR?  I think that corresponds to exceptfd

No, that would be G_IO_PRI.

 in select() but we've never set that historically.  I know glib recommends
 it but I don't think it's applicable to how we use it.
 
 Moreover, the way you do dispatch, if G_IO_ERR did occur, we'd dispatch
 both the read and write handlers which definitely isn't right.

I'm not sure what gives POLLERR.  Probably a connect() that fails, and
in that case dispatching on the write handler would be okay.  But I was
not sure and calling both is safe: handlers have to be ready for
spurious wakeups anyway, it happens if qemu_aio_wait dispatches from a
VCPU thread before the main loop gets hold of the big lock.

 I think it's easiest just to drop it.

That's indeed the case, since the current code never sets either
G_IO_HUP or G_IO_ERR.

Paolo



Re: [Qemu-devel] [PATCH 3/6] target-ppc: Extend FPU state for newer POWER CPUs

2012-09-26 Thread Aurelien Jarno
On Wed, Sep 26, 2012 at 01:12:18PM +1000, David Gibson wrote:
 This patch adds some extra FPU state to CPUPPCState.  Specifically, fpscr
 is extended to 64 bits, since some recent CPUs now have more status bits
 than fit inside 64 bits, and we add the 32 VSR registers present on CPUs
 with VSX (these extend the standard FP regs, which together with the
 Altivec/VMX registers form a 64 x 128bit register file for VSX).
 
 We don't actually support the instructions using these extra registers in
 TCG yet, but we still a place to store the state so we can sync it with
 KVM and savevm/loadvm it.  This patch updates the savevm code to not
 fail on the extended state, but also does not actually save it - that's
 a project for another patch.
 
 Signed-off-by: David Gibson da...@gibson.dropbear.id.au
 ---
  target-ppc/cpu.h   |4 +++-
  target-ppc/machine.c   |8 ++--
  target-ppc/translate.c |2 +-
  3 files changed, 10 insertions(+), 4 deletions(-)
 
 diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
 index faf4404..846778f 100644
 --- a/target-ppc/cpu.h
 +++ b/target-ppc/cpu.h
 @@ -963,7 +963,7 @@ struct CPUPPCState {
  /* floating point registers */
  float64 fpr[32];
  /* floating point status and control register */
 -uint32_t fpscr;
 +uint64_t fpscr;

This will break the TCG code, as fpscr is mapped as an i32 in TCG. Also
if it is 64-bit only on PPC64 machines, it might be a good idea to
change it to target_ulong instead, and use _tl in the TCG code.

  /* Next instruction pointer */
  target_ulong nip;
 @@ -1014,6 +1014,8 @@ struct CPUPPCState {
  /* Altivec registers */
  ppc_avr_t avr[32];
  uint32_t vscr;
 +/* VSX registers */
 +uint64_t vsr[32];
  /* SPE registers */
  uint64_t spe_acc;
  uint32_t spe_fscr;
 diff --git a/target-ppc/machine.c b/target-ppc/machine.c
 index 21ce757..5e7bc00 100644
 --- a/target-ppc/machine.c
 +++ b/target-ppc/machine.c
 @@ -6,6 +6,7 @@ void cpu_save(QEMUFile *f, void *opaque)
  {
  CPUPPCState *env = (CPUPPCState *)opaque;
  unsigned int i, j;
 +uint32_t fpscr;
  
  for (i = 0; i  32; i++)
  qemu_put_betls(f, env-gpr[i]);
 @@ -30,7 +31,8 @@ void cpu_save(QEMUFile *f, void *opaque)
  u.d = env-fpr[i];
  qemu_put_be64(f, u.l);
  }
 -qemu_put_be32s(f, env-fpscr);
 +fpscr = env-fpscr;
 +qemu_put_be32s(f, fpscr);
  qemu_put_sbe32s(f, env-access_type);
  #if defined(TARGET_PPC64)
  qemu_put_betls(f, env-asr);
 @@ -90,6 +92,7 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
  CPUPPCState *env = (CPUPPCState *)opaque;
  unsigned int i, j;
  target_ulong sdr1;
 +uint32_t fpscr;
  
  for (i = 0; i  32; i++)
  qemu_get_betls(f, env-gpr[i]);
 @@ -114,7 +117,8 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
  u.l = qemu_get_be64(f);
  env-fpr[i] = u.d;
  }
 -qemu_get_be32s(f, env-fpscr);
 +qemu_get_be32s(f, fpscr);
 +env-fpscr = fpscr;
  qemu_get_sbe32s(f, env-access_type);
  #if defined(TARGET_PPC64)
  qemu_get_betls(f, env-asr);
 diff --git a/target-ppc/translate.c b/target-ppc/translate.c
 index ac915cc..c8122b7 100644
 --- a/target-ppc/translate.c
 +++ b/target-ppc/translate.c
 @@ -9463,7 +9463,7 @@ void cpu_dump_state (CPUPPCState *env, FILE *f, 
 fprintf_function cpu_fprintf,
  if ((i  (RFPL - 1)) == (RFPL - 1))
  cpu_fprintf(f, \n);
  }
 -cpu_fprintf(f, FPSCR %08x\n, env-fpscr);
 +cpu_fprintf(f, FPSCR %08 PRIx64 \n, env-fpscr);
  #if !defined(CONFIG_USER_ONLY)
  cpu_fprintf(f,  SRR0  TARGET_FMT_lx   SRR1  TARGET_FMT_lx
 PVR  TARGET_FMT_lx  VRSAVE  TARGET_FMT_lx \n,
 -- 
 1.7.10.4
 
 
 

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [PATCH 14/17] main-loop: use GSource to poll AIO file descriptors

2012-09-26 Thread Paolo Bonzini
Il 26/09/2012 00:09, Anthony Liguori ha scritto:
 What do you think about deprecating bottom halves in the !block code in
 favor of idle functions?  I don't see any reason to keep using bottom
 halves...

The ptimer.c code uses bottom halves internally.  Otherwise I'd agree.

Paolo

 Reviewed-by: Anthony Liguori aligu...@us.ibm.com




Re: [Qemu-devel] [PATCH] target-xtensa: de-optimize EXTUI

2012-09-26 Thread Aurelien Jarno
On Wed, Sep 26, 2012 at 03:05:18AM +0400, Max Filippov wrote:
 On Wed, Sep 26, 2012 at 2:57 AM, Aurelien Jarno aurel...@aurel32.net wrote:
  Now that and with 0xff, 0x and 0x is optimized in
  tcg/tcg-op.h, there is no need to do it in target-xtensa/translate.c.
 
  Cc: Max Filippov jcmvb...@gmail.com
  Signed-off-by: Aurelien Jarno aurel...@aurel32.net
  ---
   target-xtensa/translate.c |   15 +--
   1 file changed, 1 insertion(+), 14 deletions(-)
 
  diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
  index ba3ffcb..c1358ee 100644
  --- a/target-xtensa/translate.c
  +++ b/target-xtensa/translate.c
  @@ -1835,20 +1835,7 @@ static void disas_xtensa_insn(DisasContext *dc)
   } else {
   tcg_gen_mov_i32(tmp, cpu_R[RRR_T]);
   }
 
 I guess shri above may be de-optimized as well.
 In any case Acked-by: Max Filippov jcmvb...@gmail.com

Good catch, I looked for some patterns in the targets code, and didn't
see this one. I'll send an updated patch.

  -
  -switch (maskimm) {
  -case 0xff:
  -tcg_gen_ext8u_i32(cpu_R[RRR_R], tmp);
  -break;
  -
  -case 0x:
  -tcg_gen_ext16u_i32(cpu_R[RRR_R], tmp);
  -break;
  -
  -default:
  -tcg_gen_andi_i32(cpu_R[RRR_R], tmp, maskimm);
  -break;
  -}
  +tcg_gen_andi_i32(cpu_R[RRR_R], tmp, maskimm);
   tcg_temp_free(tmp);
   }
   break;
  --
  1.7.10.4
 
 
 
 -- 
 Thanks.
 -- Max
 
 

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [PATCH 11/17] aio: make AioContexts GSources

2012-09-26 Thread Paolo Bonzini
Il 26/09/2012 00:06, Anthony Liguori ha scritto:
   if (node) {
  +g_source_remove_poll(ctx-source, node-pfd);
  +
 Why remove vs. setting events = 0?

Because otherwise you'd get a dangling pointer to node-pfd. :)

Paolo

 add_poll/remove_poll also comes with an event loop notify which I don't
 think is strictly necessary here.
 




Re: [Qemu-devel] [PATCH 0/2] add pci-serial device.

2012-09-26 Thread Gerd Hoffmann
On 09/26/12 01:43, Anthony Liguori wrote:
 Gerd Hoffmann kra...@redhat.com writes:
 
   Hi,

 Two patches, first split up serial.c a bit,
 then actually add the pci-based serial device.
 
 The series looks good to me.  A couple requests:
 
 1) Could you add a spec describing this new PCI device?  Doesn't need to
be more than a couple paragraphs since the device is super simple.

Well, it is pretty strait forward:  A single IO bar, 8 bytes in size,
where the 16550 uart is mapped to:

[kraxel@fedora ~]$ lspci -vse
00:0e.0 Serial controller: Red Hat, Inc. Device 0002 (rev 01) (prog-if
00 [8250])
Subsystem: Red Hat, Inc Device 1100
Physical Slot: 14
Flags: fast devsel, IRQ 11
I/O ports at c130 [size=8]
Kernel driver in use: serial

But I can surely add a comment about it.

 2) Could you make the inf file an separate patch and either include
documentation in the commit message on how to use it with Windows or
just add a comment to the inf file?

I think a comment is better, easier to find than a commit message.  Will do.

 This is a new PCI space for QEMU too.  

It isn't new, I just followed what the pci bridge is doing (which has
1b36:0001).

 Is this a driver that is owned
 by QEMU and Red Hat is donating the PCI id or is this a driver that RH
 controls that we're implementing?

I consider it being owned by qemu.

cheers,
  Gerd




Re: [Qemu-devel] [PATCH 09/17] aio: prepare for introducing GSource-based dispatch

2012-09-26 Thread Paolo Bonzini
Il 26/09/2012 00:01, Anthony Liguori ha scritto:
  +revents = node-pfd.revents  node-pfd.events;
  +node-pfd.revents = ~revents;
 
 This is interesting and I must admit I don't understand why it's
 necessary.  What case are you trying to handle?

That's for the case where you got a write event for fd Y, but disabled
the write handler from the handler of fd X (called before fd Y).  But
what the current code does is just eat the event, so I can do the same
and set node-pfd.revents to 0.

Paolo



Re: [Qemu-devel] [PATCH 0/2] add pci-serial device.

2012-09-26 Thread Gerd Hoffmann
  Hi,

 The only reason I ask is whether this is something we can add
 new features to.  I can't think of one off hand, but it can't
 hurt to work this out up front.
 
 Multiport e.g. (to save PCI slots). There was some proposal
 recently to add a model of an real multiport PCI card, just don't
 find the mail right now...

Easy enough to add.  Should be a separate device with its own pci id
though.  According to the linux source code there seem to be two
common ways to implement this:  A single, large pci bar where all
uarts are mapped one after the other.  Or one pci bar for each uart.
Now I need to figure which is easier to handle for windows guests ...

cheers,
  Gerd



Re: [Qemu-devel] [PATCH v2] stop using stdio for monitor/serial/etc with -daemonize

2012-09-26 Thread Michael Tokarev
On 26.09.2012 01:19, Anthony Liguori wrote:
 Combining -nographic and -daemonize don't make sense.  I'd rather error
 out with this combination.
 
 I think what the user is after is -daemonize -vga none OR -daemonize
 -display none.

So what's the difference?

I know lots of people use -nographic -daemonize to run headless
guests in background (like, for example, a router).  I guess it
come way before -vga option has been introduced, but at least I
know about -vga (but not about -vga none).  For one, I never saw
-display before.  And it looks like -nographic is a synonym for
-display none, and -curses is a synonym for -display curses.

It looks like we have way too many confusing options doing the
same thing.  And I think they should be consistent, at least
when they SMELL like they do the same thing, instead of forbidding
one or another in some situations.

Besides, the patch which I based my change on, curses: don't initialize
curses when qemu is daemonized, probably makes no sense too, since
it is a situation with -curses -daemonize (or, -- is there a difference? --
-display curses -daemonize).  That situation is better be errored
out than worked around, I think.  (You just pulled that patch from
Stefanha).

Thanks,

/mjt



Re: [Qemu-devel] KVM call agenda for September 25th

2012-09-26 Thread Markus Armbruster
Anthony Liguori anth...@codemonkey.ws writes:

 Kevin Wolf kw...@redhat.com writes:

 Am 25.09.2012 14:57, schrieb Anthony Liguori:
 Paolo Bonzini pbonz...@redhat.com writes:
 
 Il 24/09/2012 13:28, Juan Quintela ha scritto:
 Hi

 Please send in any agenda items you are interested in covering.

 URI parsing library for glusterfs: libxml2 vs. in-tree fork of the
 same code.
 
 The call is a bit late for Bharata but I think copying is the way to go.
 
 Something I've been thinking about since this discussion started
 though.  Maybe we could standardize on using URIs as short-hand syntax
 for backends.

 Compared with QemuOpts, it's not really short-hand or even convenient
 for manual use. For management tools it might be nice because URIs have
 a well-known syntax, can escape anything and implementations exist. But
 I think we must still maintain an easy to use syntax for human users.

 For example:
 
 qemu -hda file:///foo.img
 
 Or:
 
 qemu -device virtio-net-pci,netdev=tap:///vnet0?script=/etc/qemu-ifup
 
 Or:
 
 qemu -device \
   isa-serial,index=0,chr=tcp://localhost:1025/?server=onwait=off

 Your examples kind of prove this: They aren't much shorter than what
 exists today, but they contain ? and , which are nasty characters on
 the command line.

 This works particularly well with a treat unknown options as -device
 mechanism so that we could do:
 
 qemu -isa-serial chr=tcp://localhost:1025/?server=onwait=off
 
 We could even introduce a secondary implied option to shorten this
 further to:
 
 qemu -isa-serial tcp://localhost:1025/?server=onwait=off

Too much magic for my taste.

I'm afraid it leads to rather obscure error messages on misspellings.

 This is something that I was thinking of in the context of -blockdev a
 while ago (without URLs): Define the block device inside of -device
 specifications. The problem of nesting an option string inside another
 one is solved in theory by URLs because they allow (nested) escaping,
 but in practice we'll need to use some kind of brackets instead if we
 want it to be usable.

 qemu -isa-serial 'tcp://localhost:1025/?server=onwait=off'

 I don't think it's really that better.  And yeah, your thoughts are
 exactly mine.  Having two syntaxes allows us to use a single option.

 Hopefully most options could avoid having query parameters so escaping
 wasn't a problem.  It's unfortunate that the TCP character device uses
 client mode by default.

You could fold a limited set of common flags into the scheme.  Prior
art: socat supports syntax like

TCP:host:port
TCP4:host:port
TCP-LISTEN:port

I'm not saying it's a good idea for QEMU.



[Qemu-devel] [PATCH 2/8] compat: turn off msi/msix on xhci for old machine types

2012-09-26 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/pc_piix.c |   16 
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 5a0796b..afd8361 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -358,14 +358,30 @@ static QEMUMachine pc_machine_v1_3 = {
 .is_default = 1,
 };
 
+#define PC_COMPAT_1_2 \
+{\
+.driver   = nec-usb-xhci,\
+.property = msi,\
+.value= off,\
+},{\
+.driver   = nec-usb-xhci,\
+.property = msix,\
+.value= off,\
+}
+
 static QEMUMachine pc_machine_v1_2 = {
 .name = pc-1.2,
 .desc = Standard PC,
 .init = pc_init_pci,
 .max_cpus = 255,
+.compat_props = (GlobalProperty[]) {
+PC_COMPAT_1_2,
+{ /* end of list */ }
+},
 };
 
 #define PC_COMPAT_1_1 \
+PC_COMPAT_1_2,\
 {\
 .driver   = virtio-scsi-pci,\
 .property = hotplug,\
-- 
1.7.1




[Qemu-devel] [PATCH 4/8] xhci: route string usb hub support

2012-09-26 Thread Gerd Hoffmann
Parse route string in slot contexts and
support devices connected via hub.
---
 hw/usb/hcd-xhci.c |   86 ++---
 1 files changed, 55 insertions(+), 31 deletions(-)

diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index 1414826..8c0155b 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -363,7 +363,7 @@ typedef struct XHCIEPContext {
 typedef struct XHCISlot {
 bool enabled;
 dma_addr_t ctx;
-unsigned int port;
+USBPort *uport;
 unsigned int devaddr;
 XHCIEPContext * eps[31];
 } XHCISlot;
@@ -1230,7 +1230,7 @@ static TRBCCode xhci_reset_ep(XHCIState *xhci, unsigned 
int slotid,
 ep |= 0x80;
 }
 
-dev = xhci-ports[xhci-slots[slotid-1].port-1].uport-dev;
+dev = xhci-slots[slotid-1].uport-dev;
 if (!dev) {
 return CC_USB_TRANSACTION_ERROR;
 }
@@ -1412,18 +1412,9 @@ static void xhci_stall_ep(XHCITransfer *xfer)
 static int xhci_submit(XHCIState *xhci, XHCITransfer *xfer,
XHCIEPContext *epctx);
 
-static USBDevice *xhci_find_device(XHCIPort *port, uint8_t addr)
-{
-if (!(port-portsc  PORTSC_PED)) {
-return NULL;
-}
-return usb_find_device(port-uport, addr);
-}
-
 static int xhci_setup_packet(XHCITransfer *xfer)
 {
 XHCIState *xhci = xfer-xhci;
-XHCIPort *port;
 USBDevice *dev;
 USBEndpoint *ep;
 int dir;
@@ -1434,13 +1425,12 @@ static int xhci_setup_packet(XHCITransfer *xfer)
 ep = xfer-packet.ep;
 dev = ep-dev;
 } else {
-port = xhci-ports[xhci-slots[xfer-slotid-1].port-1];
-dev = xhci_find_device(port, xhci-slots[xfer-slotid-1].devaddr);
-if (!dev) {
-fprintf(stderr, xhci: slot %d port %d has no device\n,
-xfer-slotid, xhci-slots[xfer-slotid-1].port);
+if (!xhci-slots[xfer-slotid-1].uport) {
+fprintf(stderr, xhci: slot %d has no device\n,
+xfer-slotid);
 return -1;
 }
+dev = xhci-slots[xfer-slotid-1].uport-dev;
 ep = usb_ep_get(dev, dir, xfer-epid  1);
 }
 
@@ -1772,7 +1762,7 @@ static TRBCCode xhci_enable_slot(XHCIState *xhci, 
unsigned int slotid)
 trace_usb_xhci_slot_enable(slotid);
 assert(slotid = 1  slotid = MAXSLOTS);
 xhci-slots[slotid-1].enabled = 1;
-xhci-slots[slotid-1].port = 0;
+xhci-slots[slotid-1].uport = NULL;
 memset(xhci-slots[slotid-1].eps, 0, sizeof(XHCIEPContext*)*31);
 
 return CC_SUCCESS;
@@ -1795,17 +1785,42 @@ static TRBCCode xhci_disable_slot(XHCIState *xhci, 
unsigned int slotid)
 return CC_SUCCESS;
 }
 
+static USBPort *xhci_lookup_uport(XHCIState *xhci, uint32_t *slot_ctx)
+{
+USBPort *uport;
+char path[32];
+int i, pos, port;
+
+port = (slot_ctx[1]16)  0xFF;
+port = xhci-ports[port-1].uport-index+1;
+pos = snprintf(path, sizeof(path), %d, port);
+for (i = 0; i  5; i++) {
+port = (slot_ctx[0]  4*i)  0x0f;
+if (!port) {
+break;
+}
+pos += snprintf(path + pos, sizeof(path) - pos, .%d, port);
+}
+
+QTAILQ_FOREACH(uport, xhci-bus.used, next) {
+if (strcmp(uport-path, path) == 0) {
+return uport;
+}
+}
+return NULL;
+}
+
 static TRBCCode xhci_address_slot(XHCIState *xhci, unsigned int slotid,
   uint64_t pictx, bool bsr)
 {
 XHCISlot *slot;
+USBPort *uport;
 USBDevice *dev;
 dma_addr_t ictx, octx, dcbaap;
 uint64_t poctx;
 uint32_t ictl_ctx[2];
 uint32_t slot_ctx[4];
 uint32_t ep0_ctx[5];
-unsigned int port;
 int i;
 TRBCCode res;
 
@@ -1837,27 +1852,28 @@ static TRBCCode xhci_address_slot(XHCIState *xhci, 
unsigned int slotid,
 DPRINTF(xhci: input ep0 context: %08x %08x %08x %08x %08x\n,
 ep0_ctx[0], ep0_ctx[1], ep0_ctx[2], ep0_ctx[3], ep0_ctx[4]);
 
-port = (slot_ctx[1]16)  0xFF;
-dev = xhci-ports[port-1].uport-dev;
-
-if (port  1 || port  xhci-numports) {
-fprintf(stderr, xhci: bad port %d\n, port);
+uport = xhci_lookup_uport(xhci, slot_ctx);
+if (uport == NULL) {
+fprintf(stderr, xhci: port not found\n);
 return CC_TRB_ERROR;
-} else if (!dev) {
-fprintf(stderr, xhci: port %d not connected\n, port);
+}
+
+dev = uport-dev;
+if (!dev) {
+fprintf(stderr, xhci: port %s not connected\n, uport-path);
 return CC_USB_TRANSACTION_ERROR;
 }
 
 for (i = 0; i  MAXSLOTS; i++) {
-if (xhci-slots[i].port == port) {
-fprintf(stderr, xhci: port %d already assigned to slot %d\n,
-port, i+1);
+if (xhci-slots[i].uport == uport) {
+fprintf(stderr, xhci: port %s already assigned to slot %d\n,
+uport-path, i+1);
 return CC_TRB_ERROR;
 }
 }
 
 slot = xhci-slots[slotid-1];
-slot-port = port;
+slot-uport = uport;
 slot-ctx = octx;
 
  

Re: [Qemu-devel] [PATCH v4 0/3] qapi: convert add_client

2012-09-26 Thread Markus Armbruster
Luiz Capitulino lcapitul...@redhat.com writes:

 v4

  - Fix misworded error message in monitor_get_fd() [Markus]
  - small doc  commit log improvements [Markus]

I'd still like 1/3 to describe the change to the parsing of property
configfd.  No biggie, so:

Reviewed-by: Markus Armbruster arm...@redhat.com



[Qemu-devel] [PATCH 1/8] add pc-1.3 machine type

2012-09-26 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/pc_piix.c |   12 ++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 88ff041..5a0796b 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -349,8 +349,8 @@ static void pc_xen_hvm_init(ram_addr_t ram_size,
 }
 #endif
 
-static QEMUMachine pc_machine_v1_2 = {
-.name = pc-1.2,
+static QEMUMachine pc_machine_v1_3 = {
+.name = pc-1.3,
 .alias = pc,
 .desc = Standard PC,
 .init = pc_init_pci,
@@ -358,6 +358,13 @@ static QEMUMachine pc_machine_v1_2 = {
 .is_default = 1,
 };
 
+static QEMUMachine pc_machine_v1_2 = {
+.name = pc-1.2,
+.desc = Standard PC,
+.init = pc_init_pci,
+.max_cpus = 255,
+};
+
 #define PC_COMPAT_1_1 \
 {\
 .driver   = virtio-scsi-pci,\
@@ -655,6 +662,7 @@ static QEMUMachine xenfv_machine = {
 
 static void pc_machine_init(void)
 {
+qemu_register_machine(pc_machine_v1_3);
 qemu_register_machine(pc_machine_v1_2);
 qemu_register_machine(pc_machine_v1_1);
 qemu_register_machine(pc_machine_v1_0);
-- 
1.7.1




Re: [Qemu-devel] linux aio and cache mode

2012-09-26 Thread Kevin Wolf
Am 26.09.2012 01:22, schrieb ching:
 On 09/25/2012 09:33 PM, Kevin Wolf wrote:
 Am 25.09.2012 00:40, schrieb ching:
 On 09/24/2012 08:30 PM, Kevin Wolf wrote:
 Am 24.09.2012 13:32, schrieb ching:
 Hi all,

 My host is qemu-1.1.1 and x64 kernel 3.5.4. The guest is using 
 aio=native

 I am trying to use unsafe cache mode to boost i/o performance.
 aio=native requires the image to be opened with O_DIRECT, i.e.
 cache=none or cache=directsync. If you specify a different cache option,
 it will silently fall back to aio=threads.

 Kevin

 will qemu log a entry for the silent fallback?
 No, that's why it's silent. :-)

 Reason:

 I am testing sparse image on btrfs with mount option: 
 rw,noatime,space_cache,autodefrag,inode_cache

 i encounter a speed difference (around 2X-3X) between 
 aio=threads,cache=unsafe and aio=native,cache=unsafe

 aio=threads is much faster, i guest there is conflict between autodefrag 
 and linux aio
 This is odd. The point is that with cache=unsafe it shouldn't even be
 using Linux AIO in the first place. I can't see why there would be any
 difference between aio=threads and aio=native with cache=unsafe.

 Kevin

 
 is it possible to check the open mode of file and whether it is using aio at 
 runtime?

You can attach strace and look for open/pwritev/iosubmit and friends. I
did that yesterday with qemu-io and didn't see any iosubmit for 'qemu-io
-k -t unsafe'.

Kevin



Re: [Qemu-devel] [PATCH 1/3] pci-assign: use monitor_handle_fd_param

2012-09-26 Thread Markus Armbruster
Markus Armbruster arm...@redhat.com writes:

 Luiz Capitulino lcapitul...@redhat.com writes:

 From: Paolo Bonzini pbonz...@redhat.com

 There is no need to open-code the choice between a file descriptor
 number or a named one.  Just use monitor_handle_fd_param, which
 also takes care of printing the error message.

 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
 ---
  hw/kvm/pci-assign.c | 12 +++-
  1 file changed, 3 insertions(+), 9 deletions(-)

 diff --git a/hw/kvm/pci-assign.c b/hw/kvm/pci-assign.c
 index 05b93d9..7a0998c 100644
 --- a/hw/kvm/pci-assign.c
 +++ b/hw/kvm/pci-assign.c
 @@ -579,15 +579,9 @@ static int get_real_device(AssignedDevice *pci_dev, 
 uint16_t r_seg,
  snprintf(name, sizeof(name), %sconfig, dir);
  
  if (pci_dev-configfd_name  *pci_dev-configfd_name) {
 -if (qemu_isdigit(pci_dev-configfd_name[0])) {
 -dev-config_fd = strtol(pci_dev-configfd_name, NULL, 0);
 -} else {
 -dev-config_fd = monitor_get_fd(cur_mon, 
 pci_dev-configfd_name);
 -if (dev-config_fd  0) {
 -error_report(%s: (%s) unkown, __func__,
 - pci_dev-configfd_name);
 -return 1;
 -}
 +dev-config_fd = monitor_handle_fd_param(cur_mon, 
 pci_dev-configfd_name);
 +if (dev-config_fd  0) {
 +return 1;
  }
  } else {
  dev-config_fd = open(name, O_RDWR);

 Silent change: no longer accepts file descriptors in octal and hex.

 Silent fix: now rejects junk after numeric file descriptor.  

 Both are fine with me, but perhaps worth mentioning in the commit
 message.

One more thought: this parses qdev kvm-pci-assign property configfd.
A real fd property (qdev_prop_fd  friends) would be cleaner, but as
long as this is the only instance, it's not worth the trouble.



Re: [Qemu-devel] [PATCH v2] stop using stdio for monitor/serial/etc with -daemonize

2012-09-26 Thread Peter Maydell
On 26 September 2012 08:09, Michael Tokarev m...@tls.msk.ru wrote:
 On 26.09.2012 01:19, Anthony Liguori wrote:
 Combining -nographic and -daemonize don't make sense.  I'd rather error
 out with this combination.

 I think what the user is after is -daemonize -vga none OR -daemonize
 -display none.

 So what's the difference?

 I know lots of people use -nographic -daemonize to run headless
 guests in background (like, for example, a router).  I guess it
 come way before -vga option has been introduced, but at least I
 know about -vga (but not about -vga none).  For one, I never saw
 -display before.  And it looks like -nographic is a synonym for
 -display none, and -curses is a synonym for -display curses.

-nographic does about three different things at once (and I think
some of its effects aren't documented). It's a legacy option retained
for backward compatibility with old command lines.
If you want something that is non-confusing and makes sense, then
use -display none to disable graphics, -serial stdio to send serial
to stdio, and so on. These newer options do one clear thing each
and can be combined straightforwardly.

 It looks like we have way too many confusing options doing the
 same thing.  And I think they should be consistent, at least
 when they SMELL like they do the same thing, instead of forbidding
 one or another in some situations.

I'd love to drop -nographic but we'd break huge numbers of
existing setups...

-- PMM



[Qemu-devel] [PATCH 6/8] ehci: Fix interrupt packet MULT handling

2012-09-26 Thread Gerd Hoffmann
From: Hans de Goede hdego...@redhat.com

There are several issues with our handling of the MULT epcap field
of interrupt qhs, which this patch fixes.

1) When we don't execute a transaction because of the transaction counter
being 0, p-async stays EHCI_ASYNC_NONE, and the next time we process the
same qtd we hit an assert in ehci_state_fetchqtd because of this. Even though
I believe that this is caused by 3 below, this patch still removes the assert,
as that can still happen without 3, when multiple packets are queued for the
same interrupt ep.

2) We only *check* the transaction counter from ehci_state_execute, any
packets queued up by fill_queue bypass this check. This is fixed by not calling
fill_queue for interrupt packets.

3) Some versions of Windows set the MULT field of the qh to 0, which is a
clear violation of the EHCI spec, but still they do it. This means that we
will never execute a qtd for these, making interrupt ep-s on USB-2 devices
not work, and after recent changes, triggering 1).

So far we've stored the transaction counter in our copy of the mult field,
but with this beginnig at 0 already when dealing with these version of windows
this won't work. So this patch adds a transact_ctr field to our qh struct,
and sets this to the MULT field value on fetchqh. When the MULT field value
is 0, we set it to 4. Assuming that windows gets way with setting it to 0,
by the actual hardware going horizontal on a 1 - 0 transition, which will
give it 4 transactions (MULT goes from 0 - 3).

Note that we cannot stop on detecting the 1 - 0 transition, as our decrement
of the transaction counter, and checking for it are done in 2 different places.

Reported-by: Shawn Starr shawn.st...@rogers.com
Signed-off-by: Hans de Goede hdego...@redhat.com
Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/usb/hcd-ehci.c |   40 
 1 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
index 6a5da84..8bdb806 100644
--- a/hw/usb/hcd-ehci.c
+++ b/hw/usb/hcd-ehci.c
@@ -373,6 +373,7 @@ struct EHCIQueue {
 uint32_t seen;
 uint64_t ts;
 int async;
+int transact_ctr;
 
 /* cached data from guest - needs to be flushed
  * when guest removes an entry (doorbell, handshake sequence)
@@ -1837,6 +1838,11 @@ static EHCIQueue *ehci_state_fetchqh(EHCIState *ehci, 
int async)
 }
 q-qh = qh;
 
+q-transact_ctr = get_field(q-qh.epcap, QH_EPCAP_MULT);
+if (q-transact_ctr == 0) { /* Guest bug in some versions of windows */
+q-transact_ctr = 4;
+}
+
 if (q-dev == NULL) {
 q-dev = ehci_find_device(q-ehci, devaddr);
 }
@@ -2014,11 +2020,8 @@ static int ehci_state_fetchqtd(EHCIQueue *q)
 } else if (p != NULL) {
 switch (p-async) {
 case EHCI_ASYNC_NONE:
-/* Should never happen packet should at least be initialized */
-assert(0);
-break;
 case EHCI_ASYNC_INITIALIZED:
-/* Previously nacked packet (likely interrupt ep) */
+/* Not yet executed (MULT), or previously nacked (int) packet */
 ehci_set_state(q-ehci, q-async, EST_EXECUTE);
 break;
 case EHCI_ASYNC_INFLIGHT:
@@ -2107,15 +2110,12 @@ static int ehci_state_execute(EHCIQueue *q)
 
 // TODO verify enough time remains in the uframe as in 4.4.1.1
 // TODO write back ptr to async list when done or out of time
-// TODO Windows does not seem to ever set the MULT field
 
-if (!q-async) {
-int transactCtr = get_field(q-qh.epcap, QH_EPCAP_MULT);
-if (!transactCtr) {
-ehci_set_state(q-ehci, q-async, EST_HORIZONTALQH);
-again = 1;
-goto out;
-}
+/* 4.10.3, bottom of page 82, go horizontal on transaction counter == 0 */
+if (!q-async  q-transact_ctr == 0) {
+ehci_set_state(q-ehci, q-async, EST_HORIZONTALQH);
+again = 1;
+goto out;
 }
 
 if (q-async) {
@@ -2132,7 +2132,11 @@ static int ehci_state_execute(EHCIQueue *q)
 trace_usb_ehci_packet_action(p-queue, p, async);
 p-async = EHCI_ASYNC_INFLIGHT;
 ehci_set_state(q-ehci, q-async, EST_HORIZONTALQH);
-again = (ehci_fill_queue(p) == USB_RET_PROCERR) ? -1 : 1;
+if (q-async) {
+again = (ehci_fill_queue(p) == USB_RET_PROCERR) ? -1 : 1;
+} else {
+again = 1;
+}
 goto out;
 }
 
@@ -2152,13 +2156,9 @@ static int ehci_state_executing(EHCIQueue *q)
 
 ehci_execute_complete(q);
 
-// 4.10.3
-if (!q-async) {
-int transactCtr = get_field(q-qh.epcap, QH_EPCAP_MULT);
-transactCtr--;
-set_field(q-qh.epcap, transactCtr, QH_EPCAP_MULT);
-// 4.10.3, bottom of page 82, should exit this state when transaction
-// counter decrements to 0
+/* 4.10.3 */
+if (!q-async  q-transact_ctr  0) {
+q-transact_ctr--;
 }
 
 /* 4.10.5 */
-- 

Re: [Qemu-devel] [libvirt] [PATCH 0/2] Fixed QEMU 1.0.1 support

2012-09-26 Thread Michal Privoznik
On 25.09.2012 19:08, Doug Goldstein wrote:
 On Tue, Sep 25, 2012 at 12:01 PM, Daniel P. Berrange
 berra...@redhat.com wrote:
 On Tue, Sep 25, 2012 at 10:57:23AM -0600, Eric Blake wrote:
 On 09/25/2012 06:54 AM, Daniel P. Berrange wrote:
 On Tue, Sep 25, 2012 at 02:49:00PM +0200, Michal Privoznik wrote:
 On 25.09.2012 10:58, Dmitry Fleytman wrote:
 This patch fixes incorrect help screen parsing for QEMU 1.0.1 package
 Version line changed from
 QEMU emulator version 1.0 (qemu-kvm-1.0), Copyright (c) 2003-2008 
 Fabrice Bellard
 To
 QEMU emulator version 1.0,1 (qemu-kvm-1.0.1), Copyright (c) 
 2003-2008 Fabrice Bellard

 This seems like a bug to me. If it is a micro version number, why is it
 delimited with comma instead of dot? If it is not a micro version
 number, can we threat it like it is?

 I agree, it smells very much like a QEMU/distro bug to me.

 It is an upstream bug:

 https://lists.gnu.org/archive/html/qemu-devel/2012-02/msg02527.html

 Distros should probably be backporting that particular patch, but
 there's still the question of whether we should deal with it in libvirt
 because it is upstream.

 Well it is a bug on only one branch of upstream, that was promptly
 fixed, so I still don't think we should complicate libvirt by dealing
 with it. It is trivial for QEMU maintainers to fix


 Daniel
 --
 
 FWIW, the raw tarball from qemu.org still contains the bug. They
 didn't reissue the tarball. First commit on the list here:
 http://wiki.qemu.org/ChangeLog/1.0
 

[CC'ing QEMU devel list]

Maybe QEMU guys can reissue the tarball since Fedora (and probably other
distros as well) is using this tarball when building a package?
Or is it distro's business to backport the patch?

Michal



Re: [Qemu-devel] [PATCH v2] register reset handler to write image into memory

2012-09-26 Thread Yin Olivia-R63875
Hi Alex,

I checked all the rom_add_*() functions.
Multiple platforms of different architectures use rom_add_* to save images.
hw/arm_boot.c
hw/exynos4210.
hw/highbank.
hw/mips_fulong2e.c
hw/mips_malta.c
hw/mips_r4k.c
hw/r2d.c

Even for PowerPC, it also use rom_add_blob() to write dtb in memory.
hw/ppc/e500.c:  rom_add_blob_fixed(BINARY_DEVICE_TREE_FILE, fdt, 
fdt_size, addr);
hw/ppc440_bamboo.c

You also minder that ELF file.
hw/elf_ops.h:   rom_add_blob_fixed(label, data, mem_size, addr);

pstrcpy_targphys() does also call rom_add_blob_fixed() function, so we need 
also verify
hw/alpha_dp264.c
hw/cris-boot.c
hw/lm32_boards.c
hw/microblaze_boot.c
hw/milkymist.c
hw/ppc.c
hw/ppc_newworld.c
hw/ppc_oldworld.c
hw/sun4m.c
hw/sun4m.c

Should we register reset handler for each above user?

The callers of rom_ptr() function:
hw/s390-virtio.c
hw/sun4m.c
hw/sun4u.c
target-arm/cpu.c
But I don't understand why rom_ptr should be changed.

Best Regards,
Olivia


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, September 20, 2012 8:10 PM
 To: Andreas Färber
 Cc: Yin Olivia-R63875; qemu-...@nongnu.org; qemu-devel@nongnu.org
 Subject: Re: [Qemu-devel] [PATCH v2] register reset handler to write
 image into memory
 
 
 On 23.08.2012, at 13:38, Andreas Färber wrote:
 
  Hi,
 
  Am 23.08.2012 11:45, schrieb Yin Olivia-R63875:
  Dear All,
 
  I can't find MAINTAINER of hw/loader.c.
  Who can help review and apply this patch?
 
  This patch is not a small bugfix so it won't be applied during the
  v1.2 Hard Freeze. You based it onto ppc-next so the obvious answer is,
  Alex needs to review it, whom you forgot to cc.
 
  This patch does not answer the question why you try to avoid the ROM
  blobs and what ROM blobs are still being used for after your patch. I
  don't think it makes much sense to work around them for your use cases
  and to leave them behind - if there's something fundamentally wrong
  with them they should be ripped out completely or fixed. But maybe I'm
  misunderstanding in the absence of explanations?
 
 The fundamental problem is the memory footprint. We only need ROM blobs
 on reset, where they get copied into guest RAM. That means during the
 lifetime of the VM, we waste host memory for no good reason. Imagine a
 guest that runs with -kernel and -initrd, each 10MB in size. Then that VM
 wastes 20MB of precious host memory.
 
 Eventually, I don't think we will need the full-blown rom interface with
 in-memory rom blobs anymore. Everything should be constructed on demand
 during reset.
 
 However, if you look at code like the s390 initialization, some users of
 the rom interface expect changes done once to be persistent. These will
 have to be rewritten to redo their changes on reset.
 
 So Olivia, please do the following:
 
   - Make sure that _no_ persistent rom code is left over eventually. This
 also means that you need to convert ELF.
   - Go through every user of rom_ptr and write the code differently. For
 now, probably by just registering a reset handler that overwrites the
 respective memory location on every reset, rather than modify the rom
 blob.
 
 
 Alex
 





Re: [Qemu-devel] [PATCH 04/25] ahci: add ide device initialization helper

2012-09-26 Thread Markus Armbruster
Jason Baron jba...@redhat.com writes:

 On Mon, Sep 24, 2012 at 06:52:29PM +0200, Markus Armbruster wrote:
 Jason Baron jba...@redhat.com writes:
 
  On Fri, Sep 21, 2012 at 04:05:14PM +0200, Markus Armbruster wrote:
  Jason Baron jba...@redhat.com writes:
  
   From: Isaku Yamahata yamah...@valinux.co.jp
  
   Introduce a helper function which initializes the ahci port with
   ide devices.
   It will be used by q35 support.
  
   Cc: Alexander Graf ag...@suse.de
   Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
   Signed-off-by: Jason Baron jba...@redhat.com
   ---
hw/ide.h  |3 +++
hw/ide/ahci.c |   16 
2 files changed, 19 insertions(+), 0 deletions(-)
  
   diff --git a/hw/ide.h b/hw/ide.h
   index 2db4079..8df872e 100644
   --- a/hw/ide.h
   +++ b/hw/ide.h
   @@ -36,4 +36,7 @@ int ide_get_bios_chs_trans(BusState *bus, int unit);
/* ide/core.c */
void ide_drive_get(DriveInfo **hd, int max_bus);

   +/* ide/ahci.c */
   +void pci_ahci_ide_create_devs(PCIDevice *pci_dev, DriveInfo 
   **hd_table);
   +
#endif /* HW_IDE_H */
   diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
   index 5ea3cad..9561210 100644
   --- a/hw/ide/ahci.c
   +++ b/hw/ide/ahci.c
   @@ -1260,3 +1260,19 @@ static void sysbus_ahci_register_types(void)
}

type_init(sysbus_ahci_register_types)
   +
   +void pci_ahci_ide_create_devs(PCIDevice *pci_dev, DriveInfo **hd_table)
   +{
   + struct AHCIPCIState *dev = DO_UPCAST(struct AHCIPCIState, card,
   pci_dev);
   +int i;
   +
   +for (i = 0; i  dev-ahci.ports; i++) {
   +/* master device only, ignore slaves */
   +if (hd_table[i * MAX_IDE_DEVS] == NULL) {
   +continue;
   +}
   +ide_create_drive(dev-ahci.dev[i].port, 0,
   + hd_table[i * MAX_IDE_DEVS]);
   +}
   +}
   +
  
  Ignores odd entries in hd_table[] (MAX_IDE_DEVS is 2).  Here's my
  attempt at explaining why.
  
  -drive has parameters bus, unit, and index.  index and (bus, unit) are
  related in a well-known way that depends on parameter if.  For if=ide,
  index = bus * 2 + unit.  This relationship is ABI, i.e. it cannot be
  changed.
  
  bus * 2 + unit makes sense for IDE, because each IDE bus can connect
  two IDE devices, master and slave.
  
  Boards implementing IDE reject drives with (bus, unit) that make no
  sense for the board's IDE controller(s).  A typical board has a single
  controller with two buses, which means bus  1 get rejected.
  
  q35 implements AHCI instead of IDE.  It connects if=ide drives to AHCI,
  because that's felt to be convenient.
  
  An AHCI port can connect a single AHCI device, unlike an IDE bus.  This
  patch identifies maps -drive's bus to AHCI port number.
  
  PATCH 11/25 sets up argument hd_table[] as follows:
  
  ide_drive_get(hd, MAX_SATA_PORTS);
  
  This rejects bus  MAX_SATA_PORTS.  It doesn't reject unit == 1.  I
  believe these get silently ignored.  Bug or feature?
  
  Should we reject unit == 1 instead?
  
  Should we map -drive's index to AHCI port number instead?
 
  Right, so now that we have ide disks that can be attached to either the
  legacy ide controller or to ahci, I think we need to differentiate which
  controller we mean. That is, as proposed q35 is treating -drive if=ide
  as an ide attached to the ahci controller. I think that is broken
  behavior b/c we need a way to differentiate between the controllers.
 
 What exactly is broken?
 

 I think that -drive if=ide should result in a disk attached piix3-ide.
 Not in an ide disk attached to the ahci controller (which is current q35
 bahavior, and is 'broken' b/c we don't want that to change after q35 is
 introdued). The reason being is that I think there should be an easy way
 to create an ide drive on piix3-ide, and an ide drive on the ahci
 controller. But it sounds like you don't agree with this point.

Two issues with that:

1. Why should q35 have a piix3-ide?  The ICH9 southbridge provides only
SATA, so the board needs additional circuitry to provide PATA.  As far
as I can tell, intel's Q35 doesn't.  ICH9-based boards that do certainly
won't use a piix3-ide, because that's a *function* of the PIIX
southbridge device.  It doesn't exist separately.

2. Why should we connect -drive if=ide to a slow PATA controller instead
of a perfectly servicable SATA controller?

  As Alexander Graf has proposed before, I think we need a -drive if=ahci
  introduced. In that case, I think we reject unit  0, as you've
  suggested.
 
 Achieved by setting if_max_devs[IF_AHCI] to one.  bus becomes an alias
 for index, and unit must be zero.
 
 Alternatively, keep if_max_devs[IF_AHCI] zero.  Swaps role of bus and
 unit.
 
 Alex had if_max_devs[IF_AHCI] = 6.
 
  In terms of the current q35 patch series, I think the first step would
  be to introduce the ahci interface type, and have hda-hdd be added with
  the default type for q35 of ahci. Then, we can simply fetch ahci drives
  of index 

[Qemu-devel] [PULL 0/8] usb patch queue

2012-09-26 Thread Gerd Hoffmann
  Hi,

This is the usb patch queue.  Adds a pc-1.3 machine type (patch #1) so I
can add xhci compat properties (patch #2).  xhci gets usb hub support.
Other that that just a bunch of bugfixes.

please pull,
  Gerd

The following changes since commit d9b41bcda91ea7285d934a9c2333c49cd32d1ad3:

  Merge remote-tracking branch 'origin/master' into staging (2012-09-25 
18:12:07 -0500)

are available in the git repository at:

  git://git.kraxel.org/qemu usb.66

David Gibson (1):
  usb: Fix usb_packet_map() in the presence of IOMMUs

Gerd Hoffmann (5):
  add pc-1.3 machine type
  compat: turn off msi/msix on xhci for old machine types
  xhci: tweak limits
  xhci: route string  usb hub support
  xhci: create a memory region for each port

Hans de Goede (2):
  ehci: Fix interrupt packet MULT handling
  usb-redir: Adjust pkg-config check for usbredirparser .pc file rename (v2)

 configure |6 +-
 hw/pc_piix.c  |   28 -
 hw/usb/hcd-ehci.c |   40 ++--
 hw/usb/hcd-xhci.c |  179 ++---
 hw/usb/libhw.c|   24 +---
 5 files changed, 166 insertions(+), 111 deletions(-)



[Qemu-devel] [PATCH 5/8] xhci: create a memory region for each port

2012-09-26 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/usb/hcd-xhci.c |   85 +++--
 1 files changed, 43 insertions(+), 42 deletions(-)

diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index 8c0155b..e79a872 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -285,6 +285,8 @@ typedef enum TRBCCode {
 #define SLOT_CONTEXT_ENTRIES_MASK 0x1f
 #define SLOT_CONTEXT_ENTRIES_SHIFT 27
 
+typedef struct XHCIState XHCIState;
+
 typedef enum EPType {
 ET_INVALID = 0,
 ET_ISO_OUT,
@@ -303,15 +305,15 @@ typedef struct XHCIRing {
 } XHCIRing;
 
 typedef struct XHCIPort {
+XHCIState *xhci;
 uint32_t portsc;
 uint32_t portnr;
 USBPort  *uport;
 uint32_t speedmask;
+char name[16];
+MemoryRegion mem;
 } XHCIPort;
 
-struct XHCIState;
-typedef struct XHCIState XHCIState;
-
 typedef struct XHCITransfer {
 XHCIState *xhci;
 USBPacket packet;
@@ -2430,20 +2432,14 @@ static uint64_t xhci_cap_read(void *ptr, 
target_phys_addr_t reg, unsigned size)
 return ret;
 }
 
-static uint32_t xhci_port_read(XHCIState *xhci, uint32_t reg)
+static uint64_t xhci_port_read(void *ptr, target_phys_addr_t reg, unsigned 
size)
 {
-uint32_t port = reg  4;
+XHCIPort *port = ptr;
 uint32_t ret;
 
-if (port = xhci-numports) {
-fprintf(stderr, xhci_port_read: port %d out of bounds\n, port);
-ret = 0;
-goto out;
-}
-
-switch (reg  0xf) {
+switch (reg) {
 case 0x00: /* PORTSC */
-ret = xhci-ports[port].portsc;
+ret = port-portsc;
 break;
 case 0x04: /* PORTPMSC */
 case 0x08: /* PORTLI */
@@ -2452,30 +2448,25 @@ static uint32_t xhci_port_read(XHCIState *xhci, 
uint32_t reg)
 case 0x0c: /* reserved */
 default:
 fprintf(stderr, xhci_port_read (port %d): reg 0x%x unimplemented\n,
-port, reg);
+port-portnr, (uint32_t)reg);
 ret = 0;
 }
 
-out:
-trace_usb_xhci_port_read(port, reg  0x0f, ret);
+trace_usb_xhci_port_read(port-portnr, reg, ret);
 return ret;
 }
 
-static void xhci_port_write(XHCIState *xhci, uint32_t reg, uint32_t val)
+static void xhci_port_write(void *ptr, target_phys_addr_t reg,
+uint64_t val, unsigned size)
 {
-uint32_t port = reg  4;
+XHCIPort *port = ptr;
 uint32_t portsc;
 
-trace_usb_xhci_port_write(port, reg  0x0f, val);
+trace_usb_xhci_port_write(port-portnr, reg, val);
 
-if (port = xhci-numports) {
-fprintf(stderr, xhci_port_read: port %d out of bounds\n, port);
-return;
-}
-
-switch (reg  0xf) {
+switch (reg) {
 case 0x00: /* PORTSC */
-portsc = xhci-ports[port].portsc;
+portsc = port-portsc;
 /* write-1-to-clear bits*/
 portsc = ~(val  (PORTSC_CSC|PORTSC_PEC|PORTSC_WRC|PORTSC_OCC|
PORTSC_PRC|PORTSC_PLC|PORTSC_CEC));
@@ -2490,16 +2481,16 @@ static void xhci_port_write(XHCIState *xhci, uint32_t 
reg, uint32_t val)
 /* write-1-to-start bits */
 if (val  PORTSC_PR) {
 DPRINTF(xhci: port %d reset\n, port);
-usb_device_reset(xhci-ports[port].uport-dev);
+usb_device_reset(port-uport-dev);
 portsc |= PORTSC_PRC | PORTSC_PED;
 }
-xhci-ports[port].portsc = portsc;
+port-portsc = portsc;
 break;
 case 0x04: /* PORTPMSC */
 case 0x08: /* PORTLI */
 default:
 fprintf(stderr, xhci_port_write (port %d): reg 0x%x unimplemented\n,
-port, reg);
+port-portnr, (uint32_t)reg);
 }
 }
 
@@ -2508,10 +2499,6 @@ static uint64_t xhci_oper_read(void *ptr, 
target_phys_addr_t reg, unsigned size)
 XHCIState *xhci = ptr;
 uint32_t ret;
 
-if (reg = 0x400) {
-return xhci_port_read(xhci, reg - 0x400);
-}
-
 switch (reg) {
 case 0x00: /* USBCMD */
 ret = xhci-usbcmd;
@@ -2554,11 +2541,6 @@ static void xhci_oper_write(void *ptr, 
target_phys_addr_t reg,
 {
 XHCIState *xhci = ptr;
 
-if (reg = 0x400) {
-xhci_port_write(xhci, reg - 0x400, val);
-return;
-}
-
 trace_usb_xhci_oper_write(reg, val);
 
 switch (reg) {
@@ -2777,6 +2759,14 @@ static const MemoryRegionOps xhci_oper_ops = {
 .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
+static const MemoryRegionOps xhci_port_ops = {
+.read = xhci_port_read,
+.write = xhci_port_write,
+.valid.min_access_size = 4,
+.valid.max_access_size = 4,
+.endianness = DEVICE_LITTLE_ENDIAN,
+};
+
 static const MemoryRegionOps xhci_runtime_ops = {
 .read = xhci_runtime_read,
 .write = xhci_runtime_write,
@@ -2850,7 +2840,7 @@ static void xhci_child_detach(USBPort *uport, USBDevice 
*child)
 }
 }
 
-static USBPortOps xhci_port_ops = {
+static USBPortOps xhci_uport_ops = {
 .attach   = xhci_attach,
 .detach   = xhci_detach,
 .wakeup   = xhci_wakeup,
@@ -2930,6 +2920,7 @@ 

Re: [Qemu-devel] [PATCH v2] stop using stdio for monitor/serial/etc with -daemonize

2012-09-26 Thread Michael Tokarev
On 26.09.2012 12:00, Peter Maydell wrote:

 I know lots of people use -nographic -daemonize to run headless
 guests in background (like, for example, a router).  I guess it
 come way before -vga option has been introduced, but at least I
 know about -vga (but not about -vga none).  For one, I never saw
 -display before.  And it looks like -nographic is a synonym for
 -display none, and -curses is a synonym for -display curses.

I mean, -nographic is about the same as -vga none -display none.

 -nographic does about three different things at once (and I think
 some of its effects aren't documented). It's a legacy option retained
 for backward compatibility with old command lines.

Sure.  Just like, for example, -stdvga was at the time being.

 If you want something that is non-confusing and makes sense, then
 use -display none to disable graphics, -serial stdio to send serial
 to stdio, and so on. These newer options do one clear thing each
 and can be combined straightforwardly.
 
 It looks like we have way too many confusing options doing the
 same thing.  And I think they should be consistent, at least
 when they SMELL like they do the same thing, instead of forbidding
 one or another in some situations.
 
 I'd love to drop -nographic but we'd break huge numbers of
 existing setups...

So let's make it actually work as expected till we're able to finally
drop it.

What is equivalent of -nographic in terms of -vga/-display/-...?
From the code it is something like

 -vga none -display none -serial mon:stdio -parallel null

(this is the code I tried to patch).

Note: this, compbined with -daemonize, also has the same issue,
namely, the tty is left in a bad state after qemu process backgrounded,
and for the very same reason: -serial stdio switches the try into
raw mode.  So this should be fixed too -- somehow, either by forbidding
this combination completely or by silently substituting stdio for
-serial with null.  But it will be done in a subsequent patch.

Note also: by forbidding -nographic -daemonize, we'll break lots of
existing setups too, and I still don't see why this combination is
bad, I already demonstrated that it can be made to work in a more
or less reasonable/expected way.

Thanks,

/mjt



[Qemu-devel] [PATCH V3] virtio-blk: add default serial id

2012-09-26 Thread Dave Young
For virtio block device, if user does not specify the serial attribute,
There will be no serial availabe, this is not convenient for identifying
the disk.

Doing something similar to ide disks, add a VD? default serial
number if user does not specify it.

Signed-off-by: Dave Young dyo...@redhat.com
---
 hw/virtio-blk.c |9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

--- qemu-kvm.orig/hw/virtio-blk.c   2012-09-02 09:45:03.115696878 +0800
+++ qemu-kvm/hw/virtio-blk.c2012-09-20 20:57:56.177206991 +0800
@@ -22,6 +22,7 @@
 # include scsi/sg.h
 #endif
 
+#define DEFAULT_VIRTIO_BLK_SERIAL_LEN 8
 typedef struct VirtIOBlock
 {
 VirtIODevice vdev;
@@ -33,6 +34,7 @@ typedef struct VirtIOBlock
 VirtIOBlkConf *blk;
 unsigned short sector_mask;
 DeviceState *qdev;
+int drive_serial;
 } VirtIOBlock;
 
 static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev)
@@ -364,6 +366,7 @@ static void virtio_blk_handle_request(Vi
 MultiReqBuffer *mrb)
 {
 uint32_t type;
+char serial[DEFAULT_VIRTIO_BLK_SERIAL_LEN];
 
 if (req-elem.out_num  1 || req-elem.in_num  1) {
 error_report(virtio-blk missing headers);
@@ -388,12 +391,14 @@ static void virtio_blk_handle_request(Vi
 } else if (type  VIRTIO_BLK_T_GET_ID) {
 VirtIOBlock *s = req-dev;
 
+snprintf(serial, DEFAULT_VIRTIO_BLK_SERIAL_LEN,
+ VD%05d, s-drive_serial);
 /*
  * NB: per existing s/n string convention the string is
  * terminated by '\0' only when shorter than buffer.
  */
 strncpy(req-elem.in_sg[0].iov_base,
-s-blk-serial ? s-blk-serial : ,
+s-blk-serial ? s-blk-serial : serial,
 MIN(req-elem.in_sg[0].iov_len, VIRTIO_BLK_ID_BYTES));
 virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
 g_free(req);
@@ -611,6 +616,7 @@ static const BlockDevOps virtio_block_op
 
 VirtIODevice *virtio_blk_init(DeviceState *dev, VirtIOBlkConf *blk)
 {
+static int drive_serial = 1;
 VirtIOBlock *s;
 static int virtio_blk_id;
 
@@ -632,6 +638,7 @@ VirtIODevice *virtio_blk_init(DeviceStat
   sizeof(struct virtio_blk_config),
   sizeof(VirtIOBlock));
 
+s-drive_serial = drive_serial++;
 s-vdev.get_config = virtio_blk_update_config;
 s-vdev.set_config = virtio_blk_set_config;
 s-vdev.get_features = virtio_blk_get_features;




[Qemu-devel] [PATCH 8/8] usb: Fix usb_packet_map() in the presence of IOMMUs

2012-09-26 Thread Gerd Hoffmann
From: David Gibson da...@gibson.dropbear.id.au

With the IOMMU infrastructure introduced before 1.2, we need to use
dma_memory_map() to obtain a qemu pointer to memory from an IO bus address.
However, dma_memory_map() alters the given length to reflect the length
over which the used DMA translation is valid - which could be either more
or less than the requested length.

usb_packet_map() does not correctly handle these cases, simply failing if
dma_memory_map() alters the requested length.  If dma_memory_map()
increased the length, we just need to use the requested length for the
qemu_iovec_add().  However, if it decreased the length, it means that a
single DMA translation is not valid for the whole sglist element, and so
we need to loop, splitting it up into multiple iovec entries for each
piece with a DMA translation (in practice 2 pieces is unlikely).

This patch implements the correct behaviour

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/usb/libhw.c |   24 +++-
 1 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/hw/usb/libhw.c b/hw/usb/libhw.c
index c0de30e..703e2d2 100644
--- a/hw/usb/libhw.c
+++ b/hw/usb/libhw.c
@@ -28,19 +28,25 @@ int usb_packet_map(USBPacket *p, QEMUSGList *sgl)
 {
 DMADirection dir = (p-pid == USB_TOKEN_IN) ?
 DMA_DIRECTION_FROM_DEVICE : DMA_DIRECTION_TO_DEVICE;
-dma_addr_t len;
 void *mem;
 int i;
 
 for (i = 0; i  sgl-nsg; i++) {
-len = sgl-sg[i].len;
-mem = dma_memory_map(sgl-dma, sgl-sg[i].base, len, dir);
-if (!mem) {
-goto err;
-}
-qemu_iovec_add(p-iov, mem, len);
-if (len != sgl-sg[i].len) {
-goto err;
+dma_addr_t base = sgl-sg[i].base;
+dma_addr_t len = sgl-sg[i].len;
+
+while (len) {
+dma_addr_t xlen = len;
+mem = dma_memory_map(sgl-dma, sgl-sg[i].base, xlen, dir);
+if (!mem) {
+goto err;
+}
+if (xlen  len) {
+xlen = len;
+}
+qemu_iovec_add(p-iov, mem, xlen);
+len -= xlen;
+base += xlen;
 }
 }
 return 0;
-- 
1.7.1




Re: [Qemu-devel] [PATCH v2] qemu/xen: Add 64 bits big bar support on qemu

2012-09-26 Thread Hao, Xudong
 -Original Message-
 From: Stefano Stabellini [mailto:stefano.stabell...@eu.citrix.com]
 Sent: Tuesday, September 25, 2012 6:52 PM
 To: Hao, Xudong
 Cc: Stefano Stabellini; xen-de...@lists.xen.org; qemu-devel@nongnu.org;
 Zhang, Xiantao
 Subject: Re: [PATCH v2] qemu/xen: Add 64 bits big bar support on qemu
 
 On Tue, 25 Sep 2012, Xudong Hao wrote:
  Changes from v1:
  - Rebase to qemu upstream from qemu-xen
 
 Thanks. Please run scripts/checkpatch.pl on this patch, you'll find
 some cody style issues that need to be fixed.
 
OK, will use this scripts to check code style.

 
  Currently it is assumed PCI device BAR access  4G memory. If there is such 
  a
  device whose BAR size is larger than 4G, it must access  4G memory
 address.
  This patch enable the 64bits big BAR support on qemu.
 
  Signed-off-by: Xudong Hao xudong@intel.com
  Signed-off-by: Xiantao Zhang xiantao.zh...@intel.com
  ---
   hw/xen_pt.c |   16 
   hw/xen_pt_config_init.c |   42
 +-
 
  diff --git a/hw/xen_pt.c b/hw/xen_pt.c
  index 307119a..2a8bcf3 100644
  --- a/hw/xen_pt.c
  +++ b/hw/xen_pt.c
  @@ -403,21 +403,21 @@ static int
 xen_pt_register_regions(XenPCIPassthroughState *s)
 
   s-bases[i].access.u = r-base_addr;
 
  -if (r-type  XEN_HOST_PCI_REGION_TYPE_IO) {
  +if (r-type  XEN_HOST_PCI_REGION_TYPE_IO)
   type = PCI_BASE_ADDRESS_SPACE_IO;
  -} else {
  +else if (r-type  XEN_HOST_PCI_REGION_TYPE_MEM_64)
  +type = PCI_BASE_ADDRESS_MEM_TYPE_64;
  +else if (r-type  XEN_HOST_PCI_REGION_TYPE_PREFETCH)
  +type |= PCI_BASE_ADDRESS_MEM_PREFETCH;
  +else
   type = PCI_BASE_ADDRESS_SPACE_MEMORY;
  -if (r-type  XEN_HOST_PCI_REGION_TYPE_PREFETCH) {
  -type |= PCI_BASE_ADDRESS_MEM_PREFETCH;
  -}
  -}
 
 Aside from the cody style issue here, this changes the behavior for type
 PCI_BASE_ADDRESS_SPACE_MEMORY. Before we could have:
 
 type =
 PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_PREFETCH;
 
 now we cannot anymore.
 

Will change to:
-if (r-type  XEN_HOST_PCI_REGION_TYPE_IO) {
+if (r-type  XEN_HOST_PCI_REGION_TYPE_IO)
 type = PCI_BASE_ADDRESS_SPACE_IO;
-} else {
+else
 type = PCI_BASE_ADDRESS_SPACE_MEMORY;
-if (r-type  XEN_HOST_PCI_REGION_TYPE_PREFETCH) {
+if (r-type  XEN_HOST_PCI_REGION_TYPE_MEM_64)
+type |= PCI_BASE_ADDRESS_MEM_TYPE_64;
+   else if (r-type  XEN_HOST_PCI_REGION_TYPE_PREFETCH)
 type |= PCI_BASE_ADDRESS_MEM_PREFETCH;
-}
-}

 
   memory_region_init_io(s-bar[i], ops, s-dev,
 xen-pci-pt-bar, r-size);
   pci_register_bar(s-dev, i, type, s-bar[i]);
 
  -XEN_PT_LOG(s-dev, IO region %i registered
 (size=0x%08PRIx64
  -base_addr=0x%08PRIx64 type: %#x)\n,
  +XEN_PT_LOG(s-dev, IO region %i registered
 (size=0x%lxPRIx64
  +base_addr=0x%lxPRIx64 type: %#x)\n,
  i, r-size, r-base_addr, type);
   }
 
  diff --git a/hw/xen_pt_config_init.c b/hw/xen_pt_config_init.c
  index e524a40..5e7ca22 100644
  --- a/hw/xen_pt_config_init.c
  +++ b/hw/xen_pt_config_init.c
  @@ -342,6 +342,7 @@ static int
 xen_pt_cmd_reg_write(XenPCIPassthroughState *s, XenPTReg *cfg_entry,
   #define XEN_PT_BAR_IO_RO_MASK 0x0003  /* BAR ReadOnly
 mask(I/O) */
   #define XEN_PT_BAR_IO_EMU_MASK0xFFFC  /* BAR emul
 mask(I/O) */
 
  +static uint64_t xen_pt_get_bar_size(PCIIORegion *r);
 
 there is just one user of xen_pt_get_bar_size, maybe you can just
 move the implementation here
 

Ok, thanks.

 
   static XenPTBarFlag xen_pt_bar_reg_parse(XenPCIPassthroughState *s,
XenPTRegInfo *reg)
   {
  @@ -366,7 +367,7 @@ static XenPTBarFlag
 xen_pt_bar_reg_parse(XenPCIPassthroughState *s,
 
   /* check unused BAR */
   r = d-io_regions[index];
  -if (r-size == 0) {
  +if (!xen_pt_get_bar_size(r)) {
   return XEN_PT_BAR_FLAG_UNUSED;
   }
 
  @@ -383,6 +384,24 @@ static XenPTBarFlag
 xen_pt_bar_reg_parse(XenPCIPassthroughState *s,
   }
   }
 
  +static bool is_64bit_bar(PCIIORegion *r)
  +{
  +return !!(r-type  PCI_BASE_ADDRESS_MEM_TYPE_64);
  +}
  +
  +static uint64_t xen_pt_get_bar_size(PCIIORegion *r)
  +{
  +if (is_64bit_bar(r))
  +{
  +uint64_t size64;
  +size64 = (r + 1)-size;
  +size64 = 32;
  +size64 += r-size;
  +return size64;
  +}
  +return r-size;
  +}
  +
   static inline uint32_t base_address_with_flags(XenHostPCIIORegion *hr)
   {
   if (hr-type  XEN_HOST_PCI_REGION_TYPE_IO) {
  @@ -481,7 +500,10 @@ static int
 xen_pt_bar_reg_write(XenPCIPassthroughState *s, XenPTReg *cfg_entry,
   switch 

[Qemu-devel] [PATCH 7/8] usb-redir: Adjust pkg-config check for usbredirparser .pc file rename (v2)

2012-09-26 Thread Gerd Hoffmann
From: Hans de Goede hdego...@redhat.com

The usbredir 0.5 release introduced the new API for 64 bit packet ids, but
it kept the libusbredirparser.pc name as is, meaning that older versions of
qemu will still have their pkg-config check for usbredirparser fulfilled,
and build with the usb-redir device. Due to the API change there will be
some compiler warnings, but the build will succeed, however the usb-redir
device will be broken on 32 bit machines.

To solve this a new usbredir-0.5.2 release is coming, which renames the
libusbredirparser.pc file to libusbredirparser-0.5.pc, so that it will no
longer fulfill the pkg-config check of the qemu-1.2 and older releases,
stopping the (silent) breakage. This patch adjusts qemu master's configure
to properly detect the new usbredir release.

Changes in v2:
-Not only use the new .pc name in the check but also when getting cflags
 and libs!

Signed-off-by: Hans de Goede hdego...@redhat.com
Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 configure |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index 1b86517..4f24062 100755
--- a/configure
+++ b/configure
@@ -2752,10 +2752,10 @@ fi
 
 # check for usbredirparser for usb network redirection support
 if test $usb_redir != no ; then
-if $pkg_config --atleast-version=0.5 libusbredirparser /dev/null 21 ; 
then
+if $pkg_config --atleast-version=0.5 libusbredirparser-0.5 /dev/null 21 
; then
 usb_redir=yes
-usb_redir_cflags=$($pkg_config --cflags libusbredirparser 2/dev/null)
-usb_redir_libs=$($pkg_config --libs libusbredirparser 2/dev/null)
+usb_redir_cflags=$($pkg_config --cflags libusbredirparser-0.5 
2/dev/null)
+usb_redir_libs=$($pkg_config --libs libusbredirparser-0.5 2/dev/null)
 QEMU_CFLAGS=$QEMU_CFLAGS $usb_redir_cflags
 libs_softmmu=$libs_softmmu $usb_redir_libs
 else
-- 
1.7.1




[Qemu-devel] [PATCH 3/8] xhci: tweak limits

2012-09-26 Thread Gerd Hoffmann
Set maxports to 15.  This is what the usb3 route string can handle.

Set maxslots to 64.  This is more than the number of root ports we
can have, but with additional hubs you can end up with more devices.

Set maxintrs (aka msi vectors) to 16.  Should be enougth, especially
considering that vectors are a limited ressource.  Linux guests use
only three at the moment.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/usb/hcd-xhci.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index e0ca690..1414826 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -37,12 +37,12 @@
 #define FIXME() do { fprintf(stderr, FIXME %s:%d\n, \
  __func__, __LINE__); abort(); } while (0)
 
-#define MAXPORTS_2 8
-#define MAXPORTS_3 8
+#define MAXPORTS_2 15
+#define MAXPORTS_3 15
 
 #define MAXPORTS (MAXPORTS_2+MAXPORTS_3)
-#define MAXSLOTS MAXPORTS
-#define MAXINTRS MAXPORTS
+#define MAXSLOTS 64
+#define MAXINTRS 16
 
 #define TD_QUEUE 24
 
-- 
1.7.1




Re: [Qemu-devel] [PATCH v2] stop using stdio for monitor/serial/etc with -daemonize

2012-09-26 Thread Peter Maydell
On 26 September 2012 09:17, Michael Tokarev m...@tls.msk.ru wrote:
 On 26.09.2012 12:00, Peter Maydell wrote:

 I know lots of people use -nographic -daemonize to run headless
 guests in background (like, for example, a router).  I guess it
 come way before -vga option has been introduced, but at least I
 know about -vga (but not about -vga none).  For one, I never saw
 -display before.  And it looks like -nographic is a synonym for
 -display none, and -curses is a synonym for -display curses.

 I mean, -nographic is about the same as -vga none -display none.

...except that it *also* messes around with where the serial output
goes and with the parallel port and maybe something else.

 What is equivalent of -nographic in terms of -vga/-display/-...?
 From the code it is something like

  -vga none -display none -serial mon:stdio -parallel null

It's something like that. It would be nice to implement -nographic
as this is an alias for  but IIRC it isn't quite doable.
(maybe I misremember)

 (this is the code I tried to patch).

 Note: this, compbined with -daemonize, also has the same issue,
 namely, the tty is left in a bad state after qemu process backgrounded,
 and for the very same reason: -serial stdio switches the try into
 raw mode.  So this should be fixed too -- somehow, either by forbidding
 this combination completely or by silently substituting stdio for
 -serial with null.  But it will be done in a subsequent patch.

 Note also: by forbidding -nographic -daemonize, we'll break lots of
 existing setups too, and I still don't see why this combination is
 bad, I already demonstrated that it can be made to work in a more
 or less reasonable/expected way.

Because you've asked both put me into the background and please
send stuff to stdio. Admittedly you've probably done that because
you didn't really understand that '-nographic' doesn't mean
'-display none', but you've still asked for a nonsensical combination.

-- PMM



Re: [Qemu-devel] [Qemu-ppc] [PATCH v10 1/1] Add USB option in machine options

2012-09-26 Thread Li Zhang
Thanks Gerd.

Hi Alex,
Can this patch be pushed to upstream?

Thanks. -:)

On Wed, Sep 26, 2012 at 2:29 PM, Gerd Hoffmann kra...@redhat.com wrote:
 On 09/26/12 06:20, Li Zhang wrote:
 Would you please have a look at my patch when you have time?
 Because it is related with USB, so I hope to get your approval before
 it is pushed.

 Looks good to me.

 cheers,
   Gerd



-- 

Best Regards
-Li



Re: [Qemu-devel] [Qemu-ppc] RFC: NVRAM for pseries machine

2012-09-26 Thread Alexander Graf


On 26.09.2012, at 03:18, David Gibson da...@gibson.dropbear.id.au wrote:

 On Wed, Sep 26, 2012 at 03:03:10AM +0200, Alexander Graf wrote:
 On 26.09.2012, at 02:27, David Gibson wrote:
 On Mon, Sep 24, 2012 at 12:38:59PM +0200, Alexander Graf wrote:
 On 24.09.2012, at 02:31, David Gibson wrote:
 [snip]
 So, if you look at the patch there is actually a -device form within
 there, the machine option is a wrapper around it.  Without the machine
 option, I don't see how to get the desired properties for the
 configuration that is:
 * NVRAM is always instantiated by default (even if it's
 non-persistent)
 * It's easy to set the drive for that always-present NVRAM
 
 I suppose the idea is that when creating a machine from a qtree
 dump, we can still recreate it. Or maybe when using -nodefaults? Not
 sure. But the way you do it right now is very close to how we want
 to model USB too, so I do like it. It's consistent.
 
 I really don't follow what point you're making here.
 
 The problem with -device syntax for my purpose is that with *no* extra
 command line arguments we should always have some sort of NVRAM - it's
 mandated by the platform spec, and should always be there, just like
 the PCI bridge and VIO bridge.  That means instantiating the device
 from the machine setup code.  But then, using -device will create a
 second instance of the device, which is no good, because only one can
 actually be used.
 
 What I'm trying to say is that the machine file should create a
 device. Always in the case of PAPR. But I suppose pseudo-code is
 easier to read:
 
 spapr.c:
 
  create_device(spapr-nvram, drive=machine_opts[nvram]);
 
 Ok.  That's what I do now.
 
 spapr-nvram:
 
  if (!drive || checksum_is_bad(drive))
autogenerate_nvram_contents();
 
 Actually, I'm planning for the initialization of the content to be
 done from the guest firmware.

Does the guest have all information necessary to construct a workable nvram 
image? If so, then yes, that's even better.

Alex

 
 
 Then we can later add in vl.c:
 
  case OPTION_nvram:
create_drive(nvram, option);
machine_opts[nvram] = drive[nvram];
 
 Ok, that all works for me.
 
 Blue, does that seem reasonable to you?
 
 -- 
 David Gibson| I'll have my music baroque, and my code
 david AT gibson.dropbear.id.au| minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
 http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] [PATCH V3] virtio-blk: add default serial id

2012-09-26 Thread Dave Young
Hi, Eric

This is in fact same with v1 except the spelling fixes.

I switched back to v1 because of your concern about the same id reusing
issue. For v1, unplugging 10 times is not likely and insane, so I
think we can safely ignore it.

Unless qemu-img create some uuid for disk imgs I think there's no way
to create some really ideal *uniq* ids.

On 09/26/2012 04:18 PM, Dave Young wrote:

 For virtio block device, if user does not specify the serial attribute,
 There will be no serial availabe, this is not convenient for identifying
 the disk.
 
 Doing something similar to ide disks, add a VD? default serial
 number if user does not specify it.
 
 Signed-off-by: Dave Young dyo...@redhat.com
 ---
  hw/virtio-blk.c |9 -
  1 file changed, 8 insertions(+), 1 deletion(-)
 
 --- qemu-kvm.orig/hw/virtio-blk.c 2012-09-02 09:45:03.115696878 +0800
 +++ qemu-kvm/hw/virtio-blk.c  2012-09-20 20:57:56.177206991 +0800
 @@ -22,6 +22,7 @@
  # include scsi/sg.h
  #endif
  
 +#define DEFAULT_VIRTIO_BLK_SERIAL_LEN 8
  typedef struct VirtIOBlock
  {
  VirtIODevice vdev;
 @@ -33,6 +34,7 @@ typedef struct VirtIOBlock
  VirtIOBlkConf *blk;
  unsigned short sector_mask;
  DeviceState *qdev;
 +int drive_serial;
  } VirtIOBlock;
  
  static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev)
 @@ -364,6 +366,7 @@ static void virtio_blk_handle_request(Vi
  MultiReqBuffer *mrb)
  {
  uint32_t type;
 +char serial[DEFAULT_VIRTIO_BLK_SERIAL_LEN];
  
  if (req-elem.out_num  1 || req-elem.in_num  1) {
  error_report(virtio-blk missing headers);
 @@ -388,12 +391,14 @@ static void virtio_blk_handle_request(Vi
  } else if (type  VIRTIO_BLK_T_GET_ID) {
  VirtIOBlock *s = req-dev;
  
 +snprintf(serial, DEFAULT_VIRTIO_BLK_SERIAL_LEN,
 + VD%05d, s-drive_serial);
  /*
   * NB: per existing s/n string convention the string is
   * terminated by '\0' only when shorter than buffer.
   */
  strncpy(req-elem.in_sg[0].iov_base,
 -s-blk-serial ? s-blk-serial : ,
 +s-blk-serial ? s-blk-serial : serial,
  MIN(req-elem.in_sg[0].iov_len, VIRTIO_BLK_ID_BYTES));
  virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
  g_free(req);
 @@ -611,6 +616,7 @@ static const BlockDevOps virtio_block_op
  
  VirtIODevice *virtio_blk_init(DeviceState *dev, VirtIOBlkConf *blk)
  {
 +static int drive_serial = 1;
  VirtIOBlock *s;
  static int virtio_blk_id;
  
 @@ -632,6 +638,7 @@ VirtIODevice *virtio_blk_init(DeviceStat
sizeof(struct virtio_blk_config),
sizeof(VirtIOBlock));
  
 +s-drive_serial = drive_serial++;
  s-vdev.get_config = virtio_blk_update_config;
  s-vdev.set_config = virtio_blk_set_config;
  s-vdev.get_features = virtio_blk_get_features;
 
 



-- 
Thanks
Dave



Re: [Qemu-devel] [Qemu-ppc] [PATCH v10 1/1] Add USB option in machine options

2012-09-26 Thread Alexander Graf


On 26.09.2012, at 10:50, Li Zhang zhlci...@gmail.com wrote:

 Thanks Gerd.
 
 Hi Alex,
 Can this patch be pushed to upstream?

Gerd, do you want to take this through the usb queue? If I get s formal ack 
from you I can also take it into ppc-next, but it doesn't quite belong there :).

Alex

 
 Thanks. -:)
 
 On Wed, Sep 26, 2012 at 2:29 PM, Gerd Hoffmann kra...@redhat.com wrote:
 On 09/26/12 06:20, Li Zhang wrote:
 Would you please have a look at my patch when you have time?
 Because it is related with USB, so I hope to get your approval before
 it is pushed.
 
 Looks good to me.
 
 cheers,
  Gerd
 
 
 
 -- 
 
 Best Regards
 -Li



Re: [Qemu-devel] [PATCH v2] Add infrastructure for QIDL-based device serialization

2012-09-26 Thread Paolo Bonzini
Il 25/09/2012 23:12, Anthony Liguori ha scritto:
 Just like sparse is a compiler, so is qidl.  We are free to use the
 '_' + lowercase prefix.
 
   ISADevice _immutable dev;
 
 It's an established practice in wide-use.

But QEMU is also compiled with GCC, so we're not.  The Linux kernel
coding standards are not something to imitate for what concerns use of
reserved identifiers, if only because it targets a freestanding environment.

Paolo



Re: [Qemu-devel] [PATCH v9 4/4] block: Support GlusterFS as a QEMU block backend.

2012-09-26 Thread Kevin Wolf
Am 24.09.2012 11:13, schrieb Bharata B Rao:
 block: Support GlusterFS as a QEMU block backend.
 
 From: Bharata B Rao bhar...@linux.vnet.ibm.com
 
 This patch adds gluster as the new block backend in QEMU. This gives
 QEMU the ability to boot VM images from gluster volumes. Its already
 possible to boot from VM images on gluster volumes using FUSE mount, but
 this patchset provides the ability to boot VM images from gluster volumes
 by by-passing the FUSE layer in gluster. This is made possible by
 using libgfapi routines to perform IO on gluster volumes directly.
 
 VM Image on gluster volume is specified like this:
 
 file=gluster[+transport]://[server[:port]]/volname/image[?socket=...]
 
 'gluster' is the protocol.
 
 'transport' specifies the transport type used to connect to gluster
 management daemon (glusterd). Valid transport types are
 tcp, unix and rdma. If a transport type isn't specified, then tcp
 type is assumed.
 
 'server' specifies the server where the volume file specification for
 the given volume resides. This can be either hostname, ipv4 address
 or ipv6 address. ipv6 address needs to be within square brackets [ ].
 If transport type is 'unix', then server field is ignored, but the
 'socket' field needs to be populated with the path to unix domain
 socket.
 
 'port' is the port number on which glusterd is listening. This is optional
 and if not specified, QEMU will send 0 which will make gluster to use the
 default port. port is ignored for unix type of transport.
 
 'volname' is the name of the gluster volume which contains the VM image.
 
 'image' is the path to the actual VM image that resides on gluster volume.
 
 Examples:
 
 file=gluster://1.2.3.4/testvol/a.img
 file=gluster+tcp://1.2.3.4/testvol/a.img
 file=gluster+tcp://1.2.3.4:24007/testvol/dir/a.img
 file=gluster+tcp://[1:2:3:4:5:6:7:8]/testvol/dir/a.img
 file=gluster+tcp://[1:2:3:4:5:6:7:8]:24007/testvol/dir/a.img
 file=gluster+tcp://server.domain.com:24007/testvol/dir/a.img
 file=gluster+unix:///testvol/dir/a.img?socket=/tmp/glusterd.socket
 file=gluster+rdma://1.2.3.4:24007/testvol/a.img
 
 Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com
 ---
 
  block/Makefile.objs |1 
  block/gluster.c |  642 
 +++
  2 files changed, 643 insertions(+), 0 deletions(-)
  create mode 100644 block/gluster.c
 
 
 diff --git a/block/Makefile.objs b/block/Makefile.objs
 index b5754d3..a1ae67f 100644
 --- a/block/Makefile.objs
 +++ b/block/Makefile.objs
 @@ -9,3 +9,4 @@ block-obj-$(CONFIG_POSIX) += raw-posix.o
  block-obj-$(CONFIG_LIBISCSI) += iscsi.o
  block-obj-$(CONFIG_CURL) += curl.o
  block-obj-$(CONFIG_RBD) += rbd.o
 +block-obj-$(CONFIG_GLUSTERFS) += gluster.o
 diff --git a/block/gluster.c b/block/gluster.c
 new file mode 100644
 index 000..a2f8303
 --- /dev/null
 +++ b/block/gluster.c
 @@ -0,0 +1,642 @@
 +/*
 + * GlusterFS backend for QEMU
 + *
 + * Copyright (C) 2012 Bharata B Rao bhar...@linux.vnet.ibm.com
 + *
 + * Pipe handling mechanism in AIO implementation is derived from
 + * block/rbd.c. Hence,
 + *
 + * Copyright (C) 2010-2011 Christian Brunner c...@muc.de,
 + * Josh Durgin josh.dur...@dreamhost.com
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2.  See
 + * the COPYING file in the top-level directory.
 + *
 + * Contributions after 2012-01-13 are licensed under the terms of the
 + * GNU GPL, version 2 or (at your option) any later version.
 + */
 +#include glusterfs/api/glfs.h
 +#include block_int.h
 +#include qemu_socket.h
 +#include uri.h
 +
 +typedef struct GlusterAIOCB {
 +BlockDriverAIOCB common;
 +int64_t size;
 +int ret;
 +bool *finished;
 +QEMUBH *bh;
 +} GlusterAIOCB;
 +
 +typedef struct BDRVGlusterState {
 +struct glfs *glfs;
 +int fds[2];
 +struct glfs_fd *fd;
 +int qemu_aio_count;
 +int event_reader_pos;
 +GlusterAIOCB *event_acb;
 +} BDRVGlusterState;
 +
 +#define GLUSTER_FD_READ  0
 +#define GLUSTER_FD_WRITE 1
 +
 +typedef struct GlusterConf {
 +char *server;
 +int port;
 +char *volname;
 +char *image;
 +char *transport;
 +} GlusterConf;
 +
 +static void qemu_gluster_gconf_free(GlusterConf *gconf)
 +{
 +g_free(gconf-server);
 +g_free(gconf-volname);
 +g_free(gconf-image);
 +g_free(gconf-transport);
 +g_free(gconf);
 +}
 +
 +static int parse_volume_options(GlusterConf *gconf, char *path)
 +{
 +char *token, *saveptr;
 +
 +/* volname */
 +token = strtok_r(path, /, saveptr);
 +if (!token) {
 +return -EINVAL;
 +}
 +gconf-volname = g_strdup(token);
 +
 +/* image */
 +token = strtok_r(NULL, ?, saveptr);

If I understand uri.c right, there is no ? in the path, so there's no
reason to call strtok. You could just use the rest of the string.

 +if (!token) {
 +return -EINVAL;
 +}
 +gconf-image = g_strdup(token);
 +return 0;
 +}
 +
 +/*
 + * 

Re: [Qemu-devel] [PATCH v9 4/4] block: Support GlusterFS as a QEMU block backend.

2012-09-26 Thread Paolo Bonzini
Il 26/09/2012 12:00, Kevin Wolf ha scritto:
  +
  +ret = write(fd, (void *)acb, sizeof(acb));
  +if (ret = 0) {
  +break;
  +}
  +if (errno == EINTR) {
  +continue;
  +}
  +if (errno != EAGAIN) {
  +break;
  +}
 Variatio delectat? ;-)
 
 How about just do { ... } while (errno == EINTR || errno == EAGAIN); ?

That should be

while ((ret  0)  (errno == EINTR || errno == EAGAIN));

However, fd here is blocking, so you can just use qemu_write_full.

Paolo




Re: [Qemu-devel] [PATCH 5/9] mm: compaction: Acquire the zone-lru_lock as late as possible

2012-09-26 Thread Mel Gorman
On Tue, Sep 25, 2012 at 02:39:31PM -0700, Andrew Morton wrote:
 On Tue, 25 Sep 2012 17:13:27 +0900
 Minchan Kim minc...@kernel.org wrote:
 
  I see. To me, your saying is better than current comment.
  I hope comment could be more explicit.
  
  diff --git a/mm/compaction.c b/mm/compaction.c
  index df01b4e..f1d2cc7 100644
  --- a/mm/compaction.c
  +++ b/mm/compaction.c
  @@ -542,8 +542,9 @@ isolate_migratepages_range(struct zone *zone, struct 
  compact_control *cc,
   * splitting and collapsing (collapsing has already happened
   * if PageLRU is set) but the lock is not necessarily taken
   * here and it is wasteful to take it just to check 
  transhuge.
  -* Check transhuge without lock and skip if it's either a
  -* transhuge or hugetlbfs page.
  +* Check transhuge without lock and *skip* if it's either a
  +* transhuge or hugetlbfs page because it's not safe to call
  +* compound_order.
   */
  if (PageTransHuge(page)) {
  if (!locked)
 
 Going a bit further:
 
 --- 
 a/mm/compaction.c~mm-compaction-acquire-the-zone-lru_lock-as-late-as-possible-fix
 +++ a/mm/compaction.c
 @@ -415,7 +415,8 @@ isolate_migratepages_range(struct zone *
* if PageLRU is set) but the lock is not necessarily taken
* here and it is wasteful to take it just to check transhuge.
* Check transhuge without lock and skip if it's either a
 -  * transhuge or hugetlbfs page.
 +  * transhuge or hugetlbfs page because calling compound_order()
 +  * requires lru_lock to exclude isolation and splitting.
*/
   if (PageTransHuge(page)) {
   if (!locked)
 _
 
 but...  the requirement to hold lru_lock for compound_order() is news
 to me.  It doesn't seem to be written down or explained anywhere, and
 one wonders why the cheerily undocumented compound_lock() doesn't have
 this effect.  What's going on here??
 

The lru_lock is not *required* for compound_order(). Normally, users of
compound_order() know that the page is not going to collapse underneath
them. The slub allocator is not going to have a compound page it controls
disappear unexpectedly and does not need additional locking for example.

In the case where we are potentially dealing with a THP page, we have to
take into account if it can collapse underneath us. In this case, there is a
race between when when PageTransHuge is checked and compound_order is called.
The race is probably harmless but it was easy to take into account in
this case.

I think the comment saying that lru_lock is required is misleading. How
about this?

 * Check TransHuge without lock and skip the whole
 * pageblock if it's either a transhuge or hugetlbfs page
 * as calling compound_order without preventing THP
 * splitting the page underneath us may return
 * surprising results.

-- 
Mel Gorman
SUSE Labs



Re: [Qemu-devel] [PATCH v2] Add infrastructure for QIDL-based device serialization

2012-09-26 Thread Kevin Wolf
Am 25.09.2012 23:12, schrieb Anthony Liguori:
 Michael Roth mdr...@linux.vnet.ibm.com writes:
 
 On Tue, Sep 25, 2012 at 08:37:16AM +0200, Paolo Bonzini wrote:
 Il 24/09/2012 20:14, Michael Roth ha scritto:
 I went with qUppercase because it avoids all the previous issues with
 using leading underscores, and it's reserved in terms of QEMU coding
 guidelines as far as I can tell (we generally require leading capital
 for typedefs and lowercase for variable names, and can work around
 exceptions on a case by case basis by using QIDL() or some other name).
 I also had it as q_* for a bit but that didn't seem much better on the
 eyes we looking at converted structures.

 It looks like Hungarian notation and very much unlike other QEMU code.
 I'd use q_ or qidl_ prefix instead, or rather QIDL().

 I wanted some way to distinguish from other qemu code to avoid conflicts,
 but i think q_* seems reasonable if we reserve the prefix via CODING_STYLE.
 Then for conflicts outside our control we can either use a different name
 for the annotations or use the long-form QIDL() style depending on the
 circumstances.

 I'm not sure why we need two ways to say the same thing...  I know it's
 just bikeshedding to some extent, but I'd really like to standardize on
 a single form.

 QIDL() (or maybe qidl()) should be the One True Form. It's the
 only one that provides both proper namespacing and can be used both for
 simple annotations and for ones that take parameters.

 I guess the real question is whether or not it makes sense to provide
 shortcuts for the more common annotations to avoid clutter. I've heard
 it both ways, so it's hard to decide.

 So let's bikeshed a bit. Maybe to put things into perspective, we're looking
 at (and I'm just gonna go ahead and switch the OTF to qidl() now so we're
 looking at the best case scenarios for both, and include q_* as well):

 a) One True Form:
 QIDL_DECLARE(RTCState) { 

 ISADevice dev qidl(immutable);
 MemoryRegion io qidl(immutable);
 
 Just like sparse is a compiler, so is qidl.  We are free to use the
 '_' + lowercase prefix.
 
   ISADevice _immutable dev;
 
 It's an established practice in wide-use.

Not commenting on the underscore, but you did one thing that I want to
support: Put the (q)_immutable in a place where it looks like a
qualifier. Not so important for the qidl(...) syntax, but with the
simplified forms I definitely like it better.

I think I would even have made it '(q)_immutable ISADevice dev;', but
having the field name last is what really matters for readability.

Kevin



Re: [Qemu-devel] [PATCH v2] Add infrastructure for QIDL-based device serialization

2012-09-26 Thread Paolo Bonzini
Il 26/09/2012 12:20, Kevin Wolf ha scritto:
  QIDL_DECLARE(RTCState) { 
 
  ISADevice dev qidl(immutable);
  MemoryRegion io qidl(immutable);
  
  Just like sparse is a compiler, so is qidl.  We are free to use the
  '_' + lowercase prefix.
  
ISADevice _immutable dev;
  
  It's an established practice in wide-use.
 Not commenting on the underscore, but you did one thing that I want to
 support: Put the (q)_immutable in a place where it looks like a
 qualifier. Not so important for the qidl(...) syntax, but with the
 simplified forms I definitely like it better.
 
 I think I would even have made it '(q)_immutable ISADevice dev;', but
 having the field name last is what really matters for readability.

Agreed.  I don't want to be a nuisance, so: Michael, please pick one between

ISADevice QIDL(immutable) dev
ISADevice q_immutable dev
ISADevice qidl(immutable) dev

and if you choose the second, let's make QIDL an implementation detail,
i.e. document that every new attribute we introduce should define a new
q_* macro.

Paolo



Re: [Qemu-devel] [PATCH 04/25] ahci: add ide device initialization helper

2012-09-26 Thread Kevin Wolf
Am 26.09.2012 10:15, schrieb Markus Armbruster:
 Jason Baron jba...@redhat.com writes:
 I think that -drive if=ide should result in a disk attached piix3-ide.
 Not in an ide disk attached to the ahci controller (which is current q35
 bahavior, and is 'broken' b/c we don't want that to change after q35 is
 introdued). The reason being is that I think there should be an easy way
 to create an ide drive on piix3-ide, and an ide drive on the ahci
 controller. But it sounds like you don't agree with this point.
 
 Two issues with that:
 
 1. Why should q35 have a piix3-ide?  The ICH9 southbridge provides only
 SATA, so the board needs additional circuitry to provide PATA.  As far
 as I can tell, intel's Q35 doesn't.  ICH9-based boards that do certainly
 won't use a piix3-ide, because that's a *function* of the PIIX
 southbridge device.  It doesn't exist separately.
 
 2. Why should we connect -drive if=ide to a slow PATA controller instead
 of a perfectly servicable SATA controller?

Because the guest OS doesn't have an AHCI driver.

But I don't think that connecting to a slow PATA controller describes
exactly what we would ideally want to emulate. We'll probably want to
emulate a SATA controller that is in IDE emulation mode (or whatever it
is called). But as long as we don't have it, the PATA controller is
probably the right thing to provide.

 But I'm not sure about using if=ide to use ich9-ahci. I'm suggesting
 that if=ide should continue to refer to piix3-ide.
 
 Perhaps it should mean the board's preferred ATA controller, perhaps
 it should mean the board's preferred PATA controller.  Certainly
 debatable.
 
 If the latter, not necessarily piix3-ide.

I'd rephrase as the board's preferred way to provide an IDE interface,
but yes, I think that's the real question.

If we ever want to make Q35 the default, if=ide must implement this
semantics. If we want to stay with PIIX forever, we have a choice. I'm
leaning towards implementing it anyway and just making if=ahci the
default for Q35 if no if is specified at all.

Kevin



Re: [Qemu-devel] [PATCH v2] qemu/xen: Add 64 bits big bar support on qemu

2012-09-26 Thread Stefano Stabellini
On Wed, 26 Sep 2012, Hao, Xudong wrote:
  -Original Message-
  From: Stefano Stabellini [mailto:stefano.stabell...@eu.citrix.com]
  Sent: Tuesday, September 25, 2012 6:52 PM
  To: Hao, Xudong
  Cc: Stefano Stabellini; xen-de...@lists.xen.org; qemu-devel@nongnu.org;
  Zhang, Xiantao
  Subject: Re: [PATCH v2] qemu/xen: Add 64 bits big bar support on qemu
  
  On Tue, 25 Sep 2012, Xudong Hao wrote:
   Changes from v1:
   - Rebase to qemu upstream from qemu-xen
  
  Thanks. Please run scripts/checkpatch.pl on this patch, you'll find
  some cody style issues that need to be fixed.
  
 OK, will use this scripts to check code style.
 
  
   Currently it is assumed PCI device BAR access  4G memory. If there is 
   such a
   device whose BAR size is larger than 4G, it must access  4G memory
  address.
   This patch enable the 64bits big BAR support on qemu.
  
   Signed-off-by: Xudong Hao xudong@intel.com
   Signed-off-by: Xiantao Zhang xiantao.zh...@intel.com
   ---
hw/xen_pt.c |   16 
hw/xen_pt_config_init.c |   42
  +-
  
   diff --git a/hw/xen_pt.c b/hw/xen_pt.c
   index 307119a..2a8bcf3 100644
   --- a/hw/xen_pt.c
   +++ b/hw/xen_pt.c
   @@ -403,21 +403,21 @@ static int
  xen_pt_register_regions(XenPCIPassthroughState *s)
  
s-bases[i].access.u = r-base_addr;
  
   -if (r-type  XEN_HOST_PCI_REGION_TYPE_IO) {
   +if (r-type  XEN_HOST_PCI_REGION_TYPE_IO)
type = PCI_BASE_ADDRESS_SPACE_IO;
   -} else {
   +else if (r-type  XEN_HOST_PCI_REGION_TYPE_MEM_64)
   +type = PCI_BASE_ADDRESS_MEM_TYPE_64;
   +else if (r-type  XEN_HOST_PCI_REGION_TYPE_PREFETCH)
   +type |= PCI_BASE_ADDRESS_MEM_PREFETCH;
   +else
type = PCI_BASE_ADDRESS_SPACE_MEMORY;
   -if (r-type  XEN_HOST_PCI_REGION_TYPE_PREFETCH) {
   -type |= PCI_BASE_ADDRESS_MEM_PREFETCH;
   -}
   -}
  
  Aside from the cody style issue here, this changes the behavior for type
  PCI_BASE_ADDRESS_SPACE_MEMORY. Before we could have:
  
  type =
  PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_PREFETCH;
  
  now we cannot anymore.
  
 
 Will change to:
 -if (r-type  XEN_HOST_PCI_REGION_TYPE_IO) {
 +if (r-type  XEN_HOST_PCI_REGION_TYPE_IO)
  type = PCI_BASE_ADDRESS_SPACE_IO;
 -} else {
 +else
  type = PCI_BASE_ADDRESS_SPACE_MEMORY;
 -if (r-type  XEN_HOST_PCI_REGION_TYPE_PREFETCH) {
 +if (r-type  XEN_HOST_PCI_REGION_TYPE_MEM_64)
 +type |= PCI_BASE_ADDRESS_MEM_TYPE_64;
 +   else if (r-type  XEN_HOST_PCI_REGION_TYPE_PREFETCH)
  type |= PCI_BASE_ADDRESS_MEM_PREFETCH;
 -}
 -}

Isn't it possible that both XEN_HOST_PCI_REGION_TYPE_MEM_64 and
XEN_HOST_PCI_REGION_TYPE_PREFETCH are set? It doesn't look like this can
cover that case.
The following seems to be what you are looking for:


if (r-type  XEN_HOST_PCI_REGION_TYPE_IO) {
type = PCI_BASE_ADDRESS_SPACE_IO;
} else {
type = PCI_BASE_ADDRESS_SPACE_MEMORY;
if (r-type  XEN_HOST_PCI_REGION_TYPE_PREFETCH) {
type |= PCI_BASE_ADDRESS_MEM_PREFETCH;
}
if (r-type  XEN_HOST_PCI_REGION_TYPE_MEM_64) {
type |= PCI_BASE_ADDRESS_MEM_TYPE_64;
}
}


static XenPTBarFlag xen_pt_bar_reg_parse(XenPCIPassthroughState *s,
 XenPTRegInfo *reg)
{
   @@ -366,7 +367,7 @@ static XenPTBarFlag
  xen_pt_bar_reg_parse(XenPCIPassthroughState *s,
  
/* check unused BAR */
r = d-io_regions[index];
   -if (r-size == 0) {
   +if (!xen_pt_get_bar_size(r)) {
return XEN_PT_BAR_FLAG_UNUSED;
}
  
   @@ -383,6 +384,24 @@ static XenPTBarFlag
  xen_pt_bar_reg_parse(XenPCIPassthroughState *s,
}
}
  
   +static bool is_64bit_bar(PCIIORegion *r)
   +{
   +return !!(r-type  PCI_BASE_ADDRESS_MEM_TYPE_64);
   +}
   +
   +static uint64_t xen_pt_get_bar_size(PCIIORegion *r)
   +{
   +if (is_64bit_bar(r))
   +{
   +uint64_t size64;
   +size64 = (r + 1)-size;
   +size64 = 32;
   +size64 += r-size;
   +return size64;
   +}
   +return r-size;
   +}
   +
static inline uint32_t base_address_with_flags(XenHostPCIIORegion *hr)
{
if (hr-type  XEN_HOST_PCI_REGION_TYPE_IO) {
   @@ -481,7 +500,10 @@ static int
  xen_pt_bar_reg_write(XenPCIPassthroughState *s, XenPTReg *cfg_entry,
switch (s-bases[index].bar_flag) {
case XEN_PT_BAR_FLAG_MEM:
bar_emu_mask = XEN_PT_BAR_MEM_EMU_MASK;
   -bar_ro_mask = XEN_PT_BAR_MEM_RO_MASK | (r_size - 1);
   +if (!r_size)
   +bar_ro_mask = XEN_PT_BAR_ALLF;
   +else
   +bar_ro_mask = XEN_PT_BAR_MEM_RO_MASK | (r_size - 1);
break;
  
  Is this an actual mistake everywhere?
 

Re: [Qemu-devel] [PATCH v2] register reset handler to write image into memory

2012-09-26 Thread Alexander Graf

On 26.09.2012, at 10:13, Yin Olivia-R63875 wrote:

 Hi Alex,
 
 I checked all the rom_add_*() functions.
 Multiple platforms of different architectures use rom_add_* to save images.
   hw/arm_boot.c
   hw/exynos4210.
   hw/highbank.
   hw/mips_fulong2e.c
   hw/mips_malta.c
   hw/mips_r4k.c
   hw/r2d.c
 
 Even for PowerPC, it also use rom_add_blob() to write dtb in memory.
   hw/ppc/e500.c:  rom_add_blob_fixed(BINARY_DEVICE_TREE_FILE, fdt, 
 fdt_size, addr);
   hw/ppc440_bamboo.c
 
 You also minder that ELF file.
   hw/elf_ops.h:   rom_add_blob_fixed(label, data, mem_size, addr);
 
 pstrcpy_targphys() does also call rom_add_blob_fixed() function, so we need 
 also verify
   hw/alpha_dp264.c
   hw/cris-boot.c
   hw/lm32_boards.c
   hw/microblaze_boot.c
   hw/milkymist.c
   hw/ppc.c
   hw/ppc_newworld.c
   hw/ppc_oldworld.c
   hw/sun4m.c
   hw/sun4m.c
 
 Should we register reset handler for each above user?

I suppose we should decide on a case-by-case basis. If it's easy to reconstruct 
the binary, convert it to a reset handler. It it's too hard, leave it to 
whoever cares about the board.

 The callers of rom_ptr() function:
   hw/s390-virtio.c
   hw/sun4m.c
   hw/sun4u.c
   target-arm/cpu.c
 But I don't understand why rom_ptr should be changed.

Because rom_ptr works on roms. When we get rid of roms, rom_ptr won't work 
anymore, because there is no in-memory representation of the rom to work on.

So instead of calling rom_ptr, the respective code should have a reset handler 
that gets invoked after the rom restore handler (or a callback function?) which 
can restore the change to the rom that code wants to do.


Alex




[Qemu-devel] [qemu-kvm PATCH] mips: Fix link error with 'piix4_pm_init'

2012-09-26 Thread Cole Robinson
  LINK  mipsel-softmmu/qemu-system-mipsel
  hw/mips/../mips_malta.o: In function `mips_malta_init':
  mips_malta.c:962: undefined reference to `piix4_pm_init'

Can reproduce with:

./configure --target-list=mipsel-softmmu --disable-werror
make

However only on qemu-kvm, not qemu.git or qemu 1.2.0. We are
carrying this in Fedora since we build everything from qemu-kvm.git

Signed-off-by: Cole Robinson crobi...@redhat.com
---
 hw/mips/Makefile.objs | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/mips/Makefile.objs b/hw/mips/Makefile.objs
index 29a5d0d..89af0e9 100644
--- a/hw/mips/Makefile.objs
+++ b/hw/mips/Makefile.objs
@@ -1,6 +1,7 @@
 obj-y = mips_r4k.o mips_jazz.o mips_malta.o mips_mipssim.o
 obj-y += mips_addr.o mips_timer.o mips_int.o
 obj-y += gt64xxx.o mc146818rtc.o
+obj-y += acpi.o acpi_piix4.o
 obj-$(CONFIG_FULONG) += bonito.o vt82c686.o mips_fulong2e.o
 
 obj-y := $(addprefix ../,$(obj-y))
-- 
1.7.11.2




[Qemu-devel] [PATCH v2 4/4] serial: add 2x + 4x pci variant

2012-09-26 Thread Gerd Hoffmann
Add multiport serial card implementation, with two variants,
one featuring two and one featuring four ports.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 docs/qemupciserial.inf |2 +
 hw/serial-pci.c|  157 
 2 files changed, 159 insertions(+), 0 deletions(-)

diff --git a/docs/qemupciserial.inf b/docs/qemupciserial.inf
index 905a929..911eaa6 100644
--- a/docs/qemupciserial.inf
+++ b/docs/qemupciserial.inf
@@ -11,6 +11,8 @@
 ; (Com+Lpt) from the list.  Click Have a disk.  Select this file.
 ; Procedure may vary a bit depending on the windows version.
 
+; FIXME: This file covers the single port version only.
+
 [Version]
 Signature=$CHICAGO$
 Class=Ports
diff --git a/hw/serial-pci.c b/hw/serial-pci.c
index 88b71f5..54bd4eb 100644
--- a/hw/serial-pci.c
+++ b/hw/serial-pci.c
@@ -28,6 +28,14 @@
  *pci region 0 is a io bar, 8 bytes long, with the 16550 uart mapped to it.
  *interrupt is wired to pin A.
  *
+ * pci-serial-4x spec:
+ *pci region 0 is a io bar, with four 16550 uarts mapped after each other,
+ *the first at offset 0, second at 8, third at 16 and fourth at 24.
+ *interrupt is wired to pin A.
+ *
+ * pci-serial-2x spec:
+ *same as pci-serial-4x but with two uarts only.
+ *
  * [root@fedora ~]# lspci -vnse
  * 00:0e.0 0700: 1b36:0002 (rev 01) (prog-if 00 [8250])
  * Subsystem: 1af4:1100
@@ -40,11 +48,23 @@
 #include serial.h
 #include pci.h
 
+#define PCI_SERIAL_MAX_PORTS 4
+
 typedef struct PCISerialState {
 PCIDevice dev;
 SerialState state;
 } PCISerialState;
 
+typedef struct PCIMultiSerialState {
+PCIDevicedev;
+MemoryRegion iobar;
+uint32_t ports;
+char *name[PCI_SERIAL_MAX_PORTS];
+SerialState  state[PCI_SERIAL_MAX_PORTS];
+uint32_t level[PCI_SERIAL_MAX_PORTS];
+qemu_irq *irqs;
+} PCIMultiSerialState;
+
 static int serial_pci_init(PCIDevice *dev)
 {
 PCISerialState *pci = DO_UPCAST(PCISerialState, dev, dev);
@@ -61,6 +81,56 @@ static int serial_pci_init(PCIDevice *dev)
 return 0;
 }
 
+static void multi_serial_irq_mux(void *opaque, int n, int level)
+{
+PCIMultiSerialState *pci = opaque;
+int i, pending = 0;
+
+pci-level[n] = level;
+for (i = 0; i  pci-ports; i++) {
+if (pci-level[i]) {
+pending = 1;
+}
+}
+qemu_set_irq(pci-dev.irq[0], pending);
+}
+
+static int multi_serial_pci_init(PCIDevice *dev)
+{
+PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(dev);
+PCIMultiSerialState *pci = DO_UPCAST(PCIMultiSerialState, dev, dev);
+SerialState *s;
+int i;
+
+switch (pc-device_id) {
+case 0x0003:
+pci-ports = 2;
+break;
+case 0x0004:
+pci-ports = 4;
+break;
+}
+assert(pci-ports  0);
+assert(pci-ports = PCI_SERIAL_MAX_PORTS);
+
+pci-dev.config[PCI_INTERRUPT_PIN] = 0x01;
+memory_region_init(pci-iobar, multiserial, 8 * pci-ports);
+pci_register_bar(pci-dev, 0, PCI_BASE_ADDRESS_SPACE_IO, pci-iobar);
+pci-irqs = qemu_allocate_irqs(multi_serial_irq_mux, pci,
+   pci-ports);
+
+for (i = 0; i  pci-ports; i++) {
+s = pci-state + i;
+s-baudbase = 115200;
+serial_init_core(s);
+s-irq = pci-irqs[i];
+pci-name[i] = g_strdup_printf(uart #%d, i+1);
+memory_region_init_io(s-io, serial_io_ops, s, pci-name[i], 8);
+memory_region_add_subregion(pci-iobar, 8 * i, s-io);
+}
+return 0;
+}
+
 static void serial_pci_exit(PCIDevice *dev)
 {
 PCISerialState *pci = DO_UPCAST(PCISerialState, dev, dev);
@@ -70,6 +140,22 @@ static void serial_pci_exit(PCIDevice *dev)
 memory_region_destroy(s-io);
 }
 
+static void multi_serial_pci_exit(PCIDevice *dev)
+{
+PCIMultiSerialState *pci = DO_UPCAST(PCIMultiSerialState, dev, dev);
+SerialState *s;
+int i;
+
+for (i = 0; i  pci-ports; i++) {
+s = pci-state + i;
+qemu_chr_add_handlers(s-chr, NULL, NULL, NULL, NULL);
+memory_region_destroy(s-io);
+g_free(pci-name[i]);
+}
+memory_region_destroy(pci-iobar);
+qemu_free_irqs(pci-irqs);
+}
+
 static const VMStateDescription vmstate_pci_serial = {
 .name = pci-serial,
 .version_id = 1,
@@ -81,11 +167,38 @@ static const VMStateDescription vmstate_pci_serial = {
 }
 };
 
+static const VMStateDescription vmstate_pci_multi_serial = {
+.name = pci-serial-multi,
+.version_id = 1,
+.minimum_version_id = 1,
+.fields  = (VMStateField[]) {
+VMSTATE_PCI_DEVICE(dev, PCIMultiSerialState),
+VMSTATE_STRUCT_ARRAY(state, PCIMultiSerialState, PCI_SERIAL_MAX_PORTS,
+ 0, vmstate_serial, SerialState),
+VMSTATE_UINT32_ARRAY(level, PCIMultiSerialState, PCI_SERIAL_MAX_PORTS),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static Property serial_pci_properties[] = {
 DEFINE_PROP_CHR(chardev,  PCISerialState, state.chr),
 

[Qemu-devel] [PATCH v2 0/4] add pci-serial devices.

2012-09-26 Thread Gerd Hoffmann
  Hi,

Second version.  Added comment specifying the virtual hardware.
Splitted windows inf file into a separate patch.  Added multiport
versions (2x and 4x) of the card.

Gerd Hoffmann (4):
  serial: split serial.c
  serial: add pci variant
  serial: add windows inf file for the pci card to docs
  serial: add 2x + 4x pci variant

 default-configs/pci.mak  |2 +
 docs/qemupciserial.inf   |  109 +++
 hw/Makefile.objs |3 +-
 hw/alpha_dp264.c |1 +
 hw/kzm.c |2 +-
 hw/mips_fulong2e.c   |1 +
 hw/mips_jazz.c   |1 +
 hw/mips_malta.c  |1 +
 hw/mips_mipssim.c|2 +-
 hw/mips_r4k.c|1 +
 hw/musicpal.c|2 +-
 hw/omap_uart.c   |3 +-
 hw/openrisc_sim.c|3 +-
 hw/pc.c  |1 +
 hw/pc.h  |   27 -
 hw/pci_ids.h |1 +
 hw/petalogix_ml605_mmu.c |2 +-
 hw/ppc/e500.c|2 +-
 hw/ppc405_uc.c   |2 +-
 hw/ppc440_bamboo.c   |2 +-
 hw/ppc_prep.c|1 +
 hw/pxa2xx.c  |2 +-
 hw/serial-isa.c  |  130 ++
 hw/serial-pci.c  |  271 ++
 hw/serial.c  |  143 +---
 hw/serial.h  |   73 +
 hw/sm501.c   |2 +-
 hw/sun4u.c   |1 +
 hw/virtex_ml507.c|2 +-
 hw/xtensa_lx60.c |3 +-
 30 files changed, 616 insertions(+), 180 deletions(-)
 create mode 100644 docs/qemupciserial.inf
 create mode 100644 hw/serial-isa.c
 create mode 100644 hw/serial-pci.c
 create mode 100644 hw/serial.h




[Qemu-devel] [PATCH v2 2/4] serial: add pci variant

2012-09-26 Thread Gerd Hoffmann
So we get a hot-pluggable 16550 uart.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 default-configs/pci.mak |2 +
 hw/Makefile.objs|1 +
 hw/pci_ids.h|1 +
 hw/serial-pci.c |  115 +++
 4 files changed, 119 insertions(+), 0 deletions(-)
 create mode 100644 hw/serial-pci.c

diff --git a/default-configs/pci.mak b/default-configs/pci.mak
index 69e18f1..ae9d1eb 100644
--- a/default-configs/pci.mak
+++ b/default-configs/pci.mak
@@ -19,3 +19,5 @@ CONFIG_IDE_PCI=y
 CONFIG_AHCI=y
 CONFIG_ESP=y
 CONFIG_ESP_PCI=y
+CONFIG_SERIAL=y
+CONFIG_SERIAL_PCI=y
diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 7a27889..9ab8878 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -21,6 +21,7 @@ hw-obj-$(CONFIG_ESCC) += escc.o
 hw-obj-$(CONFIG_EMPTY_SLOT) += empty_slot.o
 
 hw-obj-$(CONFIG_SERIAL) += serial.o serial-isa.o
+hw-obj-$(CONFIG_SERIAL_PCI) += serial-pci.o
 hw-obj-$(CONFIG_PARALLEL) += parallel.o
 hw-obj-$(CONFIG_I8254) += i8254_common.o i8254.o
 hw-obj-$(CONFIG_PCSPK) += pcspk.o
diff --git a/hw/pci_ids.h b/hw/pci_ids.h
index 301bf1c..c017a79 100644
--- a/hw/pci_ids.h
+++ b/hw/pci_ids.h
@@ -37,6 +37,7 @@
 #define PCI_CLASS_BRIDGE_PCI 0x0604
 #define PCI_CLASS_BRIDGE_OTHER   0x0680
 
+#define PCI_CLASS_COMMUNICATION_SERIAL   0x0700
 #define PCI_CLASS_COMMUNICATION_OTHER0x0780
 
 #define PCI_CLASS_PROCESSOR_CO   0x0b40
diff --git a/hw/serial-pci.c b/hw/serial-pci.c
new file mode 100644
index 000..88b71f5
--- /dev/null
+++ b/hw/serial-pci.c
@@ -0,0 +1,115 @@
+/*
+ * QEMU 16550A UART emulation
+ *
+ * Copyright (c) 2003-2004 Fabrice Bellard
+ * Copyright (c) 2008 Citrix Systems, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+/*
+ * pci-serial spec:
+ *pci region 0 is a io bar, 8 bytes long, with the 16550 uart mapped to it.
+ *interrupt is wired to pin A.
+ *
+ * [root@fedora ~]# lspci -vnse
+ * 00:0e.0 0700: 1b36:0002 (rev 01) (prog-if 00 [8250])
+ * Subsystem: 1af4:1100
+ * Physical Slot: 14
+ * Flags: fast devsel, IRQ 11
+ * I/O ports at c130 [size=8]
+ * Kernel driver in use: serial
+ */
+
+#include serial.h
+#include pci.h
+
+typedef struct PCISerialState {
+PCIDevice dev;
+SerialState state;
+} PCISerialState;
+
+static int serial_pci_init(PCIDevice *dev)
+{
+PCISerialState *pci = DO_UPCAST(PCISerialState, dev, dev);
+SerialState *s = pci-state;
+
+s-baudbase = 115200;
+serial_init_core(s);
+
+pci-dev.config[PCI_INTERRUPT_PIN] = 0x01;
+s-irq = pci-dev.irq[0];
+
+memory_region_init_io(s-io, serial_io_ops, s, serial, 8);
+pci_register_bar(pci-dev, 0, PCI_BASE_ADDRESS_SPACE_IO, s-io);
+return 0;
+}
+
+static void serial_pci_exit(PCIDevice *dev)
+{
+PCISerialState *pci = DO_UPCAST(PCISerialState, dev, dev);
+SerialState *s = pci-state;
+
+qemu_chr_add_handlers(s-chr, NULL, NULL, NULL, NULL);
+memory_region_destroy(s-io);
+}
+
+static const VMStateDescription vmstate_pci_serial = {
+.name = pci-serial,
+.version_id = 1,
+.minimum_version_id = 1,
+.fields  = (VMStateField[]) {
+VMSTATE_PCI_DEVICE(dev, PCISerialState),
+VMSTATE_STRUCT(state, PCISerialState, 0, vmstate_serial, SerialState),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static Property serial_pci_properties[] = {
+DEFINE_PROP_CHR(chardev,  PCISerialState, state.chr),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void serial_pci_class_initfn(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *pc = PCI_DEVICE_CLASS(klass);
+pc-init = serial_pci_init;
+pc-exit = serial_pci_exit;
+pc-vendor_id = 0x1b36; /* Red Hat */
+pc-device_id = 0x0002;
+pc-revision = 1;
+pc-class_id = PCI_CLASS_COMMUNICATION_SERIAL;
+dc-vmsd = vmstate_pci_serial;
+dc-props = serial_pci_properties;
+}
+

[Qemu-devel] [PATCH v2 1/4] serial: split serial.c

2012-09-26 Thread Gerd Hoffmann
Split serial.c into serial.c, serial.h and serial-isa.c.  While being at
creating a serial.h header file move the serial prototypes from pc.h to
the new serial.h.  The latter leads to s/pc.h/serial.h/ in tons of
boards which just want the serial bits from pc.h

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/Makefile.objs |2 +-
 hw/alpha_dp264.c |1 +
 hw/kzm.c |2 +-
 hw/mips_fulong2e.c   |1 +
 hw/mips_jazz.c   |1 +
 hw/mips_malta.c  |1 +
 hw/mips_mipssim.c|2 +-
 hw/mips_r4k.c|1 +
 hw/musicpal.c|2 +-
 hw/omap_uart.c   |3 +-
 hw/openrisc_sim.c|3 +-
 hw/pc.c  |1 +
 hw/pc.h  |   27 -
 hw/petalogix_ml605_mmu.c |2 +-
 hw/ppc/e500.c|2 +-
 hw/ppc405_uc.c   |2 +-
 hw/ppc440_bamboo.c   |2 +-
 hw/ppc_prep.c|1 +
 hw/pxa2xx.c  |2 +-
 hw/serial-isa.c  |  130 +
 hw/serial.c  |  143 ++
 hw/serial.h  |   73 +++
 hw/sm501.c   |2 +-
 hw/sun4u.c   |1 +
 hw/virtex_ml507.c|2 +-
 hw/xtensa_lx60.c |3 +-
 26 files changed, 232 insertions(+), 180 deletions(-)
 create mode 100644 hw/serial-isa.c
 create mode 100644 hw/serial.h

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 6dfebd2..7a27889 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -20,7 +20,7 @@ hw-obj-$(CONFIG_M48T59) += m48t59.o
 hw-obj-$(CONFIG_ESCC) += escc.o
 hw-obj-$(CONFIG_EMPTY_SLOT) += empty_slot.o
 
-hw-obj-$(CONFIG_SERIAL) += serial.o
+hw-obj-$(CONFIG_SERIAL) += serial.o serial-isa.o
 hw-obj-$(CONFIG_PARALLEL) += parallel.o
 hw-obj-$(CONFIG_I8254) += i8254_common.o i8254.o
 hw-obj-$(CONFIG_PCSPK) += pcspk.o
diff --git a/hw/alpha_dp264.c b/hw/alpha_dp264.c
index 9eb939f..faeb275 100644
--- a/hw/alpha_dp264.c
+++ b/hw/alpha_dp264.c
@@ -15,6 +15,7 @@
 #include mc146818rtc.h
 #include ide.h
 #include i8254.h
+#include serial.h
 
 #define MAX_IDE_BUS 2
 
diff --git a/hw/kzm.c b/hw/kzm.c
index 68cd1b4..1f3082b 100644
--- a/hw/kzm.c
+++ b/hw/kzm.c
@@ -21,7 +21,7 @@
 #include net.h
 #include sysemu.h
 #include boards.h
-#include pc.h /* for the FPGA UART that emulates a 16550 */
+#include serial.h
 #include imx.h
 
 /* Memory map for Kzm Emulation Baseboard:
diff --git a/hw/mips_fulong2e.c b/hw/mips_fulong2e.c
index 38e4b86..8a38cd9 100644
--- a/hw/mips_fulong2e.c
+++ b/hw/mips_fulong2e.c
@@ -20,6 +20,7 @@
 
 #include hw.h
 #include pc.h
+#include serial.h
 #include fdc.h
 #include net.h
 #include boards.h
diff --git a/hw/mips_jazz.c b/hw/mips_jazz.c
index db927f1..d35cd54 100644
--- a/hw/mips_jazz.c
+++ b/hw/mips_jazz.c
@@ -26,6 +26,7 @@
 #include mips.h
 #include mips_cpudevs.h
 #include pc.h
+#include serial.h
 #include isa.h
 #include fdc.h
 #include sysemu.h
diff --git a/hw/mips_malta.c b/hw/mips_malta.c
index ad23f26..05a1eaa 100644
--- a/hw/mips_malta.c
+++ b/hw/mips_malta.c
@@ -24,6 +24,7 @@
 
 #include hw.h
 #include pc.h
+#include serial.h
 #include fdc.h
 #include net.h
 #include boards.h
diff --git a/hw/mips_mipssim.c b/hw/mips_mipssim.c
index 830f635..0ee6756 100644
--- a/hw/mips_mipssim.c
+++ b/hw/mips_mipssim.c
@@ -27,7 +27,7 @@
 #include hw.h
 #include mips.h
 #include mips_cpudevs.h
-#include pc.h
+#include serial.h
 #include isa.h
 #include net.h
 #include sysemu.h
diff --git a/hw/mips_r4k.c b/hw/mips_r4k.c
index 967a76e..b3be80b 100644
--- a/hw/mips_r4k.c
+++ b/hw/mips_r4k.c
@@ -11,6 +11,7 @@
 #include mips.h
 #include mips_cpudevs.h
 #include pc.h
+#include serial.h
 #include isa.h
 #include net.h
 #include sysemu.h
diff --git a/hw/musicpal.c b/hw/musicpal.c
index f305e21..346fe41 100644
--- a/hw/musicpal.c
+++ b/hw/musicpal.c
@@ -15,7 +15,7 @@
 #include net.h
 #include sysemu.h
 #include boards.h
-#include pc.h
+#include serial.h
 #include qemu-timer.h
 #include ptimer.h
 #include block.h
diff --git a/hw/omap_uart.c b/hw/omap_uart.c
index 167d5c4..1c16a54 100644
--- a/hw/omap_uart.c
+++ b/hw/omap_uart.c
@@ -20,8 +20,7 @@
 #include qemu-char.h
 #include hw.h
 #include omap.h
-/* We use pc-style serial ports.  */
-#include pc.h
+#include serial.h
 #include exec-memory.h
 
 /* UARTs */
diff --git a/hw/openrisc_sim.c b/hw/openrisc_sim.c
index 55e97f0..e484613 100644
--- a/hw/openrisc_sim.c
+++ b/hw/openrisc_sim.c
@@ -21,7 +21,8 @@
 #include hw.h
 #include boards.h
 #include elf.h
-#include pc.h
+#include serial.h
+#include net.h
 #include loader.h
 #include exec-memory.h
 #include sysemu.h
diff --git a/hw/pc.c b/hw/pc.c
index 7e7e0e2..f056777 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -23,6 +23,7 @@
  */
 #include hw.h
 #include pc.h
+#include serial.h
 #include apic.h
 #include fdc.h
 #include ide.h
diff --git a/hw/pc.h b/hw/pc.h
index e4db071..170a265 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ 

Re: [Qemu-devel] [RFC PATCH 00/17] Support for multiple AIO contexts

2012-09-26 Thread Kevin Wolf
Am 25.09.2012 14:55, schrieb Paolo Bonzini:
 This series removes the globals from async.c/aio-posix.c so that
 multiple AIO contexts (mini event loops) can be added.  Right now,
 all block devices still use qemu_bh_new, but switching them to
 aio_bh_new would let you associate different files with different
 AIO contexts.
 
 As an added bonus, integration with the glib main loop now happens
 via GSource.  Each AIO context is a GSource, which means you can
 choose either to run it in its own thread (this of course needs
 proper locking which is not yet here), or to attach it to the main
 thread.
 
 In this state this is a bit of an academic exercise (though it works
 and may even make sense for 1.3), but I think it's an example of the
 tiny steps that can lead us towards an upstreamable version of the
 data-plane code.

Do you have a git tree where I could see what things would look like in
the end?

I wonder how this relates to my plans of getting rid of qemu_aio_flush()
and friends in favour of BlockDriver.bdrv_drain(). In fact, after
removing io_flush, I don't really see what makes AIO fd handlers special
any more.

qemu_aio_wait() only calls these handlers, but would it do any harm if
we called all fd handlers? And other than that it's just a small main
loop, so I guess it could share code with the real main loop.

So, considering this, any reason to let aio.c survive at all?

Kevin



[Qemu-devel] [PATCH v2 3/4] serial: add windows inf file for the pci card to docs

2012-09-26 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 docs/qemupciserial.inf |  107 
 1 files changed, 107 insertions(+), 0 deletions(-)
 create mode 100644 docs/qemupciserial.inf

diff --git a/docs/qemupciserial.inf b/docs/qemupciserial.inf
new file mode 100644
index 000..905a929
--- /dev/null
+++ b/docs/qemupciserial.inf
@@ -0,0 +1,107 @@
+; qemupciserial.inf for qemu, based on MSPORTS.INF
+
+; The driver itself is shipped with windows (serial.sys).  This is
+; just a inf file to tell windows which pci id the serial pci card
+; emulated by qemu has, and to apply a name tag to it which windows
+; will show in the device manager.
+
+; Installing the driver: Go to device manager.  You should find a pci
+; serial card tagged with a yellow question mark.  Open properties.
+; Pick update driver.  Then select driver manually.  Pick Ports
+; (Com+Lpt) from the list.  Click Have a disk.  Select this file.
+; Procedure may vary a bit depending on the windows version.
+
+[Version]
+Signature=$CHICAGO$
+Class=Ports
+ClassGuid={4D36E978-E325-11CE-BFC1-08002BE10318}
+Provider=%QEMU%
+DriverVer=09/24/2012,1.3.0
+
+[SourceDisksNames]
+3426=windows cd
+
+[SourceDisksFiles]
+serial.sys = 3426
+serenum.sys= 3426
+
+[DestinationDirs]
+DefaultDestDir  = 11;LDID_SYS
+ComPort.NT.Copy = 12;DIRID_DRIVERS
+SerialEnumerator.NT.Copy=12 ;DIRID_DRIVERS
+
+; Drivers
+;--
+[Manufacturer]
+%QEMU%=QEMU,NTx86
+
+[QEMU.NTx86]
+%QEMU-PCI_SERIAL.DeviceDesc% = ComPort, PCI\VEN_1b36DEV_0002CC_0700
+
+; COM sections
+;--
+[ComPort.AddReg]
+HKR,,PortSubClass,1,01
+
+[ComPort.NT]
+AddReg=ComPort.AddReg, ComPort.NT.AddReg
+LogConfig=caa
+SyssetupPnPFlags = 1
+
+[ComPort.NT.HW]
+AddReg=ComPort.NT.HW.AddReg
+
+[ComPort.NT.AddReg]
+HKR,,EnumPropPages32,,MsPorts.dll,SerialPortPropPageProvider
+
+[ComPort.NT.HW.AddReg]
+HKR,,UpperFilters,0x0001,serenum
+
+;-- Service installation
+; Port Driver (function driver for this device)
+[ComPort.NT.Services]
+AddService = Serial, 0x0002, Serial_Service_Inst, Serial_EventLog_Inst
+AddService = Serenum,,Serenum_Service_Inst
+
+; -- Serial Port Driver install sections
+[Serial_Service_Inst]
+DisplayName= %Serial.SVCDESC%
+ServiceType= 1   ; SERVICE_KERNEL_DRIVER
+StartType  = 1   ; SERVICE_SYSTEM_START (this driver may do 
detection)
+ErrorControl   = 0   ; SERVICE_ERROR_IGNORE
+ServiceBinary  = %12%\serial.sys
+LoadOrderGroup = Extended base
+
+; -- Serenum Driver install section
+[Serenum_Service_Inst]
+DisplayName= %Serenum.SVCDESC%
+ServiceType= 1   ; SERVICE_KERNEL_DRIVER
+StartType  = 3   ; SERVICE_DEMAND_START
+ErrorControl   = 1   ; SERVICE_ERROR_NORMAL
+ServiceBinary  = %12%\serenum.sys
+LoadOrderGroup = PNP Filter
+
+[Serial_EventLog_Inst]
+AddReg = Serial_EventLog_AddReg
+
+[Serial_EventLog_AddReg]
+HKR,,EventMessageFile,0x0002,%%SystemRoot%%\System32\IoLogMsg.dll;%%SystemRoot%%\System32\drivers\serial.sys
+HKR,,TypesSupported,0x00010001,7
+
+; The following sections are COM port resource configs.
+; Section name format means:
+; Char 1 = c (COM port)
+; Char 2 = I/O config: 1 (3f8), 2 (2f8), 3 (3e8), 4 (2e8), a (any)
+; Char 3 = IRQ config: #, a (any)
+
+[caa]   ; Any base, any IRQ
+ConfigPriority=HARDRECONFIG
+IOConfig=8@100-%fff8(3ff::)
+IRQConfig=S:3,4,5,7,9,10,11,12,14,15
+
+[Strings]
+QEMU=QEMU
+QEMU-PCI_SERIAL.DeviceDesc=QEMU Serial PCI Card
+
+Serial.SVCDESC   = Serial port driver
+Serenum.SVCDESC = Serenum Filter Driver
-- 
1.7.1




Re: [Qemu-devel] [PATCH v5 0/4] non-blocking connect address handling cleanup

2012-09-26 Thread Anthony Liguori
Orit Wasserman owass...@redhat.com writes:

 Changes from v4:
   - Rename ConnectHandler to NonBlockingConnectHandler
   - move function comments to functions definitions
   - move connect_state allocation to outside of the loop
   - fix comments text

 Changes from v3:
   - add missing parenthesis QEMU_SOCKET_RC_INPROGRESS macro
   - remove block from dummy_opts
   - remove in_progress from external API (inet_connect_opts and 
 inet_nonblocking_connect)
   - Allocate ConnectState inside inet_connect_opts, this make the 
 structure internal to qemu-sockets.c
   - fix migrate_fd_cleanup to handle invalid fd.
   
 Changes from v2:
   - remove the use of getnameinfo
   - remove errp for inet_connect_addr
   - remove QemuOpt block
   - fix errors in wait_for_connect 
   - pass ConnectState as a parameter to allow concurrent connect ops

 getaddrinfo can give us a list of addresses, but we only try to
 connect to the first one. If that fails we never proceed to
 the next one.  This is common on desktop setups that often have ipv6
 configured but not actually working.
 A simple way to reproduce the problem is migration:
 for the destination use -incoming tcp:0:, run migrate -d 
 tcp:localhost:
 migration will fail on hosts that have both IPv4 and IPV6 address for 
 localhost.

 To fix this, refactor address resolution code and make 
 inet_nonblocking_connect
 retry connection with a different address.

 Michael S. Tsirkin (1):
   Refactor inet_connect_opts function

 Orit Wasserman (3):
   Separate inet_connect into inet_connect (blocking) and
 inet_nonblocking_connect
   Fix address handling in inet_nonblocking_connect
   Clear handler only for valid fd

  migration-tcp.c |   37 ++--
  migration.c |4 +-
  nbd.c   |2 +-
  qemu-char.c |2 +-
  qemu-sockets.c  |  279 
 +--
  qemu_socket.h   |   15 +++-
  ui/vnc.c|2 +-
  7 files changed, 237 insertions(+), 104 deletions(-)

Applied. Thanks.

Regards,

Anthony Liguori


 -- 
 1.7.7.6




Re: [Qemu-devel] [PATCH 0/3] add pc-1.3, fix xhci comat, ivshmem 64bit option

2012-09-26 Thread Anthony Liguori
Gerd Hoffmann kra...@redhat.com writes:

   Hi,

 Small series with tree patches, the first one adds a new machine type
 for the upcoming 1.3 release so we can add compat properties as needed,
 the other two patches actually add compat properties.  The xhci one is
 a pure compat fix which turns off msi+msix on old machine types.  The
 other is the 64bit ivshmem patch which has been posted multiple times
 and still didn't make it into master yet.  It adds a compat property
 too to turn off 64bit on old machine types.

 cheers,
   Gerd

 The following changes since commit e0a1e32dbc41e6b2aabb436a9417dfd32177a3dc:

   Merge branch 'usb.64' of git://git.kraxel.org/qemu (2012-09-11 18:06:56 
 +0200)

 are available in the git repository at:

   git://git.kraxel.org/qemu misc.1


Applied. Thanks.

Regards,

Anthony Liguori

 Gerd Hoffmann (3):
   add pc-1.3 machine type
   compat: turn off msi/msix on xhci for old machine types
   ivshmem: add 64bit option

  hw/ivshmem.c |   13 ++---
  hw/pc_piix.c |   32 ++--
  2 files changed, 40 insertions(+), 5 deletions(-)



Re: [Qemu-devel] [PATCH] configure: Allow builds without any system or user emulation

2012-09-26 Thread Anthony Liguori
Stefan Weil s...@weilnetz.de writes:

 The old code aborted configure when no emulation target was selected.
 Even after removing the 'exit 1', it tried to read from STDIN
 when QEMU was configured with

 configure' '--disable-user' '--disable-system'

 This is fixed here.

 Signed-off-by: Stefan Weil s...@weilnetz.de

Applied. Thanks.

Regards,

Anthony Liguori

 ---

 This patch can be applied after 66d5499b3 was reverted.

 It also works on top of 66d5499b3. In this case only Makefile
 needs modifications, and the configure part of the patch must be removed.

 Regards

 Stefan Weil


  Makefile  |5 +
  configure |4 
  2 files changed, 5 insertions(+), 4 deletions(-)

 diff --git a/Makefile b/Makefile
 index 9523e05..d38ac0f 100644
 --- a/Makefile
 +++ b/Makefile
 @@ -52,8 +52,13 @@ SUBDIR_MAKEFLAGS=$(if $(V),,--no-print-directory) 
 BUILD_DIR=$(BUILD_DIR)
  SUBDIR_DEVICES_MAK=$(patsubst %, %/config-devices.mak, $(TARGET_DIRS))
  SUBDIR_DEVICES_MAK_DEP=$(patsubst %, %/config-devices.mak.d, $(TARGET_DIRS))
  
 +ifeq ($(SUBDIR_DEVICES_MAK),)
 +config-all-devices.mak:
 + $(call quiet-command,echo '# no devices'  $@,  GEN   $@)
 +else
  config-all-devices.mak: $(SUBDIR_DEVICES_MAK)
   $(call quiet-command,cat $(SUBDIR_DEVICES_MAK) | grep =y | sort -u  
 $@,  GEN   $@)
 +endif
  
  -include $(SUBDIR_DEVICES_MAK_DEP)
  
 diff --git a/configure b/configure
 index fc27bd9..a9305f3 100755
 --- a/configure
 +++ b/configure
 @@ -1331,10 +1331,6 @@ if test -z $target_list ; then
  else
  target_list=`echo $target_list | sed -e 's/,/ /g'`
  fi
 -if test -z $target_list ; then
 -echo No targets enabled
 -exit 1
 -fi
  # see if system emulation was really requested
  case  $target_list  in
*-softmmu *) softmmu=yes
 -- 
 1.7.0.4




Re: [Qemu-devel] [PATCH 0/5 v3] convert system_powerdown command to notifiers

2012-09-26 Thread Anthony Liguori
Igor Mammedov imamm...@redhat.com writes:

 global variable qemu_system_powerdown in sysemu.h is the only dep for qemu_irq
 and qemu_rise_irq is not a generic way to signal guest that it should 
 shutdown.

 replace it by notifiers and allow each implementation to have it's own way
 to notify guest.

 git repo for testing:
 https://github.com/imammedo/qemu/tree/shutdown_notifier.v3

 compile tested:
target-list=x86_64-linux-user,x86_64-softmmu,sparc-softmmu,arm-softmmu
 runtime tested:
x86_64-softmmu + win7 guest

 v3-v2:
 - fixed bisectably issues of series
 - make series independed of cpu_as_device series
  

Applied. Thanks.

Regards,

Anthony Liguori

 Igor Mammedov (5):
   Introduce powerdown_notifiers
   acpi: use notifier for signaling guest system_powerdown command
   target-arm: use notifier for signaling guest system_powerdown command
   target-sparc: use notifier for signaling guest system_powerdown
 command
   Cleanup unused global var qemu_system_powerdown

  hw/acpi_piix4.c |  8 +---
  hw/nseries.c| 14 +-
  hw/sun4m.c  | 14 +-
  sysemu.h|  2 +-
  vl.c| 18 ++
  5 files changed, 46 insertions(+), 10 deletions(-)

 -- 
 1.7.11.4




Re: [Qemu-devel] [PATCH 4/6] pseries: Implement PAPR NVRAM

2012-09-26 Thread Alexander Graf

On 26.09.2012, at 05:12, David Gibson wrote:

 The PAPR specification requires a certain amount of NVRAM, accessed via
 RTAS, which we don't currently implement in qemu.  This patch addresses
 this deficiency, implementing the NVRAM as a VIO device, with some glue to
 instantiate it automatically based on a machine option.
 
 The machine option specifies a drive id, which is used to back the NVRAM,
 making it persistent.  If nothing is specified, the driver instead simply
 allocates space for the NVRAM, which will not be persistent
 
 Signed-off-by: David Gibson da...@gibson.dropbear.id.au
 ---
 hw/ppc/Makefile.objs |1 +
 hw/spapr.c   |3 +
 hw/spapr.h   |3 +
 hw/spapr_nvram.c |  225 ++
 qemu-config.c|4 +
 5 files changed, 236 insertions(+)
 create mode 100644 hw/spapr_nvram.c
 
 diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
 index 951e407..91cbe8c 100644
 --- a/hw/ppc/Makefile.objs
 +++ b/hw/ppc/Makefile.objs
 @@ -11,6 +11,7 @@ obj-y += ppc_newworld.o
 obj-$(CONFIG_PSERIES) += spapr.o spapr_hcall.o spapr_rtas.o spapr_vio.o
 obj-$(CONFIG_PSERIES) += xics.o spapr_vty.o spapr_llan.o spapr_vscsi.o
 obj-$(CONFIG_PSERIES) += spapr_pci.o pci-hotplug.o spapr_iommu.o
 +obj-$(CONFIG_PSERIES) += spapr_nvram.o
 # PowerPC 4xx boards
 obj-y += ppc4xx_devs.o ppc4xx_pci.o ppc405_uc.o ppc405_boards.o
 obj-y += ppc440_bamboo.o
 diff --git a/hw/spapr.c b/hw/spapr.c
 index a8bd3c1..079825a 100644
 --- a/hw/spapr.c
 +++ b/hw/spapr.c
 @@ -804,6 +804,9 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 }
 }
 
 +/* We always have at least the nvram device on VIO */
 +spapr_create_nvram(spapr);
 +
 /* Set up PCI */
 spapr_pci_rtas_init();
 
 diff --git a/hw/spapr.h b/hw/spapr.h
 index e984e3f..d9c3b4a 100644
 --- a/hw/spapr.h
 +++ b/hw/spapr.h
 @@ -6,11 +6,13 @@
 
 struct VIOsPAPRBus;
 struct sPAPRPHBState;
 +struct sPAPRNVRAM;
 struct icp_state;
 
 typedef struct sPAPREnvironment {
 struct VIOsPAPRBus *vio_bus;
 QLIST_HEAD(, sPAPRPHBState) phbs;
 +struct sPAPRNVRAM *nvram;
 struct icp_state *icp;
 
 target_phys_addr_t ram_limit;
 @@ -336,6 +338,7 @@ typedef struct sPAPRTCE {
 #define SPAPR_PCI_BASE_LIOBN0x8000
 
 void spapr_iommu_init(void);
 +void spapr_create_nvram(sPAPREnvironment *spapr);
 DMAContext *spapr_tce_new_dma_context(uint32_t liobn, size_t window_size);
 void spapr_tce_free(DMAContext *dma);
 void spapr_tce_reset(DMAContext *dma);
 diff --git a/hw/spapr_nvram.c b/hw/spapr_nvram.c
 new file mode 100644
 index 000..8cd8a53
 --- /dev/null
 +++ b/hw/spapr_nvram.c
 @@ -0,0 +1,225 @@
 +/*
 + * QEMU sPAPR NVRAM emulation
 + *
 + * Copyright (C) 2012 David Gibson, IBM Corporation.
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a 
 copy
 + * of this software and associated documentation files (the Software), to 
 deal
 + * in the Software without restriction, including without limitation the 
 rights
 + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 + * copies of the Software, and to permit persons to whom the Software is
 + * furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included in
 + * all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
 FROM,
 + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 + * THE SOFTWARE.
 + */
 +#include sys/mman.h
 +#include libfdt.h
 +
 +#include device_tree.h
 +#include hw/sysbus.h
 +#include hw/spapr.h
 +#include hw/spapr_vio.h
 +
 +typedef struct sPAPRNVRAM {
 +VIOsPAPRDevice sdev;
 +uint32_t size;
 +uint8_t *buf;
 +BlockDriverState *drive;
 +} sPAPRNVRAM;
 +
 +#define MIN_NVRAM_SIZE 8192
 +#define DEFAULT_NVRAM_SIZE 16384
 +#define MAX_NVRAM_SIZE (UINT16_MAX * 16)
 +
 +static void rtas_nvram_fetch(sPAPREnvironment *spapr,
 + uint32_t token, uint32_t nargs,
 + target_ulong args,
 + uint32_t nret, target_ulong rets)
 +{
 +sPAPRNVRAM *nvram = spapr-nvram;
 +target_phys_addr_t offset, buffer, len;
 +int alen;
 +void *membuf;
 +
 +if ((nargs != 3) || (nret != 2)) {
 +rtas_st(rets, 0, -3);
 +return;
 +}
 +
 +if (!nvram) {
 +rtas_st(rets, 0, -1);
 +rtas_st(rets, 1, 0);
 +return;
 +}
 +
 +offset = rtas_ld(args, 0);
 +buffer = rtas_ld(args, 1);
 +len = rtas_ld(args, 2);
 +
 +if (((offset + 

Re: [Qemu-devel] [0/6] Pending pseries updates

2012-09-26 Thread Alexander Graf

On 26.09.2012, at 05:12, David Gibson wrote:

 Hi Alex,
 
 Here's another batch of updates for pseries, some of which affect
 wider target-ppc code.  I have sent a few of these before, but I don't
 believe any have made it into ppc-next so far.  5/6 is an important
 bugfix we've discussed before, which I've CCed to qemu-stable.

Thanks, applied 1/5, 2/6, 5/6, 6/6.

4/6 still needs fixing for TCG. Please compile QEMU with --enable-debug-tcg to 
see the warnings emerging from your change :).
5/6 still has a comment in flight.


Alex




Re: [Qemu-devel] [PATCH 0/2] add pci-serial device.

2012-09-26 Thread Anthony Liguori
Gerd Hoffmann kra...@redhat.com writes:

 On 09/26/12 01:43, Anthony Liguori wrote:
 Gerd Hoffmann kra...@redhat.com writes:
 
   Hi,

 Two patches, first split up serial.c a bit,
 then actually add the pci-based serial device.
 
 The series looks good to me.  A couple requests:
 
 1) Could you add a spec describing this new PCI device?  Doesn't need to
be more than a couple paragraphs since the device is super simple.

 Well, it is pretty strait forward:  A single IO bar, 8 bytes in size,
 where the 16550 uart is mapped to:

 [kraxel@fedora ~]$ lspci -vse
 00:0e.0 Serial controller: Red Hat, Inc. Device 0002 (rev 01) (prog-if
 00 [8250])
   Subsystem: Red Hat, Inc Device 1100
   Physical Slot: 14
   Flags: fast devsel, IRQ 11
   I/O ports at c130 [size=8]
   Kernel driver in use: serial

 But I can surely add a comment about it.

Understood, but I'd really prefer a file in docs/.  We should be
rigorous about having formal specs for all of our paravirtual devices.
The code shouldn't be the spec.


 2) Could you make the inf file an separate patch and either include
documentation in the commit message on how to use it with Windows or
just add a comment to the inf file?

 I think a comment is better, easier to find than a commit message.
 Will do.

So do I.  Thanks.

 This is a new PCI space for QEMU too.  

 It isn't new, I just followed what the pci bridge is doing (which has
 1b36:0001).

Ah, Michael, could you add a quick spec to docs for the pci_bridge
device?


 Is this a driver that is owned
 by QEMU and Red Hat is donating the PCI id or is this a driver that RH
 controls that we're implementing?

 I consider it being owned by qemu.

Great.

Regards,

Anthony Liguori


 cheers,
   Gerd



Re: [Qemu-devel] [RFC PATCH 00/17] Support for multiple AIO contexts

2012-09-26 Thread Paolo Bonzini
Il 26/09/2012 14:28, Kevin Wolf ha scritto:
 Do you have a git tree where I could see what things would look like in
 the end?

I will push it to aio-context on git://github.com/bonzini/qemu.git as
soon as github comes back.

 I wonder how this relates to my plans of getting rid of qemu_aio_flush()
 and friends in favour of BlockDriver.bdrv_drain().

Mostly unrelated, I think.  The introduction of the non-blocking
aio_poll in this series might help implementing bdrv_drain, like this:

blocking = false;
while(bs has requests) {
progress = aio_poll(aio context of bs, blocking);
if (progress) {
blocking = false;
continue;
}
if (bs has throttled requests) {
restart throttled requests
blocking = false;
continue;
}

/* No progress, must have been non-blocking.  We must wait.  */
assert(!blocking);
blocking = true;
}

BTW, is it true that bs-file has requests || bs-backing_hd has
requests (or any other underlying file, like vmdk extents) implies bs
has requests?

 In fact, after removing io_flush, I don't really see what makes AIO
 fd handlers special any more.

Note that while the handlers aren't that special indeed, there is still
some magic because qemu_aio_wait() bottom halves.

 qemu_aio_wait() only calls these handlers, but would it do any harm if
 we called all fd handlers?

Unfortunately yes.  You could get re-entrant calls from the monitor
while a monitor command drains the AIO queue for example.

 And other than that it's just a small main
 loop, so I guess it could share code with the real main loop.

Yes, the nested (and blocking) event loops are ugly.  On the other hand,
it is even uglier to have hooks to call the main loop from aio.c (for
handlers) and vice versa (for bottom halves).  One of the points of the
series is to make AIO just another GSource, with the bottom half magic
and fdhandler hooks handled in a single place (async.c).

Moving towards separate-thread event processing for one or more
BlockDriverStates can be done both with GMainLoop + GSource, or with
aio_poll.  I haven't put much thought in this, but a thread doing while
(aio_poll(ctx, false)); would look very much like Stefan's data-plane code.

Paolo



Re: [Qemu-devel] [PATCH 0/5] i386: cpu: remove duplicate feature names

2012-09-26 Thread Igor Mammedov
On Thu,  6 Sep 2012 17:05:34 -0300
Eduardo Habkost ehabk...@redhat.com wrote:

 The problem:
 
  - Some features are report at the same time on both CPUID[1].EDX and
CPUID[8000_0001].EDX on AMD CPUs (e.g. fpu, tsc, msr, pae, mmx).
  - -cpu model,+feature should enable the bit only on CPUID[1] if
it's not an AMD CPU, but it should enable the bit on both CPUID[1] and
CPUID[8000_0001] if it's an AMD CPU.
  - The same should happen when implementing CPU properties: setting the
property that enables a feature should set the duplicate
 CPUID[8000_0001].EDX bit only if CPU vendor is AMD.
 
 Reference: http://article.gmane.org/gmane.comp.emulators.qemu/166024
 
 The solution implemented by this series is:
  - On the CPU model table and while parsing CPU options/properties, set the
 bit only on CPUID[1] (the x86_def_t.features field).
  - When finishing initialization of the CPU cpuid fields, duplicate those
feature bits on cpuid_ext2_features if and only if the CPU vendor is AMD.
 
 This series depends on the x86 CPU patches that didn't get into 1.2
 series: http://article.gmane.org/gmane.comp.emulators.qemu/168633
   Message-Id: 1346877673-9136-1-git-send-email-ehabk...@redhat.com
 
 
 Eduardo Habkost (5):
   i386: kvm: bit 10 of CPUID[8000_0001].EDX is reserved
   i386: kvm: use a #define for the set of alias feature bits
   i386: cpu: replace EXT2_FEATURE_MASK with CPUID_EXT2_AMD_ALIASES
   i386: cpu: eliminate duplicate feature names
   i386: -cpu help: remove reference to specific CPUID leaves/registers
 
  target-i386/cpu.c | 59
 +++ target-i386/cpu.h |
 12 +++ target-i386/kvm.c |  2 +-
  3 files changed, 51 insertions(+), 22 deletions(-)
 

Reviewed-by: Igor Mammedov imamm...@redhat.com



Re: [Qemu-devel] [PATCH v2] stop using stdio for monitor/serial/etc with -daemonize

2012-09-26 Thread Anthony Liguori
Peter Maydell peter.mayd...@linaro.org writes:

 On 26 September 2012 09:17, Michael Tokarev m...@tls.msk.ru wrote:
 On 26.09.2012 12:00, Peter Maydell wrote:

 I know lots of people use -nographic -daemonize to run headless
 guests in background (like, for example, a router).  I guess it
 come way before -vga option has been introduced, but at least I
 know about -vga (but not about -vga none).  For one, I never saw
 -display before.  And it looks like -nographic is a synonym for
 -display none, and -curses is a synonym for -display curses.

 I mean, -nographic is about the same as -vga none -display none.

 ...except that it *also* messes around with where the serial output
 goes and with the parallel port and maybe something else.

 What is equivalent of -nographic in terms of -vga/-display/-...?
 From the code it is something like

  -vga none -display none -serial mon:stdio -parallel null

 It's something like that. It would be nice to implement -nographic
 as this is an alias for  but IIRC it isn't quite doable.
 (maybe I misremember)

 (this is the code I tried to patch).

 Note: this, compbined with -daemonize, also has the same issue,
 namely, the tty is left in a bad state after qemu process backgrounded,
 and for the very same reason: -serial stdio switches the try into
 raw mode.  So this should be fixed too -- somehow, either by forbidding
 this combination completely or by silently substituting stdio for
 -serial with null.  But it will be done in a subsequent patch.

 Note also: by forbidding -nographic -daemonize, we'll break lots of
 existing setups too, and I still don't see why this combination is
 bad, I already demonstrated that it can be made to work in a more
 or less reasonable/expected way.

 Because you've asked both put me into the background and please
 send stuff to stdio. Admittedly you've probably done that because
 you didn't really understand that '-nographic' doesn't mean
 '-display none', but you've still asked for a nonsensical combination.

This is a good example of where we need improved documentation but I
agree 100% with Peter.

Regards,

Anthony Liguori


 -- PMM



Re: [Qemu-devel] [RESEND PATCH v5 4/4] vfio: Enable vfio-pci and mark supported

2012-09-26 Thread Anthony Liguori
Alex Williamson alex.william...@redhat.com writes:

 Enabled for all softmmu guests supporting PCI on Linux hosts.  Note
 that currently only x86 hosts have the kernel side VFIO IOMMU support
 for this.  PPC (g3beige) is the only non-x86 guest known to work.
 ARM (veratile) hangs in firmware, others untested.

 Signed-off-by: Alex Williamson alex.william...@redhat.com
 Acked-by: Michael S. Tsirkin m...@redhat.com
 ---

  MAINTAINERS  |5 +
  configure|6 ++
  hw/Makefile.objs |3 ++-
  3 files changed, 13 insertions(+), 1 deletion(-)

 diff --git a/MAINTAINERS b/MAINTAINERS
 index 25733fc..29aac4f 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
 @@ -474,6 +474,11 @@ M: Gerd Hoffmann kra...@redhat.com
  S: Maintained
  F: hw/usb*
  
 +VFIO
 +M: Alex Williamson alex.william...@redhat.com
 +S: Supported
 +F: hw/vfio*
 +
  vhost
  M: Michael S. Tsirkin m...@redhat.com
  S: Supported
 diff --git a/configure b/configure
 index 1b86517..c2c0d4f 100755
 --- a/configure
 +++ b/configure
 @@ -165,6 +165,7 @@ attr=
  libattr=
  xfs=
  
 +vfio_pci=no
  vhost_net=no
  kvm=no
  gprof=no
 @@ -509,6 +510,7 @@ Haiku)
usb=linux
kvm=yes
vhost_net=yes
 +  vfio_pci=yes
if [ $cpu = i386 -o $cpu = x86_64 ] ; then
  audio_possible_drivers=$audio_possible_drivers fmod
fi
 @@ -3174,6 +3176,7 @@ echo libiscsi support  $libiscsi
  echo build guest agent $guest_agent
  echo seccomp support   $seccomp
  echo coroutine backend $coroutine_backend
 +echo VFIO PCI support  $vfio_pci
  
  if test $sdl_too_old = yes; then
  echo - Your SDL version is too old - please upgrade to have SDL support
 @@ -3911,6 +3914,9 @@ if test $target_softmmu = yes ; then
if test $smartcard_nss = yes ; then
  echo subdir-$target: subdir-libcacard  $config_host_mak
fi
 +  if test $vfio_pci = yes ; then
 +echo CONFIG_VFIO_PCI=y  $config_target_mak
 +  fi
case $target_arch2 in
  i386|x86_64)
echo CONFIG_HAVE_CORE_DUMP=y  $config_target_mak
 diff --git a/hw/Makefile.objs b/hw/Makefile.objs
 index 6dfebd2..7f8d3e4 100644
 --- a/hw/Makefile.objs
 +++ b/hw/Makefile.objs
 @@ -198,7 +198,8 @@ obj-$(CONFIG_VGA) += vga.o
  obj-$(CONFIG_SOFTMMU) += device-hotplug.o
  obj-$(CONFIG_XEN) += xen_domainbuild.o xen_machine_pv.o
  
 -# Inter-VM PCI shared memory
 +# Inter-VM PCI shared memory  VFIO PCI device assignment
  ifeq ($(CONFIG_PCI), y)
  obj-$(CONFIG_KVM) += ivshmem.o
 +obj-$(CONFIG_VFIO_PCI) += vfio_pci.o

Why not just make this

obj-$(CONFIG_LINUX) += vfio_pci.o

?

All you're doing in configure is setting CONFIG_VFIO_PCI if
CONFIG_LINUX.

Regards,

Anthony Liguori

  endif

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html




Re: [Qemu-devel] [PATCH v2 1/7] block: add support functions for live commit, to find and delete images.

2012-09-26 Thread Kevin Wolf
Am 25.09.2012 18:29, schrieb Jeff Cody:
 Add bdrv_find_overlay(), and bdrv_drop_intermediate().
 
 bdrv_find_overlay():  given 'bs' and the active (topmost) BDS of an image 
 chain,
 find the image that is the immediate top of 'bs'
 
 bdrv_drop_intermediate():
 Given 3 BDS (active, top, base), drop images above
 base up to and including top, and set base to be the
 backing file of top's overlay node.
 
 E.g., this converts:
 
 bottom - base - intermediate - top - active
 
 to
 
 bottom - base - active
 
 Signed-off-by: Jeff Cody jc...@redhat.com

 +/*
 + * Drops images above 'base' up to and including 'top', and sets the image
 + * above 'top' to have base as its backing file.
 + *
 + * Requires that the overlay to 'top' is opened r/w, so that the backing file
 + * information in 'bs' can be properly updated.
 + *
 + * E.g., this will convert the following chain:
 + * bottom - base - intermediate - top - active
 + *
 + * to
 + *
 + * bottom - base - active
 + *
 + * It is allowed for bottom==base, in which case it converts:
 + *
 + * base - intermediate - top - active
 + *
 + * to
 + *
 + * base - active

Compared to RFC v2 you deleted this part of the comment:

  + *
  + * It is also allowed for top==active, except in that case active is not
  + * deleted:

You describe the condition now as an error (which I think is what our
conclusion on IRC was), but now I think the following example must be
removed as well:

 + *
 + * base - intermediate - top
 + *
 + * becomes
 + *
 + * base - top
 + *
 + * Error conditions:
 + *  if active == top, that is considered an error
 + *
 + */
 +int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
 +   BlockDriverState *base)
 +{
 +BlockDriverState *intermediate;
 +BlockDriverState *base_bs = NULL;
 +BlockDriverState *new_top_bs = NULL;
 +BlkIntermediateStates *intermediate_state, *next;
 +int ret = -EIO;
 +
 +QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete;
 +QSIMPLEQ_INIT(states_to_delete);
 +
 +if (!top-drv || !base-drv) {
 +goto exit;
 +}
 +
 +new_top_bs = bdrv_find_overlay(active, top);
 +
 +if (new_top_bs == NULL) {
 +/* we could not find the image above 'top', this is an error */
 +goto exit;
 +}
 +
 +/* special case of new_top_bs-backing_hd already pointing to base - 
 nothing
 + * to do, no intermediate images */
 +if (new_top_bs-backing_hd == base) {
 +ret = 0;
 +goto exit;
 +}
 +
 +intermediate = new_top_bs-backing_hd;  /* this should be the same
 +   as 'top' */

Should? Is it the same or not? If yes, you can write it as an assert();
it no, the comment is only confusing.

(In fact it's obvious enough that I'd just go with intermediate = top;
without any comment)

Kevin



Re: [Qemu-devel] [PATCH v2] Align PCI capabilities in pci_find_space

2012-09-26 Thread Alex Williamson
On Tue, 2012-09-25 at 21:08 -0600, Alex Williamson wrote:
 On Tue, 2012-09-25 at 20:01 -0500, m...@cs.wisc.edu wrote:
  From: Matt Renzelmann m...@cs.wisc.edu
  
  The current implementation of pci_find_space does not correctly align
  PCI capabilities in the PCI configuration space.  This patch fixes
  this issue.
  
  Signed-off-by: Matt Renzelmann m...@cs.wisc.edu
  ---
  
  Alex Williamson alex.william...@redhat.com wrote:
   I think you could just search every 4th byte.  In fact, this whole used
   byte-map could be turned into a single uint64_t bitmap for standard
   config space.  Thanks,
  
  I've not tested this version of the patch, in contrast to the last, so
  I'm a bit less confident of its correctness.  I did not reimplement it
  as suggested as I'm not that familiar with this code, and instead just
  applied the every 4th byte strategy.
  
   hw/pci.c |   12 
   1 files changed, 8 insertions(+), 4 deletions(-)
  
  diff --git a/hw/pci.c b/hw/pci.c
  index f855cf3..e99866a 100644
  --- a/hw/pci.c
  +++ b/hw/pci.c
  @@ -1631,11 +1631,15 @@ static int pci_find_space(PCIDevice *pdev, uint8_t 
  size)
   int config_size = pci_config_size(pdev);
   int offset = PCI_CONFIG_HEADER_SIZE;
   int i;
  -for (i = PCI_CONFIG_HEADER_SIZE; i  config_size; ++i)
  -if (pdev-used[i])
  -offset = i + 1;
  -else if (i - offset + 1 == size)
  +
  +for (i = PCI_CONFIG_HEADER_SIZE; i  config_size; i += 4) {
  +if (pdev-used[i]) {
  +offset = i + 4;
  +} else if (i - offset + 1 == size) {
 
 This test needs to change as well.  Looks like it should now be:
 
  (i - offset + 4 = size)
 
 Whereas we were previously calculating the difference from the offset to
 the current pointer plus the current unused byte, we're now assuming the
 current dword is empty because we're only handing out dword aligned
 offsets and it would be broken for something to not mark the first entry
 used.  Probably worthwhile to also add a comment noting the PCI spec
 requires dword alignment for capabilities.  Thanks,

BTW, rather than assume the rest of the dword is empty, we could just
check each dword instead of each byte, something like

uint32_t *dword_used = pdev-used[PCI_CONFIG_HEADER_SIZE];

for (i = PCI_CONFIG_HEADER_SIZE; i  config_size; i +=4, dword_used++) {
if (*dword_used) {
offset = i + 4;
} else if (i - offset + 4 = size) {
return offset;
}
}

It also occurs to me that this function is broken for PCIe devices as we
should stop at PCI_CONFIG_SPACE_SIZE instead of config_size.  There
should be a separate allocator for extended config space, or a flag to
this function to indicate standard or extended.  Thanks,

Alex

   return offset;
  +}
  +}
  +
   return 0;
   }
   
 
 






Re: [Qemu-devel] [PATCH v2 2/7] block: add live block commit functionality

2012-09-26 Thread Kevin Wolf
Am 25.09.2012 18:29, schrieb Jeff Cody:
 This adds the live commit coroutine.  This iteration focuses on the
 commit only below the active layer, and not the active layer itself.
 
 The behaviour is similar to block streaming; the sectors are walked
 through, and anything that exists above 'base' is committed back down
 into base.  At the end, intermediate images are deleted, and the
 chain stitched together.  Images are restored to their original open
 flags upon completion.
 
 Signed-off-by: Jeff Cody jc...@redhat.com

Reviewed-by: Kevin Wolf kw...@redhat.com



Re: [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit'

2012-09-26 Thread Kevin Wolf
Am 25.09.2012 18:29, schrieb Jeff Cody:
 The command for live block commit is added, which has the following
 arguments:
 
 device: the block device to perform the commit on (mandatory)
 base:   the base image to commit into; optional (if not specified,
 it is the underlying original image)
 top:the top image of the commit - all data from inside top down
 to base will be committed into base. optional (if not specified,
 it is one below the active image) - see note below
 speed:  maximum speed, in bytes/sec
 
 note: eventually this will support merging down the active layer,
   but that code is not yet complete.  If the active layer is passed
   in currently as top, or top is left to the default, then an error
   will be returned.
 
 The is done as a block job, so upon completion a BLOCK_JOB_COMPLETED will
 be emitted.
 
 Signed-off-by: Jeff Cody jc...@redhat.com

 diff --git a/qapi-schema.json b/qapi-schema.json
 index 14e4419..e614453 100644
 --- a/qapi-schema.json
 +++ b/qapi-schema.json
 @@ -1468,6 +1468,41 @@
'returns': 'str' }
  
  ##
 +# @block-commit
 +#
 +# Live commit of data from overlay image nodes into backing nodes - i.e.,
 +# writes data between 'top' and 'base' into 'base'.
 +#
 +# @device:  the name of the device
 +#
 +# @base:   #optional The file name of the backing image to write data into.
 +#If not specified, this is the deepest backing image
 +#
 +# @top:#optional The file name of the backing image within the image 
 chain,
 +#which contains the topmost data to be committed down.
 +#If not specified, this is one layer below the active
 +#layer (i.e. active-backing_hd).

Why isn't active the default any more? I know, we don't support it yet,
but long term this is what makes most sense as a default.

 +#
 +#If top == base, that is an error.
 +#
 +#
 +# @speed:  #optional the maximum speed, in bytes per second
 +#
 +# Returns: Nothing on success
 +#  If commit or stream is already active on this device, DeviceInUse
 +#  If @device does not exist, DeviceNotFound
 +#  If image commit is not supported by this device, NotSupported
 +#  If @base does not exist, a generic error is returned
 +#  If @top does not exist, a generic error is returned
 +#  If @speed is invalid, InvalidParameter
 +#
 +# Since: 1.3
 +#
 +##
 +{ 'command': 'block-commit',
 +  'data': { 'device': 'str', '*base': 'str', '*top': 'str',
 +'*speed': 'int' } }

Kevin



Re: [Qemu-devel] [RESEND PATCH v5 4/4] vfio: Enable vfio-pci and mark supported

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 08:50 -0500, Anthony Liguori wrote:
 Alex Williamson alex.william...@redhat.com writes:
 
  Enabled for all softmmu guests supporting PCI on Linux hosts.  Note
  that currently only x86 hosts have the kernel side VFIO IOMMU support
  for this.  PPC (g3beige) is the only non-x86 guest known to work.
  ARM (veratile) hangs in firmware, others untested.
 
  Signed-off-by: Alex Williamson alex.william...@redhat.com
  Acked-by: Michael S. Tsirkin m...@redhat.com
  ---
 
   MAINTAINERS  |5 +
   configure|6 ++
   hw/Makefile.objs |3 ++-
   3 files changed, 13 insertions(+), 1 deletion(-)
 
  diff --git a/MAINTAINERS b/MAINTAINERS
  index 25733fc..29aac4f 100644
  --- a/MAINTAINERS
  +++ b/MAINTAINERS
  @@ -474,6 +474,11 @@ M: Gerd Hoffmann kra...@redhat.com
   S: Maintained
   F: hw/usb*
   
  +VFIO
  +M: Alex Williamson alex.william...@redhat.com
  +S: Supported
  +F: hw/vfio*
  +
   vhost
   M: Michael S. Tsirkin m...@redhat.com
   S: Supported
  diff --git a/configure b/configure
  index 1b86517..c2c0d4f 100755
  --- a/configure
  +++ b/configure
  @@ -165,6 +165,7 @@ attr=
   libattr=
   xfs=
   
  +vfio_pci=no
   vhost_net=no
   kvm=no
   gprof=no
  @@ -509,6 +510,7 @@ Haiku)
 usb=linux
 kvm=yes
 vhost_net=yes
  +  vfio_pci=yes
 if [ $cpu = i386 -o $cpu = x86_64 ] ; then
   audio_possible_drivers=$audio_possible_drivers fmod
 fi
  @@ -3174,6 +3176,7 @@ echo libiscsi support  $libiscsi
   echo build guest agent $guest_agent
   echo seccomp support   $seccomp
   echo coroutine backend $coroutine_backend
  +echo VFIO PCI support  $vfio_pci
   
   if test $sdl_too_old = yes; then
   echo - Your SDL version is too old - please upgrade to have SDL support
  @@ -3911,6 +3914,9 @@ if test $target_softmmu = yes ; then
 if test $smartcard_nss = yes ; then
   echo subdir-$target: subdir-libcacard  $config_host_mak
 fi
  +  if test $vfio_pci = yes ; then
  +echo CONFIG_VFIO_PCI=y  $config_target_mak
  +  fi
 case $target_arch2 in
   i386|x86_64)
 echo CONFIG_HAVE_CORE_DUMP=y  $config_target_mak
  diff --git a/hw/Makefile.objs b/hw/Makefile.objs
  index 6dfebd2..7f8d3e4 100644
  --- a/hw/Makefile.objs
  +++ b/hw/Makefile.objs
  @@ -198,7 +198,8 @@ obj-$(CONFIG_VGA) += vga.o
   obj-$(CONFIG_SOFTMMU) += device-hotplug.o
   obj-$(CONFIG_XEN) += xen_domainbuild.o xen_machine_pv.o
   
  -# Inter-VM PCI shared memory
  +# Inter-VM PCI shared memory  VFIO PCI device assignment
   ifeq ($(CONFIG_PCI), y)
   obj-$(CONFIG_KVM) += ivshmem.o
  +obj-$(CONFIG_VFIO_PCI) += vfio_pci.o
 
 Why not just make this
 
 obj-$(CONFIG_LINUX) += vfio_pci.o
 
 ?
 
 All you're doing in configure is setting CONFIG_VFIO_PCI if
 CONFIG_LINUX.

Ok.  I thought I needed linux + softmmu + pci, but maybe the softmmu is
implied with CONFIG_PCI?  Thanks,

Alex




Re: [Qemu-devel] [PATCH v2 7/7] block: after creating a live snapshot, make old image read-only

2012-09-26 Thread Kevin Wolf
Am 25.09.2012 18:29, schrieb Jeff Cody:
 Currently, after a live snapshot of a drive, the image that has
 been 'demoted' to be below the new active layer remains r/w.
 This patch reopens it read-only.
 
 Note that we do not check for error on the reopen(), because we
 will not abort the snapshots if the reopen fails.
 
 This patch depends on the bdrv_reopen() series.
 
 Signed-off-by: Jeff Cody jc...@redhat.com

This should be independent from the live commit patches, so I already
applied this one to the block branch.

Kevin



Re: [Qemu-devel] [PATCH v2 7/7] block: after creating a live snapshot, make old image read-only

2012-09-26 Thread Jeff Cody
On 09/26/2012 10:20 AM, Kevin Wolf wrote:
 Am 25.09.2012 18:29, schrieb Jeff Cody:
 Currently, after a live snapshot of a drive, the image that has
 been 'demoted' to be below the new active layer remains r/w.
 This patch reopens it read-only.

 Note that we do not check for error on the reopen(), because we
 will not abort the snapshots if the reopen fails.

 This patch depends on the bdrv_reopen() series.

 Signed-off-by: Jeff Cody jc...@redhat.com
 
 This should be independent from the live commit patches, so I already
 applied this one to the block branch.
 
 Kevin
 

Thanks



Re: [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit'

2012-09-26 Thread Jeff Cody
On 09/26/2012 10:13 AM, Kevin Wolf wrote:
 Am 25.09.2012 18:29, schrieb Jeff Cody:
 The command for live block commit is added, which has the following
 arguments:

 device: the block device to perform the commit on (mandatory)
 base:   the base image to commit into; optional (if not specified,
 it is the underlying original image)
 top:the top image of the commit - all data from inside top down
 to base will be committed into base. optional (if not specified,
 it is one below the active image) - see note below
 speed:  maximum speed, in bytes/sec

 note: eventually this will support merging down the active layer,
   but that code is not yet complete.  If the active layer is passed
   in currently as top, or top is left to the default, then an error
   will be returned.

 The is done as a block job, so upon completion a BLOCK_JOB_COMPLETED will
 be emitted.

 Signed-off-by: Jeff Cody jc...@redhat.com
 
 diff --git a/qapi-schema.json b/qapi-schema.json
 index 14e4419..e614453 100644
 --- a/qapi-schema.json
 +++ b/qapi-schema.json
 @@ -1468,6 +1468,41 @@
'returns': 'str' }
  
  ##
 +# @block-commit
 +#
 +# Live commit of data from overlay image nodes into backing nodes - i.e.,
 +# writes data between 'top' and 'base' into 'base'.
 +#
 +# @device:  the name of the device
 +#
 +# @base:   #optional The file name of the backing image to write data into.
 +#If not specified, this is the deepest backing image
 +#
 +# @top:#optional The file name of the backing image within the image 
 chain,
 +#which contains the topmost data to be committed down.
 +#If not specified, this is one layer below the active
 +#layer (i.e. active-backing_hd).
 
 Why isn't active the default any more? I know, we don't support it yet,
 but long term this is what makes most sense as a default.
 

Eric had a similar question, and asked if anyone had any preference -
this was my response:

---

I guess I don't have a strong preference either - I originally had it
the other way, but then that meant the default in the current
implementation was actually an error.

Also, I assumed (danger!) that the most common use of commit would be a
snapshot, followed by a commit of active-backing_hd. With that
assumption, it seemed like a sane default.

---

I can certainly revert back to having the active layer be the top, if
that is the preference.

 +#
 +#If top == base, that is an error.
 +#
 +#
 +# @speed:  #optional the maximum speed, in bytes per second
 +#
 +# Returns: Nothing on success
 +#  If commit or stream is already active on this device, DeviceInUse
 +#  If @device does not exist, DeviceNotFound
 +#  If image commit is not supported by this device, NotSupported
 +#  If @base does not exist, a generic error is returned
 +#  If @top does not exist, a generic error is returned
 +#  If @speed is invalid, InvalidParameter
 +#
 +# Since: 1.3
 +#
 +##
 +{ 'command': 'block-commit',
 +  'data': { 'device': 'str', '*base': 'str', '*top': 'str',
 +'*speed': 'int' } }
 
 Kevin
 




Re: [Qemu-devel] [RFC PATCH 00/17] Support for multiple AIO contexts

2012-09-26 Thread Kevin Wolf
Am 26.09.2012 15:32, schrieb Paolo Bonzini:
 Il 26/09/2012 14:28, Kevin Wolf ha scritto:
 Do you have a git tree where I could see what things would look like in
 the end?
 
 I will push it to aio-context on git://github.com/bonzini/qemu.git as
 soon as github comes back.
 
 I wonder how this relates to my plans of getting rid of qemu_aio_flush()
 and friends in favour of BlockDriver.bdrv_drain().
 
 Mostly unrelated, I think.  The introduction of the non-blocking
 aio_poll in this series might help implementing bdrv_drain, like this:
 
 blocking = false;
 while(bs has requests) {
 progress = aio_poll(aio context of bs, blocking);
 if (progress) {
 blocking = false;
 continue;
 }
 if (bs has throttled requests) {
 restart throttled requests
 blocking = false;
 continue;
 }
 
 /* No progress, must have been non-blocking.  We must wait.  */
 assert(!blocking);
 blocking = true;
 }

Yes, possibly.

 BTW, is it true that bs-file has requests || bs-backing_hd has
 requests (or any other underlying file, like vmdk extents) implies bs
 has requests?

I think each block driver is responsible for draining the requests that
it sent. This means that it will drain bs-file (because noone else
should directly go there) and in most cases also bs-backing_hd, but if
for example live commit has a request in flight that directly accesses
the backing file, I wouldn't expect that a block driver is required to
wait for the completion of this request.

 In fact, after removing io_flush, I don't really see what makes AIO
 fd handlers special any more.
 
 Note that while the handlers aren't that special indeed, there is still
 some magic because qemu_aio_wait() bottom halves.

Do you mean the qemu_bh_poll() call? But the normal main loop does the
same, so I don't see what would be special about it.

 qemu_aio_wait() only calls these handlers, but would it do any harm if
 we called all fd handlers?
 
 Unfortunately yes.  You could get re-entrant calls from the monitor
 while a monitor command drains the AIO queue for example.

Hm, that's true... Who's special here - is it the block layer or the
monitor? I'm not quite sure. If it's the monitor, maybe we should plan
to change that sometime when we have some spare time... ;-)

Kevin



Re: [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit'

2012-09-26 Thread Kevin Wolf
Am 26.09.2012 16:25, schrieb Jeff Cody:
 On 09/26/2012 10:13 AM, Kevin Wolf wrote:
 Am 25.09.2012 18:29, schrieb Jeff Cody:
 The command for live block commit is added, which has the following
 arguments:

 device: the block device to perform the commit on (mandatory)
 base:   the base image to commit into; optional (if not specified,
 it is the underlying original image)
 top:the top image of the commit - all data from inside top down
 to base will be committed into base. optional (if not specified,
 it is one below the active image) - see note below
 speed:  maximum speed, in bytes/sec

 note: eventually this will support merging down the active layer,
   but that code is not yet complete.  If the active layer is passed
   in currently as top, or top is left to the default, then an error
   will be returned.

 The is done as a block job, so upon completion a BLOCK_JOB_COMPLETED will
 be emitted.

 Signed-off-by: Jeff Cody jc...@redhat.com

 diff --git a/qapi-schema.json b/qapi-schema.json
 index 14e4419..e614453 100644
 --- a/qapi-schema.json
 +++ b/qapi-schema.json
 @@ -1468,6 +1468,41 @@
'returns': 'str' }
  
  ##
 +# @block-commit
 +#
 +# Live commit of data from overlay image nodes into backing nodes - i.e.,
 +# writes data between 'top' and 'base' into 'base'.
 +#
 +# @device:  the name of the device
 +#
 +# @base:   #optional The file name of the backing image to write data into.
 +#If not specified, this is the deepest backing image
 +#
 +# @top:#optional The file name of the backing image within the image 
 chain,
 +#which contains the topmost data to be committed down.
 +#If not specified, this is one layer below the active
 +#layer (i.e. active-backing_hd).

 Why isn't active the default any more? I know, we don't support it yet,
 but long term this is what makes most sense as a default.

 
 Eric had a similar question, and asked if anyone had any preference -
 this was my response:
 
 ---
 
 I guess I don't have a strong preference either - I originally had it
 the other way, but then that meant the default in the current
 implementation was actually an error.

We can make it non-optional for now and use active as the default once
we introduce support for committing the active layer.

 Also, I assumed (danger!) that the most common use of commit would be a
 snapshot, followed by a commit of active-backing_hd. With that
 assumption, it seemed like a sane default.
 
 ---
 
 I can certainly revert back to having the active layer be the top, if
 that is the preference.

I think it is, if nothing else for consistency with the existing
synchronous 'commit' command.

Kevin



Re: [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit'

2012-09-26 Thread Eric Blake
On 09/26/2012 08:25 AM, Jeff Cody wrote:

 +# @top:#optional The file name of the backing image within the image 
 chain,
 +#which contains the topmost data to be committed down.
 +#If not specified, this is one layer below the active
 +#layer (i.e. active-backing_hd).

 Why isn't active the default any more? I know, we don't support it yet,
 but long term this is what makes most sense as a default.

 
 Eric had a similar question, and asked if anyone had any preference -
 this was my response:
 
 ---
 
 I guess I don't have a strong preference either - I originally had it
 the other way, but then that meant the default in the current
 implementation was actually an error.
 
 Also, I assumed (danger!) that the most common use of commit would be a
 snapshot, followed by a commit of active-backing_hd. With that
 assumption, it seemed like a sane default.

Actually, I envision my personal use case to be:

Take a snapshot, do an experiment in the guest (such as install a
questionable package), and then either roll back to the snapshot
(experiment failed) or commit the active (experiment worked, no need to
have a snapshot any more); either way, taking the snapshot created a new
temporary file name, and after I make my decision on whether to commit
or discard the snapshot, I want to get back to the original file name.
Since the snapshot effectively created a temporary file name, I'd rather
not have to know the name of that temporary file just to pass it to an
explicit 'top' argument when committing the active layer.

Having to specify 'top' to avoid an error when not committing the active
layer is not as bad as getting the defaults wrong for when we do add
support for committing the active layer.

 ---
 
 I can certainly revert back to having the active layer be the top, if
 that is the preference.

Given that it has been asked again, I'd say yes, go ahead and revert to
the behavior of defaulting 'top' to the active layer.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] coroutine: always use pooling

2012-09-26 Thread Peter Maydell
On 25 September 2012 13:44, Paolo Bonzini pbonz...@redhat.com wrote:
 It makes sense to use it for other implementations than ucontext, too.
  Coroutine *qemu_coroutine_create(CoroutineEntry *entry)
  {
 -Coroutine *co = qemu_coroutine_new();
 +Coroutine *co;
 +
 +co = QSLIST_FIRST(pool);
 +if (co) {
 +QSLIST_REMOVE_HEAD(pool, pool_next);
 +pool_size--;
 +} else {
 +co = qemu_coroutine_new();
 +}
  co-entry = entry;
  return co;
  }

Since this is obviously going to blow up badly if it's called
from multiple threads, is there some kind of assert we can add
so that we fail obviously if somebody makes that coding error?
[the difficulty is probably when your backend is the gthread
one, at least I assume creating a coroutine inside a coroutine
is allowed.]

-- PMM



[Qemu-devel] [PATCH] block: live snapshot documentation tweaks

2012-09-26 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qapi-schema.json | 4 ++--
 1 file modificato, 2 inserzioni(+), 2 rimozioni(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index 8719a9d..26ac21f 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1402,7 +1402,7 @@
 # @format: #optional the format of the snapshot image, default is 'qcow2'.
 #
 # @mode: #optional whether and how QEMU should create a new image, default is
-# 'absolute-paths'.
+#'absolute-paths'.
 ##
 { 'type': 'BlockdevSnapshot',
   'data': { 'device': 'str', 'snapshot-file': 'str', '*format': 'str',
@@ -1456,7 +1456,7 @@
 # @format: #optional the format of the snapshot image, default is 'qcow2'.
 #
 # @mode: #optional whether and how QEMU should create a new image, default is
-# 'absolute-paths'.
+#'absolute-paths'.
 #
 # Returns: nothing on success
 #  If @device is not a valid block device, DeviceNotFound
-- 
1.7.12




[Qemu-devel] [PATCH] Add amd iommu emulation for Xen.

2012-09-26 Thread Wei Wang

HI,
Attached patch adds amd iommu emulation for Xen. Please review it.
Thanks,
Wei





From 122517435641384e4f5e36eaad8302ff273648e8 Mon Sep 17 00:00:00 2001
From: Wei Wang wei.wa...@amd.com
Date: Wed, 26 Sep 2012 16:43:40 +0200
Subject: [PATCH] Add amd iommu emulation for Xen.

To passthrough amd southern islands series gpu to guest, a virtual iommu device 
must
be registered on qemu pci bus. It uses a new hypercall 
xc_domain_update_iommu_msi
to notify xen the msi vector of iommu.

Signed-off-by: Wei Wang wei.wa...@amd.com
---
 hw/i386/Makefile.objs |2 +-
 hw/pc_piix.c  |6 ++
 hw/xen_iommu.c|  191 +
 hw/xen_pt.h   |1 +
 4 files changed, 199 insertions(+), 1 deletions(-)
 create mode 100644 hw/xen_iommu.c

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index 8c764bb..8b231ab 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -8,7 +8,7 @@ obj-y += pc_piix.o
 obj-y += pc_sysfw.o
 obj-$(CONFIG_XEN) += xen_platform.o xen_apic.o
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
-obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o xen_pt_msi.o
+obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o 
xen_pt_msi.o xen_iommu.o
 obj-y += kvm/
 obj-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index fd5898f..0b5d034 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -46,6 +46,9 @@
 #ifdef CONFIG_XEN
 #  include xen/hvm/hvm_info_table.h
 #endif
+#ifdef CONFIG_XEN_PCI_PASSTHROUGH
+#  include xen_pt.h
+#endif
 
 #define MAX_IDE_BUS 2
 
@@ -228,6 +231,9 @@ static void pc_init1(MemoryRegion *system_memory,
 pc_vga_init(isa_bus, pci_enabled ? pci_bus : NULL);
 if (xen_enabled()) {
 pci_create_simple(pci_bus, -1, xen-platform);
+#ifdef CONFIG_XEN_PCI_PASSTHROUGH
+xen_pt_iommu_create(pci_bus);
+#endif
 }
 
 /* init basic PC hardware */
diff --git a/hw/xen_iommu.c b/hw/xen_iommu.c
new file mode 100644
index 000..9a9ede4
--- /dev/null
+++ b/hw/xen_iommu.c
@@ -0,0 +1,191 @@
+/*
+ * amd iommu support
+ *
+ * Copyright (C) 2012 Advanced Micro Devices, Inc.
+ * Author: Wei Wang wei.wa...@amd.com 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#include xen_pt.h
+#include xen_backend.h
+
+#pragma pack(1)
+
+typedef struct iommu_capability_block {
+uint8_t id;
+uint8_t next_ptr;
+uint8_t cap_info;
+uint8_t flags;
+uint32_tbase_low;
+uint32_tbase_high;
+uint32_trange;
+uint32_tmisc;
+} iommu_capability_t;
+
+typedef struct msi_capability_block {
+uint8_t id;
+uint8_t next_ptr;
+uint16_tmsg_ctrl;
+uint32_taddr_low;
+uint32_taddr_high;
+uint32_tmsi_data;
+} msi_capability_t;
+
+struct amd_iommu_config {
+uint16_tvendor_id;
+uint16_tdevice_id;
+uint16_tcommand;
+uint16_tstatus;
+uint8_t revision;
+uint8_t api;
+uint8_t subclass;
+uint8_t class;
+uint8_t cache_line_size;
+uint8_t latency_timer;
+uint8_t header_type;
+uint8_t bist;
+uint32_tbase_address_regs[6];
+uint32_treserved1;
+uint16_tsubsystem_vendor_id;
+uint16_tsubsystem_id;
+uint32_trom_addr;
+uint8_t cap_ptr;
+uint8_t reserved3[3];
+uint32_treserved4;
+uint8_t interrupt_line;
+uint8_t interrupt_pin;
+uint8_t min_gnt;
+uint8_t max_lat;
+iommu_capability_t cap;
+msi_capability_t   msi;
+};
+#pragma pack()
+
+#ifndef PCI_CAP_ID_SEC
+#define PCI_CAP_ID_SEC  0x0F
+#endif
+#define PCI_CLASS_SYSTEM_AMD_IOMMU  0x0806
+#define PCI_DEVICE_AMD_IOMMU_V2 0x
+#define IOMMU_CAP_FLAGS_IOTLB   0
+#define IOMMU_CAP_FLAGS_EFRSUP  3
+#define IOMMU_CAP_TYPE  0x3
+#define IOMMU_CAP_REV   0x1
+
+#define MSI_DATA_VECTOR_SHIFT  0
+#define MSI_DATA_DELIVERY_SHIFT8
+#define MSI_DATA_LEVEL_SHIFT   14
+#define MSI_DATA_TRIGGER_SHIFT 15
+#define MSI_ADDR_DESTID_MASK   0xffff
+#define MSI_ADDR_DESTMODE_SHIFT2
+#define MSI_ADDR_REDIRECTION_SHIFT 3
+#define MSI_TARGET_CPU_SHIFT   12
+#define PCI_STATUS_CAPABILITIES0x010
+
+static void amd_iommu_pci_write_config(PCIDevice *d, uint32_t address,
+   uint32_t val, int len)
+{
+struct amd_iommu_config *iommu_config;
+uint64_t msi_addr;
+uint32_t msi_data;
+uint8_t offset_msi_data, vector, en;
+uint8_t dest_mode, dest, delivery_mode, trig_mode;
+
+pci_default_write_config(d, address, val, len);
+
+iommu_config = (struct amd_iommu_config *)d-config;
+
+offset_msi_data = iommu_config-cap.next_ptr + sizeof(uint32_t) +
+   

[Qemu-devel] [PATCH 4 of 6 V6] libxc: add wrappers for new hypercalls

2012-09-26 Thread Wei Wang



From 0e5259161a6055dcbebb7b9e978b5c384c7a3efe Mon Sep 17 00:00:00 2001
From: Wei Wang wei.wa...@amd.com
Date: Wed, 26 Sep 2012 11:47:03 +0200
Subject: [PATCH 4/6] libxc: add wrappers for new hypercalls

Please see patch 1 for hypercall description.

Signed-off-by: Wei Wang wei.wa...@amd.com
---
 tools/libxc/xc_domain.c |   53 +++
 tools/libxc/xenctrl.h   |   15 +
 2 files changed, 68 insertions(+), 0 deletions(-)

diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index d98e68b..7a0d437 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -1352,6 +1352,59 @@ int xc_domain_bind_pt_isa_irq(
   PT_IRQ_TYPE_ISA, 0, 0, 0, machine_irq));
 }
 
+int xc_domain_update_iommu_msi(
+xc_interface *xch,
+uint32_t domid,
+uint8_t vector,
+uint8_t dest,
+uint8_t dest_mode,
+uint8_t delivery_mode,
+uint8_t trig_mode)
+{
+int rc;
+DECLARE_DOMCTL;
+xen_domctl_guest_iommu_op_t * iommu_op;
+
+domctl.cmd = XEN_DOMCTL_guest_iommu_op;
+domctl.domain = (domid_t)domid;
+
+iommu_op = (domctl.u.guest_iommu_op);
+iommu_op-op = XEN_DOMCTL_GUEST_IOMMU_OP_SET_MSI;
+iommu_op-u.msi.vector = vector;
+iommu_op-u.msi.dest = dest;
+iommu_op-u.msi.dest_mode = dest_mode;
+iommu_op-u.msi.delivery_mode = delivery_mode;
+iommu_op-u.msi.trig_mode = trig_mode;
+
+rc = do_domctl(xch, domctl);
+return rc;
+}
+
+int xc_domain_bind_pt_bdf(xc_interface *xch,
+uint32_t domid,
+uint16_t gseg,
+uint16_t gbdf,
+uint16_t mseg,
+uint16_t mbdf)
+{
+int rc;
+DECLARE_DOMCTL;
+xen_domctl_guest_iommu_op_t * guest_op;
+
+domctl.cmd = XEN_DOMCTL_guest_iommu_op;
+domctl.domain = (domid_t)domid;
+
+guest_op = (domctl.u.guest_iommu_op);
+guest_op-op = XEN_DOMCTL_GUEST_IOMMU_OP_BIND_BDF;
+guest_op-u.bdf_bind.g_seg = gseg;
+guest_op-u.bdf_bind.g_bdf = gbdf;
+guest_op-u.bdf_bind.m_seg = mseg;
+guest_op-u.bdf_bind.m_bdf = mbdf;
+
+rc = do_domctl(xch, domctl);
+return rc;
+}
+
 int xc_domain_memory_mapping(
 xc_interface *xch,
 uint32_t domid,
diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
index 7eb5743..1e510a0 100644
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -1730,6 +1730,21 @@ int xc_domain_bind_pt_isa_irq(xc_interface *xch,
   uint32_t domid,
   uint8_t machine_irq);
 
+int xc_domain_bind_pt_bdf(xc_interface *xch,
+  uint32_t domid,
+  uint16_t gseg,
+  uint16_t gbdf,
+  uint16_t mseg, 
+  uint16_t mbdf);
+
+int xc_domain_update_iommu_msi(xc_interface *xch,
+   uint32_t domid,
+   uint8_t vector,
+   uint8_t dest,
+   uint8_t dest_mode,
+   uint8_t delivery_mode,
+   uint8_t trig_mode);
+
 int xc_domain_set_machine_address_size(xc_interface *xch,
   uint32_t domid,
   unsigned int width);
-- 
1.7.4




Re: [Qemu-devel] [PATCH v2] stop using stdio for monitor/serial/etc with -daemonize

2012-09-26 Thread Michael Tokarev
On 26.09.2012 17:46, Anthony Liguori wrote:
[]
 This is a good example of where we need improved documentation but I
 agree 100% with Peter.

So what do we do?

We've a clear bug, I can only fix it in the patch to the Debian
package, since I've been asked about this bug multiple times,
and I care about our users. It is at least consistent with the
expectations.  Maybe at the same time, it's a good idea to print
a warning about -nographic being deprecated, but this part should
definitely be done upstream first.

Thanks,

/mjt



Re: [Qemu-devel] [PATCH 1/4] s390: sclp base support

2012-09-26 Thread Alexander Graf

On 20.08.2012, at 16:28, Jens Freimann wrote:

 From: Heinz Graalfs graa...@linux.vnet.ibm.com
 
 This adds a more generic infrastructure for handling Service-Call
 requests on s390. Currently we only support a small subset of Read
 SCP Info directly in target-s390x. This patch provides the base
 infrastructure for supporting more commands and moves Read SCP
 Info.
 In the future we could add additional commands for hotplug, call
 home and event handling.
 
 Signed-off-by: Heinz Graalfs graa...@linux.vnet.ibm.com
 Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com
 Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
 ---
 hw/s390x/Makefile.objs   |   1 +
 hw/s390x/sclp.c  | 107 +++
 hw/s390x/sclp.h  |  76 +
 target-s390x/cpu.h   |  14 +--
 target-s390x/kvm.c   |   5 +--
 target-s390x/op_helper.c |  31 +++---
 6 files changed, 192 insertions(+), 42 deletions(-)
 create mode 100644 hw/s390x/sclp.c
 create mode 100644 hw/s390x/sclp.h
 
 diff --git a/hw/s390x/Makefile.objs b/hw/s390x/Makefile.objs
 index dcdcac8..1c14b96 100644
 --- a/hw/s390x/Makefile.objs
 +++ b/hw/s390x/Makefile.objs
 @@ -1,3 +1,4 @@
 obj-y = s390-virtio-bus.o s390-virtio.o
 
 obj-y := $(addprefix ../,$(obj-y))
 +obj-y += sclp.o
 diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
 new file mode 100644
 index 000..322a0e2
 --- /dev/null
 +++ b/hw/s390x/sclp.c
 @@ -0,0 +1,107 @@
 +/*
 + * SCLP Support
 + *
 + * Copyright IBM, Corp. 2012
 + *
 + * Authors:
 + *  Christian Borntraeger borntrae...@de.ibm.com
 + *  Heinz Graalfs graa...@linux.vnet.ibm.com
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or (at 
 your
 + * option) any later version.  See the COPYING file in the top-level 
 directory.
 + *
 + */
 +
 +#include cpu.h
 +#include kvm.h
 +
 +#include sclp.h
 +
 +/* Provide information about the configuration, CPUs and storage */
 +static void read_SCP_info(SCCB *sccb)
 +{
 +ReadInfo *read_info = (ReadInfo *) sccb;
 +int shift = 0;
 +
 +while ((ram_size  (20 + shift))  65535) {
 +shift++;
 +}
 +read_info-rnmax = cpu_to_be16(ram_size  (20 + shift));
 +read_info-rnsize = 1  shift;

Any reason we can't just always expose rnmax2 and rnsize2 instead? This way 
we're quite limited on the amount of RAM we can support, right?

 +sccb-h.response_code = cpu_to_be16(SCLP_RC_NORMAL_READ_COMPLETION);
 +}
 +
 +static void sclp_execute(SCCB *sccb, uint64_t code)
 +{
 +switch (code) {
 +case SCLP_CMDW_READ_SCP_INFO:
 +case SCLP_CMDW_READ_SCP_INFO_FORCED:
 +read_SCP_info(sccb);
 +break;
 +default:
 +sccb-h.response_code = cpu_to_be16(SCLP_RC_INVALID_SCLP_COMMAND);
 +break;
 +}
 +}
 +
 +int do_sclp_service_call(uint32_t sccb, uint64_t code)
 +{
 +int r = 0;
 +SCCB work_sccb;
 +
 +target_phys_addr_t sccb_len = sizeof(SCCB);
 +
 +/*
 + * we want to work on a private copy of the sccb, to prevent guests
 + * from playing dirty tricks by modifying the memory content after
 + * the host has checked the values
 + */
 +cpu_physical_memory_read(sccb, work_sccb, sccb_len);
 +
 +/* Valid sccb sizes */
 +if (be16_to_cpu(work_sccb.h.length)  8 ||

sizeof(SCCBHeader)

 +be16_to_cpu(work_sccb.h.length)  4096) {

SCCB_SIZE

 +r = -PGM_SPECIFICATION;
 +goto out;
 +}
 +
 +sclp_execute((SCCB *)work_sccb, code);
 +
 +cpu_physical_memory_write(sccb, work_sccb,
 +  be16_to_cpu(work_sccb.h.length));
 +
 +sclp_service_interrupt(sccb);
 +
 +out:
 +return r;
 +}
 +
 +void sclp_service_interrupt(uint32_t sccb)
 +{
 +s390_sclp_extint(sccb  ~3);
 +}
 +
 +/* qemu object creation and initialization functions */
 +
 +static void s390_sclp_device_class_init(ObjectClass *klass, void *data)
 +{
 +SysBusDeviceClass *dc = SYS_BUS_DEVICE_CLASS(klass);
 +
 +dc-init = s390_sclp_dev_init;
 +}
 +
 +static TypeInfo s390_sclp_device_info = {
 +.name = TYPE_DEVICE_S390_SCLP,
 +.parent = TYPE_SYS_BUS_DEVICE,
 +.instance_size = sizeof(S390SCLPDevice),
 +.class_init = s390_sclp_device_class_init,
 +.class_size = sizeof(S390SCLPDeviceClass),
 +.abstract = true,
 +};
 +
 +static void s390_sclp_register_types(void)
 +{
 +type_register_static(s390_sclp_device_info);
 +}
 +
 +type_init(s390_sclp_register_types)
 diff --git a/hw/s390x/sclp.h b/hw/s390x/sclp.h
 new file mode 100644
 index 000..e9ad42b
 --- /dev/null
 +++ b/hw/s390x/sclp.h
 @@ -0,0 +1,76 @@
 +/*
 + * SCLP Support
 + *
 + * Copyright IBM, Corp. 2012
 + *
 + * Authors:
 + *  Christian Borntraeger borntrae...@de.ibm.com
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or (at 
 your
 + * option) any later version.  See the COPYING file in the top-level 
 directory.
 + *
 + */
 +
 +#ifndef HW_S390_SCLP_H
 +#define HW_S390_SCLP_H
 +
 

Re: [Qemu-devel] [PATCH v2] Add infrastructure for QIDL-based device serialization

2012-09-26 Thread Michael Roth
On Wed, Sep 26, 2012 at 12:33:17PM +0200, Paolo Bonzini wrote:
 Il 26/09/2012 12:20, Kevin Wolf ha scritto:
   QIDL_DECLARE(RTCState) {   

   ISADevice dev qidl(immutable);
   MemoryRegion io qidl(immutable);
   
   Just like sparse is a compiler, so is qidl.  We are free to use the
   '_' + lowercase prefix.
   
 ISADevice _immutable dev;
   
   It's an established practice in wide-use.
  Not commenting on the underscore, but you did one thing that I want to
  support: Put the (q)_immutable in a place where it looks like a
  qualifier. Not so important for the qidl(...) syntax, but with the
  simplified forms I definitely like it better.
  
  I think I would even have made it '(q)_immutable ISADevice dev;', but
  having the field name last is what really matters for readability.
 
 Agreed.  I don't want to be a nuisance, so: Michael, please pick one between

Not a problem, the parser supports both before/after. I prefer before as well,
except in the case of q_property(name, options) where we often need to put
the variable name on a second line, but those aren't too common so let's just
standardize on before for now since that'll benefit the common use case better.

 
 ISADevice QIDL(immutable) dev
 ISADevice q_immutable dev
 ISADevice qidl(immutable) dev
 
 and if you choose the second, let's make QIDL an implementation detail,
 i.e. document that every new attribute we introduce should define a new
 q_* macro.

Ok, sounds like a plan. let's do q_*.

 
 Paolo
 



Re: [Qemu-devel] [PATCH] New syscalls to the seccomp whitelist

2012-09-26 Thread Paul Moore
On Thursday, September 20, 2012 06:00:59 PM Eduardo Otubo wrote:
 Seccomp syscall whitelist updated after tests running qemu under
 libvirt ...

Hi Eduardo,

I know from our discussions offlist that you have an additional debugging 
patch to help identify missing syscalls, perhaps you could also submit that 
patch too?  I think we would want the debugging patch #ifdef'd out in normal 
use, but I think it might help the QEMU developers.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] coroutine: always use pooling

2012-09-26 Thread Paolo Bonzini
Il 26/09/2012 16:34, Peter Maydell ha scritto:
 It makes sense to use it for other implementations than ucontext, too.
   Coroutine *qemu_coroutine_create(CoroutineEntry *entry)
   {
  -Coroutine *co = qemu_coroutine_new();
  +Coroutine *co;
  +
  +co = QSLIST_FIRST(pool);
  +if (co) {
  +QSLIST_REMOVE_HEAD(pool, pool_next);
  +pool_size--;
  +} else {
  +co = qemu_coroutine_new();
  +}
   co-entry = entry;
   return co;
   }
 Since this is obviously going to blow up badly if it's called
 from multiple threads, is there some kind of assert we can add
 so that we fail obviously if somebody makes that coding error?
 [the difficulty is probably when your backend is the gthread
 one, at least I assume creating a coroutine inside a coroutine
 is allowed.]

Yes, it is; however, creating a coroutine outside the big QEMU lock is
not allowed.  Since one coroutine only runs at a time the code is safe,
just like the other global variables that are used in
qemu-coroutine-lock.c for example.

Paolo



[Qemu-devel] [PATCH v3] Align PCI capabilities in pci_find_space

2012-09-26 Thread mjr
From: Matt Renzelmann m...@cs.wisc.edu

The current implementation of pci_find_space does not correctly align
PCI capabilities in the PCI configuration space.  It also does not
distinguish PCI and PCI-Express devices.  This patch fixes these
issues.

Thanks to Alex Williamson for continuing feedback.

Signed-off-by: Matt Renzelmann m...@cs.wisc.edu
---

In this patch, I've revised the pci_find_space function as suggested
(more-or-less).  I searched for calls to pci_add_capability, and at
this time, most rely only on capabilities that fit in the PCI config
space.  More importantly, almost all specify the capability offset
instead of relying on pci_find_space, so this change does not impact
any calls that specify an offset manually.  However, it's important to
double-check that there are no calls from PCI-E virtual devices to
pci_add_capability that both:

(a) relied on pci_find_space to find them space

(b) needed the PCI-E extended config space searched in addition to the
PCI space

as these would break with this patch. Here is the list of files that
refer to pcie_cap_init:

./hw/pcie.c
./hw/pcie.h
./hw/ioh3420.c
./hw/usb/hcd-xhci.c
./hw/xio3130_upstream.c
./hw/xio3130_downstream.c

The goal of this search was simply to find PCI-E devices--there may be
a better way.  The next list contain calls to pci_add_capability:

./hw/pci_bridge.c
./hw/shpc.c
./hw/pcie.c
./hw/kvm/pci-assign.c
./hw/msi.c
./hw/pci.c
./hw/ide/ich.c
./hw/pci.h
./hw/eepro100.c
./hw/msix.c
./hw/slotid_cap.c


 hw/pci.c |   28 +---
 1 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index f855cf3..2217dda 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -1626,16 +1626,30 @@ PCIDevice *pci_create_simple(PCIBus *bus, int devfn, 
const char *name)
 return pci_create_simple_multifunction(bus, devfn, false, name);
 }
 
-static int pci_find_space(PCIDevice *pdev, uint8_t size)
+static int pci_find_space(PCIDevice *pdev, uint8_t size, bool include_pcie)
 {
-int config_size = pci_config_size(pdev);
+int config_size;
 int offset = PCI_CONFIG_HEADER_SIZE;
 int i;
-for (i = PCI_CONFIG_HEADER_SIZE; i  config_size; ++i)
-if (pdev-used[i])
-offset = i + 1;
-else if (i - offset + 1 == size)
+uint32_t *dword_used = pdev-used[PCI_CONFIG_HEADER_SIZE];
+
+if (include_pcie) {
+assert (pci_config_size(pdev) = PCIE_CONFIG_SPACE_SIZE);
+config_size = PCIE_CONFIG_SPACE_SIZE;
+} else {
+config_size = PCI_CONFIG_SPACE_SIZE;
+}
+
+/* This approach ensures the capability is dword-aligned, as
+   required by the PCI specification */
+for (i = PCI_CONFIG_HEADER_SIZE; i  config_size; i += 4, dword_used++) {
+if (*dword_used) {
+offset = i + 4;
+} else if (i - offset + 4 = size) {
 return offset;
+}
+}
+
 return 0;
 }
 
@@ -1826,7 +1840,7 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
 int i, overlapping_cap;
 
 if (!offset) {
-offset = pci_find_space(pdev, size);
+offset = pci_find_space(pdev, size, false);
 if (!offset) {
 return -ENOSPC;
 }
-- 
1.7.5.4




Re: [Qemu-devel] [PATCH v3 3/5] target-arm: convert sar, shl and shr helpers to TCG

2012-09-26 Thread Peter Maydell
On 25 September 2012 23:52, Aurelien Jarno aurel...@aurel32.net wrote:
 Now that the movcond TCG op is available, it's possible to replace
 shl and shr helpers by TCG code. The code generated by TCG is slightly
 longer than the code generated by GCC for the helper but is still worth
 it as this avoid all the consequences of using an helper: globals saved
 back to memory, no possible optimization, call overhead, etc.

 Cc: Peter Maydell peter.mayd...@linaro.org
 Signed-off-by: Aurelien Jarno aurel...@aurel32.net

Reviewed-by: Peter Maydell peter.mayd...@linaro.org

-- PMM



Re: [Qemu-devel] [PATCH v3 0/5] target-arm: misc optimizations

2012-09-26 Thread Peter Maydell
On 25 September 2012 23:51, Aurelien Jarno aurel...@aurel32.net wrote:
 This patch series optimizes the ARM target by:
  - using globals instead of ld/st function
  - using TCG code instead of helpers
  - marking some helpers const and pure

I've put this series into target-arm.next. Thanks!

-- PMM



Re: [Qemu-devel] [RFC PATCH 00/17] Support for multiple AIO contexts

2012-09-26 Thread Paolo Bonzini
Il 26/09/2012 16:31, Kevin Wolf ha scritto:

 In fact, after removing io_flush, I don't really see what makes AIO
 fd handlers special any more.

 Note that while the handlers aren't that special indeed, there is still
 some magic because qemu_aio_wait() bottom halves.
 
 Do you mean the qemu_bh_poll() call? But the normal main loop does the
 same, so I don't see what would be special about it.

That's an abstraction leakage, IMHO.  After this series the normal main
loop does not need anymore to call bottom halves.

(Most usage of bottom halves in hw/* is pointless and also falls under
the category of leaked abstractions.  The other uses could also in
principle be called at the wrong time inside monitor commands.  Many
would be served better by a thread pool if it wasn't for our beloved big
lock).

 qemu_aio_wait() only calls these handlers, but would it do any harm if
 we called all fd handlers?

 Unfortunately yes.  You could get re-entrant calls from the monitor
 while a monitor command drains the AIO queue for example.
 
 Hm, that's true... Who's special here - is it the block layer or the
 monitor? I'm not quite sure. If it's the monitor, maybe we should plan
 to change that sometime when we have some spare time... ;-)

It feels like it's the monitor.  But I think in general it is better if
as little QEMU infrastructure as possible is used by the block layer,
because you end up with impossibly-knotted dependencies.  Using things
such as GSource to mediate between the block layer and everything else
is also better with an eye to libqblock.

Also, consider that under Windows there's a big difference: after this
series, qemu_aio_wait() only works with EventNotifiers, while
qemu_set_fd_handler2 only works with sockets.  Networked block drivers
are disabled for Windows by these patches, there's really no way to move
forward without sacrificing them.

Paolo



Re: [Qemu-devel] [PATCH v3 2/2] Versatile Express: Add modelling of NOR flash

2012-09-26 Thread Peter Maydell
On 19 September 2012 16:57, Francesco Lavra francescolavra...@gmail.com wrote:
 This patch adds modelling of the two NOR flash banks found on the
 Versatile Express motherboard. Tested with U-Boot running on an emulated
 Versatile Express, with either A9 or A15 CoreTile.

 Signed-off-by: Francesco Lavra francescolavra...@gmail.com

Reviewed-by: Peter Maydell peter.mayd...@linaro.org

Thanks -- I've added these two patches to arm-devs.next.

-- PMM



[Qemu-devel] [PATCH v2 00/45] Block job improvements for 1.3

2012-09-26 Thread Paolo Bonzini
Hi all, this is the resubmission of my block job patches, originally
meant for 1.2.  This still does not include a persistent dirty bitmap,
which I hope to post in October.

The patches are organized as follows:

01-13   preparatory work for block job errors, including support for
pausing and resuming jobs

14-18   introduce block job errors, and add support in block-stream

19-25   preparatory work for block mirroring: new commands/concepts
and creating new functions out of existing code.

26-33   introduce a simple version of mirroring.  The initial patch
add the mirroring logic, followed by the ability to switch to
the destination of migration and to handle errors during the job.
All these changes come with testcases.  Removing the ability to
query the target file is the main change from v1.

34-41   These patches introduce the first optimizations, namely supporting
an arbitrary granularity for the dirty bitmap.  The current default,
1M, is too coarse to let the job converge quickly and in almost
real-time.  These patches reimplement the block device dirty bitmap
to allow efficient iteration, and add cluster copy-on-write logic.
Cluster copy-on-write is needed because management will want to
start the copy before the backing file is in place in the destination;
if mirroring takes care of copy-on-write, BDRV_O_NO_BACKING can be
used even if the granularity is smaller than the cluster size.

42-45   A second round optimizations, replacing serialized read-write
operations with multiple asynchronous I/O operations.  The various
in-flight operations can be of arbitrary size.  The initial copy
will end up reading large chunks sequentially (10M by default),
while subsequent passes can mimic more closely the guest's I/O
patterns.

All comments from Kevin's partial review are addressed, so I believe
the first 29 patches should be ready to go.  Laszlo already reviewed some
of the subsequent parts.

Please review!

Jeff Cody (1):
  blockdev: rename block_stream_cb to a generic block_job_cb

Paolo Bonzini (44):
  qerror/block: introduce QERR_BLOCK_JOB_NOT_ACTIVE
  block: fix documentation of block_job_cancel_sync
  block: move job APIs to separate files
  block: add block_job_query
  block: add support for job pause/resume
  qmp: add block-job-pause and block-job-resume
  qemu-iotests: add test for pausing a streaming operation
  block: rename block_job_complete to block_job_completed
  iostatus: rename BlockErrorAction, BlockQMPEventAction
  iostatus: move BlockdevOnError declaration to QAPI
  iostatus: change is_read to a bool
  iostatus: reorganize io error code
  block: introduce block job error
  stream: add on-error argument
  blkdebug: process all set_state rules in the old state
  qemu-iotests: map underscore to dash in QMP argument names
  qemu-iotests: add tests for streaming error handling
  block: add bdrv_query_info
  block: add bdrv_query_stats
  block: add bdrv_open_backing_file
  block: introduce new dirty bitmap functionality
  block: export dirty bitmap information in query-block
  block: add block-job-complete
  block: introduce BLOCK_JOB_READY event
  mirror: introduce mirror job
  qmp: add drive-mirror command
  mirror: implement completion
  qemu-iotests: add mirroring test case
  iostatus: forward bdrv_iostatus_reset to block job
  mirror: add support for on-source-error/on-target-error
  qmp: add pull_event function
  qemu-iotests: add testcases for mirroring
on-source-error/on-target-error
  host-utils: add ffsl
  add hierarchical bitmap data type and test cases
  block: implement dirty bitmap using HBitmap
  block: make round_to_clusters public
  mirror: perform COW if the cluster size is bigger than the
granularity
  block: return count of dirty sectors, not chunks
  block: allow customizing the granularity of the dirty bitmap
  mirror: allow customizing the granularity
  mirror: switch mirror_iteration to AIO
  mirror: add buf-size argument to drive-mirror
  mirror: support more than one in-flight AIO operation
  mirror: support arbitrarily-sized iterations

 Makefile.objs |   5 +-
 QMP/qmp-events.txt|  42 +++
 QMP/qmp.py|  20 ++
 block-migration.c |   7 +-
 block.c   | 480 +-
 block.h   |  37 ++-
 block/Makefile.objs   |   3 +-
 block/blkdebug.c  |  12 +-
 block/mirror.c| 576 
 block/stream.c|  33 ++-
 block_int.h   | 192 +++-
 blockdev.c| 259 ++---
 blockjob.c| 282 ++
 blockjob.h| 280 ++
 hbitmap.c | 400 +
 hbitmap.h | 207 

[Qemu-devel] [PATCH v2 01/45] qerror/block: introduce QERR_BLOCK_JOB_NOT_ACTIVE

2012-09-26 Thread Paolo Bonzini
The DeviceNotActive text is not a particularly good match, add
a separate text while keeping the same class.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
v1-v2: rebased for Error changes

 blockdev.c | 4 ++--
 qerror.h   | 3 +++
 2 file modificati, 5 inserzioni(+), 2 rimozioni(-)

diff --git a/blockdev.c b/blockdev.c
index e5d450f..de5457d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1142,7 +1142,7 @@ void qmp_block_job_set_speed(const char *device, int64_t 
speed, Error **errp)
 BlockJob *job = find_block_job(device);
 
 if (!job) {
-error_set(errp, QERR_DEVICE_NOT_ACTIVE, device);
+error_set(errp, QERR_BLOCK_JOB_NOT_ACTIVE, device);
 return;
 }
 
@@ -1154,7 +1154,7 @@ void qmp_block_job_cancel(const char *device, Error 
**errp)
 BlockJob *job = find_block_job(device);
 
 if (!job) {
-error_set(errp, QERR_DEVICE_NOT_ACTIVE, device);
+error_set(errp, QERR_BLOCK_JOB_NOT_ACTIVE, device);
 return;
 }
 
diff --git a/qerror.h b/qerror.h
index d0a76a4..485c773 100644
--- a/qerror.h
+++ b/qerror.h
@@ -48,6 +48,9 @@ void assert_no_error(Error *err);
 #define QERR_BASE_NOT_FOUND \
 ERROR_CLASS_GENERIC_ERROR, Base '%s' not found
 
+#define QERR_BLOCK_JOB_NOT_ACTIVE \
+ERROR_CLASS_DEVICE_NOT_ACTIVE, No active block job on device '%s'
+
 #define QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED \
 ERROR_CLASS_GENERIC_ERROR, Block format '%s' used by device '%s' does not 
support feature '%s'
 
-- 
1.7.12





[Qemu-devel] [PATCH v2 05/45] block: add block_job_query

2012-09-26 Thread Paolo Bonzini
Extract it out of the implementation of info block-jobs.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
v1-v2: use g_new0.

 blockdev.c | 15 ++-
 blockjob.c | 11 +++
 blockjob.h |  8 
 3 file modificati, 21 inserzioni(+), 13 rimozioni(-)

diff --git a/blockdev.c b/blockdev.c
index 7ab7d5e..5772c11 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1169,19 +1169,8 @@ static void do_qmp_query_block_jobs_one(void *opaque, 
BlockDriverState *bs)
 BlockJob *job = bs-job;
 
 if (job) {
-BlockJobInfoList *elem;
-BlockJobInfo *info = g_new(BlockJobInfo, 1);
-*info = (BlockJobInfo){
-.type   = g_strdup(job-job_type-job_type),
-.device = g_strdup(bdrv_get_device_name(bs)),
-.len= job-len,
-.offset = job-offset,
-.speed  = job-speed,
-};
-
-elem = g_new0(BlockJobInfoList, 1);
-elem-value = info;
-
+BlockJobInfoList *elem = g_new0(BlockJobInfoList, 1);
+elem-value = block_job_query(bs-job);
 (*prev)-next = elem;
 *prev = elem;
 }
diff --git a/blockjob.c b/blockjob.c
index 9737a43..dea63f8 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -161,3 +161,14 @@ void block_job_sleep_ns(BlockJob *job, QEMUClock *clock, 
int64_t ns)
 job-busy = true;
 }
 }
+
+BlockJobInfo *block_job_query(BlockJob *job)
+{
+BlockJobInfo *info = g_new0(BlockJobInfo, 1);
+info-type   = g_strdup(job-job_type-job_type);
+info-device = g_strdup(bdrv_get_device_name(job-bs));
+info-len= job-len;
+info-offset = job-offset;
+info-speed  = job-speed;
+return info;
+}
diff --git a/blockjob.h b/blockjob.h
index 559518a..c6af0fb 100644
--- a/blockjob.h
+++ b/blockjob.h
@@ -163,6 +163,14 @@ void block_job_cancel(BlockJob *job);
 bool block_job_is_cancelled(BlockJob *job);
 
 /**
+ * block_job_query:
+ * @job: The job to get information about.
+ *
+ * Return information about a job.
+ */
+BlockJobInfo *block_job_query(BlockJob *job);
+
+/**
  * block_job_cancel_sync:
  * @job: The job to be canceled.
  *
-- 
1.7.12





[Qemu-devel] [PATCH v2 42/45] mirror: switch mirror_iteration to AIO

2012-09-26 Thread Paolo Bonzini
There is really no change in the behavior of the job here, since
there is still a maximum of one in-flight I/O operation between
the source and the target.  However, this patch already introduces
the AIO callbacks (which are unmodified in the next patch)
and some of the logic to count in-flight operations and only
complete the job when there is none.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
v1-v2: some simplification thanks to mirror_error_action

 block/mirror.c | 155 +++--
 trace-events   |   2 +
 2 file modificati, 119 inserzioni(+), 38 rimozioni(-)

diff --git a/block/mirror.c b/block/mirror.c
index 335f17c..fc39621 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -33,8 +33,19 @@ typedef struct MirrorBlockJob {
 unsigned long *cow_bitmap;
 HBitmapIter hbi;
 uint8_t *buf;
+
+int in_flight;
+int ret;
 } MirrorBlockJob;
 
+typedef struct MirrorOp {
+MirrorBlockJob *s;
+QEMUIOVector qiov;
+struct iovec iov;
+int64_t sector_num;
+int nb_sectors;
+} MirrorOp;
+
 static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read,
 int error)
 {
@@ -48,15 +59,60 @@ static BlockErrorAction mirror_error_action(MirrorBlockJob 
*s, bool read,
 }
 }
 
-static int coroutine_fn mirror_iteration(MirrorBlockJob *s,
- BlockErrorAction *p_action)
+static void mirror_iteration_done(MirrorOp *op)
+{
+MirrorBlockJob *s = op-s;
+
+s-in_flight--;
+trace_mirror_iteration_done(s, op-sector_num, op-nb_sectors);
+g_slice_free(MirrorOp, op);
+qemu_coroutine_enter(s-common.co, NULL);
+}
+
+static void mirror_write_complete(void *opaque, int ret)
+{
+MirrorOp *op = opaque;
+MirrorBlockJob *s = op-s;
+if (ret  0) {
+BlockDriverState *source = s-common.bs;
+BlockErrorAction action;
+
+bdrv_set_dirty(source, op-sector_num, op-nb_sectors);
+action = mirror_error_action(s, false, -ret);
+if (action == BDRV_ACTION_REPORT  s-ret = 0) {
+s-ret = ret;
+}
+}
+mirror_iteration_done(op);
+}
+
+static void mirror_read_complete(void *opaque, int ret)
+{
+MirrorOp *op = opaque;
+MirrorBlockJob *s = op-s;
+if (ret  0) {
+BlockDriverState *source = s-common.bs;
+BlockErrorAction action;
+
+bdrv_set_dirty(source, op-sector_num, op-nb_sectors);
+action = mirror_error_action(s, true, -ret);
+if (action == BDRV_ACTION_REPORT  s-ret = 0) {
+s-ret = ret;
+}
+
+mirror_iteration_done(op);
+return;
+}
+bdrv_aio_writev(s-target, op-sector_num, op-qiov, op-nb_sectors,
+mirror_write_complete, op);
+}
+
+static void coroutine_fn mirror_iteration(MirrorBlockJob *s)
 {
 BlockDriverState *source = s-common.bs;
-BlockDriverState *target = s-target;
-QEMUIOVector qiov;
-int ret, nb_sectors, nb_sectors_chunk;
+int nb_sectors, nb_sectors_chunk;
 int64_t end, sector_num, cluster_num;
-struct iovec iov;
+MirrorOp *op;
 
 s-sector_num = hbitmap_iter_next(s-hbi);
 if (s-sector_num  0) {
@@ -87,31 +143,30 @@ static int coroutine_fn mirror_iteration(MirrorBlockJob *s,
 
 end = s-common.len  BDRV_SECTOR_BITS;
 nb_sectors = MIN(nb_sectors, end - sector_num);
+
+/* Allocate a MirrorOp that is used as an AIO callback.  */
+op = g_slice_new(MirrorOp);
+op-s = s;
+op-iov.iov_base = s-buf;
+op-iov.iov_len  = nb_sectors * 512;
+op-sector_num = sector_num;
+op-nb_sectors = nb_sectors;
+qemu_iovec_init_external(op-qiov, op-iov, 1);
+
 bdrv_reset_dirty(source, sector_num, nb_sectors);
 
 /* Copy the dirty cluster.  */
-iov.iov_base = s-buf;
-iov.iov_len  = nb_sectors * 512;
-qemu_iovec_init_external(qiov, iov, 1);
-
+s-in_flight++;
 trace_mirror_one_iteration(s, sector_num, nb_sectors);
-ret = bdrv_co_readv(source, sector_num, nb_sectors, qiov);
-if (ret  0) {
-*p_action = mirror_error_action(s, true, -ret);
-goto fail;
-}
-ret = bdrv_co_writev(target, sector_num, nb_sectors, qiov);
-if (ret  0) {
-*p_action = mirror_error_action(s, false, -ret);
-s-synced = false;
-goto fail;
-}
-return 0;
+bdrv_aio_readv(source, sector_num, op-qiov, nb_sectors,
+   mirror_read_complete, op);
+}
 
-fail:
-/* Try again later.  */
-bdrv_set_dirty(source, sector_num, nb_sectors);
-return ret;
+static void mirror_drain(MirrorBlockJob *s)
+{
+while (s-in_flight  0) {
+qemu_coroutine_yield();
+}
 }
 
 static void coroutine_fn mirror_run(void *opaque)
@@ -119,6 +174,7 @@ static void coroutine_fn mirror_run(void *opaque)
 MirrorBlockJob *s = opaque;
 BlockDriverState *bs = s-common.bs;
 int64_t sector_num, end, nb_sectors_chunk, length;
+uint64_t 

Re: [Qemu-devel] [PATCH 1/4] s390: sclp base support

2012-09-26 Thread Christian Borntraeger
On 26/09/12 17:00, Alexander Graf wrote:

 +/* Provide information about the configuration, CPUs and storage */
 +static void read_SCP_info(SCCB *sccb)
 +{
 +ReadInfo *read_info = (ReadInfo *) sccb;
 +int shift = 0;
 +
 +while ((ram_size  (20 + shift))  65535) {
 +shift++;
 +}
 +read_info-rnmax = cpu_to_be16(ram_size  (20 + shift));
 +read_info-rnsize = 1  shift;
 
 Any reason we can't just always expose rnmax2 and rnsize2 instead? This way 
 we're quite limited on the amount of RAM we can support, right?

Well, we have 65535 * 256 * 1MB == 16TB which is ok for the next 2 or 3 years I 
guess.
There are actually some rules that decide about rnmax vs rnmax2 etc, and here
the non-2 are appropriate. This might change for systems  16TB or systems with 
memory hotplug,
but I would prefer to use those for now. We will add the full logic in case we 
add memory
hotplug.


[...]

 +if (be16_to_cpu(work_sccb.h.length)  8 ||
 
 sizeof(SCCBHeader)

ok


 
 +be16_to_cpu(work_sccb.h.length)  4096) {
 
 SCCB_SIZE

ok


  */
 -int sclp_service_call(CPUS390XState *env, uint32_t sccb, uint64_t code)
 +int sclp_service_call(uint32_t sccb, uint64_t code)
 
 Why not move the whole thing into sclp.c? The only thing remaining here are a 
 few sanity checks that would just as well work in generic sclp handling code, 
 right?

The idea was two-fold:
- to have one single place were we cross between target-s390x and hw (review 
feedback from the first series, originally we had all everything in sclp.c)
- to have the checks that the CPU can do over there and the complex things that 
look into the sccb in sclp.c

But we could certainly move that, your take

Christian






[Qemu-devel] [PATCH v2 44/45] mirror: support more than one in-flight AIO operation

2012-09-26 Thread Paolo Bonzini
With AIO support in place, we can start copying more than one chunk
in parallel.  This patch introduces the required infrastructure for
this: the buffer is split into multiple granularity-sized chunks,
and there is a free list to access them.

Because of copy-on-write, a single operation may already require
multiple chunks to be available on the free list.

In addition, two different iterations on the HBitmap may want to
copy the same cluster.  We avoid this by keeping a bitmap of in-flight
I/O operations, and blocking until the previous iteration completes.
This should be a pretty rare occurrence, though; as long as there is
no overlap the next iteration can start before the previous one finishes.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 block/mirror.c | 109 ++---
 trace-events   |   4 ++-
 2 file modificati, 100 inserzioni(+), 13 rimozioni(-)

diff --git a/block/mirror.c b/block/mirror.c
index e6426bb..9545f90 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -17,7 +17,15 @@
 #include qemu/ratelimit.h
 #include bitmap.h
 
-#define SLICE_TIME 1ULL /* ns */
+#define SLICE_TIME1ULL /* ns */
+#define MAX_IN_FLIGHT 16
+
+/* The mirroring buffer is a list of granularity-sized chunks.
+ * Free chunks are organized in a list.
+ */
+typedef struct MirrorBuffer {
+QSIMPLEQ_ENTRY(MirrorBuffer) next;
+} MirrorBuffer;
 
 typedef struct MirrorBlockJob {
 BlockJob common;
@@ -33,7 +41,10 @@ typedef struct MirrorBlockJob {
 unsigned long *cow_bitmap;
 HBitmapIter hbi;
 uint8_t *buf;
+QSIMPLEQ_HEAD(, MirrorBuffer) buf_free;
+int buf_free_count;
 
+unsigned long *in_flight_bitmap;
 int in_flight;
 int ret;
 } MirrorBlockJob;
@@ -41,7 +52,6 @@ typedef struct MirrorBlockJob {
 typedef struct MirrorOp {
 MirrorBlockJob *s;
 QEMUIOVector qiov;
-struct iovec iov;
 int64_t sector_num;
 int nb_sectors;
 } MirrorOp;
@@ -62,8 +72,22 @@ static BlockErrorAction mirror_error_action(MirrorBlockJob 
*s, bool read,
 static void mirror_iteration_done(MirrorOp *op)
 {
 MirrorBlockJob *s = op-s;
+struct iovec *iov;
+int64_t cluster_num;
+int i, nb_chunks;
 
 s-in_flight--;
+iov = op-qiov.iov;
+for (i = 0; i  op-qiov.niov; i++) {
+MirrorBuffer *buf = (MirrorBuffer *) iov[i].iov_base;
+QSIMPLEQ_INSERT_TAIL(s-buf_free, buf, next);
+s-buf_free_count++;
+}
+
+cluster_num = op-sector_num / s-granularity;
+nb_chunks = op-nb_sectors / s-granularity;
+bitmap_clear(s-in_flight_bitmap, cluster_num, nb_chunks);
+
 trace_mirror_iteration_done(s, op-sector_num, op-nb_sectors);
 g_slice_free(MirrorOp, op);
 qemu_coroutine_enter(s-common.co, NULL);
@@ -110,8 +134,8 @@ static void mirror_read_complete(void *opaque, int ret)
 static void coroutine_fn mirror_iteration(MirrorBlockJob *s)
 {
 BlockDriverState *source = s-common.bs;
-int nb_sectors, nb_sectors_chunk;
-int64_t end, sector_num, cluster_num;
+int nb_sectors, nb_sectors_chunk, nb_chunks;
+int64_t end, sector_num, cluster_num, next_sector, hbitmap_next_sector;
 MirrorOp *op;
 
 s-sector_num = hbitmap_iter_next(s-hbi);
@@ -122,6 +146,8 @@ static void coroutine_fn mirror_iteration(MirrorBlockJob *s)
 assert(s-sector_num = 0);
 }
 
+hbitmap_next_sector = s-sector_num;
+
 /* If we have no backing file yet in the destination, and the cluster size
  * is very large, we need to do COW ourselves.  The first time a cluster is
  * copied, copy it entirely.
@@ -137,21 +163,58 @@ static void coroutine_fn mirror_iteration(MirrorBlockJob 
*s)
 bdrv_round_to_clusters(s-target,
sector_num, nb_sectors_chunk,
sector_num, nb_sectors);
-bitmap_set(s-cow_bitmap, sector_num / nb_sectors_chunk,
-   nb_sectors / nb_sectors_chunk);
+
+/* The rounding may make us copy sectors before the
+ * first dirty one.
+ */
+cluster_num = sector_num / nb_sectors_chunk;
+}
+
+/* Wait for I/O to this cluster (from a previous iteration) to be done.  */
+while (test_bit(cluster_num, s-in_flight_bitmap)) {
+trace_mirror_yield_in_flight(s, sector_num, s-in_flight);
+qemu_coroutine_yield();
 }
 
 end = s-common.len  BDRV_SECTOR_BITS;
 nb_sectors = MIN(nb_sectors, end - sector_num);
+nb_chunks = (nb_sectors + nb_sectors_chunk - 1) / nb_sectors_chunk;
+while (s-buf_free_count  nb_chunks) {
+trace_mirror_yield_buf_busy(s, nb_chunks, s-in_flight);
+qemu_coroutine_yield();
+}
+
+/* We have enough free space to copy these sectors.  */
+if (s-cow_bitmap) {
+bitmap_set(s-cow_bitmap, cluster_num, nb_chunks);
+}
 
 /* Allocate a MirrorOp that is used as an AIO callback.  */
 op = g_slice_new(MirrorOp);
 op-s = s;
-op-iov.iov_base = s-buf;
-

Re: [Qemu-devel] [PATCH v9 4/4] block: Support GlusterFS as a QEMU block backend.

2012-09-26 Thread Bharata B Rao
On Wed, Sep 26, 2012 at 12:00:47PM +0200, Kevin Wolf wrote:
 Am 24.09.2012 11:13, schrieb Bharata B Rao:
  +static int parse_volume_options(GlusterConf *gconf, char *path)
  +{
  +char *token, *saveptr;
  +
  +/* volname */
  +token = strtok_r(path, /, saveptr);
  +if (!token) {
  +return -EINVAL;
  +}
  +gconf-volname = g_strdup(token);
  +
  +/* image */
  +token = strtok_r(NULL, ?, saveptr);
 
 If I understand uri.c right, there is no ? in the path, so there's no
 reason to call strtok. You could just use the rest of the string.

As you note, I don't need 2nd strtok strictly since the rest of the string
is available in saveptr. But I thought using saveptr is not ideal or preferred.
I wanted to use the most appropriate/safe delimiter to extract the image string
in the 2nd strtok and decided to use '?'.

If you think using saveptr is fine, then I could use that as below...

/* image */
if (!*saveptr) {
return -EINVAL;
}
gconf-image = g_strdup(saveptr);

 
  +if (!token) {
  +return -EINVAL;
  +}
  +gconf-image = g_strdup(token);
  +return 0;
  +}
  +
  +
  +if (uri-query) {
  +unescape_str = uri_string_unescape(uri-query, -1, NULL);
  +if (!unescape_str) {
  +ret = -EINVAL;
  +goto out;
  +}
  +}
 
 I agree with Paolo here, this need to go away.

Ok will do that.

  +
  +if (is_unix) {
  +if (strcmp(qp-p[0].name, socket)) {
  +ret = -EINVAL;
  +goto out;
  +}
  +gconf-server = g_strdup(qp-p[0].value);
 
 Maybe add a check that uri-server is empty?

I am saying that we will ignore the server and port if
transport type is unix. But I guess I will add this check and change
the comments and patch description accordingly.

 
  +} else {
  +gconf-server = g_strdup(uri-server);
  +gconf-port = uri-port;
  +}
  +
  +out:
  +if (qp) {
  +query_params_free(qp);
  +}
  +g_free(unescape_str);
  +uri_free(uri);
  +return ret;
  +}
  +
  +static struct glfs *qemu_gluster_init(GlusterConf *gconf, const char 
  *filename)
  +{
  +struct glfs *glfs = NULL;
  +int ret;
  +
  +ret = qemu_gluster_parseuri(gconf, filename);
  +if (ret  0) {
  +error_report(Usage: file=gluster[+transport]://[server[:port]]/
  +volname/image[?socket=...]);
  +errno = -ret;
  +goto out;
  +}
  +
  +glfs = glfs_new(gconf-volname);
  +if (!glfs) {
  +goto out;
  +}
  +
  +ret = glfs_set_volfile_server(glfs, gconf-transport, gconf-server,
  +gconf-port);
  +if (ret  0) {
  +goto out;
  +}
  +
  +/*
  + * TODO: Use GF_LOG_ERROR instead of hard code value of 4 here when
  + * GlusterFS makes GF_LOG_* macros available to libgfapi users.
  + */
  +ret = glfs_set_logging(glfs, -, 4);
  +if (ret  0) {
  +goto out;
  +}
  +
  +ret = glfs_init(glfs);
  +if (ret) {
  +error_report(Gluster connection failed for server=%s port=%d 
  + volume=%s image=%s transport=%s\n, gconf-server, 
  gconf-port,
  + gconf-volname, gconf-image, gconf-transport);
  +goto out;
  +}
  +return glfs;
  +
  +out:
  +if (glfs) {
  +glfs_fini(glfs);
 
 Does this corrupt errno?

Currently glfs_fini() isn't implemented and it returns -1. I guess it could
modify errno when its implemented. At one point of time, I had a logic to save
the errno value from previous calls and restore it to errno if glfs_fini()
fails, but that looked ugly since I had to save errno values from
4 previous calls. Should I just save the errno from glfs_init() since
that does most of the validation, connection establishment etc and is more
likely to fail ?

  +static int qemu_gluster_open(BlockDriverState *bs, const char *filename,
  +int bdrv_flags)
  +{
  +BDRVGlusterState *s = bs-opaque;
  +int open_flags = 0;
  +int ret = 0;
  +GlusterConf *gconf = g_malloc0(sizeof(GlusterConf));
  +
  +s-glfs = qemu_gluster_init(gconf, filename);
  +if (!s-glfs) {
  +ret = -errno;
  +goto out;
  +}
  +
  +open_flags |=  O_BINARY;
  +open_flags = ~O_ACCMODE;
 
 open_flags == O_BINARY here, so no O_ACCMODE bits to clear.

Right, will fix.

 
  +static int qemu_gluster_send_pipe(BDRVGlusterState *s, GlusterAIOCB *acb)
  +{
  +int ret = 0;
  +
  +while (1) {
  +int fd = s-fds[GLUSTER_FD_WRITE];
  +
  +ret = write(fd, (void *)acb, sizeof(acb));
  +if (ret = 0) {
  +break;
  +}
  +if (errno == EINTR) {
  +continue;
  +}
  +if (errno != EAGAIN) {
  +break;
  +}
 
 Variatio delectat? ;-)
 
 How about just do { ... } while (errno == EINTR || errno == EAGAIN); ?

I will go with qemu_write_full(). With that I could get 

[Qemu-devel] [PATCH 1/4] pl190: fix read of VECTADDR

2012-09-26 Thread Peter Maydell
From: Brendan Fennell bfenn...@skynet.ie

Reading VECTADDR was causing us to set the current priority to
the wrong value, the most obvious effect of which was that we
would return the vector for the wrong interrupt as the result
of the read.

Signed-off-by: Brendan Fennell bfenn...@skynet.ie
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 hw/pl190.c |   18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/hw/pl190.c b/hw/pl190.c
index cb50afb..7332f4d 100644
--- a/hw/pl190.c
+++ b/hw/pl190.c
@@ -117,12 +117,18 @@ static uint64_t pl190_read(void *opaque, 
target_phys_addr_t offset,
 return s-protected;
 case 12: /* VECTADDR */
 /* Read vector address at the start of an ISR.  Increases the
-   current priority level to that of the current interrupt.  */
-for (i = 0; i  s-priority; i++)
-  {
-if ((s-level | s-soft_level)  s-prio_mask[i])
-  break;
-  }
+ * current priority level to that of the current interrupt.
+ *
+ * Since an enabled interrupt X at priority P causes prio_mask[Y]
+ * to have bit X set for all Y  P, this loop will stop with
+ * i == the priority of the highest priority set interrupt.
+ */
+for (i = 0; i  s-priority; i++) {
+if ((s-level | s-soft_level)  s-prio_mask[i + 1]) {
+break;
+}
+}
+
 /* Reading this value with no pending interrupts is undefined.
We return the default address.  */
 if (i == PL190_NUM_PRIO)
-- 
1.7.9.5




[Qemu-devel] [PATCH 3/4] Versatile Express: Fix NOR flash 0 address and remove flash alias

2012-09-26 Thread Peter Maydell
From: Francesco Lavra francescolavra...@gmail.com

In the A series memory map (implemented in the Cortex A15 CoreTile), the
first NOR flash bank (flash 0) is mapped to address 0x0800, while
address 0x can be configured as alias to either the first or the
second flash bank. This patch fixes the definition of flash 0 address,
and for simplicity removes the alias definition.

Signed-off-by: Francesco Lavra francescolavra...@gmail.com
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 hw/vexpress.c |7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/hw/vexpress.c b/hw/vexpress.c
index b615844..454c2bb 100644
--- a/hw/vexpress.c
+++ b/hw/vexpress.c
@@ -62,7 +62,6 @@ enum {
 VE_COMPACTFLASH,
 VE_CLCD,
 VE_NORFLASH0,
-VE_NORFLASH0ALIAS,
 VE_NORFLASH1,
 VE_SRAM,
 VE_VIDEORAM,
@@ -104,9 +103,8 @@ static target_phys_addr_t motherboard_legacy_map[] = {
 };
 
 static target_phys_addr_t motherboard_aseries_map[] = {
-/* CS0: 0x .. 0x0c00 */
-[VE_NORFLASH0] = 0x,
-[VE_NORFLASH0ALIAS] = 0x0800,
+/* CS0: 0x0800 .. 0x0c00 */
+[VE_NORFLASH0] = 0x0800,
 /* CS4: 0x0c00 .. 0x1000 */
 [VE_NORFLASH1] = 0x0c00,
 /* CS5: 0x1000 .. 0x1400 */
@@ -413,7 +411,6 @@ static void vexpress_common_init(const VEDBoardInfo 
*daughterboard,
 sysbus_create_simple(pl111, map[VE_CLCD], pic[14]);
 
 /* VE_NORFLASH0: not modelled */
-/* VE_NORFLASH0ALIAS: not modelled */
 /* VE_NORFLASH1: not modelled */
 
 sram_size = 0x200;
-- 
1.7.9.5




[Qemu-devel] [PATCH v2 35/45] add hierarchical bitmap data type and test cases

2012-09-26 Thread Paolo Bonzini
HBitmaps provides an array of bits.  The bits are stored as usual in an
array of unsigned longs, but HBitmap is also optimized to provide fast
iteration over set bits; going from one bit to the next is O(logB n)
worst case, with B = sizeof(long) * CHAR_BIT: the result is low enough
that the number of levels is in fact fixed.

In order to do this, it stacks multiple bitmaps with progressively coarser
granularity; in all levels except the last, bit N is set iff the N-th
unsigned long is nonzero in the immediately next level.  When iteration
completes on the last level it can examine the 2nd-last level to quickly
skip entire words, and even do so recursively to skip blocks of 64 words or
powers thereof (32 on 32-bit machines).

Given an index in the bitmap, it can be split in group of bits like
this (for the 64-bit case):

 bits 0-57 = word in the last bitmap | bits 58-63 = bit in the word
 bits 0-51 = word in the 2nd-last bitmap | bits 52-57 = bit in the word
 bits 0-45 = word in the 3rd-last bitmap | bits 46-51 = bit in the word

So it is easy to move up simply by shifting the index right by
log2(BITS_PER_LONG) bits.  To move down, you shift the index left
similarly, and add the word index within the group.  Iteration uses
ffs (find first set bit) to find the next word to examine; this
operation can be done in constant time in most current architectures.

Setting or clearing a range of m bits on all levels, the work to perform
is O(m + m/W + m/W^2 + ...), which is O(m) like on a regular bitmap.

When iterating on a bitmap, each bit (on any level) is only visited
once.  Hence, The total cost of visiting a bitmap with m bits in it is
the number of bits that are set in all bitmaps.  Unless the bitmap is
extremely sparse, this is also O(m + m/W + m/W^2 + ...), so the amortized
cost of advancing from one bit to the next is usually constant.

Reviewed-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
v1-v2: reworked iterator API, various other changes after review
from Laszlo.  Added a testcase to test-hbitmap.c.

 hbitmap.c| 400 ++
 hbitmap.h| 207 ++
 tests/Makefile   |   2 +
 tests/test-hbitmap.c | 408 +++
 trace-events |   5 +
 5 file modificati, 1022 inserzioni(+)
 create mode 100644 hbitmap.c
 create mode 100644 hbitmap.h
 create mode 100644 tests/test-hbitmap.c

diff --git a/hbitmap.c b/hbitmap.c
new file mode 100644
index 000..90facab
--- /dev/null
+++ b/hbitmap.c
@@ -0,0 +1,400 @@
+/*
+ * Hierarchical Bitmap Data Type
+ *
+ * Copyright Red Hat, Inc., 2012
+ *
+ * Author: Paolo Bonzini pbonz...@redhat.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include osdep.h
+#include hbitmap.h
+#include host-utils.h
+#include trace.h
+#include string.h
+#include glib.h
+#include assert.h
+
+/* HBitmaps provides an array of bits.  The bits are stored as usual in an
+ * array of unsigned longs, but HBitmap is also optimized to provide fast
+ * iteration over set bits; going from one bit to the next is O(logB n)
+ * worst case, with B = sizeof(long) * CHAR_BIT: the result is low enough
+ * that the number of levels is in fact fixed.
+ *
+ * In order to do this, it stacks multiple bitmaps with progressively coarser
+ * granularity; in all levels except the last, bit N is set iff the N-th
+ * unsigned long is nonzero in the immediately next level.  When iteration
+ * completes on the last level it can examine the 2nd-last level to quickly
+ * skip entire words, and even do so recursively to skip blocks of 64 words or
+ * powers thereof (32 on 32-bit machines).
+ *
+ * Given an index in the bitmap, it can be split in group of bits like
+ * this (for the 64-bit case):
+ *
+ *   bits 0-57 = word in the last bitmap | bits 58-63 = bit in the word
+ *   bits 0-51 = word in the 2nd-last bitmap | bits 52-57 = bit in the word
+ *   bits 0-45 = word in the 3rd-last bitmap | bits 46-51 = bit in the word
+ *
+ * So it is easy to move up simply by shifting the index right by
+ * log2(BITS_PER_LONG) bits.  To move down, you shift the index left
+ * similarly, and add the word index within the group.  Iteration uses
+ * ffs (find first set bit) to find the next word to examine; this
+ * operation can be done in constant time in most current architectures.
+ *
+ * Setting or clearing a range of m bits on all levels, the work to perform
+ * is O(m + m/W + m/W^2 + ...), which is O(m) like on a regular bitmap.
+ *
+ * When iterating on a bitmap, each bit (on any level) is only visited
+ * once.  Hence, The total cost of visiting a bitmap with m bits in it is
+ * the number of bits that are set in all bitmaps.  Unless the bitmap is
+ * extremely sparse, this is also O(m + m/W + m/W^2 + ...), so the amortized
+ 

[Qemu-devel] [PATCH v2 34/45] host-utils: add ffsl

2012-09-26 Thread Paolo Bonzini
We can provide fast versions based on the other functions defined
by host-utils.h.  Some care is required on glibc, which provides
ffsl already.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 host-utils.h | 26 ++
 1 file modificato, 26 inserzioni(+)

diff --git a/host-utils.h b/host-utils.h
index 821db93..2724be0 100644
--- a/host-utils.h
+++ b/host-utils.h
@@ -24,6 +24,7 @@
  */
 
 #include compiler.h   /* QEMU_GNUC_PREREQ */
+#include string.h /* ffsl */
 
 #if defined(__x86_64__)
 #define __HAVE_FAST_MULU64__
@@ -234,3 +235,28 @@ static inline int ctpop64(uint64_t val)
 return val;
 #endif
 }
+
+/* glibc does not provide an inline version of ffsl, so always define
+ * ours.  We need to give it a different name, however.
+ */
+#ifdef __GLIBC__
+#define ffsl qemu_ffsl
+#endif
+static inline int ffsl(long val)
+{
+if (!val) {
+return 0;
+}
+
+#if QEMU_GNUC_PREREQ(3, 4)
+return __builtin_ctzl(val) + 1;
+#else
+if (sizeof(long) == 4) {
+return ctz32(val) + 1;
+} else if (sizeof(long) == 8) {
+return ctz64(val) + 1;
+} else {
+abort();
+}
+#endif
+}
-- 
1.7.12





  1   2   3   >