date:20111028

Leib, David david.l...@sap.com writes:

 Hi,
 I am trying to assign a new virtio block device in addition to a normal 
 virtio block device who are accessing exactly the same cdrom drive 
 (/dev/sr0) because I additionaly want to access the block device in my way 
 by manually calling the virtqueue_pop and virtqueue_push and not the normal 
 way they are called.
 At the kvm startup I am assigning this additional qemu rblock device in the 
 vm_config_groups by adding a new QemuOptsList:

 static QemuOptsList qemu_ablock_opts = {
 .name = ablock,
 .head = QTAILQ_HEAD_INITIALIZER(qemu_ablock_opts.head),
 .desc = {
 . normal options like the original virtio block 
 device .
 { /* end of list */ }
 },
 };

 and insert the same data like the normal virtio block device (file=/dev/sr0 
 and if=virtio) in qemu_config.c.
 After that I am calling the normal drive_init_func  (vl.c) with this 
 command :

 qemu_opts_foreach(qemu_find_opts(ablock), drive_init_func, 
 machine-use_scsi, 1);

Your new qemu_ablock_opts won't do a thing without something that writes
configuration data to it.  For qemu_drive_opts, that's drive_def(),
which runs on behalf of -drive and drive hot plug.

 I also added PCIDeviceInfo to the virtio_info array who looks like this:
 {
 .qdev.name = additional_blk_pci,
 .qdev.alias = additional-blk,
 .qdev.size = sizeof(VirtIOPCIProxy),
 .init  = virtio_blk_init_pci_additional,
 .exit  = virtio_blk_exit_pci,
 .vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET,
 .device_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
 .revision  = VIRTIO_PCI_ABI_VERSION,
 .class_id  = PCI_CLASS_STORAGE_SCSI,
 .qdev.props = (Property[]) {
 DEFINE_PROP_HEX32(class, VirtIOPCIProxy, 
 class_code, 0),
 DEFINE_BLOCK_PROPERTIES(VirtIOPCIProxy, block),
 DEFINE_PROP_STRING(serial, VirtIOPCIProxy, 
 block_serial),
 DEFINE_PROP_BIT(ioeventfd, VirtIOPCIProxy, flags,
 
 VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
 DEFINE_PROP_UINT32(vectors, VirtIOPCIProxy, 
 nvectors, 2),
 DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, 
 host_features),
 DEFINE_PROP_END_OF_LIST(),
 },
 .qdev.reset = virtio_pci_reset,
 },
 It is completely the same like the normal virtio-blk-pci except the .init 
 function that I replaced with my own init-function.
 My problem now is that this init-function is never called when I am starting 
 up the kvm. It only calling the the init-function of virtio-blk-pci two 
 times and my PCIDeviceInfo init-function is completely ignored.
 The initialisation of all virtio_info's in virtio-pci.c works fine but my 
 init-function is never used.
 I tried to initialise only my additional-virtio-blk-pci device but is still 
 calling the init-function from virtio-blk-pci.
 I hope somebody can give me idea where the problem is.

Try -device additional_blk_pci.  Check out docs/qdev-device-use.txt for
how to go from -drive if=virtio (which doesn't use your device) to
-device (which should be able to use your device).

Re: [Qemu-devel] [Qemu-ppc] [PATCH] ppcr: Avoid decrementer related kvm exits

2011-10-28 Thread Alexander Graf


On 25.10.2011, at 21:38, Scott Wood wrote:

 On 10/14/2011 01:44 AM, Alexander Graf wrote:
 Wouldn't a simple
 
 if (kvm_enabled()) {
return;
 }
 
 in the beginning of the function make more sense? There's no code connecting 
 the in-qemu and the in-kvm decrementors atm, so any logic applying to the 
 in-qemu one is moot for kvm.
 
 On book3e at least, we can use sregs to set the decrementer, and we
 probably want this to happen on reset.

Sure.

if (kvm_enabled()) {
kvmppc_set_dec(x);
return;
}


Alex

Re: [Qemu-devel] [PATCH v2 1/3] qemu-tls.h: Add abstraction layer for TLS variables

Andreas Färber afaer...@suse.de writes:

 Am 27.10.2011 13:37, schrieb Peter Maydell:
 Add an abstraction layer for defining and using thread-local
 variables. For the moment this is implemented only for Linux,
 which means they can only be used in restricted circumstances.
 The abstraction layer allows us to add POSIX and Win32 support
 later.
 

 [Paolo's SoB missing]

 Signed-off-by: Peter Maydell peter.mayd...@linaro.org
 ---
  qemu-tls.h |   51 +++
  1 files changed, 51 insertions(+), 0 deletions(-)
  create mode 100644 qemu-tls.h
 
 diff --git a/qemu-tls.h b/qemu-tls.h
 new file mode 100644
 index 000..d96a159
 --- /dev/null
 +++ b/qemu-tls.h
 @@ -0,0 +1,51 @@
 +/*
 + * Abstraction layer for defining and using TLS variables
 + *
 + * Copyright (c) 2011 Red Hat, Inc, Linaro Limited

 The concatenation looks kind of funny. ;)

I'd split into

 * Copyright (c) 2011 Red Hat, Inc
 * Copyright (c) 2011 Linaro Limited

 + *
 + * Authors:
 + *  Paolo Bonzini pbonz...@redhat.com
 + *  Peter Maydell peter.mayd...@linaro.org
 + *
 + * This program is free software; you can redistribute it and/or
 + * modify it under the terms of the GNU General Public License as
 + * published by the Free Software Foundation; either version 2 of
 + * the License, or (at your option) any later version.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU General Public License for more details.
 + *
 + * You should have received a copy of the GNU General Public License along
 + * with this program; if not, see http://www.gnu.org/licenses/.
 + */
 +
 +#ifndef QEMU_TLS_GCC_H
 +#define QEMU_TLS_GCC_H

 Extra _GCC. But does no harm.

Unless you file unusual under harm, which I happen to do :)

 +
 +/* Per-thread variables. Note that we only have implementations
 + * which are really thread-local on Linux; the dummy implementations
 + * define plain global variables.
 + *
 + * This means that for the moment use should be restricted to
 + * per-VCPU variables, which are OK because:
 + *  - the only -user mode supporting multiple VCPU threads is linux-user
 + *  - TCG system mode is single-threaded regarding VCPUs
 + *  - KVM system mode is multi-threaded but limited to Linux
 + *
 + * TODO: proper implementations via Win32 .tls sections and
 + * POSIX pthread_getspecific.
 + */
 +#ifdef __linux__
 +#define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
 +#define DEFINE_TLS(type, x)  __thread __typeof__(type) tls__##x
 +#define get_tls(x)   tls__##x
 +#else
 +/* Dummy implementations which define plain global variables */
 +#define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
 +#define DEFINE_TLS(type, x)  __typeof__(type) tls__##x
 +#define get_tls(x)   tls__##x
 +#endif

Any particular reason for pasting tls__ onto the identifier?

 +
 +#endif
[...]

I wish we'll be able to ditch these cumbersome, ugly macros in favor of
C1X threading support.  It'll probably remain a wish forever, given the
lengths we go to support systems that stubbornly refuse to update their
C programming environment for the 21st century.

Re: [Qemu-devel] [PATCH] Fix compiler warning (always return a value)

Stefan Hajnoczi stefa...@gmail.com writes:

 On Mon, Oct 24, 2011 at 10:18:43PM +0200, Stefan Weil wrote:
 For compilations with -DNDEBUG, the default case did not return
 a value which caused a compiler warning.
 
 Signed-off-by: Stefan Weil s...@weilnetz.de
 ---
  hw/ppce500_spin.c |   11 ---
  1 files changed, 8 insertions(+), 3 deletions(-)
 
 diff --git a/hw/ppce500_spin.c b/hw/ppce500_spin.c
 index cccd940..5b5ffe0 100644
 --- a/hw/ppce500_spin.c
 +++ b/hw/ppce500_spin.c
 @@ -168,17 +168,22 @@ static uint64_t spin_read(void *opaque, 
 target_phys_addr_t addr, unsigned len)
  {
  SpinState *s = opaque;
  uint8_t *spin_p = ((uint8_t*)s-spin)[addr];
 +uint64_t result = 0;
  
  switch (len) {
  case 1:
 -return ldub_p(spin_p);
 +result = ldub_p(spin_p);
 +break;
  case 2:
 -return lduw_p(spin_p);
 +result = lduw_p(spin_p);
 +break;
  case 4:
 -return ldl_p(spin_p);
 +result = ldl_p(spin_p);
 +break;
  default:
  assert(0);

 I would replace assert(3) with abort(3).  If this ever happens the
 program is broken - returning 0 instead of an undefined value doesn't
 help.

Why is it useful to make failed assertions stop the program regardless
of NDEBUG only when the assertion happens to be can't reach?

If you worry about assertions failing, don't compile with -DNDEBUG.

Doctor, it hurts when I disable assertions!
Don't disable them, then.
;-P

Re: [Qemu-devel] [PATCH v2 1/3] qemu-tls.h: Add abstraction layer for TLS variables

On 28 October 2011 08:27, Markus Armbruster arm...@redhat.com wrote:
 Andreas Färber afaer...@suse.de writes:

 Am 27.10.2011 13:37, schrieb Peter Maydell:
 + * Copyright (c) 2011 Red Hat, Inc, Linaro Limited

 The concatenation looks kind of funny. ;)

 I'd split into

  * Copyright (c) 2011 Red Hat, Inc
  * Copyright (c) 2011 Linaro Limited

 +#ifndef QEMU_TLS_GCC_H
 +#define QEMU_TLS_GCC_H

 Extra _GCC. But does no harm.

 Unless you file unusual under harm, which I happen to do :)

OK, I'll fix these nits and resend later this morning.

 +#ifdef __linux__
 +#define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
 +#define DEFINE_TLS(type, x)  __thread __typeof__(type) tls__##x
 +#define get_tls(x)           tls__##x
 +#else
 +/* Dummy implementations which define plain global variables */
 +#define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
 +#define DEFINE_TLS(type, x)  __typeof__(type) tls__##x
 +#define get_tls(x)           tls__##x
 +#endif

 Any particular reason for pasting tls__ onto the identifier?

It means we catch accidentally using the identifier directly
rather than via get_tls() at compile time. (That doesn't matter
for the __thread case, obviously, but does for other implementations.)
(also it means we get out of the way of the cpu_single_env macro
we define in patch 3.)

-- PMM

Re: [Qemu-devel] [RFC] Introduce qemu_abort?

Peter Maydell peter.mayd...@linaro.org writes:

 On 26 October 2011 19:34, Stefan Weil s...@weilnetz.de wrote:
 Adding the infrastructure (macros / implementation) could be done
 faster. I suggest these interfaces in qemu-common.h:

 qemu_abort() - abort QEMU with a message containing function name,
 file name and line (macro, see message text in my previous mail cited above)

 qemu_fatal(formatstring, ...) - abort QEMU with a printf like message
 (function, prints QEMU aborted,  and the text according to the parameters)

 We already have a couple of print a message and abort functions:
   hw_error()  (prints qemu: hardware error message, dumps cpu state
for all cores and abort()s)
   cpu_abort() (prints qemu: fatal: message, dumps cpu state for one
core and abort()s)
   tcg_abort() (takes no arguments, acts like your proposed qemu_abort())

 ...perhaps we should try to straighten out the naming inconsistencies
 here and document which ones to use when.

Yes, that would be useful.

 I think the qemu_fatal() might be better addressed as part of a set
 of functions for handling fatal and other errors in a more consistent
 way. At the moment it's a bit random whether a device model will
 abort, warn but continue or silently continue if a guest does something
 that tickles unimplemented behaviour or does something that's probably
 a guest error (eg accessing nonexistent registers). It would be good
 to have some of this be user configurable (some people want to see
 my guest OS is doing something that's probably a guest OS bug warnings,
 some don't, for instance).

Yes.

assert() and abort() are strictly for programming errors: the programmer
screwed up, program is in disorder, and can't continue.  They're not
appropriate ways to handle environmental conditions that can be expected
(out of memory, network kaputt, ...), or bad input.  For QEMU, input
includes guest actions.

I generally don't bother to provide nice ways to abort on programming
error.  assert() is standard, and it's just fine.  To do anything about
it, I'll need a debugger anyway.  If you don't like abort() not telling
you function and line, use assert().

That said, if you want to wrap around yet another standard function, go
right ahead, one more won't make a difference.

 Regarding whether to put this into qemu 1.0 or not -- my preference
 would be for 'not' because any change touching a lot of files has
 the potential to cause pending patches/pullreqs people are trying
 to get in before the feature freeze deadline to no longer apply.

Agree.

Re: [Qemu-devel] [PATCH] Teach block/vdi about discarded (no longer allocated) blocks

Am 27.10.2011 18:12, schrieb Stefan Weil:
 Am 27.10.2011 10:53, schrieb Kevin Wolf:
 Am 26.10.2011 21:51, schrieb Eric Sunshine:
 An entry in the VDI block map will hold an offset to the actual block if
 the block is allocated, or one of two specially-interpreted values if
 not allocated. Using VirtualBox terminology, value VDI_IMAGE_BLOCK_FREE
 (0x) represents a never-allocated block (semantically arbitrary
 content). VDI_IMAGE_BLOCK_ZERO (0xfffe) represents a discarded
 block (semantically zero-filled). block/vdi knows only about
 VDI_IMAGE_BLOCK_FREE. Teach it about VDI_IMAGE_BLOCK_ZERO.

 Signed-off-by: Eric Sunshine sunsh...@sunshineco.com

 Thanks, applied to the block branch.

 Kevin
 
 
 Kevin, I don't want to block improvements. Nevertheless
 I'd like to see a small modification in this patch:
 both #defines should be implemented without a type cast.
 Please change them or wait until Eric sends an update.
 
 My favorite is this:
 
 #define VDI_UNALLOCATED UINT32_MAX
 #define VDI_DISCARD (VDI_UNALLOCATED - 1)
 
 This would also be ok:
 
 #define VDI_UNALLOCATED 0xU
 #define VDI_DISCARD 0xfffeU
 
 Using the macro names and the definitions (with type cast)
 from the original VirtualBox code would also be ok.

I did see your comments, and I waited for the endianness thing to be
answered. However, how the definition of these constants is written is
really not a functional defect, but simply a matter of taste. It's an
old rule that whoever does the work also decides on the details.

I really think it's wasting our time if we need to discuss if a type
cast in the constant definition is only allowed after typedefing
uint32_t to something else like in VBox.

So my preferred way is to leave the patch as it is. The code is simple
and clear and objectively seen it won't get any better with your taste
applied. If Eric prefers, I can update it to use 0xU, though.

Kevin

Re: [Qemu-devel] [PATCH] Teach block/vdi about discarded (no longer allocated) blocks

2011-10-28 Thread Eric Sunshine



On Oct 28, 2011, at 4:00 AM, Kevin Wolf wrote:


Am 27.10.2011 18:12, schrieb Stefan Weil:

Am 27.10.2011 10:53, schrieb Kevin Wolf:

Am 26.10.2011 21:51, schrieb Eric Sunshine:
An entry in the VDI block map will hold an offset to the actual  
block if
the block is allocated, or one of two specially-interpreted  
values if
not allocated. Using VirtualBox terminology, value  
VDI_IMAGE_BLOCK_FREE
(0x) represents a never-allocated block (semantically  
arbitrary
content). VDI_IMAGE_BLOCK_ZERO (0xfffe) represents a  
discarded

block (semantically zero-filled). block/vdi knows only about
VDI_IMAGE_BLOCK_FREE. Teach it about VDI_IMAGE_BLOCK_ZERO.

Signed-off-by: Eric Sunshine sunsh...@sunshineco.com


Thanks, applied to the block branch.

Kevin



Kevin, I don't want to block improvements. Nevertheless
I'd like to see a small modification in this patch:
both #defines should be implemented without a type cast.
Please change them or wait until Eric sends an update.

My favorite is this:

#define VDI_UNALLOCATED UINT32_MAX
#define VDI_DISCARD (VDI_UNALLOCATED - 1)

This would also be ok:

#define VDI_UNALLOCATED 0xU
#define VDI_DISCARD 0xfffeU

Using the macro names and the definitions (with type cast)
from the original VirtualBox code would also be ok.


I did see your comments, and I waited for the endianness thing to be
answered. However, how the definition of these constants is written is
really not a functional defect, but simply a matter of taste. It's an
old rule that whoever does the work also decides on the details.

I really think it's wasting our time if we need to discuss if a type
cast in the constant definition is only allowed after typedefing
uint32_t to something else like in VBox.

So my preferred way is to leave the patch as it is. The code is simple
and clear and objectively seen it won't get any better with your taste
applied. If Eric prefers, I can update it to use 0xU, though.


The 0xU notation has the benefit of being explicit, whereas  
the ((uint32_t)~0) notation, taken from the VirtualBox source, is  
somewhat magical for a reader who does not perform an automatic  
((uint32_t)~0) == 0xU conversion in his head. Consequently,  
the 0xU notation might a better choice, if it's not too much  
bother for you to amend the patch.


-- ES

[Qemu-devel] [Bug 498107] Re: www.qemu.org and www.nongnu.org/qemu have a lot of bugs

2011-10-28 Thread Klaus Rennecke

What is this report supposed to tell me in the context of the QEMU
software? I believe this entry borders on mobbing the maintainer, and
looks suspiciously like a plug for the book co-authored by the OP. As a
user, it's just clutter in the search results on launchpad.

** Changed in: qemu
   Status: New = Opinion

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/498107

Title:
  www.qemu.org and www.nongnu.org/qemu have a lot of bugs

Status in QEMU:
  Opinion

Bug description:
  The http://websites www.qemu.org and http://www.nongnu.org/qemu have a
  lot of bugs:

  Why two websites with different oudated content?

  No contact address
  -
  It is not possible to contact the webmaster.

  Outdated content
  --
  The current relase is 0.12.0-rc2.!
  - http://www.nongnu.org/qemu/index.html Jul 30, 2009 QEMU version 0.11.0-rc1 
is out
  - http://www.nongnu.org/qemu/download.html
  - http://www.qemu.org QEMU version 0.12.0-rc1 is out 
  - http://www.qemu.org/download.html

  Many Links are outdated or broken
  
  - http://www.qemu.org/links.html
  - http://www.nongnu.org/qemu/links.html
  For example QEMU on Windows. Why not http://www.davereyn.co.uk/download.htm 
?

  - http://www.qemu.org/user-doc.html
  - http://www.nongnu.org/qemu/user-doc.html
  For example: Quick Start, FAQ and QEMU Wiki. 

  No word about the end of KQEMU support next.

  - http://www.qemu.org/qemu-doc.html
  - http://www.nongnu.org/qemu/qemu-doc.html
  There are a lot of differences to
  qemu --help
  help in QEMU-Monitor

  For example
  -rtc-td-hack
  -localtime
  -startdate
  -netdev
  -mem-path
  -mem-prealloc
  -tdf
  -nvram
  -enable-nesting
  -no-kvm-irqchip
  -no-kvm-pit
  -no-kvm-pit-reinjection
  -xen-domid id
  -xen-create
  -xen-attach
  -readconfig file
  -writeconfig file
  (qemu) host_net_redir
  (qemu) acl_reset

  Please see also

  - http://qemu-buch.de/d/Anhang/_Startoptionen_von_QEMU_und_KVM 
  - http://qemu-buch.de/d/Anhang/_QEMU-Monitor

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/498107/+subscriptions

Re: [Qemu-devel] [PATCH] net: tap-linux: Fix unhelpful error message

On Fri, Oct 14, 2011 at 03:05:10PM -0300, Luiz Capitulino wrote:
 I'm getting:
 
 could not configure /dev/net/tun (tap%d): Operation not permitted
 
 When the ioctl() fails, ifr.ifr_name will most likely not be overwritten.
 So we better only use it when ifname contains a string.
 
 Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
 ---
 
 PS: Trivial tree candidate.
 
  net/tap-linux.c |6 +-
  1 files changed, 5 insertions(+), 1 deletions(-)

Thanks, applied to the trivial patches -next tree:
http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/trivial-patches-next

Stefan

Re: [Qemu-devel] [PATCH] Fix compiler warning (always return a value)

On Fri, Oct 28, 2011 at 8:40 AM, Markus Armbruster arm...@redhat.com wrote:
 Stefan Hajnoczi stefa...@gmail.com writes:

 On Mon, Oct 24, 2011 at 10:18:43PM +0200, Stefan Weil wrote:
 For compilations with -DNDEBUG, the default case did not return
 a value which caused a compiler warning.

 Signed-off-by: Stefan Weil s...@weilnetz.de
 ---
  hw/ppce500_spin.c |   11 ---
  1 files changed, 8 insertions(+), 3 deletions(-)

 diff --git a/hw/ppce500_spin.c b/hw/ppce500_spin.c
 index cccd940..5b5ffe0 100644
 --- a/hw/ppce500_spin.c
 +++ b/hw/ppce500_spin.c
 @@ -168,17 +168,22 @@ static uint64_t spin_read(void *opaque, 
 target_phys_addr_t addr, unsigned len)
  {
      SpinState *s = opaque;
      uint8_t *spin_p = ((uint8_t*)s-spin)[addr];
 +    uint64_t result = 0;

      switch (len) {
      case 1:
 -        return ldub_p(spin_p);
 +        result = ldub_p(spin_p);
 +        break;
      case 2:
 -        return lduw_p(spin_p);
 +        result = lduw_p(spin_p);
 +        break;
      case 4:
 -        return ldl_p(spin_p);
 +        result = ldl_p(spin_p);
 +        break;
      default:
          assert(0);

 I would replace assert(3) with abort(3).  If this ever happens the
 program is broken - returning 0 instead of an undefined value doesn't
 help.

 Why is it useful to make failed assertions stop the program regardless
 of NDEBUG only when the assertion happens to be can't reach?

My point is that it should not be an assertion.  The program has a
control flow path which should never be taken.  In any safe
programming environment the program will terminate with a diagnostic.

That's exactly what we need to do here too.  assert(3) is the wrong
tool for this; we're not even asserting anything.

Stefan

[Qemu-devel] [PATCH 2/2] hw/9pfs: Supply missing va_end()

C99 7.15.1: Each invocation of the va_start and va_copy macros shall
be matched by a corresponding invocation of the va_end macro in the
same function.

Spotted by Coverity.  Harmless on the (common) systems where va_end()
does nothing.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/9pfs/virtio-9p.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index aab3beb..1e22696 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -231,6 +231,7 @@ v9fs_string_alloc_printf(char **strp, const char *fmt, 
va_list ap)
 arg_ulong = va_arg(ap2, unsigned long);
 len += number_to_string((void *)arg_ulong, 'U');
 } else {
+va_end(ap2);
 return -1;
 }
 break;
@@ -244,11 +245,14 @@ v9fs_string_alloc_printf(char **strp, const char *fmt, 
va_list ap)
 default:
 fprintf(stderr,
v9fs_string_alloc_printf:Incorrect format %c, *iter);
+va_end(ap2);
 return -1;
 }
 iter++;
 }
 
+va_end(ap2);
+
 alloc_print:
 *strp = g_malloc((len + 1) * sizeof(**strp));
 
-- 
1.7.6.4

[Qemu-devel] [PATCH 1/2] sysbus: Supply missing va_end()

C99 7.15.1: Each invocation of the va_start and va_copy macros shall
be matched by a corresponding invocation of the va_end macro in the
same function.

Spotted by Coverity.  Harmless on the (common) systems where va_end()
does nothing.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/sysbus.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/hw/sysbus.c b/hw/sysbus.c
index 4fab5a4..fd2fc6a 100644
--- a/hw/sysbus.c
+++ b/hw/sysbus.c
@@ -198,6 +198,7 @@ DeviceState *sysbus_create_varargs(const char *name,
 sysbus_connect_irq(s, n, irq);
 n++;
 }
+va_end(va);
 return dev;
 }
 
@@ -229,6 +230,7 @@ DeviceState *sysbus_try_create_varargs(const char *name,
 sysbus_connect_irq(s, n, irq);
 n++;
 }
+va_end(va);
 return dev;
 }
 
-- 
1.7.6.4

[Qemu-devel] [PATCH 0/2] Trivial portability fix: missing va_end()

Markus Armbruster (2):
  sysbus: Supply missing va_end()
  hw/9pfs: Supply missing va_end()

 hw/9pfs/virtio-9p.c |4 
 hw/sysbus.c |2 ++
 2 files changed, 6 insertions(+), 0 deletions(-)

-- 
1.7.6.4

Re: [Qemu-devel] [PATCH] Fix compiler warning (always return a value)

Stefan Hajnoczi stefa...@gmail.com writes:

 On Fri, Oct 28, 2011 at 8:40 AM, Markus Armbruster arm...@redhat.com wrote:
 Stefan Hajnoczi stefa...@gmail.com writes:

 On Mon, Oct 24, 2011 at 10:18:43PM +0200, Stefan Weil wrote:
 For compilations with -DNDEBUG, the default case did not return
 a value which caused a compiler warning.

 Signed-off-by: Stefan Weil s...@weilnetz.de
 ---
  hw/ppce500_spin.c |   11 ---
  1 files changed, 8 insertions(+), 3 deletions(-)

 diff --git a/hw/ppce500_spin.c b/hw/ppce500_spin.c
 index cccd940..5b5ffe0 100644
 --- a/hw/ppce500_spin.c
 +++ b/hw/ppce500_spin.c
 @@ -168,17 +168,22 @@ static uint64_t spin_read(void *opaque, 
 target_phys_addr_t addr, unsigned len)
  {
      SpinState *s = opaque;
      uint8_t *spin_p = ((uint8_t*)s-spin)[addr];
 +    uint64_t result = 0;

      switch (len) {
      case 1:
 -        return ldub_p(spin_p);
 +        result = ldub_p(spin_p);
 +        break;
      case 2:
 -        return lduw_p(spin_p);
 +        result = lduw_p(spin_p);
 +        break;
      case 4:
 -        return ldl_p(spin_p);
 +        result = ldl_p(spin_p);
 +        break;
      default:
          assert(0);

 I would replace assert(3) with abort(3).  If this ever happens the
 program is broken - returning 0 instead of an undefined value doesn't
 help.

 Why is it useful to make failed assertions stop the program regardless
 of NDEBUG only when the assertion happens to be can't reach?

 My point is that it should not be an assertion.  The program has a
 control flow path which should never be taken.  In any safe
 programming environment the program will terminate with a diagnostic.

In the not-so-safe C programming environment, we can disable that safety
check by defining NDEBUG.

Disabling safety checks is almost always a bad idea.

 That's exactly what we need to do here too.  assert(3) is the wrong
 tool for this; we're not even asserting anything.

We do, actually: can't reach.  That's as good an assertion as any.

Re: [Qemu-devel] [PATCH] Fix compiler warning (always return a value)

On Fri, Oct 28, 2011 at 9:59 AM, Markus Armbruster arm...@redhat.com wrote:
 Stefan Hajnoczi stefa...@gmail.com writes:

 On Fri, Oct 28, 2011 at 8:40 AM, Markus Armbruster arm...@redhat.com wrote:
 Stefan Hajnoczi stefa...@gmail.com writes:

 On Mon, Oct 24, 2011 at 10:18:43PM +0200, Stefan Weil wrote:
 For compilations with -DNDEBUG, the default case did not return
 a value which caused a compiler warning.

 Signed-off-by: Stefan Weil s...@weilnetz.de
 ---
  hw/ppce500_spin.c |   11 ---
  1 files changed, 8 insertions(+), 3 deletions(-)

 diff --git a/hw/ppce500_spin.c b/hw/ppce500_spin.c
 index cccd940..5b5ffe0 100644
 --- a/hw/ppce500_spin.c
 +++ b/hw/ppce500_spin.c
 @@ -168,17 +168,22 @@ static uint64_t spin_read(void *opaque, 
 target_phys_addr_t addr, unsigned len)
  {
      SpinState *s = opaque;
      uint8_t *spin_p = ((uint8_t*)s-spin)[addr];
 +    uint64_t result = 0;

      switch (len) {
      case 1:
 -        return ldub_p(spin_p);
 +        result = ldub_p(spin_p);
 +        break;
      case 2:
 -        return lduw_p(spin_p);
 +        result = lduw_p(spin_p);
 +        break;
      case 4:
 -        return ldl_p(spin_p);
 +        result = ldl_p(spin_p);
 +        break;
      default:
          assert(0);

 I would replace assert(3) with abort(3).  If this ever happens the
 program is broken - returning 0 instead of an undefined value doesn't
 help.

 Why is it useful to make failed assertions stop the program regardless
 of NDEBUG only when the assertion happens to be can't reach?

 My point is that it should not be an assertion.  The program has a
 control flow path which should never be taken.  In any safe
 programming environment the program will terminate with a diagnostic.

 In the not-so-safe C programming environment, we can disable that safety
 check by defining NDEBUG.

 Disabling safety checks is almost always a bad idea.

There's a difference in a safety check that slows down the system and
a failure condition where the program must terminate.

assert(3) is for safety checks that can be disabled because they may
slow down the system.

I've been saying that assert(3) isn't appropriate here because the
program can't continue and there's no check we can skip here.

Stefan

[Qemu-devel] [Bug 611142] Re: seabios should have native scsi support

2011-10-28 Thread Launchpad Bug Tracker

Status changed to 'Confirmed' because the bug affects multiple users.

** Changed in: seabios (Ubuntu)
   Status: New = Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/611142

Title:
  seabios should have native scsi support

Status in QEMU:
  New
Status in “qemu-kvm” package in Ubuntu:
  Triaged
Status in “seabios” package in Ubuntu:
  Confirmed

Bug description:
  Binary package hint: seabios

  Currently when a grub multiboot image is booted with 'kvm -kernel' and
  'biosdisk' module, it will see block devices of type IDE or virtio.
  It will not see scsi devices.

  To demonstrate this:
  $ qemu-img create -f qcow2 disk.img 1G
  $ grub-mkrescue --output=rescue.iso
  $ grub-mkimage -O i386-pc --output=grub-mb.img biosdisk minicmd part_msdos

  An 'ls' inside the grub prompt will show hard disks (hd0) on if:
  a.) -drive uses interface of virtio or scsi
  or
  b.) kvm boots boot from a cdrom or floppy 

  For example, with these commands, grub will see a '(hd0)'
  $ kvm -drive file=disk.img,if=scsi,boot=on -cdrom rescue.iso -boot d
  $ kvm -drive file=disk.img,if=scsi,boot=on -floppy rescue.iso -boot a
  $ kvm -drive file=disk.img,if=virtio,boot=on -cdrom rescue.iso -boot d
  $ kvm -drive file=disk.img,if=ide,boot=on -cdrom rescue.iso -boot d
  $ kvm -drive file=disk.img,if=virtio,boot=on -kernel grub-mb.img 
  $ kvm -drive file=disk.img,if=ide,boot=on -kernel grub-mb.img 

  But the following will not:
  $ kvm -drive file=disk.img,if=scsi,boot=on -kernel grub-mb.img

  ProblemType: Bug
  DistroRelease: Ubuntu 10.10
  Package: seabios 0.6.0-0ubuntu1
  ProcVersionSignature: User Name 2.6.32-305.9-ec2 2.6.32.11+drm33.2
  Uname: Linux 2.6.32-305-ec2 i686
  Architecture: i386
  Date: Thu Jul 29 03:21:21 2010
  Dependencies:
   
  Ec2AMI: ami-e930db80
  Ec2AMIManifest: 
ubuntu-images-testing-us/ubuntu-maverick-daily-i386-server-20100727.manifest.xml
  Ec2AvailabilityZone: us-east-1b
  Ec2InstanceType: m1.small
  Ec2Kernel: aki-407d9529
  Ec2Ramdisk: unavailable
  PackageArchitecture: all
  ProcEnviron:
   PATH=(custom, user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  SourcePackage: seabios

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/611142/+subscriptions

[Qemu-devel] [PATCH 0/0] Manpage. Fix typo and add Sheepdog description

2011-10-28 Thread Ronnie Sahlberg

These patches
1, fixes a typo in a previous patch   NDB - NBD
2, adds description of sheepdog URL syntax to the manpage

[Qemu-devel] [PATCH 1/2] Documentation: Fix typo in manpage. NDB - NBD

2011-10-28 Thread Ronnie Sahlberg


Signed-off-by: Ronnie Sahlberg ronniesahlb...@gmail.com
---
 qemu-options.hx |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index bf2ebb3..c55080c 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1761,10 +1761,10 @@ compiled and linked against libiscsi.
 QEMU supports NBD (Network Block Devices) both using TCP protocol as well
 as Unix Domain Sockets.
 
-Syntax for specifying a NDB device using TCP
+Syntax for specifying a NBD device using TCP
 ``nbd:server-ip:port[:exportname=export]''
 
-Syntax for specifying a NDB device using Unix Domain Sockets
+Syntax for specifying a NBD device using Unix Domain Sockets
 ``nbd:unix:domain-socket[:exportname=export]''
 
 
-- 
1.7.3.1

[Qemu-devel] [PATCH 2/2] Documentation: Add syntax for using sheepdog devices

2011-10-28 Thread Ronnie Sahlberg


Signed-off-by: Ronnie Sahlberg ronniesahlb...@gmail.com
---
 qemu-options.hx |   27 +++
 1 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index c55080c..38d0f57 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1778,6 +1778,33 @@ Example for Unix Domain Sockets
 qemu --drive file=nbd:unix:/tmp/nbd-socket
 @end example
 
+@item Sheepdog
+Sheepdog is a distributed storage system for QEMU.
+QEMU sopports using either local sheepdog devices or remote networked
+devices.
+
+Syntax for specifying a sheepdog device
+@table @list
+``sheepdog:vdiname''
+
+``sheepdog:vdiname:snapid''
+
+``sheepdog:vdiname:tag''
+
+``sheepdog:host:port:vdiname''
+
+``sheepdog:host:port:vdiname:snapid''
+
+``sheepdog:host:port:vdiname:tag''
+@end table
+
+Example
+@example
+qemu --drive file=sheepdog:192.0.2.1:3:MyVirtualMachine
+@end example
+
+See also @url{http://http://www.osrg.net/sheepdog/}.
+
 @end table
 ETEXI
 
-- 
1.7.3.1

Re: [Qemu-devel] [PATCH] net: Only flush queue or call sent callback on successful delivery

On Thu, Oct 27, 2011 at 10:02 AM, Mark Wu wu...@linux.vnet.ibm.com wrote:
 Now queue flushing and sent callback could be invoked even on delivery
 failure. We add a checking of receiver's return value to avoid this
 case.

 Signed-off-by: Mark Wu wu...@linux.vnet.ibm.com
 ---
  net/queue.c |   12 +++-
  1 files changed, 7 insertions(+), 5 deletions(-)

What problem are you trying to fix?

 @@ -251,7 +253,7 @@ void qemu_net_queue_flush(NetQueue *queue)
             break;
         }

 -        if (packet-sent_cb) {
 +        if (ret  0  packet-sent_cb) {
             packet-sent_cb(packet-sender, ret);

This looks wrong.  ret is passed as an argument to the callback.  You
are skipping the callback on error and not giving it a chance to see
negative ret.

Looking at virtio_net_tx_complete() this causes a virtqueue element leak.

Stefan

Re: [Qemu-devel] [PATCH 2/2] Documentation: Add syntax for using sheepdog devices

2011-10-28 Thread Eric Sunshine


On Oct 28, 2011, at 5:13 AM, Ronnie Sahlberg wrote:

Signed-off-by: Ronnie Sahlberg ronniesahlb...@gmail.com
---
qemu-options.hx |   27 +++
1 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index c55080c..38d0f57 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1778,6 +1778,33 @@ Example for Unix Domain Sockets
qemu --drive file=nbd:unix:/tmp/nbd-socket
@end example

+@item Sheepdog
+Sheepdog is a distributed storage system for QEMU.
+QEMU sopports using either local sheepdog devices or remote networked


s/sopports/supports/

-- ES

Re: [Qemu-devel] qemu-img vmdk converted from iso not accepted by vSphere

Rich: Thanks for the ESXi 4.1.0 and vSphere Client 4.1.0 version info.

Fam: Do you have any suggestions how to debug the Error uploading
file path/to/image.vmdk to server.  Not a supported disk format
(sparse VMDK version too old) issue?  Ideally qemu-img would create
images that work with current VMware tools by default.

Stefan

Re: [Qemu-devel] [PATCH] Teach block/vdi about discarded (no longer allocated) blocks

Am 28.10.2011 10:15, schrieb Eric Sunshine:
 
 On Oct 28, 2011, at 4:00 AM, Kevin Wolf wrote:
 
 Am 27.10.2011 18:12, schrieb Stefan Weil:
 Am 27.10.2011 10:53, schrieb Kevin Wolf:
 Am 26.10.2011 21:51, schrieb Eric Sunshine:
 An entry in the VDI block map will hold an offset to the actual  
 block if
 the block is allocated, or one of two specially-interpreted  
 values if
 not allocated. Using VirtualBox terminology, value  
 VDI_IMAGE_BLOCK_FREE
 (0x) represents a never-allocated block (semantically  
 arbitrary
 content). VDI_IMAGE_BLOCK_ZERO (0xfffe) represents a  
 discarded
 block (semantically zero-filled). block/vdi knows only about
 VDI_IMAGE_BLOCK_FREE. Teach it about VDI_IMAGE_BLOCK_ZERO.

 Signed-off-by: Eric Sunshine sunsh...@sunshineco.com

 Thanks, applied to the block branch.

 Kevin


 Kevin, I don't want to block improvements. Nevertheless
 I'd like to see a small modification in this patch:
 both #defines should be implemented without a type cast.
 Please change them or wait until Eric sends an update.

 My favorite is this:

 #define VDI_UNALLOCATED UINT32_MAX
 #define VDI_DISCARD (VDI_UNALLOCATED - 1)

 This would also be ok:

 #define VDI_UNALLOCATED 0xU
 #define VDI_DISCARD 0xfffeU

 Using the macro names and the definitions (with type cast)
 from the original VirtualBox code would also be ok.

 I did see your comments, and I waited for the endianness thing to be
 answered. However, how the definition of these constants is written is
 really not a functional defect, but simply a matter of taste. It's an
 old rule that whoever does the work also decides on the details.

 I really think it's wasting our time if we need to discuss if a type
 cast in the constant definition is only allowed after typedefing
 uint32_t to something else like in VBox.

 So my preferred way is to leave the patch as it is. The code is simple
 and clear and objectively seen it won't get any better with your taste
 applied. If Eric prefers, I can update it to use 0xU, though.
 
 The 0xU notation has the benefit of being explicit, whereas  
 the ((uint32_t)~0) notation, taken from the VirtualBox source, is  
 somewhat magical for a reader who does not perform an automatic  
 ((uint32_t)~0) == 0xU conversion in his head. Consequently,  
 the 0xU notation might a better choice, if it's not too much  
 bother for you to amend the patch.

I'll amend it with this change:

diff --git a/block/vdi.c b/block/vdi.c
index 25790c4..523a640 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -115,10 +115,10 @@ void uuid_unparse(const uuid_t uu, char *out);
 #define VDI_TEXT  QEMU VM Virtual Disk Image \n

 /* A never-allocated block; semantically arbitrary content. */
-#define VDI_UNALLOCATED ((uint32_t)~0)
+#define VDI_UNALLOCATED 0xU

 /* A discarded (no longer allocated) block; semantically zero-filled. */
-#define VDI_DISCARDED ((uint32_t)~1)
+#define VDI_DISCARDED   0xfffeU

 #define VDI_IS_ALLOCATED(X) ((X)  VDI_DISCARDED)

Re: [Qemu-devel] [PATCH 1/2] Documentation: Fix typo in manpage. NDB - NBD

Am 28.10.2011 11:13, schrieb Ronnie Sahlberg:
 Signed-off-by: Ronnie Sahlberg ronniesahlb...@gmail.com
 ---
  qemu-options.hx |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/qemu-options.hx b/qemu-options.hx
 index bf2ebb3..c55080c 100644
 --- a/qemu-options.hx
 +++ b/qemu-options.hx
 @@ -1761,10 +1761,10 @@ compiled and linked against libiscsi.
  QEMU supports NBD (Network Block Devices) both using TCP protocol as well
  as Unix Domain Sockets.
  
 -Syntax for specifying a NDB device using TCP
 +Syntax for specifying a NBD device using TCP
  ``nbd:server-ip:port[:exportname=export]''
  
 -Syntax for specifying a NDB device using Unix Domain Sockets
 +Syntax for specifying a NBD device using Unix Domain Sockets
  ``nbd:unix:domain-socket[:exportname=export]''

I already fixed this one locally in the original commit.

Kevin

Re: [Qemu-devel] [PATCH 2/2] Documentation: Add syntax for using sheepdog devices

Am 28.10.2011 11:13, schrieb Eric Sunshine:
 On Oct 28, 2011, at 5:13 AM, Ronnie Sahlberg wrote:
 Signed-off-by: Ronnie Sahlberg ronniesahlb...@gmail.com
 ---
 qemu-options.hx |   27 +++
 1 files changed, 27 insertions(+), 0 deletions(-)

 diff --git a/qemu-options.hx b/qemu-options.hx
 index c55080c..38d0f57 100644
 --- a/qemu-options.hx
 +++ b/qemu-options.hx
 @@ -1778,6 +1778,33 @@ Example for Unix Domain Sockets
 qemu --drive file=nbd:unix:/tmp/nbd-socket
 @end example

 +@item Sheepdog
 +Sheepdog is a distributed storage system for QEMU.
 +QEMU sopports using either local sheepdog devices or remote networked
 
 s/sopports/supports/

Thanks, applied to the block branch with fixed typo.

I took a look at qemu-doc.html now, and I think that this information is
very hard to find. We should probably consider moving it to it's own
section that is referenced in the table of contents. Alternatively, I
saw that there are already sections about NBD and Sheepdog under 3.6
Disk images. Maybe moving the new information there would be an option,
too.

Kevin

Re: [Qemu-devel] [PATCH] Add wildcard trace event support

On Thu, Oct 20, 2011 at 10:38 AM, Mark Wu wu...@linux.vnet.ibm.com wrote:
  The tracetool script automates tedious trace event code generation and also
 diff --git a/trace/simple.c b/trace/simple.c
 index b639dda..869e315 100644
 --- a/trace/simple.c
 +++ b/trace/simple.c
 @@ -324,14 +324,29 @@ void trace_print_events(FILE *stream, fprintf_function 
 stream_printf)
  bool trace_event_set_state(const char *name, bool state)
  {
     unsigned int i;
 -
 +    unsigned int len;
 +    bool wildcard = false;
 +    bool matched = false;
 +
 +    len = strlen(name);
 +    if (name[len-1] == '*') {

I think it's worth making a small change:

if (len  0  name[len - 1] == '*') {

Normally strlen(name)  0 but just in case we should prevent accessing name[-1].

Seems fine otherwise.  Perhaps we can figure out how to share code
between simple.c and stderr.c in the future.

Stefan

[Qemu-devel] [PATCH v3 3/3] Make cpu_single_env thread-local

From: Paolo Bonzini pbonz...@redhat.com

Make cpu_single_env thread-local. This fixes a regression
in handling of multi-threaded programs in linux-user mode
(bug 823902).

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
[Peter Maydell: rename tls_cpu_single_env to cpu_single_env]
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Reviewed-by: Andreas Färber afaer...@suse.de
---
 cpu-all.h |4 +++-
 exec.c|2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 42a5fa0..5f47ab8 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -20,6 +20,7 @@
 #define CPU_ALL_H
 
 #include qemu-common.h
+#include qemu-tls.h
 #include cpu-common.h
 
 /* some important defines:
@@ -334,7 +335,8 @@ void cpu_dump_statistics(CPUState *env, FILE *f, 
fprintf_function cpu_fprintf,
 void QEMU_NORETURN cpu_abort(CPUState *env, const char *fmt, ...)
 GCC_FMT_ATTR(2, 3);
 extern CPUState *first_cpu;
-extern CPUState *cpu_single_env;
+DECLARE_TLS(CPUState *,cpu_single_env);
+#define cpu_single_env get_tls(cpu_single_env)
 
 /* Flags for use in ENV-INTERRUPT_PENDING.
 
diff --git a/exec.c b/exec.c
index 9dc4edb..18e26cb 100644
--- a/exec.c
+++ b/exec.c
@@ -120,7 +120,7 @@ static MemoryRegion *system_io;
 CPUState *first_cpu;
 /* current CPU in the current thread. It is only valid inside
cpu_exec() */
-CPUState *cpu_single_env;
+DEFINE_TLS(CPUState *,cpu_single_env);
 /* 0 = Do not count executed instructions.
1 = Precise instruction counting.
2 = Adaptive rate instruction counting.  */
-- 
1.7.1

[Qemu-devel] [PATCH v3 1/3] qemu-tls.h: Add abstraction layer for TLS variables

Add an abstraction layer for defining and using thread-local
variables. For the moment this is implemented only for Linux,
which means they can only be used in restricted circumstances.
The abstraction layer allows us to add POSIX and Win32 support
later.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 qemu-tls.h |   52 
 1 files changed, 52 insertions(+), 0 deletions(-)
 create mode 100644 qemu-tls.h

diff --git a/qemu-tls.h b/qemu-tls.h
new file mode 100644
index 000..5b70f10
--- /dev/null
+++ b/qemu-tls.h
@@ -0,0 +1,52 @@
+/*
+ * Abstraction layer for defining and using TLS variables
+ *
+ * Copyright (c) 2011 Red Hat, Inc
+ * Copyright (c) 2011 Linaro Limited
+ *
+ * Authors:
+ *  Paolo Bonzini pbonz...@redhat.com
+ *  Peter Maydell peter.mayd...@linaro.org
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#ifndef QEMU_TLS_H
+#define QEMU_TLS_H
+
+/* Per-thread variables. Note that we only have implementations
+ * which are really thread-local on Linux; the dummy implementations
+ * define plain global variables.
+ *
+ * This means that for the moment use should be restricted to
+ * per-VCPU variables, which are OK because:
+ *  - the only -user mode supporting multiple VCPU threads is linux-user
+ *  - TCG system mode is single-threaded regarding VCPUs
+ *  - KVM system mode is multi-threaded but limited to Linux
+ *
+ * TODO: proper implementations via Win32 .tls sections and
+ * POSIX pthread_getspecific.
+ */
+#ifdef __linux__
+#define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
+#define DEFINE_TLS(type, x)  __thread __typeof__(type) tls__##x
+#define get_tls(x)   tls__##x
+#else
+/* Dummy implementations which define plain global variables */
+#define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
+#define DEFINE_TLS(type, x)  __typeof__(type) tls__##x
+#define get_tls(x)   tls__##x
+#endif
+
+#endif
-- 
1.7.1

[Qemu-devel] [PATCH v3 0/3] TLS abstraction layer for thread-local cpu_single_env on Linux

These patches add enough of the TLS abstraction layer to allow us
to make cpu_single_env thread-local on Linux systems. This fixes
the regression described in bug 823902 for the 1.0 release; we
can add the Win32 and POSIX implementations later.

I haven't included Paolo's Prepare Windows port for thread-local
cpu_single_env patch -- it would be safe to do so but it isn't
necessary until we actually implement TLS for Win32.

Changes v1-v2:
 * fix Paolo's email address
 * split the darwin-user change out into a separate patch
 * drop the 'tls_' prefix from the cpu_single_env tls var name
Changes v2-v3:
 * minor rearrangement of copyright notice in comment
 * added a missing Signed-off-by
 * fixed the name of the multiple-include-guard #define

Paolo Bonzini (2):
  darwin-user/main.c: Drop unused cpu_single_env definition
  Make cpu_single_env thread-local

Peter Maydell (1):
  qemu-tls.h: Add abstraction layer for TLS variables

 cpu-all.h  |4 +++-
 darwin-user/main.c |2 --
 exec.c |2 +-
 qemu-tls.h |   52 
 4 files changed, 56 insertions(+), 4 deletions(-)
 create mode 100644 qemu-tls.h

[Qemu-devel] [PATCH v3 2/3] darwin-user/main.c: Drop unused cpu_single_env definition

From: Paolo Bonzini pbonz...@redhat.com

Drop the cpu_single_env definition as it is unused.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 darwin-user/main.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/darwin-user/main.c b/darwin-user/main.c
index 1a881a0..c0f14f8 100644
--- a/darwin-user/main.c
+++ b/darwin-user/main.c
@@ -729,8 +729,6 @@ static void usage(void)
 
 /* XXX: currently only used for async signals (see signal.c) */
 CPUState *global_env;
-/* used only if single thread */
-CPUState *cpu_single_env = NULL;
 
 /* used to free thread contexts */
 TaskState *first_task_state;
-- 
1.7.1

[Qemu-devel] [PATCH v6 2/2] hw/vexpress.c, hw/realview.c: Add PL041 to VExpress, Realview boards

Instantiate the PL041 audio on the Versatile Express and
Realview board models.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 hw/realview.c |8 +++-
 hw/vexpress.c |7 ++-
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/hw/realview.c b/hw/realview.c
index 14281b0..9a8e63c 100644
--- a/hw/realview.c
+++ b/hw/realview.c
@@ -125,7 +125,7 @@ static void realview_init(ram_addr_t ram_size,
 MemoryRegion *ram_hi = g_new(MemoryRegion, 1);
 MemoryRegion *ram_alias = g_new(MemoryRegion, 1);
 MemoryRegion *ram_hack = g_new(MemoryRegion, 1);
-DeviceState *dev, *sysctl, *gpio2;
+DeviceState *dev, *sysctl, *gpio2, *pl041;
 SysBusDevice *busdev;
 qemu_irq *irqp;
 qemu_irq pic[64];
@@ -232,6 +232,12 @@ static void realview_init(ram_addr_t ram_size,
 pic[n] = qdev_get_gpio_in(dev, n);
 }
 
+pl041 = qdev_create(NULL, pl041);
+qdev_prop_set_uint32(pl041, nc_fifo_depth, 512);
+qdev_init_nofail(pl041);
+sysbus_mmio_map(sysbus_from_qdev(pl041), 0, 0x10004000);
+sysbus_connect_irq(sysbus_from_qdev(pl041), 0, pic[19]);
+
 sysbus_create_simple(pl050_keyboard, 0x10006000, pic[20]);
 sysbus_create_simple(pl050_mouse, 0x10007000, pic[21]);
 
diff --git a/hw/vexpress.c b/hw/vexpress.c
index c9766dd..0940a26 100644
--- a/hw/vexpress.c
+++ b/hw/vexpress.c
@@ -41,7 +41,7 @@ static void vexpress_a9_init(ram_addr_t ram_size,
 {
 CPUState *env = NULL;
 ram_addr_t ram_offset, vram_offset, sram_offset;
-DeviceState *dev, *sysctl;
+DeviceState *dev, *sysctl, *pl041;
 SysBusDevice *busdev;
 qemu_irq *irqp;
 qemu_irq pic[64];
@@ -118,6 +118,11 @@ static void vexpress_a9_init(ram_addr_t ram_size,
 /* 0x10001000 SP810 system control */
 /* 0x10002000 serial bus PCI */
 /* 0x10004000 PL041 audio */
+pl041 = qdev_create(NULL, pl041);
+qdev_prop_set_uint32(pl041, nc_fifo_depth, 512);
+qdev_init_nofail(pl041);
+sysbus_mmio_map(sysbus_from_qdev(pl041), 0, 0x10004000);
+sysbus_connect_irq(sysbus_from_qdev(pl041), 0, pic[11]);
 
 dev = sysbus_create_varargs(pl181, 0x10005000, pic[9], pic[10], NULL);
 /* Wire up MMC card detect and read-only signals */
-- 
1.7.1

[Qemu-devel] [PATCH v6 0/2] Add AACI audio playback support to ARM devboards

This is Mathieu's v5 patch, except that I have corrected the one-liner
minor error spotted by Andrzej Zaborowski, tested, rebased and thrown
in my patch adding the PL041 to the other devboards.

I hope this is OK to apply now; I'm hoping to get it in before the
freeze for 1.0...

Thanks to Mathieu for all his hard work on this patch.

-- PMM


Mathieu Sonet (1):
  Add AACI audio playback support to the ARM Versatile/PB platform

Peter Maydell (1):
  hw/vexpress.c, hw/realview.c: Add PL041 to VExpress, Realview boards

 Makefile.target  |1 +
 hw/lm4549.c  |  336 
 hw/lm4549.h  |   43 
 hw/pl041.c   |  636 ++
 hw/pl041.h   |  135 
 hw/pl041.hx  |   81 +++
 hw/realview.c|8 +-
 hw/versatilepb.c |8 +
 hw/vexpress.c|7 +-
 9 files changed, 1253 insertions(+), 2 deletions(-)
 create mode 100644 hw/lm4549.c
 create mode 100644 hw/lm4549.h
 create mode 100644 hw/pl041.c
 create mode 100644 hw/pl041.h
 create mode 100644 hw/pl041.hx

[Qemu-devel] [PATCH v6 1/2] Add AACI audio playback support to the ARM Versatile/PB platform

From: Mathieu Sonet cont...@elasticsheep.com

This driver emulates the ARM AACI interface (PL041) connected to a LM4549 codec.
It enables audio playback for the Versatile/PB platform.

Limitations:
- Supports only a playback on one channel (Versatile/Vexpress)
- Supports only one TX FIFO in compact-mode or non-compact mode.
- Supports playback of 12, 16, 18 and 20 bits samples.
- Record is not supported.
- The PL041 is hardwired to a LM4549 codec.

Versatile/PB test build:
linux-2.6.38.5
buildroot-2010.11
alsa-lib-1.0.22
alsa-utils-1.0.22
mpg123-0.66

Qemu host: Ubuntu 10.04 in Vmware/OS X

Playback tested successfully with speaker-test/aplay/mpg123.

Signed-off-by: Mathieu Sonet cont...@elasticsheep.com
[Peter Maydell: fixed typo in code clearing SL1RXBUSY/SL2RXBUSY
 bits, as spotted by Andrzej Zaborowski]
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 Makefile.target  |1 +
 hw/lm4549.c  |  336 
 hw/lm4549.h  |   43 
 hw/pl041.c   |  636 ++
 hw/pl041.h   |  135 
 hw/pl041.hx  |   81 +++
 hw/versatilepb.c |8 +
 7 files changed, 1240 insertions(+), 0 deletions(-)
 create mode 100644 hw/lm4549.c
 create mode 100644 hw/lm4549.h
 create mode 100644 hw/pl041.c
 create mode 100644 hw/pl041.h
 create mode 100644 hw/pl041.hx

diff --git a/Makefile.target b/Makefile.target
index 1e90df7..1b1156d 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -361,6 +361,7 @@ obj-arm-y += syborg_virtio.o
 obj-arm-y += vexpress.o
 obj-arm-y += strongarm.o
 obj-arm-y += collie.o
+obj-arm-y += pl041.o lm4549.o
 
 obj-sh4-y = shix.o r2d.o sh7750.o sh7750_regnames.o tc58128.o
 obj-sh4-y += sh_timer.o sh_serial.o sh_intc.o sh_pci.o sm501.o
diff --git a/hw/lm4549.c b/hw/lm4549.c
new file mode 100644
index 000..4d5b831
--- /dev/null
+++ b/hw/lm4549.c
@@ -0,0 +1,336 @@
+/*
+ * LM4549 Audio Codec Interface
+ *
+ * Copyright (c) 2011
+ * Written by Mathieu Sonet - www.elasticsheep.com
+ *
+ * This code is licenced under the GPL.
+ *
+ * *
+ *
+ * This driver emulates the LM4549 codec.
+ *
+ * It supports only one playback voice and no record voice.
+ */
+
+#include hw.h
+#include audio/audio.h
+#include lm4549.h
+
+#if 0
+#define LM4549_DEBUG  1
+#endif
+
+#if 0
+#define LM4549_DUMP_DAC_INPUT 1
+#endif
+
+#ifdef LM4549_DEBUG
+#define DPRINTF(fmt, ...) \
+do { printf(lm4549:  fmt , ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) do {} while (0)
+#endif
+
+#if defined(LM4549_DUMP_DAC_INPUT)
+#include stdio.h
+static FILE *fp_dac_input;
+#endif
+
+/* LM4549 register list */
+enum {
+LM4549_Reset= 0x00,
+LM4549_Master_Volume= 0x02,
+LM4549_Line_Out_Volume  = 0x04,
+LM4549_Master_Volume_Mono   = 0x06,
+LM4549_PC_Beep_Volume   = 0x0A,
+LM4549_Phone_Volume = 0x0C,
+LM4549_Mic_Volume   = 0x0E,
+LM4549_Line_In_Volume   = 0x10,
+LM4549_CD_Volume= 0x12,
+LM4549_Video_Volume = 0x14,
+LM4549_Aux_Volume   = 0x16,
+LM4549_PCM_Out_Volume   = 0x18,
+LM4549_Record_Select= 0x1A,
+LM4549_Record_Gain  = 0x1C,
+LM4549_General_Purpose  = 0x20,
+LM4549_3D_Control   = 0x22,
+LM4549_Powerdown_Ctrl_Stat  = 0x26,
+LM4549_Ext_Audio_ID = 0x28,
+LM4549_Ext_Audio_Stat_Ctrl  = 0x2A,
+LM4549_PCM_Front_DAC_Rate   = 0x2C,
+LM4549_PCM_ADC_Rate = 0x32,
+LM4549_Vendor_ID1   = 0x7C,
+LM4549_Vendor_ID2   = 0x7E
+};
+
+static void lm4549_reset(lm4549_state *s)
+{
+uint16_t *regfile = s-regfile;
+
+regfile[LM4549_Reset]   = 0x0d50;
+regfile[LM4549_Master_Volume]   = 0x8008;
+regfile[LM4549_Line_Out_Volume] = 0x8000;
+regfile[LM4549_Master_Volume_Mono]  = 0x8000;
+regfile[LM4549_PC_Beep_Volume]  = 0x;
+regfile[LM4549_Phone_Volume]= 0x8008;
+regfile[LM4549_Mic_Volume]  = 0x8008;
+regfile[LM4549_Line_In_Volume]  = 0x8808;
+regfile[LM4549_CD_Volume]   = 0x8808;
+regfile[LM4549_Video_Volume]= 0x8808;
+regfile[LM4549_Aux_Volume]  = 0x8808;
+regfile[LM4549_PCM_Out_Volume]  = 0x8808;
+regfile[LM4549_Record_Select]   = 0x;
+regfile[LM4549_Record_Gain] = 0x8000;
+regfile[LM4549_General_Purpose] = 0x;
+regfile[LM4549_3D_Control]  = 0x0101;
+regfile[LM4549_Powerdown_Ctrl_Stat] = 0x000f;
+regfile[LM4549_Ext_Audio_ID]= 0x0001;
+regfile[LM4549_Ext_Audio_Stat_Ctrl] = 0x;
+regfile[LM4549_PCM_Front_DAC_Rate]  = 0xbb80;
+regfile[LM4549_PCM_ADC_Rate]= 0xbb80;
+regfile[LM4549_Vendor_ID1]  = 0x4e53;
+regfile[LM4549_Vendor_ID2]

[Qemu-devel] [Bug 882997] [NEW] 64-bit linux guests fail to start on oneiric running 3.0 kernel

2011-10-28 Thread Klaus Rennecke

Public bug reported:

Host: Ubuntu 11.10 kernel vmlinuz-3.0.0-12-generic or vmlinuz-3.0.0-12-server 
on AMD Athlon(tm) II P360 Dual-Core
Guests: SLES 10 or 11, all 64 bit

32 bit windows guest starts fine. All 64 bit linux guests loop during
boot, when GRUB is starting. VMs are managed using libvirt
0.9.2-4ubuntu15 and virt-manager 0.9.0.

Log file shows:
 KVM internal error. Suberror: 1
 emulation failure

repeated for each GRUB attempt.

Starting the same host with vmlinuz-2.6.38-11-generic makes all VMs run
OK.

** Affects: qemu
 Importance: Undecided
 Status: New


** Tags: amd64 oneiric

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/882997

Title:
  64-bit linux guests fail to start on oneiric running 3.0 kernel

Status in QEMU:
  New

Bug description:
  Host: Ubuntu 11.10 kernel vmlinuz-3.0.0-12-generic or vmlinuz-3.0.0-12-server 
on AMD Athlon(tm) II P360 Dual-Core
  Guests: SLES 10 or 11, all 64 bit

  32 bit windows guest starts fine. All 64 bit linux guests loop during
  boot, when GRUB is starting. VMs are managed using libvirt
  0.9.2-4ubuntu15 and virt-manager 0.9.0.

  Log file shows:
   KVM internal error. Suberror: 1
   emulation failure

  repeated for each GRUB attempt.

  Starting the same host with vmlinuz-2.6.38-11-generic makes all VMs
  run OK.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/882997/+subscriptions

[Qemu-devel] [PATCH v9 0/4] The intro of QEMU block I/O throttling

The main goal of the patch is to effectively cap the disk I/O speed or counts 
of one single VM.It is only one draft, so it unavoidably has some drawbacks, if 
you catch them, please let me know.

The patch will mainly introduce one block I/O throttling algorithm, one timer 
and one block queue for each I/O limits enabled drive.

When a block request is coming in, the throttling algorithm will check if its 
I/O rate or counts exceed the limits; if yes, then it will enqueue to the block 
queue; The timer will handle the I/O requests in it.

Some available features follow as below:
(1) global bps limit.
   -drive bps=xxxin bytes/s
(2) only read bps limit
   -drive bps_rd=xxx in bytes/s
(3) only write bps limit
   -drive bps_wr=xxx in bytes/s
(4) global iops limit
   -drive iops=xxx   in ios/s
(5) only read iops limit
   -drive iops_rd=xxxin ios/s
(6) only write iops limit
   -drive iops_wr=xxxin ios/s
(7) the combination of some limits.
   -drive bps=xxx,iops=xxx

Known Limitations:
(1) #1 can not coexist with #2, #3
(2) #4 can not coexist with #5, #6

Changes since code V8:
1.) made a lot of changes based on kevin's comments.
2.) slice_time is dynamically adjusted.
3.) rebase the latest qemu upstream.

 v8: fix the build per patch based on stefan's comments.

 v7: Mainly simply the block queue.
 Adjust codes based on stefan's comments.

 v6: Mainly fix the aio callback issue for block queue.
 Adjust codes based on Ram Pai's comments.

 v5: add qmp/hmp support.
 Adjust the codes based on stefan's comments
 qmp/hmp: add block_set_io_throttle

 v4: fix memory leaking based on ryan's feedback.

 v3: Added the code for extending slice time, and modified the method to 
compute wait time for the timer.

 v2: The codes V2 for QEMU disk I/O limits.
 Modified the codes mainly based on stefan's comments.

 v1: Submit the codes for QEMU disk I/O limits.
 Only a code draft.


Zhi Yong Wu (4):
  block: add the block queue support
  block: add the command line support
  block: add the block throttling algorithm
  qmp/hmp: add block_set_io_throttle

 Makefile.objs |2 +-
 block.c   |  543 ++---
 block.h   |   24 +++
 block/blk-queue.c |  201 
 block/blk-queue.h |   63 ++
 block_int.h   |   45 +
 blockdev.c|   83 
 blockdev.h|2 +
 hmp-commands.hx   |   15 ++
 qemu-config.c |   24 +++
 qemu-options.hx   |1 +
 qerror.c  |4 +
 qerror.h  |3 +
 qmp-commands.hx   |   53 +-
 14 files changed, 1038 insertions(+), 25 deletions(-)
 create mode 100644 block/blk-queue.c
 create mode 100644 block/blk-queue.h

-- 
1.7.6

[Qemu-devel] [PATCH v9 1/4] block: add the block queue support

Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com
---
 Makefile.objs |2 +-
 block.c   |   71 +--
 block.h   |   20 +
 block/blk-queue.c |  201 +
 block/blk-queue.h |   63 +
 block_int.h   |   23 ++
 6 files changed, 371 insertions(+), 9 deletions(-)
 create mode 100644 block/blk-queue.c
 create mode 100644 block/blk-queue.h

diff --git a/Makefile.objs b/Makefile.objs
index 01587c8..98891b3 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -33,7 +33,7 @@ block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o 
dmg.o bochs.o vpc.o vv
 block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o 
qcow2-cache.o
 block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-nested-y += qed-check.o
-block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
+block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o 
blk-queue.o
 block-nested-$(CONFIG_WIN32) += raw-win32.o
 block-nested-$(CONFIG_POSIX) += raw-posix.o
 block-nested-$(CONFIG_CURL) += curl.o
diff --git a/block.c b/block.c
index 70aab63..40ab277 100644
--- a/block.c
+++ b/block.c
@@ -1026,6 +1026,7 @@ typedef struct RwCo {
 QEMUIOVector *qiov;
 bool is_write;
 int ret;
+bool limit_skip;
 } RwCo;
 
 static void coroutine_fn bdrv_rw_co_entry(void *opaque)
@@ -1060,6 +1061,7 @@ static int bdrv_rw_co(BlockDriverState *bs, int64_t 
sector_num, uint8_t *buf,
 .qiov = qiov,
 .is_write = is_write,
 .ret = NOT_DONE,
+.limit_skip = false,
 };
 
 qemu_iovec_init_external(qiov, iov, 1);
@@ -1242,6 +1244,64 @@ int bdrv_pwrite_sync(BlockDriverState *bs, int64_t 
offset,
 return 0;
 }
 
+typedef struct BlockDriverAIOCBCoroutine {
+BlockDriverAIOCB common;
+BlockRequest req;
+bool is_write;
+QEMUBH *bh;
+bool cb_skip;
+bool limit_skip;
+void *blkq_acb;
+} BlockDriverAIOCBCoroutine;
+
+/* block I/O throttling */
+typedef struct CoroutineCB {
+BlockDriverState *bs;
+int64_t sector_num;
+int nb_sectors;
+QEMUIOVector *qiov;
+} CoroutineCB;
+
+static void bdrv_io_limits_skip_set(void *opaque,
+BlockAPIType co_type,
+bool cb_skip,
+bool limit_skip) {
+RwCo *rwco;
+BlockDriverAIOCBCoroutine *aioco;
+
+if (co_type == BDRV_API_SYNC) {
+rwco = opaque;
+rwco-limit_skip = limit_skip;
+} else if (co_type == BDRV_API_ASYNC) {
+aioco = opaque;
+aioco-cb_skip = cb_skip;
+aioco-limit_skip = limit_skip;
+} else {
+abort();
+}
+}
+
+void bdrv_io_limits_issue_request(void *opaque,
+  BlockAPIType co_type) {
+BlockQueueAIOCB *blkq_acb = opaque;
+
+if (blkq_acb-co_type == BDRV_API_CO) {
+qemu_coroutine_enter(blkq_acb-co, blkq_acb-cocb);
+} else {
+CoroutineEntry *entry = NULL;
+if (co_type == BDRV_API_SYNC) {
+entry = bdrv_rw_co_entry;
+} else if (co_type == BDRV_API_ASYNC) {
+entry = bdrv_co_do_rw;
+}
+
+bdrv_io_limits_skip_set(blkq_acb-cocb,
+co_type, false, true);
+Coroutine *co = qemu_coroutine_create(entry);
+qemu_coroutine_enter(co, blkq_acb-cocb);
+}
+}
+
 /*
  * Handle a read request in coroutine context
  */
@@ -2650,14 +2710,6 @@ static BlockDriverAIOCB 
*bdrv_aio_writev_em(BlockDriverState *bs,
 return bdrv_aio_rw_vector(bs, sector_num, qiov, nb_sectors, cb, opaque, 1);
 }
 
-
-typedef struct BlockDriverAIOCBCoroutine {
-BlockDriverAIOCB common;
-BlockRequest req;
-bool is_write;
-QEMUBH* bh;
-} BlockDriverAIOCBCoroutine;
-
 static void bdrv_aio_co_cancel_em(BlockDriverAIOCB *blockacb)
 {
 qemu_aio_flush();
@@ -2711,6 +2763,9 @@ static BlockDriverAIOCB 
*bdrv_co_aio_rw_vector(BlockDriverState *bs,
 acb-req.nb_sectors = nb_sectors;
 acb-req.qiov = qiov;
 acb-is_write = is_write;
+acb-blkq_acb = NULL;
+acb-cb_skip  = false;
+acb-limit_skip = false;
 
 co = qemu_coroutine_create(bdrv_co_do_rw);
 qemu_coroutine_enter(co, acb);
diff --git a/block.h b/block.h
index 5a042c9..b7fbc40 100644
--- a/block.h
+++ b/block.h
@@ -82,6 +82,17 @@ typedef enum {
 BDRV_IOS_MAX
 } BlockIOStatus;
 
+/* disk I/O throttling */
+typedef enum {
+BDRV_BLKQ_ENQ_FIRST, BDRV_BLKQ_ENQ_AGAIN, BDRV_BLKQ_DEQ_PASS,
+BDRV_BLKQ_PASS, BDRV_IO_MAX
+} BlockQueueRetType;
+
+typedef enum {
+BDRV_API_SYNC, BDRV_API_ASYNC, BDRV_API_CO,
+BDRV_API_MAX
+} BlockAPIType;
+
 void bdrv_iostatus_enable(BlockDriverState *bs);
 void bdrv_iostatus_reset(BlockDriverState *bs);
 void bdrv_iostatus_disable(BlockDriverState *bs);
@@ -94,6 +105,8 @@ void bdrv_info(Monitor *mon, QObject **ret_data);
 void bdrv_stats_print(Monitor *mon,

Re: [Qemu-devel] [PATCH] net: Only flush queue or call sent callback on successful delivery

2011-10-28 Thread Mark Wu


On 10/28/2011 05:13 PM, Stefan Hajnoczi wrote:

On Thu, Oct 27, 2011 at 10:02 AM, Mark Wuwu...@linux.vnet.ibm.com  wrote:

Now queue flushing and sent callback could be invoked even on delivery
failure. We add a checking of receiver's return value to avoid this
case.

Signed-off-by: Mark Wuwu...@linux.vnet.ibm.com
---
  net/queue.c |   12 +++-
  1 files changed, 7 insertions(+), 5 deletions(-)

What problem are you trying to fix?


@@ -251,7 +253,7 @@ void qemu_net_queue_flush(NetQueue *queue)
 break;
 }

-if (packet-sent_cb) {
+if (ret  0  packet-sent_cb) {
 packet-sent_cb(packet-sender, ret);

This looks wrong.  ret is passed as an argument to the callback.  You
are skipping the callback on error and not giving it a chance to see
negative ret.

Looking at virtio_net_tx_complete() this causes a virtqueue element leak.

Thanks for your review!
Yes, that's a problem. I thought only tap call queue send function with 
a callback (tap_send_completed) and confirmed that no memory leak in the 
case of tap. I agree that it will cause a
descriptor leak, but actually virtio_net_tx_complete doesn't check 
'ret'. It just pushes the elem to the virtqueue with 'async_tx.len' and 
flushes the tx queue. I think it assumes the callback is only called on 
success. Otherwise, it doesn't make sense for me. My point is that flush 
shouldn't happen on a deliver failure. Probably it will cause more 
failures. tap_send_completed assumes it's called on successfully deliver 
a packet too because it re-enables polling of tap fd.  That's why I add 
a checking of 'ret'.


I am not sure if the original code really needs a fix because it will 
not cause any visible problems.

Stefan

[Qemu-devel] [PATCH v9 3/4] block: add the block throttling algorithm

Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com
---
 block.c |  360 --
 1 files changed, 348 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index 8f92950..2d85b64 100644
--- a/block.c
+++ b/block.c
@@ -63,9 +63,11 @@ static int coroutine_fn bdrv_co_writev_em(BlockDriverState 
*bs,
  int64_t sector_num, int nb_sectors,
  QEMUIOVector *iov);
 static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs,
-int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
+int64_t sector_num, int nb_sectors, QEMUIOVector *qiov,
+void *opaque, BlockAPIType co_type);
 static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs,
-int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
+int64_t sector_num, int nb_sectors, QEMUIOVector *qiov,
+void *opaque, BlockAPIType co_type);
 static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
@@ -75,6 +77,13 @@ static BlockDriverAIOCB 
*bdrv_co_aio_rw_vector(BlockDriverState *bs,
bool is_write);
 static void coroutine_fn bdrv_co_do_rw(void *opaque);
 
+static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors,
+bool is_write, double elapsed_time, uint64_t *wait);
+static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write,
+double elapsed_time, uint64_t *wait);
+static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors,
+bool is_write, int64_t *wait);
+
 static QTAILQ_HEAD(, BlockDriverState) bdrv_states =
 QTAILQ_HEAD_INITIALIZER(bdrv_states);
 
@@ -161,6 +170,26 @@ bool bdrv_io_limits_enabled(BlockDriverState *bs)
  || io_limits-iops[BLOCK_IO_LIMIT_TOTAL];
 }
 
+static BlockQueueAIOCB *bdrv_io_limits_perform(BlockDriverState *bs,
+   BlockRequestHandler *handler, int64_t sector_num,
+   QEMUIOVector *qiov, int nb_sectors)
+{
+BlockQueueAIOCB *ret = NULL;
+int64_t wait_time = -1;
+
+if (bdrv_exceed_io_limits(bs, nb_sectors, false, wait_time)) {
+ret = qemu_block_queue_enqueue(bs-block_queue, bs, handler,
+   sector_num, qiov,
+   nb_sectors, NULL, NULL);
+if (wait_time != -1) {
+qemu_mod_timer(bs-block_timer,
+   wait_time + qemu_get_clock_ns(vm_clock));
+}
+}
+
+return ret;
+}
+
 /* check if the path starts with protocol: */
 static int path_has_protocol(const char *path)
 {
@@ -1112,10 +1141,12 @@ static void coroutine_fn bdrv_rw_co_entry(void *opaque)
 
 if (!rwco-is_write) {
 rwco-ret = bdrv_co_do_readv(rwco-bs, rwco-sector_num,
- rwco-nb_sectors, rwco-qiov);
+ rwco-nb_sectors, rwco-qiov,
+ rwco, BDRV_API_SYNC);
 } else {
 rwco-ret = bdrv_co_do_writev(rwco-bs, rwco-sector_num,
-  rwco-nb_sectors, rwco-qiov);
+  rwco-nb_sectors, rwco-qiov,
+  rwco, BDRV_API_SYNC);
 }
 }
 
@@ -1331,6 +1362,17 @@ typedef struct BlockDriverAIOCBCoroutine {
 void *blkq_acb;
 } BlockDriverAIOCBCoroutine;
 
+static void bdrv_co_rw_bh(void *opaque)
+{
+BlockDriverAIOCBCoroutine *acb = opaque;
+
+acb-common.cb(acb-common.opaque, acb-req.error);
+
+acb-blkq_acb = NULL;
+qemu_bh_delete(acb-bh);
+qemu_aio_release(acb);
+}
+
 /* block I/O throttling */
 typedef struct CoroutineCB {
 BlockDriverState *bs;
@@ -1339,6 +1381,25 @@ typedef struct CoroutineCB {
 QEMUIOVector *qiov;
 } CoroutineCB;
 
+static bool bdrv_io_limits_skip_query(void *opaque,
+  BlockAPIType co_type) {
+bool limit_skip;
+RwCo *rwco;
+BlockDriverAIOCBCoroutine *aioco;
+
+if (co_type == BDRV_API_SYNC) {
+rwco = opaque;
+limit_skip = rwco-limit_skip;
+} else if (co_type == BDRV_API_ASYNC) {
+aioco = opaque;
+limit_skip = aioco-limit_skip;
+} else {
+abort();
+}
+
+return limit_skip;
+}
+
 static void bdrv_io_limits_skip_set(void *opaque,
 BlockAPIType co_type,
 bool cb_skip,
@@ -1379,13 +1440,52 @@ void bdrv_io_limits_issue_request(void *opaque,
 }
 }
 
+static int bdrv_io_limits_intercept(BlockDriverState *bs,
+BlockRequestHandler *handler, int64_t sector_num,
+QEMUIOVector *qiov, int nb_sectors, void *opaque, BlockAPIType co_type)
+{
+BlockQueueAIOCB *blkq_acb = NULL;
+BlockQueueRetType ret = BDRV_BLKQ_PASS;
+
+blkq_acb =

[Qemu-devel] [PATCH v9 4/4] qmp/hmp: add block_set_io_throttle

Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com
---
 block.c |   27 ---
 blockdev.c  |   51 +++
 blockdev.h  |2 ++
 hmp-commands.hx |   15 +++
 qerror.c|4 
 qerror.h|3 +++
 qmp-commands.hx |   53 -
 7 files changed, 151 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index 2d85b64..395bbd7 100644
--- a/block.c
+++ b/block.c
@@ -2164,6 +2164,16 @@ static void bdrv_print_dict(QObject *obj, void *opaque)
 qdict_get_bool(qdict, ro),
 qdict_get_str(qdict, drv),
 qdict_get_bool(qdict, encrypted));
+
+monitor_printf(mon,  bps=% PRId64  bps_rd=% PRId64
+ bps_wr=% PRId64  iops=% PRId64
+ iops_rd=% PRId64  iops_wr=% PRId64,
+qdict_get_int(qdict, bps),
+qdict_get_int(qdict, bps_rd),
+qdict_get_int(qdict, bps_wr),
+qdict_get_int(qdict, iops),
+qdict_get_int(qdict, iops_rd),
+qdict_get_int(qdict, iops_wr));
 } else {
 monitor_printf(mon,  [not inserted]);
 }
@@ -2214,10 +2224,22 @@ void bdrv_info(Monitor *mon, QObject **ret_data)
 QObject *obj;
 
 obj = qobject_from_jsonf({ 'file': %s, 'ro': %i, 'drv': %s, 
- 'encrypted': %i },
+ 'encrypted': %i, 
+ 'bps': % PRId64 ,
+ 'bps_rd': % PRId64 ,
+ 'bps_wr': % PRId64 ,
+ 'iops': % PRId64 ,
+ 'iops_rd': % PRId64 ,
+ 'iops_wr': % PRId64 },
  bs-filename, bs-read_only,
  bs-drv-format_name,
- bdrv_is_encrypted(bs));
+ bdrv_is_encrypted(bs),
+ bs-io_limits.bps[BLOCK_IO_LIMIT_TOTAL],
+ bs-io_limits.bps[BLOCK_IO_LIMIT_READ],
+ bs-io_limits.bps[BLOCK_IO_LIMIT_WRITE],
+ bs-io_limits.iops[BLOCK_IO_LIMIT_TOTAL],
+ bs-io_limits.iops[BLOCK_IO_LIMIT_READ],
+ bs-io_limits.iops[BLOCK_IO_LIMIT_WRITE]);
 if (bs-backing_file[0] != '\0') {
 QDict *qdict = qobject_to_qdict(obj);
 qdict_put(qdict, backing_file,
@@ -2405,7 +2427,6 @@ void bdrv_debug_event(BlockDriverState *bs, BlkDebugEvent 
event)
 }
 
 return drv-bdrv_debug_event(bs, event);
-
 }
 
 /**/
diff --git a/blockdev.c b/blockdev.c
index faf8c56..9eed973 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -745,6 +745,57 @@ int do_change_block(Monitor *mon, const char *device,
 return monitor_read_bdrv_key_start(mon, bs, NULL, NULL);
 }
 
+/* throttling disk I/O limits */
+int do_block_set_io_throttle(Monitor *mon,
+   const QDict *qdict, QObject **ret_data)
+{
+const char *devname = qdict_get_str(qdict, device);
+int64_t bps= qdict_get_try_int(qdict, bps, -1);
+int64_t bps_rd = qdict_get_try_int(qdict, bps_rd, -1);
+int64_t bps_wr = qdict_get_try_int(qdict, bps_wr, -1);
+int64_t iops   = qdict_get_try_int(qdict, iops, -1);
+int64_t iops_rd= qdict_get_try_int(qdict, iops_rd, -1);
+int64_t iops_wr= qdict_get_try_int(qdict, iops_wr, -1);
+BlockDriverState *bs;
+
+bs = bdrv_find(devname);
+if (!bs) {
+qerror_report(QERR_DEVICE_NOT_FOUND, devname);
+return -1;
+}
+
+if ((bps == -1) || (bps_rd == -1) || (bps_wr == -1)
+|| (iops == -1) || (iops_rd == -1) || (iops_wr == -1)) {
+qerror_report(QERR_MISSING_PARAMETER,
+  bps/bps_rd/bps_wr/iops/iops_rd/iops_wr);
+return -1;
+}
+
+if ((bps != 0  (bps_rd != 0  || bps_wr != 0))
+|| (iops != 0  (iops_rd != 0 || iops_wr != 0))) {
+qerror_report(QERR_INVALID_PARAMETER_COMBINATION);
+return -1;
+}
+
+bs-io_limits.bps[BLOCK_IO_LIMIT_TOTAL] = bps;
+bs-io_limits.bps[BLOCK_IO_LIMIT_READ]  = bps_rd;
+bs-io_limits.bps[BLOCK_IO_LIMIT_WRITE] = bps_wr;
+bs-io_limits.iops[BLOCK_IO_LIMIT_TOTAL] = iops;
+bs-io_limits.iops[BLOCK_IO_LIMIT_READ]  = iops_rd;
+bs-io_limits.iops[BLOCK_IO_LIMIT_WRITE] = iops_wr;
+bs-slice_time = BLOCK_IO_SLICE_TIME;
+
+if (!bs-io_limits_enabled  bdrv_io_limits_enabled(bs)) {
+

[Qemu-devel] [PATCH v9 2/4] block: add the command line support

Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com
---
 block.c |   85 +++
 block.h |4 ++
 block_int.h |   22 ++
 blockdev.c  |   32 
 qemu-config.c   |   24 +++
 qemu-options.hx |1 +
 6 files changed, 168 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index 40ab277..8f92950 100644
--- a/block.c
+++ b/block.c
@@ -30,6 +30,9 @@
 #include qemu-objects.h
 #include qemu-coroutine.h
 
+#include qemu-timer.h
+#include block/blk-queue.h
+
 #ifdef CONFIG_BSD
 #include sys/types.h
 #include sys/stat.h
@@ -104,6 +107,60 @@ int is_windows_drive(const char *filename)
 }
 #endif
 
+/* throttling disk I/O limits */
+void bdrv_io_limits_disable(BlockDriverState *bs)
+{
+bs-io_limits_enabled = false;
+
+if (bs-block_queue) {
+qemu_block_queue_submit(bs-block_queue, qemu_block_queue_cb);
+qemu_del_block_queue(bs-block_queue);
+
+bs-block_queue = NULL;
+}
+
+if (bs-block_timer) {
+qemu_del_timer(bs-block_timer);
+qemu_free_timer(bs-block_timer);
+bs-block_timer = NULL;
+}
+
+bs-slice_start = 0;
+bs-slice_end   = 0;
+bs-slice_time  = 0;
+memset(bs-io_disps, 0, sizeof(bs-io_disps));
+}
+
+static void bdrv_block_timer(void *opaque)
+{
+BlockDriverState *bs = opaque;
+BlockQueue *queue= bs-block_queue;
+
+qemu_block_queue_submit(queue, qemu_block_queue_cb);
+}
+
+void bdrv_io_limits_enable(BlockDriverState *bs)
+{
+bs-io_limits_enabled = true;
+bs-block_queue   = qemu_new_block_queue();
+bs-block_timer   = qemu_new_timer_ns(vm_clock, bdrv_block_timer, bs);
+bs-slice_time= 5 * BLOCK_IO_SLICE_TIME;
+bs-slice_start   = qemu_get_clock_ns(vm_clock);
+bs-slice_end = bs-slice_start + bs-slice_time;
+memset(bs-io_disps, 0, sizeof(bs-io_disps));
+}
+
+bool bdrv_io_limits_enabled(BlockDriverState *bs)
+{
+BlockIOLimit *io_limits = bs-io_limits;
+return io_limits-bps[BLOCK_IO_LIMIT_READ]
+ || io_limits-bps[BLOCK_IO_LIMIT_WRITE]
+ || io_limits-bps[BLOCK_IO_LIMIT_TOTAL]
+ || io_limits-iops[BLOCK_IO_LIMIT_READ]
+ || io_limits-iops[BLOCK_IO_LIMIT_WRITE]
+ || io_limits-iops[BLOCK_IO_LIMIT_TOTAL];
+}
+
 /* check if the path starts with protocol: */
 static int path_has_protocol(const char *path)
 {
@@ -684,6 +741,11 @@ int bdrv_open(BlockDriverState *bs, const char *filename, 
int flags,
 bdrv_dev_change_media_cb(bs, true);
 }
 
+/* throttling disk I/O limits */
+if (bs-io_limits_enabled) {
+bdrv_io_limits_enable(bs);
+}
+
 return 0;
 
 unlink_and_fail:
@@ -719,6 +781,21 @@ void bdrv_close(BlockDriverState *bs)
 
 bdrv_dev_change_media_cb(bs, false);
 }
+
+/*throttling disk I/O limits*/
+bdrv_io_limits_disable(bs);
+
+/* throttling disk I/O limits */
+if (bs-block_queue) {
+qemu_del_block_queue(bs-block_queue);
+bs-block_queue = NULL;
+}
+
+if (bs-block_timer) {
+qemu_del_timer(bs-block_timer);
+qemu_free_timer(bs-block_timer);
+bs-block_timer = NULL;
+}
 }
 
 void bdrv_close_all(void)
@@ -1576,6 +1653,14 @@ void bdrv_get_geometry_hint(BlockDriverState *bs,
 *psecs = bs-secs;
 }
 
+/* throttling disk io limits */
+void bdrv_set_io_limits(BlockDriverState *bs,
+BlockIOLimit *io_limits)
+{
+bs-io_limits = *io_limits;
+bs-io_limits_enabled = bdrv_io_limits_enabled(bs);
+}
+
 /* Recognize floppy formats */
 typedef struct FDFormat {
 FDriveType drive;
diff --git a/block.h b/block.h
index b7fbc40..1cd6b8b 100644
--- a/block.h
+++ b/block.h
@@ -105,6 +105,10 @@ void bdrv_info(Monitor *mon, QObject **ret_data);
 void bdrv_stats_print(Monitor *mon, const QObject *data);
 void bdrv_info_stats(Monitor *mon, QObject **ret_data);
 
+/* disk I/O throttling */
+void bdrv_io_limits_enable(BlockDriverState *bs);
+void bdrv_io_limits_disable(BlockDriverState *bs);
+bool bdrv_io_limits_enabled(BlockDriverState *bs);
 void bdrv_io_limits_issue_request(void *opaque, BlockAPIType co_type);
 
 void bdrv_init(void);
diff --git a/block_int.h b/block_int.h
index 86a355d..3a4379c 100644
--- a/block_int.h
+++ b/block_int.h
@@ -34,6 +34,13 @@
 #define BLOCK_FLAG_ENCRYPT 1
 #define BLOCK_FLAG_COMPAT6 4
 
+#define BLOCK_IO_LIMIT_READ 0
+#define BLOCK_IO_LIMIT_WRITE1
+#define BLOCK_IO_LIMIT_TOTAL2
+
+#define BLOCK_IO_SLICE_TIME 1
+#define NANOSECONDS_PER_SECOND  10.0
+
 #define BLOCK_OPT_SIZE  size
 #define BLOCK_OPT_ENCRYPT   encryption
 #define BLOCK_OPT_COMPAT6   compat6
@@ -55,6 +62,11 @@ typedef struct BlockIOLimit {
 uint64_t iops[3];
 } BlockIOLimit;
 
+typedef struct BlockIODisp {
+uint64_t bytes[2];
+uint64_t ios[2];
+} BlockIODisp;
+
 struct BlockDriver {
 const char *format_name;
 int instance_size;
@@ -189,7

Re: [Qemu-devel] [PATCH] net: Only flush queue or call sent callback on successful delivery

On Fri, Oct 28, 2011 at 11:02 AM, Mark Wu wu...@linux.vnet.ibm.com wrote:
 On 10/28/2011 05:13 PM, Stefan Hajnoczi wrote:

 On Thu, Oct 27, 2011 at 10:02 AM, Mark Wuwu...@linux.vnet.ibm.com
  wrote:

 Now queue flushing and sent callback could be invoked even on delivery
 failure. We add a checking of receiver's return value to avoid this
 case.

 Signed-off-by: Mark Wuwu...@linux.vnet.ibm.com
 ---
  net/queue.c |   12 +++-
  1 files changed, 7 insertions(+), 5 deletions(-)

 What problem are you trying to fix?

 @@ -251,7 +253,7 @@ void qemu_net_queue_flush(NetQueue *queue)
             break;
         }

 -        if (packet-sent_cb) {
 +        if (ret  0  packet-sent_cb) {
             packet-sent_cb(packet-sender, ret);

 This looks wrong.  ret is passed as an argument to the callback.  You
 are skipping the callback on error and not giving it a chance to see
 negative ret.

 Looking at virtio_net_tx_complete() this causes a virtqueue element leak.

 Thanks for your review!
 Yes, that's a problem. I thought only tap call queue send function with a
 callback (tap_send_completed) and confirmed that no memory leak in the case
 of tap. I agree that it will cause a
 descriptor leak, but actually virtio_net_tx_complete doesn't check 'ret'. It
 just pushes the elem to the virtqueue with 'async_tx.len' and flushes the tx
 queue. I think it assumes the callback is only called on success. Otherwise,
 it doesn't make sense for me. My point is that flush shouldn't happen on a
 deliver failure. Probably it will cause more failures. tap_send_completed
 assumes it's called on successfully deliver a packet too because it
 re-enables polling of tap fd.  That's why I add a checking of 'ret'.

 I am not sure if the original code really needs a fix because it will not
 cause any visible problems.

In that case I would leave it.  I agree that if we just got an error
it is likely that sending more will also cause an error.  But we don't
know when in the future things will succeed.  Also Ethernet is lossly
and the guest networking stack is designed to handle failures and
dropped packets.

Stefan

[Qemu-devel] [PATCH 1/4] qemu-nbd: exit if the child exits before a socket connection is established

In -c mode qemu-nbd forks with the parent running the in-kernel NBD
client and the child running the server.  The next patch will introduce
a case in which the child fails after forking.  Detect this situation
and exit the parent.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-nbd.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index d8d3e15..972524d 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -24,6 +24,7 @@
 #include stdio.h
 #include getopt.h
 #include err.h
+#include sys/wait.h
 #include sys/types.h
 #include sys/socket.h
 #include netinet/in.h
@@ -382,7 +383,9 @@ int main(int argc, char **argv)
 do {
 sock = unix_socket_outgoing(socket);
 if (sock == -1) {
-if (errno != ENOENT  errno != ECONNREFUSED) {
+/* If the child exits before we connect, fail.  */
+if (waitpid(pid, NULL, WNOHANG) == pid ||
+(errno != ENOENT  errno != ECONNREFUSED)) {
 ret = 1;
 goto out;
 }
-- 
1.7.6.4

[Qemu-devel] [PATCH 0/4] fix qemu-nbd -c

qemu-nbd -c is another casualty of removing raw_read/raw_write.  This
series fixes it.

Unfortunately, as a side effect of this qemu-nbd will have to daemonize
before detecting all possible errors.  For this reason patches 2 and 3
make qemu-nbd write errors to syslog when daemonized.

Paolo Bonzini (4):
  qemu-nbd: exit if the child exits before a socket connection is
established
  qemu-nbd: include our own err/errx implementation
  qemu-nbd: report errors to syslog when daemonized
  qemu-nbd: do not start the block layer in the parent

 qemu-nbd.c |   78 +--
 1 files changed, 59 insertions(+), 19 deletions(-)

-- 
1.7.6.4

Re: [Qemu-devel] qemu/qemu-kvm floppy regression brought by 212ec7baa28cc9d819234fed1541fc1423cfe3d8

On Fri, Oct 28, 2011 at 10:20 AM, Lucas Meneghel Rodrigues
l...@redhat.com wrote:
 On Thu 27 Oct 2011 11:17:48 PM BRST, Zhi Yong Wu wrote:

 On Fri, Oct 28, 2011 at 2:57 AM, Lucas Meneghel Rodrigues
 l...@redhat.com  wrote:

 On 10/27/2011 05:17 AM, Stefan Hajnoczi wrote:

 On Wed, Oct 26, 2011 at 03:19:17PM -0200, Lucas Meneghel Rodrigues
 wrote:

 On 10/26/2011 01:47 PM, Kevin Wolf wrote:

 Am 26.10.2011 16:41, schrieb Lucas Meneghel Rodrigues:

 Hi folks:

 We've captured a regression with floppy disk on recent qemu (and
 qemu-kvm, after a code merge). We bisected it to be caused by:

 commit 212ec7baa28cc9d819234fed1541fc1423cfe3d8
 Author: Richard Hendersonr...@twiddle.net
 Date:   Mon Aug 15 15:08:45 2011 -0700

      fdc: Convert to isa_register_portio_list

      Signed-off-by: Richard Hendersonr...@twiddle.net
      Signed-off-by: Avi Kivitya...@redhat.com

 Since this commit, the guest doesn't see a floppy disk attached to it
 anymore, blocking kvm autotest ability to install windows guests
 automatically. This is a big deal for kvm autotest (ruins our
 automated
 regression jobs), so please take a look at it.

 Can you please try again with the latest block branch? I think there
 is
 a patch queued that will fix it.

 Kevin, I did try with HEAD of your repo:

 git://repo.or.cz/qemu/kevin.git

 [lmr@freedom qemu-kwolf]$ git branch -r
   origin/HEAD -    origin/master
   origin/blkqueue
   origin/blkqueue-v1
   origin/block
   origin/coroutine
   origin/coroutine-block
   origin/coroutine-devel
   origin/devel
   origin/ehci
   origin/for-anthony
   origin/for-stable-0.14
   origin/inplace-conversion
   origin/master

 With this repo, master branch, the problem persists. With the block
 branch, the problem persists.

 Now, with the blkqueue branch the problem is resolved. Cleber had
 the same results booting a FreeDOS floppy. So the fix is indeed in
 blkqueue.

 Oh, you might want to check the blkqueue branch, it does have quite
 a bunch of set but unused variables, which will cause compilation
 errors unless --disable-werror is passed to the configure script.

 I think blkqueue is an older development branch of the block queue
 feature that Kevin was working on.  It is not Kevin's block tree (see
 his block branch).

 So no, the block branch does not resolve the floppy access problem.

 Well, considering the tests of the stable set I'm running against qemu
 right
 now, this is not the biggest of our problems... I'm verifying qemu is
 segfaulting on nearly every prolonged attempt of doing migration... I'm
 about to write an email about it.

 What is the OS of your guest? fedora16 or RHEL6? I would like recently
 to use floppy device in guest.

 Fedora 15.
Just i tried to use Fedora-15-x86_64-DVD.iso to install one guest, but  hang up.

My qemu.git/HEAD is 9f60639b848944200c3d33a89233d808de0b5a43.

While it can work when using ./Fedora-14-x86_64-DVD.iso.

Regards,




-- 
Regards,

Zhi Yong Wu

Re: [Qemu-devel] qemu/qemu-kvm floppy regression brought by 212ec7baa28cc9d819234fed1541fc1423cfe3d8

On Fri, Oct 28, 2011 at 6:24 PM, Zhi Yong Wu zwu.ker...@gmail.com wrote:
 On Fri, Oct 28, 2011 at 10:20 AM, Lucas Meneghel Rodrigues
 l...@redhat.com wrote:
 On Thu 27 Oct 2011 11:17:48 PM BRST, Zhi Yong Wu wrote:

 On Fri, Oct 28, 2011 at 2:57 AM, Lucas Meneghel Rodrigues
 l...@redhat.com  wrote:

 On 10/27/2011 05:17 AM, Stefan Hajnoczi wrote:

 On Wed, Oct 26, 2011 at 03:19:17PM -0200, Lucas Meneghel Rodrigues
 wrote:

 On 10/26/2011 01:47 PM, Kevin Wolf wrote:

 Am 26.10.2011 16:41, schrieb Lucas Meneghel Rodrigues:

 Hi folks:

 We've captured a regression with floppy disk on recent qemu (and
 qemu-kvm, after a code merge). We bisected it to be caused by:

 commit 212ec7baa28cc9d819234fed1541fc1423cfe3d8
 Author: Richard Hendersonr...@twiddle.net
 Date:   Mon Aug 15 15:08:45 2011 -0700

      fdc: Convert to isa_register_portio_list

      Signed-off-by: Richard Hendersonr...@twiddle.net
      Signed-off-by: Avi Kivitya...@redhat.com

 Since this commit, the guest doesn't see a floppy disk attached to it
 anymore, blocking kvm autotest ability to install windows guests
 automatically. This is a big deal for kvm autotest (ruins our
 automated
 regression jobs), so please take a look at it.

 Can you please try again with the latest block branch? I think there
 is
 a patch queued that will fix it.

 Kevin, I did try with HEAD of your repo:

 git://repo.or.cz/qemu/kevin.git

 [lmr@freedom qemu-kwolf]$ git branch -r
   origin/HEAD -    origin/master
   origin/blkqueue
   origin/blkqueue-v1
   origin/block
   origin/coroutine
   origin/coroutine-block
   origin/coroutine-devel
   origin/devel
   origin/ehci
   origin/for-anthony
   origin/for-stable-0.14
   origin/inplace-conversion
   origin/master

 With this repo, master branch, the problem persists. With the block
 branch, the problem persists.

 Now, with the blkqueue branch the problem is resolved. Cleber had
 the same results booting a FreeDOS floppy. So the fix is indeed in
 blkqueue.

 Oh, you might want to check the blkqueue branch, it does have quite
 a bunch of set but unused variables, which will cause compilation
 errors unless --disable-werror is passed to the configure script.

 I think blkqueue is an older development branch of the block queue
 feature that Kevin was working on.  It is not Kevin's block tree (see
 his block branch).

 So no, the block branch does not resolve the floppy access problem.

 Well, considering the tests of the stable set I'm running against qemu
 right
 now, this is not the biggest of our problems... I'm verifying qemu is
 segfaulting on nearly every prolonged attempt of doing migration... I'm
 about to write an email about it.

 What is the OS of your guest? fedora16 or RHEL6? I would like recently
 to use floppy device in guest.

 Fedora 15.
And on one host with OS Fedora 15, after floppy.ko is loaded, i have
not seen its device file /dev/fd0.

On one guest with RH6 or Fedora 14, it fails to load floppy.ko.


 Just i tried to use Fedora-15-x86_64-DVD.iso to install one guest, but  hang 
 up.

 My qemu.git/HEAD is 9f60639b848944200c3d33a89233d808de0b5a43.

 While it can work when using ./Fedora-14-x86_64-DVD.iso.

Regards,




 --
 Regards,

 Zhi Yong Wu




-- 
Regards,

Zhi Yong Wu

[Qemu-devel] [PATCH 3/4] qemu-nbd: report errors to syslog when daemonized

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-nbd.c |   16 ++--
 1 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 573bf3d..5031158 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -30,6 +30,7 @@
 #include netinet/tcp.h
 #include arpa/inet.h
 #include signal.h
+#include syslog.h
 #include libgen.h
 
 #define SOCKET_PATH/var/lock/qemu-nbd-%s
@@ -37,6 +38,7 @@
 #define NBD_BUFFER_SIZE (1024*1024)
 
 static int verbose;
+static int daemonized;
 
 static void usage(const char *name)
 {
@@ -87,7 +89,11 @@ static void err(int status, const char *format, ...)
 if (vasprintf(s, format, ap) == -1) {
abort();
 }
-fprintf(stderr, qemu-nbd: %s: %s\n, s, msg);
+if (daemonized) {
+syslog(LOG_ERR, %s: %s, s, msg);
+} else {
+fprintf(stderr, qemu-nbd: %s: %s\n, s, msg);
+}
 free(s);
 exit(status);
 }
@@ -101,7 +107,11 @@ static void errx(int status, const char *format, ...)
 if (vasprintf(s, format, ap) == -1) {
 abort();
 }
-fprintf(stderr, qemu-nbd: %s\n, s);
+if (daemonized) {
+syslog(LOG_ERR, %s, s);
+} else {
+fprintf(stderr, qemu-nbd: %s\n, s);
+}
 free(s);
 exit(status);
 }
@@ -387,9 +397,11 @@ int main(int argc, char **argv)
 
 if (!verbose) {
 /* detach client and server */
+openlog(qemu-nbd, LOG_PID, LOG_USER);
 if (qemu_daemon(0, 0) == -1) {
 err(EXIT_FAILURE, Failed to daemonize);
 }
+daemonized = 1;
 }
 
 if (socket == NULL) {
-- 
1.7.6.4

[Qemu-devel] [PATCH 2/4] qemu-nbd: include our own err/errx implementation

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-nbd.c |   30 +-
 1 files changed, 29 insertions(+), 1 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 972524d..573bf3d 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -23,7 +23,6 @@
 #include stdarg.h
 #include stdio.h
 #include getopt.h
-#include err.h
 #include sys/wait.h
 #include sys/types.h
 #include sys/socket.h
@@ -78,6 +77,35 @@ static void version(const char *name)
 , name);
 }
 
+static void err(int status, const char *format, ...)
+{
+char *s, *msg;
+va_list ap;
+
+msg = strerror(errno);
+va_start(ap, format);
+if (vasprintf(s, format, ap) == -1) {
+   abort();
+}
+fprintf(stderr, qemu-nbd: %s: %s\n, s, msg);
+free(s);
+exit(status);
+}
+
+static void errx(int status, const char *format, ...)
+{
+char *s;
+va_list ap;
+
+va_start(ap, format);
+if (vasprintf(s, format, ap) == -1) {
+abort();
+}
+fprintf(stderr, qemu-nbd: %s\n, s);
+free(s);
+exit(status);
+}
+
 struct partition_record
 {
 uint8_t bootable;
-- 
1.7.6.4

Re: [Qemu-devel] [PATCH] This patch adds a new block driver : iSCSI

On Wed, Sep 21, 2011 at 5:37 PM, Ronnie Sahlberg
ronniesahlb...@gmail.com wrote:
 This provides built-in support for iSCSI to QEMU.
 This has the advantage that the iSCSI devices need not be made visible to the 
 host, which is useful if you have very many virtual machines and very many 
 iscsi devices.
 It also has the benefit that non-root users of QEMU can access iSCSI devices 
 across the network without requiring root privilege on the host.

 This driver interfaces with the multiplatform posix library for iscsi 
 initiator/client access to iscsi devices hosted at
    git://github.com/sahlberg/libiscsi.git

 The patch adds the driver to interface with the iscsi library.
 It also updated the configure script to
 * by default, probe is libiscsi is available and if so, build
  qemu against libiscsi.
 * --enable-libiscsi
  Force a build against libiscsi. If libiscsi is not available
  the build will fail.
 * --disable-libiscsi
  Do not link against libiscsi, even if it is available.

 When linked with libiscsi, qemu gains support to access iscsi resources such 
 as disks and cdrom directly, without having to make the devices visible to 
 the host.

 You can specify devices using a iscsi url of the form :
 iscsi://[username[:password@]]host[:port]/target-iqn-name/lun
 When using authentication, the password can optionally be set with
 LIBISCSI_CHAP_PASSWORD=password to avoid it showing up in the process list

 Signed-off-by: Ronnie Sahlberg ronniesahlb...@gmail.com
 ---
  Makefile.objs |    1 +
  block/iscsi.c |  596 
 +
  configure     |   31 +++
  trace-events  |    7 +
  4 files changed, 635 insertions(+), 0 deletions(-)
  create mode 100644 block/iscsi.c

 diff --git a/Makefile.objs b/Makefile.objs
 index a529a11..8c8e420 100644
 --- a/Makefile.objs
 +++ b/Makefile.objs
 @@ -36,6 +36,7 @@ block-nested-y += qed-check.o
  block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
  block-nested-$(CONFIG_WIN32) += raw-win32.o
  block-nested-$(CONFIG_POSIX) += raw-posix.o
 +block-nested-$(CONFIG_LIBISCSI) += iscsi.o
  block-nested-$(CONFIG_CURL) += curl.o
  block-nested-$(CONFIG_RBD) += rbd.o

 diff --git a/block/iscsi.c b/block/iscsi.c
 new file mode 100644
 index 000..6517576
 --- /dev/null
 +++ b/block/iscsi.c
 @@ -0,0 +1,596 @@
 +/*
 + * QEMU Block driver for iSCSI images
 + *
 + * Copyright (c) 2010-2011 Ronnie Sahlberg ronniesahlb...@gmail.com
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a 
 copy
 + * of this software and associated documentation files (the Software), to 
 deal
 + * in the Software without restriction, including without limitation the 
 rights
 + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 + * copies of the Software, and to permit persons to whom the Software is
 + * furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included in
 + * all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
 FROM,
 + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 + * THE SOFTWARE.
 + */
 +
 +#include config-host.h
 +
 +#include poll.h
 +#include sysemu.h
 +#include qemu-common.h
 +#include qemu-error.h
 +#include block_int.h
 +#include trace.h
 +
 +#include iscsi/iscsi.h
 +#include iscsi/scsi-lowlevel.h
 +
 +
 +typedef struct IscsiLun {
 +    struct iscsi_context *iscsi;
 +    int lun;
 +    int block_size;
 +    unsigned long num_blocks;
 +} IscsiLun;
 +
 +typedef struct IscsiAIOCB {
 +    BlockDriverAIOCB common;
 +    QEMUIOVector *qiov;
 +    QEMUBH *bh;
 +    IscsiLun *iscsilun;
 +    struct scsi_task *task;
 +    uint8_t *buf;
 +    int canceled;
 +    int status;
 +    size_t read_size;
 +    size_t read_offset;
 +} IscsiAIOCB;
 +
 +struct IscsiTask {
 +    IscsiLun *iscsilun;
 +    int status;
 +    int complete;
 +};
 +
 +static void
 +iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void 
 *command_data,
 +                    void *private_data)
 +{
 +}
 +
 +static void
 +iscsi_aio_cancel(BlockDriverAIOCB *blockacb)
 +{
 +    IscsiAIOCB *acb = (IscsiAIOCB *)blockacb;
 +    IscsiLun *iscsilun = acb-iscsilun;
 +
 +    acb-status = -ECANCELED;
 +    acb-common.cb(acb-common.opaque, acb-status);
 +    acb-canceled = 1;
 +
 +    /* send a task mgmt call to the target to cancel the task on the target 
 */
 +    iscsi_task_mgmt_abort_task_async(iscsilun-iscsi, acb-task,
 +                                     iscsi_abort_task_cb, NULL);
 +
 +    /* then also

Re: [Qemu-devel] qemu-img vmdk converted from iso not accepted by vSphere

2011-10-28 Thread Richard Wellum

I see one other report of the same thing on the vmware forums:

http://communities.vmware.com/message/1839303

Thanks,

||Rich

On Oct 28, 2011, at 5:17 AM, Stefan Hajnoczi wrote:

 Rich: Thanks for the ESXi 4.1.0 and vSphere Client 4.1.0 version info.
 
 Fam: Do you have any suggestions how to debug the Error uploading
 file path/to/image.vmdk to server.  Not a supported disk format
 (sparse VMDK version too old) issue?  Ideally qemu-img would create
 images that work with current VMware tools by default.
 
 Stefan

[Qemu-devel] [PATCH 4/4] qemu-nbd: do not start the block layer in the parent

Forking and threading do not behave very well together.  With qemu-nbd in
-c mode it could happen that the thread pool is started in the parent, and
the children (the one actually doing the I/O) is left without working I/O.
Fix this by only running bdrv_init in the child process.

Reported-by: Pierre Riteau pierre.rit...@irisa.fr
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-nbd.c |   31 ++-
 1 files changed, 14 insertions(+), 17 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 5031158..962952c 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -371,21 +371,6 @@ int main(int argc, char **argv)
return 0;
 }
 
-bdrv_init();
-
-bs = bdrv_new(hda);
-
-if ((ret = bdrv_open(bs, argv[optind], flags, NULL))  0) {
-errno = -ret;
-err(EXIT_FAILURE, Failed to bdrv_open '%s', argv[optind]);
-}
-
-fd_size = bs-total_sectors * 512;
-
-if (partition != -1 
-find_partition(bs, partition, dev_offset, fd_size))
-err(EXIT_FAILURE, Could not find partition %d, partition);
-
 if (device) {
 pid_t pid;
 int sock;
@@ -418,7 +403,6 @@ int main(int argc, char **argv)
 size_t blocksize;
 
 ret = 0;
-bdrv_close(bs);
 
 do {
 sock = unix_socket_outgoing(socket);
@@ -473,8 +457,21 @@ int main(int argc, char **argv)
 /* children */
 }
 
+bdrv_init();
+bs = bdrv_new(hda);
+if ((ret = bdrv_open(bs, argv[optind], flags, NULL))  0) {
+errno = -ret;
+err(EXIT_FAILURE, Failed to bdrv_open '%s', argv[optind]);
+}
+
+fd_size = bs-total_sectors * 512;
+
+if (partition != -1 
+find_partition(bs, partition, dev_offset, fd_size)) {
+err(EXIT_FAILURE, Could not find partition %d, partition);
+}
+
 sharing_fds = g_malloc((shared + 1) * sizeof(int));
-
 if (socket) {
 sharing_fds[0] = unix_socket_incoming(socket);
 } else {
-- 
1.7.6.4

Re: [Qemu-devel] [PATCH] Fix compiler warning (always return a value)

Stefan Hajnoczi stefa...@gmail.com writes:

 On Fri, Oct 28, 2011 at 9:59 AM, Markus Armbruster arm...@redhat.com wrote:
 Stefan Hajnoczi stefa...@gmail.com writes:

 On Fri, Oct 28, 2011 at 8:40 AM, Markus Armbruster arm...@redhat.com 
 wrote:
 Stefan Hajnoczi stefa...@gmail.com writes:

 On Mon, Oct 24, 2011 at 10:18:43PM +0200, Stefan Weil wrote:
 For compilations with -DNDEBUG, the default case did not return
 a value which caused a compiler warning.

 Signed-off-by: Stefan Weil s...@weilnetz.de
 ---
  hw/ppce500_spin.c |   11 ---
  1 files changed, 8 insertions(+), 3 deletions(-)

 diff --git a/hw/ppce500_spin.c b/hw/ppce500_spin.c
 index cccd940..5b5ffe0 100644
 --- a/hw/ppce500_spin.c
 +++ b/hw/ppce500_spin.c
 @@ -168,17 +168,22 @@ static uint64_t spin_read(void *opaque, 
 target_phys_addr_t addr, unsigned len)
  {
      SpinState *s = opaque;
      uint8_t *spin_p = ((uint8_t*)s-spin)[addr];
 +    uint64_t result = 0;

      switch (len) {
      case 1:
 -        return ldub_p(spin_p);
 +        result = ldub_p(spin_p);
 +        break;
      case 2:
 -        return lduw_p(spin_p);
 +        result = lduw_p(spin_p);
 +        break;
      case 4:
 -        return ldl_p(spin_p);
 +        result = ldl_p(spin_p);
 +        break;
      default:
          assert(0);

 I would replace assert(3) with abort(3).  If this ever happens the
 program is broken - returning 0 instead of an undefined value doesn't
 help.

 Why is it useful to make failed assertions stop the program regardless
 of NDEBUG only when the assertion happens to be can't reach?

 My point is that it should not be an assertion.  The program has a
 control flow path which should never be taken.  In any safe
 programming environment the program will terminate with a diagnostic.

 In the not-so-safe C programming environment, we can disable that safety
 check by defining NDEBUG.

 Disabling safety checks is almost always a bad idea.

 There's a difference in a safety check that slows down the system and
 a failure condition where the program must terminate.

 assert(3) is for safety checks that can be disabled because they may
 slow down the system.

 I've been saying that assert(3) isn't appropriate here because the
 program can't continue and there's no check we can skip here.

a. Program can't continue: same for pretty much any assertion anywhere.

b. There's no code we can skip here: calling abort() is code.

What I've been saying is

1. Attempting to distinguish between safety checks that are safe to skip
and failure conditions that aren't is a fool's errand.  If a check is
safe to skip it's not a safety check by definition.  It's debugging
code, perhaps.

2. Optionally disabling expensive safety checks should be done based
on measurements, if at all.  Arbitrarily declaring all can't reach
checks inexpensive and all other assertions expensive isn't
measuring, it's guesswork.

[Qemu-devel] [ANNOUNCE] Xvisor: eXtensible Versatile hypervISOR

2011-10-28 Thread Chauhan, Himanshu

We are pleased to announcing Xvisor v0.1.0, a new open source bare-metal
hypervisor, which aims
towards providing virtualization solution, which is light-weight, portable,
and flexible with small memory foot print and less virtualization overhead.
It is distributed under GNU Public License (GPLv2).

Xvisor has most of the features expected from a modern full fledged
hypervisor:
- Tree based configuration
- Tickless and high resolution timekeeping
- Threading framework
- Device driver framework
- CPU virtualization
- Address Space virtualization
- Device emulation framework
- Serial port virtualization
- Managment terminal

Xvisor is currently being ported for two architectures: ARM and MIPS. The
development testing is being done using QEMU (0.14.xx or higher).

Xvisor ARM is able to boot multiple unmodified Linux-2.6.30.10 or
Linux-3.0.4
guest with a fairly interactive and smooth Busybox 0.19.2 console. Currently
supported host for Xvisor ARM is Realview-PB-A8 Board emulated by QEMU. It
is
also being ported to real hardware, specifically Beagle Board.

Please try out our Xivsor ARM demo on QEMU (0.14.xx or higher). To download
the demo tarball visit the link below:
https://docs.google.com/a/brainfault.org/leaf?id=0B0ABS_s60oP_OGE2MDA0MWYtNDUxOS00YWZkLTllMmMtN2M5MmE1ZGM0NDkyhl=en_GB

Xvisor MIPS is still a work in progress and will be announced very soon.

Xvisor is currently hosted on Github, to clone or download source code
please
visit following links:
Mainline: https://github.com/xvisor/xvisor
ARM Development: https://github.com/avpatel/xvisor-arm
MIPS Development: https://github.com/hschauhan/xvisor-mips

Our developer mailing list is xvisor-devel[at]googlegroups[dot]com, please
feel
free to post your queries or join our development mailing list.

Re: [Qemu-devel] [PATCH v2] block: avoid SIGUSR2

Am 27.10.2011 16:32, schrieb Kevin Wolf:
 Am 27.10.2011 16:15, schrieb Kevin Wolf:
 Am 27.10.2011 15:57, schrieb Stefan Hajnoczi:
 On Thu, Oct 27, 2011 at 03:26:23PM +0200, Kevin Wolf wrote:
 Am 19.09.2011 16:37, schrieb Frediano Ziglio:
 Now that iothread is always compiled sending a signal seems only an
 additional step. This patch also avoid writing to two pipe (one from 
 signal
 and one in qemu_service_io).

 Work with kvm enabled or disabled. strace output is more readable (less 
 syscalls).

 Signed-off-by: Frediano Ziglio fredd...@gmail.com

 Something in this change has bad effects, in the sense that it seems to
 break bdrv_read_em.

 How does it break bdrv_read_em?  Are you seeing QEMU hung with 100% CPU
 utilization or deadlocked?

 Sorry, I should have been more detailed here.

 No, it's nothing obvious, it must be some subtle side effect. The result
 of bdrv_read_em itself seems to be correct (return value and checksum of
 the read buffer).

 However instead of booting into the DOS setup I only get an error
 message Kein System oder Laufwerksfehler (don't know how it reads in
 English DOS versions), which seems to be produced by the boot sector.

 I excluded all of the minor changes, so I'm sure that it's caused by the
 switch from kill() to a direct call of the function that writes into the
 pipe.

 One interesting thing is that qemu_aio_wait() does not release the QEMU
 mutex, so we cannot write to a pipe with the mutex held and then spin
 waiting for the iothread to do work for us.

 Exactly how kill and qemu_notify_event() were different I'm not sure
 right now but it could be a factor.

 This would cause a hang, right? Then it isn't what I'm seeing.
 
 While trying out some more things, I added some fprintfs to
 posix_aio_process_queue() and suddenly it also fails with the kill()
 version. So what has changed might really just be the timing, and it
 could be a race somewhere that has always (?) existed.

Replying to myself again... It looks like there is a problem with
reentrancy in fdctrl_transfer_handler. I think this would have been
guarded by the AsyncContexts before, but we don't have them any more.

qemu-system-x86_64: /root/upstream/qemu/hw/fdc.c:1253:
fdctrl_transfer_handler: Assertion `reentrancy == 0' failed.

Program received signal SIGABRT, Aborted.

(gdb) bt
#0  0x003ccd2329a5 in raise () from /lib64/libc.so.6
#1  0x003ccd234185 in abort () from /lib64/libc.so.6
#2  0x003ccd22b935 in __assert_fail () from /lib64/libc.so.6
#3  0x0046ff09 in fdctrl_transfer_handler (opaque=value
optimized out, nchan=value optimized out, dma_pos=value optimized out,
dma_len=value optimized out) at /root/upstream/qemu/hw/fdc.c:1253
#4  0x0046702c in channel_run () at /root/upstream/qemu/hw/dma.c:348
#5  DMA_run () at /root/upstream/qemu/hw/dma.c:378
#6  0x0040b0e1 in qemu_bh_poll () at async.c:70
#7  0x0040aa19 in qemu_aio_wait () at aio.c:147
#8  0x0041c355 in bdrv_read_em (bs=0x131fd80, sector_num=19,
buf=value optimized out, nb_sectors=1) at block.c:2896
#9  0x0041b3d2 in bdrv_read (bs=0x131fd80, sector_num=19,
buf=0x1785a00 IO  SYS!, nb_sectors=1) at block.c:1062
#10 0x0041b3d2 in bdrv_read (bs=0x131f430, sector_num=19,
buf=0x1785a00 IO  SYS!, nb_sectors=1) at block.c:1062
#11 0x0046fbb8 in do_fdctrl_transfer_handler (opaque=0x1785788,
nchan=2, dma_pos=value optimized out, dma_len=512)
at /root/upstream/qemu/hw/fdc.c:1178
#12 0x0046fecf in fdctrl_transfer_handler (opaque=value
optimized out, nchan=value optimized out, dma_pos=value optimized out,
dma_len=value optimized out) at /root/upstream/qemu/hw/fdc.c:1255
#13 0x0046702c in channel_run () at /root/upstream/qemu/hw/dma.c:348
#14 DMA_run () at /root/upstream/qemu/hw/dma.c:378
#15 0x0046e456 in fdctrl_start_transfer (fdctrl=0x1785788,
direction=1) at /root/upstream/qemu/hw/fdc.c:1107
#16 0x00558a41 in kvm_handle_io (env=0x1323ff0) at
/root/upstream/qemu/kvm-all.c:834
#17 kvm_cpu_exec (env=0x1323ff0) at /root/upstream/qemu/kvm-all.c:976
#18 0x0053686a in qemu_kvm_cpu_thread_fn (arg=0x1323ff0) at
/root/upstream/qemu/cpus.c:661
#19 0x003ccda077e1 in start_thread () from /lib64/libpthread.so.0
#20 0x003ccd2e151d in clone () from /lib64/libc.so.6

I'm afraid that we can only avoid things like this reliably if we
convert all devices to be direct users of AIO/coroutines. The current
block layer infrastructure doesn't emulate the behaviour of bdrv_read
accurately as bottom halves can be run in the nested main loop.

For floppy, the following seems to be a quick fix (Lucas, Cleber, does
this solve your problems?), though it's not very satisfying. And I'm not
quite sure yet why it doesn't always happen with kill() in
posix-aio-compat.c.

diff --git a/hw/dma.c b/hw/dma.c
index 8a7302a..1d3b6f1 100644
--- a/hw/dma.c
+++ b/hw/dma.c
@@ -358,6 +358,13 @@ static void DMA_run (void)
 struct dma_cont *d;

Re: [Qemu-devel] [PATCH v2] block: avoid SIGUSR2

Am 28.10.2011 13:33, schrieb Kevin Wolf:
 Am 27.10.2011 16:32, schrieb Kevin Wolf:
 Am 27.10.2011 16:15, schrieb Kevin Wolf:
 Am 27.10.2011 15:57, schrieb Stefan Hajnoczi:
 On Thu, Oct 27, 2011 at 03:26:23PM +0200, Kevin Wolf wrote:
 Am 19.09.2011 16:37, schrieb Frediano Ziglio:
 Now that iothread is always compiled sending a signal seems only an
 additional step. This patch also avoid writing to two pipe (one from 
 signal
 and one in qemu_service_io).

 Work with kvm enabled or disabled. strace output is more readable (less 
 syscalls).

 Signed-off-by: Frediano Ziglio fredd...@gmail.com

 Something in this change has bad effects, in the sense that it seems to
 break bdrv_read_em.

 How does it break bdrv_read_em?  Are you seeing QEMU hung with 100% CPU
 utilization or deadlocked?

 Sorry, I should have been more detailed here.

 No, it's nothing obvious, it must be some subtle side effect. The result
 of bdrv_read_em itself seems to be correct (return value and checksum of
 the read buffer).

 However instead of booting into the DOS setup I only get an error
 message Kein System oder Laufwerksfehler (don't know how it reads in
 English DOS versions), which seems to be produced by the boot sector.

 I excluded all of the minor changes, so I'm sure that it's caused by the
 switch from kill() to a direct call of the function that writes into the
 pipe.

 One interesting thing is that qemu_aio_wait() does not release the QEMU
 mutex, so we cannot write to a pipe with the mutex held and then spin
 waiting for the iothread to do work for us.

 Exactly how kill and qemu_notify_event() were different I'm not sure
 right now but it could be a factor.

 This would cause a hang, right? Then it isn't what I'm seeing.

 While trying out some more things, I added some fprintfs to
 posix_aio_process_queue() and suddenly it also fails with the kill()
 version. So what has changed might really just be the timing, and it
 could be a race somewhere that has always (?) existed.
 
 Replying to myself again... It looks like there is a problem with
 reentrancy in fdctrl_transfer_handler. I think this would have been
 guarded by the AsyncContexts before, but we don't have them any more.
 
 qemu-system-x86_64: /root/upstream/qemu/hw/fdc.c:1253:
 fdctrl_transfer_handler: Assertion `reentrancy == 0' failed.
 
 Program received signal SIGABRT, Aborted.
 
 (gdb) bt
 #0  0x003ccd2329a5 in raise () from /lib64/libc.so.6
 #1  0x003ccd234185 in abort () from /lib64/libc.so.6
 #2  0x003ccd22b935 in __assert_fail () from /lib64/libc.so.6
 #3  0x0046ff09 in fdctrl_transfer_handler (opaque=value
 optimized out, nchan=value optimized out, dma_pos=value optimized out,
 dma_len=value optimized out) at /root/upstream/qemu/hw/fdc.c:1253
 #4  0x0046702c in channel_run () at /root/upstream/qemu/hw/dma.c:348
 #5  DMA_run () at /root/upstream/qemu/hw/dma.c:378
 #6  0x0040b0e1 in qemu_bh_poll () at async.c:70
 #7  0x0040aa19 in qemu_aio_wait () at aio.c:147
 #8  0x0041c355 in bdrv_read_em (bs=0x131fd80, sector_num=19,
 buf=value optimized out, nb_sectors=1) at block.c:2896
 #9  0x0041b3d2 in bdrv_read (bs=0x131fd80, sector_num=19,
 buf=0x1785a00 IO  SYS!, nb_sectors=1) at block.c:1062
 #10 0x0041b3d2 in bdrv_read (bs=0x131f430, sector_num=19,
 buf=0x1785a00 IO  SYS!, nb_sectors=1) at block.c:1062
 #11 0x0046fbb8 in do_fdctrl_transfer_handler (opaque=0x1785788,
 nchan=2, dma_pos=value optimized out, dma_len=512)
 at /root/upstream/qemu/hw/fdc.c:1178
 #12 0x0046fecf in fdctrl_transfer_handler (opaque=value
 optimized out, nchan=value optimized out, dma_pos=value optimized out,
 dma_len=value optimized out) at /root/upstream/qemu/hw/fdc.c:1255
 #13 0x0046702c in channel_run () at /root/upstream/qemu/hw/dma.c:348
 #14 DMA_run () at /root/upstream/qemu/hw/dma.c:378
 #15 0x0046e456 in fdctrl_start_transfer (fdctrl=0x1785788,
 direction=1) at /root/upstream/qemu/hw/fdc.c:1107
 #16 0x00558a41 in kvm_handle_io (env=0x1323ff0) at
 /root/upstream/qemu/kvm-all.c:834
 #17 kvm_cpu_exec (env=0x1323ff0) at /root/upstream/qemu/kvm-all.c:976
 #18 0x0053686a in qemu_kvm_cpu_thread_fn (arg=0x1323ff0) at
 /root/upstream/qemu/cpus.c:661
 #19 0x003ccda077e1 in start_thread () from /lib64/libpthread.so.0
 #20 0x003ccd2e151d in clone () from /lib64/libc.so.6
 
 I'm afraid that we can only avoid things like this reliably if we
 convert all devices to be direct users of AIO/coroutines. The current
 block layer infrastructure doesn't emulate the behaviour of bdrv_read
 accurately as bottom halves can be run in the nested main loop.
 
 For floppy, the following seems to be a quick fix (Lucas, Cleber, does
 this solve your problems?), though it's not very satisfying. And I'm not
 quite sure yet why it doesn't always happen with kill() in
 posix-aio-compat.c.
 
 diff --git a/hw/dma.c b/hw/dma.c
 index 8a7302a..1d3b6f1 100644
 ---

Re: [Qemu-devel] qemu/qemu-kvm floppy regression brought by 212ec7baa28cc9d819234fed1541fc1423cfe3d8

2011-10-28 Thread Lucas Meneghel Rodrigues


On 10/27/2011 11:17 PM, Zhi Yong Wu wrote:

On Fri, Oct 28, 2011 at 2:57 AM, Lucas Meneghel Rodrigues
l...@redhat.com  wrote:

On 10/27/2011 05:17 AM, Stefan Hajnoczi wrote:


On Wed, Oct 26, 2011 at 03:19:17PM -0200, Lucas Meneghel Rodrigues wrote:


On 10/26/2011 01:47 PM, Kevin Wolf wrote:


Am 26.10.2011 16:41, schrieb Lucas Meneghel Rodrigues:


Hi folks:

We've captured a regression with floppy disk on recent qemu (and
qemu-kvm, after a code merge). We bisected it to be caused by:

commit 212ec7baa28cc9d819234fed1541fc1423cfe3d8
Author: Richard Hendersonr...@twiddle.net
Date:   Mon Aug 15 15:08:45 2011 -0700

  fdc: Convert to isa_register_portio_list

  Signed-off-by: Richard Hendersonr...@twiddle.net
  Signed-off-by: Avi Kivitya...@redhat.com

Since this commit, the guest doesn't see a floppy disk attached to it
anymore, blocking kvm autotest ability to install windows guests
automatically. This is a big deal for kvm autotest (ruins our automated
regression jobs), so please take a look at it.


Can you please try again with the latest block branch? I think there is
a patch queued that will fix it.


Kevin, I did try with HEAD of your repo:

git://repo.or.cz/qemu/kevin.git

[lmr@freedom qemu-kwolf]$ git branch -r
   origin/HEAD -origin/master
   origin/blkqueue
   origin/blkqueue-v1
   origin/block
   origin/coroutine
   origin/coroutine-block
   origin/coroutine-devel
   origin/devel
   origin/ehci
   origin/for-anthony
   origin/for-stable-0.14
   origin/inplace-conversion
   origin/master

With this repo, master branch, the problem persists. With the block
branch, the problem persists.

Now, with the blkqueue branch the problem is resolved. Cleber had
the same results booting a FreeDOS floppy. So the fix is indeed in
blkqueue.

Oh, you might want to check the blkqueue branch, it does have quite
a bunch of set but unused variables, which will cause compilation
errors unless --disable-werror is passed to the configure script.


I think blkqueue is an older development branch of the block queue
feature that Kevin was working on.  It is not Kevin's block tree (see
his block branch).


So no, the block branch does not resolve the floppy access problem.

Well, considering the tests of the stable set I'm running against qemu right
now, this is not the biggest of our problems... I'm verifying qemu is
segfaulting on nearly every prolonged attempt of doing migration... I'm
about to write an email about it.

What is the OS of your guest? fedora16 or RHEL6? I would like recently
to use floppy device in guest.


Ok, correcting my answer, it was Windows 7 SP1. For some reason, I read 
'host' instead of guest. I had a long day when I wrote the 1st reply.


It's been a while that we've verified that the linux floppy driver is 
quite unstable. In general linux + floppy devices on a guest tends to 
crash the guest kernel, and that's why in kvm autotest we moved from 
floppy devices to cdroms to hold the kickstart file. The fact you 
managed to get things working under F14 means you are lucky, and that 
particular floppy driver bug does not happen under F14's kernel. We 
never had such a kernel crash problem with any of the Windows's kernels.


So if I were you, I would *not* use a floppy with a linux guest. Well, 
unless you want to debug the floppy driver bug and fix this on upstream 
linux once for all, so future versions of linux won't have this problem.

Re: [Qemu-devel] [PATCH v2] block: avoid SIGUSR2


On 10/28/2011 01:33 PM, Kevin Wolf wrote:

I'm afraid that we can only avoid things like this reliably if we
convert all devices to be direct users of AIO/coroutines. The current
block layer infrastructure doesn't emulate the behaviour of bdrv_read
accurately as bottom halves can be run in the nested main loop.

For floppy, the following seems to be a quick fix (Lucas, Cleber, does
this solve your problems?), though it's not very satisfying. And I'm not
quite sure yet why it doesn't always happen with kill() in
posix-aio-compat.c.


Another fix is to change idle bottom halves (at least the one in 
hw/dma.c) to 10ms timers.


Paolo

Re: [Qemu-devel] [PATCH 4/4] qemu-nbd: do not start the block layer in the parent

2011-10-28 Thread Pierre Riteau

Thanks a lot Paolo, I confirm this patchset fixes the bug!

-- 
Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
http://perso.univ-rennes1.fr/pierre.riteau/

On 28 oct. 2011, at 12:17, Paolo Bonzini wrote:

 Forking and threading do not behave very well together.  With qemu-nbd in
 -c mode it could happen that the thread pool is started in the parent, and
 the children (the one actually doing the I/O) is left without working I/O.
 Fix this by only running bdrv_init in the child process.
 
 Reported-by: Pierre Riteau pierre.rit...@irisa.fr
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
 qemu-nbd.c |   31 ++-
 1 files changed, 14 insertions(+), 17 deletions(-)
 
 diff --git a/qemu-nbd.c b/qemu-nbd.c
 index 5031158..962952c 100644
 --- a/qemu-nbd.c
 +++ b/qemu-nbd.c
 @@ -371,21 +371,6 @@ int main(int argc, char **argv)
   return 0;
 }
 
 -bdrv_init();
 -
 -bs = bdrv_new(hda);
 -
 -if ((ret = bdrv_open(bs, argv[optind], flags, NULL))  0) {
 -errno = -ret;
 -err(EXIT_FAILURE, Failed to bdrv_open '%s', argv[optind]);
 -}
 -
 -fd_size = bs-total_sectors * 512;
 -
 -if (partition != -1 
 -find_partition(bs, partition, dev_offset, fd_size))
 -err(EXIT_FAILURE, Could not find partition %d, partition);
 -
 if (device) {
 pid_t pid;
 int sock;
 @@ -418,7 +403,6 @@ int main(int argc, char **argv)
 size_t blocksize;
 
 ret = 0;
 -bdrv_close(bs);
 
 do {
 sock = unix_socket_outgoing(socket);
 @@ -473,8 +457,21 @@ int main(int argc, char **argv)
 /* children */
 }
 
 +bdrv_init();
 +bs = bdrv_new(hda);
 +if ((ret = bdrv_open(bs, argv[optind], flags, NULL))  0) {
 +errno = -ret;
 +err(EXIT_FAILURE, Failed to bdrv_open '%s', argv[optind]);
 +}
 +
 +fd_size = bs-total_sectors * 512;
 +
 +if (partition != -1 
 +find_partition(bs, partition, dev_offset, fd_size)) {
 +err(EXIT_FAILURE, Could not find partition %d, partition);
 +}
 +
 sharing_fds = g_malloc((shared + 1) * sizeof(int));
 -
 if (socket) {
 sharing_fds[0] = unix_socket_incoming(socket);
 } else {
 -- 
 1.7.6.4

Re: [Qemu-devel] [PATCH 4/4] qemu-nbd: do not start the block layer in the parent


On 10/28/2011 01:56 PM, Pierre Riteau wrote:

Thanks a lot Paolo, I confirm this patchset fixes the bug!


Do you believe the limitation on error reporting to be too strong?

Paolo

Re: [Qemu-devel] [PATCH 1/2] seabios: Add Local APIC NMI Structure to ACPI MADT

2011-10-28 Thread Kenji Kaneshige

Avi, Jan,

Could you comment on these patches?

Inject-NMI doesn't work on Windows guest without these patches. Windows seems
to setup LVT based on ACPI NMI structure information which is missing in current
seabios. LVT LINT1 are never unmasked by Windows guest without the patches.

Those patches were already reviewed by seabios people, but need ack from 
qemu/kvm
side.

Regards,
Kenji Kaneshige



(2011/10/10 15:06), Lai Jiangshan wrote:
 From: Kenji Kaneshigekaneshige.ke...@jp.fujitsu.com
 
 ACPI NMI Structure describes LINT pin (LINT0 or LINT1) information to
 which NMI is connected, and it is needed by OS to initialize local APIC.
 
 Signed-off-by: Kenji Kaneshigekaneshige.ke...@jp.fujitsu.com
 Reviewed-by: Lai Jiangshanla...@cn.fujitsu.com
 ---
   src/acpi.c |   22 --
   1 file changed, 20 insertions(+), 2 deletions(-)
 
 Index: seabios/src/acpi.c
 ===
 --- seabios.orig/src/acpi.c
 +++ seabios/src/acpi.c
 @@ -134,6 +134,14 @@ struct madt_intsrcovr {
   u16 flags;
   } PACKED;
 
 +struct madt_local_nmi {
 +ACPI_SUB_HEADER_DEF
 +u8  processor_id;   /* ACPI processor id */
 +u16 flags;  /* MPS INTI flags */
 +u8  lint;   /* Local APIC LINT# */
 +} PACKED;
 +
 +
   /*
* ACPI 2.0 Generic Address Space definition.
*/
 @@ -288,7 +296,9 @@ build_madt(void)
   int madt_size = (sizeof(struct multiple_apic_table)
+ sizeof(struct madt_processor_apic) * MaxCountCPUs
+ sizeof(struct madt_io_apic)
 - + sizeof(struct madt_intsrcovr) * 16);
 + + sizeof(struct madt_intsrcovr) * 16
 + + sizeof(struct madt_local_nmi));
 +
   struct multiple_apic_table *madt = malloc_high(madt_size);
   if (!madt) {
   warn_noalloc();
 @@ -340,7 +350,15 @@ build_madt(void)
   intsrcovr++;
   }
 
 -build_header((void*)madt, APIC_SIGNATURE, (void*)intsrcovr - 
 (void*)madt, 1);
 +struct madt_local_nmi *local_nmi = (void*)intsrcovr;
 +local_nmi-type = APIC_LOCAL_NMI;
 +local_nmi-length   = sizeof(*local_nmi);
 +local_nmi-processor_id = 0xff; /* all processors */
 +local_nmi-flags= 0;
 +local_nmi-lint = 1; /* LINT1 */
 +local_nmi++;
 +
 +build_header((void*)madt, APIC_SIGNATURE, (void*)local_nmi - 
 (void*)madt, 1);
   return madt;
   }
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] qemu/qemu-kvm floppy regression brought by 212ec7baa28cc9d819234fed1541fc1423cfe3d8

2011-10-28 Thread Gleb Natapov

On Fri, Oct 28, 2011 at 09:46:33AM -0200, Lucas Meneghel Rodrigues wrote:
 On 10/27/2011 11:17 PM, Zhi Yong Wu wrote:
 On Fri, Oct 28, 2011 at 2:57 AM, Lucas Meneghel Rodrigues
 l...@redhat.com  wrote:
 On 10/27/2011 05:17 AM, Stefan Hajnoczi wrote:
 
 On Wed, Oct 26, 2011 at 03:19:17PM -0200, Lucas Meneghel Rodrigues wrote:
 
 On 10/26/2011 01:47 PM, Kevin Wolf wrote:
 
 Am 26.10.2011 16:41, schrieb Lucas Meneghel Rodrigues:
 
 Hi folks:
 
 We've captured a regression with floppy disk on recent qemu (and
 qemu-kvm, after a code merge). We bisected it to be caused by:
 
 commit 212ec7baa28cc9d819234fed1541fc1423cfe3d8
 Author: Richard Hendersonr...@twiddle.net
 Date:   Mon Aug 15 15:08:45 2011 -0700
 
   fdc: Convert to isa_register_portio_list
 
   Signed-off-by: Richard Hendersonr...@twiddle.net
   Signed-off-by: Avi Kivitya...@redhat.com
 
 Since this commit, the guest doesn't see a floppy disk attached to it
 anymore, blocking kvm autotest ability to install windows guests
 automatically. This is a big deal for kvm autotest (ruins our automated
 regression jobs), so please take a look at it.
 
 Can you please try again with the latest block branch? I think there is
 a patch queued that will fix it.
 
 Kevin, I did try with HEAD of your repo:
 
 git://repo.or.cz/qemu/kevin.git
 
 [lmr@freedom qemu-kwolf]$ git branch -r
origin/HEAD -origin/master
origin/blkqueue
origin/blkqueue-v1
origin/block
origin/coroutine
origin/coroutine-block
origin/coroutine-devel
origin/devel
origin/ehci
origin/for-anthony
origin/for-stable-0.14
origin/inplace-conversion
origin/master
 
 With this repo, master branch, the problem persists. With the block
 branch, the problem persists.
 
 Now, with the blkqueue branch the problem is resolved. Cleber had
 the same results booting a FreeDOS floppy. So the fix is indeed in
 blkqueue.
 
 Oh, you might want to check the blkqueue branch, it does have quite
 a bunch of set but unused variables, which will cause compilation
 errors unless --disable-werror is passed to the configure script.
 
 I think blkqueue is an older development branch of the block queue
 feature that Kevin was working on.  It is not Kevin's block tree (see
 his block branch).
 
 So no, the block branch does not resolve the floppy access problem.
 
 Well, considering the tests of the stable set I'm running against qemu right
 now, this is not the biggest of our problems... I'm verifying qemu is
 segfaulting on nearly every prolonged attempt of doing migration... I'm
 about to write an email about it.
 What is the OS of your guest? fedora16 or RHEL6? I would like recently
 to use floppy device in guest.
 
 Ok, correcting my answer, it was Windows 7 SP1. For some reason, I
 read 'host' instead of guest. I had a long day when I wrote the 1st
 reply.
 
 It's been a while that we've verified that the linux floppy driver
 is quite unstable. In general linux + floppy devices on a guest
 tends to crash the guest kernel, and that's why in kvm autotest we
 moved from floppy devices to cdroms to hold the kickstart file. The
 fact you managed to get things working under F14 means you are
 lucky, and that particular floppy driver bug does not happen under
 F14's kernel. We never had such a kernel crash problem with any of
 the Windows's kernels.
 
 So if I were you, I would *not* use a floppy with a linux guest.
 Well, unless you want to debug the floppy driver bug and fix this on
 upstream linux once for all, so future versions of linux won't have
 this problem.
AFAIK the problem exists only on smp guest.

--
Gleb.

Re: [Qemu-devel] [PATCH 4/4] qemu-nbd: do not start the block layer in the parent

2011-10-28 Thread Pierre Riteau

On 28 oct. 2011, at 13:57, Paolo Bonzini wrote:

 On 10/28/2011 01:56 PM, Pierre Riteau wrote:
 Thanks a lot Paolo, I confirm this patchset fixes the bug!
 
 Do you believe the limitation on error reporting to be too strong?
 
 Paolo

Yes, it would be better if we could have error output on stderr. Now, simple 
errors such as a missing image file (or wrong path to the image) are reported 
to syslog instead. It could be the source of some headaches...

Is there a way we could have the child send the error to the parent over a pipe 
and have the parent print it on stderr?

Pierre

Re: [Qemu-devel] [PATCH 4/4] qemu-nbd: do not start the block layer in the parent


On 10/28/2011 02:16 PM, Pierre Riteau wrote:

Yes, it would be better if we could have error output on stderr. Now,
simple errors such as a missing image file (or wrong path to the
image) are reported to syslog instead. It could be the source of some
headaches...

Is there a way we could have the child send the error to the parent
over a pipe and have the parent print it on stderr?


A way could be to change the fork() into a separate thread, so that you 
can daemonize as soon as you accept the socket rather than having to do 
it early.


Paolo

Re: [Qemu-devel] [PATCH 1/2] seabios: Add Local APIC NMI Structure to ACPI MADT

2011-10-28 Thread Gleb Natapov

On Fri, Oct 28, 2011 at 09:08:18PM +0900, Kenji Kaneshige wrote:
 Avi, Jan,
 
 Could you comment on these patches?
 
 Inject-NMI doesn't work on Windows guest without these patches. Windows seems
 to setup LVT based on ACPI NMI structure information which is missing in 
 current
 seabios. LVT LINT1 are never unmasked by Windows guest without the patches.
 
 Those patches were already reviewed by seabios people, but need ack from 
 qemu/kvm
 side.
 
 Regards,
 Kenji Kaneshige
 
 
 
Acked-by: Gleb Natapov g...@redhat.com

 (2011/10/10 15:06), Lai Jiangshan wrote:
  From: Kenji Kaneshigekaneshige.ke...@jp.fujitsu.com
  
  ACPI NMI Structure describes LINT pin (LINT0 or LINT1) information to
  which NMI is connected, and it is needed by OS to initialize local APIC.
  
  Signed-off-by: Kenji Kaneshigekaneshige.ke...@jp.fujitsu.com
  Reviewed-by: Lai Jiangshanla...@cn.fujitsu.com
  ---
src/acpi.c |   22 --
1 file changed, 20 insertions(+), 2 deletions(-)
  
  Index: seabios/src/acpi.c
  ===
  --- seabios.orig/src/acpi.c
  +++ seabios/src/acpi.c
  @@ -134,6 +134,14 @@ struct madt_intsrcovr {
u16 flags;
} PACKED;
  
  +struct madt_local_nmi {
  +ACPI_SUB_HEADER_DEF
  +u8  processor_id;   /* ACPI processor id */
  +u16 flags;  /* MPS INTI flags */
  +u8  lint;   /* Local APIC LINT# */
  +} PACKED;
  +
  +
/*
 * ACPI 2.0 Generic Address Space definition.
 */
  @@ -288,7 +296,9 @@ build_madt(void)
int madt_size = (sizeof(struct multiple_apic_table)
 + sizeof(struct madt_processor_apic) * MaxCountCPUs
 + sizeof(struct madt_io_apic)
  - + sizeof(struct madt_intsrcovr) * 16);
  + + sizeof(struct madt_intsrcovr) * 16
  + + sizeof(struct madt_local_nmi));
  +
struct multiple_apic_table *madt = malloc_high(madt_size);
if (!madt) {
warn_noalloc();
  @@ -340,7 +350,15 @@ build_madt(void)
intsrcovr++;
}
  
  -build_header((void*)madt, APIC_SIGNATURE, (void*)intsrcovr - 
  (void*)madt, 1);
  +struct madt_local_nmi *local_nmi = (void*)intsrcovr;
  +local_nmi-type = APIC_LOCAL_NMI;
  +local_nmi-length   = sizeof(*local_nmi);
  +local_nmi-processor_id = 0xff; /* all processors */
  +local_nmi-flags= 0;
  +local_nmi-lint = 1; /* LINT1 */
  +local_nmi++;
  +
  +build_header((void*)madt, APIC_SIGNATURE, (void*)local_nmi - 
  (void*)madt, 1);
return madt;
}
  
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
  

--
Gleb.

Re: [Qemu-devel] [PATCH v2] block: avoid SIGUSR2

2011-10-28 Thread Cleber Rosa


On 10/28/2011 08:33 AM, Kevin Wolf wrote:

Am 27.10.2011 16:32, schrieb Kevin Wolf:

Am 27.10.2011 16:15, schrieb Kevin Wolf:

Am 27.10.2011 15:57, schrieb Stefan Hajnoczi:

On Thu, Oct 27, 2011 at 03:26:23PM +0200, Kevin Wolf wrote:

Am 19.09.2011 16:37, schrieb Frediano Ziglio:

Now that iothread is always compiled sending a signal seems only an
additional step. This patch also avoid writing to two pipe (one from signal
and one in qemu_service_io).

Work with kvm enabled or disabled. strace output is more readable (less 
syscalls).

Signed-off-by: Frediano Zigliofredd...@gmail.com

Something in this change has bad effects, in the sense that it seems to
break bdrv_read_em.

How does it break bdrv_read_em?  Are you seeing QEMU hung with 100% CPU
utilization or deadlocked?

Sorry, I should have been more detailed here.

No, it's nothing obvious, it must be some subtle side effect. The result
of bdrv_read_em itself seems to be correct (return value and checksum of
the read buffer).

However instead of booting into the DOS setup I only get an error
message Kein System oder Laufwerksfehler (don't know how it reads in
English DOS versions), which seems to be produced by the boot sector.

I excluded all of the minor changes, so I'm sure that it's caused by the
switch from kill() to a direct call of the function that writes into the
pipe.


One interesting thing is that qemu_aio_wait() does not release the QEMU
mutex, so we cannot write to a pipe with the mutex held and then spin
waiting for the iothread to do work for us.

Exactly how kill and qemu_notify_event() were different I'm not sure
right now but it could be a factor.

This would cause a hang, right? Then it isn't what I'm seeing.

While trying out some more things, I added some fprintfs to
posix_aio_process_queue() and suddenly it also fails with the kill()
version. So what has changed might really just be the timing, and it
could be a race somewhere that has always (?) existed.

Replying to myself again... It looks like there is a problem with
reentrancy in fdctrl_transfer_handler. I think this would have been
guarded by the AsyncContexts before, but we don't have them any more.

qemu-system-x86_64: /root/upstream/qemu/hw/fdc.c:1253:
fdctrl_transfer_handler: Assertion `reentrancy == 0' failed.

Program received signal SIGABRT, Aborted.

(gdb) bt
#0  0x003ccd2329a5 in raise () from /lib64/libc.so.6
#1  0x003ccd234185 in abort () from /lib64/libc.so.6
#2  0x003ccd22b935 in __assert_fail () from /lib64/libc.so.6
#3  0x0046ff09 in fdctrl_transfer_handler (opaque=value
optimized out, nchan=value optimized out, dma_pos=value optimized out,
 dma_len=value optimized out) at /root/upstream/qemu/hw/fdc.c:1253
#4  0x0046702c in channel_run () at /root/upstream/qemu/hw/dma.c:348
#5  DMA_run () at /root/upstream/qemu/hw/dma.c:378
#6  0x0040b0e1 in qemu_bh_poll () at async.c:70
#7  0x0040aa19 in qemu_aio_wait () at aio.c:147
#8  0x0041c355 in bdrv_read_em (bs=0x131fd80, sector_num=19,
buf=value optimized out, nb_sectors=1) at block.c:2896
#9  0x0041b3d2 in bdrv_read (bs=0x131fd80, sector_num=19,
buf=0x1785a00 IO  SYS!, nb_sectors=1) at block.c:1062
#10 0x0041b3d2 in bdrv_read (bs=0x131f430, sector_num=19,
buf=0x1785a00 IO  SYS!, nb_sectors=1) at block.c:1062
#11 0x0046fbb8 in do_fdctrl_transfer_handler (opaque=0x1785788,
nchan=2, dma_pos=value optimized out, dma_len=512)
 at /root/upstream/qemu/hw/fdc.c:1178
#12 0x0046fecf in fdctrl_transfer_handler (opaque=value
optimized out, nchan=value optimized out, dma_pos=value optimized out,
 dma_len=value optimized out) at /root/upstream/qemu/hw/fdc.c:1255
#13 0x0046702c in channel_run () at /root/upstream/qemu/hw/dma.c:348
#14 DMA_run () at /root/upstream/qemu/hw/dma.c:378
#15 0x0046e456 in fdctrl_start_transfer (fdctrl=0x1785788,
direction=1) at /root/upstream/qemu/hw/fdc.c:1107
#16 0x00558a41 in kvm_handle_io (env=0x1323ff0) at
/root/upstream/qemu/kvm-all.c:834
#17 kvm_cpu_exec (env=0x1323ff0) at /root/upstream/qemu/kvm-all.c:976
#18 0x0053686a in qemu_kvm_cpu_thread_fn (arg=0x1323ff0) at
/root/upstream/qemu/cpus.c:661
#19 0x003ccda077e1 in start_thread () from /lib64/libpthread.so.0
#20 0x003ccd2e151d in clone () from /lib64/libc.so.6

I'm afraid that we can only avoid things like this reliably if we
convert all devices to be direct users of AIO/coroutines. The current
block layer infrastructure doesn't emulate the behaviour of bdrv_read
accurately as bottom halves can be run in the nested main loop.

For floppy, the following seems to be a quick fix (Lucas, Cleber, does
this solve your problems?), though it's not very satisfying. And I'm not
quite sure yet why it doesn't always happen with kill() in
posix-aio-compat.c.

diff --git a/hw/dma.c b/hw/dma.c
index 8a7302a..1d3b6f1 100644
--- a/hw/dma.c
+++ b/hw/dma.c
@@ -358,6 +358,13 @@ static void DMA_run (void)
  struct

Re: [Qemu-devel] [PATCH v2] block: avoid SIGUSR2

Am 28.10.2011 13:50, schrieb Paolo Bonzini:
 On 10/28/2011 01:33 PM, Kevin Wolf wrote:
 I'm afraid that we can only avoid things like this reliably if we
 convert all devices to be direct users of AIO/coroutines. The current
 block layer infrastructure doesn't emulate the behaviour of bdrv_read
 accurately as bottom halves can be run in the nested main loop.

 For floppy, the following seems to be a quick fix (Lucas, Cleber, does
 this solve your problems?), though it's not very satisfying. And I'm not
 quite sure yet why it doesn't always happen with kill() in
 posix-aio-compat.c.
 
 Another fix is to change idle bottom halves (at least the one in 
 hw/dma.c) to 10ms timers.

Which would be using the fact that timers are only executed in the real
main loop. Which makes me wonder if it would be enough for floppy if we
changed qemu_bh_poll() to take a bool run_idle_bhs that would be true in
the main loop and false an qemu_aio_wait().

Still this wouldn't be a general solution as normal BHs have the very
same problem if they are scheduled before a bdrv_read/write call. To
solve that I guess we'd have to reintroduce AsyncContext, but it has its
own problems and was removed for a reason.

Or we make some serious effort now to convert devices to AIO.

Kevin

Re: [Qemu-devel] [PATCH v2] block: avoid SIGUSR2

On Fri, Oct 28, 2011 at 1:29 PM, Kevin Wolf kw...@redhat.com wrote:
 Am 28.10.2011 13:50, schrieb Paolo Bonzini:
 On 10/28/2011 01:33 PM, Kevin Wolf wrote:
 I'm afraid that we can only avoid things like this reliably if we
 convert all devices to be direct users of AIO/coroutines. The current
 block layer infrastructure doesn't emulate the behaviour of bdrv_read
 accurately as bottom halves can be run in the nested main loop.

 For floppy, the following seems to be a quick fix (Lucas, Cleber, does
 this solve your problems?), though it's not very satisfying. And I'm not
 quite sure yet why it doesn't always happen with kill() in
 posix-aio-compat.c.

 Another fix is to change idle bottom halves (at least the one in
 hw/dma.c) to 10ms timers.

 Which would be using the fact that timers are only executed in the real
 main loop. Which makes me wonder if it would be enough for floppy if we
 changed qemu_bh_poll() to take a bool run_idle_bhs that would be true in
 the main loop and false an qemu_aio_wait().

 Still this wouldn't be a general solution as normal BHs have the very
 same problem if they are scheduled before a bdrv_read/write call. To
 solve that I guess we'd have to reintroduce AsyncContext, but it has its
 own problems and was removed for a reason.

 Or we make some serious effort now to convert devices to AIO.

Zhi Yong: We were just talking about converting devices to aio.  If
you have time to do that for fdc, sd, or any other synchronous API
users in hw/ that would be helpful.  Please let us know which device
you are refactoring so we don't duplicate work.

Stefan

Re: [Qemu-devel] [PATCH] Fix compiler warning (always return a value)

On Fri, Oct 28, 2011 at 12:07 PM, Markus Armbruster arm...@redhat.com wrote:
 Stefan Hajnoczi stefa...@gmail.com writes:

 On Fri, Oct 28, 2011 at 9:59 AM, Markus Armbruster arm...@redhat.com wrote:
 Stefan Hajnoczi stefa...@gmail.com writes:

 On Fri, Oct 28, 2011 at 8:40 AM, Markus Armbruster arm...@redhat.com 
 wrote:
 Stefan Hajnoczi stefa...@gmail.com writes:

 On Mon, Oct 24, 2011 at 10:18:43PM +0200, Stefan Weil wrote:
 For compilations with -DNDEBUG, the default case did not return
 a value which caused a compiler warning.

 Signed-off-by: Stefan Weil s...@weilnetz.de
 ---
  hw/ppce500_spin.c |   11 ---
  1 files changed, 8 insertions(+), 3 deletions(-)

 diff --git a/hw/ppce500_spin.c b/hw/ppce500_spin.c
 index cccd940..5b5ffe0 100644
 --- a/hw/ppce500_spin.c
 +++ b/hw/ppce500_spin.c
 @@ -168,17 +168,22 @@ static uint64_t spin_read(void *opaque, 
 target_phys_addr_t addr, unsigned len)
  {
      SpinState *s = opaque;
      uint8_t *spin_p = ((uint8_t*)s-spin)[addr];
 +    uint64_t result = 0;

      switch (len) {
      case 1:
 -        return ldub_p(spin_p);
 +        result = ldub_p(spin_p);
 +        break;
      case 2:
 -        return lduw_p(spin_p);
 +        result = lduw_p(spin_p);
 +        break;
      case 4:
 -        return ldl_p(spin_p);
 +        result = ldl_p(spin_p);
 +        break;
      default:
          assert(0);

 I would replace assert(3) with abort(3).  If this ever happens the
 program is broken - returning 0 instead of an undefined value doesn't
 help.

 Why is it useful to make failed assertions stop the program regardless
 of NDEBUG only when the assertion happens to be can't reach?

 My point is that it should not be an assertion.  The program has a
 control flow path which should never be taken.  In any safe
 programming environment the program will terminate with a diagnostic.

 In the not-so-safe C programming environment, we can disable that safety
 check by defining NDEBUG.

 Disabling safety checks is almost always a bad idea.

 There's a difference in a safety check that slows down the system and
 a failure condition where the program must terminate.

 assert(3) is for safety checks that can be disabled because they may
 slow down the system.

 I've been saying that assert(3) isn't appropriate here because the
 program can't continue and there's no check we can skip here.

 a. Program can't continue: same for pretty much any assertion anywhere.

 b. There's no code we can skip here: calling abort() is code.

 What I've been saying is

 1. Attempting to distinguish between safety checks that are safe to skip
 and failure conditions that aren't is a fool's errand.  If a check is
 safe to skip it's not a safety check by definition.  It's debugging
 code, perhaps.

 2. Optionally disabling expensive safety checks should be done based
 on measurements, if at all.  Arbitrarily declaring all can't reach
 checks inexpensive and all other assertions expensive isn't
 measuring, it's guesswork.

I'm tempted to continue the thread but at the end of the day we just
need to make the function compile with -NDEBUG.  Using abort(3) or
qemu_abort() would be the smallest and IMO sanest change, but if it's
something else that's fine.

Stefan

Re: [Qemu-devel] [PATCH 1/2] seabios: Add Local APIC NMI Structure to ACPI MADT

2011-10-28 Thread Jun Koi

2011/10/28 Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com:
 Avi, Jan,

 Could you comment on these patches?

 Inject-NMI doesn't work on Windows guest without these patches.

sorry but i am really curious here: why Windows still works well even
if it desnt see the inject-NMI?
or there are still invisible side-effects that we are not awere of???

thanks,
Jun

[Qemu-devel] ARM hosts: code_gen_alloc() maps code buffer on top of libc heap

I've been tracking down a bug where qemu run on ARM hosts will
(about half of the time) abort early in execution with:

qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion `(old_top ==
(((mbinptr) (((char *) ((av)-bins[((1) - 1) * 2])) -
__builtin_offsetof (struct malloc_chunk, fd  old_size == 0) ||
((unsigned long) (old_size) = (unsigned long)__builtin_offsetof
(struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) 
~((2 * (sizeof(size_t))) - 1)))  ((old_top)-size  0x1) 
((unsigned long)old_end  pagemask) == 0)' failed.

This turns out to be because code_gen_alloc() is using
mmap(MAP_FIXED) to map the code buffer at address
0x0100UL, which is in the area glibc happens to be using
for its heap. This tends to make the next malloc() abort,
although occasionally the stars align and we pass that
and fail weirdly later on.

I suspect we need to drop the MAP_FIXED requirement and
fix the TCG code to cope with emitting code for longer-range
branches for calls to host fns etc (calls/branches within the
generated code should be ok to keep using the short-range
branch insn I think). There is already no guarantee that
the generated code and the host C code are within short
branch range of each other...

-- PMM

Re: [Qemu-devel] [PATCH] Teach block/vdi about discarded (no longer allocated) blocks

2011-10-28 Thread Stefan Weil


Am 28.10.2011 11:22, schrieb Kevin Wolf:

Am 28.10.2011 10:15, schrieb Eric Sunshine:

On Oct 28, 2011, at 4:00 AM, Kevin Wolf wrote:


Am 27.10.2011 18:12, schrieb Stefan Weil:

Am 27.10.2011 10:53, schrieb Kevin Wolf:

Am 26.10.2011 21:51, schrieb Eric Sunshine:

An entry in the VDI block map will hold an offset to the actual
block if
the block is allocated, or one of two specially-interpreted
values if
not allocated. Using VirtualBox terminology, value
VDI_IMAGE_BLOCK_FREE
(0x) represents a never-allocated block (semantically
arbitrary
content). VDI_IMAGE_BLOCK_ZERO (0xfffe) represents a
discarded
block (semantically zero-filled). block/vdi knows only about
VDI_IMAGE_BLOCK_FREE. Teach it about VDI_IMAGE_BLOCK_ZERO.

Signed-off-by: Eric Sunshinesunsh...@sunshineco.com

Thanks, applied to the block branch.

Kevin


Kevin, I don't want to block improvements. Nevertheless
I'd like to see a small modification in this patch:
both #defines should be implemented without a type cast.
Please change them or wait until Eric sends an update.

My favorite is this:

#define VDI_UNALLOCATED UINT32_MAX
#define VDI_DISCARD (VDI_UNALLOCATED - 1)

This would also be ok:

#define VDI_UNALLOCATED 0xU
#define VDI_DISCARD 0xfffeU

Using the macro names and the definitions (with type cast)
from the original VirtualBox code would also be ok.

I did see your comments, and I waited for the endianness thing to be
answered. However, how the definition of these constants is written is
really not a functional defect, but simply a matter of taste. It's an
old rule that whoever does the work also decides on the details.

I really think it's wasting our time if we need to discuss if a type
cast in the constant definition is only allowed after typedefing
uint32_t to something else like in VBox.

So my preferred way is to leave the patch as it is. The code is simple
and clear and objectively seen it won't get any better with your taste
applied. If Eric prefers, I can update it to use 0xU, though.

The 0xU notation has the benefit of being explicit, whereas
the ((uint32_t)~0) notation, taken from the VirtualBox source, is
somewhat magical for a reader who does not perform an automatic
((uint32_t)~0) == 0xU conversion in his head. Consequently,
the 0xU notation might a better choice, if it's not too much
bother for you to amend the patch.

I'll amend it with this change:

diff --git a/block/vdi.c b/block/vdi.c
index 25790c4..523a640 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -115,10 +115,10 @@ void uuid_unparse(const uuid_t uu, char *out);
  #define VDI_TEXT   QEMU VM Virtual Disk Image\n

  /* A never-allocated block; semantically arbitrary content. */
-#define VDI_UNALLOCATED ((uint32_t)~0)
+#define VDI_UNALLOCATED 0xU

  /* A discarded (no longer allocated) block; semantically zero-filled. */
-#define VDI_DISCARDED ((uint32_t)~1)
+#define VDI_DISCARDED   0xfffeU

  #define VDI_IS_ALLOCATED(X) ((X)  VDI_DISCARDED)


Thanks for this update. With your patch added to Eric's, I can add

Acked-by: Stefan Weil s...@weilnetz.de

Re: [Qemu-devel] [PATCH] Fix compiler warning (always return a value)

Stefan Hajnoczi stefa...@gmail.com writes:

 On Fri, Oct 28, 2011 at 12:07 PM, Markus Armbruster arm...@redhat.com wrote:
 Stefan Hajnoczi stefa...@gmail.com writes:

 On Fri, Oct 28, 2011 at 9:59 AM, Markus Armbruster arm...@redhat.com 
 wrote:
 Stefan Hajnoczi stefa...@gmail.com writes:

 On Fri, Oct 28, 2011 at 8:40 AM, Markus Armbruster arm...@redhat.com 
 wrote:
 Stefan Hajnoczi stefa...@gmail.com writes:
[...]
 I would replace assert(3) with abort(3).  If this ever happens the
 program is broken - returning 0 instead of an undefined value doesn't
 help.

 Why is it useful to make failed assertions stop the program regardless
 of NDEBUG only when the assertion happens to be can't reach?

 My point is that it should not be an assertion.  The program has a
 control flow path which should never be taken.  In any safe
 programming environment the program will terminate with a diagnostic.

 In the not-so-safe C programming environment, we can disable that safety
 check by defining NDEBUG.

 Disabling safety checks is almost always a bad idea.

 There's a difference in a safety check that slows down the system and
 a failure condition where the program must terminate.

 assert(3) is for safety checks that can be disabled because they may
 slow down the system.

 I've been saying that assert(3) isn't appropriate here because the
 program can't continue and there's no check we can skip here.

 a. Program can't continue: same for pretty much any assertion anywhere.

 b. There's no code we can skip here: calling abort() is code.

 What I've been saying is

 1. Attempting to distinguish between safety checks that are safe to skip
 and failure conditions that aren't is a fool's errand.  If a check is
 safe to skip it's not a safety check by definition.  It's debugging
 code, perhaps.

 2. Optionally disabling expensive safety checks should be done based
 on measurements, if at all.  Arbitrarily declaring all can't reach
 checks inexpensive and all other assertions expensive isn't
 measuring, it's guesswork.

 I'm tempted to continue the thread but at the end of the day we just
 need to make the function compile with -NDEBUG.  Using abort(3) or

Do we?

Anyone crazy^Wadventurous enough to compile with -DNDEBUG should be
capable of configure --disable-werror.

 qemu_abort() would be the smallest and IMO sanest change, but if it's
 something else that's fine.

[Qemu-devel] [PATCH] dma: Avoid reentrancy in DMA transfer handlers

With the conversion of the block layer to coroutines, bdrv_read/write
have changed to run a nested event loop that calls qemu_bh_poll.
Consequently a scheduled BH can be called while a DMA transfer handler
runs and this means that DMA_run becomes reentrant.

Devices haven't been designed to cope with that, so instead of running a
nested transfer handler just wait for the next invocation of the BH from the
main loop.

This fixes some problems with the floppy device.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/dma.c |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/hw/dma.c b/hw/dma.c
index 8a7302a..e8d6341 100644
--- a/hw/dma.c
+++ b/hw/dma.c
@@ -358,6 +358,13 @@ static void DMA_run (void)
 struct dma_cont *d;
 int icont, ichan;
 int rearm = 0;
+static int running = 0;
+
+if (running) {
+goto out;
+} else {
+running = 1;
+}
 
 d = dma_controllers;
 
@@ -374,6 +381,8 @@ static void DMA_run (void)
 }
 }
 
+out:
+running = 0;
 if (rearm)
 qemu_bh_schedule_idle(dma_bh);
 }
-- 
1.7.6.4

[Qemu-devel] [PATCH] Fix segfault after migration completes

2011-10-28 Thread Luiz Capitulino

To reproduce:

1. Start the source VM with:

 # qemu [...] -S

2. Start the destination VM with:

 # qemu source VM cmd-line -incoming tcp:0:

3. In the source VM:

  (qemu) migrate -d tcp:0:

3. The source VM will segfault as soon as migration completes (might not
   happen in the first try)

Here's the backtrace:

#0  0x00516f39 in qemu_file_get_error (f=0x0) at 
/home/lcapitulino/src/qmp-unstable/savevm.c:431
431 return f-last_error;

#0  0x00516f39 in qemu_file_get_error (f=0x0) at 
/home/lcapitulino/src/qmp-unstable/savevm.c:431
#1  0x004e7a9a in migrate_fd_put_notify (opaque=0x987640) at 
/home/lcapitulino/src/qmp-unstable/migration.c:255
#2  0x0046d59a in qemu_iohandler_poll (readfds=0x7fff45ccfe50, 
writefds=0x7fff45ccfdd0, xfds=0x7fff45ccfd50, ret=1)
at /home/lcapitulino/src/qmp-unstable/iohandler.c:124
#3  0x004e6033 in main_loop_wait (nonblocking=0) at 
/home/lcapitulino/src/qmp-unstable/main-loop.c:463
#4  0x004db5b0 in main_loop () at 
/home/lcapitulino/src/qmp-unstable/vl.c:1478
#5  0x004dffed in main (argc=16, argv=0x7fff45cd0318, 
envp=0x7fff45cd03a0) at /home/lcapitulino/src/qmp-unstable/vl.c:3449

So, 's-file' is NULL in migrate_fd_put_notify(). The interesting thing
is that it's valid in the qemu_file_put_notify() call, which makes me
think that either: there's a race somewhere or qemu_file_put_notify() is
itself clearing 's-file'. In both cases the fix below could just be hiding
the real issue, but let's get started...

Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
---
 migration.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/migration.c b/migration.c
index bdca72e..f6e6208 100644
--- a/migration.c
+++ b/migration.c
@@ -252,7 +252,7 @@ static void migrate_fd_put_notify(void *opaque)
 
 qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL);
 qemu_file_put_notify(s-file);
-if (qemu_file_get_error(s-file)) {
+if (s-file  qemu_file_get_error(s-file)) {
 migrate_fd_error(s);
 }
 }
-- 
1.7.7.1.488.ge8e1c.dirty

[Qemu-devel] [PATCH] qapi: fix typos in documentation JSON examples

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 docs/qapi-code-gen.txt |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt
index f345866..c0a9325 100644
--- a/docs/qapi-code-gen.txt
+++ b/docs/qapi-code-gen.txt
@@ -41,7 +41,7 @@ dictionary.  This corresponds to a struct in C or an Object 
in JSON.  An
 example of a complex type is:
 
  { 'type': 'MyType',
-   'data' { 'member1': 'str', 'member2': 'int', '*member3': 'str } }
+   'data': { 'member1': 'str', 'member2': 'int', '*member3': 'str' } }
 
 The use of '*' as a prefix to the name means the member is optional.  Optional
 members should always be added to the end of the dictionary to preserve
@@ -63,7 +63,7 @@ An example command is:
 
  { 'command': 'my-command',
'data': { 'arg1': 'str', '*arg2': 'str' },
-   'returns': 'str' ]
+   'returns': 'str' }
 
 Command names should be all lower case with words separated by a hyphen.
 
-- 
1.7.7

[Qemu-devel] [PATCH] acl: Fix use after free in qemu_acl_reset()

Reproducer:

$ MALLOC_PERTURB_=234 qemu-system-x86_64 -vnc :0,acl,sasl [...]
QEMU 0.15.50 monitor - type 'help' for more information
(qemu) acl_add vnc.username fred allow
acl: added rule at position 1
(qemu) acl_reset vnc.username
Segmentation fault (core dumped)

Spotted by Coverity.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 acl.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/acl.c b/acl.c
index 0654f38..e840b9b 100644
--- a/acl.c
+++ b/acl.c
@@ -95,13 +95,13 @@ int qemu_acl_party_is_allowed(qemu_acl *acl,
 
 void qemu_acl_reset(qemu_acl *acl)
 {
-qemu_acl_entry *entry;
+qemu_acl_entry *entry, *next_entry;
 
 /* Put back to deny by default, so there is no window
  * of open access while the user re-initializes the
  * access control list */
 acl-defaultDeny = 1;
-QTAILQ_FOREACH(entry, acl-entries, next) {
+QTAILQ_FOREACH_SAFE(entry, acl-entries, next, next_entry) {
 QTAILQ_REMOVE(acl-entries, entry, next);
 free(entry-match);
 free(entry);
-- 
1.7.6.4

[Qemu-devel] [PATCH V3 00/10] Xen PCI Passthrough

Hi all,

This patch series introduces the PCI passthrough for Xen.

First, we have HostPCIDevice that help to access one PCI device of the host.

Then, there is an additions in the QEMU code, pci_check_bar_overlap.

There are also several change in pci_ids and pci_regs.

Last part, but not least, the PCI passthrough device himself. Cut in 3 parts
(or file), there is one to take care of the initialisation of a passthrough
device. The second one handle everything about the config address space, there
are specifics functions for every config register. The third one is to handle
MSI.

There is a patch series on xen-devel that add the support of setting a PCI
passthrough device through QMP from libxl (xen tool stack). It is just a call
to device_add, with the driver parametter hostaddr=:00:1b.0.

Change since v2;
  - in host-pci-device.c:
- Return more usefull error code in get_ressource().
- Use macro in host_pci_find_ext_cap_offset instead of raw number. But I
  still not sure if PCI_MAX_EXT_CAP is right, it's result is 480 like it
  was before, so it's maybe ok.
  - All use of MSI stuff in two first pci passthrough patch have been removed
and move to the last patch.


Change v1-v2:
  - fix style issue (checkpatch.pl)
  - set the original authors, add some missing copyright headers
  - HostPCIDevice:
- introduce HostPCIIORegions (with base_addr, size, flags)
- save all flags from ./resource and store it in a separate field.
- fix endianess on write
- new host_pci_dev_put function
- use pci.c like interface host_pci_get/set_byte/word/long (instead of
  host_pci_read/write_)
  - compile HostPCIDevice only on linux (as well as xen_pci_passthrough)
  - introduce apic-msidef.h file.
  - no more run_one_timer, if a pci device is in the middle of a power
transition, just return an error in config read/write
  - use a global var mapped_machine_irq (local to xen_pci_passthrough.c)
  - add msitranslate and power-mgmt ad qdev property


Allen Kay (2):
  Introduce Xen PCI Passthrough, qdevice (1/3)
  Introduce Xen PCI Passthrough, PCI config space helpers (2/3)

Anthony PERARD (6):
  configure: Introduce --enable-xen-pci-passthrough.
  Introduce HostPCIDevice to access a pci device on the host.
  pci_ids: Add INTEL_82599_VF id.
  pci_regs: Fix value of PCI_EXP_TYPE_RC_EC.
  pci_regs: Add PCI_EXP_TYPE_PCIE_BRIDGE
  Introduce apic-msidef.h

Jiang Yunhong (1):
  Introduce Xen PCI Passthrough, MSI (3/3)

Yuji Shimada (1):
  pci.c: Add pci_check_bar_overlap

 Makefile.target  |7 +
 configure|   25 +
 hw/apic-msidef.h |   30 +
 hw/apic.c|   11 +-
 hw/host-pci-device.c |  252 
 hw/host-pci-device.h |   75 +
 hw/pci.c |   47 +
 hw/pci.h |3 +
 hw/pci_ids.h |1 +
 hw/pci_regs.h|3 +-
 hw/xen_pci_passthrough.c |  861 
 hw/xen_pci_passthrough.h |  280 
 hw/xen_pci_passthrough_config_init.c | 2553 ++
 hw/xen_pci_passthrough_helpers.c |   46 +
 hw/xen_pci_passthrough_msi.c |  667 +
 15 files changed, 4850 insertions(+), 11 deletions(-)
 create mode 100644 hw/apic-msidef.h
 create mode 100644 hw/host-pci-device.c
 create mode 100644 hw/host-pci-device.h
 create mode 100644 hw/xen_pci_passthrough.c
 create mode 100644 hw/xen_pci_passthrough.h
 create mode 100644 hw/xen_pci_passthrough_config_init.c
 create mode 100644 hw/xen_pci_passthrough_helpers.c
 create mode 100644 hw/xen_pci_passthrough_msi.c

-- 
Anthony PERARD

[Qemu-devel] [PATCH V3 01/10] configure: Introduce --enable-xen-pci-passthrough.

Signed-off-by: Anthony PERARD anthony.per...@citrix.com
---
 Makefile.target |2 ++
 configure   |   25 +
 2 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/Makefile.target b/Makefile.target
index fe5f6f7..867d687 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -215,6 +215,8 @@ obj-$(CONFIG_NO_XEN) += xen-stub.o
 
 obj-i386-$(CONFIG_XEN) += xen_platform.o
 
+# Xen PCI Passthrough
+
 # Inter-VM PCI shared memory
 CONFIG_IVSHMEM =
 ifeq ($(CONFIG_KVM), y)
diff --git a/configure b/configure
index 4f87e0a..301ab44 100755
--- a/configure
+++ b/configure
@@ -127,6 +127,7 @@ vnc_png=
 vnc_thread=no
 xen=
 xen_ctrl_version=
+xen_pci_passthrough=
 linux_aio=
 attr=
 xfs=
@@ -641,6 +642,10 @@ for opt do
   ;;
   --enable-xen) xen=yes
   ;;
+  --disable-xen-pci-passthrough) xen_pci_passthrough=no
+  ;;
+  --enable-xen-pci-passthrough) xen_pci_passthrough=yes
+  ;;
   --disable-brlapi) brlapi=no
   ;;
   --enable-brlapi) brlapi=yes
@@ -979,6 +984,8 @@ echo(affects only QEMU, not 
qemu-img)
 echo   --enable-mixemu  enable mixer emulation
 echo   --disable-xendisable xen backend driver support
 echo   --enable-xen enable xen backend driver support
+echo   --disable-xen-pci-passthrough
+echo   --enable-xen-pci-passthrough
 echo   --disable-brlapi disable BrlAPI
 echo   --enable-brlapi  enable BrlAPI
 echo   --disable-vnc-tlsdisable TLS encryption for VNC server
@@ -1342,6 +1349,21 @@ EOF
   fi
 fi
 
+if test $xen_pci_passthrough != no; then
+  if test $xen = yes  test $linux = yes; then
+xen_pci_passthrough=yes
+  else
+if test $xen_pci_passthrough = yes; then
+  echo ERROR
+  echo ERROR: User requested feature Xen PCI Passthrough
+  echo ERROR: but this feature require /sys from Linux
+  echo ERROR
+  exit 1;
+fi
+xen_pci_passthrough=no
+  fi
+fi
+
 ##
 # pkg-config probe
 
@@ -3398,6 +3420,9 @@ case $target_arch2 in
 if test $xen = yes -a $target_softmmu = yes ; then
   target_phys_bits=64
   echo CONFIG_XEN=y  $config_target_mak
+  if test $xen_pci_passthrough = yes; then
+echo CONFIG_XEN_PCI_PASSTHROUGH=y  $config_target_mak
+  fi
 else
   echo CONFIG_NO_XEN=y  $config_target_mak
 fi
-- 
Anthony PERARD

[Qemu-devel] [PATCH V3 10/10] Introduce Xen PCI Passthrough, MSI (3/3)

From: Jiang Yunhong yunhong.ji...@intel.com

Signed-off-by: Jiang Yunhong yunhong.ji...@intel.com
Signed-off-by: Shan Haitao haitao.s...@intel.com
Signed-off-by: Anthony PERARD anthony.per...@citrix.com
---
 Makefile.target  |1 +
 hw/apic-msidef.h |2 +
 hw/xen_pci_passthrough.c |   27 ++-
 hw/xen_pci_passthrough.h |   55 +++
 hw/xen_pci_passthrough_config_init.c |  495 +-
 hw/xen_pci_passthrough_msi.c |  667 ++
 6 files changed, 1240 insertions(+), 7 deletions(-)
 create mode 100644 hw/xen_pci_passthrough_msi.c

diff --git a/Makefile.target b/Makefile.target
index c32c688..17b8857 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -220,6 +220,7 @@ obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += host-pci-device.o
 obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pci_passthrough.o
 obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pci_passthrough_helpers.o
 obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pci_passthrough_config_init.o
+obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pci_passthrough_msi.o
 
 # Inter-VM PCI shared memory
 CONFIG_IVSHMEM =
diff --git a/hw/apic-msidef.h b/hw/apic-msidef.h
index 3182f0b..6e2eb71 100644
--- a/hw/apic-msidef.h
+++ b/hw/apic-msidef.h
@@ -22,6 +22,8 @@
 
 #define MSI_ADDR_DEST_MODE_SHIFT2
 
+#define MSI_ADDR_REDIRECTION_SHIFT  3
+
 #define MSI_ADDR_DEST_ID_SHIFT  12
 #define  MSI_ADDR_DEST_ID_MASK  0x000
 
diff --git a/hw/xen_pci_passthrough.c b/hw/xen_pci_passthrough.c
index b97c5b6..4b9eb74 100644
--- a/hw/xen_pci_passthrough.c
+++ b/hw/xen_pci_passthrough.c
@@ -417,6 +417,7 @@ static void pt_iomem_map(XenPCIPassthroughState *s, int i,
 }
 
 if (!first_map  old_ebase != -1) {
+pt_add_msix_mapping(s, i);
 /* Remove old mapping */
 ret = xc_domain_memory_mapping(xen_xc, xen_domid,
old_ebase  XC_PAGE_SHIFT,
@@ -441,6 +442,15 @@ static void pt_iomem_map(XenPCIPassthroughState *s, int i,
 if (ret != 0) {
 PT_LOG(Error: create new mapping failed!\n);
 }
+
+ret = pt_remove_msix_mapping(s, i);
+if (ret != 0) {
+PT_LOG(Error: remove MSI-X mmio mapping failed!\n);
+}
+
+if (old_ebase != e_phys  old_ebase != -1) {
+pt_msix_update_remap(s, i);
+}
 }
 }
 
@@ -737,6 +747,9 @@ static int pt_initfn(PCIDevice *pcidev)
 mapped_machine_irq[machine_irq]++;
 }
 
+/* setup MSI-INTx translation if support */
+rc = pt_enable_msi_translate(s);
+
 /* bind machine_irq to device */
 if (rc  0  machine_irq != 0) {
 uint8_t e_device = PCI_SLOT(s-dev.devfn);
@@ -765,7 +778,8 @@ static int pt_initfn(PCIDevice *pcidev)
 
 out:
 PT_LOG(Real physical device %02x:%02x.%x registered successfuly!\n
-   IRQ type = %s\n, bus, slot, func, INTx);
+   IRQ type = %s\n, bus, slot, func,
+   s-msi_trans_en ? MSI-INTx : INTx);
 
 return 0;
 }
@@ -782,7 +796,7 @@ static int pt_unregister_device(PCIDevice *pcidev)
 e_intx = pci_intx(s);
 machine_irq = s-machine_irq;
 
-if (machine_irq) {
+if (s-msi_trans_en == 0  machine_irq) {
 rc = xc_domain_unbind_pt_irq(xen_xc, xen_domid, machine_irq,
  PT_IRQ_TYPE_PCI, 0, e_device, e_intx, 0);
 if (rc  0) {
@@ -790,6 +804,13 @@ static int pt_unregister_device(PCIDevice *pcidev)
 }
 }
 
+if (s-msi) {
+pt_msi_disable(s);
+}
+if (s-msix) {
+pt_msix_disable(s);
+}
+
 if (machine_irq) {
 mapped_machine_irq[machine_irq]--;
 
@@ -824,6 +845,8 @@ static PCIDeviceInfo xen_pci_passthrough = {
 .is_express = 0,
 .qdev.props = (Property[]) {
 DEFINE_PROP_STRING(hostaddr, XenPCIPassthroughState, hostaddr),
+DEFINE_PROP_BIT(msitranslate, XenPCIPassthroughState, msi_trans_cap,
+0, true),
 DEFINE_PROP_BIT(power-mgmt, XenPCIPassthroughState, power_mgmt,
 0, false),
 DEFINE_PROP_END_OF_LIST(),
diff --git a/hw/xen_pci_passthrough.h b/hw/xen_pci_passthrough.h
index ebc04fd..5f404b0 100644
--- a/hw/xen_pci_passthrough.h
+++ b/hw/xen_pci_passthrough.h
@@ -63,6 +63,10 @@ typedef int (*conf_byte_restore)
 
 #define PT_BAR_ALLF0x  /* BAR ALLF value */
 
+/* MSI-X */
+#define PT_MSI_FLAG_UNINIT 0x1000
+#define PT_MSI_FLAG_MAPPED 0x2000
+
 
 typedef enum {
 GRP_TYPE_HARDWIRED = 0, /* 0 Hardwired reg group */
@@ -166,6 +170,34 @@ typedef struct XenPTRegGroup {
 } XenPTRegGroup;
 
 
+typedef struct XenPTMSI {
+uint32_t flags;
+uint32_t ctrl_offset; /* saved control offset */
+int pirq;  /* guest pirq corresponding */
+uint32_t addr_lo;  /* guest message address */
+uint32_t addr_hi;  /* guest message upper address */
+uint16_t data; /* guest

Re: [Qemu-devel] [PATCH] Fix segfault after migration completes


On 10/28/2011 04:58 PM, Luiz Capitulino wrote:

To reproduce:

1. Start the source VM with:

  # qemu [...] -S

2. Start the destination VM with:

  # qemusource VM cmd-line  -incoming tcp:0:

3. In the source VM:

   (qemu) migrate -d tcp:0:

3. The source VM will segfault as soon as migration completes (might not
happen in the first try)

Here's the backtrace:

#0  0x00516f39 in qemu_file_get_error (f=0x0) at 
/home/lcapitulino/src/qmp-unstable/savevm.c:431
431 return f-last_error;

#0  0x00516f39 in qemu_file_get_error (f=0x0) at 
/home/lcapitulino/src/qmp-unstable/savevm.c:431
#1  0x004e7a9a in migrate_fd_put_notify (opaque=0x987640) at 
/home/lcapitulino/src/qmp-unstable/migration.c:255
#2  0x0046d59a in qemu_iohandler_poll (readfds=0x7fff45ccfe50, 
writefds=0x7fff45ccfdd0, xfds=0x7fff45ccfd50, ret=1)
 at /home/lcapitulino/src/qmp-unstable/iohandler.c:124
#3  0x004e6033 in main_loop_wait (nonblocking=0) at 
/home/lcapitulino/src/qmp-unstable/main-loop.c:463
#4  0x004db5b0 in main_loop () at 
/home/lcapitulino/src/qmp-unstable/vl.c:1478
#5  0x004dffed in main (argc=16, argv=0x7fff45cd0318, 
envp=0x7fff45cd03a0) at /home/lcapitulino/src/qmp-unstable/vl.c:3449

So, 's-file' is NULL in migrate_fd_put_notify(). The interesting thing
is that it's valid in the qemu_file_put_notify() call, which makes me
think that either: there's a race somewhere or qemu_file_put_notify() is
itself clearing 's-file'. In both cases the fix below could just be hiding
the real issue, but let's get started...

Signed-off-by: Luiz Capitulinolcapitul...@redhat.com
---
  migration.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/migration.c b/migration.c
index bdca72e..f6e6208 100644
--- a/migration.c
+++ b/migration.c
@@ -252,7 +252,7 @@ static void migrate_fd_put_notify(void *opaque)

  qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL);
  qemu_file_put_notify(s-file);
-if (qemu_file_get_error(s-file)) {
+if (s-file  qemu_file_get_error(s-file)) {
  migrate_fd_error(s);
  }
  }


Just one comment, it would be good to mention in the commit message the 
call chain.  The one that Eduardo had tracked offlist looks indeed 
correct to me:


select loop - migrate_fd_put_notify() - qemu_file_put_notify() - 
buffered_put_buffer() - migrate_fd_put_ready() - 
migrate_fd_completed() - migrate_fd_cleanup().


Anyway, code-wise:

Reviewed-by: Paolo Bonzini pbonz...@redhat.com

Paolo

[Qemu-devel] [PATCH V3 05/10] pci_regs: Fix value of PCI_EXP_TYPE_RC_EC.

Value check in PCI Express Base Specification rev 1.1

Signed-off-by: Anthony PERARD anthony.per...@citrix.com
---
 hw/pci_regs.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/pci_regs.h b/hw/pci_regs.h
index e8357c3..6b42515 100644
--- a/hw/pci_regs.h
+++ b/hw/pci_regs.h
@@ -393,7 +393,7 @@
 #define  PCI_EXP_TYPE_DOWNSTREAM 0x6   /* Downstream Port */
 #define  PCI_EXP_TYPE_PCI_BRIDGE 0x7   /* PCI/PCI-X Bridge */
 #define  PCI_EXP_TYPE_RC_END   0x9 /* Root Complex Integrated Endpoint */
-#define  PCI_EXP_TYPE_RC_EC0x10/* Root Complex Event Collector */
+#define  PCI_EXP_TYPE_RC_EC 0xa /* Root Complex Event Collector */
 #define PCI_EXP_FLAGS_SLOT 0x0100  /* Slot implemented */
 #define PCI_EXP_FLAGS_IRQ  0x3e00  /* Interrupt message number */
 #define PCI_EXP_DEVCAP 4   /* Device capabilities */
-- 
Anthony PERARD

[Qemu-devel] [PATCH V3 07/10] Introduce Xen PCI Passthrough, qdevice (1/3)

From: Allen Kay allen.m@intel.com

Signed-off-by: Allen Kay allen.m@intel.com
Signed-off-by: Guy Zana g...@neocleus.com
Signed-off-by: Anthony PERARD anthony.per...@citrix.com
---
 Makefile.target  |2 +
 hw/xen_pci_passthrough.c |  838 ++
 hw/xen_pci_passthrough.h |  223 ++
 hw/xen_pci_passthrough_helpers.c |   46 ++
 4 files changed, 1109 insertions(+), 0 deletions(-)
 create mode 100644 hw/xen_pci_passthrough.c
 create mode 100644 hw/xen_pci_passthrough.h
 create mode 100644 hw/xen_pci_passthrough_helpers.c

diff --git a/Makefile.target b/Makefile.target
index 243f9f2..36ea47d 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -217,6 +217,8 @@ obj-i386-$(CONFIG_XEN) += xen_platform.o
 
 # Xen PCI Passthrough
 obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += host-pci-device.o
+obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pci_passthrough.o
+obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pci_passthrough_helpers.o
 
 # Inter-VM PCI shared memory
 CONFIG_IVSHMEM =
diff --git a/hw/xen_pci_passthrough.c b/hw/xen_pci_passthrough.c
new file mode 100644
index 000..b97c5b6
--- /dev/null
+++ b/hw/xen_pci_passthrough.c
@@ -0,0 +1,838 @@
+/*
+ * Copyright (c) 2007, Neocleus Corporation.
+ * Copyright (c) 2007, Intel Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Alex Novik a...@neocleus.com
+ * Allen Kay allen.m@intel.com
+ * Guy Zana g...@neocleus.com
+ *
+ * This file implements direct PCI assignment to a HVM guest
+ */
+
+/*
+ * Interrupt Disable policy:
+ *
+ * INTx interrupt:
+ *   Initialize(register_real_device)
+ * Map INTx(xc_physdev_map_pirq):
+ *   fail
+ * - Set real Interrupt Disable bit to '1'.
+ * - Set machine_irq and assigned_device-machine_irq to '0'.
+ * * Don't bind INTx.
+ *
+ * Bind INTx(xc_domain_bind_pt_pci_irq):
+ *   fail
+ * - Set real Interrupt Disable bit to '1'.
+ * - Unmap INTx.
+ * - Decrement mapped_machine_irq[machine_irq]
+ * - Set assigned_device-machine_irq to '0'.
+ *
+ *   Write to Interrupt Disable bit by guest software(pt_cmd_reg_write)
+ * Write '0'
+ *   ptdev-msi_trans_en is false
+ * - Set real bit to '0' if assigned_device-machine_irq isn't '0'.
+ *
+ * Write '1'
+ *   ptdev-msi_trans_en is false
+ * - Set real bit to '1'.
+ *
+ * MSI-INTx translation.
+ *   Initialize(xc_physdev_map_pirq_msi/pt_msi_setup)
+ * Bind MSI-INTx(xc_domain_bind_pt_irq)
+ *   fail
+ * - Unmap MSI.
+ *   success
+ * - Set dev-msi-pirq to '-1'.
+ *   fail
+ * - Do nothing.
+ *
+ *   Write to Interrupt Disable bit by guest software(pt_cmd_reg_write)
+ * Write '0'
+ *   ptdev-msi_trans_en is true
+ * - Set MSI Enable bit to '1'.
+ *
+ * Write '1'
+ *   ptdev-msi_trans_en is true
+ * - Set MSI Enable bit to '0'.
+ *
+ * MSI interrupt:
+ *   Initialize MSI register(pt_msi_setup, pt_msi_update)
+ * Bind MSI(xc_domain_update_msi_irq)
+ *   fail
+ * - Unmap MSI.
+ * - Set dev-msi-pirq to '-1'.
+ *
+ * MSI-X interrupt:
+ *   Initialize MSI-X register(pt_msix_update_one)
+ * Bind MSI-X(xc_domain_update_msi_irq)
+ *   fail
+ * - Unmap MSI-X.
+ * - Set entry-pirq to '-1'.
+ */
+
+#include sys/ioctl.h
+
+#include pci.h
+#include xen.h
+#include xen_backend.h
+#include xen_pci_passthrough.h
+
+#define PCI_BAR_ENTRIES (6)
+
+#define PT_NR_IRQS  (256)
+char mapped_machine_irq[PT_NR_IRQS] = {0};
+
+/* Config Space */
+static int pt_pci_config_access_check(PCIDevice *d, uint32_t address, int len)
+{
+/* check offset range */
+if (address = 0xFF) {
+PT_LOG(Error: Failed to access register with offset exceeding FFh. 
+   [%02x:%02x.%x][Offset:%02xh][Length:%d]\n,
+   pci_bus_num(d-bus), PCI_SLOT(d-devfn), PCI_FUNC(d-devfn),
+   address, len);
+return -1;
+}
+
+/* check read size */
+if ((len != 1)  (len != 2)  (len != 4)) {
+PT_LOG(Error: Failed to access register with invalid access length. 
+   [%02x:%02x.%x][Offset:%02xh][Length:%d]\n,
+   pci_bus_num(d-bus), PCI_SLOT(d-devfn), PCI_FUNC(d-devfn),
+   address, len);
+return -1;
+}
+
+/* check offset alignment */
+if (address  (len - 1)) {
+PT_LOG(Error: Failed to access register with invalid access size 
+alignment. [%02x:%02x.%x][Offset:%02xh][Length:%d]\n,
+pci_bus_num(d-bus), PCI_SLOT(d-devfn), PCI_FUNC(d-devfn),
+address, len);
+return -1;
+}
+
+return 0;
+}
+
+int pt_bar_offset_to_index(uint32_t offset)
+{
+int index = 0;
+
+/* check Exp ROM BAR */
+if (offset == PCI_ROM_ADDRESS) {
+return PCI_ROM_SLOT;
+}
+
+

Re: [Qemu-devel] [PATCH] Fix segfault after migration completes

2011-10-28 Thread Eduardo Habkost

On Fri, Oct 28, 2011 at 12:58:04PM -0200, Luiz Capitulino wrote:
[...]
 
 So, 's-file' is NULL in migrate_fd_put_notify(). The interesting thing
 is that it's valid in the qemu_file_put_notify() call, which makes me
 think that either: there's a race somewhere or qemu_file_put_notify() is
 itself clearing 's-file'. In both cases the fix below could just be hiding
 the real issue, but let's get started...
 
 Signed-off-by: Luiz Capitulino lcapitul...@redhat.com

Acked-by: Eduardo Habkost ehabk...@redhat.com

However, it looks like the error-check interface of QEMUFile is really
easy to misuse, and can be improved:

- Either errors are always triggered synchronously inside
  qemu_file_put_notify(), or they can be triggered asynchronously
  elsewhere too.
- If they are always triggered synchronously during the
  qemu_file_put_notify() call, then qemu_file_put_notify() should return
  error information itself instead of requiring a qemu_file_get_error()
  call.
- If errors can be triggered asynchronously, then we need an error
  notification mechanism that makes sure no error is ever missed,
  instead of this error check on migrate_fd_put_notify().

 ---
  migration.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/migration.c b/migration.c
 index bdca72e..f6e6208 100644
 --- a/migration.c
 +++ b/migration.c
 @@ -252,7 +252,7 @@ static void migrate_fd_put_notify(void *opaque)
  
  qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL);
  qemu_file_put_notify(s-file);
 -if (qemu_file_get_error(s-file)) {
 +if (s-file  qemu_file_get_error(s-file)) {
  migrate_fd_error(s);
  }
  }
 -- 
 1.7.7.1.488.ge8e1c.dirty
 

-- 
Eduardo

[Qemu-devel] [PATCH V3 09/10] Introduce apic-msidef.h

This patch move the msi definition from apic.c to apic-msidef.h. So it can be
used also by other .c files.

Signed-off-by: Anthony PERARD anthony.per...@citrix.com
---
 hw/apic-msidef.h |   28 
 hw/apic.c|   11 +--
 2 files changed, 29 insertions(+), 10 deletions(-)
 create mode 100644 hw/apic-msidef.h

diff --git a/hw/apic-msidef.h b/hw/apic-msidef.h
new file mode 100644
index 000..3182f0b
--- /dev/null
+++ b/hw/apic-msidef.h
@@ -0,0 +1,28 @@
+#ifndef HW_APIC_MSIDEF_H
+#define HW_APIC_MSIDEF_H
+
+/*
+ * Intel APIC constants: from include/asm/msidef.h
+ */
+
+/*
+ * Shifts for MSI data
+ */
+
+#define MSI_DATA_VECTOR_SHIFT   0
+#define  MSI_DATA_VECTOR_MASK   0x00ff
+
+#define MSI_DATA_DELIVERY_MODE_SHIFT8
+#define MSI_DATA_LEVEL_SHIFT14
+#define MSI_DATA_TRIGGER_SHIFT  15
+
+/*
+ * Shift/mask fields for msi address
+ */
+
+#define MSI_ADDR_DEST_MODE_SHIFT2
+
+#define MSI_ADDR_DEST_ID_SHIFT  12
+#define  MSI_ADDR_DEST_ID_MASK  0x000
+
+#endif /* HW_APIC_MSIDEF_H */
diff --git a/hw/apic.c b/hw/apic.c
index 8289eef..18c4a87 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -24,6 +24,7 @@
 #include sysbus.h
 #include trace.h
 #include pc.h
+#include apic-msidef.h
 
 /* APIC Local Vector Table */
 #define APIC_LVT_TIMER   0
@@ -65,16 +66,6 @@
 #define MAX_APICS 255
 #define MAX_APIC_WORDS 8
 
-/* Intel APIC constants: from include/asm/msidef.h */
-#define MSI_DATA_VECTOR_SHIFT  0
-#define MSI_DATA_VECTOR_MASK   0x00ff
-#define MSI_DATA_DELIVERY_MODE_SHIFT   8
-#define MSI_DATA_TRIGGER_SHIFT 15
-#define MSI_DATA_LEVEL_SHIFT   14
-#define MSI_ADDR_DEST_MODE_SHIFT   2
-#define MSI_ADDR_DEST_ID_SHIFT 12
-#defineMSI_ADDR_DEST_ID_MASK   0x000
-
 #define MSI_ADDR_SIZE   0x10
 
 typedef struct APICState APICState;
-- 
Anthony PERARD

[Qemu-devel] [PATCH V3 02/10] Introduce HostPCIDevice to access a pci device on the host.

Signed-off-by: Anthony PERARD anthony.per...@citrix.com
---
 Makefile.target  |1 +
 hw/host-pci-device.c |  252 ++
 hw/host-pci-device.h |   75 +++
 3 files changed, 328 insertions(+), 0 deletions(-)
 create mode 100644 hw/host-pci-device.c
 create mode 100644 hw/host-pci-device.h

diff --git a/Makefile.target b/Makefile.target
index 867d687..243f9f2 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -216,6 +216,7 @@ obj-$(CONFIG_NO_XEN) += xen-stub.o
 obj-i386-$(CONFIG_XEN) += xen_platform.o
 
 # Xen PCI Passthrough
+obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += host-pci-device.o
 
 # Inter-VM PCI shared memory
 CONFIG_IVSHMEM =
diff --git a/hw/host-pci-device.c b/hw/host-pci-device.c
new file mode 100644
index 000..5eafc49
--- /dev/null
+++ b/hw/host-pci-device.c
@@ -0,0 +1,252 @@
+/*
+ * Copyright (C) 2011   Citrix Ltd.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include qemu-common.h
+#include host-pci-device.h
+
+#define PCI_MAX_EXT_CAP \
+((PCIE_CONFIG_SPACE_SIZE - PCI_CONFIG_SPACE_SIZE) / (PCI_CAP_SIZEOF + 4))
+
+enum error_code {
+ERROR_SYNTAX = 1,
+};
+
+static int path_to(const HostPCIDevice *d,
+   const char *name, char *buf, ssize_t size)
+{
+return snprintf(buf, size, /sys/bus/pci/devices/%04x:%02x:%02x.%x/%s,
+d-domain, d-bus, d-dev, d-func, name);
+}
+
+static int get_resource(HostPCIDevice *d)
+{
+int i, rc = 0;
+FILE *f;
+char path[PATH_MAX];
+unsigned long long start, end, flags, size;
+
+path_to(d, resource, path, sizeof (path));
+f = fopen(path, r);
+if (!f) {
+fprintf(stderr, Error: Can't open %s: %s\n, path, strerror(errno));
+return -errno;
+}
+
+for (i = 0; i  PCI_NUM_REGIONS; i++) {
+if (fscanf(f, %llx %llx %llx, start, end, flags) != 3) {
+fprintf(stderr, Error: Syntax error in %s\n, path);
+rc = ERROR_SYNTAX;
+break;
+}
+if (start) {
+size = end - start + 1;
+} else {
+size = 0;
+}
+
+if (i  PCI_ROM_SLOT) {
+d-io_regions[i].base_addr = start;
+d-io_regions[i].size = size;
+d-io_regions[i].flags = flags;
+} else {
+d-rom.base_addr = start;
+d-rom.size = size;
+d-rom.flags = flags;
+}
+}
+
+fclose(f);
+return rc;
+}
+
+static unsigned long get_value(HostPCIDevice *d, const char *name)
+{
+char path[PATH_MAX];
+FILE *f;
+unsigned long value;
+
+path_to(d, name, path, sizeof (path));
+f = fopen(path, r);
+if (!f) {
+fprintf(stderr, Error: Can't open %s: %s\n, path, strerror(errno));
+return -1;
+}
+if (fscanf(f, %lx\n, value) != 1) {
+fprintf(stderr, Error: Syntax error in %s\n, path);
+value = -1;
+}
+fclose(f);
+return value;
+}
+
+static int pci_dev_is_virtfn(HostPCIDevice *d)
+{
+int rc;
+char path[PATH_MAX];
+struct stat buf;
+
+path_to(d, physfn, path, sizeof (path));
+rc = !stat(path, buf);
+
+return rc;
+}
+
+static int host_pci_config_fd(HostPCIDevice *d)
+{
+char path[PATH_MAX];
+
+if (d-config_fd  0) {
+path_to(d, config, path, sizeof (path));
+d-config_fd = open(path, O_RDWR);
+if (d-config_fd  0) {
+fprintf(stderr, HostPCIDevice: Can not open '%s': %s\n,
+path, strerror(errno));
+}
+}
+return d-config_fd;
+}
+static int host_pci_config_read(HostPCIDevice *d, int pos, void *buf, int len)
+{
+int fd = host_pci_config_fd(d);
+int res = 0;
+
+res = pread(fd, buf, len, pos);
+if (res  0) {
+fprintf(stderr, host_pci_config: read failed: %s (fd: %i)\n,
+strerror(errno), fd);
+return -1;
+}
+return res;
+}
+static int host_pci_config_write(HostPCIDevice *d,
+ int pos, const void *buf, int len)
+{
+int fd = host_pci_config_fd(d);
+int res = 0;
+
+res = pwrite(fd, buf, len, pos);
+if (res  0) {
+fprintf(stderr, host_pci_config: write failed: %s\n,
+strerror(errno));
+return -1;
+}
+return res;
+}
+
+uint8_t host_pci_get_byte(HostPCIDevice *d, int pos)
+{
+  uint8_t buf;
+  host_pci_config_read(d, pos, buf, 1);
+  return buf;
+}
+uint16_t host_pci_get_word(HostPCIDevice *d, int pos)
+{
+  uint16_t buf;
+  host_pci_config_read(d, pos, buf, 2);
+  return le16_to_cpu(buf);
+}
+uint32_t host_pci_get_long(HostPCIDevice *d, int pos)
+{
+  uint32_t buf;
+  host_pci_config_read(d, pos, buf, 4);
+  return le32_to_cpu(buf);
+}
+int host_pci_get_block(HostPCIDevice *d, int pos, uint8_t *buf, int len)
+{
+  return host_pci_config_read(d, pos, buf, len);
+}
+
+int host_pci_set_byte(HostPCIDevice *d, int

Re: [Qemu-devel] [PATCH v2] block: avoid SIGUSR2


On 10/28/2011 02:31 PM, Stefan Hajnoczi wrote:

Zhi Yong: We were just talking about converting devices to aio.  If
you have time to do that for fdc, sd, or any other synchronous API
users in hw/ that would be helpful.  Please let us know which device
you are refactoring so we don't duplicate work.


The problem is not really fdc or sd themselves, but whoever uses the 
result of the synchronous reads---respectively DMA and the SD clients.


Some SD clients talk to the SD card in a relatively confined way and 
have interrupts that they set when the operation is done, so these 
confined parts that talk to the card could be changed to a coroutine and 
locked with a CoMutex.  However, not even all of these can do it (in 
particular I'm not sure about ssi-sd.c cannot).


I'm thinking that the problem with the floppy is really that it mixes 
synchronous and asynchronous parts.  As long as you're entirely 
synchronous you should not have any problem, but as soon as you add 
asynchronicity (via bottom halves) you now have to deal with reentrancy.


git grep _bh hw/ suggests that this should not be a huge problem; most 
if not all occurrences are related to ptimers, or are in entirely 
asynchronous code (IDE, SCSI, virtio).  Floppy+DMA seems to be the only 
problematic occurrence, and any fix (switch to timers, drop idle BH in 
qemu_aio_wait, reschedule if DMA reenters during I/O, drop BH completely 
and just loop) is as good as the others.


(Actually, another one worth checking is ATAPI, but I don't know the 
code and the standards well enough).


Paolo

[Qemu-devel] [PATCH V3 04/10] pci_ids: Add INTEL_82599_VF id.

Signed-off-by: Anthony PERARD anthony.per...@citrix.com
---
 hw/pci_ids.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/hw/pci_ids.h b/hw/pci_ids.h
index 83f3893..2ea5ec2 100644
--- a/hw/pci_ids.h
+++ b/hw/pci_ids.h
@@ -117,6 +117,7 @@
 #define PCI_DEVICE_ID_INTEL_82801I_UHCI6 0x2939
 #define PCI_DEVICE_ID_INTEL_82801I_EHCI1 0x293a
 #define PCI_DEVICE_ID_INTEL_82801I_EHCI2 0x293c
+#define PCI_DEVICE_ID_INTEL_82599_VF 0x10ed
 
 #define PCI_VENDOR_ID_XEN   0x5853
 #define PCI_DEVICE_ID_XEN_PLATFORM  0x0001
-- 
Anthony PERARD

[Qemu-devel] [PATCH] xen_disk: remove dead code

Xen_disk.c has support for using synchronous I/O instead of asynchronous,
but it is compiled out by default.  Remove it.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/xen_disk.c |   86 +---
 1 files changed, 2 insertions(+), 84 deletions(-)

diff --git a/hw/xen_disk.c b/hw/xen_disk.c
index db86395..f5c3fec 100644
--- a/hw/xen_disk.c
+++ b/hw/xen_disk.c
@@ -52,7 +52,6 @@ static int syncwrite= 0;
 static int batch_maps   = 0;
 
 static int max_requests = 32;
-static int use_aio  = 1;
 
 /* - */
 
@@ -317,76 +316,6 @@ static int ioreq_map(struct ioreq *ioreq)
 return 0;
 }
 
-static int ioreq_runio_qemu_sync(struct ioreq *ioreq)
-{
-struct XenBlkDev *blkdev = ioreq-blkdev;
-int i, rc;
-off_t pos;
-
-if (ioreq-req.nr_segments  ioreq_map(ioreq) == -1) {
-goto err_no_map;
-}
-if (ioreq-presync) {
-bdrv_flush(blkdev-bs);
-}
-
-switch (ioreq-req.operation) {
-case BLKIF_OP_READ:
-pos = ioreq-start;
-for (i = 0; i  ioreq-v.niov; i++) {
-rc = bdrv_read(blkdev-bs, pos / BLOCK_SIZE,
-   ioreq-v.iov[i].iov_base,
-   ioreq-v.iov[i].iov_len / BLOCK_SIZE);
-if (rc != 0) {
-xen_be_printf(blkdev-xendev, 0, rd I/O error (%p, len 
%zd)\n,
-  ioreq-v.iov[i].iov_base,
-  ioreq-v.iov[i].iov_len);
-goto err;
-}
-pos += ioreq-v.iov[i].iov_len;
-}
-break;
-case BLKIF_OP_WRITE:
-case BLKIF_OP_WRITE_BARRIER:
-if (!ioreq-req.nr_segments) {
-break;
-}
-pos = ioreq-start;
-for (i = 0; i  ioreq-v.niov; i++) {
-rc = bdrv_write(blkdev-bs, pos / BLOCK_SIZE,
-ioreq-v.iov[i].iov_base,
-ioreq-v.iov[i].iov_len / BLOCK_SIZE);
-if (rc != 0) {
-xen_be_printf(blkdev-xendev, 0, wr I/O error (%p, len 
%zd)\n,
-  ioreq-v.iov[i].iov_base,
-  ioreq-v.iov[i].iov_len);
-goto err;
-}
-pos += ioreq-v.iov[i].iov_len;
-}
-break;
-default:
-/* unknown operation (shouldn't happen -- parse catches this) */
-goto err;
-}
-
-if (ioreq-postsync) {
-bdrv_flush(blkdev-bs);
-}
-ioreq-status = BLKIF_RSP_OKAY;
-
-ioreq_unmap(ioreq);
-ioreq_finish(ioreq);
-return 0;
-
-err:
-ioreq_unmap(ioreq);
-err_no_map:
-ioreq_finish(ioreq);
-ioreq-status = BLKIF_RSP_ERROR;
-return -1;
-}
-
 static void qemu_aio_complete(void *opaque, int ret)
 {
 struct ioreq *ioreq = opaque;
@@ -557,9 +486,7 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 rp = blkdev-rings.common.sring-req_prod;
 xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
 
-if (use_aio) {
-blk_send_response_all(blkdev);
-}
+blk_send_response_all(blkdev);
 while (rc != rp) {
 /* pull request from ring */
 if (RING_REQUEST_CONS_OVERFLOW(blkdev-rings.common, rc)) {
@@ -582,16 +509,7 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 continue;
 }
 
-if (use_aio) {
-/* run i/o in aio mode */
-ioreq_runio_qemu_aio(ioreq);
-} else {
-/* run i/o in sync mode */
-ioreq_runio_qemu_sync(ioreq);
-}
-}
-if (!use_aio) {
-blk_send_response_all(blkdev);
+ioreq_runio_qemu_aio(ioreq);
 }
 
 if (blkdev-more_work  blkdev-requests_inflight  max_requests) {
-- 
1.7.6.4

[Qemu-devel] [PATCH V3 03/10] pci.c: Add pci_check_bar_overlap

From: Yuji Shimada shimada-...@necst.nec.co.jp

This function help Xen PCI Passthrough device to check for overlap.

Signed-off-by: Yuji Shimada shimada-...@necst.nec.co.jp
Signed-off-by: Anthony PERARD anthony.per...@citrix.com
---
 hw/pci.c |   47 +++
 hw/pci.h |3 +++
 2 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index e8cc1b0..9f65216 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -2120,3 +2120,50 @@ MemoryRegion *pci_address_space_io(PCIDevice *dev)
 {
 return dev-bus-address_space_io;
 }
+
+int pci_check_bar_overlap(PCIDevice *dev,
+  pcibus_t addr, pcibus_t size, uint8_t type)
+{
+PCIBus *bus = dev-bus;
+PCIDevice *devices = NULL;
+PCIIORegion *r;
+int i, j;
+int rc = 0;
+
+/* check Overlapped to Base Address */
+for (i = 0; i  ARRAY_SIZE(bus-devices); i++) {
+devices = bus-devices[i];
+if (!devices) {
+continue;
+}
+
+/* skip itself */
+if (devices-devfn == dev-devfn) {
+continue;
+}
+
+for (j = 0; j  PCI_NUM_REGIONS; j++) {
+r = devices-io_regions[j];
+
+/* skip different resource type, but don't skip when
+ * prefetch and non-prefetch memory are compared.
+ */
+if (type != r-type) {
+if (type == PCI_BASE_ADDRESS_SPACE_IO ||
+r-type == PCI_BASE_ADDRESS_SPACE_IO) {
+continue;
+}
+}
+
+if ((addr  (r-addr + r-size))  ((addr + size)  r-addr)) {
+printf(Overlapped to device[%02x:%02x.%x][Region:%d]
+   [Address:%PRIx64h][Size:%PRIx64h]\n,
+   pci_bus_num(bus), PCI_SLOT(devices-devfn),
+   PCI_FUNC(devices-devfn), j, r-addr, r-size);
+rc = 1;
+}
+}
+}
+
+return rc;
+}
diff --git a/hw/pci.h b/hw/pci.h
index 86a81c8..0e1a07d 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -487,4 +487,7 @@ static inline uint32_t pci_config_size(const PCIDevice *d)
 return pci_is_express(d) ? PCIE_CONFIG_SPACE_SIZE : PCI_CONFIG_SPACE_SIZE;
 }
 
+int pci_check_bar_overlap(PCIDevice *dev,
+  pcibus_t addr, pcibus_t size, uint8_t type);
+
 #endif
-- 
Anthony PERARD

[Qemu-devel] [PATCH V3 06/10] pci_regs: Add PCI_EXP_TYPE_PCIE_BRIDGE

Signed-off-by: Anthony PERARD anthony.per...@citrix.com
---
 hw/pci_regs.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/hw/pci_regs.h b/hw/pci_regs.h
index 6b42515..56a404b 100644
--- a/hw/pci_regs.h
+++ b/hw/pci_regs.h
@@ -392,6 +392,7 @@
 #define  PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */
 #define  PCI_EXP_TYPE_DOWNSTREAM 0x6   /* Downstream Port */
 #define  PCI_EXP_TYPE_PCI_BRIDGE 0x7   /* PCI/PCI-X Bridge */
+#define  PCI_EXP_TYPE_PCIE_BRIDGE 0x8   /* PCI/PCI-X to PCIE Bridge */
 #define  PCI_EXP_TYPE_RC_END   0x9 /* Root Complex Integrated Endpoint */
 #define  PCI_EXP_TYPE_RC_EC 0xa /* Root Complex Event Collector */
 #define PCI_EXP_FLAGS_SLOT 0x0100  /* Slot implemented */
-- 
Anthony PERARD

Re: [Qemu-devel] ARM hosts: code_gen_alloc() maps code buffer on top of libc heap


On 10/28/2011 04:32 PM, Peter Maydell wrote:


I suspect we need to drop the MAP_FIXED requirement and
fix the TCG code to cope with emitting code for longer-range
branches for calls to host fns etc (calls/branches within the
generated code should be ok to keep using the short-range
branch insn I think). There is already no guarantee that
the generated code and the host C code are within short
branch range of each other...


Does USE_STATIC_CODE_GEN_BUFFER fix it?  Do you know why Currently it 
is not recommended to allocate big chunks of data in user mode?


Paolo

Re: [Qemu-devel] ARM hosts: code_gen_alloc() maps code buffer on top of libc heap

On 28 October 2011 17:14, Paolo Bonzini pbonz...@redhat.com wrote:
 On 10/28/2011 04:32 PM, Peter Maydell wrote:

 I suspect we need to drop the MAP_FIXED requirement and
 fix the TCG code to cope with emitting code for longer-range
 branches for calls to host fns etc (calls/branches within the
 generated code should be ok to keep using the short-range
 branch insn I think). There is already no guarantee that
 the generated code and the host C code are within short
 branch range of each other...

 Does USE_STATIC_CODE_GEN_BUFFER fix it?

Well, it avoids the malloc abort. We seem to get stuck in an
infinite cycle of the guest trying to start grub and rebooting
before it manages to print the blue-background grub menu screen,
though, so something's still not right. Likely a different bug,
though.

 Do you know why Currently it is
 not recommended to allocate big chunks of data in user mode?

Don't know, but my guess is that it is or was likely to clash
with attempts to put the guest binary in the right place in
the address space.

-- PMM

Re: [Qemu-devel] [PATCH] dma: Avoid reentrancy in DMA transfer handlers


On 10/28/2011 04:18 PM, Kevin Wolf wrote:

With the conversion of the block layer to coroutines, bdrv_read/write
have changed to run a nested event loop that calls qemu_bh_poll.
Consequently a scheduled BH can be called while a DMA transfer handler
runs and this means that DMA_run becomes reentrant.

Devices haven't been designed to cope with that, so instead of running a
nested transfer handler just wait for the next invocation of the BH from the
main loop.

This fixes some problems with the floppy device.

Signed-off-by: Kevin Wolfkw...@redhat.com
---
  hw/dma.c |9 +
  1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/hw/dma.c b/hw/dma.c
index 8a7302a..e8d6341 100644
--- a/hw/dma.c
+++ b/hw/dma.c
@@ -358,6 +358,13 @@ static void DMA_run (void)
  struct dma_cont *d;
  int icont, ichan;
  int rearm = 0;
+static int running = 0;
+
+if (running) {
+goto out;
+} else {
+running = 1;
+}

  d = dma_controllers;

@@ -374,6 +381,8 @@ static void DMA_run (void)
  }
  }

+out:
+running = 0;
  if (rearm)
  qemu_bh_schedule_idle(dma_bh);
  }


Hmm, I think you should set rearm = 1 to ensure the BH is run when 
ultimately you leave the sync read.  Sorry for not spotting this before.


Paolo

Re: [Qemu-devel] [PATCH] Fix compiler warning (always return a value)


On 10/28/2011 04:39 PM, Markus Armbruster wrote:


  I'm tempted to continue the thread but at the end of the day we just
  need to make the function compile with -NDEBUG.  Using abort(3) or

Do we?

Anyone crazy^Wadventurous enough to compile with -DNDEBUG should be
capable of configure --disable-werror.


Also, I'm not really sure of the benefit of NDEBUG.  For something like 
QEMU, hardware emulation is not going to be your bottleneck and for KVM 
you badly want those asserts to protect against guest-to-host 
escalation.  If you have a really expensive check, you should wrap it 
inside #ifdef DEBUG.


And if you use NDEBUG because you want to live on the edge and ignore 
possible problems, remember there's an assert (p != NULL) hidden 
before any pointer dereference, and you cannot disable that with NDEBUG. :)


Paolo

[Qemu-devel] [PATCH v2] Fix segfault on migration completion

2011-10-28 Thread Luiz Capitulino

A simple migration reproduces it:

1. Start the source VM with:

   # qemu [...] -S

2. Start the destination VM with:

   # qemu source VM cmd-line -incoming tcp:0:

3. In the source VM:

   (qemu) migrate -d tcp:0:

4. The source VM will segfault as soon as migration completes (might not
   happen in the first try)

What is happening here is that qemu_file_put_notify() can end up closing
's-file' (in which case it's also set to NULL). The call stack is rather
complex, but Eduardo helped tracking it to:

select loop - migrate_fd_put_notify() - qemu_file_put_notify() -
buffered_put_buffer() - migrate_fd_put_ready() -
migrate_fd_completed() - migrate_fd_cleanup().

To be honest, it's not completely clear to me in which cases 's-file'
is not closed (on error maybe)? But I doubt this fix will make anything
worse.

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Acked-by: Eduardo Habkost ehabk...@redhat.com
Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
---

V2: better commit log

 migration.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/migration.c b/migration.c
index bdca72e..f6e6208 100644
--- a/migration.c
+++ b/migration.c
@@ -252,7 +252,7 @@ static void migrate_fd_put_notify(void *opaque)
 
 qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL);
 qemu_file_put_notify(s-file);
-if (qemu_file_get_error(s-file)) {
+if (s-file  qemu_file_get_error(s-file)) {
 migrate_fd_error(s);
 }
 }
-- 
1.7.7.1.488.ge8e1c.dirty

Re: [Qemu-devel] [PATCH 5/8] scsi-disk: add scsi-block for device passthrough

Am 25.10.2011 12:53, schrieb Paolo Bonzini:
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  hw/scsi-disk.c |  118 
 
  1 files changed, 118 insertions(+), 0 deletions(-)

This has lost its commit message since the last version. I'll pick it
from there:

scsi-block is a new device that supports device passthrough of Linux
block devices (i.e. /dev/sda, not /dev/sg0).  It uses SG_IO for commands
other than I/O commands, and regular AIO read/writes for I/O commands.
Besides being simpler to configure (no mapping required to scsi-generic
device names), this removes the need for a large bounce buffer and,
in the future, will get scatter/gather support for free from scsi-disk.

Kevin

Re: [Qemu-devel] [PATCH 6/8] block: add eject request callback