date:20101216

Re: [Qemu-devel] Re: [Spice-devel] RFC; usb redirection protocol

2010-12-16 Thread Gerd Hoffmann


  Hi,


I know of devices that will enumerate twice, first as one device, then
after a certain setup exchange as another. But that seems to be covered
by the suggestion here, it will just be identicle to two completely different
transports.


Or identify devices by physical port instead of bus address or 
vendor/device id.  Having that would be good for usb passthrough too I 
guess.


cheers,
  Gerd

[Qemu-devel] Re: [Spice-devel] RFC; usb redirection protocol

2010-12-16 Thread Gerd Hoffmann


  Hi,


It could be that the implementation decides to not handle synchroneous
commands synchroneous at all. The main difference I'm trying to make
here is between commands which do things to end points which cannot
be done while other packets are in flight (like setting configuration,
or interface alt setting) and commands (normal packets) of which there
can be multiple in flight. I'm open to using a different term then
synchroneous and async here.


Name them control / data packets?

cheers,
  Gerd

Re: [Qemu-devel] Re: [PATCH 00/15] Megasas HBA emulation and SCSI update v.3

2010-12-16 Thread Stefan Hajnoczi

On Thu, Dec 16, 2010 at 1:45 AM, Benjamin Herrenschmidt
b...@kernel.crashing.org wrote:
 On Mon, 2010-12-13 at 08:32 +0100, Hannes Reinecke wrote:
 On 12/10/2010 11:14 PM, Paolo Bonzini wrote:
  On 11/24/2010 05:50 PM, Christoph Hellwig wrote:
  Btw, it might make sense to split this series into two.
 
  Patches 1 to 11 are genuine improvements to the SCSI code, which I'd
  like to see merged ASAP.  The rest is the actual megasas driver, which
  I still want to see, but haven't even gotten to review yet.
 
  Ping for patches 1 to 11?
 
  Paolo

 The first few already have been merged by Kevin Wolf; I'll see to
 prepare an updated patchset.

 Actually, I was about to ask as I'd like to base some new work of mine
 on top of these. I don't see any recent commit from Kevin in the qemu
 master branch (nor in any other branch on
 http://git.savannah.gnu.org/cgit/qemu.git/log/).

 Does Kevin maintain a separate staging tree ?

http://repo.or.cz/w/qemu/kevin.git/shortlog/refs/heads/block

Stefan

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-16 Thread Michael S. Tsirkin

On Thu, Dec 16, 2010 at 04:08:16PM +0900, Isaku Yamahata wrote:
 On Wed, Dec 15, 2010 at 08:27:49AM -0700, Alex Williamson wrote:
  On Wed, 2010-12-15 at 11:56 +0200, Michael S. Tsirkin wrote:
   On Tue, Dec 14, 2010 at 11:34:53AM -0700, Alex Williamson wrote:
On Tue, 2010-12-14 at 14:26 +0200, Michael S. Tsirkin wrote:
 On Mon, Dec 13, 2010 at 10:04:24PM -0700, Alex Williamson wrote:
  
  I've only ever seen config[PCI_SECONDARY_BUS] be non-zero for an
  assigned device, so I'm pretty sure we're not going to hurt 
  migration,
  but the code is clearly wrong and I'd like to make sure we don't 
  trip on
  a migration failure for a minor device config space change.
 
 Which reminds me: maybe just mark nested bridges as non-migrateable
 for now?  Care writing such a patch?

Hmm, this is trickier than it sounds.
   
   Hmm, since 0 is put in the path instead of the bridge number,
   will the correct bridge be restored?
   
 We're really only broken wrt
migration if a device under a bridge calls qemu_ram_alloc.
   
   I guess there's more broken-ness. What exactly breaks qemu_ram_alloc?
  
  You're right, it's more broken than that.  Anything that calls
  get_dev_path is broken for migration of bridges since the path is
  determined before the guest updates bus numbers.  That includes
  qemu_ram_alloc and vmstate.  I was only looking at the qemu_ram_alloc
  side.  So perhaps the right answer, for the moment, is to block
  migration if there's a p2p bridge.
 
 Eww. That's bad. Anyway, I agree to disable it for the moment.

Patch?

 Let's fix it after 0.14 release.
 -- 
 yamahata

[Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-16 Thread Avi Kivity


On 12/15/2010 08:00 PM, Luiz Capitulino wrote:

  
Looks like a GUI feature to me,

  Really?  Can't see how you can build NMI to all CPUs from NMI this
  CPU.  Or am I misunderstanding you?

I guess so. Avi referred to 'nmi button on many machines', I assumed he
meant a virtual machine GUI, am I wrong?


I meant a real machine's GUI (it's a physical button you can press with 
your finger, if you have thin fingers).


--
error compiling committee.c: too many arguments to function

[Qemu-devel] Re: [PATCH 11/21] ioport: insert event_tap_ioport() to ioport_write().

2010-12-16 Thread Michael S. Tsirkin

On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote:
 2010/11/28 Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp:
  2010/11/28 Michael S. Tsirkin m...@redhat.com:
  On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote:
  Record ioport event to replay it upon failover.
 
  Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
 
  Interesting. This will have to be extended to support ioeventfd.
  Since each eventfd is really just a binary trigger
  it should be enough to read out the fd state.
 
  Haven't thought about eventfd yet.  Will try doing it in the next
  spin.
 
 Hi Michael,
 
 I looked into eventfd and realized it's only used with vhost now.

There are patches on list to use it for block/userspace net.

  However, I
 believe vhost bypass the net layer in qemu, and there is no way for Kemari to
 detect the outputs.  To me, it doesn't make sense to extend this patch to
 support eventfd...
 
 Thanks,
 
 Yoshi
 
 
  Yoshi
 
 
  ---
   ioport.c |    2 ++
   1 files changed, 2 insertions(+), 0 deletions(-)
 
  diff --git a/ioport.c b/ioport.c
  index aa4188a..74aebf5 100644
  --- a/ioport.c
  +++ b/ioport.c
  @@ -27,6 +27,7 @@
 
   #include ioport.h
   #include trace.h
  +#include event-tap.h
 
   /***/
   /* IO Port */
  @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, 
  uint32_t data)
           default_ioport_writel
       };
       IOPortWriteFunc *func = ioport_write_table[index][address];
  +    event_tap_ioport(index, address, data);
       if (!func)
           func = default_func[index];
       func(ioport_opaque[address], address, data);
  --
  1.7.1.2
 
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-16 Thread Markus Armbruster

Luiz Capitulino lcapitul...@redhat.com writes:

 On Wed, 15 Dec 2010 18:45:09 +0100
 Markus Armbruster arm...@redhat.com wrote:

 Luiz Capitulino lcapitul...@redhat.com writes:
 
  On Wed, 15 Dec 2010 19:18:32 +0200
  Avi Kivity a...@redhat.com wrote:
[...]
  I'd like to see cpu-index made optional; if not present, nmi all cpus 
  (that's what the nmi button on many machines does, or at least I think 
  that's what it does).
 
  Looks like a GUI feature to me,
 
 Really?  Can't see how you can build NMI to all CPUs from NMI this
 CPU.  Or am I misunderstanding you?

 I guess so. Avi referred to 'nmi button on many machines', I assumed he
 meant a virtual machine GUI, am I wrong?

  _might_ turn out to be an undesirable
  side effect to client writers.
 
 They seem to be coping fine with optional arguments elsewhere.

 Which we might want to review.

 I guess I prefer a to-all-cpus argument.
 
 How would that look like?  cpu-index: all?

 Like this:

 { execute: inject-nmi, arguments: { to-all-cpus: true } }

 But this looks like an optimization to me, because it's also easy to do:

 for cpu in query-cpus; do
   inject-nmi cpu

 Unless we want to do this in an atomic way, due to side effects I'm
 not aware about.

I'd expect a physical NMI button to interrupt all CPUs simultaneously
(modulo wire length).

[...]

[Qemu-devel] Re: [PATCH 11/21] ioport: insert event_tap_ioport() to ioport_write().

2010-12-16 Thread Yoshiaki Tamura

2010/12/16 Michael S. Tsirkin m...@redhat.com:
 On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote:
 2010/11/28 Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp:
  2010/11/28 Michael S. Tsirkin m...@redhat.com:
  On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote:
  Record ioport event to replay it upon failover.
 
  Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
 
  Interesting. This will have to be extended to support ioeventfd.
  Since each eventfd is really just a binary trigger
  it should be enough to read out the fd state.
 
  Haven't thought about eventfd yet.  Will try doing it in the next
  spin.

 Hi Michael,

 I looked into eventfd and realized it's only used with vhost now.

 There are patches on list to use it for block/userspace net.

Thanks.  Now I understand.
In that case, inserting an even-tap function to the following code
should be appropriate?

int event_notifier_test_and_clear(EventNotifier *e)
{
uint64_t value;
int r = read(e-fd, value, sizeof(value));
return r == sizeof(value);
}


  However, I
 believe vhost bypass the net layer in qemu, and there is no way for Kemari to
 detect the outputs.  To me, it doesn't make sense to extend this patch to
 support eventfd...

 Thanks,

 Yoshi

 
  Yoshi
 
 
  ---
   ioport.c |    2 ++
   1 files changed, 2 insertions(+), 0 deletions(-)
 
  diff --git a/ioport.c b/ioport.c
  index aa4188a..74aebf5 100644
  --- a/ioport.c
  +++ b/ioport.c
  @@ -27,6 +27,7 @@
 
   #include ioport.h
   #include trace.h
  +#include event-tap.h
 
   /***/
   /* IO Port */
  @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, 
  uint32_t data)
           default_ioport_writel
       };
       IOPortWriteFunc *func = ioport_write_table[index][address];
  +    event_tap_ioport(index, address, data);
       if (!func)
           func = default_func[index];
       func(ioport_opaque[address], address, data);
  --
  1.7.1.2
 
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Qemu-devel] Re: [PATCH 05/21] virtio: modify save/load handler to handle inuse varialble.

2010-12-16 Thread Michael S. Tsirkin

On Thu, Dec 16, 2010 at 04:36:16PM +0900, Yoshiaki Tamura wrote:
 2010/12/3 Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp:
  2010/12/2 Michael S. Tsirkin m...@redhat.com:
  On Wed, Dec 01, 2010 at 05:03:43PM +0900, Yoshiaki Tamura wrote:
  2010/11/28 Michael S. Tsirkin m...@redhat.com:
   On Sun, Nov 28, 2010 at 08:27:58PM +0900, Yoshiaki Tamura wrote:
   2010/11/28 Michael S. Tsirkin m...@redhat.com:
On Thu, Nov 25, 2010 at 03:06:44PM +0900, Yoshiaki Tamura wrote:
Modify inuse type to uint16_t, let save/load to handle, and revert
last_avail_idx with inuse if there are outstanding emulation.
   
Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
   
This changes migration format, so it will break compatibility with
existing drivers. More generally, I think migrating internal
state that is not guest visible is always a mistake
as it ties migration format to an internal implementation
(yes, I know we do this sometimes, but we should at least
try not to add such cases).  I think the right thing to do in this 
case
is to flush outstanding
work when vm is stopped.  Then, we are guaranteed that inuse is 0.
I sent patches that do this for virtio net and block.
  
   Could you give me the link of your patches?  I'd like to test
   whether they work with Kemari upon failover.  If they do, I'm
   happy to drop this patch.
  
   Yoshi
  
   Look for this:
   stable migration image on a stopped vm
   sent on:
   Wed, 24 Nov 2010 17:52:49 +0200
 
  Thanks for the info.
 
  However, The patch series above didn't solve the issue.  In
  case of Kemari, inuse is mostly  0 because it queues the
  output, and while last_avail_idx gets incremented
  immediately, not sending inuse makes the state inconsistent
  between Primary and Secondary.
 
  Hmm. Can we simply avoid incrementing last_avail_idx?
 
  I think we can calculate or prepare an internal last_avail_idx,
  and update the external when inuse is decremented.  I'll try
  whether it work w/ w/o Kemari.
 
 Hi Michael,
 
 Could you please take a look at the following patch?

Which version is this against?

 commit 36ee7910059e6b236fe9467a609f5b4aed866912
 Author: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
 Date:   Thu Dec 16 14:50:54 2010 +0900
 
 virtio: update last_avail_idx when inuse is decreased.
 
 Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp

It would be better to have a commit description explaining why a change
is made, and why it is correct, not just repeating what can be seen from
the diff anyway.

 diff --git a/hw/virtio.c b/hw/virtio.c
 index c8a0fc6..6688c02 100644
 --- a/hw/virtio.c
 +++ b/hw/virtio.c
 @@ -237,6 +237,7 @@ void virtqueue_flush(VirtQueue *vq, unsigned int count)
  wmb();
  trace_virtqueue_flush(vq, count);
  vring_used_idx_increment(vq, count);
 +vq-last_avail_idx += count;
  vq-inuse -= count;
  }
 
 @@ -385,7 +386,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
  unsigned int i, head, max;
  target_phys_addr_t desc_pa = vq-vring.desc;
 
 -if (!virtqueue_num_heads(vq, vq-last_avail_idx))
 +if (!virtqueue_num_heads(vq, vq-last_avail_idx + vq-inuse))
  return 0;
 
  /* When we start there are none of either input nor output. */
 @@ -393,7 +394,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
 
  max = vq-vring.num;
 
 -i = head = virtqueue_get_head(vq, vq-last_avail_idx++);
 +i = head = virtqueue_get_head(vq, vq-last_avail_idx + vq-inuse);
 
  if (vring_desc_flags(desc_pa, i)  VRING_DESC_F_INDIRECT) {
  if (vring_desc_len(desc_pa, i) % sizeof(VRingDesc)) {
 

Hmm, will virtio_queue_empty be wrong now? What about virtqueue_avail_bytes?
Previous patch version sure looked simpler, and this seems functionally
equivalent, so my question still stands: here it is rephrased in a
different way:

assume that we have in avail ring 2 requests at start of ring: A and B 
in this order

host pops A, then B, then completes B and flushes

now with this patch last_avail_idx will be 1, and then
remote will get it, it will execute B again. As a result
B will complete twice, and apparently A will never complete.


This is what I was saying below: assuming that there are
outstanding requests when we migrate, there is no way
a single index can be enough to figure out which requests
need to be handled and which are in flight already.

We must add some kind of bitmask to tell us which is which.

 
 
   I'm wondering why
  last_avail_idx is OK to send but not inuse.
 
  last_avail_idx is at some level a mistake, it exposes part of
  our internal implementation, but it does *also* express
  a guest observable state.
 
  Here's the problem that it solves: just looking at the rings in virtio
  there is no way to detect that a specific request has already been
  completed. And the protocol forbids completing the same request twice.
 
  Our

[Qemu-devel] Re: [PATCH 11/21] ioport: insert event_tap_ioport() to ioport_write().

2010-12-16 Thread Michael S. Tsirkin

On Thu, Dec 16, 2010 at 06:50:04PM +0900, Yoshiaki Tamura wrote:
 2010/12/16 Michael S. Tsirkin m...@redhat.com:
  On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote:
  2010/11/28 Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp:
   2010/11/28 Michael S. Tsirkin m...@redhat.com:
   On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote:
   Record ioport event to replay it upon failover.
  
   Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
  
   Interesting. This will have to be extended to support ioeventfd.
   Since each eventfd is really just a binary trigger
   it should be enough to read out the fd state.
  
   Haven't thought about eventfd yet.  Will try doing it in the next
   spin.
 
  Hi Michael,
 
  I looked into eventfd and realized it's only used with vhost now.
 
  There are patches on list to use it for block/userspace net.
 
 Thanks.  Now I understand.
 In that case, inserting an even-tap function to the following code
 should be appropriate?
 
 int event_notifier_test_and_clear(EventNotifier *e)
 {
 uint64_t value;
 int r = read(e-fd, value, sizeof(value));
 return r == sizeof(value);
 }

Possibly.

 
   However, I
  believe vhost bypass the net layer in qemu, and there is no way for Kemari 
  to
  detect the outputs.


Then maybe you should check for this combination and either disable
vhost-net on the backend when kemari is active or fail.

   To me, it doesn't make sense to extend this patch to
  support eventfd...
  Thanks,
 
  Yoshi
 
  
   Yoshi
  
  
   ---
    ioport.c |    2 ++
    1 files changed, 2 insertions(+), 0 deletions(-)
  
   diff --git a/ioport.c b/ioport.c
   index aa4188a..74aebf5 100644
   --- a/ioport.c
   +++ b/ioport.c
   @@ -27,6 +27,7 @@
  
    #include ioport.h
    #include trace.h
   +#include event-tap.h
  
    /***/
    /* IO Port */
   @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t 
   address, uint32_t data)
            default_ioport_writel
        };
        IOPortWriteFunc *func = ioport_write_table[index][address];
   +    event_tap_ioport(index, address, data);
        if (!func)
            func = default_func[index];
        func(ioport_opaque[address], address, data);
   --
   1.7.1.2
  
   --
   To unsubscribe from this list: send the line unsubscribe kvm in
   the body of a message to majord...@vger.kernel.org
   More majordomo info at  http://vger.kernel.org/majordomo-info.html
   --
   To unsubscribe from this list: send the line unsubscribe kvm in
   the body of a message to majord...@vger.kernel.org
   More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
  
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 1/2] Introduce strtosz_suffix()

2010-12-16 Thread Markus Armbruster

Sorry for the late review.

jes.soren...@redhat.com writes:

 From: Jes Sorensen jes.soren...@redhat.com

 This introduces strtosz_suffix() which allows the caller to specify a
 default suffix in case the non default of MB is wanted.

 strtosz() is kept as a wrapper for strtosz_suffix() which keeps it's
 current default of MB.

 Signed-off-by: Jes Sorensen jes.soren...@redhat.com
 ---
  cutils.c  |   17 ++---
  qemu-common.h |7 +++
  2 files changed, 21 insertions(+), 3 deletions(-)

 diff --git a/cutils.c b/cutils.c
 index 28089aa..7984bc1 100644
 --- a/cutils.c
 +++ b/cutils.c
 @@ -291,10 +291,10 @@ int fcntl_setfl(int fd, int flag)
   * value must be terminated by whitespace, ',' or '\0'. Return -1 on
   * error.
   */
 -ssize_t strtosz(const char *nptr, char **end)
 +ssize_t strtosz_suffix(const char *nptr, char **end, const char 
 default_suffix)

Last parameter's const is of marginal uility.  Matter of taste.

  {
  ssize_t retval = -1;
 -char *endptr, c;
 +char *endptr, c, d;
  int mul_required = 0;
  double val, mul, integral, fraction;
  
 @@ -313,10 +313,16 @@ ssize_t strtosz(const char *nptr, char **end)
   * part of a multi token argument.
   */
  c = *endptr;
 +d = c;
  if (isspace(c) || c == '\0' || c == ',') {
  c = 0;
 +if (default_suffix) {
 +d = default_suffix;
 +} else {
 +d = c;
 +}

The else clause d = c is effectively d = 0 (due to prior c = 0),
which is the same as d = default_suffix (due to else's condition).
Thus, you can replace the conditional by a simple

d = default_suffix;

  }
 -switch (c) {
 +switch (d) {
  case 'B':
  case 'b':
  mul = 1;
 @@ -371,3 +377,8 @@ fail:
  
  return retval;
  }
 +
 +ssize_t strtosz(const char *nptr, char **end)
 +{
 +return strtosz_suffix(nptr, end, STRTOSZ_DEFSUFFIX_MB);
 +}
 diff --git a/qemu-common.h b/qemu-common.h
 index de82c2e..1ed32e5 100644
 --- a/qemu-common.h
 +++ b/qemu-common.h
 @@ -149,7 +149,14 @@ time_t mktimegm(struct tm *tm);
  int qemu_fls(int i);
  int qemu_fdatasync(int fd);
  int fcntl_setfl(int fd, int flag);
 +
 +#define STRTOSZ_DEFSUFFIX_TB 'T'
 +#define STRTOSZ_DEFSUFFIX_GB 'G'
 +#define STRTOSZ_DEFSUFFIX_MB 'M'
 +#define STRTOSZ_DEFSUFFIX_KB 'K'
 +#define STRTOSZ_DEFSUFFIX_B  'B'

I don't see what these defines buy us over the literals, but if it makes
you or other reviewers happier...

  ssize_t strtosz(const char *nptr, char **end);
 +ssize_t strtosz_suffix(const char *nptr, char **end, const char 
 default_suffix);
  
  /* path.c */
  void init_paths(const char *prefix);

Re: [Qemu-devel] [PATCH 0/1] introduce spice-qemu-char chardev

2010-12-16 Thread Markus Armbruster

Alon Levy al...@redhat.com writes:

 Adding a chardev backend for spice, for usage by spice vdagent over a
 with virtio-serial device.

Please put this into PATCH 1/1, so it gets committed.  Thanks.

[Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-16 Thread Avi Kivity


On 12/16/2010 12:48 PM, Luiz Capitulino wrote:

Ok, I didn't know that, but I had another idea: the command could accept
either a single cpu index or a list:


   { execute: inject-nmi, arguments: { cpus: 2 } }

   { execute: inject-nmi, arguments: { cpus: [1, 2, 3, 4] } }

This has the feature of injecting the nmi in just some cpus, although I'm
not sure this is going to be desired/useful.

If we agree on this we'll have to wait because the monitor doesn't currently
support hybrid arguments.


I hope it never does.  They're hard to support in old-school statically 
typed languages.


--
error compiling committee.c: too many arguments to function

[Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-16 Thread Luiz Capitulino

On Thu, 16 Dec 2010 11:03:38 +0200
Avi Kivity a...@redhat.com wrote:

 On 12/15/2010 08:00 PM, Luiz Capitulino wrote:

  Looks like a GUI feature to me,
  
Really?  Can't see how you can build NMI to all CPUs from NMI this
CPU.  Or am I misunderstanding you?
 
  I guess so. Avi referred to 'nmi button on many machines', I assumed he
  meant a virtual machine GUI, am I wrong?
 
 I meant a real machine's GUI (it's a physical button you can press with 
 your finger, if you have thin fingers).

Ok, I didn't know that, but I had another idea: the command could accept
either a single cpu index or a list:


  { execute: inject-nmi, arguments: { cpus: 2 } }

  { execute: inject-nmi, arguments: { cpus: [1, 2, 3, 4] } }

This has the feature of injecting the nmi in just some cpus, although I'm
not sure this is going to be desired/useful.

If we agree on this we'll have to wait because the monitor doesn't currently
support hybrid arguments.

[Qemu-devel] [PATCH 1/3] qemu-img.c: Re-factor img_create()

2010-12-16 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

This patch re-factors img_create() moving the code doing the actual
work into block.c where it can be shared with QEMU. This is needed to
be able to create images from QEMU to be used for live snapshots.

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 block.c|  144 
 block.h|4 ++
 qemu-img.c |  108 +
 3 files changed, 150 insertions(+), 106 deletions(-)

diff --git a/block.c b/block.c
index b4aaf41..765f9f3 100644
--- a/block.c
+++ b/block.c
@@ -2758,3 +2758,147 @@ int64_t bdrv_get_dirty_count(BlockDriverState *bs)
 {
 return bs-dirty_count;
 }
+
+int bdrv_img_create(const char *filename, const char *fmt,
+const char *base_filename, const char *base_fmt,
+char *options, uint64_t img_size, int flags)
+{
+QEMUOptionParameter *param = NULL, *create_options = NULL;
+QEMUOptionParameter *backing_fmt;
+BlockDriverState *bs = NULL;
+BlockDriver *drv, *proto_drv;
+int ret = 0;
+
+/* Find driver and parse its options */
+drv = bdrv_find_format(fmt);
+if (!drv) {
+error_report(Unknown file format '%s', fmt);
+ret = -1;
+goto out;
+}
+
+proto_drv = bdrv_find_protocol(filename);
+if (!proto_drv) {
+error_report(Unknown protocol '%s', filename);
+ret = -1;
+goto out;
+}
+
+create_options = append_option_parameters(create_options,
+  drv-create_options);
+create_options = append_option_parameters(create_options,
+  proto_drv-create_options);
+
+/* Create parameter list with default values */
+param = parse_option_parameters(, create_options, param);
+
+set_option_parameter_int(param, BLOCK_OPT_SIZE, img_size);
+
+/* Parse -o options */
+if (options) {
+param = parse_option_parameters(options, create_options, param);
+if (param == NULL) {
+error_report(Invalid options for file format '%s'., fmt);
+ret = -1;
+goto out;
+}
+}
+
+if (base_filename) {
+if (set_option_parameter(param, BLOCK_OPT_BACKING_FILE,
+ base_filename)) {
+error_report(Backing file not supported for file format '%s',
+ fmt);
+ret = -1;
+goto out;
+}
+}
+
+backing_fmt = get_option_parameter(param, BLOCK_OPT_BACKING_FMT);
+if (backing_fmt  backing_fmt-value.s) {
+if (!bdrv_find_format(backing_fmt-value.s)) {
+error_report(Unknown backing file format '%s',
+ backing_fmt-value.s);
+ret = -1;
+goto out;
+}
+}
+
+if (base_fmt) {
+if (set_option_parameter(param, BLOCK_OPT_BACKING_FMT, base_fmt)) {
+error_report(Backing file format not supported for file 
+ format '%s', fmt);
+ret = -1;
+goto out;
+}
+}
+
+// The size for the image must always be specified, with one exception:
+// If we are using a backing file, we can obtain the size from there
+if (get_option_parameter(param, BLOCK_OPT_SIZE)-value.n == -1) {
+QEMUOptionParameter *backing_file =
+get_option_parameter(param, BLOCK_OPT_BACKING_FILE);
+
+if (backing_file  backing_file-value.s) {
+uint64_t size;
+const char *fmt = NULL;
+char buf[32];
+
+if (backing_fmt  backing_fmt-value.s) {
+fmt = backing_fmt-value.s;
+}
+
+bs = bdrv_new();
+if (!bs) {
+error_report(Not enough memory to allocate BlockDriverState);
+ret = -1;
+goto out;
+}
+ret = bdrv_open(bs, backing_file-value.s, flags, drv);
+if (ret  0) {
+error_report(Could not open '%s', filename);
+ret = -1;
+goto out;
+}
+bdrv_get_geometry(bs, size);
+size *= 512;
+
+snprintf(buf, sizeof(buf), % PRId64, size);
+set_option_parameter(param, BLOCK_OPT_SIZE, buf);
+} else {
+error_report(Image creation needs a size parameter);
+ret = -1;
+goto out;
+}
+}
+
+printf(Formatting '%s', fmt=%s , filename, fmt);
+print_option_parameters(param);
+puts();
+
+ret = bdrv_create(drv, filename, param);
+free_option_parameters(create_options);
+free_option_parameters(param);
+
+if (ret  0) {
+if (ret == -ENOTSUP) {
+error_report(Formatting or formatting option not supported for 
+ file format '%s', fmt);
+} else if (ret == -EFBIG) {
+

[Qemu-devel] [PATCH 3/3] Prevent creating an image with the same filename as backing file

2010-12-16 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 block.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index 765f9f3..027dc6a 100644
--- a/block.c
+++ b/block.c
@@ -2769,6 +2769,13 @@ int bdrv_img_create(const char *filename, const char 
*fmt,
 BlockDriver *drv, *proto_drv;
 int ret = 0;
 
+if (!strcmp(filename, base_filename)) {
+error_report(Error: Trying to create a snapshot with the same 
+ filename as the backing file);
+ret = -1;
+goto out;
+}
+
 /* Find driver and parse its options */
 drv = bdrv_find_format(fmt);
 if (!drv) {
-- 
1.7.3.3

[Qemu-devel] [PATCH v2 0/3] Re-factor img_create() and add live snapshots

2010-12-16 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

Hi,

This set of patches re-factors img_create() and moves the core part of
it into block.c so it can be accessed from qemu as well as
qemu-img. The second patch adds basic live snapshots support to the
code, however only snapshots to external QCOW2 images is supported for
now. QED support should be trivial once the QED patches go into
upstream.

The last patch fixes a small gotcha which is present in the old code
as well. Try to catch cases where a user tries to create an image with
itself as the backing file. QEMU does 'interesting' things when you do
this.

Many thanks to Kevin for his help with block layer internals!

New in v2:
 - Fix error return value in monitor command
 - Clarify help message for command
 - Fix patch conflict against block tree. It's all Stefan's fault :)
   f8feb11f4d76f390dddc5cc5345abf99f7659a78

Cheers,
Jes

Jes Sorensen (3):
  qemu-img.c: Re-factor img_create()
  Introduce do_snapshot_blkdev() and monitor command to handle it.
  Prevent creating an image with the same filename as backing file

 block.c |  151 +++
 block.h |4 ++
 blockdev.c  |   61 ++
 blockdev.h  |1 +
 hmp-commands.hx |   19 +++
 qemu-img.c  |  108 +--
 6 files changed, 238 insertions(+), 106 deletions(-)

-- 
1.7.3.3

[Qemu-devel] [PATCH 2/5] ccid: add passthru card device

2010-12-16 Thread Alon Levy

The passthru ccid card is a device sitting on the usb-ccid bus and
using a chardevice to communicate with a remote device using the
VSCard protocol defined in libcacard/vscard_common.h

Usage docs available in following patch in docs/ccid.txt

Signed-off-by: Alon Levy al...@redhat.com
---
 Makefile.objs |2 +-
 hw/ccid-card-passthru.c   |  277 +
 libcacard/vscard_common.h |  130 +
 3 files changed, 408 insertions(+), 1 deletions(-)
 create mode 100644 hw/ccid-card-passthru.c
 create mode 100644 libcacard/vscard_common.h

diff --git a/Makefile.objs b/Makefile.objs
index 1454264..c4a445c 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -193,7 +193,7 @@ hw-obj-$(CONFIG_FDC) += fdc.o
 hw-obj-$(CONFIG_ACPI) += acpi.o acpi_piix4.o
 hw-obj-$(CONFIG_APM) += pm_smbus.o apm.o
 hw-obj-$(CONFIG_DMA) += dma.o
-hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o
+hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o ccid-card-passthru.o
 
 # PPC devices
 hw-obj-$(CONFIG_OPENPIC) += openpic.o
diff --git a/hw/ccid-card-passthru.c b/hw/ccid-card-passthru.c
new file mode 100644
index 000..e90ba0e
--- /dev/null
+++ b/hw/ccid-card-passthru.c
@@ -0,0 +1,277 @@
+/*
+ * CCID Card Device emulation
+ *
+ * Copyright (c) 2010 Red Hat.
+ * Written by Alon Levy.
+ *
+ * This code is licenced under the LGPL.
+ */
+
+#include qemu-char.h
+#include monitor.h
+#include hw/ccid.h
+#include libcacard/vscard_common.h
+
+#define DPRINTF(card, lvl, fmt, ...) \
+do { if (lvl = card-debug) { printf(ccid-card:  fmt , ## __VA_ARGS__); } } 
while (0)
+
+/* Passthru card */
+
+
+// TODO: do we still need this?
+uint8_t DEFAULT_ATR[] = {
+/* From some example somewhere
+ 0x3B, 0xB0, 0x18, 0x00, 0xD1, 0x81, 0x05, 0xB1, 0x40, 0x38, 0x1F, 0x03, 0x28
+ */
+
+/* From an Athena smart card */
+ 0x3B, 0xD5, 0x18, 0xFF, 0x80, 0x91, 0xFE, 0x1F, 0xC3, 0x80, 0x73, 0xC8, 0x21, 
0x13, 0x08
+
+}; /* maximum size of ATR - from 7816-3 */
+
+
+#define PASSTHRU_DEV_NAME ccid-card-passthru
+#define VSCARD_IN_SIZE 65536
+#define MAX_ATR_SIZE40
+
+typedef struct PassthruState PassthruState;
+
+struct PassthruState {
+CCIDCardState base;
+CharDriverState *cs;
+uint8_t  vscard_in_data[VSCARD_IN_SIZE];
+uint32_t vscard_in_pos;
+uint32_t vscard_in_hdr;
+uint8_t  atr[MAX_ATR_SIZE];
+uint8_t  atr_length;
+uint8_t debug;
+};
+
+/* VSCard protocol over chardev
+ * This code should not depend on the card type.
+ * */
+
+static void ccid_card_vscard_send_msg(
+PassthruState *s, VSCMsgType type, reader_id_t reader_id,
+const uint8_t* payload, uint32_t length)
+{
+VSCMsgHeader scr_msg_header;
+
+scr_msg_header.type = type;
+scr_msg_header.reader_id = reader_id;
+scr_msg_header.length = length;
+qemu_chr_write(s-cs, (uint8_t*)scr_msg_header, sizeof(VSCMsgHeader));
+qemu_chr_write(s-cs, payload, length);
+}
+
+static void ccid_card_vscard_send_apdu(
+PassthruState *s, const uint8_t* apdu, uint32_t length)
+{
+ccid_card_vscard_send_msg(s, VSC_APDU, VSCARD_MINIMAL_READER_ID, apdu, 
length);
+}
+
+static void ccid_card_vscard_send_error(
+PassthruState *s, reader_id_t reader_id, VSCErrorCode code)
+{
+VSCMsgError msg = {.code=code};
+
+ccid_card_vscard_send_msg(s, VSC_Error, reader_id, (uint8_t*)msg, 
sizeof(msg));
+}
+
+static void ccid_card_vscard_send_init(PassthruState *s)
+{
+VSCMsgInit msg = {.version=VSCARD_VERSION};
+
+ccid_card_vscard_send_msg(s, VSC_Init, VSCARD_UNDEFINED_READER_ID,
+ (uint8_t*)msg, sizeof(msg));
+}
+
+static int ccid_card_vscard_can_read(void *opaque)
+{
+return 65535;
+}
+
+static void ccid_card_vscard_handle_message(PassthruState *card,
+VSCMsgHeader* scr_msg_header)
+{
+uint8_t *data = (uint8_t*)scr_msg_header[1];
+
+switch (scr_msg_header-type) {
+case VSC_ATR:
+DPRINTF(card, 1, VSC_ATR %d\n, scr_msg_header-length);
+assert(scr_msg_header-length = MAX_ATR_SIZE);
+memcpy(card-atr, data, scr_msg_header-length);
+card-atr_length = scr_msg_header-length;
+ccid_card_card_inserted(card-base);
+break;
+case VSC_APDU:
+ccid_card_send_apdu_to_guest(card-base, data, 
scr_msg_header-length);
+break;
+case VSC_CardRemove:
+DPRINTF(card, 1, VSC_CardRemove\n);
+ccid_card_card_removed(card-base);
+break;
+case VSC_Init:
+break;
+case VSC_Error:
+ccid_card_card_error(card-base, *(uint64_t*)data);
+break;
+case VSC_ReaderAdd:
+if (ccid_card_ccid_attach(card-base)  0) {
+ccid_card_vscard_send_error(card, VSCARD_UNDEFINED_READER_ID,
+  VSC_CANNOT_ADD_MORE_READERS);
+} else {
+ccid_card_vscard_send_msg(card, VSC_ReaderAddResponse,
+

[Qemu-devel] [PATCH 0/5] usb-ccid (v10)

2010-12-16 Thread Alon Levy

This patchset adds three new devices, usb-ccid, ccid-card-passthru and
ccid-card-emulated, providing a CCID bus, a simple passthru protocol
implementing card requiring a client, and a standalone emulated card.

It also introduces a new directory libcaccard with CAC card emulation,
CAC is a type of ISO 7816 smart card.

v8-v10 changes:
 * usb-ccid:
  * add slot for future use (Gerd)
  * ifdef ENABLE_MIGRATION for migration support on account of usb
   migration not being ready in general. (Gerd)
 * verbosified commit messages. (Gerd)
 * put libcacard docs in libcacard commit. (Gerd)

v8-v9 changes:
 * Blue Swirl comments:
  * white space fixes
  * enabled by default, disabled only if missing nss
  * forgotten fix from v8 (don't build libcacard.so)
 * added a note about device being little endian
 * library renamed from libcaccard to libcacard
 * squashed both of libcacard patches, they touched different files anyway.

v7-v8 changes:
 * Blue Swirl comments:
  * usb-ccid: deannonymize some structs
  * usb-ccid: coding style change - answer_t and bulk_in_t fixed
  * usb-ccid: handle endianess conversion between guest and host
 * usb-ccid: s/ccid_bulk_in_copy_out/ccid_bulk_in_copy_to_guest/
 * ccid-card-emulated: fix segfault if backend not specified
 * ccid-card-emulated: let last reader inserted win
 * libcaccard: remove double vscard_common.h

v6-v7 changes:
 * external libcaccard became internal directory libcaccard
  * statically link object files into qemu
  * produce libcaccard.so for usage by external projects
  * applied coding style to new code (please check me)
  - did not use the qemu options parsing for libcaccard, since
   it seems to draw large amounts of qemu code (monitor for instance).

v5-v6 changes:
 * really remove static debug (I apologize for claiming to have done so before)

v4-v5 changes:
 * rebased to latest
 * remove static debug in card devices
 * fix --enable-smartcard to link
 * stall instead of assert when exceeding BULK_OUT_DATA_SIZE
 * make ccid_reserve_recv_buf for too large len discard message, not exit
 * make ccid_reserve_recv_buf return void*
 * fix typo
 * remove commented code in VMState

v3-v4:
 * remove ccid field in CCIDBus
 * remove static debug in bus
 * add back docs

v2-v3:
 * split into bus (usb-ccid.c, uses ccid.h) and card (ccid-card-passthru.c).
 * removed documentation (being revised).

v1-v2:
 * all QSIMPLEQ turned into fixed sized rings
 * all allocated buffers turned into fixed size buffers
 * added migration support
 * added a message to tell client qemu has migrated to ip:port
  * for lack of monitor commands ip:port are 0:0, which causes the updated
   vscclient to connect to one port higher on the same host. will add monitor
   commands in a separate patch. tested with current setup.


*** BLURB HERE ***

Alon Levy (4):
  usb-ccid: add CCID bus
  ccid: add passthru card device
  ccid: add ccid-card-emulated device (v2)
  ccid: add docs

Robert Relyea (1):
  libcacard: initial commit after coding style fixes

 Makefile|6 +-
 Makefile.objs   |6 +
 Makefile.target |2 +
 configure   |   25 +
 docs/ccid.txt   |  125 
 hw/ccid-card-emulated.c |  501 
 hw/ccid-card-passthru.c |  277 +
 hw/ccid.h   |   35 ++
 hw/usb-ccid.c   | 1362 +++
 libcacard/Makefile  |   13 +
 libcacard/cac.c |  411 +
 libcacard/cac.h |   20 +
 libcacard/card_7816.c   |  780 +
 libcacard/card_7816.h   |   60 ++
 libcacard/card_7816t.h  |  163 +
 libcacard/config.h  |   81 +++
 libcacard/event.c   |  112 
 libcacard/eventt.h  |   28 +
 libcacard/link_test.c   |   20 +
 libcacard/mutex.h   |   59 ++
 libcacard/passthru.c|  612 +++
 libcacard/passthru.h|   50 ++
 libcacard/vcard.c   |  350 +++
 libcacard/vcard.h   |   85 +++
 libcacard/vcard_emul.h  |   59 ++
 libcacard/vcard_emul_nss.c  | 1147 
 libcacard/vcard_emul_type.c |   60 ++
 libcacard/vcard_emul_type.h |   29 +
 libcacard/vcardt.h  |   66 +++
 libcacard/vevent.h  |   26 +
 libcacard/vreader.c |  515 
 libcacard/vreader.h |   53 ++
 libcacard/vreadert.h|   23 +
 libcacard/vscard_common.h   |  130 
 libcacard/vscclient.c   |  710 ++
 35 files changed, 7999 insertions(+), 2 deletions(-)
 create mode 100644 docs/ccid.txt
 create mode 100644 hw/ccid-card-emulated.c
 create mode 100644 hw/ccid-card-passthru.c
 create mode 100644 hw/ccid.h
 create mode 100644 hw/usb-ccid.c
 create mode 100644 libcacard/Makefile
 create mode 100644 libcacard/cac.c
 create mode 100644 libcacard/cac.h
 create mode 100644 libcacard/card_7816.c
 create mode 100644

[Qemu-devel] [PATCH 2/3] Introduce do_snapshot_blkdev() and monitor command to handle it.

2010-12-16 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

The monitor command is:
snapshot_blkdev device [snapshot-file] [format]

Default format is qcow2. For now snapshots without a snapshot-file, eg
internal snapshots, are not supported.

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 blockdev.c  |   61 +++
 blockdev.h  |1 +
 hmp-commands.hx |   19 +
 3 files changed, 81 insertions(+), 0 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 3b3b82d..9d6f72c 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -516,6 +516,67 @@ void do_commit(Monitor *mon, const QDict *qdict)
 }
 }
 
+int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data)
+{
+const char *device = qdict_get_str(qdict, device);
+const char *filename = qdict_get_try_str(qdict, snapshot_file);
+const char *format = qdict_get_try_str(qdict, format);
+const char format_qcow2[] = qcow2;
+BlockDriverState *bs;
+BlockDriver *drv, *proto_drv;
+int ret = 0;
+int flags;
+
+bs = bdrv_find(device);
+if (!bs) {
+qerror_report(QERR_DEVICE_NOT_FOUND, device);
+ret = -1;
+goto out;
+}
+
+if (!format) {
+format = format_qcow2;
+}
+
+drv = bdrv_find_format(format);
+if (!drv) {
+qerror_report(QERR_INVALID_BLOCK_FORMAT, format);
+ret = -1;
+goto out;
+}
+
+proto_drv = bdrv_find_protocol(filename);
+if (!proto_drv) {
+qerror_report(QERR_INVALID_BLOCK_FORMAT, format);
+ret = -1;
+goto out;
+}
+
+ret = bdrv_img_create(filename, format, bs-filename,
+  bs-drv-format_name, NULL, -1, bs-open_flags);
+if (ret) {
+goto out;
+}
+
+qemu_aio_flush();
+bdrv_flush(bs);
+
+flags = bs-open_flags;
+bdrv_close(bs);
+ret = bdrv_open(bs, filename, flags, drv);
+/*
+ * If reopening the image file we just created fails, we really
+ * are in trouble :(
+ */
+assert(ret == 0);
+out:
+if (ret) {
+ret = -1;
+}
+
+return ret;
+}
+
 static int eject_device(Monitor *mon, BlockDriverState *bs, int force)
 {
 if (!force) {
diff --git a/blockdev.h b/blockdev.h
index 4cb8ca9..4536b5c 100644
--- a/blockdev.h
+++ b/blockdev.h
@@ -52,5 +52,6 @@ int do_block_set_passwd(Monitor *mon, const QDict *qdict, 
QObject **ret_data);
 int do_change_block(Monitor *mon, const char *device,
 const char *filename, const char *fmt);
 int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data);
+int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data);
 
 #endif
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 23024ba..dd3db36 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -801,6 +801,25 @@ STEXI
 Set maximum tolerated downtime (in seconds) for migration.
 ETEXI
 
+{
+.name   = snapshot_blkdev,
+.args_type  = device:s,snapshot_file:s?,format:s?,
+.params = device [new-image-file] [format],
+.help   = initiates a live snapshot\n\t\t\t
+  of device. If a new image file is specified, 
the\n\t\t\t
+  new image file will become the new root image.\n\t\t\t
+  If format is specified, the snapshot file will\n\t\t\t
+  be created in that format. Otherwise the\n\t\t\t
+  snapshot will be internal! (currently unsupported),
+.mhandler.cmd_new = do_snapshot_blkdev,
+},
+
+STEXI
+...@item snapshot_blkdev
+...@findex snapshot_blkdev
+Snapshot device, using snapshot file as target if provided
+ETEXI
+
 #if defined(TARGET_I386)
 {
 .name   = drive_add,
-- 
1.7.3.3

[Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-16 Thread Luiz Capitulino

On Thu, 16 Dec 2010 12:51:14 +0200
Avi Kivity a...@redhat.com wrote:

 On 12/16/2010 12:48 PM, Luiz Capitulino wrote:
  Ok, I didn't know that, but I had another idea: the command could accept
  either a single cpu index or a list:
 
 
 { execute: inject-nmi, arguments: { cpus: 2 } }
 
 { execute: inject-nmi, arguments: { cpus: [1, 2, 3, 4] } }
 
  This has the feature of injecting the nmi in just some cpus, although I'm
  not sure this is going to be desired/useful.
 
  If we agree on this we'll have to wait because the monitor doesn't currently
  support hybrid arguments.
 
 I hope it never does.  They're hard to support in old-school statically 
 typed languages.

We could accept only a list then.

Re: [Qemu-devel] [Qestion] What status of memory stats feature

2010-12-16 Thread Luiz Capitulino

On Wed, 15 Dec 2010 16:58:03 -0600
Adam Litke a...@us.ibm.com wrote:

 On Wed, 2010-12-15 at 15:39 -0200, Luiz Capitulino wrote:
  On Wed, 15 Dec 2010 16:20:05 +0900
  Ken'ichi Ohmichi oomi...@mxs.nes.nec.co.jp wrote:
  
   
   Hi,
   
   I tried to get the memory stats by using virDomainMemoryStats() of 
   libvirt,
   but it could not do it because of the following patch:
   
   [PATCH 03/23] disable guest-provided stats on info balloon command
 2010/10/01 from Luiz Capitulino
 http://www.mail-archive.com/qemu-devel@nongnu.org/msg43024.html 
   
   According to the related thread, the above patch avoids hanging user
   monitor, and people hope the memory stats feature will be able in the
   future. So I'd like to know the status of this feature.
   
   Is there the patch for enabling the feature ?
   If the patch exists, I'd like to try it.
  
  It doesn't, afaik.
 
 It depends on what you are looking to do.  If you just want to play
 around and aren't concerned about lockups, You could undo the above
 patch to re-enable the interface (although I cannot guarantee that even
 this will work).  
 
   What is the essential problem ?
  
  There are two essential problems here.
  
  The first one is that QMP lacks true asynchronous command support (it does
  have some code for it, but it's incomplete).
 
 I was never completely clear on the problems with the QMP async support.
 
  The second problem is that we must not make an existing synchronous command
  asynchronous, because this breaks the ABI.
 
 The memory stats interface has always advertised itself as an async
 command so this one shouldn't matter.  Whether those semantics actually
 ever worked is another issue.

The query-balloon command used to be synchronous (or quite fast, or would
never lockup). We changed that.

   Does some infinite loop happen ?
  
  No, the guest doesn't respond.
  
   Unfortunately, I cannot understand the cause of hanging user monitor.
  
  The balloon command needs guest cooperation (ie. it talks to the guest),
  if the guest doesn't respond the command will never return.
 
 Isn't that only a problem for the user monitor?  For QMP you might just
 never get a response to the stats request.  IIRC, the QMP monitor is
 never 'suspended' in the same way that the user monitor is so I wouldn't
 expect it to be susceptible to the same lockup issues.

We need to have a way to cancel the request. This is not the cause of
this problem, though.

  We don't have a precise ETA for async support, but if it depends only on
  the new error framework work, then we could have it for 0.15. If it depends
  on the full monitor redesign work, then I guess we'll two releases.
 
 I can't provide a definitive answer to this unfortunately since I
 haven't looked in awhile (nor am I privy to the details of the full
 monitor redesign proposal.

Basically, we're splitting HMP and QMP. The end result will be that HMP
will use QMP as a C library. Not hard to do, but there's lots of code
churn involved.

Now, regarding async support, I already have a proposal, but I'll send it
only next year because I'll be in vacations for two weeks.

[Qemu-devel] [PATCH 4/5] ccid: add ccid-card-emulated device (v2)

2010-12-16 Thread Alon Levy

This devices uses libcacard (internal) to emulate a smartcard conforming
to the CAC standard. It attaches to the usb-ccid bus. Usage instructions
(example command lines) are in the following patch in docs/ccid.txt. It
uses libcacard which uses nss, so it can work with both hw cards and
certificates (files).

changes from v1:
remove stale comments, use only c-style comments
bugfix, forgot to set recv_len
change reader name to 'Virtual Reader'

Signed-off-by: Alon Levy al...@redhat.com
---
 Makefile.objs   |2 +-
 hw/ccid-card-emulated.c |  501 +++
 2 files changed, 502 insertions(+), 1 deletions(-)
 create mode 100644 hw/ccid-card-emulated.c

diff --git a/Makefile.objs b/Makefile.objs
index 3f44a91..54b22ca 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -193,7 +193,7 @@ hw-obj-$(CONFIG_FDC) += fdc.o
 hw-obj-$(CONFIG_ACPI) += acpi.o acpi_piix4.o
 hw-obj-$(CONFIG_APM) += pm_smbus.o apm.o
 hw-obj-$(CONFIG_DMA) += dma.o
-hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o ccid-card-passthru.o
+hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o ccid-card-passthru.o 
ccid-card-emulated.o
 
 # PPC devices
 hw-obj-$(CONFIG_OPENPIC) += openpic.o
diff --git a/hw/ccid-card-emulated.c b/hw/ccid-card-emulated.c
new file mode 100644
index 000..7ecdf75
--- /dev/null
+++ b/hw/ccid-card-emulated.c
@@ -0,0 +1,501 @@
+/*
+ * CCID Card Device. Emulated card.
+ *
+ * It can be used to provide access to the local hardware in a non exclusive
+ * way, or it can use certificates. It requires the usb-ccid bus.
+ *
+ * Usage 1: standard, mirror hardware reader+card:
+ * qemu .. -usb -device usb-ccid -device ccid-card-emulated
+ *
+ * Usage 2: use certificates, no hardware required
+ * one time: create the certificates:
+ *  for i in 1 2 3; do certutil -d /etc/pki/nssdb -x -t CT,CT,CT -S -s 
CN=user$i -n user$i; done
+ * qemu .. -usb -device usb-ccid -device 
ccid-card-emulated,cert1=user1,cert2=user2,cert3=user3
+ *
+ * If you use a non default db for the certificates you can specify it using 
the db parameter.
+ *
+ *
+ * Copyright (c) 2010 Red Hat.
+ * Written by Alon Levy.
+ *
+ * This code is licenced under the LGPL.
+ */
+
+#include pthread.h
+#include eventt.h
+#include vevent.h
+#include vreader.h
+#include vcard_emul.h
+#include qemu-char.h
+#include monitor.h
+#include hw/ccid.h
+
+#define DPRINTF(card, lvl, fmt, ...) \
+do { if (lvl = card-debug) { printf(ccid-card-emul: %s:  fmt , __func__, 
## __VA_ARGS__); } } while (0)
+
+#define EMULATED_DEV_NAME ccid-card-emulated
+
+#define BACKEND_NSS_EMULATED nss-emulated /* the default */
+#define BACKEND_CERTIFICATES certificates
+
+typedef struct EmulatedState EmulatedState;
+
+enum {
+EMUL_READER_INSERT = 0,
+EMUL_READER_REMOVE,
+EMUL_CARD_INSERT,
+EMUL_CARD_REMOVE,
+EMUL_GUEST_APDU,
+EMUL_RESPONSE_APDU,
+EMUL_ERROR,
+};
+
+static const char* emul_event_to_string(uint32_t emul_event)
+{
+switch (emul_event) {
+case EMUL_READER_INSERT: return EMUL_READER_INSERT;
+case EMUL_READER_REMOVE: return EMUL_READER_REMOVE;
+case EMUL_CARD_INSERT: return EMUL_CARD_INSERT;
+case EMUL_CARD_REMOVE: return EMUL_CARD_REMOVE;
+case EMUL_GUEST_APDU: return EMUL_GUEST_APDU;
+case EMUL_RESPONSE_APDU: return EMUL_RESPONSE_APDU;
+case EMUL_ERROR: return EMUL_ERROR;
+default:
+break;
+}
+return UNKNOWN;
+}
+
+typedef struct EmulEvent {
+QSIMPLEQ_ENTRY(EmulEvent) entry;
+union {
+struct {
+uint32_t type;
+} gen;
+struct {
+uint32_t type;
+uint64_t code;
+} error;
+struct {
+uint32_t type;
+uint32_t len;
+uint8_t data[];
+} data;
+} p;
+} EmulEvent;
+
+#define MAX_ATR_SIZE 40
+struct EmulatedState {
+CCIDCardState base;
+uint8_t  debug;
+char*backend;
+char*cert1;
+char*cert2;
+char*cert3;
+char*db;
+uint8_t  atr[MAX_ATR_SIZE];
+uint8_t  atr_length;
+QSIMPLEQ_HEAD(event_list, EmulEvent) event_list;
+pthread_mutex_t event_list_mutex;
+VReader *reader;
+QSIMPLEQ_HEAD(guest_apdu_list, EmulEvent) guest_apdu_list;
+pthread_mutex_t vreader_mutex; /* and guest_apdu_list mutex */
+pthread_mutex_t handle_apdu_mutex;
+pthread_cond_t handle_apdu_cond;
+int  pipe[2];
+int  quit_apdu_thread;
+pthread_mutex_t apdu_thread_quit_mutex;
+pthread_cond_t apdu_thread_quit_cond;
+};
+
+static void emulated_apdu_from_guest(CCIDCardState *base, const uint8_t *apdu, 
uint32_t len)
+{
+EmulatedState *card = DO_UPCAST(EmulatedState, base, base);
+EmulEvent *event = (EmulEvent*)malloc(sizeof(EmulEvent) + len);
+
+assert(event);
+event-p.data.type = EMUL_GUEST_APDU;
+event-p.data.len = len;
+memcpy(event-p.data.data, apdu, len);
+pthread_mutex_lock(card-vreader_mutex);
+

[Qemu-devel] [PATCH 1/5] usb-ccid: add CCID bus

2010-12-16 Thread Alon Levy

A CCID device is a smart card reader. It is a USB device, defined at [1].
This patch introduces the usb-ccid device that is a ccid bus. Next patches will
introduce two card types to use it, a passthru card and an emulated card.

 [1] http://www.usb.org/developers/devclass_docs/DWG_Smart-Card_CCID_Rev110.

Signed-off-by: Alon Levy al...@redhat.com
---
 Makefile.objs |1 +
 configure |6 +
 hw/ccid.h |   35 ++
 hw/usb-ccid.c | 1362 +
 4 files changed, 1404 insertions(+), 0 deletions(-)
 create mode 100644 hw/ccid.h
 create mode 100644 hw/usb-ccid.c

diff --git a/Makefile.objs b/Makefile.objs
index cebb945..1454264 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -193,6 +193,7 @@ hw-obj-$(CONFIG_FDC) += fdc.o
 hw-obj-$(CONFIG_ACPI) += acpi.o acpi_piix4.o
 hw-obj-$(CONFIG_APM) += pm_smbus.o apm.o
 hw-obj-$(CONFIG_DMA) += dma.o
+hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o
 
 # PPC devices
 hw-obj-$(CONFIG_OPENPIC) += openpic.o
diff --git a/configure b/configure
index 2917874..584cc8e 100755
--- a/configure
+++ b/configure
@@ -332,6 +332,7 @@ zero_malloc=
 trace_backend=nop
 trace_file=trace
 spice=
+smartcard=yes
 
 # OS specific
 if check_define __linux__ ; then
@@ -2354,6 +2355,7 @@ echo vhost-net support $vhost_net
 echo Trace backend $trace_backend
 echo Trace output file $trace_file-pid
 echo spice support $spice
+echo smartcard support $smartcard
 
 if test $sdl_too_old = yes; then
 echo - Your SDL version is too old - please upgrade to have SDL support
@@ -2617,6 +2619,10 @@ if test $spice = yes ; then
   echo CONFIG_SPICE=y  $config_host_mak
 fi
 
+if test $smartcard = yes ; then
+  echo CONFIG_SMARTCARD=y  $config_host_mak
+fi
+
 # XXX: suppress that
 if [ $bsd = yes ] ; then
   echo CONFIG_BSD=y  $config_host_mak
diff --git a/hw/ccid.h b/hw/ccid.h
new file mode 100644
index 000..af59070
--- /dev/null
+++ b/hw/ccid.h
@@ -0,0 +1,35 @@
+#ifndef __CCID_H__
+#define __CCID_H__
+
+#include qdev.h
+
+typedef struct CCIDCardState CCIDCardState;
+typedef struct CCIDCardInfo CCIDCardInfo;
+
+struct CCIDCardState {
+DeviceState qdev;
+uint32_tslot; // For future use with multiple slot reader.
+};
+
+struct CCIDCardInfo {
+DeviceInfo qdev;
+void (*print)(Monitor *mon, CCIDCardState *card, int indent);
+const uint8_t *(*get_atr)(CCIDCardState *card, uint32_t *len);
+void (*apdu_from_guest)(CCIDCardState *card, const uint8_t *apdu, uint32_t 
len);
+int (*exitfn)(CCIDCardState *card);
+int (*initfn)(CCIDCardState *card);
+};
+
+void ccid_card_send_apdu_to_guest(CCIDCardState *card, uint8_t* apdu, uint32_t 
len);
+void ccid_card_card_removed(CCIDCardState *card);
+void ccid_card_card_inserted(CCIDCardState *card);
+void ccid_card_card_error(CCIDCardState *card, uint64_t error);
+void ccid_card_qdev_register(CCIDCardInfo *card);
+
+/* support guest visible insertion/removal of ccid devices based on actual
+ * devices connected/removed. Called by card implementation (passthru, local) 
*/
+int ccid_card_ccid_attach(CCIDCardState *card);
+void ccid_card_ccid_detach(CCIDCardState *card);
+
+#endif // __CCID_H__
+
diff --git a/hw/usb-ccid.c b/hw/usb-ccid.c
new file mode 100644
index 000..f431ba4
--- /dev/null
+++ b/hw/usb-ccid.c
@@ -0,0 +1,1362 @@
+/*
+ * CCID Device emulation
+ *
+ * Based on usb-serial.c:
+ * Copyright (c) 2006 CodeSourcery.
+ * Copyright (c) 2008 Samuel Thibault samuel.thiba...@ens-lyon.org
+ * Written by Paul Brook, reused for FTDI by Samuel Thibault,
+ * Reused for CCID by Alon Levy.
+ * Contributed to by Robert Relyea
+ * Copyright (c) 2010 Red Hat.
+ *
+ * This code is licenced under the LGPL.
+ */
+
+/* References:
+ *
+ * CCID Specification Revision 1.1 April 22nd 2005
+ *  Universal Serial Bus, Device Class: Smart Card
+ *  Specification for Integrated Circuit(s) Cards Interface Devices
+ *
+ * Endianess note: from the spec (1.3)
+ *  Fields that are larger than a byte are stored in little endian
+ *
+ * KNOWN BUGS
+ * 1. remove/insert can sometimes result in removed state instead of inserted.
+ * This is a result of the following:
+ *  symptom: dmesg shows ERMOTEIO (-121), pcscd shows -99. This happens
+ *  when we send a too short packet, seen in uhci-usb.c, resulting from
+ *  a urb requesting SPD and us returning a smaller packet.
+ *  Not sure which messages trigger this.
+ *
+ * Migration note:
+ *
+ * All the VMStateDescription's are left here for future use, but
+ * not enabled right now since there is no support for USB migration.
+ *
+ * To enable define ENABLE_MIGRATION
+ */
+
+#include qemu-common.h
+#include qemu-error.h
+#include usb.h
+#include monitor.h
+
+#include hw/ccid.h
+
+//#define DEBUG_CCID
+
+#define DPRINTF(s, lvl, fmt, ...) \
+do { if (lvl = s-debug) { printf(usb-ccid:  fmt , ## __VA_ARGS__); } } 
while (0)
+
+#define CCID_DEV_NAME usb-ccid
+
+/* The two options for variable sized buffers:
+ * make them constant size, for large enough constant,
+ * or

[Qemu-devel] Re: [PATCH 1/3] qemu-img.c: Re-factor img_create()

2010-12-16 Thread Kevin Wolf

Am 16.12.2010 12:04, schrieb jes.soren...@redhat.com:
 From: Jes Sorensen jes.soren...@redhat.com
 
 This patch re-factors img_create() moving the code doing the actual
 work into block.c where it can be shared with QEMU. This is needed to
 be able to create images from QEMU to be used for live snapshots.
 
 Signed-off-by: Jes Sorensen jes.soren...@redhat.com
 ---
  block.c|  144 
 
  block.h|4 ++
  qemu-img.c |  108 +
  3 files changed, 150 insertions(+), 106 deletions(-)
 
 diff --git a/block.c b/block.c
 index b4aaf41..765f9f3 100644
 --- a/block.c
 +++ b/block.c
 @@ -2758,3 +2758,147 @@ int64_t bdrv_get_dirty_count(BlockDriverState *bs)
  {
  return bs-dirty_count;
  }
 +
 +int bdrv_img_create(const char *filename, const char *fmt,
 +const char *base_filename, const char *base_fmt,
 +char *options, uint64_t img_size, int flags)
 +{
 +QEMUOptionParameter *param = NULL, *create_options = NULL;
 +QEMUOptionParameter *backing_fmt;
 +BlockDriverState *bs = NULL;
 +BlockDriver *drv, *proto_drv;
 +int ret = 0;
 +
 +/* Find driver and parse its options */
 +drv = bdrv_find_format(fmt);
 +if (!drv) {
 +error_report(Unknown file format '%s', fmt);
 +ret = -1;
 +goto out;
 +}
 +
 +proto_drv = bdrv_find_protocol(filename);
 +if (!proto_drv) {
 +error_report(Unknown protocol '%s', filename);
 +ret = -1;
 +goto out;
 +}
 +
 +create_options = append_option_parameters(create_options,
 +  drv-create_options);
 +create_options = append_option_parameters(create_options,
 +  proto_drv-create_options);
 +
 +/* Create parameter list with default values */
 +param = parse_option_parameters(, create_options, param);
 +
 +set_option_parameter_int(param, BLOCK_OPT_SIZE, img_size);
 +
 +/* Parse -o options */
 +if (options) {
 +param = parse_option_parameters(options, create_options, param);
 +if (param == NULL) {
 +error_report(Invalid options for file format '%s'., fmt);
 +ret = -1;
 +goto out;
 +}
 +}
 +
 +if (base_filename) {
 +if (set_option_parameter(param, BLOCK_OPT_BACKING_FILE,
 + base_filename)) {
 +error_report(Backing file not supported for file format '%s',
 + fmt);
 +ret = -1;
 +goto out;
 +}
 +}
 +
 +backing_fmt = get_option_parameter(param, BLOCK_OPT_BACKING_FMT);
 +if (backing_fmt  backing_fmt-value.s) {
 +if (!bdrv_find_format(backing_fmt-value.s)) {
 +error_report(Unknown backing file format '%s',
 + backing_fmt-value.s);
 +ret = -1;
 +goto out;
 +}
 +}
 +
 +if (base_fmt) {
 +if (set_option_parameter(param, BLOCK_OPT_BACKING_FMT, base_fmt)) {
 +error_report(Backing file format not supported for file 
 + format '%s', fmt);
 +ret = -1;
 +goto out;
 +}
 +}

The order is wrong here. If you use -F, the backing format won't be checked.

 +
 +// The size for the image must always be specified, with one exception:
 +// If we are using a backing file, we can obtain the size from there
 +if (get_option_parameter(param, BLOCK_OPT_SIZE)-value.n == -1) {
 +QEMUOptionParameter *backing_file =
 +get_option_parameter(param, BLOCK_OPT_BACKING_FILE);
 +
 +if (backing_file  backing_file-value.s) {
 +uint64_t size;
 +const char *fmt = NULL;
 +char buf[32];
 +
 +if (backing_fmt  backing_fmt-value.s) {
 +fmt = backing_fmt-value.s;
 +}
 +
 +bs = bdrv_new();
 +if (!bs) {
 +error_report(Not enough memory to allocate 
 BlockDriverState);
 +ret = -1;
 +goto out;
 +}

bdrv_new never returns NULL (it's an indirect qemu_malloc call).

 +ret = bdrv_open(bs, backing_file-value.s, flags, drv);
 +if (ret  0) {
 +error_report(Could not open '%s', filename);
 +ret = -1;
 +goto out;
 +}
 +bdrv_get_geometry(bs, size);
 +size *= 512;
 +
 +snprintf(buf, sizeof(buf), % PRId64, size);
 +set_option_parameter(param, BLOCK_OPT_SIZE, buf);
 +} else {
 +error_report(Image creation needs a size parameter);
 +ret = -1;
 +goto out;
 +}
 +}
 +
 +printf(Formatting '%s', fmt=%s , filename, fmt);
 +print_option_parameters(param);
 +puts();
 +
 +ret =

Re: [Qemu-devel] [PATCH 3/3] Prevent creating an image with the same filename as backing file

2010-12-16 Thread Stefan Hajnoczi

On Thu, Dec 16, 2010 at 11:04 AM,  jes.soren...@redhat.com wrote:
 From: Jes Sorensen jes.soren...@redhat.com

 Signed-off-by: Jes Sorensen jes.soren...@redhat.com
 ---
  block.c |    7 +++
  1 files changed, 7 insertions(+), 0 deletions(-)

 diff --git a/block.c b/block.c
 index 765f9f3..027dc6a 100644
 --- a/block.c
 +++ b/block.c
 @@ -2769,6 +2769,13 @@ int bdrv_img_create(const char *filename, const char 
 *fmt,
     BlockDriver *drv, *proto_drv;
     int ret = 0;

 +    if (!strcmp(filename, base_filename)) {
 +        error_report(Error: Trying to create a snapshot with the same 
 +                     filename as the backing file);

Can we avoid using the word snapshot here?  Just image would make sense too.

Users could hit this if they incorrectly create an image manually.
They might not be thinking in terms of snapshot and that word is
already overloaded in QEMU.

Stefan

[Qemu-devel] [PATCH 5/5] ccid: add docs

2010-12-16 Thread Alon Levy

Add documentation for the usb-ccid device and accompanying two card
devices, ccid-card-emulated and ccid-card-passthru.
---
 docs/ccid.txt  |  125 ++
 docs/libcacard.txt |  483 
 2 files changed, 125 insertions(+), 483 deletions(-)
 create mode 100644 docs/ccid.txt
 delete mode 100644 docs/libcacard.txt

diff --git a/docs/ccid.txt b/docs/ccid.txt
new file mode 100644
index 000..791c28c
--- /dev/null
+++ b/docs/ccid.txt
@@ -0,0 +1,125 @@
+Qemu CCID Device Documentation.
+
+Contents
+1. USB CCID device
+2. Building
+3. Using ccid-card-emulated with hardware
+4. Using ccid-card-emulated with certificates
+5. Using ccid-card-passthru with client side hardware
+6. Using ccid-card-passthru with client side certificates
+7. Passthrough protocol scenario
+8. libcaccard
+
+1. USB CCID device
+
+The USB CCID device is a USB device implementing the CCID specification, which
+lets one connect smart card readers that implement the same spec. For more
+information see the specification:
+
+ Universal Serial Bus
+ Device Class: Smart Card
+ CCID
+ Specification for
+ Integrated Circuit(s) Cards Interface Devices
+ Revision 1.1
+ April 22rd, 2005
+
+Smartcard are used for authentication, single sign on, decryption in
+public/private schemes and digital signatures. A smartcard reader on the client
+cannot be used on a guest with simple usb passthrough since it will then not be
+available on the client, possibly locking the computer when it is removed. On
+the other hand this device can let you use the smartcard on both the client and
+the guest machine. It is also possible to have a completely virtual smart card
+reader and smart card (i.e. not backed by a physical device) using this device.
+
+2. Building
+
+The cryptographic functions and access to the physical card is done via NSS.
+
+Installing NSS:
+
+In redhat/fedora:
+yum install nss-devel
+In ubuntu/debian:
+apt-get install libnss3-dev
+(not tested on ubuntu)
+
+Configuring and building:
+./configure --enable-smartcard  make
+
+3. Using ccid-card-emulated with hardware
+
+Assuming you have a working smartcard on the host with the current
+user, using NSS, qemu acts as another NSS client using ccid-card-emulated:
+
+qemu -usb -device usb-ccid -device ccid-card-emualated
+
+4. Using ccid-card-emulated with certificates
+
+You must create the certificates. This is a one time process. We use NSS
+certificates:
+
+certutil -d /etc/pki/nssdb -x -t CT,CT,CT -S -s CN=cert1 -n cert1
+
+Note: you must have exactly three certificates.
+
+Assuming the current user can access the certificates (use certutil -L to
+verify), you can use the emulated card type with the certificates backend:
+
+qemu -usb -device usb-ccid -device 
ccid-card-emulated,backend=certificates,cert1=cert1,cert2=cert2,cert3=cert3
+
+5. Using ccid-card-passthru with client side hardware
+
+on the host specify the ccid-card-passthru device with a suitable chardev:
+
+qemu -chardev socket,server,host=0.0.0.0,port=2001,id=ccid,nowait -usb 
-device usb-ccid -device ccid-card-passthru,chardev=ccid
+
+on the client run vscclient, built when you built the libcaccard library:
+libcaccard/vscclient qemu-host 2001
+
+6. Using ccid-card-passthru with client side certificates
+
+Run qemu as per #5, and run vscclient as follows:
+(Note: vscclient command line interface is in a state of change)
+
+libcaccard/vscclient -e db=\/etc/pki/nssdb\ use_hw=no 
soft=(,Test,CAC,,cert1,cert2,cert3) qemu-host 2001
+
+7. Passthrough protocol scenario
+
+This is a typical interchange of messages when using the passthru card device.
+usb-ccid is a usb device. It defaults to an unattached usb device on startup.
+usb-ccid expects a chardev and expects the protocol defined in
+cac_card/vscard_common.h to be passed over that.
+
+A typical interchange is:
+
+client event  |  vscclient   |passthru| usb-ccid  
|  guest event
+--
+  |  VSC_Init||   |
+  |  VSC_ReaderAdd   || attach|
+  |  ||   
|  sees new usb device.
+card inserted |  ||   |
+  |  VSC_ATR ||   |
+  |  ||   
|  guest operation, APDU transfer via CCID
+  |  |   VSC_APDU |   |
+  |  VSC_APDU||   |
+client-physical |  ||   |
+card APDU exchange|  ||   |
+

[Qemu-devel] [PATCH 2/2] ARM: Implement correct NaN propagation rules

2010-12-16 Thread Peter Maydell

Implement the correct NaN propagation rules for ARM targets by
providing an appropriate pickNaN function.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 fpu/softfloat-specialize.h |   23 +++
 1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/fpu/softfloat-specialize.h b/fpu/softfloat-specialize.h
index 3015480..fd1b114 100644
--- a/fpu/softfloat-specialize.h
+++ b/fpu/softfloat-specialize.h
@@ -149,6 +149,28 @@ static float32 commonNaNToFloat32( commonNaNT a )
 | tie-break rule.
 **/
 
+#if defined(TARGET_ARM)
+static int pickNaN(flag aIsQNaN, flag aIsSNaN, flag bIsQNaN, flag bIsSNaN,
+flag aIsLargerSignificand)
+{
+/* ARM mandated NaN propagation rules: take the first of:
+ *  1. A if it is signaling
+ *  2. B if it is signaling
+ *  3. A (quiet)
+ *  4. B (quiet)
+ * A signaling NaN is always quietened before returning it.
+ */
+if (aIsSNaN) {
+return 0;
+} else if (bIsSNaN) {
+return 1;
+} else if (aIsQNaN) {
+return 0;
+} else {
+return 1;
+}
+}
+#else
 static int pickNaN(flag aIsQNaN, flag aIsSNaN, flag bIsQNaN, flag bIsSNaN,
 flag aIsLargerSignificand)
 {
@@ -178,6 +200,7 @@ static int pickNaN(flag aIsQNaN, flag aIsSNaN, flag 
bIsQNaN, flag bIsSNaN,
 return 1;
 }
 }
+#endif
 
 /*
 | Takes two single-precision floating-point values `a' and `b', one of which
-- 
1.6.3.3

[Qemu-devel] Re: [PATCH 2/3] Introduce do_snapshot_blkdev() and monitor command to handle it.

2010-12-16 Thread Kevin Wolf

Am 16.12.2010 12:04, schrieb jes.soren...@redhat.com:
 From: Jes Sorensen jes.soren...@redhat.com
 
 The monitor command is:
 snapshot_blkdev device [snapshot-file] [format]
 
 Default format is qcow2. For now snapshots without a snapshot-file, eg
 internal snapshots, are not supported.
 
 Signed-off-by: Jes Sorensen jes.soren...@redhat.com
 ---
  blockdev.c  |   61 
 +++
  blockdev.h  |1 +
  hmp-commands.hx |   19 +
  3 files changed, 81 insertions(+), 0 deletions(-)
 
 diff --git a/blockdev.c b/blockdev.c
 index 3b3b82d..9d6f72c 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -516,6 +516,67 @@ void do_commit(Monitor *mon, const QDict *qdict)
  }
  }
  
 +int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data)
 +{
 +const char *device = qdict_get_str(qdict, device);
 +const char *filename = qdict_get_try_str(qdict, snapshot_file);
 +const char *format = qdict_get_try_str(qdict, format);
 +const char format_qcow2[] = qcow2;
 +BlockDriverState *bs;
 +BlockDriver *drv, *proto_drv;
 +int ret = 0;
 +int flags;
 +
 +bs = bdrv_find(device);
 +if (!bs) {
 +qerror_report(QERR_DEVICE_NOT_FOUND, device);
 +ret = -1;
 +goto out;
 +}
 +
 +if (!format) {
 +format = format_qcow2;

Why introducing format_qcow2 instead of directly using the string
literal here?

 +}
 +
 +drv = bdrv_find_format(format);
 +if (!drv) {
 +qerror_report(QERR_INVALID_BLOCK_FORMAT, format);
 +ret = -1;
 +goto out;
 +}
 +
 +proto_drv = bdrv_find_protocol(filename);
 +if (!proto_drv) {
 +qerror_report(QERR_INVALID_BLOCK_FORMAT, format);
 +ret = -1;
 +goto out;
 +}
 +
 +ret = bdrv_img_create(filename, format, bs-filename,
 +  bs-drv-format_name, NULL, -1, bs-open_flags);
 +if (ret) {
 +goto out;
 +}
 +
 +qemu_aio_flush();
 +bdrv_flush(bs);
 +
 +flags = bs-open_flags;
 +bdrv_close(bs);
 +ret = bdrv_open(bs, filename, flags, drv);
 +/*
 + * If reopening the image file we just created fails, we really
 + * are in trouble :(
 + */
 +assert(ret == 0);

bdrv_commit handles this case by setting bs-drv = NULL. After this the
device will fail all requests with -ENOMEDIUM, but at least you don't
lose your VM immediately.

Kevin

[Qemu-devel] [PATCH 0/2] ARM: Implement correct NaN propagation rules

2010-12-16 Thread Peter Maydell

IEEE754 doesn't specify precisely what NaN should be returned as the
result of an operation on two input NaNs. The existing softfloat code
hardwires x87 propagation rules. This patchset abstracts out the code
in the various propagateFloat*NaN() functions which picks a NaN to
return, so that it can be easily replaced on a per-target basis.
It also implements the correct version of this routine for ARM.

I've tested this by the usual random instruction set method. The
first patch makes no change to NaN handling (ie it is purely a
refactoring of the code); the second brings ARM into line with the
hardware implementation.

Maintainers of other targets should consider implementing a suitable
version of the pickNaN() function. I know the SPARC architecture
manual documents NaN rules, and they're not the x87 ones, for instance.


Peter Maydell (2):
  softfloat: abstract out target-specific NaN propagation rules
  ARM: Implement correct NaN propagation rules

 fpu/softfloat-specialize.h |  183 +--
 1 files changed, 123 insertions(+), 60 deletions(-)

[Qemu-devel] Re: [PATCH 2/3] Introduce do_snapshot_blkdev() and monitor command to handle it.

2010-12-16 Thread Jes Sorensen

On 12/16/10 12:45, Kevin Wolf wrote:
 Am 16.12.2010 12:04, schrieb jes.soren...@redhat.com:
 From: Jes Sorensen jes.soren...@redhat.com

 The monitor command is:
 snapshot_blkdev device [snapshot-file] [format]

 Default format is qcow2. For now snapshots without a snapshot-file, eg
 internal snapshots, are not supported.

 Signed-off-by: Jes Sorensen jes.soren...@redhat.com
 ---
  blockdev.c  |   61 
 +++
  blockdev.h  |1 +
  hmp-commands.hx |   19 +
  3 files changed, 81 insertions(+), 0 deletions(-)

 diff --git a/blockdev.c b/blockdev.c
 index 3b3b82d..9d6f72c 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -516,6 +516,67 @@ void do_commit(Monitor *mon, const QDict *qdict)
  }
  }
  
 +int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data)
 +{
 +const char *device = qdict_get_str(qdict, device);
 +const char *filename = qdict_get_try_str(qdict, snapshot_file);
 +const char *format = qdict_get_try_str(qdict, format);
 +const char format_qcow2[] = qcow2;
 +BlockDriverState *bs;
 +BlockDriver *drv, *proto_drv;
 +int ret = 0;
 +int flags;
 +
 +bs = bdrv_find(device);
 +if (!bs) {
 +qerror_report(QERR_DEVICE_NOT_FOUND, device);
 +ret = -1;
 +goto out;
 +}
 +
 +if (!format) {
 +format = format_qcow2;
 
 Why introducing format_qcow2 instead of directly using the string
 literal here?

It should generate the same code - I kinda liked my style better, but
I'll change it.

 +flags = bs-open_flags;
 +bdrv_close(bs);
 +ret = bdrv_open(bs, filename, flags, drv);
 +/*
 + * If reopening the image file we just created fails, we really
 + * are in trouble :(
 + */
 +assert(ret == 0);
 
 bdrv_commit handles this case by setting bs-drv = NULL. After this the
 device will fail all requests with -ENOMEDIUM, but at least you don't
 lose your VM immediately.

Well if we hit this situation something catastrophic happened. I don't
see how we can continue beyond this point really. The old image was
dropped in the swap and the new one is not accessible ... we're dead :(

Cheers,
Jes

[Qemu-devel] Re: [PATCH 3/3] Prevent creating an image with the same filename as backing file

2010-12-16 Thread Kevin Wolf

Am 16.12.2010 12:04, schrieb jes.soren...@redhat.com:
 From: Jes Sorensen jes.soren...@redhat.com
 
 Signed-off-by: Jes Sorensen jes.soren...@redhat.com
 ---
  block.c |7 +++
  1 files changed, 7 insertions(+), 0 deletions(-)
 
 diff --git a/block.c b/block.c
 index 765f9f3..027dc6a 100644
 --- a/block.c
 +++ b/block.c
 @@ -2769,6 +2769,13 @@ int bdrv_img_create(const char *filename, const char 
 *fmt,
  BlockDriver *drv, *proto_drv;
  int ret = 0;
  
 +if (!strcmp(filename, base_filename)) {
 +error_report(Error: Trying to create a snapshot with the same 
 + filename as the backing file);
 +ret = -1;
 +goto out;
 +}
 +
  /* Find driver and parse its options */
  drv = bdrv_find_format(fmt);
  if (!drv) {

This doesn't catch things like qemu-img create -f qcow2
-obacking_file=foo.qcow2 foo.qcow2 though it will work if you use the
legacy -b option. I think we should keep it consistent.

Kevin

Re: [Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-16 Thread Markus Armbruster

Luiz Capitulino lcapitul...@redhat.com writes:

 On Thu, 16 Dec 2010 11:03:38 +0200
 Avi Kivity a...@redhat.com wrote:

 On 12/15/2010 08:00 PM, Luiz Capitulino wrote:

  Looks like a GUI feature to me,
  
Really?  Can't see how you can build NMI to all CPUs from NMI this
CPU.  Or am I misunderstanding you?
 
  I guess so. Avi referred to 'nmi button on many machines', I assumed he
  meant a virtual machine GUI, am I wrong?
 
 I meant a real machine's GUI (it's a physical button you can press with 
 your finger, if you have thin fingers).

 Ok, I didn't know that, but I had another idea: the command could accept
 either a single cpu index or a list:


   { execute: inject-nmi, arguments: { cpus: 2 } }

   { execute: inject-nmi, arguments: { cpus: [1, 2, 3, 4] } }

 This has the feature of injecting the nmi in just some cpus, although I'm
 not sure this is going to be desired/useful.

Use case for NMI-ing a subset of the CPUs?

Heck, use case for anything else but NMI all?

 If we agree on this we'll have to wait because the monitor doesn't currently
 support hybrid arguments.

Let's keep the schema simple.

[Qemu-devel] Re: [PATCH] ide: Register vm change state handler once only

2010-12-16 Thread Kevin Wolf

Am 10.12.2010 15:47, schrieb Stefan Hajnoczi:
 We register the vm change state handler in a PCI BAR map() function.
 This function can be called multiple times throughout the lifetime of a
 PCI IDE device.  This results in duplicate vm change state handlers
 being register, none of which are ever unregistered.
 
 Instead, register the vm change state handler in the device's init
 function once and for all.
 
 piix tested, cmd646 and via not tested.
 
 Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com

This conflicts with the AHCI patches. Can you rebase on top of the block
branch?

Kevin

[Qemu-devel] Re: [PATCH 1/3] qemu-img.c: Re-factor img_create()

2010-12-16 Thread Jes Sorensen

On 12/16/10 12:35, Kevin Wolf wrote:
 Am 16.12.2010 12:04, schrieb jes.soren...@redhat.com:
 +
 +backing_fmt = get_option_parameter(param, BLOCK_OPT_BACKING_FMT);
 +if (backing_fmt  backing_fmt-value.s) {
 +if (!bdrv_find_format(backing_fmt-value.s)) {
 +error_report(Unknown backing file format '%s',
 + backing_fmt-value.s);
 +ret = -1;
 +goto out;
 +}
 +}
 +
 +if (base_fmt) {
 +if (set_option_parameter(param, BLOCK_OPT_BACKING_FMT, base_fmt)) {
 +error_report(Backing file format not supported for file 
 + format '%s', fmt);
 +ret = -1;
 +goto out;
 +}
 +}
 
 The order is wrong here. If you use -F, the backing format won't be checked.

Urgh yes, my bad! I had it the other way, but decided to change it -
very silly.

 +
 +// The size for the image must always be specified, with one exception:
 +// If we are using a backing file, we can obtain the size from there
 +if (get_option_parameter(param, BLOCK_OPT_SIZE)-value.n == -1) {
 +QEMUOptionParameter *backing_file =
 +get_option_parameter(param, BLOCK_OPT_BACKING_FILE);
 +
 +if (backing_file  backing_file-value.s) {
 +uint64_t size;
 +const char *fmt = NULL;
 +char buf[32];
 +
 +if (backing_fmt  backing_fmt-value.s) {
 +fmt = backing_fmt-value.s;
 +}
 +
 +bs = bdrv_new();
 +if (!bs) {
 +error_report(Not enough memory to allocate 
 BlockDriverState);
 +ret = -1;
 +goto out;
 +}
 
 bdrv_new never returns NULL (it's an indirect qemu_malloc call).

I see - this was copied wholesale from qemu-img.c but I'll just remove
the error check.

 +ret = bdrv_open(bs, backing_file-value.s, flags, drv);
 +if (ret  0) {
 +error_report(Could not open '%s', filename);
 +ret = -1;
 +goto out;
 +}
 +bdrv_get_geometry(bs, size);
 +size *= 512;
 +
 +snprintf(buf, sizeof(buf), % PRId64, size);
 +set_option_parameter(param, BLOCK_OPT_SIZE, buf);
 +} else {
 +error_report(Image creation needs a size parameter);
 +ret = -1;
 +goto out;
 +}
 +}
 +
 +printf(Formatting '%s', fmt=%s , filename, fmt);
 +print_option_parameters(param);
 +puts();
 +
 +ret = bdrv_create(drv, filename, param);
 +free_option_parameters(create_options);
 +free_option_parameters(param);
 
 These need to be after out: to avoid leaking in error cases.
 
 You're basically reverting a87a6721d with this.

Whoops - another one of those conflicting ones. It's all Stefan's fault :)

 +
 +if (ret  0) {
 +if (ret == -ENOTSUP) {
 +error_report(Formatting or formatting option not supported for 
 
 + file format '%s', fmt);
 +} else if (ret == -EFBIG) {
 +error_report(The image size is too large for file format '%s',
 + fmt);
 +} else {
 +error_report(%s: error while creating %s: %s, filename, fmt,
 + strerror(-ret));
 +}
 +}
 +
 +out:
 +if (bs) {
 +bdrv_delete(bs);
 +}
 +if (ret) {
 +return 1;
 +}
 
 Maybe we should better use the usual 0/-errno style. In qemu-img it was
 the exit code of the program, but now it's a block layer function.

I like errno, I made it behave like this for consistency with the rest
of QEMU. In most places we don't seem to like anything but -1/1 for
error values.

Let me know what you prefer, or alternatively we can change it in a
follow-on patch?

Cheers,
Jes

Re: [Qemu-devel] [PATCH 1/5] usb-ccid: add CCID bus

2010-12-16 Thread Gerd Hoffmann


On 12/16/10 12:06, Alon Levy wrote:

A CCID device is a smart card reader. It is a USB device, defined at [1].
This patch introduces the usb-ccid device that is a ccid bus. Next patches will
introduce two card types to use it, a passthru card and an emulated card.

  [1]http://www.usb.org/developers/devclass_docs/DWG_Smart-Card_CCID_Rev110.



Acked-by: Gerd Hoffmann kra...@redhat.com

cheers,
  Gerd

Re: [Qemu-devel] [PATCH 2/5] ccid: add passthru card device

2010-12-16 Thread Gerd Hoffmann


On 12/16/10 12:06, Alon Levy wrote:

The passthru ccid card is a device sitting on the usb-ccid bus and
using a chardevice to communicate with a remote device using the
VSCard protocol defined in libcacard/vscard_common.h

Usage docs available in following patch in docs/ccid.txt



Acked-by: Gerd Hoffmann kra...@redhat.com

cheers,
  Gerd

[Qemu-devel] [PATCH 1/1] spice: add chardev

2010-12-16 Thread Alon Levy

Adding a chardev backend for spice, for usage by spice vdagent in
conjunction with a properly named virtio-serial device.
---
 Makefile.objs |2 +-
 qemu-char.c   |7 ++
 qemu-config.c |9 ++
 qemu-options.hx   |   18 -
 spice-qemu-char.c |  222 +
 spice-qemu-char.h |9 ++
 6 files changed, 265 insertions(+), 2 deletions(-)
 create mode 100644 spice-qemu-char.c
 create mode 100644 spice-qemu-char.h

diff --git a/Makefile.objs b/Makefile.objs
index cebb945..320b2a9 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -102,7 +102,7 @@ common-obj-$(CONFIG_BRLAPI) += baum.o
 common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o migration-fd.o
 common-obj-$(CONFIG_WIN32) += version.o
 
-common-obj-$(CONFIG_SPICE) += ui/spice-core.o ui/spice-input.o 
ui/spice-display.o
+common-obj-$(CONFIG_SPICE) += ui/spice-core.o ui/spice-input.o 
ui/spice-display.o spice-qemu-char.o
 
 audio-obj-y = audio.o noaudio.o wavaudio.o mixeng.o
 audio-obj-$(CONFIG_SDL) += sdlaudio.o
diff --git a/qemu-char.c b/qemu-char.c
index edc9ad6..82789ba 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -98,6 +98,10 @@
 
 #include qemu_socket.h
 
+#ifdef CONFIG_SPICE
+#include spice-qemu-char.h
+#endif
+
 #define READ_BUF_LEN 4096
 
 /***/
@@ -2495,6 +2499,9 @@ static const struct {
 || defined(__FreeBSD_kernel__)
 { .name = parport,   .open = qemu_chr_open_pp },
 #endif
+#ifdef CONFIG_SPICE
+{ .name = spicevmc, .open = qemu_chr_open_spice },
+#endif
 };
 
 CharDriverState *qemu_chr_open_opts(QemuOpts *opts,
diff --git a/qemu-config.c b/qemu-config.c
index 965fa46..42cd977 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -147,6 +147,15 @@ static QemuOptsList qemu_chardev_opts = {
 .name = signal,
 .type = QEMU_OPT_BOOL,
 },
+#ifdef CONFIG_SPICE
+{
+.name = name,
+.type = QEMU_OPT_STRING,
+},{
+.name = debug,
+.type = QEMU_OPT_NUMBER,
+},
+#endif
 { /* end of list */ }
 },
 };
diff --git a/qemu-options.hx b/qemu-options.hx
index 4d99a58..d8ad070 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1357,6 +1357,9 @@ DEF(chardev, HAS_ARG, QEMU_OPTION_chardev,
 #if defined(__linux__) || defined(__FreeBSD__) || defined(__DragonFly__)
 -chardev parport,id=id,path=path[,mux=on|off]\n
 #endif
+#if defined(CONFIG_SPICE)
+-chardev spicevmc,id=id,debug=debug,name=name\n
+#endif
 , QEMU_ARCH_ALL
 )
 
@@ -1381,7 +1384,10 @@ Backend is one of:
 @option{stdio},
 @option{braille},
 @option{tty},
-...@option{parport}.
+...@option{parport}
+#if defined(CONFIG_SPICE)
+...@option{spicevmc}.
+#endif
 The specific backend will determine the applicable options.
 
 All devices must have an id, which can be any string up to 127 characters long.
@@ -1557,6 +1563,16 @@ Connect to a local parallel port.
 @option{path} specifies the path to the parallel port device. @option{path} is
 required.
 
+#if defined(CONFIG_SPICE)
+...@item -chardev spicevmc ,i...@var{id} ,deb...@var{debug}, na...@var{name}
+
+...@option{debug} debug level for spicevmc
+
+...@option{name} name of spice channel to connect to
+
+Connect to a spice virtual machine channel, such as vdiport.
+#endif
+
 @end table
 ETEXI
 
diff --git a/spice-qemu-char.c b/spice-qemu-char.c
new file mode 100644
index 000..a080796
--- /dev/null
+++ b/spice-qemu-char.c
@@ -0,0 +1,222 @@
+#include config-host.h
+#include ui/qemu-spice.h
+#include spice.h
+#include spice-experimental.h
+
+#include osdep.h
+
+#include spice-qemu-char.h
+
+//#define SPICE_QEMU_CHAR_USE_IOCTL
+
+#define dprintf(_scd, _level, _fmt, ...)\
+do {\
+static unsigned __dprintf_counter = 0;  \
+if (_scd-debug = _level) {\
+fprintf(stderr, scd: %3d:  _fmt, ++__dprintf_counter, ## 
__VA_ARGS__);\
+}   \
+} while (0)
+
+#define VMC_MAX_HOST_WRITE2048
+
+typedef struct SpiceCharDriver {
+CharDriverState*  chr;
+SpiceCharDeviceInstance sin;
+char  *subtype;
+bool  active;
+uint8_t   *buffer;
+uint8_t   *datapos;
+ssize_t   bufsize, datalen;
+uint32_t  debug;
+} SpiceCharDriver;
+
+
+static int vmc_write(SpiceCharDeviceInstance *sin, const uint8_t *buf, int len)
+{
+SpiceCharDriver *scd = container_of(sin, SpiceCharDriver, sin);
+ssize_t out = 0;
+ssize_t last_out;
+uint8_t* p = (uint8_t*)buf;
+
+while (len  0) {
+last_out = MIN(len, VMC_MAX_HOST_WRITE);
+qemu_chr_read(scd-chr, p, last_out);
+if (last_out  0) {
+out += last_out;
+

[Qemu-devel] [PATCH 1/2] softfloat: abstract out target-specific NaN propagation rules

2010-12-16 Thread Peter Maydell

IEEE754 doesn't specify precisely what NaN should be returned as
the result of an operation on two input NaNs. This is therefore
target-specific. Abstract out the code in propagateFloat*NaN()
which was implementing the x87 propagation rules, so that it
can be easily replaced on a per-target basis.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 fpu/softfloat-specialize.h |  160 +++
 1 files changed, 100 insertions(+), 60 deletions(-)

diff --git a/fpu/softfloat-specialize.h b/fpu/softfloat-specialize.h
index 8e6aceb..3015480 100644
--- a/fpu/softfloat-specialize.h
+++ b/fpu/softfloat-specialize.h
@@ -134,6 +134,52 @@ static float32 commonNaNToFloat32( commonNaNT a )
 }
 
 /*
+| Select which NaN to propagate for a two-input operation.
+| IEEE754 doesn't specify all the details of this, so the
+| algorithm is target-specific.
+| The routine is passed various bits of information about the
+| two NaNs and should return 0 to select NaN a and 1 for NaN b.
+| Note that signalling NaNs are always squashed to quiet NaNs
+| by the caller, by flipping the SNaN bit before returning them.
+|
+| aIsLargerSignificand is only valid if both a and b are NaNs
+| of some kind, and is true if a has the larger significand,
+| or if both a and b have the same significand but a is
+| positive but b is negative. It is only needed for the x87
+| tie-break rule.
+**/
+
+static int pickNaN(flag aIsQNaN, flag aIsSNaN, flag bIsQNaN, flag bIsSNaN,
+flag aIsLargerSignificand)
+{
+/* This implements x87 NaN propagation rules:
+ * SNaN + QNaN = return the QNaN
+ * two SNaNs = return the one with the larger significand, silenced
+ * two QNaNs = return the one with the larger significand
+ * SNaN and a non-NaN = return the SNaN, silenced
+ * QNaN and a non-NaN = return the QNaN
+ *
+ * If we get down to comparing significands and they are the same,
+ * return the NaN with the positive sign bit (if any).
+ */
+if (aIsSNaN) {
+if (bIsSNaN) {
+return aIsLargerSignificand ? 0 : 1;
+}
+return bIsQNaN ? 1 : 0;
+}
+else if (aIsQNaN) {
+if (bIsSNaN || !bIsQNaN)
+return 0;
+else {
+return aIsLargerSignificand ? 0 : 1;
+}
+} else {
+return 1;
+}
+}
+
+/*
 | Takes two single-precision floating-point values `a' and `b', one of which
 | is a NaN, and returns the appropriate NaN result.  If either `a' or `b' is a
 | signaling NaN, the invalid exception is raised.
@@ -141,7 +187,7 @@ static float32 commonNaNToFloat32( commonNaNT a )
 
 static float32 propagateFloat32NaN( float32 a, float32 b STATUS_PARAM)
 {
-flag aIsNaN, aIsSignalingNaN, bIsNaN, bIsSignalingNaN;
+flag aIsNaN, aIsSignalingNaN, bIsNaN, bIsSignalingNaN, 
aIsLargerSignificand;
 bits32 av, bv, res;
 
 if ( STATUS(default_nan_mode) )
@@ -161,26 +207,22 @@ static float32 propagateFloat32NaN( float32 a, float32 b 
STATUS_PARAM)
 bv |= 0x0040;
 #endif
 if ( aIsSignalingNaN | bIsSignalingNaN ) float_raise( float_flag_invalid 
STATUS_VAR);
-if ( aIsSignalingNaN ) {
-if ( bIsSignalingNaN ) goto returnLargerSignificand;
-res = bIsNaN ? bv : av;
-}
-else if ( aIsNaN ) {
-if ( bIsSignalingNaN || ! bIsNaN )
-res = av;
-else {
- returnLargerSignificand:
-if ( (bits32) ( av1 )  (bits32) ( bv1 ) )
-res = bv;
-else if ( (bits32) ( bv1 )  (bits32) ( av1 ) )
-res = av;
-else
-res = ( av  bv ) ? av : bv;
-}
+
+if ((bits32)(av1)  (bits32)(bv1)) {
+aIsLargerSignificand = 0;
+} else if ((bits32)(bv1)  (bits32)(av1)) {
+aIsLargerSignificand = 1;
+} else {
+aIsLargerSignificand = (av  bv) ? 1 : 0;
 }
-else {
+
+if (pickNaN(aIsNaN, aIsSignalingNaN, bIsNaN, bIsSignalingNaN,
+aIsLargerSignificand)) {
 res = bv;
+} else {
+res = av;
 }
+
 return make_float32(res);
 }
 
@@ -276,7 +318,7 @@ static float64 commonNaNToFloat64( commonNaNT a )
 
 static float64 propagateFloat64NaN( float64 a, float64 b STATUS_PARAM)
 {
-flag aIsNaN, aIsSignalingNaN, bIsNaN, bIsSignalingNaN;
+flag aIsNaN, aIsSignalingNaN, bIsNaN, bIsSignalingNaN, 
aIsLargerSignificand;
 bits64 av, bv, res;
 
 if ( STATUS(default_nan_mode) )
@@ -296,26 +338,22 @@ static float64 propagateFloat64NaN( float64 a, float64 b 
STATUS_PARAM)
 bv |= LIT64( 0x0008 );
 #endif
 if ( aIsSignalingNaN | bIsSignalingNaN ) float_raise( float_flag_invalid 
STATUS_VAR);
-if ( aIsSignalingNaN ) {
-if ( bIsSignalingNaN ) goto

[Qemu-devel] Re: [PATCH] rtl8139: IO memory is not part of vmstate

2010-12-16 Thread Paolo Bonzini


On 12/15/2010 08:04 PM, Michael S. Tsirkin wrote:

This assuming upstream developers do not care about downstreams.
To give a chance for downstream to cherry-pick changes, upstream
should use subsections instead of version ids too.


Then version ids should be deprecated altogether.  Nothing against it, 
but I never heard about this plan.


Paolo

Re: [Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-16 Thread Avi Kivity


On 12/16/2010 01:47 PM, Markus Armbruster wrote:


  This has the feature of injecting the nmi in just some cpus, although I'm
  not sure this is going to be desired/useful.

Use case for NMI-ing a subset of the CPUs?

Heck, use case for anything else but NMI all?


Exactly.

--
error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH 2/4] Introduce do_snapshot_blkdev() and monitor command to handle it.

2010-12-16 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

The monitor command is:
snapshot_blkdev device [snapshot-file] [format]

Default format is qcow2. For now snapshots without a snapshot-file, eg
internal snapshots, are not supported.

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 blockdev.c  |   62 +++
 blockdev.h  |1 +
 hmp-commands.hx |   19 
 3 files changed, 82 insertions(+), 0 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 3b3b82d..d7add36 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -516,6 +516,68 @@ void do_commit(Monitor *mon, const QDict *qdict)
 }
 }
 
+int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data)
+{
+const char *device = qdict_get_str(qdict, device);
+const char *filename = qdict_get_try_str(qdict, snapshot_file);
+const char *format = qdict_get_try_str(qdict, format);
+BlockDriverState *bs;
+BlockDriver *drv, *proto_drv;
+int ret = 0;
+int flags;
+
+bs = bdrv_find(device);
+if (!bs) {
+qerror_report(QERR_DEVICE_NOT_FOUND, device);
+ret = -1;
+goto out;
+}
+
+if (!format) {
+format = qcow2;
+}
+
+drv = bdrv_find_format(format);
+if (!drv) {
+qerror_report(QERR_INVALID_BLOCK_FORMAT, format);
+ret = -1;
+goto out;
+}
+
+proto_drv = bdrv_find_protocol(filename);
+if (!proto_drv) {
+qerror_report(QERR_INVALID_BLOCK_FORMAT, format);
+ret = -1;
+goto out;
+}
+
+ret = bdrv_img_create(filename, format, bs-filename,
+  bs-drv-format_name, NULL, -1, bs-open_flags);
+if (ret) {
+goto out;
+}
+
+qemu_aio_flush();
+bdrv_flush(bs);
+
+flags = bs-open_flags;
+bdrv_close(bs);
+ret = bdrv_open(bs, filename, flags, drv);
+/*
+ * If reopening the image file we just created fails, we really
+ * are in trouble :(
+ */
+if (ret != 0) {
+abort();
+}
+out:
+if (ret) {
+ret = -1;
+}
+
+return ret;
+}
+
 static int eject_device(Monitor *mon, BlockDriverState *bs, int force)
 {
 if (!force) {
diff --git a/blockdev.h b/blockdev.h
index 4cb8ca9..4536b5c 100644
--- a/blockdev.h
+++ b/blockdev.h
@@ -52,5 +52,6 @@ int do_block_set_passwd(Monitor *mon, const QDict *qdict, 
QObject **ret_data);
 int do_change_block(Monitor *mon, const char *device,
 const char *filename, const char *fmt);
 int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data);
+int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data);
 
 #endif
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 23024ba..dd3db36 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -801,6 +801,25 @@ STEXI
 Set maximum tolerated downtime (in seconds) for migration.
 ETEXI
 
+{
+.name   = snapshot_blkdev,
+.args_type  = device:s,snapshot_file:s?,format:s?,
+.params = device [new-image-file] [format],
+.help   = initiates a live snapshot\n\t\t\t
+  of device. If a new image file is specified, 
the\n\t\t\t
+  new image file will become the new root image.\n\t\t\t
+  If format is specified, the snapshot file will\n\t\t\t
+  be created in that format. Otherwise the\n\t\t\t
+  snapshot will be internal! (currently unsupported),
+.mhandler.cmd_new = do_snapshot_blkdev,
+},
+
+STEXI
+...@item snapshot_blkdev
+...@findex snapshot_blkdev
+Snapshot device, using snapshot file as target if provided
+ETEXI
+
 #if defined(TARGET_I386)
 {
 .name   = drive_add,
-- 
1.7.3.3

[Qemu-devel] [PATCH 1/4] qemu-img.c: Re-factor img_create()

2010-12-16 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

This patch re-factors img_create() moving the code doing the actual
work into block.c where it can be shared with QEMU. This is needed to
be able to create images from QEMU to be used for live snapshots.

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 block.c|  141 
 block.h|4 ++
 qemu-img.c |  108 +-
 3 files changed, 147 insertions(+), 106 deletions(-)

diff --git a/block.c b/block.c
index b4aaf41..a48b30c 100644
--- a/block.c
+++ b/block.c
@@ -2758,3 +2758,144 @@ int64_t bdrv_get_dirty_count(BlockDriverState *bs)
 {
 return bs-dirty_count;
 }
+
+int bdrv_img_create(const char *filename, const char *fmt,
+const char *base_filename, const char *base_fmt,
+char *options, uint64_t img_size, int flags)
+{
+QEMUOptionParameter *param = NULL, *create_options = NULL;
+QEMUOptionParameter *backing_fmt;
+BlockDriverState *bs = NULL;
+BlockDriver *drv, *proto_drv;
+int ret = 0;
+
+/* Find driver and parse its options */
+drv = bdrv_find_format(fmt);
+if (!drv) {
+error_report(Unknown file format '%s', fmt);
+ret = -1;
+goto out;
+}
+
+proto_drv = bdrv_find_protocol(filename);
+if (!proto_drv) {
+error_report(Unknown protocol '%s', filename);
+ret = -1;
+goto out;
+}
+
+create_options = append_option_parameters(create_options,
+  drv-create_options);
+create_options = append_option_parameters(create_options,
+  proto_drv-create_options);
+
+/* Create parameter list with default values */
+param = parse_option_parameters(, create_options, param);
+
+set_option_parameter_int(param, BLOCK_OPT_SIZE, img_size);
+
+/* Parse -o options */
+if (options) {
+param = parse_option_parameters(options, create_options, param);
+if (param == NULL) {
+error_report(Invalid options for file format '%s'., fmt);
+ret = -1;
+goto out;
+}
+}
+
+if (base_filename) {
+if (set_option_parameter(param, BLOCK_OPT_BACKING_FILE,
+ base_filename)) {
+error_report(Backing file not supported for file format '%s',
+ fmt);
+ret = -1;
+goto out;
+}
+}
+
+if (base_fmt) {
+if (set_option_parameter(param, BLOCK_OPT_BACKING_FMT, base_fmt)) {
+error_report(Backing file format not supported for file 
+ format '%s', fmt);
+ret = -1;
+goto out;
+}
+}
+
+backing_fmt = get_option_parameter(param, BLOCK_OPT_BACKING_FMT);
+if (backing_fmt  backing_fmt-value.s) {
+if (!bdrv_find_format(backing_fmt-value.s)) {
+error_report(Unknown backing file format '%s',
+ backing_fmt-value.s);
+ret = -1;
+goto out;
+}
+}
+
+// The size for the image must always be specified, with one exception:
+// If we are using a backing file, we can obtain the size from there
+if (get_option_parameter(param, BLOCK_OPT_SIZE)-value.n == -1) {
+QEMUOptionParameter *backing_file =
+get_option_parameter(param, BLOCK_OPT_BACKING_FILE);
+
+if (backing_file  backing_file-value.s) {
+uint64_t size;
+const char *fmt = NULL;
+char buf[32];
+
+if (backing_fmt  backing_fmt-value.s) {
+fmt = backing_fmt-value.s;
+}
+
+bs = bdrv_new();
+
+ret = bdrv_open(bs, backing_file-value.s, flags, drv);
+if (ret  0) {
+error_report(Could not open '%s', filename);
+ret = -1;
+goto out;
+}
+bdrv_get_geometry(bs, size);
+size *= 512;
+
+snprintf(buf, sizeof(buf), % PRId64, size);
+set_option_parameter(param, BLOCK_OPT_SIZE, buf);
+} else {
+error_report(Image creation needs a size parameter);
+ret = -1;
+goto out;
+}
+}
+
+printf(Formatting '%s', fmt=%s , filename, fmt);
+print_option_parameters(param);
+puts();
+
+ret = bdrv_create(drv, filename, param);
+
+if (ret  0) {
+if (ret == -ENOTSUP) {
+error_report(Formatting or formatting option not supported for 
+ file format '%s', fmt);
+} else if (ret == -EFBIG) {
+error_report(The image size is too large for file format '%s',
+ fmt);
+} else {
+error_report(%s: error while creating %s: %s, filename, fmt,
+ strerror(-ret));
+}
+}
+

[Qemu-devel] [PATCH v3 0/4] Re-factor img_create() and add live snapshots

2010-12-16 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

Hi,

This set of patches re-factors img_create() and moves the core part of
it into block.c so it can be accessed from qemu as well as
qemu-img. The second patch adds basic live snapshots support to the
code, however only snapshots to external QCOW2 images is supported for
now. QED support should be trivial once the QED patches go into
upstream.

The last patch fixes a small gotcha which is present in the old code
as well. Try to catch cases where a user tries to create an image with
itself as the backing file. QEMU does 'interesting' things when you do
this.

Many thanks to Kevin for his help with block layer internals!

New in v2:
 - Fix error return value in monitor command
 - Clarify help message for command
 - Fix patch conflict against block tree. It's all Stefan's fault :)
   f8feb11f4d76f390dddc5cc5345abf99f7659a78

New in v3:
 - Address issues pointed out by Stefan and Kevin
 - Additional patch to return proper -errno error values on error in
   bdrv_img_create() as suggested by Kevin.

Jes Sorensen (4):
  qemu-img.c: Re-factor img_create()
  Introduce do_snapshot_blkdev() and monitor command to handle it.
  Prevent creating an image with the same filename as backing file
  bdrv_img_create() use proper errno return values

 block.c |  145 +++
 block.h |4 ++
 blockdev.c  |   62 +++
 blockdev.h  |1 +
 hmp-commands.hx |   19 +++
 qemu-img.c  |  108 +
 6 files changed, 233 insertions(+), 106 deletions(-)

-- 
1.7.3.3

Re: [Qemu-devel] [PATCH 3/5] libcacard: initial commit after coding style fixes

2010-12-16 Thread Gerd Hoffmann


On 12/16/10 12:06, Alon Levy wrote:

libcacard emulates a Common Access Card (CAC) which is a standard
for smartcards. It is used by the emulated ccid card introduced in
a following patch. Docs are available in docs/libcacard.txt


Looks good to me, although I'm not a crypto expert.  Most of the 
critical crypto stuff is in nss anyway though, so:


Acked-by: Gerd Hoffmann kra...@redhat.com

cheers,
  Gerd

[Qemu-devel] [PATCH 4/4] bdrv_img_create() use proper errno return values

2010-12-16 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

Kevin suggested to have bdrv_img_create() return proper -errno values
on error.

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 block.c |   23 ++-
 1 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/block.c b/block.c
index 0c14eee..fe07d0b 100644
--- a/block.c
+++ b/block.c
@@ -2773,14 +2773,14 @@ int bdrv_img_create(const char *filename, const char 
*fmt,
 drv = bdrv_find_format(fmt);
 if (!drv) {
 error_report(Unknown file format '%s', fmt);
-ret = -1;
+ret = -EINVAL;
 goto out;
 }
 
 proto_drv = bdrv_find_protocol(filename);
 if (!proto_drv) {
 error_report(Unknown protocol '%s', filename);
-ret = -1;
+ret = -EINVAL;
 goto out;
 }
 
@@ -2799,7 +2799,7 @@ int bdrv_img_create(const char *filename, const char *fmt,
 param = parse_option_parameters(options, create_options, param);
 if (param == NULL) {
 error_report(Invalid options for file format '%s'., fmt);
-ret = -1;
+ret = -EINVAL;
 goto out;
 }
 }
@@ -2809,7 +2809,7 @@ int bdrv_img_create(const char *filename, const char *fmt,
  base_filename)) {
 error_report(Backing file not supported for file format '%s',
  fmt);
-ret = -1;
+ret = -EINVAL;
 goto out;
 }
 }
@@ -2818,7 +2818,7 @@ int bdrv_img_create(const char *filename, const char *fmt,
 if (set_option_parameter(param, BLOCK_OPT_BACKING_FMT, base_fmt)) {
 error_report(Backing file format not supported for file 
  format '%s', fmt);
-ret = -1;
+ret = -EINVAL;
 goto out;
 }
 }
@@ -2828,7 +2828,7 @@ int bdrv_img_create(const char *filename, const char *fmt,
 if (!strcmp(filename, backing_file-value.s)) {
 error_report(Error: Trying to create an image with the 
  same filename as the backing file);
-ret = -1;
+ret = -EINVAL;
 goto out;
 }
 }
@@ -2838,7 +2838,7 @@ int bdrv_img_create(const char *filename, const char *fmt,
 if (!bdrv_find_format(backing_fmt-value.s)) {
 error_report(Unknown backing file format '%s',
  backing_fmt-value.s);
-ret = -1;
+ret = -EINVAL;
 goto out;
 }
 }
@@ -2860,7 +2860,6 @@ int bdrv_img_create(const char *filename, const char *fmt,
 ret = bdrv_open(bs, backing_file-value.s, flags, drv);
 if (ret  0) {
 error_report(Could not open '%s', filename);
-ret = -1;
 goto out;
 }
 bdrv_get_geometry(bs, size);
@@ -2870,7 +2869,7 @@ int bdrv_img_create(const char *filename, const char *fmt,
 set_option_parameter(param, BLOCK_OPT_SIZE, buf);
 } else {
 error_report(Image creation needs a size parameter);
-ret = -1;
+ret = -EINVAL;
 goto out;
 }
 }
@@ -2901,8 +2900,6 @@ out:
 if (bs) {
 bdrv_delete(bs);
 }
-if (ret) {
-return 1;
-}
-return 0;
+
+return ret;
 }
-- 
1.7.3.3

Re: [Qemu-devel] [PATCH 4/5] ccid: add ccid-card-emulated device (v2)

2010-12-16 Thread Gerd Hoffmann


On 12/16/10 12:06, Alon Levy wrote:

This devices uses libcacard (internal) to emulate a smartcard conforming
to the CAC standard. It attaches to the usb-ccid bus. Usage instructions
(example command lines) are in the following patch in docs/ccid.txt. It
uses libcacard which uses nss, so it can work with both hw cards and
certificates (files).


Acked-by: Gerd Hoffmann kra...@redhat.com

cheers,
  Gerd

[Qemu-devel] [PATCH 3/4] Prevent creating an image with the same filename as backing file

2010-12-16 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 block.c |   15 +++
 1 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index a48b30c..0c14eee 100644
--- a/block.c
+++ b/block.c
@@ -2764,7 +2764,7 @@ int bdrv_img_create(const char *filename, const char *fmt,
 char *options, uint64_t img_size, int flags)
 {
 QEMUOptionParameter *param = NULL, *create_options = NULL;
-QEMUOptionParameter *backing_fmt;
+QEMUOptionParameter *backing_fmt, *backing_file;
 BlockDriverState *bs = NULL;
 BlockDriver *drv, *proto_drv;
 int ret = 0;
@@ -2823,6 +2823,16 @@ int bdrv_img_create(const char *filename, const char 
*fmt,
 }
 }
 
+backing_file = get_option_parameter(param, BLOCK_OPT_BACKING_FILE);
+if (backing_file  backing_file-value.s) {
+if (!strcmp(filename, backing_file-value.s)) {
+error_report(Error: Trying to create an image with the 
+ same filename as the backing file);
+ret = -1;
+goto out;
+}
+}
+
 backing_fmt = get_option_parameter(param, BLOCK_OPT_BACKING_FMT);
 if (backing_fmt  backing_fmt-value.s) {
 if (!bdrv_find_format(backing_fmt-value.s)) {
@@ -2836,9 +2846,6 @@ int bdrv_img_create(const char *filename, const char *fmt,
 // The size for the image must always be specified, with one exception:
 // If we are using a backing file, we can obtain the size from there
 if (get_option_parameter(param, BLOCK_OPT_SIZE)-value.n == -1) {
-QEMUOptionParameter *backing_file =
-get_option_parameter(param, BLOCK_OPT_BACKING_FILE);
-
 if (backing_file  backing_file-value.s) {
 uint64_t size;
 const char *fmt = NULL;
-- 
1.7.3.3

Re: [Qemu-devel] [PATCH 5/5] ccid: add docs

2010-12-16 Thread Gerd Hoffmann


On 12/16/10 12:06, Alon Levy wrote:

Add documentation for the usb-ccid device and accompanying two card
devices, ccid-card-emulated and ccid-card-passthru.
---
  docs/ccid.txt  |  125 ++
  docs/libcacard.txt |  483 


Guess that isn't intentional, right?


+A typical interchange is:
+
+client event  |  vscclient   |passthru| usb-ccid  
|  guest event
+--
+  |  VSC_Init||   |
+  |  VSC_ReaderAdd   || attach|
+  |  ||   
|  sees new usb device.
+card inserted |  ||   |
+  |  VSC_ATR ||   |
+  |  ||   
|  guest operation, APDU transfer via CCID
+  |  |   VSC_APDU |   |
+  |  VSC_APDU||   |
+client-physical |  ||   |
+card APDU exchange|  ||   |
+[APDU-APDU repeats several times]
+card removed  |  ||   |
+  |  VSC_CardRemove  ||   |
+kill/quit |  ||   |
+  vscclient   |  ||   |
+  |  VSC_ReaderRemove||detach |
+  |  ||   
|   usb device removed.


This needs updating, doesn't it?  I think you plugin and -out just the 
cards on the ccid bus instead of the whole usb device, right?


cheers,
  Gerd

Re: [Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-16 Thread Luiz Capitulino

On Thu, 16 Dec 2010 14:50:08 +0200
Avi Kivity a...@redhat.com wrote:

 On 12/16/2010 01:47 PM, Markus Armbruster wrote:
  
This has the feature of injecting the nmi in just some cpus, although I'm
not sure this is going to be desired/useful.
 
  Use case for NMI-ing a subset of the CPUs?
 
  Heck, use case for anything else but NMI all?
 
 Exactly.

Then I think the to-all-cpus argument is better.

Re: [Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-16 Thread Avi Kivity


On 12/16/2010 03:09 PM, Luiz Capitulino wrote:

On Thu, 16 Dec 2010 14:50:08 +0200
Avi Kivitya...@redhat.com  wrote:

  On 12/16/2010 01:47 PM, Markus Armbruster wrote:

   This has the feature of injecting the nmi in just some cpus, although 
I'm
   not sure this is going to be desired/useful.
  
Use case for NMI-ing a subset of the CPUs?
  
Heck, use case for anything else but NMI all?

  Exactly.

Then I think the to-all-cpus argument is better.


Why have an argument at all?  Always nmi to all cpus.

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-16 Thread Luiz Capitulino

On Thu, 16 Dec 2010 15:11:50 +0200
Avi Kivity a...@redhat.com wrote:

 On 12/16/2010 03:09 PM, Luiz Capitulino wrote:
  On Thu, 16 Dec 2010 14:50:08 +0200
  Avi Kivitya...@redhat.com  wrote:
 
On 12/16/2010 01:47 PM, Markus Armbruster wrote:
  
 This has the feature of injecting the nmi in just some cpus, 
   although I'm
 not sure this is going to be desired/useful.

  Use case for NMI-ing a subset of the CPUs?

  Heck, use case for anything else but NMI all?
  
Exactly.
 
  Then I think the to-all-cpus argument is better.
 
 Why have an argument at all?  Always nmi to all cpus.

That's possible too.

Re: [Qemu-devel] [PATCH 1/1] spice: add chardev

2010-12-16 Thread Gerd Hoffmann


On 12/16/10 12:29, Alon Levy wrote:

Adding a chardev backend for spice, for usage by spice vdagent in
conjunction with a properly named virtio-serial device.


Usage example would be nice here.


+#ifdef CONFIG_SPICE
+#include spice-qemu-char.h
+#endif


#ifdef can be dropped.


+#ifdef CONFIG_SPICE
+{
+.name = name,
+.type = QEMU_OPT_STRING,
+},{
+.name = debug,
+.type = QEMU_OPT_NUMBER,
+},
+#endif


This too.


@@ -1381,7 +1384,10 @@ Backend is one of:
  @option{stdio},
  @option{braille},
  @option{tty},
-...@option{parport}.
+...@option{parport}
+#if defined(CONFIG_SPICE)
+...@option{spicevmc}.
+#endif


This too, documentation should be there unconditionally.


+//#define SPICE_QEMU_CHAR_USE_IOCTL


Why is this disabled?
Does it depend on the chardev patches from Amit?


diff --git a/spice-qemu-char.h b/spice-qemu-char.h
new file mode 100644
index 000..32d5009
--- /dev/null
+++ b/spice-qemu-char.h
@@ -0,0 +1,9 @@
+#ifndef __SPICE_QEMU_CHAR_H__
+#define __SPICE_QEMU_CHAR_H__
+
+#include qemu-char.h
+
+CharDriverState *qemu_chr_open_spice(QemuOpts *opts);
+
+#endif // __SPICE_QEMU_CHAR_H__
+


Hmm, maybe add this to ui/qemu-spice.h instead, so we don't clutter the 
tree with lots of tiny includes?


cheers,
  Gerd

[Qemu-devel] [PATCH] qemu.img.c: Use error_report() instead of own error() implementation

2010-12-16 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 qemu-img.c |  128 +--
 1 files changed, 63 insertions(+), 65 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 603bdb3..d3921b6 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -26,6 +26,7 @@
 #include osdep.h
 #include sysemu.h
 #include block_int.h
+#include qerror.h
 #include stdio.h
 
 #ifdef _WIN32
@@ -40,16 +41,6 @@ typedef struct img_cmd_t {
 /* Default to cache=writeback as data integrity is not important for qemu-tcg. 
*/
 #define BDRV_O_FLAGS BDRV_O_CACHE_WB
 
-static void GCC_FMT_ATTR(1, 2) error(const char *fmt, ...)
-{
-va_list ap;
-va_start(ap, fmt);
-fprintf(stderr, qemu-img: );
-vfprintf(stderr, fmt, ap);
-fprintf(stderr, \n);
-va_end(ap);
-}
-
 static void format_print(void *opaque, const char *name)
 {
 printf( %s, name);
@@ -196,13 +187,13 @@ static int print_block_option_help(const char *filename, 
const char *fmt)
 /* Find driver and parse its options */
 drv = bdrv_find_format(fmt);
 if (!drv) {
-error(Unknown file format '%s', fmt);
+error_report(Unknown file format '%s', fmt);
 return 1;
 }
 
 proto_drv = bdrv_find_protocol(filename);
 if (!proto_drv) {
-error(Unknown protocol '%s', filename);
+error_report(Unknown protocol '%s', filename);
 return 1;
 }
 
@@ -225,30 +216,30 @@ static BlockDriverState *bdrv_new_open(const char 
*filename,
 
 bs = bdrv_new();
 if (!bs) {
-error(Not enough memory);
+error_report(Not enough memory);
 goto fail;
 }
 if (fmt) {
 drv = bdrv_find_format(fmt);
 if (!drv) {
-error(Unknown file format '%s', fmt);
+error_report(Unknown file format '%s', fmt);
 goto fail;
 }
 } else {
 drv = NULL;
 }
 if (bdrv_open(bs, filename, flags, drv)  0) {
-error(Could not open '%s', filename);
+error_report(Could not open '%s', filename);
 goto fail;
 }
 if (bdrv_is_encrypted(bs)) {
 printf(Disk image '%s' is encrypted.\n, filename);
 if (read_password(password, sizeof(password))  0) {
-error(No password given);
+error_report(No password given);
 goto fail;
 }
 if (bdrv_set_key(bs, password)  0) {
-error(invalid password);
+error_report(invalid password);
 goto fail;
 }
 }
@@ -266,13 +257,15 @@ static int add_old_style_options(const char *fmt, 
QEMUOptionParameter *list,
 {
 if (base_filename) {
 if (set_option_parameter(list, BLOCK_OPT_BACKING_FILE, base_filename)) 
{
-error(Backing file not supported for file format '%s', fmt);
+error_report(Backing file not supported for file format '%s',
+ fmt);
 return -1;
 }
 }
 if (base_fmt) {
 if (set_option_parameter(list, BLOCK_OPT_BACKING_FMT, base_fmt)) {
-error(Backing file format not supported for file format '%s', 
fmt);
+error_report(Backing file format not supported for file 
+ format '%s', fmt);
 return -1;
 }
 }
@@ -309,11 +302,11 @@ static int img_create(int argc, char **argv)
 fmt = optarg;
 break;
 case 'e':
-error(qemu-img: option -e is deprecated, please use \'-o 
+error_report(qemu-img: option -e is deprecated, please use \'-o 
   encryption\' instead!);
 return 1;
 case '6':
-error(qemu-img: option -6 is deprecated, please use \'-o 
+error_report(qemu-img: option -6 is deprecated, please use \'-o 
   compat6\' instead!);
 return 1;
 case 'o':
@@ -333,9 +326,9 @@ static int img_create(int argc, char **argv)
 ssize_t sval;
 sval = strtosz_suffix(argv[optind++], NULL, STRTOSZ_DEFSUFFIX_B);
 if (sval  0) {
-error(Invalid image size specified! You may use k, M, G or 
+error_report(Invalid image size specified! You may use k, M, G or 

   T suffixes for );
-error(kilobytes, megabytes, gigabytes and terabytes.);
+error_report(kilobytes, megabytes, gigabytes and terabytes.);
 ret = -1;
 goto out;
 }
@@ -399,7 +392,7 @@ static int img_check(int argc, char **argv)
 ret = bdrv_check(bs, result);
 
 if (ret == -ENOTSUP) {
-error(This image format does not support checks);
+error_report(This image format does not support checks);
 bdrv_delete(bs);
 return 1;
 }
@@ -481,16 +474,16 @@ static int img_commit(int argc, char **argv)
 printf(Image committed.\n);
 break;
 case -ENOENT:
-error(No disk

[Qemu-devel] [PATCH] qemu-img: Call error_set_progname

2010-12-16 Thread Kevin Wolf

Call error_set_progname during the qemu-img initialization, so that error
messages printed with error_report() use the right prefix.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 qemu-img.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 603bdb3..0ff179f 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -23,6 +23,7 @@
  */
 #include qemu-common.h
 #include qemu-option.h
+#include qemu-error.h
 #include osdep.h
 #include sysemu.h
 #include block_int.h
@@ -1508,6 +1509,8 @@ int main(int argc, char **argv)
 const img_cmd_t *cmd;
 const char *cmdname;
 
+error_set_progname(argv[0]);
+
 bdrv_init();
 if (argc  2)
 help();
-- 
1.7.2.3

[Qemu-devel] Re: [PATCH v3 0/4] Re-factor img_create() and add live snapshots

2010-12-16 Thread Kevin Wolf

Am 16.12.2010 13:52, schrieb jes.soren...@redhat.com:
 From: Jes Sorensen jes.soren...@redhat.com
 
 Hi,
 
 This set of patches re-factors img_create() and moves the core part of
 it into block.c so it can be accessed from qemu as well as
 qemu-img. The second patch adds basic live snapshots support to the
 code, however only snapshots to external QCOW2 images is supported for
 now. QED support should be trivial once the QED patches go into
 upstream.
 
 The last patch fixes a small gotcha which is present in the old code
 as well. Try to catch cases where a user tries to create an image with
 itself as the backing file. QEMU does 'interesting' things when you do
 this.
 
 Many thanks to Kevin for his help with block layer internals!
 
 New in v2:
  - Fix error return value in monitor command
  - Clarify help message for command
  - Fix patch conflict against block tree. It's all Stefan's fault :)
f8feb11f4d76f390dddc5cc5345abf99f7659a78
 
 New in v3:
  - Address issues pointed out by Stefan and Kevin
  - Additional patch to return proper -errno error values on error in
bdrv_img_create() as suggested by Kevin.
 
 Jes Sorensen (4):
   qemu-img.c: Re-factor img_create()
   Introduce do_snapshot_blkdev() and monitor command to handle it.
   Prevent creating an image with the same filename as backing file
   bdrv_img_create() use proper errno return values
 
  block.c |  145 
 +++
  block.h |4 ++
  blockdev.c  |   62 +++
  blockdev.h  |1 +
  hmp-commands.hx |   19 +++
  qemu-img.c  |  108 +
  6 files changed, 233 insertions(+), 106 deletions(-)

Thanks, applied all to the block branch.

Kevin

[Qemu-devel] Re: [PATCH] qemu.img.c: Use error_report() instead of own error() implementation

2010-12-16 Thread Kevin Wolf

Am 16.12.2010 14:31, schrieb jes.soren...@redhat.com:
 From: Jes Sorensen jes.soren...@redhat.com
 
 Signed-off-by: Jes Sorensen jes.soren...@redhat.com

Thanks, applied to the block branch.

Kevin

[Qemu-devel] Re: [PATCH 05/21] virtio: modify save/load handler to handle inuse varialble.

2010-12-16 Thread Yoshiaki Tamura

2010/12/16 Michael S. Tsirkin m...@redhat.com:
 On Thu, Dec 16, 2010 at 04:36:16PM +0900, Yoshiaki Tamura wrote:
 2010/12/3 Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp:
  2010/12/2 Michael S. Tsirkin m...@redhat.com:
  On Wed, Dec 01, 2010 at 05:03:43PM +0900, Yoshiaki Tamura wrote:
  2010/11/28 Michael S. Tsirkin m...@redhat.com:
   On Sun, Nov 28, 2010 at 08:27:58PM +0900, Yoshiaki Tamura wrote:
   2010/11/28 Michael S. Tsirkin m...@redhat.com:
On Thu, Nov 25, 2010 at 03:06:44PM +0900, Yoshiaki Tamura wrote:
Modify inuse type to uint16_t, let save/load to handle, and revert
last_avail_idx with inuse if there are outstanding emulation.
   
Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
   
This changes migration format, so it will break compatibility with
existing drivers. More generally, I think migrating internal
state that is not guest visible is always a mistake
as it ties migration format to an internal implementation
(yes, I know we do this sometimes, but we should at least
try not to add such cases).  I think the right thing to do in this 
case
is to flush outstanding
work when vm is stopped.  Then, we are guaranteed that inuse is 0.
I sent patches that do this for virtio net and block.
  
   Could you give me the link of your patches?  I'd like to test
   whether they work with Kemari upon failover.  If they do, I'm
   happy to drop this patch.
  
   Yoshi
  
   Look for this:
   stable migration image on a stopped vm
   sent on:
   Wed, 24 Nov 2010 17:52:49 +0200
 
  Thanks for the info.
 
  However, The patch series above didn't solve the issue.  In
  case of Kemari, inuse is mostly  0 because it queues the
  output, and while last_avail_idx gets incremented
  immediately, not sending inuse makes the state inconsistent
  between Primary and Secondary.
 
  Hmm. Can we simply avoid incrementing last_avail_idx?
 
  I think we can calculate or prepare an internal last_avail_idx,
  and update the external when inuse is decremented.  I'll try
  whether it work w/ w/o Kemari.

 Hi Michael,

 Could you please take a look at the following patch?

 Which version is this against?

Oops.  It should be very old.
67f895bfe69f323b427b284430b6219c8a62e8d4

 commit 36ee7910059e6b236fe9467a609f5b4aed866912
 Author: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
 Date:   Thu Dec 16 14:50:54 2010 +0900

     virtio: update last_avail_idx when inuse is decreased.

     Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp

 It would be better to have a commit description explaining why a change
 is made, and why it is correct, not just repeating what can be seen from
 the diff anyway.

Sorry for being lazy here.

 diff --git a/hw/virtio.c b/hw/virtio.c
 index c8a0fc6..6688c02 100644
 --- a/hw/virtio.c
 +++ b/hw/virtio.c
 @@ -237,6 +237,7 @@ void virtqueue_flush(VirtQueue *vq, unsigned int count)
      wmb();
      trace_virtqueue_flush(vq, count);
      vring_used_idx_increment(vq, count);
 +    vq-last_avail_idx += count;
      vq-inuse -= count;
  }

 @@ -385,7 +386,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
      unsigned int i, head, max;
      target_phys_addr_t desc_pa = vq-vring.desc;

 -    if (!virtqueue_num_heads(vq, vq-last_avail_idx))
 +    if (!virtqueue_num_heads(vq, vq-last_avail_idx + vq-inuse))
          return 0;

      /* When we start there are none of either input nor output. */
 @@ -393,7 +394,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)

      max = vq-vring.num;

 -    i = head = virtqueue_get_head(vq, vq-last_avail_idx++);
 +    i = head = virtqueue_get_head(vq, vq-last_avail_idx + vq-inuse);

      if (vring_desc_flags(desc_pa, i)  VRING_DESC_F_INDIRECT) {
          if (vring_desc_len(desc_pa, i) % sizeof(VRingDesc)) {


 Hmm, will virtio_queue_empty be wrong now? What about virtqueue_avail_bytes?

I think there are two problems.

1. When to update last_avail_idx.
2. The ordering issue you're mentioning below.

The patch above is only trying to address 1 because last time you
mentioned that modifying last_avail_idx upon save may break the
guest, which I agree.  If virtio_queue_empty and
virtqueue_avail_bytes are only used internally, meaning invisible
to the guest, I guess the approach above can be applied too.

 Previous patch version sure looked simpler, and this seems functionally
 equivalent, so my question still stands: here it is rephrased in a
 different way:

        assume that we have in avail ring 2 requests at start of ring: A and B 
 in this order

        host pops A, then B, then completes B and flushes

        now with this patch last_avail_idx will be 1, and then
        remote will get it, it will execute B again. As a result
        B will complete twice, and apparently A will never complete.


 This is what I was saying below: assuming that there are
 outstanding requests when we migrate, there is no way
 a single index can be enough to figure out which

[Qemu-devel] Re: [PATCH] qemu-img: Call error_set_progname

2010-12-16 Thread Jes Sorensen

On 12/16/10 15:13, Kevin Wolf wrote:
 Call error_set_progname during the qemu-img initialization, so that error
 messages printed with error_report() use the right prefix.
 
 Signed-off-by: Kevin Wolf kw...@redhat.com

Acked-by: Jes Sorensen jes.soren...@redhat.com


 ---
  qemu-img.c |3 +++
  1 files changed, 3 insertions(+), 0 deletions(-)
 
 diff --git a/qemu-img.c b/qemu-img.c
 index 603bdb3..0ff179f 100644
 --- a/qemu-img.c
 +++ b/qemu-img.c
 @@ -23,6 +23,7 @@
   */
  #include qemu-common.h
  #include qemu-option.h
 +#include qemu-error.h
  #include osdep.h
  #include sysemu.h
  #include block_int.h
 @@ -1508,6 +1509,8 @@ int main(int argc, char **argv)
  const img_cmd_t *cmd;
  const char *cmdname;
  
 +error_set_progname(argv[0]);
 +
  bdrv_init();
  if (argc  2)
  help();

[Qemu-devel] [PATCH] Remove NULL checks for bdrv_new return value

2010-12-16 Thread Kevin Wolf

It's an indirect call to qemu_malloc, which never returns an error.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/xen_disk.c |   17 ++---
 qemu-img.c|5 +
 qemu-io.c |2 --
 qemu-nbd.c|2 --
 4 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/hw/xen_disk.c b/hw/xen_disk.c
index 85a1c85..1954311 100644
--- a/hw/xen_disk.c
+++ b/hw/xen_disk.c
@@ -634,17 +634,12 @@ static int blk_init(struct XenDevice *xendev)
 if (!blkdev-dinfo) {
 /* setup via xenbus - create new block driver instance */
 xen_be_printf(blkdev-xendev, 2, create new bdrv (xenbus setup)\n);
-   blkdev-bs = bdrv_new(blkdev-dev);
-   if (blkdev-bs) {
-   if (bdrv_open(blkdev-bs, blkdev-filename, qflags,
-   bdrv_find_whitelisted_format(blkdev-fileproto))
-!= 0) {
-   bdrv_delete(blkdev-bs);
-   blkdev-bs = NULL;
-   }
-   }
-   if (!blkdev-bs)
-   return -1;
+blkdev-bs = bdrv_new(blkdev-dev);
+if (bdrv_open(blkdev-bs, blkdev-filename, qflags,
+  bdrv_find_whitelisted_format(blkdev-fileproto)) != 0) {
+bdrv_delete(blkdev-bs);
+blkdev-bs = NULL;
+}
 } else {
 /* setup via qemu cmdline - already setup for us */
 xen_be_printf(blkdev-xendev, 2, get configured bdrv (cmdline 
setup)\n);
diff --git a/qemu-img.c b/qemu-img.c
index 0b871d8..afd9ed2 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -215,10 +215,7 @@ static BlockDriverState *bdrv_new_open(const char 
*filename,
 char password[256];
 
 bs = bdrv_new();
-if (!bs) {
-error_report(Not enough memory);
-goto fail;
-}
+
 if (fmt) {
 drv = bdrv_find_format(fmt);
 if (!drv) {
diff --git a/qemu-io.c b/qemu-io.c
index ff353eb..0f6d1b6 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -1509,8 +1509,6 @@ static int openfile(char *name, int flags, int growable)
}
} else {
bs = bdrv_new(hda);
-   if (!bs)
-   return 1;
 
if (bdrv_open(bs, name, flags, NULL)  0) {
fprintf(stderr, %s: can't open device %s\n, progname, 
name);
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 99f1d22..e858033 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -336,8 +336,6 @@ int main(int argc, char **argv)
 bdrv_init();
 
 bs = bdrv_new(hda);
-if (bs == NULL)
-return 1;
 
 if ((ret = bdrv_open(bs, argv[optind], flags, NULL))  0) {
 errno = -ret;
-- 
1.7.2.3

[Qemu-devel] Re: [PATCH 05/21] virtio: modify save/load handler to handle inuse varialble.

2010-12-16 Thread Michael S. Tsirkin

On Thu, Dec 16, 2010 at 11:28:46PM +0900, Yoshiaki Tamura wrote:
 2010/12/16 Michael S. Tsirkin m...@redhat.com:
  On Thu, Dec 16, 2010 at 04:36:16PM +0900, Yoshiaki Tamura wrote:
  2010/12/3 Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp:
   2010/12/2 Michael S. Tsirkin m...@redhat.com:
   On Wed, Dec 01, 2010 at 05:03:43PM +0900, Yoshiaki Tamura wrote:
   2010/11/28 Michael S. Tsirkin m...@redhat.com:
On Sun, Nov 28, 2010 at 08:27:58PM +0900, Yoshiaki Tamura wrote:
2010/11/28 Michael S. Tsirkin m...@redhat.com:
 On Thu, Nov 25, 2010 at 03:06:44PM +0900, Yoshiaki Tamura wrote:
 Modify inuse type to uint16_t, let save/load to handle, and 
 revert
 last_avail_idx with inuse if there are outstanding emulation.

 Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp

 This changes migration format, so it will break compatibility with
 existing drivers. More generally, I think migrating internal
 state that is not guest visible is always a mistake
 as it ties migration format to an internal implementation
 (yes, I know we do this sometimes, but we should at least
 try not to add such cases).  I think the right thing to do in 
 this case
 is to flush outstanding
 work when vm is stopped.  Then, we are guaranteed that inuse is 0.
 I sent patches that do this for virtio net and block.
   
Could you give me the link of your patches?  I'd like to test
whether they work with Kemari upon failover.  If they do, I'm
happy to drop this patch.
   
Yoshi
   
Look for this:
stable migration image on a stopped vm
sent on:
Wed, 24 Nov 2010 17:52:49 +0200
  
   Thanks for the info.
  
   However, The patch series above didn't solve the issue.  In
   case of Kemari, inuse is mostly  0 because it queues the
   output, and while last_avail_idx gets incremented
   immediately, not sending inuse makes the state inconsistent
   between Primary and Secondary.
  
   Hmm. Can we simply avoid incrementing last_avail_idx?
  
   I think we can calculate or prepare an internal last_avail_idx,
   and update the external when inuse is decremented.  I'll try
   whether it work w/ w/o Kemari.
 
  Hi Michael,
 
  Could you please take a look at the following patch?
 
  Which version is this against?
 
 Oops.  It should be very old.
 67f895bfe69f323b427b284430b6219c8a62e8d4
 
  commit 36ee7910059e6b236fe9467a609f5b4aed866912
  Author: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
  Date:   Thu Dec 16 14:50:54 2010 +0900
 
      virtio: update last_avail_idx when inuse is decreased.
 
      Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
 
  It would be better to have a commit description explaining why a change
  is made, and why it is correct, not just repeating what can be seen from
  the diff anyway.
 
 Sorry for being lazy here.
 
  diff --git a/hw/virtio.c b/hw/virtio.c
  index c8a0fc6..6688c02 100644
  --- a/hw/virtio.c
  +++ b/hw/virtio.c
  @@ -237,6 +237,7 @@ void virtqueue_flush(VirtQueue *vq, unsigned int count)
       wmb();
       trace_virtqueue_flush(vq, count);
       vring_used_idx_increment(vq, count);
  +    vq-last_avail_idx += count;
       vq-inuse -= count;
   }
 
  @@ -385,7 +386,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement 
  *elem)
       unsigned int i, head, max;
       target_phys_addr_t desc_pa = vq-vring.desc;
 
  -    if (!virtqueue_num_heads(vq, vq-last_avail_idx))
  +    if (!virtqueue_num_heads(vq, vq-last_avail_idx + vq-inuse))
           return 0;
 
       /* When we start there are none of either input nor output. */
  @@ -393,7 +394,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement 
  *elem)
 
       max = vq-vring.num;
 
  -    i = head = virtqueue_get_head(vq, vq-last_avail_idx++);
  +    i = head = virtqueue_get_head(vq, vq-last_avail_idx + vq-inuse);
 
       if (vring_desc_flags(desc_pa, i)  VRING_DESC_F_INDIRECT) {
           if (vring_desc_len(desc_pa, i) % sizeof(VRingDesc)) {
 
 
  Hmm, will virtio_queue_empty be wrong now? What about virtqueue_avail_bytes?
 
 I think there are two problems.
 
 1. When to update last_avail_idx.
 2. The ordering issue you're mentioning below.
 
 The patch above is only trying to address 1 because last time you
 mentioned that modifying last_avail_idx upon save may break the
 guest, which I agree.  If virtio_queue_empty and
 virtqueue_avail_bytes are only used internally, meaning invisible
 to the guest, I guess the approach above can be applied too.

So IMHO 2 is the real issue. This is what was problematic
with the save patch, otherwise of course changes in save
are better than changes all over the codebase.

  Previous patch version sure looked simpler, and this seems functionally
  equivalent, so my question still stands: here it is rephrased in a
  different way:
 
         assume that we have in avail ring 2 requests at start of ring: A and 
  B in this order
 
         host pops A, then B, then completes B and flushes

[Qemu-devel] [PATCH v2] Remove NULL checks for bdrv_new return value

2010-12-16 Thread Kevin Wolf

It's an indirect call to qemu_malloc, which never returns an error.

Signed-off-by: Kevin Wolf kw...@redhat.com
---

v2:
- Fixed bdrv_open failure case in xen_disk

 hw/xen_disk.c |   17 ++---
 qemu-img.c|5 +
 qemu-io.c |2 --
 qemu-nbd.c|2 --
 4 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/hw/xen_disk.c b/hw/xen_disk.c
index 85a1c85..ed9e5eb 100644
--- a/hw/xen_disk.c
+++ b/hw/xen_disk.c
@@ -634,17 +634,12 @@ static int blk_init(struct XenDevice *xendev)
 if (!blkdev-dinfo) {
 /* setup via xenbus - create new block driver instance */
 xen_be_printf(blkdev-xendev, 2, create new bdrv (xenbus setup)\n);
-   blkdev-bs = bdrv_new(blkdev-dev);
-   if (blkdev-bs) {
-   if (bdrv_open(blkdev-bs, blkdev-filename, qflags,
-   bdrv_find_whitelisted_format(blkdev-fileproto))
-!= 0) {
-   bdrv_delete(blkdev-bs);
-   blkdev-bs = NULL;
-   }
-   }
-   if (!blkdev-bs)
-   return -1;
+blkdev-bs = bdrv_new(blkdev-dev);
+if (bdrv_open(blkdev-bs, blkdev-filename, qflags,
+  bdrv_find_whitelisted_format(blkdev-fileproto)) != 0) {
+bdrv_delete(blkdev-bs);
+return -1;
+}
 } else {
 /* setup via qemu cmdline - already setup for us */
 xen_be_printf(blkdev-xendev, 2, get configured bdrv (cmdline 
setup)\n);
diff --git a/qemu-img.c b/qemu-img.c
index 0b871d8..afd9ed2 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -215,10 +215,7 @@ static BlockDriverState *bdrv_new_open(const char 
*filename,
 char password[256];
 
 bs = bdrv_new();
-if (!bs) {
-error_report(Not enough memory);
-goto fail;
-}
+
 if (fmt) {
 drv = bdrv_find_format(fmt);
 if (!drv) {
diff --git a/qemu-io.c b/qemu-io.c
index ff353eb..0f6d1b6 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -1509,8 +1509,6 @@ static int openfile(char *name, int flags, int growable)
}
} else {
bs = bdrv_new(hda);
-   if (!bs)
-   return 1;
 
if (bdrv_open(bs, name, flags, NULL)  0) {
fprintf(stderr, %s: can't open device %s\n, progname, 
name);
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 99f1d22..e858033 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -336,8 +336,6 @@ int main(int argc, char **argv)
 bdrv_init();
 
 bs = bdrv_new(hda);
-if (bs == NULL)
-return 1;
 
 if ((ret = bdrv_open(bs, argv[optind], flags, NULL))  0) {
 errno = -ret;
-- 
1.7.2.3

Re: [Qemu-devel] Re: [PATCH 00/15] Megasas HBA emulation and SCSI update v.3

2010-12-16 Thread Kevin Wolf

Am 16.12.2010 02:45, schrieb Benjamin Herrenschmidt:
 On Mon, 2010-12-13 at 08:32 +0100, Hannes Reinecke wrote:
 On 12/10/2010 11:14 PM, Paolo Bonzini wrote:
 On 11/24/2010 05:50 PM, Christoph Hellwig wrote:
 Btw, it might make sense to split this series into two.

 Patches 1 to 11 are genuine improvements to the SCSI code, which I'd
 like to see merged ASAP.  The rest is the actual megasas driver, which
 I still want to see, but haven't even gotten to review yet.

 Ping for patches 1 to 11?

 Paolo

 The first few already have been merged by Kevin Wolf; I'll see to
 prepare an updated patchset.
 
 Actually, I was about to ask as I'd like to base some new work of mine
 on top of these. I don't see any recent commit from Kevin in the qemu
 master branch (nor in any other branch on
 http://git.savannah.gnu.org/cgit/qemu.git/log/).

Patches 1 to 5 are already in master, for the rest I'm waiting for the
next update. If you need them earlier for basing your own work on them I
can take this version of the patches into my block branch, even though
some of the need an update before they can be merged into master.

 Does Kevin maintain a separate staging tree ?

It's at git://repo.or.cz/qemu/kevin.git block

Kevin

[Qemu-devel] classic emulator Vs QEMU-TCG

2010-12-16 Thread Stefano Bonifazi


Hi all!
I am a student, trying to understand QEMU, specifically TCG 
translation/execution.
After spending much time on the code I still have big doubts. I think my 
doubts are due to the classic idea I have of an emulator.
Actually as a student, I've never developed even a simple classic 
emulator myself, but in my idea it should follow this flow:

1) Fetch target instruction
 i.e. PC(0x532652) : 0x104265 (I am just inventing)
2) Decode
 Opcode 0x10 :  ADD,  R1: 0x42, R2: 0x65
3) Look up instruction function table:
 switch(opcode)
  case add :
   add(R1, R2)
  break;
4) Execution
 void add(int R1, int R2)
 { env-reg[R1] = env-reg[R1] + env[R2];}

Now all of that would be compiled offline for the host machine and at 
runtime the host macine would just execute the binary host code for the 
instruction  env-reg[R1] = env-reg[R1] + env[R2]; (its host binary 
translation)


In QEMU/TCG, thanks to the help of Mr. Blue Swirl, I understood there is 
a runtime creation of host binary, starting from the loaded target binary..
My big doubt is, how can I execute that new binary? .. Shall TCG put it 
in some memory location, and then make the process branch to that 
address (and then back) ?

I really can't see how that happens in the code :(

in cpu-exec.c : cpu_exec_nocache i find:


/* execute the generated code */
next_tb = tcg_qemu_tb_exec(tb-tc_ptr);

and in cpu-exec.c : cpu_exec


/* execute the generated code */

next_tb = tcg_qemu_tb_exec(tc_ptr);
so I thought tcg_qemu_tb_exec function should do the work of executing 
the translated binary in the host.

But then I found out it is just a define in tcg.h:

#define tcg_qemu_tb_exec(tb_ptr) ((long REGPARM (*)(void 
*))code_gen_prologue)(tb_ptr)

and again in exec.c


uint8_t code_gen_prologue[1024] code_gen_section;
Maybe I have some problems with that C syntax, but I really don't 
understand what happens there.. how the execution happens!


I think for all of you working for so long on QEMU, with a long 
successful experience in this field should be very easy.. but atm I 
really can't figure it out alone.. I can't find good documents 
explaining it, and I can't understand myself from the code!

Thank you very very much for any help! :)
Stefano B.

Re: [Qemu-devel] classic emulator Vs QEMU-TCG

2010-12-16 Thread Peter Maydell

On 16 December 2010 15:20, Stefano Bonifazi stefboombas...@gmail.com wrote:
 so I thought tcg_qemu_tb_exec function should do the work of executing the
 translated binary in the host.
 But then I found out it is just a define in tcg.h:

 #define tcg_qemu_tb_exec(tb_ptr) ((long REGPARM (*)(void
 *))code_gen_prologue)(tb_ptr)

 and again in exec.c

 uint8_t code_gen_prologue[1024] code_gen_section;

 Maybe I have some problems with that C syntax, but I really don't understand
 what happens there.. how the execution happens!

Some hints:
 * go and look up the C syntax for function pointers and
casting things to function pointers
 * code_gen_prologue[] contains code which has been generated
once on startup -- go and find the function which is doing this,
which ought to tell you what the prologue code actually does...
 * try single stepping individual machine instructions in the
debugger as you go through tcg_qemu_tb_exec() and matching
this up with what is really happening here and with the bits of
qemu which generated that code.

-- PMM

Re: [Qemu-devel] ]PATCH 0/7] add TRIM/UNMAP support, v3

2010-12-16 Thread Kevin Wolf

Am 10.12.2010 16:00, schrieb Christoph Hellwig:
 This patchset adds support for the ATA TRIM and SCSI WRITE SAME with
 unmap commands, which allow reclaiming free space from a backing image.
 
 The user facing implementation is pretty complete, but not really
 efficient because the underlying bdrv_discard implementation doesn't
 use the aio implementation yet.  The reason for that is that the SCSI
 layer doesn't really allow any asynchronous commands except for
 READ/WRITE by design, and implementing the ATA TRIM command with it's
 multiple ranges is rather painful, and combined with the SCSI limitation
 I didn't bother yet.  The only backend support so far is the XFS hole
 punching ioctl, but others can be added easily when they become
 available.  A virtio implementation for a discard command would also
 be pretty easy, but until we actually support a better backend then
 a plain sparse file it's not worth using for production enviroments
 anyway, but more for playing with the thin provisioning infrastructure,
 or observing guest behaviour when TRIM / unmap is supported.
 
 If the support is enabled and the backend doesn't support hole punching
 the TRIM / WRITE SAME commands become no-ops so that migration from
 hosts supporting or not supporting it works.
 
 Version 3:
   - refactor IDE dma support code
   - proper brace obsfucation
   - fix compile without xfs headers
   - use bool instead of int for a one-byte flag
 
 Version 2:
   - replace tabs with spaces
   - return -ENOMEDIUM from bdrv_discard if there's no driver
 assigned
   - actually list the TP EVPD page as supported when querying
 for supported EVPD pages

The SCSI changes seem to apply, but the rest conflicts with recent
changes, most notably AHCI. Can you rebase on top of the block branch?

I also found some minor things in the SCSI code, so I'll take the chance
to comment on them.

Kevin

Re: [Qemu-devel] [PATCH 2/7] scsi-disk: support WRITE SAME (16) with unmap bit

2010-12-16 Thread Kevin Wolf

Am 10.12.2010 16:00, schrieb Christoph Hellwig:
 Support discards via the WRITE SAME command with the unmap bit set, and
 tell the initiator about the support for it via the block limit and the
 new thin provisioning EVPD pages.  Also fix the comment which incorrectly
 describedthe block limits EVPD page.
 
 Signed-off-by: Christoph Hellwig h...@lst.de
 
 Index: qemu/hw/scsi-disk.c
 ===
 --- qemu.orig/hw/scsi-disk.c  2010-12-07 09:35:44.203009851 +0100
 +++ qemu/hw/scsi-disk.c   2010-12-07 09:42:07.984255141 +0100
 @@ -424,7 +424,8 @@ static int scsi_disk_emulate_inquiry(SCS
  outbuf[buflen++] = 0x80; // unit serial number
  outbuf[buflen++] = 0x83; // device identification
  if (bdrv_get_type_hint(s-bs) != BDRV_TYPE_CDROM) {
 -outbuf[buflen++] = 0xb0; // block device characteristics
 +outbuf[buflen++] = 0xb0; // block limits
 +outbuf[buflen++] = 0xb2; // thin provisioning
  }
  outbuf[pages] = buflen - pages - 1; // number of pages
  break;
 @@ -466,8 +467,10 @@ static int scsi_disk_emulate_inquiry(SCS
  buflen += id_len;
  break;
  }
 -case 0xb0: /* block device characteristics */
 +case 0xb0: /* block limits */
  {
 +unsigned int unmap_sectors =
 +s-qdev.conf.discard_granularity / s-qdev.blocksize;
  unsigned int min_io_size =
  s-qdev.conf.min_io_size / s-qdev.blocksize;
  unsigned int opt_io_size =
 @@ -492,6 +495,21 @@ static int scsi_disk_emulate_inquiry(SCS
  outbuf[13] = (opt_io_size  16)  0xff;
  outbuf[14] = (opt_io_size  8)  0xff;
  outbuf[15] = opt_io_size  0xff;
 +
 +/* optimal unmap granularity */
 +outbuf[28] = (unmap_sectors  24)  0xff;
 +outbuf[29] = (unmap_sectors  16)  0xff;
 +outbuf[30] = (unmap_sectors  8)  0xff;
 +outbuf[31] = unmap_sectors  0xff;
 +break;
 +}
 +case 0xb2: /* thin provisioning */
 +{
 +outbuf[3] = buflen = 8;
 +outbuf[4] = 0;
 +outbuf[5] = 0x40; /* write same with unmap supported */
 +outbuf[6] = 0;
 +outbuf[7] = 0;
  break;
  }
  default:
 @@ -959,6 +977,11 @@ static int scsi_disk_emulate_command(SCS
  outbuf[11] = 0;
  outbuf[12] = 0;
  outbuf[13] = get_physical_block_exp(s-qdev.conf);
 +
 +/* set TPE bit if the format supports discard */
 +if (s-qdev.conf.discard_granularity)
 +outbuf[14] = 0x80;

Missing braces.

 +
  /* Protection, exponent and lowest lba field left blank. */
  buflen = req-cmd.xfer;
  break;
 @@ -1123,6 +1146,34 @@ static int32_t scsi_send_command(SCSIDev
  goto illegal_lba;
  }
  break;
 +case WRITE_SAME_16:
 +len = r-req.cmd.xfer / d-blocksize;
 +
 +DPRINTF(WRITE SAME(16) (sector % PRId64 , count %d)\n,
 +r-req.cmd.lba, len);
 +
 +if (r-req.cmd.lba  s-max_lba) {
 +goto illegal_lba;
 +}
 +
 +/*
 + * We only support WRITE SAME with the unmap bit set for now.
 + */
 +if (!(buf[1]  0x8)) {
 +goto fail;
 +}
 +
 +rc = bdrv_discard(s-bs, r-req.cmd.lba * s-cluster_size,
 +  len * s-cluster_size);
 +if (rc  0) {
 +/* XXX: better error code ?*/
 +goto fail;
 +}
 +
 +scsi_req_set_status(r, GOOD, NO_SENSE);
 +scsi_req_complete(r-req);
 +scsi_remove_request(r);

Isn't this the same as scsi_command_complete()?

 +return 0;

And if you break; here instead of returning (like all other commands)
and remove the three lines above completely, I think it would just do
the right thing.

Kevin

[Qemu-devel] [PATCH v2] ide: Register vm change state handler once only

2010-12-16 Thread Stefan Hajnoczi

We register the vm change state handler in a PCI BAR map() function.
This function can be called multiple times throughout the lifetime of a
PCI IDE device.  This results in duplicate vm change state handlers
being register, none of which are ever unregistered.

Instead, register the vm change state handler in the device's init
function once and for all.

piix tested, cmd646 and via not tested.

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
v2:
 * Rebased on kevin/block

 hw/ide/cmd646.c |   18 ++
 hw/ide/piix.c   |   34 --
 hw/ide/via.c|   34 --
 3 files changed, 58 insertions(+), 28 deletions(-)

diff --git a/hw/ide/cmd646.c b/hw/ide/cmd646.c
index e191ee6..89ba836 100644
--- a/hw/ide/cmd646.c
+++ b/hw/ide/cmd646.c
@@ -167,10 +167,6 @@ static void bmdma_map(PCIDevice *pci_dev, int region_num,
 
 for(i = 0;i  2; i++) {
 BMDMAState *bm = d-bmdma[i];
-bmdma_init(d-bus[i], bm);
-bm-bus = d-bus+i;
-qemu_add_vm_change_state_handler(d-bus[i].dma-ops-restart_cb,
- bm-dma);
 
 if (i == 0) {
 register_ioport_write(addr, 4, 1, bmdma_writeb_0, d);
@@ -228,6 +224,7 @@ static int pci_cmd646_ide_initfn(PCIDevice *dev)
 PCIIDEState *d = DO_UPCAST(PCIIDEState, dev, dev);
 uint8_t *pci_conf = d-dev.config;
 qemu_irq *irq;
+int i;
 
 pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_CMD);
 pci_config_set_device_id(pci_conf, PCI_DEVICE_ID_CMD_646);
@@ -253,10 +250,15 @@ static int pci_cmd646_ide_initfn(PCIDevice *dev)
 pci_conf[PCI_INTERRUPT_PIN] = 0x01; // interrupt on pin 1
 
 irq = qemu_allocate_irqs(cmd646_set_irq, d, 2);
-ide_bus_new(d-bus[0], d-dev.qdev, 0);
-ide_bus_new(d-bus[1], d-dev.qdev, 1);
-ide_init2(d-bus[0], irq[0]);
-ide_init2(d-bus[1], irq[1]);
+for (i = 0; i  2; i++) {
+ide_bus_new(d-bus[i], d-dev.qdev, i);
+ide_init2(d-bus[i], irq[i]);
+
+bmdma_init(d-bus[i], d-bmdma[i]);
+bm-bus = d-bus[i];
+qemu_add_vm_change_state_handler(d-bus[i].dma-ops-restart_cb,
+ d-bmdma[i]-dma);
+}
 
 vmstate_register(dev-qdev, 0, vmstate_ide_pci, d);
 qemu_register_reset(cmd646_reset, d);
diff --git a/hw/ide/piix.c b/hw/ide/piix.c
index a6b5d92..1cad906 100644
--- a/hw/ide/piix.c
+++ b/hw/ide/piix.c
@@ -76,10 +76,6 @@ static void bmdma_map(PCIDevice *pci_dev, int region_num,
 
 for(i = 0;i  2; i++) {
 BMDMAState *bm = d-bmdma[i];
-bmdma_init(d-bus[i], bm);
-bm-bus = d-bus+i;
-qemu_add_vm_change_state_handler(d-bus[i].dma-ops-restart_cb,
- bm-dma);
 
 register_ioport_write(addr, 1, 1, bmdma_cmd_writeb, bm);
 
@@ -112,6 +108,29 @@ static void piix3_reset(void *opaque)
 pci_conf[0x20] = 0x01; /* BMIBA: 20-23h */
 }
 
+static void pci_piix_init_ports(PCIIDEState *d) {
+int i;
+struct {
+int iobase;
+int iobase2;
+int isairq;
+} port_info[] = {
+{0x1f0, 0x3f6, 14},
+{0x170, 0x376, 15},
+};
+
+for (i = 0; i  2; i++) {
+ide_bus_new(d-bus[i], d-dev.qdev, i);
+ide_init_ioport(d-bus[i], port_info[i].iobase, port_info[i].iobase2);
+ide_init2(d-bus[i], isa_reserve_irq(port_info[i].isairq));
+
+bmdma_init(d-bus[i], d-bmdma[i]);
+d-bmdma[i].bus = d-bus[i];
+qemu_add_vm_change_state_handler(d-bus[i].dma-ops-restart_cb,
+ d-bmdma[i].dma);
+}
+}
+
 static int pci_piix_ide_initfn(PCIIDEState *d)
 {
 uint8_t *pci_conf = d-dev.config;
@@ -125,13 +144,8 @@ static int pci_piix_ide_initfn(PCIIDEState *d)
 
 vmstate_register(d-dev.qdev, 0, vmstate_ide_pci, d);
 
-ide_bus_new(d-bus[0], d-dev.qdev, 0);
-ide_bus_new(d-bus[1], d-dev.qdev, 1);
-ide_init_ioport(d-bus[0], 0x1f0, 0x3f6);
-ide_init_ioport(d-bus[1], 0x170, 0x376);
+pci_piix_init_ports(d);
 
-ide_init2(d-bus[0], isa_reserve_irq(14));
-ide_init2(d-bus[1], isa_reserve_irq(15));
 return 0;
 }
 
diff --git a/hw/ide/via.c b/hw/ide/via.c
index 2603110..5b70bd2 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -78,10 +78,6 @@ static void bmdma_map(PCIDevice *pci_dev, int region_num,
 
 for(i = 0;i  2; i++) {
 BMDMAState *bm = d-bmdma[i];
-bmdma_init(d-bus[i], bm);
-bm-bus = d-bus+i;
-qemu_add_vm_change_state_handler(d-bus[i].dma-ops-restart_cb,
- bm-dma);
 
 register_ioport_write(addr, 1, 1, bmdma_cmd_writeb, bm);
 
@@ -135,6 +131,29 @@ static void via_reset(void *opaque)
 pci_set_long(pci_conf + 0xc0, 0x00020001);
 }
 
+static void vt82c686b_init_ports(PCIIDEState *d) {
+int i;
+struct {
+int iobase;
+int iobase2;
+int isairq;
+} port_info[] = {
+{0x1f0, 0x3f6,

Re: [Qemu-devel] classic emulator Vs QEMU-TCG

2010-12-16 Thread Mulyadi Santosa

Hi Stefano

I'll try to share what I know about TCG..

On Thu, Dec 16, 2010 at 22:20, Stefano Bonifazi
stefboombas...@gmail.com wrote:
 Actually as a student, I've never developed even a simple classic emulator
 myself,

you're not alone...trust me.. :)

but in my idea it should follow this flow:
 1) Fetch target instruction
  i.e. PC(0x532652) : 0x104265 (I am just inventing)
 2) Decode
  Opcode 0x10 :  ADD,  R1: 0x42, R2: 0x65
 3) Look up instruction function table:
  switch(opcode)
  case add :
   add(R1, R2)
  break;
 4) Execution
  void add(int R1, int R2)
  { env-reg[R1] = env-reg[R1] + env[R2];}

You're right. Basically, we're taught that emulation is like big giant
swith..case with lots of condition. And that's exactly what Bochs
does AFAIK...

The pros of this approach is instruction could be simulated as precise
as possible and we could have more precise control about
timing...however the cons is... as we saw that big case
branching...cache miss could likely happen (in host machine I mean)
and pipeline stalls might happen more.

By doing what Qemu does, be it using the old dyngen or new TCG, we try
to maintain execution fluidity by interpreting instruction as less
as possible and strings together  the already translated blocks ...
And don't forget that Qemu sometimes does things like lazy flags
update, somewhat simple dead code elimination and so on. More like
tiny compiler...right?

 Now all of that would be compiled offline for the host machine and at
 runtime the host macine would just execute the binary host code for the
 instruction  env-reg[R1] = env-reg[R1] + env[R2]; (its host binary
 translation)

 My big doubt is, how can I execute that new binary? .. Shall TCG put it in
 some memory location, and then make the process branch to that address (and
 then back) ?
 I really can't see how that happens in the code :(

 in cpu-exec.c : cpu_exec_nocache i find:

 /* execute the generated code */
    next_tb = tcg_qemu_tb_exec(tb-tc_ptr);

 and in cpu-exec.c : cpu_exec

 /* execute the generated code */

                    next_tb = tcg_qemu_tb_exec(tc_ptr);

 so I thought tcg_qemu_tb_exec function should do the work of executing the
 translated binary in the host.
 But then I found out it is just a define in tcg.h:

 #define tcg_qemu_tb_exec(tb_ptr) ((long REGPARM (*)(void
 *))code_gen_prologue)(tb_ptr)

 and again in exec.c

 uint8_t code_gen_prologue[1024] code_gen_section;

 Maybe I have some problems with that C syntax, but I really don't understand
 what happens there.. how the execution happens!

With my limited C knowledge, I saw that as a instruction jump (to
tb_ptr). The code_gen_prologue seems to me like a cast. casting
each opcode in tb_ptr as uint8_t with maximum length=1024

I hope that's the right interpretation...I must admit Qemu is full of
gcc and C tricks here and there...


-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com

[Qemu-devel] Re: [PATCH 05/21] virtio: modify save/load handler to handle inuse varialble.

2010-12-16 Thread Yoshiaki Tamura

2010/12/16 Michael S. Tsirkin m...@redhat.com:
 On Thu, Dec 16, 2010 at 11:28:46PM +0900, Yoshiaki Tamura wrote:
 2010/12/16 Michael S. Tsirkin m...@redhat.com:
  On Thu, Dec 16, 2010 at 04:36:16PM +0900, Yoshiaki Tamura wrote:
  2010/12/3 Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp:
   2010/12/2 Michael S. Tsirkin m...@redhat.com:
   On Wed, Dec 01, 2010 at 05:03:43PM +0900, Yoshiaki Tamura wrote:
   2010/11/28 Michael S. Tsirkin m...@redhat.com:
On Sun, Nov 28, 2010 at 08:27:58PM +0900, Yoshiaki Tamura wrote:
2010/11/28 Michael S. Tsirkin m...@redhat.com:
 On Thu, Nov 25, 2010 at 03:06:44PM +0900, Yoshiaki Tamura wrote:
 Modify inuse type to uint16_t, let save/load to handle, and 
 revert
 last_avail_idx with inuse if there are outstanding emulation.

 Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp

 This changes migration format, so it will break compatibility 
 with
 existing drivers. More generally, I think migrating internal
 state that is not guest visible is always a mistake
 as it ties migration format to an internal implementation
 (yes, I know we do this sometimes, but we should at least
 try not to add such cases).  I think the right thing to do in 
 this case
 is to flush outstanding
 work when vm is stopped.  Then, we are guaranteed that inuse is 
 0.
 I sent patches that do this for virtio net and block.
   
Could you give me the link of your patches?  I'd like to test
whether they work with Kemari upon failover.  If they do, I'm
happy to drop this patch.
   
Yoshi
   
Look for this:
stable migration image on a stopped vm
sent on:
Wed, 24 Nov 2010 17:52:49 +0200
  
   Thanks for the info.
  
   However, The patch series above didn't solve the issue.  In
   case of Kemari, inuse is mostly  0 because it queues the
   output, and while last_avail_idx gets incremented
   immediately, not sending inuse makes the state inconsistent
   between Primary and Secondary.
  
   Hmm. Can we simply avoid incrementing last_avail_idx?
  
   I think we can calculate or prepare an internal last_avail_idx,
   and update the external when inuse is decremented.  I'll try
   whether it work w/ w/o Kemari.
 
  Hi Michael,
 
  Could you please take a look at the following patch?
 
  Which version is this against?

 Oops.  It should be very old.
 67f895bfe69f323b427b284430b6219c8a62e8d4

  commit 36ee7910059e6b236fe9467a609f5b4aed866912
  Author: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
  Date:   Thu Dec 16 14:50:54 2010 +0900
 
      virtio: update last_avail_idx when inuse is decreased.
 
      Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
 
  It would be better to have a commit description explaining why a change
  is made, and why it is correct, not just repeating what can be seen from
  the diff anyway.

 Sorry for being lazy here.

  diff --git a/hw/virtio.c b/hw/virtio.c
  index c8a0fc6..6688c02 100644
  --- a/hw/virtio.c
  +++ b/hw/virtio.c
  @@ -237,6 +237,7 @@ void virtqueue_flush(VirtQueue *vq, unsigned int 
  count)
       wmb();
       trace_virtqueue_flush(vq, count);
       vring_used_idx_increment(vq, count);
  +    vq-last_avail_idx += count;
       vq-inuse -= count;
   }
 
  @@ -385,7 +386,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement 
  *elem)
       unsigned int i, head, max;
       target_phys_addr_t desc_pa = vq-vring.desc;
 
  -    if (!virtqueue_num_heads(vq, vq-last_avail_idx))
  +    if (!virtqueue_num_heads(vq, vq-last_avail_idx + vq-inuse))
           return 0;
 
       /* When we start there are none of either input nor output. */
  @@ -393,7 +394,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement 
  *elem)
 
       max = vq-vring.num;
 
  -    i = head = virtqueue_get_head(vq, vq-last_avail_idx++);
  +    i = head = virtqueue_get_head(vq, vq-last_avail_idx + vq-inuse);
 
       if (vring_desc_flags(desc_pa, i)  VRING_DESC_F_INDIRECT) {
           if (vring_desc_len(desc_pa, i) % sizeof(VRingDesc)) {
 
 
  Hmm, will virtio_queue_empty be wrong now? What about 
  virtqueue_avail_bytes?

 I think there are two problems.

 1. When to update last_avail_idx.
 2. The ordering issue you're mentioning below.

 The patch above is only trying to address 1 because last time you
 mentioned that modifying last_avail_idx upon save may break the
 guest, which I agree.  If virtio_queue_empty and
 virtqueue_avail_bytes are only used internally, meaning invisible
 to the guest, I guess the approach above can be applied too.

 So IMHO 2 is the real issue. This is what was problematic
 with the save patch, otherwise of course changes in save
 are better than changes all over the codebase.

All right.  Then let's focus on 2 first.

  Previous patch version sure looked simpler, and this seems functionally
  equivalent, so my question still stands: here it is rephrased in a
  different way:
 
         assume that we have in avail ring 2 requests

[Qemu-devel] [PATCH 2/2] Add proper -errno error return values to qcow2_open()

2010-12-16 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

In addition this adds missing braces to the function to be consistent
with the coding style.

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 block/qcow2.c |   61 
 1 files changed, 43 insertions(+), 18 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index d7fd167..b4a9e5e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -140,12 +140,14 @@ static int qcow2_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,
 static int qcow2_open(BlockDriverState *bs, int flags)
 {
 BDRVQcowState *s = bs-opaque;
-int len, i;
+int len, i, ret = 0;
 QCowHeader header;
 uint64_t ext_end;
 
-if (bdrv_pread(bs-file, 0, header, sizeof(header)) != sizeof(header))
+if (bdrv_pread(bs-file, 0, header, sizeof(header)) != sizeof(header)) {
+ret = -EIO; 
 goto fail;
+}
 be32_to_cpus(header.magic);
 be32_to_cpus(header.version);
 be64_to_cpus(header.backing_file_offset);
@@ -160,16 +162,23 @@ static int qcow2_open(BlockDriverState *bs, int flags)
 be64_to_cpus(header.snapshots_offset);
 be32_to_cpus(header.nb_snapshots);
 
-if (header.magic != QCOW_MAGIC || header.version != QCOW_VERSION)
+if (header.magic != QCOW_MAGIC || header.version != QCOW_VERSION) {
+ret = -EINVAL;
 goto fail;
+}
 if (header.cluster_bits  MIN_CLUSTER_BITS ||
-header.cluster_bits  MAX_CLUSTER_BITS)
+header.cluster_bits  MAX_CLUSTER_BITS) {
+ret = -EINVAL;
 goto fail;
-if (header.crypt_method  QCOW_CRYPT_AES)
+}
+if (header.crypt_method  QCOW_CRYPT_AES) {
+ret = -EINVAL;
 goto fail;
+}
 s-crypt_method_header = header.crypt_method;
-if (s-crypt_method_header)
+if (s-crypt_method_header) {
 bs-encrypted = 1;
+}
 s-cluster_bits = header.cluster_bits;
 s-cluster_size = 1  s-cluster_bits;
 s-cluster_sectors = 1  (s-cluster_bits - 9);
@@ -191,15 +200,20 @@ static int qcow2_open(BlockDriverState *bs, int flags)
 s-l1_vm_state_index = size_to_l1(s, header.size);
 /* the L1 table must contain at least enough entries to put
header.size bytes */
-if (s-l1_size  s-l1_vm_state_index)
+if (s-l1_size  s-l1_vm_state_index) {
+ret = -EINVAL;
 goto fail;
+}
 s-l1_table_offset = header.l1_table_offset;
 if (s-l1_size  0) {
 s-l1_table = qemu_mallocz(
 align_offset(s-l1_size * sizeof(uint64_t), 512));
-if (bdrv_pread(bs-file, s-l1_table_offset, s-l1_table, s-l1_size * 
sizeof(uint64_t)) !=
-s-l1_size * sizeof(uint64_t))
+if (bdrv_pread(bs-file, s-l1_table_offset, s-l1_table,
+   s-l1_size * sizeof(uint64_t)) !=
+s-l1_size * sizeof(uint64_t)) {
+ret = -EIO;
 goto fail;
+}
 for(i = 0;i  s-l1_size; i++) {
 be64_to_cpus(s-l1_table[i]);
 }
@@ -212,35 +226,46 @@ static int qcow2_open(BlockDriverState *bs, int flags)
   + 512);
 s-cluster_cache_offset = -1;
 
-if (qcow2_refcount_init(bs)  0)
+ret = qcow2_refcount_init(bs);
+if (ret != 0) {
 goto fail;
+}
 
 QLIST_INIT(s-cluster_allocs);
 
 /* read qcow2 extensions */
-if (header.backing_file_offset)
+if (header.backing_file_offset) {
 ext_end = header.backing_file_offset;
-else
+} else {
 ext_end = s-cluster_size;
-if (qcow2_read_extensions(bs, sizeof(header), ext_end))
+}
+if (qcow2_read_extensions(bs, sizeof(header), ext_end)) {
+ret = -EINVAL;
 goto fail;
+}
 
 /* read the backing file name */
 if (header.backing_file_offset != 0) {
 len = header.backing_file_size;
-if (len  1023)
+if (len  1023) {
 len = 1023;
-if (bdrv_pread(bs-file, header.backing_file_offset, bs-backing_file, 
len) != len)
+}
+if (bdrv_pread(bs-file, header.backing_file_offset,
+   bs-backing_file, len) != len) {
+ret = -EIO;
 goto fail;
+}
 bs-backing_file[len] = '\0';
 }
-if (qcow2_read_snapshots(bs)  0)
+if (qcow2_read_snapshots(bs)  0) {
+ret = -EINVAL;
 goto fail;
+}
 
 #ifdef DEBUG_ALLOC
 qcow2_check_refcounts(bs);
 #endif
-return 0;
+return ret;
 
  fail:
 qcow2_free_snapshots(bs);
@@ -249,7 +274,7 @@ static int qcow2_open(BlockDriverState *bs, int flags)
 qemu_free(s-l2_cache);
 qemu_free(s-cluster_cache);
 qemu_free(s-cluster_data);
-return -1;
+return ret;
 }
 
 static int qcow2_set_key(BlockDriverState *bs, const char *key)
-- 
1.7.3.3

[Qemu-devel] [PATCH 0/2] qcow2 cleanups

2010-12-16 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

Hi,

These two patches tries to clean up the qcow2 code a little. First
makes the function names consistent, ie. we shouldn't have qcow_
functions in the qcow2 code. Second tries to add proper errno return
values to qcow2_open().

Jes Sorensen (2):
  block/qcow2.c: rename qcow_ functions to qcow2_
  Add proper -errno error return values to qcow2_open()

 block/qcow2-cluster.c  |6 +-
 block/qcow2-snapshot.c |6 +-
 block/qcow2.c  |  269 +++-
 3 files changed, 158 insertions(+), 123 deletions(-)

-- 
1.7.3.3

[Qemu-devel] [PATCH 1/2] block/qcow2.c: rename qcow_ functions to qcow2_

2010-12-16 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

It doesn't really make sense for functions in qcow2.c to be named
qcow_ so convert the names to match correctly.

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 block/qcow2-cluster.c  |6 +-
 block/qcow2-snapshot.c |6 +-
 block/qcow2.c  |  210 +---
 3 files changed, 116 insertions(+), 106 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index b040208..6928c63 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -352,8 +352,8 @@ void qcow2_encrypt_sectors(BDRVQcowState *s, int64_t 
sector_num,
 }
 
 
-static int qcow_read(BlockDriverState *bs, int64_t sector_num,
- uint8_t *buf, int nb_sectors)
+static int qcow2_read(BlockDriverState *bs, int64_t sector_num,
+  uint8_t *buf, int nb_sectors)
 {
 BDRVQcowState *s = bs-opaque;
 int ret, index_in_cluster, n, n1;
@@ -419,7 +419,7 @@ static int copy_sectors(BlockDriverState *bs, uint64_t 
start_sect,
 if (n = 0)
 return 0;
 BLKDBG_EVENT(bs-file, BLKDBG_COW_READ);
-ret = qcow_read(bs, start_sect + n_start, s-cluster_data, n);
+ret = qcow2_read(bs, start_sect + n_start, s-cluster_data, n);
 if (ret  0)
 return ret;
 if (s-crypt_method) {
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index aacf357..74823a5 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -116,7 +116,7 @@ int qcow2_read_snapshots(BlockDriverState *bs)
 }
 
 /* add at the end of the file a new list of snapshots */
-static int qcow_write_snapshots(BlockDriverState *bs)
+static int qcow2_write_snapshots(BlockDriverState *bs)
 {
 BDRVQcowState *s = bs-opaque;
 QCowSnapshot *sn;
@@ -300,7 +300,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, 
QEMUSnapshotInfo *sn_info)
 s-snapshots = snapshots1;
 s-snapshots[s-nb_snapshots++] = *sn;
 
-if (qcow_write_snapshots(bs)  0)
+if (qcow2_write_snapshots(bs)  0)
 goto fail;
 #ifdef DEBUG_ALLOC
 qcow2_check_refcounts(bs);
@@ -378,7 +378,7 @@ int qcow2_snapshot_delete(BlockDriverState *bs, const char 
*snapshot_id)
 qemu_free(sn-name);
 memmove(sn, sn + 1, (s-nb_snapshots - snapshot_index - 1) * sizeof(*sn));
 s-nb_snapshots--;
-ret = qcow_write_snapshots(bs);
+ret = qcow2_write_snapshots(bs);
 if (ret  0) {
 /* XXX: restore snapshot if error ? */
 return ret;
diff --git a/block/qcow2.c b/block/qcow2.c
index 537c479..d7fd167 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -50,10 +50,10 @@ typedef struct {
 uint32_t magic;
 uint32_t len;
 } QCowExtension;
-#define  QCOW_EXT_MAGIC_END 0
-#define  QCOW_EXT_MAGIC_BACKING_FORMAT 0xE2792ACA
+#define  QCOW2_EXT_MAGIC_END 0
+#define  QCOW2_EXT_MAGIC_BACKING_FORMAT 0xE2792ACA
 
-static int qcow_probe(const uint8_t *buf, int buf_size, const char *filename)
+static int qcow2_probe(const uint8_t *buf, int buf_size, const char *filename)
 {
 const QCowHeader *cow_header = (const void *)buf;
 
@@ -73,14 +73,14 @@ static int qcow_probe(const uint8_t *buf, int buf_size, 
const char *filename)
  * unknown magic is skipped (future extension this version knows nothing about)
  * return 0 upon success, non-0 otherwise
  */
-static int qcow_read_extensions(BlockDriverState *bs, uint64_t start_offset,
-uint64_t end_offset)
+static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
+ uint64_t end_offset)
 {
 QCowExtension ext;
 uint64_t offset;
 
 #ifdef DEBUG_EXT
-printf(qcow_read_extensions: start=%ld end=%ld\n, start_offset, 
end_offset);
+printf(qcow2_read_extensions: start=%ld end=%ld\n, start_offset, 
end_offset);
 #endif
 offset = start_offset;
 while (offset  end_offset) {
@@ -88,13 +88,13 @@ static int qcow_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,
 #ifdef DEBUG_EXT
 /* Sanity check */
 if (offset  s-cluster_size)
-printf(qcow_handle_extension: suspicious offset %lu\n, offset);
+printf(qcow_read_extension: suspicious offset %lu\n, offset);
 
 printf(attemting to read extended header in offset %lu\n, offset);
 #endif
 
 if (bdrv_pread(bs-file, offset, ext, sizeof(ext)) != sizeof(ext)) {
-fprintf(stderr, qcow_handle_extension: ERROR: 
+fprintf(stderr, qcow_read_extension: ERROR: 
 pread fail from offset % PRIu64 \n,
 offset);
 return 1;
@@ -106,10 +106,10 @@ static int qcow_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,
 printf(ext.magic = 0x%x\n, ext.magic);
 #endif
 switch (ext.magic) {
-case QCOW_EXT_MAGIC_END:
+case QCOW2_EXT_MAGIC_END:
 return 0;
 
-case QCOW_EXT_MAGIC_BACKING_FORMAT:
+case QCOW2_EXT_MAGIC_BACKING_FORMAT:
 if

[Qemu-devel] -snapshot

2010-12-16 Thread Amador Pahim

Hello,

Is a bad idea if I run multiples qemu vms pointing to the same disk
img using -snapshot parameter? What kind of problem may I have?

Thanks,

[Qemu-devel] Re: [PATCH 11/21] ioport: insert event_tap_ioport() to ioport_write().

2010-12-16 Thread Stefan Hajnoczi

On Thu, Dec 16, 2010 at 9:50 AM, Yoshiaki Tamura
tamura.yoshi...@lab.ntt.co.jp wrote:
 2010/12/16 Michael S. Tsirkin m...@redhat.com:
 On Thu, Dec 16, 2010 at 04:37:41PM +0900, Yoshiaki Tamura wrote:
 2010/11/28 Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp:
  2010/11/28 Michael S. Tsirkin m...@redhat.com:
  On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote:
  Record ioport event to replay it upon failover.
 
  Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
 
  Interesting. This will have to be extended to support ioeventfd.
  Since each eventfd is really just a binary trigger
  it should be enough to read out the fd state.
 
  Haven't thought about eventfd yet.  Will try doing it in the next
  spin.

 Hi Michael,

 I looked into eventfd and realized it's only used with vhost now.

 There are patches on list to use it for block/userspace net.

 Thanks.  Now I understand.
 In that case, inserting an even-tap function to the following code
 should be appropriate?

 int event_notifier_test_and_clear(EventNotifier *e)
 {
    uint64_t value;
    int r = read(e-fd, value, sizeof(value));
    return r == sizeof(value);
 }


  However, I
 believe vhost bypass the net layer in qemu, and there is no way for Kemari 
 to
 detect the outputs.  To me, it doesn't make sense to extend this patch to
 support eventfd...

Here is the userspace ioeventfd patch series:
http://www.mail-archive.com/qemu-devel@nongnu.org/msg49208.html

Instead of switching to QEMU userspace to handle the virtqueue kick
pio write, we signal the eventfd inside the kernel and resume guest
code execution.  The I/O thread can then process the virtqueue kick in
parallel to guest code execution.

I think this can still be tied into Kemari.  If you are switching to a
pure net/block-layer event tap instead of pio/mmio, then I think it
should just work.

For vhost it would be more difficult to integrate with Kemari.

Stefan

Re: [Qemu-devel] -snapshot

2010-12-16 Thread Stefan Hajnoczi

On Thu, Dec 16, 2010 at 4:21 PM, Amador Pahim ama...@pahim.org wrote:
 Is a bad idea if I run multiples qemu vms pointing to the same disk
 img using -snapshot parameter? What kind of problem may I have?

It should work fine.  -snapshot means that QEMU creates a temporary
qcow2 file backed by the disk image you've given it.  No changes will
be made to your disk image - it is read-only.

Stefan

Re: [Qemu-devel] -snapshot

2010-12-16 Thread Amador Pahim

Hello Stefan,

Thank you for your answer. Just one more question: If, while my
snapshot vms are running, the main disk is modified by a non
snapshot vm? For example, installing some extra software.. this can
freeze vms or something?

Regards,
Pahim

On Thu, Dec 16, 2010 at 2:28 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Thu, Dec 16, 2010 at 4:21 PM, Amador Pahim ama...@pahim.org wrote:
 Is a bad idea if I run multiples qemu vms pointing to the same disk
 img using -snapshot parameter? What kind of problem may I have?

 It should work fine.  -snapshot means that QEMU creates a temporary
 qcow2 file backed by the disk image you've given it.  No changes will
 be made to your disk image - it is read-only.

 Stefan

Re: [Qemu-devel] ]PATCH 0/7] add TRIM/UNMAP support, v3

2010-12-16 Thread Christoph Hellwig

On Thu, Dec 16, 2010 at 04:43:20PM +0100, Kevin Wolf wrote:
 The SCSI changes seem to apply, but the rest conflicts with recent
 changes, most notably AHCI. Can you rebase on top of the block branch?

I've tried to, but with the maze of pointer indirections I'm totally
lost for now.  I'll have to drop the IDE side for now until that area
gets less messy or I get a lot more time.

[Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify

2010-12-16 Thread Stefan Hajnoczi

On Wed, Dec 15, 2010 at 12:59 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Wed, Dec 15, 2010 at 12:14 PM, Michael S. Tsirkin m...@redhat.com wrote:
 On Wed, Dec 15, 2010 at 11:42:12AM +, Stefan Hajnoczi wrote:
 Are you happy with this patchset if I remove virtio-net-pci
 ioeventfd=on|off so only virtio-blk-pci has ioeventfd=on|off (with
 default on)?  For block we've found it to be a win and the initial
 results looked good for net too.

Please let me know if I should disable ioeventfd for virtio-net.

Stefan

Re: [Qemu-devel] [PATCH 2/7] scsi-disk: support WRITE SAME (16) with unmap bit

2010-12-16 Thread Christoph Hellwig

On Thu, Dec 16, 2010 at 04:48:15PM +0100, Kevin Wolf wrote:
  +scsi_req_set_status(r, GOOD, NO_SENSE);
  +scsi_req_complete(r-req);
  +scsi_remove_request(r);
 
 Isn't this the same as scsi_command_complete()?

Yes.

 
  +return 0;
 
 And if you break; here instead of returning (like all other commands)
 and remove the three lines above completely, I think it would just do
 the right thing.

Yes, that looks doable.

Re: [Qemu-devel] [PATCH 1/1] spice: add chardev

2010-12-16 Thread Alon Levy

On Thu, Dec 16, 2010 at 02:21:16PM +0100, Gerd Hoffmann wrote:
 On 12/16/10 12:29, Alon Levy wrote:
 Adding a chardev backend for spice, for usage by spice vdagent in
 conjunction with a properly named virtio-serial device.
 
 Usage example would be nice here.
 
 +#ifdef CONFIG_SPICE
 +#include spice-qemu-char.h
 +#endif
 
 #ifdef can be dropped.
 
ok.

 +#ifdef CONFIG_SPICE
 +{
 +.name = name,
 +.type = QEMU_OPT_STRING,
 +},{
 +.name = debug,
 +.type = QEMU_OPT_NUMBER,
 +},
 +#endif
 
 This too.

ok.

 
 @@ -1381,7 +1384,10 @@ Backend is one of:
   @option{stdio},
   @option{braille},
   @option{tty},
 -...@option{parport}.
 +...@option{parport}
 +#if defined(CONFIG_SPICE)
 +...@option{spicevmc}.
 +#endif
 
 This too, documentation should be there unconditionally.
 
ok.

 +//#define SPICE_QEMU_CHAR_USE_IOCTL
 
 Why is this disabled?
 Does it depend on the chardev patches from Amit?
 

There was a long discussion that concluded we don't want IOCTL's at all,
and that there should be some other mechanism for connection state
communication between the two sides. Meanwhile I found out I don't need
these (I don't remember exactly what I used instead, but basically just
the regular results of write/read).

 diff --git a/spice-qemu-char.h b/spice-qemu-char.h
 new file mode 100644
 index 000..32d5009
 --- /dev/null
 +++ b/spice-qemu-char.h
 @@ -0,0 +1,9 @@
 +#ifndef __SPICE_QEMU_CHAR_H__
 +#define __SPICE_QEMU_CHAR_H__
 +
 +#include qemu-char.h
 +
 +CharDriverState *qemu_chr_open_spice(QemuOpts *opts);
 +
 +#endif // __SPICE_QEMU_CHAR_H__
 +
 
 Hmm, maybe add this to ui/qemu-spice.h instead, so we don't clutter
 the tree with lots of tiny includes?

ok.

 
 cheers,
   Gerd

Re: [Qemu-devel] [PATCH 5/5] ccid: add docs

2010-12-16 Thread Alon Levy

On Thu, Dec 16, 2010 at 02:01:34PM +0100, Gerd Hoffmann wrote:
 On 12/16/10 12:06, Alon Levy wrote:
 Add documentation for the usb-ccid device and accompanying two card
 devices, ccid-card-emulated and ccid-card-passthru.
 ---
   docs/ccid.txt  |  125 ++
   docs/libcacard.txt |  483 
  
 
 Guess that isn't intentional, right?
 

doh!

 +A typical interchange is:
 +
 +client event  |  vscclient   |passthru| 
 usb-ccid  |  guest event
 +--
 +  |  VSC_Init|| 
   |
 +  |  VSC_ReaderAdd   || attach  
   |
 +  |  || 
   |  sees new usb device.
 +card inserted |  || 
   |
 +  |  VSC_ATR || 
   |
 +  |  || 
   |  guest operation, APDU transfer via CCID
 +  |  |   VSC_APDU | 
   |
 +  |  VSC_APDU|| 
   |
 +client-physical |  || 
   |
 +card APDU exchange|  || 
   |
 +[APDU-APDU repeats several times]
 +card removed  |  || 
   |
 +  |  VSC_CardRemove  || 
   |
 +kill/quit |  || 
   |
 +  vscclient   |  || 
   |
 +  |  VSC_ReaderRemove||detach   
   |
 +  |  || 
   |   usb device removed.
 
 This needs updating, doesn't it?  I think you plugin and -out just
 the cards on the ccid bus instead of the whole usb device, right?
 

actually I'm not sure.. I'll check and update.

 cheers,
   Gerd

Re: [Qemu-devel] [PATCH 1/1] spice: add chardev

2010-12-16 Thread Gerd Hoffmann


  Hi,


+//#define SPICE_QEMU_CHAR_USE_IOCTL


Why is this disabled?
Does it depend on the chardev patches from Amit?



There was a long discussion that concluded we don't want IOCTL's at all,
and that there should be some other mechanism for connection state
communication between the two sides. Meanwhile I found out I don't need
these (I don't remember exactly what I used instead, but basically just
the regular results of write/read).


Ok, so when it is obsolete now it can be dropped altogether I guess?

cheers,
  Gerd

Re: [Qemu-devel] -snapshot

2010-12-16 Thread Stefan Hajnoczi

On Thu, Dec 16, 2010 at 4:34 PM, Amador Pahim ama...@pahim.org wrote:
 Thank you for your answer. Just one more question: If, while my
 snapshot vms are running, the main disk is modified by a non
 snapshot vm? For example, installing some extra software.. this can
 freeze vms or something?

Correct, it is not safe to modify the base image while there is
another disk image backed off it.

The reason for this is that the image only needs to store the changes
that were made on top of the base image.  For anything which hasn't
been modified it will go back to the base image and read data from
there.

If you modify the base image, then the filesystem in the base image is
not longer what your image file was created from and you have an
inconsistent view of the disk.  It leads to odd behavior and is
unsafe.

Stefan

Re: [Qemu-devel] [PATCH v2] Remove NULL checks for bdrv_new return value

2010-12-16 Thread Stefan Hajnoczi

On Thu, Dec 16, 2010 at 2:44 PM, Kevin Wolf kw...@redhat.com wrote:
 It's an indirect call to qemu_malloc, which never returns an error.

 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---

 v2:
 - Fixed bdrv_open failure case in xen_disk

  hw/xen_disk.c |   17 ++---
  qemu-img.c    |    5 +
  qemu-io.c     |    2 --
  qemu-nbd.c    |    2 --
  4 files changed, 7 insertions(+), 19 deletions(-)

Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com

[Qemu-devel] Re: [PATCH 4/4] qcow2: Use block-queue

2010-12-16 Thread Stefan Hajnoczi

On Mon, Dec 13, 2010 at 4:29 PM, Kevin Wolf kw...@redhat.com wrote:
 diff --git a/block/qcow2.c b/block/qcow2.c
 index 537c479..e445913 100644
 --- a/block/qcow2.c
 +++ b/block/qcow2.c
 @@ -136,6 +136,20 @@ static int qcow_read_extensions(BlockDriverState *bs, 
 uint64_t start_offset,
     return 0;
  }

 +static bool qcow_blkqueue_error_cb(void *opaque, int ret)
 +{
 +    BlockDriverState *bs = opaque;
 +    BlockErrorAction action = bdrv_get_on_error(bs, 0);
 +
 +    if ((action == BLOCK_ERR_STOP_ENOSPC  ret == -ENOSPC)
 +        || action == BLOCK_ERR_STOP_ANY)
 +    {
 +        bdrv_mon_event(bs, BDRV_ACTION_STOP, 0);
 +        return vm_stop(0);
 +    }
 +
 +    return false;
 +}

Not sure this is qcow2-specific.  Is it possible to put this in a
generic sysemu-specific place and qemu-tool.c?  That would allow the
vm_stop() change to be dropped.

Stefan

[Qemu-devel] [PATCH 0/2] Fix rtl8139 migration with hotplug

2010-12-16 Thread Alex Williamson

Ok, I think this might actually make everyone happy, but I've been
known to be wrong about that many times before.  Juan challenged me
to find an rtl8139 migration scenario that fails when hotplug is
not involved (and not switch device creation order since that's a
usage bug).  I couldn't come up with one.  We had been arguing that
a subsection didn't make sense for the change to rtl8139 vmstate
because the needed function would be {return 1}. but what if we
could detect if the VM had done any other hotplugs and only include
the subsection in those cases.  That's what this short series does.

So, I hope Juan is happy because this preserves the migration ABI
for the majority of the use cases, and I hope Michael is happy
because it does so using a subsection.  Thanks,

Alex

---

Alex Williamson (2):
  rtl8139: Use subsection to restrict migration after hotplug
  qdev: Track runtime machine modifications


 hw/qdev.c|   10 ++
 hw/qdev.h|1 +
 hw/rtl8139.c |   28 +++-
 3 files changed, 38 insertions(+), 1 deletions(-)

[Qemu-devel] [patch 1/3] add migration_active function

2010-12-16 Thread Marcelo Tosatti

To query whether migration is active.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Index: qemu-kvm-block-copy/migration.c
===
--- qemu-kvm-block-copy.orig/migration.c
+++ qemu-kvm-block-copy/migration.c
@@ -448,3 +448,13 @@ int migrate_fd_close(void *opaque)
 qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL);
 return s-close(s);
 }
+
+bool migration_active(void)
+{
+if (current_migration 
+current_migration-get_status(current_migration) == MIG_STATE_ACTIVE) {
+return true;
+}
+
+return false;
+}
Index: qemu-kvm-block-copy/migration.h
===
--- qemu-kvm-block-copy.orig/migration.h
+++ qemu-kvm-block-copy/migration.h
@@ -134,4 +134,6 @@ static inline FdMigrationState *migrate_
 return container_of(mig_state, FdMigrationState, mig_state);
 }
 
+bool migration_active(void);
+
 #endif

Re: [Qemu-devel] [PATCH 00/14] [PULL] ARM fixes, v2

2010-12-16 Thread Peter Maydell

On 7 December 2010 15:50, Peter Maydell peter.mayd...@linaro.org wrote:
 The following changes since commit 2c90fe2b71df2534884bce96d90cbfcc93aeedb8:
  Kirill Batuzov (1):
        Speedup 'tb_find_slow' by using the same heuristic as during
 memory page lookup

 are available in the git repository at:

  git://git.linaro.org/qemu/qemu-arm.git for-anthony

Ping?

-- PMM

[Qemu-devel] [patch 0/3] add support for live block copy

2010-12-16 Thread Marcelo Tosatti

Add support for live block copy. This is similar (and based on), block
migration, except it is performed without VM migration.

Re: [Qemu-devel] -snapshot

2010-12-16 Thread Stefan Weil


Am 16.12.2010 18:45, schrieb Stefan Hajnoczi:

On Thu, Dec 16, 2010 at 4:34 PM, Amador Pahim ama...@pahim.org wrote:

Thank you for your answer. Just one more question: If, while my
snapshot vms are running, the main disk is modified by a non
snapshot vm? For example, installing some extra software.. this can
freeze vms or something?


Correct, it is not safe to modify the base image while there is
another disk image backed off it.

The reason for this is that the image only needs to store the changes
that were made on top of the base image. For anything which hasn't
been modified it will go back to the base image and read data from
there.

If you modify the base image, then the filesystem in the base image is
not longer what your image file was created from and you have an
inconsistent view of the disk. It leads to odd behavior and is
unsafe.

Stefan


There are useful scenarios where using the same disk
simultaneously from a snapshot vm and a real system
works.

If you have a hard disk with a dual boot configuration,
it is sometimes useful to boot one configuration with
the real system, then start qemu and boot the second
configuration.

Even booting the same configuration twice
(once with the real machine, once with qemu snapshot)
is sometimes useful and works to a limited degree.
It is a simple way to try new bootloader configurations
or other boot setups.

Regards
Stefan

[Qemu-devel] [PATCH] qdev: sysbus_get_default must not return a NULL pointer (fix regression)

2010-12-16 Thread Stefan Weil

Every system should have some sort of main system bus,
so sysbus_get_default should always return a valid bus.

Without this patch, at least mipssim and malta no longer
start but raise a null pointer access exception (caused by
commit ec990eb622ad46df5ddcb1e94c418c271894d416).

Cc: Anthony Liguori anth...@codemonkey.ws
Signed-off-by: Stefan Weil w...@mail.berlios.de
---
 hw/qdev.c |9 +
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/hw/qdev.c b/hw/qdev.c
index 10e28df..6fc9b02 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -107,10 +107,7 @@ DeviceState *qdev_create(BusState *bus, const char *name)
 DeviceInfo *info;
 
 if (!bus) {
-if (!main_system_bus) {
-main_system_bus = qbus_create(system_bus_info, NULL, 
main-system-bus);
-}
-bus = main_system_bus;
+bus = sysbus_get_default();
 }
 
 info = qdev_find_info(bus-info, name);
@@ -311,6 +308,10 @@ static int qdev_reset_one(DeviceState *dev, void *opaque)
 
 BusState *sysbus_get_default(void)
 {
+if (!main_system_bus) {
+main_system_bus = qbus_create(system_bus_info, NULL,
+  main-system-bus);
+}
 return main_system_bus;
 }
 
-- 
1.7.2.3

[Qemu-devel] Re: [PATCH 0/3] add TRIM/UNMAP support, v4

2010-12-16 Thread Christoph Hellwig

This patchset adds support for the SCSI WRITE SAME with unmap command,
which allows reclaiming free space from a backing image.

The user facing implementation is pretty complete, but not really
efficient because the underlying bdrv_discard implementation doesn't
use the aio implementation yet.  The reason for that is that the SCSI
layer doesn't really allow any asynchronous commands except for
READ/WRITE by design.  The only support backend so far is the XFS hole
punching ioctl, but others can be added easily when they become
available.  A virtio implementation for a discard command would also
be pretty easy, but until we actually support a better backend then
a plain sparse file it's not worth using for production enviroments
anyway, but more for playing with the thin provisioning infrastructure,
or observing guest behaviour when TRIM / unmap is supported.

If the support is enabled and the backend doesn't support hole punching
the WRITE SAME command becomes a no-op so that migration from
hosts supporting or not supporting it works.

Version 4:
- incorporate Kevin's review comments for the scsi patch
- update to the current block queue
- drop ide TRIM support, as it doesn't easily port to the
  refactoring of the IDE code

Version 3:
- refactor IDE dma support code
- proper brace obsfucation
- fix compile without xfs headers
- use bool instead of int for a one-byte flag
 
Version 2:
- replace tabs with spaces
- return -ENOMEDIUM from bdrv_discard if there's no driver
  assigned
- actually list the TP EVPD page as supported when querying
  for supported EVPD pages

[Qemu-devel] [PATCH 1/3] block: add discard support

2010-12-16 Thread Christoph Hellwig

Add a new bdrv_discard method to free blocks in a mapping image, and a new
drive property to set the granularity for these discard.  If no discard
granularity support is set discard support is disabled.

Signed-off-by: Christoph Hellwig h...@lst.de

Index: qemu/block.c
===
--- qemu.orig/block.c   2010-12-16 16:51:43.0 +0100
+++ qemu/block.c2010-12-16 16:52:05.875253180 +0100
@@ -1515,6 +1515,17 @@ int bdrv_has_zero_init(BlockDriverState
 return 1;
 }
 
+int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
+{
+if (!bs-drv) {
+return -ENOMEDIUM;
+}
+if (!bs-drv-bdrv_discard) {
+return 0;
+}
+return bs-drv-bdrv_discard(bs, sector_num, nb_sectors);
+}
+
 /*
  * Returns true iff the specified sector is present in the disk image. Drivers
  * not implementing the functionality are assumed to not support backing files,
Index: qemu/block.h
===
--- qemu.orig/block.h   2010-12-16 16:51:43.0 +0100
+++ qemu/block.h2010-12-16 16:52:05.875253180 +0100
@@ -146,6 +146,7 @@ int bdrv_flush(BlockDriverState *bs);
 void bdrv_flush_all(void);
 void bdrv_close_all(void);
 
+int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
 int bdrv_has_zero_init(BlockDriverState *bs);
 int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
int *pnum);
Index: qemu/block_int.h
===
--- qemu.orig/block_int.h   2010-12-16 16:51:43.0 +0100
+++ qemu/block_int.h2010-12-16 16:52:38.542254787 +0100
@@ -72,6 +72,8 @@ struct BlockDriver {
 BlockDriverCompletionFunc *cb, void *opaque);
 BlockDriverAIOCB *(*bdrv_aio_flush)(BlockDriverState *bs,
 BlockDriverCompletionFunc *cb, void *opaque);
+int (*bdrv_discard)(BlockDriverState *bs, int64_t sector_num,
+int nb_sectors);
 
 int (*bdrv_aio_multiwrite)(BlockDriverState *bs, BlockRequest *reqs,
 int num_reqs);
@@ -227,6 +229,7 @@ typedef struct BlockConf {
 uint16_t min_io_size;
 uint32_t opt_io_size;
 int32_t bootindex;
+uint32_t discard_granularity;
 } BlockConf;
 
 static inline unsigned int get_physical_block_exp(BlockConf *conf)
@@ -250,6 +253,8 @@ static inline unsigned int get_physical_
_conf.physical_block_size, 512), \
 DEFINE_PROP_UINT16(min_io_size, _state, _conf.min_io_size, 0),  \
 DEFINE_PROP_UINT32(opt_io_size, _state, _conf.opt_io_size, 0),\
-DEFINE_PROP_INT32(bootindex, _state, _conf.bootindex, -1) \
+DEFINE_PROP_INT32(bootindex, _state, _conf.bootindex, -1),\
+DEFINE_PROP_UINT32(discard_granularity, _state, \
+   _conf.discard_granularity, 0)
 
 #endif /* BLOCK_INT_H */
Index: qemu/block/raw.c
===
--- qemu.orig/block/raw.c   2010-12-15 10:05:38.0 +0100
+++ qemu/block/raw.c2010-12-16 16:52:05.882253111 +0100
@@ -65,6 +65,11 @@ static int raw_probe(const uint8_t *buf,
return 1; /* everything can be opened as raw image */
 }
 
+static int raw_discard(BlockDriverState *bs, int64_t sector_num, int 
nb_sectors)
+{
+return bdrv_discard(bs-file, sector_num, nb_sectors);
+}
+
 static int raw_is_inserted(BlockDriverState *bs)
 {
 return bdrv_is_inserted(bs-file);
@@ -130,6 +135,7 @@ static BlockDriver bdrv_raw = {
 .bdrv_aio_readv = raw_aio_readv,
 .bdrv_aio_writev= raw_aio_writev,
 .bdrv_aio_flush = raw_aio_flush,
+.bdrv_discard   = raw_discard,
 
 .bdrv_is_inserted   = raw_is_inserted,
 .bdrv_eject = raw_eject,

[Qemu-devel] [PATCH 2/3] scsi-disk: support WRITE SAME (16) with unmap bit

2010-12-16 Thread Christoph Hellwig

Support discards via the WRITE SAME command with the unmap bit set, and
tell the initiator about the support for it via the block limit and the
new thin provisioning EVPD pages.  Also fix the comment which incorrectly
describedthe block limits EVPD page.

Signed-off-by: Christoph Hellwig h...@lst.de

Index: qemu/hw/scsi-disk.c
===
--- qemu.orig/hw/scsi-disk.c2010-12-15 10:06:14.260003984 +0100
+++ qemu/hw/scsi-disk.c 2010-12-16 16:57:26.881003566 +0100
@@ -424,7 +424,8 @@ static int scsi_disk_emulate_inquiry(SCS
 outbuf[buflen++] = 0x80; // unit serial number
 outbuf[buflen++] = 0x83; // device identification
 if (bdrv_get_type_hint(s-bs) != BDRV_TYPE_CDROM) {
-outbuf[buflen++] = 0xb0; // block device characteristics
+outbuf[buflen++] = 0xb0; // block limits
+outbuf[buflen++] = 0xb2; // thin provisioning
 }
 outbuf[pages] = buflen - pages - 1; // number of pages
 break;
@@ -466,8 +467,10 @@ static int scsi_disk_emulate_inquiry(SCS
 buflen += id_len;
 break;
 }
-case 0xb0: /* block device characteristics */
+case 0xb0: /* block limits */
 {
+unsigned int unmap_sectors =
+s-qdev.conf.discard_granularity / s-qdev.blocksize;
 unsigned int min_io_size =
 s-qdev.conf.min_io_size / s-qdev.blocksize;
 unsigned int opt_io_size =
@@ -492,6 +495,21 @@ static int scsi_disk_emulate_inquiry(SCS
 outbuf[13] = (opt_io_size  16)  0xff;
 outbuf[14] = (opt_io_size  8)  0xff;
 outbuf[15] = opt_io_size  0xff;
+
+/* optimal unmap granularity */
+outbuf[28] = (unmap_sectors  24)  0xff;
+outbuf[29] = (unmap_sectors  16)  0xff;
+outbuf[30] = (unmap_sectors  8)  0xff;
+outbuf[31] = unmap_sectors  0xff;
+break;
+}
+case 0xb2: /* thin provisioning */
+{
+outbuf[3] = buflen = 8;
+outbuf[4] = 0;
+outbuf[5] = 0x40; /* write same with unmap supported */
+outbuf[6] = 0;
+outbuf[7] = 0;
 break;
 }
 default:
@@ -959,6 +977,12 @@ static int scsi_disk_emulate_command(SCS
 outbuf[11] = 0;
 outbuf[12] = 0;
 outbuf[13] = get_physical_block_exp(s-qdev.conf);
+
+/* set TPE bit if the format supports discard */
+if (s-qdev.conf.discard_granularity) {
+outbuf[14] = 0x80;
+}
+
 /* Protection, exponent and lowest lba field left blank. */
 buflen = req-cmd.xfer;
 break;
@@ -1123,6 +1147,31 @@ static int32_t scsi_send_command(SCSIDev
 goto illegal_lba;
 }
 break;
+case WRITE_SAME_16:
+len = r-req.cmd.xfer / d-blocksize;
+
+DPRINTF(WRITE SAME(16) (sector % PRId64 , count %d)\n,
+r-req.cmd.lba, len);
+
+if (r-req.cmd.lba  s-max_lba) {
+goto illegal_lba;
+}
+
+/*
+ * We only support WRITE SAME with the unmap bit set for now.
+ */
+if (!(buf[1]  0x8)) {
+goto fail;
+}
+
+rc = bdrv_discard(s-bs, r-req.cmd.lba * s-cluster_size,
+  len * s-cluster_size);
+if (rc  0) {
+/* XXX: better error code ?*/
+goto fail;
+}
+
+break;
 default:
 DPRINTF(Unknown SCSI command (%2.2x)\n, buf[0]);
 fail:
Index: qemu/hw/scsi-defs.h
===
--- qemu.orig/hw/scsi-defs.h2010-12-15 10:05:38.301003914 +0100
+++ qemu/hw/scsi-defs.h 2010-12-16 16:52:55.803004683 +0100
@@ -84,6 +84,7 @@
 #define MODE_SENSE_10 0x5a
 #define PERSISTENT_RESERVE_IN 0x5e
 #define PERSISTENT_RESERVE_OUT 0x5f
+#define WRITE_SAME_16 0x93
 #define MAINTENANCE_IN0xa3
 #define MAINTENANCE_OUT   0xa4
 #define MOVE_MEDIUM   0xa5

[Qemu-devel] [PATCH 3/3] raw-posix: add discard support

2010-12-16 Thread Christoph Hellwig

Add support to discard blocks in a raw image residing on an XFS filesystem
by calling the XFS_IOC_UNRESVSP64 ioctl to punch holes.  Support for other
hole punching mechanisms can be added when they become available.

Signed-off-by: Christoph Hellwig h...@lst.de

Index: qemu/block/raw-posix.c
===
--- qemu.orig/block/raw-posix.c 2010-12-15 10:05:37.0 +0100
+++ qemu/block/raw-posix.c  2010-12-16 17:40:47.617253460 +0100
@@ -69,6 +69,10 @@
 #include sys/diskslice.h
 #endif
 
+#ifdef CONFIG_XFS
+#include xfs/xfs.h
+#endif
+
 //#define DEBUG_FLOPPY
 
 //#define DEBUG_BLOCK
@@ -120,6 +124,9 @@ typedef struct BDRVRawState {
 #endif
 uint8_t *aligned_buf;
 unsigned aligned_buf_size;
+#ifdef CONFIG_XFS
+bool is_xfs : 1;
+#endif
 } BDRVRawState;
 
 static int fd_open(BlockDriverState *bs);
@@ -196,6 +203,12 @@ static int raw_open_common(BlockDriverSt
 #endif
 }
 
+#ifdef CONFIG_XFS
+if (platform_test_xfs_fd(s-fd)) {
+s-is_xfs = 1;
+}
+#endif
+
 return 0;
 
 out_free_buf:
@@ -740,6 +753,36 @@ static int raw_flush(BlockDriverState *b
 return qemu_fdatasync(s-fd);
 }
 
+#ifdef CONFIG_XFS
+static int xfs_discard(BDRVRawState *s, int64_t sector_num, int nb_sectors)
+{
+struct xfs_flock64 fl;
+
+memset(fl, 0, sizeof(fl));
+fl.l_whence = SEEK_SET;
+fl.l_start = sector_num  9;
+fl.l_len = (int64_t)nb_sectors  9;
+
+if (xfsctl(NULL, s-fd, XFS_IOC_UNRESVSP64, fl)  0) {
+printf(cannot punch hole (%s)\n, strerror(errno));
+return -errno;
+}
+
+return 0;
+}
+#endif
+
+static int raw_discard(BlockDriverState *bs, int64_t sector_num, int 
nb_sectors)
+{
+#ifdef CONFIG_XFS
+BDRVRawState *s = bs-opaque;
+
+if (s-is_xfs)
+return xfs_discard(s, sector_num, nb_sectors);
+#endif
+
+return 0;
+}
 
 static QEMUOptionParameter raw_create_options[] = {
 {
@@ -761,6 +804,7 @@ static BlockDriver bdrv_file = {
 .bdrv_close = raw_close,
 .bdrv_create = raw_create,
 .bdrv_flush = raw_flush,
+.bdrv_discard = raw_discard,
 
 .bdrv_aio_readv = raw_aio_readv,
 .bdrv_aio_writev = raw_aio_writev,
Index: qemu/configure
===
--- qemu.orig/configure 2010-12-16 16:51:43.0 +0100
+++ qemu/configure  2010-12-16 17:41:11.410254507 +0100
@@ -288,6 +288,7 @@ xen=
 linux_aio=
 attr=
 vhost_net=
+xfs=
 
 gprof=no
 debug_tcg=no
@@ -1399,6 +1400,27 @@ EOF
 fi
 
 ##
+# xfsctl() probe, used for raw-posix
+if test $xfs != no ; then
+  cat  $TMPC  EOF
+#include xfs/xfs.h
+int main(void)
+{
+xfsctl(NULL, 0, 0, NULL);
+return 0;
+}
+EOF
+  if compile_prog   ; then
+xfs=yes
+  else
+if test $xfs = yes ; then
+  feature_not_found xfs
+fi
+xfs=no
+  fi
+fi
+
+##
 # vde libraries probe
 if test $vde != no ; then
   vde_libs=-lvdeplug
@@ -2403,6 +2425,7 @@ echo Trace backend $trace_backend
 echo Trace output file $trace_file-pid
 echo spice support $spice
 echo rbd support   $rbd
+echo xfsctl support$xfs
 
 if test $sdl_too_old = yes; then
 echo - Your SDL version is too old - please upgrade to have SDL support
@@ -2548,6 +2571,9 @@ fi
 if test $uuid = yes ; then
   echo CONFIG_UUID=y  $config_host_mak
 fi
+if test $xfs = yes ; then
+  echo CONFIG_XFS=y  $config_host_mak
+fi
 qemu_version=`head $source_path/VERSION`
 echo VERSION=$qemu_version $config_host_mak
 echo PKGVERSION=$pkgversion $config_host_mak

[Qemu-devel] [PATCH 1/2] qdev: Track runtime machine modifications

2010-12-16 Thread Alex Williamson

Create a trivial interface to track whether the machine has been
modified since boot.  Adding or removing devices will trigger this
to return true.  An example usage scenario for such an interface is
the rtl8139 driver which includes a cpu_register_io_memory() value
in it's migration stream.  For the majority of migrations, where
no hotplug has occured in the machine, this works correctly.  Once
the machine is modified, we can use this interface to detect that
and include a subsection for the device to prevent migrations to
rtl8139 versions with this bug.

Signed-off-by: Alex Williamson alex.william...@redhat.com
---

 hw/qdev.c |   10 ++
 hw/qdev.h |1 +
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/hw/qdev.c b/hw/qdev.c
index 35858cb..e6e7a57 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -32,6 +32,8 @@
 #include blockdev.h
 
 static int qdev_hotplug = 0;
+static bool qdev_hot_added = false;
+static bool qdev_hot_removed = false;
 
 /* This is a nasty hack to allow passing a NULL bus to qdev_create.  */
 static BusState *main_system_bus;
@@ -93,6 +95,7 @@ static DeviceState *qdev_create_from_info(BusState *bus, 
DeviceInfo *info)
 if (qdev_hotplug) {
 assert(bus-allow_hotplug);
 dev-hotplugged = 1;
+qdev_hot_added = true;
 }
 dev-instance_id_alias = -1;
 dev-state = DEV_STATE_CREATED;
@@ -305,6 +308,8 @@ int qdev_unplug(DeviceState *dev)
 }
 assert(dev-info-unplug != NULL);
 
+qdev_hot_removed = true;
+
 return dev-info-unplug(dev);
 }
 
@@ -370,6 +375,11 @@ void qdev_machine_creation_done(void)
 qdev_hotplug = 1;
 }
 
+bool qdev_machine_modified(void)
+{
+return qdev_hot_added || qdev_hot_removed;
+}
+
 /* Get a character (serial) device interface.  */
 CharDriverState *qdev_init_chardev(DeviceState *dev)
 {
diff --git a/hw/qdev.h b/hw/qdev.h
index 579328a..66b8aee 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -123,6 +123,7 @@ int qdev_unplug(DeviceState *dev);
 void qdev_free(DeviceState *dev);
 int qdev_simple_unplug_cb(DeviceState *dev);
 void qdev_machine_creation_done(void);
+bool qdev_machine_modified(void);
 
 qemu_irq qdev_get_gpio_in(DeviceState *dev, int n);
 void qdev_connect_gpio_out(DeviceState *dev, int n, qemu_irq pin);

[Qemu-devel] [patch 3/3] do not allow migration if block copy in progress

2010-12-16 Thread Marcelo Tosatti

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Index: qemu-kvm-block-copy/migration.c
===
--- qemu-kvm-block-copy.orig/migration.c
+++ qemu-kvm-block-copy/migration.c
@@ -19,6 +19,7 @@
 #include block.h
 #include qemu_socket.h
 #include block-migration.h
+#include block-copy.h
 #include qemu-objects.h
 
 //#define DEBUG_MIGRATION
@@ -88,6 +89,11 @@ int do_migrate(Monitor *mon, const QDict
 return -1;
 }
 
+if (block_copy_active()) {
+monitor_printf(mon, block copy in progress\n);
+return -1;
+}
+
 if (strstart(uri, tcp:, p)) {
 s = tcp_start_outgoing_migration(mon, p, max_throttle, detach,
  blk, inc);

[Qemu-devel] [patch 2/3] live block copy

2010-12-16 Thread Marcelo Tosatti

Add support for live block copy.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Index: qemu-kvm/block-copy.c
===
--- /dev/null
+++ qemu-kvm/block-copy.c
@@ -0,0 +1,728 @@
+/*
+ * QEMU live block copy
+ *
+ * Copyright (C) 2010 Red Hat Inc.
+ *
+ * Authors: Marcelo Tosatti mtosa...@redhat.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include qemu-common.h
+#include block_int.h
+#include qemu-queue.h
+#include qemu-timer.h
+#include monitor.h
+#include block-copy.h
+#include migration.h
+#include sysemu.h
+#include qjson.h
+#include assert.h
+
+#define BLOCK_SIZE (BDRV_SECTORS_PER_DIRTY_CHUNK  BDRV_SECTOR_BITS)
+#define MAX_IS_ALLOCATED_SEARCH 65536
+
+/*
+ * Stages:
+ *
+ * STAGE_BULK: bulk reads/writes in progress
+ * STAGE_BULK_FINISHED: bulk reads finished, bulk writes in progress
+ * STAGE_DIRTY: bulk writes finished, dirty reads/writes in progress
+ * STAGE_SWITCH_FINISHED: switched to new image.
+ */
+
+enum BdrvCopyStage {
+STAGE_BULK,
+STAGE_BULK_FINISHED,
+STAGE_DIRTY,
+STAGE_SWITCH_FINISHED,
+};
+
+typedef struct BdrvCopyState {
+BlockDriverState *src;
+BlockDriverState *dst;
+bool shared_base;
+
+int64_t curr_sector;
+int64_t completed_sectors;
+int64_t nr_sectors;
+
+enum BdrvCopyStage stage;
+int inflight_reads;
+int error;
+int failed;
+int cancelled;
+QLIST_HEAD(, BdrvCopyBlock) io_list;
+unsigned long *aio_bitmap;
+QEMUTimer *aio_timer;
+QLIST_ENTRY(BdrvCopyState) list;
+
+int64_t blocks;
+int64_t total_time;
+
+char src_device_name[32];
+char dst_filename[1024];
+int commit_fd;
+} BdrvCopyState;
+
+typedef struct BdrvCopyBlock {
+BdrvCopyState *state;
+uint8_t *buf;
+int64_t sector;
+int64_t nr_sectors;
+struct iovec iov;
+QEMUIOVector qiov;
+BlockDriverAIOCB *aiocb;
+int64_t time;
+QLIST_ENTRY(BdrvCopyBlock) list;
+} BdrvCopyBlock;
+
+static QLIST_HEAD(, BdrvCopyState) block_copy_list =
+QLIST_HEAD_INITIALIZER(block_copy_list);
+
+static void alloc_aio_bitmap(BdrvCopyState *s)
+{
+BlockDriverState *bs = s-src;
+int64_t bitmap_size;
+
+bitmap_size = (bdrv_getlength(bs)  BDRV_SECTOR_BITS) +
+BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
+bitmap_size /= BDRV_SECTORS_PER_DIRTY_CHUNK * 8;
+
+s-aio_bitmap = qemu_mallocz(bitmap_size);
+}
+
+static bool aio_inflight(BdrvCopyState *s, int64_t sector)
+{
+int64_t chunk = sector / (int64_t)BDRV_SECTORS_PER_DIRTY_CHUNK;
+
+if (s-aio_bitmap 
+(sector  BDRV_SECTOR_BITS)  bdrv_getlength(s-src)) {
+return !!(s-aio_bitmap[chunk / (sizeof(unsigned long) * 8)] 
+(1UL  (chunk % (sizeof(unsigned long) * 8;
+} else {
+return 0;
+}
+}
+
+static void set_aio_inflight(BdrvCopyState *s, int64_t sector_num,
+ int nb_sectors, int set)
+{
+int64_t start, end;
+unsigned long val, idx, bit;
+
+start = sector_num / BDRV_SECTORS_PER_DIRTY_CHUNK;
+end = (sector_num + nb_sectors - 1) / BDRV_SECTORS_PER_DIRTY_CHUNK;
+
+for (; start = end; start++) {
+idx = start / (sizeof(unsigned long) * 8);
+bit = start % (sizeof(unsigned long) * 8);
+val = s-aio_bitmap[idx];
+if (set) {
+if (!(val  (1UL  bit))) {
+val |= 1UL  bit;
+}
+} else {
+if (val  (1UL  bit)) {
+val = ~(1UL  bit);
+}
+}
+s-aio_bitmap[idx] = val;
+}
+}
+
+static void blkcopy_set_stage(BdrvCopyState *s, enum BdrvCopyStage stage)
+{
+s-stage = stage;
+
+switch (stage) {
+case STAGE_BULK:
+BLKDBG_EVENT(s-dst-file, BLKDBG_BLKCOPY_STAGE_BULK);
+break;
+case STAGE_BULK_FINISHED:
+BLKDBG_EVENT(s-dst-file, BLKDBG_BLKCOPY_STAGE_BULK_FINISHED);
+break;
+case STAGE_DIRTY:
+BLKDBG_EVENT(s-dst-file, BLKDBG_BLKCOPY_STAGE_DIRTY);
+break;
+case STAGE_SWITCH_FINISHED:
+BLKDBG_EVENT(s-dst-file, BLKDBG_BLKCOPY_STAGE_SWITCH_FINISHED);
+break;
+default:
+break;
+}
+}
+
+static void blk_copy_handle_cb_error(BdrvCopyState *s, int ret)
+{
+s-error = ret;
+qemu_mod_timer(s-aio_timer, qemu_get_clock(rt_clock));
+}
+
+static inline void add_avg_transfer_time(BdrvCopyState *s, int64_t time)
+{
+s-blocks++;
+s-total_time += time;
+}
+
+static void blk_copy_write_cb(void *opaque, int ret)
+{
+BdrvCopyBlock *blk = opaque;
+BdrvCopyState *s = blk-state;
+
+if (ret  0) {
+QLIST_REMOVE(blk, list);
+qemu_free(blk-buf);
+qemu_free(blk);
+blk_copy_handle_cb_error(s, ret);
+return;
+}
+
+QLIST_REMOVE(blk, list);
+add_avg_transfer_time(s, qemu_get_clock_ns(rt_clock) - blk-time);
+
+/* schedule switch to

Re: [Qemu-devel] -snapshot

2010-12-16 Thread Amador Pahim

Thanks. Some stuff are clear now. I started a VDI Open Source project
(http://www.ucs.br/projeto/osdvt/) and your information will help a
lot.

Pahim

On Thu, Dec 16, 2010 at 4:16 PM, Stefan Weil w...@mail.berlios.de wrote:
 Am 16.12.2010 18:45, schrieb Stefan Hajnoczi:

 On Thu, Dec 16, 2010 at 4:34 PM, Amador Pahim ama...@pahim.org wrote:

 Thank you for your answer. Just one more question: If, while my
 snapshot vms are running, the main disk is modified by a non
 snapshot vm? For example, installing some extra software.. this can
 freeze vms or something?

 Correct, it is not safe to modify the base image while there is
 another disk image backed off it.

 The reason for this is that the image only needs to store the changes
 that were made on top of the base image. For anything which hasn't
 been modified it will go back to the base image and read data from
 there.

 If you modify the base image, then the filesystem in the base image is
 not longer what your image file was created from and you have an
 inconsistent view of the disk. It leads to odd behavior and is
 unsafe.

 Stefan

 There are useful scenarios where using the same disk
 simultaneously from a snapshot vm and a real system
 works.

 If you have a hard disk with a dual boot configuration,
 it is sometimes useful to boot one configuration with
 the real system, then start qemu and boot the second
 configuration.

 Even booting the same configuration twice
 (once with the real machine, once with qemu snapshot)
 is sometimes useful and works to a limited degree.
 It is a simple way to try new bootloader configurations
 or other boot setups.

 Regards
 Stefan

1 2 >

1 - 100 of 113 matches

Mail list logo