date:20101215

Re: [Qemu-devel] Sometimes qem-kvm hang at kvm_put_vcpu_events when virsh restore. Both version 0.12.5 / 0.13.0

2010-12-15 Thread changlimin

1. I create a guest use libvirt;
2. I start the guest use: virsh start guest;
3. I save the guest to a file, use: virsh save guest guest.save
4. I retore the guest, use : virsh retore guest.save
Sometimes, the guest restore successful. sometimes restore fail. 
I debug the code, the qemu-kvm hang at  kvm_put_vcpu_events, the process 
state is D, uninterruptable sleep. Only can killed by kill -9.
I print the events when kvm_put_vcpu_events / kvm_get_vcpu_events, it is 
always different, event when restore successful

- Original Message - 
From: Mulyadi Santosa mulyadi.sant...@gmail.com
To: changlimin changli...@h3c.com
Cc: qemu-devel@nongnu.org
Sent: Wednesday, December 15, 2010 3:46 PM
Subject: Re: [Qemu-devel] Sometimes qem-kvm hang at kvm_put_vcpu_events when 
virsh restore. Both version 0.12.5 / 0.13.0


 2010/12/15 changlimin changli...@h3c.com:
 BTW windows 2008 guest start failed at qem-kvm 0.13.0, but 0.12.5 start
 success.
 
 I am not qemu/kvm expert, but I think if you could provide either qemu
 instruction trace (using -d) or traces inside the guest OS itself
 (windows live debugger maybe?), it would  be much easier for people
 here to decipher the problem.
 
 At least could you write down how do you invoke qemu-kvm back then?
 
 -- 
 regards,
 
 Mulyadi Santosa
 Freelance Linux trainer and consultant
 
 blog: the-hydra.blogspot.com
 training: mulyaditraining.blogspot.com

[Qemu-devel] qemu-x86_64 segments on fedora14

2010-12-15 Thread Rania Mameesh

Dear All,

I am getting a the following segmentation fault when I run the command
qemu-x86_64 /bin/ls:

qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)

I am using fedora14 operating system. Thanks for your help.

Rania Mameesh
University of Siena
Italy

[Qemu-devel] Re: [PATCH 00/11] AHCI emulation support v9

2010-12-15 Thread Gerd Hoffmann


  Hi,


One thing that might be worth considering is to have preliminary
seabios support for booting. Gerd does have something in the queue
there ;).


Not in the queue, it's upstream.  Just pull latest seabios.  As it 
didn't got much testing yet it is off by default, so make sure to flip 
CONFIG_AHCI.  Thats it ;)


cheers,
  Gerd

Re: [Qemu-devel] Sometimes qem-kvm hang at kvm_put_vcpu_events when virsh restore. Both version 0.12.5 / 0.13.0

2010-12-15 Thread Mulyadi Santosa

On Wed, Dec 15, 2010 at 15:08, changlimin changli...@h3c.com wrote:
 1. I create a guest use libvirt;
 2. I start the guest use: virsh start guest;
 3. I save the guest to a file, use: virsh save guest guest.save
 4. I retore the guest, use : virsh retore guest.save
    Sometimes, the guest restore successful. sometimes restore fail.
    I debug the code, the qemu-kvm hang at  kvm_put_vcpu_events, the process 
 state is D, uninterruptable sleep. Only can killed by kill -9.
    I print the events when kvm_put_vcpu_events / kvm_get_vcpu_events, it is 
 always different, event when restore successful

I could only say, i smell deadlock due to race condition here...but I
have no further prove...how many (virtual) CPU did you simulate at
that time?

-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com

Re: [Qemu-devel] qemu-x86_64 segments on fedora14

2010-12-15 Thread Mulyadi Santosa

On Wed, Dec 15, 2010 at 15:26, Rania Mameesh mame...@dii.unisi.it wrote:
 Dear All,

 I am getting a the following segmentation fault when I run the command
 qemu-x86_64 /bin/ls:

 qemu: uncaught target signal 11 (Segmentation fault) - core dumped
 Segmentation fault (core dumped)

that was the user mode Qemu emulation, right?

Sounds like missing NPTL implementation to meAFAIK
it's still incomplete in some parts

-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com

Re: [Qemu-devel] Sometimes qem-kvm hang at kvm_put_vcpu_events when virsh restore. Both version 0.12.5 / 0.13.0

2010-12-15 Thread changlimin

I have tried 1 vcpu and 2 vcpus, both sometimes restore fail.

- Original Message - 
From: Mulyadi Santosa mulyadi.sant...@gmail.com
To: changlimin changli...@h3c.com
Cc: qemu-devel@nongnu.org
Sent: Wednesday, December 15, 2010 4:31 PM
Subject: Re: [Qemu-devel] Sometimes qem-kvm hang at kvm_put_vcpu_events when 
virsh restore. Both version 0.12.5 / 0.13.0

On Wed, Dec 15, 2010 at 15:08, changlimin changli...@h3c.com wrote:
 1. I create a guest use libvirt;
 2. I start the guest use: virsh start guest;
 3. I save the guest to a file, use: virsh save guest guest.save
 4. I retore the guest, use : virsh retore guest.save
 Sometimes, the guest restore successful. sometimes restore fail.
 I debug the code, the qemu-kvm hang at kvm_put_vcpu_events, the process state 
 is D, uninterruptable sleep. Only can killed by kill -9.
 I print the events when kvm_put_vcpu_events / kvm_get_vcpu_events, it is 
 always different, event when restore successful

I could only say, i smell deadlock due to race condition here...but I
have no further prove...how many (virtual) CPU did you simulate at
that time?

-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com

[Qemu-devel] Re: [PATCH 00/11] AHCI emulation support v9

2010-12-15 Thread Alexander Graf


Am 15.12.2010 um 09:28 schrieb Gerd Hoffmann kra...@redhat.com:

  Hi,
 
 One thing that might be worth considering is to have preliminary
 seabios support for booting. Gerd does have something in the queue
 there ;).
 
 Not in the queue, it's upstream.  Just pull latest seabios.  As it didn't got 
 much testing yet it is off by default, so make sure to flip CONFIG_AHCI.  
 Thats it ;)
 

Even better! Could we get that in for 0.14 still? I assume that means we need a 
stable seabios release.

Alex

[Qemu-devel] Re: [PATCH 00/11] AHCI emulation support v9

2010-12-15 Thread Gerd Hoffmann


  Hi,


Even better! Could we get that in for 0.14 still? I assume that means
we need a stable seabios release.


Also for Gleb's bootorder bits we'll need a seabios update.  I think for 
that not all bits are in upstream seabios yet though.  A new release 
once this is settled is probably a good idea, so we don't have to grab a 
random snapshot.


cheers,
  Gerd

[Qemu-devel] [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-15 Thread Lai Jiangshan


Convert do_inject_nmi() to QObject, QError, we need to use it(via libvirt).

changed from v1
Add document.
Add error handling when the cpu index is invalid.

changed from v2
use QERR_INVALID_PARAMETER_VALUE as Markus suggest.

Signed-off-by:  Lai Jiangshan la...@cn.fujitsu.com
---
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 23024ba..f86d9fe 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -724,7 +724,8 @@ ETEXI
 .args_type  = cpu_index:i,
 .params = cpu,
 .help   = inject an NMI on the given CPU,
-.mhandler.cmd = do_inject_nmi,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_inject_nmi,
 },
 #endif
 STEXI
diff --git a/monitor.c b/monitor.c
index ec31eac..3e33a96 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2119,7 +2119,7 @@ static void do_wav_capture(Monitor *mon, const QDict 
*qdict)
 #endif
 
 #if defined(TARGET_I386)
-static void do_inject_nmi(Monitor *mon, const QDict *qdict)
+static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject **ret_data)
 {
 CPUState *env;
 int cpu_index = qdict_get_int(qdict, cpu_index);
@@ -2127,8 +2127,12 @@ static void do_inject_nmi(Monitor *mon, const QDict 
*qdict)
 for (env = first_cpu; env != NULL; env = env-next_cpu)
 if (env-cpu_index == cpu_index) {
 cpu_interrupt(env, CPU_INTERRUPT_NMI);
-break;
+return 0;
 }
+
+qerror_report(QERR_INVALID_PARAMETER_VALUE, cpu_index,
+  a CPU number);
+return -1;
 }
 #endif
 
diff --git a/qmp-commands.hx b/qmp-commands.hx
index e5f157f..fcb6bf2 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -429,6 +429,33 @@ Example:
 
 EQMP
 
+#if defined(TARGET_I386)
+{
+.name   = inject_nmi,
+.args_type  = cpu_index:i,
+.params = cpu,
+.help   = inject an NMI on the given CPU,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_inject_nmi,
+},
+#endif
+SQMP
+inject_nmi
+--
+
+Inject an NMI on the given CPU (x86 only).
+
+Arguments:
+
+- cpu_index: the index of the CPU to be injected NMI (json-int)
+
+Example:
+
+- { execute: inject_nmi, arguments: { cpu_index: 0 } }
+- { return: {} }
+
+EQMP
+
 {
 .name   = migrate,
 .args_type  = detach:-d,blk:-b,inc:-i,uri:s,

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-15 Thread Michael S. Tsirkin

On Tue, Dec 14, 2010 at 11:34:53AM -0700, Alex Williamson wrote:
 On Tue, 2010-12-14 at 14:26 +0200, Michael S. Tsirkin wrote:
  On Mon, Dec 13, 2010 at 10:04:24PM -0700, Alex Williamson wrote:
   
   I've only ever seen config[PCI_SECONDARY_BUS] be non-zero for an
   assigned device, so I'm pretty sure we're not going to hurt migration,
   but the code is clearly wrong and I'd like to make sure we don't trip on
   a migration failure for a minor device config space change.
  
  Which reminds me: maybe just mark nested bridges as non-migrateable
  for now?  Care writing such a patch?
 
 Hmm, this is trickier than it sounds.

Hmm, since 0 is put in the path instead of the bridge number,
will the correct bridge be restored?

  We're really only broken wrt
 migration if a device under a bridge calls qemu_ram_alloc.

I guess there's more broken-ness. What exactly breaks qemu_ram_alloc?

  Any device
 is free to do this, but typically it only happens via
 pci_add_option_rom() (not counting vga as typical).  So maybe the better
 approach for now is to prevent the problem by disallowing option ROMs
 for devices below a bridge.  We obviously risk devices coming along that
 allocate RAM on their own, but we could still allow the most common
 issue with almost no lost functionality (assuming no one wants to boot
 off that nested device).  Thoughts?  Thanks,
 
 Alex

[Qemu-devel] Re: [PATCH] rtl8139: IO memory is not part of vmstate

2010-12-15 Thread Michael S. Tsirkin

On Tue, Dec 14, 2010 at 08:41:55AM -0700, Alex Williamson wrote:
 On Tue, 2010-12-14 at 14:32 +0200, Michael S. Tsirkin wrote:
  On Mon, Dec 13, 2010 at 10:00:48PM -0700, Alex Williamson wrote:
   On Tue, 2010-12-14 at 06:43 +0200, Michael S. Tsirkin wrote:
On Mon, Dec 13, 2010 at 12:15:08PM -0700, Alex Williamson wrote:
 On Mon, 2010-12-13 at 21:06 +0200, Michael S. Tsirkin wrote:
  On Mon, Dec 13, 2010 at 11:59:16AM -0700, Alex Williamson wrote:
   On Mon, 2010-12-13 at 20:54 +0200, Michael S. Tsirkin wrote:
On Mon, Dec 13, 2010 at 11:00:44AM -0700, Alex Williamson wrote:
 On Mon, 2010-12-13 at 19:50 +0200, Michael S. Tsirkin wrote:
  On Mon, Dec 13, 2010 at 10:43:22AM -0700, Alex Williamson 
  wrote:
   So, unfortunately, I stand by my original patch.
  
  What about the one that put -1 in saved index for a 
  hotplugged device?
 
 There are still examples that don't work even without hotplug 
 (example 2
 and example 3 after the reboot).  That hack limits the 
 damage, but still
 leaves a latent bug for reboot and doesn't address the 
 non-hotplug
 scenarios.  So, I don't think it's worthwhile to pursue, and 
 we
 shouldn't pretend we can use it to avoid bumping the 
 version_id.
 Thanks,
 
 Alex

I guess when we bump it we tell users: migration is completely
borken to the old version, don't even try it.

Is there a way for libvirt to discover such incompatibilities
and avoid the migration?
   
   I don't know if libvirt has a way to query this in advance.  If a
   migration is attempted, the target will report:
   
   savevm: unsupported version 5 for ':00:03.0/rtl8139' v4
   
   And the source will continue running.  We waste plenty of bits 
   getting
   to that point,
  
  Yes, this happens after all of memory has been migrated.
 
 Better late than never :^\

One other question: can we do the same by creating a new (empty)
section? As was discussed in the past this is easier for
downstreams to cherry-pick.
   
   The only way I can think to do that would be to have a subsection that
   is always included, but saves no data.  That would force a failure on
   new-old migration, but I don't think it really matches the intended
   purpose of subsections and feels like it's adding cruft for no gain.
   Maybe I'm missing something.  Juan, is there any advantage to trapping
   this in a subsection?  Thanks,
   
   Alex
  
  Maybe in this particular case the advantage is minimal.
  But it seems easier to stick to a rule of no more version
  bumps than argue about each case.
 
 Do we have such a rule?

I think it's easier to have a hard rule than start arguing whether
each change is a subsection or not for each patch.
What is the downside?

 If we have a subsection who's needed function
 is return 1, I think that's a good indication that it's not appropriate
 for a subsection and the end result is equivalent to bumping the main
 driver vmstate version.

So if you feel strongly about it, go ahead, I'm just concerned
we'll now need to argue version or subsection for each migration-related
change.

 It's convoluted to try to hide a one-way upgrade in a subsection.

It may be convoluted but I think it works better with downstreams.
When we'll bump the version again in the future, it
should be possible for a downstream to pick up
just the next change, and skip this one.

But that's a general comment. In this case, I expect all
downstream to be able to pick this bugfix with no real downsides.

  Thanks,
 
 Alex

[Qemu-devel] Re: [PATCH] rtl8139: IO memory is not part of vmstate

2010-12-15 Thread Michael S. Tsirkin

On Tue, Dec 14, 2010 at 04:59:55PM +0100, Paolo Bonzini wrote:
 On 12/14/2010 04:41 PM, Alex Williamson wrote:
   Maybe in this particular case the advantage is minimal.
   But it seems easier to stick to a rule of no more version
   bumps than argue about each case.
 
 Do we have such a rule?  If we have a subsection who's needed function
 is return 1, I think that's a good indication that it's not appropriate
 for a subsection and the end result is equivalent to bumping the main
 driver vmstate version.  It's convoluted to try to hide a one-way
 upgrade in a subsection.  Thanks,
 
 Indeed, subsections are for data that is rarely needed so that
 there's some chance (sometimes ~100%) of migration working
 seemlessly.

If a subsection arrives that qemu does
not know about, won't migratin fail?

  In this case it's either
 no-bump-and-live-with-the-consequences, or changing the version id.
 
 Paolo

This was discussed to death already.  version ids have the problem
that they don't play nicely with downstreams.

-- 
MST

Re: [Qemu-devel] [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-15 Thread Markus Armbruster

Lai Jiangshan la...@cn.fujitsu.com writes:

 Convert do_inject_nmi() to QObject, QError, we need to use it(via libvirt).

 changed from v1
 Add document.
 Add error handling when the cpu index is invalid.

 changed from v2
 use QERR_INVALID_PARAMETER_VALUE as Markus suggest.

 Signed-off-by:  Lai Jiangshan la...@cn.fujitsu.com

A note on commit messages:

The commit message should describe the current version of the patch.
Don't repeat the subject line in the body.

Patch history is very useful for review, but usually uninteresting once
the patch is committed.  Thus, it's best to put it after the ---
separator.

Subsystem tags in the subject line are helpful.  But qemu doesn't
provide any information there :)


Regarding the patch:

The conversion looks good.

The new QMP command is called inject_nmi, while the existing human
monitor command is called nmi.  Luiz asked for this name change.  I
don't mind.  But should we rename the human monitor command for
consistency?

Regardless, the differing command name is worth mentioning in the commit
message.

[Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify

2010-12-15 Thread Stefan Hajnoczi

On Mon, Dec 13, 2010 at 6:52 PM, Michael S. Tsirkin m...@redhat.com wrote:
 On Mon, Dec 13, 2010 at 05:57:28PM +, Stefan Hajnoczi wrote:
 On Mon, Dec 13, 2010 at 4:28 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
  On Mon, Dec 13, 2010 at 4:12 PM, Michael S. Tsirkin m...@redhat.com 
  wrote:
  On Mon, Dec 13, 2010 at 03:27:06PM +, Stefan Hajnoczi wrote:
  On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin m...@redhat.com 
  wrote:
   On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote:
   On Mon, Dec 13, 2010 at 01:11:27PM +, Stefan Hajnoczi wrote:
Fresh results:
   
192.168.0.1 - host (runs netperf)
192.168.0.2 - guest (runs netserver)
   
host$ src/netperf -H 192.168.0.2 -- -m 200
   
ioeventfd=on
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2
(192.168.0.2) port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec
 87380  16384    200    10.00    1759.25
   
ioeventfd=off
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.2
(192.168.0.2) port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec
   
 87380  16384    200    10.00    1757.15
   
The results vary approx +/- 3% between runs.
   
Invocation:
$ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev
type=tap,id=net0,ifname=tap0,script=no,downscript=no -device
virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive
if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img
   
I am running qemu.git with v5 patches, based off
36888c6335422f07bbc50bf3443a39f24b90c7c6.
   
Host:
1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz
8 GB RAM
RHEL 6 host
   
Next I will try the patches on latest qemu-kvm.git
   
Stefan
  
   One interesting thing is that I put virtio-net earlier on
   command line.
  
   Sorry I mean I put it after disk, you put it before.
 
  I can't find a measurable difference when swapping -drive and -netdev.
 
  One other concern I have is that we are apparently using
  ioeventfd for all VQs. E.g. for virtio-net we probably should not
  use it for the control VQ - it's a waste of resources.
 
  One option is a per-device (block, net, etc) bitmap that masks out
  virtqueues.  Is that something you'd like to see?
 
  I'm tempted to mask out the RX vq too and see how that affects the
  qemu-kvm.git specific issue.

 As expected, the rx virtqueue is involved in the degradation.  I
 enabled ioeventfd only for the TX virtqueue and got the same good
 results as userspace virtio-net.

 When I enable only the rx virtqueue, performs decreases as we've seen above.

 Stefan

 Interesting. In particular this implies something's wrong with the
 queue: we should not normally be getting notifications from rx queue
 at all. Is it running low on buffers? Does it help to increase the vq
 size?  Any other explanation?

I made a mistake, it is the *tx* vq that causes reduced performance on
short packets with ioeventfd.  I double-checked the results and the rx
vq doesn't affect performance.

Initially I thought the fix would be to adjust the tx mitigation
mechanism since ioeventfd does its own mitigation of sorts.  Multiple
eventfd signals will be coalesced into one qemu-kvm event handler call
if qemu-kvm didn't have a chance to handle the first event before the
eventfd was signalled again.

I added -device virtio-net-pci tx=immediate to flush the TX queue
immediately instead of scheduling a BH or timer.  Unfortunately this
had little measurable effect and performance stayed the same.  This
suggests most of the latency is between the guest's pio write and
qemu-kvm getting around to handling the event.

You mentioned that vhost-net has the same performance issue on this
benchmark.  I guess a solution for vhost-net may help virtio-ioeventfd
and vice versa.

Are you happy with this patchset if I remove virtio-net-pci
ioeventfd=on|off so only virtio-blk-pci has ioeventfd=on|off (with
default on)?  For block we've found it to be a win and the initial
results looked good for net too.

Stefan

[Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify

2010-12-15 Thread Stefan Hajnoczi

For the record, here are the commits to selectively mask virtqueues
for ioeventfd and to add -device virtio-net-pci,tx=immediate:
http://repo.or.cz/w/qemu-kvm/stefanha.git/shortlog/refs/heads/virtio-ioeventfd-2

I'm posting this in case you want to try it out too.

Stefan

Re: [Qemu-devel] [PULL] spice: add qxl device, qmp events + monitor commands.

2010-12-15 Thread Gerd Hoffmann


On 12/09/10 14:33, Gerd Hoffmann wrote:

The following changes since commit
138b38b61bf92d4e9588acf934e532499c94e185:

ppc: kvm: fix signedness warning (2010-12-08 21:30:19 +0100)

are available in the git repository at:
git://anongit.freedesktop.org/spice/qemu spice.v23.pull


Ping?

cheers,
  Gerd

Re: [Qemu-devel] Re: [PATCH 1/7] usb-linux: introduce a usb_linux_alt_setting function

2010-12-15 Thread Gerd Hoffmann


On 12/09/10 14:16, Gerd Hoffmann wrote:

On 11/26/10 19:13, Hans de Goede wrote:

The next patch in this series introduces multiple ways to get the
alt setting dependent upon usb_fs_type, it is cleaner to put this
into its own function.

Note that this patch also changes the assumed alt setting in case
of an error getting the alt setting to be 0 (a sane default) rather
then the interface numberwhich makes no sense.


Patch series looks good.

Acked-by: Gerd Hoffmann kra...@redhat.com


Ping?  Any chance this can go in for 0.14?

cheers,
  Gerd

Re: [Qemu-devel] [PATCH 0/5] usb-ccid (v9)

2010-12-15 Thread Gerd Hoffmann


On 12/12/10 19:37, Alon Levy wrote:

This patchset adds three new devices, usb-ccid, ccid-card-passthru and
ccid-card-emulated, providing a CCID bus, a simple passthru protocol
implementing card requiring a client, and a standalone emulated card.


Hmm, 'git am' refuses to apply these and complains about a corrupted patch:

Applying: libcacard: initial commit after coding style fixes
fatal: corrupt patch at line 5839

Is there a git tree somewhere with the patches to pull from?

cheers,
  Gerd

[Qemu-devel] Re: [Spice-devel] RFC; usb redirection protocol

2010-12-15 Thread Hans de Goede


Hi,

Thanks for taking the time to read all that!

On 12/13/2010 12:21 PM, Gerd Hoffmann wrote:

Basic packet structure / communication
--

Each packet exchanged between the vm-host and the usb-host starts
with a usb_redir_header, followed by an optional command specific
header follow by optional additional data.

The usb_redir_header each packet starts with looks as follows:

struct usb_redir_header { uint32_t command; uint32_t length; }


uint32_t id; ? A reply would then carry the id of the request ...



That sounds like a good idea (all though for iso streams the packets
will be only send in one direction).


Given that everything is done over a potentially slow transport in
practice the diferentiating between synchroneous and asynchroneous
commands may seem odd. The difference is how the usb-host will handle
them once received. For synchroneous commands the usb-host will hand
the request over to the host os and then *wait* for a response. This
means that the vm-host is guaranteed to get an immediate response.
Where as for asynchroneous commands to usb-host hands the request
over to the host os with the request to let the usb-host process know
when the request is done.


Hmm. Looks like you are planning for one tcp stream and one thread (on the 
usb-host side) for each usb device. That will not work very good for 
usb-over-vnc because there is a single tcp stream only. We could of course 
multiplex multiple logical usb connections over vnc, but even then blocking on 
the usb-host side looks bad as this could disrupt other usb devices forwarded 
over the same connection.


The idea is for this protocol to be transport independent (*) so it
could be one stream, or it could be multiplexed into another transport.

Assuring that there are not too large latencies, and handling things
like not blocking for too long when handling multiple devices from the
same thread are left to the implementation.

It could be that the implementation decides to not handle synchroneous
commands synchroneous at all. The main difference I'm trying to make
here is between commands which do things to end points which cannot
be done while other packets are in flight (like setting configuration,
or interface alt setting) and commands (normal packets) of which there
can be multiple in flight. I'm open to using a different term then
synchroneous and async here.

*) Assuming a reliable, ordered transport




usb_redir_report_descriptor ---

usb_redir_header.command: usb_redir_report_desciptor
usb_redir_header.length: sizeof usb device descriptors

No command specific header.

The command specific additional data contains the entire descriptors
for the usb device.

A packet of this type is send by the usb-host directly after the
hello packet it contains the usb descriptor tables for the usb
device.


Device addressing isn't done at all in the protocol, i.e. there is a fixed device 
- connection relation ship?



The idea is there is a fixed device - transport pipe relation ship, yes.
For the simple tcp server usb-host I plan to implement as first usb-host,
the device would be specified on the cmdline while starting the server.

When one wants to use multiple devices, this means starting one usb-server
per device.

For things like vnc and spice I assume there will be some sort of control
channel where usb device management is done. To be more precise I expect
the vnc client / spice client to have some UI which allows selecting a usb
device to plug into the virtual machine. And then the client will setup a
transport for this (reserve a channel within its existing stream) and tell
the server that it has an usb device at sub channel id #, at which point
the server will add a new usb redir device to qemu, passing in a transport
object which is connected to this channel.


Please let me know what you think of this.


Do you know whenever certain low-level usb ops can work with this?



I expect most usb devices to work with this, I don't know about
really weird ones.


Specifically iphone firmware flashing was mentioned on the list.



I think that should work, but that is an interesting case about which
I don't know enough yet.


Also I remember somewhere in the ehci (or xhci?) specs was mentioned

 with some devices it can be needed to talk to them *before* an bus address is 
assigned ...

If some devices need this, and the setup done there cannot be done
by the client machines os, we loose, as device enumeration and address
assignment is done at the OS level, and at least under Linux we won't
get a chance to talk to the device till after it has been assigned an
address.

Regards,

Hans

[Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify

2010-12-15 Thread Michael S. Tsirkin

On Wed, Dec 15, 2010 at 11:42:12AM +, Stefan Hajnoczi wrote:
 On Mon, Dec 13, 2010 at 6:52 PM, Michael S. Tsirkin m...@redhat.com wrote:
  On Mon, Dec 13, 2010 at 05:57:28PM +, Stefan Hajnoczi wrote:
  On Mon, Dec 13, 2010 at 4:28 PM, Stefan Hajnoczi stefa...@gmail.com 
  wrote:
   On Mon, Dec 13, 2010 at 4:12 PM, Michael S. Tsirkin m...@redhat.com 
   wrote:
   On Mon, Dec 13, 2010 at 03:27:06PM +, Stefan Hajnoczi wrote:
   On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin m...@redhat.com 
   wrote:
On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote:
On Mon, Dec 13, 2010 at 01:11:27PM +, Stefan Hajnoczi wrote:
 Fresh results:

 192.168.0.1 - host (runs netperf)
 192.168.0.2 - guest (runs netserver)

 host$ src/netperf -H 192.168.0.2 -- -m 200

 ioeventfd=on
 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
 192.168.0.2
 (192.168.0.2) port 0 AF_INET
 Recv   Send    Send
 Socket Socket  Message  Elapsed
 Size   Size    Size     Time     Throughput
 bytes  bytes   bytes    secs.    10^6bits/sec
  87380  16384    200    10.00    1759.25

 ioeventfd=off
 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
 192.168.0.2
 (192.168.0.2) port 0 AF_INET
 Recv   Send    Send
 Socket Socket  Message  Elapsed
 Size   Size    Size     Time     Throughput
 bytes  bytes   bytes    secs.    10^6bits/sec

  87380  16384    200    10.00    1757.15

 The results vary approx +/- 3% between runs.

 Invocation:
 $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev
 type=tap,id=net0,ifname=tap0,script=no,downscript=no -device
 virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive
 if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img

 I am running qemu.git with v5 patches, based off
 36888c6335422f07bbc50bf3443a39f24b90c7c6.

 Host:
 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz
 8 GB RAM
 RHEL 6 host

 Next I will try the patches on latest qemu-kvm.git

 Stefan
   
One interesting thing is that I put virtio-net earlier on
command line.
   
Sorry I mean I put it after disk, you put it before.
  
   I can't find a measurable difference when swapping -drive and -netdev.
  
   One other concern I have is that we are apparently using
   ioeventfd for all VQs. E.g. for virtio-net we probably should not
   use it for the control VQ - it's a waste of resources.
  
   One option is a per-device (block, net, etc) bitmap that masks out
   virtqueues.  Is that something you'd like to see?
  
   I'm tempted to mask out the RX vq too and see how that affects the
   qemu-kvm.git specific issue.
 
  As expected, the rx virtqueue is involved in the degradation.  I
  enabled ioeventfd only for the TX virtqueue and got the same good
  results as userspace virtio-net.
 
  When I enable only the rx virtqueue, performs decreases as we've seen 
  above.
 
  Stefan
 
  Interesting. In particular this implies something's wrong with the
  queue: we should not normally be getting notifications from rx queue
  at all. Is it running low on buffers? Does it help to increase the vq
  size?  Any other explanation?
 
 I made a mistake, it is the *tx* vq that causes reduced performance on
 short packets with ioeventfd.  I double-checked the results and the rx
 vq doesn't affect performance.
 
 Initially I thought the fix would be to adjust the tx mitigation
 mechanism since ioeventfd does its own mitigation of sorts.  Multiple
 eventfd signals will be coalesced into one qemu-kvm event handler call
 if qemu-kvm didn't have a chance to handle the first event before the
 eventfd was signalled again.
 
 I added -device virtio-net-pci tx=immediate to flush the TX queue
 immediately instead of scheduling a BH or timer.  Unfortunately this
 had little measurable effect and performance stayed the same.  This
 suggests most of the latency is between the guest's pio write and
 qemu-kvm getting around to handling the event.
 
 You mentioned that vhost-net has the same performance issue on this
 benchmark.  I guess a solution for vhost-net may help virtio-ioeventfd
 and vice versa.
 
 Are you happy with this patchset if I remove virtio-net-pci
 ioeventfd=on|off so only virtio-blk-pci has ioeventfd=on|off (with
 default on)?  For block we've found it to be a win and the initial
 results looked good for net too.
 
 Stefan

I'm concerned that the tests were done on qemu.git.
Could you check block with qemu-kvm too please?

[Qemu-devel] Re: [Spice-devel] RFC; usb redirection protocol

2010-12-15 Thread Frédéric Grelot

  Do you know whenever certain low-level usb ops can work with this?
 
 
 I expect most usb devices to work with this, I don't know about
 really weird ones.
 
  Specifically iphone firmware flashing was mentioned on the list.
 
 
 I think that should work, but that is an interesting case about which
 I don't know enough yet.

If that can help you, I may try to capture the usb activities while doing the 
next firmware upgrade.
Unfortunately, it'll be under Windows so I don't know what tool I could use for 
that.
If I can confirm that the upgrade works with Oracle Virtualbox, I could also 
try that, and capture it at guest /and/ host level.

I heard that next update should happen within a few week, so I'll keep you in 
touch about it.

Frederic.

Re: [Qemu-devel] [PATCH 1/5] usb-ccid: add CCID bus

2010-12-15 Thread Gerd Hoffmann


On 12/12/10 19:37, Alon Levy wrote:

A CCID device is a smart card reader. It is a USB device, defined at [1].
This patch introduces the usb-ccid device that is a ccid bus. Next patches will
introduce two card types to use it, a passthru card and an emulated card.


Looks good overall, just some minor nits / questions:


+struct CCIDCardState {
+DeviceState qdev;


Add uint32_t slot here?

Also adding a bus property for it would be good, even though it can't be 
anything but '0' right now, to prepare the interfaces for the day when 
we'll support more than a single card slot and thus can attach multiple 
cards to the ccid bus.



+static VMStateDescription ccid_vmstate = {
+.name = CCID_DEV_NAME,
+.version_id = 1,
+.minimum_version_id = 1,
+.post_load = ccid_post_load,
+.pre_save = ccid_pre_save,
+.fields = (VMStateField []) {
+VMSTATE_STRUCT(dev, USBCCIDState, 1, usb_device_vmstate, USBDevice),


Can we please disable this for now?  USB migration support doesn't 
exist, and when we add it chances are that it isn't compatible with what 
you have here.



+static struct USBDeviceInfo ccid_info = {
+.product_desc   = QEMU USB CCID,
+.qdev.name  = CCID_DEV_NAME,
+.qdev.size  = sizeof(USBCCIDState),
+.qdev.vmsd  =ccid_vmstate,


Best leave the vmstate structs in the code (with a comment) and just zap 
this line which winds it up.


thanks,
  Gerd

Re: [Qemu-devel] [PATCH 2/5] ccid: add passthru card device

2010-12-15 Thread Gerd Hoffmann


  Hi,

Commit message could be a bit more verbose, even if it is redundant with 
what the patch 1/5 message says.  Also short usage information (or 
pointer to the file patch 5/5 adds) would be good.


Patch itself looks fine to me.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH 3/5] libcacard: initial commit after coding style fixes

2010-12-15 Thread Gerd Hoffmann


On 12/12/10 19:37, Alon Levy wrote:

From: Robert Relyearrel...@redhat.com

Signed-off-by: Alon Levyal...@redhat.com


Commit message could be more verbose too.  This is the smart card 
emulation library, right?


Patch looks good on a quick glance, I'm not a crypto expert though.

Acked-by: Gerd Hoffmann kra...@redhat.com

cheers,
  Gerd

Re: [Qemu-devel] [PATCH 4/5] ccid: add ccid-card-emulated device (v2)

2010-12-15 Thread Gerd Hoffmann


On 12/12/10 19:37, Alon Levy wrote:

changes from v1:
 remove stale comments, use only c-style comments
 bugfix, forgot to set recv_len
 change reader name to 'Virtual Reader'

Signed-off-by: Alon Levyal...@redhat.com


Should be more verbose too.

Please explain what the threads are used for and why they are needed 
(/me guesses that libcacard wants run async).


thanks,
  Gerd

Re: [Qemu-devel] [PATCH 5/5] smartcard: add docs

2010-12-15 Thread Gerd Hoffmann


  Hi,


  docs/libcacard.txt |  483 


Move this to the libcacard patch and add a pointer to it to the commit 
message?


cheers,
  Gerd

[Qemu-devel] Re: [PATCH v5 0/4] virtio: Use ioeventfd for virtqueue notify

2010-12-15 Thread Stefan Hajnoczi

On Wed, Dec 15, 2010 at 12:14 PM, Michael S. Tsirkin m...@redhat.com wrote:
 On Wed, Dec 15, 2010 at 11:42:12AM +, Stefan Hajnoczi wrote:
 On Mon, Dec 13, 2010 at 6:52 PM, Michael S. Tsirkin m...@redhat.com wrote:
  On Mon, Dec 13, 2010 at 05:57:28PM +, Stefan Hajnoczi wrote:
  On Mon, Dec 13, 2010 at 4:28 PM, Stefan Hajnoczi stefa...@gmail.com 
  wrote:
   On Mon, Dec 13, 2010 at 4:12 PM, Michael S. Tsirkin m...@redhat.com 
   wrote:
   On Mon, Dec 13, 2010 at 03:27:06PM +, Stefan Hajnoczi wrote:
   On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin m...@redhat.com 
   wrote:
On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote:
On Mon, Dec 13, 2010 at 01:11:27PM +, Stefan Hajnoczi wrote:
 Fresh results:

 192.168.0.1 - host (runs netperf)
 192.168.0.2 - guest (runs netserver)

 host$ src/netperf -H 192.168.0.2 -- -m 200

 ioeventfd=on
 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
 192.168.0.2
 (192.168.0.2) port 0 AF_INET
 Recv   Send    Send
 Socket Socket  Message  Elapsed
 Size   Size    Size     Time     Throughput
 bytes  bytes   bytes    secs.    10^6bits/sec
  87380  16384    200    10.00    1759.25

 ioeventfd=off
 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
 192.168.0.2
 (192.168.0.2) port 0 AF_INET
 Recv   Send    Send
 Socket Socket  Message  Elapsed
 Size   Size    Size     Time     Throughput
 bytes  bytes   bytes    secs.    10^6bits/sec

  87380  16384    200    10.00    1757.15

 The results vary approx +/- 3% between runs.

 Invocation:
 $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev
 type=tap,id=net0,ifname=tap0,script=no,downscript=no -device
 virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive
 if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img

 I am running qemu.git with v5 patches, based off
 36888c6335422f07bbc50bf3443a39f24b90c7c6.

 Host:
 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz
 8 GB RAM
 RHEL 6 host

 Next I will try the patches on latest qemu-kvm.git

 Stefan
   
One interesting thing is that I put virtio-net earlier on
command line.
   
Sorry I mean I put it after disk, you put it before.
  
   I can't find a measurable difference when swapping -drive and -netdev.
  
   One other concern I have is that we are apparently using
   ioeventfd for all VQs. E.g. for virtio-net we probably should not
   use it for the control VQ - it's a waste of resources.
  
   One option is a per-device (block, net, etc) bitmap that masks out
   virtqueues.  Is that something you'd like to see?
  
   I'm tempted to mask out the RX vq too and see how that affects the
   qemu-kvm.git specific issue.
 
  As expected, the rx virtqueue is involved in the degradation.  I
  enabled ioeventfd only for the TX virtqueue and got the same good
  results as userspace virtio-net.
 
  When I enable only the rx virtqueue, performs decreases as we've seen 
  above.
 
  Stefan
 
  Interesting. In particular this implies something's wrong with the
  queue: we should not normally be getting notifications from rx queue
  at all. Is it running low on buffers? Does it help to increase the vq
  size?  Any other explanation?

 I made a mistake, it is the *tx* vq that causes reduced performance on
 short packets with ioeventfd.  I double-checked the results and the rx
 vq doesn't affect performance.

 Initially I thought the fix would be to adjust the tx mitigation
 mechanism since ioeventfd does its own mitigation of sorts.  Multiple
 eventfd signals will be coalesced into one qemu-kvm event handler call
 if qemu-kvm didn't have a chance to handle the first event before the
 eventfd was signalled again.

 I added -device virtio-net-pci tx=immediate to flush the TX queue
 immediately instead of scheduling a BH or timer.  Unfortunately this
 had little measurable effect and performance stayed the same.  This
 suggests most of the latency is between the guest's pio write and
 qemu-kvm getting around to handling the event.

 You mentioned that vhost-net has the same performance issue on this
 benchmark.  I guess a solution for vhost-net may help virtio-ioeventfd
 and vice versa.

 Are you happy with this patchset if I remove virtio-net-pci
 ioeventfd=on|off so only virtio-blk-pci has ioeventfd=on|off (with
 default on)?  For block we've found it to be a win and the initial
 results looked good for net too.

 Stefan

 I'm concerned that the tests were done on qemu.git.
 Could you check block with qemu-kvm too please?

The following results show qemu-kvm with virtio-ioeventfd v3 for both
aio=native and aio=threads:

http://wiki.qemu.org/Features/VirtioIoeventfd

Stefan

Re: [Qemu-devel] [PATCH 0/5] usb-ccid (v9)

2010-12-15 Thread Alon Levy

On Wed, Dec 15, 2010 at 01:01:56PM +0100, Gerd Hoffmann wrote:
 On 12/12/10 19:37, Alon Levy wrote:
 This patchset adds three new devices, usb-ccid, ccid-card-passthru and
 ccid-card-emulated, providing a CCID bus, a simple passthru protocol
 implementing card requiring a client, and a standalone emulated card.
 
 Hmm, 'git am' refuses to apply these and complains about a corrupted patch:
 
 Applying: libcacard: initial commit after coding style fixes
 fatal: corrupt patch at line 5839
 
 Is there a git tree somewhere with the patches to pull from?
 
git://anongit.freedesktop.org/~alon/qemu usb_ccid.v9

It contains some other patches (usb attach/detach and qtree extra
parameter patches), so cherry-pick from it. I'll make a straight branch
too after I address your comments.

 cheers,
   Gerd

Re: [Qemu-devel] Re: sparc OBP psr value

2010-12-15 Thread Bob Breuer

Forget this.  My test was flawed because I still wasn't comparing apples
to apples.  I was comparing the pre-bootloader state to the
post-bootloader state, and it seems that OBP, even on a real machine,
shows all the registers as zero before it runs any program.

However, I still think there's something wrong with the RT625 version
somewhere that causes it to show as a 605e in the banner, but I don't
have the real thing to compare against.  What I do have for comparison
is a 150MHz hyperSparc which shows up as an RT626.

Bob

Re: [Qemu-devel] [PATCH 26/27] blockdev: Collect block device code in new blockdev.c

2010-12-15 Thread Artyom Tarasenko

On Fri, Jun 4, 2010 at 6:33 PM, Kevin Wolf kw...@redhat.com wrote:
 From: Markus Armbruster arm...@redhat.com

 Anything that moves hundreds of lines out of vl.c can't be all bad.

I know I'm late for this train, but why does this patch change the
license of the former vl.c code from a BSD-like one to GPLv2?
It seems to be contradicting the vl.c paragraph:

 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.


-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

Re: [Qemu-devel] Re: [Spice-devel] RFC; usb redirection protocol

2010-12-15 Thread Alon Levy

On Wed, Dec 15, 2010 at 01:15:58PM +0100, Hans de Goede wrote:
 Hi,
 
 Thanks for taking the time to read all that!
 
 On 12/13/2010 12:21 PM, Gerd Hoffmann wrote:
 Basic packet structure / communication
 --
 
 Each packet exchanged between the vm-host and the usb-host starts
 with a usb_redir_header, followed by an optional command specific
 header follow by optional additional data.
 
 The usb_redir_header each packet starts with looks as follows:
 
 struct usb_redir_header { uint32_t command; uint32_t length; }
 
 uint32_t id; ? A reply would then carry the id of the request ...
 
 
 That sounds like a good idea (all though for iso streams the packets
 will be only send in one direction).
 
 Given that everything is done over a potentially slow transport in
 practice the diferentiating between synchroneous and asynchroneous
 commands may seem odd. The difference is how the usb-host will handle
 them once received. For synchroneous commands the usb-host will hand
 the request over to the host os and then *wait* for a response. This
 means that the vm-host is guaranteed to get an immediate response.
 Where as for asynchroneous commands to usb-host hands the request
 over to the host os with the request to let the usb-host process know
 when the request is done.
 
 Hmm. Looks like you are planning for one tcp stream and one thread (on the 
 usb-host side) for each usb device. That will not work very good for 
 usb-over-vnc because there is a single tcp stream only. We could of course 
 multiplex multiple logical usb connections over vnc, but even then blocking 
 on the usb-host side looks bad as this could disrupt other usb devices 
 forwarded over the same connection.
 
 The idea is for this protocol to be transport independent (*) so it
 could be one stream, or it could be multiplexed into another transport.
 
 Assuring that there are not too large latencies, and handling things
 like not blocking for too long when handling multiple devices from the
 same thread are left to the implementation.
 
 It could be that the implementation decides to not handle synchroneous
 commands synchroneous at all. The main difference I'm trying to make
 here is between commands which do things to end points which cannot
 be done while other packets are in flight (like setting configuration,
 or interface alt setting) and commands (normal packets) of which there
 can be multiple in flight. I'm open to using a different term then
 synchroneous and async here.
 
 *) Assuming a reliable, ordered transport
 
 
 usb_redir_report_descriptor ---
 
 usb_redir_header.command: usb_redir_report_desciptor
 usb_redir_header.length: sizeof usb device descriptors
 
 No command specific header.
 
 The command specific additional data contains the entire descriptors
 for the usb device.
 
 A packet of this type is send by the usb-host directly after the
 hello packet it contains the usb descriptor tables for the usb
 device.
 
 Device addressing isn't done at all in the protocol, i.e. there is a fixed 
 device - connection relation ship?
 
 
 The idea is there is a fixed device - transport pipe relation ship, yes.
 For the simple tcp server usb-host I plan to implement as first usb-host,
 the device would be specified on the cmdline while starting the server.
 
 When one wants to use multiple devices, this means starting one usb-server
 per device.
 
 For things like vnc and spice I assume there will be some sort of control
 channel where usb device management is done. To be more precise I expect
 the vnc client / spice client to have some UI which allows selecting a usb
 device to plug into the virtual machine. And then the client will setup a
 transport for this (reserve a channel within its existing stream) and tell
 the server that it has an usb device at sub channel id #, at which point
 the server will add a new usb redir device to qemu, passing in a transport
 object which is connected to this channel.
 
 Please let me know what you think of this.
 
 Do you know whenever certain low-level usb ops can work with this?
 
 
 I expect most usb devices to work with this, I don't know about
 really weird ones.
 
 Specifically iphone firmware flashing was mentioned on the list.
 
 
 I think that should work, but that is an interesting case about which
 I don't know enough yet.
 
 Also I remember somewhere in the ehci (or xhci?) specs was mentioned
  with some devices it can be needed to talk to them *before* an bus address 
  is assigned ...
 
 If some devices need this, and the setup done there cannot be done
 by the client machines os, we loose, as device enumeration and address
 assignment is done at the OS level, and at least under Linux we won't
 get a chance to talk to the device till after it has been assigned an
 address.
 

I know of devices that will enumerate twice, first as one device, then
after a certain setup exchange as another. But that seems to be covered
by the

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-15 Thread Alex Williamson

On Wed, 2010-12-15 at 11:56 +0200, Michael S. Tsirkin wrote:
 On Tue, Dec 14, 2010 at 11:34:53AM -0700, Alex Williamson wrote:
  On Tue, 2010-12-14 at 14:26 +0200, Michael S. Tsirkin wrote:
   On Mon, Dec 13, 2010 at 10:04:24PM -0700, Alex Williamson wrote:

I've only ever seen config[PCI_SECONDARY_BUS] be non-zero for an
assigned device, so I'm pretty sure we're not going to hurt migration,
but the code is clearly wrong and I'd like to make sure we don't trip on
a migration failure for a minor device config space change.
   
   Which reminds me: maybe just mark nested bridges as non-migrateable
   for now?  Care writing such a patch?
  
  Hmm, this is trickier than it sounds.
 
 Hmm, since 0 is put in the path instead of the bridge number,
 will the correct bridge be restored?
 
   We're really only broken wrt
  migration if a device under a bridge calls qemu_ram_alloc.
 
 I guess there's more broken-ness. What exactly breaks qemu_ram_alloc?

You're right, it's more broken than that.  Anything that calls
get_dev_path is broken for migration of bridges since the path is
determined before the guest updates bus numbers.  That includes
qemu_ram_alloc and vmstate.  I was only looking at the qemu_ram_alloc
side.  So perhaps the right answer, for the moment, is to block
migration if there's a p2p bridge.

Alex

   Any device
  is free to do this, but typically it only happens via
  pci_add_option_rom() (not counting vga as typical).  So maybe the better
  approach for now is to prevent the problem by disallowing option ROMs
  for devices below a bridge.  We obviously risk devices coming along that
  allocate RAM on their own, but we could still allow the most common
  issue with almost no lost functionality (assuming no one wants to boot
  off that nested device).  Thoughts?  Thanks,
  
  Alex

Re: [Qemu-devel] [PATCH 26/27] blockdev: Collect block device code in new blockdev.c

2010-12-15 Thread Kevin Wolf

Am 15.12.2010 16:04, schrieb Artyom Tarasenko:
 On Fri, Jun 4, 2010 at 6:33 PM, Kevin Wolf kw...@redhat.com wrote:
 From: Markus Armbruster arm...@redhat.com

 Anything that moves hundreds of lines out of vl.c can't be all bad.
 
 I know I'm late for this train, but why does this patch change the
 license of the former vl.c code from a BSD-like one to GPLv2?
 It seems to be contradicting the vl.c paragraph:
 
  * The above copyright notice and this permission notice shall be included in
  * all copies or substantial portions of the Software.

Right, I think this needs to fixed. Markus, can you send a patch?

Kevin

[Qemu-devel] Re: [PATCH] rtl8139: IO memory is not part of vmstate

2010-12-15 Thread Alex Williamson

On Wed, 2010-12-15 at 12:07 +0200, Michael S. Tsirkin wrote:
 On Tue, Dec 14, 2010 at 08:41:55AM -0700, Alex Williamson wrote:
  On Tue, 2010-12-14 at 14:32 +0200, Michael S. Tsirkin wrote:
   On Mon, Dec 13, 2010 at 10:00:48PM -0700, Alex Williamson wrote:
On Tue, 2010-12-14 at 06:43 +0200, Michael S. Tsirkin wrote:
 On Mon, Dec 13, 2010 at 12:15:08PM -0700, Alex Williamson wrote:
  On Mon, 2010-12-13 at 21:06 +0200, Michael S. Tsirkin wrote:
   On Mon, Dec 13, 2010 at 11:59:16AM -0700, Alex Williamson wrote:
On Mon, 2010-12-13 at 20:54 +0200, Michael S. Tsirkin wrote:
 On Mon, Dec 13, 2010 at 11:00:44AM -0700, Alex Williamson 
 wrote:
  On Mon, 2010-12-13 at 19:50 +0200, Michael S. Tsirkin wrote:
   On Mon, Dec 13, 2010 at 10:43:22AM -0700, Alex Williamson 
   wrote:
So, unfortunately, I stand by my original patch.
   
   What about the one that put -1 in saved index for a 
   hotplugged device?
  
  There are still examples that don't work even without 
  hotplug (example 2
  and example 3 after the reboot).  That hack limits the 
  damage, but still
  leaves a latent bug for reboot and doesn't address the 
  non-hotplug
  scenarios.  So, I don't think it's worthwhile to pursue, 
  and we
  shouldn't pretend we can use it to avoid bumping the 
  version_id.
  Thanks,
  
  Alex
 
 I guess when we bump it we tell users: migration is completely
 borken to the old version, don't even try it.
 
 Is there a way for libvirt to discover such incompatibilities
 and avoid the migration?

I don't know if libvirt has a way to query this in advance.  If 
a
migration is attempted, the target will report:

savevm: unsupported version 5 for ':00:03.0/rtl8139' v4

And the source will continue running.  We waste plenty of bits 
getting
to that point,
   
   Yes, this happens after all of memory has been migrated.
  
  Better late than never :^\
 
 One other question: can we do the same by creating a new (empty)
 section? As was discussed in the past this is easier for
 downstreams to cherry-pick.

The only way I can think to do that would be to have a subsection that
is always included, but saves no data.  That would force a failure on
new-old migration, but I don't think it really matches the intended
purpose of subsections and feels like it's adding cruft for no gain.
Maybe I'm missing something.  Juan, is there any advantage to trapping
this in a subsection?  Thanks,

Alex
   
   Maybe in this particular case the advantage is minimal.
   But it seems easier to stick to a rule of no more version
   bumps than argue about each case.
  
  Do we have such a rule?
 
 I think it's easier to have a hard rule than start arguing whether
 each change is a subsection or not for each patch.

We need to have these discussions to make sure we're doing everything we
can do preserve the ABI.  AIUI, subsections are only useful in helping
us with that if there's a worthwhile percentage of the time that the
subsection isn't needed in the migration stream.  Even then it's unclear
to me how high level tools are supposed to deal with this.  Do they
re-try the migration, hoping the subsection was only needed temporarily?
In this case, the subsection would be needed all the time, so I don't
think it's applicable and I don't see a dogmatic rule about using them
to have any value.

 What is the downside?

Some empty structures to do things in a completely convoluted way.
What's the upside?

  If we have a subsection who's needed function
  is return 1, I think that's a good indication that it's not appropriate
  for a subsection and the end result is equivalent to bumping the main
  driver vmstate version.
 
 So if you feel strongly about it, go ahead, I'm just concerned
 we'll now need to argue version or subsection for each migration-related
 change.

Subsections don't magically preserve the migration ABI, so we still need
to have these discussions

  It's convoluted to try to hide a one-way upgrade in a subsection.
 
 It may be convoluted but I think it works better with downstreams.
 When we'll bump the version again in the future, it
 should be possible for a downstream to pick up
 just the next change, and skip this one.
 
 But that's a general comment. In this case, I expect all
 downstream to be able to pick this bugfix with no real downsides.

If a subsection can maintain the migration ABI in a worthwhile
percentage of use cases and high level tools are signed up to understand
whether to retry a migration blocked by a subsection describing an
inflight DMA vs avoiding a basic incompatibility, sure, let's use them.
I don't think

[Qemu-devel] Re: [PULL 00/14] Block patches

2010-12-15 Thread Kevin Wolf

Am 09.12.2010 12:09, schrieb Kevin Wolf:
 The following changes since commit 138b38b61bf92d4e9588acf934e532499c94e185:
 
   ppc: kvm: fix signedness warning (2010-12-08 21:30:19 +0100)
 
 are available in the git repository at:
   git://repo.or.cz/qemu/kevin.git for-anthony
 
 Christian Brunner (1):
   ceph/rbd block driver for qemu-kvm
 
 Jes Sorensen (8):
   Add missing tracing to qemu_mallocz()
   Use qemu_mallocz() instead of calloc() in img_convert()
   img_convert(): Only try to free bs[] entries if bs is valid.
   Consolidate printing of block driver options
   Fix formatting and missing braces in qemu-img.c
   Fail if detecting an unknown option
   Make error handling more consistent in img_create() and img_resize()
   qemu-img: Deprecate obsolete -6 and -e options
 
 Stefan Hajnoczi (5):
   block: Make bdrv_create_file() ':' handling consistent
   qemu-option: Don't reinvent append_option_parameters()
   qemu-option: Fix parse_option_parameters() documentation typo
   qemu-img: Free option parameter lists in img_create()
   qemu-img: Fail creation if backing format is invalid
 
  Makefile.objs |1 +
  block.c   |2 +-
  block/rbd.c   | 1059 
 +
  block/rbd_types.h |   71 
  block_int.h   |1 -
  configure |   52 +++
  qemu-img.c|  247 -
  qemu-malloc.c |5 +-
  qemu-option.c |   13 +-
  9 files changed, 1344 insertions(+), 107 deletions(-)
  create mode 100644 block/rbd.c
  create mode 100644 block/rbd_types.h

Ping?

Kevin

Re: [Qemu-devel] [PATCH v2 1/1] qemu-img.c: Clean up handling of image size in img_create()

2010-12-15 Thread Markus Armbruster

jes.soren...@redhat.com writes:

 From: Jes Sorensen jes.soren...@redhat.com

 This cleans up the handling of image size in img_create() by parsing
 the value early, and then only setting it once if a value has been
 added as the last argument to the command line.

 Signed-off-by: Jes Sorensen jes.soren...@redhat.com
 ---
  qemu-img.c |   14 --
  1 files changed, 8 insertions(+), 6 deletions(-)


Patch conflicts with commit c2abccec.

 diff --git a/qemu-img.c b/qemu-img.c
 index d146d8c..9986004 100644
 --- a/qemu-img.c
 +++ b/qemu-img.c
 @@ -282,6 +282,7 @@ static int add_old_style_options(const char *fmt, 
 QEMUOptionParameter *list,
  static int img_create(int argc, char **argv)
  {
  int c, ret = 0;
 +uint64_t img_size = -1;
  const char *fmt = raw;
  const char *base_fmt = NULL;
  const char *filename;
 @@ -329,6 +330,11 @@ static int img_create(int argc, char **argv)
  }
  filename = argv[optind++];
  
 +/* Get image size, if specified */
 +if (optind  argc) {
 +img_size = strtosz(argv[optind++], NULL);

strtosz() can fail.  More below.

 +}
 +
  if (options  !strcmp(options, ?)) {
  ret = print_block_option_help(filename, fmt);
  goto out;
 @@ -356,7 +362,8 @@ static int img_create(int argc, char **argv)
  
  /* Create parameter list with default values */
  param = parse_option_parameters(, create_options, param);
 -set_option_parameter_int(param, BLOCK_OPT_SIZE, -1);
 +
 +set_option_parameter_int(param, BLOCK_OPT_SIZE, img_size);
  
  /* Parse -o options */
  if (options) {
 @@ -368,11 +375,6 @@ static int img_create(int argc, char **argv)
  }
  }
  
 -/* Add size to parameters */
 -if (optind  argc) {
 -set_option_parameter(param, BLOCK_OPT_SIZE, argv[optind++]);
 -}
 -
  /* Add old-style options to parameters */
  ret = add_old_style_options(fmt, param, base_filename, base_fmt);
  if (ret  0) {

This switches parsing of the size argument from parse_option_size() (via
set_option_parameter()) to strtosz().  I'm fine with that, but:

* Before:

$ qemu-img create xxx xxx
Parameter 'size' expects a size
You may use k, M, G or T suffixes for kilobytes, megabytes, gigabytes and 
terabytes.
qemu-img: Image creation needs a size parameter

* After:

$ qemu-img create xxx xxx
qemu-img: Image creation needs a size parameter

Intentional?

[Qemu-devel] Re: [PATCH 0/3] Re-factor img_create() and add live snapshots

2010-12-15 Thread Kevin Wolf

Am 13.12.2010 08:32, schrieb jes.soren...@redhat.com:
 From: Jes Sorensen jes.soren...@redhat.com
 
 Hi,
 
 This set of patches re-factors img_create() and moves the core part of
 it into block.c so it can be accessed from qemu as well as
 qemu-img. The second patch adds basic live snapshots support to the
 code, however only snapshots to external QCOW2 images is supported for
 now. QED support should be trivial once the QED patches go into
 upstream.
 
 The last patch fixes a small gotcha which is present in the old code
 as well. Try to catch cases where a user tries to create an image with
 itself as the backing file. QEMU does 'interesting' things when you do
 this.
 
 Many thanks to Kevin for his help with block layer internals!
 
 Cheers,
 Jes
 
 
 Jes Sorensen (3):
   qemu-img.c: Re-factor img_create()
   Introduce do_snapshot_blkdev() and monitor command to handle it.
   Prevent creating an image with the same filename as backing file

This doesn't seem to apply cleanly any more. Can you please rebase
(ideally on top of my block branch)?

Kevin

[Qemu-devel] Re: [PATCH 2/3] Introduce do_snapshot_blkdev() and monitor command to handle it.

2010-12-15 Thread Kevin Wolf

Am 13.12.2010 08:32, schrieb jes.soren...@redhat.com:
 From: Jes Sorensen jes.soren...@redhat.com
 
 The monitor command is:
 snapshot_blkdev device [snapshot-file] [format]
 
 Default format is qcow2. For now snapshots without a snapshot-file, eg
 internal snapshots, are not supported.
 
 Signed-off-by: Jes Sorensen jes.soren...@redhat.com
 ---
  blockdev.c  |   61 
 +++
  blockdev.h  |1 +
  hmp-commands.hx |   19 +
  3 files changed, 81 insertions(+), 0 deletions(-)
 
 diff --git a/blockdev.c b/blockdev.c
 index f6ac439..1ea24d7 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -514,6 +514,67 @@ void do_commit(Monitor *mon, const QDict *qdict)
  }
  }
  
 +int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data)
 +{
 +const char *device = qdict_get_str(qdict, device);
 +const char *filename = qdict_get_try_str(qdict, snapshot_file);
 +const char *format = qdict_get_try_str(qdict, format);
 +const char format_qcow2[] = qcow2;
 +BlockDriverState *bs;
 +BlockDriver *drv, *proto_drv;
 +int ret = 0;
 +int flags;
 +
 +bs = bdrv_find(device);
 +if (!bs) {
 +qerror_report(QERR_DEVICE_NOT_FOUND, device);
 +ret = -1;
 +goto out;
 +}
 +
 +if (!format) {
 +format = format_qcow2;
 +}
 +
 +drv = bdrv_find_format(format);
 +if (!drv) {
 +qerror_report(QERR_INVALID_BLOCK_FORMAT, format);
 +ret = -1;
 +goto out;
 +}
 +
 +proto_drv = bdrv_find_protocol(filename);
 +if (!proto_drv) {
 +qerror_report(QERR_INVALID_BLOCK_FORMAT, format);
 +ret = -1;
 +goto out;
 +}
 +
 +ret = bdrv_img_create(filename, format, bs-filename,
 +  bs-drv-format_name, NULL, -1, bs-open_flags);
 +if (ret) {
 +goto out;
 +}
 +
 +qemu_aio_flush();
 +bdrv_flush(bs);
 +
 +flags = bs-open_flags;
 +bdrv_close(bs);
 +ret = bdrv_open(bs, filename, flags, drv);
 +/*
 + * If reopening the image file we just created fails, we really
 + * are in trouble :(
 + */
 +assert(ret == 0);
 +out:
 +if (ret) {
 +ret = 1;

I seem to remember that errors are always -1 for monitor commands.

Kevin

Re: [Qemu-devel] [PATCH v2 1/1] qemu-img.c: Clean up handling of image size in img_create()

2010-12-15 Thread Jes Sorensen

On 12/15/10 17:47, Markus Armbruster wrote:
 jes.soren...@redhat.com writes:
 
 From: Jes Sorensen jes.soren...@redhat.com

 This cleans up the handling of image size in img_create() by parsing
 the value early, and then only setting it once if a value has been
 added as the last argument to the command line.

 Signed-off-by: Jes Sorensen jes.soren...@redhat.com
 ---
  qemu-img.c |   14 --
  1 files changed, 8 insertions(+), 6 deletions(-)

 
 Patch conflicts with commit c2abccec.

ARGH :(

 This switches parsing of the size argument from parse_option_size() (via
 set_option_parameter()) to strtosz().  I'm fine with that, but:
 
 * Before:
 
 $ qemu-img create xxx xxx
 Parameter 'size' expects a size
 You may use k, M, G or T suffixes for kilobytes, megabytes, gigabytes and 
 terabytes.
 qemu-img: Image creation needs a size parameter
 
 * After:
 
 $ qemu-img create xxx xxx
 qemu-img: Image creation needs a size parameter
 
 Intentional?

This was addressed in the later revision when I introduced strtosz_suffix()

Cheers,
Jes

[Qemu-devel] Re: [PATCH 2/3] Introduce do_snapshot_blkdev() and monitor command to handle it.

2010-12-15 Thread Jes Sorensen

On 12/15/10 17:55, Kevin Wolf wrote:
 +int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data)
 +{
 +const char *device = qdict_get_str(qdict, device);
 +const char *filename = qdict_get_try_str(qdict, snapshot_file);
 +const char *format = qdict_get_try_str(qdict, format);
 +const char format_qcow2[] = qcow2;
 +BlockDriverState *bs;
 +BlockDriver *drv, *proto_drv;
 +int ret = 0;
 +int flags;
 +
 +bs = bdrv_find(device);
 +if (!bs) {
 +qerror_report(QERR_DEVICE_NOT_FOUND, device);
 +ret = -1;
 +goto out;
 +}
 +
 +if (!format) {
 +format = format_qcow2;
 +}
 +
 +drv = bdrv_find_format(format);
 +if (!drv) {
 +qerror_report(QERR_INVALID_BLOCK_FORMAT, format);
 +ret = -1;
 +goto out;
 +}
 +
 +proto_drv = bdrv_find_protocol(filename);
 +if (!proto_drv) {
 +qerror_report(QERR_INVALID_BLOCK_FORMAT, format);
 +ret = -1;
 +goto out;
 +}
 +
 +ret = bdrv_img_create(filename, format, bs-filename,
 +  bs-drv-format_name, NULL, -1, bs-open_flags);
 +if (ret) {
 +goto out;
 +}
 +
 +qemu_aio_flush();
 +bdrv_flush(bs);
 +
 +flags = bs-open_flags;
 +bdrv_close(bs);
 +ret = bdrv_open(bs, filename, flags, drv);
 +/*
 + * If reopening the image file we just created fails, we really
 + * are in trouble :(
 + */
 +assert(ret == 0);
 +out:
 +if (ret) {
 +ret = 1;
 
 I seem to remember that errors are always -1 for monitor commands.

I mapped it after something else, but admitted I cannot remember where -
can someone clarify?

Cheers,
Jes

[Qemu-devel] Re: [PATCH 0/3] Re-factor img_create() and add live snapshots

2010-12-15 Thread Jes Sorensen

On 12/15/10 17:52, Kevin Wolf wrote:
 Am 13.12.2010 08:32, schrieb jes.soren...@redhat.com:
 Jes Sorensen (3):
   qemu-img.c: Re-factor img_create()
   Introduce do_snapshot_blkdev() and monitor command to handle it.
   Prevent creating an image with the same filename as backing file
 
 This doesn't seem to apply cleanly any more. Can you please rebase
 (ideally on top of my block branch)?
 
 Kevin

Darn! I'll try and rebase it tomorrow.

Thanks,
Jes

[Qemu-devel] Re: [PATCH] rtl8139: IO memory is not part of vmstate

2010-12-15 Thread Paolo Bonzini


On 12/15/2010 11:00 AM, Michael S. Tsirkin wrote:

Indeed, subsections are for data that is rarely needed so that
there's some chance (sometimes ~100%) of migration working
seemlessly.


If a subsection arrives that qemu does
not know about, won't migratin fail?


Yes, that's why rarely needed = some high chance of migration working 
(though no certainty).



In this case it's either
no-bump-and-live-with-the-consequences, or changing the version id.


This was discussed to death already.  version ids have the problem
that they don't play nicely with downstreams.


Downstream version bumps don't play nicely with upstream, so downstream 
does have a reason for always-necessary subsections.  But upstream can 
bump the version id as much as they care.


Paolo

[Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-15 Thread Luiz Capitulino

On Wed, 15 Dec 2010 17:49:27 +0800
Lai Jiangshan la...@cn.fujitsu.com wrote:

 
 Convert do_inject_nmi() to QObject, QError, we need to use it(via libvirt).
 
 changed from v1
 Add document.
 Add error handling when the cpu index is invalid.
 
 changed from v2
 use QERR_INVALID_PARAMETER_VALUE as Markus suggest.
 
 Signed-off-by:  Lai Jiangshan la...@cn.fujitsu.com
 ---
 diff --git a/hmp-commands.hx b/hmp-commands.hx
 index 23024ba..f86d9fe 100644
 --- a/hmp-commands.hx
 +++ b/hmp-commands.hx
 @@ -724,7 +724,8 @@ ETEXI
  .args_type  = cpu_index:i,
  .params = cpu,
  .help   = inject an NMI on the given CPU,
 -.mhandler.cmd = do_inject_nmi,
 +.user_print = monitor_user_noop,
 +.mhandler.cmd_new = do_inject_nmi,
  },
  #endif
  STEXI
 diff --git a/monitor.c b/monitor.c
 index ec31eac..3e33a96 100644
 --- a/monitor.c
 +++ b/monitor.c
 @@ -2119,7 +2119,7 @@ static void do_wav_capture(Monitor *mon, const QDict 
 *qdict)
  #endif
  
  #if defined(TARGET_I386)
 -static void do_inject_nmi(Monitor *mon, const QDict *qdict)
 +static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject 
 **ret_data)
  {
  CPUState *env;
  int cpu_index = qdict_get_int(qdict, cpu_index);
 @@ -2127,8 +2127,12 @@ static void do_inject_nmi(Monitor *mon, const QDict 
 *qdict)
  for (env = first_cpu; env != NULL; env = env-next_cpu)
  if (env-cpu_index == cpu_index) {
  cpu_interrupt(env, CPU_INTERRUPT_NMI);
 -break;
 +return 0;
  }
 +
 +qerror_report(QERR_INVALID_PARAMETER_VALUE, cpu_index,
 +  a CPU number);
 +return -1;
  }
  #endif
  
 diff --git a/qmp-commands.hx b/qmp-commands.hx
 index e5f157f..fcb6bf2 100644
 --- a/qmp-commands.hx
 +++ b/qmp-commands.hx
 @@ -429,6 +429,33 @@ Example:
  
  EQMP
  
 +#if defined(TARGET_I386)
 +{
 +.name   = inject_nmi,
 +.args_type  = cpu_index:i,
 +.params = cpu,
 +.help   = inject an NMI on the given CPU,
 +.user_print = monitor_user_noop,
 +.mhandler.cmd_new = do_inject_nmi,
 +},
 +#endif
 +SQMP
 +inject_nmi
 +--
 +
 +Inject an NMI on the given CPU (x86 only).
 +
 +Arguments:
 +
 +- cpu_index: the index of the CPU to be injected NMI (json-int)

Please, use cpu-index, that's what we're using for the human-monitor-command.

Avi, Anthony, can you please ACK this new command?

 +
 +Example:
 +
 +- { execute: inject_nmi, arguments: { cpu_index: 0 } }
 +- { return: {} }
 +
 +EQMP
 +
  {
  .name   = migrate,
  .args_type  = detach:-d,blk:-b,inc:-i,uri:s,

Re: [Qemu-devel] [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-15 Thread Luiz Capitulino

On Wed, 15 Dec 2010 11:49:23 +0100
Markus Armbruster arm...@redhat.com wrote:

 Lai Jiangshan la...@cn.fujitsu.com writes:
 
  Convert do_inject_nmi() to QObject, QError, we need to use it(via libvirt).
 
  changed from v1
  Add document.
  Add error handling when the cpu index is invalid.
 
  changed from v2
  use QERR_INVALID_PARAMETER_VALUE as Markus suggest.
 
  Signed-off-by:  Lai Jiangshan la...@cn.fujitsu.com
 
 A note on commit messages:
 
 The commit message should describe the current version of the patch.
 Don't repeat the subject line in the body.
 
 Patch history is very useful for review, but usually uninteresting once
 the patch is committed.  Thus, it's best to put it after the ---
 separator.
 
 Subsystem tags in the subject line are helpful.  But qemu doesn't
 provide any information there :)
 
 
 Regarding the patch:
 
 The conversion looks good.
 
 The new QMP command is called inject_nmi, while the existing human
 monitor command is called nmi.  Luiz asked for this name change.  I
 don't mind.  But should we rename the human monitor command for
 consistency?

I don't think so, we don't need (and maybe don't even want) naming parity
between QMP and HMP. Remember that one of our mistakes was to couple the two.

Also, Avi asked for more descriptive names in QMP and I agree with him, I
would even be in favor of calling it inject-non-maskable-interrupt.

 
 Regardless, the differing command name is worth mentioning in the commit
 message.

[Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-15 Thread Avi Kivity


On 12/15/2010 07:09 PM, Luiz Capitulino wrote:

On Wed, 15 Dec 2010 17:49:27 +0800
Lai Jiangshanla...@cn.fujitsu.com  wrote:


  Convert do_inject_nmi() to QObject, QError, we need to use it(via libvirt).

  changed from v1
  Add document.
  Add error handling when the cpu index is invalid.

  changed from v2
  use QERR_INVALID_PARAMETER_VALUE as Markus suggest.

  Signed-off-by:  Lai Jiangshanla...@cn.fujitsu.com
  ---
  diff --git a/hmp-commands.hx b/hmp-commands.hx
  index 23024ba..f86d9fe 100644
  --- a/hmp-commands.hx
  +++ b/hmp-commands.hx
  @@ -724,7 +724,8 @@ ETEXI
   .args_type  = cpu_index:i,
   .params = cpu,
   .help   = inject an NMI on the given CPU,
  -.mhandler.cmd = do_inject_nmi,
  +.user_print = monitor_user_noop,
  +.mhandler.cmd_new = do_inject_nmi,
   },
   #endif
   STEXI
  diff --git a/monitor.c b/monitor.c
  index ec31eac..3e33a96 100644
  --- a/monitor.c
  +++ b/monitor.c
  @@ -2119,7 +2119,7 @@ static void do_wav_capture(Monitor *mon, const QDict 
*qdict)
   #endif

   #if defined(TARGET_I386)
  -static void do_inject_nmi(Monitor *mon, const QDict *qdict)
  +static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject 
**ret_data)
   {
   CPUState *env;
   int cpu_index = qdict_get_int(qdict, cpu_index);
  @@ -2127,8 +2127,12 @@ static void do_inject_nmi(Monitor *mon, const QDict 
*qdict)
   for (env = first_cpu; env != NULL; env = env-next_cpu)
   if (env-cpu_index == cpu_index) {
   cpu_interrupt(env, CPU_INTERRUPT_NMI);
  -break;
  +return 0;
   }
  +
  +qerror_report(QERR_INVALID_PARAMETER_VALUE, cpu_index,
  +  a CPU number);
  +return -1;
   }
   #endif

  diff --git a/qmp-commands.hx b/qmp-commands.hx
  index e5f157f..fcb6bf2 100644
  --- a/qmp-commands.hx
  +++ b/qmp-commands.hx
  @@ -429,6 +429,33 @@ Example:

   EQMP

  +#if defined(TARGET_I386)
  +{
  +.name   = inject_nmi,
  +.args_type  = cpu_index:i,
  +.params = cpu,
  +.help   = inject an NMI on the given CPU,
  +.user_print = monitor_user_noop,
  +.mhandler.cmd_new = do_inject_nmi,
  +},
  +#endif
  +SQMP
  +inject_nmi
  +--
  +
  +Inject an NMI on the given CPU (x86 only).
  +
  +Arguments:
  +
  +- cpu_index: the index of the CPU to be injected NMI (json-int)

Please, use cpu-index, that's what we're using for the human-monitor-command.

Avi, Anthony, can you please ACK this new command?



I'd like to see cpu-index made optional; if not present, nmi all cpus 
(that's what the nmi button on many machines does, or at least I think 
that's what it does).


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH v2 1/1] qemu-img.c: Clean up handling of image size in img_create()

2010-12-15 Thread Markus Armbruster

Jes Sorensen jes.soren...@redhat.com writes:

 On 12/15/10 17:47, Markus Armbruster wrote:
 jes.soren...@redhat.com writes:
 
 From: Jes Sorensen jes.soren...@redhat.com

 This cleans up the handling of image size in img_create() by parsing
 the value early, and then only setting it once if a value has been
 added as the last argument to the command line.

 Signed-off-by: Jes Sorensen jes.soren...@redhat.com
 ---
  qemu-img.c |   14 --
  1 files changed, 8 insertions(+), 6 deletions(-)

 
 Patch conflicts with commit c2abccec.

 ARGH :(

 This switches parsing of the size argument from parse_option_size() (via
 set_option_parameter()) to strtosz().  I'm fine with that, but:
 
 * Before:
 
 $ qemu-img create xxx xxx
 Parameter 'size' expects a size
 You may use k, M, G or T suffixes for kilobytes, megabytes, gigabytes 
 and terabytes.
 qemu-img: Image creation needs a size parameter
 
 * After:
 
 $ qemu-img create xxx xxx
 qemu-img: Image creation needs a size parameter
 
 Intentional?

 This was addressed in the later revision when I introduced strtosz_suffix()

Looks like I'm getting lost in my post-vacation mail backlog %-}

[Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-15 Thread Luiz Capitulino

On Wed, 15 Dec 2010 19:18:32 +0200
Avi Kivity a...@redhat.com wrote:

 On 12/15/2010 07:09 PM, Luiz Capitulino wrote:
  On Wed, 15 Dec 2010 17:49:27 +0800
  Lai Jiangshanla...@cn.fujitsu.com  wrote:
 
  
Convert do_inject_nmi() to QObject, QError, we need to use it(via 
   libvirt).
  
changed from v1
Add document.
Add error handling when the cpu index is invalid.
  
changed from v2
use QERR_INVALID_PARAMETER_VALUE as Markus suggest.
  
Signed-off-by:  Lai Jiangshanla...@cn.fujitsu.com
---
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 23024ba..f86d9fe 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -724,7 +724,8 @@ ETEXI
 .args_type  = cpu_index:i,
 .params = cpu,
 .help   = inject an NMI on the given CPU,
-.mhandler.cmd = do_inject_nmi,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_inject_nmi,
 },
 #endif
 STEXI
diff --git a/monitor.c b/monitor.c
index ec31eac..3e33a96 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2119,7 +2119,7 @@ static void do_wav_capture(Monitor *mon, const 
   QDict *qdict)
 #endif
  
 #if defined(TARGET_I386)
-static void do_inject_nmi(Monitor *mon, const QDict *qdict)
+static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject 
   **ret_data)
 {
 CPUState *env;
 int cpu_index = qdict_get_int(qdict, cpu_index);
@@ -2127,8 +2127,12 @@ static void do_inject_nmi(Monitor *mon, const 
   QDict *qdict)
 for (env = first_cpu; env != NULL; env = env-next_cpu)
 if (env-cpu_index == cpu_index) {
 cpu_interrupt(env, CPU_INTERRUPT_NMI);
-break;
+return 0;
 }
+
+qerror_report(QERR_INVALID_PARAMETER_VALUE, cpu_index,
+  a CPU number);
+return -1;
 }
 #endif
  
diff --git a/qmp-commands.hx b/qmp-commands.hx
index e5f157f..fcb6bf2 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -429,6 +429,33 @@ Example:
  
 EQMP
  
+#if defined(TARGET_I386)
+{
+.name   = inject_nmi,
+.args_type  = cpu_index:i,
+.params = cpu,
+.help   = inject an NMI on the given CPU,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_inject_nmi,
+},
+#endif
+SQMP
+inject_nmi
+--
+
+Inject an NMI on the given CPU (x86 only).
+
+Arguments:
+
+- cpu_index: the index of the CPU to be injected NMI (json-int)
 
  Please, use cpu-index, that's what we're using for the 
  human-monitor-command.
 
  Avi, Anthony, can you please ACK this new command?
 
 
 I'd like to see cpu-index made optional; if not present, nmi all cpus 
 (that's what the nmi button on many machines does, or at least I think 
 that's what it does).

Looks like a GUI feature to me, _might_ turn out to be an undesirable
side effect to client writers. I guess I prefer a to-all-cpus argument.

[Qemu-devel] Re: [PATCH 2/3] Introduce do_snapshot_blkdev() and monitor command to handle it.

2010-12-15 Thread Luiz Capitulino

On Wed, 15 Dec 2010 17:57:25 +0100
Jes Sorensen jes.soren...@redhat.com wrote:

 On 12/15/10 17:55, Kevin Wolf wrote:
  +int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject 
  **ret_data)
  +{
  +const char *device = qdict_get_str(qdict, device);
  +const char *filename = qdict_get_try_str(qdict, snapshot_file);
  +const char *format = qdict_get_try_str(qdict, format);
  +const char format_qcow2[] = qcow2;
  +BlockDriverState *bs;
  +BlockDriver *drv, *proto_drv;
  +int ret = 0;
  +int flags;
  +
  +bs = bdrv_find(device);
  +if (!bs) {
  +qerror_report(QERR_DEVICE_NOT_FOUND, device);
  +ret = -1;
  +goto out;
  +}
  +
  +if (!format) {
  +format = format_qcow2;
  +}
  +
  +drv = bdrv_find_format(format);
  +if (!drv) {
  +qerror_report(QERR_INVALID_BLOCK_FORMAT, format);
  +ret = -1;
  +goto out;
  +}
  +
  +proto_drv = bdrv_find_protocol(filename);
  +if (!proto_drv) {
  +qerror_report(QERR_INVALID_BLOCK_FORMAT, format);
  +ret = -1;
  +goto out;
  +}
  +
  +ret = bdrv_img_create(filename, format, bs-filename,
  +  bs-drv-format_name, NULL, -1, bs-open_flags);
  +if (ret) {
  +goto out;
  +}
  +
  +qemu_aio_flush();
  +bdrv_flush(bs);
  +
  +flags = bs-open_flags;
  +bdrv_close(bs);
  +ret = bdrv_open(bs, filename, flags, drv);
  +/*
  + * If reopening the image file we just created fails, we really
  + * are in trouble :(
  + */
  +assert(ret == 0);
  +out:
  +if (ret) {
  +ret = 1;
  
  I seem to remember that errors are always -1 for monitor commands.
 
 I mapped it after something else, but admitted I cannot remember where -
 can someone clarify?

-1

 
 Cheers,
 Jes

Re: [Qemu-devel] [PATCH v4 1/2] Minimal RAM API support

2010-12-15 Thread Paul Brook

 This adds a minimum chunk of Anthony's RAM API support so that we
 can identify actual VM RAM versus all the other things that make
 use of qemu_ram_alloc.

Why do we care? How are you defining actual VM RAM?

Surely the whole point of qemu_ram_alloc is to allocate a chunk of memory that 
can be mapped into the guest physical address space, so all uses of 
qemu_ram_alloc should be using this API.

Paul

[Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-15 Thread Markus Armbruster

Luiz Capitulino lcapitul...@redhat.com writes:

 On Wed, 15 Dec 2010 19:18:32 +0200
 Avi Kivity a...@redhat.com wrote:

 On 12/15/2010 07:09 PM, Luiz Capitulino wrote:
  On Wed, 15 Dec 2010 17:49:27 +0800
  Lai Jiangshanla...@cn.fujitsu.com  wrote:
 
  
Convert do_inject_nmi() to QObject, QError, we need to use it(via 
   libvirt).
  
changed from v1
Add document.
Add error handling when the cpu index is invalid.
  
changed from v2
use QERR_INVALID_PARAMETER_VALUE as Markus suggest.
  
Signed-off-by:  Lai Jiangshanla...@cn.fujitsu.com
---
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 23024ba..f86d9fe 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -724,7 +724,8 @@ ETEXI
 .args_type  = cpu_index:i,
 .params = cpu,
 .help   = inject an NMI on the given CPU,
-.mhandler.cmd = do_inject_nmi,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_inject_nmi,
 },
 #endif
 STEXI
diff --git a/monitor.c b/monitor.c
index ec31eac..3e33a96 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2119,7 +2119,7 @@ static void do_wav_capture(Monitor *mon, const 
   QDict *qdict)
 #endif
  
 #if defined(TARGET_I386)
-static void do_inject_nmi(Monitor *mon, const QDict *qdict)
+static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject 
   **ret_data)
 {
 CPUState *env;
 int cpu_index = qdict_get_int(qdict, cpu_index);
@@ -2127,8 +2127,12 @@ static void do_inject_nmi(Monitor *mon, const 
   QDict *qdict)
 for (env = first_cpu; env != NULL; env = env-next_cpu)
 if (env-cpu_index == cpu_index) {
 cpu_interrupt(env, CPU_INTERRUPT_NMI);
-break;
+return 0;
 }
+
+qerror_report(QERR_INVALID_PARAMETER_VALUE, cpu_index,
+  a CPU number);
+return -1;
 }
 #endif
  
diff --git a/qmp-commands.hx b/qmp-commands.hx
index e5f157f..fcb6bf2 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -429,6 +429,33 @@ Example:
  
 EQMP
  
+#if defined(TARGET_I386)
+{
+.name   = inject_nmi,
+.args_type  = cpu_index:i,
+.params = cpu,
+.help   = inject an NMI on the given CPU,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_inject_nmi,
+},
+#endif
+SQMP
+inject_nmi
+--
+
+Inject an NMI on the given CPU (x86 only).
+
+Arguments:
+
+- cpu_index: the index of the CPU to be injected NMI (json-int)
 
  Please, use cpu-index, that's what we're using for the 
  human-monitor-command.
 
  Avi, Anthony, can you please ACK this new command?
 
 
 I'd like to see cpu-index made optional; if not present, nmi all cpus 
 (that's what the nmi button on many machines does, or at least I think 
 that's what it does).

 Looks like a GUI feature to me,

Really?  Can't see how you can build NMI to all CPUs from NMI this
CPU.  Or am I misunderstanding you?

 _might_ turn out to be an undesirable
 side effect to client writers.

They seem to be coping fine with optional arguments elsewhere.

I guess I prefer a to-all-cpus argument.

How would that look like?  cpu-index: all?

I find optional json-int a simpler schema than either a json-int or
the json-string all.

Re: [Qemu-devel] [PATCH 5/6] [RFC] Emulation of Leon3.

2010-12-15 Thread Fabien Chouteau


On 12/13/2010 07:18 PM, Blue Swirl wrote:

On Mon, Dec 13, 2010 at 3:51 PM, Fabien Chouteauchout...@adacore.com  wrote:

On 12/11/2010 10:56 AM, Blue Swirl wrote:


On Tue, Dec 7, 2010 at 11:40 AM, Fabien Chouteauchout...@adacore.com
  wrote:


On 12/06/2010 06:53 PM, Blue Swirl wrote:


On Mon, Dec 6, 2010 at 9:26 AM, Fabien Chouteauchout...@adacore.com
  wrote:


Signed-off-by: Fabien Chouteauchout...@adacore.com
---
  Makefile.target  |5 +-
  hw/leon3.c   |  310
++
  target-sparc/cpu.h   |   10 ++
  target-sparc/helper.c|2 +-
  target-sparc/op_helper.c |   30 -
  5 files changed, 353 insertions(+), 4 deletions(-)

diff --git a/Makefile.target b/Makefile.target
index 2800f47..f40e04f 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -290,7 +290,10 @@ obj-sparc-y += cirrus_vga.o
  else
  obj-sparc-y = sun4m.o lance.o tcx.o sun4m_iommu.o slavio_intctl.o
  obj-sparc-y += slavio_timer.o slavio_misc.o sparc32_dma.o
-obj-sparc-y += cs4231.o eccmemctl.o sbi.o sun4c_intctl.o
+obj-sparc-y += cs4231.o eccmemctl.o sbi.o sun4c_intctl.o leon3.o
+
+# GRLIB
+obj-sparc-y += grlib_gptimer.o grlib_irqmp.o grlib_apbuart.o
  endif

  obj-arm-y = integratorcp.o versatilepb.o arm_pic.o arm_timer.o
diff --git a/hw/leon3.c b/hw/leon3.c
new file mode 100644
index 000..ba61081
--- /dev/null
+++ b/hw/leon3.c
@@ -0,0 +1,310 @@
+/*
+ * QEMU Leon3 System Emulator
+ *
+ * Copyright (c) 2010 AdaCore
+ *
+ * Permission is hereby granted, free of charge, to any person
obtaining
a copy
+ * of this software and associated documentation files (the
Software),
to deal
+ * in the Software without restriction, including without limitation
the
rights
+ * to use, copy, modify, merge, publish, distribute, sublicense,
and/or
sell
+ * copies of the Software, and to permit persons to whom the Software
is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be
included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES
OR
OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN
+ * THE SOFTWARE.
+ */
+#include hw.h
+#include qemu-timer.h
+#include qemu-char.h
+#include sysemu.h
+#include boards.h
+#include loader.h
+#include elf.h
+
+#include grlib.h
+
+/* #define DEBUG_LEON3 */
+
+#ifdef DEBUG_LEON3
+#define DPRINTF(fmt, ...)   \
+do { printf(Leon3:  fmt , ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...)
+#endif
+
+/* Default system clock.  */
+#define CPU_CLK (40 * 1000 * 1000)
+
+#define PROM_FILENAMEu-boot.bin
+
+#define MAX_PILS 16
+
+typedef struct Leon3State
+{
+uint32_t cache_control;
+uint32_t inst_cache_conf;
+uint32_t data_cache_conf;
+
+uint64_t entry; /* save kernel entry in case of reset
*/
+} Leon3State;
+
+Leon3State leon3_state;


Again global state, please refactor. Perhaps most of the cache
handling code belong to target-sparc/op_helper.c and this structure to
CPUSPARCState.


I will try to find a solution for that.
Is it OK to add some Leon3 specific stuff in the CPUSPARCState?


Yes, no problem. You can also drop the intermediate Leon3State
structure if there is no benefit.


+
+/* Cache control: emulate the behavior of cache control registers but
without
+   any effect on the emulated CPU */
+
+#define CACHE_DISABLED 0x0
+#define CACHE_FROZEN   0x1
+#define CACHE_ENABLED  0x3
+
+/* Cache Control register fields */
+
+#define CACHE_CTRL_IF (14)  /* Instruction Cache Freeze on
Interrupt */
+#define CACHE_CTRL_DF (15)  /* Data Cache Freeze on Interrupt
*/
+#define CACHE_CTRL_DP (1  14)  /* Data cache flush pending */
+#define CACHE_CTRL_IP (1  15)  /* Instruction cache flush pending
*/
+#define CACHE_CTRL_IB (1  16)  /* Instruction burst fetch */
+#define CACHE_CTRL_FI (1  21)  /* Flush Instruction cache (Write
only)
*/
+#define CACHE_CTRL_FD (1  22)  /* Flush Data cache (Write only) */
+#define CACHE_CTRL_DS (1  23)  /* Data cache snoop enable */
+
+void leon3_cache_control_int(void)
+{
+uint32_t state = 0;
+
+if (leon3_state.cache_control  CACHE_CTRL_IF) {
+/* Instruction cache state */
+state = leon3_state.cache_control  0x3;


Please add a new define CACHE_CTRL_xxx to replace 0x3.



Done.


+if (state == CACHE_ENABLED) {
+state = CACHE_FROZEN;
+DPRINTF(Instruction cache: freeze\n);
+}
+
+leon3_state.cache_control= ~0x3;
+

Re: [Qemu-devel] [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-15 Thread Markus Armbruster

Luiz Capitulino lcapitul...@redhat.com writes:

 On Wed, 15 Dec 2010 11:49:23 +0100
 Markus Armbruster arm...@redhat.com wrote:

 Lai Jiangshan la...@cn.fujitsu.com writes:
 
  Convert do_inject_nmi() to QObject, QError, we need to use it(via libvirt).
 
  changed from v1
  Add document.
  Add error handling when the cpu index is invalid.
 
  changed from v2
  use QERR_INVALID_PARAMETER_VALUE as Markus suggest.
 
  Signed-off-by:  Lai Jiangshan la...@cn.fujitsu.com
 
 A note on commit messages:
 
 The commit message should describe the current version of the patch.
 Don't repeat the subject line in the body.
 
 Patch history is very useful for review, but usually uninteresting once
 the patch is committed.  Thus, it's best to put it after the ---
 separator.
 
 Subsystem tags in the subject line are helpful.  But qemu doesn't
 provide any information there :)
 
 
 Regarding the patch:
 
 The conversion looks good.
 
 The new QMP command is called inject_nmi, while the existing human
 monitor command is called nmi.  Luiz asked for this name change.  I
 don't mind.  But should we rename the human monitor command for
 consistency?

 I don't think so, we don't need (and maybe don't even want) naming parity
 between QMP and HMP. Remember that one of our mistakes was to couple the two.

Human nmi and QMP inject_nmi are identical commands, aren't they?
Giving the same things the same name isn't coupling :)

The mistake that matters here was adopting existing human commands for
QMP uncritically.

 Also, Avi asked for more descriptive names in QMP and I agree with him, I
 would even be in favor of calling it inject-non-maskable-interrupt.

I like inject_nmi better than nmi.  inject-non-maskable-interrupt is too
long even for QMP.

 Regardless, the differing command name is worth mentioning in the commit
 message.

[Qemu-devel] Re: [PATCH] qemu-kvm: Switch to upstream -enable-kvm semantics

2010-12-15 Thread Anthony Liguori


On 12/15/2010 09:50 AM, Markus Armbruster wrote:

We currently enable KVM by default, and when it's not available, we
print a message and fall back to TCG.  Option -enable-kvm is ignored.
Option -no-kvm suppresses KVM.

Upstream works differently: KVM is off by default, -enable-kvm
switches it on.  -enable-kvm terminates the process unsuccessfully if
KVM is not available.

upstream qemu   |  default  |-enable-kvm
+---+---
KVM available   | disabled  |  enabled
KVM unavailable | disabled  |fail

qemu-kvm|  default  |-enable-kvm
+---+---
KVM available   |  enabled* |  enabled
KVM unavailable | disabled  | disabled*

* differs from upstream

Users of qemu and qemu-kvm need to be aware of these differences to
enable / disable use of KVM reliably.  This is bothersome.

Consider -enable-kvm when KVM is unavailable: If the user expects
qemu-kvm behavior (fall back), but qemu fails, he'll likely be
surprised and unhappy.  If the user expects upstream behavior (fail),
but qemu-kvm falls back to TCG, the guest runs slow as molasses, and
the user will likely be confused and unhappy (unless he spots and
understands the disable KVM message).

Switch to upstream semantics: KVM off by default, -enable-kvm switches
it on, and when it can't, it's fatal.

Having to enable KVM explicitly is annoying, but the proper place to
address that is upstream.

Signed-off-by: Markus Armbrusterarm...@redhat.com
   


Backwards compatibility is going to kill us if we try to make this change.

Current qemu-kvm behavior:

default: -accel kvm,tcg
-no-kvm: -accel tcg
-enable-kvm: -accel kvm,tcg

Current upstream behavior

default: -accel tcg
-enable-kvm: -accel kvm

I think we should tie `-accel' to the machine type.  For qemu-kvm, a 
different default machine type should be used than upstream qemu (it 
really should be a configure switch).


For `pc', the default `-accel' behavior should remain 'tcg'.  For 
`kvmpc', the default `-accel' behavior should be 'kvm,tcg'.


-no-kvm should be deprecated.  -enable-kvm should also be deprecated in 
favor of the `-accel' option.


In the short term, it would be a good idea to modify qemu-kvm to switch 
the -enable-kvm semantics to match upstream (fail if KVM isn't 
available).  Adding an alias for 'kvmpc' upstream and qemu-kvm and 
making qemu-kvm default to 'kvmpc' would be helpful for management tools 
too.


Regards,

Anthony Liguori


---
  vl.c |   10 +-
  1 files changed, 1 insertions(+), 9 deletions(-)

diff --git a/vl.c b/vl.c
index e3c8919..87e88c2 100644
--- a/vl.c
+++ b/vl.c
@@ -247,7 +247,7 @@ static void *boot_set_opaque;
  static NotifierList exit_notifiers =
  NOTIFIER_LIST_INITIALIZER(exit_notifiers);

-int kvm_allowed = 1;
+int kvm_allowed = 0;
  uint32_t xen_domid;
  enum xen_mode xen_mode = XEN_EMULATE;

@@ -2436,10 +2436,8 @@ int main(int argc, char **argv, char **envp)
  case QEMU_OPTION_smbios:
  do_smbios_option(optarg);
  break;
-#ifdef OBSOLETE_KVM_IMPL
  case QEMU_OPTION_enable_kvm:
  kvm_allowed = 1;
-#endif
  break;
case QEMU_OPTION_no_kvm:
kvm_allowed = 0;
@@ -2789,18 +2787,12 @@ int main(int argc, char **argv, char **envp)
  if (kvm_allowed) {
  int ret = kvm_init(smp_cpus);
  if (ret  0) {
-#if defined(OBSOLETE_KVM_IMPL) || defined(CONFIG_NO_CPU_EMULATION)
  if (!kvm_available()) {
  printf(KVM not supported for this target\n);
  } else {
  fprintf(stderr, failed to initialize KVM: %s\n, 
strerror(-ret));
  }
  exit(1);
-#endif
-#ifdef CONFIG_KVM
-fprintf(stderr, Could not initialize KVM, will disable KVM 
support\n);
-kvm_allowed = 0;
-#endif
  }
  }

Re: [Qemu-devel] [Qestion] What status of memory stats feature

2010-12-15 Thread Luiz Capitulino

On Wed, 15 Dec 2010 16:20:05 +0900
Ken'ichi Ohmichi oomi...@mxs.nes.nec.co.jp wrote:

 
 Hi,
 
 I tried to get the memory stats by using virDomainMemoryStats() of libvirt,
 but it could not do it because of the following patch:
 
 [PATCH 03/23] disable guest-provided stats on info balloon command
   2010/10/01 from Luiz Capitulino
   http://www.mail-archive.com/qemu-devel@nongnu.org/msg43024.html 
 
 According to the related thread, the above patch avoids hanging user
 monitor, and people hope the memory stats feature will be able in the
 future. So I'd like to know the status of this feature.
 
 Is there the patch for enabling the feature ?
 If the patch exists, I'd like to try it.

It doesn't, afaik.

 What is the essential problem ?

There are two essential problems here.

The first one is that QMP lacks true asynchronous command support (it does
have some code for it, but it's incomplete).

The second problem is that we must not make an existing synchronous command
asynchronous, because this breaks the ABI.

 Does some infinite loop happen ?

No, the guest doesn't respond.

 Unfortunately, I cannot understand the cause of hanging user monitor.

The balloon command needs guest cooperation (ie. it talks to the guest),
if the guest doesn't respond the command will never return.

We don't have a precise ETA for async support, but if it depends only on
the new error framework work, then we could have it for 0.15. If it depends
on the full monitor redesign work, then I guess we'll two releases.

Re: [Qemu-devel] [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-15 Thread Luiz Capitulino

On Wed, 15 Dec 2010 18:39:07 +0100
Markus Armbruster arm...@redhat.com wrote:

 Luiz Capitulino lcapitul...@redhat.com writes:
 
  On Wed, 15 Dec 2010 11:49:23 +0100
  Markus Armbruster arm...@redhat.com wrote:
 
  Lai Jiangshan la...@cn.fujitsu.com writes:
  
   Convert do_inject_nmi() to QObject, QError, we need to use it(via 
   libvirt).
  
   changed from v1
   Add document.
   Add error handling when the cpu index is invalid.
  
   changed from v2
   use QERR_INVALID_PARAMETER_VALUE as Markus suggest.
  
   Signed-off-by:  Lai Jiangshan la...@cn.fujitsu.com
  
  A note on commit messages:
  
  The commit message should describe the current version of the patch.
  Don't repeat the subject line in the body.
  
  Patch history is very useful for review, but usually uninteresting once
  the patch is committed.  Thus, it's best to put it after the ---
  separator.
  
  Subsystem tags in the subject line are helpful.  But qemu doesn't
  provide any information there :)
  
  
  Regarding the patch:
  
  The conversion looks good.
  
  The new QMP command is called inject_nmi, while the existing human
  monitor command is called nmi.  Luiz asked for this name change.  I
  don't mind.  But should we rename the human monitor command for
  consistency?
 
  I don't think so, we don't need (and maybe don't even want) naming parity
  between QMP and HMP. Remember that one of our mistakes was to couple the 
  two.
 
 Human nmi and QMP inject_nmi are identical commands, aren't they?

At this point in time yes, but they might not be in the near future. Assuming
they might be different is the safest thing to do.

That's true for all existing commands.

 Giving the same things the same name isn't coupling :)

Expecting them to be the same in the future is.

 The mistake that matters here was adopting existing human commands for
 QMP uncritically.

That's the protocol visible mistake, yes.

  Also, Avi asked for more descriptive names in QMP and I agree with him, I
  would even be in favor of calling it inject-non-maskable-interrupt.
 
 I like inject_nmi better than nmi.  inject-non-maskable-interrupt is too
 long even for QMP.

It's not supposed to be typed that much, but I'm not that strong about that.

nitpick: I think we should be consistent in the use of _ or -, eg. we
 should pick inject-nmi or inject_nmi?

 
  Regardless, the differing command name is worth mentioning in the commit
  message.

[Qemu-devel] Re: [PATCH] qemu-kvm: Switch to upstream -enable-kvm semantics

2010-12-15 Thread Markus Armbruster

Anthony Liguori anth...@codemonkey.ws writes:

 On 12/15/2010 09:50 AM, Markus Armbruster wrote:
 We currently enable KVM by default, and when it's not available, we
 print a message and fall back to TCG.  Option -enable-kvm is ignored.
 Option -no-kvm suppresses KVM.

 Upstream works differently: KVM is off by default, -enable-kvm
 switches it on.  -enable-kvm terminates the process unsuccessfully if
 KVM is not available.

 upstream qemu   |  default  |-enable-kvm
 +---+---
 KVM available   | disabled  |  enabled
 KVM unavailable | disabled  |fail

 qemu-kvm|  default  |-enable-kvm
 +---+---
 KVM available   |  enabled* |  enabled
 KVM unavailable | disabled  | disabled*

 * differs from upstream

 Users of qemu and qemu-kvm need to be aware of these differences to
 enable / disable use of KVM reliably.  This is bothersome.

 Consider -enable-kvm when KVM is unavailable: If the user expects
 qemu-kvm behavior (fall back), but qemu fails, he'll likely be
 surprised and unhappy.  If the user expects upstream behavior (fail),
 but qemu-kvm falls back to TCG, the guest runs slow as molasses, and
 the user will likely be confused and unhappy (unless he spots and
 understands the disable KVM message).

 Switch to upstream semantics: KVM off by default, -enable-kvm switches
 it on, and when it can't, it's fatal.

 Having to enable KVM explicitly is annoying, but the proper place to
 address that is upstream.

 Signed-off-by: Markus Armbrusterarm...@redhat.com


 Backwards compatibility is going to kill us if we try to make this change.

 Current qemu-kvm behavior:

 default: -accel kvm,tcg
 -no-kvm: -accel tcg
 -enable-kvm: -accel kvm,tcg

 Current upstream behavior

 default: -accel tcg
 -enable-kvm: -accel kvm

 I think we should tie `-accel' to the machine type.  For qemu-kvm, a
 different default machine type should be used than upstream qemu (it
 really should be a configure switch).

 For `pc', the default `-accel' behavior should remain 'tcg'.  For
 kvmpc', the default `-accel' behavior should be 'kvm,tcg'.

 -no-kvm should be deprecated.  -enable-kvm should also be deprecated
 in favor of the `-accel' option.

I'm fine with -accel and deprecating the old options.  But until we have
that:

 In the short term, it would be a good idea to modify qemu-kvm to
 switch the -enable-kvm semantics to match upstream (fail if KVM isn't
 available).

That's what my patch does.

Additionally, it changes the default to match upstream: KVM disabled.

What do you want changed in my patch?

  Adding an alias for 'kvmpc' upstream and qemu-kvm and
 making qemu-kvm default to 'kvmpc' would be helpful for management
 tools too.

That would address Having to enable KVM explicitly is annoying.

[Qemu-devel] Re: [PATCH v3] qemu, qmp: convert do_inject_nmi() to QObject, QError

2010-12-15 Thread Luiz Capitulino

On Wed, 15 Dec 2010 18:45:09 +0100
Markus Armbruster arm...@redhat.com wrote:

 Luiz Capitulino lcapitul...@redhat.com writes:
 
  On Wed, 15 Dec 2010 19:18:32 +0200
  Avi Kivity a...@redhat.com wrote:
 
  On 12/15/2010 07:09 PM, Luiz Capitulino wrote:
   On Wed, 15 Dec 2010 17:49:27 +0800
   Lai Jiangshanla...@cn.fujitsu.com  wrote:
  
   
 Convert do_inject_nmi() to QObject, QError, we need to use it(via 
libvirt).
   
 changed from v1
 Add document.
 Add error handling when the cpu index is invalid.
   
 changed from v2
 use QERR_INVALID_PARAMETER_VALUE as Markus suggest.
   
 Signed-off-by:  Lai Jiangshanla...@cn.fujitsu.com
 ---
 diff --git a/hmp-commands.hx b/hmp-commands.hx
 index 23024ba..f86d9fe 100644
 --- a/hmp-commands.hx
 +++ b/hmp-commands.hx
 @@ -724,7 +724,8 @@ ETEXI
  .args_type  = cpu_index:i,
  .params = cpu,
  .help   = inject an NMI on the given CPU,
 -.mhandler.cmd = do_inject_nmi,
 +.user_print = monitor_user_noop,
 +.mhandler.cmd_new = do_inject_nmi,
  },
  #endif
  STEXI
 diff --git a/monitor.c b/monitor.c
 index ec31eac..3e33a96 100644
 --- a/monitor.c
 +++ b/monitor.c
 @@ -2119,7 +2119,7 @@ static void do_wav_capture(Monitor *mon, const 
QDict *qdict)
  #endif
   
  #if defined(TARGET_I386)
 -static void do_inject_nmi(Monitor *mon, const QDict *qdict)
 +static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject 
**ret_data)
  {
  CPUState *env;
  int cpu_index = qdict_get_int(qdict, cpu_index);
 @@ -2127,8 +2127,12 @@ static void do_inject_nmi(Monitor *mon, const 
QDict *qdict)
  for (env = first_cpu; env != NULL; env = env-next_cpu)
  if (env-cpu_index == cpu_index) {
  cpu_interrupt(env, CPU_INTERRUPT_NMI);
 -break;
 +return 0;
  }
 +
 +qerror_report(QERR_INVALID_PARAMETER_VALUE, cpu_index,
 +  a CPU number);
 +return -1;
  }
  #endif
   
 diff --git a/qmp-commands.hx b/qmp-commands.hx
 index e5f157f..fcb6bf2 100644
 --- a/qmp-commands.hx
 +++ b/qmp-commands.hx
 @@ -429,6 +429,33 @@ Example:
   
  EQMP
   
 +#if defined(TARGET_I386)
 +{
 +.name   = inject_nmi,
 +.args_type  = cpu_index:i,
 +.params = cpu,
 +.help   = inject an NMI on the given CPU,
 +.user_print = monitor_user_noop,
 +.mhandler.cmd_new = do_inject_nmi,
 +},
 +#endif
 +SQMP
 +inject_nmi
 +--
 +
 +Inject an NMI on the given CPU (x86 only).
 +
 +Arguments:
 +
 +- cpu_index: the index of the CPU to be injected NMI (json-int)
  
   Please, use cpu-index, that's what we're using for the 
   human-monitor-command.
  
   Avi, Anthony, can you please ACK this new command?
  
  
  I'd like to see cpu-index made optional; if not present, nmi all cpus 
  (that's what the nmi button on many machines does, or at least I think 
  that's what it does).
 
  Looks like a GUI feature to me,
 
 Really?  Can't see how you can build NMI to all CPUs from NMI this
 CPU.  Or am I misunderstanding you?

I guess so. Avi referred to 'nmi button on many machines', I assumed he
meant a virtual machine GUI, am I wrong?

  _might_ turn out to be an undesirable
  side effect to client writers.
 
 They seem to be coping fine with optional arguments elsewhere.

Which we might want to review.

 I guess I prefer a to-all-cpus argument.
 
 How would that look like?  cpu-index: all?

Like this:

{ execute: inject-nmi, arguments: { to-all-cpus: true } }

But this looks like an optimization to me, because it's also easy to do:

for cpu in query-cpus; do
  inject-nmi cpu

Unless we want to do this in an atomic way, due to side effects I'm
not aware about.

 I find optional json-int a simpler schema than either a json-int or
 the json-string all.

[Qemu-devel] [Bug 690776] Re: Overwrite argv to set process title, eliminating 16-character prctl() limit.

2010-12-15 Thread John Morrissey


** Attachment added: qemu-overwrite-argv-to-set-process-title.patch
   
https://bugs.launchpad.net/bugs/690776/+attachment/1767059/+files/qemu-overwrite-argv-to-set-process-title.patch

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/690776

Title:
  Overwrite argv to set process title, eliminating 16-character prctl() limit.

Status in QEMU:
  New

Bug description:
  I've modified qemu to overwrite its arguments to set the process title, since 
its current prctl() method has a 16-character limit.

I posted the original patch to qemu-devel, made the changes others suggested, 
then re-posted to qemu-devel. I flailed around a bit with the patch submission 
process and think I finally got it right, but haven't been able to gain the 
notice of a committer to have this pushed. Maybe this will get more attention 
when reported in the BTS.

[Qemu-devel] [Bug 690776] [NEW] Overwrite argv to set process title, eliminating 16-character prctl() limit.

2010-12-15 Thread John Morrissey

Public bug reported:

I've modified qemu to overwrite its arguments to set the process title,
since its current prctl() method has a 16-character limit.

I posted the original patch to qemu-devel, made the changes others
suggested, then re-posted to qemu-devel. I flailed around a bit with the
patch submission process and think I finally got it right, but haven't
been able to gain the notice of a committer to have this pushed. Maybe
this will get more attention when reported in the BTS.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/690776

Title:
  Overwrite argv to set process title, eliminating 16-character prctl() limit.

Status in QEMU:
  New

Bug description:
  I've modified qemu to overwrite its arguments to set the process title, since 
its current prctl() method has a 16-character limit.

I posted the original patch to qemu-devel, made the changes others suggested, 
then re-posted to qemu-devel. I flailed around a bit with the patch submission 
process and think I finally got it right, but haven't been able to gain the 
notice of a committer to have this pushed. Maybe this will get more attention 
when reported in the BTS.

[Qemu-devel] Re: [PATCH] rtl8139: IO memory is not part of vmstate

2010-12-15 Thread Michael S. Tsirkin

On Wed, Dec 15, 2010 at 06:12:25PM +0100, Paolo Bonzini wrote:
 On 12/15/2010 11:00 AM, Michael S. Tsirkin wrote:
 Indeed, subsections are for data that is rarely needed so that
 there's some chance (sometimes ~100%) of migration working
 seemlessly.
 
 If a subsection arrives that qemu does
 not know about, won't migratin fail?
 
 Yes, that's why rarely needed = some high chance of migration
 working (though no certainty).
 
 In this case it's either
 no-bump-and-live-with-the-consequences, or changing the version id.
 
 This was discussed to death already.  version ids have the problem
 that they don't play nicely with downstreams.
 
 Downstream version bumps don't play nicely with upstream, so
 downstream does have a reason for always-necessary subsections.  But
 upstream can bump the version id as much as they care.
 
 Paolo

This assuming upstream developers do not care about downstreams.
To give a chance for downstream to cherry-pick changes, upstream
should use subsections instead of version ids too.

-- 
MST

Re: [Qemu-devel] [PATCH v4 1/2] Minimal RAM API support

2010-12-15 Thread Alex Williamson

On Wed, 2010-12-15 at 17:23 +, Paul Brook wrote:
  This adds a minimum chunk of Anthony's RAM API support so that we
  can identify actual VM RAM versus all the other things that make
  use of qemu_ram_alloc.
 
 Why do we care? How are you defining actual VM RAM?
 
 Surely the whole point of qemu_ram_alloc is to allocate a chunk of memory 
 that 
 can be mapped into the guest physical address space, so all uses of 
 qemu_ram_alloc should be using this API.

http://wiki.qemu.org/Features/RamAPI

Re: [Qemu-devel] [PATCH v4 1/2] Minimal RAM API support

2010-12-15 Thread Anthony Liguori


On 12/15/2010 11:23 AM, Paul Brook wrote:

This adds a minimum chunk of Anthony's RAM API support so that we
can identify actual VM RAM versus all the other things that make
use of qemu_ram_alloc.
 

Why do we care? How are you defining actual VM RAM?

Surely the whole point of qemu_ram_alloc is to allocate a chunk of memory that
can be mapped into the guest physical address space, so all uses of
qemu_ram_alloc should be using this API.
   


actual VM RAM == the DIMM devices.  This address has exactly a 1-1 
mapping between memory content and an address.  It doesn't change during 
program execution.


It may be mapped in the CPU in weird ways, it may be visibly different 
to devices, but that's a different interface.


Why do we care about differentiating actual VM RAM from things that 
behave like RAM but are not actually RAM (like device ROM)?  Because the 
semantics are different.  ROM is non-volatile and RAM is volatile.  If 
we don't make that distinction in our interfaces, we loose the ability 
to model the behavioral differences.


For things like paravirtual devices, we can take short cuts (to optimize 
performance) by saying the device is directly connecting to RAM (and 
doesn't go through the normal translation hierarchy).


Regards,

Anthony Liguori


Paul
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Qemu-devel] [PATCH] Fix migrate set speed doc arg

2010-12-15 Thread Luiz Capitulino

We used to ignore any fractional part in 0.13, but due to recent
changes (started with 9f9b17a4f0865286391e4d3a0a735230122a2289)
migrate_set_speed will reject the fractional part.

We don't expect existing clients to be relying on this, but we
need to update the documentation to reflect the change.

Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
---
 qmp-commands.hx |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 3486223..164ecbc 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -510,7 +510,7 @@ Set maximum speed for migrations.
 
 Arguments:
 
-- value: maximum speed, in bytes per second (json-number)
+- value: maximum speed, in bytes per second (json-int)
 
 Example:
 
-- 
1.7.3.3.585.g74f6e

Re: [Qemu-devel] [PATCH] Fix migrate set speed doc arg

2010-12-15 Thread Markus Armbruster

Luiz Capitulino lcapitul...@redhat.com writes:

 We used to ignore any fractional part in 0.13, but due to recent
 changes (started with 9f9b17a4f0865286391e4d3a0a735230122a2289)
 migrate_set_speed will reject the fractional part.

 We don't expect existing clients to be relying on this, but we
 need to update the documentation to reflect the change.

 Signed-off-by: Luiz Capitulino lcapitul...@redhat.com

Documentation broken by commit ed3d4a80.

ACK

[Qemu-devel] [Bug 685096] Re: USB Passthrough not working for Windows 7 guest

2010-12-15 Thread Carlos-velasco

Same problem here, using:
qemu-kvm 0.13
kernel 2.6.36.2
kvm-intel

Guest:
Windows 7 Enterprise x64

INFO USBHOST:
  Device 2.2, speed 480 Mb/s
Class 00: USB device 054c:02a5, Storage Media

INFO USB:
Device 0.3, Speed 480 Mb/s, Product Storage Media

Device appears in Windows 7 but in Error Code 10.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/685096

Title:
  USB Passthrough not working for Windows 7 guest

Status in QEMU:
  New

Bug description:
  USB Passthrough from host to guest is not working for a 32-bit Windows 7 
guest, while it works perfectly for a 32-bit Windows XP guest. 

The device appears in the device manager of Windows 7, but with Error code 10: 
device cannot start. I have tried this with numerous USB thumbdrives and a USB 
wireless NIC, all with the same result. The device name and functionality is 
recognized, so at least some USB negotiation is taking place.

I am trying this with the latest git-pull of QEMU-KVM. 

The command line to launch qemu-kvm for win7 is:
sudo /home/user/local_install/bin/qemu-system-x86_64 -cpu core2duo -m 1024 -smp 
2 -vga std -hda ./disk_images/win7.qcow -vnc :1 -boot c -usb -usbdevice tablet 
-usbdevice host:0781:5150

The command line to launch qemu-kvm for winxp is:
sudo /home/user/local_install/bin/qemu-system-x86_64 -cpu core2duo -m 1024 -smp 
2 -usb -vga std -hda ./winxpsp3.qcow -vnc :0 -boot c -usbdevice tablet 
-usbdevice host:0781:5150

Any help is appreciated.

[Qemu-devel] Storing instructions executed in CPUState

2010-12-15 Thread James Greensky

Hello, I am attempting to fold a cache model into Qemu.  For this I need to
store the instructions executed between actual lds/sts to Qemu memory, for
performance reasons.  I figured that a buffer in the CPUState
could accommodate this requirement, however tcg only gives the ability to
ld/st from host memory at constant offsets.  Is there a better solution to
this problem than adding new ld/st instructions to tcg that take a computed
offset? Thx, James

[Qemu-devel] [Bug 685096] Re: USB Passthrough not working for Windows 7 guest

2010-12-15 Thread Carlos-velasco

Ugh... I have just realized that KVM only supports UHCI, so not USB 2.0
support

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/685096

Title:
  USB Passthrough not working for Windows 7 guest

Status in QEMU:
  New

Bug description:
  USB Passthrough from host to guest is not working for a 32-bit Windows 7 
guest, while it works perfectly for a 32-bit Windows XP guest. 

The device appears in the device manager of Windows 7, but with Error code 10: 
device cannot start. I have tried this with numerous USB thumbdrives and a USB 
wireless NIC, all with the same result. The device name and functionality is 
recognized, so at least some USB negotiation is taking place.

I am trying this with the latest git-pull of QEMU-KVM. 

The command line to launch qemu-kvm for win7 is:
sudo /home/user/local_install/bin/qemu-system-x86_64 -cpu core2duo -m 1024 -smp 
2 -vga std -hda ./disk_images/win7.qcow -vnc :1 -boot c -usb -usbdevice tablet 
-usbdevice host:0781:5150

The command line to launch qemu-kvm for winxp is:
sudo /home/user/local_install/bin/qemu-system-x86_64 -cpu core2duo -m 1024 -smp 
2 -usb -vga std -hda ./winxpsp3.qcow -vnc :0 -boot c -usbdevice tablet 
-usbdevice host:0781:5150

Any help is appreciated.

Re: [Qemu-devel] [Qestion] What status of memory stats feature

2010-12-15 Thread Adam Litke

On Wed, 2010-12-15 at 15:39 -0200, Luiz Capitulino wrote:
 On Wed, 15 Dec 2010 16:20:05 +0900
 Ken'ichi Ohmichi oomi...@mxs.nes.nec.co.jp wrote:
 
  
  Hi,
  
  I tried to get the memory stats by using virDomainMemoryStats() of libvirt,
  but it could not do it because of the following patch:
  
  [PATCH 03/23] disable guest-provided stats on info balloon command
2010/10/01 from Luiz Capitulino
http://www.mail-archive.com/qemu-devel@nongnu.org/msg43024.html 
  
  According to the related thread, the above patch avoids hanging user
  monitor, and people hope the memory stats feature will be able in the
  future. So I'd like to know the status of this feature.
  
  Is there the patch for enabling the feature ?
  If the patch exists, I'd like to try it.
 
 It doesn't, afaik.

It depends on what you are looking to do.  If you just want to play
around and aren't concerned about lockups, You could undo the above
patch to re-enable the interface (although I cannot guarantee that even
this will work).  

  What is the essential problem ?
 
 There are two essential problems here.
 
 The first one is that QMP lacks true asynchronous command support (it does
 have some code for it, but it's incomplete).

I was never completely clear on the problems with the QMP async support.

 The second problem is that we must not make an existing synchronous command
 asynchronous, because this breaks the ABI.

The memory stats interface has always advertised itself as an async
command so this one shouldn't matter.  Whether those semantics actually
ever worked is another issue.

  Does some infinite loop happen ?
 
 No, the guest doesn't respond.
 
  Unfortunately, I cannot understand the cause of hanging user monitor.
 
 The balloon command needs guest cooperation (ie. it talks to the guest),
 if the guest doesn't respond the command will never return.

Isn't that only a problem for the user monitor?  For QMP you might just
never get a response to the stats request.  IIRC, the QMP monitor is
never 'suspended' in the same way that the user monitor is so I wouldn't
expect it to be susceptible to the same lockup issues.

 We don't have a precise ETA for async support, but if it depends only on
 the new error framework work, then we could have it for 0.15. If it depends
 on the full monitor redesign work, then I guess we'll two releases.

I can't provide a definitive answer to this unfortunately since I
haven't looked in awhile (nor am I privy to the details of the full
monitor redesign proposal.

-- 
Thanks,
Adam

Re: [Qemu-devel] Re: [PATCH 00/15] Megasas HBA emulation and SCSI update v.3

2010-12-15 Thread Benjamin Herrenschmidt

On Mon, 2010-12-13 at 08:32 +0100, Hannes Reinecke wrote:
 On 12/10/2010 11:14 PM, Paolo Bonzini wrote:
  On 11/24/2010 05:50 PM, Christoph Hellwig wrote:
  Btw, it might make sense to split this series into two.
 
  Patches 1 to 11 are genuine improvements to the SCSI code, which I'd
  like to see merged ASAP.  The rest is the actual megasas driver, which
  I still want to see, but haven't even gotten to review yet.
  
  Ping for patches 1 to 11?
  
  Paolo
 
 The first few already have been merged by Kevin Wolf; I'll see to
 prepare an updated patchset.

Actually, I was about to ask as I'd like to base some new work of mine
on top of these. I don't see any recent commit from Kevin in the qemu
master branch (nor in any other branch on
http://git.savannah.gnu.org/cgit/qemu.git/log/).

Does Kevin maintain a separate staging tree ?

Cheers,
Ben.

Re: [Qemu-devel] Re: [PATCH 00/15] Megasas HBA emulation and SCSI update v.3

2010-12-15 Thread Benjamin Herrenschmidt

On Thu, 2010-12-16 at 12:45 +1100, Benjamin Herrenschmidt wrote:
 On Mon, 2010-12-13 at 08:32 +0100, Hannes Reinecke wrote:
  On 12/10/2010 11:14 PM, Paolo Bonzini wrote:
   On 11/24/2010 05:50 PM, Christoph Hellwig wrote:
   Btw, it might make sense to split this series into two.
  
   Patches 1 to 11 are genuine improvements to the SCSI code, which I'd
   like to see merged ASAP.  The rest is the actual megasas driver, which
   I still want to see, but haven't even gotten to review yet.
   
   Ping for patches 1 to 11?
   
   Paolo
  
  The first few already have been merged by Kevin Wolf; I'll see to
  prepare an updated patchset.
 
 Actually, I was about to ask as I'd like to base some new work of mine
 on top of these. I don't see any recent commit from Kevin in the qemu
 master branch (nor in any other branch on
 http://git.savannah.gnu.org/cgit/qemu.git/log/).
 
 Does Kevin maintain a separate staging tree ?

BTW. I could use patch 12 too (get_sense() callback :-) Would save me
from manufacturing REQUEST_SENSE etc... I can carry it locally for the
time being but it would be nice to have it upstream.

Cheers,
Ben.

Re: [Qemu-devel] Re: [PATCH] scsi-generic: bugfixes for 'SCSIRequest' conversion

2010-12-15 Thread Benjamin Herrenschmidt


  The commit message of this commit says it's a workaround for a problem
  with lsi:
  
  - when a read is aborted due to a mark/EOF/EOD/EOM, the len reported to
  controller can be 0. LSI controller emulation doesn't know how to manage
  this. A workaround found is to call the completion routine with
  SCSI_REASON_DONE just after calling it with SCSI_REASON_DATA with len=0.
  
  Are you sure that it's not needed any more?
  
 Don't ask me. I didn't do the patch, and my knowledge of lsi HBA
 internals is scanty.
 Nic, can you comment here?

Back to this topic...

On my current WIP code, currently based on qemu upstream commit
f711df67d611e4762966a249742a5f7499e19f99, I've just observed the
following behaviour:

Use -cdrom /dev/cdrom (which points to /dev/sr0)

No disk in the drive, scsi-disk kicks in. test-unit-ready properly says
there's no medium, so far so good.

Now, my (buggy) guest code tries to do a READ_10. At this point, qemu
hangs. What happens is that my scsi complete callback get called with
reason 1 (DATA) and arg 0. I then perform no data copy, and call back
sdev-info-read_data() which eventually calls me back again with reason
1 and arg 0, etc... in a loop.

I haven't had a chance yet to look at what's happening in the guts of
scsi-disk, but here, either I'm supposed to special case DATA with 0
bytes requested in my HBA driver, or there's a bug in scsi-disk.

I can work around it by doing a cancel_io in that case and then
completing the request is if reason had been 0 (DONE) but that's of
course just a band-aid.

I'll let you know what I find out when I get a chance to look at this
again (hopefully soon).

Cheers,
Ben.

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-15 Thread Isaku Yamahata

On Wed, Dec 15, 2010 at 08:27:49AM -0700, Alex Williamson wrote:
 On Wed, 2010-12-15 at 11:56 +0200, Michael S. Tsirkin wrote:
  On Tue, Dec 14, 2010 at 11:34:53AM -0700, Alex Williamson wrote:
   On Tue, 2010-12-14 at 14:26 +0200, Michael S. Tsirkin wrote:
On Mon, Dec 13, 2010 at 10:04:24PM -0700, Alex Williamson wrote:
 
 I've only ever seen config[PCI_SECONDARY_BUS] be non-zero for an
 assigned device, so I'm pretty sure we're not going to hurt migration,
 but the code is clearly wrong and I'd like to make sure we don't trip 
 on
 a migration failure for a minor device config space change.

Which reminds me: maybe just mark nested bridges as non-migrateable
for now?  Care writing such a patch?
   
   Hmm, this is trickier than it sounds.
  
  Hmm, since 0 is put in the path instead of the bridge number,
  will the correct bridge be restored?
  
We're really only broken wrt
   migration if a device under a bridge calls qemu_ram_alloc.
  
  I guess there's more broken-ness. What exactly breaks qemu_ram_alloc?
 
 You're right, it's more broken than that.  Anything that calls
 get_dev_path is broken for migration of bridges since the path is
 determined before the guest updates bus numbers.  That includes
 qemu_ram_alloc and vmstate.  I was only looking at the qemu_ram_alloc
 side.  So perhaps the right answer, for the moment, is to block
 migration if there's a p2p bridge.

Eww. That's bad. Anyway, I agree to disable it for the moment.
Let's fix it after 0.14 release.
-- 
yamahata

Re: [Qemu-devel] [PATCH] RFC: delay pci_update_mappings for 64-bit BARs

2010-12-15 Thread Isaku Yamahata

On Tue, Dec 14, 2010 at 10:42:43AM -0700, Cam Macdonell wrote:
 On Mon, Dec 13, 2010 at 8:00 PM, Isaku Yamahata yamah...@valinux.co.jp 
 wrote:
  On Mon, Dec 13, 2010 at 03:43:44PM -0700, Cam Macdonell wrote:
  Do not call pci_update_mappings on the lower 32-bits of a 64-bit bar. 
  ?Wait for the upper 32 or else Qemu will try to map on just the lower 32 
  which is probably going to corrupt memory.
 
  I was encountering crashes when mapping certain PCI region sizes. ?The 
  problem turns out that pci_update_mappings is being called without all 
  64-bits in the BAR. ?For example when mapping to 0x18000, once the 
  lower 32-bits were written the remapping happened (mapping to 0x800) 
  which would overwrite something.
 
  I'm not certain if this is completely correct, I'm simply testing the 
  lower 4-bits to only be MEM_TYPE_64 flag. ?Upper 32-bit address parts can 
  be values like 0xff which is tricky to test against.
 
  You're assuming that guest OS always write lower 32bit and them upper 32bit.
  Is the assumption correct?
  I found Linux does, but I don't know about other OSes.
  And I couldn't find any sentence about how to update (64bit) BAR in the 
  specs.
  (Please correct me if I missed it)
 
 I think you're right, we probably can't assume the order.
 
 
  Some work around would be necessary regardless of 32bit-or-64bit.
  because qemu doesn't emulate bus accurately at the moment.
  How about the followings?
  If BAR overlaps with RAM, don't map BAR.
  If BAR overlaps with other BARs, record the overlapping and
  when updating one of the BARs, update all the overlapping BARs.
  Which BAR wins depends on the order of updating, it doesn't matter because
  it's anomaly case.
 
 But the addresses in the BARs may not overlap.  For example, Linux
 allocates memory from top down, so I recently had the mapping of a BAR
 to address 0xffc000
 
 So BAR 0x18 sees 0xc004
 Then BAR 0x1c sees 0xff
 
 So if I understand what you mean by overlapping BARs, 0xc000 and
 0xffc000 will not be detected as overlapping and so we can't
 record it.  But, we can allow harmless mappings of the incomplete
 lower-32 to proceed and then get remapped when the upper bits are
 written.  (This is what happens currently, but fails when the lower-32
 overwrite RAM).
 
 Case of writing upper-then-lower (non-Linux case):
 The addresses in the upper 32-bits are going to be limited to 16-bits
 (at most 48-bit addresses currently) and so those shouldn't update
 mappings because they will overlap with RAM.  When the lower-bits are
 written, we have the full 64-bit address and can update mappings.
 
 Case of writing lower-then-upper:
 If the lower 32-bit BAR address doesn't conflict with RAM, map it.
 When the upper bits are written, update to the correct mapping.
 
 We would just have to ensure the first mapping is indeed harmless.

Agreed. When only lower 32-bit is set, it may conflicts with other BAR.
So when upper 32-bit is written, the conflicted BAR also needs update.
It might be acceptable to update all pci devices instead of conflicted ones.
Although it would be quite slow, setting BAR is not performance critical.


 Would that work?

I hope so. Anyway it's just work around.


 Cam
 
 
  This way, 32bit BAR case is also covered.
 
  thanks,
 
 
  Cam
  ---
  ?hw/pci.c | ? ?5 -
  ?1 files changed, 4 insertions(+), 1 deletions(-)
 
  diff --git a/hw/pci.c b/hw/pci.c
  index 438c0d1..3b81792 100644
  --- a/hw/pci.c
  +++ b/hw/pci.c
  @@ -1000,6 +1000,9 @@ void pci_default_write_config(PCIDevice *d, uint32_t 
  addr, uint32_t val, int l)
  ?{
  ? ? ?int i, was_irq_disabled = pci_irq_disabled(d);
  ? ? ?uint32_t config_size = pci_config_size(d);
  + ? ?int is_64 = 0;
  +
  + ? ?is_64 = ((val  0xf) == PCI_BASE_ADDRESS_MEM_TYPE_64);
 
  ? ? ?for (i = 0; i  l  addr + i  config_size; val = 8, ++i) {
  ? ? ? ? ?uint8_t wmask = d-wmask[addr + i];
  @@ -1008,7 +1011,7 @@ void pci_default_write_config(PCIDevice *d, uint32_t 
  addr, uint32_t val, int l)
  ? ? ? ? ?d-config[addr + i] = (d-config[addr + i]  ~wmask) | (val  
  wmask);
  ? ? ? ? ?d-config[addr + i] = ~(val  w1cmask); /* W1C: Write 1 to Clear 
  */
  ? ? ?}
  - ? ?if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
  + ? ?if ((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24)  (!is_64)) ||
  ? ? ? ? ?ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
  ? ? ? ? ?ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
  ? ? ? ? ?range_covers_byte(addr, l, PCI_COMMAND))
  --
  1.7.0.4
 
 
 
  --
  yamahata
 
 
 

-- 
yamahata

[Qemu-devel] Re: [PATCH 05/21] virtio: modify save/load handler to handle inuse varialble.

2010-12-15 Thread Yoshiaki Tamura

2010/12/3 Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp:
 2010/12/2 Michael S. Tsirkin m...@redhat.com:
 On Wed, Dec 01, 2010 at 05:03:43PM +0900, Yoshiaki Tamura wrote:
 2010/11/28 Michael S. Tsirkin m...@redhat.com:
  On Sun, Nov 28, 2010 at 08:27:58PM +0900, Yoshiaki Tamura wrote:
  2010/11/28 Michael S. Tsirkin m...@redhat.com:
   On Thu, Nov 25, 2010 at 03:06:44PM +0900, Yoshiaki Tamura wrote:
   Modify inuse type to uint16_t, let save/load to handle, and revert
   last_avail_idx with inuse if there are outstanding emulation.
  
   Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
  
   This changes migration format, so it will break compatibility with
   existing drivers. More generally, I think migrating internal
   state that is not guest visible is always a mistake
   as it ties migration format to an internal implementation
   (yes, I know we do this sometimes, but we should at least
   try not to add such cases).  I think the right thing to do in this case
   is to flush outstanding
   work when vm is stopped.  Then, we are guaranteed that inuse is 0.
   I sent patches that do this for virtio net and block.
 
  Could you give me the link of your patches?  I'd like to test
  whether they work with Kemari upon failover.  If they do, I'm
  happy to drop this patch.
 
  Yoshi
 
  Look for this:
  stable migration image on a stopped vm
  sent on:
  Wed, 24 Nov 2010 17:52:49 +0200

 Thanks for the info.

 However, The patch series above didn't solve the issue.  In
 case of Kemari, inuse is mostly  0 because it queues the
 output, and while last_avail_idx gets incremented
 immediately, not sending inuse makes the state inconsistent
 between Primary and Secondary.

 Hmm. Can we simply avoid incrementing last_avail_idx?

 I think we can calculate or prepare an internal last_avail_idx,
 and update the external when inuse is decremented.  I'll try
 whether it work w/ w/o Kemari.

Hi Michael,

Could you please take a look at the following patch?

commit 36ee7910059e6b236fe9467a609f5b4aed866912
Author: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
Date:   Thu Dec 16 14:50:54 2010 +0900

virtio: update last_avail_idx when inuse is decreased.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp

diff --git a/hw/virtio.c b/hw/virtio.c
index c8a0fc6..6688c02 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -237,6 +237,7 @@ void virtqueue_flush(VirtQueue *vq, unsigned int count)
 wmb();
 trace_virtqueue_flush(vq, count);
 vring_used_idx_increment(vq, count);
+vq-last_avail_idx += count;
 vq-inuse -= count;
 }

@@ -385,7 +386,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
 unsigned int i, head, max;
 target_phys_addr_t desc_pa = vq-vring.desc;

-if (!virtqueue_num_heads(vq, vq-last_avail_idx))
+if (!virtqueue_num_heads(vq, vq-last_avail_idx + vq-inuse))
 return 0;

 /* When we start there are none of either input nor output. */
@@ -393,7 +394,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)

 max = vq-vring.num;

-i = head = virtqueue_get_head(vq, vq-last_avail_idx++);
+i = head = virtqueue_get_head(vq, vq-last_avail_idx + vq-inuse);

 if (vring_desc_flags(desc_pa, i)  VRING_DESC_F_INDIRECT) {
 if (vring_desc_len(desc_pa, i) % sizeof(VRingDesc)) {





  I'm wondering why
 last_avail_idx is OK to send but not inuse.

 last_avail_idx is at some level a mistake, it exposes part of
 our internal implementation, but it does *also* express
 a guest observable state.

 Here's the problem that it solves: just looking at the rings in virtio
 there is no way to detect that a specific request has already been
 completed. And the protocol forbids completing the same request twice.

 Our implementation always starts processing the requests
 in order, and since we flush outstanding requests
 before save, it works to just tell the remote 'process only requests
 after this place'.

 But there's no such requirement in the virtio protocol,
 so to be really generic we could add a bitmask of valid avail
 ring entries that did not complete yet. This would be
 the exact representation of the guest observable state.
 In practice we have rings of up to 512 entries.
 That's 64 byte per ring, not a lot at all.

 However, if we ever do change the protocol to send the bitmask,
 we would need some code to resubmit requests
 out of order, so it's not trivial.

 Another minor mistake with last_avail_idx is that it has
 some redundancy: the high bits in the index
 ( vq size) are not necessary as they can be
 got from avail idx.  There's a consistency check
 in load but we really should try to use formats
 that are always consistent.

 The following patch does the same thing as original, yet
 keeps the format of the virtio.  It shouldn't break live
 migration either because inuse should be 0.

 Yoshi

 Question is, can you flush to make inuse 0 in kemari too?
 And if not, how do you handle the fact that some requests
 are

[Qemu-devel] Re: [PATCH 11/21] ioport: insert event_tap_ioport() to ioport_write().

2010-12-15 Thread Yoshiaki Tamura

2010/11/28 Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp:
 2010/11/28 Michael S. Tsirkin m...@redhat.com:
 On Thu, Nov 25, 2010 at 03:06:50PM +0900, Yoshiaki Tamura wrote:
 Record ioport event to replay it upon failover.

 Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp

 Interesting. This will have to be extended to support ioeventfd.
 Since each eventfd is really just a binary trigger
 it should be enough to read out the fd state.

 Haven't thought about eventfd yet.  Will try doing it in the next
 spin.

Hi Michael,

I looked into eventfd and realized it's only used with vhost now.  However, I
believe vhost bypass the net layer in qemu, and there is no way for Kemari to
detect the outputs.  To me, it doesn't make sense to extend this patch to
support eventfd...

Thanks,

Yoshi


 Yoshi


 ---
  ioport.c |    2 ++
  1 files changed, 2 insertions(+), 0 deletions(-)

 diff --git a/ioport.c b/ioport.c
 index aa4188a..74aebf5 100644
 --- a/ioport.c
 +++ b/ioport.c
 @@ -27,6 +27,7 @@

  #include ioport.h
  #include trace.h
 +#include event-tap.h

  /***/
  /* IO Port */
 @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, 
 uint32_t data)
          default_ioport_writel
      };
      IOPortWriteFunc *func = ioport_write_table[index][address];
 +    event_tap_ioport(index, address, data);
      if (!func)
          func = default_func[index];
      func(ioport_opaque[address], address, data);
 --
 1.7.1.2

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

75 matches

Mail list logo