date:20100519

[PATCH] pc: fix segfault introduced by 3d53f5c36ff6

2010-05-19 Thread Eduard - Gabriel Munteanu

Commit 3d53f5c36ff6 introduced a segfault by erroneously making fw_cfg a
'void **' and passing it around in different ways.

Signed-off-by: Eduard - Gabriel Munteanu 
---
 hw/pc.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index fee08c9..4a4a706 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -822,7 +822,7 @@ void pc_memory_init(ram_addr_t ram_size,
 ram_addr_t ram_addr, bios_offset, option_rom_offset;
 ram_addr_t below_4g_mem_size, above_4g_mem_size = 0;
 int bios_size, isa_bios_size;
-void **fw_cfg;
+void *fw_cfg;
 
 if (ram_size >= 0xe000 ) {
 above_4g_mem_size = ram_size - 0xe000;
@@ -905,7 +905,7 @@ void pc_memory_init(ram_addr_t ram_size,
 rom_set_fw(fw_cfg);
 
 if (linux_boot) {
-load_linux(*fw_cfg, kernel_filename, initrd_filename, kernel_cmdline, 
below_4g_mem_size);
+load_linux(fw_cfg, kernel_filename, initrd_filename, kernel_cmdline, 
below_4g_mem_size);
 }
 
 for (i = 0; i < nb_option_roms; i++) {
-- 
1.6.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 10/12] kvm: enable smp > 1

2010-05-19 Thread Avi Kivity


On 05/19/2010 11:02 PM, Udo Lembke wrote:

Unrelated, what are your smp issues?



If i use one cpu i got a good io-performance:
e.g. over 500MB/s at the profile "install" of the io-benchmark 
h2benchw.exe.
( aio=threads | SAS-Raid-0 | 
ftp://ftp.heise.de/pub/ct/ctsi/h2benchw.zip | hwbenchw.exe -p -w 
iotest 0)

The same test but with two cpus gives results between 27 and 298 MB/s!

Also in real life it's noticeable not only with an benchmark. I use a 
win-vm with two cpu for postscript-ripping and have a performance drop 
due to the bad io.


What's your block device model?  virtio or ide?

What does cpu usage look like on guest or host?

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] vhost-net: utilize PUBLISH_USED_IDX feature

2010-05-19 Thread Avi Kivity


On 05/20/2010 01:27 AM, Michael S. Tsirkin wrote:

On Wed, May 19, 2010 at 08:04:51PM +0300, Avi Kivity wrote:
   

On 05/18/2010 04:19 AM, Michael S. Tsirkin wrote:
 

With PUBLISH_USED_IDX, guest tells us which used entries
it has consumed. This can be used to reduce the number
of interrupts: after we write a used entry, if the guest has not yet
consumed the previous entry, or if the guest has already consumed the
new entry, we do not need to interrupt.
This imporves bandwidth by 30% under some workflows.

Signed-off-by: Michael S. Tsirkin
---

Rusty, Dave, this patch depends on the patch
"virtio: put last seen used index into ring itself"
which is currently destined at Rusty's tree.
Rusty, if you are taking that one for 2.6.35, please
take this one as well.
Dave, any objections?

   

I object: I think the index should have its own cacheline,
 

The issue here is that host/guest do not know each
other's cache line size. I guess we could just put it
at offset 128 or something like that ... Rusty?
   


Not so pretty, but ok.

   

and that it should be documented before merging.
 

I think you meant to object to the virtio patch, not this one.  This
patch does not introduce new layout, just implements host support.
virtio spec patch will follow: it is not part of linux tree so
there is no patch dependency.
   


There is a reviewer dependency.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH RFC] virtio: put last seen used index into ring itself

2010-05-19 Thread Avi Kivity


On 05/20/2010 01:33 AM, Michael S. Tsirkin wrote:



Virtio is already way too bouncy due to the indirection between the
avail/used rings and the descriptor pool.  A device with out of order
completion (like virtio-blk) will quickly randomize the unused
descriptor indexes, so every descriptor fetch will require a bounce.

In contrast, if the rings hold the descriptors themselves instead of
pointers, we bounce (sizeof(descriptor)/cache_line_size) cache lines for
every descriptor, amortized.
 

On the other hand, consider that on fast path we are never using all
of the ring. With a good allocator we might be able to keep
reusing only small part of the ring, instead of wrapping around
all of it all of the time.
   


It's still suboptimal, we have to bounce both the avail/used rings and 
the descriptor pool, compared to just the descriptor ring with a direct 
design.  Plus we don't need a fancy allocator.


When amortizing cachelines, simple data structures win.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [RFC] Bug Day - June 1st, 2010

2010-05-19 Thread Michael Tokarev


20.05.2010 02:30, Anthony Liguori wrote:

On 05/19/2010 05:29 PM, Andre Przywara wrote:

Michael Tokarev wrote:

...

Also, thanks to Andre Przywara, whole winNT thing works but it requires
-cpu qemu64,level=1 (or level=2 or =3), -- _not_ with default CPU. This

[]

It'd be nice if we had more flexibility in defining custom machine types
so you could just do qemu -M win98.


This is wrong IMHO.  win98 and winNT can run on various different
machines, including all modern ones (yes I tried the same winNT
on my Athlon X2-64, just had to switch SATA from AHCI to IDE;
win95 works too)...  just not in kvm :)


BTW: Does anyone knows what the problem with Windows95/98 on KVM is? I
tried some tracing today, but couldn't find a hint.


Um.  The bugreport(s) come as a surprize for me: I tried to
install win98 in kvm several times in the past but setup
always failed - different messages in different versions
of kvm, either "unable to emulate" or "real mode trap" or
something else, or just lockup, usually on first reboot.
So - the bugreports talks about mouse non-working, but
this means win98 itself works somehow...  I dunno :)

/mjt
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Kvm device passthrough

2010-05-19 Thread Mu Lin

Hi, Folks:

Could you provide pointer to the kvm device passthrough howto/FAQ?

I have two questions:

1. my host os, the Linux doesn't have the native device driver for some home 
grown pci devices, the driver is in the guest os, does device passthrough work 
in this case? Assuming I have VT-d.

2. How about i2c devices?

Thanks

Mu--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] add support for protocol driver create_options

2010-05-19 Thread MORITA Kazutaka

This patch enables protocol drivers to use their create options which
are not supported by the format.  For example, protcol drivers can use
a backing_file option with raw format.

Signed-off-by: MORITA Kazutaka 
---
 block.c   |7 +++
 block.h   |1 +
 qemu-img.c|   49 ++---
 qemu-option.c |   52 +---
 qemu-option.h |2 ++
 5 files changed, 85 insertions(+), 26 deletions(-)

diff --git a/block.c b/block.c
index 48d8468..0ab9424 100644
--- a/block.c
+++ b/block.c
@@ -56,7 +56,6 @@ static int bdrv_read_em(BlockDriverState *bs, int64_t 
sector_num,
 uint8_t *buf, int nb_sectors);
 static int bdrv_write_em(BlockDriverState *bs, int64_t sector_num,
  const uint8_t *buf, int nb_sectors);
-static BlockDriver *find_protocol(const char *filename);
 
 static QTAILQ_HEAD(, BlockDriverState) bdrv_states =
 QTAILQ_HEAD_INITIALIZER(bdrv_states);
@@ -210,7 +209,7 @@ int bdrv_create_file(const char* filename, 
QEMUOptionParameter *options)
 {
 BlockDriver *drv;
 
-drv = find_protocol(filename);
+drv = bdrv_find_protocol(filename);
 if (drv == NULL) {
 drv = bdrv_find_format("file");
 }
@@ -283,7 +282,7 @@ static BlockDriver *find_hdev_driver(const char *filename)
 return drv;
 }
 
-static BlockDriver *find_protocol(const char *filename)
+BlockDriver *bdrv_find_protocol(const char *filename)
 {
 BlockDriver *drv1;
 char protocol[128];
@@ -469,7 +468,7 @@ int bdrv_file_open(BlockDriverState **pbs, const char 
*filename, int flags)
 BlockDriver *drv;
 int ret;
 
-drv = find_protocol(filename);
+drv = bdrv_find_protocol(filename);
 if (!drv) {
 return -ENOENT;
 }
diff --git a/block.h b/block.h
index 24efeb6..9034ebb 100644
--- a/block.h
+++ b/block.h
@@ -54,6 +54,7 @@ void bdrv_info_stats(Monitor *mon, QObject **ret_data);
 
 void bdrv_init(void);
 void bdrv_init_with_whitelist(void);
+BlockDriver *bdrv_find_protocol(const char *filename);
 BlockDriver *bdrv_find_format(const char *format_name);
 BlockDriver *bdrv_find_whitelisted_format(const char *format_name);
 int bdrv_create(BlockDriver *drv, const char* filename,
diff --git a/qemu-img.c b/qemu-img.c
index d3c30a7..8ae7184 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -252,8 +252,8 @@ static int img_create(int argc, char **argv)
 const char *base_fmt = NULL;
 const char *filename;
 const char *base_filename = NULL;
-BlockDriver *drv;
-QEMUOptionParameter *param = NULL;
+BlockDriver *drv, *proto_drv;
+QEMUOptionParameter *param = NULL, *create_options = NULL;
 char *options = NULL;
 
 flags = 0;
@@ -286,33 +286,42 @@ static int img_create(int argc, char **argv)
 }
 }
 
+/* Get the filename */
+if (optind >= argc)
+help();
+filename = argv[optind++];
+
 /* Find driver and parse its options */
 drv = bdrv_find_format(fmt);
 if (!drv)
 error("Unknown file format '%s'", fmt);
 
+proto_drv = bdrv_find_protocol(filename);
+if (!proto_drv)
+error("Unknown protocol '%s'", filename);
+
+create_options = append_option_parameters(create_options,
+  drv->create_options);
+create_options = append_option_parameters(create_options,
+  proto_drv->create_options);
+
 if (options && !strcmp(options, "?")) {
-print_option_help(drv->create_options);
+print_option_help(create_options);
 return 0;
 }
 
 /* Create parameter list with default values */
-param = parse_option_parameters("", drv->create_options, param);
+param = parse_option_parameters("", create_options, param);
 set_option_parameter_int(param, BLOCK_OPT_SIZE, -1);
 
 /* Parse -o options */
 if (options) {
-param = parse_option_parameters(options, drv->create_options, param);
+param = parse_option_parameters(options, create_options, param);
 if (param == NULL) {
 error("Invalid options for file format '%s'.", fmt);
 }
 }
 
-/* Get the filename */
-if (optind >= argc)
-help();
-filename = argv[optind++];
-
 /* Add size to parameters */
 if (optind < argc) {
 set_option_parameter(param, BLOCK_OPT_SIZE, argv[optind++]);
@@ -362,6 +371,7 @@ static int img_create(int argc, char **argv)
 puts("");
 
 ret = bdrv_create(drv, filename, param);
+free_option_parameters(create_options);
 free_option_parameters(param);
 
 if (ret < 0) {
@@ -543,14 +553,14 @@ static int img_convert(int argc, char **argv)
 {
 int c, ret, n, n1, bs_n, bs_i, flags, cluster_size, cluster_sectors;
 const char *fmt, *out_fmt, *out_baseimg, *out_filename;
-BlockDriver *drv;
+BlockDriver *drv, *proto_drv;
 BlockDriverState **bs, *out_bs;
 int64_t total_sect

Re: [Qemu-devel] [PATCH RFC] virtio: put last seen used index into ring itself

2010-05-19 Thread Rusty Russell

On Thu, 20 May 2010 02:31:50 pm Rusty Russell wrote:
> On Wed, 19 May 2010 05:36:42 pm Avi Kivity wrote:
> > > Note that this is a exclusive->shared->exclusive bounce only, too.
> > >
> > 
> > A bounce is a bounce.
> 
> I tried to measure this to show that you were wrong, but I was only able
> to show that you're right.  How annoying.  Test code below.

This time for sure!

#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

/* We share memory via an mmap. */
struct counter {
unsigned int cacheline1;
char pad[256];
unsigned int cacheline2;
};

#define MAX_BOUNCES 1

enum mode {
SHARE,
UNSHARE,
LOCKSHARE,
LOCKUNSHARE,
};

int main(int argc, char *argv[])
{
cpu_set_t cpuset;
volatile struct counter *counter;
struct timeval start, stop;
bool child;
unsigned int count;
uint64_t usec;
enum mode mode;

if (argc != 4)
errx(1, "Usage: cachebounce share|unshare|lockshare|lockunshare 
 ");

if (strcmp(argv[1], "share") == 0)
mode = SHARE;
else if (strcmp(argv[1], "unshare") == 0)
mode = UNSHARE;
else if (strcmp(argv[1], "lockshare") == 0)
mode = LOCKSHARE;
else if (strcmp(argv[1], "lockunshare") == 0)
mode = LOCKSHARE;
else
errx(1, "Usage: cachebounce share|unshare|lockshare|lockunshare 
 ");

CPU_ZERO(&cpuset);

counter = mmap(NULL, getpagesize(), PROT_READ|PROT_WRITE, 
MAP_ANONYMOUS|MAP_SHARED, -1, 0);
if (counter == MAP_FAILED)
err(1, "Mapping page");

/* Fault it in. */
counter->cacheline1 = counter->cacheline2 = 0;

child = (fork() == 0);

CPU_SET(atoi(argv[2 + child]), &cpuset);
if (sched_setaffinity(getpid(), sizeof(cpu_set_t), &cpuset) != 0)
err(1, "Calling sched_setaffinity()");

gettimeofday(&start, NULL);

if (child) {
count = 1;
switch (mode) {
case SHARE:
while (count < MAX_BOUNCES) {
/* Spin waiting for other side to change it. */
while (counter->cacheline1 != count);
count++;
counter->cacheline1 = count;
count++;
}
break;
case UNSHARE:
while (count < MAX_BOUNCES) {
/* Spin waiting for other side to change it. */
while (counter->cacheline1 != count);
count++;
counter->cacheline2 = count;
count++;
}
break;
case LOCKSHARE:
while (count < MAX_BOUNCES) {
/* Spin waiting for other side to change it. */
while 
(__sync_val_compare_and_swap(&counter->cacheline1, count, count+1)
   != count);
count += 2;
}
break;
case LOCKUNSHARE:
while (count < MAX_BOUNCES) {
/* Spin waiting for other side to change it. */
while (counter->cacheline1 != count);

__sync_val_compare_and_swap(&counter->cacheline2, count, count+1);
count += 2;
}
break;
}
} else {
count = 0;
switch (mode) {
case SHARE:
while (count < MAX_BOUNCES) {
/* Spin waiting for other side to change it. */
while (counter->cacheline1 != count);
count++;
counter->cacheline1 = count;
count++;
}
break;
case UNSHARE:
while (count < MAX_BOUNCES) {
/* Spin waiting for other side to change it. */
while (counter->cacheline2 != count);
count++;
counter->cacheline1 = count;
count++;
}
break;
case LOCKSHARE:
while (count < MAX_BOUNCES) {
/* Spin waiting for other side to cha

Re: [Qemu-devel] [PATCH RFC] virtio: put last seen used index into ring itself

2010-05-19 Thread Rusty Russell

On Wed, 19 May 2010 05:36:42 pm Avi Kivity wrote:
> > Note that this is a exclusive->shared->exclusive bounce only, too.
> >
> 
> A bounce is a bounce.

I tried to measure this to show that you were wrong, but I was only able
to show that you're right.  How annoying.  Test code below.

> Virtio is already way too bouncy due to the indirection between the 
> avail/used rings and the descriptor pool.

I tried to do a more careful analysis below, and I think this is an
overstatement.

> A device with out of order 
> completion (like virtio-blk) will quickly randomize the unused 
> descriptor indexes, so every descriptor fetch will require a bounce.
> 
> In contrast, if the rings hold the descriptors themselves instead of 
> pointers, we bounce (sizeof(descriptor)/cache_line_size) cache lines for 
> every descriptor, amortized.

We already have indirect, this would be a logical next step.  So let's
think about it. The avail ring would contain 64 bit values, the used ring
would contain indexes into the avail ring.

So client writes descriptor page and adds to avail ring, then writes to
index.  Server reads index, avail ring, descriptor page (3).  Writes used
entry (1).  Updates last_used (1).  Client reads used (1), derefs avail (1),
updates last_used (1), cleans descriptor page (1).

That's 9 cacheline transfers, worst case.  Best case of a half-full ring
in steady state, assuming 128-byte cache lines, the avail ring costs are
1/16, the used entry is 1/64.  This drops it to 6 and 9/64 transfers.

(Note, the current scheme adds 2 more cacheline transfers, for the descriptor
table, worst case.  Assuming indirect, we get 2/8 xfer best case.  Either way,
it's not the main source of cacheline xfers).

Can we do better?  The obvious idea is to try to get rid of last_used and
used, and use the ring itself.  We would use an invalid entry to mark the
head of the ring.

Any other thoughts?
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] vhost-net: utilize PUBLISH_USED_IDX feature

2010-05-19 Thread Rusty Russell

On Thu, 20 May 2010 07:57:18 am Michael S. Tsirkin wrote:
> On Wed, May 19, 2010 at 08:04:51PM +0300, Avi Kivity wrote:
> > On 05/18/2010 04:19 AM, Michael S. Tsirkin wrote:
> >> With PUBLISH_USED_IDX, guest tells us which used entries
> >> it has consumed. This can be used to reduce the number
> >> of interrupts: after we write a used entry, if the guest has not yet
> >> consumed the previous entry, or if the guest has already consumed the
> >> new entry, we do not need to interrupt.
> >> This imporves bandwidth by 30% under some workflows.
> >>
> >> Signed-off-by: Michael S. Tsirkin
> >> ---
> >>
> >> Rusty, Dave, this patch depends on the patch
> >> "virtio: put last seen used index into ring itself"
> >> which is currently destined at Rusty's tree.
> >> Rusty, if you are taking that one for 2.6.35, please
> >> take this one as well.
> >> Dave, any objections?
> >>
> >
> > I object: I think the index should have its own cacheline,
> 
> The issue here is that host/guest do not know each
> other's cache line size. I guess we could just put it
> at offset 128 or something like that ... Rusty?

I was assuming you'd put it at the end of the padding.

I think it's a silly optimization, but Avi obviously feels strongly about
it and I respect his opinion.

Please resubmit...
Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Autotest 0.12.0-rc1

2010-05-19 Thread Lucas Meneghel Rodrigues

Hi folks,

Autotest 0.12.0 is almost ready, due to be released in a few days

http://autotest.kernel.org/milestone/0.12.0

I rolled out some pre-release tarballs that can be found on

http://test.kernel.org/releases/0.12.0-rc1/

If you want to test it to see what it looks like, be my guest and don't
hesitate to report bugs against it

http://autotest.kernel.org/newticket

This release will sport the web interface post django 1.0 conversion,
and up to date tests and profilers. Please check it out!

Cheers,

Lucas

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] Print a user-friendly message on failed vmentry

2010-05-19 Thread Mohammed Gamal

This patch address bug report in https://bugs.launchpad.net/qemu/+bug/530077.

Failed vmentries were handled with handle_unhandled() which prints a rather
unfriendly message to the user. This patch separates handling vmentry failures
from unknown exit reasons and prints a friendly message to the user.

Signed-off-by: Mohammed Gamal 
---
 qemu-kvm.c |   19 ++-
 1 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/qemu-kvm.c b/qemu-kvm.c
index 35a4c8a..1fdb6fe 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -106,6 +106,23 @@ static int handle_unhandled(uint64_t reason)
 return -EINVAL;
 }
 
+static int handle_failed_vmentry(uint64_t reason)
+{
+fprintf(stderr, "kvm: vm entry failed with error 0x%" PRIx64 "\n\n", 
reason);
+fprintf(stderr, "If you're runnning a guest on an Intel machine 
without\n");
+fprintf(stderr, "unrestricted mode support, the failure can be most 
likely\n");
+fprintf(stderr, "due to the guest entering an invalid state for Intel 
VT.\n");
+fprintf(stderr, "For example, the guest maybe running in big real mode\n");
+fprintf(stderr, "which is not supported on less recent Intel 
processors.\n\n");
+fprintf(stderr, "You may want to try enabling KVM real mode emulation. 
To\n");
+fprintf(stderr, "enable it, you can run the following commands as 
root:\n");
+fprintf(stderr, "# rmmod kvm_intel\n");
+fprintf(stderr, "# rmmod kvm\n");
+fprintf(stderr, "# modprobe kvm_intel emulate_invalid_guest_state=1\n\n");
+fprintf(stderr, "WARNING: Real mode emulation is still 
work-in-progress\n");
+fprintf(stderr, "and thus it is not always guaranteed to work.\n\n");
+return -EINVAL;
+}
 
 static inline void set_gsi(kvm_context_t kvm, unsigned int gsi)
 {
@@ -586,7 +603,7 @@ int kvm_run(CPUState *env)
 r = handle_unhandled(run->hw.hardware_exit_reason);
 break;
 case KVM_EXIT_FAIL_ENTRY:
-r = 
handle_unhandled(run->fail_entry.hardware_entry_failure_reason);
+r = 
handle_failed_vmentry(run->fail_entry.hardware_entry_failure_reason);
 break;
 case KVM_EXIT_EXCEPTION:
 fprintf(stderr, "exception %d (%x)\n", run->ex.exception,
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Shadow MMU state preserved across kvm_mmu_zap_all?

2010-05-19 Thread Marek Olszewski


Hello,

I'm trying to track down a bug I'm observing in a branched version of 
kvm I'm using for research.  I'm hoping someone might be able to point 
me int to the right direction as I haven't had any luck with it on my 
own.  Here are the details:


I have made some changes to kvm that enable guest user applications to 
use duplicate shadow pages to do interesting things (essentially I 
duplicate the shadow page table tree for a process multiple times, once 
for each thread).  During my tests, my guest application enables this 
new feature, completes correctly, and then disables it.  Unfortunately, 
after the test application completes, random programs begin segfaulting 
for unknown reasons.  This is despite the fact that my changes to KVM no 
longer get executed (verified with a kgdb).  At first I thought that I 
corrupted the shadow pages tables somehow, however, calling 
kvm_mmu_zap_all does not solve the problem.  Thus, I figured I corrupted 
the guest OS somehow, however, the problem persists even if I reboot the 
guest OS.  

So my question is this: Are there any other data structures that survive 
both a call to kvm_mmu_zap and a guest reboot?


Thanks!

Marek

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Print a user-friendly message on failed vmentry

2010-05-19 Thread Mohammed Gamal

On Thu, May 20, 2010 at 4:28 AM, Ryan Harper  wrote:
>
> * Mohammed Gamal  [2010-05-19 16:17]:
> > This patch address bug report in 
> > https://bugs.launchpad.net/qemu/+bug/530077.
> >
> > Failed vmentries were handled with handle_unhandled() which prints a rather
> > unfriendly message to the user. This patch separates handling vmentry 
> > failures
> > from unknown exit reasons and prints a friendly message to the user.
> >
> > Signed-off-by: Mohammed Gamal 
> > ---
> >  qemu-kvm.c |   16 +++-
> >  1 files changed, 15 insertions(+), 1 deletions(-)
> >
> > diff --git a/qemu-kvm.c b/qemu-kvm.c
> > index 35a4c8a..deb4df8 100644
> > --- a/qemu-kvm.c
> > +++ b/qemu-kvm.c
> > @@ -106,6 +106,20 @@ static int handle_unhandled(uint64_t reason)
> >      return -EINVAL;
> >  }
> >
> > +static int handle_failed_vmentry(uint64_t reason)
> > +{
> > +    fprintf(stderr, "kvm: vm entry failed with error 0x%" PRIx64 "\n\n", 
> > reason);
> > +    fprintf(stderr, "If you're runnning a guest on an Intel machine, it 
> > can be\n");
> > +    fprintf(stderr, "most-likely due to the guest going into an invalid 
> > state\n");
> > +    fprintf(stderr, "for Intel VT. For example, the guest maybe running in 
> > big\n");
> > +    fprintf(stderr, "real mode which is not supported by Intel VT.\n\n");
>
> We might want to qualify this with certain cpu versions.  IIRC, the VMX
> unrestricted mode should handle big real mode correctly, no?   Maybe,
>
> +    fprintf(stderr, "on some Intel processors. For example, the guest maybe 
> running in big\n");
> +    fprintf(stderr, "real mode which is not supported on most Intel 
> processors.\n\n");
>
Good point. Will correct and resend.

> --
> Ryan Harper
> Software Engineer; Linux Technology Center
> IBM Corp., Austin, Tx
> ry...@us.ibm.com
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Print a user-friendly message on failed vmentry

2010-05-19 Thread Ryan Harper

* Mohammed Gamal  [2010-05-19 16:17]:
> This patch address bug report in https://bugs.launchpad.net/qemu/+bug/530077.
> 
> Failed vmentries were handled with handle_unhandled() which prints a rather
> unfriendly message to the user. This patch separates handling vmentry failures
> from unknown exit reasons and prints a friendly message to the user.
> 
> Signed-off-by: Mohammed Gamal 
> ---
>  qemu-kvm.c |   16 +++-
>  1 files changed, 15 insertions(+), 1 deletions(-)
> 
> diff --git a/qemu-kvm.c b/qemu-kvm.c
> index 35a4c8a..deb4df8 100644
> --- a/qemu-kvm.c
> +++ b/qemu-kvm.c
> @@ -106,6 +106,20 @@ static int handle_unhandled(uint64_t reason)
>  return -EINVAL;
>  }
> 
> +static int handle_failed_vmentry(uint64_t reason)
> +{
> +fprintf(stderr, "kvm: vm entry failed with error 0x%" PRIx64 "\n\n", 
> reason);
> +fprintf(stderr, "If you're runnning a guest on an Intel machine, it can 
> be\n");
> +fprintf(stderr, "most-likely due to the guest going into an invalid 
> state\n");
> +fprintf(stderr, "for Intel VT. For example, the guest maybe running in 
> big\n");
> +fprintf(stderr, "real mode which is not supported by Intel VT.\n\n");

We might want to qualify this with certain cpu versions.  IIRC, the VMX
unrestricted mode should handle big real mode correctly, no?   Maybe, 

+fprintf(stderr, "on some Intel processors. For example, the guest maybe 
running in big\n");
+fprintf(stderr, "real mode which is not supported on most Intel 
processors.\n\n");

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ry...@us.ibm.com
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH RFC] virtio: put last seen used index into ring itself

2010-05-19 Thread Michael S. Tsirkin

On Wed, May 19, 2010 at 11:06:42AM +0300, Avi Kivity wrote:
> On 05/19/2010 10:39 AM, Rusty Russell wrote:
>>
>> I think we're talking about the last 2 entries of the avail ring.  That means
>> the worst case is 1 false bounce every time around the ring.
>
> It's low, but why introduce an inefficiency when you can avoid doing it  
> for the same effort?
> 
>> I think that's
>> why we're debating it instead of measuring it :)
>>
>
> Measure before optimize is good for code but not for protocols.   
> Protocols have to be robust against future changes.  Virtio is warty  
> enough already, we can't keep doing local optimizations.
>
>> Note that this is a exclusive->shared->exclusive bounce only, too.
>>
>
> A bounce is a bounce.
>
> Virtio is already way too bouncy due to the indirection between the  
> avail/used rings and the descriptor pool.  A device with out of order  
> completion (like virtio-blk) will quickly randomize the unused  
> descriptor indexes, so every descriptor fetch will require a bounce.
>
> In contrast, if the rings hold the descriptors themselves instead of  
> pointers, we bounce (sizeof(descriptor)/cache_line_size) cache lines for  
> every descriptor, amortized.

On the other hand, consider that on fast path we are never using all
of the ring. With a good allocator we might be able to keep
reusing only small part of the ring, instead of wrapping around
all of it all of the time.


> -- 
> Do not meddle in the internals of kernels, for they are subtle and quick to 
> panic.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] vhost-net: utilize PUBLISH_USED_IDX feature

2010-05-19 Thread Michael S. Tsirkin

On Wed, May 19, 2010 at 08:04:51PM +0300, Avi Kivity wrote:
> On 05/18/2010 04:19 AM, Michael S. Tsirkin wrote:
>> With PUBLISH_USED_IDX, guest tells us which used entries
>> it has consumed. This can be used to reduce the number
>> of interrupts: after we write a used entry, if the guest has not yet
>> consumed the previous entry, or if the guest has already consumed the
>> new entry, we do not need to interrupt.
>> This imporves bandwidth by 30% under some workflows.
>>
>> Signed-off-by: Michael S. Tsirkin
>> ---
>>
>> Rusty, Dave, this patch depends on the patch
>> "virtio: put last seen used index into ring itself"
>> which is currently destined at Rusty's tree.
>> Rusty, if you are taking that one for 2.6.35, please
>> take this one as well.
>> Dave, any objections?
>>
>
> I object: I think the index should have its own cacheline,

The issue here is that host/guest do not know each
other's cache line size. I guess we could just put it
at offset 128 or something like that ... Rusty?

> and that it should be documented before merging.

I think you meant to object to the virtio patch, not this one.  This
patch does not introduce new layout, just implements host support.
virtio spec patch will follow: it is not part of linux tree so
there is no patch dependency.

> -- 
> Do not meddle in the internals of kernels, for they are subtle and quick to 
> panic.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [RFC] Bug Day - June 1st, 2010

2010-05-19 Thread Anthony Liguori


On 05/19/2010 05:29 PM, Andre Przywara wrote:

Michael Tokarev wrote:

...

Also, thanks to Andre Przywara, whole winNT thing works but it requires
-cpu qemu64,level=1 (or level=2 or =3), -- _not_ with default CPU.  This
is also testing, but it's not obvious what to do witht the result...
Can't we use the file based CPU models for that? Actually it looks 
like a template config file for certain guest operation systems (like 
-vga std -net nic,model=ne2k_pci for older Windows version) make 
sense. These could include all quirks that we find on the way.


It'd be nice if we had more flexibility in defining custom machine types 
so you could just do qemu -M win98.


Regards,

Anthony Liguori

BTW: Does anyone knows what the problem with Windows95/98 on KVM is? I 
tried some tracing today, but couldn't find a hint.


Regards,
Andre.



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] kvm: Switch kvm_update_guest_debug to run_on_cpu

2010-05-19 Thread Jan Kiszka

From: Jan Kiszka 

Guest debugging under KVM is currently broken once io-threads are
enabled. Easily fixable by switching the fake on_vcpu to the real
run_on_cpu implementation.

Signed-off-by: Jan Kiszka 
---
 kvm-all.c |   12 +---
 1 files changed, 1 insertions(+), 11 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index c238f54..5684e51 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1033,16 +1033,6 @@ void kvm_setup_guest_memory(void *start, size_t size)
 }
 
 #ifdef KVM_CAP_SET_GUEST_DEBUG
-static void on_vcpu(CPUState *env, void (*func)(void *data), void *data)
-{
-#ifdef CONFIG_IOTHREAD
-if (env != cpu_single_env) {
-abort();
-}
-#endif
-func(data);
-}
-
 struct kvm_sw_breakpoint *kvm_find_sw_breakpoint(CPUState *env,
  target_ulong pc)
 {
@@ -1086,7 +1076,7 @@ int kvm_update_guest_debug(CPUState *env, unsigned long 
reinject_trap)
 kvm_arch_update_guest_debug(env, &data.dbg);
 data.env = env;
 
-on_vcpu(env, kvm_invoke_set_guest_debug, &data);
+run_on_cpu(env, kvm_invoke_set_guest_debug, &data);
 return data.err;
 }
 
-- 
1.6.0.2
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [RFC] Bug Day - June 1st, 2010

2010-05-19 Thread Andre Przywara


Michael Tokarev wrote:

...

Also, thanks to Andre Przywara, whole winNT thing works but it requires
-cpu qemu64,level=1 (or level=2 or =3), -- _not_ with default CPU.  This
is also testing, but it's not obvious what to do witht the result...
Can't we use the file based CPU models for that? Actually it looks like 
a template config file for certain guest operation systems (like -vga 
std -net nic,model=ne2k_pci for older Windows version) make sense. These 
could include all quirks that we find on the way.


BTW: Does anyone knows what the problem with Windows95/98 on KVM is? I 
tried some tracing today, but couldn't find a hint.


Regards,
Andre.

--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 488-3567-12

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH qemu-kvm] device-assignment: add config fd qdev property

2010-05-19 Thread Anthony Liguori


On 05/19/2010 04:59 PM, Chris Wright wrote:

I'm not sure I have a better suggestion though.
 

Anything else requires inventing some new commandline options and monitor
comnmands (-pcidevice anyone? ;-).  I'm not sure the benefit, esp with
hopes of moving to uio.
   

Yeah, I think device passthrough is going to require new command
line syntax to be more qemu friendly...
 

Right, and I think we should do that in the context of redoing the
infrastructure.  What do you think?
   


Yeah, as long as it's understood that this in on the horizon 
(particularly on the libvirt side).


Regards,

Anthony Liguori


thanks,
-chris
   


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH qemu-kvm] device-assignment: add config fd qdev property

2010-05-19 Thread Chris Wright

* Anthony Liguori (anth...@codemonkey.ws) wrote:
> On 05/19/2010 04:10 PM, Chris Wright wrote:
> >* Anthony Liguori (anth...@codemonkey.ws) wrote:
> >>An fd as a qdev property seems like a bad idea to me.
> >What is your concern?
> 
> qdev properties are supposed to represent device tunables.  A file
> descriptor is not a tunable.

It tunes which config file to read from ;-)

> It's like passing the tap device fd via qdev to an e1000.
> 
> >>I'm not sure I have a better suggestion though.
> >Anything else requires inventing some new commandline options and monitor
> >comnmands (-pcidevice anyone? ;-).  I'm not sure the benefit, esp with
> >hopes of moving to uio.
> 
> Yeah, I think device passthrough is going to require new command
> line syntax to be more qemu friendly...

Right, and I think we should do that in the context of redoing the
infrastructure.  What do you think?

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Print a user-friendly message on failed vmentry

2010-05-19 Thread Anthony Liguori


On 05/19/2010 04:16 PM, Mohammed Gamal wrote:

This patch address bug report in https://bugs.launchpad.net/qemu/+bug/530077.

Failed vmentries were handled with handle_unhandled() which prints a rather
unfriendly message to the user. This patch separates handling vmentry failures
from unknown exit reasons and prints a friendly message to the user.

Signed-off-by: Mohammed Gamal
---
  qemu-kvm.c |   16 +++-
  1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/qemu-kvm.c b/qemu-kvm.c
index 35a4c8a..deb4df8 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -106,6 +106,20 @@ static int handle_unhandled(uint64_t reason)
  return -EINVAL;
  }

+static int handle_failed_vmentry(uint64_t reason)
+{
+fprintf(stderr, "kvm: vm entry failed with error 0x%" PRIx64 "\n\n", 
reason);
+fprintf(stderr, "If you're runnning a guest on an Intel machine, it can 
be\n");
+fprintf(stderr, "most-likely due to the guest going into an invalid 
state\n");
+fprintf(stderr, "for Intel VT. For example, the guest maybe running in 
big\n");
+fprintf(stderr, "real mode which is not supported by Intel VT.\n\n");
+fprintf(stderr, "You may want to try enabling real mode emulation in 
KVM.\n");
+fprintf(stderr, "To Enable it, you may run the following commands as 
root:\n");
+fprintf(stderr, "# rmmod kvm_intel\n");
+fprintf(stderr, "# rmmod kvm\n");
+fprintf(stderr, "# modprobe kvm_intel emulate_invalid_guest_state=1\n");
+return -EINVAL;
+}

   


Very nice.

Regards,

Anthony Liguori


  static inline void set_gsi(kvm_context_t kvm, unsigned int gsi)
  {
@@ -586,7 +600,7 @@ int kvm_run(CPUState *env)
  r = handle_unhandled(run->hw.hardware_exit_reason);
  break;
  case KVM_EXIT_FAIL_ENTRY:
-r = 
handle_unhandled(run->fail_entry.hardware_entry_failure_reason);
+r = 
handle_failed_vmentry(run->fail_entry.hardware_entry_failure_reason);
  break;
  case KVM_EXIT_EXCEPTION:
  fprintf(stderr, "exception %d (%x)\n", run->ex.exception,
   


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH qemu-kvm] device-assignment: add config fd qdev property

2010-05-19 Thread Anthony Liguori


On 05/19/2010 04:10 PM, Chris Wright wrote:

* Anthony Liguori (anth...@codemonkey.ws) wrote:
   

An fd as a qdev property seems like a bad idea to me.
 

What is your concern?
   


qdev properties are supposed to represent device tunables.  A file 
descriptor is not a tunable.


It's like passing the tap device fd via qdev to an e1000.


I'm not sure I have a better suggestion though.
 

Anything else requires inventing some new commandline options and monitor
comnmands (-pcidevice anyone? ;-).  I'm not sure the benefit, esp with
hopes of moving to uio.
   


Yeah, I think device passthrough is going to require new command line 
syntax to be more qemu friendly...


Regards,

Anthony Liguori


thanks,
-chris
   


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] Print a user-friendly message on failed vmentry

2010-05-19 Thread Mohammed Gamal

This patch address bug report in https://bugs.launchpad.net/qemu/+bug/530077.

Failed vmentries were handled with handle_unhandled() which prints a rather
unfriendly message to the user. This patch separates handling vmentry failures
from unknown exit reasons and prints a friendly message to the user.

Signed-off-by: Mohammed Gamal 
---
 qemu-kvm.c |   16 +++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/qemu-kvm.c b/qemu-kvm.c
index 35a4c8a..deb4df8 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -106,6 +106,20 @@ static int handle_unhandled(uint64_t reason)
 return -EINVAL;
 }
 
+static int handle_failed_vmentry(uint64_t reason)
+{
+fprintf(stderr, "kvm: vm entry failed with error 0x%" PRIx64 "\n\n", 
reason);
+fprintf(stderr, "If you're runnning a guest on an Intel machine, it can 
be\n");
+fprintf(stderr, "most-likely due to the guest going into an invalid 
state\n");
+fprintf(stderr, "for Intel VT. For example, the guest maybe running in 
big\n");
+fprintf(stderr, "real mode which is not supported by Intel VT.\n\n");
+fprintf(stderr, "You may want to try enabling real mode emulation in 
KVM.\n");
+fprintf(stderr, "To Enable it, you may run the following commands as 
root:\n");
+fprintf(stderr, "# rmmod kvm_intel\n");
+fprintf(stderr, "# rmmod kvm\n");
+fprintf(stderr, "# modprobe kvm_intel emulate_invalid_guest_state=1\n");
+return -EINVAL;
+}
 
 static inline void set_gsi(kvm_context_t kvm, unsigned int gsi)
 {
@@ -586,7 +600,7 @@ int kvm_run(CPUState *env)
 r = handle_unhandled(run->hw.hardware_exit_reason);
 break;
 case KVM_EXIT_FAIL_ENTRY:
-r = 
handle_unhandled(run->fail_entry.hardware_entry_failure_reason);
+r = 
handle_failed_vmentry(run->fail_entry.hardware_entry_failure_reason);
 break;
 case KVM_EXIT_EXCEPTION:
 fprintf(stderr, "exception %d (%x)\n", run->ex.exception,
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] VMX: Properly return error to userspace on vmentry failure

2010-05-19 Thread Mohammed Gamal

The vmexit handler returns KVM_EXIT_UNKNOWN since there is no handler
for vmentry failures. This intercepts vmentry failures and returns
KVM_FAIL_ENTRY to userspace instead.

Signed-off-by: Mohammed Gamal 
---
 arch/x86/kvm/vmx.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 99ae513..4edcffb 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3665,6 +3665,13 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
if (enable_ept && is_paging(vcpu))
vcpu->arch.cr3 = vmcs_readl(GUEST_CR3);
 
+   if (exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) {
+   vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+   vcpu->run->fail_entry.hardware_entry_failure_reason
+   = exit_reason & ~VMX_EXIT_REASONS_FAILED_VMENTRY;
+   return 0;
+   }
+
if (unlikely(vmx->fail)) {
vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
vcpu->run->fail_entry.hardware_entry_failure_reason
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH qemu-kvm] device-assignment: add config fd qdev property

2010-05-19 Thread Chris Wright

* Anthony Liguori (anth...@codemonkey.ws) wrote:
> An fd as a qdev property seems like a bad idea to me.

What is your concern?

> I'm not sure I have a better suggestion though.

Anything else requires inventing some new commandline options and monitor
comnmands (-pcidevice anyone? ;-).  I'm not sure the benefit, esp with
hopes of moving to uio.

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 10/12] kvm: enable smp > 1

2010-05-19 Thread Udo Lembke


Avi Kivity schrieb:

On 05/19/2010 12:57 PM, Udo Lembke wrote:

Jan Kiszka schrieb:

...
--enable-io-thread?

If you had it disabled, it would also answer my question if -smp works
without problems without that feature.

Jan


Hi,
i have a dumb question: what is the "--enable-io-thread"? Is this a 
kvm-switch?


It's a ./configure switch for upstream qemu (don't use with qemu-kvm 
yet).


My kvm 0.12.4 don't accept this switch. I'm know only "threads=n" as 
smp-parameter and "aio=threads" as drive-parameter.


Because i look for a solution for a better io-performance of 
windows-guest with more than one cpu...


Unrelated, what are your smp issues?


If i use one cpu i got a good io-performance:
e.g. over 500MB/s at the profile "install" of the io-benchmark h2benchw.exe.
( aio=threads | SAS-Raid-0 | ftp://ftp.heise.de/pub/ct/ctsi/h2benchw.zip 
| hwbenchw.exe -p -w iotest 0)

The same test but with two cpus gives results between 27 and 298 MB/s!

Also in real life it's noticeable not only with an benchmark. I use a 
win-vm with two cpu for postscript-ripping and have a performance drop 
due to the bad io.


Udo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: KVM call agenda for May 18

2010-05-19 Thread Anthony Liguori


On 05/19/2010 03:20 AM, Christoph Hellwig wrote:

On Tue, May 18, 2010 at 08:52:36AM -0500, Anthony Liguori wrote:
   

This should be filed in launchpad as a qemu bug and it should be tested
against the latest git.  This bug sounds like we're using an int to
represent sector offset somewhere but there's not enough info in the bug
report to figure out for sure.  I just audited the virtio-blk ->  raw ->
aio=threads path and I don't see an obvious place that we're getting it
wrong.
 

FYI: I'm going to ignore everything that's in launchpad - even more than
in the stupid SF bugtracker.  While the SF one is almost unsuable
launchpad is entirely unsuable.  If you don't have an account with the
evil spacement empire you can't even check the email addresses of the
reporters, so any communication with them is entirely impossible.
   


All bug traffic will now come to the list and you can just respond 
directly to that.  The mails include the submitters contact information.


Regards,

Anthony Liguori


It's time we get a proper bugzilla.qemu.org for both qemu and qemu-kvm
that can be used sanely.  If you ask nicely you might even get a virtual
instance of bugzilla.kernel.org which works quite nicely.

   


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kvm: network problem with Solaris 10u8 guest

2010-05-19 Thread Michael Tokarev


19.05.2010 21:09, Avi Kivity wrote:

On 05/19/2010 04:46 PM, Harald Dunkel wrote:

I am trying to run Solaris 10u8 as a guest in kvm (kernel
2.6.33.2). Problem: The virtual network devices don't work
with this Solaris version.

e1000 and pcnet work just by chance, as it seems. I can ping
the guest (even though some packets are lost). I cannot use
ssh to login.

rtl8139 and ne2k_pci are not even listed by "ifconfig -a" on
the guest.

Solaris 10u6 worked fine (using the e1000 emulation). Same for
the Linux guests.


Does opensolaris exhibit the the same problems? If so, you can probably
bisect the driver to find the change that broke the device. With that we
can probably deduce if it is the device or driver that is broken, and
what the issue is.


I verified this right after Harald posted his original
bugreport against debian qemu-kvm package (*).  No,
opensolaris does NOT shows this bug.

I even tried solaris install image -- sol-10-u8-ga-x86-dvd.iso,
but it works here as far as I can see.  I performed only basic
tests however, -- basically because I just don't remember how
to _use_ solaris (it's been about 10 years ago), and can only
do some telnet/ping/ftp.

(*) http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=579751

/mjt
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC PATCH 1/1] ceph/rbd block driver for qemu-kvm

2010-05-19 Thread Christian Brunner

The attached patch is a block driver for the distributed file system
Ceph (http://ceph.newdream.net/). This driver uses librados (which
is part of the Ceph server) for direct access to the Ceph object
store and is running entirely in userspace. Therefore it is
called "rbd" - rados block device.

To compile the driver a recent version of ceph (>= 0.20.1) is needed
and you have to "--enable-rbd" when running configure.

Additional information is available on the Ceph-Wiki:

http://ceph.newdream.net/wiki/Kvm-rbd

---
 Makefile  |3 +
 Makefile.objs |1 +
 block/rados.h |  376 ++
 block/rbd.c   |  585 +
 block/rbd_types.h |   48 +
 configure |   27 +++
 6 files changed, 1040 insertions(+), 0 deletions(-)
 create mode 100644 block/rados.h
 create mode 100644 block/rbd.c
 create mode 100644 block/rbd_types.h

diff --git a/Makefile b/Makefile
index eb9e02b..b1ab3e9 100644
--- a/Makefile
+++ b/Makefile
@@ -27,6 +27,9 @@ configure: ;
 $(call set-vpath, $(SRC_PATH):$(SRC_PATH)/hw)
 
 LIBS+=-lz $(LIBS_TOOLS)
+ifdef CONFIG_RBD
+LIBS+=-lrados
+endif
 
 ifdef BUILD_DOCS
 DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8
diff --git a/Makefile.objs b/Makefile.objs
index acbaf22..85791ac 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -18,6 +18,7 @@ block-nested-y += parallels.o nbd.o blkdebug.o
 block-nested-$(CONFIG_WIN32) += raw-win32.o
 block-nested-$(CONFIG_POSIX) += raw-posix.o
 block-nested-$(CONFIG_CURL) += curl.o
+block-nested-$(CONFIG_RBD) += rbd.o
 
 block-obj-y +=  $(addprefix block/, $(block-nested-y))
 
diff --git a/block/rados.h b/block/rados.h
new file mode 100644
index 000..6cde9a1
--- /dev/null
+++ b/block/rados.h
@@ -0,0 +1,376 @@
+#ifndef __RADOS_H
+#define __RADOS_H
+
+/*
+ * Data types for the Ceph distributed object storage layer RADOS
+ * (Reliable Autonomic Distributed Object Store).
+ */
+
+
+
+/*
+ * osdmap encoding versions
+ */
+#define CEPH_OSDMAP_INC_VERSION 5
+#define CEPH_OSDMAP_INC_VERSION_EXT 5
+#define CEPH_OSDMAP_VERSION 5
+#define CEPH_OSDMAP_VERSION_EXT 5
+
+/*
+ * fs id
+ */
+struct ceph_fsid {
+   unsigned char fsid[16];
+};
+
+static inline int ceph_fsid_compare(const struct ceph_fsid *a,
+   const struct ceph_fsid *b)
+{
+   return memcmp(a, b, sizeof(*a));
+}
+
+/*
+ * ino, object, etc.
+ */
+typedef __le64 ceph_snapid_t;
+#define CEPH_SNAPDIR ((__u64)(-1))  /* reserved for hidden .snap dir */
+#define CEPH_NOSNAP  ((__u64)(-2))  /* "head", "live" revision */
+#define CEPH_MAXSNAP ((__u64)(-3))  /* largest valid snapid */
+
+struct ceph_timespec {
+   __le32 tv_sec;
+   __le32 tv_nsec;
+} __attribute__ ((packed));
+
+
+/*
+ * object layout - how objects are mapped into PGs
+ */
+#define CEPH_OBJECT_LAYOUT_HASH 1
+#define CEPH_OBJECT_LAYOUT_LINEAR   2
+#define CEPH_OBJECT_LAYOUT_HASHINO  3
+
+/*
+ * pg layout -- how PGs are mapped onto (sets of) OSDs
+ */
+#define CEPH_PG_LAYOUT_CRUSH  0
+#define CEPH_PG_LAYOUT_HASH   1
+#define CEPH_PG_LAYOUT_LINEAR 2
+#define CEPH_PG_LAYOUT_HYBRID 3
+
+
+/*
+ * placement group.
+ * we encode this into one __le64.
+ */
+struct ceph_pg {
+   __le16 preferred; /* preferred primary osd */
+   __le16 ps;/* placement seed */
+   __le32 pool;  /* object pool */
+} __attribute__ ((packed));
+
+/*
+ * pg_pool is a set of pgs storing a pool of objects
+ *
+ *  pg_num -- base number of pseudorandomly placed pgs
+ *
+ *  pgp_num -- effective number when calculating pg placement.  this
+ * is used for pg_num increases.  new pgs result in data being "split"
+ * into new pgs.  for this to proceed smoothly, new pgs are intiially
+ * colocated with their parents; that is, pgp_num doesn't increase
+ * until the new pgs have successfully split.  only _then_ are the new
+ * pgs placed independently.
+ *
+ *  lpg_num -- localized pg count (per device).  replicas are randomly
+ * selected.
+ *
+ *  lpgp_num -- as above.
+ */
+#define CEPH_PG_TYPE_REP 1
+#define CEPH_PG_TYPE_RAID4   2
+#define CEPH_PG_POOL_VERSION 2
+struct ceph_pg_pool {
+   __u8 type;/* CEPH_PG_TYPE_* */
+   __u8 size;/* number of osds in each pg */
+   __u8 crush_ruleset;   /* crush placement rule */
+   __u8 object_hash; /* hash mapping object name to ps */
+   __le32 pg_num, pgp_num;   /* number of pg's */
+   __le32 lpg_num, lpgp_num; /* number of localized pg's */
+   __le32 last_change;   /* most recent epoch changed */
+   __le64 snap_seq;  /* seq for per-pool snapshot */
+   __le32 snap_epoch;/* epoch of last snap */
+   __le32 num_snaps;
+   __le32 num_removed_snap_intervals; /* if non-empty, NO per-pool snaps */
+   __le64 auid;   /* who owns the pg */
+} __attribute__ ((packed));
+
+/*
+ * stable_mod func is used to control number of

[RFC PATCH 0/1] ceph/rbd block driver for qemu-kvm

2010-05-19 Thread Christian Brunner

Hi,

this patch is a block driver for the distributed file system Ceph
(http://ceph.newdream.net/). Ceph was included in the Linux v2.6.34
kernel. However, this driver uses librados (which is part of the Ceph
server) for direct access to the Ceph object store and is running entirely
in userspace. Therefore it is called "rbd" - rados block device.

The basic idea is to stripe a VM block device over (by default) 4MB objects
stored in the Ceph distributed object store.  This is very similar to what
the sheepdog project is doing, but uses the ceph server as a storage backend.
If you don't plan on using the entire ceph filesystem you may leave out the
metadata service of ceph.

Yehuda Sadeh helped me with the implementation and put some additional
usage information on the Ceph-Wiki (http://ceph.newdream.net/wiki/Kvm-rbd).
He has also written a Linux kernel driver to make an rbd image accessible as
a block device.

To compile the driver a recent version of ceph (>= 0.20.1) is needed and
you have to "--enable-rbd" when running configure.

Since our tests where quite promising, I would like to have feedback from
other qemu-kvm developers/users. If someone could check the AIO handling
of the driver I would be glad, too.

Thanks,
Christian

PS: The patch is based on git://repo.or.cz/qemu/kevin.git block

---
 Makefile  |3 +
 Makefile.objs |1 +
 block/rados.h |  376 ++
 block/rbd.c   |  585 +
 block/rbd_types.h |   48 +
 configure |   27 +++
 6 files changed, 1040 insertions(+), 0 deletions(-)
 create mode 100644 block/rados.h
 create mode 100644 block/rbd.c
 create mode 100644 block/rbd_types.h

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH qemu-kvm] device-assignment: add config fd qdev property

2010-05-19 Thread Anthony Liguori


On 05/19/2010 02:00 PM, Chris Wright wrote:

When libvirt launches a guest it first chowns the relevenat
/sys/bus/pci/.../config file for an assigned device then drops privileges.

This causes an issue for device assignment because despite being file
owner, the sysfs config space file checks for CAP_SYS_ADMIN before
allowing access to device dependent config space.

This adds a new qdev configfd property which allows libvirt to open the
sysfs config space file and give qemu an already opened file descriptor.
Along with a change pending for the 2.6.35 kernel, this allows the
capability check to compare against privileges from when the file was
opened.

Signed-off-by: Chris Wright
   


An fd as a qdev property seems like a bad idea to me.  I'm not sure I 
have a better suggestion though.


Regards,

Anthony Liguori


---
  hw/device-assignment.c |   12 
  1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index eb31c78..172f0c9 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -612,12 +612,15 @@ static int get_real_device(AssignedDevice *pci_dev, 
uint16_t r_seg,

  snprintf(name, sizeof(name), "%sconfig", dir);

-fd = open(name, O_RDWR);
+fd = dev->config_fd;
  if (fd == -1) {
-fprintf(stderr, "%s: %s: %m\n", __func__, name);
-return 1;
+fd = open(name, O_RDWR);
+if (fd == -1) {
+fprintf(stderr, "%s: %s: %m\n", __func__, name);
+return 1;
+}
+dev->config_fd = fd;
  }
-dev->config_fd = fd;
  again:
  r = read(fd, pci_dev->dev.config, pci_config_size(&pci_dev->dev));
  if (r<  0) {
@@ -1433,6 +1436,7 @@ static PCIDeviceInfo assign_info = {
  .qdev.props   = (Property[]) {
  DEFINE_PROP("host", AssignedDevice, host, qdev_prop_hostaddr, 
PCIHostDevice),
  DEFINE_PROP_UINT32("iommu", AssignedDevice, use_iommu, 1),
+DEFINE_PROP_INT32("configfd", AssignedDevice, real_device.config_fd, 
-1),
  DEFINE_PROP_END_OF_LIST(),
  },
  };
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
   


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH qemu-kvm] device-assignment: add config fd qdev property

2010-05-19 Thread Alex Williamson

On Wed, 2010-05-19 at 12:00 -0700, Chris Wright wrote:
> When libvirt launches a guest it first chowns the relevenat
> /sys/bus/pci/.../config file for an assigned device then drops privileges.
> 
> This causes an issue for device assignment because despite being file
> owner, the sysfs config space file checks for CAP_SYS_ADMIN before
> allowing access to device dependent config space.
> 
> This adds a new qdev configfd property which allows libvirt to open the
> sysfs config space file and give qemu an already opened file descriptor.
> Along with a change pending for the 2.6.35 kernel, this allows the
> capability check to compare against privileges from when the file was
> opened.
> 
> Signed-off-by: Chris Wright 
> ---
>  hw/device-assignment.c |   12 
>  1 files changed, 8 insertions(+), 4 deletions(-)

Acked-by: Alex Williamson 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH qemu-kvm] device-assignment: add config fd qdev property

2010-05-19 Thread Chris Wright

When libvirt launches a guest it first chowns the relevenat
/sys/bus/pci/.../config file for an assigned device then drops privileges.

This causes an issue for device assignment because despite being file
owner, the sysfs config space file checks for CAP_SYS_ADMIN before
allowing access to device dependent config space.

This adds a new qdev configfd property which allows libvirt to open the
sysfs config space file and give qemu an already opened file descriptor.
Along with a change pending for the 2.6.35 kernel, this allows the
capability check to compare against privileges from when the file was
opened.

Signed-off-by: Chris Wright 
---
 hw/device-assignment.c |   12 
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index eb31c78..172f0c9 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -612,12 +612,15 @@ static int get_real_device(AssignedDevice *pci_dev, 
uint16_t r_seg,
 
 snprintf(name, sizeof(name), "%sconfig", dir);
 
-fd = open(name, O_RDWR);
+fd = dev->config_fd;
 if (fd == -1) {
-fprintf(stderr, "%s: %s: %m\n", __func__, name);
-return 1;
+fd = open(name, O_RDWR);
+if (fd == -1) {
+fprintf(stderr, "%s: %s: %m\n", __func__, name);
+return 1;
+}
+dev->config_fd = fd;
 }
-dev->config_fd = fd;
 again:
 r = read(fd, pci_dev->dev.config, pci_config_size(&pci_dev->dev));
 if (r < 0) {
@@ -1433,6 +1436,7 @@ static PCIDeviceInfo assign_info = {
 .qdev.props   = (Property[]) {
 DEFINE_PROP("host", AssignedDevice, host, qdev_prop_hostaddr, 
PCIHostDevice),
 DEFINE_PROP_UINT32("iommu", AssignedDevice, use_iommu, 1),
+DEFINE_PROP_INT32("configfd", AssignedDevice, real_device.config_fd, 
-1),
 DEFINE_PROP_END_OF_LIST(),
 },
 };
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Gentoo guest with smp: emerge freeze while recompile world

2010-05-19 Thread Alexander Graf



Am 19.05.2010 um 18:25 schrieb Avi Kivity :


On 05/19/2010 11:33 AM, Riccardo wrote:

This is a multi-part message in MIME format.

Hi,
I have a server dual xeon quad core with gentoo and qemu:
app-emulation/qemu-kvm-0.12.3-r1  USE="aio gnutls ncurses sasl vde - 
alsa
-bluetooth -curl -esd -fdt -hardened -kvm-trace -pulseaudio -qemu- 
ifup -sdl

-static"

Any suggestions?
It's possible to enable a log for what service?



There are almost impossible to debug.

Try copying vmlinux out of your guest and attach with gdb when it  
hangs.  Then issue the command


 (gdb) thread apply all backtrace

to see what the guest is doing.


Another thing coming to my mind is sysrq. Do ctrl-alt-2 and type  
"sendkey alt-print-o" (IIRC, please check the docs for sysrq letters).  
That should give you a backtrace for stuck cpus. If that doesn't help,  
go with dumping the state of all processes. For easy dumpibg, boot the  
guest with -serial stdio and pass 'console=ttyS0' on the kernel  
command line.


Alex




--
Do not meddle in the internals of kernels, for they are subtle and  
quick to panic.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kvm: network problem with Solaris 10u8 guest

2010-05-19 Thread Avi Kivity


On 05/19/2010 04:46 PM, Harald Dunkel wrote:

Hi folks,

   


Please post kvm issues to the kvm mailing list.


I am trying to run Solaris 10u8 as a guest in kvm (kernel
2.6.33.2). Problem: The virtual network devices don't work
with this Solaris version.

e1000 and pcnet work just by chance, as it seems. I can ping
the guest (even though some packets are lost). I cannot use
ssh to login.

rtl8139 and ne2k_pci are not even listed by "ifconfig -a" on
the guest.

Solaris 10u6 worked fine (using the e1000 emulation). Same for
the Linux guests.

Can anybody reproduce this problem?


Any helpful comment would be highly appreciated. Of course
I would be glad to help to track this down.
   


Does opensolaris exhibit the the same problems?  If so, you can probably 
bisect the driver to find the change that broke the device.  With that 
we can probably deduce if it is the device or driver that is broken, and 
what the issue is.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] vhost-net: utilize PUBLISH_USED_IDX feature

2010-05-19 Thread Avi Kivity


On 05/18/2010 04:19 AM, Michael S. Tsirkin wrote:

With PUBLISH_USED_IDX, guest tells us which used entries
it has consumed. This can be used to reduce the number
of interrupts: after we write a used entry, if the guest has not yet
consumed the previous entry, or if the guest has already consumed the
new entry, we do not need to interrupt.
This imporves bandwidth by 30% under some workflows.

Signed-off-by: Michael S. Tsirkin
---

Rusty, Dave, this patch depends on the patch
"virtio: put last seen used index into ring itself"
which is currently destined at Rusty's tree.
Rusty, if you are taking that one for 2.6.35, please
take this one as well.
Dave, any objections?
   


I object: I think the index should have its own cacheline, and that it 
should be documented before merging.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: KVM call minutes for May 18

2010-05-19 Thread Avi Kivity


On 05/18/2010 05:29 PM, Chris Wright wrote:


sourceforge bug tracker...
- sucks
- unclear if there's active triage
- anthony prefers the launchpad instance
   


Kernel bugs can go to bugzilla.kernel.org.  Of course it isn't always 
clear if a bug is a kernel or qemu bug.  Recommend we ask users to post 
to the list first, get some guidance, then the bug is either resolved or 
we ask them to file a bug report.  Should improve the experience for 
users and developers.



- alex likes the sf email to list, wuld be good to keep that feature
   


Pretty critical IMO.  The little attention the tracker gets is entirely 
due to the email.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [RFC] Bug Day - June 1st, 2010

2010-05-19 Thread Anthony Liguori


On 05/19/2010 09:52 AM, Luiz Capitulino wrote:

On Tue, 18 May 2010 17:38:27 -0500
Anthony Liguori  wrote:

   

Hi,

In an effort to improve the 0.13 release quality, I'd like to host a Bug
Day on June 1st, 2010.  I've setup a quick wiki page with some more info
(http://wiki.qemu.org/BugDay/June2010).
 

  Tuesday is our call conf day and other people have reported that they have
more confs during that day. Suggest Jun 2.
   


I don't have a problem with the 2nd.


Here's my basic thinking:

   - Anyone who can should try to spend some time either triaging bugs,
updating bug status, or actually fixing bugs.
 

  And testing, Fedora has a number of test cases already written, but
I guess that just a few are qemu specific:

https://fedoraproject.org/wiki/Test_Day:2010-04-08_Virtualization

  We could link those and write our own, or at least list the major
features to be tested..

  Of course we could have a different day for testing too, but I think
this is a way to get everyone busy.
   


Yup, I think it's a good suggestion.  Please update the wiki if there's 
anything you'd like to add.



   - We'll have a special IRC channel (#qemu-bugday) on OFTC.  As many
QEMU and KVM developers as possible should join this channel for that
day to help assist people working on bugs.
   - We'll try to migrate as many confirmable bugs from the Source Forge
tracker to Launchpad.
 

  Can't this be automated?
   


It could, but I think doing it manually could be helpful in culling bad 
bug reports.



If this is successful, we'll try to have regular bug days.  Any
suggestions on how to make the experience as fun and productive as
possible are certainly appreciated!
 

  Lots of foods and a Monty Python session in the evening.
   


Of course :-)

Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] qemu-kvm: Enable xsave related CPUID

2010-05-19 Thread Avi Kivity


On 05/19/2010 11:34 AM, Sheng Yang wrote:

Signed-off-by: Sheng Yang
---
  target-i386/cpuid.c |   32 
   


Can send to Anthony directly, while tcg doesn't support xsave/ymm, all 
the code here is generic.



  1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index eebf038..21e94f3 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -1067,6 +1067,38 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
  *ecx = 0;
  *edx = 0;
  break;
+case 0xD:
+/* Processor Extended State */
+if (!(env->cpuid_ext_features&  CPUID_EXT_XSAVE)) {
+*eax = 0;
+*ebx = 0;
+*ecx = 0;
+*edx = 0;
+break;
+}
+if (count == 0) {
+*eax = 0x7;
+*ebx = 0x340;
+*ecx = 0x340;
+*edx = 0;
+} else if (count == 1) {
+/* eax = 1, so we can continue with others */
+*eax = 1;
+*ebx = 0;
+*ecx = 0;
+*edx = 0;
+} else if (count == 2) {
+*eax = 0x100;
+*ebx = 0x240;
+*ecx = 0;
+*edx = 0;
+} else {
+*eax = 0;
+*ebx = 0;
+*ecx = 0;
+*edx = 0;
+}
+break;
   


Lots of magic numbers.  Symbolic constants or explanatory comments.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] KVM: VMX: Enable XSAVE/XRSTORE for guest

2010-05-19 Thread Avi Kivity


On 05/19/2010 11:34 AM, Sheng Yang wrote:

From: Dexuan Cui

Enable XSAVE/XRSTORE for guest.

Change from V1:

1. Use FPU API.
2. Fix CPUID issue.
3. Save/restore all possible guest xstate fields when switching. Because we
don't know which fields guest has already touched.

Signed-off-by: Dexuan Cui
Signed-off-by: Sheng Yang
---
  arch/x86/include/asm/kvm_host.h |1 +
  arch/x86/include/asm/vmx.h  |1 +
  arch/x86/kvm/vmx.c  |   28 +
  arch/x86/kvm/x86.c  |   85 +++---
  4 files changed, 108 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d08bb4a..78d7b06 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -302,6 +302,7 @@ struct kvm_vcpu_arch {
} update_pte;

struct fpu guest_fpu;
+   uint64_t xcr0, host_xcr0;
   


host_xcr0 can be a global.


  /*
   * Interruption-information format
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 99ae513..2ee8ff6 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -36,6 +36,8 @@
  #include
  #include
  #include
+#include
+#include

  #include "trace.h"

@@ -2616,6 +2618,8 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
vmx->vcpu.arch.cr4_guest_owned_bits = KVM_CR4_GUEST_OWNED_BITS;
if (enable_ept)
vmx->vcpu.arch.cr4_guest_owned_bits |= X86_CR4_PGE;
+   if (cpu_has_xsave)
+   vmx->vcpu.arch.cr4_guest_owned_bits |= X86_CR4_OSXSAVE;
   


First, we should only allow the guest to play with cr4.osxsave if 
guest_has_xsave in cpuid; otherwise we need to #GP if the guest sets 
it.  Second, it may be better to trap when the guest sets it (should be 
rare); this way, we only need to save/restore xcr0 if the guest has 
enabled cr4.osxsave.



vmcs_writel(CR4_GUEST_HOST_MASK, ~vmx->vcpu.arch.cr4_guest_owned_bits);

tsc_base = vmx->vcpu.kvm->arch.vm_init_tsc;
@@ -3354,6 +3358,29 @@ static int handle_wbinvd(struct kvm_vcpu *vcpu)
return 1;
  }

+static int handle_xsetbv(struct kvm_vcpu *vcpu)
+{
+   u64 new_bv = ((u64)(kvm_register_read(vcpu, VCPU_REGS_RDX)<<  32)) |
+   kvm_register_read(vcpu, VCPU_REGS_RAX);
   


I think you need to trim the upper 32 bits of rax.

Please introduce helpers for reading edx:eax into a u64 and vice versa.  
We can then use the helpers here and in the msr code.



+
+   if (kvm_register_read(vcpu, VCPU_REGS_RCX) != 0)
+   goto err;
+   if (vmx_get_cpl(vcpu) != 0)
+   goto err;
+   if (!(new_bv&  XSTATE_FP) ||
+(new_bv&  ~vcpu->arch.host_xcr0))
+   goto err;
+   if ((new_bv&  XSTATE_YMM)&&  !(new_bv&  XSTATE_SSE))
+   goto err;
   


This is a little worrying.  What if a new bit is introduced later that 
depends on other bits?  We'll need to add a dependency between ZMM and 
YMM or whatever, and old versions will be broken.


So I think we need to check xcr0 not against host_xcr0 but instead 
against a whitelist of xcr0 bits that we know how to handle (currently 
fpu, see, and ymm).



+   vcpu->arch.xcr0 = new_bv;
+   xsetbv(XCR_XFEATURE_ENABLED_MASK, vcpu->arch.xcr0);
+   skip_emulated_instruction(vcpu);
+   return 1;
+err:
+   kvm_inject_gp(vcpu, 0);
+   return 1;
+}
+
   



@@ -149,6 +150,11 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ NULL }
  };

+static inline u32 bit(int bitno)
+{
+   return 1<<  (bitno&  31);
+}
+
  static void kvm_on_user_return(struct user_return_notifier *urn)
  {
unsigned slot;
@@ -473,6 +479,17 @@ void kvm_lmsw(struct kvm_vcpu *vcpu, unsigned long msw)
  }
  EXPORT_SYMBOL_GPL(kvm_lmsw);

+static bool guest_cpuid_has_xsave(struct kvm_vcpu *vcpu)
+{
+   struct kvm_cpuid_entry2 *best;
+
+   best = kvm_find_cpuid_entry(vcpu, 1, 0);
+   if (best->ecx&  bit(X86_FEATURE_XSAVE))
   


Sanity:  if (best && ...)


+   return true;
+
+   return false;
   


Can avoid the if (): return best && (best->ecx & ...);


+}
+
  int __kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
  {
unsigned long old_cr4 = kvm_read_cr4(vcpu);
@@ -481,6 +498,9 @@ int __kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
if (cr4&  CR4_RESERVED_BITS)
return 1;

+   if (!guest_cpuid_has_xsave(vcpu)&&  X86_CR4_OSXSAVE)
   


s/&&.*//


+   return 1;
+
if (is_long_mode(vcpu)) {
if (!(cr4&  X86_CR4_PAE))
   



return 1;

@@ -1887,6 +1902,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, 
u32 function,
unsigned f_lm = 0;
  #endif
unsigned f_rdtscp = kvm_x86_ops->rdtscp_supported() ? F(RDTSCP) : 0;
+   unsigned f_xsave = cpu_has_xsave ? F(XSAVE) : 0;

/* cpuid 1.edx */
const u32 kvm_supported_word0_x86_features =
@@ -1916,7 +1932,7 @@ static void do_

Re: Gentoo guest with smp: emerge freeze while recompile world

2010-05-19 Thread Avi Kivity


On 05/19/2010 11:33 AM, Riccardo wrote:

This is a multi-part message in MIME format.

Hi,
I have a server dual xeon quad core with gentoo and qemu:
app-emulation/qemu-kvm-0.12.3-r1  USE="aio gnutls ncurses sasl vde -alsa
-bluetooth -curl -esd -fdt -hardened -kvm-trace -pulseaudio -qemu-ifup -sdl
-static"

There is a lot of vm running with ubuntu and fedora that running without 
problems.

I installed one vm with latest gentoo amd64, stage3 and portage.
When I try to do an emerge -e world the process freeze after a while, any time
in differente package and there aren't any errors in the logs.
This is a screenshot of the freezed vm:
http://yfrog.com/0iscre1j

top - 10:00:50 up 10:53,  2 users,  load average: 0.00, 0.00, 0.00
Tasks: 130 total,   1 running, 124 sleeping,   0 stopped,   5 zombie
Cpu(s):  0.1%us,  0.0%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8194776k total,   886508k used,  7308268k free,   225080k buffers
Swap:  2048248k total,0k used,  2048248k free,   476956k cached

   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 1 root  20   0  3760  656  552 S0  0.0   0:00.86 init
 2 root  20   0 000 S0  0.0   0:00.00 kthreadd
 3 root  RT   0 000 S0  0.0   0:01.78 migration/0
 4 root  20   0 000 S0  0.0   0:00.00 ksoftirqd/0
 5 root  RT   0 000 S0  0.0   0:02.01 migration/1
 6 root  20   0 000 S0  0.0   0:00.01 ksoftirqd/1
 7 root  RT   0 000 S0  0.0   0:02.05 migration/2
 8 root  20   0 000 S0  0.0   0:00.01 ksoftirqd/2
 9 root  RT   0 000 S0  0.0   0:02.15 migration/3
10 root  20   0 000 S0  0.0   0:00.01 ksoftirqd/3
11 root  RT   0 000 S0  0.0   0:01.69 migration/4
12 root  20   0 000 S0  0.0   0:00.00 ksoftirqd/4
13 root  RT   0 000 S0  0.0   0:01.49 migration/5
14 root  20   0 000 S0  0.0   0:00.00 ksoftirqd/5
15 root  20   0 000 S0  0.0   0:00.11 events/0
16 root  20   0 000 S0  0.0   0:00.21 events/1

ps -elf:
1 S root   776 2  0  80   0 - 0 scsi_e May18 ?00:00:00
[scsi_eh_1]
1 S root   810 2  0  80   0 - 0 worker May18 ?00:00:00
[kpsmoused]
1 S root   818 2  0  80   0 - 0 worker May18 ?00:00:00
[kstriped]
1 S root   821 2  0  80   0 - 0 worker May18 ?00:00:00
[kmpathd/0]
1 S root   822 2  0  80   0 - 0 worker May18 ?00:00:00
[kmpathd/1]
1 S root   823 2  0  80   0 - 0 worker May18 ?00:00:00
[kmpathd/2]
1 S root   824 2  0  80   0 - 0 worker May18 ?00:00:00
[kmpathd/3]
1 S root   825 2  0  80   0 - 0 worker May18 ?00:00:00
[kmpathd/4]
1 S root   826 2  0  80   0 - 0 worker May18 ?00:00:00
[kmpathd/5]
1 S root   827 2  0  80   0 - 0 worker May18 ?00:00:00
[kmpath_handlerd]
1 S root   828 2  0  80   0 - 0 worker May18 ?00:00:00
[ksnapd]
1 S root   859 2  0  80   0 - 0 worker May18 ?00:00:00
[usbhid_resumer]
1 S root   900 2  0  80   0 - 0 kjourn May18 ?00:00:00
[jbd2/vda3-8]
1 S root   901 2  0  80   0 - 0 worker May18 ?00:00:00
[ext4-dio-unwrit]
1 S root   902 2  0  80   0 - 0 worker May18 ?00:00:00
[ext4-dio-unwrit]
1 S root   903 2  0  80   0 - 0 worker May18 ?00:00:00
[ext4-dio-unwrit]
1 S root   904 2  0  80   0 - 0 worker May18 ?00:00:00
[ext4-dio-unwrit]
1 S root   905 2  0  80   0 - 0 worker May18 ?00:00:00
[ext4-dio-unwrit]
1 S root   906 2  0  80   0 - 0 worker May18 ?00:00:00
[ext4-dio-unwrit]
5 S root  1005 1  0  76  -4 -  3098 poll_s May18 ?00:00:00
/sbin/udevd --daemon
1 S root  2661 1  0  80   0 -  7492 wait   May18 ?00:00:00
supervising syslog-ng
5 S root  2662  2661  0  80   0 -  7525 poll_s May18 ?00:00:00
/usr/sbin/syslog-ng
1 S root  3250 1  0  80   0 -  9477 poll_s May18 ?00:00:00
/usr/sbin/sshd
1 S root  3370 1  0  80   0 -  4086 hrtime May18 ?00:00:00
/usr/sbin/cron
4 S root  3437 1  0  80   0 - 13988 wait   May18 tty1 00:00:00
/bin/login --
0 S root  3438 1  0  80   0 -  1464 n_tty_ May18 tty2 00:00:00
/sbin/agetty 38400 tty2 linux
0 S root  3439 1  0  80   0 -  1465 n_tty_ May18 tty3 00:00:00
/sbin/agetty 38400 tty3 linux
0 S root  3440 1  0  80   0 -  1464 n_tty_ May18 tty4 00:00:00
/sbin/agetty 38400 tty4 linux
0 S root  3441 1  0  80   0 -  1465 n_tty_ May18 tty5 00:00:00
/sbin/agetty 38400 tty5 linux
0 S root  3442 1  0  80   0 -  1465 n_t

Re: system_powerdown not working for qemu-kvm 0.12.4?

2010-05-19 Thread Avi Kivity


On 05/19/2010 12:23 PM, Teck Choon Giam wrote:

On Sun, May 16, 2010 at 7:52 PM, Avi Kivity  wrote:
   

On 05/15/2010 04:19 AM, Teck Choon Giam wrote:
 

Hi,

Anyone encountered the same issue as me about system_powerdown no
longer working since upgraded to qemu-kvm 0.12.4?

   

Compared with what version?

--
error compiling committee.c: too many arguments to function


 

Hi Avi,

Sorry for this late reply.

Here are my testing:

freebsd 8: qemu-kvm-0.12.3 system_powerdown not working and
qemu-kvm-0.12.4 system_powerdown not working
centos 5: qemu-kvm-0.12.3 system_powerdown working and qemu-kvm-0.12.4
system_powerdown not working
windows 2008 R2: qemu-kvm-0.12.3 system_powerdown working and
qemu-kvm-0.12.4 system_powerdown not working

If you want me to test on other qemu-kvm version, please state which version.
   


Can you try to bisect between qemu-kvm-0.12.3 and 0.12.4 to see which 
commit introduced the regression?


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 10/12] kvm: enable smp > 1

2010-05-19 Thread Avi Kivity


On 05/19/2010 12:57 PM, Udo Lembke wrote:

Jan Kiszka schrieb:

...
--enable-io-thread?

If you had it disabled, it would also answer my question if -smp works
without problems without that feature.

Jan


Hi,
i have a dumb question: what is the "--enable-io-thread"? Is this a 
kvm-switch?


It's a ./configure switch for upstream qemu (don't use with qemu-kvm yet).

My kvm 0.12.4 don't accept this switch. I'm know only "threads=n" as 
smp-parameter and "aio=threads" as drive-parameter.


Because i look for a solution for a better io-performance of 
windows-guest with more than one cpu...


Unrelated, what are your smp issues?

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] KVM: MMU: split kvm_sync_page() function

2010-05-19 Thread Avi Kivity


On 05/15/2010 01:51 PM, Xiao Guangrong wrote:

Split kvm_sync_page() into kvm_sync_page() and kvm_sync_page_transient()
to clarify the code address Avi's suggestion

kvm_sync_page_transient() function only update shadow page but not mark
it sync and not write protect sp->gfn. it will be used by later patch
   



Applied all three, thanks.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH][v2 0/3] Convert KVM to use FPU API

2010-05-19 Thread Avi Kivity


On 05/17/2010 12:37 PM, Avi Kivity wrote:

On 05/17/2010 12:08 PM, Sheng Yang wrote:

Change from v1:
Use unlazy_fpu() to handle host FPU, avoiding save/restore of host 
FPU states.




Looks good, will wait a bit for more reviews and apply.



Now applied.  Thanks.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv2] vhost-net: utilize PUBLISH_USED_IDX feature

2010-05-19 Thread Avi Kivity


On 05/18/2010 08:47 PM, Avi Kivity wrote:

On 05/18/2010 05:21 AM, Michael S. Tsirkin wrote:

With PUBLISH_USED_IDX, guest tells us which used entries
it has consumed. This can be used to reduce the number
of interrupts: after we write a used entry, if the guest has not yet
consumed the previous entry, or if the guest has already consumed the
new entry, we do not need to interrupt.
This imporves bandwidth by 30% under some workflows.


Seems to be missing the cacheline alignment.

Rusty's clarification did not satisfy me, I think it's needed.



Oh, and this should definitely follow the patch to the virtio spec, not 
precede it.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] AMD Erratum 383 workaround for KVM

2010-05-19 Thread Avi Kivity


On 05/19/2010 03:03 PM, Joerg Roedel wrote:

On Mon, May 17, 2010 at 08:43:33AM -0400, Joerg Roedel wrote:
   

these two patches implement the workaround for AMD Erratum 383 into KVM.
This is necessary to prevent the host to crash if a guest triggers the
erratum.
For details on the erratum please see page 96 of

http://support.amd.com/us/Processor_TechDocs/41322.pdf

The workaround implemented in these patches will be documented in the
next update of the revision guide.
 

Hey Marcelo, Avi,

have you had a chance to look at this? Any opinions?

   


Sorry, holiday here.  Patches applied, thanks.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [RFC] Bug Day - June 1st, 2010

2010-05-19 Thread Jamie Lokier

Michael Tokarev wrote:
> Anthony Liguori wrote:
> []
> > For the Bug Day, anything is interesting IMHO.  My main interest is to
> > get as many people involved in testing and bug fixing as possible.  If
> > folks are interested in testing specific things like unusual or older
> > OSes, I'm happy to see it!
> 
> Well, interesting or not, but I for one don't know what to do with the
> results.  There were a thread on kvm@ about sigsegv in cirrus code when
> running winNT. The issue has been identified and appears to be fixed,
> as in, kvm process does not SIGSEGV anymore, but it does not work anyway,
> now printing:
> 
>  BUG: kvm_dirty_pages_log_enable_slot: invalid parameters
> 
> with garbled guest display.  Thanks goes to Stefano Stabellini for
> finding the SIGSEGV case, but unfortunately his hard work isn't quite
> useful since the behavour isn't very much different from the previous
> version... ;)

A "BUG:" is good to see in a bug report: It gives you something
specific to analyse.  Good luck ;-)

Imho, it'd be quite handy to keep a timeline of working/non-working
guests in a table somewhere, and which qemu versions and options they
were observed to work or break with.

> Also, thanks to Andre Przywara, whole winNT thing works but it requires
> -cpu qemu64,level=1 (or level=2 or =3), -- _not_ with default CPU.  This
> is also testing, but it's not obvious what to do witht the result...

Doesn't WinNT work with qemu32 or kvm32?
It's a 32-bit OS after all.

- Jamie
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [RFC] Bug Day - June 1st, 2010

2010-05-19 Thread Luiz Capitulino

On Tue, 18 May 2010 17:38:27 -0500
Anthony Liguori  wrote:

> Hi,
> 
> In an effort to improve the 0.13 release quality, I'd like to host a Bug 
> Day on June 1st, 2010.  I've setup a quick wiki page with some more info 
> (http://wiki.qemu.org/BugDay/June2010).

 Tuesday is our call conf day and other people have reported that they have
more confs during that day. Suggest Jun 2.

> Here's my basic thinking:
> 
>   - Anyone who can should try to spend some time either triaging bugs, 
> updating bug status, or actually fixing bugs.

 And testing, Fedora has a number of test cases already written, but
I guess that just a few are qemu specific:

https://fedoraproject.org/wiki/Test_Day:2010-04-08_Virtualization

 We could link those and write our own, or at least list the major
features to be tested..

 Of course we could have a different day for testing too, but I think
this is a way to get everyone busy.

>   - We'll have a special IRC channel (#qemu-bugday) on OFTC.  As many 
> QEMU and KVM developers as possible should join this channel for that 
> day to help assist people working on bugs.
>   - We'll try to migrate as many confirmable bugs from the Source Forge 
> tracker to Launchpad.

 Can't this be automated?

> If this is successful, we'll try to have regular bug days.  Any 
> suggestions on how to make the experience as fun and productive as 
> possible are certainly appreciated!

 Lots of foods and a Monty Python session in the evening.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [RFC] Bug Day - June 1st, 2010

2010-05-19 Thread Natalia Portillo

Hi,

>> There are a couple of.
>> 
>> The majority of them are because SeaBIOS not behaving like real hardware 
>> should and must do.
>> 
>> If you know any OS that's not in the OS Support List and is broken please 
>> commit.
> 
> If something is broken, please file a bug against it.  The OS Support List is 
> a very useful resource but we still need bugs to actually work on fixing the 
> problems.
The OS Support List clearly specifies that in case of a fail, bug, or non 
working case, a bug should be filled, so I think, using the OS Support List 
includes bugs in Launchpad, however the reverse is not true.--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [RFC] Bug Day - June 1st, 2010

2010-05-19 Thread Natalia Portillo

Hi,

>>> There have been reports of several legacy OSes being unable to install
>>> or boot in the newer qemu while working in the older one.  They're
>>> probably not in the "OS Support List" though.  Are they effectively
>>> uninteresting for the purpose of the 0.13 release?
>>> 
> 
> For the Bug Day, anything is interesting IMHO.  My main interest is to get as 
> many people involved in testing and bug fixing as possible.  If folks are 
> interested in testing specific things like unusual or older OSes, I'm happy 
> to see it!

Well the idea is not only about unusual or older OSes.
That list detected bugs and regressions appearing in usual and new OSes (for 
example the infamous SeaBIOS problem with Windows 9x and mouse also affects 
freshly and new OpenSuSE Linux 11.2).--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [RFC] Bug Day - June 1st, 2010

2010-05-19 Thread Anthony Liguori


On 05/19/2010 12:04 AM, Aurelien Jarno wrote:

On Tue, May 18, 2010 at 05:38:27PM -0500, Anthony Liguori wrote:
   

Hi,

In an effort to improve the 0.13 release quality, I'd like to host a
Bug Day on June 1st, 2010.  I've setup a quick wiki page with some
more info (http://wiki.qemu.org/BugDay/June2010).

Here's my basic thinking:

  - Anyone who can should try to spend some time either triaging
bugs, updating bug status, or actually fixing bugs.
  - We'll have a special IRC channel (#qemu-bugday) on OFTC.  As many
QEMU and KVM developers as possible should join this channel for
that day to help assist people working on bugs.
  - We'll try to migrate as many confirmable bugs from the Source
Forge tracker to Launchpad.

If this is successful, we'll try to have regular bug days.  Any
suggestions on how to make the experience as fun and productive as
possible are certainly appreciated!
 

The idea is nice, but would it be possible to hold this on a week-end,
I personally won't be able to attend such thing on a day week.

Or maybe holding that on two days: friday and saturday so that people
can participate at least one of the two days, depending if they do that
from work or from home.
   


The work week in Israel is Sunday - Thursday.

It would have to be Sunday and Monday but honestly, I think both days 
tend to be bad for this sort of thing.


I'd much rather do more frequent bug days and alternate between a 
weekday and a Saturday.


Regards,

Anthony Liguori


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [RFC] Bug Day - June 1st, 2010

2010-05-19 Thread Anthony Liguori


On 05/19/2010 08:19 AM, Michael Tokarev wrote:

Anthony Liguori wrote:
[]
   

For the Bug Day, anything is interesting IMHO.  My main interest is to
get as many people involved in testing and bug fixing as possible.  If
folks are interested in testing specific things like unusual or older
OSes, I'm happy to see it!
 

Well, interesting or not, but I for one don't know what to do with the
results.  There were a thread on kvm@ about sigsegv in cirrus code when
running winNT. The issue has been identified and appears to be fixed,
as in, kvm process does not SIGSEGV anymore, but it does not work anyway,
now printing:

  BUG: kvm_dirty_pages_log_enable_slot: invalid parameters

with garbled guest display.  Thanks goes to Stefano Stabellini for
finding the SIGSEGV case, but unfortunately his hard work isn't quite
useful since the behavour isn't very much different from the previous
version... ;)
   


File a bug in Launchpad.  Even if it isn't fixed immediately, at least 
that way it will not be lost.  Things are too easily lost on the mailing 
list.


Regards,

Anthony Liguori


Also, thanks to Andre Przywara, whole winNT thing works but it requires
-cpu qemu64,level=1 (or level=2 or =3), -- _not_ with default CPU.  This
is also testing, but it's not obvious what to do witht the result...

Thanks!

/mjt

   


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [RFC] Bug Day - June 1st, 2010

2010-05-19 Thread Michael Tokarev

Anthony Liguori wrote:
[]
> For the Bug Day, anything is interesting IMHO.  My main interest is to
> get as many people involved in testing and bug fixing as possible.  If
> folks are interested in testing specific things like unusual or older
> OSes, I'm happy to see it!

Well, interesting or not, but I for one don't know what to do with the
results.  There were a thread on kvm@ about sigsegv in cirrus code when
running winNT. The issue has been identified and appears to be fixed,
as in, kvm process does not SIGSEGV anymore, but it does not work anyway,
now printing:

 BUG: kvm_dirty_pages_log_enable_slot: invalid parameters

with garbled guest display.  Thanks goes to Stefano Stabellini for
finding the SIGSEGV case, but unfortunately his hard work isn't quite
useful since the behavour isn't very much different from the previous
version... ;)

Also, thanks to Andre Przywara, whole winNT thing works but it requires
-cpu qemu64,level=1 (or level=2 or =3), -- _not_ with default CPU.  This
is also testing, but it's not obvious what to do witht the result...

Thanks!

/mjt

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [RFC] Bug Day - June 1st, 2010

2010-05-19 Thread Anthony Liguori


On 05/18/2010 08:45 PM, Jamie Lokier wrote:

Natalia Portillo wrote:
   

Hi,

 

- We'll try to migrate as many confirmable bugs from the Source Forge tracker 
to Launchpad.
   

I think that part of the bug day should also include retesting OSes that appear 
in OS Support List as having bug and confirming if the bug is still present and 
if it's in Launchpad or not.


We don't have a supported OS list.  Everything is best effort.  The OS 
Support List Natalia is referring to is a reflection of what's been 
reported to work, not what we expect should work.



There have been reports of several legacy OSes being unable to install
or boot in the newer qemu while working in the older one.  They're
probably not in the "OS Support List" though.  Are they effectively
uninteresting for the purpose of the 0.13 release?
 


For the Bug Day, anything is interesting IMHO.  My main interest is to 
get as many people involved in testing and bug fixing as possible.  If 
folks are interested in testing specific things like unusual or older 
OSes, I'm happy to see it!


Regards,

Anthony Liguori


Unfortunately I doubt I will have time to participate in the Bug Day.

Thanks,
-- Jamie

 


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] AMD Erratum 383 workaround for KVM

2010-05-19 Thread Joerg Roedel

On Mon, May 17, 2010 at 08:43:33AM -0400, Joerg Roedel wrote:
> these two patches implement the workaround for AMD Erratum 383 into KVM.
> This is necessary to prevent the host to crash if a guest triggers the
> erratum.
> For details on the erratum please see page 96 of
> 
> http://support.amd.com/us/Processor_TechDocs/41322.pdf
> 
> The workaround implemented in these patches will be documented in the
> next update of the revision guide.

Hey Marcelo, Avi,

have you had a chance to look at this? Any opinions?

Thanks,

Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [RFC] Bug Day - June 1st, 2010

2010-05-19 Thread Miguel Di Ciurcio Filho

On Wed, May 19, 2010 at 2:04 AM, Aurelien Jarno  wrote:
>
> The idea is nice, but would it be possible to hold this on a week-end,
> I personally won't be able to attend such thing on a day week.
>
> Or maybe holding that on two days: friday and saturday so that people
> can participate at least one of the two days, depending if they do that
> from work or from home.
>

I second that. How about June 4th and 5th?

Miguel
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mental Health Counselors - 283,184 records 7,206 emails

2010-05-19 Thread Peck decant

We have lists for healthcare, business & finance, consumers and professionals. 
Lots of different lists from various optin sources.

Just send me an email here for additional info: successto...@gmx.com

  


to stop receiving future correspondence from us > please email remfi...@gmx.com 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 10/12] kvm: enable smp > 1

2010-05-19 Thread Udo Lembke


Jan Kiszka schrieb:

...
--enable-io-thread?

If you had it disabled, it would also answer my question if -smp works
without problems without that feature.

Jan

  

Hi,
i have a dumb question: what is the "--enable-io-thread"? Is this a 
kvm-switch?
My kvm 0.12.4 don't accept this switch. I'm know only "threads=n" as 
smp-parameter and "aio=threads" as drive-parameter.


Because i look for a solution for a better io-performance of 
windows-guest with more than one cpu...


best regards

Udo



smime.p7s
Description: S/MIME Cryptographic Signature

Re: [PATCH +stable] block: don't attempt to merge overlapping requests

2010-05-19 Thread Christoph Hellwig

On Wed, May 19, 2010 at 10:23:44AM +0100, Stefan Hajnoczi wrote:
> On Wed, May 19, 2010 at 10:06 AM, Avi Kivity  wrote:
> >> In the cache=writeback case the virtio-blk guest driver does:
> >>
> >> blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH, ...)
> >>
> >
> > I don't follow. ?What's the implication?
> 
> I was wondering whether the queue is incorrectly set to a mode where
> overlapping write requests aren't being ordered.  Anyway, Christoph
> says overlapping write requests can be issued so my theory is broken.

They can happen, but won't during usual pagecache based writeback.  So
this should not happen for the pagecache based mke2fs workload.  It
could happen for a workload with mkfs that uses O_DIRECT.  And we
need to handle it either way.

And btw, our barrier handling for devices using multiwrite (aka virtio)
is utterly busted right now, patch will follow ASAP.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH +stable] block: don't attempt to merge overlapping requests

2010-05-19 Thread Stefan Hajnoczi

On Wed, May 19, 2010 at 10:06 AM, Avi Kivity  wrote:
>> In the cache=writeback case the virtio-blk guest driver does:
>>
>> blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH, ...)
>>
>
> I don't follow.  What's the implication?

I was wondering whether the queue is incorrectly set to a mode where
overlapping write requests aren't being ordered.  Anyway, Christoph
says overlapping write requests can be issued so my theory is broken.

> btw I really dislike how the cache attribute (which I see as a pure host
> choice) is exposed to the guest.  It means we can't change caching mode on
> the fly (for example after live migration), or that changing caching mode
> during a restart may expose a previously hidden guest bug.

Christoph has mentioned wanting to make the write cache switchable
from inside the guest.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: system_powerdown not working for qemu-kvm 0.12.4?

2010-05-19 Thread Teck Choon Giam

On Sun, May 16, 2010 at 7:52 PM, Avi Kivity  wrote:
> On 05/15/2010 04:19 AM, Teck Choon Giam wrote:
>>
>> Hi,
>>
>> Anyone encountered the same issue as me about system_powerdown no
>> longer working since upgraded to qemu-kvm 0.12.4?
>>
>
> Compared with what version?
>
> --
> error compiling committee.c: too many arguments to function
>
>

Hi Avi,

Sorry for this late reply.

Here are my testing:

freebsd 8: qemu-kvm-0.12.3 system_powerdown not working and
qemu-kvm-0.12.4 system_powerdown not working
centos 5: qemu-kvm-0.12.3 system_powerdown working and qemu-kvm-0.12.4
system_powerdown not working
windows 2008 R2: qemu-kvm-0.12.3 system_powerdown working and
qemu-kvm-0.12.4 system_powerdown not working

If you want me to test on other qemu-kvm version, please state which version.

Thanks.

Kindest regards,
Giam Teck Choon
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/3] KVM test: Add implementation of network based unattended installation

2010-05-19 Thread Jason Wang

This patch could let the unattended installation to be done through
the following method:

- cdrom: the original method which does the installation from cdrom
- url: installing the linux guest from http or ftp, tree url was specified
through url
- nfs: installing the linux guest from nfs. the server address was
specified through nfs_server, and the director was specified through
nfs_dir

For url and nfs installation, the extra_params need to be configurated
to specify the location of unattended files:

- If the unattended file in the tree is used, "extra_parmas= append
  ks=floppy" and unattended_file params need to be specified in the
  configuration file.
- If the unattended file located at remote server is used,
  unattended_file option must be none and "extram_params= append
  ks=http://xxx"; need to be speficied in the configuration file and
  don't forget the add the finish nofitication part.

The --kernel and --initrd were used directly for the network
installation instead of the tftp/bootp param because the user mode
network is too slow to do this.

Only the unattended files for RHEL and Fedora gues ts are modified,
others are kept unmodified and could do the installation from cdrom.

Signed-off-by: Jason Wang 
---
 client/tests/kvm/scripts/unattended.py   |  103 +-
 client/tests/kvm/tests_base.cfg.sample   |1 
 client/tests/kvm/unattended/Fedora-10.ks |2 -
 client/tests/kvm/unattended/Fedora-11.ks |2 -
 client/tests/kvm/unattended/Fedora-12.ks |2 -
 client/tests/kvm/unattended/Fedora-8.ks  |2 -
 client/tests/kvm/unattended/Fedora-9.ks  |2 -
 client/tests/kvm/unattended/RHEL-3-series.ks |2 -
 client/tests/kvm/unattended/RHEL-4-series.ks |2 -
 client/tests/kvm/unattended/RHEL-5-series.ks |2 -
 10 files changed, 107 insertions(+), 13 deletions(-)

diff --git a/client/tests/kvm/scripts/unattended.py 
b/client/tests/kvm/scripts/unattended.py
index fdadd03..b738e3f 100755
--- a/client/tests/kvm/scripts/unattended.py
+++ b/client/tests/kvm/scripts/unattended.py
@@ -50,6 +50,7 @@ class UnattendedInstall(object):
 self.cdrom_iso = os.path.join(kvm_test_dir, cdrom_iso)
 self.floppy_mount = tempfile.mkdtemp(prefix='floppy_', dir='/tmp')
 self.cdrom_mount = tempfile.mkdtemp(prefix='cdrom_', dir='/tmp')
+self.nfs_mount = tempfile.mkdtemp(prefix='nfs_', dir='/tmp')
 flopy_name = os.environ['KVM_TEST_floppy']
 self.floppy_img = os.path.join(kvm_test_dir, flopy_name)
 floppy_dir = os.path.dirname(self.floppy_img)
@@ -60,6 +61,16 @@ class UnattendedInstall(object):
 self.pxe_image = os.environ.get('KVM_TEST_pxe_image', '')
 self.pxe_initrd = os.environ.get('KVM_TEST_pxe_initrd', '')
 
+self.medium = os.environ.get('KVM_TEST_medium', '')
+self.url = os.environ.get('KVM_TEST_url', '')
+self.kernel = os.environ.get('KVM_TEST_kernel', '')
+self.initrd = os.environ.get('KVM_TEST_initrd', '')
+self.nfs_server = os.environ.get('KVM_TEST_nfs_server', '')
+self.nfs_dir = os.environ.get('KVM_TEST_nfs_dir', '')
+self.image_path = kvm_test_dir
+self.kernel_path = os.path.join(self.image_path, self.kernel)
+self.initrd_path = os.path.join(self.image_path, self.initrd)
+
 
 def create_boot_floppy(self):
 """
@@ -106,7 +117,8 @@ class UnattendedInstall(object):
 dest = os.path.join(self.floppy_mount, dest_fname)
 
 # Replace KVM_TEST_CDKEY (in the unattended file) with the cdkey
-# provided for this test
+# provided for this test and replace the KVM_TEST_MEDIUM with
+# the tree url or nfs address provided for this test.
 unattended_contents = open(self.unattended_file).read()
 dummy_cdkey_re = r'\bKVM_TEST_CDKEY\b'
 real_cdkey = os.environ.get('KVM_TEST_cdkey')
@@ -117,7 +129,20 @@ class UnattendedInstall(object):
 else:
 print ("WARNING: 'cdkey' required but not specified for "
"this unattended installation")
+
+dummy_re = r'\bKVM_TEST_MEDIUM\b'
+if self.medium == "cdrom":
+content = "cdrom"
+elif self.medium == "url":
+content = "url --url %s" % self.url
+elif self.medium == "nfs":
+content = "nfs --server=%s --dir=%s" % (self.nfs_server, 
self.nfs_dir)
+else:
+raise SetupError("Unexpected installation medium %s" % 
self.url)
+
+unattended_contents = re.sub(dummy_re, content, 
unattended_contents)
 
+print unattended_contents
 # Write the unattended file contents to 'dest'
 open(dest, 'w').write(unattended_contents)
 
@@ -216,6 +241,58 @@ class UnattendedInstall(object):
 print "PXE boot successfuly set"
 
 
+def setup_url

[PATCH 2/3] KVM test: Do not use the hard-coded address during unattended installation

2010-05-19 Thread Jason Wang

When we do the unattended installation in tap mode, we should use
vm.get_address() instead of the 'localhost' in order the connect to
the finish program running in the guest.

Signed-off-by: Jason Wang 
---
 client/tests/kvm/tests/unattended_install.py |   25 +
 1 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/client/tests/kvm/tests/unattended_install.py 
b/client/tests/kvm/tests/unattended_install.py
index e2cec8e..e71f993 100644
--- a/client/tests/kvm/tests/unattended_install.py
+++ b/client/tests/kvm/tests/unattended_install.py
@@ -17,7 +17,6 @@ def run_unattended_install(test, params, env):
 vm = kvm_test_utils.get_living_vm(env, params.get("main_vm"))
 
 port = vm.get_port(int(params.get("guest_port_unattended_install")))
-addr = ('localhost', port)
 if params.get("post_install_delay"):
 post_install_delay = int(params.get("post_install_delay"))
 else:
@@ -31,17 +30,19 @@ def run_unattended_install(test, params, env):
 time_elapsed = 0
 while time_elapsed < install_timeout:
 client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
-try:
-client.connect(addr)
-msg = client.recv(1024)
-if msg == 'done':
-if post_install_delay:
-logging.debug("Post install delay specified, "
-  "waiting %ss...", post_install_delay)
-time.sleep(post_install_delay)
-break
-except socket.error:
-pass
+addr = vm.get_address()
+if addr:
+try:
+client.connect((addr, port))
+msg = client.recv(1024)
+if msg == 'done':
+if post_install_delay:
+logging.debug("Post install delay specified, "
+  "waiting %ss...", post_install_delay)
+time.sleep(post_install_delay)
+break
+except socket.error:
+pass
 time.sleep(1)
 client.close()
 end_time = time.time()

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/3] KVM test: Add the support of kernel and initrd option for qemu-kvm

2010-05-19 Thread Jason Wang

"-kernel" option is useful for both unattended installation and the
unittest in /kvm/user/test.

Signed-off-by: Jason Wang 
---
 client/tests/kvm/kvm_vm.py |   10 ++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py
index c203e14..2515859 100755
--- a/client/tests/kvm/kvm_vm.py
+++ b/client/tests/kvm/kvm_vm.py
@@ -288,6 +288,16 @@ class VM:
 tftp = kvm_utils.get_path(root_dir, tftp)
 qemu_cmd += " -tftp %s" % tftp
 
+kernel = params.get("kernel")
+if kernel:
+kernel = kvm_utils.get_path(root_dir, kernel)
+qemu_cmd += " -kernel %s" % kernel
+
+initrd = params.get("initrd")
+if initrd:
+initrd = kvm_utils.get_path(root_dir, initrd)
+qemu_cmd += " -initrd %s" % initrd
+
 extra_params = params.get("extra_params")
 if extra_params:
 qemu_cmd += " %s" % extra_params

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[v3 PATCH] KVM test: Add a helper to search the panic in the log

2010-05-19 Thread Jason Wang

This checker serves as the post_command to find the panic information
in the file which contains the content of guest serial console.

Changes from v2:
- Put all things into __main__
- Fix some typos

Signed-off-by: Jason Wang 
---
 client/tests/kvm/scripts/check_serial.py |   24 
 client/tests/kvm/tests_base.cfg.sample   |7 +--
 2 files changed, 29 insertions(+), 2 deletions(-)
 create mode 100644 client/tests/kvm/scripts/check_serial.py

diff --git a/client/tests/kvm/scripts/check_serial.py 
b/client/tests/kvm/scripts/check_serial.py
new file mode 100644
index 000..6432c27
--- /dev/null
+++ b/client/tests/kvm/scripts/check_serial.py
@@ -0,0 +1,24 @@
+import os, sys, glob, re
+
+
+class SerialCheckerError(Exception):
+"""
+Simple wrapper for the builtin Exception class.
+"""
+pass
+
+
+if __name__ == "__main__":
+client_dir =  os.environ['AUTODIR']
+pattern = os.environ['KVM_TEST_search_pattern']
+shortname = os.environ['KVM_TEST_shortname']
+debugdir = os.path.join(client_dir, "results/default/kvm.%s/debug" 
+% shortname)
+serial_files = glob.glob(os.path.join(debugdir, 'serial*'))
+
+fail = [ f for f in serial_files if
+ re.findall(pattern, file(f).read(), re.I) ]
+if fail:
+print "%s is found in %s" % (pattern, fail)
+raise SerialCheckerError("Error found during the check, please " 
+ "check the log")
diff --git a/client/tests/kvm/tests_base.cfg.sample 
b/client/tests/kvm/tests_base.cfg.sample
index e85bb4a..c4e522a 100644
--- a/client/tests/kvm/tests_base.cfg.sample
+++ b/client/tests/kvm/tests_base.cfg.sample
@@ -52,6 +52,10 @@ address_index = 0
 # Misc
 profilers = kvm_stat
 
+# pattern to search in guest serial console
+search_pattern = panic
+post_command = "python scripts/check_serial.py"
+post_command_noncritical = no
 
 # Tests
 variants:
@@ -1324,10 +1328,9 @@ virtio|virtio_blk|e1000|balloon_check:
 variants:
 - @qcow2:
 image_format = qcow2
-post_command = " python scripts/check_image.py;"
+post_command += " && python scripts/check_image.py"
 remove_image = no
 post_command_timeout = 600
-post_command_noncritical = yes
 - vmdk:
 only Fedora Ubuntu Windows
 only smp2

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH +stable] block: don't attempt to merge overlapping requests

2010-05-19 Thread Avi Kivity


On 05/19/2010 12:01 PM, Stefan Hajnoczi wrote:

On Wed, May 19, 2010 at 9:09 AM, Avi Kivity  wrote:
   

On 05/18/2010 10:22 PM, Stefan Hajnoczi wrote:
 

What cache= mode are you running?
   

writeback.
 

In the cache=writeback case the virtio-blk guest driver does:

blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH, ...)
   


I don't follow.  What's the implication?

btw I really dislike how the cache attribute (which I see as a pure host 
choice) is exposed to the guest.  It means we can't change caching mode 
on the fly (for example after live migration), or that changing caching 
mode during a restart may expose a previously hidden guest bug.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: the >1Tb block issue

2010-05-19 Thread Avi Kivity


On 05/19/2010 11:57 AM, Christoph Hellwig wrote:

On Tue, May 18, 2010 at 08:38:22PM +0300, Avi Kivity wrote:

Yes.  Why would Linux post overlapping requests? makes
0x sense.

There may be a guest bug in here too.  Christoph?


Overlapping writes are entirely fine from the guest POV, although they
should be rather unusual.  We can update a page and send it out again
when it gets redirtied while still out on the wire.


But the device may reorder requests:

  system  device

  issue request r1 for sector n page p
  dma into buffer b1
  modify contents of page p
  issue request r2 for sector n page p
  dma into buffer b2
  complete r2
  complete r1

Is there any guarantee r2 will complete after r1, or that b1 and b2 are 
coherent?  I'm not aware of any.


What about NFS O_DIRECT backing virtio-blk?  Here, requests can 
definitely be reordered, and the buffers are certainly not coherent 
(since they're don't even exist once the data has left the NIC).


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH +stable] block: don't attempt to merge overlapping requests

2010-05-19 Thread Stefan Hajnoczi

On Wed, May 19, 2010 at 9:09 AM, Avi Kivity  wrote:
> On 05/18/2010 10:22 PM, Stefan Hajnoczi wrote:
>> What cache= mode are you running?
>
> writeback.

In the cache=writeback case the virtio-blk guest driver does:

blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH, ...)

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: the >1Tb block issue

2010-05-19 Thread Christoph Hellwig

On Tue, May 18, 2010 at 08:38:22PM +0300, Avi Kivity wrote:
> Yes.  Why would Linux post overlapping requests? makes  
> 0x sense.
>
> There may be a guest bug in here too.  Christoph?

Overlapping writes are entirely fine from the guest POV, although they
should be rather unusual.  We can update a page and send it out again
when it gets redirtied while still out on the wire.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL] KVM updates for the 2.6.35 merge window

2010-05-19 Thread Avi Kivity


Linus, please pull from

  git://git.kernel.org/pub/scm/virt/kvm/kvm.git kvm-updates/2.6.35

to receive the KVM updates for the 2.6.35 merge window.  Highlights 
include a ppc64 port, timekeeping improvements, a lot of emulator work, 
and perf integration.


Shortlog/diffstat below.  Messy due to a merge of tip's perf/core, which 
you have already pulled.


Alexander Graf (84):
  KVM: PPC: Add QPR registers
  KVM: PPC: Make fpscr 64-bit
  KVM: PPC: Enable MMIO to do 64 bits, fprs and qprs
  KVM: PPC: Teach MMIO Signedness
  KVM: PPC: Add AGAIN type for emulation return
  KVM: PPC: Add hidden flag for paired singles
  KVM: PPC: Add Gekko SPRs
  KVM: PPC: Combine extension interrupt handlers
  KVM: PPC: Preload FPU when possible
  KVM: PPC: Fix typo in book3s_32 debug code
  KVM: PPC: Implement mtsr instruction emulation
  KVM: PPC: Make software load/store return eaddr
  KVM: PPC: Make ext giveup non-static
  KVM: PPC: Add helpers to call FPU instructions
  KVM: PPC: Fix error in BAT assignment
  KVM: PPC: Add helpers to modify ppc fields
  KVM: PPC: Enable program interrupt to do MMIO
  KVM: PPC: Implement Paired Single emulation
  KVM: PPC: Add capability for paired singles
  KVM: PPC: Enable use of secondary htab bucket
  KVM: PPC: Simplify kvmppc_load_up_(FPU|VMX|VSX)
  KVM: PPC: Allocate vcpu struct using vmalloc
  KVM: PPC: Memset vcpu to zeros
  KVM: PPC: Destory timer on vcpu destruction
  KVM: PPC: Ensure split mode works
  KVM: PPC: Allow userspace to unset the IRQ line
  KVM: PPC: Make DSISR 32 bits wide
  KVM: PPC: Book3S_32 guest MMU fixes
  KVM: PPC: Split instruction reading out
  KVM: PPC: Don't reload FPU with invalid values
  KVM: PPC: Load VCPU for register fetching
  KVM: PPC: Implement mfsr emulation
  KVM: PPC: Implement BAT reads
  KVM: PPC: Make XER load 32 bit
  KVM: PPC: Implement emulation for lbzux and lhax
  KVM: PPC: Implement alignment interrupt
  KVM: Add support for enabling capabilities per-vcpu
  KVM: PPC: Add OSI hypercall interface
  KVM: PPC: Make build work without CONFIG_VSX/ALTIVEC
  KVM: PPC: Fix dcbz emulation
  KVM: PPC: Add emulation for dcba
  KVM: PPC: Add check if pte was mapped secondary
  KVM: PPC: Use ULL for big numbers
  KVM: PPC: Make bools bitfields
  KVM: PPC: Disable MSR_FEx for Cell hosts
  KVM: PPC: Only use QPRs when available
  KVM: PPC: Don't export Book3S symbols on BookE
  KVM: PPC: Add dequeue for external on BookE
  KVM: PPC: Name generic 64-bit code generic
  KVM: PPC: Add host MMU Support
  KVM: PPC: Add SR swapping code
  KVM: PPC: Add generic segment switching code
  PPC: Split context init/destroy functions
  KVM: PPC: Add kvm_book3s_64.h
  KVM: PPC: Add kvm_book3s_32.h
  KVM: PPC: Add fields to shadow vcpu
  KVM: PPC: Improve indirect svcpu accessors
  KVM: PPC: Use KVM_BOOK3S_HANDLER
  KVM: PPC: Use CONFIG_PPC_BOOK3S define
  PPC: Add STLU
  KVM: PPC: Use now shadowed vcpu fields
  KVM: PPC: Extract MMU init
  KVM: PPC: Make real mode handler generic
  KVM: PPC: Make highmem code generic
  KVM: PPC: Make SLB switching code the new segment framework
  KVM: PPC: Release clean pages as clean
  KVM: PPC: Remove fetch fail code
  KVM: PPC: Add SVCPU to Book3S_32
  KVM: PPC: Emulate segment fault
  KVM: PPC: Add Book3S compatibility code
  KVM: PPC: Export MMU variables
  PPC: Export SWITCH_FRAME_SIZE
  KVM: PPC: Check max IRQ prio
  KVM: PPC: Add KVM intercept handlers
  KVM: PPC: Enable Book3S_32 KVM building
  KVM: PPC: Convert u64 -> ulong
  KVM: PPC: Make Performance Counters work
  KVM: PPC: Improve split mode
  KVM: PPC: Make Alignment interrupts work again
  KVM: PPC: Be more informative on BUG
  KVM: PPC: Set VSID_PR also for Book3S_64
  KVM: PPC: Fix Book3S_64 Host MMU debug output
  KVM: PPC: Find HTAB ourselves
  KVM: PPC: Enable native paired singles

Andre Przywara (1):
  KVM: SVM: implement NEXTRIPsave SVM feature

Andrea Gelmini (1):
  KVM: arch/x86/kvm/kvm_timer.h checkpatch cleanup

Avi Kivity (32):
  KVM: Move kvm_exit tracepoint rip reading inside tracepoint
  KVM: Trace exception injection
  KVM: MMU: Consolidate two guest pte reads in kvm_mmu_pte_write()
  KVM: Make locked operations truly atomic
  KVM: Don't follow an atomic operation by a non-atomic one
  KVM: MMU: Do not instantiate nontrapping spte on unsync page
  KVM: MMU: Reinstate pte prefetch on invlpg
  KVM: MMU: Disassociate direct maps from guest levels
  KVM: Document KVM_SET_USER_MEMORY_REGION
  KVM: Document KVM_SET_TSS_ADDR
  KVM: Document replacements for KVM_EXIT_HYPERCALL
  KVM: x86 emulator: Don't overwrite decode cache
  KVM: Trace emulated instructions

Gentoo guest with smp: emerge freeze while recompile world

2010-05-19 Thread Riccardo

This is a multi-part message in MIME format.

Hi,
I have a server dual xeon quad core with gentoo and qemu:
app-emulation/qemu-kvm-0.12.3-r1  USE="aio gnutls ncurses sasl vde -alsa
-bluetooth -curl -esd -fdt -hardened -kvm-trace -pulseaudio -qemu-ifup -sdl
-static"

There is a lot of vm running with ubuntu and fedora that running without 
problems.

I installed one vm with latest gentoo amd64, stage3 and portage.
When I try to do an emerge -e world the process freeze after a while, any time
in differente package and there aren't any errors in the logs.
This is a screenshot of the freezed vm:
http://yfrog.com/0iscre1j

top - 10:00:50 up 10:53,  2 users,  load average: 0.00, 0.00, 0.00
Tasks: 130 total,   1 running, 124 sleeping,   0 stopped,   5 zombie
Cpu(s):  0.1%us,  0.0%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8194776k total,   886508k used,  7308268k free,   225080k buffers
Swap:  2048248k total,0k used,  2048248k free,   476956k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
1 root  20   0  3760  656  552 S0  0.0   0:00.86 init
2 root  20   0 000 S0  0.0   0:00.00 kthreadd
3 root  RT   0 000 S0  0.0   0:01.78 migration/0
4 root  20   0 000 S0  0.0   0:00.00 ksoftirqd/0
5 root  RT   0 000 S0  0.0   0:02.01 migration/1
6 root  20   0 000 S0  0.0   0:00.01 ksoftirqd/1
7 root  RT   0 000 S0  0.0   0:02.05 migration/2
8 root  20   0 000 S0  0.0   0:00.01 ksoftirqd/2
9 root  RT   0 000 S0  0.0   0:02.15 migration/3
   10 root  20   0 000 S0  0.0   0:00.01 ksoftirqd/3
   11 root  RT   0 000 S0  0.0   0:01.69 migration/4
   12 root  20   0 000 S0  0.0   0:00.00 ksoftirqd/4
   13 root  RT   0 000 S0  0.0   0:01.49 migration/5
   14 root  20   0 000 S0  0.0   0:00.00 ksoftirqd/5
   15 root  20   0 000 S0  0.0   0:00.11 events/0
   16 root  20   0 000 S0  0.0   0:00.21 events/1

ps -elf:
1 S root   776 2  0  80   0 - 0 scsi_e May18 ?00:00:00
[scsi_eh_1]
1 S root   810 2  0  80   0 - 0 worker May18 ?00:00:00
[kpsmoused]
1 S root   818 2  0  80   0 - 0 worker May18 ?00:00:00
[kstriped]
1 S root   821 2  0  80   0 - 0 worker May18 ?00:00:00
[kmpathd/0]
1 S root   822 2  0  80   0 - 0 worker May18 ?00:00:00
[kmpathd/1]
1 S root   823 2  0  80   0 - 0 worker May18 ?00:00:00
[kmpathd/2]
1 S root   824 2  0  80   0 - 0 worker May18 ?00:00:00
[kmpathd/3]
1 S root   825 2  0  80   0 - 0 worker May18 ?00:00:00
[kmpathd/4]
1 S root   826 2  0  80   0 - 0 worker May18 ?00:00:00
[kmpathd/5]
1 S root   827 2  0  80   0 - 0 worker May18 ?00:00:00
[kmpath_handlerd]
1 S root   828 2  0  80   0 - 0 worker May18 ?00:00:00
[ksnapd]
1 S root   859 2  0  80   0 - 0 worker May18 ?00:00:00
[usbhid_resumer]
1 S root   900 2  0  80   0 - 0 kjourn May18 ?00:00:00
[jbd2/vda3-8]
1 S root   901 2  0  80   0 - 0 worker May18 ?00:00:00
[ext4-dio-unwrit]
1 S root   902 2  0  80   0 - 0 worker May18 ?00:00:00
[ext4-dio-unwrit]
1 S root   903 2  0  80   0 - 0 worker May18 ?00:00:00
[ext4-dio-unwrit]
1 S root   904 2  0  80   0 - 0 worker May18 ?00:00:00
[ext4-dio-unwrit]
1 S root   905 2  0  80   0 - 0 worker May18 ?00:00:00
[ext4-dio-unwrit]
1 S root   906 2  0  80   0 - 0 worker May18 ?00:00:00
[ext4-dio-unwrit]
5 S root  1005 1  0  76  -4 -  3098 poll_s May18 ?00:00:00
/sbin/udevd --daemon
1 S root  2661 1  0  80   0 -  7492 wait   May18 ?00:00:00
supervising syslog-ng
5 S root  2662  2661  0  80   0 -  7525 poll_s May18 ?00:00:00
/usr/sbin/syslog-ng
1 S root  3250 1  0  80   0 -  9477 poll_s May18 ?00:00:00
/usr/sbin/sshd
1 S root  3370 1  0  80   0 -  4086 hrtime May18 ?00:00:00
/usr/sbin/cron
4 S root  3437 1  0  80   0 - 13988 wait   May18 tty1 00:00:00
/bin/login --
0 S root  3438 1  0  80   0 -  1464 n_tty_ May18 tty2 00:00:00
/sbin/agetty 38400 tty2 linux
0 S root  3439 1  0  80   0 -  1465 n_tty_ May18 tty3 00:00:00
/sbin/agetty 38400 tty3 linux
0 S root  3440 1  0  80   0 -  1464 n_tty_ May18 tty4 00:00:00
/sbin/agetty 38400 tty4 linux
0 S root  3441 1  0  80   0 -  1465 n_tty_ May18 tty5 00:00:00
/sbin/agetty 38400 tty5 linux
0 S root  3442 1  0  80   0 -  1465 n_tty_ May18 tty6 00:00:00
/sbin/agetty 38400 tty6 linux
4

[PATCH v2] KVM: VMX: Enable XSAVE/XRSTORE for guest

2010-05-19 Thread Sheng Yang

From: Dexuan Cui 

Enable XSAVE/XRSTORE for guest.

Change from V1:

1. Use FPU API.
2. Fix CPUID issue.
3. Save/restore all possible guest xstate fields when switching. Because we
don't know which fields guest has already touched.

Signed-off-by: Dexuan Cui 
Signed-off-by: Sheng Yang 
---
 arch/x86/include/asm/kvm_host.h |1 +
 arch/x86/include/asm/vmx.h  |1 +
 arch/x86/kvm/vmx.c  |   28 +
 arch/x86/kvm/x86.c  |   85 +++---
 4 files changed, 108 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d08bb4a..78d7b06 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -302,6 +302,7 @@ struct kvm_vcpu_arch {
} update_pte;
 
struct fpu guest_fpu;
+   uint64_t xcr0, host_xcr0;
 
gva_t mmio_fault_cr2;
struct kvm_pio_request pio;
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 9e6779f..346ea66 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -266,6 +266,7 @@ enum vmcs_field {
 #define EXIT_REASON_EPT_VIOLATION   48
 #define EXIT_REASON_EPT_MISCONFIG   49
 #define EXIT_REASON_WBINVD 54
+#define EXIT_REASON_XSETBV 55
 
 /*
  * Interruption-information format
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 99ae513..2ee8ff6 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -36,6 +36,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "trace.h"
 
@@ -2616,6 +2618,8 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
vmx->vcpu.arch.cr4_guest_owned_bits = KVM_CR4_GUEST_OWNED_BITS;
if (enable_ept)
vmx->vcpu.arch.cr4_guest_owned_bits |= X86_CR4_PGE;
+   if (cpu_has_xsave)
+   vmx->vcpu.arch.cr4_guest_owned_bits |= X86_CR4_OSXSAVE;
vmcs_writel(CR4_GUEST_HOST_MASK, ~vmx->vcpu.arch.cr4_guest_owned_bits);
 
tsc_base = vmx->vcpu.kvm->arch.vm_init_tsc;
@@ -3354,6 +3358,29 @@ static int handle_wbinvd(struct kvm_vcpu *vcpu)
return 1;
 }
 
+static int handle_xsetbv(struct kvm_vcpu *vcpu)
+{
+   u64 new_bv = ((u64)(kvm_register_read(vcpu, VCPU_REGS_RDX) << 32)) |
+   kvm_register_read(vcpu, VCPU_REGS_RAX);
+
+   if (kvm_register_read(vcpu, VCPU_REGS_RCX) != 0)
+   goto err;
+   if (vmx_get_cpl(vcpu) != 0)
+   goto err;
+   if (!(new_bv & XSTATE_FP) ||
+(new_bv & ~vcpu->arch.host_xcr0))
+   goto err;
+   if ((new_bv & XSTATE_YMM) && !(new_bv & XSTATE_SSE))
+   goto err;
+   vcpu->arch.xcr0 = new_bv;
+   xsetbv(XCR_XFEATURE_ENABLED_MASK, vcpu->arch.xcr0);
+   skip_emulated_instruction(vcpu);
+   return 1;
+err:
+   kvm_inject_gp(vcpu, 0);
+   return 1;
+}
+
 static int handle_apic_access(struct kvm_vcpu *vcpu)
 {
return emulate_instruction(vcpu, 0, 0, 0) == EMULATE_DONE;
@@ -3632,6 +3659,7 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu 
*vcpu) = {
[EXIT_REASON_TPR_BELOW_THRESHOLD] = handle_tpr_below_threshold,
[EXIT_REASON_APIC_ACCESS] = handle_apic_access,
[EXIT_REASON_WBINVD]  = handle_wbinvd,
+   [EXIT_REASON_XSETBV]  = handle_xsetbv,
[EXIT_REASON_TASK_SWITCH] = handle_task_switch,
[EXIT_REASON_MCE_DURING_VMENTRY]  = handle_machine_check,
[EXIT_REASON_EPT_VIOLATION]   = handle_ept_violation,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7be1d36..5e20f37 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -64,6 +64,7 @@
(~(unsigned long)(X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD | X86_CR4_DE\
  | X86_CR4_PSE | X86_CR4_PAE | X86_CR4_MCE \
  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR  \
+ | X86_CR4_OSXSAVE \
  | X86_CR4_OSXMMEXCPT | X86_CR4_VMXE))
 
 #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
@@ -149,6 +150,11 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ NULL }
 };
 
+static inline u32 bit(int bitno)
+{
+   return 1 << (bitno & 31);
+}
+
 static void kvm_on_user_return(struct user_return_notifier *urn)
 {
unsigned slot;
@@ -473,6 +479,17 @@ void kvm_lmsw(struct kvm_vcpu *vcpu, unsigned long msw)
 }
 EXPORT_SYMBOL_GPL(kvm_lmsw);
 
+static bool guest_cpuid_has_xsave(struct kvm_vcpu *vcpu)
+{
+   struct kvm_cpuid_entry2 *best;
+
+   best = kvm_find_cpuid_entry(vcpu, 1, 0);
+   if (best->ecx & bit(X86_FEATURE_XSAVE))
+   return true;
+
+   return false;
+}
+
 int __kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 {
unsigned long old_cr4 = kvm_read_cr4(vcpu);
@@ -481,6 +498,9 @@ int __kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
if (cr4 & CR4_RESERVED_BIT

[PATCH] qemu-kvm: Enable xsave related CPUID

2010-05-19 Thread Sheng Yang


Signed-off-by: Sheng Yang 
---
 target-i386/cpuid.c |   32 
 1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index eebf038..21e94f3 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -1067,6 +1067,38 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 *ecx = 0;
 *edx = 0;
 break;
+case 0xD:
+/* Processor Extended State */
+if (!(env->cpuid_ext_features & CPUID_EXT_XSAVE)) {
+*eax = 0;
+*ebx = 0;
+*ecx = 0;
+*edx = 0;
+break;
+}
+if (count == 0) {
+*eax = 0x7;
+*ebx = 0x340;
+*ecx = 0x340;
+*edx = 0;
+} else if (count == 1) {
+/* eax = 1, so we can continue with others */
+*eax = 1;
+*ebx = 0;
+*ecx = 0;
+*edx = 0;
+} else if (count == 2) {
+*eax = 0x100;
+*ebx = 0x240;
+*ecx = 0;
+*edx = 0;
+} else {
+*eax = 0;
+*ebx = 0;
+*ecx = 0;
+*edx = 0;
+}
+break;
 case 0x8000:
 *eax = env->cpuid_xlevel;
 *ebx = env->cpuid_vendor1;
-- 
1.7.0.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [KVM-AUTOTEST PATCH] KVM test: use command line option wrapper functions

2010-05-19 Thread Feng Yang

Hi, Michael

Thanks for your patch.
We plan add "netdev" parameter support in make_qemu_command.  Since you are 
working on this part. Could you add netdev support in your patch? hopeful 
netdev can be default supported in make_qemu_command if qemu support it. Thanks 
very much!

I think the point of this patch is good and we need this kinds of patch.
But I think we need not add so many new function.  Especially some function 
only directly return the string and do nothing more.
This will increase the function call consumption.


- "Michael Goldish"  wrote:

> From: "Michael Goldish" 
> To: autot...@test.kernel.org, kvm@vger.kernel.org
> Cc: "Michael Goldish" 
> Sent: Monday, May 17, 2010 9:29:35 PM GMT +08:00 Beijing / Chongqing / Hong 
> Kong / Urumqi
> Subject: [KVM-AUTOTEST PATCH] KVM test: use command line option wrapper 
> functions
>
> In order to support multiple versions of qemu which use different
> command line
> options or syntaxes, wrap all command line options in small helper
> functions,
> which append text to the command line according to the output of 'qemu
> -help'.
> 
> Signed-off-by: Michael Goldish 
> ---
>  client/tests/kvm/kvm_vm.py |  198
> ++--
>  1 files changed, 135 insertions(+), 63 deletions(-)
> 
> diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py
> index 047505a..94bacdf 100755
> --- a/client/tests/kvm/kvm_vm.py
> +++ b/client/tests/kvm/kvm_vm.py
> @@ -186,12 +186,100 @@ class VM:
> nic_model -- string to pass as 'model' parameter for
> this
> NIC (e.g. e1000)
>  """
> -if name is None:
> -name = self.name
> -if params is None:
> -params = self.params
> -if root_dir is None:
> -root_dir = self.root_dir
> +# Helper function for command line option wrappers
> +def has_option(help, option):
> +return bool(re.search(r"^-%s(\s|$)" % option, help,
> re.MULTILINE))
> +
> +# Wrappers for all supported qemu command line parameters.
> +# This is meant to allow support for multiple qemu versions.
> +# Each of these functions receives the output of 'qemu -help'
> as a
> +# parameter, and should add the requested command line
> option
> +# accordingly.
> +
> +def add_name(help, name):
> +return " -name '%s'" % name

I think we need not add so many new function.  Especially some function only 
directly return the string and do nothing more.
This will increase the function call consumption.

> +
> +def add_unix_socket_monitor(help, filename):
> +return " -monitor unix:%s,server,nowait" % filename
Same as above
> +
> +def add_mem(help, mem):
> +return " -m %s" % mem
Same as above
> +
> +def add_smp(help, smp):
> +return " -smp %s" % smp
Same as above.

> +
> +def add_cdrom(help, filename, index=2):
> +if has_option(help, "drive"):
> +return " -drive file=%s,index=%d,media=cdrom" %
> (filename,
> +
> index)
> +else:
> +return " -cdrom %s" % filename
> +
> +def add_drive(help, filename, format=None, cache=None,
> werror=None,
> +  serial=None, snapshot=False, boot=False):
> +cmd = " -drive file=%s" % filename
> +if format: cmd += ",if=%s" % format
> +if cache: cmd += ",cache=%s" % cache
> +if werror: cmd += ",werror=%s" % werror
> +if serial: cmd += ",serial=%s" % serial
> +if snapshot: cmd += ",snapshot=on"
> +if boot: cmd += ",boot=on"
> +return cmd
> +
> +def add_nic(help, vlan, model=None, mac=None):
> +cmd = " -net nic,vlan=%d" % vlan
> +if model: cmd += ",model=%s" % model
> +if mac: cmd += ",macaddr=%s" % mac
> +return cmd
> +
> +def add_net(help, vlan, mode, ifname=None, script=None,
> +downscript=None):
> +cmd = " -net %s,vlan=%d" % (mode, vlan)
> +if mode == "tap":
> +if ifname: cmd += ",ifname=%s" % ifname
> +if script: cmd += ",script=%s" % script
> +cmd += ",downscript=%s" % (downscript or "no")
> +return cmd
> +
> +def add_floppy(help, filename):
> +return " -fda %s" % filename
> +
> +def add_tftp(help, filename):
> +return " -tftp %s" % filename
> +
> +def add_tcp_redir(help, host_port, guest_port):
> +return " -redir tcp:%s::%s" % (host_port, guest_port)
> +
> +def add_vnc(help, vnc_port):
> +return " -vnc :%d" % (vnc_port - 5900)
> +
> +def add_sdl(help):
> +if has_option(help, "sdl"):
> +return " -sdl"
> +else:
> +ret

Re: KVM call agenda for May 18

2010-05-19 Thread Christoph Hellwig

On Tue, May 18, 2010 at 08:52:36AM -0500, Anthony Liguori wrote:
> This should be filed in launchpad as a qemu bug and it should be tested  
> against the latest git.  This bug sounds like we're using an int to  
> represent sector offset somewhere but there's not enough info in the bug  
> report to figure out for sure.  I just audited the virtio-blk -> raw ->  
> aio=threads path and I don't see an obvious place that we're getting it  
> wrong.

FYI: I'm going to ignore everything that's in launchpad - even more than
in the stupid SF bugtracker.  While the SF one is almost unsuable
launchpad is entirely unsuable.  If you don't have an account with the
evil spacement empire you can't even check the email addresses of the
reporters, so any communication with them is entirely impossible.

It's time we get a proper bugzilla.qemu.org for both qemu and qemu-kvm
that can be used sanely.  If you ask nicely you might even get a virtual
instance of bugzilla.kernel.org which works quite nicely.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: qemu-kvm hangs if multipath device is queing

2010-05-19 Thread Peter Lieven


Kevin Wolf wrote:

Am 19.05.2010 09:29, schrieb Christoph Hellwig:
  

On Tue, May 18, 2010 at 03:22:36PM +0200, Kevin Wolf wrote:


I think it's stuck here in an endless loop:

while (laiocb->ret == -EINPROGRESS)
qemu_laio_completion_cb(laiocb->ctx);

Can you verify this by single-stepping one or two loop iterations? ret
and errno after the read call could be interesting, too.
  

Maybe the compiler is just too smart.  Without some form of barrier
it could just optimize the loop away as laiocb->ret couldn't change
in a normal single-threaded environment.



It probably could in theory, but in practice we're in a read() inside
qemu_laio_completion, so it didn't do it here.
  

if you supply a patch that will add some usleeps at the point in
question i'm willing to test if it solves the 100% cpu problem.

Kevin

  


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH +stable] block: don't attempt to merge overlapping requests

2010-05-19 Thread Avi Kivity


On 05/18/2010 10:22 PM, Stefan Hajnoczi wrote:

On Tue, May 18, 2010 at 6:18 PM, Avi Kivity  wrote:
   

The block multiwrite code pretends to be able to merge overlapping requests,
but doesn't do so in fact.  This leads to I/O errors (for example on mkfs
of a large virtio disk).
 

Are overlapping write requests correct guest behavior?  I thought the
ordering semantics require a flush between overlapping writes to
ensure A is written before B.

What cache= mode are you running?
   


writeback.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH RFC] virtio: put last seen used index into ring itself

2010-05-19 Thread Avi Kivity


On 05/19/2010 10:39 AM, Rusty Russell wrote:


I think we're talking about the last 2 entries of the avail ring.  That means
the worst case is 1 false bounce every time around the ring.


It's low, but why introduce an inefficiency when you can avoid doing it 
for the same effort?



I think that's
why we're debating it instead of measuring it :)
   


Measure before optimize is good for code but not for protocols.  
Protocols have to be robust against future changes.  Virtio is warty 
enough already, we can't keep doing local optimizations.



Note that this is a exclusive->shared->exclusive bounce only, too.
   


A bounce is a bounce.

Virtio is already way too bouncy due to the indirection between the 
avail/used rings and the descriptor pool.  A device with out of order 
completion (like virtio-blk) will quickly randomize the unused 
descriptor indexes, so every descriptor fetch will require a bounce.


In contrast, if the rings hold the descriptors themselves instead of 
pointers, we bounce (sizeof(descriptor)/cache_line_size) cache lines for 
every descriptor, amortized.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: qemu-kvm hangs if multipath device is queing

2010-05-19 Thread Kevin Wolf

Am 19.05.2010 09:29, schrieb Christoph Hellwig:
> On Tue, May 18, 2010 at 03:22:36PM +0200, Kevin Wolf wrote:
>> I think it's stuck here in an endless loop:
>>
>> while (laiocb->ret == -EINPROGRESS)
>> qemu_laio_completion_cb(laiocb->ctx);
>>
>> Can you verify this by single-stepping one or two loop iterations? ret
>> and errno after the read call could be interesting, too.
> 
> Maybe the compiler is just too smart.  Without some form of barrier
> it could just optimize the loop away as laiocb->ret couldn't change
> in a normal single-threaded environment.

It probably could in theory, but in practice we're in a read() inside
qemu_laio_completion, so it didn't do it here.

Kevin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH RFC] virtio: put last seen used index into ring itself

2010-05-19 Thread Rusty Russell

On Wed, 12 May 2010 04:57:22 am Avi Kivity wrote:
> On 05/07/2010 06:23 AM, Rusty Russell wrote:
> > On Thu, 6 May 2010 07:30:00 pm Avi Kivity wrote:
> >
> >> On 05/05/2010 11:58 PM, Michael S. Tsirkin wrote:
> >>  
> >>> + /* We publish the last-seen used index at the end of the available ring.
> >>> +  * It is at the end for backwards compatibility. */
> >>> + vr->last_used_idx =&(vr)->avail->ring[num];
> >>> + /* Verify that last used index does not spill over the used ring. */
> >>> + BUG_ON((void *)vr->last_used_idx +
> >>> +sizeof *vr->last_used_idx>   (void *)vr->used);
> >>>}
> >>>
> >>>
> >> Shouldn't this be on its own cache line?
> >>  
> > It's next to the available ring; because that's where the guest publishes
> > its data.  That whole page is guest-write, host-read.
> >
> > Putting it on a cacheline by itself would be a slight pessimization; the 
> > host
> > cpu would have to get the last_used_idx cacheline and the avail descriptor
> > cacheline every time.  This way, they are sometimes the same cacheline.
> 
> If one peer writes the tail of the available ring, while the other reads 
> last_used_idx, it's a false bounce, no?

I think we're talking about the last 2 entries of the avail ring.  That means
the worst case is 1 false bounce every time around the ring.  I think that's
why we're debating it instead of measuring it :)

Note that this is a exclusive->shared->exclusive bounce only, too.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: qemu-kvm hangs if multipath device is queing

2010-05-19 Thread Christoph Hellwig

On Tue, May 18, 2010 at 03:22:36PM +0200, Kevin Wolf wrote:
> I think it's stuck here in an endless loop:
> 
> while (laiocb->ret == -EINPROGRESS)
> qemu_laio_completion_cb(laiocb->ctx);
> 
> Can you verify this by single-stepping one or two loop iterations? ret
> and errno after the read call could be interesting, too.

Maybe the compiler is just too smart.  Without some form of barrier
it could just optimize the loop away as laiocb->ret couldn't change
in a normal single-threaded environment.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

84 matches

Mail list logo