Re: KVM call agenda for May 18

2010-05-17 Thread Brian Jackson
On Monday 17 May 2010 22:23:46 Chris Wright wrote:
> Please send in any agenda items you are interested in covering.
> 
> If we have a lack of agenda items I'll cancel the week's call.


Perceived long standing bugs that nobody seems to care about. There are a few, 
one of which is the > 1TB [1] bug that has existed for 4+ months.

And others.

This can wait for a later call if necessary... not worth a call on it's own.


Etc:
[1] 
http://sourceforge.net/tracker/?func=detail&aid=2933400&group_id=180599&atid=893831


> 
> thanks,
> -chris
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [resend] fix non-mergeable buffers packet too large error handling

2010-05-17 Thread David Miller
From: "Michael S. Tsirkin" 
Date: Tue, 18 May 2010 06:12:52 +0300

> DaveM, just to clarify, this patch is on top of the series
> we are working on with David L Stevens. It's not for your net tree.

Understood.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost-net: utilize PUBLISH_USED_IDX feature

2010-05-17 Thread David Miller
From: "Michael S. Tsirkin" 
Date: Tue, 18 May 2010 04:19:31 +0300

> With PUBLISH_USED_IDX, guest tells us which used entries
> it has consumed. This can be used to reduce the number
> of interrupts: after we write a used entry, if the guest has not yet
> consumed the previous entry, or if the guest has already consumed the
> new entry, we do not need to interrupt.
> This imporves bandwidth by 30% under some workflows.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
> 
> Rusty, Dave, this patch depends on the patch
> "virtio: put last seen used index into ring itself"
> which is currently destined at Rusty's tree.
> Rusty, if you are taking that one for 2.6.35, please
> take this one as well.
> Dave, any objections?

None:

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM call agenda for May 18

2010-05-17 Thread Chris Wright
Please send in any agenda items you are interested in covering.

If we have a lack of agenda items I'll cancel the week's call.

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [resend] fix non-mergeable buffers packet too large error handling

2010-05-17 Thread Michael S. Tsirkin
DaveM, just to clarify, this patch is on top of the series
we are working on with David L Stevens. It's not for your net tree.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2976863 ] 32PAE Windows guest blue screen when booting with apci on

2010-05-17 Thread SourceForge.net
Bugs item #2976863, was opened at 2010-03-26 14:44
Message generated for change (Comment added) made by haoxudong
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2976863&group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Xudong Hao (haoxudong)
Assigned to: Nobody/Anonymous (nobody)
Summary: 32PAE Windows guest blue screen when booting with apci on

Initial Comment:
Environment:

Host OS (ia32/ia32e/IA64):32e and pae
Guest OS (ia32/ia32e/IA64): ia32pae
Guest OS Type (Linux/Windows): windows(xp, vista, and Win7)
kvm.git Commit: b8ed93d4.. 
qemu-kvm Commit: aa4df134.. 
Host Kernel Version:2.6.34-rc2
Hardware: Stoakely


Bug detailed description:
--
when we boot a 32pae Windows guest with acpi on, the guest will become bule 
screen of death.


Reproduce steps:

1. boot into host, loal kvm and kvm_intel module
2. use following command to boot a guest: qemu-system-x86_64  -m 256 -net
nic,macaddr=00:16:3e:27:b6:8a,model=rtl8139 -net tap,script=/etc/kvm/qemu-ifup
-hda /share/xvs/var/Windows.img
3. the guest will become bule screen of death.

--

>Comment By: Xudong Hao (haoxudong)
Date: 2010-05-18 11:11

Message:
>This problem you are experiencing is only reproducible on 32-bit host,
>64-bit host is OK, correct?

No, on 64-bit host, 32pae guest get this booting slow problem with acpi on
and shandow mode too.

Mtosatti, What platform(CPU) do you use?

--

Comment By: Marcelo Tosatti (mtosatti)
Date: 2010-04-22 00:51

Message:
> Now BSOD issue got out on kvm commit:
>
bc8a97a218d0d2367d125db3e11ec45bb0fa0a3f-a1b705841caa33fca0b95928505c01d5105b706f.

Please save the BSOD message.

> However, boot a 32-bit XP with acpi on will cost about 9 minutes, did
you
> meet similar issue on your side?

No. It boots similarly fast as with 64-bit host. 

This problem you are experiencing is only reproducible on 32-bit host,
64-bit host is OK, correct?


--

Comment By: Xudong Hao (haoxudong)
Date: 2010-04-20 10:35

Message:
Now BSOD issue got out on kvm commit:
bc8a97a218d0d2367d125db3e11ec45bb0fa0a3f-a1b705841caa33fca0b95928505c01d5105b706f.

However, boot a 32-bit XP with acpi on will cost about 9 minutes, did you
meet similar issue on your side?

--

Comment By: Marcelo Tosatti (mtosatti)
Date: 2010-04-15 04:51

Message:
1) 32-bit WinXP

qemu-system-x86_64 -drive file=/root/winxp-32-good.qcow2,cache=writeback 
-m 2000 -vnc :1 -usbdevice tablet -net nic,model=e1000 -net
tap,script=/root/ifup

2) acpi on 

Control panes shows "ACPI multiprocessor PC"

3) shadow mode

[r...@localhost ~]# cat /sys/module/kvm_intel/parameters/ept 
N

host:

Linux localhost.localdomain 2.6.34-rc3 #101 SMP Mon Apr 12 18:18:39 BRT
2010 i686 i686 i386 GNU/Linux


Can you send a screenshot of the hang? 

--

Comment By: Xudong Hao (haoxudong)
Date: 2010-04-13 13:45

Message:
Motsatti,

I checked again, guest still booting stop at the loading scroll bar. Let
find what difference between ours.
Can you check if your reproduce steps:
1) 32-bit WinXP
1) acpi on
2) shadow mode

--

Comment By: Marcelo Tosatti (mtosatti)
Date: 2010-04-13 06:28

Message:
Xudong,

Tested WinXP.32 and WinVista.32 on 32-bit host, both work fine.


--

Comment By: Xudong Hao (haoxudong)
Date: 2010-04-12 09:56

Message:
Latest commit: 069c71e7a5e5f968099ca551f68e5dfe5a3f71bc
qemu: 5c781061a45abe1855ad0a95d834336da574703c
Windows guest did not blue screen with latest KVM commit, but guest hang
at the loading scroll bar. 

This issue only happen on shadow mode. Windows guest can boot up with EPT
mode.


--

Comment By: Marcelo Tosatti (mtosatti)
Date: 2010-04-06 02:50

Message:
Xudong,

Can you reproduce the problem with latest git repositories? 

The BSOD screenshot is truncated, the error number is not present.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2976863&group_id=180599
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2] vhost-net: utilize PUBLISH_USED_IDX feature

2010-05-17 Thread Michael S. Tsirkin
With PUBLISH_USED_IDX, guest tells us which used entries
it has consumed. This can be used to reduce the number
of interrupts: after we write a used entry, if the guest has not yet
consumed the previous entry, or if the guest has already consumed the
new entry, we do not need to interrupt.
This imporves bandwidth by 30% under some workflows.

Signed-off-by: Michael S. Tsirkin 
---

This is on top of Rusty's tree and depends on the virtio patch.

Changes from v1:
fix build


 drivers/vhost/vhost.c |   27 +--
 drivers/vhost/vhost.h |4 ++--
 2 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 750effe..18c4f6e 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -278,14 +278,15 @@ static int memory_access_ok(struct vhost_dev *d, struct 
vhost_memory *mem,
return 1;
 }
 
-static int vq_access_ok(unsigned int num,
+static int vq_access_ok(struct vhost_dev *d, unsigned int num,
struct vring_desc __user *desc,
struct vring_avail __user *avail,
struct vring_used __user *used)
 {
+   size_t s = vhost_has_feature(d, VIRTIO_RING_F_PUBLISH_USED) ? 2 : 0;
return access_ok(VERIFY_READ, desc, num * sizeof *desc) &&
   access_ok(VERIFY_READ, avail,
-sizeof *avail + num * sizeof *avail->ring) &&
+sizeof *avail + num * sizeof *avail->ring + s) &&
   access_ok(VERIFY_WRITE, used,
sizeof *used + num * sizeof *used->ring);
 }
@@ -312,7 +313,7 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq, 
void __user *log_base)
 /* Caller should have vq mutex and device mutex */
 int vhost_vq_access_ok(struct vhost_virtqueue *vq)
 {
-   return vq_access_ok(vq->num, vq->desc, vq->avail, vq->used) &&
+   return vq_access_ok(vq->dev, vq->num, vq->desc, vq->avail, vq->used) &&
vq_log_access_ok(vq, vq->log_base);
 }
 
@@ -448,7 +449,7 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, 
void __user *argp)
 * If it is not, we don't as size might not have been setup.
 * We will verify when backend is configured. */
if (vq->private_data) {
-   if (!vq_access_ok(vq->num,
+   if (!vq_access_ok(d, vq->num,
(void __user *)(unsigned long)a.desc_user_addr,
(void __user *)(unsigned long)a.avail_user_addr,
(void __user *)(unsigned 
long)a.used_user_addr)) {
@@ -473,6 +474,7 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, 
void __user *argp)
vq->log_used = !!(a.flags & (0x1 << VHOST_VRING_F_LOG));
vq->desc = (void __user *)(unsigned long)a.desc_user_addr;
vq->avail = (void __user *)(unsigned long)a.avail_user_addr;
+   vq->last_used = (u16 __user *)&vq->avail->ring[vq->num];
vq->log_addr = a.log_guest_addr;
vq->used = (void __user *)(unsigned long)a.used_user_addr;
break;
@@ -993,7 +995,8 @@ void vhost_discard_vq_desc(struct vhost_virtqueue *vq)
 
 /* After we've used one of their buffers, we tell them about it.  We'll then
  * want to notify the guest, using eventfd. */
-int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
+static int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head,
+ int len)
 {
struct vring_used_elem __user *used;
 
@@ -1034,9 +1037,10 @@ int vhost_add_used(struct vhost_virtqueue *vq, unsigned 
int head, int len)
 }
 
 /* This actually signals the guest, using eventfd. */
-void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
+static void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
__u16 flags;
+   __u16 used;
/* Flush out used index updates. This is paired
 * with the barrier that the Guest executes when enabling
 * interrupts. */
@@ -1053,6 +1057,17 @@ void vhost_signal(struct vhost_dev *dev, struct 
vhost_virtqueue *vq)
 !vhost_has_feature(dev, VIRTIO_F_NOTIFY_ON_EMPTY)))
return;
 
+   if (vhost_has_feature(dev, VIRTIO_RING_F_PUBLISH_USED)) {
+   __u16 used;
+   if (get_user(used, vq->last_used)) {
+   vq_err(vq, "Failed to get last used idx");
+   return;
+   }
+
+   if (used != (u16)(vq->last_used_idx - 1))
+   return;
+   }
+
/* Signal the Guest tell them we used something up. */
if (vq->call_ctx)
eventfd_signal(vq->call_ctx, 1);
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 44591ba..bd01aca 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -52,6 +52,7 @@

Re: [PATCH] [resend] fix non-mergeable buffers packet too large error handling

2010-05-17 Thread Michael S. Tsirkin
On Mon, May 17, 2010 at 02:16:36PM -0700, David L Stevens wrote:
> This patch enforces single-buffer allocation when
> mergeable rx buffers is not enabled.
> 
> Reported-by: Michael S. Tsirkin 
> Signed-off-by: David L Stevens 

Thanks! Why are you resending?

> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 309c570..c346304 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -361,13 +361,21 @@ static void handle_rx(struct vhost_net *net)
>   break;
>   }
>   /* TODO: Should check and handle checksum. */
> - if (vhost_has_feature(&net->dev, VIRTIO_NET_F_MRG_RXBUF) &&
> - memcpy_toiovecend(vq->hdr, (unsigned char *)&headcount,
> -   offsetof(typeof(hdr), num_buffers),
> -   sizeof hdr.num_buffers)) {
> - vq_err(vq, "Failed num_buffers write");
> + if (vhost_has_feature(&net->dev, VIRTIO_NET_F_MRG_RXBUF)) {
> + if (memcpy_toiovecend(vq->hdr,
> +   (unsigned char *)&headcount,
> +   offsetof(typeof(hdr),
> +   num_buffers),
> +   sizeof hdr.num_buffers)) {
> + vq_err(vq, "Failed num_buffers write");
> + vhost_discard_desc(vq, headcount);
> + break;
> + }
> + } else if (headcount > 1) {
> + vq_err(vq, "rx packet too large (%d) for guest",
> +sock_len);
>   vhost_discard_desc(vq, headcount);
> - break;
> + continue;
>   }
>   vhost_add_used_and_signal_n(&net->dev, vq, vq->heads,
>   headcount);
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] vhost-net: utilize PUBLISH_USED_IDX feature

2010-05-17 Thread Michael S. Tsirkin
With PUBLISH_USED_IDX, guest tells us which used entries
it has consumed. This can be used to reduce the number
of interrupts: after we write a used entry, if the guest has not yet
consumed the previous entry, or if the guest has already consumed the
new entry, we do not need to interrupt.
This imporves bandwidth by 30% under some workflows.

Signed-off-by: Michael S. Tsirkin 
---

Rusty, Dave, this patch depends on the patch
"virtio: put last seen used index into ring itself"
which is currently destined at Rusty's tree.
Rusty, if you are taking that one for 2.6.35, please
take this one as well.
Dave, any objections?

 drivers/vhost/vhost.c |   27 +--
 drivers/vhost/vhost.h |2 ++
 2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 750effe..2a66cf3 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -278,14 +278,15 @@ static int memory_access_ok(struct vhost_dev *d, struct 
vhost_memory *mem,
return 1;
 }
 
-static int vq_access_ok(unsigned int num,
+static int vq_access_ok(struct vhost_dev *d, unsigned int num,
struct vring_desc __user *desc,
struct vring_avail __user *avail,
struct vring_used __user *used)
 {
+   size_t s = vhost_has_feature(d, VIRTIO_RING_F_PUBLISH_USED) ? 2 : 0;
return access_ok(VERIFY_READ, desc, num * sizeof *desc) &&
   access_ok(VERIFY_READ, avail,
-sizeof *avail + num * sizeof *avail->ring) &&
+sizeof *avail + num * sizeof *avail->ring + s) &&
   access_ok(VERIFY_WRITE, used,
sizeof *used + num * sizeof *used->ring);
 }
@@ -312,7 +313,7 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq, 
void __user *log_base)
 /* Caller should have vq mutex and device mutex */
 int vhost_vq_access_ok(struct vhost_virtqueue *vq)
 {
-   return vq_access_ok(vq->num, vq->desc, vq->avail, vq->used) &&
+   return vq_access_ok(vq->dev, vq->num, vq->desc, vq->avail, vq->used,) &&
vq_log_access_ok(vq, vq->log_base);
 }
 
@@ -448,7 +449,7 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, 
void __user *argp)
 * If it is not, we don't as size might not have been setup.
 * We will verify when backend is configured. */
if (vq->private_data) {
-   if (!vq_access_ok(vq->num,
+   if (!vq_access_ok(d, vq->num,
(void __user *)(unsigned long)a.desc_user_addr,
(void __user *)(unsigned long)a.avail_user_addr,
(void __user *)(unsigned 
long)a.used_user_addr)) {
@@ -473,6 +474,7 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, 
void __user *argp)
vq->log_used = !!(a.flags & (0x1 << VHOST_VRING_F_LOG));
vq->desc = (void __user *)(unsigned long)a.desc_user_addr;
vq->avail = (void __user *)(unsigned long)a.avail_user_addr;
+   vq->last_used = (u16 __user *)&vq->avail->ring[vq->num];
vq->log_addr = a.log_guest_addr;
vq->used = (void __user *)(unsigned long)a.used_user_addr;
break;
@@ -993,7 +995,8 @@ void vhost_discard_vq_desc(struct vhost_virtqueue *vq)
 
 /* After we've used one of their buffers, we tell them about it.  We'll then
  * want to notify the guest, using eventfd. */
-int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
+static int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head,
+ int len)
 {
struct vring_used_elem __user *used;
 
@@ -1034,9 +1037,10 @@ int vhost_add_used(struct vhost_virtqueue *vq, unsigned 
int head, int len)
 }
 
 /* This actually signals the guest, using eventfd. */
-void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
+static void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
__u16 flags;
+   __u16 used;
/* Flush out used index updates. This is paired
 * with the barrier that the Guest executes when enabling
 * interrupts. */
@@ -1053,6 +1057,17 @@ void vhost_signal(struct vhost_dev *dev, struct 
vhost_virtqueue *vq)
 !vhost_has_feature(dev, VIRTIO_F_NOTIFY_ON_EMPTY)))
return;
 
+   if (vhost_has_feature(d, VIRTIO_RING_F_PUBLISH_USED)) {
+   __u16 used;
+   if (get_user(used, vq->last_used)) {
+   vq_err(vq, "Failed to get last used idx");
+   return;
+   }
+
+   if (used != (u16)(vq->last_used_idx - 1))
+   return;
+   }
+
/* Signal the Guest tell them we used something up. */
if (vq->call_ctx)
eventfd_signal(vq->call_ctx, 1);
diff --git a/drive

Re: [Autotest][PATCH V2] KVM Test: Add ioquit test case

2010-05-17 Thread Lucas Meneghel Rodrigues
On Fri, 2010-05-14 at 17:43 +0800, Feng Yang wrote:
> Emulate the powercut under IO workload(dd so far) using kill -9.
> Then check image in post command.
> This case want to make sure powercut under IO workload will not
> break qcow2 image.

The big question that came to my mind here is: Are we really expected to
keep the hard disk image consistency even if the power is cut down? I am
not sure about that. Dor, what is the expected behavior on this
situation?

> Now it only work on linux.

I have a couple of minor comments on this one:

> Signed-off-by: Feng Yang 
> ---
>  client/tests/kvm/tests/ioquit.py   |   39 
> 
>  client/tests/kvm/tests_base.cfg.sample |   10 +++-
>  2 files changed, 48 insertions(+), 1 deletions(-)
>  create mode 100644 client/tests/kvm/tests/ioquit.py
> 
> diff --git a/client/tests/kvm/tests/ioquit.py 
> b/client/tests/kvm/tests/ioquit.py
> new file mode 100644
> index 000..a202297
> --- /dev/null
> +++ b/client/tests/kvm/tests/ioquit.py
> @@ -0,0 +1,39 @@
> +import logging, time, random
> +from autotest_lib.client.common_lib import error
> +import kvm_test_utils
> +
> +
> +def run_ioquit(test, params, env):
> +"""
> +Emulate the poweroff under IO workload(dd so far) using kill -9.
> +
> +@param test: Kvm test object
> +@param params: Dictionary with the test parameters.
> +@param env: Dictionary with test environment.
> +"""
> +
> +vm = kvm_test_utils.get_living_vm(env, params.get("main_vm"))
> +session = kvm_test_utils.wait_for_login(vm,
> +  timeout=int(params.get("login_timeout", 360)))
> +session2 = kvm_test_utils.wait_for_login(vm,
> +  timeout=int(params.get("login_timeout", 360)))
> +try:
> +bg_cmd = params.get("background_cmd")
> +logging.info("Add IO workload for guest OS.")
> +(s, o) = session.get_command_status_output(bg_cmd, timeout=60)
> +check_cmd = params.get("check_cmd")
> +(s, o) = session2.get_command_status_output(check_cmd, timeout=60)
> +if int(o) <= 0:
> +raise error.TestError("Fail to add IO workload for Guest OS")
> +
> +logging.info("Sleep for a while")
> +time.sleep(random.randrange(30,100))
> +(s, o) = session2.get_command_status_output(check_cmd, timeout=300)
> +if int(o) <= 0:
> +logging.info("Background command finish before kill VM")

^ "IO workload finished before the VM was killed"

> +logging.info("Kill the virtual machine")
> +vm.process.close()
> +finally:
> +session.close()
> +session2.close()
> diff --git a/client/tests/kvm/tests_base.cfg.sample 
> b/client/tests/kvm/tests_base.cfg.sample
> index bb3646c..4387a36 100644
> --- a/client/tests/kvm/tests_base.cfg.sample
> +++ b/client/tests/kvm/tests_base.cfg.sample
> @@ -389,7 +389,11 @@ variants:
>  rebase_mode = unsafe
>  image_name_snapshot1 = sn1
>  image_name_snapshot2 = sn2
> -
> +- ioquit:
> +type = ioquit
> +background_cmd = "for i in 1 2 3 4; do (nohup dd if=/dev/urandom 
> of=/tmp/file bs=102400 count=1000 &) done"
> +check_cmd = ps -a |grep dd |wc -l

^ Here we can add login_timeout explicitly so people can tune this
timeout.

> +
>  # system_powerdown, system_reset and shutdown *must* be the last ones
>  # defined (in this order), since the effect of such tests can leave
>  # the VM on a bad state.
> @@ -1347,6 +1351,10 @@ variants:
>  pre_command += " scripts/hugepage.py /mnt/kvm_hugepage;"
>  extra_params += " -mem-path /mnt/kvm_hugepage"
>  
> +ioquit:
> +post_command_noncritical = no
> +only qcow2
> +only Linux
>  
>  variants:
>  - @no_pci_assignable:


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 10/10] KVM test: Add a helper to search the panic in the log

2010-05-17 Thread Lucas Meneghel Rodrigues
On Mon, 2010-05-17 at 15:28 +0800, Jason Wang wrote:
> Michael Goldish wrote:
> > On 05/11/2010 12:04 PM, Jason Wang wrote:
> >   
> >> This checker serves as the post_command to find the panic information
> >> in the file which contains the content of guest serial console.
> >>
> >> Signed-off-by: Jason Wang 
> >> ---
> >>  client/tests/kvm/scripts/check_serial.py |   41 
> >> ++
> >>  client/tests/kvm/tests_base.cfg.sample   |7 -
> >>  2 files changed, 46 insertions(+), 2 deletions(-)
> >>  create mode 100644 client/tests/kvm/scripts/check_serial.py
> >>
> >> diff --git a/client/tests/kvm/scripts/check_serial.py 
> >> b/client/tests/kvm/scripts/check_serial.py
> >> new file mode 100644
> >> index 000..969bbe3
> >> --- /dev/null
> >> +++ b/client/tests/kvm/scripts/check_serial.py
> >> @@ -0,0 +1,41 @@
> >> +import os, sys, glob, re
> >> +
> >> +
> >> +class SerialCheckerError(Exception):
> >> +"""
> >> +Simple wrapper for the builtin Exception class.
> >> +"""
> >> +pass
> >> +
> >> +
> >> +class SerialChecker(object):
> >> +"""
> >> +Serach the panic or other keywords in the guest serial.

^ Typo here, search.

> >> +"""
> >> +def __init__(self):
> >> +"""
> >> +Gets params from environment variables and sets class attributes.
> >> +"""
> >> +client_dir =  os.environ['AUTODIR']
> >> +self.pattern = os.environ['KVM_TEST_search_pattern']
> >> +self.shortname = os.environ['KVM_TEST_shortname']
> >> +self.debugdir = os.path.join(client_dir, 
> >> "results/default/kvm.%s/debug" \
> >> 
> >
> > I think the final backslash is unnecessary.
> >
> >   
> >> + % self.shortname)
> >> +self.serial_files = glob.glob(os.path.join(self.debugdir, 
> >> 'serial*'))
> >> +
> >> +
> >> +def check(self):
> >> +"""
> >> +Check whether the pattern were found in the serial files
> >> +"""
> >> +fail = [ f for f in self.serial_files if
> >> + re.findall(self.pattern, file(f).read(), re.I) ]
> >> +if fail:
> >> +print "%s is found in %s" % (self.pattern, fail)
> >> +raise SerialCheckerError("Error found during the check, 
> >> Please" \
> >> 
> >
> > Same here.
> >
> >   
> >> + " check the log")
> >> +
> >> +
> >> +if __name__ == "__main__":
> >> +checker = SerialChecker()
> >> +checker.check()
> >> 
> >
> > I wonder why we need a class.  Why not just put all the code here?
> >
> >   
> Just follow the style of other pre_command, maybe Lucas like it.

When I made the first pre command script, I opted to use a class to
represent the unattended install setup, and other pre commands followed
this style. Since the code is small, I see no problem in putting it
under __main__ altogether. So Jason, you can remove the code from the
class and put it under __main__, please.

> >> diff --git a/client/tests/kvm/tests_base.cfg.sample 
> >> b/client/tests/kvm/tests_base.cfg.sample
> >> index 3c0933e..3ac8f0d 100644
> >> --- a/client/tests/kvm/tests_base.cfg.sample
> >> +++ b/client/tests/kvm/tests_base.cfg.sample
> >> @@ -52,6 +52,10 @@ address_index = 0
> >>  # Misc
> >>  profilers = kvm_stat
> >>  
> >> +# pattern serach in guest serial console
> >> +serach_pattern = panic

^ typo, should be search_pattern

> >> +post_command = "python scripts/check_serial.py"
> >> +post_command_noncritical = no
> >>  
> >>  # Tests
> >>  variants:
> >> @@ -1314,10 +1318,9 @@ virtio|virtio_blk|e1000|balloon_check:
> >>  variants:
> >>  - @qcow2:
> >>  image_format = qcow2
> >> -post_command = " python scripts/check_image.py;"
> >> +post_command = " python scripts/check_image.py; python 
> >> scripts/check_serial.py"
> >> 
> > ^
> > This should be +=, because post_command may have been previously
> > assigned some value.
> >
> >   
> Would change it, and do you have any other comments about this patchset?

Other than the issues pointed out, looks good.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [resend] fix non-mergeable buffers packet too large error handling

2010-05-17 Thread David L Stevens
This patch enforces single-buffer allocation when
mergeable rx buffers is not enabled.

Reported-by: Michael S. Tsirkin 
Signed-off-by: David L Stevens 

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 309c570..c346304 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -361,13 +361,21 @@ static void handle_rx(struct vhost_net *net)
break;
}
/* TODO: Should check and handle checksum. */
-   if (vhost_has_feature(&net->dev, VIRTIO_NET_F_MRG_RXBUF) &&
-   memcpy_toiovecend(vq->hdr, (unsigned char *)&headcount,
- offsetof(typeof(hdr), num_buffers),
- sizeof hdr.num_buffers)) {
-   vq_err(vq, "Failed num_buffers write");
+   if (vhost_has_feature(&net->dev, VIRTIO_NET_F_MRG_RXBUF)) {
+   if (memcpy_toiovecend(vq->hdr,
+ (unsigned char *)&headcount,
+ offsetof(typeof(hdr),
+ num_buffers),
+ sizeof hdr.num_buffers)) {
+   vq_err(vq, "Failed num_buffers write");
+   vhost_discard_desc(vq, headcount);
+   break;
+   }
+   } else if (headcount > 1) {
+   vq_err(vq, "rx packet too large (%d) for guest",
+  sock_len);
vhost_discard_desc(vq, headcount);
-   break;
+   continue;
}
vhost_add_used_and_signal_n(&net->dev, vq, vq->heads,
headcount);


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [QEMU-KVM]: Megasas + TCM_Loop + SG_IO into Windows XP guests

2010-05-17 Thread Nicholas A. Bellinger
On Fri, 2010-05-14 at 02:42 -0700, Nicholas A. Bellinger wrote:
> On Fri, 2010-05-14 at 09:22 +0200, Hannes Reinecke wrote:
> > Nicholas A. Bellinger wrote:
> > > Greetings Hannes and co,
> > > 
> 
> > Let's see if I can find some time working on the megasas emulation.
> > Maybe I find something.
> > Last time I checked it was with a Windows7 build, but I didn't do
> > any real tests there. Basically just checking if the system boots up :-)
> > 
> 
> Nothing fancy just yet.  This is involving a normal NTFS filesystem
> format on a small TCM/FILEIO LUN using scsi-generic and a userspace
> FILEIO with scsi-disk.
> 
> This involves the XP guest waiting until the very last READ_10 once the
> format has completed (eg: all WRITE and VERIFY CDBs complete with GOOD
> status AFAICT) before announcing that mkfs.ntfs failed without any
> helpful exception message (due to missing metadata of some sort I would
> assume..?)
> 
> So perhaps dumping QEMU and TCM_Loop SCSI payloads to determine if any
> correct blocks from megasas_handle_io() are actually making it out to
> KVM host is going to be my next option.  ;)
> 

Greetings Hannes,

So I spent some more time with XP guests this weekend, and I noticed two
things immediately when using hw/lsi53c895a.c instead of hw/megasas.c
with the same two TCM_Loop SAS LUNs via SG_IO from last week:

1) With lsi53c895a, XP guests are able to boot successfully w/ out the
synchronous SG_IO hack that is currently required to get past the first
36-byte INQUIRY for megasas + XP SP2

2) With lsi53c895a, XP is able to successfully create and mount a NTFS
filesystem, reboot, and read blocks appear to be functioning properly.
FYI I have not run any 'write known pattern then read-back and compare
blocks' data integrity tests from with in the XP guests just yet, but I
am confident that TCM scatterlist -> se_mem_t mapping is working as
expected on the KVM Host.

Futhermore, after formatting a 5 GB TCM/FILEIO LUN with lsi53c895a, and
then rebooting with megasas with the same two configured TCM_Loop SG_IO
devices, it appears to be able to mount and read blocks successfully.
Attempting to write new blocks on the mounted filesystem also appears to
work to some degree, but throughput slows down to a crawl during XP
guest buffer cache flush, which is likely attributed to the use of my
quick SYNC SG_IO hack.

So it appears that there are two seperate issues here, and AFAICT they
both look to be XP and megasas specific.  For #2, it may be something
about the format of the incoming scatterlists generated during XP's
mkfs.ntfs that is causing some issues.  While watching output during fs
creation, I noticed the following WRITE_10s with a starting 4088 byte
scatterlist and a trailing 8 byte scatterlist:

megasas: writel mmio 40: 2b0b003
megasas: Found mapped frame 2 context 82b0b000 pa 2b0b000
megasas: Enqueue frame context 82b0b000 tail 493 busy 1
megasas: LD SCSI dev 2 lun 0 sdev 0xdc0230 xfer 16384
scsi-generic: Using cur_addr: 0x0ff6c008 cur_len: 0x0ff8
scsi-generic: Adding iovec for mem: 0x7f1783b96008 len: 0x0ff8
scsi-generic: Using cur_addr: 0x0fd6e000 cur_len: 0x1000
scsi-generic: Adding iovec for mem: 0x7f1783998000 len: 0x1000
scsi-generic: Using cur_addr: 0x0fe2f000 cur_len: 0x1000
scsi-generic: Adding iovec for mem: 0x7f1783a59000 len: 0x1000
scsi-generic: Using cur_addr: 0x0fdf cur_len: 0x1000
scsi-generic: Adding iovec for mem: 0x7f1783a1a000 len: 0x1000
scsi-generic: Using cur_addr: 0x0fded000 cur_len: 0x0008
scsi-generic: Adding iovec for mem: 0x7f1783a17000 len: 0x0008
scsi-generic: execute IOV: iovec_count: 5, dxferp: 0xd92420, dxfer_len: 16384
scsi-generic: ---> Issuing SG_IO CDB len 10: 0x2a 00 00 00 
fa be 00 00 20 00 
scsi-generic: scsi_write_complete() ret = 0
scsi-generic: Command complete 0x0xd922c0 tag=0x82b0b000 status=0
megasas: LD SCSI req 0xd922c0 cmd 0xda92c0 lun 0xdc0230 finished with status 0 
len 16384
megasas: Complete frame context 82b0b000 tail 493 busy 0 doorbell 0

Also, the final READ_10 that produces the 'could not create filesystem'
exception is for LBA 63 and XP looking for the first FS blocks after
GPT.

Could there be some breakage in megasas with a length < PAGE_SIZE for
the scatterlist..?As lsi53c895a seems to work OK for this case, is
there something about the logic of parsing the incoming struct
scatterlists that is different between the two HBA drivers..?  AFAICT
both are using Gerd's common code in hw/scsi-bus.c, unless there is
something about megasas_map_sgl() that is causing issues with the
above..?

Best,

--nab

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest][PATCH] KVM Test: Update qemu-ifup script to set bridge's forwarding delay to 0.

2010-05-17 Thread Lucas Meneghel Rodrigues
On Mon, 2010-05-10 at 15:41 +0800, Feng Yang wrote:
> Our pxe case always fail. The problem is that the bridge takes some time 
> to enter the forwarding state. Before that, packages are simply dropped by
> the bridge.
> If one sets:
> $ brctl setfd  0 # forward delay = 0
> 
> It works.

Also, per Dor's recommendation, set STP to 'off'. Commited with small
changes:

http://autotest.kernel.org/changeset/4493


> Signed-off-by: Feng Yang 
> ---
>  client/tests/kvm/scripts/qemu-ifup |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/client/tests/kvm/scripts/qemu-ifup 
> b/client/tests/kvm/scripts/qemu-ifup
> index bcd9a7a..3d23fb4 100755
> --- a/client/tests/kvm/scripts/qemu-ifup
> +++ b/client/tests/kvm/scripts/qemu-ifup
> @@ -6,3 +6,4 @@ switch=$(/usr/sbin/brctl show | awk 'NR==2 { print $1 }')
>  
>  /sbin/ifconfig $1 0.0.0.0 up
>  /usr/sbin/brctl addif ${switch} $1
> +/usr/sbin/brctl setfd ${switch} 0


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] Increase maximum number of VNC ports

2010-05-17 Thread Lucas Meneghel Rodrigues
On Sat, 2010-05-08 at 18:01 +0100, Lukas Doktor wrote:
> Signed-off-by: Lukas Doktor 

The new patch set looks good, commited them all, see:

http://autotest.kernel.org/changeset/4492

Thanks!

> ---
>  client/tests/kvm/kvm_vm.py |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py
> index d1e0246..c203e14 100755
> --- a/client/tests/kvm/kvm_vm.py
> +++ b/client/tests/kvm/kvm_vm.py
> @@ -396,7 +396,7 @@ class VM:
>  
>  # Find available VNC port, if needed
>  if params.get("display") == "vnc":
> -self.vnc_port = kvm_utils.find_free_port(5900, 6000)
> +self.vnc_port = kvm_utils.find_free_port(5900, 6100)
>  
>  # Find random UUID if specified 'uuid = random' in config file
>  if params.get("uuid") == "random":


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: kvmclock / tsc server side fix

2010-05-17 Thread Zachary Amsden

On 05/17/2010 05:36 AM, Glauber Costa wrote:

On Fri, May 14, 2010 at 04:07:43PM -1000, Zachary Amsden wrote:
   

I believe this fixes the root cause of the kvmclock warp.  It's
quite a plausible phenomenon, and explains why it was so easy to
produce.

 

You mean this is the case for both SMP and UP, or just UP as we talked
before?
   


It's possible on both SMP and UP, guest and host.  It is impossible on 
UP host unless special circumstances come into play (one of my patches 
created these circumstances).



I don't get the role of upscale in your patch. Frequency changes are
already handled by the cpufreq notifier.
   


The only purpose of upscale is to downscale the measurement of delta 
used for counting stats if CPU frequency was raised since last 
observed.  This is because moving to a faster TSC rate means we might 
have counted some cycles at the wrong rate while the rate was in 
transition.  It doesn't much matter, as the delta for which "overrun" is 
logged was computed wrong anyway.


I'll clean up my patches and resend as a series today.

Zach
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 00/12] [PULL] qemu-kvm.git uq/master queue

2010-05-17 Thread Anthony Liguori

On 05/12/2010 04:24 PM, Marcelo Tosatti wrote:

The following changes since commit 54d7cf136f040713095cbc064f62d753bff6f9d2:
   Markus Armbruster (1):
 doc: Clean up monitor command function index
   


Pulled.  Thanks.

Regards,

Anthony Liguori


are available in the git repository at:

   git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git uq/master

Gleb Natapov (2):
   Do not stop VM if emulation failed in userspace.
   kvm: fix 8001.EDX supported bit filtering

Jan Kiszka (2):
   kvm: synchronize state from cpu context
   kvm: validate context for kvm cpu get/put operations

Marcelo Tosatti (8):
   Fix -mem-path with hugetlbfs
   kvm: set cpu_single_env around KVM_RUN ioctl
   make SIG_IPI to tcg vcpu thread reliable
   standardize on qemu_cpu_kick for signalling cpu thread(s)
   port qemu-kvm's on_vcpu code
   add cpu_is_stopped helper
   move stop/stopped CPU_COMMON fields after area zeroed by reset
   kvm: enable smp>  1

  cpu-all.h  |2 +
  cpu-defs.h |6 ++-
  cpu-exec.c |7 
  cpus.c |   88 ---
  exec-all.h |3 ++
  exec.c |8 +++-
  kvm-all.c  |   24 ++---
  kvm.h  |4 ++
  qemu-common.h  |8 +
  target-i386/kvm.c  |   29 -
  target-ppc/kvm.c   |   10 ++
  target-s390x/kvm.c |   10 ++
  12 files changed, 169 insertions(+), 30 deletions(-)


   


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: kvmclock / tsc server side fix

2010-05-17 Thread Glauber Costa
On Fri, May 14, 2010 at 04:07:43PM -1000, Zachary Amsden wrote:
> I believe this fixes the root cause of the kvmclock warp.  It's
> quite a plausible phenomenon, and explains why it was so easy to
> produce.
> 
You mean this is the case for both SMP and UP, or just UP as we talked
before?

I don't get the role of upscale in your patch. Frequency changes are
already handled by the cpufreq notifier.

> Currently it depends on some other patches; I can send a whole
> patchset, but with all the patch activity, it isn't clear what has
> been applied and to what trees.  Where have Glauber's recent patches
> been applied?
> 
> I am looking for comments if this is a reasonably good explanation
> and fix for the problem.
> 
> I realize I messed up the overshoot calculation, it is not converted
> to nsec, but the debug stats are just for debugging.
> 
> Thanks,
> 
> Zach

> commit 24e1f31a4cdb43a8e5cab6cfb95d710c7c7bf18a
> Author: Zachary Amsden 
> Date:   Fri Feb 26 15:13:31 2010 -1000
> 
> Fix a possible backwards warp of kvmclock
> 
> Kernel time, which advances in discrete steps may progress much slower
> than TSC.  As a result, when kvmclock is adjusted to a new base, the
> apparent time to the guest, which runs at a much higher, nsec scaled
> rate based on the current TSC, may have already been observed to have
> a larger value (kernel_ns + scaled tsc) than the value to which we are
> setting it (kernel_ns + 0).
> 
> We must instead compute the clock as potentially observed by the guest
> for kernel_ns to make sure it does not go backwards.
> 
> Signed-off-by: Zachary Amsden 
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 83df4db..ba765fa 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -453,6 +453,8 @@ struct kvm_vcpu_stat {
>   u32 hypercalls;
>   u32 irq_injections;
>   u32 nmi_injections;
> + u32 tsc_overshoot;
> + u32 tsc_ahead;
>  };
>  
>  struct kvm_x86_ops {
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index bb44f9e..2bf7e86 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -134,6 +134,8 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
>   { "insn_emulation_fail", VCPU_STAT(insn_emulation_fail) },
>   { "irq_injections", VCPU_STAT(irq_injections) },
>   { "nmi_injections", VCPU_STAT(nmi_injections) },
> + { "tsc_overshoot", VCPU_STAT(tsc_overshoot) },
> + { "tsc_ahead", VCPU_STAT(tsc_ahead) },
>   { "mmu_shadow_zapped", VM_STAT(mmu_shadow_zapped) },
>   { "mmu_pte_write", VM_STAT(mmu_pte_write) },
>   { "mmu_pte_updated", VM_STAT(mmu_pte_updated) },
> @@ -849,35 +851,80 @@ static int kvm_recompute_guest_time(struct kvm_vcpu *v)
>   struct kvm_vcpu_arch *vcpu = &v->arch;
>   void *shared_kaddr;
>   unsigned long this_tsc_khz;
> + s64 kernel_ns, delta;
> + u64 tsc_timestamp;
> + bool upscale;
>  
>   if ((!vcpu->time_page))
>   return 0;
>  
> - this_tsc_khz = get_cpu_var(cpu_tsc_khz);
> - put_cpu_var(cpu_tsc_khz);
> + /*
> +  * The protection we require is simple: we must not be preempted from
> +  * the CPU between our read of the TSC khz and our read of the TSC.
> +  * Interrupt protection is not strictly required, but it does result in
> +  * greater accuracy for the TSC / kernel_ns measurement.
> +  */
> + local_irq_save(flags);
> + this_tsc_khz = __get_cpu_var(cpu_tsc_khz);
> + kvm_get_msr(v, MSR_IA32_TSC, &tsc_timestamp);
> + ktime_get_ts(&ts);
> + monotonic_to_bootbased(&ts);
> + kernel_ns = timespec_to_ns(&ts);
> + local_irq_restore(flags);
> +
>   if (unlikely(this_tsc_khz == 0)) {
>   kvm_request_guest_time_update(v);
>   return 1;
>   }
>  
> + /*
> +  * Time as measured by the TSC may go backwards when resetting the base
> +  * tsc_timestamp.  The reason for this is that the TSC resolution is
> +  * higher than the resolution of the other clock scales.  Thus, many
> +  * possible measurments of the TSC correspond to one measurement of any
> +  * other clock, and so a spread of values is possible.  This is not a
> +  * problem for the computation of the nanosecond clock; with TSC rates
> +  * around 1GHZ, there can only be a few cycles which correspond to one
> +  * nanosecond value, and any path through this code will inevitably
> +  * take longer than that.  However, with the kernel_ns value itself,
> +  * the precision may be much lower, down to HZ granularity.  If the
> +  * first sampling of TSC against kernel_ns ends in the low part of the
> +  * range, and the second in the high end of the range, we can get:
> +  *
> +  * (TSC - offset_low) * S + kns_old > (TSC - offset_high) * S + kns_new
> +  *
> +  * As the sampling errors potentially range in the thousands of cyc

Re: [Autotest] [PATCH] KVM Test: Make remote_scp() more robust.

2010-05-17 Thread Michael Goldish
On 05/07/2010 01:26 PM, Feng Yang wrote:
> 1. In remote_scp(), if SCP connetion stalled for some reason, following
> code will be ran.
> else:  # match == None
> 
> logging.debug("Timeout elapsed or process terminated")
> status = sub.get_status()
> sub.close()
> return status == 0
> At this moment, kvm_subprocess server is still running which means
> lock_server_running_filename is still locked. But sub.get_status()
> tries to lock it again.  If kvm_subprocess server keeps running,
> a deadlock will happen. This patch will fix this issue by enable

Agreed.  It's a mistake (my mistake) to call get_status() on a process
that's still running and isn't expected to terminate soon.  I think even
the docstring of get_status() says that it blocks, so that's expected
behavior.
However, there's a simple solution to that, and I don't see why an
additional timeout is necessary.

> timeout parameter. Update default value for timeout to 600, it should
> be enough.
> 
> 2. Add "-v" in scp command to catch more infomation. Also add "Exit status"
> and "stalled" match prompt in remote_scp().
> Signed-off-by: Feng Yang 
> ---
>  client/tests/kvm/kvm_utils.py |   36 
>  client/tests/kvm/kvm_vm.py|4 ++--
>  2 files changed, 30 insertions(+), 10 deletions(-)
> 
> diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py
> index 25f3c8c..3db4dec 100644
> --- a/client/tests/kvm/kvm_utils.py
> +++ b/client/tests/kvm/kvm_utils.py
> @@ -524,7 +524,7 @@ def remote_login(command, password, prompt, linesep="\n", 
> timeout=10):
>  return None
>  
>  
> -def remote_scp(command, password, timeout=300, login_timeout=10):
> +def remote_scp(command, password, timeout=600, login_timeout=10):
>  """
>  Run the given command using kvm_spawn and provide answers to the 
> questions
>  asked. If timeout expires while waiting for the transfer to complete ,
> @@ -548,12 +548,18 @@ def remote_scp(command, password, timeout=300, 
> login_timeout=10):
>  
>  password_prompt_count = 0
>  _timeout = login_timeout
> +end_time = time.time() + timeout
> +logging.debug("Trying to SCP...")
>  
> -logging.debug("Trying to login...")
>  
>  while True:
> +if end_time <= time.time():
> +logging.debug("transfer timeout!")
> +sub.close()
> +return False
>  (match, text) = sub.read_until_last_line_matches(
> -[r"[Aa]re you sure", r"[Pp]assword:\s*$", r"lost 
> connection"],
> +[r"[Aa]re you sure", r"[Pp]assword:\s*$", r"lost connection",
> + r"Exit status", r"stalled"],
>  timeout=_timeout, internal_timeout=0.5)
>  if match == 0:  # "Are you sure you want to continue connecting"
>  logging.debug("Got 'Are you sure...'; sending 'yes'")
> @@ -574,15 +580,29 @@ def remote_scp(command, password, timeout=300, 
> login_timeout=10):
>  logging.debug("Got 'lost connection'")
>  sub.close()
>  return False
> +elif match == 3: # "Exit status"

This check for "Exit status" is redundant.  When the process terminates,
read_until_last_line_matches() will return None and get_status() will
return the exit status.

> +sub.close()
> +if "Exit status 0" in text:
> +logging.debug("SCP command completed successfully")
> +return True
> +else:
> +logging.debug("SCP command fail with exit status %s" % text)
> +return False
> +elif match == 4: # "stalled"
> +logging.debug("SCP connection stalled for some reason")
> +continue
> +
>  else:  # match == None
> -logging.debug("Timeout elapsed or process terminated")
> +if sub.is_alive():
> +continue
> +logging.debug("Process terminated for some reason")
>  status = sub.get_status()
>  sub.close()
>  return status == 0

To avoid a deadlock, we can simply check if the process is still alive
before calling get_status(), i.e.

else:  # match == None
if sub.is_alive():
logging.debug("Timeout elapsed")
sub.close()
return False
else:
status = sub.get_status()
sub.close()
logging.debug("Process exited with status %d", status)
return status == 0

This looks like the simplest solution to me.

>  
>  
>  def scp_to_remote(host, port, username, password, local_path, remote_path,
> -  timeout=300):
> +  timeout=600):
>  """
>  Copy files to a remote host (guest).
>  
> @@ -596,14 +616,14 @@ def scp_to_remote(host, port, username, password, 
> local_path, remote_path,
>  
>  @return: True on success and False on failure.
>  """
> -command = ("scp -o UserKnownHostsFile=/dev/null "

Re: [Autotest] [KVM-AUTOTEST PATCH] KVM test: make use of tcpdump optional

2010-05-17 Thread Michael Goldish
On 05/17/2010 04:35 PM, Lucas Meneghel Rodrigues wrote:
> On Mon, 2010-05-17 at 16:29 +0300, Michael Goldish wrote:
>> To disable tcpdump, set 'run_tcpdump = no' in a config file.
>> If 'run_tcpdump' isn't set at all, it defaults to 'yes'.
>>
>> (Currently TAP mode cannot be used without tcpdump.)
> 
> Maybe we can just tie tcpdump execution to tap mode - if running on tap
> mode, enable tcpdump, else, disable it. I'd rather prefer this
> approach. 

Checking for nic_mode == 'tap' is incorrect IMO, because nic_mode is a
VM parameter, not a global parameter.
There can be several VMs, some with nic_mode == 'user' and some with
nic_mode == 'tap', e.g.

nic_mode = user # default value for all VMs
nic_mode_vm2 = tap  # specific to vm2 (overrides nic_mode)

so if we went by nic_mode, we wouldn't run tcpdump, which is required by
vm2.

Also, a test is free to start its own VMs.  Consider this (somewhat
unlikely) scenario: nic_mode=user, so vm1 uses user mode, and the test
clones it, changes the clone's mode to tap, and starts the clone.  If we
went by nic_mode, we wouldn't start tcpdump even though the clone needs it.

>> Signed-off-by: Michael Goldish 
>> ---
>>  client/tests/kvm/kvm_preprocessing.py  |2 +-
>>  client/tests/kvm/tests_base.cfg.sample |1 +
>>  2 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/client/tests/kvm/kvm_preprocessing.py 
>> b/client/tests/kvm/kvm_preprocessing.py
>> index 50db65c..e11207a 100644
>> --- a/client/tests/kvm/kvm_preprocessing.py
>> +++ b/client/tests/kvm/kvm_preprocessing.py
>> @@ -196,7 +196,7 @@ def preprocess(test, params, env):
>>  if "tcpdump" in env and not env["tcpdump"].is_alive():
>>  env["tcpdump"].close()
>>  del env["tcpdump"]
>> -if "tcpdump" not in env:
>> +if "tcpdump" not in env and params.get("run_tcpdump", "yes") == "yes":
>>  command = "/usr/sbin/tcpdump -npvi any 'dst port 68'"
>>  logging.debug("Starting tcpdump (%s)...", command)
>>  env["tcpdump"] = kvm_subprocess.kvm_tail(
>> diff --git a/client/tests/kvm/tests_base.cfg.sample 
>> b/client/tests/kvm/tests_base.cfg.sample
>> index c9dfd0b..1276267 100644
>> --- a/client/tests/kvm/tests_base.cfg.sample
>> +++ b/client/tests/kvm/tests_base.cfg.sample
>> @@ -46,6 +46,7 @@ nic_mode = user
>>  #nic_mode = tap
>>  nic_script = scripts/qemu-ifup
>>  address_index = 0
>> +run_tcpdump = yes
>>  
>>  # Misc
>>  run_kvm_stat = yes
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest] [KVM-AUTOTEST PATCH] KVM test: make use of tcpdump optional

2010-05-17 Thread Lucas Meneghel Rodrigues
On Mon, 2010-05-17 at 16:29 +0300, Michael Goldish wrote:
> To disable tcpdump, set 'run_tcpdump = no' in a config file.
> If 'run_tcpdump' isn't set at all, it defaults to 'yes'.
> 
> (Currently TAP mode cannot be used without tcpdump.)

Maybe we can just tie tcpdump execution to tap mode - if running on tap
mode, enable tcpdump, else, disable it. I'd rather prefer this
approach. 

> Signed-off-by: Michael Goldish 
> ---
>  client/tests/kvm/kvm_preprocessing.py  |2 +-
>  client/tests/kvm/tests_base.cfg.sample |1 +
>  2 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/client/tests/kvm/kvm_preprocessing.py 
> b/client/tests/kvm/kvm_preprocessing.py
> index 50db65c..e11207a 100644
> --- a/client/tests/kvm/kvm_preprocessing.py
> +++ b/client/tests/kvm/kvm_preprocessing.py
> @@ -196,7 +196,7 @@ def preprocess(test, params, env):
>  if "tcpdump" in env and not env["tcpdump"].is_alive():
>  env["tcpdump"].close()
>  del env["tcpdump"]
> -if "tcpdump" not in env:
> +if "tcpdump" not in env and params.get("run_tcpdump", "yes") == "yes":
>  command = "/usr/sbin/tcpdump -npvi any 'dst port 68'"
>  logging.debug("Starting tcpdump (%s)...", command)
>  env["tcpdump"] = kvm_subprocess.kvm_tail(
> diff --git a/client/tests/kvm/tests_base.cfg.sample 
> b/client/tests/kvm/tests_base.cfg.sample
> index c9dfd0b..1276267 100644
> --- a/client/tests/kvm/tests_base.cfg.sample
> +++ b/client/tests/kvm/tests_base.cfg.sample
> @@ -46,6 +46,7 @@ nic_mode = user
>  #nic_mode = tap
>  nic_script = scripts/qemu-ifup
>  address_index = 0
> +run_tcpdump = yes
>  
>  # Misc
>  run_kvm_stat = yes


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Storage class should be before const qualifier

2010-05-17 Thread Tobias Klauser
On 2010-05-17 at 15:13:35 +0200, Michael S. Tsirkin  wrote:
> On Mon, May 17, 2010 at 03:12:49PM +0200, Tobias Klauser wrote:
> > The C99 specification states in section 6.11.5:
> > 
> > The placement of a storage-class specifier other than at the beginning
> > of the declaration specifiers in a declaration is an obsolescent
> > feature.
> > 
> > Signed-off-by: Tobias Klauser 
> 
> 
> Will apply, thanks!
> Just to clarify: does some compiler/checker actually barf on this?

GCC does emit a warning if the options '-std=c99 -W -Wall' are present.
ICC also does warn about this, though I don't know whether this depends
on any commandline options.

Cheers
Tobias
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-AUTOTEST PATCH] KVM test: make use of tcpdump optional

2010-05-17 Thread Michael Goldish
To disable tcpdump, set 'run_tcpdump = no' in a config file.
If 'run_tcpdump' isn't set at all, it defaults to 'yes'.

(Currently TAP mode cannot be used without tcpdump.)

Signed-off-by: Michael Goldish 
---
 client/tests/kvm/kvm_preprocessing.py  |2 +-
 client/tests/kvm/tests_base.cfg.sample |1 +
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/client/tests/kvm/kvm_preprocessing.py 
b/client/tests/kvm/kvm_preprocessing.py
index 50db65c..e11207a 100644
--- a/client/tests/kvm/kvm_preprocessing.py
+++ b/client/tests/kvm/kvm_preprocessing.py
@@ -196,7 +196,7 @@ def preprocess(test, params, env):
 if "tcpdump" in env and not env["tcpdump"].is_alive():
 env["tcpdump"].close()
 del env["tcpdump"]
-if "tcpdump" not in env:
+if "tcpdump" not in env and params.get("run_tcpdump", "yes") == "yes":
 command = "/usr/sbin/tcpdump -npvi any 'dst port 68'"
 logging.debug("Starting tcpdump (%s)...", command)
 env["tcpdump"] = kvm_subprocess.kvm_tail(
diff --git a/client/tests/kvm/tests_base.cfg.sample 
b/client/tests/kvm/tests_base.cfg.sample
index c9dfd0b..1276267 100644
--- a/client/tests/kvm/tests_base.cfg.sample
+++ b/client/tests/kvm/tests_base.cfg.sample
@@ -46,6 +46,7 @@ nic_mode = user
 #nic_mode = tap
 nic_script = scripts/qemu-ifup
 address_index = 0
+run_tcpdump = yes
 
 # Misc
 run_kvm_stat = yes
-- 
1.5.4.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-AUTOTEST PATCH] KVM test: use command line option wrapper functions

2010-05-17 Thread Michael Goldish
In order to support multiple versions of qemu which use different command line
options or syntaxes, wrap all command line options in small helper functions,
which append text to the command line according to the output of 'qemu -help'.

Signed-off-by: Michael Goldish 
---
 client/tests/kvm/kvm_vm.py |  198 ++--
 1 files changed, 135 insertions(+), 63 deletions(-)

diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py
index 047505a..94bacdf 100755
--- a/client/tests/kvm/kvm_vm.py
+++ b/client/tests/kvm/kvm_vm.py
@@ -186,12 +186,100 @@ class VM:
nic_model -- string to pass as 'model' parameter for this
NIC (e.g. e1000)
 """
-if name is None:
-name = self.name
-if params is None:
-params = self.params
-if root_dir is None:
-root_dir = self.root_dir
+# Helper function for command line option wrappers
+def has_option(help, option):
+return bool(re.search(r"^-%s(\s|$)" % option, help, re.MULTILINE))
+
+# Wrappers for all supported qemu command line parameters.
+# This is meant to allow support for multiple qemu versions.
+# Each of these functions receives the output of 'qemu -help' as a
+# parameter, and should add the requested command line option
+# accordingly.
+
+def add_name(help, name):
+return " -name '%s'" % name
+
+def add_unix_socket_monitor(help, filename):
+return " -monitor unix:%s,server,nowait" % filename
+
+def add_mem(help, mem):
+return " -m %s" % mem
+
+def add_smp(help, smp):
+return " -smp %s" % smp
+
+def add_cdrom(help, filename, index=2):
+if has_option(help, "drive"):
+return " -drive file=%s,index=%d,media=cdrom" % (filename,
+ index)
+else:
+return " -cdrom %s" % filename
+
+def add_drive(help, filename, format=None, cache=None, werror=None,
+  serial=None, snapshot=False, boot=False):
+cmd = " -drive file=%s" % filename
+if format: cmd += ",if=%s" % format
+if cache: cmd += ",cache=%s" % cache
+if werror: cmd += ",werror=%s" % werror
+if serial: cmd += ",serial=%s" % serial
+if snapshot: cmd += ",snapshot=on"
+if boot: cmd += ",boot=on"
+return cmd
+
+def add_nic(help, vlan, model=None, mac=None):
+cmd = " -net nic,vlan=%d" % vlan
+if model: cmd += ",model=%s" % model
+if mac: cmd += ",macaddr=%s" % mac
+return cmd
+
+def add_net(help, vlan, mode, ifname=None, script=None,
+downscript=None):
+cmd = " -net %s,vlan=%d" % (mode, vlan)
+if mode == "tap":
+if ifname: cmd += ",ifname=%s" % ifname
+if script: cmd += ",script=%s" % script
+cmd += ",downscript=%s" % (downscript or "no")
+return cmd
+
+def add_floppy(help, filename):
+return " -fda %s" % filename
+
+def add_tftp(help, filename):
+return " -tftp %s" % filename
+
+def add_tcp_redir(help, host_port, guest_port):
+return " -redir tcp:%s::%s" % (host_port, guest_port)
+
+def add_vnc(help, vnc_port):
+return " -vnc :%d" % (vnc_port - 5900)
+
+def add_sdl(help):
+if has_option(help, "sdl"):
+return " -sdl"
+else:
+return ""
+
+def add_nographic(help):
+return " -nographic"
+
+def add_uuid(help, uuid):
+return " -uuid %s" % uuid
+
+def add_pcidevice(help, host):
+return " -pcidevice host=%s" % host
+
+# End of command line option wrappers
+
+if name is None: name = self.name
+if params is None: params = self.params
+if root_dir is None: root_dir = self.root_dir
+
+qemu_binary = kvm_utils.get_path(root_dir, params.get("qemu_binary",
+  "qemu"))
+# Get the output of 'qemu -help' (log a message in case this call never
+# returns or causes some other kind of trouble)
+logging.debug("Getting output of 'qemu -help'")
+help = commands.getoutput("%s -help" % qemu_binary)
 
 # Start constructing the qemu command
 qemu_cmd = ""
@@ -199,65 +287,49 @@ class VM:
 if params.get("x11_display"):
 qemu_cmd += "DISPLAY=%s " % params.get("x11_display")
 # Add the qemu binary
-qemu_cmd += kvm_utils.get_path(root_dir, params.get("qemu_binary",
-"qemu"))
+qemu_cmd += qemu_binary
 # Add the VM

[KVM-AUTOTEST PATCH] KVM test: formatting improvements to scan_results.py

2010-05-17 Thread Michael Goldish
Print results clearly even if test names are very long.
Also, for consistency, use the same quote character everywhere.

Signed-off-by: Michael Goldish 
---
 client/tests/kvm/scan_results.py |   49 ++---
 1 files changed, 29 insertions(+), 20 deletions(-)

diff --git a/client/tests/kvm/scan_results.py b/client/tests/kvm/scan_results.py
index f7bafa9..f7073e4 100755
--- a/client/tests/kvm/scan_results.py
+++ b/client/tests/kvm/scan_results.py
@@ -24,19 +24,19 @@ def parse_results(text):
 # Found a START line -- get start time
 if (line.startswith("START") and len(parts) >= 5 and
 parts[3].startswith("timestamp")):
-start_time = float(parts[3].split('=')[1])
+start_time = float(parts[3].split("=")[1])
 start_time_list.append(start_time)
 info_list.append("")
 
 # Found an END line -- get end time, name and status
 elif (line.startswith("END") and len(parts) >= 5 and
   parts[3].startswith("timestamp")):
-end_time = float(parts[3].split('=')[1])
+end_time = float(parts[3].split("=")[1])
 start_time = start_time_list.pop()
 info = info_list.pop()
 test_name = parts[2]
 test_status = parts[0].split()[1]
-# Remove 'kvm.' prefix
+# Remove "kvm." prefix
 if test_name.startswith("kvm."):
 test_name = test_name.split("kvm.")[1]
 result_list.append((test_name, test_status,
@@ -50,39 +50,48 @@ def parse_results(text):
 return result_list
 
 
-def print_result(result):
-"""Nicely print a single Autotest result.
+def print_result(result, name_width):
+"""
+Nicely print a single Autotest result.
 
-result -- a 4-tuple
+@param result: a 4-tuple
+@param name_width: test name maximum width
 """
 if result:
-print '%-48s\t\t%s\t%s\t%s' % tuple(map(str, result))
+format = "%%-%ds%%-10s %%-8s %%s" % name_width
+print format % result
 
 
 def main(resfiles):
-print_result(('Test', 'Status', 'Seconds', 'Info'))
-print_result(('', '--', '---', ''))
+result_lists = []
+name_width = 40
 
 for resfile in resfiles:
-print '(Result file: %s)' % resfile
 try:
-f = file(resfile)
-text = f.read()
-f.close()
+text = open(resfile).read()
 except IOError:
-print 'Bad result file: %s' % resfile
-return
+print "Bad result file: %s" % resfile
+continue
 results = parse_results(text)
-map(print_result, results)
+result_lists.append((resfile, results))
+name_width = max(name_width, max(len(r[0]) for r in results))
+
+print_result(("Test", "Status", "Seconds", "Info"), name_width)
+print_result(("", "--", "---", ""), name_width)
+
+for resfile, results in result_lists:
+print "(Result file: %s)" % resfile
+for r in results:
+print_result(r, name_width)
 
 
-if __name__ == '__main__':
+if __name__ == "__main__":
 import sys, os, glob
 
-resfiles = glob.glob('../../results/default/status*')
+resfiles = glob.glob("../../results/default/status*")
 if len(sys.argv) > 1:
-if sys.argv[1] == '-h' or sys.argv[1] == '--help':
-print 'Usage: %s [result files]' % sys.argv[0]
+if sys.argv[1] == "-h" or sys.argv[1] == "--help":
+print "Usage: %s [result files]" % sys.argv[0]
 sys.exit(0)
 resfiles = sys.argv[1:]
 main(resfiles)
-- 
1.5.4.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] vhost: Storage class should be before const qualifier

2010-05-17 Thread Tobias Klauser
The C99 specification states in section 6.11.5:

The placement of a storage-class specifier other than at the beginning
of the declaration specifiers in a declaration is an obsolescent
feature.

Signed-off-by: Tobias Klauser 
---
 drivers/vhost/net.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index aa88911..cd36f5f 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -626,7 +626,7 @@ static long vhost_net_compat_ioctl(struct file *f, unsigned 
int ioctl,
 }
 #endif
 
-const static struct file_operations vhost_net_fops = {
+static const struct file_operations vhost_net_fops = {
.owner  = THIS_MODULE,
.release= vhost_net_release,
.unlocked_ioctl = vhost_net_ioctl,
-- 
1.6.3.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Storage class should be before const qualifier

2010-05-17 Thread Michael S. Tsirkin
On Mon, May 17, 2010 at 03:12:49PM +0200, Tobias Klauser wrote:
> The C99 specification states in section 6.11.5:
> 
> The placement of a storage-class specifier other than at the beginning
> of the declaration specifiers in a declaration is an obsolescent
> feature.
> 
> Signed-off-by: Tobias Klauser 


Will apply, thanks!
Just to clarify: does some compiler/checker actually barf on this?

> ---
>  drivers/vhost/net.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index aa88911..cd36f5f 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -626,7 +626,7 @@ static long vhost_net_compat_ioctl(struct file *f, 
> unsigned int ioctl,
>  }
>  #endif
>  
> -const static struct file_operations vhost_net_fops = {
> +static const struct file_operations vhost_net_fops = {
>   .owner  = THIS_MODULE,
>   .release= vhost_net_release,
>   .unlocked_ioctl = vhost_net_ioctl,
> -- 
> 1.6.3.3
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: [RFC PATCH v3 2/3] block: call the snapshot handlers of the protocol drivers

2010-05-17 Thread MORITA Kazutaka
On Mon, May 17, 2010 at 9:20 PM, Kevin Wolf  wrote:
> Am 17.05.2010 14:19, schrieb MORITA Kazutaka:
>> At Mon, 17 May 2010 13:08:08 +0200,
>> Kevin Wolf wrote:
>>>
>>> Am 17.05.2010 12:19, schrieb MORITA Kazutaka:

  int bdrv_snapshot_goto(BlockDriverState *bs,
                         const char *snapshot_id)
  {
      BlockDriver *drv = bs->drv;
 +    int ret, open_ret;
 +
      if (!drv)
          return -ENOMEDIUM;
 -    if (!drv->bdrv_snapshot_goto)
 -        return -ENOTSUP;
 -    return drv->bdrv_snapshot_goto(bs, snapshot_id);
 +    if (drv->bdrv_snapshot_goto)
 +        return drv->bdrv_snapshot_goto(bs, snapshot_id);
 +
 +    if (bs->file) {
 +        drv->bdrv_close(bs);
 +        ret = bdrv_snapshot_goto(bs->file, snapshot_id);
 +        open_ret = drv->bdrv_open(bs, bs->open_flags);
 +        if (open_ret < 0) {
 +            bdrv_delete(bs);
>>>
>>> I think you mean bs->file here.
>>>
>>> Kevin
>>
>> This is an error of re-opening the format driver, so what we should
>> delete here is not bs->file but bs, isn't it?  If we failed to open bs
>> here, the drive doesn't seem to work anymore.
>
> But bdrv_delete means basically free it. This is almost guaranteed to
> lead to crashes because that BlockDriverState is still in use in other
> places.
>
> One additional case of use after free is in the very next line:
>
 +            bs->drv = NULL;
>
> You can't do that when bs is freed, obviously. But I think just setting
> bs->drv to NULL without bdrv_deleting it before is the better way. It
> will fail any requests (with -ENOMEDIUM), but can't produce crashes.
> This is also what bdrv_commit does in such situations.
>
> In this state, we don't access the underlying file any more, so we could
> delete bs->file - this is why I thought you actually meant to do that.
>

I'm sorry for the confusion.  I understand what we should do here.
I'll fix it for the next post.

Thanks,

Kazutaka
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] AMD Erratum 383 workaround for KVM

2010-05-17 Thread Joerg Roedel
Hi,

these two patches implement the workaround for AMD Erratum 383 into KVM.
This is necessary to prevent the host to crash if a guest triggers the
erratum.
For details on the erratum please see page 96 of

http://support.amd.com/us/Processor_TechDocs/41322.pdf

The workaround implemented in these patches will be documented in the
next update of the revision guide.

Joerg

Diffstat:

 arch/x86/include/asm/msr-index.h |1 +
 arch/x86/kvm/svm.c   |   98 +-
 2 files changed, 98 insertions(+), 1 deletions(-)

Shortlog:

Joerg Roedel (2):
  KVM: SVM: Handle MCEs early in the vmexit process
  KVM: SVM: Implement workaround for Erratum 383


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] KVM: SVM: Implement workaround for Erratum 383

2010-05-17 Thread Joerg Roedel
This patch implements a workaround for AMD erratum 383 into
KVM. Without this erratum fix it is possible for a guest to
kill the host machine. This patch implements the suggested
workaround for hypervisors which will be published by the
next revision guide update.

Cc: sta...@kernel.org
Signed-off-by: Joerg Roedel 
---
 arch/x86/include/asm/msr-index.h |1 +
 arch/x86/kvm/svm.c   |   83 ++
 2 files changed, 84 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 4eb2f72..fdb6af2 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -112,6 +112,7 @@
 #define MSR_AMD64_PATCH_LOADER 0xc0010020
 #define MSR_AMD64_OSVW_ID_LENGTH   0xc0010140
 #define MSR_AMD64_OSVW_STATUS  0xc0010141
+#define MSR_AMD64_DC_CFG   0xc0011022
 #define MSR_AMD64_IBSFETCHCTL  0xc0011030
 #define MSR_AMD64_IBSFETCHLINAD0xc0011031
 #define MSR_AMD64_IBSFETCHPHYSAD   0xc0011032
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index f92d191..8a42a0d 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 
+#include 
 #include 
 
 #include 
@@ -56,6 +57,8 @@ MODULE_LICENSE("GPL");
 
 #define DEBUGCTL_RESERVED_BITS (~(0x3fULL))
 
+static bool erratum_383_found __read_mostly;
+
 static const u32 host_save_user_msrs[] = {
 #ifdef CONFIG_X86_64
MSR_STAR, MSR_LSTAR, MSR_CSTAR, MSR_SYSCALL_MASK, MSR_KERNEL_GS_BASE,
@@ -374,6 +377,31 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu, 
unsigned nr,
svm->vmcb->control.event_inj_err = error_code;
 }
 
+static void svm_init_erratum_383(void)
+{
+   u32 low, high;
+   int err;
+   u64 val;
+
+   /* Only Fam10h is affected */
+   if (boot_cpu_data.x86 != 0x10)
+   return;
+
+   /* Use _safe variants to not break nested virtualization */
+   val = native_read_msr_safe(MSR_AMD64_DC_CFG, &err);
+   if (err)
+   return;
+
+   val |= (1ULL << 47);
+
+   low  = lower_32_bits(val);
+   high = upper_32_bits(val);
+
+   native_write_msr_safe(MSR_AMD64_DC_CFG, low, high);
+
+   erratum_383_found = true;
+}
+
 static int has_svm(void)
 {
const char *msg;
@@ -429,6 +457,8 @@ static int svm_hardware_enable(void *garbage)
 
wrmsrl(MSR_VM_HSAVE_PA, page_to_pfn(sd->save_area) << PAGE_SHIFT);
 
+   svm_init_erratum_383();
+
return 0;
 }
 
@@ -1410,8 +1440,61 @@ static int nm_interception(struct vcpu_svm *svm)
return 1;
 }
 
+static bool is_erratum_383(void)
+{
+   int err, i;
+   u64 value;
+
+   if (!erratum_383_found)
+   return false;
+
+   value = native_read_msr_safe(MSR_IA32_MC0_STATUS, &err);
+   if (err)
+   return false;
+
+   /* Bit 62 may or may not be set for this mce */
+   value &= ~(1ULL << 62);
+
+   if (value != 0xb6010015)
+   return false;
+
+   /* Clear MCi_STATUS registers */
+   for (i = 0; i < 6; ++i)
+   native_write_msr_safe(MSR_IA32_MCx_STATUS(i), 0, 0);
+
+   value = native_read_msr_safe(MSR_IA32_MCG_STATUS, &err);
+   if (!err) {
+   u32 low, high;
+
+   value &= ~(1ULL << 2);
+   low= lower_32_bits(value);
+   high   = upper_32_bits(value);
+
+   native_write_msr_safe(MSR_IA32_MCG_STATUS, low, high);
+   }
+
+   /* Flush tlb to evict multi-match entries */
+   __flush_tlb_all();
+
+   return true;
+}
+
 static void svm_handle_mce(struct vcpu_svm *svm)
 {
+   if (is_erratum_383()) {
+   /*
+* Erratum 383 triggered. Guest state is corrupt so kill the
+* guest.
+*/
+   struct kvm_run *kvm_run = svm->vcpu.run;
+
+   pr_err("KVM: Guest triggered AMD Erratum 383\n");
+
+   set_bit(KVM_REQ_TRIPLE_FAULT, &svm->vcpu.requests);
+
+   return;
+   }
+
/*
 * On an #MC intercept the MCE handler is not called automatically in
 * the host. So do it by hand here.
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] KVM: SVM: Handle MCEs early in the vmexit process

2010-05-17 Thread Joerg Roedel
This patch moves handling of the MC vmexits to an earlier
point in the vmexit. The handle_exit function is too late
because the vcpu might alreadry have changed its physical
cpu.

Cc: sta...@kernel.org
Signed-off-by: Joerg Roedel 
---
 arch/x86/kvm/svm.c |   15 ++-
 1 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index cea916f..f92d191 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1410,7 +1410,7 @@ static int nm_interception(struct vcpu_svm *svm)
return 1;
 }
 
-static int mc_interception(struct vcpu_svm *svm)
+static void svm_handle_mce(struct vcpu_svm *svm)
 {
/*
 * On an #MC intercept the MCE handler is not called automatically in
@@ -1420,6 +1420,11 @@ static int mc_interception(struct vcpu_svm *svm)
"int $0x12\n");
/* not sure if we ever come back to this point */
 
+   return;
+}
+
+static int mc_interception(struct vcpu_svm *svm)
+{
return 1;
 }
 
@@ -3183,6 +3188,14 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
vcpu->arch.regs_avail &= ~(1 << VCPU_EXREG_PDPTR);
vcpu->arch.regs_dirty &= ~(1 << VCPU_EXREG_PDPTR);
}
+
+   /*
+* We need to handle MC intercepts here before the vcpu has a chance to
+* change the physical cpu
+*/
+   if (unlikely(svm->vmcb->control.exit_code ==
+SVM_EXIT_EXCP_BASE + MC_VECTOR))
+   svm_handle_mce(svm);
 }
 
 #undef R
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: [RFC PATCH v3 2/3] block: call the snapshot handlers of the protocol drivers

2010-05-17 Thread Kevin Wolf
Am 17.05.2010 14:19, schrieb MORITA Kazutaka:
> At Mon, 17 May 2010 13:08:08 +0200,
> Kevin Wolf wrote:
>>
>> Am 17.05.2010 12:19, schrieb MORITA Kazutaka:
>>>  
>>>  int bdrv_snapshot_goto(BlockDriverState *bs,
>>> const char *snapshot_id)
>>>  {
>>>  BlockDriver *drv = bs->drv;
>>> +int ret, open_ret;
>>> +
>>>  if (!drv)
>>>  return -ENOMEDIUM;
>>> -if (!drv->bdrv_snapshot_goto)
>>> -return -ENOTSUP;
>>> -return drv->bdrv_snapshot_goto(bs, snapshot_id);
>>> +if (drv->bdrv_snapshot_goto)
>>> +return drv->bdrv_snapshot_goto(bs, snapshot_id);
>>> +
>>> +if (bs->file) {
>>> +drv->bdrv_close(bs);
>>> +ret = bdrv_snapshot_goto(bs->file, snapshot_id);
>>> +open_ret = drv->bdrv_open(bs, bs->open_flags);
>>> +if (open_ret < 0) {
>>> +bdrv_delete(bs);
>>
>> I think you mean bs->file here.
>>
>> Kevin
> 
> This is an error of re-opening the format driver, so what we should
> delete here is not bs->file but bs, isn't it?  If we failed to open bs
> here, the drive doesn't seem to work anymore.

But bdrv_delete means basically free it. This is almost guaranteed to
lead to crashes because that BlockDriverState is still in use in other
places.

One additional case of use after free is in the very next line:

>>> +bs->drv = NULL;

You can't do that when bs is freed, obviously. But I think just setting
bs->drv to NULL without bdrv_deleting it before is the better way. It
will fail any requests (with -ENOMEDIUM), but can't produce crashes.
This is also what bdrv_commit does in such situations.

In this state, we don't access the underlying file any more, so we could
delete bs->file - this is why I thought you actually meant to do that.

Kevin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3 2/3] block: call the snapshot handlers of the protocol drivers

2010-05-17 Thread MORITA Kazutaka
At Mon, 17 May 2010 13:08:08 +0200,
Kevin Wolf wrote:
> 
> Am 17.05.2010 12:19, schrieb MORITA Kazutaka:
> >  
> >  int bdrv_snapshot_goto(BlockDriverState *bs,
> > const char *snapshot_id)
> >  {
> >  BlockDriver *drv = bs->drv;
> > +int ret, open_ret;
> > +
> >  if (!drv)
> >  return -ENOMEDIUM;
> > -if (!drv->bdrv_snapshot_goto)
> > -return -ENOTSUP;
> > -return drv->bdrv_snapshot_goto(bs, snapshot_id);
> > +if (drv->bdrv_snapshot_goto)
> > +return drv->bdrv_snapshot_goto(bs, snapshot_id);
> > +
> > +if (bs->file) {
> > +drv->bdrv_close(bs);
> > +ret = bdrv_snapshot_goto(bs->file, snapshot_id);
> > +open_ret = drv->bdrv_open(bs, bs->open_flags);
> > +if (open_ret < 0) {
> > +bdrv_delete(bs);
> 
> I think you mean bs->file here.
> 
> Kevin

This is an error of re-opening the format driver, so what we should
delete here is not bs->file but bs, isn't it?  If we failed to open bs
here, the drive doesn't seem to work anymore.

Regards,

Kazutaka

> > +bs->drv = NULL;
> > +return open_ret;
> > +}
> > +return ret;
> > +}
> > +
> > +return -ENOTSUP;
> >  }
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC PATCH v3 2/3] block: call the snapshot handlers of the protocol drivers

2010-05-17 Thread Kevin Wolf
Am 17.05.2010 12:19, schrieb MORITA Kazutaka:
> When snapshot handlers are not defined in the format driver, it is
> better to call the ones of the protocol driver.  This enables us to
> implement snapshot support in the protocol driver.
> 
> We need to call bdrv_close() and bdrv_open() handlers of the format
> driver before and after bdrv_snapshot_goto() call of the protocol.  It is
> because the contents of the block driver state may need to be changed
> after loading vmstate.
> 
> Signed-off-by: MORITA Kazutaka 
> ---
>  block.c |   61 +++--
>  1 files changed, 43 insertions(+), 18 deletions(-)
> 
> diff --git a/block.c b/block.c
> index f3bf3f2..c987e57 100644
> --- a/block.c
> +++ b/block.c
> @@ -1683,9 +1683,11 @@ int bdrv_save_vmstate(BlockDriverState *bs, const 
> uint8_t *buf,
>  BlockDriver *drv = bs->drv;
>  if (!drv)
>  return -ENOMEDIUM;
> -if (!drv->bdrv_save_vmstate)
> -return -ENOTSUP;
> -return drv->bdrv_save_vmstate(bs, buf, pos, size);
> +if (drv->bdrv_save_vmstate)
> +return drv->bdrv_save_vmstate(bs, buf, pos, size);
> +if (bs->file)
> +return bdrv_save_vmstate(bs->file, buf, pos, size);
> +return -ENOTSUP;
>  }
>  
>  int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf,
> @@ -1694,9 +1696,11 @@ int bdrv_load_vmstate(BlockDriverState *bs, uint8_t 
> *buf,
>  BlockDriver *drv = bs->drv;
>  if (!drv)
>  return -ENOMEDIUM;
> -if (!drv->bdrv_load_vmstate)
> -return -ENOTSUP;
> -return drv->bdrv_load_vmstate(bs, buf, pos, size);
> +if (drv->bdrv_load_vmstate)
> +return drv->bdrv_load_vmstate(bs, buf, pos, size);
> +if (bs->file)
> +return bdrv_load_vmstate(bs->file, buf, pos, size);
> +return -ENOTSUP;
>  }
>  
>  void bdrv_debug_event(BlockDriverState *bs, BlkDebugEvent event)
> @@ -1720,20 +1724,37 @@ int bdrv_snapshot_create(BlockDriverState *bs,
>  BlockDriver *drv = bs->drv;
>  if (!drv)
>  return -ENOMEDIUM;
> -if (!drv->bdrv_snapshot_create)
> -return -ENOTSUP;
> -return drv->bdrv_snapshot_create(bs, sn_info);
> +if (drv->bdrv_snapshot_create)
> +return drv->bdrv_snapshot_create(bs, sn_info);
> +if (bs->file)
> +return bdrv_snapshot_create(bs->file, sn_info);
> +return -ENOTSUP;
>  }
>  
>  int bdrv_snapshot_goto(BlockDriverState *bs,
> const char *snapshot_id)
>  {
>  BlockDriver *drv = bs->drv;
> +int ret, open_ret;
> +
>  if (!drv)
>  return -ENOMEDIUM;
> -if (!drv->bdrv_snapshot_goto)
> -return -ENOTSUP;
> -return drv->bdrv_snapshot_goto(bs, snapshot_id);
> +if (drv->bdrv_snapshot_goto)
> +return drv->bdrv_snapshot_goto(bs, snapshot_id);
> +
> +if (bs->file) {
> +drv->bdrv_close(bs);
> +ret = bdrv_snapshot_goto(bs->file, snapshot_id);
> +open_ret = drv->bdrv_open(bs, bs->open_flags);
> +if (open_ret < 0) {
> +bdrv_delete(bs);

I think you mean bs->file here.

Kevin

> +bs->drv = NULL;
> +return open_ret;
> +}
> +return ret;
> +}
> +
> +return -ENOTSUP;
>  }
>  
>  int bdrv_snapshot_delete(BlockDriverState *bs, const char *snapshot_id)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Qemu-KVM Livate Migration 0.12.2 -> 0.12.3/4 broken?

2010-05-17 Thread Jan Kiszka
Juan Quintela wrote:
> Jan Kiszka  wrote:
>> Juan Quintela wrote:
>>> Lack of "proper" subsections.  IDE is something like:
>>>
>>> const VMStateDescription vmstate_ide_drive = {
>>> .version_id = 4,
>>> 
>>> }
>>>
>>> static const VMStateDescription vmstate_bmdma = {
>>> .name = "ide bmdma",
>>> .version_id = 4,
>>> ...
>>> }
>>>
>>> const VMStateDescription vmstate_ide_pci = {
>>> .name = "ide",
>>> .version_id = 4,
>>> 
>>> VMSTATE_STRUCT_ARRAY(bmdma, PCIIDEState, 2, 0,
>>>  vmstate_bmdma, BMDMAState),
>>> VMSTATE_IDE_DRIVES(bus[0].ifs, PCIIDEState),
>>> VMSTATE_IDE_DRIVES(bus[1].ifs, PCIIDEState),
>>> 
>>> }
>>>
>>>
>>> Notice that everything is at version 4.  It used to be everything at
>>> version 3.  Now the problem is that when migrating from v3 -> v4.  We
>>> put in one place v3, But we only have a version number at the toplevel,
>>> rest of "subsections" don't sent a version number.  There is no way to
>>> fix it in the general case.  We can hack something around for ide, but
>>> that will just be a hack, or we can backport marcelo change and port it
>>> as a proper subsection (that is my plan).  I expect to have time at the
>>> end of next time to work on this.
>> BTW, the IDE subsystem is yet lacking a proper vmstate section split-up
>> along qdev boundaries (ie. vmstate_ide_pci should not contain drive
>> structures). Do you plan to address this as well?
> 
> Not for Friday, and not for 0.12.

For sure. I missed that this was only a 0.12 issue.

> 
> That is 0.13 material, and have to get one agreement on how to go.
> We can go for:
> - good structure
> - backward compatibility
> 
> I can't see any good way to get both at this stage :(  But I am open to
> sugestions.

Based on recent experiments with vmstate to enhance the hpet, I'm fairly
optimistic that we can have both (just the code complexity suffers a
bit): Split up the drive sections for new versions, but keep the legacy
fields with attached .field_exists() filters for reading of old
versions. But I may also underestimate issues of this particular case.

> 
> Later, Juan.
> 
> PD. BTW, very good work with printing the vmstate, that was one of the goals
> when we added it, that was the next step after porting everything to
> vmstate :)
> 

I'm sorry for stealing you the pleasure to add it. :)

Jan



signature.asc
Description: OpenPGP digital signature


buildbot failure in qemu-kvm on default_x86_64_out_of_tree

2010-05-17 Thread qemu-kvm
The Buildbot has detected a new failure of default_x86_64_out_of_tree on 
qemu-kvm.
Full details are available at:
 
http://buildbot.b1-systems.de/qemu-kvm/builders/default_x86_64_out_of_tree/builds/337

Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/

Buildslave for this Build: b1_qemu_kvm_1

Build Reason: 
Build Source Stamp: [branch master] HEAD
Blamelist: Amit Shah ,Andrzej Zaborowski 
,Anthony Liguori ,Avi Kivity 
,Blue Swirl ,Corentin Chary 
,Gleb Natapov ,Igor V. Kovalenko 
,Isaku Yamahata ,Jan Kiszka 
,Marcelo Tosatti ,Markus 
Armbruster ,Shin-ichiro KAWASAKI 
,Stefan Weil 

BUILD FAILED: failed compile

sincerely,
 -The Buildbot

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


buildbot failure in qemu-kvm on default_i386_out_of_tree

2010-05-17 Thread qemu-kvm
The Buildbot has detected a new failure of default_i386_out_of_tree on qemu-kvm.
Full details are available at:
 
http://buildbot.b1-systems.de/qemu-kvm/builders/default_i386_out_of_tree/builds/335

Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/

Buildslave for this Build: b1_qemu_kvm_2

Build Reason: 
Build Source Stamp: [branch master] HEAD
Blamelist: Amit Shah ,Andrzej Zaborowski 
,Anthony Liguori ,Avi Kivity 
,Blue Swirl ,Corentin Chary 
,Gleb Natapov ,Igor V. Kovalenko 
,Isaku Yamahata ,Jan Kiszka 
,Marcelo Tosatti ,Markus 
Armbruster ,Shin-ichiro KAWASAKI 
,Stefan Weil 

BUILD FAILED: failed compile

sincerely,
 -The Buildbot

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


buildbot failure in qemu-kvm on default_i386_debian_5_0

2010-05-17 Thread qemu-kvm
The Buildbot has detected a new failure of default_i386_debian_5_0 on qemu-kvm.
Full details are available at:
 
http://buildbot.b1-systems.de/qemu-kvm/builders/default_i386_debian_5_0/builds/398

Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/

Buildslave for this Build: b1_qemu_kvm_2

Build Reason: 
Build Source Stamp: [branch master] HEAD
Blamelist: Amit Shah ,Andrzej Zaborowski 
,Anthony Liguori ,Avi Kivity 
,Blue Swirl ,Corentin Chary 
,Gleb Natapov ,Igor V. Kovalenko 
,Isaku Yamahata ,Jan Kiszka 
,Marcelo Tosatti ,Markus 
Armbruster ,Shin-ichiro KAWASAKI 
,Stefan Weil 

BUILD FAILED: failed compile

sincerely,
 -The Buildbot

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


buildbot failure in qemu-kvm on default_x86_64_debian_5_0

2010-05-17 Thread qemu-kvm
The Buildbot has detected a new failure of default_x86_64_debian_5_0 on 
qemu-kvm.
Full details are available at:
 
http://buildbot.b1-systems.de/qemu-kvm/builders/default_x86_64_debian_5_0/builds/396

Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/

Buildslave for this Build: b1_qemu_kvm_1

Build Reason: 
Build Source Stamp: [branch master] HEAD
Blamelist: Amit Shah ,Andrzej Zaborowski 
,Anthony Liguori ,Avi Kivity 
,Blue Swirl ,Corentin Chary 
,Gleb Natapov ,Igor V. Kovalenko 
,Isaku Yamahata ,Jan Kiszka 
,Marcelo Tosatti ,Markus 
Armbruster ,Shin-ichiro KAWASAKI 
,Stefan Weil 

BUILD FAILED: failed compile

sincerely,
 -The Buildbot

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ANNOUNCE] kvm-kmod-2.6.34

2010-05-17 Thread Jan Kiszka
Linux 2.6.34 is out, and here comes the corresponding kvm-kmod package.
Download is available at

http://downloads.sourceforge.net/project/kvm/kvm-kmod/2.6.34/kvm-kmod-2.6.34.tar.bz2

KVM changes since kvm-kmod-2.6.34-rc3:
 - VMX: blocked-by-sti must not defer NMI injections
 - x86: Call vcpu_load and vcpu_put in cpuid_update
 - SVM: Fix wrong intercept masks on 32 bit
 - convert ioapic lock to spinlock
 - x86: Fix TSS size check for 16-bit tasks
 - Add missing srcu_read_lock() for kvm_mmu_notifier_release()
 - Increase NR_IOBUS_DEVS limit to 200
 - fix the handling of dirty bitmaps to avoid overflows
 - MMU: fix kvm_mmu_zap_page() and its calling path
 - VMX: Save/restore rflags.vm correctly in real mode
 - allow bit 10 to be cleared in MSR_IA32_MC4_CTL
 - Don't spam kernel log when injecting exceptions due to bad cr writes
 - SVM: Fix memory leaks that happen when svm_create_vcpu() fails
 - take srcu lock before call to complete_pio()

kvm-kmod changes:
 - stop distributing generated kvm-kmod-config.h

As previously announced, this release also marks the end-of-life for the
kvm-kmod-2.6.33.x series.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 3/3] block: add sheepdog driver for distributed storage support

2010-05-17 Thread MORITA Kazutaka
Hi,

Thank you very much for the reviewing!

At Fri, 14 May 2010 13:08:06 +0200,
Kevin Wolf wrote:

> > +
> > +struct sd_req {
> > +   uint8_t proto_ver;
> > +   uint8_t opcode;
> > +   uint16_tflags;
> > +   uint32_tepoch;
> > +   uint32_tid;
> > +   uint32_tdata_length;
> > +   uint32_topcode_specific[8];
> > +};
> 
> CODING_STYLE says that structs should be typedefed and their names
> should be in CamelCase. So something like this:
> 
> typedef struct SheepdogReq {
> ...
> } SheepdogReq;
> 
> (Or, if your prefer, SDReq; but with things like SDAIOCB I think it
> becomes hard to read)
> 

I see. I use Sheepdog as a prefix, like SheepdogReq.


> > +/*
> > +
> > +#undef eprintf
> > +#define eprintf(fmt, args...)  
> > \
> > +do {   
> > \
> > +   fprintf(stderr, "%s %d: " fmt, __func__, __LINE__, ##args); \
> > +} while (0)
> 
> What about using error_report() instead of fprintf? Though it should be
> the same currently.
> 

Yes, using common helper functions is better.  I replaced all the
printf.


> > +
> > +   for (i = 0; i < ARRAY_SIZE(errors); ++i)
> > +   if (errors[i].err == err)
> > +   return errors[i].desc;
> 
> CODING_STYLE requires braces here.
> 

I fixed all the missing braces.


> > +
> > +   return "Invalid error code";
> > +}
> > +
> > +static inline int before(uint32_t seq1, uint32_t seq2)
> > +{
> > +return (int32_t)(seq1 - seq2) < 0;
> > +}
> > +
> > +static inline int after(uint32_t seq1, uint32_t seq2)
> > +{
> > +   return (int32_t)(seq2 - seq1) < 0;
> > +}
> 
> These functions look strange... Is the difference to seq1 < seq2 that
> the cast introduces intentional? (after(0x0, 0xabcdefff) == 1)
> 
> If yes, why is this useful? This needs a comment. If no, why even bother
> to have this function instead of directly using < or > ?
> 

These functions are used to compare sequential numbers which can be
wrap-around. For example, linux/net/tcp.h in the linux kernel.

Anyway, sheepdog doesn't use these functions, so I removed them.


> > +   if (snapid)
> > +   dprintf("%" PRIx32 " non current inode was open.\n", vid);
> > +   else
> > +   s->is_current = 1;
> > +
> > +   fd = connect_to_sdog(s->addr);
> 
> I wonder why you need to open another connection here instead of using
> s->fd. This pattern repeats at least in the snapshot functions, so I'm
> sure it's there for a reason. Maybe add a comment?
> 

We can use s->fd only for aio read/write operations.  It is because
the block driver may be during waiting response from the server, so we
cannot send other requests to the discriptor to avoid receiving wrong
data.

I added the comment to get_sheep_fd().


> > +
> > +   iov.iov_base = &s->inode;
> > +   iov.iov_len = sizeof(s->inode);
> > +   aio_req = alloc_aio_req(s, acb, vid_to_vdi_oid(s->inode.vdi_id),
> > +   data_len, offset, 0, 0, offset);
> > +   if (!aio_req) {
> > +   eprintf("too many requests\n");
> > +   acb->ret = -EIO;
> > +   goto out;
> > +   }
> 
> Randomly failing requests is probably not a good idea. The guest might
> decide that the disk/file system is broken and stop using it. Can't you
> use a list like in AIOPool, so you can dynamically add new requests as
> needed?
> 

I agree.  In the v3 patch, AIO requests are allocated dynamically, and
all the requests are linked to the outstanding_aio_head in the
BDRVSheepdogState.


> > +
> > +static int sd_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
> > +{
> > +   struct bdrv_sd_state *s = bs->opaque;
> > +   struct bdrv_sd_state *old_s;
> > +   char vdi[SD_MAX_VDI_LEN];
> > +   char *buf = NULL;
> > +   uint32_t vid;
> > +   uint32_t snapid = 0;
> > +   int ret = -ENOENT, fd;
> > +
> > +   old_s = qemu_malloc(sizeof(struct bdrv_sd_state));
> > +   if (!old_s) {
> 
> qemu_malloc never returns NULL.
> 

I removed all the NULL checks.


> > +
> > +BlockDriver bdrv_sheepdog = {
> > +   .format_name = "sheepdog",
> > +   .protocol_name = "sheepdog",
> > +   .instance_size = sizeof(struct bdrv_sd_state),
> > +   .bdrv_file_open = sd_open,
> > +   .bdrv_close = sd_close,
> > +   .bdrv_create = sd_create,
> > +
> > +   .bdrv_aio_readv = sd_aio_readv,
> > +   .bdrv_aio_writev = sd_aio_writev,
> > +
> > +   .bdrv_snapshot_create = sd_snapshot_create,
> > +   .bdrv_snapshot_goto = sd_snapshot_goto,
> > +   .bdrv_snapshot_delete = sd_snapshot_delete,
> > +   .bdrv_snapshot_list = sd_snapshot_list,
> > +
> > +   .bdrv_save_vmstate = sd_save_vmstate,
> > +   .bdrv_load_vmstate = sd_load_vmstate,
> > +
> > +   .create_options = sd_create_options,
> > +};
> 
> Please align the = to the same column, at least in each block.
> 

I have aligned in the v3 patch.


Thanks,

Kazutaka
--
To unsu

[RFC PATCH v3 1/3] close all the block drivers before the qemu process exits

2010-05-17 Thread MORITA Kazutaka
This patch calls the close handler of the block driver before the qemu
process exits.

This is necessary because the sheepdog block driver releases the lock
of VM images in the close handler.

Signed-off-by: MORITA Kazutaka 
---
 block.c |9 +
 block.h |1 +
 vl.c|1 +
 3 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index bfe46e3..f3bf3f2 100644
--- a/block.c
+++ b/block.c
@@ -636,6 +636,15 @@ void bdrv_close(BlockDriverState *bs)
 }
 }
 
+void bdrv_close_all(void)
+{
+BlockDriverState *bs;
+
+QTAILQ_FOREACH(bs, &bdrv_states, list) {
+bdrv_close(bs);
+}
+}
+
 void bdrv_delete(BlockDriverState *bs)
 {
 /* remove from list, if necessary */
diff --git a/block.h b/block.h
index 278259c..531e802 100644
--- a/block.h
+++ b/block.h
@@ -121,6 +121,7 @@ BlockDriverAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
 /* Ensure contents are flushed to disk.  */
 void bdrv_flush(BlockDriverState *bs);
 void bdrv_flush_all(void);
+void bdrv_close_all(void);
 
 int bdrv_has_zero_init(BlockDriverState *bs);
 int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
diff --git a/vl.c b/vl.c
index 85bcc84..5ce7807 100644
--- a/vl.c
+++ b/vl.c
@@ -2007,6 +2007,7 @@ static void main_loop(void)
 exit(0);
 }
 }
+bdrv_close_all();
 pause_all_vcpus();
 }
 
-- 
1.5.6.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 0/3] Sheepdog: distributed storage system for QEMU

2010-05-17 Thread MORITA Kazutaka
Hi all,

This patch adds a block driver for Sheepdog distributed storage
system.

Changes from v2 to v3 are:

 - add drv->bdrv_close() and drv->bdrv_open() before and after
   bdrv_snapshot_goto() call of the protocol.
 - address the review comments on the sheepdog driver code.
   I'll send the details in the reply of the review mail.

Changes from v1 to v2 are:

 - rebase onto git://repo.or.cz/qemu/kevin.git block
 - modify the sheepdog driver as a protocol driver
 - add new patch to call the snapshot handler of the protocol

If this patchset is okay, I'll work on the image creation problem of the
protocol driver.

Thanks,

Kazutaka


MORITA Kazutaka (3):
  close all the block drivers before the qemu process exits
  block: call the snapshot handlers of the protocol drivers
  block: add sheepdog driver for distributed storage support

 Makefile.objs|2 +-
 block.c  |   70 ++-
 block.h  |1 +
 block/sheepdog.c | 1845 ++
 vl.c |1 +
 5 files changed, 1900 insertions(+), 19 deletions(-)
 create mode 100644 block/sheepdog.c

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 3/3] block: add sheepdog driver for distributed storage support

2010-05-17 Thread MORITA Kazutaka
Sheepdog is a distributed storage system for QEMU. It provides highly
available block level storage volumes to VMs like Amazon EBS.  This
patch adds a qemu block driver for Sheepdog.

Sheepdog features are:
- No node in the cluster is special (no metadata node, no control
  node, etc)
- Linear scalability in performance and capacity
- No single point of failure
- Autonomous management (zero configuration)
- Useful volume management support such as snapshot and cloning
- Thin provisioning
- Autonomous load balancing

The more details are available at the project site:
http://www.osrg.net/sheepdog/

Signed-off-by: MORITA Kazutaka 
---
 Makefile.objs|2 +-
 block/sheepdog.c | 1845 ++
 2 files changed, 1846 insertions(+), 1 deletions(-)
 create mode 100644 block/sheepdog.c

diff --git a/Makefile.objs b/Makefile.objs
index ecdd53e..6edbc57 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -14,7 +14,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 
 block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o 
vvfat.o
 block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
-block-nested-y += parallels.o nbd.o blkdebug.o
+block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o
 block-nested-$(CONFIG_WIN32) += raw-win32.o
 block-nested-$(CONFIG_POSIX) += raw-posix.o
 block-nested-$(CONFIG_CURL) += curl.o
diff --git a/block/sheepdog.c b/block/sheepdog.c
new file mode 100644
index 000..4672f00
--- /dev/null
+++ b/block/sheepdog.c
@@ -0,0 +1,1845 @@
+/*
+ * Copyright (C) 2009-2010 Nippon Telegraph and Telephone Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version
+ * 2 as published by the Free Software Foundation.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see .
+ */
+#include 
+#include 
+
+#include "qemu-common.h"
+#include "qemu-error.h"
+#include "block_int.h"
+
+#define SD_PROTO_VER 0x01
+
+#define SD_DEFAULT_ADDR "localhost:7000"
+
+#define SD_OP_CREATE_AND_WRITE_OBJ  0x01
+#define SD_OP_READ_OBJ   0x02
+#define SD_OP_WRITE_OBJ  0x03
+
+#define SD_OP_NEW_VDI0x11
+#define SD_OP_LOCK_VDI   0x12
+#define SD_OP_RELEASE_VDI0x13
+#define SD_OP_GET_VDI_INFO   0x14
+#define SD_OP_READ_VDIS  0x15
+
+#define SD_FLAG_CMD_WRITE0x01
+#define SD_FLAG_CMD_COW  0x02
+
+#define SD_RES_SUCCESS   0x00 /* Success */
+#define SD_RES_UNKNOWN   0x01 /* Unknown error */
+#define SD_RES_NO_OBJ0x02 /* No object found */
+#define SD_RES_EIO   0x03 /* I/O error */
+#define SD_RES_VDI_EXIST 0x04 /* Vdi exists already */
+#define SD_RES_INVALID_PARMS 0x05 /* Invalid parameters */
+#define SD_RES_SYSTEM_ERROR  0x06 /* System error */
+#define SD_RES_VDI_LOCKED0x07 /* Vdi is locked */
+#define SD_RES_NO_VDI0x08 /* No vdi found */
+#define SD_RES_NO_BASE_VDI   0x09 /* No base vdi found */
+#define SD_RES_VDI_READ  0x0A /* Cannot read requested vdi */
+#define SD_RES_VDI_WRITE 0x0B /* Cannot write requested vdi */
+#define SD_RES_BASE_VDI_READ 0x0C /* Cannot read base vdi */
+#define SD_RES_BASE_VDI_WRITE   0x0D /* Cannot write base vdi */
+#define SD_RES_NO_TAG0x0E /* Requested tag is not found */
+#define SD_RES_STARTUP   0x0F /* Sheepdog is on starting up */
+#define SD_RES_VDI_NOT_LOCKED   0x10 /* Vdi is not locked */
+#define SD_RES_SHUTDOWN  0x11 /* Sheepdog is shutting down */
+#define SD_RES_NO_MEM0x12 /* Cannot allocate memory */
+#define SD_RES_FULL_VDI  0x13 /* we already have the maximum vdis */
+#define SD_RES_VER_MISMATCH  0x14 /* Protocol version mismatch */
+#define SD_RES_NO_SPACE  0x15 /* Server has no room for new objects */
+#define SD_RES_WAIT_FOR_FORMAT  0x16 /* Sheepdog is waiting for a format 
operation */
+#define SD_RES_WAIT_FOR_JOIN0x17 /* Sheepdog is waiting for other nodes 
joining */
+#define SD_RES_JOIN_FAILED   0x18 /* Target node had failed to join sheepdog */
+
+/*
+ * Object ID rules
+ *
+ *  0 - 19 (20 bits): data object space
+ * 20 - 31 (12 bits): reserved data object space
+ * 32 - 55 (24 bits): vdi object space
+ * 56 - 59 ( 4 bits): reserved vdi object space
+ * 60 - 63 ( 4 bits): object type indentifier space
+ */
+
+#define VDI_SPACE_SHIFT   32
+#define VDI_BIT (UINT64_C(1) << 63)
+#define VMSTATE_BIT (UINT64_C(1) << 62)
+#define MAX_DATA_OBJS (1ULL << 20)
+#define MAX_CHILDREN 1024
+#define SD_MAX_VDI_LEN 256
+#define SD_NR_VDIS   (1U << 24)
+#define SD_DATA_OBJ_SIZE (UINT64_C(1) << 22)
+
+#define SD_INODE_SIZE (sizeof(SheepdogInode))
+#define CURRENT_VDI_ID 0
+
+typedef struct SheepdogReq {
+   uint8_t proto_ver;
+   uint8_t opcode;
+   uint16_tflags;
+   uint32_tepoch;
+   uint32_tid;
+   uint32_tdata_length

[RFC PATCH v3 2/3] block: call the snapshot handlers of the protocol drivers

2010-05-17 Thread MORITA Kazutaka
When snapshot handlers are not defined in the format driver, it is
better to call the ones of the protocol driver.  This enables us to
implement snapshot support in the protocol driver.

We need to call bdrv_close() and bdrv_open() handlers of the format
driver before and after bdrv_snapshot_goto() call of the protocol.  It is
because the contents of the block driver state may need to be changed
after loading vmstate.

Signed-off-by: MORITA Kazutaka 
---
 block.c |   61 +++--
 1 files changed, 43 insertions(+), 18 deletions(-)

diff --git a/block.c b/block.c
index f3bf3f2..c987e57 100644
--- a/block.c
+++ b/block.c
@@ -1683,9 +1683,11 @@ int bdrv_save_vmstate(BlockDriverState *bs, const 
uint8_t *buf,
 BlockDriver *drv = bs->drv;
 if (!drv)
 return -ENOMEDIUM;
-if (!drv->bdrv_save_vmstate)
-return -ENOTSUP;
-return drv->bdrv_save_vmstate(bs, buf, pos, size);
+if (drv->bdrv_save_vmstate)
+return drv->bdrv_save_vmstate(bs, buf, pos, size);
+if (bs->file)
+return bdrv_save_vmstate(bs->file, buf, pos, size);
+return -ENOTSUP;
 }
 
 int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf,
@@ -1694,9 +1696,11 @@ int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf,
 BlockDriver *drv = bs->drv;
 if (!drv)
 return -ENOMEDIUM;
-if (!drv->bdrv_load_vmstate)
-return -ENOTSUP;
-return drv->bdrv_load_vmstate(bs, buf, pos, size);
+if (drv->bdrv_load_vmstate)
+return drv->bdrv_load_vmstate(bs, buf, pos, size);
+if (bs->file)
+return bdrv_load_vmstate(bs->file, buf, pos, size);
+return -ENOTSUP;
 }
 
 void bdrv_debug_event(BlockDriverState *bs, BlkDebugEvent event)
@@ -1720,20 +1724,37 @@ int bdrv_snapshot_create(BlockDriverState *bs,
 BlockDriver *drv = bs->drv;
 if (!drv)
 return -ENOMEDIUM;
-if (!drv->bdrv_snapshot_create)
-return -ENOTSUP;
-return drv->bdrv_snapshot_create(bs, sn_info);
+if (drv->bdrv_snapshot_create)
+return drv->bdrv_snapshot_create(bs, sn_info);
+if (bs->file)
+return bdrv_snapshot_create(bs->file, sn_info);
+return -ENOTSUP;
 }
 
 int bdrv_snapshot_goto(BlockDriverState *bs,
const char *snapshot_id)
 {
 BlockDriver *drv = bs->drv;
+int ret, open_ret;
+
 if (!drv)
 return -ENOMEDIUM;
-if (!drv->bdrv_snapshot_goto)
-return -ENOTSUP;
-return drv->bdrv_snapshot_goto(bs, snapshot_id);
+if (drv->bdrv_snapshot_goto)
+return drv->bdrv_snapshot_goto(bs, snapshot_id);
+
+if (bs->file) {
+drv->bdrv_close(bs);
+ret = bdrv_snapshot_goto(bs->file, snapshot_id);
+open_ret = drv->bdrv_open(bs, bs->open_flags);
+if (open_ret < 0) {
+bdrv_delete(bs);
+bs->drv = NULL;
+return open_ret;
+}
+return ret;
+}
+
+return -ENOTSUP;
 }
 
 int bdrv_snapshot_delete(BlockDriverState *bs, const char *snapshot_id)
@@ -1741,9 +1762,11 @@ int bdrv_snapshot_delete(BlockDriverState *bs, const 
char *snapshot_id)
 BlockDriver *drv = bs->drv;
 if (!drv)
 return -ENOMEDIUM;
-if (!drv->bdrv_snapshot_delete)
-return -ENOTSUP;
-return drv->bdrv_snapshot_delete(bs, snapshot_id);
+if (drv->bdrv_snapshot_delete)
+return drv->bdrv_snapshot_delete(bs, snapshot_id);
+if (bs->file)
+return bdrv_snapshot_delete(bs->file, snapshot_id);
+return -ENOTSUP;
 }
 
 int bdrv_snapshot_list(BlockDriverState *bs,
@@ -1752,9 +1775,11 @@ int bdrv_snapshot_list(BlockDriverState *bs,
 BlockDriver *drv = bs->drv;
 if (!drv)
 return -ENOMEDIUM;
-if (!drv->bdrv_snapshot_list)
-return -ENOTSUP;
-return drv->bdrv_snapshot_list(bs, psn_info);
+if (drv->bdrv_snapshot_list)
+return drv->bdrv_snapshot_list(bs, psn_info);
+if (bs->file)
+return bdrv_snapshot_list(bs->file, psn_info);
+return -ENOTSUP;
 }
 
 #define NB_SUFFIXES 4
-- 
1.5.6.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][v2 3/3] KVM: x86: Use FPU API

2010-05-17 Thread Avi Kivity

On 05/17/2010 12:08 PM, Sheng Yang wrote:

Convert KVM to use generic FPU API.

  void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
  {
if (vcpu->guest_fpu_loaded)
@@ -5156,7 +5134,7 @@ void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)

vcpu->guest_fpu_loaded = 1;
unlazy_fpu(current);
-   kvm_fx_restore(&vcpu->arch.guest_fx_image);
+   fpu_restore_checking(&vcpu->arch.guest_fpu);
trace_kvm_fpu(1);
  }
   


Do we need to do something on error here?

I think an error can only occur if userspace sets bad fpu state, so 
maybe do the check in the ioctl.


In any case, unrelated to this patch.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: Qemu-KVM Livate Migration 0.12.2 -> 0.12.3/4 broken?

2010-05-17 Thread Alexander Graf

On 17.05.2010, at 11:11, Michael Tokarev wrote:

> 17.05.2010 13:07, Juan Quintela wrote:
>> Michael Tokarev  wrote:
> []
>>> http://www.mail-archive.com/kvm@vger.kernel.org/msg34051.html
>>> I wonder why it is not noticed before -- it's broken since 0.12...
>> 
>> People has become rich and everybody has a 64 bit hardware :-)
> 
> Actually I don't think there's a CPU that's 32bits and has
> necessary VT instructions.  At least not the ones to care
> about - maybe there were some models but they'e not common
> and are gone long time ago.

Intel Core Solo / Core Duo (old)
Intel Atom z5xx (new)

In fact, my mobile KVM development machine is a z530 based notebook. That one's 
32-bit only. But then again I have no idea why anyone would need migration 
there.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][v2 0/3] Convert KVM to use FPU API

2010-05-17 Thread Avi Kivity

On 05/17/2010 12:08 PM, Sheng Yang wrote:

Change from v1:
Use unlazy_fpu() to handle host FPU, avoiding save/restore of host FPU states.

   


Looks good, will wait a bit for more reviews and apply.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][v2 1/3] x86: Export FPU API for KVM use

2010-05-17 Thread Avi Kivity

On 05/17/2010 12:19 PM, Sheng Yang wrote:

On Monday 17 May 2010 17:19:02 Avi Kivity wrote:
   

On 05/17/2010 12:08 PM, Sheng Yang wrote:
 

Also add some constants.


+++ b/arch/x86/kernel/process.c
@@ -28,6 +28,7 @@ unsigned long idle_nomwait;

   EXPORT_SYMBOL(idle_nomwait);

   struct kmem_cache *task_xstate_cachep;

+EXPORT_SYMBOL(task_xstate_cachep);

   int arch_dup_task_struct(struct task_struct *dst, struct task_struct
   *src) {
   

_GPL() unless good reason not to.
 

Oops... Just copied and pasted the above line...

   


No matter, I'll update it when applying if no further comments 
(reviewing 3 now).


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][v2 1/3] x86: Export FPU API for KVM use

2010-05-17 Thread Sheng Yang
Also add some constants.

Signed-off-by: Sheng Yang 
---
 arch/x86/include/asm/i387.h  |2 ++
 arch/x86/include/asm/xsave.h |3 +++
 arch/x86/kernel/i387.c   |3 ++-
 arch/x86/kernel/process.c|1 +
 4 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index 8002e9c..9610628 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -488,6 +488,8 @@ static inline void fpu_copy(struct fpu *dst, struct fpu 
*src)
memcpy(dst->state, src->state, xstate_size);
 }
 
+extern void fpu_finit(struct fpu *fpu);
+
 #endif /* __ASSEMBLY__ */
 
 #define PSHUFB_XMM5_XMM0 .byte 0x66, 0x0f, 0x38, 0x00, 0xc5
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index 2c4390c..29ee4e4 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -13,6 +13,9 @@
 
 #define FXSAVE_SIZE512
 
+#define XSTATE_YMM_SIZE 256
+#define XSTATE_YMM_OFFSET (512 + 64)
+
 /*
  * These are the features that the OS can handle currently.
  */
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 86cef6b..cbc 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -107,7 +107,7 @@ void __cpuinit fpu_init(void)
 }
 #endif /* CONFIG_X86_64 */
 
-static void fpu_finit(struct fpu *fpu)
+void fpu_finit(struct fpu *fpu)
 {
 #ifdef CONFIG_X86_32
if (!HAVE_HWFP) {
@@ -132,6 +132,7 @@ static void fpu_finit(struct fpu *fpu)
fp->fos = 0xu;
}
 }
+EXPORT_SYMBOL_GPL(fpu_finit);
 
 /*
  * The _current_ task is using the FPU for the first time
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 8bcc21f..0f331f4 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -28,6 +28,7 @@ unsigned long idle_nomwait;
 EXPORT_SYMBOL(idle_nomwait);
 
 struct kmem_cache *task_xstate_cachep;
+EXPORT_SYMBOL_GPL(task_xstate_cachep);
 
 int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 {
-- 
1.7.0.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][v2 1/3] x86: Export FPU API for KVM use

2010-05-17 Thread Sheng Yang
On Monday 17 May 2010 17:19:02 Avi Kivity wrote:
> On 05/17/2010 12:08 PM, Sheng Yang wrote:
> > Also add some constants.
> > 
> > 
> > +++ b/arch/x86/kernel/process.c
> > @@ -28,6 +28,7 @@ unsigned long idle_nomwait;
> > 
> >   EXPORT_SYMBOL(idle_nomwait);
> >   
> >   struct kmem_cache *task_xstate_cachep;
> > 
> > +EXPORT_SYMBOL(task_xstate_cachep);
> > 
> >   int arch_dup_task_struct(struct task_struct *dst, struct task_struct
> >   *src) {
> 
> _GPL() unless good reason not to.

Oops... Just copied and pasted the above line...

--
regards
Yang, Sheng
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][v2 1/3] x86: Export FPU API for KVM use

2010-05-17 Thread Avi Kivity

On 05/17/2010 12:08 PM, Sheng Yang wrote:

Also add some constants.
   



+++ b/arch/x86/kernel/process.c
@@ -28,6 +28,7 @@ unsigned long idle_nomwait;
  EXPORT_SYMBOL(idle_nomwait);

  struct kmem_cache *task_xstate_cachep;
+EXPORT_SYMBOL(task_xstate_cachep);

  int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
  {
   


_GPL() unless good reason not to.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Qemu-KVM Livate Migration 0.12.2 -> 0.12.3/4 broken?

2010-05-17 Thread Michael Tokarev

17.05.2010 13:07, Juan Quintela wrote:

Michael Tokarev  wrote:

[]

http://www.mail-archive.com/kvm@vger.kernel.org/msg34051.html
I wonder why it is not noticed before -- it's broken since 0.12...


People has become rich and everybody has a 64 bit hardware :-)


Actually I don't think there's a CPU that's 32bits and has
necessary VT instructions.  At least not the ones to care
about - maybe there were some models but they'e not common
and are gone long time ago.

But apparently there are quite some people who use 32bit
OS on 64bit-capable hardware, for one reason or another.
I'm doing it due to historical reasons, but at least I'm
running 64bit kernel... ;)


Will take a look at the end of the week.


Thank you!

/mjt
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][v2 1/3] x86: Export FPU API for KVM use

2010-05-17 Thread Sheng Yang
Also add some constants.

Signed-off-by: Sheng Yang 
---
 arch/x86/include/asm/i387.h  |2 ++
 arch/x86/include/asm/xsave.h |3 +++
 arch/x86/kernel/i387.c   |3 ++-
 arch/x86/kernel/process.c|1 +
 4 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index 8002e9c..9610628 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -488,6 +488,8 @@ static inline void fpu_copy(struct fpu *dst, struct fpu 
*src)
memcpy(dst->state, src->state, xstate_size);
 }
 
+extern void fpu_finit(struct fpu *fpu);
+
 #endif /* __ASSEMBLY__ */
 
 #define PSHUFB_XMM5_XMM0 .byte 0x66, 0x0f, 0x38, 0x00, 0xc5
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index 2c4390c..29ee4e4 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -13,6 +13,9 @@
 
 #define FXSAVE_SIZE512
 
+#define XSTATE_YMM_SIZE 256
+#define XSTATE_YMM_OFFSET (512 + 64)
+
 /*
  * These are the features that the OS can handle currently.
  */
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 86cef6b..cbc 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -107,7 +107,7 @@ void __cpuinit fpu_init(void)
 }
 #endif /* CONFIG_X86_64 */
 
-static void fpu_finit(struct fpu *fpu)
+void fpu_finit(struct fpu *fpu)
 {
 #ifdef CONFIG_X86_32
if (!HAVE_HWFP) {
@@ -132,6 +132,7 @@ static void fpu_finit(struct fpu *fpu)
fp->fos = 0xu;
}
 }
+EXPORT_SYMBOL_GPL(fpu_finit);
 
 /*
  * The _current_ task is using the FPU for the first time
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 8bcc21f..373fec9 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -28,6 +28,7 @@ unsigned long idle_nomwait;
 EXPORT_SYMBOL(idle_nomwait);
 
 struct kmem_cache *task_xstate_cachep;
+EXPORT_SYMBOL(task_xstate_cachep);
 
 int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 {
-- 
1.7.0.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][v2 2/3] KVM: x86: Use unlazy_fpu() for host FPU

2010-05-17 Thread Sheng Yang
We can avoid unnecessary fpu load when userspace process
didn't use FPU frequently.

Derived from Avi's idea.

Signed-off-by: Sheng Yang 
---
 arch/x86/include/asm/kvm_host.h |1 -
 arch/x86/kvm/x86.c  |   18 ++
 2 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 0c06148..d93601c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -301,7 +301,6 @@ struct kvm_vcpu_arch {
unsigned long mmu_seq;
} update_pte;
 
-   struct i387_fxsave_struct host_fx_image;
struct i387_fxsave_struct guest_fx_image;
 
gva_t mmio_fault_cr2;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7500cba..ba8c2d2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -52,6 +52,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define MAX_IO_MSRS 256
 #define CR0_RESERVED_BITS  \
@@ -5134,21 +5135,10 @@ void fx_init(struct kvm_vcpu *vcpu)
 {
unsigned after_mxcsr_mask;
 
-   /*
-* Touch the fpu the first time in non atomic context as if
-* this is the first fpu instruction the exception handler
-* will fire before the instruction returns and it'll have to
-* allocate ram with GFP_KERNEL.
-*/
-   if (!used_math())
-   kvm_fx_save(&vcpu->arch.host_fx_image);
-
/* Initialize guest FPU by resetting ours and saving into guest's */
preempt_disable();
-   kvm_fx_save(&vcpu->arch.host_fx_image);
kvm_fx_finit();
kvm_fx_save(&vcpu->arch.guest_fx_image);
-   kvm_fx_restore(&vcpu->arch.host_fx_image);
preempt_enable();
 
vcpu->arch.cr0 |= X86_CR0_ET;
@@ -5165,7 +5155,7 @@ void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
return;
 
vcpu->guest_fpu_loaded = 1;
-   kvm_fx_save(&vcpu->arch.host_fx_image);
+   unlazy_fpu(current);
kvm_fx_restore(&vcpu->arch.guest_fx_image);
trace_kvm_fpu(1);
 }
@@ -5177,7 +5167,6 @@ void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
 
vcpu->guest_fpu_loaded = 0;
kvm_fx_save(&vcpu->arch.guest_fx_image);
-   kvm_fx_restore(&vcpu->arch.host_fx_image);
++vcpu->stat.fpu_reload;
set_bit(KVM_REQ_DEACTIVATE_FPU, &vcpu->requests);
trace_kvm_fpu(0);
@@ -5203,9 +5192,6 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 {
int r;
 
-   /* We do fxsave: this must be aligned. */
-   BUG_ON((unsigned long)&vcpu->arch.host_fx_image & 0xF);
-
vcpu->arch.mtrr_state.have_fixed = 1;
vcpu_load(vcpu);
r = kvm_arch_vcpu_reset(vcpu);
-- 
1.7.0.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][v2 3/3] KVM: x86: Use FPU API

2010-05-17 Thread Sheng Yang
Convert KVM to use generic FPU API.

Signed-off-by: Sheng Yang 
---
 arch/x86/include/asm/kvm_host.h |   17 +
 arch/x86/kvm/x86.c  |   52 ---
 2 files changed, 17 insertions(+), 52 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d93601c..d08bb4a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -301,7 +301,7 @@ struct kvm_vcpu_arch {
unsigned long mmu_seq;
} update_pte;
 
-   struct i387_fxsave_struct guest_fx_image;
+   struct fpu guest_fpu;
 
gva_t mmio_fault_cr2;
struct kvm_pio_request pio;
@@ -708,21 +708,6 @@ static inline unsigned long read_msr(unsigned long msr)
 }
 #endif
 
-static inline void kvm_fx_save(struct i387_fxsave_struct *image)
-{
-   asm("fxsave (%0)":: "r" (image));
-}
-
-static inline void kvm_fx_restore(struct i387_fxsave_struct *image)
-{
-   asm("fxrstor (%0)":: "r" (image));
-}
-
-static inline void kvm_fx_finit(void)
-{
-   asm("finit");
-}
-
 static inline u32 get_rdx_init_val(void)
 {
return 0x600; /* P6 family */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ba8c2d2..7be1d36 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -53,6 +53,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define MAX_IO_MSRS 256
 #define CR0_RESERVED_BITS  \
@@ -5058,27 +5059,6 @@ out:
 }
 
 /*
- * fxsave fpu state.  Taken from x86_64/processor.h.  To be killed when
- * we have asm/x86/processor.h
- */
-struct fxsave {
-   u16 cwd;
-   u16 swd;
-   u16 twd;
-   u16 fop;
-   u64 rip;
-   u64 rdp;
-   u32 mxcsr;
-   u32 mxcsr_mask;
-   u32 st_space[32];   /* 8*16 bytes for each FP-reg = 128 bytes */
-#ifdef CONFIG_X86_64
-   u32 xmm_space[64];  /* 16*16 bytes for each XMM-reg = 256 bytes */
-#else
-   u32 xmm_space[32];  /* 8*16 bytes for each XMM-reg = 128 bytes */
-#endif
-};
-
-/*
  * Translate a guest virtual address to a guest physical address.
  */
 int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
@@ -5101,7 +5081,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
 
 int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
 {
-   struct fxsave *fxsave = (struct fxsave *)&vcpu->arch.guest_fx_image;
+   struct i387_fxsave_struct *fxsave =
+   &vcpu->arch.guest_fpu.state->fxsave;
 
memcpy(fpu->fpr, fxsave->st_space, 128);
fpu->fcw = fxsave->cwd;
@@ -5117,7 +5098,8 @@ int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, 
struct kvm_fpu *fpu)
 
 int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
 {
-   struct fxsave *fxsave = (struct fxsave *)&vcpu->arch.guest_fx_image;
+   struct i387_fxsave_struct *fxsave =
+   &vcpu->arch.guest_fpu.state->fxsave;
 
memcpy(fxsave->st_space, fpu->fpr, 128);
fxsave->cwd = fpu->fcw;
@@ -5133,22 +5115,18 @@ int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, 
struct kvm_fpu *fpu)
 
 void fx_init(struct kvm_vcpu *vcpu)
 {
-   unsigned after_mxcsr_mask;
-
-   /* Initialize guest FPU by resetting ours and saving into guest's */
-   preempt_disable();
-   kvm_fx_finit();
-   kvm_fx_save(&vcpu->arch.guest_fx_image);
-   preempt_enable();
+   fpu_alloc(&vcpu->arch.guest_fpu);
+   fpu_finit(&vcpu->arch.guest_fpu);
 
vcpu->arch.cr0 |= X86_CR0_ET;
-   after_mxcsr_mask = offsetof(struct i387_fxsave_struct, st_space);
-   vcpu->arch.guest_fx_image.mxcsr = 0x1f80;
-   memset((void *)&vcpu->arch.guest_fx_image + after_mxcsr_mask,
-  0, sizeof(struct i387_fxsave_struct) - after_mxcsr_mask);
 }
 EXPORT_SYMBOL_GPL(fx_init);
 
+static void fx_free(struct kvm_vcpu *vcpu)
+{
+   fpu_free(&vcpu->arch.guest_fpu);
+}
+
 void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
 {
if (vcpu->guest_fpu_loaded)
@@ -5156,7 +5134,7 @@ void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
 
vcpu->guest_fpu_loaded = 1;
unlazy_fpu(current);
-   kvm_fx_restore(&vcpu->arch.guest_fx_image);
+   fpu_restore_checking(&vcpu->arch.guest_fpu);
trace_kvm_fpu(1);
 }
 
@@ -5166,7 +5144,7 @@ void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
return;
 
vcpu->guest_fpu_loaded = 0;
-   kvm_fx_save(&vcpu->arch.guest_fx_image);
+   fpu_save_init(&vcpu->arch.guest_fpu);
++vcpu->stat.fpu_reload;
set_bit(KVM_REQ_DEACTIVATE_FPU, &vcpu->requests);
trace_kvm_fpu(0);
@@ -5179,6 +5157,7 @@ void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
vcpu->arch.time_page = NULL;
}
 
+   fx_free(vcpu);
kvm_x86_ops->vcpu_free(vcpu);
 }
 
@@ -5213,6 +5192,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
kvm_mmu_unload(vcpu);
vcp

[PATCH][v2 0/3] Convert KVM to use FPU API

2010-05-17 Thread Sheng Yang
Change from v1:
Use unlazy_fpu() to handle host FPU, avoiding save/restore of host FPU states.

Sheng Yang (3):
  x86: Export FPU API for KVM use
  KVM: x86: Use unlazy_fpu() for host FPU
  KVM: x86: Use FPU API

 arch/x86/include/asm/i387.h |2 +
 arch/x86/include/asm/kvm_host.h |   18 +-
 arch/x86/include/asm/xsave.h|3 ++
 arch/x86/kernel/i387.c  |3 +-
 arch/x86/kernel/process.c   |1 +
 arch/x86/kvm/x86.c  |   70 ++-
 6 files changed, 27 insertions(+), 70 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Qemu-KVM Livate Migration 0.12.2 -> 0.12.3/4 broken?

2010-05-17 Thread Juan Quintela
Michael Tokarev  wrote:
> 17.05.2010 11:00, Avi Kivity wrote:
>> On 05/16/2010 11:04 PM, Juan Quintela wrote:
> []
 We've regressed from failing some migrations to failing all migrations.
>>> Humm, 0.12.4 -> 0.12.4 should work. My advise is just revert the patch
>>> and live with it for another week, what do you think?
>>
>> A week is fine. I'm not going to release 0.12.4.1 for the week until you
>> fix it, and just changing things in git doesn't help users.
>
> There are at least 1 other problem in the migration code that's
> worth looking at (IMHO anyway): namely, migration on 32 bit host
> is completely (as far as i can see) broken (instant kvm process
> crash) in 0.12, but it works on 64bits:
>
> http://www.mail-archive.com/kvm@vger.kernel.org/msg34051.html
>
> I wonder why it is not noticed before -- it's broken since 0.12...

People has become rich and everybody has a 64 bit hardware :-)

Will take a look at the end of the week.

Thank for the report, Juan.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH 0/12] KVM, x86, ppc, asm-generic: moving dirty bitmaps to user space

2010-05-17 Thread Takuya Yoshikawa



User allocated bitmaps have the advantage of reducing pinned memory.
However we have plenty more pinned memory allocated in memory slots, so
by itself, user allocated bitmaps don't justify this change.


In that sense, what do you think about the question I sent last week?

=== REPOST 1 ===
>>
>> mark_page_dirty is called with the mmu_lock spinlock held in set_spte.
>> Must find a way to move it outside of the spinlock section.
>>
>
> Oh, it's a serious problem. I have to consider it.

Avi, Marcelo,

Sorry but I have to say that mmu_lock spin_lock problem was completely out of
my mind. Although I looked through the code, it seems not easy to move the
set_bit_user to outside of spinlock section without breaking the semantics of
its protection.

So this may take some time to solve.

But personally, I want to do something for x86's "vmallc() every time" problem
even though moving dirty bitmaps to user space cannot be achieved soon.

In that sense, do you mind if we do double buffering without moving dirty 
bitmaps to
user space?

I know that the resource for vmalloc() is precious for x86 but even now, at the 
timing
of get_dirty_log, we use the same amount of memory as double buffering.
=== 1 END ===




Perhaps if we optimize memory slot write protection (I have some ideas
about this) we can make the performance improvement more pronounced.



It's really nice!

Even now we can measure the performance improvement by introducing switch ioctl
when guest is relatively idle, so the combination will be really effective!

=== REPOST 2 ===
>>
>> Can you post such a test, for an idle large guest?
>
> OK, I'll do!


Result of "low workload test" (running top during migration) first,

4GB guest
picked up slots[1](len=3757047808) only
*
get.org get.optswitch.opt

1060875 310292 190335
1076754 301295 188600
 655504 318284 196029
 529769 301471325
 694796  70216 221172
 651868 353073 196184
 543339 312865 213236
1061938  72785 203090
 689527 323901 249519
 621364 323881473
1063671  70703 192958
 915903 336318 174008
1046462 332384782
1037942  72783 190655
 680122 318305 243544
 688156 314935 193526
 558658 265934 190550
 652454 372135 196270
 660140  68613352
1101947 378642 186575
.........
*

As expected we've got the difference more clearly.

In this case, switch.opt reduced 1/3 (.1 msec) compared to get.opt
for each iteration.

And when the slot is cleaner, the ratio is bigger.
=== 2 END ===
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] KVM: x86: Use FPU API

2010-05-17 Thread Sheng Yang
On Monday 17 May 2010 16:18:22 Avi Kivity wrote:
> On 05/17/2010 11:09 AM, Sheng Yang wrote:
> > Convert KVM to use generic FPU API.
> > 
> > Signed-off-by: Sheng Yang
> > ---
> > Like this? (Drop patch 1)
> 
> Will be more readable with a patch that converts host_fx_image to
> unlazy_fpu(), and a second patch that converts guest_fx_image to the fpu
> API.

OK.
 
> I think unlazy_fpu() is even a performance win in case userspace doesn't
> do a lot of floating point (which is the case with qemu).  I wonder why
> we didn't think of it before.

...
--
regards
Yang, Sheng
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Qemu-KVM Livate Migration 0.12.2 -> 0.12.3/4 broken?

2010-05-17 Thread Michael Tokarev

17.05.2010 11:00, Avi Kivity wrote:

On 05/16/2010 11:04 PM, Juan Quintela wrote:

[]

We've regressed from failing some migrations to failing all migrations.

Humm, 0.12.4 -> 0.12.4 should work. My advise is just revert the patch
and live with it for another week, what do you think?


A week is fine. I'm not going to release 0.12.4.1 for the week until you
fix it, and just changing things in git doesn't help users.


There are at least 1 other problem in the migration code that's
worth looking at (IMHO anyway): namely, migration on 32 bit host
is completely (as far as i can see) broken (instant kvm process
crash) in 0.12, but it works on 64bits:

http://www.mail-archive.com/kvm@vger.kernel.org/msg34051.html

I wonder why it is not noticed before -- it's broken since 0.12...

Thanks!

/mjt
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] KVM: x86: Use FPU API

2010-05-17 Thread Avi Kivity

On 05/17/2010 11:09 AM, Sheng Yang wrote:

Convert KVM to use generic FPU API.

Signed-off-by: Sheng Yang
---
Like this? (Drop patch 1)

   


Will be more readable with a patch that converts host_fx_image to 
unlazy_fpu(), and a second patch that converts guest_fx_image to the fpu 
API.


I think unlazy_fpu() is even a performance win in case userspace doesn't 
do a lot of floating point (which is the case with qemu).  I wonder why 
we didn't think of it before.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] KVM: x86: Use FPU API

2010-05-17 Thread Sheng Yang
Convert KVM to use generic FPU API.

Signed-off-by: Sheng Yang 
---
Like this? (Drop patch 1)

 arch/x86/include/asm/kvm_host.h |   18 +-
 arch/x86/kvm/x86.c  |   70 ++-
 2 files changed, 19 insertions(+), 69 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 0c06148..d08bb4a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -301,8 +301,7 @@ struct kvm_vcpu_arch {
unsigned long mmu_seq;
} update_pte;
 
-   struct i387_fxsave_struct host_fx_image;
-   struct i387_fxsave_struct guest_fx_image;
+   struct fpu guest_fpu;
 
gva_t mmio_fault_cr2;
struct kvm_pio_request pio;
@@ -709,21 +708,6 @@ static inline unsigned long read_msr(unsigned long msr)
 }
 #endif
 
-static inline void kvm_fx_save(struct i387_fxsave_struct *image)
-{
-   asm("fxsave (%0)":: "r" (image));
-}
-
-static inline void kvm_fx_restore(struct i387_fxsave_struct *image)
-{
-   asm("fxrstor (%0)":: "r" (image));
-}
-
-static inline void kvm_fx_finit(void)
-{
-   asm("finit");
-}
-
 static inline u32 get_rdx_init_val(void)
 {
return 0x600; /* P6 family */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7500cba..7be1d36 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -52,6 +52,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #define MAX_IO_MSRS 256
 #define CR0_RESERVED_BITS  \
@@ -5057,27 +5059,6 @@ out:
 }
 
 /*
- * fxsave fpu state.  Taken from x86_64/processor.h.  To be killed when
- * we have asm/x86/processor.h
- */
-struct fxsave {
-   u16 cwd;
-   u16 swd;
-   u16 twd;
-   u16 fop;
-   u64 rip;
-   u64 rdp;
-   u32 mxcsr;
-   u32 mxcsr_mask;
-   u32 st_space[32];   /* 8*16 bytes for each FP-reg = 128 bytes */
-#ifdef CONFIG_X86_64
-   u32 xmm_space[64];  /* 16*16 bytes for each XMM-reg = 256 bytes */
-#else
-   u32 xmm_space[32];  /* 8*16 bytes for each XMM-reg = 128 bytes */
-#endif
-};
-
-/*
  * Translate a guest virtual address to a guest physical address.
  */
 int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
@@ -5100,7 +5081,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
 
 int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
 {
-   struct fxsave *fxsave = (struct fxsave *)&vcpu->arch.guest_fx_image;
+   struct i387_fxsave_struct *fxsave =
+   &vcpu->arch.guest_fpu.state->fxsave;
 
memcpy(fpu->fpr, fxsave->st_space, 128);
fpu->fcw = fxsave->cwd;
@@ -5116,7 +5098,8 @@ int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, 
struct kvm_fpu *fpu)
 
 int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
 {
-   struct fxsave *fxsave = (struct fxsave *)&vcpu->arch.guest_fx_image;
+   struct i387_fxsave_struct *fxsave =
+   &vcpu->arch.guest_fpu.state->fxsave;
 
memcpy(fxsave->st_space, fpu->fpr, 128);
fxsave->cwd = fpu->fcw;
@@ -5132,41 +5115,26 @@ int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, 
struct kvm_fpu *fpu)
 
 void fx_init(struct kvm_vcpu *vcpu)
 {
-   unsigned after_mxcsr_mask;
-
-   /*
-* Touch the fpu the first time in non atomic context as if
-* this is the first fpu instruction the exception handler
-* will fire before the instruction returns and it'll have to
-* allocate ram with GFP_KERNEL.
-*/
-   if (!used_math())
-   kvm_fx_save(&vcpu->arch.host_fx_image);
-
-   /* Initialize guest FPU by resetting ours and saving into guest's */
-   preempt_disable();
-   kvm_fx_save(&vcpu->arch.host_fx_image);
-   kvm_fx_finit();
-   kvm_fx_save(&vcpu->arch.guest_fx_image);
-   kvm_fx_restore(&vcpu->arch.host_fx_image);
-   preempt_enable();
+   fpu_alloc(&vcpu->arch.guest_fpu);
+   fpu_finit(&vcpu->arch.guest_fpu);
 
vcpu->arch.cr0 |= X86_CR0_ET;
-   after_mxcsr_mask = offsetof(struct i387_fxsave_struct, st_space);
-   vcpu->arch.guest_fx_image.mxcsr = 0x1f80;
-   memset((void *)&vcpu->arch.guest_fx_image + after_mxcsr_mask,
-  0, sizeof(struct i387_fxsave_struct) - after_mxcsr_mask);
 }
 EXPORT_SYMBOL_GPL(fx_init);
 
+static void fx_free(struct kvm_vcpu *vcpu)
+{
+   fpu_free(&vcpu->arch.guest_fpu);
+}
+
 void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
 {
if (vcpu->guest_fpu_loaded)
return;
 
vcpu->guest_fpu_loaded = 1;
-   kvm_fx_save(&vcpu->arch.host_fx_image);
-   kvm_fx_restore(&vcpu->arch.guest_fx_image);
+   unlazy_fpu(current);
+   fpu_restore_checking(&vcpu->arch.guest_fpu);
trace_kvm_fpu(1);
 }
 
@@ -5176,8 +5144,7 @@ void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
return;
 
vcpu->

Re: [PATCH v2 10/10] KVM test: Add a helper to search the panic in the log

2010-05-17 Thread Jason Wang

Michael Goldish wrote:

On 05/11/2010 12:04 PM, Jason Wang wrote:
  

This checker serves as the post_command to find the panic information
in the file which contains the content of guest serial console.

Signed-off-by: Jason Wang 
---
 client/tests/kvm/scripts/check_serial.py |   41 ++
 client/tests/kvm/tests_base.cfg.sample   |7 -
 2 files changed, 46 insertions(+), 2 deletions(-)
 create mode 100644 client/tests/kvm/scripts/check_serial.py

diff --git a/client/tests/kvm/scripts/check_serial.py 
b/client/tests/kvm/scripts/check_serial.py
new file mode 100644
index 000..969bbe3
--- /dev/null
+++ b/client/tests/kvm/scripts/check_serial.py
@@ -0,0 +1,41 @@
+import os, sys, glob, re
+
+
+class SerialCheckerError(Exception):
+"""
+Simple wrapper for the builtin Exception class.
+"""
+pass
+
+
+class SerialChecker(object):
+"""
+Serach the panic or other keywords in the guest serial.
+"""
+def __init__(self):
+"""
+Gets params from environment variables and sets class attributes.
+"""
+client_dir =  os.environ['AUTODIR']
+self.pattern = os.environ['KVM_TEST_search_pattern']
+self.shortname = os.environ['KVM_TEST_shortname']
+self.debugdir = os.path.join(client_dir, 
"results/default/kvm.%s/debug" \



I think the final backslash is unnecessary.

  

+ % self.shortname)
+self.serial_files = glob.glob(os.path.join(self.debugdir, 'serial*'))
+
+
+def check(self):
+"""
+Check whether the pattern were found in the serial files
+"""
+fail = [ f for f in self.serial_files if
+ re.findall(self.pattern, file(f).read(), re.I) ]
+if fail:
+print "%s is found in %s" % (self.pattern, fail)
+raise SerialCheckerError("Error found during the check, Please" \



Same here.

  

+ " check the log")
+
+
+if __name__ == "__main__":
+checker = SerialChecker()
+checker.check()



I wonder why we need a class.  Why not just put all the code here?

  

Just follow the style of other pre_command, maybe Lucas like it.

diff --git a/client/tests/kvm/tests_base.cfg.sample 
b/client/tests/kvm/tests_base.cfg.sample
index 3c0933e..3ac8f0d 100644
--- a/client/tests/kvm/tests_base.cfg.sample
+++ b/client/tests/kvm/tests_base.cfg.sample
@@ -52,6 +52,10 @@ address_index = 0
 # Misc
 profilers = kvm_stat
 
+# pattern serach in guest serial console

+serach_pattern = panic
+post_command = "python scripts/check_serial.py"
+post_command_noncritical = no
 
 # Tests

 variants:
@@ -1314,10 +1318,9 @@ virtio|virtio_blk|e1000|balloon_check:
 variants:
 - @qcow2:
 image_format = qcow2
-post_command = " python scripts/check_image.py;"
+post_command = " python scripts/check_image.py; python 
scripts/check_serial.py"


^
This should be +=, because post_command may have been previously
assigned some value.

  

Would change it, and do you have any other comments about this patchset?

 remove_image = no
 post_command_timeout = 600
-post_command_noncritical = yes
 - vmdk:
 only Fedora Ubuntu Windows
 only smp2

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] KVM test: Add rtc into the guest test

2010-05-17 Thread Jason Wang
rtc test is a test for rtc driver which could also be useful for
testing the rtc emulation in kvm.

Signed-off-by: Jason Wang 
---
 0 files changed, 0 insertions(+), 0 deletions(-)

diff --git a/client/tests/kvm/autotest_control/rtc.control 
b/client/tests/kvm/autotest_control/rtc.control
new file mode 100644
index 000..b7597e3
--- /dev/null
+++ b/client/tests/kvm/autotest_control/rtc.control
@@ -0,0 +1,15 @@
+TIME="SHORT"
+AUTHOR = "Jason Wang "
+DOC = """
+rtc is a simple test of realtime clock driver which was grabbed from 
+Documentation/rtc.txt. It does the functional test of interrupt, alarm and 
+requeseted frequency.
+
+Please refer the kernel documentation for details.
+"""
+NAME = 'rtc'
+TEST_CLASS = 'kernel'
+TEST_CATEGORY = 'Functional'
+TEST_TYPE = 'client'
+
+job.run_test('rtc', maxfreq=8192)
diff --git a/client/tests/kvm/tests_base.cfg.sample 
b/client/tests/kvm/tests_base.cfg.sample
index bb3646c..b003a3e 100644
--- a/client/tests/kvm/tests_base.cfg.sample
+++ b/client/tests/kvm/tests_base.cfg.sample
@@ -151,6 +151,8 @@ variants:
 test_control_file = scrashme.control
 - hwclock:
 test_control_file = hwclock.control
+- rtc:
+test_control_file = rtc.control
 
 - linux_s3: install setup unattended_install
 type = linux_s3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] Adding rtctest as client test

2010-05-17 Thread Jason Wang
This new autotest module implements a simple test for the driver of realtime
clock. It do the testing of interrupt, date reading, alarm and
frequency.

Please refer the Documentation/rtc.txt for details.

Signed-off-by: Jason Wang 
---
 client/tests/rtc/src/Makefile  |   19 +++
 client/tests/rtc/src/rtctest.c |  261 
 2 files changed, 280 insertions(+), 0 deletions(-)
 create mode 100644 client/tests/rtc/src/Makefile
 create mode 100644 client/tests/rtc/src/rtctest.c

diff --git a/client/tests/rtc/control b/client/tests/rtc/control
new file mode 100644
index 000..f0f64c9
--- /dev/null
+++ b/client/tests/rtc/control
@@ -0,0 +1,15 @@
+TIME="SHORT"
+AUTHOR = "Jason Wang "
+DOC = """
+rtc is a simple test of realtime clock driver which was grabbed from 
+Documentation/rtc.txt. It does the functional test of interrupt, alarm and 
+requeseted frequency.
+
+Please refer the kernel documentation for details.
+"""
+NAME = 'rtc'
+TEST_CLASS = 'kernel'
+TEST_CATEGORY = 'Functional'
+TEST_TYPE = 'client'
+
+job.run_test('rtc')
diff --git a/client/tests/rtc/rtc.py b/client/tests/rtc/rtc.py
new file mode 100644
index 000..d4a1500
--- /dev/null
+++ b/client/tests/rtc/rtc.py
@@ -0,0 +1,23 @@
+import os
+from autotest_lib.client.bin import test, utils
+from autotest_lib.client.common_lib import error
+
+class rtc(test.test):
+version = 1
+
+preserve_srcdir = True
+
+def setup(self):
+os.chdir(self.srcdir)
+utils.system('make')
+
+
+def initialize(self):
+self.job.require_gcc()
+
+
+def run_once(self, def_rtc="/dev/rtc0", maxfreq=64):
+if not os.path.exists(def_rtc):
+raise error.TestNAError("%s doest not existed." % def_rtc)
+result = utils.system(self.srcdir + '/rtctest %s %s' % (def_rtc,
+maxfreq))
diff --git a/client/tests/rtc/src/Makefile b/client/tests/rtc/src/Makefile
new file mode 100644
index 000..f99dc60
--- /dev/null
+++ b/client/tests/rtc/src/Makefile
@@ -0,0 +1,19 @@
+CC=cc
+CFLAGS=-O -Wall -Wstrict-prototypes
+
+PROGS= rtctest
+
+SRCS=  rtctest.c
+OBJS=  ${SRCS:.c=.o}
+
+
+all:   $(PROGS)
+
+rtctest:   $(OBJS)
+   $(CC) $(LDFLAGS) -o rtctest $(OBJS)
+
+clean:
+   -rm -f $(OBJS)
+
+clobber:   clean
+   -rm -f $(PROGS)
diff --git a/client/tests/rtc/src/rtctest.c b/client/tests/rtc/src/rtctest.c
new file mode 100644
index 000..25b74af
--- /dev/null
+++ b/client/tests/rtc/src/rtctest.c
@@ -0,0 +1,261 @@
+/*
+ *  Real Time Clock Driver Test/Example Program
+ *
+ *  Compile with:
+ *  gcc -s -Wall -Wstrict-prototypes rtctest.c -o rtctest
+ *
+ *  Copyright (C) 1996, Paul Gortmaker.
+ *  Copyright (C) 2010, Jason Wang 
+ *
+ *  Released under the GNU General Public License, version 2,
+ *  included herein by reference.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+
+/*
+ * This expects the new RTC class driver framework, working with
+ * clocks that will often not be clones of what the PC-AT had.
+ * Use the command line to specify another RTC if you need one.
+ */
+static const char default_rtc[] = "/dev/rtc0";
+static int maxfreq = 64;
+
+int main(int argc, char **argv)
+{
+   int i, fd, retval, irqcount = 0;
+   unsigned long tmp, data;
+   struct rtc_time rtc_tm;
+   const char *rtc = default_rtc;
+
+   switch (argc) {
+   case 3:
+   maxfreq = atoi(argv[2]);
+   case 2:
+   rtc = argv[1];
+   /* FALLTHROUGH */
+   case 1:
+   break;
+   default:
+   fprintf(stderr, "usage:  rtctest [rtcdev] [maxfreq]\n");
+   return 1;
+   }
+
+   fd = open(rtc, O_RDONLY);
+
+   if (fd ==  -1) {
+   perror(rtc);
+   exit(errno);
+   }
+
+   fprintf(stderr, "\n\t\t\tRTC Driver Test Example.\n\n");
+
+   /* Turn on update interrupts (one per second) */
+   retval = ioctl(fd, RTC_UIE_ON, 0);
+   if (retval == -1) {
+   if (errno == ENOTTY) {
+   fprintf(stderr,
+   "\n...Update IRQs not supported.\n");
+   goto test_READ;
+   }
+   perror("RTC_UIE_ON ioctl");
+   exit(errno);
+   }
+
+   fprintf(stderr, "Counting 5 update (1/sec) interrupts from reading %s:",
+   rtc);
+   fflush(stderr);
+   for (i=1; i<6; i++) {
+   /* This read will block */
+   retval = read(fd, &data, sizeof(unsigned long));
+   if (retval == -1) {
+   perror("read");
+   exit(errno);
+   }
+   fprintf(stderr, " %d",i);
+   fflush(stderr);
+

[PATCH -v3] KVM, Fix QEMU-KVM is killed by guest SRAO MCE

2010-05-17 Thread Huang Ying
In common cases, guest SRAO MCE will cause corresponding poisoned page
be un-mapped and SIGBUS be sent to QEMU-KVM, then QEMU-KVM will relay
the MCE to guest OS.

But it is reported that if the poisoned page is accessed in guest
after un-mapped and before MCE is relayed to guest OS, QEMU-KVM will
be killed.

The reason is as follow. Because poisoned page has been un-mapped,
guest access will cause guest exit and kvm_mmu_page_fault will be
called. kvm_mmu_page_fault can not get the poisoned page for fault
address, so kernel and user space MMIO processing is tried in turn. In
user MMIO processing, poisoned page is accessed again, then QEMU-KVM
is killed by force_sig_info.

To fix the bug, kvm_mmu_page_fault send HWPOISON signal to QEMU-KVM
and do not try kernel and user space MMIO processing for poisoned
page.


Changelog:

v3:

- Make is_hwpoison_address workable for pud_large or pmd_large address.

v2:

- Use page table walker to determine whether the virtual address is
  poisoned to avoid change user space interface (via changing
  get_user_pages).

- Wrap bad page processing into kvm_handle_bad_page to avoid code
  duplicating.

Reported-by: Max Asbock 
Signed-off-by: Huang Ying 
---
 arch/x86/kvm/mmu.c |   34 ++
 arch/x86/kvm/paging_tmpl.h |7 ++-
 include/linux/kvm_host.h   |1 +
 include/linux/mm.h |8 
 mm/memory-failure.c|   30 ++
 virt/kvm/kvm_main.c|   30 --
 6 files changed, 95 insertions(+), 15 deletions(-)

--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1975,6 +1976,27 @@ static int __direct_map(struct kvm_vcpu
return pt_write;
 }
 
+static void kvm_send_hwpoison_signal(struct kvm *kvm, gfn_t gfn)
+{
+   char buf[1];
+   void __user *hva;
+   int r;
+
+   /* Touch the page, so send SIGBUS */
+   hva = (void __user *)gfn_to_hva(kvm, gfn);
+   r = copy_from_user(buf, hva, 1);
+}
+
+static int kvm_handle_bad_page(struct kvm *kvm, gfn_t gfn, pfn_t pfn)
+{
+   kvm_release_pfn_clean(pfn);
+   if (is_hwpoison_pfn(pfn)) {
+   kvm_send_hwpoison_signal(kvm, gfn);
+   return 0;
+   }
+   return 1;
+}
+
 static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, gfn_t gfn)
 {
int r;
@@ -1998,10 +2020,8 @@ static int nonpaging_map(struct kvm_vcpu
pfn = gfn_to_pfn(vcpu->kvm, gfn);
 
/* mmio */
-   if (is_error_pfn(pfn)) {
-   kvm_release_pfn_clean(pfn);
-   return 1;
-   }
+   if (is_error_pfn(pfn))
+   return kvm_handle_bad_page(vcpu->kvm, gfn, pfn);
 
spin_lock(&vcpu->kvm->mmu_lock);
if (mmu_notifier_retry(vcpu, mmu_seq))
@@ -2204,10 +2224,8 @@ static int tdp_page_fault(struct kvm_vcp
mmu_seq = vcpu->kvm->mmu_notifier_seq;
smp_rmb();
pfn = gfn_to_pfn(vcpu->kvm, gfn);
-   if (is_error_pfn(pfn)) {
-   kvm_release_pfn_clean(pfn);
-   return 1;
-   }
+   if (is_error_pfn(pfn))
+   return kvm_handle_bad_page(vcpu->kvm, gfn, pfn);
spin_lock(&vcpu->kvm->mmu_lock);
if (mmu_notifier_retry(vcpu, mmu_seq))
goto out_unlock;
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -424,11 +424,8 @@ static int FNAME(page_fault)(struct kvm_
pfn = gfn_to_pfn(vcpu->kvm, walker.gfn);
 
/* mmio */
-   if (is_error_pfn(pfn)) {
-   pgprintk("gfn %lx is mmio\n", walker.gfn);
-   kvm_release_pfn_clean(pfn);
-   return 1;
-   }
+   if (is_error_pfn(pfn))
+   return kvm_handle_bad_page(vcpu->kvm, walker.gfn, pfn);
 
spin_lock(&vcpu->kvm->mmu_lock);
if (mmu_notifier_retry(vcpu, mmu_seq))
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -254,6 +254,7 @@ extern pfn_t bad_pfn;
 
 int is_error_page(struct page *page);
 int is_error_pfn(pfn_t pfn);
+int is_hwpoison_pfn(pfn_t pfn);
 int kvm_is_error_hva(unsigned long addr);
 int kvm_set_memory_region(struct kvm *kvm,
  struct kvm_userspace_memory_region *mem,
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1464,6 +1464,14 @@ extern int sysctl_memory_failure_recover
 extern void shake_page(struct page *p, int access);
 extern atomic_long_t mce_bad_pages;
 extern int soft_offline_page(struct page *page, int flags);
+#ifdef CONFIG_MEMORY_FAILURE
+int is_hwpoison_address(unsigned long addr);
+#else
+static inline int is_hwpoison_address(unsigned long addr)
+{
+   return 0;
+}
+#endif
 
 extern void dump_page(struct page *page);
 
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -45,6 +45,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "internal.h"
 
 int sysctl_memory_failure_early_kill __read_mostly = 0;
@@ -1296,3 +1297,32 @

Re: Qemu-KVM Livate Migration 0.12.2 -> 0.12.3/4 broken?

2010-05-17 Thread Avi Kivity

On 05/16/2010 11:04 PM, Juan Quintela wrote:



So, to make the story short: I know what is happening, and I know how to
fix it, just that fix is not trivial.  I just need time.

   

Meanwhile, we have a broken 0.12.4.  Is there a quick'n'dirty
workaround that will be forward compatible with the real fix that we
can push out?
 

revert the patch.  It almost never happen (being in the middle of one
IO) while migrating.
   


If the guest does any work, it will always happen.

   

We've regressed from failing some migrations to failing all migrations.
 

Humm, 0.12.4 ->  0.12.4 should work.  My advise is just revert the patch
and live with it for another week, what do you think?
   


A week is fine.  I'm not going to release 0.12.4.1 for the week until 
you fix it, and just changing things in git doesn't help users.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html