Re: [kvm-devel] Chaning disks from IDE to SCSI on a Windows VM

2008-04-16 Thread Felix Leimbach
Jun, what happens when you boot into Windows' safe mode?
It usually displays the drivers as it loads them. Would be interesting 
to see where it hangs.

Also try this:
* Boot with the IDE disk which works and the dummy SCSI disk attached as 
per Alberto's instructions
* After you confirmed that the SCSI controller and disk are in device 
manager: delete the currently in use IDE controller from device manager
* Reboot into the SCSI setup

Good luck,
Felix

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1/5]Add some trace markers and exposeinterfaces in kernel for tracing

2008-04-16 Thread Liu, Eric E
Hollis Blanchard wrote:
 On Tuesday 15 April 2008 22:13:28 Liu, Eric E wrote:
 Hollis Blanchard wrote:
 On Wednesday 09 April 2008 05:01:36 Liu, Eric E wrote:
 +/* This structure represents a single trace buffer record. */
 +struct kvm_trace_rec { +   __u32 event:28;
 +   __u32 extra_u32:3;
 +   __u32 cycle_in:1;
 +   __u32 pid;
 +   __u32 vcpu_id;
 +   union {
 +   struct {
 +   __u32 cycle_lo, cycle_hi;
 +   __u32 extra_u32[KVM_TRC_EXTRA_MAX];
 +   } cycle; +   struct {
 +   __u32 extra_u32[KVM_TRC_EXTRA_MAX];
 +   } nocycle; +   } u;
 +};
 
 Do we really need bitfields here? They are notoriously non-portable.
 
 Practically speaking, this will prevent me from copying a trace file
 from my big-endian target to my little-endian workstation for
 analysis, at least without some ugly hacking in the userland tool.
 Here the main consideration using bitfields is to save storage space
 for 
 each record, but as you said it is non-portable for your mentioned
 case, so should we need to adjust the struct like this?
  __u32 event;
  __16 extra_u32;
  __16 cycle_in;
 
 If space really is a worry, you could still combine the fields, and
 just use masks to extract the data later. No matter what,
 byteswapping is required in the userland tool. I suspect this isn't
 there already, but it will be easier to add without the bitfields.
 
 Hmm, while we're on the subject, I'm not sure what the best way to
 automatically byteswap will be. It probably isn't worth it to convert
 all trace data to a standard ordering (which would add overhead to
 tracing), but I suppose there is no metadata in the trace log? A
 command line switch might be inconvenient but inevitable.

A tricky approach is that we insert medadata to the trace file before reading 
the trace log, so that the analysis tool can look at the medadata to check 
whether we need to convert byte order?   


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Extboot Option ROM rewritten in C - v2

2008-04-16 Thread Nguyen Anh Quynh
Hi Anthony,

I found a bug in the last code: send_command() failed to copy back the
result into extboot_cmd structure. This patch fixes it.

I succesfully tested this version with guest Win2K (fully updated,
scsi boot) and Linux 2.6.25-rc8 (virtio).

Let me know if you can boot Windows with this version.

Thanks,
Quynh
---
This code is an attempt to rewrite the current extboot option rom in
C. The new code now minimize the assembly code, so that the assembly
code is very small and simple: boot.S's only job is to interface with
C code, which does all the dirty job. signrom is modified to adapt
with the new result binary image.

The result option rom has the same size as the original one: 1.5KB,
while the actual code size is around the same: 1.2KB (gcc can optimize
really well)

To install this option rom, do the following steps as root:

make
make save --- backup the original option rom to
/usr/share/qemu/extboot.bin.org
make install   --- overwrite the new option rom to
/usr/share/qemu/extboot.bin


extboot2.tbz2
Description: Binary data
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Chaning disks from IDE to SCSI on a Windows VM

2008-04-16 Thread Elmar Haneke
 That's it.  If anyone can attest to this procedures and report, it would 
 be greatly appreciated.



I did try to do so But I do get an disk read error.

kvm is invoked via

qemu-system-x86_64 \
-m 768 \
-drive file=Vista.img,if=scsi,bus=0,index=0,media=disk,boot=on \
-net nic,model=e1000,macaddr=52:54:00:12:34:$ifnum \
-net tap,ifname=$iface,script=no \
-k de \
-usb \
-usbdevice tablet \
-monitor stdio


What's going wrong?

Elmar
inline: screen.png-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] finally see her naked

2008-04-16 Thread Elliott
Discover new ways to grow your man device http://www.bugehaej.com/

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 2/2] QEMU: decrease console refresh rate with -nographic

2008-04-16 Thread Avi Kivity
Anthony Liguori wrote:

 What about aio completions?  The only race-free way to handle both 
 posix aio completion and fd readiness is signals AFAIK.

 We poll aio completion after the select don't we?  Worst case scenario 
 we miss a signal and wait to poll after the next select event.  That's 
 going to occur very often because of the timer.

if select() doesn't enable signals (like you can do with pselect) you 
may sit for a long time in select() until the timer expires.

Consider a 100Hz Linux guest running 'ls -lR' out of a cold cache: 
instead of 1-2 ms disk latencies you'll see 10 ms latencies, killing 
performance by a factor of 5.

I see the following possible solutions:

1. Apply Anders' patch and keep I/O completions signal based.

2. Use signalfd() to convert aio completions to fd readiness, emulating 
signalfd() using a thread which does sigwait()+write() (to a pipe) on 
older hosts

3. Use a separate thread for aio completions

4. Use pselect(), live with the race on older hosts (it was introduced 
in 2.6.16, which we barely support anyway), live with the signal 
delivery inefficiency.

When I started writing this email I was in favor of (1), but now with 
the new signalfd emulation I'm leaning towards (2).  I still think (1) 
should be merged, preferably to qemu upstream.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] performance with guests running 2.4 kernels (specifically RHEL3)

2008-04-16 Thread Avi Kivity
David S. Ahern wrote:
 I have been looking at RHEL3 based guests lately, and to say the least the
 performance is horrible. Rather than write a long tome on what I've done and
 observed, I'd like to find out if anyone has some insights or known problem
 areas running 2.4 guests. The short of it is that % system time spikes from 
 time
 to time (e.g., on exec of a new process such as running /bin/true).

 I do not see the problem running RHEL3 on ESX, and an equivalent VM running
 RHEL4 runs fine. That suggests that the 2.4 kernel is doing something in a way
 that is not handled efficiently by kvm.

 Can someone shed some light on it?
   

It's not something that I test regularly.  If you're running a 32-bit 
kernel, I'd suspect kmap(), or perhaps false positives from the fork 
detector.

kvmtrace will probably give enough info to tell exactly what's going on; 
'kvmstat -1' while the badness is happening may also help.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] KVM Test result, kernel 8d3a833.., userspace bae043c..

2008-04-16 Thread Yunfeng Zhao
Hi All,
 
This is today's KVM test result against kvm.git 
8d3a833dc9d42f0967e57717f89c518375d6a417 and kvm-userspace.git 
bae043c2ddf35ed1965f062131394afa75e45b17.

Three Old Issues:

1.  Booting four guests likely fails
https://sourceforge.net/tracker/?func=detailatid=893831aid=1919354group_id=180599
 

2.  booting smp windows guests has 30% chance of hang
https://sourceforge.net/tracker/?func=detailatid=893831aid=1910923group_id=180599
 

3. Cannot boot guests with hugetlbfs
https://sourceforge.net/tracker/?func=detailatid=893831aid=1941302group_id=180599
 



Test environment
 
PlatformWoodcrest
CPU 4
Memory size 8G'
 
Details

IA32-pae: 
1. boot guest with 256M memory  PASS
2. boot two windows xp guest   PASS
3. boot 4 same guest in parallelPASS
4. boot linux and windows guest in parallel PASS
5. boot guest with 1500M memory PASS
6. boot windows 2003 with ACPI enabled   PASS
7. boot Windows xp with ACPI enabled  PASS
8. boot Windows 2000 without ACPI  PASS
9. kernel build on SMP linux guestPASS
10. LTP on SMP linux guest PASS
11. boot base kernel linux PASS
12. save/restore 32-bit HVM guests   PASS
13. live migration 32-bit HVM guests  PASS
14. boot SMP Windows xp with ACPI enabledPASS
15. boot SMP Windows 2003 with ACPI enabled PASS
16. boot SMP Windows 2000 with ACPI enabled PASS
 

IA32e: 
1. boot four 32-bit guest in 
parallel  PASS
2. boot four 64-bit guest in 
parallel  PASS
3. boot 4G 64-bit 
guest  PASS
4. boot 4G pae 
guest PASS
5. boot 32-bit linux and 32 bit windows guest in parallelPASS
6. boot 32-bit guest with 1500M memory PASS
7. boot 64-bit guest with 1500M memory PASS
8. boot 32-bit guest with 256M memory   PASS
9. boot 64-bit guest with 256M memory   PASS
10. boot two 32-bit windows xp in parallelPASS
11. boot four 32-bit different guest in para 
PASS
12. save/restore 64-bit linux guests 
PASS
13. save/restore 32-bit linux guests 
PASS
14. boot 32-bit SMP windows 2003 with ACPI enabled  PASS
15. boot 32-bit SMP Windows 2000 with ACPI enabled PASS
16. boot 32-bit SMP Windows xp with ACPI enabledPASS
17. boot 32-bit Windows 2000 without ACPIPASS
18. boot 64-bit Windows xp with ACPI enabledPASS
19. boot 32-bit Windows xp without ACPIPASS
20. boot 64-bit UP 
vista  PASS
21. boot 64-bit SMP 
vista   PASS
22. kernel build in 32-bit linux guest OS  PASS
23. kernel build in 64-bit linux guest OS  PASS
24. LTP on SMP 32-bit linux guest OSPASS
25. LTP on SMP 64-bit linux guest OSPASS
26. boot 64-bit guests with ACPI enabled PASS
27. boot 32-bit 
x-server   PASS  
28. boot 64-bit SMP windows XP with ACPI enabled PASS
29. boot 64-bit SMP windows 2003 with ACPI enabled  PASS
30. live migration 64bit linux 
guests PASS
31. live migration 32bit linux 
guests PASS
32. reboot 32bit windows xp guest   PASS
33. reboot 32bit windows xp guest   PASS
 
 
Report Summary on IA32-pae
Summary Test Report of Last Session
=
   Total   PassFailNoResult   Crash
=
control_panel   7   7   0 00
Restart 2   2   0 00
gtest   15  15  0 00

Re: [kvm-devel] Automatic page sharing/consolidation

2008-04-16 Thread Avi Kivity
Enrico Weigelt wrote:
 Hi folks,

 I'm using dozens of VE's / jails to separate applications
 (even complete webapps w/ their own httpd) for easier maintenance
 and better security. But this tends to consume a lot of memory,
 since code sharing (.so's) cannot take effect here (each jail/VE
 has it's completely own tree). 

 Now I wonder if it might be possible to let the kernel automatically
 find out equal pages and map them together. 

 A little compasion showed up that at least 50% of the code could be
 shared (maybe more with some tuning, and maybe even data). So it's
 (IMHO) really worth it.

 Would this be possible ? What had to be done for this ?
   

Izik (copied) is working on this for kvm.  Results so far are very 
encouraging, but currently merged pages are not swappable.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 1/1] Enble a guest to access a device's memory mapped I/O regions directly.

2008-04-16 Thread benami
From: Ben-Ami Yassour [EMAIL PROTECTED]

Signed-off-by: Ben-Ami Yassour [EMAIL PROTECTED]
Signed-off-by: Muli Ben-Yehuda [EMAIL PROTECTED]
---
 libkvm/libkvm.c   |   24 
 qemu/hw/pci-passthrough.c |   89 +++--
 qemu/hw/pci-passthrough.h |2 +
 3 files changed, 40 insertions(+), 75 deletions(-)

diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c
index de91328..8c02af9 100644
--- a/libkvm/libkvm.c
+++ b/libkvm/libkvm.c
@@ -400,7 +400,7 @@ void *kvm_create_userspace_phys_mem(kvm_context_t kvm, 
unsigned long phys_start,
 {
int r;
int prot = PROT_READ;
-   void *ptr;
+   void *ptr = NULL;
struct kvm_userspace_memory_region memory = {
.memory_size = len,
.guest_phys_addr = phys_start,
@@ -410,16 +410,24 @@ void *kvm_create_userspace_phys_mem(kvm_context_t kvm, 
unsigned long phys_start,
if (writable)
prot |= PROT_WRITE;
 
-   ptr = mmap(NULL, len, prot, MAP_ANONYMOUS | MAP_SHARED, -1, 0);
-   if (ptr == MAP_FAILED) {
-   fprintf(stderr, create_userspace_phys_mem: %s, 
strerror(errno));
-   return 0;
-   }
+   if (len  0) {
+   ptr = mmap(NULL, len, prot, MAP_ANONYMOUS | MAP_SHARED, -1, 0);
+   if (ptr == MAP_FAILED) {
+   fprintf(stderr, create_userspace_phys_mem: %s,
+   strerror(errno));
+   return 0;
+   }
 
-   memset(ptr, 0, len);
+   memset(ptr, 0, len);
+   }
 
memory.userspace_addr = (unsigned long)ptr;
-   memory.slot = get_free_slot(kvm);
+
+   if (len  0)
+   memory.slot = get_free_slot(kvm);
+   else
+   memory.slot = get_slot(phys_start);
+
r = ioctl(kvm-vm_fd, KVM_SET_USER_MEMORY_REGION, memory);
if (r == -1) {
fprintf(stderr, create_userspace_phys_mem: %s, 
strerror(errno));
diff --git a/qemu/hw/pci-passthrough.c b/qemu/hw/pci-passthrough.c
index 7ffcc7b..a5894d9 100644
--- a/qemu/hw/pci-passthrough.c
+++ b/qemu/hw/pci-passthrough.c
@@ -25,18 +25,6 @@ typedef __u64 resource_size_t;
 extern kvm_context_t kvm_context;
 extern FILE *logfile;
 
-CPUReadMemoryFunc *pt_mmio_read_cb[3] = {
-   pt_mmio_readb,
-   pt_mmio_readw,
-   pt_mmio_readl
-};
-
-CPUWriteMemoryFunc *pt_mmio_write_cb[3] = {
-   pt_mmio_writeb,
-   pt_mmio_writew,
-   pt_mmio_writel
-};
-
 //#define PT_DEBUG
 
 #ifdef PT_DEBUG
@@ -45,47 +33,6 @@ CPUWriteMemoryFunc *pt_mmio_write_cb[3] = {
 #define DEBUG(fmt, args...)
 #endif
 
-#define pt_mmio_write(suffix, type)\
-void pt_mmio_write##suffix(void *opaque, target_phys_addr_t e_phys,\
-   uint32_t value) \
-{  \
-   pt_region_t *r_access = (pt_region_t *)opaque;  \
-   void *r_virt = (u8 *)r_access-r_virtbase + \
-   (e_phys - r_access-e_physbase);\
-   if (r_access-debug  PT_DEBUG_MMIO) {  \
-   fprintf(logfile, pt_mmio_write #suffix\
-   : e_physbase=%p e_phys=%p r_virt=%p value=%08x\n, \
-   (void *)r_access-e_physbase, (void *)e_phys,   \
-   r_virt, value); \
-   }   \
-   *(type *)r_virt = (type)value;  \
-}
-
-pt_mmio_write(b, u8)
-pt_mmio_write(w, u16)
-pt_mmio_write(l, u32)
-
-#define pt_mmio_read(suffix, type) \
-uint32_t pt_mmio_read##suffix(void *opaque, target_phys_addr_t e_phys) \
-{  \
-   pt_region_t *r_access = (pt_region_t *)opaque;  \
-   void *r_virt = (u8 *)r_access-r_virtbase + \
-   (e_phys - r_access-e_physbase);\
-   uint32_t value = (u32) (*(type *) r_virt);  \
-   if (r_access-debug  PT_DEBUG_MMIO) {  \
-   fprintf(logfile,\
-   pt_mmio_read #suffix : e_physbase=%p\
-   e_phys=%p r_virt=%p value=%08x\n, \
-   (void *)r_access-e_physbase,   \
-   (void *)e_phys, r_virt, value); \
-   }   \
-   return value;   \
-}
-
-pt_mmio_read(b, u8)
-pt_mmio_read(w, u16)
-pt_mmio_read(l, u32)
-
 #define pt_ioport_write(suffix)

[kvm-devel] [PATCH 1/1] Enble a guest to access a device's memory mapped I/O regions directly.

2008-04-16 Thread benami
From: Ben-Ami Yassour [EMAIL PROTECTED]

Signed-off-by: Ben-Ami Yassour [EMAIL PROTECTED]
Signed-off-by: Muli Ben-Yehuda [EMAIL PROTECTED]
---
 libkvm/libkvm.c   |   24 
 qemu/hw/pci-passthrough.c |   89 +++--
 qemu/hw/pci-passthrough.h |2 +
 3 files changed, 40 insertions(+), 75 deletions(-)

diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c
index de91328..8c02af9 100644
--- a/libkvm/libkvm.c
+++ b/libkvm/libkvm.c
@@ -400,7 +400,7 @@ void *kvm_create_userspace_phys_mem(kvm_context_t kvm, 
unsigned long phys_start,
 {
int r;
int prot = PROT_READ;
-   void *ptr;
+   void *ptr = NULL;
struct kvm_userspace_memory_region memory = {
.memory_size = len,
.guest_phys_addr = phys_start,
@@ -410,16 +410,24 @@ void *kvm_create_userspace_phys_mem(kvm_context_t kvm, 
unsigned long phys_start,
if (writable)
prot |= PROT_WRITE;
 
-   ptr = mmap(NULL, len, prot, MAP_ANONYMOUS | MAP_SHARED, -1, 0);
-   if (ptr == MAP_FAILED) {
-   fprintf(stderr, create_userspace_phys_mem: %s, 
strerror(errno));
-   return 0;
-   }
+   if (len  0) {
+   ptr = mmap(NULL, len, prot, MAP_ANONYMOUS | MAP_SHARED, -1, 0);
+   if (ptr == MAP_FAILED) {
+   fprintf(stderr, create_userspace_phys_mem: %s,
+   strerror(errno));
+   return 0;
+   }
 
-   memset(ptr, 0, len);
+   memset(ptr, 0, len);
+   }
 
memory.userspace_addr = (unsigned long)ptr;
-   memory.slot = get_free_slot(kvm);
+
+   if (len  0)
+   memory.slot = get_free_slot(kvm);
+   else
+   memory.slot = get_slot(phys_start);
+
r = ioctl(kvm-vm_fd, KVM_SET_USER_MEMORY_REGION, memory);
if (r == -1) {
fprintf(stderr, create_userspace_phys_mem: %s, 
strerror(errno));
diff --git a/qemu/hw/pci-passthrough.c b/qemu/hw/pci-passthrough.c
index 7ffcc7b..a5894d9 100644
--- a/qemu/hw/pci-passthrough.c
+++ b/qemu/hw/pci-passthrough.c
@@ -25,18 +25,6 @@ typedef __u64 resource_size_t;
 extern kvm_context_t kvm_context;
 extern FILE *logfile;
 
-CPUReadMemoryFunc *pt_mmio_read_cb[3] = {
-   pt_mmio_readb,
-   pt_mmio_readw,
-   pt_mmio_readl
-};
-
-CPUWriteMemoryFunc *pt_mmio_write_cb[3] = {
-   pt_mmio_writeb,
-   pt_mmio_writew,
-   pt_mmio_writel
-};
-
 //#define PT_DEBUG
 
 #ifdef PT_DEBUG
@@ -45,47 +33,6 @@ CPUWriteMemoryFunc *pt_mmio_write_cb[3] = {
 #define DEBUG(fmt, args...)
 #endif
 
-#define pt_mmio_write(suffix, type)\
-void pt_mmio_write##suffix(void *opaque, target_phys_addr_t e_phys,\
-   uint32_t value) \
-{  \
-   pt_region_t *r_access = (pt_region_t *)opaque;  \
-   void *r_virt = (u8 *)r_access-r_virtbase + \
-   (e_phys - r_access-e_physbase);\
-   if (r_access-debug  PT_DEBUG_MMIO) {  \
-   fprintf(logfile, pt_mmio_write #suffix\
-   : e_physbase=%p e_phys=%p r_virt=%p value=%08x\n, \
-   (void *)r_access-e_physbase, (void *)e_phys,   \
-   r_virt, value); \
-   }   \
-   *(type *)r_virt = (type)value;  \
-}
-
-pt_mmio_write(b, u8)
-pt_mmio_write(w, u16)
-pt_mmio_write(l, u32)
-
-#define pt_mmio_read(suffix, type) \
-uint32_t pt_mmio_read##suffix(void *opaque, target_phys_addr_t e_phys) \
-{  \
-   pt_region_t *r_access = (pt_region_t *)opaque;  \
-   void *r_virt = (u8 *)r_access-r_virtbase + \
-   (e_phys - r_access-e_physbase);\
-   uint32_t value = (u32) (*(type *) r_virt);  \
-   if (r_access-debug  PT_DEBUG_MMIO) {  \
-   fprintf(logfile,\
-   pt_mmio_read #suffix : e_physbase=%p\
-   e_phys=%p r_virt=%p value=%08x\n, \
-   (void *)r_access-e_physbase,   \
-   (void *)e_phys, r_virt, value); \
-   }   \
-   return value;   \
-}
-
-pt_mmio_read(b, u8)
-pt_mmio_read(w, u16)
-pt_mmio_read(l, u32)
-
 #define pt_ioport_write(suffix)

[kvm-devel] direct mmio for passthrough - kernel part

2008-04-16 Thread benami

This patch for PCI passthrough devices enables a guest to access a device's
memory mapped I/O regions directly, without requiring the host to trap and
emulate every MMIO access. 

Updated from last version: we create a memory slot for each MMIO region of the
guest's devices, and then use the /sys/bus/pci/.../resource# mapping to find the
hfn for that MMIO region. The kernel part and the userspace part of this
patchset apply to Amit's pv-dma tree.  Tested on a Lenovo M57p with an e1000 NIC
assigned directly to an FC8 guest.

Comments are appreciated. 



-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Announcement: Proxmox Virtual Environment

2008-04-16 Thread Dietmar Maurer
Hi all,

I am glad to announce the first beta release of 'Proxmox Virtual
Environment' - an open source virtualization platform for the
enterprise. 

The main features are:

- All code is GPL
- OpenVZ and KVM support
- bare metal installer (debian etch 64)
- Backup/restore with vzdump/LVM2
- web based management
- integrated virtual appliance download (include certified
appliances)
- configuration cluster

You can find more information at http://pve.proxmox.com

We encourage anyone interested to download and test.
The CD image is available at: http://pve.proxmox.com/wiki/Downloads

Let us know what you think! 

Best regards,

Dietmar

--
Dietmar Maurer   Proxmox Server Solutions GmbH
   CTO
[EMAIL PROTECTED] http://www.proxmox.com
--

 


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 1/1] Enble a guest to access a device's memory mapped I/O regions directly.

2008-04-16 Thread benami
From: Ben-Ami Yassour [EMAIL PROTECTED]

Signed-off-by: Ben-Ami Yassour [EMAIL PROTECTED]
Signed-off-by: Muli Ben-Yehuda [EMAIL PROTECTED]
---
 arch/x86/kvm/mmu.c |   59 +--
 arch/x86/kvm/paging_tmpl.h |   19 +
 include/linux/kvm_host.h   |2 +-
 virt/kvm/kvm_main.c|   17 +++-
 4 files changed, 69 insertions(+), 28 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 078a7f1..c89029d 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -112,6 +112,8 @@ static int dbg = 1;
 #define PT_FIRST_AVAIL_BITS_SHIFT 9
 #define PT64_SECOND_AVAIL_BITS_SHIFT 52
 
+#define PT_SHADOW_IO_MARK (1ULL  PT_FIRST_AVAIL_BITS_SHIFT)
+
 #define VALID_PAGE(x) ((x) != INVALID_PAGE)
 
 #define PT64_LEVEL_BITS 9
@@ -237,6 +239,9 @@ static int is_dirty_pte(unsigned long pte)
 
 static int is_rmap_pte(u64 pte)
 {
+   if (pte  PT_SHADOW_IO_MARK)
+   return false;
+
return is_shadow_present_pte(pte);
 }
 
@@ -1034,7 +1039,8 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 
*shadow_pte,
 unsigned pt_access, unsigned pte_access,
 int user_fault, int write_fault, int dirty,
 int *ptwrite, int largepage, gfn_t gfn,
-pfn_t pfn, bool speculative)
+pfn_t pfn, bool speculative,
+int direct_mmio)
 {
u64 spte;
int was_rmapped = 0;
@@ -1114,6 +1120,9 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 
*shadow_pte,
}
}
 
+   if (direct_mmio)
+   spte |= PT_SHADOW_IO_MARK;
+
 unshadowed:
 
if (pte_access  ACC_WRITE_MASK)
@@ -1129,16 +1138,19 @@ unshadowed:
++vcpu-kvm-stat.lpages;
 
page_header_update_slot(vcpu-kvm, shadow_pte, gfn);
-   if (!was_rmapped) {
-   rmap_add(vcpu, shadow_pte, gfn, largepage);
-   if (!is_rmap_pte(*shadow_pte))
-   kvm_release_pfn_clean(pfn);
-   } else {
-   if (was_writeble)
-   kvm_release_pfn_dirty(pfn);
-   else
-   kvm_release_pfn_clean(pfn);
+   if (!direct_mmio) {
+   if (!was_rmapped) {
+   rmap_add(vcpu, shadow_pte, gfn, largepage);
+   if (!is_rmap_pte(*shadow_pte))
+   kvm_release_pfn_clean(pfn);
+   } else {
+   if (was_writeble)
+   kvm_release_pfn_dirty(pfn);
+   else
+   kvm_release_pfn_clean(pfn);
+   }
}
+
if (!ptwrite || !*ptwrite)
vcpu-arch.last_pte_updated = shadow_pte;
 }
@@ -1149,7 +1161,7 @@ static void nonpaging_new_cr3(struct kvm_vcpu *vcpu)
 
 static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write,
   int largepage, gfn_t gfn, pfn_t pfn,
-  int level)
+  int level, int direct_mmio)
 {
hpa_t table_addr = vcpu-arch.mmu.root_hpa;
int pt_write = 0;
@@ -1163,13 +1175,15 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, 
int write,
 
if (level == 1) {
mmu_set_spte(vcpu, table[index], ACC_ALL, ACC_ALL,
-0, write, 1, pt_write, 0, gfn, pfn, 
false);
+0, write, 1, pt_write, 0, gfn, pfn,
+false, direct_mmio);
return pt_write;
}
 
if (largepage  level == 2) {
mmu_set_spte(vcpu, table[index], ACC_ALL, ACC_ALL,
-0, write, 1, pt_write, 1, gfn, pfn, 
false);
+0, write, 1, pt_write, 1, gfn, pfn,
+false, direct_mmio);
return pt_write;
}
 
@@ -1200,6 +1214,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, 
int write, gfn_t gfn)
int r;
int largepage = 0;
pfn_t pfn;
+   int direct_mmio = 0;
 
down_read(current-mm-mmap_sem);
if (is_largepage_backed(vcpu, gfn  ~(KVM_PAGES_PER_HPAGE-1))) {
@@ -1207,10 +1222,10 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t 
v, int write, gfn_t gfn)
largepage = 1;
}
 
-   pfn = gfn_to_pfn(vcpu-kvm, gfn);
+   pfn = gfn_to_pfn(vcpu-kvm, gfn, direct_mmio);
up_read(current-mm-mmap_sem);
 
-   /* mmio */
+   /* handle emulated mmio */
if (is_error_pfn(pfn)) {
kvm_release_pfn_clean(pfn);
return 1;
@@ -1219,7 +1234,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, 
int write, gfn_t gfn)
spin_lock(vcpu-kvm-mmu_lock);

Re: [kvm-devel] [patch 2/2] QEMU: decrease console refresh rate with -nographic

2008-04-16 Thread Anthony Liguori
Avi Kivity wrote:
 Anthony Liguori wrote:

 What about aio completions?  The only race-free way to handle both 
 posix aio completion and fd readiness is signals AFAIK.

 We poll aio completion after the select don't we?  Worst case 
 scenario we miss a signal and wait to poll after the next select 
 event.  That's going to occur very often because of the timer.

 if select() doesn't enable signals (like you can do with pselect) you 
 may sit for a long time in select() until the timer expires.

 Consider a 100Hz Linux guest running 'ls -lR' out of a cold cache: 
 instead of 1-2 ms disk latencies you'll see 10 ms latencies, killing 
 performance by a factor of 5.

 I see the following possible solutions:

 1. Apply Anders' patch and keep I/O completions signal based.

 2. Use signalfd() to convert aio completions to fd readiness, 
 emulating signalfd() using a thread which does sigwait()+write() (to a 
 pipe) on older hosts

 3. Use a separate thread for aio completions

 4. Use pselect(), live with the race on older hosts (it was introduced 
 in 2.6.16, which we barely support anyway), live with the signal 
 delivery inefficiency.

 When I started writing this email I was in favor of (1), but now with 
 the new signalfd emulation I'm leaning towards (2).  I still think (1) 
 should be merged, preferably to qemu upstream.

There is a 5th option.  Do away with the use of posix aio.  We get 
absolutely no benefit from it because it's limited to a single thread.  
Fabrice has reverted a patch to change that in the past.

Regards,

Anthony Liguori


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 2/2] virtio-s390: Change virtio interrupt definitions to follow architecture

2008-04-16 Thread Carsten Otte
From: Christian Borntraeger [EMAIL PROTECTED]

This patch changes the interrupt defintions for virtio on s390. We now use
the extint number 0x2603, which is used as a host interrupt already by z/VM
for pfault and dasd_diag.
We will use subcode 0x0D to distinguish virtio from dasd and pfault.

Signed-off-by: Christian Borntraeger [EMAIL PROTECTED]
Signed-off-by: Carsten Otte [EMAIL PROTECTED]
---
 arch/s390/kvm/interrupt.c |6 +-
 drivers/s390/kvm/kvm_virtio.c |8 +++-
 2 files changed, 12 insertions(+), 2 deletions(-)

Index: kvm/arch/s390/kvm/interrupt.c
===
--- kvm.orig/arch/s390/kvm/interrupt.c
+++ kvm/arch/s390/kvm/interrupt.c
@@ -162,7 +162,11 @@ static void __do_deliver_interrupt(struc
VCPU_EVENT(vcpu, 4, interrupt: virtio parm:%x,parm64:%lx,
   inti-ext.ext_params, inti-ext.ext_params2);
vcpu-stat.deliver_virtio_interrupt++;
-   rc = put_guest_u16(vcpu, __LC_EXT_INT_CODE, 0x1237);
+   rc = put_guest_u16(vcpu, __LC_EXT_INT_CODE, 0x2603);
+   if (rc == -EFAULT)
+   exception = 1;
+
+   rc = put_guest_u16(vcpu, __LC_CPU_ADDRESS, 0x0d00);
if (rc == -EFAULT)
exception = 1;
 
Index: kvm/drivers/s390/kvm/kvm_virtio.c
===
--- kvm.orig/drivers/s390/kvm/kvm_virtio.c
+++ kvm/drivers/s390/kvm/kvm_virtio.c
@@ -23,6 +23,8 @@
 #include asm/setup.h
 #include asm/s390_ext.h
 
+#define VIRTIO_SUBCODE_64 0x0D00
+
 /*
  * The pointer to our (page) of device descriptions.
  */
@@ -291,6 +293,10 @@ static void scan_devices(void)
 static void kvm_extint_handler(u16 code)
 {
void *data = (void *) *(long *) __LC_PFAULT_INTPARM;
+   u16 subcode = S390_lowcore.cpu_addr;
+
+   if ((subcode  0xff00) != VIRTIO_SUBCODE_64)
+   return;
 
vring_interrupt(0, data);
 }
@@ -319,8 +325,8 @@ static int __init kvm_devices_init(void)
 
kvm_devices  = (void *) (max_pfn  PAGE_SHIFT);
 
-   register_external_interrupt(0x1237, kvm_extint_handler);
ctl_set_bit(0, 9);
+   register_external_interrupt(0x2603, kvm_extint_handler);
 
scan_devices();
return 0;



-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Announcement: Proxmox Virtual Environment

2008-04-16 Thread Alexey Eremenko
On Wed, Apr 16, 2008 at 1:14 PM, Dietmar Maurer [EMAIL PROTECTED] wrote:
 Hi all,

  I am glad to announce the first beta release of 'Proxmox Virtual
  Environment' - an open source virtualization platform for the
  enterprise.

  The main features are:

 - All code is GPL
 - OpenVZ and KVM support
 - bare metal installer (debian etch 64)
 - Backup/restore with vzdump/LVM2
 - web based management
 - integrated virtual appliance download (include certified
  appliances)
 - configuration cluster

  You can find more information at http://pve.proxmox.com

  We encourage anyone interested to download and test.
  The CD image is available at: http://pve.proxmox.com/wiki/Downloads

  Let us know what you think!

  Best regards,

  Dietmar

This technology looks promising... I will try it as soon as time will permit.

-- 
-Alexey Eromenko Technologov

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 0/2] kvm-s390 fixes

2008-04-16 Thread Carsten Otte
Hi Avi,

these two fixes repair two things in kvm-s390:
- #1 makes kvm complile again on s390 after a common code change
- #2 changes our virtio interrupt definitions to the values that will
  be reserved for kvm use in s390 architecture

I'd be great if both could make 2.6.26.

so long,
Carsten


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 1/2] kvm-s390: provide get/set_mp_state stubs to fix compile error

2008-04-16 Thread Carsten Otte
From: Christian Borntraeger [EMAIL PROTECTED]

Since 

commit ded6fb24fb694bcc5f308a02ec504d45fbc8aaa6
Author: Marcelo Tosatti [EMAIL PROTECTED]
Date:   Fri Apr 11 13:24:45 2008 -0300
KVM: add ioctls to save/store mpstate

kvm does not compile on s390. 
This patch provides ioctl stubs for s390 to make kvm.git compile again.
As migration is not yet supported, the ioctl definitions are empty.

Signed-off-by: Christian Borntraeger [EMAIL PROTECTED]
Signed-off-by: Carsten Otte [EMAIL PROTECTED]
---
 arch/s390/kvm/kvm-s390.c |   12 
 1 file changed, 12 insertions(+)

Index: kvm/arch/s390/kvm/kvm-s390.c
===
--- kvm.orig/arch/s390/kvm/kvm-s390.c
+++ kvm/arch/s390/kvm/kvm-s390.c
@@ -414,6 +414,18 @@ int kvm_arch_vcpu_ioctl_debug_guest(stru
return -EINVAL; /* not implemented yet */
 }
 
+int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
+   struct kvm_mp_state *mp_state)
+{
+   return -EINVAL; /* not implemented yet */
+}
+
+int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
+   struct kvm_mp_state *mp_state)
+{
+   return -EINVAL; /* not implemented yet */
+}
+
 static void __vcpu_run(struct kvm_vcpu *vcpu)
 {
memcpy(vcpu-arch.sie_block-gg14, vcpu-arch.guest_gprs[14], 16);



-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Extboot Option ROM rewritten in C - v2

2008-04-16 Thread Anthony Liguori
Nguyen Anh Quynh wrote:
 Hi Anthony,

 I found a bug in the last code: send_command() failed to copy back the
 result into extboot_cmd structure. This patch fixes it.

 I succesfully tested this version with guest Win2K (fully updated,
 scsi boot) and Linux 2.6.25-rc8 (virtio).

 Let me know if you can boot Windows with this version.
   

I'll test it out.  Please send it to the list as a patch against 
kvm-userspace.

Regards,

Anthony Liguori

 Thanks,
 Quynh
 ---
 This code is an attempt to rewrite the current extboot option rom in
 C. The new code now minimize the assembly code, so that the assembly
 code is very small and simple: boot.S's only job is to interface with
 C code, which does all the dirty job. signrom is modified to adapt
 with the new result binary image.

 The result option rom has the same size as the original one: 1.5KB,
 while the actual code size is around the same: 1.2KB (gcc can optimize
 really well)

 To install this option rom, do the following steps as root:

 make
 make save --- backup the original option rom to
 /usr/share/qemu/extboot.bin.org
 make install   --- overwrite the new option rom to
 /usr/share/qemu/extboot.bin
   


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 2/2] QEMU: decrease console refresh rate with -nographic

2008-04-16 Thread Carsten Otte
Anthony Liguori wrote:
 There is a 5th option.  Do away with the use of posix aio.  We get 
 absolutely no benefit from it because it's limited to a single thread.  
 Fabrice has reverted a patch to change that in the past.
How about using linux aio for it? It seems much better, because it 
doesn't use userspace threads but has a direct in-kernel 
implementation. I've had good performance on zldisk with that, and 
it's stable.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [ANNOUNCE] kvm-66 release

2008-04-16 Thread Avi Kivity
Today's new kvm architecture is ia64, aka Itanium 2.  Like s390, it is 
only provided in the git tree, not in the tarball.  Windows and Linux 
guests are supported.

On good old x86, we have a the new kvmtrace performance monitoring 
framework together with a sizable number of bug fixes.

Changes from kvm-65:
- adjust external module for 2.6.25 module locations (Anthony Liguori)
- fix userspace compilation failure without kernel pit (Joerg Roedel)
- kvmtrace performance monitoring mechanism (Eric Liu)
- stop all vcpus before saving their state (Marcelo Tosatti)
   - fixes smp live migration
- save/restore kernel apicbase (Marcelo Tosatti)
- block SIG_IPI signals (Marcelo Tosatti)
- smsw mem16, lmsw mem16 emulation and unit tests
- fix compile warnings (Jerone Young)
- fix reset with iothread
- ia64 architecture support (Xiantao Zhang, Anthony Xu)
- don't assume guest pages are backed by a 'struct page' (Anthony Liguori)
   - needed for pci device assignment
- register kvm's ioctl range
- fix hardware task switching buglet (Izik Eidus)
- fix mce handling on AMD (Joerg Roedel)
- do hardware task switching in hardware when NPT is enabled (Joerg Roedel)
- fix timer race waking up a halted vcpu with smp (Marcelo Tosatti)
- fix irq race leading to irqs delivery delays (Marcelo Tosatti)
- fix triple fault handling on AMD
- fix lea instruction emulation


Notes:
  If you use the modules bundled with kvm-66, you can use any version
of Linux from 2.6.17 upwards.
  If you use the modules bundled with Linux 2.6.20, you need to use
kvm-12.
  If you use the modules bundled with Linux 2.6.21, you need to use
kvm-17.
  Modules from Linux 2.6.22 and up will work with any kvm version from
kvm-22.  Some features may only be available in newer releases.
  For best performance, use Linux 2.6.23-rc2 or later as the host.

http://kvm.qumranet.com


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 5/5] SVM: remove now obsolete FIXME comment

2008-04-16 Thread Joerg Roedel
With the usage of the V_TPR field this comment is now obsolet.

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kvm/svm.c |7 ---
 1 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 61bb2cb..d643605 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -916,13 +916,6 @@ static void svm_set_segment(struct kvm_vcpu *vcpu,
 
 }
 
-/* FIXME:
-
-   svm(vcpu)-vmcb-control.int_ctl = ~V_TPR_MASK;
-   svm(vcpu)-vmcb-control.int_ctl |= (sregs-cr8  V_TPR_MASK);
-
-*/
-
 static int svm_guest_debug(struct kvm_vcpu *vcpu, struct kvm_debug_guest *dbg)
 {
return -EOPNOTSUPP;
-- 
1.5.3.7



-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 3/5] SVM: sync V_TPR with LAPIC.TPR if CR8 write intercept is disabled

2008-04-16 Thread Joerg Roedel
If the CR8 write intercept is disabled the V_TPR field of the VMCB needs to be
synced with the TPR field in the local apic.

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kvm/svm.c |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index f8ce36e..ee2ee83 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1620,6 +1620,16 @@ static void svm_prepare_guest_switch(struct kvm_vcpu 
*vcpu)
 {
 }
 
+static inline void sync_cr8_to_lapic(struct kvm_vcpu *vcpu)
+{
+   struct vcpu_svm *svm = to_svm(vcpu);
+
+   if (!(svm-vmcb-control.intercept_cr_write  INTERCEPT_CR8_MASK)) {
+   int cr8 = svm-vmcb-control.int_ctl  V_TPR_MASK;
+   kvm_lapic_set_tpr(vcpu, cr8);
+   }
+}
+
 static inline void sync_lapic_to_cr8(struct kvm_vcpu *vcpu)
 {
struct vcpu_svm *svm = to_svm(vcpu);
@@ -1791,6 +1801,8 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct 
kvm_run *kvm_run)
 
stgi();
 
+   sync_cr8_to_lapic(vcpu);
+
svm-next_rip = 0;
 }
 
-- 
1.5.3.7



-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 0/5] SVM CR8 optimization patches

2008-04-16 Thread Joerg Roedel
This patch series implements optimizations to the CR8 intercept handling in
SVM. With these patches applied CR8 reads are not intercepted anymore. The
writes to CR8 are only intercepted if the TPR masks interrupts. This
significantly reduces the number of total CR8 intercepts when running Windows
64 bit versions. Some quick numbers:

Boot and shudown of Vista 64: 

Without these patches: ~38.000.000 CR8 writes intercepted
Withthese patches: ~38.000 CR8 writes intercepted

diffstat:

 arch/x86/kvm/lapic.c |1 +
 arch/x86/kvm/svm.c   |   68 -
 2 files changed, 56 insertions(+), 13 deletions(-)




-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 4/5] SVM: disable CR8 intercept when tpr is not masking interrupts

2008-04-16 Thread Joerg Roedel
This patch disables the intercept of CR8 writes if the TPR is not masking
interrupts. This reduces the total number CR8 intercepts to below 1 percent of
what we have without this patch using Windows 64 bit guests.

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kvm/svm.c |   31 +++
 1 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ee2ee83..61bb2cb 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1502,6 +1502,27 @@ static void svm_set_irq(struct kvm_vcpu *vcpu, int irq)
svm_inject_irq(svm, irq);
 }
 
+static void update_cr8_intercept(struct kvm_vcpu *vcpu)
+{
+   struct vcpu_svm *svm = to_svm(vcpu);
+   struct vmcb *vmcb = svm-vmcb;
+   int max_irr, tpr;
+
+   if (!irqchip_in_kernel(vcpu-kvm) || vcpu-arch.apic-vapic_addr)
+   return;
+
+   vmcb-control.intercept_cr_write = ~INTERCEPT_CR8_MASK;
+
+   max_irr = kvm_lapic_find_highest_irr(vcpu);
+   if (max_irr == -1)
+   return;
+
+   tpr = kvm_lapic_get_cr8(vcpu)  4;
+
+   if (tpr = (max_irr  0xf0))
+   vmcb-control.intercept_cr_write |= INTERCEPT_CR8_MASK;
+}
+
 static void svm_intr_assist(struct kvm_vcpu *vcpu)
 {
struct vcpu_svm *svm = to_svm(vcpu);
@@ -1514,14 +1535,14 @@ static void svm_intr_assist(struct kvm_vcpu *vcpu)
  SVM_EVTINJ_VEC_MASK;
vmcb-control.exit_int_info = 0;
svm_inject_irq(svm, intr_vector);
-   return;
+   goto out;
}
 
if (vmcb-control.int_ctl  V_IRQ_MASK)
-   return;
+   goto out;
 
if (!kvm_cpu_has_interrupt(vcpu))
-   return;
+   goto out;
 
if (!(vmcb-save.rflags  X86_EFLAGS_IF) ||
(vmcb-control.int_state  SVM_INTERRUPT_SHADOW_MASK) ||
@@ -1529,12 +1550,14 @@ static void svm_intr_assist(struct kvm_vcpu *vcpu)
/* unable to deliver irq, set pending irq */
vmcb-control.intercept |= (1ULL  INTERCEPT_VINTR);
svm_inject_irq(svm, 0x0);
-   return;
+   goto out;
}
/* Okay, we can deliver the interrupt: grab it and update PIC state. */
intr_vector = kvm_cpu_get_interrupt(vcpu);
svm_inject_irq(svm, intr_vector);
kvm_timer_intr_post(vcpu, intr_vector);
+out:
+   update_cr8_intercept(vcpu);
 }
 
 static void kvm_reput_irq(struct vcpu_svm *svm)
-- 
1.5.3.7



-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] SVM: remove selective CR0 comment

2008-04-16 Thread Joerg Roedel
There is not selective cr0 intercept bug. The code in the comment sets the
CR0.PG bit. But KVM sets the CR4.PG bit for SVM always to implement the paged
real mode. So the 'mov %eax,%cr0' instruction does not change the CR0.PG bit.
Selective CR0 intercepts only occur when a bit is actually changed. So its the
right behavior that there is no intercept on this instruction.

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kvm/svm.c |   11 ---
 1 files changed, 0 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 3379e13..55b5076 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -514,17 +514,6 @@ static void init_vmcb(struct vcpu_svm *svm)
control-intercept =(1ULL  INTERCEPT_INTR) |
(1ULL  INTERCEPT_NMI) |
(1ULL  INTERCEPT_SMI) |
-   /*
-* selective cr0 intercept bug?
-*  0:   0f 22 d8mov%eax,%cr3
-*  3:   0f 20 c0mov%cr0,%eax
-*  6:   0d 00 00 00 80  or $0x8000,%eax
-*  b:   0f 22 c0mov%eax,%cr0
-* set cr3 -interception
-* get cr0 -interception
-* set cr0 - no interception
-*/
-   /*  (1ULL  INTERCEPT_SELECTIVE_CR0) | */
(1ULL  INTERCEPT_CPUID) |
(1ULL  INTERCEPT_INVD) |
(1ULL  INTERCEPT_HLT) |
-- 
1.5.3.7



-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Extboot Option ROM rewritten in C - v2

2008-04-16 Thread Anthony Liguori
Anthony Liguori wrote:
 A couple general comments.

 I'd feel a lot more comfortable with the int13 handler returning an 
 int and the asm stub code uses that result to determine how to set 
 CF.  You set CF deep within the function stack and there's no 
 guarantee that GCC isn't going to stomp on it.

Ignore that bit, I missed that you were only setting it within the regs 
structure.

Regards,

Anthony Liguori

 I also don't think we want to raise int18 when we get a command we 
 don't understand.  We should just not change any of the register 
 state.  There are a number of extended commands that look for a magic 
 value to determine whether the command exists or not.

 Regards,

 Anthony Liguori

 Nguyen Anh Quynh wrote:
 Hi Anthony,

 I found a bug in the last code: send_command() failed to copy back the
 result into extboot_cmd structure. This patch fixes it.

 I succesfully tested this version with guest Win2K (fully updated,
 scsi boot) and Linux 2.6.25-rc8 (virtio).

 Let me know if you can boot Windows with this version.

 Thanks,
 Quynh
 ---
 This code is an attempt to rewrite the current extboot option rom in
 C. The new code now minimize the assembly code, so that the assembly
 code is very small and simple: boot.S's only job is to interface with
 C code, which does all the dirty job. signrom is modified to adapt
 with the new result binary image.

 The result option rom has the same size as the original one: 1.5KB,
 while the actual code size is around the same: 1.2KB (gcc can optimize
 really well)

 To install this option rom, do the following steps as root:

 make
 make save --- backup the original option rom to
 /usr/share/qemu/extboot.bin.org
 make install   --- overwrite the new option rom to
 /usr/share/qemu/extboot.bin
   




-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 2/5] X86: export kvm_lapic_set_tpr to modules

2008-04-16 Thread Joerg Roedel
This patch exports the kvm_lapic_set_tpr() function from the lapic code to
modules. It is required in the kvm-amd module to optimize CR8 intercepts.

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kvm/lapic.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 2ccf994..57ac4e4 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -822,6 +822,7 @@ void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long 
cr8)
apic_set_tpr(apic, ((cr8  0x0f)  4)
 | (apic_get_reg(apic, APIC_TASKPRI)  4));
 }
+EXPORT_SYMBOL_GPL(kvm_lapic_set_tpr);
 
 u64 kvm_lapic_get_cr8(struct kvm_vcpu *vcpu)
 {
-- 
1.5.3.7



-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 1/5] SVM: sync TPR value to V_TPR field in the VMCB

2008-04-16 Thread Joerg Roedel
This patch adds syncing of the lapic.tpr field to the V_TPR field of the VMCB.
With this change we can safely remove the CR8 read intercept.

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kvm/svm.c |   18 --
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 3379e13..f8ce36e 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -486,8 +486,7 @@ static void init_vmcb(struct vcpu_svm *svm)
 
control-intercept_cr_read =INTERCEPT_CR0_MASK |
INTERCEPT_CR3_MASK |
-   INTERCEPT_CR4_MASK |
-   INTERCEPT_CR8_MASK;
+   INTERCEPT_CR4_MASK;
 
control-intercept_cr_write =   INTERCEPT_CR0_MASK |
INTERCEPT_CR3_MASK |
@@ -1621,6 +1620,19 @@ static void svm_prepare_guest_switch(struct kvm_vcpu 
*vcpu)
 {
 }
 
+static inline void sync_lapic_to_cr8(struct kvm_vcpu *vcpu)
+{
+   struct vcpu_svm *svm = to_svm(vcpu);
+   u64 cr8;
+
+   if (!irqchip_in_kernel(vcpu-kvm))
+   return;
+
+   cr8 = kvm_get_cr8(vcpu);
+   svm-vmcb-control.int_ctl = ~V_TPR_MASK;
+   svm-vmcb-control.int_ctl |= cr8  V_TPR_MASK;
+}
+
 static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
struct vcpu_svm *svm = to_svm(vcpu);
@@ -1630,6 +1642,8 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct 
kvm_run *kvm_run)
 
pre_svm_run(svm);
 
+   sync_lapic_to_cr8(vcpu);
+
save_host_msrs(vcpu);
fs_selector = read_fs();
gs_selector = read_gs();
-- 
1.5.3.7



-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1/2] kvm-s390: provide ge t/set_mp_state stubs to fix compile error

2008-04-16 Thread Hollis Blanchard
By the way Marcelo, it would be polite to provide these stubs yourself to 
avoid breaking the build on other architectures.

It looks like IA64 is still broken because of this.

-- 
Hollis Blanchard
IBM Linux Technology Center

On Wednesday 16 April 2008 09:06:34 Carsten Otte wrote:
 From: Christian Borntraeger [EMAIL PROTECTED]
 
 Since 
 
 commit ded6fb24fb694bcc5f308a02ec504d45fbc8aaa6
 Author: Marcelo Tosatti [EMAIL PROTECTED]
 Date:   Fri Apr 11 13:24:45 2008 -0300
 KVM: add ioctls to save/store mpstate
 
 kvm does not compile on s390. 
 This patch provides ioctl stubs for s390 to make kvm.git compile again.
 As migration is not yet supported, the ioctl definitions are empty.
 
 Signed-off-by: Christian Borntraeger [EMAIL PROTECTED]
 Signed-off-by: Carsten Otte [EMAIL PROTECTED]
 ---
  arch/s390/kvm/kvm-s390.c |   12 
  1 file changed, 12 insertions(+)
 
 Index: kvm/arch/s390/kvm/kvm-s390.c
 ===
 --- kvm.orig/arch/s390/kvm/kvm-s390.c
 +++ kvm/arch/s390/kvm/kvm-s390.c
 @@ -414,6 +414,18 @@ int kvm_arch_vcpu_ioctl_debug_guest(stru
   return -EINVAL; /* not implemented yet */
  }
 
 +int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
 + struct kvm_mp_state *mp_state)
 +{
 + return -EINVAL; /* not implemented yet */
 +}
 +
 +int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
 + struct kvm_mp_state *mp_state)
 +{
 + return -EINVAL; /* not implemented yet */
 +}
 +
  static void __vcpu_run(struct kvm_vcpu *vcpu)
  {
   memcpy(vcpu-arch.sie_block-gg14, vcpu-arch.guest_gprs[14], 16);
 
 
 
 -
 This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
 Don't miss this year's exciting event. There's still time to save $100. 
 Use priority code J8TL2D2. 
 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
 ___
 kvm-devel mailing list
 kvm-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/kvm-devel
 



-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Second KVM process hangs eating 80-100% CPU on host during startup

2008-04-16 Thread Alex Davis
Host software:
Linux 2.6.24.4
KVM 65 (I am using the kernel modules from this release).
X11 7.2 from Xorg
SDL 1.2.13
GCC 4.1.1
Glibc 2.4

Host hardware:
Asus P5B Deluxe (P965 chipset based) motherboard
4 GB RAM
Intel E6700 CPU

Guest software:
Slackware 12.0 installed from CD-ROM.

Command used to first KVM instance:
/usr/local/bin/qemu-system-x86_64 -hda /spare/vdisk1.img -cdrom /dev/cdrom 
-boot c -m 384 -net
nic,macaddr=DE:AD:BE:EF:11:29 -net tap,ifname=tap0,script=no 

Command used to start second KVM instance:
/usr/local/bin/qemu-system-x86_64 -hda /spare/vdisk2.img -cdrom /dev/cdrom 
-boot c -m 384 -net
nic,macaddr=DE:AD:BE:EF:11:30 -net tap,ifname=tap1,script=no 

tap0 and tap1 are bridged on the host. The guest OS was installed on 
/spare/vdisk1.img, 
which was initially created by /usr/local/bin/qemu-img create -f qcow 
/spare/vdisk.img 10G
After the guest installation completed, vdisk1 was copied to vdisk2.

The second instance always stops after printing
Checking if the processor honours the WP bit even in supervisor mode... Ok.
It stays hung until I press the return key in the first instance; sometimes 
clicking in another X
window will wake it up as well. 

This is a test machine so I can test patches (almost) at will.




I code, therefore I am


  

Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-16 Thread Robin Holt

I don't think this lock mechanism is completely working.  I have
gotten a few failures trying to dereference 0x100100 which appears to
be LIST_POISON1.

Thanks,
Robin

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Extboot Option ROM rewritten in C - v2

2008-04-16 Thread Anthony Liguori
A couple general comments.

I'd feel a lot more comfortable with the int13 handler returning an int 
and the asm stub code uses that result to determine how to set CF.  You 
set CF deep within the function stack and there's no guarantee that GCC 
isn't going to stomp on it.

I also don't think we want to raise int18 when we get a command we don't 
understand.  We should just not change any of the register state.  There 
are a number of extended commands that look for a magic value to 
determine whether the command exists or not.

Regards,

Anthony Liguori

Nguyen Anh Quynh wrote:
 Hi Anthony,

 I found a bug in the last code: send_command() failed to copy back the
 result into extboot_cmd structure. This patch fixes it.

 I succesfully tested this version with guest Win2K (fully updated,
 scsi boot) and Linux 2.6.25-rc8 (virtio).

 Let me know if you can boot Windows with this version.

 Thanks,
 Quynh
 ---
 This code is an attempt to rewrite the current extboot option rom in
 C. The new code now minimize the assembly code, so that the assembly
 code is very small and simple: boot.S's only job is to interface with
 C code, which does all the dirty job. signrom is modified to adapt
 with the new result binary image.

 The result option rom has the same size as the original one: 1.5KB,
 while the actual code size is around the same: 1.2KB (gcc can optimize
 really well)

 To install this option rom, do the following steps as root:

 make
 make save --- backup the original option rom to
 /usr/share/qemu/extboot.bin.org
 make install   --- overwrite the new option rom to
 /usr/share/qemu/extboot.bin
   


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] kvm: move kvmclock initialization inside kvm_guest_init

2008-04-16 Thread Glauber Costa
It makes no sense for the clock initialization to be
hanging around in setup_32.c when we have a generic kvm guest
initialization function available. So, we move kvmclock_init()
inside such a function, leading to a cleaner code.

Signed-off-by: Glauber Costa [EMAIL PROTECTED]
---
 arch/x86/kernel/kvm.c  |2 ++
 arch/x86/kernel/setup_32.c |4 
 include/linux/kvm_para.h   |5 +
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index d9121f9..5cad368 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -210,6 +210,8 @@ static void paravirt_ops_setup(void)
pv_info.name = KVM;
pv_info.paravirt_enabled = 1;
 
+   kvmclock_init();
+
if (kvm_para_has_feature(KVM_FEATURE_NOP_IO_DELAY))
pv_cpu_ops.io_delay = kvm_io_delay;
 
diff --git a/arch/x86/kernel/setup_32.c b/arch/x86/kernel/setup_32.c
index 65f3a23..029350c 100644
--- a/arch/x86/kernel/setup_32.c
+++ b/arch/x86/kernel/setup_32.c
@@ -771,10 +771,6 @@ void __init setup_arch(char **cmdline_p)
 
max_low_pfn = setup_memory();
 
-#ifdef CONFIG_KVM_CLOCK
-   kvmclock_init();
-#endif
-
 #ifdef CONFIG_VMI
/*
 * Must be after max_low_pfn is determined, and before kernel
diff --git a/include/linux/kvm_para.h b/include/linux/kvm_para.h
index 3ddce03..c5e662c 100644
--- a/include/linux/kvm_para.h
+++ b/include/linux/kvm_para.h
@@ -28,6 +28,11 @@ void __init kvm_guest_init(void);
 #else
 #define kvm_guest_init() do { } while (0)
 #endif
+#ifdef CONFIG_KVM_CLOCK
+void kvmclock_init(void);
+#else
+#define kvmclock_init() do { } while (0)
+#endif
 
 static inline int kvm_para_has_feature(unsigned int feature)
 {
-- 
1.5.0.6


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] kvm: move kvmclock initialization inside kvm_guest_init

2008-04-16 Thread Glauber de Oliveira Costa
Glauber Costa wrote:
 It makes no sense for the clock initialization to be
 hanging around in setup_32.c when we have a generic kvm guest
 initialization function available. So, we move kvmclock_init()
 inside such a function, leading to a cleaner code.

 Signed-off-by: Glauber Costa [EMAIL PROTECTED]
 ---
  arch/x86/kernel/kvm.c  |2 ++
  arch/x86/kernel/setup_32.c |4 
  include/linux/kvm_para.h   |5 +
  3 files changed, 7 insertions(+), 4 deletions(-)

 diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
 index d9121f9..5cad368 100644
 --- a/arch/x86/kernel/kvm.c
 +++ b/arch/x86/kernel/kvm.c
 @@ -210,6 +210,8 @@ static void paravirt_ops_setup(void)
   pv_info.name = KVM;
   pv_info.paravirt_enabled = 1;
  
 + kvmclock_init();
 +
   if (kvm_para_has_feature(KVM_FEATURE_NOP_IO_DELAY))
   pv_cpu_ops.io_delay = kvm_io_delay;
  
 diff --git a/arch/x86/kernel/setup_32.c b/arch/x86/kernel/setup_32.c
 index 65f3a23..029350c 100644
 --- a/arch/x86/kernel/setup_32.c
 +++ b/arch/x86/kernel/setup_32.c
 @@ -771,10 +771,6 @@ void __init setup_arch(char **cmdline_p)
  
   max_low_pfn = setup_memory();
  
 -#ifdef CONFIG_KVM_CLOCK
 - kvmclock_init();
 -#endif
 -
  #ifdef CONFIG_VMI
   /*
* Must be after max_low_pfn is determined, and before kernel
 diff --git a/include/linux/kvm_para.h b/include/linux/kvm_para.h
 index 3ddce03..c5e662c 100644
 --- a/include/linux/kvm_para.h
 +++ b/include/linux/kvm_para.h
 @@ -28,6 +28,11 @@ void __init kvm_guest_init(void);
  #else
  #define kvm_guest_init() do { } while (0)
  #endif
 +#ifdef CONFIG_KVM_CLOCK
 +void kvmclock_init(void);
 +#else
 +#define kvmclock_init() do { } while (0)
 +#endif
  
  static inline int kvm_para_has_feature(unsigned int feature)
  {
   

Forget about it. Marelo just screamed to me (and somehow I heard it), 
that this create a bogus dependency between clock and the mmu functions. 
Duh.

I'll resend a better version

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1/2] kvm-s390: provide get/set_mp_state stubs to fix compile error

2008-04-16 Thread Marcelo Tosatti
On Wed, Apr 16, 2008 at 11:21:05AM -0500, Hollis Blanchard wrote:
 By the way Marcelo, it would be polite to provide these stubs yourself to 
 avoid breaking the build on other architectures.

Indeed, should have been more careful.

 It looks like IA64 is still broken because of this.

Now I'm not sure if IA64 supports migration. Should it return -EINVAL or
the ia64 mpstate ?


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] add virtio disk geometry feature

2008-04-16 Thread Ryan Harper
From: Ryan Harper [EMAIL PROTECTED]

Rather than faking up some geometry, allow the backend to push the disk
geometry via virtio pci config option.  Keep the old geo code around for
compatibility.

Signed-off-by: Ryan Harper [EMAIL PROTECTED]
Reviewed-by: Anthony Liguori [EMAIL PROTECTED]

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 0cfbe8c..1d2142a 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -157,10 +157,28 @@ static int virtblk_ioctl(struct inode *inode, struct file 
*filp,
 /* We provide getgeo only to please some old bootloader/partitioning tools */
 static int virtblk_getgeo(struct block_device *bd, struct hd_geometry *geo)
 {
-   /* some standard values, similar to sd */
-   geo-heads = 1  6;
-   geo-sectors = 1  5;
-   geo-cylinders = get_capacity(bd-bd_disk)  11;
+   struct virtio_blk *vblk = bd-bd_disk-private_data;
+   int err = 0;
+
+   /* see if the host passed in geometry config */
+   err = virtio_config_val(vblk-vdev, VIRTIO_BLK_F_GEOMETRY,
+   offsetof(struct virtio_blk_config, cylinders),
+   geo-cylinders);
+   
+   /* if host sets geo flag, all 3 values must be present */
+   if (!err) {
+   __virtio_config_val(vblk-vdev,
+   offsetof(struct virtio_blk_config, heads),
+   geo-heads);
+   __virtio_config_val(vblk-vdev,
+   offsetof(struct virtio_blk_config, sectors),
+   geo-sectors);
+   } else {
+   /* some standard values, similar to sd */
+   geo-heads = 1  6;
+   geo-sectors = 1  5;
+   geo-cylinders = get_capacity(bd-bd_disk)  11;
+   }
return 0;
 }
 
diff --git a/include/linux/virtio_blk.h b/include/linux/virtio_blk.h
index bca0b10..142c496 100644
--- a/include/linux/virtio_blk.h
+++ b/include/linux/virtio_blk.h
@@ -9,6 +9,7 @@
 #define VIRTIO_BLK_F_BARRIER   0   /* Does host support barriers? */
 #define VIRTIO_BLK_F_SIZE_MAX  1   /* Indicates maximum segment size */
 #define VIRTIO_BLK_F_SEG_MAX   2   /* Indicates maximum # of segments */
+#define VIRTIO_BLK_F_GEOMETRY  4   /* Legacy geometry available  */
 
 struct virtio_blk_config
 {
@@ -18,6 +19,12 @@ struct virtio_blk_config
__le32 size_max;
/* The maximum number of segments (if VIRTIO_BLK_F_SEG_MAX) */
__le32 seg_max;
+   /* cylinders of the device (if VIRTIO_BLK_F_GEOMETRY) */
+   __le16 cylinders;
+   /* heads of the device (if VIRTIO_BLK_F_GEOMETRY) */
+   __u8 heads;
+   /* sectors of the device (if VIRTIO_BLK_F_GEOMETRY) */
+   __u8 sectors;
 } __attribute__((packed));
 
 /* These two define direction. */

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] pass virtio disk geometry via config space

2008-04-16 Thread Ryan Harper
From: Ryan Harper [EMAIL PROTECTED]

Rather than faking up some geometry, allow the backend to push the disk
geometry via virtio pci config option.  Keep the old geo code around for
compatibility.

Signed-off-by: Ryan Harper [EMAIL PROTECTED]
Reviewed-by: Anthony Liguori [EMAIL PROTECTED]

diff --git a/qemu/hw/pc.c b/qemu/hw/pc.c
index ae87ab9..e78647a 100644
--- a/qemu/hw/pc.c
+++ b/qemu/hw/pc.c
@@ -1130,7 +1130,7 @@ static void pc_init1(ram_addr_t ram_size, int 
vga_ram_size,
DriveInfo *info = drives_table[extboot_drive];
int cyls, heads, secs;
 
-   if (info-type != IF_IDE) {
+   if (info-type != IF_IDE  info-type != IF_VIRTIO) {
bdrv_guess_geometry(info-bdrv, cyls, heads, secs);
bdrv_set_geometry_hint(info-bdrv, cyls, heads, secs);
}
diff --git a/qemu/hw/virtio-blk.c b/qemu/hw/virtio-blk.c
index 492bd7f..d51501e 100644
--- a/qemu/hw/virtio-blk.c
+++ b/qemu/hw/virtio-blk.c
@@ -25,12 +25,16 @@
 #define VIRTIO_BLK_F_BARRIER   0   /* Does host support barriers? */
 #define VIRTIO_BLK_F_SIZE_MAX  1   /* Indicates maximum segment size */
 #define VIRTIO_BLK_F_SEG_MAX   2   /* Indicates maximum # of segments */
+#define VIRTIO_BLK_F_GEOMETRY  4   /* Indicates support of legacy geometry 
*/
 
 struct virtio_blk_config
 {
 uint64_t capacity;
 uint32_t size_max;
 uint32_t seg_max;
+uint16_t cylinders;
+uint8_t heads;
+uint8_t sectors;
 };
 
 /* These two define direction. */
@@ -132,32 +136,40 @@ static void virtio_blk_update_config(VirtIODevice *vdev, 
uint8_t *config)
 VirtIOBlock *s = to_virtio_blk(vdev);
 struct virtio_blk_config blkcfg;
 int64_t capacity;
+int cylinders, heads, secs;
 
 bdrv_get_geometry(s-bs, capacity);
+bdrv_get_geometry_hint(s-bs, cylinders, heads, secs);
 blkcfg.capacity = cpu_to_le64(capacity);
 blkcfg.seg_max = cpu_to_le32(128 - 2);
+blkcfg.cylinders = cpu_to_le16(cylinders);
+blkcfg.heads = heads;
+blkcfg.sectors = secs;
 memcpy(config, blkcfg, sizeof(blkcfg));
 }
 
 static uint32_t virtio_blk_get_features(VirtIODevice *vdev)
 {
-return (1  VIRTIO_BLK_F_SEG_MAX);
+return (1  VIRTIO_BLK_F_SEG_MAX | 1  VIRTIO_BLK_F_GEOMETRY);
 }
 
 void *virtio_blk_init(PCIBus *bus, uint16_t vendor, uint16_t device,
  BlockDriverState *bs)
 {
 VirtIOBlock *s;
+int cylinders, heads, secs;
 
 s = (VirtIOBlock *)virtio_init_pci(bus, virtio-blk, vendor, device,
   0, VIRTIO_ID_BLOCK,
   0x01, 0x80, 0x00,
-  16, sizeof(VirtIOBlock));
+  sizeof(struct virtio_blk_config), 
sizeof(VirtIOBlock));
 
 s-vdev.update_config = virtio_blk_update_config;
 s-vdev.get_features = virtio_blk_get_features;
 s-bs = bs;
 bs-devfn = s-vdev.pci_dev.devfn;
+bdrv_guess_geometry(s-bs, cylinders, heads, secs);
+bdrv_set_geometry_hint(s-bs, cylinders, heads, secs);
 
 virtio_add_queue(s-vdev, 128, virtio_blk_handle_output);
 

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-16 Thread Robin Holt
On Wed, Apr 16, 2008 at 11:35:38AM -0700, Christoph Lameter wrote:
 On Wed, 16 Apr 2008, Robin Holt wrote:
 
  I don't think this lock mechanism is completely working.  I have
  gotten a few failures trying to dereference 0x100100 which appears to
  be LIST_POISON1.
 
 How does xpmem unregistering of notifiers work?

For the tests I have been running, we are waiting for the release
callout as part of exit.

Thanks,
Robin

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-16 Thread Christoph Lameter
On Wed, 16 Apr 2008, Robin Holt wrote:

 I don't think this lock mechanism is completely working.  I have
 gotten a few failures trying to dereference 0x100100 which appears to
 be LIST_POISON1.

How does xpmem unregistering of notifiers work?

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-16 Thread Christoph Lameter
On Wed, 16 Apr 2008, Robin Holt wrote:

 On Wed, Apr 16, 2008 at 11:35:38AM -0700, Christoph Lameter wrote:
  On Wed, 16 Apr 2008, Robin Holt wrote:
  
   I don't think this lock mechanism is completely working.  I have
   gotten a few failures trying to dereference 0x100100 which appears to
   be LIST_POISON1.
  
  How does xpmem unregistering of notifiers work?
 
 For the tests I have been running, we are waiting for the release
 callout as part of exit.

Some more details on the failure may be useful. AFAICT list_del[_rcu] is 
the culprit here and that is only used on release or unregister.



-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Qemu-devel] [PATCH 1/5] PCI DMA API (v3)

2008-04-16 Thread Blue Swirl
On 4/16/08, Anthony Liguori [EMAIL PROTECTED] wrote:
 This patch introduces a DMA API and plumbs support through the DMA layer.  We
  use a mostly opaque structure, IOVector to represent a scatter/gather list of
  physical memory.  Associated with each IOVector is a read/write function and
  an opaque pointer.  This allows arbitrary transformation/mapping of the
  data while providing an easy mechanism to short-cut the zero-copy case
  in the block/net backends.

This looks much better also for Sparc uses. I converted pcnet to use
the IOVectors (see patch), it does not work yet but looks doable.

IMHO the read/write functions should be a property of the bus so that
they are hidden from the device, for pcnet it does not matter as we
have to do the swapping anyway.


pcnet_dma_api.diff
Description: plain/text
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Second KVM process hangs eating 80-100% CPU on host during startup

2008-04-16 Thread Alex Davis

--- Alex Davis [EMAIL PROTECTED] wrote:

 Host software:
 Linux 2.6.24.4
 KVM 65 (I am using the kernel modules from this release).
 X11 7.2 from Xorg
 SDL 1.2.13
 GCC 4.1.1
 Glibc 2.4
 
 Host hardware:
 Asus P5B Deluxe (P965 chipset based) motherboard
 4 GB RAM
 Intel E6700 CPU
 
 Guest software:
 Slackware 12.0 installed from CD-ROM.
 
Additional information: host arch. is x86_64(64-bit); guest arch. is 
x86(32-bit).

I code, therefore I am


  

Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Qemu-devel] [PATCH 1/5] PCI DMA API (v3)

2008-04-16 Thread Anthony Liguori
Blue Swirl wrote:
 On 4/16/08, Anthony Liguori [EMAIL PROTECTED] wrote:
   
 This patch introduces a DMA API and plumbs support through the DMA layer.  We
  use a mostly opaque structure, IOVector to represent a scatter/gather list 
 of
  physical memory.  Associated with each IOVector is a read/write function and
  an opaque pointer.  This allows arbitrary transformation/mapping of the
  data while providing an easy mechanism to short-cut the zero-copy case
  in the block/net backends.
 

 This looks much better also for Sparc uses. I converted pcnet to use
 the IOVectors (see patch), it does not work yet but looks doable.
   

Excellent!

 IMHO the read/write functions should be a property of the bus so that
 they are hidden from the device, for pcnet it does not matter as we
 have to do the swapping anyway.
   

For an IOMMU that has a per-device mapping, the read/write functions 
have to operate on a per-device basis.

Regards,

Anthony Liguori


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] KVM: MMU: kvm_pv_mmu_op should not take mmap_sem

2008-04-16 Thread Marcelo Tosatti

kvm_pv_mmu_op should not take mmap_sem. All gfn_to_page() callers down
in the MMU processing will take it if necessary, so as it is it can
deadlock.

Apparently a leftover from the days before slots_lock.

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]


diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 078a7f1..2ad6f54 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2173,8 +2173,6 @@ int kvm_pv_mmu_op(struct kvm_vcpu *vcpu, unsigned long 
bytes,
int r;
struct kvm_pv_mmu_op_buffer buffer;
 
-   down_read(current-mm-mmap_sem);
-
buffer.ptr = buffer.buf;
buffer.len = min_t(unsigned long, bytes, sizeof buffer.buf);
buffer.processed = 0;
@@ -2194,7 +2192,6 @@ int kvm_pv_mmu_op(struct kvm_vcpu *vcpu, unsigned long 
bytes,
r = 1;
 out:
*ret = buffer.processed;
-   up_read(current-mm-mmap_sem);
return r;
 }
 

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] add virtio disk geometry feature

2008-04-16 Thread Rusty Russell
On Thursday 17 April 2008 04:56:37 Ryan Harper wrote:
 From: Ryan Harper [EMAIL PROTECTED]

 Rather than faking up some geometry, allow the backend to push the disk
 geometry via virtio pci config option.  Keep the old geo code around for
 compatibility.

Hi Ryan,

   Looks good! Some brief review below.  Mainly just how I would have done
things stuff.  BTW, does this help in real life?  I assume something in
userspace wants it?

 + int err = 0;
 +
 + /* see if the host passed in geometry config */
 + err = virtio_config_val(vblk-vdev, VIRTIO_BLK_F_GEOMETRY,
 + offsetof(struct virtio_blk_config, cylinders),
 + geo-cylinders);

Unnecessary err initialization.  Sometimes gcc catches bugs when you defer 
initializations of err to as late as possible (ie. paths where err isn't
set properly), so I tend to do it.

 + /* if host sets geo flag, all 3 values must be present */
 + if (!err) {
 + __virtio_config_val(vblk-vdev,
 + offsetof(struct virtio_blk_config, heads),
 + geo-heads);
 + __virtio_config_val(vblk-vdev,
 + offsetof(struct virtio_blk_config, sectors),
 + geo-sectors);

Kind of ugly; we can represent this in the data structure explicitly tho...

 @@ -18,6 +19,12 @@ struct virtio_blk_config
   __le32 size_max;
   /* The maximum number of segments (if VIRTIO_BLK_F_SEG_MAX) */
   __le32 seg_max;
 + /* cylinders of the device (if VIRTIO_BLK_F_GEOMETRY) */
 + __le16 cylinders;
 + /* heads of the device (if VIRTIO_BLK_F_GEOMETRY) */
 + __u8 heads;
 + /* sectors of the device (if VIRTIO_BLK_F_GEOMETRY) */
 + __u8 sectors;
  } __attribute__((packed));

... using a struct-within-a-struct.

Here's the result:
Subject: add virtio disk geometry feature
Date: Wed, 16 Apr 2008 13:56:37 -0500
From: Ryan Harper [EMAIL PROTECTED]

Rather than faking up some geometry, allow the backend to push the disk
geometry via virtio pci config option.  Keep the old geo code around for
compatibility.

Signed-off-by: Ryan Harper [EMAIL PROTECTED]
Reviewed-by: Anthony Liguori [EMAIL PROTECTED]
Signed-off-by: Rusty Russell [EMAIL PROTECTED] (modified to single struct)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -157,10 +157,25 @@ static int virtblk_ioctl(struct inode *i
 /* We provide getgeo only to please some old bootloader/partitioning tools */
 static int virtblk_getgeo(struct block_device *bd, struct hd_geometry *geo)
 {
-   /* some standard values, similar to sd */
-   geo-heads = 1  6;
-   geo-sectors = 1  5;
-   geo-cylinders = get_capacity(bd-bd_disk)  11;
+   struct virtio_blk *vblk = bd-bd_disk-private_data;
+   struct virtio_blk_geometry vgeo;
+   int err;
+
+   /* see if the host passed in geometry config */
+   err = virtio_config_val(vblk-vdev, VIRTIO_BLK_F_GEOMETRY,
+   offsetof(struct virtio_blk_config, geometry),
+   vgeo);
+
+   if (!err) {
+   geo-heads = vgeo.heads;
+   geo-sectors = vgeo.sectors;
+   geo-cylinders = vgeo.cylinders;
+   } else {
+   /* some standard values, similar to sd */
+   geo-heads = 1  6;
+   geo-sectors = 1  5;
+   geo-cylinders = get_capacity(bd-bd_disk)  11;
+   }
return 0;
 }
 
diff --git a/include/linux/virtio_blk.h b/include/linux/virtio_blk.h
--- a/include/linux/virtio_blk.h
+++ b/include/linux/virtio_blk.h
@@ -9,6 +9,7 @@
 #define VIRTIO_BLK_F_BARRIER   0   /* Does host support barriers? */
 #define VIRTIO_BLK_F_SIZE_MAX  1   /* Indicates maximum segment size */
 #define VIRTIO_BLK_F_SEG_MAX   2   /* Indicates maximum # of segments */
+#define VIRTIO_BLK_F_GEOMETRY  4   /* Legacy geometry available  */
 
 struct virtio_blk_config
 {
@@ -18,6 +19,12 @@ struct virtio_blk_config
__le32 size_max;
/* The maximum number of segments (if VIRTIO_BLK_F_SEG_MAX) */
__le32 seg_max;
+   /* geometry the device (if VIRTIO_BLK_F_GEOMETRY) */
+   struct virtio_blk_geometry {
+   __le16 cylinders;
+   __u8 heads;
+   __u8 sectors;
+   } geometry;
 } __attribute__((packed));
 
 /* These two define direction. */

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] add virtio disk geometry feature

2008-04-16 Thread Anthony Liguori
Rusty Russell wrote:
 On Thursday 17 April 2008 04:56:37 Ryan Harper wrote:
   
 From: Ryan Harper [EMAIL PROTECTED]

 Rather than faking up some geometry, allow the backend to push the disk
 geometry via virtio pci config option.  Keep the old geo code around for
 compatibility.
 

 Hi Ryan,

Looks good! Some brief review below.  Mainly just how I would have done
 things stuff.  BTW, does this help in real life?  I assume something in
 userspace wants it?
   

Boot loaders (like grub) query the geometry from the kernel to figure 
out how to setup the stage1/stage2.  We've seen strange issues with grub 
thinking it has crazy geometries when installed on a virtio disk (as 
opposed to booting from virtio with an existing disk).

Ryan: have you tested a hardy install with your patches?  Does it help 
when installing to virtio?  I could pretty reliably reproduce the 
strangeness with a 20GB disk image FWIW.

 diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
 --- a/drivers/block/virtio_blk.c
 +++ b/drivers/block/virtio_blk.c
 @@ -157,10 +157,25 @@ static int virtblk_ioctl(struct inode *i
  /* We provide getgeo only to please some old bootloader/partitioning tools */
  static int virtblk_getgeo(struct block_device *bd, struct hd_geometry *geo)
  {
 - /* some standard values, similar to sd */
 - geo-heads = 1  6;
 - geo-sectors = 1  5;
 - geo-cylinders = get_capacity(bd-bd_disk)  11;
 + struct virtio_blk *vblk = bd-bd_disk-private_data;
 + struct virtio_blk_geometry vgeo;
 + int err;
 +
 + /* see if the host passed in geometry config */
 + err = virtio_config_val(vblk-vdev, VIRTIO_BLK_F_GEOMETRY,
 + offsetof(struct virtio_blk_config, geometry),
 + vgeo);
 +
 + if (!err) {
 + geo-heads = vgeo.heads;
 + geo-sectors = vgeo.sectors;
 + geo-cylinders = vgeo.cylinders;
 + } else {
 + /* some standard values, similar to sd */
 + geo-heads = 1  6;
 + geo-sectors = 1  5;
 + geo-cylinders = get_capacity(bd-bd_disk)  11;
 + }
   return 0;
  }
   

You're probably breaking PPC since the values in the config space are in 
little endian format.  virtio_config_val does automagic endianness 
conversion if the size is 2, 4, or 8.  In this case, the structure size 
is 4 so the endianness conversion will do the wrong thing.

Magic endianness conversion based on read size is looking pretty evil to 
me... Perhaps we need explicit *_val[8,16,32,64]?

Regards,

Anthony Liguori


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [RFC] Multiple QEMU AIO implementations

2008-04-16 Thread Anthony Liguori
This isn't fully cooked yet, but pretty close.  The basic idea is to 
make the aio usage in block-raw go to a set of function pointers and 
allow multiple simultaneous AIO implementations.


I converted the posix-aio support to this, and also introduced a unix 
aio which just uses O_NONBLOCK and select().  The later only supports 1 
simultaneous request per-fd but currently posix-aio is limited to that 
too.  At least with my QEMU testing, the unix aio implementation 
outperforms posix-aio by a factor of 2.


And it uses no signals...

I'm inclined to suggest that we use signalfd with posix-aio, and for 
older guests, just fall back to unix aio.  We can also introduce a 
linux-aio and use that when possible.


Regards,

Anthony Liguori


qemu-block-aio-unix.patch
Description: application/mbox
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] add virtio disk geometry feature

2008-04-16 Thread Hollis Blanchard
On Wednesday 16 April 2008 16:32:30 Anthony Liguori wrote:
  diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
  --- a/drivers/block/virtio_blk.c
  +++ b/drivers/block/virtio_blk.c
  @@ -157,10 +157,25 @@ static int virtblk_ioctl(struct inode *i
   /* We provide getgeo only to please some old bootloader/partitioning 
tools */
   static int virtblk_getgeo(struct block_device *bd, struct hd_geometry 
*geo)
   {
  - /* some standard values, similar to sd */
  - geo-heads = 1  6;
  - geo-sectors = 1  5;
  - geo-cylinders = get_capacity(bd-bd_disk)  11;
  + struct virtio_blk *vblk = bd-bd_disk-private_data;
  + struct virtio_blk_geometry vgeo;
  + int err;
  +
  + /* see if the host passed in geometry config */
  + err = virtio_config_val(vblk-vdev, VIRTIO_BLK_F_GEOMETRY,
  + offsetof(struct virtio_blk_config, 
geometry),
  + vgeo);
  +
  + if (!err) {
  + geo-heads = vgeo.heads;
  + geo-sectors = vgeo.sectors;
  + geo-cylinders = vgeo.cylinders;
  + } else {
  + /* some standard values, similar to sd */
  + geo-heads = 1  6;
  + geo-sectors = 1  5;
  + geo-cylinders = get_capacity(bd-bd_disk)  11;
  + }
    return 0;
   }
    
 
 You're probably breaking PPC since the values in the config space are in 
 little endian format.  virtio_config_val does automagic endianness 
 conversion if the size is 2, 4, or 8.  In this case, the structure size 
 is 4 so the endianness conversion will do the wrong thing.

Good catch; byte-swapping an entire structure is a terrible terrible idea.

 Magic endianness conversion based on read size is looking pretty evil to 
 me... Perhaps we need explicit *_val[8,16,32,64]?

Implicit byteswapping based on access size is the standard way of implementing 
accessors.

In this case, reading each structure member individually will do the right 
implicit swapping, rather than trying to load the whole thing as a single 
access.

-- 
Hollis Blanchard
IBM Linux Technology Center
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] disappointing speed with virtio_blk

2008-04-16 Thread Gerd von Egidy
Hi Marcelo,

 virtio-blk is doing synchronous IO which blocks the guest CPU.

 This is especially bad for write intensive loads where the guest
 will hang in the host write throttling logic.

 In the meantime please try the following patch:

 http://www.mail-archive.com/kvm-devel@lists.sourceforge.net/msg14732.html

Thank you for this hint.

I tried it this evening with kvm 66 - which should include your patch, right?

The result, at least with bonnie++, is nearly the same.

I looked at the patch for the guest kernel you sent with this patch but 
looking at the discussion it did not work.

Kind regards,

Gerd

-- 
Address (better: trap) for people I really don't want to get mail from:
[EMAIL PROTECTED]

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] disappointing speed with virtio_blk

2008-04-16 Thread Marcelo Tosatti
On Thu, Apr 17, 2008 at 01:05:50AM +0200, Gerd von Egidy wrote:
 Hi Marcelo,
 
  virtio-blk is doing synchronous IO which blocks the guest CPU.
 
  This is especially bad for write intensive loads where the guest
  will hang in the host write throttling logic.
 
  In the meantime please try the following patch:
 
  http://www.mail-archive.com/kvm-devel@lists.sourceforge.net/msg14732.html
 
 Thank you for this hint.
 
 I tried it this evening with kvm 66 - which should include your patch, right?

Hi Gerd,

No its not included. The issue is being worked on.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 0/3] Qemu crashes with pci passthrough

2008-04-16 Thread Glauber de Oliveira Costa
Hi, 

I've got some qemu crashes while trying to passthrough an ide device
to a kvm guest. After some investigation, it turned out that 
register_ioport_{read/write} will abort on errors instead of returning
a meaningful error.

However, even if we do return an error, the asynchronous nature of pci
config space mapping updates makes it a little bit hard to treat.

This series of patches basically treats errors in the mapping functions in
the pci layer. If anything goes wrong, we unregister the pci device, unmapping
any mappings that happened to be sucessfull already.

After these patches are applied, a lot of warnings appears. And, you know,
everytime there is a warning, god kills a kitten. But I'm not planning on
touching the other pieces of qemu code for this until we set up (or not) in
this solution

Comments are very welcome, specially from qemu folks (since it is a bit 
invasive)




-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 2/3] map regions as registered

2008-04-16 Thread Glauber de Oliveira Costa
From: Glauber Costa [EMAIL PROTECTED]

map which io registers where already sucessfuly registered

Signed-off-by: Glauber Costa [EMAIL PROTECTED]
---
 qemu/hw/pci.c |5 +++--
 qemu/hw/pci.h |3 +++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/qemu/hw/pci.c b/qemu/hw/pci.c
index 1937408..7e4ce2d 100644
--- a/qemu/hw/pci.c
+++ b/qemu/hw/pci.c
@@ -199,7 +199,7 @@ static void pci_unregister_io_regions(PCIDevice *pci_dev)
 
 for(i = 0; i  PCI_NUM_REGIONS; i++) {
 r = pci_dev-io_regions[i];
-if (!r-size)
+if ((!r-size) || (r-status != PCI_STATUS_REGISTERED))
 continue;
 if (r-type == PCI_ADDRESS_SPACE_IO) {
 isa_unassign_ioport(r-addr, r-size);
@@ -321,7 +321,7 @@ static void pci_update_mappings(PCIDevice *d)
 } else {
 isa_unassign_ioport(r-addr, r-size);
 }
-} else {
+} else if (r-status == PCI_STATUS_REGISTERED) {
 cpu_register_physical_memory(pci_to_cpu_addr(r-addr),
  r-size,
  IO_MEM_UNASSIGNED);
@@ -330,6 +330,7 @@ static void pci_update_mappings(PCIDevice *d)
 r-addr = new_addr;
 if (r-addr != -1) {
 r-map_func(d, i, r-addr, r-size, r-type);
+r-status = PCI_STATUS_REGISTERED;
 }
 }
 }
diff --git a/qemu/hw/pci.h b/qemu/hw/pci.h
index e11fbbf..6350ad2 100644
--- a/qemu/hw/pci.h
+++ b/qemu/hw/pci.h
@@ -27,9 +27,12 @@ typedef struct PCIIORegion {
 uint32_t addr; /* current PCI mapping address. -1 means not mapped */
 uint32_t size;
 uint8_t type;
+uint8_t status;
 PCIMapIORegionFunc *map_func;
 } PCIIORegion;
 
+#define PCI_STATUS_REGISTERED  1
+
 #define PCI_ROM_SLOT 6
 #define PCI_NUM_REGIONS 7
 
-- 
1.5.5


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 1/3] don't exit on errors while registering ioports

2008-04-16 Thread Glauber de Oliveira Costa
From: Glauber Costa [EMAIL PROTECTED]

Currently, any error in register_ioports make qemu
abort through hw_error(). But there are situations
in which those errors are not fatal. Just return
 0 instead

Signed-off-by: Glauber Costa [EMAIL PROTECTED]
---
 qemu/vl.c |   12 +++-
 1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/qemu/vl.c b/qemu/vl.c
index 35a0465..d7e07e2 100644
--- a/qemu/vl.c
+++ b/qemu/vl.c
@@ -351,13 +351,13 @@ int register_ioport_read(int start, int length, int size,
 } else if (size == 4) {
 bsize = 2;
 } else {
-hw_error(register_ioport_read: invalid size);
+fprintf(stderr, register_ioport_read: invalid size\n);
 return -1;
 }
 for(i = start; i  start + length; i += size) {
 ioport_read_table[bsize][i] = func;
 if (ioport_opaque[i] != NULL  ioport_opaque[i] != opaque)
-hw_error(register_ioport_read: invalid opaque);
+fprintf(stderr, register_ioport_read: invalid opaque\n);
 ioport_opaque[i] = opaque;
 }
 return 0;
@@ -376,13 +376,15 @@ int register_ioport_write(int start, int length, int size,
 } else if (size == 4) {
 bsize = 2;
 } else {
-hw_error(register_ioport_write: invalid size);
+fprintf(stderr, register_ioport_write: invalid size\n);
 return -1;
 }
 for(i = start; i  start + length; i += size) {
 ioport_write_table[bsize][i] = func;
-if (ioport_opaque[i] != NULL  ioport_opaque[i] != opaque)
-hw_error(register_ioport_write: invalid opaque);
+if (ioport_opaque[i] != NULL  ioport_opaque[i] != opaque) {
+fprintf(stderr, register_ioport_write: invalid opaque\n);
+return -1;
+}
 ioport_opaque[i] = opaque;
 }
 return 0;
-- 
1.5.5


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 3/3] propagate errors from ioport registering up to pci level

2008-04-16 Thread Glauber de Oliveira Costa
From: Glauber Costa [EMAIL PROTECTED]

In situations like pci-passthrough, the ioport registering can
fail, because another device is already present and in charge for
an io address. The current state would crash qemu, but we can propagate
the errors up to the pci layer, avoiding it.

Signed-off-by: Glauber Costa [EMAIL PROTECTED]
---
 qemu/hw/pci-passthrough.c |   28 
 qemu/hw/pci.c |   30 --
 qemu/hw/pci.h |2 +-
 3 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/qemu/hw/pci-passthrough.c b/qemu/hw/pci-passthrough.c
index 7ffcc7b..3912447 100644
--- a/qemu/hw/pci-passthrough.c
+++ b/qemu/hw/pci-passthrough.c
@@ -127,7 +127,7 @@ pt_ioport_read(b)
 pt_ioport_read(w)
 pt_ioport_read(l)
 
-static void pt_iomem_map(PCIDevice * d, int region_num,
+static int pt_iomem_map(PCIDevice * d, int region_num,
 uint32_t e_phys, uint32_t e_size, int type)
 {
pt_dev_t *r_dev = (pt_dev_t *) d;
@@ -141,6 +141,7 @@ static void pt_iomem_map(PCIDevice * d, int region_num,
cpu_register_physical_memory(e_phys,
 r_dev-dev.io_regions[region_num].size,
 r_dev-v_addrs[region_num].memory_index);
+   return 0;
 }
 
 
@@ -148,7 +149,8 @@ static void pt_ioport_map(PCIDevice * pci_dev, int 
region_num,
  uint32_t addr, uint32_t size, int type)
 {
pt_dev_t *r_dev = (pt_dev_t *) pci_dev;
-   int i;
+   int i, err;
+
uint32_t ((*rf[])(void *, uint32_t)) =  { pt_ioport_readb,
  pt_ioport_readw,
  pt_ioport_readl
@@ -163,10 +165,14 @@ static void pt_ioport_map(PCIDevice * pci_dev, int 
region_num,
  region_num=%d \n, addr, type, size, region_num);
 
for (i = 0; i  3; i++) {
-   register_ioport_write(addr, size, 1i, wf[i],
+   err = register_ioport_write(addr, size, 1i, wf[i],
  (void *) (r_dev-v_addrs + region_num));
-   register_ioport_read(addr, size, 1i, rf[i],
+   if (err  0)
+   return err;
+   err = register_ioport_read(addr, size, 1i, rf[i],
 (void *) (r_dev-v_addrs + region_num));
+   if (err  0)
+   return err;
}
 }
 
@@ -455,6 +461,18 @@ struct {
 int nptdevs;
 extern int piix_get_irq(int);
 
+static int pt_pci_unregister(PCIDevice *pci_dev)
+{
+   pt_dev_t *pt = (pt_dev_t *)pci_dev;
+   int i;
+   for (i = 0; i  MAX_PTDEVS ; i++) {
+   if (ptdevs[i].ptdev == pt)
+   ptdevs[i].ptdev = NULL;
+   }
+   return 0;
+}
+
+
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
@@ -572,6 +590,8 @@ int pt_init(PCIBus * bus)
ret = -1;
}
ptdevs[i].ptdev = dev;
+   /* FIXME: Can the unregister callback be ever called before 
this point? */
+   dev-dev.unregister = pt_pci_unregister;
}
 
if (kvm_enabled()  !qemu_kvm_irqchip_in_kernel())
diff --git a/qemu/hw/pci.c b/qemu/hw/pci.c
index 7e4ce2d..5265b81 100644
--- a/qemu/hw/pci.c
+++ b/qemu/hw/pci.c
@@ -48,7 +48,7 @@ struct PCIBus {
 int irq_count[];
 };
 
-static void pci_update_mappings(PCIDevice *d);
+static int pci_update_mappings(PCIDevice *d);
 static void pci_set_irq(void *opaque, int irq_num, int level);
 void pci_pt_update_irq(PCIDevice *d);
 
@@ -133,13 +133,14 @@ void pci_device_save(PCIDevice *s, QEMUFile *f)
 int pci_device_load(PCIDevice *s, QEMUFile *f)
 {
 uint32_t version_id;
-int i;
+int i, err;
 
 version_id = qemu_get_be32(f);
 if (version_id  2)
 return -EINVAL;
 qemu_get_buffer(f, s-config, 256);
-pci_update_mappings(s);
+if ((err = pci_update_mappings(s))  0)
+   return err;
 
 if (version_id = 2)
 for (i = 0; i  4; i ++)
@@ -192,7 +193,7 @@ static target_phys_addr_t 
pci_to_cpu_addr(target_phys_addr_t addr)
 return addr + pci_mem_base;
 }
 
-static void pci_unregister_io_regions(PCIDevice *pci_dev)
+void pci_unregister_io_regions(PCIDevice *pci_dev)
 {
 PCIIORegion *r;
 int i;
@@ -256,11 +257,22 @@ void pci_register_io_region(PCIDevice *pci_dev, int 
region_num,
 *(uint32_t *)(pci_dev-config + addr) = cpu_to_le32(type);
 }
 
+static int map_pci_region(PCIDevice *d, int i, PCIIORegion *r)
+{
+   int err = 0;
 
-static void pci_update_mappings(PCIDevice *d)
+   if ((err = r-map_func(d, i, r-addr, r-size, r-type))  0) {
+   fprintf(stderr, Could not map pci device %s\n, d-name);
+   pci_unregister_device(d);
+   }
+   r-status = PCI_STATUS_REGISTERED;
+   return err;
+}
+
+static int 

Re: [kvm-devel] Extboot Option ROM rewritten in C - v2

2008-04-16 Thread Nguyen Anh Quynh
On Thu, Apr 17, 2008 at 12:02 AM, Anthony Liguori [EMAIL PROTECTED] wrote:
 A couple general comments.

  I'd feel a lot more comfortable with the int13 handler returning an int and
 the asm stub code uses that result to determine how to set CF.  You set CF
 deep within the function stack and there's no guarantee that GCC isn't going
 to stomp on it.

  I also don't think we want to raise int18 when we get a command we don't
 understand.  We should just not change any of the register state.  There are
 a number of extended commands that look for a magic value to determine
 whether the command exists or not.

Absolutely. We should return error code in that case.

Thanks,
Quynh

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Extboot Option ROM rewritten in C - v3

2008-04-16 Thread Nguyen Anh Quynh
This patch replaces the current assembly code of Extboot option rom
with new C code. Patch is against kvm-66.

This version returns an error code in case int 13 handler cannot
handle a requested function.

Signed-off-by: Nguyen Anh Quynh [EMAIL PROTECTED]


# diffstat extboot3.diff
 b/extboot/Makefile  |   67 ++--
 b/extboot/boot.S|  119 
 b/extboot/farvar.h  |  113 
 b/extboot/rom.c |  366 ++
 b/extboot/signrom.c |   83 --
 b/extboot/util.h|   89 ++
 extboot/extboot.S   |  705 
 7 files changed, 786 insertions(+), 756 deletions(-)
commit bd0c778b70bbbcc5f120478278a965f87b8af7f4
Author: Nguyen Anh Quynh [EMAIL PROTECTED]
Date:   Thu Apr 17 10:22:00 2008 +0900

Extboot Option ROM rewritten in C

Signed-off-by: Nguyen Anh Quynh [EMAIL PROTECTED]

diff --git a/extboot/Makefile b/extboot/Makefile
index ab2dae7..8ac5b7b 100644
--- a/extboot/Makefile
+++ b/extboot/Makefile
@@ -1,35 +1,34 @@
-OBJCOPY=objcopy
-
-# from kernel sources - scripts/Kbuild.include
-# try-run
-# Usage: option = $(call try-run, $(CC)...-o $$TMP,option-ok,otherwise)
-# Exit code chooses option. $$TMP is can be used as temporary file and
-# is automatically cleaned up.
-try-run = $(shell set -e;		\
-	TMP=$(TMPOUT)..tmp;	\
-	if ($(1)) /dev/null 21;	\
-	then echo $(2);		\
-	else echo $(3);		\
-	fi;\
-	rm -f $$TMP)
-
-# cc-option-yn
-# Usage: flag := $(call cc-option-yn,-march=winchip-c6)
-cc-option-yn = $(call try-run,\
-	$(CC) $(KBUILD_CFLAGS) $(1) -S -xc /dev/null -o $$TMP,y,n)
-
-CFLAGS = -Wall -Wstrict-prototypes -Werror -fomit-frame-pointer -fno-builtin
-ifeq ($(call cc-option-yn,-fno-stack-protector),y)
-CFLAGS += -fno-stack-protector
-endif
-
-all: extboot.bin
+# Makefile for extboot Option ROM
+# Nguyen Anh Quynh [EMAIL PROTECTED]
 
-%.o: %.S
-	$(CC) $(CFLAGS) -o $@ -c $
+CC = gcc
+CCFLAGS = -g -Wall -Werror -nostdlib -fno-builtin -fomit-frame-pointer -Os
+
+cc-option = $(shell if test -z `$(1) $(2) -S -o /dev/null -xc \
+  /dev/null 21`; then echo $(2); else echo $(3); fi ;)
+CCFLAGS += $(call cc-option,$(CC),-nopie,)
+CCFLAGS += $(call cc-option,$(CC),-fno-stack-protector,)
+CCFLAGS += $(call cc-option,$(CC),-fno-stack-protector-all,)
+
+INSTALLDIR = /usr/share/qemu
+
+.PHONY: all
+all: clean extboot.bin
+
+.PHONY: install
+install: extboot.bin
+	cp extboot.bin $(INSTALLDIR)
+
+.PHONY: save
+save:
+	mv $(INSTALLDIR)/extboot.bin $(INSTALLDIR)/extboot.bin.org
+
+.PHONY: clean
+clean:
+	$(RM) *.o *.img *.bin signrom *~
 
-extboot.img: extboot.o
-	$(LD) --oformat binary -Ttext 0 -o $@ $
+extboot.img: boot.o rom.o
+	$(LD) --oformat binary -Ttext 0 $^ -o $@ 
 
 extboot.bin: extboot.img signrom
 	./signrom extboot.img extboot.bin
@@ -37,5 +36,9 @@ extboot.bin: extboot.img signrom
 signrom: signrom.c
 	$(CC) -o $@ -g -Wall $^
 
-clean:
-	$(RM) *.o *.img *.bin signrom *~
+%.o: %.c
+	$(CC) $(CCFLAGS) -c $
+
+%.o: %.S
+	$(CC) $(CCFLAGS) -c $
+
diff --git a/extboot/boot.S b/extboot/boot.S
new file mode 100644
index 000..bc7e20c
--- /dev/null
+++ b/extboot/boot.S
@@ -0,0 +1,119 @@
+/*
+ * extboot.c
+ * Extended Boot Option ROM for QEMU.
+
+ * Copyright (C) by Nguyen Anh Quynh [EMAIL PROTECTED], 2008. 
+ *
+ * Based on the ASM version extboot.S of Anthony Liguori [EMAIL PROTECTED]
+
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+	.text
+	.code16gcc
+	.globl _start
+	.extern setup
+_start:
+	.short 0xAA55		/* ROM signature */
+	.byte 0/* ROM size - to be patched at built-time */
+	/* ROM entry: initializing */
+	pushal
+	/* %es is clobbered, so save it */
+	pushw %es
+	/* -fomit-frame-pointer might clobber %ebp */
+	pushl %ebp
+	call setup
+	popl  %ebp
+	popw  %es
+	popal
+	lretw
+
+	/* interrupt 19 handler */
+	.globl int19_handler
+	.extern int19_handler_C
+	.extern linux_boot
+	int19_handler:
+	pushal
+	/* %es is clobbered, so save it */
+	pushw %es
+	/* -fomit-frame-pointer might clobber %ebp */
+	pushl %ebp
+	call int19_handler_C
+	popl  %ebp
+	popw  %es
+orw  %ax, %ax	/* Do we need to execute original INT 19? */  
+	popal
+	jnz   linux_boot	/* No, just boot Linux kernel */
+	int $0x2b
+
+	/* interrupt 13 handler */
+	.globl int13_handler
+	.extern int13_handler_C
+int13_handler:
+	cmpb $0x80, %dl
+	/* only 

[kvm-devel] [ kvm-Bugs-1944629 ] Can't boot smp guests on ia32e host

2008-04-16 Thread SourceForge.net
Bugs item #1944629, was opened at 2008-04-17 13:44
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=1944629group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: yunfeng (yunfeng)
Assigned to: Nobody/Anonymous (nobody)
Summary: Can't boot smp guests on ia32e host

Initial Comment:
Environment:

Host:ia32e 
Guest:  ia32pae/ia32e
Commits: kernel eeff99b6f7e47630356cf2aee2e58d1b3e1af931  
userspace cd6ac0431e01c01c7870aadfafc6ddaf09a8fb95
Hardware:Platform Woodcrest
 CPU  4
 Memory size  8G'

Bug detailed description:
--
Can't boot smp guests on ia32e platform, both windows and linux smp guests will
hang as the attachment shows.
Using raw images or qcow images to boot, guests hang at the same point.
With -no-kvm or -no-kvm-irqchip option, smp linux guests can boot; with
-no-kvm-irqchip smp windows guests can boot.

Reproduce steps:

1.prepare a windows or linux image
2.boot guest with the command:
qemu-system-x86_64 -m 256 -smp 2 -net
nic,macaddr=00:16:3e:66:c3:4a,model=rtl8139 -net tap,script=/etc/kvm/qemu-ifup
-hda /share/xvs/var/ia32p_xpsp2_smp_acpi.img

Current result:



Expected result:



Basic root-causing log:
--


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=1944629group_id=180599

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] KVM Test result, kernel eeff99b.., userspace cd6ac04.. -- One new issue

2008-04-16 Thread Yunfeng Zhao
Hi All,
 
This is today's KVM test result against kvm.git 
eeff99b6f7e47630356cf2aee2e58d1b3e1af931 and kvm-userspace.git 
cd6ac0431e01c01c7870aadfafc6ddaf09a8fb95.

One New Issue

1.Can't boot smp guests on ia32e host
https://sourceforge.net/tracker/?func=detailatid=893831aid=1944629group_id=180599

Three Old Issues:

2. Booting four guests likely fails
https://sourceforge.net/tracker/?func=detailatid=893831aid=1919354group_id=180599
 

3.  booting smp windows guests has 30% chance of hang
https://sourceforge.net/tracker/?func=detailatid=893831aid=1910923group_id=180599
 

4. Cannot boot guests with hugetlbfs
https://sourceforge.net/tracker/?func=detailatid=893831aid=1941302group_id=180599
 


Test environment
  
PlatformWoodcrest
CPU 4
Memory size 8G'
 
Details

IA32-pae:  
1. boot guest with 256M memory  PASS
2. boot two windows xp guest   PASS
3. boot 4 same guest in parallelPASS
4. boot linux and windows guest in parallel PASS
5. boot guest with 1500M memory PASS
6. boot windows 2003 with ACPI enabled   PASS
7. boot Windows xp with ACPI enabled  PASS
8. boot Windows 2000 without ACPI  PASS
9. kernel build on SMP linux guestPASS
10. LTP on SMP linux guest PASS
11. boot base kernel linux PASS
12. save/restore 32-bit HVM guests   PASS
13. live migration 32-bit HVM guests  PASS
14. boot SMP Windows xp with ACPI enabledPASS
15. boot SMP Windows 2003 with ACPI enabled PASS
16. boot SMP Windows 2000 with ACPI enabled PASS
 

IA32e:  
1. boot four 32-bit guest in 
parallel  PASS
2. boot four 64-bit guest in 
parallel  PASS
3. boot 4G 64-bit 
guest  PASS
4. boot 4G pae 
guest PASS
5. boot 32-bit linux and 32 bit windows guest in parallelPASS
6. boot 32-bit guest with 1500M memory PASS
7. boot 64-bit guest with 1500M memory PASS
8. boot 32-bit guest with 256M memory   PASS
9. boot 64-bit guest with 256M memory   PASS
10. boot two 32-bit windows xp in parallelPASS
11. boot four 32-bit different guest in para 
PASS
12. save/restore 64-bit linux guests 
PASS
13. save/restore 32-bit linux guests 
PASS
14. boot 32-bit SMP windows 2003 with ACPI enabled  FAIL
15. boot 32-bit SMP Windows 2000 with ACPI enabled FAIL
16. boot 32-bit SMP Windows xp with ACPI enabledFAIL
17. boot 32-bit Windows 2000 without ACPIPASS
18. boot 64-bit Windows xp with ACPI enabledPASS
19. boot 32-bit Windows xp without ACPIPASS
20. boot 64-bit UP 
vista  PASS
21. boot 64-bit SMP 
vista   FAIL
22. kernel build in 32-bit linux guest OS  PASS
23. kernel build in 64-bit linux guest OS  PASS
24. LTP on 32-bit linux guest OSPASS
25. LTP on 64-bit linux guest OSPASS
26. boot 64-bit guests with ACPI enabled PASS
27. boot 32-bit 
x-server   PASS   
28. boot 64-bit SMP windows XP with ACPI enabled FAIL
29. boot 64-bit SMP windows 2003 with ACPI enabled  FAIL
30. live migration 64bit linux 
guests PASS
31. live migration 32bit linux 
guests PASS
32. reboot 32bit windows xp guest   PASS
33. reboot 32bit windows xp guest   PASS
 
 
Report Summary on IA32-pae
Summary Test Report of Last Session
=
   Total   PassFailNoResult   Crash
=
control_panel   7   7   0 0