date:20120803

Re: [PATCH V5 0/4] Improve virtio-blk performance

2012-08-03 Thread Asias He


On 08/03/2012 05:40 AM, Jens Axboe wrote:

On 08/02/2012 08:25 AM, Asias He wrote:

Hi folks,

This version added REQ_FLUSH and REQ_FUA support as suggested by Christoph and
rebased against latest linus's tree.

Jens, could you please consider picking up the dependencies 1/4 and
2/4 in your tree. Thanks!


Pickedup, thanks for getting that done!


Cheers. Thanks, Jens.

--
Asias
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/3] KVM: PPC: booke: Allow multiple exception types

2012-08-03 Thread Bharat Bhushan

Current kvmppc_booke_handlers uses the same macro (KVM_HANDLER) and
all handlers are considered to be the same size. This will not be
the case if we want to use different macros for different handlers.

This patch improves the kvmppc_booke_handler so that it can
support different macros for different handlers.

Signed-off-by: Liu Yu yu@freescale.com
[bharat.bhus...@freescale.com: Substantial changes]
Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/include/asm/kvm_ppc.h  |2 -
 arch/powerpc/kvm/booke.c|9 ---
 arch/powerpc/kvm/booke.h|1 +
 arch/powerpc/kvm/booke_interrupts.S |   37 --
 arch/powerpc/kvm/e500.c |   13 +++
 5 files changed, 48 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 1438a5e..deb55bd 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -47,8 +47,6 @@ enum emulation_result {
 
 extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
 extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
-extern char kvmppc_handlers_start[];
-extern unsigned long kvmppc_handler_len;
 extern void kvmppc_handler_highmem(void);
 
 extern void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu);
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index aebbb8b..1338233 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1541,6 +1541,7 @@ int __init kvmppc_booke_init(void)
 {
 #ifndef CONFIG_KVM_BOOKE_HV
unsigned long ivor[16];
+   unsigned long *handler = kvmppc_booke_handler_addr;
unsigned long max_ivor = 0;
int i;
 
@@ -1574,14 +1575,14 @@ int __init kvmppc_booke_init(void)
 
for (i = 0; i  16; i++) {
if (ivor[i]  max_ivor)
-   max_ivor = ivor[i];
+   max_ivor = i;
 
memcpy((void *)kvmppc_booke_handlers + ivor[i],
-  kvmppc_handlers_start + i * kvmppc_handler_len,
-  kvmppc_handler_len);
+  (void *)handler[i], handler[i + 1] - handler[i]);
}
flush_icache_range(kvmppc_booke_handlers,
-  kvmppc_booke_handlers + max_ivor + 
kvmppc_handler_len);
+  kvmppc_booke_handlers + ivor[max_ivor] +
+   handler[max_ivor + 1] - handler[max_ivor]);
 #endif /* !BOOKE_HV */
return 0;
 }
diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h
index ba61974..de9e526 100644
--- a/arch/powerpc/kvm/booke.h
+++ b/arch/powerpc/kvm/booke.h
@@ -65,6 +65,7 @@
  (1  BOOKE_IRQPRIO_CRITICAL))
 
 extern unsigned long kvmppc_booke_handlers;
+extern unsigned long kvmppc_booke_handler_addr[];
 
 void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr);
 void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 old_msr);
diff --git a/arch/powerpc/kvm/booke_interrupts.S 
b/arch/powerpc/kvm/booke_interrupts.S
index bb46b32..3539805 100644
--- a/arch/powerpc/kvm/booke_interrupts.S
+++ b/arch/powerpc/kvm/booke_interrupts.S
@@ -73,6 +73,14 @@ _GLOBAL(kvmppc_handler_\ivor_nr)
bctr
 .endm
 
+.macro KVM_HANDLER_ADDR ivor_nr
+   .long   kvmppc_handler_\ivor_nr
+.endm
+
+.macro KVM_HANDLER_END
+   .long   kvmppc_handlers_end
+.endm
+
 _GLOBAL(kvmppc_handlers_start)
 KVM_HANDLER BOOKE_INTERRUPT_CRITICAL SPRN_SPRG_RSCRATCH_CRIT SPRN_CSRR0
 KVM_HANDLER BOOKE_INTERRUPT_MACHINE_CHECK  SPRN_SPRG_RSCRATCH_MC SPRN_MCSRR0
@@ -93,9 +101,7 @@ KVM_HANDLER BOOKE_INTERRUPT_DEBUG SPRN_SPRG_RSCRATCH_CRIT 
SPRN_CSRR0
 KVM_HANDLER BOOKE_INTERRUPT_SPE_UNAVAIL SPRN_SPRG_RSCRATCH0 SPRN_SRR0
 KVM_HANDLER BOOKE_INTERRUPT_SPE_FP_DATA SPRN_SPRG_RSCRATCH0 SPRN_SRR0
 KVM_HANDLER BOOKE_INTERRUPT_SPE_FP_ROUND SPRN_SPRG_RSCRATCH0 SPRN_SRR0
-
-_GLOBAL(kvmppc_handler_len)
-   .long kvmppc_handler_1 - kvmppc_handler_0
+_GLOBAL(kvmppc_handlers_end)
 
 /* Registers:
  *  SPRG_SCRATCH0: guest r4
@@ -463,6 +469,31 @@ lightweight_exit:
lwz r4, VCPU_GPR(R4)(r4)
rfi
 
+   .data
+   .align  4
+   .globl  kvmppc_booke_handler_addr
+kvmppc_booke_handler_addr:
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_CRITICAL
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_MACHINE_CHECK
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_DATA_STORAGE
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_INST_STORAGE
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_EXTERNAL
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_ALIGNMENT
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_PROGRAM
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_FP_UNAVAIL
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_SYSCALL
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_AP_UNAVAIL
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_DECREMENTER
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_FIT
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_WATCHDOG
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_DTLB_MISS
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_ITLB_MISS
+KVM_HANDLER_ADDR BOOKE_INTERRUPT_DEBUG

[PATCH 1/3] booke: Added ONE_REG interface for IAC/DAC debug registers

2012-08-03 Thread Bharat Bhushan

From: Bharat Bhushan bharat.bhus...@freescale.com

IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG
interface is added to set/get them.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/include/asm/kvm.h  |   12 ++
 arch/powerpc/include/asm/kvm_host.h |   24 +++-
 arch/powerpc/kvm/booke.c|   66 +-
 arch/powerpc/kvm/booke_emulate.c|8 ++--
 4 files changed, 102 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h
index 1bea4d8..3c14202 100644
--- a/arch/powerpc/include/asm/kvm.h
+++ b/arch/powerpc/include/asm/kvm.h
@@ -221,6 +221,12 @@ struct kvm_sregs {
 
__u32 dbsr; /* KVM_SREGS_E_UPDATE_DBSR */
__u32 dbcr[3];
+   /*
+* iac/dac registers are 64bit wide, while this API
+* interface provides only lower 32 bits on 64 bit
+* processors. ONE_REG interface is added for 64bit
+* iac/dac registers.
+*/
__u32 iac[4];
__u32 dac[2];
__u32 dvc[2];
@@ -326,5 +332,11 @@ struct kvm_book3e_206_tlb_params {
 };
 
 #define KVM_REG_PPC_HIOR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x1)
+#define KVM_REG_PPC_IAC1   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x2)
+#define KVM_REG_PPC_IAC2   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x3)
+#define KVM_REG_PPC_IAC3   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x4)
+#define KVM_REG_PPC_IAC4   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x5)
+#define KVM_REG_PPC_DAC1   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x6)
+#define KVM_REG_PPC_DAC2   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x7)
 
 #endif /* __LINUX_KVM_POWERPC_H */
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 873ec11..dcee499 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -344,6 +344,27 @@ struct kvmppc_slb {
bool class  : 1;
 };
 
+# ifdef CONFIG_PPC_FSL_BOOK3E
+#define KVMPPC_BOOKE_IAC_NUM   2
+#define KVMPPC_BOOKE_DAC_NUM   2
+# else
+#define KVMPPC_BOOKE_IAC_NUM   4
+#define KVMPPC_BOOKE_DAC_NUM   2
+# endif
+#define KVMPPC_BOOKE_MAX_IAC   4
+#define KVMPPC_BOOKE_MAX_DAC   2
+
+struct kvmppc_booke_debug_reg {
+   u32 dbcr0;
+   u32 dbcr1;
+   u32 dbcr2;
+#ifdef CONFIG_KVM_E500MC
+   u32 dbcr4;
+#endif
+   u64 iac[KVMPPC_BOOKE_MAX_IAC];
+   u64 dac[KVMPPC_BOOKE_MAX_DAC];
+};
+
 struct kvm_vcpu_arch {
ulong host_stack;
u32 host_pid;
@@ -438,9 +459,8 @@ struct kvm_vcpu_arch {
 
u32 ccr0;
u32 ccr1;
-   u32 dbcr0;
-   u32 dbcr1;
u32 dbsr;
+   struct kvmppc_booke_debug_reg dbg_reg;
 
u64 mmcr[3];
u32 pmc[8];
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 68ed863..aebbb8b 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1379,12 +1379,74 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 
 int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
 {
-   return -EINVAL;
+   int r = -EINVAL;
+
+   switch(reg-id) {
+   case KVM_REG_PPC_IAC1:
+   r = copy_to_user((u64 __user *)(long)reg-addr,
+vcpu-arch.dbg_reg.iac[0], sizeof(u64));
+   break;
+   case KVM_REG_PPC_IAC2:
+   r = copy_to_user((u64 __user *)(long)reg-addr,
+vcpu-arch.dbg_reg.iac[1], sizeof(u64));
+   break;
+#ifndef CONFIG_PPC_FSL_BOOK3E
+   case KVM_REG_PPC_IAC3:
+   r = copy_to_user((u64 __user *)(long)reg-addr,
+vcpu-arch.dbg_reg.iac[2], sizeof(u64));
+   break;
+   case KVM_REG_PPC_IAC4:
+   r = copy_to_user((u64 __user *)(long)reg-addr,
+vcpu-arch.dbg_reg.iac[3], sizeof(u64));
+   break;
+#endif
+   case KVM_REG_PPC_DAC1:
+   r = copy_to_user((u64 __user *)(long)reg-addr,
+vcpu-arch.dbg_reg.dac[0], sizeof(u64));
+   break;
+   case KVM_REG_PPC_DAC2:
+   r = copy_to_user((u64 __user *)(long)reg-addr,
+vcpu-arch.dbg_reg.dac[1], sizeof(u64));
+   break;
+   default:
+   break;
+   }
+   return r;
 }
 
 int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
 {
-   return -EINVAL;
+   int r = -EINVAL;
+
+   switch(reg-id) {
+   case KVM_REG_PPC_IAC1:
+   r = copy_from_user(vcpu-arch.dbg_reg.iac[0],
+(u64 __user *)(long)reg-addr, sizeof(u64));
+   break;
+   case KVM_REG_PPC_IAC2:
+   r =

[PATCH 3/3] KVM: PPC: booke: Added debug handler

2012-08-03 Thread Bharat Bhushan

Installed debug handler will be used for guest debug support and
debug facility emulation features (patches for these features
will follow this patch).

Signed-off-by: Liu Yu yu@freescale.com
[bharat.bhus...@freescale.com: Substantial changes]
Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/include/asm/kvm_host.h |1 +
 arch/powerpc/kernel/asm-offsets.c   |1 +
 arch/powerpc/kvm/booke_interrupts.S |   45 +++
 3 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index dcee499..bd78523 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -494,6 +494,7 @@ struct kvm_vcpu_arch {
u32 tlbcfg[4];
u32 mmucfg;
u32 epr;
+   u32 crit_save;
 #endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 85b05c4..92f149b 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -563,6 +563,7 @@ int main(void)
DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
+   DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save));
 #endif /* CONFIG_PPC_BOOK3S */
 #endif /* CONFIG_KVM */
 
diff --git a/arch/powerpc/kvm/booke_interrupts.S 
b/arch/powerpc/kvm/booke_interrupts.S
index 3539805..890673c 100644
--- a/arch/powerpc/kvm/booke_interrupts.S
+++ b/arch/powerpc/kvm/booke_interrupts.S
@@ -73,6 +73,51 @@ _GLOBAL(kvmppc_handler_\ivor_nr)
bctr
 .endm
 
+.macro KVM_DBG_HANDLER ivor_nr scratch srr0
+_GLOBAL(kvmppc_handler_\ivor_nr)
+   mtspr   \scratch, r4
+   mfspr   r4, SPRN_SPRG_THREAD
+   lwz r4, THREAD_KVM_VCPU(r4)
+   stw r3, VCPU_CRIT_SAVE(r4)
+   mfcrr3
+   mfspr   r4, SPRN_CSRR1
+   andi.   r4, r4, MSR_PR
+   bne 1f
+   /* debug interrupt happened in enter/exit path */
+   mfspr   r4, SPRN_CSRR1
+   rlwinm  r4, r4, 0, ~MSR_DE
+   mtspr   SPRN_CSRR1, r4
+   lis r4, 0x
+   ori r4, r4, 0x
+   mtspr   SPRN_DBSR, r4
+   mfspr   r4, SPRN_SPRG_THREAD
+   lwz r4, THREAD_KVM_VCPU(r4)
+   mtcrr3
+   lwz r3, VCPU_CRIT_SAVE(r4)
+   mfspr   r4, \scratch
+   rfci
+1: /* debug interrupt happened in guest */
+   mfspr   r4, \scratch
+   mtcrr3
+   mr  r3, r4
+   mfspr   r4, SPRN_SPRG_THREAD
+   lwz r4, THREAD_KVM_VCPU(r4)
+   stw r3, VCPU_GPR(R4)(r4)
+   stw r5, VCPU_GPR(R5)(r4)
+   stw r6, VCPU_GPR(R6)(r4)
+   lwz r3, VCPU_CRIT_SAVE(r4)
+   mfspr   r5, \srr0
+   stw r3, VCPU_GPR(R3)(r4)
+   stw r5, VCPU_PC(r4)
+   mfctr   r5
+   lis r6, kvmppc_resume_host@h
+   stw r5, VCPU_CTR(r4)
+   li  r5, \ivor_nr
+   ori r6, r6, kvmppc_resume_host@l
+   mtctr   r6
+   bctr
+.endm
+
 .macro KVM_HANDLER_ADDR ivor_nr
.long   kvmppc_handler_\ivor_nr
 .endm
-- 
1.7.0.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH V5 3/4] virtio-blk: Add bio-based IO path for virtio-blk

2012-08-03 Thread Asias He


On 08/03/2012 06:28 AM, Michael S. Tsirkin wrote:

On Thu, Aug 02, 2012 at 02:25:55PM +0800, Asias He wrote:

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index c0bbeb4..95cfeed 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -14,6 +14,9 @@

  #define PART_BITS 4

+static bool use_bio;
+module_param(use_bio, bool, S_IRUGO);
+
  static int major;
  static DEFINE_IDA(vd_index_ida);



OK, but eventually, you plan to make this use a feature bit, yes?


Yes, I think so.

--
Asias
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Attaching audio CD to VM instance

2012-08-03 Thread Paolo Bonzini

Il 02/08/2012 22:21, Erik Lotspeich ha scritto:
 Hi,
 
 I haven't seen this question asked before. Is it possible to mount an
 audio CD is a raw mode to that a VM instance can access the audio
 tracks as if it were a physical CD-ROM drive? It seems that the current
 implementation presents a CD-ROM to the VM, not CD-DA.

You can do this with SCSI pass-through.  With QEMU you can use this
command line:

   -device virtio-scsi-pci
   -drive if=none,file=/dev/sr0,id=sr0
   -device scsi-block,drive=sr0

if you have a new enough guest (Linux 3.4+ or Fedora 16 or RHEL6.3 or
clones).  Older guests can use -device lsi instead.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 01/10] KVM: iommu: fix releasing unmapped page

2012-08-03 Thread Xiao Guangrong

There are two bugs:
- the 'error page' is forgot to be released
  [ it is unneeded after commit a2766325cf9f9, for backport, we
still do kvm_release_pfn_clean for the error pfn ]

- guest pages are always released regardless of the unmapped page
  (e,g, caused by hwpoison)

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 virt/kvm/iommu.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/virt/kvm/iommu.c b/virt/kvm/iommu.c
index c03f1fb..6a67bea 100644
--- a/virt/kvm/iommu.c
+++ b/virt/kvm/iommu.c
@@ -107,6 +107,7 @@ int kvm_iommu_map_pages(struct kvm *kvm, struct 
kvm_memory_slot *slot)
 */
pfn = kvm_pin_pages(slot, gfn, page_size);
if (is_error_pfn(pfn)) {
+   kvm_release_pfn_clean(pfn);
gfn += 1;
continue;
}
@@ -300,6 +301,12 @@ static void kvm_iommu_put_pages(struct kvm *kvm,

/* Get physical address */
phys = iommu_iova_to_phys(domain, gfn_to_gpa(gfn));
+
+   if (!phys) {
+   gfn++;
+   continue;
+   }
+
pfn  = phys  PAGE_SHIFT;

/* Unmap address from IO address space */
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 02/10] KVM: introduce KVM_PFN_ERR_FAULT

2012-08-03 Thread Xiao Guangrong

After that, the exported and un-inline function, get_fault_pfn,
can be removed

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 arch/x86/kvm/mmu.c   |2 +-
 include/linux/kvm_host.h |3 ++-
 virt/kvm/kvm_main.c  |   12 +++-
 3 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a9a2052..39ed315 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2514,7 +2514,7 @@ static pfn_t pte_prefetch_gfn_to_pfn(struct kvm_vcpu 
*vcpu, gfn_t gfn,

slot = gfn_to_memslot_dirty_bitmap(vcpu, gfn, no_dirty_log);
if (!slot)
-   return get_fault_pfn();
+   return KVM_PFN_ERR_FAULT;

hva = gfn_to_hva_memslot(slot, gfn);

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index dbc65f9..4c39543 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -48,6 +48,8 @@
 #define KVM_MAX_MMIO_FRAGMENTS \
(KVM_MMIO_SIZE / KVM_USER_MMIO_SIZE + KVM_EXTRA_MMIO_FRAGMENTS)

+#define KVM_PFN_ERR_FAULT  (-EFAULT)
+
 /*
  * vcpu-requests bit members
  */
@@ -444,7 +446,6 @@ void kvm_release_pfn_clean(pfn_t pfn);
 void kvm_set_pfn_dirty(pfn_t pfn);
 void kvm_set_pfn_accessed(pfn_t pfn);
 void kvm_get_pfn(pfn_t pfn);
-pfn_t get_fault_pfn(void);

 int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
int len);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index bcf973e..9c084f8 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -948,12 +948,6 @@ static pfn_t get_bad_pfn(void)
return -ENOENT;
 }

-pfn_t get_fault_pfn(void)
-{
-   return -EFAULT;
-}
-EXPORT_SYMBOL_GPL(get_fault_pfn);
-
 static pfn_t get_hwpoison_pfn(void)
 {
return -EHWPOISON;
@@ -1124,7 +1118,7 @@ static pfn_t hva_to_pfn(unsigned long addr, bool atomic, 
bool *async,
struct vm_area_struct *vma;

if (atomic)
-   return get_fault_pfn();
+   return KVM_PFN_ERR_FAULT;

down_read(current-mm-mmap_sem);
if (npages == -EHWPOISON ||
@@ -1136,7 +1130,7 @@ static pfn_t hva_to_pfn(unsigned long addr, bool atomic, 
bool *async,
vma = find_vma_intersection(current-mm, addr, addr+1);

if (vma == NULL)
-   pfn = get_fault_pfn();
+   pfn = KVM_PFN_ERR_FAULT;
else if ((vma-vm_flags  VM_PFNMAP)) {
pfn = ((addr - vma-vm_start)  PAGE_SHIFT) +
vma-vm_pgoff;
@@ -1144,7 +1138,7 @@ static pfn_t hva_to_pfn(unsigned long addr, bool atomic, 
bool *async,
} else {
if (async  (vma-vm_flags  VM_WRITE))
*async = true;
-   pfn = get_fault_pfn();
+   pfn = KVM_PFN_ERR_FAULT;
}
up_read(current-mm-mmap_sem);
} else
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 03/10] KVM: introduce KVM_PFN_ERR_HWPOISON

2012-08-03 Thread Xiao Guangrong

Then, get_hwpoison_pfn and is_hwpoison_pfn can be removed

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 arch/x86/kvm/mmu.c   |2 +-
 include/linux/kvm_host.h |2 +-
 virt/kvm/kvm_main.c  |   13 +
 3 files changed, 3 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 39ed315..924b4e8 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2651,7 +2651,7 @@ static void kvm_send_hwpoison_signal(unsigned long 
address, struct task_struct *
 static int kvm_handle_bad_page(struct kvm_vcpu *vcpu, gfn_t gfn, pfn_t pfn)
 {
kvm_release_pfn_clean(pfn);
-   if (is_hwpoison_pfn(pfn)) {
+   if (pfn == KVM_PFN_ERR_HWPOISON) {
kvm_send_hwpoison_signal(gfn_to_hva(vcpu-kvm, gfn), current);
return 0;
}
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 4c39543..cbd5af8 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -49,6 +49,7 @@
(KVM_MMIO_SIZE / KVM_USER_MMIO_SIZE + KVM_EXTRA_MMIO_FRAGMENTS)

 #define KVM_PFN_ERR_FAULT  (-EFAULT)
+#define KVM_PFN_ERR_HWPOISON   (-EHWPOISON)

 /*
  * vcpu-requests bit members
@@ -396,7 +397,6 @@ extern struct page *bad_page;

 int is_error_page(struct page *page);
 int is_error_pfn(pfn_t pfn);
-int is_hwpoison_pfn(pfn_t pfn);
 int is_noslot_pfn(pfn_t pfn);
 int is_invalid_pfn(pfn_t pfn);
 int kvm_is_error_hva(unsigned long addr);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 9c084f8..f17ce44 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -948,17 +948,6 @@ static pfn_t get_bad_pfn(void)
return -ENOENT;
 }

-static pfn_t get_hwpoison_pfn(void)
-{
-   return -EHWPOISON;
-}
-
-int is_hwpoison_pfn(pfn_t pfn)
-{
-   return pfn == -EHWPOISON;
-}
-EXPORT_SYMBOL_GPL(is_hwpoison_pfn);
-
 int is_noslot_pfn(pfn_t pfn)
 {
return pfn == -ENOENT;
@@ -1124,7 +1113,7 @@ static pfn_t hva_to_pfn(unsigned long addr, bool atomic, 
bool *async,
if (npages == -EHWPOISON ||
(!async  check_user_page_hwpoison(addr))) {
up_read(current-mm-mmap_sem);
-   return get_hwpoison_pfn();
+   return KVM_PFN_ERR_HWPOISON;
}

vma = find_vma_intersection(current-mm, addr, addr+1);
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 04/10] KVM: introduce KVM_PFN_ERR_BAD

2012-08-03 Thread Xiao Guangrong

Then, remove get_bad_pfn

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 include/linux/kvm_host.h |1 +
 virt/kvm/kvm_main.c  |7 +--
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index cbd5af8..ba8b222 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -50,6 +50,7 @@

 #define KVM_PFN_ERR_FAULT  (-EFAULT)
 #define KVM_PFN_ERR_HWPOISON   (-EHWPOISON)
+#define KVM_PFN_ERR_BAD(-ENOENT)

 /*
  * vcpu-requests bit members
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f17ce44..75d3c76 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -943,11 +943,6 @@ int is_error_pfn(pfn_t pfn)
 }
 EXPORT_SYMBOL_GPL(is_error_pfn);

-static pfn_t get_bad_pfn(void)
-{
-   return -ENOENT;
-}
-
 int is_noslot_pfn(pfn_t pfn)
 {
return pfn == -ENOENT;
@@ -1152,7 +1147,7 @@ static pfn_t __gfn_to_pfn(struct kvm *kvm, gfn_t gfn, 
bool atomic, bool *async,

addr = gfn_to_hva(kvm, gfn);
if (kvm_is_error_hva(addr))
-   return get_bad_pfn();
+   return KVM_PFN_ERR_BAD;

return hva_to_pfn(addr, atomic, async, write_fault, writable);
 }
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 05/10] KVM: inline is_*_pfn functions

2012-08-03 Thread Xiao Guangrong

These functions are exported and can not inline, move them
to kvm_host.h to eliminate the overload of function call

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 include/linux/kvm_host.h |   19 ---
 virt/kvm/kvm_main.c  |   18 --
 2 files changed, 16 insertions(+), 21 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ba8b222..f604d1d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -21,6 +21,7 @@
 #include linux/slab.h
 #include linux/rcupdate.h
 #include linux/ratelimit.h
+#include linux/err.h
 #include asm/signal.h

 #include linux/kvm.h
@@ -52,6 +53,21 @@
 #define KVM_PFN_ERR_HWPOISON   (-EHWPOISON)
 #define KVM_PFN_ERR_BAD(-ENOENT)

+static inline int is_error_pfn(pfn_t pfn)
+{
+   return IS_ERR_VALUE(pfn);
+}
+
+static inline int is_noslot_pfn(pfn_t pfn)
+{
+   return pfn == -ENOENT;
+}
+
+static inline int is_invalid_pfn(pfn_t pfn)
+{
+   return !is_noslot_pfn(pfn)  is_error_pfn(pfn);
+}
+
 /*
  * vcpu-requests bit members
  */
@@ -397,9 +413,6 @@ id_to_memslot(struct kvm_memslots *slots, int id)
 extern struct page *bad_page;

 int is_error_page(struct page *page);
-int is_error_pfn(pfn_t pfn);
-int is_noslot_pfn(pfn_t pfn);
-int is_invalid_pfn(pfn_t pfn);
 int kvm_is_error_hva(unsigned long addr);
 int kvm_set_memory_region(struct kvm *kvm,
  struct kvm_userspace_memory_region *mem,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 75d3c76..08b600b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -937,24 +937,6 @@ int is_error_page(struct page *page)
 }
 EXPORT_SYMBOL_GPL(is_error_page);

-int is_error_pfn(pfn_t pfn)
-{
-   return IS_ERR_VALUE(pfn);
-}
-EXPORT_SYMBOL_GPL(is_error_pfn);
-
-int is_noslot_pfn(pfn_t pfn)
-{
-   return pfn == -ENOENT;
-}
-EXPORT_SYMBOL_GPL(is_noslot_pfn);
-
-int is_invalid_pfn(pfn_t pfn)
-{
-   return !is_noslot_pfn(pfn)  is_error_pfn(pfn);
-}
-EXPORT_SYMBOL_GPL(is_invalid_pfn);
-
 struct page *get_bad_page(void)
 {
return ERR_PTR(-ENOENT);
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 06/10] KVM: remove the unused declare

2012-08-03 Thread Xiao Guangrong

Remove it since it is not used anymore

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 include/linux/kvm_host.h |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f604d1d..bdf2182 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -410,8 +410,6 @@ id_to_memslot(struct kvm_memslots *slots, int id)
return slot;
 }

-extern struct page *bad_page;
-
 int is_error_page(struct page *page);
 int kvm_is_error_hva(unsigned long addr);
 int kvm_set_memory_region(struct kvm *kvm,
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 07/10] KVM: introduce KVM_ERR_PTR_BAD_PAGE

2012-08-03 Thread Xiao Guangrong

It is used to eliminate the overload of function call and cleanup
the code

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 include/linux/kvm_host.h |9 +++--
 virt/kvm/async_pf.c  |2 +-
 virt/kvm/kvm_main.c  |   13 +
 3 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index bdf2182..0aebe7a 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -68,6 +68,13 @@ static inline int is_invalid_pfn(pfn_t pfn)
return !is_noslot_pfn(pfn)  is_error_pfn(pfn);
 }

+#define KVM_ERR_PTR_BAD_PAGE   (ERR_PTR(-ENOENT))
+
+static inline int is_error_page(struct page *page)
+{
+   return IS_ERR(page);
+}
+
 /*
  * vcpu-requests bit members
  */
@@ -410,7 +417,6 @@ id_to_memslot(struct kvm_memslots *slots, int id)
return slot;
 }

-int is_error_page(struct page *page);
 int kvm_is_error_hva(unsigned long addr);
 int kvm_set_memory_region(struct kvm *kvm,
  struct kvm_userspace_memory_region *mem,
@@ -437,7 +443,6 @@ void kvm_arch_flush_shadow(struct kvm *kvm);
 int gfn_to_page_many_atomic(struct kvm *kvm, gfn_t gfn, struct page **pages,
int nr_pages);

-struct page *get_bad_page(void);
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
 unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn);
 void kvm_release_page_clean(struct page *page);
diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
index 7972278..56f5533 100644
--- a/virt/kvm/async_pf.c
+++ b/virt/kvm/async_pf.c
@@ -203,7 +203,7 @@ int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu)
if (!work)
return -ENOMEM;

-   work-page = get_bad_page();
+   work-page = KVM_ERR_PTR_BAD_PAGE;
INIT_LIST_HEAD(work-queue); /* for list_del to work */

spin_lock(vcpu-async_pf.lock);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 08b600b..5873031 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -931,17 +931,6 @@ void kvm_disable_largepages(void)
 }
 EXPORT_SYMBOL_GPL(kvm_disable_largepages);

-int is_error_page(struct page *page)
-{
-   return IS_ERR(page);
-}
-EXPORT_SYMBOL_GPL(is_error_page);
-
-struct page *get_bad_page(void)
-{
-   return ERR_PTR(-ENOENT);
-}
-
 static inline unsigned long bad_hva(void)
 {
return PAGE_OFFSET;
@@ -1188,7 +1177,7 @@ static struct page *kvm_pfn_to_page(pfn_t pfn)
WARN_ON(kvm_is_mmio_pfn(pfn));

if (is_error_pfn(pfn) || kvm_is_mmio_pfn(pfn))
-   return get_bad_page();
+   return KVM_ERR_PTR_BAD_PAGE;

return pfn_to_page(pfn);
 }
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 08/10] KVM: do not release the error pfn

2012-08-03 Thread Xiao Guangrong

After commit a2766325cf9f9, the error pfn is replaced by the
error code, it need not be released anymore

[ The patch has been compiling tested for powerpc ]

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 arch/powerpc/kvm/e500_tlb.c |1 -
 arch/x86/kvm/mmu.c  |7 +++
 arch/x86/kvm/mmu_audit.c|4 +---
 arch/x86/kvm/paging_tmpl.h  |8 ++--
 virt/kvm/iommu.c|1 -
 virt/kvm/kvm_main.c |   14 --
 6 files changed, 14 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index c8f6c58..09ce5ac 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -524,7 +524,6 @@ static inline void kvmppc_e500_shadow_map(struct 
kvmppc_vcpu_e500 *vcpu_e500,
if (is_error_pfn(pfn)) {
printk(KERN_ERR Couldn't get real page for gfn %lx!\n,
(long)gfn);
-   kvm_release_pfn_clean(pfn);
return;
}

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 924b4e8..d68223f 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2498,7 +2498,9 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 
*sptep,
rmap_recycle(vcpu, sptep, gfn);
}
}
-   kvm_release_pfn_clean(pfn);
+
+   if (!is_error_pfn(pfn))
+   kvm_release_pfn_clean(pfn);
 }

 static void nonpaging_new_cr3(struct kvm_vcpu *vcpu)
@@ -2650,7 +2652,6 @@ static void kvm_send_hwpoison_signal(unsigned long 
address, struct task_struct *

 static int kvm_handle_bad_page(struct kvm_vcpu *vcpu, gfn_t gfn, pfn_t pfn)
 {
-   kvm_release_pfn_clean(pfn);
if (pfn == KVM_PFN_ERR_HWPOISON) {
kvm_send_hwpoison_signal(gfn_to_hva(vcpu-kvm, gfn), current);
return 0;
@@ -3275,8 +3276,6 @@ static bool try_async_pf(struct kvm_vcpu *vcpu, bool 
prefault, gfn_t gfn,
if (!async)
return false; /* *pfn has correct page already */

-   kvm_release_pfn_clean(*pfn);
-
if (!prefault  can_do_async_pf(vcpu)) {
trace_kvm_try_async_get_page(gva, gfn);
if (kvm_find_async_pf_gfn(vcpu, gfn)) {
diff --git a/arch/x86/kvm/mmu_audit.c b/arch/x86/kvm/mmu_audit.c
index 7d7d0b9..bac5fa4 100644
--- a/arch/x86/kvm/mmu_audit.c
+++ b/arch/x86/kvm/mmu_audit.c
@@ -116,10 +116,8 @@ static void audit_mappings(struct kvm_vcpu *vcpu, u64 
*sptep, int level)
gfn = kvm_mmu_page_get_gfn(sp, sptep - sp-spt);
pfn = gfn_to_pfn_atomic(vcpu-kvm, gfn);

-   if (is_error_pfn(pfn)) {
-   kvm_release_pfn_clean(pfn);
+   if (is_error_pfn(pfn))
return;
-   }

hpa =  pfn  PAGE_SHIFT;
if ((*sptep  PT64_BASE_ADDR_MASK) != hpa)
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index bb7cf01..bf8c42b 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -370,10 +370,8 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, 
struct kvm_mmu_page *sp,
pgprintk(%s: gpte %llx spte %p\n, __func__, (u64)gpte, spte);
pte_access = sp-role.access  FNAME(gpte_access)(vcpu, gpte, true);
pfn = gfn_to_pfn_atomic(vcpu-kvm, gpte_to_gfn(gpte));
-   if (mmu_invalid_pfn(pfn)) {
-   kvm_release_pfn_clean(pfn);
+   if (mmu_invalid_pfn(pfn))
return;
-   }

/*
 * we call mmu_set_spte() with host_writable = true because that
@@ -448,10 +446,8 @@ static void FNAME(pte_prefetch)(struct kvm_vcpu *vcpu, 
struct guest_walker *gw,
gfn = gpte_to_gfn(gpte);
pfn = pte_prefetch_gfn_to_pfn(vcpu, gfn,
  pte_access  ACC_WRITE_MASK);
-   if (mmu_invalid_pfn(pfn)) {
-   kvm_release_pfn_clean(pfn);
+   if (mmu_invalid_pfn(pfn))
break;
-   }

mmu_set_spte(vcpu, spte, sp-role.access, pte_access, 0, 0,
 NULL, PT_PAGE_TABLE_LEVEL, gfn,
diff --git a/virt/kvm/iommu.c b/virt/kvm/iommu.c
index 6a67bea..037cb67 100644
--- a/virt/kvm/iommu.c
+++ b/virt/kvm/iommu.c
@@ -107,7 +107,6 @@ int kvm_iommu_map_pages(struct kvm *kvm, struct 
kvm_memory_slot *slot)
 */
pfn = kvm_pin_pages(slot, gfn, page_size);
if (is_error_pfn(pfn)) {
-   kvm_release_pfn_clean(pfn);
gfn += 1;
continue;
}
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 5873031..bd175f7 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -102,9 +102,6 @@ static bool largepages_enabled = true;

 bool kvm_is_mmio_pfn(pfn_t pfn)
 {
-   if (is_error_pfn(pfn))
-   return false;
-
if (pfn_valid(pfn)) {

[PATCH v2 09/10] KVM: do not release the error page

2012-08-03 Thread Xiao Guangrong

After commit a2766325cf9f9, the error page is replaced by the
error code, it need not be released anymore

[ The patch has been compiling tested for powerpc ]

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 arch/powerpc/kvm/44x_tlb.c   |1 -
 arch/powerpc/kvm/book3s_pr.c |4 +---
 arch/x86/kvm/svm.c   |1 -
 arch/x86/kvm/vmx.c   |5 ++---
 arch/x86/kvm/x86.c   |9 +++--
 include/linux/kvm_host.h |2 +-
 virt/kvm/async_pf.c  |4 ++--
 virt/kvm/kvm_main.c  |5 +++--
 8 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/kvm/44x_tlb.c b/arch/powerpc/kvm/44x_tlb.c
index 33aa715..5dd3ab4 100644
--- a/arch/powerpc/kvm/44x_tlb.c
+++ b/arch/powerpc/kvm/44x_tlb.c
@@ -319,7 +319,6 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 gvaddr, 
gpa_t gpaddr,
if (is_error_page(new_page)) {
printk(KERN_ERR Couldn't get guest page for gfn %llx!\n,
(unsigned long long)gfn);
-   kvm_release_page_clean(new_page);
return;
}
hpaddr = page_to_phys(new_page);
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index a1baec3..05c28f5 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -242,10 +242,8 @@ static void kvmppc_patch_dcbz(struct kvm_vcpu *vcpu, 
struct kvmppc_pte *pte)
int i;

hpage = gfn_to_page(vcpu-kvm, pte-raddr  PAGE_SHIFT);
-   if (is_error_page(hpage)) {
-   kvm_release_page_clean(hpage);
+   if (is_error_page(hpage))
return;
-   }

hpage_offset = pte-raddr  ~PAGE_MASK;
hpage_offset = ~0xFFFULL;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 687d0c3..31be4a5 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2105,7 +2105,6 @@ static void *nested_svm_map(struct vcpu_svm *svm, u64 
gpa, struct page **_page)
return kmap(page);

 error:
-   kvm_release_page_clean(page);
kvm_inject_gp(svm-vcpu, 0);

return NULL;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 2300e53..4b6e809 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -596,10 +596,9 @@ static inline struct vmcs12 *get_vmcs12(struct kvm_vcpu 
*vcpu)
 static struct page *nested_get_page(struct kvm_vcpu *vcpu, gpa_t addr)
 {
struct page *page = gfn_to_page(vcpu-kvm, addr  PAGE_SHIFT);
-   if (is_error_page(page)) {
-   kvm_release_page_clean(page);
+   if (is_error_page(page))
return NULL;
-   }
+
return page;
 }

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b6379e5..3c94d80 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1635,10 +1635,9 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, 
u64 data)
vcpu-arch.time_page =
gfn_to_page(vcpu-kvm, data  PAGE_SHIFT);

-   if (is_error_page(vcpu-arch.time_page)) {
-   kvm_release_page_clean(vcpu-arch.time_page);
+   if (is_error_page(vcpu-arch.time_page))
vcpu-arch.time_page = NULL;
-   }
+
break;
}
case MSR_KVM_ASYNC_PF_EN:
@@ -3941,10 +3940,8 @@ static int emulator_cmpxchg_emulated(struct 
x86_emulate_ctxt *ctxt,
goto emul_write;

page = gfn_to_page(vcpu-kvm, gpa  PAGE_SHIFT);
-   if (is_error_page(page)) {
-   kvm_release_page_clean(page);
+   if (is_error_page(page))
goto emul_write;
-   }

kaddr = kmap_atomic(page);
kaddr += offset_in_page(gpa);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 0aebe7a..a8989fc 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -458,7 +458,7 @@ pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn);
 pfn_t gfn_to_pfn_prot(struct kvm *kvm, gfn_t gfn, bool write_fault,
  bool *writable);
 pfn_t gfn_to_pfn_memslot(struct kvm_memory_slot *slot, gfn_t gfn);
-void kvm_release_pfn_dirty(pfn_t);
+void kvm_release_pfn_dirty(pfn_t pfn);
 void kvm_release_pfn_clean(pfn_t pfn);
 void kvm_set_pfn_dirty(pfn_t pfn);
 void kvm_set_pfn_accessed(pfn_t pfn);
diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
index 56f5533..ea475cd 100644
--- a/virt/kvm/async_pf.c
+++ b/virt/kvm/async_pf.c
@@ -111,7 +111,7 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu 
*vcpu)
list_entry(vcpu-async_pf.done.next,
   typeof(*work), link);
list_del(work-link);
-   if (work-page)
+   if (!is_error_page(work-page))
kvm_release_page_clean(work-page);
kmem_cache_free(async_pf_cache, work);
}
@@ -138,7 +138,7 @@ void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu)

[PATCH v2 10/10] KVM: let the error pfn not depend on error code

2012-08-03 Thread Xiao Guangrong

Currently, we use the error code as error pfn to indicat the error
condition, it is not straightforward and it will not work on PAE
32-bit cpu with huge memory, since the valid physical address
can be at most 52 bits

For the normal pfn, the highest 12 bits should be zero, so we can
mask these bits to indicate the error.

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 include/linux/kvm_host.h |   24 +++-
 1 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a8989fc..2aaff6e 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -49,28 +49,34 @@
 #define KVM_MAX_MMIO_FRAGMENTS \
(KVM_MMIO_SIZE / KVM_USER_MMIO_SIZE + KVM_EXTRA_MMIO_FRAGMENTS)

-#define KVM_PFN_ERR_FAULT  (-EFAULT)
-#define KVM_PFN_ERR_HWPOISON   (-EHWPOISON)
-#define KVM_PFN_ERR_BAD(-ENOENT)
+/*
+ * For the normal pfn, the highest 12 bits should be zero,
+ * so we can mask these bits to indicate the error.
+ */
+#define KVM_PFN_ERR_MASK   (0xfffULL  52)
+
+#define KVM_PFN_ERR_FAULT  (KVM_PFN_ERR_MASK)
+#define KVM_PFN_ERR_HWPOISON   (KVM_PFN_ERR_MASK + 1)
+#define KVM_PFN_ERR_BAD(KVM_PFN_ERR_MASK + 2)

-static inline int is_error_pfn(pfn_t pfn)
+static inline bool is_error_pfn(pfn_t pfn)
 {
-   return IS_ERR_VALUE(pfn);
+   return !!(pfn  KVM_PFN_ERR_MASK);
 }

-static inline int is_noslot_pfn(pfn_t pfn)
+static inline bool is_noslot_pfn(pfn_t pfn)
 {
-   return pfn == -ENOENT;
+   return pfn == KVM_PFN_ERR_BAD;
 }

-static inline int is_invalid_pfn(pfn_t pfn)
+static inline bool is_invalid_pfn(pfn_t pfn)
 {
return !is_noslot_pfn(pfn)  is_error_pfn(pfn);
 }

 #define KVM_ERR_PTR_BAD_PAGE   (ERR_PTR(-ENOENT))

-static inline int is_error_page(struct page *page)
+static inline bool is_error_page(struct page *page)
 {
return IS_ERR(page);
 }
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 8/9] KVM: do not release the error pfn

2012-08-03 Thread Xiao Guangrong

On 08/02/2012 09:14 PM, Marcelo Tosatti wrote:
 On Sun, Jul 29, 2012 at 04:18:58PM +0800, Xiao Guangrong wrote:
 After commit a2766325cf9f9, the error pfn is replaced by the
 error code, it need not be released anymore

 [ The patch is compiling tested for powerpc ]

 Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
 ---
  arch/powerpc/kvm/e500_tlb.c |1 -
  arch/x86/kvm/mmu.c  |6 +++---
  arch/x86/kvm/mmu_audit.c|4 +---
  arch/x86/kvm/paging_tmpl.h  |8 ++--
  virt/kvm/iommu.c|1 -
  virt/kvm/kvm_main.c |   14 --
  6 files changed, 14 insertions(+), 20 deletions(-)

 diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
 index c8f6c58..09ce5ac 100644
 --- a/arch/powerpc/kvm/e500_tlb.c
 +++ b/arch/powerpc/kvm/e500_tlb.c
 @@ -524,7 +524,6 @@ static inline void kvmppc_e500_shadow_map(struct 
 kvmppc_vcpu_e500 *vcpu_e500,
  if (is_error_pfn(pfn)) {
  printk(KERN_ERR Couldn't get real page for gfn %lx!\n,
  (long)gfn);
 -kvm_release_pfn_clean(pfn);
  return;
  }

 diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
 index 320a781..949a5b8 100644
 --- a/arch/x86/kvm/mmu.c
 +++ b/arch/x86/kvm/mmu.c
 @@ -2498,7 +2498,9 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 
 *sptep,
  rmap_recycle(vcpu, sptep, gfn);
  }
  }
 -kvm_release_pfn_clean(pfn);
 +
 +if (!is_error_pfn(pfn))
 +kvm_release_pfn_clean(pfn);
  }
 
 Can it ever be error_pfn? Seems a problem if so.
 

Yes, the no-slot-pfn, we will cache the mmio access into spte.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/10] nEPT: Add EPT tables support to paging_tmpl.h

2012-08-03 Thread Xiao Guangrong

On 08/03/2012 05:25 AM, Nadav Har'El wrote:
 On Thu, Aug 02, 2012, Xiao Guangrong wrote about Re: [PATCH 02/10] nEPT: Add 
 EPT tables support to paging_tmpl.h:
 +   #ifdef CONFIG_X86_64
 +   #define PT_MAX_FULL_LEVELS 4
 +   #define CMPXCHG cmpxchg
 +   #else
 +   #define CMPXCHG cmpxchg64
 +   #define PT_MAX_FULL_LEVELS 2
 +   #endif

 Missing the case of FULL_LEVELS == 3? Oh, you mentioned it
 as PAE case in the PATCH 0.
 
 I understood this differently (and it would not be surprising if
 wrongly...): With nested EPT, we only deal with two *EPT* tables -
 the shadowed page table and shadow page table are both EPT.
 And EPT tables cannot have three levels - even if PAE is used. Or at least,
 that's what I thought...
 
 Note A/D bits are supported on new intel cpus, this function should be 
 reworked
 for nept. I know you did not export this feather to guest, but we can reduce
 the difference between nept and other mmu models if A/D are supported.
 
 I'm not sure what you meant: If the access/dirty bits are supported in
 newer cpus, do you think we *should* support them also in the processor
 L1 processor, or are you saying that it would be easier to support them
 because this is what the shadow page table code normally does anyway,
 so *not* supporting them will take effort?

I mean it would be easier to support them
 because this is what the shadow page table code normally does anyway,
 so *not* supporting them will take effort :)

Then, we can drop ifndef PTTYPT_EPT...

Actuality, we can redefine some bits (like PRSENT, WRTIABLE, DRITY...) to
let the paging_tmpl code work for all models.

 
 +#if PTTYPE != PTTYPE_EPT
  static int FNAME(walk_addr_nested)(struct guest_walker *walker,
struct kvm_vcpu *vcpu, gva_t addr,
u32 access)
 @@ -335,6 +395,7 @@ static int FNAME(walk_addr_nested)(struc
 return FNAME(walk_addr_generic)(walker, vcpu, vcpu-arch.nested_mmu,
 addr, access);
  }
 +#endif


 Hmm, you do not need the special walking functions?
 
 Since these functions are static, the compiler warns me on every
 function that is never used, so I had to #if them out...
 
 

IIUC, you did not implement the functions (like walk_addr_nested) to translate
L2's VA to L2's PA, yes? (it is needed for emulation.)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/9] KVM: define kvm_bad_pfn statically

2012-08-03 Thread Xiao Guangrong

Marcelo, Paul,

Thanks for your review!

On 08/03/2012 08:01 AM, Paul Mackerras wrote:
 On Thu, Aug 02, 2012 at 10:15:27AM -0300, Marcelo Tosatti wrote:
 
 Remind me what is the guarantee that -Exxx does not clash with
 a valid pfn number?
 
 A pfn number is an address  PAGE_SHIFT, so it will have the top 12
 (at least) bits clear, whereas -Exxx will have the top bit set.
 

Yes.

As this way is hard to understand and it will break huge memory support
on PAE 32bit cpu, i have used a new way in the v2:

http://marc.info/?l=linux-kernelm=134398012027025w=2

Please review the new version.



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] pci-assign: Switch to pci_device_route_intx_to_irq interface

2012-08-03 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

Drop pci_map_irq/piix_get_irq in favor of upstream's new interface. This
should also properly model disabling of the line at PCI host controller
level.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/device-assignment.c |   29 +++--
 hw/pc.h|4 
 hw/pci.c   |4 
 hw/pci.h   |2 --
 hw/piix_pci.c  |7 ---
 5 files changed, 19 insertions(+), 27 deletions(-)

diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index cc39958..d14c327 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -115,7 +115,7 @@ typedef struct AssignedDevice {
 AssignedDevRegion v_addrs[PCI_NUM_REGIONS - 1];
 PCIDevRegions real_device;
 int run;
-int girq;
+PCIINTxRoute intx_route;
 uint16_t h_segnr;
 uint8_t h_busnr;
 uint8_t h_devfn;
@@ -865,21 +865,24 @@ static int assign_device(AssignedDevice *dev)
 static int assign_irq(AssignedDevice *dev)
 {
 struct kvm_assigned_irq assigned_irq_data;
-int irq, r = 0;
+PCIINTxRoute intx_route;
+int r = 0;
 
 /* Interrupt PIN 0 means don't use INTx */
 if (assigned_dev_pci_read_byte(dev-dev, PCI_INTERRUPT_PIN) == 0)
 return 0;
 
-irq = pci_map_irq(dev-dev, dev-intpin);
-irq = piix_get_irq(irq);
+intx_route = pci_device_route_intx_to_irq(dev-dev, 0);
+assert(intx_route.mode != PCI_INTX_INVERTED);
 
-if (dev-girq == irq)
+if (dev-intx_route.mode == intx_route.mode 
+dev-intx_route.irq == intx_route.irq) {
 return r;
+}
 
 memset(assigned_irq_data, 0, sizeof(assigned_irq_data));
 assigned_irq_data.assigned_dev_id = calc_assigned_dev_id(dev);
-assigned_irq_data.guest_irq = irq;
+assigned_irq_data.guest_irq = intx_route.irq;
 if (dev-irq_requested_type) {
 assigned_irq_data.flags = dev-irq_requested_type;
 r = kvm_deassign_irq(kvm_state, assigned_irq_data);
@@ -889,6 +892,11 @@ static int assign_irq(AssignedDevice *dev)
 dev-irq_requested_type = 0;
 }
 
+if (intx_route.mode == PCI_INTX_DISABLED) {
+dev-intx_route = intx_route;
+return 0;
+}
+
 retry:
 assigned_irq_data.flags = KVM_DEV_IRQ_GUEST_INTX;
 if (dev-features  ASSIGNED_DEVICE_PREFER_MSI_MASK 
@@ -917,7 +925,7 @@ retry:
 return r;
 }
 
-dev-girq = irq;
+dev-intx_route = intx_route;
 dev-irq_requested_type = assigned_irq_data.flags;
 return r;
 }
@@ -1029,7 +1037,7 @@ static void assigned_dev_update_msi(PCIDevice *pci_dev)
 perror(assigned_dev_enable_msi: assign irq);
 }
 
-assigned_dev-girq = -1;
+assigned_dev-intx_route.mode = PCI_INTX_DISABLED;
 assigned_dev-irq_requested_type = assigned_irq_data.flags;
 } else {
 assign_irq(assigned_dev);
@@ -1160,7 +1168,7 @@ static void assigned_dev_update_msix(PCIDevice *pci_dev)
 return;
 }
 }
-assigned_dev-girq = -1;
+assigned_dev-intx_route.mode = PCI_INTX_DISABLED;
 assigned_dev-irq_requested_type = assigned_irq_data.flags;
 } else {
 assign_irq(assigned_dev);
@@ -1784,7 +1792,8 @@ static int assigned_initfn(struct PCIDevice *pci_dev)
 e_intx = dev-dev.config[0x3d] - 1;
 dev-intpin = e_intx;
 dev-run = 0;
-dev-girq = -1;
+dev-intx_route.mode = PCI_INTX_DISABLED;
+dev-intx_route.irq = -1;
 dev-h_segnr = dev-host.domain;
 dev-h_busnr = dev-host.bus;
 dev-h_devfn = PCI_DEVFN(dev-host.slot, dev-host.function);
diff --git a/hw/pc.h b/hw/pc.h
index a662090..5b36eb5 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -171,10 +171,6 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int 
*piix_devfn,
 extern PCIDevice *piix4_dev;
 int piix4_init(PCIBus *bus, ISABus **isa_bus, int devfn);
 
-int piix_get_irq(int pin);
-
-int ipf_map_irq(PCIDevice *pci_dev, int irq_num);
-
 /* vga.c */
 enum vga_retrace_method {
 VGA_RETRACE_DUMB,
diff --git a/hw/pci.c b/hw/pci.c
index 5ef3453..0b22913 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -1089,10 +1089,6 @@ static void pci_set_irq(void *opaque, int irq_num, int 
level)
 pci_change_irq_level(pci_dev, irq_num, change);
 }
 
-int pci_map_irq(PCIDevice *pci_dev, int pin)
-{
-return pci_dev-bus-map_irq(pci_dev, pin);
-}
 /* Special hooks used by device assignment */
 void pci_bus_set_route_irq_fn(PCIBus *bus, pci_route_irq_fn route_intx_to_irq)
 {
diff --git a/hw/pci.h b/hw/pci.h
index c69af01..4b6ab3d 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -274,8 +274,6 @@ void pci_register_bar(PCIDevice *pci_dev, int region_num,
   uint8_t attr, MemoryRegion *memory);
 pcibus_t pci_get_bar_addr(PCIDevice *pci_dev, int region_num);
 
-int pci_map_irq(PCIDevice *pci_dev, int pin);
-
 int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
uint8_t offset, uint8_t size);
 
diff --git a/hw/piix_pci.c

[PATCH 2/2] pci-assign: Use pci_device_set_intx_routing_notifier

2012-08-03 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

Replace the hack in pci_default_write_config with upstream's generic
callback mechanism to get informed about changes on the PCI INTx
routing.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/Makefile.objs   |3 +--
 hw/device-assignment.c |   48 
 hw/device-assignment.h |   33 -
 hw/pc.h|3 ---
 hw/pci.c   |   11 ---
 hw/piix_pci.c  |3 ---
 6 files changed, 17 insertions(+), 84 deletions(-)
 delete mode 100644 hw/device-assignment.h

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 30f9ba6..fa8bb08 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -3,7 +3,7 @@ hw-obj-y += loader.o
 hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
 hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
 hw-obj-y += fw_cfg.o
-hw-obj-$(CONFIG_PCI) += pci_bridge.o pci_bridge_dev.o
+hw-obj-$(CONFIG_PCI) += pci.o pci_bridge.o pci_bridge_dev.o
 hw-obj-$(CONFIG_PCI) += msix.o msi.o
 hw-obj-$(CONFIG_PCI) += shpc.o
 hw-obj-$(CONFIG_PCI) += slotid_cap.o
@@ -164,7 +164,6 @@ obj-$(CONFIG_SOFTMMU) += vhost_net.o
 obj-$(CONFIG_VHOST_NET) += vhost.o
 obj-$(CONFIG_REALLY_VIRTFS) += 9pfs/
 obj-$(CONFIG_NO_PCI) += pci-stub.o
-obj-$(CONFIG_PCI) += pci.o
 obj-$(CONFIG_VGA) += vga.o
 obj-$(CONFIG_SOFTMMU) += device-hotplug.o
 obj-$(CONFIG_XEN) += xen_domainbuild.o xen_machine_pv.o
diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index d14c327..7a90027 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -36,7 +36,6 @@
 #include pc.h
 #include qemu-error.h
 #include console.h
-#include device-assignment.h
 #include loader.h
 #include monitor.h
 #include range.h
@@ -143,6 +142,8 @@ typedef struct AssignedDevice {
 QLIST_ENTRY(AssignedDevice) next;
 } AssignedDevice;
 
+static void assigned_dev_update_irq_routing(PCIDevice *dev);
+
 static void assigned_dev_load_option_rom(AssignedDevice *dev);
 
 static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev);
@@ -869,8 +870,13 @@ static int assign_irq(AssignedDevice *dev)
 int r = 0;
 
 /* Interrupt PIN 0 means don't use INTx */
-if (assigned_dev_pci_read_byte(dev-dev, PCI_INTERRUPT_PIN) == 0)
+if (assigned_dev_pci_read_byte(dev-dev, PCI_INTERRUPT_PIN) == 0) {
+pci_device_set_intx_routing_notifier(dev-dev, NULL);
 return 0;
+}
+
+pci_device_set_intx_routing_notifier(dev-dev,
+ assigned_dev_update_irq_routing);
 
 intx_route = pci_device_route_intx_to_irq(dev-dev, 0);
 assert(intx_route.mode != PCI_INTX_INVERTED);
@@ -944,43 +950,19 @@ static void deassign_device(AssignedDevice *dev)
 dev-dev.qdev.id, strerror(-r));
 }
 
-#if 0
-AssignedDevInfo *get_assigned_device(int pcibus, int slot)
-{
-AssignedDevice *assigned_dev = NULL;
-AssignedDevInfo *adev = NULL;
-
-QLIST_FOREACH(adev, adev_head, next) {
-assigned_dev = adev-assigned_dev;
-if (pci_bus_num(assigned_dev-dev.bus) == pcibus 
-PCI_SLOT(assigned_dev-dev.devfn) == slot)
-return adev;
-}
-
-return NULL;
-}
-#endif
-
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
-void assigned_dev_update_irqs(void)
+static void assigned_dev_update_irq_routing(PCIDevice *dev)
 {
-AssignedDevice *dev, *next;
+AssignedDevice *assigned_dev = DO_UPCAST(AssignedDevice, dev, dev);
 Error *err = NULL;
 int r;
 
-dev = QLIST_FIRST(devs);
-while (dev) {
-next = QLIST_NEXT(dev, next);
-if (dev-irq_requested_type  KVM_DEV_IRQ_HOST_INTX) {
-r = assign_irq(dev);
-if (r  0) {
-qdev_unplug(dev-dev.qdev, err);
-assert(!err);
-}
-}
-dev = next;
+r = assign_irq(assigned_dev);
+if (r  0) {
+qdev_unplug(dev-qdev, err);
+assert(!err);
 }
 }
 
@@ -1009,6 +991,7 @@ static void assigned_dev_update_msi(PCIDevice *pci_dev)
 perror(assigned_dev_update_msi: deassign irq);
 
 assigned_dev-irq_requested_type = 0;
+pci_device_set_intx_routing_notifier(pci_dev, NULL);
 }
 
 if (ctrl_byte  PCI_MSI_FLAGS_ENABLE) {
@@ -1151,6 +1134,7 @@ static void assigned_dev_update_msix(PCIDevice *pci_dev)
 perror(assigned_dev_update_msix: deassign irq);
 
 assigned_dev-irq_requested_type = 0;
+pci_device_set_intx_routing_notifier(pci_dev, NULL);
 }
 
 if (ctrl_word  PCI_MSIX_FLAGS_ENABLE) {
diff --git a/hw/device-assignment.h b/hw/device-assignment.h
deleted file mode 100644
index 3fcb804..000
--- a/hw/device-assignment.h
+++ /dev/null
@@ -1,33 +0,0 @@
-/*
- * Copyright (c) 2007, Neocleus Corporation.
- * Copyright (c) 2007, Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms and conditions of

[no subject]

2012-08-03 Thread Stefan Bader

Subject: Re: Nested kvm_intel broken on pre 3.3 hosts

 No, you're backporting the entire feature.  All we need is to expose
 RDPMC intercept to the guest.

Oh well, I thought that was the thing you asked for...

 It should be sufficient to backport the bits in
 nested_vmx_setup_ctls_msrs() and nested_vmx_exit_handled().

Ok, how about that? It is probably wrong again, but at least it
allows to load the kvm-intel module from within a nested guest
and not having the feature pretend to fail seems the closest
thing to do...

---

From 0aeb99348363b7aeb2b0bd92428cb212159fa468 Mon Sep 17 00:00:00 2001
From: Stefan Bader stefan.ba...@canonical.com
Date: Thu, 10 Nov 2011 14:57:25 +0200
Subject: [PATCH] KVM: VMX: Fake intercept RDPMC

Based on commit fee84b079d5ddee2247b5c1f53162c330c622902 upstream.

  Intercept RDPMC and forward it to the PMU emulation code.

But drop the requirement for the feature being present and instead
of forwarding, cause a GP as if the call had failed.

BugLink: http://bugs.launchpad.net/bugs/1031090
Signed-off-by: Stefan Bader stefan.ba...@canonical.com
---
 arch/x86/kvm/vmx.c |   10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7315488..fc937f2 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1956,6 +1956,7 @@ static __init void nested_vmx_setup_ctls_msrs(void)
 #endif
CPU_BASED_MOV_DR_EXITING | CPU_BASED_UNCOND_IO_EXITING |
CPU_BASED_USE_IO_BITMAPS | CPU_BASED_MONITOR_EXITING |
+   CPU_BASED_RDPMC_EXITING |
CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
/*
 * We can allow some features even when not supported by the
@@ -4613,6 +4614,14 @@ static int handle_invlpg(struct kvm_vcpu *vcpu)
return 1;
 }
 
+static int handle_rdpmc(struct kvm_vcpu *vcpu)
+{
+   /* Instead of implementing the feature, cause a GP */
+   kvm_complete_insn_gp(vcpu, 1);
+
+   return 1;
+}
+
 static int handle_wbinvd(struct kvm_vcpu *vcpu)
 {
skip_emulated_instruction(vcpu);
@@ -5563,6 +5572,7 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu 
*vcpu) = {
[EXIT_REASON_HLT] = handle_halt,
[EXIT_REASON_INVD]= handle_invd,
[EXIT_REASON_INVLPG]  = handle_invlpg,
+   [EXIT_REASON_RDPMC]   = handle_rdpmc,
[EXIT_REASON_VMCALL]  = handle_vmcall,
[EXIT_REASON_VMCLEAR] = handle_vmclear,
[EXIT_REASON_VMLAUNCH]= handle_vmlaunch,
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Nested kvm_intel broken on pre 3.3 hosts

2012-08-03 Thread Stefan Bader

 No, you're backporting the entire feature.  All we need is to expose
 RDPMC intercept to the guest.

Oh well, I thought that was the thing you asked for...

 It should be sufficient to backport the bits in
 nested_vmx_setup_ctls_msrs() and nested_vmx_exit_handled().

Ok, how about that? It is probably wrong again, but at least it
allows to load the kvm-intel module from within a nested guest
and not having the feature pretend to fail seems the closest
thing to do...

---

From 0aeb99348363b7aeb2b0bd92428cb212159fa468 Mon Sep 17 00:00:00 2001
From: Stefan Bader stefan.ba...@canonical.com
Date: Thu, 10 Nov 2011 14:57:25 +0200
Subject: [PATCH] KVM: VMX: Fake intercept RDPMC

Based on commit fee84b079d5ddee2247b5c1f53162c330c622902 upstream.

  Intercept RDPMC and forward it to the PMU emulation code.

But drop the requirement for the feature being present and instead
of forwarding, cause a GP as if the call had failed.

BugLink: http://bugs.launchpad.net/bugs/1031090
Signed-off-by: Stefan Bader stefan.ba...@canonical.com
---
 arch/x86/kvm/vmx.c |   10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7315488..fc937f2 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1956,6 +1956,7 @@ static __init void nested_vmx_setup_ctls_msrs(void)
 #endif
CPU_BASED_MOV_DR_EXITING | CPU_BASED_UNCOND_IO_EXITING |
CPU_BASED_USE_IO_BITMAPS | CPU_BASED_MONITOR_EXITING |
+   CPU_BASED_RDPMC_EXITING |
CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
/*
 * We can allow some features even when not supported by the
@@ -4613,6 +4614,14 @@ static int handle_invlpg(struct kvm_vcpu *vcpu)
return 1;
 }
 
+static int handle_rdpmc(struct kvm_vcpu *vcpu)
+{
+   /* Instead of implementing the feature, cause a GP */
+   kvm_complete_insn_gp(vcpu, 1);
+
+   return 1;
+}
+
 static int handle_wbinvd(struct kvm_vcpu *vcpu)
 {
skip_emulated_instruction(vcpu);
@@ -5563,6 +5572,7 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu 
*vcpu) = {
[EXIT_REASON_HLT] = handle_halt,
[EXIT_REASON_INVD]= handle_invd,
[EXIT_REASON_INVLPG]  = handle_invlpg,
+   [EXIT_REASON_RDPMC]   = handle_rdpmc,
[EXIT_REASON_VMCALL]  = handle_vmcall,
[EXIT_REASON_VMCLEAR] = handle_vmclear,
[EXIT_REASON_VMLAUNCH]= handle_vmlaunch,
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2] configure: Don't implicitly hardcode list of KVM architectures

2012-08-03 Thread Peter Maydell

The code creating the symlink from linux-headers/asm to the
architecture specific linux-headers/asm-$arch directory was
implicitly hardcoding a list of KVM supporting architectures.
Add a default case for the common Linux architecture name and
QEMU CPU name match case, so future architectures will only
need to add code if they've managed to get mismatched names.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
v1-v2 changes: conform to same indent rules as surrounding code

 configure |   14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index 9f071b7..eafb81f 100755
--- a/configure
+++ b/configure
@@ -3485,15 +3485,23 @@ if test $linux = yes ; then
   mkdir -p linux-headers
   case $cpu in
   i386|x86_64)
-symlink $source_path/linux-headers/asm-x86 linux-headers/asm
+linux_arch=x86
 ;;
   ppcemb|ppc|ppc64)
-symlink $source_path/linux-headers/asm-powerpc linux-headers/asm
+linux_arch=powerpc
 ;;
   s390x)
-symlink $source_path/linux-headers/asm-s390 linux-headers/asm
+linux_arch=s390
+;;
+  *)
+# For most CPUs the kernel architecture name and QEMU CPU name match.
+linux_arch=$cpu
 ;;
   esac
+# For non-KVM architectures we will not have asm headers
+if [ -e $source_path/linux-headers/asm-$linux_arch ]; then
+  symlink $source_path/linux-headers/asm-$linux_arch linux-headers/asm
+fi
 fi
 
 for target in $target_list; do
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] pci-assign: Use pci_device_set_intx_routing_notifier

2012-08-03 Thread Alex Williamson

On Fri, 2012-08-03 at 12:52 +0200, Jan Kiszka wrote:
 From: Jan Kiszka jan.kis...@siemens.com
 
 Replace the hack in pci_default_write_config with upstream's generic
 callback mechanism to get informed about changes on the PCI INTx
 routing.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/Makefile.objs   |3 +--
  hw/device-assignment.c |   48 
 
  hw/device-assignment.h |   33 -
  hw/pc.h|3 ---
  hw/pci.c   |   11 ---
  hw/piix_pci.c  |3 ---
  6 files changed, 17 insertions(+), 84 deletions(-)
  delete mode 100644 hw/device-assignment.h
 
 diff --git a/hw/Makefile.objs b/hw/Makefile.objs
 index 30f9ba6..fa8bb08 100644
 --- a/hw/Makefile.objs
 +++ b/hw/Makefile.objs
 @@ -3,7 +3,7 @@ hw-obj-y += loader.o
  hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
  hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
  hw-obj-y += fw_cfg.o
 -hw-obj-$(CONFIG_PCI) += pci_bridge.o pci_bridge_dev.o
 +hw-obj-$(CONFIG_PCI) += pci.o pci_bridge.o pci_bridge_dev.o
  hw-obj-$(CONFIG_PCI) += msix.o msi.o
  hw-obj-$(CONFIG_PCI) += shpc.o
  hw-obj-$(CONFIG_PCI) += slotid_cap.o
 @@ -164,7 +164,6 @@ obj-$(CONFIG_SOFTMMU) += vhost_net.o
  obj-$(CONFIG_VHOST_NET) += vhost.o
  obj-$(CONFIG_REALLY_VIRTFS) += 9pfs/
  obj-$(CONFIG_NO_PCI) += pci-stub.o
 -obj-$(CONFIG_PCI) += pci.o
  obj-$(CONFIG_VGA) += vga.o
  obj-$(CONFIG_SOFTMMU) += device-hotplug.o
  obj-$(CONFIG_XEN) += xen_domainbuild.o xen_machine_pv.o
 diff --git a/hw/device-assignment.c b/hw/device-assignment.c
 index d14c327..7a90027 100644
 --- a/hw/device-assignment.c
 +++ b/hw/device-assignment.c
 @@ -36,7 +36,6 @@
  #include pc.h
  #include qemu-error.h
  #include console.h
 -#include device-assignment.h
  #include loader.h
  #include monitor.h
  #include range.h
 @@ -143,6 +142,8 @@ typedef struct AssignedDevice {
  QLIST_ENTRY(AssignedDevice) next;
  } AssignedDevice;
  
 +static void assigned_dev_update_irq_routing(PCIDevice *dev);
 +
  static void assigned_dev_load_option_rom(AssignedDevice *dev);
  
  static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev);
 @@ -869,8 +870,13 @@ static int assign_irq(AssignedDevice *dev)
  int r = 0;
  
  /* Interrupt PIN 0 means don't use INTx */
 -if (assigned_dev_pci_read_byte(dev-dev, PCI_INTERRUPT_PIN) == 0)
 +if (assigned_dev_pci_read_byte(dev-dev, PCI_INTERRUPT_PIN) == 0) {
 +pci_device_set_intx_routing_notifier(dev-dev, NULL);
  return 0;
 +}
 +
 +pci_device_set_intx_routing_notifier(dev-dev,
 + assigned_dev_update_irq_routing);
  
  intx_route = pci_device_route_intx_to_irq(dev-dev, 0);
  assert(intx_route.mode != PCI_INTX_INVERTED);
 @@ -944,43 +950,19 @@ static void deassign_device(AssignedDevice *dev)
  dev-dev.qdev.id, strerror(-r));
  }
  
 -#if 0
 -AssignedDevInfo *get_assigned_device(int pcibus, int slot)
 -{
 -AssignedDevice *assigned_dev = NULL;
 -AssignedDevInfo *adev = NULL;
 -
 -QLIST_FOREACH(adev, adev_head, next) {
 -assigned_dev = adev-assigned_dev;
 -if (pci_bus_num(assigned_dev-dev.bus) == pcibus 
 -PCI_SLOT(assigned_dev-dev.devfn) == slot)
 -return adev;
 -}
 -
 -return NULL;
 -}
 -#endif
 -
  /* The pci config space got updated. Check if irq numbers have changed
   * for our devices
   */
 -void assigned_dev_update_irqs(void)
 +static void assigned_dev_update_irq_routing(PCIDevice *dev)
  {
 -AssignedDevice *dev, *next;
 +AssignedDevice *assigned_dev = DO_UPCAST(AssignedDevice, dev, dev);
  Error *err = NULL;
  int r;
  
 -dev = QLIST_FIRST(devs);
 -while (dev) {
 -next = QLIST_NEXT(dev, next);
 -if (dev-irq_requested_type  KVM_DEV_IRQ_HOST_INTX) {
 -r = assign_irq(dev);
 -if (r  0) {
 -qdev_unplug(dev-dev.qdev, err);
 -assert(!err);
 -}
 -}
 -dev = next;
 +r = assign_irq(assigned_dev);
 +if (r  0) {
 +qdev_unplug(dev-qdev, err);
 +assert(!err);
  }
  }
  
 @@ -1009,6 +991,7 @@ static void assigned_dev_update_msi(PCIDevice *pci_dev)
  perror(assigned_dev_update_msi: deassign irq);
  
  assigned_dev-irq_requested_type = 0;
 +pci_device_set_intx_routing_notifier(pci_dev, NULL);
  }
  
  if (ctrl_byte  PCI_MSI_FLAGS_ENABLE) {
 @@ -1151,6 +1134,7 @@ static void assigned_dev_update_msix(PCIDevice *pci_dev)
  perror(assigned_dev_update_msix: deassign irq);
  
  assigned_dev-irq_requested_type = 0;
 +pci_device_set_intx_routing_notifier(pci_dev, NULL);
  }
  
  if (ctrl_word  PCI_MSIX_FLAGS_ENABLE) {
 diff --git a/hw/device-assignment.h b/hw/device-assignment.h
 deleted file mode 100644
 index 3fcb804..000
 --- a/hw/device-assignment.h
 +++ /dev/null
 @@ -1,33 +0,0 @@
 -/*
 - *

[PATCH v2 1/2] pci-assign: Switch to pci_device_route_intx_to_irq interface

2012-08-03 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

Drop pci_map_irq/piix_get_irq in favor of upstream's new interface. This
should also properly model disabling of the line at PCI host controller
level.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---

Changes in v2:
 - always set intx_route.irq to -1 when disabling

 hw/device-assignment.c |   31 +--
 hw/pc.h|4 
 hw/pci.c   |4 
 hw/pci.h   |2 --
 hw/piix_pci.c  |7 ---
 5 files changed, 21 insertions(+), 27 deletions(-)

diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index cc39958..35fc604 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -115,7 +115,7 @@ typedef struct AssignedDevice {
 AssignedDevRegion v_addrs[PCI_NUM_REGIONS - 1];
 PCIDevRegions real_device;
 int run;
-int girq;
+PCIINTxRoute intx_route;
 uint16_t h_segnr;
 uint8_t h_busnr;
 uint8_t h_devfn;
@@ -865,21 +865,24 @@ static int assign_device(AssignedDevice *dev)
 static int assign_irq(AssignedDevice *dev)
 {
 struct kvm_assigned_irq assigned_irq_data;
-int irq, r = 0;
+PCIINTxRoute intx_route;
+int r = 0;
 
 /* Interrupt PIN 0 means don't use INTx */
 if (assigned_dev_pci_read_byte(dev-dev, PCI_INTERRUPT_PIN) == 0)
 return 0;
 
-irq = pci_map_irq(dev-dev, dev-intpin);
-irq = piix_get_irq(irq);
+intx_route = pci_device_route_intx_to_irq(dev-dev, 0);
+assert(intx_route.mode != PCI_INTX_INVERTED);
 
-if (dev-girq == irq)
+if (dev-intx_route.mode == intx_route.mode 
+dev-intx_route.irq == intx_route.irq) {
 return r;
+}
 
 memset(assigned_irq_data, 0, sizeof(assigned_irq_data));
 assigned_irq_data.assigned_dev_id = calc_assigned_dev_id(dev);
-assigned_irq_data.guest_irq = irq;
+assigned_irq_data.guest_irq = intx_route.irq;
 if (dev-irq_requested_type) {
 assigned_irq_data.flags = dev-irq_requested_type;
 r = kvm_deassign_irq(kvm_state, assigned_irq_data);
@@ -889,6 +892,11 @@ static int assign_irq(AssignedDevice *dev)
 dev-irq_requested_type = 0;
 }
 
+if (intx_route.mode == PCI_INTX_DISABLED) {
+dev-intx_route = intx_route;
+return 0;
+}
+
 retry:
 assigned_irq_data.flags = KVM_DEV_IRQ_GUEST_INTX;
 if (dev-features  ASSIGNED_DEVICE_PREFER_MSI_MASK 
@@ -917,7 +925,7 @@ retry:
 return r;
 }
 
-dev-girq = irq;
+dev-intx_route = intx_route;
 dev-irq_requested_type = assigned_irq_data.flags;
 return r;
 }
@@ -1029,7 +1037,8 @@ static void assigned_dev_update_msi(PCIDevice *pci_dev)
 perror(assigned_dev_enable_msi: assign irq);
 }
 
-assigned_dev-girq = -1;
+assigned_dev-intx_route.mode = PCI_INTX_DISABLED;
+assigned_dev-intx_route.irq = -1;
 assigned_dev-irq_requested_type = assigned_irq_data.flags;
 } else {
 assign_irq(assigned_dev);
@@ -1160,7 +1169,8 @@ static void assigned_dev_update_msix(PCIDevice *pci_dev)
 return;
 }
 }
-assigned_dev-girq = -1;
+assigned_dev-intx_route.mode = PCI_INTX_DISABLED;
+assigned_dev-intx_route.irq = -1;
 assigned_dev-irq_requested_type = assigned_irq_data.flags;
 } else {
 assign_irq(assigned_dev);
@@ -1784,7 +1794,8 @@ static int assigned_initfn(struct PCIDevice *pci_dev)
 e_intx = dev-dev.config[0x3d] - 1;
 dev-intpin = e_intx;
 dev-run = 0;
-dev-girq = -1;
+dev-intx_route.mode = PCI_INTX_DISABLED;
+dev-intx_route.irq = -1;
 dev-h_segnr = dev-host.domain;
 dev-h_busnr = dev-host.bus;
 dev-h_devfn = PCI_DEVFN(dev-host.slot, dev-host.function);
diff --git a/hw/pc.h b/hw/pc.h
index a662090..5b36eb5 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -171,10 +171,6 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int 
*piix_devfn,
 extern PCIDevice *piix4_dev;
 int piix4_init(PCIBus *bus, ISABus **isa_bus, int devfn);
 
-int piix_get_irq(int pin);
-
-int ipf_map_irq(PCIDevice *pci_dev, int irq_num);
-
 /* vga.c */
 enum vga_retrace_method {
 VGA_RETRACE_DUMB,
diff --git a/hw/pci.c b/hw/pci.c
index 5ef3453..0b22913 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -1089,10 +1089,6 @@ static void pci_set_irq(void *opaque, int irq_num, int 
level)
 pci_change_irq_level(pci_dev, irq_num, change);
 }
 
-int pci_map_irq(PCIDevice *pci_dev, int pin)
-{
-return pci_dev-bus-map_irq(pci_dev, pin);
-}
 /* Special hooks used by device assignment */
 void pci_bus_set_route_irq_fn(PCIBus *bus, pci_route_irq_fn route_intx_to_irq)
 {
diff --git a/hw/pci.h b/hw/pci.h
index c69af01..4b6ab3d 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -274,8 +274,6 @@ void pci_register_bar(PCIDevice *pci_dev, int region_num,
   uint8_t attr, MemoryRegion *memory);
 pcibus_t pci_get_bar_addr(PCIDevice *pci_dev, int region_num);
 
-int pci_map_irq(PCIDevice *pci_dev, int

Re: [PATCH 2/2] pci-assign: Use pci_device_set_intx_routing_notifier

2012-08-03 Thread Jan Kiszka

On 2012-08-03 17:35, Alex Williamson wrote:
 On Fri, 2012-08-03 at 12:52 +0200, Jan Kiszka wrote:
 From: Jan Kiszka jan.kis...@siemens.com

 Replace the hack in pci_default_write_config with upstream's generic
 callback mechanism to get informed about changes on the PCI INTx
 routing.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/Makefile.objs   |3 +--
  hw/device-assignment.c |   48 
 
  hw/device-assignment.h |   33 -
  hw/pc.h|3 ---
  hw/pci.c   |   11 ---
  hw/piix_pci.c  |3 ---
  6 files changed, 17 insertions(+), 84 deletions(-)
  delete mode 100644 hw/device-assignment.h

 diff --git a/hw/Makefile.objs b/hw/Makefile.objs
 index 30f9ba6..fa8bb08 100644
 --- a/hw/Makefile.objs
 +++ b/hw/Makefile.objs
 @@ -3,7 +3,7 @@ hw-obj-y += loader.o
  hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
  hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
  hw-obj-y += fw_cfg.o
 -hw-obj-$(CONFIG_PCI) += pci_bridge.o pci_bridge_dev.o
 +hw-obj-$(CONFIG_PCI) += pci.o pci_bridge.o pci_bridge_dev.o
  hw-obj-$(CONFIG_PCI) += msix.o msi.o
  hw-obj-$(CONFIG_PCI) += shpc.o
  hw-obj-$(CONFIG_PCI) += slotid_cap.o
 @@ -164,7 +164,6 @@ obj-$(CONFIG_SOFTMMU) += vhost_net.o
  obj-$(CONFIG_VHOST_NET) += vhost.o
  obj-$(CONFIG_REALLY_VIRTFS) += 9pfs/
  obj-$(CONFIG_NO_PCI) += pci-stub.o
 -obj-$(CONFIG_PCI) += pci.o
  obj-$(CONFIG_VGA) += vga.o
  obj-$(CONFIG_SOFTMMU) += device-hotplug.o
  obj-$(CONFIG_XEN) += xen_domainbuild.o xen_machine_pv.o
 diff --git a/hw/device-assignment.c b/hw/device-assignment.c
 index d14c327..7a90027 100644
 --- a/hw/device-assignment.c
 +++ b/hw/device-assignment.c
 @@ -36,7 +36,6 @@
  #include pc.h
  #include qemu-error.h
  #include console.h
 -#include device-assignment.h
  #include loader.h
  #include monitor.h
  #include range.h
 @@ -143,6 +142,8 @@ typedef struct AssignedDevice {
  QLIST_ENTRY(AssignedDevice) next;
  } AssignedDevice;
  
 +static void assigned_dev_update_irq_routing(PCIDevice *dev);
 +
  static void assigned_dev_load_option_rom(AssignedDevice *dev);
  
  static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev);
 @@ -869,8 +870,13 @@ static int assign_irq(AssignedDevice *dev)
  int r = 0;
  
  /* Interrupt PIN 0 means don't use INTx */
 -if (assigned_dev_pci_read_byte(dev-dev, PCI_INTERRUPT_PIN) == 0)
 +if (assigned_dev_pci_read_byte(dev-dev, PCI_INTERRUPT_PIN) == 0) {
 +pci_device_set_intx_routing_notifier(dev-dev, NULL);
  return 0;
 +}
 +
 +pci_device_set_intx_routing_notifier(dev-dev,
 + assigned_dev_update_irq_routing);
  
  intx_route = pci_device_route_intx_to_irq(dev-dev, 0);
  assert(intx_route.mode != PCI_INTX_INVERTED);
 @@ -944,43 +950,19 @@ static void deassign_device(AssignedDevice *dev)
  dev-dev.qdev.id, strerror(-r));
  }
  
 -#if 0
 -AssignedDevInfo *get_assigned_device(int pcibus, int slot)
 -{
 -AssignedDevice *assigned_dev = NULL;
 -AssignedDevInfo *adev = NULL;
 -
 -QLIST_FOREACH(adev, adev_head, next) {
 -assigned_dev = adev-assigned_dev;
 -if (pci_bus_num(assigned_dev-dev.bus) == pcibus 
 -PCI_SLOT(assigned_dev-dev.devfn) == slot)
 -return adev;
 -}
 -
 -return NULL;
 -}
 -#endif
 -
  /* The pci config space got updated. Check if irq numbers have changed
   * for our devices
   */
 -void assigned_dev_update_irqs(void)
 +static void assigned_dev_update_irq_routing(PCIDevice *dev)
  {
 -AssignedDevice *dev, *next;
 +AssignedDevice *assigned_dev = DO_UPCAST(AssignedDevice, dev, dev);
  Error *err = NULL;
  int r;
  
 -dev = QLIST_FIRST(devs);
 -while (dev) {
 -next = QLIST_NEXT(dev, next);
 -if (dev-irq_requested_type  KVM_DEV_IRQ_HOST_INTX) {
 -r = assign_irq(dev);
 -if (r  0) {
 -qdev_unplug(dev-dev.qdev, err);
 -assert(!err);
 -}
 -}
 -dev = next;
 +r = assign_irq(assigned_dev);
 +if (r  0) {
 +qdev_unplug(dev-qdev, err);
 +assert(!err);
  }
  }
  
 @@ -1009,6 +991,7 @@ static void assigned_dev_update_msi(PCIDevice *pci_dev)
  perror(assigned_dev_update_msi: deassign irq);
  
  assigned_dev-irq_requested_type = 0;
 +pci_device_set_intx_routing_notifier(pci_dev, NULL);
  }
  
  if (ctrl_byte  PCI_MSI_FLAGS_ENABLE) {
 @@ -1151,6 +1134,7 @@ static void assigned_dev_update_msix(PCIDevice 
 *pci_dev)
  perror(assigned_dev_update_msix: deassign irq);
  
  assigned_dev-irq_requested_type = 0;
 +pci_device_set_intx_routing_notifier(pci_dev, NULL);
  }
  
  if (ctrl_word  PCI_MSIX_FLAGS_ENABLE) {
 diff --git a/hw/device-assignment.h b/hw/device-assignment.h
 deleted file mode 100644
 index 3fcb804..000
 ---

Re: [PATCH v2 1/2] pci-assign: Switch to pci_device_route_intx_to_irq interface

2012-08-03 Thread Alex Williamson

On Fri, 2012-08-03 at 19:02 +0200, Jan Kiszka wrote:
 From: Jan Kiszka jan.kis...@siemens.com
 
 Drop pci_map_irq/piix_get_irq in favor of upstream's new interface. This
 should also properly model disabling of the line at PCI host controller
 level.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
 
 Changes in v2:
  - always set intx_route.irq to -1 when disabling
 
  hw/device-assignment.c |   31 +--
  hw/pc.h|4 
  hw/pci.c   |4 
  hw/pci.h   |2 --
  hw/piix_pci.c  |7 ---
  5 files changed, 21 insertions(+), 27 deletions(-)
 
 diff --git a/hw/device-assignment.c b/hw/device-assignment.c
 index cc39958..35fc604 100644
 --- a/hw/device-assignment.c
 +++ b/hw/device-assignment.c
 @@ -115,7 +115,7 @@ typedef struct AssignedDevice {
  AssignedDevRegion v_addrs[PCI_NUM_REGIONS - 1];
  PCIDevRegions real_device;
  int run;
 -int girq;
 +PCIINTxRoute intx_route;
  uint16_t h_segnr;
  uint8_t h_busnr;
  uint8_t h_devfn;
 @@ -865,21 +865,24 @@ static int assign_device(AssignedDevice *dev)
  static int assign_irq(AssignedDevice *dev)
  {
  struct kvm_assigned_irq assigned_irq_data;
 -int irq, r = 0;
 +PCIINTxRoute intx_route;
 +int r = 0;
  
  /* Interrupt PIN 0 means don't use INTx */
  if (assigned_dev_pci_read_byte(dev-dev, PCI_INTERRUPT_PIN) == 0)
  return 0;
  
 -irq = pci_map_irq(dev-dev, dev-intpin);
 -irq = piix_get_irq(irq);
 +intx_route = pci_device_route_intx_to_irq(dev-dev, 0);
 +assert(intx_route.mode != PCI_INTX_INVERTED);
  
 -if (dev-girq == irq)
 +if (dev-intx_route.mode == intx_route.mode 
 +dev-intx_route.irq == intx_route.irq) {
  return r;
 +}
  
  memset(assigned_irq_data, 0, sizeof(assigned_irq_data));
  assigned_irq_data.assigned_dev_id = calc_assigned_dev_id(dev);
 -assigned_irq_data.guest_irq = irq;
 +assigned_irq_data.guest_irq = intx_route.irq;
  if (dev-irq_requested_type) {
  assigned_irq_data.flags = dev-irq_requested_type;
  r = kvm_deassign_irq(kvm_state, assigned_irq_data);
 @@ -889,6 +892,11 @@ static int assign_irq(AssignedDevice *dev)
  dev-irq_requested_type = 0;
  }
  
 +if (intx_route.mode == PCI_INTX_DISABLED) {
 +dev-intx_route = intx_route;
 +return 0;
 +}
 +
  retry:
  assigned_irq_data.flags = KVM_DEV_IRQ_GUEST_INTX;
  if (dev-features  ASSIGNED_DEVICE_PREFER_MSI_MASK 
 @@ -917,7 +925,7 @@ retry:
  return r;
  }
  
 -dev-girq = irq;
 +dev-intx_route = intx_route;
  dev-irq_requested_type = assigned_irq_data.flags;
  return r;
  }
 @@ -1029,7 +1037,8 @@ static void assigned_dev_update_msi(PCIDevice *pci_dev)
  perror(assigned_dev_enable_msi: assign irq);
  }
  
 -assigned_dev-girq = -1;
 +assigned_dev-intx_route.mode = PCI_INTX_DISABLED;
 +assigned_dev-intx_route.irq = -1;
  assigned_dev-irq_requested_type = assigned_irq_data.flags;
  } else {
  assign_irq(assigned_dev);
 @@ -1160,7 +1169,8 @@ static void assigned_dev_update_msix(PCIDevice *pci_dev)
  return;
  }
  }
 -assigned_dev-girq = -1;
 +assigned_dev-intx_route.mode = PCI_INTX_DISABLED;
 +assigned_dev-intx_route.irq = -1;
  assigned_dev-irq_requested_type = assigned_irq_data.flags;
  } else {
  assign_irq(assigned_dev);
 @@ -1784,7 +1794,8 @@ static int assigned_initfn(struct PCIDevice *pci_dev)
  e_intx = dev-dev.config[0x3d] - 1;
  dev-intpin = e_intx;
  dev-run = 0;
 -dev-girq = -1;
 +dev-intx_route.mode = PCI_INTX_DISABLED;
 +dev-intx_route.irq = -1;
  dev-h_segnr = dev-host.domain;
  dev-h_busnr = dev-host.bus;
  dev-h_devfn = PCI_DEVFN(dev-host.slot, dev-host.function);
 diff --git a/hw/pc.h b/hw/pc.h
 index a662090..5b36eb5 100644
 --- a/hw/pc.h
 +++ b/hw/pc.h
 @@ -171,10 +171,6 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int 
 *piix_devfn,
  extern PCIDevice *piix4_dev;
  int piix4_init(PCIBus *bus, ISABus **isa_bus, int devfn);
  
 -int piix_get_irq(int pin);
 -
 -int ipf_map_irq(PCIDevice *pci_dev, int irq_num);
 -
  /* vga.c */
  enum vga_retrace_method {
  VGA_RETRACE_DUMB,
 diff --git a/hw/pci.c b/hw/pci.c
 index 5ef3453..0b22913 100644
 --- a/hw/pci.c
 +++ b/hw/pci.c
 @@ -1089,10 +1089,6 @@ static void pci_set_irq(void *opaque, int irq_num, int 
 level)
  pci_change_irq_level(pci_dev, irq_num, change);
  }
  
 -int pci_map_irq(PCIDevice *pci_dev, int pin)
 -{
 -return pci_dev-bus-map_irq(pci_dev, pin);
 -}
  /* Special hooks used by device assignment */
  void pci_bus_set_route_irq_fn(PCIBus *bus, pci_route_irq_fn 
 route_intx_to_irq)
  {
 diff --git a/hw/pci.h b/hw/pci.h
 index c69af01..4b6ab3d 100644
 --- a/hw/pci.h
 +++ b/hw/pci.h
 @@ -274,8 +274,6 @@ void

Re: [PATCH] update KVM_SAVE_MSRS_BEGIN to correct value

2012-08-03 Thread Marcelo Tosatti

On Wed, Aug 01, 2012 at 05:01:42PM +0300, Gleb Natapov wrote:
 When MSR_KVM_PV_EOI_EN was added to msrs_to_save array
 KVM_SAVE_MSRS_BEGIN was not updated accordingly.
 
 Signed-off-by: Gleb Natapov g...@redhat.com

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL] KVM fixes for 3.6-rc2

2012-08-03 Thread Marcelo Tosatti


Linus,

Please pull from

   git://git.kernel.org/pub/scm/virt/kvm/kvm.git master

To receive the following KVM _bug fixes_:

- Fix DS/ES segment register corruption on x86_32.
- Fix kvmclock wallclock migration offset.
- Fix PIT interrupt ACK vs system reset logic bug.



Avi Kivity (1):
  KVM: VMX: Fix ds/es corruption on i386 with preemption

Bruce Rogers (1):
  KVM: x86: apply kvmclock offset to guest wall clock time

Gleb Natapov (1):
  KVM: PIC: call ack notifiers for irqs that are dropped form irr


 arch/x86/kvm/i8259.c |   17 +
 arch/x86/kvm/vmx.c   |   20 +---
 arch/x86/kvm/x86.c   |4 
 3 files changed, 34 insertions(+), 7 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Reset problem vs. MMIO emulation, hypercalls, etc...

2012-08-03 Thread Marcelo Tosatti

On Fri, Aug 03, 2012 at 02:41:13PM -0300, Marcelo Tosatti wrote:
 On Fri, Aug 03, 2012 at 06:20:18AM +1000, Benjamin Herrenschmidt wrote:
  On Thu, 2012-08-02 at 15:35 +0300, Avi Kivity wrote:
   This is actually documented in api.txt, though not in relation to
   reset:
   
 NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO and KVM_EXIT_OSI, the
 corresponding operations are complete (and guest state is
   consistent)
 only after userspace has re-entered the kernel with KVM_RUN.  The
 kernel side will first finish incomplete operations and then check
 for pending signals.  Userspace can re-enter the guest with an
 unmasked signal pending to complete pending operations.
   
   For x86 the issue was with live migration - you can't copy guest
   register state in the middle of an I/O operation.  Reset is actually
   similar, but it involves writing state (which can then be overwritten)
   instead of reading it.
  
  Hrm, except that doing KVM_RUN with a signal is very cumbersome to do
  and I couldn't quite find the logic in qemu to do it ... but I might
  just have missed it. I can see indeed that in the migration case you
  want to actually complete the operation rather than just abort it.
  
  Any chance you can point me to the code that performs that trick qemu
  side for migration ?
 
 kvm-all.c:
 
 kvm_arch_pre_run(env, run);
 if (env-exit_request) {
 DPRINTF(interrupt exit requested\n);
 /*
  * KVM requires us to reenter the kernel after IO exits to
  * complete
  * instruction emulation. This self-signal will ensure that
  * we
  * leave ASAP again.
  */
 qemu_cpu_kick_self();
 }


See kvm_arch_process_async_events() call to qemu_system_reset_request()
in target-i386/kvm.c.

The whole thing is fragile, though: we rely on the order events
are processed inside KVM_RUN, in x86:

1) If there is pending MMIO, process it.
2) If not, return with -EINTR (and KVM_EXIT_INTR) in case
there is a signal pending.

That way, the vcpu will not process the stop event from the main loop
(ie not exit from the kvm_cpu_exec() loop), until MMIO is finished.

  Anthony seems to think that for reset we can just abort the operation
  state in the kernel when the MP state changes.
  
  Cheers,
  Ben.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 4/8] KVM-HV: Add VCPU running/pre-empted state for guest

2012-08-03 Thread Marcelo Tosatti

On Fri, Aug 03, 2012 at 11:25:44AM +0530, Nikunj A Dadhania wrote:
 On Thu, 2 Aug 2012 16:56:28 -0300, Marcelo Tosatti mtosa...@redhat.com 
 wrote:

   + case MSR_KVM_VCPU_STATE:
   + vcpu-arch.v_state.vs_page = gfn_to_page(vcpu-kvm, data  
   PAGE_SHIFT);
   + vcpu-arch.v_state.vs_offset = data  ~(PAGE_MASK | 
   KVM_MSR_ENABLED);
  
  Assign vs_offset after success.
  
   +
   + if (is_error_page(vcpu-arch.v_state.vs_page)) {
   + kvm_release_page_clean(vcpu-arch.time_page);
   + vcpu-arch.v_state.vs_page = NULL;
   + pr_info(KVM: VCPU_STATE - Unable to pin the page\n);
  
  Missing break or return;
  
   + }
   + vcpu-arch.v_state.msr_val = data;
   + break;
   +
 case MSR_IA32_MCG_CTL:
  
  Please verify this code carefully again.
  
  Also leaking the page reference.
  
 vcpu-arch.apf.msr_val = 0;
 vcpu-arch.st.msr_val = 0;
   + vcpu-arch.v_state.msr_val = 0;
  
  Add a newline and comment (or even better a new helper).

 kvmclock_reset(vcpu);
  
 
 How about something like the below. I have tried to look at time_page
 for reference:
 
 
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 580abcf..c82cc12 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -1604,6 +1604,16 @@ static void kvm_clear_vcpu_state(struct kvm_vcpu *vcpu)
   kunmap_atomic(kaddr);
  }
  
 +static void kvm_vcpu_state_reset(struct kvm_vcpu *vcpu)
 +{
 + vcpu-arch.v_state.msr_val = 0;
 + vcpu-arch.v_state.vs_offset = 0;
 + if (vcpu-arch.v_state.vs_page) {
 + kvm_release_page_dirty(vcpu-arch.v_state.vs_page);
 + vcpu-arch.v_state.vs_page = NULL;
 + }
 +}
 +
  int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
  {
   bool pr = false;
 @@ -1724,14 +1734,17 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 
 msr, u64 data)
   break;
  
   case MSR_KVM_VCPU_STATE:
 + kvm_vcpu_state_reset(vcpu);
 +
   vcpu-arch.v_state.vs_page = gfn_to_page(vcpu-kvm, data  
 PAGE_SHIFT);
 - vcpu-arch.v_state.vs_offset = data  ~(PAGE_MASK | 
 KVM_MSR_ENABLED);

Should also fail if its not enabled (KVM_MSR_ENABLED bit).

What is the point of having non-NULL vs_page pointer if KVM_MSR_ENABLED
bit is not set?

The rest is fine, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 6/8] KVM-HV: Add flush_on_enter before guest enter

2012-08-03 Thread Marcelo Tosatti

On Fri, Aug 03, 2012 at 11:07:13AM +0530, Nikunj A Dadhania wrote:
 On Thu, 2 Aug 2012 17:16:41 -0300, Marcelo Tosatti mtosa...@redhat.com 
 wrote:
  On Thu, Aug 02, 2012 at 05:14:02PM -0300, Marcelo Tosatti wrote:
   On Tue, Jul 31, 2012 at 04:19:02PM +0530, Nikunj A. Dadhania wrote:
From: Nikunj A. Dadhania nik...@linux.vnet.ibm.com
 
 static void kvm_set_vcpu_state(struct kvm_vcpu *vcpu)
 {
@@ -1584,7 +1573,8 @@ static void kvm_set_vcpu_state(struct kvm_vcpu 
*vcpu)
kaddr = kmap_atomic(vcpu-arch.v_state.vs_page);
kaddr += vcpu-arch.v_state.vs_offset;
vs = kaddr;
-   kvm_set_atomic(vs-state, 0, 1  
KVM_VCPU_STATE_IN_GUEST_MODE);
+   if (xchg(vs-state, VS_IN_GUEST) == VS_SHOULD_FLUSH)
+   kvm_x86_ops-tlb_flush(vcpu);
kunmap_atomic(kaddr);
 }
 
@@ -1600,7 +1590,8 @@ static void kvm_clear_vcpu_state(struct kvm_vcpu 
*vcpu)
kaddr = kmap_atomic(vcpu-arch.v_state.vs_page);
kaddr += vcpu-arch.v_state.vs_offset;
vs = kaddr;
-   kvm_set_atomic(vs-state, 1  KVM_VCPU_STATE_IN_GUEST_MODE, 
0);
+   if (xchg(vs-state, VS_NOT_IN_GUEST) == VS_SHOULD_FLUSH)
+   kvm_x86_ops-tlb_flush(vcpu);
kunmap_atomic(kaddr);
 }
   
   Nevermind the early comment (the other comments on that message are
   valid).
 I assume the above is related to kvm_set_atomic comment in the [3/8]

Yes.

  Ah, so the pseucode mentions flush-on-exit because we can be clearing 
  the flag on xchg. Setting KVM_REQ_TLB_FLUSH instead should be enough
  (please verify).
  
 Yes, that will work while exiting. 
 
 In the vm_enter case, we need to do a kvm_x86_ops-tlb_flush(vcpu), as
 we have already passed the phase of checking the request.

Yes.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3] kvm: Check if smp_cpus exceeds max cpus supported by kvm

2012-08-03 Thread Marcelo Tosatti

On Tue, Jul 31, 2012 at 07:18:17PM +0800, riegama...@gmail.com wrote:
 From: Dunrong Huang riegama...@gmail.com
 
 Add a helper function for fetching max cpus supported by kvm.
 
 Make QEMU exit with an error message if smp_cpus exceeds limit
 of VCPU count retrieved by invoking this helper function.
 
 Signed-off-by: Dunrong Huang riegama...@gmail.com
 ---
 v1 - v2:
* Fix indentation(thanks to Stefan Hajnoczi for his review)

 v2 - v3(Thanks to Peter Maydell for this hint)
* Add a comment for helper function
* Use goto err instead of exit(1)

  kvm-all.c |   29 +
  1 files changed, 29 insertions(+), 0 deletions(-)

Applied to uq/master, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] configure: Don't implicitly hardcode list of KVM architectures

2012-08-03 Thread Marcelo Tosatti

On Fri, Aug 03, 2012 at 01:51:25PM +0100, Peter Maydell wrote:
 The code creating the symlink from linux-headers/asm to the
 architecture specific linux-headers/asm-$arch directory was
 implicitly hardcoding a list of KVM supporting architectures.
 Add a default case for the common Linux architecture name and
 QEMU CPU name match case, so future architectures will only
 need to add code if they've managed to get mismatched names.
 
 Signed-off-by: Peter Maydell peter.mayd...@linaro.org
 ---
 v1-v2 changes: conform to same indent rules as surrounding code

Applied to uq/master, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 01/10] KVM: iommu: fix releasing unmapped page

2012-08-03 Thread Marcelo Tosatti

On Fri, Aug 03, 2012 at 03:36:52PM +0800, Xiao Guangrong wrote:
 There are two bugs:
 - the 'error page' is forgot to be released
   [ it is unneeded after commit a2766325cf9f9, for backport, we
 still do kvm_release_pfn_clean for the error pfn ]
 
 - guest pages are always released regardless of the unmapped page
   (e,g, caused by hwpoison)
 
 Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com

Looks good.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

KVM: x86: fix pvclock guest stopped flag reporting (v2)

2012-08-03 Thread Marcelo Tosatti


kvm_guest_time_update unconditionally clears hv_clock.flags field,
so the notification never reaches the guest.

Fix it by allowing PVCLOCK_GUEST_STOPPED to passthrough.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 48e7131..e775547 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -420,6 +420,8 @@ struct kvm_vcpu_arch {
unsigned int hw_tsc_khz;
unsigned int time_offset;
struct page *time_page;
+   /* set guest stopped flag in pvclock flags field */
+   bool pvclock_set_guest_stopped_request;
 
struct {
u64 msr_val;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a87c82a..ddcf8ad 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1136,6 +1136,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
unsigned long this_tsc_khz;
s64 kernel_ns, max_kernel_ns;
u64 tsc_timestamp;
+   u8 pvclock_flags;
 
/* Keep irq disabled to prevent changes to the clock */
local_irq_save(flags);
@@ -1217,7 +1218,14 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
vcpu-hv_clock.system_time = kernel_ns + v-kvm-arch.kvmclock_offset;
vcpu-last_kernel_ns = kernel_ns;
vcpu-last_guest_tsc = tsc_timestamp;
-   vcpu-hv_clock.flags = 0;
+
+   pvclock_flags = 0;
+   if (vcpu-pvclock_set_guest_stopped_request) {
+   pvclock_flags |= PVCLOCK_GUEST_STOPPED;
+   vcpu-pvclock_set_guest_stopped_request = false;
+   }
+
+   vcpu-hv_clock.flags = pvclock_flags;
 
/*
 * The interface expects us to write an even number signaling that the
@@ -2628,10 +2636,9 @@ static int kvm_vcpu_ioctl_x86_set_xcrs(struct kvm_vcpu 
*vcpu,
  */
 static int kvm_set_guest_paused(struct kvm_vcpu *vcpu)
 {
-   struct pvclock_vcpu_time_info *src = vcpu-arch.hv_clock;
if (!vcpu-arch.time_page)
return -EINVAL;
-   src-flags |= PVCLOCK_GUEST_STOPPED;
+   vcpu-arch.pvclock_set_guest_stopped_request = true;
kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
return 0;
 }
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Reset problem vs. MMIO emulation, hypercalls, etc...

2012-08-03 Thread Benjamin Herrenschmidt

On Fri, 2012-08-03 at 14:41 -0300, Marcelo Tosatti wrote:

  Hrm, except that doing KVM_RUN with a signal is very cumbersome to do
  and I couldn't quite find the logic in qemu to do it ... but I might
  just have missed it. I can see indeed that in the migration case you
  want to actually complete the operation rather than just abort it.
  
  Any chance you can point me to the code that performs that trick qemu
  side for migration ?
 
 kvm-all.c:
 
 kvm_arch_pre_run(env, run);
 if (env-exit_request) {
 DPRINTF(interrupt exit requested\n);
 /*
  * KVM requires us to reenter the kernel after IO exits to
  * complete
  * instruction emulation. This self-signal will ensure that
  * we
  * leave ASAP again.
  */
 qemu_cpu_kick_self();
 }
 
 
  Anthony seems to think that for reset we can just abort the operation
  state in the kernel when the MP state changes.

Ok, I see now, this is scary really... set a flag somewhere, pray that
we are in the right part of the loop on elsewhere... oh well.

In fact there's some fun (through harmless) bits to be found, look at
x86 for example:

if (env-exception_injected == EXCP08_DBLE) {
/* this means triple fault */
qemu_system_reset_request();
env-exit_request = 1;
return 0;
}

Well, qemu_system_reset_request() calls cpu_stop_current() which calls
cpu_exit() which also does:

   env-exit_request = 1;
 
So obviously it will be well set by that time :-)

Now I can see how having it set during kvm_arch_process_async_events()
will work as this is called right before the run loop. Similarily,
EXIT_MMIO and EXIT_IO would work so they are a non issue.

Our problem I suspect with PAPR doing resets via hcalls is that
our kvm_arch_handle_exit() does:

case KVM_EXIT_PAPR_HCALL:
dprintf(handle PAPR hypercall\n);
run-papr_hcall.ret = spapr_hypercall(env, run-papr_hcall.nr,
  run-papr_hcall.args);
ret = 1;
break;

See the ret = 1 here ? That means that the caller (kvm_cpu_exec.c main
loop) will exit immediately upon return. If we had instead returned 0,
it would have looped, seen the exit_requested set by
qemu_system_reset_request(), which would have then done the signal + run
trick, completed the hypercall, and returned this time in a clean state.

So it seems (I don't have the machine at hand to test right now) that by
just changing that ret = 1 to ret = 0, we might be fixing our problem,
including the case where another vcpu is taking a hypercall at the time
of the reset (this other CPU will also need to complete the operation
before stopping).

David, is there any reason why you did ret = 1 to begin with ? For
things like reset etc... the exit will happen as a result of
env-exit_requested being set by cpu_exit().

Are there other cases where you wish the hcall to go back to the main
loop ? In that case, shouldn't it set env-exit_requested rather than
returning 1 at that point ? (H_CEDE for example ?)

Paul, David, I don't have time to test that before Tuesday (I'm away on
monday) but if you have a chance, revert the kernel change we did and
try this qemu patch for reset:

--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -766,7 +766,7 @@ int kvm_arch_handle_exit(CPUPPCState *env, struct
kvm_run *r
 dprintf(handle PAPR hypercall\n);
 run-papr_hcall.ret = spapr_hypercall(env, run-papr_hcall.nr,
   run-papr_hcall.args);
-ret = 1;
+ret = 0;
 break;
 #endif
 default:

It might also need something like:

diff --git a/hw/spapr_hcall.c b/hw/spapr_hcall.c
index a5990a9..d4d7fb0 100644
--- a/hw/spapr_hcall.c
+++ b/hw/spapr_hcall.c
@@ -545,6 +545,7 @@ static target_ulong h_cede(CPUPPCState *env,
sPAPREnvironmen
 hreg_compute_hflags(env);
 if (!cpu_has_work(env)) {
 env-halted = 1;
+env-exit_request = 1;
 }
 return H_SUCCESS;
 }

Though I don't think H_CEDE ever goes down to qemu, does it ?

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Reset problem vs. MMIO emulation, hypercalls, etc...

2012-08-03 Thread Benjamin Herrenschmidt

On Fri, 2012-08-03 at 15:05 -0300, Marcelo Tosatti wrote:

 See kvm_arch_process_async_events() call to qemu_system_reset_request()
 in target-i386/kvm.c.
 
 The whole thing is fragile, though: we rely on the order events
 are processed inside KVM_RUN, in x86:
 
 1) If there is pending MMIO, process it.
 2) If not, return with -EINTR (and KVM_EXIT_INTR) in case
 there is a signal pending.
 
 That way, the vcpu will not process the stop event from the main loop
 (ie not exit from the kvm_cpu_exec() loop), until MMIO is finished.

Right, it is fragile, thankfully we appear to adhere to the same
ordering on powerpc so far :-)

So we'll need to test but it looks like we might be able to fix our
problem without a kernel or API change, just by changing qemu to
do the same exit_request trick for our reboot hypercall.

Long run however, I wonder whether we should consider an explicit ioctl
to complete those pending operations instead...

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/3] KVM: PPC: booke: Added debug handler

2012-08-03 Thread Bharat Bhushan

Installed debug handler will be used for guest debug support and
debug facility emulation features (patches for these features
will follow this patch).

Signed-off-by: Liu Yu yu@freescale.com
[bharat.bhus...@freescale.com: Substantial changes]
Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/include/asm/kvm_host.h |1 +
 arch/powerpc/kernel/asm-offsets.c   |1 +
 arch/powerpc/kvm/booke_interrupts.S |   45 +++
 3 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index dcee499..bd78523 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -494,6 +494,7 @@ struct kvm_vcpu_arch {
u32 tlbcfg[4];
u32 mmucfg;
u32 epr;
+   u32 crit_save;
 #endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 85b05c4..92f149b 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -563,6 +563,7 @@ int main(void)
DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
+   DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save));
 #endif /* CONFIG_PPC_BOOK3S */
 #endif /* CONFIG_KVM */
 
diff --git a/arch/powerpc/kvm/booke_interrupts.S 
b/arch/powerpc/kvm/booke_interrupts.S
index 3539805..890673c 100644
--- a/arch/powerpc/kvm/booke_interrupts.S
+++ b/arch/powerpc/kvm/booke_interrupts.S
@@ -73,6 +73,51 @@ _GLOBAL(kvmppc_handler_\ivor_nr)
bctr
 .endm
 
+.macro KVM_DBG_HANDLER ivor_nr scratch srr0
+_GLOBAL(kvmppc_handler_\ivor_nr)
+   mtspr   \scratch, r4
+   mfspr   r4, SPRN_SPRG_THREAD
+   lwz r4, THREAD_KVM_VCPU(r4)
+   stw r3, VCPU_CRIT_SAVE(r4)
+   mfcrr3
+   mfspr   r4, SPRN_CSRR1
+   andi.   r4, r4, MSR_PR
+   bne 1f
+   /* debug interrupt happened in enter/exit path */
+   mfspr   r4, SPRN_CSRR1
+   rlwinm  r4, r4, 0, ~MSR_DE
+   mtspr   SPRN_CSRR1, r4
+   lis r4, 0x
+   ori r4, r4, 0x
+   mtspr   SPRN_DBSR, r4
+   mfspr   r4, SPRN_SPRG_THREAD
+   lwz r4, THREAD_KVM_VCPU(r4)
+   mtcrr3
+   lwz r3, VCPU_CRIT_SAVE(r4)
+   mfspr   r4, \scratch
+   rfci
+1: /* debug interrupt happened in guest */
+   mfspr   r4, \scratch
+   mtcrr3
+   mr  r3, r4
+   mfspr   r4, SPRN_SPRG_THREAD
+   lwz r4, THREAD_KVM_VCPU(r4)
+   stw r3, VCPU_GPR(R4)(r4)
+   stw r5, VCPU_GPR(R5)(r4)
+   stw r6, VCPU_GPR(R6)(r4)
+   lwz r3, VCPU_CRIT_SAVE(r4)
+   mfspr   r5, \srr0
+   stw r3, VCPU_GPR(R3)(r4)
+   stw r5, VCPU_PC(r4)
+   mfctr   r5
+   lis r6, kvmppc_resume_host@h
+   stw r5, VCPU_CTR(r4)
+   li  r5, \ivor_nr
+   ori r6, r6, kvmppc_resume_host@l
+   mtctr   r6
+   bctr
+.endm
+
 .macro KVM_HANDLER_ADDR ivor_nr
.long   kvmppc_handler_\ivor_nr
 .endm
-- 
1.7.0.4


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Przepraszamy za wszelkie niedogodności spowodowane mam cię o wyrozumiałość

2012-08-03 Thread sedeflores




Ten komunikat jest z gospodarzem centrum wszystkich naszych właścicieli kont. 
Jesteśmy w trakcie modernizacji naszej bazy danych i e-mail centrum na ten rok 
2012. Jesteśmy usunięcie wszystkich
niewykorzystane konto aby utworzyć więcej miejsca dla nowych raz i aby zapobiec 
spamu. Tuż przed Ta wiadomość została wysłana, aktualnie uruchomione na 20,9 
GB, które jest
przekroczony limit przechowywania, które jest 20GB. Aby zapobiec konto od 
zamknięcia, będziesz musiał zaktualizować nam poniżej informacje, aby pomóc nam 
ponownie ustawić konto
Miejsce w bazie danych przed utrzymać swojego INBOX.Warning! E-mail właściciel, 
który nie chce aktualizować jego lub jej e-mail, w ciągu 48 godzin od 
otrzymania tego ostrzeżenia będą
stracić jego lub jej e-mail na stałe.
Potwierdź adres e-MAIL IDENTITY PONIŻEJ uaktualnić konto:

skopiować adres email: kontobiurkoemailpomoc...@techie.com
odpowiedź i wyślij wszystkie dane poniżej.


Adres e-mail: ..
Login: 
Hasło: 
Potwierdź hasło ...
Alternatywne email: ...
Data urodzenia: ...
Pytanie bezpieczeństwa ...
Twoja odpowiedź ..

Dziękujemy za współpracę.
Z poważaniem,
Centrum konserwacji.

-Nota de Confidencialidad-
El texto de este correo electrónico está dirigido exclusivamente al 
destinatario que figura en el mismo. Se advierte que puede contener información 
de carácter reservada, secreta o confidencial, así como datos de carácter 
personal. Por tanto, su utilización o divulgación sólo está permitida a las 
personas autorizadas. El contenido está alcanzado y regulado por la normativa 
de la República Oriental del Uruguay respecto a la Protección de los Datos 
Personales, en particular por la Ley No. 18.331 de 11-08-08, sus decretos 
reglamentarios No. 664/008 de 22-12-08 y No. 414/09 de 31-08-09, y por la 
restante que se sancione con posterioridad sobre el tema. Si el mensaje no está 
destinado a usted y lo ha recibido por error o por otras circunstancias, deberá 
abstenerse de leer, reproducir o difundir el contenido del mismo en forma 
alguna ni bajo ningún concepto. Le solicitamos además que lo comunique en forma 
inmediata por este medio al remitente y que lo elimine de manera segura e 
irrecuperable. Las comunicaciones por Internet no pueden garantizarse de ser 
oportunamente seguras, o libres de error o virus. El remitente no acepta 
responsabilidad por cualquier error u omisión.

-Disclaimer-
The text of this email is intended solely for the addressee shown herein. Note 
that it may contain confidential, proprietary or confidential information, as 
well as personal data. Therefore, its use or disclosure is permitted only to 
the authorized persons. The content is scoped and regulated by the rules of the 
Oriental Republic of Uruguay on the Protection of Personal Data, in particular 
by Law No. 18,331 of 11-08-08, its Regulatory Decree No. 664/008 of 22/12/08 
and No. 414/09 of 31-08-09, and the remaining to be approved later on the 
subject. If the message is not aimed at you and you have received it by mistake 
or other circumstances, you must refrain from reading, copying or disseminating 
the contents thereof in any form or by any means. We further request that you 
immediately communicate by this means the sender and delete it safely and 
unrecoverable. Internet communications cannot be guaranteed to be timely 
secure, error or virus-free. Sender does not accept liability for any errors or 
omissions.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/4] powerpc/booke: Add CPU_FTR_EMB_HV check for e5500.

2012-08-03 Thread Kumar Gala


On Jul 9, 2012, at 8:04 AM, Varun Sethi wrote:

 Added CPU_FTR_EMB_HV feature check for e550.
 
 Signed-off-by: Varun Sethi varun.se...@freescale.com
 Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
 ---
 arch/powerpc/kernel/cpu_setup_fsl_booke.S |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

[ fixed typo e550 - e5500 ]

applied to next

- k
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Reset problem vs. MMIO emulation, hypercalls, etc...

2012-08-03 Thread Marcelo Tosatti

On Fri, Aug 03, 2012 at 06:20:18AM +1000, Benjamin Herrenschmidt wrote:
 On Thu, 2012-08-02 at 15:35 +0300, Avi Kivity wrote:
  This is actually documented in api.txt, though not in relation to
  reset:
  
NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO and KVM_EXIT_OSI, the
corresponding operations are complete (and guest state is
  consistent)
only after userspace has re-entered the kernel with KVM_RUN.  The
kernel side will first finish incomplete operations and then check
for pending signals.  Userspace can re-enter the guest with an
unmasked signal pending to complete pending operations.
  
  For x86 the issue was with live migration - you can't copy guest
  register state in the middle of an I/O operation.  Reset is actually
  similar, but it involves writing state (which can then be overwritten)
  instead of reading it.
 
 Hrm, except that doing KVM_RUN with a signal is very cumbersome to do
 and I couldn't quite find the logic in qemu to do it ... but I might
 just have missed it. I can see indeed that in the migration case you
 want to actually complete the operation rather than just abort it.
 
 Any chance you can point me to the code that performs that trick qemu
 side for migration ?

kvm-all.c:

kvm_arch_pre_run(env, run);
if (env-exit_request) {
DPRINTF(interrupt exit requested\n);
/*
 * KVM requires us to reenter the kernel after IO exits to
 * complete
 * instruction emulation. This self-signal will ensure that
 * we
 * leave ASAP again.
 */
qemu_cpu_kick_self();
}


 Anthony seems to think that for reset we can just abort the operation
 state in the kernel when the MP state changes.
 
 Cheers,
 Ben.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Reset problem vs. MMIO emulation, hypercalls, etc...

2012-08-03 Thread Marcelo Tosatti

On Fri, Aug 03, 2012 at 02:41:13PM -0300, Marcelo Tosatti wrote:
 On Fri, Aug 03, 2012 at 06:20:18AM +1000, Benjamin Herrenschmidt wrote:
  On Thu, 2012-08-02 at 15:35 +0300, Avi Kivity wrote:
   This is actually documented in api.txt, though not in relation to
   reset:
   
 NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO and KVM_EXIT_OSI, the
 corresponding operations are complete (and guest state is
   consistent)
 only after userspace has re-entered the kernel with KVM_RUN.  The
 kernel side will first finish incomplete operations and then check
 for pending signals.  Userspace can re-enter the guest with an
 unmasked signal pending to complete pending operations.
   
   For x86 the issue was with live migration - you can't copy guest
   register state in the middle of an I/O operation.  Reset is actually
   similar, but it involves writing state (which can then be overwritten)
   instead of reading it.
  
  Hrm, except that doing KVM_RUN with a signal is very cumbersome to do
  and I couldn't quite find the logic in qemu to do it ... but I might
  just have missed it. I can see indeed that in the migration case you
  want to actually complete the operation rather than just abort it.
  
  Any chance you can point me to the code that performs that trick qemu
  side for migration ?
 
 kvm-all.c:
 
 kvm_arch_pre_run(env, run);
 if (env-exit_request) {
 DPRINTF(interrupt exit requested\n);
 /*
  * KVM requires us to reenter the kernel after IO exits to
  * complete
  * instruction emulation. This self-signal will ensure that
  * we
  * leave ASAP again.
  */
 qemu_cpu_kick_self();
 }


See kvm_arch_process_async_events() call to qemu_system_reset_request()
in target-i386/kvm.c.

The whole thing is fragile, though: we rely on the order events
are processed inside KVM_RUN, in x86:

1) If there is pending MMIO, process it.
2) If not, return with -EINTR (and KVM_EXIT_INTR) in case
there is a signal pending.

That way, the vcpu will not process the stop event from the main loop
(ie not exit from the kvm_cpu_exec() loop), until MMIO is finished.

  Anthony seems to think that for reset we can just abort the operation
  state in the kernel when the MP state changes.
  
  Cheers,
  Ben.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Reset problem vs. MMIO emulation, hypercalls, etc...

2012-08-03 Thread Benjamin Herrenschmidt

On Fri, 2012-08-03 at 15:05 -0300, Marcelo Tosatti wrote:

 See kvm_arch_process_async_events() call to qemu_system_reset_request()
 in target-i386/kvm.c.
 
 The whole thing is fragile, though: we rely on the order events
 are processed inside KVM_RUN, in x86:
 
 1) If there is pending MMIO, process it.
 2) If not, return with -EINTR (and KVM_EXIT_INTR) in case
 there is a signal pending.
 
 That way, the vcpu will not process the stop event from the main loop
 (ie not exit from the kvm_cpu_exec() loop), until MMIO is finished.

Right, it is fragile, thankfully we appear to adhere to the same
ordering on powerpc so far :-)

So we'll need to test but it looks like we might be able to fix our
problem without a kernel or API change, just by changing qemu to
do the same exit_request trick for our reboot hypercall.

Long run however, I wonder whether we should consider an explicit ioctl
to complete those pending operations instead...

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

45 matches

Mail list logo