Re: Q: What is the struct kvm srcu protecting?

2013-05-05 Thread Gleb Natapov
On Fri, May 03, 2013 at 10:21:09AM -0700, David Daney wrote:
 On 05/03/2013 03:51 AM, Gleb Natapov wrote:
 Hi David,
 
 On Thu, May 02, 2013 at 10:48:36PM -0300, Marcelo Tosatti wrote:
 On Thu, May 02, 2013 at 11:22:52AM -0700, David Daney wrote:
 Hi,
 
 I am working on the MIPS KVM port, and am trying to figure out under
 which circumstances do I need to srcu_read_lock()/srcu_read_unlock()
 the kvm-srcu.
 
 Is your work somehow related to the work of Sanjay Lal that can be found
 here: https://git.linux-mips.org/?p=ralf/upstream-sfr.git;a=summary?
 
 
 It is related in that a single asm/uaip/kvm.h must be shared between
 the implementations.  It differs in that it is based on the MIPS-VZ
 hardware virtualization feature, where as Sanjay's code is a pure
 software solution.
 
 If possible some code might be shared between the two, but it may
 end up looking somewhat like the x86 implementation where there are
 separate VMX and SVM implementations.
 
Sanjay code already has such design and the reason he gave for it was upcoming
HW virtualization support.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state

2013-05-05 Thread Jan Kiszka
On 2013-04-30 14:42, Jan Kiszka wrote:
 On 2013-04-30 13:46, Gleb Natapov wrote:
 On Sun, Apr 28, 2013 at 12:20:38PM +0200, Jan Kiszka wrote:
 On 2013-02-23 22:35, Jan Kiszka wrote:
 From: Jan Kiszka jan.kis...@siemens.com

 Likely a typo, but a fatal one as kvm_set_cr0 performs checks on the
 state transition that may prevent loading L1's cr0.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  arch/x86/kvm/vmx.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index 26d47e9..94f3b66 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -7429,7 +7429,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu 
 *vcpu,
 * fpu_active (which may have changed).
 * Note that vmx_set_cr0 refers to efer set above.
 */
 -  kvm_set_cr0(vcpu, vmcs12-host_cr0);
 +  vmx_set_cr0(vcpu, vmcs12-host_cr0);
/*
 * If we did fpu_activate()/fpu_deactivate() during L2's run, we need
 * to apply the same changes to L1's vmcs. We just set cr0 correctly,


 This one still applies, is necessary for nested unrestricted guest mode,
 and I'm still convinced it's an appropriate way to fix the bug. How to
 proceed?

 What check that is done by kvm_set_cr0() fails?
 
 Would have to reproduce the bug to confirm, but from the top of my head
 and from looking at the code again:
 
 if (!is_paging(vcpu)  (cr0  X86_CR0_PG)) {
   if ((vcpu-arch.efer  EFER_LME)) {
   int cs_db, cs_l;
 
   if (!is_pae(vcpu))
   return 1;
   kvm_x86_ops-get_cs_db_l_bits(vcpu, cs_db, cs_l);
   if (cs_l)
   return 1;
 
 I think to remember this last check triggered. When we come from the
 guest with paging off, we may run through this check an incorrectly bail
 out here when the host state fulfills the conditions (PG, EFER_LME, and
 L bit set).

Just retried, and actually the first check (!is_pae) fails right now
(with nested unrestricted guest mode patched in). The second one
stumbles if I set CR4 before CR1 in load_vmcs12_host_state.

So, however you put it, calling kvm_set_cr0 remains wrong.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] kvm: Add compat_ioctl for device control API

2013-05-05 Thread Gleb Natapov
On Tue, Apr 30, 2013 at 08:00:45PM -0500, Scott Wood wrote:
 This API shouldn't have 32/64-bit issues, but VFS assumes it does
 unless told otherwise.
 
 Signed-off-by: Scott Wood scottw...@freescale.com
Applied, thanks.

 ---
  virt/kvm/kvm_main.c |3 +++
  1 file changed, 3 insertions(+)
 
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index 54a14fa..ee0adad 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -2221,6 +2221,9 @@ static int kvm_device_release(struct inode *inode, 
 struct file *filp)
  
  static const struct file_operations kvm_device_fops = {
   .unlocked_ioctl = kvm_device_ioctl,
 +#ifdef CONFIG_COMPAT
 + .compat_ioctl = kvm_device_ioctl,
 +#endif
   .release = kvm_device_release,
  };
  
 -- 
 1.7.10.4

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 04/11] kvm tools: console: unconditionally output to any console

2013-05-05 Thread Asias He
On Fri, May 3, 2013 at 5:19 PM, Pekka Enberg penb...@kernel.org wrote:
 On Wed, May 1, 2013 at 6:50 PM, Will Deacon will.dea...@arm.com wrote:
 From: Marc Zyngier marc.zyng...@arm.com

 Kvmtool suppresses any output to a console that has not been elected
 as *the* console.

 While this makes sense on the input side (we want the input to be sent
 to one console driver only), it seems to be the wrong thing to do on
 the output side, as it effectively prevents the guest from switching
 from one console to another (think earlyprintk using 8250 to virtio
 console).

 After all, the guest *does* poke this device and outputs something
 there.

 Just remove the kvm-cfg.active_console test from the output paths.

 Signed-off-by: Marc Zyngier marc.zyng...@arm.com
 Signed-off-by: Will Deacon will.dea...@arm.com

 Seems reasonable. Asias, Sasha?

This patch itself looks good to me.

But we have more issues for the console devices and termials with
regard to multiple console support:

1) All the console outputs (spapr_hvcons.c, spapr_rtas.c
virtio/console.c) are redirected to term 0.
2) With multiple console support, the cfg.active_console logic is not
very useful at all.
3) Four serial devices ttyS0-3 are initialized unconditionally and
mapped to term 0-3.
4) Using --tty option, we can map a term to /dev/pts/N on host. I
think we can merge --tty option to --console option.

I have something like this in my mind:

--console type=serial,backend=stdio
--console type=virtio,backend=pts
--console type=hv,backend=pts

e.g to add two serial consoles ttyS0 and ttyS1 and one virtio console
hvc0, ttyS0 is mapped the stdio and ttyS1 and hvc0 are mapped to pts,
we use this:

--console type=serial,backend=stdio  --console type=serial,backend=pts
 --console type=virtio,backend=pts


 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
Asias
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] KVM updates for the 3.10 merge window

2013-05-05 Thread Gleb Natapov
Linus,

Please pull from

git://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/kvm-3.10-1

to receive the KVM updates for the 3.10 merge window. 

Highlights of the updates are:

general:
 - new emulated device API
 - legacy device assignment is now optional
 - irqfd interface is more generic and can be shared between arches
x86:
 - VMCS shadow support and other nested VMX improvements
 - APIC virtualization and Posted Interrupt hardware support
 - Optimize mmio spte zapping
ppc:
  - BookE: in-kernel MPIC emulation with irqfd support
  - Book3S: in-kernel XICS emulation (incomplete)
  - Book3S: HV: migration fixes
  - BookE: more debug support preparation
  - BookE: e6500 support
ARM:
 - reworking of Hyp idmaps
s390:
 - ioeventfd for virtio-ccw

And many other bug fixes, cleanups and improvements.

Abel Gordon (11):
  KVM: nVMX: Shadow-vmcs control fields/bits
  KVM: nVMX: Detect shadow-vmcs capability
  KVM: nVMX: Introduce vmread and vmwrite bitmaps
  KVM: nVMX: Refactor handle_vmwrite
  KVM: nVMX: Fix VMXON emulation
  KVM: nVMX: Allocate shadow vmcs
  KVM: nVMX: Release shadow vmcs
  KVM: nVMX: Copy processor-specific shadow-vmcs to VMCS12
  KVM: nVMX: Copy VMCS12 to processor-specific shadow vmcs
  KVM: nVMX: Synchronize VMCS12 content with the shadow vmcs
  KVM: nVMX: Enable and disable shadow vmcs functionality

Alex Williamson (2):
  kvm: Allow build-time configuration of KVM device assignment
  kvm: KVM_CAP_IOMMU only available with device assignment

Alexander Graf (15):
  Merge commit 'origin/next' into kvm-ppc-next
  KVM: ARM: Fix kvm_vm_ioctl_irq_line
  Merge commit 'origin/next' into kvm-ppc-next
  KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS
  KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING
  KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing
  KVM: Remove kvm_get_intr_delivery_bitmask
  KVM: Move irq routing to generic code
  KVM: Extract generic irqchip logic into irqchip.c
  KVM: Move irq routing setup to irqchip.c
  KVM: Move irqfd resample cap handling to generic code
  KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
  KVM: PPC: MPIC: Restrict to e500 platforms
  KVM: IA64: Carry non-ia64 changes into ia64

Andre Przywara (1):
  ARM: KVM: iterate over all CPUs for CPU compatibility check

Andrew Honig (1):
  KVM: x86: Fix memory leak in vmx.c

Arnd Bergmann (1):
  ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally

Benjamin Herrenschmidt (3):
  KVM: PPC: Book3S: Add kernel emulation for the XICS interrupt controller
  KVM: PPC: Book3S HV: Speed up wakeups of CPUs on HV KVM
  KVM: PPC: Book3S HV: Add support for real mode ICP in XICS emulation

Bharat Bhushan (9):
  KVM: PPC: move tsr update in a separate function
  KVM: PPC: Added one_reg interface for timer registers
  KVM: PPC: booke: Added debug handler
  Added ONE_REG interface for debug instruction
  KVM: PPC: cache flush for kernel managed pages
  KVM: PPC: debug stub interface parameter defined
  Rename EMULATE_DO_PAPR to EMULATE_EXIT_USER
  KVM: extend EMULATE_EXIT_USER to support different exit reasons
  booke: exit to user space if emulator request

Borislav Petkov (1):
  kvm, svm: Fix typo in printk message

Chegu Vinod (1):
  KVM: x86: Increase the hard max VCPU limit

Chen Gang (1):
  arch/x86/kvm: beautify source code for __u32 irq which is never  0

Christian Borntraeger (1):
  KVM: s390: Dont do a gmap update on minor memslot changes

Christoffer Dall (3):
  KVM: ARM: Reintroduce trace_kvm_hvc
  KVM: ARM: Fix API documentation for ONE_REG encoding
  KVM: ARM: Fix spelling in error message

Cornelia Huck (6):
  KVM: s390: Export virtio-ccw api.
  KVM: Initialize irqfd from kvm_init().
  KVM: Introduce KVM_VIRTIO_CCW_NOTIFY_BUS.
  KVM: ioeventfd for virtio-ccw devices.
  KVM: s390: Wire up ioeventfd.
  KVM: s390: virtio_ccw: reset errors for new I/O.

Geoff Levand (4):
  KVM: Move vm_list kvm_lock declarations out of x86
  KVM: Make local routines static
  KVM: Move kvm_spurious_fault to x86.c
  KVM: Move kvm_rebooting declaration out of x86

Gleb Natapov (10):
  Merge  'git://github.com/agraf/linux-2.6.git kvm-ppc-next' into queue
  KVM: emulator: fix unimplemented instruction detection
  KVM: VMX: do not try to reexecute failed instruction while emulating 
invalid guest state
  KVM: emulator: Do not fail on emulation of undefined opcode
  KVM: emulator: mark 0xff 0x7d opcode as undefined.
  KVM: VMX: Fix check guest state validity if a guest is in VM86 mode
  Merge git://github.com/agraf/linux-2.6.git kvm-ppc-next into queue
  KVM: X86 emulator: fix source operand decoding for 8bit mov[zs]x 
instructions
  Merge branch 'kvm-arm-cleanup' from 

[PATCH] KVM: add missing misc_deregister() on error in kvm_init()

2013-05-05 Thread Wei Yongjun
From: Wei Yongjun yongjun_...@trendmicro.com.cn

Add the missing misc_deregister() before return from kvm_init()
in the debugfs init error handling case.

Signed-off-by: Wei Yongjun yongjun_...@trendmicro.com.cn
---
 virt/kvm/kvm_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f18013f..3eb4d16 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3012,6 +3012,7 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned 
vcpu_align,
 
 out_undebugfs:
unregister_syscore_ops(kvm_syscore_ops);
+   misc_deregister(kvm_dev);
 out_unreg:
kvm_async_pf_deinit();
 out_free:

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] vhost-net: Free ubuf when vhost_dev_ioctl fails

2013-05-05 Thread Michael S. Tsirkin
On Fri, May 03, 2013 at 02:25:17PM +0800, Asias He wrote:
 Free ubuf when vhost_dev_ioctl for VHOST_SET_OWNER fails.
 
 Signed-off-by: Asias He as...@redhat.com
 ---
  drivers/vhost/net.c | 20 ++--
  1 file changed, 18 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
 index b2f6b41..eb73217 100644
 --- a/drivers/vhost/net.c
 +++ b/drivers/vhost/net.c
 @@ -152,6 +152,19 @@ void vhost_ubuf_put_and_wait(struct vhost_ubuf_ref 
 *ubufs)
   kfree(ubufs);
  }
  
 +static void vhost_net_clear_ubuf_info(struct vhost_net *n)
 +{
 +
 + bool zcopy;
 + int i;
 +
 + for (i = 0; i  n-dev.nvqs; ++i) {
 + zcopy = vhost_zcopy_mask  (0x1  i);
 + if (zcopy)
 + kfree(n-vqs[i].ubuf_info);
 + }
 +}
 +
  int vhost_net_set_ubuf_info(struct vhost_net *n)
  {
   bool zcopy;
 @@ -1069,10 +1082,13 @@ static long vhost_net_ioctl(struct file *f, unsigned 
 int ioctl,
   goto out;
   }
   r = vhost_dev_ioctl(n-dev, ioctl, argp);
 - if (r == -ENOIOCTLCMD)
 + if (r == -ENOIOCTLCMD) {
   r = vhost_vring_ioctl(n-dev, ioctl, argp);
 - else
 + } else {
 + if (r  0  ioctl == VHOST_SET_OWNER)
 + vhost_net_clear_ubuf_info(n);
   vhost_net_flush(n);
 + }

This is becoming too complex.
Let's just export vhost_dev_set_owner from vhost.c
and have a separate case statement for VHOST_SET_OWNER.


Also - could you please send a separate series
with bugfixes, so I can apply for 3.10?
Cleanups I will queue for 3.11.

Thanks!

  out:
   mutex_unlock(n-dev.mutex);
   return r;
 -- 
 1.8.1.4
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: Booting physically installed Windows while in Arch (AHCI support in OVMF?)

2013-05-05 Thread Evert Heylen
Please, any help?

I'm currently in such a state I won't be able to sleep well before I
make some progress on this.
I've already described my situation quite precisly, if one needs even
more information, just ask.

I've now also tried with a separate img containing DUET, so I can use
the default seabios to boot DUET, which can  boot Windows in UEFI
mode. However, DUET just doesn't see my disk at all, be it in IDE or
AHCI mode. If I boot the same img *physically* (from a usb), I can
enter DUET and I can see my physical disk (which is running in AHCI
mode). So I guess this is an issue with KVM/QEMU.

Any ideas would be greatly appreciated.

On Sun, Apr 28, 2013 at 6:29 PM, Evert Heylen everthey...@gmail.com wrote:
 Hi all, My situation is the following:
 My PC (x64) has an UEFI capable motherboard (ASRock Z77). On my hard
 drive (which is GPT formatted ofc), I have Windows 7 installed on
 /dev/sda3 and Arch Linux on /dev/sda2. I can boot both OS'es. However,
 I would like to boot Windows while in Arch, using KVM. I'm using the
 OVMF images. I tried it right away with this command:

 qemu-system-x86_64 -enable-kvm -smp 4 -cpu host -m 4096 -hda /dev/sda
 -L /path/to/ovmf/

 It doesn't work. When booting in safe mode in windows, I can see that
 windows fails when trying to load CLASSPNP.sys . After some googling I
 found out that it might be because qemu 'mounts' the drive in IDE
 mode, while windows expects it to be in AHCI mode (because it was
 installed in AHCI mode). Then, after some more googling, I tried this
 command, which should (correct me if I'm wrong) mount the drive in
 AHCI mode.

 qemu-system-x86_64 -enable-kvm -smp 4 -cpu host -m 4096 -L
 /path/to/ovmf -device ahci,id=ahci0 -drive
 if=none,file=/dev/sda,format=raw,id=drive-sata0-0-0 -device
 driver=ide-drive,bus=ahci0.0,drive=drive-sata0-0-0,id=sata0-0-0

 However, with this command OVMF doesn't seem to recognise any drive at
 all, the 'Boot from file' screen is empty.

 So, I would like to know if OVMF supports AHCI, and if it doesn't, do
 you have any other ideas?
 I know it's generally not a good idea to boot a physically installed
 OS in a vm, but I want to try it anyway.

 Thanks,
 Evert
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[uq/master PATCH] kvmvapic: add ioport read accessor

2013-05-05 Thread Marcelo Tosatti

Necessary since memory region accessor assumes read and write
methods are registered. Otherwise reading I/O port 0x7e segfaults.

https://bugzilla.redhat.com/show_bug.cgi?id=954306

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index 5b558aa..655483b 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -687,8 +687,14 @@ static void vapic_write(void *opaque, hwaddr addr, 
uint64_t data,
 }
 }
 
+static uint64_t vapic_read(void *opaque, hwaddr addr, unsigned size)
+{
+return 0x;
+}
+
 static const MemoryRegionOps vapic_ops = {
 .write = vapic_write,
+.read = vapic_read,
 .endianness = DEVICE_NATIVE_ENDIAN,
 };
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm/ppc/booke64: Hard disable interrupts when entering the guest

2013-05-05 Thread Benjamin Herrenschmidt
On Fri, 2013-05-03 at 18:45 -0500, Scott Wood wrote:
 kvmppc_lazy_ee_enable() was causing interrupts to be soft-enabled
 (albeit hard-disabled) in kvmppc_restart_interrupt().  This led to
 warnings, and possibly breakage if the interrupt state was later saved
 and then restored (leading to interrupts being hard-and-soft enabled
 when they should be at least soft-disabled).
 
 Simply removing kvmppc_lazy_ee_enable() leaves interrupts only
 soft-disabled when we enter the guest, but they will be hard-disabled
 when we exit the guest -- without PACA_IRQ_HARD_DIS ever being set, so
 the local_irq_enable() fails to hard-enable.
 
 While we could just set PACA_IRQ_HARD_DIS after an exit to compensate,
 instead hard-disable interrupts before entering the guest.  This way,
 we won't have to worry about interactions if we take an interrupt
 during the guest entry code.  While I don't see any obvious
 interactions, it could change in the future (e.g. it would be bad if
 the non-hv code were used on 64-bit or if 32-bit guest lazy interrupt
 disabling, since the non-hv code changes IVPR among other things).

Shouldn't the interrupts be marked soft-enabled (even if hard disabled)
when entering the guest ?

Ie. The last stage of entry will hard enable, so they should be
soft-enabled too... if not, latency trackers will consider the whole
guest periods as interrupt disabled...

Now, kvmppc_lazy_ee_enable() seems to be clearly bogus to me. It will
unconditionally set soft_enabled and clear irq_happened from a
soft-disabled state, thus potentially losing a pending event.

Book3S HV seems to be keeping interrupts fully enabled all the way
until the asm hard disables, which would be fine except that I'm worried
we are racy vs. need_resched  signals.

One thing you may be able to do is call prep_irq_for_idle(). This will
tell you if something happened, giving you a chance to abort/re-enable
before you go the guest.

Ben.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v1][KVM][PATCH 1/1] kvm:ppc: enable doorbell exception with E500MC

2013-05-05 Thread Tiejun Chen
Actually E500MC also support doorbell exception, and CONFIG_PPC_E500MC
can cover BOOK3E/BOOK3E_64 as well.

Signed-off-by: Tiejun Chen tiejun.c...@windriver.com
---
 arch/powerpc/kvm/booke.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 1020119..dc1f590 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -795,7 +795,7 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
kvmppc_fill_pt_regs(regs);
timer_interrupt(regs);
break;
-#if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_BOOK3E_64)
+#if defined(CONFIG_PPC_E500MC)
case BOOKE_INTERRUPT_DOORBELL:
kvmppc_fill_pt_regs(regs);
doorbell_exception(regs);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC][KVM][PATCH 1/1] kvm:ppc:booke-64: soft-disable interrupts

2013-05-05 Thread Tiejun Chen
For the external interrupt, the decrementer exception and the doorbell
excpetion, we also need to soft-disable interrupts while doing as host
interrupt handlers since the DO_KVM hook is always performed to skip
EXCEPTION_COMMON then miss this original chance with the 'ints' (INTS_DISABLE).

Signed-off-by: Tiejun Chen tiejun.c...@windriver.com
---
 arch/powerpc/kvm/bookehv_interrupts.S |9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index e8ed7d6..2fd62bf 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -33,6 +33,8 @@
 
 #ifdef CONFIG_64BIT
 #include asm/exception-64e.h
+#include asm/hw_irq.h
+#include asm/irqflags.h
 #else
 #include ../kernel/head_booke.h /* for THREAD_NORMSAVE() */
 #endif
@@ -469,6 +471,13 @@ _GLOBAL(kvmppc_resume_host)
PPC_LL  r3, HOST_RUN(r1)
mr  r5, r14 /* intno */
mr  r14, r4 /* Save vcpu pointer. */
+#ifdef CONFIG_64BIT
+   /* Should we soft-disable interrupts? */
+   andi.   r6, r5, BOOKE_INTERRUPT_EXTERNAL | BOOKE_INTERRUPT_DECREMENTER 
| BOOKE_INTERRUPT_DOORBELL
+   beq skip_soft_dis
+   SOFT_DISABLE_INTS(r7,r8)
+skip_soft_dis:
+#endif
bl  kvmppc_handle_exit
 
/* Restore vcpu pointer and the nonvolatiles we used. */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][KVM][PATCH 1/1] kvm:ppc:booke-64: soft-disable interrupts

2013-05-05 Thread tiejun.chen

On 05/06/2013 11:10 AM, Tiejun Chen wrote:

For the external interrupt, the decrementer exception and the doorbell
excpetion, we also need to soft-disable interrupts while doing as host
interrupt handlers since the DO_KVM hook is always performed to skip
EXCEPTION_COMMON then miss this original chance with the 'ints' (INTS_DISABLE).


Sorry, miss to send Ben.

Tiejun



Signed-off-by: Tiejun Chen tiejun.c...@windriver.com
---
  arch/powerpc/kvm/bookehv_interrupts.S |9 +
  1 file changed, 9 insertions(+)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index e8ed7d6..2fd62bf 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -33,6 +33,8 @@

  #ifdef CONFIG_64BIT
  #include asm/exception-64e.h
+#include asm/hw_irq.h
+#include asm/irqflags.h
  #else
  #include ../kernel/head_booke.h /* for THREAD_NORMSAVE() */
  #endif
@@ -469,6 +471,13 @@ _GLOBAL(kvmppc_resume_host)
PPC_LL  r3, HOST_RUN(r1)
mr  r5, r14 /* intno */
mr  r14, r4 /* Save vcpu pointer. */
+#ifdef CONFIG_64BIT
+   /* Should we soft-disable interrupts? */
+   andi.   r6, r5, BOOKE_INTERRUPT_EXTERNAL | BOOKE_INTERRUPT_DECREMENTER 
| BOOKE_INTERRUPT_DOORBELL
+   beq skip_soft_dis
+   SOFT_DISABLE_INTS(r7,r8)
+skip_soft_dis:
+#endif
bl  kvmppc_handle_exit

/* Restore vcpu pointer and the nonvolatiles we used. */



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] vhost-net fix ubuf

2013-05-05 Thread Asias He
Asias He (2):
  vhost: Export vhost_dev_set_owner
  vhost-net: Free ubuf when vhost_dev_set_owner fails

 drivers/vhost/net.c   | 38 --
 drivers/vhost/vhost.c |  2 +-
 drivers/vhost/vhost.h |  1 +
 3 files changed, 34 insertions(+), 7 deletions(-)

-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] vhost: Export vhost_dev_set_owner

2013-05-05 Thread Asias He
Signed-off-by: Asias He as...@redhat.com
---
 drivers/vhost/vhost.c | 2 +-
 drivers/vhost/vhost.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 749b5ab..de9441a 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -344,7 +344,7 @@ static int vhost_attach_cgroups(struct vhost_dev *dev)
 }
 
 /* Caller should have device mutex */
-static long vhost_dev_set_owner(struct vhost_dev *dev)
+long vhost_dev_set_owner(struct vhost_dev *dev)
 {
struct task_struct *worker;
int err;
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index b58f4ae..cc23bc4 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -135,6 +135,7 @@ struct vhost_dev {
 };
 
 long vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int 
nvqs);
+long vhost_dev_set_owner(struct vhost_dev *dev);
 long vhost_dev_check_owner(struct vhost_dev *);
 struct vhost_memory *vhost_dev_reset_owner_prepare(void);
 void vhost_dev_reset_owner(struct vhost_dev *, struct vhost_memory *);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] vhost-net: Free ubuf when vhost_dev_set_owner fails

2013-05-05 Thread Asias He
Signed-off-by: Asias He as...@redhat.com
---
 drivers/vhost/net.c | 38 --
 1 file changed, 32 insertions(+), 6 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index a3645bd..354665a 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -146,6 +146,19 @@ void vhost_ubuf_put_and_wait(struct vhost_ubuf_ref *ubufs)
kfree(ubufs);
 }
 
+static void vhost_net_clear_ubuf_info(struct vhost_net *n)
+{
+
+   bool zcopy;
+   int i;
+
+   for (i = 0; i  n-dev.nvqs; ++i) {
+   zcopy = vhost_zcopy_mask  (0x1  i);
+   if (zcopy)
+   kfree(n-vqs[i].ubuf_info);
+   }
+}
+
 int vhost_net_set_ubuf_info(struct vhost_net *n)
 {
bool zcopy;
@@ -1027,6 +1040,23 @@ static int vhost_net_set_features(struct vhost_net *n, 
u64 features)
return 0;
 }
 
+static long vhost_net_set_owner(struct vhost_net *n)
+{
+   int r;
+
+   mutex_lock(n-dev.mutex);
+   r = vhost_net_set_ubuf_info(n);
+   if (r)
+   goto out;
+   r = vhost_dev_set_owner(n-dev);
+   if (r)
+   vhost_net_clear_ubuf_info(n);
+   vhost_net_flush(n);
+out:
+   mutex_unlock(n-dev.mutex);
+   return r;
+}
+
 static long vhost_net_ioctl(struct file *f, unsigned int ioctl,
unsigned long arg)
 {
@@ -1055,19 +1085,15 @@ static long vhost_net_ioctl(struct file *f, unsigned 
int ioctl,
return vhost_net_set_features(n, features);
case VHOST_RESET_OWNER:
return vhost_net_reset_owner(n);
+   case VHOST_SET_OWNER:
+   return vhost_net_set_owner(n);
default:
mutex_lock(n-dev.mutex);
-   if (ioctl == VHOST_SET_OWNER) {
-   r = vhost_net_set_ubuf_info(n);
-   if (r)
-   goto out;
-   }
r = vhost_dev_ioctl(n-dev, ioctl, argp);
if (r == -ENOIOCTLCMD)
r = vhost_vring_ioctl(n-dev, ioctl, argp);
else
vhost_net_flush(n);
-out:
mutex_unlock(n-dev.mutex);
return r;
}
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] vhost-net: Free ubuf when vhost_dev_ioctl fails

2013-05-05 Thread Asias He
On Sun, May 05, 2013 at 04:50:07PM +0300, Michael S. Tsirkin wrote:
 On Fri, May 03, 2013 at 02:25:17PM +0800, Asias He wrote:
  Free ubuf when vhost_dev_ioctl for VHOST_SET_OWNER fails.
  
  Signed-off-by: Asias He as...@redhat.com
  ---
   drivers/vhost/net.c | 20 ++--
   1 file changed, 18 insertions(+), 2 deletions(-)
  
  diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
  index b2f6b41..eb73217 100644
  --- a/drivers/vhost/net.c
  +++ b/drivers/vhost/net.c
  @@ -152,6 +152,19 @@ void vhost_ubuf_put_and_wait(struct vhost_ubuf_ref 
  *ubufs)
  kfree(ubufs);
   }
   
  +static void vhost_net_clear_ubuf_info(struct vhost_net *n)
  +{
  +
  +   bool zcopy;
  +   int i;
  +
  +   for (i = 0; i  n-dev.nvqs; ++i) {
  +   zcopy = vhost_zcopy_mask  (0x1  i);
  +   if (zcopy)
  +   kfree(n-vqs[i].ubuf_info);
  +   }
  +}
  +
   int vhost_net_set_ubuf_info(struct vhost_net *n)
   {
  bool zcopy;
  @@ -1069,10 +1082,13 @@ static long vhost_net_ioctl(struct file *f, 
  unsigned int ioctl,
  goto out;
  }
  r = vhost_dev_ioctl(n-dev, ioctl, argp);
  -   if (r == -ENOIOCTLCMD)
  +   if (r == -ENOIOCTLCMD) {
  r = vhost_vring_ioctl(n-dev, ioctl, argp);
  -   else
  +   } else {
  +   if (r  0  ioctl == VHOST_SET_OWNER)
  +   vhost_net_clear_ubuf_info(n);
  vhost_net_flush(n);
  +   }
 
 This is becoming too complex.
 Let's just export vhost_dev_set_owner from vhost.c
 and have a separate case statement for VHOST_SET_OWNER.

done.

 
 Also - could you please send a separate series
 with bugfixes, so I can apply for 3.10?
 Cleanups I will queue for 3.11.

done.

 Thanks!
 
   out:
  mutex_unlock(n-dev.mutex);
  return r;
  -- 
  1.8.1.4

-- 
Asias
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v17 RESEND] pvpanic: pvpanic device driver

2013-05-05 Thread Hu Tao
On Fri, May 03, 2013 at 06:59:18PM -0300, Marcelo Tosatti wrote:
 On Fri, May 03, 2013 at 10:47:10AM +0800, Hu Tao wrote:
  pvpanic device is a qemu simulated device through which guest panic
  event is sent to host.
  
  Signed-off-by: Hu Tao hu...@cn.fujitsu.com
  ---
   drivers/platform/x86/Kconfig   |   7 +++
   drivers/platform/x86/Makefile  |   2 +
   drivers/platform/x86/pvpanic.c | 115 
  +
   3 files changed, 124 insertions(+)
   create mode 100644 drivers/platform/x86/pvpanic.c
  
  diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
  index 3338437..527ed04 100644
  --- a/drivers/platform/x86/Kconfig
  +++ b/drivers/platform/x86/Kconfig
  @@ -781,4 +781,11 @@ config APPLE_GMUX
graphics as well as the backlight. Currently only backlight
control is supported by the driver.
   
  +config PVPANIC
  +   tristate pvpanic device support
  +   depends on ACPI
  +   ---help---
  + This driver provides support for pvpanic device, which is a qemu
  + simulated device through which guest panic event is sent to host.
  +
   endif # X86_PLATFORM_DEVICES
  diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
  index ace2b38..ef0ec74 100644
  --- a/drivers/platform/x86/Makefile
  +++ b/drivers/platform/x86/Makefile
  @@ -51,3 +51,5 @@ obj-$(CONFIG_INTEL_OAKTRAIL)  += intel_oaktrail.o
   obj-$(CONFIG_SAMSUNG_Q10)  += samsung-q10.o
   obj-$(CONFIG_APPLE_GMUX)   += apple-gmux.o
   obj-$(CONFIG_CHROMEOS_LAPTOP)  += chromeos_laptop.o
  +
  +obj-$(CONFIG_PVPANIC)   += pvpanic.o
  diff --git a/drivers/platform/x86/pvpanic.c b/drivers/platform/x86/pvpanic.c
  new file mode 100644
  index 000..81c95ec
  --- /dev/null
  +++ b/drivers/platform/x86/pvpanic.c
  @@ -0,0 +1,115 @@
  +/*
  + *  pvpanic.c - pvpanic Device Support
  + *
  + *  Copyright (C) 2013 Fujitsu.
  + *
  + *  This program is free software; you can redistribute it and/or modify
  + *  it under the terms of the GNU General Public License as published by
  + *  the Free Software Foundation; either version 2 of the License, or
  + *  (at your option) any later version.
  + *
  + *  This program is distributed in the hope that it will be useful,
  + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
  + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  + *  GNU General Public License for more details.
  + *
  + *  You should have received a copy of the GNU General Public License
  + *  along with this program; if not, write to the Free Software
  + *  Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  
  02110-1301  USA
  + */
  +
  +#define pr_fmt(fmt) KBUILD_MODNAME :  fmt
  +
  +#include linux/kernel.h
  +#include linux/module.h
  +#include linux/init.h
  +#include linux/types.h
  +#include acpi/acpi_bus.h
  +#include acpi/acpi_drivers.h
  +
  +MODULE_AUTHOR(Hu Tao hu...@cn.fujitsu.com);
  +MODULE_DESCRIPTION(pvpanic device driver);
  +MODULE_LICENSE(GPL);
  +
  +static int pvpanic_add(struct acpi_device *device);
  +static int pvpanic_remove(struct acpi_device *device);
  +
  +static const struct acpi_device_id pvpanic_device_ids[] = {
  +   { QEMU0001, 0},
  +   { , 0},
  +};
  +MODULE_DEVICE_TABLE(acpi, pvpanic_device_ids);
  +
  +#define PVPANIC_PANICKED   (1  0)
  +
  +static acpi_handle handle;
  +
  +static struct acpi_driver pvpanic_driver = {
  +   .name = pvpanic,
  +   .class =QEMU,
  +   .ids =  pvpanic_device_ids,
  +   .ops =  {
  +   .add =  pvpanic_add,
  +   .remove =   pvpanic_remove,
  +   },
  +   .owner =THIS_MODULE,
  +};
  +
  +static void
  +pvpanic_send_event(unsigned int event)
  +{
  +   union acpi_object arg;
  +   struct acpi_object_list arg_list;
  +
  +   if (!handle)
  +   return;
  +
  +   arg.type = ACPI_TYPE_INTEGER;
  +   arg.integer.value = event;
  +
  +   arg_list.count = 1;
  +   arg_list.pointer = arg;
  +
  +   acpi_evaluate_object(handle, WRPT, arg_list, NULL);
  +}
 
 Is it safe to call acpi_evaluate_object from a panic notifier? For
 example:
 
 - Has it been confirmed that no code invoked via acpi_evaluate_object can 
 panic() ?

Confirmed.

 - acpi_ex_enter_interpreter grabs a mutex. Is that path ever used?

Unfortunately yes. As I can tell, there are 2 places in the path to grab
a mutex: when searching the namespace for the method, and when executing
the method. I didn't find a non-blocking version of acpi_evaluate_object.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 4/6] KVM: MMU: fast invalid all shadow pages

2013-05-05 Thread Xiao Guangrong
On 05/04/2013 08:52 AM, Marcelo Tosatti wrote:
 On Sat, May 04, 2013 at 12:51:06AM +0800, Xiao Guangrong wrote:
 On 05/03/2013 11:53 PM, Marcelo Tosatti wrote:
 On Fri, May 03, 2013 at 01:52:07PM +0800, Xiao Guangrong wrote:
 On 05/03/2013 09:05 AM, Marcelo Tosatti wrote:

 +
 +/*
 + * Fast invalid all shadow pages belong to @slot.
 + *
 + * @slot != NULL means the invalidation is caused the memslot specified
 + * by @slot is being deleted, in this case, we should ensure that rmap
 + * and lpage-info of the @slot can not be used after calling the 
 function.
 + *
 + * @slot == NULL means the invalidation due to other reasons, we need
 + * not care rmap and lpage-info since they are still valid after calling
 + * the function.
 + */
 +void kvm_mmu_invalid_memslot_pages(struct kvm *kvm,
 +   struct kvm_memory_slot *slot)
 +{
 +spin_lock(kvm-mmu_lock);
 +kvm-arch.mmu_valid_gen++;
 +
 +/*
 + * All shadow paes are invalid, reset the large page info,
 + * then we can safely desotry the memslot, it is also good
 + * for large page used.
 + */
 +kvm_clear_all_lpage_info(kvm);

 Xiao,

 I understood it was agreed that simple mmu_lock lockbreak while
 avoiding zapping of newly instantiated pages upon a

   if(spin_needbreak)
   cond_resched_lock()

 cycle was enough as a first step? And then later introduce root zapping
 along with measurements.

 https://lkml.org/lkml/2013/4/22/544

 Yes, it is.

 See the changelog in 0/0:

  we use lock-break technique to zap all sptes linked on the
 invalid rmap, it is not very effective but good for the first step.

 Thanks!

 Sure, but what is up with zeroing kvm_clear_all_lpage_info(kvm) and
 zapping the root? Only lock-break technique along with generation number 
 was what was agreed.

 Marcelo,

 Please Wait... I am completely confused. :(

 Let's clarify zeroing kvm_clear_all_lpage_info(kvm) and zapping the root 
 first.
 Are these changes you wanted?

 void kvm_mmu_invalid_memslot_pages(struct kvm *kvm,
 struct kvm_memory_slot *slot)
 {
  spin_lock(kvm-mmu_lock);
  kvm-arch.mmu_valid_gen++;

  /* Zero all root pages.*/
 restart:
  list_for_each_entry_safe(sp, node, kvm-arch.active_mmu_pages, link) {
  if (!sp-root_count)
  continue;

  if (kvm_mmu_prepare_zap_page(kvm, sp, invalid_list))
  goto restart;
  }

  /*
   * All shadow paes are invalid, reset the large page info,
   * then we can safely desotry the memslot, it is also good
   * for large page used.
   */
  kvm_clear_all_lpage_info(kvm);

  kvm_mmu_commit_zap_page(kvm, invalid_list);
  spin_unlock(kvm-mmu_lock);
 }

 static void rmap_remove(struct kvm *kvm, u64 *spte)
 {
  struct kvm_mmu_page *sp;
  gfn_t gfn;
  unsigned long *rmapp;

  sp = page_header(__pa(spte));
 +
 +   /* Let invalid sp do not access its rmap. */
 +if (!sp_is_valid(sp))
 +return;
 +
  gfn = kvm_mmu_page_get_gfn(sp, spte - sp-spt);
  rmapp = gfn_to_rmap(kvm, gfn, sp-role.level);
  pte_list_remove(spte, rmapp);
 }

 If yes, there is the reason why we can not do this that i mentioned before:

 after call kvm_mmu_invalid_memslot_pages(), the memslot-rmap will be 
 destroyed.
 Later, if host reclaim page, the mmu-notify handlers, -invalidate_page and
 -invalidate_range_start, can not find any spte using the host page, then
 Accessed/Dirty for host page is missing tracked.
 (missing call kvm_set_pfn_accessed and kvm_set_pfn_dirty properly.)

 What's your idea?
 
 
 Step 1) Fix kvm_mmu_zap_all's behaviour: introduce lockbreak via
 spin_needbreak. Use generation numbers so that in case kvm_mmu_zap_all 
 releases mmu_lock and reacquires it again, only shadow pages 
 from the generation with which kvm_mmu_zap_all started are zapped (this
 guarantees forward progress and eventual termination).
 
 kvm_mmu_zap_generation()
   spin_lock(mmu_lock)
   int generation = kvm-arch.mmu_generation;
 
   for_each_shadow_page(sp) {
   if (sp-generation == kvm-arch.mmu_generation)
   zap_page(sp)
   if (spin_needbreak(mmu_lock)) {
   kvm-arch.mmu_generation++;
   cond_resched_lock(mmu_lock);
   }
   }
 
 kvm_mmu_zap_all()
   spin_lock(mmu_lock)
   for_each_shadow_page(sp) {
   if (spin_needbreak(mmu_lock)) {
   cond_resched_lock(mmu_lock);
   }
   }
 
 Use kvm_mmu_zap_generation for kvm_arch_flush_shadow_memslot.
 Use kvm_mmu_zap_all for kvm_mmu_notifier_release,kvm_destroy_vm.
 
 This addresses the main problem: excessively long hold times 
 of kvm_mmu_zap_all with very large guests.
 
 Do you see any problem with this logic? This was what i was thinking 
 we agreed.

No. I understand it and it can work.


RE: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support

2013-05-05 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf 
 Of
 Alexander Graf
 Sent: Friday, May 03, 2013 6:48 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421
 Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support
 
 
 On 03.05.2013, at 15:11, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Friday, May 03, 2013 6:00 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421
  Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub
  support
 
 
  On 03.05.2013, at 13:08, Alexander Graf wrote:
 
 
 
  Am 03.05.2013 um 12:48 schrieb Bhushan Bharat-R65777 
  r65...@freescale.com:
 
  +static void kvmppc_booke_vcpu_load_debug_regs(struct kvm_vcpu
  +*vcpu) {
  +if (!vcpu-arch.debug_active)
  +return;
  +
  +/* Disable all debug events and clead pending debug events */
  +mtspr(SPRN_DBCR0, 0x0);
  +kvmppc_clear_dbsr();
  +
  +/*
  + * Check whether guest still need debug resource, if not then
 there
  + * is no need to restore guest context.
  + */
  +if (!vcpu-arch.shadow_dbg_reg.dbcr0)
  +return;
  +
  +/* Load Guest Context */
  +mtspr(SPRN_DBCR1, vcpu-arch.shadow_dbg_reg.dbcr1);
  +mtspr(SPRN_DBCR2, vcpu-arch.shadow_dbg_reg.dbcr2); #ifdef
  +CONFIG_KVM_E500MC
  +mtspr(SPRN_DBCR4, vcpu-arch.shadow_dbg_reg.dbcr4);
 
  You need to make sure DBCR4 is 0 when you leave things back to
  normal user space. Otherwise guest debug can interfere with host 
  debug.
 
 
  ok
 
 
  +#endif
  +mtspr(SPRN_IAC1, vcpu-arch.shadow_dbg_reg.iac[0]);
  +mtspr(SPRN_IAC2, vcpu-arch.shadow_dbg_reg.iac[1]);
  +#if CONFIG_PPC_ADV_DEBUG_IACS  2
  +mtspr(SPRN_IAC3, vcpu-arch.shadow_dbg_reg.iac[2]);
  +mtspr(SPRN_IAC4, vcpu-arch.shadow_dbg_reg.iac[3]);
  +#endif
  +mtspr(SPRN_DAC1, vcpu-arch.shadow_dbg_reg.dac[0]);
  +mtspr(SPRN_DAC2, vcpu-arch.shadow_dbg_reg.dac[1]);
  +
  +/* Enable debug events after other debug registers restored */
  +mtspr(SPRN_DBCR0, vcpu-arch.shadow_dbg_reg.dbcr0); }
 
  All of the code above looks suspiciously similar to
  prime_debug_regs();. Can't we somehow reuse that?
 
  I think we can if
  - Save thread-debug_regs in local data structure
 
  Yes, it can even be on the stack.
 
  - Load vcpu-arch-debug_regs in thread-debug_regs
  - Call prime_debug_regs();
  - Restore thread-debug_regs from local save values in first step
 
  On heavyweight exit, based on the values on stack, yes.
 
  This is how I think we can save/restore debug context. Please
  correct if I am
  missing something.
 
  Sounds about right :)
 
  Actually, what happens if a guest breakpoint is set to a kernel
  address that happens to be within the scope of kvm code?
 
  You mean address of kvm code in guest or host?
 
  If host, we already mentioned that we do not support that. Right?
 
 QEMU wants to debug the guest at address 0xc123. kvm_run happens to be at
 that address. We switch the debug registers through prime_debug_regs. Will the
 host kernel receive a debug interrupt when it runs kvm_run()?

No,
On e500v2, we uses DBCR1 and DBCR2 to not allow debug events when MSR.PR = 0
On e500mc+, we uses EPCR.DUVD to not allow debug events when in hypervisor mode.

-Bharat

 
 
 Alex
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in the body of 
 a
 message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm/ppc/booke64: Hard disable interrupts when entering the guest

2013-05-05 Thread Benjamin Herrenschmidt
On Fri, 2013-05-03 at 18:45 -0500, Scott Wood wrote:
 kvmppc_lazy_ee_enable() was causing interrupts to be soft-enabled
 (albeit hard-disabled) in kvmppc_restart_interrupt().  This led to
 warnings, and possibly breakage if the interrupt state was later saved
 and then restored (leading to interrupts being hard-and-soft enabled
 when they should be at least soft-disabled).
 
 Simply removing kvmppc_lazy_ee_enable() leaves interrupts only
 soft-disabled when we enter the guest, but they will be hard-disabled
 when we exit the guest -- without PACA_IRQ_HARD_DIS ever being set, so
 the local_irq_enable() fails to hard-enable.
 
 While we could just set PACA_IRQ_HARD_DIS after an exit to compensate,
 instead hard-disable interrupts before entering the guest.  This way,
 we won't have to worry about interactions if we take an interrupt
 during the guest entry code.  While I don't see any obvious
 interactions, it could change in the future (e.g. it would be bad if
 the non-hv code were used on 64-bit or if 32-bit guest lazy interrupt
 disabling, since the non-hv code changes IVPR among other things).

Shouldn't the interrupts be marked soft-enabled (even if hard disabled)
when entering the guest ?

Ie. The last stage of entry will hard enable, so they should be
soft-enabled too... if not, latency trackers will consider the whole
guest periods as interrupt disabled...

Now, kvmppc_lazy_ee_enable() seems to be clearly bogus to me. It will
unconditionally set soft_enabled and clear irq_happened from a
soft-disabled state, thus potentially losing a pending event.

Book3S HV seems to be keeping interrupts fully enabled all the way
until the asm hard disables, which would be fine except that I'm worried
we are racy vs. need_resched  signals.

One thing you may be able to do is call prep_irq_for_idle(). This will
tell you if something happened, giving you a chance to abort/re-enable
before you go the guest.

Ben.


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v1][KVM][PATCH 1/1] kvm:ppc: enable doorbell exception with E500MC

2013-05-05 Thread Tiejun Chen
Actually E500MC also support doorbell exception, and CONFIG_PPC_E500MC
can cover BOOK3E/BOOK3E_64 as well.

Signed-off-by: Tiejun Chen tiejun.c...@windriver.com
---
 arch/powerpc/kvm/booke.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 1020119..dc1f590 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -795,7 +795,7 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
kvmppc_fill_pt_regs(regs);
timer_interrupt(regs);
break;
-#if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_BOOK3E_64)
+#if defined(CONFIG_PPC_E500MC)
case BOOKE_INTERRUPT_DOORBELL:
kvmppc_fill_pt_regs(regs);
doorbell_exception(regs);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC][KVM][PATCH 1/1] kvm:ppc:booke-64: soft-disable interrupts

2013-05-05 Thread Tiejun Chen
For the external interrupt, the decrementer exception and the doorbell
excpetion, we also need to soft-disable interrupts while doing as host
interrupt handlers since the DO_KVM hook is always performed to skip
EXCEPTION_COMMON then miss this original chance with the 'ints' (INTS_DISABLE).

Signed-off-by: Tiejun Chen tiejun.c...@windriver.com
---
 arch/powerpc/kvm/bookehv_interrupts.S |9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index e8ed7d6..2fd62bf 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -33,6 +33,8 @@
 
 #ifdef CONFIG_64BIT
 #include asm/exception-64e.h
+#include asm/hw_irq.h
+#include asm/irqflags.h
 #else
 #include ../kernel/head_booke.h /* for THREAD_NORMSAVE() */
 #endif
@@ -469,6 +471,13 @@ _GLOBAL(kvmppc_resume_host)
PPC_LL  r3, HOST_RUN(r1)
mr  r5, r14 /* intno */
mr  r14, r4 /* Save vcpu pointer. */
+#ifdef CONFIG_64BIT
+   /* Should we soft-disable interrupts? */
+   andi.   r6, r5, BOOKE_INTERRUPT_EXTERNAL | BOOKE_INTERRUPT_DECREMENTER 
| BOOKE_INTERRUPT_DOORBELL
+   beq skip_soft_dis
+   SOFT_DISABLE_INTS(r7,r8)
+skip_soft_dis:
+#endif
bl  kvmppc_handle_exit
 
/* Restore vcpu pointer and the nonvolatiles we used. */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][KVM][PATCH 1/1] kvm:ppc:booke-64: soft-disable interrupts

2013-05-05 Thread tiejun.chen

On 05/06/2013 11:10 AM, Tiejun Chen wrote:

For the external interrupt, the decrementer exception and the doorbell
excpetion, we also need to soft-disable interrupts while doing as host
interrupt handlers since the DO_KVM hook is always performed to skip
EXCEPTION_COMMON then miss this original chance with the 'ints' (INTS_DISABLE).


Sorry, miss to send Ben.

Tiejun



Signed-off-by: Tiejun Chen tiejun.c...@windriver.com
---
  arch/powerpc/kvm/bookehv_interrupts.S |9 +
  1 file changed, 9 insertions(+)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index e8ed7d6..2fd62bf 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -33,6 +33,8 @@

  #ifdef CONFIG_64BIT
  #include asm/exception-64e.h
+#include asm/hw_irq.h
+#include asm/irqflags.h
  #else
  #include ../kernel/head_booke.h /* for THREAD_NORMSAVE() */
  #endif
@@ -469,6 +471,13 @@ _GLOBAL(kvmppc_resume_host)
PPC_LL  r3, HOST_RUN(r1)
mr  r5, r14 /* intno */
mr  r14, r4 /* Save vcpu pointer. */
+#ifdef CONFIG_64BIT
+   /* Should we soft-disable interrupts? */
+   andi.   r6, r5, BOOKE_INTERRUPT_EXTERNAL | BOOKE_INTERRUPT_DECREMENTER 
| BOOKE_INTERRUPT_DOORBELL
+   beq skip_soft_dis
+   SOFT_DISABLE_INTS(r7,r8)
+skip_soft_dis:
+#endif
bl  kvmppc_handle_exit

/* Restore vcpu pointer and the nonvolatiles we used. */



--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html