Hi Mario,
On Thu, Apr 17 2014 at 2:32:22 am BST, Mario Smarduch m.smard...@samsung.com
wrote:
Revised iteration after initial comments. Still just for ARMv7. I looked
at the ARMv8 code and yes it practically appears to reuse most of
fault handling in ARMv7, I wasn't aware so much code was
On Thu, Apr 17 2014 at 2:33:17 am BST, Mario Smarduch m.smard...@samsung.com
wrote:
Add HYP API to invalidate all VM TLBs without passing address parameter,
that kvm_tlb_flush_vmid_ipa() uses. Hopefully this is a valid way
to do it. Tests show nothing is broken.
The address parameter is
On Thu, Apr 17 2014 at 2:34:19 am BST, Mario Smarduch m.smard...@samsung.com
wrote:
Add support for dirty bitmap management. Wanted to make it generic but
function
does a couple things different then the x86 version.
Signed-off-by: Mario Smarduch m.smard...@samsung.com
---
On Thu, Apr 17 2014 at 2:34:51 am BST, Mario Smarduch m.smard...@samsung.com
wrote:
This should be in an earlier patch, omitted by mistake.
Please fix this up before sending the patch series. It is not like it
would take very long to rebase and squash.
Signed-off-by: Mario Smarduch
On Thu, Apr 17 2014 at 2:34:39 am BST, Mario Smarduch m.smard...@samsung.com
wrote:
Additional logic to handle second stage page faults during migration.
Primarily
page faults are prevented from creating huge pages.
Signed-off-by: Mario Smarduch m.smard...@samsung.com
---
This reverts commit 5befdc385ddb2d5ae8995ad89004529a3acf58fc.
Since we will allow flush tlb out of mmu-lock in the later
patch
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/kvm/paging_tmpl.h | 7 +++
include/linux/kvm_host.h | 4 +---
virt/kvm/kvm_main.c
Using sp-role.level instead of @level since @level is not got from the
page table hierarchy
There is no issue in current code since the fast page fault currently only
fixes the fault caused by dirty-log that is always on the last level
(level = 1)
This patch makes the code more readable and
Relax the tlb flush condition since we will write-protect the spte out of mmu
lock. Note lockless write-protection only marks the writable spte to readonly
and the spte can be writable only if both SPTE_HOST_WRITEABLE and
SPTE_MMU_WRITEABLE are set (that are tested by
Now we can flush all the TLBs out of the mmu lock without TLB corruption when
write-proect the sptes, it is because:
- we have marked large sptes readonly instead of dropping them that means we
just change the spte from writable to readonly so that we only need to care
the case of changing
Currently, kvm zaps the large spte if write-protected is needed, the later
read can fault on that spte. Actually, we can make the large spte readonly
instead of making them un-present, the page fault caused by read access can
be avoided
The idea is from Avi:
| As I mentioned before,
Since Marcelo has agreed the comments improving in the off-line mail, i
consider this is his Ack. :) Please let me know If i misunderstood it.
This patchset is splited from my previous patchset:
[PATCH v3 00/15] KVM: MMU: locklessly write-protect
that can be found at:
https://bugzilla.kernel.org/show_bug.cgi?id=74251
Bug ID: 74251
Summary: Assign I350 NIC VF and Intel 82599 NIC VF to win7/win8
32bit guest, the interface cannot get IP
Product: Virtualization
Version: unspecified
Kernel
On Tue, Mar 25, 2014 at 11:15:29AM +0100, Paolo Bonzini wrote:
Il 24/03/2014 21:49, Christian Borntraeger ha scritto:
event_legacy_tracepoint:
+PE_NAME '-' PE_NAME ':' PE_NAME
+{
+struct parse_events_evlist *data = _data;
+struct list_head *list;
+char sys_name[strlen($1) +
On 17/04/14 13:32, Jiri Olsa wrote:
On Tue, Mar 25, 2014 at 11:15:29AM +0100, Paolo Bonzini wrote:
Il 24/03/2014 21:49, Christian Borntraeger ha scritto:
event_legacy_tracepoint:
+PE_NAME '-' PE_NAME ':' PE_NAME
+{
+ struct parse_events_evlist *data = _data;
+ struct list_head *list;
+
On Thu, Apr 17, 2014 at 01:41:56PM +0200, Christian Borntraeger wrote:
On 17/04/14 13:32, Jiri Olsa wrote:
On Tue, Mar 25, 2014 at 11:15:29AM +0100, Paolo Bonzini wrote:
Il 24/03/2014 21:49, Christian Borntraeger ha scritto:
event_legacy_tracepoint:
+PE_NAME '-' PE_NAME ':' PE_NAME
+{
On Mon, Mar 31, 2014 at 09:50:44PM +0300, Michael S. Tsirkin wrote:
With KVM, MMIO is much slower than PIO, due to the need to
do page walk and emulation. But with EPT, it does not have to be: we
know the address from the VMCS so if the address is unique, we can look
up the eventfd directly,
Signed-off-by: Andrew Jones drjo...@redhat.com
---
x86/unittests.cfg | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index 7930c026a38d6..d78fe0eafe2b6 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -156,5 +156,5 @@ extra_params
On Mon, Mar 24, 2014 at 09:49:00PM +0100, Christian Borntraeger wrote:
From: Alexander Yarygin yary...@linux.vnet.ibm.com
Trace events potentially can have a '-' in their trace system name,
e.g. kvm on s390 defines kvm-s390:* tracepoints.
tools/perf could not parse them, because there was no
Whilst our IO port is fixed at CPU physical address 0x0, changing
ARM_IOPORT_AREA should be all that's necessary to move it around in CPU
physical space (it will still be at 0x0 in the bus address space).
This patch ensures we subtract KVM_IOPORT_AREA from the faulting CPU
physical address when
Now that the dust has settled on the devicetree bindings for the generic
PCI host controller in the Linux kernel, update the node generated by
kvmtool to match what mainline kernels will expect.
Signed-off-by: Will Deacon will.dea...@arm.com
---
tools/kvm/arm/fdt.c | 1 +
tools/kvm/arm/kvm.c |
This patch extracts the logic for the exchange of new and previous tail
code words into a new xchg_tail() function which can be optimized in a
later patch.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
include/asm-generic/qspinlock_types.h |2 +
kernel/locking/qspinlock.c|
In order to fully resolve the lock waiter preemption problem in virtual
guests, it is necessary to enable lock stealing in the lock waiters.
A simple test-and-set lock, however, has 2 main problems:
1) The constant spinning on the lock word put a lot of cacheline
contention traffic on the
With the pending addition of more codes to support unfair lock and
PV spinlock, the complexity of the slowpath function increases to
the point that the number of scratch-pad registers in the x86-64
architecture is not enough and so those additional non-scratch-pad
registers will need to be used.
This patch introduces a new generic queue spinlock implementation that
can serve as an alternative to the default ticket spinlock. Compared
with the ticket spinlock, this queue spinlock should be almost as fair
as the ticket spinlock. It has about the same speed in single-thread
and it can be much
This patch renames the paravirt_ticketlocks_enabled static key to a
more generic paravirt_spinlocks_enabled name.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/include/asm/spinlock.h |4 ++--
arch/x86/kernel/kvm.c|2 +-
This patch adds base para-virtualization support to the queue
spinlock in the same way as was done in the PV ticket lock code. In
essence, the lock waiters will spin for a specified number of times
(QSPIN_THRESHOLD = 2^14) and then halted itself. The queue head waiter,
unlike the other waiter,
The simple unfair queue lock cannot completely solve the lock waiter
preemption problem as a preempted CPU at the front of the queue will
block forward progress in all the other CPUs behind it in the queue.
To allow those CPUs to move forward, it is necessary to enable lock
stealing for those lock
This patch adds the necessary KVM specific code to allow KVM to
support the CPU halting and kicking operations needed by the queue
spinlock PV code.
Two KVM guests of 20 CPU cores (2 nodes) were created for performance
testing in one of the following three configurations:
1) Only 1 VM is active
If unfair lock is supported, the lock acquisition loop at the end of
the queue_spin_lock_slowpath() function may need to detect the fact
the lock can be stolen. Code are added for the stolen lock detection.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 26
There is a problem in the current trylock_pending() function. When the
lock is free, but the pending bit holder hasn't grabbed the lock
cleared the pending bit yet, the trylock_pending() function will fail.
As a result, the regular queuing code path will be used most of
the time even when there
Currently, atomic_cmpxchg() is used to get the lock. However, this is
not really necessary if there is more than one task in the queue and
the queue head don't need to reset the queue code word. For that case,
a simple write to set the lock bit is enough as the queue head will
be the only one
When we allow for a max NR_CPUS 2^14 we can optimize the pending
wait-acquire and the xchg_tail() operations.
By growing the pending bit to a byte, we reduce the tail to 16bit.
This means we can use xchg16 for the tail part and do away with all
the repeated compxchg() operations.
This in turn
This patch enables the coexistence of both the PV qspinlock and
unfair lock. When both are enabled, however, only the lock fastpath
will perform lock stealing whereas the slowpath will have that disabled
to get the best of both features.
We also need to transition a CPU spinning too long in the
Locking is always an issue in a virtualized environment because of 2
different types of problems:
1) Lock holder preemption
2) Lock waiter preemption
One solution to the lock waiter preemption problem is to allow unfair
lock in a virtualized environment. In this case, a new lock acquirer
can
v8-v9:
- Integrate PeterZ's version of the queue spinlock patch with some
modification:
http://lkml.kernel.org/r/20140310154236.038181...@infradead.org
- Break the more complex patches into smaller ones to ease review effort.
- Fix a racing condition in the PV qspinlock code.
v7-v8:
This patch makes the necessary changes at the x86 architecture
specific layer to enable the use of queue spinlock for x86-64. As
x86-32 machines are typically not multi-socket. The benefit of queue
spinlock may not be apparent. So queue spinlock is not enabled.
Currently, there is some
This patch adds the necessary XEN specific code to allow XEN to
support the CPU halting and kicking operations needed by the queue
spinlock PV code.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/xen/spinlock.c | 146 +--
This patch modifies the para-virtualization (PV) infrastructure code
of the x86-64 architecture to support the PV queue spinlock. Three
new virtual methods are added to support PV qspinlock:
1) kick_cpu - schedule in a virtual CPU
2) halt_cpu - schedule out a virtual CPU
3) lockstat - update
Because the qspinlock needs to touch a second cacheline; add a pending
bit and allow a single in-word spinner before we punt to the second
cacheline.
Signed-off-by: Peter Zijlstra pet...@infradead.org
Signed-off-by: Waiman Long waiman.l...@hp.com
---
include/asm-generic/qspinlock_types.h | 12
In order to support additional virtualization features like unfair lock
and para-virtualized spinlock, it is necessary to store additional
CPU specific data into the queue node structure. As a result, a new
qnode structure is created and the mcs_spinlock structure is now part
of the new structure.
On Thu, Apr 17, 2014 at 11:03:55AM -0400, Waiman Long wrote:
+/**
+ * trylock_pending - try to acquire queue spinlock using the pending bit
+ * @lock : Pointer to queue spinlock structure
+ * @pval : Pointer to value of the queue spinlock 32-bit word
+ * Return: 1 if lock acquired, 0
On Thu, Apr 17, 2014 at 11:03:56AM -0400, Waiman Long wrote:
@@ -192,36 +220,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock,
u32 val)
node-next = NULL;
/*
+ * We touched a (possibly) cold cacheline; attempt the trylock once
+ * more in the hope someone
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
+#if !defined(__LITTLE_ENDIAN) !defined(__BIG_ENDIAN)
+#error Missing either LITTLE_ENDIAN or BIG_ENDIAN definition.
+#endif
This seems entirely superfluous, I don't think a kernel build will go
anywhere if either is missing.
--
To
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
@@ -48,6 +53,9 @@
* We can further change the first spinner to spin on a bit in the lock word
* instead of its node; whereby avoiding the need to carry a node from lock
to
* unlock, and preserving API.
+ *
+ * N.B. The
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
+struct __qspinlock {
+ union {
+ atomic_t val;
+ struct {
+#ifdef __LITTLE_ENDIAN
+ u16 locked_pending;
+ u16 tail;
+#else
+ u16
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
+static __always_inline void
+clear_pending_set_locked(struct qspinlock *lock, u32 val)
+{
+ struct __qspinlock *l = (void *)lock;
+
+ ACCESS_ONCE(l-locked_pending) = 1;
+}
@@ -157,8 +251,13 @@ static inline int
On Thu, Apr 17, 2014 at 11:03:58AM -0400, Waiman Long wrote:
There is a problem in the current trylock_pending() function. When the
lock is free, but the pending bit holder hasn't grabbed the lock
cleared the pending bit yet, the trylock_pending() function will fail.
I remember seeing some
On Thu, Apr 17, 2014 at 11:03:59AM -0400, Waiman Long wrote:
kernel/locking/qspinlock.c | 61 +++
1 files changed, 44 insertions(+), 17 deletions(-)
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 497da24..80fe9ee 100644
On Thu, Apr 17, 2014 at 03:33:55PM +0300, Michael S. Tsirkin wrote:
On Mon, Mar 31, 2014 at 09:50:44PM +0300, Michael S. Tsirkin wrote:
With KVM, MMIO is much slower than PIO, due to the need to
do page walk and emulation. But with EPT, it does not have to be: we
know the address from the
On Thu, Apr 17, 2014 at 11:03:52AM -0400, Waiman Long wrote:
v8-v9:
- Integrate PeterZ's version of the queue spinlock patch with some
modification:
http://lkml.kernel.org/r/20140310154236.038181...@infradead.org
- Break the more complex patches into smaller ones to ease review
On 04/17/2014 10:53 PM, Konrad Rzeszutek Wilk wrote:
On Thu, Apr 17, 2014 at 11:03:52AM -0400, Waiman Long wrote:
v8-v9:
- Integrate PeterZ's version of the queue spinlock patch with some
modification:
http://lkml.kernel.org/r/20140310154236.038181...@infradead.org
- Break the
https://bugzilla.kernel.org/show_bug.cgi?id=73331
Shesha shes...@gmail.com changed:
What|Removed |Added
CC||shes...@gmail.com
--- Comment
On 04/17/2014 11:42 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:55AM -0400, Waiman Long wrote:
+/**
+ * trylock_pending - try to acquire queue spinlock using the pending bit
+ * @lock : Pointer to queue spinlock structure
+ * @pval : Pointer to value of the queue spinlock 32-bit
On 04/17/2014 11:49 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:56AM -0400, Waiman Long wrote:
@@ -192,36 +220,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock, u32
val)
node-next = NULL;
/*
+* We touched a (possibly) cold cacheline; attempt the
On 04/17/2014 11:50 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
+#if !defined(__LITTLE_ENDIAN) !defined(__BIG_ENDIAN)
+#error Missing either LITTLE_ENDIAN or BIG_ENDIAN definition.
+#endif
This seems entirely superfluous, I don't think a kernel build
On 04/17/2014 11:51 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
@@ -48,6 +53,9 @@
* We can further change the first spinner to spin on a bit in the lock word
* instead of its node; whereby avoiding the need to carry a node from lock to
*
Hello,
I had some basic questions regarding KVM, and would appreciate any help:)
I have been reading about the KVM architecture, and as I understand
it, the guest shows up as a regular process in the host itself..
I had some questions around that..
1. Are the guest processes
On 04/17/2014 11:56 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
+struct __qspinlock {
+ union {
+ atomic_t val;
+ struct {
+#ifdef __LITTLE_ENDIAN
+ u16 locked_pending;
+
On 04/17/2014 11:58 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
+static __always_inline void
+clear_pending_set_locked(struct qspinlock *lock, u32 val)
+{
+ struct __qspinlock *l = (void *)lock;
+
+ ACCESS_ONCE(l-locked_pending) = 1;
+}
@@
This series of patches fix various scenarios in which KVM behavior does not
follow x86 specifications. Each patch actually deals with a separate bug.
These bugs can cause the guest to get stuck (i.e., make no progress), encounter
spurious injected exceptions, or cause guest code to misbehave. As
If a guest enables a performance counter but does not enable PMI, the
hypervisor currently does not reprogram the performance counter once it
overflows. As a result the host performance counter is kept with the original
sampling period which was configured according to the value of the guest's
According to Intel specifications, PAE and non-PAE does not have any reserved
bits. In long-mode, regardless to PCIDE, only the high bits (above the
physical address) are reserved.
Signed-off-by: Nadav Amit na...@cs.technion.ac.il
---
:100644 100644 7de069af.. e21aee9... M
The IN instruction is not be affected by REP-prefix as INS is. Therefore, the
emulation should ignore the REP prefix as well. The current emulator
implementation tries to perform writeback when IN instruction with REP-prefix
is emulated. This causes it to perform wrong memory write or spurious
On 04/17/2014 12:36 PM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:58AM -0400, Waiman Long wrote:
There is a problem in the current trylock_pending() function. When the
lock is free, but the pending bit holder hasn't grabbed the lock
cleared the pending bit yet, the trylock_pending()
On 04/17/2014 01:23 PM, Konrad Rzeszutek Wilk wrote:
On Thu, Apr 17, 2014 at 11:03:52AM -0400, Waiman Long wrote:
v8-v9:
- Integrate PeterZ's version of the queue spinlock patch with some
modification:
http://lkml.kernel.org/r/20140310154236.038181...@infradead.org
- Break the
On 04/17/2014 01:40 PM, Raghavendra K T wrote:
On 04/17/2014 10:53 PM, Konrad Rzeszutek Wilk wrote:
On Thu, Apr 17, 2014 at 11:03:52AM -0400, Waiman Long wrote:
v8-v9:
- Integrate PeterZ's version of the queue spinlock patch with some
modification:
MZ So let's play the difference game with x86:
int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log
kvm_vm_ioctl_get_dirty_log() is identical now to x86 version moved it to
kvm_main.c,
to make it generic, it's declared weak. Do I go into x86 and remove that
function?
Or
When using address-size override prefix with string instructions in long-mode,
ESI/EDI/ECX are zero extended if they are affected by the instruction
(incremented/decremented). Currently, the KVM emulator does not do so.
In addition, although it is not well-documented, when address override
If EFER.LMA is off, cs.l does not determine execution mode.
Currently, the emulation engine assumes differently.
Signed-off-by: Nadav Amit na...@cs.technion.ac.il
---
:100644 100644 f4d9839... c99f7eb... M arch/x86/kvm/x86.c
arch/x86/kvm/x86.c |2 +-
1 file changed, 1 insertion(+), 1
Il 16/04/2014 18:52, Marcelo Tosatti ha scritto:
How about handling VM-entry error due to invalid state with
vmx-emulation_required = true;
continue to main vcpu loop;
What would reset it to false though? None of the places that call
emulation_required() is a hot path right now, and this
70 matches
Mail list logo