On 9/14/23 03:32, Leonardo Bras wrote:
On Tue, Sep 12, 2023 at 09:08:34AM +0800, Guo Ren wrote:
On Mon, Sep 11, 2023 at 11:34 PM Waiman Long wrote:
On 9/10/23 04:29, guo...@kernel.org wrote:
From: Guo Ren
Allow cmdline to force the kernel to use queued_spinlock when
On 9/13/23 14:54, Palmer Dabbelt wrote:
On Sun, 06 Aug 2023 22:23:34 PDT (-0700), sor...@fastmail.com wrote:
On Wed, Aug 2, 2023, at 12:46 PM, guo...@kernel.org wrote:
From: Guo Ren
According to qspinlock requirements, RISC-V gives out a weak LR/SC
forward progress guarantee which does not sa
On 9/13/23 08:52, Guo Ren wrote:
On Wed, Sep 13, 2023 at 4:55 PM Leonardo Bras wrote:
On Tue, Sep 12, 2023 at 09:10:08AM +0800, Guo Ren wrote:
On Mon, Sep 11, 2023 at 9:03 PM Waiman Long wrote:
On 9/10/23 23:09, Guo Ren wrote:
On Mon, Sep 11, 2023 at 10:35 AM Waiman Long wrote:
On 9/10
On 9/10/23 04:29, guo...@kernel.org wrote:
From: Guo Ren
Allow cmdline to force the kernel to use queued_spinlock when
CONFIG_RISCV_COMBO_SPINLOCKS=y.
Signed-off-by: Guo Ren
Signed-off-by: Guo Ren
---
Documentation/admin-guide/kernel-parameters.txt | 2 ++
arch/riscv/kernel/setup.c
On 9/10/23 04:29, guo...@kernel.org wrote:
From: Guo Ren
Allow cmdline to force the kernel to use queued_spinlock when
CONFIG_RISCV_COMBO_SPINLOCKS=y.
Signed-off-by: Guo Ren
Signed-off-by: Guo Ren
---
Documentation/admin-guide/kernel-parameters.txt | 2 ++
arch/riscv/kernel/setup.c
On 9/10/23 23:09, Guo Ren wrote:
On Mon, Sep 11, 2023 at 10:35 AM Waiman Long wrote:
On 9/10/23 04:28, guo...@kernel.org wrote:
From: Guo Ren
The target of xchg_tail is to write the tail to the lock value, so
adding prefetchw could help the next cmpxchg step, which may
decrease the cmpxchg
On 9/10/23 04:28, guo...@kernel.org wrote:
From: Guo Ren
The target of xchg_tail is to write the tail to the lock value, so
adding prefetchw could help the next cmpxchg step, which may
decrease the cmpxchg retry loops of xchg_tail. Some processors may
utilize this feature to give a forward gu
On 8/11/23 20:24, Guo Ren wrote:
On Sat, Aug 12, 2023 at 4:42 AM Waiman Long wrote:
On 8/2/23 12:47, guo...@kernel.org wrote:
From: Guo Ren
The pv_ops belongs to x86 custom infrastructure and cleans up the
cna_configure_spin_lock_slowpath() with standard code. This is
preparation for riscv
On 8/2/23 12:47, guo...@kernel.org wrote:
From: Guo Ren
The pv_ops belongs to x86 custom infrastructure and cleans up the
cna_configure_spin_lock_slowpath() with standard code. This is
preparation for riscv support CNA qspoinlock.
CNA qspinlock has not been merged into mainline yet. I will su
On 8/2/23 12:46, guo...@kernel.org wrote:
From: Guo Ren
Combo spinlock could support queued and ticket in one Linux Image and
select them during boot time via errata mechanism. Here is the func
size (Bytes) comparison table below:
TYPE: COMBO | TICKET | QUEUED
arch_spin_loc
On 8/2/23 12:46, guo...@kernel.org wrote:
\
diff --git a/arch/riscv/include/asm/spinlock.h
b/arch/riscv/include/asm/spinlock.h
new file mode 100644
index ..c644a92d4548
--- /dev/null
+++ b/arch/riscv/include/asm/spinlock.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2
u_is_preempted(node_cpu(node->prev
+ vcpu_is_preempted_node(node)))
return true;
/* unqueue */
Reviewed-by: Waiman Long
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
http
On 6/28/22 08:54, Guo Hui wrote:
The instructions assigned to the vcpu_is_preempted function parameter
in the X86 architecture physical machine are redundant instructions,
causing the multi-core performance of Unixbench to drop by about 4% to 5%.
The C function is as follows:
static bool vcpu_is_
On 6/27/22 10:27, Guo Hui wrote:
The instructions assigned to the vcpu_is_preempted function parameter
in the X86 architecture physical machine are redundant instructions,
causing the multi-core performance of Unixbench to drop by about 4% to 5%.
The C function is as follows:
static bool vcpu_is_
On 6/27/22 01:54, Guo Hui wrote:
Thank you very much Longman, my patch is as you said, only disable
node_cpu on X86, enable node_cpu on arm64, powerpc, s390 architectures;
the code is in file arch/x86/kernel/paravirt-spinlocks.c:
DECLARE_STATIC_KEY_FALSE(preemted_key);
static_branch_enab
On 6/26/22 22:13, Guo Hui wrote:
The instructions assigned to the vcpu_is_preempted function parameter
in the X86 architecture physical machine are redundant instructions,
causing the multi-core performance of Unixbench to drop by about 4% to 5%.
The C function is as follows:
static bool vcpu_is_
On 7/25/20 1:26 PM, Peter Zijlstra wrote:
On Fri, Jul 24, 2020 at 03:10:59PM -0400, Waiman Long wrote:
On 7/24/20 4:16 AM, Will Deacon wrote:
On Thu, Jul 23, 2020 at 08:47:59PM +0200, pet...@infradead.org wrote:
On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:
BTW, do you have
On 7/24/20 3:10 PM, Waiman Long wrote:
On 7/24/20 4:16 AM, Will Deacon wrote:
On Thu, Jul 23, 2020 at 08:47:59PM +0200, pet...@infradead.org wrote:
On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:
BTW, do you have any comment on my v2 lock holder cpu info
qspinlock patch?
I will
create mode 100644 arch/powerpc/include/asm/simple_spinlock.h
create mode 100644 arch/powerpc/include/asm/simple_spinlock_types.h
That patch series looks good to me. Thanks for working on this.
For the series,
Acked-by: Waiman Long
___
Vi
On 7/24/20 9:14 AM, Nicholas Piggin wrote:
This implements smp_cond_load_relaed with the slowpath busy loop using the
Nit: "smp_cond_load_relaxed"
Cheers,
Longman
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://list
On 7/24/20 4:16 AM, Will Deacon wrote:
On Thu, Jul 23, 2020 at 08:47:59PM +0200, pet...@infradead.org wrote:
On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:
BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch?
I will have to update the patch to fix the
On 7/23/20 3:58 PM, pet...@infradead.org wrote:
On Thu, Jul 23, 2020 at 03:04:13PM -0400, Waiman Long wrote:
On 7/23/20 2:47 PM, pet...@infradead.org wrote:
On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:
BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch
On 7/23/20 2:47 PM, pet...@infradead.org wrote:
On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:
BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch?
I will have to update the patch to fix the reported 0-day test problem, but
I want to collect other feedback
On 7/23/20 10:00 AM, Peter Zijlstra wrote:
On Thu, Jul 09, 2020 at 12:06:13PM -0400, Waiman Long wrote:
We don't really need to do a pv_spinlocks_init() if pv_kick() isn't
supported.
Waiman, if you cannot explain how not having kick is a sane thing, what
are you saying here?
The c
On 7/23/20 9:30 AM, Nicholas Piggin wrote:
I would prefer to extract out the pending bit handling code out into a
separate helper function which can be overridden by the arch code
instead of breaking the slowpath into 2 pieces.
You mean have the arch provide a queued_spin_lock_slowpath_pending
f
On 7/21/20 7:08 AM, Nicholas Piggin wrote:
diff --git a/arch/powerpc/include/asm/qspinlock.h
b/arch/powerpc/include/asm/qspinlock.h
index b752d34517b3..26d8766a1106 100644
--- a/arch/powerpc/include/asm/qspinlock.h
+++ b/arch/powerpc/include/asm/qspinlock.h
@@ -31,16 +31,57 @@ static inline void
++
arch/powerpc/platforms/pseries/Kconfig| 5 ++
arch/powerpc/platforms/pseries/setup.c| 6 +-
include/asm-generic/qspinlock.h | 2 +
Another ack?
I am OK with adding the #ifdef around queued_spin_lock().
Acked-by: Waiman Long
diff --git a/arch/powerpc
On 7/8/20 7:50 PM, Waiman Long wrote:
On 7/8/20 1:10 AM, Nicholas Piggin wrote:
Excerpts from Waiman Long's message of July 8, 2020 1:33 pm:
On 7/7/20 1:57 AM, Nicholas Piggin wrote:
Yes, powerpc could certainly get more performance out of the slow
paths, and then there are a few param
On 7/8/20 4:41 AM, Peter Zijlstra wrote:
On Tue, Jul 07, 2020 at 03:57:06PM +1000, Nicholas Piggin wrote:
Yes, powerpc could certainly get more performance out of the slow
paths, and then there are a few parameters to tune.
Can you clarify? The slow path is already in use on ARM64 which is weak
On 7/8/20 4:32 AM, Peter Zijlstra wrote:
On Tue, Jul 07, 2020 at 11:33:45PM -0400, Waiman Long wrote:
From 5d7941a498935fb225b2c7a3108cbf590114c3db Mon Sep 17 00:00:00 2001
From: Waiman Long
Date: Tue, 7 Jul 2020 22:29:16 -0400
Subject: [PATCH 2/9] locking/pvqspinlock: Introduce
On 7/8/20 1:10 AM, Nicholas Piggin wrote:
Excerpts from Waiman Long's message of July 8, 2020 1:33 pm:
On 7/7/20 1:57 AM, Nicholas Piggin wrote:
Yes, powerpc could certainly get more performance out of the slow
paths, and then there are a few parameters to tune.
We don't have a good alternate
sult?
Thanks,
Longman
>From 161e545523a7eb4c42c145c04e9a5a15903ba3d9 Mon Sep 17 00:00:00 2001
From: Waiman Long
Date: Tue, 7 Jul 2020 20:46:51 -0400
Subject: [PATCH 1/9] locking/pvqspinlock: Code relocation and extraction
Move pv_kick_node() and the unlock functions up and extract out the hash
and lock code from pv_wait
On 7/6/20 12:35 AM, Nicholas Piggin wrote:
v3 is updated to use __pv_queued_spin_unlock, noticed by Waiman (thank you).
Thanks,
Nick
Nicholas Piggin (6):
powerpc/powernv: must include hvcall.h to get PAPR defines
powerpc/pseries: move some PAPR paravirt functions to their own file
powe
On 7/3/20 3:35 AM, Nicholas Piggin wrote:
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/paravirt.h | 28 ++
arch/powerpc/include/asm/qspinlock.h | 55 +++
arch/powerpc/include/asm/qspinlock_paravirt.h | 5 ++
arch/powerpc/platforms/p
On 7/2/20 12:15 PM, kernel test robot wrote:
Hi Nicholas,
I love your patch! Yet something to improve:
[auto build test ERROR on powerpc/next]
[also build test ERROR on tip/locking/core v5.8-rc3 next-20200702]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when subm
On 7/2/20 3:48 AM, Nicholas Piggin wrote:
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/paravirt.h | 23
arch/powerpc/include/asm/qspinlock.h | 55 +++
arch/powerpc/include/asm/qspinlock_paravirt.h | 5 ++
arch/powerpc/platforms/pse
On 6/16/20 2:53 PM, Joe Perches wrote:
On Mon, 2020-06-15 at 21:57 -0400, Waiman Long wrote:
v4:
- Break out the memzero_explicit() change as suggested by Dan Carpenter
so that it can be backported to stable.
- Drop the "crypto: Remove unnecessary memzero_explicit()"
On 6/16/20 2:53 PM, Joe Perches wrote:
On Mon, 2020-06-15 at 21:57 -0400, Waiman Long wrote:
v4:
- Break out the memzero_explicit() change as suggested by Dan Carpenter
so that it can be backported to stable.
- Drop the "crypto: Remove unnecessary memzero_explicit()"
On 6/16/20 2:09 PM, Andrew Morton wrote:
On Tue, 16 Jun 2020 11:43:11 -0400 Waiman Long wrote:
As said by Linus:
A symmetric naming is only helpful if it implies symmetries in use.
Otherwise it's actively misleading.
In "kzalloc()", the z is meaningful and an importa
ked-by: David Howells
Acked-by: Michal Hocko
Acked-by: Johannes Weiner
Signed-off-by: Waiman Long
---
arch/s390/crypto/prng.c | 4 +--
arch/x86/power/hibernate.c| 2 +-
crypto/adiantum.c | 2 +-
cry
.org
Acked-by: Michal Hocko
Signed-off-by: Waiman Long
---
mm/slab_common.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 9e72ba224175..37d48a56431d 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1726,7 +1726,7 @@ void kz
izer especially if LTO is
used. Instead, the new kfree_sensitive() uses memzero_explicit() which
won't get compiled out.
Waiman Long (2):
mm/slab: Use memzero_explicit() in kzfree()
mm, treewide: Rename kzfree() to kfree_sensitive()
arch/s390/crypto/prng.c | 4
On 6/16/20 10:26 AM, Dan Carpenter wrote:
Last time you sent this we couldn't decide which tree it should go
through. Either the crypto tree or through Andrew seems like the right
thing to me.
Also the other issue is that it risks breaking things if people add
new kzfree() instances while we ar
On 6/16/20 10:48 AM, David Sterba wrote:
On Mon, Jun 15, 2020 at 09:57:18PM -0400, Waiman Long wrote:
In btrfs_ioctl_get_subvol_info(), there is a classic case where kzalloc()
was incorrectly paired with kzfree(). According to David Sterba, there
isn't any sensitive information i
On 6/15/20 11:30 PM, Eric Biggers wrote:
On Mon, Jun 15, 2020 at 09:57:16PM -0400, Waiman Long wrote:
The kzfree() function is normally used to clear some sensitive
information, like encryption keys, in the buffer before freeing it back
to the pool. Memset() is currently used for the buffer
() instead.
Reported-by: David Sterba
Signed-off-by: Waiman Long
---
fs/btrfs/ioctl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index f1dd9e4271e9..e8f7c5f00894 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2692,7 +2692,7
ked-by: David Howells
Acked-by: Michal Hocko
Acked-by: Johannes Weiner
Signed-off-by: Waiman Long
---
arch/s390/crypto/prng.c | 4 +--
arch/x86/power/hibernate.c| 2 +-
crypto/adiantum.c | 2 +-
cry
ring isn't totally safe either as compiler
may compile out the clearing in their optimizer especially if LTO is
used. Instead, the new kfree_sensitive() uses memzero_explicit() which
won't get compiled out.
Waiman Long (3):
mm/slab: Use memzero_explicit() in kzfree()
mm, treewide
especially if LTO is being used. To make sure that this
optimization will not happen, memzero_explicit(), which is introduced
in v3.18, is now used in kzfree() to do the clearing.
Fixes: 3ef0e5ba4673 ("slab: introduce kzfree()")
Cc: sta...@vger.kernel.org
Signed-off-by: Waiman Lon
On 6/15/20 2:07 PM, Dan Carpenter wrote:
On Mon, Apr 13, 2020 at 05:15:49PM -0400, Waiman Long wrote:
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 23c7500eea7d..c08bc7eb20bd 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1707,17 +1707,17 @@ void *krealloc(const void *p
On 4/14/20 3:16 PM, Michal Suchánek wrote:
> On Tue, Apr 14, 2020 at 12:24:36PM -0400, Waiman Long wrote:
>> On 4/14/20 2:08 AM, Christophe Leroy wrote:
>>>
>>> Le 14/04/2020 à 00:28, Waiman Long a écrit :
>>>> Since kfree_sensitive() will do an implicit me
On 4/14/20 8:48 AM, David Sterba wrote:
> On Mon, Apr 13, 2020 at 05:15:49PM -0400, Waiman Long wrote:
>> fs/btrfs/ioctl.c | 2 +-
>
>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>> index 40b729dce91c..eab3f8510426 100644
>> ---
On 4/14/20 2:08 AM, Christophe Leroy wrote:
>
>
> Le 14/04/2020 à 00:28, Waiman Long a écrit :
>> Since kfree_sensitive() will do an implicit memzero_explicit(), there
>> is no need to call memzero_explicit() before it. Eliminate those
>> memzero_explicit() and simplify
: Waiman Long
---
.../allwinner/sun8i-ce/sun8i-ce-cipher.c | 19 +-
.../allwinner/sun8i-ss/sun8i-ss-cipher.c | 20 +--
drivers/crypto/amlogic/amlogic-gxl-cipher.c | 12 +++
drivers/crypto/inside-secure/safexcel_hash.c | 3 +--
4 files changed, 14
On 4/13/20 5:31 PM, Joe Perches wrote:
> On Mon, 2020-04-13 at 17:15 -0400, Waiman Long wrote:
>> Since kfree_sensitive() will do an implicit memzero_explicit(), there
>> is no need to call memzero_explicit() before it. Eliminate those
>> memzero_explicit() and simplify the ca
er.
The renaming is done by using the command sequence:
git grep -w --name-only kzfree |\
xargs sed -i 's/\bkzfree\b/kfree_sensitive/'
followed by some editing of the kfree_sensitive() kerneldoc and the
use of memzero_explicit() instead of memset().
Suggested-by: Joe
Since kfree_sensitive() will do an implicit memzero_explicit(), there
is no need to call memzero_explicit() before it. Eliminate those
memzero_explicit() and simplify the call sites.
Signed-off-by: Waiman Long
---
.../crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c | 15 +++
.../crypto
r
may compile out the clearing in their optimizer. Instead, the new
kfree_sensitive() uses memzero_explicit() which won't get compiled out.
Waiman Long (2):
mm, treewide: Rename kzfree() to kfree_sensitive()
crypto: Remove unnecessary memzero_explicit()
arch/s390/cry
On 04/01/2019 02:38 AM, Juergen Gross wrote:
> On 25/03/2019 19:03, Waiman Long wrote:
>> On 03/25/2019 12:40 PM, Juergen Gross wrote:
>>> On 25/03/2019 16:57, Waiman Long wrote:
>>>> It was found that passing an invalid cpu number to pv_vcpu_is_preempted()
>&
On 03/25/2019 12:40 PM, Juergen Gross wrote:
> On 25/03/2019 16:57, Waiman Long wrote:
>> It was found that passing an invalid cpu number to pv_vcpu_is_preempted()
>> might panic the kernel in a VM guest. For example,
>>
>> [2.531077] Oops: [#1] SMP PTI
>&g
:__raw_callee_save___kvm_vcpu_is_preempted+0x0/0x20
To guard against this kind of kernel panic, check is added to
pv_vcpu_is_preempted() to make sure that no invalid cpu number will
be used.
Signed-off-by: Waiman Long
---
arch/x86/include/asm/paravirt.h | 6 ++
1 file changed, 6 insertions(+)
diff --git a/arch/x86/include/asm
On 11/01/2017 06:01 PM, Boris Ostrovsky wrote:
> On 11/01/2017 04:58 PM, Waiman Long wrote:
>> +/* TODO: To be removed in a future kernel version */
>> static __init int xen_parse_nopvspin(char *arg)
>> {
>> -xen_pvspin = false;
>> +pr_warn("xen_n
With the new pvlock_type kernel parameter, xen_nopvspin is no longer
needed. This patch deprecates the xen_nopvspin parameter by removing
its documentation and treating it as an alias of "pvlock_type=queued".
Signed-off-by: Waiman Long
---
Documentation/admin-guide/kernel-parameter
h 2 deprecates Xen's xen_nopvspin parameter as it is no longer
needed.
Waiman Long (2):
x86/paravirt: Add kernel parameter to choose paravirt lock type
x86/xen: Deprecate xen_nopvspin
Documentation/admin-guide/kernel-parameters.txt | 11 ---
arch/x86/include/asm/paravirt.h
this new parameter
in determining if pvqspinlock should be used. The parameter, however,
will override Xen's xen_nopvspin in term of disabling unfair lock.
Signed-off-by: Waiman Long
---
Documentation/admin-guide/kernel-parameters.txt | 7 +
arch/x86/include/asm/paravirt.h
On 11/01/2017 03:01 PM, Boris Ostrovsky wrote:
> On 11/01/2017 12:28 PM, Waiman Long wrote:
>> On 11/01/2017 11:51 AM, Juergen Gross wrote:
>>> On 01/11/17 16:32, Waiman Long wrote:
>>>> Currently, there are 3 different lock types that can be chosen
On 11/01/2017 11:51 AM, Juergen Gross wrote:
> On 01/11/17 16:32, Waiman Long wrote:
>> Currently, there are 3 different lock types that can be chosen for
>> the x86 architecture:
>>
>> - qspinlock
>> - pvqspinlock
>> - unfair lock
>>
>> One
this new parameter
in determining if pvqspinlock should be used. The parameter, however,
will override Xen's xen_nopvspin in term of disabling unfair lock.
Signed-off-by: Waiman Long
---
Documentation/admin-guide/kernel-parameters.txt | 7 +
arch/x86/include/asm/paravirt.h
t can decide whether to use paravitualized
>> spinlocks, the current fallback to the unfair test-and-set scheme, or
>> to mimic the bare metal behavior.
>>
>> V3:
>> - remove test for hypervisor environment from virt_spin_lock(9 as
>> suggested by Waiman Long
>
On 09/06/2017 12:04 PM, Peter Zijlstra wrote:
> On Wed, Sep 06, 2017 at 11:49:49AM -0400, Waiman Long wrote:
>>> #define virt_spin_lock virt_spin_lock
>>> static inline bool virt_spin_lock(struct qspinlock *lock)
>>> {
>>> + if (!s
)
>
> if (!xen_pvspin) {
> printk(KERN_DEBUG "xen: PV spinlocks disabled\n");
> + static_branch_disable(&virt_spin_lock_key);
> return;
> }
> printk(KERN_DEBUG "xen: PV spinlocks enabled\n"
)
> diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
> index 294294c71ba4..838d235b87ef 100644
> --- a/kernel/locking/qspinlock.c
> +++ b/kernel/locking/qspinlock.c
> @@ -76,6 +76,10 @@
> #define MAX_NODES4
> #endif
>
> +#ifdef CONFIG_PARAVIRT
> +DEFI
On 09/06/2017 09:06 AM, Peter Zijlstra wrote:
> On Wed, Sep 06, 2017 at 08:44:09AM -0400, Waiman Long wrote:
>> On 09/06/2017 03:08 AM, Peter Zijlstra wrote:
>>> Guys, please trim email.
>>>
>>> On Tue, Sep 05, 2017 at 10:31:46AM -0400, Waiman Long wrote:
&g
On 09/06/2017 03:08 AM, Peter Zijlstra wrote:
> Guys, please trim email.
>
> On Tue, Sep 05, 2017 at 10:31:46AM -0400, Waiman Long wrote:
>> For clarification, I was actually asking if you consider just adding one
>> more jump label to skip it for Xen/KVM instead of making
>
On 09/05/2017 10:24 AM, Waiman Long wrote:
> On 09/05/2017 10:18 AM, Juergen Gross wrote:
>> On 05/09/17 16:10, Waiman Long wrote:
>>> On 09/05/2017 09:24 AM, Juergen Gross wrote:
>>>> There are cases where a guest tries to switch spinlocks to bare metal
On 09/05/2017 10:18 AM, Juergen Gross wrote:
> On 05/09/17 16:10, Waiman Long wrote:
>> On 09/05/2017 09:24 AM, Juergen Gross wrote:
>>> There are cases where a guest tries to switch spinlocks to bare metal
>>> behavior (e.g. by setting "xen_nopvspin" bo
On 09/05/2017 10:08 AM, Peter Zijlstra wrote:
> On Tue, Sep 05, 2017 at 10:02:57AM -0400, Waiman Long wrote:
>> On 09/05/2017 09:24 AM, Juergen Gross wrote:
>>> +static inline bool native_virt_spin_lock(struct qspinlock *lock)
>>> +{
>>> + if (!s
On 09/05/2017 09:24 AM, Juergen Gross wrote:
> There are cases where a guest tries to switch spinlocks to bare metal
> behavior (e.g. by setting "xen_nopvspin" boot parameter). Today this
> has the downside of falling back to unfair test and set scheme for
> qspinlocks due to virt_spin_lock() detec
On 09/05/2017 09:24 AM, Juergen Gross wrote:
> There are cases where a guest tries to switch spinlocks to bare metal
> behavior (e.g. by setting "xen_nopvspin" boot parameter). Today this
> has the downside of falling back to unfair test and set scheme for
> qspinlocks due to virt_spin_lock() detec
__raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched.
Suggested-by: Peter Zijlstra
Signed-off-by: Waiman Long
---
arch/x86/kernel/asm-offsets_64.c | 9 +
arch/x86/kernel/kvm.c| 24
2 files changed, 33 insertions(+)
diff --git a/arch/x86
intends to reduce this performance
overhead by replacing the C __kvm_vcpu_is_preempted() function by
an optimized version of __raw_callee_save___kvm_vcpu_is_preempted()
written in assembly.
Waiman Long (2):
x86/paravirt: Change vcp_is_preempted() arg type to long
x86/kvm: Provide opt
number won't exceed
32 bits.
Signed-off-by: Waiman Long
---
arch/x86/include/asm/paravirt.h | 2 +-
arch/x86/include/asm/qspinlock.h | 2 +-
arch/x86/kernel/kvm.c| 2 +-
arch/x86/kernel/paravirt-spinlocks.c | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)
On 02/16/2017 11:09 AM, Peter Zijlstra wrote:
> On Wed, Feb 15, 2017 at 04:37:49PM -0500, Waiman Long wrote:
>> The cpu argument in the function prototype of vcpu_is_preempted()
>> is changed from int to long. That makes it easier to provide a better
>> optimized assembly ver
On 02/16/2017 11:48 AM, Peter Zijlstra wrote:
> On Wed, Feb 15, 2017 at 04:37:50PM -0500, Waiman Long wrote:
>> +/*
>> + * Hand-optimize version for x86-64 to avoid 8 64-bit register saving and
>> + * restoring to/from the stack. It is assumed that the preempted value
>>
__raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched.
Suggested-by: Peter Zijlstra
Signed-off-by: Waiman Long
---
arch/x86/kernel/kvm.c | 30 ++
1 file changed, 30 insertions(+)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 85ed343
number won't exceed
32 bits.
Signed-off-by: Waiman Long
---
arch/x86/include/asm/paravirt.h | 2 +-
arch/x86/include/asm/qspinlock.h | 2 +-
arch/x86/kernel/kvm.c| 2 +-
arch/x86/kernel/paravirt-spinlocks.c | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)
reempted()
can have some impact on system performance on a VM guest, especially
of x86-64 guest, this patch set intends to reduce this performance
overhead by replacing the C __kvm_vcpu_is_preempted() function by
an optimized version of __raw_callee_save___kvm_vcpu_is_preempted()
written in assembly
number won't exceed
32 bits.
Signed-off-by: Waiman Long
---
arch/x86/include/asm/paravirt.h | 2 +-
arch/x86/include/asm/qspinlock.h | 2 +-
arch/x86/kernel/kvm.c| 2 +-
arch/x86/kernel/paravirt-spinlocks.c | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)
performance on a VM guest, especially
of x86-64 guest, this patch set intends to reduce this performance
overhead by replacing the C __kvm_vcpu_is_preempted() function by
an optimized version of __raw_callee_save___kvm_vcpu_is_preempted()
written in assembly.
Waiman Long (2):
x86/parav
__raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched.
Suggested-by: Peter Zijlstra
Signed-off-by: Waiman Long
---
arch/x86/kernel/kvm.c | 28
1 file changed, 28 insertions(+)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 85ed343
On 02/14/2017 04:39 AM, Peter Zijlstra wrote:
> On Mon, Feb 13, 2017 at 05:34:01PM -0500, Waiman Long wrote:
>> It is the address of &steal_time that will exceed the 32-bit limit.
> That seems extremely unlikely. That would mean we have more than 4G
> worth of per-cpu variab
On 02/13/2017 04:52 PM, Peter Zijlstra wrote:
> On Mon, Feb 13, 2017 at 03:12:45PM -0500, Waiman Long wrote:
>> On 02/13/2017 02:42 PM, Waiman Long wrote:
>>> On 02/13/2017 05:53 AM, Peter Zijlstra wrote:
>>>> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra
On 02/13/2017 03:06 PM, h...@zytor.com wrote:
> On February 13, 2017 2:53:43 AM PST, Peter Zijlstra
> wrote:
>> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote:
>>> That way we'd end up with something like:
>>>
>>> asm("
>>> push %rdi;
>>> movslq %edi, %rdi;
>>> movq __per_cpu_offs
On 02/13/2017 02:42 PM, Waiman Long wrote:
> On 02/13/2017 05:53 AM, Peter Zijlstra wrote:
>> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote:
>>> That way we'd end up with something like:
>>>
>>> asm("
>>> push %rdi;
>>
On 02/13/2017 05:53 AM, Peter Zijlstra wrote:
> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote:
>> That way we'd end up with something like:
>>
>> asm("
>> push %rdi;
>> movslq %edi, %rdi;
>> movq __per_cpu_offset(,%rdi,8), %rax;
>> cmpb $0, %[offset](%rax);
>> setne %al;
>> pop %rd
On 02/13/2017 05:47 AM, Peter Zijlstra wrote:
> On Fri, Feb 10, 2017 at 12:00:43PM -0500, Waiman Long wrote:
>
>>>> +asm(
>>>> +".pushsection .text;"
>>>> +".global __raw_callee_save___kvm_vcpu_is_preempted;"
>
On 02/10/2017 11:35 AM, Waiman Long wrote:
> On 02/10/2017 11:19 AM, Peter Zijlstra wrote:
>> On Fri, Feb 10, 2017 at 10:43:09AM -0500, Waiman Long wrote:
>>> It was found when running fio sequential write test with a XFS ramdisk
>>> on a VM running on a 2-socket x86-6
On 02/10/2017 11:19 AM, Peter Zijlstra wrote:
> On Fri, Feb 10, 2017 at 10:43:09AM -0500, Waiman Long wrote:
>> It was found when running fio sequential write test with a XFS ramdisk
>> on a VM running on a 2-socket x86-64 system, the %CPU times as reported
>> by perf were as
fio [k] osq_lock
10.14% 10.14% fio [k] __kvm_vcpu_is_preempted
On bare metal, the patch doesn't introduce any performance
regression. On KVM guest, it produces noticeable performance
improvement (up to 7%).
Signed-off-by: Waiman Long
---
v1->v2:
- Rerun the fio test on a differe
On 02/08/2017 02:05 PM, Peter Zijlstra wrote:
> On Wed, Feb 08, 2017 at 01:00:24PM -0500, Waiman Long wrote:
>> It was found when running fio sequential write test with a XFS ramdisk
>> on a 2-socket x86-64 system, the %CPU times as reported by perf were
>> as follows:
>
1 - 100 of 426 matches
Mail list logo