Re: [RFC PATCH 1/2] arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables

2017-10-05 Thread Paul E. McKenney
On Thu, Oct 05, 2017 at 09:31:48PM +0200, Andrea Parri wrote:
> Hi Will,
> 
> none of my comments below represent objections to this patch, but
> let me remark:
> 
> 
> On Thu, Oct 05, 2017 at 05:31:54PM +0100, Will Deacon wrote:
> > Hi Paul,
> > 
> > On Tue, Oct 03, 2017 at 12:11:10PM -0700, Paul E. McKenney wrote:
> > > On Fri, Sep 29, 2017 at 05:33:49PM +0100, Will Deacon wrote:
> > > > On Fri, Sep 29, 2017 at 09:29:39AM -0700, Paul E. McKenney wrote:
> > > > > On Fri, Sep 29, 2017 at 10:08:43AM +0100, Will Deacon wrote:
> > > > > > Ok, but where does that leave us wrt my initial proposal of moving
> > > > > > smp_read_barrier_depends() into READ_ONCE and getting rid of
> > > > > > lockless_dereference?
> > > > > > 
> > > > > > Michael (or anybody else running mainline on SMP Alpha) -- would 
> > > > > > you be
> > > > > > able to give the diff below a spin and see whether there's a 
> > > > > > measurable
> > > > > > performance impact?
> > > > > 
> > > > > This will be a sensitive test.  The smp_read_barrier_depends() can be
> > > > > removed from lockless_dereference().  Without this removal Alpha will
> > > > > get two memory barriers from rcu_dereference() and friends.
> > > > 
> > > > Oh yes, good point. I was trying to keep the diff simple, but you're
> > > > right that this is packing too many barriers. Fixed diff below.
> > > 
> > > Not seeing any objections thus far.  If there are none by (say) the
> > > end of this week, I would be happy to queue a patch for the 4.16
> > > merge window.  That should give ample opportunity for further review
> > > and testing.
> > 
> > Ok, full patch below.
> > 
> > Will
> > 
> > --->8
> > 
> > From 15956d0cc6b37208d8542b1858a8d8b64227acf4 Mon Sep 17 00:00:00 2001
> > From: Will Deacon 
> > Date: Thu, 5 Oct 2017 16:57:36 +0100
> > Subject: [PATCH] locking/barriers: Kill lockless_dereference
> > 
> > lockless_dereference is a nice idea, but it's gained little traction in
> > kernel code since it's introduction three years ago. This is partly
> > because it's a pain to type, but also because using READ_ONCE instead
> > will work correctly on all architectures apart from Alpha, which is a
> > fully supported but somewhat niche architecture these days.
> 
> lockless_dereference might be a mouthful, but it does (explicitly)
> say/remark: "Yep, we are relying on the following address dep. to
> be "in strong-ppo" ".
> 
> Such information will be lost or, at least, not immediately clear
> by just reading a READ_ONCE(). (And Yes, this information is only
> relevant when we "include" Alpha in the picture/analysis.)

It is possible to argue that lockless_dereference() should remain for
this reason, even given READ_ONCE() containing smp_read_barrier_depends().
However, such arguments would be much more compelling if there were
tools that cared about the difference.

> > This patch moves smp_read_barrier_depends() (a NOP on all architectures
> > other than Alpha) from lockless_dereference into READ_ONCE, converts
> > the few actual users over to READ_ONCE and then finally removes
> > lockless_dereference altogether.
> 
> Notice that several "potential users" of lockless_dereference are
> currently hidden in other call sites for smp_read_barrier_depends
> (i.e., cases where this barrier is not called from within a lock-
> less or an RCU dereference).
> 
> Some of these usages (e.g.,
> 
>   include/linux/percpu-refcount.h:__ref_is_percpu,
>   mm/ksm.c:get_ksm_page,
>   security/keys/keyring.c:search_nested_keyrings )
> 
> precedes this barrier with a READ_ONCE; others (e.g.,
> 
>   arch/alpha/include/asm/pgtable.h:pmd_offset,
>   net/ipv4/netfilter/arp_tables.c:arpt_do_table
>   kernel/kernel/events/uprobes.c:get_trampiline_vaddr )
> 
> with a plain read.

I would welcome patches for the cases where smp_read_barrier_depends() is
preceded by READ_ONCE().  Perhaps the others should gain a READ_ONCE(),
and I suspect that they should, but ultimately that decision is in
the hands of the relevant maintainer, so any such patches need to be
separated and will need at least an ack from the relevant maintainers.

> There also appear to be cases where the barrier is preceded by an
> ACCESS_ONCE (c.f, fs/dcache.c:prepend_name) or by an xchg_release
> (c.f., kernel/locking/qspinlock.c:queued_spin_lock_slowpath), and
> it would not be difficult to imagine/create different usages.

It would indeed be good to replace ACCESS_ONCE() with READ_ONCE() or
with WRITE_ONCE() where this works.  And yes, I agree that there are
other usage patterns possible.

> > Signed-off-by: Will Deacon 
> 
> I understand that we all agree we're missing a Tested-by here ;-).

Indeed, hence my "applied for testing and review".  ;-)

Thanx, Paul

>   Andrea
> 
> 
> > ---
> >  Documentation/memory-barriers.txt   | 12 
> >  .../translations/ko_KR/memory-barriers.txt  | 12 

Re: [RFC PATCH 1/2] arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables

2017-10-05 Thread Andrea Parri
Hi Will,

none of my comments below represent objections to this patch, but
let me remark:


On Thu, Oct 05, 2017 at 05:31:54PM +0100, Will Deacon wrote:
> Hi Paul,
> 
> On Tue, Oct 03, 2017 at 12:11:10PM -0700, Paul E. McKenney wrote:
> > On Fri, Sep 29, 2017 at 05:33:49PM +0100, Will Deacon wrote:
> > > On Fri, Sep 29, 2017 at 09:29:39AM -0700, Paul E. McKenney wrote:
> > > > On Fri, Sep 29, 2017 at 10:08:43AM +0100, Will Deacon wrote:
> > > > > Ok, but where does that leave us wrt my initial proposal of moving
> > > > > smp_read_barrier_depends() into READ_ONCE and getting rid of
> > > > > lockless_dereference?
> > > > > 
> > > > > Michael (or anybody else running mainline on SMP Alpha) -- would you 
> > > > > be
> > > > > able to give the diff below a spin and see whether there's a 
> > > > > measurable
> > > > > performance impact?
> > > > 
> > > > This will be a sensitive test.  The smp_read_barrier_depends() can be
> > > > removed from lockless_dereference().  Without this removal Alpha will
> > > > get two memory barriers from rcu_dereference() and friends.
> > > 
> > > Oh yes, good point. I was trying to keep the diff simple, but you're
> > > right that this is packing too many barriers. Fixed diff below.
> > 
> > Not seeing any objections thus far.  If there are none by (say) the
> > end of this week, I would be happy to queue a patch for the 4.16
> > merge window.  That should give ample opportunity for further review
> > and testing.
> 
> Ok, full patch below.
> 
> Will
> 
> --->8
> 
> From 15956d0cc6b37208d8542b1858a8d8b64227acf4 Mon Sep 17 00:00:00 2001
> From: Will Deacon 
> Date: Thu, 5 Oct 2017 16:57:36 +0100
> Subject: [PATCH] locking/barriers: Kill lockless_dereference
> 
> lockless_dereference is a nice idea, but it's gained little traction in
> kernel code since it's introduction three years ago. This is partly
> because it's a pain to type, but also because using READ_ONCE instead
> will work correctly on all architectures apart from Alpha, which is a
> fully supported but somewhat niche architecture these days.

lockless_dereference might be a mouthful, but it does (explicitly)
say/remark: "Yep, we are relying on the following address dep. to
be "in strong-ppo" ".

Such information will be lost or, at least, not immediately clear
by just reading a READ_ONCE(). (And Yes, this information is only
relevant when we "include" Alpha in the picture/analysis.)


> 
> This patch moves smp_read_barrier_depends() (a NOP on all architectures
> other than Alpha) from lockless_dereference into READ_ONCE, converts
> the few actual users over to READ_ONCE and then finally removes
> lockless_dereference altogether.

Notice that several "potential users" of lockless_dereference are
currently hidden in other call sites for smp_read_barrier_depends
(i.e., cases where this barrier is not called from within a lock-
less or an RCU dereference).

Some of these usages (e.g.,

  include/linux/percpu-refcount.h:__ref_is_percpu,
  mm/ksm.c:get_ksm_page,
  security/keys/keyring.c:search_nested_keyrings )

precedes this barrier with a READ_ONCE; others (e.g.,

  arch/alpha/include/asm/pgtable.h:pmd_offset,
  net/ipv4/netfilter/arp_tables.c:arpt_do_table
  kernel/kernel/events/uprobes.c:get_trampiline_vaddr )

with a plain read.

There also appear to be cases where the barrier is preceded by an
ACCESS_ONCE (c.f, fs/dcache.c:prepend_name) or by an xchg_release
(c.f., kernel/locking/qspinlock.c:queued_spin_lock_slowpath), and
it would not be difficult to imagine/create different usages.


> 
> Signed-off-by: Will Deacon 

I understand that we all agree we're missing a Tested-by here ;-).

  Andrea


> ---
>  Documentation/memory-barriers.txt   | 12 
>  .../translations/ko_KR/memory-barriers.txt  | 12 
>  arch/x86/events/core.c  |  2 +-
>  arch/x86/include/asm/mmu_context.h  |  4 ++--
>  arch/x86/kernel/ldt.c   |  2 +-
>  drivers/md/dm-mpath.c   | 20 ++--
>  fs/dcache.c |  4 ++--
>  fs/overlayfs/ovl_entry.h|  2 +-
>  fs/overlayfs/readdir.c  |  2 +-
>  include/linux/compiler.h| 21 
> +
>  include/linux/rculist.h |  4 ++--
>  include/linux/rcupdate.h|  4 ++--
>  kernel/events/core.c|  4 ++--
>  kernel/seccomp.c|  2 +-
>  kernel/task_work.c  |  2 +-
>  mm/slab.h   |  2 +-
>  16 files changed, 28 insertions(+), 71 deletions(-)
> 
> diff --git a/Documentation/memory-barriers.txt 
> b/Documentation/memory-barriers.txt
> index b759a60624fd..470a682f3fa4 100644
> --- 

Re: [RFC PATCH 1/2] arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables

2017-10-05 Thread Will Deacon
Hi Paul,

On Tue, Oct 03, 2017 at 12:11:10PM -0700, Paul E. McKenney wrote:
> On Fri, Sep 29, 2017 at 05:33:49PM +0100, Will Deacon wrote:
> > On Fri, Sep 29, 2017 at 09:29:39AM -0700, Paul E. McKenney wrote:
> > > On Fri, Sep 29, 2017 at 10:08:43AM +0100, Will Deacon wrote:
> > > > Ok, but where does that leave us wrt my initial proposal of moving
> > > > smp_read_barrier_depends() into READ_ONCE and getting rid of
> > > > lockless_dereference?
> > > > 
> > > > Michael (or anybody else running mainline on SMP Alpha) -- would you be
> > > > able to give the diff below a spin and see whether there's a measurable
> > > > performance impact?
> > > 
> > > This will be a sensitive test.  The smp_read_barrier_depends() can be
> > > removed from lockless_dereference().  Without this removal Alpha will
> > > get two memory barriers from rcu_dereference() and friends.
> > 
> > Oh yes, good point. I was trying to keep the diff simple, but you're
> > right that this is packing too many barriers. Fixed diff below.
> 
> Not seeing any objections thus far.  If there are none by (say) the
> end of this week, I would be happy to queue a patch for the 4.16
> merge window.  That should give ample opportunity for further review
> and testing.

Ok, full patch below.

Will

--->8

>From 15956d0cc6b37208d8542b1858a8d8b64227acf4 Mon Sep 17 00:00:00 2001
From: Will Deacon 
Date: Thu, 5 Oct 2017 16:57:36 +0100
Subject: [PATCH] locking/barriers: Kill lockless_dereference

lockless_dereference is a nice idea, but it's gained little traction in
kernel code since it's introduction three years ago. This is partly
because it's a pain to type, but also because using READ_ONCE instead
will work correctly on all architectures apart from Alpha, which is a
fully supported but somewhat niche architecture these days.

This patch moves smp_read_barrier_depends() (a NOP on all architectures
other than Alpha) from lockless_dereference into READ_ONCE, converts
the few actual users over to READ_ONCE and then finally removes
lockless_dereference altogether.

Signed-off-by: Will Deacon 
---
 Documentation/memory-barriers.txt   | 12 
 .../translations/ko_KR/memory-barriers.txt  | 12 
 arch/x86/events/core.c  |  2 +-
 arch/x86/include/asm/mmu_context.h  |  4 ++--
 arch/x86/kernel/ldt.c   |  2 +-
 drivers/md/dm-mpath.c   | 20 ++--
 fs/dcache.c |  4 ++--
 fs/overlayfs/ovl_entry.h|  2 +-
 fs/overlayfs/readdir.c  |  2 +-
 include/linux/compiler.h| 21 +
 include/linux/rculist.h |  4 ++--
 include/linux/rcupdate.h|  4 ++--
 kernel/events/core.c|  4 ++--
 kernel/seccomp.c|  2 +-
 kernel/task_work.c  |  2 +-
 mm/slab.h   |  2 +-
 16 files changed, 28 insertions(+), 71 deletions(-)

diff --git a/Documentation/memory-barriers.txt 
b/Documentation/memory-barriers.txt
index b759a60624fd..470a682f3fa4 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -1886,18 +1886,6 @@ There are some more advanced barrier functions:
  See Documentation/atomic_{t,bitops}.txt for more information.
 
 
- (*) lockless_dereference();
-
- This can be thought of as a pointer-fetch wrapper around the
- smp_read_barrier_depends() data-dependency barrier.
-
- This is also similar to rcu_dereference(), but in cases where
- object lifetime is handled by some mechanism other than RCU, for
- example, when the objects removed only when the system goes down.
- In addition, lockless_dereference() is used in some data structures
- that can be used both with and without RCU.
-
-
  (*) dma_wmb();
  (*) dma_rmb();
 
diff --git a/Documentation/translations/ko_KR/memory-barriers.txt 
b/Documentation/translations/ko_KR/memory-barriers.txt
index a7a813258013..ec3b46e27b7a 100644
--- a/Documentation/translations/ko_KR/memory-barriers.txt
+++ b/Documentation/translations/ko_KR/memory-barriers.txt
@@ -1858,18 +1858,6 @@ Mandatory 배리어들은 SMP 시스템에서도 UP 시스템에서도 SMP 효
  참고하세요.
 
 
- (*) lockless_dereference();
-
- 이 함수는 smp_read_barrier_depends() 데이터 의존성 배리어를 사용하는
- 포인터 읽어오기 래퍼(wrapper) 함수로 생각될 수 있습니다.
-
- 객체의 라이프타임이 RCU 외의 메커니즘으로 관리된다는 점을 제외하면
- rcu_dereference() 와도 유사한데, 예를 들면 객체가 시스템이 꺼질 때에만
- 제거되는 경우 등입니다.  또한, lockless_dereference() 은 RCU 와 함께
- 사용될수도, RCU 없이 사용될 수도 있는 일부 데이터 구조에 사용되고
- 있습니다.
-
-
  (*) dma_wmb();
  (*) dma_rmb();
 
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 80534d3c2480..589af1eec7c1 100644
--- a/arch/x86/events/core.c
+++ 

Re: [RFC PATCH 1/2] arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables

2017-09-29 Thread Will Deacon
On Fri, Sep 29, 2017 at 09:29:39AM -0700, Paul E. McKenney wrote:
> On Fri, Sep 29, 2017 at 10:08:43AM +0100, Will Deacon wrote:
> > On Thu, Sep 28, 2017 at 05:58:30PM -0700, Paul E. McKenney wrote:
> > > On Fri, Sep 29, 2017 at 07:59:09AM +1300, Michael Cree wrote:
> > > > On Thu, Sep 28, 2017 at 08:43:54AM -0700, Paul E. McKenney wrote:
> > > > > On Thu, Sep 28, 2017 at 09:45:35AM +0100, Will Deacon wrote:
> > > > > > On Thu, Sep 28, 2017 at 10:38:01AM +0200, Peter Zijlstra wrote:
> > > > > > > On Wed, Sep 27, 2017 at 04:49:28PM +0100, Will Deacon wrote:
> > > > > > > > In many cases, page tables can be accessed concurrently by 
> > > > > > > > either another
> > > > > > > > CPU (due to things like fast gup) or by the hardware page table 
> > > > > > > > walker
> > > > > > > > itself, which may set access/dirty bits. In such cases, it is 
> > > > > > > > important
> > > > > > > > to use READ_ONCE/WRITE_ONCE when accessing page table entries 
> > > > > > > > so that
> > > > > > > > entries cannot be torn, merged or subject to apparent loss of 
> > > > > > > > coherence.
> > > > > > > 
> > > > > > > In fact, we should use lockless_dereference() for many of them. 
> > > > > > > Yes
> > > > > > > Alpha is the only one that cares about the difference between 
> > > > > > > that and
> > > > > > > READ_ONCE() and they do have the extra barrier, but if we're 
> > > > > > > going to do
> > > > > > > this, we might as well do it 'right' :-)
> > > > > > 
> > > > > > I know this sounds daft, but I think one of the big reasons why
> > > > > > lockless_dereference() doesn't get an awful lot of use is because 
> > > > > > it's
> > > > > > such a mouthful! Why don't we just move the 
> > > > > > smp_read_barrier_depends()
> > > > > > into READ_ONCE? Would anybody actually care about the potential 
> > > > > > impact on
> > > > > > Alpha (which, frankly, is treading on thin ice given the low 
> > > > > > adoption of
> > > > > > lockless_dereference())?
> > > > > 
> > > > > This is my cue to ask my usual question...  ;-)
> > > > > 
> > > > > Are people still running mainline kernels on Alpha?  (Added Alpha 
> > > > > folks.)
> > > > 
> > > > Yes.  I run two Alpha build daemons that build the unofficial
> > > > debian-alpha port.  Debian popcon reports nine machines running
> > > > Alpha, which are likely to be running the 4.12.y kernel which
> > > > is currently in debian-alpha, (and presumably soon to be 4.13.y
> > > > which is now built on Alpha in experimental).
> > > 
> > > I salute your dedication to Alpha!  ;-)
> > 
> > Ok, but where does that leave us wrt my initial proposal of moving
> > smp_read_barrier_depends() into READ_ONCE and getting rid of
> > lockless_dereference?
> > 
> > Michael (or anybody else running mainline on SMP Alpha) -- would you be
> > able to give the diff below a spin and see whether there's a measurable
> > performance impact?
> 
> This will be a sensitive test.  The smp_read_barrier_depends() can be
> removed from lockless_dereference().  Without this removal Alpha will
> get two memory barriers from rcu_dereference() and friends.

Oh yes, good point. I was trying to keep the diff simple, but you're
right that this is packing too many barriers. Fixed diff below.

Thanks,

Will

--->8

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index e95a2631e545..c4ee9d6d8f2d 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -340,6 +340,7 @@ static __always_inline void __write_once_size(volatile void 
*p, void *res, int s
__read_once_size(&(x), __u.__c, sizeof(x)); \
else\
__read_once_size_nocheck(&(x), __u.__c, sizeof(x)); \
+   smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \
__u.__val;  \
 })
 #define READ_ONCE(x) __READ_ONCE(x, 1)
@@ -620,7 +621,6 @@ static __always_inline void __write_once_size(volatile void 
*p, void *res, int s
 ({ \
typeof(p) _p1 = READ_ONCE(p); \
typeof(*(p)) *___typecheck_p __maybe_unused; \
-   smp_read_barrier_depends(); /* Dependency order vs. p above. */ \
(_p1); \
 })
 
--
To unsubscribe from this list: send the line "unsubscribe linux-alpha" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/2] arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables

2017-09-29 Thread Paul E. McKenney
On Fri, Sep 29, 2017 at 10:08:43AM +0100, Will Deacon wrote:
> On Thu, Sep 28, 2017 at 05:58:30PM -0700, Paul E. McKenney wrote:
> > On Fri, Sep 29, 2017 at 07:59:09AM +1300, Michael Cree wrote:
> > > On Thu, Sep 28, 2017 at 08:43:54AM -0700, Paul E. McKenney wrote:
> > > > On Thu, Sep 28, 2017 at 09:45:35AM +0100, Will Deacon wrote:
> > > > > On Thu, Sep 28, 2017 at 10:38:01AM +0200, Peter Zijlstra wrote:
> > > > > > On Wed, Sep 27, 2017 at 04:49:28PM +0100, Will Deacon wrote:
> > > > > > > In many cases, page tables can be accessed concurrently by either 
> > > > > > > another
> > > > > > > CPU (due to things like fast gup) or by the hardware page table 
> > > > > > > walker
> > > > > > > itself, which may set access/dirty bits. In such cases, it is 
> > > > > > > important
> > > > > > > to use READ_ONCE/WRITE_ONCE when accessing page table entries so 
> > > > > > > that
> > > > > > > entries cannot be torn, merged or subject to apparent loss of 
> > > > > > > coherence.
> > > > > > 
> > > > > > In fact, we should use lockless_dereference() for many of them. Yes
> > > > > > Alpha is the only one that cares about the difference between that 
> > > > > > and
> > > > > > READ_ONCE() and they do have the extra barrier, but if we're going 
> > > > > > to do
> > > > > > this, we might as well do it 'right' :-)
> > > > > 
> > > > > I know this sounds daft, but I think one of the big reasons why
> > > > > lockless_dereference() doesn't get an awful lot of use is because it's
> > > > > such a mouthful! Why don't we just move the smp_read_barrier_depends()
> > > > > into READ_ONCE? Would anybody actually care about the potential 
> > > > > impact on
> > > > > Alpha (which, frankly, is treading on thin ice given the low adoption 
> > > > > of
> > > > > lockless_dereference())?
> > > > 
> > > > This is my cue to ask my usual question...  ;-)
> > > > 
> > > > Are people still running mainline kernels on Alpha?  (Added Alpha 
> > > > folks.)
> > > 
> > > Yes.  I run two Alpha build daemons that build the unofficial
> > > debian-alpha port.  Debian popcon reports nine machines running
> > > Alpha, which are likely to be running the 4.12.y kernel which
> > > is currently in debian-alpha, (and presumably soon to be 4.13.y
> > > which is now built on Alpha in experimental).
> > 
> > I salute your dedication to Alpha!  ;-)
> 
> Ok, but where does that leave us wrt my initial proposal of moving
> smp_read_barrier_depends() into READ_ONCE and getting rid of
> lockless_dereference?
> 
> Michael (or anybody else running mainline on SMP Alpha) -- would you be
> able to give the diff below a spin and see whether there's a measurable
> performance impact?

This will be a sensitive test.  The smp_read_barrier_depends() can be
removed from lockless_dereference().  Without this removal Alpha will
get two memory barriers from rcu_dereference() and friends.

Thanx, Paul

> Cheers,
> 
> Will
> 
> --->8
> 
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index e95a2631e545..0ce21e25492a 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -340,6 +340,7 @@ static __always_inline void __write_once_size(volatile 
> void *p, void *res, int s
>   __read_once_size(&(x), __u.__c, sizeof(x)); \
>   else\
>   __read_once_size_nocheck(&(x), __u.__c, sizeof(x)); \
> + smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \
>   __u.__val;  \
>  })
>  #define READ_ONCE(x) __READ_ONCE(x, 1)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-alpha" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/2] arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables

2017-09-29 Thread Will Deacon
On Thu, Sep 28, 2017 at 05:58:30PM -0700, Paul E. McKenney wrote:
> On Fri, Sep 29, 2017 at 07:59:09AM +1300, Michael Cree wrote:
> > On Thu, Sep 28, 2017 at 08:43:54AM -0700, Paul E. McKenney wrote:
> > > On Thu, Sep 28, 2017 at 09:45:35AM +0100, Will Deacon wrote:
> > > > On Thu, Sep 28, 2017 at 10:38:01AM +0200, Peter Zijlstra wrote:
> > > > > On Wed, Sep 27, 2017 at 04:49:28PM +0100, Will Deacon wrote:
> > > > > > In many cases, page tables can be accessed concurrently by either 
> > > > > > another
> > > > > > CPU (due to things like fast gup) or by the hardware page table 
> > > > > > walker
> > > > > > itself, which may set access/dirty bits. In such cases, it is 
> > > > > > important
> > > > > > to use READ_ONCE/WRITE_ONCE when accessing page table entries so 
> > > > > > that
> > > > > > entries cannot be torn, merged or subject to apparent loss of 
> > > > > > coherence.
> > > > > 
> > > > > In fact, we should use lockless_dereference() for many of them. Yes
> > > > > Alpha is the only one that cares about the difference between that and
> > > > > READ_ONCE() and they do have the extra barrier, but if we're going to 
> > > > > do
> > > > > this, we might as well do it 'right' :-)
> > > > 
> > > > I know this sounds daft, but I think one of the big reasons why
> > > > lockless_dereference() doesn't get an awful lot of use is because it's
> > > > such a mouthful! Why don't we just move the smp_read_barrier_depends()
> > > > into READ_ONCE? Would anybody actually care about the potential impact 
> > > > on
> > > > Alpha (which, frankly, is treading on thin ice given the low adoption of
> > > > lockless_dereference())?
> > > 
> > > This is my cue to ask my usual question...  ;-)
> > > 
> > > Are people still running mainline kernels on Alpha?  (Added Alpha folks.)
> > 
> > Yes.  I run two Alpha build daemons that build the unofficial
> > debian-alpha port.  Debian popcon reports nine machines running
> > Alpha, which are likely to be running the 4.12.y kernel which
> > is currently in debian-alpha, (and presumably soon to be 4.13.y
> > which is now built on Alpha in experimental).
> 
> I salute your dedication to Alpha!  ;-)

Ok, but where does that leave us wrt my initial proposal of moving
smp_read_barrier_depends() into READ_ONCE and getting rid of
lockless_dereference?

Michael (or anybody else running mainline on SMP Alpha) -- would you be
able to give the diff below a spin and see whether there's a measurable
performance impact?

Cheers,

Will

--->8

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index e95a2631e545..0ce21e25492a 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -340,6 +340,7 @@ static __always_inline void __write_once_size(volatile void 
*p, void *res, int s
__read_once_size(&(x), __u.__c, sizeof(x)); \
else\
__read_once_size_nocheck(&(x), __u.__c, sizeof(x)); \
+   smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \
__u.__val;  \
 })
 #define READ_ONCE(x) __READ_ONCE(x, 1)
--
To unsubscribe from this list: send the line "unsubscribe linux-alpha" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/2] arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables

2017-09-28 Thread Michael Cree
On Thu, Sep 28, 2017 at 08:43:54AM -0700, Paul E. McKenney wrote:
> On Thu, Sep 28, 2017 at 09:45:35AM +0100, Will Deacon wrote:
> > On Thu, Sep 28, 2017 at 10:38:01AM +0200, Peter Zijlstra wrote:
> > > On Wed, Sep 27, 2017 at 04:49:28PM +0100, Will Deacon wrote:
> > > > In many cases, page tables can be accessed concurrently by either 
> > > > another
> > > > CPU (due to things like fast gup) or by the hardware page table walker
> > > > itself, which may set access/dirty bits. In such cases, it is important
> > > > to use READ_ONCE/WRITE_ONCE when accessing page table entries so that
> > > > entries cannot be torn, merged or subject to apparent loss of coherence.
> > > 
> > > In fact, we should use lockless_dereference() for many of them. Yes
> > > Alpha is the only one that cares about the difference between that and
> > > READ_ONCE() and they do have the extra barrier, but if we're going to do
> > > this, we might as well do it 'right' :-)
> > 
> > I know this sounds daft, but I think one of the big reasons why
> > lockless_dereference() doesn't get an awful lot of use is because it's
> > such a mouthful! Why don't we just move the smp_read_barrier_depends()
> > into READ_ONCE? Would anybody actually care about the potential impact on
> > Alpha (which, frankly, is treading on thin ice given the low adoption of
> > lockless_dereference())?
> 
> This is my cue to ask my usual question...  ;-)
> 
> Are people still running mainline kernels on Alpha?  (Added Alpha folks.)

Yes.  I run two Alpha build daemons that build the unofficial
debian-alpha port.  Debian popcon reports nine machines running
Alpha, which are likely to be running the 4.12.y kernel which
is currently in debian-alpha, (and presumably soon to be 4.13.y
which is now built on Alpha in experimental).

Cheers
Michael
--
To unsubscribe from this list: send the line "unsubscribe linux-alpha" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/2] arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables

2017-09-28 Thread Paul E. McKenney
On Thu, Sep 28, 2017 at 04:49:54PM +0100, Will Deacon wrote:
> On Thu, Sep 28, 2017 at 08:43:54AM -0700, Paul E. McKenney wrote:
> > On Thu, Sep 28, 2017 at 09:45:35AM +0100, Will Deacon wrote:
> > > On Thu, Sep 28, 2017 at 10:38:01AM +0200, Peter Zijlstra wrote:
> > > > On Wed, Sep 27, 2017 at 04:49:28PM +0100, Will Deacon wrote:
> > > > > In many cases, page tables can be accessed concurrently by either 
> > > > > another
> > > > > CPU (due to things like fast gup) or by the hardware page table walker
> > > > > itself, which may set access/dirty bits. In such cases, it is 
> > > > > important
> > > > > to use READ_ONCE/WRITE_ONCE when accessing page table entries so that
> > > > > entries cannot be torn, merged or subject to apparent loss of 
> > > > > coherence.
> > > > 
> > > > In fact, we should use lockless_dereference() for many of them. Yes
> > > > Alpha is the only one that cares about the difference between that and
> > > > READ_ONCE() and they do have the extra barrier, but if we're going to do
> > > > this, we might as well do it 'right' :-)
> > > 
> > > I know this sounds daft, but I think one of the big reasons why
> > > lockless_dereference() doesn't get an awful lot of use is because it's
> > > such a mouthful! Why don't we just move the smp_read_barrier_depends()
> > > into READ_ONCE? Would anybody actually care about the potential impact on
> > > Alpha (which, frankly, is treading on thin ice given the low adoption of
> > > lockless_dereference())?
> > 
> > This is my cue to ask my usual question...  ;-)
> > 
> > Are people still running mainline kernels on Alpha?  (Added Alpha folks.)
> > 
> > As always, if anyone is, we must continue to support Alpha, but sounds
> > like time to check again.
> 
> I'll be honest and say that I haven't updated mine for a while, but I do
> have a soft spot for those machines :(

Let's see what the Alpha folks say.  I myself have had a close
relationship with Alpha for almost 20 years, but I suspect that in
my case it is more a hard spot on my head rather than a soft spot in
my heart.  ;-)

Thanx,
Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-alpha" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/2] arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables

2017-09-28 Thread Paul E. McKenney
On Thu, Sep 28, 2017 at 09:45:35AM +0100, Will Deacon wrote:
> On Thu, Sep 28, 2017 at 10:38:01AM +0200, Peter Zijlstra wrote:
> > On Wed, Sep 27, 2017 at 04:49:28PM +0100, Will Deacon wrote:
> > > In many cases, page tables can be accessed concurrently by either another
> > > CPU (due to things like fast gup) or by the hardware page table walker
> > > itself, which may set access/dirty bits. In such cases, it is important
> > > to use READ_ONCE/WRITE_ONCE when accessing page table entries so that
> > > entries cannot be torn, merged or subject to apparent loss of coherence.
> > 
> > In fact, we should use lockless_dereference() for many of them. Yes
> > Alpha is the only one that cares about the difference between that and
> > READ_ONCE() and they do have the extra barrier, but if we're going to do
> > this, we might as well do it 'right' :-)
> 
> I know this sounds daft, but I think one of the big reasons why
> lockless_dereference() doesn't get an awful lot of use is because it's
> such a mouthful! Why don't we just move the smp_read_barrier_depends()
> into READ_ONCE? Would anybody actually care about the potential impact on
> Alpha (which, frankly, is treading on thin ice given the low adoption of
> lockless_dereference())?

This is my cue to ask my usual question...  ;-)

Are people still running mainline kernels on Alpha?  (Added Alpha folks.)

As always, if anyone is, we must continue to support Alpha, but sounds
like time to check again.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-alpha" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html