Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-08 Thread Dan Williams
On Mon, May 8, 2017 at 1:32 PM, Ross Zwisler
 wrote:
> On Fri, Apr 28, 2017 at 12:39:12PM -0700, Dan Williams wrote:
>> The pmem driver has a need to transfer data with a persistent memory
>> destination and be able to rely on the fact that the destination writes
>> are not cached. It is sufficient for the writes to be flushed to a
>> cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect
>> userspace to call fsync() to ensure data-writes have reached a
>> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or
>> REQ_FLUSH to the pmem driver which will turn around and fence previous
>> writes with an "sfence".
>>
>> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt,
>> that guarantee that the destination buffer is not dirty in the cpu cache
>> on completion. The new copy_from_iter_wt and sub-routines will be used
>> to replace the "pmem api" (include/linux/pmem.h +
>> arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt()
>> and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config
>> symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
>> otherwise.
>>
>> This is meant to satisfy the concern from Linus that if a driver wants
>> to do something beyond the normal nocache semantics it should be
>> something private to that driver [1], and Al's concern that anything
>> uaccess related belongs with the rest of the uaccess code [2].
>>
>> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
>> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
>>
>> Cc: 
>> Cc: Jan Kara 
>> Cc: Jeff Moyer 
>> Cc: Ingo Molnar 
>> Cc: Christoph Hellwig 
>> Cc: "H. Peter Anvin" 
>> Cc: Al Viro 
>> Cc: Thomas Gleixner 
>> Cc: Matthew Wilcox 
>> Cc: Ross Zwisler 
>> Signed-off-by: Dan Williams 
[..]
> I took a pretty hard look at the changes in arch/x86/lib/usercopy_64.c, and
> they look correct to me.  The inline assembly for non-temporal copies mixed
> with C for loop control is IMHO much easier to follow than the pure assembly
> of __copy_user_nocache().
>
> Reviewed-by: Ross Zwisler 

Thanks Ross, I appreciate it.


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-08 Thread Dan Williams
On Mon, May 8, 2017 at 1:32 PM, Ross Zwisler
 wrote:
> On Fri, Apr 28, 2017 at 12:39:12PM -0700, Dan Williams wrote:
>> The pmem driver has a need to transfer data with a persistent memory
>> destination and be able to rely on the fact that the destination writes
>> are not cached. It is sufficient for the writes to be flushed to a
>> cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect
>> userspace to call fsync() to ensure data-writes have reached a
>> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or
>> REQ_FLUSH to the pmem driver which will turn around and fence previous
>> writes with an "sfence".
>>
>> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt,
>> that guarantee that the destination buffer is not dirty in the cpu cache
>> on completion. The new copy_from_iter_wt and sub-routines will be used
>> to replace the "pmem api" (include/linux/pmem.h +
>> arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt()
>> and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config
>> symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
>> otherwise.
>>
>> This is meant to satisfy the concern from Linus that if a driver wants
>> to do something beyond the normal nocache semantics it should be
>> something private to that driver [1], and Al's concern that anything
>> uaccess related belongs with the rest of the uaccess code [2].
>>
>> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
>> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
>>
>> Cc: 
>> Cc: Jan Kara 
>> Cc: Jeff Moyer 
>> Cc: Ingo Molnar 
>> Cc: Christoph Hellwig 
>> Cc: "H. Peter Anvin" 
>> Cc: Al Viro 
>> Cc: Thomas Gleixner 
>> Cc: Matthew Wilcox 
>> Cc: Ross Zwisler 
>> Signed-off-by: Dan Williams 
[..]
> I took a pretty hard look at the changes in arch/x86/lib/usercopy_64.c, and
> they look correct to me.  The inline assembly for non-temporal copies mixed
> with C for loop control is IMHO much easier to follow than the pure assembly
> of __copy_user_nocache().
>
> Reviewed-by: Ross Zwisler 

Thanks Ross, I appreciate it.


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-08 Thread Ross Zwisler
On Fri, Apr 28, 2017 at 12:39:12PM -0700, Dan Williams wrote:
> The pmem driver has a need to transfer data with a persistent memory
> destination and be able to rely on the fact that the destination writes
> are not cached. It is sufficient for the writes to be flushed to a
> cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect
> userspace to call fsync() to ensure data-writes have reached a
> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or
> REQ_FLUSH to the pmem driver which will turn around and fence previous
> writes with an "sfence".
> 
> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt,
> that guarantee that the destination buffer is not dirty in the cpu cache
> on completion. The new copy_from_iter_wt and sub-routines will be used
> to replace the "pmem api" (include/linux/pmem.h +
> arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt()
> and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config
> symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
> otherwise.
> 
> This is meant to satisfy the concern from Linus that if a driver wants
> to do something beyond the normal nocache semantics it should be
> something private to that driver [1], and Al's concern that anything
> uaccess related belongs with the rest of the uaccess code [2].
> 
> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
> 
> Cc: 
> Cc: Jan Kara 
> Cc: Jeff Moyer 
> Cc: Ingo Molnar 
> Cc: Christoph Hellwig 
> Cc: "H. Peter Anvin" 
> Cc: Al Viro 
> Cc: Thomas Gleixner 
> Cc: Matthew Wilcox 
> Cc: Ross Zwisler 
> Signed-off-by: Dan Williams 
> ---
<>
> diff --git a/arch/x86/include/asm/uaccess_64.h 
> b/arch/x86/include/asm/uaccess_64.h
> index c5504b9a472e..07ded30c7e89 100644
> --- a/arch/x86/include/asm/uaccess_64.h
> +++ b/arch/x86/include/asm/uaccess_64.h
> @@ -171,6 +171,10 @@ unsigned long raw_copy_in_user(void __user *dst, const 
> void __user *src, unsigne
>  extern long __copy_user_nocache(void *dst, const void __user *src,
>   unsigned size, int zerorest);
>  
> +extern long __copy_user_wt(void *dst, const void __user *src, unsigned size);
> +extern void memcpy_page_wt(char *to, struct page *page, size_t offset,
> +size_t len);
> +
>  static inline int
>  __copy_from_user_inatomic_nocache(void *dst, const void __user *src,
> unsigned size)
> @@ -179,6 +183,13 @@ __copy_from_user_inatomic_nocache(void *dst, const void 
> __user *src,
>   return __copy_user_nocache(dst, src, size, 0);
>  }
>  
> +static inline int
> +__copy_from_user_inatomic_wt(void *dst, const void __user *src, unsigned 
> size)
> +{
> + kasan_check_write(dst, size);
> + return __copy_user_wt(dst, src, size);
> +}
> +
>  unsigned long
>  copy_user_handle_tail(char *to, char *from, unsigned len);
>  
> diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c
> index 3b7c40a2e3e1..0aeff66a022f 100644
> --- a/arch/x86/lib/usercopy_64.c
> +++ b/arch/x86/lib/usercopy_64.c
> @@ -7,6 +7,7 @@
>   */
>  #include 
>  #include 
> +#include 
>  
>  /*
>   * Zero Userspace
> @@ -73,3 +74,130 @@ copy_user_handle_tail(char *to, char *from, unsigned len)
>   clac();
>   return len;
>  }
> +
> +#ifdef CONFIG_ARCH_HAS_UACCESS_WT
> +/**
> + * clean_cache_range - write back a cache range with CLWB
> + * @vaddr:   virtual start address
> + * @size:number of bytes to write back
> + *
> + * Write back a cache range using the CLWB (cache line write back)
> + * instruction. Note that @size is internally rounded up to be cache
> + * line size aligned.
> + */
> +static void clean_cache_range(void *addr, size_t size)
> +{
> + u16 x86_clflush_size = boot_cpu_data.x86_clflush_size;
> + unsigned long clflush_mask = x86_clflush_size - 1;
> + void *vend = addr + size;
> + void *p;
> +
> + for (p = (void *)((unsigned long)addr & ~clflush_mask);
> +  p < vend; p += x86_clflush_size)
> + clwb(p);
> +}
> +
> +long __copy_user_wt(void *dst, const void __user *src, unsigned size)
> +{
> + unsigned long flushed, dest = (unsigned long) dst;
> + long rc = __copy_user_nocache(dst, src, size, 0);
> +
> + /*
> +  * __copy_user_nocache() uses non-temporal stores for the bulk
> +  * of the transfer, but we need to manually flush if the
> +  * transfer is unaligned. A cached memory copy is used when
> +  * destination or size is not naturally aligned. That is:
> +  *   - Require 8-byte alignment when size is 8 bytes or larger.
> +  *   - Require 4-byte alignment when size is 4 bytes.
> +  

Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-08 Thread Ross Zwisler
On Fri, Apr 28, 2017 at 12:39:12PM -0700, Dan Williams wrote:
> The pmem driver has a need to transfer data with a persistent memory
> destination and be able to rely on the fact that the destination writes
> are not cached. It is sufficient for the writes to be flushed to a
> cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect
> userspace to call fsync() to ensure data-writes have reached a
> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or
> REQ_FLUSH to the pmem driver which will turn around and fence previous
> writes with an "sfence".
> 
> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt,
> that guarantee that the destination buffer is not dirty in the cpu cache
> on completion. The new copy_from_iter_wt and sub-routines will be used
> to replace the "pmem api" (include/linux/pmem.h +
> arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt()
> and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config
> symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
> otherwise.
> 
> This is meant to satisfy the concern from Linus that if a driver wants
> to do something beyond the normal nocache semantics it should be
> something private to that driver [1], and Al's concern that anything
> uaccess related belongs with the rest of the uaccess code [2].
> 
> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
> 
> Cc: 
> Cc: Jan Kara 
> Cc: Jeff Moyer 
> Cc: Ingo Molnar 
> Cc: Christoph Hellwig 
> Cc: "H. Peter Anvin" 
> Cc: Al Viro 
> Cc: Thomas Gleixner 
> Cc: Matthew Wilcox 
> Cc: Ross Zwisler 
> Signed-off-by: Dan Williams 
> ---
<>
> diff --git a/arch/x86/include/asm/uaccess_64.h 
> b/arch/x86/include/asm/uaccess_64.h
> index c5504b9a472e..07ded30c7e89 100644
> --- a/arch/x86/include/asm/uaccess_64.h
> +++ b/arch/x86/include/asm/uaccess_64.h
> @@ -171,6 +171,10 @@ unsigned long raw_copy_in_user(void __user *dst, const 
> void __user *src, unsigne
>  extern long __copy_user_nocache(void *dst, const void __user *src,
>   unsigned size, int zerorest);
>  
> +extern long __copy_user_wt(void *dst, const void __user *src, unsigned size);
> +extern void memcpy_page_wt(char *to, struct page *page, size_t offset,
> +size_t len);
> +
>  static inline int
>  __copy_from_user_inatomic_nocache(void *dst, const void __user *src,
> unsigned size)
> @@ -179,6 +183,13 @@ __copy_from_user_inatomic_nocache(void *dst, const void 
> __user *src,
>   return __copy_user_nocache(dst, src, size, 0);
>  }
>  
> +static inline int
> +__copy_from_user_inatomic_wt(void *dst, const void __user *src, unsigned 
> size)
> +{
> + kasan_check_write(dst, size);
> + return __copy_user_wt(dst, src, size);
> +}
> +
>  unsigned long
>  copy_user_handle_tail(char *to, char *from, unsigned len);
>  
> diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c
> index 3b7c40a2e3e1..0aeff66a022f 100644
> --- a/arch/x86/lib/usercopy_64.c
> +++ b/arch/x86/lib/usercopy_64.c
> @@ -7,6 +7,7 @@
>   */
>  #include 
>  #include 
> +#include 
>  
>  /*
>   * Zero Userspace
> @@ -73,3 +74,130 @@ copy_user_handle_tail(char *to, char *from, unsigned len)
>   clac();
>   return len;
>  }
> +
> +#ifdef CONFIG_ARCH_HAS_UACCESS_WT
> +/**
> + * clean_cache_range - write back a cache range with CLWB
> + * @vaddr:   virtual start address
> + * @size:number of bytes to write back
> + *
> + * Write back a cache range using the CLWB (cache line write back)
> + * instruction. Note that @size is internally rounded up to be cache
> + * line size aligned.
> + */
> +static void clean_cache_range(void *addr, size_t size)
> +{
> + u16 x86_clflush_size = boot_cpu_data.x86_clflush_size;
> + unsigned long clflush_mask = x86_clflush_size - 1;
> + void *vend = addr + size;
> + void *p;
> +
> + for (p = (void *)((unsigned long)addr & ~clflush_mask);
> +  p < vend; p += x86_clflush_size)
> + clwb(p);
> +}
> +
> +long __copy_user_wt(void *dst, const void __user *src, unsigned size)
> +{
> + unsigned long flushed, dest = (unsigned long) dst;
> + long rc = __copy_user_nocache(dst, src, size, 0);
> +
> + /*
> +  * __copy_user_nocache() uses non-temporal stores for the bulk
> +  * of the transfer, but we need to manually flush if the
> +  * transfer is unaligned. A cached memory copy is used when
> +  * destination or size is not naturally aligned. That is:
> +  *   - Require 8-byte alignment when size is 8 bytes or larger.
> +  *   - Require 4-byte alignment when size is 4 bytes.
> +  */
> + if (size < 8) {
> + if (!IS_ALIGNED(dest, 4) || size != 4)
> + clean_cache_range(dst, 1);
> + } else {
> + if (!IS_ALIGNED(dest, 8)) {
> + dest 

RE: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-07 Thread Kani, Toshimitsu
> * Dan Williams  wrote:
> 
> > On Sat, May 6, 2017 at 2:46 AM, Ingo Molnar  wrote:
> > >
> > > * Dan Williams  wrote:
> > >
> > >> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu 
> wrote:
> > >> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
> > >> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu
> 
> > >> >> wrote:
> > >> >  :
> > >> >> > > ---
> > >> >> > > Changes since the initial RFC:
> > >> >> > > * s/writethru/wt/ since we already have ioremap_wt(),
> > >> >> > > set_memory_wt(), etc. (Ingo)
> > >> >> >
> > >> >> > Sorry I should have said earlier, but I think the term "wt" is
> > >> >> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
> > >> >> > semantics, not WT semantics.
> > >> >>
> > >> >> The non-temporal stores do, but memcpy_wt() is using a combination
> of
> > >> >> non-temporal stores and explicit cache flushing.
> > >> >>
> > >> >> > How about using "nocache" as it's been
> > >> >> > used in __copy_user_nocache()?
> > >> >>
> > >> >> The difference in my mind is that the "_nocache" suffix indicates
> > >> >> opportunistic / optional cache pollution avoidance whereas "_wt"
> > >> >> strictly arranges for caches not to contain dirty data upon
> > >> >> completion of the routine. For example, non-temporal stores on older
> > >> >> x86 cpus could potentially leave dirty data in the cache, so
> > >> >> memcpy_wt on those cpus would need to use explicit cache flushing.
> > >> >
> > >> > I see.  I agree that its behavior is different from the existing one
> > >> > with "_nocache".   That said, I think "wt" or "write-through" generally
> > >> > means that writes allocate cachelines and keep them clean by writing
> to
> > >> > memory.  So, subsequent reads to the destination will hit the
> > >> > cachelines.  This is not the case with this interface.
> > >>
> > >> True... maybe _nocache_strict()? Or, leave it _wt() until someone
> > >> comes along and is surprised that the cache is not warm for reads
> > >> after memcpy_wt(), at which point we can ask "why not just use plain
> > >> memcpy then?", or set the page-attributes to WT.
> > >
> > > Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal
> and that
> > > no cache line is left around afterwards (dirty or clean)?
> >
> > Yes, I think "flush" belongs in the name, and to make it easily
> > grep-able separate from _nocache we can call it _flushcache? An
> > efficient implementation will use _nocache / non-temporal stores
> > internally, but external consumers just care about the state of the
> > cache after the call.
> 
> _flushcache() works for me too.
> 

Works for me too.
Thanks,
-Toshi



RE: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-07 Thread Kani, Toshimitsu
> * Dan Williams  wrote:
> 
> > On Sat, May 6, 2017 at 2:46 AM, Ingo Molnar  wrote:
> > >
> > > * Dan Williams  wrote:
> > >
> > >> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu 
> wrote:
> > >> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
> > >> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu
> 
> > >> >> wrote:
> > >> >  :
> > >> >> > > ---
> > >> >> > > Changes since the initial RFC:
> > >> >> > > * s/writethru/wt/ since we already have ioremap_wt(),
> > >> >> > > set_memory_wt(), etc. (Ingo)
> > >> >> >
> > >> >> > Sorry I should have said earlier, but I think the term "wt" is
> > >> >> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
> > >> >> > semantics, not WT semantics.
> > >> >>
> > >> >> The non-temporal stores do, but memcpy_wt() is using a combination
> of
> > >> >> non-temporal stores and explicit cache flushing.
> > >> >>
> > >> >> > How about using "nocache" as it's been
> > >> >> > used in __copy_user_nocache()?
> > >> >>
> > >> >> The difference in my mind is that the "_nocache" suffix indicates
> > >> >> opportunistic / optional cache pollution avoidance whereas "_wt"
> > >> >> strictly arranges for caches not to contain dirty data upon
> > >> >> completion of the routine. For example, non-temporal stores on older
> > >> >> x86 cpus could potentially leave dirty data in the cache, so
> > >> >> memcpy_wt on those cpus would need to use explicit cache flushing.
> > >> >
> > >> > I see.  I agree that its behavior is different from the existing one
> > >> > with "_nocache".   That said, I think "wt" or "write-through" generally
> > >> > means that writes allocate cachelines and keep them clean by writing
> to
> > >> > memory.  So, subsequent reads to the destination will hit the
> > >> > cachelines.  This is not the case with this interface.
> > >>
> > >> True... maybe _nocache_strict()? Or, leave it _wt() until someone
> > >> comes along and is surprised that the cache is not warm for reads
> > >> after memcpy_wt(), at which point we can ask "why not just use plain
> > >> memcpy then?", or set the page-attributes to WT.
> > >
> > > Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal
> and that
> > > no cache line is left around afterwards (dirty or clean)?
> >
> > Yes, I think "flush" belongs in the name, and to make it easily
> > grep-able separate from _nocache we can call it _flushcache? An
> > efficient implementation will use _nocache / non-temporal stores
> > internally, but external consumers just care about the state of the
> > cache after the call.
> 
> _flushcache() works for me too.
> 

Works for me too.
Thanks,
-Toshi



Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-07 Thread Ingo Molnar

* Dan Williams  wrote:

> On Sat, May 6, 2017 at 2:46 AM, Ingo Molnar  wrote:
> >
> > * Dan Williams  wrote:
> >
> >> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu  
> >> wrote:
> >> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
> >> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu 
> >> >> wrote:
> >> >  :
> >> >> > > ---
> >> >> > > Changes since the initial RFC:
> >> >> > > * s/writethru/wt/ since we already have ioremap_wt(),
> >> >> > > set_memory_wt(), etc. (Ingo)
> >> >> >
> >> >> > Sorry I should have said earlier, but I think the term "wt" is
> >> >> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
> >> >> > semantics, not WT semantics.
> >> >>
> >> >> The non-temporal stores do, but memcpy_wt() is using a combination of
> >> >> non-temporal stores and explicit cache flushing.
> >> >>
> >> >> > How about using "nocache" as it's been
> >> >> > used in __copy_user_nocache()?
> >> >>
> >> >> The difference in my mind is that the "_nocache" suffix indicates
> >> >> opportunistic / optional cache pollution avoidance whereas "_wt"
> >> >> strictly arranges for caches not to contain dirty data upon
> >> >> completion of the routine. For example, non-temporal stores on older
> >> >> x86 cpus could potentially leave dirty data in the cache, so
> >> >> memcpy_wt on those cpus would need to use explicit cache flushing.
> >> >
> >> > I see.  I agree that its behavior is different from the existing one
> >> > with "_nocache".   That said, I think "wt" or "write-through" generally
> >> > means that writes allocate cachelines and keep them clean by writing to
> >> > memory.  So, subsequent reads to the destination will hit the
> >> > cachelines.  This is not the case with this interface.
> >>
> >> True... maybe _nocache_strict()? Or, leave it _wt() until someone
> >> comes along and is surprised that the cache is not warm for reads
> >> after memcpy_wt(), at which point we can ask "why not just use plain
> >> memcpy then?", or set the page-attributes to WT.
> >
> > Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal 
> > and that
> > no cache line is left around afterwards (dirty or clean)?
> 
> Yes, I think "flush" belongs in the name, and to make it easily
> grep-able separate from _nocache we can call it _flushcache? An
> efficient implementation will use _nocache / non-temporal stores
> internally, but external consumers just care about the state of the
> cache after the call.

_flushcache() works for me too.

Thanks,

Ingo


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-07 Thread Ingo Molnar

* Dan Williams  wrote:

> On Sat, May 6, 2017 at 2:46 AM, Ingo Molnar  wrote:
> >
> > * Dan Williams  wrote:
> >
> >> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu  
> >> wrote:
> >> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
> >> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu 
> >> >> wrote:
> >> >  :
> >> >> > > ---
> >> >> > > Changes since the initial RFC:
> >> >> > > * s/writethru/wt/ since we already have ioremap_wt(),
> >> >> > > set_memory_wt(), etc. (Ingo)
> >> >> >
> >> >> > Sorry I should have said earlier, but I think the term "wt" is
> >> >> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
> >> >> > semantics, not WT semantics.
> >> >>
> >> >> The non-temporal stores do, but memcpy_wt() is using a combination of
> >> >> non-temporal stores and explicit cache flushing.
> >> >>
> >> >> > How about using "nocache" as it's been
> >> >> > used in __copy_user_nocache()?
> >> >>
> >> >> The difference in my mind is that the "_nocache" suffix indicates
> >> >> opportunistic / optional cache pollution avoidance whereas "_wt"
> >> >> strictly arranges for caches not to contain dirty data upon
> >> >> completion of the routine. For example, non-temporal stores on older
> >> >> x86 cpus could potentially leave dirty data in the cache, so
> >> >> memcpy_wt on those cpus would need to use explicit cache flushing.
> >> >
> >> > I see.  I agree that its behavior is different from the existing one
> >> > with "_nocache".   That said, I think "wt" or "write-through" generally
> >> > means that writes allocate cachelines and keep them clean by writing to
> >> > memory.  So, subsequent reads to the destination will hit the
> >> > cachelines.  This is not the case with this interface.
> >>
> >> True... maybe _nocache_strict()? Or, leave it _wt() until someone
> >> comes along and is surprised that the cache is not warm for reads
> >> after memcpy_wt(), at which point we can ask "why not just use plain
> >> memcpy then?", or set the page-attributes to WT.
> >
> > Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal 
> > and that
> > no cache line is left around afterwards (dirty or clean)?
> 
> Yes, I think "flush" belongs in the name, and to make it easily
> grep-able separate from _nocache we can call it _flushcache? An
> efficient implementation will use _nocache / non-temporal stores
> internally, but external consumers just care about the state of the
> cache after the call.

_flushcache() works for me too.

Thanks,

Ingo


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-06 Thread Dan Williams
On Sat, May 6, 2017 at 2:46 AM, Ingo Molnar  wrote:
>
> * Dan Williams  wrote:
>
>> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu  wrote:
>> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
>> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu 
>> >> wrote:
>> >  :
>> >> > > ---
>> >> > > Changes since the initial RFC:
>> >> > > * s/writethru/wt/ since we already have ioremap_wt(),
>> >> > > set_memory_wt(), etc. (Ingo)
>> >> >
>> >> > Sorry I should have said earlier, but I think the term "wt" is
>> >> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
>> >> > semantics, not WT semantics.
>> >>
>> >> The non-temporal stores do, but memcpy_wt() is using a combination of
>> >> non-temporal stores and explicit cache flushing.
>> >>
>> >> > How about using "nocache" as it's been
>> >> > used in __copy_user_nocache()?
>> >>
>> >> The difference in my mind is that the "_nocache" suffix indicates
>> >> opportunistic / optional cache pollution avoidance whereas "_wt"
>> >> strictly arranges for caches not to contain dirty data upon
>> >> completion of the routine. For example, non-temporal stores on older
>> >> x86 cpus could potentially leave dirty data in the cache, so
>> >> memcpy_wt on those cpus would need to use explicit cache flushing.
>> >
>> > I see.  I agree that its behavior is different from the existing one
>> > with "_nocache".   That said, I think "wt" or "write-through" generally
>> > means that writes allocate cachelines and keep them clean by writing to
>> > memory.  So, subsequent reads to the destination will hit the
>> > cachelines.  This is not the case with this interface.
>>
>> True... maybe _nocache_strict()? Or, leave it _wt() until someone
>> comes along and is surprised that the cache is not warm for reads
>> after memcpy_wt(), at which point we can ask "why not just use plain
>> memcpy then?", or set the page-attributes to WT.
>
> Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal and 
> that
> no cache line is left around afterwards (dirty or clean)?

Yes, I think "flush" belongs in the name, and to make it easily
grep-able separate from _nocache we can call it _flushcache? An
efficient implementation will use _nocache / non-temporal stores
internally, but external consumers just care about the state of the
cache after the call.


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-06 Thread Dan Williams
On Sat, May 6, 2017 at 2:46 AM, Ingo Molnar  wrote:
>
> * Dan Williams  wrote:
>
>> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu  wrote:
>> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
>> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu 
>> >> wrote:
>> >  :
>> >> > > ---
>> >> > > Changes since the initial RFC:
>> >> > > * s/writethru/wt/ since we already have ioremap_wt(),
>> >> > > set_memory_wt(), etc. (Ingo)
>> >> >
>> >> > Sorry I should have said earlier, but I think the term "wt" is
>> >> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
>> >> > semantics, not WT semantics.
>> >>
>> >> The non-temporal stores do, but memcpy_wt() is using a combination of
>> >> non-temporal stores and explicit cache flushing.
>> >>
>> >> > How about using "nocache" as it's been
>> >> > used in __copy_user_nocache()?
>> >>
>> >> The difference in my mind is that the "_nocache" suffix indicates
>> >> opportunistic / optional cache pollution avoidance whereas "_wt"
>> >> strictly arranges for caches not to contain dirty data upon
>> >> completion of the routine. For example, non-temporal stores on older
>> >> x86 cpus could potentially leave dirty data in the cache, so
>> >> memcpy_wt on those cpus would need to use explicit cache flushing.
>> >
>> > I see.  I agree that its behavior is different from the existing one
>> > with "_nocache".   That said, I think "wt" or "write-through" generally
>> > means that writes allocate cachelines and keep them clean by writing to
>> > memory.  So, subsequent reads to the destination will hit the
>> > cachelines.  This is not the case with this interface.
>>
>> True... maybe _nocache_strict()? Or, leave it _wt() until someone
>> comes along and is surprised that the cache is not warm for reads
>> after memcpy_wt(), at which point we can ask "why not just use plain
>> memcpy then?", or set the page-attributes to WT.
>
> Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal and 
> that
> no cache line is left around afterwards (dirty or clean)?

Yes, I think "flush" belongs in the name, and to make it easily
grep-able separate from _nocache we can call it _flushcache? An
efficient implementation will use _nocache / non-temporal stores
internally, but external consumers just care about the state of the
cache after the call.


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-06 Thread Ingo Molnar

* Dan Williams  wrote:

> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu  wrote:
> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu 
> >> wrote:
> >  :
> >> > > ---
> >> > > Changes since the initial RFC:
> >> > > * s/writethru/wt/ since we already have ioremap_wt(),
> >> > > set_memory_wt(), etc. (Ingo)
> >> >
> >> > Sorry I should have said earlier, but I think the term "wt" is
> >> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
> >> > semantics, not WT semantics.
> >>
> >> The non-temporal stores do, but memcpy_wt() is using a combination of
> >> non-temporal stores and explicit cache flushing.
> >>
> >> > How about using "nocache" as it's been
> >> > used in __copy_user_nocache()?
> >>
> >> The difference in my mind is that the "_nocache" suffix indicates
> >> opportunistic / optional cache pollution avoidance whereas "_wt"
> >> strictly arranges for caches not to contain dirty data upon
> >> completion of the routine. For example, non-temporal stores on older
> >> x86 cpus could potentially leave dirty data in the cache, so
> >> memcpy_wt on those cpus would need to use explicit cache flushing.
> >
> > I see.  I agree that its behavior is different from the existing one
> > with "_nocache".   That said, I think "wt" or "write-through" generally
> > means that writes allocate cachelines and keep them clean by writing to
> > memory.  So, subsequent reads to the destination will hit the
> > cachelines.  This is not the case with this interface.
> 
> True... maybe _nocache_strict()? Or, leave it _wt() until someone
> comes along and is surprised that the cache is not warm for reads
> after memcpy_wt(), at which point we can ask "why not just use plain
> memcpy then?", or set the page-attributes to WT.

Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal and 
that 
no cache line is left around afterwards (dirty or clean)?

Thanks,

Ingo


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-06 Thread Ingo Molnar

* Dan Williams  wrote:

> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu  wrote:
> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu 
> >> wrote:
> >  :
> >> > > ---
> >> > > Changes since the initial RFC:
> >> > > * s/writethru/wt/ since we already have ioremap_wt(),
> >> > > set_memory_wt(), etc. (Ingo)
> >> >
> >> > Sorry I should have said earlier, but I think the term "wt" is
> >> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
> >> > semantics, not WT semantics.
> >>
> >> The non-temporal stores do, but memcpy_wt() is using a combination of
> >> non-temporal stores and explicit cache flushing.
> >>
> >> > How about using "nocache" as it's been
> >> > used in __copy_user_nocache()?
> >>
> >> The difference in my mind is that the "_nocache" suffix indicates
> >> opportunistic / optional cache pollution avoidance whereas "_wt"
> >> strictly arranges for caches not to contain dirty data upon
> >> completion of the routine. For example, non-temporal stores on older
> >> x86 cpus could potentially leave dirty data in the cache, so
> >> memcpy_wt on those cpus would need to use explicit cache flushing.
> >
> > I see.  I agree that its behavior is different from the existing one
> > with "_nocache".   That said, I think "wt" or "write-through" generally
> > means that writes allocate cachelines and keep them clean by writing to
> > memory.  So, subsequent reads to the destination will hit the
> > cachelines.  This is not the case with this interface.
> 
> True... maybe _nocache_strict()? Or, leave it _wt() until someone
> comes along and is surprised that the cache is not warm for reads
> after memcpy_wt(), at which point we can ask "why not just use plain
> memcpy then?", or set the page-attributes to WT.

Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal and 
that 
no cache line is left around afterwards (dirty or clean)?

Thanks,

Ingo


RE: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-05 Thread Kani, Toshimitsu
> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu 
> wrote:
> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu 
> >> wrote:
> >  :
> >> > > ---
> >> > > Changes since the initial RFC:
> >> > > * s/writethru/wt/ since we already have ioremap_wt(),
> >> > > set_memory_wt(), etc. (Ingo)
> >> >
> >> > Sorry I should have said earlier, but I think the term "wt" is
> >> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
> >> > semantics, not WT semantics.
> >>
> >> The non-temporal stores do, but memcpy_wt() is using a combination of
> >> non-temporal stores and explicit cache flushing.
> >>
> >> > How about using "nocache" as it's been
> >> > used in __copy_user_nocache()?
> >>
> >> The difference in my mind is that the "_nocache" suffix indicates
> >> opportunistic / optional cache pollution avoidance whereas "_wt"
> >> strictly arranges for caches not to contain dirty data upon
> >> completion of the routine. For example, non-temporal stores on older
> >> x86 cpus could potentially leave dirty data in the cache, so
> >> memcpy_wt on those cpus would need to use explicit cache flushing.
> >
> > I see.  I agree that its behavior is different from the existing one
> > with "_nocache".   That said, I think "wt" or "write-through" generally
> > means that writes allocate cachelines and keep them clean by writing to
> > memory.  So, subsequent reads to the destination will hit the
> > cachelines.  This is not the case with this interface.
> 
> True... maybe _nocache_strict()? Or, leave it _wt() until someone
> comes along and is surprised that the cache is not warm for reads
> after memcpy_wt(), at which point we can ask "why not just use plain
> memcpy then?", or set the page-attributes to WT.

I prefer _nocache_strict(), if it's not too long, since it avoids any
confusion.  If other arches actually implement it with WT semantics,
we might become the one to change it, instead of the caller.

Thanks,
-Toshi



RE: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-05 Thread Kani, Toshimitsu
> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu 
> wrote:
> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu 
> >> wrote:
> >  :
> >> > > ---
> >> > > Changes since the initial RFC:
> >> > > * s/writethru/wt/ since we already have ioremap_wt(),
> >> > > set_memory_wt(), etc. (Ingo)
> >> >
> >> > Sorry I should have said earlier, but I think the term "wt" is
> >> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
> >> > semantics, not WT semantics.
> >>
> >> The non-temporal stores do, but memcpy_wt() is using a combination of
> >> non-temporal stores and explicit cache flushing.
> >>
> >> > How about using "nocache" as it's been
> >> > used in __copy_user_nocache()?
> >>
> >> The difference in my mind is that the "_nocache" suffix indicates
> >> opportunistic / optional cache pollution avoidance whereas "_wt"
> >> strictly arranges for caches not to contain dirty data upon
> >> completion of the routine. For example, non-temporal stores on older
> >> x86 cpus could potentially leave dirty data in the cache, so
> >> memcpy_wt on those cpus would need to use explicit cache flushing.
> >
> > I see.  I agree that its behavior is different from the existing one
> > with "_nocache".   That said, I think "wt" or "write-through" generally
> > means that writes allocate cachelines and keep them clean by writing to
> > memory.  So, subsequent reads to the destination will hit the
> > cachelines.  This is not the case with this interface.
> 
> True... maybe _nocache_strict()? Or, leave it _wt() until someone
> comes along and is surprised that the cache is not warm for reads
> after memcpy_wt(), at which point we can ask "why not just use plain
> memcpy then?", or set the page-attributes to WT.

I prefer _nocache_strict(), if it's not too long, since it avoids any
confusion.  If other arches actually implement it with WT semantics,
we might become the one to change it, instead of the caller.

Thanks,
-Toshi



Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-05 Thread Dan Williams
On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu  wrote:
> On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
>> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu 
>> wrote:
>  :
>> > > ---
>> > > Changes since the initial RFC:
>> > > * s/writethru/wt/ since we already have ioremap_wt(),
>> > > set_memory_wt(), etc. (Ingo)
>> >
>> > Sorry I should have said earlier, but I think the term "wt" is
>> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
>> > semantics, not WT semantics.
>>
>> The non-temporal stores do, but memcpy_wt() is using a combination of
>> non-temporal stores and explicit cache flushing.
>>
>> > How about using "nocache" as it's been
>> > used in __copy_user_nocache()?
>>
>> The difference in my mind is that the "_nocache" suffix indicates
>> opportunistic / optional cache pollution avoidance whereas "_wt"
>> strictly arranges for caches not to contain dirty data upon
>> completion of the routine. For example, non-temporal stores on older
>> x86 cpus could potentially leave dirty data in the cache, so
>> memcpy_wt on those cpus would need to use explicit cache flushing.
>
> I see.  I agree that its behavior is different from the existing one
> with "_nocache".   That said, I think "wt" or "write-through" generally
> means that writes allocate cachelines and keep them clean by writing to
> memory.  So, subsequent reads to the destination will hit the
> cachelines.  This is not the case with this interface.

True... maybe _nocache_strict()? Or, leave it _wt() until someone
comes along and is surprised that the cache is not warm for reads
after memcpy_wt(), at which point we can ask "why not just use plain
memcpy then?", or set the page-attributes to WT.


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-05 Thread Dan Williams
On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu  wrote:
> On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
>> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu 
>> wrote:
>  :
>> > > ---
>> > > Changes since the initial RFC:
>> > > * s/writethru/wt/ since we already have ioremap_wt(),
>> > > set_memory_wt(), etc. (Ingo)
>> >
>> > Sorry I should have said earlier, but I think the term "wt" is
>> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
>> > semantics, not WT semantics.
>>
>> The non-temporal stores do, but memcpy_wt() is using a combination of
>> non-temporal stores and explicit cache flushing.
>>
>> > How about using "nocache" as it's been
>> > used in __copy_user_nocache()?
>>
>> The difference in my mind is that the "_nocache" suffix indicates
>> opportunistic / optional cache pollution avoidance whereas "_wt"
>> strictly arranges for caches not to contain dirty data upon
>> completion of the routine. For example, non-temporal stores on older
>> x86 cpus could potentially leave dirty data in the cache, so
>> memcpy_wt on those cpus would need to use explicit cache flushing.
>
> I see.  I agree that its behavior is different from the existing one
> with "_nocache".   That said, I think "wt" or "write-through" generally
> means that writes allocate cachelines and keep them clean by writing to
> memory.  So, subsequent reads to the destination will hit the
> cachelines.  This is not the case with this interface.

True... maybe _nocache_strict()? Or, leave it _wt() until someone
comes along and is surprised that the cache is not warm for reads
after memcpy_wt(), at which point we can ask "why not just use plain
memcpy then?", or set the page-attributes to WT.


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-05 Thread Kani, Toshimitsu
On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu 
> wrote:
 :
> > > ---
> > > Changes since the initial RFC:
> > > * s/writethru/wt/ since we already have ioremap_wt(),
> > > set_memory_wt(), etc. (Ingo)
> > 
> > Sorry I should have said earlier, but I think the term "wt" is
> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
> > semantics, not WT semantics.
> 
> The non-temporal stores do, but memcpy_wt() is using a combination of
> non-temporal stores and explicit cache flushing.
> 
> > How about using "nocache" as it's been
> > used in __copy_user_nocache()?
> 
> The difference in my mind is that the "_nocache" suffix indicates
> opportunistic / optional cache pollution avoidance whereas "_wt"
> strictly arranges for caches not to contain dirty data upon
> completion of the routine. For example, non-temporal stores on older
> x86 cpus could potentially leave dirty data in the cache, so
> memcpy_wt on those cpus would need to use explicit cache flushing.

I see.  I agree that its behavior is different from the existing one
with "_nocache".   That said, I think "wt" or "write-through" generally
means that writes allocate cachelines and keep them clean by writing to
memory.  So, subsequent reads to the destination will hit the
cachelines.  This is not the case with this interface.

Thanks,
-Toshi
 

Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-05 Thread Kani, Toshimitsu
On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote:
> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu 
> wrote:
 :
> > > ---
> > > Changes since the initial RFC:
> > > * s/writethru/wt/ since we already have ioremap_wt(),
> > > set_memory_wt(), etc. (Ingo)
> > 
> > Sorry I should have said earlier, but I think the term "wt" is
> > misleading.  Non-temporal stores used in memcpy_wt() provide WC
> > semantics, not WT semantics.
> 
> The non-temporal stores do, but memcpy_wt() is using a combination of
> non-temporal stores and explicit cache flushing.
> 
> > How about using "nocache" as it's been
> > used in __copy_user_nocache()?
> 
> The difference in my mind is that the "_nocache" suffix indicates
> opportunistic / optional cache pollution avoidance whereas "_wt"
> strictly arranges for caches not to contain dirty data upon
> completion of the routine. For example, non-temporal stores on older
> x86 cpus could potentially leave dirty data in the cache, so
> memcpy_wt on those cpus would need to use explicit cache flushing.

I see.  I agree that its behavior is different from the existing one
with "_nocache".   That said, I think "wt" or "write-through" generally
means that writes allocate cachelines and keep them clean by writing to
memory.  So, subsequent reads to the destination will hit the
cachelines.  This is not the case with this interface.

Thanks,
-Toshi
 

Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-05 Thread Dan Williams
On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu  wrote:
> On Fri, 2017-04-28 at 12:39 -0700, Dan Williams wrote:
>> The pmem driver has a need to transfer data with a persistent memory
>> destination and be able to rely on the fact that the destination
>> writes are not cached. It is sufficient for the writes to be flushed
>> to a cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we
>> expect userspace to call fsync() to ensure data-writes have reached a
>> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA
>> or REQ_FLUSH to the pmem driver which will turn around and fence
>> previous writes with an "sfence".
>>
>> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and
>> memcpy_wt, that guarantee that the destination buffer is not dirty in
>> the cpu cache on completion. The new copy_from_iter_wt and sub-
>> routines will be used to replace the "pmem api" (include/linux/pmem.h
>> + arch/x86/include/asm/pmem.h). The availability of
>> copy_from_iter_wt() and memcpy_wt() are gated by the
>> CONFIG_ARCH_HAS_UACCESS_WT config symbol, and fallback to
>> copy_from_iter_nocache() and plain memcpy() otherwise.
>>
>> This is meant to satisfy the concern from Linus that if a driver
>> wants to do something beyond the normal nocache semantics it should
>> be something private to that driver [1], and Al's concern that
>> anything uaccess related belongs with the rest of the uaccess code
>> [2].
>>
>> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.
>> html
>> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.ht
>> ml
>>
>> Cc: 
>> Cc: Jan Kara 
>> Cc: Jeff Moyer 
>> Cc: Ingo Molnar 
>> Cc: Christoph Hellwig 
>> Cc: "H. Peter Anvin" 
>> Cc: Al Viro 
>> Cc: Thomas Gleixner 
>> Cc: Matthew Wilcox 
>> Cc: Ross Zwisler 
>> Signed-off-by: Dan Williams 
>> ---
>> Changes since the initial RFC:
>> * s/writethru/wt/ since we already have ioremap_wt(),
>> set_memory_wt(), etc. (Ingo)
>
> Sorry I should have said earlier, but I think the term "wt" is
> misleading.  Non-temporal stores used in memcpy_wt() provide WC
> semantics, not WT semantics.

The non-temporal stores do, but memcpy_wt() is using a combination of
non-temporal stores and explicit cache flushing.

> How about using "nocache" as it's been
> used in __copy_user_nocache()?

The difference in my mind is that the "_nocache" suffix indicates
opportunistic / optional cache pollution avoidance whereas "_wt"
strictly arranges for caches not to contain dirty data upon completion
of the routine. For example, non-temporal stores on older x86 cpus
could potentially leave dirty data in the cache, so memcpy_wt on those
cpus would need to use explicit cache flushing.


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-05 Thread Dan Williams
On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu  wrote:
> On Fri, 2017-04-28 at 12:39 -0700, Dan Williams wrote:
>> The pmem driver has a need to transfer data with a persistent memory
>> destination and be able to rely on the fact that the destination
>> writes are not cached. It is sufficient for the writes to be flushed
>> to a cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we
>> expect userspace to call fsync() to ensure data-writes have reached a
>> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA
>> or REQ_FLUSH to the pmem driver which will turn around and fence
>> previous writes with an "sfence".
>>
>> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and
>> memcpy_wt, that guarantee that the destination buffer is not dirty in
>> the cpu cache on completion. The new copy_from_iter_wt and sub-
>> routines will be used to replace the "pmem api" (include/linux/pmem.h
>> + arch/x86/include/asm/pmem.h). The availability of
>> copy_from_iter_wt() and memcpy_wt() are gated by the
>> CONFIG_ARCH_HAS_UACCESS_WT config symbol, and fallback to
>> copy_from_iter_nocache() and plain memcpy() otherwise.
>>
>> This is meant to satisfy the concern from Linus that if a driver
>> wants to do something beyond the normal nocache semantics it should
>> be something private to that driver [1], and Al's concern that
>> anything uaccess related belongs with the rest of the uaccess code
>> [2].
>>
>> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.
>> html
>> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.ht
>> ml
>>
>> Cc: 
>> Cc: Jan Kara 
>> Cc: Jeff Moyer 
>> Cc: Ingo Molnar 
>> Cc: Christoph Hellwig 
>> Cc: "H. Peter Anvin" 
>> Cc: Al Viro 
>> Cc: Thomas Gleixner 
>> Cc: Matthew Wilcox 
>> Cc: Ross Zwisler 
>> Signed-off-by: Dan Williams 
>> ---
>> Changes since the initial RFC:
>> * s/writethru/wt/ since we already have ioremap_wt(),
>> set_memory_wt(), etc. (Ingo)
>
> Sorry I should have said earlier, but I think the term "wt" is
> misleading.  Non-temporal stores used in memcpy_wt() provide WC
> semantics, not WT semantics.

The non-temporal stores do, but memcpy_wt() is using a combination of
non-temporal stores and explicit cache flushing.

> How about using "nocache" as it's been
> used in __copy_user_nocache()?

The difference in my mind is that the "_nocache" suffix indicates
opportunistic / optional cache pollution avoidance whereas "_wt"
strictly arranges for caches not to contain dirty data upon completion
of the routine. For example, non-temporal stores on older x86 cpus
could potentially leave dirty data in the cache, so memcpy_wt on those
cpus would need to use explicit cache flushing.


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-05 Thread Kani, Toshimitsu
On Fri, 2017-04-28 at 12:39 -0700, Dan Williams wrote:
> The pmem driver has a need to transfer data with a persistent memory
> destination and be able to rely on the fact that the destination
> writes are not cached. It is sufficient for the writes to be flushed
> to a cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we
> expect userspace to call fsync() to ensure data-writes have reached a
> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA
> or REQ_FLUSH to the pmem driver which will turn around and fence
> previous writes with an "sfence".
> 
> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and
> memcpy_wt, that guarantee that the destination buffer is not dirty in
> the cpu cache on completion. The new copy_from_iter_wt and sub-
> routines will be used to replace the "pmem api" (include/linux/pmem.h
> + arch/x86/include/asm/pmem.h). The availability of
> copy_from_iter_wt() and memcpy_wt() are gated by the
> CONFIG_ARCH_HAS_UACCESS_WT config symbol, and fallback to
> copy_from_iter_nocache() and plain memcpy() otherwise.
> 
> This is meant to satisfy the concern from Linus that if a driver
> wants to do something beyond the normal nocache semantics it should
> be something private to that driver [1], and Al's concern that
> anything uaccess related belongs with the rest of the uaccess code
> [2].
> 
> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.
> html
> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.ht
> ml
> 
> Cc: 
> Cc: Jan Kara 
> Cc: Jeff Moyer 
> Cc: Ingo Molnar 
> Cc: Christoph Hellwig 
> Cc: "H. Peter Anvin" 
> Cc: Al Viro 
> Cc: Thomas Gleixner 
> Cc: Matthew Wilcox 
> Cc: Ross Zwisler 
> Signed-off-by: Dan Williams 
> ---
> Changes since the initial RFC:
> * s/writethru/wt/ since we already have ioremap_wt(),
> set_memory_wt(), etc. (Ingo)

Sorry I should have said earlier, but I think the term "wt" is
misleading.  Non-temporal stores used in memcpy_wt() provide WC
semantics, not WT semantics.  How about using "nocache" as it's been
used in __copy_user_nocache()?

Thanks,
-Toshi


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-05 Thread Kani, Toshimitsu
On Fri, 2017-04-28 at 12:39 -0700, Dan Williams wrote:
> The pmem driver has a need to transfer data with a persistent memory
> destination and be able to rely on the fact that the destination
> writes are not cached. It is sufficient for the writes to be flushed
> to a cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we
> expect userspace to call fsync() to ensure data-writes have reached a
> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA
> or REQ_FLUSH to the pmem driver which will turn around and fence
> previous writes with an "sfence".
> 
> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and
> memcpy_wt, that guarantee that the destination buffer is not dirty in
> the cpu cache on completion. The new copy_from_iter_wt and sub-
> routines will be used to replace the "pmem api" (include/linux/pmem.h
> + arch/x86/include/asm/pmem.h). The availability of
> copy_from_iter_wt() and memcpy_wt() are gated by the
> CONFIG_ARCH_HAS_UACCESS_WT config symbol, and fallback to
> copy_from_iter_nocache() and plain memcpy() otherwise.
> 
> This is meant to satisfy the concern from Linus that if a driver
> wants to do something beyond the normal nocache semantics it should
> be something private to that driver [1], and Al's concern that
> anything uaccess related belongs with the rest of the uaccess code
> [2].
> 
> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.
> html
> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.ht
> ml
> 
> Cc: 
> Cc: Jan Kara 
> Cc: Jeff Moyer 
> Cc: Ingo Molnar 
> Cc: Christoph Hellwig 
> Cc: "H. Peter Anvin" 
> Cc: Al Viro 
> Cc: Thomas Gleixner 
> Cc: Matthew Wilcox 
> Cc: Ross Zwisler 
> Signed-off-by: Dan Williams 
> ---
> Changes since the initial RFC:
> * s/writethru/wt/ since we already have ioremap_wt(),
> set_memory_wt(), etc. (Ingo)

Sorry I should have said earlier, but I think the term "wt" is
misleading.  Non-temporal stores used in memcpy_wt() provide WC
semantics, not WT semantics.  How about using "nocache" as it's been
used in __copy_user_nocache()?

Thanks,
-Toshi


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-05 Thread Dan Williams
On Thu, May 4, 2017 at 11:54 PM, Ingo Molnar  wrote:
>
> * Dan Williams  wrote:
>
>> The pmem driver has a need to transfer data with a persistent memory
>> destination and be able to rely on the fact that the destination writes
>> are not cached. It is sufficient for the writes to be flushed to a
>> cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect
>> userspace to call fsync() to ensure data-writes have reached a
>> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or
>> REQ_FLUSH to the pmem driver which will turn around and fence previous
>> writes with an "sfence".
>>
>> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt,
>> that guarantee that the destination buffer is not dirty in the cpu cache
>> on completion. The new copy_from_iter_wt and sub-routines will be used
>> to replace the "pmem api" (include/linux/pmem.h +
>> arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt()
>> and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config
>> symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
>> otherwise.
>>
>> This is meant to satisfy the concern from Linus that if a driver wants
>> to do something beyond the normal nocache semantics it should be
>> something private to that driver [1], and Al's concern that anything
>> uaccess related belongs with the rest of the uaccess code [2].
>>
>> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
>> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
>>
>> Cc: 
>> Cc: Jan Kara 
>> Cc: Jeff Moyer 
>> Cc: Ingo Molnar 
>> Cc: Christoph Hellwig 
>> Cc: "H. Peter Anvin" 
>> Cc: Al Viro 
>> Cc: Thomas Gleixner 
>> Cc: Matthew Wilcox 
>> Cc: Ross Zwisler 
>> Signed-off-by: Dan Williams 
>> ---
>> Changes since the initial RFC:
>> * s/writethru/wt/ since we already have ioremap_wt(), set_memory_wt(),
>>   etc. (Ingo)
>
> Looks good to me. I suspect you'd like to carry this in the nvdimm tree?
>
> Acked-by: Ingo Molnar 

Thanks, Ingo!. Yes, I'll carry it in nvdimm.git for 4.13.


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-05 Thread Dan Williams
On Thu, May 4, 2017 at 11:54 PM, Ingo Molnar  wrote:
>
> * Dan Williams  wrote:
>
>> The pmem driver has a need to transfer data with a persistent memory
>> destination and be able to rely on the fact that the destination writes
>> are not cached. It is sufficient for the writes to be flushed to a
>> cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect
>> userspace to call fsync() to ensure data-writes have reached a
>> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or
>> REQ_FLUSH to the pmem driver which will turn around and fence previous
>> writes with an "sfence".
>>
>> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt,
>> that guarantee that the destination buffer is not dirty in the cpu cache
>> on completion. The new copy_from_iter_wt and sub-routines will be used
>> to replace the "pmem api" (include/linux/pmem.h +
>> arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt()
>> and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config
>> symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
>> otherwise.
>>
>> This is meant to satisfy the concern from Linus that if a driver wants
>> to do something beyond the normal nocache semantics it should be
>> something private to that driver [1], and Al's concern that anything
>> uaccess related belongs with the rest of the uaccess code [2].
>>
>> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
>> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
>>
>> Cc: 
>> Cc: Jan Kara 
>> Cc: Jeff Moyer 
>> Cc: Ingo Molnar 
>> Cc: Christoph Hellwig 
>> Cc: "H. Peter Anvin" 
>> Cc: Al Viro 
>> Cc: Thomas Gleixner 
>> Cc: Matthew Wilcox 
>> Cc: Ross Zwisler 
>> Signed-off-by: Dan Williams 
>> ---
>> Changes since the initial RFC:
>> * s/writethru/wt/ since we already have ioremap_wt(), set_memory_wt(),
>>   etc. (Ingo)
>
> Looks good to me. I suspect you'd like to carry this in the nvdimm tree?
>
> Acked-by: Ingo Molnar 

Thanks, Ingo!. Yes, I'll carry it in nvdimm.git for 4.13.


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-05 Thread Ingo Molnar

* Dan Williams  wrote:

> The pmem driver has a need to transfer data with a persistent memory
> destination and be able to rely on the fact that the destination writes
> are not cached. It is sufficient for the writes to be flushed to a
> cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect
> userspace to call fsync() to ensure data-writes have reached a
> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or
> REQ_FLUSH to the pmem driver which will turn around and fence previous
> writes with an "sfence".
> 
> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt,
> that guarantee that the destination buffer is not dirty in the cpu cache
> on completion. The new copy_from_iter_wt and sub-routines will be used
> to replace the "pmem api" (include/linux/pmem.h +
> arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt()
> and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config
> symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
> otherwise.
> 
> This is meant to satisfy the concern from Linus that if a driver wants
> to do something beyond the normal nocache semantics it should be
> something private to that driver [1], and Al's concern that anything
> uaccess related belongs with the rest of the uaccess code [2].
> 
> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
> 
> Cc: 
> Cc: Jan Kara 
> Cc: Jeff Moyer 
> Cc: Ingo Molnar 
> Cc: Christoph Hellwig 
> Cc: "H. Peter Anvin" 
> Cc: Al Viro 
> Cc: Thomas Gleixner 
> Cc: Matthew Wilcox 
> Cc: Ross Zwisler 
> Signed-off-by: Dan Williams 
> ---
> Changes since the initial RFC:
> * s/writethru/wt/ since we already have ioremap_wt(), set_memory_wt(),
>   etc. (Ingo)

Looks good to me. I suspect you'd like to carry this in the nvdimm tree?

Acked-by: Ingo Molnar 

Thanks,

Ingo


Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations

2017-05-05 Thread Ingo Molnar

* Dan Williams  wrote:

> The pmem driver has a need to transfer data with a persistent memory
> destination and be able to rely on the fact that the destination writes
> are not cached. It is sufficient for the writes to be flushed to a
> cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect
> userspace to call fsync() to ensure data-writes have reached a
> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or
> REQ_FLUSH to the pmem driver which will turn around and fence previous
> writes with an "sfence".
> 
> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt,
> that guarantee that the destination buffer is not dirty in the cpu cache
> on completion. The new copy_from_iter_wt and sub-routines will be used
> to replace the "pmem api" (include/linux/pmem.h +
> arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt()
> and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config
> symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
> otherwise.
> 
> This is meant to satisfy the concern from Linus that if a driver wants
> to do something beyond the normal nocache semantics it should be
> something private to that driver [1], and Al's concern that anything
> uaccess related belongs with the rest of the uaccess code [2].
> 
> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
> 
> Cc: 
> Cc: Jan Kara 
> Cc: Jeff Moyer 
> Cc: Ingo Molnar 
> Cc: Christoph Hellwig 
> Cc: "H. Peter Anvin" 
> Cc: Al Viro 
> Cc: Thomas Gleixner 
> Cc: Matthew Wilcox 
> Cc: Ross Zwisler 
> Signed-off-by: Dan Williams 
> ---
> Changes since the initial RFC:
> * s/writethru/wt/ since we already have ioremap_wt(), set_memory_wt(),
>   etc. (Ingo)

Looks good to me. I suspect you'd like to carry this in the nvdimm tree?

Acked-by: Ingo Molnar 

Thanks,

Ingo