Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Mon, May 8, 2017 at 1:32 PM, Ross Zwislerwrote: > On Fri, Apr 28, 2017 at 12:39:12PM -0700, Dan Williams wrote: >> The pmem driver has a need to transfer data with a persistent memory >> destination and be able to rely on the fact that the destination writes >> are not cached. It is sufficient for the writes to be flushed to a >> cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect >> userspace to call fsync() to ensure data-writes have reached a >> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or >> REQ_FLUSH to the pmem driver which will turn around and fence previous >> writes with an "sfence". >> >> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt, >> that guarantee that the destination buffer is not dirty in the cpu cache >> on completion. The new copy_from_iter_wt and sub-routines will be used >> to replace the "pmem api" (include/linux/pmem.h + >> arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt() >> and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config >> symbol, and fallback to copy_from_iter_nocache() and plain memcpy() >> otherwise. >> >> This is meant to satisfy the concern from Linus that if a driver wants >> to do something beyond the normal nocache semantics it should be >> something private to that driver [1], and Al's concern that anything >> uaccess related belongs with the rest of the uaccess code [2]. >> >> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html >> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html >> >> Cc: >> Cc: Jan Kara >> Cc: Jeff Moyer >> Cc: Ingo Molnar >> Cc: Christoph Hellwig >> Cc: "H. Peter Anvin" >> Cc: Al Viro >> Cc: Thomas Gleixner >> Cc: Matthew Wilcox >> Cc: Ross Zwisler >> Signed-off-by: Dan Williams [..] > I took a pretty hard look at the changes in arch/x86/lib/usercopy_64.c, and > they look correct to me. The inline assembly for non-temporal copies mixed > with C for loop control is IMHO much easier to follow than the pure assembly > of __copy_user_nocache(). > > Reviewed-by: Ross Zwisler Thanks Ross, I appreciate it.
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Mon, May 8, 2017 at 1:32 PM, Ross Zwisler wrote: > On Fri, Apr 28, 2017 at 12:39:12PM -0700, Dan Williams wrote: >> The pmem driver has a need to transfer data with a persistent memory >> destination and be able to rely on the fact that the destination writes >> are not cached. It is sufficient for the writes to be flushed to a >> cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect >> userspace to call fsync() to ensure data-writes have reached a >> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or >> REQ_FLUSH to the pmem driver which will turn around and fence previous >> writes with an "sfence". >> >> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt, >> that guarantee that the destination buffer is not dirty in the cpu cache >> on completion. The new copy_from_iter_wt and sub-routines will be used >> to replace the "pmem api" (include/linux/pmem.h + >> arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt() >> and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config >> symbol, and fallback to copy_from_iter_nocache() and plain memcpy() >> otherwise. >> >> This is meant to satisfy the concern from Linus that if a driver wants >> to do something beyond the normal nocache semantics it should be >> something private to that driver [1], and Al's concern that anything >> uaccess related belongs with the rest of the uaccess code [2]. >> >> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html >> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html >> >> Cc: >> Cc: Jan Kara >> Cc: Jeff Moyer >> Cc: Ingo Molnar >> Cc: Christoph Hellwig >> Cc: "H. Peter Anvin" >> Cc: Al Viro >> Cc: Thomas Gleixner >> Cc: Matthew Wilcox >> Cc: Ross Zwisler >> Signed-off-by: Dan Williams [..] > I took a pretty hard look at the changes in arch/x86/lib/usercopy_64.c, and > they look correct to me. The inline assembly for non-temporal copies mixed > with C for loop control is IMHO much easier to follow than the pure assembly > of __copy_user_nocache(). > > Reviewed-by: Ross Zwisler Thanks Ross, I appreciate it.
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Fri, Apr 28, 2017 at 12:39:12PM -0700, Dan Williams wrote: > The pmem driver has a need to transfer data with a persistent memory > destination and be able to rely on the fact that the destination writes > are not cached. It is sufficient for the writes to be flushed to a > cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect > userspace to call fsync() to ensure data-writes have reached a > power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or > REQ_FLUSH to the pmem driver which will turn around and fence previous > writes with an "sfence". > > Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt, > that guarantee that the destination buffer is not dirty in the cpu cache > on completion. The new copy_from_iter_wt and sub-routines will be used > to replace the "pmem api" (include/linux/pmem.h + > arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt() > and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config > symbol, and fallback to copy_from_iter_nocache() and plain memcpy() > otherwise. > > This is meant to satisfy the concern from Linus that if a driver wants > to do something beyond the normal nocache semantics it should be > something private to that driver [1], and Al's concern that anything > uaccess related belongs with the rest of the uaccess code [2]. > > [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html > [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html > > Cc:> Cc: Jan Kara > Cc: Jeff Moyer > Cc: Ingo Molnar > Cc: Christoph Hellwig > Cc: "H. Peter Anvin" > Cc: Al Viro > Cc: Thomas Gleixner > Cc: Matthew Wilcox > Cc: Ross Zwisler > Signed-off-by: Dan Williams > --- <> > diff --git a/arch/x86/include/asm/uaccess_64.h > b/arch/x86/include/asm/uaccess_64.h > index c5504b9a472e..07ded30c7e89 100644 > --- a/arch/x86/include/asm/uaccess_64.h > +++ b/arch/x86/include/asm/uaccess_64.h > @@ -171,6 +171,10 @@ unsigned long raw_copy_in_user(void __user *dst, const > void __user *src, unsigne > extern long __copy_user_nocache(void *dst, const void __user *src, > unsigned size, int zerorest); > > +extern long __copy_user_wt(void *dst, const void __user *src, unsigned size); > +extern void memcpy_page_wt(char *to, struct page *page, size_t offset, > +size_t len); > + > static inline int > __copy_from_user_inatomic_nocache(void *dst, const void __user *src, > unsigned size) > @@ -179,6 +183,13 @@ __copy_from_user_inatomic_nocache(void *dst, const void > __user *src, > return __copy_user_nocache(dst, src, size, 0); > } > > +static inline int > +__copy_from_user_inatomic_wt(void *dst, const void __user *src, unsigned > size) > +{ > + kasan_check_write(dst, size); > + return __copy_user_wt(dst, src, size); > +} > + > unsigned long > copy_user_handle_tail(char *to, char *from, unsigned len); > > diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c > index 3b7c40a2e3e1..0aeff66a022f 100644 > --- a/arch/x86/lib/usercopy_64.c > +++ b/arch/x86/lib/usercopy_64.c > @@ -7,6 +7,7 @@ > */ > #include > #include > +#include > > /* > * Zero Userspace > @@ -73,3 +74,130 @@ copy_user_handle_tail(char *to, char *from, unsigned len) > clac(); > return len; > } > + > +#ifdef CONFIG_ARCH_HAS_UACCESS_WT > +/** > + * clean_cache_range - write back a cache range with CLWB > + * @vaddr: virtual start address > + * @size:number of bytes to write back > + * > + * Write back a cache range using the CLWB (cache line write back) > + * instruction. Note that @size is internally rounded up to be cache > + * line size aligned. > + */ > +static void clean_cache_range(void *addr, size_t size) > +{ > + u16 x86_clflush_size = boot_cpu_data.x86_clflush_size; > + unsigned long clflush_mask = x86_clflush_size - 1; > + void *vend = addr + size; > + void *p; > + > + for (p = (void *)((unsigned long)addr & ~clflush_mask); > + p < vend; p += x86_clflush_size) > + clwb(p); > +} > + > +long __copy_user_wt(void *dst, const void __user *src, unsigned size) > +{ > + unsigned long flushed, dest = (unsigned long) dst; > + long rc = __copy_user_nocache(dst, src, size, 0); > + > + /* > + * __copy_user_nocache() uses non-temporal stores for the bulk > + * of the transfer, but we need to manually flush if the > + * transfer is unaligned. A cached memory copy is used when > + * destination or size is not naturally aligned. That is: > + * - Require 8-byte alignment when size is 8 bytes or larger. > + * - Require 4-byte alignment when size is 4 bytes. > +
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Fri, Apr 28, 2017 at 12:39:12PM -0700, Dan Williams wrote: > The pmem driver has a need to transfer data with a persistent memory > destination and be able to rely on the fact that the destination writes > are not cached. It is sufficient for the writes to be flushed to a > cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect > userspace to call fsync() to ensure data-writes have reached a > power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or > REQ_FLUSH to the pmem driver which will turn around and fence previous > writes with an "sfence". > > Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt, > that guarantee that the destination buffer is not dirty in the cpu cache > on completion. The new copy_from_iter_wt and sub-routines will be used > to replace the "pmem api" (include/linux/pmem.h + > arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt() > and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config > symbol, and fallback to copy_from_iter_nocache() and plain memcpy() > otherwise. > > This is meant to satisfy the concern from Linus that if a driver wants > to do something beyond the normal nocache semantics it should be > something private to that driver [1], and Al's concern that anything > uaccess related belongs with the rest of the uaccess code [2]. > > [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html > [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html > > Cc: > Cc: Jan Kara > Cc: Jeff Moyer > Cc: Ingo Molnar > Cc: Christoph Hellwig > Cc: "H. Peter Anvin" > Cc: Al Viro > Cc: Thomas Gleixner > Cc: Matthew Wilcox > Cc: Ross Zwisler > Signed-off-by: Dan Williams > --- <> > diff --git a/arch/x86/include/asm/uaccess_64.h > b/arch/x86/include/asm/uaccess_64.h > index c5504b9a472e..07ded30c7e89 100644 > --- a/arch/x86/include/asm/uaccess_64.h > +++ b/arch/x86/include/asm/uaccess_64.h > @@ -171,6 +171,10 @@ unsigned long raw_copy_in_user(void __user *dst, const > void __user *src, unsigne > extern long __copy_user_nocache(void *dst, const void __user *src, > unsigned size, int zerorest); > > +extern long __copy_user_wt(void *dst, const void __user *src, unsigned size); > +extern void memcpy_page_wt(char *to, struct page *page, size_t offset, > +size_t len); > + > static inline int > __copy_from_user_inatomic_nocache(void *dst, const void __user *src, > unsigned size) > @@ -179,6 +183,13 @@ __copy_from_user_inatomic_nocache(void *dst, const void > __user *src, > return __copy_user_nocache(dst, src, size, 0); > } > > +static inline int > +__copy_from_user_inatomic_wt(void *dst, const void __user *src, unsigned > size) > +{ > + kasan_check_write(dst, size); > + return __copy_user_wt(dst, src, size); > +} > + > unsigned long > copy_user_handle_tail(char *to, char *from, unsigned len); > > diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c > index 3b7c40a2e3e1..0aeff66a022f 100644 > --- a/arch/x86/lib/usercopy_64.c > +++ b/arch/x86/lib/usercopy_64.c > @@ -7,6 +7,7 @@ > */ > #include > #include > +#include > > /* > * Zero Userspace > @@ -73,3 +74,130 @@ copy_user_handle_tail(char *to, char *from, unsigned len) > clac(); > return len; > } > + > +#ifdef CONFIG_ARCH_HAS_UACCESS_WT > +/** > + * clean_cache_range - write back a cache range with CLWB > + * @vaddr: virtual start address > + * @size:number of bytes to write back > + * > + * Write back a cache range using the CLWB (cache line write back) > + * instruction. Note that @size is internally rounded up to be cache > + * line size aligned. > + */ > +static void clean_cache_range(void *addr, size_t size) > +{ > + u16 x86_clflush_size = boot_cpu_data.x86_clflush_size; > + unsigned long clflush_mask = x86_clflush_size - 1; > + void *vend = addr + size; > + void *p; > + > + for (p = (void *)((unsigned long)addr & ~clflush_mask); > + p < vend; p += x86_clflush_size) > + clwb(p); > +} > + > +long __copy_user_wt(void *dst, const void __user *src, unsigned size) > +{ > + unsigned long flushed, dest = (unsigned long) dst; > + long rc = __copy_user_nocache(dst, src, size, 0); > + > + /* > + * __copy_user_nocache() uses non-temporal stores for the bulk > + * of the transfer, but we need to manually flush if the > + * transfer is unaligned. A cached memory copy is used when > + * destination or size is not naturally aligned. That is: > + * - Require 8-byte alignment when size is 8 bytes or larger. > + * - Require 4-byte alignment when size is 4 bytes. > + */ > + if (size < 8) { > + if (!IS_ALIGNED(dest, 4) || size != 4) > + clean_cache_range(dst, 1); > + } else { > + if (!IS_ALIGNED(dest, 8)) { > + dest
RE: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
> * Dan Williamswrote: > > > On Sat, May 6, 2017 at 2:46 AM, Ingo Molnar wrote: > > > > > > * Dan Williams wrote: > > > > > >> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu > wrote: > > >> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote: > > >> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu > > > >> >> wrote: > > >> > : > > >> >> > > --- > > >> >> > > Changes since the initial RFC: > > >> >> > > * s/writethru/wt/ since we already have ioremap_wt(), > > >> >> > > set_memory_wt(), etc. (Ingo) > > >> >> > > > >> >> > Sorry I should have said earlier, but I think the term "wt" is > > >> >> > misleading. Non-temporal stores used in memcpy_wt() provide WC > > >> >> > semantics, not WT semantics. > > >> >> > > >> >> The non-temporal stores do, but memcpy_wt() is using a combination > of > > >> >> non-temporal stores and explicit cache flushing. > > >> >> > > >> >> > How about using "nocache" as it's been > > >> >> > used in __copy_user_nocache()? > > >> >> > > >> >> The difference in my mind is that the "_nocache" suffix indicates > > >> >> opportunistic / optional cache pollution avoidance whereas "_wt" > > >> >> strictly arranges for caches not to contain dirty data upon > > >> >> completion of the routine. For example, non-temporal stores on older > > >> >> x86 cpus could potentially leave dirty data in the cache, so > > >> >> memcpy_wt on those cpus would need to use explicit cache flushing. > > >> > > > >> > I see. I agree that its behavior is different from the existing one > > >> > with "_nocache". That said, I think "wt" or "write-through" generally > > >> > means that writes allocate cachelines and keep them clean by writing > to > > >> > memory. So, subsequent reads to the destination will hit the > > >> > cachelines. This is not the case with this interface. > > >> > > >> True... maybe _nocache_strict()? Or, leave it _wt() until someone > > >> comes along and is surprised that the cache is not warm for reads > > >> after memcpy_wt(), at which point we can ask "why not just use plain > > >> memcpy then?", or set the page-attributes to WT. > > > > > > Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal > and that > > > no cache line is left around afterwards (dirty or clean)? > > > > Yes, I think "flush" belongs in the name, and to make it easily > > grep-able separate from _nocache we can call it _flushcache? An > > efficient implementation will use _nocache / non-temporal stores > > internally, but external consumers just care about the state of the > > cache after the call. > > _flushcache() works for me too. > Works for me too. Thanks, -Toshi
RE: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
> * Dan Williams wrote: > > > On Sat, May 6, 2017 at 2:46 AM, Ingo Molnar wrote: > > > > > > * Dan Williams wrote: > > > > > >> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu > wrote: > > >> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote: > > >> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu > > > >> >> wrote: > > >> > : > > >> >> > > --- > > >> >> > > Changes since the initial RFC: > > >> >> > > * s/writethru/wt/ since we already have ioremap_wt(), > > >> >> > > set_memory_wt(), etc. (Ingo) > > >> >> > > > >> >> > Sorry I should have said earlier, but I think the term "wt" is > > >> >> > misleading. Non-temporal stores used in memcpy_wt() provide WC > > >> >> > semantics, not WT semantics. > > >> >> > > >> >> The non-temporal stores do, but memcpy_wt() is using a combination > of > > >> >> non-temporal stores and explicit cache flushing. > > >> >> > > >> >> > How about using "nocache" as it's been > > >> >> > used in __copy_user_nocache()? > > >> >> > > >> >> The difference in my mind is that the "_nocache" suffix indicates > > >> >> opportunistic / optional cache pollution avoidance whereas "_wt" > > >> >> strictly arranges for caches not to contain dirty data upon > > >> >> completion of the routine. For example, non-temporal stores on older > > >> >> x86 cpus could potentially leave dirty data in the cache, so > > >> >> memcpy_wt on those cpus would need to use explicit cache flushing. > > >> > > > >> > I see. I agree that its behavior is different from the existing one > > >> > with "_nocache". That said, I think "wt" or "write-through" generally > > >> > means that writes allocate cachelines and keep them clean by writing > to > > >> > memory. So, subsequent reads to the destination will hit the > > >> > cachelines. This is not the case with this interface. > > >> > > >> True... maybe _nocache_strict()? Or, leave it _wt() until someone > > >> comes along and is surprised that the cache is not warm for reads > > >> after memcpy_wt(), at which point we can ask "why not just use plain > > >> memcpy then?", or set the page-attributes to WT. > > > > > > Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal > and that > > > no cache line is left around afterwards (dirty or clean)? > > > > Yes, I think "flush" belongs in the name, and to make it easily > > grep-able separate from _nocache we can call it _flushcache? An > > efficient implementation will use _nocache / non-temporal stores > > internally, but external consumers just care about the state of the > > cache after the call. > > _flushcache() works for me too. > Works for me too. Thanks, -Toshi
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
* Dan Williamswrote: > On Sat, May 6, 2017 at 2:46 AM, Ingo Molnar wrote: > > > > * Dan Williams wrote: > > > >> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu > >> wrote: > >> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote: > >> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu > >> >> wrote: > >> > : > >> >> > > --- > >> >> > > Changes since the initial RFC: > >> >> > > * s/writethru/wt/ since we already have ioremap_wt(), > >> >> > > set_memory_wt(), etc. (Ingo) > >> >> > > >> >> > Sorry I should have said earlier, but I think the term "wt" is > >> >> > misleading. Non-temporal stores used in memcpy_wt() provide WC > >> >> > semantics, not WT semantics. > >> >> > >> >> The non-temporal stores do, but memcpy_wt() is using a combination of > >> >> non-temporal stores and explicit cache flushing. > >> >> > >> >> > How about using "nocache" as it's been > >> >> > used in __copy_user_nocache()? > >> >> > >> >> The difference in my mind is that the "_nocache" suffix indicates > >> >> opportunistic / optional cache pollution avoidance whereas "_wt" > >> >> strictly arranges for caches not to contain dirty data upon > >> >> completion of the routine. For example, non-temporal stores on older > >> >> x86 cpus could potentially leave dirty data in the cache, so > >> >> memcpy_wt on those cpus would need to use explicit cache flushing. > >> > > >> > I see. I agree that its behavior is different from the existing one > >> > with "_nocache". That said, I think "wt" or "write-through" generally > >> > means that writes allocate cachelines and keep them clean by writing to > >> > memory. So, subsequent reads to the destination will hit the > >> > cachelines. This is not the case with this interface. > >> > >> True... maybe _nocache_strict()? Or, leave it _wt() until someone > >> comes along and is surprised that the cache is not warm for reads > >> after memcpy_wt(), at which point we can ask "why not just use plain > >> memcpy then?", or set the page-attributes to WT. > > > > Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal > > and that > > no cache line is left around afterwards (dirty or clean)? > > Yes, I think "flush" belongs in the name, and to make it easily > grep-able separate from _nocache we can call it _flushcache? An > efficient implementation will use _nocache / non-temporal stores > internally, but external consumers just care about the state of the > cache after the call. _flushcache() works for me too. Thanks, Ingo
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
* Dan Williams wrote: > On Sat, May 6, 2017 at 2:46 AM, Ingo Molnar wrote: > > > > * Dan Williams wrote: > > > >> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu > >> wrote: > >> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote: > >> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu > >> >> wrote: > >> > : > >> >> > > --- > >> >> > > Changes since the initial RFC: > >> >> > > * s/writethru/wt/ since we already have ioremap_wt(), > >> >> > > set_memory_wt(), etc. (Ingo) > >> >> > > >> >> > Sorry I should have said earlier, but I think the term "wt" is > >> >> > misleading. Non-temporal stores used in memcpy_wt() provide WC > >> >> > semantics, not WT semantics. > >> >> > >> >> The non-temporal stores do, but memcpy_wt() is using a combination of > >> >> non-temporal stores and explicit cache flushing. > >> >> > >> >> > How about using "nocache" as it's been > >> >> > used in __copy_user_nocache()? > >> >> > >> >> The difference in my mind is that the "_nocache" suffix indicates > >> >> opportunistic / optional cache pollution avoidance whereas "_wt" > >> >> strictly arranges for caches not to contain dirty data upon > >> >> completion of the routine. For example, non-temporal stores on older > >> >> x86 cpus could potentially leave dirty data in the cache, so > >> >> memcpy_wt on those cpus would need to use explicit cache flushing. > >> > > >> > I see. I agree that its behavior is different from the existing one > >> > with "_nocache". That said, I think "wt" or "write-through" generally > >> > means that writes allocate cachelines and keep them clean by writing to > >> > memory. So, subsequent reads to the destination will hit the > >> > cachelines. This is not the case with this interface. > >> > >> True... maybe _nocache_strict()? Or, leave it _wt() until someone > >> comes along and is surprised that the cache is not warm for reads > >> after memcpy_wt(), at which point we can ask "why not just use plain > >> memcpy then?", or set the page-attributes to WT. > > > > Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal > > and that > > no cache line is left around afterwards (dirty or clean)? > > Yes, I think "flush" belongs in the name, and to make it easily > grep-able separate from _nocache we can call it _flushcache? An > efficient implementation will use _nocache / non-temporal stores > internally, but external consumers just care about the state of the > cache after the call. _flushcache() works for me too. Thanks, Ingo
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Sat, May 6, 2017 at 2:46 AM, Ingo Molnarwrote: > > * Dan Williams wrote: > >> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu wrote: >> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote: >> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu >> >> wrote: >> > : >> >> > > --- >> >> > > Changes since the initial RFC: >> >> > > * s/writethru/wt/ since we already have ioremap_wt(), >> >> > > set_memory_wt(), etc. (Ingo) >> >> > >> >> > Sorry I should have said earlier, but I think the term "wt" is >> >> > misleading. Non-temporal stores used in memcpy_wt() provide WC >> >> > semantics, not WT semantics. >> >> >> >> The non-temporal stores do, but memcpy_wt() is using a combination of >> >> non-temporal stores and explicit cache flushing. >> >> >> >> > How about using "nocache" as it's been >> >> > used in __copy_user_nocache()? >> >> >> >> The difference in my mind is that the "_nocache" suffix indicates >> >> opportunistic / optional cache pollution avoidance whereas "_wt" >> >> strictly arranges for caches not to contain dirty data upon >> >> completion of the routine. For example, non-temporal stores on older >> >> x86 cpus could potentially leave dirty data in the cache, so >> >> memcpy_wt on those cpus would need to use explicit cache flushing. >> > >> > I see. I agree that its behavior is different from the existing one >> > with "_nocache". That said, I think "wt" or "write-through" generally >> > means that writes allocate cachelines and keep them clean by writing to >> > memory. So, subsequent reads to the destination will hit the >> > cachelines. This is not the case with this interface. >> >> True... maybe _nocache_strict()? Or, leave it _wt() until someone >> comes along and is surprised that the cache is not warm for reads >> after memcpy_wt(), at which point we can ask "why not just use plain >> memcpy then?", or set the page-attributes to WT. > > Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal and > that > no cache line is left around afterwards (dirty or clean)? Yes, I think "flush" belongs in the name, and to make it easily grep-able separate from _nocache we can call it _flushcache? An efficient implementation will use _nocache / non-temporal stores internally, but external consumers just care about the state of the cache after the call.
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Sat, May 6, 2017 at 2:46 AM, Ingo Molnar wrote: > > * Dan Williams wrote: > >> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu wrote: >> > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote: >> >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu >> >> wrote: >> > : >> >> > > --- >> >> > > Changes since the initial RFC: >> >> > > * s/writethru/wt/ since we already have ioremap_wt(), >> >> > > set_memory_wt(), etc. (Ingo) >> >> > >> >> > Sorry I should have said earlier, but I think the term "wt" is >> >> > misleading. Non-temporal stores used in memcpy_wt() provide WC >> >> > semantics, not WT semantics. >> >> >> >> The non-temporal stores do, but memcpy_wt() is using a combination of >> >> non-temporal stores and explicit cache flushing. >> >> >> >> > How about using "nocache" as it's been >> >> > used in __copy_user_nocache()? >> >> >> >> The difference in my mind is that the "_nocache" suffix indicates >> >> opportunistic / optional cache pollution avoidance whereas "_wt" >> >> strictly arranges for caches not to contain dirty data upon >> >> completion of the routine. For example, non-temporal stores on older >> >> x86 cpus could potentially leave dirty data in the cache, so >> >> memcpy_wt on those cpus would need to use explicit cache flushing. >> > >> > I see. I agree that its behavior is different from the existing one >> > with "_nocache". That said, I think "wt" or "write-through" generally >> > means that writes allocate cachelines and keep them clean by writing to >> > memory. So, subsequent reads to the destination will hit the >> > cachelines. This is not the case with this interface. >> >> True... maybe _nocache_strict()? Or, leave it _wt() until someone >> comes along and is surprised that the cache is not warm for reads >> after memcpy_wt(), at which point we can ask "why not just use plain >> memcpy then?", or set the page-attributes to WT. > > Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal and > that > no cache line is left around afterwards (dirty or clean)? Yes, I think "flush" belongs in the name, and to make it easily grep-able separate from _nocache we can call it _flushcache? An efficient implementation will use _nocache / non-temporal stores internally, but external consumers just care about the state of the cache after the call.
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
* Dan Williamswrote: > On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu wrote: > > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote: > >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu > >> wrote: > > : > >> > > --- > >> > > Changes since the initial RFC: > >> > > * s/writethru/wt/ since we already have ioremap_wt(), > >> > > set_memory_wt(), etc. (Ingo) > >> > > >> > Sorry I should have said earlier, but I think the term "wt" is > >> > misleading. Non-temporal stores used in memcpy_wt() provide WC > >> > semantics, not WT semantics. > >> > >> The non-temporal stores do, but memcpy_wt() is using a combination of > >> non-temporal stores and explicit cache flushing. > >> > >> > How about using "nocache" as it's been > >> > used in __copy_user_nocache()? > >> > >> The difference in my mind is that the "_nocache" suffix indicates > >> opportunistic / optional cache pollution avoidance whereas "_wt" > >> strictly arranges for caches not to contain dirty data upon > >> completion of the routine. For example, non-temporal stores on older > >> x86 cpus could potentially leave dirty data in the cache, so > >> memcpy_wt on those cpus would need to use explicit cache flushing. > > > > I see. I agree that its behavior is different from the existing one > > with "_nocache". That said, I think "wt" or "write-through" generally > > means that writes allocate cachelines and keep them clean by writing to > > memory. So, subsequent reads to the destination will hit the > > cachelines. This is not the case with this interface. > > True... maybe _nocache_strict()? Or, leave it _wt() until someone > comes along and is surprised that the cache is not warm for reads > after memcpy_wt(), at which point we can ask "why not just use plain > memcpy then?", or set the page-attributes to WT. Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal and that no cache line is left around afterwards (dirty or clean)? Thanks, Ingo
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
* Dan Williams wrote: > On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu wrote: > > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote: > >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu > >> wrote: > > : > >> > > --- > >> > > Changes since the initial RFC: > >> > > * s/writethru/wt/ since we already have ioremap_wt(), > >> > > set_memory_wt(), etc. (Ingo) > >> > > >> > Sorry I should have said earlier, but I think the term "wt" is > >> > misleading. Non-temporal stores used in memcpy_wt() provide WC > >> > semantics, not WT semantics. > >> > >> The non-temporal stores do, but memcpy_wt() is using a combination of > >> non-temporal stores and explicit cache flushing. > >> > >> > How about using "nocache" as it's been > >> > used in __copy_user_nocache()? > >> > >> The difference in my mind is that the "_nocache" suffix indicates > >> opportunistic / optional cache pollution avoidance whereas "_wt" > >> strictly arranges for caches not to contain dirty data upon > >> completion of the routine. For example, non-temporal stores on older > >> x86 cpus could potentially leave dirty data in the cache, so > >> memcpy_wt on those cpus would need to use explicit cache flushing. > > > > I see. I agree that its behavior is different from the existing one > > with "_nocache". That said, I think "wt" or "write-through" generally > > means that writes allocate cachelines and keep them clean by writing to > > memory. So, subsequent reads to the destination will hit the > > cachelines. This is not the case with this interface. > > True... maybe _nocache_strict()? Or, leave it _wt() until someone > comes along and is surprised that the cache is not warm for reads > after memcpy_wt(), at which point we can ask "why not just use plain > memcpy then?", or set the page-attributes to WT. Perhaps a _nocache_flush() postfix, to signal both that it's non-temporal and that no cache line is left around afterwards (dirty or clean)? Thanks, Ingo
RE: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu> wrote: > > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote: > >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu > >> wrote: > > : > >> > > --- > >> > > Changes since the initial RFC: > >> > > * s/writethru/wt/ since we already have ioremap_wt(), > >> > > set_memory_wt(), etc. (Ingo) > >> > > >> > Sorry I should have said earlier, but I think the term "wt" is > >> > misleading. Non-temporal stores used in memcpy_wt() provide WC > >> > semantics, not WT semantics. > >> > >> The non-temporal stores do, but memcpy_wt() is using a combination of > >> non-temporal stores and explicit cache flushing. > >> > >> > How about using "nocache" as it's been > >> > used in __copy_user_nocache()? > >> > >> The difference in my mind is that the "_nocache" suffix indicates > >> opportunistic / optional cache pollution avoidance whereas "_wt" > >> strictly arranges for caches not to contain dirty data upon > >> completion of the routine. For example, non-temporal stores on older > >> x86 cpus could potentially leave dirty data in the cache, so > >> memcpy_wt on those cpus would need to use explicit cache flushing. > > > > I see. I agree that its behavior is different from the existing one > > with "_nocache". That said, I think "wt" or "write-through" generally > > means that writes allocate cachelines and keep them clean by writing to > > memory. So, subsequent reads to the destination will hit the > > cachelines. This is not the case with this interface. > > True... maybe _nocache_strict()? Or, leave it _wt() until someone > comes along and is surprised that the cache is not warm for reads > after memcpy_wt(), at which point we can ask "why not just use plain > memcpy then?", or set the page-attributes to WT. I prefer _nocache_strict(), if it's not too long, since it avoids any confusion. If other arches actually implement it with WT semantics, we might become the one to change it, instead of the caller. Thanks, -Toshi
RE: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
> On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu > wrote: > > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote: > >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu > >> wrote: > > : > >> > > --- > >> > > Changes since the initial RFC: > >> > > * s/writethru/wt/ since we already have ioremap_wt(), > >> > > set_memory_wt(), etc. (Ingo) > >> > > >> > Sorry I should have said earlier, but I think the term "wt" is > >> > misleading. Non-temporal stores used in memcpy_wt() provide WC > >> > semantics, not WT semantics. > >> > >> The non-temporal stores do, but memcpy_wt() is using a combination of > >> non-temporal stores and explicit cache flushing. > >> > >> > How about using "nocache" as it's been > >> > used in __copy_user_nocache()? > >> > >> The difference in my mind is that the "_nocache" suffix indicates > >> opportunistic / optional cache pollution avoidance whereas "_wt" > >> strictly arranges for caches not to contain dirty data upon > >> completion of the routine. For example, non-temporal stores on older > >> x86 cpus could potentially leave dirty data in the cache, so > >> memcpy_wt on those cpus would need to use explicit cache flushing. > > > > I see. I agree that its behavior is different from the existing one > > with "_nocache". That said, I think "wt" or "write-through" generally > > means that writes allocate cachelines and keep them clean by writing to > > memory. So, subsequent reads to the destination will hit the > > cachelines. This is not the case with this interface. > > True... maybe _nocache_strict()? Or, leave it _wt() until someone > comes along and is surprised that the cache is not warm for reads > after memcpy_wt(), at which point we can ask "why not just use plain > memcpy then?", or set the page-attributes to WT. I prefer _nocache_strict(), if it's not too long, since it avoids any confusion. If other arches actually implement it with WT semantics, we might become the one to change it, instead of the caller. Thanks, -Toshi
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsuwrote: > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote: >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu >> wrote: > : >> > > --- >> > > Changes since the initial RFC: >> > > * s/writethru/wt/ since we already have ioremap_wt(), >> > > set_memory_wt(), etc. (Ingo) >> > >> > Sorry I should have said earlier, but I think the term "wt" is >> > misleading. Non-temporal stores used in memcpy_wt() provide WC >> > semantics, not WT semantics. >> >> The non-temporal stores do, but memcpy_wt() is using a combination of >> non-temporal stores and explicit cache flushing. >> >> > How about using "nocache" as it's been >> > used in __copy_user_nocache()? >> >> The difference in my mind is that the "_nocache" suffix indicates >> opportunistic / optional cache pollution avoidance whereas "_wt" >> strictly arranges for caches not to contain dirty data upon >> completion of the routine. For example, non-temporal stores on older >> x86 cpus could potentially leave dirty data in the cache, so >> memcpy_wt on those cpus would need to use explicit cache flushing. > > I see. I agree that its behavior is different from the existing one > with "_nocache". That said, I think "wt" or "write-through" generally > means that writes allocate cachelines and keep them clean by writing to > memory. So, subsequent reads to the destination will hit the > cachelines. This is not the case with this interface. True... maybe _nocache_strict()? Or, leave it _wt() until someone comes along and is surprised that the cache is not warm for reads after memcpy_wt(), at which point we can ask "why not just use plain memcpy then?", or set the page-attributes to WT.
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Fri, May 5, 2017 at 3:44 PM, Kani, Toshimitsu wrote: > On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote: >> On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu >> wrote: > : >> > > --- >> > > Changes since the initial RFC: >> > > * s/writethru/wt/ since we already have ioremap_wt(), >> > > set_memory_wt(), etc. (Ingo) >> > >> > Sorry I should have said earlier, but I think the term "wt" is >> > misleading. Non-temporal stores used in memcpy_wt() provide WC >> > semantics, not WT semantics. >> >> The non-temporal stores do, but memcpy_wt() is using a combination of >> non-temporal stores and explicit cache flushing. >> >> > How about using "nocache" as it's been >> > used in __copy_user_nocache()? >> >> The difference in my mind is that the "_nocache" suffix indicates >> opportunistic / optional cache pollution avoidance whereas "_wt" >> strictly arranges for caches not to contain dirty data upon >> completion of the routine. For example, non-temporal stores on older >> x86 cpus could potentially leave dirty data in the cache, so >> memcpy_wt on those cpus would need to use explicit cache flushing. > > I see. I agree that its behavior is different from the existing one > with "_nocache". That said, I think "wt" or "write-through" generally > means that writes allocate cachelines and keep them clean by writing to > memory. So, subsequent reads to the destination will hit the > cachelines. This is not the case with this interface. True... maybe _nocache_strict()? Or, leave it _wt() until someone comes along and is surprised that the cache is not warm for reads after memcpy_wt(), at which point we can ask "why not just use plain memcpy then?", or set the page-attributes to WT.
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote: > On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu> wrote: : > > > --- > > > Changes since the initial RFC: > > > * s/writethru/wt/ since we already have ioremap_wt(), > > > set_memory_wt(), etc. (Ingo) > > > > Sorry I should have said earlier, but I think the term "wt" is > > misleading. Non-temporal stores used in memcpy_wt() provide WC > > semantics, not WT semantics. > > The non-temporal stores do, but memcpy_wt() is using a combination of > non-temporal stores and explicit cache flushing. > > > How about using "nocache" as it's been > > used in __copy_user_nocache()? > > The difference in my mind is that the "_nocache" suffix indicates > opportunistic / optional cache pollution avoidance whereas "_wt" > strictly arranges for caches not to contain dirty data upon > completion of the routine. For example, non-temporal stores on older > x86 cpus could potentially leave dirty data in the cache, so > memcpy_wt on those cpus would need to use explicit cache flushing. I see. I agree that its behavior is different from the existing one with "_nocache". That said, I think "wt" or "write-through" generally means that writes allocate cachelines and keep them clean by writing to memory. So, subsequent reads to the destination will hit the cachelines. This is not the case with this interface. Thanks, -Toshi
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Fri, 2017-05-05 at 15:25 -0700, Dan Williams wrote: > On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu > wrote: : > > > --- > > > Changes since the initial RFC: > > > * s/writethru/wt/ since we already have ioremap_wt(), > > > set_memory_wt(), etc. (Ingo) > > > > Sorry I should have said earlier, but I think the term "wt" is > > misleading. Non-temporal stores used in memcpy_wt() provide WC > > semantics, not WT semantics. > > The non-temporal stores do, but memcpy_wt() is using a combination of > non-temporal stores and explicit cache flushing. > > > How about using "nocache" as it's been > > used in __copy_user_nocache()? > > The difference in my mind is that the "_nocache" suffix indicates > opportunistic / optional cache pollution avoidance whereas "_wt" > strictly arranges for caches not to contain dirty data upon > completion of the routine. For example, non-temporal stores on older > x86 cpus could potentially leave dirty data in the cache, so > memcpy_wt on those cpus would need to use explicit cache flushing. I see. I agree that its behavior is different from the existing one with "_nocache". That said, I think "wt" or "write-through" generally means that writes allocate cachelines and keep them clean by writing to memory. So, subsequent reads to the destination will hit the cachelines. This is not the case with this interface. Thanks, -Toshi
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsuwrote: > On Fri, 2017-04-28 at 12:39 -0700, Dan Williams wrote: >> The pmem driver has a need to transfer data with a persistent memory >> destination and be able to rely on the fact that the destination >> writes are not cached. It is sufficient for the writes to be flushed >> to a cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we >> expect userspace to call fsync() to ensure data-writes have reached a >> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA >> or REQ_FLUSH to the pmem driver which will turn around and fence >> previous writes with an "sfence". >> >> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and >> memcpy_wt, that guarantee that the destination buffer is not dirty in >> the cpu cache on completion. The new copy_from_iter_wt and sub- >> routines will be used to replace the "pmem api" (include/linux/pmem.h >> + arch/x86/include/asm/pmem.h). The availability of >> copy_from_iter_wt() and memcpy_wt() are gated by the >> CONFIG_ARCH_HAS_UACCESS_WT config symbol, and fallback to >> copy_from_iter_nocache() and plain memcpy() otherwise. >> >> This is meant to satisfy the concern from Linus that if a driver >> wants to do something beyond the normal nocache semantics it should >> be something private to that driver [1], and Al's concern that >> anything uaccess related belongs with the rest of the uaccess code >> [2]. >> >> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364. >> html >> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.ht >> ml >> >> Cc: >> Cc: Jan Kara >> Cc: Jeff Moyer >> Cc: Ingo Molnar >> Cc: Christoph Hellwig >> Cc: "H. Peter Anvin" >> Cc: Al Viro >> Cc: Thomas Gleixner >> Cc: Matthew Wilcox >> Cc: Ross Zwisler >> Signed-off-by: Dan Williams >> --- >> Changes since the initial RFC: >> * s/writethru/wt/ since we already have ioremap_wt(), >> set_memory_wt(), etc. (Ingo) > > Sorry I should have said earlier, but I think the term "wt" is > misleading. Non-temporal stores used in memcpy_wt() provide WC > semantics, not WT semantics. The non-temporal stores do, but memcpy_wt() is using a combination of non-temporal stores and explicit cache flushing. > How about using "nocache" as it's been > used in __copy_user_nocache()? The difference in my mind is that the "_nocache" suffix indicates opportunistic / optional cache pollution avoidance whereas "_wt" strictly arranges for caches not to contain dirty data upon completion of the routine. For example, non-temporal stores on older x86 cpus could potentially leave dirty data in the cache, so memcpy_wt on those cpus would need to use explicit cache flushing.
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Fri, May 5, 2017 at 1:39 PM, Kani, Toshimitsu wrote: > On Fri, 2017-04-28 at 12:39 -0700, Dan Williams wrote: >> The pmem driver has a need to transfer data with a persistent memory >> destination and be able to rely on the fact that the destination >> writes are not cached. It is sufficient for the writes to be flushed >> to a cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we >> expect userspace to call fsync() to ensure data-writes have reached a >> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA >> or REQ_FLUSH to the pmem driver which will turn around and fence >> previous writes with an "sfence". >> >> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and >> memcpy_wt, that guarantee that the destination buffer is not dirty in >> the cpu cache on completion. The new copy_from_iter_wt and sub- >> routines will be used to replace the "pmem api" (include/linux/pmem.h >> + arch/x86/include/asm/pmem.h). The availability of >> copy_from_iter_wt() and memcpy_wt() are gated by the >> CONFIG_ARCH_HAS_UACCESS_WT config symbol, and fallback to >> copy_from_iter_nocache() and plain memcpy() otherwise. >> >> This is meant to satisfy the concern from Linus that if a driver >> wants to do something beyond the normal nocache semantics it should >> be something private to that driver [1], and Al's concern that >> anything uaccess related belongs with the rest of the uaccess code >> [2]. >> >> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364. >> html >> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.ht >> ml >> >> Cc: >> Cc: Jan Kara >> Cc: Jeff Moyer >> Cc: Ingo Molnar >> Cc: Christoph Hellwig >> Cc: "H. Peter Anvin" >> Cc: Al Viro >> Cc: Thomas Gleixner >> Cc: Matthew Wilcox >> Cc: Ross Zwisler >> Signed-off-by: Dan Williams >> --- >> Changes since the initial RFC: >> * s/writethru/wt/ since we already have ioremap_wt(), >> set_memory_wt(), etc. (Ingo) > > Sorry I should have said earlier, but I think the term "wt" is > misleading. Non-temporal stores used in memcpy_wt() provide WC > semantics, not WT semantics. The non-temporal stores do, but memcpy_wt() is using a combination of non-temporal stores and explicit cache flushing. > How about using "nocache" as it's been > used in __copy_user_nocache()? The difference in my mind is that the "_nocache" suffix indicates opportunistic / optional cache pollution avoidance whereas "_wt" strictly arranges for caches not to contain dirty data upon completion of the routine. For example, non-temporal stores on older x86 cpus could potentially leave dirty data in the cache, so memcpy_wt on those cpus would need to use explicit cache flushing.
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Fri, 2017-04-28 at 12:39 -0700, Dan Williams wrote: > The pmem driver has a need to transfer data with a persistent memory > destination and be able to rely on the fact that the destination > writes are not cached. It is sufficient for the writes to be flushed > to a cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we > expect userspace to call fsync() to ensure data-writes have reached a > power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA > or REQ_FLUSH to the pmem driver which will turn around and fence > previous writes with an "sfence". > > Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and > memcpy_wt, that guarantee that the destination buffer is not dirty in > the cpu cache on completion. The new copy_from_iter_wt and sub- > routines will be used to replace the "pmem api" (include/linux/pmem.h > + arch/x86/include/asm/pmem.h). The availability of > copy_from_iter_wt() and memcpy_wt() are gated by the > CONFIG_ARCH_HAS_UACCESS_WT config symbol, and fallback to > copy_from_iter_nocache() and plain memcpy() otherwise. > > This is meant to satisfy the concern from Linus that if a driver > wants to do something beyond the normal nocache semantics it should > be something private to that driver [1], and Al's concern that > anything uaccess related belongs with the rest of the uaccess code > [2]. > > [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364. > html > [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.ht > ml > > Cc:> Cc: Jan Kara > Cc: Jeff Moyer > Cc: Ingo Molnar > Cc: Christoph Hellwig > Cc: "H. Peter Anvin" > Cc: Al Viro > Cc: Thomas Gleixner > Cc: Matthew Wilcox > Cc: Ross Zwisler > Signed-off-by: Dan Williams > --- > Changes since the initial RFC: > * s/writethru/wt/ since we already have ioremap_wt(), > set_memory_wt(), etc. (Ingo) Sorry I should have said earlier, but I think the term "wt" is misleading. Non-temporal stores used in memcpy_wt() provide WC semantics, not WT semantics. How about using "nocache" as it's been used in __copy_user_nocache()? Thanks, -Toshi
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Fri, 2017-04-28 at 12:39 -0700, Dan Williams wrote: > The pmem driver has a need to transfer data with a persistent memory > destination and be able to rely on the fact that the destination > writes are not cached. It is sufficient for the writes to be flushed > to a cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we > expect userspace to call fsync() to ensure data-writes have reached a > power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA > or REQ_FLUSH to the pmem driver which will turn around and fence > previous writes with an "sfence". > > Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and > memcpy_wt, that guarantee that the destination buffer is not dirty in > the cpu cache on completion. The new copy_from_iter_wt and sub- > routines will be used to replace the "pmem api" (include/linux/pmem.h > + arch/x86/include/asm/pmem.h). The availability of > copy_from_iter_wt() and memcpy_wt() are gated by the > CONFIG_ARCH_HAS_UACCESS_WT config symbol, and fallback to > copy_from_iter_nocache() and plain memcpy() otherwise. > > This is meant to satisfy the concern from Linus that if a driver > wants to do something beyond the normal nocache semantics it should > be something private to that driver [1], and Al's concern that > anything uaccess related belongs with the rest of the uaccess code > [2]. > > [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364. > html > [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.ht > ml > > Cc: > Cc: Jan Kara > Cc: Jeff Moyer > Cc: Ingo Molnar > Cc: Christoph Hellwig > Cc: "H. Peter Anvin" > Cc: Al Viro > Cc: Thomas Gleixner > Cc: Matthew Wilcox > Cc: Ross Zwisler > Signed-off-by: Dan Williams > --- > Changes since the initial RFC: > * s/writethru/wt/ since we already have ioremap_wt(), > set_memory_wt(), etc. (Ingo) Sorry I should have said earlier, but I think the term "wt" is misleading. Non-temporal stores used in memcpy_wt() provide WC semantics, not WT semantics. How about using "nocache" as it's been used in __copy_user_nocache()? Thanks, -Toshi
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Thu, May 4, 2017 at 11:54 PM, Ingo Molnarwrote: > > * Dan Williams wrote: > >> The pmem driver has a need to transfer data with a persistent memory >> destination and be able to rely on the fact that the destination writes >> are not cached. It is sufficient for the writes to be flushed to a >> cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect >> userspace to call fsync() to ensure data-writes have reached a >> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or >> REQ_FLUSH to the pmem driver which will turn around and fence previous >> writes with an "sfence". >> >> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt, >> that guarantee that the destination buffer is not dirty in the cpu cache >> on completion. The new copy_from_iter_wt and sub-routines will be used >> to replace the "pmem api" (include/linux/pmem.h + >> arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt() >> and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config >> symbol, and fallback to copy_from_iter_nocache() and plain memcpy() >> otherwise. >> >> This is meant to satisfy the concern from Linus that if a driver wants >> to do something beyond the normal nocache semantics it should be >> something private to that driver [1], and Al's concern that anything >> uaccess related belongs with the rest of the uaccess code [2]. >> >> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html >> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html >> >> Cc: >> Cc: Jan Kara >> Cc: Jeff Moyer >> Cc: Ingo Molnar >> Cc: Christoph Hellwig >> Cc: "H. Peter Anvin" >> Cc: Al Viro >> Cc: Thomas Gleixner >> Cc: Matthew Wilcox >> Cc: Ross Zwisler >> Signed-off-by: Dan Williams >> --- >> Changes since the initial RFC: >> * s/writethru/wt/ since we already have ioremap_wt(), set_memory_wt(), >> etc. (Ingo) > > Looks good to me. I suspect you'd like to carry this in the nvdimm tree? > > Acked-by: Ingo Molnar Thanks, Ingo!. Yes, I'll carry it in nvdimm.git for 4.13.
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
On Thu, May 4, 2017 at 11:54 PM, Ingo Molnar wrote: > > * Dan Williams wrote: > >> The pmem driver has a need to transfer data with a persistent memory >> destination and be able to rely on the fact that the destination writes >> are not cached. It is sufficient for the writes to be flushed to a >> cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect >> userspace to call fsync() to ensure data-writes have reached a >> power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or >> REQ_FLUSH to the pmem driver which will turn around and fence previous >> writes with an "sfence". >> >> Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt, >> that guarantee that the destination buffer is not dirty in the cpu cache >> on completion. The new copy_from_iter_wt and sub-routines will be used >> to replace the "pmem api" (include/linux/pmem.h + >> arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt() >> and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config >> symbol, and fallback to copy_from_iter_nocache() and plain memcpy() >> otherwise. >> >> This is meant to satisfy the concern from Linus that if a driver wants >> to do something beyond the normal nocache semantics it should be >> something private to that driver [1], and Al's concern that anything >> uaccess related belongs with the rest of the uaccess code [2]. >> >> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html >> [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html >> >> Cc: >> Cc: Jan Kara >> Cc: Jeff Moyer >> Cc: Ingo Molnar >> Cc: Christoph Hellwig >> Cc: "H. Peter Anvin" >> Cc: Al Viro >> Cc: Thomas Gleixner >> Cc: Matthew Wilcox >> Cc: Ross Zwisler >> Signed-off-by: Dan Williams >> --- >> Changes since the initial RFC: >> * s/writethru/wt/ since we already have ioremap_wt(), set_memory_wt(), >> etc. (Ingo) > > Looks good to me. I suspect you'd like to carry this in the nvdimm tree? > > Acked-by: Ingo Molnar Thanks, Ingo!. Yes, I'll carry it in nvdimm.git for 4.13.
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
* Dan Williamswrote: > The pmem driver has a need to transfer data with a persistent memory > destination and be able to rely on the fact that the destination writes > are not cached. It is sufficient for the writes to be flushed to a > cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect > userspace to call fsync() to ensure data-writes have reached a > power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or > REQ_FLUSH to the pmem driver which will turn around and fence previous > writes with an "sfence". > > Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt, > that guarantee that the destination buffer is not dirty in the cpu cache > on completion. The new copy_from_iter_wt and sub-routines will be used > to replace the "pmem api" (include/linux/pmem.h + > arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt() > and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config > symbol, and fallback to copy_from_iter_nocache() and plain memcpy() > otherwise. > > This is meant to satisfy the concern from Linus that if a driver wants > to do something beyond the normal nocache semantics it should be > something private to that driver [1], and Al's concern that anything > uaccess related belongs with the rest of the uaccess code [2]. > > [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html > [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html > > Cc: > Cc: Jan Kara > Cc: Jeff Moyer > Cc: Ingo Molnar > Cc: Christoph Hellwig > Cc: "H. Peter Anvin" > Cc: Al Viro > Cc: Thomas Gleixner > Cc: Matthew Wilcox > Cc: Ross Zwisler > Signed-off-by: Dan Williams > --- > Changes since the initial RFC: > * s/writethru/wt/ since we already have ioremap_wt(), set_memory_wt(), > etc. (Ingo) Looks good to me. I suspect you'd like to carry this in the nvdimm tree? Acked-by: Ingo Molnar Thanks, Ingo
Re: [PATCH v2] x86, uaccess: introduce copy_from_iter_wt for pmem / writethrough operations
* Dan Williams wrote: > The pmem driver has a need to transfer data with a persistent memory > destination and be able to rely on the fact that the destination writes > are not cached. It is sufficient for the writes to be flushed to a > cpu-store-buffer (non-temporal / "movnt" in x86 terms), as we expect > userspace to call fsync() to ensure data-writes have reached a > power-fail-safe zone in the platform. The fsync() triggers a REQ_FUA or > REQ_FLUSH to the pmem driver which will turn around and fence previous > writes with an "sfence". > > Implement a __copy_from_user_inatomic_wt, memcpy_page_wt, and memcpy_wt, > that guarantee that the destination buffer is not dirty in the cpu cache > on completion. The new copy_from_iter_wt and sub-routines will be used > to replace the "pmem api" (include/linux/pmem.h + > arch/x86/include/asm/pmem.h). The availability of copy_from_iter_wt() > and memcpy_wt() are gated by the CONFIG_ARCH_HAS_UACCESS_WT config > symbol, and fallback to copy_from_iter_nocache() and plain memcpy() > otherwise. > > This is meant to satisfy the concern from Linus that if a driver wants > to do something beyond the normal nocache semantics it should be > something private to that driver [1], and Al's concern that anything > uaccess related belongs with the rest of the uaccess code [2]. > > [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html > [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html > > Cc: > Cc: Jan Kara > Cc: Jeff Moyer > Cc: Ingo Molnar > Cc: Christoph Hellwig > Cc: "H. Peter Anvin" > Cc: Al Viro > Cc: Thomas Gleixner > Cc: Matthew Wilcox > Cc: Ross Zwisler > Signed-off-by: Dan Williams > --- > Changes since the initial RFC: > * s/writethru/wt/ since we already have ioremap_wt(), set_memory_wt(), > etc. (Ingo) Looks good to me. I suspect you'd like to carry this in the nvdimm tree? Acked-by: Ingo Molnar Thanks, Ingo