Re: [PATCH v2] arch/cacheflush: Introduce flush_all_caches()

2022-08-23 Thread Dan Williams
Davidlohr Bueso wrote:
> On Mon, 22 Aug 2022, Dan Williams wrote:
> 
> >Davidlohr Bueso wrote:
> >> On Sun, 21 Aug 2022, Christoph Hellwig wrote:
> >>
> >> >On Fri, Aug 19, 2022 at 10:10:24AM -0700, Davidlohr Bueso wrote:
> >> >> index b192d917a6d0..ac4d4fd4e508 100644
> >> >> --- a/arch/x86/include/asm/cacheflush.h
> >> >> +++ b/arch/x86/include/asm/cacheflush.h
> >> >> @@ -10,4 +10,8 @@
> >> >>
> >> >>  void clflush_cache_range(void *addr, unsigned int size);
> >> >>
> >> >> +/* see comments in the stub version */
> >> >> +#define flush_all_caches() \
> >> >> +   do { wbinvd_on_all_cpus(); } while(0)
> >> >
> >> >Yikes.  This is just a horrible, horrible name and placement for a bad
> >> >hack that should have no generic relevance.
> >>
> >> Why does this have no generic relevance? There's already been discussions
> >> on how much wbinv is hated[0].
> >>
> >> >Please fix up the naming to make it clear that this function is for a
> >> >very specific nvdimm use case, and move it to a nvdimm-specific header
> >> >file.
> >>
> >> Do you have any suggestions for a name? And, as the changelog describes,
> >> this is not nvdimm specific anymore, and the whole point of all this is
> >> volatile memory components for cxl, hence nvdimm namespace is bogus.
> >>
> >> [0] 
> >> https://lore.kernel.org/all/yvtc2u1j%2fqip8...@worktop.programming.kicks-ass.net/
> >
> >While it is not nvdimm specific anymore, it's still specific to "memory
> >devices that can bulk invalidate a physical address space". I.e. it's
> >not as generic as its location in arch/x86/include/asm/cacheflush.h
> >would imply. So, similar to arch_invalidate_pmem(), lets keep it in a
> >device-driver-specific header file, because hch and peterz are right, we
> >need to make this much more clear that it is not for general
> >consumption.
> 
> Fine, I won't argue - although I don't particularly agree, at least wrt
> the naming. Imo my naming does _exactly_ what it should do and is much
> easier to read than arch_has_flush_memregion() which is counter intuitive
> when we are in fact flushing everything. This does not either make it
> any more clearer about virt vs physical mappings either (except that
> it's no longer associated to cacheflush). But, excepting arm cacheflush.h's
> rare arch with braino cache users get way too much credit in their namespace
> usage.


> 
> But yes there is no doubt that my version is more inviting than it should be,
> which made me think of naming it to flush_all_caches_careful() so the user
> is forced to at least check the function (or one would hope).

So I'm not married to arch_has_flush_memregion() or even including the
physical address range to flush, the only aspect of the prototype I want
to see incorporated is something about the target / motivation for the
flush.

"flush_all_caches_careful()" says nothing about what the API is being
"careful" about. It reminds of Linus' comments on memcpy_mcsafe()

https://lore.kernel.org/all/CAHk-=wh1SPyuGkTkQESsacwKTpjWd=_-KwoCK5o=suc3ymd...@mail.gmail.com/

"Naming - like comments - shouldn't be about what some implementation
is, but about the concept."

So "memregion" was meant to represent a memory device backed physical
address range, but that association may only be in my own head.  How
about something even more explicit like:
"flush_after_memdev_invalidate()" where someone would feel icky using it
for anything other than what we have been talking about in this thread.

> Anyway, I'll send a new version based on the below - I particularly agree
> with the hypervisor bits.

Ok, just one more lap around the bikeshed track, but I think we're
converging.



Re: [PATCH v2] arch/cacheflush: Introduce flush_all_caches()

2022-08-23 Thread Davidlohr Bueso

On Mon, 22 Aug 2022, Dan Williams wrote:


Davidlohr Bueso wrote:

On Sun, 21 Aug 2022, Christoph Hellwig wrote:

>On Fri, Aug 19, 2022 at 10:10:24AM -0700, Davidlohr Bueso wrote:
>> index b192d917a6d0..ac4d4fd4e508 100644
>> --- a/arch/x86/include/asm/cacheflush.h
>> +++ b/arch/x86/include/asm/cacheflush.h
>> @@ -10,4 +10,8 @@
>>
>>  void clflush_cache_range(void *addr, unsigned int size);
>>
>> +/* see comments in the stub version */
>> +#define flush_all_caches() \
>> +  do { wbinvd_on_all_cpus(); } while(0)
>
>Yikes.  This is just a horrible, horrible name and placement for a bad
>hack that should have no generic relevance.

Why does this have no generic relevance? There's already been discussions
on how much wbinv is hated[0].

>Please fix up the naming to make it clear that this function is for a
>very specific nvdimm use case, and move it to a nvdimm-specific header
>file.

Do you have any suggestions for a name? And, as the changelog describes,
this is not nvdimm specific anymore, and the whole point of all this is
volatile memory components for cxl, hence nvdimm namespace is bogus.

[0] 
https://lore.kernel.org/all/yvtc2u1j%2fqip8...@worktop.programming.kicks-ass.net/


While it is not nvdimm specific anymore, it's still specific to "memory
devices that can bulk invalidate a physical address space". I.e. it's
not as generic as its location in arch/x86/include/asm/cacheflush.h
would imply. So, similar to arch_invalidate_pmem(), lets keep it in a
device-driver-specific header file, because hch and peterz are right, we
need to make this much more clear that it is not for general
consumption.


Fine, I won't argue - although I don't particularly agree, at least wrt
the naming. Imo my naming does _exactly_ what it should do and is much
easier to read than arch_has_flush_memregion() which is counter intuitive
when we are in fact flushing everything. This does not either make it
any more clearer about virt vs physical mappings either (except that
it's no longer associated to cacheflush). But, excepting arm cacheflush.h's
rare arch with braino cache users get way too much credit in their namespace
usage.

But yes there is no doubt that my version is more inviting than it should be,
which made me think of naming it to flush_all_caches_careful() so the user
is forced to at least check the function (or one would hope).

Anyway, I'll send a new version based on the below - I particularly agree
with the hypervisor bits.

Thanks,
Davidlohr



Re: [PATCH v2] arch/cacheflush: Introduce flush_all_caches()

2022-08-22 Thread Dan Williams
Davidlohr Bueso wrote:
> On Sun, 21 Aug 2022, Christoph Hellwig wrote:
> 
> >On Fri, Aug 19, 2022 at 10:10:24AM -0700, Davidlohr Bueso wrote:
> >> index b192d917a6d0..ac4d4fd4e508 100644
> >> --- a/arch/x86/include/asm/cacheflush.h
> >> +++ b/arch/x86/include/asm/cacheflush.h
> >> @@ -10,4 +10,8 @@
> >>
> >>  void clflush_cache_range(void *addr, unsigned int size);
> >>
> >> +/* see comments in the stub version */
> >> +#define flush_all_caches() \
> >> +  do { wbinvd_on_all_cpus(); } while(0)
> >
> >Yikes.  This is just a horrible, horrible name and placement for a bad
> >hack that should have no generic relevance.
> 
> Why does this have no generic relevance? There's already been discussions
> on how much wbinv is hated[0].
> 
> >Please fix up the naming to make it clear that this function is for a
> >very specific nvdimm use case, and move it to a nvdimm-specific header
> >file.
> 
> Do you have any suggestions for a name? And, as the changelog describes,
> this is not nvdimm specific anymore, and the whole point of all this is
> volatile memory components for cxl, hence nvdimm namespace is bogus.
> 
> [0] 
> https://lore.kernel.org/all/yvtc2u1j%2fqip8...@worktop.programming.kicks-ass.net/

While it is not nvdimm specific anymore, it's still specific to "memory
devices that can bulk invalidate a physical address space". I.e. it's
not as generic as its location in arch/x86/include/asm/cacheflush.h
would imply. So, similar to arch_invalidate_pmem(), lets keep it in a
device-driver-specific header file, because hch and peterz are right, we
need to make this much more clear that it is not for general
consumption.

There is already include/linux/memregion.h for identifying memory regions
with a common id across ACPI NFIT, DAX "soft reserved" regions, and CXL.
So how about something like along the same lines as
arch_invalidate_pmem():

diff --git a/include/linux/memregion.h b/include/linux/memregion.h
index c04c4fd2e209..0310135d7a42 100644
--- a/include/linux/memregion.h
+++ b/include/linux/memregion.h
@@ -21,3 +21,23 @@ static inline void memregion_free(int id)
 }
 #endif
 #endif /* _MEMREGION_H_ */
+
+/*
+ * Device memory technologies like NVDIMM and CXL have events like
+ * secure erase and dynamic region provision that can invalidate an
+ * entire physical memory address range at once. Limit that
+ * functionality to architectures that have an efficient way to
+ * writeback and invalidate potentially terabytes of memory at once.
+ */
+#ifdef CONFIG_ARCH_HAS_MEMREGION_INVALIDATE
+void arch_flush_memregion(phys_addr_t phys, resource_size_t size);
+bool arch_has_flush_memregion(void);
+#else
+static inline bool arch_has_flush_memregion(void)
+{
+   return false;
+}
+static void arch_flush_memregion(phys_addr_t phys, resource_size_t size);
+{
+}
+#endif

...where arch_has_flush_memregion() on x86 is:

bool arch_has_flush_memregion(void)
{
return !cpu_feature_enabled(X86_FEATURE_HYPERVISOR)
}

...to make it clear that this API is unavailable in virtual
environments.



Re: [PATCH v2] arch/cacheflush: Introduce flush_all_caches()

2022-08-22 Thread Davidlohr Bueso

On Sun, 21 Aug 2022, Christoph Hellwig wrote:


On Fri, Aug 19, 2022 at 10:10:24AM -0700, Davidlohr Bueso wrote:

index b192d917a6d0..ac4d4fd4e508 100644
--- a/arch/x86/include/asm/cacheflush.h
+++ b/arch/x86/include/asm/cacheflush.h
@@ -10,4 +10,8 @@

 void clflush_cache_range(void *addr, unsigned int size);

+/* see comments in the stub version */
+#define flush_all_caches() \
+   do { wbinvd_on_all_cpus(); } while(0)


Yikes.  This is just a horrible, horrible name and placement for a bad
hack that should have no generic relevance.


Why does this have no generic relevance? There's already been discussions
on how much wbinv is hated[0].


Please fix up the naming to make it clear that this function is for a
very specific nvdimm use case, and move it to a nvdimm-specific header
file.


Do you have any suggestions for a name? And, as the changelog describes,
this is not nvdimm specific anymore, and the whole point of all this is
volatile memory components for cxl, hence nvdimm namespace is bogus.

[0] 
https://lore.kernel.org/all/yvtc2u1j%2fqip8...@worktop.programming.kicks-ass.net/

Thanks,
Davidlohr



Re: [PATCH v2] arch/cacheflush: Introduce flush_all_caches()

2022-08-22 Thread Christoph Hellwig
On Fri, Aug 19, 2022 at 10:10:24AM -0700, Davidlohr Bueso wrote:
> index b192d917a6d0..ac4d4fd4e508 100644
> --- a/arch/x86/include/asm/cacheflush.h
> +++ b/arch/x86/include/asm/cacheflush.h
> @@ -10,4 +10,8 @@
>  
>  void clflush_cache_range(void *addr, unsigned int size);
>  
> +/* see comments in the stub version */
> +#define flush_all_caches() \
> + do { wbinvd_on_all_cpus(); } while(0)

Yikes.  This is just a horrible, horrible name and placement for a bad
hack that should have no generic relevance.

Please fix up the naming to make it clear that this function is for a
very specific nvdimm use case, and move it to a nvdimm-specific header
file.



Re: [PATCH v2] arch/cacheflush: Introduce flush_all_caches()

2022-08-20 Thread Davidlohr Bueso

On Fri, 19 Aug 2022, Ira Weiny wrote:


Did you mean "must"?


Yep.


+ * such as those which caches are in a consistent state. The
+ * caller can verify the situation early on.
+ */
+#ifndef flush_all_caches
+# define flush_all_caches_capable() false
+static inline void flush_all_caches(void)
+{
+   WARN_ON_ONCE("cache invalidation required\n");
+}


With the addition of flush_all_caches_capable() will flush_all_caches() ever be
called?


No, it should not. Hence you get a splat if you call it bogusly.



Re: [PATCH v2] arch/cacheflush: Introduce flush_all_caches()

2022-08-19 Thread Ira Weiny
On Fri, Aug 19, 2022 at 10:10:24AM -0700, Davidlohr Bueso wrote:
> With CXL security features, global CPU cache flushing nvdimm requirements
> are no longer specific to that subsystem, even beyond the scope of
> security_ops. CXL will need such semantics for features not necessarily
> limited to persistent memory.
> 
> The functionality this is enabling is to be able to instantaneously
> secure erase potentially terabytes of memory at once and the kernel
> needs to be sure that none of the data from before the secure is still
> present in the cache. It is also used when unlocking a memory device
> where speculative reads and firmware accesses could have cached poison
> from before the device was unlocked.
> 
> This capability is typically only used once per-boot (for unlock), or
> once per bare metal provisioning event (secure erase), like when handing
> off the system to another tenant or decomissioning a device. That small
> scope plus the fact that none of this is available to a VM limits the
> potential damage.
> 
> While the scope of this is for physical address space, add a new
> flush_all_caches() in cacheflush headers such that each architecture
> can define it, when capable. For x86 just use the wbinvd hammer and
> prevent any other arch from being capable.
> 
> Signed-off-by: Davidlohr Bueso 
> ---
> 
> Changes from v1 
> (https://lore.kernel.org/all/20220815160706.tqd42dv24tgb7x7y@offworld/):
> - Added comments and improved changelog to reflect this is
>   routine should be avoided and not considered a general API (Peter, Dan).
> 
>  arch/x86/include/asm/cacheflush.h |  4 +++
>  drivers/acpi/nfit/intel.c | 41 ++-
>  include/asm-generic/cacheflush.h  | 31 +++
>  3 files changed, 53 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/x86/include/asm/cacheflush.h 
> b/arch/x86/include/asm/cacheflush.h
> index b192d917a6d0..ac4d4fd4e508 100644
> --- a/arch/x86/include/asm/cacheflush.h
> +++ b/arch/x86/include/asm/cacheflush.h
> @@ -10,4 +10,8 @@
>  
>  void clflush_cache_range(void *addr, unsigned int size);
>  
> +/* see comments in the stub version */
> +#define flush_all_caches() \
> + do { wbinvd_on_all_cpus(); } while(0)
> +
>  #endif /* _ASM_X86_CACHEFLUSH_H */
> diff --git a/drivers/acpi/nfit/intel.c b/drivers/acpi/nfit/intel.c
> index 8dd792a55730..f2f6c31e6ab7 100644
> --- a/drivers/acpi/nfit/intel.c
> +++ b/drivers/acpi/nfit/intel.c
> @@ -4,6 +4,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "intel.h"
>  #include "nfit.h"
>  
> @@ -190,8 +191,6 @@ static int intel_security_change_key(struct nvdimm 
> *nvdimm,
>   }
>  }
>  
> -static void nvdimm_invalidate_cache(void);
> -
>  static int __maybe_unused intel_security_unlock(struct nvdimm *nvdimm,
>   const struct nvdimm_key_data *key_data)
>  {
> @@ -210,6 +209,9 @@ static int __maybe_unused intel_security_unlock(struct 
> nvdimm *nvdimm,
>   };
>   int rc;
>  
> + if (!flush_all_caches_capable())
> + return -EINVAL;
> +
>   if (!test_bit(NVDIMM_INTEL_UNLOCK_UNIT, _mem->dsm_mask))
>   return -ENOTTY;
>  
> @@ -228,7 +230,7 @@ static int __maybe_unused intel_security_unlock(struct 
> nvdimm *nvdimm,
>   }
>  
>   /* DIMM unlocked, invalidate all CPU caches before we read it */
> - nvdimm_invalidate_cache();
> + flush_all_caches();
>  
>   return 0;
>  }
> @@ -294,11 +296,14 @@ static int __maybe_unused intel_security_erase(struct 
> nvdimm *nvdimm,
>   },
>   };
>  
> + if (!flush_all_caches_capable())
> + return -EINVAL;
> +
>   if (!test_bit(cmd, _mem->dsm_mask))
>   return -ENOTTY;
>  
>   /* flush all cache before we erase DIMM */
> - nvdimm_invalidate_cache();
> + flush_all_caches();
>   memcpy(nd_cmd.cmd.passphrase, key->data,
>   sizeof(nd_cmd.cmd.passphrase));
>   rc = nvdimm_ctl(nvdimm, ND_CMD_CALL, _cmd, sizeof(nd_cmd), NULL);
> @@ -318,7 +323,7 @@ static int __maybe_unused intel_security_erase(struct 
> nvdimm *nvdimm,
>   }
>  
>   /* DIMM erased, invalidate all CPU caches before we read it */
> - nvdimm_invalidate_cache();
> + flush_all_caches();
>   return 0;
>  }
>  
> @@ -338,6 +343,9 @@ static int __maybe_unused 
> intel_security_query_overwrite(struct nvdimm *nvdimm)
>   },
>   };
>  
> + if (!flush_all_caches_capable())
> + return -EINVAL;
> +
>   if (!test_bit(NVDIMM_INTEL_QUERY_OVERWRITE, _mem->dsm_mask))
>   return -ENOTTY;
>  
> @@ -355,7 +363,7 @@ static int __maybe_unused 
> intel_security_query_overwrite(struct nvdimm *nvdimm)
>   }
>  
>   /* flush all cache before we make the nvdimms available */
> - nvdimm_invalidate_cache();
> + flush_all_caches();
>   return 0;
>  }
>  
> @@ -377,11 +385,14 @@ static int __maybe_unused 
> intel_security_overwrite(struct nvdimm *nvdimm,
>   

[PATCH v2] arch/cacheflush: Introduce flush_all_caches()

2022-08-19 Thread Davidlohr Bueso
With CXL security features, global CPU cache flushing nvdimm requirements
are no longer specific to that subsystem, even beyond the scope of
security_ops. CXL will need such semantics for features not necessarily
limited to persistent memory.

The functionality this is enabling is to be able to instantaneously
secure erase potentially terabytes of memory at once and the kernel
needs to be sure that none of the data from before the secure is still
present in the cache. It is also used when unlocking a memory device
where speculative reads and firmware accesses could have cached poison
from before the device was unlocked.

This capability is typically only used once per-boot (for unlock), or
once per bare metal provisioning event (secure erase), like when handing
off the system to another tenant or decomissioning a device. That small
scope plus the fact that none of this is available to a VM limits the
potential damage.

While the scope of this is for physical address space, add a new
flush_all_caches() in cacheflush headers such that each architecture
can define it, when capable. For x86 just use the wbinvd hammer and
prevent any other arch from being capable.

Signed-off-by: Davidlohr Bueso 
---

Changes from v1 
(https://lore.kernel.org/all/20220815160706.tqd42dv24tgb7x7y@offworld/):
- Added comments and improved changelog to reflect this is
  routine should be avoided and not considered a general API (Peter, Dan).

 arch/x86/include/asm/cacheflush.h |  4 +++
 drivers/acpi/nfit/intel.c | 41 ++-
 include/asm-generic/cacheflush.h  | 31 +++
 3 files changed, 53 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/cacheflush.h 
b/arch/x86/include/asm/cacheflush.h
index b192d917a6d0..ac4d4fd4e508 100644
--- a/arch/x86/include/asm/cacheflush.h
+++ b/arch/x86/include/asm/cacheflush.h
@@ -10,4 +10,8 @@
 
 void clflush_cache_range(void *addr, unsigned int size);
 
+/* see comments in the stub version */
+#define flush_all_caches() \
+   do { wbinvd_on_all_cpus(); } while(0)
+
 #endif /* _ASM_X86_CACHEFLUSH_H */
diff --git a/drivers/acpi/nfit/intel.c b/drivers/acpi/nfit/intel.c
index 8dd792a55730..f2f6c31e6ab7 100644
--- a/drivers/acpi/nfit/intel.c
+++ b/drivers/acpi/nfit/intel.c
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "intel.h"
 #include "nfit.h"
 
@@ -190,8 +191,6 @@ static int intel_security_change_key(struct nvdimm *nvdimm,
}
 }
 
-static void nvdimm_invalidate_cache(void);
-
 static int __maybe_unused intel_security_unlock(struct nvdimm *nvdimm,
const struct nvdimm_key_data *key_data)
 {
@@ -210,6 +209,9 @@ static int __maybe_unused intel_security_unlock(struct 
nvdimm *nvdimm,
};
int rc;
 
+   if (!flush_all_caches_capable())
+   return -EINVAL;
+
if (!test_bit(NVDIMM_INTEL_UNLOCK_UNIT, _mem->dsm_mask))
return -ENOTTY;
 
@@ -228,7 +230,7 @@ static int __maybe_unused intel_security_unlock(struct 
nvdimm *nvdimm,
}
 
/* DIMM unlocked, invalidate all CPU caches before we read it */
-   nvdimm_invalidate_cache();
+   flush_all_caches();
 
return 0;
 }
@@ -294,11 +296,14 @@ static int __maybe_unused intel_security_erase(struct 
nvdimm *nvdimm,
},
};
 
+   if (!flush_all_caches_capable())
+   return -EINVAL;
+
if (!test_bit(cmd, _mem->dsm_mask))
return -ENOTTY;
 
/* flush all cache before we erase DIMM */
-   nvdimm_invalidate_cache();
+   flush_all_caches();
memcpy(nd_cmd.cmd.passphrase, key->data,
sizeof(nd_cmd.cmd.passphrase));
rc = nvdimm_ctl(nvdimm, ND_CMD_CALL, _cmd, sizeof(nd_cmd), NULL);
@@ -318,7 +323,7 @@ static int __maybe_unused intel_security_erase(struct 
nvdimm *nvdimm,
}
 
/* DIMM erased, invalidate all CPU caches before we read it */
-   nvdimm_invalidate_cache();
+   flush_all_caches();
return 0;
 }
 
@@ -338,6 +343,9 @@ static int __maybe_unused 
intel_security_query_overwrite(struct nvdimm *nvdimm)
},
};
 
+   if (!flush_all_caches_capable())
+   return -EINVAL;
+
if (!test_bit(NVDIMM_INTEL_QUERY_OVERWRITE, _mem->dsm_mask))
return -ENOTTY;
 
@@ -355,7 +363,7 @@ static int __maybe_unused 
intel_security_query_overwrite(struct nvdimm *nvdimm)
}
 
/* flush all cache before we make the nvdimms available */
-   nvdimm_invalidate_cache();
+   flush_all_caches();
return 0;
 }
 
@@ -377,11 +385,14 @@ static int __maybe_unused intel_security_overwrite(struct 
nvdimm *nvdimm,
},
};
 
+   if (!flush_all_caches_capable())
+   return -EINVAL;
+
if (!test_bit(NVDIMM_INTEL_OVERWRITE, _mem->dsm_mask))
return -ENOTTY;
 
/* flush all cache before we erase DIMM */
-