On 6/30/20 2:24 PM, Michal Suchánek wrote:
On Mon, Jun 29, 2020 at 06:50:15PM -0700, Dan Williams wrote:
On Mon, Jun 29, 2020 at 1:41 PM Aneesh Kumar K.V
<aneesh.ku...@linux.ibm.com> wrote:

Michal Suchánek <msucha...@suse.de> writes:


On Mon, Jun 29, 2020 at 07:27:20PM +0530, Aneesh Kumar K.V wrote:
nvdimm expect the flush routines to just mark the cache clean. The barrier
that mark the store globally visible is done in nvdimm_flush().

Update the papr_scm driver to a simplified nvdim_flush callback that do
only the required barrier.

Signed-off-by: Aneesh Kumar K.V <aneesh.ku...@linux.ibm.com>
  arch/powerpc/lib/pmem.c                   |  6 ------
  arch/powerpc/platforms/pseries/papr_scm.c | 13 +++++++++++++
  2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/lib/pmem.c b/arch/powerpc/lib/pmem.c
index 5a61aaeb6930..21210fa676e5 100644
--- a/arch/powerpc/lib/pmem.c
+++ b/arch/powerpc/lib/pmem.c
@@ -19,9 +19,6 @@ static inline void __clean_pmem_range(unsigned long start, 
unsigned long stop)

      for (i = 0; i < size >> shift; i++, addr += bytes)
              asm volatile(PPC_DCBSTPS(%0, %1): :"i"(0), "r"(addr): "memory");
-    asm volatile(PPC_PHWSYNC ::: "memory");

  static inline void __flush_pmem_range(unsigned long start, unsigned long stop)
@@ -34,9 +31,6 @@ static inline void __flush_pmem_range(unsigned long start, 
unsigned long stop)

      for (i = 0; i < size >> shift; i++, addr += bytes)
              asm volatile(PPC_DCBFPS(%0, %1): :"i"(0), "r"(addr): "memory");
-    asm volatile(PPC_PHWSYNC ::: "memory");

  static inline void clean_pmem_range(unsigned long start, unsigned long stop)
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
index 9c569078a09f..9a9a0766f8b6 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -630,6 +630,18 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor 

      return 0;
+ * We have made sure the pmem writes are done such that before calling this
+ * all the caches are flushed/clean. We use dcbf/dcbfps to ensure this. Here
+ * we just need to add the necessary barrier to make sure the above flushes
+ * are have updated persistent storage before any data access or data transfer
+ * caused by subsequent instructions is initiated.
+ */
+static int papr_scm_flush_sync(struct nd_region *nd_region, struct bio *bio)
+    arch_pmem_flush_barrier();
+    return 0;

  static ssize_t flags_show(struct device *dev,
                        struct device_attribute *attr, char *buf)
@@ -743,6 +755,7 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
      ndr_desc.mapping = &mapping;
      ndr_desc.num_mappings = 1;
      ndr_desc.nd_set = &p->nd_set;
+    ndr_desc.flush = papr_scm_flush_sync;

AFAICT currently the only device that implements flush is virtio_pmem.
How does the nfit driver get away without implementing flush?

generic_nvdimm_flush does the required barrier for nfit. The reason for
adding ndr_desc.flush call back for papr_scm was to avoid the usage
of iomem based deep flushing (ndr_region_data.flush_wpq) which is not
supported by papr_scm.

BTW we do return NULL for ndrd_get_flush_wpq() on power. So the upstream
code also does the same thing, but in a different way.

Also the flush takes arguments that are completely unused but a user of
the pmem region must assume they are used, and call flush() on the
region rather than arch_pmem_flush_barrier() directly.

The bio argument can help a pmem driver to do range based flushing in
case of pmem_make_request. If bio is null then we must assume a full
device flush.

The bio argument isn't for range based flushing, it is for flush
operations that need to complete asynchronously.
How does the block layer determine that the pmem device needs
asynchronous fushing?

        set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
and dax_synchronous(dev)

The flush() was designed for the purpose with the bio argument and only
virtio_pmem which is fulshed asynchronously used it. Now that papr_scm
resuses it fir different purpose how do you tell?


Reply via email to