From: John Groves <[email protected]>

Clear holder_ops before holder_data so that a concurrent fs_dax_get()
cannot have its newly installed holder_ops overwritten. cmpxchg()
provides release ordering on weakly-ordered architectures, ensuring the
WRITE_ONCE(holder_ops, NULL) store is visible to any CPU that observes
the holder_data release.

Add a WARN_ON() that fires only when the cmpxchg observes a non-NULL
value that is not @holder, i.e. fs_put_dax() called by something that
is not the current holder. That is an API contract violation; the
WARN_ON() does not prevent the damage but makes the bug visible.

A NULL cmpxchg result is deliberately tolerated: kill_dax() clears
holder_data while a holder is still attached when a device is removed
out from under a mounted filesystem (after delivering MF_MEM_PRE_REMOVE).
The holder's subsequent fs_put_dax() - e.g. xfs_free_buftarg() after a
forced shutdown - then legitimately finds holder_data already NULL, so
warning on that case would turn supported device removal into a splat
(or a panic with panic_on_warn).

Also add a kerneldoc comment documenting that fs_put_dax() must only
be called by the current holder.

Fixes: eec38f5d86d27 ("dax: Add fs_dax_get() func to prepare dax for fs-dax 
usage")
Signed-off-by: John Groves <[email protected]>
---
 drivers/dax/super.c | 42 +++++++++++++++++++++++++++++++++++++++---
 1 file changed, 39 insertions(+), 3 deletions(-)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 25cf99dd9360b..96f778dcde50b 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -116,11 +116,47 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
 
 #if IS_ENABLED(CONFIG_FS_DAX)
 
+/**
+ * fs_put_dax() - release holder ownership of a dax_device
+ * @dax_dev: dax device to release (may be NULL)
+ * @holder: the holder pointer previously passed to fs_dax_get() or
+ *          fs_dax_get_by_bdev(); must match exactly, as it is used
+ *          in a cmpxchg to atomically release ownership
+ *
+ * Must only be called by the current holder. Clears holder_ops before
+ * holder_data to avoid a race where a concurrent fs_dax_get() could have
+ * its newly installed holder_ops overwritten.
+ */
 void fs_put_dax(struct dax_device *dax_dev, void *holder)
 {
-       if (dax_dev && holder &&
-           cmpxchg(&dax_dev->holder_data, holder, NULL) == holder)
-               dax_dev->holder_ops = NULL;
+       if (dax_dev && holder) {
+               void *prev;
+
+               /*
+                * Clear holder_ops before releasing holder_data. A concurrent
+                * dax_holder_notify_failure() that sees NULL ops returns
+                * -EOPNOTSUPP cleanly. A concurrent fs_dax_get() that acquires
+                * holder_data after the cmpxchg below is guaranteed to observe
+                * holder_ops=NULL first (cmpxchg provides release ordering), so
+                * its subsequent store of new ops will not be overwritten.
+                */
+               WRITE_ONCE(dax_dev->holder_ops, NULL);
+               prev = cmpxchg(&dax_dev->holder_data, holder, NULL);
+
+               /*
+                * prev == holder: normal release.
+                * prev == NULL:   already released by kill_dax() when the
+                *                 device was removed under a live holder;
+                *                 not a bug.
+                * prev != holder (non-NULL): fs_put_dax() called by something
+                *                 that is not the current holder; an API
+                *                 contract violation. A lock would be needed
+                *                 to guard against this, but we WARN_ON()
+                *                 instead since violating the contract is
+                *                 a bug.
+                */
+               WARN_ON(prev && prev != holder);
+       }
        put_dax(dax_dev);
 }
 EXPORT_SYMBOL_GPL(fs_put_dax);
-- 
2.53.0



Reply via email to