On 02/03/2026 12:36, Nilay Shroff wrote:
On 2/25/26 9:02 PM, John Garry wrote:
Add code for path selection.

NVMe ANA is abstracted into enum mpath_access_state. The motivation here is
so that SCSI ALUA can be used. Callbacks .is_disabled, .is_optimized,
.get_access_state are added to get the path access state.

Path selection modes round-robin, NUMA, and queue-depth are added, same
as NVMe supports.

NVMe has almost like-for-like equivalents here:
- __mpath_find_path() -> __nvme_find_path()
- mpath_find_path() -> nvme_find_path()

and similar for all introduced callee functions.

Functions mpath_set_iopolicy() and mpath_get_iopolicy() are added for
setting default iopolicy.

A separate mpath_iopolicy structure is introduced. There is no iopolicy
member included in the mpath_head structure as it may not suit NVMe, where
iopolicy is per-subsystem and not per namespace.

Signed-off-by: John Garry <[email protected]>
---
  include/linux/multipath.h |  36 ++++++
  lib/multipath.c           | 251 ++++++++++++++++++++++++++++++++++++++
  2 files changed, 287 insertions(+)

diff --git a/include/linux/multipath.h b/include/linux/multipath.h
index be9dd9fb83345..c964a1aba9c42 100644
--- a/include/linux/multipath.h
+++ b/include/linux/multipath.h
@@ -7,6 +7,22 @@
  extern const struct block_device_operations mpath_ops;
+enum mpath_iopolicy_e {
+    MPATH_IOPOLICY_NUMA,
+    MPATH_IOPOLICY_RR,
+    MPATH_IOPOLICY_QD,
+};
+
+struct mpath_iopolicy {
+    enum mpath_iopolicy_e    iopolicy;
+};
+
+enum mpath_access_state {
+    MPATH_STATE_OPTIMIZED,
+    MPATH_STATE_ACTIVE,
+    MPATH_STATE_INVALID    = 0xFF
+};
Hmm so here we don't have MPATH_STATE_NONOPTIMIZED.
We are morphing NVME_ANA_NONOPTIMIZED as MPATH_STATE_ACTIVE.

Yes, well it is treated the same (as NVME_ANA_NONOPTIMIZED) for path selection.

Is it because SCSI doesn't have (NONOPTIMIZED) state?

It does have an active (and optimal) state, but I think that keeping NVMe terminology may be better for now.


+
  struct mpath_disk {
      struct gendisk        *disk;
      struct kref        ref;
@@ -18,10 +34,16 @@ struct mpath_disk {
  struct mpath_device {
      struct list_head    siblings;
+    atomic_t        nr_active;
      struct gendisk        *disk;
+    int            numa_node;
  };
I haven't seen any API which help set nr_active or numa_node.

I missed setting numa_node for NVMe. About nr_active, that is set/read by the NVMe code, like nvme_mpath_start_request(). I did try to abstract that function into a common helper, but it just becomes a mess.

Do we need to have those under struct mpath_head_template ?

I think that the drivers can handle these directly.

Thanks

Reply via email to