On 3/3/26 6:11 PM, John Garry wrote:
The nvme_mpath_start_request() increments ns->ctrl->nr_active, and
nvme_mpath_end_request() decrements it. This means that nr_active is
maintained per controller. If multiple NVMe namespaces are created and
attached to the same controller, their I/O activity is accumulated in
the single ctrl->nr_active counter.
In contrast, libmultipath defines nr_active in struct mpath_device,
which is referenced from struct nvme_ns. Even if we add code to update
mpath_device->nr_active, that accounting would effectively be per
namespace, not per controller.
Right, I need to change that back to per-controller.
The nr_active value is used by the queue-depth policy. Currently,
mpath_queue_depth_path() accesses mpath_device->nr_active to make
forwarding decisions. However, if mpath_device->nr_active is
maintained per namespace, it does not correctly reflect controller-
wide load when multiple namespaces share the same controller.
Yes
Therefore, instead of maintaining a separate nr_active in struct
mpath_device, it may be more appropriate for mpath_queue_depth_path()
to reference ns->ctrl->nr_active directly. In that case, nr_active
could be removed from struct mpath_device entirely.
I think so, but we will need scsi to maintain such a count internally to
support this policy. And for NVMe we will need some abstraction to
lookup the per-controller QD for a mpath_device.
This raises another question regarding the current framework. From what
I can see, all NVMe multipath I/O policies are currently supported for
SCSI as well. Going forward, if we introduce a new I/O policy for NVMe
that does not make sense for SCSI, how can we ensure that the new policy
is supported only for NVMe and not for SCSI? Conversely, we may also
want to introduce a policy that is relevant only for SCSI but not for NVMe.
With the current framework, it seems difficult to restrict a policy to a
specific transport. It appears that all policies are implicitly shared
between NVMe and SCSI.
Would it make sense to introduce some abstraction for I/O policies in
the framework so that a given policy can be implemented and exposed only
for the relevant transport (e.g., NVMe-only or SCSI-only), rather than
requiring it to be supported by both?
Thanks,
--Nilay