The nvme_mpath_start_request() increments ns->ctrl->nr_active, and
nvme_mpath_end_request() decrements it. This means that nr_active is
maintained per controller. If multiple NVMe namespaces are created and
attached to the same controller, their I/O activity is accumulated in
the single ctrl->nr_active counter.
In contrast, libmultipath defines nr_active in struct mpath_device,
which is referenced from struct nvme_ns. Even if we add code to update
mpath_device->nr_active, that accounting would effectively be per
namespace, not per controller.
Right, I need to change that back to per-controller.
The nr_active value is used by the queue-depth policy. Currently,
mpath_queue_depth_path() accesses mpath_device->nr_active to make
forwarding decisions. However, if mpath_device->nr_active is maintained
per namespace, it does not correctly reflect controller-wide load when
multiple namespaces share the same controller.
Yes
Therefore, instead of maintaining a separate nr_active in struct
mpath_device, it may be more appropriate for mpath_queue_depth_path() to
reference ns->ctrl->nr_active directly. In that case, nr_active could be
removed from struct mpath_device entirely.
I think so, but we will need scsi to maintain such a count internally to
support this policy. And for NVMe we will need some abstraction to
lookup the per-controller QD for a mpath_device.
Thanks for checking!