On Wed, 2025-08-20 at 23:32 +0200, Xose Vazquez Perez wrote:
> One file per vendor, or device, is a bit excessive for two-four
> rules.
> 
> 
> If possible, select round-robin (>=5.1), or queue-depth (>=6.11).
> round-robin is a basic selector, and only works well under ideal
> conditions.
> 
> A nvme benchmark, round-robin vs queue-depth, shows how bad it is:
> https://marc.info/?l=linux-kernel&m=171931850925572
> https://marc.info/?l=linux-kernel&m=171931852025575
> https://github.com/johnmeneghini/iopolicy/?tab=readme-ov-file#sample-data
> https://people.redhat.com/jmeneghi/ALPSS_2023/NVMe_QD_Multipathing.pdf
> 
> 
> [ctrl_loss_tmo default value is 600 (ten minutes)]
> 
> 
> v3:
>  - add Fujitsu/ETERNUS AB/HB
>  - add Hitachi/VSP
> 
> v2:
>  - fix ctrl_loss_tmo commnent
>  - add Infinidat/InfiniBox
> 
> 
> Cc: Wayne Berthiaume <wayne.berthia...@dell.com>
> Cc: Vasuki Manikarnike <vasuki.manikarn...@hpe.com>
> Cc: Matthias Rudolph <matthias.rudo...@hitachivantara.com>
> Cc: Martin George <mart...@netapp.com>
> Cc: NetApp RDAC team <ng-eseries-upstream-maintain...@netapp.com>
> Cc: Zou Ming <zouming.zoum...@huawei.com>
> Cc: Li Xiaokeng <lixiaok...@huawei.com>
> Cc: Randy Jennings <ran...@purestorage.com>
> Cc: Jyoti Rani <jr...@purestorage.com>
> Cc: Brian Bunker <br...@purestorage.com>
> Cc: Uday Shankar <ushan...@purestorage.com>
> Cc: Chaitanya Kulkarni <k...@nvidia.com>
> Cc: Sagi Grimberg <s...@grimberg.me>
> Cc: Keith Busch <kbu...@kernel.org>
> Cc: Christoph Hellwig <h...@lst.de>
> Cc: Marco Patalano <mpata...@redhat.com>
> Cc: Ewan D. Milne <emi...@redhat.com>
> Cc: John Meneghini <jmene...@redhat.com>
> Cc: Daniel Wagner <dwag...@suse.de>
> Cc: Daniel Wagner <w...@monom.org>
> Cc: Hannes Reinecke <h...@suse.de>
> Cc: Martin Wilck <mwi...@suse.com>
> Cc: Benjamin Marzinski <bmarz...@redhat.com>
> Cc: Christophe Varoqui <christophe.varo...@opensvc.com>
> Cc: BLOCK-ML <linux-bl...@vger.kernel.org>
> Cc: NVME-ML <linux-n...@lists.infradead.org>
> Cc: SCSI-ML <linux-s...@vger.kernel.org>
> Cc: DM_DEVEL-ML <dm-devel@lists.linux.dev>
> Signed-off-by: Xose Vazquez Perez <xose.vazq...@gmail.com>
> ---
> 
> This will be the last iteration of this patch, there are no more NVMe
> storage
> array manufacturers.
> 
> 
> Maybe these rules should be merged into this new file. ???
> 71-nvmf-hpe.rules.in
> 71-nvmf-netapp.rules.in
> 71-nvmf-vastdata.rules.in
> 
> ---
>  .../80-nvmf-storage_arrays.rules.in           | 48
> +++++++++++++++++++
>  1 file changed, 48 insertions(+)
>  create mode 100644 nvmf-autoconnect/udev-rules/80-nvmf-
> storage_arrays.rules.in
> 
> diff --git a/nvmf-autoconnect/udev-rules/80-nvmf-
> storage_arrays.rules.in b/nvmf-autoconnect/udev-rules/80-nvmf-
> storage_arrays.rules.in
> new file mode 100644
> index 00000000..ac5df797
> --- /dev/null
> +++ b/nvmf-autoconnect/udev-rules/80-nvmf-storage_arrays.rules.in
> @@ -0,0 +1,48 @@
> +##### Storage arrays
> +
> +#### Set iopolicy for NVMe-oF
> +### iopolicy: numa (default), round-robin (>=5.1), or queue-depth
> (>=6.11)
> +
> +## Dell EMC
> +# PowerMax
> +ACTION=="add|change", SUBSYSTEM=="nvme-subsystem",
> ATTR{subsystype}=="nvm", ATTR{iopolicy}="round-robin",
> ATTR{model}=="EMC PowerMax"

Do you have a specific reason to add the model match after the
assignment? It isn't wrong AFAIK, but highly unusual and confusing.

> +ACTION=="add|change", SUBSYSTEM=="nvme-subsystem",·
> ATTR{subsystype}=="nvm", ATTR{iopolicy}="queue-depth",
> ATTR{model}=="EMC PowerMax"

I am assuming the idea here is that if queue-depth is unsupported, the
second command will fail, and thus round-robin will be selected?
I am not sure if that's a good idea. 

The "best" iopolicy doesn't depend on the storage array in use.
It depends on what the kernel supports, the workload, and the user
preferences.

I suggest using something like this instead:

ENV{.NVME_IOPOLICY}!="?*", ENV{.NVME_IOPOLICY}="queue-depth"

This allows users to add a early rule file 00-nvme-policy.rules
to override the default:

ACTION=="add|change", SUBSYSTEM=="nvme-subsystem", 
ENV{.NVME_IOPOLICY}="round-robin"


Then you could simply do this:


ACTION!="add|change", GOTO="iopolicy_end"
SUBSYSTEM!="nvme-subsystem", GOTO="iopolicy_end"
ATTR{subsystype}!="nvm", GOTO="iopolicy_end"

ATTR{model}=="dellemc-powerstore", ATTR{iopolicy}="$env{NVME_IOPOLICY}"
# other models ...

LABEL="iopolicy_end"


Anyway, I dislike the idea of maintaining a potentially ever-growing
list of storage models with special policies in generic udev rules.
Udev rules 

*If* we want to pursue model-specific settings, we should rather use
the systemd hwdb [1] for that purpose. For example,


ACTION=="add|change", SUBSYSTEM=="nvme-subsystem", ATTR{subsystype}=="nvm", \
    ENV{.NVME_IOPOLICY}!="?*", \
    IMPORT{builtin}="hwdb nvme_subsys:$attr{model}

# .NVME_IOPOLICY would be set by hwdb if a match was found
ENV{.NVME_IOPOLICY}=="?*", ATTR{iopolicy}="$env{.NVME_IOPOLICY}"


Vendors could then just put their preference into the hwdb.

But first of all I'd be curious why this setting would be 
model-specific in the first place.

Regards
Martin

[1] https://man7.org/linux/man-pages/man7/hwdb.7.html

Reply via email to