On 10/04/2026 08:06, Nilay Shroff wrote:
# Part b: Ensure writes work for intermittent disconnect
_nvme_connect_subsys
nvmedev=$(_find_nvme_dev "${def_subsysnqn}")
ns=$(_find_nvme_ns "${def_subsys_uuid}")
echo 10 > "/sys/block/"$ns"/delayed_removal_secs"
bytes_written=$(run_xfs_io_pwritev2 /dev/"$ns" 4096)
if [ "$bytes_written" != 4096 ]; then
echo "could not write successfully initially"
fi
sleep 1
_nvme_disconnect_ctrl "${nvmedev}"
sleep 1
ns=$(_find_nvme_ns "${def_subsys_uuid}")
if [[ "${ns}" = "" ]]; then
echo "could not find ns after disconnect"
fi
_delayed_nvme_reconnect_ctrl &
sleep 1
bytes_written=$(run_xfs_io_pwritev2 /dev/"$ns" 4096)
if [ "$bytes_written" != 4096 ]; then
echo "could not write successfully with reconnect"
fi
It seems there may be a race here if we attempt to write to $ns before
the reconnect has completed in _delayed_nvme_reconnect_ctrl.
If the intention is simply to verify that the controller reconnect occurs
within the delayed removal window and test pwrite,
Not exactly. I want to verify that if I write between the disconnect and
the reconnect, then we write succeeds.
then it may be
sufficient
to:
- perform the reconnect, and
- then validate the write (pwrite) afterwards.
I think that this is something subtly different.
For your revised test, if we reconnect, we always expect the subsequent
write to succeed even without the delayed removal, so I am not sure what
we achieve.
In that case, we could either:
- run _delayed_nvme_reconnect_ctrl in the foreground, or
- open-code the reconnect directly in the script before issuing the write.
How would that open-code reconnect look? I was just using the subsystem
connect, which I think is not optimal.
Thanks,
John