Hi

On Thu, 8 May 2025, Martin Wilck wrote:

> Hello Kevin,
> 
> [I'm sorry for the previous email. It seems that I clicked "send" in
> the wrong window].
> 
> On Tue, 2025-04-29 at 18:50 +0200, Kevin Wolf wrote:
> > Multipath cannot directly provide failover for ioctls in the kernel
> > because it doesn't know what each ioctl means and which result could
> > indicate a path error. Userspace generally knows what the ioctl it
> > issued means and if it might be a path error, but neither does it
> > know
> > which path the ioctl took nor does it necessarily have the privileges
> > to
> > fail a path using the control device.
> 
> Thanks for this effort.
> 
> I have some remarks about your approach. The most important one is that
> the DM_MPATH_PROBE_PATHS_CMD ioctl appears to be a dangerous command.
> It sends IO to possibly broken paths and waits for it to complete. In
> the common error case of a device not responding, this might cause the
> code to hang for a long time in the kernel ioctl code path, in
> uninterruptible sleep (note that these probing commands will be queued

Normal reads and writes would also hang in an uninterruptible sleep if a 
path stops responding.

> after other possibly hanging IO). In the past we have put a lot of
> effort into other code paths in multipath-tools and elsewhere to avoid
> this kind of hang to the extent possible. It seems to me that your set
> re-introduces this risk.
> 
> Apart from that, minor observations are that in patch 2/2 you don't
> check whether the map is in queueing state, and I don't quite
> understand why you only check paths in the map's active path group
> without attempting a PG failover in the case where all paths in the
> current PG fail.
> 
> I am wondering whether the DM_MPATH_PROBE_PATHS_CMD ioctl is necessary
> at all. Rather than triggering explicit path probing, you could have
> dm-mpath "handle" IO errors by failing the active path for "path
> errors". That would be similar to my patch set from 2021 [1], but it
> would avoid the extra IO and thus the additional risk of hanging in the
> kernel.

What exactly do you mean? You say "you could have dm-mpath 'handle' IO 
errors by failing the active path for "path errors".

Christoph doesn't want dm-mpath can't look at the error code - so dm-mpath 
doesn't know if path error occured or not. qemu could look at the error 
code, but qemu doesn't know which path did the SG_IO go through - so it 
can't instruct qemu to fail that path.

One of the possibility that we discussed was to add a path-id to SG_IO, so 
that dm-mpath would mark which path did the SG_IO go through and qemu 
could instruct dm-mpath to fail that path. But we rejected it as being 
more complicated than the current approach.

> Contrary to my set, you wouldn't attempt retries in the kernel, but
> leave this part to qemu instead, like in the current set. That would
> avoid Christoph's main criticism that "Failing over SG_IO does not make
> sense" [2].
> 
> Doing the failover in qemu has the general disadvantage that qemu has
> no notion about the number of available and/or healthy paths; it can
> thus only guess the reasonable number of retries for any given I/O. But
> that's unavoidable, given that the idea to do kernel-level failover on
> SG_IO is rejected.
> 
> Please let me know your thoughts.
> 
> Best Regards
> Martin
> 
> [1] https://lore.kernel.org/all/20210628151558.2289-1-mwi...@suse.com/
> [2] https://lore.kernel.org/all/20210701075629.ga25...@lst.de/

Mikulas

Reply via email to