On Thu, 2018-04-12 at 13:41 -0500, Benjamin Marzinski wrote:
> On Mon, Apr 02, 2018 at 09:50:48PM +0200, Martin Wilck wrote:
> > For "find_multipaths smart", check if a path is already in use
> > before setting DM_MULTIPATH_DEVICE_PATH to 1 or 2 (and thus,
> > SYSTEMD_READY=0). If we don't do this, a device which has already
> > been
> > mounted (e.g. during initrd processing) may be unmounted by
> > systemd, causing
> > havoc to the boot process.
> I'm reviewing  v3 of this patch because I don't see patch 17/20 in
> your
> emails from v4. Am I missing an email, or did it not get sent?

It seems so, it didn't reach the dm-devel archive either. Strange.
I got it on my suse.com address, so maybe something went wrong in our
outgoing server. Anyway, v3/17 and v4/17 are identical.

> Signed-off-by: Martin Wilck <mwi...@suse.com>
> > +
> > +           /*
> > +            * If opening the path with O_EXCL fails, the path
> > +            * is in use (e.g. mounted during initramfs
> > processing).
> > +            * We know that it's not used by dm-multipath.
> > +            * We may not set SYSTEMD_READY=0 on such devices,
> > it
> > +            * might cause systemd to umount the device.
> > +            * Use O_RDONLY, because udevd would trigger
> > another
> > +            * uevent for close-after-write.
> > +            *
> > +            * get_refwwid() above stores the path we examine
> > in slot 0.
> > +            */
> > +           pp = VECTOR_SLOT(pathvec, 0);
> > +           fd = open(udev_device_get_devnode(pp->udev),
> > +                     O_RDONLY|O_EXCL);
> I'm worried about this.  Since we can't be sure that is_failed_wwid()
> will really tell us that multipathd has tried to multipath the device
> and failed, 

As I said already, I don't understand why you say that.

I can assert that if is_failed_wwid() returns true, multipathd has
definitely tried and failed since the last reboot, and no (other
instance of) multipathd or multipath has succeeded since then.

If is_failed_wwid() returns false, it's possible that the map already
exists (see patch 18), or that previous/current instances of multipathd
simply didn't try -  we have to check by other means.

> it is totally possible to get a maybe after multipath has
> turned the path device over to the rest of the system.

A transition from "no" to "maybe" is only possible if a single path,
which isn't in the WWIDs file and isn't part of a multipath map,
transitions A) from "failed" to  "not failed" or B) from "blacklisted"
to "not blacklisted". A) means that multipathd has successfully created
a map, thus the path is now part of a map, and we will transition to
"yes" and not to "maybe". B) is pathogical except for the coldplug

However, transitioning from "no" to "yes" in multipath -u is just as
bad as "no" to "maybe", unless the device has already been multipathed.
This is a common case: a second path appears for a once-released
device. I agree that we shouldn't try open(O_EXCL) in that situation.

> Of course, this means I would exlcude the whole second "if (cmd ==
> CMD_VALID_PATH)" section in configure() unless we know that it is
> safe
> to grab the device.  Otherwise, there is nothing to stop us from
> claiming a device that is in use. Clearly that exclusive grab check
> is
> racy at any time except on add events or when the device already is
> set
> to SYSTEMD_READY=0.  I'm pretty sure that the coldplug add event
> after
> the switchroot is safe, since nothing will be racing to grab the
> device
> then. 
> You've already agreed that it should be fine to allow multipathd to
> try
> to create a multipath device on top of a non-claimed path, since we
> can
> just claim it later by issuing a uevent.  I feel like this is just
> another instance of that.  If this isn't a new path, where we have
> excluded everyone else from using it, we can't suddenly claim it just
> because a second path appears. However, if multipathd manages to
> create
> a multipath device on top of it, then it will add the wwid to the
> wwids
> file, and be able to claim it.  But otherwise, I don't think that the
> exclusive grab is safe or reliable enough to allow us to simply do
> this
> on any uevent.
> I would add a new option to multipath, that works with -u, to tell it
> that maybes are allowed. If find_multipaths == FIND_MULTIPATHS_SMART,
> then it should not claim the device if it doesn't get positively
> claimed
> in the first "if (cmd == CMD_VALID_PATH)" section of configure().
> That
> will save us from claiming devices that are already in use, and speed
> the multipath -u calls up.

I don't think we need another option. We can use the uevent environment
in the -u case.


Dr. Martin Wilck <mwi...@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imend├Ârffer, Jane Smithard, Graham Norton
HRB 21284 (AG N├╝rnberg)

dm-devel mailing list

Reply via email to