On Thu, Dec 04, 2025 at 08:55:58AM -0800, Brian Bunker wrote:
> On Wed, Dec 3, 2025 at 11:37 AM Benjamin Marzinski <[email protected]> 
> wrote:
> >
> > On Thu, Nov 27, 2025 at 12:31:12PM +0100, Martin Wilck wrote:
> > > On Mon, 2025-11-24 at 15:11 -0500, Benjamin Marzinski wrote:
> > > > On Fri, Nov 21, 2025 at 02:35:23PM -0800, Brian Bunker wrote:
> > > >
> > > > >
> > > > > I added the purge thread because I didn't want to starve the
> > > > > checker thread at a large disconnect volume scale. I noticed
> > > > > that the number of devices if I purged inline with the check that
> > > > > it didn't scale linearly after a point and seemed to be
> > > > > significantly
> > > > > starving the checker thread. Doing the purge in another thread
> > > > > seemed to relieve that pressure.
> > > >
> > > > That might be because the check thread has safeguards to try to avoid
> > > > starving the other threads. Since a lot of multipathd's work is gated
> > > > by
> > > > the vecs lock, there's only so much parallelism that can happen with
> > > > multiple threads. If deleting the devices is taking a long time, it
> > > > might
> > > > be better for this to get interrupted, so that other threads can run.
> > > >
> > > > Since purging the paths is fairly low priority, we could continue to
> > > > run
> > > > it in its own thread, but instead of running through all the devices
> > > > at
> > > > once, purgeloop could lock the vecs->lock, handle one device, and
> > > > then
> > > > unlock and loop. This means it would need to search through the list
> > > > from the start each time it regrabbed the lock, since the list could
> > > > have changed while it wasn't holding it. When purgeloop makes it all
> > > > the
> > > > way through the list without finding any paths that are due for a
> > > > delete
> > > > attempt, it can sleep or block waiting for more paths.
> > > >
> > > > This would mean that paths would need to remember if they had already
> > > > been handled in a cycle. That could be done by purgeloop keeping
> > > > track
> > > > of which cycle it was on, and the paths storing the cycle number
> > > > whenever they were checked.
> > >
> > > As I wrote in my other post, wish that this thread wouldn't hold the
> > > vecs lock at all. multipathd could simply use a pipe to write dev_t's
> > > of to-be-purged paths to the purge thread (or process :-) ).
> >
> > This seems like a good idea. Obviously, writing to sysfs while the vecs
> > lock is being held means that multipath will continue to have that scsi
> > device open, so there's no chance of the device name getting reused if
> > the device is removed from the system. But this other thread/process
> > could simply open the scsi device, double check that it's still
> > disconnected, write to sysfs, and then close the device. That would
> > offer the same guarantee, without the need to interact with anything
> > else.
> >
> > > Martin
> >
> Martin,

Actually, this wasn't Martin, it was me.

> If you look at the v3 implementation that I sent up yesterday, you will
> see we are no longer holding the vecs lock during the sysfs delete
> operation. We also made the delete operation non-blocking.
> 
> The two places in the purgeloop where we do hold the lock are for two
> very short operations:
> 
> 1. We scan the vector to find one path that needs purging. We update
> pp->purge_cycle and increment the retry counter pp->purge_retries++.
> Then we copy the necessary data for the purge into the purge_info
> data structure. Then we release the lock. There shouldn't be much
> contention here with these operations.
> 
> 2. We also re-acquire the lock to make sure the path still exists in
> the pathvec and clear the purge_path and reset purge_retries. This should
> also be very fast.
> 
> We can't take the approach to open the device since it is disconnected.
> I think it will just fail or hang. We have the udev reference to make sure
> the device object is alive and not freed underneath us.

That makes sense. Unfortunately, I checked, and holding a reference on
the udev device does not keep the scsi device from being deleted, or a
new device from being created with the same devnode. 

My comments on your v3 patch talk about using a dup() to copy the
existing pp->fd to deal with this, which should work, regardless of the
path state.

-Ben
 
> Thanks,
> Brian
> 
> -- 
> Brian Bunker
> PURE Storage, Inc.
> [email protected]


Reply via email to