Re: Do the pidhashtbl_locks added by r340742 need to be sx locks?
On 4/11/19, Rick Macklem wrote: > Mateusz Guzik wrote: >>On 4/11/19, Rick Macklem wrote: >>> Hi, >>> >>> I finally got around to looking at what effect replacing pfind_locked() >>> with >>> pfind() has for the NFSv4 client and it is broken. >>> >>> The problem is that the NFS code needs to call some variant of "pfind()" >>> while >>> holding a mutex lock. The current _pfind() code uses the >>> pidhashtbl_locks, >>> which are "sx" locks. >>> >>> There are a few ways to fix this: >>> 1 - Create a custom version of _pfind() for the NFS client with the >>> sx_X() >>> calls >>> removed, plus replace the locking of allproc_lock with locking of >>> all >>> the >>> pidhashtbl_locks, so that the "sx" locks are acquired before the >>> mutex. >>> --> Not very efficient, but since it is only done once/sec, I can >>> live >>> with it. >>> 2 - Similar to the above, but still lock the allproc_lock and use a loop >>> of >>> FOREACH_PROC_IN_SYSTEM(p) instead of a hash list for the pid in the >>> custom pfind(). (I don't know if this would be preferable to >>> locking >>> all >>> the pidhashtbl_locks for other users of pfind()?) >>> 3 - Convert the pidhashtbl_locks to mutexes. Then the NFS client doesn't >>> need >>> to acquire any proc related locks and it just works. >>> I can't see anywhere that "sleeps" while holding the >>> pidhashtbl_locks, >>> so I >>> think they can be converted, although I haven't tried it yet? >>> >>> From my perspective, #3 seems the better solution. >>> What do others think? >>> >> >>Changing the lock type to rwlock may be doable and worthwhile on its own, >>but I don't think it would constitute the right fix. >> >>Preferably there would be an easy to use mechanism which allows >>registering per-process callbacks. Currently it can be somewhat emulated >>with EVENTHANDLERs, but it would give calls for all exiting processes, not >>only the ones of interest. Then there would be no need to periodically >>scan as you would always get notified on process exit. > Long ago when I first did the NFSv4 code for OpenBSD2.6, I had a callback > function > pointer in "struct proc" which the NFS code set non-null to get a callback. > { The code still has remnants of that because it still has > nfscl_cleanup_common(), >which was code shared by that callback and this approach which was used > for >the Mac OS X port, where I couldn't change "struct proc". } > I have never added anything like that for FreeBSD, but I suppose we could > look > at doing it that way. > To be honest, since the current code works fine and can be difficult to test > well, > I hesitate to change to using a callback. > >>Note the current code does not ref processes it is interested in any >>manner and just performs a timestamp check to see if it got the one it >>expected (with pid reuse in mind). >> >>So I think a temporary hack which will do the trick will take the current >>approach further: rely on struct proc being type-stable (i.e. never being >>freed) and also store the pointer. You can always safely PROC_LOCK it, do >>checks to see the proc is alive and has the right timestamp... > Hmm, so you are saying that every element of the proc_zone always has a > valid > p_mtx field in it that can be safely PROC_LOCK()'d no matter if the element > refers to a process at that time? > I would also need help with the code to determine if the structure refers > to > a process that currently exists with the same pid and creation time. You can just: PROC_LOCK(p); if (p->p_state != PRS_NORMAL) { /* it's not the one you want, bail */ } /* the current timestamp check goes here */ PROC_UNLOCK(p); > > I suppose saving "p" with the lock/open owner string and then doing what > you > suggest is possible, but it would take some work. > > For now, I can just grab all the pidhashtbl_locks once/sec and fix head so > it works. > -- Mateusz Guzik ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Do the pidhashtbl_locks added by r340742 need to be sx locks?
Mateusz Guzik wrote: >On 4/11/19, Rick Macklem wrote: >> Hi, >> >> I finally got around to looking at what effect replacing pfind_locked() >> with >> pfind() has for the NFSv4 client and it is broken. >> >> The problem is that the NFS code needs to call some variant of "pfind()" >> while >> holding a mutex lock. The current _pfind() code uses the pidhashtbl_locks, >> which are "sx" locks. >> >> There are a few ways to fix this: >> 1 - Create a custom version of _pfind() for the NFS client with the sx_X() >> calls >> removed, plus replace the locking of allproc_lock with locking of all >> the >> pidhashtbl_locks, so that the "sx" locks are acquired before the >> mutex. >> --> Not very efficient, but since it is only done once/sec, I can live >> with it. >> 2 - Similar to the above, but still lock the allproc_lock and use a loop of >> FOREACH_PROC_IN_SYSTEM(p) instead of a hash list for the pid in the >> custom pfind(). (I don't know if this would be preferable to locking >> all >> the pidhashtbl_locks for other users of pfind()?) >> 3 - Convert the pidhashtbl_locks to mutexes. Then the NFS client doesn't >> need >> to acquire any proc related locks and it just works. >> I can't see anywhere that "sleeps" while holding the pidhashtbl_locks, >> so I >> think they can be converted, although I haven't tried it yet? >> >> From my perspective, #3 seems the better solution. >> What do others think? >> > >Changing the lock type to rwlock may be doable and worthwhile on its own, >but I don't think it would constitute the right fix. > >Preferably there would be an easy to use mechanism which allows >registering per-process callbacks. Currently it can be somewhat emulated >with EVENTHANDLERs, but it would give calls for all exiting processes, not >only the ones of interest. Then there would be no need to periodically >scan as you would always get notified on process exit. Long ago when I first did the NFSv4 code for OpenBSD2.6, I had a callback function pointer in "struct proc" which the NFS code set non-null to get a callback. { The code still has remnants of that because it still has nfscl_cleanup_common(), which was code shared by that callback and this approach which was used for the Mac OS X port, where I couldn't change "struct proc". } I have never added anything like that for FreeBSD, but I suppose we could look at doing it that way. To be honest, since the current code works fine and can be difficult to test well, I hesitate to change to using a callback. >Note the current code does not ref processes it is interested in any >manner and just performs a timestamp check to see if it got the one it >expected (with pid reuse in mind). > >So I think a temporary hack which will do the trick will take the current >approach further: rely on struct proc being type-stable (i.e. never being >freed) and also store the pointer. You can always safely PROC_LOCK it, do >checks to see the proc is alive and has the right timestamp... Hmm, so you are saying that every element of the proc_zone always has a valid p_mtx field in it that can be safely PROC_LOCK()'d no matter if the element refers to a process at that time? I would also need help with the code to determine if the structure refers to a process that currently exists with the same pid and creation time. I suppose saving "p" with the lock/open owner string and then doing what you suggest is possible, but it would take some work. For now, I can just grab all the pidhashtbl_locks once/sec and fix head so it works. rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Do the pidhashtbl_locks added by r340742 need to be sx locks?
On 4/11/19, Rick Macklem wrote: > Hi, > > I finally got around to looking at what effect replacing pfind_locked() > with > pfind() has for the NFSv4 client and it is broken. > > The problem is that the NFS code needs to call some variant of "pfind()" > while > holding a mutex lock. The current _pfind() code uses the pidhashtbl_locks, > which are "sx" locks. > > There are a few ways to fix this: > 1 - Create a custom version of _pfind() for the NFS client with the sx_X() > calls > removed, plus replace the locking of allproc_lock with locking of all > the > pidhashtbl_locks, so that the "sx" locks are acquired before the > mutex. > --> Not very efficient, but since it is only done once/sec, I can live > with it. > 2 - Similar to the above, but still lock the allproc_lock and use a loop of > FOREACH_PROC_IN_SYSTEM(p) instead of a hash list for the pid in the > custom pfind(). (I don't know if this would be preferable to locking > all > the pidhashtbl_locks for other users of pfind()?) > 3 - Convert the pidhashtbl_locks to mutexes. Then the NFS client doesn't > need > to acquire any proc related locks and it just works. > I can't see anywhere that "sleeps" while holding the pidhashtbl_locks, > so I > think they can be converted, although I haven't tried it yet? > > From my perspective, #3 seems the better solution. > What do others think? > Changing the lock type to rwlock may be doable and worthwhile on its own, but I don't think it would constitute the right fix. Preferably there would be an easy to use mechanism which allows registering per-process callbacks. Currently it can be somewhat emulated with EVENTHANDLERs, but it would give calls for all exiting processes, not only the ones of interest. Then there would be no need to periodically scan as you would always get notified on process exit. Note the current code does not ref processes it is interested in any manner and just performs a timestamp check to see if it got the one it expected (with pid reuse in mind). So I think a temporary hack which will do the trick will take the current approach further: rely on struct proc being type-stable (i.e. never being freed) and also store the pointer. You can always safely PROC_LOCK it, do checks to see the proc is alive and has the right timestamp, all without needing to pfind or similar. -- Mateusz Guzik ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Do the pidhashtbl_locks added by r340742 need to be sx locks?
Hi, I finally got around to looking at what effect replacing pfind_locked() with pfind() has for the NFSv4 client and it is broken. The problem is that the NFS code needs to call some variant of "pfind()" while holding a mutex lock. The current _pfind() code uses the pidhashtbl_locks, which are "sx" locks. There are a few ways to fix this: 1 - Create a custom version of _pfind() for the NFS client with the sx_X() calls removed, plus replace the locking of allproc_lock with locking of all the pidhashtbl_locks, so that the "sx" locks are acquired before the mutex. --> Not very efficient, but since it is only done once/sec, I can live with it. 2 - Similar to the above, but still lock the allproc_lock and use a loop of FOREACH_PROC_IN_SYSTEM(p) instead of a hash list for the pid in the custom pfind(). (I don't know if this would be preferable to locking all the pidhashtbl_locks for other users of pfind()?) 3 - Convert the pidhashtbl_locks to mutexes. Then the NFS client doesn't need to acquire any proc related locks and it just works. I can't see anywhere that "sleeps" while holding the pidhashtbl_locks, so I think they can be converted, although I haven't tried it yet? >From my perspective, #3 seems the better solution. What do others think? rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"