Re: Do the pidhashtbl_locks added by r340742 need to be sx locks?

2019-04-10 Thread Mateusz Guzik
On 4/11/19, Rick Macklem  wrote:
> Mateusz Guzik wrote:
>>On 4/11/19, Rick Macklem  wrote:
>>> Hi,
>>>
>>> I finally got around to looking at what effect replacing pfind_locked()
>>> with
>>> pfind() has for the NFSv4 client and it is broken.
>>>
>>> The problem is that the NFS code needs to call some variant of "pfind()"
>>> while
>>> holding a mutex lock. The current _pfind() code uses the
>>> pidhashtbl_locks,
>>> which are "sx" locks.
>>>
>>> There are a few ways to fix this:
>>> 1 - Create a custom version of _pfind() for the NFS client with the
>>> sx_X()
>>> calls
>>>   removed, plus replace the locking of allproc_lock with locking of
>>> all
>>> the
>>>   pidhashtbl_locks, so that the "sx" locks are acquired before the
>>> mutex.
>>>   --> Not very efficient, but since it is only done once/sec, I can
>>> live
>>> with it.
>>> 2 - Similar to the above, but still lock the allproc_lock and use a loop
>>> of
>>>  FOREACH_PROC_IN_SYSTEM(p) instead of a hash list for the pid in the
>>>  custom pfind(). (I don't know if this would be preferable to
>>> locking
>>> all
>>>  the pidhashtbl_locks for other users of pfind()?)
>>> 3 - Convert the pidhashtbl_locks to mutexes. Then the NFS client doesn't
>>> need
>>>  to acquire any proc related locks and it just works.
>>>  I can't see anywhere that "sleeps" while holding the
>>> pidhashtbl_locks,
>>> so I
>>>  think they can be converted, although I haven't tried it yet?
>>>
>>> From my perspective, #3 seems the better solution.
>>> What do others think?
>>>
>>
>>Changing the lock type to rwlock may be doable and worthwhile on its own,
>>but I don't think it would constitute the right fix.
>>
>>Preferably there would be an easy to use mechanism which allows
>>registering per-process callbacks. Currently it can be somewhat emulated
>>with EVENTHANDLERs, but it would give calls for all exiting processes, not
>>only the ones of interest. Then there would be no need to periodically
>>scan as you would always get notified on process exit.
> Long ago when I first did the NFSv4 code for OpenBSD2.6, I had a callback
> function
> pointer in "struct proc" which the NFS code set non-null to get a callback.
> { The code still has remnants of that because it still has
> nfscl_cleanup_common(),
>which was code shared by that callback and this approach which was used
> for
>the Mac OS X port, where I couldn't change "struct proc". }
> I have never added anything like that for FreeBSD, but I suppose we could
> look
> at doing it that way.
> To be honest, since the current code works fine and can be difficult to test
> well,
> I hesitate to change to using a callback.
>
>>Note the current code does not ref processes it is interested in any
>>manner and just performs a timestamp check to see if it got the one it
>>expected (with pid reuse in mind).
>>
>>So I think a temporary hack which will do the trick will take the current
>>approach further: rely on struct proc being type-stable (i.e. never being
>>freed) and also store the pointer. You can always safely PROC_LOCK it, do
>>checks to see the proc is alive and has the right timestamp...
> Hmm, so you are saying that every element of the proc_zone always has a
> valid
> p_mtx field in it that can be safely PROC_LOCK()'d no matter if the element
> refers to a process at that time?
> I would also need help with the code to determine if the structure refers
> to
> a process that currently exists with the same pid and creation time.

You can just:
PROC_LOCK(p);
if (p->p_state != PRS_NORMAL) { /* it's not the one you want, bail */ }
/* the current timestamp check goes here */
PROC_UNLOCK(p);
>
> I suppose saving "p" with the lock/open owner string and then doing what
> you
> suggest is possible, but it would take some work.
>
> For now, I can just grab all the pidhashtbl_locks once/sec and fix head so
> it works.
>

-- 
Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Do the pidhashtbl_locks added by r340742 need to be sx locks?

2019-04-10 Thread Rick Macklem
Mateusz Guzik wrote:
>On 4/11/19, Rick Macklem  wrote:
>> Hi,
>>
>> I finally got around to looking at what effect replacing pfind_locked()
>> with
>> pfind() has for the NFSv4 client and it is broken.
>>
>> The problem is that the NFS code needs to call some variant of "pfind()"
>> while
>> holding a mutex lock. The current _pfind() code uses the pidhashtbl_locks,
>> which are "sx" locks.
>>
>> There are a few ways to fix this:
>> 1 - Create a custom version of _pfind() for the NFS client with the sx_X()
>> calls
>>   removed, plus replace the locking of allproc_lock with locking of all
>> the
>>   pidhashtbl_locks, so that the "sx" locks are acquired before the
>> mutex.
>>   --> Not very efficient, but since it is only done once/sec, I can live
>> with it.
>> 2 - Similar to the above, but still lock the allproc_lock and use a loop of
>>  FOREACH_PROC_IN_SYSTEM(p) instead of a hash list for the pid in the
>>  custom pfind(). (I don't know if this would be preferable to locking
>> all
>>  the pidhashtbl_locks for other users of pfind()?)
>> 3 - Convert the pidhashtbl_locks to mutexes. Then the NFS client doesn't
>> need
>>  to acquire any proc related locks and it just works.
>>  I can't see anywhere that "sleeps" while holding the pidhashtbl_locks,
>> so I
>>  think they can be converted, although I haven't tried it yet?
>>
>> From my perspective, #3 seems the better solution.
>> What do others think?
>>
>
>Changing the lock type to rwlock may be doable and worthwhile on its own,
>but I don't think it would constitute the right fix.
>
>Preferably there would be an easy to use mechanism which allows
>registering per-process callbacks. Currently it can be somewhat emulated
>with EVENTHANDLERs, but it would give calls for all exiting processes, not
>only the ones of interest. Then there would be no need to periodically
>scan as you would always get notified on process exit.
Long ago when I first did the NFSv4 code for OpenBSD2.6, I had a callback 
function
pointer in "struct proc" which the NFS code set non-null to get a callback.
{ The code still has remnants of that because it still has 
nfscl_cleanup_common(),
   which was code shared by that callback and this approach which was used for
   the Mac OS X port, where I couldn't change "struct proc". }
I have never added anything like that for FreeBSD, but I suppose we could look
at doing it that way.
To be honest, since the current code works fine and can be difficult to test 
well,
I hesitate to change to using a callback.

>Note the current code does not ref processes it is interested in any
>manner and just performs a timestamp check to see if it got the one it
>expected (with pid reuse in mind).
>
>So I think a temporary hack which will do the trick will take the current
>approach further: rely on struct proc being type-stable (i.e. never being
>freed) and also store the pointer. You can always safely PROC_LOCK it, do
>checks to see the proc is alive and has the right timestamp...
Hmm, so you are saying that every element of the proc_zone always has a valid
p_mtx field in it that can be safely PROC_LOCK()'d no matter if the element
refers to a process at that time?
I would also need help with the code to determine if the structure refers to
a process that currently exists with the same pid and creation time.

I suppose saving "p" with the lock/open owner string and then doing what you
suggest is possible, but it would take some work.

For now, I can just grab all the pidhashtbl_locks once/sec and fix head so it 
works.

rick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Do the pidhashtbl_locks added by r340742 need to be sx locks?

2019-04-10 Thread Mateusz Guzik
On 4/11/19, Rick Macklem  wrote:
> Hi,
>
> I finally got around to looking at what effect replacing pfind_locked()
> with
> pfind() has for the NFSv4 client and it is broken.
>
> The problem is that the NFS code needs to call some variant of "pfind()"
> while
> holding a mutex lock. The current _pfind() code uses the pidhashtbl_locks,
> which are "sx" locks.
>
> There are a few ways to fix this:
> 1 - Create a custom version of _pfind() for the NFS client with the sx_X()
> calls
>   removed, plus replace the locking of allproc_lock with locking of all
> the
>   pidhashtbl_locks, so that the "sx" locks are acquired before the
> mutex.
>   --> Not very efficient, but since it is only done once/sec, I can live
> with it.
> 2 - Similar to the above, but still lock the allproc_lock and use a loop of
>  FOREACH_PROC_IN_SYSTEM(p) instead of a hash list for the pid in the
>  custom pfind(). (I don't know if this would be preferable to locking
> all
>  the pidhashtbl_locks for other users of pfind()?)
> 3 - Convert the pidhashtbl_locks to mutexes. Then the NFS client doesn't
> need
>  to acquire any proc related locks and it just works.
>  I can't see anywhere that "sleeps" while holding the pidhashtbl_locks,
> so I
>  think they can be converted, although I haven't tried it yet?
>
> From my perspective, #3 seems the better solution.
> What do others think?
>

Changing the lock type to rwlock may be doable and worthwhile on its own,
but I don't think it would constitute the right fix.

Preferably there would be an easy to use mechanism which allows
registering per-process callbacks. Currently it can be somewhat emulated
with EVENTHANDLERs, but it would give calls for all exiting processes, not
only the ones of interest. Then there would be no need to periodically
scan as you would always get notified on process exit.

Note the current code does not ref processes it is interested in any
manner and just performs a timestamp check to see if it got the one it
expected (with pid reuse in mind).

So I think a temporary hack which will do the trick will take the current
approach further: rely on struct proc being type-stable (i.e. never being
freed) and also store the pointer. You can always safely PROC_LOCK it, do
checks to see the proc is alive and has the right timestamp, all without
needing to pfind or similar.

-- 
Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Do the pidhashtbl_locks added by r340742 need to be sx locks?

2019-04-10 Thread Rick Macklem
Hi,

I finally got around to looking at what effect replacing pfind_locked() with
pfind() has for the NFSv4 client and it is broken.

The problem is that the NFS code needs to call some variant of "pfind()" while
holding a mutex lock. The current _pfind() code uses the pidhashtbl_locks,
which are "sx" locks.

There are a few ways to fix this:
1 - Create a custom version of _pfind() for the NFS client with the sx_X() calls
  removed, plus replace the locking of allproc_lock with locking of all the
  pidhashtbl_locks, so that the "sx" locks are acquired before the mutex.
  --> Not very efficient, but since it is only done once/sec, I can live 
with it.
2 - Similar to the above, but still lock the allproc_lock and use a loop of
 FOREACH_PROC_IN_SYSTEM(p) instead of a hash list for the pid in the
 custom pfind(). (I don't know if this would be preferable to locking all
 the pidhashtbl_locks for other users of pfind()?)
3 - Convert the pidhashtbl_locks to mutexes. Then the NFS client doesn't need
 to acquire any proc related locks and it just works.
 I can't see anywhere that "sleeps" while holding the pidhashtbl_locks, so I
 think they can be converted, although I haven't tried it yet?

>From my perspective, #3 seems the better solution.
What do others think?

rick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"