tentitive ucred mutex cleanup patch, MATT's version (was Re: ucred holding patch, BDE version)

2002-02-16 Thread Matthew Dillon

Please review this patch.  I haven't looked at BDE's patch but this was
so simple I decided to just go do it.

There are many situations where the allocation of a system structure can 
be safely done with Giant, but deallocation occurs dangerously deep
in the call stack.  For these situations I believe it makes sense to
have a recycle queue (maybe even a global recycle queue) to handle
freeing the refcount-0 structures at a later time (like in the allocation
code where we might already need Giant).  At least until MALLOC/FREE are
really made MP safe.

In anycase, here is the patch.  It is VERY simple and it also cuts the
size of the ucred structure down by a huge amount.

As a bonus, here are the gettimeofday() loop test results with this
code and the gettimeofday() giant removal:

CURRENT, gettimeofday() SMP patch, ucred patch:

1 TG running:   396365 calls/sec
2 TGs running:  322000 calls/sec each.

Now with my gdb -k on /dev/mem I note that the contention is occuring
in userret(), where Giant is being obtained for the signal processing.
If we could optimize that I'll bet the call rate would go above 400K calls
per second with two running.

-Matt

Index: sys/ucred.h
===
RCS file: /home/ncvs/src/sys/sys/ucred.h,v
retrieving revision 1.26
diff -u -r1.26 ucred.h
--- sys/ucred.h 11 Oct 2001 23:38:17 -  1.26
+++ sys/ucred.h 16 Feb 2002 21:36:58 -
@@ -60,8 +60,9 @@
struct  uidinfo *cr_uidinfo;/* per euid resource consumption */
struct  uidinfo *cr_ruidinfo;   /* per ruid resource consumption */
struct  prison *cr_prison;  /* jail(4) */
-#definecr_endcopy cr_mtx
-   struct  mtx cr_mtx; /* protect refcount */
+#definecr_endcopy cr_mtxp
+   struct  mtx *cr_mtxp;   /* protect refcount */
+   struct  ucred *cr_freenext; /* next on free list */
 };
 #define cr_gid cr_groups[0]
 #define NOCRED ((struct ucred *)0) /* no credential available */
Index: kern/kern_prot.c
===
RCS file: /home/ncvs/src/sys/kern/kern_prot.c,v
retrieving revision 1.133
diff -u -r1.133 kern_prot.c
--- kern/kern_prot.c16 Jan 2002 06:55:30 -  1.133
+++ kern/kern_prot.c16 Feb 2002 21:42:17 -
@@ -72,6 +72,9 @@
int dummy;
 };
 #endif
+
+static struct ucred *ucred_free_list;
+
 /*
  * MPSAFE
  */
@@ -1582,11 +1585,40 @@
 struct ucred *
 crget()
 {
-   register struct ucred *cr;
+   struct ucred *cr;
+
+   GIANT_REQUIRED;
+
+   mtx_pool_lock(ucred_free_list);
+   if (ucred_free_list) {
+   cr = ucred_free_list;
+   ucred_free_list = cr-cr_freenext;
+   cr-cr_freenext = NULL;
+   mtx_pool_unlock(ucred_free_list);
 
-   MALLOC(cr, struct ucred *, sizeof(*cr), M_CRED, M_WAITOK | M_ZERO);
+   /*
+* Cleanup previous user of this recycled ucred
+*
+* Some callers of crget(), such as nfs_statfs(),
+* allocate a temporary credential, but don't
+* allocate a uidinfo structure.
+*/
+   if (cr-cr_uidinfo != NULL)
+   uifree(cr-cr_uidinfo);
+   if (cr-cr_ruidinfo != NULL)
+   uifree(cr-cr_ruidinfo);
+   /*
+* Free a prison, if any.
+*/
+   if (jailed(cr))
+   prison_free(cr-cr_prison);
+   } else {
+   mtx_pool_unlock(ucred_free_list);
+   MALLOC(cr, struct ucred *, sizeof(*cr), M_CRED, M_WAITOK);
+   }
+   bzero(cr, sizeof(struct ucred));
cr-cr_ref = 1;
-   mtx_init(cr-cr_mtx, ucred, MTX_DEF);
+   cr-cr_mtxp = mtx_pool_find(cr);
return (cr);
 }
 
@@ -1598,42 +1630,36 @@
struct ucred *cr;
 {
 
-   mtx_lock(cr-cr_mtx);
+   mtx_lock(cr-cr_mtxp);
cr-cr_ref++;
-   mtx_unlock(cr-cr_mtx);
+   mtx_unlock(cr-cr_mtxp);
return (cr);
 }
 
 /*
  * Free a cred structure.
  * Throws away space when ref count gets to 0.
+ *
+ * XXX we should not have to obtain the mutex lock if cr_ref is 1, since
+ * we are the last user.
  */
 void
 crfree(cr)
struct ucred *cr;
 {
+   struct mtx *mtxp = cr-cr_mtxp;
 
-   mtx_lock(cr-cr_mtx);
+   mtx_lock(mtxp);
KASSERT(cr-cr_ref  0, (bad ucred refcount: %d, cr-cr_ref));
if (--cr-cr_ref == 0) {
-   mtx_destroy(cr-cr_mtx);
-   /*
-* Some callers of crget(), such as nfs_statfs(),
-* allocate a temporary credential, but don't
-* allocate a uidinfo structure.
-*/
-   if (cr-cr_uidinfo != NULL)
-   

Re: ucred holding patch, BDE version

2002-02-11 Thread Terry Lambert

Julian Elischer wrote:
  This makes little sense to me.
 
  Maybe I'm missing something, but by virtue of ownership we don't
  have to worry about the ucred's refcount on entry into the kernel
  because it is the owner and no one else is allowed to change our
  privledges besideds ourselves via set[ug]id().
 
 multiple threads can do it..
 
  The proclock is needed to get the reference,
 guarding against other threads, and giant is needed fo rnot to free it
 because if it reaches a refcount of 0 it needs to call free(). (which john
 assures me needs Giant at this time).
 We could avoid the proclock with judicious use of an atomic refcount
 incrementing method.
 
 When Giant goes away it won't be so bad but it will STILL be quicker to
 not drop it across userland.

The multiple threads argument is bogus, since the calls
to [gs]et[ug]id() are on a per process, not a per thread
basis.

An argument which *may* not be bogus (I am unconvinced) is
that creds are immutable once instanced, and that the calls
to [gs]et[ug]id() instance a new cred and replace, rather
than changing an existing cred (this logically follows from
credential inheritance, or the first set call would change
the cred used by init and all other processes).


Personally, I still do not understand the need to have a
cred reference per thread, the only thing that makes any
sense about that is to optimize the degenerate case of a
daemon that makes calls as another ID, on behalf of a lot
of users (or, sequentially, at least, different users).
One example of such a program would be SAMBA (but *not*
NFS, due to access semantics on objects based on path
component access exclusion by credential not being an
effective mechanism for NFS file handles).

I think that you would need to have [gs]et[ug]id() be on a
per thread basis for this to be an efficiency, and I think
trying to do this pessimizes everything else.

My gut tells me that creds should be per process, and
that the references to them should be taken sparingly,
and then only if a need can be justified, rather than
just in case some day.


Kirk at one time called vnodes the structure that ate the
kernel; he was wrong: it was creds.


Perhaps this dicsussion is enough impetus to justify
revisiting the atomic_t type definitions, which would
be useful as reference counted hold/release mechanisms
that would obviate the need for locks here?  This would
at least let you defer getting rid of the per thread
cred instances until later.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



RE: ucred holding patch, BDE version

2002-02-11 Thread John Baldwin


On 11-Feb-02 Julian Elischer wrote:
 here is the BDE version ready to commit.
 Extended to other architectures.
 
 Bruce, John, comments?
 
 As I was adding a prototype to ucred.h I stripped the __Ps of the others in
 that
 section
 (in the spirit of change it when editing it anyhow

Hmm, acquire_ucred (don't really like that name, maybe thread_updatecred(td)
which can use td_proc to get the proc) probably should be declared in
sys/proc.h.  Well, maybe not, sys/ucred.h is probably fine.  But it's
implementation should then be in kern_prot.c along with all the other ucred
related functions. :)
 
Also, please make the comment above the function into a complete sentence and
capitalize appropriately, etc. as per style(9) just to be pedantic.  I guess
removing __P() as you go is ok if that spirit is what the -arch thread is
desired.  Personally I thought it should be the other way around just like we
don't mix whitespace commits with code commits to avoid obfuscating function
changes with style changes.  IMO, just commit to ucred.h blowing away __P()
first, then commit your functional changes with the rest.

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread John Baldwin


On 11-Feb-02 Alfred Perlstein wrote:
 * Julian Elischer [EMAIL PROTECTED] [020211 15:00] wrote:
 In the current world, when the thread enters userland, it does:
 
 lock giant
 crfree() (which includes mutexes)
 unlock giant
 
 This isn't needed afaik.

Yes, calling free() without Giant is about as good as calling fdrop() without
Giant Alfred. :)

 if there are ASTs it does this once again for each AST waiting as well.
 
 And on the way into the system it does:
 lock process
 crhold() (which includes mutex ops)
 unlock process
 
 This isn't needed, at least afaik.

Not strictly for the comparison as Julian and Terry pointed out since the race
can occur anyway (i.e., you don't need the lock to see if p_ucred changed),
however, if you are actually doing a crhold(), you want to make sure p_ucred
isn't stale, so you need the proc lock.

 so if there is a single AST (not uncommon) it does on a system call, 4 to
 6 locks and 4 to 6 unlocks just to get a reference on the ucred it already
 had a reference on. By not freeing it when going to userland, and checking
 if it is the right one when returning to the kernel, we replace that with
 a pointer comparison (well maybe 2) 99.999% of the time.
 
 John still wants to free it if INVARIANTS is on so he canh trap on
 inapropriate access. I'm not sure it's needed but am willing to do so..
 
 This makes little sense to me.
 
 Maybe I'm missing something, but by virtue of ownership we don't
 have to worry about the ucred's refcount on entry into the kernel
 because it is the owner and no one else is allowed to change our
 privledges besideds ourselves via set[ug]id().

Umm, do you understand the entire thread ucred reference at all?  What if you
have two threads on two different CPU's from teh same process and one changes
your ucred out from under you while you are handlign a syscall?  The idea here
is to ensure that a thread has a consistent ucred for an entire user trap or
syscall to avoid races involving credentials.

 Therefore the additional hold on entry is completely useless no
 matter what and with that the release on exit is also useless.

No, you just haven't been keeping up.  I added td_ucred at least a month or so
ago and it was discussed on -arch before it went in.  I have a big honking
patch to test when I get back from BSDCon that changes the kernel to use
td_ucred almost everywhere instead of p_ucred so that we actualyl use the
per-thread ucred (which will be needed before we can stop being a 1:1:1:1
system) and also means credential ops such as suser(), and p_can* don't need
locks for the current thread.  (p_can* still need locks for the target process).

 -Alfred

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread John Baldwin


On 12-Feb-02 Julian Elischer wrote:
  The proclock is needed to get the reference,
 guarding against other threads, and giant is needed fo rnot to free it
 because if it reaches a refcount of 0 it needs to call free(). (which john
 assures me needs Giant at this time).
 We could avoid the proclock with judicious use of an atomic refcount
 incrementing method.

_No_!  The proc lock is protecting p_ucred, it can't go away!  What _can_ go
away is the per-ucred mutex to protect the refcount if we ever revive the
refcount API.

 When Giant goes away it won't be so bad but it will STILL be quicker to
 not drop it across userland.

Yes.  Actually, calling free() can still be rather expensive even when Giant is
gone.

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread John Baldwin


On 12-Feb-02 Terry Lambert wrote:
 Julian Elischer wrote:
  This makes little sense to me.
 
  Maybe I'm missing something, but by virtue of ownership we don't
  have to worry about the ucred's refcount on entry into the kernel
  because it is the owner and no one else is allowed to change our
  privledges besideds ourselves via set[ug]id().
 
 multiple threads can do it..
 
  The proclock is needed to get the reference,
 guarding against other threads, and giant is needed fo rnot to free it
 because if it reaches a refcount of 0 it needs to call free(). (which john
 assures me needs Giant at this time).
 We could avoid the proclock with judicious use of an atomic refcount
 incrementing method.
 
 When Giant goes away it won't be so bad but it will STILL be quicker to
 not drop it across userland.
 
 The multiple threads argument is bogus, since the calls
 to [gs]et[ug]id() are on a per process, not a per thread
 basis.

Thread 1 makes a syscall that blocks.  Say it blocks in one of 3 VOP's all of
which need a credential.  Thread 2 changes the credentials of the process. 
Thread 1 now resumes assuming that say a VADMIN or VACCESS suceeded with the
old cred that may not be valid any longer and performs VOP's with the now newer
credential (which it may even read a stale value of w/o a lock thus using some
random memory for the creds) to do its other VOP's which may now fail weirdly.

The idea of per-thread ucred references is so that a thread will have the same
credentials for the lifetime of a syscall so that the entire syscall is self
consistent.  It also means that except for when we update the ucred reference,
we don't need locks to access thread credentials since the thread references
are to read-only credentials.  We discussed this on -arch _months_ ago and
everyone agreed with it then.

 Perhaps this dicsussion is enough impetus to justify
 revisiting the atomic_t type definitions, which would
 be useful as reference counted hold/release mechanisms
 that would obviate the need for locks here?  This would
 at least let you defer getting rid of the per thread
 cred instances until later.

All having the refcount_t and other refcount_* functions would do is let us get
rid of the per-ucred mutex (actually a pool mutex, so the overhead per ucred is
just a pointer right now).  It wouln't change the fact that we need a lock to
make sure p_ucred is up to date before we read a value we need to depend on or
actually use.

 -- Terry

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



RE: ucred holding patch, BDE version

2002-02-11 Thread Julian Elischer



On Mon, 11 Feb 2002, John Baldwin wrote:

 
 On 11-Feb-02 Julian Elischer wrote:
  here is the BDE version ready to commit.
  Extended to other architectures.
  
  Bruce, John, comments?
  
  As I was adding a prototype to ucred.h I stripped the __Ps of the others in
  that
  section
  (in the spirit of change it when editing it anyhow
 
 Hmm, acquire_ucred (don't really like that name, maybe thread_updatecred(td)
 which can use td_proc to get the proc) probably should be declared in
 sys/proc.h.  Well, maybe not, sys/ucred.h is probably fine.  But it's
 implementation should then be in kern_prot.c along with all the other ucred
 related functions. :)

I guess so.
The name requires changing anyhow as it was pointed out to me that Bruce
mis-spelled acquire and I didn't notice.

  
 Also, please make the comment above the function into a complete sentence and
 capitalize appropriately, etc. as per style(9) just to be pedantic.  I guess
 removing __P() as you go is ok if that spirit is what the -arch thread is
 desired.  Personally I thought it should be the other way around just like we
 don't mix whitespace commits with code commits to avoid obfuscating function
 changes with style changes.  IMO, just commit to ucred.h blowing away __P()
 first, then commit your functional changes with the rest.

hmmm I am completely confused as to which way we ended up deciding then..
:-)

 
 -- 
 
 John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
 Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-current in the body of the message
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread Julian Elischer



On Mon, 11 Feb 2002, Terry Lambert wrote:

 Julian Elischer wrote:
   This makes little sense to me.
  
   Maybe I'm missing something, but by virtue of ownership we don't
   have to worry about the ucred's refcount on entry into the kernel
   because it is the owner and no one else is allowed to change our
   privledges besideds ourselves via set[ug]id().
  
  multiple threads can do it..
  
   The proclock is needed to get the reference,
  guarding against other threads, and giant is needed fo rnot to free it
  because if it reaches a refcount of 0 it needs to call free(). (which john
  assures me needs Giant at this time).
  We could avoid the proclock with judicious use of an atomic refcount
  incrementing method.
  
  When Giant goes away it won't be so bad but it will STILL be quicker to
  not drop it across userland.
 
 The multiple threads argument is bogus, since the calls
 to [gs]et[ug]id() are on a per process, not a per thread
 basis.

there is no such thing as a per process syscall.
Two threads can always do the same syscall at the same time.
there needs to be a proc-lock to stop it from becoming
chaotic in there. In actual fact, since you cannot alter a cred
but only replace that which the process points to it's not 
quite that bad, but you need to either lock it or have atomic
reference-counting that can handle the possibility that
the cred could have bee decremented to 0 by another thread just before
you checked it.


 
 An argument which *may* not be bogus (I am unconvinced) is
 that creds are immutable once instanced, and that the calls
 to [gs]et[ug]id() instance a new cred and replace, rather
 than changing an existing cred (this logically follows from
 credential inheritance, or the first set call would change
 the cred used by init and all other processes).
 
 
 Personally, I still do not understand the need to have a
 cred reference per thread, the only thing that makes any
 sense about that is to optimize the degenerate case of a
 daemon that makes calls as another ID, on behalf of a lot
 of users (or, sequentially, at least, different users).
 One example of such a program would be SAMBA (but *not*
 NFS, due to access semantics on objects based on path
 component access exclusion by credential not being an
 effective mechanism for NFS file handles).

the cred that is in force at  the time that the syscall STARTS
is used for the full syscall otherwise you could have
one cred used for the first part of a syscall and a completely 
differnet one used for the secnd part of a syscall.

 
 I think that you would need to have [gs]et[ug]id() be on a
 per thread basis for this to be an efficiency, and I think
 trying to do this pessimizes everything else.
 
 My gut tells me that creds should be per process, and
 that the references to them should be taken sparingly,
 and then only if a need can be justified, rather than
 just in case some day.

creads can only be changed per process but the threads only pick
up the change on next syscall startup.

 
 
 Kirk at one time called vnodes the structure that ate the
 kernel; he was wrong: it was creds.

I believe it was Mike Karels.

 
 
 Perhaps this dicsussion is enough impetus to justify
 revisiting the atomic_t type definitions, which would
 be useful as reference counted hold/release mechanisms
 that would obviate the need for locks here?  This would
 at least let you defer getting rid of the per thread
 cred instances until later.

I've made that point before and I believe that jhb has said he would like
such primatives.

 
 -- Terry
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



RE: ucred holding patch, BDE version

2002-02-11 Thread John Baldwin


On 11-Feb-02 Julian Elischer wrote:
 here is the BDE version ready to commit.
 Extended to other architectures.
 
 Bruce, John, comments?
 
 As I was adding a prototype to ucred.h I stripped the __Ps of the others in
 that
 section
 (in the spirit of change it when editing it anyhow

Hmm, acquire_ucred (don't really like that name, maybe thread_updatecred(td)
which can use td_proc to get the proc) probably should be declared in
sys/proc.h.  Well, maybe not, sys/ucred.h is probably fine.  But it's
implementation should then be in kern_prot.c along with all the other ucred
related functions. :)
 
Also, please make the comment above the function into a complete sentence and
capitalize appropriately, etc. as per style(9) just to be pedantic.  I guess
removing __P() as you go is ok if that spirit is what the -arch thread is
desired.  Personally I thought it should be the other way around just like we
don't mix whitespace commits with code commits to avoid obfuscating function
changes with style changes.  IMO, just commit to ucred.h blowing away __P()
first, then commit your functional changes with the rest.

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread John Baldwin


On 12-Feb-02 Julian Elischer wrote:
  The proclock is needed to get the reference,
 guarding against other threads, and giant is needed fo rnot to free it
 because if it reaches a refcount of 0 it needs to call free(). (which john
 assures me needs Giant at this time).
 We could avoid the proclock with judicious use of an atomic refcount
 incrementing method.

_No_!  The proc lock is protecting p_ucred, it can't go away!  What _can_ go
away is the per-ucred mutex to protect the refcount if we ever revive the
refcount API.

 When Giant goes away it won't be so bad but it will STILL be quicker to
 not drop it across userland.

Yes.  Actually, calling free() can still be rather expensive even when Giant is
gone.

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread John Baldwin


On 11-Feb-02 Alfred Perlstein wrote:
 * Julian Elischer [EMAIL PROTECTED] [020211 15:00] wrote:
 In the current world, when the thread enters userland, it does:
 
 lock giant
 crfree() (which includes mutexes)
 unlock giant
 
 This isn't needed afaik.

Yes, calling free() without Giant is about as good as calling fdrop() without
Giant Alfred. :)

 if there are ASTs it does this once again for each AST waiting as well.
 
 And on the way into the system it does:
 lock process
 crhold() (which includes mutex ops)
 unlock process
 
 This isn't needed, at least afaik.

Not strictly for the comparison as Julian and Terry pointed out since the race
can occur anyway (i.e., you don't need the lock to see if p_ucred changed),
however, if you are actually doing a crhold(), you want to make sure p_ucred
isn't stale, so you need the proc lock.

 so if there is a single AST (not uncommon) it does on a system call, 4 to
 6 locks and 4 to 6 unlocks just to get a reference on the ucred it already
 had a reference on. By not freeing it when going to userland, and checking
 if it is the right one when returning to the kernel, we replace that with
 a pointer comparison (well maybe 2) 99.999% of the time.
 
 John still wants to free it if INVARIANTS is on so he canh trap on
 inapropriate access. I'm not sure it's needed but am willing to do so..
 
 This makes little sense to me.
 
 Maybe I'm missing something, but by virtue of ownership we don't
 have to worry about the ucred's refcount on entry into the kernel
 because it is the owner and no one else is allowed to change our
 privledges besideds ourselves via set[ug]id().

Umm, do you understand the entire thread ucred reference at all?  What if you
have two threads on two different CPU's from teh same process and one changes
your ucred out from under you while you are handlign a syscall?  The idea here
is to ensure that a thread has a consistent ucred for an entire user trap or
syscall to avoid races involving credentials.

 Therefore the additional hold on entry is completely useless no
 matter what and with that the release on exit is also useless.

No, you just haven't been keeping up.  I added td_ucred at least a month or so
ago and it was discussed on -arch before it went in.  I have a big honking
patch to test when I get back from BSDCon that changes the kernel to use
td_ucred almost everywhere instead of p_ucred so that we actualyl use the
per-thread ucred (which will be needed before we can stop being a 1:1:1:1
system) and also means credential ops such as suser(), and p_can* don't need
locks for the current thread.  (p_can* still need locks for the target process).

 -Alfred

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread Alfred Perlstein

* Julian Elischer [EMAIL PROTECTED] [020211 15:00] wrote:
 In the current world, when the thread enters userland, it does:
 
 lock giant
 crfree() (which includes mutexes)
 unlock giant

This isn't needed afaik.

 if there are ASTs it does this once again for each AST waiting as well.
 
 And on the way into the system it does:
 lock process
 crhold() (which includes mutex ops)
 unlock process

This isn't needed, at least afaik.

 so if there is a single AST (not uncommon) it does on a system call, 4 to
 6 locks and 4 to 6 unlocks just to get a reference on the ucred it already
 had a reference on. By not freeing it when going to userland, and checking
 if it is the right one when returning to the kernel, we replace that with
 a pointer comparison (well maybe 2) 99.999% of the time.
 
 John still wants to free it if INVARIANTS is on so he canh trap on
 inapropriate access. I'm not sure it's needed but am willing to do so..

This makes little sense to me.

Maybe I'm missing something, but by virtue of ownership we don't
have to worry about the ucred's refcount on entry into the kernel
because it is the owner and no one else is allowed to change our
privledges besideds ourselves via set[ug]id().

Therefore the additional hold on entry is completely useless no
matter what and with that the release on exit is also useless.

-Alfred

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread Julian Elischer



On Mon, 11 Feb 2002, Alfred Perlstein wrote:

 * Julian Elischer [EMAIL PROTECTED] [020211 15:00] wrote:
  In the current world, when the thread enters userland, it does:
  
  lock giant
  crfree() (which includes mutexes)
  unlock giant
 
 This isn't needed afaik.

Argue with jhb.

 
  if there are ASTs it does this once again for each AST waiting as well.
  
  And on the way into the system it does:
  lock process
  crhold() (which includes mutex ops)
  unlock process
 
 This isn't needed, at least afaik.

see below.


 
  so if there is a single AST (not uncommon) it does on a system call, 4 to
  6 locks and 4 to 6 unlocks just to get a reference on the ucred it already
  had a reference on. By not freeing it when going to userland, and checking
  if it is the right one when returning to the kernel, we replace that with
  a pointer comparison (well maybe 2) 99.999% of the time.
  
  John still wants to free it if INVARIANTS is on so he canh trap on
  inapropriate access. I'm not sure it's needed but am willing to do so..
 
 This makes little sense to me.
 
 Maybe I'm missing something, but by virtue of ownership we don't
 have to worry about the ucred's refcount on entry into the kernel
 because it is the owner and no one else is allowed to change our
 privledges besideds ourselves via set[ug]id().

multiple threads can do it..


 The proclock is needed to get the reference,
guarding against other threads, and giant is needed fo rnot to free it
because if it reaches a refcount of 0 it needs to call free(). (which john
assures me needs Giant at this time).
We could avoid the proclock with judicious use of an atomic refcount
incrementing method.

When Giant goes away it won't be so bad but it will STILL be quicker to
not drop it across userland.

 
 Therefore the additional hold on entry is completely useless no
 matter what and with that the release on exit is also useless.
 
 -Alfred
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread Julian Elischer



On Mon, 11 Feb 2002, John Baldwin wrote:

 
 On 12-Feb-02 Julian Elischer wrote:
   The proclock is needed to get the reference,
  guarding against other threads, and giant is needed fo rnot to free it
  because if it reaches a refcount of 0 it needs to call free(). (which john
  assures me needs Giant at this time).
  We could avoid the proclock with judicious use of an atomic refcount
  incrementing method.
 
 _No_!  The proc lock is protecting p_ucred, it can't go away!  What _can_ go
 away is the per-ucred mutex to protect the refcount if we ever revive the
 refcount API.

hmm ok Alfred, here's the way I see it..
You are never permitted to CHANGE a cred.
You ALWAYS allocate another and switch them.
When you switch them you decrement the refcount of the old one.
If someone else takes a reference on it at the same moment that you 
drop it, then the order is important down to the bus-cycle
as to whether it gets freed or not. We already know that dereferencing it
from the proc structure is not important, because a stale ucred pointer
is only preventable from the userland. 
All that is important  is that the pointer is a REAL pointer to a cred
and that it is not allowed to go to 0 references in its way to 1
reference.
An atomic reference count increment that checks
against 0 would be part of it but might not be enough.
I think we also need to have a lock on something because
it might get freed and used for something else that happens to place a 1
in that location inbetween the time that p-p_ucred is read
and the refcount is decremented.

As an asside:
I think that in NT they may have refcounts in the same location in all
structures as I think they derived all their structures from a bas class
that has ref counts. In that case it WOULD have a 1 in that location if
it were re-used. (It's been a long time since I saw NT code so I may be
wrong)

 
  When Giant goes away it won't be so bad but it will STILL be quicker to
  not drop it across userland.
 
 Yes.  Actually, calling free() can still be rather expensive even when Giant is
 gone.
 
 -- 
 
 John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
 Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-current in the body of the message
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread Terry Lambert

John Baldwin wrote:
 Yes, calling free() without Giant is about as good as calling fdrop() without
 Giant Alfred. :)

Alfred would be right, for per processor memory pools.  8-).

  And on the way into the system it does:
  lock process
  crhold() (which includes mutex ops)
  unlock process
 
  This isn't needed, at least afaik.
 
 Not strictly for the comparison as Julian and Terry pointed out since the race
 can occur anyway (i.e., you don't need the lock to see if p_ucred changed),
 however, if you are actually doing a crhold(), you want to make sure p_ucred
 isn't stale, so you need the proc lock.

No.  If you _depend_ on the frequency of change being low,
you can do this with only atomic reference counts.  See the
pseudo code in my other posting, in direct response to you.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread Terry Lambert

Julian Elischer wrote:
  The multiple threads argument is bogus, since the calls
  to [gs]et[ug]id() are on a per process, not a per thread
  basis.
 
 there is no such thing as a per process syscall.
 Two threads can always do the same syscall at the same time.
 there needs to be a proc-lock to stop it from becoming
 chaotic in there. In actual fact, since you cannot alter a cred
 but only replace that which the process points to it's not
 quite that bad, but you need to either lock it or have atomic
 reference-counting that can handle the possibility that
 the cred could have bee decremented to 0 by another thread just before
 you checked it.

I would argue that:

1)  There are a small number of system calls where this
is true.

2)  It's worth doing the synchornization in-kernel for
the process alone, where this is the case.

The problem I have with the crd locking is that each process
does not do a gratuitous clone of the active cred before doing
its own thing (if you will remember, I suggested this in the
context of the cred reference count overflow bug, back when I
found the problem last year).

The upshot of this is that the lock will stall anyone using
that cred.  This argues that creds should be, minimally,
per process, if not per CPU, instead of shared references
smeared all over the place, and locked on each reference,
even though it's not possible for a cred to be changed at
all out from under you -- only replaced wholesale, since
once instanced, a cred may not be changed.

Ask John and Robert about the permissability of changing
cred contents, and the NFS changes that resulted (fairly)
recently.

Personally, I think that cred changes are rare enough,
even in the degenerate case, that it's worth taking the
synchronization event as an intraprocess global IPI,
rather than using a lock.


  Personally, I still do not understand the need to have a
  cred reference per thread, the only thing that makes any
  sense about that is to optimize the degenerate case of a
  daemon that makes calls as another ID, on behalf of a lot
  of users (or, sequentially, at least, different users).
  One example of such a program would be SAMBA (but *not*
  NFS, due to access semantics on objects based on path
  component access exclusion by credential not being an
  effective mechanism for NFS file handles).
 
 the cred that is in force at  the time that the syscall STARTS
 is used for the full syscall otherwise you could have
 one cred used for the first part of a syscall and a completely
 differnet one used for the secnd part of a syscall.

Again: the model permits dropping, but not gaining, of
priviledges.  THe worst case failure, then, is:

1)  A call starts on a thread without proper program
level stall synchornization between threads.

2)  A subsequent thread drops priviledges, and this
drop is unprotected by the stall.

3)  The call completes, but fails during completion
for lack of privledges present when the call was
started, because the drop was not stalled.

Frankly, I don't see this being a problem: badly written
code with a race condition will occasionally lose the race,
rather than being implicitly protected (an assumption that
is not portable, if it is depended upon), and the code will
end up being fixed by its author.


  I think that you would need to have [gs]et[ug]id() be on a
  per thread basis for this to be an efficiency, and I think
  trying to do this pessimizes everything else.
 
  My gut tells me that creds should be per process, and
  that the references to them should be taken sparingly,
  and then only if a need can be justified, rather than
  just in case some day.
 
 creads can only be changed per process but the threads only pick
 up the change on next syscall startup.

I think this is the fundamental misunderstanding: creds
never change.  The can only be replaced, and then only
with a cred of equal or lesser priviledge.


  Kirk at one time called vnodes the structure that ate the
  kernel; he was wrong: it was creds.
 
 I believe it was Mike Karels.

Whatever.  It's creds now.


  Perhaps this dicsussion is enough impetus to justify
  revisiting the atomic_t type definitions, which would
  be useful as reference counted hold/release mechanisms
  that would obviate the need for locks here?  This would
  at least let you defer getting rid of the per thread
  cred instances until later.
 
 I've made that point before and I believe that jhb has said he would like
 such primatives.

I'm still not convinced that it's necessary to lock what
you are trying to lock, but I certainly agree on atomic_t.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread Terry Lambert

John Baldwin wrote:
  The multiple threads argument is bogus, since the calls
  to [gs]et[ug]id() are on a per process, not a per thread
  basis.
 
 Thread 1 makes a syscall that blocks.  Say it blocks in one of 3 VOP's all of
 which need a credential.  Thread 2 changes the credentials of the process.
 Thread 1 now resumes assuming that say a VADMIN or VACCESS suceeded with the
 old cred that may not be valid any longer and performs VOP's with the now newer
 credential (which it may even read a stale value of w/o a lock thus using some
 random memory for the creds) to do its other VOP's which may now fail weirdly.

Then the combination of calls fails.  There is no case where
it will succeed when previously it would not.  I would argue
that the failure is a lost race condition that exists in the
code anyway, and that this approach merely makes the race
failure more reliably reproducible.  8-).

I think you could fix this case, in any event, by saving a
reference to the credential at the start (or, better, as a
result of the sleep, when you know it's going to happen),
any time you are going to split a system call across multiple
potential blocking operations.  That certainly is *NOT* as
common to all system calls as this discussion implies.


 The idea of per-thread ucred references is so that a thread will have the same
 credentials for the lifetime of a syscall so that the entire syscall is self
 consistent.  It also means that except for when we update the ucred reference,
 we don't need locks to access thread credentials since the thread references
 are to read-only credentials.  We discussed this on -arch _months_ ago and
 everyone agreed with it then.

As long as the pointer updates are atomic, you don't need to
hold a lock to reference it anyway.  You don't care over the
update, since it's a drop of priviledge.


  Perhaps this dicsussion is enough impetus to justify
  revisiting the atomic_t type definitions, which would
  be useful as reference counted hold/release mechanisms
  that would obviate the need for locks here?  This would
  at least let you defer getting rid of the per thread
  cred instances until later.
 
 All having the refcount_t and other refcount_* functions would do is let us get
 rid of the per-ucred mutex (actually a pool mutex, so the overhead per ucred is
 just a pointer right now).  It wouln't change the fact that we need a lock to
 make sure p_ucred is up to date before we read a value we need to depend on or
 actually use.

Not true.  You can take a reference to it, and the reference
is guaranteed to not change out from under you, so long as
it is counted.  Use of a stack variable to store the value
over the count increment allows a compare after the increment,
after which a retry is possible (if there was a race during
the reference taking).  Thus there is no need for a lock, e.g.:

cred *
addreftocred(cred **cpp)
{
cred *rcp;  /* stack variable */

again:
rcp = *cp;
atomic_plusplus( rcp-count);   /* magic atomic crap */
if( rcp != *cp) {
atomic_mimusminus( rcp-count);
goto again;
}

return( rcp);
}

This depends on the rarity of credential changes; in the
degenerate case, where there is a credential change between
each normal system call that is not a credential change, the
overhead is immense.  But people who write code like that
can be safely punished without fear.  8-).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread John Baldwin


On 12-Feb-02 Julian Elischer wrote:
 
 
 On Mon, 11 Feb 2002, John Baldwin wrote:
 
 
 On 12-Feb-02 Julian Elischer wrote:
   The proclock is needed to get the reference,
  guarding against other threads, and giant is needed fo rnot to free it
  because if it reaches a refcount of 0 it needs to call free(). (which john
  assures me needs Giant at this time).
  We could avoid the proclock with judicious use of an atomic refcount
  incrementing method.
 
 _No_!  The proc lock is protecting p_ucred, it can't go away!  What _can_ go
 away is the per-ucred mutex to protect the refcount if we ever revive the
 refcount API.
 
 hmm ok Alfred, here's the way I see it..

This is jhb, not alfred.

 You are never permitted to CHANGE a cred.
 You ALWAYS allocate another and switch them.
 When you switch them you decrement the refcount of the old one.
 If someone else takes a reference on it at the same moment that you 
 drop it, then the order is important down to the bus-cycle
 as to whether it gets freed or not. We already know that dereferencing it
 from the proc structure is not important, because a stale ucred pointer
 is only preventable from the userland. 

NO, no!  You only know it is stale in that case cause you are comparing a
non-stale known-good pointer to p_ucred.  If p_ucred is stale, then it is equal
to td_ucred which still points to a valid ucred so it is ok in that particular
case.  If p_ucred is stale but points to a new ucred, getting the proc lock
around crhold() ensures that you will gain a reference to the really correct
value that is guaranteed to point to a correct value.   Don't think in terms of
bus cycles.  On non-i386 architectures, a write can sit in a store buffer
waiting until it is pushed out by a memory barrier or for the CPU to get around
to posting it.  You can't assume that a write will be visible to other CPU's
soon.

 All that is important  is that the pointer is a REAL pointer to a cred
 and that it is not allowed to go to 0 references in its way to 1
 reference.

But if it's a stale pointer, it's not a pointer to a ucred.  That memory could
have been free'd and now be a mbuf header.  Now you go try to increment the
refcount of a mbuf header.  Bad juju here. :)  Trust me, my jhb_proc kernels
were locking up cause I missed one instance of setting p_args to NULL during
exec so taht during wait we would trash random data memory when we tried to
indirect the stale p_args pointer to dec the refcount.

The problem is that you actually have 2 locks involved for 2 different things
here during your crhold()'s:

 - proc lock which protects p_ucred in two ways:
   - 1) it ensures that the ucred has at least one reference that doesn't go
away so it's ok for us to deref the poitner and increment the count
   - 2) it ensures that the value of the pointer we read is up to date
 - ucred mutex which protects the ucred refcount

Only the second mutex can be replaced by pure atomic operations in theory.  The
first one cannot.  In your comparison test, you aren't dereferencing p_ucred,
so if it is stale, that is ok due to other issues.  However, when you do
crhold(p-p_ucred), you dereference the pointer, so it better darn well be
pointing at a ucred and not a mbuf.

 An atomic reference count increment that checks
 against 0 would be part of it but might not be enough.
 I think we also need to have a lock on something because
 it might get freed and used for something else that happens to place a 1
 in that location inbetween the time that p-p_ucred is read
 and the refcount is decremented.

That is what the proc lock is doing. :-P
 
 As an asside:
 I think that in NT they may have refcounts in the same location in all
 structures as I think they derived all their structures from a bas class
 that has ref counts. In that case it WOULD have a 1 in that location if
 it were re-used. (It's been a long time since I saw NT code so I may be
 wrong)

That would involve great evil.  Let's not do that.  How do you know the
refcount hasn't been adjusted on some other structure anyways that you have a
stale pointer to?  Maybe that current structure is just now being freed?

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread John Baldwin


On 12-Feb-02 Terry Lambert wrote:
 Julian Elischer wrote:
  The multiple threads argument is bogus, since the calls
  to [gs]et[ug]id() are on a per process, not a per thread
  basis.
 
 there is no such thing as a per process syscall.
 Two threads can always do the same syscall at the same time.
 there needs to be a proc-lock to stop it from becoming
 chaotic in there. In actual fact, since you cannot alter a cred
 but only replace that which the process points to it's not
 quite that bad, but you need to either lock it or have atomic
 reference-counting that can handle the possibility that
 the cred could have bee decremented to 0 by another thread just before
 you checked it.
 
 I would argue that:
 
 1)There are a small number of system calls where this
   is true.
 
 2)It's worth doing the synchornization in-kernel for
   the process alone, where this is the case.
 
 The problem I have with the crd locking is that each process
 does not do a gratuitous clone of the active cred before doing
 its own thing (if you will remember, I suggested this in the
 context of the cred reference count overflow bug, back when I
 found the problem last year).

... which has nothing to do with the problem at hand as far as I can tell. 
Well, if we did, say, embed a cred in each process rather than sharing them,
then we wouldn't need to protect p_ucred with a lock, but now we would need to
protect all the contents of the cred with a lock, or go with the IPI scheme
(yuck, IPI's are very uncheap).

 The upshot of this is that the lock will stall anyone using
 that cred.  This argues that creds should be, minimally,
 per process, if not per CPU, instead of shared references
 smeared all over the place, and locked on each reference,
 even though it's not possible for a cred to be changed at
 all out from under you -- only replaced wholesale, since
 once instanced, a cred may not be changed.

What in the world are you talking about Terry.  Have you read the code?
The lock _only_ protects the reference count, nothing else!  Once I commit the
code to make sure that we use p_ucred properly when updating credentials then I
can commit the code to chagne all of suser(), p_can*() to use the per-thread
ucred.  Since teh per-thread ucred is immutable, and since td_ucred is only
ever touchced by curthread, thsi involves NO LOCKS for ANY calls to suser() or
p_can*() EXCEPT in the few syscalls that update credentials.

We went over this _months_ ago.  I direct you to the archives of -arch please.

 Personally, I think that cred changes are rare enough,
 even in the degenerate case, that it's worth taking the
 synchronization event as an intraprocess global IPI,
 rather than using a lock.

Egads.  That still doesn't fix the problem of some syscall changing its creds
to get weird behavior halfway through a syscall.  You still need locks if two
threads call setuid() at the same time.   Sure it's a program bug but we can't
have the kernel panic over it.

 creads can only be changed per process but the threads only pick
 up the change on next syscall startup.
 
 I think this is the fundamental misunderstanding: creds
 never change.  The can only be replaced, and then only
 with a cred of equal or lesser priviledge.

Well, Robert wasn't very comfortable with that change, plus it either requires
a horribly expensive and ugly IPI, or it requires lots of lock acquires to read
p_ucred that we wouldn't need otherwise.

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ucred holding patch, BDE version

2002-02-11 Thread John Baldwin


On 12-Feb-02 Terry Lambert wrote:
 Then the combination of calls fails.  There is no case where
 it will succeed when previously it would not.  I would argue
 that the failure is a lost race condition that exists in the
 code anyway, and that this approach merely makes the race
 failure more reliably reproducible.  8-).
 
 I think you could fix this case, in any event, by saving a
 reference to the credential at the start (or, better, as a
 result of the sleep, when you know it's going to happen),
 any time you are going to split a system call across multiple
 potential blocking operations.  That certainly is *NOT* as
 common to all system calls as this discussion implies.
 
 
 The idea of per-thread ucred references is so that a thread will have the
 same
 credentials for the lifetime of a syscall so that the entire syscall is self
 consistent.  It also means that except for when we update the ucred
 reference,
 we don't need locks to access thread credentials since the thread references
 are to read-only credentials.  We discussed this on -arch _months_ ago and
 everyone agreed with it then.
 
 As long as the pointer updates are atomic, you don't need to
 hold a lock to reference it anyway.  You don't care over the
 update, since it's a drop of priviledge.

Ok, I'll use really small words and diagrams:

thread's 1 and 2 belong to process p
p-p_ucred starts pointing at memory X
CPU 1 is executing thread 1
CPU 2 is executing thread 2

CPU 3 is executing some other random code in the kernel
At start both thread 1 and thread 2 have td_ucred == X

CPU 1   CPU 2   CPU 3
seteuid(), p_ucred  in userland
now points to Y
...
seteuid(), p_ucred
now points to Z
...
some syscall, Y's
only remaining
ref is free'd
... allocates some
other kernel
structure at Y
executes any syscall
because on non-i386
arch's, the value of
p_ucred may still be
sitting in the store
buffer and we aren't
using a lock, we may
get any of the values
X, Y, or Z for p_ucred
when we test to see
if it needs updating.
If we get X, we keep
using the old reference
(this is the acceptable
race from before).  If
we get Z, then we gain
a reference to the new
ucred.  If we get Y,
then we increment the
refcount of the ucred
formerly at Y, in fact
corrupting the memory
allocated by CPU 3!

*BOOM*

Is that simple enough?

  Perhaps this dicsussion is enough impetus to justify
  revisiting the atomic_t type definitions, which would
  be useful as reference counted hold/release mechanisms
  that would obviate the need for locks here?  This would
  at least let you defer getting rid of the per thread
  cred instances until later.
 
 All having the refcount_t and other refcount_* functions would do is let us
 get
 rid of the per-ucred mutex (actually a pool mutex, so the overhead per ucred
 is
 just a pointer right now).  It wouln't change the fact that we need a lock
 to
 make sure p_ucred is up to date before we read a value we need to depend on
 or
 actually use.
 
 Not true.  You can take a reference to it, and the reference
 is guaranteed to not change out from under you, so long as
 it is counted.  Use of a stack variable to store the value
 over the count increment allows a compare after the increment,
 after which a retry is possible (if there was a race during
 the reference taking).  Thus there is no need for a lock, e.g.:
 
   cred *
   addreftocred(cred **cpp)
   {
   cred *rcp;  /* stack variable */
 
   again:
   rcp = *cp;
   atomic_plusplus( rcp-count);   /* magic atomic crap */
   if( rcp != *cp) {
   atomic_mimusminus( rcp-count);
   goto again;
   }
 
   return( rcp);
   }
 
 This depends on the rarity of credential changes; in the
 degenerate case, where there is a credential change between
 each normal system call that is not a credential change, the
 overhead is immense.  But people who write code like that
 can be safely punished without fear.  8-).

This code is still broken.  How do you know that the value you are reading from
cp (well, cpp, looks like you have a typo :-P) isn't stale and that your
increment isn't corrupting something?