I am a little surprised that you are having trouble adding users even to
large groups, but I don't know what kind of load your servers are seeing.

When I debugged all this code in testing large groups my biggest problem
was deleting large groups.  There is significant code in there to try to
break deletion operations into a series of sub-transactions each of
which is relatively small.  This is necessary because Ubik limits the
number of pages (1K blocks of the database) which can be modified in a
single transaction.  Even with these limits, however, I found that with
a large enough group and a heavy enough load eventually the OS would
swap out the ptserver long enough so that it would lose quorum.

The actual cause of the problem was losing the Rx connection to the
other quorom members.  This arises because the LWP facility used in AFS3
is non-preemtive.  So if some operation is long running (because it is
cpu bound or paged out) it may fail to run the Rx thread (in the
ptserver process) which sends keep-alives.  This can cause loss of
quoron or losing the connection to the client (i.e. the guy running
"pts").  Both of which appear as intermittent failures.

All this said, I'm afraid that it is true that a heavily loaded server,
especially one that is doing a lot of paging or even swapping, can run
into trouble in this area.

Let me further amplify on Tony Mauro's comment about the 5000 member
limit.  This is really just a sanity check on the bulk get membership
list call.  It will not allocate memory for a list longer than 5000
elements when marshaling the response to this call.  So this is really
just a limit on the "pts mem" command.  The GetCPS routine will continue
to notice that all the users are members of the group (of course the
GetCPS call probably has the same 5000 element limit too, but that's a
different problem).

My vague recollection is that I tested this code up to between 5K and
10K before running afoul of the above mentioned problems and giving up
on further refinements.  As you can see from the foregoing, your mileage
may vary.

Ted Anderson

Reply via email to