On Wed, 02 Mar 2005 11:11:36 -0500, Jeffrey Hutzelman <[EMAIL PROTECTED]> wrote: > > > On Wednesday, March 02, 2005 10:55:09 AM -0500 Tom Keiser > <[EMAIL PROTECTED]> wrote: > > > I've been profiling the pthreaded fileserver for a while now, and one > > thing continues to bother me. We call the time() syscall so > > frequently that it's often 20% of our total execution time. > > There are plenty of hacks to reduce this overhead, but every one I > > know of will introduce some amount of non-determinism. Is that an > > acceptable risk? > > > > Having a thread in a nanosleep; gettimeofday loop might be acceptable, > > but during periods of high load, you couldn't guarantee that your > > global epoch time variable would be _monotonically_ increasing. I > > suppose a combination of a SIGALRM handler and gettimeofday; setitimer > > might mitigate that problem to some extent. Anyone have any better > > suggestions on how to reduce this overhead? > > > > Certainly there are parts of the code that should continue to call > > time() directly, but how much do we care if we accept a kerberos > > ticket that expired a few milliseconds ago, when acceptable clock > > drift is many orders of magnitude larger? > > I'm not sure where are all the places we are currently actually calling > time(). I can think of four likely situations in which we care about time: > > (1) validating authentication > (2) ubik voting > (3) file timestamps > (4) various timing-sensitive bits in rx > > For (1) and (2) we don't care about sub-second accuracy. This is pretty > much true of any place we currently call time(), which returns an integer > number of seconds. > > For (2), we _do_ care about being accurate within a few seconds. There is > a certain small amount of skew allowed in the voting protocol, but it is > intended to deal mostly with potential network latency, and throwing it > away on clock skew can hurt. >
Yeah, I figured ubik would not be a good candidate for this kind of optimization. > For (2) and (3) we _do_ care that the clock we actually use is > monotonically increasing. As it stands right now, some fairly unhappy > things can happen if the system clock is stepped backwards for some reason. > When you say stepped backwards, do you mean with respect to previous values of a userspace "clock", or with respect to the real system clock? I can certainly guarantee that a userspace clock is non-decreasing, but I cannot guarantee a non-negative difference with respect to the system clock at any given time. > As for (4), there are any number of things in rx that are timing sensitive, > but for the most part they don't work by calling time(). > According to the DTrace pid provider, these are the top ten callers of time() over a few second period on a 1.3.78 fileserver: 11 fileserver`rxi_ReadProc+0x2b0 13 fileserver`TryAttach+0x80 15 fileserver`rxi_SendXmitList+0x2a4 16 fileserver`common_StoreData64+0x83c 21 fileserver`VGetVnode_r+0x130 29 fileserver`VGetVnode_r+0xad8 151 fileserver`rx_ReadvProc+0x40 5966 fileserver`rxi_ListenerProc+0x118 7969 fileserver`rxi_SendAck+0x5c0 23760 fileserver`rxi_FillReadVec+0x10c For rxi_FillReadVec at least, it's not rx, but the rxkad security object's op_CheckPacket member, which DTrace misses since its address is loaded from a function pointer. I have full stack traces for ~38000 calls to time(). If anyone wants to see more than just the immediate caller, I can post them somewhere. -- Tom Keiser [EMAIL PROTECTED] _______________________________________________ OpenAFS-devel mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-devel
