On Thu, 1 Aug 2002, Green, Paul wrote: > Andrew Bartlett [mailto:[EMAIL PROTECTED]] wrote: > > Richard Sharpe wrote: > > > > > > Hi, > > > > > > I looked at this issue, and it looks possible to accumulate > > > the timeouts that have occured in receive_message_or_smb and > > > count those up. > > > > > > Given that the resolution of the dead time parameter is in > > > minutes, this would seem to not get too far out of whack. > > > > Sounds like a fair optimization to me - assuming that's what > > it is for. > > Pardon my skepticism, but is there any evidence that calling gettimeofday > from this location in the code is actually contributing in any material way > to the performance of Samba? Any measurements? If there isn't, then you > are just optimizing based on by guess and by golly, and could be (almost > certainly will be) introducing a maintenance headache, or an unwitting > platform dependency, by trying to second-guess them. You could also make > operating Samba from within a debugger (or during profiling) rather touchy, > since your "accumulating" probably won't work in the face of breakpoints. > If this code was mine, I'd insist that you prove to me that this change > would result in at least a 3% or more gain in performance. I seriously doubt > whether you could reach this bar. Why? Well, in all probability the TCP > stack does multiple time calls all on its own for every packet, and the cost > of sending the data probably far outweighs the cost of reading the clock, so > I think this time call is lost in the noise.
OK, I think you are wise to be skeptical :-) I would only point out that while I have not profiled Samba in that area, and it would not be hard to do (perhaps a project for today as I will be adding trace points for our internal tracing tool), there are two salient points to be added: 1. The time calls in the stack are in the kernel and, while I haven't checked, on FreeBSD possibly get the internal processor cycle count scaled, which is a much cheaper operation. 2. We converted a lot of tracing code to using cycle counts recently rather than calling gettimeofday, because, guess what, all those system calls were having a big impact. 3. (Nobody needs a Spanish Inqusition) When I was profiling the sendfile patches recently, the stat call in userland (instead of being in sendfile) made a measurable difference, despite the fact that the vnode/inode should already have been cached! I would claim that gettimeofday would introduce a comparable cost to stating a file that was recently opened and thus had its vnode/inode cached. However, I agree with your comments below on how to go about optimizations. > <soapbox> > > (1) Operating system engineers know that their time routines are going to > get heavily called and generally try to optimize them. We certainly try on > our OS. (2) This routine has fairly high resolution. If you don't need > this level of resolution, you might use time(), which is probably cheaper, > and certainly no more expensive (and POSIX-compliant, whereas gettimeofday() > is not in POSIX-96). (3) Optimize based on measurements not reading code. > > </soapbox> > > Let me just add that I've wasted all too many working days in my career by > trying to optimize code by inspection. When I actually take the time to run > a benchmark and then optimize the hot spots, I get much better results in > much less (human) time. > > Thanks. > PG > -- > Paul Green, Senior Technical Consultant, Stratus Computer, Inc. > Voice: +1 978-461-7557; FAX: +1 978-461-3610; Video on request. > > -- Regards ----- Richard Sharpe, [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]
