On Sat, Feb 22, 2003 at 11:17:05PM -0500, Gianni Johansson wrote: > Matthew Toseland wrote: > > > The diagnostic var "routingTime" is a measure of how long it takes to > > route a request. It was assumed not to include any network activity - > > for example, we chop out the time taken by sending the Accepted message. > > It turns out that routingTime can include a DNS lookup. Since > > routingTime is used as an indicator of system load, so if routingTime > > goes above a certain value the node assumes it is overloaded, this is a > > bit of a problem. > > Solutions: > > A. Use tickerDelay in the load calculation, instead of routingTime, or > > B. Make routingTime not include the DNS lookup. This would involve > > layer violations. > > > > Any immediate objections to option A ? > > No. Fine with me. > > offtopic: > Have you actually instrumented the code to determine that the DNS lookup is > repsonsible for the delay? Yup. In many cases anyway.
> I poked around a while ago trying to understand the choppyness of the routingTime > values and it looked to me > like routingTime peaks were correlated with the execution of the routing table > persitance checkpoint. Hmm. Interesting. If this is really a problem (a start would be to log the time taken to flush the DataObjectRoutingStore... my coarse logging tells me that on my larger node, which has 1,299 keys in its routing table, it always finishes in the same second or the next second, but I would want to time it with millisecond precision), we may be able to do some sort of semi-shallow copy like we do in NativeFSDirectory (I have a 10GB store with index enabled, it only holds the lock for 60ms to copy the hash table). > Maybe there's lock contention with the file system lock somewhere... The routing table persistence checkpoint does not talk to the filesystem at all (currently). > > > One interesting angle: I have seen 4000 millisecond delays apparently > > caused by this. We current do not cache DNS at all, so we will have this > > delay fairly frequently, although hopefully the resolution will be > > cached by the local nameserver... I am inclined to cache DNS lookups for > > at least 5 minutes now that we have manual control over it (we may have > > some trouble making this a config option because of static init order > > issues). The more radical solution would be what I suggested in another > > mail - to have the origin node do its own DNS, if necessary, and convert > > it to raw IP addresses (and ARKs) for Freenet. If supporting multi-homed > > nodes is a priority we can code direct support for it. Anyway, since the > > ARK lookup code doesn't kick in for 15 minutes, it seems reasonable to > > set the DNS cache to 5 minutes... I'm not sure how much difference it > > will make though. Something to bear in mind for the future, certainly - > > DNS lookups block a thread. For 0.5.1, we should do either A or B, and > > I'm strongly leaning towards A. Everything except for A, B and possibly > > tweaking the DNS caching parameters should not be done before 0.5.2 > > anyway. > > -- > > Matthew Toseland > > [EMAIL PROTECTED]/[EMAIL PROTECTED] > > Full time freenet hacker. > > http://freenetproject.org/ > > Freenet Distribution Node (temporary) at > > http://80-192-4-23.cable.ubr09.na.blueyonder.co.uk:8889/Sv~JmJR8CUk/ > > ICTHUS. > > -- Matthew Toseland [EMAIL PROTECTED]/[EMAIL PROTECTED] Full time freenet hacker. http://freenetproject.org/ Freenet Distribution Node (temporary) at http://80-192-4-23.cable.ubr09.na.blueyonder.co.uk:8889/GptQvHy-Ap8/ ICTHUS.
pgp00000.pgp
Description: PGP signature
