On Jan 24, 2008, at 11:45 AM, Robert Hailey wrote:


On Jan 11, 2008, at 1:53 PM, Robert Hailey wrote:

On Jan 11, 2008, at 12:42 PM, Matthew Toseland wrote:
On Wednesday 09 January 2008 17:14, Robert Hailey wrote:

I have reverted r16886, as it appears to be based on a misunderstand
of how requests work against the topology of the network (r16980).

I was rather hoping to be talked into accepting 16886 ... we should
at least
have some logging in such cases IMHO.

Requests really shouldn't be taking that long - maybe it's related
to the HTL
problem, maybe we have such a perverse network topology that we are
resetting
HTL time after time after time, I dunno, simulations would be
interesting.

I'm thinking that message queue priorities will obsolete this problem,
as the path for responses to requests will solidify nearly
immediately. Which is to say, we will then see mostly fetch-timeouts,
not accepted/fatal-timeouts.

Well, I do think that this problem *generally* has gone away. A large part of the timeouts may have been request coalescing deadlocks. In my logs, I no longer see that "requestsender took to long to respond to requestor (+2m)", but when I do see that log statement fire, it is huge!

Jan 24, 2008 17:05:11:767 (freenet.node.RequestHandler, RequestSender for UID 5637402349040790252, ERROR):
requestsender took too long to respond to requestor (16m10s/3)
Jan 24, 2008 17:05:14:446 (freenet.node.RequestHandler, RequestSender for UID 98827504771122964, ERROR):
requestsender took too long to respond to requestor (16m8s/3)
Jan 24, 2008 17:05:14:447 (freenet.node.RequestHandler, RequestSender for UID 774454676209630, ERROR):
requestsender took too long to respond to requestor (16m8s/3)
Jan 24, 2008 17:23:00:203 (freenet.node.RequestHandler, RequestSender for UID 7341907878853950087, ERROR):
requestsender took too long to respond to requestor (34m33s/4)

Half an hour for one request? Good night!

This may be an effect of having an incredibly slow peer (although even sendSync() should not block nearly that long... so I can't imagine why), I'll keep watching it.

--
Robert Hailey


Relatedly, I have a patch which scans the running uids every 20 minutes, logging and removing those found to still be running (longer than period between checks; 20 minutes). I intended this to make sure that no UIDs were leaked from the recent thread optimization (r17190). I'm hesitant to commit it as it is a bit of a kludge rather than a fix (not leaking the uids), and if requests are taking that long it would interfere with loop detection.

Interestingly, of the four uid trackers {CHK/SSK}{request/insert}, only SSK-inserts have NOT come up in the stale-uid scan logging; so this is not confined to requestSender/r16886.

--
Robert Hailey




_______________________________________________
Devl mailing list
[email protected]
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to