On Thursday 24 January 2008 17:45, Robert Hailey wrote: > > On Jan 11, 2008, at 1:53 PM, Robert Hailey wrote: > > > On Jan 11, 2008, at 12:42 PM, Matthew Toseland wrote: > >> On Wednesday 09 January 2008 17:14, Robert Hailey wrote: > >>> > >>> I have reverted r16886, as it appears to be based on a misunderstand > >>> of how requests work against the topology of the network (r16980). > >> > >> I was rather hoping to be talked into accepting 16886 ... we should > >> at least > >> have some logging in such cases IMHO. > >> > >> Requests really shouldn't be taking that long - maybe it's related > >> to the HTL > >> problem, maybe we have such a perverse network topology that we are > >> resetting > >> HTL time after time after time, I dunno, simulations would be > >> interesting. > > > > I'm thinking that message queue priorities will obsolete this problem, > > as the path for responses to requests will solidify nearly > > immediately. Which is to say, we will then see mostly fetch-timeouts, > > not accepted/fatal-timeouts. > > Well, I do think that this problem *generally* has gone away. A large > part of the timeouts may have been request coalescing deadlocks. In my > logs, I no longer see that "requestsender took to long to respond to > requestor (+2m)", but when I do see that log statement fire, it is huge! > > Jan 24, 2008 17:05:11:767 (freenet.node.RequestHandler, RequestSender > for UID 5637402349040790252, ERROR): > requestsender took too long to respond to requestor (16m10s/3) > Jan 24, 2008 17:05:14:446 (freenet.node.RequestHandler, RequestSender > for UID 98827504771122964, ERROR): > requestsender took too long to respond to requestor (16m8s/3) > Jan 24, 2008 17:05:14:447 (freenet.node.RequestHandler, RequestSender > for UID 774454676209630, ERROR): > requestsender took too long to respond to requestor (16m8s/3) > Jan 24, 2008 17:23:00:203 (freenet.node.RequestHandler, RequestSender > for UID 7341907878853950087, ERROR): > requestsender took too long to respond to requestor (34m33s/4) > > Half an hour for one request? Good night!
This is suspicious, they are all roughly the same period except the last. I
suggest you set log level minor and investigate what happened by searching
for the UID.
>
> Relatedly, I have a patch which scans the running uids every 20
> minutes, logging and removing those found to still be running (longer
> than period between checks; 20 minutes). I intended this to make sure
> that no UIDs were leaked from the recent thread optimization (r17190).
> I'm hesitant to commit it as it is a bit of a kludge rather than a fix
> (not leaking the uids), and if requests are taking that long it would
> interfere with loop detection.
>
> Interestingly, of the four uid trackers {CHK/SSK}{request/insert},
> only SSK-inserts have NOT come up in the stale-uid scan logging; so
> this is not confined to requestSender/r16886.
The other question here is not decrementing the HTL on retries if we have
decrementAtMax = false. I've just committed a patch for that.
I'll probably release alpha 2 from 1103...
pgpB22UzI25WZ.pgp
Description: PGP signature
_______________________________________________ Devl mailing list [email protected] http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
