node

Matthew Toseland Fri, 11 Jan 2008 10:23:08 -0800

On Tuesday 08 January 2008 22:57, Robert Hailey wrote:
> 
> On Jan 8, 2008, at 3:33 PM, Matthew Toseland wrote:
> 
> > On Saturday 05 January 2008 00:50, Robert Hailey wrote:
> >>
> >> Interestingly (now that I have got the simulator running), this
> >> 'general timeout' appears even in simulations between nodes on the
> >> same machine. Unless I coded something wrong, perhaps there is an
> >> added delay or missing response somewhere which is not obvious?
> >
> > Entirely possible. Fixing it would be better than an arbitrary  
> > cutoff when we
> > are still able to potentially find the data, and still have enough  
> > HTL to do
> > so.
> 
> On Jan 8, 2008, at 2:27 PM, Matthew Toseland wrote:
> > On Friday 04 January 2008 18:32, Robert Hailey wrote:
> >>
> >> Apparently until this revision 16886, (so long as any one node does
> >> not timeout) a node will take as long as necessary to exhaust  
> >> routable
> >> peers. Even long after the original requestor has given up on that  
> >> node.
> >
> > Is there any evidence that this happens in practice? Surely the HTL  
> > should
> > prevent excessive searching in most cases?
> 
> There is, in fact. The timeout itself (which I have been running on my  
> node for a while) is evidence of the behavior (which to be seems  
> incorrect).
> 
> Jan 08, 2008 20:03:41:146 (freenet.node.RequestSender, RequestSender  
> for UID -3998139406700477577, ERROR): discontinuing non-local request  
> search, general timeout (6 attempts, 3 overloads)
> ...
> Jan 08, 2008 20:12:21:226 (freenet.node.RequestSender, RequestSender  
> for UID 60170596711015291, ERROR): discontinuing non-local request  
> search, general timeout (1 attempts, 3 overloads)


Ouch. How common is this?
> 
> You see... in the first log statement the node tried six peers before  
> running out of time. In the second case (which occurs quite  
> frequently), the node used the entire 2 minutes waiting on a response  
> from one node (FETCH_TIMEOUT); if it were allowed to continue to the  
> next node it could (65%) spend another 2 minutes on just-that-node.
> 
> >> To the best of my knowledge, all of the upstream nodes will not
> >> respond with the LOOP rejection before then. And even well before the
> >> worst case, this effect can accrue across many nodes in the path.
> >
> > If the same request is routed to a node which is already running it,  
> > it
> > rejects it with RejectedLoop. If it's routed to a node which has  
> > recently ran
> > it, it again rejects it. If it is a different request for the same  
> > key, it
> > may be coalesced.
> 
> If you mean that the RECENTLY_FAILED mechanism would keep this in  
> check... I see this idea in many places, but I cannot see where this  
> is actually implemented. The only place I see that makes a  
> FNPRecentlyFailed message is in RequestHandler (upon it's  
> RequestSender having received one).

It's part of the unfinished ULPRs system. It will be implemented after I have 
opennet fully sorted out.
> 
> Presently the node will "RejectLoop" if it is one of the last 10000  
> completed requests. My node runs through that many requests in about  
> 16 minutes. With this logging statement it is already shown that a  
> request can last longer than 2 minutes for one peer (and most nodes  
> have 20). If you assume that a request takes 4 minutes (two peers,  
> VERY optimistic), then it would then only take 4 nodes ('near' each  
> other) to generate a request-live-lock (absent the HTL; the request  
> would never drop from the network); each trying two of it's other  
> peers, and then the next in the 4-chain.

Okay, this is a problem.
> 
> I do not think that this timeout I added is arbitrary. As I understand  
> Ian's original networking theory, a request is not valid after the  
> originator has timed out. In much the same way that a single node  
> fatally timing out collapses the request chain, so too should a node  
> 'taking too long' (as that node *IS* the one fatally timing out the  
> chain).

Well, we can't easily inform downstream nodes of a request timing out. And we 
can't include the updated timeout on each hop either (for security reasons).
> 
> But on the other hand, I do understand your point about the HTL, and  
> that it would keep the request from continuing indefinitely; it seems  
> like it could be quite a waste of network resources. Certainly beyond  
> that point in time (where the requester has fatally timed out) no  
> response should be sent back to the source (that could be many of the  
> unclaimed fifo packets); or maybe just if the data is finally found  
> (you mentioned ULPRs).

Maybe there is another reason for this behaviour.

pgprkLyisBpfR.pgp
Description: PGP signature

_______________________________________________
Devl mailing list
[email protected]
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] r16886 - trunk/freenet/src/freenet/node

Reply via email to