Ian, Please read this one. Think he makes sense. If so we can remove the global estimator from the eq, with Martin's formula we should not need it. This could also explain why we are seeing such high numbers of requests on nodes - routing thinks they will react fastest and they do by rejecting...
Ed On October 02, 2003 03:07 pm, Martin Stone Davis wrote: > Ian Clarke wrote: > > Jonathan Howard wrote: > >> That shows what I say first and is dependent on (for working) as desired > >> tSuccess < tFailure. > >> > >> (pSuccess + pFailure) = 1 > >> tSuccess ~= tFailure -> > > > > Actually no, tFailure should typically be much larger than tSuccess as > > it includes the estimated time to rerequest the data from somewhere else > > (along with the time to for the failure to occur). > > > >> I can't resist :-) > >> Who is your money on the tortoise or the hare. > > > > My money is on whichever is probabilistically going to result in some > > data being returned soonest (which includes the time required for the > > dog to fetch it if the hare comes back empty handed). > > Mine too. We want to maximize the amount of data returned within a > fixed amount of time. Unfortunately, StandardNodeEstimator.estimate() > does not recognize the advantage of a consistently good but slow result > over a consistently bad but fast result. Below I derive how to decide > which "dog" to place your bets on. That is, how to calculate expected > number of successes per unit of time. > > Let's say that tSuccess is the number of time units it takes to return 1 > data unit if successful. Imagine some long stretch of time t where we > are repeatedly making requests of the node. That time would be filled > by a bunch of intervals. Say, nSuccess of them have length tSuccess, > and nFailure have length tFailure (of course, it's more random than > that, but I'm trying to make it simple). Then, > > t = nSuccess*tSuccess + nFailure*tFailure > > We would expect the ratio of the number of success over failure > intervals to be pSuccess/pFailure. So we can write > > nSuccess/nFailure = pSuccess/pFailure > > We want to get nSuccess/t, the number of successes per time unit. > Combining the above two equations, we can derive: > > nSuccess/t = 1/(tSuccess + tFailure*pFailure/pSuccess) > > Let's take Jonathan's example: > >Node 1 - pSuccess 0.1 tSuccess 1000 pFailure 0.9 tFailure 2000 > >estimate = 1900 > > > >Node 2 - pSuccess 0.9 tSuccess 5000 pFailure 0.1 tFailure 10000 > >estimate = 5500 > > For node 1, nSuccess/t = 1/(1000+0.9*2000/0.1) = 1/19000 > For node 2, nSuccess/t = 1/(5000+0.1*10000/0.9) = 1/6111 > > So, node 2 really is better than node 1 since it returns more data per > unit of time. > > We might change estimate() to calculate nSuccess/t instead of > pSuccess*tSuccess+pFailure*tFailure, but that would mean opting for > nodes with higher estimates rather than nodes with lower estimates. So > to make programming easier, calculate the reciprocal: > > t/nSuccess = tSuccess + tFailure*pFailure/pSuccess > = (pSuccess*tSuccess + tFailure*pFailure)/pSuccess > = current_estimate() / pSuccess > > Ah ha! The current calculation of estimate only needs a little > modification: divide it by pSuccess. > > > > -Martin > > P.S. Jonathan, you were half right when you said to change: > >174: estimate += pSuccess * tSuccess; > >to > >estimate += tSuccess; > > since you also have to change estimate += pFailure*tFailure to estimate > += pFailure*tFailure/pSuccess. > > > _______________________________________________ > Devl mailing list > [EMAIL PROTECTED] > http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl _______________________________________________ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
