[freenet-dev] Re: Why we don't use pSearchFailed any more

Martin Stone Davis Fri, 05 Dec 2003 00:39:33 -0800

Toad wrote:

On Thu, Dec 04, 2003 at 06:20:35PM -0800, Martin Stone Davis wrote:
Toad wrote:
m0davis suggested that we should use pSearchFailed (or more detailed
stats), despite backoff. The reason we don't do this is that if we did,
a node in the RT would get a few queries and then its pSearchFailed
would increase, and subsequently it would not be routed to at all for a
very long time, because of it's very high estimate. This is the reason
that the running averages are not appropriate for QRs and timeouts -
because they don't recover. If we did use them, we would have to also
have a mechanism to ensure that the nodes were contacted after they
ceased to be backed off, DESPITE their bad estimates. This was a
significant part of the justification behind instating backoff in the
first place.
Let me try to put my suggestion for reinstating pSearchFailed in context. Implicit in my derivation of estimate() was that all of the estimators used in the formula were as accurate as reasonably possible. The worse the estimators, the worse routing will be. Therefore, it is out of the question to be unrealistically optimistic about new nodes. For a completely new node, we should either use the global estimators, or ask other nodes for estimator data on the it.
I don't see how what you are talking about is relevant to what I was
talking about :)
I oppose using a running average for pSearchFailed specifically because
it would not be accurate - most nodes have longish periods of rejecting
most queries, then recover. But our pSearchFailed would go really high,
and then we wouldn't route to the node.


Okay, good point.  I get it.  Skip to the bottom of this message for a
pretty-good (I hope) quick-fix solution.  The rest of this message are
my thoughts along the way to a simple solution.

The most obvious thing to do is to adjust pQueryReject upwards immediately following a QR. After, if we get a QR, we humans *believe* that the chance of another QR will be much higher immediately following it than if we query much later. Therefore, we should try to make the computer as smart as us in that regard.

What's the best way to model pQueryReject as a function of time after a QR? That is, pQueryReject(t). Certainly, pQueryReject(0)=1, since nodes QR for intervals and not instants of time. pQueryReject(infinity)=pQueryReject0 (some constant). The easiest thing to do is to let

pQueryReject(t)=1             if t<t0
               =pQueryReject0 if t>=t0

pQueryReject0 would come from a RoutingTimeEstimator.

The first trouble is, how do we come up with t0? The second trouble is, how is any different from some form of backoff? In fact, t0 is a backoff amount in this case. So, here's a wild swing at a solution: Use Ian's backoff time as t0, but instead of backing off and then suddenly not coming back, we backoff and gradually come back. That is,

pQueryReject(t)=1-(1-pQueryReject0)*t/t0
               =pQueryReject0 if t>=t0

t0 comes from Ian's backoff. To estimate pQueryReject0, it's easy to do so when t>=0. However, when t<t0 we have to the somehow reduce the importance of any reports to the "ResponseTimeEstimator" s.t. there's no change if t=0 and we gradually get to the full change when t=t0.

Simpler idea: Let pQueryReject(t)=pQueryReject0. That is, don't backoff. Instead just refrain from changing pQueryReject0 right after a QR. We could do this in the same gradual way described above, where we change--or use a special--ResponseTimeEstimator, OR

SIMPLEST PLAN YET: We could just refrain from updating the estimator for pQueryReject for X (configurable) seconds after a QR. We would not reset the timer for X if we get another QR while waiting for the timer to run out. (A reasonable value for X would be the usual average time between getting a QR and getting a not-QR. When QR-based-on-bw was enabled, that value was around 5 minutes. Now, what is it... 5 seconds?)

Anyhoo, just because it is the simplest doesn't mean it's the best. But, Toad, at least you should be able to implement it pretty easily, avoid the problems you pointed out, and improve estimate()'s accuracy.

-Martin


_______________________________________________
Devl mailing list
[EMAIL PROTECTED]
http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl

[freenet-dev] Re: Why we don't use pSearchFailed any more

Reply via email to