Ian Clarke wrote:

Martin Stone Davis wrote:

Ian Clarke wrote:

As I have said on a few occasions, the best measurement we could have of NGR's performance would be a mean difference between estimated and actual response times. This measures exactly how well NGR is doing the
job it is supposed to be doing, and can thus be used to guage the effectiveness of modifications to the NGR algorithm in terms of their effect on routing.


Could you point me to something that justifies this? I don't understand how this can me true.


Well, it should really be self-evident if you look at what NGR is trying to do. NGR works by estimating which node will have the lowest response time, and routing to that node. Obviously, the difference between the estimated and actual response times will be lower if the estimate is better, and thus this averaged over time will tell us how well NGR is estimating routing times, and therefore how optimal its decisions are likely to be.

I think I see: you are measuring the variability between estimated and actual response times of each node in the RT? And you either take the absolute or square of each difference? Do you combine the numbers for the various nodes into a single number?


In any case, I agree that the absolute difference for each node will tend to be closer to zero if the estimate is closer to the actual average. However, even if the estimate perfectly estimates the average, the average of the absolute differences will certainly not be zero. Another factor that comes into play is the variability of response times themselves as a function of location in the keyspace. A node that specializes in a region with more variablitiy will then look "worse" than one that specializes in a region with less.

What's the utility of including that variability in your measure of accuracy?

I may have answered my own question here: You say that you are measuring whether NGR is doing the job it is supposed to be doing. It seems more precise to say you are measuring the difference between an omnicient node and our own node. The better our implementation of NGR, the more "omnicient" our node will seem.

I'm not saying it's a bad thing to measure---just trying to clarify what you are measuring. My eHealth is designed to measure how smartly other nodes are utilizing our node, under the assumption that our estimates are accurate "enough". You are measuring how accurate our estimates are.

It seems like both of these are important. After all, I think it is clear that your measurement would not necessarily increase if all of a sudden, another bug were introduced that caused all the nodes to start making stupid decisions again. Mine would reveal this problem (assuming the estimates remained as accurate). I would therefore propose that we calculate my "health" of requesting nodes IN ADDITION TO your "variablity between predicted and actual times". Sound good?


From what I understand, for every key requested, you would measure how long we estimated tSuccess, and then subtract the actual time for success. You then take the average of all these.


Well, the average of the difference (ie. never negative), yes.

But the difference could be negative. You mean you take the average of the absolute value of the differences? Or maybe get "statistical" and take the square.

But so what? There's no reason to think that this average would be larger in build 6264 (where we picked the worst possible node) than in the later corrected versions. My eHealth, however, would measure such a difference.


I don't claim that this will allow us to judge the performance of all aspects of the node's operation, but it would allow us to measure how well NGR is making its routing decisions.

I think I agree. See above.

Ian.


-Martin


_______________________________________________ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to