I made a couple of errors in my original post. I'll correct it below:

Martin Stone Davis wrote:

Tracy R Reed wrote:

On Sun, Nov 02, 2003 at 08:41:45PM +0000, Ian Clarke spake thusly:

With NGR finally starting to meet expectations, we need to think about what lies between now and our 0.6 release.



If you expect only a .08% psuccess and barely a hint of specialization
that's fine but I rather expected a bit more. :) My psuccess is slowly
creaping up (it was .02% for so long) so hopefully it's just a matter of
time and the convergence is slow due to bandwidth limitations we all have
but I am hoping for quite a bit more, especially in the routing area which
freenet so depends on.


Is psuccess really a good measure of node/network health? IIRC, others have said it is not.

To measure our node's health, I suggest using the metrics I suggested in the 'Suggest change to "General Information" page' thread.

To measure network health (and to some extent, our specialization), I suggest calculating eEfficiency="Estimated efficiency of requesting nodes" as:

eEfficiency=(tEstimateWorstPossible-tEstimateAsIs)/(tEstimateWorstPossible-tEstimateBestPossible)


where
* tEstimateRandom (NOT USED IN FORMULA ABOVE, BUT USED BELOW) is the time we estimate another node would have to wait to get a success if they requested a random key from us (so the estimate must be a straight average across the whole keyspace);


* tEstimateAsIs is the same as tEstimateRandom except the the average is weighted by the distribution of keys that have been requested of us (good idea to use both "instantaneous" and "smoothed" versions of this);

* tEstimateBestPossible is the same as tEstimateRandom except instead of using an average, we just pick that point in the keyspace that would take the least amount of time (and, no, don't cheat by picking a key that we actually have; instead, find the minimum point on an estimator function similar to ones we use to calculate estimator() for other nodes).

* tEstimateWorstPossible is the same as tEstimateBestPossible except that we pick the point that would take the most amount of time.

If eEfficiency is 1, then we think other nodes are utilizing our node perfectly. If it's 0, then we think other nodes are probably using build 6264! =D

However, if tEstimateRandom were very close to tEstimateBestPossible, we would have to take "Estimated efficiency of requesting nodes" with a grain of salt, since then the network doesn't really *need* to be efficient (specialized).
CORRECTION: the network doesn't really need *our node* to be specialized.


> Let's say eSpecializationRelevance="Estimated
relevance of specialization" and calculate it as:

eSpecializationRelevance=(tEstimateRandom-tEstimateBestPossible)/tEstimateRandom


The closer eSpecializationRelevance is to 1, the more eERN "matters", and the closer we want to say "health" is to eERN. Conversely, the closer eSpecializationRelevance is to 0, the less eERN "matters",
s/eERN/eEfficiency/

> and
the closer we want to say "health" is to 1 (if it ain't sick, don't heal it!). Throwing all this into a formula, I propose we calculate the health of the network around our node, eHealth, as:

eHealth=eEfficiency^eSpecializationRelevance

Is this a good metric? Thoughts?

-Martin

P.S. Of course, all of the estimates above would only be relevant with enough data to base the estimates on, so it would be good to show how many datapoints were used to calculate it (with perhaps a warning if the number is too low). Also, it would be good to provide "instantaneous" and "smoothed" versions of the estimates.


_______________________________________________
Devl mailing list
[EMAIL PROTECTED]
http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to