Geometric mean is still vulnerable to this, just less so than arithmethic
mean. Besides, it is likely much more expensive computationally - or do
modern day chips come with instructions to calculate the Nth root (could
be, I haven't really examined their capabilities) ?
Is there any particular advantage geometric mean has over median ?
On Mon, 10 Apr 2006 20:06:25 -0700, Ian Clarke wrote:
> I believe another way to suppress the influence of outliers is to use a
> geometric mean, ie. rather than adding N numbers and dividing my N, you
> multiply the N numbers, and take the Nth root of the result.
>
> Ian.
>
> On 10 Apr 2006, at 18:02, John B?ckstrand wrote:
>
>> I was thinking about the mean ping times that are used for handling node
>> load as I understood it, and had the idea to use a median instead.
>>
>> Using an average means one single node can artificially push the avg
>> ping very high, unless you have more than 13 other active peers (the
>> influence from a 30s ping (is this the max, btw?) with 13 peers is
>> 2000ms, so even with ping=0 for the other 13 peers, the mean ping will
>> be 2000ms. This could be a very easy way to DoS the network, no? Just
>> create a very popular node (read: SkarX) and then answer pings/packets
>> very slowly.
>>
>> A median suppresses outliers, and outliers in a latency calculation will
>> mostly be, I suspect, those with high latency due to DoS:ing or just
>> having very bad connections (or absurd overutilization). Another form of
>> latency outliers is having local nodes, which could have very low
>> latency not affected by internet connection load.
>>
>> Anyway, attached is a patch to print the median aswell. I had guessed
>> that a few nodes would have "insanely" high ping and be one reason for
>> overloads, and that seems to be the case atleast on my darknet node,
>> which has more active connections. The difference isnt major or anythng
>> though: mean is 353ms, and median is 311ms, caused mainly by one node
>> with a latency of about 3000ms. I will try to graph these two aswell for
>> some extended period of time for my node.
>>
>> ---
>> John B?ckstrand
>> Index: src/freenet/node/NodePinger.java
>> =================================================================== ---
>> src/freenet/node/NodePinger.java (revision 8505) +++
>> src/freenet/node/NodePinger.java (working copy) @@ -1,5 +1,9 @@
>> package freenet.node;
>>
>> +import java.util.ArrayList;
>> +import java.util.Collections;
>> +import java.lang.Double;
>> +
>> import freenet.support.Logger;
>>
>> public class NodePinger implements Runnable {
>> @@ -30,20 +34,32 @@
>> void recalculateMean(PeerNode[] peers) {
>> int peerCount = 0;
>> double total = 1.0;
>> + ArrayList pingTimes = new ArrayList();
>> for(int i=0;i<peers.length;i++) {
>> PeerNode peer = peers[i];
>> if(!peer.isConnected()) continue;
>> peerCount++;
>> double avgPingTime = peer.averagePingTime();
>> + pingTimes.add(new Double(avgPingTime));
>> double avgThrottledPacketSendTime =
>> peer.throttledPacketSendAverage.currentValue();
>> double value = Math.max(avgPingTime,
>> avgThrottledPacketSendTime);
>> Logger.minor(this, "Peer: "+peer.getPeer()+",
>> avgPingTime:
>> "+avgPingTime+", avg throttled send time: "+avgThrottledPacketSendTime);
>> total *= value;
>> }
>> +
>> + Collections.sort(pingTimes);
>> +
>> + for(int i=0; i<pingTimes.size(); i++){ +
>> Logger.minor(this, "at " +
>> i + " " + pingTimes.get(i)); + }
>> +
>> if(peerCount > 0) {
>> total = Math.pow(total, 1.0 / peerCount); meanPing =
>> total;
>> - Logger.minor(this, "Mean ping: "+meanPing+"ms"); +
>> + double median = ((Double)
>> pingTimes.get((pingTimes.size()-1)/
>> 2)).doubleValue();
>> +
>> + Logger.minor(this, "Mean ping: "+meanPing+"ms, median:
>> " + median +
>> "ms");
>> }
>> }
>> }
>> _______________________________________________ Devl mailing list
>> Devl at freenetproject.org
>> http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl