On Friday 28 June 2002 17:39, Mathew wrote:

See attached diffs + files.

I think this approach will work but I didn't check it in  because I haven't 
been able to test under overload yet.  I had to restart my node after a DS 
crash and it hasn't picked up much traffic.

-- gj  




> > On Thu, Jun 27, 2002 at 01:16:51AM -0400, Gianni Johansson wrote:
> > Hi Mathew:
> >
> > I am working on code to come up with a psmin value.
> >
> > Basically, I think you want to increase psmin every time you accept a
> > request while overloaded, and decrease it gradually with time when no new
> > requests are accepted.
>
> Arbitrary decisions:
> a) how much to increase by? Doubling (bounded by 1), and halving the
> difference to 1, are obvious options.
> b) how much to decrease by? Presumably the inverse of a) ?
> c) how often to decrease? Average length of request? Maximum length of
> request? Some arbitrary constant?
>
> > I hope to have it done in the next couple of days. I'm pretty busy
> > lately.
> >
> > -- gj
>
> An alternative possibility would be to say lets reject 1%* of requests at
> 60%* load, and 99%* of requests at 90%* load, draw an exponential through
> them, and every five minutes* recalculate a list of buckets in increasing
> order of ps(k), to use to determine whether a key is to be accepted.
> * = arbitrary constant :)
This approach sounds more principled but I'm not sure it's worth the 
work/complexity/overhead.

>
> We can't do this by using standard deviations, because freenet
> histograms don't exactly follow the Normal distribution. But with
> recalculation every five minutes it shouldn't be much of an overhead.
>

--
Freesite
(0.4) freenet:SSK@npfV5XQijFkF6sXZvuO0o~kG4wEPAgM/homepage//
? src/freenet/node/OverloadTriageStrategy.java
? src/freenet/node/SimpleTriageStrategy.java
Index: src/freenet/client/http/NodeStatusServlet.java
===================================================================
RCS file: /cvsroot/freenet/freenet/src/freenet/client/http/NodeStatusServlet.java,v
retrieving revision 1.34
diff -r1.34 NodeStatusServlet.java
835c835,837
< 	pw.println("Max: " + pMax);
---
> 	pw.println("Max         : " + pMax);
>         pw.println("Triage psmin: " + Node.overloadTriage.psmin());
> 
837a840
> 
Index: src/freenet/node/Node.java
===================================================================
RCS file: /cvsroot/freenet/freenet/src/freenet/node/Node.java,v
retrieving revision 1.27
diff -r1.27 Node.java
20c20
< import freenet.message.Request;
---
> import freenet.message.Request; 
935a936
>     static public final OverloadTriageStrategy overloadTriage = new SimpleTriageStrategy();
960a962,963
> 
> 
991,998c994,996
<         if ((maximumThreads == 0 ||
<              (activeJobs() < specializedThreadCutoff)) &&
< 	      ((((int)req.searchKey.getVal()[0]) & 0xff) ==
< 	      binMostSuccessful(true, false))
< 	     )
< 	{
< 	    if (inboundRequests != null)
< 	    {
---
>         if ( (maximumThreads == 0 || (activeJobs() < specializedThreadCutoff)) &&
>              overloadTriage.acceptUnderOverload(req)) {
> 	    if (inboundRequests != null) {
package freenet.node;

import freenet.message.Request;

/**
   This class implements a strategy for triaging incoming requests by
   using statistics about previously successful requests.

   The triaging is done by keeping a minimum success probability psmin
   which is compared to each incoming requests estimated success
   probability.  psmin is increased with every request accepted and
   decreased over time if no requests are accepted.

   This intent of this strategy is to have the node select requests
   more rigourously when the inbound request rate is higher, and more
   forgivingly when it is lower.

   NOTES:
   Let pmax = the highest known request success probability.
   Let pLastRequest = the psmin value at which the previous
                      request was accepted.

   0) Here's how I do the time decay:

   psmin = pLastRequest / 
          ( 1 + ( (currentTimeMs - lastRequestHandledTimeMs) / DECAY_PERIOD_MS)) 
        

   i.e. so if there are no requests accepted for DECAY_PERIOD_MS after
   the last request was accpeted, psmin should drop by 1/2.


   1) "Direct hits" under high load
   If psmin is within K_DIRECT_HIT_FUDGE_FACTOR of pmax and a request
   is accepted psmin is set to K_OVERSHOOT * pmax, effectively
   stopping all requests from being accepted until psmin time decays
   back down to psmax.

   Recovery time = DECAY_PERIOD_MS * (1 - (1 / K_OVERSHOOT))
  
   e.g.
   DECAY_PERIOD_MS = 60000ms, K_OVERSHOOT = 2.0
   => recovery time = 30000ms, 
   (30 seconds during which no requests will be accepted)
              
  
   2) Lack of synchronization 
   I intentionally didn't synchronize on the successful request stats
   because acceptUnderOverload may be called very often, and I was
   afraid of the cost of lock contention.

   I think that the errors introduced by races should be acceptable.
   If anyone sees a really pathalogical scenario, let me know.
   
**/

class SimpleTriageStrategy implements OverloadTriageStrategy {

    public final static float DECAY_PERIOD_MS = 60000.0f;
    public final static float K_OVERSHOOT = 2.0f;
    public final static float K_DIRECT_HIT_FUDGE_FACTOR = 0.01f;
    public final static float K_INCREMENT_FACTOR = 0.2f;

    private long lastRequestAcceptedTimeMs = 0;
    private float psmin = 0.0f;

    public boolean acceptUnderOverload(Request req) {
        int bin =((int)req.searchKey.getVal()[0]) & 0xff;
        
        float pSuccessPrevious = Node.pSuccess(bin, true, false);

        if ( pSuccessPrevious == Float.NaN || pSuccessPrevious <= 0.0f  ) {
            return false;
        }

        long now = System.currentTimeMillis();
        float timeWeightedPsmin = timeWeightedPsmin(now);

        if ( pSuccessPrevious >= timeWeightedPsmin) {
            lastRequestAcceptedTimeMs = now;
            float pMax = Node.pSuccess(Node.binMostSuccessful(true, false),
                                       true, false);

            if ( pMax == Float.NaN || pMax <= 0.0f  ) {
                return false;
            }
            
            // Note: deltap can be less than 0 if pMax changed 
            //       underneath us.
            float deltap = pMax - timeWeightedPsmin;
            System.err.println("SimpleTriageStrategy.acceptUnderOverload -- psmin:  " + timeWeightedPsmin +
                               " pMax: " + pMax +
                               " deltap: " + deltap);

            if (deltap / pMax  < K_DIRECT_HIT_FUDGE_FACTOR) {
                // If there was a direct hit with effective
                // psmin at close to the maximum success
                // probability overshoot to stop all requests for 
                // a short while. 
                //
                // The effective psmin eventually time decays back down to
                // below psmax.
                psmin = pMax * K_OVERSHOOT;
                System.err.println("SimpleTriageStrategy.acceptUnderOverload -- accepted, overshot " +
                                   psmin + " " + pMax);
            }
            else {
                // push psmin up above the value we just accepted
                // the request at.
                psmin = timeWeightedPsmin + K_INCREMENT_FACTOR * deltap;
                System.err.println("SimpleTriageStrategy.acceptUnderOverload -- accepted " +
                                   psmin + " " + pMax);

            }
            return true;
        }
        return false;
    }

    private final float timeWeightedPsmin(long now) {
        if (psmin == 0.0 || lastRequestAcceptedTimeMs == 0 ) {
            return 0.0f;
        }
        
        long deltat = now - lastRequestAcceptedTimeMs;
        
        return (float)(psmin * ( 1.0 / (1.0 + ((float)deltat/DECAY_PERIOD_MS))));
    }

    public final float psmin() { return timeWeightedPsmin( System.currentTimeMillis()); }
}





package freenet.node;

import freenet.message.Request;

public interface OverloadTriageStrategy {
    boolean acceptUnderOverload(Request req);
    float psmin();
}

Reply via email to