Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2012-08-02 Thread Matthew Toseland
On Thursday 01 Sep 2011 15:29:35 Evan Daniel wrote:
 On Thu, Sep 1, 2011 at 7:24 AM, Matthew Toseland
 t...@amphibian.dyndns.org wrote:
  I like this proposal :)
 
  Is the documentation on the math of how to get the random routing to
  behave well sufficient? Let me know if it isn't. The MHMC routing math
  shouldn't be too complicated, but we want to be certain it's
  implemented correctly so that the data is sound.
 
  Do you have a metric for how clustered vs uniform a node's peers are?
 
 Maybe. It's a tougher problem than it looks at first glance, unless we
 have a somewhat reliable network size estimate available. I'll give it
 some more thought.
 
 If you want a qualitative, visual estimate, just change the peer
 location distribution graph on the stats page to have a logarithmic x
 axis. Basically, a node should have similar numbers of peers at
 distance 0.25  d = 0.5, and at 0.125  d = 0.25, etc. That is, bar
 n should graph the number of nodes at distance 2^(-n-2)  d 
 2^(-n-1). That doesn't provide an answer you can use in code to
 evaluate and make decisions, but it should give better anecdotal
 evidence about problems.

So you/operhiem1 plan to do something with this now we have probe requests?
 
  What's MHMC?
 
 Metropolis-Hastings Monte Carlo. We currently use it to get the right
 probability distribution for location swaps. 

Do we? I thought the walks were purely random, and the probability of swapping 
was based solely on a comparison between the distances before and after?

 We should also use it for
 randomly routed requests. 

Indeed, this is now implemented.

 (For ones that need to be statistically
 random, anyway. It's probably not worth the performance cost for eg
 high-htl random routing of regular requests for security purposes.)
 It's mentioned in the bug report:
 https://bugs.freenetproject.org/view.php?id=3568
 And there's a code snippet on my flog, that's used in my simulator:
 freenet:USK@gjw6StjZOZ4OAG-pqOxIp5Nk11udQZOrozD4jld42Ac,BYyqgAtc9p0JGbJ~18XU6mtO9ChnBZdf~ttCn48FV7s,AQACAAE/flog/29/200909.xhtml
 (Entry for 20090926)
 
 Evan Daniel


signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2012-08-02 Thread Evan Daniel
On Thu, Aug 2, 2012 at 7:18 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Thursday 01 Sep 2011 15:29:35 Evan Daniel wrote:
 On Thu, Sep 1, 2011 at 7:24 AM, Matthew Toseland
 t...@amphibian.dyndns.org wrote:
  I like this proposal :)
 
  Is the documentation on the math of how to get the random routing to
  behave well sufficient? Let me know if it isn't. The MHMC routing math
  shouldn't be too complicated, but we want to be certain it's
  implemented correctly so that the data is sound.
 
  Do you have a metric for how clustered vs uniform a node's peers are?

 Maybe. It's a tougher problem than it looks at first glance, unless we
 have a somewhat reliable network size estimate available. I'll give it
 some more thought.

 If you want a qualitative, visual estimate, just change the peer
 location distribution graph on the stats page to have a logarithmic x
 axis. Basically, a node should have similar numbers of peers at
 distance 0.25  d = 0.5, and at 0.125  d = 0.25, etc. That is, bar
 n should graph the number of nodes at distance 2^(-n-2)  d 
 2^(-n-1). That doesn't provide an answer you can use in code to
 evaluate and make decisions, but it should give better anecdotal
 evidence about problems.

 So you/operhiem1 plan to do something with this now we have probe requests?

Yes, though probably not immediately.


  What's MHMC?

 Metropolis-Hastings Monte Carlo. We currently use it to get the right
 probability distribution for location swaps.

 Do we? I thought the walks were purely random, and the probability of 
 swapping was based solely on a comparison between the distances before and 
 after?

Absolutely. It's the mathematical technique used to prove that both
the path folding algorithm of opennet and the location swapping
algorithm of darknet will produce the desired results. The fact that
the formulas used in those proofs don't appear in the Freenet source
code is irrelevant, except that it contributes to Freenet code being
difficult to understand.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2011-09-10 Thread Arne Babenhauserheide
Hi, 
At Sat, 03 Sep 2011 00:53:58 +0200,
Arne Babenhauserheide wrote:
 Am Freitag, 2. September 2011, 23:34:29 schrieb Matthew Toseland:
   If the load balancer does not have some hidden delicacies, there is a
   very  simple check to see if my understanding is right.
  No, it takes into account that SSKs use very little bandwidth (1-2K versus
  32K).
 
 Bandwidth: 75 down, 85 up. 

I changed the upload bandwidth to 120 out (about my real output bandwidth), and 
it adjusted itself to 70 down and 78 up. This is about 10-20% more than with a 
90kB/s limit, but far away from using my 120 (it only uses 65%). It is less 
then the overshooting one, though → load limiter is a bit too strong. 

I assume that due to treating CHKs and SSKs the same, the load limiting also 
increased the number of CHKs, though these really need the bandwidth. 

If we have a mean transfer time of about 45s-1min for a 32k block on my node 
(with OLM: failed CHK time - successful time, with NLM it is likely a bit more 
complex). 

Now the search time of about 1 minute in NLM for unsuccessful downloads 
(successful have 1:24 min for me) will block IO. If we assume that a successful 
transfer still takes 45s, almost half the time is spent searching without any 
transfer → wasted bandwidth. 

(Disclaimer: The following is just an idea. Treat any “should” as “it might be 
a good idea to” - which is much longer and harder to read, so I go with the 
should). 

I assume that the deeper issue than just failing requests is that waiting 
should not be accounted as bandwidth. A requests should be accounted as 32kB * 
estimated success probability, so failing requests are not counted 
(conceptually, the implementation is another thing). The limiter should then 
use a time-frame of about 2× the time to complete a request and try to fill 
that → Since NLM has higher wait times, it also needs a bigger bandwidth 
limiter window. If it can get to 3 minutes for bulk, the bandwidth limiter 
window needs to be 6 minutes. 

… one more reason (with the assumption of a 2min limiter window), why NLM used 
so little bandwidth: Estimated 2 min search times with 1 min transfer time 
meant that essentially 2/3rd of the allocated bandwidth was overbooking, 
because 2/3rd of the transfers would not finish within the window, so their 
bandwidth reservation was carried on into the next period. This reduced the 
total number of running requests which in turn increased the wait times (since 
less requests finished in a given time period). 

Conclusion: 


NLM changes some basic assumptions about the load
limiting. Because of that we need parameter tweaking to integrate
it cleanly. Likely that can only be done in a real network. We
already know that the network *does not break down* with NLM, so
live tweaking is possible.

Likely it also changes the assumptions for other parts, so these
will need some tweaking, too.

Toad spent years to optimize the parameters and helper parts to
work well with the assumptions from OLM. It’s natural that NLM —
which has slightly different characteristics — requires some
optimization outside of its main algorithms, too.


Best wishes, 
Arne

Besides: Bad performance: If the inserter uses NLM and the downloader uses OLM, 
I think that might make it harder for the downloader to get the data (different 
routing). Worse: If some downloaders use NLM and some use OLM the packets might 
take different paths (NLM has shorter paths), which is likely to affect the 
caching negatively (needs to cache at both paths). 

Besides 2: Regin has the theory that the thread scheduler was the bottleneck, 
because his threads always hit the limit of 500 threads. 
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2011-09-02 Thread Ian Clarke
On Wed, Aug 31, 2011 at 8:00 AM, Matthew Toseland t...@amphibian.dyndns.org
 wrote:

 WE NEED MORE DATA.


Well, my gut tells me that our existing scheme is likely too complicated to
fix unless we are extremely fortuitous, however I'm happy to be wrong about
that if others think that they have a good understanding of why we're having
problems and how to fix them.

So this is fine with me provided that its not just data for data's sake, but
*actionable* data.  By this I mean that the data we collect should give us
clear information about what is and isn't working, and hopefully tell us how
to fix it.

But a bunch of metrics we have no idea how to interpret won't get us
anywhere.

Ian.

-- 
Ian Clarke
Founder, The Freenet Project
Email: i...@freenetproject.org
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2011-09-02 Thread Arne Babenhauserheide
Am Freitag, 2. September 2011, 12:20:02 schrieb Ian Clarke:
 On Wed, Aug 31, 2011 at 8:00 AM, Matthew Toseland t...@amphibian.dyndns.org
  wrote:
 
  WE NEED MORE DATA.

 Well, my gut tells me that our existing scheme is likely too complicated to
 fix unless we are extremely fortuitous, however I'm happy to be wrong about
 that if others think that they have a good understanding of why we're having
 problems and how to fix them.

If the load balancer does not have some hidden delicacies, there is a very
simple check to see if my understanding is right.

Since SSKs are mostly unsuccessfull and are about 50% of the requests, the
bandwidth limiter essentially targets 50% of the bandwidth.

Setting my bandwidth to about 150% of my actual bandwidth should make it guess
my bandwidth more correctly, leaving 25% free for bursting¹.

Currently the mean bandwidth with NLM and AIMDs for me is about 50 kB/s on a
setting of 90kB/s outgoing.

My line can handle about 120kB/s outgoing.

So I set the bandwidth setting to 180kB/s.

If I am right, Freenet should then consume about 90kB/s on average.
If it stays at 50-60, that’s likely a limitation of my peers → no useful data
→ test would have to be done on a slower line or with more peers.
If it goes down or I get very many timeouts, then I‘m likely wrong.

It would be nice if some other people could replicate that.

Note: I just disabled my testnet node to avoid skewing the data.

Best wishes,
Arne

signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2011-09-02 Thread Matthew Toseland
On Friday 02 Sep 2011 18:20:02 Ian Clarke wrote:
 On Wed, Aug 31, 2011 at 8:00 AM, Matthew Toseland t...@amphibian.dyndns.org
  wrote:
 
  WE NEED MORE DATA.
 
 Well, my gut tells me that our existing scheme is likely too complicated to
 fix unless we are extremely fortuitous, however I'm happy to be wrong about
 that if others think that they have a good understanding of why we're having
 problems and how to fix them.
 
 So this is fine with me provided that its not just data for data's sake, but
 *actionable* data.  By this I mean that the data we collect should give us
 clear information about what is and isn't working, and hopefully tell us how
 to fix it.

As a matter of principle, empiricism requires experimental data.

But it's true that sometimes the metrics which are easiest to gather, and most 
closely related to our expected usage (i.e. good benchmarks), are hardest to 
theorise about.
 
 But a bunch of metrics we have no idea how to interpret won't get us
 anywhere.

Well, the following appear to be justifiable:
- We can detect whether the theory about network topology being messed up by 
local requests is correct. This is directly actionable: If a large proportion 
of nodes have this problem, we could try the no path folding above htl 16 
rule. Of course, this might slow down local requests, and might even make it 
harder to bootstrap new nodes ...
- Throughput data ... I guess some of our existing tests probably give us this, 
e.g. the daily timings for insert/retrieve, which lately succeed but take 
rather a long time. I guess we probably have enough here, although arguably we 
should have a big-file inter-day test (current tests are small files for 
multiple days or big files for one day).
- Build to build comparisons: If we are going to deploy changes to the network 
we need to be able to compare one build to another. Otherwise, no matter how 
good our simulations (and odds are they won't be very good, getting the right 
level of accuracy is hard), we won't know whether it works in practice.


signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2011-09-02 Thread Matthew Toseland
On Friday 02 Sep 2011 19:31:14 Arne Babenhauserheide wrote:
 Am Freitag, 2. September 2011, 12:20:02 schrieb Ian Clarke:
  On Wed, Aug 31, 2011 at 8:00 AM, Matthew Toseland t...@amphibian.dyndns.org
   wrote:
   
   WE NEED MORE DATA.
  
  Well, my gut tells me that our existing scheme is likely too complicated to
  fix unless we are extremely fortuitous, however I'm happy to be wrong about
  that if others think that they have a good understanding of why we're having
  problems and how to fix them.
 
 If the load balancer does not have some hidden delicacies, there is a very 
 simple check to see if my understanding is right. 
 
 Since SSKs are mostly unsuccessfull and are about 50% of the requests, the 
 bandwidth limiter essentially targets 50% of the bandwidth. 

No, it takes into account that SSKs use very little bandwidth (1-2K versus 32K).


signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2011-09-02 Thread Matthew Toseland
On Friday 02 Sep 2011 18:20:02 Ian Clarke wrote:
 On Wed, Aug 31, 2011 at 8:00 AM, Matthew Toseland t...@amphibian.dyndns.org
  wrote:
 
  WE NEED MORE DATA.
 
 
 Well, my gut tells me that our existing scheme is likely too complicated to
 fix unless we are extremely fortuitous, however I'm happy to be wrong about
 that if others think that they have a good understanding of why we're having
 problems and how to fix them.

And my gut tells me that scrapping 6 months work sucks, especially when the 
current performance appears to be well below what it was when I started to 
deploy NLM, so we should try to salvage it, especially as anything new will 
probably take another 6 months to argue about, design, simulate, tweak and 
generally get right.

Funny how gut feelings can be so subjective! ;)
 
 So this is fine with me provided that its not just data for data's sake, but
 *actionable* data.  By this I mean that the data we collect should give us
 clear information about what is and isn't working, and hopefully tell us how
 to fix it.
 
 But a bunch of metrics we have no idea how to interpret won't get us
 anywhere.


signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2011-09-02 Thread Arne Babenhauserheide
Am Freitag, 2. September 2011, 23:34:29 schrieb Matthew Toseland:
  If the load balancer does not have some hidden delicacies, there is a
  very  simple check to see if my understanding is right.
 
  Since SSKs are mostly unsuccessfull and are about 50% of the requests,
  the  bandwidth limiter essentially targets 50% of the bandwidth.
 
 No, it takes into account that SSKs use very little bandwidth (1-2K versus
 32K).

That might explain the data I see: 

Bandwidth: 75 down, 85 up. 

I expected 90 kB/s. 

Theory: My claimed bandwidth doubling seems to have overshot, which gave too 
many timeouts (that is also supported by the volatility of the bandwidth: 
oszillating rapidly between 50 and 100 - this is with AIMDs on). 

Best wishes, 
Arne

signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2011-09-01 Thread Matthew Toseland
On Wednesday 31 Aug 2011 15:05:51 Evan Daniel wrote:
 On Wed, Aug 31, 2011 at 9:00 AM, Matthew Toseland
 t...@amphibian.dyndns.org wrote:
  On Monday 29 Aug 2011 19:28:50 Ian Clarke wrote:
  Ok, thinking about it, here is a proposal, or  rather, the beginning of a
  proposal.  I'm assuming that we get rid of NLM, fair sharing, and anything
  else intended to control load, and replace it with this.  We will 
  absolutely
  need to simulate this before we write a single line of code to deploy it.
 
  The core idea is that a node will include a floating point number in
  response to any kind of request showing how close that node is to being
  overloaded.  0.0 would mean its doing nothing, 1.0 would mean that it is
  completely saturated and must reject requests.  Clearly the goal is to 
  avoid
  getting anywhere near to 1.0.
 
  A node tracks several things:
 
 - What is the overall average load reported by responses this node has
 received
 - What is the average load reported by responses this node has received,
 per remote node
 - What is the average load reported by responses this node forwarded, 
  per
 remote node
 
  I think, given these metrics, we should be able to do the following:
 
 - Limit our overall rate of initiating local requests based on the 
  global
 average reported load
 - Limit our rate of local requests based on the average load of the
 connection to the peer it would need to be forwarded to
 - Detect when remote peers are abusing the system by disregarding load -
 as evidenced by a significantly higher average load in replies 
  forwarded to
 them
 
  Of course, lots of questions:
 
 - How do we compute the averages?  A decaying running average of some
 kind?  What time period?
 - How do we translate load into a desired rate of requests?
 - What criteria indicate that a peer is abusing the system?  What is the
 remedy?
 
  This is basically control theory stuff, and we'll need robust answers to
  these questions before we proceed (ideally with a theoretical rather than
  experimental foundation).
 
  One fundamental issue that hasn't been covered in detail yet:
 
  WE NEED MORE DATA.
 
  How can we achieve this?
 
  First, digger3's graphs are very helpful. But he can only very rarely show 
  statistically significant results between one build and another. We need a 
  lot more people gathering data for this. We can either just ask people, or 
  we can build a plugin that makes it easy (there is some manual setup at the 
  moment).
 
  Second, digger3's graphs don't address throughput. This is THE reason why 
  NLM was turned off: throughput was dramatically down. digger3's graphs are 
  based on the LongTermManySingleBlocksTest, which is based on inserting a 
  bunch of single blocks. It's true that each is timed but that is 
  irrelevant: What we need is a big insert, where other factors affecting 
  throughput can come into play.
 
  Third, there are those who say our problems are not related to load 
  management at all, they are related to the fact that Freenet nodes with a 
  lot of downloads tend to have a flat distribution of links. This is a 
  natural enough optimisation for their local usage, but if it happens across 
  the network, or even if it happens in clusters, it will wreak havoc. We 
  need some way to estimate how widespread this is. If it is common, we 
  should do something about it, for instance, not path folding until requests 
  reach HTL 16 (which is a good idea for security, but if this problem is 
  rare and not clustered it would reduce performance).
 
  The obvious option is to implement the random routed probes that Evan 
  suggested. These should eventually be able to replace the current probe 
  requests, and would return a lot less data and so be safer. They would be 
  routed randomly and then terminate on a node, and return a single value 
  from that node - in this case some metric of how spread out vs specialised 
  its peers are. Other versions would be able to estimate the network size by 
  a tag-and-release method that should be cheaper than the current probe 
  request based stuff (it would return an estimate of uptime and a unique ID 
  for the node which is kept private and used solely for that purpose), or 
  return the proportion of peers backed off. Another variant is to random 
  route and then do a request - this would allow us to determine whether 
  content is still retrievable and thus whether we want to reinsert it, or 
  might provide more representative statistics than bootstrapping a new node 
  every time (at least allowing us to separate problems with bootstrapping 
  from problems faced by the average node).
 
  IMHO some or all of the above are worth seriously considering before 
  deploying any new scheme. If we want to be empirical we need to measure the 
  effect of our changes on the real network, not only in simulation.
 
 I like this proposal :)

Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2011-09-01 Thread Evan Daniel
On Thu, Sep 1, 2011 at 7:24 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 I like this proposal :)

 Is the documentation on the math of how to get the random routing to
 behave well sufficient? Let me know if it isn't. The MHMC routing math
 shouldn't be too complicated, but we want to be certain it's
 implemented correctly so that the data is sound.

 Do you have a metric for how clustered vs uniform a node's peers are?

Maybe. It's a tougher problem than it looks at first glance, unless we
have a somewhat reliable network size estimate available. I'll give it
some more thought.

If you want a qualitative, visual estimate, just change the peer
location distribution graph on the stats page to have a logarithmic x
axis. Basically, a node should have similar numbers of peers at
distance 0.25  d = 0.5, and at 0.125  d = 0.25, etc. That is, bar
n should graph the number of nodes at distance 2^(-n-2)  d 
2^(-n-1). That doesn't provide an answer you can use in code to
evaluate and make decisions, but it should give better anecdotal
evidence about problems.

 What's MHMC?

Metropolis-Hastings Monte Carlo. We currently use it to get the right
probability distribution for location swaps. We should also use it for
randomly routed requests. (For ones that need to be statistically
random, anyway. It's probably not worth the performance cost for eg
high-htl random routing of regular requests for security purposes.)
It's mentioned in the bug report:
https://bugs.freenetproject.org/view.php?id=3568
And there's a code snippet on my flog, that's used in my simulator:
freenet:USK@gjw6StjZOZ4OAG-pqOxIp5Nk11udQZOrozD4jld42Ac,BYyqgAtc9p0JGbJ~18XU6mtO9ChnBZdf~ttCn48FV7s,AQACAAE/flog/29/200909.xhtml
(Entry for 20090926)

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2011-08-31 Thread Evan Daniel
On Wed, Aug 31, 2011 at 9:00 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Monday 29 Aug 2011 19:28:50 Ian Clarke wrote:
 Ok, thinking about it, here is a proposal, or  rather, the beginning of a
 proposal.  I'm assuming that we get rid of NLM, fair sharing, and anything
 else intended to control load, and replace it with this.  We will absolutely
 need to simulate this before we write a single line of code to deploy it.

 The core idea is that a node will include a floating point number in
 response to any kind of request showing how close that node is to being
 overloaded.  0.0 would mean its doing nothing, 1.0 would mean that it is
 completely saturated and must reject requests.  Clearly the goal is to avoid
 getting anywhere near to 1.0.

 A node tracks several things:

    - What is the overall average load reported by responses this node has
    received
    - What is the average load reported by responses this node has received,
    per remote node
    - What is the average load reported by responses this node forwarded, per
    remote node

 I think, given these metrics, we should be able to do the following:

    - Limit our overall rate of initiating local requests based on the global
    average reported load
    - Limit our rate of local requests based on the average load of the
    connection to the peer it would need to be forwarded to
    - Detect when remote peers are abusing the system by disregarding load -
    as evidenced by a significantly higher average load in replies forwarded 
 to
    them

 Of course, lots of questions:

    - How do we compute the averages?  A decaying running average of some
    kind?  What time period?
    - How do we translate load into a desired rate of requests?
    - What criteria indicate that a peer is abusing the system?  What is the
    remedy?

 This is basically control theory stuff, and we'll need robust answers to
 these questions before we proceed (ideally with a theoretical rather than
 experimental foundation).

 One fundamental issue that hasn't been covered in detail yet:

 WE NEED MORE DATA.

 How can we achieve this?

 First, digger3's graphs are very helpful. But he can only very rarely show 
 statistically significant results between one build and another. We need a 
 lot more people gathering data for this. We can either just ask people, or we 
 can build a plugin that makes it easy (there is some manual setup at the 
 moment).

 Second, digger3's graphs don't address throughput. This is THE reason why NLM 
 was turned off: throughput was dramatically down. digger3's graphs are based 
 on the LongTermManySingleBlocksTest, which is based on inserting a bunch of 
 single blocks. It's true that each is timed but that is irrelevant: What we 
 need is a big insert, where other factors affecting throughput can come into 
 play.

 Third, there are those who say our problems are not related to load 
 management at all, they are related to the fact that Freenet nodes with a lot 
 of downloads tend to have a flat distribution of links. This is a natural 
 enough optimisation for their local usage, but if it happens across the 
 network, or even if it happens in clusters, it will wreak havoc. We need some 
 way to estimate how widespread this is. If it is common, we should do 
 something about it, for instance, not path folding until requests reach HTL 
 16 (which is a good idea for security, but if this problem is rare and not 
 clustered it would reduce performance).

 The obvious option is to implement the random routed probes that Evan 
 suggested. These should eventually be able to replace the current probe 
 requests, and would return a lot less data and so be safer. They would be 
 routed randomly and then terminate on a node, and return a single value from 
 that node - in this case some metric of how spread out vs specialised its 
 peers are. Other versions would be able to estimate the network size by a 
 tag-and-release method that should be cheaper than the current probe request 
 based stuff (it would return an estimate of uptime and a unique ID for the 
 node which is kept private and used solely for that purpose), or return the 
 proportion of peers backed off. Another variant is to random route and then 
 do a request - this would allow us to determine whether content is still 
 retrievable and thus whether we want to reinsert it, or might provide more 
 representative statistics than bootstrapping a new node every time (at least 
 allowing us to separate problems with bootstrapping from problems faced by 
 the average node).

 IMHO some or all of the above are worth seriously considering before 
 deploying any new scheme. If we want to be empirical we need to measure the 
 effect of our changes on the real network, not only in simulation.

 ___
 Devl mailing list
 Devl@freenetproject.org
 http://freenetproject.org/cgi-bin/mailman/listinfo/devl


I like this proposal :)

Is the