Recent problems with the bandwidth limiter being inaccurate (minor
problem), and being the primary cause of the quite ludicrous message
send times we are seeing (major problem), lead me to the conclusion that
the best way forward is to reimplement it. Basic principles (a more
optimal abstraction may be found for implementation, this is the
behaviour I'm interested in):

We have N uses for the bandwidth. They get preferential treatment over a
certain fraction of the whole.
I want this so that we can separate use for trailers from use for 
messages, and ensure that messages are not starved by trailers.
We have incoming K bytes per millisecond. We store these in blocks of L
milliseconds, say 10. These can be drained a byte at a time, or in
larger quantities.
We must discard them _if unused_ after the expiry period - we cannot
accumulate them beyond this point.
If they are new (younger than M milliseconds), we allocate them only
to the user they are connected to.
If they get older than M milliseconds, we allocate them to whoever wants
them.
If they get older than E milliseconds, we delete them unused.
When we want to obtain some bandwidth, e.g. for writing, we ask the
limiter for so may bytes, and it finds as many as possible up to that
limit from chunks currently available, and then, unless it can return
immediately (because the required number of bytes is available _now_),
it either returns the number available, or pre-reserves some and
returns the time to sleep to obtain more (or we could implement it as
a callback - that would probably be more sensible with the new 
design), depending on how we call it.
The code would have built in logging of means over a specified period, 
to eliminate the need for and inefficiency of DiagnosticsInputStream.
We may either grab some bandwidth up to what we need and then try to
write to it and return unused bytes, analogous to the old code, or we
may do what we do now (do a write and then get the bandwidth).

Gains:
* We can allocate some proportion to messages, and some proportion to
  trailers, safe in the knowledge that it will be poached later if it is
  needed. The percentages need not add up to 100%; we can allocate 10%
  to trailers and 10% to messages, and let the rest float free. The end
  result of this is that trailers and messages cannot starve each other
  beyond certain configurable limits. This should help to reduce message
  send times. This is the vital bit.
* Excess bandwidth is discarded after it is no longer usable without
  generating a spike. This means bwlimiting will be much smoother, a
  vital step towards the unofficial project goal of being able to run
  Unreal Tournament at the same time as a live (limited, maybe
  customized a bit - e.g. 1ms granularity, 100ms expiry) node!
* Cleaner code. Hopefully!

We would then have a separate limiter structure for the long term
limiting. The main limiter would allocate data 80:20 to trailers and
messages, (but messages can steal bandwidth from trailers after a short
time), and expire unused bandwidth after 1 second, probably using a
granularity of 10ms. The auxiliary, long term average limiter would use
an expiry period of 1 week and a granularity of 1 hour (with a lot of
bandwidth in each granule!). The latter would need to be serialized if
we are serious about it. And other limits could be introduced at
whatever granularity was required.

Because of the real time nature of routing, because almost all nodes use
bandwidth limiting, often quite severe bandwidth limiting, because of
the strict (compared to current message send time's) timeouts throughout
the code, because of the failure of our attempts to alleviate the
problems with message send times (new diagnostic, messageSendTime -
target is 200ms, or at the very worst, 1000ms; typical values are 5
seconds to 30 seconds or more - the timeouts start at 4000ms...), 
I believe that this is of higher priority even than NGRouting.

Other possible solutions for the present problems:
* Reject queries if the previous minute's (or 10 seconds', or with some
  work, second's) bandwidth limit was exceeded by some proportion.
  In the current network, nodes will not generally get the message; with
  NGRouting, it would work a bit better. But it's still very coarsely
  grained.
* Don't limit (or pseudo-limit, i.e. don't make them wait, but let them
  slow down everyone else) message sending. The problem is that for many
  nodes message sending occupies most of their available bandwidth.
* Somehow use two Bandwidth's, one for messages and one for trailers,
  and allow them to steal bandwidth up to some arbitrary limit from each
  other. Disadvantages include the fact that we have to decide how much
  bandwidth to keep for each one, that it might cause bandwidth usage
  spikes (by accumulating spare bandwidth over relatively long periods),
  and that it might be as difficult to implement and debug as a new
  limiter.
-- 
Matthew J Toseland - [EMAIL PROTECTED]
Freenet Project Official Codemonkey - http://freenetproject.org/
ICTHUS - Nothing is impossible. Our Boss says so.

Attachment: pgp00000.pgp
Description: PGP signature

Reply via email to