The observed problems:
* The bandwidth limiting code was broken. It has been fixed.
* Even when fixed, it led to it taking very long times to send messages.
  This was apparently mainly because of messages being starved by
  trailers. I made messages take priority over trailers, and limited
  sends of trailers to available bandwidth, if there was some (it is
  refreshed 10 times per second, so the chunks won't be too small). This
  resulted in message send times being acceptably low, consistently. I
  included a safeguard: after 2000ms, send jobs from trailers are
  converted to the priority of messages (and sent to the end of the
  queue). Also, negotiations are now throttled, with a priority higher
  than messages.
* We are now seeing many instances of
Send died while transferring insert: CB 0x82 (CB_SEND_CONN_DIED), for
freenet.node.states.request.TransferInsert:....

and
freenet.node.states.request.TransferReply, QThread-592, NORMAL):
Upstream node connection died for freenet.node.states.request.TransferReply:
 ......

We are also seeing very large numbers of trailing field sends on nodes
with the current code.

These could well be caused by trailers being of lower priority than
messages, and therefore starved. Theoretically my safeguard should hold.
The timeouts on connections are not *that* bad... of course if the node
is processing very many messages, transfers will slow down to ensure
that message send times remain low, but they should still be
transferred - what we see above is connections getting closed!

One possible solution:
* Implement multiplexing. Messages and trailers will be sent on the same
  connections, which will not timeout.
* Don't limit the number of connections directly. But when NGRouting
  sees that a node is transferring everything slowly, it will take this
  into account in its routing decisions. Especially for large keys (we
  use the transfer rate and the size in the NGR decision, remember?), it
  will route away from heavily loaded nodes unless they are expected to
  be much much better at finding the given key.
* In the long run, we will want to implement non-blocking trailing field
  transfers. This will eliminate attacks based on flooding nodes with
  very slow trailing field transfers to run it out of threads. It will
  also greatly improve their load capacity, and allow us to use 100
  thread splitfile downloads by default, with their legendary speeds,
  without overwhelming the node and pushing it into queryrejecting.
  However, there is no urgency on this matter.

In other words, we need to do exactly what we are doing, although maybe
we should implement multiplexing earlier than planned. Probably before
0.6, definitely before (or concurrent with, NOT AFTER) NIOv2. NGRouting
is probably a more urgent priority, for the reasons given above.

On Thu, Aug 07, 2003 at 11:45:29AM -0700, Ian Clarke wrote:
> The current bandwidth limiting mechanism seems to be causing serious
> problems.  It works by limiting the speed of transmission of data on a
> byte-per-byte basis.  Unfortunately this creates a situation where
> transfers of data occur more slowly, which means they take longer, which
> means that we have more concurrent transfers overall, which slows them
> down even further - and the cycle continues - with the entire Freenet 
> network getting bogged down in a web of extremely slow transfers.
> 
> The alternative is for a node to try to maximize the per-transfer 
> connection speed by rejecting new requests when the upstream connection 
> speed is maxed out.  Some claim that this is a terrible idea and will 
> screw up routing because it will be impossible to get a node to accept a 
> datarequest, but I disagree.  
> 
> Imagine you go to McDonalds and ask a server for some food, they take 
> your order.  Now, you didn't know, but that server is actually serving 
> 20 other people at the same time and consequently it takes you ages to 
> get your food.  Wouldn't it be better if that server said "Sorry Sir, 
> I'm really busy - please try another server".
> 
> In short, by making a node try to service its existing transfers as 
> quickly as it can, it gets them out of the way faster and can thus serve 
> just as many requests as a node which accepts all requests but takes 
> ages to serve each individual one.
> 
> Thoughts?
> 
> Ian.
> 
> -- 
> Ian Clarke                                                [EMAIL PROTECTED]
> Coordinator, The Freenet Project            http://freenetproject.org/
> Weblog                                     http://slashdot.org/~sanity/journal



-- 
Matthew J Toseland - [EMAIL PROTECTED]
Freenet Project Official Codemonkey - http://freenetproject.org/
ICTHUS - Nothing is impossible. Our Boss says so.

Attachment: pgp00000.pgp
Description: PGP signature

Reply via email to