On Jun 13, 2009, at 12:32 PM, Matthew Toseland wrote:

> We might want to wait until we have finished with the opennet  
> connection limit changes, but IMHO this is a good idea too.
>
> Basically, requests would have a flag, which is either bulk or  
> realtime.
>
> Currently, we only allow new requests if our current requests can be  
> completed within available bandwidth - assuming they all succeed -  
> within 90 seconds. It is very rare that they all succeed, in fact a  
> lot of requests fail, but across a route spanning 10 hops it is  
> likely there is one node which is bogged down with lots of transfers.
>
> For bulk requests, we increase the transfer threshold to 120 seconds  
> or maybe even 300 seconds. This will optimise throughput.
>
> For realtime requests, we reduce the transfer threshold to maybe 20  
> seconds, severely limiting the number of requests but ensuring they  
> all complete fast. Any incoming realtime transfer that takes 20  
> seconds is turtled (at which point they become bulk requests). Data  
> blocks for realtime requests take precedence over data blocks for  
> bulk requests. We would need to ensure that the data for the bulk  
> requests does get transferred, and the realtime requests don't  
> constantly starve the bulk requests. This would require a token  
> bucket or something similar to limit the proportion of bandwidth  
> used by realtime requests, which would need to be relative to the  
> available/used bandwidth and not necessarily to the limit.
>
> Fproxy would use realtime requests. Persistent downloads would use  
> bulk requests. Big files being fetched in fproxy after asking the  
> user might use bulk requests.
>
> All this assumes the probability of a CHK request succeeding (in the  
> region of 10% at the moment) doesn't dramatically rise with Bloom  
> filter sharing. Maybe we should put it off until after that?

I would definitely put this off till after bloom filter sharing, but I  
don't think I agree with the implementation.

A simpler solution...

The present *algorithm* is optimized for throughput, and rightly so as  
traffic analysis becomes easier at low-latencies. Go ahead and  
implement the realtime flag, but implement it as a transfer-mode, that  
it would temporarily suspend *all* other transfers. Thus having a true  
low-latency request.

If we find this suspension would make any of those transfers timeout,  
then reject that incoming low-latency request.

You would have to check for this anyway, with two throttles (as you  
mentioned) there is still the potential to cause transfers to run over  
the 'agreed' time-budget if it manages to add an extra 30 seconds to a  
request.

Simple cares must be taken:
1) a rejected real-time request should not be cause for backoff (as  
then it adversely effects general transfers, and is probably the fault  
of the requesting node anyway [for queueing too many normal transfers]).
2) enforce that we do not accept a second real-time request from a  
peer currently in a real-time transfer.

-

The harder (and more general) solution would be to plug a latency  
field into the request, passing how "long" we are willing to wait for  
the transfer to complete.
If we are forwarding a low-latency request, we deduct the time since  
we got the message [and possibly the expected link/transfer time].  
Which is to say it's time requirement would become harder to meet as  
it fails, till all would reject a request needing a zero-transfer time.

Then to mirror your suggestion, start bulk requests at 90 secs, low  
latency at 20 secs. Transfer packets "earliest deadline first."

But this would require unfair queueing, which I believe requires token  
passing...

<begin plug for token passing>

While I agree with the bulk/realtime flag (link-level), I do think  
that your solution is going down the wrong path... basically  
reduplicating an implementation that the performance of which is  
already questionable. I believe what you are trying to mend is the  
perceived user experience, which I believe rests squarely on the  
"ethernet-ness" of requests being accepted or rejected. Tokens (or...  
assurance of a request not being rejected), lets us implement fair  
queueing; and if implemented right could include your idea of multiple  
target transfer times.

I have little doubt that creating a second-but-tighter-window would  
not greatly effect the end-user experience. With the ethernet-effect,  
and accepting fewer requests (by design). The user would simply have a  
smaller chance but that the page would more quickly appear. Rather  
than having 10% succeeding in 90 seconds, we might have 1% succeeding  
in 20 seconds. Thus making the perceived benefit (what you are really  
trying to solve) likely further away.

I suggest a move to 'request-tokens', be it explicit token granting/ 
passing or somehow implicit. This would also let us move forward with  
any kind of load balancing.

I have an incomplete sketch of how I think this should work. While  
simple in theory (arguably simpler than the current system), this  
would require a partial rewrite of the load/bandwidth system.  
Basically a single request-granting thread which would 'bless' peers  
in a round-robin fashion; skipping peers which already have full token  
buckets, are over there bandwidth throttle, etc.

The primary concern I presently have is the case of peers which are  
rigged to not grant request tokens. For this I was thinking that a  
successful request would 'automatically' earn you a request token  
(implicit/token not passed?). Not sure... and that leads to the  
question of how the node sets the maximum bucket size of a node (the  
number of allowable concurrent requests), which could be a constant or  
tend to be less towards 'leaf' nodes.

--
Robert Hailey


Reply via email to