This was actually happening at quite low loads (< 40 connections over all 4 keynotes). Once adaptive throttles was
disabled and other unrelated issues fixed the system had no obvious issues coping with higher loads in both testing and
the conference itself (e.g. the 159 peak keynote avatars in the conference). So I don't think it was a server bandwidth
issue.
That said, it was somewhat strange behaviour as affected only maybe 10-20% of connections. Once it did affect a
connection (I saw this happening by logging downward adjustments which one can still do with the console command "debug
lludp throttles log 1"), the connection would not recover - at some point a bunch of expires would reduce the throttle
again. Connections seemed to be affected randomly - I experienced the issue myself at one point and I have pretty solid
fibre.
You're right in that I don't know why this happened or why problematic connections stayed problematic instead of slowly
recovering. Because of time constraints we had to disable adaptive instead of investigating further. But I don't
advocate doing this by default at all because, as you say, it's an important mechanism for congestion control.
I do plan further investigation will happen at some point but it's time consuming work and I'd really love to get a
release out soon-ish. So for the moment I would like to do tune the adapation mechanism tuning as you've mentioned,
which I believe should probably be done anyway. Because of the nature of the problem, my plan would be not to change
the adaption divisor but rather to adapt downwards only every 2 seconds or so if packets are expiring rather than on
every packet expire. I believe this should still achieve the adaption effect without massively penalising the
connection if there has been a momentary connection issue or similar.
On 26/11/14 02:39, Mic Bowman wrote:
As you mention... cutting the throttle by 50% was modeled on the TCP congestion
control approach. It is very aggressive
as a congestion control mechanism and certainly could be tuned.
That being said... do you know why the packets were considered un-acked? If its
because the simulator is having problems
(which given your description that it happens under load seems to be the case)
then we can probably do something more
intelligent about throttling over all simulator BW. That is... maybe the
problem is that the top end of the overall
simulator bw is the problem, not the per connection throttles.
Manual throttles & adaptive throttles are not exclusive. You can use both.
Adaptive manages the top end, but the manual
throttles set an absolute max.
--mic
On Tue, Nov 25, 2014 at 5:15 PM, Justin Clark-Casey <[email protected]
<mailto:[email protected]>> wrote:
Hi Mic (primarily),
Two years ago [1] we had a discussion about the enable_adaptive_throttles
setting. Just for background, this is a
setting that adapts the amount of data sent to the viewer depending on
whether reliable packets sent from the
simulator are acked or not. As such, it looks to make sure that a viewer
which sets a downstream bandwidth higher
than its network connection can cope with is not permanently hosed with too
much data. We enabled it on an
experimental basis [2].
As you said at the time, this is modelled on the congestion approach used
in TCP. I see that for TCP, the rate is
halved on every unacked segment. In OpenSimulator, it's halved on every
unacked reliable packet.
However, under fairly modest load conditions in the conference grid, I saw
a behaviour where sometimes for a
connection a sequence of packets would expire for some connections in a very
short time period (< 1 sec). This
would halve the throttle many times, in my observations right down to the
absolute minimum. This caused the
behaviour from the user's point of view to degrade considerably for an
extended period of time. The throttles takes
quite a long period to grow again.
I didn't get much further with the diagnostics since a lack of time forced
us to switch back to manual throttling
instead (with a 1 mbit per viewer and 400 mbit total on the keynotes).
This seemed to work okay in testing and in
the event itself. However, this leaves one vulnerable to the problem
adaptive_throttles looks to tackle in the
first place.
I'm still reading up about this stuff, but it strikes me that halving the
throttle on every missed packet is much
harsher than the TCP approach, as with UDP a whole sequence can expire at
once rather than a single segment that is
subsequently retried before another segment can be missed.
One idea is to ignore all expiries in a certain period (e.g. next 2
seconds) if an expired packet has already caused
the throttle to be halved. Of course, this is a bit more complicated to do
but hopefully not too much so. What do
you think? Any other ideas?
[1]
http://opensimulator.org/__pipermail/opensim-dev/2011-__October/023017.html
<http://opensimulator.org/pipermail/opensim-dev/2011-October/023017.html>
[2]
http://opensimulator.org/__pipermail/opensim-dev/2011-__October/023063.html
<http://opensimulator.org/pipermail/opensim-dev/2011-October/023063.html>
Best Regards,
--
Justin Clark-Casey (justincc)
OSVW Consulting
http://justincc.org
http://twitter.com/justincc
_________________________________________________
Opensim-dev mailing list
[email protected] <mailto:[email protected]>
http://opensimulator.org/cgi-__bin/mailman/listinfo/opensim-__dev
<http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev>
_______________________________________________
Opensim-dev mailing list
[email protected]
http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev
--
Justin Clark-Casey (justincc)
OSVW Consulting
http://justincc.org
http://twitter.com/justincc
_______________________________________________
Opensim-dev mailing list
[email protected]
http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev