and, just to be clear... did you have *both* adaptive and total bw throttles turned on?
the interaction between the two through the hierarchical token bucket is another place where i was more than a little worried. i tested that with network emulators under high load & it seemed to do what it was supposed to do, but i wouldn't be surprised to find a timing issue. --mic On Mon, Dec 1, 2014 at 8:42 AM, Mic Bowman <[email protected]> wrote: > one thing that i was concerned about when i put the throttles in place is > the relationship between congestion control and packet sizes. if you're > generating a large number of small, reliable packets that are being > dropped, that could cause the congestion control to kick in more quickly. > that would suggest an adjustment based on bytes sent rather than time > (though both are probably appropriate). > > my biggest concern is that we start fixing by "stabbing in the dark". > congestion control is particularly nasty in how it interacts which is why i > started with a well known & long battle tested algorithm. making random > changes might fix one problem and introduce a half dozen others. > > i'm not in a position to help on the diagnosis until next week if you can > wait until then. > > --mic > > > On Wed, Nov 26, 2014 at 4:04 PM, Justin Clark-Casey < > [email protected]> wrote: > >> This was actually happening at quite low loads (< 40 connections over all >> 4 keynotes). Once adaptive throttles was disabled and other unrelated >> issues fixed the system had no obvious issues coping with higher loads in >> both testing and the conference itself (e.g. the 159 peak keynote avatars >> in the conference). So I don't think it was a server bandwidth issue. >> >> That said, it was somewhat strange behaviour as affected only maybe >> 10-20% of connections. Once it did affect a connection (I saw this >> happening by logging downward adjustments which one can still do with the >> console command "debug lludp throttles log 1"), the connection would not >> recover - at some point a bunch of expires would reduce the throttle >> again. Connections seemed to be affected randomly - I experienced the >> issue myself at one point and I have pretty solid fibre. >> >> You're right in that I don't know why this happened or why problematic >> connections stayed problematic instead of slowly recovering. Because of >> time constraints we had to disable adaptive instead of investigating >> further. But I don't advocate doing this by default at all because, as you >> say, it's an important mechanism for congestion control. >> >> I do plan further investigation will happen at some point but it's time >> consuming work and I'd really love to get a release out soon-ish. So for >> the moment I would like to do tune the adapation mechanism tuning as you've >> mentioned, which I believe should probably be done anyway. Because of the >> nature of the problem, my plan would be not to change the adaption divisor >> but rather to adapt downwards only every 2 seconds or so if packets are >> expiring rather than on every packet expire. I believe this should still >> achieve the adaption effect without massively penalising the connection if >> there has been a momentary connection issue or similar. >> >> On 26/11/14 02:39, Mic Bowman wrote: >> >>> As you mention... cutting the throttle by 50% was modeled on the TCP >>> congestion control approach. It is very aggressive >>> as a congestion control mechanism and certainly could be tuned. >>> >>> That being said... do you know why the packets were considered un-acked? >>> If its because the simulator is having problems >>> (which given your description that it happens under load seems to be the >>> case) then we can probably do something more >>> intelligent about throttling over all simulator BW. That is... maybe the >>> problem is that the top end of the overall >>> simulator bw is the problem, not the per connection throttles. >>> >>> Manual throttles & adaptive throttles are not exclusive. You can use >>> both. Adaptive manages the top end, but the manual >>> throttles set an absolute max. >>> >>> --mic >>> >>> >>> On Tue, Nov 25, 2014 at 5:15 PM, Justin Clark-Casey < >>> [email protected] <mailto:[email protected]>> wrote: >>> >>> Hi Mic (primarily), >>> >>> Two years ago [1] we had a discussion about the >>> enable_adaptive_throttles setting. Just for background, this is a >>> setting that adapts the amount of data sent to the viewer depending >>> on whether reliable packets sent from the >>> simulator are acked or not. As such, it looks to make sure that a >>> viewer which sets a downstream bandwidth higher >>> than its network connection can cope with is not permanently hosed >>> with too much data. We enabled it on an >>> experimental basis [2]. >>> >>> As you said at the time, this is modelled on the congestion approach >>> used in TCP. I see that for TCP, the rate is >>> halved on every unacked segment. In OpenSimulator, it's halved on >>> every unacked reliable packet. >>> >>> However, under fairly modest load conditions in the conference grid, >>> I saw a behaviour where sometimes for a >>> connection a sequence of packets would expire for some connections >>> in a very short time period (< 1 sec). This >>> would halve the throttle many times, in my observations right down >>> to the absolute minimum. This caused the >>> behaviour from the user's point of view to degrade considerably for >>> an extended period of time. The throttles takes >>> quite a long period to grow again. >>> >>> I didn't get much further with the diagnostics since a lack of time >>> forced us to switch back to manual throttling >>> instead (with a 1 mbit per viewer and 400 mbit total on the >>> keynotes). This seemed to work okay in testing and in >>> the event itself. However, this leaves one vulnerable to the >>> problem adaptive_throttles looks to tackle in the >>> first place. >>> >>> I'm still reading up about this stuff, but it strikes me that >>> halving the throttle on every missed packet is much >>> harsher than the TCP approach, as with UDP a whole sequence can >>> expire at once rather than a single segment that is >>> subsequently retried before another segment can be missed. >>> >>> One idea is to ignore all expiries in a certain period (e.g. next 2 >>> seconds) if an expired packet has already caused >>> the throttle to be halved. Of course, this is a bit more >>> complicated to do but hopefully not too much so. What do >>> you think? Any other ideas? >>> >>> [1] http://opensimulator.org/__pipermail/opensim-dev/2011-__ >>> October/023017.html >>> <http://opensimulator.org/pipermail/opensim-dev/2011- >>> October/023017.html> >>> [2] http://opensimulator.org/__pipermail/opensim-dev/2011-__ >>> October/023063.html >>> <http://opensimulator.org/pipermail/opensim-dev/2011- >>> October/023063.html> >>> >>> Best Regards, >>> >>> -- >>> Justin Clark-Casey (justincc) >>> OSVW Consulting >>> http://justincc.org >>> http://twitter.com/justincc >>> _________________________________________________ >>> Opensim-dev mailing list >>> [email protected] <mailto:[email protected]> >>> http://opensimulator.org/cgi-__bin/mailman/listinfo/opensim-__dev >>> <http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev> >>> >>> >>> >>> >>> _______________________________________________ >>> Opensim-dev mailing list >>> [email protected] >>> http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev >>> >>> >> >> -- >> Justin Clark-Casey (justincc) >> OSVW Consulting >> http://justincc.org >> http://twitter.com/justincc >> _______________________________________________ >> Opensim-dev mailing list >> [email protected] >> http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev >> > >
_______________________________________________ Opensim-dev mailing list [email protected] http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev
