yes, which is why I said discard them when new updates occur. On Mon, Mar 28, 2011 at 12:03 PM, Melanie <[email protected]> wrote:
> For avatars yes. But prim updates can never be discarded, no matter > how trivial, because they establish new persistent state. > > Melanie > > Dahlia Trimble wrote: > > the viewer discards small changes anyway if avatar imposters are enabled > > > > On Mon, Mar 28, 2011 at 11:54 AM, Melanie <[email protected]> wrote: > > > >> No, we can't discard small changes. As the avatar comes closer, they > >> would be seen out of place, e.g. someone building in the distance > >> would move prims and then you come closer to look and all prims > >> would be out of place. > >> > >> Melanie > >> > >> Dahlia Trimble wrote: > >> > a couple thoughts.. > >> > > >> > Perhaps resend timeout period could be a function of throttle setting > >> and/or > >> > measured packet acknowledgement time per-client? (provided we measure > >> it). > >> > That may prevent excessive resend processing that may not be > necessary. > >> > > >> > On the distance prioritization, could small changed in object > >> translations > >> > be discarded from the prioritization queues/resend buffers for distant > >> > objects when new updates occur for those objects? Small changes may > not > >> be > >> > noticeable from the viewer perspective anyway. > >> > > >> > > >> > On Mon, Mar 28, 2011 at 10:48 AM, Teravus Ovares <[email protected]> > >> wrote: > >> > > >> >> Here are a few facts that I've personally discovered while working > >> >> with LLClientView. > >> >> > >> >> 1. It has been noted that people with poor connections to the > >> >> simulator do consume more bandwidth, cpu, and have a generally worse > >> >> experience. This has been tested and profiled extensively. This > >> >> may seem like a small issue because what it's doing is so basic... > >> >> however the frequency in which this occurs is a real cause of > >> >> performance issues. > >> >> > >> >> 2. It's also noted that the CPU used in these cases reduces the CPU > >> >> available to the rest of the simulator resulting in a lower quality > of > >> >> service for the rest of the people on the simulator. > >> >> This has been seen in the profiling and has been qualitatively > >> >> observed by a large number of users connected and everything is OK > and > >> >> then a 'problem connection' user connecting causing a wide range of > >> >> issues. > >> >> > >> >> 3. It's also noted that lowering the outgoing UDP packet throttles > >> >> beyond a certain point results in perpetual queuing and resends. > >> >> This was tested by using a throttle multiplier last year that was > >> >> implemented by justincc. I'm not sure if the multiplier is still > >> >> there. It's most easily seen with image packets. Again, I note > >> >> that the packets are not rebuilt going from the regular outbound > queue > >> >> to the resend queue. The resend queue is /supposed/ to be used to > >> >> quickly get data that is essential to the client after attempting to > >> >> send once already. The UDP spec declares the maximum resend to be 2 > >> >> times, however there has been some considerable debate on whether or > >> >> not OpenSimulator should follow that specific specification item > >> >> leading to a configuration option to enable perpetual resends > >> >> (Implemented by Melanie). The configuration item was named similar > >> >> to, 'reliable is important' or something like that. I'm not sure if > >> >> the configuration item survived the many revisions however I suspect > >> >> that it did. > >> >> > >> >> 4. It's also noted that raising the packet throttles beyond what the > >> >> connection can support results in resending almost every packet the > >> >> maximum amount of times before the limit is reached. > >> >> This is easily reproducible by setting the connection (in the client) > >> >> to the maximum and connecting to a region that you've never been to > >> >> before on a sub par connection. Before the client adjusts and > >> >> requests a lower throttle setting there's massive data loss and > >> >> massive re-queuing. > >> >> > >> >> 5. The client tries to adjust the throttle settings based on network > >> >> conditions. This can be observed by monitoring the packet that sets > >> >> the throttles and dragging the bar to maximum. After a certain > >> >> amount of resends, the client will call the set throttle packet with > >> >> reduced settings (some argue that it doesn't do that fast enough). > >> >> > >> >> 6. A user who has connected previously to the simulator will use less > >> >> resources then a user who has never connected to the simulator. > (this > >> >> is mostly because of the image cache on the client). Any client > >> >> that uses CAPS images will use less resources then one that uses > >> >> LLUDP. > >> >> > >> >> When working with the packet queues, it's essential to understand > >> >> those 6 observations. Even though, the place where you tend to see > >> >> the issues with queuing is the image queue over LLUDP, the principles > >> >> apply to all of the udp queues. > >> >> > >> >> Regards > >> >> > >> >> Teravus > >> >> > >> >> > >> >> On Mon, Mar 28, 2011 at 1:00 PM, Mic Bowman <[email protected]> > wrote: > >> >> > Over the last several weeks, Dan Lake & I have been looking some of > >> the > >> >> > networking performance issues in opensim. As always, our concerns > are > >> >> with > >> >> > the problems caused by very complex scenes with very large numbers > of > >> >> > avatars. However, I think some of the issues we have found will > >> generally > >> >> > improve networking with OpenSim. Since the behavior represents a > >> fairly > >> >> > significant change in behavior (though the number of lines of code > is > >> not > >> >> > great), I'm going to put this into a separate branch for testing > >> (called > >> >> > queuetest) in the opensim git repository. > >> >> > We've found several problems with the current > >> >> > networking/prioritization code. > >> >> > * Reprioritization is completely broken for SceneObjectParts. On > >> >> > reprioritization, the current code uses the localid stored in the > >> scene > >> >> > Entities list but since the scene does not store the localid for > SOPs, > >> >> that > >> >> > attempt always fails. So the original priority of the SOP continues > to > >> be > >> >> > used. This could be the cause of some problems since the initial > >> >> > prioritization assumes position 128,128. I don't understand all the > >> >> possible > >> >> > ramifications, but suffice it to say, using the localid is causing > >> >> > problems. > >> >> > Fix: The sceneentity is already stored in the update, just use that > >> >> instead > >> >> > of the localid. > >> >> > * We currently pull (by default) 100 entity updates from the > >> entityupdate > >> >> > queue and convert them into packets. Once converted into packets, > they > >> >> are > >> >> > then queued again for transmissions. This is a bad thing. Under any > >> kind > >> >> of > >> >> > load, we've measured the time in the packet queue to be up to many > >> >> > hundreds/thousands of milliseconds (and to be highly variable). > When > >> an > >> >> > object changes one property and then doesn't change it again, the > time > >> in > >> >> > the packet queue is largely irrelevant. However, if the object is > >> >> > continuously changing (an avatar changing position, a physical > object > >> >> > moving, etc) then the conversion from a entity update to a packet > >> >> "freezes" > >> >> > the properties to be sent. If the object is continuously changing, > >> then > >> >> with > >> >> > fairly high probability, the packet contains old data (the > properties > >> of > >> >> the > >> >> > entity from the point at which it was converted into a packet). > >> >> > The real problem is that, in theory, to improve the efficiency of > the > >> >> > packets (fill up each message) we are grabbing big chunks of > updates. > >> >> Under > >> >> > load, that causes queuing at the packet layer which makes updates > >> stale. > >> >> > That is... queuing at the packet layer is BAD. > >> >> > Fix: We implemented an adaptive algorithm for the number of updates > to > >> >> grab > >> >> > with each pass. We set a target time of 200ms for each iteration. > That > >> >> > means, we are trying to bound the maximum age of any update in the > >> packet > >> >> > queue to 200ms. The adaptive algorithm looks a lot like a TCP slow > >> start: > >> >> > every time we complete an iteration (flush the packet queue) in > less > >> than > >> >> > 200ms we increase linearly the number of updates we take in the > next > >> >> > iteration (add 5 to the count) and when we don't make it back in > >> 200ms, > >> >> we > >> >> > drop the number we take quadratically (cut the number in half). In > our > >> >> > experiments with large numbers of moving avatars, this algorithm > works > >> >> > *very* well. The number of updates taken per iteration stabilizes > very > >> >> > quickly and the response time is dramatically improved (no "snap > back" > >> on > >> >> > avatars, for example). One difference from the traditional slow > >> start... > >> >> > since the number of "static" items in the queue is very high when a > >> >> client > >> >> > first enters a region, we start with the number of updates taken at > >> 500. > >> >> > that gets the static items out of the queue quickly (and delay > doesn't > >> >> > matter as much) and the number taken is generally stable before the > >> >> > login/teleport screen even goes away. > >> >> > * The current prioritization queue can lead to update starvation. > The > >> >> > prioritization algorithm dumps all entity updates into a single > >> ordered > >> >> > queue. Lets say you have several hundred avatars moving around in a > >> >> scene. > >> >> > Since we take a limited number of updates from the queue in each > >> >> iteration, > >> >> > we will take only the updates for the "closest" (highest priority) > >> >> avatars. > >> >> > However, since those avatars continue to move, they are re-inserted > >> into > >> >> the > >> >> > priority queue *ahead* of the updates that were already there. > So... > >> >> unless > >> >> > the queue can be completely emptied each iteration or the priority > of > >> the > >> >> > "distant" (low priority) avatars changes, those avatars will never > be > >> >> > updated. > >> >> > Fix: We converted the single priority queue into multiple priority > >> queues > >> >> > and use fair queuing to retrieve updates from each. Here's how it > >> works > >> >> > (more or less)... the current metrics (all of the current > >> prioritization > >> >> > algorithms use distance at some point for prioritization) compute a > >> >> distance > >> >> > from the avatar/camera to an object. We take the log of that > distance > >> and > >> >> > use that as the index for the queue where we place the update. So > >> close > >> >> > things go into the highest priority queue and distant things go > into > >> the > >> >> > lowest priority queue. Since the area covered by a priority queue > >> grows > >> >> as > >> >> > the square of the radius, the distant (lowest priority queues) will > >> have > >> >> the > >> >> > most objects while the highest priority queues will have a small > >> number > >> >> of > >> >> > objects. Inside each priority queue, we order the updates by the > time > >> in > >> >> > which they entered the queue. Then we pull a fixed number of > updates > >> from > >> >> > each priority queue each iteration. The result is that local > updates > >> get > >> >> a > >> >> > high fraction of the outgoing bandwidth but distant updates are > >> >> guaranteed > >> >> > to get at least "some" of the bandwidth. No starvation. The current > >> >> > prioritization algorithm we implemented is a modification of the > "best > >> >> > avatar responsiveness" and "front back" in that we use root prim > >> location > >> >> > for child prims and the priority of updates "in back" of the avatar > is > >> >> lower > >> >> > than updates "in front". Our experiments show that the fair queuing > >> does > >> >> > drain the update queue AND continues to provide a > disproportionately > >> high > >> >> > percentage of the bw to "close" updates. > >> >> > One other note on this... we should be able to improve the > performance > >> of > >> >> > reprioritization with this approach. If we know the distance an > avatar > >> >> has > >> >> > moved, we only have to reprioritize objects that might have changed > >> >> priority > >> >> > queues. Haven't implemented this yet but have some ideas for how to > do > >> >> it. > >> >> > * The resend queue is evil. When an update packet is sent (they are > >> >> marked > >> >> > reliable) it is moved to a queue to await acknowledgement. If no > >> >> > acknowledgement is received (in time), the packet is retransmitted > and > >> >> the > >> >> > wait time is doubled and so on... What that means is that a resend > >> >> packets > >> >> > in a scene that is rapidly changing will often contain updates that > >> are > >> >> > outdated. That is, when we resend the packet, we are just resending > >> old > >> >> data > >> >> > (and if you're having a lot of resends that means you already have > a > >> bad > >> >> > connection & now you're filling it up with useless data). > >> >> > Fix: this isn't implemented yet (help would be appreciated)... we > >> think > >> >> that > >> >> > instead of saving packets for resend... a better solution would be > to > >> >> keep > >> >> > the entity updates that went into the packet. if we don't receive > an > >> ack > >> >> in > >> >> > time, then put the entity updates back into the entity update queue > >> (with > >> >> > entry time from their original enqueuing). That would ensure that > we > >> send > >> >> an > >> >> > update for the object & that the data sent is the most recent. > >> >> > * One final note... per client bandwidth throttles seem to work > very > >> >> well. > >> >> > however, our experiments with per-simulator throttles was not > >> positive. > >> >> it > >> >> > appeared that a small number of clients was consuming all of the bw > >> >> > available to the simulator and the rest were starved. Haven't > looked > >> into > >> >> > this any more. > >> >> > > >> >> > So... > >> >> > Feedback appreciated... there is some logging code (disabled) in > the > >> >> branch; > >> >> > real data would be great. And help testing. there are a number of > >> >> > attachment, deletes and so on that i'm not sure work correctly. > >> >> > --mic > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > _______________________________________________ > >> >> > Opensim-dev mailing list > >> >> > [email protected] > >> >> > https://lists.berlios.de/mailman/listinfo/opensim-dev > >> >> > > >> >> > > >> >> _______________________________________________ > >> >> Opensim-dev mailing list > >> >> [email protected] > >> >> https://lists.berlios.de/mailman/listinfo/opensim-dev > >> >> > >> > > >> > > >> > > ------------------------------------------------------------------------ > >> > > >> > _______________________________________________ > >> > Opensim-dev mailing list > >> > [email protected] > >> > https://lists.berlios.de/mailman/listinfo/opensim-dev > >> _______________________________________________ > >> Opensim-dev mailing list > >> [email protected] > >> https://lists.berlios.de/mailman/listinfo/opensim-dev > >> > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Opensim-dev mailing list > > [email protected] > > https://lists.berlios.de/mailman/listinfo/opensim-dev > _______________________________________________ > Opensim-dev mailing list > [email protected] > https://lists.berlios.de/mailman/listinfo/opensim-dev >
_______________________________________________ Opensim-dev mailing list [email protected] https://lists.berlios.de/mailman/listinfo/opensim-dev
