+1 for me. From the perspective of someone getting into hosting regions haveing this kind of information is critical to do proper load balancing and haveing a good idea the health of all the servers.
On Fri, Nov 27, 2009 at 8:21 PM, Kyle <[email protected]> wrote: > Implementation is over my head but as a former hardware technician & old > enough to have troubleshot mainframes by tape reels and blinking red lights > I lived by front panel indicators to help me quickly isolate issues and > train operators on what trouble indicators to look for to catch issues > before they caused downtime. > > So to me this is a must have to help diagnose and improve stability and > uptime-Brilliant!....+1 > > Kyle G > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Teravus Ovares > Sent: Friday, November 27, 2009 9:10 PM > To: [email protected] > Subject: [Opensim-dev] Designing with Instrumentation in mind. > > Hey there, > > A while back, we had somewhat reasonable statistics being generated > and presented to the client. They were not always accurate, but > based on what I saw, I could, pretty much pin certain parts of the > simulator as the limiting factor during load tests. I'd say, the > number 1 reason that they were semi-accurate and not accurate.. in > the past.. is because nobody ever thought about instrumentation > during the functionality design. It was always 'tacked on later'. > One example of this.. is the current AssetCache implementation. > There's no way, currently, to know, at a glance.. how many > external requests it has open. Additionally, it will be extremely > difficult to put one in because of the way the objects are designed > and accessed. To put one in, an event needs to be added to the > IAssetService interface and each AssetCache implementation will need > an interlocked int to count how many gets and puts it currently has > open to the external data source as well as it's own event calling > schedule. Then, the IAssetService property in Scene, (AssetService) > will need an event handler.. which updates the values in > SimStatsReporter in Scene (StatsReporter). This idea of external > access resource instrumentation should really have been built in to > the design of the AssetService. > > This last recent load test, there were no real statistics that I could > use to determine what the limiting factor was. > Time Dilation was pegged at 1.0.. even when the simulator was > obviously struggling. Total Frame time (MS) was -50ms even when the > simulation MS was 850ms and the Physics ms was 250ms, so the > inconsistencies made it impossible to know what part of the simulator > was struggling. Agent Updates were erratic.. sometimes high.. > sometimes low when the simulator was fine and when it was struggling. > Pending Uploads and Downloads were always 0, so there was no way to > know how well the simulator was downloading and uploading assets to > and from the grid. Packet stats were non-existant, so there was no > way to know how well the UDP handlers were faring under the load. > When it crashed, it crashed with a mono based stack trace which > pointed to out of memory errors, so the only way that you could, > scientifically, find out what the issue is.. is to run a load test > under a memory profiler. We know, that running a public load test > under a memory profiler is quite impractical. > > To make something better, I need to know two things, where it is, and > where I want it to be. How can we make OpenSimulator better if we > don't have statistics that point to where we are currently? > > On that note, I propose that, when designing objects for functionality > in OpenSimulator, that we also consider if the objects should be > instrumented and, what would be the best way to go about instrumenting > the objects. We should incorporate instrumentation into the design of > the objects. Some of that instrumentation is appropriate for a > client to see, some of it might not be. Consider that, many of them > should be client facing and be included in the SimStats that get sent > to the client.. so that we can have a reasonable idea of what's > going on with a simulator at a glance. Also, in the design of the > instrumentation, we make sure that the instrumentation is accurate and > lightweight. > > The load test went reasonably... but, we didn't get half of the > information on the simulator that we needed to be able to improve it. > > > Please comment :) I look forward to hearing your responses. > > Regards > > Teravus > _______________________________________________ > Opensim-dev mailing list > [email protected] > https://lists.berlios.de/mailman/listinfo/opensim-dev > > No virus found in this incoming message. > Checked by AVG - www.avg.com > Version: 9.0.709 / Virus Database: 270.14.79/2522 - Release Date: 11/27/09 > 14:39:00 > > > _______________________________________________ > Opensim-dev mailing list > [email protected] > https://lists.berlios.de/mailman/listinfo/opensim-dev >
_______________________________________________ Opensim-dev mailing list [email protected] https://lists.berlios.de/mailman/listinfo/opensim-dev
