[Opensim-dev] Designing with Instrumentation in mind.

Teravus Ovares Fri, 27 Nov 2009 18:10:34 -0800

Hey there,

A while back, we had somewhat reasonable statistics being generated
and presented to the client.    They were not always accurate, but
based on what I saw, I could, pretty much pin certain parts of the
simulator as the limiting factor during load tests.  I'd say, the
number 1 reason that they were semi-accurate and not accurate..  in
the past..   is because nobody ever thought about instrumentation
during the functionality design.     It was always 'tacked on later'.
  One example of this..    is the current AssetCache implementation.
  There's no way, currently, to know, at a glance..   how many
external requests it has open.   Additionally, it will be extremely
difficult to put one in because of the way the objects are designed
and accessed.  To put one in, an event needs to be added to the
IAssetService interface and each AssetCache implementation will need
an interlocked int to count how many gets and puts it currently has
open to the external data source as well as it's own event calling
schedule.   Then, the IAssetService property in Scene, (AssetService)
will need an event handler..   which updates the values in
SimStatsReporter in Scene (StatsReporter).   This idea of external
access resource instrumentation should really have been built in to
the design of the AssetService.


This last recent load test, there were no real statistics that I could
use to determine what the limiting factor was.
Time Dilation was pegged at 1.0..    even when the simulator was
obviously struggling.    Total Frame time (MS) was -50ms even when the
simulation MS was 850ms and the Physics ms was 250ms, so the
inconsistencies made it impossible to know what part of the simulator
was struggling.  Agent Updates were erratic..   sometimes high..
sometimes low when the simulator was fine and when it was struggling.
Pending Uploads and Downloads were always 0, so there was no way to
know how well the simulator was downloading and uploading assets to
and from the grid.   Packet stats were non-existant, so there was no
way to know how well the UDP handlers were faring under the load.
When it crashed, it crashed with a mono based stack trace which
pointed to out of memory errors, so the only way that you could,
scientifically, find out what the issue is..   is to run a load test
under a memory profiler.     We know, that running a public load test
under a memory profiler is quite impractical.

To make something better, I need to know two things, where it is, and
where I want it to be.    How can we make OpenSimulator better if we
don't have statistics that point to where we are currently?

On that note, I propose that, when designing objects for functionality
in OpenSimulator, that we also consider if the objects should be
instrumented and, what would be the best way to go about instrumenting
the objects.  We should incorporate instrumentation into the design of
the objects.   Some of that instrumentation is appropriate for a
client to see, some of it might not be.   Consider that, many of them
should be client facing and be included in the SimStats that get sent
to the client..    so that we can have a reasonable idea of what's
going on with a simulator at a glance.   Also, in the design of the
instrumentation, we make sure that the instrumentation is accurate and
lightweight.

The load test went reasonably...      but, we didn't get half of the
information on the simulator that we needed to be able to improve it.


Please comment :)     I look forward to hearing your responses.

Regards

Teravus
_______________________________________________
Opensim-dev mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/opensim-dev

[Opensim-dev] Designing with Instrumentation in mind.

Reply via email to