Also, for debugging network issues with OpenSim, WinGridProxy (included
in libomv) is much more appropriate than WireShark. You can see the
content of all messages.
On 11/14/2014 8:05 AM, Diva Canto wrote:
On 11/14/2014 6:23 AM, Michael Heilmann wrote:
Thanks for the responses. I'll go into a little more detail:
We have been running several profilers against OpenSimulator on the
MOSES grid, and on my development machine. The tests were to examine
the loading on the server under several different loads, specifically
mesh and physics loads. What we found appears to be that no matter
what kind of load we placed on the region, even to the point of
becoming unresponsive due to physics and mesh, that scripting and
physics load were nowhere near the amount of time spent in
OpenSim.Region.ClientStack.LindenUDP once we had more than one or two
avatars logged in. We know from previous investigations at our
firewall that network traffic for OpenSim is not that heavy,
especially with low numbers of users.
If this is a problem, and you are running a recent-ish version of core
OpenSim, it sounds like some misconfiguration somewhere. Back in the
summer of 2013 we had a problem with the server running OSCC'13; the
kernel was configured to run in some sort of special mode that was
making everything run badly and unpredictably. We fixed the kernel
configuration, and suddenly things started running much more
smoothly-- I don't remember the details, but Nebadon may clarify things.
OpenSim these days can handle 50 people on a single simulator without
much trouble. If you look at figure 7 of my paper
(http://www.ics.uci.edu/~lopes/documents/summersim14/gabrielova_lopes_preprint.pdf)
you will see the quantification of "without much trouble." I suggest
that you reproduce my experimental conditions with pCamBot and check
whether your numbers are very different from ours. If they are very
different, then there's definitely something odd in your setup, as we
were able to reproduce these numbers in several machines. Feel free to
contact me directly for details about pCamBot configuration.
Bots aren't real viewers, but they are much better for measuring
things systematically and detecting problems and bottlenecks than
relying on real users driven by real people. The performance you get
with pCamBot will be correlated with the performance you get with real
users.
I ran several Wireshark captures against a Firestorm viewer logging
into the MOSES public grid ABWIS region, where we hold our office
hours. I saw that with our current configuration, all traffic
between the server and my client, with the exception of http CAPS and
fsapi calls, were UDP traffic. This is not immediately concerning,
as we have simian serve our mesh and textures directly. The messages
are mostly binary information, so I could not examine closely, but I
did see a lot of messages containing identical ASCII strings, such as
the name of my avatar.
Hard to say what you saw, but I bet those are the AgentUpdate messages
that I mentioned before. The viewer sends at least 10/sec. At points,
the viewer sends much more than 10/sec, up to 60/sec. Again, take a
look at my paper for understanding what those are, and how OpenSim
deals with them since OSCC'13.
As I said before, it would be nice to understand why the viewer is so
eager to blabber its status to the server when nothing is going on.
My primary concern is the amount of time spent handling networking,
not necessarily the networking its-self. But there is at least a
portion of messages on the UDP pipeline that are either reliable, or
perhaps should be; and re-implementing a reliable transport over udp
introduces load at the application layer, instead of letting a
low-level reliable transport such as tcp handle it. I went to
university with a guy who implemented a java networking library
completely over UDP, believing that it was faster than a normal TCP
socket; but he was neglecting that the networking hardware handles
the ACK and retransmission transparently, and without needing for the
messages to be handled manually by the application.
This may just be my opinion, but since I was going to be ecamining
the network stack anyways, and typically in a client-server scenario
the ability to maintain a persistent reliable connection where the
server can push important events to the client, that it would be a
good idea. The points about network throttling and QoS are taken,
but wouldn't they also typically affect the UDP stream? Working on
MOSES I have plenty of problems dealing with external users who
operate on restricted networks, and they cannot see traffic aside
from 80 and 443 without dealing with their own IT personnel. The
fact that it is HTTP over TCP instead of raw TCP makes no difference
once it is on a non-standard HTTP port.
I agree that it would be more prudent to look at improving the
websocket code and the http server, rather than replace it with a raw
TCP socket, especially given that there are multiple plugins, such as
jsonsimstats, that use the http functionality directly.
I hope that explains my position a little better. I would love to
hear if there are other plans/ideas in the community to address
time-sinks like this one, networking simply appears to us as a good
starting point to increase performance and scalability of the system.
_______________________________________________
Opensim-dev mailing list
[email protected]
http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev
_______________________________________________
Opensim-dev mailing list
[email protected]
http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev