Pretty frequently, I've been experiencing lock-ups with OpenSim.exe that defy all attempts to diagnose by elimination. It's been like this for several weeks, across various releases up to and including 8131. I'll briefly describe the symptoms here in the hope that someone might be able to point me in the right direction.
1. When it happens, the server is usually up and running, and it's much more likely to happen when two avs are logged in (I've not tried any more than two) than with one. It can also happen with only one, or with none, and sometimes happens while scripts are still loading after starting the server, sometimes after an hour or so of running. Sometimes it can be left on overnight with no logins and it'll be fine. A typical scenario would be one av being logged in without problems for quite a long time; when a second av logs in, the server locks up within seconds. This, however, is not a completely universal rule, just the most common of the many scenarios. 2. What happens is this: the region server stops responding at the command line, and nothing more will happen until it's killed off. Clients get logged off after timeout. On a system monitor, one virtual CPU (it's a P4, so single-core, but with hyper-threading there are two logical CPUs) is going flat-out at around 85-95%, the second CPU idling at around the normal 30%. This persists until the process is killed. Memory usage continues as normal. around 30-50% of the 2GB total. 3. The server is running in grid mode, with UGAIM services provided by OSGrid, on a Ubuntu 8.10 server with Mono 1.9.1. There are four regions serviced by one copy of OpenSim.exe. Scripting is XEngine, physics ODE/Meshmerizer. There are around 2200 prims in all, with perhaps 400 mostly idle scripts. 4. Eliminating each of the regions in turn seems to alleviate the problem to a certain degree - most of the time, any single region may be run with less problems than all four, but there is little pattern to this in the longer term. In the short term, it may appear that a particular region is at fault, but after a day or two the situation may change. 5. Disabling scripting doesn't seem to have much, if any effect. No script is doing anything particularly exotic, and all are normally waiting for events. 6. Another thing which sometimes (but not always) helps is to remove the mostly recently-rezzed prims. 7. I've tried reinstalling Ubuntu, and I've also moved the whole thing from one machine to another during this time. OpenSim has been upgraded roughly weekly to the latest stable version hosted by OSGrid. It's hard to eliminate anything, because the situation changes so often; it may appear to run fine for several hours, with two avs making moderate use of the sims for building, etc, then it may barely run at all. If any of this looks familiar to anyone, I'd be very grateful for any help. I think you can imagine how frustrating this has been for myself and my partner, being almost completely unable to progress with our plans for building the regions into the land of our dreams. And it's particularly galling because of the lack of consistency - we never seem to be able to narrow down the cause of the problem. -- John Hopkin _______________________________________________ Opensim-users mailing list [email protected] https://lists.berlios.de/mailman/listinfo/opensim-users
