Re: [Opensim-users] How to get ROBUST to notify that it has finished setup?, (Fred Beckhusen)
I do much of this in DreamGrid. MySQL, Robust and Opensim each have different ways that you can detect being "up". You are correct in that Linux spawns processes (not services, which are a Windows construct) and detecting the process is running is not what you need. You need to know when the process is ready to use. MySQL: MySQL is ready if you get a response back from a Select Version(); query. Socket connectivity to port 3306, or even a login to it is not enough. Success is determined in my app if the Response.length > 0 . I don't check the version as I don't care and it's NaN anyway. There may be no Robust or Region database at first boot. They will make the databases they need. Robust: I used to just probe the robust HTTP port for a 200 OK HTTP response. But when robust is first building the database, this is not correct. A slow PC could take too long to make the robust database and a launched Opensim would fail to connect. Ubit added a HTTP response code to my code which is mostlycore Opensim. Robust is MIT licensed so feel free to use it. The source is in the Dreamworld repo at github.com/Outworldz/Dreamworld. gitk can help locate the spot of any changes to core. Ubit may be able to point you in the correct place. DreamGrid uses a URL in my file Robust.vb in function IsRobustRunning() As Boolean, which uses a parameter to ask for robust readiness. Up = Client.DownloadString("http://; & Settings.PublicIP & ":" & Settings.HttpPort & "/index.php?version") Typically, this would translate to http://127.0.0.1:8002/index.php?version If the Up string contains "Opensim", I mark Robust as up and launch all Opensims sequentially (not in parallel, though you could fork and do so). I want to control the boot more closely. As soon as Opensim is is spawned I try to back a Process ID, and then I launch another after checking out the CPU and RAM. A PID can take anywhere from milliseconds to 10-15 seconds ( possibly much longer) to get so I have to wait for it. I need the PID to locate the instances later. You could use the PID file that Opensim makes on disk, but then you must know all the regions by parsing files, and delete the .pid file before the launch, and that brings in more headaches in detecting a running Opensim. They may be still shutting down, or they restarted, or never shut down, or was left running deliberately by the Grid owner. All those edge cases are where I spend a lot of my time. This requires a complicated state machine to track it. Opensim is a hot mess. I spawn them as a process, and so can detect an exit by using withevents, and if desired, restart them ( typically initiated by a use request, as the reboot a region should not be used. You need to exit and respawn the instance. And detecting "Up" may be impossible. The Region Ready module exists to report that Opensim is up. But it's not always correct. If scripts are off, you get nothing from the module. Also, if Logins are disabled, there will never be a Region Ready. Worse, after Login Enabled is reported on the console, and Region Ready says the Region scripts are done, as long as a minute or more, you can still get a Teleport Denied because the region is offline. And that even when Robust has it marked as Online! Its something deep in Opensim I still need to fix. Opensim will be CPU bound at boot as it is extremely thread happy. MySQL will rarely go above a few percent CPU even with 100% of the RAM in use with many regions launched at once. The CPU will get swamped, especially if maps are on, Maps can double RAM use and extend the CPU use for many minutes, versus the more normal few seconds it takes. DreamGrid gets the average CPU and RAM over 3 second periods, and stops down spawning Opensim processes when CPU > 90% or memory > 90%. It begins to spawn again when the average drops below these threshholds. This leaves a bit of both left over which makes the system much more responsive. I also support Core Affinity for Opensim so you can select the cores to use, and am experimenting with using just once core during boot and then using more as it becomes stable. Lots to play with here. I will be adding Core Affinity as a choice for Robust and Mysql soon. --ooo--/\/\/\-|(--ooo--/\/\/\-|(--ooo Fred K. Beckhusen ___ Opensim-users mailing list Opensim-users@opensimulator.org http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-users
Re: [Opensim-users] How to get ROBUST to notify that it has finished setup?, (Fred Beckhusen)
I do much of this in DreamGrid. MySQL, Robust and Opensim each have different ways that you can detect being "up". You are correct in that Linux spawns processes (not services, which are a Windows construct) and detecting the process is running is not what you need. You need to know when the process is ready to use. MySQL: MySQL is ready if you get a response back from a Select Version(); query. Socket connectivity to port 3306, or even a login to it is not enough. Success is determined in my app if the Response.length > 0 . I don't check the version as I don't care and it's NaN anyway. There may be no Robust or Region database at first boot. They will make the databases they need. Robust: I used to just probe the robust HTTP port for a 200 OK HTTP response. But when robust is first building the database, this is not correct. A slow PC could take too long to make the robust database and a launched Opensim would fail to connect. Ubit added a HTTP response code to my code which is mostlycore Opensim. Robust is MIT licensed so feel free to use it. The source is in the Dreamworld repo at github.com/Outworldz/Dreamworld. gitk can help locate the spot of any changes to core. Ubit may be able to point you in the correct place. DreamGrid uses a URL in my file Robust.vb in function IsRobustRunning() As Boolean, which uses a parameter to ask for robust readiness. Up = Client.DownloadString("http://; & Settings.PublicIP & ":" & Settings.HttpPort & "/index.php?version") Typically, this would translate to http://127.0.0.1:8002/index.php?version If the Up string contains "Opensim", I mark Robust as up and launch all Opensims sequentially (not in parallel, though you could fork and do so). I want to control the boot more closely. As soon as Opensim is is spawned I try to back a Process ID, and then I launch another after checking out the CPU and RAM. A PID can take anywhere from milliseconds to 10-15 seconds ( possibly much longer) to get so I have to wait for it. I need the PID to locate the instances later. You could use the PID file that Opensim makes on disk, but then you must know all the regions by parsing files, and delete the .pid file before the launch, and that brings in more headaches in detecting a running Opensim. They may be still shutting down, or they restarted, or never shut down, or was left running deliberately by the Grid owner. All those edge cases are where I spend a lot of my time. This requires a complicated state machine to track it. Opensim is a hot mess. I spawn them as a process, and so can detect an exit by using withevents, and if desired, restart them ( typically initiated by a use request, as the reboot a region should not be used. You need to exit and respawn the instance. And detecting "Up" may be impossible. The Region Ready module exists to report that Opensim is up. But it's not always correct. If scripts are off, you get nothing from the module. Also, if Logins are disabled, there will never be a Region Ready. Worse, after Login Enabled is reported on the console, and Region Ready says the Region scripts are done, as long as a minute or more, you can still get a Teleport Denied because the region is offline. And that even when Robust has it marked as Online! Its something deep in Opensim I still need to fix. Opensim will be CPU bound at boot as it is extremely thread happy. MySQL will rarely go above a few percent CPU even with 100% of the RAM in use with many regions launched at once. The CPU will get swamped, especially if maps are on, Maps can double RAM use and extend the CPU use for many minutes, versus the more normal few seconds it takes. DreamGrid gets the average CPU and RAM over 3 second periods, and stops down spawning Opensim processes when CPU > 90% or memory > 90%. It begins to spawn again when the average drops below these threshholds. This leaves a bit of both left over which makes the system much more responsive. I also support Core Affinity for Opensim so you can select the cores to use, and am experimenting with using just once core during boot and then using more as it becomes stable. Lots to play with here. I will be adding Core Affinity as a choice for Robust and Mysql soon. --ooo--/\/\/\-|(--ooo--/\/\/\-|(--ooo Fred K. Beckhusen ___ Opensim-users mailing list Opensim-users@opensimulator.org http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-users
Re: [Opensim-users] How to get ROBUST to notify that it has finished setup?
I don't use systemd for OpenSimulator as I find it lacks the necessary handling for all that can go wrong with OpenSimulator and its interdependent services. It can also complicate things when you may want to do things for maintenance or other purposes. Simply knowing a process is running is not sufficient for what may go wrong. In my case a use a series of wrappers and file semaphores to control how OpenSimulator is started, shutdown at both the individual application level as well as the entire grid. I generally run FSAssets as a separate service and the rest of Robust as another. For cases where there may be high levels of concurrency I separate Robust into more services. If we consider my startup sequence for a grid with 2 robust services; FSAssets service and another for everything else (core)then my main wrapper would do the following for startup; Start the grid - check for an AUTORUN file semaphore for the grid - if we do not see AUTORUN then abort any startup - check if MySQL is running and accepting connections - if MySQL is not OK then wait for 60 seconds and try again - if MySQL has not been found to be OK after 5 minutes then generate a log message and return with a fail exit code - execute the wrapper for FSAssets - if exit code is ok then start Robust-core. otherwise generate a log message and return with a fail exit code - if exit code is ok then loop through each simulator and run its wrapper for startup, if any exit code is not OK then generate a log message but continue - if no not OK exit codes where encountered then generate a log message for successful grid startup Each of my OpenSim services have their own wrapper script as follows; FSAssets wrapper script; - check to see if MySQL is running and I can make a simple query on the table to get the record count for assets (this will fail on the first-run startup but that is never done as part of an automated sequence in my case) - if MySQL is not OK then then generate a log message and return with a fail exit code - start the FSAssets robust executable, in my case as a tmux session - loop 10 times checking if we still have the new tmux session each second - if during any check we no longer see our new tmux session we assume something went wrong, generate a log message and return with a fail exit code - if after our 10 second loop we still have our tmux session then return with an OK exit code Robust-core wrapper script; - check to see if MySQL is running and I can make a simple query on the table to get the record count for user accounts (this will fail on the first-run startup but that is never done as part of an automated sequence in my case) - if MySQL is not OK then then generate a log message and return with a fail exit code - start the core Robust executable, in my case as a tmux session - loop 10 times checking if we still have the new tmux session each second - if during any check we no longer see our new tmux session we assume something went wrong, generate a log message and return with a fail exit code - if after our 10 second loop we still have our tmux session then return with an OK exit code Simulator wrapper script; - check for an AUTORUN file semaphore for the simulator - if we do not see AUTORUN then return with an OK exit code (this is a skipped simulator and not an error) - check if the main robust service is running by requesting the get_grid_info - if we couldn't get the grid info or did not find our expected grid uri in the response then return with a fail exit code - check to see if MySQL is running and our simulator schema exists (this is OK for the first-run startup since we always our schema to be present to continue) - if MySQL is not OK then then generate a log message and return with a fail exit code - start our simulator executable as a new tmux session - loop 10 times checking if we still have the new tmux session each second - if during any check we no longer see our new tmux session we assume something went wrong, generate a log message and return with a fail exit code - if after our 10 second loop we still have our tmux session then return with an OK exit code My grid shutdown process is very similar but in reverse order, without the DB or service checks, and with longer delays. For startup the 10 seconds checks could be shortened but I prefer to wait long enough that I know the OpenSimulator executable is not going to die due to an error. That usually happens in the first couple of seconds. In the case of simulators I often increase the 10 second loop to cover the typical time needed to start the scripts for that particular build so I can better balance the load on the host and not have too many sims all trying to start their scripts at once. Over the years this is what I have found to be the most flexible way to handle an OpenSimuator grid and its services while avoiding many of the errors that can happen along the way.
Re: [Opensim-users] How to get ROBUST to notify that it has finished setup?
At 8:19 PM -0800 11/17/21, Dahlia Trimble wrote: You might be able to detect it by having a bash script check to see if the associated Robust sockets are active. I'm not sure if it means Robust is finished it's startup but it's probably at least pretty close. Or make an actual request on one of the ROBUST services: http://opensimulator.org/wiki/Services Note about starting all simulators in parallel : This can cause a big spike in CPU. I allow for 30 seconds between simulators to smooth it out. Not sure systemd can do that. -- Jeff ___ Opensim-users mailing list Opensim-users@opensimulator.org http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-users
Re: [Opensim-users] How to get ROBUST to notify that it has finished setup?
You might be able to detect it by having a bash script check to see if the associated Robust sockets are active. I'm not sure if it means Robust is finished it's startup but it's probably at least pretty close. On Wed, Nov 17, 2021 at 1:25 PM Leal Duarte wrote: > Hi, > > in absolute terms, It is just not possible to know when robust is > fully loaded, due to opensim heavy multi tasked nature. > > In some configurations it can even be split into several processes > even on different machines. > > Even Regions. > > What we have is just is end of scripts loading. When that ends > physics engine may still be processing meshes, for example > > Ubit > > > On 17-Nov-21 21:03, Gwyneth Llewelyn wrote: > > Hi all, > > > > I've been tinkering with my automation scripts under Ubuntu Linux > > 20.04.3 LTS, trying to get them fully integrated with systemd. It's > > tougher than I imagined! > > > > My question is rather simple. OpenSim.ini lists a few options to run > > some scripts and/or send some notifications when the instance is fully > > loaded and operational (for instance, once the instance is fully > > loaded, you could check for the statistics API. These can be used for > > a variety of purposes, from simple notifications to a sysadmin to let > > them know that an instance has rebooted, to let users get some sort of > > feedback on which regions are up, etc. and so forth. These can also be > > used for system maintenance purposes as well. > > > > I can't find anything similar for ROBUST, though — at least, not on > > the configuration files. The closest I could find was a reference to > > the 'console'. I'm assuming that this would technically allow a bash > > script to connect to ROBUST and perform some sort of check...? A bit, > > uh, 'clunky' but... I guess it's a possibility? > > > > What are you using to signal that ROBUST has finished loading? > > > > Thanks in advance! > > > > - Gwyn > > > > P. S. Some background notes, for those interested in understanding > > what I'm trying to accomplish and why I've been having some trouble. > > One of the great things about systemd (arguably one of the few...) is > > that it launches everything in parallel, as much as possible; the > > theory being that services will not need to block each other, which is > > what happened in early systems (which relied on a serial sequence of > > steps, each having to finish before the next one was launched). > > > > This is great for launching all the OpenSim instances for the whole > > grid — they will load in parallel, and, since they're pretty much > > self-contained, they will happily get what they need from the database > > server, and — in theory! — finish faster than launching each instance, > > one by one (in practice, it's not so rosy, since the database server > > becomes the bottleneck... although it ought to be possible to > > fine-tune it to deal with so many requests in parallel). > > > > However, there are two catches with this approach. > > > > Firstly, if the MySQL database is not ready before ROBUST and/or the > > instances launch, OpenSim will assume a 'broken' or non-existing > > database connection, and gracefully fail, by asking for the Estate > > name and so forth — i.e. basically the instances will be up, but > > blocked. The good news is that there are several ways to check that > > MySQL is up and running (using some external scripts — ), so this can > > be checked before ROBUST or any of the OpenSim instances are launched. > > > > Secondly — and the reason for this message to the list! — _if_ ROBUST > > hasn't launched yet, then none of the OpenSim instances will register > > themselves with the core grid services (including the asset server). > > I'm not quite sure if each instance, after failing their attempts in > > contacting ROBUST, will do any attempt at a later stage to re-check-in > > with it. If not, it effectively means a broken grid, where sections of > > it, on individual instances, will simply be isolated from the rest of > > the grid. > > > > ROBUST is quite fast in loading everything — compared with the OpenSim > > instances, at least — which means that there is a good chance that it > > launches before the instances. But we cannot be sure that this > > actually happens. > > > > Now, systemd has a way to generate a list (rather, a directed > > graph...) of dependencies. One can, indeed, make sure that ROBUST has > > already been launched *before* launching any of the instances. But > > this won't help much in this case, because systemd is only able to > > check that the *process* has been launched — not if it's ready to > > accept requests. There are some tricks to achieve that, but most > > require some changes in the ROBUST code, and I'm not even sure that, > > running inside Mono, the C# code has any access to system calls. The > > alternative is to use scripts that check for other things — such as, > > say, a status page or a file that has been written
Re: [Opensim-users] How to get ROBUST to notify that it has finished setup?
Hi, in absolute terms, It is just not possible to know when robust is fully loaded, due to opensim heavy multi tasked nature. In some configurations it can even be split into several processes even on different machines. Even Regions. What we have is just is end of scripts loading. When that ends physics engine may still be processing meshes, for example Ubit On 17-Nov-21 21:03, Gwyneth Llewelyn wrote: Hi all, I've been tinkering with my automation scripts under Ubuntu Linux 20.04.3 LTS, trying to get them fully integrated with systemd. It's tougher than I imagined! My question is rather simple. OpenSim.ini lists a few options to run some scripts and/or send some notifications when the instance is fully loaded and operational (for instance, once the instance is fully loaded, you could check for the statistics API. These can be used for a variety of purposes, from simple notifications to a sysadmin to let them know that an instance has rebooted, to let users get some sort of feedback on which regions are up, etc. and so forth. These can also be used for system maintenance purposes as well. I can't find anything similar for ROBUST, though — at least, not on the configuration files. The closest I could find was a reference to the 'console'. I'm assuming that this would technically allow a bash script to connect to ROBUST and perform some sort of check...? A bit, uh, 'clunky' but... I guess it's a possibility? What are you using to signal that ROBUST has finished loading? Thanks in advance! - Gwyn P. S. Some background notes, for those interested in understanding what I'm trying to accomplish and why I've been having some trouble. One of the great things about systemd (arguably one of the few...) is that it launches everything in parallel, as much as possible; the theory being that services will not need to block each other, which is what happened in early systems (which relied on a serial sequence of steps, each having to finish before the next one was launched). This is great for launching all the OpenSim instances for the whole grid — they will load in parallel, and, since they're pretty much self-contained, they will happily get what they need from the database server, and — in theory! — finish faster than launching each instance, one by one (in practice, it's not so rosy, since the database server becomes the bottleneck... although it ought to be possible to fine-tune it to deal with so many requests in parallel). However, there are two catches with this approach. Firstly, if the MySQL database is not ready before ROBUST and/or the instances launch, OpenSim will assume a 'broken' or non-existing database connection, and gracefully fail, by asking for the Estate name and so forth — i.e. basically the instances will be up, but blocked. The good news is that there are several ways to check that MySQL is up and running (using some external scripts — ), so this can be checked before ROBUST or any of the OpenSim instances are launched. Secondly — and the reason for this message to the list! — _if_ ROBUST hasn't launched yet, then none of the OpenSim instances will register themselves with the core grid services (including the asset server). I'm not quite sure if each instance, after failing their attempts in contacting ROBUST, will do any attempt at a later stage to re-check-in with it. If not, it effectively means a broken grid, where sections of it, on individual instances, will simply be isolated from the rest of the grid. ROBUST is quite fast in loading everything — compared with the OpenSim instances, at least — which means that there is a good chance that it launches before the instances. But we cannot be sure that this actually happens. Now, systemd has a way to generate a list (rather, a directed graph...) of dependencies. One can, indeed, make sure that ROBUST has already been launched *before* launching any of the instances. But this won't help much in this case, because systemd is only able to check that the *process* has been launched — not if it's ready to accept requests. There are some tricks to achieve that, but most require some changes in the ROBUST code, and I'm not even sure that, running inside Mono, the C# code has any access to system calls. The alternative is to use scripts that check for other things — such as, say, a status page or a file that has been written somewhere — in order to deduce that something has not only been launched but is actively accepting requests. I know how to do that inside an OpenSim instance, but not on ROBUST. -- "I'm not building a game. I'm building a new country." -- Philip "Linden" Rosedale, interview to Wired, 2004-05-08 ___ Opensim-users mailing list Opensim-users@opensimulator.org http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-users ___ Opensim-users mailing list Opensim-users@opensimulator.org