On 28 May 2012 19:10, Earl Lazarus <[email protected]> wrote: > I'll try to be as succinct at possible: > > 1) We have developed a CPU intensive simulation that is to be run on an SGE > cluster. > 2) Each simulation is a client that also requires an "environment" server to > be running on that host. The environment server > is associated with the physical environment (a location on the earth and > a month). The client treats the > server as a function call, making a query and waiting for a response, so the > CPU impact of the server is minimal. > The environment servers would generally be started before the main > simulations are submitted to SGE and would be > left running after each of the main simulations end. The servers can > communicate with multiple clients needing the same > environment representation. I might envision the servers running > continuously for a week while the user submits hundreds of > the Monte Carlo simulations, each taking 15 min of wall clock time. When > the user is finished with his "study", he shuts down > the servers. > 2) Currently each host has a slot count equal to the number of CPUs (4). > 3) There are other simulations already running under SGE on this same > cluster; i.e. there are lots of other users. > 4) At the moment I have 4 flavors of server, each representing a different > physical environment. To get them up > on each host will take 4 slots. If the host has only 4 CPUs, then it is > saturated and no clients can run. > 5) If I up the slot count to 8, then I can have 4 clients and 4 servers > running. But the side effect is that if > I am not running MY software, then SGE can feed that host 8 CPU intensive > simulations belonging to > someone else, thus oversubscribing the 4 CPUs by a factor of 2. > 6) If only "slots" came in "flavors", then slots 5 thru 8 could only be used > by my servers and no one else.
They do more or less. Here's one way to do it: Use one queue for the servers with 4 slots another for everything else also with 4 slots. Specify -q normal (or whatever) in the global sge_request file. When submitting a "server" job just specify -q server to override the sge_request file. If this is just for you you could add an ACL/userset on the server queue. If you are after a more general facility the possibly just set h_cpu low on the server queue so that anyone trying to run real jobs on it gets them killed real quick. The above may not work if you already have a complex queue setup for other reasons. William > > Any ideas? > 1) I could ensure that I only run 2 flavors of server on a 4 CPU host, > leaving 2 slots for CPU intensive simulations > to be run (either mine or those of other users). But then I'm cutting > down the throughput since the servers > use such a small amount of CPU resources. _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
