On 28 May 2012 19:10, Earl Lazarus <[email protected]> wrote:
> I'll try to be as succinct at possible:
>
> 1) We have developed a CPU intensive simulation that is to be run on an SGE
> cluster.
> 2) Each simulation is a client that also requires an "environment" server to
> be running on that host.  The environment server
>    is associated with the physical environment (a location on the earth and
> a month).  The client treats the
> server as a function call, making a query and waiting for a response, so the
> CPU impact of the server is minimal.
>   The environment servers would generally be started before the main
> simulations are submitted to SGE and would be
>   left running after each of the main simulations end.  The servers can
> communicate with multiple clients needing the same
>   environment representation.  I might envision the servers running
> continuously for a week while the user submits hundreds of
>   the Monte Carlo simulations, each taking 15 min of wall clock time.  When
> the user is finished with his "study", he shuts down
> the servers.
> 2) Currently each host has a slot count equal to the number of CPUs (4).
> 3) There are other simulations already running under SGE on this same
> cluster; i.e. there are lots of other users.
> 4) At the moment I have 4 flavors of server, each representing a different
> physical environment.  To get them up
>    on each host will take 4 slots.  If the host has only 4 CPUs, then it is
> saturated and no clients can run.
> 5) If I up the slot count to 8, then I can have 4 clients and 4 servers
> running.  But the side effect is that if
>    I am not running MY software, then SGE can feed that host 8 CPU intensive
> simulations belonging to
>    someone else, thus oversubscribing the 4 CPUs by a factor of 2.
> 6) If only "slots" came in "flavors", then slots 5 thru 8 could only be used
> by my servers and no one else.

They do more or less.   Here's one way to do it:

Use one queue for the servers with 4 slots another for
everything else also with 4 slots.  Specify -q normal (or whatever) in
the global sge_request file.

When submitting a "server" job just specify -q server to override the
sge_request file.

If this is just for you you could add an ACL/userset on the server
queue.  If you are after a more general
facility the possibly just set h_cpu low on the server queue so that
anyone trying to run real jobs on it
gets them killed real quick.

The above may not work if you already have a complex queue setup for
other reasons.

William









>
> Any ideas?
> 1) I could ensure that I only run 2 flavors of server on a 4 CPU host,
> leaving 2 slots for CPU intensive simulations
>    to be run (either mine or those of other users).  But then I'm cutting
> down the throughput since the servers
>    use such a small amount of CPU resources.

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to