I'll try to be as succinct at possible:

1) We have developed a CPU intensive simulation that is to be run on an SGE
cluster.
2) Each simulation is a client that also requires an "environment" server
to be running on that host.  The environment server
   is associated with the physical environment (a location on the earth and
a month).  The client treats the
server as a function call, making a query and waiting for a response, so
the CPU impact of the server is minimal.
  The environment servers would generally be started before the main
simulations are submitted to SGE and would be
  left running after each of the main simulations end.  The servers can
communicate with multiple clients needing the same
  environment representation.  I might envision the servers running
continuously for a week while the user submits hundreds of
  the Monte Carlo simulations, each taking 15 min of wall clock time.  When
the user is finished with his "study", he shuts down
the servers.
2) Currently each host has a slot count equal to the number of CPUs (4).
3) There are other simulations already running under SGE on this same
cluster; i.e. there are lots of other users.
4) At the moment I have 4 flavors of server, each representing a different
physical environment.  To get them up
   on each host will take 4 slots.  If the host has only 4 CPUs, then it is
saturated and no clients can run.
5) If I up the slot count to 8, then I can have 4 clients and 4 servers
running.  But the side effect is that if
   I am not running MY software, then SGE can feed that host 8 CPU
intensive simulations belonging to
   someone else, thus oversubscribing the 4 CPUs by a factor of 2.
6) If only "slots" came in "flavors", then slots 5 thru 8 could only be
used by my servers and no one else.

Any ideas?
1) I could ensure that I only run 2 flavors of server on a 4 CPU host,
leaving 2 slots for CPU intensive simulations
   to be run (either mine or those of other users).  But then I'm cutting
down the throughput since the servers
   use such a small amount of CPU resources.
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to