While the original idea was to use some workflow for low core count jobs in a SLURM cluster, it ended up with a setup of a Virtual Cluster in (possibly) any queuing system. Although it might depend on the site policy to allow and use such a set up, it's at least a working scenario and might add features to any actual installation, which are not available or not set up. On the other hand this provides some kind of micro-scheduling inside the given allocation which is not available otherwise.
We got access to a SLURM equipped cluster where one always get complete nodes and are asked to avoid single serial jobs or to pack them by scripting to fill the nodes. With the additional need for a workflow application (kinda DRMAA) and array job dependencies, I got the idea to run a GridEngine instance as a Virtual Cluster in a SLURM cluster to solve this. Essentially it's quite easy, as GridEngine offers: - one can start SGE as normal user (for a single user setup per Virtual Cluster it's exactly appropriate) - SGE supports independent configurations, i.e. each Virtual Cluster is an SGE_CELL - configuration files can be plain text files (classic), and hence are easily adjustable After an untar of SGE somewhere like /YOUR/PATH/HERE/VC-common-installation/opt/sge (no need to install anything here), we need a planchet of a "classic" configuration put there named "__SGE_PLANCHET__", and like the /tmp directory everyone should be able to write at this level besides the "__SGE_PLANCHET__" (`chmod go=rwx,+t /YOUR/PATH/HERE/VC-common-installation/opt/sge`). To the planchet you can add items as needed, e.g. more PEs, complexes, queues,… The enclosed script `multi-spawn.sh` gives an idea what has to be done then to start a virtual cluster, even several ones per user, i.e.: $ sbatch multi-spawn.sh Regarding DRMAA one doesn't need to run this on the login node or a dedicated job, instead the workflow application is already part of the (SLURM) job itself (to be put in the application section in `multi-spawn.sh`). === While the planchet was created still with 6.2u5, there are only a few steps necessary to create one for your version of SGE: Run each install_* for qmaster and execd once. Essentially this will create only a configuration and choose "classic" for the spooling method (no need to add any exechost when you are asked for, in fact: remove the one which was added afterwards, and in the @allhosts hostgroup too). Then rename this created "default" configuration to "__SGE_PLANCHET__" and look in my planchet with `grep` for entries like __FOO__ (i.e. strings enclosed by a double underscore). These have to be replaced then there accordingly. The `multi-spawn.sh` will then change these in a copy of the planchet to the names and location of the actual SGE instance; i.e. each SGE_CELL has also its own spool directory. Notably it's in sgemaster and sgeexecd: SGE_ROOT=/usr/sge; export SGE_ROOT SGE_CELL=default; export SGE_CELL to: SGE_ROOT=__SGE_INSTALLATION__; export SGE_ROOT SGE_CELL=__SGE_CELL__; export SGE_CELL === You might need passphraseless `ssh` between the nodes, unless you start remote daemons by `srun`. If this is not working too, a pseudo MPI application whose only duty is to start the sgeexecd on each involved node should do. === In case you want to login to one of the nodes which were granted for your Virtual Cluster interactively, you need to: $ source /YOUR/PATH/HERE/VC-common-installation/opt/sge/SGE_<SLURM_JOB_ID>/common/settings.sh there to gain access to the SGE commands in the interactive shell for this particular Virtual Cluster. Therefore two mini functions `sge-set <SLURM_JOB_ID>` and `sge-done` are included to ease this. While this works on the nodes instantly, it's necessary to add the head node(s) of the SLURM cluster in the planchet beforehand as submit and/or admin hosts. === In case one wants to send emails, note that the default for GridEngine is the account of the login node, which is in this case an exechost for SLURM. Either a special set up there is necessary to receive email on an exechost, or provide always an absolute eMail address with the option "-M" to GridEngine. === As every VC starts with job id 1, it might be helpful to create scratch directories (in a global prolog/epilog) consisting of "${SLURM_JOB_ID}_$(basename ${TMPDIR})". If you are getting always full nodes, you won't have this problem on a local scratch directory for $TMPDIR though. === BTW: did I mention it: no need to be root anywhere. -- Reuti
multi-spawn.sh
Description: Binary data
__SGE_PLANCHET__.tgz
Description: Binary data
cluster.tgz
Description: Binary data
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users