Dear all,

We are in thre process of designing a cluster to be used for
calculations on sensitive data (DNA from patients, etc.).  We would like
to be able to run jobs from different projects at the same time, and
naturally, the jobs should be shielded from each other.

One idea we are investigating is to use virtual machines running on the
cluster, and then roll back/restart the VMs between each job.  In
particular, we consider setting up a fixed number of VMs, matching the
physical hardware of the cluster, and tell slurm to use those VMs as its
compute nodes.  We will only allocate one job to each VM, and have an
EpilogSlurmctld script that rolls back the VM after the job finishes.
(We might also have a PrologSlurmctld script that rolls back the VM
before a job starts, for extra security.  Alternatively, the epilog will
shut down the VM and the prolog will boot it from scratch.)

Does this sound like a good way to isolate jobs from each other?

Has anyone here done anything like this, or have ideas/thoughs about how
best to isolate jobs from each other?


-- 
Regards,
Bjørn-Helge Mevik, dr. scient,
Research Computing Services, University of Oslo

Reply via email to