Hi, I'm glad to announce a initial version of Slurm Simulator.
I have been working those last months on this and I got a version stable and deterministic enough. The point is to simulate a long trace of jobs so no real jobs are executed. Intead, a simulation manager knows how long jobs will last in advance. The main goal is to offer this simulation mode without main Slurm code modifications. This is accomplished with slurm 2.1.9 and I think this could be migrated to newer slurm versions easily as long as no main slurm design changes happen. The simulation is based on LD_PRELOAD functionality so time related and thread related functions are captured and wrappers executed for simulation purposes. Slurm is a multithread and multiprocess software so taking control of slurm execution is not simple at all. Main slurm threads are executed in sequential order to achieve determinsm. There are other threads related to job submission, job disptach or job completion which are executed on their own, although limiting how much of those threads are created at the same time. What simulation manager takes care of is those threads "belonging" to a simulation cycle should be executed on that cycle. Under simulation there are several slurm functionalilties which are not needed. Others have been modified for simplicity. Simulation performance is good enough. Having such a tight control over slurm threads is not a problem since all can be done in milliseconds. The main problem is normal slurm scheduling under high load makes a simulator cycle (1 simulated second) to last several real seconds. For a 3000 nodes /12000 cores cluster, executing 1000 jobs ranging from 30 seconds to an hour, with multifactor priority, and using a Intel Xeon with 8 cores for simulation execution, it takes ~1000 seconds to simulate 40000 real seconds. So a full-day simulation would take ~35 minutes. A patch for slurm-2.1.9 with a HOWTO and some configuration files can be obtained from http://www.bsc.es/plantillaA.php?cat_id=705 I'm looking forward to get some feedback about the usefulness of this work. Alejandro Lucero WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received. http://www.bsc.es/disclaimer.htm
