I'm working on integrating slurm simulator code with main slurm code
tree. It also includes enhancements for making easier to work with the
simulator.

One really nice thing to have is to obtain a job trace for simulation
from a real production system. The idea is to create such a job trace
from the database specifying a starting and end point. It would create
an initial state taking into account the starting point so some jobs
could be  running  when the simulation starts.

With current job table description is not possible to get such a trace
with a good level of detail. Some sbatch parameteres like --dependency,
--extra-node-info, --constraint, --exclusive, --hold, --licenses, ...,
are not known just looking at the database.

I think it would be good to have a text field for recording sbatch or
srun full parameters. Even if the simulator is not used, sometimes you
need to understand what happened to some jobs (maybe some user is
claiming other jobs from other users with lower priority overtook his
job) and without some information you can not go further. I remember
having this problem in a cluster where dependencies are often used.

The change is not really hard so if no one see a problem with it I will
work for getting the functionality in the next slurm version.



WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer

Reply via email to