Some time ago we've been using slurmctl prologue for this. 2017-10-16 16:36 GMT+02:00 Ryan Richholt <ryanrichh...@gmail.com>:
> Thanks, that sounds like a good idea. A prolog script could also handle > this right? That way if the node crashes while the job is running, it would > still be saved. > > On Mon, Oct 16, 2017 at 3:20 AM Merlin Hartley < > merlin-sl...@mrc-mbu.cam.ac.uk> wrote: > >> You could also use a simple epilog script to save the output of ‘scontrol >> show job’ to a file/database. >> >> M >> >> >> -- >> Merlin Hartley >> Computer Officer >> MRC Mitochondrial Biology Unit >> Cambridge, CB2 0XY >> United Kingdom >> >> On 15 Oct 2017, at 20:49, Ryan Richholt <ryanrichh...@gmail.com> wrote: >> >> Is there any way to get the job command with sacct? >> >> For example, if I submit a job like this: >> >> $ sbatch testArgs.sh hey there >> >> I can get the full command from "scontrol show job": >> >> ... >> Command=/home/rrichholt/scripts/testArgs.sh hey there >> ... >> >> But, that information is not available long-term with sacct. >> >> To explain why I would like this: >> >> I'm dealing with a workflow that submits lots of jobs for different >> projects. Each submits the same script, but the first argument points to a >> different project directory. When jobs fail, it's very hard to tell which >> project they were working on, because "scontrol show job" only lasts for >> 300 seconds. Sometimes they fail at night and I don't know until the next >> morning. >> >> >>