Thanks, that sounds like a good idea. A prolog script could also handle this right? That way if the node crashes while the job is running, it would still be saved.
On Mon, Oct 16, 2017 at 3:20 AM Merlin Hartley < [email protected]> wrote: > You could also use a simple epilog script to save the output of ‘scontrol > show job’ to a file/database. > > M > > > -- > Merlin Hartley > Computer Officer > MRC Mitochondrial Biology Unit > Cambridge, CB2 0XY > United Kingdom > > On 15 Oct 2017, at 20:49, Ryan Richholt <[email protected]> wrote: > > Is there any way to get the job command with sacct? > > For example, if I submit a job like this: > > $ sbatch testArgs.sh hey there > > I can get the full command from "scontrol show job": > > ... > Command=/home/rrichholt/scripts/testArgs.sh hey there > ... > > But, that information is not available long-term with sacct. > > To explain why I would like this: > > I'm dealing with a workflow that submits lots of jobs for different > projects. Each submits the same script, but the first argument points to a > different project directory. When jobs fail, it's very hard to tell which > project they were working on, because "scontrol show job" only lasts for > 300 seconds. Sometimes they fail at night and I don't know until the next > morning. > > >
