Hi David, We've implemented a very similar setup to you, using "scontrol show job" and looking for the StdOut file. Not very elegant (spank plugins might be the better option) but it works for us.
A couple of small points: - we use EpilogSlurmctld and not Epilog; the Epilog will run on each compute node, so may not be the best fit for a singleton task like writing job stats (and might explain why you get two Hello World entries in your output file). The EpilogSlurmctld runs once (on the headnode only). - it's probably best to leave the slurm.epilog.clean file unmodified, as it is a standard file in slurm and could be changed with an upgrade; so you'd have to manually reproduce your edits to it (we've been bitten by that with rpm upgrades). Instead we've started to configure a completely new file as the Epilog (Epilog=.../slurm.epilog.local) for performing custom tasks (post-job node health checks in our case), but it explicitly calls the slurm.epilog.clean as well at the end, as that script cleans up user processes. But anyhow I guess that's moot if you instead use EpilogSlurmctld for your post-job stats writing. :) Hope that helps. Kind regards, Paddy On Mon, Jun 27, 2016 at 01:03:47AM -0700, Baker D.J. wrote: > Hi Lyn, > > Thank you for your reply to my question. I???ve been exploring and > experimenting with the prolog and epilog today. So for example, I???ve put > the following piece of code in my ???final??? epilog, epilog.clean??? > > stdout=`/local/software/slurm/default/bin/scontrol show job ${SLURM_JOB_ID} | > grep -i stdout | cut -f2 -d '='` > #stdout=/local/software/slurm/default/etc/output > echo 'Hello World from Epilog.clean' >>$stdout > > slurm.conf entry??? > > Epilog=/local/software/slurm/default/etc/slurm.epilog.clean > > This does the job. There may be a more elegant way to do things, however this > does work. I do, however, get the ???Hello World??? statement written twice > in my job output. I assume that this final epilog will be executed once at > the end of the job. Do you or anyone else on the forum understand why the > statement is echo???ed twice? > > ???. > pi is approximately 3.1416009869231249, Error is 0.0000083333333318 > wall clock time = 0.883930 > Hello World from Epilog.clean > Hello World from Epilog.clean > ???. > > Best regards, > David > > From: Lyn Gerner [mailto:schedulerqu...@gmail.com] > Sent: Thursday, June 23, 2016 6:56 PM > To: slurm-dev <slurm-dev@schedmd.com> > Subject: [slurm-dev] Re: Writing to job output files from prolog and epilog > scripts > > Hi David, > > Be sure to note the special methods for setting env variables and writing to > stdout from the task prolog, in the Prolog and Epilog Guide web page. In > order to write job summary info (during one of your epilogs), you can acquire > the stdout location from scontrol show job ${SLURM_JOB_ID} and just echo to > it. > > Regards, > Lyn > > On Thu, Jun 23, 2016 at 2:19 AM, Baker D.J. > <d.j.ba...@soton.ac.uk<mailto:d.j.ba...@soton.ac.uk>> wrote: > Hello, > > I???m sure that this question has been asked before, however I don???t recall > finding a satisfactory answer to this question. We are investigating moving > from torque/moab to slurm on our HPC clusters. In our torque prologue and > epilogue scripts we write information in to users??? job output files. For > example where the job executed on the cluster (on which compute nodes) and > how many (much) resources the job used. > > I???ve set up some prototype slurm prolog and epilog scripts and included > some write (echo) statements, however I don???t see any of the information in > job output files. Is writing information in to output files much more tricky > in slurm or have I missed something fundamental? Alternatively, are there > other ways and means of doing this? Could someone please advise me. > > Best regards, > David > -- Paddy Doyle Trinity Centre for High Performance Computing, Lloyd Building, Trinity College Dublin, Dublin 2, Ireland. Phone: +353-1-896-3725 http://www.tchpc.tcd.ie/