Hello list, thank you for your thoughts and help. I found my problem to be caused by myself (as always).
Using srun to copy my files, I also copied the output file which somehow resulted in SLURM not being able to log further to it, except the end. It switched copy with rsync and excluded my .sbatch .out files. Now running the script worked like a charm. Thank you guys. Best regards, Dennis Am 10.07.2017 um 10:13 schrieb Carlos Fenoy: > Re: [slurm-dev] Re: srun can't use variables in a batch script after > upgrade > Hi, > > any idea why the output of your job is not complete? There is nothing > after "Copying files...". Does the /work/tants directory exists in all > the nodes? The variable $SLURM_JOB_NAME is interpreted by bash so srun > only sees "srun -N2 -n2 rm -rf /work/tants/mpicopytest" > > Regards, > Carlos > > On Mon, Jul 10, 2017 at 10:02 AM, Dennis Tants > <[email protected] > <mailto:[email protected]>> wrote: > > > Hello Loris, > > Am 10.07.2017 um 07:39 schrieb Loris Bennett: > > Hi Dennis, > > > > Dennis Tants <[email protected] > <mailto:[email protected]>> writes: > > > >> Hi list, > >> > >> I am a little bit lost right now and would appreciate your help. > >> We have a little cluster with 16 nodes running with SLURM and it is > >> doing everything we want, except a few > >> little things I want to improve. > >> > >> So that is why I wanted to upgrade our old SLURM 15.X (don't > know the > >> exact version) to 17.02.4 on my test machine. > >> I just deleted the old version completely with 'yum erase slurm-*' > >> (CentOS 7 btw.) and build the new version with rpmbuild. > >> Everything went fine so I started configuring a new > slurm[dbd].conf. > >> This time I also wanted to integrate backfill instead of FIFO > >> and also use accounting (just to know which person uses the most > >> resources). Because we had no databases yet I started > >> slurmdbd and slurmctld without problems. > >> > >> Everything seemed fine with a simple mpi hello world test on > one and two > >> nodes. > >> Now I wanted to enhance the script a bit more and include > working in the > >> local directory of the nodes which is /work. > >> To get everything up and running I used the script which I > attached for > >> you (it also includes the output after running the script). > >> It should basically just copy all data to > /work/tants/$SLURM_JOB_NAME > >> before doing the mpi hello world. > >> But it seems that srun does not know $SLURM_JOB_NAME even > though it is > >> there. > >> /work/tants belongs to the correct user and has rwx permissions. > >> > >> So did I just configure something wrong or what happened here? > Nearly > >> the same example is working on our cluster with > >> 15.X. The script is only for testing purposes, thats why there > are so > >> many echo commands in there. > >> If you see any mistake or can recommend better configurations I > would > >> glady hear them. > >> Should you need any more information I will provide them. > >> Thank you for your time! > > Shouldn't the variable be $SBATCH_JOB_NAME? > > > > Cheers, > > > > Loris > > > > when I use "echo $SLURM_JOB_NAME" it will tell me the name I specified > with #SBATCH -J. > It is not working with srun in this version (it was working in 15.x). > > However, when I now use "echo $SBATCH_JOB_NAME" it is just a blank > variable. As told by someone from the list, > I used the command "env" to verify which variables are available. This > list includes SLURM_JOB_NAME > with the name I specified. So $SLURM_JOB_NAME shouldn't be a problem. > > Thank you for your suggestion though. > Any other hints? > > Best regards, > Dennis > > -- > Dennis Tants > Auszubildender: Fachinformatiker für Systemintegration > > ZARM - Zentrum für angewandte Raumfahrttechnologie und > Mikrogravitation > ZARM - Center of Applied Space Technology and Microgravity > > Universität Bremen > Am Fallturm > 28359 Bremen, Germany > > Telefon: 0421 218 57940 > E-Mail: [email protected] <mailto:[email protected]> > > www.zarm.uni-bremen.de <http://www.zarm.uni-bremen.de> > > > > > -- > -- > Carles Fenoy -- Dennis Tants Auszubildender: Fachinformatiker für Systemintegration ZARM - Zentrum für angewandte Raumfahrttechnologie und Mikrogravitation ZARM - Center of Applied Space Technology and Microgravity Universität Bremen Am Fallturm 28359 Bremen, Germany Telefon: 0421 218 57940 E-Mail: [email protected] www.zarm.uni-bremen.de
