Hi, David Bigagli <[email protected]> writes:
> can we see what script is broken? You may want to review #991 to > get the full picture. We have a small shell script called "jobsh", that have been designed to behave similarly to RSH/SSH. When run non-interactively it uses -u, which have worked fine since at least Slurm 2.4. Obviously not real-world examples, but things like this works: rsync -e jobsh ... - rsync over srun GIT_SSH=jobsh git clone ... - clone git repo over srun In Slurm 14.11 this breaks since -u now injects carriage return characters in the output. Running without --unbuffered in 14.11 appears to give the behavior that --unbuffered did in previous versions. So for this use case we could change our script to check the slurm version and remove -u if on 14.11. But that is the kind of changes that we should not have to make when we rely on an option that have worked for 10+ years. Bug #991 appears to be about a miss-match between observed behavior and documentation. Problem is that it was then assumed that the documentation was wrong. The documentation actually pretty accurately described how srun have "always" done buffering. I don't find any mention anywhere as to why the behavior was changed. Maybe the change was done unintenioally? That would explain the lack of updated documentation. It also appears to be no way at all to get the old behavior with 14.11? The line buffering actually makes a lot of sense as default behavior. There is sadly plenty of codes out there that have rather bad output routines. With srun doing line buffering at least output from multiple processes don't end up on the same line. Here is a stupid example that shows how srun used to behave, compared to now. --- srun-buffering-test ---------------- #!/bin/bash echo -n "$(hostname): " i=1 while [ $i -lt 20 ]; do echo -n "$i" [ $((i%10)) == 0 ] && echo -ne "\n$(hostname): " sleep 0.5 ((i++)) done echo '' ---------------------------------------- --- srun-buffering-test.job ------------ #!/bin/bash #SBATCH -N2 -n2 -t 15 #SBATCH -o srun-buffering-test.out srun --version set -x srun ./srun-buffering-test srun -u ./srun-buffering-test ---------------------------------------- Running this job on previous versions (tested on 2.4, 2.6, 14.03) results in output like this: --- srun-buffering-test.out ------------ slurm 2.6.4 + srun ./srun-buffering-test a3: 12345678910 a4: 12345678910 a3: 111213141516171819 a4: 111213141516171819 + srun -u ./srun-buffering-test a3: a4: 11223344556677889910 a4: 10 a3: 111112121313141415151616171718181919 ---------------------------------------- With 14.11 both srun and srun -u gives the scrabled output, difference being that with -u line breaks are also changed into CR+LF. (might not be showed by most e-mail clients however): --- srun-buffering-test.out ------------ slurm 14.11.4 + srun ./srun-buffering-test n549: 1n550: 1223344556677889910 n549: 10 n550: 111112121313141415151616171718181919 + srun -u ./srun-buffering-test n549: 1n550: 1223344556677889910 n549: 10 n550: 111112121313141415151616171718181919 ---------------------------------------- I would still prefer that this change was reverted completely. At the very least you need to restore the previous -u,--unbuffered behavior. If the current default behavior is kept, then -u,--unbuffered should simply do nothing. A --line-buffered option to get the old behavior would also be nice in that case. Regards, Pär Lindfors, NSC
