Dear all, we see messages like this:
grep "wrong node" /var/log/slurm/slurmctld.log [2015-02-04T04:27:05.591] error: Registered job 11923579.0 on wrong node taurusi3033 [2015-02-04T04:27:05.591] error: Registered job 11900205.4294967294 on wrong node taurusi3033 [2015-02-04T04:27:05.591] error: Registered job 11925038.0 on wrong node taurusi3033 [2015-02-04T08:59:23.360] error: Registered job 11923729.0 on wrong node taurusi3019 [2015-02-04T09:23:23.143] error: Registered job 11923729.0 on wrong node taurusi3107 [2015-02-04T11:01:58.993] error: Batch completion for job 11923075 sent from wrong node (taurusi3178 rather than taurusi3084), ignored request [2015-02-04T11:28:31.198] error: Batch completion for job 11925657 sent from wrong node (taurusi3137 rather than taurusi1235), ignored request [2015-02-04T12:17:06.055] error: Registered job 11925657.0 on wrong node taurusi3137 What can possibly have gone wrong here? I have no clue! (Slurm 14.11.03) Thank you Ulf -- ___________________________________________________________________ Dr. Ulf Markwardt Technische Universität Dresden Center for Information Services and High Performance Computing (ZIH) 01062 Dresden, Germany Phone: (+49) 351/463-33640 WWW: http://www.tu-dresden.de/zih
smime.p7s
Description: S/MIME Cryptographic Signature
