Ulf, I would verify the slurm.conf is the same in each node. 

On February 4, 2015 3:41:35 AM PST, Ulf Markwardt <[email protected]> 
wrote:
>Dear all, 
>
>we see messages like this:
>
> grep "wrong node" /var/log/slurm/slurmctld.log
>
>[2015-02-04T04:27:05.591] error: Registered job 11923579.0 on wrong
>node taurusi3033
>[2015-02-04T04:27:05.591] error: Registered job 11900205.4294967294 on
>wrong node taurusi3033
>[2015-02-04T04:27:05.591] error: Registered job 11925038.0 on wrong
>node taurusi3033
>[2015-02-04T08:59:23.360] error: Registered job 11923729.0 on wrong
>node taurusi3019
>[2015-02-04T09:23:23.143] error: Registered job 11923729.0 on wrong
>node taurusi3107
>[2015-02-04T11:01:58.993] error: Batch completion for job 11923075 sent
>from wrong node (taurusi3178 rather than taurusi3084), ignored request
>[2015-02-04T11:28:31.198] error: Batch completion for job 11925657 sent
>from wrong node (taurusi3137 rather than taurusi1235), ignored request
>[2015-02-04T12:17:06.055] error: Registered job 11925657.0 on wrong
>node taurusi3137
>
>What can possibly have gone wrong here? I have no clue!
>(Slurm 14.11.03)
>
>Thank you
>Ulf
>
>-- 
>___________________________________________________________________
>Dr. Ulf Markwardt
>
>Technische Universität Dresden
>Center for Information Services and High Performance Computing (ZIH)
>01062 Dresden, Germany
>
>Phone: (+49) 351/463-33640      WWW:  http://www.tu-dresden.de/zih

Reply via email to