Read this slurm.conf manual, under the parameters that start with Node. They discuss this situation.
-- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novos...@rutgers.edu<mailto:novos...@rutgers.edu> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Apr 15, 2017, at 11:47, Jianwen Wei <wei.jian...@gmail.com<mailto:wei.jian...@gmail.com>> wrote: Hi, I used *short* hostnames (say node306) in all my compute node and SLURM settings before. It works well. However, error messages arise in /var/log/slurmctld.log when I set FQDN for the compute nodes. [2017-04-15T22:50:06.149] error: find_node_record: lookup failure for node306.<http://node306.pi.sjtu.edu.cn>yourdomain.com<http://yourdomain.com> On nnode306: $ hostname node306.yourdomain.com<http://node306.yourdomain.com> $ hostname -s node306 $ hostname -f node306.yourdomain.com<http://node306.yourdomain.com> In /etc/slurm/slurm.conf , shortnames are used since FQDN prevents use of hostlist. That is, "node[001-332].yourdomain.com<http://yourdomain.com>" is invalid. NodeName=node[001-332] CPUs=16 SocketsPerBoard=2 CoresPerSocket=8 ThreadsPerCore=1 RealMemory=64100 By far, SLURM works fine despite the error message appearing in log every 10 minutes. I appreciate any suggestion on this issue. Best, Jianwen