On 05/04/16 09:01, Mehdi Acheli wrote: > Everything in slurm is working fine. I can issue jobs and see the state > of the eight nodes as Idle. However, when I try to connect to a compute > node with a user, even if he has a job running on, I get rejected. The > log shows that the pam module is unable to contact slurm controller and > the error is returned by slurm_load_job_user. But the squeue command > which uses the same routine as I read works fine on the same node...
Any chance you could set up a situation where a user has a job on a particular node and then send: 1) "scontrol show slurmd" from the compute node 2) the PAM logs that show the failure Also, is this with pam_slurm or pam_slurm_adopt? All the best, Chris -- Christopher Samuel Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: [email protected] Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci
