I see the following line in slurmctld.log Node w51 appears to have a different slurm.conf than the slurmctld. This could cause issues with communication and functionality. Please review both files and make sure they are the same. If this is expected ignore, and set DebugFlags=NO_CONF_HASH in your slurm.conf.
I will sync the conf files and retry On Tue, Feb 22, 2011 at 3:15 PM, Danny Auble <[email protected]> wrote: > Is there anything of interest in the slurmctld log? How about the slurmd > log on the node running the job? > > > On 02/22/11 15:14, Paul Thirumalai wrote: > > But new jobs are still getting stuck in CG state. > > On Tue, Feb 22, 2011 at 3:13 PM, Paul Thirumalai < > [email protected]> wrote: > >> I am using slurm 2.2. I restarted slurmctld using -c option and squeue not >> does not show any new jobs >> >> >> On Tue, Feb 22, 2011 at 3:11 PM, Jerry Smith <[email protected]> wrote: >> >>> Have you tried restarting slurmctld? We had this issue back a few >>> revisions, but it seemed to go away with a newer rev ( though I couldn't >>> tell you which one ). >>> >>> Have you tried setting the state to RESUME for the nodes? >>> >>> What version of Slurm are you running? >>> >>> Jerry >>> >>> Paul Thirumalai wrote: >>> >>> I am not using epilog while launching these jobs. the jobs are simple >>> python scripts that run the hostname command and put thte output in a file >>> that is provided in teh command line. >>> >>> I can see that file was written. This tells me that the job completed. >>> When I login to the node I dont see the process running. However squeue >>> still tells me that the job is in CG state. >>> >>> I stopped all the slurm daemons and restarted them, but the state of th >>> job is still CG and it shows up in squeue >>> >>> On Tue, Feb 22, 2011 at 2:45 PM, Paul Thirumalai < >>> [email protected]> wrote: >>> >>>> Actually I just figured out that all jobs seem to be stuck in COMPLETING >>>> state. I am now reading >>>> https://computing.llnl.gov/linux/slurm/faq.html#comp<http://www.google.com/url?sa=D&q=https://computing.llnl.gov/linux/slurm/faq.html%23comp> >>>> >>>> >>>> >>>> I will continue to trouble shoot. If I run into issues, I will repost >>>> to this thread. >>>> >>> >>> >> >
