syncing slurm.conf on all nodes fixed the issue. Thanks all for you help

On Tue, Feb 22, 2011 at 3:19 PM, Paul Thirumalai
<[email protected]>wrote:

> I see the following line in slurmctld.log
>
>  Node w51 appears to have a different slurm.conf than the slurmctld.  This
> could cause issues with communication and functionality.  Please review both
> files and make sure they are the same.  If this is expected ignore, and set
> DebugFlags=NO_CONF_HASH in your slurm.conf.
>
> I will sync the conf files and retry
>
>
> On Tue, Feb 22, 2011 at 3:15 PM, Danny Auble <[email protected]> wrote:
>
>>  Is there anything of interest in the slurmctld log?  How about the
>> slurmd log on the node running the job?
>>
>>
>> On 02/22/11 15:14, Paul Thirumalai wrote:
>>
>> But new jobs are still getting stuck in CG state.
>>
>> On Tue, Feb 22, 2011 at 3:13 PM, Paul Thirumalai <
>> [email protected]> wrote:
>>
>>> I am using slurm 2.2. I restarted slurmctld using -c option and squeue
>>> not does not show any new jobs
>>>
>>>
>>> On Tue, Feb 22, 2011 at 3:11 PM, Jerry Smith <[email protected]> wrote:
>>>
>>>>  Have you tried restarting slurmctld?  We had this issue back a few
>>>> revisions, but it seemed to go away with a newer rev ( though I couldn't
>>>> tell you which one ).
>>>>
>>>> Have you tried setting the state to RESUME for the nodes?
>>>>
>>>> What version of Slurm are you running?
>>>>
>>>> Jerry
>>>>
>>>> Paul Thirumalai wrote:
>>>>
>>>> I am not using epilog while launching these jobs. the jobs are simple
>>>> python scripts that run the hostname command and put thte output in a file
>>>> that is provided in teh command line.
>>>>
>>>>  I can see that file was written. This tells me that the job completed.
>>>> When I login to the node I dont see the process running. However squeue
>>>> still tells me that the job is in CG state.
>>>>
>>>>  I stopped all the slurm daemons and restarted them, but the state of
>>>> th job is still CG and it shows up in squeue
>>>>
>>>> On Tue, Feb 22, 2011 at 2:45 PM, Paul Thirumalai <
>>>> [email protected]> wrote:
>>>>
>>>>> Actually I just figured out that all jobs seem to be stuck in
>>>>> COMPLETING state. I am now reading
>>>>> https://computing.llnl.gov/linux/slurm/faq.html#comp<http://www.google.com/url?sa=D&q=https://computing.llnl.gov/linux/slurm/faq.html%23comp>
>>>>>
>>>>>
>>>>>
>>>>>  I will continue to trouble shoot. If I run into issues, I will repost
>>>>> to this thread.
>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to