Hi Emily,

Thank you for the information.

How to avoid node from having DRAIN state ?
I didn't set its state to DRAIN.

Thank you in advance

Regards,

Husen



On Fri, Apr 8, 2016 at 8:16 PM, E.M. Dragowsky <[email protected]> wrote:

> Hi, Husen --
>
> The DRAIN state means the node is not available for jobs, at least as far
> as I understand from the documentation describing scontrol:
>
> If you want to remove a node from service, you typically want to set it's
> state to "DRAIN".
>
> Cheers,
> ~ Emily
>
> ----------------------------------
> E.M. Dragowsky, Ph.D.
> ITS -- Research Computing
> Case Western Reserve University
> (216) 368-0082
>
> On Fri, Apr 8, 2016 at 8:47 AM, Husen R <[email protected]> wrote:
>
>> Hello Remi,
>>
>> Thank you for your reply.
>>
>> here is the output of 'sinfo' and 'sinfo -R' respectively:
>>
>> pro@head-node:~$ sinfo
>> PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
>> comeon*      up      30:00      1  drain head-node
>> pro@head-node:~$ sinfo -R
>> REASON               USER      TIMESTAMP           NODELIST
>> batch job complete f root      2016-04-08T16:16:38 head-node
>>
>> The state of my node is drain. I don't understand why the resources is
>> not available. Currently, I don't run any resource-hungry application on
>> that node.
>>
>> Regards,
>>
>>
>> Husen
>>
>>
>> On Fri, Apr 8, 2016 at 7:23 PM, Rémi Palancher <[email protected]> wrote:
>>
>>>
>>> Le 08/04/2016 13:39, Husen R a écrit :
>>>
>>>> [...]
>>>> pro@head-node:/mirror/source$ squeue
>>>>               JOBID   PARTITION        NAME      USER     ST       TIME
>>>>   NODES     NODELIST(REASON)
>>>>                  70    comeon         MatMul      pro     PD       0:00
>>>>       1        (Resources)
>>>>                  71    comeon         MatMul      pro     PD       0:00
>>>>       1        (Resources)
>>>>                  72    comeon         MatMul      pro     PD       0:00
>>>>       1        (Resources)
>>>>
>>>
>>> In the last column, squeue gives you the reason why the job are pending.
>>> "Resources" means there is not enough resources available to run the jobs.
>>>
>>> Check the state of your nodes using `sinfo`.
>>>
>>> Best,
>>> Rémi
>>>
>>
>>
>

Reply via email to