Thank you Trey, it work.

I put this config on controller and works:

> # COMPUTE NODES
> NodeName=JGNODE[1-1] CPUs=1 State=UNKNOWN
> #PartitionName=debug Nodes=JGNODE1 Default=yes MaxTime=INFINITE
> PartitionName=CLUSTER Default=yes State=UP nodes=JGNODE[1-1]
>

But now have problem with the "munge.socket.2" on the controller and the
nodes.

> Munge encode failed: Failed to access "/var/run/munge/munge.socket.2"
>

P.S. Using Centos 6.6 x86.
Bets,
Góis



2015-04-21 21:43 GMT+01:00 Trey Dockendorf <[email protected]>:

>  Your node definition doesn't match what you assigned the partition
> 'debug'.  You probably want NodeName=JGNODE1 instead of NodeName=JGHCSLURM.
>
> - Trey
>
> =============================
>
> Trey Dockendorf
> Systems Analyst I
> Texas A&M University
> Academy for Advanced Telecommunications and Learning Technologies
> Phone: (979)458-2396
> Email: [email protected]
> Jabber: [email protected]
>
> On Tue, Apr 21, 2015 at 3:31 PM, Jorge Góis <[email protected]> wrote:
>
>>  Hi guys.
>> I'm testing the slurm-14.11.5 and now using the slurm-15.08.0-0pre3.
>>
>> But in both version have  this error:
>>
>>> slurmctld: error: find_node_record: lookup failure for JGNODE1
>>> slurmctld: error: build_part_bitmap: invalid node name JGNODE1
>>> slurmctld: fatal: Invalid node names in partition CLUSTER
>>>
>>
>> On node JGNODE1 have de slurmd -Dvvv runing and have this message:
>>
>>> slurmd: debug2: Error connecting slurm stream socket at 192.168.1.1:6817:
>>> Connection refused
>>> slurmd: debug:  Failed to contact primary controller: Connection refused
>>>
>> But is normal because the slurmctld don't start.
>>
>> In slurm.conf on controller have this lines:
>> ...
>> # COMPUTE NODES
>> NodeName=JGHCSLURM CPUs=1 State=UNKNOWN
>> PartitionName=debug Nodes=JGNODE1 Default=yes MaxTime=INFINITE
>> #PartitionName=CLUSTER Default=yes State=UP nodes=JGNODE[1-1]
>>
>>
>> HC can ping the JGNODE1, and have on /etc/hosts the IP, FQDN and NAME.
>>
>> I make a simple script to test de gethostname() and resolve the name
>> JGNODE1.
>>
>> Can help to find the problem?
>>
>> Best,
>>   jG0|s
>>
>>
>>
>

Reply via email to