Sorry noob problem, i'm forgot start munge.

Bests,
Gois



2015-04-21 21:57 GMT+01:00 Jorge Góis <[email protected]>:

>  Thank you Trey, it work.
>
> I put this config on controller and works:
>
>> # COMPUTE NODES
>> NodeName=JGNODE[1-1] CPUs=1 State=UNKNOWN
>> #PartitionName=debug Nodes=JGNODE1 Default=yes MaxTime=INFINITE
>> PartitionName=CLUSTER Default=yes State=UP nodes=JGNODE[1-1]
>>
>
> But now have problem with the "munge.socket.2" on the controller and the
> nodes.
>
>> Munge encode failed: Failed to access "/var/run/munge/munge.socket.2"
>>
>
> P.S. Using Centos 6.6 x86.
> Bets,
> Góis
>
>
>
> 2015-04-21 21:43 GMT+01:00 Trey Dockendorf <[email protected]>:
>
>>  Your node definition doesn't match what you assigned the partition
>> 'debug'.  You probably want NodeName=JGNODE1 instead of NodeName=JGHCSLURM.
>>
>> - Trey
>>
>> =============================
>>
>> Trey Dockendorf
>> Systems Analyst I
>> Texas A&M University
>> Academy for Advanced Telecommunications and Learning Technologies
>> Phone: (979)458-2396
>> Email: [email protected]
>> Jabber: [email protected]
>>
>> On Tue, Apr 21, 2015 at 3:31 PM, Jorge Góis <[email protected]> wrote:
>>
>>>  Hi guys.
>>> I'm testing the slurm-14.11.5 and now using the slurm-15.08.0-0pre3.
>>>
>>> But in both version have  this error:
>>>
>>>> slurmctld: error: find_node_record: lookup failure for JGNODE1
>>>> slurmctld: error: build_part_bitmap: invalid node name JGNODE1
>>>> slurmctld: fatal: Invalid node names in partition CLUSTER
>>>>
>>>
>>> On node JGNODE1 have de slurmd -Dvvv runing and have this message:
>>>
>>>> slurmd: debug2: Error connecting slurm stream socket at
>>>> 192.168.1.1:6817: Connection refused
>>>> slurmd: debug:  Failed to contact primary controller: Connection refused
>>>>
>>> But is normal because the slurmctld don't start.
>>>
>>> In slurm.conf on controller have this lines:
>>> ...
>>> # COMPUTE NODES
>>> NodeName=JGHCSLURM CPUs=1 State=UNKNOWN
>>> PartitionName=debug Nodes=JGNODE1 Default=yes MaxTime=INFINITE
>>> #PartitionName=CLUSTER Default=yes State=UP nodes=JGNODE[1-1]
>>>
>>>
>>> HC can ping the JGNODE1, and have on /etc/hosts the IP, FQDN and NAME.
>>>
>>> I make a simple script to test de gethostname() and resolve the name
>>> JGNODE1.
>>>
>>> Can help to find the problem?
>>>
>>> Best,
>>>   jG0|s
>>>
>>>
>>>
>>
>

Reply via email to