Yeah, I'm surprised by that - they used to build --with-tm as it only activates 
if/when it finds itself in the appropriate environment. No harm in building the 
support, so the distros always built with all the RM components. No idea why 
this happened - you might mention it to them as I suspect it was an 
error/oversight.



> On Jan 18, 2022, at 3:05 PM, Crni Gorac via users <users@lists.open-mpi.org> 
> wrote:
> 
> Indeed, I realized in the meantime that changing the hostfile to:
> --------------------
> node1 slots=1
> node2 slots=1
> --------------------
> works as I expected.
> 
> Thanks once again for the clarification, got it now.  I'll see if we
> can live this way (the job submission scripts are mostly automatically
> generated from an auxiliary, site specific, shell script, and I can
> change this one to simply add "slots=1" to the hostfile generated by
> PBS, before passing it to mpirun), but it's pity that tm support is
> not included in these pre-built OpenMPI installations.
> 
> On Tue, Jan 18, 2022 at 11:56 PM Ralph Castain via users
> <users@lists.open-mpi.org> wrote:
>> 
>> Hostfile isn't being ignored - it is doing precisely what it is supposed to 
>> do (and is documented to do). The problem is that without tm support, we 
>> don't read the external allocation. So we use hostfile to identify the 
>> hosts, and then we discover the #slots on each host as being the #cores on 
>> that node.
>> 
>> In contrast, the -host option is doing what it is supposed to do - it 
>> assigns one slot for each mention of the hostname. You can increase the slot 
>> allocation using the colon qualifier - i.e., "-host node1:5" assigns 5 slots 
>> to node1.
>> 
>> If tm support is included, then we read the PBS allocation and see one slot 
>> on each node - and launch accordingly.
>> 
>> 
>>> On Jan 18, 2022, at 2:44 PM, Crni Gorac via users 
>>> <users@lists.open-mpi.org> wrote:
>>> 
>>> OK, just checked and you're right: both processes get run on the first
>>> node.  So it seems that the "hostfile" option in mpirun, that in my
>>> case refers to a file properly listing two nodes, like:
>>> --------------------
>>> node1
>>> node2
>>> --------------------
>>> is ignored.
>>> 
>>> I also tried logging in to node1, and launching using mpirun directly,
>>> without PBS, and the same thing happens.  However, if I specify "host"
>>> options instead, then ranks get started on different nodes, and it all
>>> works properly.  Then I tried the same from within the PBS script, and
>>> it worked.
>>> 
>>> Thus, to summarize, instead of:
>>> mpirun -n 2 -hostfile $PBS_NODEFILE ./foo
>>> one should use:
>>> mpirun -n 2 --host node1,node2 ./foo
>>> 
>>> Rather strange, but it's important that it works somehow.  Thanks for your 
>>> help!
>>> 
>>> On Tue, Jan 18, 2022 at 10:54 PM Ralph Castain via users
>>> <users@lists.open-mpi.org> wrote:
>>>> 
>>>> Are you launching the job with "mpirun"? I'm not familiar with that cmd 
>>>> line and don't know what it does.
>>>> 
>>>> Most likely explanation is that the mpirun from the prebuilt versions 
>>>> doesn't have TM support, and therefore doesn't understand the 1ppn 
>>>> directive in your cmd line. My guess is that you are using the ssh 
>>>> launcher - what is odd is that you should wind up with two procs on the 
>>>> first node, in which case those envars are correct. If you are seeing one 
>>>> proc on each node, then something is wrong.
>>>> 
>>>> 
>>>>> On Jan 18, 2022, at 1:33 PM, Crni Gorac via users 
>>>>> <users@lists.open-mpi.org> wrote:
>>>>> 
>>>>> I have one process per node, here is corresponding line from my job
>>>>> submission script (with compute nodes named "node1" and "node2"):
>>>>> 
>>>>> #PBS -l 
>>>>> select=1:ncpus=1:mpiprocs=1:host=node1+1:ncpus=1:mpiprocs=1:host=node2
>>>>> 
>>>>> On Tue, Jan 18, 2022 at 10:20 PM Ralph Castain via users
>>>>> <users@lists.open-mpi.org> wrote:
>>>>>> 
>>>>>> Afraid I can't understand your scenario - when you say you "submit a 
>>>>>> job" to run on two nodes, how many processes are you running on each 
>>>>>> node??
>>>>>> 
>>>>>> 
>>>>>>> On Jan 18, 2022, at 1:07 PM, Crni Gorac via users 
>>>>>>> <users@lists.open-mpi.org> wrote:
>>>>>>> 
>>>>>>> Using OpenMPI 4.1.2 from MLNX_OFED_LINUX-5.5-1.0.3.2 distribution, and
>>>>>>> have PBS 18.1.4 installed on my cluster (cluster nodes are running
>>>>>>> CentOS 7.9).  When I try to submit a job that will run on two nodes in
>>>>>>> the cluster, both ranks get OMPI_COMM_WORLD_LOCAL_SIZE set to 2,
>>>>>>> instead of 1, and OMPI_COMM_WORLD_LOCAL_RANK are set to 0 and 1,
>>>>>>> instead of both being 0.  At the same time, the hostfile generated by
>>>>>>> PBS ($PBS_NODEFILE) properly contains two nodes listed.
>>>>>>> 
>>>>>>> I've tried with OpenMPI 3 from HPC-X, and the same thing happens too.
>>>>>>> However, when I build OpenMPI myself (notable difference from above
>>>>>>> mentioned pre-built MPI versions is that I use "--with-tm" option to
>>>>>>> point to my PBS installation), then OMPI_COMM_WORLD_LOCAL_SIZE and
>>>>>>> OMPI_COMM_WORLD_LOCAL_RANK are set properly.
>>>>>>> 
>>>>>>> I'm not sure how to debug the problem, and whether it is possible to
>>>>>>> fix it at all with a pre-built OpenMPI version, so any suggestion is
>>>>>>> welcome.
>>>>>>> 
>>>>>>> Thanks.
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Reply via email to