Re: [OMPI users] Issues with different IB adapters and openmpi 2.0.2

2017-02-28 Thread r...@open-mpi.org
The root cause is that the nodes are defined as “heterogeneous” because the 
difference in HCAs causes a difference in selection logic. For scalability 
purposes, we don’t circulate the choice of PML as that isn’t something mpirun 
can “discover” and communicate.

One option we could pursue is to provide a mechanism by which we add the HCAs 
to the topology “signature” sent back by the daemon. This would allow us to 
detect the difference, and then ensure that the PML selection is included in 
the circulated wireup data so the system can at least warn you of the problem 
instead of silently hanging.


> On Feb 28, 2017, at 10:38 AM, Orion Poplawski  wrote:
> 
> On 02/27/2017 05:19 PM, Howard Pritchard wrote:
>> Hi Orion
>> 
>> Does the problem occur if you only use font2 and 3?  Do you have MXM 
>> installed
>> on the font1 node?
> 
> No, running across font2/3 is fine.  No idea what MXM is.
> 
>> The 2.x series is using PMIX and it could be that is impacting the PML sanity
>> check.
>> 
>> Howard
>> 
>> 
>> Orion Poplawski > schrieb am
>> Mo. 27. Feb. 2017 um 14:50:
>> 
>>We have a couple nodes with different IB adapters in them:
>> 
>>font1/var/log/lspci:03:00.0 InfiniBand [0c06]: Mellanox Technologies 
>> MT25204
>>[InfiniHost III Lx HCA] [15b3:6274] (rev 20)
>>font2/var/log/lspci:03:00.0 InfiniBand [0c06]: QLogic Corp. IBA7220 
>> InfiniBand
>>HCA [1077:7220] (rev 02)
>>font3/var/log/lspci:03:00.0 InfiniBand [0c06]: QLogic Corp. IBA7220 
>> InfiniBand
>>HCA [1077:7220] (rev 02)
>> 
>>With 1.10.3 we saw the following errors with mpirun:
>> 
>>[font2.cora.nwra.com:13982 ]
>>[[23220,1],10] selected pml cm, but peer
>>[[23220,1],0] on font1 selected pml ob1
>> 
>>which crashed MPI_Init.
>> 
>>We worked around this by passing "--mca pml ob1".  I notice now with 
>> openmpi
>>2.0.2 without that option I no longer see errors, but the mpi program will
>>hang shortly after startup.  Re-adding the option makes it work, so I'm
>>assuming the underlying problem is still the same, but openmpi appears to 
>> have
>>stopped alerting me to the issue.
>> 
>>Thoughts?
>> 
>>--
>>Orion Poplawski
>>Technical Manager  720-772-5637
>>NWRA, Boulder/CoRA Office FAX: 303-415-9702
>>3380 Mitchell Lane   or...@nwra.com
>>
>>Boulder, CO 80301   http://www.nwra.com
>>___
>>users mailing list
>>users@lists.open-mpi.org 
>>https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
>> 
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
> 
> 
> -- 
> Orion Poplawski
> Technical Manager  720-772-5637
> NWRA, Boulder/CoRA Office FAX: 303-415-9702
> 3380 Mitchell Lane   or...@nwra.com
> Boulder, CO 80301   http://www.nwra.com
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Issues with different IB adapters and openmpi 2.0.2

2017-02-28 Thread Orion Poplawski
On 02/27/2017 05:19 PM, Howard Pritchard wrote:
> Hi Orion
> 
> Does the problem occur if you only use font2 and 3?  Do you have MXM installed
> on the font1 node?

No, running across font2/3 is fine.  No idea what MXM is.

> The 2.x series is using PMIX and it could be that is impacting the PML sanity
> check.
> 
> Howard
> 
> 
> Orion Poplawski > schrieb am
> Mo. 27. Feb. 2017 um 14:50:
> 
> We have a couple nodes with different IB adapters in them:
> 
> font1/var/log/lspci:03:00.0 InfiniBand [0c06]: Mellanox Technologies 
> MT25204
> [InfiniHost III Lx HCA] [15b3:6274] (rev 20)
> font2/var/log/lspci:03:00.0 InfiniBand [0c06]: QLogic Corp. IBA7220 
> InfiniBand
> HCA [1077:7220] (rev 02)
> font3/var/log/lspci:03:00.0 InfiniBand [0c06]: QLogic Corp. IBA7220 
> InfiniBand
> HCA [1077:7220] (rev 02)
> 
> With 1.10.3 we saw the following errors with mpirun:
> 
> [font2.cora.nwra.com:13982 ]
> [[23220,1],10] selected pml cm, but peer
> [[23220,1],0] on font1 selected pml ob1
> 
> which crashed MPI_Init.
> 
> We worked around this by passing "--mca pml ob1".  I notice now with 
> openmpi
> 2.0.2 without that option I no longer see errors, but the mpi program will
> hang shortly after startup.  Re-adding the option makes it work, so I'm
> assuming the underlying problem is still the same, but openmpi appears to 
> have
> stopped alerting me to the issue.
> 
> Thoughts?
> 
> --
> Orion Poplawski
> Technical Manager  720-772-5637
> NWRA, Boulder/CoRA Office FAX: 303-415-9702
> 3380 Mitchell Lane   or...@nwra.com
> 
> Boulder, CO 80301   http://www.nwra.com
> ___
> users mailing list
> users@lists.open-mpi.org 
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 


-- 
Orion Poplawski
Technical Manager  720-772-5637
NWRA, Boulder/CoRA Office FAX: 303-415-9702
3380 Mitchell Lane   or...@nwra.com
Boulder, CO 80301   http://www.nwra.com
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Using OpenMPI / ORTE as cluster aware GNU Parallel

2017-02-28 Thread Mark Santcroos
Hi Brock, Angel, Reuti,



You might want to look at a tool we developed:
http://radical-cybertools.github.io/radical-pilot/index.html

This was actually one of the drivers for isolating the persistent ORTE DVM 
thats being discussed in this thread.

With RADICAL-Pilot you can use a Python API to launch an ORTE DVM on a 
computational resource and then run tasks on top of that.

Happy to answer questions off-list.



Regards,

Mark
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] State of the DVM in Open MPI

2017-02-28 Thread r...@open-mpi.org
Hi Reuti

The DVM in master seems to be fairly complete, but several organizations are in 
the process of automating tests for it so it gets more regular exercise.

If you are using a version in OMPI 2.x, those are early prototype - we haven’t 
updated the code in the release branches. The more production-ready version 
will be in 3.0, and we’ll start supporting it there.

Meantime, we do appreciate any suggestions and bug reports as we polish it up.


> On Feb 28, 2017, at 2:17 AM, Reuti  wrote:
> 
> Hi,
> 
> Only by reading recent posts I got aware of the DVM. This would be a welcome 
> feature for our setup*. But I see not all options working as expected - is it 
> still a work in progress, or should all work as advertised?
> 
> 1)
> 
> $ soft@server:~> orte-submit -cf foo --hnp file:/home/reuti/dvmuri -n 1 touch 
> /home/reuti/hacked
> 
> Open MPI has detected that a parameter given to a command line
> option does not match the expected format:
> 
>  Option: np
>  Param:  foo
> 
> ==> The given option is -cf, not -np
> 
> 2)
> 
> According to `man orte-dvm` there is -H, -host, --host, -machinefile, 
> -hostfile but none of them seem operational (Open MPI 2.0.2). A given 
> hostlist given by SGE is honored though.
> 
> -- Reuti
> 
> 
> *) We run Open MPI jobs inside SGE. This works fine. Some applications invoke 
> several `mpiexec`-calls during their execution and rely on temporary files 
> they created in the last step(s). While this is working fine on one and the 
> same machine, it fails in case SGE granted slots on several machines as the 
> scratch directories created by `qrsh -inherit …` vanish once the 
> `mpiexec`-call on this particular node finishes (and not at the end of the 
> complete job). I can mimic persistent scratch directories in SGE for a 
> complete job, but invoking the DVM before and shutting it down later on 
> (either by hand in the job script or by SGE killing all remains at the end of 
> the job) might be more straight forward (looks like `orte-dvm` is started by 
> `qrsh -inherit …` too).
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

[OMPI users] State of the DVM in Open MPI

2017-02-28 Thread Reuti
Hi,

Only by reading recent posts I got aware of the DVM. This would be a welcome 
feature for our setup*. But I see not all options working as expected - is it 
still a work in progress, or should all work as advertised?

1)

$ soft@server:~> orte-submit -cf foo --hnp file:/home/reuti/dvmuri -n 1 touch 
/home/reuti/hacked

Open MPI has detected that a parameter given to a command line
option does not match the expected format:

  Option: np
  Param:  foo

==> The given option is -cf, not -np

2)

According to `man orte-dvm` there is -H, -host, --host, -machinefile, -hostfile 
but none of them seem operational (Open MPI 2.0.2). A given hostlist given by 
SGE is honored though.

-- Reuti


*) We run Open MPI jobs inside SGE. This works fine. Some applications invoke 
several `mpiexec`-calls during their execution and rely on temporary files they 
created in the last step(s). While this is working fine on one and the same 
machine, it fails in case SGE granted slots on several machines as the scratch 
directories created by `qrsh -inherit …` vanish once the `mpiexec`-call on this 
particular node finishes (and not at the end of the complete job). I can mimic 
persistent scratch directories in SGE for a complete job, but invoking the DVM 
before and shutting it down later on (either by hand in the job script or by 
SGE killing all remains at the end of the job) might be more straight forward 
(looks like `orte-dvm` is started by `qrsh -inherit …` too).


signature.asc
Description: Message signed with OpenPGP using GPGMail
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users