Re: [OMPI users] Checkpoint with blcr

2017-05-19 Thread Omar Andrés Zapata Mesa
Thanks Jeff On Fri, May 19, 2017 at 4:59 PM, Jeff Squyres (jsquyres) wrote: > Open MPI v2.1.x does not support checkpoint restart; it was unmaintained > and getting stale, so it was removed. > > Looks like we forgot to remove the cr MPI extension from the v2.1.x > release

Re: [OMPI users] Checkpoint with blcr

2017-05-19 Thread Jeff Squyres (jsquyres)
Open MPI v2.1.x does not support checkpoint restart; it was unmaintained and getting stale, so it was removed. Looks like we forgot to remove the cr MPI extension from the v2.1.x release series when we removed the rest of the checkpoint restart support. Sorry for the confusion. > On May

[OMPI users] Checkpoint with blcr

2017-05-19 Thread Omar Andrés Zapata Mesa
Dear all, I am trying to compile ompi 2.1.0 with support for checkpoint using blcr. openmpi-2.1.0# ./configure --enable-mpi-ext=cr --- MPI Extension cr configure: WARNING: Requested "cr" MPI cr, but cannot build it configure: WARNING: because fault tolerance is not enabled. configure: WARNING:

Re: [OMPI users] Many different errors with ompi version 2.1.1

2017-05-19 Thread Allan Overstreet
Below are the results from the ibnetdiscover command This command was run from node smd. # # Topology file: generated on Fri May 19 15:59:47 2017 # # Initiated from node 0002c903000a0a32 port 0002c903000a0a34 vendid=0x8f1 devid=0x5a5a sysimgguid=0x8f105001094d3

Re: [OMPI users] Many different errors with ompi version 2.1.1

2017-05-19 Thread Elken, Tom
" i do not think btl/openib can be used with QLogic cards (please someone correct me if i am wrong)" You are wrong :) . The openib BTL is the best one to use for interoperability between QLogic and Mellanox IB cards. The Intel True Scale (the continuation of the QLogic IB product line) Host SW

Re: [OMPI users] MPI the correct solution?

2017-05-19 Thread Reuti
As I think it's not relevant to Open MPI itself, I answered in PM only. -- Reuti > Am 18.05.2017 um 18:55 schrieb do...@mail.com: > > On Tue, 9 May 2017 00:30:38 +0200 > Reuti wrote: >> Hi, >> >> Am 08.05.2017 um 23:25 schrieb David Niklas: >> >>> Hello, >>> I

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gabriele Fatigati
Yes, using " -pami_noib" solve the problem, I lost the previous message. Thanks you so much for the support. 2017-05-19 11:12 GMT+02:00 John Hearns via users : > I am not sure I agree with that. > (a) the original error message from Gabriele was quite clear - the MPI >

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread John Hearns via users
ps. One take away for everyone working with MPI. Turn up the error logging or debug level. then PAY ATTENTION to the error messages. I have spent a LOT of my time doing just that - with OpenMPI and with Intel MPI over Omnipath and other interconnects in the dim and distant past. The guy or girl

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread John Hearns via users
I am not sure I agree with that. (a) the original error message from Gabriele was quite clear - the MPI could not find an interface card which was up, so it would not run. (b) Nysal actually pointed out the solution which looks good - after reaidng the documentation.. use pami_noib (c) Having

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gabriele Fatigati
Ok Gilles, the output of mpirun --mca pml ^pami --mca btl_base_verbose 100 is in attached 2017-05-19 10:05 GMT+02:00 Gilles Gouaillardet : > Gabriele, > > > i am sorry, i really meant > > mpirun --mca pml ^pami --mca btl_base_verbose 100 ... > > > Cheers, > > Gilles > > On

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread r...@open-mpi.org
If I might interject here before lots of time is wasted. Spectrum MPI is an IBM -product- and is not free. What you are likely running into is that their license manager is blocking you from running, albeit without a really nice error message. I’m sure that’s something they are working on. If

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gabriele Fatigati
Hi Gilles, in attach the outpuf of: mpirun --mca btl_base_verbose 100 -np 2 ... 2017-05-19 9:43 GMT+02:00 Gilles Gouaillardet : > Gabriele, > > > can you > > mpirun --mca btl_base_verbose 100 -np 2 ... > > > so we can figure out why nor sm nor vader is used ? > > > Cheers, >

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gilles Gouaillardet
Gabriele, i am sorry, i really meant mpirun --mca pml ^pami --mca btl_base_verbose 100 ... Cheers, Gilles On 5/19/2017 4:28 PM, Gabriele Fatigati wrote: Using: mpirun --mca pml ^pami --mca pml_base_verbose 100 -n 2 ./prova_mpi I attach the output 2017-05-19 9:16 GMT+02:00 John Hearns

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Nysal Jan K A
hi Gabriele, You can check some of the available options here - https://www.ibm.com/support/knowledgecenter/en/SSZTET_10.1.0/smpi02/smpi02_interconnect.html The "-pami_noib" option might be of help in this scenario. Alternatively, on a single node, the vader BTL can also be used. Regards --Nysal

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Nathan Hjelm
Add —mca btl self,vader -Nathan > On May 19, 2017, at 1:23 AM, Gabriele Fatigati wrote: > > Oh no, by using two procs: > > > findActiveDevices Error > We found no active IB device ports > findActiveDevices Error > We found no active IB device ports >

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gilles Gouaillardet
Gabriele, can you mpirun --mca btl_base_verbose 100 -np 2 ... so we can figure out why nor sm nor vader is used ? Cheers, Gilles On 5/19/2017 4:23 PM, Gabriele Fatigati wrote: Oh no, by using two procs: findActiveDevices Error We found no active IB device ports findActiveDevices

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread John Hearns via users
BTLs attempted: self That should only allow a single process to communicate with its self On 19 May 2017 at 09:23, Gabriele Fatigati wrote: > Oh no, by using two procs: > > > findActiveDevices Error > We found no active IB device ports > findActiveDevices Error > We

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gabriele Fatigati
Using: mpirun --mca pml ^pami --mca pml_base_verbose 100 -n 2 ./prova_mpi I attach the output 2017-05-19 9:16 GMT+02:00 John Hearns via users : > Gabriele, > as Gilles says if you are running within a single host system, you don not > need the pami layer. > Usually

Re: [OMPI users] Many different errors with ompi version 2.1.1

2017-05-19 Thread John Hearns via users
Allan, remember that Infiniband is not Ethernet. You dont NEED to set up IPOIB interfaces. Two diagnostics please for you to run: ibnetdiscover ibdiagnet Let us please have the reuslts ofibnetdiscover On 19 May 2017 at 09:25, John Hearns wrote: > Giles,

Re: [OMPI users] Many different errors with ompi version 2.1.1

2017-05-19 Thread John Hearns via users
Giles, Allan, if the host 'smd' is acting as a cluster head node it is not a must for it to have an Infiniband card. So you should be able to run jobs across the other nodes, which have Qlogic cards. I may have something mixed up here, if so I am sorry. If you want also to run jobs on the smd

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gabriele Fatigati
Oh no, by using two procs: findActiveDevices Error We found no active IB device ports findActiveDevices Error We found no active IB device ports -- At least one pair of MPI processes are unable to reach each other for MPI

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gabriele Fatigati
Hi GIlles, using your command with one MPI procs I get: findActiveDevices Error We found no active IB device ports Hello world from rank 0 out of 1 processors So it seems to work apart the error message. 2017-05-19 9:10 GMT+02:00 Gilles Gouaillardet : > Gabriele, > > > so

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread John Hearns via users
Gabriele, as Gilles says if you are running within a single host system, you don not need the pami layer. Usually you would use the btls sm,selfthough I guess 'vader' is the more up to date choice On 19 May 2017 at 09:10, Gilles Gouaillardet wrote: > Gabriele, > > > so

Re: [OMPI users] Many different errors with ompi version 2.1.1

2017-05-19 Thread Gilles Gouaillardet
Allan, i just noted smd has a Mellanox card, while other nodes have QLogic cards. mtl/psm works best for QLogic while btl/openib (or mtl/mxm) work best for Mellanox, but these are not interoperable. also, i do not think btl/openib can be used with QLogic cards (please someone correct me

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gilles Gouaillardet
Gabriele, so it seems pml/pami assumes there is an infiniband card available (!) i guess IBM folks will comment on that shortly. meanwhile, you do not need pami since you are running on a single node mpirun --mca pml ^pami ... should do the trick (if it does not work, can run and post the

[OMPI users] Fwd: IBM Spectrum MPI problem

2017-05-19 Thread Gabriele Fatigati
-- Forwarded message -- From: Gabriele Fatigati Date: 2017-05-19 9:07 GMT+02:00 Subject: Re: [OMPI users] IBM Spectrum MPI problem To: John Hearns If I understand well, when I launch mpirun by default try to use Infiniband, but

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gabriele Fatigati
Hi John, Infiniband is not used, there is a single node on this machine. 2017-05-19 8:50 GMT+02:00 John Hearns via users : > Gabriele, pleae run 'ibv_devinfo' > It looks to me like you may have the physical interface cards in these > systems, but you do not have the

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread John Hearns via users
Gabriele, pleae run 'ibv_devinfo' It looks to me like you may have the physical interface cards in these systems, but you do not have the correct drivers or libraries loaded. I have had similar messages when using Infiniband on x86 systems - which did not have libibverbs installed. On 19 May

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gabriele Fatigati
Hi Gilles, using your command: [openpower:88536] mca: base: components_register: registering framework pml components [openpower:88536] mca: base: components_register: found loaded component pami [openpower:88536] mca: base: components_register: component pami register function successful

Re: [OMPI users] Many different errors with ompi version 2.1.1

2017-05-19 Thread Gilles Gouaillardet
Allan, - on which node is mpirun invoked ? - are you running from a batch manager ? - is there any firewall running on your nodes ? - how many interfaces are part of bond0 ? the error is likely occuring when wiring-up mpirun/orted what if you mpirun -np 2 --hostfile nodes --mca

[OMPI users] Many different errors with ompi version 2.1.1

2017-05-19 Thread Allan Overstreet
I experiencing many different errors with openmpi version 2.1.1. I have had a suspicion that this might be related to the way the servers were connected and configured. Regardless below is a diagram of how the server are configured. __ _