Re: [OMPI users] [External] Re: mpirun on Kubuntu 20.4.1 hangs

2020-10-22 Thread Prentice Bisbal via users
 Could SELinux or AppArmor be active by default for a new install and 
be causing this problem?


Prentice

On 10/21/20 12:22 PM, Jorge SILVA via users wrote:


Hello Gus,

 Thank you for your answer..  Unfortunately my problem is much more 
basic. I  didn't try to run the program in both computers , but just 
to run something in one computer. I just installed the new OS an 
openmpi in two different computers, in the standard way, with the same 
result.


For example:

In kubuntu20.4.1 LTS with openmpi 4.0.3-0ubuntu

jorge@gcp26:~/MPIRUN$ cat hello.f90
 print*,"Hello World!"
end
jorge@gcp26:~/MPIRUN$ mpif90 hello.f90 -o hello
jorge@gcp26:~/MPIRUN$ ./hello
 Hello World!
jorge@gcp26:~/MPIRUN$ mpirun -np 1 hello <---here the program hangs 
with no output

^C^Cjorge@gcp26:~/MPIRUN$

The mpirun task sleeps with no output, and only twice ctrl-C ends the 
execution  :


jorge   5540  0.1  0.0  44768  8472 pts/8    S+ 17:54   0:00 
mpirun -np 1 hello


In kubuntu 18.04.5 LTS with openmpi 2.1.1, of course, the same program 
gives


jorge@gcp30:~/MPIRUN$ cat hello.f90
 print*, "Hello World!"
 END
jorge@gcp30:~/MPIRUN$ mpif90 hello.f90 -o hello
jorge@gcp30:~/MPIRUN$ ./hello
 Hello World!
jorge@gcp30:~/MPIRUN$ mpirun -np 1 hello
 Hello World
jorge@gcp30:~/MPIRUN$


Even just typing mpirun hangs without the usual error message.

Are there any changes between the two versions of openmpi that I 
miss?  Some package lacking to mpirun ?


Thank you again for your help

Jorge


Le 21/10/2020 à 00:20, Gus Correa a écrit :

Hi Jorge

You may have an active firewall protecting either computer or both,
and preventing mpirun to start the connection.
Your /etc/hosts file may also not have the computer IP addresses.
You may also want to try the --hostfile option.
Likewise, the --verbose option may also help diagnose the problem.

It would help if you send the mpirun command line, the hostfile (if any),
error message if any, etc.


These FAQs may help diagnose and solve the problem:

https://www.open-mpi.org/faq/?category=running#diagnose-multi-host-problems
https://www.open-mpi.org/faq/?category=running#mpirun-hostfile
https://www.open-mpi.org/faq/?category=running

I hope this helps,
Gus Correa

On Tue, Oct 20, 2020 at 4:47 PM Jorge SILVA via users 
mailto:users@lists.open-mpi.org>> wrote:


Hello,

I installed kubuntu20.4.1 with openmpi 4.0.3-0ubuntu in two
different
computers in the standard way. Compiling with mpif90 works, but
mpirun
hangs with no output in both systems. Even mpirun command without
parameters hangs and only twice ctrl-C typing can end the sleeping
program. Only  the command

 mpirun --help

gives the usual output.

Seems that is something related to the terminal output, but the
command
worked well for Kubuntu 18.04. Is there a way to debug or fix this
problem (without re-compiling from sources, etc)? Is it a known
problem?

Thanks,

  Jorge



[OMPI users] Is ready send mode supported? Or can I force eager protocol for a subset of messages?

2020-10-22 Thread Oskar Lappi via users

Hi, a performance question,

I have a distributed stencil loop that's sending several tens of 
slightly larger messages every iteration, I post double buffered 
receives at initialization and immediately after a receive request is 
completed.
I can therefore prove that the receive is posted on the other side of a 
send before it is sent, and would like to use the ready send mode to be 
able to shave off the overhead of rendezvous.
Some other (setup) parts of the program use synchronous sends that I 
can't prove this for.


My question is: is ready send mode "supported", that is to say does it 
take advantage of the fact that I've proved it can be used and performs 
an eager send every time? Or does this depend on the underlying  component?


Follow-up question: if ready send mode can't force an eager protocol, 
how could I do that? And can I verify which protocol is being used somehow?


We're using Open MPI 4.0.3 and UCX for GPUDirect RDMA communication on 
Mellanox Infiniband.


Best,
Oskar

oskar.la...@abo.fi


Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-10-22 Thread Joseph Schuchart via users

Hi Jorge,

Can you try to get a stack trace of mpirun using the following command 
in a separate terminal?


sudo gdb -batch -ex "thread apply all bt" -p $(ps -C mpirun -o pid= | 
head -n 1)


Maybe that will give some insight where mpirun is hanging.

Cheers,
Joseph

On 10/21/20 9:58 PM, Jorge SILVA via users wrote:

Hello Jeff,

The  program is not executed, seems waits for something to connect with 
(why twice ctrl-C ?)


jorge@gcp26:~/MPIRUN$ mpirun -np 1 touch /tmp/foo
^C^C

jorge@gcp26:~/MPIRUN$ ls -l /tmp/foo
ls: impossible d'accéder à '/tmp/foo': Aucun fichier ou dossier de ce type

no file  is created..

In fact, my question was if are there differences in mpirun usage  
between these versions..  The


mpirun -help

gives a different output as expected, but I  tried a lot of options 
without any success.



Le 21/10/2020 à 21:16, Jeff Squyres (jsquyres) a écrit :
There's huge differences between Open MPI v2.1.1 and v4.0.3 (i.e., 
years of development effort); it would be very hard to categorize them 
all; sorry!


What happens if you

    mpirun -np 1 touch /tmp/foo

(Yes, you can run non-MPI apps through mpirun)

Is /tmp/foo created?  (i.e., did the job run, and mpirun is somehow 
not terminating)




On Oct 21, 2020, at 12:22 PM, Jorge SILVA via users 
mailto:users@lists.open-mpi.org>> wrote:


Hello Gus,

 Thank you for your answer..  Unfortunately my problem is much more 
basic. I  didn't try to run the program in both computers , but just 
to run something in one computer. I just installed the new OS an 
openmpi in two different computers, in the standard way, with the 
same result.


For example:

In kubuntu20.4.1 LTS with openmpi 4.0.3-0ubuntu

jorge@gcp26:~/MPIRUN$ cat hello.f90
 print*,"Hello World!"
end
jorge@gcp26:~/MPIRUN$ mpif90 hello.f90 -o hello
jorge@gcp26:~/MPIRUN$ ./hello
 Hello World!
jorge@gcp26:~/MPIRUN$ mpirun -np 1 hello <---here  the program hangs 
with no output

^C^Cjorge@gcp26:~/MPIRUN$

The mpirun task sleeps with no output, and only twice ctrl-C ends the 
execution  :


jorge   5540  0.1  0.0 44768  8472 pts/8    S+   17:54   0:00 
mpirun -np 1 hello


In kubuntu 18.04.5 LTS with openmpi 2.1.1, of course, the same 
program gives


jorge@gcp30:~/MPIRUN$ cat hello.f90
 print*, "Hello World!"
 END
jorge@gcp30:~/MPIRUN$ mpif90 hello.f90 -o hello
jorge@gcp30:~/MPIRUN$ ./hello
 Hello World!
jorge@gcp30:~/MPIRUN$ mpirun -np 1 hello
 Hello World
jorge@gcp30:~/MPIRUN$


Even just typing mpirun hangs without the usual error message.

Are there any changes between the two versions of openmpi that I 
miss?  Some package lacking to mpirun ?


Thank you again for your help

Jorge


Le 21/10/2020 à 00:20, Gus Correa a écrit :

Hi Jorge

You may have an active firewall protecting either computer or both,
and preventing mpirun to start the connection.
Your /etc/hosts file may also not have the computer IP addresses.
You may also want to try the --hostfile option.
Likewise, the --verbose option may also help diagnose the problem.

It would help if you send the mpirun command line, the hostfile (if 
any),

error message if any, etc.


These FAQs may help diagnose and solve the problem:

https://www.open-mpi.org/faq/?category=running#diagnose-multi-host-problems
https://www.open-mpi.org/faq/?category=running#mpirun-hostfile
https://www.open-mpi.org/faq/?category=running

I hope this helps,
Gus Correa

On Tue, Oct 20, 2020 at 4:47 PM Jorge SILVA via users 
mailto:users@lists.open-mpi.org>> wrote:


Hello,

I installed kubuntu20.4.1 with openmpi 4.0.3-0ubuntu in two
different
computers in the standard way. Compiling with mpif90 works, but
mpirun
hangs with no output in both systems. Even mpirun command without
parameters hangs and only twice ctrl-C typing can end the sleeping
program. Only  the command

 mpirun --help

gives the usual output.

Seems that is something related to the terminal output, but the
command
worked well for Kubuntu 18.04. Is there a way to debug or fix this
problem (without re-compiling from sources, etc)? Is it a known
problem?

Thanks,

  Jorge




--
Jeff Squyres
jsquy...@cisco.com 



Re: [OMPI users] Anyone try building openmpi 4.0.5 w/ llvm 11

2020-10-22 Thread Gilles Gouaillardet via users
Alan,

thanks for the report, I addressed this issue in
https://github.com/open-mpi/ompi/pull/8116

As a temporary workaround, you can apply the attached patch.

FWIW, f18 (shipped with LLVM 11.0.0) is still in development and uses
gfortran under the hood.

Cheers,

Gilles

On Wed, Oct 21, 2020 at 12:44 AM Alan Wild via users
 wrote:
>
> More specifically building the new “flang” compiler and compiling openmpi 
> with the combination of clang/flang rather than clang/gfortran.
>
> Configure is passing (including support for 16 byte REAL and COMPLEX types.  
> However there is one file that uses REAL128 and CONPLE(REAL128) and I’m 
> getting type in supported KIND=-1 errors.
>
> (Quick aside I’m surprised that the one file is using the ISO_FORTRAN_ENV 
> type names but configure is only checking for the non-standard REAL*# names.  
> Feels like an oversight to me.)
>
> From what I’ve been able to find (and reading their iso_fortran_env.f90 
> module file the compiler should support the two types. I barely know enough 
> FORTRAN to be dangerous (took one semester of F77 in like 1996) so I’m not 
> really sure what I’m looking at here or what to try next.
>
> I would really like to provide an openmpi build to my users that is a “pure 
> LLVM” build.
>
> -Alan
> --
> a...@madllama.net http://humbleville.blogspot.com


configure.diff
Description: Binary data