Re: [OMPI users] Building with thread support on Windows?

2011-09-21 Thread Shiqing Fan

Hi Bjorn,

Unfortunately, the currently version of Open MPI for Windows doesn't 
support Posix nor Solaris threads.


However, the work of supporting MinGW is proceeding, which will support 
GNU compilers for building Open MPI on Windows, and it may partly 
support pthread, but still needs a lot of tests.



Regards,
Shiqing



On 2011-09-21 8:33 PM, Björn Regnström wrote:
I am building with VS 2008 and the compiler (cl) and the standard 
libraries that goes with
it, including the windows thread library. I have noted that ompi_info 
requires either Posix
or Solaris threads to report that open-mpi has thread support. Do I 
 need to change the

thread library and/or do I need another compiler?

Regards,
Bjorn Regnstrom

At Wednesday, 2011-09-21 on 17:32 Tim Prince wrote:

On 9/21/2011 11:18 AM, Björn Regnström wrote:
> Hi,
>
> I am trying to build Open MPI 1.4.3 with thread support on
Windows. A
> trivial test program
> runs if it calls MPI_Init or MP_Init_thread(int *argc, char
***argv, int
> required, int *provide) with
> reguired=0 but hangs if required>0. ompi_info for my build
reports that
> there is no thread
> support but MPI_Init_thread returns provide==required.
>
> The only change in the CMake configuration was to check
> OMPI_ENABLE_MPI_THREADS.
> Is there anything else that needs to be done with the configuration?
>
> I have built 1.4.3 with thread support on several linuxes and
mac and it
> works fine there.
>
Not all Windows compilers work well enough with all threading models
that you could expect satisfactory results; in particular, the
compilers
and thread libraries you use on linux may not be adequate for Windows
thread support.


-- 
Tim Prince

___
users mailing list
us...@open-mpi.org 
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
---
Shiqing Fan
High Performance Computing Center Stuttgart (HLRS)
Tel: ++49(0)711-685-87234  Nobelstrasse 19
Fax: ++49(0)711-685-65832  70569 Stuttgart
http://www.hlrs.de/organization/people/shiqing-fan/
email: f...@hlrs.de



Re: [OMPI users] RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-21 Thread Jeff Squyres
On Sep 21, 2011, at 4:24 PM, Sébastien Boisvert wrote:

>> What happens if you run 2 ibv_rc_pingpong's on each node?  Or N 
>> ibv_rc_pingpongs?
> 
> With 11 ibv_rc_pingpong's
> 
> http://pastebin.com/85sPcA47
> 
> Code to do that => https://gist.github.com/1233173
> 
> Latencies are around 20 microseconds.

This seems to imply that the network is to blame for the higher latency...?

I.e., if you run the same pattern with MPI processes and get 20us latency, that 
would tend to imply that the network itself is not performing well with that IO 
pattern.

> My job seems to do well so far with ofud !
> 
> [sboisver12@colosse2 ray]$ qstat
> job-ID  prior   name   user state submit/start at queue   
>slots ja-task-ID 
> -
> 3047460 0.55384 fish-Assem sboisver12   r 09/21/2011 15:02:25 
> med@r104-n58 256   

I would still be suspicious -- ofud is not well tested, and it can definitely 
hang if there are network drops.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




[OMPI users] RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-21 Thread Sébastien Boisvert
> What happens if you run 2 ibv_rc_pingpong's on each node?  Or N 
> ibv_rc_pingpongs?

With 11 ibv_rc_pingpong's

http://pastebin.com/85sPcA47

Code to do that => https://gist.github.com/1233173

Latencies are around 20 microseconds.




My job seems to do well so far with ofud !


[sboisver12@colosse2 ray]$ qstat
job-ID  prior   name   user state submit/start at queue 
 slots ja-task-ID 
-
3047460 0.55384 fish-Assem sboisver12   r 09/21/2011 15:02:25 med@r104-n58  
   256   




> 
> De : users-boun...@open-mpi.org [users-boun...@open-mpi.org] de la part de 
> Jeff Squyres [jsquy...@cisco.com]
> Date d'envoi : 21 septembre 2011 15:28
> À : Open MPI Users
> Objet : Re: [OMPI users] RE : Latency of 250 microseconds with Open-MPI 
> 1.4.3, Mellanox Infiniband and 256 MPI ranks
> 
> On Sep 21, 2011, at 3:17 PM, Sébastien Boisvert wrote:
> 
>> Meanwhile, I contacted some people at SciNet, which is also part of Compute 
>> Canada.
>>
>> They told me to try Open-MPI 1.4.3 with the Intel compiler with --mca btl 
>> self,ofud to use the ofud BTL instead of openib for OpenFabrics transport.
>>
>> This worked quite good -- I got a low latency of 35 microseconds. Yay !
> 
> That's still pretty terrible.
> 
> Per your comments below, yes, ofud was never finished.  I believe it doesn't 
> have retransmission code in there, so if anything is dropped by the network 
> (which, in a congested/busy network, there will be drops), the job will 
> likely hang.
> 
> The ofud and openib BTLs should have similar latencies.  Indeed, openib 
> should actually have slightly lower HRT ping-pong latencies because of 
> protocol and transport differences between the two.
> 
> The openib BTL should give about the same latency as the ibv_rc_pingpong, 
> which you cited at about 11 microseconds (I assume there must be multiple 
> hops in that IB network to be that high), which jives with your "only 1 
> process sends" RAY network test (http://pastebin.com/dWMXsHpa).
> 
> It's not uncommon for latency to go up if multiple processes are all banging 
> on the HCA, but it shouldn't go up noticeably if there's only 2 processes on 
> each node doing simple ping-pong tests, for example.
> 
> What happens if you run 2 ibv_rc_pingpong's on each node?  Or N 
> ibv_rc_pingpongs?
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


Re: [OMPI users] Typo in MPI_Cart_coords man page

2011-09-21 Thread Jeff Squyres
Fixed in the trunk; thanks!

On Sep 19, 2011, at 3:14 PM, Jeremiah Willcock wrote:

> The bottom of the MPI_Cart_coords man page (in SVN trunk as well as some 
> releases) states:
> 
> The inverse mapping, rank-to-coordinates translation is provided by 
> MPI_Cart_coords.
> 
> Although that is true, we are already in the man page for MPI_Cart_coords and 
> so the reverse is the mapping from coordinates to rank.
> 
> -- Jeremiah Willcock
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-21 Thread Jeff Squyres
On Sep 21, 2011, at 3:17 PM, Sébastien Boisvert wrote:

> Meanwhile, I contacted some people at SciNet, which is also part of Compute 
> Canada. 
> 
> They told me to try Open-MPI 1.4.3 with the Intel compiler with --mca btl 
> self,ofud to use the ofud BTL instead of openib for OpenFabrics transport.
> 
> This worked quite good -- I got a low latency of 35 microseconds. Yay !

That's still pretty terrible.

Per your comments below, yes, ofud was never finished.  I believe it doesn't 
have retransmission code in there, so if anything is dropped by the network 
(which, in a congested/busy network, there will be drops), the job will likely 
hang.

The ofud and openib BTLs should have similar latencies.  Indeed, openib should 
actually have slightly lower HRT ping-pong latencies because of protocol and 
transport differences between the two.

The openib BTL should give about the same latency as the ibv_rc_pingpong, which 
you cited at about 11 microseconds (I assume there must be multiple hops in 
that IB network to be that high), which jives with your "only 1 process sends" 
RAY network test (http://pastebin.com/dWMXsHpa).

It's not uncommon for latency to go up if multiple processes are all banging on 
the HCA, but it shouldn't go up noticeably if there's only 2 processes on each 
node doing simple ping-pong tests, for example.

What happens if you run 2 ibv_rc_pingpong's on each node?  Or N 
ibv_rc_pingpongs?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




[OMPI users] RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-21 Thread Sébastien Boisvert
Hi Yevgeny,


You are right on comparing apples with apples.

But MVAPICH2 is not installed on colosse, which is in the CLUMEQ consortium, a 
part of Compute Canada.


Meanwhile, I contacted some people at SciNet, which is also part of Compute 
Canada. 


They told me to try Open-MPI 1.4.3 with the Intel compiler with --mca btl 
self,ofud to use the ofud BTL instead of openib for OpenFabrics transport.


This worked quite good -- I got a low latency of 35 microseconds. Yay !


See http://pastebin.com/VpAd1NrK for the Grid Engine submission script and for 
the Ray latency output.





With  Open-MPI 1.4.3, gcc 4.4.2 and --mca btl self,ofud, the job hangs 
somewhere before Ray I presume because there is nothing in
the standard output and there is nothing in the standard error.

One thing I noticed is that the load of a given node is 7, not 8, which is 
strange because there are, in theory, 8 instances of Ray on each node.

See http://pastebin.com/gVMjQ9Ra





According to the Open-MPI mailing list, ofud "was never really finished".

See http://www.open-mpi.org/community/lists/users/2010/12/14977.php


Could that unfinished status explain why it works with the Intel compiler but 
not with the GNU compiler ?


libibverbs is utilised on the colosse if that matters.



  Sébastien

 http://github.com/sebhtml/ray

> 
> De : Yevgeny Kliteynik [klit...@dev.mellanox.co.il]
> Date d'envoi : 20 septembre 2011 08:14
> À : Open MPI Users
> Cc : Sébastien Boisvert
> Objet : Re: [OMPI users] Latency of 250 microseconds with Open-MPI 1.4.3, 
> Mellanox Infiniband and 256 MPI ranks
> 
> Hi Sébastien,
> 
> If I understand you correctly, you are running your application on two
> different MPIs on two different clusters with two different IB vendors.
> 
> Could you make a comparison more "apples to apples"-ish?
> For instance:
> - run the same version of Open MPI on both clusters
> - run the same version of MVAPICH on both clusters
> 
> 
> -- YK
> 
> On 18-Sep-11 1:59 AM, Sébastien Boisvert wrote:
>> Hello,
>>
>> Open-MPI 1.4.3 on Mellanox Infiniband hardware gives a latency of 250 
>> microseconds with 256 MPI ranks on super-computer A (name is colosse).
>>
>> The same software gives a latency of 10 microseconds with MVAPICH2 and 
>> QLogic Infiniband hardware with 512 MPI ranks on super-computer B (name is 
>> guillimin).
>>
>>
>> Here are the relevant information listed in 
>> http://www.open-mpi.org/community/help/
>>
>>
>> 1. Check the FAQ first.
>>
>> done !
>>
>>
>> 2. The version of Open MPI that you're using.
>>
>> Open-MPI 1.4.3
>>
>>
>> 3. The config.log file from the top-level Open MPI directory, if available 
>> (please compress!).
>>
>> See below.
>>
>> Command file: http://pastebin.com/mW32ntSJ
>>
>>
>> 4. The output of the "ompi_info --all" command from the node where you're 
>> invoking mpirun.
>>
>> ompi_info -a on colosse: http://pastebin.com/RPyY9s24
>>
>>
>> 5. If running on more than one node -- especially if you're having problems 
>> launching Open MPI processes -- also include the output of the "ompi_info -v 
>> ompi full --parsable" command from each node on which you're trying to run.
>>
>> I am not having problems launching Open-MPI processes.
>>
>>
>> 6. A detailed description of what is failing.
>>
>> Open-MPI 1.4.3 on Mellanox Infiniband hardware give a latency of 250 
>> microseconds with 256 MPI ranks on super-computer A (name is colosse).
>>
>> The same software gives a latency of 10 microseconds with MVAPICH2 and 
>> QLogic Infiniband hardware on 512 MPI ranks on super-computer B (name is 
>> guillimin).
>>
>> Details follow.
>>
>>
>> I am developing a distributed genome assembler that runs with the 
>> message-passing interface (I am a PhD student).
>> It is called Ray. Link: http://github.com/sebhtml/ray
>>
>> I recently added the option -test-network-only so that Ray can be used to 
>> test the latency. Each MPI rank has to send 10 messages (4000 bytes 
>> each), one by one.
>> The destination of any message is picked up at random.
>>
>>
>> On colosse, a super-computer located at Laval University, I get an average 
>> latency of 250 microseconds with the test done in Ray.
>>
>> See http://pastebin.com/9nyjSy5z
>>
>> On colosse, the hardware is Mellanox Infiniband QDR ConnectX and the MPI 
>> middleware is Open-MPI 1.4.3 compiled with gcc 4.4.2.
>>
>> colosse has 8 compute cores per node (Intel Nehalem).
>>
>>
>> Testing the latency with ibv_rc_pingpong on colosse gives 11 microseconds.
>>
>>local address:  LID 0x048e, QPN 0x1c005c, PSN 0xf7c66b
>>remote address: LID 0x018c, QPN 0x2c005c, PSN 0x5428e6
>> 8192000 bytes in 0.01 seconds = 5776.64 Mbit/sec
>> 1000 iters in 0.01 seconds = 11.35 usec/iter
>>
>> So I know that the Infiniband has a correct latency between two HCAs because 
>> of the output of ibv_rc_pingpong.
>>
>>
>>
>> Adding the parameter --mca btl_openib_verbose 1 to mpirun shows that 

Re: [OMPI users] Building with thread support on Windows?

2011-09-21 Thread Björn Regnström
I am building with VS 2008 and the compiler (cl) and the standard
libraries that goes with 
it, including the windows thread library. I have noted that ompi_info
requires either Posix 
or Solaris threads to report that open-mpi has thread support. Do I
 need to change the 
thread library and/or do I need another compiler?
 
Regards,
Bjorn Regnstrom

At Wednesday, 2011-09-21 on 17:32 Tim Prince wrote:

On 9/21/2011 11:18 AM, Björn Regnström wrote:
> Hi,
>
> I am trying to build Open MPI 1.4.3 with thread support on Windows.
A
> trivial test program
> runs if it calls MPI_Init or MP_Init_thread(int *argc, char
***argv, int
> required, int *provide) with
> reguired=0 but hangs if required>0. ompi_info for my build reports
that
> there is no thread
> support but MPI_Init_thread returns provide==required.
>
> The only change in the CMake configuration was to check
> OMPI_ENABLE_MPI_THREADS.
> Is there anything else that needs to be done with the
configuration?
>
> I have built 1.4.3 with thread support on several linuxes and mac
and it
> works fine there.
>
Not all Windows compilers work well enough with all threading models 
that you could expect satisfactory results; in particular, the
compilers 
and thread libraries you use on linux may not be adequate for Windows

thread support.

-- 
Tim Prince
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] EXTERNAL: Re: Question about compilng with fPIC

2011-09-21 Thread Tim Prince

On 9/21/2011 12:22 PM, Blosch, Edwin L wrote:

Thanks Tim.

I'm compiling source units and linking them into an executable.  Or perhaps you 
are talking about how OpenMPI itself is built?  Excuse my ignorance...

The source code units are compiled like this:
/usr/mpi/intel/openmpi-1.4.3/bin/mpif90 -D_GNU_SOURCE -traceback -align -pad 
-xHost -falign-functions -fpconstant -O2 -I. 
-I/usr/mpi/intel/openmpi-1.4.3/include -c ../code/src/main/main.f90

The link step is like this:
/usr/mpi/intel/openmpi-1.4.3/bin/mpif90 -D_GNU_SOURCE -traceback -align -pad -xHost 
-falign-functions -fpconstant -static-intel -o ../bin/
-lstdc++

OpenMPI itself was configured like this:
./configure --prefix=/release/cfd/openmpi-intel --without-tm --without-sge 
--without-lsf --without-psm --without-portals --without-gm --without-elan 
--without-mx --without-slurm --without-loadleveler 
--enable-mpirun-prefix-by-default --enable-contrib-no-build=vt 
--enable-mca-no-build=maffinity --disable-per-user-config-files 
--disable-io-romio --with-mpi-f90-size=small --enable-static --disable-shared 
CXX=/appserv/intel/Compiler/11.1/072/bin/intel64/icpc 
CC=/appserv/intel/Compiler/11.1/072/bin/intel64/icc 'CFLAGS=  -O2' 'CXXFLAGS=  
-O2' F77=/appserv/intel/Compiler/11.1/072/bin/intel64/ifort 
'FFLAGS=-D_GNU_SOURCE -traceback  -O2' 
FC=/appserv/intel/Compiler/11.1/072/bin/intel64/ifort 'FCFLAGS=-D_GNU_SOURCE 
-traceback  -O2' 'LDFLAGS= -static-intel'

ldd output on the final executable gives:
 linux-vdso.so.1 =>   (0x7fffb77e7000)
 libstdc++.so.6 =>  /usr/lib64/libstdc++.so.6 (0x2b2e2b652000)
 libibverbs.so.1 =>  /usr/lib64/libibverbs.so.1 (0x2b2e2b95e000)
 libdl.so.2 =>  /lib64/libdl.so.2 (0x2b2e2bb6d000)
 libnsl.so.1 =>  /lib64/libnsl.so.1 (0x2b2e2bd72000)
 libutil.so.1 =>  /lib64/libutil.so.1 (0x2b2e2bf8a000)
 libm.so.6 =>  /lib64/libm.so.6 (0x2b2e2c18d000)
 libpthread.so.0 =>  /lib64/libpthread.so.0 (0x2b2e2c3e4000)
 libc.so.6 =>  /lib64/libc.so.6 (0x2b2e2c60)
 libgcc_s.so.1 =>  /lib64/libgcc_s.so.1 (0x2b2e2c959000)
 /lib64/ld-linux-x86-64.so.2 (0x2b2e2b433000)

Do you see anything that suggests I should have been compiling the application 
and/or OpenMPI with -fPIC?

If you were building any OpenMPI shared libraries, those should use 
-fPIC. configure may have made the necessary additions. If your 
application had shared libraries, you would require -fPIC, but 
apparently you had none.  The shared libraries you show presumably 
weren't involved in your MPI or application build, and you must have 
linked in static versions of your MPI libraries, where -fPIC wouldn't be 
required.



--
Tim Prince


Re: [OMPI users] Question about compilng with fPIC

2011-09-21 Thread Blosch, Edwin L
Follow-up:  I misread the coding, so now I think mpi_iprobe is probably not 
being used for this case.  I'll have to pin the blame somewhere else.  -fPIC 
definitely fixes the problem, as I tried removing -mcmodel=medium and it still 
worked.   Our usual communication pattern is mpi_irecv, mpi_isend, mpi_waitall; 
perhaps there is something unhealthy in the semantics there.

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Blosch, Edwin L
Sent: Wednesday, September 21, 2011 10:44 AM
To: Open MPI Users
Subject: EXTERNAL: [OMPI users] Question about compilng with fPIC

Follow-up to a mislabeled thread:  "How could OpenMPI (or MVAPICH) affect 
floating-point results?"

I have found a solution to my problem, but I would like to understand the 
underlying issue better.

To rehash: An Intel-compiled executable linked with MVAPICH runs fine; linked 
with OpenMPI fails.  The earliest symptom I could see was some strange 
difference in numerical values of quantities that should be unaffected by MPI 
calls.  Tim's advice guided me to assume memory corruption. Eugene's advice 
guided me to explore the detailed differences in compilation.  

I observed that the MVAPICH mpif90 wrapper adds -fPIC.

I tried adding -fPIC and -mcmodel=medium to the compilation of the 
OpenMPI-linked executable.  Now it works fine. I haven't tried without 
-mcmodel=medium, but my guess is -fPIC did the trick.

Does anyone know why compiling with -fPIC has helped?  Does it suggest an 
application problem or an OpenMPI problem?

To note: This is an Infiniband-based cluster.  The application does pretty 
basic MPI-1 operations: send, recv, bcast, reduce, allreduce, gather, gather, 
isend, irecv, waitall.  There is one task that uses iprobe with MPI_ANY_TAG, 
but this task is only involved in certain cases (including this one). 
Conversely, cases that do not call iprobe have not yet been observed to crash.  
I am deducing that this function is the problem.

Thanks,

Ed

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Blosch, Edwin L
Sent: Tuesday, September 20, 2011 11:46 AM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How could OpenMPI (or MVAPICH) affect 
floating-point results?

Thank you for this explanation.  I will assume that my problem here is some 
kind of memory corruption.


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Tim Prince
Sent: Tuesday, September 20, 2011 10:36 AM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How could OpenMPI (or MVAPICH) affect 
floating-point results?

On 9/20/2011 10:50 AM, Blosch, Edwin L wrote:

> It appears to be a side effect of linkage that is able to change a 
> compute-only routine's answers.
>
> I have assumed that max/sqrt/tiny/abs might be replaced, but some other kind 
> of corruption may be going on.
>

Those intrinsics have direct instruction set translations which 
shouldn't vary from -O1 on up nor with linkage options nor be affected 
by MPI or insertion of WRITEs.

-- 
Tim Prince
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] EXTERNAL: Re: Question about compilng with fPIC

2011-09-21 Thread Blosch, Edwin L
Thanks Tim.

I'm compiling source units and linking them into an executable.  Or perhaps you 
are talking about how OpenMPI itself is built?  Excuse my ignorance...

The source code units are compiled like this:
/usr/mpi/intel/openmpi-1.4.3/bin/mpif90 -D_GNU_SOURCE -traceback -align -pad 
-xHost -falign-functions -fpconstant -O2 -I. 
-I/usr/mpi/intel/openmpi-1.4.3/include -c ../code/src/main/main.f90

The link step is like this:
/usr/mpi/intel/openmpi-1.4.3/bin/mpif90 -D_GNU_SOURCE -traceback -align -pad 
-xHost -falign-functions -fpconstant -static-intel -o ../bin/  -lstdc++

OpenMPI itself was configured like this:
./configure --prefix=/release/cfd/openmpi-intel --without-tm --without-sge 
--without-lsf --without-psm --without-portals --without-gm --without-elan 
--without-mx --without-slurm --without-loadleveler 
--enable-mpirun-prefix-by-default --enable-contrib-no-build=vt 
--enable-mca-no-build=maffinity --disable-per-user-config-files 
--disable-io-romio --with-mpi-f90-size=small --enable-static --disable-shared 
CXX=/appserv/intel/Compiler/11.1/072/bin/intel64/icpc 
CC=/appserv/intel/Compiler/11.1/072/bin/intel64/icc 'CFLAGS=  -O2' 'CXXFLAGS=  
-O2' F77=/appserv/intel/Compiler/11.1/072/bin/intel64/ifort 
'FFLAGS=-D_GNU_SOURCE -traceback  -O2' 
FC=/appserv/intel/Compiler/11.1/072/bin/intel64/ifort 'FCFLAGS=-D_GNU_SOURCE 
-traceback  -O2' 'LDFLAGS= -static-intel'

ldd output on the final executable gives: 
linux-vdso.so.1 =>  (0x7fffb77e7000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x2b2e2b652000)
libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x2b2e2b95e000)
libdl.so.2 => /lib64/libdl.so.2 (0x2b2e2bb6d000)
libnsl.so.1 => /lib64/libnsl.so.1 (0x2b2e2bd72000)
libutil.so.1 => /lib64/libutil.so.1 (0x2b2e2bf8a000)
libm.so.6 => /lib64/libm.so.6 (0x2b2e2c18d000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x2b2e2c3e4000)
libc.so.6 => /lib64/libc.so.6 (0x2b2e2c60)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x2b2e2c959000)
/lib64/ld-linux-x86-64.so.2 (0x2b2e2b433000)

Do you see anything that suggests I should have been compiling the application 
and/or OpenMPI with -fPIC?

Thanks

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Tim Prince
Sent: Wednesday, September 21, 2011 10:53 AM
To: us...@open-mpi.org
Subject: EXTERNAL: Re: [OMPI users] Question about compilng with fPIC

On 9/21/2011 11:44 AM, Blosch, Edwin L wrote:
> Follow-up to a mislabeled thread:  "How could OpenMPI (or MVAPICH) affect 
> floating-point results?"
>
> I have found a solution to my problem, but I would like to understand the 
> underlying issue better.
>
> To rehash: An Intel-compiled executable linked with MVAPICH runs fine; linked 
> with OpenMPI fails.  The earliest symptom I could see was some strange 
> difference in numerical values of quantities that should be unaffected by MPI 
> calls.  Tim's advice guided me to assume memory corruption. Eugene's advice 
> guided me to explore the detailed differences in compilation.
>
> I observed that the MVAPICH mpif90 wrapper adds -fPIC.
>
> I tried adding -fPIC and -mcmodel=medium to the compilation of the 
> OpenMPI-linked executable.  Now it works fine. I haven't tried without 
> -mcmodel=medium, but my guess is -fPIC did the trick.
>
> Does anyone know why compiling with -fPIC has helped?  Does it suggest an 
> application problem or an OpenMPI problem?
>
> To note: This is an Infiniband-based cluster.  The application does pretty 
> basic MPI-1 operations: send, recv, bcast, reduce, allreduce, gather, gather, 
> isend, irecv, waitall.  There is one task that uses iprobe with MPI_ANY_TAG, 
> but this task is only involved in certain cases (including this one). 
> Conversely, cases that do not call iprobe have not yet been observed to 
> crash.  I am deducing that this function is the problem.
>

If you are making a .so, the included .o files should be built with 
-fPIC or similar. Ideally, the configure and build tools would enforce this.

-- 
Tim Prince
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] Question about compilng with fPIC

2011-09-21 Thread Tim Prince

On 9/21/2011 11:44 AM, Blosch, Edwin L wrote:

Follow-up to a mislabeled thread:  "How could OpenMPI (or MVAPICH) affect 
floating-point results?"

I have found a solution to my problem, but I would like to understand the 
underlying issue better.

To rehash: An Intel-compiled executable linked with MVAPICH runs fine; linked 
with OpenMPI fails.  The earliest symptom I could see was some strange 
difference in numerical values of quantities that should be unaffected by MPI 
calls.  Tim's advice guided me to assume memory corruption. Eugene's advice 
guided me to explore the detailed differences in compilation.

I observed that the MVAPICH mpif90 wrapper adds -fPIC.

I tried adding -fPIC and -mcmodel=medium to the compilation of the 
OpenMPI-linked executable.  Now it works fine. I haven't tried without 
-mcmodel=medium, but my guess is -fPIC did the trick.

Does anyone know why compiling with -fPIC has helped?  Does it suggest an 
application problem or an OpenMPI problem?

To note: This is an Infiniband-based cluster.  The application does pretty 
basic MPI-1 operations: send, recv, bcast, reduce, allreduce, gather, gather, 
isend, irecv, waitall.  There is one task that uses iprobe with MPI_ANY_TAG, 
but this task is only involved in certain cases (including this one). 
Conversely, cases that do not call iprobe have not yet been observed to crash.  
I am deducing that this function is the problem.



If you are making a .so, the included .o files should be built with 
-fPIC or similar. Ideally, the configure and build tools would enforce this.


--
Tim Prince


[OMPI users] Question about compilng with fPIC

2011-09-21 Thread Blosch, Edwin L
Follow-up to a mislabeled thread:  "How could OpenMPI (or MVAPICH) affect 
floating-point results?"

I have found a solution to my problem, but I would like to understand the 
underlying issue better.

To rehash: An Intel-compiled executable linked with MVAPICH runs fine; linked 
with OpenMPI fails.  The earliest symptom I could see was some strange 
difference in numerical values of quantities that should be unaffected by MPI 
calls.  Tim's advice guided me to assume memory corruption. Eugene's advice 
guided me to explore the detailed differences in compilation.  

I observed that the MVAPICH mpif90 wrapper adds -fPIC.

I tried adding -fPIC and -mcmodel=medium to the compilation of the 
OpenMPI-linked executable.  Now it works fine. I haven't tried without 
-mcmodel=medium, but my guess is -fPIC did the trick.

Does anyone know why compiling with -fPIC has helped?  Does it suggest an 
application problem or an OpenMPI problem?

To note: This is an Infiniband-based cluster.  The application does pretty 
basic MPI-1 operations: send, recv, bcast, reduce, allreduce, gather, gather, 
isend, irecv, waitall.  There is one task that uses iprobe with MPI_ANY_TAG, 
but this task is only involved in certain cases (including this one). 
Conversely, cases that do not call iprobe have not yet been observed to crash.  
I am deducing that this function is the problem.

Thanks,

Ed

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Blosch, Edwin L
Sent: Tuesday, September 20, 2011 11:46 AM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How could OpenMPI (or MVAPICH) affect 
floating-point results?

Thank you for this explanation.  I will assume that my problem here is some 
kind of memory corruption.


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Tim Prince
Sent: Tuesday, September 20, 2011 10:36 AM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How could OpenMPI (or MVAPICH) affect 
floating-point results?

On 9/20/2011 10:50 AM, Blosch, Edwin L wrote:

> It appears to be a side effect of linkage that is able to change a 
> compute-only routine's answers.
>
> I have assumed that max/sqrt/tiny/abs might be replaced, but some other kind 
> of corruption may be going on.
>

Those intrinsics have direct instruction set translations which 
shouldn't vary from -O1 on up nor with linkage options nor be affected 
by MPI or insertion of WRITEs.

-- 
Tim Prince
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] Building with thread support on Windows?

2011-09-21 Thread Tim Prince

On 9/21/2011 11:18 AM, Björn Regnström wrote:

Hi,

I am trying to build Open MPI 1.4.3 with thread support on Windows. A
trivial test program
runs if it calls MPI_Init or MP_Init_thread(int *argc, char ***argv, int
required, int *provide) with
reguired=0 but hangs if required>0. ompi_info for my build reports that
there is no thread
support but MPI_Init_thread returns provide==required.

The only change in the CMake configuration was to check
OMPI_ENABLE_MPI_THREADS.
Is there anything else that needs to be done with the configuration?

I have built 1.4.3 with thread support on several linuxes and mac and it
works fine there.

Not all Windows compilers work well enough with all threading models 
that you could expect satisfactory results; in particular, the compilers 
and thread libraries you use on linux may not be adequate for Windows 
thread support.



--
Tim Prince


[OMPI users] Building with thread support on Windows?

2011-09-21 Thread Björn Regnström
Hi,

I am trying to build Open MPI 1.4.3 with thread support on Windows. A
trivial test program 
runs if it calls MPI_Init or MP_Init_thread(int *argc, char ***argv,
int required, int *provide) with
reguired=0 but hangs if required>0. ompi_info for my build reports
that there is no thread
support but MPI_Init_thread returns provide==required. 

The only change in the CMake configuration was to check
OMPI_ENABLE_MPI_THREADS. 
Is there anything else that needs to be done with the configuration? 

I have built 1.4.3 with thread support on several linuxes and mac and
it works fine there. 

Regards,
Bjorn Regnstrom