Re: [OMPI users] possible bug exercised by mpi4py

2012-05-23 Thread Jonathan Dursi

On 23 May 9:37PM, Jonathan Dursi wrote:


On the other hand, it works everywhere if I pad the rcounts array with
an extra valid value (0 or 1, or for that matter 783), or replace the
allgatherv with an allgather.


.. and it fails with 7 even where it worked (but succeeds with 8) if I 
pad rcounts with an extra invalid value which should never be read.


Should the recvcounts[] parameters test in allgatherv.c loop up to 
size=ompi_comm_remote_size(comm), as is done in alltoallv.c, rather than 
ompi_comm_size(comm) ?   That seems to avoid the problem.


   - Jonathan
--
Jonathan Dursi | SciNet, Compute/Calcul Canada | www.SciNetHPC.ca


Re: [OMPI users] possible bug exercised by mpi4py

2012-05-23 Thread Bennet Fauber
In case it is helpful to those who may not have the Intel compilers, these 
are the libraries against which the two executables of Lisandro's 
allgather.c get linked:



with Intel compilers:
=
$ ldd a.out
linux-vdso.so.1 =>  (0x7fffb7dfd000)
libmpi.so.0 => 
/home/software/rhel5/openmpi-1.4.3/intel-11.0/lib/libmpi.so.0 (0x2b460cec2000)
libopen-rte.so.0 => 
/home/software/rhel5/openmpi-1.4.3/intel-11.0/lib/libopen-rte.so.0 
(0x2b460d198000)
libopen-pal.so.0 => 
/home/software/rhel5/openmpi-1.4.3/intel-11.0/lib/libopen-pal.so.0 
(0x2b460d40)
libdl.so.2 => /lib64/libdl.so.2 (0x003f6a20)
libnsl.so.1 => /lib64/libnsl.so.1 (0x003f6fe0)
libutil.so.1 => /lib64/libutil.so.1 (0x003f7460)
libm.so.6 => /lib64/libm.so.6 (0x003f6a60)
libpthread.so.0 => /lib64/libpthread.so.0 (0x003f6ae0)
libc.so.6 => /lib64/libc.so.6 (0x003f69e0)
libimf.so => /usr/caen/intel-11.0/fc/11.0.074/lib/intel64/libimf.so 
(0x2b460d69f000)
libsvml.so => /usr/caen/intel-11.0/fc/11.0.074/lib/intel64/libsvml.so 
(0x2b460d9f6000)
libintlc.so.5 => 
/usr/caen/intel-11.0/fc/11.0.074/lib/intel64/libintlc.so.5 (0x2b460dbb3000)
libgcc_s.so.1 => /home/software/rhel5/gcc/4.6.2/lib64/libgcc_s.so.1 
(0x2b460dcf)
/lib64/ld-linux-x86-64.so.2 (0x003f69a0)
=


with GCC 4.6.2
=
$ ldd a.out
linux-vdso.so.1 =>  (0x7fff93dfd000)
libmpi.so.0 => 
/home/software/rhel5/openmpi-1.4.4/gcc-4.6.2/lib/libmpi.so.0 (0x2ab3ba523000)
libopen-rte.so.0 => 
/home/software/rhel5/openmpi-1.4.4/gcc-4.6.2/lib/libopen-rte.so.0 
(0x2ab3ba7cf000)
libopen-pal.so.0 => 
/home/software/rhel5/openmpi-1.4.4/gcc-4.6.2/lib/libopen-pal.so.0 
(0x2ab3baa1d000)
libdl.so.2 => /lib64/libdl.so.2 (0x003f6a20)
libnsl.so.1 => /lib64/libnsl.so.1 (0x003f6fe0)
libutil.so.1 => /lib64/libutil.so.1 (0x003f7460)
libm.so.6 => /lib64/libm.so.6 (0x003f6a60)
libpthread.so.0 => /lib64/libpthread.so.0 (0x003f6ae0)
libc.so.6 => /lib64/libc.so.6 (0x003f69e0)
/lib64/ld-linux-x86-64.so.2 (0x003f69a0)
=

-- bennet
--
East Hall Technical Services
Mathematics and Psychology Research Computing
University of Michigan
(734) 763-1182


Re: [OMPI users] possible bug exercised by mpi4py

2012-05-23 Thread Jonathan Dursi
Fails for me with 1.4.3 with gcc, but works with intel; works with 1.4.4 
with gcc or intel; fails with 1.5.5 with either.   Succeeds with intelmpi.


On the other hand, it works everywhere if I pad the rcounts array  with 
an extra valid value (0 or 1, or for that matter 783), or replace the 
allgatherv with an allgather.


  - Jonathan
--
Jonathan Dursi | SciNet, Compute/Calcul Canada | www.SciNetHPC.ca


Re: [OMPI users] possible bug exercised by mpi4py

2012-05-23 Thread Bennet Fauber

On Wed, 23 May 2012, Lisandro Dalcin wrote:


On 23 May 2012 19:04, Jeff Squyres  wrote:

Thanks for all the info!

But still, can we get a copy of the test in C?  That would make it 
significantly easier for us to tell if there is a problem with Open MPI -- 
mainly because we don't know anything about the internals of mpi4py.


FYI, this test ran fine with previous (but recent, let say 1.5.4)
OpenMPI versions, but fails with 1.6. The test also runs fine with
MPICH2.


I compiled the C example Lisandro provided using openmpi/1.4.3 compiled 
against the Intel 11.0 compilers, and it ran the first time.  I then 
recompiled using gcc 4.6.2 and openmpi 1.4.4, and it provided the 
following errors:


$ mpirun -np 5 a.out
[hostname:6601] *** An error occurred in MPI_Allgatherv
[hostname:6601] *** on communicator
[hostname:6601] *** MPI_ERR_COUNT: invalid count argument
[hostname:6601] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--
mpirun has exited due to process rank 4 with PID 6601 on
node hostname exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--

I then recompiled using the Intel compilers, and it runs without error 10 
out of 10 times.


I then recompiled using the gcc 4.6.2/openmpi 1.4.4 combination, and it 
fails consistently.


On the second and subsequent tries, it provides the following additional 
errors:


$ mpirun -np 5 a.out
[hostname:7168] *** An error occurred in MPI_Allgatherv
[hostname:7168] *** on communicator
[hostname:7168] *** MPI_ERR_COUNT: invalid count argument
[hostname:7168] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--
mpirun has exited due to process rank 2 with PID 7168 on
node hostname exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--
[hostname:07163] 1 more process has sent help message help-mpi-errors.txt / 
mpi_errors_are_fatal
[hostname:07163] Set MCA parameter "orte_base_help_aggregate" to 0 to see all 
help / error messages

Not sure if that information is helpful or not.

I am still completely puzzled why the number 5 is magic

-- bennet

Re: [OMPI users] openmpi-1.6 undefined reference

2012-05-23 Thread Jeff Squyres
On May 23, 2012, at 6:20 PM, marco atzeri wrote:

> ~ 90% of the time we have mismatch problems between upstream and
> cygwin on autoconf/automake/libtool versions that are not cygwin
> aware or updated.

Ok, fair enough.

I'd be curious if you actually need to do this with Open MPI -- we use very 
recent versions of the GNU Autotools to bootstrap our tarballs.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] openmpi-1.6 undefined reference

2012-05-23 Thread marco atzeri

On 5/23/2012 11:20 PM, Jeff Squyres wrote:

On May 23, 2012, at 9:53 AM, marco atzeri wrote:


experience says that autoreconf is a good approach on cygwin,
it is almost standard on our package build procedure.


I'm still curious: why?  (I'm *assuming* that you're building from an official 
Open MPI tarball -- is that incorrect?)

I ask because we've already run autoreconf, meaning that official Open MPI 
tarballs are fully bootstrapped and do not need to have autogen (i.e., 
ultimately autoreconf) re-run on them.

Specifically: I'm unaware of a reason why you should need to re-run autogen 
(autoreconf) on an otherwise-unaltered Open MPI that was freshly extracted from 
a tarball.  Does something happen differently if you *don't* re-run autogen 
(autoreconf)?

Re-running autogen shouldn't be causing you any problems, of course -- this is 
just my curiosity asserting itself...



Hi Jeff,
~ 90% of the time we have mismatch problems between upstream and
cygwin on autoconf/automake/libtool versions that are not cygwin
aware or updated.

As safe approuch, we prefer apply "autoreconf -i -f" as default when
building binary packages.

see cygautoreconf on
http://cygwin-ports.svn.sourceforge.net/viewvc/cygwin-ports/cygport/trunk/README

Regards
Marco




Re: [OMPI users] possible bug exercised by mpi4py

2012-05-23 Thread Jeff Squyres
Thanks for all the info!

But still, can we get a copy of the test in C?  That would make it 
significantly easier for us to tell if there is a problem with Open MPI -- 
mainly because we don't know anything about the internals of mpi4py.


On May 23, 2012, at 5:43 PM, Bennet Fauber wrote:

> Thanks, Ralph,
> 
> On Wed, 23 May 2012, Ralph Castain wrote:
> 
>> I don't honestly think many of us have any knowledge of mpi4py. Does this 
>> test work with other MPIs?
> 
> The mpi4py developers have said they've never seen this using mpich2.  I have 
> not been able to test that myself.
> 
>> MPI_Allgather seems to be passing our tests, so I suspect it is something in 
>> the binding. If you can provide the actual test, I'm willing to take a look 
>> at it.
> 
> The actual test is included in the install bundle for mpi4py, along with the 
> C source code used to create the bindings.
> 
>   http://code.google.com/p/mpi4py/downloads/list
> 
> The install is straightforward and simple.  Unpack the tarball, make sure 
> that mpicc is in your path
> 
>   $ cd mpi4py-1.3
>   $ python setup.py build
>   $ python setup.py install --prefix=/your/install
>   $ export PYTHONPATH=/your/install/lib/pythonN.M/site-packages
>   $ mpirun -np 5 python test/runtests.py \
>--verbose --no-threads --include cco_obj_inter
> 
> where N.M are the major.minor numbers of your python distribution.
> 
> What I find most puzzling is that, maybe, 1 out of 10 times it will run to 
> completion with -np 5, and it runs with all other numbers of processors I've 
> tested always.
> 
>   -- bennet
> 
>> On May 23, 2012, at 2:52 PM, Bennet Fauber wrote:
>> 
>>> I've installed the latest mpi4py-1.3 on several systems, and there is a 
>>> repeated bug when running
>>> 
>>> $ mpirun -np 5 python test/runtests.py
>>> 
>>> where it throws an error on mpigather with openmpi-1.4.4 and hangs with 
>>> openmpi-1.3.
>>> 
>>> It runs to completion and passes all tests when run with -np of 2, 3, 4, 6, 
>>> 7, 8, 9, 10, 11, and 12.
>>> 
>>> There is a thread on this at
>>> 
>>> http://groups.google.com/group/mpi4py/browse_thread/thread/509ac46af6f79973
>>> 
>>> where others report being able to replicate, too.
>>> 
>>> The compiler used first was gcc-4.6.2, with openmpi-1.4.4.
>>> 
>>> These are all Red Hat machines, RHEL 5 or 6 and with multiple compilers and 
>>> versions of openmpi 1.3.0 and 1.4.4.
>>> 
>>> Lisandro who is the primary developer of mpi4py is able to replicate on 
>>> Fedora 16.
>>> 
>>> Someone else is able to reproduce with
>>> 
>>> [ quoting from the groups.google.com page... ]
>>> ===
>>> It also happens with the current hg version of mpi4py and
>>> $ rpm -qa openmpi gcc python
>>> python-2.7.3-6.fc17.x86_64
>>> gcc-4.7.0-5.fc17.x86_64
>>> openmpi-1.5.4-5.fc17.1.x86_64
>>> ===
>>> 
>>> So, I believe this is a bug to be reported.  Per the advice at
>>> 
>>> http://www.open-mpi.org/community/help/bugs.php
>>> 
>>> If you feel that you do have a definite bug to report but are
>>> unsure which list to post to, then post to the user's list.
>>> 
>>> Please let me know if there is additional information that you need to 
>>> replicate.
>>> 
>>> Some output is included below the signature in case it is useful.
>>> 
>>> -- bennet
>>> --
>>> East Hall Technical Services
>>> Mathematics and Psychology Research Computing
>>> University of Michigan
>>> (734) 763-1182
>>> 
>>> On RHEL 5, openmpi 1.3, gcc 4.1.2, python 2.7
>>> 
>>> $ mpirun -np 5 --mca btl ^sm python test/runtests.py --verbose --no-threads 
>>> --include cco_obj_inter
>>> [0...@sirocco.math.lsa.umich.edu] Python 2.7 
>>> (/home/bennet/epd7.2.2/bin/python)
>>> [0...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
>>> [0...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
>>> (build/lib.linux-x86_64-2.7/mpi4py)
>>> [1...@sirocco.math.lsa.umich.edu] Python 2.7 
>>> (/home/bennet/epd7.2.2/bin/python)
>>> [1...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
>>> [1...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
>>> (build/lib.linux-x86_64-2.7/mpi4py)
>>> [2...@sirocco.math.lsa.umich.edu] Python 2.7 
>>> (/home/bennet/epd7.2.2/bin/python)
>>> [2...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
>>> [2...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
>>> (build/lib.linux-x86_64-2.7/mpi4py)
>>> [3...@sirocco.math.lsa.umich.edu] Python 2.7 
>>> (/home/bennet/epd7.2.2/bin/python)
>>> [3...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
>>> [3...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
>>> (build/lib.linux-x86_64-2.7/mpi4py)
>>> [4...@sirocco.math.lsa.umich.edu] Python 2.7 
>>> (/home/bennet/epd7.2.2/bin/python)
>>> [4...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
>>> [4...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
>>> (build/lib.linux-x86_64-2.7/mpi4py)
>>> testAllgather 

Re: [OMPI users] possible bug exercised by mpi4py

2012-05-23 Thread Bennet Fauber

Jeff,

Well, not really, since the test is written in python   ;-)

The mpi4py source code is at

http://code.google.com/p/mpi4py/downloads/list

but I'm not sure what else I can provide, though.

I'm more the reporting middleman here.  I'd be happy to try to connect you 
and the developer of mpi4py.  It seems like openmpi should work regardless 
what value -np is used, which is what puzzles both me and the mpi4py 
developers.


-- bennet

On Wed, 23 May 2012, Jeff Squyres wrote:


Can you provide us with a C version of the test?

On May 23, 2012, at 4:52 PM, Bennet Fauber wrote:


I've installed the latest mpi4py-1.3 on several systems, and there is a 
repeated bug when running

$ mpirun -np 5 python test/runtests.py

where it throws an error on mpigather with openmpi-1.4.4 and hangs with 
openmpi-1.3.

It runs to completion and passes all tests when run with -np of 2, 3, 4, 6, 7, 
8, 9, 10, 11, and 12.

There is a thread on this at

http://groups.google.com/group/mpi4py/browse_thread/thread/509ac46af6f79973

where others report being able to replicate, too.

The compiler used first was gcc-4.6.2, with openmpi-1.4.4.

These are all Red Hat machines, RHEL 5 or 6 and with multiple compilers and 
versions of openmpi 1.3.0 and 1.4.4.

Lisandro who is the primary developer of mpi4py is able to replicate on Fedora 
16.

Someone else is able to reproduce with

[ quoting from the groups.google.com page... ]
===
It also happens with the current hg version of mpi4py and
$ rpm -qa openmpi gcc python
python-2.7.3-6.fc17.x86_64
gcc-4.7.0-5.fc17.x86_64
openmpi-1.5.4-5.fc17.1.x86_64
===

So, I believe this is a bug to be reported.  Per the advice at

http://www.open-mpi.org/community/help/bugs.php

If you feel that you do have a definite bug to report but are
unsure which list to post to, then post to the user's list.

Please let me know if there is additional information that you need to 
replicate.

Some output is included below the signature in case it is useful.

-- bennet
--
East Hall Technical Services
Mathematics and Psychology Research Computing
University of Michigan
(734) 763-1182

On RHEL 5, openmpi 1.3, gcc 4.1.2, python 2.7

$ mpirun -np 5 --mca btl ^sm python test/runtests.py --verbose --no-threads 
--include cco_obj_inter
[0...@sirocco.math.lsa.umich.edu] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
[0...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[0...@sirocco.math.lsa.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
[1...@sirocco.math.lsa.umich.edu] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
[1...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[1...@sirocco.math.lsa.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
[2...@sirocco.math.lsa.umich.edu] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
[2...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[2...@sirocco.math.lsa.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
[3...@sirocco.math.lsa.umich.edu] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
[3...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[3...@sirocco.math.lsa.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
[4...@sirocco.math.lsa.umich.edu] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
[4...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[4...@sirocco.math.lsa.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
testAllgather (test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
(test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
(test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
(test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
(test_cco_obj_inter.TestCCOObjInter) ...
[ hangs ]

RHEL5
===
$ python
Python 2.6.6 (r266:84292, Sep 12 2011, 14:03:14)
[GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] on linux2

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/software/rhel6/gcc/4.7.0/libexec/gcc/x86_64-
unknown-linux-gnu/4.7.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.7.0/configure --prefix=/home/software/rhel6/
gcc/4.7.0 --with-mpfr=/home/software/rhel6/gcc/mpfr-3.1.0/ --with-mpc=/
home/software/rhel6/gcc/mpc-0.9/ --with-gmp=/home/software/rhel6/gcc/
gmp-5.0.5/ --disable-multilib
Thread model: posix
gcc version 4.7.0 (GCC)

$ mpirun -np 5 python test/runtests.py --verbose --no-threads --include 
cco_obj_inter
[4...@host-rh6.engin.umich.edu] Python 2.6 (/usr/bin/python)
[4...@host-rh6.engin.umich.edu] MPI 2.1 (Open MPI 1.6.0)
[4...@host-rh6.engin.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
[2...@host-rh6.engin.umich.edu] Python 2.6 (/usr/bin/python)
[2...@host-rh6.engin.umich.edu] MPI 2.1 (Open MPI 1.6.0)
[2...@host-rh6.engin.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)

Re: [OMPI users] possible bug exercised by mpi4py

2012-05-23 Thread Bennet Fauber

Thanks, Ralph,

On Wed, 23 May 2012, Ralph Castain wrote:

I don't honestly think many of us have any knowledge of mpi4py. Does 
this test work with other MPIs?


The mpi4py developers have said they've never seen this using mpich2.  I 
have not been able to test that myself.


MPI_Allgather seems to be passing our tests, so I suspect it is 
something in the binding. If you can provide the actual test, I'm 
willing to take a look at it.


The actual test is included in the install bundle for mpi4py, along with 
the C source code used to create the bindings.


http://code.google.com/p/mpi4py/downloads/list

The install is straightforward and simple.  Unpack the tarball, make sure 
that mpicc is in your path


$ cd mpi4py-1.3
$ python setup.py build
$ python setup.py install --prefix=/your/install
$ export PYTHONPATH=/your/install/lib/pythonN.M/site-packages
$ mpirun -np 5 python test/runtests.py \
 --verbose --no-threads --include cco_obj_inter

where N.M are the major.minor numbers of your python distribution.

What I find most puzzling is that, maybe, 1 out of 10 times it will run to 
completion with -np 5, and it runs with all other numbers of processors 
I've tested always.


-- bennet


On May 23, 2012, at 2:52 PM, Bennet Fauber wrote:


I've installed the latest mpi4py-1.3 on several systems, and there is a 
repeated bug when running

$ mpirun -np 5 python test/runtests.py

where it throws an error on mpigather with openmpi-1.4.4 and hangs with 
openmpi-1.3.

It runs to completion and passes all tests when run with -np of 2, 3, 4, 6, 7, 
8, 9, 10, 11, and 12.

There is a thread on this at

http://groups.google.com/group/mpi4py/browse_thread/thread/509ac46af6f79973

where others report being able to replicate, too.

The compiler used first was gcc-4.6.2, with openmpi-1.4.4.

These are all Red Hat machines, RHEL 5 or 6 and with multiple compilers and 
versions of openmpi 1.3.0 and 1.4.4.

Lisandro who is the primary developer of mpi4py is able to replicate on Fedora 
16.

Someone else is able to reproduce with

[ quoting from the groups.google.com page... ]
===
It also happens with the current hg version of mpi4py and
$ rpm -qa openmpi gcc python
python-2.7.3-6.fc17.x86_64
gcc-4.7.0-5.fc17.x86_64
openmpi-1.5.4-5.fc17.1.x86_64
===

So, I believe this is a bug to be reported.  Per the advice at

http://www.open-mpi.org/community/help/bugs.php

If you feel that you do have a definite bug to report but are
unsure which list to post to, then post to the user's list.

Please let me know if there is additional information that you need to 
replicate.

Some output is included below the signature in case it is useful.

-- bennet
--
East Hall Technical Services
Mathematics and Psychology Research Computing
University of Michigan
(734) 763-1182

On RHEL 5, openmpi 1.3, gcc 4.1.2, python 2.7

$ mpirun -np 5 --mca btl ^sm python test/runtests.py --verbose --no-threads 
--include cco_obj_inter
[0...@sirocco.math.lsa.umich.edu] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
[0...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[0...@sirocco.math.lsa.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
[1...@sirocco.math.lsa.umich.edu] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
[1...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[1...@sirocco.math.lsa.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
[2...@sirocco.math.lsa.umich.edu] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
[2...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[2...@sirocco.math.lsa.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
[3...@sirocco.math.lsa.umich.edu] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
[3...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[3...@sirocco.math.lsa.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
[4...@sirocco.math.lsa.umich.edu] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
[4...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[4...@sirocco.math.lsa.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
testAllgather (test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
(test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
(test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
(test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
(test_cco_obj_inter.TestCCOObjInter) ...
[ hangs ]

RHEL5
===
$ python
Python 2.6.6 (r266:84292, Sep 12 2011, 14:03:14)
[GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] on linux2

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/software/rhel6/gcc/4.7.0/libexec/gcc/x86_64-
unknown-linux-gnu/4.7.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.7.0/configure 

Re: [OMPI users] possible bug exercised by mpi4py

2012-05-23 Thread Jeff Squyres
Can you provide us with a C version of the test?

On May 23, 2012, at 4:52 PM, Bennet Fauber wrote:

> I've installed the latest mpi4py-1.3 on several systems, and there is a 
> repeated bug when running
> 
>   $ mpirun -np 5 python test/runtests.py
> 
> where it throws an error on mpigather with openmpi-1.4.4 and hangs with 
> openmpi-1.3.
> 
> It runs to completion and passes all tests when run with -np of 2, 3, 4, 6, 
> 7, 8, 9, 10, 11, and 12.
> 
> There is a thread on this at
> 
> http://groups.google.com/group/mpi4py/browse_thread/thread/509ac46af6f79973
> 
> where others report being able to replicate, too.
> 
> The compiler used first was gcc-4.6.2, with openmpi-1.4.4.
> 
> These are all Red Hat machines, RHEL 5 or 6 and with multiple compilers and 
> versions of openmpi 1.3.0 and 1.4.4.
> 
> Lisandro who is the primary developer of mpi4py is able to replicate on 
> Fedora 16.
> 
> Someone else is able to reproduce with
> 
> [ quoting from the groups.google.com page... ]
> ===
> It also happens with the current hg version of mpi4py and
> $ rpm -qa openmpi gcc python
> python-2.7.3-6.fc17.x86_64
> gcc-4.7.0-5.fc17.x86_64
> openmpi-1.5.4-5.fc17.1.x86_64
> ===
> 
> So, I believe this is a bug to be reported.  Per the advice at
> 
>   http://www.open-mpi.org/community/help/bugs.php
> 
>   If you feel that you do have a definite bug to report but are
>   unsure which list to post to, then post to the user's list.
> 
> Please let me know if there is additional information that you need to 
> replicate.
> 
> Some output is included below the signature in case it is useful.
> 
>   -- bennet
> --
> East Hall Technical Services
> Mathematics and Psychology Research Computing
> University of Michigan
> (734) 763-1182
> 
> On RHEL 5, openmpi 1.3, gcc 4.1.2, python 2.7
> 
> $ mpirun -np 5 --mca btl ^sm python test/runtests.py --verbose --no-threads 
> --include cco_obj_inter
> [0...@sirocco.math.lsa.umich.edu] Python 2.7 
> (/home/bennet/epd7.2.2/bin/python)
> [0...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
> [0...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
> (build/lib.linux-x86_64-2.7/mpi4py)
> [1...@sirocco.math.lsa.umich.edu] Python 2.7 
> (/home/bennet/epd7.2.2/bin/python)
> [1...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
> [1...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
> (build/lib.linux-x86_64-2.7/mpi4py)
> [2...@sirocco.math.lsa.umich.edu] Python 2.7 
> (/home/bennet/epd7.2.2/bin/python)
> [2...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
> [2...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
> (build/lib.linux-x86_64-2.7/mpi4py)
> [3...@sirocco.math.lsa.umich.edu] Python 2.7 
> (/home/bennet/epd7.2.2/bin/python)
> [3...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
> [3...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
> (build/lib.linux-x86_64-2.7/mpi4py)
> [4...@sirocco.math.lsa.umich.edu] Python 2.7 
> (/home/bennet/epd7.2.2/bin/python)
> [4...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
> [4...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
> (build/lib.linux-x86_64-2.7/mpi4py)
> testAllgather (test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
> (test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
> (test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
> (test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
> (test_cco_obj_inter.TestCCOObjInter) ...
> [ hangs ]
> 
> RHEL5
> ===
> $ python
> Python 2.6.6 (r266:84292, Sep 12 2011, 14:03:14)
> [GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] on linux2
> 
> $ gcc -v
> Using built-in specs.
> COLLECT_GCC=gcc
> COLLECT_LTO_WRAPPER=/home/software/rhel6/gcc/4.7.0/libexec/gcc/x86_64-
> unknown-linux-gnu/4.7.0/lto-wrapper
> Target: x86_64-unknown-linux-gnu
> Configured with: ../gcc-4.7.0/configure --prefix=/home/software/rhel6/
> gcc/4.7.0 --with-mpfr=/home/software/rhel6/gcc/mpfr-3.1.0/ --with-mpc=/
> home/software/rhel6/gcc/mpc-0.9/ --with-gmp=/home/software/rhel6/gcc/
> gmp-5.0.5/ --disable-multilib
> Thread model: posix
> gcc version 4.7.0 (GCC)
> 
> $ mpirun -np 5 python test/runtests.py --verbose --no-threads --include 
> cco_obj_inter
> [4...@host-rh6.engin.umich.edu] Python 2.6 (/usr/bin/python)
> [4...@host-rh6.engin.umich.edu] MPI 2.1 (Open MPI 1.6.0)
> [4...@host-rh6.engin.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
> [2...@host-rh6.engin.umich.edu] Python 2.6 (/usr/bin/python)
> [2...@host-rh6.engin.umich.edu] MPI 2.1 (Open MPI 1.6.0)
> [2...@host-rh6.engin.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
> [1...@host-rh6.engin.umich.edu] Python 2.6 (/usr/bin/python)
> [1...@host-rh6.engin.umich.edu] MPI 2.1 (Open MPI 1.6.0)
> [1...@host-rh6.engin.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
> [0...@host-rh6.engin.umich.edu] Python 2.6 (/usr/bin/python)
> [0...@host-rh6.engin.umich.edu] 

Re: [OMPI users] openmpi-1.6 undefined reference

2012-05-23 Thread Jeff Squyres
On May 23, 2012, at 9:53 AM, marco atzeri wrote:

> experience says that autoreconf is a good approach on cygwin,
> it is almost standard on our package build procedure.

I'm still curious: why?  (I'm *assuming* that you're building from an official 
Open MPI tarball -- is that incorrect?)

I ask because we've already run autoreconf, meaning that official Open MPI 
tarballs are fully bootstrapped and do not need to have autogen (i.e., 
ultimately autoreconf) re-run on them.

Specifically: I'm unaware of a reason why you should need to re-run autogen 
(autoreconf) on an otherwise-unaltered Open MPI that was freshly extracted from 
a tarball.  Does something happen differently if you *don't* re-run autogen 
(autoreconf)?

Re-running autogen shouldn't be causing you any problems, of course -- this is 
just my curiosity asserting itself...

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] possible bug exercised by mpi4py

2012-05-23 Thread Ralph Castain
I don't honestly think many of us have any knowledge of mpi4py. Does this test 
work with other MPIs?

MPI_Allgather seems to be passing our tests, so I suspect it is something in 
the binding. If you can provide the actual test, I'm willing to take a look at 
it.


On May 23, 2012, at 2:52 PM, Bennet Fauber wrote:

> I've installed the latest mpi4py-1.3 on several systems, and there is a 
> repeated bug when running
> 
>   $ mpirun -np 5 python test/runtests.py
> 
> where it throws an error on mpigather with openmpi-1.4.4 and hangs with 
> openmpi-1.3.
> 
> It runs to completion and passes all tests when run with -np of 2, 3, 4, 6, 
> 7, 8, 9, 10, 11, and 12.
> 
> There is a thread on this at
> 
> http://groups.google.com/group/mpi4py/browse_thread/thread/509ac46af6f79973
> 
> where others report being able to replicate, too.
> 
> The compiler used first was gcc-4.6.2, with openmpi-1.4.4.
> 
> These are all Red Hat machines, RHEL 5 or 6 and with multiple compilers and 
> versions of openmpi 1.3.0 and 1.4.4.
> 
> Lisandro who is the primary developer of mpi4py is able to replicate on 
> Fedora 16.
> 
> Someone else is able to reproduce with
> 
> [ quoting from the groups.google.com page... ]
> ===
> It also happens with the current hg version of mpi4py and
> $ rpm -qa openmpi gcc python
> python-2.7.3-6.fc17.x86_64
> gcc-4.7.0-5.fc17.x86_64
> openmpi-1.5.4-5.fc17.1.x86_64
> ===
> 
> So, I believe this is a bug to be reported.  Per the advice at
> 
>   http://www.open-mpi.org/community/help/bugs.php
> 
>   If you feel that you do have a definite bug to report but are
>   unsure which list to post to, then post to the user's list.
> 
> Please let me know if there is additional information that you need to 
> replicate.
> 
> Some output is included below the signature in case it is useful.
> 
>   -- bennet
> --
> East Hall Technical Services
> Mathematics and Psychology Research Computing
> University of Michigan
> (734) 763-1182
> 
> On RHEL 5, openmpi 1.3, gcc 4.1.2, python 2.7
> 
> $ mpirun -np 5 --mca btl ^sm python test/runtests.py --verbose --no-threads 
> --include cco_obj_inter
> [0...@sirocco.math.lsa.umich.edu] Python 2.7 
> (/home/bennet/epd7.2.2/bin/python)
> [0...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
> [0...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
> (build/lib.linux-x86_64-2.7/mpi4py)
> [1...@sirocco.math.lsa.umich.edu] Python 2.7 
> (/home/bennet/epd7.2.2/bin/python)
> [1...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
> [1...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
> (build/lib.linux-x86_64-2.7/mpi4py)
> [2...@sirocco.math.lsa.umich.edu] Python 2.7 
> (/home/bennet/epd7.2.2/bin/python)
> [2...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
> [2...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
> (build/lib.linux-x86_64-2.7/mpi4py)
> [3...@sirocco.math.lsa.umich.edu] Python 2.7 
> (/home/bennet/epd7.2.2/bin/python)
> [3...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
> [3...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
> (build/lib.linux-x86_64-2.7/mpi4py)
> [4...@sirocco.math.lsa.umich.edu] Python 2.7 
> (/home/bennet/epd7.2.2/bin/python)
> [4...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
> [4...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
> (build/lib.linux-x86_64-2.7/mpi4py)
> testAllgather (test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
> (test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
> (test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
> (test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
> (test_cco_obj_inter.TestCCOObjInter) ...
> [ hangs ]
> 
> RHEL5
> ===
> $ python
> Python 2.6.6 (r266:84292, Sep 12 2011, 14:03:14)
> [GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] on linux2
> 
> $ gcc -v
> Using built-in specs.
> COLLECT_GCC=gcc
> COLLECT_LTO_WRAPPER=/home/software/rhel6/gcc/4.7.0/libexec/gcc/x86_64-
> unknown-linux-gnu/4.7.0/lto-wrapper
> Target: x86_64-unknown-linux-gnu
> Configured with: ../gcc-4.7.0/configure --prefix=/home/software/rhel6/
> gcc/4.7.0 --with-mpfr=/home/software/rhel6/gcc/mpfr-3.1.0/ --with-mpc=/
> home/software/rhel6/gcc/mpc-0.9/ --with-gmp=/home/software/rhel6/gcc/
> gmp-5.0.5/ --disable-multilib
> Thread model: posix
> gcc version 4.7.0 (GCC)
> 
> $ mpirun -np 5 python test/runtests.py --verbose --no-threads --include 
> cco_obj_inter
> [4...@host-rh6.engin.umich.edu] Python 2.6 (/usr/bin/python)
> [4...@host-rh6.engin.umich.edu] MPI 2.1 (Open MPI 1.6.0)
> [4...@host-rh6.engin.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
> [2...@host-rh6.engin.umich.edu] Python 2.6 (/usr/bin/python)
> [2...@host-rh6.engin.umich.edu] MPI 2.1 (Open MPI 1.6.0)
> [2...@host-rh6.engin.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
> [1...@host-rh6.engin.umich.edu] Python 2.6 (/usr/bin/python)
> 

[OMPI users] possible bug exercised by mpi4py

2012-05-23 Thread Bennet Fauber
I've installed the latest mpi4py-1.3 on several systems, and there is a 
repeated bug when running


$ mpirun -np 5 python test/runtests.py

where it throws an error on mpigather with openmpi-1.4.4 and hangs with 
openmpi-1.3.


It runs to completion and passes all tests when run with -np of 2, 3, 4, 
6, 7, 8, 9, 10, 11, and 12.


There is a thread on this at

http://groups.google.com/group/mpi4py/browse_thread/thread/509ac46af6f79973

where others report being able to replicate, too.

The compiler used first was gcc-4.6.2, with openmpi-1.4.4.

These are all Red Hat machines, RHEL 5 or 6 and with multiple compilers 
and versions of openmpi 1.3.0 and 1.4.4.


Lisandro who is the primary developer of mpi4py is able to replicate on 
Fedora 16.


Someone else is able to reproduce with

[ quoting from the groups.google.com page... ]
===
It also happens with the current hg version of mpi4py and
$ rpm -qa openmpi gcc python
python-2.7.3-6.fc17.x86_64
gcc-4.7.0-5.fc17.x86_64
openmpi-1.5.4-5.fc17.1.x86_64
===

So, I believe this is a bug to be reported.  Per the advice at

http://www.open-mpi.org/community/help/bugs.php

If you feel that you do have a definite bug to report but are
unsure which list to post to, then post to the user's list.

Please let me know if there is additional information that you need to 
replicate.


Some output is included below the signature in case it is useful.

-- bennet
--
East Hall Technical Services
Mathematics and Psychology Research Computing
University of Michigan
(734) 763-1182

On RHEL 5, openmpi 1.3, gcc 4.1.2, python 2.7

$ mpirun -np 5 --mca btl ^sm python test/runtests.py --verbose 
--no-threads --include cco_obj_inter
[0...@sirocco.math.lsa.umich.edu] Python 2.7 
(/home/bennet/epd7.2.2/bin/python)

[0...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[0...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
(build/lib.linux-x86_64-2.7/mpi4py)
[1...@sirocco.math.lsa.umich.edu] Python 2.7 
(/home/bennet/epd7.2.2/bin/python)

[1...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[1...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
(build/lib.linux-x86_64-2.7/mpi4py)
[2...@sirocco.math.lsa.umich.edu] Python 2.7 
(/home/bennet/epd7.2.2/bin/python)

[2...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[2...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
(build/lib.linux-x86_64-2.7/mpi4py)
[3...@sirocco.math.lsa.umich.edu] Python 2.7 
(/home/bennet/epd7.2.2/bin/python)

[3...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[3...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
(build/lib.linux-x86_64-2.7/mpi4py)
[4...@sirocco.math.lsa.umich.edu] Python 2.7 
(/home/bennet/epd7.2.2/bin/python)

[4...@sirocco.math.lsa.umich.edu] MPI 2.0 (Open MPI 1.3.0)
[4...@sirocco.math.lsa.umich.edu] mpi4py 1.3 
(build/lib.linux-x86_64-2.7/mpi4py)
testAllgather (test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
(test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
(test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
(test_cco_obj_inter.TestCCOObjInter) ... testAllgather 
(test_cco_obj_inter.TestCCOObjInter) ...

[ hangs ]

RHEL5
===
$ python
Python 2.6.6 (r266:84292, Sep 12 2011, 14:03:14)
[GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] on linux2

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/software/rhel6/gcc/4.7.0/libexec/gcc/x86_64-
unknown-linux-gnu/4.7.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.7.0/configure --prefix=/home/software/rhel6/
gcc/4.7.0 --with-mpfr=/home/software/rhel6/gcc/mpfr-3.1.0/ --with-mpc=/
home/software/rhel6/gcc/mpc-0.9/ --with-gmp=/home/software/rhel6/gcc/
gmp-5.0.5/ --disable-multilib
Thread model: posix
gcc version 4.7.0 (GCC)

$ mpirun -np 5 python test/runtests.py --verbose --no-threads --include 
cco_obj_inter
[4...@host-rh6.engin.umich.edu] Python 2.6 (/usr/bin/python)
[4...@host-rh6.engin.umich.edu] MPI 2.1 (Open MPI 1.6.0)
[4...@host-rh6.engin.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
[2...@host-rh6.engin.umich.edu] Python 2.6 (/usr/bin/python)
[2...@host-rh6.engin.umich.edu] MPI 2.1 (Open MPI 1.6.0)
[2...@host-rh6.engin.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
[1...@host-rh6.engin.umich.edu] Python 2.6 (/usr/bin/python)
[1...@host-rh6.engin.umich.edu] MPI 2.1 (Open MPI 1.6.0)
[1...@host-rh6.engin.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
[0...@host-rh6.engin.umich.edu] Python 2.6 (/usr/bin/python)
[0...@host-rh6.engin.umich.edu] MPI 2.1 (Open MPI 1.6.0)
[0...@host-rh6.engin.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
[3...@host-rh6.engin.umich.edu] Python 2.6 (/usr/bin/python)
[3...@host-rh6.engin.umich.edu] MPI 2.1 (Open MPI 1.6.0)
[3...@host-rh6.engin.umich.edu] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
testAllgather 

Re: [OMPI users] openmpi-1.6 undefined reference

2012-05-23 Thread marco atzeri

On 5/23/2012 2:08 PM, Ralph Castain wrote:

Add "libompitrace" to your enable-contrib-no-build list. There is likely a 
missing include in there, but you don't need that lib to run. We'll take a look at it.



thanks.
build was fine and check passed almost all

from "grep -i pass openmpi-1.6-1-check.log"

PASS: predefined_gap_test.exe
File opened with dladvise_local, all passed
PASS: dlopen_test.exe
All 2 tests passed
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_barrier.exe
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_barrier_noinline.exe
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_spinlock.exe
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_spinlock_noinline.exe
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_math.exe
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_math_noinline.exe
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_cmpset.exe
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_cmpset_noinline.exe
All 8 tests passed
All 0 tests passed
All 0 tests passed
/pub/devel/openmpi/openmpi-1.6-1/src/openmpi-1.6/test/datatype/opal_datatype_test.c:533:5: 
warning: passing argument 1 of ‘test_create_blacs_type1’ discards 
qualifiers from pointer target type
/pub/devel/openmpi/openmpi-1.6-1/src/openmpi-1.6/test/datatype/opal_datatype_test.c:534:5: 
warning: passing argument 1 of ‘test_create_blacs_type2’ discards 
qualifiers from pointer target type
/pub/devel/openmpi/openmpi-1.6-1/src/openmpi-1.6/test/datatype/opal_ddt_lib.c:155:5: 
warning: passing argument 4 of ‘opal_datatype_create_struct’ from 
incompatible pointer type

decode [PASSED]
PASS: opal_datatype_test.exe
PASS: checksum.exe
PASS: position.exe
decode [PASSED]
PASS: ddt_test.exe

decode [NOT PASSED]   <<< ?

PASS: ddt_raw.exe
All 5 tests passed
SUPPORT: OMPI Test Passed: opal_path_nfs(): (0 tests)
PASS: opal_path_nfs.exe
1 test passed



the detailed log is not more clear, I do not understand if the
test is passed or not



#
 * TEST INVERSED VECTOR
 #

raw extraction in 0 microsec


#
 * TEST STRANGE DATATYPE
 #


Strange datatype BEFORE COMMIT

Strange datatype AFTER COMMIT
raw extraction in 0 microsec


#
 * TEST UPPER TRIANGULAR MATRIX (size 100)
 #

raw extraction in 0 microsec
Example 3.26 type1 correct
Example 3.26 type1 correct
Example 3.26 type2 correct
type3 correct
hindexed ok
indexed ok
hvector ok
vector ok


#
 * TEST UPPER MATRIX
 #

test upper matrix
complete raw in 0 microsec
decode [NOT PASSED]


#
 * TEST MATRIX BORDERS
 #



#
 * TEST CONTIGUOUS
 #

test contiguous (alignement)


#
 * TEST STRUCT
 #

test struct
>><<
>><<
>><<
>><<
 Contiguous data-type (MPI_DOUBLE)
raw extraction in 0 microsec
>><<
>><<
Contiguous multiple data-type (4500*1)
raw extraction in 0 microsec
Contiguous multiple data-type (450*10)
raw extraction in 0 microsec
Contiguous multiple data-type (45*100)
raw extraction in 0 microsec
Contiguous multiple data-type (100*45)
raw extraction in 0 microsec
Contiguous multiple data-type (10*450)
raw extraction in 0 microsec
Contiguous multiple data-type (1*4500)
raw extraction in 0 microsec
>><<
>><<
Vector data-type (450 times 10 double stride 11)
raw extraction in 0 microsec
>><<
>><<
Vector data-type (450 times 10 double stride 11)
raw extraction in 0 microsec
>><<
>><<
raw extraction in 1000 microsec
>><<
>><<
raw extraction in 0 microsec
>><<
>><<
raw extraction in 2000 microsec
>><<
>><<
raw extraction in 0 microsec

Re: [OMPI users] Shared Memory - Eager VS Rendezvous

2012-05-23 Thread Simone Pellegrini

On 05/23/2012 03:05 PM, Jeff Squyres wrote:

On May 23, 2012, at 6:05 AM, Simone Pellegrini wrote:


If process A sends a message to process B and the eager protocol is used then I 
assume that the message is written into a shared memory area and picked up by 
the receiver when the receive operation is posted.

Open MPI has a few different shared memory protocols.

For short messages, they always follow what you mention above: CICO.

For large messages, we either use a pipelined CICO (as you surmised below) or 
use direct memory mapping if you have the Linux knem kernel module installed.  
More below.


When the rendezvous is utilized however the message still need to end up in the 
shared memory area somehow. I don't think any RDMA-like transfer exists for 
shared memory communications.

Just to clarify: RDMA = Remote Direct Memory Access, and the "remote" usually 
refers to a different physical address space (e.g., a different server).

In Open MPI's case, knem can use a direct memory copy between two processes.


Therefore you need to buffer this message somehow, however I   assume that 
you don't buffer the whole thing but use some type of pipelined protocol so 
that you reduce the size of the buffer you need to keep in the shared memory.

Correct.  For large messages, when using CICO, we copy the first fragment and 
the necessary meta data to the shmem block.  When the receiver ACKs the first 
fragment, we pipeline CICO the rest of the large message through the shmem 
block.  With the sender and receiver (more or less) simultaneously writing and 
reading to the circular shmem block, we probably won't fill it up -- meaning 
that the sender hypothetically won't need to block.

I'm skipping a bunch of details, but that's the general idea.


Is it completely wrong? It would be nice if someone could point me somewhere I 
can find more details about this. In the OpenMPI tuning page there are several 
details regarding the protocol utilized for IB but very little for SM.

Good point.  I'll see if we can get some more info up there.


I think I found the answer to my question on Jeff Squyres  blog:
http://blogs.cisco.com/performance/shared-memory-as-an-mpi-transport-part-2/

However now I have a new question, how do I know if my machine uses the 
copyin/copyout mechanism or the direct mapping?

You need the Linux knem module.  See the OMPI README and do a text search for 
"knem".


Thanks a lot for the clarification.
however I still have hard time to explain the following phenomena.

I have a very simple code performing a ping/pong between 2 processes 
which are allocated on the same computing node. Each process is bound to 
a different CPU via affinity settings.


I perform this operation with 3 cache scenarios
1) Cache is completely invalidate before the send/recv (both at the 
sender and receiver side)
2) Cache is preloaded before the send/recv operation and it's in 
"exclusive" state.
3) Cache is preloaded before the send/recv operation but this time cache 
lines are in a "modified" state


Now scenario 2 has a speedup over scenario 1 as expected. However 
scenario 3 is much slower then 1. I observed this for both knem and xpmem.
I assume someone is forcing the modified cache lines to be written into 
the memory before the copy is performed. Probably because the segment is 
assigned to a volatile pointer so somehow the stuff in cache has to be 
written into main memory.


Instead when the OpenMPI CICO protocol is used 2 and 3 have the exact 
same speedup over 1. Therefore I assume that in this way no-one forces 
the write-through of dirty cache lines. I am questioning my self on this 
issue since yesterday and it's quite difficult to understand without 
knowing all the internal details.


Is this an expected behaviour also for you or you find it surprising? :)

cheers, Simone







Re: [OMPI users] Shared Memory - Eager VS Rendezvous

2012-05-23 Thread Gutierrez, Samuel K

On May 23, 2012, at 7:05 AM, Jeff Squyres wrote:

> On May 23, 2012, at 6:05 AM, Simone Pellegrini wrote:
> 
>>> If process A sends a message to process B and the eager protocol is used 
>>> then I assume that the message is written into a shared memory area and 
>>> picked up by the receiver when the receive operation is posted. 
> 
> Open MPI has a few different shared memory protocols.
> 
> For short messages, they always follow what you mention above: CICO.
> 
> For large messages, we either use a pipelined CICO (as you surmised below) or 
> use direct memory mapping if you have the Linux knem kernel module installed. 
>  More below.
> 
>>> When the rendezvous is utilized however the message still need to end up in 
>>> the shared memory area somehow. I don't think any RDMA-like transfer exists 
>>> for shared memory communications. 
> 
> Just to clarify: RDMA = Remote Direct Memory Access, and the "remote" usually 
> refers to a different physical address space (e.g., a different server).  
> 
> In Open MPI's case, knem can use a direct memory copy between two processes.

In addition, the vader BTL (XPMEM BTL) also provides similar functionality - 
provided the XPMEM kernel module and user library are available on the system.

Based on my limited experience with the two, I've noticed that knem is 
well-suited for Intel architectures, while XPMEM is best for AMD architectures.

Samuel K. Gutierrez
Los Alamos National Laboratory

> 
>>> Therefore you need to buffer this message somehow, however I   assume 
>>> that you don't buffer the whole thing but use some type of pipelined 
>>> protocol so that you reduce the size of the buffer you need to keep in the 
>>> shared memory. 
> 
> Correct.  For large messages, when using CICO, we copy the first fragment and 
> the necessary meta data to the shmem block.  When the receiver ACKs the first 
> fragment, we pipeline CICO the rest of the large message through the shmem 
> block.  With the sender and receiver (more or less) simultaneously writing 
> and reading to the circular shmem block, we probably won't fill it up -- 
> meaning that the sender hypothetically won't need to block.
> 
> I'm skipping a bunch of details, but that's the general idea.
> 
>>> Is it completely wrong? It would be nice if someone could point me 
>>> somewhere I can find more details about this. In the OpenMPI tuning page 
>>> there are several details regarding the protocol utilized for IB but very 
>>> little for SM. 
> 
> Good point.  I'll see if we can get some more info up there.
> 
>> I think I found the answer to my question on Jeff Squyres  blog:
>> http://blogs.cisco.com/performance/shared-memory-as-an-mpi-transport-part-2/
>> 
>> However now I have a new question, how do I know if my machine uses the 
>> copyin/copyout mechanism or the direct mapping? 
> 
> You need the Linux knem module.  See the OMPI README and do a text search for 
> "knem".
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] openmpi-1.6 undefined reference

2012-05-23 Thread marco atzeri

On 5/23/2012 3:19 PM, Jeff Squyres (jsquyres) wrote:

Just curious - are you running autogen for any particular reason?

I don't know how much Cygwin testing we've done.

Sent from my phone. No type good.



experience says that autoreconf is a good approach on cygwin,
it is almost standard on our package build procedure.

As autogen is performing the same action I see no reason to bypass it
and run a standard autoreconf.

Regards
Marco




Re: [OMPI users] [EXTERNAL] Re: mpicc link shouldn't add -ldl and -lhwloc

2012-05-23 Thread Barrett, Brian W
On 5/22/12 10:36 PM, "Orion Poplawski"  wrote:

>On 05/22/2012 10:34 PM, Orion Poplawski wrote:
>> On 05/21/2012 06:15 PM, Jeff Squyres wrote:
>>> On May 15, 2012, at 10:37 AM, Orion Poplawski wrote:
>>>
 $ mpicc -showme:link
 -pthread -m64 -L/usr/lib64/openmpi/lib -lmpi -ldl -lhwloc

 -ldl and -lhwloc should not be listed. The user should only link
 against libraries that they are using directly, namely -lmpi, and
 they should explicitly add -ldl and -lhwloc if their code directly
 uses those libraries. There does not appear to be any references to
 libdl or libhwloc symbols in the openmpi headers which is the other
 place it could come in.
>>>
>>> I just read this a few times, and I admit that I'm a little confused.
>>>
>>> libmpi does use symbols from libdl; we use it to load OMPI's plugins.
>>> So I'm not sure why you're saying we shouldn't -ldl in the wrapper
>>> compiler...?
>>>
>>> libhwloc might be a little questionable here -- I'll have to check to
>>> see whether 1.6 uses hwloc only in a plugin or whether it's used in
>>> the base library (I don't remember offhand).
>>>
>>
>> But libmpi is already linked against libdl and libhwloc. The wrapper
>> libraries are added when linking user code. But unless a user's code
>> directly uses libdl or libhwloc they don't need to link to those
>>libraries.
>>
>
>I should add the caveat that they are need when linking statically, but
>not when using shared libraries.

And therein lies the problem.  We have a number of users who build Open
MPI statically and even some who build both static and shared libraries in
the same build.  We've never been able to figure out a reasonable way to
guess if we need to add -lhwloc or -ldl, so we add them.  It's better to
list them and have some redundant dependencies (since you have to have the
library anyways) than to not list them and have odd link errors.

We're open to better suggestions, but keep in mind that they have to be
portable and not require the user to change their build systems (ie, we
can't just depend on libtool to do everything for us).

Brian

-- 
  Brian W. Barrett
  Dept. 1423: Scalable System Software
  Sandia National Laboratories








Re: [OMPI users] openmpi-1.6 undefined reference

2012-05-23 Thread Jeff Squyres (jsquyres)
Just curious - are you running autogen for any particular reason?

I don't know how much Cygwin testing we've done. 

Sent from my phone. No type good. 

On May 23, 2012, at 6:09 AM, "Ralph Castain"  wrote:

> Add "libompitrace" to your enable-contrib-no-build list. There is likely a 
> missing include in there, but you don't need that lib to run. We'll take a 
> look at it.
> 
> 
> On May 23, 2012, at 12:58 AM, marco atzeri wrote:
> 
>> I am trying to build openmpi-1.6 for cygwin with dynamic libs
>> 
>> -
>> ./autogen.sh
>> cd build_dir
>> source_dir/configure \
>>  LDFLAGS="-Wl,--export-all-symbols -no-undefined"  \
>>  --disable-mca-dso \
>>  --without-udapl \
>>  --enable-cxx-exceptions \
>>  --enable-mpi-threads \
>>  --enable-progress-threads \
>>  --with-threads=posix \
>>  --without-cs-fs \
>>  --enable-heterogeneous \
>>  --with-mpi-param_check=always \
>>  --enable-contrib-no-build=vt \
>> --enable-mca-nobuild=memory_mallopt,paffinity,installdirs-windows,timer-windows,shmem-sysv
>> make
>> -
>> 
>> the build stop here :
>> CCLD   libompitrace.la
>> Creating library file: .libs/libompitrace.dll.a.libs/abort.o: In function 
>> `MPI_Abort':
>> /pub/devel/openmpi/openmpi-1.6-2/src/openmpi-1.6/ompi/contrib/libompitrace/abort.c:32:
>>  undefined reference to `_o   mpi_mpi_comm_world'
>> /pub/devel/openmpi/openmpi-1.6-2/src/openmpi-1.6/ompi/contrib/libompitrace/abort.c:32:
>>  undefined reference to `_P   MPI_Comm_rank'
>> /pub/devel/openmpi/openmpi-1.6-2/src/openmpi-1.6/ompi/contrib/libompitrace/abort.c:33:
>>  undefined reference to `_P   MPI_Comm_get_name'
>> /pub/devel/openmpi/openmpi-1.6-2/src/openmpi-1.6/ompi/contrib/libompitrace/abort.c:38:
>>  undefined reference to `_P   MPI_Abort'
>> 
>> I do not find "mpi_mpi_comm_world" defined in any of the
>> already built objects, except
>> 
>> ./ompi/communicator/.libs/comm_init.o
>> 0200 C _ompi_mpi_comm_world
>> 
>> and on libmpi.dll.a
>> 
>> d002278.o:
>>  i .idata$4
>>  i .idata$5
>>  i .idata$6
>>  i .idata$7
>>  t .text
>>U __head_cygmpi_1_dll
>>  I __imp__ompi_mpi_comm_world
>>  I __nm__ompi_mpi_comm_world
>> 
>> 
>> Hint ?
>> 
>> Marco
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Shared Memory - Eager VS Rendezvous

2012-05-23 Thread Jeff Squyres
On May 23, 2012, at 6:05 AM, Simone Pellegrini wrote:

>> If process A sends a message to process B and the eager protocol is used 
>> then I assume that the message is written into a shared memory area and 
>> picked up by the receiver when the receive operation is posted. 

Open MPI has a few different shared memory protocols.

For short messages, they always follow what you mention above: CICO.

For large messages, we either use a pipelined CICO (as you surmised below) or 
use direct memory mapping if you have the Linux knem kernel module installed.  
More below.

>> When the rendezvous is utilized however the message still need to end up in 
>> the shared memory area somehow. I don't think any RDMA-like transfer exists 
>> for shared memory communications. 

Just to clarify: RDMA = Remote Direct Memory Access, and the "remote" usually 
refers to a different physical address space (e.g., a different server).  

In Open MPI's case, knem can use a direct memory copy between two processes.  

>> Therefore you need to buffer this message somehow, however I   assume 
>> that you don't buffer the whole thing but use some type of pipelined 
>> protocol so that you reduce the size of the buffer you need to keep in the 
>> shared memory. 

Correct.  For large messages, when using CICO, we copy the first fragment and 
the necessary meta data to the shmem block.  When the receiver ACKs the first 
fragment, we pipeline CICO the rest of the large message through the shmem 
block.  With the sender and receiver (more or less) simultaneously writing and 
reading to the circular shmem block, we probably won't fill it up -- meaning 
that the sender hypothetically won't need to block.

I'm skipping a bunch of details, but that's the general idea.

>> Is it completely wrong? It would be nice if someone could point me somewhere 
>> I can find more details about this. In the OpenMPI tuning page there are 
>> several details regarding the protocol utilized for IB but very little for 
>> SM. 

Good point.  I'll see if we can get some more info up there.

> I think I found the answer to my question on Jeff Squyres  blog:
> http://blogs.cisco.com/performance/shared-memory-as-an-mpi-transport-part-2/
> 
> However now I have a new question, how do I know if my machine uses the 
> copyin/copyout mechanism or the direct mapping? 

You need the Linux knem module.  See the OMPI README and do a text search for 
"knem".

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




[OMPI users] openmpi-1.6 + Intel compilers

2012-05-23 Thread Christophe Peyret
Hello,

I have built openmpi-1.6 for MacOSX Lion 10.7.4 with intel compilers v12.1 
(icc,icpc,ifort)

When I try to install the mpif90 wrapper, I have the error message : ifort: 
error #10104: unable to open '-commons'

When I compare "mpif90 -showme" from version 1.5.4 and version 1.6, I find :


/opt/openmpi-1.5.4/bin/mpif90 -showme
/opt/intel/composerxe/bin/ifort -I/opt/openmpi-1.5.4/include 
-I/opt/openmpi-1.5.4/lib -L/opt/openmpi-1.5.4-v12/lib -lmpi_f90 -lmpi_f77 -lmpi

/opt/openmpi-1.6/bin/mpif90 -showme
/opt/intel/composerxe/bin/ifort -I/opt/openmpi-1.6/include 
-Wl,-commons,use_dylibs -I/opt/openmpi-1.6/lib -L/opt/openmpi-1.6/lib -lmpi_f90 
-lmpi_f77 -lmpi -lm

What are options -Wl,-commons,use_dylibs ? Can I remove it and How ?

--
Christophe Peyret
ONERA - DSNA - PS3A

http://www.onera.fr/dsna/couplage-methodes-aeroacoustiques







Re: [OMPI users] openmpi-1.6 undefined reference

2012-05-23 Thread Ralph Castain
Add "libompitrace" to your enable-contrib-no-build list. There is likely a 
missing include in there, but you don't need that lib to run. We'll take a look 
at it.


On May 23, 2012, at 12:58 AM, marco atzeri wrote:

> I am trying to build openmpi-1.6 for cygwin with dynamic libs
> 
> -
> ./autogen.sh
> cd build_dir
> source_dir/configure \
>   LDFLAGS="-Wl,--export-all-symbols -no-undefined"  \
>   --disable-mca-dso \
>   --without-udapl \
>   --enable-cxx-exceptions \
>   --enable-mpi-threads \
>   --enable-progress-threads \
>   --with-threads=posix \
>   --without-cs-fs \
>   --enable-heterogeneous \
>   --with-mpi-param_check=always \
>   --enable-contrib-no-build=vt \
> --enable-mca-nobuild=memory_mallopt,paffinity,installdirs-windows,timer-windows,shmem-sysv
> make
> -
> 
> the build stop here :
>  CCLD   libompitrace.la
> Creating library file: .libs/libompitrace.dll.a.libs/abort.o: In function 
> `MPI_Abort':
> /pub/devel/openmpi/openmpi-1.6-2/src/openmpi-1.6/ompi/contrib/libompitrace/abort.c:32:
>  undefined reference to `_o   mpi_mpi_comm_world'
> /pub/devel/openmpi/openmpi-1.6-2/src/openmpi-1.6/ompi/contrib/libompitrace/abort.c:32:
>  undefined reference to `_P   MPI_Comm_rank'
> /pub/devel/openmpi/openmpi-1.6-2/src/openmpi-1.6/ompi/contrib/libompitrace/abort.c:33:
>  undefined reference to `_P   MPI_Comm_get_name'
> /pub/devel/openmpi/openmpi-1.6-2/src/openmpi-1.6/ompi/contrib/libompitrace/abort.c:38:
>  undefined reference to `_P   MPI_Abort'
> 
> I do not find "mpi_mpi_comm_world" defined in any of the
> already built objects, except
> 
> ./ompi/communicator/.libs/comm_init.o
> 0200 C _ompi_mpi_comm_world
> 
> and on libmpi.dll.a
> 
> d002278.o:
>  i .idata$4
>  i .idata$5
>  i .idata$6
>  i .idata$7
>  t .text
> U __head_cygmpi_1_dll
>  I __imp__ompi_mpi_comm_world
>  I __nm__ompi_mpi_comm_world
> 
> 
> Hint ?
> 
> Marco
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Shared Memory - Eager VS Rendezvous

2012-05-23 Thread Simone Pellegrini

I think I found the answer to my question on Jeff Squyres  blog:
http://blogs.cisco.com/performance/shared-memory-as-an-mpi-transport-part-2/

However now I have a new question, how do I know if my machine uses the 
copyin/copyout mechanism or the direct mapping?


Assuming that I am running on OpenMPI 1.5.x installed on top of a linux 
Kernel 2.6.32?


cheers, Simone

On 05/22/2012 05:29 PM, Simone Pellegrini wrote:

Dear all,
I would like to have a confirmation on the assumptions I have on how 
OpenMPI implements the rendezvous protocol for shared memory.


If process A sends a message to process B and the eager protocol is 
used then I assume that the message is written into a shared memory 
area and picked up by the receiver when the receive operation is posted.


When the rendezvous is utilized however the message still need to end 
up in the shared memory area somehow. I don't think any RDMA-like 
transfer exists for shared memory communications. Therefore you need 
to buffer this message somehow, however I assume that you don't buffer 
the whole thing but use some type of pipelined protocol so that you 
reduce the size of the buffer you need to keep in the shared memory.


Is it completely wrong? It would be nice if someone could point me 
somewhere I can find more details about this. In the OpenMPI tuning 
page there are several details regarding the protocol utilized for IB 
but very little for SM.


thanks in advance,
Simone P.






Re: [OMPI users] MPI_COMPLEX16

2012-05-23 Thread David Singleton

On 05/23/2012 07:30 PM, Patrick Le Dot wrote:

David Singleton  writes:




I should have checked earlier - same for MPI_COMPLEX and MPI_COMPLEX8.

David

On 04/27/2012 08:43 AM, David Singleton wrote:


Apologies if this has already been covered somewhere. One of our users
has noticed that MPI_COMPLEX16 is flagged as an invalid type in 1.5.4
but not in 1.4.3 while MPI_DOUBLE_COMPLEX is accepted for both. This is
with either gfortran or intel-fc.
...


Hi,

I hit the same problem : MPI_COMPLEX8 and MPI_COMPLEX16 were available
in v1.4 but were removes in v1.5 and I don't understand why, except that
this types are not into the standard...

I have a patch to reintroduce them again so let me know what you think.



I would very much appreciate seeing that patch.

Thanks
David



Re: [OMPI users] MPI_COMPLEX16

2012-05-23 Thread Patrick Le Dot
David Singleton  anu.edu.au> writes:

> 
> 
> I should have checked earlier - same for MPI_COMPLEX and MPI_COMPLEX8.
> 
> David
> 
> On 04/27/2012 08:43 AM, David Singleton wrote:
> >
> > Apologies if this has already been covered somewhere. One of our users
> > has noticed that MPI_COMPLEX16 is flagged as an invalid type in 1.5.4
> > but not in 1.4.3 while MPI_DOUBLE_COMPLEX is accepted for both. This is
> > with either gfortran or intel-fc.
> > ...

Hi,

I hit the same problem : MPI_COMPLEX8 and MPI_COMPLEX16 were available
in v1.4 but were removes in v1.5 and I don't understand why, except that
this types are not into the standard...

I have a patch to reintroduce them again so let me know what you think.

Thanks,
Patrick




[OMPI users] openmpi-1.6 undefined reference

2012-05-23 Thread marco atzeri

I am trying to build openmpi-1.6 for cygwin with dynamic libs

-
./autogen.sh
cd build_dir
source_dir/configure \
   LDFLAGS="-Wl,--export-all-symbols -no-undefined"  \
   --disable-mca-dso \
   --without-udapl \
   --enable-cxx-exceptions \
   --enable-mpi-threads \
   --enable-progress-threads \
   --with-threads=posix \
   --without-cs-fs \
   --enable-heterogeneous \
   --with-mpi-param_check=always \
   --enable-contrib-no-build=vt \

--enable-mca-nobuild=memory_mallopt,paffinity,installdirs-windows,timer-windows,shmem-sysv
make
-

the build stop here :
  CCLD   libompitrace.la
Creating library file: .libs/libompitrace.dll.a.libs/abort.o: In 
function `MPI_Abort':
/pub/devel/openmpi/openmpi-1.6-2/src/openmpi-1.6/ompi/contrib/libompitrace/abort.c:32: 
undefined reference to `_o   mpi_mpi_comm_world'
/pub/devel/openmpi/openmpi-1.6-2/src/openmpi-1.6/ompi/contrib/libompitrace/abort.c:32: 
undefined reference to `_P   MPI_Comm_rank'
/pub/devel/openmpi/openmpi-1.6-2/src/openmpi-1.6/ompi/contrib/libompitrace/abort.c:33: 
undefined reference to `_P   MPI_Comm_get_name'
/pub/devel/openmpi/openmpi-1.6-2/src/openmpi-1.6/ompi/contrib/libompitrace/abort.c:38: 
undefined reference to `_P   MPI_Abort'


I do not find "mpi_mpi_comm_world" defined in any of the
already built objects, except

./ompi/communicator/.libs/comm_init.o
0200 C _ompi_mpi_comm_world

and on libmpi.dll.a

d002278.o:
 i .idata$4
 i .idata$5
 i .idata$6
 i .idata$7
 t .text
 U __head_cygmpi_1_dll
 I __imp__ompi_mpi_comm_world
 I __nm__ompi_mpi_comm_world


Hint ?

Marco



Re: [OMPI users] mpicc link shouldn't add -ldl and -lhwloc

2012-05-23 Thread Orion Poplawski

On 05/22/2012 10:34 PM, Orion Poplawski wrote:

On 05/21/2012 06:15 PM, Jeff Squyres wrote:

On May 15, 2012, at 10:37 AM, Orion Poplawski wrote:


$ mpicc -showme:link
-pthread -m64 -L/usr/lib64/openmpi/lib -lmpi -ldl -lhwloc

-ldl and -lhwloc should not be listed. The user should only link
against libraries that they are using directly, namely -lmpi, and
they should explicitly add -ldl and -lhwloc if their code directly
uses those libraries. There does not appear to be any references to
libdl or libhwloc symbols in the openmpi headers which is the other
place it could come in.


I just read this a few times, and I admit that I'm a little confused.

libmpi does use symbols from libdl; we use it to load OMPI's plugins.
So I'm not sure why you're saying we shouldn't -ldl in the wrapper
compiler...?

libhwloc might be a little questionable here -- I'll have to check to
see whether 1.6 uses hwloc only in a plugin or whether it's used in
the base library (I don't remember offhand).



But libmpi is already linked against libdl and libhwloc. The wrapper
libraries are added when linking user code. But unless a user's code
directly uses libdl or libhwloc they don't need to link to those libraries.



I should add the caveat that they are need when linking statically, but 
not when using shared libraries.


--
Orion Poplawski
Technical Manager 303-415-9701 x222
NWRA/CoRA DivisionFAX: 303-415-9702
3380 Mitchell Lane  or...@cora.nwra.com
Boulder, CO 80301  http://www.cora.nwra.com


Re: [OMPI users] mpicc link shouldn't add -ldl and -lhwloc

2012-05-23 Thread Orion Poplawski

On 05/21/2012 06:15 PM, Jeff Squyres wrote:

On May 15, 2012, at 10:37 AM, Orion Poplawski wrote:


$ mpicc -showme:link
-pthread -m64 -L/usr/lib64/openmpi/lib -lmpi -ldl -lhwloc

-ldl and -lhwloc should not be listed.  The user should only link against 
libraries that they are using directly, namely -lmpi, and they should 
explicitly add -ldl and -lhwloc if their code directly uses those libraries. 
There does not appear to be any references to libdl or libhwloc symbols in the 
openmpi headers which is the other place it could come in.


I just read this a few times, and I admit that I'm a little confused.

libmpi does use symbols from libdl; we use it to load OMPI's plugins.  So I'm 
not sure why you're saying we shouldn't -ldl in the wrapper compiler...?

libhwloc might be a little questionable here -- I'll have to check to see 
whether 1.6 uses hwloc only in a plugin or whether it's used in the base 
library (I don't remember offhand).



But libmpi is already linked against libdl and libhwloc.  The wrapper 
libraries are added when linking user code.  But unless a user's code 
directly uses libdl or libhwloc they don't need to link to those libraries.


--
Orion Poplawski
Technical Manager 303-415-9701 x222
NWRA/CoRA DivisionFAX: 303-415-9702
3380 Mitchell Lane  or...@cora.nwra.com
Boulder, CO 80301  http://www.cora.nwra.com