date:20080131

Re: [OMPI devel] orte_ns_base_select failed: returned value -1 instead of ORTE_SUCCESS

2008-01-31 Thread Ralph Castain

Hmmm...well, my bad. There does indeed appear to be something funny going on
with Leopard. No idea what - it used to work fine. I haven't tested it in
awhile though - I've been test building regularly on Leopard, but running on
Tiger (I misspoke earlier).

For now, I'm afraid you can't run on Leopard. Have to figure it out later
when I have more time.

Ralph


> -- Forwarded Message
>> From: Aurélien Bouteiller 
>> Reply-To: Open MPI Developers 
>> Date: Thu, 31 Jan 2008 02:18:27 -0500
>> To: Open MPI Developers 
>> Subject: Re: [OMPI devel] orte_ns_base_select failed: returned value -1
>> instead of ORTE_SUCCESS
>> 
>> I tried using a fresh trunk, same problem have occured.  Here is the
>> complete configure line. I am using libtool 1.5.22 from fink.
>> Otherwise everything is standard OS 10.5.
>> 
>>$ ../trunk/configure --prefix=/Users/bouteill/ompi/build --enable-
>> mpirun-prefix-by-default --disable-io-romio --enable-debug --enable-
>> picky --enable-mem-debug --enable-mem-profile --enable-visibility --
>> disable-dlopen --disable-shared --enable-static
>> 
>> The error message generated by abort contains garbage (line numbers do
>> not match anything in .c files and according to gdb the failure does
>> not occur during ns initialization). This looks like a heap corruption
>> or something as bad.
>> 
>> orterun (argc=4, argv=0xb81c) at ../../../../trunk/orte/tools/
>> orterun/orterun.c:529
>> 529 cb_states = ORTE_PROC_STATE_TERMINATED |
>> ORTE_PROC_STATE_AT_STG1;
>> (gdb) n
>> 530 rc = orte_rmgr.spawn_job(apps, num_apps, &jobid, 0, NULL,
>> job_state_callback, cb_states, &attributes);
>> (gdb) n
>> 531 while (NULL != (item = opal_list_remove_first(&attributes)))
>> OBJ_RELEASE(item);
>> (gdb) n
>> ** Stepping over inlined function code. **
>> 532 OBJ_DESTRUCT(&attributes);
>> (gdb) n
>> 534 if (orterun_globals.do_not_launch) {
>> (gdb) n
>> 539 OPAL_THREAD_LOCK(&orterun_globals.lock);
>> (gdb) n
>> 541 if (ORTE_SUCCESS == rc) {
>> (gdb) n
>> 542 while (!orterun_globals.exit) {
>> (gdb) n
>> 543 opal_condition_wait(&orterun_globals.cond,
>> (gdb) n
>> [grosse-pomme.local:77335] [NO-NAME] ORTE_ERROR_LOG: Bad parameter in
>> file /SourceCache/openmpi/openmpi-5/openmpi/orte/mca/oob/base/
>> oob_base_init.c at line 74
>> 
>> Aurelien
>> 
>> 
>> Le 30 janv. 08 à 17:18, Ralph Castain a écrit :
>> 
>>> Are you running on the trunk, or an earlier release?
>>> 
>>> If the trunk, then I suspect you have a stale library hanging
>>> around. I
>>> build and run statically on Leopard regularly.
>>> 
>>> 
>>> On 1/30/08 2:54 PM, "Aurélien Bouteiller" 
>>> wrote:
>>> 
 I get a runtime error in static build on Mac OS 10.5 (automake 1.10,
 autoconf 2.60, gcc-apple-darwin 4.01, libtool 1.5.22).
 
 The error does not occur in dso builds, and everything seems to work
 fine on Linux.
 
 Here is the error log.
 
 ~/ompi$ mpirun -np 2 NetPIPE_3.6/NPmpi
 [grosse-pomme.local:34247] [NO-NAME] ORTE_ERROR_LOG: Bad parameter in
 file /SourceCache/openmpi/openmpi-5/openmpi/orte/mca/oob/base/
 oob_base_init.c at line 74
 [grosse-pomme.local:34247] [NO-NAME] ORTE_ERROR_LOG: Bad parameter in
 file /SourceCache/openmpi/openmpi-5/openmpi/orte/mca/ns/proxy/
 ns_proxy_component.c at line 222
 [grosse-pomme.local:34247] [NO-NAME] ORTE_ERROR_LOG: Error in file /
 SourceCache/openmpi/openmpi-5/openmpi/orte/runtime/orte_init_stage1.c
 at line 230
 --
 It looks like orte_init failed for some reason; your parallel
 process is
 likely to abort.  There are many reasons that a parallel process can
 fail during orte_init; some of which are due to configuration or
 environment problems.  This failure appears to be an internal
 failure;
 here's some additional information (which may only be relevant to an
 Open MPI developer):
 
   orte_ns_base_select failed
   --> Returned value -1 instead of ORTE_SUCCESS
 
 --
 --
 It looks like MPI_INIT failed for some reason; your parallel
 process is
 likely to abort.  There are many reasons that a parallel process can
 fail during MPI_INIT; some of which are due to configuration or
 environment
 problems.  This failure appears to be an internal failure; here's
 some
 additional information (which may only be relevant to an Open MPI
 developer):
 
   ompi_mpi_init: orte_init_stage1 failed
   --> Returned "Error" (-1) instead of "Success" (0)
 --
 *** An error occurred in MPI_Init
 *** before MPI was initialized
 *** MPI_ERRORS_ARE_FATAL (goodbye)
 
 
 
 --
 D

Re: [OMPI devel] vt compiler warnings and errors

2008-01-31 Thread Jeff Squyres

Ah -- I didn't notice this before -- do you have a configure script  
committed to SVN?  If so, this could be the problem.

Whether what Tim sees happens or not will depend on the timestamps  
that SVN puts on configure and all of the files dependent upon  
configure (Makefile.in, Makefile, ...etc.) in the VT tree.  If some of  
them have "bad" timestamps, then the dependencies in the Makefiles can  
end up re-running VT's configure, re-create configure, etc.

Is there a way to get OMPI's autogen to also autogen the VT software?   
This would ensure one, consistent set of timestamps (not dependent  
upon what timestamps SVN wrote to your filesystem for these sensitive  
files).

On Jan 31, 2008, at 12:36 PM, Matthias Jurenz wrote:

Hi Tim,

that seems wrong for me, too. I could not reproduce this on my  
computer.
The VT-integration comes with an own configure script, which will  
not created by the OMPI's autogen.sh.
I have not really an idea what's going wrong... I suppose, the  
problem is that you use another version
of the Autotools as I have used to bootstap VT ?!? The VT's  
configure script was created by following

version of the Autotools:

autoconf 2.61, automake 1.10, libtool 1.5.24.

Which version of the Autotools you are using to boostrap OpenMPI ?

Matthias

On Do, 2008-01-31 at 08:09 -0500, Tim Prins wrote:

Hi Matthias,

I just noticed something else that seems odd. On a fresh checkout,  
I did

a autogen and configure. Then I type 'make clean'. Things seem to
progress normally, but once it gets to ompi/contrib/vt/vt/extlib/ 
otf, a

new configure script gets run.

Specifically:
[tprins@sif test]$ make clean

Making clean in otf
make[5]: Entering directory
`/san/homedirs/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf'
  cd . && /bin/sh
/u/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf/missing --run
automake-1.10 --gnu
cd . && /bin/sh /u/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf/ 
missing

--run autoconf
/bin/sh ./config.status --recheck
running CONFIG_SHELL=/bin/sh /bin/sh ./configure  --with-zlib-lib=-lz
--prefix=/usr/local --exec-prefix=/usr/local --bindir=/usr/local/bin
--libdir=/usr/local/lib --includedir=/usr/local/include
--datarootdir=/usr/local/share/vampirtrace
--datadir=${prefix}/share/${PACKAGE_TARNAME}
--docdir=${prefix}/share/${PACKAGE_TARNAME}/doc --cache-file=/dev/ 
null
--srcdir=. CXXFLAGS=-g -Wall -Wundef -Wno-long-long -finline- 
functions

-pthread LDFLAGS=  LIBS=-lnsl -lutil  -lm  CPPFLAGS=  CFLAGS=-g -Wall
-Wundef -Wno-long-long -Wsign-compare -Wmissing-prototypes
-Wstrict-prototypes -Wcomment -pedantic
-Werror-implicit-function-declaration -finline-functions
-fno-strict-aliasing -pthread FFLAGS=  --no-create --no-recursion
checking build system type... x86_64-unknown-linux-gnu

Not sure if this is expected behavior, but it seems wrong to me.

Thanks,

Tim

Matthias Jurenz wrote:
> Hello,
>
> all three VT related errors which MTT reported should be fixed now.
>
> 516:
> The fix from George Bosilca at this morning should work on MacOS  
PPC.

> Thanks!
>
> 517:
> The compile error occurred due to a missing header include.
> Futhermore, the compiler warnings should be also fixed.
>
> 518:
> I have added a check whether MPI I/O is available and add the
> corresponding VT's
> configure option to enable/disable MPI I/O support. Therefor I  
used the

> variable
> "define_mpi_io" from 'ompi/mca/io/configure.m4'. Is that o.k. or  
should

> I use another
> variable ?
>
>
> Matthias
>
>
> On Di, 2008-01-29 at 09:19 -0500, Jeff Squyres wrote:
>> I got a bunch of compiler warnings and errors with VT on the PGI
>> compiler last night -- my mail client won't paste it in  
nicely.  :-(

>>
>> See these MTT reports for details:
>>
>> - On Absoft systems:
>>http://www.open-mpi.org/mtt/index.php?do_redir=516
>> - On Cisco systems:
>>With PGI compilers:
>>http://www.open-mpi.org/mtt/index.php?do_redir=517
>>With GNU compilers:
>>http://www.open-mpi.org/mtt/index.php?do_redir=518
>>
>> The output may be a bit hard to read -- for MTT builds, we  
separate
>> the stdout and stderr into 2 streams.  So you kinda have to  
merge them

>> in your head; sorry...
>>
> --
> Matthias Jurenz,
> Center for Information Services and
> High Performance Computing (ZIH), TU Dresden,
> Willersbau A106, Zellescher Weg 12, 01062 Dresden
> phone +49-351-463-31945, fax +49-351-463-37773
>
>
>  

>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Matthias Jurenz,
Center for Information Services and
High Performance Computing (ZIH), TU Dresden,
Willersbau A106, Zellescher Weg 12, 01062 Dresden
phone +49-351-463-31945, fax +49-351-463-37773

Re: [OMPI devel] vt compiler warnings and errors

2008-01-31 Thread Matthias Jurenz

Hi Tim,

that seems wrong for me, too. I could not reproduce this on my computer.
The VT-integration comes with an own configure script, which will not
created by the OMPI's autogen.sh.
I have not really an idea what's going wrong... I suppose, the problem
is that you use another version
of the Autotools as I have used to bootstap VT ?!? The VT's configure
script was created by following
version of the Autotools:

autoconf 2.61, automake 1.10, libtool 1.5.24.

Which version of the Autotools you are using to boostrap OpenMPI ?


Matthias


On Do, 2008-01-31 at 08:09 -0500, Tim Prins wrote:

> Hi Matthias,
> 
> I just noticed something else that seems odd. On a fresh checkout, I did 
> a autogen and configure. Then I type 'make clean'. Things seem to 
> progress normally, but once it gets to ompi/contrib/vt/vt/extlib/otf, a 
> new configure script gets run.
> 
> Specifically:
> [tprins@sif test]$ make clean
> 
> Making clean in otf
> make[5]: Entering directory 
> `/san/homedirs/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf'
>   cd . && /bin/sh 
> /u/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf/missing --run 
> automake-1.10 --gnu
> cd . && /bin/sh /u/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf/missing 
> --run autoconf
> /bin/sh ./config.status --recheck
> running CONFIG_SHELL=/bin/sh /bin/sh ./configure  --with-zlib-lib=-lz 
> --prefix=/usr/local --exec-prefix=/usr/local --bindir=/usr/local/bin 
> --libdir=/usr/local/lib --includedir=/usr/local/include 
> --datarootdir=/usr/local/share/vampirtrace 
> --datadir=${prefix}/share/${PACKAGE_TARNAME} 
> --docdir=${prefix}/share/${PACKAGE_TARNAME}/doc --cache-file=/dev/null 
> --srcdir=. CXXFLAGS=-g -Wall -Wundef -Wno-long-long -finline-functions 
> -pthread LDFLAGS=  LIBS=-lnsl -lutil  -lm  CPPFLAGS=  CFLAGS=-g -Wall 
> -Wundef -Wno-long-long -Wsign-compare -Wmissing-prototypes 
> -Wstrict-prototypes -Wcomment -pedantic 
> -Werror-implicit-function-declaration -finline-functions 
> -fno-strict-aliasing -pthread FFLAGS=  --no-create --no-recursion
> checking build system type... x86_64-unknown-linux-gnu
> 
> 
> 
> Not sure if this is expected behavior, but it seems wrong to me.
> 
> Thanks,
> 
> Tim
> 
> Matthias Jurenz wrote:
> > Hello,
> > 
> > all three VT related errors which MTT reported should be fixed now.
> > 
> > 516:
> > The fix from George Bosilca at this morning should work on MacOS PPC. 
> > Thanks!
> > 
> > 517:
> > The compile error occurred due to a missing header include.
> > Futhermore, the compiler warnings should be also fixed.
> > 
> > 518:
> > I have added a check whether MPI I/O is available and add the 
> > corresponding VT's
> > configure option to enable/disable MPI I/O support. Therefor I used the 
> > variable
> > "define_mpi_io" from 'ompi/mca/io/configure.m4'. Is that o.k. or should 
> > I use another
> > variable ?
> > 
> > 
> > Matthias
> > 
> > 
> > On Di, 2008-01-29 at 09:19 -0500, Jeff Squyres wrote:
> >> I got a bunch of compiler warnings and errors with VT on the PGI  
> >> compiler last night -- my mail client won't paste it in nicely.  :-(
> >>
> >> See these MTT reports for details:
> >>
> >> - On Absoft systems:
> >>http://www.open-mpi.org/mtt/index.php?do_redir=516
> >> - On Cisco systems:
> >>With PGI compilers:
> >>http://www.open-mpi.org/mtt/index.php?do_redir=517
> >>With GNU compilers:
> >>http://www.open-mpi.org/mtt/index.php?do_redir=518
> >>
> >> The output may be a bit hard to read -- for MTT builds, we separate  
> >> the stdout and stderr into 2 streams.  So you kinda have to merge them  
> >> in your head; sorry...
> >>
> > --
> > Matthias Jurenz,
> > Center for Information Services and
> > High Performance Computing (ZIH), TU Dresden,
> > Willersbau A106, Zellescher Weg 12, 01062 Dresden
> > phone +49-351-463-31945, fax +49-351-463-37773
> > 
> > 
> > 
> > 
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 

--
Matthias Jurenz,
Center for Information Services and 
High Performance Computing (ZIH), TU Dresden, 
Willersbau A106, Zellescher Weg 12, 01062 Dresden
phone +49-351-463-31945, fax +49-351-463-37773


smime.p7s
Description: S/MIME cryptographic signature

Re: [OMPI devel] SnapC

2008-01-31 Thread Josh Hursey

So the ompi-checkpoint command connects with the Global Coordinator in  
the SnapC 'full' component. The Global Coordinator lives in the HNP  
(mpirun/orterun) as determined by the 'full' component. As a result to  
start a checkpoint ompi-checkpoint must connect to the HNP.


From a user standpoint, they are typically running ompi-checkpoint  
from the same machine where they started mpirun. So it made the most  
sense to have these two connect to each other, especially if we ask  
the user to provide the PID of the mpirun process to checkpoint.


That being said, with the proper changes to 'full' (or with a new  
SnapC component), ompi-checkpoint could issue the checkpoint request  
to any process in the MPI job [orterun, orted, application processes]  
and have the correct things happen.


I have received one request for this functionality, but have not had  
the time yet to dig into it.


Does that help?

Cheers,
Josh

On Jan 31, 2008, at 9:51 AM, Leonardo Fialho wrote:


Hi all (and Josh),

Why the ompi-checkpoint have to contact the HNP specifically? If I use
another process to start the snapshot coordinator, apparently it´s
works fine, no?

PS: I prefer to send this message to the list... to keep it on the
history for further use...

--
Leonardo Fialho
Computer Architecture and Operating Systems Department - CAOS
Universidad Autonoma de Barcelona - UAB
ETSE, Edifcio Q, QC/3088
http://www.caos.uab.es
Phone: +34-93-581-2888
Fax: +34-93-581-2478

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

[OMPI devel] SnapC

2008-01-31 Thread Leonardo Fialho

Hi all (and Josh),

Why the ompi-checkpoint have to contact the HNP specifically? If I use
another process to start the snapshot coordinator, apparently it´s
works fine, no?

PS: I prefer to send this message to the list... to keep it on the
history for further use...

-- 
Leonardo Fialho
Computer Architecture and Operating Systems Department - CAOS
Universidad Autonoma de Barcelona - UAB
ETSE, Edifcio Q, QC/3088
http://www.caos.uab.es
Phone: +34-93-581-2888
Fax: +34-93-581-2478

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17307

2008-01-31 Thread Adrian Knoth

On Wed, Jan 30, 2008 at 06:48:54PM +0100, Adrian Knoth wrote:

> > What is the real issue behind this whole discussion?
> Hanging connections.
> I'll have a look at it tomorrow.

To everybody who's interested in BTL-TCP, especially George and (to a
minor degree) rhc:

I've integrated something what I call "magic address selection code".
See the comments in r17348.

Can you check

   https://svn.open-mpi.org/svn/ompi/tmp-public/btl-tcp

if it's working for you? Read: multi-rail TCP, FNN, whatever is
important to you?

The code is proof of concept and could use a little tuning (if it's
working at all. Over here, it satisfies all tests).

I vaguely remember that at least Ralph doesn't like

   int a[perm_size * sizeof(int)];

where perm_size is dynamically evaluated (read: array size is runtime
dependent)

There are also some large arrays, search for MAX_KERNEL_INTERFACE_INDEX.
Perhaps it's better to replace them with an appropriate OMPI data
structure. I don't know what fits best, you guys know the details...

So please give the code a try, and if it's working, feel free to cleanup
whatever is necessary to make it the OMPI style or give me some pointers
what to change.

I'd like to point to Thomas' diploma thesis. The PDF explains the theory
behind the code, it's like an rationale. Unfortunately, the PDF has some
typos, but I guess you'll get the idea. It's a graph matching algorithm,
Chapter 3 covers everything in detail:

 http://cluster.inf-ra.uni-jena.de/~adi/peiselt-thesis.pdf

HTH

-- 
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany

private: http://adi.thur.de

Re: [OMPI devel] 32 bit udapl warnings

2008-01-31 Thread Gleb Natapov

On Thu, Jan 31, 2008 at 08:45:54AM -0500, Don Kerr wrote:
> This was brought to my attention once before but I don't see this 
> message so I just plain forgot about it. :-(
> uDAPL defines its pointers as uint64, "typedef DAT_UINT64 DAT_VADDR", 
> and pval is a "void *" which is why the message comes up.  If I remove 
> the cast I believe I get a different warning and I just haven't stopped 
> to think of a way around this.
dat_pointer = (DAT_VADDR)(uintptr_t)void_pointer;

This is not just a warning. This is a real bug. If MSB of a void pointer
will be 1 it will be sign extended.

> 
> Tim Prins wrote:
> > Hi,
> >
> > I am seeing some warnings on the trunk when compiling udapl in 32 bit 
> > mode with OFED 1.2.5.1:
> >
> > btl_udapl.c: In function 'udapl_reg_mr':
> > btl_udapl.c:95: warning: cast from pointer to integer of different size
> > btl_udapl.c: In function 'mca_btl_udapl_alloc':
> > btl_udapl.c:852: warning: cast from pointer to integer of different size
> > btl_udapl.c: In function 'mca_btl_udapl_prepare_src':
> > btl_udapl.c:959: warning: cast from pointer to integer of different size
> > btl_udapl.c:1008: warning: cast from pointer to integer of different size
> > btl_udapl_component.c: In function 'mca_btl_udapl_component_progress':
> > btl_udapl_component.c:871: warning: cast from pointer to integer of 
> > different size
> > btl_udapl_endpoint.c: In function 'mca_btl_udapl_endpoint_write_eager':
> > btl_udapl_endpoint.c:130: warning: cast from pointer to integer of 
> > different size
> > btl_udapl_endpoint.c: In function 'mca_btl_udapl_endpoint_finish_max':
> > btl_udapl_endpoint.c:775: warning: cast from pointer to integer of 
> > different size
> > btl_udapl_endpoint.c: In function 'mca_btl_udapl_endpoint_post_recv':
> > btl_udapl_endpoint.c:864: warning: cast from pointer to integer of 
> > different size
> > btl_udapl_endpoint.c: In function 
> > 'mca_btl_udapl_endpoint_initialize_control_message':
> > btl_udapl_endpoint.c:1012: warning: cast from pointer to integer of 
> > different size
> >
> >
> > Thanks,
> >
> > Tim
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >   
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Gleb.

Re: [OMPI devel] 32 bit udapl warnings

2008-01-31 Thread Don Kerr

This was brought to my attention once before but I don't see this 
message so I just plain forgot about it. :-(
uDAPL defines its pointers as uint64, "typedef DAT_UINT64 DAT_VADDR", 
and pval is a "void *" which is why the message comes up.  If I remove 
the cast I believe I get a different warning and I just haven't stopped 
to think of a way around this.


Tim Prins wrote:

Hi,

I am seeing some warnings on the trunk when compiling udapl in 32 bit 
mode with OFED 1.2.5.1:


btl_udapl.c: In function 'udapl_reg_mr':
btl_udapl.c:95: warning: cast from pointer to integer of different size
btl_udapl.c: In function 'mca_btl_udapl_alloc':
btl_udapl.c:852: warning: cast from pointer to integer of different size
btl_udapl.c: In function 'mca_btl_udapl_prepare_src':
btl_udapl.c:959: warning: cast from pointer to integer of different size
btl_udapl.c:1008: warning: cast from pointer to integer of different size
btl_udapl_component.c: In function 'mca_btl_udapl_component_progress':
btl_udapl_component.c:871: warning: cast from pointer to integer of 
different size

btl_udapl_endpoint.c: In function 'mca_btl_udapl_endpoint_write_eager':
btl_udapl_endpoint.c:130: warning: cast from pointer to integer of 
different size

btl_udapl_endpoint.c: In function 'mca_btl_udapl_endpoint_finish_max':
btl_udapl_endpoint.c:775: warning: cast from pointer to integer of 
different size

btl_udapl_endpoint.c: In function 'mca_btl_udapl_endpoint_post_recv':
btl_udapl_endpoint.c:864: warning: cast from pointer to integer of 
different size
btl_udapl_endpoint.c: In function 
'mca_btl_udapl_endpoint_initialize_control_message':
btl_udapl_endpoint.c:1012: warning: cast from pointer to integer of 
different size



Thanks,

Tim
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] vt compiler warnings and errors

2008-01-31 Thread Tim Prins


Hi Matthias,

I just noticed something else that seems odd. On a fresh checkout, I did 
a autogen and configure. Then I type 'make clean'. Things seem to 
progress normally, but once it gets to ompi/contrib/vt/vt/extlib/otf, a 
new configure script gets run.


Specifically:
[tprins@sif test]$ make clean

Making clean in otf
make[5]: Entering directory 
`/san/homedirs/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf'
 cd . && /bin/sh 
/u/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf/missing --run 
automake-1.10 --gnu
cd . && /bin/sh /u/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf/missing 
--run autoconf

/bin/sh ./config.status --recheck
running CONFIG_SHELL=/bin/sh /bin/sh ./configure  --with-zlib-lib=-lz 
--prefix=/usr/local --exec-prefix=/usr/local --bindir=/usr/local/bin 
--libdir=/usr/local/lib --includedir=/usr/local/include 
--datarootdir=/usr/local/share/vampirtrace 
--datadir=${prefix}/share/${PACKAGE_TARNAME} 
--docdir=${prefix}/share/${PACKAGE_TARNAME}/doc --cache-file=/dev/null 
--srcdir=. CXXFLAGS=-g -Wall -Wundef -Wno-long-long -finline-functions 
-pthread LDFLAGS=  LIBS=-lnsl -lutil  -lm  CPPFLAGS=  CFLAGS=-g -Wall 
-Wundef -Wno-long-long -Wsign-compare -Wmissing-prototypes 
-Wstrict-prototypes -Wcomment -pedantic 
-Werror-implicit-function-declaration -finline-functions 
-fno-strict-aliasing -pthread FFLAGS=  --no-create --no-recursion

checking build system type... x86_64-unknown-linux-gnu



Not sure if this is expected behavior, but it seems wrong to me.

Thanks,

Tim

Matthias Jurenz wrote:

Hello,

all three VT related errors which MTT reported should be fixed now.

516:
The fix from George Bosilca at this morning should work on MacOS PPC. 
Thanks!


517:
The compile error occurred due to a missing header include.
Futhermore, the compiler warnings should be also fixed.

518:
I have added a check whether MPI I/O is available and add the 
corresponding VT's
configure option to enable/disable MPI I/O support. Therefor I used the 
variable
"define_mpi_io" from 'ompi/mca/io/configure.m4'. Is that o.k. or should 
I use another

variable ?


Matthias


On Di, 2008-01-29 at 09:19 -0500, Jeff Squyres wrote:
I got a bunch of compiler warnings and errors with VT on the PGI  
compiler last night -- my mail client won't paste it in nicely.  :-(


See these MTT reports for details:

- On Absoft systems:
   http://www.open-mpi.org/mtt/index.php?do_redir=516
- On Cisco systems:
   With PGI compilers:
   http://www.open-mpi.org/mtt/index.php?do_redir=517
   With GNU compilers:
   http://www.open-mpi.org/mtt/index.php?do_redir=518

The output may be a bit hard to read -- for MTT builds, we separate  
the stdout and stderr into 2 streams.  So you kinda have to merge them  
in your head; sorry...



--
Matthias Jurenz,
Center for Information Services and
High Performance Computing (ZIH), TU Dresden,
Willersbau A106, Zellescher Weg 12, 01062 Dresden
phone +49-351-463-31945, fax +49-351-463-37773




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

[OMPI devel] 32 bit udapl warnings

2008-01-31 Thread Tim Prins


Hi,

I am seeing some warnings on the trunk when compiling udapl in 32 bit 
mode with OFED 1.2.5.1:


btl_udapl.c: In function 'udapl_reg_mr':
btl_udapl.c:95: warning: cast from pointer to integer of different size
btl_udapl.c: In function 'mca_btl_udapl_alloc':
btl_udapl.c:852: warning: cast from pointer to integer of different size
btl_udapl.c: In function 'mca_btl_udapl_prepare_src':
btl_udapl.c:959: warning: cast from pointer to integer of different size
btl_udapl.c:1008: warning: cast from pointer to integer of different size
btl_udapl_component.c: In function 'mca_btl_udapl_component_progress':
btl_udapl_component.c:871: warning: cast from pointer to integer of 
different size

btl_udapl_endpoint.c: In function 'mca_btl_udapl_endpoint_write_eager':
btl_udapl_endpoint.c:130: warning: cast from pointer to integer of 
different size

btl_udapl_endpoint.c: In function 'mca_btl_udapl_endpoint_finish_max':
btl_udapl_endpoint.c:775: warning: cast from pointer to integer of 
different size

btl_udapl_endpoint.c: In function 'mca_btl_udapl_endpoint_post_recv':
btl_udapl_endpoint.c:864: warning: cast from pointer to integer of 
different size
btl_udapl_endpoint.c: In function 
'mca_btl_udapl_endpoint_initialize_control_message':
btl_udapl_endpoint.c:1012: warning: cast from pointer to integer of 
different size



Thanks,

Tim

Re: [OMPI devel] orte_ns_base_select failed: returned value -1 instead of ORTE_SUCCESS

2008-01-31 Thread Aurélien Bouteiller

I tried using a fresh trunk, same problem have occured.  Here is the  
complete configure line. I am using libtool 1.5.22 from fink.  
Otherwise everything is standard OS 10.5.


  $ ../trunk/configure --prefix=/Users/bouteill/ompi/build --enable- 
mpirun-prefix-by-default --disable-io-romio --enable-debug --enable- 
picky --enable-mem-debug --enable-mem-profile --enable-visibility -- 
disable-dlopen --disable-shared --enable-static


The error message generated by abort contains garbage (line numbers do  
not match anything in .c files and according to gdb the failure does  
not occur during ns initialization). This looks like a heap corruption  
or something as bad.


orterun (argc=4, argv=0xb81c) at ../../../../trunk/orte/tools/ 
orterun/orterun.c:529
529	cb_states = ORTE_PROC_STATE_TERMINATED |  
ORTE_PROC_STATE_AT_STG1;

(gdb) n
530	rc = orte_rmgr.spawn_job(apps, num_apps, &jobid, 0, NULL,  
job_state_callback, cb_states, &attributes);

(gdb) n
531	while (NULL != (item = opal_list_remove_first(&attributes)))  
OBJ_RELEASE(item);

(gdb) n
** Stepping over inlined function code. **
532 OBJ_DESTRUCT(&attributes);
(gdb) n
534 if (orterun_globals.do_not_launch) {
(gdb) n
539 OPAL_THREAD_LOCK(&orterun_globals.lock);
(gdb) n
541 if (ORTE_SUCCESS == rc) {
(gdb) n
542 while (!orterun_globals.exit) {
(gdb) n
543 opal_condition_wait(&orterun_globals.cond,
(gdb) n
[grosse-pomme.local:77335] [NO-NAME] ORTE_ERROR_LOG: Bad parameter in  
file /SourceCache/openmpi/openmpi-5/openmpi/orte/mca/oob/base/ 
oob_base_init.c at line 74


Aurelien


Le 30 janv. 08 à 17:18, Ralph Castain a écrit :


Are you running on the trunk, or an earlier release?

If the trunk, then I suspect you have a stale library hanging  
around. I

build and run statically on Leopard regularly.


On 1/30/08 2:54 PM, "Aurélien Bouteiller"   
wrote:



I get a runtime error in static build on Mac OS 10.5 (automake 1.10,
autoconf 2.60, gcc-apple-darwin 4.01, libtool 1.5.22).

The error does not occur in dso builds, and everything seems to work
fine on Linux.

Here is the error log.

~/ompi$ mpirun -np 2 NetPIPE_3.6/NPmpi
[grosse-pomme.local:34247] [NO-NAME] ORTE_ERROR_LOG: Bad parameter in
file /SourceCache/openmpi/openmpi-5/openmpi/orte/mca/oob/base/
oob_base_init.c at line 74
[grosse-pomme.local:34247] [NO-NAME] ORTE_ERROR_LOG: Bad parameter in
file /SourceCache/openmpi/openmpi-5/openmpi/orte/mca/ns/proxy/
ns_proxy_component.c at line 222
[grosse-pomme.local:34247] [NO-NAME] ORTE_ERROR_LOG: Error in file /
SourceCache/openmpi/openmpi-5/openmpi/orte/runtime/orte_init_stage1.c
at line 230
--
It looks like orte_init failed for some reason; your parallel  
process is

likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal  
failure;

here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ns_base_select failed
  --> Returned value -1 instead of ORTE_SUCCESS

--
--
It looks like MPI_INIT failed for some reason; your parallel  
process is

likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems.  This failure appears to be an internal failure; here's  
some

additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init_stage1 failed
  --> Returned "Error" (-1) instead of "Success" (0)
--
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)



--
Dr. Aurélien Bouteiller
Sr. Research Associate - Innovative Computing Laboratory
Suite 350, 1122 Volunteer Boulevard
Knoxville, TN 37996
865 974 6321





___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] orte_ns_base_select failed: returned value -1 instead of ORTE_SUCCESS

Re: [OMPI devel] vt compiler warnings and errors

Re: [OMPI devel] vt compiler warnings and errors

Re: [OMPI devel] SnapC

[OMPI devel] SnapC

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17307

Re: [OMPI devel] 32 bit udapl warnings

Re: [OMPI devel] 32 bit udapl warnings

Re: [OMPI devel] vt compiler warnings and errors

[OMPI devel] 32 bit udapl warnings

Re: [OMPI devel] orte_ns_base_select failed: returned value -1 instead of ORTE_SUCCESS

11 matches

Site Navigation

Mail list logo

Footer information