Re: [OMPI devel] More VT warnings

2008-02-01 Thread Ralf Wildenhues
* Tim Prins wrote on Fri, Feb 01, 2008 at 04:09:31PM CET:
> 
> Note that this indicates that the file vt_metric_papi.c is being 
> compiled *3* times. I am not using a parallel make here. Any ideas why 
> it is compiling 3 times?

The file is listed as source file to four different libraries, and
per-target CFLAGS are used for these.  Between one and four of these
libraries are actually built, depending on decisions done at configure
time.

Cheers,
Ralf


Re: [OMPI devel] [OMPI svn] svn:open-mpi r17307

2008-02-01 Thread Tim Prins

Adrian,

For the most part this seems to work for me. But there are a few issues. 
I'm not sure which are introduced by this patch, and whether some may be 
expected behavior. But for completeness I will point them all out. 
First, let me explain I am working on a machine with 3 tcp interfaces, 
lo, eth0, and ib0. Both eth0 and ib0 connect all the compute nodes.


1. There are some warnings when compiling:
btl_tcp_proc.c:171: warning: no previous prototype for 'evaluate_assignment'
btl_tcp_proc.c:206: warning: no previous prototype for 'visit'
btl_tcp_proc.c:224: warning: no previous prototype for 
'mca_btl_tcp_initialise_interface'

btl_tcp_proc.c: In function `mca_btl_tcp_proc_insert':
btl_tcp_proc.c:304: warning: pointer targets in passing arg 2 of 
`opal_ifindextomask' differ in signedness
btl_tcp_proc.c:313: warning: pointer targets in passing arg 2 of 
`opal_ifindextomask' differ in signedness

btl_tcp_proc.c:389: warning: comparison between signed and unsigned
btl_tcp_proc.c:400: warning: comparison between signed and unsigned
btl_tcp_proc.c:401: warning: comparison between signed and unsigned
btl_tcp_proc.c:459: warning: ISO C90 forbids variable-size array `a'
btl_tcp_proc.c:459: warning: ISO C90 forbids mixed declarations and code
btl_tcp_proc.c:465: warning: ISO C90 forbids mixed declarations and code
btl_tcp_proc.c:466: warning: comparison between signed and unsigned
btl_tcp_proc.c:480: warning: comparison between signed and unsigned
btl_tcp_proc.c:485: warning: comparison between signed and unsigned
btl_tcp_proc.c:495: warning: comparison between signed and unsigned

2. If I exclude all my tcp interfaces, the connection fails properly, 
but I do get a malloc request for 0 bytes:
tprins@odin examples]$ mpirun -mca btl tcp,self  -mca btl_tcp_if_exclude 
eth0,ib0,lo -np 2 ./ring_c

malloc debug: Request for 0 bytes (btl_tcp_component.c, 844)
malloc debug: Request for 0 bytes (btl_tcp_component.c, 844)


3. If the exclude list does not contain 'lo', or the include list 
contains 'lo', the job hangs when using multiple nodes:
[tprins@odin examples]$ mpirun -mca btl tcp,self  -mca 
btl_tcp_if_exclude ib0 -np 2 -bynode ./ring_cProcess 0 sending 10 to 1, 
tag 201 (2 processes in ring)
[odin011][1,0][btl_tcp_endpoint.c:619:mca_btl_tcp_endpoint_complete_connect] 
connect() failed: Connection refused (111)


[tprins@odin examples]$ mpirun -mca btl tcp,self  -mca 
btl_tcp_if_include eth0,lo -np 2 -bynode ./ring_c

Process 0 sending 10 to 1, tag 201 (2 processes in ring)
[odin011][1,0][btl_tcp_endpoint.c:619:mca_btl_tcp_endpoint_complete_connect] 
connect() failed: Connection refused (111)



However, the great news about this patch is that it appears to fix 
https://svn.open-mpi.org/trac/ompi/ticket/1027 for me.


Hope this helps,

Tim



Adrian Knoth wrote:

On Wed, Jan 30, 2008 at 06:48:54PM +0100, Adrian Knoth wrote:


What is the real issue behind this whole discussion?

Hanging connections.
I'll have a look at it tomorrow.


To everybody who's interested in BTL-TCP, especially George and (to a
minor degree) rhc:

I've integrated something what I call "magic address selection code".
See the comments in r17348.

Can you check

   https://svn.open-mpi.org/svn/ompi/tmp-public/btl-tcp

if it's working for you? Read: multi-rail TCP, FNN, whatever is
important to you?


The code is proof of concept and could use a little tuning (if it's
working at all. Over here, it satisfies all tests).

I vaguely remember that at least Ralph doesn't like

   int a[perm_size * sizeof(int)];

where perm_size is dynamically evaluated (read: array size is runtime
dependent)

There are also some large arrays, search for MAX_KERNEL_INTERFACE_INDEX.
Perhaps it's better to replace them with an appropriate OMPI data
structure. I don't know what fits best, you guys know the details...


So please give the code a try, and if it's working, feel free to cleanup
whatever is necessary to make it the OMPI style or give me some pointers
what to change.


I'd like to point to Thomas' diploma thesis. The PDF explains the theory
behind the code, it's like an rationale. Unfortunately, the PDF has some
typos, but I guess you'll get the idea. It's a graph matching algorithm,
Chapter 3 covers everything in detail:

 http://cluster.inf-ra.uni-jena.de/~adi/peiselt-thesis.pdf


HTH





Re: [OMPI devel] VT in trunk + how to disable

2008-02-01 Thread Jeff Squyres

I think my position is about the same as Terry's.

I also think we have a precedent for building everything that is  
possible and letting the user choose at run-time what they want to  
do.  My $0.02 is that it's easier to tell random users (and  
customers!) "yes, OMPI should have built that for you by default; you  
use it like this..." vs. "No, sorry, you need to go re-install OMPI to  
have feature X."


We developers are probably a bit more sensitive to this issue since it  
makes longer builds (and we re-build all the time).  But remember that  
most people install OMPI only a small number of times -- so build time  
is less of an issue for them.


(I'm assuming that at least one of your motivations for asking was the  
longer build time...?)



On Feb 1, 2008, at 10:17 AM, Terry Dontje wrote:


Josh Hursey wrote:

Should the default be to *disable* vampirtrace?

I mention this since, I assume, most people do not depend on this
tool for every Open MPI install. Meaning that Open MPI does not
require this integration for correct MPI functionality unlike
something like ROMIO [example of opt-out functionality which is 3rd
party].

So I would suggest to the group that vampirtrace be an opt-in
functionality.

What do others think?

I am not completely against disabling it as a default.  However,  
once it
builds consistently having it enabled by default shouldn't really  
cause
any problems for those not directly using it (well outside of more  
time

to compile).   I imagine changing the default probably would help ORTE
move forward but then I wonder if we will run into issues of the  
vampire

stuff not being able to resolve their issues because of ORTE problems
put back to the trunk.

--td

-- Josh

On Jan 28, 2008, at 9:59 AM, Andreas Knüpfer wrote:



Hi everybody,

the vampirtrace integration arrived at the trunk today. There seems
to be one
issue already, but we'll fix this asap.

As a general hint, this is how to completely disable anything we
integrated:

   configure --enable-contrib-no-build=vt ...

Then again, we'd like to see all the issues you may encounter and
fix them.

Best regards, Andreas

--
Dipl. Math. Andreas Knuepfer,
Center for Information Services and
High Performance Computing (ZIH), TU Dresden,
Willersbau A114, Zellescher Weg 12, 01062 Dresden
phone +49-351-463-38323, fax +49-351-463-37773
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems




Re: [OMPI devel] vt compiler warnings and errors

2008-02-01 Thread Jeff Squyres

On Feb 1, 2008, at 5:35 AM, Ralf Wildenhues wrote:


These files do not belong in SVN, they are generated by aclocal:
 ompi/contrib/vt/vt/extlib/otf/aclocal.m4
 ompi/contrib/vt/vt/aclocal.m4



I think both of these have their own configure scripts, meaning that  
they were autoconfed/automaked/whatever before they were put into OMPI.


And in hindsight, this fits in with exactly what our original goal  
was: take a VT tarball and dump it into OMPI's SVN.  Doh!


So I think the question still remains: can we hook VT's autoconf (et  
al.) requirements into the top-level autogen.sh so that the trunk copy  
of vt doesn't have configure/aclocal.m4/etc. and OMPI's top-level  
autogen.sh will create them?


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] VT in trunk + how to disable

2008-02-01 Thread Josh Hursey

Should the default be to *disable* vampirtrace?

I mention this since, I assume, most people do not depend on this  
tool for every Open MPI install. Meaning that Open MPI does not  
require this integration for correct MPI functionality unlike  
something like ROMIO [example of opt-out functionality which is 3rd  
party].


So I would suggest to the group that vampirtrace be an opt-in  
functionality.


What do others think?

-- Josh

On Jan 28, 2008, at 9:59 AM, Andreas Knüpfer wrote:


Hi everybody,

the vampirtrace integration arrived at the trunk today. There seems  
to be one

issue already, but we'll fix this asap.

As a general hint, this is how to completely disable anything we  
integrated:


configure --enable-contrib-no-build=vt ...

Then again, we'd like to see all the issues you may encounter and  
fix them.


Best regards, Andreas

--
Dipl. Math. Andreas Knuepfer,
Center for Information Services and
High Performance Computing (ZIH), TU Dresden,
Willersbau A114, Zellescher Weg 12, 01062 Dresden
phone +49-351-463-38323, fax +49-351-463-37773
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





Re: [OMPI devel] vt compiler warnings and errors

2008-02-01 Thread Ralf Wildenhues
* Jeff Squyres wrote on Thu, Jan 31, 2008 at 07:10:36PM CET:
> Ah -- I didn't notice this before -- do you have a configure script  
> committed to SVN?  If so, this could be the problem.

> > On Do, 2008-01-31 at 08:09 -0500, Tim Prins wrote:
[...]
> >> [tprins@sif test]$ make clean
> >> 
> >> Making clean in otf
> >> make[5]: Entering directory
> >> `/san/homedirs/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf'
> >>   cd . && /bin/sh
> >> /u/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf/missing --run
> >> automake-1.10 --gnu
> >> cd . && /bin/sh /u/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf/ 
> >> missing
> >> --run autoconf
[...]

These files do not belong in SVN, they are generated by aclocal:
  ompi/contrib/vt/vt/extlib/otf/aclocal.m4
  ompi/contrib/vt/vt/aclocal.m4

Cheers,
Ralf


Re: [OMPI devel] vt compiler warnings and errors

2008-02-01 Thread Andreas Knüpfer
Hi everybody,

now this is an interesting effect. 

After a fresh checkout all files have the actual time, haven't they? Is the 
timestamp explicitly saved somewhere?

Could it be, that this is newer than Tim's local time yesterday? Maybe the 
system time is not set to UTC or something like this? If so, then it should 
be possible to reproduce this today. Could you give it a try, Tim?

Another cause could be slight differences in files' times because one is 
checked out earlier than the other. However, OTF's configure ran before 
during the first global configure. Therefore, all files' timestamps should be 
correct after this. So I don't believe in this explanation.

What do you think?


-- 
Dipl. Math. Andreas Knuepfer, 
Center for Information Services and 
High Performance Computing (ZIH), TU Dresden, 
Willersbau A114, Zellescher Weg 12, 01062 Dresden
phone +49-351-463-38323, fax +49-351-463-37773


signature.asc
Description: This is a digitally signed message part.