Re: [OMPI devel] problem in runing MPI job through XGrid

2007-10-09 Thread Brian Barrett

On Oct 4, 2007, at 3:06 PM, Jinhui Qin wrote:

sib:sharcnet$ mpirun -n 3 ~/openMPI_stuff/Hello

Process 0.1.1 is unable to reach 0.1.2 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.



This is very odd -- it looks like two of the processes don't think  
they can talk to each other.  Can you try running with:


  mpirun -n 3 -mca btl tcp,self 

If that fails, then the next piece of information that would be  
useful is the IP addresses and netmasks for all the nodes in your  
cluster.  We have some logic in our TCP communication system that can  
cause some interesting results for some network topologies.


Just to verify it's not an XGrid problem, you might want to try  
running with a hostfile -- I think you'll find that the results are  
the same, but it's always good to verify.


Brian


Re: [OMPI devel] RFC: Remove opal message buffer

2007-10-09 Thread George Bosilca
That was long ago, on the first draft of the ORTE. Completely useless  
by now, so go ahead and remove it.


  george.

On Oct 8, 2007, at 10:01 AM, Tim Prins wrote:


WHAT: Remove the opal message buffer code

WHY: It is not used

WHERE: Remove references from opal/mca/base/Makefile.am and
opal/mca/base/base.h
svn rm opal/mca/base/mca_base_msgbuf*

WHEN: After timeout

TIMEOUT: COB, Wednesday October 10, 2007



I ran into this code accidentally while looking at other things. It
looks like it was originally designed to be our data packing/unpacking
system, but we now use the dss for that.

A couple grep's through the code does not find anyone who actually  
uses

this functionality. So, to reduce future confusion and excess code, I
would like to remove it.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI devel] VampirTrace integration / bundling 3rd party software in OMPI

2007-10-09 Thread Jeff Squyres

On Oct 9, 2007, at 4:07 PM, Edgar Gabriel wrote:


One of my big problems with this idea is that we lose the concept of
shipping a single unit of Open MPI.  If someone sends us a bug report
concerning VT, we no longer have a solid idea of what version they
are running because they may have replaced the one inside their Open
MPI software.


well, this issue could be however resolved, if ompi_info and friends
would have a way to report the precise version number for VT, isn't  
it?


I don't quite know how to do it yet, but I agree that ompi_info  
should show the following for each 3rd party package:


- whether we are using the internally bundled package or not
- the version of the internally bundled package

I'll muck around with the libnbc integration to figure this stuff out.


Without having any strong feelings one way or the other, I think that
the functionality is great from the end-users perspective. Just my  
0.02$...


It makes me very, very nervous.  When we ship Open MPI, we test it  
and have a good feel for what works and what does not.  If a user  
changes something inside their installed Open MPI, all bets are off  
on whether it will work or not.  Some users will get it right, some  
will not (so you have to assume that they will not).


I think it is far safer to have the user download VT outside of OMPI  
and --disable-vt, or --enable-vt=/path/to/somewhere/else.  If we  
*replace* what is in the user's expanded tarball, they they cannot  
revert to what came out of the tarball without re-expanding the  
tarball (i.e., "make clean" and "make distclean" and whatnot do not  
revert back to the real original state -- this is contrary to the  
philosophy of those Automake targets).


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] VampirTrace integration / bundling 3rd party software in OMPI

2007-10-09 Thread Edgar Gabriel



Jeff Squyres wrote:


Is this in the production VT, or is this OMPI-specific functionality?

If it's OMPI-specific functionality, I would vote to not have it.

One of my big problems with this idea is that we lose the concept of  
shipping a single unit of Open MPI.  If someone sends us a bug report  
concerning VT, we no longer have a solid idea of what version they  
are running because they may have replaced the one inside their Open  
MPI software.


well, this issue could be however resolved, if ompi_info and friends 
would have a way to report the precise version number for VT, isn't it?


Without having any strong feelings one way or the other, I think that 
the functionality is great from the end-users perspective. Just my 0.02$...


Thanks
Edgar




Running an external VT install OMPI is a different thing; that's easy  
enough to tell that someone is not using the included VT vs. an  
external VT.  But if the user is able to arbitrarily (and perhaps  
accidentally) change the included VT, this becomes problematic for  
support and maintenance.


- about the two vampirtrace-specific spots in the .m4 files: they  
correspont
to two tasks: firstly, decide if you want vampirtrace at all or (if  
you might
want to update) and secondly, passing configure options to  
vampirtrace.
we need to do the first before the second, of course. maybe we can  
move

everything to "our" .m4 file, let me check ...


I would think that all OMPI-specific VT functionality should be in  
one .m4 file.  Per my other mail, I think it should be in contrib/vt/ 
configure.m4.  This makes a nice, clean separation of m4  
functionality and keeps it self-contained into the contrib/vt/ tree.


- btw: so far the vampirtrace distribution tarball is brought to  
openmpi

under ./tracing/vampirtrace with no modifications


Excellent.  That makes things considerably easier.


- the mpicc-vt (and friends) compiler wrappers: this is not part of
vampirtrace but a new thing that only makes sense together with  
openmpi.
therefore, they stay next to 'mpicc' and all others. in fact we're  
following
a earlier suggestion from you, Jeff: 'mpicc-vt' is just like  
'mpicc' but

calls the 'vtcc' compier wrapper instead of 'cc'.

this makes everything much simpler, because we can handle all  
special cases in
vtcc. the wrapper config for 'mpicc-vt' is almost a mere copy of  
mpicc's one.
therefore, I'd like to keep them where they are right now. is this  
o.k. with

everyone?


I like the idea of mpicc-vt (etc.) wrappers, but again, I think they  
should be consolidated in the contrib/vt tree.  There's no technical  
reason they need to be in the wrappers directory.


More specifically, I am uncomfortable with importing 3rd party  
packages that touch a whole bunch of places in the OMPI tree.  I am  
much more comfortable with 3rd party packages being self-contained.


I hope to have the libnbc integration done either this week or next  
as an example.  We're still far enough away from v1.3 release that  
this does not impact any release plans with VT.




--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab  http://pstl.cs.uh.edu
Department of Computer Science  University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
Tel: +1 (713) 743-3857  Fax: +1 (713) 743-3335


Re: [OMPI devel] Module Design Concept

2007-10-09 Thread Richard Graham
One of the assumptions about the MTL¹s is that only a given MTL can handle
the message
 matching for communications.  This is done to accommodate mpi-like network
stack that
 also handle the MPI message matching, which often do not expose their
internal data
 structures used for matching.  Open MPI¹s point-to-point selection
currently forces the
 choice of single pml, and if CM is chosen, only a single MTL.  Under those
constraints
 any MTL internal structs can be kept within the scope of the MTL, w/o
polluting the global
 name-space.

Rich


On 10/8/07 5:09 PM, "Sajjad Tabib"  wrote:

> 
> Hi, 
> 
> I'm implementing a new MTL component that uses message queues to keep track of
> posted and unexpected messages. I intended to do this by creating two global
> queues, one for posted and one for unexpected, until I found that the portals
> MTL uses a different approach in their queue implemenation. The portal code
> uses handles to the queues from inside their mca_mtl_portals_module_t to post
> messages. I couldn't help but wonder, why are the queue handles here? What are
> the design implications of defining these handle queues in this module struct
> rather than globally defining them?
> I'm an Open MPI newbie and sort of confused on the modular approach taken here
> and was hoping somebody could point out the pros and the cons of the two
> approaches. I guess my next question would be: In general, what would you put
> into a module struct and what wouldn't you?
> I will appreciate any pointers that you could give me to help me understand
> this concept. 
> 
> Thanks in advance,
> 
> Sajjad Tabib
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Any info regarding sm availible?

2007-10-09 Thread Jeff Squyres

On Oct 3, 2007, at 12:43 PM, Torje Henriksen wrote:

I was wondering if you could point me to any information regarding  
these

components. I got the source code of course, but other than that.


I believe that there are no docs on how the sm btl works, but the  
people who wrote it are on this list.


I would also like to ask some more or less specific questions about  
these

and probably other parts of Open MPI. Is this the right place for such
questions?


Yes it is.  Sometimes we get busy (e.g., many of us have been at an  
MPI conference and a follow-on OMPI engineering meeting for the past  
1.5 weeks, so we've been a bit slackful on the email lists...), so if  
we don't answer your question within a few days, don't hesitate to  
ping us again...


Thanks for your time, open mpi seems like a very nice project, and  
it's

fun to be able to mess around with it :)


Thanks!  Hopefully it'll be helpful to you.

--
Jeff Squyres
Cisco Systems