Thanks a lot.. You and  Adrian have cleared a lot of my concepts and its time 
for me to develop a functional framework. I will get back to you guys when I am 
done with framework..... and having more problems and/or conceptual issues. 

The mca_btl_tcp_addr_t issue was resolved as correctly pointed by you.  I didnt 
go into the detail, but i think I must have had corrupted the code somewhere. 
The fresh tar, configure and make all install did the trick.

Best Regards,
Muhammad Atif

----- Original Message ----
From: Jeff Squyres <jsquy...@cisco.com>
To: Open MPI Developers <de...@open-mpi.org>
Sent: Saturday, January 19, 2008 11:54:09 AM
Subject: Re: [OMPI devel] btl tcp port to xensocket


On Jan 17, 2008, at 7:08 PM, Muhammad Atif wrote:

> Thanks again. Nope.. at the moment I am doing the lame stuff i.e.  
> simply changing the tcp code. So I have not created another btl  
> component. I know its not recommended thing, but I just wanted to  
> try before committing.

That makes perfect sense.  Ok, so you're not running into a component  
name collision within the modex; that's good.

> Apart from xensocket specific stuff, all what I have done inside the

> btl/tcp code is to change the structure
>
>  struct  mca_btl_tcp_addr_t {
>     struct in_addr addr_inet;     /**< IPv4 address in network byte  
> order */
>     in_port_t      addr_port;     /**< listen port */
>     unsigned short addr_inuse;    /**< local meaning only */
>     int           xs_domU_ref;       /**<xs: domU memory reference
  */
> };
>
> I wanted this structure to be passed on to all peers through  
> component exchange (modex send/recv).  This way I have the normal  
> socket listen port, its address and xensocket memory reference (its  
> not complete as it is missing some other info, but lets stick to  
> basic stuff).

Sounds reasonable.

> The second question is regarding btl tcp recv. I have seen a couple  
> of emails with some explanation specific to that particular user but

> cannot seem to answer this question (ref to previous email).

 > Second question is regarding the receive part of openmpi. In my
 > understanding, once Recv api is called, the control goes through PML
 > layer and everything initializes there. However, I am unable to get
 > a lock at the layer/file/function where the receive socket polling
 > is done. There are callbacks, but where or how exactly the openMPI
 > knows that message has in fact arrived. Any pointer will do :)

All file descriptor process is handled by libevent down in opal.   
libevent is a third party library that we imported into Open MPI (and  
modified a bit) that handles generic fd issues.  For example, we  
register fd's with libevent and tell libevent that we want callbacks  
when the fd is ready for reading or writing (depending on the context).

libevent's event loop is invoked by opal_progress(), which is called  
in lots of places.  Hence, the tcp btl can be called back whenever  
opal_progress() is invoked, because opal_progress() will invoke  
libevent, and if any socket fd's that the tcp btl registered are  
reading for reading, or if there are pending writes occurred on some  
socket fd's and those fd's are ready for writing, their callbacks will

be invoked.

Make sense?

> PS: I would love if you do some explanation of modex recv as well. ;)
> Thanks for all the support you guys are giving.

I think Adrian was referring to how the modex works.  Remember that  
the modex send is just a local memcpy; all the modex data is them  
glommed up into a single network send communication later.  After  
that, it gets a big network message with *everyone's* modex data, that

is then split up and categorized by component and sender.  The modex  
receive is then another memcpy.

So as to why you're still getting sizeof(mca_btl_tcp_addr_t)==8 in the

tcp modex receiver, the only thing I can think of is that you somehow  
didn't recompile properly.  Did you try making clean in the tcp btl  
dir and then a "make all" to ensure that everything recompiled  
properly with your modified struct in btl_tcp_addr.h?  Normally, the  
build system should take care of such dependencies, but...

-- 
Jeff Squyres
Cisco Systems

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel






      
____________________________________________________________________________________
Looking for last minute shopping deals?  
Find them fast with Yahoo! Search.  
http://tools.search.yahoo.com/newsearch/category.php?category=shopping

Reply via email to