Okay guys.. with all your support and help in understanding ompi architecture, 
I was able to get Xensocket to work.  Only minor changes to the xensocket 
kernel module made it compatible with libevent. I am getting results which are 
bad but I am sure, I have to cleanup the code. At least my results have 
improved over native netfront-netback of xen for messages of size larger than 1 
MB. 

I started with making minor changes in the TCP btl, but it seems it is not the 
best way, as changes are quite huge and it is better to have separate dedicated 
btl for xensockets. As you guys might be aware Xen supports live migration, now 
I have one stupid question. My knowledge so far suggests that btl component is 
initialized only once. The scerario here is if my guest os is migrated from one 
physical node to another, and realizes that the communicating processes are now 
on one physical host and they should abandon use of TCP btl and make use of 
Xensocket btl. I am sure it would not happen out of the box, but is it possible 
without making heavy changes in the openmpi architecture? 
With the current design, i am running a mix of tcp and xensocket btls, and 
endpoints check periodically if they are on same physical host or not. This has 
quite a big penalty in terms of time.

Another question is (good thing i am using email otherwise you guys would beat 
the hell outta me, its such a basic question). I am not able to track 
MPI_Recv(...) api call and its alike calls. Once in the code of MPI_Recv(..) we 
give a call to rc = MCA_PML_CALL(recv(buf, count ... ). This call goes to the 
macro, and pml.recv(..) gets invoked (mca_pml_base_module_recv_fn_t         
pml_recv;) . Where can I find the actual function? I get totally lost when 
trying to pinpoint what exactly is happening. Basically, I am looking for a 
place where tcp btl recv is getting called with all the goodies and  parameters 
which were passed by the MPI programmer. I hope I have made my question 
understandable. 

Best Regards,
Muhammad Atif

----- Original Message ----
From: Brian W. Barrett <brbar...@open-mpi.org>
To: Open MPI Developers <de...@open-mpi.org>
Sent: Wednesday, February 6, 2008 2:57:31 AM
Subject: Re: [OMPI devel] xensocket - callbacks through OPAL/libevent

On Mon, 4 Feb 2008, Muhammad Atif wrote:

> I am trying to port xensockets to openmpi. In principle, I have the 
> framework and everything, but there seems to be a small issue, I cannot 
> get libevent (or OPAL) to give callbacks for receive (or send) for 
> xensockets. I have tried to implement native code for xensockets with 
> libevent library, again the same issue.  No call backs! . With normal 
> sockets, callbacks do come easily.
>
> So question is, do the socket/file descriptors have to have some special 
> mechanism attached to them to support callbacks for libevent/opal? i.e 
> some structure/magic?. i.e. maybe the developers of xensockets did not 
> add that callback/interrupt thing at the time of creation. Xensockets is 
> open source, but my knowledge about these issues is limited. So I though 
> some pointer in right direction might be useful.

Yes and no :).  As you discovered, the OPAL interface just repackages a 
library called libevent to handle its socket multiplexing.  Libevent can 
use a number of different mechanisms to look for activity on sockets, 
including select() and poll() calls.  On Linux, it will generally use 
poll().  poll() requires some kernel support to do its thing, so if 
Xensockets doesn't implement the right magic to trigger poll() events, 
then libevent won't work for Xensockets.  There's really nothing you can 
do from the Open MPI front to work around this issue -- it would have to 
be fixed as part of Xensockets.

> Second question is, what if we cannot have the callbacks. What is the 
> recommended way to implement the btl component for such a device? Do we 
> need to do this with event timers?

Have a look at any of the BTLs that isn't TCP -- none of them use libevent 
callbacks for progress.  Instead, they provide a progress function as part 
of the BTL interface, which is called on a regular basis whenever progress 
needs to be made.

Brian
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel






      
____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page. 
http://www.yahoo.com/r/hs

Reply via email to