On Mar 9, 2008, at 6:13 AM, Muhammad Atif wrote:
Okay guys.. with all your support and help in understanding ompi architecture, I was able to get Xensocket to work. Only minor changes to the xensocket kernel module made it compatible with libevent. I am getting results which are bad but I am sure, I have to cleanup the code. At least my results have improved over native netfront-netback of xen for messages of size larger than 1 MB.
Great! Be aware that we are in the process of updating the version of libevent that is included in Open MPI. As part of this process, we are re-enabling the more scalable fd-monitoring mechanisms (such as epoll and friends). Do you know if xensockets play nicely with epoll?
I started with making minor changes in the TCP btl, but it seems it is not the best way, as changes are quite huge and it is better to have separate dedicated btl for xensockets. As you guys might be aware Xen supports live migration, now I have one stupid question. My knowledge so far suggests that btl component is initialized only once.
Correct.
The scerario here is if my guest os is migrated from one physical node to another, and realizes that the communicating processes are now on one physical host and they should abandon use of TCP btl and make use of Xensocket btl. I am sure it would not happen out of the box, but is it possible without making heavy changes in the openmpi architecture? With the current design, i am running a mix of tcp and xensocket btls, and endpoints check periodically if they are on same physical host or not. This has quite a big penalty in terms of time.
Josh Hursey has been doing much of the checkpoint/restart and migration work -- I'll let him answer this...
Another question is (good thing i am using email otherwise you guys would beat the hell outta me, its such a basic question). I am not able to track MPI_Recv(...) api call and its alike calls. Once in the code of MPI_Recv(..) we give a call to rc = MCA_PML_CALL(recv(buf, count ... ). This call goes to the macro, and pml.recv(..) gets invoked (mca_pml_base_module_recv_fn_t pml_recv;) . Where can I find the actual function? I get totally lost when trying to pinpoint what exactly is happening. Basically, I am looking for a place where tcp btl recv is getting called with all the goodies and parameters which were passed by the MPI programmer. I hope I have made my question understandable.
Sorry about all the function pointers -- it's how we have to do this because of all the plugins...
In the OB1 case, it goes to mca_pml_ob1_recv() (and mca_pml_ob1_irecv() for the non-blocking case). See ompi/mca/pml/ob1/ pml_ob1.c for a big function table that is passed back out of the OB1 module. This patter is repeated for most/all components in OMPI -- when the component is initialized, it passes back a table of function pointers for its module that the upper-level code can call.
-- Jeff Squyres Cisco Systems