Hello everyone, I'm a regular OpenMPI user but I'm new to the strange world of development and hence this mailing list. I'm currently working on a project that involves OpenMPI and I was wondering if I might get some guidance and pointers in the right direction. In brief, my project is trying to make an efficient transport module in OpenMPI for use in virtualised environments (specifically Xen). At the moment, Xen has a very inefficient way of inter-domain communication, and although a myriad of solutions have appeared, they are all usually quite application specific. I'm basing my project on a paper I read about efficient message passing in interdomain virtualised guests. The problem I'm having is jumping into the OpenMPI code. I've read two papers I found on the homepage: "Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation" and "TEG: A High-Performance, Scalable, Multi-Network Point-to-Point Communications Methodology" which gave me some insight about the MCA, PML and PTL. However, I'm finding it quite difficult to get a foothold into the codebase and I'm wondering if anyone might be able to point me to a guide or some documentation that might help get me started.
I have a vague idea that I'm going to need to make a specific PML that uses my own PTL for inter-domain Xen guests. Should the PTL fail (e.g. if the guest migrated to another physical machine), the PML would then switch to TCP/IP instead. Migrating back and switching back to my own PTL sounds trickier. I've read there's an out of band communication socket between all guests for this sort of thing, but once again I'm a bit lost in the detail. I'm very eager to do this project well and contribute to the OpenMPI community, and if anyone has some advice or pointers I'd really appreciate it. Kind regards Tim