Re: [OMPI devel] Replacing poll()

Alex Margolin Sat, 17 Mar 2012 18:55:45 -0400

My module is close to completion (though I need to fix other issues withshared memory to begin testing, but that's a different thread).I'm trying to understand how exactly are the fragments returned to theapplication once they are received.

In btl_tcp.c the function mca_btl_tcp_get() seems to be unused... andcalls mca_btl_tcp_endpoint_send().

I've stumled upon the following snippet (btl_tcp_endpoint.c:715):


                btl_endpoint->endpoint_recv_frag = NULL;
                if( MCA_BTL_TCP_HDR_TYPE_SEND == frag->hdr.type ) {
                    mca_btl_active_message_callback_t* reg;

reg = mca_btl_base_active_message_trigger +frag->hdr.base.tag;reg->cbfunc(&frag->btl->super, frag->hdr.base.tag,&frag->base, reg->cbdata);

This calls a callback function, which I assume notifies the upper layerof a message, but this is only for MCA_BTL_TCP_HDR_TYPE_SEND.

What about MCA_BTL_TCP_HDR_TYPE_PUT?

Thanks,
Alex

On 03/04/2012 02:54 AM, George Bosilca wrote:

On Mar 3, 2012, at 18:18 , Alex Margolin wrote:

I've figured that what I really need is to write my own BTL component, rather 
then trying to manipulate the existing TCP one. I've started writing it using 
the 1.5.5rc3 tarball and some pdfs from 2006 I found on the website (anything 
else I can look at? TCP is much more complicated then what I'm writing). I 
think I'm getting the hang of it, but I still have some questions about 
terminology for the component implementation:

The basic data structures for routing fragments are components, modules, 
interfaces and endpoints, right?

Are you trying to route fragments through intermediary nodes? If yes, then I 
might have a patch somewhere supporting routing for send/recv protocols.

So, If I have 3 nodes, each with 2 interfaces (each having one constant IP), 
and i'm running 2 processes total. I'll have... 1 component, 2 modules, 4 
interfaces (2 per module) and 4 addresses?
What about "links" (as in "num_of_links" component struct member) - what does 
it count?

Number of socket to be opened per device. In some cases (as an example when 
there is a hypervisor) one single socket is not enough to use the device 
completely. If I remember correctly on the PS3 3 socket were needed to get the 
900Mbs out of the 1Gb ethernet link.

ompi_modex_send - Is it supposed to share the addresses of all the running 
processes before they start? suppose I assume one NIC per machine. Can I just 
send an array of mca_btl_tcp_addr_t, and every process will find the one 
belonging to him by some index (his rank?). I saw the ompi_modex_recv() call in 
_proc.c and it seems that every proc instance reads the entire sent buffer 
anyway.

Right, the modex is used to exchange the "business card" of each process.

Sorry for flooding you all with questions, I hope I'm not way off here. I hope 
I'll finish writing something by the end of next week (I'm working on this 
after hours, not full time), with the purpose of submitting it as a 
contribution to open-mpi.

Looking forward to it.

george.

Appreciate your help so far,
Alex

On 03/02/2012 09:26 PM, Jeffrey Squyres wrote:

Give your btl progress function.  It'll get called quite frequently.

Look at the "progress" section in btl.h.  Progress threads don't work yet, but 
the btl_progress function will get called by the PML quite frequently.  It's how BTL's 
like openib progress their outstanding message passing.



On Mar 2, 2012, at 2:22 PM, Alex Margolin wrote:

On 03/02/2012 04:33 PM, Jeffrey Squyres wrote:

Note that the OMPI 1.4.x series is about to be retired.  If you're doing new 
stuff, I'd advise you to be working with the Open MPI SVN trunk.  In the trunk, 
we've changed how we build libevent, so if you're adding to it, you probably 
want to be working there for max forward-compatibility.

That being said:

I know trying to replace poll() seems like I'm doing something very wrong, but 
I want to poll on events without a valid linux file descriptor (and existing 
events, specifically sockets, at the same time), and I see no other way. 
Obviously, my poll2 calls the linux poll in most cases.

What exactly are you trying to do?  OMPI has some internal hooks for 
non-fd-or-event-based progress.  Indeed, libevent is typically called with 
fairly low frequency (e.g., if you're running with OpenFabrics or some other 
high-speed/not-fd-based networking interconnect).

I'm trying to create a new btl module. I've written an adapter from my library 
to TCP, so I've implemented socket/connect/accept/send/recv... now I've taken 
the TCP BTL module and cloned it - replacing the relevant calls with mine. My 
only problem is with poll, which is not in the MCA (at least in 1.4.x).
I've implemented poll() and select() but it's not that good, because my events 
are not based on valid linux file descriptors, but I can poll all my events at 
the same time (but not in conjunction with real FDs, unfortunatly).
Can you give me some pointers as to where to look in the MPI (1.5?) source code 
to implement it properly?

Thanks,
Alex
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] Replacing poll()

Reply via email to