Re: [ofiwg] [libfabric-users] TX/RX data structures and data processing mode

Hefty, Sean Fri, 16 Mar 2018 09:51:47 -0700

copying ofiwg -- that mail list is better suited for your questions.

> My group works on implementing of new libfabric provider for our HPC
> interconnect. Our current main goal is to run MPICH and OpenMPI over
> this provider.


welcome!

> The problem is, that this NIC haven't any software and hardware rx/tx
> queues for send/recv operations. We're decided to implement it on
> libfabric provider-level. So, I'm looking for data structure for queue
> store and processing.
> 
> I took a look in sockets provider code. As far as I understand, tx_ctx
> stores pointers to all information (flags, data, src_address and etc.)
> about every message to send in ring buffer, but rx_ctx stores every
> rx_entry in double-linked list. What was the motivation for choosing
> such data structures when implementing these queues are different used
> to process tx and rx?

Please look at the code in prov/util for help.  The socket code was designed 
around using it as a development tool, so I wouldn't recommend trying to copy 
its implementation.

The udp provider is a good place to start for how to construct a very simple 
software provider.  You may also want to scan the include/ofi_xxx.h files for 
helpful abstractions.  There's a slightly out of date document in 
docs/providers that describes what's available.  ofi_list.h and ofi_mem.h both 
have useful abstractions.

> Maybe you can give advice on the implementation of queues or give some
> useful information on this topic?

If you are attempting to implement reliable-datagram semantics, then the use of 
lists may be better than a queue.  Messages may complete out of order when 
targeting different peers.

Depending on your provider, you may also be able to take advantage of the 
utility providers.  RxM will implement reliable-datagram support over 
reliable-connections.  That is functional today.  RxD targets reliable-datagram 
over unreliable-datagram.  That is a work in progress, however.

> The second problem is about suitable way for progress model. For CPU
> performance reasons I want to choose FI_PROGRESS_MANUAL as primary
> mode for the processing of an asynchronous requests, but I do not
> quite understand how an application thread provides data progress. For
> example, is it enough to call fi_cq_read() from MPI implementation
> always when it wants to make a progress?

Yes, the app calling cq_read needs to be sufficient to drive progress.  Note 
that this is expected by the app in the manual progress mode even if no 
completions are expected.

- Sean
_______________________________________________
ofiwg mailing list
[email protected]
http://lists.openfabrics.org/mailman/listinfo/ofiwg

Re: [ofiwg] [libfabric-users] TX/RX data structures and data processing mode

Reply via email to