I'm looking at the topic-netmalloc item in the TODO file. Here is my 
understanding of the problem, if somebody (Steve/Andrew) could confirm or 
correct what follows, that would be much appreciated:

For the APIs token_send, mcast_flush_send and mcast_noflush_send the message 
data is allocated on the stack in totemsrp.c. For the UDP Multicast and UDP 
Unicast drivers this data is simply transmitted using sendmsg(), but for the 
Infiniband driver it must be copied into a separate buffer that has been 
registered with libibverbs. In the case of the active rrp algorithm this copy 
potentially happens multiple times. We want to eliminate the copy by creating 
the message in a buffer supplied by the driver instead of on the stack. The 
driver would be responsible for freeing the buffer once the packet was 
transmitted.

Assuming that I'm on essentially the right track here, a couple of questions 
arise:
 * If the rrp algorithm is active or passive, the totemrrp_instance contains 
multiple totemnet_instances. Is it valid to assume that all net instances 
within a given rrp instance are of the same type (i.e. all Infiniband or all 
UDP), or can they be mixed?
 * I'm not familiar with how libibverbs works; is it legitimate to use the same 
buffer to send multiple packets (with the same content) that are 'in-flight' at 
the same time? I'm assuming yes, since all of the headers that differ between 
packets are outside of that buffer and I can't think of a reason any lower 
layers would need to modify it. There would need to be some sort of reference 
count on the buffer, but that should not be too hard to implement.

thanks!
- Zane.
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to