I'm looking at the topic-netmalloc item in the TODO file. Here is my understanding of the problem, if somebody (Steve/Andrew) could confirm or correct what follows, that would be much appreciated:
For the APIs token_send, mcast_flush_send and mcast_noflush_send the message data is allocated on the stack in totemsrp.c. For the UDP Multicast and UDP Unicast drivers this data is simply transmitted using sendmsg(), but for the Infiniband driver it must be copied into a separate buffer that has been registered with libibverbs. In the case of the active rrp algorithm this copy potentially happens multiple times. We want to eliminate the copy by creating the message in a buffer supplied by the driver instead of on the stack. The driver would be responsible for freeing the buffer once the packet was transmitted. Assuming that I'm on essentially the right track here, a couple of questions arise: * If the rrp algorithm is active or passive, the totemrrp_instance contains multiple totemnet_instances. Is it valid to assume that all net instances within a given rrp instance are of the same type (i.e. all Infiniband or all UDP), or can they be mixed? * I'm not familiar with how libibverbs works; is it legitimate to use the same buffer to send multiple packets (with the same content) that are 'in-flight' at the same time? I'm assuming yes, since all of the headers that differ between packets are outside of that buffer and I can't think of a reason any lower layers would need to modify it. There would need to be some sort of reference count on the buffer, but that should not be too hard to implement. thanks! - Zane. _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
