Hello... I'm a student at the University of British Columbia working on
creating an SCTP BTL for Open MPI. I have a simple implementation
working that uses SCTPs one-to-one style sockets for sending messages.
The same writev()/readv() calls that are used in the TCP BTL are used in
this new BTL.
The next step is to allow large messages (300K +) to be sent over the
SCTP socket. Right now, I'm fragmenting the iovec pointing to the
message at the BTL level. Repeated calls to writev() are then made, with
the first call sending header information and a part of the message,
followed by sends of nothing but message data.
My concern is that if the send is interrupted partly through the message
and then resumed, it will attempt to resend the vector containing the
message data from the beginning as mca_btl_sctp_frag_t pointer is only
aware of the original, un-fragmented iovec. I'm wondering if extending
the array of iovec structures contained within the frag pointer to
contain the properly fragmented message would cause havoc on the
middleware. Currently the array is of size 5 (as is the case for the TCP
BTL). Would extending this beyond 5 to allow for proper book keeping in
the event of an interrupted send create problems?
Any ideas on this matter would be greatly appreciated.
--
Karol Mroz
km...@cs.ubc.ca