Jon raises some good questions about this proposal, but I think they can be addressed.
I think the issue of message fragmentation (which I had neglected to consider) can be handled by ensuring the optimization is not done for fragmented messages. As far as I can tell, the only messages for which the source droppable and destination droppable bits are defined are data messages (user 0 to 3) and CONN_MANAGER messages (user 8), meaning message fragments will always be transmitted in a reliable manner anyway -- even if they contain unreliable application data. Consequently, the optimization will benefit SOCK_DGRAM messages that are less than 1500 bytes (or 9KB, if jumbo frames are supported); larger messages will still be transmitted in a reliable manner, but since they will incur a performance hit due to reassembly anyway I'm not sure how much difference the "no copy" optimization would have made anyway. I agree that, in the long run, we probably need modify the link protocol to allow fragmented messages to be sent unreliably. If we eventually implement the "routed links" concept that we discussed earlier this year (i.e. having a single pair of link endpoints carrying all traffic between a pair of nodes, even if multiple bearers are used or if the nodes are not directly connected), it would simplify the job of identifying and cleaning up messages for which fragments have been lost because there would never be more than one partially reassembled message per link endpoint at any time. As for testing, we haven't done much yet since we only coded enough of the change to prove that there were measurable gains to be had by avoiding the copy operation. Before proceeding further, I wanted to get your feedback -- which has been most useful! Our next step will be to code up the rest of the change & test it more fully. I anticipate that the impact of a high link loss rate with this change in place will be essentially the same as before, as we will be replacing the previous skb_clone() operation needed to resend a message that is sitting in a link's unacknowledged message queue with an skb_alloc() operation needed to create a message that isn't in the queue. Regards, Al -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jon Paul Maloy Sent: Thursday, June 21, 2007 1:30 PM To: Stephens, Allan Cc: tipc-discussion@lists.sourceforge.net Subject: Re: [tipc-discussion] TIPC Performance ImprovementIdea forSOCK_DGRAM Hi, I am not completely happy with your proposal, but I can not see any other way to achieve it with full backwards compatibility, which I admit must be a main objective. But I do think we should at the same time make a decision about where we want to go with this, and prepare the code accordingly. Once again we need to make use of the version information from the other end, and send the packets according to the following: if (remote_version < 1.7.xx){ allans_current_proposal(); }else{ long_term_solution(); } My proposal is not as intrusive as Allan seems to think, but it does involve both endpoints. Brief summary of my proposal: We send each packet without sequence number, without adding it to the send queue, and without cloning the the buffer. We also set the non-sequence bit, so that the receiving side can catch the packets and treat them separately. The tricky part here is that that we currently allow users to send source-droppable 66k messages, so we will need to provide support for fragmentation/defragmentation. I think this may be a serious problem with both proposals. With my proposal we would probably need to use a second sequence-number series and a separate fragment reception queue per link to be able to defragment. With Allan's proposal I also see problems. What if a fragment in the middle of the message is lost, and a dummy-packet is retransmitted? It will never find its way to the de-fragmentation queue, so the unfinished message will be lingering until it is eventually cleaned up by the timer. Have you tested this Allan? What if the loss-frequency is very high? This must be investigated. ///jon Stephens, Allan wrote: > Jon Maloy had previously suggested modifying TIPC's link protocol so > source droppable messages are sent out-of-band (i.e. without sequence > number information), but this would involve a more substantial change > to the link protocol that would affect both ends of the link. We may > still want to go this route in the future, but I'm trying to see what > can be done in the short term to address the performance concerns that > have been raised by numerous TIPC users ... > > Regards, > Al > > ---------------------------------------------------------------------- > --- This SF.net email is sponsored by DB2 Express Download DB2 Express > C - the FREE version of DB2 express and take control of your XML. No > limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > _______________________________________________ > tipc-discussion mailing list > tipc-discussion@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/tipc-discussion > ------------------------------------------------------------------------ - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion