Jon raises some good questions about this proposal, but I think they can
be addressed.

I think the issue of message fragmentation (which I had neglected to
consider) can be handled by ensuring the optimization is not done for
fragmented messages.  As far as I can tell, the only messages for which
the source droppable and destination droppable bits are defined are data
messages (user 0 to 3) and CONN_MANAGER messages (user 8), meaning
message fragments will always be transmitted in a reliable manner anyway
-- even if they contain unreliable application data.  Consequently, the
optimization will benefit SOCK_DGRAM messages that are less than 1500
bytes (or 9KB, if jumbo frames are supported); larger messages will
still be transmitted in a reliable manner, but since they will incur a
performance hit due to reassembly anyway I'm not sure how much
difference the "no copy" optimization would have made anyway.

I agree that, in the long run, we probably need modify the link protocol
to allow fragmented messages to be sent unreliably.  If we eventually
implement the "routed links" concept that we discussed earlier this year
(i.e. having a single pair of link endpoints carrying all traffic
between a pair of nodes, even if multiple bearers are used or if the
nodes are not directly connected), it would simplify the job of
identifying and cleaning up messages for which fragments have been lost
because there would never be more than one partially reassembled message
per link endpoint at any time.

As for testing, we haven't done much yet since we only coded enough of
the change to prove that there were measurable gains to be had by
avoiding the copy operation.  Before proceeding further, I wanted to get
your feedback -- which has been most useful!  Our next step will be to
code up the rest of the change & test it more fully.  I anticipate that
the impact of a high link loss rate with this change in place will be
essentially the same as before, as we will be replacing the previous
skb_clone() operation needed to resend a message that is sitting in a
link's unacknowledged message queue with an skb_alloc() operation needed
to create a message that isn't in the queue.

Regards,
Al

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jon
Paul Maloy
Sent: Thursday, June 21, 2007 1:30 PM
To: Stephens, Allan
Cc: tipc-discussion@lists.sourceforge.net
Subject: Re: [tipc-discussion] TIPC Performance ImprovementIdea
forSOCK_DGRAM

Hi,
I am not completely happy with your proposal, but I can not see any
other way to achieve it with full backwards compatibility, which I admit
must be a main objective.
But I do think we should at the same time make a decision about where we
want to go with this, and prepare the code accordingly.

Once again we need to make use of the version information from the other
end, and send the packets according to the following:

if (remote_version < 1.7.xx){
    allans_current_proposal();
}else{
    long_term_solution();
}

My proposal is not as intrusive as Allan seems to think, but it does
involve both endpoints.

Brief summary of my proposal:
We send each packet without sequence number, without adding it to the
send queue, and without cloning the the buffer. We also set the
non-sequence bit, so that the receiving side can catch the packets and
treat them separately.

The tricky part here is that that we currently allow users to send
source-droppable 66k messages, so we will need to provide support for
fragmentation/defragmentation. I think this may be a serious problem
with both proposals. With my proposal we would probably need to use a
second sequence-number series and a separate fragment reception queue
per link to be able to defragment.
With Allan's proposal I also see problems. What if a fragment in the
middle of the message is lost, and a dummy-packet is retransmitted?
It will never find its way to the de-fragmentation queue, so the
unfinished message will be lingering until it is eventually cleaned up
by the timer. Have you tested this Allan? What if the loss-frequency is
very high? This must be investigated.

///jon


Stephens, Allan wrote:
> Jon Maloy had previously suggested modifying TIPC's link protocol so 
> source droppable messages are sent out-of-band (i.e. without sequence 
> number information), but this would involve a more substantial change 
> to the link protocol that would affect both ends of the link.  We may 
> still want to go this route in the future, but I'm trying to see what 
> can be done in the short term to address the performance concerns that

> have been raised by numerous TIPC users ...
>
> Regards,
> Al
>   

> ----------------------------------------------------------------------
> --- This SF.net email is sponsored by DB2 Express Download DB2 Express

> C - the FREE version of DB2 express and take control of your XML. No 
> limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> tipc-discussion mailing list
> tipc-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion
>   


------------------------------------------------------------------------
-
This SF.net email is sponsored by DB2 Express Download DB2 Express C -
the FREE version of DB2 express and take control of your XML. No limits.
Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to