Hi Jon,

As mentioned in the last meeting, we got a report from customer that TIPC has 
some latency increased (~ 30-50%) in message streaming, it's after backporting 
some recent patches. By only disabling/enabling the Nagle feature, we are now 
able to confirm this Nagle patch causing the latency issue.

We may need to turn this feature off by default (i.e. initializing 
'tsk->nodelay = true') but likely no one will try to turn it on again by 
setting 'TIPC_NODELAY' = false ☹. However, I wonder if that increase in latency 
is expected, since we declared that it must be < 1 RTT...?
Now when testing with our benchmark tool, I can indeed observe a large delay, 
some messages have been bundled and queued in the socket backlog for hundreds 
of milliseconds before they could be sent out when the required 'ACK' came. I 
have some concerns here:

1) On the receiving side, the required 'ACK' is only done when application 
actually gets the whole data. Even if only part of the last bundle is received, 
we will not do the 'ACK' as required. Why do we need to do that so strictly? 
This will cause some latency since the receiver doesn't always get the message 
as quick as the sender. I think we can separate it from the traditional 
'CONN_ACK' which is for the receiver buffer flow control, the Nagle 'ACK' 
should be done as soon as possible i.e. just after the link layer that 
indicates the receiving side has got the bundles, so the sending side can push 
the next ones if any?

2) The 'tsk->oneway' doesn't really make sense as it is only reset after the 
peer sends back a data message. In case there is only one side sending data on 
a socket (like our benchmark tool), the counter just keeps increasing, so we 
never leave the Nagle mode after entering... Do we need to force it to leave 
the Nagel mode somehow, for example: when receiving the Nagle 'ACK' without any 
bundles to be pushed?

3) Do you have any other ideas that could cause the latency increase?

BR/Tuong

-----Original Message-----
From: David Miller <da...@davemloft.net> 
Sent: Thursday, October 31, 2019 2:17 AM
To: jon.ma...@ericsson.com
Cc: net...@vger.kernel.org; l...@redhat.com; eduma...@google.com; 
tipc-discussion@lists.sourceforge.net
Subject: Re: [tipc-discussion] [net-next 1/1] tipc: add smart nagle feature

From: Jon Maloy <jon.ma...@ericsson.com>
Date: Wed, 30 Oct 2019 14:00:41 +0100

> We introduce a feature that works like a combination of TCP_NAGLE and
> TCP_CORK, but without some of the weaknesses of those. In particular,
> we will not observe long delivery delays because of delayed acks, since
> the algorithm itself decides if and when acks are to be sent from the
> receiving peer.
> 
> - The nagle property as such is determined by manipulating a new
>   'maxnagle' field in struct tipc_sock. If certain conditions are met,
>   'maxnagle' will define max size of the messages which can be bundled.
>   If it is set to zero no messages are ever bundled, implying that the
>   nagle property is disabled.
> - A socket with the nagle property enabled enters nagle mode when more
>   than 4 messages have been sent out without receiving any data message
>   from the peer.
> - A socket leaves nagle mode whenever it receives a data message from
>   the peer.
> 
> In nagle mode, messages smaller than 'maxnagle' are accumulated in the
> socket write queue. The last buffer in the queue is marked with a new
> 'ack_required' bit, which forces the receiving peer to send a CONN_ACK
> message back to the sender upon reception.
> 
> The accumulated contents of the write queue is transmitted when one of
> the following events or conditions occur.
> 
> - A CONN_ACK message is received from the peer.
> - A data message is received from the peer.
> - A SOCK_WAKEUP pseudo message is received from the link level.
> - The write queue contains more than 64 1k blocks of data.
> - The connection is being shut down.
> - There is no CONN_ACK message to expect. I.e., there is currently
>   no outstanding message where the 'ack_required' bit was set. As a
>   consequence, the first message added after we enter nagle mode
>   is always sent directly with this bit set.
> 
> This new feature gives a 50-100% improvement of throughput for small
> (i.e., less than MTU size) messages, while it might add up to one RTT
> to latency time when the socket is in nagle mode.
> 
> Acked-by: Ying Xue <ying....@windreiver.com>
> Signed-off-by: Jon Maloy <jon.ma...@ericsson.com>

Applied, thanks Jon.


_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to