On 4/14/20 5:57 AM, Tuong Tong Lien wrote:
Hi Jon,

As mentioned in the last meeting, we got a report from customer that TIPC has 
some latency increased (~ 30-50%) in message streaming, it's after backporting 
some recent patches. By only disabling/enabling the Nagle feature, we are now 
able to confirm this Nagle patch causing the latency issue.

We may need to turn this feature off by default (i.e. initializing 'tsk->nodelay = 
true') but likely no one will try to turn it on again by setting 'TIPC_NODELAY' = 
false ☹. However, I wonder if that increase in latency is expected, since we declared 
that it must be < 1 RTT...?
Now when testing with our benchmark tool, I can indeed observe a large delay, 
some messages have been bundled and queued in the socket backlog for hundreds 
of milliseconds before they could be sent out when the required 'ACK' came. I 
have some concerns here:

1) On the receiving side, the required 'ACK' is only done when application 
actually gets the whole data. Even if only part of the last bundle is received, 
we will not do the 'ACK' as required. Why do we need to do that so strictly? 
This will cause some latency since the receiver doesn't always get the message 
as quick as the sender. I think we can separate it from the traditional 
'CONN_ACK' which is for the receiver buffer flow control, the Nagle 'ACK' 
should be done as soon as possible i.e. just after the link layer that 
indicates the receiving side has got the bundles, so the sending side can push 
the next ones if any?
Yes, I now remember that I thought about this myself; that we could send the ACK already in tipc_sk_filter_connect(), but I deemed that was slightly more complex and that sending  ACK in tipc_sk_recvstream() would be fast enough. I was clearly wrong in that decision. My concern was also that RTT could become too short, so that fewer messages would be bundled, with lower throughput as result.

Obviously we should try this change and see what it brings. The longer latency is unacceptable, so we me must be ready to sacrifice some throughput in this case if necessary. We could even send the ACK earlier, directly from where the link adds the message to inputq, but that will cost more CPU cycles and I doubt that will make much difference.


2) The 'tsk->oneway' doesn't really make sense as it is only reset after the 
peer sends back a data message. In case there is only one side sending data on a 
socket (like our benchmark tool), the counter just keeps increasing, so we never 
leave the Nagle mode after entering...
That is the point. Why would we want to leave nagle mode if traffic only goes in one direction?
Do we need to force it to leave the Nagel mode somehow, for example: when 
receiving the Nagle 'ACK' without any bundles to be pushed?
In practice we do that already. When we receive an ACK and the send queue is empty, we enter the state "no outstanding acks", and the next message will be sent immediately, with the need_ack bit set. But we are still in nagle mode for the followig message, as we should be.

3) Do you have any other ideas that could cause the latency increase?
No. And I think your diagnosis makes perfectly sense. I suggest you go ahead and move the sending of ACK to tipc_sk_filter_connect() and observe the result.

BR
///jon



BR/Tuong

-----Original Message-----
From: David Miller <da...@davemloft.net>
Sent: Thursday, October 31, 2019 2:17 AM
To: jon.ma...@ericsson.com
Cc: net...@vger.kernel.org; l...@redhat.com; eduma...@google.com; 
tipc-discussion@lists.sourceforge.net
Subject: Re: [tipc-discussion] [net-next 1/1] tipc: add smart nagle feature

From: Jon Maloy <jon.ma...@ericsson.com>
Date: Wed, 30 Oct 2019 14:00:41 +0100

We introduce a feature that works like a combination of TCP_NAGLE and
TCP_CORK, but without some of the weaknesses of those. In particular,
we will not observe long delivery delays because of delayed acks, since
the algorithm itself decides if and when acks are to be sent from the
receiving peer.

- The nagle property as such is determined by manipulating a new
   'maxnagle' field in struct tipc_sock. If certain conditions are met,
   'maxnagle' will define max size of the messages which can be bundled.
   If it is set to zero no messages are ever bundled, implying that the
   nagle property is disabled.
- A socket with the nagle property enabled enters nagle mode when more
   than 4 messages have been sent out without receiving any data message
   from the peer.
- A socket leaves nagle mode whenever it receives a data message from
   the peer.

In nagle mode, messages smaller than 'maxnagle' are accumulated in the
socket write queue. The last buffer in the queue is marked with a new
'ack_required' bit, which forces the receiving peer to send a CONN_ACK
message back to the sender upon reception.

The accumulated contents of the write queue is transmitted when one of
the following events or conditions occur.

- A CONN_ACK message is received from the peer.
- A data message is received from the peer.
- A SOCK_WAKEUP pseudo message is received from the link level.
- The write queue contains more than 64 1k blocks of data.
- The connection is being shut down.
- There is no CONN_ACK message to expect. I.e., there is currently
   no outstanding message where the 'ack_required' bit was set. As a
   consequence, the first message added after we enter nagle mode
   is always sent directly with this bit set.

This new feature gives a 50-100% improvement of throughput for small
(i.e., less than MTU size) messages, while it might add up to one RTT
to latency time when the socket is in nagle mode.

Acked-by: Ying Xue <ying....@windreiver.com>
Signed-off-by: Jon Maloy <jon.ma...@ericsson.com>
Applied, thanks Jon.


_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion



_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to