Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB)

2015-11-04 Thread Bendik Rønning Opstad
On Monday, November 02, 2015 09:37:54 AM David Laight wrote:
> From: Bendik Rønning Opstad
> > Sent: 23 October 2015 21:50
> > RDB is a mechanism that enables a TCP sender to bundle redundant
> > (already sent) data with TCP packets containing new data. By bundling
> > (retransmitting) already sent data with each TCP packet containing new
> > data, the connection will be more resistant to sporadic packet loss
> > which reduces the application layer latency significantly in congested
> > scenarios.
> 
> What sort of traffic flows do you expect this to help?

As mentioned in the cover letter, RDB is aimed at reducing the
latencies for "thin-stream" traffic often produced by
latency-sensitive applications. This blog post describes RDB and the
underlying motivation:
http://mlab.no/blog/2015/10/redundant-data-bundling-in-tcp

Further information is available in the links referred to in the blog
post.

> An ssh (or similar) connection will get additional data to send,
> but that sort of data flow needs Nagle in order to reduce the
> number of packets sent.

Whether an application needs to reduce the number of packets sent
depends on the perspective of who you ask. If low latency is of high
priority for the application it may need to increase the number of
packets sent by disabling Nagle to reduce the segments sojourn times
on the sender side.

As for SSH clients, it seems OpenSSH disables Nagle for interactive
sessions.

> OTOH it might benefit from including unacked data if the Nagle
> timer expires.
> Being able to set the Nagle timer on a per-connection basis
> (or maybe using something based on the RTT instead of 2 secs)
> might make packet loss less problematic.

There is no timer for Nagle? The current (Minshall variant)
implementation restricts sending a small segment as long as the
previously transmitted packet was small and is not yet ACKed.

> Data flows that already have Nagle disabled (probably anything that
> isn't command-response and isn't unidirectional bulk data) are
> likely to generate a lot of packets within the RTT.

How many packets such applications need to transmit for optimal
latency varies to a great extent. Packets per RTT is not a very useful
metric in this regard, considering the strict dependency on the RTT.

This is why we propose a dynamic packets in flight limit (DPIFL) that
indirectly relies on the application write frequency, i.e. how often
the application performs write systems calls. This limit is used to
ensure that only applications that write data less frequently than a
certain limit may utilize RDB.

> Resending unacked data will just eat into available network bandwidth
> and could easily make any congestion worse.
>
> I think that means you shouldn't resend data more than once, and/or
> should make sure that the resent data isn't a significant overhead
> on the packet being sent.

It is important to remember what type of traffic flows we are
discussing. The applications RDB is aimed at helping produce
application-limited flows that transmit small amounts of data, both in
terms of payload per packet and packets per second.

Analysis of traces from latency-sensitive applications producing
traffic with thin-stream characteristics show inter-transmission times
ranging from a few ms (typically 20-30 ms on average) to many hundred
ms.
(http://mlab.no/blog/2015/10/redundant-data-bundling-in-tcp/#thin_streams)

Increasing the amount of transmitted data will certainly contribute to
congestion to some degree, but it is not (necessarily) an unreasonable
trade-off considering the relatively small amounts of data such
applications transmit compared to greedy flows.

RDB does not cause more packets to be sent through the network, as it
uses available "free" space in packets already scheduled for
transmission. With a bundling limitation of only one previous segment,
the bandwidth requirement is doubled - accounting for headers it would
be less.

By increasing the BW requirement for an application that produces
relatively little data, we still end up with a low BW requirement.
The suggested minimum lower bound inter-transmission time is 10 ms,
meaning that when an application writes data more frequently than
every 10 ms (on average) it will not be allowed to utilize RDB.

To what degree RDB affects competing traffic will of course depend on
the link capacity and the number of simultaneous flows utilizing RDB.
We have performed tests to asses how RDB affects competing traffic. In
one of the test scenarios, 10 RDB-enabled thin streams and 10 regular
TCP thin streams compete against 5 greedy TCP flows over a shared
bottleneck limited to 5Mbit/s. The results from this test show that by
only bundling one previous segment with each packet (segment size: 120
bytes), the effect on the the competing thin-stream traffic is modest.
(http://mlab.no/blog/2015/10/redundant-data-bundling-in-tcp/#latency_test_with_cross_traffic).

Also relevant to the discussion is the paper "Reducing web latency:

RE: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB)

2015-11-02 Thread David Laight
From: Bendik Rønning Opstad
> Sent: 23 October 2015 21:50
> RDB is a mechanism that enables a TCP sender to bundle redundant
> (already sent) data with TCP packets containing new data. By bundling
> (retransmitting) already sent data with each TCP packet containing new
> data, the connection will be more resistant to sporadic packet loss
> which reduces the application layer latency significantly in congested
> scenarios.

What sort of traffic flows do you expect this to help?

An ssh (or similar) connection will get additional data to send,
but that sort of data flow needs Nagle in order to reduce the
number of packets sent.
OTOH it might benefit from including unacked data if the Nagle
timer expires.
Being able to set the Nagle timer on a per-connection basis
(or maybe using something based on the RTT instead of 2 secs)
might make packet loss less problematic.

Data flows that already have Nagle disabled (probably anything that
isn't command-response and isn't unidirectional bulk data) are
likely to generate a lot of packets within the RTT.
Resending unacked data will just eat into available network bandwidth
and could easily make any congestion worse.

I think that means you shouldn't resend data more than once, and/or
should make sure that the resent data isn't a significant overhead
on the packet being sent.

David




RE: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB)

2015-11-02 Thread David Laight
From: Bendik Rønning Opstad
> Sent: 29 October 2015 22:54
...
> > > > The semantics of the tp->nonagle bits are already a bit complex. My
> > > > sense is that having a setsockopt of TCP_RDB transparently modify the
> > > > nagle behavior is going to add more extra complexity and unanticipated
> > > > behavior than is warranted given the slight possible gain in
> > > > convenience to the app writer. What about a model where the
> > > > application user just needs to remember to call
> > > > setsockopt(TCP_NODELAY) if they want the TCP_RDB behavior to be
> > > > sensible? I see your nice tests at
> > > >
> > > >   https://github.com/bendikro/packetdrill/commit/9916b6c53e33dd04329d29b
> > > >   7d8baf703b2c2ac1b> >
> > > > are already doing that. And my sense is that likewise most
> > > > well-engineered "thin stream" apps will already be using
> > > > setsockopt(TCP_NODELAY). Is that workable?
> 
> This is definitely workable. I agree that it may not be an ideal solution to
> have TCP_RDB disable Nagle, however, it would be useful with a way to easily
> enable RDB and disable Nagle.

If enabling RDB disables Nagle, then what happens when you turn RDB back off?

David

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB)

2015-10-29 Thread Bendik Rønning Opstad
On Monday, October 26, 2015 02:58:03 PM Yuchung Cheng wrote:
> On Mon, Oct 26, 2015 at 2:35 PM, Andreas Petlund  wrote:
> > > On 26 Oct 2015, at 15:50, Neal Cardwell  wrote:
> > > 
> > > On Fri, Oct 23, 2015 at 4:50 PM, Bendik Rønning Opstad
> > > 
> > >  wrote:
> > >> @@ -2409,6 +2412,15 @@ static int do_tcp_setsockopt(struct sock *sk,
> > >> int level,> > 
> > > ...
> > > 
> > >> +   case TCP_RDB:
> > >> +   if (val < 0 || val > 1) {
> > >> +   err = -EINVAL;
> > >> +   } else {
> > >> +   tp->rdb = val;
> > >> +   tp->nonagle = val;
> > > 
> > > The semantics of the tp->nonagle bits are already a bit complex. My
> > > sense is that having a setsockopt of TCP_RDB transparently modify the
> > > nagle behavior is going to add more extra complexity and unanticipated
> > > behavior than is warranted given the slight possible gain in
> > > convenience to the app writer. What about a model where the
> > > application user just needs to remember to call
> > > setsockopt(TCP_NODELAY) if they want the TCP_RDB behavior to be
> > > sensible? I see your nice tests at
> > > 
> > >   https://github.com/bendikro/packetdrill/commit/9916b6c53e33dd04329d29b
> > >   7d8baf703b2c2ac1b> > 
> > > are already doing that. And my sense is that likewise most
> > > well-engineered "thin stream" apps will already be using
> > > setsockopt(TCP_NODELAY). Is that workable?

This is definitely workable. I agree that it may not be an ideal solution to
have TCP_RDB disable Nagle, however, it would be useful with a way to easily
enable RDB and disable Nagle.

> > We have been discussing this a bit back and forth. Your suggestion would
> > be the right thing to keep the nagle semantics less complex and to
> > educate developers in the intrinsics of the transport.
> > 
> > We ended up choosing to implicitly disable nagle since it
> > 1) is incompatible with the logic of RDB.
> > 2) leaving it up to the developer to read the documentation and register
> > the line saying that "failing to set TCP_NODELAY will void the RDB
> > latency gain" will increase the chance of misconfigurations leading to
> > deployment with no effect.
> > 
> > The hope was to help both the well-engineered thin-stream apps and the
> > ones deployed by developers with less detailed knowledge of the
> > transport.
> but would RDB be voided if this developer turns on RDB then turns on
> Nagle later?

It would (to a large degree), but I believe that's ok? The intention with also
disabling Nagle is not to remove control from the application writer, so if
TCP_RDB disables Nagle, they should not be prevented from explicitly enabling
Nagle after enabling RDB.

The idea is to make it as easy as possible for the application writer, and
since Nagle is on by default, it makes sense to change this behavior when the
application has indicated that it values low latencies.

Would a solution with multiple option values to TCP_RDB be acceptable? E.g.
0 = Disable
1 = Enable RDB
2 = Enable RDB and disable Nagle

If the sysctl tcp_rdb accepts the same values, setting the sysctl to 2 would
allow to use and test RDB (with Nagle off) on applications that haven't
explicitly disabled Nagle, which would make the sysctl tcp_rdb even more useful.

Instead of having TCP_RDB modify Nagle, would it be better/acceptable to have a
separate socket option (e.g. TCP_THIN/TCP_THIN_LOW_LATENCY) that enables RDB and
disables Nagle? e.g.
0 = Use default system options?
1 = Enable RDB and disable Nagle

This would separate the modification of Nagle from the TCP_RDB socket option and
make it cleaner?

Such an option could also enable other latency-reducing options like
TCP_THIN_LINEAR_TIMEOUTS and TCP_THIN_DUPACK:
2 = Enable RDB, TCP_THIN_LINEAR_TIMEOUTS, TCP_THIN_DUPACK, and disable Nagle

Bendik

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB)

2015-10-27 Thread Jonas Markussen
On 26 Oct 2015, at 22:58, Yuchung Cheng  wrote:
> but would RDB be voided if this developer turns on RDB then turns on
> Nagle later?

The short answer is answer is "kind of"

My understanding is that Nagle will delay segments until they're
either MSS-sized or until segments "down the pipe" are acknowledged.

As RDB isn't able to bundle if the payload is more than MSS/2, only
an application that that sends data less frequent than an RTT would
still theoretically benefit from RDB even if Nagle is on.

However, in my opinion this is a scenario where Nagle itself is void:

If you transmit more rarely than the RTT, enabling Nagle makes no
difference.

If you transfer more frequent than the RTT, enabling Nagle makes
RDB void.

-Jonas
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB)

2015-10-26 Thread Yuchung Cheng
On Mon, Oct 26, 2015 at 2:35 PM, Andreas Petlund  wrote:
>
>
> > On 26 Oct 2015, at 15:50, Neal Cardwell  wrote:
> >
> > On Fri, Oct 23, 2015 at 4:50 PM, Bendik Rønning Opstad
> >  wrote:
> >> @@ -2409,6 +2412,15 @@ static int do_tcp_setsockopt(struct sock *sk, int 
> >> level,
> > ...
> >> +   case TCP_RDB:
> >> +   if (val < 0 || val > 1) {
> >> +   err = -EINVAL;
> >> +   } else {
> >> +   tp->rdb = val;
> >> +   tp->nonagle = val;
> >
> > The semantics of the tp->nonagle bits are already a bit complex. My
> > sense is that having a setsockopt of TCP_RDB transparently modify the
> > nagle behavior is going to add more extra complexity and unanticipated
> > behavior than is warranted given the slight possible gain in
> > convenience to the app writer. What about a model where the
> > application user just needs to remember to call
> > setsockopt(TCP_NODELAY) if they want the TCP_RDB behavior to be
> > sensible? I see your nice tests at
> >
> >   
> > https://github.com/bendikro/packetdrill/commit/9916b6c53e33dd04329d29b7d8baf703b2c2ac1b
> >
> > are already doing that. And my sense is that likewise most
> > well-engineered "thin stream" apps will already be using
> > setsockopt(TCP_NODELAY). Is that workable?
>
> We have been discussing this a bit back and forth. Your suggestion would be 
> the right thing to keep the nagle semantics less complex and to educate 
> developers in the intrinsics of the transport.
>
> We ended up choosing to implicitly disable nagle since it
> 1) is incompatible with the logic of RDB.
> 2) leaving it up to the developer to read the documentation and register the 
> line saying that "failing to set TCP_NODELAY will void the RDB latency gain" 
> will increase the chance of misconfigurations leading to deployment with no 
> effect.
>
> The hope was to help both the well-engineered thin-stream apps and the ones 
> deployed by developers with less detailed knowledge of the transport.
but would RDB be voided if this developer turns on RDB then turns on
Nagle later?

>
> -Andreas
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB)

2015-10-26 Thread Andreas Petlund

> On 26 Oct 2015, at 15:50, Neal Cardwell  wrote:
> 
> On Fri, Oct 23, 2015 at 4:50 PM, Bendik Rønning Opstad
>  wrote:
>> @@ -2409,6 +2412,15 @@ static int do_tcp_setsockopt(struct sock *sk, int 
>> level,
> ...
>> +   case TCP_RDB:
>> +   if (val < 0 || val > 1) {
>> +   err = -EINVAL;
>> +   } else {
>> +   tp->rdb = val;
>> +   tp->nonagle = val;
> 
> The semantics of the tp->nonagle bits are already a bit complex. My
> sense is that having a setsockopt of TCP_RDB transparently modify the
> nagle behavior is going to add more extra complexity and unanticipated
> behavior than is warranted given the slight possible gain in
> convenience to the app writer. What about a model where the
> application user just needs to remember to call
> setsockopt(TCP_NODELAY) if they want the TCP_RDB behavior to be
> sensible? I see your nice tests at
> 
>   
> https://github.com/bendikro/packetdrill/commit/9916b6c53e33dd04329d29b7d8baf703b2c2ac1b
> 
> are already doing that. And my sense is that likewise most
> well-engineered "thin stream" apps will already be using
> setsockopt(TCP_NODELAY). Is that workable?

We have been discussing this a bit back and forth. Your suggestion would be the 
right thing to keep the nagle semantics less complex and to educate developers 
in the intrinsics of the transport.

We ended up choosing to implicitly disable nagle since it 
1) is incompatible with the logic of RDB.
2) leaving it up to the developer to read the documentation and register the 
line saying that "failing to set TCP_NODELAY will void the RDB latency gain" 
will increase the chance of misconfigurations leading to deployment with no 
effect.

The hope was to help both the well-engineered thin-stream apps and the ones 
deployed by developers with less detailed knowledge of the transport.

-Andreas

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB)

2015-10-26 Thread Neal Cardwell
On Fri, Oct 23, 2015 at 4:50 PM, Bendik Rønning Opstad
 wrote:
>@@ -2409,6 +2412,15 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
...
> +   case TCP_RDB:
> +   if (val < 0 || val > 1) {
> +   err = -EINVAL;
> +   } else {
> +   tp->rdb = val;
> +   tp->nonagle = val;

The semantics of the tp->nonagle bits are already a bit complex. My
sense is that having a setsockopt of TCP_RDB transparently modify the
nagle behavior is going to add more extra complexity and unanticipated
behavior than is warranted given the slight possible gain in
convenience to the app writer. What about a model where the
application user just needs to remember to call
setsockopt(TCP_NODELAY) if they want the TCP_RDB behavior to be
sensible? I see your nice tests at

   
https://github.com/bendikro/packetdrill/commit/9916b6c53e33dd04329d29b7d8baf703b2c2ac1b

are already doing that. And my sense is that likewise most
well-engineered "thin stream" apps will already be using
setsockopt(TCP_NODELAY). Is that workable?

neal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html