Re: A buggy behavior for Linux TCP Reno and HTCP
On Mon, 24 Jul 2017 16:41:12 -0700 Yuchung Chengwrote: > On Mon, Jul 24, 2017 at 11:29 AM, Neal Cardwell wrote: > > On Mon, Jul 24, 2017 at 2:17 PM, Yuchung Cheng wrote: > >> On Sun, Jul 23, 2017 at 7:37 PM, Neal Cardwell > >> wrote: > >>> On Sun, Jul 23, 2017 at 10:36 PM, Neal Cardwell > >>> wrote: > > ... > What if we call the field tp->prior_cwnd? Then at least we'd have some > nice symmetry: > > - tp->snd_cwnd, which is saved in tp->prior_cwnd (and restored upon > undo) > - tp->snd_ssthresh, which is saved in tp-> prior_ssthresh (and > restored upon undo) > > That sounds appealing to me. WDYT? > >>> > >>> And, I should add, if we go with the tp->prior_cwnd approach, then we > >>> can have a single "default"/CUBIC-style undo function, instead of 15 > >>> separate but identical implementations... > >> you mean all CC modules share one ca_ops->undo_cwnd function? sounds a > >> nice consolidation work. > > > > Yes, exactly. > > > > Right now we have 9 modules that have identical tcp_foo_cwnd_undo functions: > > > > tcp_bic.c:188: return max(tp->snd_cwnd, ca->loss_cwnd); > > tcp_cubic.c:378:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); > > tcp_dctcp.c:318:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); > > tcp_highspeed.c:165:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); > > tcp_illinois.c:309: return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); > > tcp_nv.c:190: return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); > > tcp_scalable.c:50: return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); > > tcp_veno.c:210: return max(tcp_sk(sk)->snd_cwnd, veno->loss_cwnd); > > tcp_yeah.c:232: return max(tcp_sk(sk)->snd_cwnd, yeah->loss_cwnd); > > > > And if we fix this bug in tcp_reno_undo_cwnd() by referring to > > ca->loss_cwnd then we will add another 6 like this. > > > > So my proposal would be > > > > - tp->snd_cwnd, which is saved in tp->prior_cwnd (and restored upon undo) > > - tp->snd_ssthresh, which is saved in tp-> prior_ssthresh (and > >restored upon undo) > > > > Actually, now that I re-read the code, we already do have a > > prior_cwnd, which is used for the PRR code, and already set upon > > entering CA_Recovery. So if we set prior_cwnd for CA_Loss, perhaps we > > can do something like: > > > > diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c > > index fde983f6376b..c2b174469645 100644 > > --- a/net/ipv4/tcp_cong.c > > +++ b/net/ipv4/tcp_cong.c > > @@ -456,7 +456,7 @@ u32 tcp_reno_undo_cwnd(struct sock *sk) > > { > > const struct tcp_sock *tp = tcp_sk(sk); > > > > - return max(tp->snd_cwnd, tp->snd_ssthresh << 1); > > + return max(tp->snd_cwnd, tp->prior_cwnd); > > } > > EXPORT_SYMBOL_GPL(tcp_reno_undo_cwnd); > > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > index 2920e0cb09f8..ae790a84302d 100644 > > --- a/net/ipv4/tcp_input.c > > +++ b/net/ipv4/tcp_input.c > > @@ -1951,6 +1951,7 @@ void tcp_enter_loss(struct sock *sk) > > !after(tp->high_seq, tp->snd_una) || > > (icsk->icsk_ca_state == TCP_CA_Loss && > > !icsk->icsk_retransmits)) { > > tp->prior_ssthresh = tcp_current_ssthresh(sk); > > + tp->prior_cwnd = tp->snd_cwnd; > > tp->snd_ssthresh = icsk->icsk_ca_ops->ssthresh(sk); > > tcp_ca_event(sk, CA_EVENT_LOSS); > > tcp_init_undo(tp); > > > > And then change all the CC modules but BBR to use the > > tcp_reno_undo_cwnd() instead of their own custom undo code. > > > > WDYT? > Looks reasonable. But we might want to eventually refactor TCP undo > code: the stats changes (prior_ssthresh, prior_cwnd, undo_marker, > undo_retrans) are scattered in different helpers, making the code hard > to audit. I like having common code as much as possible, having per CC undo means more variations and sources of errors.
Re: A buggy behavior for Linux TCP Reno and HTCP
On Mon, Jul 24, 2017 at 11:29 AM, Neal Cardwellwrote: > On Mon, Jul 24, 2017 at 2:17 PM, Yuchung Cheng wrote: >> On Sun, Jul 23, 2017 at 7:37 PM, Neal Cardwell wrote: >>> On Sun, Jul 23, 2017 at 10:36 PM, Neal Cardwell >>> wrote: > ... What if we call the field tp->prior_cwnd? Then at least we'd have some nice symmetry: - tp->snd_cwnd, which is saved in tp->prior_cwnd (and restored upon undo) - tp->snd_ssthresh, which is saved in tp-> prior_ssthresh (and restored upon undo) That sounds appealing to me. WDYT? >>> >>> And, I should add, if we go with the tp->prior_cwnd approach, then we >>> can have a single "default"/CUBIC-style undo function, instead of 15 >>> separate but identical implementations... >> you mean all CC modules share one ca_ops->undo_cwnd function? sounds a >> nice consolidation work. > > Yes, exactly. > > Right now we have 9 modules that have identical tcp_foo_cwnd_undo functions: > > tcp_bic.c:188: return max(tp->snd_cwnd, ca->loss_cwnd); > tcp_cubic.c:378:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); > tcp_dctcp.c:318:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); > tcp_highspeed.c:165:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); > tcp_illinois.c:309: return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); > tcp_nv.c:190: return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); > tcp_scalable.c:50: return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); > tcp_veno.c:210: return max(tcp_sk(sk)->snd_cwnd, veno->loss_cwnd); > tcp_yeah.c:232: return max(tcp_sk(sk)->snd_cwnd, yeah->loss_cwnd); > > And if we fix this bug in tcp_reno_undo_cwnd() by referring to > ca->loss_cwnd then we will add another 6 like this. > > So my proposal would be > > - tp->snd_cwnd, which is saved in tp->prior_cwnd (and restored upon undo) > - tp->snd_ssthresh, which is saved in tp-> prior_ssthresh (and >restored upon undo) > > Actually, now that I re-read the code, we already do have a > prior_cwnd, which is used for the PRR code, and already set upon > entering CA_Recovery. So if we set prior_cwnd for CA_Loss, perhaps we > can do something like: > > diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c > index fde983f6376b..c2b174469645 100644 > --- a/net/ipv4/tcp_cong.c > +++ b/net/ipv4/tcp_cong.c > @@ -456,7 +456,7 @@ u32 tcp_reno_undo_cwnd(struct sock *sk) > { > const struct tcp_sock *tp = tcp_sk(sk); > > - return max(tp->snd_cwnd, tp->snd_ssthresh << 1); > + return max(tp->snd_cwnd, tp->prior_cwnd); > } > EXPORT_SYMBOL_GPL(tcp_reno_undo_cwnd); > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 2920e0cb09f8..ae790a84302d 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -1951,6 +1951,7 @@ void tcp_enter_loss(struct sock *sk) > !after(tp->high_seq, tp->snd_una) || > (icsk->icsk_ca_state == TCP_CA_Loss && !icsk->icsk_retransmits)) { > tp->prior_ssthresh = tcp_current_ssthresh(sk); > + tp->prior_cwnd = tp->snd_cwnd; > tp->snd_ssthresh = icsk->icsk_ca_ops->ssthresh(sk); > tcp_ca_event(sk, CA_EVENT_LOSS); > tcp_init_undo(tp); > > And then change all the CC modules but BBR to use the > tcp_reno_undo_cwnd() instead of their own custom undo code. > > WDYT? Looks reasonable. But we might want to eventually refactor TCP undo code: the stats changes (prior_ssthresh, prior_cwnd, undo_marker, undo_retrans) are scattered in different helpers, making the code hard to audit. > > neal
Re: A buggy behavior for Linux TCP Reno and HTCP
On Mon, Jul 24, 2017 at 2:17 PM, Yuchung Chengwrote: > On Sun, Jul 23, 2017 at 7:37 PM, Neal Cardwell wrote: >> On Sun, Jul 23, 2017 at 10:36 PM, Neal Cardwell wrote: ... >>> What if we call the field tp->prior_cwnd? Then at least we'd have some >>> nice symmetry: >>> >>> - tp->snd_cwnd, which is saved in tp->prior_cwnd (and restored upon undo) >>> - tp->snd_ssthresh, which is saved in tp-> prior_ssthresh (and >>> restored upon undo) >>> >>> That sounds appealing to me. WDYT? >> >> And, I should add, if we go with the tp->prior_cwnd approach, then we >> can have a single "default"/CUBIC-style undo function, instead of 15 >> separate but identical implementations... > you mean all CC modules share one ca_ops->undo_cwnd function? sounds a > nice consolidation work. Yes, exactly. Right now we have 9 modules that have identical tcp_foo_cwnd_undo functions: tcp_bic.c:188: return max(tp->snd_cwnd, ca->loss_cwnd); tcp_cubic.c:378:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); tcp_dctcp.c:318:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); tcp_highspeed.c:165:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); tcp_illinois.c:309: return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); tcp_nv.c:190: return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); tcp_scalable.c:50: return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd); tcp_veno.c:210: return max(tcp_sk(sk)->snd_cwnd, veno->loss_cwnd); tcp_yeah.c:232: return max(tcp_sk(sk)->snd_cwnd, yeah->loss_cwnd); And if we fix this bug in tcp_reno_undo_cwnd() by referring to ca->loss_cwnd then we will add another 6 like this. So my proposal would be - tp->snd_cwnd, which is saved in tp->prior_cwnd (and restored upon undo) - tp->snd_ssthresh, which is saved in tp-> prior_ssthresh (and restored upon undo) Actually, now that I re-read the code, we already do have a prior_cwnd, which is used for the PRR code, and already set upon entering CA_Recovery. So if we set prior_cwnd for CA_Loss, perhaps we can do something like: diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c index fde983f6376b..c2b174469645 100644 --- a/net/ipv4/tcp_cong.c +++ b/net/ipv4/tcp_cong.c @@ -456,7 +456,7 @@ u32 tcp_reno_undo_cwnd(struct sock *sk) { const struct tcp_sock *tp = tcp_sk(sk); - return max(tp->snd_cwnd, tp->snd_ssthresh << 1); + return max(tp->snd_cwnd, tp->prior_cwnd); } EXPORT_SYMBOL_GPL(tcp_reno_undo_cwnd); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 2920e0cb09f8..ae790a84302d 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -1951,6 +1951,7 @@ void tcp_enter_loss(struct sock *sk) !after(tp->high_seq, tp->snd_una) || (icsk->icsk_ca_state == TCP_CA_Loss && !icsk->icsk_retransmits)) { tp->prior_ssthresh = tcp_current_ssthresh(sk); + tp->prior_cwnd = tp->snd_cwnd; tp->snd_ssthresh = icsk->icsk_ca_ops->ssthresh(sk); tcp_ca_event(sk, CA_EVENT_LOSS); tcp_init_undo(tp); And then change all the CC modules but BBR to use the tcp_reno_undo_cwnd() instead of their own custom undo code. WDYT? neal
Re: A buggy behavior for Linux TCP Reno and HTCP
On Sun, Jul 23, 2017 at 7:37 PM, Neal Cardwellwrote: > On Sun, Jul 23, 2017 at 10:36 PM, Neal Cardwell wrote: >> On Fri, Jul 21, 2017 at 5:16 PM, Yuchung Cheng wrote: >>> On Fri, Jul 21, 2017 at 1:46 PM, Neal Cardwell >>> wrote: On Fri, Jul 21, 2017 at 4:27 PM, Lisong Xu wrote: > > Hi Yuchung, > > This test scenario is only one example to trigger this bug. In general, as > long as cwnd <4, the undo function has this bug. Yes, personally I agree that this seems like an issue that is general enough to be worth fixing. In the sense that, if cwnd <4, then we may well be very congested. So we don't want to get hit by this bug wherein an undo of a loss recovery can cause cwnd to suddenly jump (from 1, 2, or 3) up to 4. Seems like any of the several CCs that use tcp_reno_undo_cwnd() have this bug. I guess in my mind the only question is whether we want to add a tcp_foo_undo_cwnd() and ca->loss_cwnd to every CC module to handle this issue (i.e. make every CC module handle it the way CUBIC does), or (my >>> I would prefer the former b/c loss_cwnd may not be universal TCP >>> state, just like ssthresh carries no meaning in some CC (bbr). It also >>> seems also more consistent with the recent change on undo >>> >>> commit e97991832a4ea4a5f47d65f068a4c966a2eb5730 >>> Author: Florian Westphal >>> Date: Mon Nov 21 14:18:38 2016 +0100 >>> >>> tcp: make undo_cwnd mandatory for congestion modules >>> >> >> You are certainly right that it is more pure to keep a CC detail like >> that inside the CC module. >> >> But it's a bit sad to me that we have 9 separate identical >> implementations of a cwnd undo function, and that approach would add 6 >> more. >> >> We do have tp->snd_ssthresh and tp->prior_ssthresh, even though not >> all CC modules use ssthresh. >> >> What if we call the field tp->prior_cwnd? Then at least we'd have some >> nice symmetry: >> >> - tp->snd_cwnd, which is saved in tp->prior_cwnd (and restored upon undo) >> - tp->snd_ssthresh, which is saved in tp-> prior_ssthresh (and >> restored upon undo) >> >> That sounds appealing to me. WDYT? > > And, I should add, if we go with the tp->prior_cwnd approach, then we > can have a single "default"/CUBIC-style undo function, instead of 15 > separate but identical implementations... you mean all CC modules share one ca_ops->undo_cwnd function? sounds a nice consolidation work. > > neal
Re: A buggy behavior for Linux TCP Reno and HTCP
On Sun, Jul 23, 2017 at 10:36 PM, Neal Cardwellwrote: > On Fri, Jul 21, 2017 at 5:16 PM, Yuchung Cheng wrote: >> On Fri, Jul 21, 2017 at 1:46 PM, Neal Cardwell wrote: >>> On Fri, Jul 21, 2017 at 4:27 PM, Lisong Xu wrote: Hi Yuchung, This test scenario is only one example to trigger this bug. In general, as long as cwnd <4, the undo function has this bug. >>> >>> >>> Yes, personally I agree that this seems like an issue that is general enough >>> to be worth fixing. In the sense that, if cwnd <4, then we may well be very >>> congested. So we don't want to get hit by this bug wherein an undo of a loss >>> recovery can cause cwnd to suddenly jump (from 1, 2, or 3) up to 4. >>> >>> Seems like any of the several CCs that use tcp_reno_undo_cwnd() have this >>> bug. >>> >>> I guess in my mind the only question is whether we want to add a >>> tcp_foo_undo_cwnd() and ca->loss_cwnd to every CC module to handle this >>> issue (i.e. make every CC module handle it the way CUBIC does), or (my >> I would prefer the former b/c loss_cwnd may not be universal TCP >> state, just like ssthresh carries no meaning in some CC (bbr). It also >> seems also more consistent with the recent change on undo >> >> commit e97991832a4ea4a5f47d65f068a4c966a2eb5730 >> Author: Florian Westphal >> Date: Mon Nov 21 14:18:38 2016 +0100 >> >> tcp: make undo_cwnd mandatory for congestion modules >> > > You are certainly right that it is more pure to keep a CC detail like > that inside the CC module. > > But it's a bit sad to me that we have 9 separate identical > implementations of a cwnd undo function, and that approach would add 6 > more. > > We do have tp->snd_ssthresh and tp->prior_ssthresh, even though not > all CC modules use ssthresh. > > What if we call the field tp->prior_cwnd? Then at least we'd have some > nice symmetry: > > - tp->snd_cwnd, which is saved in tp->prior_cwnd (and restored upon undo) > - tp->snd_ssthresh, which is saved in tp-> prior_ssthresh (and > restored upon undo) > > That sounds appealing to me. WDYT? And, I should add, if we go with the tp->prior_cwnd approach, then we can have a single "default"/CUBIC-style undo function, instead of 15 separate but identical implementations... neal
Re: A buggy behavior for Linux TCP Reno and HTCP
On Fri, Jul 21, 2017 at 5:16 PM, Yuchung Chengwrote: > On Fri, Jul 21, 2017 at 1:46 PM, Neal Cardwell wrote: >> On Fri, Jul 21, 2017 at 4:27 PM, Lisong Xu wrote: >>> >>> Hi Yuchung, >>> >>> This test scenario is only one example to trigger this bug. In general, as >>> long as cwnd <4, the undo function has this bug. >> >> >> Yes, personally I agree that this seems like an issue that is general enough >> to be worth fixing. In the sense that, if cwnd <4, then we may well be very >> congested. So we don't want to get hit by this bug wherein an undo of a loss >> recovery can cause cwnd to suddenly jump (from 1, 2, or 3) up to 4. >> >> Seems like any of the several CCs that use tcp_reno_undo_cwnd() have this >> bug. >> >> I guess in my mind the only question is whether we want to add a >> tcp_foo_undo_cwnd() and ca->loss_cwnd to every CC module to handle this >> issue (i.e. make every CC module handle it the way CUBIC does), or (my > I would prefer the former b/c loss_cwnd may not be universal TCP > state, just like ssthresh carries no meaning in some CC (bbr). It also > seems also more consistent with the recent change on undo > > commit e97991832a4ea4a5f47d65f068a4c966a2eb5730 > Author: Florian Westphal > Date: Mon Nov 21 14:18:38 2016 +0100 > > tcp: make undo_cwnd mandatory for congestion modules > You are certainly right that it is more pure to keep a CC detail like that inside the CC module. But it's a bit sad to me that we have 9 separate identical implementations of a cwnd undo function, and that approach would add 6 more. We do have tp->snd_ssthresh and tp->prior_ssthresh, even though not all CC modules use ssthresh. What if we call the field tp->prior_cwnd? Then at least we'd have some nice symmetry: - tp->snd_cwnd, which is saved in tp->prior_cwnd (and restored upon undo) - tp->snd_ssthresh, which is saved in tp-> prior_ssthresh (and restored upon undo) That sounds appealing to me. WDYT? neal
Re: A buggy behavior for Linux TCP Reno and HTCP
On Fri, Jul 21, 2017 at 1:46 PM, Neal Cardwellwrote: > On Fri, Jul 21, 2017 at 4:27 PM, Lisong Xu wrote: >> >> Hi Yuchung, >> >> This test scenario is only one example to trigger this bug. In general, as >> long as cwnd <4, the undo function has this bug. > > > Yes, personally I agree that this seems like an issue that is general enough > to be worth fixing. In the sense that, if cwnd <4, then we may well be very > congested. So we don't want to get hit by this bug wherein an undo of a loss > recovery can cause cwnd to suddenly jump (from 1, 2, or 3) up to 4. > > Seems like any of the several CCs that use tcp_reno_undo_cwnd() have this > bug. > > I guess in my mind the only question is whether we want to add a > tcp_foo_undo_cwnd() and ca->loss_cwnd to every CC module to handle this > issue (i.e. make every CC module handle it the way CUBIC does), or (my I would prefer the former b/c loss_cwnd may not be universal TCP state, just like ssthresh carries no meaning in some CC (bbr). It also seems also more consistent with the recent change on undo commit e97991832a4ea4a5f47d65f068a4c966a2eb5730 Author: Florian Westphal Date: Mon Nov 21 14:18:38 2016 +0100 tcp: make undo_cwnd mandatory for congestion modules > preference) just add a tp->loss_cwnd field so we can use shared code in > tcp_reno_undo_cwnd() to get this right across all CC modules. > > neal >
Re: A buggy behavior for Linux TCP Reno and HTCP
On Fri, Jul 21, 2017 at 4:27 PM, Lisong Xuwrote: > > Hi Yuchung, > > This test scenario is only one example to trigger this bug. In general, as > long as cwnd <4, the undo function has this bug. Yes, personally I agree that this seems like an issue that is general enough to be worth fixing. In the sense that, if cwnd <4, then we may well be very congested. So we don't want to get hit by this bug wherein an undo of a loss recovery can cause cwnd to suddenly jump (from 1, 2, or 3) up to 4. Seems like any of the several CCs that use tcp_reno_undo_cwnd() have this bug. I guess in my mind the only question is whether we want to add a tcp_foo_undo_cwnd() and ca->loss_cwnd to every CC module to handle this issue (i.e. make every CC module handle it the way CUBIC does), or (my preference) just add a tp->loss_cwnd field so we can use shared code in tcp_reno_undo_cwnd() to get this right across all CC modules. neal
Re: A buggy behavior for Linux TCP Reno and HTCP
Hi Yuchung, This test scenario is only one example to trigger this bug. In generally, as long as cwnd <4, the undo function has this bug. This would not be a problem for a normal network. But might be an issue, if the network is highly congested (e.g., many many TCP flows with small cwnd <4). In this case, the bug may possibly mistakenly double the sending rate of each flow, and make a highly congested network even more congested similar to congestion collapse. This is actually why we need the congestion control algorithms in the first place. Thanks Lisong On 7/21/2017 12:59 PM, Yuchung Cheng wrote: On Thu, Jul 20, 2017 at 2:28 PM, Wei Sunwrote: Hi Yuchung, Sorry for the confusion. The test case was adapted from an old DSACK test case (i.e., forget to remove something). Attached is a new and simple one. Thanks Note that the test scenario is fairly rare IMO: the connection first experience timeouts, then its retransmission got acked, then the original packets get Acked (ack w/ val 1400 ecr 130). It can be really long reordering, or reordering plus packet loss. The Linux undo state machines may not handle this perfectly, but it's probably not worth extra state for such rare events. On Wed, Jul 19, 2017 at 2:31 PM, Yuchung Cheng wrote: On Tue, Jul 18, 2017 at 2:36 PM, Wei Sun wrote: Hi there, We find a buggy behavior when using Linux TCP Reno and HTCP in low bandwidth or highly congested network environments. In a simple word, their undo functions may mistakenly double the cwnd, leading to a more aggressive behavior in a highly congested scenario. The detailed reason: The current reno undo function assumes cwnd halving (and thus doubles the cwnd), but it doesn't consider a corner case condition that ssthresh is at least 2. e.g., cwnd ssth An initial state: 25 A spurious loss: 12 Undo: 45 Here the cwnd after undo is two times as that before undo. Attached is a simple script to reproduce it. the packetdrill script is a bit confusing: it disables SACK but then the client returns ACK w/ SACKs, also 3 dupacks happen after RTO so the sender isn't technically going through a fast recovery... could you provide a better test? A similar reason for HTCP, so we recommend to store the cwnd on loss in .ssthresh implementation and restore it again in .undo_cwnd for TCP Reno and HTCP implementations. Thanks
Re: A buggy behavior for Linux TCP Reno and HTCP
Hi Yuchung, This test scenario is only one example to trigger this bug. In general, as long as cwnd <4, the undo function has this bug. This would not be a problem for a normal network. But might be an issue, if the network is highly congested (e.g., many many TCP flows with small cwnd <4). In this case, the bug may possibly mistakenly double the sending rate of each flow, and make a highly congested network even more congested similar to congestion collapse. This is actually why we need the congestion control algorithms in the first place. Thanks Lisong On 7/21/2017 12:59 PM, Yuchung Cheng wrote: On Thu, Jul 20, 2017 at 2:28 PM, Wei Sunwrote: Hi Yuchung, Sorry for the confusion. The test case was adapted from an old DSACK test case (i.e., forget to remove something). Attached is a new and simple one. Thanks Note that the test scenario is fairly rare IMO: the connection first experience timeouts, then its retransmission got acked, then the original packets get Acked (ack w/ val 1400 ecr 130). It can be really long reordering, or reordering plus packet loss. The Linux undo state machines may not handle this perfectly, but it's probably not worth extra state for such rare events. On Wed, Jul 19, 2017 at 2:31 PM, Yuchung Cheng wrote: On Tue, Jul 18, 2017 at 2:36 PM, Wei Sun wrote: Hi there, We find a buggy behavior when using Linux TCP Reno and HTCP in low bandwidth or highly congested network environments. In a simple word, their undo functions may mistakenly double the cwnd, leading to a more aggressive behavior in a highly congested scenario. The detailed reason: The current reno undo function assumes cwnd halving (and thus doubles the cwnd), but it doesn't consider a corner case condition that ssthresh is at least 2. e.g., cwnd ssth An initial state: 25 A spurious loss: 12 Undo: 45 Here the cwnd after undo is two times as that before undo. Attached is a simple script to reproduce it. the packetdrill script is a bit confusing: it disables SACK but then the client returns ACK w/ SACKs, also 3 dupacks happen after RTO so the sender isn't technically going through a fast recovery... could you provide a better test? A similar reason for HTCP, so we recommend to store the cwnd on loss in .ssthresh implementation and restore it again in .undo_cwnd for TCP Reno and HTCP implementations. Thanks
Re: A buggy behavior for Linux TCP Reno and HTCP
On Thu, Jul 20, 2017 at 2:28 PM, Wei Sunwrote: > Hi Yuchung, > > Sorry for the confusion. The test case was adapted from an old DSACK > test case (i.e., forget to remove something). > > Attached is a new and simple one. Thanks Note that the test scenario is fairly rare IMO: the connection first experience timeouts, then its retransmission got acked, then the original packets get Acked (ack w/ val 1400 ecr 130). It can be really long reordering, or reordering plus packet loss. The Linux undo state machines may not handle this perfectly, but it's probably not worth extra state for such rare events. > > > > On Wed, Jul 19, 2017 at 2:31 PM, Yuchung Cheng wrote: >> On Tue, Jul 18, 2017 at 2:36 PM, Wei Sun wrote: >>> Hi there, >>> >>> We find a buggy behavior when using Linux TCP Reno and HTCP in low >>> bandwidth or highly congested network environments. >>> >>> In a simple word, their undo functions may mistakenly double the cwnd, >>> leading to a more aggressive behavior in a highly congested scenario. >>> >>> >>> The detailed reason: >>> >>> The current reno undo function assumes cwnd halving (and thus doubles >>> the cwnd), but it doesn't consider a corner case condition that >>> ssthresh is at least 2. >>> >>> e.g., >>> cwnd ssth >>> An initial state: 25 >>> A spurious loss: 12 >>> Undo: 45 >>> >>> Here the cwnd after undo is two times as that before undo. Attached is >>> a simple script to reproduce it. >> the packetdrill script is a bit confusing: it disables SACK but then >> the client returns ACK w/ SACKs, also 3 dupacks happen after RTO so >> the sender isn't technically going through a fast recovery... >> >> could you provide a better test? >> >>> >>> A similar reason for HTCP, so we recommend to store the cwnd on loss >>> in .ssthresh implementation and restore it again in .undo_cwnd for TCP >>> Reno and HTCP implementations. >>> >>> Thanks
Re: A buggy behavior for Linux TCP Reno and HTCP
Hi Yuchung, Sorry for the confusion. The test case was adapted from an old DSACK test case (i.e., forget to remove something). Attached is a new and simple one. Thanks On Wed, Jul 19, 2017 at 2:31 PM, Yuchung Chengwrote: > On Tue, Jul 18, 2017 at 2:36 PM, Wei Sun wrote: >> Hi there, >> >> We find a buggy behavior when using Linux TCP Reno and HTCP in low >> bandwidth or highly congested network environments. >> >> In a simple word, their undo functions may mistakenly double the cwnd, >> leading to a more aggressive behavior in a highly congested scenario. >> >> >> The detailed reason: >> >> The current reno undo function assumes cwnd halving (and thus doubles >> the cwnd), but it doesn't consider a corner case condition that >> ssthresh is at least 2. >> >> e.g., >> cwnd ssth >> An initial state: 25 >> A spurious loss: 12 >> Undo: 45 >> >> Here the cwnd after undo is two times as that before undo. Attached is >> a simple script to reproduce it. > the packetdrill script is a bit confusing: it disables SACK but then > the client returns ACK w/ SACKs, also 3 dupacks happen after RTO so > the sender isn't technically going through a fast recovery... > > could you provide a better test? > >> >> A similar reason for HTCP, so we recommend to store the cwnd on loss >> in .ssthresh implementation and restore it again in .undo_cwnd for TCP >> Reno and HTCP implementations. >> >> Thanks TSundo-2-1-4.pkt Description: Binary data
Re: A buggy behavior for Linux TCP Reno and HTCP
On Tue, Jul 18, 2017 at 2:36 PM, Wei Sunwrote: > Hi there, > > We find a buggy behavior when using Linux TCP Reno and HTCP in low > bandwidth or highly congested network environments. > > In a simple word, their undo functions may mistakenly double the cwnd, > leading to a more aggressive behavior in a highly congested scenario. > > > The detailed reason: > > The current reno undo function assumes cwnd halving (and thus doubles > the cwnd), but it doesn't consider a corner case condition that > ssthresh is at least 2. > > e.g., > cwnd ssth > An initial state: 25 > A spurious loss: 12 > Undo: 45 > > Here the cwnd after undo is two times as that before undo. Attached is > a simple script to reproduce it. the packetdrill script is a bit confusing: it disables SACK but then the client returns ACK w/ SACKs, also 3 dupacks happen after RTO so the sender isn't technically going through a fast recovery... could you provide a better test? > > A similar reason for HTCP, so we recommend to store the cwnd on loss > in .ssthresh implementation and restore it again in .undo_cwnd for TCP > Reno and HTCP implementations. > > Thanks
A buggy behavior for Linux TCP Reno and HTCP
Hi there, We find a buggy behavior when using Linux TCP Reno and HTCP in low bandwidth or highly congested network environments. In a simple word, their undo functions may mistakenly double the cwnd, leading to a more aggressive behavior in a highly congested scenario. The detailed reason: The current reno undo function assumes cwnd halving (and thus doubles the cwnd), but it doesn't consider a corner case condition that ssthresh is at least 2. e.g., cwnd ssth An initial state: 25 A spurious loss: 12 Undo: 45 Here the cwnd after undo is two times as that before undo. Attached is a simple script to reproduce it. A similar reason for HTCP, so we recommend to store the cwnd on loss in .ssthresh implementation and restore it again in .undo_cwnd for TCP Reno and HTCP implementations. Thanks undo-2-1-4.pkt Description: Binary data