Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-24 Thread Stephen Hemminger
On Mon, 24 Jul 2017 16:41:12 -0700
Yuchung Cheng  wrote:

> On Mon, Jul 24, 2017 at 11:29 AM, Neal Cardwell  wrote:
> > On Mon, Jul 24, 2017 at 2:17 PM, Yuchung Cheng  wrote:  
> >> On Sun, Jul 23, 2017 at 7:37 PM, Neal Cardwell  
> >> wrote:  
> >>> On Sun, Jul 23, 2017 at 10:36 PM, Neal Cardwell  
> >>> wrote:  
> > ...  
>  What if we call the field tp->prior_cwnd? Then at least we'd have some
>  nice symmetry:
> 
>  - tp->snd_cwnd,  which is saved in tp->prior_cwnd  (and restored upon 
>  undo)
>  - tp->snd_ssthresh,  which is saved in tp-> prior_ssthresh  (and
>  restored upon undo)
> 
>  That sounds appealing to me. WDYT?  
> >>>
> >>> And, I should add, if we go with the tp->prior_cwnd approach, then we
> >>> can have a single "default"/CUBIC-style undo function, instead of 15
> >>> separate but identical implementations...  
> >> you mean all CC modules share one ca_ops->undo_cwnd function? sounds a
> >> nice consolidation work.  
> >
> > Yes, exactly.
> >
> > Right now we have 9 modules that have identical tcp_foo_cwnd_undo functions:
> >
> > tcp_bic.c:188:  return max(tp->snd_cwnd, ca->loss_cwnd);
> > tcp_cubic.c:378:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
> > tcp_dctcp.c:318:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
> > tcp_highspeed.c:165:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
> > tcp_illinois.c:309: return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
> > tcp_nv.c:190:   return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
> > tcp_scalable.c:50:  return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
> > tcp_veno.c:210: return max(tcp_sk(sk)->snd_cwnd, veno->loss_cwnd);
> > tcp_yeah.c:232: return max(tcp_sk(sk)->snd_cwnd, yeah->loss_cwnd);
> >
> > And if we fix this bug in tcp_reno_undo_cwnd() by referring to
> > ca->loss_cwnd then we will add another 6 like this.
> >
> > So my proposal would be
> >
> > - tp->snd_cwnd,  which is saved in tp->prior_cwnd  (and restored upon undo)
> > - tp->snd_ssthresh,  which is saved in tp-> prior_ssthresh  (and
> >restored upon undo)
> >
> > Actually, now that I re-read the code, we already do have a
> > prior_cwnd, which is used for the PRR code, and already set upon
> > entering CA_Recovery. So if we set prior_cwnd for CA_Loss, perhaps we
> > can do something like:
> >
> > diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
> > index fde983f6376b..c2b174469645 100644
> > --- a/net/ipv4/tcp_cong.c
> > +++ b/net/ipv4/tcp_cong.c
> > @@ -456,7 +456,7 @@ u32 tcp_reno_undo_cwnd(struct sock *sk)
> >  {
> > const struct tcp_sock *tp = tcp_sk(sk);
> >
> > -   return max(tp->snd_cwnd, tp->snd_ssthresh << 1);
> > +   return max(tp->snd_cwnd, tp->prior_cwnd);
> >  }
> >  EXPORT_SYMBOL_GPL(tcp_reno_undo_cwnd);
> >
> > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> > index 2920e0cb09f8..ae790a84302d 100644
> > --- a/net/ipv4/tcp_input.c
> > +++ b/net/ipv4/tcp_input.c
> > @@ -1951,6 +1951,7 @@ void tcp_enter_loss(struct sock *sk)
> > !after(tp->high_seq, tp->snd_una) ||
> > (icsk->icsk_ca_state == TCP_CA_Loss && 
> > !icsk->icsk_retransmits)) {
> > tp->prior_ssthresh = tcp_current_ssthresh(sk);
> > +   tp->prior_cwnd = tp->snd_cwnd;
> > tp->snd_ssthresh = icsk->icsk_ca_ops->ssthresh(sk);
> > tcp_ca_event(sk, CA_EVENT_LOSS);
> > tcp_init_undo(tp);
> >
> > And then change all the CC modules but BBR to use the
> > tcp_reno_undo_cwnd() instead of their own custom undo code.
> >
> > WDYT?  
> Looks reasonable. But we might want to eventually refactor TCP undo
> code: the stats changes (prior_ssthresh, prior_cwnd, undo_marker,
> undo_retrans) are scattered in different helpers, making the code hard
> to audit.

I like having common code as much as possible,
having per CC undo means more variations and sources of errors.



Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-24 Thread Yuchung Cheng
On Mon, Jul 24, 2017 at 11:29 AM, Neal Cardwell  wrote:
> On Mon, Jul 24, 2017 at 2:17 PM, Yuchung Cheng  wrote:
>> On Sun, Jul 23, 2017 at 7:37 PM, Neal Cardwell  wrote:
>>> On Sun, Jul 23, 2017 at 10:36 PM, Neal Cardwell  
>>> wrote:
> ...
 What if we call the field tp->prior_cwnd? Then at least we'd have some
 nice symmetry:

 - tp->snd_cwnd,  which is saved in tp->prior_cwnd  (and restored upon undo)
 - tp->snd_ssthresh,  which is saved in tp-> prior_ssthresh  (and
 restored upon undo)

 That sounds appealing to me. WDYT?
>>>
>>> And, I should add, if we go with the tp->prior_cwnd approach, then we
>>> can have a single "default"/CUBIC-style undo function, instead of 15
>>> separate but identical implementations...
>> you mean all CC modules share one ca_ops->undo_cwnd function? sounds a
>> nice consolidation work.
>
> Yes, exactly.
>
> Right now we have 9 modules that have identical tcp_foo_cwnd_undo functions:
>
> tcp_bic.c:188:  return max(tp->snd_cwnd, ca->loss_cwnd);
> tcp_cubic.c:378:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
> tcp_dctcp.c:318:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
> tcp_highspeed.c:165:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
> tcp_illinois.c:309: return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
> tcp_nv.c:190:   return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
> tcp_scalable.c:50:  return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
> tcp_veno.c:210: return max(tcp_sk(sk)->snd_cwnd, veno->loss_cwnd);
> tcp_yeah.c:232: return max(tcp_sk(sk)->snd_cwnd, yeah->loss_cwnd);
>
> And if we fix this bug in tcp_reno_undo_cwnd() by referring to
> ca->loss_cwnd then we will add another 6 like this.
>
> So my proposal would be
>
> - tp->snd_cwnd,  which is saved in tp->prior_cwnd  (and restored upon undo)
> - tp->snd_ssthresh,  which is saved in tp-> prior_ssthresh  (and
>restored upon undo)
>
> Actually, now that I re-read the code, we already do have a
> prior_cwnd, which is used for the PRR code, and already set upon
> entering CA_Recovery. So if we set prior_cwnd for CA_Loss, perhaps we
> can do something like:
>
> diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
> index fde983f6376b..c2b174469645 100644
> --- a/net/ipv4/tcp_cong.c
> +++ b/net/ipv4/tcp_cong.c
> @@ -456,7 +456,7 @@ u32 tcp_reno_undo_cwnd(struct sock *sk)
>  {
> const struct tcp_sock *tp = tcp_sk(sk);
>
> -   return max(tp->snd_cwnd, tp->snd_ssthresh << 1);
> +   return max(tp->snd_cwnd, tp->prior_cwnd);
>  }
>  EXPORT_SYMBOL_GPL(tcp_reno_undo_cwnd);
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 2920e0cb09f8..ae790a84302d 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -1951,6 +1951,7 @@ void tcp_enter_loss(struct sock *sk)
> !after(tp->high_seq, tp->snd_una) ||
> (icsk->icsk_ca_state == TCP_CA_Loss && !icsk->icsk_retransmits)) {
> tp->prior_ssthresh = tcp_current_ssthresh(sk);
> +   tp->prior_cwnd = tp->snd_cwnd;
> tp->snd_ssthresh = icsk->icsk_ca_ops->ssthresh(sk);
> tcp_ca_event(sk, CA_EVENT_LOSS);
> tcp_init_undo(tp);
>
> And then change all the CC modules but BBR to use the
> tcp_reno_undo_cwnd() instead of their own custom undo code.
>
> WDYT?
Looks reasonable. But we might want to eventually refactor TCP undo
code: the stats changes (prior_ssthresh, prior_cwnd, undo_marker,
undo_retrans) are scattered in different helpers, making the code hard
to audit.

>
> neal


Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-24 Thread Neal Cardwell
On Mon, Jul 24, 2017 at 2:17 PM, Yuchung Cheng  wrote:
> On Sun, Jul 23, 2017 at 7:37 PM, Neal Cardwell  wrote:
>> On Sun, Jul 23, 2017 at 10:36 PM, Neal Cardwell  wrote:
...
>>> What if we call the field tp->prior_cwnd? Then at least we'd have some
>>> nice symmetry:
>>>
>>> - tp->snd_cwnd,  which is saved in tp->prior_cwnd  (and restored upon undo)
>>> - tp->snd_ssthresh,  which is saved in tp-> prior_ssthresh  (and
>>> restored upon undo)
>>>
>>> That sounds appealing to me. WDYT?
>>
>> And, I should add, if we go with the tp->prior_cwnd approach, then we
>> can have a single "default"/CUBIC-style undo function, instead of 15
>> separate but identical implementations...
> you mean all CC modules share one ca_ops->undo_cwnd function? sounds a
> nice consolidation work.

Yes, exactly.

Right now we have 9 modules that have identical tcp_foo_cwnd_undo functions:

tcp_bic.c:188:  return max(tp->snd_cwnd, ca->loss_cwnd);
tcp_cubic.c:378:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
tcp_dctcp.c:318:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
tcp_highspeed.c:165:return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
tcp_illinois.c:309: return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
tcp_nv.c:190:   return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
tcp_scalable.c:50:  return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
tcp_veno.c:210: return max(tcp_sk(sk)->snd_cwnd, veno->loss_cwnd);
tcp_yeah.c:232: return max(tcp_sk(sk)->snd_cwnd, yeah->loss_cwnd);

And if we fix this bug in tcp_reno_undo_cwnd() by referring to
ca->loss_cwnd then we will add another 6 like this.

So my proposal would be

- tp->snd_cwnd,  which is saved in tp->prior_cwnd  (and restored upon undo)
- tp->snd_ssthresh,  which is saved in tp-> prior_ssthresh  (and
   restored upon undo)

Actually, now that I re-read the code, we already do have a
prior_cwnd, which is used for the PRR code, and already set upon
entering CA_Recovery. So if we set prior_cwnd for CA_Loss, perhaps we
can do something like:

diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
index fde983f6376b..c2b174469645 100644
--- a/net/ipv4/tcp_cong.c
+++ b/net/ipv4/tcp_cong.c
@@ -456,7 +456,7 @@ u32 tcp_reno_undo_cwnd(struct sock *sk)
 {
const struct tcp_sock *tp = tcp_sk(sk);

-   return max(tp->snd_cwnd, tp->snd_ssthresh << 1);
+   return max(tp->snd_cwnd, tp->prior_cwnd);
 }
 EXPORT_SYMBOL_GPL(tcp_reno_undo_cwnd);

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2920e0cb09f8..ae790a84302d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1951,6 +1951,7 @@ void tcp_enter_loss(struct sock *sk)
!after(tp->high_seq, tp->snd_una) ||
(icsk->icsk_ca_state == TCP_CA_Loss && !icsk->icsk_retransmits)) {
tp->prior_ssthresh = tcp_current_ssthresh(sk);
+   tp->prior_cwnd = tp->snd_cwnd;
tp->snd_ssthresh = icsk->icsk_ca_ops->ssthresh(sk);
tcp_ca_event(sk, CA_EVENT_LOSS);
tcp_init_undo(tp);

And then change all the CC modules but BBR to use the
tcp_reno_undo_cwnd() instead of their own custom undo code.

WDYT?

neal


Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-24 Thread Yuchung Cheng
On Sun, Jul 23, 2017 at 7:37 PM, Neal Cardwell  wrote:
> On Sun, Jul 23, 2017 at 10:36 PM, Neal Cardwell  wrote:
>> On Fri, Jul 21, 2017 at 5:16 PM, Yuchung Cheng  wrote:
>>> On Fri,  Jul 21, 2017 at 1:46 PM, Neal Cardwell  
>>> wrote:
 On Fri, Jul 21, 2017 at 4:27 PM, Lisong Xu  wrote:
>
> Hi Yuchung,
>
> This test scenario is only one example to trigger this bug. In general, as
> long as cwnd <4, the undo function has this bug.


 Yes, personally I agree that this seems like an issue that is general 
 enough
 to be worth fixing. In the sense that, if cwnd <4, then we may well be very
 congested. So we don't want to get hit by this bug wherein an undo of a 
 loss
 recovery can cause cwnd to suddenly jump (from 1, 2, or 3) up to 4.

 Seems like any of the several CCs that use tcp_reno_undo_cwnd() have this
 bug.

 I guess in my mind the only question is whether we want to add a
 tcp_foo_undo_cwnd() and ca->loss_cwnd to every CC module to handle this
 issue (i.e. make every CC module handle it the way CUBIC does), or (my
>>> I would prefer the former b/c loss_cwnd may not be universal TCP
>>> state, just like ssthresh carries no meaning in some CC (bbr). It also
>>> seems also more consistent with the recent change on undo
>>>
>>> commit e97991832a4ea4a5f47d65f068a4c966a2eb5730
>>> Author: Florian Westphal 
>>> Date:   Mon Nov 21 14:18:38 2016 +0100
>>>
>>> tcp: make undo_cwnd mandatory for congestion modules
>>>
>>
>> You are certainly right that it is more pure to keep a CC detail like
>> that inside the CC module.
>>
>> But it's a bit sad to me that we have 9 separate identical
>> implementations of a cwnd undo function, and that approach would add 6
>> more.
>>
>> We do have tp->snd_ssthresh and tp->prior_ssthresh, even though not
>> all CC modules use ssthresh.
>>
>> What if we call the field tp->prior_cwnd? Then at least we'd have some
>> nice symmetry:
>>
>> - tp->snd_cwnd,  which is saved in tp->prior_cwnd  (and restored upon undo)
>> - tp->snd_ssthresh,  which is saved in tp-> prior_ssthresh  (and
>> restored upon undo)
>>
>> That sounds appealing to me. WDYT?
>
> And, I should add, if we go with the tp->prior_cwnd approach, then we
> can have a single "default"/CUBIC-style undo function, instead of 15
> separate but identical implementations...
you mean all CC modules share one ca_ops->undo_cwnd function? sounds a
nice consolidation work.

>
> neal


Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-23 Thread Neal Cardwell
On Sun, Jul 23, 2017 at 10:36 PM, Neal Cardwell  wrote:
> On Fri, Jul 21, 2017 at 5:16 PM, Yuchung Cheng  wrote:
>> On Fri,  Jul 21, 2017 at 1:46 PM, Neal Cardwell  wrote:
>>> On Fri, Jul 21, 2017 at 4:27 PM, Lisong Xu  wrote:

 Hi Yuchung,

 This test scenario is only one example to trigger this bug. In general, as
 long as cwnd <4, the undo function has this bug.
>>>
>>>
>>> Yes, personally I agree that this seems like an issue that is general enough
>>> to be worth fixing. In the sense that, if cwnd <4, then we may well be very
>>> congested. So we don't want to get hit by this bug wherein an undo of a loss
>>> recovery can cause cwnd to suddenly jump (from 1, 2, or 3) up to 4.
>>>
>>> Seems like any of the several CCs that use tcp_reno_undo_cwnd() have this
>>> bug.
>>>
>>> I guess in my mind the only question is whether we want to add a
>>> tcp_foo_undo_cwnd() and ca->loss_cwnd to every CC module to handle this
>>> issue (i.e. make every CC module handle it the way CUBIC does), or (my
>> I would prefer the former b/c loss_cwnd may not be universal TCP
>> state, just like ssthresh carries no meaning in some CC (bbr). It also
>> seems also more consistent with the recent change on undo
>>
>> commit e97991832a4ea4a5f47d65f068a4c966a2eb5730
>> Author: Florian Westphal 
>> Date:   Mon Nov 21 14:18:38 2016 +0100
>>
>> tcp: make undo_cwnd mandatory for congestion modules
>>
>
> You are certainly right that it is more pure to keep a CC detail like
> that inside the CC module.
>
> But it's a bit sad to me that we have 9 separate identical
> implementations of a cwnd undo function, and that approach would add 6
> more.
>
> We do have tp->snd_ssthresh and tp->prior_ssthresh, even though not
> all CC modules use ssthresh.
>
> What if we call the field tp->prior_cwnd? Then at least we'd have some
> nice symmetry:
>
> - tp->snd_cwnd,  which is saved in tp->prior_cwnd  (and restored upon undo)
> - tp->snd_ssthresh,  which is saved in tp-> prior_ssthresh  (and
> restored upon undo)
>
> That sounds appealing to me. WDYT?

And, I should add, if we go with the tp->prior_cwnd approach, then we
can have a single "default"/CUBIC-style undo function, instead of 15
separate but identical implementations...

neal


Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-23 Thread Neal Cardwell
On Fri, Jul 21, 2017 at 5:16 PM, Yuchung Cheng  wrote:
> On Fri, Jul 21, 2017 at 1:46 PM, Neal Cardwell  wrote:
>> On Fri, Jul 21, 2017 at 4:27 PM, Lisong Xu  wrote:
>>>
>>> Hi Yuchung,
>>>
>>> This test scenario is only one example to trigger this bug. In general, as
>>> long as cwnd <4, the undo function has this bug.
>>
>>
>> Yes, personally I agree that this seems like an issue that is general enough
>> to be worth fixing. In the sense that, if cwnd <4, then we may well be very
>> congested. So we don't want to get hit by this bug wherein an undo of a loss
>> recovery can cause cwnd to suddenly jump (from 1, 2, or 3) up to 4.
>>
>> Seems like any of the several CCs that use tcp_reno_undo_cwnd() have this
>> bug.
>>
>> I guess in my mind the only question is whether we want to add a
>> tcp_foo_undo_cwnd() and ca->loss_cwnd to every CC module to handle this
>> issue (i.e. make every CC module handle it the way CUBIC does), or (my
> I would prefer the former b/c loss_cwnd may not be universal TCP
> state, just like ssthresh carries no meaning in some CC (bbr). It also
> seems also more consistent with the recent change on undo
>
> commit e97991832a4ea4a5f47d65f068a4c966a2eb5730
> Author: Florian Westphal 
> Date:   Mon Nov 21 14:18:38 2016 +0100
>
> tcp: make undo_cwnd mandatory for congestion modules
>

You are certainly right that it is more pure to keep a CC detail like
that inside the CC module.

But it's a bit sad to me that we have 9 separate identical
implementations of a cwnd undo function, and that approach would add 6
more.

We do have tp->snd_ssthresh and tp->prior_ssthresh, even though not
all CC modules use ssthresh.

What if we call the field tp->prior_cwnd? Then at least we'd have some
nice symmetry:

- tp->snd_cwnd,  which is saved in tp->prior_cwnd  (and restored upon undo)
- tp->snd_ssthresh,  which is saved in tp-> prior_ssthresh  (and
restored upon undo)

That sounds appealing to me. WDYT?

neal


Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-21 Thread Yuchung Cheng
On Fri, Jul 21, 2017 at 1:46 PM, Neal Cardwell  wrote:
> On Fri, Jul 21, 2017 at 4:27 PM, Lisong Xu  wrote:
>>
>> Hi Yuchung,
>>
>> This test scenario is only one example to trigger this bug. In general, as
>> long as cwnd <4, the undo function has this bug.
>
>
> Yes, personally I agree that this seems like an issue that is general enough
> to be worth fixing. In the sense that, if cwnd <4, then we may well be very
> congested. So we don't want to get hit by this bug wherein an undo of a loss
> recovery can cause cwnd to suddenly jump (from 1, 2, or 3) up to 4.
>
> Seems like any of the several CCs that use tcp_reno_undo_cwnd() have this
> bug.
>
> I guess in my mind the only question is whether we want to add a
> tcp_foo_undo_cwnd() and ca->loss_cwnd to every CC module to handle this
> issue (i.e. make every CC module handle it the way CUBIC does), or (my
I would prefer the former b/c loss_cwnd may not be universal TCP
state, just like ssthresh carries no meaning in some CC (bbr). It also
seems also more consistent with the recent change on undo

commit e97991832a4ea4a5f47d65f068a4c966a2eb5730
Author: Florian Westphal 
Date:   Mon Nov 21 14:18:38 2016 +0100

tcp: make undo_cwnd mandatory for congestion modules

> preference) just add a tp->loss_cwnd field so we can use shared code in
> tcp_reno_undo_cwnd() to get this right across all CC modules.
>
> neal
>


Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-21 Thread Neal Cardwell
On Fri, Jul 21, 2017 at 4:27 PM, Lisong Xu  wrote:
>
> Hi Yuchung,
>
> This test scenario is only one example to trigger this bug. In general, as 
> long as cwnd <4, the undo function has this bug.

Yes, personally I agree that this seems like an issue that is general
enough to be worth fixing. In the sense that, if cwnd <4, then we may
well be very congested. So we don't want to get hit by this bug
wherein an undo of a loss recovery can cause cwnd to suddenly jump
(from 1, 2, or 3) up to 4.

Seems like any of the several CCs that use tcp_reno_undo_cwnd() have this bug.

I guess in my mind the only question is whether we want to add a
tcp_foo_undo_cwnd() and ca->loss_cwnd to every CC module to handle
this issue (i.e. make every CC module handle it the way CUBIC does),
or (my preference) just add a tp->loss_cwnd field so we can use shared
code in tcp_reno_undo_cwnd() to get this right across all CC modules.

neal


Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-21 Thread Lisong Xu

Hi Yuchung,

This test scenario is only one example to trigger this bug. In 
generally, as long as cwnd <4, the undo function has this bug.


This would not be a problem for a normal network. But might be an issue, 
if the network is highly congested (e.g., many many TCP flows with small 
cwnd <4). In this case, the bug may possibly mistakenly double the 
sending rate of each flow, and make a highly congested network even more 
congested  similar to congestion collapse. This is actually why we 
need the congestion control algorithms in the first place.


Thanks
Lisong

On 7/21/2017 12:59 PM, Yuchung Cheng wrote:

On Thu, Jul 20, 2017 at 2:28 PM, Wei Sun  wrote:

Hi Yuchung,

Sorry for the confusion.  The test case was adapted from an old DSACK
test case (i.e., forget to remove something).

Attached is a new and simple one. Thanks

Note that the test scenario is fairly rare IMO: the connection first
experience timeouts, then its retransmission got acked, then the
original packets get Acked (ack w/ val 1400 ecr 130). It can be really
long reordering, or reordering plus packet loss.

The Linux undo state machines may not handle this perfectly, but it's
probably not worth extra state for such rare events.




On Wed, Jul 19, 2017 at 2:31 PM, Yuchung Cheng  wrote:

On Tue, Jul 18, 2017 at 2:36 PM, Wei Sun  wrote:

Hi there,

We find a buggy behavior when using Linux TCP Reno and HTCP in low
bandwidth or highly congested network environments.

In a simple word, their undo functions may mistakenly double the cwnd,
leading to a more aggressive behavior in a highly congested scenario.


The detailed reason:

The current reno undo function assumes cwnd halving (and thus doubles
the cwnd), but it doesn't consider a corner case condition that
ssthresh is at least 2.

e.g.,
  cwnd  ssth
An initial state: 25
A spurious loss:   12
Undo:   45

Here the cwnd after undo is two times as that before undo. Attached is
a simple script to reproduce it.

the packetdrill script is a bit confusing: it disables SACK but then
the client returns ACK w/ SACKs, also 3 dupacks happen after RTO so
the sender isn't technically going through a fast recovery...

could you provide a better test?


A similar reason for HTCP, so we recommend to store the cwnd on loss
in .ssthresh implementation and restore it again in .undo_cwnd for TCP
Reno and HTCP implementations.

Thanks




Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-21 Thread Lisong Xu

Hi Yuchung,

This test scenario is only one example to trigger this bug. In general, 
as long as cwnd <4, the undo function has this bug.


This would not be a problem for a normal network. But might be an issue, 
if the network is highly congested (e.g., many many TCP flows with small 
cwnd <4). In this case, the bug may possibly mistakenly double the 
sending rate of each flow, and make a highly congested network even more 
congested  similar to congestion collapse. This is actually why we 
need the congestion control algorithms in the first place.


Thanks
Lisong

On 7/21/2017 12:59 PM, Yuchung Cheng wrote:

On Thu, Jul 20, 2017 at 2:28 PM, Wei Sun  wrote:

Hi Yuchung,

Sorry for the confusion.  The test case was adapted from an old DSACK
test case (i.e., forget to remove something).

Attached is a new and simple one. Thanks

Note that the test scenario is fairly rare IMO: the connection first
experience timeouts, then its retransmission got acked, then the
original packets get Acked (ack w/ val 1400 ecr 130). It can be really
long reordering, or reordering plus packet loss.

The Linux undo state machines may not handle this perfectly, but it's
probably not worth extra state for such rare events.




On Wed, Jul 19, 2017 at 2:31 PM, Yuchung Cheng  wrote:

On Tue, Jul 18, 2017 at 2:36 PM, Wei Sun  wrote:

Hi there,

We find a buggy behavior when using Linux TCP Reno and HTCP in low
bandwidth or highly congested network environments.

In a simple word, their undo functions may mistakenly double the cwnd,
leading to a more aggressive behavior in a highly congested scenario.


The detailed reason:

The current reno undo function assumes cwnd halving (and thus doubles
the cwnd), but it doesn't consider a corner case condition that
ssthresh is at least 2.

e.g.,
  cwnd  ssth
An initial state: 25
A spurious loss:   12
Undo:   45

Here the cwnd after undo is two times as that before undo. Attached is
a simple script to reproduce it.

the packetdrill script is a bit confusing: it disables SACK but then
the client returns ACK w/ SACKs, also 3 dupacks happen after RTO so
the sender isn't technically going through a fast recovery...

could you provide a better test?


A similar reason for HTCP, so we recommend to store the cwnd on loss
in .ssthresh implementation and restore it again in .undo_cwnd for TCP
Reno and HTCP implementations.

Thanks




Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-21 Thread Yuchung Cheng
On Thu, Jul 20, 2017 at 2:28 PM, Wei Sun  wrote:
> Hi Yuchung,
>
> Sorry for the confusion.  The test case was adapted from an old DSACK
> test case (i.e., forget to remove something).
>
> Attached is a new and simple one. Thanks
Note that the test scenario is fairly rare IMO: the connection first
experience timeouts, then its retransmission got acked, then the
original packets get Acked (ack w/ val 1400 ecr 130). It can be really
long reordering, or reordering plus packet loss.

The Linux undo state machines may not handle this perfectly, but it's
probably not worth extra state for such rare events.

>
>
>
> On Wed, Jul 19, 2017 at 2:31 PM, Yuchung Cheng  wrote:
>> On Tue, Jul 18, 2017 at 2:36 PM, Wei Sun  wrote:
>>> Hi there,
>>>
>>> We find a buggy behavior when using Linux TCP Reno and HTCP in low
>>> bandwidth or highly congested network environments.
>>>
>>> In a simple word, their undo functions may mistakenly double the cwnd,
>>> leading to a more aggressive behavior in a highly congested scenario.
>>>
>>>
>>> The detailed reason:
>>>
>>> The current reno undo function assumes cwnd halving (and thus doubles
>>> the cwnd), but it doesn't consider a corner case condition that
>>> ssthresh is at least 2.
>>>
>>> e.g.,
>>>  cwnd  ssth
>>> An initial state: 25
>>> A spurious loss:   12
>>> Undo:   45
>>>
>>> Here the cwnd after undo is two times as that before undo. Attached is
>>> a simple script to reproduce it.
>> the packetdrill script is a bit confusing: it disables SACK but then
>> the client returns ACK w/ SACKs, also 3 dupacks happen after RTO so
>> the sender isn't technically going through a fast recovery...
>>
>> could you provide a better test?
>>
>>>
>>> A similar reason for HTCP, so we recommend to store the cwnd on loss
>>> in .ssthresh implementation and restore it again in .undo_cwnd for TCP
>>> Reno and HTCP implementations.
>>>
>>> Thanks


Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-20 Thread Wei Sun
Hi Yuchung,

Sorry for the confusion.  The test case was adapted from an old DSACK
test case (i.e., forget to remove something).

Attached is a new and simple one. Thanks



On Wed, Jul 19, 2017 at 2:31 PM, Yuchung Cheng  wrote:
> On Tue, Jul 18, 2017 at 2:36 PM, Wei Sun  wrote:
>> Hi there,
>>
>> We find a buggy behavior when using Linux TCP Reno and HTCP in low
>> bandwidth or highly congested network environments.
>>
>> In a simple word, their undo functions may mistakenly double the cwnd,
>> leading to a more aggressive behavior in a highly congested scenario.
>>
>>
>> The detailed reason:
>>
>> The current reno undo function assumes cwnd halving (and thus doubles
>> the cwnd), but it doesn't consider a corner case condition that
>> ssthresh is at least 2.
>>
>> e.g.,
>>  cwnd  ssth
>> An initial state: 25
>> A spurious loss:   12
>> Undo:   45
>>
>> Here the cwnd after undo is two times as that before undo. Attached is
>> a simple script to reproduce it.
> the packetdrill script is a bit confusing: it disables SACK but then
> the client returns ACK w/ SACKs, also 3 dupacks happen after RTO so
> the sender isn't technically going through a fast recovery...
>
> could you provide a better test?
>
>>
>> A similar reason for HTCP, so we recommend to store the cwnd on loss
>> in .ssthresh implementation and restore it again in .undo_cwnd for TCP
>> Reno and HTCP implementations.
>>
>> Thanks


TSundo-2-1-4.pkt
Description: Binary data


Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-19 Thread Yuchung Cheng
On Tue, Jul 18, 2017 at 2:36 PM, Wei Sun  wrote:
> Hi there,
>
> We find a buggy behavior when using Linux TCP Reno and HTCP in low
> bandwidth or highly congested network environments.
>
> In a simple word, their undo functions may mistakenly double the cwnd,
> leading to a more aggressive behavior in a highly congested scenario.
>
>
> The detailed reason:
>
> The current reno undo function assumes cwnd halving (and thus doubles
> the cwnd), but it doesn't consider a corner case condition that
> ssthresh is at least 2.
>
> e.g.,
>  cwnd  ssth
> An initial state: 25
> A spurious loss:   12
> Undo:   45
>
> Here the cwnd after undo is two times as that before undo. Attached is
> a simple script to reproduce it.
the packetdrill script is a bit confusing: it disables SACK but then
the client returns ACK w/ SACKs, also 3 dupacks happen after RTO so
the sender isn't technically going through a fast recovery...

could you provide a better test?

>
> A similar reason for HTCP, so we recommend to store the cwnd on loss
> in .ssthresh implementation and restore it again in .undo_cwnd for TCP
> Reno and HTCP implementations.
>
> Thanks


A buggy behavior for Linux TCP Reno and HTCP

2017-07-18 Thread Wei Sun
Hi there,

We find a buggy behavior when using Linux TCP Reno and HTCP in low
bandwidth or highly congested network environments.

In a simple word, their undo functions may mistakenly double the cwnd,
leading to a more aggressive behavior in a highly congested scenario.


The detailed reason:

The current reno undo function assumes cwnd halving (and thus doubles
the cwnd), but it doesn't consider a corner case condition that
ssthresh is at least 2.

e.g.,
 cwnd  ssth
An initial state: 25
A spurious loss:   12
Undo:   45

Here the cwnd after undo is two times as that before undo. Attached is
a simple script to reproduce it.

A similar reason for HTCP, so we recommend to store the cwnd on loss
in .ssthresh implementation and restore it again in .undo_cwnd for TCP
Reno and HTCP implementations.

Thanks


undo-2-1-4.pkt
Description: Binary data