On Tue, Sep 19, 2017 at 4:04 AM, Oleksandr Natalenko
<oleksa...@natalenko.name> wrote:
> Hi.
>
> 18.09.2017 23:40, Yuchung Cheng wrote:
>>
>> I assume this kernel does not have the patch that Neal proposed in his
>> first reply?
>
>
> Correct.
>
>> The main warning needs to be triggered by another peculiar SACK that
>> kicks the sender into recovery again (after undo). Please let it run
>> longer if possible to see if we can get both. But the new data does
>> indicate the we can (validly) be in CA_Open with retrans_out > 0.
>
>
> OK, here it is:
>
> ===
> » LC_TIME=C jctl -kb | grep RIP
> …
> Sep 19 12:54:03 defiant kernel: RIP: 0010:tcp_undo_cwnd_reduction+0xbd/0xd0
> Sep 19 12:54:22 defiant kernel: RIP: 0010:tcp_undo_cwnd_reduction+0xbd/0xd0
> Sep 19 12:54:25 defiant kernel: RIP: 0010:tcp_undo_cwnd_reduction+0xbd/0xd0
> Sep 19 12:56:00 defiant kernel: RIP: 0010:tcp_fastretrans_alert+0x7c8/0x990
> Sep 19 12:57:07 defiant kernel: RIP: 0010:tcp_undo_cwnd_reduction+0xbd/0xd0
> Sep 19 12:57:14 defiant kernel: RIP: 0010:tcp_undo_cwnd_reduction+0xbd/0xd0
> Sep 19 12:58:04 defiant kernel: RIP: 0010:tcp_undo_cwnd_reduction+0xbd/0xd0
> …
> ===
>
> Note timestamps — two types of warning are distant in time, so didn't happen
> at once.
>
> While still running this kernel, anything else I can check for you?
Thanks. Based on all the experiments you did I believe there's other
code path than my hypothesis that'd cause the warning:
1) Neal's proposed F-RTO fix didn't work
2) the main warning is not being triggered together with the newly-instrumented
warning in undo
3) Disabling RACK stopped the warning

We couldn't figure out exactly what. So we'll do a bit code auditing
first to find more suspects

Reply via email to