Re: svn commit: r334804 - in head/sys: kern modules/tcp modules/tcp/rack netinet netinet/tcp_stacks sys

2018-06-08 Thread Jonathan T. Looney
On Fri, Jun 8, 2018 at 12:37 PM, hiren panchasara <
hi...@strugglingcoder.info> wrote:
> ps: I know we are not killing the default stack as of yet and just
> stopping active maintenance of it but just wanted to raise these
> (probably obvious) points.

We absolutely are not stopping active maintenance of the "default" (soon to
be renamed "base") TCP stack. RACK is an alternative which might be useful
for some people. But, there is no suggestion that we should in any way
change the priority we give to maintaining the base TCP stack anytime in
the near future.

Jonathan
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r334804 - in head/sys: kern modules/tcp modules/tcp/rack netinet netinet/tcp_stacks sys

2018-06-08 Thread hiren panchasara
On 06/07/18 at 08:07P, Matthew Macy wrote:
> >
> > Okay. I believe there might be situations where we may want to still
> > keep the 'default' stack alive. I know Windows doesn't yet use RACK when
> > rtt is lesser than 10ms (or something like that), as an example.
> >
> 
> Is there any reason RACK wouldn't work for tracking 10us RTTs? If we
> know we know the peer doesn't do delack or have enough data in flight
> and the other stack doesn't have broken LRO, could we just use this in
> lieu of high resolution timestamps?

I believe the issue is both ends having fine-grained timers for RACK to
be able to do the right thing. If timer resolution ends up being coarser
than rtt, just depending on rack may be problematic. I know 10ms is
doable on most systems but just using this as an example that we
probably want non-rack ('default') stack to be around a little longer
and possibly with the enhancements that we can easily extract out to be
shared among all the stacks.

Also RACK needing pacing which requires but more CPU so question for us
could be that do we want to keep the 'default' stack around for machines
that can't take that extra CPU hit.

SACK is inefficient in default stack and PRR could be super useful as
"psuedo" pacing mechanism and could help recover faster but at the end
of the day it all depends on someone with time/energy/motivation to
maintain the 'default' stack with all shiny things that appear in
non-default stacks.

cheers,
Hiren

ps: I know we are not killing the default stack as of yet and just
stopping active maintenance of it but just wanted to raise these
(probably obvious) points.


pgpd1Qw8EoOpa.pgp
Description: PGP signature


Re: svn commit: r334804 - in head/sys: kern modules/tcp modules/tcp/rack netinet netinet/tcp_stacks sys

2018-06-07 Thread Matthew Macy
>
> Okay. I believe there might be situations where we may want to still
> keep the 'default' stack alive. I know Windows doesn't yet use RACK when
> rtt is lesser than 10ms (or something like that), as an example.
>


Is there any reason RACK wouldn't work for tracking 10us RTTs? If we
know we know the peer doesn't do delack or have enough data in flight
and the other stack doesn't have broken LRO, could we just use this in
lieu of high resolution timestamps?

-M
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r334804 - in head/sys: kern modules/tcp modules/tcp/rack netinet netinet/tcp_stacks sys

2018-06-07 Thread hiren panchasara
On 06/07/18 at 08:58P, Randall Stewart wrote:
> 
> 
> > On Jun 7, 2018, at 6:01 PM, hiren panchasara  
> > wrote:
> > 
> > On 06/07/18 at 06:18P, Randall Stewart wrote:
> >> Author: rrs
> >> Date: Thu Jun  7 18:18:13 2018
> >> New Revision: 334804
> >> URL: https://svnweb.freebsd.org/changeset/base/334804
> >> 
> >> Log:
> >>  This commit brings in a new refactored TCP stack called Rack.
> >>  Rack includes the following features:
> >>   - A different SACK processing scheme (the old sack structures are not 
> >> used).
> >>   - RACK (Recent acknowledgment) where counting dup-acks is no longer done
> >>  instead time is used to knwo when to retransmit. (see the I-D)
> >>   - TLP (Tail Loss Probe) where we will probe for tail-losses to attempt
> >>  to try not to take a retransmit time-out. (see the I-D)
> >>   - Burst mitigation using TCPHTPS
> >>   - PRR (partial rate reduction) see the RFC.
> >> 
> >>  Once built into your kernel, you can select this stack by either
> >>  socket option with the name of the stack is "rack" or by setting
> >>  the global sysctl so the default is rack.
> >> 
> >>  Note that any connection that does not support SACK will be kicked
> >>  back to the "default" base  FreeBSD stack (currently known as "default").
> >> 
> >>  To build this into your kernel you will need to enable in your
> >>  kernel:
> >> makeoptions WITH_EXTRA_TCP_STACKS=1
> >> options TCPHPTS
> >> 
> >>  Sponsored by: Netflix Inc.
> >>  Differential Revision:https://reviews.freebsd.org/D15525
> >> 
> >> Added:
> >>  head/sys/modules/tcp/rack/
> >>  head/sys/modules/tcp/rack/Makefile   (contents, props changed)
> >>  head/sys/netinet/tcp_stacks/rack.c   (contents, props changed)
> >>  head/sys/netinet/tcp_stacks/rack_bbr_common.h   (contents, props changed)
> >>  head/sys/netinet/tcp_stacks/sack_filter.c   (contents, props changed)
> >>  head/sys/netinet/tcp_stacks/sack_filter.h   (contents, props changed)
> >>  head/sys/netinet/tcp_stacks/tcp_rack.h   (contents, props changed)
> >> Modified:
> >>  head/sys/kern/uipc_sockbuf.c
> >>  head/sys/modules/tcp/Makefile
> >>  head/sys/netinet/tcp.h
> >>  head/sys/netinet/tcp_log_buf.h
> >>  head/sys/netinet/tcp_output.c
> >>  head/sys/netinet/tcp_stacks/fastpath.c
> >>  head/sys/netinet/tcp_timer.c
> >>  head/sys/netinet/tcp_timer.h
> >>  head/sys/netinet/tcp_var.h
> >>  head/sys/sys/mbuf.h
> >>  head/sys/sys/queue.h
> >>  head/sys/sys/sockbuf.h
> >>  head/sys/sys/time.h
> > 
> > I thought we'd have more time to review/test this. Looks like BSDCan
> > commit-spree in effect. :-)
> 
> The Phabricator review has been up since May 22nd. Thats over 2.5 weeks,
> this was also discussed on the Thursday conference calls.

Fair enough. (I am out of touch a little so shouldn't really complain ;-))
> > 
> > A few questions:
> > 1) Does RACK work reliably without HPTS? If yes, has that config been
> > tested?
> > 
> No it requires the pacer.
> 
> > 2) It looks like PRR is tied to RACK. Why did we go that route?
> > Shouldn't it be easily used with the 'default' stack also?
> > 
> 
> It is what I developed.. and I had no desire to work with the default stack. 
> That
> is a fifth rail that no one wants touched.
> 
> > 3) Can new SACK be used with the traditional stack?
> 
> Well if you want to rework the base stack you might be able to do that :)
> 
> It would be quite some effort.. I think Robert wants eventually the old
> stack to be de-composed and then slowly work at getting more common
> code between them until eventually you can have a diff and somehow
> figure out how to integrate the two.

Okay. I believe there might be situations where we may want to still
keep the 'default' stack alive. I know Windows doesn't yet use RACK when
rtt is lesser than 10ms (or something like that), as an example.

Such optimizations (PRR, better SACK) should be made available to that
also if we see non-RACK having a viable future.

> 
> > 
> > 4) Where should manpage like info for RACK go? a new man-page or
> > extending tcp(4)? Info like how to enable system-wide or per socket
> > should go here.
> > 
> 
> The enable/disable or per-socket I think is in with the pluggable stack
> stuff. We might want a Rack man page.. have to think about it.
> 
> > 5) Any perf numbers to go along with this commit? Synthetic or
> > production numbers showing improvements in transfer speed or any other
> > impact on CPU usage (specially with HPTS) that you can share?
> > 
> 
> CPU will be more but we see close to a drop in rebuffers by about 12% I am 
> told.

Great. Thanks.
> 
> > 6) In your testing, have you found cases where RACK does poorly compared
> > to the 'default' stack? Any recommendations on when should RACK be
> > enabled? (Something like this could go in the manpage.)
> 
> Nope. 

Okay. I assuming the 2 places this got tested at (NFLX and LLNW) would have
provided enough variety in terms of traffic types.

Thanks again,
Hiren


pgp1Ak9Vqu90k.pgp
Description: 

Re: svn commit: r334804 - in head/sys: kern modules/tcp modules/tcp/rack netinet netinet/tcp_stacks sys

2018-06-07 Thread Randall Stewart via svn-src-all



> On Jun 7, 2018, at 6:01 PM, hiren panchasara  
> wrote:
> 
> On 06/07/18 at 06:18P, Randall Stewart wrote:
>> Author: rrs
>> Date: Thu Jun  7 18:18:13 2018
>> New Revision: 334804
>> URL: https://svnweb.freebsd.org/changeset/base/334804
>> 
>> Log:
>>  This commit brings in a new refactored TCP stack called Rack.
>>  Rack includes the following features:
>>   - A different SACK processing scheme (the old sack structures are not 
>> used).
>>   - RACK (Recent acknowledgment) where counting dup-acks is no longer done
>>  instead time is used to knwo when to retransmit. (see the I-D)
>>   - TLP (Tail Loss Probe) where we will probe for tail-losses to attempt
>>  to try not to take a retransmit time-out. (see the I-D)
>>   - Burst mitigation using TCPHTPS
>>   - PRR (partial rate reduction) see the RFC.
>> 
>>  Once built into your kernel, you can select this stack by either
>>  socket option with the name of the stack is "rack" or by setting
>>  the global sysctl so the default is rack.
>> 
>>  Note that any connection that does not support SACK will be kicked
>>  back to the "default" base  FreeBSD stack (currently known as "default").
>> 
>>  To build this into your kernel you will need to enable in your
>>  kernel:
>> makeoptions WITH_EXTRA_TCP_STACKS=1
>> options TCPHPTS
>> 
>>  Sponsored by:   Netflix Inc.
>>  Differential Revision:  https://reviews.freebsd.org/D15525
>> 
>> Added:
>>  head/sys/modules/tcp/rack/
>>  head/sys/modules/tcp/rack/Makefile   (contents, props changed)
>>  head/sys/netinet/tcp_stacks/rack.c   (contents, props changed)
>>  head/sys/netinet/tcp_stacks/rack_bbr_common.h   (contents, props changed)
>>  head/sys/netinet/tcp_stacks/sack_filter.c   (contents, props changed)
>>  head/sys/netinet/tcp_stacks/sack_filter.h   (contents, props changed)
>>  head/sys/netinet/tcp_stacks/tcp_rack.h   (contents, props changed)
>> Modified:
>>  head/sys/kern/uipc_sockbuf.c
>>  head/sys/modules/tcp/Makefile
>>  head/sys/netinet/tcp.h
>>  head/sys/netinet/tcp_log_buf.h
>>  head/sys/netinet/tcp_output.c
>>  head/sys/netinet/tcp_stacks/fastpath.c
>>  head/sys/netinet/tcp_timer.c
>>  head/sys/netinet/tcp_timer.h
>>  head/sys/netinet/tcp_var.h
>>  head/sys/sys/mbuf.h
>>  head/sys/sys/queue.h
>>  head/sys/sys/sockbuf.h
>>  head/sys/sys/time.h
> 
> I thought we'd have more time to review/test this. Looks like BSDCan
> commit-spree in effect. :-)

The Phabricator review has been up since May 22nd. Thats over 2.5 weeks,
this was also discussed on the Thursday conference calls.
> 
> A few questions:
> 1) Does RACK work reliably without HPTS? If yes, has that config been
> tested?
> 
No it requires the pacer.

> 2) It looks like PRR is tied to RACK. Why did we go that route?
> Shouldn't it be easily used with the 'default' stack also?
> 

It is what I developed.. and I had no desire to work with the default stack. 
That
is a fifth rail that no one wants touched.

> 3) Can new SACK be used with the traditional stack?

Well if you want to rework the base stack you might be able to do that :)

It would be quite some effort.. I think Robert wants eventually the old
stack to be de-composed and then slowly work at getting more common
code between them until eventually you can have a diff and somehow
figure out how to integrate the two.

> 
> 4) Where should manpage like info for RACK go? a new man-page or
> extending tcp(4)? Info like how to enable system-wide or per socket
> should go here.
> 

The enable/disable or per-socket I think is in with the pluggable stack
stuff. We might want a Rack man page.. have to think about it.



> 5) Any perf numbers to go along with this commit? Synthetic or
> production numbers showing improvements in transfer speed or any other
> impact on CPU usage (specially with HPTS) that you can share?
> 

CPU will be more but we see close to a drop in rebuffers by about 12% I am told.

> 6) In your testing, have you found cases where RACK does poorly compared
> to the 'default' stack? Any recommendations on when should RACK be
> enabled? (Something like this could go in the manpage.)

Nope. 

R

> 
> Glad to finally see this in -head!
> 
> Cheers,
> Hiren


Randall Stewart
r...@netflix.com
803-317-4952





___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r334804 - in head/sys: kern modules/tcp modules/tcp/rack netinet netinet/tcp_stacks sys

2018-06-07 Thread hiren panchasara
On 06/07/18 at 06:18P, Randall Stewart wrote:
> Author: rrs
> Date: Thu Jun  7 18:18:13 2018
> New Revision: 334804
> URL: https://svnweb.freebsd.org/changeset/base/334804
> 
> Log:
>   This commit brings in a new refactored TCP stack called Rack.
>   Rack includes the following features:
>- A different SACK processing scheme (the old sack structures are not 
> used).
>- RACK (Recent acknowledgment) where counting dup-acks is no longer done
>   instead time is used to knwo when to retransmit. (see the I-D)
>- TLP (Tail Loss Probe) where we will probe for tail-losses to attempt
>   to try not to take a retransmit time-out. (see the I-D)
>- Burst mitigation using TCPHTPS
>- PRR (partial rate reduction) see the RFC.
>   
>   Once built into your kernel, you can select this stack by either
>   socket option with the name of the stack is "rack" or by setting
>   the global sysctl so the default is rack.
>   
>   Note that any connection that does not support SACK will be kicked
>   back to the "default" base  FreeBSD stack (currently known as "default").
>   
>   To build this into your kernel you will need to enable in your
>   kernel:
>  makeoptions WITH_EXTRA_TCP_STACKS=1
>  options TCPHPTS
>   
>   Sponsored by:   Netflix Inc.
>   Differential Revision:  https://reviews.freebsd.org/D15525
> 
> Added:
>   head/sys/modules/tcp/rack/
>   head/sys/modules/tcp/rack/Makefile   (contents, props changed)
>   head/sys/netinet/tcp_stacks/rack.c   (contents, props changed)
>   head/sys/netinet/tcp_stacks/rack_bbr_common.h   (contents, props changed)
>   head/sys/netinet/tcp_stacks/sack_filter.c   (contents, props changed)
>   head/sys/netinet/tcp_stacks/sack_filter.h   (contents, props changed)
>   head/sys/netinet/tcp_stacks/tcp_rack.h   (contents, props changed)
> Modified:
>   head/sys/kern/uipc_sockbuf.c
>   head/sys/modules/tcp/Makefile
>   head/sys/netinet/tcp.h
>   head/sys/netinet/tcp_log_buf.h
>   head/sys/netinet/tcp_output.c
>   head/sys/netinet/tcp_stacks/fastpath.c
>   head/sys/netinet/tcp_timer.c
>   head/sys/netinet/tcp_timer.h
>   head/sys/netinet/tcp_var.h
>   head/sys/sys/mbuf.h
>   head/sys/sys/queue.h
>   head/sys/sys/sockbuf.h
>   head/sys/sys/time.h

I thought we'd have more time to review/test this. Looks like BSDCan
commit-spree in effect. :-)

A few questions:
1) Does RACK work reliably without HPTS? If yes, has that config been
tested?

2) It looks like PRR is tied to RACK. Why did we go that route?
Shouldn't it be easily used with the 'default' stack also?

3) Can new SACK be used with the traditional stack?

4) Where should manpage like info for RACK go? a new man-page or
extending tcp(4)? Info like how to enable system-wide or per socket
should go here.

5) Any perf numbers to go along with this commit? Synthetic or
production numbers showing improvements in transfer speed or any other
impact on CPU usage (specially with HPTS) that you can share?

6) In your testing, have you found cases where RACK does poorly compared
to the 'default' stack? Any recommendations on when should RACK be
enabled? (Something like this could go in the manpage.)

Glad to finally see this in -head!

Cheers,
Hiren


pgpeZ4e_GHrRL.pgp
Description: PGP signature


svn commit: r334804 - in head/sys: kern modules/tcp modules/tcp/rack netinet netinet/tcp_stacks sys

2018-06-07 Thread Randall Stewart
Author: rrs
Date: Thu Jun  7 18:18:13 2018
New Revision: 334804
URL: https://svnweb.freebsd.org/changeset/base/334804

Log:
  This commit brings in a new refactored TCP stack called Rack.
  Rack includes the following features:
   - A different SACK processing scheme (the old sack structures are not used).
   - RACK (Recent acknowledgment) where counting dup-acks is no longer done
  instead time is used to knwo when to retransmit. (see the I-D)
   - TLP (Tail Loss Probe) where we will probe for tail-losses to attempt
  to try not to take a retransmit time-out. (see the I-D)
   - Burst mitigation using TCPHTPS
   - PRR (partial rate reduction) see the RFC.
  
  Once built into your kernel, you can select this stack by either
  socket option with the name of the stack is "rack" or by setting
  the global sysctl so the default is rack.
  
  Note that any connection that does not support SACK will be kicked
  back to the "default" base  FreeBSD stack (currently known as "default").
  
  To build this into your kernel you will need to enable in your
  kernel:
 makeoptions WITH_EXTRA_TCP_STACKS=1
 options TCPHPTS
  
  Sponsored by: Netflix Inc.
  Differential Revision:https://reviews.freebsd.org/D15525

Added:
  head/sys/modules/tcp/rack/
  head/sys/modules/tcp/rack/Makefile   (contents, props changed)
  head/sys/netinet/tcp_stacks/rack.c   (contents, props changed)
  head/sys/netinet/tcp_stacks/rack_bbr_common.h   (contents, props changed)
  head/sys/netinet/tcp_stacks/sack_filter.c   (contents, props changed)
  head/sys/netinet/tcp_stacks/sack_filter.h   (contents, props changed)
  head/sys/netinet/tcp_stacks/tcp_rack.h   (contents, props changed)
Modified:
  head/sys/kern/uipc_sockbuf.c
  head/sys/modules/tcp/Makefile
  head/sys/netinet/tcp.h
  head/sys/netinet/tcp_log_buf.h
  head/sys/netinet/tcp_output.c
  head/sys/netinet/tcp_stacks/fastpath.c
  head/sys/netinet/tcp_timer.c
  head/sys/netinet/tcp_timer.h
  head/sys/netinet/tcp_var.h
  head/sys/sys/mbuf.h
  head/sys/sys/queue.h
  head/sys/sys/sockbuf.h
  head/sys/sys/time.h

Modified: head/sys/kern/uipc_sockbuf.c
==
--- head/sys/kern/uipc_sockbuf.cThu Jun  7 18:06:01 2018
(r334803)
+++ head/sys/kern/uipc_sockbuf.cThu Jun  7 18:18:13 2018
(r334804)
@@ -1283,6 +1283,55 @@ sbsndptr(struct sockbuf *sb, u_int off, u_int len, u_i
return (ret);
 }
 
+struct mbuf *
+sbsndptr_noadv(struct sockbuf *sb, uint32_t off, uint32_t *moff)
+{
+   struct mbuf *m;
+
+   KASSERT(sb->sb_mb != NULL, ("%s: sb_mb is NULL", __func__));
+   if (sb->sb_sndptr == NULL || sb->sb_sndptroff > off) {
+   *moff = off;
+   if (sb->sb_sndptr == NULL) {
+   sb->sb_sndptr = sb->sb_mb;
+   sb->sb_sndptroff = 0;
+   }
+   return (sb->sb_mb);
+   } else {
+   m = sb->sb_sndptr;
+   off -= sb->sb_sndptroff;
+   }
+   *moff = off;
+   return (m);
+}
+
+void
+sbsndptr_adv(struct sockbuf *sb, struct mbuf *mb, uint32_t len)
+{
+   /*
+* A small copy was done, advance forward the sb_sbsndptr to cover
+* it.
+*/
+   struct mbuf *m;
+
+   if (mb != sb->sb_sndptr) {
+   /* Did not copyout at the same mbuf */
+   return;
+   }
+   m = mb;
+   while (m && (len > 0)) {
+   if (len >= m->m_len) {
+   len -= m->m_len;
+   if (m->m_next) {
+   sb->sb_sndptroff += m->m_len;
+   sb->sb_sndptr = m->m_next;
+   }
+   m = m->m_next;
+   } else {
+   len = 0;
+   }
+   }
+}
+
 /*
  * Return the first mbuf and the mbuf data offset for the provided
  * send offset without changing the "sb_sndptroff" field.

Modified: head/sys/modules/tcp/Makefile
==
--- head/sys/modules/tcp/Makefile   Thu Jun  7 18:06:01 2018
(r334803)
+++ head/sys/modules/tcp/Makefile   Thu Jun  7 18:18:13 2018
(r334804)
@@ -7,10 +7,12 @@ SYSDIR?=${SRCTOP}/sys
 
 SUBDIR=\
${_tcp_fastpath} \
+${_tcp_rack} \
${_tcpmd5} \
 
 .if ${MK_EXTRA_TCP_STACKS} != "no" || defined(ALL_MODULES)
 _tcp_fastpath= fastpath
+_tcp_rack= rack
 .endif
 
 .if (${MK_INET_SUPPORT} != "no" || ${MK_INET6_SUPPORT} != "no") || \

Added: head/sys/modules/tcp/rack/Makefile
==
--- /dev/null   00:00:00 1970   (empty, because file is newly added)
+++ head/sys/modules/tcp/rack/Makefile  Thu Jun  7 18:18:13 2018
(r334804)
@@ -0,0 +1,24 @@
+#
+# $FreeBSD$
+#
+
+.PATH: ${.CURDIR}/../../../netinet/tcp_stacks
+
+STACKNAME=